Articles | Volume 18, issue 5
https://doi.org/10.5194/essd-18-3125-2026
© Author(s) 2026. This work is distributed under the Creative Commons Attribution 4.0 License.
GEOXYGEN: a global long-term dissolved oxygen dataset based on biogeochemistry-aware machine learning framework and multi-source observations
Download
- Final revised paper (published on 12 May 2026)
- Preprint (discussion started on 01 Dec 2025)
Interactive discussion
Status: closed
Comment types: AC – author | RC – referee | CC – community | EC – editor | CEC – chief editor
| : Report abuse
- RC1: 'Comment on essd-2025-699', Anonymous Referee #1, 06 Jan 2026
- RC2: 'Comment on essd-2025-699', Anonymous Referee #2, 27 Jan 2026
- AC1: 'Comment on essd-2025-699', Zhenguo Wang, 26 Feb 2026
Peer review completion
AR – Author's response | RR – Referee report | ED – Editor decision | EF – Editorial file upload
AR by Zhenguo Wang on behalf of the Authors (26 Feb 2026)
Author's response
Author's tracked changes
Manuscript
ED: Referee Nomination & Report Request started (26 Feb 2026) by Xingchen (Tony) Wang
RR by Anonymous Referee #1 (15 Mar 2026)
RR by Anonymous Referee #2 (21 Apr 2026)
ED: Publish subject to minor revisions (review by editor) (23 Apr 2026) by Xingchen (Tony) Wang
AR by Zhenguo Wang on behalf of the Authors (24 Apr 2026)
Author's response
Author's tracked changes
Manuscript
ED: Publish as is (25 Apr 2026) by Xingchen (Tony) Wang
AR by Zhenguo Wang on behalf of the Authors (25 Apr 2026)
Manuscript
Post-review adjustments
AA – Author's adjustment | EA – Editor approval
AA by Zhenguo Wang on behalf of the Authors (08 May 2026)
Author's adjustment
Manuscript
EA: Adjustments approved (10 May 2026) by Xingchen (Tony) Wang
The authors created a 4-D global ocean dissolved oxygen atlas at 0.5˚x0.5˚ resolution by using multiple data sources and machine learning approaches. The ensemble machine learning submodel framework used to derive this data product is interesting, and this new ocean dissolved oxygen product shows slight improvement over previous products. This dataset has the potential to be used by the oceanography community to assess oxygen evolution under a changing climate. My recommendation is minor revision.
General comments:
Specific Comments:
Figure 1a: I was confused when looking at this figure. OSD, CTD and Argo are different approaches for DO concentration data and CCHDO, GLODAP, GEOTRACES and etc. are different sources of DO data. It seems they should not be mixed in this figure. It is unclear whether the authors intend to show changes in sampling approaches over time or changes in data sources.
Sections 2 and 3: It will be helpful to add a flow chart to show the data cleaning and model training pipeline
Lines 108-109: I would be more cautious about this outlier detection approach especially in some dynamic regions like ENTP, where local DO concentration could change a lot within 10-day window.
Figure 3 and related texts: The partitioning of the global ocean into different provinces is central to the machine learning method used in the manuscript, as the submodels are trained. However, this partitioning is not clearly justified. Some questions: 1) the partitioning does not distinguish OMZ from other region. For example, ETSP OMZ is included within the whole South Pacific province, 2) the authors state that the province partitioning is based on Fay and McKinley (2014), equatorial biomes identified in Fay and McKinley (2014) are not included, 3) this global dissolved oxygen product is 4-D, and the province partitioning is same for every year and does not account for time shifts, which is also an important point in Fay and McKinley (2014), and 4) How sensitive is this machine learning approach to the province selection?
Line 218: why longitude is not included as a predictor while latitude is included? Please note that the coordinate information (longitude and latitude) might also need to be transformed like time to represent true geographical distances.
Figure 8: the red dashed lines on the figures is misleading and they should not cross land.