the Creative Commons Attribution 4.0 License.
the Creative Commons Attribution 4.0 License.
Global Monthly Ocean Dissolved Oxygen (1960–2023) Reconstructed to 5,902 m with BLENDR, a Bayesian-Optimized Ensemble Learning Framework
Abstract. Oceanic oxygen levels, crucial for marine ecosystems and biogeochemical cycles, have declined significantly over the past few decades due to climate change, posing severe environmental risks. However, historical dissolved oxygen (DO) measurements, especially below 2,000 m, remain sparse, limiting comprehensive annual and seasonal analyses. Here, we introduce the BLENDR framework (Bayesian-optimized Learning and ENsemble modeling for Data Reconstruction), a Bayesian-optimized ensemble of six machine-learning models (Random Forest, XGBoost, LightGBM, CatBoost, Extremely Randomized Trees and Histogram-based Gradient Boosting) fused via dynamic weighting, to reconstruct global monthly DO distributions at a 1° × 1° resolution from the surface to 5,902 m from 1960 to 2023. Validation against an independent dataset demonstrated that BLENDR achieves lower errors than any individual model. Our dataset captures depth-dependent deoxygenation, with the most pronounced decline occurring between 150 and 200 m, and reveals severely accelerated oxygen loss in the Arctic Ocean and North Atlantic over the past decade. This work provides the first long-term, global monthly DO product from the ocean surface to 5,902 m. The bathypelagic DO data provided in this work is a significant contribution to deep ocean oxygen dynamics and global biogeochemical cycles.
- Preprint
(3047 KB) - Metadata XML
- BibTeX
- EndNote
Status: open (extended)
-
RC1: 'Comment on essd-2025-781', Anonymous Referee #1, 10 Mar 2026
reply
-
AC1: 'Reply on RC1', Mingyu Han, 27 Mar 2026
reply
The comment was uploaded in the form of a supplement: https://essd.copernicus.org/preprints/essd-2025-781/essd-2025-781-AC1-supplement.pdf
-
AC1: 'Reply on RC1', Mingyu Han, 27 Mar 2026
reply
Data sets
Global Monthly Dissolved Oxygen Reconstruction via Bayesian Ensemble Machine Learning Mingyu Han and Yuntao Zhou https://doi.org/10.5281/zenodo.17548659
Viewed
| HTML | XML | Total | BibTeX | EndNote | |
|---|---|---|---|---|---|
| 381 | 189 | 33 | 603 | 30 | 73 |
- HTML: 381
- PDF: 189
- XML: 33
- Total: 603
- BibTeX: 30
- EndNote: 73
Viewed (geographical distribution)
| Country | # | Views | % |
|---|
| Total: | 0 |
| HTML: | 0 |
| PDF: | 0 |
| XML: | 0 |
- 1
General Comments
In this manuscript, Han et al. present a global, monthly dissolved ocean oxygen (DO) dataset from 1960 to 2023, extending down to 5,902 m depth, using a Bayesian-optimized ensemble learning framework named BLENDR. Reconstructing DO in the bathypelagic zone is an ambitious and valuable contribution to the oceanographic community.
However, I have several major concerns regarding the methodology and the validation of the results. The mathematical formulation of the dynamic weighting strategy contains physical inconsistencies, and the manuscript lacks essential spatial representativeness proofs for the validation set. Furthermore, for a dataset claiming to reconstruct deep-ocean oxygen, it lacks a comparison with existing deep-ocean products. Therefore, major revisions are required to address these methodological flaws and to fully validate the reliability of the reconstructed fields before this manuscript can be considered for publication.
Main Issues
1. Lack of Intercomparison with Existing Data Products
The manuscript currently lacks a robust comparison with other widely used ocean DO data products. To establish the reliability of this new product, it is essential to contextualize its performance against existing datasets. I strongly recommend adding a comprehensive data intercomparison section to validate the BLENDR outputs. The authors should refer to and compare their results within the context of recent multi-product coordinated intercomparisons, such as the one presented by Ito et al. (2025). This will significantly enhance the credibility of your product.
Reference: Ito T, Garcia H E, Wang Z, et al. Assessing the observational uncertainties of dissolved oxygen climatology and seasonal cycle through a coordinated intercomparison project[J]. Global Biogeochemical Cycles, 2025, 39(11): e2025GB008751.
2. Spatial-Temporal Representativeness of the Validation Set
The authors state that for each profile in GLODAPv2, they searched the CTD and OSD records for matches within ±1° and the same month, excluding those that matched. This filtered the dataset down to 8,020 profiles. However, the manuscript lacks a spatial-temporal distribution map of this filtered validation set. It is critical to prove the coverage and representativeness of these remaining 8,020 profiles. Without a distribution map showing the cruise tracks or sampling locations, readers cannot determine whether the validation set represents a global oceanic assessment or if it is merely biased toward a few localized, data-rich sub-regions. Please provide maps and temporal histograms of the validation set.
3. Potential Spatial Discontinuity in the Weight Allocation Strategy
While the dynamic weighting strategy is conceptually interesting, its current mathematical formulation may benefit from further justification. The transition mechanism between dynamic and static weights could potentially lead to spatial discontinuities. For instance, suppose grid cell A contains an observation, and the adjacent grid cell B does not. In cell A, the dynamic weight might heavily favor a specific model that perfectly fits the local observation; however, in cell B, the weight instantaneously reverts to the global average static weights (w_i) of the 6 models. This abrupt transition ("hard switch") between observed and unobserved regions might produce artificial gradients or step-changes at the boundaries, which may not fully align with the continuous nature of oceanographic variables. I recommend the authors discuss this potential limitation to ensure physical continuity.
4. Insufficient Assessment of Deep-Ocean Accuracy
A major selling point of this dataset is its extension to 5,902 m depth. However, the manuscript lacks a rigorous, depth-specific accuracy assessment for the deep ocean. Validating the entire water column collectively obscures potential biases in the bathypelagic zone. To substantiate the claims regarding deep-ocean reconstruction, I suggest conducting a direct comparison of your deep-ocean results with the DIVA-based dataset by Roach and Bindoff (2023). This comparison will help verify if your machine-learning ensemble correctly captures the subtle deep-water mass structures compared to variational analysis methods.
Reference: Roach, C. J., & Bindoff, N. L. (2023). Developing a New Oxygen Atlas of the World’s Oceans Using Data Interpolating Variational Analysis. Journal of Atmospheric and Oceanic Technology.
Other comments:
Lines 34-35: The word "biogeochemical" is used multiple times within a single or adjacent sentence. Please rephrase to avoid redundancy and improve the flow of the text.
Line 87: Typographical error. There appears to be an extraneous space after the degree symbol (°). Please remove the space for proper formatting.
Lines 125-126: Typo in the dataset name. The text reads "...use this filtered GLODA v2...". Please correct this misspelling to "GLODAPv2".
Line 306 (Equation 11): The equation for is written as . Without parentheses, this mathematically implies that is subtracted after the summation of is complete. Please add parentheses to ensure mathematical correctness: .
Lines 456-457: The sentence states, "However, the expansion rates increased again beyond 1,600 m because of expansion in the NP". Using the word "expansion" twice in such close proximity makes the sentence clunky.
Lines 624-625 (Data Availability): While the authors provided URL links for the Argo and WOD databases, the reference link or DOI for the GLODAPv2 dataset is missing. Please add the official link or DOI for GLODAPv2 to maintain consistency and adhere to ESSD's data accessibility standards.