the Creative Commons Attribution 4.0 License.
the Creative Commons Attribution 4.0 License.
A historical nutrient dataset (1895–2024) for the North Pacific: reconstructed from machine learning and hydrographic observations
Abstract. Nutrients play a critical role in oceanic primary productivity and the biological pump. However, compared to hydrographic parameters such as temperature and salinity, nutrient observations are limited due to their labor-intensive and costly measurements. Thus, nutrient observations are several orders of magnitude sparser than hydrographic observations. In this study, we first established a rigorous data quality control procedure to clean the hydrographic and nutrient (including NO₃⁻, NO₂⁻, DIP, and Si(OH)₄) observations collected from World Ocean Database (WOD) and CLIVAR and Carbon Hydrographic Data Office (CCHDO) in the North Pacific. Subsequently, the cleaned and high-quality CCHDO dataset was used to train three machine learning models – Random Forest, Light Gradient Boosting Machine (LightGBM), and Gaussian Process Regression – to establish relationships between nutrient concentrations and key variables, including space coordinates (longitude, latitude, and depth), time variables (year and month), and water mass properties (indexed by potential temperature and salinity). Validation shows that the reconstruction closely matches the observations, with RMSEs of <1.41, <0.071, <0.089 and <3.07 mmol kg-1 for NO₃⁻, NO₂⁻, DIP, and Si(OH)₄, respectively. The validated models were then applied to reconstruct nutrient concentrations from the hydrographic observations in WOD, most of which lacked direct nutrient measurements. This resulted in ~473 million reconstructed nutrient data points across 1.92 million stations for each nutrient, spanning from 1895 to 2024, representing a 2,127 to 2,393-fold increase compared to the original nutrient observations in the North Pacific (197,539 to 222,234). This new dataset will be valuable for studying nutrient variability under climate change and anthropogenic influences, and for providing transient boundary conditions in ocean biogeochemical models. The dataset generated in this study is openly available on Zenodo at https://zenodo.org/records/17451417.
- Preprint
(3034 KB) - Metadata XML
-
Supplement
(7787 KB) - BibTeX
- EndNote
Status: final response (author comments only)
- RC1: 'Comment on essd-2025-654', Anonymous Referee #1, 22 Dec 2025
-
RC2: 'Comment on essd-2025-654', Anonymous Referee #2, 16 Jan 2026
Review Du et al.
General comments
The authors’ goal is to reconcile the current imbalance in data availability between hydrographic tracers (T and S), which are widely measured, and nutrients, which are much sparser. They (1) compile nutrient data from available datasets, (2) develop a pipeline for quality control and filtering, and (3) use the “cleaned” nutrient compilation with an ensemble of machine-learning methods to build a predictive model that infers nutrient concentrations as a function of T, S, and time. I think this is valid, well-written work and it is worthy of publication.
One criticism is that the current version of the manuscript does not clearly state the nature of the final product generated using the trained model. Initially, I thought the authors were aiming to produce a fully time-resolved climatology (e.g., monthly means for each year spanning 1895–2024 or 1973–2022). Then, I interpreted the product as being analogous to WOA23 (i.e., monthly climatological means that are not explicitly resolved through time). I then downloaded the data from Zenodo using the link included in the abstract. However, the Zenodo download page is also somewhat unclear, with acronyms that are not explained (e.g., ABP, GLD, PFL, UOR), each associated with different files. Based on a preliminary inspection of some of these files, it appears that the authors provide time-resolved predictions (potentially at ~2-day resolution from 2004 to 2024). This makes the dataset more interesting, but the authors should do a better job—both in the manuscript and on Zenodo—of clearly describing the characteristics of the product they are publishing.
In addition, I think the comparison with WOA23 should be more quantitative. The main manuscript and supplementary information include many maps from WOA23 and from the authors’ reconstructed dataset, but the comparison would be more effective if it also included maps of the differences between the two products. The text describing the advantages of the new dataset relative to WOA23 is also somewhat generic (e.g., “seasonal patterns are similar but concentration is lower”; how much lower? see lines 441–449).
Specific technical comments
- The reported RMSE values are low when compared with the concentration ranges of these nutrients, but the authors should provide additional detail on how the error varies with depth. For instance, an error of ~1.5 μmol kg⁻¹ for NO₃⁻ may be acceptable in deep waters, but it would be a very large error across much of the surface ocean. The same issue applies across different biogeochemical provinces (e.g., nutrient-rich upwelling regions versus nutrient-poor subtropical gyres). The applicability of the dataset will depend strongly on the vertical and lateral structure of the reconstruction error.
- A potential concern is that nutrient data quality and sampling density have changed substantially over time. Also, summer observations are three times greater than in winter. I think the authors should comment on whether this could introduce time-dependent biases in a fully time-resolved reconstruction. Are uncertainty and model skills explicitly evaluated by era, depth, and region?
- My understanding is that the authors train the models on nutrient data from 1973 to 2022 and then apply the trained model to salinity and temperature data from 1895–2024. I guess that the assumption is that the relationship between predictors (T and S, physical tracers) and targets (nutrients, biological) remains the same between 1895-1973 and 1973-present. If so, the authors should explain why they think this is a strong assumption.
- Line 99: what’s striking of the Pacific (for example, relative to the Atlantic) is longitude not the latitude range
- Line 99-118 One aspect of the Pacific that is unique and maybe should be highlighted here is that it hosts, unlike the Atlantic, all major N fluxes (including water column denitrification and N2 fixation)
- 319-337 I think they have a good error-estimation strategy; however, I don’t think that the validation splits explicitly test for time-dependent changes in data quality (e.g., through era-based validation) or quantify how reconstruction error varies systematically with depth and biogeochemical province, where model skill and sampling density can differ substantially. Time validation is performed in Aloha but it’s for a “short” time range relative to the time range of the hydrographic properties archive (1988-2021 vs 1895-2024) and for one specific biogeochemical province.
- Figure 9 I think this figure would benefit from including a plot of the residuals (predicted minus observed)
Citation: https://doi.org/10.5194/essd-2025-654-RC2
Data sets
Validated temperature and salinity data, and reconstructed nutrient concentrations in the North Pacific (1895–2024) C. Du et al. https://zenodo.org/records/17451417
Viewed
| HTML | XML | Total | Supplement | BibTeX | EndNote | |
|---|---|---|---|---|---|---|
| 412 | 158 | 33 | 603 | 83 | 31 | 39 |
- HTML: 412
- PDF: 158
- XML: 33
- Total: 603
- Supplement: 83
- BibTeX: 31
- EndNote: 39
Viewed (geographical distribution)
| Country | # | Views | % |
|---|
| Total: | 0 |
| HTML: | 0 |
| PDF: | 0 |
| XML: | 0 |
- 1
Overall comment:
This manuscript presents a valuable contribution to the field of chemical oceanography. The authors have reconstructed a massive database of historical nutrient data points for the North Pacific, greatly expanding original observations. The rigorous four-level quality control and the use of multiple machine learning (ML) architectures make this a strong candidate for the journal Earth System Science Data.
Overall, this is a good paper. However, I have some concerns regarding the temporal extrapolation, which must be addressed to ensure the dataset's reliability for historical hindcasts. Besides, the paper would benefit from a more in-depth discussion on the long-term trend of nutrients, which can help strengthen the utility of such a historical dataset.
I recommend minor revisions to strengthen the methodology and discussion before publication further.
Major comments:
1. Temporal extrapolation robustness: The use of three validation strategies (sample-random, station-random, and cruise-random) provides a transparent view of error, and the cruise-random approach should be most convincing in validating spatial extrapolation. However, in terms of temporal extrapolation, the training dataset (CCHDO) only spans from 1973 to 2022, then the model is applied to reconstruct nutrients going back to 1895. The inclusion of year as a predictor might be biased if any trend learned from 1973–2022 does not map onto the 1895–1972 era. The authors need to justify that their approach can be extrapolated not only spatially but also temporally, maybe through some discussion about whether the water mass-nutrient relationships remained relatively stationary over the last century (but is this really true given the acceleration of anthropogenic forcing?), or validate with some "time-slices" to prove the temporal predictor is robust.
2. Missing long-term analyses: A major selling point of this paper is the temporal extent of the reconstruction (1895–2024). However, the Results section is dominated by climatological maps (Figs. 10–13), which effectively collapse the temporal dimension that the authors worked so hard to reconstruct. Providing 130 years of data without showing a single long-term trend analysis (e.g., decadal shifts in the nutricline depth, or basin-scale nutrient inventory changes) undermines the claim that this dataset is ‘historical.’ I suggest the authors add a section analyzing a long-term trend or any regime shift using their reconstructed nutrient data. This would serve as a proof of concept that the reconstruction captures low-frequency climate variability and is not just a high-resolution climatology.
3. Elaboration on potential future applications: I think the reconstructed datasets would be impactful and have broad utility, but their applications are written in a generic way. To increase the impact of this paper, I recommend expanding the discussion to explicitly list potential future applications of this dataset. Specific examples could be to use this 4D dataset to spin up ocean biogeochemical models, or investigate nutrient stoichiometric changes, etc.
Minor comments:
- Section 2.1: Oxygen is a fundamental tracer for remineralization and is physically coupled with nutrients via the Redfield ratio and AOU, but is not included in the predictors. Can the authors explain why it is not included? Is it because many datasets lack this property?
- Table 1: The salinity data count increases after quality control. Typo?
- Figure 3a: Hard to visualize the low station counts in the open ocean. Consider plotting the colorbar in log scale?
- Line 373: The model performance for NO2 is notably lower (R^2 = 0.32–0.72) compared to other nutrients. Given that NO2 is biologically dynamic, the utility of a T/S-based reconstruction is questionable. Consider removing NO2 from the primary dataset or flagging it with a high-uncertainty warning?
- Line 417: The manuscript notes that "most data points are located above 2,000 m." How should we interpret the deep data then? Do they have larger RMSE? If so, to what extent can the reconstructed deep nutrient fields be considered reliable for full-depth modeling applications?