Jingwei-Nutrients: A global spatiotemporal reconstruction of ocean nutrients (1965–2023) using multi-task deep learning
Abstract. Dissolved nitrate, phosphate, and silicate are fundamental drivers of marine primary productivity and the biological carbon pump. However, the development of continuous, long-term global datasets has long been severely hindered by extreme historical data sparsity and complex biogeochemical dynamics. Statistical interpolation methods struggle to simultaneously fill the severely sparse data gaps and capture the non-linear interactions, necessitating advanced artificial intelligence (AI) to explicitly learn and leverage their underlying relationships. Nevertheless, most existing AI methods reconstruct nutrients independently (i.e., Single-Task Learning), failing to exploit the synergistic effects inherent in cross-nutrients stoichiometry. In this study, we present Jingwei-Nutrients, a global monthly dataset at resolution from 0 to 2000 m depth spanning 1965 to 2023, reconstructed using a Transformer-based Multi-Task Learning (MTL) framework trained on a comprehensive, quality-controlled multi-source observational database. Evaluation on the validation set yields values of 0.980, 0.961, and 0.983, with RMSEs of 2.21, 0.23, and 6.35 for nitrate, phosphate, and silicate, respectively. Temporal K-fold cross-validation reveals that the MTL framework consistently achieves higher and lower RMSE for all three nutrients compared to single-task models, with larger accuracy gains in data-sparse earlier decades such as 1965–1975. Our dataset reproduces consistent global climatology patterns and seasonal cycles with World Ocean Atlas (WOA). Furthermore, independent evaluations against long-term monitoring stations (HOT and KERFIX) and GO-SHIP cruise sections (P16N, P16S, and P06E) demonstrate our effectiveness across multi-decadal temporal trend, spatial variability and vertical changes. Additionally, an ensemble-based uncertainty analysis reveals interpretable spatial heterogeneities and a long-term decreasing trend in global uncertainty, which directly mirrors the historical transition from sparse early sampling to modern observing networks. This dataset fills a critical gap in historical ocean biogeochemical observations, providing a reliable, physically consistent foundation for marine biogeochemical modeling and climate change studies. The dataset is openly available at https://doi.org/10.5281/zenodo.19491198.