Reconstruction of δ13CDIC in the Atlantic Ocean: A Probabilistic Machine Learning Approach for Filling Historical Data Gaps
Abstract. Stable carbon isotope composition of marine dissolved inorganic carbon (DIC), δ13CDIC, is a valuable tracer for oceanic carbon cycling. However, its observational coverage remains much sparser than that of DIC and other physical or biogeochemical variables, limiting its full potential. Here, we reconstruct δ13CDIC in the Atlantic Ocean using a probabilistic machine learning framework, Gaussian Process Regression (GPR). We compiled data from 51 historical cruises, including a high-resolution 2023 A16N section, and applied secondary quality control via crossover analysis, retaining 37 cruises for model training, validation, and testing. The trained GPR model achieved an average bias of −0.007 ± 0.082 ‰ and an overall uncertainty of 0.11 ‰, arising from measurement (0.07 ‰), mapping (0.08 ‰), and negligible input-variable (3.77 × 10−14 ‰) errors. Using the GLODAPv2.2023 Atlantic dataset as predictors, the reconstruction expanded the number of acceptable δ13CDIC samples by a factor of 7.65, from 8,941 to 68,435 across the Atlantic basins. The resulting dataset markedly improves the spatial resolution in longitude, latitude, and depth, and provides enhanced temporal continuity over the past four decades. Compared to the sparse original measurements, the reconstruction reduces spatial discontinuities and reveals finer vertical structures consistent with other high-resolution biogeochemical observations. This reconstructed δ13CDIC dataset provides new opportunities to resolve regional carbon cycle dynamics, validate Earth system models, refine estimates of oceanic carbon uptake, and extend climate reanalysis records. The data are publicly accessible at the data repository Zenodo under the following DOI: https://doi.org/10.5281/zenodo.16907402 (Gao et al., 2025).