A machine-learning reconstruction of sea surface pCO2 in the North American Atlantic Coastal Ocean Margin from 1993 to 2021
Abstract. Insufficient spatiotemporal coverage of partial pressure of CO2 (pCO2) observations has hindered precise studies of the coastal carbon cycle along the North American Atlantic Coastal Ocean Margin (NAACOM). Earlier pCO2-products have encountered difficulties in accurately capturing the heterogeneity of regional variations and decadal trends of pCO2 in the NAACOM. This study developed a regional reconstructed pCO2-product for the NAACOM (Reconstructed Coastal Acidification Database-pCO2, or ReCAD-NAACOM-pCO2) using a two-step approach combining random forest regression and linear regression. The product provides monthly pCO2 data at 0.25° spatial resolution from 1993 to 2021, enabling investigation of regional spatial differences, seasonal cycles, and decadal changes in pCO2. The observation-based reconstruction was trained using Surface Ocean CO2 Atlas (SOCAT) observations as ground-truth values, with various satellite-derived and reanalysis environmental variables known to control sea surface pCO2 as model inputs. The product shows high accuracy during the model training, validation, and independent test phases, demonstrating robustness and capability to accurately reconstruct pCO2 in regions or periods lacking direct observational data in the NAACOM. Compared with all the observation samples from SOCAT, the pCO2-product yields a determination coefficient of 0.83, a root-mean-square error of 18.64 µatm, and an accumulative uncertainty of 23.83 µatm. The ReCAD-NAACOM-pCO2 product demonstrates its capability to resolve seasonal cycles, regional-scale variations, and decadal linear trends of pCO2 along the NAACOM. This new product provides reliable pCO2 data for more precise studies of coastal carbon dynamics in the NAACOM region. The dataset is publicly accessible at https://doi.org/10.5281/zenodo.11500974 (Wu et al., 2024a) and will be updated regularly.