the Creative Commons Attribution 4.0 License.
the Creative Commons Attribution 4.0 License.
A surface ocean pCO2 product with improved representation of interannual variability using a vision transformer-based model
Abstract. The ocean plays a crucial role in regulating the global carbon cycle and mitigating climate change, with the spatial distribution and temporal variations of ocean surface partial pressure of CO2 (spCO2) directly determining the air-sea CO2 flux. However, constructing a global spCO2 data product that is able to resolve interannual and decadal variability remains a challenge due to the spatial sparsity and temporal discontinuity of observational data. This study presents an approach based on the Vision Transformer (ViT) model, combining high-quality observational data from the CO2 Atlas (SOCAT) with multiple advanced global ocean biogeochemical models results to reconstruct a global monthly spCO2 dataset (SJTU-AViT) at 1° resolution from 1982 to 2023. The approach employs the self-attention mechanism of the ViT model to enhance the modeling of the spatial and temporal variations of spCO2, as well as incorporates physical-biogeochemical constraints from the derivative of spCO2 with respect to key controlling factors as additional features. The incorporation of advanced ocean biogeochemical models during the training process allows the ViT-based model to capture more accurate spCO2 variability in these data-sparse regions. Evaluations demonstrate that the new data product effectively captures spCO2 variability at both global and regional scales, showing good consistency with SOCAT observations, long-term ocean station data, and global atmospheric CO2 trends. The reconstructed spCO2 demonstrates strong capability in reproducing spCO2 anomalies during El Niño-Southern Oscillation (ENSO) events, particularly in the eastern Pacific Ocean, where it shows a correlation of 0.81 with the Niño 3.4 index and demonstrates high consistency with cruise data. Based on the SJTU-AViT dataset, the estimated global air-sea CO2 flux patterns are consistent with known regional features such as strong uptake in the Southern Ocean and outgassing in the tropical Pacific. This study not only provide a new 42-year data product for advancing understanding of the ocean carbon cycle and global carbon budget assessments, but also introduces a new Transformer-based deep learning framework for Earth system data reconstruction. The data product is publicly accessible at https://doi.org/10.5281/zenodo.15331978 (Zhang et al., 2025) and will be updated regularly.
- Preprint
(11146 KB) - Metadata XML
-
Supplement
(5987 KB) - BibTeX
- EndNote
Status: final response (author comments only)
-
RC1: 'Comment on essd-2025-286', Anonymous Referee #1, 07 Sep 2025
This manuscript introduces a novel machine learning framework (SJTU-AViT) for reconstructing global sea surface pCO₂ at 1°×1° monthly resolution over the period 1982–2023. By incorporating physical–biogeochemical constraints as derived features, the approach enhances the quality of ocean carbon data reconstruction. The evaluation is comprehensive, covering mean states, seasonal cycles, and interannual variability, and shows strong skill in reproducing ENSO-related signals. This study makes a substantial contribution by providing a valuable new ocean carbon data product for the ocean carbon community and a useful machine learning framework in the field of ocean data reconstruction. The subject is highly relevant to the scope of Earth System Science Data. However, I have several general and specific comments and suggestions that should be addressed before the manuscript can be considered for publication.
General comments
- The Methods section (Model training & testing) should more clearly describe how the data were split into training and testing sets, along with the sample size distribution. This information is essential for evaluating the model’s generalization ability. The authors should specify whether the split was random, temporal, or spatial (e.g., by cruise lines or fixed stations). They should also report the number or proportion of samples in each subset, ideally stratified by time (e.g., decades) and/or region. Such details would improve transparency and reproducibility.
- The manuscript fills gaps in SOCAT observations using long-term trends and seasonal cycles from SJTU-AViT, followed by residual analysis to assess interannual variability. This procedure raises concerns about the lack of independence between the model and the validation data, since part of the evaluation relies on model-derived estimates. The authors should clarify and quantify the impact of this approach. For instance, they could limit the analysis to grid points or stations with continuous records, or apply long-term trends and seasonal cycles from an independent product to compare robustness. Demonstrating consistent results across methods would enhance the credibility of the conclusions.
- To help readers better understand the implementation of the spCO2 data reconstruction, I recommend adding a schematic figure in the main text that illustrates the reconstruction process based on the ViT model. Such a figure would improve both the readability of the manuscript and the clarity of the methodology.
- It is recommended that the authors include skill distribution tables in the Supplement, stratified by ocean basin and latitude band. These tables should report, for each group, the sample size (N), R², RMSE, MAE, and MBE. Such quantitative evidence would support the statement that “biases are larger at high latitudes” and clearly demonstrate regional and latitudinal variations in model performance.
- Regarding the independent test sites, it is recommended to provide a clear description in the main text along with detailed information. In the appendix, the nine observation stations used for independent testing should be listed, including their names, geographic locations, observation periods, and the number of samples at each site. Since the BAT site does not have direct pCO2 observations, please clarify the method used to calculate its monthly mean pCO2and specify the data sources for all sites.
- The manuscript states that SJTU-AViT outputs were interpolated to the spatiotemporal locations of SOCAT for comparison, but it does not specify the interpolation method used (e.g., bilinear, nearest neighbor, or other), nor whether any temporal or spatial smoothing was applied. The authors should provide these details. For example, “When comparing with SOCAT, model values were interpolated to observation locations using bilinear interpolation in space and linear interpolation in time.”
- The temporal coverage of Chl-a spans 1997–2022, whereas the product extends from 1982 to 2023. It is recommended to clarify how the periods prior to 1997 and for 2023 were handled (e.g., climatology, interpolation, gap-filling, or inference from other variables) to avoid any misunderstanding that the time spans are fully consistent.
- The manuscript employs a 2° monthly climatological MLD (WOCE). It is recommended to explain why a climatological mean was used instead of incorporating interannual and monthly variability, as this choice may affect the representation of temporal dynamics.
Specific comments:
- Please clarify whether the input data were standardized during the model training process, and specify the method used (e.g., variable-wise mean–variance normalization, min–max scaling, or other approaches).
- Please clearly indicate the flux sign convention in the caption of Figure 13, for example: “Negative = ocean uptake (sink), Positive = release to the atmosphere (source).” Ensure that this convention is consistent with the main text, equations, and color bar.
- In Supplement Figure S6, the legend is labeled as “spCO2” which should be “spCO2”. Please ensure consistency of the symbol and formatting throughout the manuscript (e.g., uniformly using the subscript form “spCO2” instead of “spCO2”) and apply the same convention across all figures, captions, and text. In addition, please indicate the appropriate units (e.g., μatm) where relevant to avoid confusion.
- Line 113 — abbreviation usage: The term “Sea surface salinity (SSS)” repeats a definition already given earlier. Please use the abbreviation SSS here. A full-text check is recommended to correct similar inconsistencies.
- Regarding xCO2 (MBL), please clarify how the meridional band product was mapped onto the 1° × 1° grid (e.g., through band replication, interpolation, or another approach). Providing this detail would improve the transparency of the data processing procedure.
- In the Methods section, please specify the training setup, including the maximum number of epochs and/or the early stopping patience (e.g., “trained for up to 200 epochs with early stopping, patience = 20”), to improve the reproducibility of the approach.
- It is recommended to indicate the sample size for each data point or category in Figure 3, allowing readers to more clearly understand the data coverage and the reliability of the statistics.
- It is recommended to review the entire manuscript and ensure that all instances of “CO2” use a subscript for the number 2, maintaining consistency and adhering to scientific writing conventions.
- Line 31: In the abstract, change “This study not only provide…” to “This study not only provides…”. It is recommended to review the entire manuscript for program errors.
- Line 31: In the abstract, change “Earth system” to “Earth-system” when used as a compound adjective for clarity
- Line 170: It is recommended to revise the sentence to: “The ViT-based model contains approximately 115 million parameters and was trained in parallel on eight NVIDIA RTX 4090 GPUs; each training epoch required roughly 10 minutes.”
- Line 260: It is recommended to revise the sentence to: “Most predicted values lie close to the 1:1 line, particularly within the climatologically common spCO2 range (300–420 µatm), as indicated by the high-density regions in Fig. 2.”
- Line 312: Ensure there is a space before “µatm,” e.g., “-12 µatm to +10 µatm.”
- It is recommended to standardize the number of decimal places throughout the manuscript (e.g., consistently using two or three decimal places).
- For the air–sea flux calculation, the parameterization of Wanninkhof (2014) requires the Schmidt number, wind speed source, and resolution (you used ERA5). It is recommended to specify in Section 2.4 the temporal and spatial resolution of ERA5 and the formula or reference used for computing the Schmidt number.
- Table 1: It is recommended to change “12 month” to “12-month.”
- In the Abstract, it is stated that the model shows a correlation of 0.81 with the Niño 3.4 index, whereas Section 3.4 reports a correlation of –0.81. This inconsistency in the sign of the correlation may confuse readers. Please verify the original calculation and ensure that the values and their signs are reported consistently throughout the manuscript.
Citation: https://doi.org/10.5194/essd-2025-286-RC1 -
RC2: 'Comment on essd-2025-286', Anonymous Referee #2, 07 Sep 2025
Zhang et al. present a global monthly surface ocean pCO2 dataset (SJTU-AViT) and corresponding air-sea CO2 fluxes spanning 1982-2023 at 1° resolution, developed using a Vision Transformer-based deep learning model. The approach combines SOCAT observation, and observations of climate data with multiple ocean biogeochemical models and incorporates physical-biogeochemical constraints. The authors show that their product successfully captures the spatial and temporal variations of observed pCO2 patterns, from seasonal cycles to interannual variability. The product shows more realistic small-scale spatial variability and temporal interannual variability than previous pCO2 products. The resolved air-sea CO2 fluxes agree with other estimates based on pCO2 observations. The paper is well written, the methodology is robust, and the line of thought is mostly clear to me. I only have minor comments regarding some of the technical details and presentation.
Main comments:
- The description of methodology is overall complete. However, certain technical details are still missing. It is not clear how pre-training on CMIP6 models contributes to the final model. It is not clear what the fine-tuning of MOM6 really does. Are your results sensitive to the choice of CMIP6 models and the fine-tuning? How do SOCAT data fold into your refinement? For the physical-biogeochemical constraints, are you only using what is derived from MOM6, or also from CMIP6 models as well? How are your results, particularly on the seasonal cycle, impacted by these physical-biogeochemical constraints? In other words, if you exclude these constraints, how is the representation of the seasonal pCO2 cycle affected?
- The uncertainty quantification might benefit from more detail. For u_map, what if there are no observations in one grid? How do you then quantify u_map there? Have you conducted an analysis on the spatial heterogeneity of the dominant source of uncertainty? In addition, I think it would be more appropriate to replace u_map with "algorithm uncertainty." Perhaps this can be done by generating a large ensemble of spCO2 Alternatively, this can be done by using synthetic data. You might consider subsampling SOCAT data from one of your models and then applying the ML model to subsampled model fields to generate an spCO2 map. Then you can compare the absolute differences between pCO2 from the ocean model and the ML reconstruction.
Minor comments:
L15-16: The statement that ocean surface partial pressure of spCO2 directly determines the air-sea CO2 flux is not exactly correct. It is the air-sea pCO2 difference, which is modulated by surface wind speed and gas exchange velocity.
Introduction: Perhaps it is also worth mentioning that previous ML-interpolation of pCO2 overly smooths the spatial patterns and interannual variability.
L195: Is the interpolation based on inverse distance weighted average? How do you deal with the fine-resolution time (i.e., not monthly average)?
Figure 3: Systematic biases are clear at Iceland and Irminger, with SJTU-AViT underestimating the pCO2. Any clues why?
Figure 5: The negative bias would lead to an overestimation of global ocean CO2 uptake through the bulk equation. Might be worth mentioning when you talk about the flux.
Fig. 6b: Seems like the bias PDF is wider in certain years. Speculation?
L369-372: The section title is on the seasonal cycle, but the first few sentences focus on variability at all time scales. Might consider moving this to a later section. Also, the trend should be removed beforehand in calculating STD in Fig. 7.
L391-396: A presentation issue. The seasonal changes are, physically, attributed to these factors you mentioned. This is based on our understanding of the ocean carbon dynamics rather than being directly learned from ML output. The sentences read like you confirm these dominant factors from your model output. Might consider making it clear that these are not model results. Or, indeed, you could do factor contribution analysis.
Figure 9: I think what is missing here is to show whether the seasonal phases are consistent compared to SOCAT.
Figure 11: Linearly detrended spCO2?
L568-571: PDO-related SST patterns are used in your training; incorporating other indices (e.g., directly using PDO) would be double counting?
Citation: https://doi.org/10.5194/essd-2025-286-RC2
Data sets
A surface ocean pCO2 product with improved representation of interannual variability using a vision transformer-based model Xueying Zhang, Enhui Liao, Wenfang Lu, Zelun Wu, Guansuo Wang, and Shiyu Liang https://doi.org/10.5281/zenodo.15331978
Viewed
HTML | XML | Total | Supplement | BibTeX | EndNote | |
---|---|---|---|---|---|---|
515 | 88 | 25 | 628 | 36 | 15 | 28 |
- HTML: 515
- PDF: 88
- XML: 25
- Total: 628
- Supplement: 36
- BibTeX: 15
- EndNote: 28
Viewed (geographical distribution)
Country | # | Views | % |
---|
Total: | 0 |
HTML: | 0 |
PDF: | 0 |
XML: | 0 |
- 1