17 Nov 2023
 | 17 Nov 2023
Status: this preprint is currently under review for the journal ESSD.

A long-term (2000–2020) global 0.05° continuous atmospheric carbon dioxide dataset (GCXCO2) combining OCO-2 observations and model simulations based on stack learning

Xiaobin Guan, Zhihao Sun, Dong Chu, Guanglei Xie, Yuchen Wang, and Huanfeng Shen

Abstract. High-accuracy atmospheric (carbon dioxide) CO2 concentration data are critical in understanding the global carbon cycle, but there is still a lack of a high-resolution CO2 product with long-term and global seamless coverage. In this study, a global continuous 8-day XCO2 (column-averaged CO2 dry air mole fraction) product (GCXCO2) was reconstructed at a spatial resolution of 0.05° from 2000 to 2020, based on OCO-2 satellite data. An ensemble machine learning stacking regression model, which combines light gradient boosting machine (LGBM), extreme gradient boosting (XGB), extremely randomized trees (ETR), gradient boosting regression (GBR), and random forest (RF), was utilized to model the relationships between XCO2 data and auxiliary satellite, simulation data, and meteorological data. A dynamic normalization strategy was developed to handle the great temporal variation issue and ensure the temporal expansion of the prediction model. Multiple validation methods were applied to comprehensively evaluate the spatial and temporal generalization ability of the model and product. The 10-fold cross-validation shows an overall satisfactory result at a global scale, with R2 = 0.974 and root-mean-square error (RMSE) = 0.551 ppm (parts per million). Further spatial extension and temporal prediction experiments also proved that dependable results could be obtained in the regions and time periods without valid OCO-2 satellite observations (R2 = 0.958 and R2 = 0.886, respectively). Compared with Total Carbon Column Observing Network (TCCON) ground station observations, the GCXCO2 product performs better than the model simulation data, demonstrating a better accuracy and a higher spatial resolution. Based on the GCXCO2 product, an upward annual trend of approximately 2.09 ppm/year can be found for global XCO2 between 2000 and 2020, and significant differences are found between the Northern and Southern hemispheres in different seasons. This product may well be the first remote sensing-based global high-precision long-term XCO2 dataset, which will help advance the understanding of climate change and carbon balance. The dataset can be obtained freely at (Guan and Sun, 2023).

Xiaobin Guan et al.

Status: open (until 07 Jan 2024)

Comment types: AC – author | RC – referee | CC – community | EC – editor | CEC – chief editor | : Report abuse

Xiaobin Guan et al.

Data sets

Global continuous 0.05 degree atmospheric carbon dioxide dataset (GCXCO2) based OCO-2 satellite, CAMS and CarbonTracker simulation data from 2000 to 2020 Guan Xiaobin

Xiaobin Guan et al.


Total article views: 61 (including HTML, PDF, and XML)
HTML PDF XML Total Supplement BibTeX EndNote
46 14 1 61 2 0 0
  • HTML: 46
  • PDF: 14
  • XML: 1
  • Total: 61
  • Supplement: 2
  • BibTeX: 0
  • EndNote: 0
Views and downloads (calculated since 17 Nov 2023)
Cumulative views and downloads (calculated since 17 Nov 2023)

Viewed (geographical distribution)

Total article views: 60 (including HTML, PDF, and XML) Thereof 60 with geography defined and 0 with unknown origin.
Country # Views %
  • 1
Latest update: 01 Dec 2023
Short summary
Although there are various XCO2 products, they are all limited by the spatial resolution or spatiotemporal coverage. In this study, the first global 0.05° XCO2 product (GCXCO2) for 21 years is generated by combining the OCO-2 satellite observations and models simulations. The dynamic normalization strategy is applied to enhance the temporal expansibility of stacking learning model, and the product is superior than the model simulations showing similar characteristic with OCO-2 observations.