An AI-Driven Reconstruction of Global Surface Temperature with Emphasis on Refining the Antarctic Record

Ouyang, Chenxi; Li, Qingxiang; Li, Zichen; Wei, Sihao

doi:10.5194/essd-2025-717

Preprints

https://doi.org/10.5194/essd-2025-717

Preprints

14 Jan 2026

| 14 Jan 2026

Status: this preprint is currently under review for the journal ESSD.

An AI-Driven Reconstruction of Global Surface Temperature with Emphasis on Refining the Antarctic Record

Chenxi Ouyang, Qingxiang Li, Zichen Li, and Sihao Wei

Abstract. Accurate estimates of long-term surface temperature (ST) changes are fundamental not only for assessing observed warming, but also for improving the reliability of future climate projections. However, substantial missing information in global ST datasets, remains a major source of uncertainty in estimating global or regional temperature changes. Recent advances in artificial intelligence (AI) have promoted the effective application of deep learning approaches, such as image inpainting and transfer learning, in reconstructing incomplete geophysical datasets. In this study, partial convolutional neural network (PConv) models were trained using the 20CR reanalysis data and CMIP6 climate model outputs as training samples, with the aim of achieving a proper reconstruction of the global surface temperature dataset. To address differences among existing sea surface temperature (SST) datasets, we reconstruct global monthly ST fields since 1850 by merging the China global Land Surface Air Temperature (C-LSAT2.1) dataset with Extended Reconstructed Sea Surface Temperature (ERSSTv6) dataset and Met Office Hadley Centre's sea surface temperature (HadSST4) dataset, respectively. Although both reconstructions reliably reproduce large-scale spatial patterns and long-term variations, the merge of C-LSAT2.1 with HadSST4 exhibits greater physical consistency and is therefore adopted as our preferred reconstruction. In particular, validation against station observations indicates that the reconstructions perform well over the Antarctica after 1961, where observational coverage is extremely sparse. Based on this framework, we developed the China global Artificial Intelligence Reconstructed Surface Temperature_20CR/CMIP6 (C-AIRST_R/M) datasets, providing spatially complete global monthly ST anomaly reconstructions since 1850 with a spatial resolution of 5° × 2.5°. These datasets offer improved support for extending long-term climate records and for applications in polar climate assessment, as well as in climate monitoring, detection, and attribution studies. The C-AIRST_R/M datasets can be downloaded at https://doi.org/10.6084/m9.figshare.30663797.v1 (Ouyang et al., 2025). They are also available from http://www.gwpu.net/en/h-col-103.html (last access: 21 November 2025).

Received: 21 Nov 2025 – Discussion started: 14 Jan 2026

Competing interests: At least one of the (co-)authors is a member of the editorial board of Earth System Science Data.

Publisher's note: Copernicus Publications remains neutral with regard to jurisdictional claims made in the text, published maps, institutional affiliations, or any other geographical representation in this paper. While Copernicus Publications makes every effort to include appropriate place names, the final responsibility lies with the authors. Views expressed in the text are those of the authors and do not necessarily reflect the views of the publisher.

Download & links

Preprint (PDF, 3724 KB)

Supplement (4784 KB)

Download & links

Chenxi Ouyang, Qingxiang Li, Zichen Li, and Sihao Wei

Status: open (until 01 Apr 2026)

Post a comment Subscribe to comment alert

CC1:
'Comment on essd-2025-717', David Bromwich, 24 Jan 2026 reply

I enjoyed reading your analysis and seeing the results of AI-based 2-m temperature reconstruction, especially for Antarctica that are close to the results of Bromwich et al. (2025a). This must be
because the actual observations and their spatial and temporal variations form the basis of the reconstructions.
There has to be a strong trend in time in the availability of observations away from the Antarctic Peninsula that must strongly influence your results. AWS started to deployed in Antarctica in
1980. The annual time series plots in Fig. S1 have many errors. All need rechecking. Bonaparte Point, Briana, Cape Bird, Cape Phillips, Cape Ross, Concordia, D-47, Dome C, etc. do not start
around 1960.
Specifically regarding your Antarctic reconstruction results, the paper by Bromwich et al. (2024, https://doi.org/10.1029/2024GL111907) is relevant to your discussion
of ERA5 results shown in Figs. 6 and 7. That manuscript makes the case that ERA5 warming for Antarctica prior to 1979 is much too rapid because of a cold bias in the ECMWF model
that assimilation of limited satellite data over the Southern Ocean cannot overcome. Further, even after 1979, anomalous warming occurs in ERA5 along the coast near 0E and the Filchner Ice
Shelf and in Marie Byrd Land (near Siple Coast). These anomalies can be clearly seen in Fig. 7(h) and significantly contribute to the strong warming in ERA5 1979-2024 shown in Fig. 6(c).
David Bromwich Jan. 23, 2026

Reply

Citation: https://doi.org/10.5194/essd-2025-717-CC1
- AC1: 'Reply on CC1', Ouyang Chenxi, 02 Feb 2026 reply
  
  Dear Professor David Bromwich,
  
  Thank you very much for your interest in our study and for carefully reviewing our manuscript, and for providing important suggestions regarding the station records in Fig. S1 and the explanation of the anomalous temperature trends in ERA5 over Antarctica.
  
  The early starting dates in Fig. S1 arise from the way the station time series were constructed for the purpose of defining the climatological reference period, rather than from the use of actual observations prior to station deployment. Specifically, part of the station data used in this study was taken from the GHCNm dataset, including both QCF and QFE data. The QFE records are statistically infilled using the Pairwise Homogenization Algorithm (PHA), and many stations therefore provide continuous values over the 1961–2010 period (Williams et al., 2012).
  
  We adopted GHCNm data because surface air temperature observations in Antarctica are extremely sparse, particularly before the satellite era. To maximize spatial representativeness, we deliberately incorporated all available homogenized and reconstructed information, rather than relying only on the limited raw in situ observations.
  
  Our intention in using these records was to maximize station coverage when calculating the climatological mean (1961–1990) and to provide the reconstruction model with a more spatially complete initial data. Consequently, Fig. S1 displays the full station series as used in the reconstruction framework, including statistically reconstructed (QFE) values, which explains why many station series appear to start around 1960.
  
  We would like to emphasize that these early segments do not represent direct in situ observations before the actual station installation dates, but rather statistically reconstructed values. They were included deliberately as part of the input data to improve spatial representativeness, given that the reconstruction performance depends strongly on the availability of such initial fields.
  
  Fig. S1, as currently presented, may be misleading in this respect, and we thank you for pointing this out. In the revised version, we will clarify the data sources and explicitly distinguish between actual observations and statistically reconstructed values in both the figure caption and the text.
  
  We thank you for drawing our attention to the recent study by Bromwich et al. (2024), which is highly relevant to the interpretation of the ERA5 Antarctic temperature trends discussed in our manuscript. This study provides important evidence that the pronounced warming in ERA5 over Antarctica prior to 1979 is likely exaggerated, largely due to a cold bias in the ECMWF forecast model that could not be adequately constrained by the very limited observational coverage and early satellite data assimilation over the Southern Ocean. After 1979, the anomalous warming trends in ERA5 along the coastal regions near 0°E, the Ronne Ice Shelf, and Marie Byrd Land further amplify its overall temperature trend.
  
  We agree that these findings provide valuable context for understanding the limitations of ERA5 in the Antarctic region. In the revised manuscript, we will cite Bromwich et al. (2024) and expand the discussion to clarify that ERA5 temperature trends over Antarctica—particularly prior to the satellite era and in certain coastal sectors—should be interpreted with caution.
  
  We sincerely thank you again for these constructive suggestions, which have greatly contributed to improving the quality of our study.
  
  Sincerely,
  Chenxi Ouyang Feb. 2, 2026
  
  Reference:
  Bromwich D., Ensign A., Wang S and Zou X.: Major Artifacts in ERA5 2-m Air Temperature Trends Over Antarctica Prior to and During the Modern Satellite Era. Geophys. Res. Lett., 51(21), https://doi.org/10.1029/2024GL111907, 2024.
  Williams, C. N., M. J. Menne, and J. H. Lawrimore, 2012: Modifications to the Pairwise Homogeneity Adjustment (PHA) software to address coding errors and improve run-time efficiency. NOAA National Climatic Data Center Tech. Rep. GHCNM-12-02
  
  Reply
  
  Citation: https://doi.org/10.5194/essd-2025-717-AC1
CC2:
'Comment on essd-2025-717', David Bromwich, 01 Feb 2026 reply

Continuing my commentary of Jan. 24.
A reconstruction such as yours depends on the data fed into it.
Fig. S1 is the particular concern. A quick count indicates that at least half of the station observations are
wrongly depicted as starting around 1960. To give some related examples:
Concordia staffed station started in 2005. Dome C AWS started in 1980. Dome C II AWS started in ~ 1995.
Early AWS on the Ross Ice Shelf - Elaine, Lettau, Gill, Schwerdtfeger did not start until ~ 1985.
I trust that this situation is just a plotting mistake and these records as depicted were not used
to reconstruct the near surface temperatures.

Reply

Citation: https://doi.org/10.5194/essd-2025-717-CC2
- AC2: 'Reply on CC2', Ouyang Chenxi, 02 Feb 2026 reply
  
  Dear Professor David Bromwich,
  
  We sincerely thank you for your additional comments and for highlighting this important issue concerning Fig. S1. We also greatly appreciate the opportunity to clarify this issue.
  
  The issue regarding Fig. S1 has been explained in detail in our response to CC1.
  
  Thank you again for your constructive comment, which helps us improve the clarity and transparency of our work.
  
  Sincerely,
  Chenxi Ouyang Feb. 2, 2026
  
  Reply
  
  Citation: https://doi.org/10.5194/essd-2025-717-AC2

Chenxi Ouyang, Qingxiang Li, Zichen Li, and Sihao Wei

Supplement

https://doi.org/10.5194/essd-2025-717-supplement

Data sets

China global Artificial Intelligence Reconstructed Surface Temperature20CR/CMIP6 (C-AIRSTR/M) Chenxi Ouyang https://doi.org/10.6084/m9.figshare.30663797.v1

Model code and software

climatereconstructionAI Naoto Inoue et al. https://github.com/FREVA-CLINT/climatereconstructionAI

Chenxi Ouyang, Qingxiang Li, Zichen Li, and Sihao Wei

Viewed

Total article views: 404 (including HTML, PDF, and XML)

HTML	PDF	XML	Total	Supplement	BibTeX	EndNote
226	158	20	404	74	12	27

HTML: 226
PDF: 158
XML: 20
Total: 404
Supplement: 74
BibTeX: 12
EndNote: 27

Views and downloads (calculated since 14 Jan 2026)

Month	HTML	PDF	XML	Total
Jan 2026	120	52	13	185
Feb 2026	106	106	7	219

Cumulative views and downloads (calculated since 14 Jan 2026)

Month	HTML	PDF	XML	Total
Jan 2026	120	52	13	185
Feb 2026	106	106	7	219

Viewed (geographical distribution)

Total article views: 454 (including HTML, PDF, and XML) Thereof 454 with geography defined and 0 with unknown origin.

Country	#	Views	%

Latest update: 28 Feb 2026

Download

Preprint (3724 KB)
Metadata XML

Short summary

This study developed spatially complete global monthly ST anomaly datasets for 1850–2024 with a spatial resolution of 5° × 2.5°, termed the China global Artificial Intelligence Reconstructed Surface Temperature_20CR/CMIP6 (C-AIRST_R/M) datasets, which are reconstructed independently using the 20CR-AI and CMIP6-AI schemes based on the merged C-LSAT2.1 and HadSST4.


Total:	0
HTML:	0
PDF:	0
XML:	0