Energy-conservation datasets of global land surface radiation and heat fluxes from 2000&ndash;2020 generated by CoSEB

Wang, Junrui; Tang, Ronglin; Liu, Meng; Li, Zhao-Liang

doi:10.5194/essd-2025-456

Preprints

https://doi.org/10.5194/essd-2025-456

Preprints

13 Aug 2025

| 13 Aug 2025

Status: a revised version of this preprint was accepted for the journal ESSD and is expected to appear here in due course.

Energy-conservation datasets of global land surface radiation and heat fluxes from 2000–2020 generated by CoSEB

Junrui Wang, Ronglin Tang, Meng Liu, and Zhao-Liang Li

Abstract. Accurately estimating global land surface radiation [including downward shortwave radiation (SW_IN), downward longwave radiation (LW_IN), upward shortwave radiation (SW_OUT), upward longwave radiation (LW_OUT) and net radiation (Rn)] and heat fluxes [including latent heat flux (LE), soil heat flux (G) and sensible heat flux (H)] is essential for quantifying the exchange of radiation, heat and water between the land and atmosphere under global climate change. This study presents the first energy-conservation datasets of global land surface radiation and heat fluxes from 2000 to 2020, generated by our model of Coordinated estimates of land Surface Energy Balance components (CoSEB) that was renewed with a combination of GLASS and MODIS remote sensing data, ERA5-Land reanalysis datasets, topographic data, CO₂ concentration data, and observations at 258 eddy covariance sites worldwide from the AmeriFlux, FLUXNET, EuroFlux, OzFlux, ChinaFLUX and TPDC. The developed CoSEB-based datasets are strikingly advantageous in that [1] they are the first RS-based global datasets that satisfy both surface radiation balance (SWI_N - SW_OUT + LW_IN - LW_OUT = Rn) and heat balance (LE + H + G = Rn) among the eight fluxes, as demonstrated by both the radiation imbalance ratio [RIR, defined as 100 % × (SW_IN – SW_OUT + LW_IN - LW_OUT)/Rn] and energy imbalance ratio [EIR, defined as 100 % × (Rn - G - LE - H)/Rn] of 0, [2] the radiation and heat fluxes are characterized by high accuracies, where (1) the RMSEs for daily estimates of SW_IN, SW_OUT, LW_IN, LW_OUT, Rn, LE, H and G from the CoSEB-based datasets were 28.51 W/m², 10.39 W/m², 14.29 W/m², 10.62 W/m², 22.40 W/m², 24.38 W/m², 22.67 W/m² and 6.77 W/m², respectively, as well as for 8-day estimates were 12.81 W/m², 7.08 W/m², 9.22 W/m², 8.34 W/m², 13.38 W/m², 19.99 W/m², 17.44 W/m² and 4.25 W/m², respectively, (2) the CoSEB-based datasets, in comparison to the mainstream products/datasets (i.e. GLASS, BESS-Rad, BESSV2.0, FLUXCOM, MOD16A2, PML_V2 and ETMonitor) that generally separately estimated subsets of the eight flux components, better agreed with the in situ observations. Our developed datasets hold significant potential for application across diverse fields such as agriculture, forestry, hydrology, meteorology, ecology, and environmental science, which can facilitate comprehensive studies on the variability, impacts, responses, adaptation strategies, and mitigation measures of global and regional land surface radiation and heat fluxes under the influences of climate change and human activities. The CoSEB-based datasets are open access and available through the National Tibetan Plateau Data Center (TPDC) at https://doi.org/10.11888/Terre.tpdc.302559 (Tang et al., 2025a) and through the Science Data Bank (ScienceDB) at https://doi.org/10.57760/sciencedb.27228 (Tang et al., 2025b).

Received: 30 Jul 2025 – Discussion started: 13 Aug 2025

Publisher's note: Copernicus Publications remains neutral with regard to jurisdictional claims made in the text, published maps, institutional affiliations, or any other geographical representation in this paper. While Copernicus Publications makes every effort to include appropriate place names, the final responsibility lies with the authors. Views expressed in the text are those of the authors and do not necessarily reflect the views of the publisher.

Download & links

Preprint (PDF, 3273 KB)

Supplement (1129 KB)

Download & links

Junrui Wang, Ronglin Tang, Meng Liu, and Zhao-Liang Li

Status: closed

RC1:
'Comment on essd-2025-456', Anonymous Referee #1, 03 Sep 2025
This paper presents an energy conservation datasets of global land surface radiation and heat fluxes from 2000 to 2020. The dataset is generated by the model of Coordinated estimates of land Surface Energy Balance components (CoSEB), with a combination of GLASS and MODIS remote sensing data, ERA5-Land reanalysis datasets, topographic data, CO2 concentration data, and observations at 258 eddy covariance sites worldwide from the AmeriFlux, FLUXNET, EuroFlux, OzFlux, ChinaFLUX and TPDC. The primary merit of this new model is energy-conservation.
Although the dataset might be useful, this dataset is not the first energy conservation datasets of global land surface radiation and heat fluxes as claimed by the authors. Therefore, major revisions are required before the paper is accepted.
Specific comments:
The authors claim that “This study presents the first energy conservation datasets of global land surface radiation and heat fluxes”, but reanalysis datasets, such as ERA5 which is used as inputs of this new dataset, also provide energy conservation surface fluxes for these energy fluxes. Maybe the authors want to say that this is the first remote sensing-based dataset? But the ERA5 radiative fluxes, which are not remote sensing-based, are used to generate surface fluxes in this paper, so this dataset is neither the first remote sensing-based dataset.

The merit of this new dataset is still unclear to me. According to Lines 171-180, ERA5 downward solar radiation and net thermal radiation at the surface is used in this paper, but why not simply use ERA5 fluxes if someone need to surface fluxes? The new dataset might be more accurate than ERA5 in places where ground-based observations are used to generate the new dataset, but the ground sites are sparce. To solve this problem, the authors should compare in-situ measurements with both the new data and ERA5 data in independent sites (i. e., sites that are not used in the generation of the new dataset)

The abstract is not well formatted. An abstract usually provides a brief and comprehensive summary, so trivial details in brackets [including downward shortwave radiation (SWIN), downward longwave radiation (LWIN), upward shortwave 15 radiation (SWOUT), upward longwave radiation (LWOUT) and net radiation (Rn)], [including latent heat flux (LE), soil heat flux (G) and sensible heat flux (H)], and (SWIN - SWOUT + LWIN - LWOUT = Rn) might be deleted. Internet links https://doi.org/10.11888/Terre.tpdc.302559 and citations (Tang et al., 2025a) should be removed from the abstract. On the other hand, the authors should briefly describe how these data sources are used to generate the new dataset.
Citation: https://doi.org/10.5194/essd-2025-456-RC1
- AC2: 'Reply on RC1', Ronglin Tang, 04 Nov 2025
  
  Please see our responses in the attached file. We sincerely appreciate your valuable comments and suggestions, which have greatly helped us improve the manuscript.
  
  Citation: https://doi.org/10.5194/essd-2025-456-AC2
RC2:
'Comment on essd-2025-456', Anonymous Referee #2, 01 Oct 2025
I have also attached my comments as a document for the convenience of the authors and editor.
Review of Energy-conservation datasets of global land surface radiation and heat fluxes from 2000-2020 generated by CoSEB

Summary and recommendation- In this paper , the authors apply a model of Coordinated estimates of land surface energy balance components (CoSEB) to generate estimates of surface radiation and heat fluxes from 2000 to 2020. An advantage of the CoSEB based approach is that estimates of radiation and heat are in “harmony” as opposed to generating independent estimates of each. The authors compare their estimates against observations from eddy covariance sites, other individual estimates and other individual observations. The paper is generally well written, and the results are presented clearly. However, I had several questions about the CoSEB framework itself and also the validations applied here in the manuscript. Hence I recommend major revisions. I have presented major comments and specific comments below.

Major comments-
Explanation of updates to the CoSEB framework- While reading the manuscript I realized that it is not only a paper that applies the existing CoSEB framework that is already published but also updates this framework to estimate to estimate radiation (previously this model estimated only land surface energy components and not short wave and long wave radiation). Therefore, authors need to discuss the effect of the addition of additional predicted variables on the equations and the results of the random forest. In particular, can the authors discuss which of the predictors were found to be the most important and also discuss how this differed with their previous publication? Also, can authors discuss generic details such as how many splits were generated by the random forest before and after the updates. Authors should also discuss the directionality of effects of different predictor variables based on the revised random forest.

Multi-collinearity amongst predictor variables- Authors should also discuss how multi-collinearity is handled amongst predictor variables given the large number of predictors. As far as I understand, random forests do not explicitly deal with multi collinearity unlike a PCA based approach for example. This can affect variable importance significantly. I would suggest authors explore this in detail.

Effect of autocorrelation- Given the temporal nature of several predictor variables, can authors confirm that autocorrelation does not exist or is minimized in their framework? What tests were performed to check for this? In particular I would recommend authors add lagged variables to the model to make sure that this is not the case. I believe several models constructed for earth system variables tend to ignore aspects such as autocorrelation and therefore this is an important point to address.

Effect of downscaling ERA5- Land datasets- The authors note on lines 195-197 that the ERA 5 land datasets used here have been downscaled from a resolution of ~9 kms to ~500m. This is a significant level of downscaling performed using a rather simple cubic convolution method. There are several variables related to the land cover (such as the LAI for example) that are used as predictor variables in the author’s framework. Can the authors address the uncertainty caused by such large downscaling between scales on their results? On the one hand, based on the results, it seems that the model has produced reliable results compared to observations and other datasets even after such large downscaling. Is it that the land cover related variables do not play an important role in the predictions?

In sample vs out of sample testing- While the authors present significant comparisons with observations and other datasets to validate their model (e.g. Figure 3, Figure 4 and Figure 5), it seems the authors have not checked for overfitting of their approach by splitting the dataset into a training vs testing dataset. This is especially important since as mentioned in Major comment 1., the CoSEB framework itself has been updated. Authors should address this in detail. In fact, looking at Figure 3, it seems that the R squared values for G and H are on the lower side. I am curious as to what the values look like when out of sample testing is conducted?

Specific comments-
Abstract lines 31-36- The RMSEs presented here do not make any sense at this point since the reader has no sense of scale of values to expect. I recommend authors report the R squared values here instead. Also make sure to report whether the R squared is based on pooled data or just the testing data (See Major comment 5)

Introduction lines 74-75- Can the authors differentiate the citations between those for physical vs those for statistical methods.

Introduction line 92- “impending” is an awkward word here. I would just say “It was imperative”.

Data lines 131-132- Why could a simple interpolation not be applied for missing half hourly data? Is the data extremely sensitive to time? Some clarification is needed here.

Data lines 138-139- Can the authors clarify why this criteria was applied for screening outliers?

Mainstream datasets/products for inter comparison- I was curious as to why the authors so not compare their estimates with heat and radiation estimates from popular earth system modelling systems such as CESM and CTSM (https://www.cesm.ucar.edu/). In fact, if the authors approach can produce estimates similar to earth system models, this would be a huge benefit to the community (since these models are laborious to run)

Methods lines 243-244- Once again the usage of RMSEs here does not make much sense. Can the authors just report the R squared values instead.

Methods lines 269-270- Just to confirm, the RF based uncoordinated models are models where only individual variables are estimated rather than the simultaneous calculation of several variables? This should be clarified.

Results Lines 306-309- I was curious looking at Figure 4 whether there were correlations or relationships between the EIR or RIR values and any of the other predictor variables? Is the shape of that distribution affected by any particular variables?

Results Lines 311-312- Can the authors clarify the differences between site-based validation vs sample-based validation?

Results lines 381-382- Once again, the RMSE values don’t make a lot of sense here. Authors should report the R squared values instead.

Section 4.2- When discussing the differences between the CoSEB model estimates vs other estimates, can authors also describe why the differences occur? A detailed discussion is not warranted here. Rather, I was interested in the author’s perspective as to why the author’s approach produces some differences over existing approaches.
Citation: https://doi.org/10.5194/essd-2025-456-RC2
- AC1: 'Reply on RC2', Ronglin Tang, 04 Nov 2025
  
  Please see our responses in the attached file. We sincerely appreciate your valuable comments and suggestions, which have greatly helped us improve the manuscript.
  
  Citation: https://doi.org/10.5194/essd-2025-456-AC1

Status: closed

RC1:
'Comment on essd-2025-456', Anonymous Referee #1, 03 Sep 2025
This paper presents an energy conservation datasets of global land surface radiation and heat fluxes from 2000 to 2020. The dataset is generated by the model of Coordinated estimates of land Surface Energy Balance components (CoSEB), with a combination of GLASS and MODIS remote sensing data, ERA5-Land reanalysis datasets, topographic data, CO2 concentration data, and observations at 258 eddy covariance sites worldwide from the AmeriFlux, FLUXNET, EuroFlux, OzFlux, ChinaFLUX and TPDC. The primary merit of this new model is energy-conservation.
Although the dataset might be useful, this dataset is not the first energy conservation datasets of global land surface radiation and heat fluxes as claimed by the authors. Therefore, major revisions are required before the paper is accepted.
Specific comments:
The authors claim that “This study presents the first energy conservation datasets of global land surface radiation and heat fluxes”, but reanalysis datasets, such as ERA5 which is used as inputs of this new dataset, also provide energy conservation surface fluxes for these energy fluxes. Maybe the authors want to say that this is the first remote sensing-based dataset? But the ERA5 radiative fluxes, which are not remote sensing-based, are used to generate surface fluxes in this paper, so this dataset is neither the first remote sensing-based dataset.

The merit of this new dataset is still unclear to me. According to Lines 171-180, ERA5 downward solar radiation and net thermal radiation at the surface is used in this paper, but why not simply use ERA5 fluxes if someone need to surface fluxes? The new dataset might be more accurate than ERA5 in places where ground-based observations are used to generate the new dataset, but the ground sites are sparce. To solve this problem, the authors should compare in-situ measurements with both the new data and ERA5 data in independent sites (i. e., sites that are not used in the generation of the new dataset)

The abstract is not well formatted. An abstract usually provides a brief and comprehensive summary, so trivial details in brackets [including downward shortwave radiation (SWIN), downward longwave radiation (LWIN), upward shortwave 15 radiation (SWOUT), upward longwave radiation (LWOUT) and net radiation (Rn)], [including latent heat flux (LE), soil heat flux (G) and sensible heat flux (H)], and (SWIN - SWOUT + LWIN - LWOUT = Rn) might be deleted. Internet links https://doi.org/10.11888/Terre.tpdc.302559 and citations (Tang et al., 2025a) should be removed from the abstract. On the other hand, the authors should briefly describe how these data sources are used to generate the new dataset.
Citation: https://doi.org/10.5194/essd-2025-456-RC1
- AC2: 'Reply on RC1', Ronglin Tang, 04 Nov 2025
  
  Please see our responses in the attached file. We sincerely appreciate your valuable comments and suggestions, which have greatly helped us improve the manuscript.
  
  Citation: https://doi.org/10.5194/essd-2025-456-AC2
RC2:
'Comment on essd-2025-456', Anonymous Referee #2, 01 Oct 2025
I have also attached my comments as a document for the convenience of the authors and editor.
Review of Energy-conservation datasets of global land surface radiation and heat fluxes from 2000-2020 generated by CoSEB

Summary and recommendation- In this paper , the authors apply a model of Coordinated estimates of land surface energy balance components (CoSEB) to generate estimates of surface radiation and heat fluxes from 2000 to 2020. An advantage of the CoSEB based approach is that estimates of radiation and heat are in “harmony” as opposed to generating independent estimates of each. The authors compare their estimates against observations from eddy covariance sites, other individual estimates and other individual observations. The paper is generally well written, and the results are presented clearly. However, I had several questions about the CoSEB framework itself and also the validations applied here in the manuscript. Hence I recommend major revisions. I have presented major comments and specific comments below.

Major comments-
Explanation of updates to the CoSEB framework- While reading the manuscript I realized that it is not only a paper that applies the existing CoSEB framework that is already published but also updates this framework to estimate to estimate radiation (previously this model estimated only land surface energy components and not short wave and long wave radiation). Therefore, authors need to discuss the effect of the addition of additional predicted variables on the equations and the results of the random forest. In particular, can the authors discuss which of the predictors were found to be the most important and also discuss how this differed with their previous publication? Also, can authors discuss generic details such as how many splits were generated by the random forest before and after the updates. Authors should also discuss the directionality of effects of different predictor variables based on the revised random forest.

Multi-collinearity amongst predictor variables- Authors should also discuss how multi-collinearity is handled amongst predictor variables given the large number of predictors. As far as I understand, random forests do not explicitly deal with multi collinearity unlike a PCA based approach for example. This can affect variable importance significantly. I would suggest authors explore this in detail.

Effect of autocorrelation- Given the temporal nature of several predictor variables, can authors confirm that autocorrelation does not exist or is minimized in their framework? What tests were performed to check for this? In particular I would recommend authors add lagged variables to the model to make sure that this is not the case. I believe several models constructed for earth system variables tend to ignore aspects such as autocorrelation and therefore this is an important point to address.

Effect of downscaling ERA5- Land datasets- The authors note on lines 195-197 that the ERA 5 land datasets used here have been downscaled from a resolution of ~9 kms to ~500m. This is a significant level of downscaling performed using a rather simple cubic convolution method. There are several variables related to the land cover (such as the LAI for example) that are used as predictor variables in the author’s framework. Can the authors address the uncertainty caused by such large downscaling between scales on their results? On the one hand, based on the results, it seems that the model has produced reliable results compared to observations and other datasets even after such large downscaling. Is it that the land cover related variables do not play an important role in the predictions?

In sample vs out of sample testing- While the authors present significant comparisons with observations and other datasets to validate their model (e.g. Figure 3, Figure 4 and Figure 5), it seems the authors have not checked for overfitting of their approach by splitting the dataset into a training vs testing dataset. This is especially important since as mentioned in Major comment 1., the CoSEB framework itself has been updated. Authors should address this in detail. In fact, looking at Figure 3, it seems that the R squared values for G and H are on the lower side. I am curious as to what the values look like when out of sample testing is conducted?

Specific comments-
Abstract lines 31-36- The RMSEs presented here do not make any sense at this point since the reader has no sense of scale of values to expect. I recommend authors report the R squared values here instead. Also make sure to report whether the R squared is based on pooled data or just the testing data (See Major comment 5)

Introduction lines 74-75- Can the authors differentiate the citations between those for physical vs those for statistical methods.

Introduction line 92- “impending” is an awkward word here. I would just say “It was imperative”.

Data lines 131-132- Why could a simple interpolation not be applied for missing half hourly data? Is the data extremely sensitive to time? Some clarification is needed here.

Data lines 138-139- Can the authors clarify why this criteria was applied for screening outliers?

Mainstream datasets/products for inter comparison- I was curious as to why the authors so not compare their estimates with heat and radiation estimates from popular earth system modelling systems such as CESM and CTSM (https://www.cesm.ucar.edu/). In fact, if the authors approach can produce estimates similar to earth system models, this would be a huge benefit to the community (since these models are laborious to run)

Methods lines 243-244- Once again the usage of RMSEs here does not make much sense. Can the authors just report the R squared values instead.

Methods lines 269-270- Just to confirm, the RF based uncoordinated models are models where only individual variables are estimated rather than the simultaneous calculation of several variables? This should be clarified.

Results Lines 306-309- I was curious looking at Figure 4 whether there were correlations or relationships between the EIR or RIR values and any of the other predictor variables? Is the shape of that distribution affected by any particular variables?

Results Lines 311-312- Can the authors clarify the differences between site-based validation vs sample-based validation?

Results lines 381-382- Once again, the RMSE values don’t make a lot of sense here. Authors should report the R squared values instead.

Section 4.2- When discussing the differences between the CoSEB model estimates vs other estimates, can authors also describe why the differences occur? A detailed discussion is not warranted here. Rather, I was interested in the author’s perspective as to why the author’s approach produces some differences over existing approaches.
Citation: https://doi.org/10.5194/essd-2025-456-RC2
- AC1: 'Reply on RC2', Ronglin Tang, 04 Nov 2025
  
  Please see our responses in the attached file. We sincerely appreciate your valuable comments and suggestions, which have greatly helped us improve the manuscript.
  
  Citation: https://doi.org/10.5194/essd-2025-456-AC1

Junrui Wang, Ronglin Tang, Meng Liu, and Zhao-Liang Li

Supplement

https://doi.org/10.5194/essd-2025-456-supplement

Data sets

Energy-conservation datasets of global land surface radiation and heat fluxes from 2000-2020 generated by CoSEB R. Tang et al. https://doi.org/10.11888/Terre.tpdc.302559

Energy-conservation datasets of global land surface radiation and heat fluxes from 2000-2020 generated by CoSEB R. Tang et al. https://doi.org/10.57760/sciencedb.27228

Junrui Wang, Ronglin Tang, Meng Liu, and Zhao-Liang Li

Viewed

Total article views: 1,415 (including HTML, PDF, and XML)

HTML	PDF	XML	Total	Supplement	BibTeX	EndNote
1,299	84	32	1,415	80	33	54

HTML: 1,299
PDF: 84
XML: 32
Total: 1,415
Supplement: 80
BibTeX: 33
EndNote: 54

Views and downloads (calculated since 13 Aug 2025)

Month	HTML	PDF	XML	Total
Aug 2025	270	13	1	284
Sep 2025	808	8	12	828
Oct 2025	82	12	5	99
Nov 2025	73	24	8	105
Dec 2025	58	26	6	90
Jan 2026	8	1	0	9

Cumulative views and downloads (calculated since 13 Aug 2025)

Month	HTML	PDF	XML	Total
Aug 2025	270	13	1	284
Sep 2025	808	8	12	828
Oct 2025	82	12	5	99
Nov 2025	73	24	8	105
Dec 2025	58	26	6	90
Jan 2026	8	1	0	9

Viewed (geographical distribution)

Total article views: 1,393 (including HTML, PDF, and XML) Thereof 1,393 with geography defined and 0 with unknown origin.

Country	#	Views	%

Latest update: 04 Jan 2026

Download

Preprint (3273 KB)
Metadata XML

Short summary

Existing remote sensing datasets could not provide all land-atmosphere radiation/heat flux components while satisfying energy balances. This study generates the first global dataset (2000–2020) based on our renewed Coordinated estimates of land Surface Energy Balance model, providing all high-accuracy components with perfect energy balance. This advancement enhances the study of Earth’s surface energy dynamics, enables better water management, and improves renewable energy planning.


Total:	0
HTML:	0
PDF:	0
XML:	0