13 Jan 2023
13 Jan 2023
Status: this preprint is currently under review for the journal ESSD.

Generation of global 1-km daily soil moisture product from 2000 to 2020 using ensemble learning

Yufang Zhang1, Shunlin Liang2, Han Ma2, Tao He1, Qian Wang3, Bing Li4, Jianglei Xu1, Guodong Zhang1, Xiaobang Liu1, and Changhao Xiong1 Yufang Zhang et al.
  • 1School of Remote Sensing and Information Engineering, Wuhan University, Wuhan 430079, China
  • 2Department of Geography, The University of Hong Kong, Hong Kong 999077, China
  • 3State Key Laboratory of Remote Sensing Science, Beijing Normal University, Beijing 100875, China
  • 4Key Research Institute of Yellow River Civilization and Sustainable Development & Collaborative Innovation Center on Yellow River Civilization of Henan Province, Henan University, Kaifeng 475001, China

Abstract. Motivated by the lack of long-term global soil moisture products with both high spatial and temporal resolutions, a global 1-km daily spatiotemporally continuous soil moisture product (GLASS SM) was generated from 2000 to 2020 using an ensemble learning model (eXtreme Gradient Boosting—XGBoost). The model was developed by integrating multiple datasets, including albedo, land surface temperature, and leaf area index products from the Global Land Surface Satellite (GLASS) product suite, as well as the European reanalysis (ERA5-Land) soil moisture product, in situ soil moisture dataset from the International Soil Moisture Network (ISMN), and auxiliary datasets (Multi-Error-Removed Improved-Terrain DEM and SoilGrids). Given the relatively large scale differences between point-scale in situ measurements and other datasets, the triple collocation (TC) method was adopted to select the representative soil moisture stations and their measurements for creating the training samples. To fully evaluate the model performance, three validation strategies were explored: random, site-independent, and year-independent. Results showed that for the random test samples, the XGBoost model trained with representative stations selected by the TC method achieved the highest accuracy, with an overall correlation coefficient (R) of 0.941 and root mean square error (RMSE) of 0.038 m3 m-3; whereas for both the site- and year-independent test samples, although the overall model performance was comparatively lower, training the model with representative stations could still considerably improve its overall accuracy. Meanwhile, compared to the model developed without station filtering, the validation accuracies of the model trained with representative stations improved significantly on most station, with the median R and unbiased RMSE (ubRMSE) of the model for each station increasing from 0.64 to 0.74, and decreasing from 0.055 to 0.052 m3 m-3, respectively. Further validation of the GLASS SM product across four independent soil moisture networks revealed its ability to capture the temporal dynamics of measured soil moisture (R = 0.69–0.89; ubRMSE = 0.033–0.048 m3 m-3). Lastly, the inter-comparison between the GLASS SM product and two global microwave soil moisture datasets—the 1-km Soil Moisture Active Passive/Sentinel-1 L2 Radiometer/Radar soil moisture product and the European Space Agency Climate Change Initiative combined soil moisture product at 0.25°—indicated that the derived product maintained a more complete spatial coverage, and exhibited high spatiotemporal consistency with those two soil moisture products. The annual average GLASS SM dataset from 2000 to 2020 can be freely downloaded from (Zhang et al., 2022a), and the complete product at daily scale is available at

Yufang Zhang et al.

Status: open (until 10 Mar 2023)

Comment types: AC – author | RC – referee | CC – community | EC – editor | CEC – chief editor | : Report abuse
  • CC1: 'Comment on essd-2022-348', Noemi Vergopolan, 13 Jan 2023 reply

Yufang Zhang et al.

Data sets

A global 1-km surface soil moisture product from 2000 to 2020 Yufang Zhang, Shunlin Liang, Han Ma, Tao He, Qian Wang, & Bing Li

Yufang Zhang et al.


Total article views: 379 (including HTML, PDF, and XML)
HTML PDF XML Total BibTeX EndNote
300 76 3 379 2 3
  • HTML: 300
  • PDF: 76
  • XML: 3
  • Total: 379
  • BibTeX: 2
  • EndNote: 3
Views and downloads (calculated since 13 Jan 2023)
Cumulative views and downloads (calculated since 13 Jan 2023)

Viewed (geographical distribution)

Total article views: 272 (including HTML, PDF, and XML) Thereof 272 with geography defined and 0 with unknown origin.
Country # Views %
  • 1
Latest update: 01 Feb 2023
Short summary
Soil moisture observations are important for a range of earth system applications. This study generated a long-term (2000–2020) global seamless soil moisture product with both high spatial and temporal resolutions (1 km, daily) using an XGBoost model and multi-source datasets. Evaluation of this product against dense in-situ soil moisture datasets and microwave soil moisture products showed that this product has reliable accuracy and more complete spatial coverage.