the Creative Commons Attribution 4.0 License.
the Creative Commons Attribution 4.0 License.
Fusing ERA5-Land and SMAP L4 for an Improved Global Soil Moisture Product
Abstract. Accurate, high-resolution soil moisture data are critical for hydrological modeling, climate studies, and ecosystem management. Unfortunately, current existing global products suffer from inconsistencies, coverage gaps, and biases. In this study, we evaluated the surface layers of three widely used soil moisture products, including ERA5-Land, ESA-CCI (v09.1 Combined), and SMAP L4 with resolutions ranging from 0.1° to 0.25°, against in situ measurements from 1,615 stations across five networks, including ISMN, CMA, Cemaden, COSMOS-Europe, and SONTE-China. The in situ dataset, to our knowledge, represents the most extensive global soil moisture compilation to date. It is found that ERA5-Land exhibits high correlation between measured and predicted soil moisture but the data also shows significant bias. SMAP L4 provides the highest accuracy, exhibiting low bias and root mean square error (RMSE), but is limited by its temporal coverage from 2015 to the present. To address these gaps, we developed an adjusted ERA5-Land dataset by fusing ERA5-Land and SMAP L4 using a mean-variance rescaling method optimized for long time-series alignment, which enhanced the spatiotemporal coverage and reduced bias. Validation against measured data demonstrates improved correlation with an increase correlation coefficient (r) of ~5 %, RMSE reduction of ~20 %, and NNSE improvement of ~15 % compared to the original products. The adjusted ERA5-Land dataset, which is publicly available, can be used as benchmark for future research and support drought monitoring, weather prediction, and water resource management, contributing to global climate resilience and informed decision-making across diverse ecosystems. The dataset is provided for the surface layer with global coverage at a spatial resolution of 0.1° and daily temporal resolution, spanning from 2015 to 2020, at https://zenodo.org/records/15816832.
- Preprint
                                        (3034 KB) 
- Metadata XML
- BibTeX
- EndNote
Status: final response (author comments only)
- RC1: 'Comment on essd-2025-410', Anonymous Referee #1, 04 Sep 2025
- 
                     RC2:  'Comment on essd-2025-410', Anonymous Referee #2, 20 Sep 2025
            
                        
            
                            
                    
            
            
            
                        By using soil moisture from ERA-Land, SMAP L4, and in-situ measurements from four different network sources, this study has developed a soil moisture data product at the 0.1 degree and daily spatiotemporal resolution for 2015–2020. Including ESA-CCI provides readers the limitations of ESA-CCI, which are not helpful for the data fusion. In other words, the investigation of ESA-CCI is a parallel storyline alongside the ERA5-Land and SMAP data fusion. Performing a similar analysis for SMAP L2/L3 is more beneficial for the understanding of the features and limitations of the SMAP L4 data. The short period of the data record (2015–2020) hinders the decadal scale investigation of soil moisture dynamics. Besides statistical analysis, the performance of the new data product in capturing soil moisture dynamics under drought conditions is more of interest as the authors mentioned in the Abstract Section that this product could benefit drought monitoring. Additionally, a workflow representing data processing and fusion is needed. The detailed comments are: L23: what are the specific decision-making activities? How can a six-year dataset (2015-2020) at 0.1 degree spatial resolution benefit decision-making activities given the spatial heterogeneity of landscapes? L27-29: “the water cycle” and “the hydrological cycle” cover some same processes, making this sentence redundant. L34-35: in this case, I’d include “the carbon and nitrogen cycles” in the topic sentence along with references. L79: the ESA-CCI soil moisture also has limitations in the tropical regions due to the dense coverage of trees, which is shown in Gruber et al. (2019) and is only mentioned in the next paragraph. I’d update the logic and structure of these two paragraphs by introducing each data type individually, with the advantages and limitations of each data type discussed at the same time. To further address this comment, the authors might want to discuss the limitations of soil moisture measurements/datasets across regions (e.g., tropical vs high-latitude regions). L86: the essential role of soil moisture is discussed in the pervious paragraphs. I’d not repeat this. L87-89: Many other soil moisture datasets have been used for similar purposes, e.g., land model evaluation and water management. This discussion is not necessary. L98 vs L125: The short time frame is also a limitation of this study, which develops soil moisture data for the period 2015–2020. I’d rephrase this sentence. In other words, it is not necessary mean that this study can address all the limitations discussed in Introduction. L145: Does the “1.9 million soil moisture content” refer to 1.9 million measurement record? The authors might want to mention the temporal resolution of the measurements (i.e., hourly) in the main context (rather than only mentioning it in the figure legend). L142 vs L188: The SONTE-China dataset has the measurement depths of 5, 10, 20, and 40 cm. While L124 mentioned the data retrieval of the 0-10 cm depth, how did the authors handle the inconsistency of layer depths? For the data fusion purpose, did the authors calculate the arithmetic mean between the 5 cm and 10 cm layers, or else? L239-240: this part is redundant. Figure 2: This study has developed soil moisture data for 2015–2020, and the plots show time series of soil moisture of different periods. Additionally, it’s not clear to me the reason for showing of the four sites among all the ISMN sites globally. Are they four sites representing locations under four different climatological conditions or else? Figure 3: I’d use more contrasting colors to represent “Type 2” and “Type 3”. For the selected sites (for Figure 2 as well), I’d include the latitude and longitude information. L285-287: Is this statement based on the analysis of the authors or existing studies? If it is the latter, what are the references? Is it true for everywhere globally or are there differences in terms of correlation and accuracy across space? L290-293: If ESA-CCI has issues over dense forests, why is it selected for data evaluation? Is it only because of the multi-sensor feature? The obstacles of optical measurements in the tropics are well recognized. However, choosing a dataset with better spatial coverage could still help improve uncertainty estimates in the tropics given that this study spans the entire globe. L409-417: This part belongs to Discussion. Figure 8: Given the small numbers of day with data availability in the tropics and in high-latitude regions (for 2015; the coverage in 2016 might be similar?), how do the authors get the global coverage of ESA-CCI in Figure 7? Section 3.3.1: How about the performance of datasets in agricultural regions, which are largely affected by food demands and human activities, which are not sufficiently represented by the process-based models of the reanalysis systems, i.e. ERA-Land here. L585-588: Compared to the time frame of SMAP products, April 2015–present, this study (adjusted ERA-Land) has the limitation in representing soil moisture over the long-term. I'd at least develop a data for 2015-2024. L604-612: this part shows that various soil moisture datasets show different accuracy across space, i.e., CONUS vs Europe. In this case, why not combine SMAP L4 and ESA-CCI to adjust ERA5-Land in regions with reasonable ESA-CCI coverages. Conclusion: The key features and advantages of “adjusted ERA5-Land” and expected. However, the authors provide excessive details and repeatedly revisit points that have been discussed in earlier sections, making it difficult to understand the key messages of this research. Citation: https://doi.org/10.5194/essd-2025-410-RC2 
Data sets
Fusing ERA5-Land and SMAP L4 for an Improved Global Soil Moisture Product Yonggen Zhang https://doi.org/10.5281/zenodo.15816832
Viewed
| HTML | XML | Total | BibTeX | EndNote | |
|---|---|---|---|---|---|
| 1,202 | 91 | 16 | 1,309 | 21 | 26 | 
- HTML: 1,202
- PDF: 91
- XML: 16
- Total: 1,309
- BibTeX: 21
- EndNote: 26
Viewed (geographical distribution)
| Country | # | Views | % | 
|---|
| Total: | 0 | 
| HTML: | 0 | 
| PDF: | 0 | 
| XML: | 0 | 
- 1
 
                         
                         
                         
                        



 
                 
                 
                 
                 
                
This paper presents a new gridded dataset, evaluation of multiple products based on a newly compiled set of in situ observations, and interesting results. It is also very well-written. The only major flaw seems to be the temporal coverage of the new product. The new dataset spans only 2015-2020, which is the same temporal coverage as SMAP L4. Yet the short temporal coverage of SMAP L4 is one of the reason to develop this new product. It should be straightforward to create a 1950-present (or whatever maximum feasible duration under storage and data download speed constraints) dataset using the current mean and variance scaling coefficients and ERA5-Land. The pre-SMAP period will be less accurate, but it has a good chance of being better than the original ERA5-Land when compared to pre-2015 in situ observations, and the expanded temporal coverage will make this new dataset a much more significant addition to the many gridded soil moisture datasets already available.
Other minor comments are as follows:
1. line 20-21: The 5%, 20%, and 15% number are hard to infer for the readers from Fig. 11 or Table 1. Please either give accompanying percentages in Table 1, or give absolute values in the abstract. Also, please spell out the NNSE abbreviation.
2. line 93: Cheng et al. 2017 did not discuss ERA5-Land. Please delete.
3. Fig. 6 The comparison is for a single day. It will be more informative if the comparison can be over all days - perhaps showing the per grid RMSE during the entire overlapping period.
4. Fig. 7 The ESA-CCI and SMAP L4 rows are reversely labelled. Also, the monochrome colorbar makes it difficult to see seasonality - please change it to something easier to read.
5. The discussion on ESA-CCI data gaps around lines 420-430 is unnecessarily long. The nature of the gaps - high latitudes, vegetated zones, and alpine regions - is well-reported in the original ESA-CCI paper and understood to be related to microwave sensor limitations. The authors should condense the text substantially and either remove Fig. 8, or replace the 2015 information with more comprehensive information such as the percentage of available days in each season during the entire 2015-now period. Fig. 9 and its related description are okay, because they adds new information based on the new in situ dataset provided in this study.
6. Fig. 12 - it is very difficult to see regional variations due to overlapping dots. Perhaps summary graphs can be made for each continent (North America, Europe, Asia, South America, Africa) mentioned in the text description of this figure.