A new dataset of satellite observation-based global surface soil moisture covering 2003~2018

Soil moisture is an important variable linking the atmosphere and the terrestrial ecosystems. However, long-term satellite monitoring of surface soil moisture is still lacking at global scale. In this study, we conducted data calibration and fusion of 11 well-acknowledged microwave remote sensing-based soil moisture products since 2003 through neural network 10 approach, with SMAP soil moisture data applied as the fundamental training target. The training efficiency proves to be high (R2 =0.95) due to the selection of 9 quality impact factors of microwave soil moisture products and the elaborate organization structure of multiple various neural networks(5 rounds of simulation; 8 substeps; 74 independent neural networks; and >106 regional subnetworks). We achieved global satellite monitoring of surface soil moisture during 2003~2018 at 0.1° resolution. This new dataset, once validated against the International Soil Moisture Network (ISMN) 15 records, is supposed to be superior to the existing products (ASCAT-SWI, GLDAS Noah, ERA5-Land, CCI/ECV and GLEAM), and is applicable to studying both the spatial and temporal patterns. It suggests an increase in global mean surface soil moisture, and reveals that the surface moisture decline on rainless days is highest in summers over the low-latitudes but highest in winters over most mid-latitude areas. Notably, the error propagation with the extension of the simulation period to the past is well controlled, indicating that the fusion algorithm will be more meaningful in future when more advanced 20 sensors are in operation. The dataset can be accessed at https://doi.pangaea.de/10.1594/PANGAEA.912597 (Chen, 2020). https://doi.org/10.5194/essd-2020-59 O pe n A cc es s Earth System Science Data D icu ssio n s Preprint. Discussion started: 8 May 2020 c © Author(s) 2020. CC BY 4.0 License.

retrieved through mature algorithms (Table 1) are directly applied here, instead of Tb. Though the drawback is that the final soil moisture products may inherit the uncertainties associated with each retrieval method, this problem can be generally solved by including quality impact factors (see section 2.2 for details). acquired from the ascending and descending passes of both TERRA and AQUA. The 4th is the 'land cover factor', which is added because the parameters essential for soil moisture retrieval (vegetation effect correction) are set based on land use filled by sequentially searching and averaging nearby valid values (Chen et al., 2019). While the snow/ice mask of the 235 ASCAT-SWI product can be transferred to the simulation output, the potential snow or ice cover before 2007 should be identified. For a grid in a specific ten-day period, if ice cover is reported by ASCAT-SWI in most years, it is also supposed to be potentially covered by snow/ice. However, if the thaw state is observed according to the MEaSUREs Global Record of Daily Landscape Freeze/Thaw Status V4 dataset, the grid is not masked. The simulated soil moisture in the rainforests identified in the 'ASCAT Constant' are retained but not recommended due to large uncertainty. On the other hand, to avoid 240 error propagation with training times by ensuring a high-quality training target for the next round's simulation, for every simulated result, we performed a preprocessing step called suspicious value removal. We obtained the maximal and minimum values of SMAP_E soil moisture in each pixel. If the simulated value is out of the range of the SMAP data during 2015~2018, the value is suspicious and thus not included in the training target. Subsequently, '3σ denoise' is performed again before the simulated soil moisture becomes secondary training targets, which are referred to as SIM-1T, SIM-2T, and 245 so on ('SIM' stands for the simulated soil moisture, the number after the hyphen indicates the round of simulation, and 'T' means it is applied as training target).

Neural network design (2)-five rounds of simulations
A key characteristic of this study is that not only the 11 available microwave soil moisture data with different temporal spans are all incorporated, but they are also utilized as fully as possible through up to 5 rounds of neural network-based 250 simulations, with at least four different soil moisture products retrieved from three sensors applied as predictors in each round (see details below). While increasing the sources of soil moisture data inputs can be beneficial to training efficiency, the spatial coverage of the simulation output is sacrificed because the overlapping area decreases with the increase in soil moisture data varieties. After all, most products have missing data in specific regions (e.g., mountains, wetlands and urban settlements), and some sensors are even unable to produce data at global scale (TMI is limited to [N40°, S40°]; SMOS lacks 255 data in Asia). To solve that dilemma, we classified all pixels according to the available soil moisture products with valid data during a 10-day period. However, to avoid soil moisture simulation under snow or ice cover, not all combinations are https://doi.org/10.5194/essd-2020-59 considered (see Section 2.5) while some combinations are optional. Then, corresponding to the selected combinations, different independent neural networks are trained. Moreover, for a pixel, if the neural network driven by multiple predictor soil moisture has no subnetwork in the zone where it is located due to limited valid pixels, an alternative subnetwork driven 260 by the combination of fewer soil moisture data inputs should be applied instead. Hence, it is a question that which neural network is the best choice for a specific pixel. Apart from applicability, the relative priority of different neural networks should be determined by comprehensively considering the number and quality of input soil moisture products, the variety of sensors, the quantity of training samples indicated by the number of 10-day periods, and the accuracy of training targets (the training target quality declines monotonously: SMAP>SIM-1T>SIM-2T>SIM-3T>SIM-4T). Sometimes, the priority order 265 of independent networks cannot be determined easily, then various orders are provided, leading to various substeps, with the respective simulation results integrated later. Specifically, when the LAI data source changes, the division of a single round is also essential. Based on these principles, five rounds of neural network operations containing a total of 74 independent neural networks could be designed as follows.
The first round of simulation has two substeps. In substep 1, GEOV2 PROBA-V LAI data are used for soil moisture 270 simulation during 2014~2018 (note: the other 8 quality impact factors are applied as well) whereas in substep 2, the GLASS LAI are applied, and the simulation period is 2012D19~2013D36 (i.e., September 2012 to 2013; D is the ordinal of the ten-day period). SMAP soil moisture is the training target, while ASCAT-SWI10 (abbreviated as ASCAT), SMOS-IC (SMOS), AMSR2-JAXA and AMSR2-LPRM-X (AMSR2-LPRM) are the four predictor soil moisture products (details are in Table S1~S2). The simulation results of the two substeps are combined and then transformed into the training target 275 (SIM-1T). In the second round of simulation, the training target can either be SMAP or SIM-1T, while the input soil moisture data are ASCAT, SMOS, TMI-LPRM-X (TMI) and FY-3B-NSMC (FY). The simulation output, SIM-2, spans from 2011D20 to 2012D18 (Table S3~S4). In the third round (2010D16~2011D19), SMAP, SIM-1T and SIM-2T are all used as training targets, while the predictor soil moisture data are ASCAT, SMOS, TMI and WindSat-LPRM-X (WINDSAT). There are two substeps in round 3, distinguished by whether the priority of the neural networks is determined by the training 280 sample quantity and the training target quality first (SIM-3-1) or the number of input soil moisture products is considered https://doi.org/10.5194/essd-2020-59  Table S5~S8). Because these two methods emphasize different aspects of network quality, in some pixels.
SIM-3-1 will be advantageous, but in others, SIM-3-2 is better. Hence, an algorithm is proposed to combine the advantages of both simulations, which is described in Table S9. Next, the 4th round is for simulation during 2007D1~2010D15. With SIM-2T and SIM-3T being the training targets, ASCAT, WINDSAT, TMI, AMSRE-JAXA, AMSRE-LPRM-X 285 (AMSRE-LPRM) and AMSRE-NSIDC are all applied as predictors (LAI now comes from SPOT-VGT). Two substeps are also included. In the first substep, neural networks are sorted by paying more attention to the number of soil moisture inputs and the sensors they are derived from, while the training sample size and training target quality are prioritized to make an alternative estimation (Table S10~S13). Afterwards, SIM-4 is obtained by reasonably fusing these two results. In the final round, the soil moisture simulation is extended to as early as 2003. SIM-2T, SIM-3T and SIM-4T are combined to be the 290 training target, while the soil moisture data entering the neural networks consist of WINDSAT, TMI, AMSRE-JAXA, AMSRE-LPRM and AMSRE-NSIDC (Table S14~S15).

Validation of surface soil moisture products
For the evaluation of global soil moisture data, the International Soil Moisture Network (ISMN) dataset (Dorigo et al., 2011;Dorigo et al., 2013) is the most frequently used (Al-Yaari et al., 2019;Karthikeyan et al., 2017b). Because SMAP, the 295 training target, is the soil moisture within 0~5 cm, the simulated soil moisture is supposed for that soil layer as well.
Accordingly, the measurements used for validation are limited to ≤ 5 cm in depth. The quality flags of ISMN(Dorigo et al., 2013) are also checked to retain only the 'good quality' data. After data screening and processing (see Text S2), more than 100,000 10-day averaged soil moisture records acquired from 728 stations of 29 networks are selected for validation of the soil moisture product whose temporal resolution is 10 days. The detailed information of these stations and the periods of the 300 data used are listed in Table S16, while the spatial distribution of these stations is shown in Figure S2. The major climate types of the sites are determined from the updated map of the Köppen-Geiger climate classification (see Table 2 for the descriptions of each type (Kottek et al., 2006)).
Most ISMN networks are dense networks, as the stations are very close to each other, probably within the same 0.1°grid, https://doi.org/10.5194/essd-2020-59 whereas some others are sparse networks ( Figure S2). In addition, various sensors are simultaneously operated at some 305 stations. Hence, to make full use of all available records, the site-scale 10-day averaged soil moisture data are aggregated to 0.1°grid-scale by averaging the same period's values within the same grid. Specifically, if soil moisture is not simulated due to snow or ice cover, the corresponding measurement is excluded. This resulted in a final collection of~40,000 grid-scale 10-day period soil moisture records within the validation dataset.
The soil moisture datasets to be evaluated include the simulated (satellite-based) soil moisture product in this study 310 (hereinafter 'SIM', covering 2003~2018), SMAP_E (the training target, covering April 2015~2018), the longest existing satellite-derived soil moisture record: ASCAT-SWI (converted to volumetric fraction; data period is 2007~2018), the reanalysis-based soil moisture: GLDAS Noah V2.1 and ERA5-Land (data were resampled to 0.1°by bilinear interpolation, averaged to 10-day scale and then evaluated during 2003~2018) as well as the soil moisture data developed by combining both satellite observation and model simulation: CCI and GLEAM soil moisture products (note that the difference between 315 GLEAM v3.3a and v3.3b is that for v3.3a, the radiation and air temperature forcing data comes from ERA5, whereas for v3.3b, all meteorological data are satellite-based, yet the time series is only updated to September 2018). The overall performance of any soil moisture product is first evaluated using all of the validation dataset, with R square (R 2 ) and RMSE values (unit: m 3 m -3 ) adopted as the main indicators. The next step is the temporal pattern validation. For grids with enough (>20) 10-day average records, we compared the estimated soil moisture during all periods against the corresponding 320 measurements, with the correlation coefficient (R) and RMSE calculated. Several supplementary indexes are also added, including bias, unbiased RMSE (ubRMSE) and the correlation between the anomalies (anomalies R, abbreviated here as 'A.R', can better indicate the prediction accuracy of interannual changes). Then, we compared the mean/median of the above evaluation indexes for different soil moisture products and tested whether the differences are significant. Moreover, the relative performances of various products in different climate zones are analyzed. Last, we performed spatial pattern 325 validation. For every 10-day period, the soil moisture data in all grids with stations are evaluated by the R, RMSE, bias and ubRMSE values. The relative superiority of all products during different time periods in a year, and the changes of data coverage as well as data quality with time were also investigated.

Intra-annual variation analysisof surface soil moisture
It should be noted that although the nominal spatial resolution of the surface soil moisture product (SIM) is 0.1°, the 330 intra-annual variation analysis would be more robust if we aggregate SIM to 0.5°resolution because the original resolution of SMAP soil moisture is~0.4°while the resolution of most soil moisture products to be calibrated and fused is 0.25°. Then, we excluded the high latitude areas (60°N~90°N) where the available data are limited due to frequent ice cover. For the remaining areas (60°S~60°N), based on a total of 36×16 (years)=576 data points, we fitted the intra-annual cycle of soil moisture by using the Fourier function (probably the best choice for periodic function fitting) with the period fixed to 1 year 335 (36 10-day periods). The number of terms is set to 1 unless the intra-annual cycle is obviously asymmetrical and can be much better characterized by a two-term Fourier function. Subsequently, the highest peak and lowest trough values of surface soil moisture as well as the corresponding locations in time (the ordinal of 10 days) are exported.
The direct driving factor of the variation in surface soil moisture is precipitation, for which we adopted the GPM IMERG Precipitation V06-final run data (Huffman et al., 2019). Apart from a direct correlation analysis, we also used the Fourier 340 function to characterize the intra-annual cycle of precipitation and then explored the relationship between the cycles of precipitation and surface moisture (the derived fitting function is dropped if the adjusted R 2 is lower than 0.1), with the peak time difference in each 0.5°grid calculated (if both cycles have two peaks, the average locations of the two peaks are calculated). Because the simulated surface soil moisture only represents the average condition during each 10-day period while the accuracy is not high enough, we only studied the surface soil moisture decline after ten consecutive rainless days.

345
Effective precipitation is calculated by precipitation minus canopy interception that is estimated by combining the modified Merriam canopy interception model (Kozak et al., 2010;Merriam, 1960) (the vegetation canopy cover and LAI is acquired from the GEOV2 dataset). If the total effective precipitation within two consecutive 10-day periods (20 days) is less than a given threshold (initially set to 10 mm), we suppose that the soil moisture change in the latter period compared to the previous period is mostly due to surface evaporation and percolation (capillary rise can be negligible ( temporal accuracy, but the absolute values and spatial patterns are relatively unreliable, whereas SIM shows good performances in all aspects. Generally, this study indicates that the expected order of data applicability among various global long-term surface soil moisture products is SIM (applicable to all studies)> GLEAM (suitable for studies of temporal variation)> ERA5-Land (applicable to temporal pattern studies)> GLDAS Noah V2.1 (somewhat applicable to all studies)> 450 ASCAT-SWI> CCI. The training R 2 of the previous neural networks designed for global surface soil moisture mapping is 0.45~0.55, while the temporal R and RMSE values against measurements are 0.52 and 0.084 (Yao et al., 2017), and the overall R and RMSE are 0.44 and 0.113 (Yao et al., 2019). In this study, by elaborating the neural network, the training R 2 is elevated to 0.95, with the temporal R and RMSE (0.69 and 0.08) as well as overall R and RMSE (0.65 and 0.087) values also improved. In addition, our 10-day period average product is both spatially and temporally continuous during 16 years, 455 with a high spatial resolution, and it covers all land except for frozen ground (data are available in rainforests, but not recommended). Hence, our product is probably more useful than previous machine-learned products.

The spatial and temporal patterns of the calculated surface soil moisture
For the calculated global surface soil moisture, the spatial pattern averaged during 2003~2018 is shown in Figure 12a (the maps for separate months are shown in Figure S12a). Validation shows that apart from our calculations, GLDAS has the 460 highest spatial accuracy, so the spatial map of GLDAS surface moisture is attached below ( Figure S12b). By comparison, the spatial patterns of SIM and GLDAS are similar, but there also exist some differences (see the regions circled in red).
Obviously, SIM has a higher spatial heterogeneity, probably more reflections on wetlands and irrigated fields (e.g., the Hetao Irrigation Area in China), whereas GLDAS appears patchy in arid places. The latitudinal pattern comparison in Figure   S13a also implies that SIM contains more detailed spatial information.

465
As for the interannual variation, because the GLEAM v3.3a product proves to have the best accuracy in characterizing the temporal anomalies of soil moisture, which also covers the globe, this product is selected as the reference to justify our calculation. According to Figure S13b important role than surface moisture in ecosystems, which cannot be obtained from microwave remote sensing. Hence, combining the advantages of observation and model simulation helps to improve the data accuracies of both surface and root-zone soil moisture. Unfortunately, while the CCI algorithm integrates the disadvantages of both methods, GLEAM only incorporated very limited observed information. We propose that one possible approach is to use the grid-specific 520 confidence range and the spatial pattern of satellite-based soil moisture (e.g., our product: SIM) to constrain the model parameters or add supplementary modules if necessary. In detail, SIM can be used as the initial surface soil moisture map.
Then, after each time of soil moisture simulation in multiple layers (both root-zone and surface), the model efficiency is examined through a spatial correlation test between the simulated surface moisture and SIM. In addition, whether the simulated value falls within the confidence range (e.g., ±20%) of that reported by SIM should also be tested. By recurrent 525 adjustments, the model parameters in each grid can be optimized. For irrigated croplands, if irrigation is not considered in models, the simulated surface soil moisture will soon fall below the confidence range, and the spatial correlation will also decline no matter which parameters are given. Therefore, a well-designed irrigation module (Chen et al., 2019) should be introduced over there. Finally, for regions with massive human-induced land cover changes (e.g., afforestation), optical remote sensing should be applied for better estimation of evapotranspiration.

Author contributions
Yongzhe Chen conducted the research, completed the original draft and revised it. The correspondence author, Xiaoming  Tables   Table 1: The global surface soil moisture products used in this study.