ESA CCI Soil Moisture GAPFILLED: An independent global gap-free satellite climate data record with uncertainty estimates
Abstract. The ESA CCI Soil Moisture multi-satellite climate data record is a widely used dataset for large-scale hydrological and climatological applications and studies. However, data gaps in the record can affect derived statistics such as long-term trends, and – if not taken into account – can potentially lead to inaccurate conclusions. Here, we present a novel gap-free dataset, covering the period from January 1991 to December 2023. Our dataset distinguishes itself from other gap-filled products, as it is purely based on the available soil moisture measurements (independent of ancillary variables to make predictions), and further due to the inclusion of uncertainty estimates for all interpolated data points.
Our gap-filling framework is based on a well-established univariate Discrete Cosine Transform with Penalized Least Squares (DCT-PLS) algorithm. This ensures that the dataset remains fully independent of other soil moisture and biogeophysical datasets, and eliminates the risk of introducing non-soil moisture features from other variables. We apply DCT-PLS on a spatial moving window basis to predict missing data points based on temporal and regional neighbourhood information. The challenge of providing gap-free estimates during extended periods of frozen soils is addressed by applying a linear interpolation for these periods, which approximates the retention of frozen water in the soil. To quantify the inherent uncertainties in our predictions, we developed an uncertainty estimation model that considers the input observations quality and the performance of the gap-filling algorithm under different surface conditions. We evaluate our algorithm through performance metrics with independent in situ reference measurements and by its ability to restore GLDAS Noah reanalysis data in artificially introduced satellite-like gaps. We find that the gap-filled data performs comparable to the original observations in terms of correlation and ubRMSD with in situ data (global median R = 0.72, ubRMSD = 0.05 m3m-3. However, in some complex environments with sparse observation coverage, performance is lower.
The new ESA CCI SM v09.1 GAPFILLED dataset is publicly available at https://doi.org/10.48436/s5j4q-rpd32 (Preimesberger et al., 2024) and will see yearly updates due to its inclusion in the operational ESA CCI SM production.