Global snow water equivalent product derived from machine learning model trained with in situ measurement data
Abstract. Snow water equivalent (SWE) quantifies the volume of water stored in snowpacks and therefore critically attributes to the timing and amount of water discharged into groundwater sources and rivers. The SWE has been estimated using various methods, including in situ measurements, remote sensing, and physics-based models. However, each of these methods present certain limitations, including high costs, low spatiotemporal resolution, and uncertainty in model representation and parameter calibration. To address these challenges, in this study, we developed a machine learning-based daily global gridded SWE (SWEML) product with a spatial resolution of 0.25°, covering the period from 1980 to 2020. To develop this product, we first applied the k-means clustering algorithm using topographical and climatic variables to classify global in situ SWE measurements into 13 clusters. Subsequently, we adopted the random forest algorithm to correlate daily in situ SWE measurements (n = 11,653) with meteorological forcing and terrain attributes. We compared SWEML with other SWE datasets, including the GlobSnow dataset from the European Space Agency, the Global Land Data Assimilation System dataset, and SWE estimates from the Advanced Microwave Scanning Radiometer for the Earth Observation System. The overall root mean square error (RMSE) was 10.80 mm, and the overall bias was -6.89 mm globally, in particular, with high accuracy with Pearson correlation coefficient, R, of 0.99 and RMSE of 16.88 mm in mountainous and high-elevation areas, such as the Rocky Mountains in the U.S. Furthermore, both snow accumulation during winter and snow melting during spring, were well depicted in the SWEML, which is only possible with a high-temporal-resolution product. Overall, the daily gap-free global SWEML product introduced in this study can significantly contribute to water resource management efforts in snow-dominant regions and provide a robust reference for data assimilation in global-scale land surface modeling. The SWEML is available at https://doi.org/10.5281/zenodo.14195794 (Seo et al., 2024).