A global dataset of standardized moisture anomaly index incorporating snow dynamics from 1948 to 2010

Drought indices are hard to balance in terms of versatility (effectiveness for multiple types of drought), flexibility 10 of timescales, and inclusivity (to what extent they include all physical processes). A lack of consistent source data increases the difficulty of quantifying drought. Here, we present a global monthly drought dataset from 1948 to 2010 based on a multitype and multiscalar drought index, the standardized moisture anomaly index incorporating snow dynamics (SZIsnow), driven by systematic fields from an advanced data assimilation system. The proposed SZIsnow dataset includes different physical water‒energy processes, especially snow processes. Our evaluation of the dataset demonstrates its ability to 15 distinguish different types of drought across different timescales. Our assessment also indicates that the dataset adequately captures droughts across different spatial scales. The consideration of snow processes improved the capability of SZIsnow, and the improvement is particularly evident over snow-covered high-latitude (e.g., Arctic region) and high-altitude areas (e.g., Tibetan Plateau). We found that 59.66% of Earth’s land area exhibited a drying trend between 1948 and 2010, and the remaining 40.34% exhibited a wetting trend. Our results also show that the SZIsnow dataset successfully captured the large20 scale drought events that occurred across the world; there were 525 drought events with an area larger than 500,000 km2 globally during the study period, of which nearly 70% had a duration longer than 6 months. Therefore, this new drought dataset is well suited to monitoring, assessing, and characterizing drought, and can serve as a valuable resource for future drought studies.


Introduction 25
Drought is one of the most costly and complex natural hazards, commonly causing significant and widespread adverse impacts on many sectors of society (Aghakouchak et al., 2015;He et al., 2020). The severity, extent, and duration of drought are likely to intensify across the world under the effects of climate change (Ault, 2020;Mann and Gleick, 2015). There has been increasing global interest in measures to improve the capability of drought quantification and various drought indices have been proposed over the past several decades (Liu et al., 2018;Esfahanian et al., 2017). However, current drought 30 indices struggle to reconcile versatility (ability to quantify multiple types of drought), flexibility of temporal scale (effective across different timescales), and comprehensiveness (to what extent they include all hydrological processes). Additionally, these drought indices are derived from multifarious data sources, rather than systematic and consistent physical data from the same source (Ahmadalipour and Moradkhani, 2017;Hoffmann et al., 2020). As a result, different sectors of society have rarely collaborated to synergistically fight against drought using a comprehensive drought index. 35 The propagation of drought is related to changes in numerous interconnected variables of hydrometeorological processes (e.g., precipitation, evapotranspiration, streamflow, and soil moisture). Yet a major portion of currently available drought indices focus on only one aspect of drought evolution. For example, the Rainfall Anomaly Index (RAI; Zhu et al., 2021), Streamflow Drought Index (SDI; Nalbantis and Tsakiris, 2009), and Soil Moisture Deficit Index (SMDI; Narasimhan and Srinivasan, 2005) focus only on precipitation (meteorological drought), streamflow (hydrological drought), and soil moisture 40 (agricultural drought), respectively ( Fig. 1, top row). Additionally, these indices merely consider water supply in drought and neglect water demand, but a drought is a condition of the water deficit between water supply and demand (Mishra and Singh, 2010). Thus, these indices do not provide sufficient information to enable decision-makers to organize a comprehensive antidrought approach that balances all sectors of society affected by drought.
Some indices were developed with the purpose of application to all types of droughts ( Fig. 1, second row). The Palmer 45 Drought Severity Index (PDSI; Alley, 1984;Wells et al., 2004) can be applied to different types of drought by considering water supply and demand with a simplified two-layer bucket model, but it has a fixed temporal scale and does not work well over snow-covered areas (Dai, 2011a). Although the Standardized Precipitation Evapotranspiration Index (SPEI; Vicente-Serrano et al., 2010) overcomes the PDSI's weakness of fixed temporal scale, it oversimplifies complex relationships and neglects several important hydrological processes associated with the development of drought (Zhang et al., 2015). 50 Moreover, current indices usually use physical variables from different data sources, which inevitably introduces bias and leads to an imbalanced calculation (Naumann et al., 2014). The development of the data assimilation system brings an option of systematic input for drought indices (Xu et al., 2020). Therefore, there is a need to develop a multitype and multiscalar drought index that considers various key processes related to drought and can take full advantage of output from the data assimilation system. 55 covered regions with considerable amounts of snowfall (Huning and Aghakouchak, 2020;Staudinger et al., 2014). To 65 address this deficiency, Zhang et al. (2019) modified the SZI by adding snow dynamics for drought characterization into a new version of the drought index called SZI snow (Fig. 1, final row). This is the drought index used to construct the drought dataset in this study.
Drought is mainly characterized by severity, spatial extent, duration, and timing. The traditional method for drought investigation is to explore the variability of its severity over a fixed study area (Hao et al., 2017). This method is also widely 70 used to evaluate the ability of a drought index (Peng et al., 2020). Although this method can provide certain information regarding the regional drought condition, it cannot analyze the change of spatial extent with time for a drought event (Zhai et al., 2017). As drought is a spatiotemporal process, the ability of a drought index to explore the joint evolution of drought events in space and time should be given increased attention (Herrera-Estrada et al., 2017). Thus, we utilized the severityarea-duration method (Andreadis et al., 2005;Sheffield et al., 2009), which can monitor drought in space and time, to 75 comprehensively evaluate the drought index dataset proposed by our work.
This work aims to construct a long-term global SZI snow dataset for various temporal scales. The SZI snow is here developed to characterize multitype and multiscalar drought by accounting for different physical water-energy processes, especially snow processes. This paper is organized as follows. We first introduce the data and metrics for forcing and evaluation of the proposed drought dataset (Sect. 2). The method behind the derivation of the SZI snow is summarized in Sec. 3. In Sect. 4 we 80 present a comprehensive evaluation of the SZI snow to assess its ability to capture different drought types across the world, particularly over the Arctic region and Tibetan Plateau. Based on the dataset, we further analyze the spatiotemporal changes of global drought and focus on the variability of large-scale drought events. Section 5 briefly introduces how to download the proposed drought index dataset. Finally, in Sect. 6, we discuss the advancement of the SZI snow and its potential applicability and implications, and present our conclusions.   Table S1.

Data for producing the SZIsnow drought index
Hydrometeorological variables from numerical models are commonly used as the source data to compute drought indices at the global scale, due to limited observational data exist. (Sawada and Koike, 2016). Thus, in this study, the Global Land Data Assimilation System (GLDAS) provided variables to calculate the SZI snow globally. The SZI was also calculated for the 95 purpose of comparison. The GLDAS is a state-of-the-art assimilation system using advanced land surface modeling and data assimilation techniques. It incorporates satellite-and ground-based monitoring data and aims to produce optimal land surface and flux variables (Rodell et al., 2004). Currently, two versions of the GLDAS product are available: GLDAS version 1 (GLDAS-1) and GLDAS version 2 (GLDAS-2). The better performance of GLDAS-2 compared to that of GLDAS-1 has been verified by previous work (Wang et al., 2016;Zhang et al., 2019), therefore we adopted GLDAS-2 to provide water-100 energy related variables to derive the SZI snow .
The GLDAS-2 drives the Noah land surface model (LSM), forced by the global Princeton meteorological forcing data, to approximate the observed land surface state (Rui and Beaudoing, 2011). The fields of land surface states and fluxes of GLDAS-2 in this study have a 0.25 spatial resolution and monthly temporal resolution. The ability of the assimilation system to capture the real state of the land surface is a main concern of its users. Numerous studies have assessed the 105 meteorological forcing fields (e.g., precipitation and near-surface temperature) and modeling outputs (e.g., soil moisture and evapotranspiration) of GLDAS-2 over different regions around the world (Bi et al., 2016;Spennemann et al., 2015). The GLDAS-2 product has generally been recognized as acceptable in spite of any biases and uncertainties. Additionally, GLDAS-2 provides abundant hydrometeorological information to areas with limited observations or ungauged areas. In particular, it bridges the gap between the scarce data available for the three poles (i.e., North Pole, South Pole, and Tibetan 110 Plateau) and the increasing attention of the science community on these areas because of their crucial role in Earth system science.

Data for evaluating the performance of the SZIsnow
We firstly assessed the capability of the SZI snow at a basin scale across the world by using the closed terrestrial water budget dataset developed by Pan et al. (2012). This is a monthly dataset for 32 major river basins, measured globally from 1982 to 115 2006. The drainage areas of these basins range from 230,000 to 600,000 km 2 , and their locations are shown in Fig. S1. This dataset was produced based on multisource data including in situ observations, remote sensing products, land surface model simulations, and reanalysis datasets. Through a systematic assimilation strategy, the errors and biases of the multisource data were greatly compensated, which guarantees the assimilated data has the highest possible confidence. This dataset has thus served as a baseline dataset for large basin-scale studies related to water and energy cycles and has been widely used by 120 previous researchers (Zeng and Cai, 2016). Additionally, the variables in this dataset include precipitation, evapotranspiration, streamflow, total terrestrial water storage, and snow depth. The comprehensive variables in the dataset facilitate the calculation of different drought indices as references to evaluate the SZI snow .
We applied a drought index, the Standardized Wetness Index (SWI), to evaluate the performance of the SZI and SZI snow at the global scale. The details of the SWI will be introduced in Sect. Version 3.1a. The CRU TS supplies monthly precipitation (P) and potential evapotranspiration (PET) data at a spatial resolution of 0.5 (https://crudata.uea.ac.uk/cru/data/hrg/). This PET data is computed by the Penman-Monteith equation.
The GLEAM provides monthly actual evapotranspiration (ET) data at a spatial resolution of 0.25 (https://www.gleam.eu/). Both the CRU TS and GLEAM cover a period from 1980 to 2010. In addition, the GLEAM dataset was interpolated from 130 the spatial resolution of 0.25 to that of 0.5 to facilitate the computation of the SWI.

Evidence of different drought types for the SZIsnow evaluation
We evaluated the ability of the SZI snow and SZI to capture different types of droughts based on drought evidence. First, meteorological, hydrological, and agricultural drought evidence was identified based on precipitation, streamflow, and soil water storage, respectively, from the dataset of Pan et al. (2012) (as mentioned in Sect. 2.2) over the 32 large basins. Then, 135 the evidence was compared with the SZI snow and SZI, calculated based on the GLDAS-2 product. In addition, for the convenience of comparison, we adopted the log-logistic distribution to standardize precipitation, streamflow, and soil water storage for the computation of the Standardized Precipitation Index (SPI; Mckee et al., 1993), Standardized Streamflow Index (SSI; Vicente-Serrano et al., 2012) and Standardized Water Storage Index (SWSI; Aghakouchak, 2014).
We also selected the residual water-energy ratio (WER) as a comprehensive drought indicator to evaluate the SZI . The 140 WER is defined as the ratio of residual available water (P-ET) to residual energy (PET-ET), and can integrally reflect drought conditions by depicting variation in waterenergy balance. The WER was first suggested by Liu et al. (2017) as many studies found that the ratio of sensible heat (the residual energy supply after dissipating through latent heat) to net radiation (total energy supply) is always raised under drought, meanwhile the ratio of residual available water to precipitation (total water supply) is always lowered under drought. Consequently, the WER is lowered during drought and 145 can be used as a comprehensive drought indicator. Again, we used independent datasets (i.e., the CRU TS and GLEAM datasets) to globally calculate the WER and compare it with the SZI snow and SZI. As for the SSI and SWSI, the WER was standardized for the calculation of the SWI.

Metrics for the SZIsnow evaluation
This study applied the SPI (a meteorological drought index), SSI (a hydrological drought index), SWSI (an agricultural 150 drought index), and SWI (a comprehensive drought index) as references to evaluate the performance of the SZI snow and SZI.
The four referenced drought indices were computed with datasets that were independent from the dataset used for calculating the SZI snow and SZI. We utilized Pearson correlation coefficients (r) of SPI-SZI/SZI snow , SSI-SZI/SZI snow , SWSI-SZI/SZI snow , and SWI-SZI/SZI snow to compare the performance of the SZI and SZI snow in terms of their capacity to capture multitype and multiscalar drought across different climate zones.

Hydrologic accounting
The physical processes included in the construction of the SZI snow are shown in Fig. 2. Six water budget components are involved in the procedure of hydrological accounting to determine the water demand over a region. The related variables 160 comprise ET, PET, runoff, potential runoff, soil infiltration, potential soil infiltration, soil moisture loss, potential soil moisture loss, snow water equivalent (SWE) accumulation, potential SWE accumulation, snowmelt, and potential snowmelt.
The monthly values of these variables were derived from land surface models, for instance, the GLDAS-2 Noah LSM in the present study. The prominent improvement of the SZI snow is that it accounts for the influence of snowfall on hydrological processes, which was completely ignored in the SZI (Zhang et al., 2019;Zhang et al., 2015). 165  Both the soil moisture storage and snow storage are considered as reservoirs in the SZI snow , which is different from the SZI that solely considered the former. Changes in soil moisture storage (soil infiltration or soil moisture loss) and snow (SWE accumulation or snowmelt) can alter the regional water balance (water supply or water demand), and then affect the drought condition. Consequently, the SZI snow contains more comprehensive hydrological processes than the SZI. The improvement of the SZI snow makes it applicable to a wider variety of climatic regions, especially for regions that belong to the three poles 175 where more snow is stored than at any other place on Earth. Detailed equations for the hydrologic accounting in the SZI snow are listed in Table 1. All full names for abbreviations contained in the equations are supplied in Table S2.

Procedures Variables Equations
Hydrological accounting The regional water supply firstly meets water demand from soil layers. The soil infiltration (R) was estimated by monthly changes (∆S and ∆S ) in available soil moisture of the top (S ) and bottom (S ) soil layers. The potential soil infiltration (PR) 180 was calculated as the difference between the available soil water capacity (AWC) and the available soil moisture of the entire soil. AWC is estimated as the maximum soil water of the two soil layers (Fig. 2) in the Noah LSM. Then, the rest of the regional water supply satisfies water demand from runoff (RO). RO consists of surface runoff (RO , baseflow (RO ), and snowmelt runoff (RO ), which are directly obtained from the GLDAS-2. The potential runoff (PRO) is the difference between AWC and PR, because soil moisture storage is considered as a water reservoir in the SZI . Additionally, water 185 supply is partly consumed by ET, including bare soil evaporation (E ), transpiration (E ), and canopy water evaporation (E ) that can be found in the output of the GLDAS-2. The PET is computed with output fields from GLDAS-2 using the PenmanMonteith equation. Moreover, the moisture loss (L) from the soil layers is considered in the SZI snow . The equations for L and its potential values are shown in Equation 4 of Table 1. Lastly, calculations of variables related to snow processes underscored by the SZI snow are given in Equations 5-6 of Table 1. The potential snow accumulation (PSA) equals the 190 monthly amount of snowfall (P ), and the monthly SWE change completely reflects snow accumulation (SA) and snowmelt (SM).

Climatic coefficients and precipitation that is climatically appropriate for existing conditions (CAFEC)
Similar to the PDSI, the SZI snow applies the precipitation amount that is climatically appropriate for existing conditions (CAFEC, referred as P ) to quantify the regional water demand. The amount of P is the result of interaction among 195 the six water budget components as shown in Equation 8 of Table 1. The weighting factor for each component is the climatic coefficient, which is defined as the ratio of the monthly climatic averages of actual (water supply) to potential (water demand) values. The equations used to compute these climatic coefficients are listed in Equation 7 of Table 1. The j in Equation 7 denotes months of a year, that is, each water budget component has 12 values of climatic coefficient covering all months.

Standardization of moisture anomaly 200
The comparison between the actual precipitation (P) and P can reflect the drought condition. When the actual P is less than P , the regional water supply will remain in deficit, and vice versa for a surplus. Thus, the difference between the actual P and P is an appropriate indicator for regional water deficiency or surplus. Such difference is defined as the moisture anomaly Z (Equation 9  SZI is 0, and the standard deviation is 1. Finally, we scaled the SZI categorization levels to the corresponding SPEI 210 drought severity categories in Table 2 because the same standardization method is used for both.

Identification of large-scale drought events in space and time
Using the SZI dataset constructed by this study, we performed a global and continental drought analysis for the period 1948-2010. We focused on the temporal variability of large-scale drought events through a severity-area-duration (SAD) 215 drought diagnosis method (Andreadis et al., 2005;Herrera-Estrada et al., 2017). In contrast to traditional studies which analyze the intensity, severity, and duration of drought over a fixed region, the SAD method specializes in simultaneously tracking the development of droughts in space and time based on a gridded dataset. This method proposes a Lagrangian approach by aggregating grids (under specified drought levels) of contiguous areas into clusters. These clusters are then tracked and archived as they propagate through space and time. The main steps of the SAD method are outlined as follows. 220 The SAD method firstly uses a monthly three-dimensional (3D) gridded drought index dataset to identify two-dimensional (2D) drought clusters in each time step. This drought cluster identification procedure is built on a clustering algorithm that merges spatial contiguity. Then, a median filter is utilized to smooth out noise (i.e., small-scale heterogeneity) in the 2D clusters in each time step. Specifically, we regarded a grid with a SZI value below -1.0 as being under drought and considered connected areas within which all the grids had a SZI below -1.0 as a drought cluster. The last step of the 225 clustering procedure is to remove clusters with an area less than 500,000 km 2 . The remaining clusters are regarded as separate drought events for each time step. More importantly, these identified drought clusters are allowed to split or merge through time in the SAD method. The tracking algorithm links clusters that have overlapping grid cells and records the merging or splitting date, areas, and centroids of clusters. These functions in the SAD method make it possible for us to monitor the spatiotemporal evolution of large-scale drought events. A large-scale drought event in North America identified 230 by the SAD method is given as an example in Fig. S2 to illustrate this method.

Evaluation of the SZIsnow for different drought types
We firstly evaluated the capability of the SZI snow to capture meteorological drought at different temporal scales over the 32 235 large basins from 1948 to 2010, and compared this to the capability of the SZI. The observation-based meteorological drought index, SPI, was used as a reference. As shown in Fig. 3a, the blue boxes represent the statistical distribution of the Pearson correlation coefficient (r) between the SZI and SPI for 1-48 month timescales in each basin, while the red boxes represent that between the SZI snow and SPI. The SZI snow generally outperformed the SZI over these large basins with an average improvement of 3.19% (ranging from 0.01% to 6.45%). It is clear that the extent of improvement in the SZI snow 240 increases as the SWE increases. For instance, the r of the SZI snow -SPI is 5.51% higher than that of the SZI-SPI in the Pechora basin that has an SWE of 47.5 mm yr 1 . In contrast, the r of the SZI snow -SPI is only 0.11% higher than that of the SZI-SPI in the Indus basin that has an SWE of 3.70 mm yr 1 . The relationship between the enhancement in the SZI snow and SWE implies that the SZI snow brings advantages in accounting for snow processes. It also demonstrates that the SZI snow appropriately reflects the fact that snow accumulation and melt have considerable impacts on the seasonal and inter-annual 245 variation of streamflow in snow-covered areas. In summary, the SZI snow has a satisfactory performance to capture meteorological drought.
We then evaluated the capability of the SZI snow to capture hydrological drought (Fig. 3b). The observation-based hydrological drought index, SSI, was used as a reference. The SZI snow generally outperformed the SZI over these large basins, with an average improvement of 3.13% (ranging from 0.25% to 17.53%). The extent of improvement in the SZI snow increases 250 as the SWE increases. For instance, the r of the SZI snow -SSI is 17.53% higher than that of the SZI-SSI in the Pechora basin that has an SWE of 47.5 mm yr -1 . In contrast, the r of the SZI snow -SSI is only 1.18% higher than that of the SZI-SSI in the Indus basin that has an SWE of 3.70 mm yr -1 . Moreover, among the multiple temporal scales over which it was tested, the SZI snow performs best at the 12-month scale for hydrological droughts. At the 12-month scale, the SZI snow performs 17.53%, 11.46%, 19.40%, and 4.88% better than the SZI in the Pechora, Northern Dvina, Yenisei, and Kolyma basins, respectively. 255 Thus, the SZI snow performs well in the context of capturing hydrological drought.
The capability of the SZI snow and SZI to capture agricultural drought was also assessed in our study. We conducted the same steps of assessment as those for assessing hydrological drought, but the reference drought index was an agricultural drought index, SWSI, derived from a dataset based on observations. As shown in Fig. 3c, the average r of the SZI snow -SWSI for all the basins is 0.51 (ranging from 0.19 to 0.79), which indicates the ability of the SZI snow to reliably capture agricultural 260 drought. Additionally, the SZI snow performs better than the SZI in almost all basins, and the average improvement of the SZI snow is 6.46% (ranging from 0.14% to 38.96%). Again, larger improvements occurred in basins with a larger SWE. This comparison, in terms of agricultural drought, again emphasizes the strength of the SZI snow in high-latitude and high-altitude regions with relatively greater SWE. In summary, the SZI snow performs sufficiently well to capture agriculture drought.

Evaluation of the SZIsnow across different spatial scales
The SZI snow can be computed and used to characterize drought for an individual grid, although we evaluated its capability at a basin scale in Sect. 4.1.1. In this section we apply the SWI drought index as a reference to assess the SZI snow at the global scale (Figs. 4a4b). The Hovmöller diagram (Fig. 4a) shows the distribution of the difference between the r of the SZI -SWI and that of the SZI-SWI for 1-48 month timescales across different latitudes. It is clear that the high-value 280 zonal mean difference mainly centers in the interval of 50-65 N. This indicates that the SZI snow outperformed the SZI within this 15 interval in high latitude areas. In contrast, the remaining regions, outside of this interval, show only small magnitude differences. In addition, as shown in Fig. 4b, the improvement of the SZI snow varies over different timescales; it performs better over timescales in the range of 3-12 months. Such spatial patterns, as shown in Figs. 4a4b, emphasize the physical improvement in terms of snow processes in the SZI snow construction compared to the SZI. This evaluation shows the 285 appropriate performance of the SZI snow at the global scale.
As the three-pole region is a focus of this study, we specifically compared the SZI snow and SZI over the Arctic region, where the latitude is larger than 66° 33' N. Figs. S3aS3b presents the spatial distributions of the r of the SZI snow -SWI and SZI-SWI, respectively, over a 12-month timescale. The two maps show similar spatial patterns for the SZI snow and SZI, yet the r of the SZI snow -SWI is larger than that of the SZI-SWI over the majority of the Arctic region, indicated by the positive 290 difference shown in Fig. 4c. Once again, the SZI snow is seen to outperform the SZI over the Arctic region, which is consistent with the findings from the global evaluation shown in Figs. 4a4b. Additionally, the relationship of area-averaged r and timescales is shown in Fig. 4d. The maximum r appears when the timescale is 12-months, and the relative difference between the r of the SZI snow -SWI and that of the SZI-SW (i.e., the improvement of the SZI snow ) shows a rapid growth moving from 1-to 12-month timescales (Fig. 4d, insert plot). The results demonstrate that the SZI snow dataset performs well 295 over the Arctic region.

55˚S to 85˚N for timescales ranging from 1 to 48 months. (b) Distribution of the difference for specific timescales (6, 9, 12, and 15 months) with changing latitude. (c) Spatial distribution of the differences between the correlation coefficients of the SZIsnow-SWI and those of the SZI-SWI over a 12-month timescale in the Arctic region. (d) Variations of correlation coefficients averaged over the Arctic region for various temporal scales. The inset shows the change of relative difference (%) for these temporal scales.
The risk of drought on the Tibetan Plateau, the world's third pole, can affect the water supplies of billions of people. Figure 5  305 shows the capability of the SZI snow and SZI to capture drought at various temporal scales over the Tibetan Plateau. Both the SZI snow and SZI have high r values with the SWI over a large part of the Tibetan Plateau. The r of the SZI snow -SWI is larger than 0.6 across 68.96% of the entire Tibetan Plateau, and for the SZI-SWI this value is 61.93% (Fig. 5, left and central   columns). The area-averaged r of the SZI snow -SWI is 0.72 and that of the SZI-SWI is 0.65 over a 12-month timescale, equating to an improvement of 10.77% for the SZIsnow. Moreover, the phenomenon that the SZI snow outperforms the SZI is 310 clearly shown in the right column of Fig. 5. The largest improvement is seen mainly in the northwest corner and southeastern part of the Tibetan Plateau, where the largest snow depths are also seen (Dai et al., 2017). Thus, the SZI snow dataset is a reliable resource to quantify drought across the Tibetan Plateau.

Historical trends in global drought
The proposed SZI snow dataset was applied to investigate the historical changes in global drought between 1948 and 2010. The spatial distribution of the linear trend in the SZI snow over different timescales (i.e., 3-, 6-, 12-, 15-months) is shown in Fig. 6. 320 The SZI snow at each temporal scale demonstrates a similar global pattern, except for differences in the magnitude of dryness or wetness trends. Overall, 59.66% of the land area of the Earth displays a drying trend, and the remaining 40.34% exhibits a wetting trend. As shown in Fig. 6, the SZI snow shows a drying trend over eastern Asia, northern India, most of the Arabian Peninsula and Africa, eastern Australia, and central and southern Europe; increased wetness was found over most of the United States, a large part of South America, and central Australia. Additionally, the drying trend tends to increase as the 325 timescale becomes longer. For instance, the drying rate of the SZI snow over eastern Asia becomes larger as its timescale increases. Moreover, our results are broadly consistent with the findings of Dai (2013) who analyzed the trend of global drought using the self-calibrated PDSI. This also implies that the SZI snow is a useful proxy of aridity changes. We further examined variations in the area of global land under drought (Fig. 7a). The area under drought shows an increasing trend with an average rate of increase of 0.05% yr -1 . Large fluctuations began to emerge from 1975, and Earth's drought area increased rapidly in the early 1980s. This growth was largely attributed to the leap in temperature caused by the 335 1982-1983 El Niño (Timmermann et al., 1999;Dai, 2011b). The maximum extent of drought area appeared in 1991.
Moreover, the temporal change in the global moisture anomaly Z is shown in Fig. 7b. The Z displays a global downward trend of 0.11 mm yr -1 for the period 1948-2010, which indicates the increasing global deficit between water supply and water demand. Overall, our analysis based on the SZI snow dataset revealed increased aridity over many land areas, and severe and widespread droughts over the Earth since 1948.

Statistics of large-scale drought events 345
Using the SZI snow dataset proposed in this study, we analyzed global and continental large-scale drought events (hereinafter referred to as drought) from 1948 to 2010 by leveraging the SAD drought diagnosis method. There have been 525 droughts with an area larger than 500,000 km 2 globally during the study period, as shown in Table 3. Also outlined in Table 3 is detailed information for the droughts with the longest duration and the largest area, respectively, for each continent.
Droughts with a duration longer than 6 months account for nearly 70% of all droughts. The longest drought that occurred in  Table 3. Summary of large-scale drought occurrence for each continent. In the fourth column, the duration of the drought is shown in months, and the period is listed in parentheses. In the final column, the spatial extent given as a percentage of the total continental area, and the date at which the maximum spatial extent occurred, is listed in parentheses.

Region
Number We further ranked the top five droughts in terms of duration and maximum spatial extent for each continent (Table 4). For Asia, the longest drought lasted 28 months, and its droughts commonly extend across larger areas compared to other 360 continents. The top five longest droughts in Europe had a relatively short duration compared to other continents. In Africa, the longest drought lasted 27 months, and the maximum extent was 10 million km 2 ; of all analyzed droughts, 60% occurred in the period from the mid-1980s to the mid-1990s; it is clear that there was a prolonged drought spell over this period.
Moreover, the droughts in North America always have a longer duration compared to other continents.  Table 4. Top five drought events in each continent, ranked by duration, or by maximum spatial extent. The duration and spatial 365 extent are listed in parentheses after the period of each drought event.

Region
Duration (

Temporal variability of large-scale drought events
The temporal variation of area-averaged SZI snow , the area under drought (pixels with SZI snow less than -1.0), and contiguous areas under drought are shown and analyzed in Fig. 8, in which the vertical pink dashed lines mark the top five most 370 extensive droughts in each continent. We also selected three of the top five most extensive droughts to show their spatial distribution (Fig. 9). The global averaged SZI snow displays a significant downward trend of 0.02 decade -1 (95% confidence level, Fig. 8a), which indicates a global drying trend. This drying trend was closely related to increases in temperature over the study period. Accordingly, the global area under drought shows an upward trend (0.31% decade -1 ) and approaches a plateau over the period 1985-1995. It is clear that the contiguous area under drought demonstrates a similar pattern of 375 variability to the area under drought for each continent and globally. Such similarity implies the large-scale drought identified by the SAD method can largely reflect the variability of the global area under drought.
Asia experienced a drying trend, based on the area-averaged SZI snow , during the period 1948-2010 (Fig. 8b); the contiguous area under drought ranges from 0% to 29.30%, with an average of 10.14%. With large fluctuation, droughts in early 1990s are salient features of the time series of Asia, and three of the five droughts with the largest spatial extent occurred during the 380 1990s. The drying trend in east Asia was mainly caused by weakening summer monsoons owing to changes in the El Niño-Southern Oscillation (ENSO) and the Pacific Decadal Oscillation (Zhang and Zhou, 2015). The large-scale severe droughts in the Middle East and southwest Asia were closely related to La Niña (Barlow et al., 2016). Additionally, the temporal variability within Asia is comparably small, mainly due to the dampening effect of its large spatial scale (Sheffield et al., 2009). In Europe, high variability of the contiguous area in drought was detected in the first half of the 1950s (Fig. 8c). The 385 drought condition alleviated somewhat between the mid-1950s and the mid-1970s. The high variation was repeated in the 1990s, and was associated with multiple periods of droughts with large spatial extent. In particular, large-scale droughts identified by the SZI snow occurred with a greater frequency over central Europe compared to other parts of Europe. The leading driver behind this pattern was the significant increase in potential evapotranspiration (Spinoni et al., 2015a). The findings in Europe, based on SZI snow , are broadly in agreement with other studies (Lloyd-Hughes and Saunders, 2002;390 Spinoni et al., 2015b).
In Africa, the area-averaged SZI snow exhibits a visible drying trend from 1948 to 2010 (Fig. 8d). The time series of drought areas underwent a gradual climb and achieved a maximum value in the mid-1980s, with a severe drought period then lasting until the mid-1990s. All the top five spatially extensive droughts are concentrated within this period and are commonly located to the south of the Sahara Desert (Figs. 9g-9i). Our results for Africa are generally similar to previous studies, which 395 concluded that ENSO and sea surface temperature (SST) are the main driving forces of droughts across the entire continent (Masih et al., 2014). For Oceania, strong drought spells occurred frequently from the 1950s to the 1970s (Fig. 8e). This continent is characterized by its high percentage of large-scale drought areas, and multiple droughts account for more than of global climate, for instance, the Interdecadal Pacific Oscillation (IPO) and Southern Annular Mode (Askarimarnani et al., 400 2021;Kiem et al., 2016).
In North America (Fig. 8f), the evident drought spells in the 1950s were captured by the SZI snow , and the largest drought area covered 37% of the entire continent. As shown in Fig. 9o, the drought in March 1964 covered most of the United States.
Previous studies confirmed that the tropical part of the SST anomalies was primarily related to the most notable droughts of the 1950s in the United States (Schubert et al., 2004). Another two obvious drought signals are found in the late 1970s and 405 1990s. The droughts detected here with the SZI snow show close correspondence to the findings of previous studies (Su et al., 2021;Andreadis et al., 2005). Moreover, notable distinct dry spells emerged in the 1960s and 1990s in South America (Fig.   8g). For instance, the largest drought in October 1963 covered up to 54% of this continental area (Fig. 9r)

Discussion and conclusions
This study proposes a drought index dataset on the basis of a new drought index, SZI snow , by incorporating snow dynamics into the SZI. Results from the evaluation of the SZI snow dataset suggest that consideration of snow processes can improve the 420 performance of the SZI snow . The improvement is remarkable when the SZI snow is applied in snow-covered areas, including high-latitude and high-altitude areas. Our results highlight the importance of snow in drought development because it can greatly affect the onset, cessation, severity, location, and duration of drought (Huning and Aghakouchak, 2020;Staudinger et al., 2014). Snow serves as the main water resource for many regions of the world (e.g., western United States) through its accumulation in the cold season and melting in the warm season. However, climate change is altering the effect of snow on 425 the availability of water resources. Increasing temperature leads to less snowfall and earlier snowmelt, and further results in a mismatch between the peak of streamflow and that of water demand, which can increase the drought risk over these regions (Adam et al., 2009;Özdoğan, 2011). The results of the present work underscore the importance of considering snow processes in drought quantification under global climate change.
Using the proposed SZI snow dataset, this study emphatically analyzed the severity-area-duration of global and continental 430 large-scale drought. The SZI snow dataset achieved a satisfactory performance in monitoring the propagation of large-scale contiguous droughts through space and time. Using the SAD drought diagnosis method, the SZI snow dataset appropriately captures the numbers and variability of historical large-scale contiguous droughts for each continent. These captured drought events are broadly aligned with findings from previous research (Zhang and Zhou, 2015;Mctainsh et al., 1989;Kiem et al., 2016;Lloyd-Hughes and Saunders, 2002). Such performance implies the present dataset can be applied globally to 435 understand the mechanisms behind large-scale droughts. It also raises confidence in the ability of the SZI snow to predict drought events, especially those with extensive spatiotemporal influence. Moreover, our results indicate that large-scale contiguous droughts control, to a large extent, the character of the variation of global drought. Thus, the capacity to track the evolution of large-scale droughts in space and time is a crucial aspect for the assessment of a drought index.
The SZI snow absorbs the advantages of both the PDSI and SPEI and can be used to monitor multitype droughts at various 440 temporal scales. Compared to the PDSI, it considers more hydrological components related to water supply and demand, and quantifies their contribution to water demand by weight. Such consideration enhances the physical realism of drought quantification, particularly over high-latitude and high-altitude regions that usually receive substantial snowfall (Zhang et al., 2019). The enhancement achieved by the SZI snow implies that more key physical processes should be considered when constructing a drought index, rather than using a simple generalization, although we admit that a sophisticated index is 445 always limited by insufficient observation to some extent. However, data assimilation serves as a new way to overcome the difficulty of insufficient observation (Mishra and Singh, 2011). This new method combines a multi-source dataset and an advanced land surface model to provide optimal values of variables related to drought, which is the reason why we used GLDAS-2 as the forcing means of SZI snow calculation. Therefore, the improvement of SZI snow indicates that more attention should be paid to the combination between the drought index and the data assimilation system (DAS) or LSM. The combination between the SZI snow and the DAS provides the possibility to track droughts over ungauged areas. As more models (e.g., crop model, wildfire model, root model) have been coupled with the DAS, the combination between the SZI snow and the DAS has become more physically realistic. Yet, uncertainties from the DAS will inevitably be introduced into the SZI snow , which undermines the reliability of the SZI snow . Previous studies have often obtained dissatisfactory results during the validation of the GLDAS-2 (e.g., Fatolazadeh et al., 2020). These uncertainties originate from incomplete model 455 structure, forcing data biases, and biases in parameter estimation (Qi et al., 2020). However, recent developments in LSMs, DAS techniques, and computational power are helpful in resolving issues associated with uncertainty. Thus, determining how to introduce uncertainty quantification when utilizing the SZI snow to assess drought is a future goal of ours.
The SZI snow is a comprehensive drought index because it incorporates different aspects of the hydrologic cycle, which provides a clear-cut way to synthesize different kinds of information related to drought into a simple message. Such 460 synthesizing capacity is particularly crucial because droughts have a broadly adverse influence on agricultural water, municipal water, energy supply (hydropower), and human and animal safety. Thus, the SZI snow has a high potential to be utilized for drought management. Currently, however, the SZI snow is mostly used only by the scientific community (Lu et al., 2020;Ayantobo and Wei, 2019) and rarely used by decision-and policy-makers. One reason for this is that the acquisition of best-fit thresholds in the SZI snow , for one type of drought over an area with a specific climate regime, requires a trial-and-465 error approach and takes time. On the other hand, drought management is a synergistic effort involving a variety of sectors and requires joint operations of these sectors. Additionally, the complexity of calculations is a limitation of the SZI snow . Therefore, it is necessary to strengthen the user-friendliness of the SZI snow and collaborate closely with government departments related to drought management.

Data availability 470
All datasets used in this work are freely available. The SZI snow dataset proposed by this work is a good contribution to the study of climate change, ecology, and hydrology. It is especially helpful for research focusing on spatiotemporal dynamics of drought, the underlying mechanisms of drought evolution, and the development of drought indices. The dataset contains 48 individual files with timescales of 1-48 months and has been archived in the Network Common Data Form (NetCDF) format.
The monthly SZI snow in each file covers the Earth's land area and has a spatial resolution of 0.25. The SZI snow dataset is 475 freely downloadable from the Zenodo repository at the following URL: http://doi.org/10.5281/zenodo.5627369 (Wu et al., 2021). In addition, we also published the dataset to the National Tibetan Plateau/Third Pole Environment Data Center, which has been accredited by the Earth System Science Data, and specializes in collecting, integrating, and publishing geoscientific data on and surrounding the Tibetan Plateau and the three poles. The SZI snow dataset can be downloaded from this data center at the following URL: http://data.tpdc.ac.cn/en/disallow/b039fde6-face-4d24-af45-d238a6af18b7/.

Summary
In the current study, we have produced a global monthly SZI snow dataset over 1-48 month timescales from 1948-2010. This dataset is an important contribution to drought quantification and development of drought indices because it is built on the SZI snow , a multitype and multiscalar drought index absorbing the strengths of the SPEI and PDSI. Our SZI snow dataset has achieved a remarkable improvement in drought assessment across the world, particularly for high-latitude and high-altitude 485 areas. This improvement implies that consideration of snow processes can improve the performance of a drought index.
Moreover, the SZI snow dataset can successfully monitor the spatiotemporal propagation of large-scale drought events. We expect this dataset could serve as a valuable resource for drought studies, further contributing to promoting our understanding of the mechanisms behind global drought dynamics.