STAR NDSI collection: A cloud-free MODIS NDSI dataset (2001– 2020) for China

. Snow dynamics are crucial in ecosystems, affecting radiation balance, hydrological cycles, biodiversity, and human activities. Snow areas with notably diverse characteristics are extensively distributed in China, mainly including Northern Xinjiang (NX), Northeast China (NC), and Qinghai-Tibet Plateau (QTP). Spatio-temporal continuous snow monitoring is 10 indispensable for ecosystem maintenance. Nevertheless, the formidable challenge of cloud obscuration severely impedes data collection. In the past decades, abundant binary snow cover area (SCA) maps have been retrieved from moderate resolution imaging spectroradiometer (MODIS) datasets. However, the integrated normalized difference snow index (NDSI) maps containing additional details on snow cover extent are still extremely scarce. In this study, a recent 20-year stretch seamless Terra–Aqua MODIS NDSI collection in China is generated using a Spatio-Temporal Adaptive fusion method with erroR 15 correction (STAR), which comprehensively considers spatial and temporal contextual information. Evaluation tests confirm that the cloud-free STAR NDSI collection is superior to two baseline datasets. The omission error decreased by 10% in NX compared to the snow cover extent product, and the average correlation coefficient increased by 0.11 compared to the global cloud-gap-filled MODIS NDSI product. Consequently, this collection can serve as a basic dataset for hydrological and climatic modeling to explore various critical environmental issues in China. Typical long-term cloud-free snow cover products covering most snow-dominated regions in China.

on the quantity and quality of training data.
Increasing studies have moved to the MODIS C6 suite since its release in 2016. In C6 data, snow cover is reported as 65 NDSI rather than SCA and FSC. NDSI is an index that is related to the snow presence in a pixel and is a more accurate description of snow fraction as compared to SCA and FSC (Riggs and Hall, 2015;Riggs et al., 2017). The clear-sky accuracy of C6 NDSI datasets is robust compared to higher resolution remote-sensed images (such as Landsat and Sentinel series) and in-situ measurements (Crawford, 2015;Zhang et al., 2019;Aalstad et al., 2020). As a basic data, it has the significant advantage of allowing users to more accurately determine SCA or FSC for their particular study areas and application requirements (Hall 70 et al., 2019). For example, several optimal classification thresholds for SCA Malmros et al., 2018;Tong et al., 2020) and specially tuned mapping methods for FSC (Kuter et al., 2018;Hou et al., 2020;Zhang et al., 2021) were designed to generate regional SCA and FSC datasets from NDSI snow cover datasets, which were superior to the globally harmonized algorithms in C5 data. However, severe cloud contamination also limits the application of NDSI datasets, resulting in many studies only considering cloud-free areas (Kuter et al., 2018;Malmros et al., 2018;Tong et al., 2020;Hou et al., 2020;75 Zhang et al., 2021). Therefore, a cloud-free NDSI dataset is significant for in-depth snow cover research. Since the aforementioned cloud removal methods were generally designed for binary SCA, their applicability to NDSI with more complicated spatio-temporal characteristics should be improved. Thus, several gap-filling methods with an associated concern of spatial and temporal correlations of snow presence were proposed to remove clouds from NDSI Chen et al., 2020;Li et al., 2020). Among these methods, the spatio-temporal feature-based methods with relatively high robustness 80 are more effective for improving NDSI datasets .
Many studies on snow monitoring in China are available, and most of these studies focus on binary SCA mapping. On the regional scale, QTP, which is known as the world's third pole, plays a key role in the global climate system. Nevertheless, snow cover mapping is particularly challenging over QTP due to the frequent cloud cover resembling fragmented snow. A large number of studies have demonstrated that the snow cover variability over QTP is extremely complex, with significant 85 spatio-temporal heterogeneity (Gao et al., 2012;Tang et al., 2013;Yu et al., 2016;Liang et al., 2017;Zhang et al., 2012). NX (Wang et al., 2008) and NC (Che et al., 2016) located in mid-latitude areas are dominated by seasonal snow cover. Che et al. (2019) presented an integrated snow cover dataset from a distributed hydrometeorological observation network in the Heihe River Basin, which achieved a prominent demonstration of data synthesis at a watershed scale. In addition, the large-scale transient snow cover areas increase the level of challenge for generating high-quality snow cover datasets. On the national 90 scale, Huang et al. (2016) obtained a long-term cloud-removed SCA product using a multi-source data fusion method. Despite many relevant studies, only a few cloud-free snow cover datasets have been released publicly.
Several typical long-term cloud-free snow cover products available online are listed in Table 1 (datasets are referenced via DOI), which cover most snow-dominated regions in China. Huang (2020) provided MODIS daily cloudless SCA products Thapa (2020Thapa ( , 2021 obtained eight-day/daily MODIS SCA and glacier composite datasets for High Mountain Asia by aggregating seasonal, temporal, and spatial filters, which can serve as a valuable input for hydrological and glaciological investigations. Hao et al. (2021;2022) yielded two long-term daily SCA datasets over China through a series of processes such as quality control, cloud detection, snow discrimination, and gap-filling (including hidden Markov random field and snowdepth interpolation techniques). Their releases and updates promoted the research of snow cover characteristics in China. Qiu 100 et al. (2017) yielded a daily FSC dataset with detailed snow cover information over High Mountain Asia with MDC and spatial filtering. Additionally, the global cloud-gap-filled MODIS NDSI dataset (MOD10A1F) is available online since 2020, where cloud-covered grids in the MODIS Terra NDSI product are filled by retaining clear-sky observations from previous days (Hall and Riggs, 2020). However, this dataset performs poorly in China, where periodic and transient snow is dominant. In general, cloud-free SCA datasets produced by composite algorithms are frequently released, while high-quality cloud-free NDSI 105 datasets are still scarce.
To this end, this study generates a spatiotemporally continuous Terra-Aqua MODIS NDSI product with satisfactory accuracy for China, fully considering the spatio-temporal characteristics of regional snow cover variability. A Spatio-Temporal Adaptive fusion method with erroR correction (STAR) improved from our previous work  is utilized to eliminate cloud obscuration. The long-term detailed snow cover extent dataset will facilitate snow-related scientific studies 110 and practical applications in China.
The rest of this paper is arranged as follows. Firstly, Section 2 describes the input data and the proposed cloud removal method. Then, Section 3 presents the verification accuracy of STAR NDSI collection, with a subsequent analytical application.
The cloud removal effectiveness under different cloud coverages is discussed in Section 4. Finally, the data availability and the conclusions are provided in Section 5 and Section 6, respectively. 115 https://search.earthdata.nasa.gov/). The NDSI_Snow_Cover (hereafter referred to as NDSI) scientific data set with a range of 0, 10 to 100 was used in this study. As shown in Fig. 1, the NDSI of 19 tiles covering China (excluding sea area) from 1 August 2000 to 31 July 2020, was acquired to generate snow cover maps. The 90 m digital elevation model (DEM) dataset of Shuttle 125 Radar Topographic Mission (SRTM) was obtained from the United States Geological Survey (USGS). In addition, two existing cloud-free snow cover products, including the daily snow cover extent at a 5 km resolution (NIEER AVHRR SCE, https://doi.org/10.11888/Snow.tpdc.271381, Hao et al., 2021) and the daily cloud-gap-filled MODIS NDSI at a 500 m resolution (MODIS CGF NDSI/MOD10A1F, https://doi.org/10.5067/MODIS/MOD10A1F.061, Hall and Riggs, 2020), were used for comparison. For reference data, the snow depth measurements respectively derived from 49 and 102 meteorological 130 stations in NX and QTP (Tibet Meteorological Bureau and National Meteorological Information, 2018) were used for stationbased validation. Since the snow depth data can only assess the classification performance of MODIS NDSI retrievals, the NDSI maps derived from Landsat OLI images were utilized for comprehensive validation.

Algorithm description
MODIS NDSI datasets are unable to represent the daily conditions of snow accumulation and ablation accurately because the optical remote-sensed images are subject to severe cloud pollution. Therefore, a Spatio-Temporal Adaptive fusion method with erroR correction (STAR), which is derived from our two-stage spatio-temporal fusion method , is presented to produce a spatio-temporal continuous snow collection. As shown in Fig. 2, the generation procedure comprises the pre-140 process TAC and the key-process STAR. Then, a quality assessment (QA) approach is presented to provide a data reliability profile for users. On this basis, post-processing is used to further improve the data quality in individual abnormal areas.

Terra and Aqua combination (TAC)
TAC blends the same-day snow maps deriving from MODIS sensors onboard Terra and Aqua satellites. Its cornerstone is the unlikely significant changes of the snow pattern within the data-acquired time interval (approximately 3 h). Since TAC can efficiently decrease the cloud fraction by 5%-20% with negligible precision sacrifice , it is introduced as a preprocessing to reduce cloud coverage preliminarily. Its priority scheme is determined as high value > low value > cloud. 150 the pre-processed NDSI maps after TAC (referred to as TAC NDSI dataset in subsequent sections). The snow in low altitude and low latitude areas during summer is reversed to no snow to alleviate commission errors inherited from the original data.
In addition, since the Aqua dataset is available since July 2002, the key-process STAR is directly used to remove clouds from 155 Terra MODIS NDSI dataset between August 2000 and May 2002. Particularly, the improved Aqua MODIS C6 NDSI dataset significantly enhances the effectiveness of TAC due to the successful restoration of the absent Aqua MODIS band 6 data by the quantitative image restoration method (Gladkova et al., 2012).

Spatio-Temporal Adaptive fusion with erroR correction (STAR)
Many regions with persistent clouds are out of the scope of TAC. To this end, an advanced STAR method, which 160 comprehensively utilizes spatio-temporal contextual information, is proposed to remove the clouds thoroughly. As shown in

165
The first pass involves the generation of new NDSI maps by adaptively merging the spatio-temporal contextual information, including space partition, adaptive space-time block determination, and Gaussian kernel function (GKF)-based fusion. The research area is first segmented into dozens of partitions considering the spatial heterogeneity of snow patterns.
Thus, the subsequent processes can be performed on a partition basis. Moreover, the optimal query partitions (Q) to each target partition (T) are determined by a comprehensive consideration of temporal distance (t), regional correlation (r), and cloud-free 170 fraction (f) concerning the temporal complexity of snow variations. The following optimal parameters are derived from the extensive experiments.
( 2 ) where the regional correlation between the candidate partition (C) within an eight-day window and the target partition is considered representative if the fraction of the intersecting cloud-free areas ( & C T f ) is higher than 0.3. The candidate partition 175 is then determined as a query partition according to Scheme 1 when the regional correlation is larger than 0.7. Otherwise, Scheme 2 is activated. Two query partitions with short distance and high cloud-free fraction are identified within the preceding eight days and the backward eight days, respectively. Subsequently, the 3 × 3 neighborhoods for each pixel of the target partition in all the associated query partitions are determined as the space-time reference block. Last, each pixel is reassigned a fused value from the related space-time block, as expressed in Eq. (3): 180 ), which is then normalized to ( , ) i st w . σ is the standard deviation of GKF.  characterizes the dimensional difference, which is equal 185 to t s   with an expression of each single-dimensional GKF. t r represents the regional correlation between the query and target partitions if Scheme 1 works; otherwise, it is ignored (i.e., 1 t r  ). The constant term ( 2 (2 )   ) of GKF is ignored due to the normalization process. The important parameters in STAF are listed in Table 2.
The second pass corrects the fused NDSI maps considering the spatial correlation within a partition. Specifically, the residual errors of the intersecting cloud-free areas of the pre-processed and fused NDSI maps ( P NDSI after TAC and F NDSI after STAF) are diffused to other cloud-free areas of the fused NDSI maps using the triangulation-based natural neighbor interpolation (Sibson, 1981). Then, the high-quality NDSI maps ( H NDSI ) can be generated by removing all errors from the 195 fused NDSI maps. The process is formulated as follows: where R indicates the reference area which is the boundary of the intersecting cloud-free areas. T indicates the target area. The dynamic weights in the error diffusion from R E to T E are based on the Voronoi diagrams. As expressed in Fig. 3

the original Voronoi cells (bounded by red and gray solid lines) of the reference pixels (gray dots) intersect with the new 200
Voronoi cells (bounded by blue and gray solid lines) of the reference and target pixels. Taking the target pixel T 1 with the reference pixel R 1 as an example, the weight is assigned as the ratio of the area of the intersecting Voronoi cell ( dabch A ) to that of the new Voronoi cell ( defgh A ). (1,1) After all the partitions are processed in sequence, the next iteration of STAR begins until the clouds are completely removed. 205

Quality assessment (QA) approach
A revised QA approach for the gap-filled NDSI collection is proposed on the basis of the quality estimate of MODIS NDSI datasets (Riggs and Hall, 2015), and an example is presented in Fig. A1 (Appendix A). Users can examine the basic quality of the gap-filled NDSI collection considering cloud coverage and spatio-temporal consistency of the raw NDSI dataset by retrieving the bit flags from the integer stored in QA maps. The specific attributes are listed in Table 3. 210 The snow detection reversal of the pre-processed value in TAC is tracked in Bit 2, the post-processing (Sect. 2.2.4) is tracked in Bit 3, and the number of iterations primarily related to cloud coverage is indicated by Bit 4. If the range of reference 215 values is larger than 30, then Bit 5 flag is set; if the difference between pre-processed and fused values is larger than 20, then Bit 6 flag is set. Bit 7 reflects the cloud coverage of the space-time block. Furthermore, the combination of Bits 0 and 1 is a qualitative estimate of the cloud-removed NDSI collection based on the number of iterations (from here on referred to as NI) and spatio-temporal consistency. The comprehensive estimate is determined as follows: ─ if NI = 5 and Bit 5 or 6 is set to 1, then it is assigned "poor"; 220 ─ if NI = 4 and Bit 5 or 6 is set to 1, then it is assigned "OK"; ─ if NI = 3 and Bit 5 or 6 is set to 1, then it is assigned "good"; ─ otherwise, it is assigned "best".
The QA maps are recommended for in-depth application of the cloud-removed NDSI collection.

Post-processing 225
For areas with extremely rapid and fluctuating snow variation, the temporal contextual references are likely to introduce incorrect information and magnify errors in the iterative process. Post-processing is used in this study to reduce the "disorder" phenomenon referring to QA maps. Firstly, the NDSI map with the most consistent snow pattern in adjacent time is artificially identified as a reference. Subsequently, the aforementioned EC is applied to improve the spatial consistency between the postprocessing and original areas. Finally, the QA maps are updated. 230

Validation of the NDSI collection
The gap-filled NDSI collection is evaluated with the in-situ snow depth observations and Landsat NDSI maps considering classification and numerical accuracies according to the current mature verification methods. As shown in Table 4, the classification metrics based on the confusion matrix include overall accuracy (OA), commission error (CE), and omission error (OE) (Klein and Barnett, 2003). Moreover, three general metrics are introduced to measure numerical accuracy: correlation 235 coefficient (CC), absolute error (AE), and root-mean-square error (RMSE).

Results
As mentioned above, the generation procedure of continuous snow collection includes the pre-process TAC and the key-240 process STAR. The remainder clouds of 30.62% in the entire collection after TAC are completely removed by STAR. To elaborate the reliability of STAR NDSI collection, TAC NDSI, NIEER AVHRR SCE, and MODIS CGF NDSI products are used as baseline data. Specifically, based on in-situ snow depth measurements, the classification accuracy of STAR NDSI collection is compared with those of TAC NDSI and NIEER AVHRR SCE datasets. In addition, based on Landsat NDSI maps, its numerical accuracy is compared with those of TAC NDSI and MODIS CGF NDSI datasets. This section presents the 245 evaluation results, followed by an exemplary application.

Validation against in-situ snow depth measurements
As described above, the in-situ snow depth data in NX from 1 January 2001 to 31 August 2007 and on QTP from 1 August 2000 to 31 December 2013, were used as the ground truth to evaluate the classification accuracy of TAC NDSI, NIEER 550000 data pairs. Snow-covered pixels in NDSI datasets range from 10 to 100, whereas snow-free pixels are 0; thus, the classification threshold is set as 10 . The discriminant threshold for in-situ snow depth is set as 0 or 1 cm.
In addition, the cloud-covered areas in TAC NDSI dataset are considered to be snow-free. Table 5 demonstrates that NIEER AVHRR SCE and STAR NDSI datasets preeminently capture the snow dynamics in NX referring to the in-situ measurements, with OAs of more than 94%. However, TAC NDSI dataset is insufficient to 255 accurately describe the snow cover variability. Although CEs perform well regardless of the snow depth threshold, OEs of TAC NDSI collection are extremely high, indicating that many cloud-covered areas are dominated by snow. NIEER AVHRR SCE dataset partially retrieves snow pixel under cloud obstruction with an OE decreased by ~43%. STAR NDSI collection completely removes clouds and accurately presents snow distribution, with an OE further decreased from ~17% to ~7%. The generation procedure in NX has two strengths. Firstly, the satellite-borne sensors can accurately capture the snow events on 260 the ground due to the generally thick snow averaging approximately 20 cm. Secondly, the gap-filling approach with comprehensive consideration of spatial and temporal correlation has outstanding reliability due to the significant periodicity of snow variation. It can be inferred that the NDSI datasets in NC have high accuracy because of the similar snow conditions, despite the lack of in-situ data in this region.
By contrast, despite the satisfactory performance of OAs and CEs, the OEs of three snow cover datasets over QTP are as 265 remarkably high as 72%, 42%, and 39% even at the snow depth threshold of 1 cm (Table 6). This finding indicates the omission of a large number of snow-covered pixels. The specific reasons are as follows. Firstly, the original MODIS NDSI maps frequently underestimate the snow presence throughout the snow period because discriminating the shallow snow pixels with an averaged snow depth of approximately 4 cm over QTP is challenging. Secondly, the credibility of the spatio-temporal contextual information is relatively low because the patchy snow rapidly and irregularly varies due to the extremely complex 270 topographic and climatic conditions, leading to a further decrease in the accuracy of the gap-filled results. Lastly, the meteorological stations over QTP are unevenly distributed and are mostly located in low-and medium-altitude/latitude areas dominated by transient snow. Consequently, the evaluation results slightly exaggerate the real OEs.
For the in-depth analysis of the temporal characteristics, the monthly classification accuracies of TAC NDSI, NIEER AVHRR SCE and STAR NDSI products in NX and QTP are shown in Fig. 4 (the horizontal axis is the month in a hydrological 275 year). In NX (group a), the monthly snow fraction in the in-situ samples is greater than 85% from December to next February. Therefore, the clouds in TAC NDSI dataset seriously affect the snow cover estimation, while both cloud-free products exhibit superior OAs. Compared to NIEER AVHRR SCE product, STAR NDSI collection has slightly higher CEs but relatively lower OEs. The OEs of STAR NDSI collection typically occur during snow accumulation and ablation periods, and almost disappear during stable snow-cover and snow-free periods. In QTP, the snow period is generally from October to next May, with the 280 monthly snow fraction of less than 10% in the in-situ samples. Consequently, the underestimation of snow coverage caused by the clouds in TAC NDSI dataset is slight. All three products achieve outstanding OAs and CEs, but exhibit relatively poor OEs. In NIEER AVHRR SCE and STAR NDSI datasets, these OEs are generally observed outside the snow period. As mentioned above, there are three reasons for this phenomenon. Nonetheless, STAR NDSI collection presents superior classification accuracy to TAC NDSI and NIEER AVHRR SCE datasets. 285  Overall, STAR NDSI collection is capable of snow status estimation, eliminating cloud contamination in TAC NDSI dataset, and capturing more snow events than NIEER AVHRR SCE dataset. However, the accuracy of STAR NDSI collection has regional and temporal heterogeneity. Firstly, the accuracy over QTP is lower than that of NX, which is consistent with the 295 characteristic of the original MODIS NDSI maps. Then, the permanent and periodic snow regime regions reconstructed by STAR have prominently high accuracy, while the transient snow-covered regions are easily omitted. Fortunately, the monitoring of permanent and periodic snow plays a key role in most snow-related investigations. Finally, the accuracy of stable snow-cover and snow-free periods is slightly better than that of snow accumulation and ablation periods.

Validation based on Landsat NDSI maps
Only the classification accuracy can be evaluated by in-situ measurements due to the significant difference in the nature of the 305 snow depth and NDSI data. Therefore, Landsat images with finer spatial resolution were commonly adopted for the numerical evaluation of NDSI datasets (Crawford, 2015). NDSI values for the Landsat 8 dataset were calculated as follows: Band Band .Subsequently, the average of the 17 × 17 neighborhoods closest to the center of the MODIS NDSI pixel in the Landsat NDSI map was considered to be the reference of this MODIS NDSI pixel to match the spatial resolution of Landsat with that of MODIS. Specifically, the cloud-contaminated pixels marked by the quality band in 310 Landsat images were excluded, and the reference areas with cloud coverage larger than 30% were discarded. A total of 19 Landsat NDSI maps with different snow coverages from January to April in 2018, which are distributed in NC (4 scenes), Central China region (CCR, 2 scenes), QTP (8 scenes), and NX (5 scenes), were exploited as evaluation references for this validation experiment. Two evaluations including a cross-comparison of TAC NDSI, MODIS CGF NDSI, and STAR NDSI datasets and an internal comparison of clear-sky and cloud-cover areas are described in detail below. 315 For the cross-comparison, the visual effects of three NDSI datasets on 8 January 2018 and 3 February 2018 are shown in Fig. 5. TAC NDSI dataset is still heavily obscured by clouds. Although MODIS CGF NDSI dataset completely removes clouds from the MOD10A1 product, it is difficult to accurately retrieve periodic and transient snow cover areas due to the simplicity of the cloud-gap-filled method. Specifically, the gaps are filled by retaining clear-sky observations from previous days.
However, snow patterns under cloud cover are likely to change significantly during these days. Therefore, snow cover is 320 significantly underestimated during accumulation (Fig. 5, a2) and overestimated during ablation (Fig. 5, b2). By contrast, STAR NDSI collection preeminently captures the snow dynamics under temporally continuous clouds, attributing to the spatiotemporal adaptive fusion strategy. Furthermore, the three NDSI datasets are quantitatively assessed by Landsat NDSI maps.

(group a) and 3 February (group b).
As the average cloud cover is as high as 40.7%, TAC NDSI dataset has a low correlation with Landsat NDSI maps, with an average CC of 0.46 (Table 7). The cloud-free MODIS CGF NDSI dataset enhances the correlation with Landsat NDSI maps, with an average CC of 0.73. By contrast, the snow dynamics presented by the spatiotemporally continuous STAR NDSI dataset 330 are highly consistent with Landsat NDSI maps, with an average CC further increased to 0.84. Compared with TAC NDSI and MODIS CGF NDSI datasets, the average RMSE of STAR NDSI dataset is decreased by 9.06 and 5.58, respectively. The positive AEs reveal that NDSI values for snow pixels in MODIS CGF NDSI and STAR NDSI datasets are generally higher than those of Landsat NDSI maps. In terms of snow coverage, STAR NDSI dataset notably improves the detection of snow events compared to the other two datasets, with an average absolute SRD decreased by 31.1% and 2.4% (SRD indicates the 335 difference of snow rate between MODIS and Landsat NDSI datasets). Consequently, STAR NDSI collection is a more promising snow cover product than TAC NDSI and MODIS CGF NDSI datasets, contributing to related hydrological and meteorological studies. Note that TAC, CGF, and STAR represent TAC NDSI, MODIS CGF NDSI, and STAR NDSI datasets, respectively.
In addition to the cross-comparison of TAC NDSI, MODIS CGF NDSI, and STAR NDSI datasets, an internal comparison of STAR NDSI collection in clear-sky areas and cloud-cover areas was performed based on Landsat NDSI maps, to highlight the accuracy of the recovered pixels in STAR NDSI collection. As described above, clear-sky areas and cloud-cover areas 345 account for 59.3% and 40.7%, respectively. Table 8 indicates that the snow distribution of recovered areas in STAR NDSI collection is relatively consistent with that of Landsat NDSI maps. Although the average CC decreases from 0.85 to 0.73 and the average RMSE increases from 13.48 to 16.30 compared with clear-sky areas, the accuracy of recovered areas is satisfactory.
Since many recovered areas inherit errors from clear-sky areas because the cloud removal procedure completely relies on the original dataset, a slight decrease in accuracy is reasonable. In addition, the average AEs of clear-sky and recovered areas are 350 7.81 and 6.83, respectively, revealing the systematic overestimation of NDSI values in snow-cover areas (Landsat NDSI values are generally low). Except for a few areas, the snow conditions in most cloud-cover areas are well recovered, with an average SRD of -5.0%. This finding highlights that STAR NDSI collection can completely remove clouds with satisfactory accuracy. For in-depth verification analysis, Figure 6 shows the visual effects in four regions corresponding to four highlighted groups in Table 8. The accuracy of clear-sky areas in NC is prominently high with a CC of 0.92, while recovered areas notably 360 reduce the performance with a CC of 0.42. However, Figure 6a shows that clear-sky areas in TAC NDSI dataset cannot reflect the snow events, whereas STAR NDSI collection effectively retrieves these events. Inevitably, the snow edges are slightly inaccurate and blurry due to insufficient reference information. For group (b), the original accuracy of the NDSI dataset in CCR is relatively low, with high cloud coverage and false acceptance rate, while STAR NDSI collection presents a snow pattern consistent with Landsat NDSI. Nevertheless, CCR is a transient snow area with relatively low altitudes and latitudes. 365 Therefore, the gap-filled result has visible uncertainty, in which commission (black box) and omission (red box) frequently occur. As mentioned above, MODIS NDSI datasets generally perform poorly over QTP. However, Figure 6c demonstrates that TAC NDSI and STAR NDSI collections can accurately capture snow events despite a few omissions (red box). Similar to the scene with a remarkably low CC of 0.58 can effectively reflect the snow pattern. In addition, the NDSI dataset retrieved by 370 STAR inevitably has a few extremely abnormal areas during 20 years due to the vast territory of China; an example is presented in Fig. A2 (Appendix A). These areas have severe cloud contamination and irregular snow dynamics, contributing to the challenges in reconstruction and evaluation. Therefore, these areas are corrected by post-processing as described in Sect. 2.2.4. Overall, the numerical accuracy of STAR NDSI collection is rigorously evaluated based on fine-resolution Landsat NDSI maps. The cross-comparison indicates that STAR NDSI collection is superior to both TAC NDSI and MODIS CGF NDSI datasets. In addition, the internal comparison reveals that the effectiveness of cloud removal is satisfactory, although recovered 380 areas have slightly lower accuracy than clear-sky areas. Consequently, STAR NDSI collection has considerable application potential.

Application of STAR NDSI collection
In addition to quality evaluation, the exemplary analysis also contributes to understanding the potential of STAR NDSI collection for hydrological and climatic applications. Therefore, the spatial distribution and temporal variability of snow cover 385 in China are analyzed as two simple application demonstrations.
From a spatial perspective, the sequence shown in Fig. 7 indicates that the snow fraction first increases and then decreases in NC and NX regions from 5 December 2014 to 25 January 2015. However, due to the complex topographic and climatic conditions, the snow cover on QTP presents irregular distribution and considerable fluctuations. During this period, the snow cover pattern in China changed dramatically, with the overall snow fraction ranging from 19.25% to 35.42%. In addition, 390 another sequence of cloud-free NDSI maps in early 2020 is shown in Fig. 8. During this period, the snow cover in China presents a single-wave depletion curve. The peaks of snow fractions are observed in three major snow areas between 11 January 2020 and 21 January 2020. Compared with the previous sequence, the snow cover variation on QTP in this sequence has significantly enhanced regularity.   Figure 9 shows the daily average snow fraction in the three main subregions in China and the entire situation considering 400 temporal analysis. In terms of intra-annual variability, the snow dynamics periodically evolve in NX and NC but substantially fluctuate on QTP. NX and NC have similar snow depletion curves, demonstrating rapid accumulation and ablation in November and March, respectively. QTP has a relatively long snow period, with an average snow fraction varying from 20% to 40% from October to next May. Consequently, China is dominated by periodic snow. As for inter-annual variability, among the three major snow areas, the snow fraction in NC remarkably fluctuates with a standard deviation of 5.3%. The snow 405 coverage on QTP presented a slight decreasing trend from 2005 to 2017 but increased significantly in the past two years. In particular, rather than a significant rise in maximum snow coverage, the increase can be observed throughout the snow period with a slight reduction in intra-annual volatility. This finding implies that the regional climatic conditions tend to stabilize slightly. In addition, no significant trend has been detected in snow dynamics in China during the 20 years. Nevertheless, the Overall, STAR NDSI collection can accurately reflect the spatial and temporal dynamics of snow cover in China. It is promising for hydrological and meteorological applications.

Discussion
To elaborate the cloud removal effectiveness of the proposed STAR method, the performance statistics under different simulated cloud conditions are shown in Table 9. Four TAC NDSI maps with little cloud cover during the snow period were used in the simulated experiment. Cloud masks from other dates were added to the target maps, with different fractions of 420 about 20%, 50% and 80%. Subsequently, the artificially cloud-covered maps were recovered by STAR and validated by Landsat NDSI maps. The quantitative results indicate that the recovery effectiveness of STAR typically declines significantly when cloud coverage is greater than 80%. As a result, STAR can completely remove clouds with little loss of accuracy. Only in the NC4_20180318 scene, high overestimation occurs when cloud coverage reaches 55%. The phenomenon is caused by high cloud coverage and rapid snow variation in space and time. Therefore, users are recommended to refer to the QA maps of STAR NDSI collection during snow accumulation and ablation periods, in which Bit 7 reflects the cloud coverage of the space-430 time block.

Data availability
The improved cloud-free Terra-Aqua MODIS NDSI collection (STAR NDSI collection) for China from 1 August 2000 to 31 July 2020, including STAR NDSI and STAR QA data, is available for download at https://doi.org/10.5281/zenodo.5644386 (Jing et al., 2021). The dataset is provided using a WGS 84 / UTM zone 48N projection, with a tag image file format (TIFF). 435 Users can discuss and respond to issues that arise during the use of this dataset. New versions can be released in consideration of user comments. In addition, a source code for this collection is available at https://doi.org/10.5281/zenodo.6396149 (Jing, 2022).

Conclusions
STAR NDSI collection is derived from Terra-Aqua MODIS NDSI datasets using an optimized STAR from our last research 440 . The evaluation tests indicate that STAR NDSI collection is highly consistent with the in-situ snow depth measurements and higher resolution NDSI maps. STAR NDSI collection generally has the following strengths. (1) This collection has reached a continuous 20-year period, which is the minimum period of a dataset for long-term hydrological and climatic processes analysis. (2) The cloud-free collection can accurately estimate the snow dynamics, highly consistent with in-situ snow depth and Landsat NDSI maps. Specifically, STAR NDSI collection eliminates cloud contamination and 445 preeminently improves the overall performance of TAC NDSI dataset. Due to the higher spatial resolution and larger dynamic range, the classification accuracy of STAR NDSI collection is higher than that of NIEER AVHHR SCE dataset. In terms of numerical accuracy, it is superior to MODIS CGF NDSI dataset, since the spatio-temporal adaptive fusion method generally outperforms the simplified MDC method. Additionally, it has a satisfactory accuracy in original cloud-cover areas. (3) The collection provides a detailed snow cover dataset for China, accurately reflecting the snow conditions of the following three 450 major snow areas: NX, NC, and QTP. The collection is available at: https://doi.org/10.5281/zenodo.5644386 (Jing et al., 2021).
As discussed above, STAR NDSI collection still has some deficiencies. A future release should consider several issues: (1) the original accuracy of MODIS NDSI datasets reduced by factors such as complex climatic conditions and dense forest coverage; (2) the reconstruction accuracy of snow edges affected by mixed pixels and high cloud coverage; (3) the reconstruction accuracy of transient snow areas due to the inadequate spatio-temporal contextual information; and (4) the lack 455 of evaluation based on in-situ snow depth measurements in NC due to the limited access to climate station data.
Despite the aforementioned deficiencies, since snow is a pivotal driver and sensitive indicator for many hydrometeorological processes, the daily 500 m STAR NDSI collection for 20 years has various potential applications: (1) achieving a deep understanding of long-term snow cover variability in China, (2) providing effective forcing data for hydrological and meteorological models, and (3) supporting strategic decisions on water resources management, 460 environmental pollution governance, and related economic development.
Author contributions. All authors designed the methodology. YJ implemented the experiments. YJ maintained and refined STAR NDSI collection. YJ drafted the manuscript. XL and HS revised the whole manuscript. All authors provided suggestions for this manuscript.
Competing interests. The authors declare that they have no conflict of interest. 465