Articles | Volume 14, issue 7
Data description paper
11 Jul 2022
Data description paper |  | 11 Jul 2022

STAR NDSI collection: a cloud-free MODIS NDSI dataset (2001–2020) for China

Yinghong Jing, Xinghua Li, and Huanfeng Shen

Snow dynamics are crucial in ecosystems, affecting radiation balance, hydrological cycles, biodiversity, and human activities. Snow areas with notably diverse characteristics are extensively distributed in China, mainly including Northern Xinjiang (NX), Northeast China (NC), and the Qinghai–Tibet Plateau (QTP). Spatiotemporal continuous snow monitoring is indispensable for ecosystem maintenance. Nevertheless, the formidable challenge of cloud obscuration severely impedes data collection. In the past decades, abundant binary snow cover area (SCA) maps have been retrieved from moderate resolution imaging spectroradiometer (MODIS) datasets. However, the integrated normalized difference snow index (NDSI) maps containing additional details on snow cover extent are still extremely scarce. In this study, a recent 20-year stretch seamless Terra–Aqua MODIS NDSI collection in China is generated using a Spatio-Temporal Adaptive fusion method with erroR correction (STAR), which comprehensively considers spatial and temporal contextual information. Evaluation tests confirm that the cloud-free STAR NDSI collection is superior to the two baseline datasets. The omission error decreased by 10 % in NX compared to the snow cover extent product, and the average correlation coefficient increased by 0.11 compared to the global cloud-gap-filled MODIS NDSI product. Consequently, this collection can serve as a basic dataset for hydrological and climatic modeling to explore various critical environmental issues in China. This collection is available from (Jing et al., 2021).

1 Introduction

Snow is a fundamental component of the cryosphere, strongly interacting with global energy budgets and hydrological dynamics (Hall et al., 1995). Snow cover has a remarkable impact on the Earth's radiation balance due to its highly reflective nature, thus generating feedback in the global climate system (Konzelmann and Ohmura, 1995). Up to one-sixth of the world's population relies on meltwater from glaciers and snowpacks for drinking, irrigation, hydropower generation, and industrial production (Barnett et al., 2005). Therefore, snow dynamics have a profound impact on climate change and human activities. The snow cover extent of the Northern Hemisphere has continued to decrease since the mid-20th century (Pachauri and Meyer, 2014). However, regional-scale snow variations in different environmental conditions present mixed trends due to the strong sensitivity of snow cover to climate change (Bormann et al., 2018). The snow cover regions in China are extensively distributed with remarkable spatial and temporal heterogeneity (Wang et al., 2018), mainly in Northern Xinjiang (NX), Northeast China (NC), and the Qinghai–Tibet Plateau (QTP). Therefore, accurate snow cover acquisition in China is significant for snow pattern analysis, water resource management, climate change monitoring, etc.

China has conducted large-scale observations of snow parameters since the 1950s through meteorological stations, providing a valuable database for long-term snow-related studies. However, it is difficult to accurately depict the snow characteristics in China, especially in QTP which is dominated by patchy and shallow snow, due to the sparsely and unevenly distributed traditional in situ observations. Satellite-based remote sensing is a prominent alternative for continuous snow cover monitoring at meso- and macroscales. Moderate resolution imaging spectroradiometer (MODIS) snow cover datasets are extensively used for various hydrological and climatological applications due to their relatively high spatial and temporal resolutions. At present, the Collection 5 (C5) suite providing snow cover area (SCA) and fractional snow cover (FSC) data, and the Collection 6 (C6) suite providing normalized difference snow index (NDSI) data are the most appealing representatives (Riggs and Hall, 2015). However, the main constraint of optical remote-sensed datasets, including MODIS C5 and C6 snow cover datasets, is cloud persistence.

Numerous algorithms have been proposed in the past decades to improve the spatiotemporal continuity of MODIS C5 snow cover datasets. Cloud-removal algorithms can be categorized into single-source feature fusion methods and multi-source data fusion methods considering data sources. Single-source feature fusion methods fill the gaps based on homologous contextual information, relying on the spatiotemporal correlations of snow features. These methods have evolved from the classical Terra and Aqua combination (TAC; Parajka and Bloschl, 2008), multi-day combination (MDC; Gafurov and Bardossy, 2009), and snow-line method (SNOWL; Parajka et al., 2010) to complex spatiotemporal union methods. For example, Gafurov et al. (2015) proposed a four-step method to generate cloud-free MODIS SCA maps, successively combining TAC, neighborhood filtering, MDC, and a classification tree. Dariane et al. (2017) suggested the aggregation of TAC, MDC, SNOWL, and neighborhood filtering with elevation constraints to fill the cloud-covered gaps. Li et al. (2017) developed an adaptive spatiotemporal weighted method to reclassify the cloudy pixels. These methods for binary SCA mapping have achieved satisfactory cloud-removal effectiveness and accuracy. Multi-source data fusion methods (Akyurek et al., 2010; Brown et al., 2010; Chen et al., 2018; Gafurov et al., 2015; Gao et al., 2011; He et al., 2017) maximize the complementarity among heterogeneous datasets from optical, microwave, and/or station measurements. This type of method is effective for filling the continuous gaps in space and time when the supplementary data are of high quality in the cloud-obscured regions (Li et al., 2019). In addition to traditional methods, learning-based methods are increasingly applied to snow cover mapping due to their satisfactory capabilities for nonlinear expressiveness (Yuan et al., 2020). The SCA and FSC maps can be generated by exploring the relationship between snow coverage and MODIS reflectances combined with ancillary factors, including NDSI, temperature, vegetation, and terrain parameters. As a representative of learning-based methods, artificial neural networks have been successfully utilized to model the relationship (Dobreva and Klein, 2011; Hou and Huang, 2014; Moosavi et al., 2014; Çiftçi et al., 2017; Kuter, 2021). Such methods are relatively uncertain but promising because the accuracy substantially relies on the quantity and quality of training data.

Increasing studies have moved to the MODIS C6 suite since its release in 2016. In C6 data, snow cover is reported as NDSI rather than SCA and FSC. The NDSI is an index that is related to the snow presence in a pixel and is a more accurate description of snow fraction than SCA and FSC (Riggs and Hall, 2015; Riggs et al., 2017). The clear-sky accuracy of C6 NDSI datasets is robust compared to higher-resolution remote-sensed images (such as Landsat and Sentinel series) and in situ measurements (Crawford, 2015; Zhang et al., 2019; Aalstad et al., 2020). As a basic dataset, it has the significant advantage of allowing users to more accurately determine SCA or FSC for their particular study areas and application requirements (Hall et al., 2019). For example, several optimal classification thresholds for SCA (Huang et al., 2018; Malmros et al., 2018; Tong et al., 2020) and specially tuned mapping methods for FSC (Kuter et al., 2018; Hou et al., 2020; Zhang et al., 2021) were designed to generate regional SCA and FSC datasets from NDSI snow cover datasets, which were superior to the globally harmonized algorithms in C5 data. However, severe cloud contamination also limits the application of NDSI datasets, resulting in many studies only considering cloud-free areas (Kuter et al., 2018; Malmros et al., 2018; Tong et al., 2020; Hou et al., 2020; Zhang et al., 2021). Therefore, a cloud-free NDSI dataset is significant for in-depth snow cover research. Since the aforementioned cloud-removal methods were generally designed for binary SCA, their applicability to NDSI with more complicated spatiotemporal characteristics should be improved. Thus, several gap-filling methods with associated concern for spatial and temporal correlations of snow presence were proposed to remove clouds from NDSI (Jing et al., 2019; Chen et al., 2020; Li et al., 2020). Among these methods, the spatiotemporal feature-based methods with relatively high robustness are more effective for improving NDSI datasets (Jing et al., 2019).

Many studies on snow monitoring in China are available, and most of these studies focus on binary SCA mapping. On the regional scale, QTP, which is known as the world's third pole, plays a key role in the global climate system. Nevertheless, snow cover mapping is particularly challenging over QTP due to the frequent cloud cover resembling fragmented snow. A large number of studies have demonstrated that the snow cover variability over QTP is extremely complex, with significant spatiotemporal heterogeneity (Gao et al., 2012; Tang et al., 2013; Yu et al., 2016; Liang et al., 2017; Zhang et al., 2012). Located in mid-latitude areas, NX (Wang et al., 2008) and NC (Che et al., 2016) are dominated by seasonal snow cover. Che et al. (2019) presented an integrated snow cover dataset from a distributed hydrometeorological observation network in the Heihe River Basin, which achieved a prominent demonstration of data synthesis at a watershed scale. In addition, the large-scale transient snow cover regions increase the level of challenge for generating high-quality snow cover datasets. On the national scale, Huang et al. (2016) obtained a long-term cloud-removed SCA product using a multi-source data fusion method. Despite many relevant studies, only a few cloud-free snow cover datasets have been released publicly.

Several typical long-term cloud-free snow cover products available online are listed in Table 1 (datasets are referenced via DOI), which cover most snow-dominated regions in China. Huang (2020) provided MODIS daily cloudless SCA products with relatively accurate snow detection capabilities in the Northern Hemisphere based on multi-source data. Muhammad and Thapa (2020, 2021) obtained 8 d/daily MODIS SCA and glacier composite datasets for High Mountain Asia by aggregating seasonal, temporal, and spatial filters, which can serve as a valuable input for hydrological and glaciological investigations. Hao et al. (2021, 2022) yielded two long-term daily SCA datasets over China through a series of processes such as quality control, cloud detection, snow discrimination, and gap-filling (including hidden Markov random field and snow-depth interpolation techniques). Their releases and updates promoted the research of snow cover characteristics in China. Qiu et al. (2017) yielded a daily FSC dataset with detailed snow cover information over High Mountain Asia with MDC and spatial filtering. Additionally, the global cloud-gap-filled MODIS NDSI dataset (MOD10A1F) has been available online since 2020, where cloud-covered grids in the MODIS Terra NDSI product are filled by retaining clear-sky observations from previous days (Hall and Riggs, 2020). However, this dataset performs poorly in China, where periodic and transient snow is dominant. In general, cloud-free SCA datasets produced by composite algorithms are frequently released, while high-quality cloud-free NDSI datasets are still scarce.

To this end, this study generates a spatiotemporally continuous Terra–Aqua MODIS NDSI product with satisfactory accuracy for China, fully considering the spatiotemporal characteristics of regional snow cover variability. A Spatio-Temporal Adaptive fusion method with erroR correction (STAR), improved from our previous work (Jing et al., 2019), is utilized to eliminate cloud obscuration. The long-term detailed snow cover extent dataset will facilitate snow-related scientific studies and practical applications in China.

The rest of this paper is arranged as follows: first, Sect. 2 describes the input data and the proposed cloud-removal method. Section 3 then presents the verification accuracy of the STAR NDSI collection, with a subsequent analytical application. The cloud-removal effectiveness under different cloud coverages is discussed in Sect. 4. Finally, the data availability and the conclusions are provided in Sects. 5 and 6, respectively.

Table 1Typical long-term cloud-free snow cover products covering most snow-dominated regions in China.

* Cloud coverage is less than 10 %.

Download Print Version | Download XLSX

2 Data and methods

2.1 Data

MODIS sensors onboard Terra and Aqua satellites provide the global snow cover datasets. The daily MOD10A1 and MYD10A1 datasets of C6 are available through the website of the National Aeronautics and Space Administration (NASA;, last access: 6 July 2022). The NDSI_Snow_Cover (hereafter referred to as NDSI) scientific dataset, with a range of 0, 10 to 100, was used in this study. As shown in Fig. 1, the NDSI of 19 tiles covering China (excluding sea area) from 1 August 2000 to 31 July 2020, was acquired to generate snow cover maps. The 90 m digital elevation model (DEM) dataset of the Shuttle Radar Topographic Mission (SRTM) was obtained from the United States Geological Survey (USGS). In addition, two existing cloud-free snow cover products, including the daily snow cover extent at a 5 km resolution (NIEER AVHRR SCE,, Hao et al., 2021) and the daily cloud-gap-filled MODIS NDSI at a 500 m resolution (MODIS CGF NDSI/MOD10A1F,, Hall and Riggs, 2020), were used for comparison. For reference data, the snow-depth measurements respectively derived from 49 and 102 meteorological stations in NX and QTP (Tibet Meteorological Bureau and National Meteorological Information, 2018) were used for station-based validation. Since the snow depth data can only assess the classification performance of MODIS NDSI retrievals, the NDSI maps derived from Landsat OLI images were utilized for comprehensive validation.

Figure 1Topographic relief of China, meteorological stations in NX and QTP, and Landsat OLI scenes used for validation.

2.2 Algorithm description

MODIS NDSI datasets are unable to represent the daily conditions of snow accumulation and ablation accurately because the optical remote-sensed images are subject to severe cloud pollution. Therefore, STAR, which is derived from our two-stage spatiotemporal fusion method (Jing et al., 2019), is presented to produce a spatiotemporal continuous snow collection. As shown in Fig. 2, the generation procedure comprises the pre-process TAC and the key-process STAR. Then, a quality assessment (QA) approach is presented to provide a data reliability profile for users. On this basis, post-processing is used to further improve the data quality in individual abnormal areas.

Figure 2Schematic representation of the generation procedure of STAR NDSI collection.


2.2.1 Terra and Aqua combination (TAC)

TAC blends the same-day snow maps deriving from MODIS sensors onboard Terra and Aqua satellites. Its cornerstone is the unlikely significant changes of the snow pattern within the data-acquired time interval (approximately 3 h). Since TAC can efficiently decrease the cloud fraction by 5 %–20 % with negligible precision sacrifice (Li et al., 2019), it is introduced as a pre-processing combination to reduce cloud coverage preliminarily. Its priority scheme is determined as high value > low value > cloud.

(1) NDSI P = NDSI Terra IF NDSI Terra NDSI Aqua OR NDSI Aqua is cloud , NDSI P = NDSI Aqua IF NDSI Aqua > NDSI Terra OR NDSI Terra is cloud ,

where NDSITerra and NDSIAqua are MODIS NDSI datasets from Terra and Aqua satellites, respectively. The NDSIP represents the pre-processed NDSI maps after TAC (referred to as TAC NDSI dataset in subsequent sections). The snow in low-altitude and low-latitude areas during summer is reversed to no snow to alleviate commission errors inherited from the original data. In addition, since the Aqua dataset is available from July 2002, the key-process STAR is directly used to remove clouds from the Terra MODIS NDSI dataset between August 2000 and May 2002. Particularly, the improved Aqua MODIS C6 NDSI dataset significantly enhances the effectiveness of TAC due to the successful restoration of the absent Aqua MODIS band 6 data by the quantitative image restoration method (Gladkova et al., 2012).

2.2.2 Spatio-Temporal Adaptive fusion with erroR correction (STAR)

Many regions with persistent clouds are out of the scope of TAC. To this end, an advanced STAR method, which comprehensively utilizes spatiotemporal contextual information, is proposed to remove the clouds thoroughly. As shown in Fig. 3, the method performs in two passes: spatio-temporal adaptive fusion (STAF) and error correction (EC).

Figure 3Detailed flowchart of the Spatio-Temporal Adaptive fusion with erroR correction (STAR).

The first pass involves the generation of new NDSI maps by adaptively merging the spatiotemporal contextual information, including space partition, adaptive space–time block determination, and Gaussian kernel function (GKF)-based fusion. The research area is first segmented into dozens of partitions considering the spatial heterogeneity of snow patterns. Thus, the subsequent processes can be performed on a partition basis. Moreover, the optimal query partitions (Q) to each target partition (T) are determined by a comprehensive consideration of temporal distance (t), regional correlation (r), and cloud-free fraction (f) concerning the temporal complexity of snow variations. The following optimal parameters are derived from the extensive experiments:

(2) Scheme 1: r > 0.7 , if f C & T > 0.3 Scheme 2: max ( t - 1 + f C ) , others ,

where the regional correlation between the candidate partition (C) within an 8 d window and the target partition is considered representative if the fraction of the intersecting cloud-free areas (fC&T) is higher than 0.3. The candidate partition is then determined as a query partition according to Scheme 1 when the regional correlation is larger than 0.7. Otherwise, Scheme 2 is activated. Two query partitions, with short distance and high cloud-free fraction, are identified within the preceding 8 d and the backward 8 d, respectively. Subsequently, the 3×3 neighborhoods for each pixel of the target partition in all the associated query partitions are determined as the space–time reference block. Lastly, each pixel is reassigned a fused value from the related space–time block, as expressed in Eq. (3):

(3) NDSI i F = t = 1 M s = 1 N w ( i , st ) × NDSI ( i , st ) Q , where  W ( i , st ) = r t 2 × exp - ( ( ε × Δ s ( i , s ) ) 2 + Δ t ( i , t ) 2 ) ) 2 × σ 2 ,

where NDSIiF denotes the fused NDSI of pixel i in the target partition. The pre-processed NDSI is NDSI(i,st)Q in associated query partitions, and M is the number of query partitions, each of which contains N reference pixels. In addition, the weight W(i,st) is assigned by a two-dimensional GKF involving the spatial distance (Δs(i,s)) and the temporal distance (Δt(i,t)), which is then normalized to w(i,st). The standard deviation of GKF is σ. The dimensional difference, which is characterized by ε, is equal to σt/σs with an expression of each single-dimensional GKF. The regional correlation between the query and target partitions is represented by rt if Scheme 1 works; otherwise, it is ignored (i.e., rt=1). The constant term (ε/(2πσ2)) of GKF is ignored due to the normalization process. The important parameters in STAF are listed in Table 2.

Table 2Description and default values of STAF parameters.

Download Print Version | Download XLSX

The second pass corrects the fused NDSI maps considering the spatial correlation within a partition. Specifically, the residual errors of the intersecting cloud-free areas of the pre-processed and fused NDSI maps (NDSIP after TAC and NDSIF after STAF) are diffused to other cloud-free areas of the fused NDSI maps using the triangulation-based natural neighbor interpolation (NNI; Sibson, 1981). Then, the high-quality NDSI maps (NDSIH) can be generated by removing all errors from the fused NDSI maps. The process is formulated as follows:

(4) E R = NDSI R F - NDSI R P E T ( i ) = n = 1 N ϕ ( i , n ) E R ( i , n ) NDSI T H = NDSI T F - E T ,

where “R” indicates the reference area which is the boundary of the intersecting cloud-free areas. The target area is indicated by “T”. The dynamic weights in the error diffusion from ER to ET are based on the Voronoi diagrams. As expressed in Fig. 3b (left), the original Voronoi cells (bounded by red and gray solid lines) of the reference pixels (gray dots) intersect with the new Voronoi cells (bounded by blue and gray solid lines) of the reference and target pixels. Taking the target pixel T1 with the reference pixel R1 as an example, the weight is assigned as the ratio of the area of the intersecting Voronoi cell (Adabch) to that of the new Voronoi cell (Adefgh).

(5) ϕ ( 1 , 1 ) = A dabch A defgh .

After all the partitions are processed in sequence, the next iteration of STAR begins until the clouds are completely removed.

2.2.3 Quality assessment (QA) approach

A revised QA approach for the gap-filled NDSI collection is proposed on the basis of the quality estimate of MODIS NDSI datasets (Riggs and Hall, 2015), and an example is presented in Fig. A1 (Appendix A). Users can examine the basic quality of the gap-filled NDSI collection considering cloud coverage and spatiotemporal consistency of the raw NDSI dataset by retrieving the bit flags from the integer stored in QA maps. The specific attributes are listed in Table 3.

Table 3Bit flags indicating the retrieval conditions according to the raw NDSI dataset.

Download Print Version | Download XLSX

The snow detection reversal of the pre-processed value in TAC is tracked in Bit 2, the post-processing (Sect. 2.2.4) is tracked in Bit 3, and the number of iterations primarily related to cloud coverage is indicated by Bit 4. If the range of reference values is larger than 30, then the Bit 5 flag is set; if the difference between pre-processed and fused values is larger than 20, the Bit 6 flag is set. Bit 7 reflects the cloud coverage of the space–time block. Furthermore, the combination of Bits 0 and 1 is a qualitative estimate of the cloud-removed NDSI collection based on the number of iterations (hereafter referred to as NI) and spatiotemporal consistency. The comprehensive estimate is determined as follows:

  • if NI = 5 and Bit 5 or 6 is set to 1, then it is assigned “poor”;

  • if NI = 4 and Bit 5 or 6 is set to 1, then it is assigned “OK”;

  • if NI = 3 and Bit 5 or 6 is set to 1, then it is assigned “good”;

  • otherwise, it is assigned “best”.

The QA maps are recommended for in-depth application of the cloud-removed NDSI collection.

2.2.4 Post-processing

For areas with extremely rapid and fluctuating snow variation, the temporal contextual references are likely to introduce incorrect information and magnify errors in the iterative process. Post-processing is used in this study to reduce the “disorder” phenomenon referring to QA maps. Firstly, the NDSI map with the most consistent snow pattern in adjacent time is artificially identified as a reference. Subsequently, the aforementioned EC is applied to improve the spatial consistency between the post-processing and original areas. Finally, the QA maps are updated.

2.3 Validation of the NDSI collection

The gap-filled NDSI collection is evaluated with the in situ snow-depth observations and Landsat NDSI maps considering classification and numerical accuracies according to the current mature verification methods. As shown in Table 4, the classification metrics based on the confusion matrix include overall accuracy (OA), commission error (CE), and omission error (OE) (Klein and Barnett, 2003). Moreover, 3 general metrics are introduced to measure numerical accuracy: correlation coefficient (CC), absolute error (AE), and root-mean-square error (RMSE).

Table 4Confusion matrix and validation metrics.

Download Print Version | Download XLSX

3 Results

As mentioned above, the generation procedure of continuous snow collection includes the pre-process TAC and the key-process STAR. The remaining clouds of 30.62 % in the entire collection after TAC are completely removed by STAR. To elaborate the reliability of the STAR NDSI collection, TAC NDSI, NIEER AVHRR SCE, and MODIS CGF NDSI products are used as baseline data. Specifically, based on in situ snow-depth measurements, the classification accuracy of the STAR NDSI collection is compared to those of TAC NDSI and NIEER AVHRR SCE datasets. In addition, based on Landsat NDSI maps, its numerical accuracy is compared to those of TAC NDSI and MODIS CGF NDSI datasets. This section presents the evaluation results, followed by an exemplary application.

3.1 Validation against in situ snow depth measurements

As described above, the in situ snow-depth data in NX from 1 January 2001 to 31 August 2007 and in QTP from 1 August 2000 to 31 December 2013, were used as the ground truth to evaluate the classification accuracy of the TAC NDSI, NIEER AVHRR SCE, and STAR NDSI datasets. The nearest pixel was matched with each meteorological station, with a total of about 550 000 data pairs. Snow-covered pixels in NDSI datasets range from 10 to 100, whereas snow-free pixels are 0; thus, the classification threshold is set as 10 (Zhang et al., 2019). The discriminant threshold for in situ snow depth is set as 0 or 1 cm. In addition, the cloud-covered areas in the TAC NDSI dataset are considered to be snow-free.

Figure 4Monthly classification accuracy of TAC NDSI, NIEER AVHRR SCE, and STAR NDSI products in NX (group a) and QTP (group b). Note that the optimal values for OA, CE and OE are 100 %, 0 % and 0 %, respectively.


Table 5Confusion matrices between TAC NDSI, NIEER AVHRR SCE, STAR NDSI datasets and in situ snow-depth (SD) data in NX from 1 January 2001 to 31 August 2007. Note that bold values indicate classification accuracy, including OE, CE, and OA.

Download Print Version | Download XLSX

Table 5 demonstrates that NIEER AVHRR SCE and STAR NDSI datasets preeminently capture the snow dynamics in NX referring to the in situ measurements, with OAs of more than 94 %. However, the TAC NDSI dataset is insufficient to accurately describe the snow cover variability. Although CEs perform well regardless of the snow-depth threshold, OEs of the TAC NDSI collection are extremely high, indicating that many cloud-covered areas are dominated by snow. The NIEER AVHRR SCE dataset partially retrieves snow pixels under cloud obstruction with an OE decreased by ∼43 %. The STAR NDSI collection completely removes clouds and accurately presents snow distribution, with an OE further decreased from ∼17 % to ∼7 %. The generation procedure in NX has two strengths. Firstly, the satellite-borne sensors can accurately capture the snow events on the ground due to the generally thick snow averaging approximately 20 cm. Secondly, the gap-filling approach with comprehensive consideration of spatial and temporal correlation has outstanding reliability due to the significant periodicity of snow variation. It can be inferred that the NDSI datasets in NC have high accuracy because of the similar snow conditions, despite the lack of in situ data in this region.

By contrast, despite the satisfactory performance of OAs and CEs, the OEs of three snow cover datasets over QTP are as remarkably high as 72 %, 42 %, and 39 %, even at the snow-depth threshold of 1 cm (Table 6). This finding indicates the omission of a large number of snow-covered pixels. The specific reasons are as follows: firstly, the original MODIS NDSI maps frequently underestimate the snow presence throughout the snow period; because discriminating the shallow snow pixels with an averaged snow depth of approximately 4 cm over QTP is challenging. Secondly, the credibility of the spatiotemporal contextual information is relatively low because the patchy snow rapidly and irregularly varies due to the extremely complex topographic and climatic conditions, leading to a further decrease in the accuracy of the gap-filled results. Lastly, the meteorological stations over QTP are unevenly distributed and are mostly located in low- and medium-altitude/latitude areas dominated by transient snow. Consequently, the evaluation results slightly exaggerate the real OEs.

For the in-depth analysis of the temporal characteristics, the monthly classification accuracies of TAC NDSI, NIEER AVHRR SCE and STAR NDSI products in NX and QTP are shown in Fig. 4 (the horizontal axis is the month in a hydrological year). In NX (group a), the monthly snow fraction in the in situ samples is greater than 85 % from December to next February. Therefore, the clouds in the TAC NDSI dataset seriously affect the snow cover estimation, while both cloud-free products exhibit superior OAs. Compared to the NIEER AVHRR SCE product, the STAR NDSI collection has slightly higher CEs but relatively lower OEs. The OEs of the STAR NDSI collection typically occur during snow accumulation and ablation periods, and almost disappear during stable snow-covered and snow-free periods. In QTP, the snow period is generally from October to next May, with the monthly snow fraction of less than 10 % in the in situ samples. Consequently, the underestimation of snow coverage caused by the clouds in the TAC NDSI dataset is slight. All three products achieve outstanding OAs and CEs, but exhibit relatively poor OEs. In NIEER AVHRR SCE and STAR NDSI datasets, these OEs are generally observed outside the snow period. As mentioned above, there are three reasons for this phenomenon. Nonetheless, the STAR NDSI collection presents superior classification accuracy to TAC NDSI and NIEER AVHRR SCE datasets.

Table 6Confusion matrices between TAC NDSI, NIEER AVHRR SCE, STAR NDSI datasets and in situ snow-depth (SD) data in QTP from 1 August 2000 to 31 December 2013. Note that bold values indicate classification accuracy, including OE, CE, and OA.

Download Print Version | Download XLSX

Overall, the STAR NDSI collection is capable of snow status estimation, eliminating cloud contamination in the TAC NDSI dataset, and capturing more snow events than the NIEER AVHRR SCE dataset. However, the accuracy of the STAR NDSI collection has regional and temporal heterogeneity. Firstly, the accuracy over QTP is lower than that of NX, which is consistent with the characteristic of the original MODIS NDSI maps. Furthermore, the permanent and periodic snow regime regions reconstructed by STAR have prominently high accuracy, while the transient snow-covered regions are easily omitted. Fortunately, the monitoring of permanent and periodic snow plays a key role in most snow-related investigations. Finally, the accuracy of stable snow-covered and snow-free periods is slightly better than that of snow accumulation and ablation periods.

3.2 Validation based on Landsat NDSI maps

Only the classification accuracy can be evaluated by in situ measurements due to the significant difference in the nature of the snow depth and NDSI data. Therefore, Landsat images with finer spatial resolution were commonly adopted for the numerical evaluation of NDSI datasets (Crawford, 2015). The NDSI values for the Landsat 8 dataset were calculated as follows: Band3-Band6/Band3+Band6. Subsequently, the average of the 17×17 neighborhoods closest to the center of the MODIS NDSI pixel in the Landsat NDSI map was considered to be the reference of this MODIS NDSI pixel to match the spatial resolution of Landsat with that of MODIS. Specifically, the cloud-contaminated pixels marked by the quality band in Landsat images were excluded, and the reference areas with cloud coverage larger than 30 % were discarded. A total of 19 Landsat NDSI maps with different snow coverages from January to April in 2018, which are distributed in NC (four scenes), the Central China region (CCR, two scenes), QTP (eight scenes), and NX (five scenes), were exploited as evaluation references for this validation experiment. Two evaluations including a cross-comparison of TAC NDSI, MODIS CGF NDSI, and STAR NDSI datasets, and an internal comparison of clear-sky and cloud-covered areas are described in detail below.

For the cross-comparison, the visual effects of three NDSI datasets on 8 January 2018 and 3 February 2018 are shown in Fig. 5. The TAC NDSI dataset is still heavily obscured by clouds. Although the MODIS CGF NDSI dataset completely removes clouds from the MOD10A1 product, it is difficult to accurately retrieve periodic and transient snow cover regions due to the simplicity of the cloud-gap-filled method. Specifically, the gaps are filled by retaining clear-sky observations from previous days. However, snow patterns under cloud cover are likely to change significantly during these days. Therefore, snow cover is significantly underestimated during accumulation (Fig. 5a2) and overestimated during ablation (Fig. 5b2). By contrast, the STAR NDSI collection pre-eminently captures the snow dynamics under temporally continuous clouds, attributing to the spatiotemporal adaptive fusion strategy. Furthermore, the three NDSI datasets are quantitatively assessed by Landsat NDSI maps.

Figure 5Comparison of TAC NDSI (column 1), MODIS CGF NDSI (column 2), and STAR NDSI (column 3) products on 8 January 2018 (group a) and 3 February 2018 (group b).

As the average cloud cover is as high as 40.7 %, the TAC NDSI dataset has a low correlation with Landsat NDSI maps, with an average CC of 0.46 (Table 7). The cloud-free MODIS CGF NDSI dataset enhances the correlation with Landsat NDSI maps, with an average CC of 0.73. By contrast, the snow dynamics presented by the spatiotemporally continuous STAR NDSI dataset are highly consistent with Landsat NDSI maps, with an average CC further increased to 0.84. Compared to TAC NDSI and MODIS CGF NDSI datasets, the average RMSE of STAR NDSI dataset is decreased by 9.06 and 5.58, respectively. The positive AEs reveal that NDSI values for snow pixels in MODIS CGF NDSI and STAR NDSI datasets are generally higher than those of Landsat NDSI maps. In terms of snow coverage, the STAR NDSI dataset notably improves the detection of snow events compared to the other two datasets, with an average absolute SRD decreased by 31.1 % and 2.4 % (SRD indicates the difference of snow rate between MODIS and Landsat NDSI datasets). Consequently, the STAR NDSI collection is a more promising snow cover product than TAC NDSI and MODIS CGF NDSI datasets, contributing to related hydrological and meteorological studies.

Table 7Performance statistics for TAC NDSI, MODIS CGF NDSI, and STAR NDSI datasets against Landsat NDSI maps.

Note that TAC, CGF, and STAR represent TAC NDSI, MODIS CGF NDSI, and STAR NDSI datasets, respectively.

Download Print Version | Download XLSX

In addition to the cross-comparison of TAC NDSI, MODIS CGF NDSI, and STAR NDSI datasets, an internal comparison of the STAR NDSI collection in clear-sky and cloud-covered areas was performed based on Landsat NDSI maps, to highlight the accuracy of the recovered pixels in the STAR NDSI collection. As described above, clear-sky and cloud-covered areas account for 59.3 % and 40.7 %, respectively. Table 8 indicates that the snow distribution of recovered areas in the STAR NDSI collection is relatively consistent with that of Landsat NDSI maps. Although the average CC decreases from 0.85 to 0.73 and the average RMSE increases from 13.48 to 16.30 compared to clear-sky areas, the accuracy of recovered areas is satisfactory. Since many recovered areas inherit errors from clear-sky areas because the cloud-removal procedure completely relies on the original dataset, a slight decrease in accuracy is reasonable. In addition, the average AEs of clear-sky and recovered areas are 7.81 and 6.83, respectively, revealing the systematic overestimation of NDSI values in snow cover regions (Landsat NDSI values are generally low). Except for a few areas, the snow conditions in most cloud-covered areas are well recovered, with an average SRD of −5.0 %. This finding highlights that the STAR NDSI collection can completely remove clouds with satisfactory accuracy.

Table 8Performance statistics for STAR NDSI collection against Landsat NDSI maps in clear-sky and cloud-covered areas according to the TAC dataset. Note that ** and * respectively indicate an improvement and degradation of cloud-covered areas compared to clear-sky areas (corresponding to four groups in Fig. 6).

Download Print Version | Download XLSX

For in-depth verification analysis, Fig. 6 shows the visual effects in four regions corresponding to four highlighted groups in Table 8. The accuracy of clear-sky areas in NC is prominently high with a CC of 0.92, while recovered areas notably reduce the performance with a CC of 0.42. However, Fig. 6a shows that clear-sky areas in the TAC NDSI dataset cannot reflect the snow events, whereas the STAR NDSI collection effectively retrieves these events. Inevitably, the snow edges are slightly inaccurate and blurry due to insufficient reference information. For group (b), the original accuracy of the NDSI dataset in CCR is relatively low, with high cloud coverage and a false acceptance rate, while the STAR NDSI collection presents a snow pattern consistent with Landsat NDSI. Nevertheless, CCR is a transient snow area with relatively low altitudes and latitudes. Therefore, the gap-filled result has visible uncertainty, in which commission (black box) and omission (red box) frequently occur. As mentioned above, MODIS NDSI datasets generally perform poorly over QTP. However, Fig. 6c demonstrates that TAC NDSI and STAR NDSI collections can accurately capture snow events despite a few omissions (red box). Similar to the NC region, CCs provide a positive indication of overall performance in NX. As shown in Fig. 6d, even the NX4_20180103 scene, with a remarkably low CC of 0.58, can effectively reflect the snow pattern. In addition, the NDSI dataset retrieved by STAR inevitably has a few extremely abnormal areas during 20 years due to the vast territory of China; an example is presented in Fig. A2 (Appendix A). These areas have severe cloud contamination and irregular snow dynamics, contributing to the challenges in reconstruction and evaluation. Therefore, these areas are corrected by post-processing as described in Sect. 2.2.4.

Figure 6Comparison of TAC NDSI (column 1), Landsat NDSI (column 2), and STAR NDSI (column 3) products, and classification consistency (column 4) corresponding to NC3_20180311, CCR2_20180203, QTP2_20180225, and NX4_20180103 (groups a to d).

Overall, the numerical accuracy of the STAR NDSI collection is rigorously evaluated based on fine-resolution Landsat NDSI maps. The cross-comparison indicates that the STAR NDSI collection is superior to both TAC NDSI and MODIS CGF NDSI datasets. In addition, the internal comparison reveals that the effectiveness of cloud removal is satisfactory, although recovered areas have slightly lower accuracy than clear-sky areas. Consequently, the STAR NDSI collection has considerable application potential.

3.3 Application of the STAR NDSI collection

In addition to quality evaluation, the exemplary analysis also contributes to understanding the potential of the STAR NDSI collection for hydrological and climatic applications. Therefore, the spatial distribution and temporal variability of snow cover in China are analyzed as two simple application demonstrations.

From a spatial perspective, the sequence shown in Fig. 7 indicates that the snow fraction first increases and then decreases in NC and NX regions from 5 December 2014 to 25 January 2015. However, due to the complex topographic and climatic conditions, the snow cover in QTP presents irregular distribution and considerable fluctuations. During this period, the snow cover pattern in China changed dramatically, with the overall snow fraction ranging from 19.25 % to 35.42 %. In addition, another sequence of cloud-free NDSI maps in early 2020 is shown in Fig. 8. During this period, the snow cover in China presents a single-wave depletion curve. The peaks of snow fractions are observed in three major snow areas between 11 January 2020 and 21 January 2020. Compared to the previous sequence, the snow cover variation in QTP in this sequence has significantly enhanced regularity.

Figure 7A sequence of the STAR NDSI collection from 5 December 2014 to 25 January 2015.

Figure 8A sequence of the STAR NDSI collection from 1 January 2020 to 21 February 2020.

Figure 9Daily average snow fraction of NX, NC, QTP, and China.


Table 9The cloud-removal effectiveness of STAR compared to Landsat NDSI maps under different simulated cloud conditions. Note that bold values indicate a significant degradation of the accuracy under the current cloud cover compared to the previous one.

Download Print Version | Download XLSX

Figure 9 shows the daily average snow fraction in the three main subregions of China and the entire situation considering temporal analysis. In terms of intra-annual variability, the snow dynamics periodically evolve in NX and NC but substantially fluctuate in QTP. Similar snow depletion curves are present in NX and NC, demonstrating rapid accumulation and ablation in November and March, respectively. There is a relatively long snow period in QTP, with an average snow fraction varying from 20 % to 40 % from October to next May. Consequently, China is dominated by periodic snow. As for inter-annual variability, among the three major snow areas, the snow fraction in NC remarkably fluctuates with a standard deviation of 5.3 %. The snow coverage in QTP presented a slight decreasing trend from 2005 to 2017 but increased significantly in the past 2 years. In particular, rather than a significant rise in maximum snow coverage, the increase can be observed throughout the snow period with a slight reduction in intra-annual volatility. This finding implies that the regional climatic conditions tend to stabilize slightly. In addition, no significant trend has been detected in snow dynamics in China during the 20 years. Nevertheless, the significant fluctuation of maximum snow coverage in China indicates the presence of non-negligible large-scale transient snow cover regions. For example, Fig. A3 (Appendix A) shows the extreme snow event in southern China caused by the La Niña phenomenon, which resulted in heavy casualties and economic losses in the hydrological year 2007–2008.

Overall, the STAR NDSI collection can accurately reflect the spatial and temporal dynamics of snow cover in China. It is promising for hydrological and meteorological applications.

4 Discussion

To elaborate the cloud-removal effectiveness of the proposed STAR method, the performance statistics under different simulated cloud conditions are shown in Table 9. Four TAC NDSI maps with little cloud cover during the snow period were used in the simulated experiment. Cloud masks from other dates were added to the target maps, with different fractions of about 20 %, 50 % and 80 %. Subsequently, the artificially cloud-covered maps were recovered by STAR and validated by Landsat NDSI maps.

The quantitative results indicate that the recovery effectiveness of STAR typically declines significantly when cloud coverage is greater than 80 %. As a result, STAR can completely remove clouds with little loss of accuracy. Only in the NC4_20180318 scene, high overestimation occurs when cloud coverage reaches 55 %. The phenomenon is caused by high cloud coverage and rapid snow variation in space and time. Therefore, users are recommended to refer to QA maps of the STAR NDSI collection during snow accumulation and ablation periods, in which Bit 7 reflects the cloud coverage of the space–time block.

5 Data availability

The improved cloud-free Terra–Aqua MODIS NDSI collection (STAR NDSI collection) for China from 1 August 2000 to 31 July 2020, including STAR NDSI and STAR QA data, is available for download at (Jing et al., 2021). The dataset is provided using a WGS 84 / UTM zone 48N projection, with a tag image file format (TIFF). Users can discuss and respond to issues that arise during the use of this dataset. New versions can be released in consideration of user comments. In addition, a source code for this collection is available at (Jing, 2022).

6 Conclusions

The STAR NDSI collection is derived from Terra–Aqua MODIS NDSI datasets using an optimized STAR from our last research (Jing et al., 2019). The evaluation tests indicate that the STAR NDSI collection is highly consistent with the in situ snow-depth measurements and higher-resolution NDSI maps. The STAR NDSI collection generally has the following strengths: (1) This collection has reached a continuous 20-year period, which is the minimum period of a dataset for long-term hydrological and climatic processes analysis. (2) The cloud-free collection can accurately estimate the snow dynamics, highly consistent with in situ snow-depth and Landsat NDSI maps. Specifically, the STAR NDSI collection eliminates cloud contamination and pre-eminently improves the overall performance of the TAC NDSI dataset. Due to the higher spatial resolution and larger dynamic range, the classification accuracy of the STAR NDSI collection is higher than that of the NIEER AVHRR SCE dataset. In terms of numerical accuracy, it is superior to the MODIS CGF NDSI dataset, since the STAF method generally outperforms the simplified MDC method. Additionally, it has a satisfactory accuracy in original cloud-covered areas. (3) The collection provides a detailed snow cover dataset for China, accurately reflecting the snow conditions of the following three major snow areas: NX, NC, and QTP. The collection is available at: (Jing et al., 2021).

As discussed above, the STAR NDSI collection still has some deficiencies. A future release should consider several issues: (1) the original accuracy of MODIS NDSI datasets reduced by factors such as complex climatic conditions and dense forest coverage; (2) the reconstruction accuracy of snow edges affected by mixed pixels and high cloud coverage; (3) the reconstruction accuracy of transient snow areas due to the inadequate spatiotemporal contextual information; and (4) the lack of evaluation based on in situ snow-depth measurements in NC due to the limited access to climate station data.

Despite the aforementioned deficiencies, since snow is a pivotal driver and sensitive indicator for many hydrometeorological processes, the daily 500 m STAR NDSI collection for 20 years has various potential applications: (1) achieving a deep understanding of long-term snow cover variability in China, (2) providing effective forcing data for hydrological and meteorological models, and (3) supporting strategic decisions on water resources management, environmental pollution governance, and related economic development.

Appendix A

Figure A1The QA maps over Taklimakan Desert on 19 January 2008. The QA map (a). Comprehensive QA map (b).

Figure A2Post-processing over Taklimakan Desert on 19 January 2008. STAR result (a). Final result (b).

Figure A3Extreme snow event in southern China on 31 January 2008 (a), 5 February 2008 (b), and 10 February 2008 (c).

Author contributions

All authors designed the methodology. YJ implemented the experiments. YJ maintained and refined the STAR NDSI collection. YJ drafted the manuscript. XL and HS revised the whole manuscript. All authors provided suggestions for this manuscript.

Competing interests

The contact author has declared that none of the authors has any competing interests.


Publisher's note: Copernicus Publications remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.


This research was supported by the National Natural Science Foundation of China (NSFC) under grant no. 41701394 and no. 42171302. We would also like to thank Koi Lin from Wuhan University for his suggestions on the writing of this manuscript.

Financial support

This research was supported by the National Natural Science Foundation of China (NSFC) (grant nos. 41701394 and 42171302).

Review statement

This paper was edited by Yuyu Zhou and reviewed by three anonymous referees.


Aalstad, K., Westermann, S., and Bertino, L.: Evaluating satellite retrieved fractional snow-covered area at a high-Arctic site using terrestrial photography, Remote Sens. Environ., 239, 111618,, 2020. 

Akyurek, Z., Hall, D. K., Riggs, G. A., and Sensoy, A.: Evaluating the utility of the ANSA blended snow cover product in the mountains of eastern Turkey, Int. J. Remote Sens., 31, 3727–3744,, 2010. 

Barnett, T. P., Adam, J. C., and Lettenmaier, D. P.: Potential impacts of a warming climate on water availability in snow-dominated regions, Nature, 438, 303–309,, 2005. 

Bormann, K. J., Brown, R. D., Derksen, C., and Painter, T. H.: Estimating snow-cover trends from space, Nat. Clim. Change, 8, 923–927,, 2018. 

Brown, R., Derksen, C., and Wang, L. B.: A multi-data set analysis of variability and change in Arctic spring snow cover extent, 1967–2008, J. Geophys. Res.-Atmos., 115, D16111,, 2010. 

Che, T., Dai, L. Y., Zheng, X. M., Li, X. F., and Zhao, K.: Estimation of snow depth from passive microwave brightness temperature data in forest regions of northeast China, Remote Sens. Environ., 183, 334–349,, 2016. 

Che, T., Li, X., Liu, S., Li, H., Xu, Z., Tan, J., Zhang, Y., Ren, Z., Xiao, L., Deng, J., Jin, R., Ma, M., Wang, J., and Yang, X.: Integrated hydrometeorological, snow and frozen-ground observations in the alpine region of the Heihe River Basin, China, Earth Syst. Sci. Data, 11, 1483–1499,, 2019. 

Chen, S., Wang, X., Guo, H., Xie, P., and Sirelkhatim, A. M.: Spatial and temporal adaptive gap-filling method producing daily cloud-free NDSI time series, IEEE J. Sel. Top. Appl. Earth Obs. Remote Sens., 13, 2251–2263,, 2020. 

Chen, X. N., Long, D., Liang, S. L., He, L., Zeng, C., Hao, X. H., and Hong, Y.: Developing a composite daily snow cover extent record over the Tibetan Plateau from 1981 to 2016 using multisource data, Remote Sens. Environ., 215, 284–299,, 2018. 

Çiftçi, B. B., Kuter, S., Akyürek, Z., and Weber, G. W.: Fractional snow cover mapping by artificial neural networks and support vector machines, ISPRS Annals of the Photogrammetry, Remote Sensing and Spatial Information Sciences, 4, 179–187,, 2017. 

Crawford, C. J.: MODIS Terra Collection 6 fractional snow cover validation in mountainous terrain during spring snowmelt using Landsat TM and ETM, Hydrol. Process., 29, 128–138,, 2015. 

Dariane, A. B., Khoramian, A., and Santi, E.: Investigating spatiotemporal snow cover variability via cloud-free MODIS snow cover product in Central Alborz Region, Remote Sens. Environ., 202, 152–165,, 2017. 

Dobreva, I. D. and Klein, A. G.: Fractional snow cover mapping through artificial neural network analysis of MODIS surface reflectance, Remote Sens. Environ., 115, 3355–3366,, 2011. 

Gafurov, A. and Bárdossy, A.: Cloud removal methodology from MODIS snow cover product, Hydrol. Earth Syst. Sci., 13, 1361–1373,, 2009. 

Gafurov, A., Vorogushyn, S., Farinotti, D., Duethmann, D., Merkushkin, A., and Merz, B.: Snow-cover reconstruction methodology for mountainous regions based on historic in situ observations and recent remote sensing data, The Cryosphere, 9, 451–463,, 2015. 

Gao, J., Williams, M. W., Fu, X. D., Wang, G. Q., and Gong, T. L.: Spatiotemporal distribution of snow in eastern Tibet and the response to climate change, Remote Sens. Environ., 121, 1–9,, 2012. 

Gao, Y., Xie, H. J., and Yao, T. D.: Developing Snow Cover Parameters Maps from MODIS, AMSR-E, and Blended Snow Products, Photogramm. Eng. Remote Sens., 77, 351–361,, 2011. 

Gladkova, I., Grossberg, M., Bonev, G., Romanov, P., and Shahriar, F.: Increasing the accuracy of MODIS/Aqua snow product using quantitative image restoration technique, IEEE Geosci. Remote Sens. Lett., 9, 740–743,, 2012. 

Hall, D. K. and Riggs, G. A.: MODIS/Terra CGF Snow Cover Daily L3 Global 500m SIN Grid, Version 61, NASA National Snow and Ice Data Center Distributed Active Archive Center [data set],, 2020. 

Hall, D. K., Riggs, G. A., and Salomonson, V. V.: Development of methods for mapping global snow cover using Moderate Resolution Imaging Spectroradiometer data, Remote Sens. Environ., 54, 127–140,, 1995. 

Hall, D. K., Riggs, G. A., DiGirolamo, N. E., and Román, M. O.: Evaluation of MODIS and VIIRS cloud-gap-filled snow-cover products for production of an Earth science data record, Hydrol. Earth Syst. Sci., 23, 5227–5241,, 2019. 

Hao, X., Huang, G., Che, T., Ji, W., Sun, X., Zhao, Q., Zhao, H., Wang, J., Li, H., and Yang, Q.: The NIEER AVHRR snow cover extent product over China – a long-term daily snow record for regional climate research, Earth Syst. Sci. Data, 13, 4711–4726,, 2021. 

Hao, X., Huang, G., Zheng, Z., Sun, X., Ji, W., Zhao, H., Wang, J., Li, H., and Wang, X.: Development and validation of a new MODIS snow-cover-extent product over China, Hydrol. Earth Syst. Sci., 26, 1937–1952,, 2022. 

He, G. J., Feng, X. Z., Xiao, P. F., Xia, Z. H., Wang, Z., Chen, H., Li, H., and Guo, J. J.: Dry and Wet Snow Cover Mapping in Mountain Areas Using SAR and Optical Remote Sensing Data, IEEE J. Sel. Top. Appl. Earth Obs. Remote Sens., 10, 2575–2588,, 2017. 

Hou, J. and Huang, C.: Improving mountainous snow cover fraction mapping via artificial neural networks combined with MODIS and ancillary topographic data, IEEE T. Geosci. Remote, 52, 5601–5611,, 2014. 

Hou, J., Huang, C., Zhang, Y., and Guo, J.: On the Value of Available MODIS and Landsat8 OLI Image Pairs for MODIS Fractional Snow Cover Mapping Based on an Artificial Neural Network, IEEE T. Geosci. Remote, 58, 4319–4334,, 2020. 

Huang, X.: MODIS daily cloudless binary snow products in Northern Hemisphere from 2000 to 2015, National Cryosphere Desert Data Center [data set],, 2020. 

Huang, X., Deng, J., Ma, X., Wang, Y., Feng, Q., Hao, X., and Liang, T.: Spatiotemporal dynamics of snow cover based on multi-source remote sensing data in China, The Cryosphere, 10, 2453–2463,, 2016. 

Huang, Y., Liu, H. X., Yu, B. L., We, J. P., Kang, E. L., Xu, M., Wang, S. J., Klein, A., and Chen, Y. N.: Improving MODIS snow products with a HMRF-based spatio-temporal modeling technique in the Upper Rio Grande Basin, Remote Sens. Environ., 204, 568–582,, 2018. 

Jing, Y., Shen, H., Li, X., and Guan, X.: A Two-Stage Fusion Framework to Generate a Spatio-Temporally Continuous MODIS NDSI Product over the Tibetan Plateau, Remote Sensing, 11, 2261,, 2019. 

Jing, Y. H.: A spatio-temporal adaptive fusion method with error correction for cloud-free MODIS NDSI estimation (Version 01), Zenodo [code],, 2022. 

Jing, Y. H., Li, X. H., and Shen, H. F.: STAR NDSI collection: A cloud-free MODIS NDSI dataset (2001–2020) for China (Version 01), Zenodo [data set],, 2021. 

Klein, A. G. and Barnett, A. C.: Validation of daily MODIS snow cover maps of the Upper Rio Grande River Basin for the 2000–2001 snow year, Remote Sens. Environ., 86, 162–176,, 2003. 

Konzelmann, T. and Ohmura, A.: Radiative fluxes and their impact on the energy-balance of the Greenland ice-sheet, J. Glaciol., 41, 490–502,, 1995. 

Kuter, S.: Completing the machine learning saga in fractional snow cover estimation from MODIS Terra reflectance data: Random forests versus support vector regression, Remote Sens. Environ., 255, 112294,, 2021. 

Kuter, S., Akyurek, Z., and Weber, G. W.: Retrieval of fractional snow covered area from MODIS data by multivariate adaptive regression splines, Remote Sens. Environ., 205, 236–252,, 2018. 

Li, M., Zhu, X., Li, N., and Pan, Y.: Gap-Filling of a MODIS normalized difference snow index product based on the similar pixel selecting algorithm: A case study on the Qinghai-Tibetan Plateau, Remote Sensing, 12, 1077,, 2020. 

Li, X., Jing, Y., Shen, H., and Zhang, L.: The recent developments in cloud removal approaches of MODIS snow cover product, Hydrol. Earth Syst. Sci., 23, 2401–2416,, 2019. 

Li, X. H., Fu, W. X., Shen, H. F., Huang, C. L., and Zhang, L. P.: Monitoring snow cover variability (2000–2014) in the Hengduan Mountains based on cloud-removed MODIS products with an adaptive spatio-temporal weighted method, J. Hydrol., 551, 314–327,, 2017. 

Liang, H., Huang, X. D., Sun, Y. H., Wang, Y. L., and Liang, T. G.: Fractional Snow-Cover Mapping Based on MODIS and UAV Data over the Tibetan Plateau, Remote Sensing, 9, 1332,, 2017. 

Malmros, J. K., Mernild, S. H., Wilson, R., Tagesson, T., and Fensholt, R.: Snow cover and snow-albedo changes in the central Andes of Chile and Argentina from daily MODIS observations (2000–2016), Remote Sens. Environ., 209, 240–252,, 2018. 

Moosavi, V., Malekinezhad, H., and Shirmohammadi, B.: Fractional snow cover mapping from MODIS data using wavelet-artificial intelligence hybrid models, J. Hydrol., 511, 160–170,, 2014. 

Muhammad, S. and Thapa, A.: An improved Terra–Aqua MODIS snow cover and Randolph Glacier Inventory 6.0 combined product (MOYDGL06*) for high-mountain Asia between 2002 and 2018, Earth Syst. Sci. Data, 12, 345–356,, 2020. 

Muhammad, S. and Thapa, A.: Daily Terra–Aqua MODIS cloud-free snow and Randolph Glacier Inventory 6.0 combined product (M*D10A1GL06) for high-mountain Asia between 2002 and 2019, Earth Syst. Sci. Data, 13, 767–776,, 2021. 

National Meteorological Information Center and Tibet Meteorological Bureau, China: Observational snow depth dataset of the Tibetan Plateau (Version 1.0) (1961–2013), National Tibetan Plateau Data Center [data set],, 2018. 

Pachauri, R., Meyer, L., Plattner, G., and Stocker, T.: Synthesis report. Contribution of working groups I, II and III to the fifth assessment report of the intergovernmental panel on climate change, Intergovernmental Panel on Climate Change, Geneva, Switzerland, (last access: 6 July 2022), 2014. 

Parajka, J. and Bloschl, G.: Spatio-temporal combination of MODIS images – potential for snow cover mapping, Water Resour. Res., 44, W03406,, 2008. 

Parajka, J., Pepe, M., Rampini, A., Rossi, S., and Bloschl, G.: A regional snow-line method for estimating snow cover from MODIS during cloud cover, J. Hydrol., 381, 203–212,, 2010. 

Qiu, Y., Wang, X., Han, L., Chang, L., and Shi, L.: Daily Fractional Snow Cover (FSC) Data set over High Asia, Science Data Bank [data set],, 2017. 

Riggs, G. A., Hall, D. K., and Román, M. O.: MODIS snow products collection 6 user guide, National Snow and Ice Data Center: Boulder, CO, USA, 66, (last access: 6 July 2022), 2015. 

Riggs, G. A., Hall, D. K., and Román, M. O.: Overview of NASA's MODIS and Visible Infrared Imaging Radiometer Suite (VIIRS) snow-cover Earth System Data Records, Earth Syst. Sci. Data, 9, 765–777,, 2017. 

Sibson, R.: A brief description of natural neighbor interpolation, chap. 2, in: Interpolating Multivariate Data, edited by: Barnett, V., John Wiley, Chichester, 21–36, 1981. 

Tang, Z. G., Wang, J., Li, H. Y., and Yan, L. L.: Spatiotemporal changes of snow cover over the Tibetan plateau based on cloud-removed moderate resolution imaging spectroradiometer fractional snow cover product from 2001 to 2011, J. Appl. Remote Sens., 7, 073582,, 2013. 

Tong, R., Parajka, J., Komma, J., and Blöschl, G.: Mapping snow cover from daily Collection 6 MODIS products over Austria, J. Hydrol., 590, 125548,, 2020.  

Wang, J., Che, T., Li, Z., Li, H., Hao, X., Zheng, Z., Xiao, P., Li, X., Huang, X., and Zhong, X.: Investigation on Snow Characteristics and Their Distribution in China, Adv. Earth Sci., 33, 12–16,, 2018. 

Wang, X. W., Xie, H. J., and Liang, T. G.: Evaluation of MODIS snow cover and cloud mask and its application in Northern Xinjiang, China, Remote Sens. Environ., 112, 1497–1513,, 2008. 

Yu, J. Y., Zhang, G. Q., Yao, T. D., Xie, H. J., Zhang, H. B., Ke, C. Q., and Yao, R. Z.: Developing Daily Cloud-Free Snow Composite Products From MODIS Terra-Aqua and IMS for the Tibetan Plateau, IEEE T. Geosci. Remote, 54, 2171–2180,, 2016. 

Yuan, Q., Shen, H., Li, T., Li, Z., Li, S., Jiang, Y., Xu, H., Tan, W., Yang, Q., Wang, J., Gao, J., and Zhang, L.: Deep learning in environmental remote sensing: Achievements and challenges, Remote Sens. Environ., 241, 111716,, 2020. 

Zhang, G., Xie, H., Yao, T., Liang, T., and Kang, S.: Snow cover dynamics of four lake basins over Tibetan Plateau using time series MODIS data (2001–2010), Water Resour. Res., 48, W10529,, 2012. 

Zhang, H., Zhang, F., Zhang, G., Che, T., Yan, W., Ye, M., and Ma, N.: Ground-based evaluation of MODIS snow cover product V6 across China: Implications for the selection of NDSI threshold, Sci. Total Environ., 651, 2712–2726,, 2019. 

Zhang, H., Zhang, F., Zhang, G., Yan, W., and Li, S.: Enhanced scaling effects significantly lower the ability of MODIS normalized difference snow index to estimate fractional and binary snow cover on the Tibetan Plateau, J. Hydrol., 592, 125795,, 2021. 

Short summary
Snow variation is a vital factor in global climate change. Satellite-based approaches are effective for large-scale environmental monitoring. Nevertheless, the high cloud fraction seriously impedes the remote-sensed investigation. Therefore, a recent 20-year cloud-free snow cover collection in China is generated for the first time. This collection can serve as a basic dataset for hydrological and climatic modeling to explore various critical environmental issues.
Final-revised paper