A new snow depth data set over northern China by developing a comprehensive framework using the complex GNSS station networks ( GSnow-CHINA v 1 . 0 , 2013-2020 )

1 Institute of Remote Sensing and GIS, School of Earth and Space Sciences, Peking University, Beijing 100871, China 2 College of Oceanography and Space Informatics, China University of Petroleum (East China), Qingdao 266580, 10 China 3 Key Laboratory of Remote Sensing of Gansu Province, Northwest Institute of Eco-Environment and Resources, Chinese Academy of Sciences, Lanzhou 730000, China 4 Meteorological Observation Center, China Meteorological Administration, Beijing 100081, China 5 Institute of Geographic Sciences and Natural Resources Research, Chinese Academy of Sciences, Beijing, China 15 6 Aerospace Information Research Institute, Chinese Academy of Sciences, Beijing 100101, China


Introduction
Snow cover is one of the most active elements in the cryosphere, and the maximum snow area during winter 50 nearly occupies 50% of the total land surface area of the Northern Hemisphere (Frei and Robinson, 1999;Armstrong and Brodzik, 2001;Robinson et al., 1993). The snow change plays a significant role in the hydrological, ecological, and climatic systems (Henderson et al., 2018). Therefore, accurately estimating snow cover and snow depth and their variations is essential for studies on climate and hydrology.
Currently, snow cover products derived from optical remote sensing data present high accuracy, but snow 55 depth products show significant uncertainties. Snow depth can be measured at point-scale using ground-based ultrasonic snow depth sensors or laser snow depth sensors, and mainly include observations from meteorological stations, snow surveys, and hydrological stations (Kinar and Pomeroy, 2015). Large-scale snow depth can be retrieved from optical, passive microwave, and active remote sensing observations (Shi and Dozier, 2000;Guerreiro et al., 2016;Leinss et al., 2014;Che et al., 2016), yet currently operational observations have 60 shortcomings. Optical remote sensing is affected by solar radiation and cloud (Dai et al., 2017). Passive microwave remote sensing is with coarse spatial footprints (> 25 km), and the observations saturate in deep snow (> 0.8 m) (Lievens et al., 2019). Active microwave remote sensing is with a long revisiting period (> 20 days) and with a high cost (Lievens et al., 2019).
The available global/hemispheric/regional snow depth data sets are mainly derived from ground 65 observations, microwave remote sensing, model simulations, and reanalysis (Xiao et al., 2020). Representative snow depth data sets include: 1) In-situ measurements from ground networks such as SCAN and SNOTEL (pointscale, hourly/daily/7-day/monthly; http://www.wcc.nrcs.usda.gov), 2) Data sets derived from satellite passive microwave brightness temperatures, e.g., the Advanced Microwave Scanning Radiometer for the Earth Observing System (AMSR-E) and its follow-on the Advanced Microwave Scanning Radiometer-2 (AMSR2) (25 70 km, daily, global/regional, 2002-, https://nsidc.org/), and the Global Snow Monitoring for Climate Research (GlobSnow) data set produced from the data assimilation of microwave radiometer data and meteorological station data (25 km, daily, hemispheric, 1979-, https The aforementioned long-term snow depth data sets are either point-scale or gridded data with coarse spatial resolution. Previous studies also demonstrated that current snow depth data sets show significant inconsistencies 80 and uncertainties, which limit their applications in climate change projections and hydrological processes simulations (Xiao et al., 2020;Zhang et al., 2021). Due to the complex spatial-temporal variability and the limitations of the current observation approaches, it is still challenging to derive snow depth data set of long-term with high spatial-temporal resolution. In particular, it lacks detailed observations of snow depth on a regional scale, which limits the applications in climate models, hydrological models, and snow disaster monitoring. 85 Estimating snow depth using the Global Navigation Satellite System Interferometric Reflectometry (GNSS-IR) technique has become a popular topic in recent years, ever since the principle was proposed by (Larson et al., 2009). Snow depth is determined by calculating the relative change of the effective multipath reflector height (i.e., the snow surface) to the snow-free surface. This technique is cost-effective because it does not require an additional transmitter, and instead, it continuously receives L-band microwave signals transmitted by the GNSS 90 satellites. The temporal resolution for snow sensing is expected to be hourly, along with the increasing number of GNSS satellites in orbit (Tabibi et al., 2017). The spatial footprint of the GNSS-derived snow depth is recognized as ~ 1000 m 2 , which is a scale between point-scale and satellite-scale (i.e., from tens of meters to tens of kilometers) (Larson and Nievinski, 2013). Therefore, GNSS-IR could provide new snow depth data sets which could be supplementary to the current in-situ and satellite data sets. However, developing robust and operational 95 technology to produce long-term snow depth data sets using data from various GNSS station networks is still challenging due to complex environmental and observation conditions. This study, taking advantage of 80 sites from the GNSS station networks over northern China, develops a comprehensive framework to process raw data from various stations and subsequently develops a new GNSS-IR snow depth data set (GSnow-CHINA v1.0, 12h/24h, 2013. Northern China has a wide-distributed snow 100 cover from October to April next year. China's annual mean snow extent is greater than 9'000'000 km 2 , with a stable snow-covered area of ~ 4'200'000 km 2 . This region is the main snow-covered area in China, which also plays a vital role in the climate research of the Northern Hemisphere and the cryosphere. The unique characteristics of the GSnow-CHINA v1.0 and the framework to develop it are as follows: (1) GSnow-CHINA v1.0 is a snow depth data set developed using GNSS data source, independent from the 105 current satellite, modeled, reanalysis, and in-situ data sets. The spatial resolution of this data set is between the in-situ point-scale and the coarse-gridded data, which makes it a new data set suitable for validation purposes.
(2) GSnow-CHINA v1.0 is a long-term snow depth data set over China with high temporal and spatial resolution, which provides a new data source for regional and global climate research. The data set is also helpful for monitoring local snow disasters and water resource management. 110 (3) The proposed framework to develop the data set provides comprehensive and supportive information for users to process raw data of ground GNSS stations with complex environmental conditions and various observation conditions. The technique has the potential to provide finer resolution snow depth product (e.g., one to two hours) with adequate observations from multiple GNSS systems.

Study area
Northern China lies between latitudes 25°N and 55°N and longitudes 70°E and 140°E and include humid, semi-humid, semi-arid, and arid zones. Snow is the primary freshwater resource over this area and the main natural disaster in winter for the pastoral areas. The study area includes the three main stable snow accumulation areas over China, i.e., Northeast China and Inner Mongolia (NCM), North Xinjiang and Tianshan mountain 120 (NXT), and Qinghai-Tibet Plateau (QTP) (Figure 1).
The NCM region has various geomorphic types. Mountains and hills surround the east, west, and north of this region, and the middle of this region is plain. The mean minimum air temperature in January is below -30 °C.
The annual mean snow depth is greater than 5 cm with a maximum value of greater than 30 cm. The mean snow density of this area is ~0.15 g. cm -3 . The NXT region has abundant seasonal snow water resources, vital to local 125 irrigation and animal husbandry. The mean air temperature is -4 ~ 9 ℃ with a long winter period. The QTP region is the core region of "The Third Pole" with a mean altitude of ~4378 m. Rainfall of the QTP is concentrated chiefly from May to September, while snowfall usually starts from September to April of the following year.

Data
Observations from the GNSS station networks over northern China are the primary data source to produce the snow depth data set. The networks include two separate categories constructed by two organizations, i.e., the 135 network constructed by the China Meteorological Administration (CMA) and the Crustal Movement Observation Network of China constructed by the China Earthquake Administration (CEA). China started to construct ground GNSS stations in 2009, and the stations have turned into a certain amount since 2012. The CMA stations were built to observe precipitable water vapor, while the CEA stations were built to monitor crustal deformation.
As shown in Figure 1, raw data from all 174 CMA sites and 171 CEA sites are acquired from the Center of 140 Meteorological Observation, CMA, to initially evaluate the capability to retrieve snow depth site by site. The sites are divided into three categories, i.e., high quality, medium quality, and low quality, following the recognition rule used for site quality determination. The rule will be introduced in the following "Methods" Section. Overall, there are 55 high-quality sites (52 for CMA and 3 for CEA) and 25 medium-quality sites (22  Figure 2 shows the periods of the high-quality and medium-quality GNSS sites used for snow depth retrieval. For CMA, despite the possible raw data missing for some sites, the majority time spans for the high-quality sites 150 are 2013-2020, 2015-2020, and 2016-2020, and that for the medium-quality sites are 2015-2020. For CEA, the high-quality sites are merely from 2018/2019-2020, with one medium-quality site having the earliest record from the year 2010. Due to the irreplaceable value of each site because of its unique natural environment and characteristic of snow, we preserve the high-quality and medium-quality sites as much as possible during the production of this snow depth data set. 155 The Soil Moisture Active and Passive (SMAP) L3 36 km soil moisture data are used to estimate the penetration depth of GNSS signals to the soil layer. It is a quality control step to derive a more accurate reflector 165 height of the non-snow surface. The Moderate Resolution Imaging Spectroradiometer (MODIS) 1 km Normalized Difference Vegetation Index (NDVI) data are used to identify the vegetation effects on snow depth retrieval. Two independent snow depth data products are used to analyze the quality of the data set produced in this study. One is the 1979-2020 snow depth product using passive microwave remote sensing produced by Che et al., 2008;Dai et al., 2015) (daily, 25 km), named PMW hereafter for short. The snow depth 170 of this product is derived using the SMMR and SSMI/S microwave brightness temperature processed by the National Snow and Ice Data Center (NSIDC). The other is the daily in-situ snow depth measurements using laser snow depth sensors provided by the Meteorological Observation Center, CMA.

Methods
The flowchart to produce and validate the GSnow-CHINA data set is shown in Figure 3. The raw GNSS 175 data used for snow depth retrieval is the daily RINEX data derived directly from individual CMA/CEA GNSS sites. Significant steps to produce the data set are described as follows: 1) The observables for snow depth retrieval, i.e., satellite Pseudorandom Noise (PRN) numbers, observation time, satellite elevation angle, satellite azimuth angle, pseudorange, carrier phase (CP), Signal-to-Noise Ratio (SNR), are extracted or calculated from the raw data. 180 2) The Lomb-Scargle periodogram analysis is executed on several non-snow days to determine the mean reflector heights for each quadrant, used as reference height when calculating snow depth.
3) A comprehensive evaluation of the quality of all the GNSS sites is done based on the data quality of the non-snow surface reflector height, and the sites are divided into high-, medium-, and low-quality accordingly. 185 4) For high-and medium-quality sites, the model for deriving daily reflector height is established, and the raw snow depth for each GNSS satellite, each quadrant, and each GNSS frequency is subsequently calculated as the difference value of the referenced height in Step 3) and the height of this step. 7) The GSnow-CHINA data set is evaluated using the PMW product and the in-situ measurements. The advantages and limitations of the produced data set are further analyzed to provide supportive information for future method improvement or data set extension.

Snow depth retrieval model
The state-of-the-art GNSS-IR snow depth retrieving models can be divided into two categories according 205 to the two types of observables (i.e., the SNR and the carrier phase (CP)). The principle of the SNR model is to establish a linear relationship between the oscillation frequency of the SNR observation sequence of the reflected signal and the height of the reflection surface (Larson et al., 2009). This model was later derived into several variants: e.g., the triple-frequency SNR combination model (SNR_COM) (Zhou et al., 2019), the SNR model based on raw SNR sequences (Peng et al., 2016), the SNR model based on horizontal polarization antenna (Chen 210 et al., 2014), the SNR model considering the influence of construction facilities (Vey et al., 2016), and the SNR model considering the influence of terrain (Zhang et al., 2017). The carrier phase combination model was initially proposed to estimate snow depth when there were no SNR data in the raw GNSS observation file (Ozeki and Heki, 2012). The initial form of this model was using the geometry-free linear combinations of the phase measurements (L4), and (Yu et al., 2015;Yu et al., 2018) extended the model to use triple-frequency carrier 215 phase observations (F3) as well as the combination of pseudorange and carrier phase of dual-frequency signals (F2C).
The main formulas and applicability of the five models mentioned above to the data of GNSS sites in this study are analyzed in Table 1, and Table 2 further shows the meanings of variables for the models in Table 1. The SNR, L4, and F2C models are suitable for all sites because the observables used as inputs for these models 220 are available in the GNSS raw data. Even though, the SNR model has been verified to have higher accuracy than the L4 and F2C models . The applicability of the SNR_COM and F3 models is limited because most of the GNSS sites do not contain three SNR or CP observables in a single raw data file. Considering both the applicability and the accuracy, the SNR model is determined as the primary model used to produce the snow depth data set.
Suitable for all sites but with relatively lower accuracy Suitable for all sites but with relatively lower accuracy The geometry and principle of the SNR model are shown in Figure 4. As shown in Figure 4a, the snow 230 depth (ℎ UV>W ) is calculated using a simple equation: Where ℎ X is the reflector height of the non-snow surface, and ℎ is the reflector height of the snow-covered surface. The approaches to derive ℎ X and ℎ are similar, with Figure 4 b1~b3 showing the general technical process. First, the time series of GNSS SNR observation are shown as a function of sine (elevation angle), and 235 the direct signal is removed using the polynomial fitting method, and the remaining is treated to be the contribution of the reflected signal from the land surface. Second, the reflected signal is converted from dB-Hz to Volts/Volts. Third, a Lomb-Scargle analysis is executed to the reflected signal curve to find out the dominant frequency of the transformation. The ℎ X or ℎ can be calculated by (Larson et al., 2009): Where is the wavelength of the GNSS signal and is the dominant frequency.

Determination of the non-snow surface reflector height 245
For each site, ~ ten days of data with no snow on the ground are used to calculate the non-snow surface reflector height. According to the data availability, days of the year (DOYs) 110~119 or DOYs 274~283 are generally selected since these days have no snow according to historical in-situ data. The reflector height for each GNSS satellite, quadrant, and GNSS frequency band is calculated using the Lomb-Scargle spectrum, and it is just the initial height being used for the quality evaluation of the GNSS sites. Due to the complex natural 250 environment for various sites, it is not clear whether one site is suitable for snow depth retrieval. For sites suitable for snow depth retrieval, the finalized non-snow surface reflector height will be determined as the mean value of heights of the ten days. It is worth mentioning that, for GPS and GLONASS satellite, the reflector height is given per satellite, quadrant, and frequency band, while for BDS satellite, the reflector height is given by quadrant only because the BDS MEO satellite changes its trajectory day by day. 255

Quality evaluation of the GNSS sites
The CMA and CEA sites are built under various natural and manual environmental conditions.  Photos of typical GNSS sites. "bumz" and "bgfc" are two CMA sites, and "qhdl" is a CEA site.

265
A rigorous rule is defined to evaluate the quality of all the GNSS sites. Figure 6 shows the rule being applied to six individual sites with various surroundings, i.e., bumz, bfhr, bgfc, uqwl, qhdl, and qhbm. The top panel of each subfigure shows the environmental conditions around the station on Google Map, with different colors indicating the footprints for elevation angles of 10°, 15°, 20°, 25°, and 30°, respectively. The bottom panel of each subfigure shows the sorted 10-day reflector heights of non-snow surface (i.e., ℎ X ). The plots clearly show 270 the differences in the heights for different sites. The first two sites, i.e., bumz and bfhr, show relatively long and stable ℎ X values for all the GNSS satellites, quadrants, and frequency bands during the entire observation period.
It indicates that these sites are flat enough for all the orientations and are ideal for determining the initial range of the non-snow surface reflector height, i.e., 2.5 ~ 2.8 m for bumz and 2.8 ~ 3.1 m for bfhr. Unlike these two sites, the bgfc site has relatively stable ℎ X values only in specific orientation whose natural condition is open and 275 flat. This phenomenon can be verified from the photo of the bgfc site in Figure 5. This site is also good enough to determine the initial range of the non-snow surface reflector height, i.e., 3.6 ~ 4.1 m for bgfc. On the contrary, the three sites at the bottom of Figure 6, i.e., uqwl, qhdl, and qhbm, show continuously changed ℎ X values. It indicates that it is unreliable to determine a true ℎ X by the Lomb-Scargle spectrum due to complex environmental conditions. 280 For each site, the ℎ X plot is visually checked carefully using the rule mentioned above and determined whether it is suitable for snow depth retrieval. For those sites that are suitable for usage, a range of ℎ X is given manually to narrow the good ℎ X values. The difference of the minimum and maximum value of the range is set to be no more than 0.5 m. Afterward, the finalized non-snow surface reflector heights are determined as the mean value of heights of the ten days. It should be noted that during this processing step, it can only eliminate those 285 sites with poor data quality for snow depth retrieval rather than distinguishing high-and medium-sites. There are no apparent differences for the high-and medium-quality sites in terms of the natural environment. Instead, the medium-quality site is defined using two simple rules, i.e., one is the site has good-quality data, but there is no snow for almost all the years. The other is the site's lack of data for most of the years.

Deriving snow depth of finer resolution
The default temporal resolution of the snow depth data set is 24-hour. However, some sites have adequate satellite observations that make it possible to produce finer resolution snow depth data. We have two different 300 solutions to produce snow depth of finer temporal resolution. For most sites with only GPS observations, we try GLONASS systems to derive finer temporal resolution snow depth. Unlike the previous 12-hour maximum resolution, a 2-hour resolution can be achieved using compatible observations.

Quality control of the snow depth data set
Several postprocessing steps are executed to accomplish the quality control of the raw snow depth data set.
This section gives detailed information on these steps as follows: 310 (1) Moving average filtering For each site, as shown in Figure 7, the raw snow depth values over a snow season, i.e., from October 1st this year to April 30th the following year, are gathered together, and the moving average algorithm is executed to filter out the snow depth outliers. This moving average method is a traditional way to reject outliers (Wang et al., 2020;Tabibi et al., 2017;Nievinski and Larson, 2014). Snow depth values out of the 95% confidence interval 315 are smoothed over a sliding window of length ten across neighboring elements. In the finalized GSnow-China data set, we also provide the original data set without filtering to allow users to check the initial form of the data.
The following analyses in Sections 4 and 5 are based on the filtered data. (b) shows the relationship between penetration depth and soil moisture/soil components calculated using 325 parameters provided in (Hallikainen et al., 1985). The penetration depth is deeper than 10 cm when soil is very dry (i.e., volumetric soil moisture (VSM) < 0.1 cm 3 .cm -3 ). The penetration depth is around or shallower than 5 cm under normal soil moisture conditions. Soil components also affect the penetration depth slightly. In this study, for each site, since the soil components data is unavailable, the VSM value is used as the basis to define the error of the penetration depth for this site. The average VSM is calculated as the multiple-year mean value of 330 the SMAP VSM. To balance the effects from both the soil moisture and the soil components, we define a simple and quantified rule to distinguish the error caused by soil penetration, i.e., hp = 10 cm for VSM between 0 ~ 0.1 cm 3 .cm -3 , hp = 5 cm for VSM between 0.1 ~ 0.2 cm 3 .cm -3 , and hp = 2.5 cm for VSM higher than 0.2 cm 3 .cm -3 .
The ℎ X is modified as ℎ X − ℎ Z for the final production of the snow depth data set. Figure 8 (b) shows statistics of the number of GNSS sites categorized by the modification value of the soil penetration depth. The majority 335 has shallow penetration depth of 2.5 cm or 5 cm, with only a few has deep penetration depth of 10 cm. vegetation or snow. As for northern China, this phenomenon occurs mainly in October and early November. In 345 this study, for each site over from October 1st to November 15th, if there are snow depth records from the GNSS data, we use the NDVI from MODIS data and the historical weather report to determine whether it is actual snow or not. After this round of checking, to ensure the reliability of the snow depth, for 10 sites that probably have "fake snow depth" records, DOYs 270 ~ 300 are masked out from the data set.
(4) Quality flags 350 The number of GNSS satellites used for this calculation is used as a quality flag for each snow depth data record. In this study, we set the threshold to be 5 to preserve as much data as possible. According to this quality flag, the users can decide whether to use a snow depth data record with a low number of observations. For each snow depth data record, the standard error (STE) of the snow depth observations is treated as another qualifying flag. The users can also decide their own rules to filter the data according to this quality flag. The 8-day MODIS 355 NDVI is also involved as a quality flag in the data set to show the vegetation conditions of the site initially. The 8-day values are combinations of the MODIS MOD13Q1 and MYD13Q1 products.

Intra-comparisons of GNSS snow depth results
The intra-comparisons of the snow depths are executed from three aspects, i.e., comparison of different 360 GNSS systems, comparison of different frequency bands, and comparison of different GNSS receivers.

Comparison with in-situ measurements and the PMW products
The GNSS snow depth data set, the PMW data set, and the in-situ measurements are not consistent in terms of the spatial footprint. The GNSS and in-situ data are with closer footprint compared with the 25-km PMW data. The footprint of GNSS is approximately ~ 30 m x 30 m , as illustrated in the following Figure 16. Due to the discrepancy of footprint, it is impractical to give factual accuracies when comparing these three data sets. Instead, we present the 400 performance of the three data sets at daily scale, multi-year scale, and interannual variabilities. Figure 12 shows an example of the comparisons of daily snow depth derived from GNSS, in-situ, and PMW. The data used in this figure is from 17 GNSS sites with the most extended temporal coverage (i.e., from 2013 to 2020), and the daily mean snow depth of the three data sets is calculated and shown in the figure. As expected, the GNSS and insitu data have similar performance compared to the PMW data. The GNSS snow depths are generally lower than the 405 in-situ measurements, particularly at those with snow depths higher than 10 cm. In addition, the peak of the PMW snow trend for each snow season moves to the right, which is due to the change of snow grain size (Dai et al., 2012).

410
The annual mean and maximum snow depths are significant indicators that can reflect the overall data quality and the variation trend over multiple years. Sixteen sites with the least missing daily snow depth values are used to compare the multi-year averages of the annual maximum/mean snow depth derived from GNSS, in-situ, and PMW. This comparison period is from 2016 to 2020 due to the data discontinuity for some sites at other periods.
Coincidentally, all these sixteen sites are located in the NCM region, making it possible further to analyze the 415 interannual variability of the multi-year maximum or mean snow depth. Figure 13 shows a site-by-site comparison of the five-year average of the annual maximum /mean snow depth derived from GNSS, in-situ, and PMW, respectively. the average of the annual (a1) maximum and (b1) mean snow depth. The snow depth values are classified into five categories to show consistency and discrepancy better. It shows high consistency for the three data sets in general but 420 with discrepancies for some sites. Figure 13  The maximum values are consistent for the three data sets without regard to the in-situ data having one outlier at Site jldg. This data point is an outlier because the historical weather reports showed no significant snowfall events before or after these dates. This result indicates that the laser measurements in operational meteorological observations 425 are not always reliable. For the mean values shown in (b2), the GNSS and in-situ have a better agreement than the PMW because of the significant difference in their spatial footprint. Figure 13  The interannual variability of the multi-year average of the annual maximum (mean) snow depth using the same data in Figure 13 is further shown in Figure 14. The snow depth values in this figure are the mean values of all 16 sites. The maximum and mean achieve consistent interannual variabilities for all the three data sets, with the absolute maximums of the PMW being relatively higher than the other two. This result generally indicates that the GNSS data set in this study can be used as a new data source to monitor the interannual variability of 435 snow depth.

Reflection on extreme snow event
Real-time and accurate monitoring of extreme snow events is of vital practical value. To test if this new GNSS data set can provide supportive information for this application, we use the extreme snow event that 450 happened on February 21 ~ 22 in the year 2015 to analyze the performance of the GNSS, in-situ, and PMW data sets. The event is selected because we have overlapped GNSS data from two GPS/GLONASS compatible sites, i.e., bfqe and bttl, which can provide finer resolution snow depth observations. Figure 15 (a) shows the daily snow depth variations before and after the snow event. As expected, the GNSS and in-situ data have similar responses to the event, while the PMW data has a weak response. These two sites are located in the region with 455 evergreen coniferous forest, which prevents the PMW data from acquiring reliable snow depth values due to its wider observation extent of 25 km. Figure 15 (b) further shows the response of the 2-hour GNSS snow depth data during the week of the event. It captures the evolution of the event in a more detailed way from DOY 51 than that of the other two data sets. It provides a potential to increase the monitoring frequency of extreme weather in a cheap and effective way in the future, even with a higher resolution of 1-hour or better, particularly for those 460 sites that have compatible observations from more GNSS satellite systems such as GPS, GLONASS, BDS, and Galileo. the effects of vegetation). We will continue to maintain and update the algorithm and the data set as more years 470 of data become available in the future. The data set includes snow depth of 24-hour, 12-hour, and 2-hour temporal resolutions if possible, for 80 sites from 2013 ~ 2020 over northern China (25° ~ 55° N, 70° ~ 140° E). The sites over southern China are not included because there is most probably no snow in that region. The high and medium sites are all preserved in the data set with multiple quality flags for users to apply the data.
There are two folders in the data set, i.e., the SITE_INFO and the SNOW_DEPTH. The SITE_INFO folder 475 includes the general information of the 80 GNSS sites, with four separate sheets in one .XLS file corresponding to CMA high-quality, CMA medium-quality, CEA high-quality, and CEA medium-quality, respectively. The items in the file are listed as SITE_NAME, LAT (latitude), LON (longitude), ALT (altitude), RECEIVER_TYPE, GNSS_TYPE, ANTENNA_HEIGHT (in meter), and MEAN_VSM (Volumetric soil moisture in cm 3 cm -3 ; Mean value derived using SMAP soil moisture data of 2015-2020). The SNOW_DEPTH folder includes the snow 480 depth values for all available sites. The folder is structured by ~/site/. For example, ~/hltl/stores the snow depth data of the Site hltl. There are four sub-folders in the folder of each site, i.e., raw0, filtered0, raw, and filtered.
The "raw0" and "filtered0" folders store raw data and raw-but-filtered data for individual satellite/quadrant/frequency/time. The "raw" and "filtered" folders store 24-hour/12-hour data produced using raw data in the corresponding "raw0" and "filtered0" folders. The file names including *_24h.csv, *_12h.csv, 485 and *_02h.csv represent the 24-hour, 12-hour, and 2-hour resolution data. Each CSV file gathers this specific snow season (e.g., the 2019 file stores values from October 1, 2019, to April 30, 2020). We recommend using the snow depth data in the "filtered" folder for validation/application purposes while using the snow depth data in the "raw" folder for algorithm testing purposes.
Three quality flags are included in each snow depth file, i.e., the STE, NUM_OF_PRNS, and NDVI, 490 standing for the STE of snow estimations, the number of GNSS sites, and the MODIS NDVI value. These flags should be used to filter the data to balance the data volume and the snow depth accuracy. In addition, we do not recommend using the snow depth values of less than 5 cm in the data set, which is beyond the accuracy of the current GNSS-IR technology.

Extended analysis of the data set and method
Although this study releases a data set using the current GNSS sites, which are suitable for snow depth retrieval. Those sites that are not suitable for this purpose still deserve an extended analysis to promote this research domain's development further. Also, although the method to retrieve snow depth used in this data set is determined as the SNR model due to data availability, it deserves an extended discussion of the selection of the 510 method for interested readers who dedicates to developing their own data set. The following Sections 5.1 and 5.2 give an extended analysis of the two issues mentioned above.

Factors that affect the site quality for snow depth retrieval
(1) Natural surroundings. The natural environment within the footprint of the observations is the most significant factor that determines whether a specific GNSS site is suitable for snow depth retrieval or not. Open 515 and flat ground with no vegetation is the ideal environment to set up a snow site, although some previous studies investigated methods to eliminate the influence of terrain (Zhang et al., 2017;Zhang et al., 2020). Vegetation is another factor that needs to be considered for accurate retrieval of snow depth. Figure 17 shows an example of Site bfxc, which has serious vegetation effects on snow depth retrieval before DOY 300 for 2015 ~ 2019. The than snow for these data points. A previous study suggested that it is practical to use the amplitude of the GNSS SNR data to retrieve vegetation height for observations of 1-second sampling. Therefore, for GNSS observations at the sampling intervals, it may be possible to use the SNR amplitude to build a model to qualify the vegetation effect on snow depth retrieval. However, this is not practical for the CMA or CEA sites used in this study because 535 the sampling interval is 30 seconds which is impossible to model the SNR data series to derive the amplitude.
Future research will consider using other vegetation indicators to identify this issue.

540
(2) Quality of the observation data. The data quality is another critical factor that determines if a site is suitable for snow depth retrieval or not. First, the minimum elevation angle of GNSS satellites should be set to 5° ~ 15° to preserve the multipath effect as much as possible because only data with low elevation angles can show the surface reflection. Second, the observables used as inputs for the corresponding snow depth models should be stored in the raw RINEX file. If the stored observables satisfy conditions for multiple models, one can 545 choose the model according to their accuracy or combine to use all the models during the calculation. This issue will be further discussed in the next Section 5.2. Third, the cycle slip of GNSS observation can severely reduce the data quantity available for snow depth retrieval. One can use the quality control file generated during the https://doi.org/10.5194/essd-2021-432 observation to filter out these data. Finally, random errors, e.g., human activities at some point, may exist during the observation. Also, the snow depth results on snowy days could, to some extent, affect the accuracy. 550

Selection of snow depth models
Although there are many models to retrieve snow depth, as illustrated in Table 1, considering the availability of the observables and the accuracy of the models, not all models are applicable or optimal in practical application. Figure 19 shows an overall strategy of model determination for using GNSS data to retrieve snow depth. One should first consider if the SNR observable exists in the RINEX file since the carrier phase and pseudorange are 555 observations that generally exist for positioning. If the observables satisfy all the snow depth models, the optimal model is selected according to the number of frequencies in the RINEX file. If the frequencies received by the receiver are less than 3, the SNR model is the best choice since it is simple and has reliable accuracy (Plan A in the figure). If the received frequencies are equal to or are greater than 3, the SNR_COM and F3 models can be used (Plan B in the figure). However, one can still use Plan A to replace Plan B in practical applications. If the 560 SNR observable does not exist (Plan C), the F3 model is preferred when the number of CP is greater than 3, while the L4 or the F2C model is selected when the number of CP is less than 3. Nevertheless, the effects of the ionosphere delay on the L4 and F2C models are difficult to remove, which leads to the relatively low accuracy of these two models .

Conclusions
This study proposes a comprehensive framework using raw data of the complex GNSS station networks to automatically retrieve snow depth and control its quality. Based on this, this study further produces a long-term 575 snow depth data set over northern China (i.e., GSnow-CHINA v1.0, 12h/24h, 2013-2020) using the proposed framework and historical data from 80 stations. The data set has high internal consistency with regards to different GNSS systems (mean r = 0.97 & RMSD = 1.93 cm), different frequency bands ( mean r = 0.96 & RMSD = 2.73 cm), and different GNSS receivers (mean r = 0.88). The data set also has high external consistency with the insitu measurements and the PMW products, with a consistent illustration of the interannual snow depth variability. 580 The results also show the good potential of GNSS to derive hourly snow depth observations for better monitoring snow disasters. The proposed framework to develop the data set provides comprehensive and supportive information for users to process raw data of ground GNSS stations with complex environmental conditions and various observation conditions. The resulting GSnow-CHINA v1.0 data set is distinguished from the current point-scale in-situ data or coarse-gridded data, which can be used as an independent data source for validation 585 purposes. The data set is also useful for regional and global climate research and other meteorological and hydrological applications. Both the algorithm and the data set will be maintained and updated as more years of data become available in the future.
Author contributions: WW designed the study and wrote the manuscript. HL provided the GNSS raw data for 590 the production of this data set and co-designed the study. LD provided supportive information for the validation using the PMW snow depth product. LZ provided supportive information for the data filtering. JZ, BL, ZG, HH, and TY contributed to the data/codes accumulation. data filtering All authors contributed to the writing and editing of this paper.

595
Competing interests: The authors declare that they have no conflict of interest.
Acknowledgments: The first author would like to thank team members from the Meteorological Observation Center, China Meteorological Administration for producing, maintaining, and providing the raw GNSS RINEX data and the in-situ data. The authors would like to thank the SMAP team, the MODIS team, and the PMW team 600 for archiving and providing the data used in this study.