Theia Snow collection: high-resolution operational snow cover maps from Sentinel-2 and Landsat-8 data

Abstract. The Theia Snow
collection routinely provides high-resolution maps of the snow-covered area from Sentinel-2 and Landsat-8 observations. The collection covers
selected areas worldwide, including the main mountain regions in western
Europe (e.g. Alps, Pyrenees) and the High Atlas in Morocco. Each product of
the Theia Snow collection contains four classes: snow, no snow, cloud and no data.
We present the algorithm to generate the snow products and provide an
evaluation of the accuracy of Sentinel-2 snow products using in situ snow depth
measurements, higher-resolution snow maps and visual control. The results
suggest that the snow is accurately detected in the Theia snow collection
and that the snow detection is more accurate than the Sen2Cor outputs (ESA
level 2 product). An issue that should be addressed in a future release is
the occurrence of false snow detection in some large clouds. The snow maps
are currently produced and freely distributed on average 5 d after the image
acquisition as raster and vector files via the Theia portal
(https://doi.org/10.24400/329360/F7Q52MNK).



Introduction
The snow cover is an important driver of many ecological, climatic and hydrological processes in mountain regions and in high latitude areas. In these regions, in situ observations are generally insufficient to characterize the high spatial variability of the snowpack properties. The snow cover was included as one of the 50 essential climate variables (ECVs) to be monitored by satellite remote sensing by the Global Observing System for Climate (GCOS) in accordance with the Committee on Earth Observation Satellites (CEOS) agencies. ECVs are intended to support the work of the United Nations Framework Convention on Climate Change and the Intergovernmental Panel on Climate Change.
The snow cover is not a variable sensu stricto, but an object which can be characterized through many variables, including snow-covered area (SCA), fractional area (fSCA), albedo, liquid water content, snow depth and snow water equivalent (Frei et al., 2012). An international survey by the Cryoland consortium (http://cryoland.enveo.at/consortium, last access: 12 April 2019) shed light on the user requirements for snow services based on satellite remote sensing. SCA and fSCA products were ranked as the most important by the respondents (Malnes et al., 2015). The survey also indicated the need for operational products provided on a daily basis, with latency times shorter than 12 h. Highresolution data down to 50 m resolution were sought by road and avalanches authorities (Malnes et al., 2015). The respondents requested regional products, e.g. on the scale of entire mountain ranges like the Alps or even the whole of Europe, and preferred the Universal Transverse Mercator (UTM) projection (Malnes et al., 2015). At the national level in France there is also a need for operational high-resolution snowcovered area maps as revealed by the recent roadmap for satellite applications issued by the French Government (Plan d'applications satellitaires 2018, Commissariat général au développement durable, 2018). Based on a wide panel of end users, this plan selected the monitoring of the snow-covered area in French national parks as a one of the key actions Published by Copernicus Publications.
Operational SCA maps have been generated from satellite observations since the 1960s (Ramsay, 1998). Current SCA and fSCA products are mostly derived from MODIS data (Hall et al., 2002;Sirguey et al., 2008;Painter et al., 2009;Metsämäki et al., 2015), but their spatial resolution (1 km to 250 m) can be too coarse for various applications, especially in mountain regions where the snow cover properties vary on scales of 10 to 100 m (Blöschl, 1999). High-resolution (30 m) snow cover maps can be generated from Landsat images but the low temporal revisit of the Landsat mission (16 d) is an important limitation for snow cover monitoring, especially considering that the cloud probability can exceed 50 % in temperate mountains (Parajka and Blöschl, 2008;Gascoin et al., 2015). Since 2017, with the launch of Sentinel-2B, the Copernicus Sentinel-2 mission offers the unique opportunity to map the snow cover extent at 20 m resolution with a revisit time of 5 d (cloud permitting) (Drusch et al., 2012). The combination of Sentinel-2 and Landsat-8 data provides the opportunity for even more frequent observations of the snow cover with a global median average revisit interval of 2.9 d (Li and Roy, 2017).
The principles of snow detection from multispectral optical imagery is well established since the pioneering work of Dozier (1989) with the Landsat Thematic Mapper. Today, the challenge for scientists and end users is rather to cope with the large amount of data that is generated by a mission like Sentinel-2. The generation of a single snow map from a Sentinel-2 level-1C tiled product (monodate orthorectified image expressed in top-of-atmosphere reflectance) involves downloading a 700+ Mb zip file (once uncompressed, the product contains 12 folders and 108 files, including 15 raster files and 13 XML metadata files). Since March 2018, ESA began the operational processing of level 2A (L2A) products (monodate orthorectified images expressed in surface reflectance, including a cloud mask). Each ESA L2A product also includes a snow cover mask. However, the size of a single L2A product before unzipping can exceed 1 Gb (15 folders, 137 files). In addition, the quality of the ESA L2A snow mask can be improved for two reasons: (i) ESA's L2A algorithm is a general-purpose algorithm, which was not optimized for snow detection, and (ii) because ESA's L2A processor treats each image independently (i.e. mono-date approach, Louis et al., 2016), the output cloud mask has a lower accuracy than a cloud mask generated by a multi-temporal algorithm (Hagolle et al., 2010).
In this article we introduce the Theia Snow collection, a new collection of snow cover maps, which are derived from Sentinel-2 at 20 m resolution in an operational context. The Theia Snow collection was recently upgraded to also provide snow cover maps at 30 m resolution from Landsat-8 using the same algorithm. The data are currently being produced and freely distributed via the Theia land data centre. 1 The snow retrieval algorithm is based on the Normalized Difference Snow Index (Dozier, 1989) but also uses a digital elevation model to better constrain the snow detection. We first describe the algorithm (Sect. 2) and its implementation in the let-it-snow processor (LIS,Sect. 2.4.4). Then, we provide a detailed description of the product characteristics (coverage, period, format, etc.; Sect. 3). In Sect. 4, we present an evaluation of Theia snow products using in situ snow depth measurements, very high-resolution clear-sky snow maps and visual control. We finally discuss the main limitations and potential applications of the Theia Snow collection.

Scope
An algorithm was designed to determine the snow presence or absence from Sentinel-2 and Landsat-8 observations outside areas of dense forest with the following requirements: -It should be scalable, i.e. allow the processing of large areas (10 4 km 2 ) with a reasonable computation cost (typically less than 1 h for a single product).
-It should be robust to seasonal and spatial variability of the snow cover and land surface properties.
-It should maximize the number of pixels that are classified as snow or no snow.
-It is preferable to falsely classify a pixel as cloud than falsely classify a pixel as snow or no snow.

Input
The algorithm works with multispectral remotely sensed images, which include at least a channel in the visible part of the spectrum and a channel near 1.5 µm (referred to as shortwave-infrared or SWIR). It takes the following as input: a L2A product, including the cloud and cloud shadow mask (referred to as an "L2A cloud mask" in the following), the green, red and SWIR bands from the flat-surface reflectance product. These images are corrected for atmospheric and terrain slope effects (Hagolle et al., 2017). The slope correction is important in mountain regions since it enables us to use the same detection thresholds whatever the sun-slope geometry.

Pre-processing
The red and green bands are resampled with the cubic method to a pixel size of 20 m by 20 m to match the resolution of the SWIR band. The DEM is generated from the Shuttle Radar Topography Mission seamless DEM (Jarvis et al., 2008) by cubic spline resampling to the same 20 m resolution grid.

Snow detection
The snow detection is based on the Normalized Difference Snow Index (NDSI, Dozier, 1989) and the reflectance in the red band. The NDSI is defined as where ρ green (or ρ SWIR ) is the slope-corrected surface reflectance in the green band (or SWIR at 1.6 µm). The NDSI expresses the fact that only snow surfaces are very bright in the visible and very dark in the shortwave infrared. Turbid water surfaces like some lakes or rivers may also have a high NDSI value; hence a additional criterion on the red reflectance is used to avoid false snow detection in these areas. A cloud-free pixel is classified as snow if the following condition is true: (NDSI > n i ) and (ρ red > r i ) , where n i and r i are two parameter pairs (i.e. i ∈ {1, 2}) as explained below. Otherwise the pixel is marked as "no snow". The values of the parameters are provided in Table 1.

Snow line elevation
The algorithm works in two passes. The snow detection (Sect. 2.4.1) is performed a first time using parameters n 1 and r 1 , which are set in such way as to minimize the number of false snow detections (Table 1). This first pass enables us to estimate the minimum elevation of the snow cover z s , above which a second pass will be performed with less conservative parameters values n 2 and r 2 (Table 1). An example of the evolution of the snow line elevation over the tile 31TCH in the Pyrenees is shown in Fig. 1 and the corresponding snow maps are shown in Fig. 1. Based on regional analyses of the snow line elevation variability in mountain ranges, (e.g. Gascoin et al., 2015;Krajčí et al., 2016), we assume that large altitudinal variations of the snow line elevation are not likely on the scale of a tile of 110 km by 110 km. Therefore, the minimum snow elevation z s is considered uniform on the scale of the processed image. Before proceeding to pass 2, the total snow fraction in the image after pass 1 is computed. If this snow fraction is below f t (Table 1), then pass 2 is skipped, as the sample of snow pixels is not considered statistically significant for determining the snow line elevation. Otherwise, the pass 2 is activated. For that purpose, the DEM is used to segment the image in elevation bands with a fixed height of d z (Table 1). Then, the fraction of the cloud-free area that is covered by snow in each band (after pass 1) is computed. The algorithm finds the lowest elevation band b at which the snow cover fraction is greater than a given fraction f s (Table 1). The value of z s is the lower edge of the elevation band that is two elevation bands below band b.

Cloud mask processing
The cloud mask in the input L2A product is conservative because (i) it is computed at a coarser resolution and (ii) it was developed to remove surface reflectance variations due to cloud contamination. However, the scattering of some thin clouds is low in the SWIR, green and red bands, which are used for snow detection (Sect. 2.4.1). Hence, the human eye can see the snow (or the absence thereof) through these semitransparent clouds in a colour composite. In addition, the L2A cloud mask tends to falsely classify the edges of the snow cover as cloud. Therefore, the algorithm includes some additional steps to recover those pixels from the L2A cloud mask and reclassify them as snow or no snow. This step is important because it substantially increases the number of observations as specified in Sect. 2.1.
A pixel from the L2A cloud mask cannot be reclassified as snow or no snow if any of these conditions are satisfied: it is coded as "cloud shadow" in the L2A cloud mask; it is coded as "high-altitude cloud" (or "cirrus") in the L2A cloud mask; it is not a "dark cloud" (see below).
The cloud shadows are excluded because the signal-tonoise ratio can be very low in these areas. The high clouds are excluded because they can have a similar spectral signature as the snow cover, i.e. a high reflectance in the visible and a low reflectance in the SWIR. This type of cloud are detected in Sentinel-2 and Landsat-8 L2A products based on the spectral band centred on the 1.38 µm wavelength (Gao et al., 1993).
We select only the dark clouds as possible reclassification areas, because the NDSI test is robust to the snowcloud confusion in this case. The dark clouds are defined using a threshold in the red band after downsampling the red band by a factor r f using the bilinear method. This resampling is applied to smooth locally anomalous pixels, following the MAJA algorithm, which performs the cloud detection at 240 m for L2A products (Hagolle et al., 2017). Therefore, if a (non-shadow, non-high-cloud) cloud pixel has a red reflectance at this coarser resolution that is lower than r D (Table 1), then it is temporarily removed from the cloud mask  www.earth-syst-sci-data.net/11/493/2019/ and proceeds to the pass 1 snow detection. The new cloud mask at this stage is the pass 1 cloud mask in Fig. 3. After passing the passes 1 and 2, some former cloud pixels (pixels that were originally marked as cloud in the L2A cloud mask) will not be reclassified as snow because they did not fulfil the conditions of Eq. (2). These pixels are flagged as cloud in the final snow product if they have a reflectance in the red that is greater than r B (Table 1). Otherwise they are classified as no snow. Here, the red band at 20 m resolution is used to allow an accurate detection of the snow-free areas near the snow cover edges that the L2A product tend to falsely mark as cloud. The resulting cloud mask is the pass 2 cloud mask in Fig. 3.

Implementation
The algorithm was implemented in an open-source processor called let-it-snow (LIS). LIS is written in Python 2.7 and C++ and relies on the Orfeo ToolBox and GDAL libraries. GDAL is used for input and output operations, image resampling and metadata access (GDAL/OGR contributors, 2018). Orfeo Toolbox enables us to make the computations with good performances under memory constraints (i.e. the amount of available memory can be predefined), which is critical for operational production (Grizonnet et al., 2017). LIS takes as input a digital elevation model and a Theia L2A product from Sentinel-2 MSI, Landsat-8 OLI, SPOT-4 HRVIR or SPOT-5 HRG . The spatial resolution and central wavelength of the spectral bands used by LIS for each sensor are given in Table 2. It is also compatible with Landsat-8 USGS level 2 products (U.S. Geological Survey Earth Resources Observation And Science Center, 2014) and Sentinel-2 ESA's Sen2Cor level 2A products (Louis et al., 2016).
The parameter values were set based on previous studies with Landsat (Hagolle et al., 2010;Zhu et al., 2015;Gascoin et al., 2015) and by visually checking many snow maps and snow fraction histograms. From this set of a priori parameter values, only r 2 was adjusted based on the analysis of a first batch of products to enhance the snow detection on shaded slopes.

Data description
The Theia Snow collection data can be freely accessed using a web browser by connecting to http://theia.cnes.fr (last access: 12 April 2019) or using this free command-line tool: https://github.com/olivierhagolle/theia_download (last access: 12 April 2019). The user must first create an account in Theia to have the permission to download the data.
The Theia Snow collection is organized following the tiling system of Sentinel-2 (Fig. 4). Each tile covers a square area of 110 km by 110 km in the UTM coordinate system. There are currently 127 tiles in the Theia Snow collection, mostly over the mountain ranges of western continental Eu-rope (France, Spain, Switzerland, Italy, western Austria). The snow products are also provided for southern Quebec in Canada, the Issyk-Kul lake catchment in Kyrgyzstan and the Kerguelen Islands. These extra-European sites were selected to support specific ongoing projects in relation with Theia. An opportunistic tile is produced in central Nevada, USA, because this tile is already available as a level-2A version for calibration purposes. The Theia Snow collection covers a number of mountain ranges with seasonal snow cover (Table 3).
The Theia Snow collection products are available for 127 tiles, starting on December 2017. At the time of writing there were over 17 000 available products. A set of 1300 products tagged as version 1.0 are available between November 2015 and June 2017 for a subset of 15 tiles (Table 3). These products were produced in pre-operational using a different configuration of LIS, which underestimated the snow-covered area on shaded slopes. All other products were generated with the same configuration as presented in this article. The products after December 2017 can still have a different version tag because of changing versions in the upstream L2A product. However, the different L2A versions have a low impact on the snow products. It is expected that the cloud mask has improved gradually with time, due to algorithms enhancements in the successive L2A versions and also because of the increasing revisit frequency of the Sentinel-2 mission (the full 5 d revisit became nominal above all land masses in March 2018). The different versions of the Theia Snow collection are listed and updated on this page: http:// www.cesbio.ups-tlse.fr/multitemp/?page_id=13180 (last access: 12 April 2019).
The snow products are currently routinely generated at CNES using the MUSCATE scheduler, which also manages the L2A production for Theia. MUSCATE performs a series of operations in a high-performance computing environment: (i) download the level 1C product as soon it is available from the French mirror of the Sentinel products (PEPS, Plateforme d'Exploitation des Produits Sentinel), (ii) process and distribute the L2A product and (iii) process and distribute the snow product. This workflow is optimized to reduce the lag time between the acquisition and the distribution of the products. As an example, in April 2018, nearly 80 % of the snow products were made available online 5 d or less after the date of the Sentinel-2 acquisition (Fig. 6). It is expected that this time lag should reach a median value of 2 d thanks to the increasing performance of MUSCATE.
Each snow product is distributed as a zipped file which contains raster and vector files. The file base name is the product ID, i.e. productId.zip, where the productId is a character string which is constructed as follows: www.earth-syst-sci-data.net/11/493/2019/ Earth Syst. Sci. Data, 11, 493-514, 2019  Table 1 gives the description and value of the algorithm parameters (written in red in this chart). MUSCATE is the scheduler which manages the L2A production for Theia.
For example, the eastern snow product in Fig. 5  Minimum value of the red band reflectance to return a non-snow pixel to the cloud mask 0.100 -productID_SNW_R2.{shp,shx,dbf,prj} is a vector file in the ESRI shapefile format, which contains a vectorized version of the pro-ductID_SNW_R2.tif using the polygon geometry.
-productID_CMP_R2.tif is an 8 bit RGB GeoTiff image which shows the outlines of the snow mask (green lines) and cloud mask (magenta lines) on a false-colour composite of the input L2A image (RGB image made with SWIR, red and green bands). This composition was chosen because RGB composites using the SWIR-band image are useful for discriminating the snow cover from the snow-free areas and from the clouds (Vidot et al., 2017). Hence, this image mainly allows the user to visually check the consistency of the snow and cloud mask.
-productID_MTD_ALL.xml is a metadata file.
Earth Syst. Sci. Data, 11, 493-514, 2019 www.earth-syst-sci-data.net/11/493/2019/  For most applications, only files with the SNW suffix should be useful. The shapefile and raster images are referenced in WGS-84 UTM system with the zone number given by the first two digits of the tile name. All raster files have a 20 m resolution. Note that no-snow pixels can be any surface, including water surface. No data pixels are the pixels outside of the acquisition segment.

Evaluation
In this section we present an evaluation of the Theia Snow collection. We first present the methods developed to do this evaluation and then the results. The evaluation was based only on Sentinel-2 products, because the production for Landsat-8 had just started at the time of writing and thus the Landsat-8 products represented a very minor fraction of the Theia Snow collection.

Comparison with in situ snow depth measurements
We collected all available snow depth measurements within tiles 31TGM, 31TGL, 31TGK, 32TLS, 32TLR, 32TLQ, 31TDH, 31TCH, 30TYN and 30TNX from the Météo-France database between 1 September 2017 and 31 August 2018. These 10 tiles cover most of the French Alps and Pyrenees (Fig. 8). We obtained 120 snow depth time series with at least one measurement. We gathered all available Sentinel-2 snow products for these tiles over the same period, i.e. a total of 1134 products. Then, we extracted the pixel values at the location of each snow measurement station for all dates. When a station is located in two tiles we only kept the data from the first tile. We selected the snow depth measurements which were collected on the same day of a Theia snow product and discarded the measurements corresponding to a cloud detection in the Theia snow product. The snow depth measurements were converted to snow presence and absence using a threshold of SD 0 = 0 m. This eventually allowed us to compute a confusion matrix between a set of snow presence and absence data from in situ measurement and a set of simultaneous snow presence and absence data from the Theia Snow collection across the French Alps and Pyrenees. However, previous studies comparing MODIS binary snow products with ground measurements showed that a value of 0 m may not be optimal (Klein and Barnett, 2003;Gascoin et al., 2015); therefore the sensitivity of the results to SD 0 was tested by recomputing the confusion matrix for 1 cm increments of SD 0 from 0 to 1 m.

Comparison with snow maps of higher spatial resolution
We used SPOT-6 and SPOT-7 images with a resolution of 1.5 m in panchromatic and 6 m in multispectral (blue, green, red, near-infrared) to evaluate the accuracy of the snow detection in Theia snow products. For this comparison, the LIS processor was run with its current operational configuration (i.e. the configuration used to routinely produce the Theia snow products since December 2017 Table 1). SPOT-6 and SPOT-7 sensors acquire images with a large radiometric depth coded in 16 bits, which enables us to identify surface features in alpine regions with dark shaded slopes and bright snow surfaces. We searched the catalogue of the Kalideos database for available SPOT images that could match a cloud-free (or nearly cloud-free) Theia snow map over the French Alps region (https://alpes.kalideos.fr, last access: 12 April 2019). We identified six pairs of images in 2016 and 2017 with a maximum time lag of 6 d (Table 4, Fig. 9). The SPOT images were obtained as orthorectified Earth Syst. Sci. Data, 11, 493-514, 2019 www.earth-syst-sci-data.net/11/493/2019/ products from the Kalideos database. The SPOT images were used to generate reference snow maps by a pixel-based supervised classification. For each SPOT image, we manually drew about 15 polygons of homogeneous of snow and nosnow surfaces using the panchromatic image. The few cloud pixels in the SPOT images were also manually delineated with a large buffer to restrict the classification to strictly cloud-free areas. Colour composites made from the multispectral images were also used as a visual support to help the snow and cloud classification. These polygons were then used to extract training samples from the SPOT products. The samples were taken from both the panchromatic and the multispectral images. We tested a number of classification algorithms by splitting the polygons in calibration and validation data sets. The detail of this analysis is not shown here, but it allowed us to conclude that the random forest classifier was the best choice for our purpose, given its good accuracy and its numerical efficiency (Bouchet, 2018). In addition, we generated the snow maps from the same Sentinel-2 level 1 products using ESA's Sen2Cor version 2.5 (Louis et al., 2016). The Sen2Cor output snow masks are expected to be nearly identical to the ones included in the distributed L2A products by ESA, since the same processor is used. We have generated the L2A products ourselves with Sen2Cor for this study, because the L2A products were not yet available when we made this analysis (ESA started operational production of Sentinel-2 data at level 2A in 2017). The Theia and Sen2Cor snow maps were compared to the reference snow maps from SPOT images using standard met- Table 4. Pairs of Sentinel-2 and SPOT-6/7 products used for the evaluation of the snow detection accuracy (see also Fig. 9).

Sentinel-2 product
Reference product rics derived from the confusion matrix (accuracy, F1 score, kappa, false-detection rate and false-negative rate).

Evaluation metrics
From the confusion matrices of Sect. 4.1.1 and 4.1.2, we derived the following metrics: accuracy (the proportion of the total number of predictions that were correct), false-positive rate (FPR, i.e. the proportion of no-snow pixels that were incorrectly classified as snow), false-negative rate (FNR, i.e. the proportion of snow pixels that were incorrectly classified as no snow), F1 score (harmonic average of the precision and recall) and kappa coefficient (Cohen, 1960). Table 5. Confusion matrix between the Theia snow products and in situ snow depth data (SD).

Visual verification
The evaluation methods presented above undersample the actual resolution of the Theia snow collection, both in space (only 120 points of comparisons in Sect. 4.1.1) and time (only six dates of comparisons in Sect. 4.1.2). Given that we do not have a more extensive validation data set, we used a time series of 64 Theia snow products from 6 July 2015 to 2 July 2017 over tile 31TCH to control the consistency of the snow and cloud masks based on the visual inspection of the false-colour composites. This approach is efficient for evaluating the frequency of gross errors on the tile scale, i.e. large patches of false snow or false no-snow detection (Vidot et al., 2017).

Comparison with in situ snow depth measurements
The confusion matrix between the Theia snow products and in situ snow depth measurements is given in Table 5. We find 1414 pairs of data for the study period (i.e. there are 1414 individual snow depth measurements for which a cloud-free retrieval can be found in the Theia Snow collection on the same day). From this confusion matrix, the accuracy (proportion of correct classifications) is 94 % and the kappa coefficient is 0.83, which indicates excellent agreement according to Fleiss et al. (2013). The false-positive rate (2.8 %) is lower than the false-negative rate (6.7 %), which means that the Theia snow product is more likely to underestimate than to overestimate the snow detection at the station locations. The false-negative rate decreases if SD 0 is set to a higher value; however the false-positive rate also increases in such a way that an optimum is reached at SD 0 = 2 cm (Fig. 10). The comparisons of each individual time series before matching the data by date is shown in Appendix B. These plots illustrate that the high revisit time of Sentinel-2 enables us to capture the seasonal cycle of the snow cover well. Even intermediate melt-out events at lower elevation stations can be identified over the course of the snow season. Table 6 shows that the MAJA-LIS workflow provides a better detection of the snow cover than the Sen2Cor processor in Figure 9. Location of the five SPOT-6/7 products used for the evaluation of the snow detection accuracy and the corresponding Sentinel-2 tiles (see also Table 4).  all the studied cases. Although the improvement in the accuracy coefficient is low in some cases (31TGL/2017-03-11, 31TGK/2016-08-13) or even slightly negative in one case (32TLS/2016-10-12), the F1 score and kappa coefficient of the Theia snow products are always greater than or equal to those of Sen2Cor. The differences are significant for four products among the six tested. The lower accuracy in the case of the 32TLS/2016-10-12 product is due to a higher false-negative rate. The false-positive rates in the Theia snow products are generally lower than the false-negative rates, which is consistent with the previous evaluation using in situ data. The spatial analysis of the errors indicates that the false-positives (i.e. overdetection of snow pixels) in the Theia products are mostly located near the snow cover edges (Fig. A1). This suggests that these errors might not be due to a false detection of snow (e.g. confusion with a lake or a cloud surface) but are rather due to the resolution discrepancy with the reference data. By contrast, the spatial comparison of the Sen2Cor products with the reference data shows large patches of false negatives (omission of snow pixels) in the snow-covered area, which are typically due to the omission of snow pixels in shaded areas (Fig. A2).

Visual verification
We found 4 products among 64 with significant patches of false-positive pixels (i.e. pixels falsely classified as snow). These false snow areas are exclusively located in cloud areas (Fig. 11). False-negative pixels (i.e. pixels falsely classified as snow-free) can be found by looking at higher-resolution data along the snow cover edges, but we did not find large areas of false-negative pixels.

Discussion
The evaluation of the Theia Snow collection using both in situ and remote-sensing reference data sets indicates an excellent accuracy of the snow detection, albeit with a tendency to snow underdetection. The omission of snow pixels can be due to the presence of trees or shadows. In the case of the in situ comparison, part of the errors may be explained by the uncertainty in the geolocalization of the Sentinel-2 data, which is close to 11 m at 95 % for both satellites according to the latest Sentinel-2 L1C data quality report (MPC Team, 2018). In addition, in situ measurements may not be representative of the snow conditions within the Sentinel-2 pixel. However, the sensitivity analysis of the accuracy and kappa coefficient to the SD 0 value suggests that the discrepancy between the scale of the in situ observations and the scale of the pixel observation is not significant, since the optimal SD 0 is close to 0 m. In contrast, a similar analysis with lower-resolution snow products from MODIS found an optimal SD 0 of 0.15 m Gascoin et al. (2015). This illustrates how high-resolution snow products from Sentinel-2 enable us to partly resolve the typical scale issue between in situ and remote-sensing products (Blöschl, 1999). The visual inspection of the products reveals that the main issue in some Theia snow products is rather the occurrence of false positives due to the confusion with cloud surfaces. This problem can be due to two main factors: -Cold clouds: cold clouds have a spectral signature that is close to the snow cover since they contain ice crystals (high NDSI). If these clouds are not accurately detected by the level 2A processor then the LIS algorithm will also classify them as snow after pass 1 (Sect. 2). As a result the snow line elevation z s is not well estimated, which can generate erroneous snow patches within these clouds, even at low elevation (e.g. product of 20 July 2017 in Fig. 11). rely on our visual judgment using colour composites to assess the ability of the algorithm to discriminate the cloud and the snow cover. However, to further quantify the algorithm performance and eventually optimize its parameters, we would need a database of classified imagery with labelled regions of cloud and snow cover in many different cases. Both Sen2Cor and LIS use the NDSI to detect the snow cover; however the algorithms differ in many aspects. The better performances of LIS in Sect. 4.2.2 can be related to the mono-date approach for atmospheric correction and cloud detection in Sen2Cor, which can cause snow-cloud confusion in the L2A product, while LIS uses L2A products from the more accurate multi-temporal MAJA processor (Baetens et al., 2019). In addition, a unique feature of the LIS algorithm is the use of the snow line elevation concept to improve the robustness of the snow detection in mountain regions. In addition, LIS was designed to retrieve snow below thin clouds and to reclassify snow pixels which are frequently found near the snow cover edges in L2A products. Apart Earth Syst. Sci. Data, 11, 493-514, 2019 www.earth-syst-sci-data.net/11/493/2019/ from the MAJA processing, however, the effect of these algorithmic differences could be less evident in flat areas.

Code and data availability
The Theia snow collection can be accessed and cited using this digital object identifier: https://doi.org/10.24400/329360/F7Q52MNK . The let-it-snow source code is available under the GNU Affero General Public License v3.0 in this repository: https://gitlab.orfeo-toolbox.org/remote_modules/ let-it-snow/ (last access: 12 April 2019).

Conclusions
The Theia Snow collection is a free collection of snow products which indicate the presence or absence of snow on the land based on Sentinel-2 (20 m) and Landsat-8 (30 m) data. Most of the snow products are derived from Sentinel-2. However, in late September 2018, Landsat-8 snow products started being routinely added to the Theia Snow collection for the titles in France only. Previous Landsat-8 data were also reprocessed back to March 2017. At the time of writing (15 March 2019), for example in tile T31TCH in the Pyrenees, 20.5 % of all available products were derived from Landsat-8 data (63 among 244). In addition, the processing is done to facilitate the combination of Landsat-8 and Sentinel-2 snow products, since the data format and the tiling scheme are the same (only the pixel size differs). Although the co-registration of Sentinel-2 and Landsat-8 data can be still problematic (Storey et al., 2016), the combination of "analysis ready" Landsat-8 and Sentinel-2 is important for increasing the number of cloud-free observations.
The evaluation presented in this article indicate that the Theia Snow product allows an accurate detection of the snow cover (and of the absence of snow). However, it remains a preliminary assessment that should be extended in the future. For example, the evaluation was based only on Sentinel-2 products, since the integration of Landsat-8 data started during the writing of this paper and should be extended to Landsat-8 products. However, the processing is done with the same algorithm, which is based on the calculation of the NDSI from slope-corrected surface reflectance (level 2A data). In addition, the evaluation was limited to the French Alps and French Pyrenees, while the collection covers other mountain regions. Last, the spatial evaluation using higherresolution remote-sensing data was focused on the accuracy of the snow detection using very high-resolution clear-sky images, while, as discussed above, the main difficulty is not to detect the snow when the sky is cloud-free but rather to avoid false snow detection within clouds. The reduction of the false-positive rate is the main challenge for the future developments in the LIS algorithm. In the meantime, we also welcome feedback from other users.
The Theia Snow collection is based on optical observations; therefore it is not adapted to the detection of the snow cover in dense forest areas where the ground is obstructed by the canopy (Xin et al., 2012). This may typically occur in evergreen conifer forests of alpine regions (e.g. Alps, Pyrenees). We have noted that the snow detection can be successful even in alpine forests, but we lack data that can be used for quantifying its accuracy. Therefore it is recommended to use a land cover map to exclude these forest areas from the analysis. For the tiles in France, this can be done using the Theia land cover map , which is distributed at the same spatial resolution. In the Pyrenees, the mean altitude of zero-degree isotherm in winter is close to the treeline elevation at 1600 m; therefore the impact of the forest cover is limited (Gascoin et al., 2015). The LIS algorithm may also fail to detect the snow cover on steep, shaded slopes if the solar elevation is very low (typically below 20 • ). This can occur in midlatitude areas in winter. In this case, the slope correction in the L2A product is generally not applied. This is indicated in the L2A mask but not in the snow product. A future release of the Theia Snow collection should include this information. However, the visual inspection of many images in regions of complex topography gives us the impression that the snow detection algorithm already performs well on shaded slopes. We are planning to investigate this issue further using alternative approaches (Sirguey et al., 2009).
In spite of all these limitations, the Theia snow products have already been successfully used for the evaluation of the MODIS snow products (Masson et al., 2018) and the assimilation of snow-covered area data into a snowpack model in the High Atlas (Baba et al., 2018). Given that the snow cover is a key driver of many natural processes in mountains regions, we envision various potential applications of the Theia snow products, including the modelling of the distribution of the permafrost in mountain regions, the validation and calibration of hydrological models in snow-dominated catchments and the spatial modelling of biodiversity and productivity of ecosystems in mountain regions.