Canadian historical Snow Water Equivalent dataset (CanSWE, 1928-2020)

. In situ measurements of water equivalent of snow cover water equivalent (SWE) – the vertical depth of water that would be 10 obtainedproduced if all the snow cover melted completely– are used in many applications including water management, flood forecasting, climate monitoring, and evaluation of hydrological and land surface models. The Canadian historical SWE dataset (CanSWE) combines manual and automated pan-Canadian SWE observations collected by national, provincial and territorial agencies as well as hydropower companies. Snow depth (SD) and derived bulk snow density (defined as the ratio of SWE to SD) are also included when available. This new dataset supersedes the previous Canadian Historical Snow 15 Survey (CHSSD) dataset published by Brown et al. (2019), and this paper describes the efforts made to correct metadata, remove duplicate observations, and quality control records. The CanSWE dataset was compiled from 15 different sources and includes SWE information for all provinces and territories that measure SWE. Data were updated to July 2020 and new historical data from the Government of Northwest Territories, Government of Newfoundland and Labrador, Saskatchewan Water Security Agency, and Hydro Quebec were included. CanSWE includes over one million SWE measurements from 20 2607 different locations across Canada over the period 1928 at https://doi.org/10.5281/zenodo.4734372 https://doi.org/10.5281/zenodo.4734371 (Vionnet et 2021).


Introduction
Reliable in situ information of snow water equivalent (SWE) or more precisely water equivalent of snow cover according to WMO (2018)(SWE)the vertical depth of water that would be obtained if the snow cover melted completely, which equates 25 to the snow-cover mass per unit area the equivalent amount of liquid water stored in the snowpack (WMOFierz et al., 201809)is critical for flood and drought predictions (e.g., Jörg-Hess et al., 2015;Berghuijs et al., 2016;Vionnet et al., 2020), streamflow management of water supply for hydropower generation (e.g., Magnusson et al., 2020), irrigation planning (e.g., Biemans et al., 2019) and is a key environmental variable for climate monitoring and understanding (e.g., Clark et al., 2001;Brown et al., 2019). In situ SWE measurements can be made manually or via automatic sensors (Kinar 30 and Pomeroy, 2015). Manual SWE measurements typically consist of single point measurement (snow pit or single measurement carried out with a snow tube) or multi-point gravimetric snow surveys (also known as snow transects or snow courses) collected along a pre-determined transect (WMO, 20108, Lopez Moreno et al., 2020). Manual snow surveys are generally representative of the prevailing land cover and terrain but are time-consuming and expensive which limits their temporal frequency, especially in remote locations. Automatic stations can overcome this limitation and provide SWE 35 measurements at a higher temporal frequency but have the disadvantage of only measuring SWE at a single point. Snow pillows (Beaumont, 1965) and snow scales (Johnson, 2004, Smith et al, 2017 automatically measure SWE from the overlying pressure and weight of the snowpack, respectively. Indirect methods using passive radiation sensors installed below or above the snowpack have also been developed. They measure the attenuation by the snowpack of natural cosmic radiation (Kodaoma et al., 1979;Paquet et al., 2008) or naturally emitted gamma radiation from the soil (Choquette et al., 40 201308). Finally, SWE can be automatically derived by analysis of the signal from Global Navigation Satellite System receivers (Henkel et al., 2018, Steiner et al., 2019. or Global Positioning System receivers (Koch et al., 2019). SWE observation networks using different measurements methods have been deployed at a national scale in various countries to provide valuable in situ information. Russia maintains a vast long-term network of manual snow survey transects located in the vicinity of meteorological stations (Bulygina et al., 2011). National SWE measurements relying on 45 manual methodssnow survey are also available in several European countries: such as Finland, Estonia, Ukraine and Turkey use for example snow courses whereas countries such as Germany or Czech Republic rely on single-point measurements (Haberkorn, 2019). In the Western United States (US), manual SWE measurements are collected along permanent snow courses maintained by the US Department of Agriculture (Natural Resources Consvervsation Services, 1988) and in the Northeast by various state departments (McKay et al., 1994). Another source of SWE information in the Western US and 50 Alaska is the snowpack telemetry (SNOTEL) network using automatic snow pillows (Serreze et al., 1999). In situ SWE data from several of these networks are used for a number of research and developmentmany applications. For example, they serve as reference data for the evaluation of a variety of large-scale gridded SWE products (e.g., Mortimer et al., 2020) including (i) snowpack models driven by meteorological reanalysis (e.g., Brun et al., 2013) (ii) passive microwave estimates combined with surface snow depth observation such as the GlobSnow product (Pulliainen et al., 2020) and (iii) regional 55 climate models (e.g., Rasmussen et al., 2011). Gridded snow products can also be derived from manual and automatic in situ SWE measurements (e.g., Brown et al., 2019). In a hydrological context, SWE measurements from large-scale networks can inform the calibration of snow-related parameters in hydrological models (Sun et al., 2019) and the hydrologic design in snow-dominated environments (Yan et al., 2018). Studies on the impact of climate variability and change on snowpack evolution can also rely on snow measurements from national networks (e.g., Clark et al, 2001;Musselman et al., 2017). 60 Manual snow surveys and automatic SWE stations with collocated snow depth (SD) measurements can provide information on the bulk density of the snowpack. These data have been used to develop and evaluate methods to estimate bulk snow data availability and Sect. 5 offers concluding remarks and perspectives about future updates of CanSWE.

Creation of the CanSWE dataset
The creation of the new Canadian historical SWE (CanSWE) dataset from the most recent version of the CHSSD involved three main steps as detailed on Fig. 1: (i) correction and cleaning of the 2019 CHSSD update, (ii) update of this cleaned dataset to July 2020 and addition of snow data from new stations and agencies, and (iii) consistent quality control (QC) of 100 the final dataset. These steps are described in the next sections. Prior to adding new data, the existing data were scrutinized to identify and resolve several issues raised by researchers working with the 2019 update. A preliminary analysis consisted of identifying stations with erroneous or incomplete metadata: (i) blank station name, (ii) placeholder text for station name, (iii) missing latitude and/or longitude, (iv) obvious errors in latitude and/or longitude and/or elevation. A total of 91 such stations were identified and were manually checked. 110 Valid data for station name and/or coordinates were obtained from databases of the originating agencies for 28 stations and the corresponding changes were made to the CHSSD. The remaining 63 stations with erroneous/incomplete metadata and their corresponding records of snow data were excluded, leaving 3061 individual stations in the dataset.

Merging and removal of duplicates
A second analysis was then carried out to remove duplicates and improve the consistency of the database prior to adding any 115 new data. Duplicates are defined as stations with different station IDs and potentially with different metadata (station name and/or coordinates and/or elevation) having the same SWE observation for multiple dates (at least ten). Duplicates usually consist of a pair of stations but can also be formed of three or four stations. Duplicates were introduced in previous updates of the CHSSD when snow data from various agencies were added to the CHSSD without ensuring that incoming data were already present in the CHSSD under a different station ID. In particular, instances of data duplication were introduced when 120 the SCD books were digitized. Stations from these books were all assigned a unique ID (station with the prefix "SCD-") which differs from that of the agency of origin. This generated a substantial amount of duplicate data during the period 1956 -1986. Duplicates were also introduced in transboundary situations where a single station is archived by multiple agencies but under different station names and IDs.
Duplicates were identified through a combination of automated station selection and manual inspection. For each station in 125 the CHSSD (referred here as 'inspected station'), all stations within a 5-km radius were identified. Each group of neighbouring stations was then manually inspected for similarities in (i) snow measurements for matching dates (at least ten), (ii) station location and (iii) station name. In most cases, all three of the criteria were satisfied to trigger a decision on whether a duplicate was identified. When a duplicate was identified, the inspected station and its matching neighborsstations were assigned a unique merging key to be used in subsequent consolidation. If no similar stations to thea inspectedreference 130 station were identified in a group of neighbouring stations, the inspected reference station was assigned its own merging key to aid in future updates to the CHSSD. Isolated stations without neighbours in a 5-km radius and without having been assigned a merging key were then inspected. For these isolated stations, the five nearest stationsregardless of distancewere identified, and the same similarityies criteria were applied within each group of stations. As before, a unique merging key was assigned to each set of identified duplicate stations, or only to the reference station itself in the case of no duplicates 135 being identified. As a final check, for each station, a query over the full list of station names was carried out using a shortened version of the station name to identify stations in the CHSSD with similar names. These stations were then manually inspected for similarities as described above. In total, 842 groups of duplicate sites were identified: among them, 788 were comprised of two stations, 52 had three stations and two had four stations.
The final step consisted of removing the duplicates. For each merging key associated with a set of duplicate stations, a single 140 reference station ID was identified. When duplicates occurred between one or several IDs from the SCD books and an ID from an originating agency, the reference ID was taken as that of the originating agency. When duplicates occurred between IDs from several agencies (typical of transboundary situations), the station ID belonging to the provincial or territorial agency where the station is located was selected as the reference ID. Finally, when duplicates occurred between IDs in the SCD books or IDs from the same agency, the ID associated with the longest SWE record was selected as the reference ID. 145 Records of snow depth and SWE from the reference station were retained and records from the duplicate stations inserted on dates when no data were present in the records from the reference station. The metadata (coordinates and elevation) were taken from the reference station. The station IDs and names of the duplicate sites were retained as alternative IDs and names to facilitate future data enquiries using IDs and names present in the previous versions of the CHSSD. The duplicates' metadata and data were then removed from the CHSSD, for a total of 898 stations removed. Duplicated data were mostly 150 removed over the period 1956-1986 due to conflicts between the data from the SCD books and the data from the agency of origin. The cleaned version of the CHSSD contains 2163 individual stations and was used as the basis for the update presented in this paper.

Update of the CHSSD
Agencies collecting SWE measurements across Canada were contacted to obtain access to snow data (SWE and SD). Table   1 lists the twelve different agencies that contributed snow data to the update leading to the CanSWE dataset. These agencies correspond to provincial and territorial agencies responsible for streamflow forecasting and/or environmental monitoring and 160 hydropower companies. All Canadian provinces and territories are covered by this update, with the exception of Nova Scotia and Prince Edward Island where no snow measurement program is currently active at the provincial level. Nunavut is included through the manual snow survey data collected at stations managed by ECCC. Snow survey data were also provided by the Government of Manitoba, but their format precluded inclusion in CanSWE at this time.  The snow data provided by the different agencies consist of two types of measurements: (i) manual gravimetric snow surveys and (ii) automatic stations. Manual snow survey data were provided by the twelve originating agencies (Table 1).
These data are collected by field observers using snow corers typically at five to ten points along a pre-determined survey line of 150-300 m selected to be representative of the land cover and terrain, although the precise methodology varies by agency (Brown et al., 2019). Manual snow surveys are collected irregularly in time and the sampling frequency varies from 175 one agency to another (Fig. 3). A majority of agencies conduct snow surveys once or twice per month during the snow season (Fig 3b) but several (e.g., Saskatchewan, Newfoundland and Labrador, Northwest Territories, Fig. 3a) only conduct measurements close to the peak snow accumulation and during the melting period for hydrological purposes. Most of the agencies use the Federal snow sampler whereas the Prairie and the ESC-30 samplers are used in regions of shallow snowpack such as the Prairies or the Arctic ( Table 2). The Federal snow sampler is a small-diameter and multi-section 180 sampler design to aid sampling in deep snowpack whereas the Prairie and the ESC-30 samplers present large diameters tubes to maximize snow collection in shallow snow cover and increase measurement accuracy (Dixon and Boon, 2012). More details about the impact of sampler type on uncertainties in SWE measurements are given in Goodison et al. (1987) and Lopez Moreno et al. (2020). Automatic SWE measurements from snow pillows were provided by the British Columbia Quebec and the Government of Newfoundland and Labrador also provided hourly automatic SWE measurements from passive gamma radiation sensors (Choquette et al., 201308; Table 2). Most of these automatic stations are also equipped with automatic measurements of snow depth using ultrasonic ranging instrumentssensors. The snow data and the corresponding metadata from the different agencies were obtained by direct download from web pages or ftp servers, from requests on web data servers, or directly via email. Data were most often provided as csv or Excel files but were also received as text bulletins, zxrp files, and ESRI Shapefiles. Python routines specific to each agency and corresponding data format were written to process the data and metadata and arrange them in a consistent NetCDF format. 195 Snow depth and SWE data were included at a daily frequency. Hourly time series from automatic stations were first preprocessed with a 24-h median filter to remove noise (Stone, 1995), especially in the snow depth time series from ultra-sonic sensors. The filtered data corresponding to 18 UTC was then extracted from the hourly time series to get a daily value. 18 UTC was selected since it corresponds to daytime in Canada. When available, the quality control flags from the originating agency were added (see Sect. 3.3 for more details on QC). Finally, a station metadata record was constructed for each snow 200 survey site including station ID, data source agency, station name, latitude, longitude and elevation. This list of metadata variables corresponds to that used in the 2019 CHSSD update (Brown et al., 2019). When elevation was not present in the metadata from the originating agency it was extracted from the United States Geological Survey's National Elevation Dataset (USGS NED, Gesh et al., 2002) at the position corresponding to the location of the snow survey site. The USGS's NED covers all Northern America at 30-m resolution (exceptpect parts of Alaska) and has a vertical accuracy of 3.53 m over 205 Canada (Gesch et al., 2014). A new code was also added in the metadata to describe the method of SWE measurements at each snow survey site. This code follows the standards of the World Meteorological Organization (WMO, 2019a) and is described in Table 32. Information about the sitting of the snow measurement sites (e.g., open terrain, below forest, clearing) is not available in the present version of CanSWE and will be added to future version of the dataset.  (Table 1). They consisted of newly established snow survey sites over the period 2017-2020 and of historical snow survey sites that were not included in the 2019 CHSSD update. For example, historical manual snow survey data were added from Hydro Quebec, the Saskatchewan Water Agency, the Governments of Northwest Territories and the Government of Newfoundland and Labrador. The full historical archive of the snow pillow data from Alberta Environment and Parks was also added to CanSWE. Finally, new data from automated passive gamma radiation 220 sensors from Hydro Quebec and the Government of Newfoundland and Labrador were added. This is significant because no data from automatic stations from Eastern Canada wereas present in any previous version of the CHSSD. Duplicates created by the addition of new stations were identified and removed following the methods described in Sect. 2.1.2. Overall, 798 stations from the cleaned 2019 CHSSD update were updated to 2020 and 444 new stations were added. The CanSWE dataset contains snow data for 2607 sites across Canada (Table 1). Finally, where both SWE and SD measurements were available, 225 derived bulk snow density was calculated from the ratio of SWE toand SD and included in the final database.

Quality control of the final dataset
Quality control of CanSWE involved two main steps: (i) homogenization of data quality flags from the various reporting agencies, (ii) QC of the manual and automated SWE and SD records. Each of the twelve reporting agencies have their own data archiving and reporting system with many agencies using data flags to identify possibly erroneous or problematic 230 measurements. For example, it is not always possible to accurately measure trace amounts of snow or to estimate SWE in patchy snow conditions. In these instances, the measurement may be reported as 0 but a flag of T (trace) or P (patches) assigned. Most, but not all, agencies conduct their own internal quality control prior to releasing their data. Instances where  (Table 43), respectively, compared to 18 and 15 before homogenization. 245 Quality control (QC) of SWE and SD measurements included range thresholding and automated outlier detection. SWE and SD QC flag variables (qc_snw and qc_snd, respectively), which are separate and distinct from the agency flag variables were 250 added to the dataset (Table 54). The set of QC procedures implemented here is self-contained, applicable to the full dataset, and does not rely on any auxiliary data. Researchers using a subset of CanSWE for a local region or specific years may wish to conduct their own independent QC that may consider available temperature and precipitation information (e.g., Johnson and Marks, 2004;Yan et al., 2018).  Leys et al. (2018) for more details Range thresholds were used to identify spurious records in both automated and manual measurements. Brown et al. (2019) applied this method to remove outliers from the 2019 CHSSD update and only keep valid triplets of SWE, SD and bulk snow density. For CanSWE, wWe adopted the thresholds outlined in Brown et al. (2019) for SWE and SD (0 -3000 kg m -2 mm, 0 -8000 kg m -2 mm for mountain) but a slightly more restrictivestringent range of 25 -700 kg m -3 (as opposed to 50 -1000 260 kg m -3 ) for bulk snow density. These ranges are based on common ranges for SWE and, SD and density from the literature (see Braaten, 1998). The range thresholding applied to bulk snow density aims at identifying SWE-SD pairs that are likely erroneous. To maintain consistency of the long-term database we used the same definition for mountain as Brown et al. (2019) where mountain is defined as all land west of -113° longitude. This definition is very simple and more advanced definitions (e.g., Karagulle et al,, 2017) may be considered in future version of CanSWE. Measurements outside the 265 specified ranges were set to NaN and QC flags assigned according to Table 5. When a record failed the SWE (SD) threshold but not the SD (SWE) threshold only the SWE (SD) value was set to NaN; the corresponding density value was also set to NaN and a W or H flag assigned to these records (Table 5). When a record failed the bulk snow density threshold SWE, SD and bulk snow density were set to NaN; a D flag was assigned to these records (Table 5). Together, these steps masked one or both of SWE and SD in 0.17% and 5.5% of the manual and automated records, respectively. Table 65 lists the number and  270 percentage of records masked at each QC step. The available data before and after QC is shown in Fig. 3. The small number of records flagged using the range thresholds is not surprising given that much of the data underwent QC in previous updates. The SWE and SD ranges are unchanged from previous updates so only data added in the current update have the possibility of being flagged. The density range is slightly more conservative so both new and old data were removed.
Consequently, the density range flagged the most records when compared to the SWE and SD thresholds. Finally, when 275 SWE (SD) measurements were masked (set to NaN ) in previous CHSSD updates for any reason, the corresponding QC flag (qc_snw/qc_snd) was set to M (missing) in CanSWE. 0.3% and 1.6% of the manual and automated records, respectively, have M flags.  We applied Aadditional quality control measures were applied to the automated data but was applied to the manual data due to their low temporal sampling frequency. We used the robust sample Mahalanobis distance (RMD) (Leys et al. 2018) to identify spurious SWE -SD data pairs as in Hill et al (2019). The RMD method is based on the traditional Mahalanobis Distance (MD) (Mahalanobis, 1930), which is the distance of a point from the mean of a multivariate distribution. It relies on 290 the mean and covariance matricesx of the multivariate distribution, which are affected by outliers. TConversely, the RMD uses the Minimum Covariance Determinant (Rousseeuw, 1984) and is less sensitive to outliers than the MD (Leys et al., 2018). Because this method relies on a multivariate dataset only automated data with both SWE and SD observations were assessed. For each site with a minimum of 20 records, the RMD was calculated for each SWE -SD data pair. Following Hill et al. (2019), outliers were defined as a square RMD larger than the upper 0.001 quantile of a chi-squared distribution with p 295 degrees of freedom (X 2 p; where p = number of dimensions of the data) (Gnanadesikan and Kettenring, 1972). For these records SWE, SD, and density were set to NaN and QC flags (qc_flag snw, qc_flag_snd) assigned V (Table 4). This step masked an additional 0.16% of automated records. This method was not applied to the manual data due to their low temporal sampling frequency.
3 Spatial and temporal coverage of the final dataset 300   320 Figure 6 compares the distribution of the station elevation with the hypsometry in each province and territory. The hypsometry has been derived from the Global Multi-resolution Terrain Elevation Data 2010 (https://www.usgs.gov/corescience-systems/eros/coastal-changes-and-impacts/gmted2010, last access 21 July 2020) at 30 arc-seconds reprojected to the Canada Albers Equal Area Conic projection at 250-m grid spacing. Figure 6 shows that the elevation coverage provided by the stations varies greatly from one region to another. A representative coverage is found in provinces 325 of Eastern Canada (Quebec, New Brunswick, Nova Scotia). On the other hand, in British Columbia and Alberta, SWE measurement sites tend to be located at higher elevation than the average terrain to provide relevant information on snow cover in mountainous headwater catchments. Large differences between the station elevation coverage and the hypsometry are also found in Nunavut and Saskatchewan. They are associated with sparse spatial coverage in the elevated inland parts of Nunavut and in the low-elevation northern part of the province in Saskatchewan. 330

345
The first automatic stations measuring SWE (snow pillows) in Western Canada were deployed in British Columba in the late 60s and early 70s. In Eastern Canada, the installation of automatic GMON sensors is more recent and started in 2009 in Quebec. In the CanSWE dataset, measurements from automatic stations first outnumbered those from manual snow surveys in 1988 and accounted for 89% of total SWE records for snow season 2020 (Fig. 87). The higher proportion of automated data is largely due to their higher measurement frequency compared to manual snow surveys. Finally, the number and 350 frequency of manual snow survey observations varies over the course of the snow season and between reporting agencies (Fig. 5). The number of snow surveys increases over the accumulation season reaching a maximum during the period of peak the northern regions and in mountainous regions, but the seasonal peak shown in Fig. 5

Data availability
The CanSWE dataset is distributed as a single file in NetCDF format that follows the Climate and Forecasts (CF) Metadata 360 Conventions (Hassel et al., 2017). It is available at https://doi.org/10.5281/zenodo.4734372 (Vionnet et al., 2021). Table 76 describes the data and observational metadata contained in this file. Readme files in English and French are also included in the Zenodo data repository. Future versions of CanSWE will include updated names for the observational metadata to follow the WMO standards (WMO, 2019b) 365  Table 32 for more details; 2 see Table 43 for more details, 3 see Table 54 for more details.

Conclusion
The Canadian historical SWE dataset (CanSWE) contains measurements of snow water equivalent of snow cover (SWE) and 370 snow depth (SD) and derived bulk snow density for an ensemble of sites across Canada. This dataset includes the results of extensive cleaning and quality control of the existing Canadian Historical Snow Survey Dataset (CHSSD), the addition of new historical data sources, and an update to July 2020 with data from 12 organizations and their partners. New stations from Hydro Quebec, the government of Newfoundland and Labrador, the government of Northwest Territories and the Saskatchewan Water Security Agency were added and improved the spatial coverage. A systematic quality control was 375 applied to identify and remove outliers in SWE, SD, and bulk snowcorresponding density. The CanSWE dataset presented in this paper includes data from 2607 manual and automatic snow survey sites across Canada over the period 1928 -2020. We anticipate that these data will be used for (i) climate monitoring and research, (ii) evaluation of land surface and hydrological models, (iii) development and evaluation of snow products, and (iv) other snow-related activities. Regular updates are required to make such datasets useful for the community. Ideally, these updates should be carried out on a yearly basis at the 380 end of each snow season. The data ingestion routines and automated quality control procedures developed under this project will allow future updates to be carried out in a timely and systematic fashion. We also hope that these efforts will provide opportunities to include new sources of in situ SWE information such as data collected at long-term experimental sites maintained by academic partners.

Appendix A: Previous use of the 2019 CHSSD update 385
The 2019 CHSSD update has been produced by Brown et al. (2019). This dataset has been used by different research groups in support of model evaluation, climate monitoring and development of innovative algorithms. A search was carried out on Google Scholar (last access: 20 July 2020) to list all the studies that refer to the paper by Brown et al (2019). Each study was then considered and all the studies that used the 2019 CHSSD update were listed in Table A1. Development of snow density field to improve gridded SWE products over the Northern

Author contribution
VV, CM and RB initiated the 2021 update of the CHSSD leading to CanSWE. VV coordinated the update effort. VV and CM reached out partners agencies to obtain snow data and processed them. MB developed the routines for the automatic detection of duplicates and conducted systematic identification of duplicates. CM developed the quality control routines and 395 data flag consolidation. LA identified duplicates in the 2019 update of the CHSSD and systematically tested the intermediate versions of CanSWE and identified remaining issues, that were then corrected. All authors contributed to the preparation of the manuscript. We thank an anonymous reviewer and Charles Fierz for their careful reading and useful comments, which improved the manuscript.

Competing interest 400
The authors declare that they have no conflict of interest.