Hydrometeorological, glaciological and geospatial research data from the Peyto Glacier Research Basin in the Canadian Rockies

. This paper presents hydrometeorological, glaciological and geospatial data offrom the Peyto Glacier Research Basin (PGRB) in the Canadian Rockies. Peyto Glacier has been of interest to glaciological and hydrological 20 researchers since the 1960s, when it was chosen as one of five glacier basins in Canada for the study of mass and water balance during the International Hydrological Decade (IHD, 1965-1974). Intensive studies of the glacier and observations of the glacier mass balance continued after the IHD, when the initial seasonal meteorological stations were discontinued, then restarted as continuous stations in the late 1980s. The corresponding hydrometric observations were discontinued in 1977 and restarted in 2013. Data sets presented in this paper include: high resolution, co- 25 registered DEMs This paper describes the hydrometeorological, glaciological and geospatial data collected at the Peyto Glacier Research Basin over the past five decades from its foundation by the Government of Canada as part of its contribution to the UNESCO International Hydrological Decade. The research basin now forms part of the University of Saskatchewan Centre for Hydrology’s Canadian Rockies Hydrological Observatory and so has been extensively re-instrumented and subject to intensive scientific study in the last decade. The meteorological data are from six AWS 10 sites, three on the glacier and three near the glacier. These stations are listed as CryoNet stations of the WMO GCW. Near data approaches are The Peyto Main station and re-established an AWS New instruments and dataloggers 2012-2013 University The data include hourly speed, incoming shortwave and longwave and These data are available for a period longer than two decades from the Peyto Main station, and for longer than one decade from the 20 on-ice stations. Bias-corrected ERA-Interim (European Centre for Medium-Range Weather Forecasts Interim reanalysis), WFDEI (Water and Global Change Forcing Data ERA‐Interim), NARR (North American Regional Reanalysis),


Introduction
and off-ice stations. Glaciological mass balance measurements, using ablation stakes and snow pits, have been carried out continuously since the beginning of the IHD period, and a comprehensive account of the first 14 years of mass balance results appeared in Young (1981). Mass balance data reported from Peyto Glacier have been used by many researchers (Bitz and Battisti, 1999;Demuth et al., 2008;Demuth and Keller, 2006;Letréguilly, 1988;Letréguilly and Reynaud, 1989;Marshall et al., 2011;Matulla et al., 2009;Menounos et al., 2019;Østrem, 1973;Schiefer et al., 2007; 5 Shea and Marshall, 2007;Watson et al., 2006;Watson and Luckman, 2004;Zemp et al., 2015) as reference data for the region, but the collection of data that could be used for modeling purposes has never been assembled in a single description until now.

Peyto Glacier Research Basin (PGRB)
The PGRB is in the Canadian Rockies, on the eastern side of the Continental Divide, at latitude 51.67 o N and longitude 116.55 o W. This heavily glacierized basin is 23.6 km 2 in area, ranging in elevation from 1907 to 3152 m. It is located in a predominantly sedimentary geological region, with surrounding mountains formed from hard, resistant dolomite (Young and Stanley, 1976). The basin has been well monitored over a 5055-year observational period (Shea et al., 5 2009). During the 1960s, the area of the glacier was 13.4 km 2 , but it has been continuously losing mass and area since at least the 1920s (Tennant et al., 2012), shrinking to an area of 9.87 km 2 as of 2018 ( Figure 1). Repeat ground-based photography ( Figure 2) from 1902 and 2002 show the glacier retreat that has occurred over the 20 th century. A new proglacial lake has since formed at the tongue of the glacier that increases in size every year and has been informally named 'Lake Munro' by USask to honor D. Scott Munro's research contribution to the glacier basin. Peyto Creek flows out of Lake Munro, draining the PGRB into Peyto Lake, thus supplying water to the Mistaya River.

Hydrometeorological sites
Meteorological observations were taken over the summer months (June -September) during the IHDUNESCO International Hydrological Decade (IHD) at the Peyto Creek Base Station adjacent to the glacier terminus, herein referred to as Peyto Main (Figure 1). After becoming dormant in 1974, the station was re-established at the same location in September 1987. Table 2 and Table 3 detail the meteorological variables and instruments used to record 5 them during the IHD and the post-IHD period. Three meteorological stations were also established on the glacier surface for post-IHD micrometeorological studies by D. Scott Munro in different elevation zones: Lower, Middle and Upper ice stations. These were originally positioned to represent different glacier net mass balance zonesablation zone, equilibrium line zone, and accumulation zone. Since 2012, USask, has continued these stations with new instruments, but they have been relocated to accommodate changing glacier geometry and rising elevation of the 10 equilibrium line. These data, however, are not continuous because only the Lower Ice station was maintained after 2013 due to rapid ice melt causing tower collapse and subsequent station burial at the higher elevation sites. Peyto Outlet is a hydrometric station that measures glacier meltwater runoff at the outlet of Lake Munro. whereas the others are Contributing CryoNet Stations of the GCW. Figure 1 and Table 1 contain the locational 20 information, data collection periods and data elements recorded at the stations, with selected stations shown in Figure   3. The stations are still collecting observations and our datasets will be periodically updated from what can be described in this paper.  3 Data Young and Stanley (1976) documented the glaciological and hydro-meteorologicalhydrometeorological data collected within the glacier basin during the IHD. Past studies over the glacier are also well documented in 'Peyto Glacier: One Century of Science' , which provides details on the mass balance data until 1995, along with the hypsometry of the glacier.

5
3.1 Meteorological datahistorical and present Young and Stanley (1976) describe meteorological and mass balance data for the period 1965-1974. Air temperature, relative humidity, global radiation, hours of bright sunshine, cloud cover, wind speed, and precipitation were recorded during the summer months at a meteorological station located in the base camp ( Figure 4a) and documented as 'Peyto Creek Base Station' observations. The data collection details and instruments used are described in publications of the Inland Waters Directorate of Environment Canada (Goodison, 1972;Young and Stanley, 1976 (Table 3) and a new setting as a reference station for the PGRB in July 2013 within 20 m from Peyto Main Old (Figure 4c). It measures incoming and outgoing shortwave and longwave radiations, air temperature, humidity, wind speed, precipitation, and snow depth. Figure 5 presents daily averages of these variables for the period from July 2013 to September 2019.     Table 1.

Precipitation
Precipitation at the Peyto Main Old station was measured by a Geonor T-200B, a weighing precipitation gauge with an Alter wind shield, beginning in April 2002, with a CMS tipping bucket (TBRG) rain gauge operating nearby ( Figure  4b and Table 2   Problems with the old TB date from 2007, when a rapid decline in gauge response was noted (Munro, 2020), but the 5 Geonor gauge response invitesinvited further investigation. Therefore, its records were first segregated according to rainfall and snowfall by applying the precipitation phase determination algorithm developed by Harder and Pomeroy (2013). Snowfall was bias-corrected for wind-induced undercatch (Smith, 2007) and rainfall was corrected with a catch efficiency of 0.95 (Pan et al. 2016). Bow Summit data were accepted as recorded because the surrounding tall trees provide sheltering, but do not unload intercepted snow to the single Alter-shielded weighing precipitation gauge 10 in the clearing centre at the site ( Figure 8), thus making it ideal for precipitation measurements.
Daily precipitation sequences were averaged over seven years, 2010-2016 incl., and seasonally accumulated to compare Peyto Main Geonor and Bow Summit measurements ( Figure 9). Observed precipitation accumulations are similar during the summer months between May and October, with mostly liquid precipitation occurring from June to 15 September. Large differences, however, are found for the adjacent winter snowfall months of January-May and October-December, cumulative winter precipitation recorded at Peyto being significantly less than that at Bow Summit. Therefore, the Peyto precipitation gauge may have been undercatching a large portion of the solid precipitation. It is also possible that the gauge catchundercatch correction procedure, originally developed to offset wind induced undercatch of Canadian Prairie snowfall (Smith, 2007), may require modification for use in a high 20 mountain environment. with complex terrain wind flow. While the summer precipitation comparisons with the new TB are much closer (Figure 6), the Peyto Main Station is 160 m higher and 5.5 km closer to the continental divide and so would be expected to receive somewhat greater precipitation than a gauge at Bow Summit.
from 1 January 2009 to 31 December 2019 are plotted in Figure 10, earlier data are not sufficiently continuous to be included.

Data cleaning and gap infilling
Meteorological data recording frequency was changed from hourly to half-hourly in September 2008, with new USask stations recording atand to 15-minute intervals byin 2013 with the new USask stations (Table 1). However, quarter and half-hourly data were aggregated to hourly intervals for archiving, thus corresponding to the AWS recording 5 interval used prior to September 2008. Raw data were thoroughly checked for errors and erroneous data removed.
Missing data were filled in by either linear interpolation or linear regression to data from stations within the basin.
Linear interpolation was chosen when thea data gaps weregap was less than fiveor equal to 4 hours, and linear regressions werea regression method was applied to when the gap is longer data gapsthan 4 hours. These data cleaning processes were followed in sequence by applying various R functions, along with the CRHMr package (Shook, 2016a) for which guidance and installation details are available at the GitHub https://github.com/CentreForHydrology/CRHMr. The data processing steps for quality assurance and control are shown in Figure 11.
Despite two data gaps 6-8 months long and five more that span periods of 15-45 days, the Peyto Main Old record is over 91% complete between 1987 and 2012. Gap fill-ins and corrections to key elements, such as air temperature and solar radiation were done without using the CRHMr packageexpert judgement by D. Scott Munro, with flags inserted to aid judgement on data suitability (Munro, 2020).   Table 4 shows the regression results and Figure 12 shows the systematic bias in Peyto Main air temperature data before and after a 10 o C correction. The erroneous humidity data were corrected from the Peyto Main Old station data using monthly regressions (Table 5). In addition, Peyto Main station data for all the variables were extended back to 2010 using monthly regressions with data from the Peyto Main Old station.

5
Bias-corrected reanalysis data are also included as model forcing data for running glacio-hydrological models over long periods. Four gridded reanalysis products were bias corrected, using in-situ observations at the PGRB: 1. CFSR, the Climate Forecast System Reanalysis product (Saha et al., 2010 (Dee et al., 2011). WFDEI (Weedon et al., 2011) is available at a spatial resolution of 0.5 o x 0.5 o from 1979 to 2016. NARR (Mesinger et al., 2006) is available at 3hourly temporal and 32 km spatial resolutions from January 1979 to January 2017. CFSR, developed by the National

5
Center for Environmental Prediction and the National Center for Atmospheric Research (NCEP-NCAR), is available hourly, at a horizontal resolution of 0.5° × 0.5° from 1979 to 2009 (Saha et al., 2010). A comparison of three reanalysis products showed ERA-Interim to be better than NARR and WFDEI for air temperature, vapour pressure, shortwave irradiance, longwave irradiance and precipitation, while WFDEI was best for wind speed (Pradhananga, 2020).
All gridded reanalysis data were first extracted for the Peyto Main station coordinates. ERA-Interim, WFDEI, and NARR data were interpolated to hourly time periods. The R-package, Reanalysis (Shook, 2016b) was used for extracting and interpolating ERA-Interim, WFDEI, and NARR datasets. Air temperature, vapour pressure, wind speed, precipitation, incoming longwave and incoming shortwave radiation data were interpolated linearly from 3 or 6 hour to hourly time intervals. Total precipitation (3 or 6 hours) was distributed evenly to hourly time intervals.

15
MATLAB (MATrix LABoratory) codes (Krogh et al., 2015) were used to extract CFSR values, which were already at hourly time intervals.
The hourly data were bias-corrected to the in-situ observations at the mainPeyto Main station for air temperature, vapour pressure, wind speed, incoming shortwave and longwave radiation and those at Bow Summit for precipitation.

20
Peyto Main precipitation data were not considered because they were unreliable as detailed in the section 3.2, using a. Precipitation data from Bow Summit were considered instead. A quantile mapping technique, was used for bias correction with parameters calibrated for each month from corresponding data periods using the qmap package in R (Gudmundsson, 2016). Bias-corrected ERA-Interim data from January 1979 to August 2019 are presented in Figure   13.   (Munro, 2011a). The gauge station (ID 05DA008) was established in 1966 for the IHD program and maintained by the WSC. It consisted of a float-activated continuous stage recorder (Table 6) mounted on a stand pipestandpipe ~500 m from the glacier tongue at that time ( Figure 14).

Hydrological datahistorical and present
Historical discharge measurements at Peyto Creek are problematic due to unstable cross-sections, occasional flash floods and lack of direct discharge measurements during high flows. Goodison (1972) reported that the discharge Sudden drops in the stage were observed during the early season when the discharge, likely due to temporary ice jamming as the stream channel was still partly snow-covered,.
Using the strong correlation between the SR50A water level and the otherlevel logger for the melt season, when the stream was snow-free. For the 2018 season, the shift between early and late rating curves occurred on 12 June.
Whenperiod (r = 0.998, RMSE = 0.08 m), the SR50A water level is used to extend water level for the 2013-2018 melt seasons using this rating curve to obtainthe linear regression f(x) = 0.6518x+0.2576 and then calculate streamflow for 2013-2018, only the late-season curve is used, as the SR50A site became snow-covered in winter and measurements were only available after snowmelt exposed the stream based on the rating curve in Figure 15springa. The daily mean basin runoff (streamflow discharge per unit area of the basin) averaged over the historical 11-year period (1967-1977) 5 and the present 5-year period (2013-2018) are presented in Figure 16.

Glaciological data
Glaciological mass balance measurements, using ablation stakes and snow pits, have been taken semi-annually by Canadian government agencies since 1965, when the IHD program began, the scheme for Peyto Glacier was first 10 described by Østrem (1966). Mass balance data for 11 elevation bands, 100 m in width, are reported in several publications Demuth and Keller, 2006;Dyurgerov, 2002;Ommanney, 1987;Young, 1981;Young and Stanley, 1976). Recent mass balance data are reported by the national glaciological programme of the Geological Survey of Canada to and are available from the WGMS (http://www.wgms.ch). The WGMS (20192020) has also compiled datasets from 1966 to 20172018 that are plotted in Figure 17  The dataset also includesdoes not include frontal variation, equilibrium line altitude (ELA), accumulation area ratio (AAR), glacier mass balance (winter, summer, annual) and repeat photographs, which were published by WGMS (20192020) . and available at https://wgms.ch/. Radio detection and ranging (radar) measurements of ice thickness for Peyto Glacier in the 1980s were reported by Holdsworth et al. (2006). Ground-penetrating radar surveys of ice 5 thickness across the glacier tongue in 2008-2010 were reported by Kehrl et al. (2014) in their study of volume loss from the lower Peyto Glacier area between 1966 and 2010. The data set does not include these published ice thickness data.

5
It should be noted that in several instances the data setsdatasets feature variations in temporal subsets of the data. An example is the WGMS record which, for a portion of the record, utilizes data from the Dyurgerov (2002) synthesis rather than Environment Canada National Hydrology Research Institute observations compiled by Ommanney (1987).
Moreover, all data setsdatasets present a mix of reference-surface mass balance data, with hypsometry held constant, and conventional mass balance data, where hypsometric changes are reflected in mass balance accounting (Cogley et

Digital elevation models (DEMs)
Repeat DEMs can be used to quantify surface height changes through time, which are then converted to mass change.
Photogrammetric techniques have been to construct a high-quality DEM from 1966, and airborne Light Detection and Ranging (LiDAR) surveys were used to collect DEMs for 2006 (Demuth and Hopkinson, 2013) (Table 7). However, there was extensive fresh snow cover in the images that resulted in poor contrast in the 15 accumulation region of the glacier.
The DEM was generated using the Agisoft Metashape Professional (AMP) Edition, Version 1.5. All photos were assigned to the same camera group based on the focal length, pixel size and fiducial coordinates available from the camera calibration report. Then the photos were aligned by AMP and a sparse point cloud model was produced in 20 which camera positions and orientations are indicated. To optimize the camera positions and orientation data, some reference points (GCPs) were identified from the stable terrain surrounding the glacier, over a range of elevations.
The GCP file was imported to AMP, and corresponding locations were marked on each of the photos. Finally, based on the estimated camera positions, AMP calculated depth information and a dense point cloud was generated. A DEM and an ortho image were produced from the dense point cloud.

25
Most of the accumulation zone of the glacier is missing from the dense point cloud because fresh snow cover resulted in poor contrast in this region. The interpolation feature available in AMP was not enabled whilst generating the DEMs, as it does not generate very accurate elevations. The spatial resolution of the DEM was chosen to be 10 m.

Generation of LiDAR DEM
Light Detection and Ranging (LiDAR) uses a laser pulse to calculate the distance of the target from the sensor. An airborne laser survey was conducted using a Riegl Q-780 full waveform scanner and Applanix POS AV Global Navigation Satellite System (GNSS) Inertial Measurement Unit (IMU). The laser survey trajectory data was processed using PosPac Mobile Mapping Suite (Applanix) resulting in horizontal and vertical positional accuracy typically better 5 than ±15 cm. RiPROCESS was used to post-process the point clouds and export to a LAS (LiDAR data exchange file) format, a binary file to store LiDAR data. LASTools, available from https://rapidlasso.com/lastools/, was used to process the point cloud and generate the DEM .

DEM co-registration
It is important to align the multi-temporal DEMs relative to one another so that the same point on the ground is represented at the same location in each DEM, thus enabling glacier elevation change to be measured as accurately as possible (e.g., Figure 18). The 2017 LiDAR DEM was taken as the master DEM and all other DEMs (Table 8) were co-registered with respect to this DEM following the Nuth and Kääb (2011) method. The 1966 ortho image was used to mask out all the unstable areas such as glaciers, fresh snow, or water bodies. All the pixels outside this mask were classified as stable terrain, which was primarily bedrock and so excluded trees, lakes/water bodies, glaciers, and snow 15 cover and thus used for co-registration, using. The co-registration script available in the github repository at https://github.com/GeoUtils was used to perform the task, the co-registration . The statistics of the elevation difference for stable terrain after the co-registration are listed in Table 9.  Year Resolution Source and method 1966 10 m This DEM was prepared from digital copies of diapositives, photogrammetrically scanned at 14 m resolution, obtained from the Canadian National Air Photo Library. A 10 m resolution DEM was generated using AMP Edition, Version 1. 5. 2006 10 m This DEM was prepared from LiDAR surveys taken in August 2006 (Demuth and Hopkinson, 2013). The DEM did not cover the whole area of the PGRB, so the northeast corner of the basin was mosaiced with a 2014 DEM data to fill in the missing part. 2017 1 m This DEM was prepared from LiDAR surveys taken on 17 th September 2017 and is available in the archive of the University of Northern British Columbia (UNBC).

Landcover data
Landcover changes in thedata of PGRB were compiled from remotely sensed imagery. ESRI ArcGIS, Agisoft and R were used to work withimageries and a topographic map. Landcover of 1966 was prepared from the georeferenced 10 scanned topographic map of Peyto Glacier, produced from the aerial photographs from August 1966 (Sedgwick and Henoch, 1975)time series of and landcover from 1984 to 2018 were prepared from Landsat images and digital elevation models (DEM), using various tools and functions available in the software modules.imageries. Google Earth Engine (GEE)), ESRI ArcMap, and R were used at various stages of the data preparation. GEE was also used for the initial spatial and temporal analysis of annual landcover mapping from Landsat images and ArcMap and R were used in refinement and the database preparation. Landcover maps from the satellite images were prepared by classification in accordance with albedo, the normalized-difference snow index (NDSI), and the normalized-difference water index 5 (NDWI). As datasets extracted from different sources have different projection systems, they were re-projected to NAD 1983WGS 84 / UTM Zone 11zone 11N (EPSG: 2691132611).

Basin delineation and landcover classification
The PGRB drainage basin was delineated from the 1966 DEM. Google Earth EngineGEE was used for the landcover

20
The use of TOA values was followed as a standard operating procedure in this work, with appropriate narrow to broadband conversion (Hall et al., 2002;Hall and Riggs, 2007;Liang, 2000;Smith, 2010) as the fact that atmospheric backscatter will inflate surface reflectance values, ice albedo values measured on Peyto as well as those obtained from atmosphere corrected satellite images of Peyto range from 0.17 to 0.3 (Cutler, 2006), so backscatter inflation of albedo is unlikely to reach 0.4.

30
The NDSI, NDWI and albedo for the images were obtained from the calculation on the GEE platform. The threshold of NDSI for snow cover was kept at ≥ 0.4 (Hall et al., 2002;Hall and Riggs, 2007). NDWI tends to possess dynamic threshold value (Ji et al., 2009). The NDSI, NDWI and albedo for the images were obtained from the Raster Calculator on the GEE platform.. In our case keeping the threshold to 0.4 showed best classification for a waterbody as a lower value tends to misclassify ice pixels as waterbodies. Similarly, albedo with the threshold of ≥ 0.4 was considered to After GEE export to Google Drive the images were downloaded from the drive and converted to a shape file using the Raster to Polygon tool in ArcMap. The noise in the landcover classification were cleaned with elimination function on ArcMap, visual inspection and correction of few misclassified areas were done manually and finally the files were clipped by the boundary of the PGRB.

15
Several examples of data cleaning approaches are presented. The Peyto Main station was operational during the summer months of the IHD and re-established as an AWS in 1987. New instruments and dataloggers were added in 2012-2013 by the Centre for Hydrology, University of Saskatchewan.. The meteorological data include hourly air temperature, humidity, wind speed, incoming shortwave and longwave radiation, and precipitation. These data are available for a period longer than two decades from the Peyto Main station, and for longer than one decade from the 20 on-ice stations. Bias-corrected ERA-Interim (European Centre for Medium-Range Weather Forecasts Interim reanalysis), WFDEI (Water and Global Change Forcing Data ERA-Interim), NARR (North American Regional Reanalysis), and CFSR (Climate Forecast System Reanalysis) data are also included for running hydrological models over longer periods.

25
Glaciological mass balance data are collected semi-annually by the Natural Resources Canada's Geological Survey of Canada, and partners and published by the WGMS and updated annually. Details of these data have been described in several publications. Specific mass balance data at different elevation zones, available from 2007 to 2019, are included in this paper. On-ice station data include glacier surface elevation change due to ablation and accumulation, as measured by sonic rangers at three ice stations. The three ice stations, each in a different elevation zone, have been 30 operational for various time periods, the first starting in 1995, with long gaps in the records becoming less frequent over time, especially after 2007. Geospatial data include information on basin boundary, drainage area, landcover (including snow, firn and ice on the glacier), and locations of hydrometric sites. Both historical and contemporary discharge data are included. The flow data and hourly surface elevation change data in different elevation zones can be useful for model validation. The long-term mass balance data are a valuable research asset for model development, exceptionally long database is a testament to the dogged perseverance of scientists working for various entities with support from various research funding schemes who kept their eyes on the science and so have produced a rare halfcentury detailed documentation of the impacts of climate change on the cryosphere in a high mountain environment.

Author contribution
DP cleaned, organized, and corrected the data and wrote the first draft of the manuscript. JWP and DSM designed and 5 instrumented the research basin. All the authors collected data and contributed to the paper writing.

Competing interests.
The authors declare that they have no conflict of interest.

Disclaimer
Any reference to specific equipment types or manufacturers is for informational purposes and does not represent a 10 product endorsement.

Acknowledgements
This paper and sustained observations at the PGRB were made possible by funding from the Global Water Futures

20
perseverance, foresight and bloody-mindedness along with the physical fortitude to take scientific measurements in inclement weather. This paper is dedicated to the many brave scientists who have taken observations on Peyto Glacier and to Dr. Gordon Young, who has not only done all of that but continues to encourage scientific examination of the glacier and of the dynamic interface of the cryosphere and the hydrosphere.