Interactive comment on “ Climate , snow , and soil moisture data set for the Tuolumne and Merced River watersheds , California , USA ”

Abstract. We present hourly climate data to force land surface process models and assessments over the Merced and Tuolumne watersheds in the Sierra Nevada, California, for the water year 2010–2014 period. Climate data (38 stations) includes temperature and humidity (23), precipitation (13), solar radiation (8), and wind speed and direction (8) spanning an elevation range of 333 to 2987 m. Each data set contains raw data as obtained from the source (level 0), data that are serially continuous with noise and non-physical points removed (level 1), and, where possible, data that are gap-filled using linear interpolation or regression with a nearby station record (level 2). All stations chosen for this data set were known or documented to be regularly maintained and components checked and calibrated during the period. Additional time-series data included are available snow water equivalent records from automated stations (8) and manual snow courses (22), as well as distributed snow-depth and co-located soil-moisture measurements (2–6) from four locations spanning the rain-to-snow transition zone in the centre of the domain. Spatial data layers pertinent to snowpack modelling in this data set are basin polygons and 100-m resolution rasters of elevation, vegetation type, forest basal area, tree height, and forest canopy cover, transmissivity, and extinction coefficient. All data are available from online data repositories (https://doi.org/10.6071/M3FH3D).



Introduction
The snowpack of the Sierra Nevada provides at least 40 % of California's water supply (Roos, 1989) and has historically stored an amount of water equivalent to more than half of the available Sierra foothill reservoir storage (Bales et al., 2011a).Snowpack in the western US is highly vulnerable to climate warming, both in the recent past (Mote et al., 2005) and as expected in the coming decades, particularly at lower elevations (Fyfe et al., 2017;Miller et al., 2003;Young et al., 2009).Melting snow sustains soil moisture, streams, and other water sources well into the very dry and warm Mediterranean summer that typifies the area (e.g., Yarnell et al., 2010).Building our intuition about the sensitivity of the snowpack to current and future climates, as well as storm paths and timing, is critical to the future management of these areas.Snowpack water storage affects forest fire, forest health, invasive and threatened species, recreation, flooding, and local and downstream water supplies (Brekke et al., 2009;Dettinger, 2011;Ligare et al., 2012;Miller et al., 2009;Sala et al., 2000).
Soil moisture is the other major component of water storage in mountain ecosystems.As snowpack storage dimin- ishes, it will be essential to understand changes in soil moisture as it pertains to plant-available water, evapotranspiration, and, ultimately, forest health (e.g., Bales et al., 2018;Asner et al., 2016).The 2012-2016 California drought, including the 2015 "snow drought" (Harpold et al., 2017), and associated large-scale forest mortality highlight the importance of assessments that investigate the coupled changes in snowpack and soil moisture in mountain forests.
The purpose of this paper is to introduce climate, soil moisture, snow, and spatial data that may be used for hydrologic or land surface assessments and modeling in the Tuolumne and Merced watersheds in the Sierra Nevada of central California (Fig. 1; Tables 1, 2, 3).Hourly climate data and snow and soil moisture measurements were derived from stations within and immediately adjacent to the basins.Spatial data include basin polygon files and 100 m resolution raster files of elevation, and vegetation properties.We describe data sources, processing, limitations, and where to obtain the data.This data set complements stream and climate data compiled by Lundquist et al. (2016) for the upper parts of both watersheds as a part of the Yosemite Hydroclimate Network as well as meteorological and lidar-derived snow depth data compiled for a related snow-modeling study by Hedrick et al. (2018).

Area description
The study basins are west-draining watersheds on the broad western slope of the Sierra Nevada and ultimately tributaries to the San Joaquin River.The climate is generally characterized by cool, wet winters and long, warm, dry summers.Winter storms derive from large synoptic systems from the northern Pacific and more focused and moisture-laden atmospheric rivers from further south in the Pacific.Indeed, the latter may produce 20 %-50 % of annual precipitation for the area, and just a few storms may determine the difference between above-average water years and drought (Dettinger, 2011).Within the seasonal snow zone above 1800 m elevation, much of the landscape consists of broad interfluves between deep river canyons on the Merced and Tuolumne rivers, the area of Yosemite National Park.Most snowmelt runoff is generated between 2100 and 3000 m elevation, with up to 40 % of runoff originating from elevations greater than 3000 m, which is above existing measurements (Rice et al., 2011).Nearly 60 % of the snowpack zone lies between the elevations of 2000 and 3000 m (Rice et al., 2011), and small changes in temperature during storms can result in large changes in runoff due to shifts in precipitation phase.This is illustrated in Fig. 2, which shows that wet-season winter temperatures in this zone hover close to 0 • C in representative wet and dry years in the data set.Dominant vegetation ranges from moisture-limited grasslands and oak woodlands below 1000 m elevation through ponderosa, mixed conifer (sugar pine, incense cedar, Jeffrey pine, and white fir), and red fir forests, to energylimited western white and lodgepole pine forests at and above 2500 m (Fites-Kaufman et al., 2007;Keeler-Wolf et al., 2012).Some of the largest and most productive forests in the world are located in the 1500-2000 m elevation range where there is neither moisture nor energy limitation (Kelly and Goulden, 2016;Matchett et al., 2015).Here, the mean winter temperature is a few degrees above freezing and precipitation averages 1100-1200 mm yr −1 (PRISM Climate Group, 2012).
Like all major river basins in California, the Tuolumne and Merced are vitally important water sources to the economy of the region.The watersheds provide water for a large agricultural region of the Central Valley between Merced in the south and Modesto in the north, fed primarily by Lake Mc-Clure on the Merced and Lake Don Pedro on the Tuolumne River.Further upstream of the Tuolumne River, the Hetch Hetchy water system supplies water to 2.6 million San Francisco and other Bay Area residents.

Climate data
The original intent of assembling this data set was to force the snow energy-and mass-balance model iSnobal (Marks et al., 1999) at an hourly time step.The data represent the required parameters to drive the model: incoming solar radi- ation, temperature, relative humidity, wind speed and direction, and precipitation.That modeling effort (Roche et al., 2018a) employed a subset of this data archive, which is described in succeeding sections (bold attributes in Table 2).Data were obtained from the California Data Exchange Center (CDEC) for California Department of Water Resources stations, Western Regional Climate Center for Fire Remote Access Weather Station (RAWS) network stations, and the Scripps Institution of Oceanography (SIO), which operates a transect of stations across the Sierra through the middle of the study domain.All raw data (Level 0) were processed to be serially continuous and to remove noise and nonphysical data (Level 1) and gap filled where possible using linear interpolation and regression with nearby stations (Level 2).Very few stations adequately measured all parameters and several stations have extensive periods with no data that precluded gap filling.As is typical in large mountain basins, instrumentation distribution is not uniform, often located where it is convenient to service, and heavily weighted to the lower elevations.More than two-thirds of the stations are below 2000 m elevation and no stations are located above 3000 m (Figs. 1,  2).Above 1800 m, where seasonal snowpack occurs, there are three precipitation measurement stations, two of which are rain-shadow affected (Fig. 1).For this paper we have added the additional meteorological station and soil moisture data available in the same area, which provides a more complete hydrologic data set.

Temperature and humidity
Paired temperature ( • C) and relative humidity (KPa KPa −1 ) used for snow modeling were measured at 23 stations in this data set.Stations were chosen for modeling given known maintenance records at each site that assured minimal drift and accurate subsequent calculation of dew point and vapor pressure.Figure 3 illustrates dew-point and air temperature variability as recorded at Crane Flat Lookout over a 2-week period in late 2012 and early 2013.Also shown is the dew-point lapse rate (using the methods of Marks et al., 1999)

Precipitation
Hourly precipitation (mm) was the most difficult parameter to obtain and process.The best quality records were those obtained from stations equipped with tipping-bucket gauges that were below 1000 m elevation where snow and ice are minimal.Weighing gauges in Yosemite Valley (1208 m), Earth Syst.Sci.Data, 11, 101-110, 2019 www.earth-syst-sci-data.net/11/101/2019/   2).
values were set to zero.For days with nonzero accumulation, we first set all negative incremental values to zero and then multiplied positive increments by a constant so that the sum equalled the daily total.While these gauges are representative of their respective PRISM grid cells, they recorded 50 %-60 % of PRISM estimates in their respective elevation bands in water years 2011 and 2013 (Fig. 2) because they are in a rain shadow.These records may be used primarily to derive precipitation estimates elsewhere in the basin by scaling 800 m PRISM climate normals, as done by Lundquist et al. (2016), or as simple measures of precipitation timing rather than quantity.

Wind speed and direction
For snow modeling, we selected wind data from eight sites that were primarily located on open ridge lines in order to avoid the terrain-or forest-influenced winds.Terrain and vegetation effects could then be modeled using methods such as those of Winstral et al. (2009).Additional stations such as Tioga Pass Entrance Station (TES) and Gin Flat (GIN) provided a reference for forest wind speeds.

Solar radiation
All stations measured solar radiation using pyranometers that introduce substantial aspherical effects at dawn and dusk.Moreover, their calibration history was not known.Hence, the sites chosen for snow modeling were those with a largely complete record that spanned the domain and that exhibited minimal vegetation and terrain shading.As such, this record is best used as an estimate of cloudiness when combined with an independent estimate of incoming clear-sky solar radiation at each site (see Roche et al., 2018a for methods).Other stations in the data set exhibit substantially more terrain and vegetation shading influences.Records in the data set have not been corrected for shading.
4 Snow and soil moisture data

Snow water equivalent
We extracted all available monthly snow-course and daily snow-pillow data from CDEC for purposes of evaluating snow-modeling performance.Missing snow-course data were not gap filled, given substantial inter-site variability.Snow-pillow data were checked for serial completeness and outliers and gap filled using linear interpolation only.

Snow depth and soil moisture
Snow-depth data were collected at four locations spanning the rain-snow transition zone along the Tioga Road at the Merced Grove (1810 m), Gin Flat (2149 m), Smoky Jack (2182 m), and Olmsted Quarry (2604 m).At each of the four locations, three-six snow depth sensor nodes were distributed over approximately 1-3 ha according to canopy coverage (drip edge, under canopy, open canopy) as well as aspect (Rice and Bales, 2010;Kerkez et al., 2012).Each node was instrumented with a Judd snow depth sensor mounted 3 m above the ground surface.Snow data were filtered to remove unrealistic depths and checked for serial continuity (Level 1) and then gap filled using linear interpolation for periods of a few hours and regression with adjacent stations for larger gaps (Level 2).Soil pits of 1 m depth were excavated at the drip edge, under canopy, and open canopy locations.At Merced Grove, Gin Flat, and Smoky Jack, the face of each pit was instrumented with soil moisture sensors at 10, 30, 60, and 90 cm depths.Olmsted Quarry soil pits were instrumented at depths of 10, 30, and 60 cm due to the swallow soil.The soil moisture sensors were installed in undisturbed soil.The soil profiles were then back filled and hand compacted to maintain the original soil horizons and density as much as possible.
The soil moisture sensors installed for this study were the 5TE (5.2 cm probe length), the successor to the family of Decagon ECH2O sensors studied by Kizito et al. (2008).That study evaluated the EC-5 and ECH2O-TE sensors for a wide range of soil solution salinity, temperature, and soil types.Their calibration measurements showed little probeprobe variability and demonstrated that a single calibration curve was sufficient for a range of mineral soils, suggesting there is no need for a soil-specific calibration (Bales et al., 2011b).To convert the Level 0 (raw data) to volumetric water content (VWC), the Topp equation (Topp et al., 1980) was applied: VWC = 4.3 × 10 −6 ε 3 − 5.5 × 10 −4 ε 2 + 2.92 × 10 −2 ε − 5.3 × 10 −2 , where ε is the dialectic permittivity, which is the raw value reported by the Decagon 5TE.

Spatial data
Spatial data included in this data set are basin polygons and raster files.All spatial data are in the Universal Transverse Mercator (UTM) zone 11 projection with the 1983 North American Datum.Basin polygons are in Earth Systems Research Institute (ESRI) ArcGIS shapefile format, while raster files are in ESRI ArcGIS ASCII grid format.Raster files include 100 m resolution elevation (m), canopy cover (%), generalized vegetation type, derived tree height (m), derived canopy transmissivity (dimensionless), and canopy extinction coefficient (m −1 ).The digital elevation model (DEM) was derived by resampling the 10 m U.S. Geological Survey National Elevation Dataset (NED) using bilinear interpolation.All other raster data sets were aligned with this DEM.The resulting raster contained 1296 columns and 1107 rows.Vegetation type, canopy cover, and tree basal area were derived from the U.S. Forest Service 30 m resolution California Region 5 Vegetation Maps (CALVEG, U.S. Forest Service, 2014) by determining the dominant overstory vegetation in each raster cell (Wildlife Habitat Relation (WHR) Lifeform), or spatially averaging canopy cover or basal area within each raster cell.We calculated tree height using basal area from the CALVEG data set and the allometric relation of Zhao et al. (2012).No attempt was made to compare our tree height grid with available lidar data.The WHR Lifeform designation was used to assign canopy transmissivity and extinction coefficients to each pixel based on the values from Link and Marks (1999).See Roche et al. (2018a) for more detail on the derivation of these layers.Basin polygons for the Merced and Tuolumne watersheds are in ESRI ArcGIS shapefile format.

Data availability
All data presented in this paper are available in the California Digital Library (https://doi.org/10.6071/M3FH3D,Roche et al., 2018b).Detailed metadata are associated with each file including contact information.

Summary
The data set assembled here represents the nature of data available in sparsely instrumented mountain basins coupled with the higher quality SIO Sierra transect and complimentary snow depth and soil moisture data set that has undergone quality control and gap filling.While it was used for one snow-modeling effort (Roche et al., 2018a), there are many opportunities to use the data for other applications, combining available raster data sets (PRISM, Basin Characterization Model, etc.) and testing the sensitivity of using more or fewer stations for estimating the attribute of interest.One outstanding use of the data set is an assessment of the temporal evolution of soil moisture with respect to snow accumulation and ablation across the rain-snow transition zone.Given the stark lack of measured short-and long-wave radiation in the watershed, other estimates of these attributes may be used to explore the sensitivity of model results.It is important for these kinds of data to be available for longer periods of time and in other watersheds in order to apply data-driven land surface modeling efforts that seek to minimize calibration in order to more robustly assess stressors on ecosystems.

Figure 1 .
Figure 1.Hydrometeorological stations in and adjacent to the Merced and Tuolumne watersheds used in this data set.Co-located station types are offset for clarity.Yosemite National Park is demarked by the green boundary.

Figure 2 .
Figure 2. Elevation transects of temperature and precipitation in (a) wet and cold water year 2011, and (b) dry and warm 2013.Temperatures are 3-month means and standard deviations during the main snowpack accumulation period (December-February) and the main snowmelt season (April-June).Precipitation and temperature station data were averaged by 100 m elevation band.The shaded area is the proportional basin area in each 100 m elevation band.
derived from these 23 stations, which averaged −0.0055 and −0.0065 • C m −1 during and between precipitation events.Temperature gradient varied from −0.0075 to −0.0044 • C m −1 during wet periods and −0.0079 to −0.0016 • C m −1 during dry periods.These data in combination with those of Lundquist et al. (2016) offer an interesting opportunity to further the temperature and dew-point lapse rate analyses of Lundquist and Cayan (2007) and Feld et al. (2013), respectively.

Figure 3 .
Figure 3. (a) Hourly time series of air temperature, dew-point temperature, and precipitation as recorded at Crane Flat Lookout RAWS and Crane Flat CRN stations for a 2-week period from 21 December 2012 through 3 January 2013.(b) Dew-point lapse rate and corresponding coefficient of determination for the same period for 23 stations with air temperature and relative humidity data (parameters shown in bold in Table2).

Table 1 .
Measurements, operator, and instrumentation at each site.

Table 2 .
Roche et al., 2018a.ons and data used to force model.Three-letter station name abbreviations are derived from conventions in the California Data Exchange Center database (http://cdec.water.ca.gov/, last access: 10 December 2018).Abbreviations ending with "-SIO" indicate stations operated by Scripps Institution of Oceanography that are not currently available through CDEC.CFL-CRN indicates the NOAA Climate Reference Network Station located near the Crane Flat Lookout. 2 Geographic coordinates are in the Universal Transverse Mercator (UTM) projection, 1983 North American Datum, zone 11.3Variable abbreviations: p, precipitation; rh, relative humidity; sr, solar radiation; t, air temperature; w, wind speed and direction.4Operatorabbreviationsaregivenas follows: RAWS -Interagency Fire Remote Access Weather Station network managed by the Bureau of Land Management; SIO -Scripps Institution of Oceanography; UCM -University of California Merced; CA-DWR -California Department of Water Resources; MID -Merced Irrigation District; HHWP -Hetch Hetchy Water and Power; NRCS -Natural Resource Conservation Service; PGE -Pacific Gas and Electric; NOAA -National Oceanic and Atmospheric Administration Climate Reference Network.5Actuallylocated in the town of Mariposa, CA.Bold denotes parameters used inRoche et al., 2018a.
1 (1195 m) and the Crane Flat NOAA Climate Reference Network site (2017 m) were regularly maintained and appear to produce acceptable data.The only two high-elevation gauges were at Tuolumne Meadows (TUM) and Virginia Lakes Ridge (VLR) and both were accumulation-type gauges equipped with pressure transducers.The records from these gauges exhibit substantial diurnal expansion and contraction effects, adding uncertainty to the hourly records.To process these records, we first established a daily record by extracting the midnight value to minimize heating and cooling effects, differencing from the previous day and removing any negative values.For days with zero midnight values, all hourly www.earth-syst-sci-data.net/11/101/2019/ Earth Syst.Sci.Data, 11, 101-110, 2019

Table 3 .
Snow and soil moisture data sources.