The SUMup dataset: compiled measurements of surface mass balance components over ice sheets and sea ice with analysis over Greenland

. Increasing atmospheric temperatures over ice cover affect surface processes, including melt, snowfall, and snow density. Here, we present the Surface Mass Balance and Snow on Sea Ice Working Group (SUMup) dataset, a standardized dataset of Arctic and Antarctic observations of surface mass balance components. The July 2018 SUMup dataset consists of three subdatasets, snow/ﬁrn density (https://doi.org/10. 18739/A2JH3D23R), at least near-annually resolved snow accumulation on land ice (https://doi.org/10.18739/ A2DR2P790), and snow depth on sea ice (https://doi.org/10.18739/A2WS8HK6X), to monitor change and improve estimates of surface mass balance. The measurements in this dataset were compiled from ﬁeld notes, papers, technical reports, and digital ﬁles. SUMup is a compiled, community-based dataset that can be and has been used to evaluate modeling efforts and remote sensing retrievals. Active submission of new or past measurements is encouraged. Analysis of the dataset shows that Greenland Ice Sheet density measurements in the top 1 m do not show a strong relationship with annual temperature. At


Introduction and background
Earth's polar regions are warming at an accelerated rate. As increased air temperatures and associated feedbacks with radiative heating persist, the ice cover is changing, particularly at the ice-atmosphere interface (e.g., Vaughan et al., 2003;Serreze and Francis, 2006;Hall et al., 2013). This change is evident in declining Arctic sea ice extent (e.g., Richter-Menge et al., 2016) and the recent acceleration of total mass loss from the Greenland Ice Sheet (GrIS) and Antarctic ice sheets (AIS) (e.g., Velicogna et al., 2014;IMBIE Team, 2018), which contributed ∼ 11 mm to global sea levels between 1992 and 2011 (Shepherd et al., 2012). Surface change is particularly evident over the GrIS. Surface mass balance In 2012, at the Surface Mass Balance and Snow on Sea Ice Working Group (SUMup) meeting, the modeling and remote sensing communities clearly stated to observationalists that the lack of easy-to-access, standardized, in situ measurements hindered scientific achievement. They also emphasized the need for spatially extensive measurements and annual to sub-annual accumulation measurements to coincide with the spatial and temporal scales covered by modeling and remote sensing methods. A public, annual to decadal, standardized time series of measurements was recommended . Modeling and remote sensing studies require validation measurements (e.g., Arthern et al., 2006;Burgess et al., 2010;Kuipers Munneke et al., 2015;Koenig et al., 2016), ideally with the model's same spatial (typically tens of kilometers) and temporal (typically sub-annual) resolutions. These observations are needed over large polar regions, which are difficult for an individual researcher to compile. Today, most field measurements for validation are dispersed across multiple data centers/datasets in differing formats. Some previous Arctic and Antarctic studies have compiled large sets of measurements, generally accumulation measurements (e.g., Mock, 1967a, b;Ohmura and Reeh, 1991;Vaughan and Russell, 1997;Favier et al., 2013;US ITASE, Mayewski et al., 2013;Wang et al., 2016;Machguth et al., 2016b;Thomas et al., 2017;Matsuoka et al., 2018), though most cover only a small region of the ice sheet, are not annually resolved, and/or are not publicly available through a data distribution center.
Here, we present the July 2018 SUMup dataset and its three subdatasets: density, accumulation, and snow depth on sea ice. This data paper serves to fully describe the dataset and includes analysis of the data over the GrIS demonstrating how this dataset increases our knowledge of surface mass balance processes by compiling previously dispersed measurements into a standardized dataset. Uses of SUMup include model validation, remote sensing validation and algorithm development, and long-term monitoring efforts. SUMup measurements should not be used to assess individual measurement errors or establish errors on specific retrieval methods. This is because (1) the spatial/temporal variability of snow depth on sea ice is naturally large due atmospheric processes, including accumulation and aeolian processes, further increased by sea ice characteristics such as age, drift, and ridging (e.g., Warren et al., 1999;Sturm et al., 2002), and (2) the spatial/temporal variability of density and accumulation on land ice is also large due to atmospheric processes, including accumulation, temperature, solar radiation and aeolian process, further increased by ice elevation, topography, melt, and water flow processes (e.g., Alley, 1988;Courville et al., 2007;Laepple et al., 2016;Vandecrux et al., 2018). The field measurements in SUMup were not designed to and cannot control this naturally occurring variability.

Overview
The SUMup dataset is an expandable, communitybased dataset of field measurements of surface mass balance components that is consistent in format, properly described through metadata, and publicly available. The July 2018 SUMup dataset contains three subdatasets that consist of measurements of snow/firn density (https://doi.org/10.18739/A2JH3D23R), snow accumulation on land ice (https://doi.org/10.18739/A2DR2P790), and snow depth on sea ice (https://doi.org/10.18739/A2WS8HK6X). The SUMup dataset is a living document, meant to be expanded as new measurements are taken or previous measurements discovered. The current release of the July 2018 SUMup dataset expands and replaces the previous three smaller releases of July 2013, July 2015, and July 2017. The measurements compiled in SUMup are from the polar regions and most date from 1950 to the present day. Some ice cores and radar data contained both post-and pre-1950 accumulation measurement and we did not exclude the pre-1950 data in order to keep the records intact. Pre-1950 accumulation estimates represent ∼ 2 % of the accumulation subdataset. Figure 1 shows the locations of density and accumulation measurements represented by the July 2018 SUMup dataset. Snow depths on sea ice locations are not shown on this map due to the broad spatial sampling. Density and accumulation measurements are often co-located over the ice sheets where ice cores were collected (Fig. 1).

Sources
SUMup measurements were collected, formatted, and compiled primarily through two methods: (1) searching data archives that traditionally host cryospheric data, which included Pangaea (https://www.pangaea.de/), the Arctic Data Center (https://arcticdata.io/), the NOAA's National Climate Data Center (https://www.ncdc.noaa.gov/), and the National Snow and Ice Data Center (https://nsidc.org/), and (2) by asking members of the cryospheric community to contribute field measurements. Keyword searches for the first method include searching for the words "density", "accumulation", and "snow depth on sea ice". For point measurements, annual and subannual accumulation measurements are included. For data from spatially extensive methods (e.g., from radar isochrones), near-annual resolution is required. Various data types are compiled into the SUMup dataset, including handwritten notes, technical reports, and digital files. Each measurement in the SUMup dataset contains a citation to the original source of the data. Based on keyword searches, data for this release, July 2018, should include most relevant measurements available in the data archives listed above posted before May 2018. It is possible, however, that datasets can be missed by keyword searches, and the community is encouraged to contact the authors directly about any missing datasets that should be included in future releases of SUMup. Specifically, we are working to standardize data from the Japanese Antarctic Research Expedition (JARE) to add in a future release of SUMup. New and unique data sources are included in the SUMup dataset. Notably, the snow density subdataset includes snow pit data from Carl Benson's Greenland traverses in the early 1950s and data from 1955 that previously had not been digitally scanned (Benson, 2013(Benson, , 2017. The 1955 notebooks are only archived in the National Snow and Ice Data Center paper archives. The SUMup dataset also includes snow accumulation measurements from Summit Station, Greenland's stake network called the Bamboo Forest (Dibb and Fahnestock, 2004), and corresponding density measurements at monthly temporal resolution (Dibb et al., 2007). Additionally, more widely used data sources are included, such as US International Trans-Antarctic Scientific Expedition (US ITASE, Mayewski et al., 2013) ice cores, the Program for Arctic Regional Climate Assessment (PARCA, Mosley-Thompson et al., 2001) ice cores, and the Greenland Inland Traverse (GrIT, Hawley et al., 2014) snow pits and ice cores. Section 2.4 provides more details on the specific sources for each of the three subdatasets, including the complete list of all citations.

Contributing to the dataset
The SUMup dataset will continue to expand on an annual basis as new measurements are taken and/or old measurements are discovered. Beyond expanding the current subdatasets, we expect to add additional subdatasets on surface mass balance processes which may include, but are not limited to, snow/ice albedo, snow temperature, and short-wave/longwave radiation measurements. The community is encouraged to contribute data or suggest missing data sources/types to add to SUMup by contacting the authors directly.

Structure and metadata
Each measurement contains common variables, including the date taken, latitude, longitude, surface elevation if on land, the measurement itself, error associated with the measurement, the method by which the measurement was taken, and a citation to which the measurement can be sourced back. By convention, negative latitudes represent south and negative longitudes represent west. For measurements that did not specify a specific month and day for the measurement, but provided only the year ("yyyy"), the date was entered as "yyyy0000". A fill value of −9999 was used for unknown or unmeasured parameters. Measurements can be separated into direct measurements, when the instrument measures the desired parameter directly, and derived measurements, when the instrument measures a parameter related to the primary parameter and uses a known relationship equation to derive the desired measurement. In this paper, we refer to both direct and derived measurements as measurements. Measurement methods are fully listed in each dataset's metadata by number. For clarity, in the density dataset, methods (defined in the readme files) 1-4, 6-9, and 13 are direct measurements (e.g., density cutters, ice core sections), while methods 5, 10-12, and 14-15 are derived measurements (e.g., neutron density probe, X-ray microfocus computer tomography, Gamma-ray attenuation). In the accumulation dataset, How the measurement was collected (see metadata for more details) -Citation Cited source of data (see metadata for more details) methods 1 and 3 are direct measurements (e.g., ice core sections and stake measurements), while method 2 is derived (radar isochrones). All snow depth on sea ice measurements are direct measurements. Uncertainties of measurements were only recorded if provided with the original measurement. More detail on general uncertainty for measurement methods used in SUMup is available for density cutters (Conger and McClung, 2009), a neutron density probe (Morris, 2008), density and conductivity mixed permittivity (DECOMP) (Wilhelms, 2005), and gamma-ray attenuation (Wilhelms, 1996). These can be applied to measurements as appropriate for individual scientific application. If any of the original measurements/metadata were unclear or non-existent, the original author of the data was contacted to clarify inconsistencies or questions. Snow density measurements that exceeded a physically plausible range from < 0 and > 1000 kg m 3 were rejected. Specific details on measurement methods and citations for each subdataset are included in the SUMup metadata files hosted at the Arctic Data Center and described below.

Snow accumulation on land ice
The snow accumulation on land ice subdataset of SUMup contains over 230 000 unique measurements (Fig. 1). Table 2 describes the parameters for each accumulation measurement. The measurement methods include ice cores and/or boreholes, snow pits, radar isochrones, and stake measurements. Arctic measurements are predominantly from ice cores and stake measurements and include one radar nearannual transect in southeastern Greenland (Bolzan andStrobel, 1999a-g, 2001a, b;Mosley-Thompson et al., 2001;Dibb and Fahnestock, 2004;Miège et al., 2013). The Antarctic measurements are predominantly from ice cores and include two radar transects, one in West Antarctica and one in East Antarctica (Wagenbach et al., 1994a;Graf et al., 1999a-m;Schlosser and Oerter, 2002a, b;Spikes et al., 2005;Graf and Oerter, 2006a-y;Anschütz and Oerter, 2007a-f; Banta  Mayewski et al., 2013;Medley et al., 2013;Philippe et al., 2016). In most instances accumulation (in water equivalent, w.e.) was provided in the original measurement; however, the Summit Station, Greenland, Bamboo Forest measurements consist of weekly surface height change at 100 stakes along with snow density (Dibb and Fahnestock, 2004). We multiplied the height change by the coincident snow density and averaged across all stakes to get accumulation measurements for SUMup. Similarly, the Bolzan and Strobel data (1999a-g, 2001a, b) provided a snow pit depth, year, and density that were converted to accumulation. Most of the accumulation measurements are annually resolved, with the major exceptions being the radar measurements, which are approximately decadal, and Bamboo Forest data, which are approximately monthly.

Snow depth on sea ice
The snow depth on sea ice subdataset is the sparsest within SUMup, with ∼ 92 000 unique measurements. Table 3 describes the parameters for each snow depth measurement. The measurement methods include rulers, magnaprobes, avalanche probes, and snow corers. The Arctic measurements span from 1990 to 2018 and cover areas off the coast of Finland (Eero Rinne, personal communication, 2012), near Kotzebue Sound, and Utqiagvik, Alaska (Turner et al., 2018a, b), Eureka and Nunavut, Canada, from the Environment and Climate Change Canada campaign (ECCC 2014) (King et al., 2015), near Ellesmere Island from the CryoSat-2 Validation Experiment (CryoVEX) (Haas et al., 2017), Elson Lagoon from the Bromine, Ozones, and Mercury Experiment (BROMEX) (Webster et al., 2014), and Prudhoe Bay, Alaska, from the Ice and Climate Experiment (ICEX 2011) (Gardner et al., 2012). The Antarctic ob-servations are from the Sea Ice Mass Balance in the Antarctic (SIMBA) dataset which was collected from the research vessel/ice breaker N.B. Palmer in September and October 2007 in the Bellingshausen Sea (Lewis et al., 2011). We note that large, standardized datasets of radar-derived snow depth on sea ice are available through the IceBridge Sea Ice Freeboard, Snow Depth and Thickness product and similar products derived from the IceBridge Snow Radar (Kurtz, 2012;Kwok et al., 2017). These snow depths from Operation IceBridge are no longer included in the SUMup dataset as of July 2015, but are included in a separate archive due to their size.

Spatial and temporal data analysis
The goal of the SUMup dataset is that it can be broadly used by the scientific community for a variety of research studies. Tables 4 and 5 provide the basic descriptive statistics for each subdataset for the Arctic and Antarctic, respectively. These tables provide a coarse overview of the data; however, when using the SUMup datasets, subsetting by location, time, depth, etc. will likely be required for specific applications. The minimum value for accumulation in the Arctic is −0.004 m w.e. a −1 , which represents an ablation value from monthly Bamboo Forest measurements at Summit Station, Greenland. In total, there are 5 months, all occurring in separate years, with small negative accumulation measurements from Summit Station. These negative accumulation measurements are likely due to the ablation processes of sublimation or negative wind redistribution; however, a stake measurement alone cannot determine the underlying process of a surface height change.
Field data collected over the vast polar regions have spatial and temporal sampling bias, as the time, cost, and logistics to systematically sample these regions is unreasonable. We describe the SUMup dataset here to elucidate possible bias. All the measurements in SUMup, with the exception of one loca- How the measurement was collected (see metadata for more details) -Citation Cited source of data (see metadata for more details) -  tion, were collected during the spring/summer season for that polar region, roughly April through August for the Arctic and October through February for the Antarctic. Summit Station, Greenland, the only GrIS station with year-round operations, is the one exception in the dataset where temporally consistent, year-round measurements are taken. Below, we summarize the spatial and temporal distributions of the SUMup dataset by subdatasets. For the two largest subdatasets, snow density and accumulation, we present analysis over the GrIS (Sect. 3.4). This analysis is meant to be an introduction to the dataset and is not exhaustive. We encourage the community to continue to use and more fully exploit this dataset. Figure 2 provides a bar graph showing the measurement methods that make up each subdataset showing that measurement techniques with high spatial (e.g., radar isochrones) and high depth (e.g., neutron probes) resolution dominate the number of measurements in a subdataset; however, they often have limited spatial coverage with respect to the entire region.

Snow density
Measurements were compiled of snow/firn density that cover ∼ 306 sites in the Arctic and ∼ 164 sites across Antarctica (Fig. 1). The majority of these measurements come from snow pits and ice cores on the GrIS and AIS; however, there are seven locations of snow density measurements on sea ice in the Bellinghausen Sea. Here, we analyze only the ice sheet measurements. The density subdataset is dominated (98 % of data) by high vertical depth resolution measurements (millimeter scale for ∼ 100 m) from X-ray microfocus computer tomography,  neutron density methods, gamma-ray attenuation, OPTV, and DECOMP measurements taken on cores or in boreholes at 24 locations in northern Greenland (Wilhelms, 2000a-d;Miller and Schwager, 2000a, b;Schaller et al., 2016Schaller et al., , 2017 and 44 locations from Antarctica (Gerland and Wilhelms, 1999;Oerter et al., 1999aOerter et al., -h, 2000aGraf et al., 2002a-o;Kreutz et al., 2011;Hubbard et al., 2013;Schaller et al., 2017) (Fig. 2). Because of the high depth resolution of these measurements at millimeter scale, compared to more typical density measurements at decimeter to meter scale, these data saturate histogram representations of this subdataset. For this reason, we do not use these measurements in the following analysis, thus providing a more realistic overview of the fraction of the density measurements taken throughout time and space. Figure 3 provides histograms showing the fraction of density measurements taken by year for Antarctica, Greenland excluding Summit Station, and Summit Station. (Summit Station was defined as a bounding box of 72 to 73 • N and 38 to 39 • W.) Summit Station measurements are plotted separately because this unique site provides the only location on the GrIS with year-round measurements over multiple years. The histograms for Antarctica and Greenland show spikes through time related to major collection campaigns. Antarctic density measurements peak from 1990 to 2000 related to a large amount of ice cores taken in Dronning Maud Land and the Filchner Ronne Ice Shelf (Oerter et al., -g, 2000aGraf et al., 2002a-o). Greenland measurements first peak in the early 1950s with measurements from Benson's traverses. They peak again in the late 1980s through the 1990s related to the activities surrounding GISP2 and PARCA ice cores (Alley, 1999;Bolzan and Strobel, 1999a-g;Mosley-Thompson et al., 2001) and peak for third time in the early 2010s with the Greenland Inland Traverse and the Arctic Circle Traverse cores Hawley et al., 2014). Measurements from Summit Station steadily increase in time from 1987 to 2014 with a slight peak in the late 1990s and early 2000s related to additional measurements surrounding the PARCA and GISP2 projects (Dibb and Fahnestock, 2004;Alley, 1999). Figure 4 provides an overview of the distributions of depths sampled by the density subdataset. Overall, the number of measurements decreases with depth. The Antarctic measurements decrease less uniformly with depth, which is related to the larger number of deeper ice cores. The majority of Greenland measurements are above 5 m and there are very few measurements below 20 m, demonstrating the large number of shallow cores collected across Greenland. At Summit Station, the majority of the measurements are taken above 1 m as a result of systematic tasking to dig ∼ 1 m snow pits at approximately monthly intervals since 2003. The deep 100 m plus measurements at Summit come from the GISP2 ice core (Alley, 1999).

Accumulation
Measurements of accumulation over land ice were taken at ∼ 91 locations in Antarctica and ∼ 36 locations in Greenland. These include two radar traverses that span several hundreds of kilometers in Antarctica, and a 75 km radar traverse in southeastern Greenland (Fig. 1). In total, 62 % of the accumulation measurements are from the Arctic, all within Greenland, with < 1 % of the overall measurements coming from Summit Station. The Antarctic contributes the remaining 38 % of the measurements. The accumulation subdataset is dominated (96 % of data, Fig. 2) by high horizontal spatial resolution (tens of meters) radar accumulation measurements taken from three ice sheet transects (Fig. 1). These data saturate histogram representations of this subdataset and are not used in the following analysis to provide a more realistic overview of the fraction of accumulation measurements taken throughout time and space. Figure 5 provides histograms showing the fraction of accumulation measurements taken by year for Antarctica, Green-land excluding Summit Station, and Summit Station. Year, in this case, is defined as the year in which the ice core, snow pit, etc. were collected/dug. The histograms for the Antarctic and Greenland show sporadic spikes through time corresponding to major collection campaigns, similar to yet more exaggerated than in the density subdataset. Antarctic measurements peak in the early 2000s when US ITASE ice cores were collected in West Antarctica (Mayewski et al., 2013). Greenland accumulation measurements peak in the late 1980s with ice cores preparing for the GISP2 core and in the late 1990s when the PARCA ice cores (Mosley-Thompson et al., 2001) were collected. Summit Station has a constant monthly collection of accumulation measurements from August 2003 to August 2016 from the Bamboo Forest measurements (Dibb and Fahnestock, 2004) and represents the only year-round collection of accumulation measurements in the SUMup dataset.
While understanding that the date when accumulation measurements are taken is important, it is also important to understand the year represented by a sample, corresponding to the depth. Figure 6 provides the distribution of years when annual accumulation was measured from 1950 to present. Antarctica has a relatively even distribution of accumulation measurements until 2000 when the number of samples decreases. This decrease is due to the fact that many of the cores collected by US ITASE from 2006 to 2008 in East Antarctica could not be dated to determine accumulation and also shows that most of the firn cores collected date back to 1950 or later. The Greenland accumulation measurements peak between 1980 and 2000. The mostly shallow ice cores in Greenland, and relatively higher accumulation rates compared to Antarctica, result in less data from 1950 to 1980 in the ice cores. The sharp decline in the 2000s is due to a lack of coring efforts that occurred during that decade in Greenland. Summit Station has a consistent year-round sampling of accumulation from 2003 to 2016. These systematic measurements significantly outnumber the single measurements per year collected from ice cores at Summit Station that sample the decades before 2000.

Snow depth on sea ice
The ∼ 92 000 measurements of snow depth on sea ice are mostly from the Arctic, representing 97 % of the measurements, and the Antarctic represents the remaining 3 %. The Arctic measurements span from 1990 to 2018 (Fig. 7)

Analysis over the Greenland Ice Sheet
Recent warming over the GrIS, including a melt event in 2012 that covered nearly the entire surface (Nghiem et al., 2012), has increased both snow density and snow accumulation in recent decades (e.g., Morris and Wingham, 2014;Machguth et al., 2016a;Overly et al., 2016). Improved measurements, or models, of density and its evolution with time are needed to reduce uncertainties when converting altimetry measurements into total ice sheet mass balance using altimetry (e.g., Zwally and Jun, 2002;Shepherd et al., 2012) and for converting radar isochrones into measurements of accumulation (e.g., Koenig et al., 2016). Many models use mean annual temperature and accumulation to model the spatial and temporal evolution of density (e.g., Herron and Langway, 1980;Reeh et al., 2005;Kuipers Munneke et al., 2015). Some studies, however, show that density models generally underestimate surface (< 1 m depth) density measurements (Koenig et al., 2016), while other studies point to the importance of the surface boundary condition for density models when comparing to measurements (Kuipers Munneke et al., 2015;Bellaire et al., 2017). Fausto et al. (2018) find mean annual temperature is a poor predictor of snow density from 0 to 10 cm depth. Here, we look more closely at the density and accumulation measurements within the SUMup dataset over the GrIS and their sampling distributions with respect to temperature, elevation, and latitude.
3.4.1 Greenland density distributions with elevation, latitude, and temperature Figure 8 shows the distribution of density measurements with elevation and latitude compared to the total distribution of elevations and latitudes for the entire GrIS. The fraction of the elevation at 250 m bins (red line of Fig. 8) for the Greenland Ice Sheet is derived from the CryoSat-2 Greenland digital elevation model (DEM; Helm et al., 2014a, bv). Figure 8 uses similar graphing techniques to those of Fausto et al. (2018) to clearly show sampling bias in the observation dataset. If there were no sampling bias, the fraction of measurements would be similar to the fraction of values from the DEM. This is not the case. For elevation (Fig. 8a) we see that el- evations below 3000 m are undersampled, with the exception of the 1750-2000 m bin, and elevations above 3000 m are largely oversampled. The measurements are therefore biased to higher, inland elevations which, if averaged, would likely cause a low bias in sampled densities. Figure 8b shows that our dataset is sampled best over central Greenland. More measurements are required from lower elevations and southern (< 70 • N) and northern (> 78 • N) latitudes to fill the gaps in the current dataset and reduce spatial bias. Because mean annual air temperature is a parameter often used to model density (e.g., Herron and Langway, 1980;Reeh et al., 2005), Fig. 9 shows the distribution of density measurements in Greenland in relation to 3 m mean annual air temperature estimated by the Modèle Atmosphérique Régional (MAR) model version 3.5  with a horizontal resolution of 25 km. We used the National Centers for Environmental Prediction-National Center for Atmospheric Research Reanalysis version 1 (NCEP-NCARv1) forced MAR 3.5 simulation (run from 1948 to 2015) to find the mean annual 3 m air temperature for the year corresponding to when the density measurement was taken. The NCEP-NCAR forcing was chosen because it is more reliable than ERA forcings . The red line in Fig. 9 shows the distribution of annual average temperatures (derived from 1990 to 2015) for the entire GrIS. Figure 9 clearly shows a preferential sampling of GrIS regions with lower temperatures. Cold temperatures (−20 • C and below) are oversampled in the density dataset, while temperatures above −14 • C, which make up ∼ 30 % of the GrIS, make up less than 6 % of the sampled densities. As with elevation, the density sampling distribution by mean annual temperatures likely results in a low-density bias when trying to characterize the entire GrIS. In general, the density measurements in SUMup across the GrIS oversample cooler, inland regions and undersample warmer, coastal regions. Figure 10 plots all sites in Greenland with density measurements coincidently sampled to depths of 10, 25, 50, and 100 cm compared to the mean annual temperature. No clear relationship (Pearson correlation coefficient, R 2 = 0 to 0.137) between mean annual temperature and density is seen in our data until ∼ 1 m depth (R 2 = 0.272) where higher temperatures correspond to higher density. This result suggests that in the top 1 m of snow/firn on the GrIS, in the colder, more inland areas, temperature may not be the primary variable leading to densification. Solar radiation, layering in firn, and wind processes (e.g., Liston et al., 2007;Hörhold et al., 2011) are likely important in these regions and require snow density models that account for these processes. Due to the spatial sampling bias in this dataset, melt processes are likely not a primary process in determining snow density for these measurements; however, melt processes will contribute more in the future (Nghiem et al., 2012;.

Accumulation distributions with elevation and latitude
Snowfall over the GrIS can also be parameterized by elevation and latitude. Figure 11 shows the distributions of the accumulation measurements over the GrIS by elevation and latitude. As with the density measurements the accumulation measurements all come from high elevations on the GrIS (> 1750 m), with the highest elevations (> 3000 m) be-ing largely oversampled. The sampling across latitudes is the most evenly distributed; however, latitudes above 78 • N represent a gap in the dataset. We do not compare the accumulation subdataset with mean annual temperatures here because each year of accumulation has a different mean annual temperature associated with it. We deem it beyond the scope of this analysis and suggest this as a future study that could be researched using the SUMup dataset.

Year-round density and accumulation measurements from Summit Station
Summit Station is the only site in the dataset, and on the GrIS, that has been systematically sampled for density and accumulation on a nearly monthly basis. Hence, it is the only location on the GrIS to watch the long-term, decadal, seasonal evolution of snow surface density. Figure 12 shows the monthly mean surface density to depths of 10, 25, 50, and 100 cm. A seasonal cycle is evident in the 10 and 25 cm depth mean densities with a decrease (trough) in density in late summer (August/September) and an increase (peak) in April. The decrease in summer density is likely due to surface hoar, a low-density snow crystal formation that is well known to form at Summit Station in the summer when wind speeds are low and humidity relatively high (e.g., Alley et al., 1990;Albert and Schultz, 2002;Dibb and Fahnestock, 2004). As wind speeds increase and water vapor decreases in the winter the surface snow increases in density. The seasonal signal in density is damped out by 1 m depth at Summit Station. Figure 12 also shows larger natural variability in average density measurements in the top 50 cm compared to the top 100 cm. This is expected as the deeper snow is more insulated from atmospheric and radiative processes in this dry-snow-zone location. Figure 13 shows the monthly mean accumulation at Summit Station. Accumulation is highly variable, with slightly lower values in the early summer months (May/June/July). Dibb and Fahnestock (2004) also showed a similar trend in stake measurements and Summit Station from just 2 years of data and explained that the summer season may not actually be seeing a decrease in accumulation, but that thinning layers and densification may be causing the stake measurements to not rise as much in the summertime compared to the wintertime when a snowfall event occurs. Determining whether there is a true decrease in summer accumulation or increase in snow/firn compaction rate at Summit Station requires additional research.

Data availability
The SUMup dataset is currently available through the Arctic Data Center. It hosts our three subdatasets in both csv and netcdf formats along with metadata files to further explain the methods and citations. The dataset will be updated annually.

Discussion and conclusion
We present and describe the SUMup dataset, an expandable, community-based dataset of field measurements of surface mass balance components that is consistent in format, has clearly defined metadata, and is publicly available. The subdatasets include compiled measurements of snow/firn density (https://doi.org/10.18739/A2JH3D23R), accumulation on land ice (https://doi.org/10.18739/A2DR2P790), and snow depth on sea ice (https://doi.org/10.18739/A2WS8HK6X) from the Arctic and Antarctica. As seen in SUMup, the measurements over the GrIS and AIS are sporadic in time and space, peaking during specific field campaigns and lapsing in between, which makes monitoring change with and understanding processes from field measurements difficult. This is especially prevalent for parameters like density and accumulation that change with both seasonal and climatic atmospheric conditions. Overall, there are gaps in density and accumulation data from ∼ 2000 forward and from locations on the periphery of the ice sheets. While there currently is a temporal gap in the most recent decades, we note that the GreenTRaCs traverses have collected cores across the GrIS in 2016 and 2017, including at previous PARCA sites. Once these cores are processed, they will be able to help fill some of the time gaps for the GrIS (Robert Hawley, personal communication, 2017).
Density and accumulation measurements of the GrIS oversample cooler, inland regions and undersample coastal, warmer regions. Oversampling these regions may lead to an underestimation of the total average surface density, especially in the summer season, when the measurements are undersampling regions with significant melt processes that increase density. No clear relationship between mean annual temperature and density is seen in the data until a depth of 1 m where a relationship between higher temperatures and increased density is observed. This suggests that additional parameters, such as wind speed and radiative balance, should be considered when modeling density for the GrIS at SUMup density locations and depths above 1 m. Summit Station, Greenland, is the only location with year-round density and accumulation measurements in the dataset, and on the GrIS, and seasonal cycles are evident in accumulation rate and density for depths above 50 cm.
Our analysis of the SUMup dataset shows gaps in ice sheet measurements in the recent decades and in low-elevation regions on the periphery of the ice sheets. These are the exact regions where climate change will have and has had the largest effects on the Greenland and Antarctic ice sheets (e.g., Shepherd et al., 2012;IMBIE Team, 2018;Enderlin et al., 2014) and where additional future measurements are warranted.
We encourage the cryospheric community to contribute additional field data to the SUMup dataset. We also encourage the cryospheric community, including modelers and scientists working in the field of remote sensing, to use this dataset for model validation for surface mass balance and satellite-or airborne-sensor algorithm development. SUMup is a dynamic, living dataset and is expected to be expanded and released annually.
Author contributions. LM compiled the SUMup dataset into the July 2017 and July 2018 datasets, developed the metadata, and reformatted the dataset. She made all figures for this paper and cowrote the paper. LK co-wrote this paper and developed the first SUMup dataset in 2013. PA helped with the development of the SUMup dataset, performed the initial comparison of the SUMup data to the MAR model, and contributed to the writing of this paper.
Competing interests. The authors declare that they have no conflict of interest.
Acknowledgements. Lynn Montgomery and Lora Koenig acknowledge National Science Foundation grant PLR 1603407 for funding this work. We thank our two anonymous reviewers and our editor, Reinhard Drews, for providing thorough insight and commentary that helped to greatly improve the quality of the manuscript. Publication of this article was funded by the University of Colorado Boulder Libraries Open Access Fund.