A synthesis dataset of permafrost-affected soil thermal conditions for Alaska, USA

. Recent observations of near-surface soil temperatures over the circumpolar Arctic show accelerated warming of permafrost-affected soils. The availability of a comprehensive near-surface permafrost and active layer dataset is critical to better understanding climate impacts and to constraining permafrost thermal conditions and its spatial distribution in land system models. We compiled a soil temperature dataset from 72 monitoring stations in Alaska using data collected by the U.S. Geological Survey, the National Park Service, and the University of Alaska Fairbanks permafrost monitoring networks. The array of monitoring stations spans a large range of latitudes from 60.9 to 71.3 ◦ N and elevations from near sea level to ∼ 1300 m, comprising tundra and boreal forest regions. This dataset consists of monthly ground temperatures at depths up to 1 m, volumetric soil water content, snow depth, and air temperature during 1997–2016. These data have been quality controlled in collection and processing. Meanwhile, we implemented data harmonization evaluation for the processed dataset. The ﬁnal product (PF-AK, v0.1) is available at the Arctic Data Center (https://doi.org/10.18739/A2KG55)


Introduction
Permafrost is frozen ground that remains at or below 0 • C for at least two consecutive years and may be found within about a quarter of the terrestrial land area in the Northern Hemisphere and 80 % of the land area in Alaska (Brown et al., 1998;Zhang et al., 1999;Jorgenson et al., 2008). A continuous increase in near-surface air temperatures over the Alaskan Arctic Wang et al., 2017) causes warming and thawing of permafrost, which is expected to continue throughout the 21st century with impacts on ecosystems and infrastructure (Callaghan et al., 2011;Hinzman et al., 2013;Liljedahl et al., 2016;Shiklomanov et al., 2017;Melvin et al., 2017). Thaw may have global consequences due to the potential for a significant positive cli-mate feedback related to newly released carbon previously stored within the permafrost (Abbott et al., 2016;Schaefer et al., 2014;Knoblauch et al., 2018). Modeling studies indicate that greenhouse gas emissions following thaw would amplify current rates of atmospheric warming (McGuire et al., 2018). However, large uncertainties exist regarding the timing and magnitude of this permafrost-carbon feedback, in part due to challenges associated with the representation of permafrost processes in the climate models and the lack of comprehensive permafrost datasets with which to test such models (Koven et al., 2015;McGuire et al., 2018). There is an immediate need for ready-to-use reliable near-surface permafrost datasets, including ground temperatures, soil moisture, and related climatic factors (such as air temperature and snow depth), which can serve as benchmarks for the modeling community and help evaluate potential physical, societal, and economic impacts.
The permafrost extent map by Brown et al. (1998) is one of the most widely used metrics for comparing permafrost model results against ground-based data (Koven et al., 2015;McGuire et al., 2018). Another widely used dataset in model validation is the Russian soil temperature dataset of daily ground temperature measurements at different depths ranging from 0 to 3.2 m for 51 years (Sherstiukov, 2012). An additional ground temperature dataset includes daily-mean ground temperatures at various depths from 0 to 3.2 m at more than 800 stations in China, which for selected locations date back to the 1950s (Wang et al., 2015). In addition to shallow borehole ground temperatures data (i.e., depths less than 3 m) there are datasets that archive temperatures from much deeper boreholes (generally > 5 m) (Clow, 2014;Biskaborn et al., 2015). Moreover, the Circumpolar Active Layer Monitoring network measures active layer thickness -the maximum soil depth above permafrost that thaws every summer and refreezes in the winter (Brown et al., 2000;Shiklomanov et al., 2008). Here, we consolidated data from shallow borehole ground monitoring stations across Alaska from multiple government agencies. Shallow borehole data are important because they record the most immediate response to the changing environmental conditions, whereas deep ground temperatures take extensive time to respond.
A typical permafrost monitoring station consists of an air temperature sensor, a snow depth sensor, soil moisture sensors, and soil temperature sensors. In situ observations of ground temperatures from the Alaskan Arctic region have been dispersed over different monitoring efforts, which are spread over varying time spans, and are observed at nonstandardized depths. The maximum depth of a typical monitoring station ranges from 1 to 3 m below the ground surface. However, not all stations use this design. For example, the National Park Service of Alaska network does not collect soil moisture data. Also, data from permafrost monitoring stations are not archived in a common standardized format and are hosted by different academic and government agencies, such as the Arctic Data Center, the Global Terrestrial Network for Permafrost (GTN-P), the Long Term Ecological Research Network (LTER), and the U.S. Geological Survey (USGS). Thus, we compiled a ready-to-use permafrost dataset in order to allow for efficient data retrieval and processing for permafrost-related analyses.
We compiled the first integrated shallow ground temperatures dataset for permafrost-affected soils across Alaska from the three most reliable monitoring networks operating over the past several decades: the Geophysical Institute Permafrost Laboratory at the University of Alaska Fairbanks (GI-UAF), the National Park Services in Alaska (NPS), and the USGS. This synthesis permafrost dataset for Alaska (PF-AK, version 0.1) includes measured air and ground temperatures at depth intervals up to 1.0 m, snow depth, and soil volumetric water content (VWC) for 72 permafrost monitoring stations across the state of Alaska. Detailed information and metadata are provided for the compiled dataset so that potential users can have a full understanding of the data and their associated limitations. Furthermore, two types of data evaluation were implemented: (i) testing for inconsistencies between air and ground temperature trends and (ii) the use of the snow and heat transfer metric (SHTM) to validate the relations between seasonal temperature amplitudes and snow depth. These technical evaluations are useful for proving data harmonization and reusing these data.

Permafrost monitoring networks
Our synthesis permafrost dataset for Alaska ( Fig. 1 and Table 1) is based on observed in situ data collected by the USGS, NPS, and GI-UAF teams. In the late 1990s, researchers at the GI-UAF established a near-surface permafrost monitoring system consisting of 27 stations across Alaska, primarily along the Trans-Alaskan Highway ( Fig. 1) . Similarly, the USGS installed permafrost stations to monitor permafrost conditions within the two federally managed areas on the North Slope, the National Petroleum Reserve Alaska and the Arctic National Wildlife Refuge. Since August 1998, the USGS has maintained 17 automated stations in the area, spanning latitudes from 68.5 to 70.5 • N and longitudes from 142.5 to 161 • W ( Fig. 1) (Urban and Clow, 2017). NPS has monitored ground temperatures since 2004 at several sites in national parks (Hill and Sousanes, 2015). All monitoring stations are installed on undisturbed land (Fig. 2) at a minimum specified distance from nearby infrastructure. This installation protocol ensures no biases occur associated with anthropogenic or ecosystem disturbances, which is one of the main differences with traditional meteorological stations which are often associated with airstrips and villages in Alaska. A brief description of environmental characteristics of each site, including dominant soil and vegetation type, is summarized in Table 2. Due to the differences in the station design and description used by the various teams, the soil and vegetation descriptions may not be fully comparable and are not available at all sites.
These networks utilize radiation-shielded thermistors (Campbell Scientific CSI 107 temperature probes) to monitor air temperature. In the GI-UAF and NPS network, the air temperature sensors were installed at 1.5 or 2.0 m above the ground surface, whereas the USGS network monitors air temperature at 3.0 m above the ground surface in order to minimize damage by wildlife.
Instruments used in ground temperature monitoring are specified in Table 3. To monitor near-surface ground temperatures, the networks use either a probe with several thermistors embedded within a single rod, typically 1.0 to 1.5 m long, or several individual Campbell Scientific 107 thermistors anchored at specified depths within a single hole. The thermistor temperature sensors are designed to record temperatures ranging from −30 to 75 • C, with the exception of the 107 sensors, which record temperatures from −35 to 50 • C.
An ice-bath calibration is a required procedure before installation of the GI-UAF temperature probes. This calibration includes placing the sensors into an insulated container filled with a mixture of ice shavings and distilled water, measuring the temperature, and recording the offset from 0 • C. The measured offset is then used to correct the temperature measurements. The average accuracy of these sensors is ±0.01 • C (Romanovsky et al., 2008). For the USGS network, the thermistor sensors are installed inside a tight-fitting fluid-filled plastic tube, 1.25 m long, to measure ground temperatures at depths of 0. 05, 0.10, 0.15, 0.20, 0.25, 0.30, 0.45, 0.70, 0.95, and 1.20 m (Urban and Clow, 2017). Newer USGS ground sensors are calibrated in the USGS temperature calibration facility while the older ones were calibrated in situ using an inversion (Urban and Clow, 2017). The NPS has three to four soil temperature sensors (CSI-107) installed in individual holes at depths of 0.10, 0.20 and 0.50 m, and at several locations an additional sensor is located at 1.00 m. The ground-measurement depths vary station by station within the GI-UAF network, typically ranging from the ground surface (i.e., 0 m) to 1 m below the ground surface. It is important to note that for most of the installed probes, frost heave occurs with time, and heaving depths are adjusted accordingly by subtracting the heaving values yearly. The USGS and NPS teams estimate frost heave by using ground temperature data from the topmost thermistor (at a depth of 0.05 or 0.10 m). If the temperature of the top thermistor during the thaw period exceeds air temperature, then the sensor is considered exposed or partly exposed to solar radiation. The GI-UAF team measures frost heave at every site and then subtracts heave depth from known sensors depths to correct for heaving (Romanovsky et al., 2008). Each team corrects for heaving every summer, and corrections are applied before releasing data. Our presented data thus already account for frost heave and consist of corrected ground temperatures.
Both the USGS and the GI-UAF networks measure liquid soil moisture using a HydraProbe sensor developed by Stevens Water Monitoring Systems Inc. The Stevens Hy-draProbe has a reported accuracy of ±0.03 m 3 m −3 (Bellingham, 2015). Each volumetric water content sensor was calibrated in accordance with the manufacturer's recommendations. Uncertainties associated with the sensor's sensitivity still exist under certain specific conditions, e.g., for peat. The measured liquid soil moisture from a HydraProbe cannot be directly compared with the total soil moisture content values produced by land system models because in most of the models, soil moisture includes both ice and liquid water, whereas HydraProbe sensors only measure liquid soil moisture. The USGS network measures soil moisture at one depth, approximately 0.15 m below the ground surface in all cases. The soil moisture sensors depths vary between stations for the GI-UAF network because they are installed at representative depths depending on the soil profile and texture within the active layer. The GI-UAF network measures soil moisture typically at three different depths within the active layer, ranging from 0.10 to 0.60 m. The NPS network does not include moisture probes at any of their monitoring stations. Our processed dataset only presents the upper layer (up to 0.25 m) soil water content. Snow depth is measured once per hour with a SR50 or SR50A ultrasonic distance sensor (Campbell Sci. Inc.) at all of the available stations. This downward-looking sensor is mounted on a crossarm typically at 2.5 m above the ground surface for the USGS and NPS networks, and 1.5 m above the ground surface for the GI-UAF network. The factory evaluated accuracy is ±0.01 m or 0.4 % of the distance to the ground surface. It is important to note that vegetation at the ground surface might influence shallow snow depth measurements.

Data processing workflow
All three networks apply data processing and qualitycontrol checks before release. Typically, quality control occurs shortly after annual summer field campaigns; the fully processed and quality-controlled data become publicly available a year after the data collection. In the    Figure 3 shows a schematic representation of the data processing workflow used to compile our synthesis dataset. To standardize the ground temperature depths in the dataset, we linearly interpolate ground temperatures for target depths: 0.25, 0.50, 0.75, and 1.00 m. We only implemented interpolation for those stations with measurements at least four depths, which assures a relatively small interval around the specified target depths. In addition, soil temperatures were not extrapolated beyond the maximum observed depth at any site; ground surface temperature is only calculated when supporting measurements are indeed available. Then, the calculated soil temperature at a specific depth depends on the linear slope between the observations at adjacent depths. Therefore, using a linear interpolation method does not necessarily result in a linear prediction from the ground surface to 1 m. We examined the uncertainty resulting from our linear interpolation method for the most data-sparse case, i.e., when we only have observations at four depths. To do so we selected the entire year of data without any missing values or depths and used linear interpolation to predict temperatures at five depths. Then we randomly selected only four depths, and interpolated again by using these four depths. This analysis demonstrates that while missing depths would reduce the number of available interpolation results, the influence from missing depths is limited.
The USGS and NPS network releases data at hourly resolution, whereas the GI-UAF network releases data at daily resolution. Since the most common model data output intervals of the land system and global climate models are monthly, the monthly means were calculated for all variables, including air and ground temperatures, snow depth, and soil water content. In addition to monthly data, annual means were calculated to allow evaluation of the relationship between air and ground temperatures. Thus, the dataset also provides annual statistics, including mean-annual air temperature (MAAT); mean-annual ground surface temperature (MAGST); mean-annual ground temperature at 1 m (MAGT at 0.25, 0.50, 0.75, and 1.00 m); mean and maximum seasonal snow depth (SND); and maximum, mean, and minimum soil volumetric water content (VWC).
Data from many sites have gaps and discontinuities due to harsh environmental conditions and wildlife that may interrupt the monitoring. There are various methods for calculating monthly means from incomplete time series data. For example, the USGS standards allow only 5 % of missing values for both monthly and annual mean temperature data (Urban and Clow, 2017). The World Meteorological Organization 2318 K. Wang et al.: Synthesis data of permafrost-affected soils, Alaska (WMO) does not allow gaps of more than three consecutive days or more than 5 days total from each monthly data series (Plummer et al., 2003). Other researchers are more tolerant of missing data, acknowledging the difficulty of data collection in remote cold regions. Menne et al. (2009) allow up to 10 missing days in a monthly time series. Bieniek et al. (2014) calculated monthly averages using at least 15 days. Here we calculated monthly means for any station which has at least 20 days of measurements for that specific month. The annual means were calculated from daily data. Due to the scarcity of the data, we only calculate the annual means for those years with a coverage of at least 90 % of the daily data. For this reason, we separately present annual means for air and ground temperatures as well as soil moisture, derived from daily data.
During the dataset compilation, we identified similarly named sites with different installation times and locations that do not match precisely. It is important to note that these sites, even when located nearby each other, may have considerably different environmental conditions, and thus, different ground temperature thermodynamics. A unique name is assigned to each site. Deadhorse site, maintained by GI-UAF, and Awuna site, maintained by USGS, have new monitoring stations, and the old ones have been decommissioned. The new and retired systems ran simultaneously for a few months in order to evaluate the data consistency. The environmental conditions for the newer Deadhorse station remained the same, assuring data consistency. Environmental conditions between two monitoring stations at Awuna are quite different: the original Awuna site was located on a ridge, whereas the new site is in a valley 1.9 km away. Nevertheless, the temperature data are consistent between the old and new station at Awuna. The old site (Awuna1) did not monitor soil moisture, which would be expected to be more site-specific and spatially variable. Thus, in this dataset, we present both the new and old sites' records.

Derived variables
We calculated three derived variables from monthly temperature curve at each site: (i) degree days of freezing (DDF), (ii) degree days of thawing (DDT), and (iii) frost number (FN). Nelson and Outcalt (1987) and Zhang et al. (1996) have demonstrated that these variables calculated from monthly data closely correspond to those calculated from daily data. DDT and DDF are given by and The FN index was calculated for both air temperature and ground temperatures following Nelson and Outcalt (1987): Here, dt is a day. FN serves as a simplified index for the likelihood of permafrost occurrence. A FN index of 0.5 implies equal freezing and thawing index. When the FN index is > 0.5, it indicates that the annual period of freezing dominates thaw, implying climate conditions that promote permafrost.

Data evaluation
Despite the fact that individual station observations had originally been quality controlled, we still need to examine our own results for data harmonization. Here we implemented two methods of evaluation. The first one compares the trends in air and ground temperature trends, while the second method examines the effects of snow on the ground's thermal state.
The primary objective of the trend analysis is to evaluate the consistency between trends at each station (for different depths) and between stations rather than inform interannual variability. Most of the estimated trends have a short observational period (see Table 1). We chose to show trends only for those stations with more than 10 available annual means. Currently, some of the time series are still too short to provide significant trends. As more data become available in the future, a more rigorous analysis will be possible. It is well known that climatic trend analysis requires more than 30 years of time series (IPCC, 2013). On the other hand, Box et al. (2005) showed that 15 years is sufficient for interannual variability diagnosis to be statistically significant. Since the time series for most of the stations do not exceed 15 years, we calculate trends for temperatures at different depths to determine inconsistencies between air and ground temperature trends in terms of signs' differences.
The second evaluation effort examines the physical mechanism among air temperature, snow cover, and ground thermal states, which is an auxiliary evaluation of the dataset. Seasonal snow cover will keep the ground warm by reducing cooling (or heat loss) during the winter (Yershov and Williams, 2004). Considering a semi-infinite column, the damping of the ground temperature annual cycle is dependent on both snow depth and soil thermal properties. In this study, the snow period is defined as October through March. We averaged the snow depth measurements over the period to obtain the effective snow depth (SND eff ) (Slater et al., 2017). The amplitudes of air temperature (Amp air ) and ground surface temperature (Amp gnd ) were calculated following Slater et al. (2017), for those stations with available snow depth data. The snow and heat transfer metric (SHTM) captures the correlation between the normalized temperature amplitude 3 Results Table 4 presents an overview of the data compiled in the dataset for Alaska. Our dataset comprises 41 667 data points in total. There are significant missing data (e.g., some stations do not have soil moisture sensors installed) and there are different observational periods for each sensor (e.g., air temperature sensors were installed often earlier than other sensors in some cases). Excluding the missing time series when certain instruments were not installed, the percentage of complete data is about 77 %. Figure 4 shows an annual summary of our core variables, including mean annual air temperature, ground surface temperature, and ground temperatures at 0.25, 0.50, 0.75, and 1.00 m. Overall, mean-annual air temperatures are colder than −10 • C in the Alaskan Arctic, while in the southern mountain tundra regions they are close to freezing point (−0.5 • C at RUGA2 site). Mean-annual ground surface temperatures for 46 available sites range from −7.6 • C through 2.5 • C, which, as expected, is considerably warmer than the mean-annual air temperature. For most of the sites, ground temperatures could be determined at depths of 0.25 and 0.50 m (69 and 67 sites, respectively). Ground temperatures at depths of 0.25 and 0.50 m range roughly from −7.8 to 3.3 • C. Mean-annual ground temperature at 0.75 m varies from −7.5 to 1.2 • C over 49 available sites. Ground temperatures at 1 m could only be determined at 32 sites, most of which are located in the southern portion of the Alaskan Arctic (∼ 62 • N). Mean-annual ground temperatures at this depth range from −7.8 to 1.2 • C.

Overview of this dataset
The VWC shown in Table 4 is from the upper part of the soil (i.e., depth of up to 0.25 m). The VWC measurements are mainly available from the North Slope of Alaska. Maximum VWC is important for understanding active layer dynamics during summer. Notably, the spatial variance of the maximum VWC is 3 times larger than that of the annual means. Three sites, Chandalar Shelf, Pilgrim Hot Springs, and Red Sheep Creek, were much wetter than other sites (maximum VWCs exceeding 0.7 m 3 m −3 ). This is mainly because these sites are close to a water body. Snow depth is spatially variable over Alaska, although with a general trend of increasing snow depth in the southern part of the state, according to the synthesis dataset (Fig. 5). In the Alaskan Arctic, snow cover is shallower than in the southeast region. The maximum seasonal snow depth was > 1.5 m at the Gates Glacier station (which is located near the glacier) in Wrangell St. Elias National Park. The lowest maximum snow depth occurs at West Dock near the Beaufort Sea in Prudhoe Bay, with only 0.09 m in 2010. Similar mag-  nitudes of snow thickness were reported at West Dock during the period 1983-1993 (Zhang et al., 1997). The other two sites, Asik in Noatak National Park and Serpentine in Bering Land Bridge National Preserve, also showed a shallow snow cover in recent years. The thin snow cover is probably due to wind exposure.

Data evaluation
In this dataset, we derived the FN index for air and ground temperatures at various depths ( Fig. 6 and Table 5). Because many stations do not have sensors at depths > 1 m, we report the DDT-DDF indices of air, ground surface, and 0.5 m below the ground surface in Fig. 6, with all available results listed in Table 5. Overall, almost all stations have an air FN above 0.5. Stations on the North Slope have both air and ground surface FNs exceeding 0.6. In interior and southern Alaska, air FNs are above 0.5, although the ground surface FNs are much lower due to the thicker snow cover in this region. In the Alaskan Arctic, DDTs at ground surface are generally lower than air according to the station observations. There are 13 stations with a zero DDT based on ground temperature data at 0.5 m. These results indicate a shallow active layer (< 0.5 m) at these sites. Another five stations have a DDT of 0.5 m ground temperature less than 10 • C days. The calculated frost number indices are consistent with the existing permafrost distribution map over Alaska (Jorgenson et al., 2008).
We examined the consistency among the trends of MAAT, MAGST, and MAGT at 1 m depth. Typically, if MAAT has a long-term positive trend, then MAGST is expected to have a positive trend, even if the rate is dampened . Similarly, signs of trends in MAGST and MAGT at the depth of 1 m and MAAT and MAGT at 1 m depth are hypothesized to be consistent . Here we show the annual mean temperatures at four stations, Drew Point, Fish Creek, Niguanak, and Tunalik, with 10 or more years of data (Fig. 7). Mean-annual air, Figure 6. Overview of spatial distribution of freezing-thawing index from air, ground surface temperature, and ground temperature at 0.50 m. Frost number (FN) was derived from the freezing-thawing index according to Nelson and Outcalt (1987). ground surface, and ground temperature at 1 m indicates consistent warming at rates of 0.07-0.18, 0.14-0.23, and 0.12-0.22 • C year −1 , respectively. A notable feature is that at Fish Creek, ground surface temperature and ground temperature at 1 m showed amplified warming rates compared to the magnitude of the air temperature increases, which can be explained by the significant increase of seasonal snow depth over the same period. There are six stations with relatively long records (≥ 10 years) of air, ground surface, and ground temperature at 0.5 m for the same period. In other words, at these sites, the data used to estimate linear trends of air, ground surface, and ground temperature at 0.5 m were collected over corresponding years. Figure 8 shows that air temperature, ground surface, and ground temperature at 0.5 m have consistently positive trends. Furthermore, the trends in ground surface and 0.5 m were generally close.
There are several sites in a small area that indicated inconsistency in air temperature trends. The inconsistency is mainly due to different observational periods and the relatively short duration of records. For example, there are several Smith Lake (SL) permafrost monitoring stations which are located north of the University of Alaska Fairbanks campus and west of Smith Lake with varying environmental conditions. (SL1 is in a white spruce forest with high canopy; SL2 is in a dense diminutive black spruce forest; and SL3 is located at the edge of the forest surrounded by black spruce trees and tussock shrubs; and SL4 is characterized by hummocks of sedges (tussocks) and shrubby vegetation with sparse black spruce.) The environmental conditions at the SL3 site provide favorable conditions for permafrost existence. The SL3 site has the longest air temperature record, indicating a cooling trend over the observational period (Fig. 9a). After calculating the differences between measured data for all three sites, we applied corresponding corrections and extend the data at all three sites. The overlap period (2006)(2007)(2008)(2009)(2010)(2011)(2012) showed a consistent variation with the roughly constant offset between SL2 and SL3. By using the offset, we extended the records at SL3 to 2015. Figure 9b shows that extending the time series reduces the trend magnitude and changes the negative sign of the SL3 trend to positive, demonstrating the important difference between trends derived from a complete longer time series and those derived from a sparse time series.
Finally, we examined the physical relations among air temperature, snow cover, and ground thermal state (Fig. 10). Across stations, effective snow depth was generally less than 0.4 m. The normalized temperature amplitude difference ( Amp norm ) that calculates the temperature difference between air and ground surface shows a positive linear relationship with effective snow depth. This correlation, the    so-called SHTM (Slater et al., 2017), implies that snow insulation effects increase with effective snow depth, which is consistent with previous studies (Burn and Smith, 1988;Demezhko and Shchapov, 2001;Zhang, 2005;Morse et al., 2012;Slater et al., 2017). In addition, while snow is considered an important factor in winter ground temperature, vegetation can also affect the amplitude through its influence on summer temperature.

Conclusions
Changes in near-surface ground temperatures over time are important indicators of a changing climate because they provide vital information on the response of the permafrost to climate change. In this paper, we synthesize data of 72 monitoring stations in Alaska, spanning a large range of latitudes from 60.9 to 71.3 • N and elevations from near sea level to 1327 m in tundra and boreal forest regions. This dataset consists of monthly ground temperatures at 0.25 m depth inter- Figure 9. Comparison between trends calculated using measured data at SL1, SL2, and SL3 (a). Panel (b) shows merged data series and corrected trends at SL3. Shading shows the standard error of the linear regression estimates. vals up to 1 m, volumetric soil water content, snow depth, and air temperature during 1997-2016. The remoteness of the sites and the harsh environmental conditions inevitably result in missing data; our presented dataset is 77 % complete and consists of 41 667 data points. We describe the data compilation process, listing the workflow and the challenges associated with preparing the synthesis permafrost dataset for Alaska. These data were quality controlled during the data collection and processing stages. We also implemented a data harmonization evaluation for this compiled dataset. The PF-