the Creative Commons Attribution 4.0 License.
the Creative Commons Attribution 4.0 License.
A Consolidated Database of Mercury Observations for Permafrost Regions
Abstract. Permafrost soils are one of the largest terrestrial pools of mercury (Hg) in the world, storing an estimated 500–1500 Gg of Hg in the top three meters of soil. Ongoing climate-driven thaw threatens to release this legacy Hg into the environment. Efforts to quantify and model this pool have been hindered by a lack of harmonized, spatially resolved observations. To address this, we compiled a database of 117,802 Hg observations collected between 1988 and 2022 from 59 studies across Arctic, sub-Arctic, and alpine permafrost regions of the Northern Hemisphere, including North America, northern Europe, Eurasian and the Tibetan Plateau. The database includes Hg concentration measurements in solid materials—such as soil, leaves, roots, wood, and litter—as well as in water samples from soil porewater, lakes, and rivers across the northern hemisphere permafrost domain. The database enables cross-site synthesis, model calibration and evaluation, and environmental assessments by standardizing and harmonizing data from diverse sources. Data harmonization steps included unit conversion, categorization of observations by type, and quality control measures to ensure consistency across studies. Analytical uncertainty was preserved where reported in source studies, and qualitative uncertainty indicators and flags were applied where uncertainty information was incomplete or heterogeneous. Mercury concentrations vary widely across observations, with lake sediment showing the highest median values (70 ng g⁻¹, IQR: 45–116), followed by soil (50 ng g⁻¹, IQR: 32–90), and vegetation (15 ng g⁻¹, IQR: 9–33). Water observations had a median of 2 ng L⁻¹ (IQR: 2–6). Statistically significant differences in Hg concentrations among observation types were observed at both global and regional scales, consistently following the pattern: lake sediment > soil > vegetation. These patterns, along with spatial and observation-type biases, highlight the need for improved coverage in underrepresented regions such as Eurasia. The database is freely accessible through Zenodo under the concept DOI 10.5281/zenodo.18300989 (all versions), to support ongoing research and model development in Arctic and sub-Arctic Hg cycle studies.
- Preprint
(968 KB) - Metadata XML
- BibTeX
- EndNote
Status: final response (author comments only)
- AC1: 'Comment on essd-2025-640', Christine Olson, 05 Mar 2026
-
RC1: 'Comment on essd-2025-640', Anonymous Referee #1, 07 Apr 2026
Comments to manuscript “essd-2025-640”
Overall, I find this manuscript to be a well-executed and highly valuable contribution to the field of permafrost mercury research. The authors have compiled an impressive database of 117,802 Hg observations from 59 studies spanning four decades, covering soil, sediment, vegetation, and water across the entire northern permafrost domain. The harmonization and quality control steps-including unit conversion, categorical classification, and the introduction of qualitative uncertainty flags-are thoughtful and largely appropriate, given the heterogeneity of the source data. The observed concentration patterns (lake sediment > soil > vegetation) are clearly presented and statistically robust at both global and regional scales. The database is freely accessible via a persistent DOI, which will greatly facilitate model calibration, cross-site synthesis, and environmental assessments. Despite a few areas for improvement (e.g., clarifying the water Hg fraction, addressing spatial biases), the overall quality and transparency of the data product are excellent. This paper represents a significant step forward in consolidating permafrost Hg observations and will undoubtedly become a foundational resource for the Arctic and alpine Hg cycling community. I therefore recommend publication after major revisions.
Below is a consolidated set of peer review comments for this ESSD manuscript. These comments range from brief clarifications to more detailed methodological and transparency requests. The authors should address them in a revised manuscript to strengthen the database’s reproducibility, usability, and scientific impact.
Major concerns:
- The abstract uses both terms of “harmonization vs. standardization”. Please clarify exactly which variables were unit-converted, which were re-calculated (e.g., dry-weight normalization), and whether any raw data underwent transformations beyond unit conversion (e.g., log transformation, outlier capping). A short table in the methods would help clarification.
- “Qualitative uncertainty indicators and flags” are mentioned but not defined. How were heterogeneous uncertainty reports from different studies aggregated? Please provide concrete examples of flags and their interpretation for users.
- The database covers North America, northern Europe, Eurasia, and the Tibetan Plateau, yet the abstract later calls Eurasia “underrepresented”, which indicates that spatial representativeness may be contradictory. Please clarify the following concerns: is all of Eurasia underrepresented, or only specific parts (e.g., Siberia vs. European Russia)? Quantify the sample imbalance (e.g., number of observations per 10⁶ km² by region).
- Observations span 1988-2022, but the abstract does not state whether repeat sampling at the same sites exists. If not, state this limitation explicitly. If yes, describe how users can identify time-series data (e.g., site ID flags). This is critical for studying accelerating thaw impacts, otherwise the temporal trends may not be addressable as currently structured.
- To my best knowledge, “vegetation” includes leaves, roots, wood, and litter. Litter is partially decomposed and may behave more like soil. Justify this grouping or preferably split litter into a separate observation type. At minimum, provide a breakdown of Hg concentrations by subcategory in the supplementary material of this manuscript.
- The abstract reports water Hg in ng L⁻¹ but does not state whether this is total Hg, methylmercury, dissolved, or unfiltered. These fractions have vastly different environmental interpretations. Please specify water Hg fraction clearly, and if multiple fractions exist, document how they were harmonized (or why they were pooled).
- Lines 230 and 242. The finding “lake sediment > soil > vegetation” is reported at global and regional scales. However, sample sizes are highly unbalanced (lake sediment dominates >90% of records according to one comment). Please re‑analyze using weighted statistics or randomly subsampled balanced datasets, or explicitly caution that regional patterns may reflect sampling bias rather than true environmental gradients.
- Line 124. Hg analysis techniques (CVAFS, CVAAS, ICP‑MS, etc.) differ in detection limits and accuracy. Was method type used as a quality control flag or harmonization step? If not, recommend adding a categorical variable “method category” to the database to allow users to filter by analytical rigor.
- Lines 124, 230, 255, and 368. I would like to see that data provenance and source transparency can be more sufficiently detailed in the manuscript. For example, how duplicate records across sources were identified and removed? How to ensure data exclusion criteria and reasons for exclusion? Reliability for breakdown of data origin: peer-reviewed papers, open repositories, unpublished contributions?
- One comment notes that lake sediment comprises most of records (e.g., Table 1 and Figure 4). The authors should clarify the intended uses of this database: is it suitable for soil‑focused Hg modeling? For water‑vegetation interactions? A clear “limitations” subsection is therefore recommended, including guidance on which research questions the database should not be used for.
- Line 458, If possible, please provide a table summarizing, for each observation type, the percentage of records with: precise coordinates, sampling date, analytical uncertainty, soil horizon (if applicable), organic carbon content, and laboratory information. This allows users to filter data appropriately.
- The Zenodo DOI is provided in the manuscript. I would like to suggest outlining a formal versioning policy and long‑term stewardship plan, and describe how the community can contribute new data in the future.
Citation: https://doi.org/10.5194/essd-2025-640-RC1 -
AC2: 'Reply on RC1', Christine Olson, 08 May 2026
Reviewer Comment 1:
The abstract uses both terms of “harmonization vs. standardization”. Please clarify exactly which variables were unit-converted, which were re-calculated (e.g., dry-weight normalization), and whether any raw data underwent transformations beyond unit conversion (e.g., log transformation, outlier capping). A short table in the methods would help clarification.
Response:
We have removed “harmonization” from the manuscript and revised Section 2.3 to clarify the definition of standardization. All Hg concentration values were standardized through unit conversion to common reporting units (ng g⁻¹ dry weight for solids and ng L⁻¹ for water). We further clarify that no transformations were applied to the original data beyond unit conversion, including log transformations, dry-weight recalculations, or normalization procedures. Outlier and range thresholds were applied only as quality-control flags (Sect. 2.4) and did not modify the underlying data values. A table of all variables and units is provided in the metadata file available through the GitHub project repository.
Reviewer Comment 2:
“Qualitative uncertainty indicators and flags” are mentioned but not defined. How were heterogeneous uncertainty reports from different studies aggregated? Please provide concrete examples of flags and their interpretation for users.
Response:
We have revised the manuscript to clarify the definition and use of uncertainty indicators and flags and to avoid ambiguity in terminology. Specifically, we replaced the phrase “qualitative uncertainty indicators” with “quality-control indicators” and expanded Section 2.4 to explicitly define the range and outlier flags used in the database. Range flags are assigned a value of “1” when observations exceed conservative screening thresholds (500 ng g⁻¹ for solids and 100 ng L⁻¹ for water), and “0” otherwise, while outlier flags identify values exceeding three times the interquartile range (3×IQR). We clarify that these flags are intended as screening tools to support user interpretation and do not imply data removal or modification. In addition, we now explicitly state that analytical uncertainty was preserved as reported in the original studies and was not aggregated or standardized across datasets due to heterogeneous reporting. These clarifications have been incorporated into Section 2.4 and the abstract.
Reviewer Comment 3:
The database covers North America, northern Europe, Eurasia, and the Tibetan Plateau, yet the abstract later calls Eurasia “underrepresented”, which indicates that spatial representativeness may be contradictory. Please clarify the following concerns: is all of Eurasia underrepresented, or only specific parts (e.g., Siberia vs. European Russia)? Quantify the sample imbalance (e.g., number of observations per 10⁶ km² by region).
Response:
We added Table 2 and accompanying text in Section 4.1 to quantify sampling density as the number of observations per 10⁶ km² of permafrost area by region and observation type. We also clarify in this section that Eurasia refers to Russia.
Reviewer Comment 4:
Observations span 1988-2022, but the abstract does not state whether repeat sampling at the same sites exists. If not, state this limitation explicitly. If yes, describe how users can identify time-series data (e.g., site ID flags). This is critical for studying accelerating thaw impacts, otherwise the temporal trends may not be addressable as currently structured.
Response:
We have clarified the treatment of repeat sampling in the manuscript. The database may include repeat observations at the same or nearby locations across different time periods; however, these are not explicitly flagged as time series. We have added text to Section 2.2 to note that users can identify potential repeat measurements using site identifiers, geographic coordinates, and sampling dates provided for each record. We also clarify that, while the dataset supports exploratory temporal analysis, it is not structured as a formal time-series dataset and should be used with caution for trend analysis.
Reviewer Comment 5:
To my best knowledge, “vegetation” includes leaves, roots, wood, and litter. Litter is partially decomposed and may behave more like soil. Justify this grouping or preferably split litter into a separate observation type. At minimum, provide a breakdown of Hg concentrations by subcategory in the supplementary material of this manuscript.
Response:
We have retained litterfall within the vegetation category to preserve consistency with the original data sources, where it is reported as plant-derived material. However, litterfall represents a very small subset of the dataset (n = 8) and is not expected to influence overall vegetation statistics. We have revised the manuscript to clarify this classification and to de-emphasize litter in the abstract. Given the limited number of observations, we provide a brief description of vegetation subcomponents in the main text rather than introducing a separate category or supplementary table.
Reviewer Comment 6:
The abstract reports water Hg in ng L⁻¹ but does not state whether this is total Hg, methylmercury, dissolved, or unfiltered. These fractions have vastly different environmental interpretations. Please specify water Hg fraction clearly, and if multiple fractions exist, document how they were harmonized (or why they were pooled).
Response:
We have revised the manuscript to explicitly specify that water Hg concentrations reported in the abstract, figures, and tables refer to total Hg. While additional Hg species (e.g., dissolved Hg and methylmercury) are included in the database where available, these are not the focus of the analyses presented here. We have added clarifying text to the abstract, Section 2.2, and relevant figure and table captions to ensure this distinction is clear throughout the manuscript.
Reviewer Comment 7:
Lines 230 and 242. The finding “lake sediment > soil > vegetation” is reported at global and regional scales. However, sample sizes are highly unbalanced (lake sediment dominates >90% of records according to one comment). Please re‑analyze using weighted statistics or randomly subsampled balanced datasets, or explicitly caution that regional patterns may reflect sampling bias rather than true environmental gradients.
Response:
We have revised the manuscript to more explicitly acknowledge this limitation in Section 4.2 and to clarify that the comparison excluding Canada was intentionally used to evaluate the sensitivity of observed patterns to dataset composition, given that Canadian lake sediment observations dominate the database. We now note that while statistically significant differences among matrices persist, the ordering of Hg concentrations is sensitive to sampling distribution and may partially reflect sampling bias rather than intrinsic environmental gradients.
Reviewer Comment 8:
Line 124. Hg analysis techniques (CVAFS, CVAAS, ICP‑MS, etc.) differ in detection limits and accuracy. Was method type used as a quality control flag or harmonization step? If not, recommend adding a categorical variable “method category” to the database to allow users to filter by analytical rigor.
Response:
Analytical method information is already included in the database through the “STHg_inst” field, which records the Hg measurement technique (e.g., ICP-MS, DMA, CVAAS) where available. We have clarified this in the manuscript (Sections 2.2 and 2.4). We agree that a standardized “method category” variable could be useful and will consider this for future database updates; however, harmonizing methods across studies would require additional interpretation beyond the scope of the current compilation.
Reviewer Comment 9:
Lines 124, 230, 255, and 368. I would like to see that data provenance and source transparency can be more sufficiently detailed in the manuscript. For example, how duplicate records across sources were identified and removed? How to ensure data exclusion criteria and reasons for exclusion? Reliability for breakdown of data origin: peer-reviewed papers, open repositories, unpublished contributions?
Response:
We have revised Section 2.2 to more clearly describe data provenance and screening procedures. All observations are linked to their original source through a paper identifier. Citation information is documented in structured BibTeX “.bib” files, a standard format for storing information, organized by observation type and provided with the database via the Zenodo archive and project repository. Records were screened for duplication using site information, coordinates, and reported values; no duplicate records were identified. Data inclusion required sufficient metadata (e.g., location and units), and records lacking essential information were excluded during initial extraction.
The database is composed predominantly of peer-reviewed literature and publicly available datasets, with a small subset of unpublished contributions (approximately 68 sediment and 75 water observations), as noted in Section 4.1.
Reviewer Comment 10:
One comment notes that lake sediment comprises most of records (e.g., Table 1 and Figure 4). The authors should clarify the intended uses of this database: is it suitable for soil‑focused Hg modeling? For water‑vegetation interactions? A clear “limitations” subsection is therefore recommended, including guidance on which research questions the database should not be used for.
Response:
We have expanded the limitations discussion in Section 5 to more clearly describe appropriate and inappropriate use cases of the database. Specifically, we now clarify that while the dataset is well suited for large-scale synthesis and model applications, its use for media-specific analyses (e.g., fine-scale process studies or co-located multi-compartment measurements) may be limited by imbalanced sampling across observation types and geographic regions.
Reviewer Comment 11:
Line 458, If possible, please provide a table summarizing, for each observation type, the percentage of records with: precise coordinates, sampling date, analytical uncertainty, soil horizon (if applicable), organic carbon content, and laboratory information. This allows users to filter data appropriately.
Response:
Rather than adding a summary table, we have expanded Section 6 to clarify that metadata availability varies across observation types and that users can filter the dataset using key fields such as coordinates, sampling date, analytical uncertainty, soil horizon (for soil data), organic carbon content, and laboratory information. A detailed description of all metadata fields and their availability is provided in the accompanying metadata documentation and repository files on GitHub. This approach maintains flexibility for users while avoiding oversimplification of metadata completeness across diverse observation types.
Reviewer Comment 12:
The Zenodo DOI is provided in the manuscript. I would like to suggest outlining a formal versioning policy and long‑term stewardship plan, and describe how the community can contribute new data in the future.
Response:
We have clarified and made more explicit the versioning, long-term stewardship, and contribution framework in Section 6. Specifically, we now describe that versioning is managed through the GitHub repository, with updates tracked and formal releases archived on Zenodo under unique DOIs to ensure reproducibility and long-term accessibility. We also clarify the role of the authors in maintaining the database and outline how community contributions can be submitted through the repository workflow or via direct contact, with further details provided in the project documentation.
Citation: https://doi.org/10.5194/essd-2025-640-AC2
-
RC2: 'Comment on essd-2025-640', Anonymous Referee #2, 03 May 2026
The manuscript presents a consolidated database of mercury observations across permafrost regions. Overall, this is a valuable and timely contribution, with clear potential for mercury cycling studies, model calibration, spatial synthesis, and future monitoring efforts. The manuscript is generally well organized, and I recommend minor revision before publication. The following issues should be addressed to improve clarity, consistency, and reproducibility.
Lines 147–149 and 169–172: In the Methods, the authors state that observations were included from areas with a permafrost probability of ≥10% and a mean annual ground temperature below 0 °C. However, the caption of Fig. 1 also describes isolated permafrost patches with probabilities of 0.001–0.1. Please clarify whether these lower-probability areas are only shown as part of the background permafrost map, or whether they were also considered during data screening. This distinction is important for the reproducibility of the spatial filtering procedure. It would also be helpful to specify explicitly that these thresholds refer to the modeled probability of permafrost occurrence, rather than to other permafrost properties.
Lines 329–340 and 396–404: The description of the distributional shape should be checked and made consistent. In the text and caption of Fig. 2, the authors state that the Hg concentration distributions are “left-skewed”. However, the histograms appear to show positively skewed, or right-skewed, distributions, with most observations concentrated at relatively low Hg concentrations and a smaller number of high-concentration observations forming long right tails. This interpretation is also consistent with the later statement that Hg concentrations were “positively skewed” in Lines 398–399. Please correct the skewness terminology in the Fig. 2 caption and related text.
Line 115: “PemHg database” appears to be a typographical error and should be corrected to “PermHg database.”
Line 148: A formatting error in “below 0֩C”. Please correct this to “below 0 °C”.
Lines 57-61 and introduction: few recent works about the arctic landscape dynamics and river sediment changes as well as their implications for Hg could be mentioned to place this study in a wider framework. e.g., https://www.nature.com/articles/s41561-026-01960-z
Dongfeng
Citation: https://doi.org/10.5194/essd-2025-640-RC2 -
AC3: 'Reply on RC2', Christine Olson, 08 May 2026
Reviewer Comment 1:
Lines 147–149 and 169–172: In the Methods, the authors state that observations were included from areas with a permafrost probability of ≥10% and a mean annual ground temperature below 0 °C. However, the caption of Fig. 1 also describes isolated permafrost patches with probabilities of 0.001–0.1. Please clarify whether these lower-probability areas are only shown as part of the background permafrost map, or whether they were also considered during data screening. This distinction is important for the reproducibility of the spatial filtering procedure. It would also be helpful to specify explicitly that these thresholds refer to the modeled probability of permafrost occurrence, rather than to other permafrost properties.
Response:
We clarified that the ≥10% threshold refers to the modeled probability of permafrost occurrence used for database screening, while lower-probability isolated permafrost patches shown in Fig. 1 were included only as part of the background map visualization.
Reviewer Comment 2:
Lines 329–340 and 396–404: The description of the distributional shape should be checked and made consistent. In the text and caption of Fig. 2, the authors state that the Hg concentration distributions are “left-skewed”. However, the histograms appear to show positively skewed, or right-skewed, distributions, with most observations concentrated at relatively low Hg concentrations and a smaller number of high-concentration observations forming long right tails. This interpretation is also consistent with the later statement that Hg concentrations were “positively skewed” in Lines 398–399. Please correct the skewness terminology in the Fig. 2 caption and related text.
Response:
We corrected the terminology throughout the manuscript and figure caption to “positively skewed.”
Reviewer Comment 3:
Line 115: “PemHg database” appears to be a typographical error and should be corrected to “PermHg database.”
Response:
We corrected this typographical error in the revised manuscript.
Reviewer Comment 4:
Line 148: A formatting error in “below 0֩C”. Please correct this to “below 0 °C”.
Response:
We corrected this typographical error in the revised manuscript.
Reviewer Comment 5:
Lines 57-61 and introduction: few recent works about the arctic landscape dynamics and river sediment changes as well as their implications for Hg could be mentioned to place this study in a wider framework. e.g., https://www.nature.com/articles/s41561-026-01960-z
Response:
We added a brief discussion in the Introduction highlighting recent Arctic landscape change, erosion, and sediment transport processes as additional controls on Hg redistribution and export in northern systems.
Citation: https://doi.org/10.5194/essd-2025-640-AC3
-
AC3: 'Reply on RC2', Christine Olson, 08 May 2026
Data sets
PermHg C. Olson et al. https://doi.org/10.5281/zenodo.18300989
Viewed
| HTML | XML | Total | BibTeX | EndNote | |
|---|---|---|---|---|---|
| 502 | 187 | 30 | 719 | 32 | 45 |
- HTML: 502
- PDF: 187
- XML: 30
- Total: 719
- BibTeX: 32
- EndNote: 45
Viewed (geographical distribution)
| Country | # | Views | % |
|---|
| Total: | 0 |
| HTML: | 0 |
| PDF: | 0 |
| XML: | 0 |
- 1
The complete coauthor list will be as follows in the final publication:
Christine L. Olson1, Kevin Schaefer1, Alyssa Azaroff2, Hélène Angot3, Sasiri Bandara4, Thomas A. Douglas5, Bo Elberling6, Maria Florencia Fahnestock7, Xinbin Feng8, Charlotte Haugk2, Gustaf Hugelius2, Erfan Jahangir3, Sofi Jonsson2, Shichang Kang9, 10, Adam Kirkwood11, Jennifer Korosi12, Igor Lehnherr13, Artem Lim14, Rinat Manasypov14, Dmitriy Moskovchenko15, Mina Nasr16, Daniel Obrist18, David Olefeldt17, Connor Olson1,19, Oleg Pokrovsky20, Laura Sereni3, Sarah Shakil21, M. Isabel Smith22, Jens Søndergaard23, Jeroen Sonke20, Kasia Staniszewska4, Jens Strauss24, Kyra St. Pierre25, Lauren Thompson17, Andrey Yurtaev14, Yanxu Zhang26, and Scott Zolkos27
1 University of Colorado, Boulder, USA
2 Stockholm University, Sweden
3 Univ. Grenoble Alpes, CNRS, INRAE, IRD, Grenoble INP, IGE, France
4 University of Alberta, Edmonton, Alberta, Canada
5 U.S. Army Cold Regions Research and Engineering Laboratory Fort Wainwright, USA
6 University of Copenhagen, Denmark
7 University of New Hampshire, USA
8 Institute of Geochemistry, Chinese Academy of Sciences, China
9 Institute of Mountain Hazards and Environment, Chinese Academy of Sciences
10 University of Chinese Academy of Sciences, China
11 Carleton University, Canada
12 York University, Canada
13 Department of Geography, Geomatics and Environment, University of Toronto Mississauga, Canada
14 Tomsk State University, Russia
15 Tyumen Scientific Centre SB RAS, Russia
16 Environment and Protected Areas, Government of Alberta, Canada
17 University of California Agricultural and Natural Resources, USA
18 Department of Renewable Resources, University of Alberta, Edmonton, Alberta, Canada
19 Harvard University, USA
20 Géosciences Environnement Toulouse, CNRS/IRD/Université de Toulouse, 31400 France
21 Department of Ecology and Genetics; Limnology, Uppsala University, Uppsala, Sweden
22 University of Southern California, USA
23 Aarhus University, Denmark
24 Alfred Wegener Institute Helmholtz Centre for Polar and Marine Research, Germany
25 University of Ottawa, Canada
26 Tulane University, USA
27 Woodwell Climate Research Center, USA