the Creative Commons Attribution 4.0 License.
the Creative Commons Attribution 4.0 License.
A database of glacier microbiomes for the Three Poles
Abstract. Glaciers cover 10 % of Earth’s land area and are a pool of carbon and nitrogen for downstream ecosystems. Microbes, including bacteria, fungi, algae, and other microeukaryotes, are the primary inhabitants of glacier ecosystems and are key drivers of carbon and nitrogen transformation. Here, we present a dataset on supraglacial bacteria and archaea (referred to as microorganisms hereafter) communities across Antarctic, Arctic, and Tibetan glaciers. The dataset comprises 815 amplicon sequencing data, 952 cultured bacterial genomes data, and 208 metagenome data, covering ice, snow, and cryoconite habitats. The dataset contains 67,224 amplicon sequencing phylotypes, with a higher microbial diversity in the Tibetan glaciers than in the Antarctic and Arctic glaciers, which were respectively enriched with Gammaproteobacteria, Bacteroidota, and Alphaproteobacteria. Additionally, 2,517 potential pathogens were identified, accounting for 1.9 % of the total microorganisms identified. Snow and ice exhibited a higher relative abundance of pathogens than cryoconite, which could be attributed to the similar adaptation mechanisms for microbial survival in aerosol and immune evasion. The dataset contains 62,595,715 unique genes and 4,327 microbial genomes, a 34 % expansion from previous publications. Genes were annotated for those associated with carbohydrate-active enzymes, nitrogen cycling, methane cycling, antimicrobial resistance, and microbial virulence, revealing the dynamic microbial functions in glacial habitats. This comprehensive dataset provides standardized microbial diversity, taxonomy, community structure, and genetic functions of glacial microbiomes. The data can be leveraged to elucidate ecological principles governing the distribution of microorganisms, to gain insights into the key functional genes for supraglacial microbiomes, to build mechanistic models, and to identify any potential biohazards for policymakers to make informed decisions regarding climate change. The dataset is available at the National Tibetan Plateau Data Center (, Liu et al., 2023)
- Preprint
(1172 KB) - Metadata XML
(11039 KB) - BibTeX
- EndNote
Status: final response (author comments only)
RC1: 'Comment on essd-2023-395', Anonymous Referee #1, 07 Jan 2024
MS Title: A database of glacier microbiomes for the Three Poles
Comments: This paper presents a dataset on supraglacial bacteria and archaea from three poles, covering ice, snow, and cryoconite habitats. The authors released a comprehensive dataset, including 62,595,715 unique genes and 4,327 microbial genomes from studied glaciers. It provides standardized and valuable microbial diversity, taxonomy, community structure, and genetic functions of glacial microbiomes. I believe these data can be leveraged to elucidate ecological principles governing the distribution of microorganisms especially under the serious conditions of rapid climate warming and glacier melting. For me, there only small questions need to be revision before it can be accepted.
Line 35, add related references on “…… a pool of carbon and nitrogen”. How about the carbon and nitrogen storage in the glaciers?
Line 35, “The six Pg of carbon…with glacier runoff.” Do you really mean that every year six PgC can be released due to glacial runoff? This sentence is not so clear?
Line 44-54, how about the “englacial and subglacial ecosystems” as you mentioned in the first sentence of this paragraph?
Line 55, Is the glacier surface meltwater belonged to “supraglacial ecosystem”?
Line 61, Add a “.” at the end of this sentence.
Figure 2(a), the legends for Ice (Inverted triangle) and Snow (Regular triangle) should be separated.
Figure 3, What is the captions for (a) and (b) respectibvely? Do you mean they have the same captions, and the meaning of a,b,c on the figure are also the same?
Citation: -
CC1: 'Reply on RC1', Mukan Ji, 25 Jan 2024
We are extremely thankful for these constructive comments, we have carefully addressed all comments and suggestions, please see detailed revisions below.
Comment 1: Line 35, add related references on “…… a pool of carbon and nitrogen”. How about the carbon and nitrogen storage in the glaciers?
Response: The organic carbon stored in the global glacier was estimated on an order of 6 Pg (Hood et al., 2015), but the estimate for nitrogen stored in the global glacier is not yet available. We have amended the sentence as:
Amended manuscript: The organic carbon stored in global glaciers was estimated on the order of 6 Pg, while over 15 Tg of carbon will be liberated by 2050 with the majority coming from mountain glaciers (Hood et al., 2015).
Comment 2: Line 35, “The six Pg of carbon…with glacier runoff.” Do you really mean that every year six PgC can be released due to glacial runoff? This sentence is not so clear?
Response: This is an estimated number of the total amount of carbon stored in global glaciers. Certainly, not all glaciers will melt and release the carbon stored within. We have rewritten the sentence to clarify this.
Amended manuscript: The organic carbon stored in global glaciers was estimated on order of 6 Pg, while over 15 Tg of carbon will be liberated by 2050 with the majority coming from mountain glaciers (Hood et al., 2015).
Comment 3: Line 44-54, how about the “englacial and subglacial ecosystems” as you mentioned in the first sentence of this paragraph?
Response: We apologize for the confusion. Englacial and subglacial ecosystems are important components of the glacier ecosystem as a whole, they contain diverse microorganisms. As the dataset described here mainly focuses on the microorganisms in supraglacial ecosystems, an additional introduction to microorganisms in the englacial and subglacial ecosystems will confuse the reader about the focus of the manuscript. Therefore, we have removed the first sentence and revised the sentence as:
Amended manuscript: Compared with other glacier-related habitats, the microorganisms in supraglacial ecosystems are the most active, due to their exposure to external environment and ambient temperature.
Comment 4: Line 55, Is the glacier surface meltwater belonged to “supraglacial ecosystem”?
Response: Glacier surface melting water is a part of the supraglacial ecosystem. This is different from the glacier meltwater discharged into the proglacial stream.
Comment 5: Line 61, Add a “.” at the end of this sentence.
Response: We have added the missing "." at the end of the sentence.
Comment 6: Figure 2(a), the legends for Ice (Inverted triangle) and Snow (Regular triangle) should be separated.
Response: We have increased the space between the figure legends so that these symbols are separated. The modified figure is provided in the supplement.
Comment 7: Figure 3, What is the captions for (a) and (b) respectively? Do you mean they have the same captions, and the meaning of a,b,c on the figure are also the same?
Response: We have added additional captions for these figures.
Amended manuscript: Fig. 3 The relative abundance of potential pathogens identified in the glacier microbiomes across the Antarctica, Artic, and Tibetan Plateau. (a): Relative abundance comparison by habitat; and (b): Relative abundance comparison by region. The significance comparisons among regions (a) and habitats (b) are based on Kruskal-Wallis one-way ANOVA, multiple testing was performed based on Dunn’s post-hoc test. The significance is marked by the different letters (a, b, c) above the box.
CC1: 'Reply on RC1', Mukan Ji, 25 Jan 2024
RC2: 'Comment on essd-2023-395', Anonymous Referee #2, 28 Jan 2024
The manuscript offers a dataset on microbial communities from glaciers in Antarctica, the Arctic, and Tibet, comprising 815 amplicon sequence data, 952 cultured bacterial genome data, and 208 metagenomic data. This dataset, covering diverse habitats like ice, snow, and cryoconite, is instrumental in understanding microbial diversity, taxonomy, community structure, and genetic function in glacial environments. However, there are critical issues concerning the dataset's completeness need to be addressed.
One major concern is the apparent lack of comprehensiveness in the metagenomic data presented in Table S2. Notably, the authors' previous study (Zhang et al., 2023, Microbiome,, which analyzed 88 metagenomes from 26 glacier cryoconites, is not included. This omission is surprising and detracts from the dataset's perceived completeness. Furthermore, the absence of citations for other significant metagenomics studies (below) is a critical oversight, especially given the reliance of the current paper's results on metagenomic analysis. Incorporating a broader range of relevant literature is essential to bolster the study's credibility and thoroughness.
Varliero et al. (2021) Frontiers in Microbiology,
Melanie C. Hay et al., (2023) Microbial genomics,
Bellas et al. (2020) Nature Communications.
Edwards et al. (2013) Environ. Res. Lett. DOI 10.1088/1748-9326/8/3/035003Regarding the amplicon sequencing data in Table S1, the quantity appears insufficient for a study claiming to be a comprehensive database of DNA data from glaciers. To enhance the robustness of the dataset, it is advisable to include data from at least 10 or more amplicon sequencing studies focusing on glaciers. The current dataset's overrepresentation of data from the Tibetan Plateau might introduce a geographical bias in the study's findings.
Additionally, there is a need for accurate and comprehensive citation of all data sources in Table S1. Currently, it seems that only three references from the authors of this paper are cited. Ensuring that all data sources are correctly and comprehensively cited is crucial for maintaining the integrity of the research and providing clear, traceable scientific evidence.
Citation: -
RC3: 'Comment on essd-2023-395', Anonymous Referee #3, 04 Mar 2024
Some headers contain letters that aren’t bolded.
Table S2 has cited data from Trivedi et al. in the future.
Figure 1; The Alps are not considered the Arctic. Is the microbial richness calculated with all of the taxonomic identifications from metagenomes, cultures and amplicons?
In sentence, “Metagenomic assemblies were binned using MetaBAT 2 (v2.12.1) (Kang et al., 2019), MaxBin 2 (v2.2.7) (Wu et al., 2016), and VAMB (v2.0.1) (Nissen et al., 2021) respectively” respectively should be replaced with separately.
In “These MAG together with the downloaded isolate genomes were dereplicated using the thresholds of 30% aligned fraction and a genome-wide average nucleotide identity (ANI) threshold of 95% …” makes it sound like the MAGs and the cultured genomes were dereplicated against each other instead of the method being applied to both sets of data, which I think is what is supposed to be conveyed.
In “Spatially, 69.7% of all samples (n = 568) were from Tibetan glaciers, 24% (n = 196) were from Antarctic glaciers, while those from Arctic glaciers
were slightly under-represented (6.3%, n = 51).” The arctic being slightly under-represented in an under statement.
A main table displaying the number of samples in the cryoconite (sediment and water), ice and snow of the Arctic, Antarctica and Tibetian Plateu would help understanding.
These sentences could be combined and refined for clarity. “We identified ubiquitous phylotypes for each region-habitat pair (i.e., identified in more than 55% of samples). There were five phylotypes identified as ubiquitous in all region-habitat pairs (Table S8), affiliated with Gammaproteobacteria (Comamonadaceae) or Actinobacteria (Microbacteriaceae).
Section 3.1.4 on Potential pathogens. Details are needed on what constitutes a pathogenic organism that made up this curated database.
Line 195 – 196, “We propose that this could be explained by the similar selection mechanisms for long-distance dispersal survival and host-immune evasion” Is a bold claim without reporting how many of these phylotypes have the genes for the teichoic acid or reporting how many of these snow and ice pathogens were staphylococcus.
Line 203 “Of these dereplicated ORFs, 47.8% (29,947,128) were functional annotations using eggnog” should be ‘functionally annotated’.
Line 212, “…likely representing novel species” this cannot be claimed from gOTUs.
Line 213-214. This is the first time South American glaciers are being considered alone.
Figure 4b is difficult to read and discern and determine what habitat they come from. 4d would be improved by adding the locations were each of these gOTUs are present and if they relate to the organisms (phyla level) to the ones in figure 2b. 4e is written as 4f in the caption, and is difficult to apply these virulence factors to the
Line 230 “relatively rare” add the percentage that these nitrogen fixation and nitrification genes make up.
The Antimicrobial resistance genes section is fascinating and a great addition to the study.
Line 268; The 4GDB link in inaccessible.
Citation: -
RC4: 'Comment on essd-2023-395', Anonymous Referee #4, 04 Mar 2024
This manuscript proposes a dataset for supraglacial prokaryotic communities from glaciers in the Arctic, Antarctica, and from the Tibetan Plateau. It comprises amplicon sequence data, cultured bacterial genome data, and metagenomic data from the three main supraglacial habitats: snow, ice and cryoconite holes. The authors’ research reveals a higher prokaryotic diversity on glaciers from Tibet in comparison to Arctic and Antarctic glaciers. Furthermore, the study of potential pathogens could help identify potential biohazards in glacial communities. This dataset is the first step in offering a study setting for a worldwide view of the prokaryotic community composition for supraglacial environments and covers microbial diversity, taxonomy, community structure, and genetic functions in glacial environments.
However, major concerns can be raised regarding the completeness, comparability and exploitation of the dataset. Among them, the lack of proper sourcing and credit attribution for data used but not produced by the authors. Table S1 should be completed replacing “NCBI SRA database” by proper credit when possible, including Millar et al., 2021, Webster-Brown et al., 2015, and others. Some samples, such as s2016ZPGSN, cannot be found on NCBI.
Driving conclusions on the bacterial composition and diversity from glacier samples all over the world, comparing studies performed over several years, seems risky without addressing and assessing first key points such as the use of different DNA extraction pathways, the evolution of sequencing techniques and the depth of sequencing they offer, and the different sequencing primer pairs used. The lack of proper credit to the authors and producers of the data retrieved from NCBI complicates further the access of the reader to such discrepancies in the assembled dataset.
Furthermore, ~70% of the samples presented in this study come from the Tibetan plateau, inducing a considerable bias in the estimation of the bacterial diversity worldwide. PERMANOVA tests should be paired with beta dispersion tests to start addressing this issue. For a report on the three Poles, the under-representation in Arctic samples is concerning for the interpretation of this study’s results.L98: Could you provide rarefaction curves justifying the use of this threshold?
L148-150: a higher diversity found in Tibetan cryoconite holes compared to snow and ice could be due to a sampling bias as well (which proportion of the Tibetan samples are from cryoconite holes?), which should be addressed.
Minor revisions:
L15-16: as the authors study the prokaryotic community, it would be good to stick with the terms “bacterial and archaeal” or “prokaryotic” instead of “microbial” over the course of this manuscript.
L22: “which could be attributed to the similar adaptation mechanisms for microbial survival in aerosol and immune evasion” this seems like a far-fetched conclusion.
L35-36: to reformulate. 6 petagrams (Pg) is the total organic carbon contained in glacier ice, and not all is released during the yearly seasonal melting.
L41-43: grammatical changes needed
L49-50: citation needed
L55-56: what is the justification for the mean microbial abundance in surface meltwater to increase with enhanced glacier retreat?
L57-58: “which are not commonly monitored in the environment but have the potential to enter the environment” this needs enhanced clarity
L55-61: this part would benefit from a re-writing linking clearly the different findings to each other, to knowledge gaps and to the authors’ hypotheses.
L69-70: archiving biological data from endangered environments such as glacier ecosystems is invaluable. However (and unfortunately), qualifying it of method allowing for the preservation of biodiversity is highly debatable.
L267: the link provided is not accessibleCitation:
Data sets
A database of glacier microbiomes for the Three Poles Yongqin Liu, Songnian Hu, Tao Yu, Yingfeng Luo, Zhihao Zhang, Yuying Chen, Shunchao Guo, Qinglan Sun, Guomei Fan, Linhuan Wu, Juncai Ma, Keshao Liu, Pengfei Liu, Junzhi Liu, Ji Mukan
HTML | XML | Total | Supplement | BibTeX | EndNote | |
498 | 124 | 54 | 676 | 35 | 43 | 38 |
- HTML: 498
- PDF: 124
- XML: 54
- Total: 676
- Supplement: 35
- BibTeX: 43
- EndNote: 38
Viewed (geographical distribution)
Country | # | Views | % |
Total: | 0 |
HTML: | 0 |
PDF: | 0 |
XML: | 0 |
- 1