The Ant-Iso dataset: a compilation of Antarctic surface snow isotopic observations

Wang, Jiajia; Pang, Hongxi; Wu, Shuangye; Schoenemann, Spruce W.; Uemura, Ryu; Ekaykin, Alexey; Werner, Martin; Cauquoin, Alexandre; Goursaud Oger, Sentia; Rupper, Summer; Hou, Shugui

doi:10.5194/essd-2022-384

Preprints

https://doi.org/10.5194/essd-2022-384

Preprints

30 Nov 2022

| 30 Nov 2022

Status: this preprint was under review for the journal ESSD but the revision was not accepted.

The Ant-Iso dataset: a compilation of Antarctic surface snow isotopic observations

Jiajia Wang, Hongxi Pang, Shuangye Wu, Spruce W. Schoenemann, Ryu Uemura, Alexey Ekaykin, Martin Werner, Alexandre Cauquoin, Sentia Goursaud Oger, Summer Rupper, and Shugui Hou

Abstract. Stable water isotopic observations in surface snow over Antarctica provide a foundation for validating isotopic models and interpreting Antarctic ice core records. Here, we present a new compilation of Antarctic surface snow isotopic dataset with strict quality control from published and unpublished sources including measurements from snow pits, snow cores, ice cores, deep surface snow, and precipitation (multi-year average values). The dataset contains a total of 1867 data points, including 1604 locations for oxygen isotope ratio (δ¹⁸O) and 1278 locations for deuterium isotope ratio (δ²H). 1204 locations have both δ¹⁸O and δ²H, from which d-excess (d-excess = δ²H − 8 × δ¹⁸O) can be calculated. The dataset also contains geographic and climate information. The database has a wide range of potential applications, such as the study of the spatial distribution of water isotopes in Antarctica, the evaluation of climate models, and the reconstruction and interpretation of Antarctic ice core records. As an example of model evaluation, the compiled isotopic dataset is used to assess the performance of isotope-enabled atmospheric general circulation models (AGCMs) on simulating the spatial distribution of water isotopes over Antarctica. This dataset is the most comprehensive compilation so far of observed water isotope records at multi-year average scale from multiple sources for Antarctica. It is available for download at https://doi.org/10.5281/zenodo.7294183 (Wang et al., 2022).

Received: 10 Nov 2022 – Discussion started: 30 Nov 2022

Publisher's note: Copernicus Publications remains neutral with regard to jurisdictional claims made in the text, published maps, institutional affiliations, or any other geographical representation in this paper. While Copernicus Publications makes every effort to include appropriate place names, the final responsibility lies with the authors. Views expressed in the text are those of the authors and do not necessarily reflect the views of the publisher.

Download & links

Preprint (PDF, 1821 KB)

Supplement (230 KB)

Download & links

Jiajia Wang, Hongxi Pang, Shuangye Wu, Spruce W. Schoenemann, Ryu Uemura, Alexey Ekaykin, Martin Werner, Alexandre Cauquoin, Sentia Goursaud Oger, Summer Rupper, and Shugui Hou

Status: closed

RC1:
'Comment on essd-2022-384', Anonymous Referee #1, 24 Jan 2023
In the manuscript “The Ant-Iso dataset: a compilation of Antarctic surface snow isotopic observations”, Wang et al. present a compilation of published and unpublished isotopic data (delta 18O, delta 2H and D-excess). The manuscript is largely well-written and follows a logical structure, but different points were unclear. For instance, it was explained that the “MD08” isotope data was supplemented with 794 newly collected data (L89-93). However, it was not explained:
which methods were used to find the different studies

which keywords were used

which time scales were considered

why were certain studies in or while maybe others excluded

why specific water sources were included or excluded

sampling method

which data quality parameters were used to decide to in- or exclude

why certain meta data was presented while others not

which data was collected by the authors and which from literature (in database)

In addition, it was not explained (database and text) in which way the different spatial and temporal data, and the model time scale were aggregated.

It seems that in the presented data (doi.org/10.5281/zenodo.7294183), figures, and tables, all data was shown at once without distinguishing between different spatial resolutions (space and vertical resolutions), temporal resolutions (years and time single vs time series), and various water sources. It would be good to present all “raw” collected data from literature in the database and explained in the manuscript how the data was averaged (e.g. Pang et al. 2019, snow pit 3 m 29 average values; Münch et al. 2017, snow pit 3.4 m 1329 average values 2014-2015). In addition, discuss the value of such averaged data and whether the collected data can be used to compare field data and model results consisting of different time scales. Providing the full collected data allows the user to use the data for different purposes while detailed information on sampling time and statistical weighing in the different statistical analyses, tests, and figures will help to justify whether patterns or relations presented in figures or table hold or whether these patterns appear by chance.

In addition:

L32 Clarify what the difference between a data point and location.

L35 Specify already here geographic and climate information.

L52 Include reference to literature.

L56 Clarify what "information and high fidelity " means.

L61 and L63 "spatial linear" remove spatial in L61 since the emphasis was on the core to avoid confusion with different cores located at various locations.

L62 and L62 Include references at the end of the sentence.

L65 "uneven" Clarify what uneven means.

L70 "high-resolution" Specify resolution.

L74 Include reference to literature.

L77 "high altitude" not consistent use of altitude vs. elevation.

L78 Include a figure to highlight the distribution.

L79 "researchers", which ones?

L89 Explain how the data was sourced.

L91 "strong support" clarify who gave support, the funding agency or colleagues?

L96 Give a brief introduction of "travers sampling" not everyone might be familiar with this type of sampling approach. Include also other sampling approaches and sampling methods and devices used.

L97 Specify who collected this data, the authors or different authors from the literature.

L106 "we added two routes" Clarify whether own data or literature sourced data was used.

L108 "unreleased" Explain what unreleased mean here since the data seems to be published by Ekaykin et al. (2012). Please clarify.

L128 Explain what reliability and its context means.

L129 Explain what a seasonal bias is.

L132 Since the data was collected in the dry valley with little precipitation, the data could still be a multi-year average. Please clarify.

L137 State clearly why only the last few decades were included and not all.

L146 Please also include additional geographic factors such as slope and aspect …

L157 Add reference to original authors.

L196 Specify which correlation coefficient was used.

L223 Clarify how the comparison was performed (not described in text L190-onwards). It should be included in the method section.

From L190, the model was run from 1979-2018, but the data had different temporal resolutions. Is it fair to compare these data, and what can we learn from this non-equal comparison?! Please clarify and discuss.

L268 scientists who have provided help and support in data collection. These scientists should be explicitly stated and acknowledged in the manuscript.

Table 1 Explain what are sufficient number of measurements are and why and based on what the threshold of 10 was chosen.

Figure 1 c choose different symbols similar to fig 4 for data from various data sources.

Figure 4 Rearrange panels with column 18O and one column D-excess. Include different symbols and colors indicating different time scales.

Figure 5 is hard to read too much data and color scheme, and symbols.
Citation: https://doi.org/10.5194/essd-2022-384-RC1
- AC2: 'Reply on RC1', Wang Jia, 07 Apr 2023
  
  Dear reviewer,
  Thank you for your constructive comments, which are very helpful to improve the manuscript. In the following, your comments are addressed in the same order as in the review. The original comments are in black, and our responses are in red (italics is our revision in the manuscript). For specific responses and manuscript revisions, please refer to the attachment.
  Best regards
  All authors.
  
  Citation: https://doi.org/10.5194/essd-2022-384-AC2
RC2:
'Comment on essd-2022-384', Anonymous Referee #2, 03 Feb 2023

General comments
The presented dataset of stable isotope composition in precipitation, snow and ice in Antarctica is useful and novel but require better description in the article and careful editing of the dataset. There is high potential of data being useful in the future. The description of methods and materials should be improved. The dataset is available via the link and seems to be complete but does not follow all common standards.
Such important characteristics like time coverage, elevation range and types of sample points should be provided in the abstract and dataset description.
Specific comments, questions and suggestions are listed below.
Specific comments:
Why authors of the Antarctic surface snow isotopic dataset available for download at https://doi.org/10.5281/zenodo.7294183 are different from the authors of the submitted ESSD article?
It is not clear why the title of ESSD paper is “The Ant-Iso dataset: a compilation of Antarctic surface snow isotopic observations” while there are 80 points of ice cores and 235 points of firn cores. You should either revise the title or exclude ice cores from the dataset.
The dataset needs to be revised and polished. You should thoroughly check the dataset to be sure that it follows the common requirements of a dataset.
At least 25 samples do not have coordinates, elevation and year of sampling. 117 samples do not have coordinates and year of sampling. I doubt that such values without any spatial and temporal references could be useful. If it is not possible to obtain the metadata, they should be excluded from the dataset. By the way column “Sample label” suggests location for some of the samples without coordinates. For example, 753 Molodezhnaya, 754 Amery-G1, 755 GM7, 756 GM10, 757 GM13, 758 Dome C, 759 Mirny, 760 Pioneerskaya, 761 Vostok 1, etc. Probably you can use it after careful check.
You need to define parameters in the dataset. It is not clear what the difference between published and calculated distance is and why did you need to calculate it? The same relates to elevation.
It is also not clear how the quality flag was assessed.
Does “Firn temperature or surface air temperature” relate to exact location and time of sampling or is it somehow averaged? How was it calculated or assessed? What is “Accumulation of snow/ice per year” and how was it estimated?
Data in columns should be formatted in a single style. For example, column “Averaging length (years or depth)” contains different data in very different style that prevent easy processing and analysis of the dataset. I suggest splitting the column into two different ones (“Averaging years” and “Averaging depths”) and putting only numerical values in each of them. If needed additional explanation you could add another “Comments” column with text.
Column “Sampling date” contains not dates but years in different formats (both numeric and text) that prevent processing and filtering. You may consider splitting the column into two different ones – “sampling year start” and “sampling year finish”.
Column “Sample type” have errors in writing that prevent grouping samples by types. You should carefully check every type and provide exact number of points of every type in the article.
Lines 68-74. Are you talking about the dataset by Masson-Delmotte et al. (2008), mentioned earlier, or your dataset? Clarify it and if it relates to the dataset by Masson-Delmotte et al. (2008), provide more details about its actual content rather than “it provides an observational basis”.
Lines 78-79 Provide sufficient references of “numerous new samples and measurements that have been acquired by different researchers”
Lines 79-81 Add quantitative estimation of the “additional observations” and described in numbers the difference between MD08 and your dataset.
Lines 90-91 How many data points did you make publicly available for the first time? It is one of the most important things to show the value of your dataset.
Lines 119-121 Add numbers of points to the figure caption
Lines 118-123 Consider merging figures 1 and 2. You can add information from Fig.2 to Fig.1 (b)
Line 177-178 Include the same figure for δD, ‰
Line 214 Since you have several files in Supplement you should reference more precisely.
Line 252 Content of the Word file with Supplement differs from the supplement described here. You should provide detailed description of the supplement files.
Technical corrections:
Affiliations and even country names have different formats
Line 174 Rewrite “…we do not quantitatively calculate the quantitative relationship…”

Citation: https://doi.org/10.5194/essd-2022-384-RC2
- AC1: 'Reply on RC2', Wang Jia, 07 Apr 2023
  
  Dear reviewer,
  Thank you for your constructive comments, which are very helpful to improve the manuscript. In the following, your comments are addressed in the same order as in the review. The original comments are in black, and our responses are in red (italics is our revision in the manuscript). For specific responses and manuscript revisions, please refer to the attachment.
  Best regards
  All authors.
  
  Citation: https://doi.org/10.5194/essd-2022-384-AC1

Status: closed

RC1:
'Comment on essd-2022-384', Anonymous Referee #1, 24 Jan 2023
In the manuscript “The Ant-Iso dataset: a compilation of Antarctic surface snow isotopic observations”, Wang et al. present a compilation of published and unpublished isotopic data (delta 18O, delta 2H and D-excess). The manuscript is largely well-written and follows a logical structure, but different points were unclear. For instance, it was explained that the “MD08” isotope data was supplemented with 794 newly collected data (L89-93). However, it was not explained:
which methods were used to find the different studies

which keywords were used

which time scales were considered

why were certain studies in or while maybe others excluded

why specific water sources were included or excluded

sampling method

which data quality parameters were used to decide to in- or exclude

why certain meta data was presented while others not

which data was collected by the authors and which from literature (in database)

In addition, it was not explained (database and text) in which way the different spatial and temporal data, and the model time scale were aggregated.

It seems that in the presented data (doi.org/10.5281/zenodo.7294183), figures, and tables, all data was shown at once without distinguishing between different spatial resolutions (space and vertical resolutions), temporal resolutions (years and time single vs time series), and various water sources. It would be good to present all “raw” collected data from literature in the database and explained in the manuscript how the data was averaged (e.g. Pang et al. 2019, snow pit 3 m 29 average values; Münch et al. 2017, snow pit 3.4 m 1329 average values 2014-2015). In addition, discuss the value of such averaged data and whether the collected data can be used to compare field data and model results consisting of different time scales. Providing the full collected data allows the user to use the data for different purposes while detailed information on sampling time and statistical weighing in the different statistical analyses, tests, and figures will help to justify whether patterns or relations presented in figures or table hold or whether these patterns appear by chance.

In addition:

L32 Clarify what the difference between a data point and location.

L35 Specify already here geographic and climate information.

L52 Include reference to literature.

L56 Clarify what "information and high fidelity " means.

L61 and L63 "spatial linear" remove spatial in L61 since the emphasis was on the core to avoid confusion with different cores located at various locations.

L62 and L62 Include references at the end of the sentence.

L65 "uneven" Clarify what uneven means.

L70 "high-resolution" Specify resolution.

L74 Include reference to literature.

L77 "high altitude" not consistent use of altitude vs. elevation.

L78 Include a figure to highlight the distribution.

L79 "researchers", which ones?

L89 Explain how the data was sourced.

L91 "strong support" clarify who gave support, the funding agency or colleagues?

L96 Give a brief introduction of "travers sampling" not everyone might be familiar with this type of sampling approach. Include also other sampling approaches and sampling methods and devices used.

L97 Specify who collected this data, the authors or different authors from the literature.

L106 "we added two routes" Clarify whether own data or literature sourced data was used.

L108 "unreleased" Explain what unreleased mean here since the data seems to be published by Ekaykin et al. (2012). Please clarify.

L128 Explain what reliability and its context means.

L129 Explain what a seasonal bias is.

L132 Since the data was collected in the dry valley with little precipitation, the data could still be a multi-year average. Please clarify.

L137 State clearly why only the last few decades were included and not all.

L146 Please also include additional geographic factors such as slope and aspect …

L157 Add reference to original authors.

L196 Specify which correlation coefficient was used.

L223 Clarify how the comparison was performed (not described in text L190-onwards). It should be included in the method section.

From L190, the model was run from 1979-2018, but the data had different temporal resolutions. Is it fair to compare these data, and what can we learn from this non-equal comparison?! Please clarify and discuss.

L268 scientists who have provided help and support in data collection. These scientists should be explicitly stated and acknowledged in the manuscript.

Table 1 Explain what are sufficient number of measurements are and why and based on what the threshold of 10 was chosen.

Figure 1 c choose different symbols similar to fig 4 for data from various data sources.

Figure 4 Rearrange panels with column 18O and one column D-excess. Include different symbols and colors indicating different time scales.

Figure 5 is hard to read too much data and color scheme, and symbols.
Citation: https://doi.org/10.5194/essd-2022-384-RC1
- AC2: 'Reply on RC1', Wang Jia, 07 Apr 2023
  
  Dear reviewer,
  Thank you for your constructive comments, which are very helpful to improve the manuscript. In the following, your comments are addressed in the same order as in the review. The original comments are in black, and our responses are in red (italics is our revision in the manuscript). For specific responses and manuscript revisions, please refer to the attachment.
  Best regards
  All authors.
  
  Citation: https://doi.org/10.5194/essd-2022-384-AC2
RC2:
'Comment on essd-2022-384', Anonymous Referee #2, 03 Feb 2023

General comments
The presented dataset of stable isotope composition in precipitation, snow and ice in Antarctica is useful and novel but require better description in the article and careful editing of the dataset. There is high potential of data being useful in the future. The description of methods and materials should be improved. The dataset is available via the link and seems to be complete but does not follow all common standards.
Such important characteristics like time coverage, elevation range and types of sample points should be provided in the abstract and dataset description.
Specific comments, questions and suggestions are listed below.
Specific comments:
Why authors of the Antarctic surface snow isotopic dataset available for download at https://doi.org/10.5281/zenodo.7294183 are different from the authors of the submitted ESSD article?
It is not clear why the title of ESSD paper is “The Ant-Iso dataset: a compilation of Antarctic surface snow isotopic observations” while there are 80 points of ice cores and 235 points of firn cores. You should either revise the title or exclude ice cores from the dataset.
The dataset needs to be revised and polished. You should thoroughly check the dataset to be sure that it follows the common requirements of a dataset.
At least 25 samples do not have coordinates, elevation and year of sampling. 117 samples do not have coordinates and year of sampling. I doubt that such values without any spatial and temporal references could be useful. If it is not possible to obtain the metadata, they should be excluded from the dataset. By the way column “Sample label” suggests location for some of the samples without coordinates. For example, 753 Molodezhnaya, 754 Amery-G1, 755 GM7, 756 GM10, 757 GM13, 758 Dome C, 759 Mirny, 760 Pioneerskaya, 761 Vostok 1, etc. Probably you can use it after careful check.
You need to define parameters in the dataset. It is not clear what the difference between published and calculated distance is and why did you need to calculate it? The same relates to elevation.
It is also not clear how the quality flag was assessed.
Does “Firn temperature or surface air temperature” relate to exact location and time of sampling or is it somehow averaged? How was it calculated or assessed? What is “Accumulation of snow/ice per year” and how was it estimated?
Data in columns should be formatted in a single style. For example, column “Averaging length (years or depth)” contains different data in very different style that prevent easy processing and analysis of the dataset. I suggest splitting the column into two different ones (“Averaging years” and “Averaging depths”) and putting only numerical values in each of them. If needed additional explanation you could add another “Comments” column with text.
Column “Sampling date” contains not dates but years in different formats (both numeric and text) that prevent processing and filtering. You may consider splitting the column into two different ones – “sampling year start” and “sampling year finish”.
Column “Sample type” have errors in writing that prevent grouping samples by types. You should carefully check every type and provide exact number of points of every type in the article.
Lines 68-74. Are you talking about the dataset by Masson-Delmotte et al. (2008), mentioned earlier, or your dataset? Clarify it and if it relates to the dataset by Masson-Delmotte et al. (2008), provide more details about its actual content rather than “it provides an observational basis”.
Lines 78-79 Provide sufficient references of “numerous new samples and measurements that have been acquired by different researchers”
Lines 79-81 Add quantitative estimation of the “additional observations” and described in numbers the difference between MD08 and your dataset.
Lines 90-91 How many data points did you make publicly available for the first time? It is one of the most important things to show the value of your dataset.
Lines 119-121 Add numbers of points to the figure caption
Lines 118-123 Consider merging figures 1 and 2. You can add information from Fig.2 to Fig.1 (b)
Line 177-178 Include the same figure for δD, ‰
Line 214 Since you have several files in Supplement you should reference more precisely.
Line 252 Content of the Word file with Supplement differs from the supplement described here. You should provide detailed description of the supplement files.
Technical corrections:
Affiliations and even country names have different formats
Line 174 Rewrite “…we do not quantitatively calculate the quantitative relationship…”

Citation: https://doi.org/10.5194/essd-2022-384-RC2
- AC1: 'Reply on RC2', Wang Jia, 07 Apr 2023
  
  Dear reviewer,
  Thank you for your constructive comments, which are very helpful to improve the manuscript. In the following, your comments are addressed in the same order as in the review. The original comments are in black, and our responses are in red (italics is our revision in the manuscript). For specific responses and manuscript revisions, please refer to the attachment.
  Best regards
  All authors.
  
  Citation: https://doi.org/10.5194/essd-2022-384-AC1

Jiajia Wang, Hongxi Pang, Shuangye Wu, Spruce W. Schoenemann, Ryu Uemura, Alexey Ekaykin, Martin Werner, Alexandre Cauquoin, Sentia Goursaud Oger, Summer Rupper, and Shugui Hou

Supplement

https://doi.org/10.5194/essd-2022-384-supplement

Data sets

Antarctic surface snow isotopic dataset Jiajia Wang; Hongxi Pang; Shugui Hou https://doi.org/10.5281/zenodo.7294183

Jiajia Wang, Hongxi Pang, Shuangye Wu, Spruce W. Schoenemann, Ryu Uemura, Alexey Ekaykin, Martin Werner, Alexandre Cauquoin, Sentia Goursaud Oger, Summer Rupper, and Shugui Hou

Viewed

Total article views: 2,134 (including HTML, PDF, and XML)

HTML	PDF	XML	Total	Supplement	BibTeX	EndNote
1,558	468	108	2,134	110	125	162

HTML: 1,558
PDF: 468
XML: 108
Total: 2,134
Supplement: 110
BibTeX: 125
EndNote: 162

Views and downloads (calculated since 30 Nov 2022)

Month	HTML	PDF	XML	Total
Nov 2022	32	8	2	42
Dec 2022	151	43	6	200
Jan 2023	86	24	2	112
Feb 2023	69	20	4	93
Mar 2023	36	9	2	47
Apr 2023	52	24	4	80
May 2023	24	15	0	39
Jun 2023	36	10	3	49
Jul 2023	14	14	0	28
Aug 2023	18	11	2	31
Sep 2023	28	8	2	38
Oct 2023	18	9	1	28
Nov 2023	9	1	1	11
Dec 2023	20	12	1	33
Jan 2024	23	3	1	27
Feb 2024	18	10	1	29
Mar 2024	23	16	4	43
Apr 2024	18	6	10	34
May 2024	20	6	9	35
Jun 2024	44	5	2	51
Jul 2024	16	7	5	28
Aug 2024	24	1	5	30
Sep 2024	20	5	0	25
Oct 2024	8	3	1	12
Nov 2024	11	3	0	14
Dec 2024	8	2	0	10
Jan 2025	13	5	1	19
Feb 2025	12	3	0	15
Mar 2025	9	7	4	20
Apr 2025	6	5	2	13
May 2025	11	8	1	20
Jun 2025	24	16	1	41
Jul 2025	21	11	4	36
Aug 2025	74	9	2	85
Sep 2025	317	13	3	333
Oct 2025	25	20	3	48
Nov 2025	57	26	3	86
Dec 2025	26	8	3	37
Jan 2026	65	15	4	84
Feb 2026	26	18	4	48
Mar 2026	46	29	5	80

Cumulative views and downloads (calculated since 30 Nov 2022)

Month	HTML	PDF	XML	Total
Nov 2022	32	8	2	42
Dec 2022	151	43	6	200
Jan 2023	86	24	2	112
Feb 2023	69	20	4	93
Mar 2023	36	9	2	47
Apr 2023	52	24	4	80
May 2023	24	15	0	39
Jun 2023	36	10	3	49
Jul 2023	14	14	0	28
Aug 2023	18	11	2	31
Sep 2023	28	8	2	38
Oct 2023	18	9	1	28
Nov 2023	9	1	1	11
Dec 2023	20	12	1	33
Jan 2024	23	3	1	27
Feb 2024	18	10	1	29
Mar 2024	23	16	4	43
Apr 2024	18	6	10	34
May 2024	20	6	9	35
Jun 2024	44	5	2	51
Jul 2024	16	7	5	28
Aug 2024	24	1	5	30
Sep 2024	20	5	0	25
Oct 2024	8	3	1	12
Nov 2024	11	3	0	14
Dec 2024	8	2	0	10
Jan 2025	13	5	1	19
Feb 2025	12	3	0	15
Mar 2025	9	7	4	20
Apr 2025	6	5	2	13
May 2025	11	8	1	20
Jun 2025	24	16	1	41
Jul 2025	21	11	4	36
Aug 2025	74	9	2	85
Sep 2025	317	13	3	333
Oct 2025	25	20	3	48
Nov 2025	57	26	3	86
Dec 2025	26	8	3	37
Jan 2026	65	15	4	84
Feb 2026	26	18	4	48
Mar 2026	46	29	5	80

Viewed (geographical distribution)

Total article views: 2,098 (including HTML, PDF, and XML) Thereof 2,098 with geography defined and 0 with unknown origin.

Country	#	Views	%

Latest update: 26 Mar 2026

Download

Preprint (1821 KB)
Metadata XML

Short summary

Stable water isotopic observations in surface snow over Antarctica provide a basis for validating isotopic models and interpreting Antarctic ice core records. This study presents a new compilation of Antarctic surface snow isotopic dataset based on published and unpublished sources. The database has a wide range of potential applications in studying spatial distribution of water isotopes, model validation, and reconstruction and interpretation of Antarctic ice core records.


Total:	0
HTML:	0
PDF:	0
XML:	0