the Creative Commons Attribution 4.0 License.
the Creative Commons Attribution 4.0 License.
The PALMOD 130k marine palaeoclimate data synthesis version 2
Abstract. Palaeoclimate data hold the unique promise of providing a long-term perspective on climate change and as such can serve as an important benchmark for climate models. However, palaeoclimate data have generally been archived with insufficient standardisation and metadata to allow for transparent and consistent uncertainty assessment in an automated way. Thanks to improved computation capacity, transient palaeoclimate simulations are now possible, calling for data products containing multi-parameter time series rather than information on a single parameter for a single time slice. To confront transient simulations that span the last glacial-interglacial cycle with palaeoclimate data, we have compiled a multi-parameter marine palaeoclimate data synthesis that contains time series spanning 0 to 130,000 years ago. In 2020 Jonkers et al. (2020) published the first version of the PALMOD 130k marine palaeoclimate data synthesis and described our data synthesis strategy and the contents and format of the data product in detail. Here we present a major update of the data product that markedly increases both the spatial and temporal coverage. Version 2 of the synthesis contains 2,286 time series of eight palaeoclimate parameters from 475 individual sites, each associated with rich metadata, age–depth model ensembles, and information to refine and update the chronologies. Version 2 contains 468 time series of benthic foraminifera δ18O; 357 of benthic foraminifera δ13C; 423 of near sea surface temperature; 482 and 273 of planktonic foraminifera δ18O and δ13C; and 128, 111 and 44 of carbonate, organic carbon and biogenic silica content, respectively. Compared to version 1, all radiocarbon ages have been recalibrated and the age-depth models updated. In addition, near sea surface temperature estimates based on planktonic foraminifera Mg/Ca and on UK37' have been recalculated using a single calibration thus ensuring global comparability and comprehensive assessment of their uncertainty. The data product is available in two formats (R and LiPD) facilitating use across different software and operating systems and can be downloaded at https://doi.pangaea.de/10.1594/PANGAEA.984602 (Jonkers et al., 2025b). This data descriptor presents our updating methodology and describes the contents and format of the data product in detail and concludes with recommendations on palaeodata stewardship to increase the reusability of such data.
- Preprint
(3055 KB) - Metadata XML
- BibTeX
- EndNote
Status: closed
-
RC1: 'Comment on essd-2025-599', Anonymous Referee #1, 11 Nov 2025
-
CC1: 'Reply on RC1', Julien Emile-Geay, 29 Nov 2025
Re: LiPD, perhaps this would help? https://pylipd.readthedocs.io/en/latest/. Also see examples of how to load the data into Pyleoclim for analysis here: https://linked.earth/PyleoTutorials/notebooks/L0_loading_to_series.html#loading-a-single-lipd-file
If, as you contend, the data tables are not properly formatted, then it might not. But otherwise, this will save all Python users a lot of trouble with this dataset.
Best,
J.E.G.
Citation: https://doi.org/10.5194/essd-2025-599-CC1 -
AC2: 'Reply on CC1', Lukas Jonkers, 30 Jan 2026
We would like to thank Julien Emile-Geay for pointing us to the new online documentation on how to handle LiPD files using Python. We will update the manuscript accordingly.
Citation: https://doi.org/10.5194/essd-2025-599-AC2
-
AC2: 'Reply on CC1', Lukas Jonkers, 30 Jan 2026
- AC1: 'Reply on RC1', Lukas Jonkers, 30 Jan 2026
-
CC1: 'Reply on RC1', Julien Emile-Geay, 29 Nov 2025
-
RC2: 'Comment on essd-2025-599', Anonymous Referee #2, 02 Jan 2026
The article presents a major update to the first PALMOD synthesis, including many primary data compiled here for the first time. The manuscript is very well structured and the figures are very informative. I recommend publication after the following comments have been addressed.
Major Comment:
For the purpose of uncertainty propagation, it is tremendously useful that PALMOD 2 includes age-model ensembles, as well as “1000 ensemble time series of seawater temperature” whenever possible. However, a question arises as to how these two sources of uncertainty should be combined. Are all paths through the depth-age-SST space physically realizable, or do some result in unrealistically abrupt SST changes, or even reversals compared to what is observed in the raw data (for instance, Mg/Ca vs depth)? I do not know of a standard solution to this problem, but there are existing strategies to address it (e.g. Khider et al, 2017; doi:10.1002/2016PA003057); I believe it is worth raising here to ensure the best possible use of the data. It would be useful to add a figure showing the reconstructed SST history in one core, displaying quantiles of its distribution (e.g. 5%, 25%, 50%, 75%, 95%) through time, and sharing the associated code so that users can easily repeat this relatively rare analysis.
Minor Comments:
L113: “the updated radiocarbon curve”. We learn later in the text that it is IntCal20; it would be logical to cite it here.
L140: “estimates of their uncertainty” —> estimates of uncertainty
L227: “associated manuscript” —> original publications (there many be more than one, and “manuscript” typically refers to a pre-publication form)
L249: the reference to “convergence” will be unfamiliar to many readers; it may be worth explaining in more detail.
Table 1:
- “standardised parameter name” —> what standard vocabulary was used here? If none was available or relevant, how did the authors choose to standardise?
- “CalibrationUncertainty”: does this refer to 1sigma, 2sigma, or something else?
L273: “priority climate-relevant variables” —> who decided on such priorities?
L329: “14 time series shorter than 1 kyr are not shown.” —> why are such series included at all? Is there any scientific use to be made of them in the context of this compilation?
L359: “The majority … is based” -> The majority … ARE based
L363: “LDI” this is a new acronym to me. Would it be useful including a citation?
L376: “Data is freely available” —> Data ARE freely available.
L378/9: “We explicitly encourage users to also cite the original data when using this data product. “ Please provide an explicit example or two here, as I am not sure everyone would get the hint otherwise.
L412/3: “To allow ..” -> this sentence is incomplete. It implies a second clause that never comes. Please rewrite.
L414: “different calibration (schemes)” -> please remove parentheses.
L418: “only in the form as reported” -> only in the form reported (no “as”)
L444: “modelling strategies like for instance done in the “ -> modelling strategies like for instance used for the …
L451/2: Note that the for Python users, the relevant resource is now the PyLiPD package (https://pylipd.readthedocs.io/en/latest/), and associated tutorials (https://linked.earth/pylipdTutorials/intro.html).
L456/7: “Users are encouraged to report those to the lead author so they can be corrected”. Can you perhaps add a line about the process for reporting errors (github issues?), the versioning scheme for corrections, and how they will be released?
Question for the editor: does ESSD have a mechanism for issuing dataset corrections distinct from the standard Copernicus author correction?
L464: “metadata and chronology data is” -> metadata and chronology data are
L474: “invariantly” -> invariably
L481: “The lack of a standardised vocabulary and consistent ontology” -> what about the LinkedEarth ontology (https://linked.earth/ontology/), which is aligned to the LiPD scheme?
Re: “The lack of a standardised vocabulary”, the PaST Thesaurus qualifies in my book as a “standardised vocabulary”. Perhaps the authors mean that the two major data repositories (WDS-Paleo and PANGAEA) have yet to adopt the same vocabulary? If so, I agree that it is worth calling out, in hopes that they work more closely together (and the rest of the community) to make that happen.
Citation: https://doi.org/10.5194/essd-2025-599-RC2 - AC3: 'Reply on RC2', Lukas Jonkers, 30 Jan 2026
Status: closed
-
RC1: 'Comment on essd-2025-599', Anonymous Referee #1, 11 Nov 2025
Jonkers and co-authors give an update on the PALMOD data base of marine paleotracers for the las 130 k. This is a necessary and significant update for the PALMOD data base. To previous version was limited to d13C and d18O from benthic foraminifera, Jonkers et al. now include planktic foraminefera stable isotopes, and other proxies such as Mg/Ca and carbonate and biogenic silica content. They also include temperature reconstructions for each of the sites. The authors claim to have merged age models with proxy data, assigning an age value to each downcore sample. This is a useful update, which saves users from the necessity to interpolate the age models to the data depth scales.
I am worried by the presentation of the data. The authors choose two formats: R and LiPD files. However, the way they have structured the data makes it very hard to look at it at a glance. For R (.RDS) files R needs to be used. I am an advanced python programmer, and I wasn't able to quickly access the data. Both in R and LiPD the data sets of each coring sites were saved without column names, and I find no easy way to see a depth,age,proxy list on screen. I thought LiPD would be easier, since I know that inside a LiPD folder the data are saved as .csv, however, the problem is the same: The is no reference at each file to know what I am seeing. Of course, it could be that I am not knowledgeable enough to open these files. But I am myself a data person, so if I had a problem accessing the data base, it is reasonable to think that many other users will have issues too.I recommend the authors to re-include netcdf files of each coring sites as part of the data base (they were included in the first version of the PALMOD database). These files are more universal than LiPD and R files, and readable by different software types. In addition, I recommend the authors to produce a more human-readable version of the data base. These could just be the csv files inside the LiPD directories, if we have in each file explicit information of what each column is. Otherwise, I am sad to say that this important data product will be useful for just a very few R and LiPD experts.
Citation: https://doi.org/10.5194/essd-2025-599-RC1 -
CC1: 'Reply on RC1', Julien Emile-Geay, 29 Nov 2025
Re: LiPD, perhaps this would help? https://pylipd.readthedocs.io/en/latest/. Also see examples of how to load the data into Pyleoclim for analysis here: https://linked.earth/PyleoTutorials/notebooks/L0_loading_to_series.html#loading-a-single-lipd-file
If, as you contend, the data tables are not properly formatted, then it might not. But otherwise, this will save all Python users a lot of trouble with this dataset.
Best,
J.E.G.
Citation: https://doi.org/10.5194/essd-2025-599-CC1 -
AC2: 'Reply on CC1', Lukas Jonkers, 30 Jan 2026
We would like to thank Julien Emile-Geay for pointing us to the new online documentation on how to handle LiPD files using Python. We will update the manuscript accordingly.
Citation: https://doi.org/10.5194/essd-2025-599-AC2
-
AC2: 'Reply on CC1', Lukas Jonkers, 30 Jan 2026
- AC1: 'Reply on RC1', Lukas Jonkers, 30 Jan 2026
-
CC1: 'Reply on RC1', Julien Emile-Geay, 29 Nov 2025
-
RC2: 'Comment on essd-2025-599', Anonymous Referee #2, 02 Jan 2026
The article presents a major update to the first PALMOD synthesis, including many primary data compiled here for the first time. The manuscript is very well structured and the figures are very informative. I recommend publication after the following comments have been addressed.
Major Comment:
For the purpose of uncertainty propagation, it is tremendously useful that PALMOD 2 includes age-model ensembles, as well as “1000 ensemble time series of seawater temperature” whenever possible. However, a question arises as to how these two sources of uncertainty should be combined. Are all paths through the depth-age-SST space physically realizable, or do some result in unrealistically abrupt SST changes, or even reversals compared to what is observed in the raw data (for instance, Mg/Ca vs depth)? I do not know of a standard solution to this problem, but there are existing strategies to address it (e.g. Khider et al, 2017; doi:10.1002/2016PA003057); I believe it is worth raising here to ensure the best possible use of the data. It would be useful to add a figure showing the reconstructed SST history in one core, displaying quantiles of its distribution (e.g. 5%, 25%, 50%, 75%, 95%) through time, and sharing the associated code so that users can easily repeat this relatively rare analysis.
Minor Comments:
L113: “the updated radiocarbon curve”. We learn later in the text that it is IntCal20; it would be logical to cite it here.
L140: “estimates of their uncertainty” —> estimates of uncertainty
L227: “associated manuscript” —> original publications (there many be more than one, and “manuscript” typically refers to a pre-publication form)
L249: the reference to “convergence” will be unfamiliar to many readers; it may be worth explaining in more detail.
Table 1:
- “standardised parameter name” —> what standard vocabulary was used here? If none was available or relevant, how did the authors choose to standardise?
- “CalibrationUncertainty”: does this refer to 1sigma, 2sigma, or something else?
L273: “priority climate-relevant variables” —> who decided on such priorities?
L329: “14 time series shorter than 1 kyr are not shown.” —> why are such series included at all? Is there any scientific use to be made of them in the context of this compilation?
L359: “The majority … is based” -> The majority … ARE based
L363: “LDI” this is a new acronym to me. Would it be useful including a citation?
L376: “Data is freely available” —> Data ARE freely available.
L378/9: “We explicitly encourage users to also cite the original data when using this data product. “ Please provide an explicit example or two here, as I am not sure everyone would get the hint otherwise.
L412/3: “To allow ..” -> this sentence is incomplete. It implies a second clause that never comes. Please rewrite.
L414: “different calibration (schemes)” -> please remove parentheses.
L418: “only in the form as reported” -> only in the form reported (no “as”)
L444: “modelling strategies like for instance done in the “ -> modelling strategies like for instance used for the …
L451/2: Note that the for Python users, the relevant resource is now the PyLiPD package (https://pylipd.readthedocs.io/en/latest/), and associated tutorials (https://linked.earth/pylipdTutorials/intro.html).
L456/7: “Users are encouraged to report those to the lead author so they can be corrected”. Can you perhaps add a line about the process for reporting errors (github issues?), the versioning scheme for corrections, and how they will be released?
Question for the editor: does ESSD have a mechanism for issuing dataset corrections distinct from the standard Copernicus author correction?
L464: “metadata and chronology data is” -> metadata and chronology data are
L474: “invariantly” -> invariably
L481: “The lack of a standardised vocabulary and consistent ontology” -> what about the LinkedEarth ontology (https://linked.earth/ontology/), which is aligned to the LiPD scheme?
Re: “The lack of a standardised vocabulary”, the PaST Thesaurus qualifies in my book as a “standardised vocabulary”. Perhaps the authors mean that the two major data repositories (WDS-Paleo and PANGAEA) have yet to adopt the same vocabulary? If so, I agree that it is worth calling out, in hopes that they work more closely together (and the rest of the community) to make that happen.
Citation: https://doi.org/10.5194/essd-2025-599-RC2 - AC3: 'Reply on RC2', Lukas Jonkers, 30 Jan 2026
Viewed
| HTML | XML | Total | BibTeX | EndNote | |
|---|---|---|---|---|---|
| 450 | 285 | 40 | 775 | 39 | 50 |
- HTML: 450
- PDF: 285
- XML: 40
- Total: 775
- BibTeX: 39
- EndNote: 50
Viewed (geographical distribution)
| Country | # | Views | % |
|---|
| Total: | 0 |
| HTML: | 0 |
| PDF: | 0 |
| XML: | 0 |
- 1
Jonkers and co-authors give an update on the PALMOD data base of marine paleotracers for the las 130 k. This is a necessary and significant update for the PALMOD data base. To previous version was limited to d13C and d18O from benthic foraminifera, Jonkers et al. now include planktic foraminefera stable isotopes, and other proxies such as Mg/Ca and carbonate and biogenic silica content. They also include temperature reconstructions for each of the sites. The authors claim to have merged age models with proxy data, assigning an age value to each downcore sample. This is a useful update, which saves users from the necessity to interpolate the age models to the data depth scales.
I am worried by the presentation of the data. The authors choose two formats: R and LiPD files. However, the way they have structured the data makes it very hard to look at it at a glance. For R (.RDS) files R needs to be used. I am an advanced python programmer, and I wasn't able to quickly access the data. Both in R and LiPD the data sets of each coring sites were saved without column names, and I find no easy way to see a depth,age,proxy list on screen. I thought LiPD would be easier, since I know that inside a LiPD folder the data are saved as .csv, however, the problem is the same: The is no reference at each file to know what I am seeing. Of course, it could be that I am not knowledgeable enough to open these files. But I am myself a data person, so if I had a problem accessing the data base, it is reasonable to think that many other users will have issues too.
I recommend the authors to re-include netcdf files of each coring sites as part of the data base (they were included in the first version of the PALMOD database). These files are more universal than LiPD and R files, and readable by different software types. In addition, I recommend the authors to produce a more human-readable version of the data base. These could just be the csv files inside the LiPD directories, if we have in each file explicit information of what each column is. Otherwise, I am sad to say that this important data product will be useful for just a very few R and LiPD experts.