The PALMOD 130k marine palaeoclimate data synthesis version 2
Abstract. Palaeoclimate data hold the unique promise of providing a long-term perspective on climate change and as such can serve as an important benchmark for climate models. However, palaeoclimate data have generally been archived with insufficient standardisation and metadata to allow for transparent and consistent uncertainty assessment in an automated way. Thanks to improved computation capacity, transient palaeoclimate simulations are now possible, calling for data products containing multi-parameter time series rather than information on a single parameter for a single time slice. To confront transient simulations that span the last glacial-interglacial cycle with palaeoclimate data, we have compiled a multi-parameter marine palaeoclimate data synthesis that contains time series spanning 0 to 130,000 years ago. In 2020 Jonkers et al. (2020) published the first version of the PALMOD 130k marine palaeoclimate data synthesis and described our data synthesis strategy and the contents and format of the data product in detail. Here we present a major update of the data product that markedly increases both the spatial and temporal coverage. Version 2 of the synthesis contains 2,286 time series of eight palaeoclimate parameters from 475 individual sites, each associated with rich metadata, age–depth model ensembles, and information to refine and update the chronologies. Version 2 contains 468 time series of benthic foraminifera δ18O; 357 of benthic foraminifera δ13C; 423 of near sea surface temperature; 482 and 273 of planktonic foraminifera δ18O and δ13C; and 128, 111 and 44 of carbonate, organic carbon and biogenic silica content, respectively. Compared to version 1, all radiocarbon ages have been recalibrated and the age-depth models updated. In addition, near sea surface temperature estimates based on planktonic foraminifera Mg/Ca and on UK37' have been recalculated using a single calibration thus ensuring global comparability and comprehensive assessment of their uncertainty. The data product is available in two formats (R and LiPD) facilitating use across different software and operating systems and can be downloaded at https://doi.pangaea.de/10.1594/PANGAEA.984602 (Jonkers et al., 2025b). This data descriptor presents our updating methodology and describes the contents and format of the data product in detail and concludes with recommendations on palaeodata stewardship to increase the reusability of such data.
Jonkers and co-authors give an update on the PALMOD data base of marine paleotracers for the las 130 k. This is a necessary and significant update for the PALMOD data base. To previous version was limited to d13C and d18O from benthic foraminifera, Jonkers et al. now include planktic foraminefera stable isotopes, and other proxies such as Mg/Ca and carbonate and biogenic silica content. They also include temperature reconstructions for each of the sites. The authors claim to have merged age models with proxy data, assigning an age value to each downcore sample. This is a useful update, which saves users from the necessity to interpolate the age models to the data depth scales.
I am worried by the presentation of the data. The authors choose two formats: R and LiPD files. However, the way they have structured the data makes it very hard to look at it at a glance. For R (.RDS) files R needs to be used. I am an advanced python programmer, and I wasn't able to quickly access the data. Both in R and LiPD the data sets of each coring sites were saved without column names, and I find no easy way to see a depth,age,proxy list on screen. I thought LiPD would be easier, since I know that inside a LiPD folder the data are saved as .csv, however, the problem is the same: The is no reference at each file to know what I am seeing. Of course, it could be that I am not knowledgeable enough to open these files. But I am myself a data person, so if I had a problem accessing the data base, it is reasonable to think that many other users will have issues too.
I recommend the authors to re-include netcdf files of each coring sites as part of the data base (they were included in the first version of the PALMOD database). These files are more universal than LiPD and R files, and readable by different software types. In addition, I recommend the authors to produce a more human-readable version of the data base. These could just be the csv files inside the LiPD directories, if we have in each file explicit information of what each column is. Otherwise, I am sad to say that this important data product will be useful for just a very few R and LiPD experts.