Reply on RC2

The choice to include the Abelmann et al., 1999 dataset was dictated by the need for taxonomic consistency. Our co-author (Cortese), a Southern Ocean radiolarian specialist, worked as a post-doc for 9 years in Andrea Abelmann’s group at AWI. During his first postdoc, he was tasked with updating and expanding the Abelmann et al. dataset by including samples from the subtropical region of the Atlantic Sector. Cortese used this updated version of the Abelmann et al. (1999) dataset to reconstruct temperatures in core ODP1089, Leg 177 and in core PS2821-1 from radiolarian assemblages he counted (Cortese and Abelmann, 2002). At that time, he made sure that A. Abelmann and his own taxonomic concepts were the same. In our new paper, those concepts have only been modified slightly (e.g., the radiolarian category “Larcoid” was changed to Pylonoiidae/Litheliidae, or Antarctissa cylindrica was added to Antarctissa sp.), due to new studies appearing after the 90s that updated modern concepts of radiolarian taxonomy (Suzuki and Aita, 2011; Lazarus et al., 2015; Matsuzaki et al., 2015). Therefore, we stress here, that the Abelmann et al. (1999) dataset, was checked thoroughly and that its taxonomy is the same as the rest of SO-RAD. This taxonomic consistency need is also the reason we decided not to include other datasets which were not counted by any group members, particularly so if they were quite dated (e.g., CLIMAP, 1997; Pisias et al., 1997), as the different counting conventions and taxonomy would make it very difficult to harmonize those data to the rest of our dataset. The obvious solution would be to obtain the original samples/slides and recount them, but this may not practicable as the authors of those datasets have retired. Incidentally, we perceive data accessibility as one of the main benefits of the present SO-RAD compilation. We have adopted this approach for two datasets for which slides were available to us: Hollis and Neil (2005) for samples around New Zealand, and the Rogers’ samples from the Indian Sector (Rogers & De Deckker, 2007). For both of these, Cortese recounted all slides.

The choice to include the Abelmann et al., 1999 dataset was dictated by the need for taxonomic consistency. Our co-author (Cortese), a Southern Ocean radiolarian specialist, worked as a post-doc for 9 years in Andrea Abelmann's group at AWI. During his first postdoc, he was tasked with updating and expanding the Abelmann et al. dataset by including samples from the subtropical region of the Atlantic Sector. Cortese used this updated version of the Abelmann et al. (1999) dataset to reconstruct temperatures in core ODP1089, Leg 177 and in core PS2821-1 from radiolarian assemblages he counted (Cortese and Abelmann, 2002). At that time, he made sure that A. Abelmann and his own taxonomic concepts were the same. In our new paper, those concepts have only been modified slightly (e.g., the radiolarian category "Larcoid" was changed to Pylonoiidae/Litheliidae, or Antarctissa cylindrica was added to Antarctissa sp.), due to new studies appearing after the 90s that updated modern concepts of radiolarian taxonomy (Suzuki and Aita, 2011;Lazarus et al., 2015;Matsuzaki et al., 2015). Therefore, we stress here, that the Abelmann et al. (1999) dataset, was checked thoroughly and that its taxonomy is the same as the rest of SO-RAD. This taxonomic consistency need is also the reason we decided not to include other datasets which were not counted by any group members, particularly so if they were quite dated (e.g., CLIMAP, 1997;Pisias et al., 1997), as the different counting conventions and taxonomy would make it very difficult to harmonize those data to the rest of our dataset. The obvious solution would be to obtain the original samples/slides and recount them, but this may not practicable as the authors of those datasets have retired. Incidentally, we perceive data accessibility as one of the main benefits of the present SO-RAD compilation. We have adopted this approach for two datasets for which slides were available to us: Hollis and Neil (2005) for samples around New Zealand, and the Rogers' samples from the Indian Sector (Rogers & De Deckker, 2007). For both of these, Cortese recounted all slides.
For the future expansion of the dataset, the intention is that the authors of the dataset will seek out existing samples, as well as samples from upcoming voyages, and add those data to improve the spatial coverage of the SO-RAD dataset.

Environmental data
The use of environmental data in this paper is not for analytical purposes, but to illustrate the environmental coverage of the dataset in each sector. The source of the environmental data used in this paper is referenced and freely available for download.
The SO-RAD dataset consists of radiolarian census counts only. The intention of the authors in building this dataset is not to include environmental data alongside the radiolarian census counts. Future users of the dataset who wish to perform environmental reconstructions, or use the data for other purposes, should select the environmental variables, from which depths and from which data source, best suits their needs. We do not want future users to think that the SO-RAD census counts should only be used with this one set of environmental data, which is why we have not included it in the SO-RAD dataset itself, or as a supplement to this paper.

Recommending to subset data by taxa depth
There are many views on subsetting a reference dataset for use in palaeoceanographic reconstructions. Some radiolarists have used species from all depths in their reconstructions, while others have removed species based on their known depth ranges to reconstruct environmental variables at a particular depth: e.g., Abelmann et al. (1999), Cortese and Abelmann (2002), and Matsuzaki and Itaki (2017). The authors of this paper are not prescribing hard and fast rules that future users of the dataset must follow in regard to subsetting radiolarian census count data, but are outlining a range of methods seen in radiolarian literature for future users to consider. They can then decide on the best method given their own project aims. The aim of SO-RAD is to provide as much raw data as possible with harmonized taxonomy and future users will be free to use and adapt SO-RAD to best suit their own needs.
After considering the reviewers comment, we have amended the wording of the sentence 'It may be desirable to limit the reference dataset to taxa that are known to live at the depth for which a specific environmental variable is being reconstructed.' to 'It is possible to limit the reference dataset…' to demonstrate a more neutral approach regarding this matter.
Spatial coverage of dataset i.e., the northern limit of Southern Ocean The reviewer has stated 60°S should be considered as being the northern limit of the Southern Ocean. This 60°S definition of the Southern Ocean may be considered the 'political' boundary of the Southern Ocean. For example, the Antarctic Treaty generally covers the area south of 60°S but expands to 45°S around Kerguelen Island in the Indian sector of the Southern Ocean. The oceanographic northern boundary of the Southern Ocean has several definitions. It can be considered as being the northernmost extent of the Antarctic Circumpolar Current (~38°S), or as far north as 30°S to include all oceanographic conditions south of the Subtropical Front (Sokolov and Rintoul, 2002;Talley et al., 2011). Additionally, the Westerly winds, drivers of the dominating zonal circulation in the Southern Ocean (the ACC), originate well north of 60 ºS. It is therefore beneficial to include samples located as far north as 30 ºS to encompass the full extent of the Southern Ocean.
Moreover, as the reviewer also mentions, if the dataset were used for palaeoceanographic reconstructions, it is important to incorporate a range of sites so that periods warmer than today can be accurately reconstructed. This translates to the need to capture radiolarian assemblages related to Western Boundary currents (Agulhas system, East Australian Current) that feed into the Southern Ocean. Future users of the dataset can subset the sites based on the latitudinal and longitudinal boundaries that are most appropriate to their project aims.
The authors of the dataset intend to expand the dataset by adding sites predominantly within the Southern Ocean's northern oceanographic boundary. There is no intention to ensure coverage of the dataset expands to include the tropics, which represent a different oceanographic and climatic realm, and therefore renaming the dataset as the 'Southern Hemisphere Radiolarian dataset' would be misleading.

Size fractions
The reviewer has noted that the size of the sieve used by Rogers and De Deckker (2007) to prepare the slides is in fact 63 µm rather than 45 µm. To state all samples used 45 µm sieve size was incorrect. This was an oversight by the authors and the methods section of the manuscript has been amended accordingly.

Additional column in dataset -source of data
The reviewer has recommended the inclusion of the source of the data in the SO-RAD database. This column has been added to the dataset and will be submitted to Pangaea for updating.
The reviewer has also requested the inclusion of the radiolarians/gram variable in the dataset. This data is not available for the majority of the samples included in this study and therefore we have not included this as a variable in the SO-RAD dataset. Moreover, the total number of radiolarians/gram of sediment is a different metric used as an indication for radiolarian productivity. This database is mainly focused on providing census counts of radiolarian assemblages.

Taxonomic harmonisation
As requested, an additional supplementary table will be submitted with the final manuscript. The supplement will list species names that appeared in previously published data but have been updated in the SO-RAD dataset to fix inconsistencies in naming, or to align with the most recently accepted species name.
Quantification of uncertainties from source data, environmental data, census counts In relation to environmental data, future users of the dataset who wish to conduct palaeoenvironmental reconstructions would be expected to source their own environmental data. Part of their analysis would be to investigate the uncertainties associated with their chosen environmental data set.
Some uncertainties in census counts of microfossils are difficult to quantify as they can be affected by factors such as preferential dissolution of species, differing taxonomic concepts, and counting procedures. One aspect of uncertainty in census counts which may be quantifiable is how representative the census count is of the natural population. The error associated with dominant species will generally be lower than with rare species. Higher number of counted individuals tends to minimise this type of error (Fatela and Taborda, 2002). These topics are covered briefly in sections 4.1 and 4.2.1 of the paper.
Submission of R code R code has not been submitted with this manuscript as there is no code used in the construction of the dataset and there is no code required by future users to work with the dataset. Summary statistics, maps and figures were created using R; however, these could have been made using a variety of other point-and-click software and providing the R code seemed unnecessary.

General manuscript comments
The reviewer has recommended additional references be included in the paragraph starting at L42, and in L53. Additional references have been added in those locations.
'L. 38. Subtropical Southern hemisphere assemblages are not dominated by siliceous microfossils. Subtropical (25° to 40° S).' We agree with this comment and, as the statement is not vital to the paper, we have removed it.
'L. 54. Sentence about the importance of siliceous microfossils is confusing.' We agree with this comment and, as the statement is not vital to the paper, we have removed it.
'L. 184. This sentence is confusing: …'are well removed from the direct influence of seasonal sea ice.'' We have modified the wording of this sentence (now line number 181) to '…may be affected by sea ice in certain years.'