Comment on essd-2021-259

This study presents a compilation of oceanic 234 Th measurements made at global scale over the past 50 years (1967-2018). The dataset is composed of several parameters including total, dissolved, and particulate 234 Th activity, POC: 234 Th and PON: 234 Th ratios, along with 238 U activity, POC and PON concentrations when available. This set of parameters constitutes the basis for the use of 234 Th as a proxy of carbon and nitrogen export fluxes from the oceanic upper water column. The data were obtained from several sources, the vast majority from peer-reviewed articles (214) and to a lesser extent from PhD manuscripts (4), public data repositories (8), and unpublished datasets (9). The database is composed of 223 excel spreadsheets along with a compilation file hosted in PANGAEA repository. For each data set, relevant metadata (geographic location, sampling date, project, sampling and processing methodology, bloom stage, etc.) have been systematically included. The associated paper introduces the dataset, presents a global overview, and then discusses the timeline of 234 Th measurements according to four periods covering the last 50 years. Finally, the authors discuss gaps in the dataset and present some perspectives on its future uses. ,

Th measurements made at global scale over the past 50 years . The dataset is composed of several parameters including total, dissolved, and particulate 234 Th activity, POC: 234 Th and PON: 234 Th ratios, along with 238 U activity, POC and PON concentrations when available. This set of parameters constitutes the basis for the use of 234 Th as a proxy of carbon and nitrogen export fluxes from the oceanic upper water column. The data were obtained from several sources, the vast majority from peer-reviewed articles (214) and to a lesser extent from PhD manuscripts (4), public data repositories (8), and unpublished datasets (9). The database is composed of 223 excel spreadsheets along with a compilation file hosted in PANGAEA repository. For each data set, relevant metadata (geographic location, sampling date, project, sampling and processing methodology, bloom stage, etc.) have been systematically included. The associated paper introduces the dataset, presents a global overview, and then discusses the timeline of 234 Th measurements according to four periods covering the last 50 years. Finally, the authors discuss gaps in the dataset and present some perspectives on its future uses.

General comments
First, I would like to acknowledge the extensive work that has gone into this very comprehensive and clear compilation of over 50 years of research on the short-lived radionuclide 234 Th. Such a compilation was lacking until now and represents a new important step in the use of 234 Th as a proxy for the export of C and other elements (N, BSi, CaCO 3 , and trace elements) from the upper water column. The database is well structured and clearly described with detailed metadata of significant importance. It also appears very exhaustive and I could only identify a few minor omissions or errors (see detailed comments below).
My first general concern is with the form, as the manuscript contains a significant number of typographical and editorial errors that would have benefited from careful review before its submission. This concerns in particular the list of bibliographic references which contains a significant number of errors (authors list, authors name, type of reference such PhD manuscript, book chapter, research article). Still on the form, there is a surprising confusion between concentration and activity made throughout the manuscript. The 234 Th and 238 U data you report are activities not concentrations.
On the content, I was pleased to read the timeline of the Th studies which summarizes the major stages that contributed to the development of the method. For clarity to the reader, I would recommend to indicate for each period the corresponding years both in the manuscript in the subsection headings and in the figures.
About the different eras, I might have subdivided era 1 and 2 a little differently by considering the JGOFS program as the beginning of era 2. In fact, it seems more logical to me to take into account the sampling periods corresponding to different programs rather than the date of publication of the resulting studies. This is illustrated further in Table 2, where most of the studies belonging to era 2 were performed in the framework of JGOFS.
Regarding the first era, you may think to introduce the GEOSECS program earlier than L365 after the description of the Coale and Bruland papers. I think it could be relevant to mention that it was in the framework of GEOSECS that the concept of scavenging by particles has really emerged, especially for the open ocean. You may cite the seminal work of Turekian (1977) who introduced the concept of the "great particle conspiracy". Furthermore, it might be interesting to mention that during GEOSECS little attention was given to 234 Th in comparison to 228 Th and 230 Th for studying scavenging processes in the open ocean (Broecker and Peng, 1982). It was only later, with the papers of Coale and Bruland, that the role of 234 Th as a tracer of short-term particle dynamics was really highlighted.
Regarding era 4 related to the GEOTRACES program, I would set the starting year to 2008 or even 2007, which corresponds to the International Polar Year (2008) and the start of the GEOTRACES sampling program. Regarding this program, it would be relevant to include in table 2 all cruises that have been sampled for 234 Th so far. For consistency, I would recommend using the name of the section or process study considered as defined by GEOTRACES, you can then indicate which country has been involved in the sampling (US/UK/Netherland/Germany/India/France). This remark is also valid for the Th_database file, the projects performed within the framework of GEOTRACES should be named in a more consistent way such as for instance GEOTRACES (section or process study number, country, and eventually project acronym).
Still on the GEOTRACES program, I think it is important to mention that not all sections analyzed for 234 Th are available in the 2014 and 2017 Intermediate Data Products (L367). Even if data obtained as part of GEOTRACES are published in peer-reviewed journals, their inclusion in the IDP requires some additional steps (submission and acceptance by the GDAC). Also, you may indicate that the last IDP (2021) was released very recently.
I have also some concern regarding the section 5. Significant gaps in the global dataset. It is not clear to me what the gaps you want to discuss are. Reading the first paragraph (L696-701), it seems the gaps you want to consider are related to the current understanding of the Biological Carbon Pump. On this topic, I would recommend to include some more recent reviews (Henson et al., 2019;Boyd et al., 2019;Siegel et al., 2016), which detail some of the processes that require further consideration. Reading the following, I notice you discuss two main points, the first one is related to the spatiotemporal distribution of 234Th data and the second one, too long considered from L711 until the end of the section L768, is related to the modeling approach (SS vs NSS) used for estimating the export fluxes of 234 Th. It is surprising to note that this entire section is mostly discussed using only two references and written by the first author of this review. In my view, this section needs to be reconsidered, first by giving more attention to the existing literature on the SS/NSS approach and the different oceanic contexts to which it has been applied (not only the North Atlantic), and second to the other numerous issues related to the 234 Th method. Among these, it is important to point out the role of physics (lateral and vertical advection and diffusion) (Buesseler et al., 1995;Savoye et al., 2006;Resplandy, 2012;Le Gland et al., 2019;Roca-Marti et al., 2017), the importance of the depth of integration (Buesseler et al, 2020), and finally all the other issues related to the conversion to carbon fluxes using the POC to Th ratio of sinking particles (the choice of the relevant size fraction, the interpolation methods, etc.). By following these guidelines and considering that export fluxes have not been calculated or compiled in this review, you may be able to give recommendations to future users of this database.
Finally, and still about the gaps in this dataset, there is one point that could be considered that concerns the quality of available data. I understand that it is difficult to answer this point but if we take into account the successive evolutions of the methods used for the determination of 234 Th, all the data are probably not of the same quality. I think this should be at least mentioned or even taken into account in the form of a quality flag assigned to each dataset.
Detailed comments L15: the 234Th-238U pair is primarily used for assessing export fluxes, to look at export efficiency you need to compare with the net primary production, which is actually not included in the dataset. Please clarify L20: replace concentrations by activities and at all other occurrences in the manuscript for both 234Th and 238U.
L29: the term uptake is a bit ambiguous and not directly related to the 234Th method and to export fluxes. Uptake can be used either for air-sea exchanges or biological assimilation, please clarify.  Table 2 L441-442: correct relative to its parent nuclide or 238U L443: clarify what you mean with "response time" and also to what corresponds the second equation L455: correct "with time" L456: here you keep focusing on the SS vs NSS approach. You need to better account for the physical terms (according to Buesseler et al., 1995 in specific ocean settings such as upwelling regions). Furthermore, vertical diffusion can be quantified from a single 234Th profile if the diffusivity coefficient is known or assumed.
L458: check sentence, you may change to "when temporal fluctuations in 234Th activity can be assessed". In addition, and as mentioned by Savoye et al. (2006), NSS approach requires the same water mass to be sampled. This is another difficulty of the NSS approach that can be difficult to meet in dynamical settings.
L466: correct "0.001<colloids…" L471: There is a confusion, the 3-D model was not built to estimate the gradients of 234Th activities but to estimate the physical components to the flux (V terms). Without these components, the sinking flux would have been largely underestimated in the equatorial upwelling region. For the APERO project, it will start in 2022 to 2026 and the cruise is planned for 2023 in the western North Atlantic. L706: correct "gaps" L722: correct "with time" L777-778: correct "such as" and clarify the whole sentence Table 1. Check start and end date (sometimes inverted). The design needs to be improved as it is very difficult to identify which parameters has been measured for each studies. For the Lemaitre et al. (2018) study, please correct the reference as follow: Lemaitre, N., Planchon, F., Planquette, H., Dehairs, F., Fonseca-Batista, D., Roukaerts, A., Deman, F., Tang, Y., Mariez, C., and Sarthou, G.: High variability of particulate organic carbon export along the North Atlantic GEOTRACES section GA01 as deduced from 234Th fluxes, Biogeosciences, 15, 6417-6437, https://doi.org/10.5194/bg-15-6417-2018, 2018.
The study of Maïti et al. (2016) reports total Th profiles Table 2.
As for Table 1, the design needs some improvements. Column headings needs to be clarified and further details on the other programs than JGOFS could be included. This could be especially the case for GEOTRACES as well as EXPORTS. Furthermore, I do not see the reason why a given program is considered a major program. For instance, the HiLATS program is indicated only for 2001 Check the start date for DYFAMED