Revisiting five decades of 234 Th data : a comprehensive global oceanic compilation

We present here a global oceanic compilation of Th measurements that collects results from researchers and laboratories over a period exceeding 50 years. The origin of the Th sampling in the ocean goes back to 1967, when Bhat et al. (1969) initially studied Th distribution relative to its parent U in the Indian Ocean. However, it was the seminal work 10 of Buesseler et al. (1992) which proposed an empirical method to estimate export fluxes from Th distributions that drove the extensive use of the Th-U radioactive pair to evaluate the organic carbon export out of the surface ocean by means of the biological carbon pump. Since then, a large number of Th depth profiles have been collected using a variety of sampling instruments and strategies that have changed during the past 50 years. The present compilation is made of a total 223 data sets: 214 from studies published either in articles in referred journals, PhD thesis or repositories, and 9 unpublished data 15 sets. The data were compiled from over 5000 locations spanning all the oceans for total Th profiles, dissolved and particulate Th activity concentrations (in dpm L), and POC:Th ratios (in μmol dpm) from both sediment traps and filtration methods. A total of 379 oceanographic expeditions and more than 56600 Th data points have been gathered in a single openaccess, long-term and dynamic repository. This paper introduces the dataset along with informative and descriptive graphics. Appropriate metadata have been compiled, including geographic location, date, and sample depth, among others. When 20 available, we also include water temperature, salinity, U data (over 18200 data points) and particulate organic nitrogen data. Data sources and methods information (including U and Th) are also detailed along with valuable information for future data analysis such as bloom stage and steady/non-steady state conditions at the sampling moment. The data are archived on PANGAEA repository, with the dataset’s DOI doi.pangaea.de/10.1594/PANGAEA.918125 (Ceballos-Romero et al., 2021). This provides a valuable resource to better understand and quantify how the contemporary oceanic carbon uptake functions 25 and how it will change in future.


Introduction
For several decades, radioactive tracers have been used to gain a better understanding of different oceanographic processes. In the context of the Biological Carbon Pump (BCP) (Eppley and Peterson, 1979;Volk and Hoffert, 1985), radionuclides such 30 as 210 Po and thorium isotopes, are extensively used to study the various physical, chemical or biological processes involved in the particle export and flux attenuation in the oceans (Cochran and Masque, 2003). Radionuclides are characterized by a unique property: their half-lives, which accounts for the time it takes for one-half of the atoms of a radioactive element to undergo radioactive decay and thus transforming to a different isotope. Half-lives are not affected by temperature, physical or chemical state, or any other influence. As a result, the concentration of naturally occurring radioactive elements varies over time at well-35 characterized decay and production rates. Observations of radionuclides distributions in the water column, in time and space, provide valuable insights into the presence and rates of ocean processes on spatial scales from local to basin-wide and timescales of days to millenniums depending on the radionuclide employed.
The naturally occurring radioisotope 234 Th has been commonly used to understand natural aquatic processes in four major areas: particle cycling, horizontal transport, sediment dynamics, and vertical transport . 234 Th is collected 40 by a variety of sampling procedures since its initial use by Bhat et al. (1969). 234 Th chemistry dictates that it is adsorbed onto particles surface and effectively scavenged from the dissolved water phase. Hence, when biological activity is high, 234 Th is removed from the surface ocean and transported downward by the sinking particles, thereby generating a deficit relative to its soluble or "conservative" parent 238 U. This deficit can be used to calculate the downward flux of 234 Th. An excess in 234 Th activityi.e. a daughter concentration higher than the parent one -is attributed to fragmentation processes that result in the 45 conversion of sinking to non-sinking particles, generically termed "remineralization" (Maiti et al., 2010). Due to its short halflife of T1/2 = 24.1 days (decay constant: λ = ln2/T1/2 = 0.02876 d -1 , mean lifetime: τ = 34.8 d) and its particle reactivity in seawater it is suitable to trace processes occurring in the upper ocean on time scales from days to months (Rutgers van der , or even shorter when there is high scavenging by particles ) (see Section 4.2). 234 Th has been an indispensable tool in many oceanographic field expeditions. The most widespread application of the 234 Th 50 approach is to estimate the gravitational settling of carbon as Particulate Organic Carbon (POC) out of the surface layer, which results in a downward flux of POC (see e.g., review by Le Moigne et al., 2013b and references therein). To a lesser extent, this radionuclide is also used to estimate the downward flux of other elements to the deep ocean, such Particulate Inorganic Carbon (e.g., Le Moigne et al., (2013a); Wei et al., (2011)), Biogenic Silica (e.g., Buesseler et al., 2005;Lemaitre et al., 2016;Le Moigne et al., 2013a), Particulate Organic Nitrogen (PON) (e.g., Buesseler et al., 1992;Charette and Buesseler, 2000;Murray 55 et al., 1996), or trace metals fluxes (e.g., Black et al., 2018;Lemaitre et al., 2016Lemaitre et al., , 2020Weinstein and Moran, 2005).
It was in the 90's that an increasing number of studies for 234 Th took place. This increase in use is in part due to a variety of new sampling instruments and strategies that have changed over the years. In 2006, a special issue entitled Future Applications of 234 Th in Aquatic Ecosystems (FATE, https://www.sciencedirect.com/journal/marine-chemistry/vol/100/issue/3) was published in Marine Chemistry with the purpose of thoroughly reviewing the use of 234 Th in aquatic systems. Papers included 60 reviews of the applications and future uses of this radiotracer (Benitez-Nelson and Moore, 2006;Waples et al., 2006), discussions on the techniques and methodologies used for 234 Th analyses (Rutgers van der , the impact of POC: 234 Th ratios and their sampling methodology on POC flux estimates , and 234 Th sorption and export models in the water column , among other topics. As one of the most actively used tracers in oceanography, Waples et al. (2006) already reported 237 papers dealing with 234 Th published in refereed journals. However, 65 after 5 decades of extensive use, a unique repository of 234 Th measurements has never been compiled and 234 Th data remain scattered when not belonging to major sampling programs (see Section 4 for details). Therefore, it is valuable to bring together all existing 234 Th data, along with appropriate metadata, in one repository to facilitate further use and analysis.
Previous efforts compiling 234 Th-based data have been created to access 234 Th-derived POC fluxes (see Le Moigne et al., 2013b), total 234 Th activity from the surface to 1000 m deep (Le Gland et al., 2019) and, more recently, POC: 234 Th ratios (see 70 Puigcorbé et al., 2020 anddoi.pangaea.de/10.1594/PANGAEA.911424). In contrast, we have compiled the complete results of 234 Th measurements in sea water and particles at every depth, location, and time of sampling. The compilation can be found in doi.pangaea.de/10.1594/PANGAEA.918125. This article is the report of the compilation, a unique dataset to better understand and quantify how the contemporary oceanic carbon uptake functions and how it will change in future.
The goal of this effort is to serve as a basis for an open-access, long-term and dynamic oceanic repository of 234 Th 75 measurements and valuable metadata to be used in an accessible, easily findable, and inter-operable way that grows from now on from the contribution of other authors involved in 234 Th sampling. Moreover, given the great number of metadata and parameters compiled, the compilation offers multiple ways to use 234 Th, even opening the possibility of exploring new applications. For that reason, we have chosen not to compile the derived parameters reported by other authors, such as 234 Th downward fluxes or 234 Th-derived POC fluxes, but rather provide the data necessary for others to make such analyses, open to 80 the criteria, modeling, and interpretation chosen by each researcher.

Data organization
We have gathered data sets consisting of 234 Th measurements in oceanic waters sampled between 1967 and 2018. The compilation includes a total of 56633 data points for 234 Th activity concentrations (in dpm L -1 , referred to as simply 234 Th 85 concentrations from now on), distributed as follows: i) 21457 for total, ii) 6591 for dissolved, iii) 13977 for particulate 234 Th measurements, and iv) 10856 and v) 3750 for POC: 234 Th and PON: 234 Th ratios, respectively (in µmol dpm -1 ). Additionally, 238 U concentrations (18256 data), and POC (7651 data) and PON (1740 data) concentrations (in µmol L -1) are also reported.
POC and PON concentrations are referred to as "CHN" (Carbon-Hydrogen-Nitrogen) in the compilation. We are aware that CHN is not the only analytical method that can be used to determine POC and PON concentrations. An elemental analyzer -90 isotope ratio mass spectrometer (EA-IRMS) is also widely used for this purpose (see e.g., studies by Lemaitre et al., 2018;Planchon et al., 2015), but we have used this notation for the sake of simplicity. Please refer to the original source for the analytical method used to measure these data. Temperature and salinity values are also included when possible, making a total of 5652 values compiled for temperature and 12721 values for salinity. When fields are missing for a given record, the data are entered as "-999". 95 The data have been extracted from a total of 219 different studies published in refereed journals between 1969 and 2020, 5 PhD dissertations, 9 data sets accessible in 4 different public repositories, and 9 unpublished ones directly provided by authors (listed in Table 1) spanning all oceanic regions ( Figure 1a) and compiled in a total of 223 data sets. The data repositories include: 1) BCO-DMO® ("Biological & Chemical Oceanography Data Management Office", https://www.bco-dmo.org/), 2) DARWIN® ("Data 110 and Sample Research System for Whole Cruise Information in JAMSTEC", 100 http://www.jamstec.go.jp/e/about/informations/notification_2021_maintenance.html), 3) EDI Data Portal® ("Environmental Data initiative", https://portal.edirepository.org/nis/home.jsp), and 4) PANGAEA® (https://www.pangaea.de/). Additionally, many of the data corresponding to published GEOTRACES® (https://www.geotraces.org/) cruises can be found in GEOTRACES website (see the most recent version of its intermediate data product released in 2021 (GEOTRACES Intermediate Data Product Group, 2021)). 105 Each data set is univocally identified with a unique integer record ID number (denoted as "Reference_USE") and consists of two tables: i) the "metadata" and ii) the "data".

"Metadata" sheet
The "metadata" table is a list of the data origin in its broadest sense at a glance. It contains information structured in 6 categories: 1) "REFERENCE_USE", 2) "INFO", 3) "DATA", 4) "METHODS", 5) "ADDITIONAL_DATA", 6) 110 "DATA_SOURCE". The full list of metadata included in each dataset and a brief description of the table fields can be found in the Supplementary Material (see Table S1).
There are data from 379 cruises, covering 5134 locations spanning across all oceanic regions (Figure 1a). Some stations were part of cruise transects whereas others were part of small-scale surveys or reoccupation of the same location at different seasons and years. In all cases, sampling region and periodincluding bloom stage at the sampling moment when indicated by the 115 authorsare given as metadata as part of the "INFO" section, where a total of 17 fields are reported (see Table S1).
Information such as the project name, when sampling took place within an observational program, cruise name, leg details, research vessel, and chief scientist are indicated. A summary of the sampling period, the maximum and minimum latitude and longitude of the region surveyed, and the maximum depth sampled for 234 Th is also provided. When available, for the sake of a better interpretation of the data for the accurate POC flux assessment, the stage of the bloom have also been included 120 (categorized as "bloom", "pre-bloom", "post-bloom" and "no bloom"), as recommended by e.g., Ceballos-Romero et al. (2016. Note that we do not assess the bloom stage but instead we include the information as indicated by the original authors when provided. To distinguish between the periods before and after the peak in primary production, we identify "development of the bloom" and similar expressions as pre-bloom stage and "decline of the bloom" and similar as post-bloom phase. Only when bloom or non-bloom conditions are stated by authors, we assigned these phases. When none of these stages 125 are referenced to, not information is included (noted as "-999"). We acknowledge there could be issues on the way the different authors have decided if the conditions where non-bloom, pre-bloom, bloom, or post-bloom over the different years that are out of the scope of this compilation and were not evaluated. We plan to address this gap in the compilation in future versions of it (see Section 5).
The "DATA" section provides information of the data set contents at glance with YES/NO indicators to the basic data of 238 U, 130 234 Th (total, dissolved and particulate phases) and POC(PON): 234 Th ratios. The fraction size(s) details and the number of total stations and samples are also reported. A total of 13 parameters are detailed in this section.
The "METHODS" section is intended to provide useful interpretive information regarding how the sampling and/or measurements were accomplished at a glance. It provides basic information about i) 238 U determination: whether it was directly measured, or salinity derived, in which case the salinity relationship employed is specified; ii) total 234 Th sampling and 135 radiochemical purification methods, sampling methods for the iii) dissolved, and iv) particulate 234 Th phases; and v) the modeling approach followed when 234 Th data were used to estimate POC fluxes (i.e., the assumption of steady (SS) or nonsteady state (NSS) conditions). A final space for comments of any kind is included in this section.
The potential of 234 Th data increases when combined with methods that account for export episodes over different timeframes of a bloom period, such as 210 Po-210 Pb, or sediment traps (see e.g., Ceballos-Romero et al., 2016). For that reason, we included 140 information regarding the availability of some other techniques when combined with 234 Th sampling in the "ADDITIONAL_DATA" section. Additional data of interest are specific with YES/NO indicators for the cases of i) 234 Th underway sampling, ii) sediment traps deployments, iii) 234 Th was paired with 210 Pb-210 Po disequilibrium sampling, and iv) CHN data. Note that this section is merely intended as informative of the sampling methods used complementarily to the 234 Th technique. Both 234 Th underway data and POC: 234 Th and PON: 234 Th ratios from sediment traps are reported when available. 145 However, 210 Pb-210 Po concentrations are not compiled in this dataset. The availability of these data is indicated as reported by authors. We have only consulted publications, and cruise reports when accessible to gather information for these metadata so we acknowledge that information as to the existence of these data might be missing. We therefore recommend using them cautiously when stated "NO (available)" but fully trust it when stated "YES". To report an update to the metadata or an error in the data compiled, we encourage authors to contact us, and changes will be in included in future versions of the compilation 150 (see Section 5).
It is also worth mentioning here that the amount of data that could be additionally reported in the compilation is very extensive due to the wide applicability that has been shown for 234 Th along the years (see e.g., review by Waples et al., 2006). The inclusion of new parameters in future version of the compilation will be discussed in Section 5 as part of our assessment of steps towards improving the global data set. 155 Finally, several details regarding the data source are included in the 8 fields specified in the "DATA_SOURCE" section, including a YES/NO indicator for the publication date. The data owners are always clearly indicated by the first author(s) of the publication or data set. In the case of data published in research articles, the journal and the publishing year are indicated, while in the case of unpublished data obtained from personal communication (as is the case for 9 datasets), "np" (stating for non-published) is given in the journal information and no data are provided as publication year (indicated as "-999")). If data 160 had been assigned a Digital Object Identifier (DOI), this is included in DOI/others. When no DOI is available, the data URL source -either database or publication links, is included instead. Other URL sources or personal communication from data source is keyed by the text variable "data_resource" in the "metadata" table, when available.
Most of the measurements were obtained from publications. In these cases, if the data were transcribed from tables, the table number is also given as "data localization". If data were only available graphically, a computer program to digitize the data 165 from plots was used (WebDigitizer, https://automeris.io/WebPlotDigitizer/) and the figure number is also given. In the rare cases in which data were not accessible through any of these procedures, the authors were directly contacted for data. For those cases, the author(s) contacted is(are) indicated in the "data localization" field.
Finally, a space for further links or information of interest is provided under the text variable "other DOI/resources". In the few cases that the same data were reported in another publication (e.g., Murray et al., 1996, Dunne et al., 1997and Murray 170 et al., 2005 reported the same data from the U.S. JGOFS (Joint Global Ocean Flux Study) program during 1992 in the Equatorial Pacific), it was indicated under this text variable.

"Data" sheet
The "data" table is the data set core and contains all the data detailed above (more details in Table S2).
Each data point is accompanied by cruise and station IDs, locationlatitude, longitude, depth, and additionally bottom depth 175 when availableand sampling dateincluding month-date-year and Day of Year (DOY) formats -. All 234 Th (total, dissolved and particulate) concentrations were converted to dpm L -1 (density1027 kg m -3 ) if not already reported in these units. POC: 234 Th and PON: 234 Th ratios are given in µmol dpm 1 and include samplings with sediment traps and filtration methods, in which we report i) bottles-Go-Flo and Niskin types-ii) filtration systems, iii) SPLITT (split flow-thin cell fractionation), and iv) (large or small volume) in-situ pumps for either the entire particulate fraction or two sizes classes (preferably 1-53 µm and > 53 µm) 180 when available. These size-classes were chosen largely based on results of Bishop et al., (1977), Clegg and Whitfield (1990), and others, who assumed that the >53 µm size-class was responsible for most of the mass flux into traps. Other cut offs of 51 µm or 70 µm and other size classes are found and are noted when different than 53 µm. The particles' sampling methodcategorized as "method 1" (for filtration methods) and "method 2" (for sediment traps)and size fractionscategorized as "small" or "large"are specified as part of the data. Total, particulate, and sediment traps 234 Th sampling depths are separately 185 indicated. When reported, water temperature, salinity, and 238 U measurements are included. Moreover, if accessible, POC and PON concentrations ("CHN" data in µmol L -1 ) are included.
In all cases we assume that the originating authors and editors have undertaken steps necessary to control data quality.
Measurement uncertainties in the data points are compiled as provided by the original authors (e.g., "uncert_total_ 234 Th").
Please refer to original source for whether data uncertainty includes only the one sigma counting error, or other factors, such 190 as uncertainty on volumes, detector efficiency, background, etc.

Data formats and availability
The data are archived on PANGAEA repository, with the data set's DOI https://doi.pangaea. de/10.1594/PANGAEA.918125 (Ceballos-Romero et al., 2021. The data table is available for download either as a unique merged file containing all data sets and metadata or as individual excels files. 195 Moreover, the template followed to compile the data set (including the "metadata" and "data" tables) along with instructions to fill in this template are made available in PANGAEA for any author who either wants to review, complement a data set included in this compilation or contribute to its extension with a new data set. We strongly encourage authors to contact us to submit suggestions or request to amend the data sets compiled.

Scope and introduction to the data set 200
In this article, we aim at providing a broad overview of the character of the data sets to be used for different purposes in future studies. We therefore provide several graphics to indicate the scope and nature of the data compiled. These include maps of sampling locations ( Figure 1a

General overview
In this section we give an overview of the compilation and some important aspects for its use. In section 4 we present a 210 historical review of the 50 years of the 234 Th studies. In Section 5 we discuss steps towards improving the global data set. And in Section 6 we briefly suggest some future perspectives and applications for the data sets.
The sampling locations shown in Figure 1a are dominated by cruise tracks mostly in the Northern Hemisphere (NH) ( Figure   S1), although there are also a few locations in the Southern Hemisphere (SH) with significant repeated sampling, especially in the Southern Ocean (Figure 1b). Over a total of 379 cruises compiled, 294 took place exclusively in the NH (78%), only 53 in 215 the SH (14%), and a total of 32 (8%) crossed the Equator and sampled locations in both hemispheres.
Many of these expeditions were carried out in the framework of larger research programs. For those cases, this information has been included as metadata in the compilation. During these 50 years of 234 Th data, we have selected a total of 13 ocean programs where 234 Th has been widely applied and that have provided invaluable knowledge to the 234 Th approach (see Table   2). Note that these programs have been chosen with the purpose of providing a wide global distribution and international 220 representation of 234 Th from different institutions and organizations. A total of 68 (30%) of the data sets were collected during surveys as part of one of these 13 programs. Additionally, for cruises belonging to specific projects or experiments, this information has also been reported.
Sampling locations for 234 Th also include a number of long-term, high-frequency observations at fixed locations in the open ocean, referred to as (long-term) time-series stations (TSS) from now on. These stations offer a crucial observational strategy 225 for capturing the dynamic temporal changes in ocean conditions and biogeochemical processes at the seasonal and decadal scale. Moreover, it allows investigating short-term episodic events usually unaccounted for by oceanographic cruises. A great number of TSS that are either operational, registered (for future operation), inactive or closed spread across the globe (source: https://www.ocean-ops.org). We have acknowledged the importance of sampling at these TSS by identifying matches with 234 Th sampling locations (see Figure 1c). Coincidences between locations with 234 Th data available and the position of a TSS 230 station have been reported in the metadata section with the label "time-series" in the "number of stations" field. Note that this should not be mistaken with repeated sampling of a location during a cruise, which has also been acknowledged with the label "reoccupied" in the same field. A total of 98 data sets (44%) reported samplings on at least one TSS location while only 33  (Figure 1c). More details of the history of these locations repeatedly sampled for 234 Th will be given in forthcoming sections.
Over time, the interdisciplinary scope of the studies and the number of techniques and parameters simultaneously assessed have gradually evolved. Field surveys have changed not only in terms of number but also in terms of strategy (e.g., duration, 245 spatial resolution and repeated occupation of the same site). This has influenced not only the number of data points measured with a sustained increasing trend -but also the type of 234 Th data reported (i.e., total, dissolved and particulate phases, and POC(PON): 234 Th ratios). Similarly, it has driven changes in the number of 234 Th studies published. We have identified 3 key indicators of these inflection points. In the 90's the number of 234 Th studies dramatically increased after the publication of the first empirical routine method to estimate downward POC fluxes in the ocean by Buesseler et al. (1992). Another tipping point 250 took place in the 2000's likely due to significant improvements in the 234 Th methodology derived from the introduction of the small-volume technique (Benitez-Nelson et al., 2001b). Moreover, it is important to note that it was in the late 2000's when Yool et al., (2007). proved that other empirical methods such as the 15N new production techniques (f-ratio) largely overestimated the efficiency of the carbon export. Finally, a later shift took place with the change in the way ocean is currently explored and the expansion of applications to trace metals and isotopes introduced by GEOTRACES, whose first 234 Th-related 255 publication was released in 2010 with the first version of the "cookbooksampling and sample-handling protocols" by Cutter et al. (2010). More details of these milestones and the consequences they brought to the 234 Th technique will be discussed in Section 4.
The overall temporal evolution of 234 Th measurements is depicted in Figure 2. Figure 2a shows the annual number of field expeditions including 234 Th measurements per sampling year between the initial measurements in 1967 to 2018, separated by 260 oceans. Figure 2b shows the equivalent figure for the annual oceanic distribution of 234 Th data points. And Figure 2c shows the histogram of 234 Th data points sampled since 1967 separated by data type. Additionally, the time distribution of annual 234 Th publications (either in referred journals, PhD thesis or repositories) since the first reported publication in 1969 separated by oceans is shown in Figure S2. There are several remarkable peaks in the annual record of 234 Th data points which are related to dedicated carbon export programs and experiments that would be detailed in next sections. A delay of about 3-4 years relative to sampling is commonly observed for the publication of these 234 Th data and derived results, which started in 1969 (see Figure S2). Namely, as shown in Figure 2b, peaks in the number of data points stand out in: i) 1992, when the JGOFS equatorial Pacific process study took 280 place, published by several groups in 1995-1997 (Bacon et al., 1996;Buesseler et al., 1995;Dunne et al., 1997;Murray et al., 1996) and extended in 2001 with the publication of POC: 234 Th ratios from sediment traps results by Hernes et al., (2001); ii) 1997, when the JGOFS Southern Ocean study and the Arctic expedition ARK XIII/2 were carried out, published by Buesseler with VERTIGO voyages as the most productive ones in terms of data contribution, published by  and 285 Lamborg et al., (2008b); the Shelf-Basin Interactions (SBI, https://arctic.cbl.umces.edu/sbi/web-content/) Phase II field program, published in 2007 by several authors Lepore et al., 2007 and; the EDDIES project, synthesized by Buesseler et al., (2008a); and the CROZEX project available in Morris et al., (2007), iv) 2010, where a total of 17 cruises took place, with the Southern Ocean cruises (Owens, 2013) as the most intense ones in terms of number of 234 Th samples per day and the several GEOTRACES section cruises: 2 cruises to the South Atlantic Ocean (GA10, 290 UK) published by Le Moigne et al., (2014); 3 cruises to the Atlantic Ocean (GA03 U.S. and GA03 U.S. to the North Atlantic, GA02, Denmark to the South Atlantic) published by Owens et al., (2015); and 2 cruises in the Atlantic Ocean from 64°N to the equator (GA02, legs 1 and 2, Denmark) published by Puigcorbé et al., (2017a); and finally v) 2018, when 2 GEOTRACES cruises were carried out in the Pacific Ocean (GP 15 Leg 1, U.S., Pacific Meridional Transect, and GP 15 Leg 2, U.S) unpublished until now that haven been reported in this compilation by personal communication from J. Kenyon; and the first 295 field expedition of the EXPORTS program took place (published by Buesseler et al., (2020)).

Total, particulate and dissolved 234 Th
Changes in the type of 234 Th data measured during field expeditions are depicted in Figure 2c. While within the first years of the 234 Th technique, authors focused on the total, dissolved and particulate phases mainly for the study of the scavenging of 234 Th and the partitioning of Th species between phases (see e.g., Bacon and Anderson, 1982;McKee et al., 1986;Murray et 300 al., 1989;Nozaki et al., 1981), in the 90's the broader potential of 234 Th and its application as a tracer for particle dynamics and export fluxes became evident, which drove the inclusion of sampling strategies for the determination of POC(PON): 234 Th ratios. Given the reduced annual number of data points in the 70-80`s, in comparison to subsequent years, it is difficult to perceive the contribution of each data type to the total number of data points within the first years of study. A detailed review of the evolution of 234 Th data available through years in the framework of the mentioned milestones is given in Section 4. 305 In summary, total 234 Th is almost always reported by all authors by either direct measurement or determined as the sum of the dissolved and particulate phases. It is worth mentioning that the distinction between dissolved and particulate is operational.
Dissolved and particulate specie have traditionally been discriminated by filter sizes, typically of 0.2-l µm pore size (Moran and Buesseler, 1993), with the submicron colloidal matternanometer to submicrometer size range ~0.001-l µm (Stumm, 1977)included in the dissolved phase. Note that colloidal size ranges have not been included in the compilation but indicated 310 in the metadata section when available. The 0.7 µm cut-off for the dissolved-particulate fractions is very frequent since it is the nominal pore size of the Whatman glass microfiber filters (so-called GF/F) that has been historically used to collect particulate 234 Th particulate . Nonetheless, the use of 1 µm pore size quartz microfiber filters (QMA) filters has been more common since the early 2000's due to its lower radioactivity blank with direct beta counting (e.g., Buesseler et al., 2001a). A general view of the 234 Th particulate data compiled is provided in Figure S3, which shows the 315 locations with 234 Th concentrations sampled on either small (generally from 0.7-1 to 53-70 µm) large (>53 or >70 µm) or both size fractions. An overall 75% of the studies compiled reported particulate 234 Th concentrations, more than half of them (56%) reported data in 2 size fractions.
In a reduced number of studies, 234 Th concentrations were determined as a single vertically integrated sample, normally between 0 and 100 m depth (i.e., Buesseler et al., 1998Buesseler et al., , 1994Buesseler et al., , 1995Charette and Buesseler, 2000;320 Cochran et al., 2000;Evangeliou et al., 2011;Hall et al., 2000;Ma et al., 2005;Maiti et al., 2008;Schmidt et al., 2002). This sampling approach was deliberately done to reduce sample numbers yet maximize the number of locations sampled, at a time when at sea analyses were more difficult. This has also been indicated both in the metadata and data sections in each individual spreadsheet with the code number "-555" in the depth column (see Table S1 and Table S2).
A total of 66% of the studies reported POC: 234 Th ratios measured with some of the filtration methods (described in detail in 325 Section 2.1.2), while only ~16% reported PON: 234 Th ratios. In ~40% of the studies compiled, POC(PON): 234 Th ratios collected with sediment traps were reported from a great variety of trap designs: Indented Rotating Sphere (IRS), surface-tethered, freefloating, bottom-moored, automated, cylindrical, VERTEX-style, U-type, CLAP-type, RESPIRE type, Free-drifting Lagrangian Sediment Traps (LST), High Frequency Flux (HFF) standard drifting particle interceptor traps (PITS) and, more recently, neutrally buoyant sediment traps (NBSTs) and Particle Export measurement using a LAGRAngian (PELAGRA) trap. 330 An overview of POC: 234 Th data compiled is shown in Figure S4.
Note that a specific, more detailed, compilation of global POC: 234 Th ratios in the ocean has been recently published by Puigcorbé et al (2020) (archived in the data repository PANGAEA ® under doi.org/10.1594/PANGAEA.911424) with the purpose of elucidating the spatial, temporal and depth variations of this crucial parameter. The authors present a database of 9318 measurements sampled on 3 size fractions (~> 0.7 µm, ~ 1-50 µm, ~> 50 µm) collected with in situ pumps, bottles, and 335 sinking particles collected in sediment traps from the surface down to > 5500 m. Our compilation includes the studies gathered in Puigcorbé et al (2020) and expands the data set with a total of 10851 POC: 234 Th ratios.

238 U measurements
Finally, in order to allow the assessment of POC fluxes, 238 U concentrations have been also included. This is generally reported by authors either by direct measurement or, most commonly, derived from salinity data by applying one of the 238 U-salinity 340 relationships available. From the studies compiled, we have identified a total of 11 salinity-238 U relationships published in journal articles, chronologically ordered as follows: i) Ku et al., (1977); ii) Broecker and Peng (1982), iii) Coale and Bruland, (1985), iv) Chen et al., (1986), v) Andersson et al., (1995)

Discussion: 234 Th timeline
The compilation covers a temporal range of 5 decades. The temporal distribution of oceanic 234 Th measurements begins in 1969 to measure its distribution in the hydrologic cycle (Bhat et al., 1969) and extends up to these days as an indispensable 350 tool in oceanographic expeditions. We consider it was not the passing of years but rather the publication of seminal key studies that delineates the progression of 234 Th studies. For this reason, we have divided the 234 Th technique time history in four well distinguished eras, marked by four publications, summarized in The first era within the 234 Th timeline was initiated by the study of Bhat et al. (1969), in which a total of 6 profiles of total 234 Th: 238 U ratios in depths up to 250 m were collected on-board the U.S.C. and G.S.S. Oceanographer during 1967 at several sampling sites in the Indian Ocean to study scavenging processes. Polyvinylchloride (PVC) tube samplers of 30 L capacity 360 designed by Shale J. Niskin (Niskin, 1962) were used.
It was in the framework of GEOSECS (the GEochemical Ocean SECtions Study) project (http://iridl.ldeo.columbia.edu/SOURCES/.GEOSECS/) that the concept of scavenging by particles has really emerged, especially for the open ocean. The initiation of global ocean chemistry, hydrographic, and tracer survey efforts took place within this era, especially with GEOSECS, which was fundamentally different in style and scale than anything before. This 365 project provided the first comprehensive data set for the distribution of chemical species in the world ocean between 1972 and 1978 (Moore, 1984). In 1977, the seminal work of Turekian introduced the concept of the "great particle conspiracy" to denote the fact that dissolved elements are, to a certain extent, particle reactive and eventually are rank in the ocean. All this contributed to further increase 234 Th research and pointed towards the second era of the 234 Th technique.
The pioneering study by Bhat et al., was expanded by many other investigators (e.g., Kaufman et al., 1981;Knauss et al., 1978;370 Lee et al., 1991;Matsumoto 1975;Tanaka et al., 1983;Tsunogai et al., 1986). All these studies analyzed total 234 Th (both particulate and soluble forms) giving different interpretations to the total 234 Th vertical distribution. Bhat et al. (1969) assumed that essentially all 234 Th in seawater was in particulate form, whereas Matsumoto (1975) assumed particulate 234 Th to be an insignificant part of the total 234 Th. This led to not very informative scavenging model to be applied to the surface euphotic layer until the sampling of profiles of 234 Th in both dissolved and particulate was introduced. The analysis of particulate 234 Th 375 was incorporated by Krishnaswami et al., (1976) during a survey in the Pacific Ocean. Several authors combined total and particulate 234 Th concentrations (e.g., Minagawa and Tsunogai 1980;Santschi et al., 1979). While the sampling of the 234 Th dissolved phase was initiated by McKee et al. (1984), who collected samples in coastal environments near a major sediment source (Yangtze River). From 1984 on, many studies sampled both dissolved and particulate phases (see e.g., Coale and Bruland, 1985;Huh and Beasley, 1987;Wei and Murray, 1991). 380 Initially in this this era, little attention was given to 234 Th in comparison to 228 Th and 230 Th for studying scavenging processes in the open ocean (Broecker and Peng, 1982). It was only later, with the influential study of Coale and Bruland (1985) that the role of 234 Th as a tracer of short-term particle dynamics was really highlighted. In this study several profiles of 234 Th in both dissolved and particulate form were presented and used to elucidate the partitioning of 234 Th between these phases. This allowed the authors to demonstrate that this radioisotope is an ideal particle reactive tracer for studying the scavenging of 385 thorium from surface waters.
In fact, a series of papers by Coale and Bruland in the mid-80's (Bruland and Coale, 1986;Bruland, 1985, 1987) were key for establishing the baseline for future 234 Th studies that would use 234 Th as a proxy for POC fluxes in the next era.
The authors discovered that the 238 U-234 Th disequilibrium is a direct tracer of the rates of sinking particles from the upper ocean, which led to the acknowledgment of the relevance of the downward 234 Th flux. Previously, on a one-year time-series 390 study, Tanaka et al. (1983) found large variations in the total 234 Th concentrations, with the minimum in 234 Th inventory coinciding with the early spring bloom, which they proposed was related to biological activity in the surface waters. It was within this context that Eppley et al. (1989) proposed that, if 234 Th is scavenged by biogenic particles, 234 Th could be used as a tracer of export production.
A series of milestones have been identified as the most remarkable of this era, which summaries as follows: • 1969: initial measurement of total 234 Th and introduction of the co-precipitation of 234 Th with Fe(OH)3 (Bhat et al., 1969). 400 • 1977: introduction of the concept of the "great particle conspiracy" by Turekian, (1977).  -introduction of the MnO2-impregnated filter cartridges technique (Mann et al., 1984), used with in-situ pumps.
• 1985: initial analysis of particulate and dissolved 234 Th in open ocean and link between biological processes and 234 Th 410 deficits clearly demonstrated (Bruland and Coale, 1986;Bruland, 1985, 1987). • 1989: 234 Th proposed to trace export production (Eppley et al., 1989). There exist two distinct radioanalytical methods for the 234 Th extraction and purification from water samples that were initiated during this era: i) the co-precipitation of 234 Th with Fe(OH)3, proposed by Bhat et al., (1969), and ii) the scavenging of this 425 nuclide onto MnO2 cartridges, introduced by Mann et al., (1984). A thorough review of these techniques can be found in Rutgers van der Loeff et al., (2006). Briefly, for the Fe(OH)3 technique, 20-30L sea-water samples are treated and beta-counted.
The addition of Fe carrier forms a precipitate that removes Th (and other radionuclides) from solutions, so ion exchange purification procedures are required (at sea or quickly after return to shore) to separate 234 Th from its parent and other potential beta emitters. For the MnO2 cartridges technique, seawater is sequentially pumped through filters and two MnO2 impregnated 430 cartridges connected in series to scavenge dissolved Th isotopes. This technique was often used with large volume samples (10 2 -10 4 L), needed primarily for 228 Th, 230 Th and 232 Th analyses, which required large amounts of ship time given the use of in-situ pumps for filtration, limiting the spatial coverage of the 234 Th profiles. MnO2 cartridges do not adsorb appreciable 238 U, which is another advantage as ingrowth after sampling from 238 U can be neglected. The large samples represented by a MnO2 cartridge also allowed for direct gamma counting, thus eliminating the need for laborious radiochemical purification. 435 A wide variety of methods were used to sample the different 234 Th phases within era 1. For the total phase, PVC tube and Van Dorn samplers (i.e., horizontal water bottle) were the prevalent equipment (Bhat et al., (1969) ;Matsumoto, (1975);McKee et al., (1984) or Minagawa and Tsunogai (1980) among others), although a few studies used pumping systems (Lee et al., 1991;Tsunogai et al., 1986), and bottles (Bacon and Rutgers van der Loeff, 1989). In the majority of the studies measuring dissolved and particulate 234 Th phases, the total 234 Th concertation was estimated as the sum of them (see e.g., Coale and Bruland 1985;440 Murray et al., 1989). Regarding particulate 234 Th analyses, most of the studies used filtration systems on volumes between 30 and 700L, a few studies used bottles (with Go-Flo model more typical than Niskin one), and one study introduced the use of pumps (Bacon and Rutgers van der Loeff, 1989) Note that the distinction between the dissolved and particulate phases is generally operational, with the term "dissolved" usually comprising all the phases passing through a pore size cut-off of 0.45 µm (see i.e., Bacon and Rutgers van der Loeff, 1989;Dominik et al., 1989;Rutgers van der Loeff and Berger, 1991;Santschi 445 et al., 1979;Schmidt et al., 1990;Wei and Murray, 1991).
A total of 25 published works in refereed journals comprise the 234 Th studies of this era, compiled in a total of 24 datasets.
Sampling was characterized by cruises mostly in the Pacific Ocean (see Figure 2a and Figure 4a), with a reduced number of locations sampled (maximum of 29 locations and averaged value of 6 locations per cruise) and a reduced number of data points per study (rarely over 100, averaged value of 60 samples per cruise). A total of 36 cruises were compiled from this era, with a 450 total of 1493 234 Th data points. Most surveys took place solely in the NH (29) between April and September (Figure 5a). Only 1 expedition surveyed the SH, but a total of 6 cruises crossed the Equator and sampled in both hemispheres.
Only 2 studies reported samplings that were part of a selected ocean program, which included the GEOSECS program (Krishnaswami et al., 1976) and the DYFAMED program (Schmidt et al., 1990) (see Table 2 for a summary). Another study took place within the framework of the Joint Chinese-American Field Program (JCAFP) (McKee et al., 1984). A total of 9 455 studies reported sampling time-series stations (see e.g., Tanaka et al., (1983)).
During this era, 234 Th studies were commonly focused on analyzing the parameters influencing the partitioning of a species between dissolved and particulate phases as a way to understand the mechanisms and rates for the scavenging of particlereactive species. One way to quantify the scavenging of particle-reactive species is the use of distribution coefficients (Kd), which measure the partitioning of a species between dissolved and particulate phases (McKee et al., 1986). As the knowledge 460 of scavenging increased, novel applications of 234 Th were developed, reducing the interest on the dissolved phase of 234 Th over time and driving changes in the 234 Th data type collected during field expeditions (see next sections).
The majority of the studies reported 238 U concentrations along with at least one 234 Th phase concentration (684 238 U data points reported in this era), but none of them measured POC: 234 Th (Figure 6a), as the importance of this parameter was not evinced until era 2 (see Section 4.2). 238 U was either measured (a total of 6) or derived from salinity (a total of 10) mostly using the 465 relationship from Ku et al., (1977). It is also worth mentioning that very few studies reported simultaneous sampling of complementary non-thorium measurements, such as sediment traps (a total of 6 studies, see e.g., Bruland 1987 andTsunogai et al., 1986), or 210 Pb-210 Po disequilibrium (5 studies, see Krishnaswami et al., 1976;Moran and Moore, 1989;Santschi et al., 1979Santschi et al., , 1980Tanaka et al., 1983). And none of the studies reported CHN (i.e., POC and/or PON) data.
Finally, in terms of modeling 234 Th data, more than half of the studies (56%, a total of 14) did it, and most of the cases used a 470 two-box model, following Coale and Bruland (1985). Except from Tanaka et al., (1983) that applied both a SS and NSS model to estimate the residence time of 234 Th, all the studies that provided information in this regard assumed SS conditions during sampling, which is not surprising since stations were not usually reoccupied during cruises for the collection of time-series data. The advection term was neglected in most of the cases. of 238 U in the ocean, any measurable deficit of 234 Th relative to its parent can be assumed to imply a significant removal by scavenging and particle sinking flux over a period of days to weeks before sampling, as the mean residence time of 234 Th is 485 dictated by its decay constant and removal rate by particles (Coale and Bruland, 1985). Buesseler et al. (1992) Where POC: 234 Th is the ratio of POC to 234 Th measured on sinking particles at the desired depth z (in µmol dpm -1 ), and 234 Thflux is the 234 Th downward flux measured at the same depth (in dpm m -2 d -1 ). 490 Eq. (1) is the base of the so-called 234 Th method and the POC flux obtained is referred to as 234 Th-derived POC flux. Note that the concept for this empirical method was introduced in a conference abstract in one much earlier study in the North Pacific (Tsunogai et al., 1976). Buesseler et al. (1992) proposed that the sinking flux of any elementsuch as carbon, phosphorus or nitrogen -could be derived from 234 Th flux if the ratio of this element to 234 Th on sinking particles is known. element: 234 Th ratios are directly determined from their in-situ measurement and vary with both depth and particle size (Buesseler et al., 495 2006). 234 Th flux can be calculated by evaluating the change in the corresponding total 234 Th concentration with time and the contributions due to horizontal and vertical advection and diffusion processes. The simplest solution to estimate 234 Th flux is SS conditions and ignoring adjective and diffusive transport. These assumptions are the most commonly used (e.g., Le Moigne et al., 2013b are references therein) as generally only a single 234 Th profile can be measured. The neglection of the physical 500 term became inadequate with the expansion of 234 Th research to coastal and more dynamic regimes . In the open ocean, the most relevant physical process is vertical upwelling, and it will typically result in underestimations of 234 Th export if it is not included (Buesseler et al., 1995). Furthermore, Dunne and Murray (1999) developed a model to estimate vertical advection and horizontal diffusion, concluding that vertical advection might be overestimated if horizontal advection is not considered. Alternatively, a NSS model can be applied when temporal fluctuations in 234 Th concentration can be assessed 505 owing to repeated sampling, ideally over the course of 2 to 4 weeks (Resplandy et al., 2012) and only if the same water mass is sampled  (i.e., the NSS approach is difficult to assess in dynamical settings).
The study by Buesseler and co-authors motivated an increase in oceanic 234 Th measurements in the 1990's and we consider it one of the main milestones of this era along with the following ones: • 1992: introduction of the empirical method for the POC flux estimate from 234 Th concentrations proposed by 510 Buesseler et al., 1992Buesseler et al., . • 1992: studies of the role of colloidal material (i.e., ~0.001 < colloids < l µm (Stumm, 1977)) in 234 Th scavenging (Baskaran et al., 1992;Moran and Buesseler, 1993).
• 1995: -introduction of the use of the 210 Pb-210 Po pair in a similar manner to the 238 U-234 Th pair by Shimmield et al., 515 (1995); -development of a regional 3-D 234 Th flux model to estimate the physical components to the flux (often referred to as V terms) by Buesseler et al., (1995). Without these components, the sinking flux would have been largely underestimated in the equatorial upwelling region.
This technique allowed analyzing the particulate and dissolved concentrations of 234 Th in a single aliquot, thus enabling the calculation of total 234 Th concentration (particulate plus dissolved) as well as its residence time in the 525 two phases. Moreover, it enabled on board beta counting for 234 Th with a portable beta counter, leading to obtain higher temporal and spatial resolution.
A total 43 data sets comprise this era, extracted from a total of 48 publications. Sampling was focused on the Atlantic Ocean (see Figure 2a and Figure 4b). Surveys in the Pacific Ocean were limited while increased sampling in the Southern Ocean, and field expeditions to the Arctic were conducted. Both the number of expeditions and the locations sampled increased. 545 Despite its short duration (less than a decade), a total of 70 cruises were reported within this era, indicative of the dedicated programs and experiments that marked this era, resulting in 8739 and 1133 data points for 234 Th and 238 U respectively. Similar to the first era, cruises mainly took place exclusively in the NH (52 cruises), although both expeditions solely to the SH and to both hemispheres increased, with a total of 7 and 11 cruises respectively. Samplings mainly took place within the first semester of the year (Figure 5b). 550 As previously mentioned, this period of the 234 Th history partially overlapped with the golden years of the JGOFS program, and therefore, many of the studies published during era 2 reported 234 Th measurements collected during field expeditions that took place in the frame of this international program. A summary of these activities is provided in Table 2. In the case of U.S. JGOFS, this included 234 Th measurements at the BATS (see e.g., Buesseler et al., (1994), (2000)) TSS and during several process studies in well-defined areas at strategic oceanic locations: 1) NABE (North Atlantic Bloom Experiment, 555 http://usjgofs.whoi.edu/research/nabe.html) that was one of the first major activities of JGOFS with 3 cruises along longitude 20°W in 1989. It was published 3 years later by Buesseler et al., (1992); 2) EqPac (Equatorial Pacific, http://usjgofs.whoi.edu/research/eqpac.html) process study that was conducted along 140°W during year 1992 and included a total of 4 cruises. It was published between 1994 and 1997 by Bacon et al., (1996); Buesseler et al., (1995); Dunne et al., (1997); Murray et al., (1996) and extended in the next era by Hernes et al., (2001) and Murray et al., (2005);3) the Southern 560 Ocean expedition during October/November 1992 published by Rutgers van der Loeff et al., (1997) and expanded in era 3 by Friedrich and Rutgers van der Loeff (2002); 4) Arabian Sea (http://usjgofs.whoi.edu/research/arabian.html), beginning in October 1994 and ending in January 1996 and was reported a few years later by Buesseler et al., (1998); 5) AESOPS (Antarctic Environment and Southern Ocean Process Study, http://usjgofs.whoi.edu/research/aesops.html), which carried out field work between August 1996 and April 1998 and was published within this era by Cochran et al., (2000) and extended in the next era 565 by Buesseler et al., (2001b). Additionally, studies also reported data from other JGOFS expeditions from the: 6) Indian program, which completed three major sampling expeditions for 234 Th to the eastern and central Arabian Sea in April-May 1994, February-March 1995 and July-August 1995. It was reported by Sarin et al., (1996) and extended in PANGAEA in 2013 (https://doi.org/10.1594/PANGAEA.807500); 7) Canadian program in the northeast Pacific Ocean and had two phases, from 1992 -1994 and from 1995-1997, although 234 Th was only sampled during the second phase. Data were published by 570 ; and 8) France DYFAMED program, sampled in 1987 but published in this era by Schmidt et al., (1992).
Finally, field work also included 8) the Southern Ocean Iron RElease Experiment (SOIREE, https://www.bcodmo.org/project/2051), which was the first in situ iron fertilization experiment performed in the polar waters of the Southern Ocean. It took place in February 1999 south of the Polar Front in the Australian-Pacific sector of the Southern Ocean and was reported in Charette and Buesseler (2000). Data from this iron enrichment experiment was compiled along with others in a 575 common open-access database during Iron Synthesis program (FeSynth, https://www.bco-dmo.org/program/2017) started in

2007.
In addition to JGOFS, cruises from another major initiative that included 234 Th sampling was the Ocean Margins Program (OMEX, http://po.msrc.sunysb.edu/omp/), a large field-based study, with extensive physical, chemical, biological and geological measurements, carried out on northwest European shelf break that ran in two phases, from 1993-1996 and from 580 1997-2000. 234 Th data were made available by Hall et al., (2000).
The number of data points measured per cruise significantly increased within this era, with an average of 125 234 Th data points reported per cruise in comparison to the 41 data points averaged of era 1. Such an increase was significantly marked for the dissolved and particulate phases, whose measurements increase more than a 5-fold relative to those from era 1 (see Figure 6b).
Additionally, measurements to determine POC and PON to 234 Th ratios became routine, therefore allowing the estimate of 585 POC fluxes (see Figure 4b). However, only half of the studies reported 238 U concentrations along with 234 Th data, in their majority using a variety of 238 U-salinity relationships, with Chen et al., (1986) as the prevalent one. An overall 69% of the studies (33 out of 48 that comprises this era) modeled 234 Th data. Time-series data were collected more often than during the first era, likely with the purpose of following the NSS approach (Buesseler et al., , 1995. Nonetheless, the majority of the studies (70%) that modeled 234 Th data assumed SS conditions (see e.g., Bacon and Rutgers van der Loeff 1989;Baskaran 590 et al., 1992;Gustafsson et al., 1997b). A total of 10 studies applied the NSS model, with 7 of them combining it with the SS model (see e.g. Buesseler et al., 1994;Cochran et al., 2000;Kersten et al., 1998 for the combined use of both approaches). In terms of additional data, 1 study introduced underway 234 Th sampling (Hall et al., 2000), a total of 11 studies reported sampling with sediment traps (see e.g., Cochran et al., 2000;Murray et al., 1996;Smoak et al., 1999), 10 studies sampled for 210 Pb-210 Po disequilibrium (e.g., Moran and Moore 1989;Santschi et al., 1999;Wei and Murray 1992), and 13 studies reported CHN data 595 (see e.g., Niven et al., 1995;Rutgers van der Loeff et al., 1997;Santschi et al., 1999among others).

2001-2009: Improvements of the 234 Th technique.
"At sea, things appear different." -Nathaniel Philbrick, In the heart of the Sea. 600 The final boost to the widespread and increasing use of 234 Th as a particle flux tracer was motivated by the works of Buesseler et al., (2001) and , which dramatically increased the number of 234 Th measurements and marked the third era with the introduction of the small-volume (2-4L) technique. This procedure modified the 20 L-method developed by Rugters van der Loeff and Moore (1999) in era 2 and uses the lowest sample volumes of all known 234 Th methods to date.
It not only allowed immediate on-board beta-counting of 234 Th concentration and avoid some tedious folding sessions, but also 605 enhanced both spatial and temporal resolution of particle export along with other biogeochemical parameters.
The revolution that this novel technique brought with it was substantial. The small-volume technique is essential to capture the particle dynamics and export flux variations on scales that could be better related to local biogeochemical conditions. The main advantage is the convenience of handling small volumes and more rapid processing times. Multiple sampling casts are often required for a 20-L sample profiles whereas a 4-L technique usually allows simultaneous sampling of 234 Th with other 610 parameters on a single cast (e.g., nutrients, phytoplankton biomass, etc.). As a result, this method can be easily applied at sea using samples obtained through CTD rosette water samplers that are available on most research vessels. For that reason, we have chosen the introduction of the small-volume technique as a shifting milestone in the 234 Th timeline.  • 2006: reduction of the filtration time and introduction of the alpha spectrometric measurement of 230 Th recovery by 625 means of using a combination of water bath heating and a reduction in reagent quantities (Cai et al., 2006). This also modified the typical ion-exchange chemistry to allow for the alpha spectrometric measurement of 230 Th recovery. Pump System) by Challenger Oceanic (Surrey, U.K.). Moreover, one study reported the use a novel split flow-thin cell fractionation (SPLITT) (see Gustafsson et al., 2006). This has been the golden age of the 234 Th so far in terms of cruises carried out by year, with a total of 159 cruises reported in a period of 9 years. A total of 19676 and 4567 data points for 234 Th and 238 U respectively (see Figure 4c and Figure 6c) were compiled in 76 datasets, extracted from 87 studies (81 publications in referred journals, 1 PhD thesis, and 5 repositories). 635
Sampling spanned the entire ocean and intensified in all regions, except the Indian Ocean (see Figure 2a and Figure 4c).
Expeditions to the SH increased, with 18 reported, but yet were far below those in the NH, where a total of 134 cruises were 650 undertaken (see Figure 5c). Additionally, a total of 7 surveys collected sampled in both hemispheres. Stations included the long-term observatories previously sampled for 234 Th along with novel ones, such as the PAP site (Turnewitsch and Springer, 2001). Fieldwork took place throughout the entire year, with a remarkable number of cruises in March. Both the overall number of data points and the data points measured per year increased due to the large number of field expeditions carried out during this era and the spread use of the small volume technique. An average of 2183 234 Th data points was reported per year in 655 comparison to an average of 60 and 971 data points during eras 1 and 2 respectively. As for the 234 Th sample types in this era, total number of measurements significantly increased for all phases (2-fold for the dissolved and particulate, while and 4-fold for the total), and for POC: 234 Th ratios, and remain constant for PON: 234 Th (see Figure 6c).
A great increase in the number of data points also was detected for 238 U activities, with an increase of 4-fold in the data points.

2010-present: GEOTRACES program and a new way to study the ocean. "The sea is everything. It covers seven tenths of the terrestrial globe. Its breath is pure and healthy. It is an immense desert,
where man is never lonely, for he feels life stirring on all sides. The sea is only the embodiment of a supernatural and wonderful existence." -Jules Verne, Twenty thousand leagues under the sea. 675 The beginning of this era is identified with the first publication of 234 Th data from a GEOTRACES cruise, reported by Cai et al., 2010. GEOTRACES is an international study of the marine biogeochemical cycles of trace elements and their isotopes which changed the way to explore the oceans by combining ocean sections, process studies, data synthesis, and modeling. Its launching marked the beginning of internationally dedicated large-scale collaborative projects characterized by long field surveys with very high spatial resolution and many different parameters measured simultaneously. For this reason, the initial 680 234 Th-related GEOTRACES publication was chosen as the milestone that starts the -so far -final era in the 234 Th timeline. The GEOTRACES program brought a new philosophy that drove a shift from mostly deriving POC fluxes only to include 695 trace metal fluxes (see e.g., Black et al., 2018) and the implementation of standards and intercalibration initiatives to establish procedures and protocols for sampling at sea to ensure that samples are collected, handled, and stored without contamination or other sources of bias (see e.g., the versions of the "cookbooksampling and sample-handling protocols or GEOTRACES Cruises" by GEOTRACES Standards and Intercalibration Committee (Cutter et al., 2017(Cutter et al., , 2014. This era includes the development of new technologies to accelerate the collection and analysis of samples, the intercalibration of those technologies 700 to ensure internal consistency among the participating labs, the development of a data management system to facilitate access to the results to the entire oceanographic community, and a broad collaborative effort to model, synthesize, and interpret the results. The most remarkable milestones for this era are: • 2010: -initial publication of 234 Th from a GEOTRACES cruise by Cai et al., 2010, which reported data from ARK-705 XXII/2 expedition to the Arctic Ocean in 2007 as part of the carried out in the context of the IPY-GEOTRACES program (GIPY11, Germany); -intercalibration initiative by Cutter et al., (2010) to ensure that 234 Th results produced by different groups were comparable and internally consistent, using deep waters or stored samples as standards where 234 Th and 238 U are known to be in secular equilibrium. 710 • 2012: intercalibration initiative by Maiti et al., (2012), which carried out an intercomparison of 234 Th measurements in both water and particulate samples between 15 laboratories worldwide.
• 2021: 715 -improvement of the small-volume technique been recently carried out by Clevenger et al., (2021), which introduces a revised protocol that decreases sample volumes to 2 L, shortens wait times between steps, and simplifies the chemical recovery process, expanding the ability to more rapidly and safely apply the 234 Th method; -release of the 3 rd GEOTRACES IDP (GEOTRACES Intermediate Data Product Group, 2021) A total of 80 data sets were compiled from this era, including 9 previously unpublished ones, extracted from a total of 82 720 studies (65 publications in referred journals, 4 PhD thesis, and 4 data repositories). A total of 114 cruises and 26755 and 11872 data points for 234 Th and 238 U respectively were compiled, mostly distributed between the Atlantic and Pacific oceans and, to a lesser extent, the Southern Ocean (see Figure 2a). This era is the most intense in terms of sampling, with an average of 235 234 Th measurements reported by cruise and over 2400 234 Th data points reported by year. Once again, the NH dominated the surveys (79 cruises sampled exclusively above the Equator), particularly numerous in early spring and fall (see Figure S1). 725 Nonetheless, this era reports the highest number of cruises to the SH, with a total of 27 sampling exclusively below the Equator, and an additional 8 cruises sampling in both hemispheres.
More than half of the expeditions of this era reported data from cruises that took place in the framework of major ocean programs (see Table 2) and some minor projects, such as the i) Arctic Ocean 2001 (AO-01) expedition, reported by Furthermore, a 46% of the data sets included 234 Th data from an established time-series location (see Figure 1c).
An interesting characteristic of this period is that a relatively smaller number of cruises produced a greater amount of data in comparison to previous eras. The most relevant peaks are found in years 2013 and 2015, which coincides with the publication of several GEOTRACES transects (see Section 3). Dissolved 234 Th measurements reduced drastically (4-fold), while the rest of measurement types increased to some extent, most notably for total 234 Th and 238 U (see Figure 6d, note the change in the 740 scale of this plot along the eras). Once again, 238 U derived from salinity was the most commonly used approach with almost all the studies using the U-salinity relationship of Chen et al., (1986) or Owens et al., (2011).
The number of studies carrying out 234 Th underway sampling increased to 7 (see e.g., Black et al., 2018;Estapa et al., 2015;Martin et al., 2013). More than 32% of the studies reported POC: 234 Th ratios collected with sediment traps (Haskell et al., 2013;Kawakami et al., 2010;Maiti et al., 2016 among others). Around a third of the studies (34%)reported measurements CHN data (see e.g., Rosengard et al., (2015)), while a very reduced number (~9%) analyzed 210 Po (such as Alkalay et al., 2020 750 or Wei et al., 2011) Note that so far, only data from the first expeditions of the EXPORTS project have been published (Buesseler et al., 2020a) and are available in the compilation (see peak in Figure 2b and Figure 2d). The benefits to be derived from the success of these endeavors and the inclusion of the remaining data in the compilation could trigger the beginning of a new era for the 234 Th technique (i.e., era 5). Whether or not the publication of the 234 Th measurements collected during these dedicated largescale collaborative projects would mark the beginning of era 5 is yet to be known. 790

5.
Steps towards improving the global data set We present here two perspectives that will improve the 234 Th global data set and broaden its applicability: detect ocean regions with low 234 Th measurements and identify which parameters could be useful to include in future version of the compilation for a wider application of 234 Th data.

Recommendations on 234 Th sampling 795
During the last few decades, considerable progress has been made towards unraveling the behavior of the BCP and understanding the factors influencing the carbon dynamics and the ocean carbon cycle, e.g., primary production, aggregation, ballasting, and the activities of zooplankton and bacteria (see reviews by e.g., Buesseler and Boyd (2009);De La Rocha and Passow (2007);Sanders et al. (2014);Turner (2002)). Capturing the spatiotemporal variability of 234 Th concentrations could play a key role on our precise quantification of carbon uptake, storage rates, and subsequent ecosystem impacts. Strategies up 800 to now have included time-series of process studies in key ocean regions, global surveys of carbon parameters, TSS and models and databases built from field observations. However, there are gaps in the ship-based 234 Th sampling.
In terms of spatial distribution, the SH remains clearly undersampled. Of the total of 5134 locations compiled, 3351 belong to the NH (which accounts for a total of 65% of the sampling locations) and only 1749 to the SH (35%), with the 34 remaining ones corresponding to the Equator (<1%). This is the result of the majority of expeditions taking place in the NH, 78% solely 805 in the NH and 8% as part of surveys in both hemispheres. Figure 1a highlights an especially important gap in the South Pacific region and some additional notable gaps in the data set in the Benguela system (although this was included in COMICS project and data will be available in the near future), the Mauritanian upwelling or the southern Indian Ocean. More attention should be paid to these regions when planning future expeditions for 234 Th sampling. Moreover, Figure 1c shows numerous established time-series stations that have never been 810 sampled for 234 Th and we recommend visiting in next expeditions.
In terms of temporal distribution, research cruises are frequently conducted between January and October, although data are skewed to early spring and summer in both hemispheres (see peak in March and October in Figure S1b). This is especially important for the SH, in the Southern Ocean, where tough wintertime conditions complicate shipboard operations and research cruises are therefore frequently skewed to calmer spring and summer months. This means that sampling in Southern Ocean 815 mostly occurs during bloom conditions, which affects the reliability of the SS approach to quantify carbon export from 234 Th data (e.g., Buesseler et al., 1995;Ceballos-Romero et al., 2018;Maiti et al., 2008;Resplandy et al., 2012;Savoye et al., 2006). Therefore, results in the Southern Ocean are biased, when compared to the results in other areas, as they correspond mainly to POC fluxes in bloom conditions, whereas in the rest of the oceans, pre-and post-bloom conditions are also included in the global analyses. 820 Given that the sampling period is not always up to choose, especially when surveying rough seas, in next section we recommend a series of ancillary parameters to 234 Th data, often missed in the interpretation of 234 Th results that will be useful to accurately interpret the results in those specific situations. Furthermore, these complementary parameters will ensure accuracy when i) interpreting 234 Th deficits, ii) temporally contextualizing 234 Th-derived export results, and iii) extrapolating export results to longer time scales for annual patterns and global estimates of the BCP. 825

Recommendations on 234 Th ancillary parameters
It is worth mentioning that there is a vast amount of data that can be useful to report in a 234 Th compilation due to the wide applicability that has been shown for this technique along the years. However, for the sake of feasibility, we had to set boundaries and we left some parameters out of this first version of the compilation. Nevertheless, we would like to emphasize the dynamic character of this compilation, which has been conceived as an enduring compilation. Thus, the authors will keep 830 the data set "alive" by means of i) identifying new 234 Th data sets to be included, ii) adding new data provided by direct contact to us by 234 Th data measurers, iii) amending existing data sets if errors are detected or reported, and iv) by extending the amount of parameters in included in the data sets.
Following the dynamic spirit behind this effort, we will periodically update the compilation in PANGAEA. As so, we acknowledge ways to improve the compilation in future versions of it. Both new data and ancillary parameters to the 234 Th 835 measurements will be included in future versions of the compilation, including new useful parameters arisen from novel research and also following suggestions by researchers. We recommend 234 Th data users to consider the following parameters as essential complement to 234 Th data and therefore report them when publishing 234 Th studies.
A necessary complementary data to be included in future version of the compilation for the application of the 234 Th method, are the many auxiliary sensors included on CTDs to measure parameters such as fluorescence-chlorophyll-a and PAR 840 (photosynthetically active radiation), only recently incorporated to the CTD (Jijesh et al., 2017). These parameters allow to track the variable depth of sinking POC production and choose the integration depth to estimate 234 Th fluxes by either using the euphotic zone (see e.g., Giuliani et al., 2007) , the PAR depth (e.g., Rosengard et al., (2015) or the primary production zone (e.g., Owens et al., (2015). The integration depth used to estimate POC fluxes has been shown to impact the quantification of the 'strength' (i.e., magnitude) of the BCP (Buesseler et al., 2020b. Therefore, fluorescence and/or PAR data combined 845 to 234 Th data are expected to resolve the mechanisms that control the transfer of carbon to depth by means of the BCP (see Section 6 for new studies proposed in this direction). However, most of these auxiliary sensors have only been recently incorporated to the CTD (ref). On the contrary, temperature and salinity data have been a focal point of oceanography since the late 19 th century and fully overlap with the 50 years of lifespan of the 234 Th technique. Is that for reason that the temperature and salinity were the only CTD parameters included in this initial version of the compilation. We acknowledge fluorescence, 850 PAR data crucial parameter for an accurate use of 234 Th disequilibrium to evaluate POC export and will therefore be added in future versions of the data set. We recommend 234 Th data users to consider these data as essential ancillary parameters to 234 Th data and therefore report them when publishing 234 Th studies.
On the other hand, regarding gaps in parameters overall, only a 36% of the studies compiled provided information regarding bloom conditions and bloom stage during sampling, which prevents us from assessing the reliability of the assumption of SS 855 and NSS conditions to calculate 234 Th fluxes. Most of these studies (72%) followed the SS approach, while 24% of them combined it with the NSS one, and only 4% applied only the NSS model. However, overall, only a 36% of the studies compiled provided information regarding bloom conditions and bloom stage during sampling, which prevents us from assessing the reliability of the assumption of SS and NSS conditions to calculate 234 Th fluxes. The reliability of the SS/NSS approach applied to 234 Th deficits to estimate the POC flux depends on the sampling time in relation to the bloom dynamics. The available 860 information on the bloom timing, and/or peak flux date, and/or duration detailed by station is a key requirement for a quantitate evaluation of the overall accuracy of current 234 Th -SS-derived POC flux estimates (see e.g., Ceballos-Romero et al., 2016. Therefore, we recommend providing as much as this information as possible when reporting future 234 Th data. Additionally, for future versions of the compilation, we will assess in the metadata the bloom stage at the sampling moment for each of the studies compiled following a standard criterion based on up-to-date literature (e.g., Cole et al., 2012;865 Rumyantseva et al., 2019;Thomalla et al., 2015).

Towards a better understanding of oceanic carbon uptake: data use perspectives
The efficiency of the BCP at transporting carbon to the deep ocean and regulating atmospheric CO2 levels is affected even by small perturbations to oceanic ecosystems such as seawater chemistry or nutrients distribution (Kwon et al., 2009). Ongoing changes in temperature, oxygen concentration and stratification of the water column derived from anthropogenic warming are 870 predicted to alter export and transfer efficiency by different means. Specifically, a reduction in carbon sequestration, and therefore an increase in oceanic emissions of CO2 is expected (Kwon et al., 2009;Le Quéré et al., 2009). More observational data and mechanistic models to describe the particle flux and its attenuation are required in order to unravel and properly quantify the current role of the BCP and provide a baseline for future estimates of oceanic CO2 uptake.
There are many analyses of BCP processes that 234 Th data could be used for. Figure 7a shows a global summary of 234 Th/ 238 U 875 ratios in the upper 100 m of the water column water in the data set (a total of 17370 data points). Ratios have been categorized as i) "deficit" for those values were 234 Th concentrations are lower than 238 U ones within a 10% of uncertainty (ratio < 0.9), ii) "equilibrium", for those values that have similar concentration of 234 Th and 238 U (0.9 < ratio < 1.1), and iii) "excess" for those the values where 234 Th concentrations excess 238 U ones (i.e., ratios > 1.1). This figure can be interpreted as the distribution of probability of 234 Th reaching equilibrium (or not) with its parent at 100 m. Most of the data points fall within the equilibrium 880 (53% of the data points) and deficit (39%) categories, while "excess" data points accounted only for an 8%. A closer look at the deficit data indicates that most of the samples with ratios <0.9 were very close to the equilibrium value. Therefore, most of the data point compiled reach equilibrium around 100 m depth (see peak with 250 samples with 234 Th/ 238 U ratios = 1). This is important because 100 m is the reference depth that many past studies used to calculate POC fluxes (see Figure 1 in Le Moigne et al., (2013b)). It is important to note that the depth where 234 Th equilibrium is reached do not necessarily agree with 885 the euphotic zone. Equilibrium is usually re-established below the euphotic zone, at depths ranging from 50 m (e.g., Thomalla et al., 2006) to 500 m depth (e.g., Owens et al., 2015), for reason not yet clear (see discussion by Rosengard et al., (2015)).
Therefore, most recent literature recommends using depths relative to the euphotic zone as reference depth (Buesseler et al., 2020b). Nonetheless, our data show that, in a first approximation, the choice of using a 100 m fixed depth to integrate 234 Th fluxes was accurate in many cases. 890 Overall, Figure 7a shows at glance the number of data points in the compilation that could be used to evaluate process in the upper ocean such as export flux and export efficiency (e.g., Buesseler et al., 2020b), scavenging rates of trace metals (e.g., Black et al., 2018;Lemaitre et al., 2020), or particles sinking velocities (e.g.,  by using "deficit" ratios; and those that could be used to study processes such as particle remineralizations (e.g., Usbeck et al., (2002)) by using the "excess" ratios. Recent studies have highlighted the role that disaggregation and fragmentation could play in 895 setting the magnitude of flux attenuation (Baker et al., 2017;Briggs et al., 2020;Cavan et al., 2017) and pointed at it as the most important currently unaccounted for process for improving modern-day export flux simulations . 234 Th excess could be useful to study such mechanisms by detecting low-sinking (i.e., < 20 md -1 ) POC flux below the mixed layer depth (Baker et al., 2017) by high vertical resolution sampling of 234 Th concentrations and POC: 234 Th ratios. Additionally, these data could be used to investigate alternative export pathways, such as active flux by zooplankton and fish diel vertical 900 migration, which is referred to as "active transport" as some of the POC ingested at the surface is transferred to depth as part of their daily migrations and residence at depth (Saba et al., 2021;Steinberg and Landry, 2017). Quantification of the impact of this so-called migration pump (Boyd et al., 2019) has been elusive as it is only indirectly estimated or modeled, with active flux ranging widely as a percent of the passive flux, from <4% (Le Borgne and Rodier, 1997) to equal or larger than the total passive flux measured by classical methods (Yebra et al., 2018). 234 Th excess could shed light into this, since a local maximum 905 in export should be detected around the deep scattering layer where migrators reside at depth during the day if migrators accounted for >40% of the gravitational POC flux (Buesseler and Boyd, 2009).
On another note, data from the compilation could be used to provide revised and more robust global estimates of the ocean's BCP by different means such as, i) evaluating 234 Th-derived POC fluxes regionally, ii) studying the temporal evolution of the BCP strength through time-series data, iii) studying carbon export at shallow depths, and iv) globally analyzing carbon export 910 patterns. As a future study, we propose using the compilation to integrate all data sets to a chosen depthe.g., to the depth of the 0.1% light level, following recommendation by Buesseler 2020-, so different regions of the upper ocean could be compared across seasons and regions to each other in terms of the biological pump strength.
In a second example, 234 Th-238 U disequilibrium could be combined with 210 Po-210 Pb and/or sediment traps data to analyze simultaneously key parameters for the functioning of the BCP in different temporal and spatial scales (e.g., Ceballos-Romero 915 et al., 2016;Roca-Martí et al., 2016). Finally, as an example of potential and powerful uses of this compilation, we show the results of from Davidson et al., (2021), where global 234 Th concentrations were estimated out of the regionally varying euphotic zone by suing data from these data set. 234 Th were regressed to a 2.8-degree grid using a suite of machine learning and minimum variance interpolation algorithms.
The gridded data was used to drive a 3D global model using a set of sparse, implicit, and explicit transport matrices derived 920 from a global, coupled General Circulation Model (Khatiwala et al., 2005). The model captures known major features of 234 Th, with an average of 2.03 ± 0.18 dpm L -1 globally and greatest activities in ocean gyres (Figure 7b).
Overall, the compilation presented here provides a valuable resource to better understand and quantify how the contemporary oceanic carbon uptake functions and how it will change in future from the multidisciplinary approach that draws knowledge and methodologies from the fields of physics, mathematics, chemistry, biology, oceanography, and marine sciences that is 925 necessary to generate better mechanistic models of the BCP.

Data availability
Our data set is archived in the data repository PANGAEA® (http://www.pangaea.de), under the following DOI: Loeff not only for the great amount of data shared -many of them from many years ago -but for also providing detailed explanations to extract the data of interest to the compilation, which has made this enormous undertaking much easier. Special appreciation to Dr. K. Bruland and Dr. K. Coale for, in addition to data, sharing beautiful memories of the 234 Th world.    Table 2 Selected Ocean programs from the 50 years of 234 Th data compiled and significant activities within these programs with 234 Th samplings chronologically ordered. Activities are categorized as follows: i) cruises (C), ii) *Observations at Station Papa (former Ocean Station Peter and referred to as OSP or Ocean Station "P") started in 1949, although the larger surveys became a focus of activities in support of JGOFS during the 1990s (Freeland, 2007).

ii) published exclusively in repositories (dark blue squares), and iii) published in referred journals (magenta dots). (b)
Map showing data density by sampling location. (c) Map showing long-term, high-frequency time-series stations (TSS) either i) "operational" (blue diamonds), ii) "registered" for future operation (yellow diamonds), iii) "inactive" (red asterisks), or iv) "close" (black crosses) (source: https://www.ocean-ops.org) and those locations including 234 Th 1000 sampling that match a TSS (light blue dots).  as follows: i) dissolved plus particulate 234 Th (yellow diamonds), ii) 238 U plus total 234 Th (dark blue squares), iii) no 238 U (magenta dots); and for eras 2 (b), 3 (c) and 4 (d) distinguished by the type of data available as follows: i) total 238 U plus total 234 Th plus POC: 234 Th ratios (yellow diamonds), ii) 238 U not sampled (dark blue squares), iii) total 234 Th not sampled (magenta dots), and iv) POC: 234 Th ratios not sampled (light blue triangles  Ratios have been categorized as i) "deficit, ii) "equilibrium", and iii) "excess" for those the values where 234 Th concentrations lower, equal, and larger respectively than 238 U ones within a 10% of uncertainty. (b) 234 Th activities at the base of the regionally varying euphotic zone (defined here as the depth of the 0.1% light level). Black dotes indicate the observations near the 1095 euphotic zone that were used when producing the 234 Th concentration plot.