DINOSTRAT: A global database of the stratigraphic and paleolatitudinal distribution of Mesozoic-Cenozoic organic-walled dinoflagellate cysts

Abstract. Mesozoic–Cenozoic organic-walled dinoflagellate cyst (dinocyst) biostratigraphy is a crucial tool for relative and absolute age control in complex ancient sedimentary systems. However, stratigraphic ranges of dinocysts are found to be strongly diachronous geographically. A global compilation of state-of-the-art calibrated regional stratigraphic ranges could assist in quantifying regional differences and evaluate underlying causes. For this reason, DINOSTRAT is here initiated – an open source, iterative, community-fed database intended to house all regional chronostratigraphic calibrations of dinocyst events (https://github.com/bijlpeter83/DINOSTRAT.git). DINOSTRAT version 1.0 includes > 8500 entries of first and last occurrences (collectively called “events”) of > 1900 dinocyst taxa, and their absolute ties to the chronostratigraphic time scale of Gradstein et al., 2012. Entries are derived from 199 publications and 189 sedimentary sections. DINOSTRAT interpolates paleolatitudes of regional dinocyst events, allowing evaluation of the paleolatitudinal variability of dinocyst event ages. DINOSTRAT allows for open accessibility and searchability, on region, age, and taxon. This paper presents a selection of the data in DINOSTRAT: (1) the (paleo)latitudinal spread and evolutionary history of modern dinocyst species; (2) the evolutionary patterns and paleolatitudinal spread of dinoflagellate cyst (sub)families; (3) a selection of key dinocyst events which are particularly synchronous. Although several dinocysts show – at the resolution of their calibration – quasi-synchronous event ages, indeed many species have remarkable diachroneity. DINOSTRAT provides the data storage approach by which the community can now start to relate diachroneity to (1) inadequate tie to chronostratigraphic time scales; (2) complications in taxonomic concepts and (3) ocean connectivity and/or the affinities of taxa to environmental conditions.


between authors and regions, which the current approach was unable to account for, and is considered a next step, when individual studies or sites are revisited.
For the subfamily Wetzelielloideae, DINOSTRAT deviates from the taxonomic index of Williams et al. (2017). The fundamental redefinition of species concepts in the taxonomic revisions for the Wetzelielloideae (Williams et al., 2015) 100 eliminates many stratigraphically useful Eocene dinocyst taxa (Bijl et al., 2016). Therefore, for this subfamily, the calibration of dinocyst species is presented in the taxonomic classification of Wetzelielloideae prior to (Williams et al., 2015). A decision tree is used to determine which papers to include into DINOSTRAT (Fig. 1). This tree first discards studies in which dinoflagellate cysts were the only stratigraphic tool to date the sequence. Although these papers do provide valuable information on stratigraphic order of events, discarding them from this review eliminates the risk of circular reasoning and inherited chronostratigraphic tie. Only those dinocyst events are included that could be calibrated against a stratigraphic tool 110 that can be traced back to the bio-, magneto-or chronozones in the Geologic Time Scale 2012 (GTS2012; . The decision tree distinguishes five tiers in these papers ( Fig. 1): • Tier 1 studies present dinocyst events along with magnetostratigraphic constraints obtained from the same sedimentary section. The interpretation of magnetochrons from the paleomagnetic signal was done without the use of dinoflagellate cyst biostratigraphy. Since magnetic reversals are globally synchronous, evaluating the synchroneity of dinocyst events 115 with use of paleomagnetostratigraphy is most robust.
• Tier 2 studies present dinocyst events calibrated along with compromised or problematic magnetostratigraphic constraints on the same sedimentary section, for instance when the inclination signal suffers from a strong overprint, or when the magnetochron assignment is not clear. Studies in which dinocyst events served as biostratigraphic tool for magnetochron assignment are included in this tier as well. 120 • Tier 3 studies report dinocyst events together with biostratigraphic zones (from nannoplankton, foraminifer or ammonite zones), identified on the same sequence. These studies provide clear report on the identification of these zones in the sequence.
• Tier 4 studies report dinocyst events with biostratigraphy, of which either the derivation is unclear, or the tie to the GTS (e.g., for outdated ammonite zonations), or when biostratigraphic data does not come from the same sequence, but e.g., is 125 interpreted from nearby outcrops.
• Tier 5 studies report dinocyst events with independent chronostratigraphy, of which the derivation is unverifiable, or represents a regional synthesis.  The absolute age of each dinocyst event is not explicitly entered into DINOSTRAT. Rather, its position within the zone it was calibrated to is entered. Ages are subsequently calculated via linear interpolation between these tie points, as follows: (1) 135 in which [##]% is linearly interpolated between base (0%) and top (100%) of tie points, [stratigraphic tool] is the bio-, magneto or chronozonation in the GTS2012, and [zone] is the name of the zone, chron or stage in which the dinocyst event falls. The rationale behind this approach instead of simple entry of the age is that while the absolute ages of dinocyst events are dependent on the evolving knowledge of the chronostratigraphic time scale, the stratigraphic position of the event relative to the tie points 140 in the record is fixed. This approach makes it easier to update the ages of the dinocyst events when the ages of the chrono-, magneto-and biozones are updated in the future. If dinocyst events fall between two different stratigraphic ties, the event is noted as follows:

155
Each event entry in DINOSTRAT (Dinoevents_Jan2021.csv in Bijl, 2021) includes the (paleo-) latitude of that event. This is interpolated using the age of the event and its location, which has a paleolatitude evolution through time (Paleolatitude.csv in Bijl, 2021; with use of Paleolatitude.org; Van Hinsbergen et al., 2015). Paleolatitudes of sites in mobile orogenic belts are interpolated using regional tectonic reconstructions, and as such are prone to additional latitudinal uncertainty. economically interesting regions (e.g., for hydrocarbon industry). It may also in part reflect a bias towards 'western society' 170 research, and poor accessibility of publications from non-western societies.
The paleolatitudinal position of the sites through time confirms the strong overrepresentation of Northern Hemisphere midlatitude sections (Fig. 3), and underrepresentation of the tropical regions, Pacific Ocean and southern mid-latitudes. The Paleogene has the largest latitudinal spread of records, better yet than the Neogene. Particularly the Mesozoic has few entries from the Southern Hemisphere and equatorial regions. The Mesozoic records are predominantly calibrated to ammonite 175 stratigraphy (Tier 3 and 4), and in some occasions to magnetostratigraphy (Tier 1 and 2; Fig. 3). Ammonite zones presented in the papers often had to be converted to those in the GTS2012, which is not always straightforward, as the zone definitions have changed through time (Ogg and Hinnov, 2012a, b). The ammonite zonations are prone to regional diachroniety themselves, which was demonstrated particularly for the late Jurassic (Ogg and Hinnov, 2012b). This may create a level of circular reasoning when dinocyst events are calibrated against these zones, because diachronous dinocyst events in 180 DINOSTRAT may be the result of diachronous ammonite zones rather than diachronous dinocyst events.
The interpolated paleolatitudes for dinoflagellate cyst events in DINOSTRAT allows detailed evaluation of the latitudinal synchroneity of dinocyst events. This paper presents a selection of the data in DINOSTRAT, focusing on the stratigraphic and 190 geographic range of modern dinocyst species, of dinocyst families/subfamilies and of a selection of quasi-synchronous dinocyst events. Users can filter DINOSTRAT per locality (to present the stratigraphic order of events per site) and/or per taxon (to see the geographic variability of the range of any taxon), to serve their purposes.   (Zonneveld et al., 2013). The database presented here allows comparison of modern latitudinal spread of these species to that of the past, and their age and latitude of oldest first occurrence (Supplement 2, and a selection in Fig. 4). Most modern species that have entries in DINOSTRAT have originations in the mid-Cenozoic: Impagidinium 210 species, Operculodinium centrocarpum, Tectatodinium pellitum, Tuberculodinium vancampoae (Fig. 4). Lingulodinium machaerophorum has a first occurrence around 60 Ma. The exception is Spiniferites ramosus, a generalist species with a robust morphology through time, that has a remarkably consistent FO in the Berriasian (~145 Ma; Fig. 4). The dinocyst species that have geographic distributions restricted to one hemisphere today were also latitudinally restricted in the geologic past (e.g.,

Dinocyst (sub-) families
Range charts of the Sites in DINOSTRAT are provided in the Supplements (see "Sites" folder in Supplement 2). The age over 230 paleolatitude entry in DINOSTRAT allows evaluation of the latitudinal difference in event ages for each individual species in DINOSTRAT (n=1914), as well as for groupings per genus (n=460) and family (n=28) (Supplement 2). Users can produce/adapt these plots themselves with help of the R markdown script "plot creator.Rmd" in Bijl, 2021). The most robust dinocyst events will have synchronous ages of FOs and LOs per paleolatitude (i.e., vertical blue and red lines in the plots of Supplement 2). The FOs and LOs connected per species and grouped in (sub)families are plotted and described below, with 235 particularly synchronous taxa highlighted. The purpose of these plots is threefold: First, they show the total stratigraphic range and latitudinal spread of these dinoflagellate (sub)families, and time intervals when and where phases of strong diversification and extinction occur in that (sub)family. Second, as with the plots of modern species, they show in which paleolatitudes these supra-generic groups first appear, but also where they last go extinct. Although earlier compilations of the evolution of dinocyst families do exist (e.g., McRae et al., 1996), DINOSTRAT presents the fundamental spatio-temporal observations that underpin 240 these compilations. Thirdly, the plots allow presentation of the database in a way that the validity of extrapolating dinoflagellate cyst events on a supra-regional scale can be critically evaluated in the discussion.          Family uncertain (Fig. 17) Remarks: This group of which the family is uncertain does contain several stratigraphically synchronous species (Fig. 17).

Geographic extrapolation of dinocyst events
A suite of dinocyst events throughout the entire stratigraphic record have quasi-synchronous ages across all latitudes (Fig. 5-28). Uneven geographic spread of data, with voids in the equatorial region and the Pacific Ocean, makes global synchroneity 470 of these events highly uncertain. Still, the synchronous events confirm the potential and value of dinocyst biostratigraphy to date complex sedimentary systems. It also implies that ocean connectivity did allow dinocyst species to migrate globally, as far as their environmental tolerances permit.
Yet, the majority of dinocyst species have very diachronous ranges in DINOSTRAT, as well as latitudinally restricted geographic spread, which confirms previous interpretations (Williams et al., 2004). With DINOSTRAT the underlying causes 475 of this diachroniety can now be further explored. The shortness of some of the records used in this review may lead to 'false' events, i.e., those that represent re-appearance or temporal disappearance rather than 'true' first or last occurrences (FO and LO, respectively). The obvious false FOs and LOs have been removed from DINOSTRAT by omitting events that occur at the base or the top of the sections. Particularly rare species, or those occurring at the end of their preferred environmental niche, come and go in stratigraphic sections, and these lead to 'false' events in DINOSTRAT. Although such 'false' FOs and LOs 480 may obscure a uniform age of events over latitudes, they may still have important regional stratigraphic significance, which is why their entries are retained in DINOSTRAT. As a result, age and region of the oldest FOs and youngest LOs have the most significance for reconstruction of evolutionary patterns. Although caving of material typically falsely increases the age of oldest FOs, this is unlikely a large influence on the entries in DINOSTRAT, as most studies come from core or outcrop material, and not from ditch cuttings, for which caving is much more likely. Reworking could falsely extend the age of youngest 485 LOs of species. Although species that were reported as reworked in the papers have been omitted from DINOSTRAT, some reworked dinocysts could have been falsely identified as in situ in the original papers. It cannot be excluded that this causes some level of diachroniety in LOs, although this is unlikely a large factor.
The complexity of taxonomic concepts in some dinocyst genera (species definitions, or morphological continua) hinders proper evaluation of latitudinal synchroneity of events. The reviewed literature covers 50 years, during which taxonomic concepts of 490 dinocysts species have iteratively evolved. The extensive synonymy database of Williams et al. (2017) does deliver crucial organization of the taxonomic framework. Still, some of the subtle morphological differences in species are limited to the expert eye of individual researchers, and these may not have been recognized by others (which occasionally led to the presentation of taxa on a genus level, instead of further specification to species level). Making the taxonomic framework consistent for all studies now included in DINOSTRAT would be a cardinal effort and will be part of the iterative setup of 495 DINOSTRAT. For example, reviews of dinocyst taxonomic frameworks on a per-family basis, such as has been initiated for the Spiniferites complex (e.g., Mertens and Carbonell-Moore, 2018) could help adjusting inconsistencies in species concepts, and their stratigraphic occurrence. In any case, it must be stressed that the quality of any biostratigraphic marker is defined not only by the accuracy of the tie to the chronostratigraphic time scale, or global consistency of the age of FO or LOs, but also by their morphological distinctiveness. 500 Events may also appear diachronous in DINOSTRAT because of inadequate or inaccurate tie to the chronostratigraphic time scale. In such cases, small diachroniety (~10 4-5 yr) may be related to the inherent assumption of linear sedimentation rates between age tie points. Larger diachroniety (~10 5-6 yr) may be because the zonation through which dinocyst events were calibrated to the chronostratigraphic time scale is diachronous. For calibrations against magnetostratigraphy (Tier 1, 2) this is unlikely, and could occur only when magnetochrons were wrongly interpreted in the sites used. For events calibrated against 505 Cenozoic nannoplankton and foraminifer zonations (in Tier 3, 4) this is also unlikely, as these events are relatively robustly calibrated to chronostratigraphy (Watkins and Raffi, 2020;Petrizzo et al., 2020). Less robust are the Mesozoic ammonite zonation schemes, which have shown to be quite diachronous themselves latitudinally (e.g., Ogg and Hinnov, 2012a, b and references therein). The geographic variability in ages of zone boundaries, but also numerous adjustments of zone definitions throughout the past 50 years, further complicates accurate tie of dinocyst events with ammonite data to the GTS2012. So far, 510 the majority of Mesozoic dinocyst events were calibrated against these ammonite zonations, which makes their absolute tie to Also, ecological reasons could cause geographically diachronous events. When local environmental or depositional conditions change, assemblages adjust, which leads to local and temporal (dis)appearances of species that may be falsely interpreted as 515 extinction or origination events. If so, dinocyst taxa associated to the most dynamic environmental niches on the continental shelf are expected to have the most diachronous events. Indeed, there are particularly diachronous events in Goniodomaceae and Protoperidinioideaeboth Families are associated to near-shore depositional settings (Zonneveld et al., 2013;Sluijs et al., 2005;Frieling and Sluijs, 2018), that are most environmentally dynamic. Settings in which these species occur offshore, such as upwelling regions (Sangiorgi et al., 2018), or hyperstratified waters (Reichart et al., 2004;Cramwinckel et al., 2019), are 520 environmentally equally dynamic. In contrast, families typically associated to offshore conditions, such as the Wetzeliellioideae (Frieling and Sluijs, 2018) reveal much more synchronous events. For regional stratigraphy, the diachroniety is of less concern because these events can still be used for regional stratigraphic correlation (e.g., as in Vieira and Jolley, 2020). It does mean that for such species, dinocyst biostratigraphy applies regionally, and caution should be taken to extrapolate event ages far outside of these regions. There are also species that clearly show regional inconsistency of origination or 525 extinction ages as a result of climate changee.g., Melitasphaeridium choanophorum had a much wider geographic distribution during warmer past climates and a progressively younger LO in lower latitudes as climate cooled (Fig. 4).
Diachroniety is usually larger between latitudinal bands than within latitudinal bands. The sparsity of records from the Southern high latitudes complicates robust assessment of interhemispheric differences in dinocyst event ages. In the Mesozoic, the diachroneity is likely related to the inadequate tie of events to the international time scale. DINOSTRAT is short of Mesozoic 530 records that are tied to other stratigraphic tools than ammonites. For the Cenozoic, the diachroneity between hemispheres cannot be explained by inadequate calibration, since many events are calibrated against magnetostratigraphy. For those, environmental reasons must be at play. While in the early Paleogene many dinocyst events are quasi-synchronous (events within the Wetzeliellioideae, of Cerodinium and Palaeoperidinium), in the late Paleogene and Neogene diachroneity seems to become stronger. This may be in part because of stronger latitudinal temperature gradients as global average climate cools 535 (Cramwinckel et al., 2018;Westerhold et al., 2020), which creates more diverse ecological niches and complicates latitudinal migration.
Many dinoflagellate cyst species and higher generic ranks have their oldest first occurrence and youngest last occurrence in Northern Hemisphere mid-latitudes (see, e.g., Areoligeraceae, Cladopyxiaceae, Comparodiniaceae, Goniodomaceae, Nannoceratopsiaceae, Palaeoperidinioideae, Wetzeliellioideae;Figs. 5,7,18,8,26,22,23). This may be because of a much 540 higher density of records at those latitudes. However, the vast continental shelf area in Europe throughout the Mesozoic and much of the Cenozoic did likely serve as the perfect habitat for taxa to find a new niche and to linger on. A higher record density in Southern Hemisphere and equatorial regions should shed light on this idea.

Evolutionary patterns in dinocyst (sub-) families
DINOSTRAT presents for the first time a quantitative overview of stratigraphic and paleolatitudinal distribution of fossil and 545 modern dinoflagellate cyst taxa. Through that, it refines with coherent, independent, open-access data the evolutionary patterns presented previously (e.g., Fensome et al., 1993;McRae et al., 1996), and adds their latitudinal distribution through time.

Functionality of DINOSTRAT 555
Once downloaded, DINOSTRAT can be filtered by location, allowing users to compare newly generated dinocyst chronologies to nearby calibrated regional dinocyst events. DINOSTRAT can also be filtered by species, genus or higher taxonomic rank, for further evaluation of the latitudinal spread of any species of interest. The data in DINOSTRAT is readily visualized in Supplement 2, and these plots can be adjusted and reproduced using the R markdown file "plot creator" in Bijl, 2021. The community is invited to contact the first author either via email or through GitHub, with suggestions, error reports, and/or 560 additional papers or data to be entered, so that the data content of DINOSTRAT is iteratively improved.

Future directions
DINOSTRAT will be regularly updated. Annual minor updates include addition of sites, adjustments in the current entries (e.g., through the feedback process), or minor revisions in taxonomy/stratigraphy. Major updates will occur in a 3-year cycle and are the result of new Geologic Time Scales, or profound revisions in dinocyst taxonomic concepts. Major updates will be 565 accompanied by a short communication in this journal, minor updates will be communicated through the GitHub repository.
Updates of the Geologic Time Scale (e.g., to GTS2020 (Gradstein et al., 2020)) will be implemented once the metadata of that Geologic Time scale have become available. All versions of DINOSTRAT will remain archived on GitHub.

Data availability
The database is available under a CC-BY 4.0 license on GitHub (Bijl, 2021;  modern dinocyst dataset, and (4) "Dinoevents_Jan2021.csv"; the calibrated dinocyst events. "Plot creator.Rmd" is an R markdown file to reproduce the figures presented in this paper.

Conclusions 575
This paper presents the database DINOSTRAT version 1.0 (Bijl, 2021), a database containing >8500 entries of regional dinoflagellate cyst first and last occurrences (events) from 1914 species, in 189 sites. Geographic distribution of sites used in DINOSTRAT is strongly concentrated in the northern hemisphere mid-latitude, notably in Europe and the North Atlantic, and few sites are in the Pacific or Southern Hemisphere. Ages of events were calibrated using their tie to the Geologic Time Scale.
The paper presents the location and age of origin of modern dinoflagellate cyst species, reviews the age range and geographic 580 spread modern and extinct dinoflagellate cyst taxa and highlights the most latitudinally synchronous dinoflagellate cyst events.
Many dinocyst taxa show quasi-synchronous events latitudinally, which can be widely used to stratigraphically date complex sedimentary sequences. Latitudinal diachroneity in events can be the result of either inadequate tie to the chronostratigraphic time scale, false interpretations of 'true' events, complicated species concepts or paleoceanographic reasons. In any case, it dictates caution to extrapolate ages of dinocyst events to far distances, and demonstrates the need for regionally calibrated 585 dinocyst zonations, which DINOSTRAT here provides. It further provides solid foundation to review spatio-temporal patterns in dinoflagellate cyst evolution, dispersal, and extinction. DINOSTRAT is freely available under CC-BY 4.0 license. It allows the user to filter by region, or by species, genus, or higher taxonomic rank.

Supplements
• Supplement 1: Table of conversions of published zones to those in GTS2012 590 • Supplement 2: Zip file containing ages and latitudes of events in individual dinoflagellate cyst species (1914 plots), grouped per genus (460 plots), per Family (28 plots), of modern cyst species (92 plots), and the range charts for all Sites (189 plots).

Competing interests
Author declares no conflict of interests 595

Acknowledgements
The LPP Foundation has financially supported the development of DINOSTRAT. I thank Henk Brinkhuis, Bas vd Schootbrugge and Appy Sluijs for useful discussions. The 'Advanced course in organic-walled dinoflagellate cyst taxonomy, stratigraphy and paleoecology' has been a great 'playground' to discuss progress in the field, and for that I have Martin Head,