Varved lake sediments provide climatic records with seasonal to annual
resolution and low associated age uncertainty. Robust and detailed
comparison of well-dated and annually laminated sediment records is
crucial for reconstructing abrupt and regionally time-transgressive
changes as well as validation of spatial and temporal trajectories of
past climatic changes. The VARved sediments DAtabase (VARDA) presented
here is the first data compilation for varve chronologies and
associated palaeoclimatic proxy records. The current version 1.0
allows detailed comparison of published varve records from 95
lakes. VARDA is freely accessible and was created to assess outputs
from climate models with high-resolution terrestrial palaeoclimatic
proxies. VARDA additionally provides a technical environment that
enables us to explore the database of varved lake sediments using a
connected data model and can generate a state-of-the-art graphic
representation of a multisite comparison. This allows the reassessment
of existing chronologies and tephra events to synchronize and compare
even distant varved lake records. Furthermore, the present version of
VARDA permits the exploration of varve thickness data. In this paper, we
report in detail on the data-mining and compilation strategies for the
identification of varved lakes and assimilation of high-resolution
chronologies, as well as the technical infrastructure of the
database. Additional palaeoclimatic proxy data will be provided in
forthcoming updates. The VARDA graph database and user interface can
be accessed online at
A major challenge in simulating climate change is validating model outputs with palaeoclimatic data. Model–data comparisons on regional to global scales require the integration of palaeoclimatic data from single sites into multisite networks (e.g. Franke et al., 2017). Annually laminated lake sediments provide reliable data for such networks because they offer palaeoclimatic information in high temporal resolution with low associated age uncertainty. Due to their annual to seasonal resolution, multisite networks of varved lake sediments enable investigations of abrupt and regionally time-transgressive climate change on the continents (e.g. Lane et al., 2013; Rach et al., 2014), which are fundamental to understanding past climates, especially that of the last glacial cycle (Clement and Peterson, 2008), and to better assess spatial and temporal trajectories of future climate changes. Networks of varved lake sediments also provide means to test contrasted proxy responses to climate change (e.g. Ott et al., 2017; Ramisch et al., 2018; Roberts et al., 2016), further enhancing the robustness of palaeoclimatic reconstructions. However, despite their usefulness for the generation of highly resolved multisite networks, a global synthesis of varve-related palaeoclimatic data is still not available.
Various data providers have been developed which offer free access to
palaeoclimatic and palaeoenvironmental information including high-resolution terrestrial archives. These include (1) large-scale data
repositories, such as PANGAEA (
We assessed varve-related publications aided by the literature
database of the PAGES varve working group (
To ensure an unambiguous identification of a lake record corresponding
to a given dataset, we collected and reviewed the required information
of lake names and geographic coordinates from the published
literature. Table 1 lists required and additional information for lake
records included in VARved sediments DAtabase (VARDA). To facilitate searches for lakes in an
alphabetically ordered list, the string “Lake” was removed from the
name if the string appeared in the beginning of the lake name
(e.g. “Lake Ammersee” was changed to “Ammersee”). However,
exceptions were made if the string “Lake” is an essential feature of
the lake name (e.g. “Lake of the Clouds”) or if the reference is in
non-English language (e.g. “Lac d'Annecy”). Lake locations were
stored as WGS84-referenced geographical coordinates in decimal degrees
with four decimal places, which corresponds to a precision of
VARDA v01 data sheet for lake information (green field: required information; yellow field: additional information).
VARDA v01 data sheet for sediment composite profile information (green field: required information; yellow field: additional information).
VARDA v01 data sheet for
Sediment composite profiles that were collected from primary literature sources (see Table 2) only require a unique identifier (e.g. MON for Lago Grande di Monticchio) within the VARDA database that links a profile to a corresponding lake (Table 2). Additional information encompasses the geographical coordinates of coring location (fields: latitude, longitude), coring methods (e.g. piston corer), a coring date, water depths at the core location, and the total length of the sediment composite profile with an upper (field: depth start) and lower (field: depth end) depth.
The data compilation followed the basic strategy of collecting proxy data associated with a published sediment composite profile and information about age–depth models and event layers. A sediment composite profile may either consist of a single core section or several overlapping core sections combined to a composite profile. The depth scale within a sediment composite profile is referred to as composite depth. Since data and meta-information availability greatly varied in between different publications, we classified the available information into required and additional information. The category required encompasses all information that is necessary to (a) associate a proxy value at a given depth in a sediment composite profile with a corresponding age and to (b) uniquely identify a lake, sediment composite profile, and original publication for a given dataset. The category additional encompasses all information that extends the data pool for more comprehensive analyses and therefore improves reproducibility, the ability to filter data by specific properties and, in addition, the quantification of methodological uncertainties. We converted all datasets to default units to provide standardized and thus inter-comparable data formats. Tables 1–7 provide an overview of data categories and required and additional information properties including the default units.
VARDA v01 data sheet for chronological meta-information (green field: required information; yellow field: additional information).
VARDA v01 chronology data sheet (green field: required information; yellow field: additional information).
VARDA v01 data sheet for tephra layers (green field: required information; yellow field: additional information).
VARDA v01 data sheet for varve thickness (Green field:
Uncalibrated radiocarbon measurements were collected from the
published literature and adapted to the
Chronologies for varved lake sediments are commonly based on a
combination of different dating methods (Brauer et al., 2014), such as
varve counting, radiometric dating (e.g.
VARDA version 1.0 includes published chronologies that are available in public data repositories. Tables 4 and 5 provide an overview of the required and additional meta-information for storing chronologies in VARDA and the resulting chronological data sheet, respectively. The required information includes a label for the associated sediment composite profile and the corresponding data and publication DOI. Additional information will enable rapid reassessments of original chronologies.
Additional information reports (i) on age uncertainty; (ii)
presence, type, and age of anchor points for floating chronologies
(e.g. sediment surface for continuous varve chronologies,
Ideally, the chronological data sheet associates a given depth of a sediment composite profile to an age estimate and, if available, an uncertainty range expressed as minimum and maximum estimate as additional information (2 sigma as a default). If depth information for a sediment composite profile was not provided, we either reconstructed an auxiliary composite depth by cumulative sums of continuous varve thickness measurements (if available) or excluded the corresponding chronology from the present data compilation because such time series without corresponding core depth are not updatable. The default depth scale unit was set to millimetres to avoid excessive decimal places in depth reporting. The default age scale unit was set to a BP (year before present) with 1950 CE as zero age. The default age unit was restricted to annual precision and ages are reported in integer numbers (without usage of decimal places).
Isochronous event layers provide precise tie points for the
synchronization of proxy time series from regionally different
locations and facilitate the construction of multisite
networks. Furthermore, the identification of layers corresponding to
dated events such as volcanic eruptions or geomagnetic excursions
provide additional information for the construction of robust
chronologies. For the first version of VARDA, we collected information
on reported tephra layers in the sediment composite profiles included
in the database. Table 6 provides an overview of required and
additional information of published tephra layers in VARDA. The
required information (composite depth, age, age error, and dating
method) is essential for assigning a tephra layer to a given depth in a
sediment composite profile and storing information on the age of the
layer as it has been reported. Since standards for age reporting of
tephra layers greatly vary in between different studies
(e.g. uncalibrated vs. calibrated), information on the dating method
and calibration are required for the field “dating
method or calibration”. The required field “dated in profile?”
provides information on whether the age of the tephra layer originates from
the corresponding sediment composite profile itself (field
The technical infrastructure of VARDA is intended to attribute a
down-profile record of palaeoclimatic proxy data to the corresponding
chronology of the sediment composite profile. Therefore, the required
information for proxy data sequences is the composite depth and a
corresponding proxy measurement, while additional information further
describes proxy specific measurement standards. We adapted the
variable controlled vocabulary of the PaST thesaurus for proxy data
(World Data Service for Paleoclimatology,
VARDA property graph model. Coloured circles represent nodes, and grey arrows represent edges between nodes. For an explanation, see the text.
VARDA is intended to offer a flexible generation of multisite networks with complex data relations for storing and organizing the collected information. To store and organize datasets from varved lake archives, we use a graph database. Graph technology in computer science has evolved as part of the NoSQL movement (meaning “not only SQL”; SQL: Structured Query Language) and is based on graph theory, a mathematical concept of expressing objects as interconnected entities, which dates back to the early works of Leonard Euler in the 18th century (Euler, 1741). In contrast to fixed data schemes required by relational database management systems (RDBMS), a graph explicitly models relations between data by representing entities as nodes (or vertices) described by properties and connected through edges as shown in Fig. 1 (see also the property graph model). To categorize the nature of a particular entity, one or more labels can be added to the node. Edges can be distinguished by their type and may have properties just like nodes. The ability to add new labels, edges, and properties to any entity at all times enables developers to quickly adapt the data model to changing scientific or technical requirements. Neo4j's native query language Cypher is used to read and update the contents in the graph. It allows for an intuitive and flexible generation of queries that are short and readable even for complex patterns (many relationships, circular structures, variable-length paths).
The integration of palaeoenvironmental datasets from varved lakes into a graph database resulted in a flexible data structure, which allows for connected palaeoenvironmental datasets within a single lake as well as in between different lakes. Figure 1 illustrates the VARDA property graph model schematically and visualizes connections between nodes. The VARDA data model associates each lake with one or more sediment composite profiles, which are connected to one or more datasets. Datasets, in turn, are connected to a publication, a category (chronology, tephra layer, radiocarbon date, or varve thickness record in version 1.0) and various category-specific attributes (as listed in Tables 1–7) that further describe a dataset. All these connections provide the necessary meta-information to the actual data points, which are included in a given data set. Data points from the category tephra layer can additionally connect to an event that is described in more than one lake, e.g. the Laacher See tephra. The event node offers the possibility to connect datasets between different lakes for, e.g. synchronization.
VARDA provides fast access to palaeoclimatic data from varved lakes, irrespective of a user's technical background or operating system. Therefore, the user interface (UI) was designed to be intuitive and reactive with self-explanatory forms and components that immediately respond to the user's actions. It is implemented as an online service, which can be accessed permanently using a web browser.
Identified lakes, updated geographic coordinates, and datasets included in VARDA 1.0. Letters indicate data availability in data repositories. Table also includes varved lake sites without publicly available data (without letters and references).
Continued.
Continued.
Continued.
Continued.
Screenshot of the user interface in version 1.0, available online
at
Overall, the application consists of the web client, a server-side Neo4j graph database and an Application Programming Interface (API) for communication of the client with the database. All software libraries that are integrated into VARDA have licenses that are free and permissive. The client is built with Vue.js, a JavaScript UI framework that has gained attention in the developer community since its launch in 2014 due to its versatility and runtime performance. Some features of VARDA integrate other well-documented third-party libraries, such as D3.js for data visualization and OpenLayers for rendering maps (e.g. from OSM) among vector layers with spatial data. The client state (e.g. user data and entity cache) and any transactions with the database are handled with Apollo GraphQL, a framework for API communication and state management. The client's component-oriented architecture enables fast development of new features with little interference with existing modules. All lines of source code required by the client are checked, minified, and bundled using WebPack for use in the browser.
The web application offers a user interface with optional filters to explore and visualize multisite networks on demand (see Fig. 2). A universal search field (1 in Fig. 2) can be used to select filters either by region or proxy category. An interactive diagram (2 in Fig. 2) can be used to select a temporal filter by scrolling with the mouse or resizing the light-blue-coloured frame (3 in Fig. 2) underneath the main figure.
We add the iconic NGRIP oxygen-isotope (
We identified 186 lakes from the published literature, which are
described as exhibiting continuous or floating varve sequences in their
sediments. We additionally included unvarved sediments from Lake
Prespa (Europe), Lake Ohrid (Europe), Laguna Potrok Aike (South
America), and Bear Lake (North America) to the compilation due to their
long continuous chronologies and good age control from independent
dating techniques or the frequent occurrence of tephra layers. In
total, 261 datasets for 95 of the identified lakes are available
(September 2019) in public data repositories and were included in
VARDA version 1.0. The datasets comprise of 70 individual chronologies
from 43 lakes, 146 tephra layers from 36 lakes, 118 uncalibrated
Figure 3 presents the spatial coverage of lakes and associated
datasets included in VARDA 1.0. The identified lakes are located on
all continents except Antarctica, with
Spatial distribution of identified lakes and collected datasets included in VARDA 1.0. Data availability is indicated by blue-coloured dots.
Temporal distribution of datasets in VARDA 1.0.
Figure 4 presents the temporal distribution of datasets included in
VARDA 1.0. The combined chronologies span the entire last glacial
cycle with a minimum age range of 87 years (from
All datasets are available online at
VARDA offers a user-friendly and time-efficient way to explore the multitude of palaeoenvironmental data from varved lake archives. Due to the integration of precise chronologies and isochrones from tephra event layers into a modern graph database, VARDA offers an easy way to construct regional to global networks of palaeoenvironmental information. These multisite networks can be used e.g. to explore and analyse leads and lags of regional climate change, large scale patterns in environmental variability or differentiated proxy responses within and between archives. The first version of VARDA presented here includes all technological requirements and tools for future upgrades and developments. Presently, we are working on the integration of (1) an advanced visualization tool, (2) a user-friendly import application and (3) additional proxy data such as stable isotopes and geochemical data, as priority goals for the next update. Additionally, the source code of the database application will be made available for the public in a separate contribution. In general, VARDA is intended to be community-based effort, and we welcome and encourage the participation of varve specialists and the broader palaeoenvironmental community for the further development and application of this tool.
AR coordinated the manuscript writing and wrote most parts, except Sect. 3,
which was written by AlB and MD. All authors contributed to manuscript
writing. AlB, AR, and AcB carried out the data compilation and designed the
standardization scheme with contributions from IN, MJB, JM, and NN for
tephrochronological data; RT, JM, FO, BP, and CB for
The authors declare that they have no conflict of interests.
This article is part of the special issue “Paleoclimate data synthesis and analysis of associated uncertainty (BG/CP/ESSD inter-journal SI)”. It is not associated with a conference.
This work was supported by German Federal Ministry of Education and Research
(BMBF) as a Research for Sustainability initiative (FONA;
This research has been supported by the Bundesministerium für Bildung und Forschung (grant no. 01LP1510A).
This paper was edited by David Carlson and reviewed by Pierre Francus and one anonymous referee.