The Eurasian Modern Pollen Database (EMPD), version 2

. The Eurasian (née European) Modern Pollen Database (EMPD) was established in 2013 to provide a public database of high-quality modern pollen surface samples to help support studies of past climate, land cover, and land use using fossil pollen. The EMPD is part of, and complementary to, the European Pollen Database (EPD) which contains data on fossil pollen found in Late Quaternary sedimentary archives throughout the Eurasian region. The EPD is in turn part of the rapidly growing Neotoma database, which is now the primary home for global palaeoecological data. This paper describes version 2 of the EMPD in which the number of samples held in the database has been increased by 60 % from 4826 to 8134. Much of the improvement in data coverage has come from northern Asia, and the database has consequently been renamed the Eurasian Modern Pollen Database to reﬂect this geographical enlargement. The EMPD can be viewed online using a dedicated map-based viewer at https://empd2.github.io and downloaded in a variety of ﬁle formats at https: //doi.pangaea.de/10.1594/PANGAEA.909130 (Chevalier et al., 2019).


Introduction
Modern pollen samples provide an essential source of information for interpreting and understanding the fossil pollen record, which in turn provides one of the most important spatially resolved sources of information on Quaternary vegetation and climate. We use the term "fossil pollen" here as it is commonly used in the Quaternary sciences. The fossils in this sense can more accurately be described as sub-fossils since they have usually only undergone limited (if any) postdeposition mineralisation, while pollen is taken to include many spores as well as the pollen from flowering plants. Fossil pollen can be found preserved in sediments in lakes and bogs and other anaerobic environments throughout the Eurasian region extending back throughout the Quaternary. Modern pollen is simply the component of that fossil record found in the last 100-150 years, most often in the surface layers of lake and bog sediments, but also including comparable collectors of pollen such as moss polsters. Davis et al. (2013) include a comprehensive introduction to the different scientific uses of modern pollen samples. Modern pollen samples have been used to interpret many different environmental processes, such as past changes in land cover, land use, and human impact; the impact on vegetation of past edaphic and hydroseral changes; and the effects of past changes in fire, pests, and disease on vegetation. Modern samples have also been used to understand taphonomic problems with regard to pollen transport, deposition, and preservation. One of the early motivations for establishing large modern pollen datasets and one that still remains important is their use as calibration "training sets" for the quantitative reconstruction of past climate. This approach has also more recently been adapted to quantitative reconstructions of land cover, where a similar modelling approach to climate reconstruction is applied to determine, for instance, forest cover. Similarly, modern samples have also been used to establish and model the relationship between vegetation and pollen assemblages based on the different pollen productivity of different taxa and thereby provide quantitative estimates of past vegetation composition in a landscape from records of fossil pollen.
Historically, modern pollen data were often gathered directly for a particular research project, but the data were rarely shared and if published often in grey literature such as a thesis, report, or monograph. Efforts to develop larger datasets at continental scales were pioneered in the 1990s, primarily by research groups looking to use these datasets as calibration datasets for quantitative climate reconstruction. Development however was haphazard, and the datasets had a reputation for being poorly documented and quality controlled, often containing duplicates, digitised data (not original raw counts), uncertain taxonomic standardisation, poor geolocation information, and loose definitions of "modern" that could embrace as much as the last 500 years. It became increasingly clear that a quality controlled and stan-dardised database of modern pollen samples was required, comparable to the European Pollen Database (EPD) for fossil pollen samples and reflecting the same open-access and community-based principles.
The Eurasian (née European) Modern Pollen Database (EMPD) was therefore established in 2013 as a complement to the European Pollen Database (EPD) for fossil pollen . The first version of the EMPD (referenced herein as the EMPD1) contained almost 5000 samples, submitted by over 40 individuals and research groups from all over Europe. Over the last 6 years more data have continued to be submitted, and additional efforts have been made to incorporate more data held in open data repositories such as PANGAEA and made available as a supplement in published studies. This paper documents the first update to the EMPD (referenced herein as EMPD2), in which the number of samples stored in the database has increased by around 60 %.
The EMPD remains the only open-access database of modern pollen samples covering the Eurasian continent. Smaller compilations of modern pollen samples exist for some regions, but these generally have limitations in terms of some or all of the following: (1) the extent of metadata provided, (2) the completeness of the taxa assemblage, (3) the standardisation of taxa nomenclature and hierarchy with respect to the EPD, (4) the inclusion of digitised rather than original raw count data, (5) the inclusion of percentages rather than raw counts, (6) information about the original source of the data and the analyst, and in some cases, (7) limitations to public access. Importantly, all of these aspects limit their compatibility with the EPD, where compatibility with the EPD is one of the primary objectives of the EMPD. The EMPD contains only the original raw count data (no percentage data) for the complete pollen assemblage. The EMPD also contains comprehensive and standardised metadata about the pollen sample location, the landscape and vegetation environment from which it was collected, the way it was collected, the year that it was collected, and who collected and analysed the sample and where it was published.
The EMPD has no formal spatial domain, but in general it covers the same geographic region as the EPD. This has traditionally been the Palearctic vegetation region of Eurasia excluding China, which has established its own semi-private regional database. As well as the terrestrial Eurasian landmass and associated islands, it also includes marine samples from coastal margins and enclosed seas. Increasingly however these geographical administrative boundaries have become blurred as regional pollen databases become integrated into the global Neotoma Palaeoecology Database (Williams et al., 2018), hereafter referred to as "Neotoma". While regional databases such as the EPD will outwardly retain their identity within Neotoma, internally the data will be completely integrated at a global level. It is also planned that the EMPD will become integrated into Neotoma in the near future, and with this in mind, the EMPD2 also includes data from outside of the traditional EPD region on the basis that it represented the most expeditious route to making these data publicly available within Neotoma. Consequently, this second version of the EMPD includes not only data from Europe and northern Asia, but also data from Greenland, India, China, and North Africa.

Methods
Details about the structure and metadata of the database have already been described in detail by Davis et al. (2013). The list of metadata fields is shown in Table 1. We also include climate and vegetation data for each sample location. The climate data include mean monthly, seasonal, and annual temperature and precipitation climatology from WorldClim2 (Fick and Hijmans, 2017). The climate was assigned according to the nearest grid point within the 30 s (approximately 1 km 2 ) resolution of the WorldClim2 grid. The vegetation data include realm, biome, and ecoregion, taken from Olson et al. (2001). Note that all samples have been assigned a biome, including marine samples. The biome assigned to marine samples was based on the nearest point of land to the sample. No climate has been assigned to marine samples. The protocol for the database follows that of the European Pollen Database, with some additions. The EMPD only includes samples younger than 200 BP, and with a sampling resolution comparable with the fossil pollen in the EPD. For instance, the EMPD does not include pollen trap data gathered at monthly or annual resolution, but it does accept trap data averaged over a period of at least 10 years, which is more comparable with the time typically represented in a fossil pollen sample taken from a sediment core.
Like the EPD, the EMPD only includes raw count data representing the full pollen assemblage, and it does not contain percentage data or truncated or summary assemblages. Percentages are excluded because their calculation can vary from author to author, and therefore unlike raw count data it is not always possible to directly compare different samples from different sources with percentage data. This is an important data quality criteria, but it has led to the exclusion of some large regional modern pollen datasets that have been recently published. This is discussed in the next section.
Modern pollen samples have been gathered from a variety of depositional environments, and the type of environment is recorded for 75 % of the samples in the database. The most common environments are moss polsters (31 %), soil (21 %), and lake sediments (19 %).

Data sources
The pollen data for the latest update of the EMPD have come from a diverse range of sources, but mainly submissions from individual researchers and research groups. Most of this has been the result of published research (Table 2), but we also include unpublished data. Additional pollen data have come from open-access sources such as the PANGAEA data archive and data supplements to publications, as well as new fossil pollen data submitted to the EPD and Neotoma since EMPD1 where the sample age of a sediment core top fulfils the requirements of a modern pollen sample.  Svobodová (1989Svobodová ( , 1997Svobodová ( , 2002 (2016); Pardoe, (1992,2001,2006,2014) 115 Some large independent surface sample datasets covering the Eurasian region have been published and made available since EMPD1, most notably Binney et al. (2017), Marinova et al. (2018), and Herzschuh et al. (2019). Both Binney et al. (2017) and Marinova et al. (2018) already include a large amount of data from the EPD and EMPD1, but also data that have not been publicly released before. This includes "heritage" data from earlier studies such as the Biome6000 project Prentice and Webb, 1998) and PAIN project (Bigelow et al., 2003). These heritage data are mostly composed of percentages, at least some (unknown part) of which have been digitised, and whose origins, selection criteria, and context are rarely documented. Another problem with these heritage data apart from the limited metadata is the loose definition of a "modern sample" in these early projects, being defined in both PAIN and Biome6000 as anything younger than 500 BP. Unfortunately, the age criteria for selecting individual samples were not recorded when the datasets were compiled.
These problems also extend to the recent release of data by Herzschuh et al. (2019) from China and Mongolia. These data represent most of the modern pollen data held in the Chinese Pollen Database (CPD) (Ni et al., 2010;Zheng et al., 2014). The Herzschuh et al. (2019) dataset includes 2559 modern pollen samples and is of major importance as the first significant amount of publicly available data from this region. However, the data are only provided as percentages based on a summary of the taxa from each sample and also include digitised data. We were therefore unable to include it in the EMPD2. The Herzschuh et al. (2019) data are available from PANGAEA, along with the Tarasov et al. (2011) dataset of 798 samples mainly from Japan and eastern Russia, which are also provided as percentages for a limited selection of taxa. We hope that the raw count data for the full assemblage will be made available in the near future.
Other regional pollen databases that overlap with the EMPD include the Indian Pollen Database (IPD) and the African Pollen Database (APD). The IPD is still under development and is not publicly accessible, but it includes both fossil and modern pollen samples from the Indian subcontinent (Krishnamurthy and Gaillard, 2011). The EMPD also includes samples from North Africa, which overlaps with the APD (Vincens et al., 2007). Fossil pollen data from the APD are available as individual files and as a partially complete paradox database from the APD website (Table 3), but the status of the modern pollen data held within the APD (Gajewski et al., 2002) remains somewhat unclear, since these data have not been made publicly available. At present the APD is being integrated into Neotoma, and it is hoped that once this is completed the modern pollen data from Africa will become more freely available.

Data processing
As with the EMPD1, the data submitted to the EMPD2 have come in a wide variety of data formats and with varying lev- els of metadata. All of these files had to be processed and a variety of quality control checks made before entry into the database (see also Davis et al., 2013). Figure 1 shows the steps taken in processing and qualitycontrolling the data. On receipt from the contributor, the data were entered into one of two standardised file formats according to whether they were pollen data or the associated metadata. Each of the two different types of data was then subject to a series of quality control checks to make sure they did not contain errors and that they conformed to data protocols. For instance, values in numerical fields in the metadata (shown in Table 1) had to fall within realistic boundaries expected for that field, such as for latitude, longitude, and altitude. Also, it had to be checked that controlled fields based on selection from a list of acceptable classes did not contain assignment errors, such as country name. Any missing entries were referred back to the contributor for completion, or else were completed from the original publication or other information source where available.
One of the most time-consuming tasks with the pollen data was to ensure standardisation of the original taxon names submitted by the contributor. These all had to be checked for language, typographical errors, and other issues and then assigned an internationally accepted taxa name according to the EPD common taxa "p_vars" table. If the name did not exist in the EPD taxa table it was checked (using http: //www.theplantlist.org/, last access: 20 January 2020) that it was spelled correctly and was not a synonym. It was then checked against the Neotoma pollen taxa table and assigned the Neotoma-accepted taxa name if there was a match. If it was not in the Neotoma taxa table, and it was established to be a genuine taxa name, then it was added to the EMPD taxa table as a new taxon. Note that although the EMPD is designed to be as compatible with the EPD as possible, the EMPD and EPD do not have a common taxa list, and the EMPD has many more taxa than appear in the EPD.
The accepted names for the fossil data in the EPD or Neotoma should be directly compatible with the accepted names in the EMPD, but some caution needs to be applied in integrating the two datasets since the EMPD contains additional accepted names that do not occur in the EPD or Neotoma. Where possible the EMPD assignment of accepted names respects the taxonomic resolution of the EPD-and Neotoma-accepted names. This means that where a new original taxa name is submitted to the EMPD that does not already occur in the existing databases, it is assigned the EPDor Neotoma-accepted name according to the existing taxonomic hierarchy. For example, if the new submitted original taxa name is a new species that does not occur in the EPD or Neotoma, and there is an existing accepted name at genus level, then the new species name is assigned the accepted name at the genus level. The assignment of accepted names is complicated because it requires an appreciation of differences in pollen morphology and of the reliability of identification, which can vary given the differences in skill and experience of the different analysts who contribute to the database. In addition, there are also important geographical considerations to take into account. For instance, the EMPD conforms to the EPD-accepted names but these are heavily European orientated, while the EMPD has much more data from regions such as eastern Asia where some of the accepted names are not strictly appropriate. However, in all cases we have retained in the EMPD all of the original taxa names as they were submitted by the original contributor after cleaning for typographical errors.
In the process of updating the EMPD we have harmonised as much as possible the taxa names in the EMPD with those found in the current EPD, including those names previously in the EMPD1 that have since been included in the EPD. When both the EPD and EMPD are included in the Neotoma database, then all of the taxa will exist in a single standardised taxa table consisting of all of the taxa in all of the databases.
Once the pollen data and metadata entry tables had been manually completed and checked, these were then uploaded into a Postgres database where a second series of automated quality control procedures were undertaken. These automated checks repeated many of the earlier manual checks, including ensuring that all open and closed fields were correctly completed and that the taxa names conformed to the database standardised taxa names (the "p_vars" table). In addition, it was also necessary to manually standardise worker names, address details, and data references across different datasets submitted to the database.
After the data had passed these database checks, each contributor was then asked to look again at their data as they were now stored in the database. Contributors were able to do this using the online data viewer, which provided an intuitive interface to the database that could be navigated without any prior experience of database systems. Locations for each site/sample could be checked using the viewer map interface, pollen data could be checked using a graphical (histogram) display, and metadata could be checked using a table view of all of the metadata fields. Any issues highlighted by the contributors were then corrected in the database. It was only after completing these final contributor checks that the EMPD2 database was deemed suitable for public release.
As well as adding new data, we also undertook a short review of the data in the original EMPD1. A cross-check between the country attributed to a site and the actual country where the site was located revealed that around 20 sites had either the wrong location or wrong country code. The geolocation data for around 250 samples in Morocco in EMPD1 have now been removed and placed in the information field. These were all highlighted in EMPD1 as having intractable geolocation errors , and it was felt that by removing the corrupt information from the geolocation field it would discourage their accidental use. In compensation the EMPD2 now includes new high-quality data from Morocco (see next section).

Spatial sampling
The amount of data in the database has increased by 60 %, and the EMPD2 now holds 8134 samples compared to 4826 samples in the EMPD1. The country that has experienced the largest increase in samples is Russia, which has gained 2274 more samples on top of the 379 samples already in the EMPD1 (Fig. 1). Other significant improvements in data coverage have been made in Italy, Norway, and Spain, while data are available for the first time from other countries such as Japan, Cyprus, and Kyrgyzstan. The increase in data from Russia reflects a general improvement in data coverage in EMPD2 from eastern Europe across to Asia (Fig. 2), prompting a renaming of the database from the "European" to the "Eurasian" Modern Pollen Database.
Countries where there are still relatively few or no samples despite being both relatively populous and having an active palynological community include Belgium, the Netherlands, Hungary, Czech Republic, and Slovakia. There are also virtually no samples from the Balkans. Despite the generally excellent coverage over Scandinavia, north-central Sweden remains poorly sampled, a feature that is also reflected in the lack of fossil pollen data from this area in the EPD. Further east, the distribution of samples tends to be best in the more populous regions and those with better transport infrastructure. Notable areas across northern Eurasia where we still lack samples include the steppes of Ukraine and Kazakhstan and the Central Siberian Plateau. Further south, most of China and Mongolia are well covered by the Chinese Pollen Database (now partly released by Herzschuh et al., 2019), and as mentioned earlier, there are efforts in India to improve data coverage in this region. A more difficult problem is the lack of samples from many of the Central Asian countries including Turkmenistan, Uzbekistan, Tajikistan, Afghanistan, and to some extent Pakistan, where access for scientists is currently difficult or hazardous, and where there are few locally trained scientists. The lack of modern pollen data from these regions is also reflected in a lack of fossil pollen studies from these countries.

Altitudinal sampling
The representativeness of the sample coverage in the vertical spatial domain is not easily discernible from a standard twodimensional map presented in Fig. 3. Vertical climate and vegetation gradients are much steeper than horizontal gradients, and hilly and mountainous terrain typically holds a greater variety of vegetation and climate types than can be shown on a continental-scale map. We make a better attempt to show this by plotting the distribution of samples by altitude on a hypsometric (or cumulative frequency) curve for the Palearctic study region (Fig. 4). This shows that the number of samples generally follows the proportion of land area represented at each elevation, with more samples at lower altitude, but there is still the presence of samples as the altitude gets higher. Data coverage has improved in particular in the 500-2500 m range between EMPD1 and EMPD2. The upper part of the altitudinal range above 3500 m is dominated by the Himalayas and the Tibetan Plateau, which is covered by the Chinese Pollen Database (Herzschuh et al., 2019).

Climate and vegetation sampling
The distribution of the EMPD2 samples across the vegetation biomes of the region (from Olson et al., 2001) is shown in Fig. 4. Biomes that are well sampled within the Palearctic region include most of those that occur in Europe, namely Mediterranean scrub and temperate forests and the western range of the boreal forest/taiga and tundra. Less well sampled are the temperate shrub and grasslands and deserts of the Central Asian steppe, and the eastern range of the boreal forest/taiga and tundra. Again, the Chinese Pollen Database (Herzschuh et al., 2019) covers much of the montane biomes of the Himalayas and Tian Shan, the grasslands and deserts of the Gobi area and Mongolia, and temperate and tropical forest biomes of East Asia.
While a conventional map such as Fig. 5a can show how samples are distributed across different biomes in geographical space, it does not show how well those samples are distributed in climate space. Large areas of Earth may have the same or similar climate, and the distribution of samples in conventional space does not necessarily equate to how well climate space has been sampled. Climate space is important because pollen-based climate reconstructions depend on the use of modern pollen calibration datasets that fully sample the available climate space associated with any particular vegetation type. Figure 5b shows the same information as Fig. 5a, but this time in climate space. This indicates that the EMPD2 samples appear better distributed in climate space than geographical space, but that there are fewer samples to represent the more extreme climates found at the edges of the modern climate space (such as tundra, deserts, and xeric scrublands). This is shown more clearly in Fig. 6b, where the Euclidean distance is calculated between the climate of each of the pollen samples in EMPD2 and all of the available climate space of the Palearctic region. This was done using mean annual temperature and precipitation from the WorldClim2 modern climatology (Fick and Hijmans, 2017), normalised to make the different scales comparable. The climate of the pollen site were assigned according to the nearest grid point within the 30 second (approximately 1km 2 ) resolution of the WorldClim2 grid, whilst the climate of the region was taken from the grid itself. The darker regions around the edges of the climate space show where in climate space the EMPD2 still lacks representative samples. These poorly represented climates are then shown in physical space in Fig. 6a. This indicates poor representation in the North African and Persian deserts, which are outside the Palearctic study region, but also areas within the Palearctic region including the Cen- tral Asian steppe and more mountainous areas of the Central Siberian Plateau and Siberia east of Yakutsk (130 • E).

Discussion
The increase in size of the EMPD in version EMPD2 has greatly improved the coverage of modern pollen samples across Eurasia in relation to geographical, vegetation, and climate space. This will make it possible to create more accurate reconstructions of past land cover and climate given the commensurate improvements in available climate and vege-tation analogues of fossil pollen samples. The database continues to increase in size through a mixture of newly submitted samples from old studies that predate EMPD1 and more recent studies that have occurred since EMPD1 was first made available. It is still likely that older data will continue to be submitted to the database, especially as it becomes better known, but it is unlikely that the database will continue to grow at the present rate given that much of the available older data are now expected to have been submitted. However, surface sample work has traditionally been less likely to be published in international journals, often confined to Masters or PhD theses or other grey literature, and the amount of data in existence may therefore be difficult to estimate.
To help promote access and use of the EMPD, we have created an online data viewer https://empd2.github.io (last access: 20 January 2020) (Fig. 7) (Sommer et al., 2020). This allows the database to be viewed using an intuitive clickable map that displays the location of each sample, associated metadata, and a plot of the pollen data themselves. It is also possible to download the data associated with a sample and to make suggested corrections. Other options allow the user to select subsets of the database to be viewed, for instance associated with particular individuals, projects, or research groups. The EMPD viewer allows access to the database in an intuitive way without requiring any particular computer expertise. This has been very important in not only allowing the casual user to view and access the data in the database, but also in allowing the data submitters to view their data as they exist in the database after they have been processed, providing a further quality control check. The data viewer is open source and can be adapted for other uses.
The EMPD data viewer is embedded in a web framework that is based on the version control system GitHub, where users and data contributors can transparently submit new data or raise issues with the existing data. These can then be reviewed in an open discussion with the database managers. This framework allows ongoing development of the EMPD in the future, and the usage of a free version control system additionally ensures full transparency, stability, and maintainability of access to the data, independent of funding and changing collaborations.
As well as simply adding more samples as they are submitted, we hope that the future development of the EMPD will also be more targeted. It is clear that although sample coverage is much improved in EMPD2, gaps still exist in the data coverage for Eurasia that would be useful to fill . One way to do this is to encourage fieldwork to collect samples from these data-poor regions. This approach however is expensive, since the reason why many of these areas remain unsampled is precisely because of their remoteness and the difficulty and expense involved in accessing them. An alternative that has not been widely exploited is to analyse soil and sediment samples gathered as a result of fieldwork expeditions organised with a different objective in mind. We hope that by demonstrating the important sampling gaps in the  database it will encourage individuals and research groups to consider fieldwork and data analysis in these underrepresented regions.

Ethical statement and how to acknowledge the database
Users of the database are expected to follow the guidelines of the EPD. These state that normal ethics apply to co-authorship of scientific publications. Palaeoecological datasets are labour intensive and complex, they take many years to generate, and they may have additional attributes and metadata not captured in the EMPD/EPD. Users of data stored in the EMPD/EPD should consider inviting the original data contributor of any resultant publications if that contributor's data are a major portion of the dataset analysed, or if a data contributor makes a significant contribution to the analysis of the data or to the interpretation of results.
For large-scale studies using many EMPD/EPD records, contacting all contributors or making them co-authors will not be practical, possible, or reasonable. Under no circumstance should authorship be attributed to data contributors, individually or collectively, without their explicit consent. In all cases, any use of EMPD data should include the following or similar text in the acknowledgements: "Pollen data were extracted from the Eurasian Modern Pollen Database (part of the European Pollen Database), and the work of the data contributors and the EMPD/EPD community is gratefully acknowledged." Upon publication, please send to the EMPD/EPD a copy of the published work or a link to the electronic resource. Your assistance helps document the usage of the database, which is critical to ensure continued support from funders and contributors.

Conclusions
The EMPD remains the only public, quality-controlled, and standardised database of modern pollen samples for the Eurasian region. This paper describes a recent update to the EMPD in which the database has increased almost 60 % in  Olson et al. (2001) size, so that it now contains data on 8663 modern pollen samples. This reflects an expansion in spatial coverage across northern and eastern Asia, which has prompted a change in the name of the database from the European to the Eurasian Modern Pollen Database. The improvement in spatial coverage has increased the number of vegetation and climate analogues for fossil pollen samples in the region that will directly improve reconstructions of past vegetation and climate. However, areas of poor data coverage still exist, particularly in the more remote regions of central and northern Asia and the Middle East. Development of a new map-based online data viewer for the database is already helping im-prove access to, and participation in, the EMPD, as well as quality control. We expect the EMPD to continue to grow in the future, although probably at a slower rate given that most of the previously published "heritage" data have now been incorporated. At present the EMPD remains associated with, but physically independent of, the EPD. It is also subject to only periodic updates. In future we expect both the EPD and EMPD to become fully incorporated into the global Neotoma Palaeoecological Database, which will provide seamless integration of the fossil and modern data, whilst also allowing continual updates using Neotoma data management tools. Figure 6. (a) The Euclidian distance between the climate of each modern pollen sample location (as shown in Fig. 3) and the climate of the entire Palearctic region. (b) The same as (a) but shown in climate space. Note that for clarity the values < 0.05 are shown by dark grey in (a), but white in (b). The darker the brown shading, the less well that climate is represented amongst the samples. The climate of each pollen site was assigned according to the nearest grid point within the 30 s (approximately 1 km 2 ) resolution of the WorldClim2 grid, whilst the climate of the region was taken from the grid itself. Author contributions. BASD wrote the manuscript with input from all of the authors. BASD, MC, and PS designed and implemented the database and data viewer. BASD, MC, PS, MZ, WF, LNP, AM, and VC all helped with data processing. All of the remaining authors contributed pollen sample data and were involved in the original collection, preparation, identification, and counting of these data.
Competing interests. The authors declare that they have no conflict of interest.
Acknowledgements. The EMPD includes data obtained from the Neotoma Palaeoecology Database and the European Pollen Database. The work of the data contributors and the scientific community supporting these databases is gratefully acknowledged.
Financial support. This research has been supported by the Swiss National Science Foundation (grant no. 200021_169598), with additional support from the University of Lausanne. Review statement. This paper was edited by Thomas Blunier and reviewed by two anonymous referees.