European pollen-based REVEALS land-cover reconstructions for the 1 Holocene: methodology, mapping and potentials 2

. Quantitative reconstructions of past land-cover are necessary to determine the processes involved in climate-human- land cover interactions. We present the first temporally continuous and most spatially extensive pollen-based land-cover 32 reconstruction for Europe over the Holocene (last 11,700 cal yr BP). We describe how vegetation cover has been quantified 33 from pollen records at a 1˚x1˚ spatial scale using the ‘Regional Estimates of VEgetation Abundance from Large Sites’ 34 (REVEALS) model. REVEALS calculates estimates of past regional vegetation cover in proportions or percentages. 35 REVEALS has been applied to 1128 pollen records across Europe and part of the Eastern Mediterranean-Black Sea-Caspian- 36 Corridor (30°-75°N, 25°W-50°E) to reconstruct the percentage cover of 31 plant taxa assigned to 12 plant functional types 37 (PFTs) and 3 land-cover types (LCTs). A new synthesis of relative pollen productivities (RPPs) for European plant taxa was 38 performed for this reconstruction. It includes multiple RPP values (≥ 2 values) for 39 taxa, and single values for 15 taxa (total 39 of 54 taxa). To illustrate this, we present distribution maps for five taxa ( Calluna vulgaris, Cerealia-t , Picea abies, deciduous 40 Quercus t. and evergreen Quercus t.) and three land-cover types (open land-OL, evergreen trees-ET and summer-green trees- 41 ST) for eight selected time windows. The reliability of the REVEALS reconstructions and issues related to the interpretation 42 of the results in terms of landscape openness and human-induced vegetation change are discussed. This is followed by a review 43 of the current use of this reconstruction and its future potential utility and development. REVEALS data quality are primarily 44 determined by pollen count data (pollen count/sample, pollen identification and chronology) and site type/number (lake or 45 bog, large or small, 1 site vs multiple sites) used for REVEALS analysis (for each grid cell). A large number of sites with high 46 quality pollen count data will produce more reliable land-cover estimates with lower standard errors compared to a low number 47 of sites with lower quality pollen count data. The REVEALS data presented here can be downloaded from 48

pollen data from surface lake sediments have shown that pollen assemblages from lakes ≥50ha are appropriate to estimate 145 regional plant cover using the REVEALS model (e.g. tests by Hellman et al. (2008a and b) in southern Sweden and by Sugita 146 et al. (2010) in northern America). 147 The REVEALS model (equation 1) calculates estimates of regional vegetation abundance in proportions or percentage cover 148 using fossil pollen counts from "large lakes" (Sugita, 2007a). For the application of REVEALS, an age-depth model (in cal yr BP) is required for each pollen record. We used the author's 223 original published model, the model available in the contributing database or, where necessary, a new age-depth model was 224 constructed following the approach in Trondman et al. (2015). The age-depth model for each pollen record is used to aggregate 225 RPP-harmonised pollen count data into 25 time windows throughout the Holocene following a standard time division used in 226 Mazier et al. (2012) and Trondman et al. (2015), which were later adopted by the Past Global Changes (PAGES) LandCover6k 227 working group . The first three time windows (present-100 BP (where present is the year of coring), 100-228 350 BP; 350-700 BP) capture the major human-induced land-cover changes since the early Middle Ages. Subsequent time 229 windows are contiguous 500-year long intervals (e.g. 700-1200 BP, 1200-1700BP, 1700-2200 with the oldest interval 230 representing the start of the Holocene (11200-11700 BP). The use of 500-year long time windows is motivated by the necessity 231 to obtain sufficiently large pollen counts for reliable REVEALS reconstructions. Since the size of the error on the REVEALS 232 estimate partly depends on the size of the pollen count (Sugita, 2007a), the length of the time window should be a reasonable 233 compromise to ensure both a useful time resolution of the reconstruction and an acceptable reliability of the REVEALS 234 estimate of plant cover (Trondman et al., 2015). 235 Table 1: Land-cover types (LCTs) and Plant Functional Types (PFTs) according to Wolf et al. (2008) and their corresponding pollen 236 morphological types. Fall speed of pollen (FSP) and the mean relative pollen productivity (RPP) estimates from the new RPP 237 synthesis (see section 2.3 and Appendices A-C for details) with their standard errors in brackets (see text for more explanations). (Mazier, unpublished) are 239 0.015 and 0.051, respectively (see Appendix B, Table B.3). The value of 0.035 (FSP of deciduous Quercus t.) and 0.038 (FSP of boreal-240 temperate Ericaceae) were used instead (see discussion in section 4.2 for explanation). , t = type e.g. evergreen Quercus t. RPP used 241 in this study are relative to grass pollen productivity where Poaceae = 1 (indicated in bold).

Model parameter setting 244
For the purpose of this study, a new synthesis of the RPP values available for European plant taxa was performed in 2018-245 2019 based on the by Mazier et al. (2012) and additional RPP studies published since then (Appendix A-C). This synthesis 246 provides new alternative RPP datasets for Europe, including or excluding plant taxa with dominant entomophily, and with the 247 important addition of plant taxa from the Mediterranean area (Appendix A, Table A1). The selection of RPP studies, RPP 248 values (shown in Appendix B, Tables B1 and B2) and calculation of mean RPP and their standard error (SD) for Europe are 249 explained in Appendix C. The location of studies included in the RPP synthesis is shown in Fig. C1 and related information is 250 provided in Table C1. The synthesis includes a total of 54 taxa for which RPP values are available (Tables B1 and B2) Mazier et al. (2012) and from the recent synthesis by Wieczorek & Herzschuh (2020). 257 Moreover, comparison with the RPP values of three studies not used in our synthesis is shown in Table A2. For the REVEALS 258 reconstructions presented in this paper, we excluded strictly entomophilous taxa, which resulted in a total of 31 taxa (Table 1). 259 The excluded taxa are Compositae (Asteraceae) SF Cichorioideae, Leucanthemum (Anthemis)-t., Potentilla-t., Ranunculus 260 acris-t., and Rubiaceae. We included entomophilous taxa that are known to be characterised by some anemophily, e.g. 261 Artemisia, Amaranthaceae/Chenopodiaceae, Rubiaceae, and Plantago lanceolata. We excluded plant taxa with only one RPP 262 value except Chenopodiaceae, Urtica, Juniperus, and Ulmus, and the seven exclusively sub-Mediterranean and Mediterranean 263 taxa mentioned above. 264 The FSP values (Tables 1 and A1) for boreal and temperate plant taxa were obtained from the literature (Broström et al., 2008;265 Mazier et al., 2012); these values were in turn extracted from Gregory (1973) for trees, and calculated based on pollen 266 measurements and Stokes' law for herbs (Broström et al., 2004). FSPs for Mediterranean taxa (Buxus sempervirens, Castanea 267 sativa, Ericaceae (Mediterranean species), Phillyrea, Pistacia, and Quercus evergreen type) were obtained by using pollen 268 measurements and Stokes' law (Mazier et al., unpublished); the FSP of Carpinus betulus (Mazier et al., 2012)

was used for 269
Carpinus orientalis (Grindean et al., 2019). 270 The site radius was obtained from original publications where possible. Sites in the EMBSeCBIO were classified as small 271 (0.01-1 km 2 ), medium (1.1-50 km 2 ) or large (50.1-500 km 2 ). These were assigned radii of 399m, 2921m and 10000 m, 272 respectively. Where a site's radius could not be determined from publication, it was geolocated in Google Earth and the area 273 of the site was measured. A radius value was extracted assuming that a site shape is circular (Mazier et al., 2012). A constant 274 wind speed of 3 m/s, assumed to correspond approximatively to the modern mean annual wind speed in Europe, was used 275 following Trondman et al. (2015). Zmax (maximum extent of the regional vegetation) was set to 100 km. Zmax and wind speed 276 influence on REVEALS estimates has been evaluated earlier in simulation and empirical studies (Gaillard et al., 2008;Mazier 277 et al., 2012;Sugita, 2007a) , which support the values used for these parameters. Atmospheric conditions are assumed to be 278 neutral (Sugita, 2007a). 279

Implementation of REVEALS 280
REVEALS was implemented using the REVEALS function within the LRA R-package of Abraham et al. (2014) (see Code  281 availability, section 6). The function enables the use of deposition models for bogs (Prentice's model) and lakes (Sugita's 282 model), and two dispersal models (a Gaussian plume model, and a Lagrangian stochastic model taken from the DISQOVER 283 package (Theuerkauf et al., 2016)). Within this study, the Gaussian plume model was applied. The REVEALS model was run 284 on all pollen records within each 1° × 1° grid cell across Europe. The REVEALS function is applied to lake and bog sites 285 separately within each 1° × 1° grid cell, and combines results (if there is more than one pollen record per cell) to produce a 286 single mean cover estimate (in proportion) and mean standard error (SE) for each taxon. The formulation of the SE is found 287 in Appendix A of Sugita (2007a). The REVEALS SE accounts for the standard deviations on the relative pollen productivities 288 for the individual pollen taxa (Table 1) and the number of pollen grains counted in the sample (Sugita, 2007a). The uncertainties 289 of the averaged REVEALS estimates of plant taxa for a grid cell are calculated using the delta method (Stuart and Ord., 1994), 290 and expressed as the SEs derived from the sum of the within-and between-site variations of the REVEALS results in the grid 291 cell. The delta method is a mathematical solution to the problem of calculating the mean of individual SEs (see Li et al., 2020, 292 Appendix C, for formula and further details). Results of the REVEALS function are extracted by time window, producing 25 293 matrices of mean REVEALS land-cover estimates and 25 matrices of corresponding mean SEs for each of the 31 RPP taxa 294 and each grid cell. The 31 RPP taxa are also assigned to 12 plant functional types (PFTs) and three land-cover types (LCTs) 295 (Table 1), and their mean REVEALS estimates calculated. These PFTs follow Trondman et al. (2015), with the addition of 296 two PFTs for Mediterranean vegetation not reconstructed in earlier studies: Mediterranean shade-tolerant broadleaved 297 evergreen trees (MTBE) and Mediterranean broadleaved tall shrubs, evergreen (MTSE). The mean SE for LCTs and PFTs 298 including more than one plant taxon are calculated using the delta method (Stuart and Ord., 1994), as described above. 299

Mapping of the REVEALS estimates 300
To illustrate the information that the new REVEALS reconstruction provides, we present and describe (section 3) maps of the 301 REVEALS estimates (% cover) and their associated SEs for the three LCTs ( Fig. 2 to 4) and five taxa for eight selected time 302 windows: the five taxa are Cerealia-t and Picea abies ( Fig. 5 and 6), Calluna vulgaris, deciduous Quercus type (t.), and 303 evergreen Quercus t. (Fig. D1-D3). The selection of the five taxa and eight time windows is motivated essentially by notable 304 changes in the spatial distribution of these taxa through time, with higher resolution for recent times characterised by the largest 305 and most rapid human-induced changes in vegetation cover. For visualisation purposes, the estimates are mapped in nine % 306 cover classes. These fractions are the same for the three LCTs (Figures 2-4), and the mapped output can therefore be directly 307 compared. In contrast, the colour scales used for the five taxa vary between maps depending on the abundance of the PFT/taxon 308 ( Fig. 5 and 6, D1-D3). Different taxa thus have different scales and maps cannot be directly compared. We visualise uncertainty 309 in our data by plotting the SE as a circle inside each grid cell; it is the coefficient of variation (CV, i.e. the standard error 310 divided by the REVEALS estimate). Circles are scaled to fill the grid cell if the SE is equal or greater than the mean REVEALS 311 estimate (i.e. CV ≥ 1). Grid-based REVEALS results that are based on pollen records from just one large bog, or single small 312 bogs or lakes, provide lower quality results (see section 2.1 on the REVEALS model, and discussion section 4.1). The quality 313 of REVEALS land-cover estimates by grid cell and time window is provided in Table GC_quality_by_TW (see section 5, Data  314 availability). The percentage scale ranges we use here are different from those used in the maps of Trondman et al. (2015) and, 315 therefore, the data visualisation cannot be directly compared. 316

Results 317
The complete REVEALS land cover reconstruction dataset includes mean REVEALS values (in proportions) and their related 318 mean SE for 31 individual tree and herb taxa, twelve PFTs and three LCTs for each grid cell in 25 consecutive time windows 319 of the Holocene (11.7 k BP to present). Here, results are illustrated by maps of the three LCTs ( Fig. 2-4) and five taxa  6, D1-D3). The presented maps are not part of the published dataset archived in the PANGAEA online public database (see 321 Data availability, section 5), they are examples of how the data can be visually presented and what they can be used for. 322

Land-cover types 323
The three land-cover types are evergreen trees (ET), summer-green trees (ST) and open land (OL). ET includes six PFTs which 324 are composed of nine pollen-morphological types (from here after referred to as taxa). ST includes three PFTs which are 325 composed of twelve taxa while OL includes three PFTs that are in turn composed of ten taxa (Table 1). 326

Open Land (OL) 327
At the start of the Holocene, open land (OL) (Fig. 2)

Evergreen Trees (ET) 342
The cover of evergreen trees (ET) (Fig. 3) at 9700-10200 BP is <30% across Europe, and by 7700-8200 BP fewer than 30 grid 343 cells show ET >50%. ET cover slowly increases through the early Holocene and at 5700-6200 BP groups of grid cells in 344 southern Europe record >80%, while in northern Europe ET cover ranges between 10% and 60%. There is a consistent increase 345 in ET cover over Europe during the mid-and late-Holocene with ET cover peaking at 2700-3200 BP before starting to decline. 346 Across western parts of Europe, including the United Kingdom, western France, Denmark, and the Netherlands ET never 347 exceeds 20% cover.

Selected taxa 369
In terms of PFTs, Cerealia-type (t.) is assigned to agricultural land (AL), Picea abies to shade tolerant evergreen trees (TBE1: 370 Picea abies is the only taxon in this PFT), Calluna vulgaris to low evergreen shrubs (LSE: Calluna vulgaris is the only taxon 371 in this PFT), deciduous Quercus t. to shade tolerant summer-green trees (TBS), and evergreen Quercus t. to Mediterranean 372 shade-tolerant broadleaved evergreen trees (MTBE) ( Table 1). 373

Cerealia-type 374
Cerealia-t. (Fig. 5) is recorded throughout the Holocene with 10-15% as the maximum cover. Cerealia-t. is present in southern 375 Europe at 9700-10200 BP with several grid cells recording >5 to 10%. Whilst scattered grid cells in central and western Europe 376 record the presence of Cerealia-t. at very low levels (0.5-1%), these values have high SE (greater than the REVEALS estimate) 377 and are therefore not different from zero; they correspond to single findings of Cerealia-t.. By 5700-6200 BP, grid cells in 378 Estonia and France record 3-5% cover, and several regions within central and western Europe record 0-5% (0.5-1%), although 379 with high SEs. At 2700-3200 BP, Cerealia-t. is recorded across central and western Europe in the United Kingdom, France,380 Germany, and Estonia with low values. In Norway, Sweden and Finland it has 0-1% cover with high SEs. The highest cover 381 (>5%) is observed across Europe from 1200 BP. 382 383 Figure 5. Grid-based REVEALS estimates of Cerealiat. cover for eight Holocene time windows. Percentage cover in 0.5% intervals 384 between 0 and 3%, 1% intervals between 3 and 5, and 5% interval between 5 and 10%. Intervals represented by increasingly darker 385 shades of green from 1-1.5%. Grey cells: cells without pollen data for the time window, but with pollen data in other time windows.

387
REVEALS estimate, the circle fills the entire grid cell and the REVEALS estimate is not different from zero. This occurs mainly 388 where REVEALS estimates are low.

Picea abies 390
Picea abies cover (Fig. 6) is low (1-2%) at 9700-10200 BP, although a number of grid cells in central and eastern Europe 391 record values between 30 and 50%. By 7700-8200 BP, grid cells recording 30-50% cover are observed in more regions of 392 central and eastern Europe than earlier (Russia, Estonia, Romania, Slovakia and Austria). At 5700-6200 BP, almost all of 393 central Europe has consistent but low cover of Picea abies; values are higher towards northeastern Europe (Russia,Estonia,394 Latvia, Belarus and Lithuania), up to 30-50%. By 2700-3200 BP the cover of Picea abies has increased across central (ca. 395 10%) and northeastern Europe (>30%). From 1200 BP, Picea abies is recorded in northern Europe, particularly in Norway 396 and Sweden with some grid cells recording 25-50% cover. 397 398 Figure 6. Grid-based REVEALS estimates of Picea cover for eight Holocene time windows. Percentage cover in 1% interval between 399 0 and 2%, 3% interval between 2 and 5%, 5% intervals between 5 and 30%, and 20% interval between 30 and 50%. Intervals

Calluna vulgaris 405
During the Holocene, Calluna vulgaris cover (Fig. D1) peaks at 50%, and is largely distributed in a central European belt from 406 the United Kingdom across to the southern Baltic States. At 9700-10200 BP, it is recorded in only a few grid cells, mostly in 407 central and western Europe, and at levels <10%. Cover slowly increases and by 7700-8200 BP, there are several grid cells with 408 cover >25% within the United Kingdom, and with 10-20% cover within Denmark. At 5700-6200 BP, grid cells in coastal 409 locations in northwestern Europe (particularly France, Germany and Denmark) have 50% Calluna vulgaris cover. Cover 410 steadily increases within the same grid cells and by 2700-3200 BP, cover has increased in northern and eastern Europe e.g. 411 Norway, Estonia, with values up to 20% cover. The highest cover of Calluna vulgaris is recorded in the last two millennia. 412 Although some grid cells in southeast Europe record low cover values, these have high SE. 413

Deciduous Quercus type (t.) 414
Deciduous Quercus t. (Fig. D2) is recorded in central and western Europe at 9700-10200 BP at low levels (<10%), while in 415 southern Europe (Italy) several grid cells recording >20% cover. By 7700-8200 BP, cover in central and western Europe is 416 between 1-10% while in northern and eastern Europe grid cells it is <2% with high SEs. During the mid-Holocene (5700-6200 417 BP) most of Europe, with the exception of some grid cells at the northern and southeast extremes, record deciduous Quercus 418 t. cover values between 2-15%. By 2700-3200 BP, % cover in the same grid cells has decreased to values between 2-10%. 419 Thereafter, the number of grid cells recording deciduous Quercus t. cover remains similar; however, the percentage cover 420 slowly decreases and at 350-100 BP, the number of grid cells with deciduous Quercus t. cover above 5% is very low. 421

Evergreen Quercus type (t.) 422
The spatial distribution of evergreen Quercus t. The results presented here are the first full-Holocene grid-based REVEALS estimates of land-cover change for Europe 430 spanning the Mediterranean, temperate and boreal biomes, which highlight the spatial and temporal dynamics of 31 plant taxa, 431 12 PFTs and 3 LCTs across Europe over the last 11700 years. Previous studies have demonstrated major differences between 432 REVEALS results and pollen percentages (Marquer et al., 2014;Trondman et al., 2015), and the differences between 433 REVEALS results and other methods used to transform pollen data, including pseudobiomisation, and MAT (Roberts et al. 434 2018). It is not the scope of this paper to evaluate the results in that context. This discussion focuses on the reliability and 435 potential of this "second generation" of REVEALS land cover reconstruction for Europe for use by the wider science 436 community. 437

Data reliability 438
The REVEALS results are reliant on the quality of the input datasets, namely pollen count data, chronological control for 439 sequences, and the number and reliability of RPP estimates used (see discussion on RPPs under 4.2). The standard errors (SEs) 440 can be considered a measure of the precision of the REVEALS results, and of reliability/quality (Trondman et al., 2015). 441 Where SEs are equal or greater than the REVEALS estimates (represented in the maps of Fig. 2-6 and D1-D3 as a circle that 442 fills the grid), caution should be applied in the use of the REVEALS estimates, as it implies that they are not different from 443 zero when taking the SEs into account. Whilst this is possible within an algorithmic approach that includes estimates of 444 uncertainty, it is conceptually impossible to have negative vegetation cover. If SEs ≥ mean REVEALS value it is therefore 445 uncertain whether the plant taxon has cover within the grid cell. Cover may either be very low or the taxon may be absent 446 within the region (grid cell in this case). 447 The size of pollen counts impacts on the size of REVEALS SEs (Sugita, 2007a); larger counts result in smaller SEs. 448 Aggregation of samples from pollen records to longer time windows results in larger count sizes and thus lower SEs (see 449 sections 2.2 above and 4.2 below). Our input dataset includes more than 59 million individual pollen identifications, organised 450 here into 16711 samples from 1128 sites, where a sample is an aggregated pollen count for RPP taxa for a time window at a 451 site. Seventy-seven percent of samples have count sizes in excess of 1000 which is deemed most appropriate for REVEALS 452 reconstructions (Sugita, 2007a). The mean count size across all samples is 3550. Samples with count sizes lower than 1000 453 are still used, but result in higher SEs. More than half of the pollen records used in the study were sourced from databases (see 454 section 2.2). Note that the EMBSeCBIO taxonomy has been pre-standardised, and the data compilers have removed Cerealia-455 type (t.). This means that for grid cells within the Eastern Mediterranean-Black Sea-Caspian-Corridor, caution is advised in 456 the interpretation of Cerealia-type. Nevertheless, pollen from ruderals that are often related to agriculture, for example, 457 Artemisia, Amaranthaceae/Chenopodiaceae, and Rumex acetosa type are included in the land-cover type open land (OL); 458 therefore, changes in OL cover in the Eastern Mediterranean-Black Sea-Caspian-Corridor may be related to changes in 459 agricultural land (see also discussion below, re agricultural, section 4.3). 460 Aggregation of pollen counts to time windows depends on age-depth models. We have used the best age-depth models 461 available to us, based on the chronologies presented in Giesecke et al. (2014) for EPD sites, and through liaison with data 462 contributors. Nevertheless, future REVEALS runs may draw on improvements to age-depth modelling, which may result in 463 some original pollen count data being assigned to different time windows. 464 The REVEALS results presented here are provided for 1° × 1° grid cells across Europe. The size and number of suitable pollen 465 records is an important factor in the quality of the REVEALS estimates for each grid cell. The REVEALS model was developed 466 for use with "large lakes" (≥ 50 ha; Sugita, 2007a) that represent regional vegetation. Grid cells with multiple large lakes will 467 thus provide results with the highest level of certainty and reflect the regional vegetation most accurately. These grid cell 468 results comprised of one or more large lakes, or several small sites (lake or bog) or a mix of large site(s) and small sites, are 469 considered "high quality" (dark grey grids in figure 1B). It has been shown both theoretically (Sugita, 2007a) and empirically 470 (Fyfe et al., 2013;Trondman et al., 2016) that pollen records from multiple smaller (<50 ha) lakes will also provide REVEALS 471 estimates that reflect regional vegetation. However, SEs may be larger if there is high variability in pollen composition between 472 records. We therefore also consider grid cells with multiple sites "high quality". Application of REVEALS to pollen records 473 from large bogs violates assumptions of the model (see section 2.1 above). Therefore, REVEALS estimates for grid cells 474 including large bogs or single small sites (lake or bog) may not be representative of regional vegetation, particularly in areas 475 characterised by heterogeneous vegetation. We consider such estimates as "lower quality" (light grey grids in figure 1B), 476 although they may still provide first-order indications of vegetation cover, and represent an improvement on pollen percentage 477 data (Marquer et al., 2014). Our results provide REVEALS estimates for a maximum of 420 grid cells per time window. The 478 number and type of pollen records in a grid cell can change between time windows: not all pollen records cover the entire 479 Holocene. To assess the reliability of individual results it is important to consider not just the number and type of pollen records 480 in the total dataset, but how these changes between the time windows. Results for a maximum of 143 grid cells are based on 481 three or more sites, 65 on two sites, and a minimum of 212 grid cells on a single site. The results of a maximum of 67 grid 482 cells are based on single small bogs (<400 m radius), 68 on single small lakes (<400 m radius), and 82 on single large bogs. 483

Role of RPPs and FSP in REVEALS results 484
A key assumption of the REVEALS model is that RPP values are constant within the region of interest, and through time 485 (Sugita, 2007a). Nevertheless, it has been suggested that RPPs may vary between regions, with the variation caused by 486 environmental variability (climate, land use), vegetation structure, or methodological design differences (Broström et al., 2008;487 Hellman et al., 2008a;Mazier et al., 2012;Li et al., 2020;Wieczorek and Herzschuh, 2020). Wieczorek and Herzschuh (2020)  under-representation of Ericaceae (Calluna excluded), in particular in boreal Europe, but perhaps also in temperate Europe. 508 Using only the small value from boreal/temperate Europe may lead to an over-representation of Ericaceae in Mediterranean 509 Europe. 510 Until we have more RPP values for each taxon, it is not possible to disentangle the effect of all factors influencing the 511 estimation of RPPs and to separate the effect of methodological factors from those of factors such as vegetation type, climate 512 and land use. The only way to evaluate the reliability of RPP datasets is to test them with modern or historical pollen 513 assemblages and related plant cover (Hellman et al., 2008a(Hellman et al., , 2008b. We argue that RPP values of certain taxa may not vary 514 substantially within some plant families or genera, while they might be variable within others, depending on the characteristics 515 of flowers and inflorescences that may be either very different or relatively constant within families or genera (see discussion 516 in Li et al. (2018)). Therefore, we advise to use compilations of RPPs at continental or sub-continental scales rather than 517 compilations at multi-continental scales as the northern Hemisphere dataset proposed by Wieczorek and Herzschuh (2020). 518 We consider the RPP selection used within this work as the most suitable for Europe to date, but expect revised and improved 519 RPP values as more RPP empirical studies are published. Moreover, experimentation in REVEALS applications will allow 520 future studies to evaluate the effects of using different RPP datasets on land-cover reconstructions (e.g. Mazier et al., 2012). 521 The role of FSP values in the pollen dispersal and deposition function (gi (z) in equation (1) of the REVEALS model, section 522 2.1) has been discussed by Theuerkauf et al. (2013). In this application of REVEALS we used the Gaussian Plume Model 523 (GPM) of dispersion and deposition as most existing RPP values have been estimated using this model. The GPM approximates 524 dispersal as a fast-declining curve with distance from the source plant, which implies short distances of transport for pollen 525 grain with high FSP compared to other models of dispersion and deposition (Theuerkauf et al., 2012). We have used the FSP 526 values obtained for deciduous Quercus type (t.) (0.035 m/s) and boreal-temperate Ericaceae (0.037 m/s) for evergreen Quercus 527 t. and Mediterranean Ericaceae, respectively, although the FSP values of those two taxa were estimated to 0.015 and 0.051 in 528 the Mediterranean study (Table 1 and A1). Whether using a lower FSP for evergreen Quercus t. (0.015 m/s) and a higher FSP 529 for Mediterranean Ericaceae (0.051 m/s) will have an effect on the REVEALS results is not known and requires further testing. 530

Use of the REVEALS land cover reconstructions results 531
This second generation dataset of pollen-based REVEALS land cover in Europe over the Holocene is currently used in two 532 major research projects: LandClim, and PAGES LandCover6k. LandClim is a Swedish Research Council project studying the 533 difference in the biogeophysical effect of land-cover change on climate at 6000, 2500 and 200 BP (Fyfe et al., 2022;Githumbi 534 et al., 2019;Strandberg et al., 2014;Trondman et al., 2015). PAGES LandCover6k focuses on providing datasets on past land-535 cover/land-use for climate modelling studies (Dawson et al., 2018;Gaillard et al., 2018;Harrison et al., 2020). The first 536 generation REVEALS land-cover reconstruction (Marquer et al., 2014(Marquer et al., , 2017Trondman et al., 2015) were used to evaluate 537 other pollen-based reconstructions of Holocene tree-cover changes in Europe (Roberts et al., 2018) and scenarios of 538 anthropogenic land-cover changes (ALCCs) (Kaplan et al., 2017) (see also section 1). The Trondman et al. (2015)  539 reconstructions were used to create continuous spatial datasets of past land cover using spatial statistical modelling 540 (Pirzamanbein et al., 2014(Pirzamanbein et al., , 2018(Pirzamanbein et al., , 2020. 541 Spatially explicit datasets/maps based on these second generation of REVEALS reconstructions are currently being produced 542 within PAGES LandCover6k and used to evaluate and revise the HYDE (Klein Goldewijk et al., 2017) and KK10 (Kaplan et 543 al., 2009) ALCC scenarios. Moreover, LandCover6k archaeology-based reconstructions of past land-use change (Morrison et 544 al., 2021) will be integrated with the datasets of REVEALS land-cover. Besides the uses listed above, the second generation 545 of REVEALS reconstruction for Europe offers great potential for use in a large range of studies on past European regional 546 vegetation dynamics and changes in biodiversity over the Holocene ( Marquer et al., 2014( Marquer et al., , 2017 and the relationship between 547 regional plant cover, land use, and climate over millennial and centennial time scales. Since the reconstructions are of regional 548 plant cover they will have value in archaeological research when impacts are expected at the regional level (e.g. the impact of 549 early mining (Schauer et al., 2019)). Archaeological questions and research programmes that require information on local 550 vegetation cover will require the full application of the LRA (REVEALS and LOVE; Sugita, 2007a, b), such as the local 551 vegetation estimates presented from Norway focussing on cultural landscape development (Mehl et al., 2015). The same 552 approach of using the REVEALS results within the LOVE model is necessary for ecological questions that require local 553 vegetation estimates (Cui et al., 2013(Cui et al., , 2014Sugita et al., 2010). 554 Several papers have discussed in depth the issues that need to be taken into account when interpreting REVEALS 555 reconstructions of past plant cover, in particular Trondman et al. (2015) and Marquer et al. (2017). The interpretation in terms 556 of human-induced vegetation change is one of the major challenges. The cover of open land (OL) may be used to assess 557 landscape openness, but is not a precise measure of human disturbance. OL will include plant taxa characterizing both 558 naturally-open land and agricultural land that has been created by humans through the course of the Holocene with the 559 domestication of plants and livestock. Natural openness can occur in arctic and alpine areas, in wet regions, in river deltas and 560 around large lakes, as well as in eastern steppe areas. It is a particular challenge in the Mediterranean region where natural 561 vegetation openness represents a larger fraction of the land cover than in temperate or boreal Europe (Roberts et al., 2019). 562 Agricultural Land (AL; Trondman et al., 2015 is the only PFT that includes cultivars; nevertheless, it is restricted to cereal 563 cropping, and many other cultivated crop types that can be identified through pollen analysis do not yet have RPP values (e.g. 564 Linum usitatissimum (common flax), Cannabis (hemp), Fagopyrum (buckwheat), beans, etc.). Moreover, the Cerealia-t. pollen 565 morphological type includes pollen from wild species of Poaceae, especially when identification relies essentially on 566 measurements of the pollen grain and its pore and does not consider exine structure and sculpture (Beug, 2004;Dickson, 1988). 567 The maps presented and described in section 3 as an illustration of the results show similar changes in spatial distributions and 568 quantitative cover of plant taxa and land-cover types through time, between 6000 BP and present, as the results published in 569 Trondman et al., (2015). The much greater potential of the new REVEALS reconstruction resides in its larger spatial extent, 570 covering not only boreal and temperate Europe but also southern and eastern Europe, and its contiguous time windows across 571 the entire Holocene, from 11700 BP to present. The quality of results is also higher in a number of grid cells in comparison to 572 Trondman et al (2015), where new pollen records have been included, which may in several cases decrease the standard error 573 on the REVEALS estimates. 574

Data availability 575
All data files reported in this work, which were used for calculations, and figure production are available for public download 576 at https://doi.pangaea.de/10.1594/PANGAEA.937075 (Fyfe et al., 2022). Example code for data preparation and implementation of REVEALS, using two grid cells from SW Britain, is available at 589 https://github.com/rmfyfe/landclimII. 590

Conclusions 591
The application of the REVEALS model to 1128 pollen records distributed across Europe has produced the first full-Holocene 592 estimates of vegetation cover for 31 plant taxa in 1° × 1° grid cells. These data are made available for use by the wider science 593 community, including aggregation of results to PFTs and LCTs. The REVEALS model assumptions are clearly stated to allow 594 interpretation and assessment of our results and several of the assumptions have been tested and validated. We can therefore 595 use the land-cover reconstructions to test the role of climate and humans on Holocene plant cover at regional scales. The 596 overview of land-cover change across Europe over the Holocene can be used to track the timing and rate of vegetation shifts. 597 We can also determine the effect of human-induced changes in regional vegetation cover on climate, i.e. study land use as a 598 climate forcing (Gaillard et al., 2010aHarrison et al., 2020;Strandberg et al., 2014). Local reconstructions (LOVE) can 599 be a complementary approach to archaeological surveys as fine-scale human use of the landscape cannot be distinguished 600 using REVEALS (regional estimates). The LOVE model requires that regional plant cover is known: the REVEALS 601 reconstructions are therefore needed for this purpose as well, and gridded reconstructions may be a way to perform LOVE 602 reconstructions, although other strategies can be chosen (Cui et al., 2013;Mazier et al., 2015). Questions aiming to understand 603 the degree of vegetation openness through the Holocene in Europe, or regarding changes in the relationship between summer-604 green and evergreen tree cover through time can now and in the future be answered and validated with fossil pollen data via 605 the REVEALS approach. We expect that, in the future, improved REVEALS estimates, as more pollen records are 606 incorporated, and work on RPPs develops. 607

Appendices 608
Appendix A -New RPP dataset for Europe 609

A.1 New RPP synthesis for Europe 610
The most common method to estimate RPPs involves the application of the Extended R-Value (ERV) model on datasets of 611 modern pollen assemblages and related vegetation cover. A summary of the ERV model and its assumptions, and an extensive 612 description of standardised field methods for the purpose of RPP studies are found in Bunting et al. (2013b). Estimation of 613 RPPs in Europe started with the studies by Sugita et al. (1999) and Broström et al. (2004) in southern Sweden, and Nielsen et 614 al. (2004) in Denmark. The first tests of the RPP in pollen-based reconstructions of plant cover using the LRA's REVEALS 615 (Regional Estimates of VEgetation Abundance from Large Sites) model (Sugita, 2007a) were published by Soepboer et al. 616 (2007) in Switzerland and Hellman et al. (2008a and b) in southern Sweden. Over the last 15 years, a large number of RPP 617 studies have been undertaken in Europe North of the Alps, but it is only recently that RPP studies were initiated in the 618 Mediterranean area (Grindean et al., 2019;Mazier et al., unpublished). Two earlier syntheses of RPPs in Europe were published 619 by Broström et al. (2008) and Mazier et al. (2012). From 2012 onwards, these RPP values have been used in numerous 620 applications of the LRA's two models REVEALS and LOVE (LOcal Vegetation Estimates) (Sugita, 2007a and b) to 621 reconstruct regional and local plant cover in Europe (Cui et al., 2013;Fyfe et al., 2013;Marquer et al., 2020;Mazier et al., 622 2015;Nielsen et al., 2012;Nielsen and Odgaard, 2010;Trondman et al., 2015). Wieczorek and Herzschuh (2020)  have RPP values for 7 plant taxa in common. These RPPs are compared to those from two syntheses published earlier, Mazier 629 et al. (2012) and Wieczorek and Herzschuh (2020). The number of selected RPP values (n) for Poaceae is larger than the total 630 number of RPP (tn), i.e. n = tn + 1. This is due to the fact that the study of Bunting et al. 2005

does not include a value for 631
Poaceae and the RPP values are related to Quercus (Bunting et al., 2005); therefore, RPPs related to Poaceae were calculated 632 by assuming the RPP value for Quercus (related to Poaceae; Quercus(Poaceae)) was the same in this study region than the mean 633

667
Herzschuh ( The larger differences between the mean RPPs in New and W&H than between New and Maz have not been examined in 695 detail. It is due to a slightly different selection of studies, i.e. the study of Theuerkauf et al. (2013) is not included in W &H 696 and we did not include in New (boreal and temperate Europe, Mediterranean area excluded) the studies of Bunting et al. 697 (2013a), Kuneš et al. (2019) and Grindean et al. (2019). Another important influencing factor is the selection of RPP values 698 for calculation of the mean RPP. Although the rules used to select RPP values are very similar between the syntheses, there 699 are obvious differences between New and W&H that are sometimes very significant (e.g. Juniperus). 700 (Table A2) 701

A.3 Comparison of the new synthesis with three additional individual studies
The RPPs from Twiddle et al. (2012) (Twi) for Pinus, Betula and Calluna are considerably larger than the mean RPPs in our 702 synthesis (New). This is probably due to the assumption made on the RPP of Picea related to Poaceae. The RPP of Picea 703 varies greatly between the selected studies in New, from 0.57 to 8.43 (eight values available). If we assumed that the RPP of 704 Picea related to Poaceae in the study region of Twi was the mean RPP of the five smallest RPPs, i.e. 1.57, the RPP of the three 705 taxa would be 4.8 for Pinus, 3.4 for Betula, and 3.3 for Calluna, which is more comparable to the mean RPPs in New. 706 Three taxa in Bunting et al. (2013a) (Bun) have a RPP comparable to the mean RPP in New, i.e. for Cyperaceae, Ranunculus 707 acris-t., and Rumex acetosa-t. (R. acetosa in Bun). The other taxa have a RPP in Bun smaller than the mean RPP in New, 708 except Plantago maritima that has a larger RPP (5.8) in Bun than the mean RPP for P. lanceolata in New. 709 Of nine taxa, three have a RPP in Kuneš et al. (2019)    according to the information on models used provided in Appendix C (Table C1) with further explanations on selection of 740 RPP studies. We followed similar procedures and rules as Mazier et al. (2012) and Li et al. (2018) to produce a new standard 741 RPP dataset for Europe. We consider that there are still too few RPP values per taxon to disentangle variability in the RPP 742 values for a particular taxon due to methodological issues, landscape characteristics, land use, or climate. We therefore use the 743 mean of selected RPP values for each taxon in the new standard RPP dataset, following Broström et al. (2008) and Mazier et 744 al. (2012). In boreal and temperate Europe, the number of RPP values per taxon varies between one and nine (Betula) ( Table  745 B1), and in Mediterranean Europe, there is only one value per taxon (Table B2). In general, all three sub-models of the ERV 746 model were used in the RPP studies. We selected the RPP values obtained with the ERV sub-model considered by the authors 747 to have provided the best results (following the approach of Li et al., 2018). This is usually evaluated from the shape of the 748 curve of likelihood function scores (LFS), or log likelihood (LL) (Twiddle et al., 2012) and the LFS and LL values themselves. 749

RPP (SE) -R ERV1 LSM
All RPPs selected for this synthesis are expressed relative to Poaceae (RPP=1). In studies that used another reference taxon 750 and calculated a RPP for Poaceae, the RPPs were recalculated relative to Poaceae. In studies that did not include a RPP value 751 for Poaceae, it was assumed that the reference taxon had a RPP related to Poaceae equal to the mean of the RPP values for that 752 taxon in the other studies (Mazier et al., 2012). For simplicity, we used the value of Quercus (5.83) calculated by Mazier et al. 753 (2012) for the study by Bunting et al. (2005) (Quercus as reference taxon, no RPP value for Poaceae). We could also have 754 used the new mean RPP for Quercus (4.54) using our selected RPPs (five values, instead of three in Mazier et al. (2012)). The 755 latter would not have changed our results significantly; the mean RPP for Quercus would have been 4.28 instead of 4.54 (Table  756 A4). For the study by Baker et al. (2016), we used the RPP values obtained with Poaceae as the reference taxon, given that the 757 RPPs relative to Quercus or Pinus were almost identical when ERV submodel 3 was used. The selection of RPP values in 758 boreal and temperate Europe for the calculation of the mean RPP values of each taxon (values in bold and emphasized by a 759 thick rectangle in Table B1, (A) and (B)) is based on the following rules: 760 1. We excluded the RPP values that were not significantly different from zero considering the lower bound of its SE,761 and values that were considered as uncertain by the authors of the original publications (e.g., Vaccinium for Finland 762 (Räsänen et al., 2007), Pinus for Central Sweden (von Stedingk et al., 2008)). Moreover, some RPP values were 763 excluded as they were assumed to be outliers or unreliable based on experts' knowledge on the plants involved, the 764 pollen-vegetation dataset, and the field characteristics of the related studies. For example, the RPPs for Cyperaceae, 765 Potentilla-t and Rubiaceae obtained in SW Norway (Hjelle, 1998) and those for Salix and Calluna vulgaris from 766 Central Sweden (von Stedingk et al., 2008) were assumed to be too low compared to the values obtained in other 767 study areas (Mazier et al., 2012). 768 2. (i) when five or more RPP estimates of pollen productivity (N≥5) were available for a pollen type, the largest and the 769 smallest RPP values (generally outlier values) were excluded, and the mean was calculated using the remaining three 770 or more RPP estimates; (ii) when N=4, the most deviating value was excluded, and the mean calculated using the 771 other three RPP values; (iii) when N=3, the mean was based on all values available except if one value was strongly 772 deviating from the other two; and (iv) when N=2, the mean was based on the two values available; an exception is 773 Ulmus for which we excluded the value from Germany (Theuerkauf et al. 2013) given that several of the RPPs in this 774 study are considerably higher than most values in the other available studies, i.e. for Betula (18.7), Quercus (17.85) 775 and Tilia (12.38). The latter values were also excluded from the mean RPP, as well as the unusually high values found 776 by Baker et al. (2016) for Betula (13.94), Pinus (23.12) and Quercus (18.47). Baker et al. (2016) argue that the high 777 RPP values might be characteristic of temperate deciduous forests that were little impacted by human activities. More 778 studies in this type of wooded environments would be needed to confirm this assumption. In the absence of such 779 studies we consider these values as outliers. 780 The SDs for the mean RPP values were calculated using the delta method (Stuart. and Ord., 1994), a mathematical solution to 781 the problem of calculating the mean of individual SDs (see Li et al. 2020 for more details). 782      soil samples. All studies used distance-weighted vegetation except two, Hjelle et al. (1998;SW Norway) and Sugita et al. 841 (1999;S Sweden). The Gaussian Plume Model (GPM) was used for pollen dispersal and deposition to distance-weight 842 vegetation, i.e. the Prentice bog model (Parsons and Prentice, 1981;Prentice and Parsons, 1983) in studies using pollen from 843 moss pollsters, and the Sugita's lake model (Sugita, 1993) in studies using pollen from lake sediments (see also caption of 844 Table C1). In the case of the study by Theuerkauf et al. (2013), the published RPP values were calculated using the Lagrangian 845 Stochastic Model. For the purpose of this synthesis, Theuerkauf recalculated the RPPs using the GPM bog model in the 846 application of the ERV model. The distribution of sites for collection of pollen samples and vegetation data within the study 847 regions is random or random stratified in seven of the eleven studies using moss pollsters; the five remaining studies used 848 selected sites (or systematic distribution). Studies using lake sediments normally result in a systematic site distribution. Earlier 849 studies (Broström et al., 2005;Twiddle et al., 2012) showed that random distribution of sites provided better estimates of 850 "relevant source area of pollen" (RSAP; sensu Sugita, 1994) and thus of RPPs, given that the reliable RPPs are those obtained 851 at the RSAP distance and beyond. Both studies indicated that systematic distribution of sites have the tendency to result in 852 curves of likelihood function scores that do not follow the theoretical behaviour, i.e. an increase of the scores with distance 853 until the values reach an asymptote. However, the difference in RPPs between systematic and random sampling is generally 854 not very large. Nonetheless, systematic sampling may lead to uncertainty in terms of reliability of RPPs and random 855 distribution of sites is recommended and has generally been used in studies using moss pollsters or soil samples published 856 from 2008 and onwards. 857   vegetation data within a 10 2 m 2 (herb taxa) and 10 3 m 2 quadrat (tree taxa) centred on the pollen sample was used (Sugita et al., 1999  interval between 0 and 2%, 3% interval between 2 and 5%, 5% intervals between 5 -35% and 15% interval between 35 and 50%.