High resolution biogenic global emission inventory for the time period 2000-2019 for air quality modelling

. Biogenic volatile organic compounds (BVOCs) emitted from the terrestrial vegetation into the Earth’s atmosphere 15 play an important role in atmospheric chemical processes. A gridded information of their temporal and spatial distribution is therefore needed for proper representation of the atmospheric composition by the air quality models. Here we present three newly developed high-resolution global emission inventories of the main BVOC species including isoprene, monoterpenes, sesquiterpenes, methanol, acetone and ethene. Monthly mean and monthly averaged daily profile emissions were calculated by the Model of Emission of Gases and Aerosols from Nature (MEGANv2.1) driven by meteorological reanalyzes of the 20 European Centre for Medium-Range Weather Forecasts for the period of 2000-2019. The dataset CAMS-GLOB-BIOv1.2 is based on ERA-Interim meteorology (0.5 ° x0. 5 ° horizontal spatial resolution), datasets CAMS-GLOB-BIOv3.0 and v3.1 were calculated with ERA5 (both 0.25 ° x0.25 ° horizontal spatial resolution). Furthermore, European isoprene emission potential data were updated using high-resolution land cover maps and detailed information of tree species composition and emission


3
. However, such measurements are unfortunately limited in space and time and are therefore not fully suitable to create a long-term gridded inventory of BVOC emissions required by the models. Knowledge obtained from observations on the emission processes, speciation and evaluation of fluxes serves as a valuable baseline for development of the emission BVOC models which are then able to simulate BVOC emissions for a specific time period and spatial domain 70 based on defined input parameters.
Over the time a relatively long list of BVOC emission models have been developed. The models differ in the approach used to estimate BVOC, in the level of complexity in processes considered and in factors affecting the emission. In general, there are two main approaches to BVOC modelling. First, a so-called process-based model that simulates BVOC synthesis directly 75 inside the plant (e.g. LPJ-GUESS (Lund-Potsdam-Jenna General Ecosystem Simulator), JULES (Joint UK Land Environment Simulator)). Second, based on a semi-empirical algorithm described by Guenther et al. (1995) which defines dependence of BVOC emissions from the plant on environmental factors, namely air temperature and solar radiation. From the latter developed the MEGAN model (Model of Emissions of Gases and Aerosols from Nature), widely used in the BVOC emission and atmospheric chemical and climate modelling communities. The emission algorithms can either be used as stand-alone, or 80 can be embedded inside an Earth system, land surface or air quality model. Different BVOC emission models were applied in the past to obtain estimates of BVOC emission levels on global scale (e.g. Lathière et al., 2005;Müller et al., 2008;Arneth et al., 2007b;Schurgers et al., 2009;Pacifico et al., 2011;Guenther et al., 2012;Sindelarova et al., 2014;Messina et al., 2016). Similarly, there exists long list of studies focusing on regional level (e.g. 85 Simpson et al., 1995;Simpson et al., 1999;Steinbrecher et al., 2009;Karl et al., 2009;Oderbolz et al., 2013;Emmerson et al., 2018). These inventories are so called 'bottom-up', i.e. calculated by the emission models based on surface input data.
With emerging availability of satellite-based observations of the Earth's atmosphere, data retrieved from space started to be used also in BVOC emission estimation. Space-borne measurements of suitable chemical species are used to constrain a-priori 90 emissions through an inversion technique applied in the atmospheric chemistry model. Such approach has been applied for example to constrain emissions of isoprene, the most abundant BVOC species, with satellite measurements of isoprene's oxidation product formaldehyde (e.g. Palmer et al., 2006;Millet et al., 2008;Stavrakou et al., 2009;Curci et al., 2010;Bauwens et al., 2016;Kaiser et al., 2018). Emission inventories constrained by satellite observations through application of the model inversion are being called 'top-down'. Recently, a methodology for direct measurement of isoprene emissions from space has 95 been developed by identifying spectral signatures of isoprene in satellite-borne measurements of Cross-track Infrared Sounder (Fu et al., 2019;Wells et al., 2020).
In the following section (Sect. 2) we describe a methodology of emission calculation, including description of the emission 105 model, input meteorological, land cover and emission factor data. Sect. 3 presents global and regional distribution of emission estimates, together with comparison of emission inventories within each other and with other available data. Information on data availability is given in Sect. 4 and conclusions and summary are presented in Sect. 5.

Emission model 110
The presented emission datasets were calculated using the Model of Emissions of Gases and Aerosols from Nature (MEGANv2.1, Guenther et al., 2012). The MEGAN model was developed at the National Center for Atmospheric Research (NCAR, US) and is currently maintained and further improved by Biosphere Atmosphere Interaction Group at University of California -Irvine (https://bai.ess.uci.edu/).

115
It is an emission model extensively used in the atmospheric modelling community for simulation of biogenic VOC emissions from vegetation and soils at regional and global scales (e.g. Guenther et al., 2006;Heald et al., 2008;Arneth et al., 2011;Sindelarova et al., 2014;Seco et al., 2015;Emmerson et al., 2018;Kaiser et al., 2018, Huszar et al, 2018. Furthermore, the algorithm of the MEGAN model has been embedded into number of Earth system and chemical transport models (e.g. Emmons et al., 2010;Lawrence et al., 2011;Keller et al., 2014;Henrot et al, 2017). 120 The model calculates an emission flux F (µg grid cell -1 h -1 ) of specific BVOC species from a model grid cell as follows: .
where g is a dimension-less factor accounting for dependence of emissions on environmental factors (air temperature, solar 125 radiation, ambient CO2 concentration, leaf age, etc.), EP (µg m -2 h -1 ) is an emission potential of a grid cell, i.e. a unit emission defined under standardized environmental conditions and S (m 2 ) is a grid cell surface area. The MEGANv2.1 was applied with the full canopy module which calculates meteorological conditions inside the forest canopy (e.g. leaf temperature, radiation on sunlit and shaded leaves). For calculation of isoprene, the model took into account an inhibitory effect of CO2 concentration on isoprene emissions using parametrization described in Heald et al. (2009). In our simulations, we did not consider the effect 130 5 of soil moisture stress on the plant emissions. For more details on the MEGANv2.1 algorithm please see Guenther et al. (2006Guenther et al. ( , 2012.

Meteorology
Two sources of meteorological data were used for calculation of the emission datasets. CAMS-GLOB-BIOv1.2 is based on the ERA-Interim (Dee at al., 2011) data and datasets CAMS-GLOB-BIOv3.0 and v3.1 were calculated with ERA5 (Hersbach 135 et al., 2020), both meteorological reanalyzes of the European Centre for Medium-Range Weather Forecasts (ECWMF).
MEGAN model requires the following input parameters -2 m air temperature, water mixing ratio, surface pressure, 10 m wind speed and photosynthetically active radiation (PAR). PAR is defined as solar radiation with wavelength between 400 and 700 nm which photosynthetic organisms are able to absorb during photosynthesis. Unfortunately, this parameter is not available in ERA-Interim nor ERA5 datasets (see Copernicus Knowledge Base -ERA-Interim: surface photosynthetically active radiation 140 (surface PAR) values are too low, 2017). PAR was therefore approximated with surface solar downward radiation divided by a factor of 2.2 as recommended by various studies (Olofsson et al., 2007;Jacovides et al., 2003;Escobedo et al., 2011). The water mixing ratio was calculated from 2 m dew point temperature following equations from Lowe and Ficke (1974).
Since emissions are calculated on a monthly mean basis, the input meteorological data were synoptic monthly means of 145 analyzed and forecasted parameters. ERA-Interim data were available on global grid with horizontal spatial resolution of 0.5° x 0. 5° with 3 or 6 h time steps. The data were linearly interpolated in time in order to obtain monthly averaged daily profile of each meteorological variable. ERA5 is a successor to ERA-Interim with higher horizontal spatial resolution of 0.25° x 0.25° and with 1 h time resolution. Interpolation between time steps was therefore no longer necessary in case of ERA5.

Vegetation description 150
The spatial distribution of vegetation in the MEGAN model is defined using plant functional types (PFTs). This is an alternative approach to vegetation description using biomes (e.g. savanna, tundra). While biomes can consist of physiologically distinct vegetation types (e.g. grasses and trees), plant functional types group vegetation with similar leaf physiology. Use of PFTs leads to less complex vegetation representation, but allows physiologically-based ecosystem description convenient for the dynamic global vegetation models. The MEGAN model was designed to be coupled with Community Land model (CLM4) 155 and therefore uses the same approach, i.e. representation of the global land cover with 16 PFT categories (Lawrence and Chase, 2007). Vegetation in each model grid cell is defined by fractional coverage by each of the PFT. A list of the MEGANv2.1 PFT categories is given in Table 1.
Emissions in CAMS-GLOB-BIOv1.2 and v3.1 were calculated with temporally invariable map of PFTs from CLM4 model 160 representative for the year 2000. However, global land use / land cover is experiencing dramatic changes, e.g. deforestation in 6 the tropical forests and replacement of forests by agricultural land (e.g. Song et al., 2018), which is obviously expected to impact the BVOC emissions. In order to capture the land cover change in MEGAN simulations, we replaced the static CLM4 PFT map with land cover data from the ESA-CCI (ESA, 2017). ESA-CCI data are provided by the Climate Change Initiative of the European Space Agency. The data consists of time series of global annual mean land cover maps with high horizontal 165 spatial resolution (300 m) available for the period of 1992-2018 based on satellite observations. To be consistent with the MEGAN model, the ESA-CCI land cover categories were converted to PFT classes similar to CLM using the CCI-LC user tool v4.3 (Poulter et al., 2015). Emissions calculated with temporally varying land cover are included in CAMS-GLOB-BIOv3.0 dataset.
170 Table 1 compares global land areas covered by each PFT category in CLM4 and ESA-CCI (year 2000, converted by CCI-LC user tool) land cover maps. Note that though the Corn (Maize) category is included in the MEGAN PFT list, it is currently not distinguished from other crops and its spatial coverage is therefore zero. The two maps differ in the total area covered by vegetation, with ESA-CCI giving ~19% less vegetated area globally. In ESA-CCI, extent of a tree and grass categories is ~25% lower, while coverage by the crop category is almost 50% higher when compared to CLM. 175 Vegetation seasonality is represented by changes in leaf area index (LAI). LAI is a dimensionless parameter defined as onesided leaf area per area of the ground surface (m 2 m -2 ). Spatial and temporal distribution of LAI was obtained from processed observations of the MODIS instrument (Yuan et al., 2011). The 8-day observations were averaged to monthly means. The

Global emission potential data
Emission potentials are together with the vegetation description a crucial parameter in BVOC emission estimation. In the following text we distinguish between emission factor (EF) and emission potential (EP). By EF we mean emission of a 190 chemical species from specific plant or vegetation type under standard conditions of environmental parameters. EFs can be defined either as area-based values, i.e. an emission from a unit area covered by specific plant or vegetation type (e.g.
µg(species) m -2 (ground-cover) h -1 ) or mass-based values, i.e. emission from a unit mass of the plant's dry leaf matter (e.g. where fi is a fraction of a grid cell covered by individual plant or vegetation type and Di (g(dry leaf matter) m -2 ) is a foliar density of the plant or vegetation type. 200 Emission factors in the MEGANv2.1 model are defined on a canopy-scale level as an emission under standard condition from the full canopy. Above canopy measurements of EF are unfortunately limited, therefore the canopy-scale EFs in MEGAN are still based on leaf-and branch-scale measurements which were extrapolated with a canopy environment model to the canopy level . MEGAN standard conditions are defined for series of variables, such as LAI, leaf age composition 205 of the canopy, meteorological conditions (temperature, solar radiation, humidity, wind speed, soil moisture) of the current state and of the past (temperature and solar radiation). For more details see Guenther et al. (2006).

8
The MEGAN model has two options for emission potential definition. Either use of the input emission potential maps for selected species or calculation of EP from vegetation coverage. These options are described in more detail in the two following 210 sections.

Emission potentials from detailed global maps
The first option consists of the use of annual mean emission potential maps with high spatial resolution for the main BVOC species, i.e. isoprene, main monoterpenes (a-pinene, b-pinene, myrcene, sabinene, limonene, trans-b-ocimene, 3D-carene) and 2-methyl-3-buten-2-ol (MBO). Emission maps are available together with the MEGANv2.1 code 215 (https://bai.ess.uci.edu/megan/data-and-code/megan21, access date 31/5/2021) and were created based on detailed ecoregion description, combining information on species composition with species specific emission factors and above canopy flux measurements, where available (Guenther et al., 2012). Emission potentials for the rest of the modeled species were calculated based on the PFT coverage as described in the following section.

Emission potentials calculated from PFTs 220
The second option consists of EP calculation from the vegetation composition of each grid cell. MEGAN uses 16 PFTs for description of vegetation in the model domain (listed in Table 1.). Each of the PFTs is assigned with an emission factor value for each of the modeled species (see Table 2 in Guenther et al., 2012). The emission potential of each grid cell for a specific modeled chemical species is then calculated as a weighted sum defined in Eq. (2).

225
We performed specific emission model runs to evaluate the difference in resulting emissions when emissions are calculated from EP detailed maps and from EP calculated based on the PFT coverage. All the other input parameters were kept the same.
Use of EP calculated from the PFT coverage leads to ~10% decrease of isoprene emission total on global scale when compared to emission calculation based on EP detailed maps. For b-pinene and other monoterpenes the difference is only 1-2 %.
However, for a-pinene emissions calculated from PFT coverage are more than 70 % higher when compared to emissions 230 calculated from the EP maps. Therefore, the EFs assigned to each PFT tree category for a-pinene were revised based on recent updates of EFs for the ORCHIDEE model (Messina et al., 2016). The ORCHIDEE and MEGAN models differ in definition of standard conditions which means that the ORCHIDEE EFs needed to be converted to MEGAN suitable format. The conversion was done in a similar way as described in Sect. 2.5.2. The newly used a-pinene EFs are listed in Table 2  Describing global vegetation by only 16 PFT categories is of course a simplification that inevitably brings inaccuracies, especially for categories such as broadleaf deciduous forest which can consist of tree species which are very low isoprene 245 emitters but at the same time tree species such as oaks which are very strong isoprene emitters. On the other hand, such simplifications are often necessary due to lack of detailed information on vegetation composition and/or assignment with emission factor, or is simply a result of balance between the level of detail in vegetation description and ability of the model algorithm to digest such data.

250
Calculation of EP from PFT coverage is to some extent inaccurate, on the other hand allows us to change the land cover description dataset (e.g. use ESA-CCI instead of CLM) and therefore study impact of land cover on resulting emissions.

Update of isoprene emission potentials in Europe
The MEGAN global input emission potential maps for isoprene and main monoterpenes were created based on information of global land cover distribution and vegetation composition in combination with emission factor survey, incorporating results of 255 flux measurement campaigns. Naturally, lot of information on emission factor and flux measurements originates in the tropics as it is a region of the highest emission rates. This leads to the fact that the MEGAN emission potential maps are well suited for the tropical region, but may be less fitting in other parts of the world.
As detailed land cover data were becoming available for Europe, studies focusing on estimation of biogenic VOCs from plant-260 specific vegetation description started to appear (e.g. Simpson et al., 1995, Karl et al., 2009Oderbolz et al., 2013).
Several studies have shown large discrepancies between emissions calculated using species-specific emission factors and those calculated by MEGAN-based inputs (Rinne et al., 2009;Langner et al., 2012;Jiang et al., 2019). This motivated us to revise the input emission potential maps for isoprene in this region.

265
In this work, new maps of area-based isoprene emission potentials (EP, µg m -2 h -1 ) for the European area were created. These EP maps are based on detailed maps of forest species and other vegetation combined with Europe-specific emission factors for each species. The EP map update makes use of procedures developed over many years for the EMEP model (Simpson et al., 1995(Simpson et al., , 1999(Simpson et al., , 2012. The basic emission factors and LAI changes are taken from a high-resolution version of the EMEP model rv4.33 (Simpson et al., 2012). 270 The EMEP and MEGANv2.1 models differ in their definition of standard environmental conditions for emission factors. In MEGANv2.1 EFs are defined on canopy-scale level, i.e. as an emission from the full canopy, under standardized canopy conditions of LAI, specific proportion of mature, growing and old foliage, current and previous air temperatures and radiation, humidity, wind-speed and soil moisture (Guenther et al., , 2012). The EMEP system, similar to previous BVOC 275 algorithms of Guenther et al. (1995), uses leaf-and branch-level EF definition with standard conditions for leaf temperature (30°C) and photosynthetically active radiation (1000 µmol m -2 s -1 ) only. As canopy-scale EFs are not available for the vegetation species used for this isoprene EP update, a new map was created with leaf-and branch-level EFs. The new isoprene EP values were then converted to MEGANv2.1-suitable format. There is unfortunately no accurate conversion equation that would satisfy all conditions. A rough conversion can be made following recommendations by Arneth et al. (2011) and Messina 280 et al. (2016) for conversion between the two systems. Details of the conversion of EMEP EFs to MEGANv2.1-suitable isoprene EPs are given in Sect. 2.5.2.

Land cover description and emission factors
The main basis for European BVOC emissions in Europe in the EMEP system is a map of forest species generated by Köble and Seufert (2001), combined with species-specific EFS for each of these species. The forest database provides maps for 115 285 tree species in 30 (mainly EU) European countries, based on a compilation of data from the ICP-forest network UN-ECE (1998). These data were further processed to the EMEP grid by the Stockholm Environment Institute at York (UK, S. Cinderby pers. Comm., 2004) in order to add data from other countries in the (2000-era) EMEP domain, and for non-forested vegetation.
More recently, the EMEP domain was significantly expanded to the east, and data for the expanded area and indeed globally makes use of a merge of the GLC_2000 dataset (http://bioval.jrc.ec.europa.eu/products/glc2000/glc2000.php) and data from 290 the Community Land Model (http://www.cgd.ucar.edu/models/clm/, Oleson et al. 2010, Lawrence et al. 2011) as described in Simpson et al. (2017). In order to provide a manageable number of PFTs for use in MEGAN, tree species were aggregated in six classes, as summarized in Table 3. For each grid cell, the grid-average emission potential of a specific PFT (EPPFT, µg m -2 h -1 ) was then calculated as a weighted average of all the individual tree species belonging to this PFT category: where i represents one of the many forest or vegetation species contained within that PFT, EFi is the species foliage-level isoprene emission factor (µg g(dry leaf weight) -1 h -1 ), Di is the species foliar density (g(dry leaf weight) m -2 ) and Ai is the species area (m 2 ). Further details of this methodology, including detailed composition of each PFT class as well as EF and D values for each considered tree species can be found in Simpson et al. (2012) (Sect 6.6., SI, S4.4). 305 For the European-domain runs used here, the EMEP model combines the PFT-specific EPs and max-LAI, with latitudedependent growing season dates as described in Simpson et al. (2012). For this work we have made use of much finer gridresolution (0.1°x0.1° latitude-longitude).

310
For non-forest vegetation types (e.g. grasslands, seminatural vegetation) or for forest areas not covered by the Köble and Seufert (2001) maps (e.g. for eastern Russia), default emission factors taken from Simpson et al. (2012) were applied.
The crop category is the most difficult to deal with in terms of BVOC emissions, not least because the types of crops are not well known (and can change significantly over the years), and the growing seasons almost impossible to specify. Here we used 315 a simple system which defines the phenology and emission factors of crops using EMEP model definitions. For this study, the 12 EMEP model's temperate crop (wheat, etc.), Mediterranean crop (maize, etc.) and root-crop (potato, etc.) were aggregated into one Crop PFT.
The isoprene emission potential data (EFPFT) and the monthly changes in LAI per each of the six PFTs were provided for a 320 European domain spanning [30.05° -71.95°] in latitude and [-29.95° -65.95°] in longitude with 0.1°x0.1° spatial resolution.

Conversion of isoprene emission potential map for MEGAN
In order to satisfy the MEGANv2.1 definition of standard conditions, the EMEP-based isoprene emission potential maps needed to be converted. As discussed earlier, there is unfortunately no precise way of such conversion. According to where LAIstd is standard leaf area index in the MEGAN model equal to 5 m 2 m -2 , i is an index through EMEP PFT categories (Table 3), f is a fraction of a grid cell covered by specific PFT, LAI and LAImax are monthly and maximal leaf area index of the PFT category (see Table 3), respectively, and EPPFT is EMEP-based isoprene emission potential for a specific PFT category. 335 Monthly isoprene emission potential maps for Europe were then embedded into the global domain of MEGAN gridded emission potential maps. These new global isoprene EPs were used in the calculation of the CAMS-GLOB-BIOv3.1 emission inventory.

Results and discussion 340
The following sections present examples of spatial and temporal distribution of emissions in CAMS-GLOB-BIO inventories on global and regional scales (Sect. 3.1 and 3.2). Section 3.3 focuses in more detail on the impact of land cover change on isoprene emissions and Section 3.4 on update of the isoprene emission potential data in Europe showing differences between emissions calculated with the MEGAN-default and updated input EP maps. Section 3.5 presents comparison of CAMS-GLOB-13 BIO isoprene and monoterpene emissions to other available datasets. Table 4 summarizes the data availability in each CAMS-345 GLOB-BIO inventory as well as the different input parameters used to calculate each dataset.

Global distribution of BVOC emissions
The annual global totals averaged over the respective periods are listed in Table 5 for BVOC species available in CAMS-GLOB-BIOv1.2, v3.0 and v3.1 datasets. Though the absolute values differ between the datasets, the species responsible for the majority of the global BVOC total are common to all three inventories. The most abundant species is isoprene (64 %), followed by the sum of monoterpenes (13%), methanol (7%), acetone (4%), ethene (3.6%), sesquiterpenes (2.5%), propene 355 (2%), acetaldehyde (1.4%) and ethanol (1.3%). The numbers in brackets represent the species contribution to the global BVOC total when expressed as Tg(C) yr -1 , averaged over the three datasets. The rest of species contribute together with less than 10 Tg(C) yr -1 , i.e. less than 2%. Note that for monoterpenes we provide emissions of a-pinene, b-pinene and other monoterpenes.
ERA-Interim data is available with 3 or 6 h time step and therefore to obtain hourly input fields for the MEGAN  data needed to be temporally interpolated. Such interpolation leads to underestimation of meteorological parameters, especially 365 for air temperature and solar radiation, as the interpolated fields do not capture the noon peak hours at locations between the model time steps. In the case of ERA5, the data are available with hourly timesteps and temporal interpolation is no longer needed. The annual mean ERA5 values of air temperature and solar radiation are therefore higher than ERA-Interim, especially in highly emitting regions of south America, central Africa, south-east Asia and Indonesia, which is reflected accordingly in higher modeled emissions. Similar effects of temporal resolution of the input climate data on isoprene emissions were 370 discussed by e.g. Ashworth et al. (2010).
The largest difference in global emission total can be observed between the CAMS-GLOB-BIOv3.1 and v3.0 inventories. The v3.0 emission total is more than 160 Tg(C) yr -1 lower than in v3.1, with most significant difference for isoprene estimates which are more then 140 Tg(isoprene) yr -1 lower in v3.0 compared to v3.1. Both inventories are calculated with ERA5 375 meteorology, but they differ in setup of the input emission potential data and most importantly in the underlying land cover description.
In the calculation of v3.0 we switched from using static CLM land cover maps to annually changing ESA-CCI land cover data in order to capture the effect of land cover change on emissions. To include the effect of changing land cover information in 380 the model, input gridded emission potential maps (described in Sect 2.4.1) had to be replaced by calculation of emission potentials from PFT distributions (described in Sect. 2.4.2). As discussed in Sect. 2.4.2, such a change in the model setup leads to ~10% decrease of isoprene emissions on global scale. The rest of the isoprene decrease can be explained by different land cover distribution in CLM and ESA-CCI datasets. Sensitivity emission model runs using exactly same input data except for definition of land cover distribution resulted in isoprene annual global total of 427 Tg(isoprene) yr -1 when using CLM and 316 385 Tg(isoprene) yr -1 when using ESA-CCI, i.e. almost 30 % difference. As shown in Table 1, total vegetated area is more than 18 million km 2 (19%) smaller in ESA-CCI than in CLM maps. Significant differences between the two vegetation maps are visible in the tropical region which is a source of ~80% of isoprene global emission (Guenther et al., 2012;Sindelarova et al., 2014).
The extent of broadleaf evergreen and deciduous tree cover in ESA-CCI is about 25% lower than in CLM.

390
The global spatial distribution of CAMS-GLOB-BIOv3.1 emissions for selected species is presented in Fig. 1

Regional distribution of emissions
The CAMS-GLOB-BIOv3.1 emissions of the main BVOC species for the year 2000 were further analyzed to show their regional contribution to global totals. We have used regions defined under the GlobEmission project 460 (https://www.globemission.eu/) which divide the globe to nine emitting areas. The spatial extent of the regions is given in Table 6 and shown in Fig. 3. Table 6 presents the annual emission of isoprene, monoterpenes, methanol, acetone, sesquiterpenes and ethene from each of the regions together with their relative contribution to the global total. For all species (except for methanol), more than 70% 465 of emission originates in tropical regions of South America, East Africa and Southeast Asia and 10 -18 % of emissions has its source in the northern latitudes (North America, Europe and Russia). Especially low, when compared to other species, is a production of isoprene in Europe and Russia, with less than 1% and 2 % of the global total, respectively. For methanol, the tropics contribute with only 63% and almost 25% of methanol is produced in the northern latitudes, mainly in North America and Russia. 470 Table 6. Regional annual emissions for CAMS-GLOB-BIOv3.1 isoprene, monoterpenes, methanol, acetone, sesquiterpenes and ethene expressed as Tg(species) yr -1 and as a percentage of the global total.

Impact of land cover change on isoprene emissions 480
The impact of changing land cover on emissions is captured in the CAMS-GLOB-BIOv3.0 dataset. To illustrate the effect of changing land cover on isoprene, the 20-year time series of isoprene annual totals was fitted with linear regression trend and compared to data from v3.1 for which a static CLM land cover map was used. When calculated with the static vegetation map, isoprene emissions increase globally by 0.35 % yr -1 due to temporal changes in meteorology. When annually changing ESA-CCI data are implemented, the trend decreases to 0.24 % yr -1 . A similar observation was made by Opacka et al. (2021), who 485 used a modified MODIS land cover data in the MEGAN-MOHYCAN emission model to study the impact of land cover change on isoprene emissions. They found a 0.04 to 0.33 % yr -1 mitigating effect of land cover change on general positive trends of isoprene induced mainly by temperature and solar radiation.

Isoprene emission update in Europe 540
Updated isoprene emission potential values in Europe, described in more detail in Sect. 2.5, were used to calculate isoprene emissions in CAMS-GLOB-BIOv3.1. Spatial distribution of annual mean isoprene emissions in Europe is presented in Fig. 5, where CAMS-GLOB-BIOv3.1 emissions are compared with emissions obtained directly from the EMEP model (v4.33, 22 Simpson et al., 2019) and with isoprene emissions calculated with MEGAN and v3.1 similar settings (i.e. meteorology, PFT distribution, LAI), but using the MEGAN-default emission potential maps instead of the updated EPs. 545 Figure 5 shows a good agreement in spatial distribution and amount of calculated emissions between the EMEP and v3.1 emissions which approves the approach of updated EP calculation and conversion from EMEP inputs to MEGAN format. It can also be seen that the spatial distribution of isoprene emissions changes when updated EPs are applied. Emissions calculated with MEGAN-default EPs are more uniformly distributed over the European domain, while v3.1 emissions are more localized, 550 with isoprene hotspots in areas covered by highly-emitting tree species, e.g. in Portugal, Spain, southern France, and Balkan peninsula.

23
Use of updated instead of MEGAN-default EPs in Europe leads to 35% decrease of isoprene annual total from 10.03 Tg yr -1 (v3.1 without EP update) to 6.55 Tg yr -1 (v3.1). Isoprene monthly totals from these two datasets are compared to CAMS-GLOB-BIOv1.2 (annual total of 10.5 Tg yr -1 ) and to isoprene estimated by EMEP (

Comparison of CAMS-GLOB-BIO emissions with other inventories
Time series of CAMS-GLOB-BIO emissions of isoprene and monoterpenes were compared to other available data. We focus on isoprene and monoterpenes as these are the two most abundant BVOC species and the two species for which time series from other sources are available. The rest of the species unfortunately suffers from lack of available and time-varying data. 605 Datasets gathered for this comparison are listed in Table 7. Each dataset is assigned with basic information such as model used for emission estimation and driving meteorology. Most of the inventories are so called 'bottom-up', i.e. modeled by an emission model based on meteorology, emission factors and vegetation distribution. There are two 'top-down' datasets, IASB-TD-OMI and IASB-TD-GOME2, which were calculated by an emission model and then constrained with satellite observations 610 of formaldehyde (from OMI and GOME2) by applying an inversion technique in chemical transport model (IMAGESv2) (Stavrakou et al., 2014(Stavrakou et al., , 2015. Most of the inventories were calculated with 'MEGAN-like' emission model algorithm. Except for the GUESS dataset, which was estimated by a process-based model LPJ-GUESS (Smith et al., 2001;Sitch et al., 2003).
The IASB datasets were obtained from website of the GlobEmission project (http://www.globemission.eu/). The rest of the data were obtained from the ECCAD database (http://eccad.aeris-data.fr/). 615 Table 7 In this data collection, the highest isoprene and monoterpene emissions are estimated by the MEGAN-MACC dataset. This dataset was calculated based on the MERRA/MERRA2 reanalyzes (Rienecker et al., 2011) and therefore differs in the use of meteorological inputs from most of the remaining datasets which used ERA meteorological fields (ERA-Interim and ERA5). The key role of meteorology in BVOC estimation is supported also by a relatively good agreement of CAMS-GLOB-BIOv1.2 635 isoprene emissions with IASB datasets, esp. IASB-BU-OMI and IASB-TD-GOME2, which are both calculated based on ERA-Interim fields. Furthermore, the impact of meteorological inputs can be also observed in the difference between CAMS-GLOB-BIOv1.2 and v3.1 estimates, based on ERA-Interim and ERA5, respectively, which was already discussed in Sect. 3.1.
Inter-annual variability for most of the datasets is similar as they are mostly driven by the ERA meteorology. Again, except 640 for MEGAN-MACC based on MERRA reanalysis for which the amplitude is higher. For all datasets there is a clear link between isoprene emissions and El Niño / La Niña phenomena. As also presented in Fig. 2 For monoterpenes the inter-annual variability is not as profound as for isoprene. Similar to isoprene, monoterpenes are strongly 645 emitted in the tropical region. But have also significant sources in the temperate and boreal forests in the northern hemisphere.
As a result, they are not as susceptible to atmospheric changes in the tropical band as isoprene and keep rather stable interannual profile. Monitoring Service project (CAMS, Global and Regional Emissions) as part of the European Union's Copernicus Earth Observation Programme.
The CAMS-GLOB-BIOv3.1 estimates global annual total BVOC emission of 591 Tg(C) yr -1 with isoprene as the main contributing species (440.5 Tg(isoprene) yr -1 ). Use of ERA5 meteorology in v3.1 leads to a slight increase of BVOC emissions compared to v1.2 with global BVOC total of 532 Tg(C) yr -1 (including 385.2 Tg(isoprene) yr -1 ). The total emission in CAMS-720 GLOB-BIOv3.0 dataset is 424 Tg(C) yr -1 with isoprene emissions of 299.1 Tg(isoprene) yr -1 . The difference between v3.1 and v3.0 estimates can mostly be attributed to use of an alternative land cover map for vegetation description.
The CAMS-GLOB-BIOv3.1 include isoprene estimates in Europe calculated with updated map of emission potential values which are based on fine-scale land cover with detailed maps of tree species, and should therefore better represent the 725 composition of European forests than the global EP maps of the MEGAN model. Use of updated isoprene EP maps led to a substantial decrease of European isoprene emission total by 35% and caused a change in spatial distribution of emissions.
Isoprene emissions are concentrated in several emission hot spots in locations covered by highly-emitting tree species.
Both v3.1 and v1.2 estimates are based on a static land cover description obtained from the Community Land Model (CLM4). 730 Since the world's vegetation is experiencing significant changes, such as deforestation in the tropical region, replacement of forests by agricultural land, afforestation efforts with fast-growing trees, we aimed to take this effect into account. Dataset CAMS-GLOB-BIOv3.0 considers changes in global land cover by using the ESA-CCI annual land cover maps for vegetation description in the MEGAN model. In order to use a new land cover input in the model, the emission potentials had to be calculated from the PFT distribution instead of using the high-resolution emission potential maps. Such difference in input EP 735 data and different input land cover map (ESA-CCI instead of CLM4) caused a decrease of ~30% for annual total isoprene and 29 ~ 20% for monoterpenes when compared to CAMS-GLOB-BIOv3.1. The linear trend analysis of the 20-year time series of global isoprene emissions showed that inclusion of time-varying land cover data causes a decrease in general isoprene growing trend from 0.35 % yr -1 to 0.24 % yr -1 . The trend slowdown is even more profound in the tropical regions of South America and Southeast Asia, where according to ESA-CCI data the retreat of tropical broadleaf forest can be observed. On the other hand, 740 due to expansion of broadleaf deciduous trees the isoprene increasing trend is intensified in regions such as East and Central Africa or Russia.
Time series of CAMS-GLOB-BIO isoprene and monoterpene emissions were compared to other available data. The estimates fall well within the range of values from other studies. Though the comparison shows there is quite large uncertainty in 745 emission estimates which on global scale can reach up to factor of 2 to 3, with even higher values on regional level. The different emission estimates in different versions of the CAMS-GLOB-BIO datasets suggest the uncertainty main driving factors, i.e. meteorological inputs, definition of input emission potentials and land cover distribution.
The presented CAMS-GLOB-BIO datasets provide a high-resolution data of global BVOC emissions for the period of 20 750 recent years based on up-to-date input data. The datasets are suitable for the purposes of air quality modelling, especially for models that do not include their own module for online BVOC emission estimation. Our general recommendation is to use a CAMS-GLOB-BIO dataset which is calculated with the same meteorology as drives the air quality model. If this does not apply, we recommend using the latest CAMS-GLOB-BIOv3.1 dataset. CAMS-GLOB-BIOv3.0 should be used for studies focusing on land cover change. 755

Author contribution
KS performed the emission model runs, created emission inventories and analyzed the data, JM contributed to preparation of model inputs and emission processing, DS provided data for the update of isoprene emission potentials in Europe, helped with its conversion to MEGAN model format and analysis of emissions, PH and JK participated on emission modelling, SD and CG performed the emission data formatting and upload to the emission database. KS wrote the manuscript with contributions 760 from all co-authors.