High-resolution land-use land-cover change data for regional climate modelling applications over Europe - Part 1: The plant functional type basemap for 2015

. The concept of plant functional types (PFTs) is shown to be beneﬁcial in representing the complexity of plant characteristics in land use and climate change studies using regional climate models (RCMs). By representing land use and land cover (LULC) as functional traits, responses and effects of speciﬁc plant communities can be directly coupled to the lowest atmospheric layers. To meet the requirements of RCMs for realistic LULC distribution, we developed a PFT dataset for Europe (LANDMATE PFT Version 1.0; Reinhart et al., 2021b). The dataset is based on the high-resolution ESA-CCI land cover 5 dataset and is further improved through the the additional use of climate information. Within the LANDMATE PFT dataset, satellite-based LULC information and climate data are combined to achieve the best possible (cid:58)(cid:58)(cid:58)(cid:58)(cid:58) create (cid:58)(cid:58)(cid:58) the representation of the diverse plant communities and their functions in the respective regional ecosystems while keeping the dataset most ﬂexible for application in RCMs. Each LULC class of ESA-CCI is translated into PFT or PFT fractions including climate information by using the Holdridge Life Zone concept. Through the consideration of regional climate data, the resulting PFT map for Europe 10 is regionally customized. A thorough evaluation of the LANDMATE PFT dataset is done using a comprehensive ground truth database over the European Continent. A suitable evaluation method has been developed and applied to assess the quality of the new PFT dataset. The assessment shows that the dominant LULC groups types, cropland and woodland, are well represented within the dataset while uncertainties are found for some less represented LULC groups types. The LANDMATE PFT dataset provides a realistic, high-resolution LULC distribution for implementation in RCMs and is used as basis for the LUCAS LUC 15 dataset introduced in the companion paper by Hoffmann et al. (submitted) which is available for use as LULC change input for RCM experiment setups focused on investigating LULC change impact. shows a quite patchy pattern and a strongly decreasing sample number in Northern Europe. Within ﬁlter set 9 whole ﬁlter


Datasets & concepts 2.2.1 ESA-CCI LC
The European Space Agency Climate Change Initiative (ESA-CCI) provides continuous global land cover maps (ESA-CCI 115 LC) on ∼300 m horizontal grid resolution. The ESA-CCI LC maps are available for download in annual time steps for the years 1992-2018 (ESA, 2017). The classification of the LC maps follows the United Nations Land Cover Classification System (UN-LCCS) protocol (Di Gregorio, 2005) and consists of 22 level 1 classes and 14 additional level 2 classes, which include regional specifications. More information on ESA-CCI LC data processing can be found at maps.elie.ucl.ac.be/CCI/viewer/ download/ESACCI-LC-Ph2-PUGv2_2.0.pdf. An overview of the satellite missions involved in the production of ESA-CCI 120 LC is given in table 2. Besides systematic global validation efforts (ESA, 2017;Hua et al., 2018), a few regional approaches investigated the quality of ESA-CCI LC over Europe (Vilar et al., 2019;Reinhart et al., 2021a).   (Hastings and Emery, 1992) 4 SPOT Vegetation satellite program (Maisongrande et al., 2004) 5 Project for On-Board Autonomy -Vegetation (Dierckx et al., 2014) 6 Ocean and Land Colour Instrument (OLCI) and Sea and Land Surface Temperature Radiometer (SLSTR) (Donlon et al., 2012)

E-OBS Climate data
The E-OBS dataset (Cornes et al., 2018) is a daily gridded observational dataset, derived from station observations from European countries covering the period from 1950 to 2020. The point observations are interpolated using a spline method 125 with random perturbations in order to produce an ensemble of realizations. For the creation of the HLZs that are used for the conversion of ESA-CCI LC classes to PFTs (Section 2.2.5), the ensemble mean of the 2-meter-temperature (TG) and precipitation (RR) on a regular 0.1 • grid from E-OBS version 19.0e is used. It covers most of Europe, some parts of the Middle East and a narrow strip of Northern Africa.

130
The Climate Research Unit (CRU) TS 4.03 dataset is a global gridded high-resolution climate dataset based on station observations produced and maintained by the CRU of the University of East Anglia (Harris et al., 2014). The dataset provides global monthly means of climate parameters at 0.5°resolution from 1901 to 2019. In order to achieve the target resolution of 0.1°f or the global LANDMATE PFT maps, the CRU climate data is downscaled using bilinear interpolation. Following Hoffmann et al. (2016), distance-weighted interpolation was applied to the atmospheric observation dataset CRU to extrapolate the cli-135 mate data to the coastlines of the ESA-CCI LC maps in order to compensate for the different land-sea-masks of the products.

LUCAS -land use and land cover survey
The harmonized LUCAS in situ land cover and use database for field surveys from 2006 to 2018 (d' Andrimont et al., 2020) 180 is the most consistent ground truth database for the European Continent. The survey was carried out at three-yearly intervals between 2006 and 2018. The systematic sampling design of the survey consists of a theoretical, regular grid over the European Continent with ∼2 km grid size. The reference point locations are the corner points of the theoretical grid. Not all locations within the survey were easily accessible. Therefore, the survey is supported by in situ photo interpretation, in-office photo interpretation and satellite data in the latest time steps 2015 and 2018 (table 4). However, the main proportion of the reference 185 points was recorded through location visits at all time steps, which makes this land survey the most reliable and consistent ground truth database for Europe. year 2015 are employed (Sect. 4). In order to avoid confusion between the FPS LUCAS and the LUCAS ground truth dataset, the latter will be further referred to as Ground Truth Survey or GT-SUR.
3 Cross-walking procedure -ESA-CCI LC classes to PFTs The CWP from ESA-CCI LC classes to PFTs presented in this article is based generally on (1) the translation introduced by Poulter et al. (2015) and (2) the translation by Wilhelm et al. (2014). Both translations are not just combined with each other 195 but modified using additional data. The following sections introduce the PFTs of LANDMATE PFT aggregated into groups :::::: general :::::: LULC :::: types : and give an overview of the decisions on modifications that are made during the production process based on literature and additional data. The final LANDMATE PFT map is shown in fig. 3.

Trees and shrubs, tropical and temperate | PFT 1-8
The LANDMATE PFTs are more diversified regarding tree-PFTs than the generic ESA-CCI PFTs. The expansion of tree-PFTs 200 to six in total was done at the expense of two :::: ESA ::::::::: POULTER :::::: PFTs. ::::: While :::: the :::::: generic ::::: ESA ::::::::: POULTER ::::: PFTs ::::: have :::: four shrub-PFTs : , ::: the ::::::::::: LANDMATE ::::::: dataset ::: has :::: only :::: two ::::: while ::: the ::::::: tree-PFT ::::: count :::: was :::::::: increased :: to ::: six. The increase of tree-PFT diversity is done in order to address the strong biogeophysical impacts of forested areas on regional and local climate, such as decreased albedo and increased roughness length (Bright et al., 2015). The effects of forested areas on near-surface climate are distinctively different to the effects of shrub or grass covered areas, and are also highly depending on tree species composition 205 and latitudinal range (Bonan, 2008;Richardson et al., 2013). Another reason for the six tree-PFTs is the intended use of the PFT maps in RCMs. In the Land Surface Models (LSMs) of current generation RCMs, where a distinction is rather made between different tree or tree community types than between different shrub types. Therefore and with regard to the implementation process that needs to be done for each RCM individually, an increase in the number of tree-PFTs and a decrease in the number of shrub-PFTs is considered to be convenient. Accordingly, the tree and shrub proportions were distributed following both, the 210 needleleaf and broadleaf definitions of the ESA-CCI LC classes as well as the HLZ map, where the HLZ map was decisive for an assignment of forest proportions to the temperate or tropical tree-PFT, respectively. Following a comparison with different forest datasets over Europe (not shown), the tree proportions in the translation of the mixed land cover classes (e.g. lass :::: class 61 -Tree cover, broadleaved, deciduous, closed (>40%)) are increased to be in line with the indicated overall forest amount over Europe.

Grassland | PFT 9 & 10
The generic ESA-CCI :::: ESA :::::::::: POULTER PFTs include a natural grassland-and a managed grassland-PFT to include grassland and cropland respectively. The LANDMATE PFTs include two grassland-PFTs, distinguishing between C3 and C4 grass. The contrasting photosynthetic pathways and therefore contrasting synthetic response to CO 2 and temperature determine specific ecosystem functions for both PFTs respectively. The main differences are found in global terrestrial productivity and water 220 cycling (Lattanzi, 2010;Pau et al., 2013). The translation from the LULC classes that contain grassland proportions into C3 or C4 grass-PFTs respectively is supported by a map of potential C4 vegetation by Wei et al. (2014) where the potential global distribution of C4 is estimated using bioclimatic parameters (Sect. 2.2.6).

Tundra and swamps | PFT 11 & 12
The specific vegetation PFTs tundra and swamps are treated individually in LANDMATE PFT. Tundra is mostly used for the to an improved knowledge base on how to translate LULC classes into PFTs for climate models. Particular focus is laid on mosaic classes and the sparsely vegetated classes of which appear numerous in ESA-CCI LC. Therefore, the translation ::::: CWP from Li et al. (2018) for cropland is adopted into the present CWP.

240
The irrigated cropland-PFT (PFT 14, see table ?? :: fig :: 3) is currently empty in the LANDMATE PFT map Version 1.0. This decision is made following intense research on available irrigation information. The ESA-CCI LC map that is used as initial input contains an "irrigated cropland" class but this information was not used in the process. The investigation on irrigated areas included the comparison of ESA-CCI LC to other products that are available, such as the irrigation map from the FAO (Siebert et al., 2005). Although the ESA-CCI LC quality assessment shows a very good agreement of the ESA-CCI LC irrigated 245 cropland with the validation database (ESA, 2017), the comparison showed considerable differences between the products. The success of detection of irrigated areas is highly dependent on the correct detection of the crop types to infer the water needs of the respective crops, on atmospheric and environmental conditions and on the availability of multi-temporal, high resolution imagery (Bégué et al., 2018;Karthikeyan et al., 2020). Further, most remote sensing applications depend highly on ground truth data and local knowledge. Applications using different satellite imagery to detect agricultural management practices, such as 250 irrigation, are only successfully tested and applied in local spatial units (Rufin et al., 2019;Ottosen et al., 2019). Therefore, the irrigated cropland PFT remains unoccupied for now. Nevertheless, PFT 14 is defined within LANDMATE PFT Version 1.0 for the purpose of adding irrigated LULC fractions in the future. For the long term LUCAS LUC dataset (Hoffmann et al., submitted) which is extended backward and forward based on the LANDMATE PFT map for Europe 2015, irrigated cropland areas are already implemented following the irrigated area definition of the Land Use Harmonization (LUH2) dataset (Hurtt et al., 2011).

Non-vegetated | PFT 15 & 16
The surface or as an equal to rock surface as done in several RCM approaches cannot account for the complex geobiophysical processes associated with an urban agglomeration (Daniel et al., 2019;Belda et al., 2018). Due to the distinction of the two surface types, the LANDMATE PFT map can be used for impact studies with an urban focus.

Water, permanent snow & ice
The LANDMATE PFTs do not include individual PFT definitions for water and snow/ice respectively. Regarding the water representation, most currently used RCMs are utilizing a land-sea-mask to account for oceans and inland water areas. Therefore, an explicit definition of water as individual PFT has not been implemented. Consequently, water grid cells :: all ::::: water :::::::: fractions, :::: such :: as :::::: marine ::::: water, ::::: lakes ::: and ::::: rivers are set to no data. : z : In the present translation, the snow/ice grid cells from ESA-CCI land The LANDMATE PFT map is based on the ESA-CCI LC map which was quality checked and compared to similar LULC products on a global (ESA, 2017;Yang et al., 2017;Hua et al., 2018;Li et al., 2018) and regional level (Reinhart et al., 2021a;Vilar et al., 2019). However, the translation from LULC classes to PFTs necessarily results in change of the map. The final product, the LANDMATE PFT map, is intended to be used in RCMs, which means the quality of the final product must be assessed in addition to the available quality assessments of the initial ESA-CCI LC map. In order to overcome the resolution 280 difference, which is non negligible between LANDMATE PFT and the reference data GT-SUR, the LANDMATE PFT map is prepared on 0.018°horizontal resolution, which corresponds closely to the 2 km theoretical grid of GT-SUR. and the reference data are often different in structure and nomenclature, given that ground truth reference data is mostly collected as point data and independently from the assessed map product Foody (2002); Wulder et al. (2006); Olofsson et al. 285 (2014). In order to produce reliable quality information for LANDMATE PFT, the present assessment follows closely the well established good-practice recommendations. Nevertheless, adjustments are done to account for the fractional structure of LANDMATE PFT. Section 4.2 provides additional information on the requirements of a "good practice" accuracy assessment, the key components and the selected sampling design and metrics.

Research area 290
The coverage of GT-SUR in the year 2015 includes 28 countries which are highlighted in dark grey in fig. 4.  Figure 4 also shows the 2.5°grid that was 295 used for the analysis of the accuracy assessment results (Sect. 5). Due to the fine scale and the high number of points over the whole research area, the visualization of the spatial analyses on continental scale is challenging. Therefore, the research area is split up through an overlay of a 2.5°grid (as shown in fig. 4). The overall and class-wise accuracy results for all points within each 2.5°grid cell are aggregated in order to identify large scale spatial quality differences for the analyzed LULC groups :::: types.
Additionally, the total number of points for each LULC group ::: type : per grid cell are displayed in section 5.

Accuracy assessment -background & design
The key components of the accuracy assessment of a large-scale land cover product are objective, sampling design, response design and the final analyses and estimation (Wulder et al., 2006). All of the key components have great impact on the quality of the assessment and further, on the final metrics, especially in the present assessment, where reference and assessed dataset differ widely in structure. LANDMATE PFT is a gridded dataset with fractional LULC classes but no information 305 on the subgrid location within the grid cell. Other than that, the points of GT-SUR have fixed locations expressed through exact coordinates, but no (exact) information on the spatial extent of this class. Another challenge is the fractional structure of LANDMATE PFT itself, where one unit (grid cell) possibly contains multiple fractions. Therefore, the design of the accuracy assessment needs to be customized to the objective, which is to determine the overall quality of the LANDMATE PFT map for Europe 2015 as well as the quality of individual LULC type representation within the map in order to derive recommendations 310 for the use of LANDMATE PFTs in RCMs.
When it comes to the sampling design, sampling size, spatial distribution of the respective sample and the representation of each LULC group :::: type or class within the sample are crucial to produce reliable quality information about a LULC product (Stehman, 2009). However, the :::: The collection of ground truth data is a rather expensive procedure regarding time and money, which needs to be considered during the process. The sample size is therefore a compromise size and cost. In :::::::: However, :: in the 315 present assessment , ::: we ::: are ::: able :: to :::: rely :: on an existing ground truth database containing over 340,000 recordsis used as reference : , which eliminates the possible issue of a too small sample size ::::::: reference :::::::: database. It is also known that all assessed LULC groups :::: types : are represented in a sufficiently high number (Table 6). Nevertheless, the present assessment is a special case situation with every unit of LANDMATE PFT containing more than one LULC group ::: type : potentially. Therefore, the subsets are selected through application of a filter to capture the map accuracy in a way that accounts for the fractional structure within 320 the grid cells in the LANDMATE PFT map (see section 4.2.1).
The response design deals with the spatial support regions (SSR) and the labelling protocol or classification harmonization.
The SSR is a buffer region around a sampling unit that is selected to account for small-scale landscape heterogeneity that is likely not captured by larger scale map products. In the present case, the sampling design is selected in a way that the grid cells of LANDMATE PFT serve as SSR for each GT-SUR point. A fraction is not located precisely at one location within 325 the respective grid cell but evenly distributed over the whole grid cell. Assuming, the uniformly distributed fraction can occur in small patches or in one large patch within the grid cell, the whole grid cell is defined as SSR for the respective LULC group ::: type. The labelling protocol needs to be determined to deal with the different legends of the reference and the assessed map. The harmonization of legends is selected in regard to the objective of the respective assessment, as in this case, to provide information about the quality of representation of the most dominant LULC types in LANDMATE PFT. The labelling protocol 330 used in the present assessment is summarized in table 5.
The analyses and estimation used are error matrices, that give an overview of the overall and LULC group-wise :::::::: type-wise accuracy of the LANDMATE PFT map. For both resolutions of LANDMATE PFT, the error matrices and the resulting accuracy measures overall accuracy (OA), producer's accuracy (PA) and user's accuracy (UA) are calculated, where PA and OA are calculated group-wise. The error matrix is a cross-tabulation between map and reference of the size q x q, where q stands for 335 the number of land cover classes or groups. The map classes are placed in the rows and the reference classes in the columns so that the diagonal of the matrix gives the sum of the correctly classified map units. The off-diagonal cell values represent the disagreement between the map and the reference. The overall accuracy is calculated according to equation 1: The sum of the agreeing diagonal elements n ii of all LULC groups :::: types is divided by the number of all observations n. The

340
PA represents the accuracy from the view of the map producer. The PA stands for the probability, that a LULC feature in the reference is classified as the respective feature by the map. The PA is calculated using equation 2 where the number of correctly classified units per LULC group :::: type n ii is divided by the total number of LULC group :::: type occurrences of the reference n +i : While the PA gives the proportion of features in the reference that are actually represented as those in the produced map, the UA is the accuracy from the perspective of the map user. It is the probability of a feature classified as such in the map is actually present in the reference. The UA is calculated using equation 3, where the number of correctly classified pixels n ii per LULC group ::: type : is divided by the row sum n i+ p i=1 n ji :

405
The distribution of the varying dominant LULC group filter sets over the research area in Europe. Since the >10% and the >20% filter set share the same number of points the >10% filter set is not shown.
Overall accuracy for the full domain of the 10 filter sets as introduced in table ?? as function of filter set size. Filter set numbers are shown in grey boxes.
Producer's accuracy of the 10 filter sets (Filter set numbers in grey boxes) for the LULC group URBAN as a function of sample count per filter set. reveals the issue that leads to the overall low PA. Figure 11 shows four large URBAN agglomerations in different areas of 460 Europe where the red points represent GT-SUR urban points while the white points represent GT-SUR point representing non-urban LULC groups :::: types. The grey-scaled squares represent the LANDMATE PFT URBAN fractions from zero (no coverage, white) to one (full coverage, black) within one grid cell. The LANDMATE PFT grid cells with a large urban fraction indicate :::::::: represent : the respective city core : of :::: the ::::::: selected ::::::: example ::::: cities while the GT-SUR points that are located within the city core are mostly not classified as URBAN. However, 465 the GT-SUR points do not fail to represent the structure of urban areas because they :::: these ::::: areas : are characterized through a heterogeneous pattern of sealed surfaces, recreational areas (e.g. parks) and different building types and density, not through a homogeneous sealed area. The LANDMATE PFT map represents this heterogeneous structure through the varying fractions of non-urban PFTs within the grid cell. However, in order to make the impact of a larger city visible in an RCM simulation, it is beneficial for LANDMATE PFT to represent a larger city with a dense core structure. In order to verify the PFT ::: are ::::::: directly ::::::: adopted :::: from ::: the :::::::: ESA-CCI ::: LC ::::::: dataset, ::::: which ::: was :::::::::: thoroughly :::::::: validated. Therefore, despite the low agreement with GT-SUR in the present assessment, the URBAN PFT of LANDMATE PFT 2015 is ::::::::: considered :: to ::: be of sufficiently good quality and suitable to represent urban land cover in high resolution (∼2 : 3 : km) RCM simulations. Due to the abovementioned ::::::::::::: aforementioned comparability issues the UA of the LULC group URBAN will not be further discussed ::: type :::::::: URBAN :: is ::: not ::::: further ::::::::: discussed :: in ::: this ::::::::: assessment.

CROPLAND
The CROPLAND representation in LANDMATE PFT shows, together with WOODLAND the highest PA for the research 480 area. As shown in fig. ?? : 8 : the PA for all filter sets is :: ten :::::: groups :: is :: is > : 80% which is to be considered as a very good agreement with the reference.
Producer's accuracy of the 10 filter sets (Filter set numbers in grey boxes) for the LULC group WOODLAND as a function of sample count per filter set.

510
.17ex4% of the total LANDMATE PFT cells representing WOODLAND are actually CROPLAND or OTHER.
Producer's accuracy of the 10 filter sets (Filter set numbers in grey boxes) for the LULC group GRASSLAND as a function of sample count per filter set.
One ::: The :::: main : reason for this low accuracy of LANDMATE PFT regarding GRASSLAND can be found looking at the results of sections ?? and ??. The UAs ::: the ::::: LULC ::::: types ::::::::::: CROPLAND ::: and ::::::::::::: WOODLAND. :::: The ::: UA of CROPLAND and WOODLAND 525 reveal that ∼20 :: 36 : % of the LANDMATE PFT CROPLAND cells and .17ex10% of the LANDMATE PFT WOODLAND cells are actually representing :::::: actually :::::::: represent : GRASSLAND in the reference, which adds up to over 60 ::::: almost ::: 55 % of the total GT-SUR GRASSLAND points. Another reason is found in the dataset structure of LANDMATE PFT. A considerable amount of GRASSLAND is not part of the assessment because GRASSLAND does not make the dominant but the second dominant PFT in many grid cells (∼45% of all LANDMATE PFT grid cells). Therefore,the seemingly weak GRASSLAND 530 representation in LANDMATE PFT rather shows a weakness of the present assessment that is caused by the different dataset structures.
::: 000 :::: cells :::::: where :::::::::::: SHRUBLAND :: is ::: the :::::::: dominant :::::: LULC :::: type. : Therefore, one reason for the poor SHRUBLAND representation lies within the base map (ESA-CCI LC) used for the creation of LANDMATE PFT, where the known small ::: low : count of 540 SHRUBLAND proportions was inherited by LANDMATE PFT. It must be noted, that a large proportion of SHRUBLAND in ESA-CCI LC is part of the mixed LC classes, such as Shrubland/Cropland or Shrubland/Forest. The known deficit was partly compensated by the translation into the PFTs, where SHRUBLAND proportions were added to the total as proportions of the mixed ESA-CCI LC classes. Further SHRUBLAND makes the second dominant PFT in ∼20% of the total LANDMATE PFT grid cells in the assessment. Just like for GRASSLAND, these SHRUBLAND proportions can not be addressed ::::::::: sufficiently 545 within the present assessment.
Producer's accuracy of the 10 filter sets (Filter set numbers in grey boxes) for the LULC group SHRUBLAND as a function of sample count per filter set.

575
The LANDMATE PFT dataset for Europe 2015 is published with the Long Term Archiving Service (LTA) for large research datasets, which are relevant for climate or earth system research, of the German Climate Computing Service (DKRZ). As World Data Center for Climate (WDCC), the DKRZ LTA is accredited as regular member of the World Data System. The LAND-MATE PFT dataset for Europe 2015 is available within the LANDMATE project data at https://cera-www.dkrz.de/WDCC/ui/ cerasearch/entry?acronym=LM_PFT_LandCov_EUR2015_v1.0_af (Reinhart et al., 2021b). Within the LANDMATE project, 580 a short documentation summarizes the technical information corresponding to LANDMATE PFT.

Discussion & conclusion
The present work introduces the preparation of the LANDMATE PFT map :::: 2015 for the European Continent based on several :::::::::::: high-resolution : LULC datasets and climate data.
The LANDMATE PFT ::: map ::: for ::::: 2015 : Version 1.0 is prepared in order to provide realistic, high-resolution LULC repre-585 sentation for RCMs. The dataset includes LULC information from different, validated sources as well as regional climate information through involvement of the HLZs. For each ESA-CCI land cover class, an individual CWT :: A :::::::::::: cross-walking :::::::: procedure :::::: (CWP) : is developed to translate the original LULC classes into PFTs. The various mixed LULC classes included in the base map ESA-CCI LC are extremely difficult to resolve within RCMs. Through the developed CWP, the mixed LULC classes can be disaggregated into PFT fractions, which improves the realistic representation of these ::::: LULC : classes in RCMs.

590
The involvement of the climate data further allows a customized translation of LULC classes for individual regions. The 16 LANDMATE PFTs are selected to provide simple transferability into various RCM families in order to be able to conduct coordinated RCM experiments where the implementation of a common, high quality LULC map provides minimum uncertainty for a multi-model ensemble.
Within the accuracy assessment, the OA does not change considerably between the evaluable filter sets ::::: groups : of the respective LULC groups :::: types : which shows that the dataset structure has no noticeable impact on that accuracy measure. The highest PA is found for CROPLAND and WOODLAND which are the dominant LULC groups :::: types in the research area. The lowest PA is found for SHRUBLAND and BARE AREAS, which are also the LULC groups ::::: types with the lowest overall sample ::: cell 610 count. The UA is found to be highest for WOODLAND, followed by CROPLAND, GRASSLAND and BARE AREAS. Both accuracy measures, PA and UA are highly influenced by the proportion :::::::: influenced ::: by :::: grid ::: cell ::::::::::: heterogeneity : of the dominant LULC group in the individual :::: type ::::: within :: a grid cell. The difference between the filter sets for UA of the LULC groups :::::: groups :: for :::: UA is 10 to 20 % per group while the difference for PA is noticeable but considerably lower, which means that the applied filter :::::::: threshold ::::: range has a higher influence on the former.

615
The URBAN representation in LANDMATE PFT represents a special case in the present assessment due to the heterogeneous structure of urban areas. Both datasets, GT-SUR and LANDMATE PFT are able to represent the LULC group :::: type URBAN very well for their respective purpose. Nevertheless, the PA for URBAN reflects the limitations of the present assessment method. The fine scale point data of GT-SUR represents the patchwork structure of recreational areas, building blocks, and other urban elements at the location of the respective points while LANDMATE PFT represents the urban area as an ag-620 glomeration of grid cells with URBAN as the dominant LULC group. The additional comparison with a high resolution dataset (WSF2015) showed that not only large but also small agglomerations of urban areas are represented well in LANDMATE PFT. :::: type. : Therefore and despite of the accuracy assessment results for the LULC group ::: type : URBAN, the LANDMATE PFT dataset can be recommended to be used in RCMs that resolve urban features over the European Continent.

635
The structural differences of the datasets, where gridded data is compared to point data, is a major weakness of this assessment.
Although the fractional structure does not have a major influence on the OA, the LULC group-wise :::::::: type-wise PA and even more the UA is affected.
The present assessment takes into account the dominant LULC group :::: type per grid cell of LANDMATE PFT. Depending on the proportion of this LULC group :::: type, the second or third-most represented LULC group ::: type : can occupy a consider-640 able area of the respective grid cell. Therefore, a follow up assessment, where these LULC group :::: type : proportions are also considered and compared to the ground truth is needed in order to investigate, if the PA of the less dominant LULC groups :::: types : GRASSLAND, SHRUBLAND, and BARE AREAS is increased. The use of additional LULC data , like it was done for URBAN in this assessment, would be an additional ::::::::: specialized ::: one ::: one :::::: LULC :::: type :::::: would :: be :: a useful step to validate the quality of GRASSLAND, SHRUBLAND and BARE AREAS representation in LANDMATE PFT . ::::: 2015.

645
The results show that the LANDMATE PFT map is able to represent LULC over large parts of Europe in a sufficient quality.
Especially the dominant LULC groups :::: types are represented overall well which is highly beneficial for RCM experiments that require realistic, high-resolution LULC representation. Nevertheless, there are uncertainties found for the less represented LULC groups :::: types. When using LANDMATE PFT in an RCM it is crucial to consider these uncertainties when interpreting simulation results. Especially the spatial distribution of uncertainties in LANDMATE PFT needs to be considered when com-650 paring simulation results to observations because the input parameters in the employed land-surface schemes are influenced by the individual LULC, which subsequently considerably impacts on lower-atmosphere processes, such as the intensity of heat and moisture exchange. Thus, by carefully considering the issue of uncertainty introduced by the LULC input, misconclusions about RCM model performance and about small-scale interconnections can be avoided :::::: reduced (Ge et al., 2007;Sertel et al., 2010;Santos-Alamillos et al., 2015;Reinhart et al., 2021a).

655
Beside the quality of the LULC product, the implementation process of each individual RCM is crucial for the realistic representation of LULC in regional climate model experiments. When translating a LULC product into the model specific LULC classes and structure, modifications are done that can change the map characteristics. When the LANDMATE PFT product is used in an RCM that only uses the dominant LULC fraction per grid cell, the overall LULC proportions can change.
The same applies when LANDMATE PFT is used in a model with limited fractions per grid cell or a different classification 660 system. The present assessment gives a guideline on the quality of LANDMATE PFT (Version 1.0) when used unaltered.
Through the involvement of the ground truth data, regional deficits of LANDMATE PFT are presented that can be compensated during the implementation process into the individual RCM or RCM family.
The findings of the present assessment support the identification of uncertainties within the LANDMATE PFT map for Europe. Nevertheless, user feedback is crucial for the future overall improvement of LANDMATE PFT. The RCM community 665 within the WCRP FPS LUCAS is already participating in the feedback process where implementation of LANDMATE PFT and the LUCAS LUC time series into different RCMs is comprehensively documented. The future work on LANDMATE PFT also includes the extension of the dataset to other CORDEX regions. Although, the dataset is based on various globally available datasets and therefore, can be created globally, the introduced quality assessment method must be performed for each region individually, desirably using region-specific expert knowledge. Further, the assessment should be expanded in order to include the second or third-most represented LULC group ::: type : per grid cell to possibly achieve more accurate quality information about LANDMATE PFT.                             (l) BARE AREAS Figure B1. Total count of GT-SUR points :::::: evaluated ::::::::::: LANDMATE ::: PFT :::: grid ::: cells : per 2.5°grid cell (a-c; g-i) and producer's accuracy for the individual LULC groups :::: types : (d-f;j-l) for filter set 5 :::: group ::: 0.2 (dominant LULC group ::: type occupies > 50 :: 20% per LANDMATE PFT grid cell) (l) BARE AREAS Figure B2. Total count of GT-SUR points :::::: evaluated ::::::::::: LANDMATE ::: PFT :::: grid ::: cells : per 2.5°grid cell (a-c; g-i) and producer's accuracy for the individual LULC groups :::: types : (d-f;j-l) for filter set 7 :::: group ::: 0.5 (dominant LULC group ::: type occupies > 70 :: 50% per LANDMATE PFT grid cell)  Table B10. Confusion matrix for LANDMATE PFT filter set 10 :::: group :: 1.0 : -Dominant LULC group ::: type : occupies 100 % of a LANDMATE PFT grid cell