An update and beyond: key landscapes for conservation land cover and change monitoring, thematic and validation datasets for the African, Caribbean and Pacific region

Abstract. Natural resources are increasingly being threatened in the world. Threats to biodiversity and human well-being pose enormous challenges to many vulnerable areas. Effective monitoring and protection of sites with strategic conservation importance require timely monitoring with special focus on certain land cover classes which are especially vulnerable. Larger ecological zones and wildlife corridors warrant monitoring as well, as these areas have an even higher degree of pressure and habitat loss as they are not "protected" compared to Protected Areas (i.e. National Parks, etc.). To address such a need, a satellite-imagery-based monitoring workflow to cover at-risk areas was developed. During the program’s first phase, a total of 560 442 km2 area in sub-Saharan Africa was covered. In this update we remapped some of the areas with the latest satellite images available, and in addition we added some new areas to be mapped. Thus, in this version we updated and mapped an additional 852 025 km2 in the Caribbean, African and Pacific regions with up to 32 land cover classes. Medium to high spatial resolution satellite imagery was used to generate dense time series data from which the thematic land cover maps were derived. Each map and change map were fully verified and validated by an independent team to achieve our strict data quality requirements. The independent validation datasets for each Key Landscape for Conservation (KLC) are also described and presented here (all presented datasets are available at https://doi.org/10.5281/zenodo.4621375, Szantoi et al., 2021).



Study area
The provided thematic datasets concentrate on sub-Saharan Africa with additional KLCs in the Caribbean and Pacific regions.
The selection of areas was conducted based on present and future pressures envisioned and predicted by MacKinnon and colleagues (2015) and the Biodiversity and Protected Areas Management (BIOPAMA, https://biopama.org/) Programme. In this second phase (Phase 2), 10 large areas totalling 852 025km 2 were selected, mapped and or updated, and validated ( Fig. 1). 65 These areas cover various ecosystems and generally reside in transboundary regions (Table 1, Fig. 1).

Data and method
The production workflow for the entire process is shown in Figure 2. Each stage is explained in detail in the below sections.

Data collection and mapping guidelines
Landsat TM, ETM+ and OLI at Level1TP, Sentinel-2 at Level1C, and SPOT 4, 5 and 6 at Level1-B processing level imagery were used in the production and update of the land cover and change maps. The Level1TP (Landsat), Level1C (Sentinel-2), and Level1-B (SPOT) data were further corrected for atmospheric conditions to produce surface reflectance products for the 80 classification phase. The atmospheric correction module was implemented based on the 6S as a direct radiative transfer model for Landsat (Masek et al., 2006) and SPOT (Haifeng et al., 2010) and using the Sen2Cor processor (v2.8) based on the ATCOR model (Richter et al., 2012). The Shuttle Radar Topography Mission (30m or 90m) Digital Elevation Model was used to estimate the target height and slope, as well as correct the surface sun incidence angles to perform an optional topographic correction. Based on the area's meteo-climatic conditions (climate profile and precipitation patterns), season specific satellite 85 image data were selected for each KLC (Table 1). Due to data scarcity for many areas, especially for the change maps (i.e. year 2000), imagery was collected for a target year ± 3 years. In extreme cases, (±) 5 years were allowed, or until four cloud free observations per pixel for the specified date were reached.

Land cover classification system
All thematic maps were produced at both Dichotomous and Modular levels within the Land Cover Classification System 90 (LCCS) developed by the Food and Agriculture Organization of the United Nations and the United Nations Environment Programme (Di Gregorio, 2005). The LCCS (ISO 19144-2) is a comprehensive hierarchical classification system that enables comparison of land cover classes regardless of geographic location or mapping date and scale (Di Gregorio, 2005 ( Table 2). For the Caribbean (CAR01), Timor-Leste (PAC01), and Madagascar (SAF21) KLCs, we included an additional 95 land cover class not present in other KLC map products: "Not Inland Cover", due to the special location and of the mapped areas (i.e. islands), this class is not present in LCCS and we only used it for our error assessment.

Automatic classification 100
Based on the pre-selected imagery data (Landsat, Sentinel-2, and SPOT), Dense Multitemporal Timeseries (DMT) based vegetation indices were generated to reduce data dimensionality and enhance the signal of the surface target. The DMT for each KLCs were based on the pre-processed and geometrically coregistered data, forming a geospatial datacube (Strobl et al., 2017). In addition, three vegetation indices were calculated to aid the separation of terrestrial vs. aquatic (NDFI), vegetated vs. barren (SAVI), and evergreen vs. deciduous vegetation areas (NBR). 105 The indices are (per Landsat spectral bands): All the pre-processed data (spectral bands and the DMT based indices) were fed into the Support Vector Machine supervised 110 classification model. The Support Vector Machine classifier can handle data with high dimensionality and performs well with mapping heterogeneous areas, including vegetation community types (Szantoi et al., 2013). To produce the thematic maps, the Minimum Mapping Unit concept used by Szantoi et al. (2016) was employed. Individual pixels (with corresponding land cover class information) were assigned into objects, where the minimum size of an object was set at 3 hectares (0.03km 2 ), as a compromise between technical feasibility (pixel size) and the general size of the observable features (various land cover 115 classes). Still, classification errors (omission and commission of various classes) and false alarms (for land cover change) arose due to the data availability (cloud cover, no data) and the seasonal behaviour of the land cover (e.g. rapid foliage change). To correct these errors, expert human image interpretation skills and knowledge that improved the outputs from the automated process were employed.

Land cover change detection 120
Land cover change was interpreted as a categorical change in which a particular land cover was replaced by another land cover.
As an example of conversion, the change of Cultivated and Managed Terrestrial Areas (A11) into a Natural and Semi-Natural Terrestrial Vegetation (A12) or a Cultivated and Managed Terrestrial Areas (A11) into Artificial Surfaces and Associated Areas (B15) can be mentioned. The basic condition for LC changes identification was the detection of changes in spectral reflectance within specific image bands of the employed satellite imagery and in the generated indices, but such changes were 125 further evidenced by other interpretation parameters such as shape and texture patterns. In regards to our methodology, images acquired in two or more different timeframes were used in the identification process. Furthermore, land cover changes were characterized by those changes that have longer than yearly and/or seasonal periodicity (dry/wet season). Urban sprawl, tree plantations (large or small) to replace herbaceous crops (large or small), tree covers (closed or open) or the creation of a new water reservoir undergo long-term changes that classify as actual LCCs. In our workflow, the LCC process followed the same 130 image pre-processing steps as the LC method, and an independent classification (similarly to the LC procedure) of the past date was performed. Finally, the LC and the LCC products were compared and change polygons (minimum of 0.5 hectare change) were extracted. As with the LC product, the visual refinement was an important step to produce accurate LCC polygons.

Validation dataset production 135
The validation datasets (Table 3, Figures 3 and 4) were individually created for each KLCs. The validation datasets (points) were generated using a stratified random sampling procedure. This assured a sufficient estimation for all land cover and land cover change classes according to their frequency of occurrence. The following formula (Gallaun et al., 2015) was used to determine the minimum number of validation points (per class per KLC): At least two independent data analysts (blind and plausibility interpretation process) evaluated all accuracy points. Some points were excluded from the accuracy statistics due to an error/disagreement during the evaluation procedure ( Table 3 -"Number of points LC/LCC"). The blind process attempt to interpret all validation points was based on available ancillary data (i.e. higher resolution imagery), without direct comparison to the generated LC/LCC maps. The plausibility process reviewed every point whose the blind interpretation did not match the corresponding LC/LCC value (disagreement between the LC/LCC data 160 and the blind interpretation). After this review, the final validation reference is established.
The validation of the change maps (apart of CAF07, where we have assessed all the LCCS modular classes) aimed to assess the accuracy of the change detection. Thus, the following change categories were evaluated for those land cover changes (i.e.   Figure 3 Spatial distribution of the validation datasets within the updated key landscapes for conservation.

Data quality assessment
We updated some of the most critical landscapes (KLCs) due to various anthropogenic pressures for land cover change compared to the base maps we presented in Szantoi and colleagues (2020). These KLCs were: Greater Virunga (CAF02), 180 Salonga (CAF07), Upemba (CAF11), and Yangambi (CAF99). The Salonga KLC (CAF07) was mapped initially at the dichotomous LCCS level (Table 2, 8 land cover classes), but here we present both, the base map (2016) and a change map (2019), mapped at the modular LCCS level. The new land cover and land cover change maps (CAF05, CAR01, EAF04, PAC01, SAF21, and WAF04) were all mapped at the modular level for land cover as well as for change.

Technical Validation 185
Spatial, temporal and logical consistency was assessed by an independent procedure from the producer to determine the products positional accuracy, the validity of data with respect to time (seasonality), and the logical consistency of the data (topology, attribution and logical relationships). A Qualitative-systematic accuracy assessment was also performed wall-towall through a systematic visual examination for a) global thematic assessment b) expected size of polygons (Minimum Mapping Unit (MMU)), c) seasonal effects and d) spatial patterns (i.e. following correct edges). 190 The quantitative accuracy assessment (i.e. validation) results are shown in Table 4 (overall accuracies), and in the Appendix (thematic class accuracies per KLC, Appendix A). Generally, the program aimed at a minimum of 85% overall accuracy for each product (KLC) and a minimum of 75% thematic accuracy (Producer's and User's) for each class within each KLC. The 195 land cover change (LCC) accuracy should be >72%. In exceptional cases, the thematic accuracies might be lower than the threshold due to the difficulty to discriminate a particular class in a certain KLC.

Discussion
There is a direct relationship between population growth, agricultural expansion, energy demand, and pressure on land. With the current state of development, population increase, and economic growth, a large portion of the sub-Saharan population depends on the remaining natural resources to meet their food and energy needs (Brink et al., 2012), while in the Caribbean 235 (CAR01) urbanization puts pressure on the natural resources (Nathaniel et al., 2021). In the case of Timor-Leste (PAC01) the peacebuilding process shapes the country's land cover and land use trends since 2006 (Ide et al., 2021). The demands of social and economic growth require additional land, typically at the expense of previously untouched areas. Areas under protection (i.e. national parks) that remain well-preserved (see Figs. 5, 6 and 7) often have regions in close proximity under tremendous pressure. Such areas (many times transboundary ones) need very accurate monitoring and base maps, which are provided 240 through this work, especially as areas shared between and/or among countries are frequently not mapped with a common legend, if mapped at all. The presented KLC datasets can be used for continuous land cover and land use monitoring, evaluation of management practices and effectiveness, endowment for scientific counsel, habitat modeling, information dissemination, and capacity building in their corresponding countries and to manage natural resources such as forests, soil, biodiversity, ecosystem services, and agriculture (Tolessa et al., 2017). Furthermore, regional climate change, biogeochemical, and 245 hydrologic models are currently capable of using high-resolution LC data for predictions in general (Nissan et al., 2019) and spatially focused (i.e. Africa) (Sylla et al., 2016;Vondou and Haensler, 2017).
The validation datasets are independently collected and verified through a robust procedure. Validation datasets can then be used for additional land cover mapping, creating spectral libraries, and the validation of other local, regional, and global datasets. It is important that various land cover products can be used or compared against one another regardless of their 250 geographic origins. Here, 10 land cover and land cover change maps for different areas in the OACPS where quality land cover products are missing (Marshall et al., 2017) were introduced. All data were produced using the unified Land Cover Classification System. The LCCS's modular level can be applied to local scales through its very detailed classes (here 32).  Geist and Lambin (2002) describe the driving human forces of land cover changes as an interlinking of three key variables: 255 expansion of agriculture, extraction of wood, and development of infrastructure (urbanization). The main land cover dynamic in sub-Saharan Africa can be explained by the first two variables, but increasingly with urbanization as well, just like in the other mapped areas (Caribbean, Timor-Leste) (Güneralp et al., 2017;Nathaniel et al., 2021;Hugo, 2019). Although the driving force behind the clearing of natural vegetation has traditionally been predominantly attributed to the expansion of new agricultural land areas (including investments in large-scale commercial agriculture) (Brink and Eva, 2009), firewood 260 extraction and charcoal production are also key factors in forest, woodland, and shrubland degradation throughout the region.

Drivers of change
This land cover dynamic is not just a by-product of greater forces such as logging for timber and agricultural expansion but stems from a specific need to satisfy energy demand (European Commission, 2018); in fact, in sub-Saharan Africa, the main use of extracted wood is for energy production (Kebede et al., 2010). Although the region possesses a huge diversity of energy sources such as oil, gas, coal, uranium, and hydropower, the local infrastructure and use of these commercial energy sources 265 are still somewhat limited. Traditional sources of energy in the form of firewood and charcoal account for over 75 % of the total energy use in the region (Kebede et al., 2010). Efforts to meet the population and economic demands in the OACPS while

Sources of errors
As the applied LCCS allows very detailed hierarchical classification, some classes can be difficult to distinguish from each other. This is especially true in Africa's vast and very heterogeneous landscapes where agricultural land use is mainly smallholder based (i.e., very small plots), while shifting cultivation is mostly due to the lack of fertilizers and weak soil, leading to land abandonment. Landscapes are generally not composed of clearly fragmented and well-identifiable cover formation. In 275 this region, landscapes usually form a continuum of various cover (vegetation) formations that might include different layers of tree, shrub, and herbaceous vegetation. These variations combined with differences in vegetation density (open vs. closed) and heights makes class assignments challenging. Moreover, some specific agriculture classes distinguish even the cultivation type, e.g., differentiating between fruit tree plantations and tree plantations for timber. Thus, the discrimination of such classes is very difficult and might introduce classification errors. Apart from the land cover classification, errors could also be 280 introduced due to climate-induced variability, such as leaf phenology where deciduous vegetation might appear bare during a dry period (season). At a more general level, difficulties in identifying between aquatic or regularly flooded surfaces and terrestrial areas have been observed in certain KLCs, especially when flooded periods are short.
As for Timor-Leste (PAC01), to discriminate between evergreen and deciduous natural vegetation was particularly challenging across the seasonal variations. Garamba, Salonga, Upemba, and the Yangambi biosphere), where they use the C-HSM products for planning and for investment strategies (i.e., hydropower). Thus, the before mentioned PAs were requested to be updated in terms of land cover changes for 2019 by EEAS, which we present here in this study. Another example comes from West Africa, where nongovernmental organizations (NGOs, e.g., Wild Chimpanzee Foundation), public-benefit enterprises (i.e., German Society for International Cooperation -GIZ), and national authorities (i.e., l'Office Ivoirien des Parcs et Réserves -OIPR) use the data 300 to identify areas under pressure for agriculture (cocoa, oil palm, rubber, coconut) and human-wildlife conflicts in Cote d'Ivoire, Ghana, and Liberia. Additional areas (i.e. CAR01, PAC01) mapped and presented in this study can be used to help projects (e.g. BIOPAMA, https://biopama.org/) and countries to improve management and governance of their biodiversity and natural resources.

Data availability 305
The data are provided in a shapefile (*.shp) format, polygon geometry for the land cover and change datasets and point geometry for the validation datasets. The presented data are in the World Geodetic System 1984 geographic coordinate system (GCS) (EPSG:4326) and its datum (EPSG:6326). The validation data, besides using the same GCS, also have the Africa Albers equal-area conic (EPSG:102022) projection coordinate system. year the validation sets represent, as these can be different among KLCs; the exact year is always noted in the columns' names (e.g., plaus2000, plaus2016).
The naming of all attributes follows the same structure in all data. Please see the details in the Appendix.
The complete package (all datasets together) is available for download at https://doi.org/10.5281/zenodo.4621375 (Szantoi et al., 2021), or individually as source datasets (each KLC) from the same web address.

Conclusions 335
The C-HSM service component is part of Copernicus Global Land, which produces near-real-time biophysical variables at medium scale, globally. In contrast, the C-HSM activity is an on-demand component that addresses specific user requests in the field of sustainable management of natural resources. The products presented here provide the second set of standardized land cover and land cover change datasets for 10 KLCs with their corresponding validation datasets in the African, Caribbean and Pacific regions. The geographic distribution covers the tropical and subtropical regions of west, central, and southeastern 340 Africa as well as a large part of the Caribbean region and Timor-Leste in the Pacific region. The most recent land cover change might be reassessed for selected already-mapped KLCs periodically in order to generate longer-term time series land cover dynamics information -as this is the case in the currently presented data (CAF02, CAF07, CAF11, and CAF99, see the original LC/LCC data in Szantoi et al., 2020). While this is not done systematically, but on specific customer requests, the C-HSM service encourages stakeholder cooperation and provides capacity building workshops around the globe. In-person training 345 events provide an opportunity for new and existing users to learn how to use and interpret data, operate the web information system, and easily assess recent land cover change data using Sentinel-2 image mosaics. Here, we provide very-high-quality products, which can be used directly as base maps and for policy decisions, as well as for comparison and/or evaluation of other land cover products or the implementation of validation datasets for training and validation purposes.
Finally, the service has a high degree of confidence that the data presented here (and in the previous phase, Szantoi et al., 2020) 350 are of the highest quality, regularly reaching above 90 % overall accuracy. This is guaranteed by a rigorous and independent production and validation mechanism and feedback loop, which does not stop until the required overall and per-class accuracy levels are reached.
Following the general European Commission's Copernicus Programme open-access policy, the data are distributed free to any user through a dedicated website (https://land.copernicus.eu/global/hsm, last access: 16 March 2021). This interactive online 355 information system allows access to browse, analyze, and download the data, including the accuracy assessment information.

Appendix
Thematic class accuracies per KLC. Accuracy parameters are in percent, classes with less than 15 samples were not included in the overall accuracy calculation. Accuracy results are presented at the aggregated as well as at the modular LCCS levels, depending on the type of mapping (land cover map -modular, or land cover change map -aggregated). 360 Class -corresponding class (see Table 2