Articles | Volume 13, issue 12
Earth Syst. Sci. Data, 13, 5951–5967, 2021
https://doi.org/10.5194/essd-13-5951-2021
Earth Syst. Sci. Data, 13, 5951–5967, 2021
https://doi.org/10.5194/essd-13-5951-2021

Data description paper 23 Dec 2021

Data description paper | 23 Dec 2021

Harmonized in situ datasets for agricultural land use mapping and monitoring in tropical countries

Harmonized in situ datasets for agricultural land use mapping and monitoring in tropical countries
Audrey Jolivot1,2, Valentine Lebourgeois1,2, Louise Leroux3,4,5, Mael Ameline1,2, Valérie Andriamanga1,6, Beatriz Bellón1,2, Mathieu Castets1,2, Arthur Crespin-Boucaud1,2, Pierre Defourny7, Santiana Diaz1,2, Mohamadou Dieye8, Stéphane Dupuy1,2, Rodrigo Ferraz9, Raffaele Gaetano1,2, Marie Gely1,2, Camille Jahel1,2, Bertin Kabore10, Camille Lelong1,2, Guerric le Maire​​​​​​​11,12, Danny Lo Seen1,2, Martha Muthoni13, Babacar Ndao5,14, Terry Newby15, Cecília Lira Melo de Oliveira Santos16, Eloise Rasoamalala1,6, Margareth Simoes9, Ibrahima Thiaw14, Alice Timmermans7, Annelise Tran1,2, and Agnès Bégué1,2 Audrey Jolivot et al.
  • 1CIRAD, UMR TETIS, 34398 Montpellier, France
  • 2TETIS, Univ Montpellier, AgroParisTech, CIRAD, CNRS, INRAE, Montpellier, France
  • 3CIRAD, UPR AIDA, Dakar, Senegal
  • 4AIDA, Univ Montpellier, CIRAD, Montpellier, France
  • 5Centre de Suivi Ecologique (CSE), Dakar, Senegal
  • 6Centre National de la Recherche Appliquée au Développement Rural (FOFIFA), Antsirabe, Madagascar
  • 7Université Catholique de Louvain (UCLouvain), Louvain-la-Neuve, Belgium
  • 8Institut Sénégalais de Recherches Agricoles (ISRA), Dakar, Senegal
  • 9Brazilian Agricultural Research Corporation (EMBRAPA), Rio de Janeiro, Brazil
  • 10independent consultant: Ouagadougou, Burkina Faso
  • 11CIRAD, UMR Eco&Sols, 34398 Montpellier, France
  • 12Eco&Sols, Univ Montpellier, CIRAD, INRAE, Institut Agro, IRD, Montpellier, France
  • 13independent consultant: Nairobi, Kenya
  • 14Institute of Environmental Sciences, Université Cheikh Anta Diop de Dakar (UCAD), Dakar, Senegal​​​​​​​
  • 15Agricultural Research Council (ARC), Pretoria, South Africa
  • 16Interdisciplinary Center on Energy Planning, NIPE, University of Campinas, UNICAMP, Campinas, São Paulo 13083-896, Brazil

Correspondence: Audrey Jolivot (audrey.jolivot@cirad.fr)

Abstract

The availability of crop type reference datasets for satellite image classification is very limited for complex agricultural systems as observed in developing and emerging countries. Indeed, agricultural land use is very dynamic, agricultural censuses are often poorly georeferenced and crop types are difficult to interpret directly from satellite imagery. In this paper, we present a database made of 24 datasets collected in a standardized manner over nine sites within the framework of the international JECAM (Joint Experiment for Crop Assessment and Monitoring) initiative; the sites were spread over seven countries of the tropical belt, and the number of data collection years depended on the site (from 1 to 7 years between 2013 and 2020). These quality-controlled datasets are distinguished by in situ data collected at the field scale by local experts, with precise geographic coordinates, and following a common protocol. Altogether, the datasets completed 27 074 polygons (20 257 crops and 6817 noncrops, ranging from 748 plots in 2013 (one site visited) to 5515 in 2015 (six sites visited)) documented by detailed keywords. These datasets can be used to produce and validate agricultural land use maps in the tropics. They can also be used to assess the performances and robustness of classification methods of cropland and crop types/practices in a large range of tropical farming systems. The dataset is available at https://doi.org/10.18167/DVN1/P7OLAP (Jolivot et al., 2021).

1 Introduction

Land use and land cover (LULC), and their changes, are key pieces of information for studying and monitoring carbon and water cycles, threats to biodiversity, and for establishing land use planning and public policies. In particular, accurate mapping of cropland and associated cropping practices is of primary importance for food security, agricultural and environmental monitoring and land management. However, cropland and crop-type mapping using Earth observation data is still challenging as it requires large sets of training and validation data, especially as the land use (field limits and content) generally changes annually, even seasonally. Large datasets on cropping practices are available in the Global North, mainly thanks to agricultural policies that support annual censuses and that provide tools for the digitization at field level using very high resolution remote sensing imagery (e.g., the Land Parcel Identification System designed to implement the common agricultural policy in the European Union or the Cropland Data Layer of the National Agricultural Statistic Services of the United States Department of Agriculture). Such datasets provide a very large number of annotated surface samples reporting yearly crop types, which can often easily be integrated in reference datasets for land cover mapping systems at the cost of a relatively simple “cleansing and harmonization” procedure (Inglada et al., 2017). Despite the fact that the declarative nature of such annotations makes them error prone, such “noise” is typically compensated by the large number of available crop type samples. Arguably, no such large-scale database currently exists in most of the developing and emerging countries. As a matter of fact, in these countries, cropland and crop types can be particularly difficult to map (Waldner et al., 2015) as the fields are often small to medium in size (Fritz et al., 2015), because crops are easily confused with natural vegetation and fallow and because cropping systems are typically highly variable in time and space. Each farming system has its own specificities in terms of crop type and composition, field size, cropping calendar, irrigated/rainfed mode, and other practices (Bégué et al., 2018). It is thus necessary to adapt the classification approaches (satellite data and algorithms as well as training and validation in situ data) to the large variability of farming systems in the world (Dixon et al., 2001) and thus to have access to appropriate training data.

The arrival of Sentinel-1 and Sentinel-2 satellite image time series, the emergence of new classification algorithms in the domain of machine learning and artificial intelligence, and easy access to preprocessed images and image processing tools on web platforms have democratized image processing and opened up new avenues for LULC mapping over large areas. Following this trend, large benchmark datasets acquired using annotation tools of satellite images all over the world have multiplied to train algorithms and validate remote-sensing-derived products (Long et al., 2020). However, these datasets have a broad LULC nomenclature, and agricultural land use is often reduced to a single class due to difficulties in discriminating cropping practices from satellite images. The main data sources currently available for agricultural land use mapping in southern countries are listed below.

At a global and continental scale, initiatives that freely distribute land cover reference datasets exist (see review by Tsendbazar et al., 2015). The GOFC-GOLD (Global Observation for Forest and Land Cover Dynamics; see http://www.gofcgold.wur.nl/sites/gofcgold_refdataportal.php, last access: 15 December 2021,​​​​​​​ for further details and access to data) regroups and consolidates existing reference datasets used for the validation of legacy global land cover products (prior to 2015) at moderate spatial resolution (300 m–1 km) such as GLC 2000 and GlobCover 2005. All referenced databases are provided at global scale, ranging from a few hundreds to around 2000 samples each. Except for GlobCover 2005, which contains a “rainfed cropland” class, other referred land cover nomenclatures only contain a single cropland class, sometimes referred to as “cultivated”.

Other data collection experiences reached a sensibly higher number of samples through the use of crowdsourcing campaigns, a notable example being the LULC reference dataset presented in Fritz et al. (2017) and its companion work from Laso Bayas et al. (2017b); thanks to the Geo-Wiki tool providing an easy-to-use interface for the photointerpretation of very high spatial resolution (VHSR) satellite images, it was possible to collect up to 150 000 samples of different LULC classes. This includes over 36 000 cropland locations, distributed over contrasted areas in terms of cropland density. As in the previous case, a single cropland class is referenced in the nomenclature, alone or mixed with natural vegetation (“mosaic” class). Although crowdsourcing proves to be a valuable strategy to collect reference cropland data at larger scales, it still remains unsuited when precise information has to be collected, both spatially (resolution, plot boundaries, etc.) and in terms of crop type nomenclatures. As a matter of fact, most of the crowdsourcing initiatives are based on visual image interpretation, which prevents the precise localization and identification of cropping practices. Shifting to a crowdsourced field strategy will not be suitable as well, both because of the very specific agronomic and GIS (geographic information system) competence needed and the limited accessibility to cultivated areas in tropical countries.

Recently, the LandCoverNet dataset was released for the African continent (Alemohammad et al., 2020), with the specific aim to foster the use of recent machine-learning and deep-learning approaches for automatic land cover classification. Here, samples are provided in the form of densely annotated image chips (256 × 256 pixels at 20 m resolution) accompanied by the corresponding Sentinel-2 observations over the reference year (2018). A total number of 1980 fully annotated chips, accounting for more than 30 million labeled pixels, are provided, spanning 66 tiles of Sentinel-2 over the entire African continent. Although such a dataset could allow for a finer spatial validation of LULC products at high resolution, it still provides a single “cultivated land” class, making it unsuitable for the assessment of LULC products specifically conceived for the monitoring of agricultural systems.

These data are used to validate global (Hoskins et al., 2016) or national cropland maps (Laso Bayas et al., 2017a) as the nomenclature used for labeling the classes does not specify the crop type.

At a national scale, ground campaigns, such as those carried out as part of the Sen2Agri project in South Africa and Mali, collected data on the main crop types (Defourny et al., 2019). However, these data are generally not available to validate global maps or train new classification algorithms, as they are often the responsibility of national sovereignty.

At a local scale, datasets on crop types have been acquired, and are still acquired, across multiple world regions within the context of the JECAM (Joint Experiment for Crop Assessment and Monitoring; available online: http://www.jecam.org/, last access: 10 February 2020) international network. The JECAM initiative was first developed under the GEO (Group of Earth Observations) umbrella and then became the research and development component of GEOGLAM (GEO Global Agricultural Monitoring), to enable the global agricultural monitoring community to carry out cross-site experiments and compare results based on disparate sources of data, using various methods, over a variety of local or regional cropping systems. Data are acquired following a given protocol and nomenclature (see Defourny et al., 2014). The experiment has been operating since 2013, and some in situ datasets produced at the field scale have been used in different benchmarking mapping studies (Waldner et al., 2016; Inglada et al., 2015). However, only a part of the collected ground data was used in these studies, and the databases are not publicly shared.

To make agricultural land use data publicly available to the remote sensing community, for classification algorithm benchmarking or LULC product validation, for example, an important work of harmonization of in situ JECAM and JECAM-like agricultural land use datasets was undertaken for nine sites located in the tropical belt. The acquisition protocol was adapted from Defourny et al. (2014) to take into account the characteristics of tropical agriculture (e.g., small field size, accessibility). At each site, information on crop type and cropping practices was collected locally, at the field level, with a detailed nomenclature. The acquisition period was between 2013 and 2020, and the number of monitoring years per site was between 1 and 7.

In this paper, we describe in detail the study sites, the data collection protocol and the structure of the final database. We then discuss how the harmonization of the dataset and the diversity of the studied agrosystems, including smallholder farming, make our dataset unique and valuable for applications in emerging/developing countries in the tropics.

2 Methods

2.1 Study sites

Except for Cambodia, the study sites belong to the JECAM network (http://www.jecam.org/, last access: 15 December 2021​​​​​​​) and cover several hundred square kilometers each. The nine sites are spread over seven countries of the tropical belt (Fig. 1) and cover different farming systems (Fig. 2).

https://essd.copernicus.org/articles/13/5951/2021/essd-13-5951-2021-f01

Figure 1Location map of the study sites and the associated number of collection years and sampled plots (symbolized by the size of the red circles).

https://essd.copernicus.org/articles/13/5951/2021/essd-13-5951-2021-f02

Figure 2A 1 km2 sample of land showing the landscape variety across the sampled sites due to the farming system in place: (a) rainfed cereals in Burkina Faso; (b) rice systems in Madagascar; (c) agropastoral systems in Tocantins, Brazil; (d) mixed agriculture in São Paulo, Brazil; (e) rainfed groundnut and millet agropastoral systems in Niakhar and (f) in Nioro, Senegal; (g) irrigated rice systems and orchards in Cambodia; (h) agroforestry in Kenya; (i) mixed agriculture in South Africa. Images © Google Earth 2020.

The JECAM Burkina Faso study site is a 60 km × 60 km area located around the town of Koumbia, Tuy province, in the southwest of the country. The climate is tropical. The absence of significant relief and the relatively good conditions in terms of soil and climate favored the densification of cropped surfaces, which span the majority of the area: arable lands cover more than 60 % of the site, and the remaining surface is either unsuitable for cultivation (e.g., rocky) or protected areas for nature conservation. The landscape is characterized by an alternation of large cropland areas made up of a patchwork of diversified small cropped fields (approximately 1 ha) and areas covered by natural vegetation. With the exception of a few lowland rice plots, all crops are rainfed and hence cultivated during the rainy season that occurs from May to October (approximately 1000 mm average annual rainfall). The main crops are more or less equally distributed between cash crops (mainly cotton) and staple crops, with a significant predominance of cereal crops (maize, sorghum, millet and rice) over oleaginous (sesame, groundnuts) and leguminous (peas, cowpea, soybeans) crops.

The JECAM Madagascar study site is a 60 km × 60 km zone located in the Vakinankaratra region, around Anstirabe city, in the central highlands of the country. It is characterized by terraced mountainous terrain at 1200 to 1500 m of altitude, with rice-growing valleys positioned between grassy hills and rocky outcrops. The climate is subtropical, with a rainy season from December to February. The average annual precipitation is 1300 mm. The growing season occurs from October to June. Cultivated crops are diversified, although maize and rice predominate. Fruit production is also present in the area. The mean size of an agricultural field in the area is very small (approximately 0.05 ha), but contiguous fields of the same crop type occasionally give rise to larger single crop patches. Rice is mainly grown in irrigated areas but has recently mingled with other rainfed crops on slopes (called tanetys). Other main crops are carrots, potatoes, sweet potatoes, soybeans or cassava.

The JECAM São Paulo site in Brazil is a large area of 90 km × 130 km located in São Paulo State, close to Botucatu city. It is composed of a relatively smooth relief with slopes mostly <5 %. The region is classified as subtropical humid-dry in the winter. The average temperature is 19 C, and the average annual precipitation is 1400 mm with a rainy season from December to March. The area is diversified and can be divided into four main agricultural subregions: (1) in the southwest annual crops (maize, wheat, soybean) including summer (growth cycle from October to May) and winter crops (June to September) – some of them irrigated with center-pivot systems; (2) in the center forest plantations for wood production; (3) in the east pastures, and (4) in the north sugarcane, which has variable planting and harvesting dates. The first sugarcane cycle occurs between September and March, and is grown for approximately 12–18 months. Sugarcane reaches maximal growth in April in this region. After the first harvest, the cycle of the ratoon sugarcane starts, with the annual cut between April and December. Natural forests, mostly along rivers, and orange orchards are present in these four subregions. The field size is generally larger than 10 ha and can reach more than 200 ha for pastures and forest plantations. A detailed description of this site, including crop and rotation descriptions, is given in de Oliveira Santos et al. (2019).

The JECAM Tocantins site in Brazil is part of the MATOPIBA (Maranhão, Tocantins, Piauí and Bahia) region, a new agricultural frontier in Brazil. It is a 25 km × 25 km site situated in the municipality of Pedro Afonso and surroundings, in the Cerrado biome. The climate is tropical, with a rainy season from October to March. The landscape is composed of a mosaic of large fields (generally approximately 100 ha), native forest remnants and rangelands, with mild relief, and the annual rainfall is between 1700–1800 mm. The main agricultural systems are soybean single cropping, double cropping of summer soybean from November to February followed by a cereal crop (maize, millet or sorghum) from March to June, some sugarcane, and planted pastures that are increasingly being implemented in the region as part of integrated crop–livestock systems (soybean–corn–planted-pasture). Sugarcane crops are irrigated with center-pivot systems.

The Niakhar and Nioro Senegalese study sites are located in the Senegalese Peanut Basin, in the central western part of the country. The Niakhar site spans the districts of Fatick and Bambey in the northern part of the Peanut Basin, and the Nioro site is located in the district of Nioro du Rip at the border of The Gambia. Each site covers approximately 400 km2. The climate is Sahelo-Sudanian with one rainy season (400 to 600 mm) that lasts from July to October. The relief is relatively flat. As in many parts of the Sahelian zone, smallholder farming systems are dominated by tree-based agricultural landscapes, forming so-called parklands. The Niakhar site is dominated by Faidherbia albida trees, while the Nioro site is dominated by Cordyla pinnata trees. The livelihoods of rural populations are centered on small-scale rainfed agriculture, with low usage of mineral fertilizer. Pearl millet and groundnut are the main staple crops mainly cultivated in biennial rotation. Other crops are sorghum, cowpea, bissap and maize cultivated during the rainy season.

The JECAM Kenya study site is a 25 km × 10 km area located approximately 50 km north of Nairobi, including Kangema and Muranga towns, in the central province of Kenya. It is settled in a very hilly landscape with steep slopes and strong local relief variations in a general toposequence trend following an east–west altitude gradient from 1000 to 2800 m. The climate is wet tropical, somewhat temperate by altitude and regularized by two rainy seasons (from March to May or June and from October to November) with 1200 to 2000 mm annual rainfall depending on the altitude. The permanent moisture and good natural drainage of a rich volcanic loam allows for intensive agriculture, mainly based on perennial crops (mostly banana, various fruits, coffee, and tea) associated with dairy farming and rainfed horticultural as well as food crops (e.g., French beans, cabbage, maize, cassava). The latter are cropped all year long, except in January and July which are dry months, and without a defined seasonal calendar (maize, for instance, can have three cycles per year). The mean size of an agricultural field in the area is very small (approximately 0.08 ha), resulting in a patchwork landscape of heterogeneous fields with a great diversity of structures.

The Cambodian study site corresponds to a 30 km radius buffer area around Wat Pi Chey Saa Kor, Kom Poung Kor village, Kandal province, where the ecology of fruit bats Pteropus lylei was recently investigated (Choden et al., 2019). The area is characterized by a tropical climate with a rainy season from May to October. The annual rainfall is between 1000 and 1500 mm. Two main rivers, the Mekong and the Bassac, cross the area. In this flat region, rice is the dominant crop, mainly grown in irrigated areas from May to October. Fruit plantations (mango, sapodilla) and natural wetlands are also present. The mean field size is small (approximately 1 ha). The population lives in villages along roads composed of small houses with fruit tree backyards.

The JECAM South African study area is a 60 km × 60 km site located in Mpumalanga province in the northeastern part of the country, close to the Mozambique border corresponding mostly to a subsistence agriculture area. The climate is subtropical with a rainy season from November to February. The annual rainfall is between 600 and 800 mm. The site is characterized by a bush-clad plain between the Drakensberg mountains (west) and savannahs (east) with several wildlife reserves (e.g., Kruger National Park). The study area is characterized by smallholder agriculture (generally less than 1 ha), with diversified crops: cereals, groundnuts, potatoes, vegetables and fruit crops. Important timber plantations are present on the western part of the site.

2.2 Data collection

The acquisition protocol is based on the JECAM guidelines (Defourny et al., 2014) with adaptations to consider some characteristics of tropical agriculture (mainly small field size and accessibility). Field surveys were conducted at least once in each study zone, with several sites revisited over multiple consecutive years (up to 7 years for the Burkina Faso site). Campaigns took place either around the growing peak of the cropping season, for the sites with a main growing season linked to the rainy season such as Burkina Faso, or seasonally, for the sites with multiple cropping (e.g., São Paulo site). Except for Senegal where a stratified sampling plan for field surveys was used (Ndao et al., 2021), the GPS waypoints were gathered following an opportunistic sampling approach (called the “windshield survey”) along the roads or tracks according to their accessibility (which can be difficult during the rainy season, leading to fewer surveys on secondary roads or tracks in some study areas) while ensuring the best representativity of the existing cropping systems in place (Defourny et al., 2014; Waldner et al., 2019). GPS waypoints were also recorded on different types of noncrop classes (e.g., natural vegetation, settlement areas, water bodies) to allow for differentiating crop and noncrop classes. Waypoints were only recorded for homogenous fields/entities of at least 20 m × 20 m (against a minimum sampling unit of 0.25 ha with a minimum width of 30 m in JECAM guidelines). To facilitate the location of sampling areas and the remote acquisition of waypoints, field operators were equipped with GPS tablets (Trimble Yuma2 or Handheld Algiz 10X) providing access to a QGIS project with very high spatial resolution (VHSR) images (orthorectified Pleiades or SPOT 6/7 images ordered just before the surveys, or PlanetScope images). This equipment allowed for the in situ recording of attributes relative to each waypoint on data entry forms (with the automatic filling of IDs or dates and scrollable lists for other attributes to avoid data entry errors (Fig. 3a and Table A1 in Appendix A)). For each waypoint, a set of attributes, corresponding to the cropping practices (crop type, cropping pattern, management techniques) were recorded. An attribute referred to as “KeyWords” was also created to associate various generic terms (land cover, crop group, crop type, cropping practice, etc. (in Appendix B)) to each polygon. This attribute has two objectives: (i) facilitating keyword search for the user and (ii) allowing the user to create their own nomenclature (hierarchical or not) with different levels of detail so that the nomenclature can be dedicated to the user's needs. These terms are based on the FAO land use definitions (FAO, 2020) and JECAM hierarchical nomenclature (Defourny et al., 2014), which were adapted to take into account the diversity of the farming systems in the surveyed sites. All these attributes are described in Table 1.

https://essd.copernicus.org/articles/13/5951/2021/essd-13-5951-2021-f03

Figure 3Workflow of the data acquisition: (a) field data form used on the GPS tablet; (b) GPS waypoints acquired in the field and (c) corresponding plots after digitalization of the boundaries, displayed on a satellite image in false color (red: near-infrared band, green: red band, blue: green band).​​​​​​​

Table 1Description of the attributes recorded for each polygon of the database.

* For each field at the Tocantins site, the operator was able to record the crop type for the two cropping seasons by observing the crop residues in the field or by interviewing the farmers. Consequently, the acquisition date of those polygons does not always correspond to the actual land cover of the field. The user must refer to the SOS and EOS dates to identify the season corresponding to the crop type recorded.

Download Print Version | Download XLSX

In the specific case of the Burkina Faso, Niakhar (Senegal) and São Paulo (Brazil) sites, the same fields were revisited each year to study crop rotations and fallow practices in the region. For the South African site, some points were collected by helicopter using the Producer Independent Crop Estimates System (PICES; Fourie, 2009) method developed by the National Crop Statistics Consortium. Flights were performed at an average altitude of 500 feet and a low flying speed, allowing us to record GPS points and to determine land use using a GPS tablet associated with a GIS interface and a recent VHRS image. Only clearly identifiable land covers were kept in the database.

During a field mission, the team is composed of an agronomist with geoprocessing skills, accompanied by a national researcher or technician with expertise in the local farming systems and a local driver. In some countries (Burkina Faso, Senegal, Madagascar, Kenya), local partners were trained to collect data. The training sessions were carried out directly in situ to be as close as possible to reality. The data acquisition duration varies in many of the visited areas: in Brazil (large fields and good road infrastructures), 300 plots can be visited in a day, while for other sites (small to very small fields) it is possible to collect between 50 and 150 plots per day (depending on the road state and field accessibility). Usually, the mission for a 3600 km2 site of smallholders is 1 week with approximately 700 plots visited.

2.3 Postprocessing

Once the waypoints were acquired (Fig. 3b), the boundaries of each field or noncrop entity were digitized on the VHSR images in the QGIS software, and the class labels (and other attributes; see Table 1) were attached to the polygon database (Fig. 3b). Additional noncrop polygons were added by CAPI (computer-assisted photointerpretation) of the VHSR images for the built-up areas, water bodies, wetlands, mineral surfaces, and natural forest classes (land covers clearly identifiable on images).

To avoid digitizing errors, this step was performed by the same operator as the one who performed the field surveys. Despite this, if there was doubt on the delineation of a given entity (e.g., fuzzy boundaries, high heterogeneity), the given entity was removed from the database. Finally, the topology of each entity was controlled externally.

3 Data records

This database, which contains 27 197 records, is a geographic layer in shapefile format. Each record corresponds to a polygon with 16 attributes (Table 1). Because of the dispersion of study sites on the globe, the layer is in a geographic coordinates system with datum WGS84. The distribution of the different records over the study sites is reported in Table 2, along with information on the temporal (corresponding years) and spatial coverage (source, number and average size of digitized polygons).

Table 2Synthetic view of the final GIS database.

a Areas calculated on cropland polygons. b Sixteen field campaigns in 4 years. c The digitized boundaries of the polygons correspond to homogeneous crop areas (collections of adjacent small fields) and not necessarily to single fields.

Download Print Version | Download XLSX

Twenty different land cover types and 102 different crop types were observed. More than three-quarters of the observations are agricultural land, and the most represented crop types are maize, rice and sugarcane. The distributions of the main land cover and crop types are represented in Figs. 4 and 5. Figure 6 summarizes the distribution of the data acquisition method by site and shows that 87 % of the data come from an in situ survey.

https://essd.copernicus.org/articles/13/5951/2021/essd-13-5951-2021-f04

Figure 4Distribution of the main land cover types (in number of polygons).

Download

https://essd.copernicus.org/articles/13/5951/2021/essd-13-5951-2021-f05

Figure 5Distribution of the main crop types (in number of polygons).

Download

https://essd.copernicus.org/articles/13/5951/2021/essd-13-5951-2021-f06

Figure 6Distribution of the data sources, given in percentage of the total number of polygons per site.

Download

4 Technical validation

4.1 Quality checking

Due to the nature of the dataset (in situ observation), validation is not possible. However, quality control was performed throughout the data chain, from acquisition to postprocessing, to ensure the quality of the datasets and their homogeneity throughout the sampled years and locations.

First, the acquisition protocol was described in a technical guide provided to the field teams so that nothing was forgotten during the campaigns. The dropdown list in the data entry form reduced input and postprocessing errors.

Second, during the postprocessing step, the orthorectification of the VHSR images used to digitize the fields was checked from one year to the next, for multiyear sites, and corrected if necessary by taking homologous points. The fields were then manually digitized on the VHSR images, and the photographs taken in situ were used whenever necessary. In the case of doubtful data, these data were discarded and removed from the dataset.

Finally, each site has a referee person who knows the area very well. He supervises the entire chain from data collection to database integration. In this way, each step is conducted by a specialist (agronomy, GIS, database) in complementarity with the referee to minimize errors and contribute to the overall quality of the datasets.

4.2 Representativeness of datasets

Because of their small size, these sites cannot be considered representative of the entire country in which they are located; however, they are claimed to be representative of an area that encompasses more than the JECAM site. To specify the extent of this representative area, we referred to existing zoning maps. We used the two reference maps available for southern countries: the FEWS-NET livelihood zones map (https://fews.net/fews-data/335, last access: 15 December 2021​​​​​​​) and the FAO farming systems map (http://www.fao.org/farmingsystems/mapstheme_01_en.htm, last access: 15 December 2021​​​​​​​). The livelihood zones are produced at the national scale and are available for 38 developing countries. The zones are defined as geographical areas within which people broadly share the same patterns of livelihood (i.e., broadly the same production system, the same income earning opportunities and patterns of trade) (see Grillo and Holt, 2009, for more details). Farming system maps are available for the Global South (covering 130 countries). The classes are defined as a population of individual farm systems that have broadly similar resource bases, enterprise patterns, household livelihoods and constraints (Dixon et al., 2001; Auricht et al., 2014).

Although these two maps were not produced for the same purposes, they were derived using similar criteria (agro-climatology, elevation, landscape, dominant pattern of farm activities, etc.) that are closely related to agricultural land use, as recorded in the database. For both maps, the type and extent of the zones corresponding to our JECAM study sites are given in Table 3. Unfortunately, livelihood maps are available only for four of the JECAM countries presented here.

Table 3Agricultural types and extent of study site zones: FEWS-NET livelihood zones (source: https://fews.net/fews-data/335, last access: 15 December 2021​​​​​​​) and FAO farming system zones (http://www.fao.org/farmingsystems/mapstheme_01_en.htm, last access: 15 December 2021​​​​​​​).

Download Print Version | Download XLSX

With a mean size of the zone being approximately 20 000 km2 (Table 3), we are confident that our JECAM sites are representative of the livelihood zone to which they belong. The datasets presented here can thus be used to train or validate land cover maps of the corresponding zones. The farming system zones are much larger (between 300 000 km2 and 2 000 000 km2) and include a larger diversity of environmental and farming conditions; in these conditions it is not possible to argue that the JECAM sites are representative of such large areas; thus, the JECAM datasets need to be completed with other datasets belonging to the same farming system class before being used for training land cover classification algorithms. However, they can still be used for algorithm/product validation or comparison.

It is also important to mention that other agroecological zoning (AEZ) can be used (even if only in a few areas directly related to the agricultural land use) or that each user can produce their own AEZ and use it to delineate the area in which the JECAM dataset can be used to train classification algorithms.

5 Dataset application study cases

The in situ JECAM dataset and its derived land-use/land-cover products have been used in a wide spectrum of studies covering several aspects linked to agricultural monitoring, attesting to the good quality of the dataset and good spatial representativeness of tropical country farming systems.

First, specific site studies have been conducted to test several methodological aspects. For instance, land use maps combining a supervised object-based approach with multisource high spatial resolution time series were developed in Madagascar (Lebourgeois et al., 2017) and in Brazil (de Oliveira Santos et al., 2019). The Brazilian site (São Paulo) was also included in a broader study presenting an intercomparison of several cropland mapping methodologies over five contrasting JECAM sites (Brazil, Ukraine, Russia, Argentina and China) in terms of growing conditions, characteristics and cropping practices (Waldner et al., 2016). Very recently, following the rapid dissemination of up-to-date artificial intelligence approaches, Gbodjo et al. (2020) and Ienco et al. (2020) proposed testing the potential of deep-learning architectures for land cover mapping in Senegal (Niakhar) and Burkina Faso, respectively.

Second, in situ data coming from the Burkina Faso site and the Madagascar site were included as test sites in the Sen2-Agri system. The Sen2-Agri system is an operational processing system that provides several agricultural products from Sentinel-2 and Landsat 8 time series during the cropping season. The two sites have been included in preliminary studies preparing the Sen2-Agri system processing chain (Bontemps et al., 2015; Valero et al., 2016), while the Madagascar site was considered later in the demonstration phase of the system at the local scale (http://www.esa-sen2agri.org/system-demonstration/local-sites/madagascar/, last access: 15 December 2021​​​​​​​).

Last, the different in situ data and the derived products have been used in studies covering different aspects of agricultural monitoring. For instance, a semiautomated clustering approach has been proposed for cropping system mapping over the Tocantins region in Brazil (Bellón et al., 2018). Using the land use map derived from the Burkina Faso site and the Senegal site (Niakhar), remote sensing-based statistical crop yield models have been proposed for maize (Leroux et al., 2019) and pearl millet (Leroux et al., 2020). Based on the land use map derived from the Niakhar and Nioro sites in Senegal, Ndao et al. (2021) proposed an approach to characterize the agricultural landscape heterogeneity in agroforestry parklands, which was then used to analyze how far agricultural landscape diversity contributes to the household food security (Leroux et al., 2022).

6 Data availability

The dataset is ready for use on any GIS software and can be filtered by region, year or keywords. It is distributed with a CC-BY license. The database, as well as the KMZ file locating the study areas, is available online on the CIRAD Dataverse at https://doi.org/10.18167/DVN1/P7OLAP (Jolivot et al., 2021).

7 Conclusion and perspectives

The accurate mapping of cropland and associated cropping practices in smallholder farming systems of tropical countries is crucial for the improvement of agricultural monitoring systems at local and/or global scales. The essential prerequisite to reach such objectives is to have available in situ datasets representative of the diverse agricultural practices in tropical countries. This paper presented a harmonized in situ crop type dataset acquired between 2013 and 2020 over nine sites spread over seven tropical countries. This dataset collected in the framework of the JECAM initiative is unique and very valuable, because it is produced at the field scale, based on in situ observations and quality-controlled, and contains standardized observations for various tropical cropping systems, including smallholder farming systems. These characteristics allow this dataset to be used as a benchmark to assess the performances and robustness of newly developed classification algorithms for cropland and crop type or crop practice for mapping in diverse and documented agricultural conditions. In addition, this dataset can also be used to validate the cropland class of existing global or national LULC products, in particular those recently produced with Sentinel or Landsat image time series, as well as some crop type and practice (fallow, double cropping) classes. In the end, it should be part of publicly online datasets and algorithm sharing platforms as promoted by the JECAM network and Long et al. (2020), who encourage the sharing of datasets for remote sensing applications, and more broadly to the scientific community, land use planners and agricultural monitoring agencies.

Thanks to ongoing projects and funded initiatives in which our team is involved, we will provide updates to the presented dataset on a regular basis. To date, several field campaigns are already planned on some of the presented sites, and projects are being built which will lead to the inclusion of multiple new ones. Moreover, since the paper also proposes a set of technical guidelines to integrate the database, opening up to external contributors may lead to a significant extension of the geographic coverage of the database and hence to its representativity with respect to the diversity of tropical agrosystems. As future work, we intend to carry out a study about the development of a technical solution aimed at facilitating such external contributions (e.g., a compliant data collection tool and workflow).

Appendix A

Example of scrollable lists used in the form. The crop type list depends on the study site (it is not necessary to mention crops not present on the site). Here is an example for the Burkina Faso site.

Table A1Example of scrollable lists used in the form.

Download Print Version | Download XLSX

Appendix B: Keywords list
LandCover KeyWords
Agricultural bare soil Agricultural land; Cropland; Arable land; Temporary crop
Albizia gummifera Agricultural land; Cropland; Permanent crop; Multifunctional woody crop
Annual crop Agricultural land; Cropland; Arable land; Temporary crop
Apple tree Agricultural land; Cropland; Permanent crop; Fruit crop
Asparagus Agricultural land; Cropland; Arable land; Temporary crop; Vegetables and melon; Leafy or stem vegetables
Asphalt road Built-up surface
Avocado tree Agricultural land; Cropland; Permanent crop; Fruit crop
Banana Agricultural land; Cropland; Permanent crop; Fruit crop
Bare soil Bare soil
Barley Agricultural land; Cropland; Arable land; Temporary crop; Cereals
Bean Agricultural land; Cropland; Arable land; Temporary crop; Leguminous
Beet Agricultural land; Cropland; Arable land; Temporary crop; Vegetables and melons; Root, bulb or tuberous vegetables
Built-up surface Built-up surface
Burn area Bare soil; Permanent meadow and pasture; Naturally growing
Cabbage Agricultural land; Cropland; Arable land; Temporary crop; Vegetables and melons; Leafy or stem vegetables
Cape mahogany Agricultural land; Cropland; Permanent crop; Multifunctional woody crop
Carrot Agricultural land; Cropland; Arable land; Temporary crop; Vegetables and melons; Root, bulb or tuberous vegetables
Cash woody crop Agricultural land; Cropland; Permanent crop; Cash woody crop
Cashew tree Agricultural land; Cropland; Permanent crop; Fruit crop
Cassava Agricultural land; Cropland; Arable land; Temporary crop; Root/tuber crop with high starch or inulin content
Cereals Agricultural land; Cropland; Arable land; Temporary crop; Cereals
Coffee Agricultural land; Cropland; Permanent crop; Cash woody crop
Cordia Africana Agricultural land; Cropland; Permanent crop; Multifunctional woody crop
Cotton Agricultural land; Cropland; Arable land; Temporary crop; Cash crop; Fiber crop
Cowpea Agricultural land; Cropland; Arable land; Temporary crop; Leguminous
Croton Agricultural land; Cropland; Permanent crop; Multifunctional woody crop
Cucumber Agricultural land; Cropland; Arable land; Temporary crop; Vegetables and melons; Fruit-bearing vegetables
Cucurbit Agricultural land; Cropland; Arable land; Temporary crop; Vegetables and melons; Fruit-bearing vegetables
Cyprus Agricultural land; Cropland; Permanent crop; Multifunctional woody crop
Dirt track Bare soil
Eucalyptus Agricultural land; Cropland; Permanent crop; Cash woody crop
Fallow Agricultural land; Cropland; Arable land; Fallow
Ficus lutea Agricultural land; Cropland; Permanent crop; Multifunctional woody crop
Forest Natural vegetation
Forest plantation Agricultural land; Cropland; Permanent crop; Cash woody crop
Fruit crop Agricultural land; Cropland; Permanent crop; Fruit crop
Fruit-bearing vegetable Agricultural land; Cropland; Arable land; Temporary crop; Vegetables and melons; Fruit-bearing vegetables
Gabon tulip tree Agricultural land; Cropland; Permanent crop; Multifunctional woody crop
Goat tree Agricultural land; Cropland; Permanent crop; Multifunctional woody crop
Gombo Agricultural land; Cropland; Arable land; Temporary crop; Vegetables and melons; Fruit-bearing vegetables
Grasses and other fodder crop Agricultural land; Cropland; Arable land; Temporary crop; Grasses and other fodder crop
LandCover KeyWords
Grassland Agricultural land; Permanent meadow and pasture; Naturally growing
Grevillea Agricultural land; Cropland; Permanent crop; Multifunctional woody crop
Groundnut Agricultural land; Cropland; Arable land; Temporary crop; Oilseed crop; Leguminous; Root, bulb or tuberous vegetables
Herbaceous savannah Natural vegetation; Grass land; Savannah
Herbaceous vegetation Natural vegetation; Grass land
Hibiscus Agricultural land; Cropland; Permanent crop; Multifunctional woody crop
Jatropha Agricultural land; Cropland; Permanent crop; Cash woody crop
Leafy or stem vegetable Agricultural land; Cropland; Arable land; Temporary crop; Vegetables and melons; Leafy or stem vegetables
Leguminous Agricultural land; Cropland; Arable land; Temporary crop; Oilseed crop; Leguminous
Macadamia tree Agricultural land; Cropland; Permanent crop; Fruit crop
Maize Agricultural land; Cropland; Arable land; Temporary crop; Cereals
Mango tree Agricultural land; Cropland; Permanent crop; Fruit crop
Market gardening Agricultural land; Cropland; Arable land; Temporary crop
Mid fallow Agricultural land; Cropland; Arable land; Fallow
Millet Agricultural land; Cropland; Arable land; Temporary crop; Cereals
Mineral soil Bare soil
Mixed annual crops Agricultural land; Cropland; Arable land; Temporary crop
Mixed cereals Agricultural land; Cropland; Arable land; Temporary crop; Cereals
Mixed trees Agricultural land; Cropland; Permanent crop; Fruit crop; Natural vegetation; Forest
Napier grass Agricultural land; Cropland; Arable land; Temporary crop; Grasses and other fodder crop
Oat Agricultural land; Cropland; Arable land; Temporary crop; Cereals
Oilseed crop Agricultural land; Cropland; Arable land; Temporary crop; Oilseed crop
Old fallow Agricultural land; Permanent meadow and pasture; Naturally growing
Onion Agricultural land; Cropland; Arable land; Temporary crop; Vegetables and melons; Root, bulb or tuberous vegetables
Orange tree Agricultural land; Cropland; Permanent crop; Fruit crop
Other crop Agricultural land; Cropland
Papaya tree Agricultural land; Cropland; Permanent crop; Fruit crop
Pasture Agricultural land; Permanent meadow and pasture; Naturally growing
Pea Agricultural land; Cropland; Arable land; Temporary crop; Leguminous
Peach tree Agricultural land; Cropland; Permanent crop; Fruit crop
Pear tree Agricultural land; Cropland; Permanent crop; Fruit crop
Pine Agricultural land; Cropland; Permanent crop; Cash woody crop
Pineapple Agricultural land; Cropland; Arable land; Temporary crop; Vegetables and melons; Fruit-bearing vegetables
Potato Agricultural land; Cropland; Arable land; Temporary crop; Root/tuber crop with high starch or inulin content
Ravintsara Agricultural land; Cropland; Permanent crop; Multifunctional woody crop
Rice Agricultural land; Cropland; Arable land; Temporary crop; Cereals
Root Agricultural land; Cropland; Arable land; Temporary crop; Vegetables and melons; Root, bulb or tuberous vegetables
Root, bulb or tuberous vegetable Agricultural land; Cropland; Arable land; Temporary crop; Vegetables and melons; Root, bulb or tuberous vegetables
Sapodilla tree Agricultural land; Cropland; Permanent crops; Fruit crop
Savannah Natural vegetation; Savannah
Savannah with shrubs Natural vegetation; Shrub land; Savannah
Savannah with trees Natural vegetation; Open forest; Savannah
Sesame Agricultural land; Cropland; Arable land; Temporary crop; Oilseed crop
Shrub land Natural vegetation; Shrub land
Shrub vegetation Natural vegetation; Shrub land
Sorghum Agricultural land; Cropland; Arable land; Temporary crop; Cereals
LandCover KeyWords
Soybean Agricultural land; Cropland; Arable land; Temporary crop; Oilseed crop; Leguminous
Sugarcane Agricultural land; Cropland; Arable land; Temporary crop ; Sugar crop
Sunflower Agricultural land; Cropland; Arable land; Temporary crop; Oilseed crop
Sweet potato Agricultural land; Cropland; Arable land; Temporary crop ; Root/tuber crop with high starch or inulin content
Taro Agricultural land; Cropland; Arable land; Temporary crop; Root/tuber crop with high starch or inulin content
Tea Agricultural land; Cropland; Permanent crop; Cash woody crop
Tomato Agricultural land; Cropland; Arable land; Temporary crop; Vegetables and melons; Fruit-bearing vegetables
Vegetable and root Agricultural land; Cropland; Arable land; Temporary crop; Vegetables and melons; Root, bulb or tuberous vegetables
Vegetables Agricultural land; Cropland; Arable land; Temporary crop; Vegetables and melons
Vineyard Agricultural land; Cropland; Permanent crop; Cash woody crop
Water body Water body
Watermelon Agricultural land; Cropland; Arable land; Temporary crop; Vegetables and melons; Fruit-bearing vegetables
Wattle tree Agricultural land; Cropland; Permanent crop; Multifunctional woody crop
Weakly vegetated agricultural Agricultural land; Cropland; Arable land; Temporary crop
Wetland Natural vegetation
Wheat Agricultural land; Cropland; Arable land; Temporary crop; Cereals
Wild radish Agricultural land; Cropland; Arable land; Temporary crop; Cover crop
Woodlot Agricultural land; Cropland; Permanent crop; Multifunctional woody crop
Young fallow Agricultural land; Cropland; Arable land; Fallow
Zucchini Agricultural land; Cropland; Arable land; Temporary crop; Vegetables and melons; Fruit-bearing vegetables
Author contributions

AJ, AB, LL and VL wrote the paper with substantial contributions from the following principal investigators: VL (Madagascar); RG (Burkina Faso); LL (Senegal); GlM (Brazil – São Paulo); BB, RF and MS (Brazil – Tocantins); AnT (Cambodia); CL (Kenya); and TN, AJ and PD (South Africa). AJ, VL and RG designed the database. AJ harmonized and compiled the data. Ground data collection and preprocessing were carried out by the following authors: BN and MD (Senegal Niakhar); IT (Senegal Nioro); ACB, ER, MA, SaD, StD, VA and VL (Madagascar); AB, AJ, CJ, BK, DLS, LL, MC, RG and StD (Burkina); GlM and CLMdOS (Brazil – São Paulo); AB and BB (Brazil – Tocantins); CL and MM (Kenya); AJ and MG (Cambodia); and AJ and AlT (South Africa).

Competing interests

The contact author has declared that neither they nor their co-authors have any competing interests.

Disclaimer

Publisher’s note: Copernicus Publications remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Acknowledgements

Thanks to Embrapa Pesca e Aquicultura (Palmas, TO, Brazil), Centre de Suivi Ecologique (Senegal), Agricultural Research Council (South Africa), Pasteur Institut (Cambodia), FOFIFA/DP SPAD (Madagascar), DP ASAP (Burkina Faso), ICRAF (World Agroforestry Center, Kenya), NIPE and FEAGRI (Brazil), and Eder Araujo da Silva (Floragro Apoio, Brazil) for their technical support.

Financial support

This research was mainly supported by CIRAD (scientist funding) and by various projects (CNES APR TOSCA projects, SIGMA FP7 grant no. 603719; ESA Sen2Agri grant no. ESRIN 400109979/14/I-AM; CNPq grant nos. 454292/2014-7 and 307560/2016-3; FAPESP-Microsoft Research grant no. 2014/50715-9, 25 – the SERENA project funded by the CIRAD-INRA metaprogram GloFoodS). The SPOT and Pleiades images were acquired through the GEOSUD program EQPX-20, funded by French National Research Agency. The PlanetScope images were acquired through the Planet's Science Ambassador Program.

Review statement

This paper was edited by Alexander Gruber and reviewed by two anonymous referees.

References

Alemohammad, S. H., Ballantyne, A., Bromberg Gaber, Y., Booth, K., Nakanuku-Diggs, L., and Miglarese, A. H.: LandCoverNet: A Global Land Cover Classification Training Dataset, Radiant MLHub [data set], available at: https://radiant-mlhub.s3-us-west-2.amazonaws.com/landcovernet/Documentation.pdf, last access: 7 September 2020. 

Auricht, C., Dixon, J., Boffa, J.-M., and Garrity, D.: Farming Systems of Africa, in: Atlas of African agriculture research and development: Revealing agriculture's place in Africa, 14–15, https://doi.org/10.2499/9780896298460_06, 2014. 

Bégué, A., Arvor, D., Bellon, B., Betbeder, J., de Abelleyra, D., Ferraz, R. P. D., Lebourgeois, V., Lelong, C., Simões, M., and Verón, S. R.: Remote Sensing and Cropping Practices: A Review, Remote Sens., 10, 99, https://doi.org/10.3390/rs10010099, 2018. 

Bellón, B., Bégué, A., Lo Seen, D., Lebourgeois, V., Evangelista, B. A., Simões, M., and Demonte Ferraz, R. P.: Improved regional-scale Brazilian cropping systems' mapping based on a semi-automatic object-based clustering approach, Int. J. Appl. Earth Obs. Geoinf., 68, 127–138, https://doi.org/10.1016/j.jag.2018.01.019​​​​​​​, 2018. 

Bontemps, S., Arias, M., Cara, C., Dedieu, G., Guzzonato, E., Hagolle, O., Inglada, J., Matton, N., Morin, D., Popescu, R., Rabaute, T., Savinaud, M., Sepulcre, G., Valero, S., Ahmad, I., Bégué, A., Wu, B., de Abelleyra, D., Diarra, A., Dupuy, S., French, A., Akhtar, I. U. H., Kussul, N., Lebourgeois, V., Le Page, M., Newby, T., Savin, I., Verón, S., Koetz, B., and Defourny, P.: Building a Data Set over 12 Globally Distributed Sites to Support the Development of Agriculture Monitoring Applications with Sentinel-2, Remote Sens., 7, 16062–16090, https://doi.org/10.3390/rs71215815, 2015. 

Choden, K., Ravon, S., Epstein, J. H., Hoem, T., Furey, N., Gely, M., Jolivot, A., Hul, V., Neung, C., Tran, A., and Cappelle, J.: Pteropus lylei primarily forages in residential areas in Kandal, Cambodia, Ecol. Evol., 9, 4181–4191, https://doi.org/10.1002/ece3.5046, 2019. 

Defourny, P., Jarvis, I., and Blaes, X.: JECAM Guidelines for cropland and crop type definition and field data collection, JECAM, available at: http://jecam.org/wp-content/uploads/2018/10/JECAM_Guidelines_for_Field_Data_Collection_v1_0.pdf (last access: 17 May 2021), 2014. 

Defourny, P., Bontemps, S., Bellemans, N., Cara, C., Dedieu, G., Guzzonato, E., Hagolle, O., Inglada, J., Nicola, L., Rabaute, T., Savinaud, M., Udroiu, C., Valero, S., Begue, A., Dejoux, J.-F., El Harti, A., Ezzahar, J., Kussul, N., Labbassi, K., Lebourgeois, V., Miao, Z., Newby, T., Nyamugama, A., Salh, N., Shelestov, A., Simonneaux, V., Traore, P. S., Traore, S. S., and Koetz, B.: Near real-time agriculture monitoring at national scale at parcel resolution: Performance assessment of the Sen2-Agri automated system in various cropping systems around the world, Remote Sens. Environ., 221, 551–568, https://doi.org/10.1016/j.rse.2018.11.007, 2019. 

de Oliveira Santos, C. L. M., Lamparelli, R. A. C., Dantas Araújo Figueiredo, G. K., Dupuy, S., Boury, J., dos Santos Luciano, A. C., da Silva Torres, R., and le Maire, G.: Classification of Crops, Pastures, and Tree Plantations along the Season with Multi-Sensor Image Time Series in a Subtropical Agricultural Region, Remote Sens., 11, 334, https://doi.org/10.3390/rs11030334, 2019. 

Dixon, J., Gulliver, A., and Gibbon, D.: Farming Systems and Poverty: Improving Farmers' Livelihoods in a Changing World, FAO, World Bank, Rome, Washington DC, USA, 49 pp., 2001. 

FAO: LLand use, irrigation and agricultural practices – Definitions, available at: http://www.fao.org/fileadmin/templates/ess/ess_test_folder/Definitions/Land_Use_Definitions_FAOSTAT.xlsx, last access: 9 September 2020. 

Fourie, A.: Better Crop Estimates in South Africa Integrating GIS with other business systems, in: GIS Best Practices – GIS for Agriculture, 9–13, available at: https://www.esri.com/content/dam/esrisites/sitecore-archive/Files/Pdfs/library/bestpractices/gis-for-agriculture.pdf (last access: 15 December 2021), 2009. 

Fritz, S., See, L., McCallum, I., You, L., Bun, A., Moltchanova, E., Duerauer, M., Albrecht, F., Schill, C., Perger, C., Havlik, P., Mosnier, A., Thornton, P., Wood-Sichra, U., Herrero, M., Becker-Reshef, I., Justice, C., Hansen, M., Gong, P., Abdel Aziz, S., Cipriani, A., Cumani, R., Cecchi, G., Conchedda, G., Ferreira, S., Gomez, A., Haffani, M., Kayitakire, F., Malanding, J., Mueller, R., Newby, T., Nonguierma, A., Olusegun, A., Ortner, S., Rajak, D. R., Rocha, J., Schepaschenko, D., Schepaschenko, M., Terekhov, A., Tiangwa, A., Vancutsem, C., Vintrou, E., Wenbin, W., Velde, M., Dunwoody, A., Kraxner, F., and Obersteiner, M.: Mapping global cropland and field size, Glob. Change Biol., 21, 1980–1992, https://doi.org/10.1111/gcb.12838, 2015. 

Fritz, S., See, L., Perger, C., McCallum, I., Schill, C., Schepaschenko, D., Duerauer, M., Karner, M., Dresel, C., Laso-Bayas, J.-C., Lesiv, M., Moorthy, I., Salk, C. F., Danylo, O., Sturn, T., Albrecht, F., You, L., Kraxner, F., and Obersteiner, M.: A global dataset of crowdsourced land cover and land use reference data, Sci. Data​​​​​​​, 4, 170075, https://doi.org/10.1038/sdata.2017.75, 2017. 

Gbodjo, Y. J. E., Ienco, D., Leroux, L., Interdonato, R., Gaetano, R., and Ndao, B.: Object-Based Multi-Temporal and Multi-Source Land Cover Mapping Leveraging Hierarchical Class Relationships, Remote Sens., 12, 2814, https://doi.org/10.3390/rs12172814, 2020. 

Grillo, J. and Holt, J.: Application of the Livelihood Zone Maps and Profiles for Food Security Analysis and Early Warning – Guidance for Famine Early Warning Systems Network (FEWS NET) Representatives and Partners, USAID FEWS NET, available at: https://fews.net/fews-data/335 (last access: 15 December 2021)​​​​​​​, 2009. 

Hoskins, A. J., Bush, A., Gilmore, J., Harwood, T., Hudson, L. N., Ware, C., Williams, K. J., and Ferrier, S.: Downscaling land-use data to provide global 30” estimates of five land-use classes, Ecol. Evol., 6, 3040–3055, https://doi.org/10.1002/ece3.2104, 2016. 

Ienco, D., Gbodjo, Y. J. E., Gaetano, R., and Interdonato, R.: Weakly Supervised Learning for Land Cover Mapping of Satellite Image Time Series via Attention-Based CNN, IEEE Access, 8, 179547–179560, https://doi.org/10.1109/ACCESS.2020.3024133, 2020. 

Inglada, J., Arias, M., Tardy, B., Hagolle, O., Valero, S., Morin, D., Dedieu, G., Sepulcre, G., Bontemps, S., Defourny, P., and Koetz, B.: Assessment of an Operational System for Crop Type Map Production Using High Temporal and Spatial Resolution Satellite Optical Imagery, Remote Sens., 7, 12356–12379, https://doi.org/10.3390/rs70912356, 2015. 

Jolivot, A., Lebourgeois, V., Ameline, M., Andriamanga, V., Bellon, B., Castets, M., Crespin-Boucaud, A., Defourny, P., Diaz, S., Dieye, M., Dupuy, S., Ferraz, R., Gaetano, R., Gely, M., Jahel, C., Kabore, B., Lelong, C., Le Maire, G., Leroux, L., Lo Seen, D., Muthoni, M., Ndao, B., Newby, T., De Oliveira Santos, C. L. M., Rasoamalala, E., Simoes, M., Thiaw, I., Timmermans, A., Tran, A., and Begue, A.: Harmonized in situ JECAM datasets for agricultural land use mapping and monitoring in tropical countries, Cirad [data set], https://doi.org/10.18167/DVN1/P7OLAP, 2021. 

Laso Bayas, J. C., See, L., Perger, C., Justice, C., Nakalembe, C., Dempewolf, J., and Fritz, S.: Validation of Automatically Generated Global and Regional Cropland Data Sets: The Case of Tanzania, Remote Sens., 9, 815, https://doi.org/10.3390/rs9080815, 2017a. 

Laso Bayas, J. C., Lesiv, M., Waldner, F., Schucknecht, A., Duerauer, M., See, L., Fritz, S., Fraisl, D., Moorthy, I., McCallum, I., Perger, C., Danylo, O., Defourny, P., Gallego, J., Gilliams, S., Akhtar, I. ul H., Baishya, S. J., Baruah, M., Bungnamei, K., Campos, A., Changkakati, T., Cipriani, A., Das, K., Das, K., Das, I., Davis, K. F., Hazarika, P., Johnson, B. A., Malek, Z., Molinari, M. E., Panging, K., Pawe, C. K., Pérez-Hoyos, A., Sahariah, P. K., Sahariah, D., Saikia, A., Saikia, M., Schlesinger, P., Seidacaru, E., Singha, K., and Wilson, J. W.: A global reference database of crowdsourced cropland data collected using the Geo-Wiki platform, Sci. Data, 4, 170136, https://doi.org/10.1038/sdata.2017.136, 2017b. 

Lebourgeois, V., Dupuy, S., Vintrou, É., Ameline, M., Butler, S., and Bégué, A.: A Combined Random Forest and OBIA Classification Scheme for Mapping Smallholder Agriculture at Different Nomenclature Levels Using Multisource Data (Simulated Sentinel-2 Time Series, VHRS and DEM), Remote Sens., 9, 259, https://doi.org/10.3390/rs9030259, 2017. 

Leroux, L., Castets, M., Baron, C., Escorihuela, M.-J., Bégué, A., and Lo Seen, D.: Maize yield estimation in West Africa from crop process-induced combinations of multi-domain remote sensing indices, Eur. J. Agron., 108, 11–26, https://doi.org/10.1016/j.eja.2019.04.007, 2019. 

Leroux, L., Falconnier, G. N., Diouf, A. A., Ndao, B., Gbodjo, J. E., Tall, L., Balde, A. A., Clermont-Dauphin, C., Bégué, A., Affholder, F., and Roupsard, O.: Using remote sensing to assess the effect of trees on millet yield in complex parklands of Central Senegal, Agric. Syst., 184, 102918, https://doi.org/10.1016/j.agsy.2020.102918, 2020.  

Leroux, L., Faye, N. F., Jahel, C., Falconnier, G. N., Diouf, A. A., Ndao, B., Tiaw, I., Senghor, Y., Kanfany, G., Balde, A., Dieye, M., Sirdey, N., Alobo Loison, S., Corbeels, M., Baudron, F., and Bouquet, E.: Exploring the agricultural landscape diversity-food security nexus: an analysis in two contrasted parklands of Central Senegal, Agr. Syst., 196, 103312, https://doi.org/10.1016/j.agsy.2021.103312, 2022. 

Long, Y., Xia, G., Li, S.,Yang, W., Yang, M. Y., Zhu, X., Zhang, L., and Li, D.: DiRS: On Creating Benchmark Datasets for Remote Sensing Image Interpretation, available at: https://www.researchgate.net/publication/342377115_DiRS_On_Creating_Benchmark_Datasets_for_Remote_Sensing_Image_Interpretation, last access: 7 September 2020. 

Ndao, B., Leroux, L., Gaetano, R., Diouf, A. A., Soti, V., Mbow, C., Bégué, A., and Sambou, B.: Landscape heterogeneity analysis using geospatial techniques and a priori knowledge in Sahelian agroforestry systems of Senegal, Ecol. Indic., 125, 107481, https://doi.org/10.1016/j.ecolind.2021.107481, 2021. 

Tsendbazar, N. E., de Bruin, S., and Herold, M.: Assessing global land cover reference datasets for different user communities, ISPRS J. Photogramm. Remote Sens., 103, 93–114, https://doi.org/10.1016/j.isprsjprs.2014.02.008, 2015. 

Valero, S., Morin, D., Inglada, J., Sepulcre, G., Arias, M., Hagolle, O., Dedieu, G., Bontemps, S., Defourny, P., and Koetz, B.: Production of a Dynamic Cropland Mask by Processing Remote Sensing Image Series at High Temporal and Spatial Resolutions, Remote Sens., 8, 55, https://doi.org/10.3390/rs8010055, 2016. 

Waldner, F., Fritz, S., Di Gregorio, A., and Defourny, P.: Mapping Priorities to Focus Cropland Mapping Activities: Fitness Assessment of Existing Global, Regional and National Cropland Maps, Remote Sens., 7, 7959–7986, https://doi.org/10.3390/rs70607959, 2015. 

Waldner, F., De Abelleyra, D., Verón, S. R., Zhang, M., Wu, B., Plotnikov, D., Bartalev, S., Lavreniuk, M., Skakun, S., Kussul, N., Le Maire, G., Dupuy, S., Jarvis, I., and Defourny, P.: Towards a set of agrosystem-specific cropland mapping methods to address the global cropland diversity, Int. J. Remote Sens., 37, 3196–3231, https://doi.org/10.1080/01431161.2016.1194545, 2016. 

Waldner, F., Bellemans, N., Hochman, Z., Newby, T., de Abelleyra, D., Verón, S. R., Bartalev, S., Lavreniuk, M., Kussul, N., Maire, G. L., Simoes, M., Skakun, S., and Defourny, P.: Roadside collection of training data for cropland mapping is viable when environmental and management gradients are surveyed, Int. J. Appl. Earth Obs. Geoinformation, 80, 82–93, https://doi.org/10.1016/j.jag.2019.01.002, 2019. 

Download
Short summary
This paper presents nine standardized crop type reference datasets collected between 2013 and 2020 in seven tropical countries. It aims at participating in the difficult exercise of mapping agricultural land use through satellite image classification in those complex areas where few ground truth or census data are available. These quality-controlled datasets were collected in the framework of the international JECAM initiative and contain 27 074 polygons documented by detailed keywords.