EuroCrops v2.0: Multi-annual harmonized parcel level crop type data linked to European Union-wide survey, statistical and Earth Observation products

Claverie, Martin; Chan, Ayshah; See, Linda; Ramos, Helena; Koeble, Renate; Yordanov, Momchil; Skøien, Jon Olav; Urbano, Ferdinando; d'Andrimont, Raphael; Schneider, Maja; Körner, Marco; Van der Velde, Marijn

doi:10.5194/essd-2025-752

Preprints

https://doi.org/10.5194/essd-2025-752

Preprints

26 Jan 2026

| 26 Jan 2026

Status: this preprint is currently under review for the journal ESSD.

EuroCrops v2.0: Multi-annual harmonized parcel level crop type data linked to European Union-wide survey, statistical and Earth Observation products

Martin Claverie, Ayshah Chan, Linda See, Helena Ramos, Renate Koeble, Momchil Yordanov, Jon Olav Skøien, Ferdinando Urbano, Raphael d'Andrimont, Maja Schneider, Marco Körner, and Marijn Van der Velde

Abstract. As part of the Common Agricultural Policy (CAP) of the European Union (EU), farmers make annual declarations of the agricultural activities for which they receive subsidies. The declarations include the crops they grow at parcel level, referred to as Geo-Spatial Application (GSA) data. Paying Agencies (PA) of every EU Member State (MS) use specific crop classifications in their native language, and not all provide access to the GSA data. In the past, the EuroCrops initiative harmonized openly available GSA data for a single year (2021) using the Hierarchical Crop and Agriculture Taxonomy (HCAT), but multiple years are available depending on the country. Harmonizing a time series of farmers' crop declarations at parcel level would allow for comparative spatiotemporal analysis across the EU, the development of indicators that can be used for CAP and other policy monitoring purposes, and would provide data for training and validation of remotely sensed products. Here we have collected the GSA crop type declarations and parcel geometries that are publicly available from 18 PAs, the administrative bodies managing GSA data, for a minimum of three years. We have then harmonized the GSA data using HCAT v.4, a new version developed as part of this work. The data set includes nearly 47 million parcels covering 21 Mha. To facilitate integration and interoperability of the GSA data with other EU data sets containing spatial information on crops, we harmonized the crop classes used in the following data sets with HCAT v.4: 1) LUCAS, 2) the Integrated Farm Statistics/Farm Structure Survey, 3) the Farm Accountancy Data Network (FADN), and 4) the classification systems of the Copernicus High Resolution Layer on Crop Types. To demonstrate the potential of the multiannual, harmonised dataset presented in this paper, the GSA data were aggregated to NUTS 2 regions and compared with statistics on crop areas from Eurostat, showing good correspondence for many crops but also highlighting those crops and countries where the agreement is less good, providing possible reasons why. The data can also be used for mapping crop rotations, and a map showing maize monoculture illustrates this application. Farmers' declarations will increasingly become available as MS are required to publish these under the High-Value Dataset regulation. The EuroCrops v2.0 data set is registered and publicly available under the DOI https://doi.org/10.2905/b9fb9e67-78a9-4327-9d59-39a928d812d3.

Received: 09 Dec 2025 – Discussion started: 26 Jan 2026

Publisher's note: Copernicus Publications remains neutral with regard to jurisdictional claims made in the text, published maps, institutional affiliations, or any other geographical representation in this paper. While Copernicus Publications makes every effort to include appropriate place names, the final responsibility lies with the authors. Views expressed in the text are those of the authors and do not necessarily reflect the views of the publisher.

Download & links

Preprint (PDF, 17064 KB)

Supplement (993 KB)

Download & links

Status: final response (author comments only)

RC1: 'Comment on essd-2025-752', Anonymous Referee #1, 25 Feb 2026

I recommend minor revision. The manuscript describes EuroCrops v2.0, a multi-annual harmonized parcel-level crop-type dataset built from publicly available EU Geo-Spatial Application (GSA) declarations (18 Paying Agencies; minimum three consecutive years including 2021), harmonized via the updated HCAT v4 taxonomy and linked to key EU statistical/survey and EO classification systems. The paper is timely and valuable for CAP-related analysis, crop-rotation studies, and as training/validation data for EO products; the processing chain and interoperability ambition are strong
Handling missing crop codes: you gap-fill codes where possible and otherwise create new unique codes (starting at 10001) “in a random order.” Please clarify the randomization (seed; stability across releases) and provide a quick summary of how frequent this situation is by country/year (e.g., Estonia/Ireland name-only years
Translation + matching quality: you translate via multiple MT APIs (Google/DeepL/OPUS-MT and sometimes ChatGPT), then manually check disagreements, and match via weighted Levenshtein plus rules; you note translation/matching challenges (e.g., ~58% for Finnish, ~51% matched for Brandenburg). Please add a small, systematic quality assessment so users understand typical error modes (spelling mistakes, colloquial names, mixtures)
Optional to extend the validation side: you may consider briefly positioning EuroCrops alongside global gridded crop-area products such as CROPGRIDS (173 crops circa 2020) as complementary yet solid benchmarks at coarser resolution (useful for sanity-checking aggregated totals, not as parcel-level substitutes)
https://www.nature.com/articles/s41597-024-03247-7
https://openknowledge.fao.org/items/ebc60e53-2b29-4173-b1fa-2320669b1312
Overall, I see these as documentation and reproducibility refinements rather than methodological blockers

Citation: https://doi.org/10.5194/essd-2025-752-RC1
RC2:
'Comment on essd-2025-752', Anonymous Referee #2, 03 Mar 2026
Reviewer’s Comments
Earth System Science Data Manuscript essd-2025-752
“EuroCrops v2.0: Multi-annual harmonized parcel level crop type data linked to European Union-wide survey, statistical and Earth Observation products”

This manuscript presents EuroCrops v2.0, a multi-annual harmonized dataset of parcel-level crop declarations derived from publicly available Geo-Spatial Application (GSA) data across 18 Paying Agencies in the European Union. The dataset is harmonized using an updated Hierarchical Crop and Agriculture Taxonomy (HCAT4), which introduces additional dimensions for seasonality and usage, and is linked to multiple European statistical and Earth Observation classification systems (AGRIPROD, LUCAS, IFS/FSS, FADN, HRL Crop Types). The manuscript fits well within the scope of Earth System Science Data. It describes a large, policy-relevant, open-access geospatial dataset with clear potential for reuse in Earth observation, agricultural monitoring, sustainability assessment, and CAP-related evaluation. The extension from a single-year dataset (EuroCrops v1) to a multi-annual harmonized dataset represents a significant and valuable advance. The manuscript is generally well structured, and the authors demonstrate considerable effort in data cleaning, semantic harmonization, and validation. The comparison with Eurostat statistics and gridded IFS data provides useful validation evidence. The discussion appropriately acknowledges limitations.
However, several aspects require clarification or strengthening, particularly regarding reproducibility of the harmonization workflow, quantification of methodological impacts (e.g., stacking and rasterization), and documentation of uncertainty and metadata.
Major Comments
Handling of Multiple Crops per Parcel

The manuscript describes rules for handling multiple crops listed within a single parcel label (e.g., hierarchy-based grouping, majority crop assignment, first-mentioned crop). However:
The proportion of parcels affected by multi-crop labels is not reported.

The potential bias introduced by assigning the “first mentioned” crop is not assessed.

Country-level differences in this issue are not quantified. Given that some countries (e.g., Portugal, Ireland) exhibit specific declaration complexities, this could introduce systematic distortions.

2. Reproducibility of the Harmonization Workflow
The harmonization process is conceptually well described, including translation, semantic verification, matching to HCAT4, and iterative refinement. However, several components lack sufficient detail for full reproducibility. The manuscript indicates that crop names were translated using Google Translate, DeepL, OPUS-MT, and, where necessary, ChatGPT. Additionally, a large language model (LLM) was used to detect semantic discrepancies in crop code definitions over time. For ESSD, automated or semi-automated semantic decisions that affect taxonomy assignment should be reproducible. The manuscript should clarify:
Which LLM(s) and versions were used?

Were deterministic settings applied?

What prompts or criteria were used to detect semantic discrepancies?

How were disagreements between translation tools resolved?

What proportion of entries required manual intervention?

Without these details, it is difficult for future users to replicate or audit the harmonization procedure.
3. Uncertainty and Quality Indicators
The manuscript presents validation against Eurostat and IFS data, including R² and NSE metrics. This is commendable and strengthens confidence in the dataset. However, the dataset itself does not appear to provide:
Harmonization confidence indicators,

Translation reliability scores,

Flags for manually corrected entries,

Country-level reliability summaries.

Given the complexity of semantic harmonization across languages and classification systems, some form of quality metadata would enhance transparency.
4. Metadata, FAIR Compliance, and Versioning
The dataset is openly available (with DOI), and code is provided via GitHub. This is highly positive. However, the manuscript would benefit from:
A clear data dictionary describing all attributes in the geoparquet files,

Explicit CRS documentation,

Description of file naming conventions,

A versioning and update policy (e.g., semantic versioning, expected update frequency),

Clarification on long-term repository archiving (e.g., Zenodo snapshot of code).

Minor Comments
Abstract: Consider explicitly stating the temporal coverage range (e.g., 2008–2023, where applicable) for clarity.

Consistency of terminology: Ensure consistent referencing of HCAT versions (HCAT2, HCAT3, HCAT4).

CRS specification: Explicitly define the coordinate reference systems used for storage and visualization.

Validation summary: A concise summary table of validation statistics per crop class would enhance readability.
Citation: https://doi.org/10.5194/essd-2025-752-RC2

Supplement

https://doi.org/10.5194/essd-2025-752-supplement

Data sets

EuroCropsV2 geodata Martin Claverie https://doi.org/10.2905/b9fb9e67-78a9-4327-9d59-39a928d812d3

Model code and software

EuroCropsV2 Scripts Martin Claverie and Momchil Yordanov https://github.com/Martincccc/EuroCropsV2

Viewed

Total article views: 596 (including HTML, PDF, and XML)

HTML	PDF	XML	Total	Supplement	BibTeX	EndNote
368	201	27	596	68	29	44

HTML: 368
PDF: 201
XML: 27
Total: 596
Supplement: 68
BibTeX: 29
EndNote: 44

Views and downloads (calculated since 26 Jan 2026)

Month	HTML	PDF	XML	Total
Jan 2026	82	45	5	132
Feb 2026	120	72	9	201
Mar 2026	166	84	13	263

Cumulative views and downloads (calculated since 26 Jan 2026)

Month	HTML	PDF	XML	Total
Jan 2026	82	45	5	132
Feb 2026	120	72	9	201
Mar 2026	166	84	13	263

Viewed (geographical distribution)

Total article views: 592 (including HTML, PDF, and XML) Thereof 592 with geography defined and 0 with unknown origin.

Country	#	Views	%

Latest update: 31 Mar 2026

Download

Preprint (17064 KB)
Metadata XML

Short summary

The Eurocrops v2.0 dataset contains several years of farmer crop declarations from 16 Member States for the Common Agricultural Policy. These data were harmonized with a crop taxonomy and then linked to other classification systems including agricultural statistics and remotely sensed products. This dataset can support researchers and policy makers in tracking changes over time, calculating crop rotations, for national comparisons, for evaluating official statistics and for remote sensing.


Total:	0
HTML:	0
PDF:	0
XML:	0