the Creative Commons Attribution 4.0 License.
the Creative Commons Attribution 4.0 License.
Advancements in LUCAS Copernicus 2022: Enhancing Earth Observation with Comprehensive In-Situ Data on EU Land Cover and Use
Abstract. The Land Use/Cover Area frame Survey (LUCAS) of the European Union (EU) presents a rich resource for detailed understanding of land cover and use, making it invaluable for Earth Observation (EO) applications. This manuscript discusses the recent advancements and improvements in the LUCAS Copernicus module, particularly the data collection process of 2022, its protocol simplifications, and geometry definitions compared to the 2018 survey and data. With approximately 150,000 polygons collected in 2022, an increase from 60,000 in 2018, the LUCAS Copernicus 2022 data provides a unique and comprehensive in-situ dataset for EO applications. The protocol simplification also facilitates a faster and more efficient data collection process. In 2022, there are 137,966 polygons generated, out of the original 149,408 LUCAS Copernicus points, which means 92.3 % of the points were actually surveyed. The data holds 82 land cover classes for the Copernicus module LUCAS level 3 legend (88 classes). For land use the data holds 40 classes, along with 18 classes of land use types. The dataset is available here for download (PID: http://data.europa.eu/89h/e3fe3cd0-44db-470e-8769-172a8b9e8874). The paper further elaborates on the implications of these enhancements and the need for continuous harmonisation to ensure semantic consistency and temporal usability of data across different periods. Moreover, it calls for additional studies exploring the potential of the collected data, especially in the context of remote sensing and computer vision. The manuscript ends with a discussion on future data usage and dissemination strategies.
- Preprint
(15084 KB) - Metadata XML
- BibTeX
- EndNote
Status: open (extended)
-
RC1: 'Comment on essd-2023-494', Kristof van Tricht, 01 Mar 2024
reply
Manuscript summary
This paper presents the 2022 version of the Lucas Copernicus dataset. It explains the differences with the 2018 version, most notably the significant increase in number of surveyed points and the shapes of the polygons which are now significantly larger. Some basic statistics on the new dataset are also provided. The main dataset is provided as a GPKG file. The authors conclude that further harmonization is needed in order to guarantee the semantic consistency of the coding and legend, as well as the temporal inter-usability of both the 2018 and 2022 data.
Review summary
I should start by mentioning that the value of the Lucas Copernicus dataset(s) cannot be overstated. There's a tremendous amount of work and dedication that goes into the whole workflow of visiting points, interpretation of land cover, making the necessary observations and finally processing everything in a consistent polygon-based dataset. The result is one of the most influential in situ datasets that can be used by the Earth Observation community and an example to other countries/continents. The highly anticipated 2022 dataset will be of significant value to the community and the exciting increase in both number of observations and size of the polygons will be received with acclaim.
Next to unquestionable value of the dataset, the paper itself is generally written well. However, here and there it lacks some detail that I found essential to fully understand the nature of the dataset, required to make the bridge to EO applications. Therefore, I think some minor revisions are required to add some more detail, after which I would be happy to recommend this paper and dataset for publication in ESSD. My comments and questions for clarification can be found below.
Comments
L8-9: confusing sentence where first 82 land cover classes are mentioned, and then 88 classes. Probably different things but it could use some rephrasing for clarity.
L49: is this homogeneity interpreted also on the ortho-photo when deciding on the landcover class?
L62: could the authors explain the rationale behind the polygons? What is the aim of providing homogenous polygons on top of the (pure) observation of the point itself if e.g. only the respective Sentinel 10m or 20m pixel is confirmed to be homogenous? L99 might provide a clue but it would be good to explain this rationale.
L65: this is a bit confusing as the minimum area to execute Copernicus module is reported to be 25m² while the MMU is about 79m². Where does this difference come from?
L69-77: this section is not entirely clear to me. “The position they have reached” can deviate the theoretical LUCAS point. Why and by how much? What does “cannot reach” mean? Why is a “linear feature narrower than 3m” the exception when no Copernicus-relevant information can be recorded? What are “a few meters” that a surveyor can move?
L98-99: What is the aim of the quasi-circular polygon shape? Downstream applications will likely have to process the polygons further in order to be able to have e.g. pure Sentinel pixels and not a mix at the borders of the polygon where another LC could start.
Sect. 5.3: where does this preliminary assessment come from? Can the authors elaborate a bit more?
L117-118: does this mean future versions of the dataset are possible? How will versioning of the dataset in that case be treated?
L118: Could the authors elaborate a bit more on the (planned) compatibility between 2018 and 2022 survey? e.g. what are at the moment legend inconsistencies between 2018 and 2022 and how are users advised to cope with such difference?
Sect. 7: Are other dissemination methods considered in the future as well? Such as upload zo Zenodo with DOI, upload to Google Earth Engine for fast uptake by the community, ...
L130-131: link only works when copy-pasting manually; hyperlink behind the text does not.
Figures
Figure 1: things such as the legend are too small in the current version and therefore not readable.
Figure 2: in (a), what is the size of the resulting polygon? Does this match Sentinel-1 or Sentinel-2 data? In (b), the polygon seems to contain a mix of grass, bare and builtup. Aren’t the polygons supposed to contain one homogenous land cover/use type?
Figure 3: The table shown in the figure should be explained in the caption.
Figure 5: Why is the minimum value 0, while L65 states the Copernicus module is not executed for areas smaller than 25m²? The text on top of this figure should also be revisited. E.g. far right part is truncated. I such a large precision for these numbers required? Some rounding would probably increase readability.
Figure A1: caption should contain a bit more information to be able to interpret what exactly is shown
DATA
I checked the GPKG file and have a question after a quick look: some polygons contain hardly any data (not even landcover), e.g. point_id 38543138 has almost all “null” attributes while clearly being located in arable land. Why is that? In fact 2686 polygons have “null” in their “survey_lc1” attribute. Is this to be expected?
Citation: https://doi.org/10.5194/essd-2023-494-RC1
Data sets
LUCAS Copernicus 2022 European Commission, Joint Research Centre (JRC) http://data.europa.eu/89h/e3fe3cd0-44db-470e-8769-172a8b9e8874
Viewed
HTML | XML | Total | BibTeX | EndNote | |
---|---|---|---|---|---|
380 | 59 | 19 | 458 | 20 | 21 |
- HTML: 380
- PDF: 59
- XML: 19
- Total: 458
- BibTeX: 20
- EndNote: 21
Viewed (geographical distribution)
Country | # | Views | % |
---|
Total: | 0 |
HTML: | 0 |
PDF: | 0 |
XML: | 0 |
- 1