the Creative Commons Attribution 4.0 License.
the Creative Commons Attribution 4.0 License.
Not just crop or forest: building an integrated land cover map for agricultural and natural areas
Abstract. Due to our increasing understanding of the role the surrounding landscape plays in ecological processes, a detailed characterization of land cover, including both agricultural and natural habitats, is ever more important for both researchers and conservation practitioners. Unfortunately, in the United States, different types of land cover data are split across thematic datasets that emphasize agricultural or natural vegetation, but not both. To address this data gap and reduce duplicative efforts in geospatial processing, we merged two major datasets, the LANDFIRE National Vegetation Classification (NVC) and USDA-NASS Cropland Data Layer (CDL), to produce an integrated land cover map. Our workflow leveraged strengths of the NVC and the CDL to produce detailed rasters comprising both agricultural and natural land-cover classes. We generated these maps for each year from 2012–2021 for the conterminous United States, quantified agreement between input layers and accuracy of our merged product and published the complete workflow necessary to update these data. In our validation analyses, we found that approximately 5.5 % of NVC agricultural pixels conflicted with the CDL, but we resolved most of these conflicts based on surrounding agricultural land, leaving only 0.6 % of agricultural pixels unresolved in our merged product. These ready-to-use rasters characterizing both agricultural and natural land cover will be widely useful in environmental research and management.
- Preprint
(1107 KB) - Metadata XML
-
Supplement
(592 KB) - BibTeX
- EndNote
Status: closed
-
RC1: 'Comment on essd-2022-331', Anonymous Referee #1, 12 Dec 2022
This study tried to merge the information from two types of gridded images, i.e., the LANDFIRE National Vegetation Classification (NVC) and USDA-NASS Cropland Data Layer (CDL). The methods used for producing the integrated gridded maps are straightforward, but my concerns are mainly related to the novelty and the usefulness of such dataset. Simply put, the authors allocated more specific crop information from CDL on NVC images. Thus, I doubt its novelty to be published in ESSD. Besides, the ground-truth data were unclear (how many samples, where they distribute, how the samples were collected etc.). Nonetheless, it doesn't matter because the validation of the produced data was actually the validation of NVC map since no improvement was done for the NVC data.
Citation: https://doi.org/10.5194/essd-2022-331-RC1 -
AC1: 'Reply on RC1', Melanie Kammerer, 04 Jan 2023
These improved land cover data will be useful for many ecological and agricultural management, conservation, and research applications. To our knowledge, there are no public datasets that are equivalent, in terms of thematic, temporal, and spatial resolution, to the one we produced, establishing the novelty of our work. To illustrate a specific application of our data for modelling ecosystem services, we added the following paragraph to the manuscript introduction:
"An integrated dataset of land cover documenting specific agricultural and natural habitats is a critical input to predict biodiversity, ecosystem services, and climate adaption and mitigation strategies. For example, the Integrated Valuation of Ecosystem Services and Tradeoffs (InVEST) is a set of widely used, spatial models that predict ecosystem services based on land cover data. InVEST predictions of crop pollination services depend on accurate characterization of agricultural and natural habitats available in one spatial product. Within broad classes of agricultural, forest, and wetland habitats, floral resources for pollinators can vary more than 250, 750, and 40-fold, respectively (Iverson et al in prep), necessitating a land cover map that specifies specific types of crop and natural vegetation. We developed this dataset of integrated land cover as an input for our models of pollination services, but expect models of carbon storage, crop disease, pest dynamics, and biocontrol, among other ecosystem services, would be improved with more detailed land cover data."
We also strengthened our discussion of the importance of retaining crop identity in land cover datasets by adding the following examples to paragraph two:
"…annual and perennial crops differ in frequency and intensity of tillage, which has significant implications for climate and soil health as intensive tillage adversely affects soil structure, chemical and biological processes, and increases emission of greenhouse gases (Mangalassery et al., 2015; Busari et al., 2015). Quantifying changes in agricultural land use and crop diversity also depends on land cover data that document specific crop types."
Though we would be willing to include a detailed description of the NVC ground truth data if the reviewers/editor feel we should, we believe it does not fit within the scope of our paper, as our goal was to illustrate how the classification accuracy of our combined product compares to accuracy of CDL and NVC, rather than validating the source datasets themselves. For more information on CDL and NVC validation procedures, we refer readers to CDL and LANDFIRE websites (https://www.nass.usda.gov/Research_and_Science/Cropland/sarsfaqs2.php#Section1_11.0 and https://landfire.gov/remapevt_assessment.php, respectively) and Lark et al. 2021 (https://doi.org/10.3390/rs13050968). We have added this information to the technical validation text, as well, to guide the reader who may be interested in these validation data.
Citation: https://doi.org/10.5194/essd-2022-331-AC1
-
AC1: 'Reply on RC1', Melanie Kammerer, 04 Jan 2023
-
RC2: 'Comment on essd-2022-331', Anonymous Referee #2, 27 Jan 2023
Review of Kammerer et al.: “Not just crop or forest: building an integrated land cover map for agricultural and natural areas” (essd-2022-331)
This manuscript describes a data integration effort, combining two existing vegetation/cropland cover datasets for the conterminous United States, namely the LANDFIRE National Vegetation Classification (NVC) and USDA-NASS Cropland Data Layer (CDL). As I am not an expert in the application domain for these datasets, I will focus on the integration process itself and on the validation of the resulting data product.
I think that the concept underlying this effort (“generating new knowledge by integrating existing datasets”) is very valid and timely – this is how we need to deal with the myriad of geospatial data out there in order to get the maximum value out of it. The paper is well written, and from my perspective, such an effort is of potential interest for the readers of ESSD, however, there are several concerns that should be addressed prior to publication. My biggest concern is the temporal mismatch of the data (i.e., vegetation data from 2016 is integrated with agricultural data at annual temporal resolution from 2012-2020) – why not using all available Landfire epochs, or constraining the dataset to 2016 only? This temporal mismatch should be addressed in the revised version and the rationale for this decision should be clarified.
Specific comments:
- The benefit / underlying motivation of this data integration effort needs to be clarified a little bit more. While the introduction refers to some examples in the literature that assess processes playing out at the interface of cropland and natural vegetation, the Authors should add a paragraph (maybe in a concluding section) to illustrate some technical examples how these data could be used to answer specific questions – for example, one could apply convolving focal windows to identify regions where specific crop / vegetation types co-occur within a given distance. Something like this would make the contribution/value of the integrated dataset clearer.
- Also, a vectorized version (polygons) of the integrated dataset could be very useful to assess topological relationships (e.g. adjacency) between different crop / vegetation types). à If feasible, the Authors could provide such a vectorized polygonal dataset to complement their data. This will also enhance the usage of the dataset, as some researchers may prefer to work with vector rather than raster data, for topological analyses, but may not have the resources to vectorize the data.
- The validation section should be expanded. While some sort of validation has been done, little information is given about the reference data used – please provide some information (maybe a map) on the sampling locations, data source of the reference data etc. –
- Only after one hour of reviewing this paper I realized the temporal mismatch in the data, when reading this sentence “Pixels of national vegetation are the same in all rasters provided here and represent land use in 2016.” on the data website (https://data.nal.usda.gov/dataset/data-not-just-crop-or-forest-building-integrated-land-cover-map-agricultural-and-natural-areas-spatial-files/resource/8c92879b-92cf-4e86-a3c4-0e672007a1df) - NVC data from 2016 is integrated with CDL data annually from 2012-2020? This is not clear from the manuscript. What are the implications of keeping vegetation cover stationary over time? Does cropland change faster than vegetation? How does this temporal mismatch affect the usability of the integrated dataset? Can it be used to assess recent processes at all? This issue needs to be highlighted and thoroughly discussed. When looking at this page: https://landfire.gov/data_overviews.php, I see that Landfire has been released in several years besides 2016 – why did you not integrate Landfire and CDL in annual pairs, for the years available? Please provide a rationale for this. This is probably my biggest concern about this manuscript.
- Same sentence on the data website : “Pixels of national vegetation are the same in all rasters provided here and represent land use in 2016.” --> isn’t vegetation land cover, instead of land use?
- If not done already, Authors should provide a spatial layer (raster dataset) of the mismatched / unresolved pixels, so that users can include these discrepant areas explicitly in their analyses.
- Related to that, it is not clear to me in which year the agreement assessment was conducted (sorry if I missed it), and whether the stats (e.g. 5% of conflicting pixels) refers to a specific year, is it 2016?
- It seems that the union of agricultural land use and vegetation land cover is used as the analytical “universe” in this study. How do these areas relate to other land cover / land use types, such as urban areas / developed land? It would be great if the Authors could conduct a cross-comparison to a “spatially exhaustive” dataset such as the NLCD – how does the integrated dataset agree with the classes from NLCD? This would be some kind of “external” evaluation, while the agreement analysis would be an “internal” validation. Perhaps a cross-tabulation of the area proportions per crop/vegetation class and NLCD land cover class would be interesting (see e.g. Fig 4 in https://www.nature.com/articles/s41597-022-01591-0)
- I suggest to rename the dataset containing the uncertainty statistics. Please change “tabular data” to “uncertainty statistics” or similar.
- Lastly, I suggest to come up with a name for the integrated dataset. This will make it easier to refer to the dataset and ultimately, increase the visibility of the product.
Minor comments:
Sorry if I missed it, but what is the spatial resolution of the input data? I think it is 30m for both of the datasets, this should be stated in your geoprocessing section. Also, it is important to know whether the raster grids align, or is there an offset between the two grids? Did you have to resample one of the layers to the grid of the other layer? If so, how was this done (nearest neighbor resampling?).
Some terminology… AFAIK one would speak of “forest land cover” on the one hand, but “agricultural land use” on the other hand – in the title and in the manuscript you write “land cover” – the term “land use” does not occur in the manuscript. However, isn’t your integrated dataset truly a LULC (land use / land cover) dataset? I think the integration of land cover and land use should be highlighted more in the paper (and maybe even in the title).
Minor detail: Some of the maps in Figures 2,3,4 show areas / area proportions and thus, should use an equal-area projection rather than showing Lat-Lon in a cartesian coordinate system. Lat-Lon are angular coordinates and should IMHO not be shown in cartesian coordinate systems, in particular when it comes to mapping areas / densities / area proportions – as a pixel in Maine has a different area than a pixel in Florida. I suggest to use Albers Equal Area projection.
Fig. 4- top map: I don’t think it necessary to show an “empty” map?
The introduction should include a short synopsis of similar data integration efforts, to illustrate that this is a trending topic in general, and across disciplines. For example, a similar effort from the field of human settlement modelling would be this: https://www.tandfonline.com/doi/full/10.1080/17538947.2018.1550121
Citation: https://doi.org/10.5194/essd-2022-331-RC2 - AC2: 'Reply on RC2', Melanie Kammerer, 01 Mar 2023
Status: closed
-
RC1: 'Comment on essd-2022-331', Anonymous Referee #1, 12 Dec 2022
This study tried to merge the information from two types of gridded images, i.e., the LANDFIRE National Vegetation Classification (NVC) and USDA-NASS Cropland Data Layer (CDL). The methods used for producing the integrated gridded maps are straightforward, but my concerns are mainly related to the novelty and the usefulness of such dataset. Simply put, the authors allocated more specific crop information from CDL on NVC images. Thus, I doubt its novelty to be published in ESSD. Besides, the ground-truth data were unclear (how many samples, where they distribute, how the samples were collected etc.). Nonetheless, it doesn't matter because the validation of the produced data was actually the validation of NVC map since no improvement was done for the NVC data.
Citation: https://doi.org/10.5194/essd-2022-331-RC1 -
AC1: 'Reply on RC1', Melanie Kammerer, 04 Jan 2023
These improved land cover data will be useful for many ecological and agricultural management, conservation, and research applications. To our knowledge, there are no public datasets that are equivalent, in terms of thematic, temporal, and spatial resolution, to the one we produced, establishing the novelty of our work. To illustrate a specific application of our data for modelling ecosystem services, we added the following paragraph to the manuscript introduction:
"An integrated dataset of land cover documenting specific agricultural and natural habitats is a critical input to predict biodiversity, ecosystem services, and climate adaption and mitigation strategies. For example, the Integrated Valuation of Ecosystem Services and Tradeoffs (InVEST) is a set of widely used, spatial models that predict ecosystem services based on land cover data. InVEST predictions of crop pollination services depend on accurate characterization of agricultural and natural habitats available in one spatial product. Within broad classes of agricultural, forest, and wetland habitats, floral resources for pollinators can vary more than 250, 750, and 40-fold, respectively (Iverson et al in prep), necessitating a land cover map that specifies specific types of crop and natural vegetation. We developed this dataset of integrated land cover as an input for our models of pollination services, but expect models of carbon storage, crop disease, pest dynamics, and biocontrol, among other ecosystem services, would be improved with more detailed land cover data."
We also strengthened our discussion of the importance of retaining crop identity in land cover datasets by adding the following examples to paragraph two:
"…annual and perennial crops differ in frequency and intensity of tillage, which has significant implications for climate and soil health as intensive tillage adversely affects soil structure, chemical and biological processes, and increases emission of greenhouse gases (Mangalassery et al., 2015; Busari et al., 2015). Quantifying changes in agricultural land use and crop diversity also depends on land cover data that document specific crop types."
Though we would be willing to include a detailed description of the NVC ground truth data if the reviewers/editor feel we should, we believe it does not fit within the scope of our paper, as our goal was to illustrate how the classification accuracy of our combined product compares to accuracy of CDL and NVC, rather than validating the source datasets themselves. For more information on CDL and NVC validation procedures, we refer readers to CDL and LANDFIRE websites (https://www.nass.usda.gov/Research_and_Science/Cropland/sarsfaqs2.php#Section1_11.0 and https://landfire.gov/remapevt_assessment.php, respectively) and Lark et al. 2021 (https://doi.org/10.3390/rs13050968). We have added this information to the technical validation text, as well, to guide the reader who may be interested in these validation data.
Citation: https://doi.org/10.5194/essd-2022-331-AC1
-
AC1: 'Reply on RC1', Melanie Kammerer, 04 Jan 2023
-
RC2: 'Comment on essd-2022-331', Anonymous Referee #2, 27 Jan 2023
Review of Kammerer et al.: “Not just crop or forest: building an integrated land cover map for agricultural and natural areas” (essd-2022-331)
This manuscript describes a data integration effort, combining two existing vegetation/cropland cover datasets for the conterminous United States, namely the LANDFIRE National Vegetation Classification (NVC) and USDA-NASS Cropland Data Layer (CDL). As I am not an expert in the application domain for these datasets, I will focus on the integration process itself and on the validation of the resulting data product.
I think that the concept underlying this effort (“generating new knowledge by integrating existing datasets”) is very valid and timely – this is how we need to deal with the myriad of geospatial data out there in order to get the maximum value out of it. The paper is well written, and from my perspective, such an effort is of potential interest for the readers of ESSD, however, there are several concerns that should be addressed prior to publication. My biggest concern is the temporal mismatch of the data (i.e., vegetation data from 2016 is integrated with agricultural data at annual temporal resolution from 2012-2020) – why not using all available Landfire epochs, or constraining the dataset to 2016 only? This temporal mismatch should be addressed in the revised version and the rationale for this decision should be clarified.
Specific comments:
- The benefit / underlying motivation of this data integration effort needs to be clarified a little bit more. While the introduction refers to some examples in the literature that assess processes playing out at the interface of cropland and natural vegetation, the Authors should add a paragraph (maybe in a concluding section) to illustrate some technical examples how these data could be used to answer specific questions – for example, one could apply convolving focal windows to identify regions where specific crop / vegetation types co-occur within a given distance. Something like this would make the contribution/value of the integrated dataset clearer.
- Also, a vectorized version (polygons) of the integrated dataset could be very useful to assess topological relationships (e.g. adjacency) between different crop / vegetation types). à If feasible, the Authors could provide such a vectorized polygonal dataset to complement their data. This will also enhance the usage of the dataset, as some researchers may prefer to work with vector rather than raster data, for topological analyses, but may not have the resources to vectorize the data.
- The validation section should be expanded. While some sort of validation has been done, little information is given about the reference data used – please provide some information (maybe a map) on the sampling locations, data source of the reference data etc. –
- Only after one hour of reviewing this paper I realized the temporal mismatch in the data, when reading this sentence “Pixels of national vegetation are the same in all rasters provided here and represent land use in 2016.” on the data website (https://data.nal.usda.gov/dataset/data-not-just-crop-or-forest-building-integrated-land-cover-map-agricultural-and-natural-areas-spatial-files/resource/8c92879b-92cf-4e86-a3c4-0e672007a1df) - NVC data from 2016 is integrated with CDL data annually from 2012-2020? This is not clear from the manuscript. What are the implications of keeping vegetation cover stationary over time? Does cropland change faster than vegetation? How does this temporal mismatch affect the usability of the integrated dataset? Can it be used to assess recent processes at all? This issue needs to be highlighted and thoroughly discussed. When looking at this page: https://landfire.gov/data_overviews.php, I see that Landfire has been released in several years besides 2016 – why did you not integrate Landfire and CDL in annual pairs, for the years available? Please provide a rationale for this. This is probably my biggest concern about this manuscript.
- Same sentence on the data website : “Pixels of national vegetation are the same in all rasters provided here and represent land use in 2016.” --> isn’t vegetation land cover, instead of land use?
- If not done already, Authors should provide a spatial layer (raster dataset) of the mismatched / unresolved pixels, so that users can include these discrepant areas explicitly in their analyses.
- Related to that, it is not clear to me in which year the agreement assessment was conducted (sorry if I missed it), and whether the stats (e.g. 5% of conflicting pixels) refers to a specific year, is it 2016?
- It seems that the union of agricultural land use and vegetation land cover is used as the analytical “universe” in this study. How do these areas relate to other land cover / land use types, such as urban areas / developed land? It would be great if the Authors could conduct a cross-comparison to a “spatially exhaustive” dataset such as the NLCD – how does the integrated dataset agree with the classes from NLCD? This would be some kind of “external” evaluation, while the agreement analysis would be an “internal” validation. Perhaps a cross-tabulation of the area proportions per crop/vegetation class and NLCD land cover class would be interesting (see e.g. Fig 4 in https://www.nature.com/articles/s41597-022-01591-0)
- I suggest to rename the dataset containing the uncertainty statistics. Please change “tabular data” to “uncertainty statistics” or similar.
- Lastly, I suggest to come up with a name for the integrated dataset. This will make it easier to refer to the dataset and ultimately, increase the visibility of the product.
Minor comments:
Sorry if I missed it, but what is the spatial resolution of the input data? I think it is 30m for both of the datasets, this should be stated in your geoprocessing section. Also, it is important to know whether the raster grids align, or is there an offset between the two grids? Did you have to resample one of the layers to the grid of the other layer? If so, how was this done (nearest neighbor resampling?).
Some terminology… AFAIK one would speak of “forest land cover” on the one hand, but “agricultural land use” on the other hand – in the title and in the manuscript you write “land cover” – the term “land use” does not occur in the manuscript. However, isn’t your integrated dataset truly a LULC (land use / land cover) dataset? I think the integration of land cover and land use should be highlighted more in the paper (and maybe even in the title).
Minor detail: Some of the maps in Figures 2,3,4 show areas / area proportions and thus, should use an equal-area projection rather than showing Lat-Lon in a cartesian coordinate system. Lat-Lon are angular coordinates and should IMHO not be shown in cartesian coordinate systems, in particular when it comes to mapping areas / densities / area proportions – as a pixel in Maine has a different area than a pixel in Florida. I suggest to use Albers Equal Area projection.
Fig. 4- top map: I don’t think it necessary to show an “empty” map?
The introduction should include a short synopsis of similar data integration efforts, to illustrate that this is a trending topic in general, and across disciplines. For example, a similar effort from the field of human settlement modelling would be this: https://www.tandfonline.com/doi/full/10.1080/17538947.2018.1550121
Citation: https://doi.org/10.5194/essd-2022-331-RC2 - AC2: 'Reply on RC2', Melanie Kammerer, 01 Mar 2023
Data sets
Data from: Not just crop or forest: building an integrated land cover map for agricultural and natural areas (tabular files) Kammerer, Melanie; Iverson, Aaron L.; Li, Kevin; Goslee, Sarah C https://doi.org/10.15482/USDA.ADC/1527977
Data from: Not just crop or forest: building an integrated land cover map for agricultural and natural areas (spatial files) Kammerer, Melanie; Iverson, Aaron L.; Li, Kevin; Goslee, Sarah C https://doi.org/10.15482/USDA.ADC/1527978
Model code and software
Code from: Not just crop or forest: building an integrated land cover map for agricultural and natural areas Kammerer, M https://doi.org/10.5281/zenodo.6803199
Viewed
HTML | XML | Total | Supplement | BibTeX | EndNote | |
---|---|---|---|---|---|---|
613 | 235 | 51 | 899 | 84 | 52 | 64 |
- HTML: 613
- PDF: 235
- XML: 51
- Total: 899
- Supplement: 84
- BibTeX: 52
- EndNote: 64
Viewed (geographical distribution)
Country | # | Views | % |
---|
Total: | 0 |
HTML: | 0 |
PDF: | 0 |
XML: | 0 |
- 1