the Creative Commons Attribution 4.0 License.
the Creative Commons Attribution 4.0 License.
Annual time-series 1-km maps of crop area and types in the conterminous US (CropAT-US): cropping diversity changes during 1850–2021
Abstract. Agricultural activities have been recognized as an important driver of land use and cover changes (LUCC) and have significantly impacted ecosystem feedback to climate, air, and water quality by altering land surface properties. A reliable historical cropland distribution dataset is crucial for understanding and quantifying the legacy effects of agriculture-related LUCC. While several LUCC datasets have the potential to depict cropland patterns in the conterminous US, there remains a dearth of a high-resolution dataset with crop type details over a long period. To address this gap, we reconstructed historical cropland density and crop type maps from 1850 to 2021 at a resolution of 1 km×1 km by integrating inventory datasets and gridded LUCC products. The results showed that the developed dataset is highly consistent with the county-level inventory data, with an R2 approaching one and RMSE less than 3 Mha (million hectares) at the national level. Temporally, the US total crop acreage has increased by 118 Mha from 1850 to 2021, primarily driven by corn (30 Mha) and soybean (35 Mha). Spatially, the hotspots of cropland shifted from Eastern US to the Midwest and the Great Plains, and the dominant crop types (corn and soybean) moved toward the Northwest of the US. Moreover, we found the US cropping system diversity experienced a significant increase from 1850s to 1960s, followed by a dramatic decrease in the recent six decades under the intensified agriculture. Generally, the developed dataset could facilitate the spatial data development in delineating crop-specific management practices and enable the quantification of cropland change impacts.
- Preprint
(2226 KB) - Metadata XML
-
Supplement
(1149 KB) - BibTeX
- EndNote
Status: final response (author comments only)
-
RC1: 'Comment on essd-2023-195', Anonymous Referee #1, 31 Oct 2023
This manuscript derived CONUS 1km x 1km cropland density, crop type, and crop diversity maps for 1850 to 2021, by combining several historical cropland distribution datasets. The datasets include statistics data such as USDA Quckstat (county-level plant areas of different crops), and the maps like CDL, NLCD and LCMAP generated using satellite data. The main technical approach is temporal linear interpolation to fill gaps in early years when there were no data available.
The main validation was linear regressions for derived-crop-areas vs. raw-crop-areas for 9 major crops, at county level considering four years (1920, 1960, 2000, and 2020).
In the results, the cropland density, crop type, and crop diversity maps for selected years in 1850-2021 were compared, and some patterns were found.
The findings are plausible, but the method needs clarify and justifications, and the reliability of the derived maps for was not sufficiently presented. Detailed comments are given below.
Major Comments
1. The validation should show regression residuals across different years in the study period, especially for the early years when the gap-fillings are less reliable. It’s less reliable because linear interpolations are likely to have error propagations for early years when gaps were filled using values in subsequent years, rather than interpolations using values in both previous and subsequent years. The validation should include more figures, for example, with y-axis as regression residuals and x-axis as years (when raw were available) for different crops.
2. In the linear-regression based validation, should clarify the dependency for derived vs. compared values. If the derived values are highly dependent on the compared values, no wonder high correlations were found.
3. The method uses 7 different datasets (Table 1). These datasets have duplicated items, like county-level crop areas that are existing/derivable in most datasets. Should provide information about the discrepancies of these duplicated items in different datasets, and how these discrepancies were considered in the used method. For example, when there are multiple choices, should justify why those particular datasets were selected as the basis for the linear interpolations. This is related to Minor Comments 9, 11, 15, 16, 17.
4. In Figs. 8 and 9, the derived maps have apparent visual differences with HYDE. Explanations are needed for the differences, and for the impacts of the differences given that HYDE was used as input to derive the maps (Fig. 1). Can their regressions be quantified in validations?
Minor Comments
1. May consider changing “land use and cover changes” to “land cover land use change (LCLUC)”, which is a more used standard term in my experience.
2. L50. “History database of global environment (HYDE) (Goldewijk et al., 2017) dataset provides the cropland area in each grid cell from 1000 BC to 2017 AD at a resolution of 5 arc-min. Similarly, Zumkehr and Cambell (2013) developed a cropland distribution dataset at a 5 arc-min resolution from 1850 to 2000.”
- Should provide more information that how the early maps were made. That’s the information many readers want to know.
-Should also clarify with references that the crop areas obtained with different approaches can be meaningfully combined to make historical maps, such as USDA Quickstat compared with satellite-derived NLCD, LCMAP, and CDL. Some datasets may have consistent over-estimates or under-estimates.
3. L53. “In contrast, the resolution of Cropland Data Layer (CDL), National Land Cover Database (NLCD), and Land Change Monitoring, Assessment, and Projection (LCMAP) is down to 30m.”
-Need refs.
4. L55. “However, their availability and continuity (available in the recent 40 years) are unable to provide historical cropland change patterns.”
-The statement regarding the 40-year availability of crop maps is ambiguous. It seems to be only about LCMAP, but the previous text also talks about NLCD and CDL.
5. L56. “The more recent studies, such as Cao et al. (2021) and Li et al. (2023), developed long-term LUCC datasets at 1 km by 1 km resolution… Monfreda et al. (2008) and Tang et al. (2023) generated a global crop type map with more than 170 crop types in the year of 2000 and 2020, and CDL provides the annual crop type distribution in the conterminous US with more than 50 crop types from 2008 to now”.
-Should provide information on how these datasets were made.
6. Fig. 1. What’s the purpose of deriving state-level information, given that county-level information is derived?
7. Table 1. “Linearly interpolation” -> “Linear interpolation”.
8. Table 1. What are the differences between linear interpolation and gap-filling in missing years?
9. Table 1. NASS-CPAS, NASS-COA, and USDA-NASS have duplicated information, like state-level harvest areas? Do the duplicated items have same values across the three datasets?
10. L116. “NASS-CPAS reports the annual plant area of all principal crops for each state from 1909 to 2021”
-This is inconsistent with the information in Table 1 that NASS-CPAS provides “State-level total planted area”.
11. L119. “We computed the difference between these two datasets for available years and linearly interpolated unavailable years during 1909-2021. The interpolated difference was added to NASS-CPAS to generate the annual state-level total crop plant area from 1909 to 2021. We used the interannual variations of arable land of each state extracted from HYDE to interpolate the total planting area during 1850-1908 (Equation 1).”
-“difference between these two datasets” regarding state-level crop area?
-How big are the differences, e.g. regarding relative percentage between NASS-CPAS and HYDE? See Major Comment 3.
-Why not use USDA-NASS Quickstat? See Major Comment 3.
12. L128. “For the period that the harvest areas are unavailable, we interpolated the plant area from 1850 to 2021 based on the total cropland area generated above (Equation 1 and 2)”.
-Unclear. Also Equations 1 and 2 are not clearly described.
13. L132. “We gap-filled the total county cropland from 1850 to 2021 by state total cropland area (Equation 1 and 2)”.
-Unclear.
14. L133. “Similar to the state-level crop-specific area, we converted the harvest areas to
plant areas of 9 major crops in each county from USDA-NASS Quickstat, with varied availability (Table S1). For the period when harvest areas are unavailable, we gap-filled the plant areas during 1850-2021 based on the state-level crop-specific plant area generated above (Equation 1 and 2).”
-Clarify where “state-level crop-specific area” was obtained.
- Clarify “generated above” where?
15. What are the differences of crop areas between Quickstat and CDL for overlapping years?
16. Fig. 1. For crop density maps, CDL was used for 2010-2021, and LCMAP + NLCD were used for 1985-2009. Why not use LCMAP + NLCD for the whole period 1985-2021?
17. L147. “CDL, LCMAP, and HYDE were used to provide the potential cropland distribution in P2010, P1985, and P1850, respectively”.
-Why not used LCMAP for P2010?
18. L164. “Taking developing the density map in the year 2009 as an example, we first calculated the annual difference in each grid from 2009 to 2010 based on the LCMAP density maps. Then, we applied that difference to the adjusted CDL 2010 map to generate the density map 2009 with keeping the cropland area consistent with the inventory area. Following the same rule, the adjusted LCMAP 1985 was used to retrieve the density maps in P1850.”
-This text is incomprehensible.
19. L175. “By integrating resampled crop type maps and reconstructed cropland density maps, we counted the total area for each type at the county level and identified specific crop types with a greater area than the inventory data. We further converted the surplus area from these types to other types (Equation 4 and 5). In particular, considering the natural planting scenario, the surplus area was randomly selected for converting to other types to avoid a grid planted by a fixed type.”
-Incomprehensible, e.g. “identified specific crop types with a greater area than the inventory data”.
20. Fig. 4. The density maps may be related to the CONUS field size map in the following ref.
Yan, L. and Roy, D.P., 2016. Conterminous United States crop field size quantification from multi-temporal Landsat data. Remote Sensing of Environment, 172, pp.67-86
21. In Fig. 7, should clarify what datasets were not used in the study? Also what’s the purpose to show the datasets not used?
Citation: https://doi.org/10.5194/essd-2023-195-RC1 - AC1: 'Reply on RC1', Shuchao Ye, 21 Feb 2024
-
RC2: 'Comment on essd-2023-195', Anonymous Referee #2, 13 Nov 2023
Annual time-series 1-km 1 maps of crop area and types in the conterminous US (CropAT-US): cropping diversity changes during 1850-2021
Dear Authors
The basic science of this paper is conducted in a good way and is appropriate standard. The author and his team write this paper according to journal scope and modern trends. I already published several papers in this domain. There are some flaws in this manuscript. I am glad to review this paper, because this manuscript is very relevant according to my research. In this study, the author reconstructed the historical cropland density and crop type maps from 1850 to 2021 at a resolution of 1 km×1 km by integrating inventory datasets and gridded LUCC products Moreover, the author need more time to modify this paper. At this stage I am just providing major revision in this paper because I am concerned with data structure and data interpretation. Recheck the language of the whole manuscript during revision. Overall, the quality of the manuscript is good.
Title
Title is not appropriate. There is no fluency in the title, secondly RF is not phenological based methods. Justify if you want to keep with this title.
Abstract
- Rewrite the whole abstract section because the abstract section does not reflect the whole study. Moreover, the abstract section is very complex and there is no continuity in the sentences. Why the author used this study, the author should focus on the main aim of the research.
- Objectives are not clear in the abstract section.
- Moreover, the abstract section does not reflect the whole research.
Introduction
Introduction section is very lengthy and there is no sequence of this research.
- Introduction section is not appropriate and the problem statement, research questions, and hypothesis are missing from the introduction section.
- Introduction section is very simple and I did not find the research question, problem statement and innovative idea of this research.
- Split the long paragraphs according to context.
- Some references are out of dated. The author’s should modify all references.
Material and methods
- Add study area as a subsection and explain why you choose this study area.
- Follow the IMRaD rule for the international papers.
- Add figure of study area
- After study area start from the datasets
Datasets
Add the link of all data where you download the statistical data. Don’t add the home page link. I need proper dataset link.
After datasets then author add the methodology section. Reconstruct this section (material and methods)
Accuracy assessment
- The author’s should add the accuracy assessment section at the end of this section. There are many accuracy assessment methods, why the author’s did not use the well appropriate other accuracy assessment method instead of this.
(Note: According to my experience the authors should need to reconstruct this section)
Results
Monitoring performance of random forest method
Good start of results and I really appreciate your efforts, but I want to see your data, The author put all data in respiratory and share with me link. I really want to compare the whole data in raster and excel form.
Discussion
Interpretation of Results:
- Begin by summarizing the main findings of your study.
- Clearly state whether your results align with or differ from your initial hypotheses.
Comparison with Previous Studies:
- Discuss how your results compare with existing literature. Reference relevant studies.
- Highlight similarities and differences between your findings and those of other researchers.
Explanation of Variances:
- If there are unexpected or inconsistent results, offer possible explanations.
- Consider alternative interpretations and discuss their feasibility.
Limitations:
- Acknowledge any limitations in your study that may affect the interpretation of results.
- Discuss how these limitations might have influenced the outcomes.
Implications:
- Explore the broader implications of your findings. How do they contribute to the field?
- Discuss the practical and theoretical implications of your results.
Avoid Repetition:
- Do not repeat information from the results section. Instead, focus on the interpretation.
- Keep the discussion concise and directly related to the research question.
- Remember to maintain a logical flow and use clear and concise language. Support your statements with evidence from your study and relevant literature. Additionally, consider the broader context of your research and how it contributes to the existing body of knowledge.
Conclusion
- Check the typo and syntax error mistakes in the whole manuscript during revision.
Best Regards
Citation: https://doi.org/10.5194/essd-2023-195-RC2 -
AC3: 'Reply on RC2', Shuchao Ye, 21 Feb 2024
Dear Reviewer,
Thanks for your comprehensive comments on our manuscript. We have diligently enhanced both the structural organization and linguistic clarity of the manuscript. Specifically, the principle objective of this study is to reconstruct the historical crop density and crop type maps, facilitating a nuanced examination of spatiotemporal crop diversity patterns. Consequently, the title “Annual time-series 1-km maps of crop area and types in the conterminous US (CropAT-US): Cropping diversity changes during 1850–2021” matches the scope of this study well. In the Introduction section, we summarized the characteristics of the currently crop-related spatial datasets and underscored the focal issue. Moreover, we revised our hypothesis for corn-soybean rotation, and updated the crop type maps. As for datasets, we provided expanded details on the utilized datasets, including their characteristics, availability, and discrepancies across various sources in Section 2.1 Datasets. In addition, the methodology of this study is to adopt the inventory datasets to refine the gridded maps, and we improved the clarity of Section 2 Materials and method. Regarding the accuracy assessments and the performance of the method, we added two plots (Figure 2 and S3) to show the time-series residual between the inventory data and maps. Notably, the residual (the inventory-based crop area minus the rebuilt-map-based crop area) is less than 0.2 thousand hectares (Kha) for most counties (>75%) across the entire study period for nine crop types, underscoring the robustness of our method in capturing the interannual crop area variation. We further enhanced the comparison between our product and previous datasets (Figure S8 and S9). Overall, we strengthened the validation and the comparison and improved the structure and readability of the whole manuscript.
Best regards,
On behalf of all coauthors,
Shuchao Ye
Citation: https://doi.org/10.5194/essd-2023-195-AC3
-
RC3: 'Comment on essd-2023-195', Anonymous Referee #3, 14 Jan 2024
Review of “Annual time-series 1-km maps of crop area and types in the conterminous US (CropAT-US): cropping diversity changes during 1850-2021” for ESSD
Summary
Mapping the spatial distribution and temporal dynamics of crop area and crop types is essential for crop management. However, long-term and relatively high-spatial resolution crop area data with detailed crop type information is lacking. Here Ye et al., combined multi-sources of crop area and type information and generated the 1-km maps of crop area and types in the conterminous US during 1850-2021. The manuscript is generally well organized, and easy to follow. I only have few specific concerns listed below.
Specific comments
1) Line 8, ‘climate, air’, there could have some overlaps between these two terms.
2) Line 11, each time when I mention ‘high resolution’, I am always very cautious. In remote sensing, high resolution could be meter to submeter. So ‘relatively high resolution’ could be more rigorous.
3) Line 85, any justification about the linear assumption?
4) Line 87, references are needed following ‘diversity index’
5) Line 109-110: please also briefly mention how you conducted the filtering operation in the main text
6) Table 1, how did you resample 30 m data to 1km, how did you interpolate the 5arc-min data to 1km?
7) Line 110: why not use the data from Zumkehr and Cambell (2013). I assume differences exist between these two long-term datasets.
8) Line 119-121:please briefly explain what do the differences between NASS-CPAS and USDA-COA mean in the main text.
9) Line 122: ‘interpolate’->’extrapolate’
10) Fig. 2, Please explain what does each point mean in the caption.
11) Fig. 2, In addition to this kind of scatter plot, how about plotting the time series of crop-specific cropland area between reconstructed maps and raw inventory data to show that the reconstructed maps captured the interannual variations/trend of the raw data.Citation: https://doi.org/10.5194/essd-2023-195-RC3 - AC2: 'Reply on RC3', Shuchao Ye, 21 Feb 2024
Data sets
Annual time-series 1-km maps of crop area and types in the conterminous US (CropAT-US) during 1850-2021 Shuchao Ye, Peiyu Cao, Chaoqun Lu https://doi.org/10.6084/m9.figshare.22822838.v1
Viewed
HTML | XML | Total | Supplement | BibTeX | EndNote | |
---|---|---|---|---|---|---|
521 | 196 | 43 | 760 | 40 | 27 | 34 |
- HTML: 521
- PDF: 196
- XML: 43
- Total: 760
- Supplement: 40
- BibTeX: 27
- EndNote: 34
Viewed (geographical distribution)
Country | # | Views | % |
---|
Total: | 0 |
HTML: | 0 |
PDF: | 0 |
XML: | 0 |
- 1