the Creative Commons Attribution 4.0 License.
the Creative Commons Attribution 4.0 License.
Mapping Complex Cropping Patterns in China (2018–2021) at 10 m Resolution: A Data-Driven Framework based on Multi-Product Integration and Google Satellite Embedding
Abstract. Mapping complex cropping patterns and temporal dynamics is of great significance for addressing the challenges faced by agricultural systems. However, in China, annual nationwide maps depicting multiple crops and rotation sequences are still lacking. In this study, we developed a data-driven crop mapping framework by integrating multiple existing crop products with the Google Satellite Embeddings derived from the AlphaEarth foundation model, and produced 10-meter resolution mapping of complex cropping patterns across China from 2018 to 2021. Firstly, we integrated multiple publicly available crop mapping products within a harmonized framework that applies a unified cropland extent and cropping intensity, providing a systematic assessment of their consistency at pixel level. Consistency analysis results classify the study area into areas with consistency and areas with confusion, the latter serving as the mapping focus. Then, by combining harmonized crop data layers with random forest classifiers trained on foundation-derived embedding features, our framework effectively enhanced the spatial coherence and temporal stability, achieving an overall accuracy of 92.60 % and an F1 score of 0.7584. Compared with ADM-2 statistics, the mapped cropping areas achieved high consistency (R² = 0.849, RMSE = 4.61, MAE = 2.07). The resulting datasets provide an integrated depiction of China’s complex cropping systems, enabling consistent interannual assessments of changes in crop types, cropping sequences, and spatial patterns at 10 m resolution, thereby establishing a robust foundation for refined agricultural management and policy decisions, while supporting climate-smart and sustainable land-use planning.
- Preprint
(4210 KB) - Metadata XML
-
Supplement
(853 KB) - BibTeX
- EndNote
Status: final response (author comments only)
- RC1: 'Comment on essd-2025-679', Anonymous Referee #1, 06 Jan 2026
-
RC2: 'Comment on essd-2025-679', Anonymous Referee #2, 12 Feb 2026
The manuscript integrated the embedding datasets and multiple cropland dataset to generate the cropping patterns by developing the data-driven framework. The work is novel and useful for understanding the dynamics of cropping patterns and guiding agricultural practices. There were two major comments:
- The impacts of spatiotemporal autocorrelation on accuracy should be considered when using RF.
- Since the study used multiple datasets, crop-specific area comparisons between the present study and other datasets would provide a more comprehensive view for guiding data application.
Minor comments:
Lines 20, what is the ADM-2 statistics, please state the full name of dataset at the first mention.
Lines 20, what are the units of RMSE and MAE?
Lines 29, please add common before and growing ….. To make the usage consistent throughout the manuscript.
Lines 50, you described the necessity of China mapping from the technic perspective. That is good. But adding some socio-economic background of China could help strengthen the context.
Lines 65, Please also mention its impacts on social impacts.
Lines 130, “Embeddings’ general”. Please check this throughout the manuscript to make their usage consistent.
Lines 170, what is the threshold “t”? does “t” equal to 0.8?
Line 175, how do you process the OCTC when it is less than the estimated cropping intensity? Does the “unconfused area” mean that OCTC is equal to the intensity? Please specify these categories.
Lines 165-178: Please re-organize these paragraphs to make the method clearer. If CMCI less than 0.8, then the relative grid is assigned as low-consistency. Meanwhile, if the OCTC is greater than the estimated cropping intensity, then is it “confused” area? Please plot a figure to link consistency and confusion. I am a little confused about this.
Lines 184: delete one “.”
Lines 205-207: Please test the model performance by testing various combination and ntree and mtry. Not sure 500 is the best value.
Lines 207: “To capture regional heterogeneity, the RF classification was partitioned using the Agro-Natural Regionalization of China, a climatic-geographic framework dividing the country into 38 sub-regions by temperature and precipitation” what does “the RF classification was partitioned”? do you mean train RF separately in each sub-regions? Also, please consider the influences of spatiotemporal autocorrelation on the accuracy evaluation.
Lines 229: is the “adm-1 level” the same as ADM-1 level? Please make all terminology consistent.
Lines 248-254: this section should be moved to the Method.
Lines 255-256: please cite a figure to show the province boundary and the Hu Huanyong line.
Fig 4(f): it seems that most areas have the error less than 0.25. Might use a more detailed legend below 0.25 to show the spatial heterogeneity.
Lines 315-316: confused sentence. The area of cotton is small. Although 98% of cotton remains stable, it can’t represent the stability of all cropping systems. Please revise this.
Lines 332: “there was a high degree of alternation from wheat–soybean to wheat–maize systems, accounting for 48% of the wheat–soybean area,”. What is the driver for this transition? Market or policy change?
In Section 3.4 : Please report the validation for each crop type. Although crop-specific validation might be poor, it is import for the data application, especially for the crop-specific application. AMD reported the harvested area, but the map area reflected the planted area. Therefore, the map area might be higher than AMD area, which is reasonable.
In addition, please conduct a crop area comparison between your map results and the used crop products, crop by crop. This could give the reader a more comprehensive view of the current dataset.
Lines 480: Please specify the specific information in each band if it is available. For instance, A40 is important for rice, then A40 reflect which kind of information.
Citation: https://doi.org/10.5194/essd-2025-679-RC2
Data sets
Mapping Complex Cropping Patterns in China (2018–2021) at 10 m Resolution: A Data-Driven Framework based on Multi-Product Integration and Google Satellite Embedding Xiyu Li and Le Yu https://doi.org/10.6084/m9.figshare.30582161
Viewed
| HTML | XML | Total | Supplement | BibTeX | EndNote | |
|---|---|---|---|---|---|---|
| 511 | 291 | 38 | 840 | 96 | 26 | 47 |
- HTML: 511
- PDF: 291
- XML: 38
- Total: 840
- Supplement: 96
- BibTeX: 26
- EndNote: 47
Viewed (geographical distribution)
| Country | # | Views | % |
|---|
| Total: | 0 |
| HTML: | 0 |
| PDF: | 0 |
| XML: | 0 |
- 1
This study presents a novel data-driven framework for annual mapping of cropping patterns and produces maps of China’s complex cropping systems. Overall, the study is carefully conducted and adds useful insights to the existing literature. However, several aspects of the manuscript require minor revision to improve the overall quality and interpretability. Detailed suggestions are listed below.