CCD-Rice: A long-term paddy rice distribution dataset in China at 30 m resolution

Shen, Ruoque; Peng, Qiongyan; Li, Xiangqian; Chen, Xiuzhi; Yuan, Wenping

doi:https://doi.org/10.5194/essd-2024-147

Preprints

https://doi.org/10.5194/essd-2024-147

Preprints

09 Jul 2024

| 09 Jul 2024

Status: this discussion paper is a preprint. It has been under review for the journal Earth System Science Data (ESSD). The manuscript was not accepted for further review after discussion.

CCD-Rice: A long-term paddy rice distribution dataset in China at 30 m resolution

Ruoque Shen, Qiongyan Peng, Xiangqian Li, Xiuzhi Chen, and Wenping Yuan

Abstract. As one of the most widely cultivated grain crops, paddy rice is a vital staple food in China and plays a crucial role in ensuring food security. Over the past decades, the planting area of paddy rice in China has shown substantial variability. Yet, there are no long-term high-resolution rice distribution maps in China, which hinders our ability to estimate greenhouse gas fluxes and crop production. This study developed a new optical satellite-based rice mapping method using a machine learning model and appropriate data preprocessing strategies to address the challenges of cloud contamination and missing data in optical remote sensing observations. This study produced CCD-Rice (China Crop Dataset-Rice), the first high-resolution rice distribution dataset in China from 1990 to 2016. Based on 391,659 validation samples, the overall accuracy of the distribution maps in each provincial administrative region averaged 90.26 %. Compared with 20,759 county-level statistical data, the coefficients of determination (R²) of single- and double-season rice in each year averaged 0.84 and 0.80, respectively. The distribution maps can be obtained at https://doi.org/10.57760/sciencedb.15865 (Shen et al., 2024a).

Received: 23 Apr 2024 – Discussion started: 09 Jul 2024

Publisher's note: Copernicus Publications remains neutral with regard to jurisdictional claims made in the text, published maps, institutional affiliations, or any other geographical representation in this paper. While Copernicus Publications makes every effort to include appropriate place names, the final responsibility lies with the authors. Views expressed in the text are those of the authors and do not necessarily reflect the views of the publisher.

Download & links

Ruoque Shen, Qiongyan Peng, Xiangqian Li, Xiuzhi Chen, and Wenping Yuan

Status: closed

RC1: 'Comment on essd-2024-147', Wang Xiaobo, 03 Aug 2024

Dear Authors and Editor,
The development of a long-term and high-resolution time series dataset of rice planting area is crucial for estimating the response and adaptation of rice production to environmental changes at a regional scale. In this study, a satellite-based rice mapping method was put forward by combining a machine learning model with appropriate data preprocessing strategies to address the challenges of cloud contamination and missing data in optical remote sensing observations.
The manuscript features good language, appealing images, a well-structured layout, and methodology with a degree of innovation. Moreover, the authors have provided a commendable discussion on how different data pre-processing methods affect the results, which enhances the reliability of the chosen pre-processing approach.
Although this study utilized multi-source data to validate the accuracy of the dataset in terms of spatial distribution of rice area at multiple scales, more discussion is needed regarding the dataset's accuracy in annual variation and long-term trends, which is an indispensable aspect for long-term time series datasets.
Major concerns:
Line 213~218: The method of dynamically adjusting the probability threshold for rice pixels is interesting. It would be beneficial to display the probability thresholds for each province and year, and attempt to explore the possible reasons for variations in rice probability thresholds across different regions/years in the discussion section.
Line 239~245: There are concerns about the reliability of the data post-processing method. Because the authors do not determine the reason for the excessive inter-annual variation in the detected rice planting area, using a low-pass filter to smooth time series may not be the best choice. Is it possible that the large inter-annual variation of the detected rice planting area is induced by weather and data quality, resulting in only a portion of rice fields being identified each year? Perhaps the authors could try taking the union of detected rice pixels every five years and examining whether the spatial distribution of rice area becomes more stable.
Section Results: The authors have provided a detailed analysis of the spatial consistency between the detected and statistical rice area, presented on a year basis. However, I wonder if the results could be expanded to address the following point: How does the dataset perform in terms of inter-annual variation and long-term trends of rice area at provincial or county level? Have any temporal variation characteristics been detected that align with the statistical data? It would be beneficial if the authors could elaborate on these temporal dynamics of rice area, as they are as important as the spatial consistency already demonstrated.
Minor concerns:
Line 95: It is suggested to add a technical flowchart to help readers better understand the production process of this dataset. It would be beneficial if the added flowchart could clearly annotate the training data, validation data, and other auxiliary data used in identifying rice pixels.
Line 155: It should be noted that the ChinaCropPhen1km dataset is not a crop area dataset, but rather a crop phenology dataset. The crop planting areas extracted in this dataset only represent high-quality pixels that can reflect the key phenological stages of crops.
Figure 3: Shanghai might be rather unique and not typical within Subregion 2. It would be better to replace it with SWIR1 time series curves from other provinces in Subregion 2.
Hope these comments are helpful to improve the manuscript.

Citation: https://doi.org/10.5194/essd-2024-147-RC1
RC2:
'Comment on essd-2024-147', Anonymous Referee #2, 26 Sep 2024
This study, titled “CCD-Rice: A long-term paddy rice distribution dataset in China at 30 m resolution,” developed a high spatial resolution (30 m) rice distribution dataset covering the period from 1990 to 2016, including both single-season and double-season rice cultivation in China. While the authors present a substantial amount of data, there are concerns with the methods and validation that need to be addressed. Unfortunately, I am unable to recommend the manuscript for publication at this time.

Major comments:
The structure of the introduction is unclear and fails to highlight the innovative aspects of this paper. The authors should reorganize the introduction to better emphasize their points.

Although this study and the previous paper (Shen et al., 2023) used different methods and covered a different time frame, I believe this work should be incorporated into the previous data rather than published as a separate article.

The authors mentioned cloud cover as a limiting factor in rice mapping but did not adequately address this issue. Instead, they set the missing values to 0, which is very confusing.

The authors mentioned the use of Landsat-8 & -9 and Sentinel-2 to generate 8-day composite images, but there are variations in reflectance between the sensors. It is unclear whether inter-sensor calibration was performed, as this is not detailed in the methodology.

There is a mismatch between the number of validations at the county level and the numbers presented in Figure 2. The authors need to clarify whether the screening was conducted based on specific rules or if there is another explanation for the discrepancy.

There is a lack of citations in parts of the introduction. For example, lines 48-50 and 55-56 should be supported by references.

Although there are fewer samples for single- and double-season rice, it is still necessary to validate the available samples. If not, no distinction should be made between single- and double-season rice.

The authors compared their results with coarser-resolution products, which is not meaningful when evaluating high-resolution products like those used in this study. The selected product, limited to Heilongjiang, lacks data for other regions and is more accurate than this study. The authors should select different high-resolution products for comparison. This raises another question: why does the dataset only extend to 2016 when most available rice datasets are from after 2016?

Many awkwardly written sentences in the first draft require careful review, and the overall quality of the language needs improvement. For example, lines 21-22, 29-30, 102-104, 239-240, and 260-262.
Citation: https://doi.org/10.5194/essd-2024-147-RC2

Status: closed

RC1: 'Comment on essd-2024-147', Wang Xiaobo, 03 Aug 2024

Dear Authors and Editor,
The development of a long-term and high-resolution time series dataset of rice planting area is crucial for estimating the response and adaptation of rice production to environmental changes at a regional scale. In this study, a satellite-based rice mapping method was put forward by combining a machine learning model with appropriate data preprocessing strategies to address the challenges of cloud contamination and missing data in optical remote sensing observations.
The manuscript features good language, appealing images, a well-structured layout, and methodology with a degree of innovation. Moreover, the authors have provided a commendable discussion on how different data pre-processing methods affect the results, which enhances the reliability of the chosen pre-processing approach.
Although this study utilized multi-source data to validate the accuracy of the dataset in terms of spatial distribution of rice area at multiple scales, more discussion is needed regarding the dataset's accuracy in annual variation and long-term trends, which is an indispensable aspect for long-term time series datasets.
Major concerns:
Line 213~218: The method of dynamically adjusting the probability threshold for rice pixels is interesting. It would be beneficial to display the probability thresholds for each province and year, and attempt to explore the possible reasons for variations in rice probability thresholds across different regions/years in the discussion section.
Line 239~245: There are concerns about the reliability of the data post-processing method. Because the authors do not determine the reason for the excessive inter-annual variation in the detected rice planting area, using a low-pass filter to smooth time series may not be the best choice. Is it possible that the large inter-annual variation of the detected rice planting area is induced by weather and data quality, resulting in only a portion of rice fields being identified each year? Perhaps the authors could try taking the union of detected rice pixels every five years and examining whether the spatial distribution of rice area becomes more stable.
Section Results: The authors have provided a detailed analysis of the spatial consistency between the detected and statistical rice area, presented on a year basis. However, I wonder if the results could be expanded to address the following point: How does the dataset perform in terms of inter-annual variation and long-term trends of rice area at provincial or county level? Have any temporal variation characteristics been detected that align with the statistical data? It would be beneficial if the authors could elaborate on these temporal dynamics of rice area, as they are as important as the spatial consistency already demonstrated.
Minor concerns:
Line 95: It is suggested to add a technical flowchart to help readers better understand the production process of this dataset. It would be beneficial if the added flowchart could clearly annotate the training data, validation data, and other auxiliary data used in identifying rice pixels.
Line 155: It should be noted that the ChinaCropPhen1km dataset is not a crop area dataset, but rather a crop phenology dataset. The crop planting areas extracted in this dataset only represent high-quality pixels that can reflect the key phenological stages of crops.
Figure 3: Shanghai might be rather unique and not typical within Subregion 2. It would be better to replace it with SWIR1 time series curves from other provinces in Subregion 2.
Hope these comments are helpful to improve the manuscript.

Citation: https://doi.org/10.5194/essd-2024-147-RC1
RC2:
'Comment on essd-2024-147', Anonymous Referee #2, 26 Sep 2024
This study, titled “CCD-Rice: A long-term paddy rice distribution dataset in China at 30 m resolution,” developed a high spatial resolution (30 m) rice distribution dataset covering the period from 1990 to 2016, including both single-season and double-season rice cultivation in China. While the authors present a substantial amount of data, there are concerns with the methods and validation that need to be addressed. Unfortunately, I am unable to recommend the manuscript for publication at this time.

Major comments:
The structure of the introduction is unclear and fails to highlight the innovative aspects of this paper. The authors should reorganize the introduction to better emphasize their points.

Although this study and the previous paper (Shen et al., 2023) used different methods and covered a different time frame, I believe this work should be incorporated into the previous data rather than published as a separate article.

The authors mentioned cloud cover as a limiting factor in rice mapping but did not adequately address this issue. Instead, they set the missing values to 0, which is very confusing.

The authors mentioned the use of Landsat-8 & -9 and Sentinel-2 to generate 8-day composite images, but there are variations in reflectance between the sensors. It is unclear whether inter-sensor calibration was performed, as this is not detailed in the methodology.

There is a mismatch between the number of validations at the county level and the numbers presented in Figure 2. The authors need to clarify whether the screening was conducted based on specific rules or if there is another explanation for the discrepancy.

There is a lack of citations in parts of the introduction. For example, lines 48-50 and 55-56 should be supported by references.

Although there are fewer samples for single- and double-season rice, it is still necessary to validate the available samples. If not, no distinction should be made between single- and double-season rice.

The authors compared their results with coarser-resolution products, which is not meaningful when evaluating high-resolution products like those used in this study. The selected product, limited to Heilongjiang, lacks data for other regions and is more accurate than this study. The authors should select different high-resolution products for comparison. This raises another question: why does the dataset only extend to 2016 when most available rice datasets are from after 2016?

Many awkwardly written sentences in the first draft require careful review, and the overall quality of the language needs improvement. For example, lines 21-22, 29-30, 102-104, 239-240, and 260-262.
Citation: https://doi.org/10.5194/essd-2024-147-RC2

Ruoque Shen, Qiongyan Peng, Xiangqian Li, Xiuzhi Chen, and Wenping Yuan

Data sets

CCD-Rice: A paddy rice distribution dataset in China from 1990 to 2016 at 30 m resolution Ruoque Shen, Qiongyan Peng, Xiangqian Li, Xiuzhi Chen, and Wenping Yuan https://doi.org/10.57760/sciencedb.15865

Ruoque Shen, Qiongyan Peng, Xiangqian Li, Xiuzhi Chen, and Wenping Yuan

Viewed

Total article views: 2,612 (including HTML, PDF, and XML)

HTML	PDF	XML	Total	BibTeX	EndNote
2,034	369	209	2,612	52	70

HTML: 2,034
PDF: 369
XML: 209
Total: 2,612
BibTeX: 52
EndNote: 70

Views and downloads (calculated since 09 Jul 2024)

Month	HTML	PDF	XML	Total
Jul 2024	243	60	11	314
Aug 2024	158	59	8	225
Sep 2024	110	27	3	140
Oct 2024	89	15	34	138
Nov 2024	110	20	49	179
Dec 2024	106	16	47	169
Jan 2025	86	28	43	157
Feb 2025	83	12	3	98
Mar 2025	111	27	1	139
Apr 2025	83	18	1	102
May 2025	85	14	2	101
Jun 2025	120	17	0	137
Jul 2025	127	23	6	156
Aug 2025	132	12	0	144
Sep 2025	350	16	1	367
Oct 2025	41	5	0	46

Cumulative views and downloads (calculated since 09 Jul 2024)

Month	HTML	PDF	XML	Total
Jul 2024	243	60	11	314
Aug 2024	158	59	8	225
Sep 2024	110	27	3	140
Oct 2024	89	15	34	138
Nov 2024	110	20	49	179
Dec 2024	106	16	47	169
Jan 2025	86	28	43	157
Feb 2025	83	12	3	98
Mar 2025	111	27	1	139
Apr 2025	83	18	1	102
May 2025	85	14	2	101
Jun 2025	120	17	0	137
Jul 2025	127	23	6	156
Aug 2025	132	12	0	144
Sep 2025	350	16	1	367
Oct 2025	41	5	0	46

Viewed (geographical distribution)

Total article views: 2,551 (including HTML, PDF, and XML) Thereof 2,551 with geography defined and 0 with unknown origin.

Country	#	Views	%

Cited

Latest update: 18 Oct 2025

Short summary

Rice is a vital staple crop that plays a crucial role in food security in China. However, long-term high-resolution rice distribution maps in China are lacking. This study developed a new rice mapping method using to address the challenges of cloud contamination and missing data in optical remote sensing observations. The resulting dataset, CCD-Rice (China Crop Dataset-Rice), achieved high accuracy and showed strong correlation with statistical data.


Total:	0
HTML:	0
PDF:	0
XML:	0

CCD-Rice: A long-term paddy rice distribution dataset in China at 30 m resolution

Data sets

Viewed

Viewed (geographical distribution)

Cited

1 citations as recorded by crossref.