the Creative Commons Attribution 4.0 License.
the Creative Commons Attribution 4.0 License.
CCD-Rice: A long-term paddy rice distribution dataset in China at 30 m resolution
Abstract. As one of the most widely cultivated grain crops, paddy rice is a vital staple food in China and plays a crucial role in ensuring food security. Over the past decades, the planting area of paddy rice in China has shown substantial variability. Yet, there are no long-term high-resolution rice distribution maps in China, which hinders our ability to estimate greenhouse gas fluxes and crop production. This study developed a new optical satellite-based rice mapping method using a machine learning model and appropriate data preprocessing strategies to address the challenges of cloud contamination and missing data in optical remote sensing observations. This study produced CCD-Rice (China Crop Dataset-Rice), the first high-resolution rice distribution dataset in China from 1990 to 2016. Based on 391,659 validation samples, the overall accuracy of the distribution maps in each provincial administrative region averaged 90.26 %. Compared with 20,759 county-level statistical data, the coefficients of determination (R2) of single- and double-season rice in each year averaged 0.84 and 0.80, respectively. The distribution maps can be obtained at https://doi.org/10.57760/sciencedb.15865 (Shen et al., 2024a).
- Preprint
(4174 KB) - Metadata XML
- BibTeX
- EndNote
Status: closed
-
RC1: 'Comment on essd-2024-147', Wang Xiaobo, 03 Aug 2024
Dear Authors and Editor,
The development of a long-term and high-resolution time series dataset of rice planting area is crucial for estimating the response and adaptation of rice production to environmental changes at a regional scale. In this study, a satellite-based rice mapping method was put forward by combining a machine learning model with appropriate data preprocessing strategies to address the challenges of cloud contamination and missing data in optical remote sensing observations.
The manuscript features good language, appealing images, a well-structured layout, and methodology with a degree of innovation. Moreover, the authors have provided a commendable discussion on how different data pre-processing methods affect the results, which enhances the reliability of the chosen pre-processing approach.
Although this study utilized multi-source data to validate the accuracy of the dataset in terms of spatial distribution of rice area at multiple scales, more discussion is needed regarding the dataset's accuracy in annual variation and long-term trends, which is an indispensable aspect for long-term time series datasets.
Major concerns:
Line 213~218: The method of dynamically adjusting the probability threshold for rice pixels is interesting. It would be beneficial to display the probability thresholds for each province and year, and attempt to explore the possible reasons for variations in rice probability thresholds across different regions/years in the discussion section.
Line 239~245: There are concerns about the reliability of the data post-processing method. Because the authors do not determine the reason for the excessive inter-annual variation in the detected rice planting area, using a low-pass filter to smooth time series may not be the best choice. Is it possible that the large inter-annual variation of the detected rice planting area is induced by weather and data quality, resulting in only a portion of rice fields being identified each year? Perhaps the authors could try taking the union of detected rice pixels every five years and examining whether the spatial distribution of rice area becomes more stable.
Section Results: The authors have provided a detailed analysis of the spatial consistency between the detected and statistical rice area, presented on a year basis. However, I wonder if the results could be expanded to address the following point: How does the dataset perform in terms of inter-annual variation and long-term trends of rice area at provincial or county level? Have any temporal variation characteristics been detected that align with the statistical data? It would be beneficial if the authors could elaborate on these temporal dynamics of rice area, as they are as important as the spatial consistency already demonstrated.
Minor concerns:
Line 95: It is suggested to add a technical flowchart to help readers better understand the production process of this dataset. It would be beneficial if the added flowchart could clearly annotate the training data, validation data, and other auxiliary data used in identifying rice pixels.
Line 155: It should be noted that the ChinaCropPhen1km dataset is not a crop area dataset, but rather a crop phenology dataset. The crop planting areas extracted in this dataset only represent high-quality pixels that can reflect the key phenological stages of crops.
Figure 3: Shanghai might be rather unique and not typical within Subregion 2. It would be better to replace it with SWIR1 time series curves from other provinces in Subregion 2.
Hope these comments are helpful to improve the manuscript.
Citation: https://doi.org/10.5194/essd-2024-147-RC1 -
RC2: 'Comment on essd-2024-147', Anonymous Referee #2, 26 Sep 2024
This study, titled “CCD-Rice: A long-term paddy rice distribution dataset in China at 30 m resolution,” developed a high spatial resolution (30 m) rice distribution dataset covering the period from 1990 to 2016, including both single-season and double-season rice cultivation in China. While the authors present a substantial amount of data, there are concerns with the methods and validation that need to be addressed. Unfortunately, I am unable to recommend the manuscript for publication at this time.
Major comments:
- The structure of the introduction is unclear and fails to highlight the innovative aspects of this paper. The authors should reorganize the introduction to better emphasize their points.
- Although this study and the previous paper (Shen et al., 2023) used different methods and covered a different time frame, I believe this work should be incorporated into the previous data rather than published as a separate article.
- The authors mentioned cloud cover as a limiting factor in rice mapping but did not adequately address this issue. Instead, they set the missing values to 0, which is very confusing.
- The authors mentioned the use of Landsat-8 & -9 and Sentinel-2 to generate 8-day composite images, but there are variations in reflectance between the sensors. It is unclear whether inter-sensor calibration was performed, as this is not detailed in the methodology.
- There is a mismatch between the number of validations at the county level and the numbers presented in Figure 2. The authors need to clarify whether the screening was conducted based on specific rules or if there is another explanation for the discrepancy.
- There is a lack of citations in parts of the introduction. For example, lines 48-50 and 55-56 should be supported by references.
- Although there are fewer samples for single- and double-season rice, it is still necessary to validate the available samples. If not, no distinction should be made between single- and double-season rice.
- The authors compared their results with coarser-resolution products, which is not meaningful when evaluating high-resolution products like those used in this study. The selected product, limited to Heilongjiang, lacks data for other regions and is more accurate than this study. The authors should select different high-resolution products for comparison. This raises another question: why does the dataset only extend to 2016 when most available rice datasets are from after 2016?
- Many awkwardly written sentences in the first draft require careful review, and the overall quality of the language needs improvement. For example, lines 21-22, 29-30, 102-104, 239-240, and 260-262.
Citation: https://doi.org/10.5194/essd-2024-147-RC2
Status: closed
-
RC1: 'Comment on essd-2024-147', Wang Xiaobo, 03 Aug 2024
Dear Authors and Editor,
The development of a long-term and high-resolution time series dataset of rice planting area is crucial for estimating the response and adaptation of rice production to environmental changes at a regional scale. In this study, a satellite-based rice mapping method was put forward by combining a machine learning model with appropriate data preprocessing strategies to address the challenges of cloud contamination and missing data in optical remote sensing observations.
The manuscript features good language, appealing images, a well-structured layout, and methodology with a degree of innovation. Moreover, the authors have provided a commendable discussion on how different data pre-processing methods affect the results, which enhances the reliability of the chosen pre-processing approach.
Although this study utilized multi-source data to validate the accuracy of the dataset in terms of spatial distribution of rice area at multiple scales, more discussion is needed regarding the dataset's accuracy in annual variation and long-term trends, which is an indispensable aspect for long-term time series datasets.
Major concerns:
Line 213~218: The method of dynamically adjusting the probability threshold for rice pixels is interesting. It would be beneficial to display the probability thresholds for each province and year, and attempt to explore the possible reasons for variations in rice probability thresholds across different regions/years in the discussion section.
Line 239~245: There are concerns about the reliability of the data post-processing method. Because the authors do not determine the reason for the excessive inter-annual variation in the detected rice planting area, using a low-pass filter to smooth time series may not be the best choice. Is it possible that the large inter-annual variation of the detected rice planting area is induced by weather and data quality, resulting in only a portion of rice fields being identified each year? Perhaps the authors could try taking the union of detected rice pixels every five years and examining whether the spatial distribution of rice area becomes more stable.
Section Results: The authors have provided a detailed analysis of the spatial consistency between the detected and statistical rice area, presented on a year basis. However, I wonder if the results could be expanded to address the following point: How does the dataset perform in terms of inter-annual variation and long-term trends of rice area at provincial or county level? Have any temporal variation characteristics been detected that align with the statistical data? It would be beneficial if the authors could elaborate on these temporal dynamics of rice area, as they are as important as the spatial consistency already demonstrated.
Minor concerns:
Line 95: It is suggested to add a technical flowchart to help readers better understand the production process of this dataset. It would be beneficial if the added flowchart could clearly annotate the training data, validation data, and other auxiliary data used in identifying rice pixels.
Line 155: It should be noted that the ChinaCropPhen1km dataset is not a crop area dataset, but rather a crop phenology dataset. The crop planting areas extracted in this dataset only represent high-quality pixels that can reflect the key phenological stages of crops.
Figure 3: Shanghai might be rather unique and not typical within Subregion 2. It would be better to replace it with SWIR1 time series curves from other provinces in Subregion 2.
Hope these comments are helpful to improve the manuscript.
Citation: https://doi.org/10.5194/essd-2024-147-RC1 -
RC2: 'Comment on essd-2024-147', Anonymous Referee #2, 26 Sep 2024
This study, titled “CCD-Rice: A long-term paddy rice distribution dataset in China at 30 m resolution,” developed a high spatial resolution (30 m) rice distribution dataset covering the period from 1990 to 2016, including both single-season and double-season rice cultivation in China. While the authors present a substantial amount of data, there are concerns with the methods and validation that need to be addressed. Unfortunately, I am unable to recommend the manuscript for publication at this time.
Major comments:
- The structure of the introduction is unclear and fails to highlight the innovative aspects of this paper. The authors should reorganize the introduction to better emphasize their points.
- Although this study and the previous paper (Shen et al., 2023) used different methods and covered a different time frame, I believe this work should be incorporated into the previous data rather than published as a separate article.
- The authors mentioned cloud cover as a limiting factor in rice mapping but did not adequately address this issue. Instead, they set the missing values to 0, which is very confusing.
- The authors mentioned the use of Landsat-8 & -9 and Sentinel-2 to generate 8-day composite images, but there are variations in reflectance between the sensors. It is unclear whether inter-sensor calibration was performed, as this is not detailed in the methodology.
- There is a mismatch between the number of validations at the county level and the numbers presented in Figure 2. The authors need to clarify whether the screening was conducted based on specific rules or if there is another explanation for the discrepancy.
- There is a lack of citations in parts of the introduction. For example, lines 48-50 and 55-56 should be supported by references.
- Although there are fewer samples for single- and double-season rice, it is still necessary to validate the available samples. If not, no distinction should be made between single- and double-season rice.
- The authors compared their results with coarser-resolution products, which is not meaningful when evaluating high-resolution products like those used in this study. The selected product, limited to Heilongjiang, lacks data for other regions and is more accurate than this study. The authors should select different high-resolution products for comparison. This raises another question: why does the dataset only extend to 2016 when most available rice datasets are from after 2016?
- Many awkwardly written sentences in the first draft require careful review, and the overall quality of the language needs improvement. For example, lines 21-22, 29-30, 102-104, 239-240, and 260-262.
Citation: https://doi.org/10.5194/essd-2024-147-RC2
Data sets
CCD-Rice: A paddy rice distribution dataset in China from 1990 to 2016 at 30 m resolution Ruoque Shen, Qiongyan Peng, Xiangqian Li, Xiuzhi Chen, and Wenping Yuan https://doi.org/10.57760/sciencedb.15865
Viewed
HTML | XML | Total | BibTeX | EndNote | |
---|---|---|---|---|---|
752 | 185 | 125 | 1,062 | 18 | 21 |
- HTML: 752
- PDF: 185
- XML: 125
- Total: 1,062
- BibTeX: 18
- EndNote: 21
Viewed (geographical distribution)
Country | # | Views | % |
---|
Total: | 0 |
HTML: | 0 |
PDF: | 0 |
XML: | 0 |
- 1