the Creative Commons Attribution 4.0 License.
the Creative Commons Attribution 4.0 License.
High-resolution mapping of global winter-triticeae crops using a sample-free identification method
Abstract. Winter-triticeae crops, such as winter wheat, winter barley, winter rye, and triticale, are important in human diets and planted worldwide, and thus accurate spatial distribution information of winter-triticeae crops is crucial for monitoring crop production and food security. However, there is still a lack of global high-resolution maps of winter-triticeae crops because of the reliance of existing crop mapping methods on training samples, which limits their application at the global scale. In this study, we propose a new method based on the Winter-Triticeae Crops Index (WTCI) for global winter-triticeae crops mapping. This is a new sample-free method for identifying winter-triticeae crops based on differences in their normalized difference vegetation index (NDVI) characteristics from the heading to the harvesting stages and those of other types of vegetation. Based on this new method, we produced the first global 30 m resolution distribution maps of winter-triticeae crops from 2017 to 2022. Validation in 65 countries worldwide indicated that the method exhibited satisfying performance and stable spatiotemporal transferability, with producer’s accuracy, user’s accuracy and overall accuracy of 81.12 %, 87.85 % and 87.7 %, respectively. The identified area of winter-triticeae crops was consistent with the agricultural statistical area in almost all investigated counties or regions, and the correlation coefficient (R2) between the identified area and the statistical area was over 0.6, while the relative mean absolute error (RMAE) was less than 30 % in all six years. Overall, this study provides a reliable and automatic identification method for winter-triticeae crops without any training samples. The high-resolution distribution maps of global winter-triticeae crops are expected to support multiple agricultural applications. The distribution maps can be obtained at https://doi.org/10.57760/sciencedb.12361 (Fu et al., 2023a).
- Preprint
(2819 KB) - Metadata XML
-
Supplement
(158 KB) - BibTeX
- EndNote
Status: closed
-
RC1: 'Comment on essd-2023-432', Anonymous Referee #1, 03 Jan 2024
This is needed and important research. However, some of the used methods are not clear enough, and in fact, the accuracy assessment may be not reliable. This makes the manuscript not appropriate for publication in ESSD in its current form.
My main concerns:
- National datasets used for validation are not described at all, and I am not sure if they are reliable sources. The questions arise if these datasets are robust and/or detailed enough to perform accuracy assessment for the presented map? Did you only compare the area reported by country statistics and areas obtained in your maps? If so, that is not enough. Maybe, as a validation dataset it would be better to include USA CDL dataset.
- Another dataset for comparison/validation comes from Google Earth imagery. However, how is it possible to check or distinguish if there are winter crops indeed if for some years only single image is available, and may be not acquired during the time when it is possible to assess?
- The methodology is sometimes not clear. And what is also important, the data should be described firstly, before the methods used! For example, the methodology behind integration of Sentinel-2 and Landsat imagery is not clear. Do you used any harmonization techniques, which are needed in such combination between two satellite sources?
- Checking the dataset for my country shows that a large part is in fact located in the agricultural areas (however I cannot say if these are winter, not winter or not triticeae crops). However, there are also large parts located in the forests, and large areas with “stripes” probably related to not proper processing of Landsat 7 imagery. This should be for sure addressed in future, and methods should be refined. I also checked the area of the Mediterranean Sea, where many areas of maquis /shrublands were indicated as winter crops.
Some other comments related to specific lines:
Line 28 – this sentence should be rephrased, mapping cannot monitor something
Line 58 – add information which satellite imagery did you use.
Figure 1 – samples should have different, more distinguishable colours
Line 80 – As mentioned above, data should be described first, before methodology.
Line 88 – 91 – what about evergreen forests? They are not described or shown on Figure 2, while they are usually also characterized by high values during winter, for example. I think they should be taken into consideration when determining thresholds/methodology. Also, the vegetation in, for example, Mediterranean zones such as maquis may also be examined. Furthermore, what about the snow impact on the indices values?
Figure 2 – what about southern hemisphere?
Lines 172-176 – use of SAR VH-derived thresholds is not clear.
Line 184 and further – what method for harmonizing the Sentinel-2 and Landsat data did you use? What collection from GEE were utilized? How did you remove pixels with clouds?
Line 200 – how did you distinguish winter crops based on Google Earth imagery?
Equations 5-9 are redundant; they are commonly used and well-known.
Citation: https://doi.org/10.5194/essd-2023-432-RC1 -
AC1: 'Reply on RC1', Yangyang Fu, 09 Apr 2024
The comment was uploaded in the form of a supplement: https://essd.copernicus.org/preprints/essd-2023-432/essd-2023-432-AC1-supplement.zip
-
RC2: 'Comment on essd-2023-432', Anonymous Referee #2, 14 Jan 2024
This manuscript deals with the important challenge of mapping winter triticeae crops at a global scale, a group of crops which is crucial for ensuring global food security. Conventional methods typically require a large number of reference samples to train supervised models that learn to map these crops. In many regions around the world, the availability of such samples is limited to non-existent. The proposed approach works independently from any reference data and therefore does not suffer from this drawback, providing an interesting alternative. The method is based on the temporal behavior of NDVI, where winter triticeae crops are said to be having unique characteristics that allows them to be mapped out against other crops or land cover.
While this is an attractive idea, the authors do not provide sufficient proof of the ability of such a simple approach to really result in high-quality maps at the global scale. There are several major shortcomings and lack of methodological details based on which I cannot recommend the manuscript for publication in its present form. Given its submission to ESSD, I would also expect more attention to the published data itself.
Major comments:
- The main methodology is based on NDVI values of bare land vs. vegetation and the timing of these events. Looking at equations (2), (3) and (4), it seems that the only timing-related requirement is that the max NDVI should occur before the min NDVI. In Fig. 2 the winter triticeae temporal behavior is only compared to natural vegetation such as forest and grass. The most competing classes to map out from winter triticeae crops are of course other crop types! Why were these omitted from Fig. 2? How much of the reasoning still holds when compared to other crops? For example maize would be slightly delayed wrt winter cereals in the Northern Hemisphere, but as far as I can tell from the provided equations, maize pixels would also have a high WTCI because their max NDVI occurs before the min NDVI and those values will be similar to vegetation and bare signals, respectively. What am I missing here?
- The method relies on max NDVI occurring before min NDVI. But how do you decide on the reference period for which to analyse the curve? This should be different for northern and southern hemisphere at least. This reference period is a crucial choice for the outcome of the method.
- This study uses agricultural statistical data to determine the threshold of WTCI where statistical data is available. This way, mapping is tuned towards matching these statistical numbers. In the validation results, comparisons are made between the resulting maps and the same statistical data, where correlation coefficients are reported between mapped area and the reported area. This is not an independent analysis and if thresholds were tuned to match statistical numbers, high correlation coefficients with these same numbers seem obvious. In addition, how are planted areas computed from the resulting maps? Area estimates from maps have to be done carefully to avoid biased estimates. This is not discussed here.
- USA is excluded from the analysis because "highly accurate and annually CDL" is already available. This seems odd. A study aiming for global mapping should include USA for completeness and consistency of the maps as well. In fact, USA could be excellent to compare your results to the CDL and report agreement and differences. Also with respect to proving your method is not triggered by other crops than winter triticeae.
- The study discusses (also in the title) global mapping, while the study area actually contains just 65 countries (Fig. 1). It is stated that 99% of the global winter triticeae crops are covered referring to FAO 2020 which does not appear in the reference list. How did the authors determine the 99% in the first place? Russia is not included while it grows a major part of the global wheat production. I would recommend in any case to be more careful with the "global" terminology.
- The validation section is insufficient. A field survey was conducted in China but for all other countries, visual interpretation of Google Earth images was performed. How can the latter be done reliably? How do you identify winter triticeae from a Google Earth picture? Why can't this be another crop? In regions such as Europe, many LPIS datasets are freely available, providing an excellent validation resource for the maps. I strongly suggest the authors to compare to such data instead of interpreting Google Earth pictures, especially where such high quality data is available. This is crucial to prove that the method doesn't detect other crops.
- Satellite data is hardly described. Why is Landsat 7 still part of the analysis knowing its striping issues and knowing that for the temporal range of the study it's not essential? How were reflectance data from the different sensors harmonized? Is this an existing collection? If not, more detail is required here. Based on what were clouds masked for the different sensors?
- The section on Data should really come before the explanation on the methodology
Data comments:
- GeoTIF files are called e.g. France_classify_2021_WTCI_Bline20_Vline05 or Belgium_classify_2021_WTCI_Bline60_Vline80. This naming convention needs to be explained. Two neighboring countries where Vline value changes from 05 to 80? What are these values and why do they differ so much?
- For the files I checked I was unable to get a correct georeferencing, cfr. this screenshot:
More checks on the data format are required so a user can actually make use of them. - Even within single files, strange spatial artefacts appear, like this one in Northern France. Where does this come from?
Other comments:
L28: no area estimates follow naturally from crop maps. Suggest to use another term than "crop area"
L33: Rephrase "warning global food security"
L35-36: Please have a look at the recently released WorldCereal global 10m maps of winter triticeae. Some of the statements in the manuscript are outdated. Please revise the introduction accordingly.
L67: see main comments: why these 65 countries. It's a bit strange that because CDL is available in US, this country is not mapped. This would actually be an excellent reason to do it nonetheless and compare your results with the CDL.
L72: see main comments. Please provide more information on where this number comes from.
L91: what with evergreen trees? And grassland not necessarily exhibits this continuous decrease in NDVI.
L120: See main comments. Max before min with respect to what? How do you enforce this? How do you treat calendar years? A maximum will always be before some minimum later on. What's your reference period in which you make this analysis?
L119: how is this done in practice for the southern hemisphere, i.e. how do you take into account cyclical dates?
L157: which accuracy is meant here based on which to decide the optimal threshold?
L160: why only these two crops?
L172-176: this lacks detail. The tresholds seem arbitrary and are computed on what exactly?
L366: Or other land covers
Citation: https://doi.org/10.5194/essd-2023-432-RC2 -
AC2: 'Reply on RC2', Yangyang Fu, 09 Apr 2024
The comment was uploaded in the form of a supplement: https://essd.copernicus.org/preprints/essd-2023-432/essd-2023-432-AC2-supplement.zip
Status: closed
-
RC1: 'Comment on essd-2023-432', Anonymous Referee #1, 03 Jan 2024
This is needed and important research. However, some of the used methods are not clear enough, and in fact, the accuracy assessment may be not reliable. This makes the manuscript not appropriate for publication in ESSD in its current form.
My main concerns:
- National datasets used for validation are not described at all, and I am not sure if they are reliable sources. The questions arise if these datasets are robust and/or detailed enough to perform accuracy assessment for the presented map? Did you only compare the area reported by country statistics and areas obtained in your maps? If so, that is not enough. Maybe, as a validation dataset it would be better to include USA CDL dataset.
- Another dataset for comparison/validation comes from Google Earth imagery. However, how is it possible to check or distinguish if there are winter crops indeed if for some years only single image is available, and may be not acquired during the time when it is possible to assess?
- The methodology is sometimes not clear. And what is also important, the data should be described firstly, before the methods used! For example, the methodology behind integration of Sentinel-2 and Landsat imagery is not clear. Do you used any harmonization techniques, which are needed in such combination between two satellite sources?
- Checking the dataset for my country shows that a large part is in fact located in the agricultural areas (however I cannot say if these are winter, not winter or not triticeae crops). However, there are also large parts located in the forests, and large areas with “stripes” probably related to not proper processing of Landsat 7 imagery. This should be for sure addressed in future, and methods should be refined. I also checked the area of the Mediterranean Sea, where many areas of maquis /shrublands were indicated as winter crops.
Some other comments related to specific lines:
Line 28 – this sentence should be rephrased, mapping cannot monitor something
Line 58 – add information which satellite imagery did you use.
Figure 1 – samples should have different, more distinguishable colours
Line 80 – As mentioned above, data should be described first, before methodology.
Line 88 – 91 – what about evergreen forests? They are not described or shown on Figure 2, while they are usually also characterized by high values during winter, for example. I think they should be taken into consideration when determining thresholds/methodology. Also, the vegetation in, for example, Mediterranean zones such as maquis may also be examined. Furthermore, what about the snow impact on the indices values?
Figure 2 – what about southern hemisphere?
Lines 172-176 – use of SAR VH-derived thresholds is not clear.
Line 184 and further – what method for harmonizing the Sentinel-2 and Landsat data did you use? What collection from GEE were utilized? How did you remove pixels with clouds?
Line 200 – how did you distinguish winter crops based on Google Earth imagery?
Equations 5-9 are redundant; they are commonly used and well-known.
Citation: https://doi.org/10.5194/essd-2023-432-RC1 -
AC1: 'Reply on RC1', Yangyang Fu, 09 Apr 2024
The comment was uploaded in the form of a supplement: https://essd.copernicus.org/preprints/essd-2023-432/essd-2023-432-AC1-supplement.zip
-
RC2: 'Comment on essd-2023-432', Anonymous Referee #2, 14 Jan 2024
This manuscript deals with the important challenge of mapping winter triticeae crops at a global scale, a group of crops which is crucial for ensuring global food security. Conventional methods typically require a large number of reference samples to train supervised models that learn to map these crops. In many regions around the world, the availability of such samples is limited to non-existent. The proposed approach works independently from any reference data and therefore does not suffer from this drawback, providing an interesting alternative. The method is based on the temporal behavior of NDVI, where winter triticeae crops are said to be having unique characteristics that allows them to be mapped out against other crops or land cover.
While this is an attractive idea, the authors do not provide sufficient proof of the ability of such a simple approach to really result in high-quality maps at the global scale. There are several major shortcomings and lack of methodological details based on which I cannot recommend the manuscript for publication in its present form. Given its submission to ESSD, I would also expect more attention to the published data itself.
Major comments:
- The main methodology is based on NDVI values of bare land vs. vegetation and the timing of these events. Looking at equations (2), (3) and (4), it seems that the only timing-related requirement is that the max NDVI should occur before the min NDVI. In Fig. 2 the winter triticeae temporal behavior is only compared to natural vegetation such as forest and grass. The most competing classes to map out from winter triticeae crops are of course other crop types! Why were these omitted from Fig. 2? How much of the reasoning still holds when compared to other crops? For example maize would be slightly delayed wrt winter cereals in the Northern Hemisphere, but as far as I can tell from the provided equations, maize pixels would also have a high WTCI because their max NDVI occurs before the min NDVI and those values will be similar to vegetation and bare signals, respectively. What am I missing here?
- The method relies on max NDVI occurring before min NDVI. But how do you decide on the reference period for which to analyse the curve? This should be different for northern and southern hemisphere at least. This reference period is a crucial choice for the outcome of the method.
- This study uses agricultural statistical data to determine the threshold of WTCI where statistical data is available. This way, mapping is tuned towards matching these statistical numbers. In the validation results, comparisons are made between the resulting maps and the same statistical data, where correlation coefficients are reported between mapped area and the reported area. This is not an independent analysis and if thresholds were tuned to match statistical numbers, high correlation coefficients with these same numbers seem obvious. In addition, how are planted areas computed from the resulting maps? Area estimates from maps have to be done carefully to avoid biased estimates. This is not discussed here.
- USA is excluded from the analysis because "highly accurate and annually CDL" is already available. This seems odd. A study aiming for global mapping should include USA for completeness and consistency of the maps as well. In fact, USA could be excellent to compare your results to the CDL and report agreement and differences. Also with respect to proving your method is not triggered by other crops than winter triticeae.
- The study discusses (also in the title) global mapping, while the study area actually contains just 65 countries (Fig. 1). It is stated that 99% of the global winter triticeae crops are covered referring to FAO 2020 which does not appear in the reference list. How did the authors determine the 99% in the first place? Russia is not included while it grows a major part of the global wheat production. I would recommend in any case to be more careful with the "global" terminology.
- The validation section is insufficient. A field survey was conducted in China but for all other countries, visual interpretation of Google Earth images was performed. How can the latter be done reliably? How do you identify winter triticeae from a Google Earth picture? Why can't this be another crop? In regions such as Europe, many LPIS datasets are freely available, providing an excellent validation resource for the maps. I strongly suggest the authors to compare to such data instead of interpreting Google Earth pictures, especially where such high quality data is available. This is crucial to prove that the method doesn't detect other crops.
- Satellite data is hardly described. Why is Landsat 7 still part of the analysis knowing its striping issues and knowing that for the temporal range of the study it's not essential? How were reflectance data from the different sensors harmonized? Is this an existing collection? If not, more detail is required here. Based on what were clouds masked for the different sensors?
- The section on Data should really come before the explanation on the methodology
Data comments:
- GeoTIF files are called e.g. France_classify_2021_WTCI_Bline20_Vline05 or Belgium_classify_2021_WTCI_Bline60_Vline80. This naming convention needs to be explained. Two neighboring countries where Vline value changes from 05 to 80? What are these values and why do they differ so much?
- For the files I checked I was unable to get a correct georeferencing, cfr. this screenshot:
More checks on the data format are required so a user can actually make use of them. - Even within single files, strange spatial artefacts appear, like this one in Northern France. Where does this come from?
Other comments:
L28: no area estimates follow naturally from crop maps. Suggest to use another term than "crop area"
L33: Rephrase "warning global food security"
L35-36: Please have a look at the recently released WorldCereal global 10m maps of winter triticeae. Some of the statements in the manuscript are outdated. Please revise the introduction accordingly.
L67: see main comments: why these 65 countries. It's a bit strange that because CDL is available in US, this country is not mapped. This would actually be an excellent reason to do it nonetheless and compare your results with the CDL.
L72: see main comments. Please provide more information on where this number comes from.
L91: what with evergreen trees? And grassland not necessarily exhibits this continuous decrease in NDVI.
L120: See main comments. Max before min with respect to what? How do you enforce this? How do you treat calendar years? A maximum will always be before some minimum later on. What's your reference period in which you make this analysis?
L119: how is this done in practice for the southern hemisphere, i.e. how do you take into account cyclical dates?
L157: which accuracy is meant here based on which to decide the optimal threshold?
L160: why only these two crops?
L172-176: this lacks detail. The tresholds seem arbitrary and are computed on what exactly?
L366: Or other land covers
Citation: https://doi.org/10.5194/essd-2023-432-RC2 -
AC2: 'Reply on RC2', Yangyang Fu, 09 Apr 2024
The comment was uploaded in the form of a supplement: https://essd.copernicus.org/preprints/essd-2023-432/essd-2023-432-AC2-supplement.zip
Data sets
Global 30-m resolution distribution maps of winter-triticeae crops from 2017 to 2022 Yangyang Fu, Xiuzhi Chen, Chaoqing Song, Xiaojuan Huang, Jie Dong, Qiongyan Peng, and Wenping Yuan https://doi.org/10.57760/sciencedb.12361
Viewed
HTML | XML | Total | Supplement | BibTeX | EndNote | |
---|---|---|---|---|---|---|
657 | 166 | 50 | 873 | 55 | 45 | 41 |
- HTML: 657
- PDF: 166
- XML: 50
- Total: 873
- Supplement: 55
- BibTeX: 45
- EndNote: 41
Viewed (geographical distribution)
Country | # | Views | % |
---|
Total: | 0 |
HTML: | 0 |
PDF: | 0 |
XML: | 0 |
- 1