GlobalWheatYield4km: a global wheat yield dataset at 4-km resolution during 1982–2020 based on deep learning approach
Abstract. Accurate and spatially explicit information on global crop yield is paramount for guiding policy-making and ensuring food security. However, most public datasets are at coarse resolution in both space and time. Here, we used data-driven models to develop a 4-km dataset of global wheat yield (GlobalWheatYield4km) from 1982 to 2020. First, we proposed a phenology-based approach to map spatial distributions of spring and winter wheat. Then we determined the optimal grid-scale yield estimation model by comparing the performance of two data-driven models (i.e., Random Forest (RF) and Long Short-Term Memory (LSTM)), with publicly available data (i.e., satellite and climatic data from the Google Earth Engine (GEE) platform, soil properties, and subnational-level census data covering ~11000 political units). The results showed that GlobalWheatYield4km captured 82 % of yield variations with RMSE of 619.8 kg/ha across all subnational regions and years. In addition, our dataset had a higher accuracy (R2 ~0.71) as compared with Spatial Production Allocation Model (SPAM) (R2 ~ 0.49) across all subnational regions and three years. The GlobalWheatYield4km dataset might play important roles in modelling crop system and assessing climate impact over larger areas (DOI of the referenced dataset: https://doi.org/10.6084/m9.figshare.10025006; Luo et al., 2022b).
Yuchuan Luo et al.
Yuchuan Luo et al.
GlobalWheatYield4km: a global wheat yield dataset at 4-km resolution during 1982-2020 based on deep learning approaches https://doi.org/10.6084/m9.figshare.10025006
Yuchuan Luo et al.
Viewed (geographical distribution)
This paper proposed using ML and DL methods to generate a global wheat yield dataset at 4-km resolution during 1982-2020. The generated dataset has a range of potential applications in the agricultural sciences. However, there are several sections in the paper that lack clarity and may require further elaboration. It is recommended that the author address these points before proceeding to the next stage.
Abstract - “The results showed that GlobalWheatYield4km captured 82% of yield variations with RMSE of 619.8 kg/ha.” It is unknown the yield variations of which data the authors referred to. SMAP data?
Introduction - While the authors briefly mention the use of a phenology-based method to derive the global spatial distribution of wheat at the end of the Introduction section, it may be beneficial to introduce this method earlier in the paper. It would be helpful to explain the advantages and strengths of using a phenology-based approach over other methods.
Fig 1 - Add data source.
Section 2.2.3 - Please fix the typo in LINE 102 “maximum temperature (Tmin), minimum temperatures (Tmax)”.
Section 2.3 - please add a description of the Global Wheat Production Mapping System (GWPMS), though it was cited as Luo et al. (2022a).
Section 2.3.1 - “we compared the cropland map derived from the GFSAD1KCM with statistics”. It is unclear what statistics is used here.
Fig 4 - Add country name as figure title or legend.
Discussion - (i) “In future studies, we will attempt to map the 250 spatial distribution of wheat using remote sensing images with finer spatial resolutions.” Please explain more on which datasets of finer spatial resolutions you plan to use. (ii) I am also interested in whether the methodology employed in this study can be extended to other crop species. It may be worthwhile to include a paragraph in the paper that discusses the potential applications or challenges of using similar methods for other crops, providing readers with a broader perspective on the topic.