the Creative Commons Attribution 4.0 License.
the Creative Commons Attribution 4.0 License.
NortheastChinaSoybeanYield20m: an annual soybean yield dataset at 20 m in Northeast China from 2019 to 2023
Abstract. Accurate monitoring of crop yield is important for ensuring food security. However, exiting yield datasets with a coarse spatial resolution are inadequate for capturing small scale spatial heterogeneity. Current yield estimation methods, such as machine learning models or the assimilation of remotely sensed biophysical variables into crop growth models, depend heavily on ground observations and involve significant computational costs. To solve these problems, a hybrid framework coupling the World Food Studies Simulation Model (WOFOST) and the Gated Recurrent Unit model (GRU) was proposed to generate a 20 m soybean yield dataset in Northeast China from 2019 to 2023 (NortheastChindaSoybeanYield20m). A soybean growth dataset was first generated based on the WOFOST that simulated various production scenarios (climates, crop varieties, soil types and agro-managements). The GRU model was then trained for characterizing relationships between model simulated LAI and soybean yield. The trained model was then applied for soybean yield estimation in Northeast China using time series LAI of different growth stages derived from Sentinel-2. The accuracy of the dataset was evaluated by in-situ measured and statistical data. The overall accuracy was 287.44 kg ha-1 and 272.36 kg ha-1 in the root mean squared error (RMSE) for field and regional scale, respectively. Stable results were achieved through the years with mean relative error (MRE) on average of 11.46 % in municipal scale and 7.94 % in provincial scale. Results demonstrated that the model was able to capture spatial-temporal variation of soybean yield. The NortheastChinaSoybeanYield20m was able to capture spatial-temporal variation of soybean yield, which can be applied for optimizing soybean production distribution and guiding agricultural decision-making. The NortheastChinaSoybeanYield20m dataset can be downloaded from https://doi.org/10.5281/zenodo.14263103 (Xu et al., 2024).
- Preprint
(1892 KB) - Metadata XML
- BibTeX
- EndNote
Status: open (until 01 Mar 2025)
-
RC1: 'Comment on essd-2024-586', Anonymous Referee #1, 23 Jan 2025
reply
I am very familiar with the WOFOST model and the dataset used by the author. It is not a good simulation project, not only because the simulation accuracy did not meet industry standards, but also because the author withheld many critical details and settings of the WOFOST in the manuscript, which makes it difficult for me to assess the rationality and scientific validity of the simulation. Earth System Science Data, as the name suggests, focuses on the application of datasets, but the author's professionalism in describing and processing the dataset is not good. Moreover, the description of CRU is severely inadequate. After reading the entire manuscript, I still do not understand the role of the CRU used by the author in this study.
- The study spanned from 2019 to 2023, but the sampling data was only from 2022 and 2023 (Fig.1). The author should explain this issue in the text.
- The soil data should be described in more detail, for example, which soil parameters were used in this study.
- The author used statistical data from 1980 to 2022, but the study's time scale is from 2019 to 2023. This is confusing for the readers. Please provide an explanation.
- The technology roadmap that needs improvement. 1) The author mentioned agro-management data in Figure 2, but it is not mentioned in Section 2.2 Data collections. 2) The sampling data mentioned in the Data collections section is not reflected in the figures, as well as meteorological data from National Meteorological Information Center. 3) The method of combining remote sensing data and model output through GRU is described too simplistically. 4) The author allocates a large proportion of the figures to how WOFOST conducts simulations, but this is not the focus of this study. The focus of this study should be on how to use models and remote sensing coupled for yield estimation, just as the author introduces in the research objective: "Designing a hybrid model coupling crop growth model and deep learning model for soybean yield estimation." The technology roadmap should more detailed display the research focus.
- Which sub-model of PCSE did the author use? LINTUL3 or Wofost72_PP?
- As far as I know, VAP is not included in the ERA5 dataset. How did the author obtain the VAP data?
- Line 209-215: The description of the calculation process for soil parameters is too simplistic; a detailed calculation process should be provided. For example, which parameters from the Chinese soil database were used in the study, and what theories/formulas were utilized to calculate the SMW, SMFCF, SM0, and K0 required by the WOFOST model? Is Table 2 a lookup table? Where did it come from?
- Line 215: The description of Table 3 is redundant. It suffices to directly list the values and sources of the WOFOST crop parameters. Table 4 should list all crop parameters in WOFOST, not just the main crop parameters.
- Line 235: What’s the setting of the fertilizer application rate and timing in the WOFOST?
- Line 244: After reading Section 3.2 Development of the Grated Recurrent Unit model (GRU), I am still unclear about the role of GRU in this study. The author's explanation of the principles of GRU is unclear. It does not directly describe how GRU combines the output of the WOFOST model with remote sensing data, as shown in the technical roadmap. Figure 3 lacks self-explanatory power, leaving it unclear what exactly the inputs and outputs of the GRU are.
- Why is MODIS data mentioned again in Line 315? MODIS data was not mentioned in the data collection section.
- As shown in Figures 6, 7, and A2, the model simulation accuracy is below industry standards.
- By the way, Line 240:” 3.1.2 Multi-scenarios crop simulations”, author said:” The four different types of model parameters were arranged and combined to generate various simulation scenarios”. Where could I read the scenario settings and the results of this part in the manuscript?
Citation: https://doi.org/10.5194/essd-2024-586-RC1
Data sets
NortheastChinaSoybeanYield20m: an annual soybean yield dataset at 20 m in Northeast China from 2019 to 2023 Jingyuan Xu, Xin Du, Taifeng Dong, Qiangzi Li, Yuan Zhang, Hongyan Wang, Jing Xiao, Jiashu Zhang, Yunqi Shen, and Yong Dong https://doi.org/10.5281/zenodo.14263103
Viewed
HTML | XML | Total | BibTeX | EndNote | |
---|---|---|---|---|---|
172 | 24 | 5 | 201 | 5 | 6 |
- HTML: 172
- PDF: 24
- XML: 5
- Total: 201
- BibTeX: 5
- EndNote: 6
Viewed (geographical distribution)
Country | # | Views | % |
---|
Total: | 0 |
HTML: | 0 |
PDF: | 0 |
XML: | 0 |
- 1