Preprints
https://doi.org/10.5194/essd-2024-586
https://doi.org/10.5194/essd-2024-586
08 Jan 2025
 | 08 Jan 2025
Status: this preprint is currently under review for the journal ESSD.

NortheastChinaSoybeanYield20m: an annual soybean yield dataset at 20 m in Northeast China from 2019 to 2023

Jingyuan Xu, Xin Du, Taifeng Dong, Qiangzi Li, Yuan Zhang, Hongyan Wang, Jing Xiao, Jiashu Zhang, Yunqi Shen, and Yong Dong

Abstract. Accurate monitoring of crop yield is important for ensuring food security. However, exiting yield datasets with a coarse spatial resolution are inadequate for capturing small scale spatial heterogeneity. Current yield estimation methods, such as machine learning models or the assimilation of remotely sensed biophysical variables into crop growth models, depend heavily on ground observations and involve significant computational costs. To solve these problems, a hybrid framework coupling the World Food Studies Simulation Model (WOFOST) and the Gated Recurrent Unit model (GRU) was proposed to generate a 20 m soybean yield dataset in Northeast China from 2019 to 2023 (NortheastChindaSoybeanYield20m). A soybean growth dataset was first generated based on the WOFOST that simulated various production scenarios (climates, crop varieties, soil types and agro-managements). The GRU model was then trained for characterizing relationships between model simulated LAI and soybean yield. The trained model was then applied for soybean yield estimation in Northeast China using time series LAI of different growth stages derived from Sentinel-2. The accuracy of the dataset was evaluated by in-situ measured and statistical data. The overall accuracy was 287.44 kg ha-1 and 272.36 kg ha-1 in the root mean squared error (RMSE) for field and regional scale, respectively. Stable results were achieved through the years with mean relative error (MRE) on average of 11.46 % in municipal scale and 7.94 % in provincial scale. Results demonstrated that the model was able to capture spatial-temporal variation of soybean yield. The NortheastChinaSoybeanYield20m was able to capture spatial-temporal variation of soybean yield, which can be applied for optimizing soybean production distribution and guiding agricultural decision-making. The NortheastChinaSoybeanYield20m dataset can be downloaded from https://doi.org/10.5281/zenodo.14263103 (Xu et al., 2024).

Publisher's note: Copernicus Publications remains neutral with regard to jurisdictional claims made in the text, published maps, institutional affiliations, or any other geographical representation in this preprint. The responsibility to include appropriate place names lies with the authors.
Jingyuan Xu, Xin Du, Taifeng Dong, Qiangzi Li, Yuan Zhang, Hongyan Wang, Jing Xiao, Jiashu Zhang, Yunqi Shen, and Yong Dong

Status: open (until 14 Feb 2025)

Comment types: AC – author | RC – referee | CC – community | EC – editor | CEC – chief editor | : Report abuse
Jingyuan Xu, Xin Du, Taifeng Dong, Qiangzi Li, Yuan Zhang, Hongyan Wang, Jing Xiao, Jiashu Zhang, Yunqi Shen, and Yong Dong

Data sets

NortheastChinaSoybeanYield20m: an annual soybean yield dataset at 20 m in Northeast China from 2019 to 2023 Jingyuan Xu, Xin Du, Taifeng Dong, Qiangzi Li, Yuan Zhang, Hongyan Wang, Jing Xiao, Jiashu Zhang, Yunqi Shen, and Yong Dong https://doi.org/10.5281/zenodo.14263103

Jingyuan Xu, Xin Du, Taifeng Dong, Qiangzi Li, Yuan Zhang, Hongyan Wang, Jing Xiao, Jiashu Zhang, Yunqi Shen, and Yong Dong
Metrics will be available soon.
Latest update: 08 Jan 2025
Download
Short summary
This study proposed a 20 m soybean yield dataset in Northeast China (NortheastChindaSoybeanYield20m) from 2019 to 2023 using a hybrid framework coupling crop growth model with deep learning algorithm. Stable results were achieved through the years. The overall accuracy of the dataset was 287.44 kg ha-1 and 272.36 kg ha-1 in the root mean squared error for field and regional scale, respectively. The study satisfied the urgent demands for precise control of crop yield information.
Altmetrics