the Creative Commons Attribution 4.0 License.
the Creative Commons Attribution 4.0 License.
CropPlantHarvest: A 500 m annual dataset of crop planting and harvesting dates (2001–2024) of the U.S. Midwest
Abstract. As key components of agricultural management, planting and harvesting schedules have strongly influenced crop production by defining the length of the crop growing season and shaping the environmental conditions crops experience. Accurate knowledge of these management data is crucial for enhancing crop yield estimates by capturing the timing of crop development relative to weather and soil conditions, assessing climate adaptation by tracking shifts in farming practices over time, and supporting agricultural carbon accounting. Yet, existing planting and harvesting date datasets are largely based on state-level statistics or rule-based calendars that overlook intra-regional variability and the influence of human decision-making. The absence of long-term, high-resolution planting and harvesting date information hinders our ability to reconstruct historical agricultural practices and assess their agronomic and environmental consequences. In this study, we introduce CropPlantHarvest, the first dataset of annual corn and soybean planting and harvesting dates across the U.S. Midwest at 500 m resolution from 2001 to 2024. Planting dates are estimated using CropSow, an integrative remotely sensed crop modeling system that aligns simulated crop growth trajectories with satellite observations to retrieve field-level planting dates. Harvesting dates are retrieved using the Normalized Harvest Phenology Index (NHPI), a novel index that integrates Normalized Difference Vegetation Index (NDVI) and near-infrared (NIR) reflectance to detect harvesting events by capturing the distinct spectral transition from senescent crops to exposed crop residues. Validation against USDA crop progress reports and field-level dataset demonstrates high accuracy of CropPlantHarvest, with a mean absolute error of approximately 5 days for both crop species. This large spatial and temporal dataset captures management-driven variability in crop season timing and duration, supporting improved modeling of crop yields, greenhouse gas emissions, and resource use. It could also serve as a benchmark for refining remote-sensing phenology products and evaluating the agro-environmental impacts of evolving crop management decisions. CropPlantHarvest is available at https://doi.org/10.5281/zenodo.16967482 (Liu and Diao, 2025).
- Preprint
(1707 KB) - Metadata XML
- BibTeX
- EndNote
Status: final response (author comments only)
-
RC1: 'Comment on essd-2025-526', Anonymous Referee #1, 08 Dec 2025
- AC1: 'Comment on essd-2025-526', Chunyuan Diao, 27 Jan 2026
-
RC2: 'Comment on essd-2025-526', Anonymous Referee #2, 15 Dec 2025
The manuscript presents a valuable, novel, and well-executed dataset with clear societal and scientific impact. The methodology is sound, validation is robust, and the discussion contextualizes the work effectively. The manuscript requires major clarifications and improvements to enhance its clarity, accessibility, and technical rigor.
1. The development of a long-term, high-resolution (500m, annual) dataset for both planting and harvesting dates addresses a critical gap in agricultural remote sensing and management studies. The integration of the CropSow modeling system and the novel NHPI index is a significant methodological advance over traditional phenology-based approaches.
2. The use of "pure" MODIS pixels (≥90% CDL agreement) is a good practice to minimize mixed signals. However, the manuscript should more explicitly discuss the potential implications of this filtering. What percentage of the total agricultural area in the Midwest is excluded? Could this introduce a spatial bias towards larger, more homogeneous fields?
3. The calibration of the tt_emerg_to_Greenup parameter using state-level CPRs and 1000 random pixels is appropriate. However, the process for ensuring this single, state-year-level parameter is representative of sub-state variability in soils, cultivars, and microclimate could be elaborated.
4. Defining the harvesting window from MOS (50% senescence) to MOS+60 days is logical. However, in regions or years with very late harvest or early frost, could this window be truncated? Is the t_end parameter ever adjusted based on ancillary data (e.g., first freeze date)?
5. The threshold of 0.6 for NHPI is stated as being calibrated using field-level data. It would be helpful to see a brief sensitivity analysis (e.g., in supplement) showing how the MAE changes with thresholds around 0.6 (e.g., 0.55, 0.65). This demonstrates the stability of the chosen value.
6. The effort to select MODIS pixels entirely within a single Beck's field is excellent and mitigates scale mismatch. The resulting field-level MAEs of ~6 days are very good. However, the text should more directly acknowledge that some of the remaining error is inherently due to the 500m vs. field-scale discrepancy and possible geolocation errors.
7. Figures 3 and 5 (scatter plots) are clear. Figure 6 effectively shows spatial patterns. However, Figure 6 could be more informative if it included a panel showing the standard deviation (interannual variability) of planting/harvesting dates alongside the mean.8. The non-linear relationships with latitude are interesting. The discussion attributing soybean patterns to double-cropping in the south is plausible. Is there any data (e.g., from CDL on winter wheat prevalence) that could be cited to support this claim more strongly?
9. The observed trends (delayed corn planting, stable/shortened seasons) are intriguing and well-discussed. The analysis would be even more powerful if linked directly to the driver analysis. For example, could the panel regression model be run on detrended data to separate interannual weather effects from long-term technological/management trends?
10. The fixed-effects panel regression is a suitable method. The interpretation of coefficients (e.g., minimum temperature delaying harvest due to humidity) is reasonable. To bolster causality, consider if leading/lagging variables were tested (e.g., does spring precipitation affect planting, or precipitation in the weeks immediately prior to the median planting date?).
11. The limitation regarding MODIS 500m resolution is appropriately noted, with a future outlook towards data fusion. This section could be slightly expanded to quantify the potential error. For example, in highly fragmented landscapes, what is the typical sub-pixel heterogeneity?12. The Zenodo repository is provided. Excellent. To maximize utility, please ensure the data is provided in a widely accessible, cloud-optimized format (e.g., GeoTIFF or Zarr for each year/crop/variable) with clear, machine-readable metadata.
13. The manuscript is generally well-written. There are a few minor grammatical hiccups and slightly long sentences, particularly in the Method section.
14. The Abstract and Conclusion accurately summarize the work. The Conclusion could be slightly strengthened by reiterating the immediate applications enabled by this dataset, echoing the discussion.Citation: https://doi.org/10.5194/essd-2025-526-RC2 - AC1: 'Comment on essd-2025-526', Chunyuan Diao, 27 Jan 2026
- AC1: 'Comment on essd-2025-526', Chunyuan Diao, 27 Jan 2026
Data sets
CropPlantHarvest: A 500 m annual dataset of crop planting and harvesting dates (2001–2024) of the U.S. Midwest Yin Liu and Chunyuan Diao https://zenodo.org/records/16967482
Viewed
| HTML | XML | Total | BibTeX | EndNote | |
|---|---|---|---|---|---|
| 524 | 132 | 32 | 688 | 32 | 48 |
- HTML: 524
- PDF: 132
- XML: 32
- Total: 688
- BibTeX: 32
- EndNote: 48
Viewed (geographical distribution)
| Country | # | Views | % |
|---|
| Total: | 0 |
| HTML: | 0 |
| PDF: | 0 |
| XML: | 0 |
- 1
The authors proposed a 500m annual dataset of crop planting and harvesting dates from 2001 to 2024 of 12 U.S. Midwest states for corn and soybean, named CropPlantHarvest. Planting dates are derived using CropSow and harvesting dates are calculated with Normalized Harvest Phenology Index (NHPI) based on NDVI and NIR reflectance. Field- and state-level evaluation showed the performance of the CropPlantHarvest.
Strengths:
Weaknesses: