the Creative Commons Attribution 4.0 License.
the Creative Commons Attribution 4.0 License.
A Forty-Four-Year Comprehensive Dataset of Maize Phenology in China's Huang-Huai-Hai Plain
Abstract. The dataset presents a FAIR (Findable, Accessible, Interoperable, and Reusable), comprehensive, long-term dataset documenting maize phenology dynamics across China's Huang-Huai-Hai Region (HHHP), a critical area for national grain production. Spanning the period 1981–2024, the dataset integrates observations from 101 agrometeorological stations across eight provinces and municipalities, capturing ten key phenological stages—from sowing to maturity—and deriving four critical growth lengths. A multi-tiered quality control protocol, including automated consistency checks, climate data cross-referencing, and expert arbitration, was applied to ensure data integrity. Analytical outputs include kernel density estimation for characterizing probability distributions and univariate linear regression for quantifying decadal trends. The dataset comprises 1,616 diagnostic plots in JPEG format and two core data tables in XLSX format, with a total uncompressed volume of 1.50 GB.
The dataset is publicly available via the Science Data Bank under the accession code https://doi.org/10.57760/sciencedb.32076 and supports diverse applications in climate impact assessment, crop model improvement, and adaptation strategy development.
- Preprint
(1398 KB) - Metadata XML
- BibTeX
- EndNote
Status: final response (author comments only)
- RC1: 'Comment on essd-2025-726', Anonymous Referee #1, 13 Apr 2026
-
RC2: 'Comment on essd-2025-726', Anonymous Referee #2, 08 Jun 2026
The manuscript presents a comprehensive and valuable long-term data infrastructure covering a forty-four-year period from 1981 to 2024. By integrating paper-based archives and modern digital records across 101 agrometeorological stations in China's Huang-Huai-Hai Plain, the authors capture ten essential developmental stages and derive four growth lengths. The scientific value of this dataset for crop modeling, agricultural climate change assessment, and adaptation studies is substantial. However, several critical methodological and technical aspects require substantial refinement before publication in Earth System Science Data.
First, a major challenge in multi-decadal crop phenology analysis is distinguishing climate-driven physiological shifts from human-induced agronomic adaptations, particularly changes in maize varieties. In the introduction, the authors explicitly state that process-based models often fail to account for temporal changes in crop varieties. Yet, the current dataset description and the core data tables (such as Maize phenology period data.xlsx) categorize the metadata broadly under "crop type" without specifying structural changes in maturity groups or varieties over the 44-year span. Because farmers frequently adopt long-growth cultivars to exploit increased thermal resources, treating varieties as static across four decades could significantly confound linear trend interpretations. The authors must detail how variety updates are documented within the dataset or discuss how these shifts affect the detected trends.
Second, the dataset construction methodology implements a statistical threshold flagging records beyond three standard deviations from station-specific means alongside logical consistency checks. While a statistical window is useful for detecting extreme values, crop phenology is strictly bound by biological thermal accumulation. A record might fall within three standard deviations but still violate physiological logic due to insufficient growing degree days between two subsequent stages. The authors need to clarify whether their consistency checks incorporated active physiological parameters, such as minimum thermal thresholds or heat sum requirements, specifically for compressed intermediate stages like the transition from tasseling to flowering and silking.
Third, the authors employ a hierarchical strategy utilizing a five-year moving window centered on the missing year to impute missing observations, which constitute approximately 1.1% of the total records. While the overall missing percentage is low, crop phenology exhibits high inter-annual sensitivity to episodic climate anomalies such as extreme drought or spring cold spells. Imputing a missing year based solely on the temporal proximity of adjacent years assumes a high degree of local stationarity. The authors should justify why a purely temporal moving window was preferred over a spatial interpolation method leveraging concurrent observations from neighboring stations within the same province, given that stations are densely distributed with 28 in Hebei and 24 in Henan.
Fourth, Section 2.2 lists the distribution of the 101 agrometeorological stations across eight provinces and municipalities, including 28 in Hebei, 24 in Henan, and 25 in Shanxi, but only 4 in Jiangsu and 2 in Anhui. Given that the Huang-Huai-Hai Plain covers substantial parts of Jiangsu and Anhui provinces where summer maize production is highly intensive, the sparse station density in these southern sectors represents a major spatial limitation. The manuscript does not discuss how this highly skewed station density impacts the regional-scale generalization of maize phenology trends across the entire plain. The authors need to address this spatial bias explicitly in their data limitation section.
Fifth, Section 4 introduces a multi-faceted validation strategy to establish the scientific robustness of the dataset. However, the manuscript lacks a quantitative cross-validation comparison against external independent data sources, such as satellite-derived phenology metrics from the Normalized Difference Vegetation Index or enhanced vegetation index products. Since agrometeorological station observations represent point-scale field assessments, comparing these direct visual records with regional remote sensing phenology metrics for key matching stages like tasseling or maturity would significantly strengthen the technical validation and demonstrate the interoperability of the dataset.
Citation: https://doi.org/10.5194/essd-2025-726-RC2
Data sets
A Forty-Four-Year Comprehensive Dataset of Maize Phenology in China's Huang-Huai-Hai Plain Quanjun Zhang and Dongli Wu https://doi.org/10.57760/sciencedb.32076
Viewed
| HTML | XML | Total | BibTeX | EndNote | |
|---|---|---|---|---|---|
| 352 | 268 | 30 | 650 | 38 | 57 |
- HTML: 352
- PDF: 268
- XML: 30
- Total: 650
- BibTeX: 38
- EndNote: 57
Viewed (geographical distribution)
| Country | # | Views | % |
|---|
| Total: | 0 |
| HTML: | 0 |
| PDF: | 0 |
| XML: | 0 |
- 1
General Assessment
This manuscript presents a maize phenology dataset for the Huang-Huai-Hai Plain from 1981 to 2024, based on observations from 101 agrometeorological stations. The data have undergone multi-tier quality control, with reported completeness above 98.9% and an estimated error rate below 1%. The dataset is publicly available and follows FAIR principles. While the dataset has potential value for agricultural and climate research, the current version has several significant weaknesses, particularly in figure quality, external validation, and comparison with existing datasets.
Recommendation
Major revision. The authors should redesign the figures to include confidence intervals, error ranges, and clearer temporal dynamics; add external cross-validation; compare explicitly with existing datasets; and quantify uncertainty in the quality control procedures.
Major Comments:
Minor to Moderate Comments:
No example code is provided for data reuse, and there is no guidance on which stations or years have lower data quality and should be used with caution. Phrases such as unprecedented data quality are overly promotional and should be replaced with more neutral wording. References to existing phenology datasets are scarce, while self-citations are relatively frequent. FAIR is misspelled in the abstract. Figure 4 does not indicate trend significance. Table 1 appears to be missing.