A Forty-Four-Year Comprehensive Dataset of Maize Phenology in China's Huang-Huai-Hai Plain
Abstract. The dataset presents a FAIR (Findable, Accessible, Interoperable, and Reusable), comprehensive, long-term dataset documenting maize phenology dynamics across China's Huang-Huai-Hai Region (HHHP), a critical area for national grain production. Spanning the period 1981–2024, the dataset integrates observations from 101 agrometeorological stations across eight provinces and municipalities, capturing ten key phenological stages—from sowing to maturity—and deriving four critical growth lengths. A multi-tiered quality control protocol, including automated consistency checks, climate data cross-referencing, and expert arbitration, was applied to ensure data integrity. Analytical outputs include kernel density estimation for characterizing probability distributions and univariate linear regression for quantifying decadal trends. The dataset comprises 1,616 diagnostic plots in JPEG format and two core data tables in XLSX format, with a total uncompressed volume of 1.50 GB.
The dataset is publicly available via the Science Data Bank under the accession code https://doi.org/10.57760/sciencedb.32076 and supports diverse applications in climate impact assessment, crop model improvement, and adaptation strategy development.
General Assessment
This manuscript presents a maize phenology dataset for the Huang-Huai-Hai Plain from 1981 to 2024, based on observations from 101 agrometeorological stations. The data have undergone multi-tier quality control, with reported completeness above 98.9% and an estimated error rate below 1%. The dataset is publicly available and follows FAIR principles. While the dataset has potential value for agricultural and climate research, the current version has several significant weaknesses, particularly in figure quality, external validation, and comparison with existing datasets.
Recommendation
Major revision. The authors should redesign the figures to include confidence intervals, error ranges, and clearer temporal dynamics; add external cross-validation; compare explicitly with existing datasets; and quantify uncertainty in the quality control procedures.
Major Comments:
Minor to Moderate Comments:
No example code is provided for data reuse, and there is no guidance on which stations or years have lower data quality and should be used with caution. Phrases such as unprecedented data quality are overly promotional and should be replaced with more neutral wording. References to existing phenology datasets are scarce, while self-citations are relatively frequent. FAIR is misspelled in the abstract. Figure 4 does not indicate trend significance. Table 1 appears to be missing.