A 500-m crop water requirement and irrigation water demand dataset for 25 crop types in the Yellow River Basin (2000&ndash;2020): Revealing significant underestimation from incomplete crop coverage

Tang, Shizhen; Li, Ziying; Sun, Xueyan; Luo, Lun; Zhang, Liwen; Duan, Yanghai; Wang, Jiaqi; Li, Qian; Zhang, Hongbo

doi:10.5194/essd-2026-128

Preprints

https://doi.org/10.5194/essd-2026-128

Preprints

09 Mar 2026

| 09 Mar 2026

Status: this preprint is currently under review for the journal ESSD.

A 500-m crop water requirement and irrigation water demand dataset for 25 crop types in the Yellow River Basin (2000–2020): Revealing significant underestimation from incomplete crop coverage

Shizhen Tang, Ziying Li, Xueyan Sun, Lun Luo, Liwen Zhang, Yanghai Duan, Jiaqi Wang, Qian Li, and Hongbo Zhang

Abstract. Accurate estimation of irrigation water demand (IWD) is fundamental for sustainable water resources management in water-scarce agricultural regions. However, existing IWD assessments typically consider only a limited number of major crops, potentially leading to systematic underestimation of basin-scale water demand. This study develops a comprehensive high-resolution (500 m) dataset of crop water requirement (CWR, equivalent to the net irrigation water requirement, i.e., crop evapotranspiration minus effective precipitation) and IWD (CWR divided by irrigation water use efficiency) for the Yellow River Basin (YRB) covering 25 crop types from 2000 to 2020. We first evaluated eight remote sensing-based cropland datasets against statistical records from 135 administrative units, identifying the Global Land Analysis and Discovery (GLAD) dataset as optimal. CWR and IWD were estimated using the FAO Penman-Monteith approach with spatially explicit crop coefficients and spatiotemporally dynamic irrigation water use efficiency coefficients. The dataset provides two complementary versions: a sown area-based version reflecting the full theoretical agricultural water gap (multi-year average CWR and IWD of 548.3 × 10⁸ m³ and 1086.8 × 10⁸ m³, respectively), and an irrigated area-based version constrained by actual irrigation extent (258.6 × 10⁸ m³ and 508.4 × 10⁸ m³), broadly consistent with independent estimates of actual irrigation water consumption. Our results reveal that considering only five major crops would underestimate CWR and IWD by approximately 33 % and 34 %, respectively, with the largest underestimation in the upper reach (approximately 45 %). More importantly, incomplete crop coverage not only causes quantitative bias but also misrepresents temporal dynamics, yielding opposite trend directions in some sub-basins. Sensitivity analysis indicates that 12–15 crop types are required to capture over 90–95 % of basin-scale water demand, with vegetables and tubers ranking among the top six contributors despite being frequently omitted in previous studies. The dataset (Tang and Zhang, 2026) is publicly available at https://doi.org/10.5281/zenodo.18628324.

Received: 14 Feb 2026 – Discussion started: 09 Mar 2026

Publisher's note: Copernicus Publications remains neutral with regard to jurisdictional claims made in the text, published maps, institutional affiliations, or any other geographical representation in this paper. While Copernicus Publications makes every effort to include appropriate place names, the final responsibility lies with the authors. Views expressed in the text are those of the authors and do not necessarily reflect the views of the publisher.

Download & links

Preprint (PDF, 2353 KB)

Supplement (445 KB)

Download & links

Shizhen Tang, Ziying Li, Xueyan Sun, Lun Luo, Liwen Zhang, Yanghai Duan, Jiaqi Wang, Qian Li, and Hongbo Zhang

Status: final response (author comments only)

Subscribe to comment alert

RC1:
'Comment on essd-2026-128', Anonymous Referee #1, 06 May 2026
This study develops a 500 m crop water requirement (CWR) and irrigation water demand (IWD) dataset for 25 crop types in the Yellow River Basin from 2000–2020. It finds that incomplete crop coverage can substantially underestimate basin-scale water demand, which is important and policy relevant.
However, the interpretation of the 500 m product needs to be clarified. A key limitation is that crop-specific sown areas are uniformly allocated to cropland pixels within each administrative unit, rather than derived from actual crop-type maps. Therefore, the dataset is not an actual fully validated 500 m crop-specific irrigation-demand map.
Several assumptions in the CWR calculation also need clearer treatment, especially the effective-rainfall method and the use of apparently fixed crop calendars. These limitations are discussed, but they should be integrated more directly into the Abstract, data description, uncertainty assessment, and dataset-use guidance.
Detailed comments:
The manuscript should consistently describe the product as a 500 m administratively downscaled CWR/IWD dataset, rather than implying that it maps actual crop-specific irrigation demand at 500 m resolution. Since crop-specific sown areas are spread uniformly across cropland pixels within each administrative unit, the product preserves administrative crop totals but does not show actual crop locations. This point is mentioned in the Discussion, but it should also appear in the Abstract and data description.

The cropland-product validation and selection need to be strengthened. The comparison appears to assess agreement with county-level cropland-area statistics, rather than pixel-level classification accuracy. Therefore, wording such as ‘optimal cropland dataset’ should be softened to ‘the cropland product with the best agreement with county-level statistics under the selected metrics’. Please consider adding additional error metrics, such as mean absolute error (MAE) and normalised root mean square error (nRMSE), and tests that account for repeated county-year observations, such as clustered bootstrap or mixed-effects models. A spatial, even pixel level, comparison across those data would be useful too. So you can tell which pixel has a high confidence by number of products showing the same result.

Uniform crop allocation is a major source of spatial uncertainty. The authors should also provide a more quantitative assessment of this uncertainty, where feasible. I do not think the full dataset needs to be rebuilt, but a sensitivity test using alternative spatial weights would be useful. For example, the authors could compare the current uniform-allocation approach with one or more alternatives based on irrigated-area masks, crop suitability, phenology information, or available crop-type maps for major crops such as wheat, maize, rice, soybean, and rapeseed. At minimum, the dataset should include quality flags showing whether crop statistics are available at county level or only at prefecture level, so users can identify areas with greater allocation uncertainty.

Kc values are spatialized, but planting dates and growth-stage lengths appear fixed across space and time. Please provide a crop-calendar table for all 25 crops and test sensitivity to reasonable shifts in planting dates and stage lengths.

Please test sensitivity to the effective-rainfall method. The current Döll and Siebert equation is rainfall-only and does not account for soil-water carryover or drainage. Please clarify whether CWR is calculated as seasonal ETc − Pe, or as a daily sum of positive deficits, i.e. ∑ max(ETc − Pe, 0). A comparison with USDA-SCS/NRCS or FAO CROPWAT methods would help show the robustness of the results. If feasible, a simple daily soil-water bucket or FAO-56 root-zone balance would provide an even stronger test.

Please describe the raster harmonization more clearly. The manuscript combines products at different resolutions, but it is unclear how they were resampled and aligned. It would also be useful to explain how the final product should be used and aggregated. Please clarify whether raster value 0 means true zero CWR/IWD or non-cropland.
Citation: https://doi.org/10.5194/essd-2026-128-RC1
RC2:
'Comment on essd-2026-128', Anonymous Referee #2, 02 Jun 2026
This manuscript presents a new crop water requirement (CWR) and irrigation water demand (IWD) dataset for the Yellow River Basin and argues that previous studies substantially underestimate water demand because they consider too few crop types. The topic is important, and the dataset could be useful for the community. However, several key methodological assumptions raise concerns about the reliability of the spatial patterns and quantitative estimates. The manuscript currently places strong emphasis on the underestimation caused by incomplete crop coverage, while the uncertainty introduced by its own assumptions is not adequately evaluated.
1. The dataset is described as 500 m resolution, but the actual information content is much coarser
The manuscript repeatedly highlights the development of a 500 m dataset. However, crop-specific statistics are only available at county or prefecture level and are distributed uniformly across all cropland pixels within each administrative unit.
As a result, the dataset does not actually know where individual crops are located within a county. Every cropland pixel receives the same crop composition. Therefore, although the output grid is 500 m, the crop information is effectively available only at the county level.
This distinction is important because readers may assume that the dataset represents realistic crop distributions at 500 m resolution, which it does not.
The authors should clearly explain this limitation and avoid overstating the effective spatial resolution of the dataset.
2. The uniform allocation of crops is a very strong assumption
The study assumes that all crops are evenly distributed across all cropland pixels within a county.
In reality, crop locations are strongly influenced by:
irrigation availability,

elevation,

climate,

soil conditions,

local agricultural practices.

For example, rice, vegetables, cotton, potatoes, and wheat are typically concentrated in different parts of a county rather than being uniformly distributed.
This assumption likely has a major impact on the spatial patterns shown in the maps. Yet the manuscript provides no assessment of how much uncertainty this introduces.
The authors should discuss this limitation more openly and, if possible, evaluate its influence on the results.
3. Planting dates and growing seasons are assumed to be identical across the entire basin
The Yellow River Basin covers a very large geographic area with substantial differences in climate and elevation.
Nevertheless, the study applies the same planting dates and growth durations across the basin for each crop type.
This is difficult to justify. Crops in Qinghai, Inner Mongolia, Henan, and Shandong do not have the same growing calendar.
Because crop water demand is highly sensitive to crop timing, this assumption may introduce substantial errors. The manuscript should explain why spatially varying crop calendars were not considered and discuss the resulting uncertainty.
4. Validation is incomplete
The manuscript validates cropland area estimates and selects the GLAD dataset as the best-performing product.
However, the final product depends on several additional components:
crop allocation,

crop coefficients,

crop calendars,

irrigation efficiency.

None of these components are validated.
Therefore, the validation only demonstrates that the selected cropland map reasonably represents total cropland area. It does not demonstrate that the final crop-specific CWR and IWD estimates are accurate.
The manuscript should be more cautious when discussing dataset reliability.
5. The uncertainty analysis is insufficient
The paper argues that previous studies underestimate water demand by about 30–45% because they include too few crop types.
This may be true.
However, the manuscript does not quantify uncertainty associated with:
crop allocation assumptions,

crop coefficient interpolation,

irrigation efficiency estimates,

planting date assumptions,

effective precipitation calculations.

Without such analysis, it is difficult to determine whether the reported underestimation is larger than the uncertainty of the dataset itself.
A more comprehensive uncertainty assessment is needed.
6. Irrigation efficiency before 2009 is reconstructed rather than observed
The irrigation water use efficiency values before 2009 are estimated using extrapolation because observations are unavailable.
Since irrigation efficiency directly affects IWD calculations, this assumption may strongly influence long-term trends.
The manuscript should provide a sensitivity analysis showing how different assumptions about historical irrigation efficiency affect the results.
7. The key conclusion is partly expected from the data
The central conclusion is that considering only five crops underestimates water demand.
However, the manuscript shows that these five crops account for only about two-thirds of total sown area.
Therefore, some degree of underestimation is inevitable.
The more important question is not whether underestimation exists, but which omitted crops contribute most and why their water demand is disproportionately important.
The manuscript would be stronger if it focused more on the contributions of individual crop groups rather than simply comparing five crops versus twenty-five crops.
More specifically,
Many abbreviations are difficult to follow and are not clearly defined when first introduced. Examples include IWUEC, CACD, CLCD, MCID, GLAD, CLUD-A, and GLASS-GLC.
The manuscript should explain how datasets with resolutions ranging from 30 m to 5 km were compared fairly during cropland validation.
The use of inverse distance weighting for meteorological interpolation should be justified.
The manuscript combines county-level and prefecture-level agricultural statistics. The possible impacts of mixing these spatial scales should be discussed.
The discussion section focuses heavily on the strengths of the dataset but gives relatively little attention to its limitations.
Citation: https://doi.org/10.5194/essd-2026-128-RC2

Shizhen Tang, Ziying Li, Xueyan Sun, Lun Luo, Liwen Zhang, Yanghai Duan, Jiaqi Wang, Qian Li, and Hongbo Zhang

Supplement

https://doi.org/10.5194/essd-2026-128-supplement

Data sets

A 500-m crop water requirement and irrigation water demand dataset for 25 crop types in the Yellow River Basin (2000–2020) Shizhen Tang and Hongbo Zhang https://doi.org/10.5281/zenodo.18628324

Shizhen Tang, Ziying Li, Xueyan Sun, Lun Luo, Liwen Zhang, Yanghai Duan, Jiaqi Wang, Qian Li, and Hongbo Zhang

Viewed

Total article views: 766 (including HTML, PDF, and XML)

HTML	PDF	XML	Total	Supplement	BibTeX	EndNote
479	259	28	766	61	31	36

HTML: 479
PDF: 259
XML: 28
Total: 766
Supplement: 61
BibTeX: 31
EndNote: 36

Views and downloads (calculated since 09 Mar 2026)

Month	HTML	PDF	XML	Total
Mar 2026	182	78	11	271
Apr 2026	166	124	6	296
May 2026	111	49	6	166
Jun 2026	20	8	5	33

Cumulative views and downloads (calculated since 09 Mar 2026)

Month	HTML	PDF	XML	Total
Mar 2026	182	78	11	271
Apr 2026	166	124	6	296
May 2026	111	49	6	166
Jun 2026	20	8	5	33

Viewed (geographical distribution)

Total article views: 764 (including HTML, PDF, and XML) Thereof 764 with geography defined and 0 with unknown origin.

Country	#	Views	%

Latest update: 08 Jun 2026

Download

Preprint (2353 KB)
Metadata XML

Short summary

The Yellow River Basin produces over 35 % of China’s grain yet faces severe water scarcity. We present a 500 m resolution dataset of crop water requirement and irrigation water demand for 25 crop types (2000–2020). By covering nearly all crops rather than only a few major ones, we reveal that previous assessments underestimated irrigation demand by ~34 % and even misidentified trend directions in some sub-basins. The dataset supports more accurate water allocation in this water-scarce region.


Total:	0
HTML:	0
PDF:	0
XML:	0