StageIV-IRC: A High-resolution Dataset of Extreme Orographic Quantitative Precipitation Estimates (QPE) Constrained to Water Budget Closure for Historical Floods in the Appalachian Mountains
Abstract. Quantitative Flood Estimation (QFE) in complex terrain remains a grand challenge in operational hydrology due to the lack of accurate high-resolution Quantitative Precipitation Estimates (QPE) for operational forecasting and for calibrating hydrologic models. Here, we present a high-resolution (i.e., 250 m, 5-minute-hourly) QPE dataset for 215 extreme rainfall events occurred in 26 gauged mountainous basins in the Appalachian Mountains from 2008 to 2024. This dataset is developed by applying inverse rainfall corrections (IRC) derived from physically-based rainfall-runoff modeling (Liao and Barros, 2022 and 2023) to the Next Generation Weather Radar (NEXRAD) Stage IV analysis (4 km resolution, hourly). The corrected Stage IV analysis QPE is referred to as StageIV-IRC (StageIV with Inverse Rainfall Correction). The unique advantage of this StageIV-IRC QPE dataset is its agreement with ground-based rainfall measurements while achieving water budget closure at the storm-flood event scale within observational uncertainty of streamflow observations, which is the gold standard in hydrological modeling. This dataset is the first QPE dataset aiming to improve QFE in the complex terrain by reducing biases for extreme precipitation events, and it can be used to evaluate the skill of hydrologic models in the same basins and support model calibration. The StageIV-IRC QPE dataset is publicly available at https://doi.org/10.5281/zenodo.14028866, and improved initial soil moisture maps for the studied extreme precipitation events, derived from the same IRC framework, are available in the same repository (Liao and Barros, 2025c).
The author developed a High-resolution Dataset of Extreme Orographic QPE by closing the water budget using stream gauge measurements. This is a novel method and will be of great value if further validated. Therefore, I recommend a major revision, as some clarification is needed, and more dataset evaluation may be beneficial.
Major comments:
1. I would recommend that the authors mention ICC as well in the abstract, as it is also one step in the precipitation data generation.
2. I recommend that the author provide a brief code to show how to read the data. The current format and structure of the data are unclear. It will be helpful for readers to try the data. Â
3. Are the ICC and IRC corrections implemented simultaneously in windows 2 and 5? Intuitively, overestimated rainfall values can compensate for an underestimated initial soil moisture condition. I am curious whether this compensation causes some difficulties in determining precipitation.
4. In the inverse correction process, there are likely more unknowns (precipitation at each pixel) than the knowns (observed discharge). Is it possible to obtain two different precipitation fields that can generate very similar discharge? How can you guarantee that you can get the "optimal" precipitation fields compared to other possible realizations? Is it reasonable to obtain an ensemble precipitation dataset to account for this variability?
5. Why did the authors select Stage IV as the primary precipitation source? In the first step, the authors downscale the precipitation field from 4km to 1km. Other available precipitation datasets, such as MRMS and AORC, provide precipitation estimates at a 1km resolution. If the authors use these 1km datasets, the downscale step can be removed.Â
6. L201-204, what does "self-similar statistics" mean? In L213, what does "the same rainfall statistics" mean here? I am curious which type of rainfall statistics is preserved in the downscaling process.
7. What is the size of the rainfall field in Ordinary Kriging? Is it a basin-based correction? Â Ordinary Kriging has the assumption of geostationary, which may not perform optimally when applied to a large complex region.
8. L505-L508, the authors mentioned that "The climatologically corrected STIV_DBKC fields have a significantly accurate diurnal cycle compared to only event-scale bias-corrected STIV_DBK." But in Figure 5, Â I did not see many differences between the blue and green lines. And should not the "STIV_DBK" here be "STIV_DB"?
9. L610, the authors mentioned that "IRC-ICC" is the recommended dataset. In Section 5, the author provides the citation for "IRC". Why don't the authors publish IRC-ICC?
10. I recommend that the authors provide the results of STIV_IRC_ICC in Figures 5, 6, and 7. I understand that the lack of rainfall ground truth makes the evaluation of precipitation data a little bit hard. Â The better discharge estimates from your methods cannot reflect the absolute accuracy of precipitation data, as the discharge is your objective function. I would recommend more evaluation of the precipitation data itself. Alternatively, you can use STIV_IRC_ICC to drive another hydrologic model to evaluate whether you can also have a better discharge prediction than Stage IV. Model calibration can also be implemented, as hydrologists usually do so with a precipitation dataset.Â
Minor comments:
1. I recommend that the authors clarify the terminology usage. In Figure 2, the event scale bias correction is noted as STIV_BD. But in some places of the figure and the article, STIV_DBK is used.Â
2. L690. Â The resolution of StageIV_D is "1km, hourly" in Figure 1, but you mention " the same resolution as StageIV_D datasets (250m, 5min)".
3. Provide the legend in Figure A3, Figure 8,9, 11, 12
4. Provide the unit in Figure 10
5. Provide the y-axis in Figure 11
Â