Preprints
https://doi.org/10.5194/essd-2022-52
https://doi.org/10.5194/essd-2022-52
 
02 Mar 2022
02 Mar 2022
Status: a revised version of this preprint is currently under review for the journal ESSD.

WaterBench: A Large-scale Benchmark Dataset for Data-Driven Streamflow Forecasting

Ibrahim Demir1,2, Zhongrun Xiang1, Bekir Demiray3, and Muhammed Sit3 Ibrahim Demir et al.
  • 1Department of Civil and Environmental Engineering, University of Iowa, Iowa City, 52246 USA
  • 2Department of Electrical and Computer Engineering, University of Iowa, Iowa City, 52246 USA
  • 3Interdisciplinary Graduate Program in Informatics, University of Iowa, Iowa City, 52246 USA

Abstract. This study proposes a comprehensive benchmark dataset for streamflow forecasting, WaterBench, that follows FAIR data principles that is prepared with a focus on convenience for utilizing in data-driven and machine learning studies, and provides benchmark performance for state-of-art deep learning architectures on the dataset for comparative analysis. By aggregating the datasets of streamflow, precipitation, watershed area, slope, soil types, and evapotranspiration from federal agencies and state organizations (i.e., NASA, NOAA, USGS, and Iowa Flood Center), we provided the WaterBench for hourly streamflow forecast studies. This dataset has a high temporal and spatial resolution with rich metadata and relational information, which can be used for varieties of deep learning and machine learning research. We defined a sample streamflow forecasting task for the next 120 hours and provided performance benchmarks on this task with sample linear regression and deep learning models, including Long Short-Term Memory (LSTM), Gated Recurrent Units (GRU), and S2S (Sequence-to-sequence). To some extent, WaterBench makes up for the lack of unified benchmarks in earth science research. We highly encourage researchers to use the WaterBench for deep learning research in hydrology.

Ibrahim Demir et al.

Status: final response (author comments only)

Comment types: AC – author | RC – referee | CC – community | EC – editor | CEC – chief editor | : Report abuse
  • CC1: 'Include mention of Iowa in title and abstract', Mathis Messager, 03 Mar 2022
    • AC1: 'Reply on CC1', Zhongrun Xiang, 23 Aug 2022
  • RC1: 'Comment on essd-2022-52', Anonymous Referee #1, 06 Apr 2022
    • AC2: 'Reply on RC1', Zhongrun Xiang, 23 Aug 2022
  • RC2: 'Comment on essd-2022-52', Anonymous Referee #2, 31 May 2022
    • AC3: 'Reply on RC2', Zhongrun Xiang, 23 Aug 2022
  • RC3: 'Comment on essd-2022-52', Anonymous Referee #3, 06 Jul 2022
    • AC5: 'Reply on RC3', Zhongrun Xiang, 23 Aug 2022
  • EC1: 'Comment on essd-2022-52', Martin Schultz, 25 Jul 2022
    • AC4: 'Reply on EC1', Zhongrun Xiang, 23 Aug 2022

Ibrahim Demir et al.

Data sets

WaterBench Ibrahim Demir, Zhongrun Xiang, Bekir Demiray, Muhammed Sit https://github.com/uihilab/WaterBench

Model code and software

WaterBench Ibrahim Demir, Zhongrun Xiang, Bekir Demiray, Muhammed Sit https://github.com/uihilab/WaterBench/tree/main/examples

Ibrahim Demir et al.

Viewed

Total article views: 827 (including HTML, PDF, and XML)
HTML PDF XML Total BibTeX EndNote
574 221 32 827 10 14
  • HTML: 574
  • PDF: 221
  • XML: 32
  • Total: 827
  • BibTeX: 10
  • EndNote: 14
Views and downloads (calculated since 02 Mar 2022)
Cumulative views and downloads (calculated since 02 Mar 2022)

Viewed (geographical distribution)

Total article views: 730 (including HTML, PDF, and XML) Thereof 730 with geography defined and 0 with unknown origin.
Country # Views %
  • 1
1
 
 
 
 
Latest update: 20 Sep 2022
Download
Short summary
We provided a large benchmark dataset, WaterBench, with valuable features for the hydrological modeling. This dataset designed to support cutting-edge deep learning studies for a more accurate streamflow forecast model. We also proposed a modeling task for comparative model studies and provided sample models with codes and results as the benchmark for reference. This makes up for the lack of benchmarks in earth science research.