Articles | Volume 17, issue 9
https://doi.org/10.5194/essd-17-4613-2025
© Author(s) 2025. This work is distributed under
the Creative Commons Attribution 4.0 License.
the Creative Commons Attribution 4.0 License.
https://doi.org/10.5194/essd-17-4613-2025
© Author(s) 2025. This work is distributed under
the Creative Commons Attribution 4.0 License.
the Creative Commons Attribution 4.0 License.
GRDC-Caravan: extending Caravan with data from the Global Runoff Data Centre
Global Runoff Data Centre (GRDC), Federal Institute of Hydrology (BfG), 56068 Koblenz, Germany
Henning Plessow
Global Runoff Data Centre (GRDC), Federal Institute of Hydrology (BfG), 56068 Koblenz, Germany
Simon A. Mischel
Global Runoff Data Centre (GRDC), Federal Institute of Hydrology (BfG), 56068 Koblenz, Germany
Frederik Kratzert
Google Research, 1010 Vienna, Austria
Nans Addor
Fathom, Bristol, BS8 1EJ, UK
Geography, University of Exeter, Exeter, EX4 4RJ, UK
Guy Shalev
Google Research, Tel Aviv 6789141, Israel
Ulrich Looser
Global Runoff Data Centre (GRDC), Federal Institute of Hydrology (BfG), 56068 Koblenz, Germany
Related authors
No articles found.
Nele Reyniers, Qianyu Zha, Nans Addor, Timothy J. Osborn, Nicole Forstenhäusler, and Yi He
Earth Syst. Sci. Data, 17, 2113–2133, https://doi.org/10.5194/essd-17-2113-2025, https://doi.org/10.5194/essd-17-2113-2025, 2025
Short summary
Short summary
We present bias-corrected UK Climate Projections 2018 (UKCP18) regional datasets for temperature, precipitation, and potential evapotranspiration (1981–2080). All 12 members of the 12 km ensemble were corrected using quantile mapping and a change-preserving variant. Both methods effectively reduce biases in multiple statistics while maintaining projected climatic changes. We provide guidance on using the bias-corrected datasets for climate change impact assessment.
Olivier Delaigue, Guilherme Mendoza Guimarães, Pierre Brigode, Benoît Génot, Charles Perrin, Jean-Michel Soubeyroux, Bruno Janet, Nans Addor, and Vazken Andréassian
Earth Syst. Sci. Data, 17, 1461–1479, https://doi.org/10.5194/essd-17-1461-2025, https://doi.org/10.5194/essd-17-1461-2025, 2025
Short summary
Short summary
This dataset covers 654 rivers all flowing in France. The provided time series and catchment attributes will be of interest to those modelers wishing to analyze hydrological behavior and perform model assessments.
Martin Gauch, Frederik Kratzert, Daniel Klotz, Grey Nearing, Deborah Cohen, and Oren Gilon
EGUsphere, https://doi.org/10.5194/egusphere-2025-1224, https://doi.org/10.5194/egusphere-2025-1224, 2025
Short summary
Short summary
Missing input data are one of the most common challenges when building deep learning hydrological models. We present and analyze different methods that can produce predictions when certain inputs are missing during training or inference. Our proposed strategies provide high accuracy while allowing for more flexible data handling and being robust to outages in operational scenarios.
Eduardo Acuña Espinoza, Frederik Kratzert, Daniel Klotz, Martin Gauch, Manuel Álvarez Chaves, Ralf Loritz, and Uwe Ehret
Hydrol. Earth Syst. Sci., 29, 1749–1758, https://doi.org/10.5194/hess-29-1749-2025, https://doi.org/10.5194/hess-29-1749-2025, 2025
Short summary
Short summary
Long short-term memory (LSTM) networks have demonstrated state-of-the-art performance for rainfall-runoff hydrological modelling. However, most studies focus on predictions at a daily scale, limiting the benefits of sub-daily (e.g. hourly) predictions in applications like flood forecasting. In this study, we introduce a new architecture, multi-frequency LSTM (MF-LSTM), designed to use inputs of various temporal frequencies to produce sub-daily (e.g. hourly) predictions at a moderate computational cost.
Eduardo Acuña Espinoza, Ralf Loritz, Frederik Kratzert, Daniel Klotz, Martin Gauch, Manuel Álvarez Chaves, and Uwe Ehret
Hydrol. Earth Syst. Sci., 29, 1277–1294, https://doi.org/10.5194/hess-29-1277-2025, https://doi.org/10.5194/hess-29-1277-2025, 2025
Short summary
Short summary
Data-driven techniques have shown the potential to outperform process-based models in rainfall–runoff simulations. Hybrid models, combining both approaches, aim to enhance accuracy and maintain interpretability. Expanding the set of test cases to evaluate hybrid models under different conditions, we test their generalization capabilities for extreme hydrological events.
Frederik Kratzert, Martin Gauch, Daniel Klotz, and Grey Nearing
Hydrol. Earth Syst. Sci., 28, 4187–4201, https://doi.org/10.5194/hess-28-4187-2024, https://doi.org/10.5194/hess-28-4187-2024, 2024
Short summary
Short summary
Recently, a special type of neural-network architecture became increasingly popular in hydrology literature. However, in most applications, this model was applied as a one-to-one replacement for hydrology models without adapting or rethinking the experimental setup. In this opinion paper, we show how this is almost always a bad decision and how using these kinds of models requires the use of large-sample hydrology data sets.
Andreas Auer, Martin Gauch, Frederik Kratzert, Grey Nearing, Sepp Hochreiter, and Daniel Klotz
Hydrol. Earth Syst. Sci., 28, 4099–4126, https://doi.org/10.5194/hess-28-4099-2024, https://doi.org/10.5194/hess-28-4099-2024, 2024
Short summary
Short summary
This work examines the impact of temporal and spatial information on the uncertainty estimation of streamflow forecasts. The study emphasizes the importance of data updates and global information for precise uncertainty estimates. We use conformal prediction to show that recent data enhance the estimates, even if only available infrequently. Local data yield reasonable average estimations but fall short for peak-flow events. The use of global data significantly improves these predictions.
Daniel Klotz, Martin Gauch, Frederik Kratzert, Grey Nearing, and Jakob Zscheischler
Hydrol. Earth Syst. Sci., 28, 3665–3673, https://doi.org/10.5194/hess-28-3665-2024, https://doi.org/10.5194/hess-28-3665-2024, 2024
Short summary
Short summary
The evaluation of model performance is essential for hydrological modeling. Using performance criteria requires a deep understanding of their properties. We focus on a counterintuitive aspect of the Nash–Sutcliffe efficiency (NSE) and show that if we divide the data into multiple parts, the overall performance can be higher than all the evaluations of the subsets. Although this follows from the definition of the NSE, the resulting behavior can have unintended consequences in practice.
Marvin Höge, Martina Kauzlaric, Rosi Siber, Ursula Schönenberger, Pascal Horton, Jan Schwanbeck, Marius Günter Floriancic, Daniel Viviroli, Sibylle Wilhelm, Anna E. Sikorska-Senoner, Nans Addor, Manuela Brunner, Sandra Pool, Massimiliano Zappa, and Fabrizio Fenicia
Earth Syst. Sci. Data, 15, 5755–5784, https://doi.org/10.5194/essd-15-5755-2023, https://doi.org/10.5194/essd-15-5755-2023, 2023
Short summary
Short summary
CAMELS-CH is an open large-sample hydro-meteorological data set that covers 331 catchments in hydrologic Switzerland from 1 January 1981 to 31 December 2020. It comprises (a) daily data of river discharge and water level as well as meteorologic variables like precipitation and temperature; (b) yearly glacier and land cover data; (c) static attributes of, e.g, topography or human impact; and (d) catchment delineations. CAMELS-CH enables water and climate research and modeling at catchment level.
Louise J. Slater, Louise Arnal, Marie-Amélie Boucher, Annie Y.-Y. Chang, Simon Moulds, Conor Murphy, Grey Nearing, Guy Shalev, Chaopeng Shen, Linda Speight, Gabriele Villarini, Robert L. Wilby, Andrew Wood, and Massimiliano Zappa
Hydrol. Earth Syst. Sci., 27, 1865–1889, https://doi.org/10.5194/hess-27-1865-2023, https://doi.org/10.5194/hess-27-1865-2023, 2023
Short summary
Short summary
Hybrid forecasting systems combine data-driven methods with physics-based weather and climate models to improve the accuracy of predictions for meteorological and hydroclimatic events such as rainfall, temperature, streamflow, floods, droughts, tropical cyclones, or atmospheric rivers. We review recent developments in hybrid forecasting and outline key challenges and opportunities in the field.
Nele Reyniers, Timothy J. Osborn, Nans Addor, and Geoff Darch
Hydrol. Earth Syst. Sci., 27, 1151–1171, https://doi.org/10.5194/hess-27-1151-2023, https://doi.org/10.5194/hess-27-1151-2023, 2023
Short summary
Short summary
In an analysis of future drought projections for Great Britain based on the Standardised Precipitation Index and the Standardised Precipitation Evapotranspiration Index, we show that the choice of drought indicator has a decisive influence on the resulting projected changes in drought characteristics, although both result in increased drying. This highlights the need to understand the interplay between increasing atmospheric evaporative demand and drought impacts under a changing climate.
Grey S. Nearing, Daniel Klotz, Jonathan M. Frame, Martin Gauch, Oren Gilon, Frederik Kratzert, Alden Keefe Sampson, Guy Shalev, and Sella Nevo
Hydrol. Earth Syst. Sci., 26, 5493–5513, https://doi.org/10.5194/hess-26-5493-2022, https://doi.org/10.5194/hess-26-5493-2022, 2022
Short summary
Short summary
When designing flood forecasting models, it is necessary to use all available data to achieve the most accurate predictions possible. This manuscript explores two basic ways of ingesting near-real-time streamflow data into machine learning streamflow models. The point we want to make is that when working in the context of machine learning (instead of traditional hydrology models that are based on
bio-geophysics), it is not necessary to use complex statistical methods for injecting sparse data.
Sella Nevo, Efrat Morin, Adi Gerzi Rosenthal, Asher Metzger, Chen Barshai, Dana Weitzner, Dafi Voloshin, Frederik Kratzert, Gal Elidan, Gideon Dror, Gregory Begelman, Grey Nearing, Guy Shalev, Hila Noga, Ira Shavitt, Liora Yuklea, Moriah Royz, Niv Giladi, Nofar Peled Levi, Ofir Reich, Oren Gilon, Ronnie Maor, Shahar Timnat, Tal Shechter, Vladimir Anisimov, Yotam Gigi, Yuval Levin, Zach Moshe, Zvika Ben-Haim, Avinatan Hassidim, and Yossi Matias
Hydrol. Earth Syst. Sci., 26, 4013–4032, https://doi.org/10.5194/hess-26-4013-2022, https://doi.org/10.5194/hess-26-4013-2022, 2022
Short summary
Short summary
Early flood warnings are one of the most effective tools to save lives and goods. Machine learning (ML) models can improve flood prediction accuracy but their use in operational frameworks is limited. The paper presents a flood warning system, operational in India and Bangladesh, that uses ML models for forecasting river stage and flood inundation maps and discusses the models' performances. In 2021, more than 100 million flood alerts were sent to people near rivers over an area of 470 000 km2.
Juliane Mai, Hongren Shen, Bryan A. Tolson, Étienne Gaborit, Richard Arsenault, James R. Craig, Vincent Fortin, Lauren M. Fry, Martin Gauch, Daniel Klotz, Frederik Kratzert, Nicole O'Brien, Daniel G. Princz, Sinan Rasiya Koya, Tirthankar Roy, Frank Seglenieks, Narayan K. Shrestha, André G. T. Temgoua, Vincent Vionnet, and Jonathan W. Waddell
Hydrol. Earth Syst. Sci., 26, 3537–3572, https://doi.org/10.5194/hess-26-3537-2022, https://doi.org/10.5194/hess-26-3537-2022, 2022
Short summary
Short summary
Model intercomparison studies are carried out to test various models and compare the quality of their outputs over the same domain. In this study, 13 diverse model setups using the same input data are evaluated over the Great Lakes region. Various model outputs – such as streamflow, evaporation, soil moisture, and amount of snow on the ground – are compared using standardized methods and metrics. The basin-wise model outputs and observations are made available through an interactive website.
Jonathan M. Frame, Frederik Kratzert, Daniel Klotz, Martin Gauch, Guy Shalev, Oren Gilon, Logan M. Qualls, Hoshin V. Gupta, and Grey S. Nearing
Hydrol. Earth Syst. Sci., 26, 3377–3392, https://doi.org/10.5194/hess-26-3377-2022, https://doi.org/10.5194/hess-26-3377-2022, 2022
Short summary
Short summary
The most accurate rainfall–runoff predictions are currently based on deep learning. There is a concern among hydrologists that deep learning models may not be reliable in extrapolation or for predicting extreme events. This study tests that hypothesis. The deep learning models remained relatively accurate in predicting extreme events compared with traditional models, even when extreme events were not included in the training set.
Thomas Lees, Steven Reece, Frederik Kratzert, Daniel Klotz, Martin Gauch, Jens De Bruijn, Reetik Kumar Sahu, Peter Greve, Louise Slater, and Simon J. Dadson
Hydrol. Earth Syst. Sci., 26, 3079–3101, https://doi.org/10.5194/hess-26-3079-2022, https://doi.org/10.5194/hess-26-3079-2022, 2022
Short summary
Short summary
Despite the accuracy of deep learning rainfall-runoff models, we are currently uncertain of what these models have learned. In this study we explore the internals of one deep learning architecture and demonstrate that the model learns about intermediate hydrological stores of soil moisture and snow water, despite never having seen data about these processes during training. Therefore, we find evidence that the deep learning approach learns a physically realistic mapping from inputs to outputs.
Daniel Klotz, Frederik Kratzert, Martin Gauch, Alden Keefe Sampson, Johannes Brandstetter, Günter Klambauer, Sepp Hochreiter, and Grey Nearing
Hydrol. Earth Syst. Sci., 26, 1673–1693, https://doi.org/10.5194/hess-26-1673-2022, https://doi.org/10.5194/hess-26-1673-2022, 2022
Short summary
Short summary
This contribution evaluates distributional runoff predictions from deep-learning-based approaches. We propose a benchmarking setup and establish four strong baselines. The results show that accurate, precise, and reliable uncertainty estimation can be achieved with deep learning.
Andrew J. Newman, Amanda G. Stone, Manabendra Saharia, Kathleen D. Holman, Nans Addor, and Martyn P. Clark
Hydrol. Earth Syst. Sci., 25, 5603–5621, https://doi.org/10.5194/hess-25-5603-2021, https://doi.org/10.5194/hess-25-5603-2021, 2021
Short summary
Short summary
This study assesses methods that estimate flood return periods to identify when we would obtain a large flood return estimate change if the method or input data were changed (sensitivities). We include an examination of multiple flood-generating models, which is a novel addition to the flood estimation literature. We highlight the need to select appropriate flood models for the study watershed. These results will help operational water agencies develop more robust risk assessments.
Peter T. La Follette, Adriaan J. Teuling, Nans Addor, Martyn Clark, Koen Jansen, and Lieke A. Melsen
Hydrol. Earth Syst. Sci., 25, 5425–5446, https://doi.org/10.5194/hess-25-5425-2021, https://doi.org/10.5194/hess-25-5425-2021, 2021
Short summary
Short summary
Hydrological models are useful tools that allow us to predict distributions and movement of water. A variety of numerical methods are used by these models. We demonstrate which numerical methods yield large errors when subject to extreme precipitation. As the climate is changing such that extreme precipitation is more common, we find that some numerical methods are better suited for use in hydrological models. Also, we find that many current hydrological models use relatively inaccurate methods.
John P. Bloomfield, Mengyi Gong, Benjamin P. Marchant, Gemma Coxon, and Nans Addor
Hydrol. Earth Syst. Sci., 25, 5355–5379, https://doi.org/10.5194/hess-25-5355-2021, https://doi.org/10.5194/hess-25-5355-2021, 2021
Short summary
Short summary
Groundwater provides flow, known as baseflow, to surface streams and rivers. It is important as it sustains the flow of many rivers at times of water stress. However, it may be affected by water management practices. Statistical models have been used to show that abstraction of groundwater may influence baseflow. Consequently, it is recommended that information on groundwater abstraction is included in future assessments and predictions of baseflow.
Keirnan J. A. Fowler, Suwash Chandra Acharya, Nans Addor, Chihchung Chou, and Murray C. Peel
Earth Syst. Sci. Data, 13, 3847–3867, https://doi.org/10.5194/essd-13-3847-2021, https://doi.org/10.5194/essd-13-3847-2021, 2021
Short summary
Short summary
This paper presents the Australian edition of the Catchment Attributes and Meteorology for Large-sample Studies (CAMELS) series of datasets. CAMELS-AUS comprises data for 222 unregulated catchments with long-term monitoring, combining hydrometeorological time series (streamflow and 18 climatic variables) with 134 attributes related to geology, soil, topography, land cover, anthropogenic influence and hydroclimatology. It is freely downloadable from https://doi.pangaea.de/10.1594/PANGAEA.921850.
Peter Uhe, Daniel Mitchell, Paul D. Bates, Nans Addor, Jeff Neal, and Hylke E. Beck
Geosci. Model Dev., 14, 4865–4890, https://doi.org/10.5194/gmd-14-4865-2021, https://doi.org/10.5194/gmd-14-4865-2021, 2021
Short summary
Short summary
We present a cascade of models to compute high-resolution river flooding. This takes meteorological inputs, e.g., rainfall and temperature from observations or climate models, and takes them through a series of modeling steps. This is relevant to evaluating current day and future flood risk and impacts. The model framework uses global data sets, allowing it to be applied anywhere in the world.
Frederik Kratzert, Daniel Klotz, Sepp Hochreiter, and Grey S. Nearing
Hydrol. Earth Syst. Sci., 25, 2685–2703, https://doi.org/10.5194/hess-25-2685-2021, https://doi.org/10.5194/hess-25-2685-2021, 2021
Short summary
Short summary
We investigate how deep learning models use different meteorological data sets in the task of (regional) rainfall–runoff modeling. We show that performance can be significantly improved when using different data products as input and further show how the model learns to combine those meteorological input differently across time and space. The results are carefully benchmarked against classical approaches, showing the supremacy of the presented approach.
Martin Gauch, Frederik Kratzert, Daniel Klotz, Grey Nearing, Jimmy Lin, and Sepp Hochreiter
Hydrol. Earth Syst. Sci., 25, 2045–2062, https://doi.org/10.5194/hess-25-2045-2021, https://doi.org/10.5194/hess-25-2045-2021, 2021
Short summary
Short summary
We present multi-timescale Short-Term Memory (MTS-LSTM), a machine learning approach that predicts discharge at multiple timescales within one model. MTS-LSTM is significantly more accurate than the US National Water Model and computationally more efficient than an individual LSTM model per timescale. Further, MTS-LSTM can process different input variables at different timescales, which is important as the lead time of meteorological forecasts often depends on their temporal resolution.
Gemma Coxon, Nans Addor, John P. Bloomfield, Jim Freer, Matt Fry, Jamie Hannaford, Nicholas J. K. Howden, Rosanna Lane, Melinda Lewis, Emma L. Robinson, Thorsten Wagener, and Ross Woods
Earth Syst. Sci. Data, 12, 2459–2483, https://doi.org/10.5194/essd-12-2459-2020, https://doi.org/10.5194/essd-12-2459-2020, 2020
Short summary
Short summary
We present the first large-sample catchment hydrology dataset for Great Britain. The dataset collates river flows, catchment attributes, and catchment boundaries for 671 catchments across Great Britain. We characterise the topography, climate, streamflow, land cover, soils, hydrogeology, human influence, and discharge uncertainty of each catchment. The dataset is publicly available for the community to use in a wide range of environmental and modelling analyses.
Cited articles
Addor, N., Newman, A. J., Mizukami, N., and Clark, M. P.: The CAMELS data set: catchment attributes and meteorology for large-sample studies, Hydrol. Earth Syst. Sci., 21, 5293–5313, https://doi.org/10.5194/hess-21-5293-2017, 2017.
Addor, N., Do, H. X., Alvarez-Garreton, C., Coxon, G., Fowler, K., and Mendoza, P. A.: Large-sample hydrology: recent progress, guidelines for new datasets and grand challenges, Hydrolog. Sci. J., 65, 712–725, https://doi.org/10.1080/02626667.2019.1683182, 2020.
Alvarez-Garreton, C., Mendoza, P. A., Boisier, J. P., Addor, N., Galleguillos, M., Zambrano-Bigiarini, M., Lara, A., Puelma, C., Cortes, G., Garreaud, R., McPhee, J., and Ayala, A.: The CAMELS-CL dataset: catchment attributes and meteorology for large sample studies – Chile dataset, Hydrol. Earth Syst. Sci., 22, 5817–5846, https://doi.org/10.5194/hess-22-5817-2018, 2018.
Arsenault, R., Brissette, F., Martel, J.-L., Troin, M., Lévesque, G., Davidson-Chaput, J., Gonzalez, M. C., Ameli, A., and Poulin, A.: A comprehensive, multisource database for hydrometeorological modeling of 14 425 North American watersheds, Sci. Data, 7, 243, https://doi.org/10.1038/s41597-020-00583-2, 2020.
Bloomfield, J. P., Gong, M., Marchant, B. P., Coxon, G., and Addor, N.: How is Baseflow Index (BFI) impacted by water resource management practices?, Hydrol. Earth Syst. Sci., 25, 5355–5379, https://doi.org/10.5194/hess-25-5355-2021, 2021.
Casado Rodríguez, J.: CAMELS-ES: Catchment Attributes and Meteorology for Large-Sample Studies – Spain (1.0.2), Zenodo [data set], https://doi.org/10.5281/zenodo.8428374, 2023.
Chagas, V. B. P., Chaffe, P. L. B., Addor, N., Fan, F. M., Fleischmann, A. S., Paiva, R. C. D., and Siqueira, V. A.: CAMELS-BR: hydrometeorological time series and landscape attributes for 897 catchments in Brazil, Earth Syst. Sci. Data, 12, 2075–2096, https://doi.org/10.5194/essd-12-2075-2020, 2020.
Chagas, V. B. P., Chaffe, P. L. B., and Blöschl, G.: Climate and land management accelerate the Brazilian water cycle, Nat. Commun., 13, 5136, https://doi.org/10.1038/s41467-022-32580-x, 2022.
Clerc-Schwarzenbach, F., Selleri, G., Neri, M., Toth, E., van Meerveld, I., and Seibert, J.: Large-sample hydrology – a few camels or a whole caravan?, Hydrol. Earth Syst. Sci., 28, 4219–4237, https://doi.org/10.5194/hess-28-4219-2024, 2024.
Coxon, G., Addor, N., Bloomfield, J. P., Freer, J., Fry, M., Hannaford, J., Howden, N. J. K., Lane, R., Lewis, M., Robinson, E. L., Wagener, T., and Woods, R.: CAMELS-GB: hydrometeorological time series and landscape attributes for 671 catchments in Great Britain, Earth Syst. Sci. Data, 12, 2459–2483, https://doi.org/10.5194/essd-12-2459-2020, 2020.
Csardi, G. and Nepusz, T.: The igraph software package for complex network research, InterJournal, Complex Systems, 1695, https://igraph.org (last access: 22 August 2025), 2006.
Do, H. X., Gudmundsson, L., Leonard, M., and Westra, S.: The Global Streamflow Indices and Metadata Archive (GSIM) – Part 1: The production of a daily streamflow archive and metadata, Earth Syst. Sci. Data, 10, 765–785, https://doi.org/10.5194/essd-10-765-2018, 2018.
Dolich, A., Maharjan, A., Mälicke, M., Manoj J, A., and Loritz, R.: Caravan-DE: Caravan extension Germany – German dataset for large-sample hydrology (v1.0.1) [data set], https://doi.org/10.5281/zenodo.13983616, 2024.
Dorigo, W., Dietrich, S., Aires, F., Brocca, L., Carter, S., Cretaux, J.-F., Dunkerley, D., Enomoto, H., Forsberg, R., Güntner, A., Hegglin, M. I., Hollmann, R., Hurst, D. F., Johannessen, J. A., Kummerow, C., Lee, T., Luojus, K., Looser, U., Miralles, D. G., Pellet, V., Recknagel, T., Vargas, C. R., Schneider, U., Schoeneich, P., Schröder, M., Tapper, N., Vuglinsky, V., Wagner, W., Yu, L., Zappa, L., Zemp, M., and Aich, V.: Closing the Water Cycle from Observations across Scales: Where Do We Stand?, B. Am. Meteorol. Soc., 102, E1897–E1935, https://doi.org/10.1175/BAMS-D-19-0316.1, 2021.
Färber, C., Plessow, H., Mischel, S., Kratzert, F., Addor, N., Shalev, G., and Looser, U.: GRDC-Caravan: extending the original dataset with data from the Global Runoff Data Centre (0.6), Zenodo [data set], https://doi.org/10.5281/zenodo.15349031, 2025.
Fowler, K. J. A., Acharya, S. C., Addor, N., Chou, C., and Peel, M. C.: CAMELS-AUS: hydrometeorological time series and landscape attributes for 222 catchments in Australia, Earth Syst. Sci. Data, 13, 3847–3867, https://doi.org/10.5194/essd-13-3847-2021, 2021.
GCOS: The Status of the Global Climate Observing System 2021: The GCOS Status Report (GCOS-240), WMO, Geneva, 384 pp., https://library.wmo.int/idurl/4/57596 (last access: 22 August 2025), 2021.
GEO: The GEOSS Water Strategy: From Observations to Decisions, Japan Aerospace Exploration Agency, Tokyo, 276 pp., https://ceos.org/document_management/Ad_Hoc_Teams/WSIST/WSIST_GEOSS-Water-Strategy-Full-Report_Jan2014.pdf (last access: 22 August 2025), 2014.
GRDC: Policy guidelines for the dissemination of data: https://grdc.bafg.de/about/data_policy, last access: 11 February 2025a.
GRDC: GRDC Flow Tools, GitHub [code], https://github.com/bafg-bund/GRDCFlowTools, last access: 11 February 2025b.
GRDC: Data Portal: https://grdc.bafg.de/data/data_portal/, last access: 11 February 2025c.
Gudmundsson, L., Do, H. X., Leonard, M., and Westra, S.: The Global Streamflow Indices and Metadata Archive (GSIM) – Part 2: Quality control, time-series indices and homogeneity assessment, Earth Syst. Sci. Data, 10, 787–804, https://doi.org/10.5194/essd-10-787-2018, 2018.
Gupta, H. V., Perrin, C., Blöschl, G., Montanari, A., Kumar, R., Clark, M., and Andréassian, V.: Large-sample hydrology: a need to balance depth with breadth, Hydrol. Earth Syst. Sci., 18, 463–477, https://doi.org/10.5194/hess-18-463-2014, 2014.
Heberger, M.: delineator.py: Fast, accurate watershed delineation using hybrid vector- and raster-based methods and data from MERIT-Hydro (Version 1.0) [data set], https://doi.org/10.5281/zenodo.7314287, 2021.
Helgason, H. B. and Nijssen, B.: LamaH-Ice: LArge-SaMple DAta for Hydrology and Environmental Sciences for Iceland, HydroShare [data set], https://doi.org/10.4211/hs.86117a5f36cc4b7c90a5d54e18161c91, 2024.
Höge, M., Kauzlaric, M., Siber, R., Schönenberger, U., Horton, P., Schwanbeck, J., Floriancic, M. G., Viviroli, D., Wilhelm, S., Sikorska-Senoner, A. E., Addor, N., Brunner, M., Pool, S., Zappa, M., and Fenicia, F.: Catchment attributes and hydro-meteorological time series for large-sample studies across hydrologic Switzerland (CAMELS-CH) (0.8), Zenodo [data set], https://doi.org/10.5281/zenodo.10354485, 2023.
Klingler, C., Schulz, K., and Herrnegger, M.: LamaH-CE: LArge-SaMple DAta for Hydrology and Environmental Sciences for Central Europe, Earth Syst. Sci. Data, 13, 4529–4565, https://doi.org/10.5194/essd-13-4529-2021, 2021.
Koch, J.: Caravan extension Denmark – Danish dataset for large-sample hydrology (v_05), Zenodo [data set], https://doi.org/10.5281/zenodo.7962379, 2022.
Kratzert, F.: Caravan – A global community dataset for large-sample hydrology, GitHub [code], https://github.com/kratzert/Caravan, last access: 11 February 2025.
Kratzert, F., Klotz, D., Herrnegger, M., Sampson, A. K., Hochreiter, S., and Nearing, G. S.: Toward Improved Predictions in Ungauged Basins: Exploiting the Power of Machine Learning, Water Resour. Res., 55, 11344–11354, https://doi.org/10.1029/2019WR026065, 2019a.
Kratzert, F., Klotz, D., Shalev, G., Klambauer, G., Hochreiter, S., and Nearing, G.: Towards learning universal, regional, and local hydrological behaviors via machine learning applied to large-sample datasets, Hydrol. Earth Syst. Sci., 23, 5089–5110, https://doi.org/10.5194/hess-23-5089-2019, 2019b.
Kratzert, F., Nearing, G., Addor, N., Erickson, T., Gauch, M., Gilon, O., Gudmundsson, L., Hassidim, A., Klotz, D., Nevo, S., Shalev, G., and Matias, Y.: Caravan – A global community dataset for large-sample hydrology, Sci. Data, 10, 61, https://doi.org/10.1038/s41597-023-01975-w, 2023.
Kratzert, F., Nearing, G., Addor, N., Erickson, T., Gauch, M., Gilon, O., Gudmundsson, L., Hassidim, A., Klotz, D., Nevo, S., Shalev, G., and Matias, Y.: Caravan – A global community dataset for large-sample hydrology (1.5), Zenodo [data set], https://doi.org/10.5281/zenodo.14673536, 2025a.
Kratzert, F., Nearing, G., Addor, N., Erickson, T., Gauch, M., Gilon, O., Gudmundsson, L., Hassidim, A., Klotz, D., Nevo, S., Shalev, G., and Matias, Y.: Caravan – A global community dataset for large-sample hydrology (1.6), Zenodo [code], https://doi.org/10.5281/zenodo.15529786, 2025b.
Lawford, R., Unninayar, S., Huffman, G. J., Grabs, W., Gutiérrez, A., Ishida-Watanabe, C., and Koike, T.: Implementing the GEOSS water strategy: from observations to decisions, Int. J. Digit Earth, 16, 1439–1468, https://doi.org/10.1080/17538947.2023.2202420, 2023.
Lehner, B.: Derivation of watershed boundaries for GRDC gauging stations based on the hydrosheds drainage network, GRDC Report Series, 41, Koblenz, Germany, 18 pp., https://doi.org/10.5675/GRDC_Report_41, 2012.
Lehner, B., Verdin, K., and Jarvis, A.: New Global Hydrography Derived From Spaceborne Elevation Data, EOS, 89, 93–94, https://doi.org/10.1029/2008EO100001, 2008.
Lehner, B., Linke, S., and Thieme, M.: HydroATLAS version 1.0 figshare [data set], https://doi.org/10.6084/m9.figshare.9890531.v1, 2019.
Lin, P., Pan, M., Beck, H. E., Yang, Y., Yamazaki, D., Frasson, R., David, C. H., Durand, M., Pavelsky, T. M., Allen, G. H., Gleason, C. J., and Wood, E. F.: Global Reconstruction of Naturalized River Flows at 2.94 Million Reaches, Water Resour. Res., 55, 6499–6516, https://doi.org/10.1029/2019WR025287, 2019.
Linke, S., Lehner, B., Ouellet Dallaire, C., Ariwi, J., Grill, G., Anand, M., Beames, P., Burchard-Levine, V., Maxwell, S., Moidu, H., Tan, F., and Thieme, M.: Global hydro-environmental sub-basin and river reach characteristics at high spatial resolution, Sci. Data, 6, 283, https://doi.org/10.1038/s41597-019-0300-6, 2019.
Loritz, R., Dolich, A., Acuña Espinoza, E., Ebeling, P., Guse, B., Götte, J., Hassler, S. K., Hauffe, C., Heidbüchel, I., Kiesel, J., Mälicke, M., Müller-Thomy, H., Stölzle, M., and Tarasova, L.: CAMELS-DE: hydro-meteorological time series and attributes for 1582 catchments in Germany, Earth Syst. Sci. Data, 16, 5625–5642, https://doi.org/10.5194/essd-16-5625-2024, 2024.
Metzger, M. J., Bunce, R. G. H., Jongman, R. H. G., Sayre, R., Trabucco, A., and Zomer, R.: A high-resolution bioclimate map of the world: a unifying framework for global biodiversity research and monitoring, Global Ecol. Biogeogr., 22, 630–638, https://doi.org/10.1111/geb.12022, 2013.
Morin, E.: Caravan extension Israel – Israel dataset for large-sample hydrology, Zenodo [data set], https://doi.org/10.5281/zenodo.12760798, 2024.
Muñoz-Sabater, J., Dutra, E., Agustí-Panareda, A., Albergel, C., Arduini, G., Balsamo, G., Boussetta, S., Choulga, M., Harrigan, S., Hersbach, H., Martens, B., Miralles, D. G., Piles, M., Rodríguez-Fernández, N. J., Zsoter, E., Buontempo, C., and Thépaut, J.-N.: ERA5-Land: a state-of-the-art global reanalysis dataset for land applications, Earth Syst. Sci. Data, 13, 4349–4383, https://doi.org/10.5194/essd-13-4349-2021, 2021.
Newman, A. J., Clark, M. P., Sampson, K., Wood, A., Hay, L. E., Bock, A., Viger, R. J., Blodgett, D., Brekke, L., Arnold, J. R., Hopson, T., and Duan, Q.: Development of a large-sample watershed-scale hydrometeorological data set for the contiguous USA: data set characteristics and assessment of regional variability in hydrologic model performance, Hydrol. Earth Syst. Sci., 19, 209–223, https://doi.org/10.5194/hess-19-209-2015, 2015.
Schaake, J., Cong, S. Z., and Duan, Q. Y.: The US MOPEX data set, IAHS Publications, 307, 9–28, ISBN 978-1-901502-73-2, 2006.
Senent-Aparicio, J., Castellanos-Osorio, G., Segura-Méndez, F., López-Ballesteros, A., Jimeno-Sáez, P., and Pérez-Sánchez, J.: BULL Database – Spanish Basin attributes for Unravelling Learning in Large-sample hydrology, Sci. Data, 11, 737, https://doi.org/10.1038/s41597-024-03594-5, 2024.
Shalev, G. and Kratzert, F.: Caravan MultiMet (Part 1, Nowcasts): Extending Caravan with Multiple Weather Nowcasts and Forecasts (1.1), Zenodo [data set], https://doi.org/10.5281/zenodo.14196771, 2024a.
Shalev, G. and Kratzert, F.: Caravan MultiMet (Part 2, Forecasts): Extending Caravan with Multiple Weather Nowcasts and Forecasts (1.1), Zenodo [data set], https://doi.org/10.5281/zenodo.14196772, 2024b.
Singer, M. B., Asfaw, D. T., Rosolem, R., Cuthbert, M. O., Miralles, D. G., MacLeod, D., Quchimbo, E. A., and Michaelides, K.: Hourly potential evapotranspiration at 0.1° resolution for the global land surface from 1981–present, Sci. Data, 8, 224, https://doi.org/10.1038/s41597-021-01003-9, 2021.
Yamazaki, D., Ikeshima, D., Sosa, J., Bates, P. D., Allen, G. H., and Pavelsky, T. M.: MERIT Hydro: A High-Resolution Global Hydrography Map Based on Latest Topography Dataset, Water Resour. Res., 55, 5053–5073, https://doi.org/10.1029/2019WR024873, 2019.
Short summary
Large-sample datasets are essential in hydrological science to support modelling studies and advance process understanding. Caravan is a community initiative to create a large-sample hydrology dataset of meteorological forcing data, catchment attributes and discharge data for catchments around the world. This dataset is a subset of hydrological discharge data and station-based watersheds from the Global Runoff Data Centre (GRDC), which are covered by an open data policy.
Large-sample datasets are essential in hydrological science to support modelling studies and...
Altmetrics
Final-revised paper
Preprint