GRDC-Caravan: extending Caravan with data from the Global Runoff Data Centre
Abstract. Large-sample datasets are essential in hydrological science to support modelling studies and advance process understanding. However, these datasets often lack standardization, which impedes their combination. Caravan is a community initiative to create a large-sample hydrology dataset of meteorological forcing data, catchment attributes, and discharge data for catchments around the world. Compared to existing large-sample hydrology datasets, the focus of Caravan is to use globally consistent forcing and attribute data to facilitate global studies. Caravan is a community project designed to be expanded by members of the hydrological community using a common cloud-based framework. This dataset is currently the 6th extension to Caravan, based on a subset of hydrological discharge data and station-based watersheds from the Global Runoff Data Centre (GRDC), which are covered by an open data policy. The GRDC is an international data centre operating under the auspices of the World Meteorological Organization (WMO), which collects quality-controlled river discharge data and associated metadata from the National Meteorological and Hydrological Services (NMHS) of WMO Member States. The dataset covers stations from 5,356 catchments and 25 countries, spans the years 1950–2023. This takes the total number of Caravan catchments (core dataset plus extensions) to 22,372 (1589 catchments accounting for duplicates). This extension is released under a CC-BY-4.0 license that allows redistribution and is publicly available on Zenodo: https://zenodo.org/records/14006282 (Färber et al., 2024). We encourage additional NMHS to make their data available under open licenses so that it can be included in future versions of the extension.