Articles | Volume 13, issue 10
Data description paper
20 Oct 2021
Data description paper |  | 20 Oct 2021

The Bellinge data set: open data and models for community-wide urban drainage systems research

Agnethe Nedergaard Pedersen, Jonas Wied Pedersen, Antonio Vigueras-Rodriguez, Annette Brink-Kjær, Morten Borup, and Peter Steen Mikkelsen

This paper describes a comprehensive and unique open-access data set for research within hydrological and hydraulic modelling of urban drainage systems. The data come from a mainly combined urban drainage system covering a 1.7 km2 area in the town of Bellinge, a suburb of the city of Odense, Denmark. The data set consists of up to 10 years of observations (2010–2020) from 13 level meters, 1 flow meter, 1 position sensor and 4 power sensors in the system, along with rainfall data from three rain gauges and two weather radars (X- and C-band), and meteorological data from a nearby weather station. The system characteristics of the urban drainage system (information about manholes, pipes, etc.) can be found in the data set along with characteristics of the surface area (contour lines, surface description, etc.). Two detailed hydrodynamic, distributed urban drainage models of the system are provided in the software systems MIKE URBAN and EPA Storm Water Management Model (SWMM). The two simulation models generally show similar responses, but systematic differences are present since the models have not been calibrated. With this data set we provide a useful case that will enable independent testing and replication of results from future scientific developments and innovation within urban hydrology and urban drainage systems research. The data set can be downloaded from (Pedersen et al., 2021a).

1 Introduction

Scientific progress related to urban hydrology and urban drainage systems research is slowed down by a lack of open data within the field, and the need for open data and transparency is thus increasingly being emphasised (Moy de Vitry et al., 2019; Vonach et al., 2019). Urban drainage systems are essential for protecting the environment as well as human health and property. They typically represent the largest capital investments in infrastructure in cities and the most cost-efficient in terms of socio-economic gain (Hutton et al., 2007). Utility companies typically own the urban drainage systems and administer the right to extract and share asset data and sensor observation data from the systems. Sharing data with a larger community is a complex task as data require metadata and local knowledge and because data are often hidden in various difficult-to-access data systems in the utility. Utility companies have traditionally had little interest in making their data publicly available and sometimes even reject doing so for security or publicity reasons. They are however usually interested in collaborating with local universities, and therefore most published studies are based on case data that are not shared with the broader international scientific community. This makes it virtually impossible to compare the performance of different methods, which makes it difficult to reach a scientific consensus that can allow us to start focusing on future innovation and to initiate the capacity building worldwide that is needed to ensure better urban drainage asset management. In the broader hydrological community there have been several major efforts to provide “community-wide data” with the explicit purpose of improving research and innovation in their field, e.g. CAMELS (Addor et al., 2017) and MOPEX (Schaake et al., 2006). A large part of these impressive data sets consists of satellite data or derivatives from these, which have made it possible to obtain data on a close to continental scale. This is not possible within the field of urban drainage, since the most important parts of the systems are hidden underground, and the data describing these are either non-existent or hidden in various utility companies' data systems. The basis for open data sets within urban drainage is thus bound to be the utility company.

Ideally, an open-access data set for an urban drainage system should be able to stand alone without the need for direct contact with the utility company so that any researcher can have the same level of access to information. This implies that the data set will inevitably contain more information than needed for any single study. The minimum requirement for such a data set is a detailed description of the drainage system given by asset documentation, such as dimensions and location of all pipes and structures, as well as time series of observed inflow to the system (rainfall observations and/or consumers' wastewater production). In the past few years, several research groups have presented surrogate modelling studies on each of their case areas using their own specific case data, which makes it impossible to compare the methods directly (Kroll et al., 2017; Ledergerber et al., 2019; Mahmoodian et al., 2018; Thrysøe et al., 2019; Wolfs and Willems, 2017). Within the research field of real-time control (RTC) of urban drainage systems, the lack of open data sets has led to the development of synthetic test models, such as the Astlingen network (Schütze et al., 2017; Sun et al., 2020) and the Pystorms networks (Rimer et al., 2019). While such synthetic networks are useful due to their stringent focus on the most relevant processes for the purpose at hand, the usage of actual networks to benchmark the performance of RTC methods would help the end-users in the utility companies to decide which methods to implement for their specific system. A comprehensive data set of one structure, heavily monitored during 5 d of experiments, has been released (Moy De Vitry et al., 2017), and similar data are needed for network systems. Time series of observations of levels and flows in the system will also be necessary for many investigations of e.g. model calibration techniques (Krebs et al., 2013; Vonach et al., 2019), development of improved skill scores (Bennett et al., 2013), uncertainty analysis (Deletic et al., 2012), techniques for data quality control (Kirstein et al., 2019; Therrien et al., 2020), development of data-driven models and machine learning (Carbajal et al., 2017; Eggimann et al., 2017; Palmitessa et al., 2021a), and software sensors (Fencl et al., 2019). Other areas that can be inspired by open data sharing could be the construction of digital ecosystems (Sarni et al., 2019) and digital twins (Pedersen et al., 2021b; Therrien et al., 2020). The more complete and more diverse the data set is (spatially and temporally), the more research potential there will be.

The current article describes an open data set suitable for urban drainage research and education. The data set is described in accordance with the FAIR principles of open and documented data, which require data to be findable, accessible, interoperable and reusable (Wilkinson, 2016) in order to fully support reproducible research in computational hydrology (Hutton et al., 2016; Stagge et al., 2019). The utility company VCS Denmark (referred to as “VCS” in the rest of the paper) provided most of the data with the support of hydrological and meteorological data from the Danish Meteorological Institute (DMI) (DMI, 2020), rain gauge data from The Water Pollution Committee (WPC) of The Danish Society of Engineers (DMI and IDA, 2020), and geospatial data from the Danish Agency for Data Supply and Efficiency (DADSE) (DADSE, 2020). VCS has for many years focused on documenting its systems and procedures through detailed registration of assets and systematised collection of sensor observation data. The presented data set is from a 1.7 km2 suburban area served by a combined urban drainage system with more than 10 years of observational data available. Orthophotos from 2010 to the present show no significant urban development in the areas that are connected to the combined sewer system. This gives a unique opportunity to use the same model for a 10-year period with little structural model uncertainty induced by changes in the urban layout. The specific properties of the sewer system (pipe diameters, basin volumes, actuator settings, etc.) are shared in the shape of two distributed urban drainage models since these are more accessible for most potential users than an asset database. The two simulation models are constructed in the software programs MIKE URBAN (MU) (DHI, 2020) and Storm Water Management Model (SWMM) (EPA, 2020). The MU model is used by VCS in the daily planning and operations work of the utility company, and the model of the same system using SWMM has been constructed for this publication due to its free and open-source nature. The selection of data and models provided here aims to be as “open-minded” as possible, and we believe it can be used to initiate research across a range of highly relevant topics and also inspire discussions among water utilities on the benefits of high-quality data acquisition and modelling.

The paper is organised as follows: the case area and its urban drainage system is described in Sect. 2, while Sect. 3 describes the sensor observations and Sect. 4 the two distributed urban drainage models. Section 5 provides a brief comparison of the two models with data from selected locations for three example rain events and discusses possible areas of improvement, followed by a discussion of future potential research use of the data set (Sect. 6), an overview of the data repository (Sect. 7) and our conclusions (Sect. 8).

Figure 1Maps of the case area, indicating rivers as blue lines. (a) Locations of Brændekilde, Bellinge and Dyrup (dark green areas) in the upstream part of the sewer catchment of Ejby Mølle WRRF (black outline); locations of rain gauges (stars); and locations of X-band radar grid cells (grid). (b) The pipe network in the area, distinguishing combined sewers (green) from separate sewers for stormwater (blue) and wastewater (red). Combined sewer overflow locations (green triangles). (c) Orthophoto of the area along with contour lines, indicating combined sewer overflow locations (green triangles). (d) Locations of the in-sewer sensors in Brændekilde and Bellinge together with the combined and stormwater system (wastewater pipes are hidden for simplicity). Background maps provided by DADSE (2020).

2 System description

The presented case area consists of the three suburban towns, Brændekilde, Bellinge and Dyrup, which are located in the south-western outskirts of Odense on the island of Funen, Denmark (Fig. 1a), at 5524 N, 1023 E, in a temperate climate with the seasons of winter (December–February), spring (March–May), summer (June–August) and autumn (September–November). The municipality of Odense has approximately 200 000 inhabitants and an area of 30 400 ha of which about half is developed. VCS is the local utility company responsible for operating and managing most of the water supply and all the urban drainage and wastewater infrastructure in the municipalities of Odense and Nordfyn, including the central Ejby Mølle Water Resource Recovery Facility (WRRF, also referred to as a wastewater treatment plant), which collects combined and separate wastewater from a 112 km2 service area, including Brændekilde–Bellinge–Dyrup in the most upstream part of the Ejby Mølle sewer catchment.

2.1 Characteristics of the area

Topographically, the area is flat with terrain dropping from 50 ma.s.l. in the west to 20 m in the east. It is located in the downstream part of the Odense River catchment, in-between Odense River and its tributary Borreby Møllebæk, which receive combined sewer overflow from Bellinge and Brændekilde during heavy rain (Fig. 1b and c). All sensors are located in Brændekilde and Bellinge (Fig. 1d), the total surface area contributing with rainfall runoff to the sewer system (upstream from sensor G71F06R in Bellinge) being 1.72 km2 with paved surfaces covering 0.55 km2. The case area has 1800 households with approximately 4000 inhabitants, who mostly live in detached single-family houses built from 1960 to 1980 and discharge a total average wastewater load of 501 m3 d−1. Downstream from Bellinge, a main interceptor pipe runs along Odense River to the Ejby Mølle WRRF, as seen in Fig. 1b. The small town of Dyrup is included in the models to ensure realistic hydraulic conditions in the interceptor pipe. A total average of 570 m3 d−1 of wastewater and a 1 km2 surface area contribute to the boundary conditions downstream of G71F06R, comprising stormwater from an area of 0.61 km2 coming from Dyrup.

Figure 2Meteorological data from a weather station in Årslev approx. 10 km east of the case area. Bins on the horizontal axis refer to the beginning of each year. Precipitation data are extracted at a 10 min resolution and summed to daily and yearly values. Air temperature, radiation, wind speed and relative humidity are given as daily and yearly means. Based on open hydrological data from DMI (2020) where more parameters can be extracted.


2.2 Climate and meteorology of the area

The area has a warm, temperate climate. Several meteorological variables are available for free through DMI's open data platform (DMI, 2020). The nearest DMI weather station is at Årslev approximately 10 km east of the case area (outside the area shown in Fig. 1). The historical data include time series (10 min resolution) of temperature, relative humidity, wind speed, amount of time with direct sunlight, and solar radiation. The annual rainfall depth has varied between 530 and 820 mm (Fig. 2) over the past 10 years, with the predominant westerly wind direction bringing many large frontal systems over the catchment, especially during autumn and winter. Winters tend to be mild with a relatively short snow season, while spring is slightly drier than the rest of the year. The daily average temperature has been as low as 10 C during winter and up to +20 C during summer. High-intensity, convective rainfall events mostly occur during the summer months. The maximum recorded intensity in the 10-year data set from Årslev is 11.8 mm (10 min)−1, corresponding to a mean of 1.18 mm min−1 or 19.7 µm s−1 during 10 min. Solar radiation, wind speed and humidity data are also provided from the Årslev station, thereby allowing evapotranspiration to be calculated. The wind speed and relative humidity are in general highest during the autumn and winter period.

Figure 3Conceptual illustration of the most important hydraulic structures in the urban drainage system. “Area” indicates the impervious area/total area connected to a structure, while qw is the average daily wastewater load connected to each structure. The total average daily wastewater discharge upstream of G71F06R is 5.8 L s−1 (equal to 501 m3 d−1). B: basin; CSO: combined sewer overflow; SP: storage pipe.


Figure 4Illustration of the most complex hydraulic structures with relevant names referring to nodes in the models (black text; see also Figs. 1 and 3) and sensors (red text; see further details in Sect. 3). The grey text gives names of additional model locations.


2.3 Urban drainage system

The urban drainage system extends from upstream Brændekilde through the town of Bellinge and further downstream, meeting the contribution from Dyrup along Odense River. The sewers were originally laid out as a combined system with domestic wastewater and stormwater flowing in the same pipes. There are several combined sewer overflows to Odense River. Today, most of the system is still combined, but there are also a few newer developments with separate stormwater systems with outlets running out of the catchment area, thereby not significantly affecting the combined system. Figure 3 illustrates conceptually how the main structures connect, and Fig. 4 illustrates important hydraulic details and the placement of sensors in some of these. Further details about the structures and technical drawings along with CCTV of the interesting pipes can be found in the data set (Pedersen et al., 2021a).

Figure 5Location of the hydraulic structures at Brændekilde, where observations are obtained. Basin B1 and CSO1 are located here. Background maps provided by DADSE (2020).

2.3.1 Brændekilde

Brændekilde had its own local wastewater treatment plant until around 1990, which now serves as an open-air detention basin for combined wastewater; see Fig. 5 (G80F11B, dashed white line). Combined wastewater from Brændekilde sewer branches arrives at the structure G80F66Y, from where it flows to a pumping station (G80F13P), which pumps it further downstream to Bellinge. The pumping station also receives separate wastewater from houses in a larger surrounding rural area indicated by the red lines in Figs. 1b and 5 (here, only the green pipe coming from the north is visible). In high-flow situations water is diverted from G80F66Y to the open-air detention basin (G80F11B) through an internal overflow structure, where it is temporarily stored until it can flow back down to the pumping station. The basin has an overflow weir that discharges to a small stream called Borreby Møllebæk, a tributary to Odense River.

Figure 6Location of the sensors in Bellinge. The storage pipe (SP) and basin B2 are highlighted (cyan), and the location of CSO2 is indicated. Background maps provided by DADSE (2020).

2.3.2 Bellinge

In Bellinge, Fig. 6, the system previously had problems with frequent flooding and combined sewer overflows. This led to the construction in 2010 of a large underground storage pipe and basin, which reduced the number of overflows to Odense River to approximately five per year. Models are available for both the old system (2009, may be useful when seeking to understand the historical evolution of the system) and the new system (from 2010, may be useful when comparing with observation data) and can be found in the data set (Pedersen et al., 2021a). The storage capacity in the storage pipe and basin is activated during medium to large rain events via three internal overflow structures, G71F05R, G71F04R and G71F06R, located in the old combined sewer main in Bellinge (Fig. 3). The storage pipe and basin is emptied by pumping to the sewer main in G71F06R, according to a coordinating rule-based control scheme that depends on the current water level in the structures G71F05R, G71F06R and G71F68Y, but also by the current inlet flow rate at the downstream WRRF at Ejby Mølle.

Table 1Structures with sensors installed; see Figs. 1 and 3 for locations and Fig. 4 for details about the sensor installations. Corresponding model names are given. In all structures except G73F010 and G72F040 sensors are permanently installed.

Download Print Version | Download XLSX

3 Observation data

3.1 In-sewer sensors

Throughout the catchment, several sensors provide data about the state of the system; see Table 1. Most sensors are level meters, but other information is also collected. Figure 4 illustrates the location of the sensors placed in the most important or complicated structures (G80F66Y, G71F05R, G71F04R, G71F06R and G71F68Y).

The sensors in Brændekilde are in the basin (G80F11B), in the internal overflow structure (G80F66Y) and in the pumping station (G80F13P); see Figs. 4 and 5. The pumping station is fairly simple with a large manhole serving as a sump and has sensors measuring the water level and power consumption.

In Bellinge, the volume pipe with a diameter of 2.2 m ( 2200 mm) received overflow from the old system from three internal overflow weirs: G71F05R, G71F04R and G71F06R. The upstream overflow structure, G71F05R, has a small flushing chamber with a storage volume and a gate (a position throttle), which closes at the beginning of a rain event in order to store water for flushing the storage pipe at the end of an event. This is implemented in the models; however in reality the gate has an opening and closing time of 12 s, which the models are not able to replicate. When water overflows from the old system to this flushing storage volume, the tank quickly fills and water overflows across a weir, filling the storage pipe and basin. When the storage pipe has been in use and emptied again after a rain event, the subsequent flushing storage pipe creates a small distinct peak in the water level data at the downstream basin (G71F68Y). A full chamber at G71F05R allows the flushing of the storage pipe up to three times. At G71F04R water can overflow directly from the old system into the storage pipe. Two level sensors are located at the up- and downstream end of the weir.

The third internal weir is at G71F06R where water overflows to the basin in G71F68Y downstream of the storage pipe. A level sensor directly next to the weir allows monitoring of this internal overflow. After a rain event water is pumped back to G71F06R from the G71F68Y basin. The pumping outlet is a few metres downstream of the level sensor to reduce pumping artefacts in level measurements and potential backwater effects that lead to overflow at the G71F06R weir. In G71F68Y the water from the storage pipe is led to a deep pump sump. If the basin is full (2800 m3), combined wastewater can overflow to Odense River through a rotating screen and a bendable weir. There is one level sensor in the pump sump along with a flow meter in the pressure pipe from the pumps. The pumps in G71F68Y are dry connected pumps with a frequency converter installed controlled by the level in both the pump sump and G71F06R. The pumps run alternatingly with a design capacity of 80 L s−1 max.

Figure 7Time periods with in-sewer sensor data (red bars) and rainfall observation data (blue bars) provided. White bars indicate periods with erroneous sensor data (downtimes), and grey bars indicate periods without data. Power for the pumping stations and the weather radars has not been checked for erroneous data. G73F010 and G72F040 are level sensors.


A 1 min temporal resolution is applied to all data from permanent sensors routinely gathered in VCS's supervisory control and data acquisition (SCADA) system, but the resolution is 2 min for the mobile sensors installed in G73F010 and G72F040 due to concerns about battery life.

3.2 Rainfall data

Two rain gauges with more than 10 years of observations available are located in and just outside the catchment (“5425 Brændekilde” and “5427 Dalum”); see Fig. 1a. These gauges are a part of a national network of utility-owned rain gauges, which undergo quality control by the Danish Metrological Institute (DMI) (Jørgensen et al., 1998). They are tipping-bucket gauges and measure the number of tips every minute, which is then converted to rainfall intensity. VCS also temporarily installed a tipping-bucket rain gauge at the centre of the catchment for a 1-year period; see “Aabakken” in Fig. 1a. VCS has also operated a local X-band weather radar since 2012, and time series from this radar (1 min resolution) are provided for the area of interest in cell sizes of approx. 925 m× 925 m; see Fig. 1a. The radar is dynamically adjusted using approx. 10 rain gauges located in the area covered by the radar according to the method described in Borup et al. (2016). As a supplement VCS has also since 2017 adjusted the signal from DMI's C-band radar (5 min resolution) located in Virring approx. 80 km away with the same methods in order to cover parts of VCS's service area which are not entirely covered by the X-band radar. However, we note that the distance of 80 km to the C-band radar is close to the limit for which a radar of this kind can deliver reasonably accurate quantitative rainfall estimates (Thorndahl et al., 2014).

3.3 Availability of rainfall and in-sewer observation data

Figure 7 illustrates the temporal availability of the data provided, including both the in-sewer sensor data (cf. Table 1) and the rainfall data products. The downtimes of each sensor are not long, with the exception of one of the level sensors at G71F04R. During 2019, VCS installed more permanent sensors throughout the urban drainage system, which is the case for Brændekilde (G80F11B and G80F66Y).

Figure 8Anomaly in level data and possible causes. The minimum daily water level in G71F06R (upper left corner) along with a CCTV image (lower left corner) of G71F06R from June 2013 and a photo of the same location from 2019 (right) also showing the location of the level sensor (radar). Videos and additional photos can be found in the data set (Pedersen et al., 2021a). A large anomaly in the data occurs on 11 May 2015, which is possibly due to retrofitting of the banquettes in the structure on this date. Pictures by VCS Denmark (2020).

All observation data are provided in the data set (Pedersen et al., 2021a) and in most cases are available as uninterrupted data series since 2010 (updated in August 2021, i.e. more than 10 years). VCS mainly uses sensor data for day-to-day operation and planning of maintenance work but also for developing a digital-twin strategy (Pedersen et al., 2021b). In cases where a sensor has a limited effect on the system control, longer periods with missing data may occur as those data are not a high priority (e.g. one of the duplicate sensors at G71F04R). Exact documentation of sensor maintenance has not been a high priority over all the years, and it is therefore presently not possible to give an overview of when and where sensors have been repaired, been replaced or received some sort of maintenance. Data are generally good, but as for all large real-world data sets, there will be errors and anomalies. One example is a gradual increase in the daily minimum water level at G71F06R from 2010 to 2015, followed by a sudden drop in May 2015 and a sudden increase again in December; see Fig. 8. Comparing photos taken in 2019 and CCTV recordings from 2013 revealed that the banquettes of the structure had been retrofitted during this period. Further investigation of this by interviewing operational staff at VCS revealed that, indeed, one person remembered that the banquettes were retrofitted with new material in approximately 2015. This illustrates that data can be hard to understand but that there might be a logical explanation behind large anomalies, in this case potentially an effect of gradual deterioration until May 2015 followed by more stationary conditions at lower water levels. Direct use of the data during low-flow situations therefore has to be done with care up to this date. The behaviour in December 2015 until March 2016 could be a consequence of a very rainy season in the general Odense area during these months, which caused a lot of infiltration.

The sudden drop in daily minimum water levels in 2015 is a large anomaly that is easily spotted in the data. Although the data have been analysed by the authors of this study, there might still be small artefacts in the data due to minor undocumented physical changes to the system. Small changes to the control settings of e.g. pumps have also not been thoroughly documented and can thus also appear as artefacts. We acknowledge that this is not optimal. The utility company is in a transition process of changing the way metadata are logged. By exploiting the procedures in a typical Danish utility company we can hopefully start a discussion of how to ensure best practice.

Figure 9The link between an observed depth, converted to level by adding a 0 point (red text), and the modelled depth and level from a model (blue text).


3.4 Data cleaning

The observations from the in-sewer sensors are provided both as raw data and as a cleaned version where erroneous data points have been removed. The cleaned data were converted to UTC+0 time; see Python scripts in the data set (Pedersen et al., 2021a) for details. The depth recorded by the level sensors needs to be comparable with the model results and was therefore converted to level by adding the invert level or the 0 point for the sensor to the measured depth; see Fig. 9.

VCS changed the supplier of its SCADA system during 2020, going from System 2000 (Frontmatec, 2021) to iFIX (GE Digital, 2021), which will give a different output format. A third format will be given from the interim sensors, which is supplied from a company called Danova (, last access: 13 October 2021). For this release, it was, for practical reasons, decided to use an initial set of common, simple data cleaning techniques and leave more comprehensive data validation as a research opportunity for ourselves or others in the future, e.g. Leigh et al. (2019). The cleaning techniques included five techniques for replacing clearly erroneous observation data with not-a-number (NaN) values:

  • Manufacturer quality stamp. These data were stamped with “low quality” in the iFIX SCADA system.

  • Manual remove. These are data that for some reason were deemed untrustworthy, for instance observation values during maintenance or start-up periods.

  • Out of bounds. These are data outside a defined physically meaningful range of possible values (e.g. bottom and top levels of a pipe/basin).

  • Frozen sensor. These data do not change during a time period of e.g. 20 min.

  • Outlier. These are data with spikes with a manually chosen height and duration; in our case this category is only applicable to interim Danova sensor data, which occasionally showed spike patterns which are probably not correct.

Data from iFIX are set to reduce data storage requirements at the sensor by leaving out observations that changes in the coming time step below a given threshold (for water level sensors most often set to 1 cm). Therefore, this should not be seen as a period of failure or signal loss, and for this script forward filling is applied to the values which have these properties.

Table 2Overview of the five error types in data from the different sensors. Values are 1 min values.

Download Print Version | Download XLSX

Table 2 shows the number of data points flagged by each error function, which is also illustrated in Fig. 7. Simple gap filling based on linear interpolation was performed for gaps shorter than 5 min, as an increased gap-filling period would increase the risk of interpolating a potential peak. Python scripts for the data cleaning and gap filling are given in the data set (Pedersen et al., 2021a).

4 Simulation models

Physically based, distributed urban drainage models, constructed in software packages such as the packages MU (DHI, 2020) and SWMM (EPA, 2020) that are used here, are the most detailed type of urban drainage system model available and contain two main components: a surface module that calculates rainfall runoff from each sub-catchment to the pipe system and a hydrodynamic model that calculates the flow in the pipe system. The hydrodynamic model solves the St Venant equations across the pipe network and represents head loss in manholes and flow in overflow weirs and other hydraulic structures using standard hydraulic equations. The surface modules are in principle lumped–conceptual, but the sub-catchments are distributed in space according to the overall layout of the pipe network. The detailed mathematical formulations and numerical schemes used are different in MU and SWMM, and model users may choose between several options for describing especially the lumped conceptual surface runoff model components. Rainfall data and wastewater loads are the main model forcings, but infiltration inflow and pumped flows, for example, can be used as additional forcings. The main model attributes are surface areas, imperviousness and the hydrological response time of sub-catchments, and the main asset data are features of the pipe network (diameter, length, roughness) and hydraulic structures (basin volumes, weir data, etc.). Several model forcings and attributes may be determined from independent external data sources or be considered parameters that can be calibrated based on observation data from the system. VCS uses MU in the daily modelling and model updating work, which is however not easily accessed by potential users of the data set because of its proprietary nature. We therefore also provide a model using SWMM (created to mimic the behaviour of the MU model), which is open source and thus readily available for use by the international research community.

Table 3Determination of model forcings and attributes in the implemented MU model from independent data sources.

Download Print Version | Download XLSX

4.1 MU models – building parts

The distributed urban drainage model is made by VCS in the MU software system (DHI, 2020) and is part of an operation model that is run and compared with observations on a routine basis as part of a digital-twin environment currently under development (Pedersen et al., 2021b). The hydrodynamic model consists of around 1000 nodes and 51 km of pipes (40 km of combined sewer pipes, 7 km of separate stormwater pipes and 4 km of wastewater pipes; see Fig. 1b), and the surface module consists of 713 individual sub-catchments with sizes of up to 10 ha with a median size of 0.3 ha. The downstream model boundary (model outlet) is chosen so that there are no backwater effects at any of the sensor locations in Bellinge from the downstream parts of the city of Odense. Dyrup is only in the model to ensure that the effect from this part is considered in the main pipe downstream of G71F06R. The emptying of the basin in Bellinge, G71F68Y, is in reality controlled by local regulatory and coordinating rules as well as global system-wide rules according to all the basins in the entire network of Odense and Ejby Mølle WRRF. The local rules are specified in the models, but overriding signals from the global system control are not considered here. The model can therefore not empty the basin in Bellinge realistically for some periods depending on the filling degree of other basins close to the WRRF. As the MU model software cannot handle frequency-converting pumps, the exact modelling of the pumping curves can furthermore be challenging. The screen and bendable weir in G71F68Y are described in the model as a regulated pipe with a specific Q–H curve corresponding to the detailed characteristics of these elements.

Calibration of urban drainage models has been subject to a great deal of research internationally for more than a decade (e.g. Bach et al., 2014; Broekhuizen et al., 2020; Nagel et al., 2020; Tscheikner-Gratl et al., 2016; Vezzaro et al., 2013). However, VCS has a philosophy of transparency in models, where understanding the system behaviour is considered more important than ensuring perfect calibration with non-transparent parameter sets, meaning that VCS prefers not to tune conceptual parameters to unrealistic values in order to fit models to observations. Therefore, the various parameters of the model were not calibrated, and standardised parameter sets were used when possible. In the past, however, the model has frequently been validated against observations from the system, and the causes of a poor fit have been investigated and corrected. This could for instance be an error in the registration of the level of an overflow weir or an impervious area connected differently to the system than anticipated. When discovering such errors, the system data were corrected in the asset database, and the model was updated according to the asset database.

The forcings and physical attributes of the system implemented in the provided MU model are outlined in Table 3. VCS has experienced that the current model for imperviousness overestimates the rainfall runoff from some of the impervious areas in the outlying communities. These are probably not often connected to the urban drainage network, and instead stormwater is infiltrated in trenches. The imperviousness of these areas is, however, not changed until fieldwork shows which areas should not contribute. An internal report from VCS assessed, based on analysing data from pumping stations, that approximately 30 % of the hydraulic load to Ejby Mølle WRRF is infiltration inflow. Half of this is expected to be caused by infiltration due to cracks in pipes and manholes and the other half due to agricultural drainage pipes, which historically have been connected to the urban drainage system. Several attempts have been made in VCS to model the infiltration inflow, for example with machine-learning techniques applied to observations near the treatment plant. These can be used for estimation of the inflow to the treatment plant but seldom match reality when scaled to upstream catchments. Therefore, infiltration inflow was not included in the MU model provided here, but we encourage potential users of the data set to investigate this further.

4.2 SWMM – conversion from MU

A model using SWMM (EPA, 2020) was constructed based on the utility company's MU model and was validated against the MU model results. Minor modifications were applied to the model using SWMM to produce results that are as similar to MU as possible. Model structure differences between MU and SWMM are highlighted in the following.

Surface runoff. The MU model uses a time–area runoff model (DHI, 2020), called “runoff model A”, which calculates the rainfall runoff from each sub-catchment based on its area, imperviousness and time of concentration (Table 3) and initial loss (set to a default value). SWMM has another way of estimating surface runoff with some similar attributes (area, imperviousness, initial loss) but with attributes other than the time of concentration describing the runoff routing (width, slope and Manning coefficients) for both impervious and pervious areas (Rossman and Huber, 2015). In Denmark, there has not been a tradition of including runoff from pervious surfaces in urban drainage models, except very recently as the occurrence of large cloudburst events has increased. In order to make the two models as similar as possible, the parameters for pervious surfaces in SWMM's infiltration module were thus set to unrealistically high values so that rainfall on such surfaces readily infiltrates into the ground instead of producing runoff to the urban drainage system. On impervious surfaces, the parameters were set to produce runoff similar to the runoff simulated with the MU model.

Network. The hydraulic calculations of the pipe network are quite similar in both models, solving the full St Venant equations. However, the calculation of head loss in manholes is different, as MU takes their volume into account, whereas SWMM neglects it. In order to obtain more similar models, the largest manholes were represented as “storage units” with designated volumes in SWMM. In MU there is a default setting where the length of pipes shorter than 10 m is adjusted to 10 m for computational stability. This is not the case in SWMM. MU also generates tiny amounts of water in empty pipes to avoid zero values (again for numerical stability). This is an issue in the large storage pipe, which is empty most of the time, and it was circumvented for this study by inserting a fictitious outlet in the MU model in the normally dry storage pipe, allowing water to disappear when the water level in the storage basin is low. SWMM does not generate water in dry pipes, and therefore a fictitious outlet was not inserted. In Denmark, a special non-circular shape of pipe has historically been used; MU includes this as a standard cross-section type, while SWMM does not, and a custom cross section was therefore defined for the relevant pipes in the model using SWMM; see the data set (Pedersen et al., 2021a) for further information. A more thorough comparison of the two simulation software products can be found in Borah (2011) where different model formulations used in typical software products, including both SWMM and MU, are described.

5 Model and data comparisons

In the following section we illustrate the nature of some of the data and the behaviour of the system as well as the performance of the two models. The models' response and the observations can be compared at the sensor locations (see Table 1) and as stated in the data set (Pedersen et al., 2021a). The analyses in this section will focus on two types of comparison for three selected rain events: a comparison of the two simulation models discussing structural model uncertainty potentially leading to significant differences and a comparison of selected model results with in-sewer observation data discussing how well the models can represent the dynamics in the urban drainage system given the uncertainty in the inputs.

Figure 10Intensity–duration–frequency (IDF) characteristics for the three selected rain events (2012, 2015 and 2018) measured at two different rain gauges (gauge 5425 and 5427), plotted on top of national regional IDF curves for Odense (Gregersen et al., 2014). T indicates return periods, corresponding to the reciprocal of the frequency.


Table 4Chosen events and their characteristics.

Download Print Version | Download XLSX

5.1 Selection of events

Data for three rain events of short duration and high intensity were selected from the 10 years of observations to illustrate the dynamics of the system. These three events (from June 2012, August 2015 and August 2018) were selected due to similar large rain depths and dynamics in both of the two permanent rain gauges (5417 and 5425) located on opposite sides of the catchment; and X-band radar rainfall data were furthermore available for the third event (Table 4). Similar measurements in the two gauges indicate that there is little spatial variation in rainfall rates, which reduces the risk of poor spatial sampling of the rainfall and makes the comparison of MU and SWMM responses with in-sewer sensor data simpler. Intensity–duration–frequency (IDF) characteristics of the three events are illustrated in Fig. 10, with the national regional IDF curves from Odense as background for comparison (Gregersen et al., 2013, 2014, 2016; Madsen et al., 2017).

Figure 11Computed time series of level and flow using MU (cyan) and SWMM (black) compared with observations (dotted red) for the sensors in and around the storage pipe in Bellinge. Dashed grey horizontal lines indicate weir crest levels. The green time series for event 3 is MU with radar input. Note that the vertical axes are not the same for all the rows of the figure. Water level units are metres above sea level (ma.s.l.), while flow units are m3 s−1.


5.2 MU vs. SWMM

The model results for the three events are shown in Fig. 11 together with water level observations for six of the water level sensors located in structures in Bellinge that are important to the dynamics of the storage pipe and basin (G71F05R, G71F04R, G71F06R and G71F68Y). Generally, the two models show the same tendencies in runoff response to rainfall, but there are small differences in the timing of flow in the most upstream nodes of the system (not shown) due to the different surface runoff models implemented in MU and SWMM. The integrating effect of the overall sewer catchment, however, means that the model results are very similar in the measurement locations that are located further downstream. An exception is in G71F06R (Fig. 11f) where the peaks stay higher for longer in the model using SWMM for all three events. A throttle pipe is located immediately upstream from this sensor ( 277 mm, Fig. 3), and through inspections of the model it was found that SWMM estimates up to 50 % more flow through the throttle than MU in the peak situations, thereby allowing the peak level in G71F06R to be higher for a longer time. This is due to the conceptual difference in manhole representation between the models, where MU has a volume and a head loss in the manhole, while SWMM does not. It is not known which of the solutions is most accurate.

5.3 Model results vs. observations

The observations in the G71F05R level inlet (Fig. 11a) have a higher base level than the model results. Water depths lower than 25 cm above the invert level were not recorded here during the 10-year measurement period. The pipe has a very low gradient, and therefore the elevated water level may be due to some downstream partial blockage or dislocation of a pipe. The level observations in the flushing chamber (G71F05R level basin, Fig. 11b) show a slow decrease after the chamber has been filled up, especially for event 2 and 3. None of the models show this decrease, which seems to indicate a small leak during some events from the gate that holds water back in the chamber. The model results in the G71F04R level inlet (Fig. 11c) are mostly similar, except that MU maintains a higher water level for slightly longer than SWMM does. In the G71F68Y basin the filling and emptying dynamics are generally very good (Fig. 11d). When the basin has been emptied after a rain event, a small peak occurs as the gate in G71F05R is opened and the volume pipe is flushed. The water level for event 1 is slightly underestimated by both models, while it is overestimated for event 3 (Fig. 11d). The fact that both models are biased for these events suggests that the rainfall input might also be biased. Radar observations from the X-band radar were therefore used as input to the MU model for event 3. This led to a slight underestimation of the modelled peak water levels (radar) in most locations (Fig. 11a–d, event 3). Despite the availability of two rain gauges on opposite sides of the catchment and a nearby radar, there is still a considerable uncertainty in the rainfall input. The pump rate G71F68Yp1 (Fig. 11e) is well simulated for event 2 (Fig. 11e), but it is overestimated for event 3 (Fig. 11e, event 3). Global control settings might also be responsible for the fact that the pumping starts earlier in both of the models than in the observations (Fig. 11e).

Both the model results and the observations show that the emptying of the G71F68Y basin downstream of G71F06R starts after the overflow in G71F06R has ended (Fig. 11f). Event 2 contain larger increases in the water level at G71F06R than event 1 (Fig. 11f). The pumping effects on the downstream water levels are thus not always similar despite nearly identical pumping flows. MU and SWMM also disagree on the size of this effect despite identical pipe geometries and pumping characteristics. The sudden 2015 drop in the dry-weather water level at G71F06R in event 2 that is presented in Fig. 5 is also seen between the first event and the two other events (Fig. 11f).

Several of the sensors are not placed directly above the point in a manhole/pipe/storage tank with the lowest invert elevation (Fig. 9). In these locations, it is not possible to measure water levels below the 0 level of the sensor, leading to an offset in the measured values that has to be accounted for. For most of the sensor locations, the offset is very small on the order of a few centimetres. It is, however, visible for both the G71F05R level inlet (25 cm, Fig. 11a) and the G71F68Y level (23 cm, Fig. 11d), where the sensor data cannot reach the lowest values of the simulations. As also shown in Fig. 11f there is a gap in the base level in G71F06R for the first event, which is due to the gradual increase in the minimum daily water level as shown in Fig. 8.

Figure 12MU model results (blue) and observations (dotted red) at the internal overflow structure G71F66Y in Brændekilde. The dashed grey horizontal line indicates the weir crest level. The rain event on 16 November 2019 at 15:39 UTC (second peak) consisted of 6.2 mm of rain in 426 min (second peak), while the other peak periods had a lower volume of rain.


An acceptable agreement between measured and simulated values is generally shown for the sensor locations and the three selected events. More sensors were however installed at two locations in Brændekilde in autumn 2019; see Fig. 7. Observations and MU simulation results for a single event are shown for G71F66Y in Fig. 12. This simulated level is lower than the observations, which suggests that the models overestimate surface runoff in the outlying areas and highlights that the models, although mostly physically based, have not been calibrated against observations. This illustrates some of the uncertainty in model attributes, which may be addressed in further research using the provided data and models.

6 Potential use of the data set

We envision that the data set presented in this paper can be used for a large span of research areas and problems such as (i) automated error detection and gap filling in data series; (ii) multi-site comprehensive data validation using e.g. machine learning and artificial intelligence; (iii) development of surrogate models of hydrodynamic models as well as entirely data-driven models; (iv) development of better model components for physically based hydrodynamic models; (v) development of better lumped conceptual model parts describing rainfall runoff from pervious areas and infiltration inflow in pipe systems; (vi) use of satellite data to improve the surface catchment characterisation and potentially discover flooded areas and nodes; and (vii) data assimilation for real-time modelling, forecasting for warning and control, etc. Furthermore, supplementary data such as CCTV not discussed in detail here may be used to check assumptions about pipes, manholes and hydraulic structures in future work with the provided data and models.

The data set is currently being used by the authors to identify sources of model uncertainty (Pedersen et al., 2021c), for anomaly detection of observations using machine-learning techniques (Palmitessa et al., 2021b), and for teaching activities in urban drainage at DTU Environment. With increased focus on digitalisation, the data set can also be used to initiate discussions on data acquisition and transfer needs (Eggimann et al., 2017) in order to gain insight into urban drainage systems that are gradually becoming more complex (Blumensaat et al., 2019) and to initiate discussion about which metadata and logs should be stored to ensure available information for future use. The water sector is furthermore known for inadvertently “hiding” data in silos hosted both within utilities (e.g. in different departmental systems) and by different external contractors, which makes integrated analysis tedious and resource demanding (e.g. Lund et al., 2021). We thus also encourage discussion on how the various information sources provided here may work together as required in future digital-twin environments (Pedersen et al., 2021b).

We hope that the data set can also be an inspiration for how to manage data and models for utility companies across the globe. The utility company VCS in particular hopes to benefit from future research exploring the here-presented data set, motivated by the availability of state-of-the-art models and the uniqueness of the long observation period, and to inspire discussions with other utilities that share common goals of improving performance through high-quality data acquisition and modelling.

Figure 13Overview of the items in the data repository folders and their subfolders, with an indication of the ownership of the specific data and how these data sources are normally accessed. The sensor and rain data in Fig. 7 can be found in the folders “#2 Sensor data”, “#3a Rain gauge data” and “#3b Radar data”.


7 Data availability

The data are available from DTU Data at (Pedersen et al., 2021a) and consist of the following items (cf. Fig. 13).

Asset database – urban drainage system. This is the asset database of the system of Bellinge with information of how it is registered by manholes and links. The links from a database extraction from 2007 are also provided to illustrate the system before the storage pipe and basin were built.

Sensor data – urban drainage system. Sensor data measured in the system are located here both as the raw data extracted from the SCADA system and as cleaned data, where data are checked for simple errors.

Rain gauge data. Data from the permanently installed rain gauges 5425 and 5427 along with the temporary rain gauge at Aabakken are in this folder.

Radar data. This item contains the radar data from both X-band and C-band radar. Besides that, meteorological data from the weather station in Årslev are located here.

Drawings. As-built drawings of the structure are located here together with photos taken in 2019.

CCTV – urban drainage system. CCTV videos of selected stretches are saved in this item to indicate how the system looks within the pipes.

Orthophotos, digital terrain model, etc. Data from DADSE are in this folder, where digital terrain models, orthophotos and maps of the areas can be found.

Models. The models of the current system using MU and SWMM are in this item. Besides that, a MU model of Bellinge in 2009 has been made in case some readers seek information about the system prior to 2009 when the basin and the volume pipe were built.

Catchment description. This item gives the catchment description, indicating the different classes that the imperviousness percentage is based upon.

Scripts. To ensure a certain data quality, data can be prepared with the scripts from this folder

The data set is split into nine items as there is no need to download all for a very specific use. Figure 13 gives an overview of the data repository with clear identification of both the ownership of the data and how the data would normally be accessed prior to publishing this data article and repository. The provided data come with a Creative Commons License CC BY 4.0, except for some of the rain data from DMI, which come with a CC BY-NC 4.0 License (commercial use not permitted).

8 Conclusions

Open-access data and models are currently non-existent within the urban hydrology and urban drainage research community. This comprehensive release of data from a real urban drainage system serving a 1.7 km2 area in the town Bellinge near Odense, Denmark, includes more than 10 years (2010–2020) of rainfall data from rain gauges, meteorological data from a nearby weather station, and level and flow data from in-sewer sensors. In addition, 8 and 3 years of data from X-band and C-band weather radars are provided, as well as two near-identical hydrodynamic distributed urban drainage models constructed in the software tools MIKE URBAN and EPA SWMM. This case is well suited for research within a broad range of topics such as data quality control and optimal sensor maintenance, automatic error and anomaly detection, model calibration, uncertainty analysis, development of surrogate models, data-driven modelling, forecasting, real-time control, or digital twins. We hope the community will adopt the Bellinge case as a benchmark that will enable independent testing and replication of results from future scientific developments. The case should also be highly relevant for teaching purposes.

This comprehensive data set provides a unique opportunity to explore several aspects of urban drainage systems and to publish research that can be replicated by others. The two urban drainage models respond almost identically to rainfall forcing; however, they are not calibrated to the observations, which is most evident for sensors located upstream. The models have the local regulatory and coordinating control incorporated but lack the system-wide control rules that depend on the downstream WRRF, and therefore not all events can be equally well simulated. The provided rain data give an indication of the spatial variability in rainfall, especially for the more extreme events. The extensive observations obtained in the area show that it is not always easy to operate sensors but also that there is a great potential in using data to a much greater extent than previously. With the provided data set, all researchers have access to the same models and data, which can enable a boost in research and innovation in the future.

Author contributions

All authors jointly contributed to conceptualising and designing the study, discussing results, and drafting or revising the manuscript. ABK and ANP designed the observation programme. ANP collected and curated the observation data, and ANP and JWP prepared the software for data cleaning. ANP prepared the MU models, whereas AVR and JWP prepared the model using SWMM. ANP, JWP, AVR and MB prepared the initial draft manuscript, and ANP, JWP, MB and PSM prepared the visualisations. ANP and PSM revised the manuscript and prepared the final submitted paper, which was approved by all authors. ABK, MB and PSM supervised the project.

Competing interests

The authors declare that they have no conflict of interest.


Publisher's note: Copernicus Publications remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.


The authors are grateful to several organisations for agreeing to freely share data for this case area, especially VCS Denmark, who provided all the asset data, the in-sewer observation data, and some of the rain data and the radar data; the Danish Meteorological Institute (DMI) and The Water Pollution Committee (WPC) of The Danish Society of Engineers, who provided rain gauge and weather-radar data; and the Danish Agency for Data Supply and Efficiency (DADSE), who provided spatial data as orthophoto and digital terrain models. The rain gauge data can be used freely for teaching and research, with an appropriate indication of the original source, but are not for commercial use according to an agreement with DMI and WPC. The data from DADSE can be freely used with an appropriate indication of the original source. DADSE retain copyright to the data when they are passed forward. Details of how to cite the data are given in the data set (Pedersen et al., 2021a).

Financial support

The work of Agnethe Nedergaard Pedersen was partly funded by the Innovation Fund Denmark (file no. 8118-00018B); the work of Antonio Vigueras-Rodriguez was funded by a research stay programme of the Spanish Ministry of Science, Innovation and Universities (ref. PRX19/00230), and the work of Jonas Wied Pedersen and Morten Borup was partly supported by the European Regional Development Fund through the NOAH project (Interreg Baltic Sea Region Programme grant no. R093).

Review statement

This paper was edited by David Carlson and reviewed by two anonymous referees.


Addor, N., Newman, A. J., Mizukami, N., and Clark, M. P.: The CAMELS data set: catchment attributes and meteorology for large-sample studies, Hydrol. Earth Syst. Sci., 21, 5293–5313,, 2017. 

Bach, P. M., Rauch, W., Mikkelsen, P. S., Mccarthy, D. T., and Deletic, A.: A critical review of integrated urban water modelling – Urban drainage and beyond, Environ. Modell. Softw., 54, 88–107,, 2014. 

Bennett, N. D., Croke, B. F. W., Guariso, G., Guillaume, J. H. A., Hamilton, S. H., Jakeman, A. J., Marsili-Libelli, S., Newham, L. T. H., Norton, J. P., Perrin, C., Pierce, S. A., Robson, B., Seppelt, R., Voinov, A. A., Fath, B. D., and Andreassian, V.: Characterising performance of environmental models, Environ. Modell. Softw., 40, 1–20,, 2013. 

Blumensaat, F., Leita, P., Ort, C., Rieckermann, R., Scheidegger, A., Vanrolleghem, P. A., Villez, K., Leitão, J. P., Ort, C., Rieckermann, J., Scheidegger, A., Vanrolleghem, P. A., and Villez, K.: How Urban Storm- and Wastewater Management Prepares for Emerging Opportunities and Threats: Digital Transformation, Ubiquitous Sensing, New Data Sources, and beyond – A Horizon Scan, Environ. Sci. Technol., 53, 8488–8498,, 2019. 

Borah, D. K.: Hydrologic procedures of storm event watershed models: A comprehensive review and comparison, Hydrol. Process., 25, 3472–3489,, 2011. 

Borup, M., Grum, M., Linde, J. J., and Mikkelsen, P. S.: Dynamic gauge adjustment of high-resolution X-band radar data for convective rain storms: Model-based evaluation against measured combined sewer overflow, J. Hydrol., 539, 687–699,, 2016. 

Broekhuizen, I., Leonhardt, G., Marsalek, J., and Viklander, M.: Event selection and two-stage approach for calibrating models of green urban drainage systems, Hydrol. Earth Syst. Sci., 24, 869–885,, 2020. 

Carbajal, J. P., Leitão, J. P., Albert, C., and Rieckermann, J.: Appraisal of data-driven and mechanistic emulators of nonlinear simulators: The case of hydrodynamic urban drainage models, Environ. Modell. Softw., 92, 17–27,, 2017. 

DADSE – Danish Agency for Data Supply and Efficiency: Danish Map Supply, available at:, last access: 23 October 2020. 

Deletic, A., Dotto, C. B. S., McCarthy, D. T., Kleidorfer, M., Freni, G., Mannina, G., Uhl, M., Henrichs, M., Fletcher, T. D., Rauch, W., Bertrand-Krajewski, J. L., and Tait, S.: Assessing uncertainties in urban drainage models, Phys. Chem. Earth, 42–44, 3–10,, 2012. 

DHI: Mike Urban, available at:, last access: 17 August 2020. 

DMI: DMI Open data, available at:, last access: 17 August 2020. 

DMI (Danish Meterological Institute) and IDA (The Danish Society of Engineers): The Water Pollution Committee – Rain gauge System (in Danish: Spildevandskomiteens regnmålerstyregruppe), available at: (last access: 23 October 2020), 2020. 

Eggimann, S., Mutzner, L., Wani, O., Schneider, M. Y., Spuhler, D., Moy De Vitry, M., Beutler, P., and Maurer, M.: The Potential of Knowing More: A Review of Data-Driven Urban Water Management, Environ. Sci. Technol., 51, 2538–2553,, 2017. 

EPA: EPA SWMM, available at:, last access: 17 August 2020. 

Fencl, M., Grum, M., Borup, M., and Steen Mikkelsen, P.: Robust model for estimating pumping station characteristics and sewer flows from standard pumping station data, Water Sci. Technol., 79, 1739–1745,, 2019. 

Frontmatec: System2000, available at:, last access: 13 October 2021. 

GE Digital: iFix, available at: , last access: 13 October 2021. 

Gregersen, I. B., Madsen, H., Rosbjerg, D., and Arnbjerg-Nielsen, K.: A spatial and nonstationary model for the frequency of extreme rainfall events, Water Resour. Res., 49, 127–136,, 2013. 

Gregersen, I. B., Madsen, H., Linde, J. J., and Arnbjerg-Nielsen, K.: Opdaterede klimafaktorer og dimensionsgivende regnintensiteter – Spildevandskomiteen, Skrift nr. 30., available at: (last access: 13 January 2020), 2014. 

Gregersen, I. B., Rasmussen, S. H., Madsen, H., and Arnbjerg-Nielsen, K.: Regnrække v.4.1, available at: (last access: 13 October 2021), 2016. 

Hutton, C., Wagener, T., Freer, J., Han, D., Duffy, C., and Arheimer, B.: Most computational hydrology is not reproducible, so is it really science?, Water Resour. Res., 52, 7548–7555,, 2016. 

Hutton, G., Haller, L., and Bartram, J.: Global cost-benefit analysis of water supply and sanitation interventions, J. Water Health, 5, 481–501,, 2007. 

Jørgensen, H. K., Rosenørn, S., Madsen, H., and Mikkelsen, P. S.: Quality control of rain data used for urban runoff systems, Water Sci. Technol., 37, 113–120,, 1998. 

Kirstein, J. K., Høgh, K., Rygaard, M., and Borup, M.: A semi-automated approach to validation and error diagnostics of water network data, Urban Water J., 16, 1–10,, 2019. 

Krebs, G., Kokkonen, T., Valtanen, M., Koivusalo, H., and Setälä, H.: A high resolution application of a stormwater management model (SWMM) using genetic parameter optimization, Urban Water J., 10, 394–410,, 2013. 

Kroll, S., Wambecq, T., Weemaes, M., Van Impe, J., and Willems, P.: Semi-automated buildup and calibration of conceptual sewer models, Environ. Modell. Softw., 93, 344–355,, 2017. 

Ledergerber, J. M., Pieper, L., Binet, G., Comeau, A., Maruéjouls, T., Muschalla, D., Vanrolleghem, P. A., Maru, T., Muschalla, D., and Vanrolleghem, P. A.: An Efficient and Structured Procedure to Develop Conceptual Catchment and Sewer Models from Their Detailed Counterparts, Water (Switzerland), 11, 1–19,, 2019. 

Leigh, C., Alsibai, O., Hyndman, R. J., Kandanaarachchi, S., King, O. C., McGree, J. M., Neelamraju, C., Strauss, J., Talagala, P. D., Turner, R. D. R., Mengersen, K., and Peterson, E. E.: A framework for automated anomaly detection in high frequency water-quality data from in situ sensors, Sci. Total Environ., 664, 885–898,, 2019. 

Lund, N. S. V., Kirstein, J. K., Madsen, H., Mark, O., Mikkelsen, P. S., and Borup, M.: Feasibility of using smart meter water consumption data and in-sewer flow observations for sewer system analysis: a case study, J. Hydroinform., 795–812,, 2021. 

Madsen, H., Gregersen, I. B., Rosbjerg, D., and Arnbjerg-Nielsen, K.: Regional frequency analysis of short duration rainfall extremes using gridded daily rainfall data as co-variate, Water Sci. Technol., 75, 1971–1981,, 2017. 

Mahmoodian, M., Carbajal, J. P., Bellos, V., Leopold, U., Schutz, G., and Clemens, F.: A Hybrid Surrogate Modelling Strategy for Simplification of Detailed Urban Drainage Simulators, Water Resour. Manag., 32, 5241–5256,, 2018. 

Moy de Vitry, M., Dicht, S., and Leitão, J. P.: floodX: urban flash flood experiments monitored with conventional and alternative sensors, Earth Syst. Sci. Data, 9, 657–666,, 2017. 

Moy de Vitry, M., Schneider, M. Y., Wani, O., Manny, L., Leitão, J. P., and Eggimann, S.: Smart urban water systems: what could possibly go wrong?, Environ. Res. Lett., 14, 081001,, 2019. 

Nagel, J. B., Rieckermann, J., and Sudret, B.: Principal component analysis and sparse polynomial chaos expansions for global sensitivity analysis and model calibration: Application to urban drainage simulation, Reliab. Eng. Syst. Safe., 195, 106737,, 2020. 

Palmitessa, R., Mikkelsen, P. S., Borup, M., and Law, A. W. K.: Soft sensing of water depth in combined sewers using LSTM neural networks with missing observations, J. Hydro-Environ. Res., 28, 106–116,, 2021a. 

Palmitessa, R., Pedersen, A. N., Borup, M., Sørensen, L., Law, A. W. K., Clemmensen, L. K. H., and Mikkelsen, P. S.: Anomaly detection in water depth observations from combined sewers using LSTM neural networks, in preparation, 2021b. 

Pedersen, A. N., Pedersen, J. W., Vigueras-Rodriguez, A., Brink-Kjær, A., Borup, M., and Mikkelsen, P. S.: Dataset for Bellinge: An urban drainage case study, Tech. Univ. Denmark [data set],, 2021a. 

Pedersen, A. N., Borup, M., Brink-Kjær, A., Christiansen, L. E., and Mikkelsen, P. S.: Living and Prototyping Digital Twins for Urban Water Systems: Towards Multi-Purpose Value Creation Using Models and Sensors, Water, 13, 592,, 2021b. 

Pedersen, A. N., Pedersen, J. W., Borup, M., Brink-Kjær, A., Christiansen, L. E., and Mikkelsen, P. S.: Using multi-event hydrologic and hydraulic signatures from water level sensors to diagnose locations of uncertainty in integrated urban drainage models, submitted, 2021c. 

Rimer, S. P., Troutman, S. C., Mullapudi, A., and Kerkez, B.: Demo abstract: A benchmarking framework for control and optimization of smart stormwater networks, in: ICCPS 2019 – Proc. 2019 ACM/IEEE Int. Conf. Cyber-Physical Syst., 16–18 April 2019, Montreal, QC, Canada, 350–351,, 2019. 

Rossman, L. and Huber, W.: Storm Water Management Model Reference Manual Volume I, Hydrology, EPA/600/R-., US EPA Office of Research and Development, Washington, DC, 2015. 

Sarni, W., White, C., Webb, R., Cross, K., and Glotzbach, R.: Digital Water – Industry Leaders Chart the Transformation Journey, IWA Publishing, London, UK, 2019. 

Schaake, J., Cong, S., and Duan, Q.: U.S. Mopex Data Set, IAHS Publ. Ser., vol. 307, N/A, Novemb. 1, 2006, pp. 9–28, 2006. 

Schütze, M., Lange, M., Pabst, M., and Haas, U.: Astlingen – A benchmark for real time control (RTC), Water Sci. Technol., 2017, 552–560,, 2017. 

Stagge, J. H., Rosenberg, D. E., Abdallah, A. M., Akbar, H., Attallah, N. A., and James, R.: Assessing data availability and research reproducibility in hydrology and water resources, Sci. Data, 6, 1–12,, 2019. 

Sun, C., Svensen, J. L., Borup, M., Puig, V., Cembrano, G., and Vezzaro, L.: An MPC-Enabled SWMM Implementation of the Astlingen RTC Benchmarking Network, Water, 12, 1034,, 2020. 

Therrien, J.-D., Nicolaï, N., and Vanrolleghem, P. A.: A critical review of the data pipeline: how wastewater system operation flows from data to intelligence, Water Sci. Technol., 82, 2613–2634,, 2020. 

Thorndahl, S., Nielsen, J. E., and Rasmussen, M. R.: Bias adjustment and advection interpolation of long-term high resolution radar rainfall series, J. Hydrol., 508, 214–226,, 2014. 

Thrysøe, C., Arnbjerg-Nielsen, K., and Borup, M.: Identifying fit-for-purpose lumped surrogate models for large urban drainage systems using GLUE, J. Hydrol., 568, 517–533,, 2019. 

Tscheikner-Gratl, F., Zeisl, P., Kinzel, C., Rauch, W., Kleidorfer, M., Leimgruber, J., and Ertl, T.: Lost in calibration: Why people still do not calibrate their models, and why they still should – A case study from urban drainage modelling, Water Sci. Technol., 74, 2337–2348,, 2016. 

VCS Denmark: VCS Denmark homepage, available at: (last access: 20 March 2020), 2020. 

Vezzaro, L., Mikkelsen, P. S., Deletic, A., and McCarthy, D.: Urban drainage models – Simplifying uncertainty analysis for practitioners, Water Sci. Technol., 68, 2136–2143,, 2013. 

Vonach, T., Kleidorfer, M., Rauch, W., and Tscheikner-Gratl, F.: An Insight to the Cornucopia of Possibilities in Calibration Data Collection, Water Resour. Manag., 33, 1629–1645,, 2019. 

Wilkinson, M. D., Dumontier, M., Aalbersberg, I. J., Appleton, G., Axton, M., Baak, A., Blomberg, N., Boiten, J.-W., Bonino da Silva Santos, L., Bourne, P. E., Bouwman, J., Brookes, A. J., Clark, T., Crosas, M., Dillo, I., Dumon, O., Edmunds, S., Evelo, C. T., Finkers, R., Gonzalez-Beltran, A., Gray, A. J. G., Groth, P., Goble, C., Grethe, J. S., Heringa, J., 't Hoen, P. A. C., Hooft, R., Kuhn, T., Kok, R., Kok, J., Lusher, S. J., Martone, M. E., Mons, A., Packer, A. L., Persson, B., Rocca-Serra, P., Roos, M., van Schaik, R., Sansone, S.-A., Schultes, E., Sengstag, T., Slater, T., Strawn, G., Swertz, M. A., Thompson, M., van der Lei, J., van Mulligen, E., Velterop, J., Waagmeester, A., Wittenburg, P., Wolstencroft, K., Zhao, J., and Mons, B.: The FAIR Guiding Principles for scientific data management and stewardship, Sci. Data, 3, 160018,, 2016.  

Wolfs, V. and Willems, P.: Modular Conceptual Modelling Approach and Software for Sewer Hydraulic Computations, Water Resour. Manag., 31, 283–298,, 2017. 

Short summary
A comprehensive data set from a combined sewer system in a 1.7 km2 suburban area is presented. Up to 10 years of observations (2010–2020) from level sensors, a flow meter, position and power sensors, rain gauges, X- and C-band weather radars, and a weather station; distributed hydrodynamic models; and CCTV pipe network data are included. This will enable independent testing and replication of results from future scientific developments within urban hydrology and urban drainage system research.
Final-revised paper