A new dataset of river flood hazard maps for Europe and 1 the Mediterranean Basin

12 In recent years, the importance of continental-scale hazard maps for riverine floods has grown. 13 Nowadays, such maps are used for a variety of research and commercial activities, such as 14 evaluating present and future risk scenarios and adaptation strategies, as well as supporting 15 national and local flood risk management plans. In this paper we present a new set of high- 16 resolution (100 metres) hazard maps for river flooding that covers most European countries, as 17 well as all of the river basins entering the Mediterranean and Black Seas in the Caucasus, Middle 18 East and Northern Africa countries. The new river flood hazard maps represent inundation along 19 329,000 km of the river network, for six different flood return periods, expanding on the datasets 20 previously available for the region. The input river flow data for the new maps are produced by 21 means of the hydrological model LISFLOOD using new calibration and meteorological data, 22 while inundation simulations are performed with the hydrodynamic model LISFLOOD-FP. In 23 addition, we present here a detailed validation exercise using official hazard maps for Hungary, 24 Italy, Norway, Spain and the UK, which provides a more detailed evaluation of the new dataset 25 compared with previous works in the region. We find that the modelled maps can identify on 26 average two-thirds of reference flood extent, but they also overestimate flood-prone areas for 27 flood probabilities below 1-in-100 years, while for return periods equal to or above 500 years the


43
Nowadays, flood hazard maps are a basic requirement of any flood risk management strategy (EC 44 2007). Such maps provide spatial information about a number of variables (e.g. flood extent, 45 water depth, flow velocity) that are crucial to quantify flood impacts and therefore to evaluate 46 flood risk. Moreover, they can be used as a powerful communication tool, enabling the quick 47 visualization of the potential spatial impact of a river flood over an area. 2) Data and methods 97 In this Section we describe the procedure adopted to produce and validate the flood hazard maps. 98 The hydrological input data consist of daily river flow for the years 1990-2016, produced with 99 the hydrological model LISFLOOD (see Section 2.1), based on interpolated daily meteorological 100 observations. River flow data are analysed to derive frequency distributions, peak discharges and 101 flood hydrographs, as described in Section 2. The hydrological input data required for the flood simulations are provided using synthetic flood 151 hydrographs, following the approach proposed by Alfieri et al. (2014). 152 We use the streamflow dataset derived from the long-term run of LISFLOOD described in Section 153 2.1, considering the rivers with upstream drainage areas larger than 500 km 2 . This threshold was 154 selected because the meteorological input data cannot accurately capture the short and intense 155 rainfall storms that induce extreme floods in small river basins, and therefore the streamflow 156 dataset does not represent accurately the flood statistics of smaller catchments (Alfieri et al., 157 2014). 158 For each pixel of the river network we selected annual maxima over the period 1990-2016 and 159 we used the L-moments approach to fit a Gumbel distribution and calculate peak flow values for 160 reference return periods of 10, 20, 50, 100, 200 and 500 years. We also calculated the 30-and 161 1,000-year return periods in limited parts of the model domain to allow validation against official 162 hazard maps, see Section 2.3. The resulting goodness-of-fit is presented and discussed in 163 Appendix B. We used the Gumbel distribution to keep a parsimonious parameterization (two 164 parameters instead of three for the generalized extreme value (GEV), log-normal and other 165 distributions), thus avoiding over-parameterization when extracting high return period maps from 166 a relatively short time-series. The same distribution was also adopted for the extreme value 167 analysis in previous studies regarding flood frequency and hazard (Alfieri et al., 2014168 Dottori et al., 2016). 169 Subsequently, we calculate a Flow Duration Curve (FDC) from the streamflow dataset. The FDC 170 is obtained by sorting in decreasing order all the daily discharges, thus providing annual 171 maximum values QD for any duration i between 1 and 365 days. Annual maximum values are 172 then averaged over the entire period of data, and used to calculate the ratios εi between each 173 average maximum discharge for i-th duration QD(i) and the average annual peak flow (i.e. QD = 1 174 day). Such a procedure was carried out for all the pixels of the river network. 175 The synthetic flood hydrographs are derived using daily time-steps, following the procedure 176 proposed by Maione et al. (2003). The peak value of the hydrograph is given by the peak discharge  177   for the selected T-year return period QT, while the other values for Qi are derived by multiplying  178 QT by the ratio εi. The hydrograph peak QT is placed in the centre of the hydrograph, while the 179 other values for Qi are sorted alternatively as shown in Figure  to derive a high-resolution river network at the same resolution. Along this river network we 197 identify reference sections every 5 km along the stream-wise direction, and we link each section 198 to the closest upstream section (pixel) of the EFAS 5 km river network, using a partially 199 automated procedure to ensure a correct linkage near confluences. In this way, the hydrological 200 variables necessary to build the flood hydrographs can be transferred from the 5 km to the 100 201 metres river network. Figure  Finally, the flood maps with the same return period are merged together to obtain the continental-220 scale flood hazard maps. The 100 metres river network is included as a separate map in the dataset, 221 to delineate those water courses that were considered in creating the flood hazard maps. 222 It is important to note that the flood maps developed do not account for the influence of local 223 flood defences, in particular dyke systems. Such limitation has been dictated mainly by the 224 absence of consistent data at European scale. None of the available DEMs for Europe has the 225 required accuracy and resolution to embed artificial embankments into elevation data. 226 Furthermore, there are no publicly available continental or national datasets describing the 227 location and characteristics (e.g. dyke height, distance from river channel) for flood protections. these are likely to provide higher accuracy than the modelled maps presented here, and therefore 257 they have been selected as reference maps for the validation. While official flood maps are 258 generally available online for consultation on Web-GIS services, only a few countries and river 259 basin authorities make the maps available for download in a format that allows comparison with 260 geospatial data. Table 1 presents the list of flood hazard maps that could be retrieved and used for 261 the validation exercise, while their geographical distribution is shown in Figure 1. Note that the 262 relevant links to access these maps are provided in the Data Availability section. 263 While more of such official maps are likely to become available in the near future, the maps 264 considered here offer an acceptable overview of the different climatic zones and floodplain 265 characteristics of the European continent. Conversely, we could not retrieve national or regional 266 flood hazard maps outside Europe, meaning the skill of the modelled maps could not be tested in 267 the arid regions in Northern Africa and Eastern Mediterranean. In Norway, Spain, the UK and the 268 Po River Basin the official maps take flood defences into account, which are not represented in 269 the modelling framework. Official maps for England also include areas prone to coastal flooding 270 events (such as tidal and storm surges). None of the official maps include areas prone to pluvial 271 flooding, which are therefore not considered in this analysis.

285
The national flood hazard maps listed in Table 1  between modelled and reference maps we applied a number of corrections. Firstly, we used the 294 CORINE Land Cover map to exclude permanent water bodies (river beds of large rivers or 295 estuaries, lakes, reservoirs, coastal lagoons) from the comparison. Secondly, we restricted the 296 comparison area around modelled maps to exclude the elements of river network (e.g. minor 297 tributaries) included in the reference maps but not in the modelled maps. We used a different 298 buffer extent according to each study area, considering the floodplain morphology and the 299 variable extent and density of mapped river network. For example, in Hungary we applied a 10-300 km buffer around modelled maps to include the large flooded areas reported in reference maps 301 and to avoid overfitting. In England, we used a 5 km buffer due to the high density of the river 302 network mapped in the official maps. The buffer is also applied to mask out coastal areas far from 303 rivers estuaries, because official maps include flood-prone areas due to 1-in-200-year coastal 304 flood events. We calculated that flood-prone areas inside the 5 km buffer correspond to 73% of 305 the total extent for the 1-in-100-year flood. For the Po river basin, we excluded from the 306 comparison the areas belonging to the Adige river basin and the lowland drainage network, which 307 are not included in the official hazard maps. In Spain and Norway official flood hazard maps have 308 only been produced where relevant assets are at risk, according to available documentation 309 [MITECO 2011;NVE 2020]. We therefore restricted the comparison only to areas where official 310 flood hazard maps have been produced. We evaluate the performance of simulated flood maps against reference maps using a number of 316 where ∩ is the area correctly predicted as flooded by the model, and Fo indicates the total 321 observed flooded area. HR scores range from 0 to 1, with a score of 1 indicating that all wet cells 322 in the benchmark data are wet in the model data. The formulation of the HR does not penalize 323 over-prediction, which can be instead quantified using the false alarm ratio FAR: 324 where / is the area wrongly predicted as flooded by the model. FAR scores range from 0 326 (no false alarms) to 1 (all false alarms). Finally, a more comprehensive measure of the agreement 327 between simulations and observations is given by the critical success index (CSI), defined as: 328 where ∪ is the union of observed and simulated flooded areas. CSI scores range from 0 330 (no match between model and benchmark) to 1 (perfect match between benchmark and model). 331

Additional tests 332
To choose the best possible methodologies and datasets to construct the flood hazard maps, we 333 performed a number of tests using recent input datasets, as well as alternative strategies to account 334 for vegetation effects on elevation data. 335

Elevation data 336
It is well recognized that the quality of flood hazard maps strongly depend on the accuracy of 337 elevation data used for modelling (Yamazaki et al., 2017). This is especially crucial for 338 continental-scale maps, since the quality of available elevation datasets is rarely commensurate 3) Results and discussion 370 We present the outcomes of the validation exercise by first describing the general results at 371 country and regional scale (Section 3.1). Then, we discuss the outcomes for England, Hungary 372 and Spain (Section 3.2), while the Norway and Po river basin case studies are presented in the 373 Appendix C. We also complement the analysis with additional validation over major river basins 374 in England and Spain. In  The results in Table 3   Besides these results, the visual inspection of reference maps suggest that the underestimation is 441 partly caused by the high density of mapped river network in the reference maps, in respect to 442 modelled maps. Indeed, the modelling framework excludes river basins with an upstream basin 443 area below 500 km 2 , meaning that EFAS maps only cover main river stems but miss out several 444 smaller tributaries. This is clearly visible over the Severn and in the upper Thames basins (Figure  445

Hungary 459
The results in Table 3 Table 6 and 7. For our 522 framework, we calculated each index in Table 6 using the overall modelled and reference flood 523 extent available for each return period (e.g. the value for the 100-year maps includes reference 524 and modelled maps for England, Spain and Norway). As such, each area is weighted according 525 to the extent of the corresponding flood map. 526 As can be seen in Table 6, the continental-scale model by Wing (Table 7). 557 558 Table 7. Comparison of the performance metrics for the maps described in the present study and 559 the global maps by Sampson et al. (2015). Metrics for the latter study are calculated removing 560 all channels with upstream areas of less than 500 km2. 561 The different masking applied to reference flood maps may explain some of the differences: 563

HR FAR CSI
Sampson et al. removed all channels with upstream areas of less than 500 km 2 , whereas here we 564 use a simpler 5 km buffer around modelled maps. The exclusion of permanent river channels in 565 our comparison may further penalize the overall score especially for the Thames, which as a rather 566 large channel estuary. Besides these differences in the validation, the better metrics of the maps 567  Table 3. As can be seen, differences are generally reduced across 582 the different areas and return periods. Version 1 of the flood maps produced slightly better results 583 in Hungary for the 100-and 1000-year return period (increased CSI and HR, lower FAR), while 584 version 2 has somewhat improved performances in England, mainly driven by higher HR. 585 586   Table 8. Comparison of performances of the flood hazard maps described in the present study 587 and developed by Dottori et al. (2016a). These outcomes may be interpreted considering the changes in input data between the two 591 versions, and the structure of the modelling approach and of input data, which in turn has not 592 changed substantially. The main difference between the two map versions is given by the 593 hydrological input, with Version 2 using the latest calibrated version of the LISFLOOD model. 594 For the 100-year return period, peak flow values of Version 2 are on average 35% lower than 595 Version 1 in Hungary, and 16% lower in England. However, similar decreases are also observed 596 for the 1-in-2-year peak discharge that determines full-bank discharge. The resulting reduction in 597 channel hydraulic conveyance in respect to Version 1 is likely to offset the decrease of peak flood 598 volumes, which explain the small difference in overall flood extent given by the F2/F1 parameter 599 in Table 8. Such results confirm the low sensitivity of the modelling framework to the 600 hydrological input observed by Dottori et al.   Mediterranean. In these areas, further research will be needed to better understand the 647 performance of the flood mapping procedure here proposed. Modelled maps generally achieve 648 low scores for high and medium probability of flooding. For the 1-in-100-year return period, the 649 modelled maps can identify on average two-thirds of reference flood extent, however they also 650 largely overestimate flood-prone areas in many regions, thus hampering the overall performance. 651 Performances improves markedly with the increase of return period, mostly due to the decrease 652 of the false alarm rates. In particular, critical success index (CSI) values approach and in some 653 cases exceed 0.5 for return periods equal or above 500 years, meaning that the maps can correctly 654 identify more than half of flooded areas in the main river stems and tributaries of different river 655 basins. 656 It is important to note that the validation was affected by problems in identifying the correct areas 657 for a fair comparison, because of the different density of the mapped river network in reference 658 and modelled maps. In our study we used large buffers to constrain comparison areas, which 659 possibly penalized the model performance by generating spurious false alarms in areas not 660 considered by official maps. However, we observed that the proposed maps achieve comparable 661 results to other large-scale flood models when using similar parameters for the validation. to represent small-scale precipitation processes. Improving the simulation of reservoirs may also 684 reduce the difference between the real and modelled hydrological regimes in regions such as the 685   Here we evaluate the performance of the Gumbel distribution in fitting the available reference 772 discharge values (26 annual maxima calculated for all the grid points of the LISFLOOD long-773 term run). Specifically, we compare the empirical and fitted distributions of streamflow annual 774 maxima using the Cramer-von Mises test (Anderson, 1962), and we calculate the average 775 differences between reference and fitted discharge values. Table B2 summarizes the resulting p-776 values over the study area. Figure B2 Table B2 suggest a low skill of the fitted Gumbel distributions; however the resulting 786 uncertainty in the estimates of discharge maxima is generally below 25%, as in the examples 787 shown in Figure B2. This is considered acceptable because the reference discharge maxima are 788 modelled and not observed values. Due to limited sample size, it is not possible to evaluate the 789 extrapolation error for peak flows beyond the available sample; however, previous studies 790 suggested the suitability of the Gumbel distribution. Cunnane (1989) Table 3, the modelled flood maps provide a better reproduction of reference maps 805 for the Po River, compared to other study areas. False alarms are low, while hit ratio (HR) values 806 indicate that two out of every three pixels in the reference map are correctly identified as flooded. 807 The analysis of reference and modelled maps ( Figure C1), suggests that the underestimation is 808 partly caused by flooded areas along some tributaries which are not included in modelled maps. 809 Other areas with omission errors are located near confluences of the Po main stem and the major 810 tributaries in Emilia-Romagna, which may depend on the underestimation of peak flow on 811 tributaries. In fact, the results of the LISFLOOD calibration in Figure B1