Interactive comment on “ EUFF ( EUropean Flood Fatalities ) : A European flood fatalities database since 1980

Your paper “EUFF (EUropean Flood Fatalities): A European flood fatalities database since 1980” cover an important topic. Easily accessible and consistent databases of fatalities are rare and definitely needed to advance the understanding of the process. That being said, I see two major issues with your submission: the overlap with the two previous papers (Petrucci et al. 2019a and 2019b, op. cit.) and the overall structure of the paper. Both points are somewhat related. I think the authors themselves summarize the issue best: “[t]he novel element of this work, compared with the previous one (Petrucci et al., 2019), is that this work is focusing on trend analysis.” An ESSD paper


1
Introduction 55 Between 1995-2015, floods represent 47% of the climate-related disasters (Wahlstrom and Guha-Sapir, 2015). The expected increase in floods frequency and magnitude due to climate change (Trenberth, 2011), and the resulted increasing concentration of human activities and people around areas which are prone to floods (IPCC, 2014), make them one of the most important threats for those communities. That was mostly evident during some past catastrophic floods in Europe (Barredo, 2007). Moreover, Blöschl et al. (2020) showed that the period 1990-2016 represent one of the most flood-rich periods in Europe, being exceptional in terms of 60 extent, flood seasonality and air temperatures if compared to similar past flood-rich periods (over the past 500 years).
Flood fatalities (FFs), i.e. people who lost their life directly or indirectly during floods, represent the most tragic side of this natural disaster. There are several publications analysing factors that influence the vulnerability of individuals to flooding, mainly related to gender, age, activity and risk taking behaviour, in different geographical and socioeconomic frameworks (Alderman et al., 2012;Fiala, 2017;Lowe et al., 2013;Pereira et al., 2017;Rufat et al., 2015;Špitalar et al., 2014;Brázdil et al., 2019). 65 These studies are mainly based on specific databases, which in several countries are not even available, and thus they must be created for that specific purpose. At a national level, there are few examples of official FFs databases. In USA for example, the Governmental database "Storm Data" is updated by the National Climatic Data Centre (Sharif et al., 2012). The Australian PerilAUS is the database of historical natural hazard impacts, containing FFs that occurred between 1900 and 2015, and critical information such as age, gender, and actions causing death . At a European level, there are no "official" databases 70 collecting FFs data. The first experiment that was carried out in Mediterranean environment by a multinational research group, is the MEFF (MEditerranean Flood Fatalities) database, which includes FFs occurred on a 36-year period  in five Mediterranean study areas (Petrucci et al., 2019b;Vinet et al., 2019).
Global databases such as NATHAN (Natural Hazards Assessment Network) of the reinsurance company Munich Re, or the EM-DAT (Emergency Events Database) from the Centre for Research on the Epidemiology of Disasters of the Université Catholique 75 de Louvain, are very useful to identify major disasters and extract interesting comparative statistics and useful information.
Nevertheless, they contain recordings only for the major catastrophic events and have built-in bias due to the use of indirect sources, or they do consider only the information provided by some insurance companies that do not cover the entirely affected regions (Llasat et al., 2013). Moreover, in these databases the location of FFs is characterized by low spatial resolution.
Flood fatalities databases also allow flood risk scenarios to be developed, taking into account progress and changes in the ongoing 80 lifestyle. Since 1960, the increasing number of cars contributed not only to people's mobility, but also to theirs' increased exposure to flood events, given that people in their cars are more vulnerable (Petrucci et al., 2017;Petrucci and Pasqua, 2012;Sharif et al., 2012) or if they are personnel working for State Emergency Services/Agencies (Ahmed et al., 2020). An interesting tendency which was detected in Australia between 20002015 is the dramatic increase of 4WD vehicles drivers' death  in their desperate attempt to reach their own or friends' home (Franklin et al., 2017). Some authors suggest that even if most people are 85 aware of the risk involved, the depth and / or the speed of the water might take them by surprise . It is also noted that even if most drivers may identify the potential risk, they do fail to personalize it, believing that it does not apply to themselves Pearson and Hamilton, 2014), making them impatient and thinking that they are invincible (Franklin et al., 2014) and untouchable. The circumstances can lead to the identification of different types of loss of life (victims in cars, victims in collapsed buildings, professional rescuers, voluntary rescuers, visitors, observers, victims when structural measures collapsed, 90 victims of non-structural measures, and others) also highlighting the groups of people exhibiting dangerous behaviours and taking unnecessary risks (Špitalar et al., 2020).
The current paper presents the European Flood Fatalities Database (EUFF 2020), namely the catalogue of FFs that occurred in nine study areas located in eight countries during a 39-year period . The term "European" is ambitious given that EUFF 2020 deals with only eight countries; nevertheless, we consider it as the beginning of a larger database that could be supplemented 95 with data from more European countries. EUFF 2019 has been presented in a previous paper describing the database and main results in an aggregated form (Petrucci et al., 2019a). The present work is based on an updated version of that database and analyses the number of flood fatalities per event, and the understanding of the relationships between gender and age of victims and the other variables collected, in order to highlight specific vulnerability factors. 100 Section 2 describes the methodology used to collect data, introduces study areas and presents the structure of the database and its completeness. Section 3 presents the results obtained from data elaboration while Section 4 contains the discussion of the EUFF 2020 potential. Finally, Section 5 provides information regarding the availability of database and Section 6 presents the main conclusions.

Flood fatalities data
The methodological approach is based on a systematic collection of data about floods (flood events-FEs) that caused casualties. All cases of fatal floods triggered by rainfall have been included therein, without severity thresholds: EUFF 2020 (Petrucci et al., 2020) contains all the cases of FEs, independent of the number of FFs per FE.
Such data can be extracted from different types of documentary sources (Brázdil et al., 2012). In our study we took advantage of a 110 common practice (Leal et al., 2018;Papagiannaki et al., 2013;Zêzere et al., 2014), which is the reading of national and local newspapers. In fact, due to their temporal continuity, newspapers allow systematic surveys, which in the frame of the current study were complemented by local sources, different from one study area to another (e.g. reports by rescue services or civil protection agencies). The analysis of newspapers is a long process. It requires the selection of several articles, published either in the daily edition of a single newspaper or in several newspapers of the same day, in order to filter the details and to define the framework in 115 which FFs occurred. Dealing with national newspapers, data gathering must necessarily be performed by researchers understanding the national language, and who are able to easily search throughout several articles and identify the needed information to be included in the database. A typical time sequence of the description of an event may be as the following (from Calabria-Italy):

Study areas
The EUFF 2020 database contains information on FF that occurred in 39 years   Preprint. Discussion started: 3 August 2020 c Author(s) 2020. CC BY 4.0 License. these databases. In order to homogenise the data by filling as much database fields as possible, further research has been performed in each study area by using coeval local newspapers, as Ultima Hora for BAL, La Vanguardia for CAT, Rizospastis for GRE and Il Corriere della Sera for ITA. Data assembled in EUFF 2020 are the result of collecting, merging and homogenising regional and national databases of all the study areas and include further data obtained from recent historical research. In the following, we name 140 'TOT-Area' the total sum of all study areas (Figure 1).

Database structure
The descriptions of flood events, as identified and selected from data sources (both local databases and additional documentary sources), were used to fill in the EUFF 2020 fields. Database structure was designed to allow the translation of the flood event descriptions in a well-defined and restricted number of options, listed in dropdown menus in order to facilitate database compilation. 155 The structure of EUFF 2020 is detailed in two tables: Table 2 reports the variables used to define the location and time of FE including victim profile, while Table 3 describes the variables used to define the flood-victim interaction.
The database contains the following fields:  PRIMARY KEY is an integer number that allows the univocal identification of each record by means of the FATALITY_ID.
Each record contains data about a single FF, clustered in different sections. 160 84 for the exact point where the accident occurred are available, these were also included in the fields LATITUDE and LONGITUDE, and the field LOC_ACCURACY was marked as HIGH. In the cases in which the exact point was not available, LATITUDE and LONGITUDE contain the coordinates of the centroid of MUNICIPALITY, if available, or alternatively of the PREFECTURE or REGION where the accident occurred, and LOC_ACCURACY is marked as LOW.
 TIME OF ACCIDENT contains the date in which the accident occurred, in the format dd (day), mm (month) and yy (year). For 170 those cases where the exact hour of the accident was available, that was included in the field HOUR, and the field HOUR_ACCURACY was marked as HIGH. If the hour is reported as a textual description, HOUR_ACCURACY is marked as LOW and the textual description is converted in hours according to the values reported in Table 2. in Table 3. For example, a fatality referred as CHILD was classified in the range 014 years. The GENDER, if available, is 180 reported as M (males) or F (females). The field RESIDENCY classifies the victim as RESIDENT or NOT RESIDENT in the place where the accident occurred, or as TOURIST visiting the area.  Table 3. 185

Database completeness
The first version of EUFF presented in 2019 by Petrucci et al. (2019a), contained 2466 FFs that occurred in nine study areas between 19802018. The present work is based on an updated version of that database, which, which, according to the year of updating, is named EUFF 2020. Improvements from EUFF 2019 to EUFF 2020 are summarised in Table 4. As can be noticed, the difference between the number of FFs in the two versions of the database is relatively low (17 FFs). This is expected since the number of 190 fatalities in newspapers is the most frequently information reported therein. Also, considering that this research has been carried out by a systematic surveying of newspapers, it is unlikely that large number of fatalities can be remained unnoticed and could emerge from further research. The small increasing of FFs from EUFF 2019 to EUFF 2020 essentially stems from supplementary research on short periods for which data were missing. For example, it is not rare that printed collections of old newspapers owned by libraries and newspaper libraries can be affected by gaps, due i.e. to deterioration of some edition caused by humidity in the library's 195 premises. In these cases, if digital versions or private collections of these missing editions become available, the gaps can be filled, and, if in that period some FFs occurred, the total number of FFs can be updated.
On the other hand, in EUFF 2020 there is a larger increase in the availability of variables such as ACCIDENT-PLACE, RESIDENCY, VICTIM_CONDITION and ACCIDENT_DINAMYC. This depends mainly on the current availability of coeval data sources that, for already counted FEs, allows the identification of more details not initially available in the documentary sources 200 analysed during EUFF 2019. In these cases, the absolute number of FFs did not increase but the completeness of the database has been critically improved, thus contributing a more realistic framework and a more robust basis for statistical data elaboration and analysis. Table 4. Data included in EUFF (2019) and EUFF (2020), and increases as numbers (#) and percentage (%) of the FFs in EUFF (2020). The 205 diagram on the right represents the numerical increasing of data available for each variable. Currently, EUFF 2020 contains 2483 flood fatalities ( Figure 2). Half of them occurred in Turkey (50.1%) followed by Italy (16.4%) and south France (11.0%). The remaining 22.6% concern the remaining six areas. Preprint. Discussion started: 3 August 2020 c Author(s) 2020. CC BY 4.0 License. Table 5 summarises the data collected in TOT-Area and in the individual study areas. For each variable, we reported the data 210 available expressed as proportion of FFs in each study area.

TIME OF ACCIDENT, DATE is available for 100% of FFs in all the study areas.
Concerning VICTIM PROFILE, we are aware of the AGE of 64.5% of FFs: the highest proportions pertain to BAL (100%) and ITA (99.3%). GENDER is available for 78.6% of FFs: most complete data pertains to GRE and ITA. RESIDENCY is available for 67.1% of FFs, while the highest percentages concern GRE and CAT. Concerning FLOOD-VICTIM INTERACTION, we are 220 aware of VICTIM_CONDITION for 39.3% of FFs: the highest completeness appears for CAT. VICTIM_ACTIVITY is available for 37.7% of FFs, while completeness is highest for CAT. ACCIDENT_PLACE is available for 61.7% of FFs and the most complete data on this variable pertains to ITA. ACCIDENT_DYNAMIC is available for 81.4% of FFs, and the highest percentage of data pertains to GRE. DEATH_CAUSE is known for 80.6% of FFs. In a few number of cases, PROTECTIVE_BEHAVIOUR (6.6%) and HAZARDOUS_BEHAVIOUR (13.1%) were also detected. 225 Table 5. Proportions of data (%) collected for each variable in EUFF 2020 with respect to flood fatalities occurred in TOT-Area (in red) and in each study area (in black). Preprint. Discussion started: 3 August 2020 c Author(s) 2020. CC BY 4.0 License. Results

Flood events and flood fatalities
The number of FFs caused by a single FE can be a proxy of the severity of the flood. Generally, the larger the number of FFs, the 230 higher the severity of the FE. Basic assumption is that a FE is a flood that caused the death of at least one or more people in a given DATE and REGION. Once the region or the date change, the FFs are assigned to another FE. Using this criterion, in 1980-2018 period we counted 847 FEs causing 2483 FFs in TOT-Area (Table 6), i.e. 2.9 casualties per event, on average.
FFs shows the highest numbers in TUR (1243), ITA (407) and SFR (273). 235 FEs per year has the highest value in TUR (26 FE in 1981), followed by ITA (13 FE in 2011) andCZE (12 FE in 1997). After TUR, with a mean number of 8.9 FEs per year, high values pertain to both ITA (5.1) and SFR (3.3). The modal value of FEs per year is 10 in TUR, and it ranges between one and two in the other areas.
FFs per year reaches the maximum value in TUR (157 FFs in 1995), followed by SFR (56 FFs in 1992

Gender of flood fatalities
Gender is known for 1953 FFs (i.e., 78.6% of total FFs): 47.0% of FFs have been males, 31.7% females, and for the remaining 21.3% information on gender was missing (Table 8). Among the individual areas, completeness of this information ranges between 275 98.5% (GRE) and 28.6% (ISR) of FFs that occurred in the given area (Table 5). In all areas studied, males FFs are more in absolute numbers than females.   Among the other study areas, the majority of FFs was either adults or young-adults, i.e., people between 30 and 64 years. In each age class, females are generally less in absolute numbers than males. Nevertheless, in CAT, CZE, SFR, ITA, and POR, in the class of elderly females become slightly more than males. This may reflect the age structure of the population in these study areas, where, 290 among elderly, the prevalence of females on males is stronger than in the rest of age classes (  Concerning gender and residence of FFs, we have information for 1395 FFs (56.2%) ( Table 8). The majority of FFs were residents in the area where the accident took place, both males and females, i.e. we can make a safe assumption that they were aware of the 300 local places and roads with high risk for flooding. The percentages of FFs that were either not residents in the place of the accident (i.e.: being there for work) or tourists are small (3.1% and 5.4%, respectively). we grouped bicycle, bus, caravan, tractor, truck and van, in the new class "other vehicles". Car is the most frequent mean of 305 transport in which a deadly even has happened, for both males and females, and in each study area.

Gender and victim condition
Gender and victim activity are known for 818 FFs (32.9%), and more complete information concerns CAT, GRE and ITA (Table   8). This information confirms that the majority of FFs, particularly males, were traveling (by car or other vehicles), as often mentioned in literature (e.g. Jonkman and Kelman 2005). The second most frequent activity for male FFs was working, particularly in CAT, GRE, POR and CZE. Concerning female FFs, after traveling, the second most frequent activity was sleeping, so they were 310 probably involved in the flood in a state of unconsciousness. Male fatalities during hunting and fishing activities, due to their low number have been clustered as 'recreational activities'.
Gender and accident place are known for 1371 FFs (55.2%). These data were grouped in four types: A) Riverbed/riverside, Ford, Bridge; B) Road, Tunnel/underpass; C) Campsite/tent, Countryside, Bungalow, Recreation area; D) Public/private building (Table   9). In CAT, SFR, GRE and ITA, females in Public/private building were more than males ( Figure 6). This is in accordance with the 315 societal role of females in south and central European societies, spending more time at home than males, due to their greater charge of work and responsibilities in the care of house and children. Gender and accident dynamic are known for 1676 FFs (67.5%), and they are largely available for almost all studied areas, except for ISR (Table  9). Accident dynamics was clustered in five groups:  Gender and death cause information is available for 1692 FFs (i.e., 68.1%). We matched these data in four main groups: A) 335 Collapse/heart attack; B) Drowning; C) Hypothermia; Electrocution; and D) Poly-trauma; Suffocation: Poly-trauma and suffocation. Except for POR and TUR, data on both gender and cause of death are available for more than 80% of FF (Table 9) Most commonly, people died due to drowning, and secondly due to collapse/heart attack. The percentages between males and females essentially reflect the proportions males/females among FFS in each study area, without showing any particular trend.
Gender and protective behaviours have been detected only for 155 FFs (6.2%). We grouped protective behaviours into three main 340 types (Table 9) Gender and hazardous behaviours are available for 318 FFs (14.7%). Data are not provided for BAL and ISR (Table 9). However, this does not mean that people behaved responsibly; merely, this information was either not available or not collected. We grouped Staying on river banks were recorded mainly in POR. Trying to save vehicles; Trying to save belongings; Trying to rescue animals were mainly deadly reasons in SFR and TUR. 355

Age of flood fatalities
Age is known for 1602 FFs (64.5%). This information is fragmentary for ISR and TUR and it is largely available for the other areas (  (Table 11). Largest percentage pertains to young adults traveling. Sleeping is quite common in elderly. A small percentage of FFs, in all the age classes, were doing recreational activities, and a small percentage of adults, young adults and boys/girls were rescuing someone.
Age and accident place information are available for 1004 FFs (40.4%) ( Table 11). The largest percentage of FFs (17.7%) were in public and private buildings, and mainly were elderly people. Road and tunnel/underpass were the second most frequent case, 365 mainly for adults and young adults. In ITA and GRE, the majority of FFs (essentially younger people) occurred outdoor (road and tunnel/underpass), while elderly people are affected indoor (public/private buildings). CAT, CZE and POR show a predominance of occurrences outdoor (riverbed/riverside, ford and bridges) in almost all classes of age, except for the elderly, which have been once again, more frequently affected indoor.
Age and accident dynamic data are available for 1250 FFs (50.3%), and dragged by water/mud shows the highest percentage in 370 all the age classes, mainly in adult and young adult range of age (Table 11). Among elderly, the second most frequent reason was blocked in a flooded room.
Age and death cause are available for 1470 FFs (59.2%) (Table 11). Drowning killed most people in all the age classes. The second most frequent cause of death was collapse/heart attack, again in all age classes. Nevertheless, the relative incidence of this death cause is slightly higher between children and elderly fatalities. 375 Age and protective behaviour are available for only 116 FFs (4.7%) (Table 11). These behaviours seem relatively more frequent between young adults and adults, especially in the type B (Driving to avoid danger, Getting on the car roof, Getting out of car).
Boy/girl and elderly classes of age show the lowest frequency of protective behaviour.   Preprint. Discussion started: 3 August 2020 c Author(s) 2020. CC BY 4.0 License.

Trend in flood fatalities
In this section, the linear trend of some variables collected in EUFF 2020 for the 1980-2018 period, for either TOT-Area or individual areas, is analysed. According to the number of FFs per FE, the period studied has been divided in decades while FEs has been divided in two groups: a) low-severity FEs, which caused the death of less than 10 people, and b) high severity FEs, which 390 caused the death of more than 10 people (Table 12). The mean number of FFs per year shows the highest value between 1990 and 1999 (91), followed by the incomplete decade of 2010-2018 (61). In two remaining decades, these numbers were lower (48 in 1980-1989 and 54 in 2000-2009). In 14 out of 39 years, FFs were caused exclusively by low severity FEs, mainly in the 2000-2009 decade, during which on average 84% of FFs per year were caused by these events. In general, it seems that the total number of FFs per decade is lower when the percentage of low severity events is higher (as in the 19801989 and 20002009 decades). 395 Nevertheless, throughout the period studied, the proportions between low/high severity FEs did not show any evident trends.
Analysing the trend of gender of FFs at the scale of TOT-Area, the temporal trend of males FFs is stable, while the tendency of females FFs is clearly increasing, even though this varies among the studied areas (Figure 7). CAT and POR, according to the decreasing trend in FFs (Figure 4), show decreasing trends for both males and females FFs. The small increases in FFs trend in SFR corresponds to increasing trend of females FFs, while the male trend seems to be stable. On the other hand, the increasing trends in 400 GRE, CZE and ITA FFs is essentially due to an increase share of males FFs. In TUR, a slightly decreasing trend of both genders can be noted.
By the aforementioned data, the trend of age tends to move towards higher ages (Figure 8). FFs trend decreases for child and boy/girl, while it tends to increase in the other age classes. However, these tendencies must be analysed at the local scale, and compared to the age of local populations that differs in structure ( 405 Figure 5). For example, GRE and TUR show opposite age structure of population: the amount of elderly people is very large in GRE and very small in TUR. Nevertheless, a local analysis is not presented due to the scarcity of data in some of the areas studied.

Discussion of the EUFF 2020 database potential
In order to enrich the EUFF 2020 database, data has been systematically collected from various documentary sources, widely used to record damages and casualties caused by natural hazards. Characteristics and limitations of the documentary sources have been extensively mentioned in the related literature (Brázdil et al., 2005;(Brázdil et al., 2006); Caloiero et al., 2014;Petrucci et al., 2019). 420 Data completeness highly depends on the year of occurrence of FEs: i.e. less recent FEs are less documented due to the scarcity of information sources in the past years. On the other hand, FEs occurred more recently (the last two decades) are well documented due to a plethora of digital data sources, (e.g. news websites). Besides the aforementioned differences in data availability and sources, the inhomogeneity among the study areas must be also taken into consideration. Due to national privacy laws, the information collected for each FF may have a different level of detail depending on the study area and country. 425 The completeness of the database also depends on events' severity: FEs resulted in the death of one or a few persons could remain unnoticed or poorly documented in documentary sources, while FEs causing several people's death are usually much better documented and more in detail covered. This issue is more intense when data collection is performed at a global scale. For example, global disaster loss databases, as NATHAN (Natural Hazards Assessment Network) of the reinsurance company Munich Re, or the

EM-DAT (Emergency Events Database) from the Centre for Research on the Epidemiology of Disasters of the Université 430
Catholique de Louvain, have been proved useful for the identification of major events. However, these databases exclude FEs that caused the loss of a relatively low number of people: for example, in EM-DAT this number is 10. These thresholds are generally established based on severe disasters that happened in undeveloped countries, and thus would exclude the numerous fatalities due to FEs causing a few FFs, as it nowadays is frequently happening mainly in the Mediterranean countries (Llasat et al., 2013). By applying this criterion on EUFF 2020 database, 64.8% of FFs that occurred in the study areas during 1980-2018 (see Table 7) would 435 have been missed.
Documentary sources are non-technical data sources: data obtained from documentary sources show unconventional format and are not measurable with standard criteria. However, this is the only kind of data useful to examine the impact of natural hazards such as floods (Barriendos et al., 2019) and other types of natural hazards such as earthquakes, tsunami (Alam, 2019), drought, heat waves (Camenisch et al., 2020, landslides (García-Garrido et al., 2020) or even to investigate past rainfall variability (Nash et al., 440 2016  Despite all the potential sources of bias, in many countries, sources of documentation are still the only way to overcome the absence of systematic data collections as it in principle should be by local or national agencies. At European level, there are no "official" databases collecting FFs data for Member States. A first experiment, carried out in the Mediterranean environment by a multinational research group, was MEFF (MEditerranean Flood Fatalities) database, which included FFs occurred upon a 36-year period  in five Mediterranean study areas (Aceto et al., 2017;Petrucci et al., 2019b;Vinet et al., 2019). Having MEFF as a starting 450 point, we subsequently created EUFF 2019 by extending both the study area and the period, which was an added-value element in both spatial and temporal scales. The updated EUFF 2020 database improved data completeness of the previous version, as quoted in section 2.4., according to the main characteristics of documentary sources, which is their virtual "incompleteness". As it has been extensively quoted in the literature, research in documentary sources can never be considered as fully complete, since new sources may become available or may be discovered with time. The availability of new data sources concerning old events can supply either 455 unnoticed FEs or provide further details on FFs already existed. This is why the database, by using recently discovered data, has been significantly updated by e.g. increasing the number of FFs for CZE, including more attributes to FFs for SFR, or updating the list of FFs for GRE. Even if differences between absolute numbers of FFs is relatively small, it is important to incorporate these data on currently available variables on FFs. It must be also taken into account the importance of these variables in the in depth understanding of what should be changed in order to substantially decrease flood risk for people. In fact, although for the severest 460 FEs the number of FFs becomes known from newspapers or scientific and technical reports, additional useful information such as characteristics of floods' victims or details on the conditions under which the fatal accidents took place are not collected in any other database at the scale of the study areas. In general, details about the conditions and the characteristics of FFs are ephemeral: their knowledge mainly depends on the presence of potential witnesses or survivors and often is "buried-lost" due to privacy reasons.
Hence, each additional detail that may become available must be carefully collected and used to enrich further the data which can 465 be the basis for further elaboration on the topic.
The novel element of this work, compared with the previous one (Petrucci et al., 2019), is that this work is focusing on trend analysis of data. In the frame of this publication we present for first time an analysis of number of casualties per flood events, which is a proxy of FE severity that can be compared to climatic trend. Besides, we also presented the temporal trend of gender (at local scale) and age (at TOT-Area scale) of FFs. We present data in a disaggregated format, to allow other researchers easily detect specific 470 characteristics of flood fatalities (i.e. age or gender) crosschecked with all the other variables that can be used in other kind of research. The simple data having been crosschecked by gender and age, allows easy comparisons with similar results already available in literature in different geographical frameworks (as i.e. in Australia and USA). Finally, the possibility to extract from the database data concerning a specific study area, ensures the possibility to go more in-depth from different points of view. In particular, from sociological point of view, i.e., the examination of the relationships between FFs with demographic data and national 475 development indices can be performed. From a hydrological-climatological-geomorphological point of view, the availability of year, month and day for all the 2483 FFs is a very consistent basis to set up a "threshold" analysis of rain that caused flood fatalities in the specific geomorphological framework of the study area. From a geographical point of view, due to the availability of coordinates of FFs, data can be used to draw the spatial distribution of flood mortality through a geographical information system, as performed i.e. in Vinet et al., (2019) Actually, these hints seem very promising to improve knowledge on FFs in European 480 countries. In our future plans it is the inclusion of more study areas, something which undoubtedly will enlarge and differentiate the geomorphological, hydrological, climatic and demographic framework in which relationships between local features and FFs could be carried out. Preprint. Discussion started: 3 August 2020 c Author(s) 2020. CC BY 4.0 License.
The first outcome of the current research concerns the gender of FFs. Even if, females are commonly considered as more vulnerable, in EUFF 2020 database female victims are less in absolute numbers than males, except for the elderly ones. The general concept of 485 female's weakness during floods' events is affected by higher female's vulnerability in underprivileged frameworks (Zoleta-nantes, 2000). In South Asia, for example, the over representation of female victims depends on five features. 1) women are more likely to stay at home rather than evacuate to a shelter; 2) their dress restrict their movements; 3) cultural shame deters women from escaping to public areas if their clothing is ripped; 4) their inability to swim, which is also a consequence of cultural norms; and (5) being less well nourished (Yeo and Blong, 2010). Thus, gender alone is not a de facto driver of social vulnerability. However, it may 490 rather become one, once it is correlated with age, occupation, access to health care (Alderman et al., 2012), and income. Take notice that low-income women experience the worst effects of flooding (Ajibade et al., 2013). Moreover, low-income population, regardless the gender, may suffer disproportionately human death and injury, as highlighted i.e. for color community in Texas from 1997 to 2001 (Zahran et al., 2008). The greater male vulnerability detected in EUFF 2020 can be related to a stronger exposure of males to floods, and to the higher proportion of males who drive vehicles (Jonkman and Kelman, 2005), due to either their wider 495 mobility or outdoors working activities. Particularly, males are still more than females in outdoor works, and until recently, they outnumbered also in rescue services (e.g., fire fighters, police, and defense forces) (Salvati et al., 2012). Men are also more prone to exceed the standard safety rules, take more risks and to put themselves in danger e.g. to rescue people, belongings or pets.
In EUFF 2020, the majority of FFs concentrated on people of age between 30 and 64 years, thus in their most productive working years. It also explains the places and conditions of several FFs, who have been exposed to floods outdoor, while heading from home 500 to work premises or vice-versa. On the contrary, elderly (retired) people have been more frequently affected indoor while being at home (Diakakis et al., 2020). Thus, elderly people are more frequently trapped by flood in their home, while adults and children are dragged outdoors (Haynes et al., 2015). In contrast to related studies (Zoleta-Nantes, 2000), our work did not detect any particular vulnerability features among children. In our study, it seems that the trend of young victims throughout the study period is decreasing. 505 Car or other vehicles are found to be the most frequent condition of victims in each study area, for both males and females, as extensively stated in literature, even though our data did not allow to test the interesting suggestion according which outdoor incidents are more abundant in non-urban environments (Diakakis et al., 2020).
Regarding protective and hazardous behaviors, our data are scarce, but they do confirm opinions accepted in literature according which males are more prone to risk by taking unnecessary actions (e.g., Salvati et al., 2018), even if we did not detected actions 510 influenced by drugs or alcohol, as some authors highlighted (Franklin et al., 2017). Compared to males, females behaved in a less hazardous way.
For 28.1% of FFs, even if we know the exact time of the accident, we did not examine the potential effect of this variable. In principle, based on the hour of the accident, we could identify light conditions at the moment of the accident and try to potentially correlate it with the way it affected the accident's development, as is also investigated in the related literature (e.g. Špitalar et al., 515 2014). Because of the location of the study areas in different latitudes, obtaining light conditions by hours it is needed to know the time of sunset and sunrise according to corresponding local time, a study element which was not on the frame of the current work.
Finally, our data series did not confirm the decreasing trend in FEs with multiple fatalities and the increasing trend in FEs with only a few fatalities (e.g., Diakakis, 2016;Pereira et al., 2017). These two different severity levels in FEs continue to both occur during the entire studied period without any particular trend, even if TUR seems to be the most frequently affected by FEs resulted in the 520 death of more than 10 people.
Finally, we would like to underline that the term "European" is ambitious given that EUFF 2020 deals with only eight countries.
Adding further study areas-countries is not a trivial task and an easy-to-do survey, especially nowadays that this kind of research has no dedicated funding. If we wish to add a new study area from scratch, a systematic survey of local data sources concerning 39 years has to be performed. This work must be necessarily performed by native researchers, since they can rapidly, easily and 525 effectively analyse the huge amount of sources in which the requested information on FFs can be extracted from. Hence, researchers https://doi.org/10.5194/essd-2020-154 Preprint. Discussion started: 3 August 2020 c Author(s) 2020. CC BY 4.0 License. who already use documentary sources to study natural hazards may perform this work easier. Researchers that already collected documentary sources depicting the historical series of floods in their country could straightforward re-analyse original sources to extract data on FFs, and eventually look for lacking details in coeval data sources. For all the aforementioned reasons, new study areas can not be selected based, for example, on geographical criteria (i.e. select areas at a certain latitude or longitude). Our aim is 530 to take into account all opportunities and willingness to find and collaborate with even more future partners towards common goals.

Data availability
EUropean Flood Fatalities (EUFF) database 1980-2018 (updated) is available in the 4TU Centre for Research Data (Petrucci et al., 2020(Petrucci et al., , https://doi.org/10.4121/uuid:489d8a13-1075(Petrucci et al., -4d2f-accb-db7790e4542f, 2020. The EUFF 2020 database collects 2483 FFs that occurred during a 39-year period in 9 study areas. It includes three files: a) a comma-separated values (csv) file, which 535 contains the data; b) the keyhole mark-up language (kml) file, which provides the location of fatalities on the Google Earth; and c) the readme (txt) file, containing the description of database structure.

Conclusions
In the current work, the EUFF 2020 database has been introduced recording data on flood fatalities for 9 areas of 8 European countries for the 1980-2018 period. The potential of the database to analyse various aspects of the structure of the victims due to 540 flood events is also shown. Highlighted conclusions can be shortly summarised as follows: The studied European countries follow various practices for the data collection of flood fatalities on national or regional levels. Similarly, various sources of documentary evidence they do exist. For example, Media information (newspapers, TV, internet reports) have been among the main sources of such records in the past decades.
(ii) Despite the limited documentary sources, mainly concerning underreporting issues, bias towards the most severe events and 545 inhomogeneity from one country to another, EUFF 2020 shed light on the fatal flood events occurred in the eight studied countries of the Euro-Mediterranean region.
(iii) The EUFF 2020 database provides both regional and super-regional analyses of floods fatalities concerning their gender, age, conditions, activity of fatalities and dynamics of the accidents, thus contributing to a better understanding of the human exposure to floods associated with the most serious potential consequence, which is its own death. 550 (iv) The EUFF 2020 database, with its great potential to be extended spatially and temporally, represents a unique European database of high scientific and practical potential. One aspect of the database is its vitality, as we continuously strive to improve it by extending the study period and enlarging the domain. The EUFF 2020 database and its high potentials will hopefully motivate and encourage more researchers to enrich it with data on FF available in their own countries.
Further EUFF 2020 spatial and temporal extension may allow the comparison of different local frameworks in a broader European 555 scale and the identification of general and local features useful in risk management and educational campaigns. This will subsequently allow the assessment of climate change effects, differences in demographics, economical and the technological developments on these fatalities and their temporal/spatial variability. We believe that the followed pan-European approach, not only frames the anticipation of flood fatality risk into a broader context but also promises benefits for diverse scientific disciplines.
Floods and its fatal consequences can be mitigated more efficiently by developing a holistic view, by building safety measures with 560 sharing useful experiences, best practises and lessons learned among European countries. To this direction, EUFF 2020 may contribute to outline public policies and civil protection campaigns, reduce the impact of floods and hopefully minimize the number of future fatalities.