Topo-bathymetric and oceanographic datasets for coastal flooding risk assessment: French Flooding Prevention Action Program of Saint-Malo

The French Flooding Prevention Action Program of Saint-Malo requires assessment of coastal flooding risks. The first prerequisite is a knowledge of the topography and bathymetry of the bay of Saint-Malo. In addition to existing topobathymetric data, the acquisition of new multibeam bathymetric data is performed. The combination of these datasets allows the generation of two high resolution topo-bathymetric digital terrain models. Then, to understand the hydrodynamic conditions which cause coastal flooding, a dense and extensive oceanographic field experiment is conducted. Oceanographic data 5 were acquired using a network of 22 moorings with 37 sensors, during winter 2018-2019. The network included 2 directional buoys, 2 pressure tide gauges, 18 wave pressure gauges, 4 single-point current meters, 7 current profilers and 4 acoustic wave-current profilers from mid-depth (25 m) up to the upper beach and the dike system. The oceanographic dataset provides an overview of hydrodynamics in Saint-Malo bay and wave processes leading to coastal flooding. The combination of high-resolution topo-bathymetric and oceanographic datasets provides a unique capability for model validation and pro10 cess studies. The topo-bathymetric and oceanographic datasets are available freely at doi : https://doi.org/10.17183/MNT_ COTIER_GNB_PAPI_SM_20m_WGS84, https://doi.org/10.17183/MNT_COTIER_PORT_SM_PAPI_SM_5m_WGS84 and https://doi.org/10.17183/CAMPAGNE_OCEANO_STMALO.

of sand and rocky areas in the shallow bay, and a mixture of gravel and pebbles offshore (Bonnot-Courtois et al. 2002) making the dissipation of the wave by bottom friction non-homogeneous in the bay.
The meteorological conditions are characterized by the passage of low pressure systems and cold fronts (Gaspar et al. 2007).
These weather conditions generate storm surges and significant wave fields. Combined with high spring tides, these events can To improve the knowledge of coastal flooding risks in Saint-Malo bay, an extensive bathymetric and oceanographic campaign was performed in winter 2018-2019. These campaigns allowed the creation of topo-bathymetric and oceanographic datasets.

Topo-bathymetric dataset
The purpose of the following sections is to present the different data used for the generation of the TBDTMs (Figure 1.a), the 70 most common problems experienced with combining data and the approach adopted by Shom for their resolution (Maspataud et al., 2015;Biscara et al., 2016). The hydrographic vessels are equipped with a Kongsberg-Maritime EM710 multibeam echo sounder associated with the SIS acquisition system. Depending on the vessel, sound velocity profiles were measured using Sippican XBT probes or a Valeport SVP1000 Sound Velocity Profiler to correct bathymetric data for local variations in sound speed. Real time GPS positioning 80 was collected with the Applanix POS MV inertial unit. Positioning data were post processed in POSPac software using global navigation satellite systems (GNSS) solutions. Horizontal positions were referenced to WGS84 or ITRF2014 geodetic systems at the time of the survey.
All the bathymetric data acquired were processed with the CARIS HIPS version 9.1 software. Tidal corrections were made based on the data from the tidal reference station of Saint-Malo and additional tide gauges specifically deployed for the surveys. 85 The soundings were vertically referenced to the chart datum of Saint-Malo. Subset editing was carried out to remove systematic errors and outliers. Processed and cleaned data were subjected to final validation by a qualified hydrographer.

Litto3D® program
LIDAR surveys exploited in this study were carried out within the framework of Litto3D® program. This national program is based on a partnership between Shom and the French National Geographic Institute (IGN) (Louvart and Grateau, 2005).

90
It aims to provide very high resolution coastal altimetric models of metropolitan and overseas French coasts (Pastol, 2011).
Coastal mapping of the Normand-Breton Gulf was performed by Shom's Litto3D® team between 2016 and 2018, covering approximatively 700 km2 and reaching up to 18 m water depth (Figure 1.c). Topo-bathymetric data were acquired from a Cessna Grand Caravan 208B type aircraft equipped with an airborne lidar topo-bathymetric HawkEye III double hatch (Leica 95 Geosystems). The data were acquired in relation to the ellipsoid and referenced horizontally with respect to the RGF93 in standard UTM 30N projection. The trajectory of the aircraft was based on the GNSS system and processed by Inertial Explorer.
The trajectory data were post-processed using the stations of the RBF (Réseau de Base Français). The points were generated from the processed waveform with Lidar Survey Studio (LSS) and the point cloud was processed using PFMABE version 6.4.0.43 tools. The validation of the cleaned data was finally done by qualified hydrographers.The produced data were finally reported to the IGN69 altimetry reference frame and to the Lambert 93 projection by the Circe Batch (V4-3, Using RAF09 model) conversion tool.

Shom's bathymetric database
Complementary bathymetric data used to generate the TBDTMs were extracted from the Shom's bathymetric database (BDBS).
They originate from 52 hydrographic surveys ( 490 millions of soundings) conducted between 1829 and 2019 with different 105 sounding methods (lead-lines, single beam and multibeam echosounders). Spatial coverage of each survey is represented as a vector polygon layer (called hereafter bounding polygon) that may adjoin, overlap or supersede older hydrographic surveys.
Each survey extracted from the BDBS is associated with metadata, including the acquisition and processing methods, the order survey and the quality of the data.

Other data 110
In addition to these bathymetric data, a bathymetric survey of the inner harbor delivered by the harbor authority of Saint-Malo was used in the present study. The bathymetric survey was carried out in June 2016 by the GEOXYZ society with a multibeam echosounder. Soundings were vertically referenced to the chart datum of Saint-Malo. This bathymetric source was evaluated prior to integration to other datasets.
The RGE ALTI® V2.0, produced by IGN, was exclusively used for the terrestrial domain. The data are available on the 115 IGN's data portal (https://geoservices.ign.fr) in the RGF93 geodetic system, Lambert 93 projection. Vertical datum of the data corresponds to the NGF-IGN69 legal system (IGN, 2018). The RGE ALTI® v2.0 products used in the TBDTMs cover the departments of Côtes d'Armor, Ille et Vilaine and Manche at a resolution of 5 m. Data was clipped with a buffer extending to 3 km inland. Water-surface values were also eliminated using the raster layer of sources provided with the DTMs.
2.2 Production process 120 2.2.1 Convert data to a common horizontal and vertical datum The key requirement for creating a seamless merged product is the homogeneity of the input datasets in terms of horizontal and vertical datum (Gesch and Wilson, 2001). The vertical transformation to the ellipsoid was performed with Circe 5.1 France (IGN) and Bathyelli V2.0 (Shom) for topographic and bathymetric data, respectively.

125
Selecting the most reliable source from multi-temporal and multi-sensor data is a fundamental challenge addressed in numerous works (Macnab and Jakobsson, 2000;Wong et al., 2007;Maspataud et al., 2015). This issue is particularly true for hydrographic offices, whose data legacy may be substantial: Shom's hydrographic knowledge counts presently more than 10,000 bathymetric surveys conducted between 1816 and 2019 with many areas characterized by overlapping surveys. Survey depth measurements were collected with different sounding methods and positioning systems whose accuracy has improved over time. Moreover, the data may span decades, introducing temporal and geomorphologic change as a source of error (Eakins et al., 2011) To date, the selection of the most reliable surveys (i.e. conflict resolution) for the production of nautical charts or TBDTMs was performed manually. This step is particularly time-consuming, especially in regularly surveyed areas (Maspataud et al., 2015). In order to limit this fastidious work, Shom initiated in 2019 the constitution of a bathymetric reference layer ( Figure   2). The conflict resolution of overlapping surveys is executed using the attributes of survey acquisition date and survey status, 135 the latter one defining if a survey supersede or complete older ones (Figure 2.b). In case of a survey with a "supersede" status, the deconfliction process is executed by clipping the bounding polygon of the reference survey to all other older overlapping surveys. No clipping is done for surveys with a "complete" status ( Figure 2.b). The resulting layer, called "Téthys", is intended to represent the most relevant Shom's bathymetric knowledge available and will be regularly updated on the basis of newly integrated surveys (Figure 2.c). This reference layer will be of benefit to numerous Shom's activities.

140
The generation of TBDTMs in the Normand-Breton gulf benefited from the reliability of all metadata and bounding polygons in the area of interest, which constitute the preliminary step prior to the construction of the Téthys. The deconfliction process was executed on all datasets used for the generation of the TBDTMs using a semi-automated procedure based on GMT routines.
For the datasets not integrated in the BDBS for which no status exists, this one was directly defined by the operator on the basis of inherent criteria of the data. The result of the deconfliction process corresponds to the most reliable soundings that can be 145 used as input into the surface modeling.

Interpolation
Because multiple sources of data contribute to the construction of the DTM, some datasets have data point spacing larger than the required cell size. Splines functions are generally used for their efficiency to honour variable density data providing a representative smooth and continuous surface. They may be more appropriate for large interpolation distances, which is 150 frequently required for bathymetric data (Amante (2012) and references therein). Based on these observations, Shom used the SAGA (System for Automated Geoscientific Analyses; Conrad et al. (2015)) software packages for the generation of the TBDTMs. Multilevel B-splines interpolation tool was used to perform surface modelling of the compiled data.

Altimetric conversion grids
Following NOAA's previous works (Eakins and Taylor, 2010;Eakins et al., 2011), different datum altimetric grids were de-155 veloped by Shom to convert the TBDTMs from the ellipsoid to other tidal datums (Lowest Astronomical Tide and Mean Sea Level).

Evaluation
DTM coherency is evaluated based on visual inspection (slope, cross-section and 3D views), through additional layers (density, sources diagram) and, if possible, the cross-validation of the DTM using datasets that have not been incorporated into the gener-  ated product due to diffusion constraints. Despite the processing efforts and the deconfliction process, erroneous representation of the sea-floor may remain. Preliminary versions of the TBDTMs highlighted two different types of artifacts: -the overlapping of some bathymetric surveys with a "complete" status, which lead to a noisy representation of the seafloor. For this type of problem, the status of identified extracted surveys was modified by the operator to generate a smooth surface.

165
-Oscillation effects characterized by edge effects or topographic creep generated by spline interpolation into unsurveyed marine areas (Eakins and Grothe, 2014;Danielson et al., 2016). These unwanted surface artefacts can be reduced by using locally an appropriate tension factor.
As long as anomalies are detected, their cause must be determined and data reprocessed prior to a new interpolation. These different steps must be repeated iteratively until a satisfactory result is reached (Eakins and Taylor, 2010).

Oceanographic dataset
The oceanographic dataset includes water levels, currents and sea states observations from extensive oceanographic surveys conducted by Shom during winter 2018-2019. These data are monitored from different sensors types. These following sections describe the oceanographic surveys and data processing. -Eighteen wave pressure gauges (Ocean Sensor System Inc., ten OSSI-010-003C and eight OSSI-010-022, hereafter referred to as OSSI and OSSI-NEW, respectively) The moorings were located to accurately describe hydrodynamic conditions from offshore to the coastline ( Figure 3). The

185
Datawell were deployed offshore at 25 m water depth (Sauvages and Trouvée, Figure 3.a), providing information on the incident wave fields. Other moorings were essentially deployed along four cross shore transects (T1, T2, T3 and T4, figure   3.b). Around the transects T1, T2 and T3, the beach profile is characterized by a gentle foreshore slope (1.1-1.2°, Figure 3.c). It increases slightly (1.6°) on the eastern part of the Minihic beach (T0 sensor). Transect T4 is located in the vicinity of the Grand Bé islet and is characterized by high slope variations due to the presence of the Rance estuary (Figure 3.b). Two moorings,190 with OSSI and AquaPro, were deployed in the Rance estuary (Bizeux and Aleth Figure 3.b). Table 1 summarizes the location of each mooring and sensor. For consistency with TBDTMs, the vertical datum used is the Lowest Astronomical Tide (LAT).

Sensor settings
Sensors deployed during the campaign have been programmed to accurately record the oceanographic data, taking into account battery and data storage limitations. Sensors settings in terms of measurement data, sampling rate, average interval and 195 measurement interval are summarized for each sensor in Table 2.  Figure 5 shows interruptions in data acquisition for OSSI-NEW in transects T2 and T3. These interruptions were caused by less efficient batteries than anticipated on the OSSI-NEW's (as mentioned in table 2). For moorings on the foreshore, a brief interruption in the data acquisition appeared from 22/01/2019 to 24/01/2019. This interruption was planned for a battery change. The "La Plate" AWAC sensor did not acquire any data during the campaign (the sensor power cable was disconnected). For the T3-4 mooring, no data was recovered because the mooring was unrecovered 205 by the scuba divers; the mooring had probably been stolen. Despite these incidents, the density and complementarity of the instruments used during the campaign allow us to have a description of the hydrodynamics in Saint-Malo bay. The data collection covers a winter period of more than 4 months.

Data processing
The technical processes involved in generating the oceanographic dataset are as follows:  -Pre-process binary data using the manufacturers' software.
-Process the data using manufacturers' software, Shom software (TDB) and Python toolbox developed for this study.
-Write data, metadata to netcdf format.

Water levels
The water levels are monitored by pressure sensors, tide pressure gauge (SBE) and wave pressure gauge (OSSI). These sensors 215 record bottom pressure at different frequencies, bursts and intervals of acquisition ( Table 2).
The tide Pressure gauges record the average bottom pressure every 20 minutes with a 2-min burst. First, the raw tide data file was extracted in ascii format with the seaterm manufacturer software (Seabird). Then, the raw data was processed by the TDB software. The raw data was converted in water level assuming hydrostatic equilibrium (Eq. 1). Finally, the processed data was subjected to final validation by a qualified hydrographer.
with P m the bottom pressure measurement in Pa, P atm the atmospheric pressure extracted from ERA5 atmospheric reanalysis in Pa, ρ = 1026kgm −3 the averaged water density measured in Saint-Malo bay and g = 9.81ms −2 the gravitational acceleration.
The wave pressure gauges record bottom pressure at high frequency. The bottom pressure was directly recorded in ascii 225 format. First, the raw pressure data of each sensor is calibrated using the pressure slope and offset of the sensor. For foreshore sensors, the pressure sensor was calibrated by comparing the pressure measured when the sensor is out of water to atmospheric pressure. For offshore sensors, the pressure sensor was calibrated at the end of the recording when the instrument was taken out of the water or in the laboratory. Then, the corrected pressure was converted into water level assuming hydrostatic equilibrium (Eq. 1). Finally, to reconstruct water level for the long waves such as tide, the water level was smoothed, with moving average 230 of 10 min, to filter out deformations of water surface related to short waves.
In order to be able to compare water levels recorded from different instruments, they have been relocated in height with respect to the tide gauge of Saint-Malo harbour. The correction factors (slope and offset) from Saint-Malo tide gauge is provided in the dataset for each sensor. The accuracy of water levels is in the order of ±5cm. For consistency with TBDTMs, the vertical datum used is the Lowest Astronomical Tide (LAT).

Currents
The currents are monitored by acoustic Doppler current, single point current meters (Aquadopp) and profiler current meters (AquaPro and AWAC). These sensors record the speeds in the axis of their three beams. The raw binary data file was directly processed by the TDB software. First, the velocity data are reprojected in a terrestrial landmark, on the north-east-vertical axes. Then, measurements close to the surface and polluted by the secondary lobes of the acoustic beams are suppressed. The 240 processed data was subjected to final validation by a qualified hydrographer. Prior to their deployment, the magnetic compasses of these current meters were previously calibrated on a dedicated Shom platform. The calibration procedure and the uncertainty of current data are described in Menn and Morvan (2020). Finally the current data and metadata of each sensor are written in netcdf format.

245
The sea states are monitored by three different sensor types, directional wave buoy, wave pressure gauge, and acoustic wave and current profiler. Each sensor type requires specific processing.
The directional wave buoys recorded wave motion in three directions. The raw displacement datas were computed directly aboard the buoys. First, the buoy's displacement measurements were low-pass and high-pass filtered to give the threedimensional buoy motion in the frequency range of 0.01−0.64 Hz. Then, the spectral data were computed from heaven-north-250 west displacements. Finally, the statistical wave parameters were computed using spectral data. A quality control based on skewness and kurtosis of vertical displacement were carried out. A 30-minutes data burst with |Skewness|>0.3 and Kurtosis>5 have been marked with a quality control flag (QC=0).
The wave pressure gauges are used in part 3.2.1 to monitor the water level through hydrostatic equilibrium. The hydrostatic equilibrium is not enough for short wave monitors because short waves distort the surface elevation by non-hydrostatic effects 255 (wave motion and roller structure as described in Martins et al. (2020)). The non-hydrostatic pressure signal related to the wave motion (dynamic pressure) is not entirely measured. The bottom pressure is attenuated exponentially with depth. Therefore, specific methodology must be applied to reconstruct the short wave-induced surface elevation. surface elevation related to the roller cannot be reconstructed by these methods. On this dataset, we use three different reconstruction methods : hydrostatic method, linear method and non linear weakly dispersive method (e.g Supplementary materials A.1). The pressure records were separated in a 20-min burst, each burst was linearly detrended to suppress tidal motion. The reconstruction methods were applied to reconstruct the surface elevation. Statistical wave parameters were computed by spectral analysis and wave-by-wave analysis (e.g Supplementary materials A.2).
The acoustic waves and currents profiler record surface elevation from Acoustic Surface Tracking (AST) and current in 265 wave cells at high frequency. The raw data is directly processed using the Quickwave manufacturer software. High-frequency surface elevation combined with current measurements allow the reconstruction of a directional wave spectrum using SUV method (Pedersen et al., 2005). Directional and non directional bulk wave parameters were computed using directional wave spectrum for the band between 0.025 − 0.5Hz.

Overview of oceanographic processes 270
The figure 5 (A-C) shows an overview of the temporal evolution of the metocean conditions during the oceanographic field experiment.
During this experiment, several low-pressure systems affected Saint-Malo. These storm events produced skew surges from 22 cm to 61 cm and offshore significant wave heights from 2.29 m to 4.28 m. These storms occur principally during neap tides.
No coastal flooding was observed during this experiment. The storm's events and two major spring tide have been reported in 275 Table 3 with the associated metocean conditions.
The wave transformation induces further processes affecting water level elevation such as wave set-up and infragravity wave (Dodet et al., 2018). The figure 5.D shows the wave set-up measurement (between sensor T1-1 and T1-2) and infragravity significant waves height (between 0,004 and 0,04 Hz) at T1-1. The maximum wave set-up was measured during the 09/12/18, 27/01/2019 and 10/06/19 storms with wave set-up measurement of 27 cm, 32 cm and 24 cm, respectively. The maximum 280 infragravity significant waves height also occur during these storms with heights greater than 50 cm. The table 3 indicates the wave set-up, short and infragravity significant waves height at T1-1 for each storm and spring tide event.
The figure 6 shows the wave set-up measurement (between sensor T1-1 and T1-2) versus the offshore significant wave height. The wave set-up increases with the offshore significant wave height as expected. The water level play a major role in the evolution of this wave process. The highest wave set-up are observed at rising and falling tide. At high tide (sea levels > 12 285 m), the observed wave set-up is 10 cm for offshore waves of 4 m.

Data availability
The TBDTMs (SHOM, 2020a, b) and oceanographic datasets (SHOM, 2021)    -the citation and an associated Digital Object Identifier (unique identifier used to cite scientific articles and datasets) to easily identify the future multiple uses of the DTM; -the rights and contents report describing the main features of the product and its limitation of use.

300
Oceanographic dataset is available through three levels of processing, including : -L0 : direct output of sensors at binary or ascii format.
-L1 : pre-processed data using the manufacturers' software at ascii format.
-L2 : processed data and metadata at NetCDF format.
The oceanographic dataset was plotted for a visual quality check and is available as "quick looks" on the repository. Topo-bathymetric and oceanographic datasets can be used to build, validate and calibrate hydrodynamic models. In particular, offshore wave measurements can be useful to force local high resolution models or validate spectral low resolution wave models. Thus, the long series of measurements in Saint-Malo bay will allow validating the models.
Despite the increasing number of satellites monitoring coastal variables at stake in flood risks, this type of datasets, even if 320 costly and limited in space and time, allows the characterization of processes on a shorter scale, and remains necessary and complementary to discontinuous and not very accurate satellite datasets in coastal areas.
Appendix A

A1 Surface elevation reconstruction method from bottom pressure
On this dataset we use three different reconstruction methods:
with ζ h the hydrostatic water level in m, h0 the mean water depth in m, δ m the sensor's distance from bed in m.
(ii) The linear reconstruction based on a transfer function (Eq. A2) is the most commonly used method (Bishop and Donelan, 1987). This method consists of reconstructing surface wave elevation from linear wave theory.
with F [.] is the Fourier transform and K p (ω) the correction factor: Solving this equation requires the use of the dispersion relation of linear wave theory: problem, the most commonly used method is to introduce a cut-off frequency. For frequencies exceeding the limit frequency, the correction factor can be replaced by different values : Kp=1 (sharp cut-off, the linear spectrum is replaced 340 by hydrostatic spectrum), linear correction factor, steady correction factor and Jonswap spectrum (see Mouragues et al. (2019) for description of these methods). The optimization of cut-off frequency and correction factor are a source of improvement in the representation of the wave shape (Mouragues et al., 2019;Martins et al., 2020). In this study, in order to simplify the treatment and make it homogeneous, a sharp cut-off frequency of 0.25Hz was chosen for the whole dataset.
with ζ swl the free surface elevation by linear weakly dispersive reconstruction : To solve this equation, we use the Fourier transform. Contrary to the linear method, the resolution of Eq.A5 does not require the use of a dispersion relationship. However, this transform requires intrusion of cutoff frequency to filter the measurement noise (Mouragues et al., 2019). In this study, in order to simplify the treatment and make it homogeneous, a cut-off frequency of 0.5Hz was chosen for the whole dataset. This method allows a better reconstruction of the height 355 of the highest waves near the breaking point (Mouragues et al., 2019), however its application is limited to a weakly dispersive wave regime.

A2 Spectral and wave-by-wave analysis
The statistical wave parameters were computed by a spectral analysis and a wave-by-wave analysis: (i) The surface elevation PSD was computed for each burst using Welch's method (Hanning window and 50% overlapped 360 segments of 2048 samples). The significant wave height (H m0 ) and others bulk parameters (T m01 , T m02 ) were computed using spectral moments: with E(f ) the wave energy spectra and (f min ,f max ) the bandwidth. In this study f min = 0.025Hz and f max = 0.5Hz and,
(ii) The wave-by-wave analysis was performed for each burst using a local maxima or peak to peak analysis. The positions 370 between two consecutive crests (with distance > 2 s and amplitude > 0.1 m) have been extracted to computed wave parameters. The maximum wave height and other bulk parameters (H mean , H sig , T mean ) were computed using crests and troughs (minimum between two crests) :