Reconstruction of hourly coastal water levels and counterfactuals without sea level rise for impact attribution

. Rising seas are a threat for human and natural systems along coastlines. The relation between global warming and sea -level rise is established, but the quantification of impacts of historical sea -level rise on a global scale is largely absent. To foster such quantification, we here present a reconstruction of historical hourly (1979-2015) and monthly (1900-2015) coastal water levels and a corresponding counterfactual without long-term trends in sea level. The dataset pair allows for impact attribution studies that quantify the contribution of sea level rise to observed changes in coastal systems following the definition of the Intergovernmental Panel on Climate Change (IPCC). Impacts are ultimately caused by water levels that are relative to the local land height, which makes the inclusion of vertical land motion a necessary step. Also, many impacts are driven by sub-daily extreme water levels. To capture these aspects, the factual data combines reconstructed geocentric sea level on a monthly time scale since 1900, vertical land motion since 1900 and hourly storm-tide variations since 1979. The inclusion of observation-based vertical land motion brings the trends of the combined dataset closer to tide gauge records in most cases, but outliers remain. Daily maximum water levels get in closer agreement with tide gauges through the inclusion of intra-annual ocean density variations. The counterfactual data is derived from the factual data through subtraction of the quadratic trend. The dataset is made available openly through the Inter-Sectoral Impact Model Intercomparison Project (ISIMIP)

as the difference between the observed state of a natural, human or managed system and a counterfactual baseline that characterises the system's state in the absence of changes in the climate-related systems'' and further that the "difference between the observed and the counterfactual baseline state is considered the change in the natural, human or managed system that is attributed to the changes in the climate-related systems (impact attribution)" (O'Neill et al. 2022a).
As the counterfactual impact baseline cannot be observed, it needs to be modeled by an impact model. A precondition for impact attribution is that the impact model explains the observed phenomenon under consideration reasonably well given its drivers. This necessitates a model evaluation step, which is followed by the attribution step itself. The presented work aims to make forcing data available for both steps: i) factual forcing data to evaluate impact models and produce factual historical impact simulations and ii) counterfactual forcing data to produce counterfactual impact simulations.
While the factual data should stay as close as possible to reality and is thus in principle set, the counterfactual data depends on the specific attribution question. As coastal systems changed fast over the past century with climate and sea level presumably one but often not the dominant driver, we here ask the attribution question "how did historical sea level rise affect observed phenomena in dynamic coastal systems with a multitude of drivers, irrespective of the origin of the sea level rise?" We thus aim to delineate sea level rise from other drivers of change like population change, construction activity at the coast or ecosystem degradation through direct human intervention. We do not focus on the causes of sea level rise itself, but treat it as one driver of coastal change or disturbance in line with the IPCC definition (O'Neill et al. 2022a). Quantifying the fraction of impacts from anthropogenic influence on sea level rise would need investigation of the causal chain from emissions to sea level rise to impacts through a more complex attribution setup based on climate model ensembles (Hope et al. 2022).
Coastal systems ultimately experience the change of the water level at the coast relative to the height of the land, which we term relative water levels. As relative water levels are the most direct input for impact models we provide factual and counterfactual relative water levels as our main dataset. This necessitates the inclusion of vertical land motion, whichthough an important driver of coastal impacts -has been less rigorously observed and researched on a global level than sea level, and global datasets only are becoming available recently Hammond et al. 2021;Frederikse et al. 2020;Hawkins, Husson, et al. 2019;Pfeffer et al. 2017). Consistent with the impact attribution definition of the IPCC we do not investigate the drivers of vertical land motion itself, but treat it as a driver of impacts as a part of the relative sea level.
There is no predecessor for a global relative water level dataset.
We construct relative hourly and extended monthly coastal water levels globally for the historical period and a respective counterfactual -the Hourly Coastal water levels with Counterfactual (HCC) dataset. For the factual dataset we combine data resolving high temporal resolution to capture coastal storm-tide extremes (Muis et al. 2020), data covering the low frequency variability and long-term trends in mean sea level (Dangendorf et al. 2019) and data for vertical land motion (VLM) based on a probabilistic reconstruction from direct observations . For the counterfactual dataset we remove the long-term trends from the factual data. We describe the approach to create factual and counterfactual datasets in the section 'Materials and Methods', present the main features of the dataset in the subsequent 'Results' section and provide a discussion in the final section.

Materials and Methods
Many impacts manifest through extreme sea-level conditions that occur on the timescale of hours, which necessitates a product that resolves these timescales. We use the Coastal Dataset for the Evaluation of Climate Impact (CoDEC, Muis et al. 2020) to cover the high-frequency variation of storm tides along global coastlines. The dataset performs well in reproducing historical extremes. However, CoDEC does not incorporate a full historical reconstruction of observed long-term mean sealevel rise as it starts only in the year 1979 and does not reflect any changes in ocean density (i.e., sterodynamic) and mass (barystatic). To capture the longer-term evolution of coastal water levels, we use the hybrid reconstructions (HR) dataset from Dangendorf et al. (2019). It is the most recent spatio-temporal sea-level reconstruction and represents sea-level change since 1900 on a monthly timescale. We adjust it for residual VLM to represent the evolution of water levels relative to the geoid (geocentric). Impacts are ultimately related to the height of the sea relative to the affected land. We combine the geocentric water levels with the probabilistic VLM reconstruction from Oelsmann et al. (2023) to yield the evolution of the water levels relative to the coast.
In the following we give a short description of the three source datasets.

Coastal Dataset for the Evaluation of Climate Impact (CoDEC)
CoDEC is an update of the Global Tide and Surge Reanalysis (GTSR, Muis et al. 2016) dataset and uses a newer modeling framework, higher resolution and newer climate forcing data. It is based on the hydrodynamic Global Tide and Surge Model (GTSMv3.0), which uses the unstructured Delft3D Flexible Mesh software (Kernkamp et al. 2011) as shallow-water flow solver and resolves coastal areas at high detail while being efficiently coarse in the open ocean. GTSM uses the depthaveraged, barotropic mode of Delft3D, assuming a constant density of ocean waters. It explicitly models tides and storm surges at a high temporal resolution. The model has global coverage and thus no open boundaries. The coastal resolution is 1.25km at European coasts and 2.5km at other global coasts. To produce CoDEC, GTSM is forced with the 10m wind speed and atmospheric pressure from the ERA5 reanalysis (Hersbach et al. 2020 (Muis et al. 2020). For tropical cyclones with wind fields of relatively small spatial extent, extreme water levels are expected to be underestimated due to poor representation in the meteorological forcing (Dullaart et al. 2020).
Polar regions are not well resolved due to low-quality atmospheric forcing, poor bathymetry and the poor representation of ice dynamics.
The HR dataset (Dangendorf et al. 2019) combines two methodologies to reconstruct historical sea-level rise from tide gauge and satellite observations. Both methodologies are prominent sea-level reconstruction approaches on their own (Church et al. 2011;Hay et al. 2015) and have their distinct advantages and shortcomings. HR applies each methodology at time scales where they have well-proven performance. The HR dataset covers the period 1900-2015 and has monthly time resolution.
Thus it cannot provide sea-level variability on shorter than monthly timescales, which needs to be introduced by CoDEC.
Since HR is based on observations, the data includes all sea-level processes that are not explicitly removed. Most importantly, it includes the effects of Gravitation, Rotation, and Deformation of the Earth accompanying the sea-level change from mass addition through melting glaciers and ice sheets, changes due to density variations of the ocean water and dynamic ocean currents, and variations induced by the inverse barometer effect. By construction, HR includes the sea-level variability from wind and atmospheric pressure changes, which are also represented in the CoDEC dataset. Note that modulations due to the nodal cycle driven by the varying declination of the moon in time are not explicitly modeled within the HR framework.

Vertical Land Motion dataset
We use vertical land motion data from ) that provide a probabilistic annual vertical land motion reconstruction from 1995-2020 based on more than 10,000 time series from global navigation satellite system (GNSS) stations and differences of altimetry and tide gauge observations. Their approach accounts explicitly for a linear trend component and non-linear variations with time. It adapts methods so far used for the reconstruction of absolute sea level changes (e.g., Church and White, 2011) using empirical orthogonal functions. The spatiotemporal variations are interpolated along the world's coastlines using adaptive Bayesian transdimensional processes (Hawkins, Husson, et al. 2019;Hawkins, Bodin, et al. 2019). By accounting for the non-linear components of the temporal evolution, the estimated linear trends over the last century  are expected to be more robust. The nonlinear components capture for example the present-day effects (since 1995) of earthquakes, which can introduce extreme variations in observed VLM trends up to centimeters per year, or instantaneous displacements with a magnitude of several centimeters to meters.

Tide Gauge datasets
We use two different tide gauge datasets to evaluate the reconstruction. To evaluate long-term sea level change we use the where there are several files for different time periods in the GESLA-2 database, remove data points flagged as suspicious outliers or datum shifts, and interpolate all records to hourly resolution. We align all tide gauge records to a common vertical datum by subtracting the mean sea level over 1993 -2015. Then, to align them to the HCC data, we add the HCC mean for the same period to the tide gauge records. To ensure that this alignment is valid we select only GESLA-2 and PSMSL records with at least one year of observations in the interval 1993 -2015. Additionally, PSMSL records are restricted to those with at least 20 years of observations. Those restrictions lead to a total of 705 stations in the PSMSL and 714 stations in the GESLA-2 database respectively. The stations are illustrated in Fig. 1 with colors referring to ocean basins following the definition of Thompson and Merrifield (2014).

Factual water levels
We adjust the Hybrid Reconstruction (HR) dataset (Dangendorf et al., 2019) for vertical land motion contributions to obtain geocentric water levels. The contributions of vertical land motion in HR consist of two parts. The first part is due to longterm glacial isostatic adjustment since the glacial maximum 21,000 years ago which is explicitly modeled in HR and can thus be readily taken out. The second VLM contribution is due to short-term crustal responses to present-day ice melt since 1900 (Pfeffer et al., 2017;Spada, 2017;Riva et al., 2017) which is implicitly contained in HR through cryostatic fingerprints that are fitted to tide gauges.We use the annual reconstructions of the crustal responses to present-day ice melt from glaciers, the Antarctic ice sheet and the Greenland ice sheet from Frederikse et al. (2020) to remove this contribution.
For the monthly-resolved dataset of relative water levels we add vertical land motion from Oelsmann et al. (2023)  The outcome describes the monthly-resolved long-term evolution of relative coastal water levels. To include hourly variation in coastal water levels we add hourly CoDEC data. Barotropic water level changes due to wind and atmospheric pressure on time scales longer or equal to one month are covered in both our monthly dataset and the CoDEC dataset. We use frequency filtering to avoid their double counting. Barotropic sea-level variations due to wind and atmospheric pressure are explicitly modeled in the CoDEC dataset whereas HR is based on a statistical reconstruction method based on sparse observations. However, HR is not restricted to barotropic variations alone and covers the full spectrum of intra-annual and longer sea-level variations (including sterodynamic and barystatic processes). We thus expect that it depends on the location which product performs better. We have therefore tested for different cutoff frequencies and how it affects the performance of the combined product when compared to tide gauges. We varied the cutoff frequency for values of 1, 2, 3, 4, 5, 6, and 12 months and found an optimal cutoff frequency of 3 months (90 days).
Before filtering we first deseasonalize our combined monthly dataset and the CoDEC dataset, such that seasonality does not impact the filtering process. We deseasonalize CoDEC by removing its monthly average climatology over the years 1993-2015 which is interpolated to hourly resolution with cubic spline interpolation. To keep the nodal cycle in the final reconstruction, we first subtract the tidal contribution to the water levels and only high-pass filter the non-tidal residuals of the CoDEC data. We deseasonalize our combined monthly dataset by removing the seasonal cycle which we calculated from AVISO 2 satellite observations by taking the monthly average climatology over the years 1993-2015.
We then subtract the 90-day running mean value from the deseasonalized non-tidal residuals of the CoDEC data. This highpass filter removes contributions to sea-level variability longer than three months. Correspondingly, we low-pass filter our deseasonalized combined monthly dataset by taking its 90-day running mean value to only retain contributions with frequency longer than three months. We combine both filtered products by first interpolating the low-pass filtered combined dataset and the seasonal cycle from AVISO to the hourly target resolution of CoDEC using cubic spline interpolation. We then sum the high-pass filtered CoDEC, the low-pass filtered and interpolated combined monthly dataset, the tidal levels from CoDEC and the interpolated seasonal cycle from AVISO. We apply the same process excluding VLM reconstruction to yield a geocentric version of the combined dataset.
To yield a common vertical reference, we shift the geocentric version of our dataset vertically to have the same average sea level as the satellite altimetry for the time period 1993 -2015. It thus describes sea-surface height above the WGS 84 Geoid equal to AVISO 3 . We align our relative version of the dataset with the geocentric version such that the water levels in both datasets are equal on the last date of the record.

Counterfactual water levels
We generate counterfactual water levels that exclude the trends since the beginning of the 20th century but preserve the short-term sea-level variability of the factual dataset. For each location we estimate a quadratic trend using linear regression on the annual mean time series with the intercept fixed at the average sea level in 1900-1905. We remove this long-term trend from the factual time series to yield the counterfactual time series. Covering the period from 1900 to 2015, the record is sufficiently long so that the influence of sea-level variability on the trend estimation can be expected to be minor. We therefore do not include predictors for the main modes of climate variability as for example in (Menéndez and Woodworth, 2010;Marcos and Woodworth, 2017;Wang et al., 2021).
As the water height relative to the coast is needed as input for impact models the factual/counterfactual tuple can be used as forcing in such models directly. We additionally provide a counterfactual of the geocentric version of the factual dataset excluding the effects of VLM. The geocentric factual/counterfactual tuple can be used if it is known from other sources that VLM is negligible or if better VLM estimates are available regionally. To yield a counterfactual consistent with our approach, the VLM quadratic trend since 1900-1905 would need to be estimated from the regional VLM data and then subtracted from the geocentric counterfactual dataset.

Results
We provide the factual and counterfactual datasets with hourly resolution for the time period 1979-2015 and monthly resolution for the time period 1900-2015.

Long-term sea level trends
We evaluate the performance of our dataset by comparing it to tide gauge measurements from the PSMSL database with at least 20 years of observations. As tide gauges measure sea level relative to their position on land we can directly compare them with our dataset. For aligning observed and modeled RSL we subtract the respective mean value over years with valid observations from 1993-2015 from both datasets. To visualize long-term sea level change we plot the linear trend from the tide gauges and the modeled coastal water levels respectively for years with valid observations. Our dataset reflects well the different trends in different world regions (Fig. 2a), aligned by ocean basin and ordered by latitude (Fig. 2c).
8 We evaluate the performance of our dataset through its root mean squared error (RMSE) against observations and compare it to the performance of the HR dataset (Table 1). Our dataset shows a median RMSE of 5.58 cm (std. 5.53 cm) over all tide gauge stations which is an improvement compared to the HR with a median RMSE of 5.81 cm (std 6.06 cm). The improvement occurs in all 7 basins and is pronounced in the Subtropical North Atlantic with a median RMSE of 4.60 cm (std. 2.78 cm) as compared to 5.54 cm (std. 6.18 cm) in HR. Figure 2b provides more detail, showing RMSE per tide gauge.
In the higher latitudes of the East Pacific our dataset has a lower RMSE than HR for most stations (Fig. 2b). The performance also decreases at tide gauges in the lower northern latitudes of the East Pacific. As these regions are all located at plate boundaries and are thus highly prone to tectonically induced VLM, this may hint to problems in the extrapolation back in time to 1900 from recent VLM data for such regions. However, in summary for all locations we see the inclusion of observational VLM and its backward extrapolation superior to approaches that only include the glacial isostatic adjustment component of VLM and neglect other contributing processes as can be seen in the overall improvement of median RMSE.

Intra-annual variability
To evaluate our dataset on timescales shorter than a year we compare it with observations from the GESLA-2 database. For illustration, in Fig. 3 we first show daily mean values for CoDEC, HR and our dataset as anomalies to their respective yearly mean in the year 2001 for five selected tide gauge stations. For the example of Stockholm, Sweden, the density variations modulating sea level during the annual cycle as present in HR bring the atmospheric wind-and pressure-driven variability from CoDEC significantly closer to observations and thus improve the performance in our combined dataset. A similar effect is evident for Toyama, Japan and Miami, USA. For the stations of Zanzibar, Tansania, and Rio de Janeiro, Brazil, sea-level variation of HR is low as compared to the total sea-level amplitude, thus the factual dataset and CoDEC evolve similarly and an improvement is not evident. Both stations are located in areas that are barely covered by tide gauges used in HR.

Figure 3: Illustrative comparison of our HCC (blue line), the HR (pink line) and CoDEC (red line) datasets with observations (black line) at five example tide gauge stations. We show daily mean values for the year 2001 that are vertically aligned by removing the respective annual mean sea level.
We assess if improvements are visible over all tide gauges in Fig. 4. We compare coastal water levels from our dataset, CoDEC and HR through their respective RMSE against tide gauge observations. Following Muis et al (2020) improvement is due to better alignment of intra-annual variability with tide gauges as interannual mean sea-level change is explicitly excluded. For most locations the RMSE against tide gauge observations from GESLA-2 is lower for our detrended monthly dataset than for HR, visualized by blue bars in Figure 4a. The global median RMSE of our dataset is 3.77 cm (std. 3.39 cm) compared to 4.01 cm (std. 3.36 cm) for the HR dataset. The improvement is consistent over all basins except for the Northwest Pacific where the median RMSE for our dataset of 3.80 cm (std. 1.87 cm) is slightly higher than for HR with 3.73 cm (std. 1.92 cm) ( Table 1). The improved performance of our dataset is stronger in mid to higher latitudes of the North Atlantic and some stations in the East Pacific. In the East Pacific and Indian Ocean -South Pacific there is a mixed picture with some stations showing a lower performance than HR (green bars Fig. 4a). Wind and air-pressure driven barotropic sea level variability is more pronounced in mid to higher latitudes (Merrifield et al., 2013) which might explain the improved performance of our dataset in these regions since it uses sea-level variability from CoDEC on a time scale up to three months. This is plausible as wind and air-pressure driven sea level variability are explicitly modeled in CoDEC and only interpolated from sparse observations in HR.  (Fig. 4b). The more important role of ocean density variations as compared to wind-and airpressure-driven variability is a plausible explanation for the stronger increase in performance in the lower latitudes. Density variations are captured in our dataset through the inclusion of HR and the seasonal cycle from AVISO.

Extreme Water Levels
To illustrate the role of the long-term trend for sea level extremes, we investigate extreme water levels in our factual and counterfactual dataset and compare them to tide gauge observations from GESLA2. We only consider extreme events from 2009-2013 because this period is well covered in the observations and sea-level rise is close to its maximum. We restrict our analysis of extreme water levels to tide gauge stations with at least one year of data in the considered period which leaves a total of 520 stations. As astronomical tides introduce a strong offset in extreme water levels and thus make the comparison between different locations difficult, we here remove astronomical tides from the modeled and observed water levels and focus on the surge component. As coasts are historically adapted to their tides, extreme surges are an important cause for extreme impact events and their damages. Tides for the observations are estimated by fitting harmonic functions with 69 harmonic components as described by (European Commission, Joint Research Centre, Probst, P. , Annunziato, A., 2017). To preserve variability with frequencies larger or equal to one month we predict harmonic tides based on fitted parameters for 65 sub-monthly harmonic components only, leaving out 4 components with frequencies larger or equal to one month. We then remove the predicted harmonic tides from the observations. See the method section for a description of the removal of tides in our dataset. To level out differences in surge height between different stations caused by permanent differences to the geoid, we remove the mean value from 2009-2013 from tide gauges and our datasets. We then pick the 18 highest (the top one percent) daily maximum water levels from the observational data in the years 2009-2013. These correspond to the one percent highest daily maximum water levels in these years. We compare those to the maximum values at the same day in our factual and counterfactual dataset, respectively.
We show this top one percent highest water levels for our factual and counterfactual dataset, and the tide gauge observations in Fig. 5a. In general, there is a good agreement between modeled extreme water levels (blue bars) and observations (black markers). However, our dataset underestimates the extreme water levels at most locations with a median bias of -13.03 cm (std. 23.12 cm). This underestimation is pronounced for the East Pacific with -21.89 cm (std. 34.66 cm) and Subpolar North Atlantic West with -16.63 cm (std 18.33 cm) (Table 1). There is a known low bias in the model in particular for the highest water levels originating from the CoDEC dataset. It is largely attributed to the spatial resolution of the atmospheric inputs (Muis et al., 2020). We aligned our dataset and the observations by the mean over the available data from 1993 to 2015 for By design, the counterfactual dataset preserves the daily, monthly and interannual variability of the factual dataset. Extreme sea-level events have the same timing in the counterfactual and the factual. We can thus pick the timings of the one percent highest water levels from the factual dataset and assess the events in the counterfactual dataset in Figure 5a. The trend in relative coastal water levels increased extreme water levels for almost all world regions with the counterfactual lying below the factual for most tide gauge locations. Especially for regions with low surge magnitudes it often contributes a significant fraction to the extreme event. The situation is different for the high northern latitudes where counterfactual sea-level rise is above the factual. In these regions the extreme event magnitude is reduced primarily due to the influence of glacial isostatic adjustment (Emery and Aubrey, 1985). In some regions the counterfactual is below zero. These are regions where the highest surge levels are close to the mean sea level from 2009-2013. With the factual not much higher than zero the counterfactual without the sea level trend since 1900 easily falls below zero.

335
We illustrate the contribution from geocentric water levels and VLM to the relative coastal water levels in Fig. 5b. The contribution of geocentric waterlevels is relatively stable across locations. This is in contrast to the contribution from VLM, which is more variable across locations. In many places both processes have a similar order of magnitude and there are some regions where VLM exceeds changes in geocentric water levels. This has been recognized in earlier works , Nicholls et al., 2021Pfeffer et al., 2017;Hawkins et al., 2019b;Hammond et al., 2021;Wöppelmann and Marcos, 2016).
We produced the counterfactual by estimating the quadratic long-term trend in the factual RSL data and subtracting it. To illustrate our approach we show relative sea-level rise between the start and the end of the counterfactual dataset (difference 1900-1905 mean and 2010-2015 mean, orange dots in Fig. 5b). It is close to zero for most locations. This means that the counterfactual is largely free of long-term trends, underlining the validity of the approach.

Discussion
In this work, we combine datasets of long-term sea level change, short-term coastal water level variability and vertical land motion to yield a forcing dataset for historical simulations with coastal impact models. To facilitate the attribution of historical impacts to sea level rise, we complement the dataset with a counterfactual.
The task poses several challenges. A major one is the inclusion of VLM to yield relative coastal water levels. VLM can only be directly measured since GNSS data became available on a larger scale in the early 2000s (Hammond et al., 2021). While the glacial isostatic adjustment component of VLM can be well approximated by a linear trend, other VLM processes are often highly nonlinear or of local origin and can thus not be easily extended backward to 1900. We still decided to include VLM beyond glacial isostatic adjustment to stay as close as possible to observations though increasing data uncertainty.
We incorporated a VLM dataset directly derived from observations as the most independent source for such data.
Alternatives to this approach exist and were already used in earlier datasets. One possibility is to only account for VLM that is caused by glacial isostatic adjustment which can directly be modeled, or implicitly through cryostatic fingerprints in the case of responses to present day ice melt (Dangendorf et al., 2019). Another possibility is to approximate non-linear effects from the residual between tide gauge observations and reconstructions (Hay et al., 2015;Kopp et al., 2014;Dangendorf et al., 2021) for that by frequency filtering only deseasonalized data and adding the seasonal cycle from AVISO satellite altimetry. For processes with lower frequencies than the cutoff, sea level variability comes from HR which covers density-driven variability but is for many regions not as good in covering atmospherically driven variability as CoDEC.
We decide for a cutoff frequency of three months for all locations based on performance comparison with tide gauge observations. The optimal cutoff frequency could also be chosen to vary between regions depending on the dominant regional processes. We decide against that because such a method poses the risk of overfitting, in particular for regions sparsely covered by tide gauges. It would also be unclear how to choose the frequency for locations not covered by tide gauges.
HR does not cover regions with sea ice because there is no continuous coverage of altimetry for those regions. Our coastal water levels in those regions are based on extrapolation of HR and need to be used with extreme care. We provide a mask along with the dataset so users can exclude sea ice areas. For some of these regions, namely for Greenland, Siberia and Antarctica, VLM data is also absent .
To derive a counterfactual dataset in line with the concept of impact attribution of the IPCC, we use a simple quadratic model to first estimate and then remove the sea-level trend since 1900 from the factual dataset. The quadratic model assumes a constant acceleration of sea-level rise over time. Analysis of sea-level rise shows variation throughout the last century with an acceleration phase in the early century followed by a deceleration and then again acceleration until today (Dangendorf et al., 2019). By design, this variation is not included in our quadratic trend estimate. In general, we expect our trend estimation to largely exclude natural variability due to its low dimension and the long data period. This is a desired outcome and preserves the natural variability in the counterfactual.
In contrast to atmospheric climate change, there is no pre-industrial period in which sea level was stable over time. There is therefore not a clear indication for the time period that we can reference as the baseline for the counterfactual. We here took the practical choice of making the years 1900-1905 the reference time because this is when the HR dataset starts. In a strict sense, with the counterfactual forcing data we thus mimic a sea-level world of the beginning of the 20th century and not a world in which sea-level rise has not occurred. The approach produces counterfactuals that are largely stationary, but incorporate the same shorter-term variability as the factual dataset. The data thus also allows for impact attribution of individual coastal extreme events. For an extended discussion of the concept see (Mengel et al. 2021).
Without additional analysis, the presented work does not allow for the attribution of coastal impacts to anthropogenic greenhouse gas emissions, mediated through sea-level rise. Such additional steps would need the differentiation between climate variability and forced climate response. This is usually done via large model ensembles and dedicated experimental setups like DAMIP (Gillett et al., 2016). While attribution of global mean sea level change to anthropogenic emissions is possible (e.g. Slangen et al. 2016), the task of separating variability from the forced signal is more challenging on the regional level that is necessary for impact assessments and not yet possible over the 20th century (Fox-Kemper et al., 2021).
We here exclude deliberately the separation of anthropogenic and non-anthropogenic forcing of sea-level rise as it often becomes the focus of impact attribution studies and sidelines the more difficult, less researched, but in our view very relevant issue of separation of climate change from direct human interventions as drivers of observed changes in natural, human and managed systems. Additionally, large earth system model ensembles that would allow for the attribution to emissions but also reliably capture all major sea level components are not available so far.
We provide the factual and counterfactual dataset as part of the ISIMIP3a simulation round. This simulation round is dedicated to the evaluation of impact models and to impact attribution. ISIMIP already provides datasets on some additional drivers of coastal impacts, such as change in population, land use, economy or urban area 4 . With the presented data, we aim to facilitate the development of a new generation of coastal impact models that explicitly resolve the observed spatial and temporal coastal changes and disturbances.

Code and data availability
The source code (v1.0) underlying the analysis and produce the figures presented in the paper is archived at https://doi.org/10.5281/zenodo.7771501 (Treu, 2023). All code is open to use under the Creative Commons Attribution 4.0 International license. The presented counterfactual climate dataset is archived at https://doi.org/10.5281/zenodo.7771386  and based on v1.0 of the source code.

Author contributions
ST, MM developed the concept. ST implemented the methods, wrote the code and produced the data. SD, SM, TW and JO provided the input data. All authors contributed to the writing of the manuscript.

Competing interests
The contact author has declared that none of the authors has any competing interests.