SHELDA: Sub-hourly European Quality Controlled Sea Level Dataset
Abstract. Availability of high-quality sub-hourly sea level data is essential for understanding of a wide range of oceanic processes, including tidal oscillations, seiches, storm surges, tsunamis (including meteotsunamis), and their impact on sea level extremes and coastal flooding. Freely accessible sea level databases often contain time series measured with hourly or even longer sampling step, or they contain high-frequency data that have not undergone quality control procedures. To address this gap, the SHELDA (Sub-Hourly European Quality Controlled Sea Level DAtaset) has been created. This dataset comprises 257 individual tide gauge records in NetCDF format (https://doi.org/10.14284/764, Balić and Šepić, 2025), each representing quality-controlled sea level time series sampled at intervals between 1 and 15 minutes, along with residual time series derived by removing tidal components. This paper outlines the rigorous quality control procedures implemented and describes the spatial and temporal coverage of the dataset, along with technical specifications. SHELDA enables precise identification and analysis of sea level variability at timescales from minutes to multi-yearly along the European coasts, including Greenland, Canary Islands, Israel, Lebanon and Türkiye.
The paper describes quality control performed on 257 high-frequency sea level records (sampled at intervals from 1 to 15 minutes) along European coastlines. The manuscript is generally well organized and clearly written, with a few shortcomings noted below.
The dataset, as described, has the potential to make a significant contribution to the scientific community, particularly for researchers investigating sea level processes on shorter timescales. However, the dataset available via the provided link currently contains only 30 station records, rather than the expected 257.
In light of this discrepancy, I provide comments on both the manuscript and the available subset of 30 records, and I recommend rejection with the possibility of resubmission once the complete dataset is made accessible.
Comments on the paper
L15: Please specify the license under which the data are distributed.
L17: “rigorous quality control procedures”: Please briefly outline the main steps undertaken, as well as those not applied (e.g. corrections of datum shifts and drifts). Also include in the Abstract the minimum and maximum lengths of the available records.
L20: Please use a consistent name for Turkey throughout the manuscript (see also Fig. 10).
L49: Please rephrase “some kind of averaging procedure”, as this is not appropriate. Also, I doubt you would measure wind waves with a tide gauge instrument typically installed on the coastline (as visible from your maps).
L58: ‘reduced’ > higher?
L91–94: Please rephrase for clarity; it is not clear whether MISELA includes only the high-frequency signal or both high-frequency and total sea levels.
Figure 1: Can you offer a better solution for records shorter than 3 years, as the white circles on the grey background are not visible?
Sections 2.2.1 and 2.2.2: These sections would benefit from clearer organization. The current structure with multiple (i), (ii), (iii).. points, some connected, others not, makes them difficult to follow.
Figure 6: (%) > [%]
L211: I would refer to all data from one station (regardless of how many segments with different sampling frequencies it contains) as a “record,” as the current usage is confusing (e.g. the number changes from 257 to 275 and then back to 257).
L220: I understand that correcting datum shifts is not possible without additional data. However, users would benefit from knowing which stations are affected, so that they can exclude them if necessary. I recommend including this information in the manuscript and in Tables A1–A3 (and also in NetCDF files). Furthermore, please specify in the paper what you consider to be a shift, a datum shift, and a drift (including possible causes).
Figure 3: The bathymetry color bar is missing.
L331: Please clarify what specific issues were identified in the German data that led to their exclusion from SHELDA.
Table 1: See comments on NetCDF files below.
Figures 7, 8, 9: Please define what is meant by a step sensor.
L380: In some places it is unclear whether you are referring to the original or processed data. For example, is the 1-minute interval the most common in the original data or after processing? Please clarify.
L380–387: It would be useful to include a brief discussion explaining the observed distribution of sampling frequencies (e.g. related to dynamics?) and sensor types (e.g. infrastructure constraints?).
L394: Please include the data license.
L471: ‘article’ > ‘manuscript’
L19, L133, L397: In my opinion, the SHELDA dataset is most suitable for short-period processes, particularly since drifts and datum shifts were identified but not corrected. Therefore, the term “multi-yearly” appears too strong.
Comments on NetCDF files
1. For each station, please include information on the data owner or the institution responsible for maintaining the station.
2. The current qc_flags variable (0 or 1) provides limited added value, as the presence or absence of data can already be inferred directly from the dataset. It would be beneficial to enhance this by distinguishing between data that are originally missing (i.e., not available from data centers) and data that have been removed during quality control procedures. As a further improvement, the authors might also consider introducing an additional flag to indicate data points that are present but potentially questionable or of lower quality (as noted in the manuscript, where some data were identified as doubtful).
3. Users would benefit from additional metadata indicating how many segments are contained within each file (i.e., whether a file consists of one segment or multiple segments).
Comments on data (see attached Figures)
1. Check whether interpolation was applied across missing data in the first segment (Fig. bodr.png).
2. It appears that some stations still exhibit constant sea levels over time (see figures in Constant_SL).
3. Please check station batr (see figures in Suspect_data; ignore the red stars). Some high-frequency episodes appear to be “cut off” from above. Is this expected behaviour?