The Sea State CCI dataset v1: towards a sea state climate data record based on satellite observations

. Sea state data are of major importance for climate studies, marine engineering, safety at sea and coastal management. However, long-term sea state datasets are sparse and not always consistent, and sea state data users still mostly rely on numerical wave models for research and engineering applications. Facing the urgent need for a sea state climate data record, the Global Climate Observing System has listed “Sea State” as an Essential Climate Variable (ECV), fostering the launch in 2018 of the Sea State Climate Change Initiative (CCI). The CCI is a programme of the European Space Agency, whose objective is to realise the full potential of global Earth observation archives established by ESA and its member states in order to contribute to the ECV database. This paper presents the implementation of the ﬁrst release of the Sea State CCI dataset, the implementation and beneﬁts of a high-level denoising method, its validation against in situ measurements and numerical model outputs, and the future developments considered within the Sea State CCI project. The Sea State CCI dataset v1 is freely available on the ESA CCI website (http://cci.esa.int/data, last access: 25 August 2020) at ftp://anon-ftp.ceda.ac.uk/neodc/esacci/sea_state/data/v1.1_release/ (last access: 25 August 2020). Three products are available: a multi-mission along-track L2P product (http://dx.doi.org/ 10.5285/f91cd3ee7b6243d5b7d41b9beaf397e1, Piollé et al., 2020a), a daily merged multi mission along-track L3 product (http://dx.doi.org/10.5285/3ef6a5a66e9947d39b356251909dc12b, Piollé et al., 2020b) and a multimission monthly gridded L4 product (http://dx.doi.org/10.5285/47140d618dcc40309e1edbca7e773478, Piollé Sea State CCI dataset v1: towards a sea state climate data record based on satellite


Introduction
Sea state, i.e. the description of wind sea and swell conditions at sea in terms of spectral or bulk wave parameters, is a key component of the coupling between the ocean and the atmosphere, the coasts, and the sea ice. In the open ocean, wind-generated waves increase the sea surface roughness and enhance the air-sea momentum transfer through the modification of the wind stress (Edson et al., 2013). Wavebreaking contributes to the mixing of the ocean upper layer (Babanin and Haus, 2009) and releases of sea spray aerosol into the atmosphere (Monahan et al., 1986). At the coast, waves are refracted by the shallow bathymetry and the tidal currents, they shoal over the shoreface and transfer energy to higher and lower harmonics through non-linear interactions (Longuet-Higgins and Stewart, 1962). They eventually break in the surf zone, increasing the water level, generating strong currents and stirring large quantities of sediments at the break point (Thornton et al., 1996). All these wave-induced processes contribute to rapid coastal erosion (Masselink et al., 2016), dune breaching (Kraus and Wamsley, 2003) and/or low-lying land overwash during extreme storm events. In the high latitudes, waves interact with the sea ice by modifying its mechanical properties, through the fragmentation of the ice floes of the marginal ice zone into smaller pieces, or through the push of the ice in the direction of the wave propagation (Stopa et al., 2018). Given that increased greenhouse gas emission caused by anthropic activities has a strong impact on the Earth's climate, which translates into the modification of the atmospheric circulation, the acceleration of sea level rise and the rapid decay of Arctic sea ice, significant changes in future sea state conditions and the above-mentioned coupling mechanisms are expected (see, e.g. Thomson and Rogers, 2014;Idier et al., 2019;Reguero et al., 2019).
Nowadays, long-term records of wave parameters are provided by voluntary observing ships along the major maritime routes (Gulev and Grigorieva, 2004); by in situ wave buoy networks, mostly located along the US, European, Japanese, and Australian coastlines; and by satellite altimeter measurements . While altimeter-based datasets are providing the (almost) global coverage necessary to understand the large-scale variability of sea states and their interactions with the other components of the Earth's climate, they still suffer from several limitations: (1) the main sea state parameter computed from radar altimeter echoes is the significant wave height, yet other spectral parameters such as the wave period and directions are key for some applications, e.g. coastal impact (Dodet et al., 2019). (2) Altimeter measurements cover the last 34 years only (starting with GEOSAT in 1985 with a data gap between 1990 and 1991), which is still relatively short for extracting robust trend information out of the strong multi-annual fluctuations of the significant wave height. (3) The sparse altimeter sampling pattern and the changing number of in-orbit altimeter missions cause undersampling errors that bias the long-term statistics, particularly for extreme conditions (Jiang, 2020). (4) Altimeter missions need to be accurately cross-calibrated to deliver consistent long-term time series, this is particularly true when instruments operating in different modes are merged in a single product (Timmermans et al., 2020). (5) Altimeter measurements are contaminated by different sources of noise, which prevent a proper representation of significant wave height (SWH) variability at scales lower than 100 km (Ardhuin et al., 2017). In the last 20 years, several research groups have contributed to the development of long-term calibrated altimeter databases (Queffeulou, 2004;Zieger et al., 2009;Ribal and Young, 2019), and some of these datasets have been used to compute the significant wave height trends over the last decades. In a recent study, Young and Ribal (2019) estimated trends in SWH ranging from − 1 to +1 cm yr −1 , depicting a large regional variability with negative trends mostly located in the Pacific Ocean. These results, and the dataset they are based on , represent a milestone in the characterisation of sea state decadal variability, however, new developments are necessary to verify these findings and extend the potential of satellite sea state observations.
Aware of the increasing need for accurate, robust and consistent long-term sea state data required by the climate science community , the Global Climate Observing System (GCOS) has listed "Sea State" as an Essential Climate Variable (ECV). ECVs are geophysical records generated from systematic Earth observations in support of international frameworks and policies such as the United Nations Framework Convention on Climate Change (UNFCCC) and the Intergovernmental Panel on Climate Change (IPCC). The Climate Change Initiative (CCI) programme, launched by the European Space Agency in 2010, has already contributed to the production of new Climate Data Records (CDRs) associated with ECVs, such as aerosol (Popp et al., 2016) or sea ice concentration (Lavergne et al., 2019). In this context, the Sea State CCI project was kicked off in 2018 in order to produce a CDR for the new ECV Sea State. This paper presents the first dataset released in the context of the Sea State CCI project.
The next section of this paper describes the altimeter missions that have been considered for the Sea State CCI dataset v1 and the in situ and model data that have been used to compare against the altimeter data. Section 3 describes the main processing steps (namely, data editing, inter-calibration and denoising) implemented within the Sea State CCI production system. Section 4 presents the results of the comparison against in situ measurements and model outputs. Section 5 presents two applications of the Sea State CCI dataset v1 at global and regional scales. Finally, Sect. 6 discusses the current status of the Sea State CCI database v1, the main limitations of the data and the perspectives for the future releases of this dataset.

Altimeter data
The altimeter data used in the Sea State CCI dataset v1 come from multiple missions spanning from 1991 to 2018. Although many spaceborne radar altimeters are bi-frequency for atmospheric corrections (Ku-C or Ku-S), only measurements in Ku band were used for consistency reasons, being available for all missions except SARAL/AltiKa (Ka band). Table 1 provides the list of missions used in the Sea State CCI dataset v1, together with the input product and version used, and their orbital properties (note that some cycle changes occurred in the course of some missions for limited period of time: these are not listed here for clarity, but the corresponding measurements were included in the Sea State CCI dataset v1). Not surprisingly, the list of altimeter data sources is very similar to that used by the Sea Level CCI (Fig. 1a in Quartly et al., 2017), except that project could not utilise the instruments in very long repeat cycles.

In situ measurements
The in situ data used to validate the Sea State CCI dataset v1 were gathered by ECMWF (Fig. 1). Most of the data came from the operational archive from ECMWF, where all data distributed via the Global Telecommunication System (GTS) are kept. Data from moored buoys and fixed platforms were extracted. These data are usually reported hourly (or less frequently). The bulk of the data comes from moored buoys, with the exception of data from operating platforms in the North Sea, Norwegian Sea and the Gulf of Mexico. The main data providers are the US (via the National Data Buoy Center, NDBC, and Scripps), Canada, the UK, France, Ireland, Norway, Iceland, Germany, Spain, Brazil, South Korea and India. This dataset was supplemented by buoy data obtained from the websites from the UK Centre for Environment, Fisheries and Aquaculture Science (CEFAS) and the Faeroe Islands network. In addition, buoy data from Denmark, New Zealand and Japan obtained as part of ECMWF wave forecast validation project were also used. A basic quality control was applied to each hourly time series for each location to remove spurious outliers.
Wave in situ measurements were compared to altimeter data at every altimeter-in situ match-up. An altimeter-in situ match-up occurs each time the altimeter ground track is less than 50 km from a in situ location and the in situ measurement is available within 30 min (following Queffeulou, 2004). For each match-up, the altimeter SWH is averaged over the along-track records lying within a 50 km radius circle centred on the in situ location. The in situ time series are filtered with a 2 h moving window and are then interpolated on the satellite overpass time. The metrics used for validations are the bias, the root-mean-square error (RMSE), the normalised RMSE (NRMSE), the scatter index (SI) and the Figure 1. Global maps of the wave in situ data used to validate the Sea State CCI dataset v1. Red circles indicate stations less than 50 km from the coast, blue circles indicate stations between 50 and 100 km from the coast, green circles indicate stations between 100 and 200 km from the coast, and pink circles indicate stations more than 200 km from the coast. The number of stations is given within brackets. Black boxes indicate basin extensions used for the regional validation of the dataset.
where X alti is the significant wave height recorded by the altimeter and X ref is the significant wave height recorded by the wave buoy or the model (as mentioned in the next section). Comparisons between altimeter data and in situ measurements showed much better agreement when coastal buoys (< 200 km) were discarded from the analysis. This can be seen, for instance, on the scatter diagram and error metrics computed between SARAL and in situ SWH measurements during the year 2017 ( Fig. 2), when all wave buoys are considered (Fig. 2a), and when only offshore wave buoys 200 km away from the coast are considered (Fig. 2b). Poorer performances in the comparison with coastal buoys have at least three reasons: firstly, land shading and refraction can modify SWH at much shorter distances than in the open ocean, affecting the validity of the 50 km-radius assumption and jeopardising the number of sites that can be effectively used for the comparison; secondly, coastal backscatter inhomogeneities in the satellite footprint affect the retrievals partic-  ularly in the last 20 km from the coastline (see Sect. 6.2); finally, the stronger variability of the wave field in the coastal zone due to tidal currents, bathymetric refraction, and coastal wind inhomogeneity invalidates the assumption of wave field homogeneity within the altimeter footprint.

Numerical wave model
The wave hindcast used to compare model results with altimeter data were produced with the spectral wave model WAVEWATCH III ® (WW3, The WAVEWATCH III Development Group, 2016). The model is forced by wind fields from the ERA5 reanalysis (Hersbach et al., 2020); by geostrophic and Ekman current components from Globcurrent products (Rio et al., 2014), with an ice mask applied from SSMI radiometer (CERSAT); and iceberg distribution from Altiberg (Tournadre et al., 2016). The coverage is global and extends from 78 • S to 80 • N at 0.5 • resolution with a spectral discretisation of 24 directions and 36 frequencies with the lowest frequency at 0.0339 Hz. Output fields are generated at 3-hourly intervals. The WW3 version used is based on the GitHub NOAA-EMC stable release from 27 June 2019. The model parameterisation is based on Rascle and Ardhuin (2013) (T471) with the following tuning for the wave growth: BETAMAX = 1.65 and SWELLF7 = 4.14 × 10 5 , and for the strong wind intensification the following tuning is applied: WCOR1 = 23 and WCOR2 = 1.08. Modelled SWH values are linearly interpolated along the satellite ground track and statistical errors (bias, RMSE, NRMSE, SI, R 2 ) are then computed. Statistics are only computed on measurements considered good, based on the quality level flag defined in Sect. 3.1.

Processing of altimeter data
The Sea State CCI dataset v1 products are inherited from the GlobWave project (2009-2012) building on the experience and methodology developed within this project. It extends and improves the GlobWave products, which were a postprocessing of existing L2 altimeter agency products with additional filtering, corrections and variables. Three kinds of products are delivered in the Sea State CCI dataset v1.
-L2P. Along-track products separated per satellite and half-orbit (pass) or full orbit (depending on the input product used), including all measurements with flags, corrections and extra parameters from other sources. These are expert products with rich content and no data loss (Piollé et al., 2020a).
-L3. Edited merged daily products, derived from the L2P and retaining only valid and good-quality measurements from all altimeters over 1 d (one daily file), with simplified content (only a few key parameters). This is close to what is delivered in near real time by, for instance, the Copernicus Marine Environment Monitoring Service (Piollé et al., 2020b).
-L4. Statistical gridded products, also derived from the L2P and averaging valid and good measurements from all available altimeters over a fixed resolution grid (1 • × 1 • ) on a monthly basis. These products are meant for statistics and visualisation (such as the CCI toolbox, http://climatetoolbox.io/, last access: 25 August 2020) (Piollé et al., 2020c).
The following sections provide more details on the processing steps of L2P products, from which L3 and L4 are derived.

Data editing
This first step consists in the identification of bad or suspect measurements, in order to build a quality level flag (swh_quality) providing users with a way to only retain the valid measurements in their analysis. This is achieved through a series of tests applied to each measurement, the result of which are summarised into an additional rejection flag (swh_rejection_flags), where each bit documents a specific test's failure or success. Table 2 lists the four levels of the variable swh_quality.
When SWH measurements were rejected as bad, the reason (quality test) for which they were rejected is reported in the related swh_rejection_flags variable. The eight rejection flags are as follows.
-not_water. The surface type is not water. It may be land or continental ice. We try to keep lake and inner seas measurements (when the discrimination is possible from the GDR information). This test only uses the internal flags provided in the input product by the producer.
-sea_ice. The measurement has possible ice contamination. The sea ice fraction is taken from an external source (such as the Sea Ice CCI microwave based daily maps). Sea ice contamination is defined as areas where the sea ice fraction is greater than a minimal threshold (corresponding to 10 % of ice in the current configuration). SWH measurements where the sea ice fraction is greater than 0 % but lower than 10 % are classified as acceptable.
-swh_validity. The SWH measurements were considered invalid (for instance because of the possible range or some internal flag provided in the original product used as input).
-sigma0_validity. The sigma0 measurements were considered invalid for water surface type.
-waveform_validity. The measurements were considered invalid as there are indications of unsuitable waveforms (as indicated in some internal flag provided in the original product used as input) for a proper SWH calculation.
-ssh_validity. The SWH measurements were considered invalid as there were issues on SSH (as indicated in some internal flag provided in the original product used as input), which was considered an indication of problematic quality for SWH too.
-swh_rms_outlier. The root-mean-square deviation of the 20 Hz SWH measurements exceeds a certain threshold, which depends on SWH and is computed following Sepulveda et al. (2015) -swh_outlier. The measurements were considered invalid when performing the SWH outlier test: this test considers all the measurements within a 100 km window centred on the screened measurement; measurements that deviate from the 100 km mean (excluding the two most extreme values in the mean calculation) by more than 5 standard deviations or by more than 5 m are discarded. These empirical thresholds were defined through careful visual examination of the data. This step is iterated three times over the same window.
The editing criteria which leads to setting the SWH quality level and rejection flags are specific to each mission and are detailed in the Sea State CCI dataset product user guide (available on the project's website: http://cci.esa.int/seastate, last access: 25 August 2020).

Cross-calibration
The Sea State CCI project builds on the GlobWave project, for which SWH altimeter measurements over the period 1985-2016 were carefully calibrated against in situ data (GlobWaveTeam, 2013). In the Sea State CCI dataset All data * dh = −0.0685 + 6.0426 × 10 −4 cycle + 7.7894 × 10 −6 cycle 2 − 6.9624 × 10 −8 cycle 3 . v1, three additional altimeter missions, namely JASON-3, CRYOSAT-2 (Low-Resolution Mode) and SARAL, have been included, and we describe here the methodology used to cross-calibrate these SWH records against a common reference dataset. Moreover, a new version (version E) of the JASON-1 GDR has been released since the GlobWave project and the calibration formula derived for JASON-1 has also been updated. According to the GlobWave Annual Quality Control Report (GlobWaveTeam, 2012), there is no specific quality problem in JASON-2 and the variability in terms of data quality is lower than for JASON-1 and EN-VISAT. Therefore, the calibrations of JASON-1, JASON-3, CRYOSAT-2 and SARAL are performed against the JASON-2 data, as calibrated by Queffeulou and Croizé-Fillon (2017). Altimeter SWH cross-calibration is carried out by comparing SWH measurements at cross-over locations between the altimeter to be calibrated and the reference mission JASON-2. A cross-over data pair is defined each time the two satellite ground tracks intersect within a 60 min time window (Fig. 3). In order to attenuate the impact of along-track noise (instrumental and retracking-induced noise) in the comparison, SWH is averaged along n consecutive measurements 25 km away from the intersection points (7 ≤ n ≤ 9 depending on altimeter orbital velocity, shown as blue and red dots in Fig. 3). SWH at cross-over locations are then compared to estimate the calibration formula.
Visual assessment of JASON-1, JASON-3 and SARAL SWH measurements against JASON-2 calibrated SWH measurements indicate a linear relationship between these missions (Figs. 4, 5 and 6), and linear calibration formula are obtained by fitting a least-square regression line through the SWH data. Note that the fitting was only applied for SWH values larger than 1 m. Below this value, the linearity of the relationship is lost, mostly due to differences in the instrumental correction applied to account for the fact that the point target response in the model used is approximated by a Gaus- sian function (Thibaut et al., 2010). Moreover, it is known that SWH retrieval at low sea states and particularly below 0.75 m is less accurate and is noisier due the inadequate sampling of the signal (Smith and Scharroo, 2015).
For CRYOSAT-2 the relationship is no longer linear (Fig. 7), and we use a second-order polynomial function to correct this mission. In order to avoid discontinuous and unrealistic corrections at high sea state, we apply this secondorder polynomial corrections until an upper threshold, corre-    sponding to the SWH values at which the polynomial intersects the zero residual y axis (in this case 7.67 m). Table 3 lists the equations used to calibrate the altimeter SWH measurements in the Sea State CCI dataset v1.

Data denoising
Altimeter measurements are characterised by a low signal-tonoise ratio (SNR) at spatial scales below about 100 km, blurring geophysical signal variabilities in this scale range, such as those resulting from wave-current interactions. The use of altimeter data therefore often requires preliminary noise filtering, and low-pass or smoothing filters are frequently applied. Such operation quite systematically results in the loss of small-scale (< 100 km) geophysical information or in the creation of artefacts in the geophysical variability analysed (e.g. spectral ringing), and requires setting of a cut-off wavelength or a filter window length that is difficult to determine adequately for a global dataset. As for approaches that infer a correction to eliminate correlated errors from other aspects of the waveform data (Quartly, 2019;Tran et al., 2019), it also leaves a substantial amount of low-and medium-frequency noise in the data. To overcome these difficulties, an adaptive noise elimination method is used, based on the nonparametric Empirical Mode Decomposition (EMD) method developed to analyse non-stationary and non-linear signals (Huang et al., 1998). EMD is a scale decomposition into a limited number of amplitude and frequency modulated functions (AM-FM) -called Intrinsic Mode Functions (IMFs)among which the Gaussian noise distribution is predictable (Flandrin et al., 2004). It therefore provides the basis for a noise elimination approach with results often superior to those of wavelet-based techniques (Kopsinis and McLaughlin, 2009). Recently, EMD analysis has been successfully applied to altimeter data to analyse wave-current interactions known to predominate at scales below 100 km (Quilfen et al., 2018;Quilfen and Chapron, 2019). The main steps of this method are described hereinafter. For a full description of the method, please refer to Quilfen and Chapron (2020).

The EMD principles
EMD adaptively decomposes a signal x(t) into a small number L of IMFs h n (t), 1 ≤ n ≤ L, and thus (1) The IMF number, L, depends on the length of the record and typically varies from 1 to 10 for the lengths analysed in the altimeter dataset. By construction, IMFs have the following properties: they are zero mean, all their maxima and minima are respectively positive and negative, and they have the same number (or ±1) of zero-crossings and local extrema. The IMFs are calculated successively, the first one containing the shortest scales and the last one containing a trend, by construction of the algorithm. Each IMF is estimated using an iterative process called sifting that determines the AM-FM high-frequency part of any input signal. For a given data segment, the sifting operates in a few steps: (1) find the local maxima and minima, (2) interpolate along the maxima and minima to form an upper and a lower envelope, (3) calculate the average of the two envelopes and subtract it from the analysed segment, and (4) repeat the process from step 1 to 3 unless a stopping criterion has been met (see Huang et al., 1998;Quilfen and Chapron, 2020, for details). An example is shown in Fig. 8 for a JASON-2 measurement record of about 1060 km length, for which the EMD method determined six IMFs to represent the full signal. The figure also shows other aspects of the denoising process to be discussed in the next section. As shown, the highfrequency noise is projected in the first IMF, and the scale range of each IMF is increasing with the IMF increasing in rank. Notably, the very large geophysical gradients such as observed in this example are also captured by IMF1. IMF1 therefore requires a particular processing to separate noise from useful information.
Once the signal is broken down into a set of IMFs, a denoising strategy inspired by those used for wavelet techniques can be applied. The analysis to be carried out takes advantage of (1) the well-behaved and predictable distribution of Gaussian noise energy with the IMF basis, (2) the legacy of decades of wavelet-based denoising techniques, and (3) an ensemble average approach to estimate a robust noise-free signal. Flandrin et al. (2004) showed that in the case of pure fractional Gaussian noise, the first IMF possesses the characteristics of a high-pass filter, while the higher-order modes behave similarly to a dyadic filter bank for which, as they descend the frequency scale, the successive frequency bands have half the width of their predecessors. This is illustrated in Fig. 9.

EMD-based data denoising
It implies that the Gaussian noise variance projected onto the IMF basis can be modelled, for IMFs of rank n > 1, as follows: α depends on the autocorrelation function of the fractional Gaussian noise (i.e. α = 0.5 for an uncorrelated noise, e.g. white noise; α = 0.5 for an autocorrelated noise). For white noise, the expected noise energy level of each IMF of rank n > 1 is then given by where E 1 is computed using the Median Absolute Deviation (MAD) from zero: where n 1 (t) is the IMF1 noise estimated from a wavelet analysis (as an example, see the top panel of Fig. 8). Equation (3) and (4) then give the expected noise energy in each IMF to determine the different thresholds below which signal fluctuations are associated with noise, as illustrated in Fig. 8. For each IMF, the threshold is T n = A √ E n . A is a constant that can be adjusted as a global tuning parameter.
With the EMD basis, noise energy decreases rapidly with the increasing IMF rank: ∼ 59 %, 20.5 %, 10.3 %, and 5.2 % of total energy for the first four IMFs, respectively, which represents ∼ 95 % of the total noise energy. For a given noisy input signal, the SNR and robustness of the denoised signal (e.g. to mitigate for result uncertainties associated with signal fluctuations close to the applied thresholds) are increased by estimating the final result as an ensemble average of several denoised signals. For that, the noise n 1 (t) is first removed from the noisy signal x(t), then a set of k new noisy signals is generated by adding random realisations of n 1 (t), providing after denoising a set of k denoised signals whose average gives the resulting denoised SWH and whose standard deviation gives the uncertainty attached to the denoised SWH. The uncertainty parameter therefore accounts for the noise characteristics of the noisy signal (function of the altimeter sensor, SWH etc), as well as for the local SNR (which is scale-dependent) and for uncertainties attached to the denoising process. Figure 9 illustrates the different points discussed above. It shows how the EMD filter bank distributes a white noise signal and the JASON-2 altimeter SWH signal in the Agulhas Current region. The standard deviation of noise was adjusted to fit the SWH background noise at scales < 20 km. As shown, the EMD filter bank is composed of a high-pass filter, IMF1, and a dyadic filter bank for higher-ranking IMFs. A similar structure is observed when EMD is applied to the SWH along-track signal, confirming that IMF1 mainly contains the high-frequency noise and showing that pure noise and SWH higher-ranking IMFs share the same frequency ranges. Figure 9, which shows how similar the filter bank is for pure noise and for SWH signal, therefore highlights the practical rule used for denoising, which compares the signal modulation in each IMF with the noise energy expected for the IMF of same rank. The proposed method is free of systematic artefacts, preserves the amplitude of spatial gradients and extreme values, and eliminates the noise over the whole frequency range. Signals down to scales of nearly 30 km can be recovered, provided that the local signal-to-noise ratio is sufficient.

Comparisons against in situ data and model results
Statistical metrics (bias, RMSE, NRMSE, SI and R 2 ) between altimeter measurements and in situ data were computed for each mission and each year. The overall scores are provided in Table 4 for the calibrated and denoised altimeter SWH, considering only altimeter-in situ match-ups that occurred more than 200 km from the coast. With a number of match-up data between 1018 (ERS-1, 3 years of data) and 14 395 (JASON-2, 11 years of data), all the computed values are statistically significant. Except for ERS-1 for which the bias is negative (−7.2 cm), all the mission show a pos-itive bias lower than 10 cm. The RMSE is below 26 cm for all missions, corresponding to a mean value lower than 11 % once normalised by the mean of the observations. Moreover, the scatter index is lower than 9 % and the coefficient of determination is higher than 0.96 for all missions. In order to assess the impact of the calibration and denoising steps applied on the altimeter measurements (Sect. 3.1), the above-mentioned metrics were also computed for the raw and calibrated SWH data before denoising was applied. Figure 10 shows the averaged bias and normalised RMSE between in situ and altimeter measurements for the raw, calibrated and denoised SWH. Here, again only match-ups that occurred more than 200 km from the coast were considered. Comparing the statistics for the raw SWH and calibrated SWH, we see that the calibration step tends to decrease the absolute bias and the NRMSE, except for JASON-1, JASON-2, SARAL and JASON-3. In particular, the GlobWavecalibrated JASON-2 measurements present a positive bias of ∼ 8 cm. Since these data were used to inter-calibrate the JASON-1, SARAL and JASON-3 measurements included in the present dataset, it is straightforward to attribute the positive bias found in these three missions to the propagation of the error during the inter-calibration step. The increased bias also resulted in larger NRMSE after the calibration of these missions. Although a clear understanding of this increased bias for JASON-2 requires further investigations, the different in situ dataset (stations and time period) used for the GlobWave calibration and for the present validation may explain part of the discrepancies. Comparing the statistics for the calibrated and denoised SWH, we see that the denoised data compared slightly better with in situ measurements than data that is not denoised, with NRMSE decreasing by up to 7 % and by 3 % on average, after denoising is applied. The impact of the denoising step was actually much more significant on the comparisons against model outputs, which take into account the along-track variability (see below).
Comparison of the altimeter dataset against the WW3 wave model hindcast (described in Sect. 2.3) was performed as a complementary validation with an independent dataset. In order to assess the quality of the dataset over the 1994-2018 time period, mean global bias and NRMSE between the denoised altimeter SWH and the modelled SWH were computed on a yearly basis for each altimeter mission. Figure 11 shows the time series of these two parameters, with distinct colours for each mission. We can see that the bias is lower than 10 cm and the NRMSE is lower than 13 % over the whole period. The overall trend is a decrease of the error metrics from the oldest missions to the most recent ones that may be attributed to improvements in instrument performance and processing techniques. We also note some inter-and multiannual variabilities in the metrics that can be associated with changes in missions recording phases and associated orbits. The thin dashed lines in Fig. 11b show the NRMSE obtained before denoising is applied on the altimeter SWH. Differences in the metrics obtained with the calibrated (not de-  noised) and denoised SWH illustrate the significant improvements obtained after the small scale (< 100 km) fluctuations in the altimeter measurements are removed, with a NRMSE decrease by up to 20 % and by 10 % on average.
Finally, coastal and regional assessments of the dataset were performed by computing error metrics for different coastal strips (> 200, 100-200, 50-100, 0-50 km) and different basins (North Atlantic, South Atlantic, North Pacific, South Pacific, Indian Ocean and Southern Ocean). The number and locations of the coastal in situ stations and the (arbitrary) extensions of the ocean basins used for these comparisons are depicted in Fig. 1. Due to a low number of in situ stations in most ocean basins, the regional assessment was only performed against model outputs. Also, given the 0.5 • model resolution, coastal assessment was only performed for model outputs located more than 50 km from the coast. The results of these comparisons are summarised in Table 5.
The error metrics for the different coastal strips clearly indicate the better performance for the comparisons obtained further from the coast. For instance, the NRMSE increases from ∼ 10 % for buoys located > 200 km from the coast to ∼ 24 % for buoys located less than 50 km from the coast, and from ∼ 11 % for model outputs > 200 km from the coast to ∼ 17 % for model outputs within 50 and 100 km from the coast. These results also reveal a ∼ 10 cm increase of the bias between altimeter data and in situ measurements at less than 50 km from the coast. This increased bias may be explained by the fact that match-ups between altimeter and in situ stations close to the coast will contain a larger fraction of altimeter records located offshore with respect to the buoy position, with the altimeter records nearer to the coast being rejected during the data editing process (see Sect. 3.1). As a result, the altimeter will systematically see larger waves than the in situ sensor in the regions where sea states are impacted by coastal features, such as shallow depths, island blocking, or increased tidal currents. In regard to the basin comparisons, we note a similar performance for each basin, except for the Southern Ocean, where the bias is negative and ∼ 20 cm lower than for the other regions and the NRMSE is ∼ 14 %, whereas it is closer to 11 % for the other regions. This may be explained by the poorer quality of both altimeter records and model simulations in this region that is dominated by high sea states.

Cross-consistency analysis
One objective of the Sea State CCI project is to implement a processing system able to produce accurate and consistent long-term time series of EO-based SWH measurements. Indeed, the time consistency of the produced dataset is particularly relevant for investigating the multi-decadal variability of the Sea State ECV and its interactions with other components of the Earth climate system. In order to ensure that the produced altimeter SWH is consistent over the altimeter time period, we inspected the monthly global mean SWH for each mission, within 60 • S and 60 • N. Figure 12 shows the time series of the global monthly means for the raw, calibrated and denoised SWH, with the corresponding mean values computed over the available measurement period. We see that the global means of raw SWH present large differences from one mission to the other, with an overall standard deviation (σ ) of the mean values equal to 15 cm. The lowest mean value (2.00 m) is obtained for the earliest mission, ERS-1, while the highest mean value (2.54 m) is obtained for CRYOSAT-2. These differences are strongly reduced after calibration is applied to SWH (σ = 2.5 cm). We note that denoising the calibrated SWH has a minor impact on the mean values, with maximum changes of the global mean lower than 2 %. The remaining differences in the mean values between each mission can be partly attributed to the different calibration methodology applied to the GlobWave dataset and to the most recent missions included in the Sea State CCI dataset v1 but also to the natural variability of the sea states, which is partly controlled by inter-annual and decadal climate modes, such as the North Atlantic Oscillation, the El Niño-Southern Oscillation or the Southern Annular Mode Reguero et al., 2019).

Global wave height climatology
A first evaluation of the Sea State CCI dataset v1 is shown in Fig. 13a as the global distribution of the climatological annual mean significant wave height calculated over the period 1992-2017. This is based on the CCI Sea State Level 4 (L4) gridded product and is presented here at its native 1 • resolution. Further evaluations and analyses of the L4 product over climatological timescales, including intercomparisons with other high-quality sea state data sources, are provided in Timmermans et al. (2020). The climatology clearly shows the typical features of global wave fields, with high sea states at mid-to-high latitudes in both hemispheres corresponding to the imprint of extratropical storm tracks and the persistently high winds of the Southern Ocean. The 1 • resolution of the product also makes it possible to distinguish regions of lower mean wave heights in enclosed and sheltered seas and close to islands and land, for example in the Gulf of Mexico, the Mediterranean Sea, the Baltic Sea and the Indo-Pacific Warm Pool.
Focusing now on Fig. 13b, we see the normalised climatological difference (expressed as a percentage of SWH) between the CCI product and the climatological mean obtained from the calibrated multi-mission altimeter data published by Ribal and Young (2019). The overall agreement between the two altimeter-based datasets is generally good, with differences typically of less than ± 2.5 %, although some spatially coherent differences (both positive and negative) are clearly visible, most noticeably on either side of the Equator. While a detailed explanation for the differences is subject to further analysis, a number of factors are likely to be relevant. We note some differences in source missions (omission here of SARAL in the analysis of Ribal and Young, 2019, for example) and differences in the calibration methodology, such as the use of different sets of reference data buoys between the two products, and subsequent mission cross-calibration. See Timmermans et al. (2020), Sect. 4 for more details.
Finally, Fig. 13c presents a similar comparison, this time between the CCI and ERA5. ERA5 (Hersbach et al., 2020) is the most recent of the reanalysis products developed and distributed by ECMWF, which features a number of innovations, including higher spatial and temporal resolution and hourly assimilation of altimeter significant wave height data. In these results, the comparisons indicate that, even though ERA5 assimilates altimeter data, the ERA5 climatological mean SWH is substantially lower than CCI almost everywhere, except the eastern tropical Pacific and southern tropical Atlantic, where ERA5 clearly overestimates the wave climate. Once again, as for the comparison against Ribal and Young (2019), strong signatures are observed either side of the Equator. These are likely attributable to at least two factors. Firstly, ERA5 generally underestimates SWH in stormy areas, except in the deep tropics where the wave climate is dominated by long period swell. Recent changes have been made to the ERA5 wave physics package to try to solve some of these issues (https://www.ecmwf.int/en/about/mediacentre/news/2019/forecasting-system-upgrade-set-improveglobal-weather-forecasts, last access: 25 August 2020). Secondly, in the tropical Pacific Ocean, the impacts of the equatorial and counter-equatorial currents are clearly visible. This corresponds to the absence of ocean surface currents, both in the atmosphere boundary layer and in the wave model component of ERA5. It is also affected by the relatively coarse (32 km) wind fields, which lead to loss of information in the wave model. Similar examination of differences in climatological seasonal (JFM, JJA) mean between the CCI and the product of Ribal and Young is provided by Timmermans et al. (2020). Differences were generally found not to be statistically significant at a 10 % level (their Fig. S2 in supporting information), with some possible exceptions in regions of low average sea state, such as the Bay of Bengal and the Indonesian seas. However, a more rigorous assessment of robustness of differences was recommended, noting high sea state variability over the relatively short record and other possible sources of systematic error that remain poorly understood.

Long-term wave height trends
The accurate representation of long term temporal variation is crucial for many applications. Timmermans et al. (2020) examined long-term global seasonal (JFM, JJA) SWH trends from the CCI Level 4 dataset and intercompared those with other high-quality sea state records over the period of continuous satellite coverage (their Figs. 3 and S4). Figure 14 shows the trend in annual mean SWH for the CCI L4 product, with a similar intercomparison. Trends are calculated using a linear regression approach, discussed further by Timmermans et al. (2020).
Overall, there is remarkable variability across datasets, although in all cases intra-dataset variability shows a high degree of spatial coherence. While the maximum range of trends across all datasets is approximately the same, a striking result is that trends from the CCI L4 (Fig. 14b) appear to be substantially more positive than those from the Ribal and Young (Fig. 14a) and in better agreement with the CY46R1 ERA5 hindcast. Some regional intra-dataset trends appear to be robust at the 5 % significance level (w.r.t. the linear model) but these are rarely consistent across products, with disagreement in sign in a few locations. CCI L4 contrasts with Ribal and Young with positive trends in the central Atlantic and eastern Pacific, although there is qualitative agreement on (negative) sign in the northern and southern Pacific.
As already highlighted (see also Timmermans et al., 2020, Sect. 4), differences in source missions and calibration approaches are likely relevant. In particular, Timmermans et al. (2020) reveal (their Fig. 2) that the CCI dataset tends to pro-vide the largest values (or various datasets, including buoys) in SWH time series at two specific locations, a phenomenon likely linked to the use of JASON-2 as a reference calibration mission. Further factors include the impact of interannual variability and changes in observation space-time sampling density over the period, both of which may affect the evaluation of a linear trend, particularly if bias is present at the beginning or end of the record. Finally, the relatively high degree of spatial heterogeneity seen for all products suggests that relatively short-term variability has localised influence. In general, on much longer timescales more homogeneous trends might be expected. See also Timmermans et al. (2020) for analysis of seasonal trends (JFM, JJA).

Spectral variability at regional scales
The Sea State CCI dataset v1 provides a unique opportunity to analyse global and regional sea state variability in the mesoscale range below several hundreds kilometres up to ∼ 50 km. Indeed, the wave field in this scale range is strongly modulated by wave-current interactions (Ardhuin et al., 2017;Quilfen et al., 2018) that were hitherto neglected in the analysis of altimeter signals due to noise contamination. Moreover, in most ocean basins, altimeter data are the only available measurements of wave heights. Fig. 15a shows the yearly averaged mean surface current vorticity computed from altimeter-derived geostrophic surface currents (Rio et al., 2014). Six 1 • × 1 • regions that are well exposed to swell events and characterised by different surface dynamics are displayed as coloured rectangles. Regions (a) (Agulhas Current, in red) and (b) (Drake Passage, in green) are characterised by strong surface vorticity (> 1 × 10 −5 s −1 ), with many surface mesoscale and sub-mesoscale features produced by instability processes (Tedesco et al., 2019;Rocha et al., 2016). Regions (c) and (d) (Atlantic-Pacific equatorial band, in orange and purple, respectively) are characterised by intermediate surface vorticity (between 0.6 × 10 −5 and 0.4 × 10 −5 s −1 ). Regions (e) and (f) (northern branch of southern Atlantic-Pacific gyre, in blue and turquoise, respectively) are characterised by low surface vorticity (< 0.2 × 10 −5 s −1 ).
For each of these regions, we performed a spectral analysis on 8 years (2010-2018) of 1 Hz along-track SWH measurements acquired by JASON-2. The wavenumber spectra were computed along segments of 128 points (∼ 800 km) detrended and tapered with a Hanning window. For each region, approximately 3000 1D spectra were averaged, and the mean spectra are shown in Fig. 15b for each region with similar surface current vorticity (from high vorticity on the left-hand side to low vorticity in the right-hand side). The divergence between the dotted line (original signal) and the solid lines (denoised signal) highlights the scales at which the SWH variability is dominated by noise. In our case, we used the EMD-based filtering method described in Sect. 3.3 to reveal the SWH variability at smaller scales. Interestingly, we note that the scale at which the divergence takes place depends on the dynamics of the region considered (from ∼ 125 km for the high-vorticity regions to ∼ 200 km for the lower-vorticity regions). Spectral slopes were computed with linear regression over the range 250-50 km (unshaded area in Fig. 15b), for which the spectral slope is nearly constant. These slopes are around k −2.5 in regions where surface vorticity is intense (i.e. Agulhas Current and Drake Passage), as already evidenced by Ardhuin et al. (2017) from combined altimeter data and numerical model results in the Gulf Stream and the Drake Passage. In regions where the vorticity is lower, such as the equatorial band or in the northern branch of At-lantic and Pacific tropical gyres, the spectral slopes becomes less steep (around k −1.5 ). This regional distribution of SWH spectral shape presents some similarities with the one obtained for the sea surface height, with a steeper slope in high energy area and milder slopes in low energy regions (e.g. Vergara et al., 2019;Xu and Fu, 2011). The difference in the wave height power spectral density between Fig. 15b and the two others ( Fig. 15c and 15d) is due to the contrasting wave height climatology in the considered regions (see Sect. 5.1).
These preliminary results highlight the benefit of the EMD denoising method in investigating the small mesoscale SWH variability. Further investigation will be carried out to under-  Timmermans et al., 2020, for details). Dots indicate grid cells where the trend coefficient is significant at the 5 % level.  (Rio et al., 2014). Coloured rectangles are the studied areas through a spectral analysis. (b) The associated 8-year-averaged significant wave height power spectral density (PSD) as a function of wavelength and wavenumber. In solid lines the SWH spectra obtained from denoised SWH data are given, and in dotted lines the spectra for raw data (including estimation noise) are given. Note that, for reading convenience, the y axis is not the same for the three subplots in (b).
stand the impact of surface currents on sea state mesoscale variability over multi-decadal scales.

Current limitations and future developments
This section discusses the current status of the Sea State CCI dataset v1, the main limitations of the data and the perspectives for the future releases of the dataset.

Definition of a reference in situ dataset for sea states
Routine observations from moored buoys now exceed 40 years for several locations worldwide, which make them practical for analysing long-term trends. The most abun-dant open-source network is NOAA's National Data Buoy Center (NDBC), which has maintained an expansive network since the 1970s. Despite the multi-decadal time series from moored buoys, the data homogeneity is a critical issue (Gemmrich et al., 2011). Buoy hulls, payloads and data processing algorithms change over time, and often the changes through metadata are not well documented. The changes in buoy configurations introduce spurious deviations in the time series that are at least on the same order of magnitude (if not larger) as changes due to inter-annual variability (ENSO, NAO, SAM, etc.) or secular trends. Having detailed metadata is critical to correct the buoy time series. As a result, reported trends from buoy records produce inconsistent results, with changes in magnitude and even sign between buoys sep- arated by a few hundred kilometres (Allan and Komar, 2000;Gower, 2002;Ruggiero et al., 2010;Young et al., 2011). These inconsistencies mean that, at present, very few longterm buoy datasets exist that can be used reliably for trend estimation. This is a major shortcoming as no agreed "ground truth" exists to compare satellite or model estimates of trend.
There is a pressing need to produce long-term buoy datasets that include both the measured quantities of interest (significant wave height, wind speed) but also metadata documenting information such as buoy hull type, sampling details, instrument package, processing details, etc. With such metadata, it is potentially possible to correct for changes to such quantities over time and hence produce a harmonised dataset, in a similar manner to the careful reprocessing of sea surface temperature records (Merchant et al., 2019).

Revisiting altimeter inter-calibration
In its current version (v1), the Sea State dataset uses the GlobWave calibration formula for all missions prior to 2018 (except for JASON-1, as explained in Sect. 2.1), while the most recent missions were inter-calibrated against the JASON-2 data, as corrected in GlobWave. This strategy was adopted in order to ensure the long-term consistency of this merged dataset. However, the independent validation exercise performed within the CCI project against an exhaustive in situ dataset (see Sect. 4) has revealed some unexpected features of the corrected data. In particular, the calibrated JASON-2 SWH presents a positive bias of ∼ 8 cm, larger than the one computed from the raw data. This discrepancy could be due to the different in situ data (time coverage and selected networks) used for the GlobWave calibration and the CCI validation. Since JASON-2 is used as a reference for inter-calibrating other missions, this bias also impacts the most recent missions. Moreover, comparisons be-tween the CCI dataset and the one of Ribal and Young (2019) have revealed significant differences in the long-term statistics between these two datasets, which may be partly related to the calibration methodologies and reference in situ data (Timmermans et al., 2020). Future developments in the CCI dataset will therefore require an improved inter-calibration methodology that will be systematically applied to all altimeter missions included in the dataset, in order to reduce the uncertainties in the long-term sea state statistics.

Assessment and implementation of new retracking algorithms
In order to accurately estimate physical variables of relevance in satellite altimetry, average waveforms (usually at a rate of 20 Hz) are fitted to a mathematical model and an optimisation algorithm, in a process called "retracking". For conventional (low-resolution mode) altimetry, all the information on SWH is encrypted in the few bins on the leading edge of the waveform (Fig. 16a). The actual echo observed at any bin is the sum of the contributions from many incoherent reflecting points on the sea surface; the effect of this "fading noise" is that the power recorded in the mean waveform has an intrinsic variability that will have a strong effect on parameters calculated from only a few waveform bins. Also, in the coastal zone, unwanted reflections from nearby land or sheltered bays (Gomez-Enri et al., 2010) and changes in the wave shape due to wave-bottom and wave-current interactions (Ardhuin et al., 2012) can affect the quantity and quality of SWH estimations within 20 km of the coast (Passaro et al., 2015). The uncertainty in estimates due to fading noise typically increases with SWH, but the uncertainty is much more pronounced in the near-shore region (Fig. 16b). Similar challenges exist in the marginal ice zone. The advent of delay Doppler altimetry (DDA) offers the potential for improved SWH accuracy near land (Nencioli and Quartly, 2019) and reduced sensitivity to fading noise through being able to utilise a greater number of independent echoes (Raney, 1998).
There is now a strong demand to improve the quality of altimetric wave height data from both LRM and DDA instruments through improved retracking methods in order to (1) enhance the precision (i.e. short-scale repeatability of 20 Hz estimates), (2) increase robustness and accuracy in the coastal zone and ice-affected areas, (3) observe the true spectra of waves unencumbered by retracker resolving issues such as the "spectral hump" (Dibarboure et al., 2014), (4) accurately record the extreme waves despite uncertainty increasing (Fig. 16b), and (5) improve estimation at low SWH where the slope of the leading edge is inadequately resolved (Smith and Scharroo, 2015). DDA shows promise for these aspects, although it is worth noting that the much narrower footprint with DDA may be leading to an underestimation bias associated with wave direction (Moreau et al., 2018).
To address these limitations, new retracking techniques have been developed, which generally involve one or more of the following features: numerical solution of the radar equation (as opposed to using an analytical model), fitting of a selected portion of the waveform (Passaro et al., 2014;Thibaut et al., 2017;Peng and Deng, 2018), finding a smooth trajectory through a cloud of solutions (Roscher et al., 2017), and post-processing aimed at reducing correlated errors among consecutive estimations Quartly et al., 2019;Quartly, 2019). On top of this, several flavours exist of an analytical model to describe the viewing geometry of the DDA acquisitions (Moreau et al., 2018;Buchhaupt et al., 2018;Ray et al., 2015).
In the framework of the Sea State CCI, a set of rules and statistics for a so-called round robin exercise have been defined, which is common in such projects (e.g. Brewin et al., 2015) but to date has never been applied to altimetry. The aim is to ensure that these new algorithms can be evaluated in a rigorous and transparent way, taking into account all the different applications. The procedure involves comparison with external datasets (buoys and models), internal analysis of outlier rejection, quality flags, precision and spectral properties. The results of this study show that a number of specially designed algorithms can deliver improved SWH retrieval in both open ocean and close to the coast and for a range of sea state conditions (Schlembach et al., 2020). The gains are achieved both through the design of the retracking algorithms, e.g. to avoid spurious signals in the tail of the waveform, and also through enhanced data selection using a data quality flag tuned to that specific retracker. However, a procedure for identifying good pairings of buoys and altimeter tracks is essential in order to achieve robust results (Nencioli and Quartly, 2019;Quartly and Kurekin, 2020).

Conclusions
The Climate Change Initiative programme launched by ESA in 2010 has fostered the production of climate quality longterm global datasets of Essential Climate Variables, whose analysis is needed for understanding the mechanisms of climate change and the associated societal impact. In this context, the Sea State CCI project is in charge of reprocessing and developing dedicated algorithms for historical and current EO missions dedicated to the observations of sea state (radar altimeters and SAR missions) in order to produce a continuous, consistent and robust long-term dataset of sea state parameters. The first version of the Sea State CCI dataset, presented in this study, covers the period 1991-2018 and includes observations from 10 altimeter missions. The implementation of quality flags and auxiliary parameters in a systematic way, the update of calibration formula for the most recent missions, the development of an EMDbased denoising method, and the validation against an extensive network of in situ data buoy and state-of-the art model results, resulted in a unique dataset designed for the study of wave climate variability. This dataset has already proved really useful in investigating sea state variability at global and regional scales, in terms of wave climatology and spectral variability. Future releases of the Sea State CCI dataset will extend the capacity of this dataset even further, through (1) the implementation of dedicated retracking algorithms for estimating the SWH with improved accuracy, (2) the revision of the calibration formula based on a high-quality and consistent dataset of in situ buoy measurements, and (3) the inclusion of spectral wave parameters derived from SAR missions.
Author contributions. GD, YQ and JFP conceived and organized the manuscript. GD wrote the manuscript with inputs from all au-thors. JFP, JRB, MA and GD collected and processed the data (altimeter, in situ, model). JFP and GD implemented the editing and calibration of the altimeter data. YQ implemented the EMD denoising method. JFP supervised the overall production of the CCI dataset. GD and MA performed the quality assessment of the CCI product. BT and CG analyzed the L4 product and carried out the global climatology and the long-term trend analysis. GM carried out the spectral analysis with inputs from YQ, FA and GD. GQ, MP, IY, JS and GD wrote the section about the limitations and future developments. EA managed the project with the support of FA and GD. CD and PC supervised the project. All authors provided critical feedback and helped shape the research, analysis and manuscript.