Integrated water vapour content retrievals from ship-borne GNSS receivers during EUREC4A

Abstract. In the framework of the EUREC4A (Elucidating the role of clouds–circulation coupling in climate) campaign that took place in January and February 2020, integrated water vapour (IWV) contents were retrieved over the open tropical Atlantic Ocean using Global Navigation Satellite System (GNSS) data acquired from three research vessels (R/Vs): R/V Atalante, R/V Maria S. Merian and R/V Meteor. This paper describes the GNSS processing method and compares the GNSS IWV retrievals with IWV estimates from the European Centre for Medium-range Weather Forecasts (ECMWF) fifth reanalysis (ERA5), from the Moderate Resolution Imaging Spectroradiometer (MODIS) infrared products and from terrestrial GNSS stations located along the tracks of the ships. The ship-borne GNSS IWV retrievals from R/V Atalante and R/V Meteor compare well with ERA5, with small biases (−1.62 kg m−2 for R/V Atalante and +0.65 kg m−2 for R/V Meteor) and a root mean square (rms) difference of about 2.3 kg m−2. The results for the R/V Maria S. Merian are found to be of poorer quality, with an rms difference of 6 kg m−2, which is very likely due to the location of the GNSS antenna on this R/V prone to multipath effects. The comparisons with ground-based GNSS data confirm these results. The comparisons of all three R/V IWV retrievals with MODIS infrared products show large rms differences of 5–7 kg m−2, reflecting the enhanced uncertainties in these satellite products in the tropics. These ship-borne IWV retrievals are intended to be used for the description and understanding of meteorological phenomena that occurred during the campaign, east of Barbados, Guyana and northern Brazil. Both the raw GNSS measurements and the IWV estimates are available through the AERIS data centre (https://en.aeris-data.fr/, last access: 20 September 2020). The digital object identifiers (DOIs) for R/V Atalante IWV and raw datasets are https://doi.org/10.25326/71 (Bosser et al., 2020a) and https://doi.org/10.25326/74 (Bosser et al., 2020d), respectively. The DOIs for the R/V Maria S. Merian IWV and raw datasets are https://doi.org/10.25326/72 (Bosser et al., 2020b) and https://doi.org/10.25326/75 (Bosser et al., 2020e), respectively. The DOIs for the R/V Meteor IWV and raw datasets are https://doi.org/10.25326/73 (Bosser et al., 2020c) and https://doi.org/10.25326/76 (Bosser et al., 2020f), respectively.



GNSS data processing
The GNSS observations were initially processed with the GIPSY-OASIS II v6.4 (herefater GIPSY) software in kinematic PPP mode (Zumberge et al., 1997) using standard options similar to the static mode used in Bock et al. (2020a). The software uses the Jet Propulsion Laboratory (JPL) fiducial-free and high rate (30 s) final products 3.0 for satellite orbits and clocks. The data were analysed in a 30 h window centred on noon of each day from which the 00-24 h parameters were extracted to avoid 125 edge effects. Second order ionosphere correction was used. Phase ambiguities were fixed using the wide lane phase biases computed by JPL as part of their processing of the global GNSS network (Bertiger et al., 2010). The kinematic mode estimates receiver position, clock offsets, ZTDs and horizontal gradients simultaneously for each epoch at a rate of 30 s. No constraint was applied to positions between consecutive epochs. Tropospheric delays were modelled by time-varying zenith components and horizontal gradients. The zenith components 130 include the Zenith Hydrostatic Delay (ZHD) and the Zenith Wet Delays (ZWD) which represent the contributions of dry air and water molecules, respectively, in the atmospheric column (Bevis et al., 1992). The projection of the zenith delays into the direction of the GNSS satellites is done using the Vienna Mapping Function 1 (VMF1) (Boehm et al., 2006). The projection of the gradient parameters is done using the Bar-Sever et al. (1998) mapping function. ZHD was only corrected a priori while ZWDs and horizontal gradients were modelled as random walk processes corrections to the a priori values estimated during the 135 data processing with a 30 s time resolution. The random walk process parameters were fixed as in the static mode to 5 mm h −1/2 and 0.5 mm h −1/2 for ZWDs and gradients, respectively. The a priori values for ZHD and ZWD and the coefficients for the mapping functions were extracted from the Technische Universität Wien (TU-Wien) data base (https://vmf.geo.tuwien.ac.at/). These values are computed from the 6-hourly ECMWF operational analyses by TU-Wien and are distributed on a global 2°×2.5°latitude-longitude grid. In order to take into account the effect of ship along-track displacements on these parameters, 140 a pre-processing was carried out in order to obtain an approximate trajectory. A priori ZHDs and ZWDs were then calculated using a filtered version of this first trajectory every 30 s using a 1-hour median filtering. The mapping functions parameters were calculated only for the average daily positions but were temporally interpolated from the 6-hourly sampling to 30 s.
Two other processing parameters are of importance: the elevation cut-off angle and the observation weighting. In the standard static processing, we used a 7°cut-off angle and uniform phase observation weighting of 10 mm. The choice of these parameters 145 results from a compromise between including low elevation observations that help decorrelate position and ZTD estimates (this is especially important in kinematic mode where both parameters are estimated at every epoch) and rejecting low elevation observations which are prone to multipath errors. We tested several variants of these parameters and noticed that they had a small but significant impact on the position and ZTD estimates for R/Vs Atalante and Meteor and a very large impact on the results from R/V Maria S. Merian. The results for the latter were actually very bad as anticipated in the previous Section, and 150 after testing unsuccessfully several other processing options (especially the arc duration for satellite tracking and the ambiguity fixing strategy) we decided to test several other processing software packages. Preprint. Discussion started: 9 October 2020 c Author(s) 2020. CC BY 4.0 License. very similar to that used with GIPSY, namely: kinematic PPP mode with ambiguity resolution, VMF1 modeling for mapping functions, a priori ZHD and ZWD data from TU-Wien, and 30-h processing window. Differences between the softwares concern satellites orbit and clock products, as SPARK uses the International GNSS Service (IGS) final products, and the random walk parameters which are fixed to 3 mm h −1/2 and 0.1 mm h −1/2 for ZWDs and gradients, respectively. The elevation cut-off angle in SPARK is fixed to 7.5°and the observation weighting is not specified. However several tests conducted with GIPSY 160 showed that SPARK and GIPSY results for R/Vs Atalante and Meteor agreed best when GIPSY included a 1/ sin(elev) weighting. The main disadvantage of the SPARK online service is the impossibility of changing the processing parameters.
Nevertheless, the results of SPARK for R/V Maria S. Merian GNSS data remained largely superior to those obtained with GIPSY. It is worth noting that this is the first time in our 15 years of experience in GIPSY that it actually fails to converge towards an acceptable solution. The problem in the GIPSY processing with the data from R/V Maria S. Merian was identified 165 in the data editing module, which is an upstream processing step, where many observations were deleted because of too many cycle slips. We believe that the main difference is that SPARK uses more modern and efficient data editing and processing algorithms. It is worth noting that the GIPSY software has recently evolved into a new software called GipsyX (Bertiger et al., 2020) which uses more state-of-the-art data editing and processing compared to GIPSY. It is likely that GipsyX can resolve the problem encountered by GIPSY with the R/V Maria S. Merian data and produce solutions close to those of SPARK. This new 170 software will be tested in the near future.
Regarding the elevation cut-off angle value and observations weighting tests with GIPSY, we noticed that switching from 7.5°(taken identical to SPARK) to 3°changed the mean height estimates for R/Vs Atalante and Meteor by 2.3 mm and 3.6 mm, respectively, and mean ZTDs in a consistent way with a factor of -3.5 approximately. Similarly, the comparison of two solutions with and without observation weighting (uniform vs. 1/ sin(elev)) highlighted a difference in the mean height of 175 5.8 and 15.2 mm for the two R/Vs and consistent differences in mean ZTDs. Such changes are symptomatic of the presence of low-elevation errors due to multipath for instance. The slightly larger variations for R/V Meteor suggest that the data from this R/V are more impacted by multipath errors. The final GIPSY processing options that we retained were thus motivated by the reduction of multipath errors. The cut-off angle was therefore fixed to 7.5°and a down-weighting of low elevation angle observations was applied. Another advantage of this choice is that the GISPY processing options were consistent with those of 180 SPARK that were used for processing the R/V Maria S. Merian.
3 Comparison of processing software results

Formal errors
The first characterization of the processing results was carried out by analyzing the formal errors of the three-dimensional positions and ZTD estimates. Figure 4 shows the temporal evolution and the histograms of the formal errors for the two 185 processing software packages and the three R/Vs, and Table 3 reports the respective percentile values. Two features stand out from the plots: the shift towards higher values for the SPARK software results and the very large scatter of the R/V Maria S. Merian results for both software. The shift is mainly linked to the differences in parameterization of the two software (e.g. weighting of measurements, random walks) and input data (e.g. orbit and clocks products). The larger scatter for R/V Maria S.
Merian data is explained by the lower data quality leading to more outliers which are associated with larger formal errors.

Data screening
The analysis of the distribution of formal errors helped to set the range limits for the post-processing data screening in order to reject outliers in the ZTD and position estimates (Bock et al., 2020b). Due to the different statistical properties observed in the 200 results discussed above, different thresholds were adopted for the two softwares and the three R/Vs: -For R/Vs Atalante and Meteor, we observed for both processing a dip in the histogram of the formal errors of positions around 70 mm and 90 mm for GIPSY and SPARK, respectively. Histograms of the formal errors of ZTD did not emphasize any discontinuities. So, for GIPSY estimates, we set the range check upper limit for the formal errors of positions to 70 mm, which led to a rejection of 0.004 % (4 points out of 10 5 ) for R/V Atalante and 0.05 % (49 points out of 10 5 ) for 205 R/V Meteor. For SPARK estimates, we set the limit for formal errors of positions to 90 mm, which led to no rejection (0 point) for R/V Atalante and a rejection of 0.09 % (85 points) for R/V Meteor. With these range limits, the number of rejections for both software were fairly consistent.
-For R/V Maria S. Merian, the histograms of formal error of positions were more continuous, and only a small dip was observed around 70 cm for the GIPSY solution. The histograms of formal errors of ZTDs present a dip around 7 mm for 210 GIPSY and 13 mm for SPARK. For GIPSY, upper limits for the formal errors of position and ZTD were therefore set to 70 cm and 7 mm, respectively, which led to a rejection of 6.7 % (5261 points out of 7.510 4 ). For SPARK, the limit for formal errors of positions was set 90 cm (i.e. in the same proportion as for the other ships) and the upper limit for the formal errors of ZTD was set to 13 mm, which led to a rejection of 2.82 % (2177 points).

Comparison of position and ZTD estimates
215 Table 4 gives some statistics on the results from the two software packages after the screening. The average number of satellites used per epoch for R/Vs Atalante and Meteor is nominal (around 10) and consistent between softwares. This is not the case for R/V Maria S. Merian for which this number is much smaller for the GIPSY processing (5.6) although slighter better for SPARK (8.1). As previously mentioned, these numbers suggest that a lot of data were edited by both software, which in the https://doi.org/10.5194/essd-2020-282 Preprint. Discussion started: 9 October 2020 c Author(s) 2020. CC BY 4.0 License. case of GIPSY become very small and make the solution unstable. Figure 5 compares the height and ZTD estimates for R/V Maria S. Merian from both software from which the instability of the GIPSY solution is obvious.
The other statistics of Table 4 show that both softwares are able to estimate nearly the same number of height and ZTD parameters for R/Vs Atalante and Meteor, although the number of estimates is smaller for SPARK than for GIPSY. The height and ZTD estimates are fairly consistent between software for R/Vs Atalante and Meteor but much larger for R/V Maria S.
Merian despite the outlier screening.

225
Finally, we decided to keep the GIPSY solutions for R/Vs Atalante and Meteor for the main reason that we have access to more processing output parameters which may be useful for further investigations. For the R/V Maria S. Merian, the SPARK solution was kept because it is obviously of higher quality.  Figure 6 shows the differences between the modelled and GNSS-estimated geoid heights for the three R/Vs. Note that since the draught between the antenna reference point and the waterline of the ships is not precisely known (and is subject to variations of several tens of centimetres during the cruise), the differences were corrected for the median antenna heights.

Vertical positioning evaluation
As a result, the mean differences for all three ships were insignificant. The GNSS time series were also smoothed using a 5-min median filter in order to reduce fast heave movements of the ships. Standard deviations of the GNSS height errors for 240 the filtered data are on the order of 20 cm for R/V Meteor and R/V Atalante. These errors are consistent with the error budget described by (Bouin et al., 2009) for sea surface height determination by GNSS and the formal errors of the CNES_CLS mean surface that ranges from a few millimeters up to 10 cm in coastal zones over the area. For R/V Maria S. Merian, the standard deviation reaches 40.8 cm (Figure 6). The position estimates for this ship are much noisier, reflecting the poorer quality of the GNSS data acquired on this vessel as previously mentioned.

245
Inspection of the time series in Figure 6 shows a period of larger scatter around 20-22 January 2020. We checked that these errors are not related to the JPL and IGS satellite orbit and clock products by performing a kinematic mode processing of GNSS observations for nearby terrestrial stations. The latter did not show this feature. The impact of a higher speed of ships during this period is also not suspected, since the speed values are of the same order throughout the entire campaign. These variations are therefore more likely related to the sea state during this period. As mentioned previously, during the data processing, the ZTD was modelled with 2 components as follows: After the processing, we needed to extract the ZWD using a precise estimate of the ZHD. In this work, ZHD was computed from mean sea level pressure extracted from ERA5 reanalysis with a horizontal resolution of 0.25°and temporal sampling of 1 h (Hersbach et al., 2020) using the modified Saastamoinen formula (Saastamoinen, 1972) proposed by Bosser et al. (2007).

260
The ZHD estimates were corrected for the height difference between the GNSS antenna and the mean sea level using the following formula (Steigenberger et al., 2009;Boehm and Schuh, 2013) which was shown to be adequate for small height differences: where T (h ERA5 ) and P (h ERA5 ) are the mean temperature and pressure from ERA5 between the GNSS antenna and the ERA5 265 surface, g(h ERA5 ) = 9.8062 m s −2 ; g atm = 9.7840 m s −2 is the approximated gravity of the center of mass of the atmosphere (Boehm and Schuh, 2013); and k 1 is a refractivity constant for dry air.
The final GNSS ZWD estimate was obtained as: and the IWV was converted from ZW D(h GP S ) following: where κ(T m ) is a semi-empirical function of the weighted mean temperature T m (Bevis et al., 1992): where R v = 461.5 J K −1 kg −1 is the specific gas constant for water vapour, and k 2 and k 3 are refractivity constants for the water molecule. In this work we used the refractivity constants updated by Bock et al. (2020b) and the T m estimates provided 275 by TU-Wien on the same global grid as the a priori ZHD and ZWD products used for GNSS processing. We interpolated the GPS antenna by using the empirical formulation first proposed by (O. et al., 2005) and used by (Parracho et al., 2018):  et al., 2003). The spatial resolution of MODIS IR products is 5 km. We only used data for which the "Quality_Assurance_Infrared" flag was set to "Useful and Good". For each pass of the Aqua and Terra satellites over the EUREC 4 A domain, the closest pixel representing a valid IWV within a 20 km radius area around each vessel was considered. Furthermore, the time difference between the satellite measurement and the GNSS measurement was imposed to be less than 15 s.

295
Note that the MODIS Near-IR IWV product is known to be of higher quality than the IR product but it is only available during daytime, over infra-red reflective surfaces such as clear land, clouds, and oceanic areas in the condition of Sun glinter.
The latter condition is very restrictive as only very few observations were found be valid over our study area. For this reason we only used the IR product here.
Several studies have evaluated the MODIS IR IWV product by comparison with ground-based measurements from radioson-300 des and ground-based water vapor radiometer. They found a RMS difference between 5 and 6 kg m −2 in cloud free conditions (Liu et al., 2015;Ferrare et al., 2002). The accuracy of the MODIS IR IWV products is less than expected from the ERA5 reanalysis but it still provides an independent observational source of evaluation for the GNSS retrievals. Figure 7 shows the IWV time series and the differences for the three data sets and the three R/Vs. Preprint. Discussion started: 9 October 2020 c Author(s) 2020. CC BY 4.0 License.