The CM SAF ATOVS tropospheric water vapour and temperature data record : overview of methodology and evaluation

Instruments Data Provenance & Structure


Introduction
Water vapour plays a central role in the Earth's energy and water cycles.In the lower troposphere, condensation of water vapour into precipitation provides latent heating which dominates the structure of tropospheric diabatic heating (Trenberth and Stepaniak, 2003a, b).Water vapour is also the most important gaseous source of infrared opacity in the atmosphere, accounting for about 60 % of the natural greenhouse effect for clear skies (Kiehl and Trenberth, 1997), and provides the largest positive feedback in model projections of climate change (Held and Soden, 2000).
The mission of the Global Climate Observing System (GCOS) is to ensure avail- project on water vapour, its primary goals were the accurate global measurement, modelling, and long-term prediction of water vapour.Furthermore, the GEWEX Data and Assessments Panel (GDAP) has initiated the water vapor assessment (G-VAP, http://www.gewex-vap.org).G-VAP's major objective is the characterisation of longterm satellite based tropospheric water vapour data records.A G-VAP activity is the analysis of the probability density function (PDF) of water vapour.The Television and Infrared Observation Satellite (TIROS) Operational Vertical Sounder suite (TOVS) onboard the TIROS-N and NOAA-6 through NOAA-14 satellites consists of the High Resolution Infrared Radiation Sounder (HIRS), the Microwave Sounding Unit (MSU) and the Stratospheric Sounding Unit (SSU).The MSU and SSU have been replaced with improved instruments, the Advanced Microwave Sounding Units AMSU-A and AMSU-B, and more recently AMSU-B has been replaced by the Microwave Humidity Sounder (MHS).The ATOVS suite, AMSU-A, AMSU-B, and HIRS, are flying onboard the NOAA-15 through NOAA-17 satellites.On NOAA-18, NOAA-19, and Metop-A MHS replaced AMSU-B.The TOVS/ATOVS observations allow for the retrieval of water vapour and temperature profiles.The TOVS/ATOVS observations started in 1978/98 and are among the longest time series available from satellite.
Retrieval methods can be separated into statistical/semi-physical and physical schemes.The semi-physical schemes retrieve the water vapour content by applying a statistical scheme (linear regression or neural networks) based on a training data set.
The physical schemes mostly use a first guess, often coming from a numerical weather forecast model (NWP) or reanalysis, as the basis for the forward computation and then vary the first guess profile until the computed set of radiances best matches the observed radiances.Processes in the atmosphere complicate the retrieval task, e.g., the co-existence of the three thermodynamic phases of water on Earth, interaction with aerosols and uncertainties in surface emissivities and temperatures particularly over land.The error characteristics of the retrieval or analysis will critically depend on the a-priori data used as shown by, e.g., Rodgers (2000).Several retrievals for TOVS and in particular ATOVS have been developed.An important aspect in this context is that Figures synchronised infrared and microwave observations can be used.This way the information content increases and clear sky as well as cloudy sky conditions are sampled.An example of TOVS retrieval is described in Scott et al. (1999) and forms the basis for a data record of atmospheric profiles.Retrieval algorithms for ATOVS are described in, e.g., Li et al. (2000) and Reale et al. (2008).Boukabara et al. (2011) developed the Microwave Integrated Retrieval System which uses AMSU-A and MHS observations and is currently being updated to also include Special Sensor Microwave Imager/Sounder (SSMIS) observations.These retrieval schemes are presently applied operationally and have not been used so far to reprocess the ATOVS record.
With the availability of hyperspectral infrared sounders which are jointly installed with microwave radiometers onboard National Aeronautics and Space Administration's (NASA) Aqua, EUMETSAT's Metop-A/Metop-B and the Joint Polar Satellite System (JPSS) Suomi National Polar-orbiting partnership (NPP) platforms the retrieval capacity has been enhanced.This development started with AIRS onboard Aqua which is in orbit since 2002.AIRS covers the infrared spectrum from 3.  Vonder Haar et al., 2012).An overview of available satellite and reanalysis records is provided in the G-VAP assessment plan available at http://www.gewex-vap.org.More information on available satellite data records can also be found at http://ecv-inventory.com.This paper introduces the CM SAF ATOVS tropospheric humidity and temperature data record.The ATOVS observations are consistently reprocessed with a fixed processing chain.The main elements of the processing chain are the AVHRR and ATOVS Pre-processing Package (AAPP, Atkinson, 2011), the International ATOVS Processing Package (IAPP) retrieval algorithm (Li et al., 2000) and the Kriging algorithm (Schröder et al., 2013).The ATOVS data record is freely available from http://www.cmsaf.eu/wuiand referenced under Digital Object Identifier (DOI): 10.5676/EUM_SAF_CM/WVT_ATOVS/V001.This paper is based on the Algorithm Theoretical Basis Document (ATBD) and the validation report available at http://www.cmsaf.eu/docs.After introducing the technical specifications of the ATOVS data record and the input data, the pre-processing, the retrieval and the post-processing are described.In Sect.4, we show results from the comparison of the ATOVS data record to the GUAN radiosonde observations and the AIRS data record for the period 1999-2011 and for the period 2003-2011, respectively.In order to enhance readability we focus on TPW and LPW here.Finally, we provide conclusions.

Product description
The ATOVS data record contains tropospheric water vapour and temperature products and is defined at all longitudes and for latitudes between 80 • N and 80  The following products are included in the ATOVS data record: vertically integrated water vapour or total precipitable water vapour (TPW) in kg m −2 .
-Layered products for 5 layers: layer vertically integrated precipitable water vapour (LPW) in kg m −2 , layer mean temperature in K.
-Products at 6 pressure levels: specific humidity in g kg −1 , temperature in K.
Relative humidity for 5 layers is provided as additional, auxiliary data.The layer and level definitions are given in Table 1 and  With a maximum timeliness of two months CM SAF also provides an "operational" version of the ATOVS products.This data set is operational since 2009 and covers the period 2004-present.The operational processing scheme uses ECMWF Integrated Forecast System (IFS) forecasts since March 2012, does not apply SNOs and is based on various retrieval versions.Currently, the implementation of the IAPP version 4 is carried out to allow the processing of Metop-B data.The operational ATOVS products are routinely compared against GUAN observations and the results of this comparison are subject to an annual review and are published at www.cmsaf.eu/docs.
The operationally processed ATOVS data record is freely available from www.cmsaf.eu/wui.150 GHz are located in the atmospheric window and are sensitive to water vapour at lowest layers in the atmosphere.The MHS channels are similar to the AMSU-B channels.The third ATOVS instrument, HIRS/3 (replaced by HIRS/4 on NOAA-18, NOAA-19, and on Metop-A) is an infrared 20 channel cross track scanning sounder with an instantaneous field of view of 1.3 • providing a nominal spatial resolution of 18.9 km (improved to 10 km for HIRS/4).HIRS infrared observations are affected by surface properties, clouds, temperature and water vapour.
The retrieval of the geophysical parameters is done using the IAPP software version 3.0b (see Sect. 3.3).IAPP uses the following ATOVS channels: HIRS channels 1 to 17, AMSU-A channels 1 to 15, and AMSU-B channels 17 to 20.When an instrument channel became faulty on a specific satellite, this channel was removed from the retrieval for the entire reprocessing time period for that particular satellite.Such channels are: AMSU-A channels 11 and 14 on NOAA-15, AMSU-A channel 4 on NOAA-16 and AMSU-A channel 7 on Metop-A.
The IAPP relies on the use of a priori data.The following European Centre for Medium-Range Weather Forecasts (ECMWF) ERA Interim reanalysis fields (Dee et al., 2011) are used as a priori information: temperature profile, relative humidity profile, 2 m dew point, 2 m temperature, skin temperature, surface pressure, geopotential height, sea ice cover, land sea mask, total column water vapour, and horizontal and vertical winds at 10 m.

Input data pre-processing
The input data pre-processing is carried out in two steps.First, AAPP is used to convert the geo-referenced and calibrated brightness temperatures (level 1c, taken from ECMWF's Meteorological Archival and Retrieval System, MARS) into mapped data (level 1d).During this process the scan lines are also sorted according to time.Furthermore, the AAPP Binary Universal Form for the Representation of meteorological data (bufr) decoding tool to read the l1c data is used.Full ical Weather Prediction (NWP SAF).An overview of the AAPP is given in Atkinson (2011), a scientific description is available from Labrot et al. (2011) and the software description from Labrot et al. (2012).Secondly, Simultaneous Nadir Overpasses (SNO) coefficients are applied to the AMSU-B data to intercalibrate observations from the different satellites.The SNO coefficients used to process the ATOVS data record are described in John et al. (2012) and were provided as monthly mean brightness temperature differences for the satellites NOAA-15 to NOAA-18 and Metop-A, covering the period January 2001-December 2010.Since NOAA-16 exhibits temporal overlap with all other satellites having ATOVS instruments onboard, it has been used as a reference satellite for the SNO intercalibration.John et al. (2012) emphasise that the quality of the intercalibration using classical SNO approaches are hampered due to the overrepresentation of cold scenes.The biases between the satellites are dependent on the scene radiance which is itself depending on the latitude at which the observation is made.Improvements to classical SNO approaches were suggested by John et al. (2012) for AMSU-B and developed by Shi and Bates (2011) for HIRS.Unfortunately, at the time of the data record processing, no intercalibration coefficients based on the conclusions of John et al. (2012) were available.In general, inter-calibration coefficients are also available for AMSU-A (see Zou and Wang, 2011 for details) and HIRS (see Shi and Bates, 2011 for details).However, they are applicable to limb-corrected observations and thus not useable for the processing of the ATOVS data record as the IAPP requires non-limbcorrected radiances as input.

Retrieval
The retrieval software used to generate the ATOVS data record is the IAPP version 3b developed by the University of Wisconsin in Madison, WI, USA (Li et al., 2000).The IAPP retrieves among other atmospheric parameters, temperature and moisture profiles in both, clear and cloudy atmospheres at 42 pressure levels.The IAPP algorithm can be decomposed into the following steps: the HIRS cloud detection and removal procedures, the bias adjustment relative to collocated radiosonde observations, and the actual retrieval.The retrieval involves two steps: first, the initial temperature, water vapour, ozone profiles, and the surface skin temperature are obtained by statistical regression between the ATOVS measurements and the ERA Interim reanalysis.The second part of the retrieval is the computation of an iterative physical solution of the radiative transfer equation using the first guess results and the ERA Interim reanalysis as background information.The physical iterative retrieval algorithm, the cloud detection procedure and the bias adjustment method are described in detail in Li et al. (2000) and are recalled in Courcoux and Schröder (2013).Here, we recall that the HIRS cloud detection algorithm is applied to 3 × 3 adjacent HIRS pixels.When one or more pixels are cloud free the retrieval process is applied.If not, the cloud removal process can be applied; however, it is not implemented in the IAPP version 3b utilised here.A land only and an ocean only scattering index threshold is applied to AMSU-A observations in order to flag pixels affected by strong scattering events which typically occur in presence of strong precipitation or in presence of snow cover and sea ice.The microwave surface emissivity is part of the solution while the infrared surface emissivity is set to 0.99 during the retrieval process (Li et al., 2000).

Post processing
The retrieval outputs are first quality controlled according to the following criteria: -TPW between 0 and 90 kg m −2 , temperature between 180 and 340 K, specific humidity between 0 and 55 g kg −1 , surface emissivity between 0 and 1, surface pressure between 0 and 1050 hPa.If profile or surface values are outside these ranges or if the profile exhibits superadiabaticity, the full profile is set to undefined.After quality control, the 42 level profiles are integrated and averaged to obtain the final products described in Sect. 2. Finally, an objective interpolation technique commonly called Kriging is applied to the quality controlled and integrated products.The advantage of applying Kriging is that it fills data gaps and that uncertainty estimates at grid level are computed.The principle of Kriging is that an estimate or prediction for an unobserved location is computed by using the observations from locations in its vicinity.The optimal estimate at each grid point is found by a weighted average of the information from the surrounding points.The challenge is to determine these optimal weights.The weights depend on the distance-dependent spatial correlation function and the error of the used observation.The Kriging algorithm used for the ATOVS data record is described in detail in Schröder et al. (2013).The extra-daily standard deviation for the monthly means, the Kriging error for the daily means, and the number of valid observations per grid box are part of the ATOVS data record.

Evaluation
The ATOVS tropospheric humidity and temperature data record is compared to GUAN radiosonde observations in order to guarantee consistent and comparable evaluation results between the operational and the reprocessed ATOVS data records.sense.An accuracy assessment is not carried out.Furthermore, the impact of background information and uncertainty on the observed quality is not analysed here, and we refer to, e.g., Eyre and Hilton (2013) for further reading.In Sect.4.1 the GUAN and AIRS data records are described.The comparison considers TPW and LPW and the results are presented in three subsections of Sect.4.2: first, the TPW time series from ATOVS, GUAN and AIRS data records are presented and discussed.Second and third, the comparison results between ATOVS and GUAN data records and between ATOVS and AIRS data records are discussed.

GUAN
The GUAN radiosonde network has been established by GCOS in order to make available current and historical upper air data for climate change detection and climate monitoring.GUAN provides global radiosonde observations, from homogeneously distributed upper air stations, that have a specific record length as well as that meet the continuity requirement and data quality requirements as defined by GCOS (Daan, 2002).At present there are 171 GUAN stations worldwide.A station map and a station list can be found at http://www.wmo.int/pages/prog/gcos/index.php?name= ObservingSystemsandData.The GUAN data is distributed by the Global Telecommunications Service (GTS) and archived at Deutscher Wetterdienst (DWD).The processing of GUAN data at DWD was consistently done with one exception: in October 2008 the archiving system at DWD has been changed and the GUAN processing software was adapted.However, the results of the comparison between ATOVS and GUAN data records do not exhibit any distinct feature in October 2008.
The quality of radiosonde observations is affected by a series of issues such as temporally and spatially varying radiosonde types and national practice (e.g., Soden and Lanzante, 1996;Christy and Norris, 2009;Moradi et al., 2010) and issues and differences in calibration procedures (e.g., Miloshevich et al., 2006;Vömel et al., 2006) Full ries of correction algorithms have been developed by, e.g., Miloshevich et al. (2004), Leiterer et al. (2005) and Miloshevich et al. (2009) which mainly focus on RS80 and RS92 radiosonde observations.Such corrections have not been applied to the utilised GUAN observations.Exemples of reprocessed radiosonde archives which include temperature and water vapour are the Integrated Global Radiosonde Archive (IGRA) (Durre et al., 2006) and its homogenised version (Dai et al., 2011).Dai et al. (2011) describe a few known discontinuities in humidity observations from radiosondes.These are: the dew point depression (DPD) was set to 30 • C under dry conditions at several stations, and temperature observations under cold conditions for "early radiosonde hygrometers" were unreliable and were reported as missing (Dai et al., 2011).

AIRS
AIRS is an infrared cross track scanning instrument flying together with an AMSU-A radiometer onboard the NASA Aqua satellite since 2002.The Level 2 AIRS data record which is used for comparison is the AIRX2RET product provided by the NASA Goddard Earth Science Data and Information Service Center (GES DISC, http://daac.gsfc.nasa.gov/),this product is based on AIRS and AMSU-A observations.The processing version is V5.0 for the data from 2002 to 30 September 2007 and V5.2 for data from 1 October 2007 onwards.AIRS L2 products come in swath-based 6 min length files, 240 files cover one day.The products have a spatial resolution of 50 km × 50 km and the profiles are defined on 14 layers at 1000, 925, 850, 700, 600, 500, 400, 300, 250, 200, 150, 100, 70, and 50 hPa.The format is HDF4.Among others, the AIRS Standard Products consist of cloud properties and profiles of temperature and water vapour.The products are the results of employing the combined AIRS-IR/AMSU-A-microwave retrieval which is described in detail in Susskind et al. (2003Susskind et al. ( , 2006Susskind et al. ( , 2011)).
The retrieval process also involves a cloud clearing process which assumes that the radiative properties in each field of view are a function of cloud fraction only.A retrieval solution is rejected when the cloud fraction is larger than 90 %.A scattering (rain) in- dex is not explicitly applied.Infrared and microwave surface emissivities are part of the solution.For the comparison to the ATOVS data record, the data field Qual_H2O was evaluated and "best" and "good" quality data were used in the comparison.
An evaluation of the AIRS version 5 TPW products accuracy is given in Bedka et al. (2010) who compared the satellite products to ground-based observations at selected Atmospheric Radiation Measurement (ARM) sites.Using ground-based microwave radiometer observations at Barrow, Southern Great Plains-Lamont (SGP) and Nauru they found an average relative error which is typically smaller than 5 % for all sites, except at SGP, where AIRS is too moist when TPW is less than 10 kg m −2 .At SGP night-time observations exhibit a dry bias of approximately 10 % when TPW is greater than 10 kg m −2 (Bedka et al., 2010).
Recently, the AIRS version 6 products have been released.Improvements over version 5 are described at http://airs.jpl.nasa.gov/data/why_version_6, the first comparison results of version 6 and version 5 products to ECMWF can also be found on the webpage.The AIRS version 6 products may still be improved and products validation and quality assurance are ongoing.Thus its products validation state is "provisional".distribution of the radiosonde stations between the two hemispheres coupled to a TPW annual cycle which is different in both hemispheres leads to the TPW annual cycle that one can observe in the GUAN data record time series in Fig. 3.It can also be seen that the annual cycle in TPW from GUAN has slightly larger amplitudes in 1999 and 2000 than from 2001 onwards.The larger amplitudes in the first two years are caused by stronger minima during boreal winters.When looking at the time series of deseasonalised anomalies (not shown) a breakpoint is found between June and July 2001.The strength of the breakpoint is computed as the difference between the average TPW using 24 month prior and after the break.Using this strength and the averaged standard deviation over the two periods as input to a two-sided t test shows that the strength of the change is associated with a coverage probability of 92 %.Thus, the breakpoint is not considered to be significant when applying a standard significance level of 0.05.In the following the strength and the coverage probability are computed and interpreted the same way.

Time series of the different data records
The ATOVS TPW data record exhibits a breakpoint between January 2001 and Finally, a specific feature of Kriging is discussed.Kriging requires two independent measurements such as from different satellites or from the morning and afternoon overpasses of a single satellite.For the period January 1999 to February 2001 only NOAA-15 was available.Then, it may happen that the same location is not observed twice a day, e.g., due to the occurrence of strong precipitation events.When this happens Kriging is not applied and the daily average is flagged as undefined.For the June 2000 case study, the number of valid observations is 12 % smaller in the Kriging product than in the arithmetically averaged product and the number of undefined values are 9 % larger in the Kriging product than in the arithmetically averaged product.Indeed, it is visible that the positions of minima in number of valid observations and of undefined values coincide with the position of the Inter Tropical Convergence Zone (ITCZ) (not shown).
It seems that gap filling through Kriging has a noticeable impact on monthly means and largely explains the breakpoint between January and February 2001.

Comparison to GUAN data
The methodology for the comparison of the ATOVS data record against the GUAN radiosonde data record is the following: first, the GUAN data record is integrated to match the vertical layer and level definitions of the ATOVS data record water vapour products.For each day, only stations having at least two radiosonde launches per day are used and averaged to daily values.The ATOVS data record is spatially collocated to the position of the GUAN stations using a nearest neighbour algorithm.The collocated daily averages form the basis for the comparison.We analyse the monthly bias and the bias-corrected root mean square error (RMSE) between ATOVS and GUAN data records.The number of valid collocations per month are greater or equal to 450.
First of all, the difference in quality between the reprocessed and operational ATOVS products is discussed.Figure 5 presents the comparison results between TPW from the reprocessed and operational ATOVS data records and the GUAN data record. Figure 5 clearly shows that the TPW product from the reprocessed ATOVS data record exhibits a better quality and stability than the TPW from the operational ATOVS product.The bias of the operational ATOVS product compared to the GUAN data record shows a significant breakpoint between April and May 2009.At this time the following changes had been implemented in the operational ATOVS processing chain: migration of the processing chain, update of AAPP and IAPP, removal of NOAA-15 and NOAA-18 observations from the retrieval, implementation of Metop-A and NOAA-19 observations in the retrieval.The obvious improvement for the reprocessed data record is that the breakpoint in the bias time series is largely reduced and this also leads to an improved averaged bias.We refer to Sect.4.2.3 for further discussion.Moreover, the RMSE is slightly smaller for the reprocessed data record than for the operational product.Now, we focus on the comparison between the reprocessed ATOVS data record and the GUAN data record that is shown on the left panel of Fig. 5.The TPW bias is typically smaller than 1 kg m −2 and the average bias is −0.16 kg m −2 , furthermore, the bias is stable.Interestingly, no breakpoint is observed in the bias time series between 2000 and 2001.The breakpoints in the averaged, non-collocated TPW time series of GUAN and ATOVS have the same direction, a different strength and occur with a temporal difference of 5 months.This might translate into a breakpoint in the bias time series which is smaller than the ATOVS breakpoint itself and which is overlain by an anomaly between February and June 2001.This is not evident in the bias time series due to the collocation process: gap filling is mainly applied in presence of strong precipitation.Associated areas are typically found in ITCZ and storm track regions with poor coverage of GUAN stations.Due to the collocation procedure data from these areas have a reduced impact on the breakpoint.In fact, the ATOVS time series of collocated TPW values exhibits a breakpoint which is reduced by 0.7 kg m −2 .The GUAN data record and the ATOVS data record are sampled in a similar way, consequently, the collocated GUAN data also exhibits a breakpoint between January and February 2001, and the breakpoint is not evident in the bias time series.The averaged RMSE is 3.25 kg m −2 and the RMSE is stable from 2001 onwards with values around 3 kg m −2 .The RMSE exhibits maximum values in the two first years of the time series.This is expected since for the years 1999 and 2000 only one satellite is available for the processing, while for the rest of the processing at least two satellites are used.This behaviour was also observed in a comparison between SSM/I based and ERA-Interim TPW products: the RMSE decreased with the transition from a single satellite product to a multiple satellite product (Schröder et al., 2013).the increase in natural variability enhances the representativity uncertainty between the point and the areal observations.Figure 6 presents the comparison results between the ATOVS and the GUAN LPW products, again in terms of bias and RMSE.The LPW bias for layer 5 (surface-850 hPa) is around −0.7 kg m −2 , while the LPW biases for layers 3 and 4 (850-700 hPa and 700-500 hPa) are between 0 and 0.6 kg m −2 .As for the TPW, a slight increase in bias is present in early 2009, and an explanation for this increase is given in Sect.4.2.3.The LPW biases for layers 1 and 2 are relatively small with values around 0.002 and 0.003 kg m −2 , respectively.The LPW bias for layer 2 exhibits an unexplained anomaly of approximately 0.2 kg m −2 in 1999.Maximum RMSE values are slightly larger than 2.5 kg m −2 and are found for layer 5. LPW RMSE values for layers 2 to 4 exhibit a decrease after the first two years as it was found for the TPW.Furthermore, the RMSE typically exhibits an annual cycle similar to the TPW RMSE.
When relative values are considered (not shown) the bias is the smallest (largest) for LPW5 (LPW1) with average values of −6 and 36 %, respectively.The moist bias in the upper troposphere can be expected due to the observed dry bias in uncorrected radiosonde observations (e.g., Soden and Lanzante, 1996).The relative RMSE systematically increases with height.The lowest layer has an average relative RMSE of 30 %, the highest layer of 107 %.Our results exhibit similar magnitudes than the results presented in Reale et al. (2012).They compared humidity profiles from various satellite products, among them the NOAA ATOVS product (Reale et al., 2008), with GTS radiosonde data and found typical values of 25-50 % below 400 hPa and up to 100 % and more in the upper troposphere.
Although outside of the focus of this work, we briefly recall that various uncertainties contribute to the observed differences.Here, we compare point measurements with areal observations.Thus, the representativity uncertainty impacts the observed differences.To our knowledge the representativeness uncertainty is not known at each GUAN station.However, for assimilation purposes high resolution models have been used to assess such uncertainties.An analysis example is given in Waller et al. ( 2014 for specific humidity -they found a strong dependence of the representativity uncertainty on height and weather state.Furthermore, the comparison of the ATOVS and the GUAN data is based on daily averages and the differences in sampling between the radiosonde and the satellite observations contribute to the observed differences. To give an example of the diurnal sampling uncertainty we refer to the work of Dai et al. (2002).Using high-temporal resolution Global Position System (GPS) data from stations over North America they found that the uncertainty in seasonally averaged TPW is within ±3 % or ±0.5 kg m −2 when the sampling is changed from 30 min to twice daily at 00:00 and 12:00 UTC.Finally, we refer to the work of Pougatchev et al. (2009) and Sun et al. (2010).They compared temperature and relative humidity profiles from radiosonde data to IASI and to the Constellation Observing System for Meteorology, Ionosphere and Climate (COSMIC) observations in order to analyse the uncertainty arising from temporal and spatial mismatches.

Comparison to AIRS data
To be able to compare the ATOVS data record with the AIRS data record, the AIRS profiles are vertically integrated according to the ATOVS layer definitions (see Sect. 2).Then, the swath based products are gridded onto the ATOVS spatial grid, and finally all data are averaged to obtain monthly means which form the basis for the comparison.
The number of valid collocations per month is typically larger than 60 000.The RMSE is relatively stable with values around 2.4 kg m −2 .
The LPW bias is shown in Fig. 8 and exhibits similar features as the bias for TPW, except for the LPW bias for layer 5 which exhibits an annual cycle.The breakpoint observed in the comparison of TPW in early 2009 is also evident for LPW for layers 3 to 5. Relative to the TPW bias and the LPW bias for layers 3 to 5 the RMSE is stable over time.The LPW bias for layer 1 and the LPW RMSE for layer 3 exhibit a distinct feature between late 2005 and early 2009.This coincides with the use of NOAA-18 observations in the retrieval while MHS onboard this particular satellite exhibited a series of technical issues (see http://www.ospo.noaa.gov/Operations/POES/status.html).
The RMSE between the ATOVS and the AIRS data records does not exhibit a pronounced annual cycle and is typically smaller than the RMSE between the ATOVS and the GUAN data records likely because the number of valid collocations is larger and equally distributed over the northern and Southern Hemispheres and because the comparison of point measurements with areal observations likely exhibits larger representativity uncertainties than the comparison of two areal observations.However, the biases for TPW and LPW are larger between the ATOVS and the AIRS data records than between the ATOVS and the GUAN data records.The relatively large bias between the ATOVS and the AIRS data records is discussed and analysed in more detail.Schneider et al. (2012) compared the TPW of a SSM/I+Medium Resolution Resolution Imaging Spectrometer (MERIS) product to TPW products from GUAN, AIRS (V5) and ATOVS (operational CM SAF ATOVS data) for the period 2004-2008.For the comparison to AIRS, the AIRS cloud mask has been applied because TPW from MERIS is only available under clear sky conditions.They found biases (RMSEs) of 0.5 and −1.1 kg m −2 (2.3 and 3.3 kg m −2 ) relative to AIRS and ATOVS, respectively.Due to the clear sky bias (e.g., Sohn and Bennartz, 2008;Mieruch et al., 2010) these values cannot be directly compared to the results of this work.Nevertheless, it shows that the ATOVS product is more humid than the AIRS product, and the comparison of the ATOVS data to the AIRS data exhibits similar RMSE values to our analysis.and ARM data is on average smaller than 0.1 kg m −2 .However, their comparison results to all sites exhibit a day/night contrast which is most pronounced for SGP with day-night differences of 1.6 kg m −2 .They conclude that the relative difference at SGP for values larger than 10 kg m −2 is 10 % during boreal summers and decreases during boreal winters.They argue that surface emissivity, land use and cover and unique boundary layer conditions may contribute to this difference.Finally, Fig. 10 shows a map of the TPW bias between ATOVS and the AIRS data records.Obviously the bias is dominated by regions of strong precipitation and frequent cloud occurrences such as in the ITCZ and storm track regions.Relatively large values are observed at mountainous areas such as the Alps but the maximum differences are observed over tropical land surfaces with relative differences of about 15 %.Of course differences in retrieval set up and associated uncertainties contribute to this bias.Of particular relevance in view of the spatial distribution of the bias are differences in cloud detection, in cloud clearing (not applied for the ATOVS data record) and in the handling of scattering events (screened in the ATOVS data record).In the ATOVS data record, AMSU-B observations are used which allow a retrieval also under cloudy conditions while in the AIRS data record, cloud clearing needs to be applied to the AIRS data in order to retrieve TPW.In general clear sky observations exhibit a systematic underestimation of TPW relative to almost all sky observations (e.g., Sohn and Bennartz, 2008).Thus, the different instrumentation might contribute to the observed differences.
As outlined earlier the gap filling process of the Kriging contributes to the observed difference between the ATOVS and the AIRS TPW products with the TPW from ATOVS being larger in precipitating areas than the TPW from AIRS.Schröder et al. (2013) compared the CM SAF SSM/I TPW product to the SSM/I TPW product from the University of Hamburg and from the Max-Planck-Institute for Meteorology.The only difference in the generation of both products is again that the CM SAF product is based on post-processing using Kriging.The spatial distribution of their results is very similar to spatial distribution in Fig. 9. Thus, Kriging is a significant contributor to the observed bias between the ATOVS and the AIRS data records.As precipitation over tropical land surfaces exhibits a pronounced diurnal cycle with maxima in late afternoon and evening (e.g., Yang and Slingo, 2001) the differences between TPW from Metop-A (equator crossing time of ascending node: ∼ 21:30 LT), from NOAA-16 (∼ 19:00 LT) and from NOAA-19 (∼ 13:30 LT) have been analysed for the year 2011 (not shown).While the differences between the TPW from NOAA-19 and from Metop-A are relatively small, the differences between the TPW from NOAA-16 and from the other two satellites exhibit pronounced minima over tropical land surfaces.We found minimum differences of approximately −3 kg m −2 or −2 %, with smaller TPW values from NOAA-16 than from NOAA-19 and from Metop.These minima over tropical land surfaces nearly vanish when NOAA-16 and Metop-A differences are computed for the year 2008.Then, the NOAA-16 equator crossing time is approximately 16:00 LT.It seems that the diurnal sampling in combination with the diurnal cycle of deep convection over tropical land surfaces has an impact on the ATOVS product.However, channel 4 observations from NOAA-16 have not been used as input to the retrieval.Information from channel 4 is valuable to separate information on near surface properties from the temperature and water vapour signal of the lower troposphere.When the difference in the TPW products from the CM SAF ATOVS data record and from the AIRS data record are compared for the months of July and August for the period between 2008 and 2011 (not shown), the general coincidence of large biases and precipitating areas is still present.However, land surfaces in the northern extra-tropics, exhibit an increase in bias as well, which might also be associated to convective precipitation.Because convective precipitation has a short term impact on surface emissivity, differences in handling surface emissivities contribute to the overall difference as well.
Because the TPW bias between the ATOVS and AIRS data records exhibits a breakpoint in early 2009 we had a closer look at the TPW time series from the ATOVS data record and at the TPW bias time series between the ATOVS and the GUAN data records.Between April and May 2009, the ATOVS TPW anomaly time series exhibits a breakpoint of strength 0.75 kg m −2 with coverage probability of 98 % and the bias between TPW from ATOVS and from GUAN, a breakpoint of strength 0.37 kg m which coincides with an increasing equator crossing time (see Fig. 4 in John et al., 2012).Most obvious is that the bias for the data derived from NOAA-15 is systematically smaller than the bias for the data derived from any of the other satellites.The average difference between the NOAA-15 and AIRS and between the NOAA-16 and AIRS biases is almost 1 kg m −2 .Consequently, with the removal of NOAA-15 from the retrieval, the bias relative to AIRS and GUAN data records increases.This might be explained by the fact that the retrieval software, the IAPP, was developed and tuned at a time when NOAA-15 was the only available satellite (Li et al., 2000).We recomputed the bias between the TPW from ATOVS and AIRS data records with NOAA-15 data being removed from the retrieval in June 90 km × 90 km and is provided as daily and monthly averages.The product suite contains TPW, LPW, layer mean temperature as well as specific humidity and temperature at six pressure levels and is based on an optimal estimation retrieval and a post-processing by a Kriging algorithm to allow for gap filling.The ATOVS data record covers the period January 1999 to December 2011 and has been generated by the consistent application of a fixed processing chain.The reprocessing resulted in an improvement of the quality and the stability of the ATOVS data record relative to the operational ATOVS product.To our knowledge this is the first time that an ATOVS reprocessing effort has been conducted.The ATOVS data record, in NetCDF format, and the related documents (product user manual, validation report, algorithm theoretical basis document) are freely available from the CM SAF website at http://www.cmsaf.eu/wuiand http://www.cmsaf.eu/docs.The data record is referenced under doi: 10.5676/EUM_SAF_CM/WVT_ATOVS/V001.The analysis of the global TPW average from the ATOVS data record revealed a significant breakpoint between January and February 2001 which coincides with the change from the use of observations from one satellite to two satellites for the retrieval.An example of the monthly mean PDF analysis shows that the Kriging systematically fills the PDF at large values.As gaps typically occur in association with precipitation, in a humid environment, this is reasonable.Thus, the gap filling process through Kriging largely explains the breakpoint.We do not recommend using the ATOVS data record for the period from January 1999 to January 2001 for variability analysis.
The TPW and LPW products from the ATOVS data record have been compared to corresponding radiosonde observations at GUAN stations and to the AIRS version 5 data record in order to identify potential issues in the ATOVS data record and to characterise the ATOVS data record in a relative sense.The breakpoint between January and February 2001 is not evident in the bias between ATOVS and GUAN due to the collocation procedure.Based on the comparison to the GUAN data record, we find an averaged bias and an averaged RMSE of −0.2 and 3.3 kg m −2 for TPW, respectively.For LPW maximum relative (absolute) biases and RMSE are found for the highest (lowest) layer, similar to results presented in the literature.Relative to AIRS the RMSE is typically smaller while the bias is larger and exhibits a breakpoint between April and May 2009.At that time, NOAA-15 observations were removed from the retrieval.The retrieval software was originally developed for NOAA-15 and apparently optimally tuned to this satellite.We further discussed the spatial distribution of the bias between the ATOVS and AIRS data records.Maximum biases coincide with regions of strong and frequent precipitation in the ITCZ, here in particular tropical land areas, and in the storm track regions.Various (potential) contributors to this bias have been discussed: (1) gap filling through Kriging, (2) clear sky bias, (3) differences in cloud and precipitation handling, (4) differences in the handling of surface emissivities and (5) differences in diurnal sampling.Over ocean the dominating contributor is the Kriging approach.
Over land a separation into the individual contributors to the overall bias is a major effort as it, among others, requires restarting the retrievals with common input, cloud and rain screening and cloud removal.Within G-VAP the intercomparison of gridded long-time series satellite data records also exhibits largest discrepancies over tropical land surfaces.Within G-VAP further analysis will be conducted to find explanations for this.Here, we conclude that the ATOVS data record should be considered with care over tropical land surfaces and also at high elevation.
Finally, it became obvious that the ATOVS data record will benefit from carefully (inter)-calibrated sensor records.Full  The CM SAF ATOVS water vapour and temperature products ability of global observations for monitoring the climate system, detecting and attributing climate change, assessing impacts of and supporting adaptation to climate variability and change, and supporting climate research.The GCOS Second Adequacy Report (GCOS-82, 2003) established a priority list of 44 Essential Climate Variables (ECV) and called for integrated global analysis products.Water vapour is one of the 44 ECVs because of its key role for the radiation budget, the structure of tropospheric diabatic heating, the water cycle and atmospheric chemistry.The World Climate Research Programme (WCRP) Global Energy and Water Cycle Experiment (GEWEX) program has the objective to fully understand the water cycle for predicting climate change.GEWEX has initiated a series of projects and assessments to produce longtimes series of parameters linked to the water cycle and to evaluate the current maturity of such products.The Global Water Vapor Project (GVaP) was one of the GEWEX Discussion Paper | Discussion Paper | Discussion Paper | Discussion Paper | Discussion Paper | Discussion Paper | 7 to 15.4 µm with a total of 2378 channels.Since 2007 EUMETSAT's Metop satellites carry the Infrared Atmospheric Sounding Interferometer (IASI) instrument that performs observations in the infrared spectrum (3.63-15.5 µm) with 8461 channels.Finally, the Cross-track Infrared Sounder (CrIS) onboard NPP covers the infrared spectrum (3.92-15.38 µm) with 1305 channels.Of these instruments, IASI is the only instrument that covers continuously the full spectral range.AIRS, IASI and CrIS retrievals are described in, e.g., Susskind et al. (2011), August et al. (2014), and Gambacorta et al. (2012).Examples of evaluation results for water vapour products from ATOVS and hyperspectral instruments can be found in, e.g., Bedka et al. (2010), Reale et al. (2012) and Divakarla et al. (2014).A few long-term satellite based water vapour profile data records have been generated and publically released.To give an example, the NASA Water Vapour Project (NVAP) TPW and LPW products are based on a combination of the Special Sensor Microwave Imager (SSM/I), TOVS and radiosonde data for the time period between 1988 and 1999 (Randel et al., 1996) and contributed to the GEWEX project GVaP.NVAP has ESSDD Discussion Paper | Discussion Paper | Discussion Paper | recently been reanalysed and extended to cover the period 1988-2009 (NVAP-M as part of the NASA's Making Earth System Data Records for Use in Research Environments (MEaSUREs) programme, S. The products are available as daily and monthly means on a cylindrical equal area projection of 90 km × 90 km.The temporal coverage of the data record ranges from the 1 January 1999 to the 31 December 2011.The Kriging error (for daily mean prod-Discussion Paper | Discussion Paper | Discussion Paper | ucts), the extra-daily standard deviation (for monthly products) and the number of valid observations per grid box are also available for each product.The data files are created following Network Common Data Format (NetCDF) Climate and Forecast (CF) Metadata Convention version 1.5 and NetCDF Attribute Convention for Data set Discovery version 1.0.The products are available free of charge from the CM SAF website (www.cmsaf.eu/wui).
the TPW is integrated from surface to 100 hPa.Examples of the ATOVS data record products are shown in Figs. 1 and 2. In Fig. 1, the monthly mean TPW for September 2007 is shown together with the corresponding extra-daily standard deviation and the corresponding number of observations per grid box. Figure 2 shows TPW for 20 September 2007, with the Kriging error expressed in terms of standard deviation (see Schröder et al., 2013 for a definition) and the corresponding number of observations per grid box.The full ATOVS time series has been reprocessed with a fixed pre-processing, retrieval and post-processing scheme described below.The reprocessed ATOVS data record was released in 2013.Discussion Paper | Discussion Paper | Discussion Paper | ATOVS is a sounding instrument system composed of three sounders, two microwave sounders, AMSU-A and AMSU-B, replaced by MHS onboard NOAA-18, NOAA-19 and Metop-A and the infrared sounder HIRS.ATOVS flies since the 13 May 1998 onboard NOAA and Metop polar orbiting satellites.So far seven platforms carried the ATOVS instruments, namely, NOAA-15 through NOAA-19, as well as Metop-A and Metop-B.AMSU-A and -B are cross track scanning total power radiometers with instantaneous field of views of 3.3 and 1.1 • providing a footprint size at nadir of 48 and 16 km, respectively.The 15 AMSU-A channels primarily provide temperature sounding of the atmosphere through channels located at the 57 GHz oxygen absorption band.It has also three channels (at 23.8, 31.4,and 89 GHz) that provide information on tropospheric water vapour, precipitation over ocean, sea ice coverage, and other surface parameters.AMSU-B has five channels that mainly measure water vapour and liquid precipitation.Three of its channels are located in the water vapour band at 183.31 ± 1, 183.31 ± 3, and 183.31 ± 7 GHz.The channels at 89 and ESSDD Discussion Paper | Discussion Paper | Discussion Paper | Discussion Paper | Discussion Paper | Discussion Paper | Discussion Paper | Discussion Paper | Discussion Paper | Discussion Paper | Discussion Paper | Discussion Paper | To further allow a global comparison we also use the AIRS data record.AIRS observations have a large temporal overlap with the ATOVS data record.Many other ground-based, insitu and satellite observations are available for comparison.An extensive list of such data records is given in the Appendix of the G-VAP assessment plan, available at www.gewex-vap.org.The goal of the comparison of the ATOVS data record with GUAN radiosonde and ESSDD Discussion Paper | Discussion Paper | Discussion Paper | Screen / Esc Printer-friendly Version Interactive Discussion Discussion Paper | Discussion Paper | Discussion Paper | Discussion Paper | Discussion Paper | Discussion Paper | Discussion Paper |

Figure 3
Figure 3 presents the time series of monthly averaged TPW of the three data records used for the evaluation: ATOVS and GUAN data records for the period 1999-2011 and AIRS data record for the period 2003-2011.The data have not been collocated and GUAN data have only been used when at least two observations per day are available.ATOVS and AIRS data records exhibit similar annual cycles.However, a systematic difference between both data records is evident.This is discussed in Sect.4.2.3.The annual cycle of the GUAN time series has larger amplitudes than the annual cycles of the satellite time series.The GUAN stations are located on islands and over land, with the majority of stations on the continental Northern Hemisphere.This asymmetric

February 2001 .
The difference in TPW between the years1999-2000 and 2002-2003   is 2.8 kg m −2 with a coverage probability of 99 %.Moreover, the annual cycle of TPW exhibits stronger minima during boreal winters during the first two years.This breakpoint is analysed in more detail in the following.First, the breakpoint does not temporally coincide with the start of the use of SNOs in January 2001.Moreover, no breakpoint is visible between December 2010 and January 2011, when the use of SNOs ends.Second, we assess the impact of Kriging on the homogeneity of the time series.We compared the PDF based on the CM SAF ATOVS data record products and ATOVS products which have been arithmetically averaged on basis of daily values.Figure 4 shows the PDFs of TPW values for June 2002, separately for the CM SAF ATOVS data record products and for the arithmetically averaged monthly means.Obviously, the distribution of the CM SAF ATOVS data record products exhibits an increased number of TPW values at the high end of the distribution.This is reflected in the monthly mean ESSDD Discussion Paper | Discussion Paper | Discussion Paper | TPW of 25.3 kg m−2 for the classically averaged data and of 26.2 kg m −2 for the CM SAF data record products which gives an average difference of 0.9 kg m −2 .Apart from sampling, gaps are caused by strong scattering events, e.g., in presence of strong precipitation.During gap filling, Kriging uses valid observations in the vicinity of the gaps.The gaps neighbouring areas are typically humid, and thus, Kriging fills these gaps with generally large values (see alsoSchröder et al., 2013).This explains the increased frequency of occurrence at the high end of the TPW distribution.The PDF does not change significantly when more than two satellites are used (not shown).The PDFs of the classically averaged monthly means for NOAA-15 and NOAA-16 alone are also shown in Fig.4.The difference between the arithmetically averaged monthly means for NOAA-15 and NOAA-16 are 24.7 and 25.2 kg m −2 , respectively, thus leading to a difference of 0.5 kg m −2 .
Discussion Paper | Discussion Paper | Discussion Paper | Discussion Paper | Discussion Paper | Discussion Paper | Finally, in contrast to the bias the RMSE exhibits an annual cycle with maxima during boreal summers.When analysing the standard deviation of TPW from GUAN data record (not shown) a pronounced annual cycle is visible with a sharp increase in standard deviation between May and July (maximum value: 4.3 kg m −2 ).The maximum in July is followed by a slow decrease until February (minimum value: 2.3 kg m −2 ).Besides a potential dependency of the uncertainty of the radiosonde observations on TPW we argue that the dominating factor for the annual cycle in RMSE is, that with increasing TPW values during boreal summers, the natural variability of water vapour also increases and that ESSDD Discussion Paper | Discussion Paper | Discussion Paper | Discussion Paper | Discussion Paper | Discussion Paper |

Figure 7
presents the comparison results of AIRS and ATOVS TPW products.It can be seen that the TPW bias changes from approximately 1 kg m −2 in 2003 to approximately 2 kg m −2 in 2011.A breakpoint is present between April and May 2009 which temporally coincides with the removal of NOAA-15 data from ATOVS processing.The strength of the breakpoint is 0.5 kg m −2 and exhibits a coverage probability of 97 %.

Figure 1 .
Figure 1.TPW (left panel), extra-daily standard deviation (middle panel) and number of valid observations per grid point (right panel) for September 2007.

Figure 4 .Figure 10 .
Figure 4. Histogram of the TPW values for June 2002.The dashed line represents the histogram of the CM SAF ATOVS humidity and temperature TPW product using the data from the NOAA-15 and NOAA-16 satellites and the Kriging method for averaging.The red line represents the histogram of the data from the NOAA-15 and NOAA-16 satellites being averaged using the arithmetic averaging method.The solid black and green lines represent the data from the NOAA-15 and NOAA-16 satellites, respectively (averaged also using the arithmetic averaging method).

Figure 11 .
Figure 11.TPW bias between the ATOVS and the AIRS V5 data records, as in Fig. 7. Furthermore, the figure shows the bias between the ATOVS data record without the use of NOAA-15 data from June 2008 onwards (instead of May 2009).
The AAPP software is being developed and maintained by the EUMETSAT Satellite Application Facility for Numer- . A se- Discussion Paper | Discussion Paper | Discussion Paper | Discussion Paper |coverage probability of 97 %.Obviously the removal from the retrieval of NOAA-15 observations, in May 2009 introduces a breakpoint to the ATOVS time series.After the removal of NOAA-15 observations from the retrieval, from May 2009 onwards, we still use two satellites for the ATOVS processing.When looking at Figs. 3, 5 and 7 we do not see further coincidences between a apparent breakpoint and a change from two to three satellites or vice versa.Thus, we do not expect that this breakpoint can be explained by sampling or Kriging effects.Consequently, we extended the analysis by comparing the AIRS data record to the ATOVS products for each NOAA and Metop satellite separately.During this exercise the Kriging routine is not applied to the ATOVS data.Figure10shows the bias between the ATOVS products derived from each NOAA and Metop satellite and the AIRS data record.During overlapping periods data from Metop-A, NOAA-16 and -19 exhibit similar biases relative to AIRS, with values around 2 kg m −2 .Also noticeable is the increase in bias forNOAA-16 between 2003 and 2009 The development of such Fundamental Climate Data Records (FCDR) is currently carried out in a cooperation between CM SAF und EU-METSAT.Discussion Paper | Discussion Paper | Discussion Paper | Discussion Paper | facility of the ECMWF.ECMWF and NASA GES DISC are acknowledged for providing the ERA Interim and AIRS version 5 data, respectively.Finally, we thank Nathalie Selbach and Stephan Finkensieper from DWD for valuable discussions and technical support.

Table 1 .
Layer and level definitions for the ATOVS data record.