The MUSICA IASI { H 2 O , δ D } pair product

We present a global and multi-annual space-borne dataset of tropospheric {H2O, δD} pairs that is based on radiance measurements from the nadir thermal infrared sensor IASI (Infrared Atmospheric Sounding Interferometer) onboard the Metop satellites of EUMETSAT (European Organisation for the Exploitation of Meteorological Satellites). This dataset is an a posteriori processed extension of the MUSICA (MUlti-platform remote Sensing of Isotopologues for investigating the Cycle of Atmospheric water) IASI full product dataset as presented in Schneider et al. (2021b). From the independently retrieved H2O 5 and δD proxy states, their a priori settings and constraints, and their error covariances provided by the IASI full product dataset we generate an optimal estimation product for pairs of H2O and δD. Here, this standard MUSICA method for deriving {H2O, δD} pairs is extended using an a posteriori reduction of the constraints for improving the retrieval sensitivity at dry conditions. By applying this improved water isotopologue post-processing for all cloud-free MUSICA IASI retrievals, this yields a {H2O, δD} pair dataset for the whole period from October 2014 to June 2019 with a global coverage twice per day (local morning and 10 evening overpass times). In total, the dataset covers more than 1200 million individually processed observations. The retrievals are most sensitivity to variations of {H2O, δD} pairs within the free troposphere, with up to 30 % of all retrievals containing vertical profile information in the {H2O, δD} pair product. After applying appropriate quality filters, the largest number of reliable pair data arises for tropical and subtropical summer regions, but also for higher latitudes there is a considerable amount of reliable data. Exemplary time-series over the Tropical Atlantic and West Africa are chosen to illustrates the potential of the 15 MUSICA IASI {H2O, δD} pair data for atmospheric moisture pathway studies. Finally, the dataset is referenced with the DOI 10.35097/415 (Diekmann et al., 2021).


Introduction
Concomitant observations of moisture content and stable water isotopologues allow fundamental insights into the transport and phase transitions of water in the atmosphere. Differences in the molecular masses lead to characteristic responses of each 20 isotopologue to phase changes. Consequently, the ratio of light and heavy water isotopologues inside an air parcel reveals information about moisture processes that have occurred during its pathway through the atmosphere, and can hence support the investigation of the atmospheric branch of the hydrological cycle (an extensive overview is given in Galewsky et al., 2016).
For describing distributions of water isotopologues the δ-notation given in ‰ is commonly used, for instance between H 2 O and its heavier isotopologue HDO, with both given as volume mixing ratios: R vsmow is the isotopic ratio of Vienna Standard Mean Ocean Water as defined by the International Atomic Energy Agency (Craig, 1961). Several studies have proposed the combined analysis of H 2 O and δD distributions (here denoted as {H 2 O, δD} pairs) and demonstrated its value for analysing moisture processes and transport. For instance, signatures in {H 2 O, δD} pair distributions from model simulations and measurements were constrained to relative contributions of kinetic and equilibrium 30 fractionation, such as Rayleigh condensation, rain evaporation and airmass mixing (e.g., Worden et al., 2007;Noone, 2012;Dyroff et al., 2015;González et al., 2016;Schneider et al., 2017;Eckstein et al., 2018;Lacour et al., 2018).
During the last decades the space-based remote sensing of tropospheric water isotopologues has progressed considerably in terms of retrieval development, quality and application. On the one hand, cloud-free land observations from short-wave infrared sensors were used to generate total columns of the ratio HDO/H 2 O (e.g. Frankenberg et al., 2009;Boesch et al., 35 2013; Frankenberg et al., 2013;Schneider et al., 2020), while on the other hand thermal infrared sensors allowed for retrieving HDO/H 2 O ratios with weak vertical profile information for land as well as ocean observations (e.g. Worden et al., 2006Worden et al., , 2019Lacour et al., 2012;Schneider and Hase, 2011;Schneider et al., 2016). To ensure coherence in the vertical sensitivities of remotely sensed H 2 O and δD, which is inevitable for a combined interpretation, a further post-processing that creates optimal {H 2 O, δD} pair information is proposed by Schneider et al. (2012). width of the IASI sensors, this mission is able to provide a global scan of the atmosphere multiple times per day, with about 350,000 cloud-free observations per sensor and per day. The overpasses are designed such that the orbits cross the equator at approximately 9.30 and 21.30 local time (Clerbaux et al., 2009).
To process the enormous amount of IASI measurements, we have set up a quasi-operational processing chain that efficiently runs on high-performance computing clusters (Schneider et al., 2021b). It comprises an extended version of the MUSICA IASI 55 retrieval (Schneider and Hase, 2011). In this context we present the most recent updates regarding the optimal estimation {H 2 O, δD} pair productfrom Schneider et al. (2012), including an a posteriori enhancement of the sensitivity for dry conditions.
We discuss and apply a method for achieving an a posteriori reduction of the retrieval constraints. According to the local 2 https://doi.org/10.5194/essd-2021-87 Open Access Earth System Science Data Discussions Figure 1. Overview of the full MUSICA IASI processing chain and its output products. The red frame indicates the part that is documented in the underlying paper. Further information about this processing chain can be found in Schneider et al. (2021a further improved and runs now efficiently on high-performance computing clusters. Figure 1 provides an overview of the full processing chain. The main retrieval processing consists of the pre-processing stage, the PROFFIT-nadir retrieval and the output generation and is extensively described in Schneider et al. (2021b). The supply of the full product dataset (Schneider et al., 2021b) offers very good possibilities for data reusage. Examples are an a posteriori synergetic use with products from other sensors (Schneider et al., 2021a) and the a posteriori generation of an optimal estimation {H 2 O, δD} pair product as 80 presented in this paper.
Here, we shortly recall the relevant information of the MUSICA IASI retrieval and subsequently present the improved postprocessing for creating and evaluating the {H 2 O, δD} pairs.

Main characteristics of IASI
IASI is a Fourier transform spectrometer measuring the thermal infrared upwelling radiation that is affected by atmospheric 85 processes like absorption and scattering. Its spectral resolution is 0.

MUSICA IASI retrieval
The MUSICA IASI retrieval represents an optimal estimation algorithm for retrieving vertical profiles of mixing ratios of water vapour isotopologues and the trace gases CH 4 , N 2 O and HNO 3 as well as atmospheric and surface temperatures. It uses the nadir version of the radiative transfer code PROFFIT (Hase et al., 2004) for the spectral window of 1190-1400 cm −1 and an iterative Gauss-Newton method for the inversion calculations (Clive D. Rodgers, 2000;Schneider and Hase, 2011). As 95 proposed in Schneider et al. (2006) and Worden et al. (2006) By considering the MUSICA IASI full retrieval product, we apply an a posteriori processing for optimizing the water isotopologue states and generate an optimal estimation {H 2 O, δD} pair product. The following section provides details about the corresponding processing, including information about the error treatment and data selection according to data quality. In general, a retrieved height-depending state vectorx represents a smoothed image of the true atmospheric state x atm and is defined according to the averaging kernel matrix A and the a priori state vector x a : Following the definition by Clive D. Rodgers (2000), an averaging kernel (rows of A) depicts the fraction of the retrieved result coming from the retrieval itself and not from the a priori assumption. In case of a perfect retrieval the kernel matrix would 110 equal the identity matrix, expressing total independence of the retrieval results from the chosen a priori state. Thus, the degree of deviation from unity quantifies the vertical information content of a remotely sensed observation.
For instance, a common metric for describing the vertical information content is the degree of freedom for signal (DOFS). It is defined as the trace of the averaging kernel matrix. The value of DOFS indicates the number of vertical structures that can be independently determined from an observation (Clive D. Rodgers, 2000).

115
Further, the sum of the values along an individual averaging kernel is called measurement response (Eriksson, 2000;Baron et al., 2002). A measurement response of 1 implies that the retrieved state is a smoothed but unbiased image of the true atmospheric profile, whereas values deviating from unity are induced, if the retrieval constraint deviates from pure smoothing (von Clarmann et al., 2020).
To examine the vertical resolution of a retrieved profile (i.e. the capability to detect vertical structures), metrics that characterize 120 the vertical sensitivity, i.e. the shape of the averaging kernels, can be valuable. First, the relative position of the sensitivity weighted altitude compared to its nominal altitude is termed information displacement. For this, we use the centroid offset as defined by Backus and Gilbert (1970) in a discretized form (Keppens et al., 2015). And second, to describe the vertical smoothing of the retrieved state, the MUSICA IASI retrievals provides two different diagnostics. The definition of Backus and Gilbert (1970) is used to create a kernel-weighted spread around the kernel centroid, while the data density reciprocal of 125 Purser and Huang (1993) serves to indicate the layer width that covers a DOFS of 1. Discussions of these two metrics can be found in Keppens et al. (2015) and von Clarmann et al. (2020), and in the context of the MUSICA IASI full retrieval product in Schneider et al. (2021b).

Generation of an optimal estimation {H 2 O, δD} pair product
Due to its high variability in the troposphere, H 2 O can be detected very well in contrast to δD, which varies only weakly. As  IASI results against long-term datasets from ground-based remote sensing stations and in-situ aircraft measurements (Wiegele et al., 2014;Schneider et al., 2015). x wv,a = Px wv,a (4) x wv = Px wv (5) In the following, primed variables are consistently referring to the water vapour proxy state base. Detailed information about the transformation operator P can be found in Schneider et al. (2012), Wiegele et al. (2014) and Barthlott et al. (2017).x wv is in fact the state that is optimally estimated by the MUSICA IASI retrieval and it represents proxies for H 2 O and δD.
The following step harmonizes the differing sensitivities of the water vapour proxy states by reducing the sensitivity of the H 2 O proxy to the sensitivity of the δD proxy. The a posteriori correction operator C serves to create the harmonized product , which is called Type 2 product in Schneider et al. (2012). The main property ofx * wv is that it provides profiles for H 2 O and 155 δD having practically the same averaging kernels. This allows meaningful analyses of paired {H 2 O, δD} distributions. from 1 by about 10-14 %. In the next section we introduce a method that increases the DOFS and the measurement response a posteriori (the corresponding kernels are shown in Fig. 2b, but will be discussed in Sect. 2.3.3).

Reduction of retrieval regularization
By inspecting the row kernels in A * wv for a full orbit, we observe that there is a general and non-negligible deficit in the 165 measurement response of the {H 2 O, δD} pairs. The blue dots in Fig. 3 show that, for instance, at 1.8, 4.2 and 6.4 km a.s.l. and for moisture contents below 10 4 ppmv a large amount of data contain measurement response values far below 1.
As emphasized in Sect. 2.3.1, the measurement response is a metric for the influence of the a priori assumptions on the retrieval result. Thus, a too low measurement response can be an indicator for a too strong retrieval constraint that excessively pulls the retrieved states towards the a priori profiles (Clive D. Rodgers, 2000;von Clarmann et al., 2020). Therefore, to reduce the 170 observed lack of sensitivity in the {H 2 O, δD} pairs, we apply a method for a posteriori modifying and reducing the underlying retrieval constraint. For this purpose, we adapt a linear optimal estimation method from Rodgers and Connor (2003) that creates a best estimate of a given retrieval result with regards to a new constraint: The purpose of the operator M is that it allows a modification of the retrieval solutionx and its kernels A according to a 175 weaker constraint. R d is the regularization matrix chosen according to the desired constraint.  In general, a regularization restricts the variability of a retrieved state vector in order to keep the retrieval solution within the range of physically realistic profiles (Phillips, 1962;Tikhonov, 1963;Clive D.Rodgers, 2000). By reducing its strength we increase the allowed variability for the retrieved states. As a consequence, the information content increases. On the downside, a weaker constraint causes larger noise and can produce information that is not provided by the measurement (Clive D. Rodgers, 180 2000). Therefore, the remainder of this section discusses the optimal definition of the input matrices R d , A and S x,noise as well as the correct usage of M in order to enhance the sensitivity of the {H 2 O, δD} pairs.
First of all, to achieve the full benefit from the matrix operations in Eq. (10), we consider the full MUSICA IASI state for the kernel matrix A as this is also used during the original retrieval processing. This means that we need to take into account the non-harmonized water vapour proxy state A wv from Eq. (6) and the interfering effects of the other retrieval state vectors.

185
Since the retrieval output from Schneider et al. (2021b) provides only the dominant averaging kernels and cross-correlations, we build A as follows: The kernels of N 2 O and CH 4 are denoted as A ghg,11 A ghg,22 , respectively. The indices 21 and 12 indicate the respective crossdependencies. The cross-dependency of the temperature retrieval to the water vapour proxy states is marked with A t,wv,1 and 190 A t,wv,2 . The entries, for which the corresponding kernel matrix are not provided, are filled with the null matrices 0. Further, we calculate S x,noise the retrieval noise error covariance, i.e. the variability in the measured radiances that was not explained during the retrieval processing. It can be calculated from A and the regularization matrix R that was originally applied during the retrieval (Clive D. Rodgers, 2000;Schneider et al., 2021b): Now the question arises about the choice of a new meaningful regularization matrix R d . For this purpose, we first recapitulate the original MUSICA approach for setting up the retrieval constraint. For each target species an individual covariance matrix S a is given that describes the potential departure of the retrieval solution from the a priori state. This depends on the choice of the height-depending correlation length, the a priori assumed size of vertical structures that may be resolvable (Schneider et al., 2021b). By inverting S a we yield the corresponding regularization matrix R . During the MUSICA IASI retrieval the inversion 200 of S a is realized by a decomposition into its diagonal and derivative values (Hase et al., 2004;Schneider et al., 2021b): with α i as the strength of the individual constraining terms and L i as the constraint operators. The diagonal matrices α i are derived from S a and are provided for each target state as output variables from the MUSICA IASI retrieval (Schneider et al.,205 2021b). L 0 represents the diagonal constraint operator and equals the identity matrix I. Its effect is to shift the retrieved profile towards the a priori profile. L 1 and L 2 are the first-and second-order derivative operators and constrain the retrieved profile according to the shape of the a priori covariance, thereby representing smoothing constraints. For the retrieval of atmospheric trace gases with weak spectroscopic signatures smoothing constraints can be advantageous over L 0 , because a diagonal constraint tightens the retrieval by means of the absolute a priori values, with potentially inducing a bias in the retrieval (e.g. Steck,210 2002). Therefore, we infer that the consideration of the diagonal constraint in Eq. (13) causes the observed sensitivity lack in the {H 2 O, δD} pair data for dry conditions. Following this hypothesis, we remove the diagonal constraint operators for the water vapour states and create the new weaker regularization matrix R wv,d : Keeping the regularization matrices of the other target states unchanged, we can then build the new regularization matrix R d 215 for the full MUSICA state. With that, Eq. (10) is fully determined and we now can use M to adjust the kernel matrix A according to the new constraint R d : Based on the optimized kernel matrix A m , we can now create the optimal {H 2 O, δD} pair information for the constraint reduced state. By extracting A wv,m as the first 2×2 block from A m , we calculate the new a posteriori operator C m analogous 220 to Eq. (7) and generate the constraint reduced pair product:  This product A * wv,m with reduced constraint shows a clear increase of the measurement response (see lower panels in Figure 3). While the improvements are rather small for 6.4 km a.s.l., the results at 1.8 and 4.2 km a.s.l. have a much better measurement 225 response for moisture contents above 700 ppmv.
The time-series of the measurement response along the orbit used in Fig. 3 is shown in Fig. 4b (upper panel). It is found that the constraint reduction leads to a general decrease of the deviation from 1. Over the Pacific and Atlantic Ocean (observation IDs of 2500-7500 and 15,000-20,000) there is a shift of the slightly over-estimated measurement response towards 1. In contrast, for higher latitudes its values are originally below 1, but increase significantly due to the constraint reduction. This 230 is in particular pronounced for observations above Australia (observation IDs of 7500-10,000), where an averaged increase in the measurement response of up to 0.5 is apparent. Also for polar observations of the Northern Hemisphere (observation IDS of 0-2500 and 20,000-25,000) the measurement strongly improves.
Analogous improvements become apparent for the individual row kernels in Fig. 2 (compare Fig. 2a and b). The measurement response increases for the dry polar data at 3.0 and 4.2 km a.s.l. by 56 and 18 %, respectively. Also for the tropical site the

Error treatment
Several studies have intensively discussed the error treatment for satellite observations in general (Clive D. Rodgers, 2000;von Clarmann et al., 2020) and with a focus on MUSICA IASI retrieval data (Schneider and Hase, 2011;Borger et al., 2018). Schneider et al. (2021b) provided an overview of the errors that result for the most recent MUSICA IASI retrieval. Along with the kernel modifications for reducing the diagonal constraint for water vapour (see Sect. 2.3.3), a respective processing is required for the dominant MUSICA IASI error covariances.

245
Given the error covariance S x in the proxy state base, we use the optimized a posteriori operator C m to transform it according to the reduced constraint: We perform this processing for the retrieval noise error covariance S x,noise from Eq. (12) and for the temperature crosscovariance S x,temp. : This strongly depends on the choice of the assumed a priori uncertainty covariance S a,temp. (Schneider et al., 2021b).
As these two are the dominant errors for the MUSICA IASI δD product (Schneider et al., 2021b), we use their sum as an estimate of the total error covariance for the optimized H 2 O and δD states:

255
The bottom panel in Fig. 4b illustrates how the total δD error changes due to the a posteriori constraint reduction. In general, with relaxing the regularization strength the retrieval noise will increase (e.g. Clive D. Rodgers, 2000). Following this behaviour, the δD error exhibits a strong increase for areas where the impact of the regularization optimization is large and the measurement response increases. For instance, the strong improvements of the measurement response over the dry Australian desert are at the expanse of increasing the averaged δD error by 20 ‰ with single peaks up to 50 ‰. An increase of the noise 260 is also observed for high latitudes in the Northern hemisphere, whereas for observations above the Pacific and Atlantic Ocean the noise is only slightly affected (compare with discussion in Sect. 2.3.3).

Data filtering
Supplementary to the raw IASI L1C measurements, EUMETSAT distributes auxiliary L2 diagnostics, such as cloud cover and surface type. Utilizing these diagnostics, Schneider et al. (2021b) provide the MUSICA IASI retrieval results for (almost)   1). Therefore, we define this flag based on the sensitivity metrics of the kernel matrix A * wv,m . For the measurement response we require values between 0.8 and 1.2. To limit the information displacement at an altitude level z(i), 275 we define following criterion: with c(i) being the centroid of the corresponding averaging kernel (Keppens et al., 2015) and z cl (i) the a priori correlation length at the respective altitude level (Schneider et al., 2021b). This criterion ensures that the deviation of the centroid from the nominal height is less than half of the a priori correlation length. As filter condition for the vertical resolution we propose: r LW (i) is the layer width per one DOFS from Purser and Huang (1993) (see Sect. 2.3.1) as a proxy for the vertical resolution of an averaging kernel. By considering the kernel properties relative to the correlation length, we achieve that also kernels with larger values in their metrics pass the aforementioned filters if larger values in the corresponding correlation length are assumed.

285
Second, we provide the error flag musica_deltad_error_flag that identifies data points with too high uncertainties in the δD retrieval results, namely errors due to measurement noise and atmospheric temperature uncertainties. The corresponding height-dependent flag displays retrieval results with a total δD error below 40 ‰.
The aforementioned filter conditions are visualized in Fig. 5  i.e. they only consist of the values 1 (for indicating high quality) and 0 (for low quality). Even though the recommended filter conditions are chosen somewhat arbitrary, they efficiently remove recognisable outliers in terms of kernel properties (see Fig.   5a-c) and data uncertainties (see Fig. 5d) of the retrieved {H 2 O, δD} pairs. Therefore, the simultaneous application of the corresponding quality flags musica_wvp_kernel_flag and musica_deltad_error_flag serves for a convenient and meaningful selection of reliable {H 2 O, δD} pair data. However, to enable a flexible adjustment of the individual filter 295 conditions for individual purposes, the output datasets contain the filtered and unfiltered {H 2 O, δD} pair data together with the flag and filter variables.

Matrix compression
Analogous to Schneider et al. (2021b), the averaging kernel matrices for the {H 2 O, δD} pairs are stored in a decomposed and compressed format in order to reduce the required storage volume. For this purpose, we apply a singular value decomposition 300 for the matrices A * wv,m and A t,wv,m into the components U, D and V that decompose the kernel matrix through The length of the singular value vector D is called rank. The actual compression is achieved by cutting off the lowest singular values in D and thereby reducing the rank. Consequently, also the number of singular vectors U and V are reduced. The optimal limit of the singular values for balancing the compression error against the effective storage reduction is discussed in 305 Weber (2019). Based on that, we neglect singular values that are less than 0.1 % of the maximum singular value in D.  from the corresponding a priori values, such that lower values in H 2 O and δD can be observed. This is analogous to the increase in the measurement response that is most pronounced for dry conditions (see Fig. 2, 3 and 4). As the measurement response is considered during the quality filtering for reliable {H 2 O, δD} pairs (see Table 1), its increase yields a higher number of observations passing the recommended data filter (see data amount in Fig. 6).
In summary, the MUSICA IASI water isotopologue post-processing provides an optimal estimation {H 2 O, δD} pair product 315 in the troposphere with a substantial increase of sensitivity for dry conditions. Together with the recommended quality flags indicating observations with meaningful averaging kernels and low errors for δD, this is the main product provided freely to the scientific community.  The metadata of the output NetCDF4 files are in agreement with the CF metadata naming conventions (Version 1.7). Information about how to access the full dataset is given in Sect. 6.

330
The following section gives an impression of the spatial and temporal representativeness of the optimal estimation {H 2 O, δD} pair data.  The black line indicates the global means that are further divided into the means for morning (violet) and evening (pink) overpasses.

Degree of freedom for signal
indicated by the averaging kernels in Fig. 2). The DOFS minimum is located over the polar regions during winter times, as 340 these regions are typically very dry and cold.
Over oceans, the DOFS distribution roughly reflects the dominant sea-surface temperature patterns. For instance, the warm surface currents in the West Atlantic and West Pacific correlate with an increased sensitivity of the {H 2 O, δD} pair retrievals.
While the large-scale DOFS patterns show a strong inter-annual variability for all regions except the tropics, their diurnal variations are rather small. Instead, the latter becomes more pronounced for small-scale regional structures. In particular for land 345 observations thermal effects lead to a sensitivity maximum for morning times (Clerbaux et al., 2009), e.g. for Australia during February and for Europe and North America during August. Conversely, for the Sahara we observe an inverted effect, i.e. an increase of DOFS from morning to evening. As a next step, we will consider data that have been additionally filtered for high sensitivity and low uncertainty in the {H 2 O, δD} pair product (see Table 1).

350
As discussed in Sect. 2.3, the MUSICA IASI water vapour retrieval is mainly sensitive to water vapour in the free troposphere. Figure 8 shows that this is reflected clearly on the vertical distribution of available {H 2 O, δD} pairs after applying the full recommended filters according to Table 1. Here, the amount of globally available observations per day and per morning and evening maps is averaged for February and August 2018 and is shown for each retrieval grid level between the surface and 9 km. The best data availability arises between 2-7 km a.s.l. On average, during boreal summer (at maximum over 400,000 355 data pairs per day) remarkably more observations are available than during austral summer (maximum 300,000 data pairs). In contrast, the diurnal variations are again rather small on the global scale. Only for altitudes below 3.5 km a.s.l. we observe a slight decrease of data availability during evening. This might be due to thermal heating that develops during the day and leads to a upwards transport of low-level moisture, resulting in a upwards shift of the retrieval sensitivity. Such effects are stronger https://doi.org/10.5194/essd-2021-87

Horizontal distribution of data coverage
In this section we discuss the horizontal data coverage of {H 2 O, δD} pairs for different altitude regions after applying the respective quality filtering according to conditions are simultaneously fulfilled at 2.9 and 6.4 km a.s.l. We observe similar spatial patterns with lower values and less temporal coverage, when compared to 4.2 km a.s.l. Even though the data coverage decreases significantly for areas with profile information at even lower altitudes (the quality filter conditions are simultaneously fulfilled at 1-1.5 and 4-5 km above ground level only for about 10-17 % of the cloud-free observations), interesting features emerge. The maximum availability of about 10 observations per grid box and per day shifts towards higher latitudes, such that over the tropics there are almost no data.

380
In this analysis we jointly investigated the morning and evening observations. As can be deduced from Fig. 7   to the full (i.e. unfiltered) cloud-free IASI observations. The results are shown for 4.2 km above sea level (a.s.l.), for observations where the filter conditions are fulfilled simultaneously at 2.9 and 6.4 km a.s.l. and at 1-1.5 and 4-5 km above ground level (a.g.l.), respectively. For the latter, if more than one grid level falls inside the given altitude range, then the lower one is chosen. differences between the morning and evening distributions will differ only little. For instance, Table 2 includes the fractions of available data after filtering according to Table 1 for the altitude regions from Fig. 9. The values do not differ significantly for the mid-troposphere during morning and evening times, but reduce for lower altitudes during the evening overpasses (analogous to Fig. 8).   (a) and δD (b) product at 4.2 km a.s.l., evaluated on a 1 • ×1 • grid. The filtering is performed according to Table 1.

Data example: Tropical Atlantic and Sahel
To convey an impression of the amount and scientific potential of the optimal estimation MUSICA IASI  The data are illustrated with normalized two-dimensional histogram contours comprising the main 10 and 90 % of the scatter 410 points (the calculation is described in the appendix of Eckstein et al. (2018) González et al., 2016;Schneider et al., 2017;Christner et al., 2018;Eckstein et al., 2018;Lacour et al., 2018), such an analysis is then capable of providing a deeper understanding of atmospheric moisture pathways and will therefore be part of future MUSICA IASI studies.

Dataset availability
The full MUSICA IASI water isotopologue dataset is referenced with the DOI 10.35097/415 (Diekmann et al., 2021). Further,

Summary
We present an extension of the MUSICA IASI retrieval that aims at creating an optimized water isotopologue pair product for the free troposphere. The retrieval processor from Schneider et al. (2021b) is an update of the version that was developed and validated against reference measurements during the MUSICA project . The presented a posteriori 440 processing step exploits their retrieval results and generates an optimal estimation {H 2 O, δD} pair product by harmonizing the averaging kernels of H 2 O and δD, as proposed by Schneider et al. (2012). We introduce a further optimization step by a posteriori reducing the strength of the underlying regularization. This increases the sensitivity of the {H 2 O, δD} pair retrieval product, especially for dry conditions, and enhances the vertical profile information between the boundary layer and the free troposphere. However, as trade-off the retrieval noise increases, but not beyond an unreasonable range (∼ 12 % for H 2 O and 445 ∼ 30 ‰ for δD). For a user-friendly data handling, we derive supplementary filter flags that perform a height-depending data selection based on the quality of the {H 2 O, δD} pair results.
We applied this a posteriori processing to the MUSICA IASI full retrieval product and created a novel space-borne dataset representativeness in terms of data quality and coverage for tropical and summertime sub-tropical regions. Despite a negative equator-to-pole gradient in the horizontal representativeness, there is still a satisfactory amount of reliable {H 2 O, δD} pair data in higher latitudes, with ranging during summer up to polar regions.
Due to its unique combination of coverage and resolution in space and time, this dataset is highly promising for studying atmospheric moisture pathways. It enables analyses across different scales, from annually to daily, from globally to locally, 455 and is therefore appealing to a wide range of scientific applications. For further encouraging the use of these data, they are made freely available to the scientific community under the DOI 10.35097/415 (Diekmann et al., 2021).
Author contributions. FH developed the radiative transfer model PROFFIT-NADIR. BE and MS optimized the MUSICA IASI retrieval. BE, MS, ES and OG performed the retrieval calculations. MS and CD developed the water isotopologue post-processing. CD performed the processing and created the data statistics. CD wrote major parts of the manuscript. PB and PK supervised this study. All authors contributed

460
to the discussion of the paper.
Competing interests. The authors declare that they have no conflict of interest.
Acknowledgements. This work has strongly benefited from the project MUSICA (