Simplified SAGE II ozone data usage rules

High-quality satellite-based measurements are crucial to the assessment of global stratospheric composition change. The Stratospheric Aerosol and Gas Experiment II (SAGE II) provides the longest, continuous data set of vertically resolved ozone and aerosol extinction coefficients to date and therefore remains a cornerstone of understanding and detecting long-term ozone variability and trends in the stratosphere. Despite its stability, SAGE II measurements must be screened for outliers that are a result of excessive aerosol emitted into the atmosphere and that degrade inferences of change. Current methods for SAGE II ozone measurement quality assurance consist of multiple ad hoc and sometimes conflicting rules, leading to too much valuable data being removed or outliers being missed. In this work, the SAGE II ozone data set version 7.00 is used to develop and present a new set of screening recommendations and to compare the output to the screening recommendations currently used. Applying current recommendations to SAGE II ozone leads to unexpected features, such as removing ozone values around zero if the relative error is used as a screening criterion, leading to biases in monthly mean zonal mean ozone concentrations. Most of these current recommendations were developed based on “visual inspection”, leading to inconsistent rules that might not be applicable at every altitude and latitude. Here, a set of new screening recommendations is presented that take into account the knowledge of how the measurements were made. The number of screening recommendations is reduced to three, which mainly remove ozone values that are affected by high aerosol loading and are therefore not reliable measurements. More data remain when applying these new recommendations compared to the rules that are currently being used, leading to more data being available for scientific studies. The SAGE II ozone data set used here is publicly available at https://doi.org/10.5281/zenodo.3710518 (Kremser et al., 2020). The complete SAGE II version 7.00 data set, which includes other variables in addition to ozone, is available at https://eosweb.larc.nasa.gov/project/sage2/sage2_v7_table (last access: December 2019), https://doi.org/10.5067/ERBS/SAGEII/SOLAR_BINARY_L2-V7.0 (SAGE II Science Team, 2012; Damadeo et al., 2013).


Introduction
Even though the stratosphere contains less than 10 % of the mass of the atmosphere, changes in its chemical composition affect surface climate. For instance, stratospheric ozone is the key factor in governing UV levels at Earth's surface, which directly impact human, animal, and plant health. In addition, many stratospheric components, including ozone, absorb and emit radiation, which in turn change the temperature distribution within the stratosphere and therefore change the dy-namics through the atmosphere and down to the surface. Understanding how the fingerprint of the effects of the stratosphere on the climate system changes with time is therefore imperative for diagnosing past and ongoing changes in climate. Space-based measurements of stratospheric composition are useful for diagnosing global changes as they can provide consistent, long-term measurements of key parameters on a global scale. However, these measurements are complicated by the fact that they are indirect measurements (usually optical properties of their targets) whose quality is challenged by the physics of the measurement including the accuracy of the measurement and the ability to separate between the effects of the target species and the affects of other gases and particulate matter. As a result, our ability to accurately detect small but important changes in a key parameter like ozone is controlled by our ability to identify the difference between usable and deficient data. The process of making these distinctions is crucial to deriving robust long-term trends in ozone and other compounds and understanding the impact of large geophysical events such as the 1991 Mount Pinatubo eruption.
Measurements from the Stratospheric Aerosol and Gas Experiment II (SAGE II, McCormick, 1987) remain a cornerstone of understanding and detecting long-term ozone variability and trends in the stratosphere. This data set is recognized for its stability over its 21-year lifetime (1984-2005) and the high vertical resolution of its ozone and aerosol extinction coefficient measurements during its mission. While this instrument was remarkably successful and long-lasting, it is well known that SAGE II ozone data quality declines due to the high stratospheric aerosol levels following the Mount Pinatubo eruption and other deleterious features. As with any data set, these must be accounted for when using SAGE II data. Given that the mission began 35 years ago and the Mount Pinatubo eruption occurred almost 30 years ago, it is unsurprising that numerous filters for removing artefacts in SAGE II ozone data have been proposed (e.g. Cunnold et al., 1989), modified, and refined (Cunnold et al., 1996;Wang et al., 2002;Rind et al., 2005;Wang et al., 2006). As a result, there are a plethora of "generally accepted" screening methods for SAGE II data that are sometimes inconsistent with one another, most often subjective, and occasionally untraceable. Furthermore, the screening recommendations derived using one version of the SAGE II data are continued to be used with a later and presumably better version of the SAGE II data set but without a revision of the recommendations when the data, its reported uncertainty, and its sensitivity to interfering species are likely to have changed. For instance, the release notes for version 7.00 of the SAGE II data recommend the use of the Wang et al. (2002) ozone usage rules, which were developed for version 6.1 of the SAGE II data. Therefore, there remains a need for consistent and robust approaches to determining the suitability of stratospheric chemical composition measurements to diagnose long-term change.
SAGE II ozone data are often used as reference measurements to which other measurements are adjusted (Froidevaux et al., 2015;Davis et al., 2016;Sofieva et al., 2017;Hassler et al., 2018), as they provide a long, stable record of ozone measurements with a high vertical resolution. Therefore, having the best possible SAGE II ozone data set is crucial in the development of long-term homogeneous data sets that combine measurements from different sources. However, having numerous screening recommendations available that were published in previous studies hampers the use of the SAGE II data in scientific studies and the creation of merged ozone data sets such as Sofieva et al. (2017), Froidevaux et al. (2015), Davis et al. (2016), and Hassler et al. (2018). Furthermore, the impact of these recommendations on the SAGE II data set is seldom further investigated, and any introduced biases will remain undetected.
In this paper, we will use the SAGE II data set version 7.00 (Damadeo et al., 2013;SAGE II Science Team, 2012) to compare the impact of SAGE II ozone data usage rules that have been proposed in a number of publications (Wang et al., 2002;Rind et al., 2005;Wang et al., 2006) and are stated on the SAGE II release notes page. Here, we focus on the proposed rules that were applied to the SAGE II ozone data set before being used to generate a homogenized satellite record of vertically resolved ozone data sets (Davis et al., 2016;Hassler et al., 2008Hassler et al., , 2018. A long-term, latitudinally and vertically resolved ozone database is required as input to climate models that do not have the ability to include a fully coupled stratospheric chemistry scheme (Hassler et al., 2018). Both Davis et al. (2016) and Hassler et al. (2018) applied around seven rules that are mainly based on previous studies by Wang et al. (2002) with the modifications outlined in the SAGE II version 7.00 release notes.
The aim of this study is not to review the generally accepted set of rules for SAGE II ozone data in detail but to develop a set of simple yet robust SAGE II ozone data usage rules. A characteristic of any limb-viewing UV-visible instrument such as SAGE II is that the quality of any ozone observation is sensitive to material that lies at and above the altitude of the observation. Therefore, we will modify aerosolrelated rules to reflect the geometry of SAGE II ozone measurements. The rules will make use of parameters that are generally available or derivable from information contained in the SAGE II version 7.00 data files. The expectation is that these new rules to SAGE II ozone data results in a more robust SAGE II ozone data set that can be used in trend analysis studies and homogenization efforts with other space-based measurements.

SAGE II -the basics
The SAGE II instrument was on board the Earth Radiation Budget Satellite (ERBS), launched by the space shuttle Challenger in October 1984, and was operational until mid-2005. Like its companion instruments the Stratospheric Aerosol Measurement (SAM II, 1978-1993, SAGE (1979SAGE ( -1981, SAGE III/Meteor 3M (2002-2005, and SAGE III/ISS (2017-present), SAGE II observed the Sun through the limb of the atmosphere for each spacecraft transit of the solar terminator, a technique called solar occultation (Fig. 1), to measure a line-of-sight (LOS) transmission profile from 0.5 to 100 km at multiple wavelengths. Each profile took between 1.5 and 4 min to collect, with up to 32 profiles per day at its peak, although the number of profiles decreased to 16 Figure 1. A schematic diagram showing the solar occultation viewing geometry that is exploited by SAGE II to make measurements of atmospheric constituents. Z t denotes the tangent altitude. For more details, see the main text.
after mid-2000. Combined with the ERBS orbit this provided measurements at 2 latitudes per day that shift over time, providing coverage from 80 • S to 80 • N every 1-2 months. In SAGE terminology, each spacecraft sunrise and sunset encounter producing these LOS transmission profiles is referred to as an "event"; there are usually 15 sunset and 15 sunrise events each day. The altitude at which transmission was reported is for the lowermost point along the LOS path and commonly referred to as the tangent altitude (Z t in Fig. 1) since it corresponds to the point where the path is travelling parallel to the Earth's surface. The lowermost altitude where transmission was reported is generally higher than 0.5 km since the presence of dense cloud or aerosol and occasionally the solid Earth itself make the line of sight opaque. The geometry of the solar occultation measurement technique is shown in Fig. 1. Under most circumstances, this geometry is favourable for stratospheric applications including ozone and aerosol extinction coefficient since the long path lengths near the tangent altitude allow for a large signalto-noise ratio for what are generally optically thin layers of ozone and aerosol. SAGE II has a large dynamic range and can theoretically measure LOS optical depths between about 0.001 and 8. The long paths have a broad horizontal resolution with an effective horizontal resolution between hundreds and tens of thousands of square kilometres depending on the details of an individual event (Thomason et al., 2003).
SAGE II was a seven-channel Sun photometer with central wavelengths at 386,448,452,525,600,940, and 1020 nm. The latest algorithm used to derive version 7.00 of SAGE II data is described in detail in Damadeo et al. (2013). The most challenging part of the data processing of SAGE II transmission profiles is to determine the tangent altitude (altitude registration) and the point on the Sun that it intersects with (Damadeo et al., 2013). Once transmission profiles are obtained, constituent profiles including ozone are produced us-ing a straightforward linear species separation algorithm that relates aerosol extinction at 452, 525, and 1020 nm to an unknown aerosol contribution at 600 nm where ozone is effectively inferred. SAGE II ozone values are reported whenever they can be produced by the processing software. The uncertainties reported for ozone observations are generally reliable assessments of ozone data quality. The altitude range reported for data products is limited to the range where relevant observations are made over the lifetime of the instrument. For instance, ozone can be reported from 0.5 to 70 km, while aerosol extinction coefficient is reported as no higher than 40 km. Since the ozone contribution at 600 nm is usually much larger than that by aerosol, the retrieval of ozone number density is robust and produces high-quality data over a broad range of altitude and conditions. However, there are well-known exceptions to this generality and, like any data set, SAGE II ozone measurements require cautious use. For instance, given that aerosol extinction coefficient at 600 nm is interpolated across a broad range in wavelengths (a factor of almost 2), it is hardly surprising that the interpolation is not perfect and, when aerosol extinction coefficient is high, the potential for introducing artefacts in the ozone data set is significant. While enhanced aerosol loading in the atmosphere is not common during the SAGE II mission, the Mount Pinatubo eruption of June 1991 increased aerosol extinction coefficient substantially (primarily below 25 km) by as much as a factor of 1000. SAGE II ozone data quality in this extreme period is clearly negatively impacted (Yue et al., 1995). Furthermore, the presence of clouds, either tropospheric or polar stratospheric, and occasional smaller volcanic eruptions can have a similar deleterious impact on ozone data quality and therefore need to be accounted for before SAGE II data are used.
3 Current data usage rules for SAGE II ozone SAGE II measurements are reported during periods of high aerosol loading, i.e. periods where aerosol extinction coefficients are large, and many current and past data usage rules have been designed to remove anomalous water vapour and ozone observations that are associated with these periods, the presence of clouds, or instrument-related artefacts (e.g. Wang et al., 2002). No recent review has been done on whether or not the current screening recommendations of the SAGE II ozone data are still applicable to the most recent version of the SAGE II data, resulting in potentially good data points being removed from the data set or bad data points remaining, which in turn can lead to biases in, e.g. trends that are derived from the SAGE II observations.
Here we will discuss data usage rules that have been applied in screening SAGE II ozone data before their use in the generation of homogeneous long-term ozone data sets as described in Froidevaux et al. (2015), Hassler et al. (2008Hassler et al. ( , 2018 and Davis et al. (2016). The data rules outlined be-low were derived by others and are the most predominate rules found in the literature and the most commonly applied usage rules. We have not included every proposed filter nor should those given below be considered, in part or in total, as a canonical set, but they are instead simply a collection of commonly used data usage rules for SAGE II ozone. The rules that are further looked at in this study (hereafter referred to as "current rules") are as follows.
1. Exclude all values between 23 June 1993 and 11 April 1994 between 15 and 50 km if the error is bigger than 10 % (Hassler et al., 2008).
2. Wang et al. (2002) suggested removing all data where the error is greater than 300 %. This rule was later adapted to only exclude ozone values above 35 km if the error is bigger than 300 % (Froidevaux et al., 2015;Davis et al., 2016;Hassler et al., 2018, SAGE II v7.00 release notes).
6. Exclusion of all data points at altitude and below the occurrence of an aerosol extinction (525 nm) value of greater than 6 × 10 −3 km −1 (Davis et al., 2016;Hassler et al., 2018 and SAGE II v7.00 release notes).
7. Eliminate all data below 23 km between July 1991 and December 1993 for excessive aerosol (Hassler et al., 2008(Hassler et al., , 2018 11. Outlier screening by removing all values that are more than 10σ away from the monthly mean value for a given latitude band (15 • zones), longitude (90 • quadrants), and altitude (0.5 km grid) (Rind et al., 2005;Hassler et al., 2008). This rule was later adapted to remove values that are farther than 3σ away from the mean in 10 • latitude bins (Davis et al., 2016;Hassler et al., 2018).
Up to 7 out of the 11 recommendations were used in the studies by Froidevaux et al. (2015), Hassler et al. (2008Hassler et al. ( , 2018, and Davis et al. (2016), rules no. 1 and 8 to 10 were included here as they have been used in earlier versions of the vertically resolved ozone database described in Hassler et al. (2008). These rules seem to have been replaced by rule no. 11, the outlier-screening rule in more recent publications. However, Froidevaux et al. (2015) for example, did not apply any outlier-screening rule before using SAGE II ozone data. There seems to be some confusion about whether or not to apply outlier-screening rules and which rule should be applied to SAGE II ozone. Here, we will provide a new outlier-screening recommendation that is applicable to not only normally distributed data but also skewed data sets, and which does not rely on visual inspection of the data set which is a rather subjective screening method. In addition to the 11 screening rules listed above, Froidevaux et al. (2015) and Davis et al. (2016) removed all SAGE II ozone profiles associated with "short events" during the 1993 and 1994 period as described in Taha et al. (2004). The short events rule is based on the beta angle, which is a spacecraft or event characteristic, i.e. the beta angle is the elevation angle of the Sun with respect to the orbital plane of the spacecraft. Most significantly, beta angle governs the duration of an event where the larger the magnitude, the longer an event lasts. Increasing event duration corresponds to expanded spatial extent, i.e. for any event, the latitude-longitude of tangent altitudes moves at roughly 7 km s −1 , which, in turn, reduces the applicability of the assumption of spherical homogeneity of the atmosphere used by the retrieval algorithm. The short events period relates to the time when SAGE II data were collected during an event that was shortened to reduce power requirements following the failure of a series of cells in the spacecraft battery. Taha et al. (2004) found that sunrise events with a beta angle between −47 and +47 • and sunset events with a beta angle less than −45 • and greater than +45 • are the most affected events and need to be removed. Data for these events are notably noisier than unaffected events (Damadeo et al., 2013). Short events were a part of SAGE II version 6.2 but are excluded from the processing within the retrieval in version 7.00 (Damadeo et al., 2013), and therefore the rationale behind this rule is less compelling and this rule should be eliminated and not applied to SAGE II v7.00 ozone data. Therefore, this rule is excluded from the list of "current data usage rules" listed above.

Comments on the current data usage rules
It is the purpose of any data usage rules to differentiate between usable data, including high-quality, atypical, or noisy but unbiased data and unusable, or best to avoid data, including data where a bias is likely to be an issue and data with excessively high noise levels, suggesting that they contain no useful information. There is, in principle, no need for any data usage rules to be applied if the data are well behaved in the sense that they only consist of random noise with zero mean. In fact, for an ideal data set, it would be incorrect to follow common practice and eliminate data based on relative error because low values that represent the negative tail of a well-behaved distribution of noisy data are preferentially eliminated. As a result, the mean of the data set can be biased high even when the family of data points is itself unbiased. An example of this effect can be seen in Fig. 2, where the application of current rule no. 3 leads to unexpected results. In this example, measurement noise is relatively large and spreads the reported ozone values, at that altitude and latitude bin, across the zero line. Using the relative errors to flag data as outliers leads to removal of ozone values lying in a band centred at zero, while data larger in magnitude of both signs will remain in the data set. While this is an example of unintended consequences, it is a good example of conditions in which an argument could be made to eliminate no data and have averaging be allowed to do its work. Removing ozone values that have an associated relative error greater than a fixed value (e.g. rule no. 3) will bias the remaining data set ( Fig. 2), leading to a larger positive value. However, this issue could be mitigated by using a relative error rule in which the ozone uncertainty in number density is compared to the average value of ozone within the averaging window rather than individual measurements so that it becomes an assessment of the size of the reported uncertainty rather the size of reported ozone number density.
To investigate the impact of each rule on the SAGE II ozone data, the usage rules outlined above were applied individually to the whole SAGE II ozone data set at a given altitude and within a given 10 • latitude band. Therefore, the order in which the rules are applied does not impact which data points are flagged for removal, and data points might be flagged more than once. SAGE II ozone observations at 20 and 16.5 km altitude and between 5 • S and 5 • N and 25 and 15 • S throughout the duration of the SAGE II mission are shown in Fig. 3, together with the flagged ozone values that would be removed by applying the current rules as indicated in the legend. Note that some of the data usage rules only apply to higher altitude ranges, and therefore no data are removed at the given altitudes. The percentage of the ozone values that remain and would be removed by applying the rules is shown in the legend; note that the sum of all percentages may be greater than 100 % as the current rules are applied individually to the data set, i.e. the data point can be removed more than once.
At 20 km, overall 10 % of the ozone data are removed by applying the current rules outlined in Sect. 3, and rule no. 4 and rule no. 6 remove the majority of the ozone data due to aerosol extinction values exceeding a threshold. However, is it clear that applying the current rules to the SAGE II ozone data set is too restrictive as it removes ozone measurements that appear unaffected by aerosol, especially during the period of the Nyamuragira and Nevado del Ruiz eruptions (1985 and 1986, respectively); see Fig. 3a. At lower altitudes (16.5 km), 15 % of the ozone data are removed when applying the data-screening rules, with the majority being removed due to rule no. 4, which again is based on aerosol extinction values. While aerosol extinction values are enhanced due to volcanic eruptions, this should not equal bad ozone data, as can be seen in Fig. 3b. Here, ozone values just before the Nyamuragira and Nevado del Ruiz eruptions and at the end of the Mount Pinatubo eruption period (1994/1995) seem as valid as any other ozone data point and should not necessarily be removed. Furthermore, applying rule no. 3 leads to ozone values around zero to be removed, which as discussed above biases any mean values calculated from this set of data. Overall, the data usage recommendation as described above appears unnecessarily restrictive, and relying on current aerosol extinction coefficient rules to capture the impact of aerosol on the quality of the ozone data is not advisable. This result confirms the finding of a recent study by Damadeo et al. (2018), who concluded that the Wang et al. (2002) filtering recommendations are overly conservative and need to be revisited.
While Hassler et al. (2018) did not directly employ a cloud presence filter as a usage rule, Rind et al. (2005), Hassler et al. (2008), and Davis et al. (2016) have used the SAGE II cloud presence flags to eliminate data when the presence of clouds has been inferred. Applying cloud flags to SAGE II data may be justified by the inhomogeneity of clouds as they are present in SAGE-like observations (Thomason and Vernier, 2013), as opposed to strictly confining the usage rules to the magnitude of aerosol extinction. However, it is not clear that the impact on ozone by enhanced extinction from cloud presence can be significantly different from that caused by enhanced aerosol, which has its own associated usage rules. Since the detection of cloud presence in SAGE II observations is ambiguous at best (Thomason and Vernier, 2013), any cloud presence rule has been excluded in this study. While further in-depth investigation of the cloud effect on ozone data quality may be worthwhile, it is beyond the scope of this study.

How measurements are made and how that should impact usage rules
In revising the SAGE II ozone data usage rules, the primary goal is to simplify them and to retain as much data as possible without compromising the data quality of the remaining data. A guiding principle for the development of new usage rules is that any information needed in the revised rules must be readily accessible to other users and hence reproducible by other users of the data set. This excludes, for instance, the use of SAGE II transmission data since it is not routinely made available to users. This is important to note since one goal of the way the new usage rules are devised is to make them reflect the way in which the measurements were made. Therefore, as outlined below, the line-of-sight (LOS) values for aerosol extinction coefficient will be used. Since this is not routinely available in the data set, we calculate the LOS aerosol optical depth using the reported aerosol extinction profiles integrated along the LOS path that is approximated using a simple geometric model, thereby neglecting the impact of refraction. In addition, aerosol extinction at 600 nm is estimated within the SAGE II retrieval algorithm, but it is not included in the data product. Therefore, this parameter is estimated by using a simple Ångström coefficient approximation based on extinction reported at 525 and 1020 nm, as described below. The approximations outlined below are adequate for the application of developing new data usage rules and are readily reproducible by any user of the SAGE II ozone data. Neglecting small contributions by nitrogen dioxide and water vapour, the SAGE II line-of-sight optical depth t at 600 nm is given by the following equation: where z 0 is the tangent altitude, σ m is the molecular cross section (3.1626 × 10 −27 cm 2 molecule −1 , Bucholtz, 1995), n m is the neutral density number density profile, σ O 3 is the ozone absorption coefficient (5.2667 × 10 −21 cm 2 molecule −1 , Bogumil et al., 2003), n O 3 (z) is the ozone number density, and k a is the aerosol extinction coefficient profile at 600 nm (see below). The derivative dl(z)dz −1 is the distance travelled along the line of sight per unit change in altitude. Values for n O 3 and n m are available directly from the SAGE II data product files, and dl(z)dz −1 , neglecting the effects of refraction, can be computed from simple spherical geometric considerations. In discrete form, l, which reflects the 0.5 km vertical spacing of SAGE II reporting altitudes, is computed as follows: where i is the LOS optical depth altitude and j are the data levels above the measurement (so j > i). The value l i,i+1 is applied to the data at level i. From Eq. (1) it is clear that the measured LOS optical depth at any altitude is dependent not just on components at that altitude but also at all levels above it. A strength of the occultation measurement is that, given the limb-viewing geometry, the contribution is dominated by the lowest few altitude levels. Figure 4a shows the value of l i,j as a function of altitude and a number of tangent altitudes. While these weighting functions are heavily weighted toward the tangent altitude, the shape of the constituent profile (e.g. ozone) can significantly alter the altitudes that dominate a measurement. Figure 4b shows an ozone profile for June 1999 between 5 • S and 5 • N. In this example, the ozone profile has a broad peak around 25 km and decreases rapidly below the peak. The weighting of ozone measurements as a function of altitude is shown in Fig. 4c. It is the product of Fig. 4a and Fig. 4b  Figure 3. SAGE II ozone number density at 20 km between 5 • S and 5 • N (a) and 16.5 km between 25 and 15 • S (b) before and after data usage rules (Sect. 3) have been applied, as indicated in the legend. Each rule is associated with a percentage of data points that will be removed from the data sets if the rule is applied. For more details, see the main text.
using only the ozone portion in the integral of Eq. (1). For measurements with tangent altitudes at and above the ozone peak (around 25 km), the measurement is heavily weighted to the tangent altitude. However, for tangent altitudes below the peak, measurements are more heavily affected by the overlying ozone, such that the total line-of-sight ozone amount primarily consists of ozone several kilometres above the tangent altitude.
Overall, the fraction of the total LOS optical depth due to ozone at the measurement altitude can be estimated as where σ O 3 is the ozone absorption coefficient, n O 3 (z) is the ozone number density at altitude z, l i,i+1 is the length of the LOS path through the altitude bin i with the tangent point being at altitude z, and τ LOS (z) is the LOS optical depth at altitude z calculated using Eq. (1). Measurement fractions, f , for a given latitude band (5 • S and 5 • N) from 10 to 70 km are shown in Fig. 5. Measurement fractions maximize between about 25 and 50 km between 20 % and 30 %, which is quite consistent from year to year except during the 1991-1994 period, where the stratosphere is strongly impacted by the June 1991 eruption of Mount Pinatubo. In this latitude band (tropics) the ozone fraction of the optical depth in the upper troposphere-lower stratosphere (UTLS) is only a few percent of the total LOS optical depth. As a result, performing ozone measurements from SAGE II is extremely challenging at these altitudes and they are very sensitive to even modestly enhanced aerosol. Ozone measurement uncertainty is usually dominated by the estimates of transmission uncertainty that are carried through the retrieval process to ozone and other product uncertainties. While transmission uncertainty varies slowly with altitude, the rapid decrease in ozone fraction below the ozone peak results in ozone uncertainties due to transmission uncertainty to increase rapidly. Overall, transmission uncertainty is an unbiased source of error and roughly Gaussian, thus it can be reduced by averaging multiple measurements as has been done in Hassler et al. (2018) and other ozone climatologies. Of course, at very high altitudes (> 60 km) and low altitudes (at and below the tropopause), the uncertainty can become sufficiently large that a meaningful measurements can no longer be made and no amount of averaging will produce a reliable result.
Other than ozone itself, there are two additional components to the LOS optical depth at 600 nm: (i) molecular scattering, which is significant in the lower stratosphere and above 45 km, and (ii) aerosol extinction, which is highly variable and the most likely contributor to biases in the measure- Figure 5. Measurement fraction, f , for 5 • N for all SAGE II data (no clearing) for the entire lifetime of the instrument. Note that the Mount Pinatubo period, particularly from month 81 (June 1991) to through to roughly month 100 (the end of 1992), shows a clear reduction of the value of f above 18 km to as high as 35 km. Though several other volcanic events occur during the SAGE II lifetime (e.g. Nevado del Ruiz and Nyamuragira), none of these are readily apparent in f since they were an order of magnitude or more smaller than the Mount Pinatubo event.
ments. The molecular scattering correction is generally small and only slowly varies in an altitude-latitude bin and is therefore not of particular importance for the following discussion. The second term, however, is particularly noteworthy during the Mount Pinatubo period, but it can also be episodically significant for smaller eruptions such as the Nyamuragira and Nevado del Ruiz eruptions (1985 and 1986, respectively) and when clouds or polar stratospheric clouds are present. As previously noted, aerosol extinction at 600 nm must be inferred from aerosol extinction measurements at 452, 525, and 1020 nm. Aerosol correction usually contributes to the measurement uncertainty in the stratosphere, as ozone dominates aerosol extinction at 600 nm. However, in the upper troposphere, where clouds are present, or during significant volcanic eruptions, even small deficiencies in the estimation process of aerosol extinction at 600 nm can cause significant deviations in the retrieved ozone values. The version 7.00 algorithm that is used to derive SAGE II ozone data includes an estimate of the possible magnitude of bias introduced in the retrieved ozone values due to the estimation of aerosol extinction at 600 nm. However, this estimate itself is highly uncertain and therefore its use in studies and efforts in, e.g. generating a merged ozone database such as Hassler et al. (2018); Davis et al. (2016), is limited.
In light of this, a number of ozone data usage rules based on the aerosol extinction coefficient apart from provided uncertainties have been proposed by Wang et al. (2002) and have been used in Hassler et al. (2018) and Davis et al. (2016). These include rules that exclude much of the ozone in the lower stratosphere following the Mount Pinatubo eruptions (current rules no. 1 and 7). Like the importance of the ozone overburden to ozone uncertainty at an altitude, a similar and somewhat hidden aspect of ozone measurement uncertainty is its sensitivity to enhanced aerosol (or cloud) above the measurement altitude even though aerosol at that altitude is well within nominal levels. This sensitivity is not reflected in any of the current SAGE II ozone data usage rules and accounting for this possibility is a goal for this study.
Nominally, accounting for the burden of material above the tangent altitude should be straightforward, as every SAGE II event has an intermediate product of transmission profiles at the wavelengths available from the instrument. However, this product is not routinely made available from what is essentially a heritage mission. Developing generally usable rules requires us to use only publicly available products. Fortunately, since the rules will only be applied in a semi-quantitative manner, we can simulate the LOS values in a straightforward way using products available in the primary data product files. In addition, we simplify the overburden test to consider only aerosol, since molecular and ozone contribution do not vary a great deal over the lifetime of the instrument. For our aerosol test, we simulate 600 nm aerosol extinction coefficient using the aerosol portion of Eq. (1), the path length from Eq. (2), and a reconstructed aerosol extinction coefficient at 600 nm. While aerosol extinction coefficient at 600 nm is computed during data processing, it is never explicitly examined or retained as a data or qualityassured product. For this study, we estimate extinction at 600 nm using the SAGE II aerosol extinction coefficient at 525 and 1020 nm using a simple Ångström coefficient α approach where α = − ln k a 525 k a 1020 ln 525 1020 , α together with the aerosol extinction coefficient at 1020 nm is then used to determine the aerosol extinction coefficient at 600 nm, following k a 600 = k a 1020 600 1020 Aerosol extinction coefficient is only reported to 40 km, and very small negative values above 30 km are regularly reported in the data files. For the purposes of this study, at any time when either aerosol extinction at 525 or 1020 nm is reported as negative, the aerosol extinction coefficient at 600 nm is set to zero. Generally, k a 600 is between k a 525 and k a 1020 (closer to k a 525 ) and α is usually between 0 (clouds and high amounts of volcanic aerosol) and 3 (low amounts of aerosol).
The relative contributions by ozone, aerosol, nitrogen dioxide, water, and molecular scattering vary by season, the phase of the Quasi-Biennial Oscillation (QBO), latitude, and altitude. Furthermore, these contributions are modulated by volcanic eruptions, which is always ignored in the construction of ozone usage rules under the assumption that the effects are small or invariant enough for the purposes of identifying outliers. While this assumption is probably true, it may be worth taking a look into in a future study.

Development of new ozone data usage rules
In this section, the rationale behind the development of new SAGE II ozone data usage rules is described. The goal was to reduce the data-screening recommendations to as few rules as possible and to consider two new recommendations, where one is based on LOS aerosol optical depth and the other is a basic outlier identification rule. Additional recommendations were only considered when clear deficiencies were uncovered following implementation of the primary rules.

LOS aerosol optical depth and ozone data quality
It is well known that SAGE II ozone data quality is modulated by variations in aerosol extinction coefficient (e.g. Steele and Turco, 1997;Wang et al., 2002). This is primarily due to the relatively sparse spectral sampling of the instrument and the need to effectively infer aerosol extinction at 600 nm from measurements at 525 and 1020 nm (see Sect. 4). The strength of the ozone signal in the lower stratosphere is generally more than adequate for a robust retrieval of ozone concentration. However, enhanced aerosol from clouds and particularly from the major Mount Pinatubo eruption in 1991, can reverse the weight of observations at 600 nm from ozone toward aerosol. As shown in Fig. 5, even in the main ozone layer in the 20 to 30 km range, the fraction of the signal from ozone at the measurement altitude drops by factors of 3 or more during the Mount Pinatubo period. Therefore, it is not surprising that the quality of the ozone measurements declines under these circumstances. Figure 6 shows an extinction coefficient profile (at 1020 nm), as retrieved from the measured transmissions, which encounters a dense aerosol layer, in this case probably a cloud, at about 15 km. While the extinction profile down to this altitude remains robust, there are substantial oscillation in the profile below the top of the aerosol layer. At the same time the LOS aerosol optical depth remains well behaved below this altitude (Fig. 6). The oscillations reflect numerical instability, resulting from very large factor variations (up to 1000) in extinction coefficients over a narrow altitude range and such features invariably do conveniently occur exactly within the SAGE II altitude grid. While the reported uncertainties on the SAGE II data show that these data are of poor quality, solely relying on these extinction values to reflect the impact of aerosol on the quality of the ozone data is not advisable, and other measures on how best to flag ozone data that are affected by enhanced aerosol are required. The dependence of ozone on aerosol LOS optical depth, derived from the aerosol extinction coefficient at 600 nm (see Sect. 4), for a pair of altitude-latitude bands is shown in Fig. 7. There is a cluster of observations at low aerosol LOS optical depth and then a long tail toward higher values mostly occurring during the multi-year recovery from the Mount Pinatubo eruption through the mid-1990s. For much of the aerosol LOS optical depth domain, ozone shows no clear variation with increasing optical depth, but at values exceeding about 3 (vertical line in Fig. 7) ozone appears to decline. While an aerosol-related ozone change cannot be excluded, this decline is generally taken as evidence of an aerosol artefact in the ozone measurements. Examining all latitude-altitude bands we observe that the impact of aerosol on ozone does not occur for aerosol LOS optical depths below 3 and in some situations does not appear until values exceed 4. Since the volume of data with LOS optical depths between 3 and 4 is small, we conservatively use 3 as the cutoff value at all latitudes bands and altitudes, i.e. our recommendation is to remove all ozone measurements at any given altitude and within a given latitude band if the corresponding aerosol LOS optical depth exceeds a value of 3.
After the new aerosol rule is applied to the SAGE II ozone data set (hereafter referred to as "LOS optical depth rule"), the biggest outliers caused by, e.g. aerosol enhancements in the atmosphere, are removed. SAGE II ozone at 16 km and 21 km altitude between 5 and 15 • S for each month of the observing period is shown in Fig. 8. The blue dots in Fig. 8 indicate the ozone data that will be removed due to the aerosol LOS optical depth rule. At 21 km, this rule mostly removes ozone data during the Mount Pinatubo eruption pe- riod, which leads to high ozone concentrations and even negative ozone concentrations being retrieved in some months. Outside the Mount Pinatubo period, the ozone values remain unaffected by these screening recommendations (Fig. 8a). At 16 km, the LOS increases as to be expected and the LOS optical depth rule removes a decent amount of data, especially the negative ozone concentrations that were retrieved and are provided with the SAGE II data.

Usage rule according to reported relative uncertainties
Applying the LOS optical depth rule to the SAGE II ozone data set did not remove all deficiencies from the data set.
Looking into the remaining features in more detail, we discovered that there are some ozone values where the relative uncertainty was set to exactly 200 %, which to us seemed surprising. Further investigation led us to the following explanation for these uncertainties and how they should be treated. As the version 7.00 data product was developed, it was noted that, under some circumstances, the calculated uncertainty for ozone concentration was much too small when negative values were reported. During investigations for the de-velopment of new ozone data usage rules, it was observed that the defect occurs primarily in the lower stratosphere and troposphere where the aerosol LOS optical depth exceeds 2 but, for reasons that are not immediately clear, this does not affect all data where this aerosol level of 2 is exceeded. Most likely this reflects a failure in the empirical means of accounting for the deleterious impact of excess aerosol LOS optical depth in the ozone retrieval (Damadeo et al., 2013). The fixed 200 % uncertainty value is intended as a means to differentiate between negative retrieved values with high uncertainties and retrieved values with low but unrealistic uncertainties. Negative retrieved ozone values and their associated uncertainties remain intact at higher altitudes, as this is more often a result of noise in the data rather than a deleterious effect on the retrieval algorithm from high aerosol extinctions or clouds. So since we are certain that the 200 % uncertainty data are the result of an aerosol issue but no single value of aerosol LOS optical depth can be identified, we have implemented a rule that removes all ozone observations where the uncertainty is reported as exactly 200 %. We recommend to apply this rule first, before any other rule is considered.
The effect of applying this 200 % rule on the ozone data is illustrated in Fig. 9, showing SAGE II ozone observations at 12 km between the 65 and 75 • S latitude bands. In this case, substantial scatter of ozone observations is apparent. There are some high values that occur during the Mount Pinatubo period, which may reflect aerosol contamination of the ozone product and data points stretching downward into negative values throughout the SAGE II lifetime. Blue dots in Fig. 9a represent ozone data points with uncertainties reported as exactly 200 % and data that are eliminated from the data set; essentially all the negative values have been eliminated. Figure 9b shows the result for eliminating all data where the aerosol LOS optical depth exceeds 2 (marked as blue dots).
Here, essentially all of the points that are eliminated by the 200 % rule are also eliminated, as well as those points with very high ozone values during the Mount Pinatubo period. However, it is also clear from Fig. 9b that some points that fall well within acceptable bounds are also being removed, and overall 12.1 % of the data are removed in that latitude bin. This problem is ubiquitous in the lower stratosphere, and it is clear that simply eliminating all data with LOS aerosol optical depth exceeding a value of 2 is not acceptable; compare 0.8 % of the data that are removed by applying the 200 % rule with 12.1 % of the data that are removed by using the LOS aerosol optical depth exceeding a value of two criteria.

Statistical outlier identification
After applying the new LOS optical depth rule outlined above, the distribution of ozone values within latitudealtitude bins is dominated by geophysical variability and unbiased measurement noise. This generally results in distributions that are roughly Gaussian in shape, and applying noise  filtering such as current rule no. 11 (cf. Sect. 3), removing ozone values that are 3σ away from the mean of the distribution of ozone values, can capture most obvious outliers. However, we note that there are many situations in which the distributions of ozone within the latitude-altitude bins demonstrate a clear skewness in their shape (e.g. Fig. 10). The skewness is the result of strong variations in ozone concentration with spatial coordinates like equivalent latitude or potential vorticity (PV) that do not exactly coincide with latitude (Schoeberl et al., 1993). For instance, strong variations in ozone across the polar vortex boundary that is asymmetrically situated across the pole can lead skewness in the distribution of ozone in a latitude bin. Similarly, the ozone distribution in tropical latitudes can be created by gradients of ozone across a meandering tropical pipe (see Fig. 10). Therefore, we find that applying a simple Gaussian filter tends to preferentially remove ozone values on the broad tail of the skewed distribution with a concomitant impact on statistical values, particularly means, for these bins. We considered performing the outlier test in equivalent latitude space as equivalent latitude is based on PV and is a widely used diagnostic for isentropic transport in the stratosphere and upper troposphere. While we found that the skewness of the ozone distributions observed in equivalent latitude space is reduced compared to using geographical latitude space (not shown), significant skewness remained and an alternative approach was developed.
To mitigate the skewness issue in outlier detection, we employed a skewed distribution outlier test as the third and last rule for SAGE II ozone filtering. Here, we apply a method developed by Hubert and Van der Veeken (2008) to detect outliers that does not need the assumption of symmetry or rely on visual inspection. It is based on the quartile text in which outliers for a univariate (continuous, unimodal) data set, X n = x 1 , x 2 , . . ., x n , are inferred outside of the bounds of [Q 1 − 1.5× IQR, Q 3 + 1.5× IQR], where Q 1 is the first quartile for the distribution X, Q 3 is the third quartile, and IQR is the interquartile range, defined as Q 3 − Q 1 . While, for data coming from a normal distribution, the probability of data being beyond the whiskers is approximately 0.7 %, this percentage can be much higher if the data are skewed. For skewed data, Hubert and Van der Veeken (2008) employ the statistical quantity medcouple to modify the outlier bounds to better encompass the distribution of X. The medcouple (MC) is a robust measure of skewness of a distribution and is bound between a value of −1 and 1. For a symmetric distributions, MC is zero, for left-and right-skewed data the medcouple is negative and positive, respectively. The medcouple is defined as the median of the kernel function (MC(X n ) = med h(x i , x j )), where h(x i , x j ) is defined as follows: and is evaluated over all couples (x i , x j ) where x i is smaller than the median of x and x j larger than the median of x; X n has been sorted such that x 1 ≤ x 2 ≤ . . . ≤ x n . Following Hubert and Van der Veeken (2008), the outlier bounds are then modified according to: for MC ≥ 0: for MC < 0: In this study, the medcouple is calculated separately for each month, using all ozone values in that given month (for all years), altitude, and latitude band using data that has previously passed rules no. 1 and 2. MC is then used to define the outlier bounds, and ozone values that lie outside these bounds are removed from the final product. Distributions of ozone for a selected number of months, altitude, and latitude band are shown in Figs  The red line indicates the median of the distribution shown in each panel, while the dashed black lines represent the upper and lower boundaries calculated as described in the text. As a reference, panel (b) includes 3 times the standard deviation of the distribution (3σ ) to represent the selection criteria similar to current rule no. 11. All data points that lie outside the dashed black lines will be removed from the data set. For more details, see the main text. rule and the aerosol LOS optical depth rule were applied. All data that lie beyond the boundaries (black lines in Figs. 10 to 12) will be removed from the data set. This new outlier rule (skewness rule) removes only large outliers and generally fewer data points from the distribution than the current rule no. 11 as outlier bounds (compare the dashed black lines to the dashed green lines in Figs. 10 to 12). It also generally passes the eye test of where empirically determined outlier bounds would be placed (unlike current rule no. 11), as shown in Fig. 10. Overall, less than 3 % of all data points are eliminated using this rule and the median of the distribution is nearly unchanged (as denoted in Figs. 10 to 12), as all median values (red lines in each figure) are similar.
The effect of the three screening rules for the SAGE II ozone data is shown in Fig. 13. Ozone at 17 km between 15 and 25 • N is shown in Fig. 13a, while ozone at 20.5 km between 25 and 35 • S is shown in Fig. 13b. At 17 km, the majority of ozone (mainly negative or low ozone values) is removed by the 200 % uncertainty rule (about 6 %), followed by the outlier-screening rule (about 3 %). The aerosol rule plays a significant role only during the Mount Pinatubo period. At higher altitudes, the 200 % screening rule and the outlier rule become less important, and the majority of ozone is removed due to the aerosol rule (Fig. 13b), mainly during the Mount Pinatubo period. Above 30 km, very few ozone data are removed throughout the data record (not shown), and the higher you go the less important the screening rules are. Overall, the current rules eliminate up to 13 % of the ozone Figure 11. As in Fig. 10 but for ozone data at 18 km between 15 and 25 • N for January. data in an altitude and latitude band primarily below 23 km, while the new screening recommendations developed in this study remove no more than 5 % of the data. While at higher altitudes the total number of data points that is eliminated by the current and new rules is similar (less than 2 % depending on altitude and latitude band), they do not remove the same data points and differences are apparent that will be discussed in more detail in the next section.

Comparing data rules
A comparison between the remaining SAGE II ozone data at a number of altitudes and latitude bands after the new and current rules have been applied is shown in Fig. 14. While at 16 km applying the new and current rules to the SAGE II ozone data between 5 • S and 5 • N latitude removes about the same number of data points (46 % compared to 48 %), differences are apparent, as shown in Table 1. About 14 % of the data that remain included in the screened ozone data set after the current rules were applied are now removed when applying the new rules to the same data set. In this case, these additional data points that are removed are mostly the negative and low ozone values that were flagged as bad using the 200 % relative uncertainty. On the other hand, about 16 % of the data that are removed when applying the current rules are now retained when using the new rules. The retained values include ozone data during the Nyamuragira and Nevado del Ruiz (1985 and1986, respectively) eruptions and the Mount Pinatubo eruption, data which do not show obvious reasons for why they should have been removed. This suggests that the aerosol LOS optical depth rule is more appropriate than using a threshold for the aerosol extinction coefficient as it is used in the current set of screening rules. The differences between the data sets that remain after the new (Fig. 14a) and current rules (Fig. 14b) are substantial and could potentially lead to different mean values for the given altitude and latitude band. At higher altitudes (20 km), the number of data points that remain after applying the new rules is higher than after applying the current rules. While about 89 % of the data between 45 and 55 • N and at 20 km altitude remain after the current rules were applied, more than 96 % of the data remain after the new rules were applied to the same data set. The current rules remove about 7 % more data, especially during the Mount Pinatubo period, than the new rules, which retain a lot of ozone measurements during the Mount Pinatubo eruption period. Only about 0.1 % of the data that remained after the current rules were applied are removed when using the new data-screening recommendations (Table 1). These results suggest that the new recommendations provide better means to eliminate biased ozone measurements, retaining important ozone measurements during the Mount Pinatubo eruption period.
The retention of data during the Mount Pinatubo period is even more pronounced at 20.5 km between 65 and 75 • N (Fig. 14). The new rules only remove about 1 % of the overall data (mainly outliers, i.e. the skewness rule) in that latitude band, while the current rules remove more than 11 % of the ozone measurements. This again indicates that the current rules are too restrictive when it comes to decide what a "good" ozone measurement is, leading to unexpected results and removing valuable data during volcanic events such as the Nyamuragira, Nevado del Ruiz, and Mount Pinatubo eruptions.
At higher altitudes, above 35 km, both set of rules retain more than 98.5 % of the ozone data, and above 40 km more than 99 % of the data remain after any screening rule is applied. For example, at 35.5 km between 35 and 45 • S, both set of rules remove 98.99 %. The new outlier-screening rule removes a total of 0.81 % of the data, while the current set of the rules remove 0.65 % of the data. In both cases, only the outlier-screening rule removes any ozone data, and at higher latitudes, the new outlier-screening rule removes a few more data points. As expected, these results indicate that the measurements at the lower altitudes and in the mid-stratosphere are mostly affected by aerosol and clouds. Despite the low number of ozone data that are removed above 35 km, we recommend applying the new set of screening recommendations at all levels, as outliers remain part of the data set and need to be removed before any analysis of the data is performed.

Data availability
The SAGE II ozone data set used in this paper, including calculated line-of-sight optical depths, is publicly available in NetCDF format from Zenodo at https://doi.org/10.5281/zenodo.3710518 (Kremser et al., 2020) and is distributed under the Creative Commons Attribution 4.0 International Public License. The complete SAGE II data set version 7.00 (binary format) is available free of charge from the NASA Atmospheric Science Data Center (ASDC) at https://doi.org/10.5067/ERBS/SAGEII/SOLAR_ BINARY_L2-V7.0 (SAGE II Science Team, 2012).

Conclusions
In this study, we developed adequate SAGE II ozone datascreening rules using only three rules compared to up to 11 rules used in previous similar efforts that included screening used to produce homogenized ozone data sets (Davis et al., 2016;Hassler et al., 2018). The new rules are simple and everything required to apply these rules is provided with the SAGE II data available from the NASA data centre (ASDC). The new recommendations take into account how the measurements were made and are as follows.
-Remove all ozone data when the corresponding aerosol LOS optical depth exceeds a value of 3.
-Remove all ozone data with an uncertainty of exactly 200 %.
-Remove all data that fall outside boundaries calculated using the skewness test of a distribution.
In general, the new rules remove fewer data from the overall SAGE II ozone data set and the new rules are more robust and appropriate than previous versions particularly in the handling of non-aerosol-related outliers (current rule no. 11), as they are not restricted to a normal distribution of the ozone values in a given latitude band.
The new SAGE II screening rules are still empirical rules based on our best scientific judgement and based on how the measurements were made. It is possible that applying these new rules is still throwing out data, particularly those based on ozone anomalies, that are "good" data and retaining ozone values that appear valid when they are "bad". It is incumbent on the user of this approach to apply them in a way that is consistent with their own requirements, potentially easing or tightening the new data usage rules, perhaps in the more extreme periods as exemplified by the Mount Pinatubo eruption.
SAGE II ozone data are used as the reference ("gold standard") to which other satellite data (such as AURA MLS,