Articles | Volume 13, issue 12
Data description paper
10 Dec 2021
Data description paper |  | 10 Dec 2021

ML-TOMCAT: machine-learning-based satellite-corrected global stratospheric ozone profile data set from a chemical transport model

Sandip S. Dhomse, Carlo Arosio, Wuhu Feng, Alexei Rozanov, Mark Weber, and Martyn P. Chipperfield

High-quality stratospheric ozone profile data sets are a key requirement for accurate quantification and attribution of long-term ozone changes. Satellite instruments provide stratospheric ozone profile measurements over typical mission durations of 5–15 years. Various methodologies have then been applied to merge and homogenise the different satellite data in order to create long-term observation-based ozone profile data sets with minimal data gaps. However, individual satellite instruments use different measurement methods, sampling patterns and retrieval algorithms which complicate the merging of these different data sets. In contrast, atmospheric chemical models can produce chemically consistent long-term ozone simulations based on specified changes in external forcings, but they are subject to the deficiencies associated with incomplete understanding of complex atmospheric processes and uncertain photochemical parameters.

Here, we use chemically self-consistent output from the TOMCAT 3-D chemical transport model (CTM) and a random-forest (RF) ensemble learning method to create a merged 42-year (1979–2020) stratospheric ozone profile data set (ML-TOMCAT V1.0). The underlying CTM simulation was forced by meteorological reanalyses, specified trends in long-lived source gases, solar flux and aerosol variations. The RF is trained using the Stratospheric Water and OzOne Satellite Homogenized (SWOOSH) data set over the time periods of the Microwave Limb Sounder (MLS) from the Upper Atmosphere Research Satellite (UARS) (1991–1998) and Aura (2005–2016) missions. We find that ML-TOMCAT shows excellent agreement with available independent satellite-based data sets which use pressure as a vertical coordinate (e.g. GOZCARDS, SWOOSH for non-MLS periods) but weaker agreement with the data sets which are altitude-based (e.g. SAGE-CCI-OMPS, SCIAMACHY-OMPS). We find that at almost all stratospheric levels ML-TOMCAT ozone concentrations are well within uncertainties of the observational data sets. The ML-TOMCAT (V1.0) data set is ideally suited for the evaluation of chemical model ozone profiles from the tropopause to 0.1 hPa and is freely available via (Dhomse et al.2021).

1 Introduction

With the successful implementation of the Montreal Protocol, various observations confirm reductions in the concentrations of halogenated ozone-depleting substances (ODSs) in the atmosphere (WMO2014, 2018). Satellite data records also confirm a peak in upper-stratospheric HCl (the main chlorine reservoir) around 1997, followed by a steady decline (Anderson et al.2000; Froidevaux et al.2006a; Hossaini et al.2019). Hence, attention has turned towards the detection and attribution of ozone recovery (e.g. Dhomse et al.2006; Solomon et al.2016; Chipperfield et al.2017; Steinbrecht et al.2017; Dhomse et al.2018; Szeląg et al.2020). However, the accurate quantification of ozone changes is challenging because of the quality of long-term ozone profile data sets, where measurement errors are of similar or larger magnitude than the long-term ozone trends. In addition, complex coupling between various physical and chemical processes controlling stratospheric ozone concentrations cause large short-term ozone changes. Complications also arise because there are some non-linear changes in stratospheric dynamics as well as chemical constituents. For example, between 2018 and 2021, some of the largest and smallest ozone losses of the recent decades were recorded in both the Arctic and Antarctic polar stratospheres (e.g. Wargan et al.2020; Wohltmann et al.2020; Bognar et al.2021; Weber et al.2021). Some observational data suggest that there has been a continuous decline in lower-stratospheric ozone (Ball et al.2018, 2020), which could be attributed to changes in stratospheric dynamics (e.g. Chipperfield et al.2018; Wargan et al.2018; Orbe et al.2020; Abalos and de la Cámara2020). Atmospheric concentrations of ODSs such as CFC-11 are decreasing at uneven rates (Montzka et al.2018, 2021), which could induce variability in ozone trends. Additionally, significant positive trends have been detected in very short-lived substances (VSLSs) containing chlorine and bromine that are not controlled by the Montreal Protocol (e.g. Hossaini et al.2015, 2019).

As there are no long-term ozone profile data from a single satellite instrument, various attempts have been made to merge such data from different instruments. However, individual satellite instruments have different temporal and spatial resolution depending on the measurement techniques and retrieval algorithms (e.g. Sofieva et al.2014; Damadeo et al.2018). For example, solar occultation instruments (e.g. Stratospheric Aerosol and Gas Experiment (SAGE, McCormick et al.1989), Halogen Occultation Experiment (HALOE, Russell et al.1993)) provide high-quality measurements but are constrained by limited spatial coverage. Limb-scanning instruments such as the Microwave Limb Sounder (MLS, Froidevaux et al.2006b), Scanning Imaging Absorption Spectrometer for Atmospheric Cartography (SCIAMACHY, Bovensmann et al.1999), and Optical Spectrograph and InfraRed Imager System (OSIRIS, Murtagh et al.2002) provide better spatial coverage but have coarser vertical resolution. A key constraining factor is that only few satellite data sets cover enough overlapping years to remove inter-instrument biases with minimal uncertainty.

Hence, Randel and Wu (2007) adopted a novel approach to create a gap-free stratospheric ozone profile data for the 1979–2005 time period. They used SAGE (I and II) satellite profile measurements and polar ozonesondes, together with a seasonally varying ozone climatology from Paul et al. (1998) to fill the gaps, to generate multi-variate regression-based gap-free ozone anomalies. Later, Cionni et al. (2011) used a similar methodology along with climate model simulations to extend the time series backwards to 1850. The Cionni et al. (2011) data were recommended for the historical CMIP5 simulations for the climate models that did not include stratospheric chemistry, in order to enforce time-dependent ozone variations. Hassler et al. (2008) used a different methodology to create a satellite-based long-term ozone profile data set. Along with SAGE I and II measurements, they used HALOE and POAM (Polar Ozone and Aerosol Measurement) II and III satellite measurements, as well as ozonesonde data from 130 stations, to create a collection of binary data files – also known as the Binary DataBase of Profiles (BDBP) version 1.0. Bodeker et al. (2013) updated the BDBP data set to construct “Bodeker Scientific” or “BS” data. They updated BDBP data by including measurements from the Limb Infrared Monitor of the Stratosphere (LIMS), the Improved Limb Atmospheric Spectrometer (ILAS) and ILAS II. They used a multivariate regression model to create different versions of the ozone profile data set ranging from the surface to 70 km for the 1979–2008 time period. Hassler et al. (2018a) revised and extended (1979–2016) the BS data set by using the TOMCAT chemical transport model (CTM) ozone profiles as a transfer function to capture ozone variability for the period without satellite observations.

Another widely used merged data set is the Global OZone Chemistry And Related trace gas Data records for the Stratosphere (GOZCARDS, Froidevaux et al.2015). These are monthly mean zonally averaged time series constructed using ozone profile measurements from several NASA satellite instruments and the Atmospheric Chemistry Experiment Fourier Transform Spectrometer (ACE-FTS, Bernath et al.2005). Merging is done primarily by removing average biases between SAGE II and individual data records for overlap periods (Froidevaux et al.2015). The GOZCARDS data files contain mixing ratios on a pressure–latitude grid (316 to 0.1 hPa), updated later to GOZCARDS v2.2 (Froidevaux et al.2019).

Davis et al. (2016) adopted a slightly different approach to construct the Stratospheric Water and OzOne Satellite Homogenized (SWOOSH) data set. SWOOSH merges stratospheric ozone profile data from solar occultation instruments (SAGE-II/III, HALOE, ACE-FTS) as well as limb-scanning instruments (UARS-MLS and Aura-MLS). The measurements are homogenised by applying corrections that are calculated from data taken during time periods of instrument overlap. The primary SWOOSH data product consists of monthly mean zonal-mean values on a pressure grid at 2.5, 5 and 10 resolution. One of the major characteristics of SWOOSH data is that when merging greater weight is given to the instruments that sample more frequently (e.g. Aura-MLS). Filled and unfilled versions of the data set exist on both geographical and equivalent latitude coordinates.

Several additional attempts have been made to merge satellite time series from limb and occultation instruments. For example, the SAGE-CCI-OMPS data set, described in Sofieva et al. (2017), includes SAGE II time series and several limb data sets. The OMPS-LP data set used is produced at the University of Saskatchewan, Saskatoon (Zawada et al.2018). First, they screened and homogenised CCI data sets in the HARMOZ format before merging them in terms of ozone anomalies. Recently, Arosio et al. (2019) created a merged SCIAMACHY-OMPS limb data set (SCIA-OMPS), which combines these two time series produced at the University of Bremen. They used MLS data series as a transfer function to merge SCIAMACHY with OMPS-LP as these instruments share only 2 months of overlap, but MLS was not included in the merged data set. This time series is monthly averaged, covers the period 2002–present and is longitudinally resolved, with a 5 latitude × 20 longitude grid. Due to the similarities in the measurement geometries and techniques, as well as in the retrieval approaches, they implemented a plain de-biasing approach for the merging, directly obtaining a long-term ozone time series in appropriate units.

Another widely accepted approach is using data assimilation techniques to create observation-based data (e.g. Feng et al.2008; Skachko et al.2014; Errera et al.2019; Wargan et al.2020). However, only a few instruments such as MLS provide relatively long-term ozone profile measurements. For the pre-MLS time period, very few observations are available that can provide good constraint on the assimilation system. Also, the forward model is generally forced with available (re)analysis dynamical fields so reanalysis data sets are also prone to the inhomogeneities in the forcing fields along with any discrepancies in chemical scheme.

In this paper we present a new data–model method for producing a long-term data set of stratospheric ozone. We use ozone profile output from a CTM to create a machine-learning-based satellite-corrected long-term chemically (and dynamically) consistent ozone profile data set (hereafter, ML-TOMCAT) for the 1979–2020 time period. The CTM setup is described in Sect. 2, followed by our methodology in Sect. 3. Comparisons of ML-TOMCAT with some of the other available merged ozone profile data sets are presented in Sect. 4, with a summary of our key results in Sect. 5.

2 Model setup

We use chemically consistent monthly mean zonal mean ozone profiles from the TOMCAT CTM as the basis data set. TOMCAT is an offline three-dimensional (3D) CTM that includes a comprehensive stratospheric chemistry scheme (Chipperfield2006). For the present study, the CTM setup is similar to the control simulations used in our recent studies such as Feng et al. (2021), Bognar et al. (2021) and Weber et al. (2021). Briefly, TOMCAT is forced with meteorological fields from ERA-5 reanalyses (Hersbach et al.2020), starting from 1979. Simulations are performed at a 2.8 × 2.8 horizontal resolution with 32 hybrid sigma-pressure levels extending from the surface to about 60 km. For major ODSs and greenhouse gases (GHGs) the model uses time-dependent observed global mean surface mixing ratios (Carpenter et al.2018) that are treated as well mixed throughout the troposphere. The model also includes the effects of solar flux variability and heterogeneous chemistry on volcanically enhanced stratospheric aerosol as described in Dhomse et al. (2015, 2016). Solar irradiance data are from the NRL2 (Coddington et al.2016) empirical model and the sulfate aerosol surface area density (SAD) from Luo (2016). TOMCAT also includes chlorine and bromine contributions from VSLSs as described in Hossaini et al. (2019). A passive ozone tracer (no chemical ozone loss), generally used to diagnose chemical ozone loss, is initialised every 6 months from the chemical ozone tracer (1 June and 1 December). TOMCAT has been regularly used to study long-term changes in stratospheric trace gases, showing good agreement with various ground-based and satellite data sets (e.g. Mahieu et al.2014; Chipperfield et al.2015; Wales et al.2018; Harrison et al.2021; Prignon et al.2021).

3 Methodology

We use the random-forest (RF) regression analysis to generate a long-term chemically consistent data set. The RF is a supervised machine-learning (ML) algorithm that uses an ensemble of decision trees (e.g. Breiman2001; Svetnik et al.2003). A decision tree can be considered a flow chart used in computer programming (a tree-shaped schematic) that is generally used to show a statistical probability or path of action. A single decision tree in a RF can be considered a random tree in a forest of decision trees. Each decision tree consists of three components: decision nodes, leaf nodes and a root node. The root node and decision nodes of the decision tree represent the explanatory variables. The leaf nodes represent the final output. The explanatory variables used in our analysis are explained at end of this section.

A decision tree algorithm divides the data set into branches (using true and false criteria), which further segregate into other branches until a leaf node (or result node) is reached. Multiple trees are constructed by randomly sampling data points multiple times (e.g. bootstrap method). Hence, an individual tree can be considered a unique tree (hence unique output). RF uses a bagging technique, which means the RF model consists of many individual decision trees and aggregated predictions are used for the final prognosis. A distinct advantage of RF regression is that it is relatively accurate and very easy to set up. RF can also behave like a non-linear regression method. As RF adds randomness to the decision procedure, instead of relying on the most important explanatory variables, it searches for the best variable among random subsets. This ensures that the final output does not rely heavily on a single explanatory variable, thereby avoiding overfitting (e.g. Kotsiantis2013). We use RF regression from the Python package scikit-learn (Pedregosa et al.2011) with two options: random_state=0 and bootstrap=True.

Initially, TOMCAT zonal mean ozone profiles are linearly interpolated in log-pressure space on to 43 equidistant (12 per decade) pressure levels (1000–0.1 hPa, MLS pressure levels), followed by spatial interpolation onto 72 SWOOSH latitude bins at 2.5 resolution. SWOOSH data are obtained via (last access: 15 March 2021) Then, we calculate the ozone difference (dO3) between SWOOSH and model ozone profiles for the 1991–1998 and 2005–2016 time periods (total 20 years). For the calculation of dO3 values, we use the gap-filled SWOOSH data product. SWOOSH data ranges from 316 to 1 hPa (31 pressure levels); hence, for pressure levels below 316 hPa the ML-TOMCAT values are set to latitudinally and monthly varying climatological values from Logan (1999), which are also used in stratospheric TOMCAT simulations.

For the regression analysis, a 20-year (largely MLS covering) time period is selected in order to avoid heteroscedasticity (i.e. effect of different sampling frequencies/methodologies (e.g. Sofieva et al.2014; Millán et al.2016) between different types of satellite data sets) as SWOOSH relies heavily on MLS (UARS and Aura) data records. Additionally, it also covers a period when the stratospheric chlorine loading was increasing (1991–1998) and decreasing (2005–2016) and RF has enough sample to include different characteristics of ozone variability. The regression model has five terms: passive ozone (PO3), HCl mixing ratio (HCl), and methane mixing ratio (CH4), as well as observation–model total column difference (dTCO) and Mg II solar flux term (MgII). The PO3, HCl and CH4 terms account for possible biases in CTM profiles due to transport in different stratospheric regions (e.g. Strahan et al.2011; Feng et al.2021). dTCO is an ideal learner for the lower-stratospheric ozone transport as total column ozone measurements have much smaller retrieval errors (e.g. Petropavlovskikh et al.2019); hence, they provide a good constraint for the possible biases in ERA-5 stratospheric transport (e.g. Ploeger et al.2021). TOMCAT has 203 spectral bins in the photolysis scheme (e.g. Dhomse et al.2016). Therefore, the MgII solar flux term is included to account for possible biases in the representation of the 11-year solar flux variability (e.g. Haigh et al.2010; Dhomse et al.2013) or the use of coarse spectral bins (e.g. Sukhodolov et al.2016).

Overall, there are five explanatory variables in the regression model for individual grid points, and these are taken from TOMCAT output fields. The regression model can be represented as

(1) dO 3 = β 1 PO 3 + β 2 HCl + β 3 CH 4 + β 4 dTCO + β 5 MgII + residuals ,

where β1, β2, β3, β4 and β5 can be considered to be the contribution coefficient for a given explanatory variable, and PO3, HCl and CH4 are TOMCAT monthly mean zonal mean tracers. For the calculation of dTCO we use Copernicus Climate Change Service (C3S) total ozone data (1979–2018). The C3S total column product is a combination of total column data from 15 sensors using gap-filling assimilation methods and is obtained via!/data set/satellite-ozone?tab=overview (last access: 1 May 2021). For the years 2019 and 2020, we use level 3 total column data from the Ozone Monitoring Instrument (OMI) V3 that is obtained via (last access: 1 May 2021). The Mg II index is obtained from (last access: 1 December 2021). We assume long-term chemical ozone changes are realistically represented by TOMCAT chemistry (e.g. Feng et al.2007; Chipperfield et al.2017; Dhomse et al.2019); hence, all the predictor time series are detrended and normalised between 0 and 1.

4 Results

Atmospheric chemical models are ideal tools for understanding/simulating past (and future) ozone changes, as they combine up-to-date knowledge about various physical and chemical processes using a mathematically consistent framework. Different models use different combinations of chemical and dynamical schemes to represent important processes in the atmosphere. However, some of these processes are computationally expensive; hence, they are represented by somewhat simplified parameterisations. For example, many chemical models prescribe observation-based sulfate surface area density (SAD) to represent the effects of volcanically enhanced stratospheric aerosol for simulating heterogeneous chemistry which leads to ozone loss (Dhomse et al.2015). Many models also prescribe surface concentrations of GHGs and ODSs rather than emission fluxes. CTMs such as TOMCAT use dynamical forcing fields from (re)analyses data sets such as ERA-Interim or ERA-5. Hence, CTMs are subject to possible inhomogeneities due to changes in the number of assimilated observations, as well as other deficiencies (e.g. missing processes) in the forward model used in the assimilation system. On the other hand, observational data sets are also subject to errors associated with the measurement techniques, instrument degradation and retrieval algorithms. Hence, almost all chemical models may be expected to show a bias against observational data records, either because of model deficiencies or errors in the observations. However, chemical models do use a consistent chemical scheme, so we can assume that chemical model–observation ozone differences are largely due to uncertainties in the forcing fields such as meteorology (e.g. winds, temperature) and chemical parameterisations (e.g. reaction rates, solar fluxes, photolysis schemes). CTMs have the distinct advantage in terms of dynamics as they are forced with up-to-date reanalysis data, although with the above-noted caveat of possible inhomogeneities in observations used in the assimilation systems. Hence, the recently initiated SPARC Reanalysis Intercomparison Project (S-RIP) is aimed at providing guidance on future reanalysis activity. S-RIP also plans to perform comprehensive evaluation and intercomparison of different reanalysis data products; for details see (last access: 1 June 2021). Here, we train the ML algorithm on the model–observation differences for the period that has relatively good temporal sampling. Estimated parameters are then used to simulate differences for the entire (1979–2020) time period. In this section, we analyse model–observation biases associated with individual predictors and compare the ML-corrected data against a variety of observation-based data sets.

4.1 Model biases

Figure 1 shows climatological (2006–2020) monthly zonal mean differences between TOMCAT and SWOOSH ozone profiles (TOMCAT minus SWOOSH). TOMCAT profiles show an almost symmetrically structured negative biases in the upper stratosphere and positive biases in the lower stratosphere. The largest negative biases (up to 0.8 ppm) occur in the tropical upper stratosphere (around 3 hPa), and they remain negative throughout the year. The ozone lifetime at these altitudes is less than a day; hence, the observed biases might be associated with deficiencies in the photochemical reactions in the model. At this altitude, ozone production is largely controlled by solar fluxes below 240 nm, while longer wavelengths control ozone destruction (e.g. Haigh et al.2010). Therefore, negative ozone biases in the upper stratosphere are most probably due to uncertainties in the solar irradiances and/or photolysis cross sections that control ozone production (e.g. Brasseur and Solomon2006). Furthermore, in this region of the atmosphere, ozone chemistry is mostly temperature dependent (e.g. Stolarski et al.2010; Dhomse et al.2013, 2016); hence, the model ozone biases could be due to uncertainties in temperature-dependent reaction rates (e.g. Ghosh et al.1997).

Figure 1Latitude–pressure cross sections of the climatological (2006–2020) monthly mean difference (ppm) between TOMCAT and SWOOSH (Davis et al.2016) ozone profiles.


In the lower stratosphere the ozone lifetime ranges from months to years; so, positive biases in the TOMCAT ozone could be due to a combination of both dynamics and chemistry. First, reduced overhead ozone could increase lower-stratospheric ozone via the self-healing effect; i.e. increased ultraviolet radiation increases ozone production at lower altitudes (e.g. Haigh1994). Second, ozone is primarily produced in the tropical stratosphere, and its downward transport is controlled by the quasi-biennial oscillation (QBO) (e.g. Tian et al.2006), whereas transport towards middle and high latitudes is determined by the strength of the Brewer–Dobson (BD) circulation (e.g. Holton et al.1995; Weber et al.2003; Dhomse et al.2006; Weber et al.2011), which increases its lifetime considerably. Hence, ozone biases in the lower stratosphere are likely due to the incomplete representation of various circulation pathways in TOMCAT either due to model resolution or missing representation of key physical process in the ERA-5 reanalysis scheme (e.g. Mitchell et al.2020), which impacts the meteorology used in the CTM.

4.2 Contribution from explanatory variables

As the exact causes of TOMCAT ozone biases are still not well understood, we use the RF model to remove them. The RF regression model coefficients are derived using 20 years (1991–1998, 2006–2018) for which SWOOSH data include a large number of observational profiles, especially from MLS on the UARS and Aura satellite platforms. The RF regression model uses 20 years of monthly data, with 80 % and 20 % of data points being used for training and testing, respectively. The estimated RF regression coefficients are then used to calculate model biases for the entire 42-year time period (1979–2020). RF-calculated ozone biases are then added to the TOMCAT time series to create the long-term gap-free data set, hereafter labelled ML-TOMCAT.

Figure 2 shows how much variance (or R2) of the data the RF regression model is able to explain, along with regression coefficients for individual explanatory variables. For example, an R2 value of 0.8 indicates that the RF regression model is able to explain 80 % of the biases in TOMCAT ozone relative to SWOOSH data for the 20 years of the training period. R2 also represents the sum of the regression coefficients from individual explanatory variables. Overall, the RF regression model performance is consistently high (R2>0.8) throughout the stratosphere, except for the mid-stratosphere, which is a transition region where the TOMCAT ozone biases change from positive to negative. At high northern latitudes, mid-stratospheric R2 values decrease to 0.6. However, since TOMCAT–SWOOSH differences are much smaller here, a RF-based correction has a minimal impact on the quality of ML-TOMCAT ozone profiles.

Figure 2Latitude–pressure cross sections of the variance (R2) and regression coefficients from passive ozone, HCl, CH4, solar and total column ozone anomaly (see main text Eq. 1).


Additionally, as expected, the RF regression coefficients are significant in different regions of the stratosphere for various explanatory variables. The passive ozone tracer seems to show the largest coefficients in the tropical mid-stratosphere, as well as varying contributions in different regions of the stratosphere. The passive ozone contribution in the tropical mid-stratospheric could be linked to the incomplete representation of NOx-related chemical changes in TOMCAT and/or seasonal changes in the stratospheric transport in the reanalysis (e.g. Galytska et al.2019). The HCl tracer shows significant coefficients in the upper stratosphere, where the ClO ozone loss cycle is important. It also shows significant contribution at the low- and mid-latitude lower stratosphere. HCl can be considered both a dynamical and chemical proxy, as in the upper-stratospheric HCl is primarily produced via degradation of ozone-depleting substances and is transported downwards at high latitudes via the BD circulation (e.g. Mahieu et al.2014). Therefore, HCl variations in this region can be considered a proxy for the changes in the strength of the BD circulation as well as horizontal isentropic transport, especially between tropics and mid-latitudes. The CH4 tracer term seems to show significant coefficients in the lowermost stratosphere (just above the tropopause) as well as a significant contribution around the mid-latitude sub-tropics. The CH4 tracer contribution resembles a QBO-induced secondary circulation pattern. Interestingly, the solar term shows the largest coefficients in the mid-latitude upper stratosphere rather than in the tropical upper stratosphere, suggesting solar flux variability has only a minor contribution to the TOMCAT–observation biases. As expected the dTCO term shows the largest contribution in the lowermost stratosphere, especially in the tropical and polar regions. Interestingly, ozone anomalies in these regions show good agreement with various satellite-based data sets (e.g. Chipperfield et al.2017, 2018; Li et al.2020; Feng et al.2021), and TOMCAT biases are much smaller. This means that although dTCO coefficients are largest in the lowermost stratosphere, the overall bias correction contribution remains relatively small.

4.3 Comparison against merged data sets

After analysing the regression coefficients, we now present a comparison between ML-TOMCAT and available satellite-based long-term data sets. Due to key differences between satellite measurement techniques, ozone profiles are retrieved either at altitude or pressure levels and either in units of mixing ratio or number density. For example, MLS retrieves profiles of ozone mixing ratio on pressure levels, whereas SAGE retrieves profiles of number density on altitude levels. Hence, merging these different data sets needs pressure, temperature or altitude information at a given co-location from an external source such as reanalysis data. The GOZCARDS and SWOOSH data sets use MERRA2 reanalysis data to convert SAGE II ozone number density profiles on fixed pressure levels (Damadeo et al.2013). ML-TOMCAT is based on modelled ozone profiles as a function of pressure, although conversion to altitude (geopotential height) coordinates is straightforward. In particular, ML-TOMCAT data were processed on corresponding grids/units using ERA-5 geopotential height, temperature and pressure fields that are used to drive TOMCAT.

This subsection consists of two parts. First we compare ML-TOMCAT profiles with data sets using pressure co-ordinate systems (e.g. SWOOSH, GOZCARDS), followed by comparisons with altitude-based data sets (SAGE-CCI-OMPS, SCIA-OMPS, BSVert).

4.3.1 Comparison with pressure level data

As noted earlier, we used only 20 years of SWOOSH data to train the RF model. Hence, the next obvious step is to compare ML-TOMCAT ozone with SWOOSH over the full time period. Figure 3 compares relative differences (in percent) of ML-TOMCAT with GOZCARDS and SWOOSH, respectively, as a function of latitude and pressure. ML-TOMCAT shows slightly positive biases in the middle stratosphere and somewhat negative biases in the upper and lower stratosphere with respect to both SWOOSH and GOZCARDS data. The largest biases (up to 10 %) are observed in the tropical lowermost stratosphere as well as polar latitudes. However, these largest differences in the tropical lowermost stratosphere (and upper troposphere) cannot be correctly validated as most satellite retrievals show largest uncertainties in this region (Rahpoe et al.2015; Steinbrecht et al.2017; Sofieva et al.2021). Similarly, for the non-MLS period, the biases in the polar stratosphere could be due to the lack of observational ozone profiles during polar night.

Figure 3Relative differences (in percent) as a function of pressure and latitude between ML-TOMCAT and (a) GOZCARDS V2 (Froidevaux et al.2019) and (b) SWOOSH (Davis et al.2016). Stippling indicates regions where differences are smaller than 1 standard deviation.


Figure 4 shows TOMCAT, ML-TOMCAT, SWOOSH and GOZCARDS ozone time series over the Equator (0 latitude) at three pressure levels (1, 10 and 50 hPa). Supplement Figs. S1 to S10 show similar comparisons at 15 N, 15 S, 30 N, 30 S, 45 N, 45 S, 60 N, 60 S, 75 N and 75 S latitude bins. The grey shaded area indicates the standard deviation of the ozone values within each bin for the GOZCARDS time series. The green shaded areas indicates the root mean square uncertainty of the combined data sets for each bin in SWOOSH data (σrmss in Davis et al.2016). Overall, there is a good agreement between the ML-TOMCAT, GOZCARDS and SWOOSH time series. As seen in Fig. 1, ML-TOMCAT shows significant improvements in the tropical stratosphere compared to TOMCAT.

Figure 4Comparison between TOMCAT (blue lines) and ML-TOMCAT (red lines) ozone mixing ratios over the Equator (0) at (a, top) 1 hPa, (b, middle) 10 hPa and (c, bottom) 50 hPa. Satellite-based ozone mixing ratios from GOZCARDS (Froidevaux et al.2019) and SWOOSH (Davis et al.2016) data sets along with their uncertainty estimates (shaded) are shown with black- and green-coloured lines, respectively.


A peculiar detail of Fig. 4 is that the standard deviation in the SWOOSH time series is largest during the 1991–1999 time period, which could be due to a combination of various factors. First, UARS MLS ozone profiles are retrieved at only six levels per pressure decade (Livesey et al.2003) instead of 12 levels per decade for Aura MLS (see, last access: 1 June 2021). Second, significant enhancement in the stratospheric aerosol loading following the Mt. Pinatubo eruption in June 1991 led to larger retrieval errors. Even with those uncertainties in SWOOSH (and GOZCARDS), ML-TOMCAT is generally close to the satellite-based data sets for the entire time period, and the agreement with satellite data is greatly improved in comparison to the original TOMCAT profile data. Supplement Figs. S1 to S10 also show an excellent agreement between ML-TOMCAT and the GOZCARDS/SWOOSH data sets for other latitude bands.

Next we scrutinise percentage differences between GOZCARDS and ML-TOMCAT on the same pressure levels. Figure 5 shows relative differences between TOMCAT, ML-TOMCAT and SWOOSH ozone time series with respect to GOZCARDS. As seen earlier, TOMCAT ozone shows up to 40 % positive biases in the lower stratosphere and 10 % negative biases in the upper stratosphere (also seen in Fig. 1). In contrast, ML-TOMCAT biases are well below 5 % at all levels. At 50 hPa, TOMCAT biases seems to follow QBO-type oscillations that are correctly removed in ML-TOMCAT. Similarly, at 1 hPa TOMCAT differences show some uneven variations that could be linked to the inhomogeneities in the ERA-5 dynamical fields that are used to force TOMCAT. Furthermore, ML-TOMCAT differences show much smaller and almost linear biases at 1 hPa and lie well within the spread of GOZCARDS data.

Figure 5As Fig. 4 but for the residuals, i.e. relative differences between SWOOSH (green), TOMCAT (blue) and ML-TOMCAT (red) ozone with respect to GOZCARDS ozone.


Interestingly, although both GOZCARDS and SWOOSH are created by merging nearly identical data sets, there are differences between them which are largest for the 1984–2004 time period. This indicates that even slight differences in merging methodology lead to large differences in the merged data set. Although we use completely independent output from a CTM as a basis data set, GOZCARDS–ML-TOMCAT differences are within the expected discrepancy between GOZCARDS and SWOOSH data sets, especially at 10 and 50 hPa.

Another notable feature in Fig. 5 is that at 50 hPa ML-TOMCAT shows the largest differences during 2020, which could be associated with the biases in ERA-5 dynamics during that period. A TOMCAT sensitivity simulation forced with ECMWF operational analysis data shows better agreement with MLS ozone variation during this period (e.g. Chrysanthou et al.2021). In addition, larger differences seen during 1984 (50 hPa), 1988 (10 hPa) and 1996–1999 (1 hPa) are most probably associated with SAGE II sampling issues and/or inhomogeneities in ERA-5 dynamical fields. However, a detailed analysis of these biases is out of scope of this study and it needs further investigation.

4.3.2 Comparison with altitude level data

We now compare ML-TOMCAT ozone profiles against altitude-based merged satellite data sets. Figure 6 shows the relative differences between TOMCAT/ML-TOMCAT vs. SAGE-CCI-OMPS (Sofieva et al.2017), BSVert (Hassler et al.2018a) and SCIA-OMPS (Arosio et al.2019) data sets as a function of altitude and latitude. The top panels (a and b) compare the mean relative differences between the SAGE-CCI-OMPS data set, TOMCAT and ML-TOMCAT, respectively. Here TOMCAT shows large positive biases (up to 20 %) in the lowermost stratosphere and negative biases (up to 15 %) in the upper stratosphere. On the other hand, ML-TOMCAT shows only ±10 % biases throughout the stratosphere. Larger biases are seen in the Antarctic stratosphere that could be attributed to the limited observational ozone profiles used to construct the altitude-based merged satellite data products. Interestingly, ML-TOMCAT shows largest biases (up to 30 %) with respect to the BSVert data set, though TOMCAT profiles (forced with ERA-Interim) are used as transfer function while constructing BSVert (Hassler et al.2018a). In addition, in the lowermost stratosphere, biases are negative in the tropics and SH mid-latitudes and positive in the NH middle and high latitudes. Hence, a contributing factor for these hemispherically asymmetric biases with respect to BSVert ozone profiles might be differences between ERA-Interim and ERA-5 reanalysis data (e.g. Ploeger et al.2021) that are used to force these two data sets. The negative values in relative differences in the lower tropical stratosphere shown with respect to the SCIA-OMPS data set in the fourth panel are systematic throughout the time series and are thought to be related to two factors. The first one is the rather coarse vertical grid (corresponding to SCIAMACHY vertical resolution of 3.3 km), which makes it sensitive to the interpolation onto the TOMCAT grid. The second is the difference in use of the merging procedure implemented for SCIA-OMPS and SWOOSH, so that ML-TOMCAT, trained over the MLS period using SWOOSH, shows a negative bias with respect to SCIA-OMPS, which does not show such a bias with respect to MLS (Arosio et al.2019).

Figure 6Relative difference (%) as a function of latitude and altitude between (a) TOMCAT versus SAGE-CCI-OMPS (1985–2019) and ML-TOMCAT versus (b) SAGE-CCI-OMPS (1985–2019), (c) BSVert (1985–2017) and (d) SCIA-OMPS (2002–2019), averaged over the respective time series. Stippling indicates regions where differences are smaller than 1 standard deviation.


Figure 7 compares TOMCAT and ML-TOMCAT profiles with the three altitude-based ozone data sets with a focus on the Equator (0 latitude). Supplement Figs. S11 to S20 show similar comparisons for 15 N, 15 S, 30 N, 30 S, 45 N, 45 S, 60 N, 60 S, 75 N, and 75 S latitude bins. Figure 8 displays the respective relative differences with respect to the SAGE-CCI-OMPS data set, which in this case is taken as a reference. In this way it is possible to evaluate the improvement introduced by applying the ML algorithm but also have an estimation of the discrepancies between different merged data sets, which is expected to be on the order of 5 %–10 %. With respect to the comparison with the data sets on the pressure vertical coordinate, the scatter between the time series is larger here, due to the larger variety of different satellites available to produce the merged products and the fact that they have not been used in the ML training.

Figure 7Comparison between TOMCAT (blue lines) and ML-TOMCAT (red lines) ozone mixing ratios over the Equator (0) at (a, top) 45 km, (b, middle) 35 km and (c, bottom) 25 km. Satellite-based ozone mixing ratios from SAGE-CCI-OMPS, BSVert (Hassler et al.2018a) and SCIA-OMPS (Arosio et al.2019) data sets are shown with black-, green- and cyan-coloured lines, respectively.


At about 45 km in the tropics the ML algorithm seems to overcorrect the negative bias shown by TOMCAT, leading to generally higher ozone values with respect to the other data sets, especially in the first half of the time series. In the middle stratosphere we find the best agreement between SAGE-CCI-OMPS and ML-TOMCAT; here the expected discrepancies among the merged data sets are comparable to the differences observed between ML-TOMCAT and SAGE-CCI-OMPS. At the peak of the ozone number profile around 25 km, we notice generally lower values for ML-TOMCAT, on average by 5 %. Similar biases are observed at middle and high latitudes as well as seen in Supplement Figs. S11 to S20. The strong seasonal cycle seen in the TOMCAT difference with respect to the merged data sets is largely reduced by ML-TOMCAT at this altitude.

4.3.3 Polar regions

The use of ML-TOMCAT helps to fill the observational gaps especially in atmospheric regions with a lack of observations and before the beginning of the 21st century, when satellite measurements were sparser. For example, polar regions during local winter cannot be observed by limb observations based on scattered sunlight. Instruments such as Aura MLS and the Sounding of the Atmosphere using Broadband Emission Radiometry (SABER, Rong et al.2008) have generally been used to fill this gap over the last two decades. For chemical models, complexities are also associated with the denitrification and dehydration (or chlorine activation) schemes that determine heterogeneous ozone losses (Grooß et al.2018). Though most of our earlier studies showed that TOMCAT is able simulate to polar ozone losses quite realistically (e.g. Feng et al.2007; Chipperfield et al.2015, 2017; Dhomse et al.2019), some systematic biases in polar stratosphere were noted in Feng et al. (2021) and Weber et al. (2021). Figure 9 compares ozone at 18 km over the Arctic, demonstrating the good agreement between ML-TOMCAT and MLS in this region for both local summer and winter seasons. The bottom panel shows the ozone sub-column over the Antarctic (poleward of 70 S latitude) integrated between 12 and 20 km for TOMCAT, ML-TOMCAT and MLS averaged over September–October months. The good agreement between MLS and ML-TOMCAT during the ozone hole period is observed for most of the years. ML-TOMCAT enables the reconstruction of the large ozone losses which occurred in the 1980s during a phase when ozone-depleting substances were on a rapid rise before the implementation of the Montreal Protocol and their phaseout.

Figure 8Same as Fig. 7 but for the residuals, i.e. relative differences between TOMCAT (blue), ML-TOMCAT (red), BSVert (green) and SCIA-OMPS (cyan) ozone with respect to SAGE-CCI-OMPS.


Figure 9(a) Ozone concentration time series (molecules cm−3) at 18 km over the Arctic region (latitudes poleward of 70 N). Aura-MLS and the Sounding of the Atmosphere using Broadband Emission Radiometry (SABER, Rong et al.2008) data are superimposed on TOMCAT and ML-TOMCAT time series. (b) Mean ozone sub-column (DU) between 12–20 km for September and October each year over the Antarctic region (latitudes poleward of 70 S).


Figure 10ML-TOMCAT (red line) and TOMCAT (blue line) total column ozone comparison with SBUV merged ozone data (MOD, black line) obtained from (last access: 1 June 2021). Monthly mean total column time series are shown for six latitude bins: Arctic (60–90 N), Antarctic (60–90 S), NH mid-latitudes (35–60 N), SH mid-latitudes (35–60 S), tropics (20 S–20 N) and near global (60 S–60 N).


4.3.4 Total column comparison

As noted above, total column measurements have relatively small retrieval errors and high temporal resolution and thus provide an important data set for assessing model performance. Hence, we compare total column ozone from ML-TOMCAT and TOMCAT with the SBUV merged ozone (MOD) data set ( (last access: 1 June 2021). Monthly mean total ozone columns, which are calculated by integrating number density profiles, are shown in Fig. 10 for six latitude bands. Supplement Fig. S22 compares tropospheric columns obtained by integrating profiles for ozone volume mixing ratios below 150 parts per billion (ppb). As expected, TOMCAT tropospheric columns show a constant, repeating seasonal cycle with smallest mean value (and amplitude) in the tropics (up to 13 DU) and largest mean values in the NH mid-latitudes (up to 23 DU). In contrast, ML-TOMCAT tropospheric columns show much larger mean and amplitude for all the latitude bins (mean column of about 35 DU in the NH mid-latitudes). ML-TOMCAT tropospheric columns also show large short-term variations (e.g. year 1991 following the Pinatubo eruption), suggesting that the ML-TOMCAT pressure range (316–1 hPa) does affect calculation of the tropospheric column through inclusion of levels that extend above 316 hPa. Note that below the 316 hPa level, both TOMCAT and ML-TOMCAT profiles include monthly climatological values from Logan (1999). Hence, it is important to note that ML-TOMCAT lower-mid tropospheric values are not recommended for scientific studies. Supplement Fig. S23 shows latitude–altitude cross section of climatological (1979–2020) ozone differences between ML-TOMCAT and TOMCAT in Dobson units. Figure S23 clearly shows that the largest differences are in the upper troposphere/lower stratosphere.

As noted earlier, TOMCAT total column differences are relatively small for both Arctic (60–90 N) and Antarctic (60–90 S) regions, and Fig. 10 clearly shows that the same is true for ML-TOMCAT total columns as well. For mid-latitudes (35–60), TOMCAT shows biases of up to +20 DU biases (especially in the NH mid-latitude) compared to observations, whereas ML-TOMCAT shows differences of less than 10 DU. Interestingly, for the tropics (20 S–20 N), TOMCAT shows negative biases until 2000 and slightly positive biases afterwards that are almost negligible in ML-TOMCAT time series. On the other hand, for the near-global average (60 S–60 N), ML-TOMCAT biases remain positive until 1990 and are close to TOMCAT biases. After 2000 TOMCAT seems to show slightly increasing positive biases with respect to SBUV MOD data, but ML-TOMCAT seems to show almost negligible biases without any apparent trend.

Overall ML-TOMCAT ozone profiles outside the 316–1 hPa range should be considered (slightly modified) TOMCAT model profiles. However, total column (and tropospheric column shown in Supplement Fig. S22) comparisons suggest that vertically integrated (1000–0.1 hPa) ML-TOMCAT profiles can provide a useful estimate which is better than TOMCAT and also better than combining ML-TOMCAT stratospheric column with tropospheric column from other sources (noting that levels above 1 hPa have a negligible contribution to the column). Hence, for convenience we include both tropospheric and lower-mesospheric ozone values in ML-TOMCAT data files (1000–0.1 hPa) even though they are only based on values from the TOMCAT model outside of the pressure range 316–1 hPa. For future versions of ML-TOMCAT we aim to also correct tropospheric ozone profile biases using merged tropospheric ozone profile data sets described, for example, in the Tropospheric Ozone Assessment Report (TOAR).

5 Data availability

We thank Sean Davies for SWOOSH data, which are publicly available via (last access : 1 June 2021; Davis et al.2016) . We also thank Lucien Froidevaux ( for GOZCARDS v2 data. SAGE-CCI-OMPS was obtained via (last access: 1 June 2021; Solomon et al.2016). SCIA-OMPS data are available upon email request to Alexei Rozanov or Mark Weber. BSVert data were obtained from (Hassler et al.2018b). ML-TOMCAT v1.0 data are publicly available via (Dhomse et al.2021).

6 Summary and conclusions

Stratospheric ozone concentrations are affected by many short- and long-term processes; hence, high-quality ozone profile data sets are needed for accurate attribution studies. Though satellite instruments provide global measurements, due to their short mission durations various merging methodologies have been adopted to create homogeneous and gap-free long-term ozone profile data sets. Individual merging methodologies have distinct advantages and disadvantages. Atmospheric chemical models are also able to simulate chemically consistent long-term data sets, but they are prone to the deficiencies associated with the simplified parameterisations and uncertain parameters.

Here we have used TOMCAT CTM ozone profiles and a random-forest (RF) regression model to create gap-free ozone profile data set (ML-TOMCAT) for 1979–2020. The RF is applied to the ozone difference between the SWOOSH and TOMCAT ozone profiles by selecting 20 years of MLS measurements (UARS-MLS and AURA-MLS) as a training period. RF show consistent performance throughout the stratosphere, except at high latitudes and the mid-latitude mid-stratosphere. Overall, ML-TOMCAT shows excellent agreement with SWOOSH for the entire time period (1984–2020), though somewhat larger differences are apparent for the period where limited ozone measurements are available for SWOOSH construction. We also find that ML-TOMCAT shows better agreement with satellite-based merged data sets which use pressure as the vertical coordinate (e.g. SWOOSH, GOZCARDS) but weaker agreement with the data sets which use altitude (e.g. SAGE-CCI-OMPS, SCIA-OMPS). We find that at almost all stratospheric levels ML-TOMCAT ozone concentrations are well within uncertainties of the observational data sets. Presently, the ML-TOMCAT V1.0 data set is ideally suited for the evaluation of chemical model ozone profiles from the tropopause to 0.1 hPa. ML-TOMCAT V1.0 ozone profile data on pressure and altitude levels in mixing ratios and number density units are publicly available via (Dhomse et al.2021).


The supplement related to this article is available online at:

Author contributions

SSD conceived the idea and initiated the study in discussion with MPC. The model runs were performed and analysed by SSD, MPC and WF. The figures were prepared by CA and SSD. The paper was written by SSD, MPC and MW, who included comments from all of the other coauthors.

Competing interests

The contact author has declared that neither they nor their co-authors have any competing interests.


Publisher's note: Copernicus Publications remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.


We thank NASA, NOAA and ESA for GOZCARDS, SWOOSH and SAGE-CCI-OMPS data products. We thank the European Centre for Medium-Range Weather Forecasts for providing their analyses. TOMCAT simulations were performed on the UK national Archer and Leeds Arc4 HPC systems.

Financial support

This research has been supported by the NERC SISLAC project (NE/R001782/1). The financial support of part of this work from the state of Bremen, DAAD PRIME grant (Carlo Arosio), and ESA SOLVE Living Planet Fellowship (Carlo Arosio) is gratefully acknowledged. The University of Bremen contribution also benefited from the DFG VolImpact and BMBF SynopSys projects.

Review statement

This paper was edited by Martin Schultz and reviewed by two anonymous referees.


Abalos, M. and de la Cámara, A.: Twenty-First Century Trends in Mixing Barriers and Eddy Transport in the Lower Stratosphere, Geophys. Res. Lett., 47, e2020GL089548,, 2020. a

Anderson, J., Russell III, J., Solomon, S., and Deaver, L.: Halogen Occultation Experiment confirmation of stratospheric chlorine decreases in accordance with the Montreal Protocol, J. Geophys. Res.-Atmos., 105, 4483–4490, 2000. a

Arosio, C., Rozanov, A., Malinina, E., Weber, M., and Burrows, J. P.: Merging of ozone profiles from SCIAMACHY, OMPS and SAGE II observations to study stratospheric ozone changes, Atmos. Meas. Tech., 12, 2423–2444,, 2019. a, b, c, d

Ball, W. T., Alsing, J., Mortlock, D. J., Staehelin, J., Haigh, J. D., Peter, T., Tummon, F., Stübi, R., Stenke, A., Anderson, J., Bourassa, A., Davis, S. M., Degenstein, D., Frith, S., Froidevaux, L., Roth, C., Sofieva, V., Wang, R., Wild, J., Yu, P., Ziemke, J. R., and Rozanov, E. V.: Evidence for a continuous decline in lower stratospheric ozone offsetting ozone layer recovery, Atmos. Chem. Phys., 18, 1379–1394,, 2018. a

Ball, W. T., Chiodo, G., Abalos, M., Alsing, J., and Stenke, A.: Inconsistencies between chemistry–climate models and observed lower stratospheric ozone trends since 1998, Atmos. Chem. Phys., 20, 9737–9752,, 2020. a

Bernath, P. F., McElroy, C. T., Abrams, M. C., Boone, C. D., Butler, M., Camy-Peyret, C., Carleer, M., Clerbaux, C., Coheur, P.-F., Colin, R., DeCola, P., DeMazière, M., Drummond, J. R., Dufour, D., Evans, W. F. J., Fast, H., Fussen, D., Gilbert, K., Jennings, D. E., Llewellyn, E. J., Lowe, R. P., Mahieu, E., McConnell, J. C., McHugh, M., McLeod, S. D., Michaud, R., Midwinter, C., Nassar, R., Nichitiu, F., Nowlan, C., Rinsland, C. P., Rochon, Y. J., Rowlands, N., Semeniuk, K., Simon, P., Skelton, R., Sloan, J. J., Soucy, M.-A., Strong, K., Tremblay, P., Turnbull, D., Walker, K. A., Walkty, I., Wardle, D. A., Wehrle, V., Zander, R., and Zou, J.: Atmospheric Chemistry Experiment (ACE): Mission overview, Geophys. Res. Lett., 32, L15S01,, 2005. a

Bodeker, G. E., Hassler, B., Young, P. J., and Portmann, R. W.: A vertically resolved, global, gap-free ozone database for assessing or constraining global climate model simulations, Earth Syst. Sci. Data, 5, 31–43,, 2013. a

Bognar, K., Alwarda, R., Strong, K., Chipperfield, M. P., Dhomse, S. S., Drummond, J. R., Feng, W., Fioletov, V., Goutail, F., Herrera, B., Manney, G. L., McCullough, E. M., Millán, L. F., Pazmino, A., Walker, K. A., Wizenberg, T., and Zhao, X.: Unprecedented Spring 2020 Ozone Depletion in the Context of 20 Years of Measurements at Eureka, Canada, J. Geophys. Res.-Atmos., 126, e2020JD034365,, 2021. a, b

Bovensmann, H., Burrows, J., Buchwitz, M., Frerick, J., Noël, S., Rozanov, V., Chance, K., and Goede, A.: SCIAMACHY: Mission objectives and measurement modes, J. Atmos. Sci., 56, 127–150, 1999. a

Brasseur, G. P. and Solomon, S.: Aeronomy of the middle atmosphere: Chemistry and physics of the stratosphere and mesosphere, vol. 32, Springer Science & Business Media, ISBN 978-1-4020-3284-4,, 2006. a

Breiman, L.: Random forests, Mach. Learn., 45, 5–32, 2001. a

Carpenter, L. J., Daniel, J. S., Fleming, E. L., Hanaoka, T., Ju, H., Ravishankara, A. R., Ross, M. N., Tilmes, S., Wallington, T. J., and Wuebbles, D. J.: Scenarios and information for policy makers, in: Scientific Assessment of Ozone Depletion: 2018, World Meteorological Organization, Global Ozone Research and Monitoring Project – Report No. 58, chap. 6, World Meteorological Organization/UNEP, Geneva, Switzerland, available at: (last access: 1 December 2021), 2018. a

Chipperfield, M. P.: New version of the TOMCAT/SLIMCAT off-line chemical transport model: Intercomparison of stratospheric tracer experiments, Q. J. Roy. Meteor. Soc., 132, 1179–1203,, 2006. a

Chipperfield, M. P., Dhomse, S. S., Feng, W., McKenzie, R. L., Velders, G. J. M., and Pyle, J. A.: Quantifying the ozone and ultraviolet benefits already achieved by the Montreal Protocol, Nat. Commun., 6, 7233,, 2015. a, b

Chipperfield, M. P., Bekki, S., Dhomse, S., Harris, N. R. P., Hassler, B., Hossaini, R., Steinbrecht, W., Thiéblemont, R., and Weber, M.: Detecting recovery of the stratospheric ozone layer, Nature, 549, 211–218,, 2017. a, b, c, d

Chipperfield, M. P., Dhomse, S., Hossaini, R., Feng, W., Santee, M. L., Weber, M., Burrows, J. P., Wild, J. D., Loyola, D., and Coldewey-Egbers, M.: On the Cause of Recent Variations in Lower Stratospheric Ozone, Geophys. Res. Lett., 45, 5718–5726,, 2018. a, b

Chrysanthou, A., Dhomse, S., Feng, W., Yajun, L., Hossaini, R., Ball, W., and Chipperfield, M.: The conundrum of the recent variations in the lower stratospheric ozone: An update, Quadrennial Ozone Symposium, available at:, last access: 1 December 2021. a

Cionni, I., Eyring, V., Lamarque, J. F., Randel, W. J., Stevenson, D. S., Wu, F., Bodeker, G. E., Shepherd, T. G., Shindell, D. T., and Waugh, D. W.: Ozone database in support of CMIP5 simulations: results and corresponding radiative forcing, Atmos. Chem. Phys., 11, 11267–11292,, 2011. a, b

Coddington, O., Lean, J. L., Pilewskie, P., Snow, M., and Lindholm, D.: A solar irradiance climate data record, B. Am. Meteorol. Soc., 97, 1265–1282,, 2016. a

Damadeo, R., Petropavlovskikh, I., Godin-Beekmann, S., Hubert, D., Sofieva, V., and Hassler, B.: The Long-term Ozone Trends and Uncertainties in the Stratosphere (LOTUS) SPARC activity: Lessons Learned, in: EGU General Assembly Conference Abstracts, Vienna, Austria, p. 9953, 2018. a

Damadeo, R. P., Zawodny, J. M., Thomason, L. W., and Iyer, N.: SAGE version 7.0 algorithm: application to SAGE II, Atmos. Meas. Tech., 6, 3539–3561,, 2013. a

Davis, S. M., Rosenlof, K. H., Hassler, B., Hurst, D. F., Read, W. G., Vömel, H., Selkirk, H., Fujiwara, M., and Damadeo, R.: The Stratospheric Water and Ozone Satellite Homogenized (SWOOSH) database: a long-term database for climate studies, Earth Syst. Sci. Data, 8, 461–490,, 2016. a, b, c, d, e, f

Dhomse, S., Weber, M., Wohltmann, I., Rex, M., and Burrows, J. P.: On the possible causes of recent increases in northern hemispheric total ozone from a statistical analysis of satellite data from 1979 to 2003, Atmos. Chem. Phys., 6, 1165–1180,, 2006. a, b

Dhomse, S., Feng, W., Montzka, S. A., Hossaini, R., Keeble, J., Pyle, J., Daniel, J., and Chipperfield, M.: Delay in recovery of the Antarctic ozone hole from unexpected CFC-11 emissions, Nat. Commun., 10, 1–12, 2019. a, b

Dhomse, S. S., Chipperfield, M. P., Feng, W., Ball, W. T., Unruh, Y. C., Haigh, J. D., Krivova, N. A., Solanki, S. K., and Smith, A. K.: Stratospheric O3 changes during 2001–2010: the small role of solar flux variations in a chemical transport model, Atmos. Chem. Phys., 13, 10113–10123,, 2013. a, b

Dhomse, S. S., Chipperfield, M. P., Feng, W., Hossaini, R., Mann, G. W., and Santee, M. L.: Revisiting the hemispheric asymmetry in midlatitude ozone changes following the Mount Pinatubo eruption: A 3-D model study, Geophys. Res. Lett., 42, 3038–3047,, 2015. a, b

Dhomse, S. S., Chipperfield, M. P., Damadeo, R. P., Zawodny, J. M., Ball, W. T., Feng, W., Hossaini, R., Mann, G. W., and Haigh, J. D.: On the ambiguous nature of the 11-year solar cycle signal in upper stratospheric ozone, Geophys. Res. Lett., 43, 7241–7249,, 2016. a, b, c

Dhomse, S. S., Kinnison, D., Chipperfield, M. P., Salawitch, R. J., Cionni, I., Hegglin, M. I., Abraham, N. L., Akiyoshi, H., Archibald, A. T., Bednarz, E. M., Bekki, S., Braesicke, P., Butchart, N., Dameris, M., Deushi, M., Frith, S., Hardiman, S. C., Hassler, B., Horowitz, L. W., Hu, R.-M., Jöckel, P., Josse, B., Kirner, O., Kremser, S., Langematz, U., Lewis, J., Marchand, M., Lin, M., Mancini, E., Marécal, V., Michou, M., Morgenstern, O., O'Connor, F. M., Oman, L., Pitari, G., Plummer, D. A., Pyle, J. A., Revell, L. E., Rozanov, E., Schofield, R., Stenke, A., Stone, K., Sudo, K., Tilmes, S., Visioni, D., Yamashita, Y., and Zeng, G.: Estimates of ozone return dates from Chemistry-Climate Model Initiative simulations, Atmos. Chem. Phys., 18, 8409–8438,, 2018. a

Dhomse, S. S., Chipperfield, M. P., Feng, W., Arosio, C., Weber, M., and Rozanov, A.: ML-TOMCAT V1.0: Machine-Learning-Based Satellite-Corrected Global Stratospheric Ozone Profile Dataset, Zenodo [data set],, 2021. a, b, c

Errera, Q., Chabrillat, S., Christophe, Y., Debosscher, J., Hubert, D., Lahoz, W., Santee, M. L., Shiotani, M., Skachko, S., von Clarmann, T., and Walker, K.: Technical note: Reanalysis of Aura MLS chemical observations, Atmos. Chem. Phys., 19, 13647–13679,, 2019. a

Feng, L., Brugge, R., Hólm, E., Harwood, R., O'Neill, A., Filipiak, M., Froidevaux, L., and Livesey, N.: Four-dimensional variational assimilation of ozone profiles from the Microwave Limb Sounder on the Aura satellite, J. Geophys. Res.-Atmos., 113, D15S07,, 2008. a

Feng, W., Chipperfield, M. P., Davies, S., von der Gathen, P., Kyrö, E., Volk, C. M., Ulanovsky, A., and Belyaev, G.: Large chemical ozone loss in 2004/2005 Arctic winter/spring, Geophys. Res. Lett., 34, L09803,, 2007. a, b

Feng, W., Dhomse, S. S., Arosio, C., Weber, M., Burrows, J. P., Santee, M. L., and Chipperfield, M. P.: Arctic ozone depletion in 2019/20: Roles of chemistry, dynamics and the Montreal Protocol, Geophys. Res. Lett., 48, e2020GL091911,, 2021. a, b, c, d

Froidevaux, L., Livesey, N. J., Read, W. G., Salawitch, R. J., Waters, J. W., Drouin, B., MacKenzie, I. A., Pumphrey, H. C., Bernath, P., Boone, C., Nassar, R., Montzka, S., Elkins, J., Cunnold, D., and Waugh, D.: Temporal decrease in upper atmospheric chlorine, Geophys. Res. Lett., 33, L23812,, 2006a. a

Froidevaux, L., Livesey, N. J., Read, W. G., Jiang, Y. B., Jimenez, C., Filipiak, M. J., Schwartz, M. J., Santee, M. L., Pumphrey, H. C., Jiang, J. H., and Wu, D. L.: Early validation analyses of atmospheric profiles from EOS MLS on the Aura satellite, IEEE T. Geosci. Remote, 44, 1106–1121, 2006b. a

Froidevaux, L., Anderson, J., Wang, H.-J., Fuller, R. A., Schwartz, M. J., Santee, M. L., Livesey, N. J., Pumphrey, H. C., Bernath, P. F., Russell III, J. M., and McCormick, M. P.: Global OZone Chemistry And Related trace gas Data records for the Stratosphere (GOZCARDS): methodology and sample results with a focus on HCl, H2O, and O3, Atmos. Chem. Phys., 15, 10471–10507,, 2015. a, b

Froidevaux, L., Kinnison, D. E., Wang, R., Anderson, J., and Fuller, R. A.: Evaluation of CESM1 (WACCM) free-running and specified dynamics atmospheric composition simulations using global multispecies satellite data records, Atmos. Chem. Phys., 19, 4783–4821,, 2019. a, b, c

Galytska, E., Rozanov, A., Chipperfield, M. P., Dhomse, S. S., Weber, M., Arosio, C., Feng, W., and Burrows, J. P.: Dynamically controlled ozone decline in the tropical mid-stratosphere observed by SCIAMACHY, Atmos. Chem. Phys., 19, 767–783,, 2019. a

Ghosh, S., Pyle, J. A., and Good, P.: Temperature dependence of the ClO concentration near the stratopause, J. Geophys. Res.-Atmos., 102, 19207–19216,, 1997. a

Grooß, J.-U., Müller, R., Spang, R., Tritscher, I., Wegner, T., Chipperfield, M. P., Feng, W., Kinnison, D. E., and Madronich, S.: On the discrepancy of HCl processing in the core of the wintertime polar vortices, Atmos. Chem. Phys., 18, 8647–8666,, 2018. a

Haigh, J.: The role of stratospheric ozone in modulating the solar radiative forcing of climate, Nature, 370, 544–546,, 1994. a

Haigh, J., Winning, A., Toumi, R., and Harder, J.: An influence of solar spectral variations on radiative forcing of climate, Nature, 467, 696–699,, 2010. a, b

Harrison, J. J., Chipperfield, M. P., Boone, C. D., Dhomse, S. S., and Bernath, P. F.: Fifteen Years of HFC-134a Satellite Observations: Comparisons With SLIMCAT Calculations, J. Geophys. Res.-Atmos., 126, e2020JD033208,, 2021. a

Hassler, B., Bodeker, G. E., and Dameris, M.: Technical Note: A new global database of trace gases and aerosols from multiple sources of high vertical resolution measurements, Atmos. Chem. Phys., 8, 5403–5421,, 2008. a

Hassler, B., Kremser, S., Bodeker, G. E., Lewis, J., Nesbit, K., Davis, S. M., Chipperfield, M. P., Dhomse, S. S., and Dameris, M.: An updated version of a gap-free monthly mean zonal mean ozone database, Earth Syst. Sci. Data, 10, 1473–1490,, 2018a. a, b, c, d

Hassler, B., Kremser, S., Bodeker, G., Lewis, J., Nesbit, K., Davis, S., Chipperfield, M., Dhomse, S., and Dameris, M.: BSVerticalOzone database (v1.0), Zenodo [data set],, 2018b. a

Hersbach, H., Bell, B., Berrisford, P., Hirahara, S., Horányi, A., Muñoz-Sabater, J., Nicolas, J., Peubey, C., Radu, R., Schepers, D., Simmons, A., Soci, C., Abdalla, S., Abellan, X., Balsamo, G., Bechtold, P., Biavati, G., Bidlot, J., Bonavita, M., De Chiara, G., Dahlgren, P., Dee, D., Diamantakis, M., Dragani, R., Flemming, J., Forbes, R., Fuentes, M., Geer, A., Haimberger, L., Healy, S., Hogan, R. J., Hólm, E., Janisková, M., Keeley, S., Laloyaux, P., Lopez, P., Lupu, C., Radnoti, G., de Rosnay, P., Rozum, I., Vamborg, F., Villaume, S., and Thépaut, J.-N.: The ERA5 global reanalysis, Q. J. Roy. Meteor. Soc., 146, 1999–2049,, 2020. a

Holton, J. R., Haynes, P. H., McIntyre, M. E., Douglass, A. R., Rood, R. B., and Pfister, L.: Stratosphere-troposphere exchange, Rev. Geophys., 33, 403–439,, 1995. a

Hossaini, R., Chipperfield, M. P., Montzka, S. A., Rap, A., Dhomse, S., and Feng, W.: Efficiency of short-lived halogens at influencing climate through depletion of stratospheric ozone, Nat. Geosci., 8, 186–190,, 2015. a

Hossaini, R., Atlas, E., Dhomse, S. S., Chipperfield, M. P., Bernath, P. F., Fernando, A. M., Mühle, J., Leeson, A. A., Montzka, S. A., Feng, W., Harrison, J. J., Krummel, P., Vollmer, M. K., Reimann, S., O'Doherty, S., Young, D., Maione, M., Arduini, J., and Lunder, C. R.: Recent Trends in Stratospheric Chlorine From Very Short-Lived Substances, J. Geophys. Res.-Atmos., 124, 2318–2335,, 2019. a, b, c

Kotsiantis, S. B.: Decision trees: a recent overview, Artif. Intell. Rev., 39, 261–283, 2013. a

Li, Y., Chipperfield, M. P., Feng, W., Dhomse, S. S., Pope, R. J., Li, F., and Guo, D.: Analysis and attribution of total column ozone changes over the Tibetan Plateau during 1979–2017, Atmos. Chem. Phys., 20, 8627–8639,, 2020. a

Livesey, N. J., Read, W. G., Froidevaux, L., Waters, J. W., Santee, M. L., Pumphrey, H. C., Wu, D. L., Shippony, Z., and Jarnot, R. F.: The UARS Microwave Limb Sounder version 5 data set: Theory, characterization, and validation, J. Geophys. Res.-Atmos., 108, 4378,, 2003. a

Logan, J. A.: An analysis of ozonesonde data for the troposphere: Recommendations for testing 3-D models and development of a gridded climatology for tropospheric ozone, J. Geophys. Res.-Atmos., 104, 16115–16149,, 1999. a, b

Luo, B.: Stratospheric aerosol data for use in CMIP6 models, available at: (last access: 1 June 2021), 2016. a

Mahieu, E., Chipperfield, M. P., Notholt, J., Reddmann, T., Anderson, J., Bernath, P. F., Blumenstock, T., Coffey, M. T., Dhomse, S. S., Feng, W., Franco, B., Froidevaux, L., Griffith, D. W. T., Hannigan, J. W., Hase, F., Hossaini, R., Jones, N. B., Morino, I., Murata, I., Nakajima, H., Palm, M., Paton-Walsh, C., Russell, J. M., Schneider, M., Servais, C., Smale, D., Walker, K. A., Russel III, J. M., Schneider, M., Servais, C., Smale, D., and Walker, K. A.: Recent Northern Hemisphere stratospheric HCl increase due to atmospheric circulation changes, Nature, 515, 104–107,, 2014. a, b

McCormick, M., Zawodny, J., Veiga, R., Larsen, J., and Wang, P.: An overview of SAGE I and II ozone measurements, Planet. Space Sci., 37, 1567–1586, 1989. a

Millán, L. F., Livesey, N. J., Santee, M. L., Neu, J. L., Manney, G. L., and Fuller, R. A.: Case studies of the impact of orbital sampling on stratospheric trend detection and derivation of tropical vertical velocities: solar occultation vs. limb emission sounding, Atmos. Chem. Phys., 16, 11521–11534,, 2016. a

Mitchell, D. M., Lo, Y. E., Seviour, W. J., Haimberger, L., and Polvani, L. M.: The vertical profile of recent tropical temperature trends: Persistent model biases in the context of internal variability, Environ. Res. Lett., 15, 1040b4,, 2020. a

Montzka, S. A., Dutton, G. S., Yu, P., Ray, E., Portmann, R. W., Daniel, J. S., Kuijpers, L., Hall, B. D., Mondeel, D., Siso, C., and Nance, J. D.: An unexpected and persistent increase in global emissions of ozone-depleting CFC-11, Nature, 557, 413–417, 2018. a

Montzka, S. A., Dutton, G. S., Portmann, R. W., Chipperfield, M. P., Davis, S., Feng, W., Manning, A. J., Ray, E., Rigby, M., Hall, B. D., and Siso, C.: A decline in global CFC-11 emissions during 2018–2019, Nature, 590, 428–432, 2021. a

Murtagh, D., Frisk, U., Merino, F., Ridal, M., Jonsson, A., Stegman, J., Witt, G., Eriksson, P., Jiménez, C., Megie, G., de La Noë, J., Ricaud, P., Baron, P., Pardo, J. R., Hauchcorne, A., Llewellyn, E. J., Degenstein, D. A., Gattinger, R. L., Lloyd, N. D., Evans, W. F. J., McDade, I. C., Haley, C. S., Sioris, C., von Savigny, C., Solheim, B. H., McConnell, J. C., Strong, K., Richardson, E. H., Leppelmeier, G. W., Kyrölä, E., Auvinen, H., and Oikarinen, L.: Review: An overview of the Odin atmospheric mission, Can. J. Phys., 80, 309–319,, 2002. a

Orbe, C., Wargan, K., Pawson, S., and Oman, L. D.: Mechanisms linked to recent ozone decreases in the Northern Hemisphere lower stratosphere, J. Geophys. Res.-Atmos., 125, e2019JD031631,, 2020. a

Paul, J., Fortuin, F., and Kelder, H.: An ozone climatology based on ozonesonde and satellite measurements, J. Geophys. Res.-Atmos., 103, 31709–31734, 1998. a

Pedregosa, F., Varoquaux, G., Gramfort, A., Michel, V., Thirion, B., Grisel, O., Blondel, M., Prettenhofer, P., Weiss, R., Dubourg, V., Vanderplas, J., Passos, A., Cournapeau, D., Brucher, M., Perrot, P., and Duchesnay, E.: Scikit-learn: Machine learning in Python, J. Mach. Learn. Res., 12, 2825–2830, 2011. a

Petropavlovskikh, I., Godin-Beekmann, S., Hubert, D., Damadeo, R., Hassler, B., and Sofieva, V.: SPARC/IO3C/GAW Report on Long-term Ozone Trends and Uncertainties in the Stratosphere, Tech. rep., SPARC, 9th assessment report of the SPARC project, International Project Office at DLR-IPA, GAW Report No. 241, WCRP Report 17/2018, available at: (last access: 1 June 2021), 2019. a

Ploeger, F., Diallo, M., Charlesworth, E., Konopka, P., Legras, B., Laube, J. C., Grooß, J.-U., Günther, G., Engel, A., and Riese, M.: The stratospheric Brewer–Dobson circulation inferred from age of air in the ERA5 reanalysis, Atmos. Chem. Phys., 21, 8393–8412,, 2021. a, b

Prignon, M., Chabrillat, S., Friedrich, M., Smale, D., Strahan, S. E., Bernath, P. F., Chipperfield, M. P., Dhomse, S. S., Feng, W., Minganti, D., Servais, C., and Mahieu, E.: Stratospheric fluorine as a tracer of circulation changes: comparison between infrared remote-sensing observations and simulations with five modern reanalyses, J. Geophys. Res.-Atmos., 126, e2021JD034995,, 2021. a

Rahpoe, N., Weber, M., Rozanov, A. V., Weigel, K., Bovensmann, H., Burrows, J. P., Laeng, A., Stiller, G., von Clarmann, T., Kyrölä, E., Sofieva, V. F., Tamminen, J., Walker, K., Degenstein, D., Bourassa, A. E., Hargreaves, R., Bernath, P., Urban, J., and Murtagh, D. P.: Relative drifts and biases between six ozone limb satellite measurements from the last decade, Atmos. Meas. Tech., 8, 4369–4381,, 2015. a

Randel, W. J. and Wu, F.: A stratospheric ozone profile data set for 1979–2005: Variability, trends, and comparisons with column ozone data, J. Geophys. Res.-Atmos., 112, D06313,, 2007. a

Rong, P., Russell III, J., Mlynczak, M., Remsberg, E., Marshall, B., Gordley, L., and Lopez-Puertas, M.: Validation of TIMED/SABER v1.07 ozone at 9.6 µm in the altitude range 15–70 km, J. Geophys. Res, 114, D04306,, 2008. a, b

Russell III, J. M., Gordley, L. L., Park, J. H., Drayson, S. R., Hesketh, W. D., Cicerone, R. J., Tuck, A. F., Frederick, J. E., Harries, J. E., and Crutzen, P. J.: The halogen occultation experiment, J. Geophys. Res.-Atmos., 98, 10777–10797, 1993. a

Skachko, S., Errera, Q., Ménard, R., Christophe, Y., and Chabrillat, S.: Comparison of the ensemble Kalman filter and 4D-Var assimilation methods using a stratospheric tracer transport model, Geosci. Model Dev., 7, 1451–1465,, 2014. a

Sofieva, V. F., Kalakoski, N., Päivärinta, S.-M., Tamminen, J., Laine, M., and Froidevaux, L.: On sampling uncertainty of satellite ozone profile measurements, Atmos. Meas. Tech., 7, 1891–1900,, 2014. . a, b

Sofieva, V. F., Kyrölä, E., Laine, M., Tamminen, J., Degenstein, D., Bourassa, A., Roth, C., Zawada, D., Weber, M., Rozanov, A., Rahpoe, N., Stiller, G., Laeng, A., von Clarmann, T., Walker, K. A., Sheese, P., Hubert, D., van Roozendael, M., Zehner, C., Damadeo, R., Zawodny, J., Kramarova, N., and Bhartia, P. K.: Merged SAGE II, Ozone_cci and OMPS ozone profile dataset and evaluation of ozone trends in the stratosphere, Atmos. Chem. Phys., 17, 12533–12552,, 2017. a, b

Sofieva, V. F., Szeląg, M., Tamminen, J., Kyrölä, E., Degenstein, D., Roth, C., Zawada, D., Rozanov, A., Arosio, C., Burrows, J. P., Weber, M., Laeng, A., Stiller, G. P., von Clarmann, T., Froidevaux, L., Livesey, N., van Roozendael, M., and Retscher, C.: Measurement report: regional trends of stratospheric ozone evaluated using the MErged GRIdded Dataset of Ozone Profiles (MEGRIDOP), Atmos. Chem. Phys., 21, 6707–6720,, 2021. a

Solomon, S., Ivy, D. J., Kinnison, D., Mills, M. J., Neely, R. R., and Schmidt, A.: Emergence of healing in the Antarctic ozone layer, Science, 353(, 269–274, 2016. a, b

Steinbrecht, W., Froidevaux, L., Fuller, R., Wang, R., Anderson, J., Roth, C., Bourassa, A., Degenstein, D., Damadeo, R., Zawodny, J., Frith, S., McPeters, R., Bhartia, P., Wild, J., Long, C., Davis, S., Rosenlof, K., Sofieva, V., Walker, K., Rahpoe, N., Rozanov, A., Weber, M., Laeng, A., von Clarmann, T., Stiller, G., Kramarova, N., Godin-Beekmann, S., Leblanc, T., Querel, R., Swart, D., Boyd, I., Hocke, K., Kämpfer, N., Maillard Barras, E., Moreira, L., Nedoluha, G., Vigouroux, C., Blumenstock, T., Schneider, M., García, O., Jones, N., Mahieu, E., Smale, D., Kotkamp, M., Robinson, J., Petropavlovskikh, I., Harris, N., Hassler, B., Hubert, D., and Tummon, F.: An update on ozone profile trends for the period 2000 to 2016, Atmos. Chem. Phys., 17, 10675–10690,, 2017. a, b

Stolarski, R. S., Douglass, A. R., Newman, P. A., Pawson, S., and Schoeberl, M. R.: Relative Contribution of Greenhouse Gases and Ozone-Depleting Substances to Temperature Trends in the Stratosphere: A Chemistry–Climate Model Study, J. Climate, 23, 28–42,, 2010. a

Strahan, S. E., Douglass, A. R., Stolarski, R. S., Akiyoshi, H., Bekki, S., Braesicke, P., Butchart, N., Chipperfield, M. P., Cugnet, D., Dhomse, S., Frith, S. M., Gettelman, A., Hardiman, S. C., Kinnison, D. E., Lamarque, J. F., Mancini, E., Marchand, M., Michou, M., Morgenstern, O., Nakamura, T., Olivie, D., Pawson, S., Pitari, G., Plummer, D. A., Pyle, J. A., Scinocca, J. F., Shepherd, T. G., Shibata, K., Smale, D., Teyssedre, H., Tian, W., and Yamashita, Y.: Using transport diagnostics to understand chemistry climate model ozone simulations, J. Geophys. Res.-Atmos., 116, D17302,, 2011. a

Sukhodolov, T., Rozanov, E., Ball, W., Bais, A., Tourpali, K., Shapiro, A., Telford, P., Smyshlyaev, S., Fomin, B., Sander, R., Bossay, S., Bekki, S., Marchand, M., Chipperfield, M., Dhomse, S., Haigh, J., Peter, T., and Schmutz, W.: Evaluation of simulated photolysis rates and their response to solar irradiance variability, J. Geophys. Res.-Atmos., 121, 6066–6084,, 2016. a

Svetnik, V., Liaw, A., Tong, C., Culberson, J. C., Sheridan, R. P., and Feuston, B. P.: Random forest: a classification and regression tool for compound classification and QSAR modeling, J. Chem. Inf. Comput. Sci., 43, 1947–1958, 2003. a

Szeląg, M. E., Sofieva, V. F., Degenstein, D., Roth, C., Davis, S., and Froidevaux, L.: Seasonal stratospheric ozone trends over 2000–2018 derived from several merged data sets, Atmos. Chem. Phys., 20, 7035–7047,, 2020. a

Tian, W., Chipperfield, M. P., Gray, L. J., and Zawodny, J. M.: Quasi-biennial oscillation and tracer distributions in a coupled chemistry-climate model, J. Geophys. Res.-Atmos., 111, D20301,, 2006. a

Wales, P. A., Salawitch, R. J., Nicely, J. M., Anderson, D. C., Canty, T. P., Baider, S., Dix, B., Koenig, T. K., Volkamer, R., Chen, D., Huey, L. G., Tanner, D. J., Cuevas, C. A., Fernandez, R. P., Kinnison, D. E., Lamarque, J.-F., Saiz-Lopez, A., Atlas, E. L., Hall, S. R., Navarro, M. A., Pan, L. L., Schauffler, S. M., Stell, M., Tilmes, S., Ullmann, K., Weinheimer, A. J., Akiyoshi, H., Chipperfield, M. P., Deushi, M., Dhomse, S. S., Feng, W., Graf, P., Hossaini, R., Jöckel, P., Mancini, E., Michou, M., Morgenstern, O., Oman, L. D., Pitari, G., Plummer, D. A., Revell, L. E., Rozanov, E., Saint-Martin, D., Schofield, R., Stenke, A., Stone, K., Visioni, D., Yamashita, Y., and Zeng, G.: Stratospheric Injection of Brominated Very Short-Lived Substances Inferred from Aircraft Observations of Organic Bromine and BrO in the Western Pacific, J. Geophys. Res.-Atmos., 23, 5690–5719,, 2018. a

Wargan, K., Orbe, C., Pawson, S., Ziemke, J. R., Oman, L. D., Olsen, M. A., Coy, L., and Emma Knowland, K.: Recent decline in extratropical lower stratospheric ozone attributed to circulation changes, Geophys. Res. Lett., 45, 5166–5176, 2018. a

Wargan, K., Weir, B., Manney, G. L., Cohn, S. E., and Livesey, N. J.: The anomalous 2019 Antarctic ozone hole in the GEOS Constituent Data Assimilation System with MLS observations, J. Geophys. Res.-Atmos., 125, e2020JD033335,, 2020. a, b

Weber, M., Dhomse, S., Wittrock, F., Richter, A., Sinnhuber, B.-M., and Burrows, J. P.: Dynamical control of NH and SH winter/spring total ozone from GOME observations in 1995–2002, Geophys. Res. Lett., 30, 1583,, 2003. a

Weber, M., Dikty, S., Burrows, J. P., Garny, H., Dameris, M., Kubin, A., Abalichin, J., and Langematz, U.: The Brewer-Dobson circulation and total ozone from seasonal to decadal time scales, Atmos. Chem. Phys., 11, 11221–11235,, 2011.  a

Weber, M., Arosio, C., Feng, W., Dhomse, S. S., Chipperfield, M. P., Meier, A., Burrows, J. P., Eichmann, K.-U., Richter, A., and Rozanov, A.: The Unusual Stratospheric Arctic Winter 2019/20: Chemical Ozone Loss From Satellite Observations and TOMCAT Chemical Transport Model, J. Geophys. Res.-Atmos., 126, e2020JD034386,, 2021. a, b, c

WMO: Scientific Assessment of Ozone Depletion:2014, Tech. rep., World Meteorological Organization, Global Ozone Research and Monitoring Project, Report No. 55, Geneva, Switzerland, available at: (last access: 1 December 2021), 2014. a

WMO: Scientific Assessment of Ozone Depletion:2018, Tech. rep., World Meteorological Organization, Global Ozone Research and Monitoring Project, Report No. 58, Geneva, Switzerland, available at: (last access: 1 December 2021), 2018. a

Wohltmann, I., Gathen, P., Lehmann, R., Maturilli, M., Deckelmann, H., Manney, G. L., Davies, J., Tarasick, D., Jepsen, N., Kivi, R., Lyall, N., and Rex, M.: Near-complete local reduction of Arctic stratospheric ozone by severe chemical loss in spring 2020, Geophys. Res. Lett., 47, e2020GL089547,, 2020. a

Zawada, D. J., Rieger, L. A., Bourassa, A. E., and Degenstein, D. A.: Tomographic retrievals of ozone with the OMPS Limb Profiler: algorithm description and preliminary results, Atmos. Meas. Tech., 11, 2375–2393,, 2018. a

Short summary
High-quality long-term ozone profile data sets are key to estimating short- and long-term ozone variability. Almost all the satellite (and chemical model) data sets show some kind of bias with respect to each other. This is because of differences in measurement methodologies as well as simplified processes in the models. We use satellite data sets and chemical model output to generate 42 years of ozone profile data sets using a random-forest machine-learning algorithm that is named ML-TOMCAT.
Final-revised paper