The polar mesospheric cloud dataset of the Balloon Lidar Experiment (BOLIDE)

. The Balloon Lidar Experiment (BOLIDE) observed polar mesospheric clouds (PMCs) along the Arctic circle between Sweden and Canada during the balloon ﬂight of PMC Turbo in July 2018. The purpose of the mission was to study small-scale dynamical processes induced by the breaking of atmospheric gravity waves by high-resolution imaging and proﬁling of the PMC layer. The primary parameter of the lidar soundings is the time- and range-resolved volume backscatter coefﬁcient β . These data are available at high resolutions of 20 m and 10 s (Kaiﬂer, 2021, https://doi.org/10.5281/zenodo.5722385). This document describes how we calculate β from the BOLIDE photon count data and balloon ﬂoating altitude. We compile information relevant for the scientiﬁc exploration of this dataset, including statistics, mean values, and temporal evolution of parameters like PMC brightness, altitude, and occurrence rate. Special emphasis is given to the stability of the gondola pointing and the effect of resolution on the signal-to-noise ratio and thus the detection threshold of PMC. PMC layers were detected during 49.7 h in total, accounting for 36.8 % of the 5.7 d ﬂight duration and a total of 178 924 PMC proﬁles at 10 s resolution. Up to the present, published results from subsets of this dataset include the evolution of small-scale vortex rings, distinct Kelvin–Helmholtz instabilities, and mesospheric bores. The lidar soundings reveal a wide range of responses of the PMC layer to larger-scale gravity waves and breaking gravity waves, including the accompanying instabilities, that await scientiﬁc analysis.


Introduction
The dataset described here contains balloon lidar soundings of polar mesospheric clouds (PMCs), which are also known as noctilucent clouds (NLCs). Both terms name the same phenomenon, i.e. optically thin layers of ice particles at around 83 km altitude which occur at high latitudes during polar summer. The term noctilucent clouds is translated from the German term "Leuchtende Nachtwolken", coined by Otto Jesse, who was among the first observers in 1885, and who was the first to recognize the scientific value of their observation regarding the exploration of what is now known as the mesosphere-lower thermosphere region (Jesse, 1885;Leslie, 1885;Backhouse, 1885). One of the prime interests in PMC research today is the structure and dynamical evolution of atmospheric gravity waves imprinted on this cloud layer, as first noted by Witt (1962). Noctilucent clouds can be observed by the naked eye, i.e. in the visible part of the electromagnetic spectrum, or by camera from the ground within the twilight zone. Satellite instruments in orbit since the 1980s extend the observations to different wavelengths, all longitudes, and a wider range of latitudes, and hence, the more general term polar mesospheric clouds was introduced (Thomas, 1984;DeLand et al., 2006). This term has also been used related to observations with cameras from space (Chandran et al., 2009). With the advent of lidars that often operate at 532 nm wavelength -and thus within the wavelength range that human observers can see -and that were initially ground based, the term noctilucent clouds has been retained also for historic reasons. However, the majority of PMC data from ground-based lidars are acquired at polar latitudes in full daylight due to the higher occurrence frequency closer to the poles. As such, the term PMC also became common in connection with ground-based lidars (e.g. Chu et al., 2003). The Balloon Lidar Experiment (BOLIDE) is the first lidar to make observations from a balloon platform in the upper stratosphere independent of the influence of the lower atmosphere, e.g. tropospheric clouds, and the mission design and operation had many aspects of a small satellite mission. For this reason, and for conformity to the mission name, we use the term PMC in connection with this dataset.
The BOLIDE instrument is a high-power Rayleigh lidar designed for operation on long-duration balloons floating at ∼ 40 km altitude in the Arctic and Antarctic (Kaifler et al., 2020). The laser beam produced by a pulsed Nd:YAG laser, with a mean optical power of 4.5 W and 100 Hz pulse rate, is tilted 28 • off-zenith to avoid the balloon. Backscattered light is collected by a 0.5 m diameter mirror and detected via an avalanche photo diode. The detector is operated in photoncounting mode, i.e. every single detected photon is recorded with its time stamp relative to the emission of the last laser pulse. The native temporal resolution of the BOLIDE data acquisition is 800 ps, but the effective resolution is limited to the duration of the laser pulses, which is 5 ns or 1.5 m in range. For all practical applications, these raw photon count data are binned to a lower resolution in order to achieve a sufficiently high signal-to-noise ratio. For example, Kaifler et al. (2020, their Fig. 8) show photon count data at 10 m and 1 s resolution. For scientific interpretation, photon counts are converted to volume backscatter coefficients that relate the measured atmospheric return signal to a standard atmospheric density profile. For the detection of scattering originating from PMC particles, we define a threshold of 2.5σ relative to the background signal. The source of this background signal is scattered sunlight and, as such, contributes to the lidar signal measured in every altitude range. In our opinion, when calculating volume backscatter coefficients from BOLIDE measurements, the photon count data binned to 20 m vertical and 10 s time resolution represents a good compromise between the diametral requirements of a high resolution and high signal-to-noise ratio. The resulting dataset is of comparable quality to the much larger PMC dataset produced by the Arctic Lidar Observatory for Middle Atmosphere Research (ALOMAR) Rayleigh/Mie/Raman lidar (RMR; Kaifler et al., 2018;Schäfer et al., 2020). Moreover, Taylor et al. (2009) andCollins et al. (2009) have also analysed PMC lidar data at resolutions below 10 min.
The PMC Turbo mission was designed to acquire quasi-3D soundings of the PMC layer by using a combination of camera and lidar observations at high resolution, as it is only at scales below 1 min and tens of metres that the fine structures imprinted by dynamical instabilities during the breaking of gravity waves and the transition to turbulence become evident (Fritts et al., 2019). Besides the BOLIDE lidar, the scientific payload consisted of four wide field-of-view and three narrow field-of-view cameras (Kjellstrand et al., 2020). In this work, we describe and characterize the BOLIDE PMC dataset available from Zenodo (Kaifler, 2021) and NASA's Space Physics Data Facility (see Sect. 5). The netcdf file includes the volume backscatter coefficient β at 20 m and 10 s resolution and time series of the gondola floating altitude, the orientation of the gondola in azimuth, and the latitude and longitude of the lidar beam at 82 km altitude.
2 Balloon floating altitude and gondola stabilization PMC Turbo was carried by a long-duration balloon launched at Esrange, Sweden, in July 2018. The stratospheric winds carried the instruments across the Norwegian Sea, the Greenland ice sheet, Baffin Bay, and Baffin Island to the northern Kivalliq Region of Nunavut in Canada, resulting in a flight track within a latitude band of 66.33-69.46 • N. Maps with the flight track are shown by Fritts et al. (2019). The length of the flight track was 5810 km, and the average speed was 11.7 m s −1 , with a standard deviation of 4.3 m s −1 due to changing upper stratospheric winds. Of relevance for the analysis of lidar PMC data is the balloon floating altitude z g , as it constitutes an offset in the calculation of the absolute PMC layer altitude based on the range data measured by the lidar. z g changes as a function of local solar time due to the thermal heating of the balloon, and occasionally, ballast was dropped to gain height (Fig. 1a). z g was measured by several GPS receivers mounted on the balloon gondola carrying BOLIDE, and the data were transmitted to the ground via satellite links and stored on board at different temporal resolutions. These datasets have been quality controlled, merged, and interpolated to a 5 s grid. Ascent to the upper stratosphere was completed on 7 July at 10:10 UT at 20.71 • E, and descent was initiated on 14 July 2018 at 04:46 UT at 109.63 • W. The average floating altitude was 38.27 km. Minimum and maximum altitudes were 35.75 and 39.55 km, respectively. Wavelet analyses of the floating altitude reveal the dominant diurnal component and a buoyancy oscillation of ≈ 5 min period and ≈ 40 m amplitude (Fig. 1b). A pronounced buoyancy oscillation can be seen in the top panel of Fig. A1e. The PMC altitudes in the published dataset have been corrected for variations in the floating altitude.
The lidar beam was pointed 28 • off-zenith, with an uncertainty of about 0.1 • caused by the mentioned buoyancy oscillations. The slanted beam resulted in the probing of a PMC layer centred at 82 km altitude at 23.25 km horizontal distance to the position of the gondola projected onto the PMC layer. Due to variations in the floating altitude, this distance varied between 22.57 and 24.59 km during the flight. The balloon's buoyancy oscillation of 40 m amplitude lead to a periodic horizontal displacement of the laser beam, with an amplitude of 210 m at the altitude of the PMC layer, whereas the uncertainty in the beam pointing angle corresponds to an uncertainty in horizontal distance of approximately 80 m. Due to the off-zenith beam angle, a wide PMC layer extending 80-86 km in altitude was thus probed diagonally, with the lower boundary being horizontally offset by 3 km rela- tive to the upper boundary. For an average PMC layer width of 1 km, this value amounts to 500 m, accordingly. The azimuth angle α changed continuously as a rotator stabilized the pointing of the gondola in the anti-sun direction, resulting in one full rotation during 1 local day (Fig. 1c). In the fixed reference frame of the gondola, this rotation caused the laser beam to perform a circular motion at the PMC layer, with approximately 24 km radius and a speed of 1.7 m s −1 . In the Earth-centred reference frame, this circular motion is overlaid with the drift of the gondola, which was on average 7 times faster. A rotation of 1 • displaced the laser beam by 406 m at 82 km altitude. The lidar was turned off during tests when the gondola was rotated towards the Sun during non-PMC conditions in the evenings of 11 and 13 July. Few sudden losses of azimuth control occurred during the mission. The recovery from those control problems resulted in faster sweeps of the lidar beam over short distances. To automatically detect these events in post-processing, we look for times with significant spectral power at 1 min period, using wavelet analysis of the cosine of the azimuth angle (Fig. 1d). Only six of these events occurred during PMC conditions, namely on 8 July between 13:58 and 14:11 UT, at 16:53 UT, and between 23:03 and 23:20 UT, on 10 July between 05:51 and 06:23 UT, and on 11 July between 01:35-01:54 and 02:54-03:32 UT. These data subsets are shown in Appendix A. At the average speed of 12.1 m s −1 of the laser beam at the height of the PMC layer, one 10 s lidar profile averages over a horizontal distance of 121 m. The spot size of the laser beam at the PMC layer follows from the mean divergence of the laser beam and the slant range. Using a mean divergence angle of 67 µrad (Kaifler et al., 2020), the calculated spot size is 6.3 m. The position of the lidar beam is fixed relative to the field-of-view (FOV) of PMC Turbo's camera systems, which have a pixel size at the position of the lidar beam of 3 and 8 m, respectively (Kjellstrand et al., 2020). The actual resolution of features in the PMC layer depends on the local wind speed with which they are advected through the lidar's FOV. Model estimates of the wind speed derived from radar data, for example, amount to 40 m s −1 between 02:00 and 03:00 UT on 10 July 2018 (Geach et al., 2020). Applying tracking algorithms to small-scale features in series of images acquired by the cameras on board, the local wind speed can also be derived from measurements (Geach et al., 2020). As these analyses are inherently complex, e.g. in the case of multiple layers in sheared environments, information on relative velocities of the position of the laser beam within the advecting PMC layers is not included as a standard product for the whole flight.

Calculation of volume backscatter coefficients
The first step in the calculation of the volume backscatter coefficient is the conversion from slant range to vertical range by multiplying the range grid of the 20 m × 10 s binned photon count profiles with the cosine of the zenith angle. The geometric altitude is obtained by adding the altitude of the gondola, z g , to the vertical lidar range. The second step is then the removal of the so-called background. The background originates from solar radiation scattered into the optical path of the receiving telescope by the residual atmosphere and the telescope's spider and results in a range-independent offset in the lidar data. Signal-induced noise produced by the photon detector introduces a weak range dependence, however. For that reason, we estimate the background F above the PMC layer between 96 and 120 km altitude by fitting a linear model, to each count profile. The coefficients in the linear term in Eq.
(1) were determined empirically from calibration measurements. A 0 (t) ranges between 20 and 40 photon counts and is strongly correlated with the solar zenith angle but is also influenced by PMCs, as they increase the sky brightness too (the flight time series is shown in Fig. 4a). After removal of the background, the count profiles are scaled with (z − z g ) −2 to correct for the range-dependent solid angle of the backscattered laser light. The resulting backscatter profiles comprise Rayleigh scattering from air molecules and scattering produced by ice particles when PMCs are present. As with ground-based lidars, we detect PMCs by comparing the backscatter profile to a MSIS-E-90 total mass density profile ρ ref (z) (Hedin, 1991) that is fitted to each lidar backscatter profile between 60 and 75 km altitude. Multiplication with the thus determined normalization factor scales the backscatter profile to units of atmospheric density. The backscatter ratio R(z) in the altitude range between 76 and 90 km is obtained by division of the normalized backscatter profile by ρ ref (z). Finally, the volume backscatter coefficient β in units of 1 m −1 sr −1 is determined by the following: with the molar mass of air M = 28.8 g mol −1 , the Avogadro number N A = 6.022 × 10 23 mol −1 , and the Rayleigh scattering cross section σ R = 6.32 × 10 −32 m 2 sr −1 for 532 nm wavelength (Thayer et al., 1995). In order to estimate the noise level, we calculate the standard deviation σ bg of β(z) between 88 and 90 km, a height at which contamination from PMC ice particles is unlikely. Values β > 2.5 σ bg are accepted as significant, and all others are masked. Figure 2 illustrates the process of calculating density profiles from count data and subsequently deriving β(z) for 11 July 2018 at 01:35 UT for different resolutions when a very weak PMC layer at 87 km resides above a bright and narrow double layer. Although the background dominates the high-resolution profile down to about 70 km, the PMC signal, in a thin layer at 83 km, is also detected with high significance at 10 s resolution. The presence of a weak top layer is confirmed when using lower-resolution data. The detection threshold at the resolution of 20 m × 10 s is 2.5 σ bg = 3.66 × 10 −10 m −1 sr −1 . Decreasing the resolution to 300 m × 5 min results in a lower detection threshold of 2.5 σ bg = 0.12 × 10 −10 m −1 sr −1 . The signal-to-noise ratio increases as more signal is accumulated in larger bins. This comes at the cost of not resolving the fine details in the PMC layer structure. As shown in the example, the maximum value β max = 20 × 10 −10 m −1 sr −1 reduces to β max = 12×10 −10 m −1 sr −1 and β max = 7×10 −10 m −1 sr −1 with decreasing resolution. This is due to the presence of a sub-scale structure that is averaged out at lower resolutions. The connection between the resolution and signal-to-noise ratio also becomes evident when considering a 5 h period of PMC detections on 10/11 June 2018 that contains the profile selected in Fig. 2. Figure 3a shows the faint, high-altitude layer at 87 km altitude that is detected at low resolutions but not as clear at higher resolutions ( Fig. 3b and c). On the other hand, the increase in resolution reveals smaller-scale structure that is only hinted at the lower resolution, e.g. the oscillation from 22:00 to 22:30 UT at 82 km altitude or the fine double layer at 83 km at 01:45 UT. This revelation of a smaller-scale structure only succeeds when the signal-to-noise ratio is sufficiently high, and thus, β is large enough. A lower background also helps, but the background varies only by a factor of 2 during the day, while β varies over a much wider range.
To characterize the vertical displacements of the PMC layer, it is sometimes convenient to reduce the data to a onedimensional time series, and a suitable representation of the PMC layer altitude is the β-weighted centroid altitude z c . We only evaluate z c for profiles where the sum of all ln β values exceeds a value of 0.04. This effectively removes sporadic detections that cannot be attributed to a continuous PMC layer. To demonstrate this for an example that was especially   selected for low β, we add z c as a black curve to Fig. 3c. The limit imposed on β int does not significantly reduce the amount of usable data but ensures a high quality in the determination z c . The resulting time series z c (t) can be subsequently interpolated and spectrally analysed (the full time series of z c is shown in Fig. 4d). In order to avoid having to make further assumptions on the coherency of the PMC layer, we refrain from applying additional spatial filters.   The time-and range-resolved PMC detections are shown in Fig. 4e. From a Gaussian fit to the mean β profile (Fig. 4f), we derive a mean PMC altitude of 82.43 km with a standard deviation of 1.08 km. This is in agreement with satellite data for this latitude band of 82.6 km reported by Carbary et al. (2001). The measured mean altitude is 840 m lower but within uncertainty intervals, compared to 83.27 ± 1.30 km reported by Fiedler et al. (2017) based on the multi-year ALOMAR RMR lidar dataset at 69 • N, 16 • E. Unlike the ALOMAR dataset, the BOLIDE dataset was obtained at a different range of longitudes (21 • W-110 • E compared to 16 • E) and is of short duration and thus likely susceptible to sampling biases induced by tidal, seasonal, and interannual variation.
The β values show an exponential distribution covering a wide range from 10 × 10 −10 to 80 × 10 −10 m −1 sr −1 . The histogram shown in Fig. 5 includes all values of β between 79 and 86 km altitude. The slope α BOLIDE = −0.1086 ± 0.0005 of a linear fit to this data is in agreement with results from the ALOMAR dataset of α ALOMAR = −0.1079 ± 0.0045 between 3 and 50 × 10 −10 m −1 sr −1 , based on β max at 15 min temporal resolution (Berger et al., 2019).
An overview of the variability in the PMC layers in the BOLIDE dataset is shown in Figs. 6 and 7. Periodic motions of the PMC layer altitude with periods of few hours down to the buoyancy period of ≈ 5 min are induced by gravity waves. The BOLIDE dataset also shows other, non-linear large-scale phenomena, e.g. mesospheric bores, of which the two occurrences on 13 July at 12:30 and 13:30 UT (Fig. 7i) were analysed in detail by Fritts et al. (2020). In the spatial and temporal vicinity of such structures and when gravity waves break, a variety of instability dynamics is induced. Here, the true advantage of the BOLIDE dataset comes to play, as it is only at a high temporal and vertical resolution that these dynamical processes are resolved. At short timescales of 2 min and less, locally enhanced β results mainly from convergence of air masses that increase the local ice number density and not so much from ice particle growth, as the latter is a slower process. However, ice particles may quickly sublimate during rapid downdrafts. To interpret specific dynamics observed by the lidar, the common volume PMC Turbo images are very helpful, as they provide the spatial context (viewed in two dimensions from below) in the direct neighbourhood of a few hundred me-tres and information on the structure and evolution of the larger-scale cloud field up to ≈ 160 km distance (Kjellstrand et al., 2020, their Fig. 5). For example, the largest value of β exceeding 100 × 10 −10 m −1 sr −1 occurred after a rapid decrease in the PMC lower boundary on 11 July at 03:30 UT (Fig. 7f) and is related to the passage of a large-scale vortex ring of several kilometres in diameter. This and other types of patterns at the scale of few minutes and less that are included in the BOLIDE dataset and typically occur in lidar soundings of PMC layers are studied in more detail by Kaifler et al. (2022). In-depth analysis, including modelling of events during PMC Turbo, concern the small-scale vortex rings on 10 July at 02:40 UT (Fig. 6d, Geach et al., 2020) and the Kelvin-Helmholtz instability dynamics on 12 July at 13:40 UT (Fig. 7h, Fritts et al., 2022;Kjellstrand et al., 2022). An analysis of the dynamics of breaking gravity waves during the bright PMC displays on 11 July between 04:00 and 06:00 UT (Fig. 7f) is in preparation.

Conclusions
The BOLIDE dataset available from Zenodo and NASA's Space Physics Data Facility contains PMC volume backscatter coefficients at 20 m × 10 s resolution. Additionally, time series of balloon altitude, azimuth angle, and beam position at 82 km altitude are provided. The dataset, as described in this document, is suitable for analysis of small-scale dynamical processes acting on the PMC layer, especially in combination with the images acquired by the PMC Turbo wide and narrow FOV cameras. The limited flight duration of 5.9 d likely results in sampling biases due to tidal and seasonal variation, affecting, for example, the flight mean PMC altitude. We thus hope to obtain a larger dataset from the upcoming Balloon Sodium Lidar to measure Tides in the Antarctic Region (B-SoLiTARe) mission, which is currently scheduled for launch at McMurdo, Antarctica, in 2024 and will provide a flight time of approximately 3 weeks. The analysis of the data described here can be applied to future datasets acquired with airborne (balloon or aircraft) or ground-based Rayleigh lidar instruments.

Appendix A: Deviations from nominal rotator motion
On six occasions, PMC lidar measurements are affected by deviations from the nominal rotator motion that induces horizontal displacements of the laser beam position on the PMC layer. Those PMC layer observations are not invalid, but it may be necessary to account for the fast-moving laser beam when interpreting layer displacements. Plots of the affected subsets are shown in Fig. A1, together with floating altitude and azimuth angle. Figure A1. Floating altitude z g , azimuth angle α, and respective PMC detections for the six periods with deviations from the nominal rotator motion during PMC conditions. The floating altitude has been taken into account in PMC altitude calculations.