EUREC 4 A observations from the SAFIRE ATR42 aircraft

. As part of the EUREC 4 A ( Elucidating the role of cloud-circulation coupling in climate ) ﬁeld campaign, which took place in January and February 2020 over the western tropical Atlantic near Barbados, the French SAFIRE ATR42 research aircraft (cid:58)(cid:58)(cid:58)(cid:58)(cid:58) (ATR) conducted 19 ﬂights in the lower troposphere. Each ﬂight followed a common ﬂight pattern that sampled the atmosphere around the cloud-base level, at different heights of the subcloud layer, near the sea surface and in the lower free troposphere. The aircraft’s payload included a backscatter lidar and a Doppler cloud radar that were both horizontally 5 oriented, a Doppler cloud radar looking upward, microphysical probes, a cavity ring-down spectrometer for water isotopes, a multiwavelength radiometer, a visible camera and multiple meteorological sensors, including fast rate sensors for turbulence measurements. With this instrumentation, the ATR characterized the macrophysical and microphysical properties of trade-wind clouds together with their thermodynamical, turbulent and radiative environment. This paper presents the airborne operations, the ﬂight segmentation, the instrumentation, the data processing and the EUREC 4 A datasets produced from the 10 ATR measurements. It shows that the ATR measurements of humidity, wind and cloud-base cloud fraction measured with different techniques and samplings are internally consistent, that meteorological measurements are consistent with estimates from dropsondes launched from an overﬂying aircraft ( (cid:58)(cid:58)(cid:58) the (cid:58)(cid:58)(cid:58)(cid:58) High (cid:58)(cid:58)(cid:58)(cid:58)(cid:58)(cid:58)(cid:58) Altitude (cid:58)(cid:58)(cid:58)(cid:58) and (cid:58)(cid:58)(cid:58)(cid:58)(cid:58)(cid:58)(cid:58)(cid:58)(cid:58)(cid:58)

. The ATR coming back from its successful EMI flight in Barbados on Jan 23 2020.
with extremely large and steep vertical gradients, ranging from 80% near the surface, to 100 % within clouds, to less than 5 % above the trade inversion (Stephan et al., 2021). These gradients favour phase changes and the deposition of cloud droplets on airborne sensors, which can affect the response time and accuracy of the measurements. These challenges were met by fitting the aircraft with a wealth of instrumentation which, in some cases, was used in an airborne configuration for the very first time. The instrumentation was also chosen to promote redundancy or complementarity 50 of sensors and measurement techniques. This redundancy was not only important for the post-processing and calibration of the data, it was also essential to assess the robustness of the ATR measurements of cloud fraction, humidity and winds.
The goal of this paper is to provide an overview of the operations and measurements of the ATR during EUREC 4 A. Section 2 presents the aircraft, the operations, the flight patterns and their segmentation, and the weather conditions during the flights.
Section 3 presents the ATR instrumentation, ranging from the core instrumentation of the aircraft to the instruments that were 55 specifically devised for EUREC 4 A, and provides a brief description of the data post-processing and of the associated datasets.
The focus is put on the datasets which have not been subject to specific data papers. Section 4 assesses the internal consistency of ATR measurements regarding the cloud-base cloud fraction, humidity and wind, and their consistency with observations from other platforms. Links to the data are provided in Section 5 and a brief summary and conclusions are presented in Section 6. 60 2 Flights and operations

A challenging start
The SAFIRE ATR42 (F-HMTO) is a turbo-propeller aircraft flying in the lower troposphere (its ceiling is at about 7.5 km) which has been modified in many ways to fit scientific research purposes. The preparation of the ATR for the EUREC 4 A campaign was associated with significant challenges.  Table 1. List of ATR flights with a brief description of the main flight patterns: the mean approximate height (and number) of rectangles flown around cloud-base (R b ) or cloud-top (Rt), the height of the L-patterns flown near the top and the middle of the subcloud-layer, the height of the near-surface leg (S-pattern) and of the Ferry legs flown above clouds. 1 On Jan 30 2020, from 11:42 to 12:32 UTC, HALO flew two race-track patterns above the ATR rectangle. 2 On Feb 9 2020, from 14:32 to 17:00 UTC, the ATR flew within the field of view of the RSS aircraft. 3 On Feb 11 2020, from 4:17 to 7:25 UTC, the P3 flew two circular patterns within the EUREC 4 A circle at an altitude of about 7.5 km and dropped 12 sondes along its first circle (from 4:17 to 5:55 UTC) just before the ATR take-off.
in Barbados between two flights was about one hour long, so that within 90 min, the ATR was back in the measurement zone for a second mission (Table 1). While the ATR was flying in the lower troposphere, HALO was observing the cloud field from aloft and was droping sondes along three consecutive circles of about 200 km diameter . 95 The ATR's mission was primarily focused on characterizing the cloud-base cloudiness, subcloud-layer properties and their signals of spatial organisation at the turbulent scale and at the mesoscale. For this purpose, each flight was composed of a basic set of patterns near cloud-base and within the subcloud-layer that was repeated independent of meteorological conditions. This repetition was motivated by the wish to sample the diversity of boundary layer conditions without any bias, and to compare the flights with each other. Then, depending on flight and weather conditions, a few additional patterns were flown near cloud Then, to characterize the turbulent and mesoscale organization of the subcloud-layer, the ATR flew two L-shape patterns within the subcloud-layer, one near the top of the subcloud-layer (generally around 600 m) and the other near the middle of the subcloud-layer (around 300 m). As the organization of the boundary layer can be anisotropic and dependent on the wind 115 direction, each L-pattern was composed of two straight legs perpendicular to each other (each leg being about 60 km long): one along-wind and one cross-wind. Finally, in daylight conditions a near-surface leg of about 40 km was performed at an altitude of about 60 m before returning to the Grantley Adams International Airport (BGI) in Barbados through another Ferry leg in the free troposphere.
A few flights were associated with particular features: 120 -During RF06 (Jan 30), from 11:42 to 12:32 UTC, HALO flew (twice as fast as the ATR) two race-track patterns above the ATR rectangle at an altitude of about 10 km; two dropsondes were dropped at the extremities of the HALO race-track.
This coordinated flight will help compare the cloud detection and characterization performed with the HALO and ATR measurements.
-During RF16 (Feb 9th), the ATR flew within the field of view of the Regional Security System (RSS ) :::: RSS aircraft, 125 which was flying parallel to the ATR at about the same altitude. On this occasion, the ATR flew 4 rectangles around cloud base. The coordination between the two aircraft will help compare the cloud detection performed with the ATR instruments with the high-resolution pictures taken by the visible camera of the RSS aircraft.
-During RF17 (Feb 13th), the ATR flew during night time. This flight was coordinated with the P-3 aircraft , that dropped sondes (from an altitude of about 7.5 km) along the EUREC 4 A circle right before the ATR take-off.

Ground support
The main role of the ATR during EUREC 4 A was to measure the cloud fraction and the thermodynamical, dynamical and microphysical properties of the atmosphere at the interface between the subcloud layer and the cloud layer (Bony et al., 2017;. reported are the ranges of subcloud-layer top heights (zsc) and inversion heights (zinv) derived from dropsondes (Table 3).

Flight segmentation
To aid in the analysis of the flight data, each flight is segmented into non-exclusive timestamps summarized in a set of ::: Yet :::::: Another ::::::: Markup ::::::::: Language YAML files (Table 2). Different kinds of segments are defined, that correspond to basic patterns ('R-pattern', 'L-pattern', 'S-pattern') or to particular phases of the flight (e.g., 'Ferry'). The vertical level at which these 170 patterns are flown (at cloud-top, cloud-base, near the top of the subcloud-layer, near the middle of the subcloud-layer, near the sea surface, above or below the trade inversion level) is also indicated as a 'note' in the YAML files. The vertical excursions of the ATR are referred to as 'Profiles', and the direction (upward or downward) in which they were realized is also reported. An example of flight segmentation is shown for RF11 (Fig. 5). The vertical and horizontal trajectories of each flight are shown in Figs. C1 and C2.

175
The characterization of the turbulence ("T") requires to consider straight and stabilized legs of at least 30 km (Lenschow et al., 1994). For this reason, the R-and L-patterns were also associated with a finer segmentation in straight horizontal legs of equal duration and length (  (Table 3). Figure 6. Segmentation of the R-and L-patterns into straight and stabilized segments of equal duration and length for turbulence studies (T-shortlegs: 30 km/5 min in red, referred to as rnx or lnx where n is the pattern number, T-longlegs: 60 km/10 min in purple, referred to as RnX or LnX ). Also reported are the longest stabilized legs in one direction (T-longestlegs, 120 km/20 min or 60 km/10 min, in green, referred to as RLi or LLi where i = 1,..P where P is the number of such segments for the flight). A similar nomenclature is used for the segmentation of the S-patterns. See Table 2 for the definition and the nomenclature of these segments. After Brilouet et al. (2021).  Table 2. Segmentation of the ATR flights into patterns ('kind'), flown at different levels ('note'). Each segment is associated with a 'name', where n = 1, 2,.. N (N being the number of patterns of the 'kind' category flown during the flight), X = A, B, C.... and x = a, b,.. h. See Fig. 6 for an illustration of the sub-segmentation of the patterns into T-shortlegs, T-longlegs and T-longestlegs segments. Also reported is the total number of segments in each category. This information is included in a set of YAML files (one file per flight).

Environmental conditions associated with each flight
To aid in the analysis of the ATR data, we summarize in Tables 3 and 4 the main environmental conditions associated with each flight, as well as qualitative descriptions of the prominent cloud types and mesoscale cloud patterns present during each flight, plus some information about aerosols and the presence of precipitation. The prominent cloud types are determined by  Table 3. Meteorological conditions associated with each ATR flight, and their average over all flights. All quantities are computed from the JOANNE dropsondes dataset  as averages over 3 consecutive circles flown during each ATR flight. zINV , zSC and zLCL are the trade-inversion height, the subcloud-layer top height and the lifting condensation level height, respectively. zINV is defined as the height where the moist static energy is minimum between 1300 and 4000 m. zSC is defined as the lowest altitude above 200 m where θv(z) exceeds by more than 0.2 K the mass-weighted average of θv from 200 m to z (Canut et al., 2012;Rochetin et al., 2021;Touzé-Peiffer et al., 2021). zLCL is diagnosed as zLCL = z20m -(C pd ((TLCL-T20m)/g)), with TLCL = 1/((1/(T20m-55)) -(log(RH20m)/2840)) + 55 where T is the temperature and RH the relative humidity. ω is the vertical velocity measured at the scale of the EUREC 4 A circle by dropsondes (Bony and Stevens, 2019;George et al., 2021); ωSC and ωINV are the mass-weighted averages of ω in a 200 m layer centered around zSC and zINV , respectively. ωF T and RHF T (FT referring to the lower free troposphere), are the mass-weighted averages between 4000 and 6000 m of ω and RH, respectively (note that ω was not measured during RF06). PW (precipitable water) is the mass-weighted integral of water vapor specific humidity from the surface to the altitude of the dropsonde launch (about 10 km). The lower-tropospheric stability (LTS) is defined as LTS = θ 700hP a -θ 1000hP a (Klein and Hartmann, 1993). Vs is the near-surface wind speed computed from the zonal and meridional wind components measured by dropsondes at 20 m.  Table 4. Cloud, aerosol and precipitation conditions associated with ATR flights. Through the combined analysis of Fig. C2, GOES-E animations (section B), BCO radar information and C 3 ONTEXT results (Schulz, 2021), the prominent low-level cloud types (at the scale of the R-and L-patterns) and cloud mesoscale patterns (at the scale of the EUREC 4 A circle) are reported for each ATR flight. The different low-level cloud types considered are very shallow cumuli (VS), vertically developped chimney clouds (CH), chimney clouds with stratiform outflow below the inversion (StCH) and chimney clouds with an horizontally extended stratiform layer (ExStCH). Clear-sky is referred to as CS. The mesoscale cloud patterns (referred to as SU, GR, FL or FI for Sugar, Gravel, Flowers and Fish) are defined in Stevens et al. (2020).
They are written in bold when there is a consensus about their prominence during the flight. The aerosol extinction coefficient (AEC), volume depolarization ratio (VDR) and dust condition are from Chazette et al. (2020); dust+ corresponds to 1%≤ VDR < 2% and dust++ to VDR ≥ 2%. The fractional areas (in %) of the R-patterns flown at cloud-base covered by drizzle or rain are derived from the BASTA radar using reflectivity thresholds of -20 dBZ and 0 dBZ to distinguish clouds from drizzle and drizzle from rain, respectively (section 3.5.3). Asterisks * indicate the presence of deeper congestus clouds with cloud-top at 5 km (for RF17 and RF18) or alto-stratus layers between 5 and 8 km for RF20.
watching animations of the GOES-16 satellite imagery centered on each ATR flight (see their description in Appendix B) plus BCO radar observations. The prominent mesoscale cloud patterns are determined visually from the analysis of the GOES-16 movies associated with each ATR flight and the results of the mesoscale cloud patterns overview of Schulz (2021).
Consistently with these contrasted environmental conditions, the most prominent cloud types and mesoscale cloud patterns encountered during each flight also varied (Table 4). For instance, small thin clouds prevailed during RF05 and RF06 (Jan 28 and Jan 30), but deeper cloud systems associated with the presence of stratiform cloudiness around the trade inversion 200 level and rain were present during RF03 (Jan 26), RF07 (Jan 31), RF17-18 (Feb 11) and RF19 (Feb 13). The mesoscale cloud patterns associated with each ATR flight were often a mix of several patterns. Yet, a few flights were associated with a greater prominence of specific mesoscale patterns. For instance, RF06 (Jan 30) was clearly associated with a Sugar pattern, while RF09 and RF10 (on Feb 2) were clearly associated with a Flowers pattern, RF09 sampling mostly the clear-sky part of the pattern and RF10 sampling more of the cloudy area. The Gravel pattern occurs often in association with other patterns, especially with 205 the Sugar pattern, as found during RF05 (Jan 28), RF12 (Feb 5), RF15 and RF16 (Feb 9).
3.1 Aircraft navigation, attitude and meteorological data (SAFIRE-CORE) 225 3.1.1 Inertial/Navigation system The ATR Inertial Navigation System(INS), also named AIRINS, is an iXblue inertial navigation system using a Fiber-Optic Gyroscope. By construction, an inertial unit is drifting and the position needs to be reset by a GPS :::::: Global :::::::::: Positioning :::::: System ::::: (GPS) : position to provide accurate parameters. It is done by using a Trimble BX992 GPS. The AIRINS-GPS positioning system then provides groundspeed, acceleration, attitudes angles and speed platform components in an Earth-based coordinate 230 system.
During EUREC 4 A, three problems occurred that impacted the measurements and the data processing: (1) A failure in the internet ouput of the AIRINS-GPS system prevented us from recording the data at 100 Hz as usual; the data were recorded instead at 50 Hz on a serial output, and then they were synchronized and averaged at 25 Hz and at 1 Hz. (2) During RF03, the GPS was rejected by AIRINS, which resulted in an incorrect position (true heading and attitude) and thus unreliable horizontal 235 wind measurements for this flight; a corrected position (derived from the GPS only) was used in the V2 version of the SAFIRE-CORE dataset, as well as in the RF03 files of other ATR datasets. (3) For RF20, the inertial/GPS data are available at 1 Hz only.

Pressure, anemoclinometric and wind measurements
The SAFIRE ATR is equipped with a five-hole radome that measures the distribution of pressure around the nose of the aircraft 240 (Table 5): the difference of pressure measured between two holes in the vertical or horizontal planes informs about the attack angle and sideslip angle, respectively (Lenschow, 1986). The static and dynamic pressures are measured by Rosemount or Thales transducers connected to Pitot tubes on both sides of the radome. The static pressure, which corresponds to the pressure corrected from the airflow disturbance produced by the aircraft, is determined using a pre-established calibration based on specific flights and maneuvers. The dynamical pressure is obtained by subtracting the static pressure from the total pressure 245 measured at the central radome hole. The true air speed (TAS), which is the speed of the aircraft relative to the airmass through which it is flying, is calculated from the dynamical and static pressures.
The wind is then inferred from the difference between the speed of the aircraft relative to the Earth and the true air speed (Lenschow, 1986). The high rate wind measurements of the ATR have been very robust since its first field campaign in 2006 (Saïd et al., 2010). Unfortunately, because of a hose leak between a hole of the radome and a pressure transducer inside the  radome, the measurement of the vertical wind is not reliable from RF02 to RF08. The horizontal wind measurements were not significantly affected by this problem.

Air temperature
During EUREC 4 A, the air temperature was measured by 2 Rosemount sensors E102AL (Table 5). The first one is located on the nose of the aircraft, inside a non-deiced housing, and the second one is located on the fuselage inside a deiced housing 255 (Fig. 7). The static temperature, which is the temperature corrected for aircraft speed and recovery factor of the housing, is calculated as: where T t is the measured total temperature (°C), ∆P the dynamic pressure (hPa), P s the static pressure (hPa) and r f the recovery factor (r f =0.98)

260
From RF09 to RF20, fast (turbulent) temperature fluctuations were also measured at 200 Hz (and averaged at 25 Hz) with a fine wire temperature sensor. The fine wire is a 5 µm platinium wire soldered on a support and mounted inside a SFIM T4113 housing. Despite its fragility (a fine wire can easily break during takeoff or landing when the aircraft encounters particles or insects), it remained intact during the whole campaign. Despite its housing, the response time of the Rosemount sensor can sometimes be affected by the presence of cloud droplets :::::::::::::::::::::: (Lawson and Cooper, 1990). The fine wire can also be affected by 265 this problem, but it recovers much more quickly, emphasizing the complementarity of the two sensors (Brilouet et al., 2021).
The total temperature from the fine wire is derived by fitting and calibrating its raw measurements against the total temperature measured by the non-deiced Rosemount sensor. The resistance of the fine wire being subject to oxidation, this calibration is performed for each individual flight. The static temperature is estimated using the same method as for the Rosemount sensor, using (for the lack of better estimate) the same recovery factor.

270
The Rosemount and fine wire temperature data are processed at 1 Hz and at 25 Hz. From RF09 to RF20, the turbulence dataset (SAFIRE-TURB) uses the fine wire data as the best estimate for fast fluctuations, and the Rosemount data as a spare (Brilouet et al., 2021).

Humidity
No less than five instruments measured humidity in-situ on board the ATR (Table 6), in addition to the cavity ring-down spec-275 trometer (CRDS) presented in another section of this paper (section 3.6). Each instrument is based on a particular measurement principle or technology, and therefore exhibits specific strengths and limitations in terms of stability, response rate, sensitivity to the presence of condensation or measuring range. The comparison and fine analysis of the different measurements makes it possible to calibrate and correct or bypass the shortcomings of each measurement, so as to produce high quality humidity datasets. The main features associated with these instruments and the processing of their measurements are outlined below.

280
A chilled mirror dew point hygrometer (Buckresearch 1011C) measured the atmospheric dew and frost points. This measurement, made by cooling a reflective condensation surface until an optical system detects the presence of condensation, is traditionally considered as a reference measurement for humidity. However, this type of hygrometer can have limitations when the aircraft undergoes large changes in altitude, passes through a cloud or samples environments with high humidity contrasts.
This sensor also has a slow response time and show limitations in very dry conditions such as those encountered above the 285 trade inversion.
A Humicap 180C Enviscope-Vaisala capacitive sensor was placed inside a non-deiced Rosemount E102 housing. This sensor is made of a hygroscopic dielectric material whose capacitance is dependent on humidity. After correcting for the effects of aircraft speed, it measures relative humidity directly with a short response time. However, the sensor is sensitive to the presence of cloud droplets and it can report relative humidities above 100 %. Its measurements are thus considered only in unsaturated 290 environments, and under these conditions they help assess the robustness or even calibrate the measurements of other sensors :::::::: mentioned :::::: above.  Table 6. Humidity sensors. Note that the cavity ring-down spectrometer (whose inlet is shown here) is represented in Table 9. See Annex D for the correspondance between the position on the aircraft and the ATR configuration.

300
Finally, two additional instruments were used to measure rapid fluctuations of humidity: a Licor LI-7500A and a Campbell Scientific krypton hygrometer (KH20).
Therefore, as the Licor it was cleaned before each subsequent flight. The KH20 measures rapid fluctuations of humidity but not absolute humidity. Absolute values (in g m−3 :: −3 ) are obtained by calibration against the slow (1 Hz) humidity measurements  Based on the processing of these different measurements, two humidity datasets have been produced: one at 1 Hz, included in the SAFIRE-CORE dataset, and another at 25 Hz, which is included in the SAFIRE-TURB dataset. Note that in the SAFIRE-TURB dataset, the calibration of the humidity measurements is performed on a leg by leg basis, both for the Licor 7500A and the KH20 sensors. Kipp and Zonen sensors mounted at the top and at the bottom of the ATR measured upwelling and downwelling broadband radiative fluxes : , :::::::::: respectively (Table 7): CGR4 pyrgeometers measured hemispheric longwave fluxes in the 4.5-42 µm spectral range, CMP21 pyranometers measured hemispheric shortwave radiation in the 0.75-2.7 µm spectral range (red dome), and CMP22 pyranometers measured hemispheric shortwave radiation in the 0.2-3.6 µm spectral range (clear dome).
Measuring upwelling and downwelling radiative fluxes requires the aircraft to be in a plane and stable position. For this reason, the SAFIRE-RADIATION dataset includes two sets of variables for each radiative flux: raw fluxes, and fluxes corrected for the attitude of the aircraft. In the time series of corrected fluxes, whenever the roll or pitch of the aircraft was greater than ± 5°the radiative measurements were considered as 'undefined', and otherwise the downwelling shortwave measurements 340 were corrected for the attitude of the aircraft. This correction requires to know the offset of the sensor installation, which corresponds to the bias associated with the potential tilt of the mechanical installation of the sensors relative to their support.
This offset must be estimated every time the sensor has been re-mounted on the aircraft (such as done at the arrival of the ATR in Barbados, section 2.1). It was determined through specific manouvers performed during the test flight RF02.
All pyrgeometers and pyranometers worked properly during the campaign except one: the CMP21 pyranometer (red dome) 345 at the top of the aircraft. Because of this malfunctioning, the downwelling 0.75-2.7 µm irradiance measurements were either absent or unvalidated during the campaign. However all other upward and downward longwave and shortwave fluxes, including the downwelling shortwave measurements over the 0.2-3.6 µm spectral range, are available and distributed in the SAFIRE-RADIATION dataset at 1 Hz.

350
In addition to broadband radiometers, the ATR carried a nadir-viewing multispectral radiometer, the CLIMAT CE332 instrument, developed by the Laboratoire d'Optique Atmosphérique (LOA) in collaboration with CIMEL (Brogniez et al., 2003).
It is done by comparing the radiances measured on the observed target with that measured by looking at a reference cavity maintained at a given temperature. During the post-processing, the measurements performed at 6 Hz are synchronized and 355 averaged at 1 Hz. They are included in the SAFIRE-CLIMAT dataset. It is planned to estimate the sea surface temperature from these measurements.

Visible images (SAFIRE-CAMERA)
To visualize the context of the data acquired by in-situ measurements or remote sensing, two high-resolution cameras were mounted on the aircraft. One camera, an AV GT 1920C model with a resolution of 1936 × 1456 pixels and a wide angle (focal 360 length of 4.8 mm), took high frequency images (10 frames per second) through the ATR window on the side of the horizontallystaring lidar and radar instruments. The other camera, a Mako G-223 model with a resolution of 2048 × 1088 pixels and a focal length of 16 mm, looked down towards the sea surface at a moderate frequency (1 frame per second). The images taken through the aircraft windows often appear dark because the choice :: of :::::::: exposure :::: time : was made to avoid saturation due to the brightness of the clouds as much as possible, especially when the sun is behind the aircraft (Fig. 8a). The downward-looking 365 camera can detect the presence of clouds below the aircraft and can help characterise the state of the ocean surface (Fig. 8b).
Three types of products are derived from these cameras: movies (in avi format) are produced for each camera ("window" or "ground") and for each flight, and high-resolution images (in bmp format) are produced for the window camera for R and L patterns. Image acquired a few minutes earlier by the camera looking down towards the ocean.

370
The 5-hole nose radome and specific temperature and humidity sensors mounted on the ATR (Rosemount and fine wire thermometers, Licor and KH20 hygrometers, see Tables 5 and 6 and section 3.1) measured rapid fluctuations of the three wind components, temperature and humidity. Based on these high frequency (25 Hz) measurements, the SAFIRE-TURB turbulence dataset was produced to characterize the turbulent characteristics of the atmosphere through a number of diagnostics. The data processing strategies, the calibration methodologies, the procedures of quality control applied to the 25 Hz temperature and 375 moisture measurements, and the methods used to estimate the turbulent diagnostics are explained in details in Brilouet et al. The 'turbulent moments' include means, variances and covariances of dynamical and thermodynamical variables, turbulent kinetic energy and dissipation rate, third order moments and skewnesses of wind components, potential temperature and water cal velocity density energy spectrum peak, error estimates on the turbulent moments, and quality flags on the temperature and humidity measurements. These diagnostics are produced for each type of segment (T-shortlegs, T-longlegs and T-longestlegs).
This dataset is produced for two levels of data processing. In the Level 2 dataset, the turbulent moments and fluctuations are calculated for each humidity sensor and each temperature sensor, and a quality flag is associated with each sensor. In the 390 Level 3 dataset, a 'best estimate' of the turbulent moments and fluctuations is provided, together with a quality flag; for each segment, the best estimate corresponds to the moments and fluctuations computed from the sensor that has the best quality flag over this segment. The dataset is distributed in NetCDF files whose nomenclature is summarized in Table 3 of Brilouet et al. (2021).

Aerosols (UHSAS)
A UHSAS-A probe (airborne version, serial no: 1303-007) was mounted on the lower left-hand pod on the fuselage section 405 (Fig. 7). This probe is an optical-scattering aerosol particle spectrometer developed and commercialized by Droplet Measurement Technologies (DMT) that counts and sizes particles in the 0.06 to 1 µm range. The sizes are then sorted into 99 linearly spaced size bins of fixed width (9.7 nm).
According to the manufacturer, UHSAS operation is limited to a non-condensing environnement. Ladino et al. (2017) reported that UHSAS measurements are subject to water contamination when performed in a cloudy area, which is also visible in our data. Therefore, UHSAS measurements made in cloudy area (determined by LWC ::::: liquid ::::: water :::::: content : > 1 mg m −3 using 420 CDP and 2D-S data, as in the case of Ladino et al. (2017)) are rejected. Moreover, the UHSAS has a maximum count rate of 3000 per second and Cai et al. (2008) has shown that the detection efficiency decreases when the particle concentration exceeds 3000 cm −3 due to coincidence effect. Therefore, points where the total count exceeds 3000 per second are removed from the data. According to Cai et al. (2008), particle concentrations in the small size range come with a caveat that the detection efficiency of a UHSAS (lab version) tends to decrease for particles smaller than 100 nm. Finally, inspection of the housekeeping 425 data revealed erratic variations in the sample flow rate between 32 sccm and 50 sccm, caused by a loose electrical connection at a mass flow controller. Periods of large sample flow variation are manually identified and discarded. The aerosol concentration is calculated from the probe counts per second and the sample flow rate converted from mass (sccm) to volumetric flow rate (cm −3 ) using temperature and pressures measurements from the aircraft core instruments (sections 3.1.2 and 3.1.3).

Cloud microphysics
Cloud microphysical measurements were made with two instruments: the CDP-2 :::::::::::::::: (Lance et al., 2010) which counts and sizes cloud droplets in the 2-50 µm size range, and the 2D-S :::::::::::::::::: (Lawson et al., 2006) which images cloud, drizzle and raindrops in the 10-1280 µm nominal size range (Table 8). Both instruments were mounted under the wings of the ATR, one on the right side 440 and the other on the left side (Fig. 7). Throughout the campaign, the optics of the 2D-S and CDP-2 (and FCDP) probes were cleaned after each flight to remove traces of dust and salt. At low altitudes where the air is warm, the temperature of the CDP-2 and 2D-S lasers increased rapidly and therefore the instruments were often switched off by the operator to avoid damaging the probe. As a result, few CDP-2 and 2D-S measurements are reported along the subcloud layer legs.

CDP-2: cloud droplets
The CDP-2 (serial no. 1711-111, equipped with anti-shatter tips) is a cloud particle spectrometer that counts and sizes cloud droplets in the 2-50 µm range and sorts them into 30 size categories with a resolution of 1-2 µm. The 1 Hz raw data (histograms of counts per second) are processed using DMT's built-in counting and sizing algorithms based on the Mie scattering model, assuming that droplets are spherical with a refractive index of 1.33, and converted to concentrations with the probe sample 450 volume. The sample volume is calculated using the true air speed of the aircraft from SAFIRE-CORE data and the calibrated sample area (0.292m 2 ) determined prior to the campaign by mapping the probe's response to calibrated water microdroplets injected across the laser beam with an apparatus similar to Lance et al. (2010). At 100 m s −1 , which was the typical ATR airspeed during the scientific flights, the sample volume was about 30 cm 3 s −1 . The calibration of the CDP-2 with respect to particle size was regularly monitored during the campaign by means of calibrated glass bead injection tests.

455
Measurements in the subcloud layer reveal that the CDP-2 can detect non-cloud droplet particles such as large/ultra-large aerosols. Although these particles may not satisfy the underlying assumption of the CDP-2 sizing algorithms, it was decided not to filter out these measurements in the CDP-2 files so that further investigations of large aerosols may be conducted, at least qualitatively. However, the response of the CDP-2 to such aerosol particles being unknown, the data taken in non-cloudy areas are subject to unquantified errors.

470
The raw data (from either vertical or horizontal channel, whichever worked best during the flight) is processed using the LaMP in-house processing routines which stem from the early release of the SPEC 2DSView software and are continually updated to integrate state-of-the-art corrections.
The calculation of the sample volume takes into account the decrease in field depth ::::: depth :: of :::: field : with particle size and follows the manufacturer's formula given in Lawson et al. (2006) and the overload periods of the probe. Artifacts due to noisy 475 or dead pixels are identified and removed using the pixel analysis described in Lawson (2011). This probe is equipped with anti-shattering arm tips (K-tip, Korolev et al. (2013)) designed to prevent ice/droplet fragments from falling into the probe sample volume and contaminating the measurement at the lower end of the size spectra (note that no ice was sampled along the ATR flights of EUREC 4 A). In addition to the K-tip, a splash/shatter detection and removal algorithm based on arrival time analysis is applied (e.g. Field et al. (2006), Korolev and Field (2015)). The size of particles seen out of focus is corrected using 480 the Korolev (2007) diffraction correction. Despite these efforts to clean artifacts, the concentration in the first few bins remains questionable for reasons described in Thornberry et al. (2017) and Bansemer (2018) (the contribution of remnant noisy events is amplified in the concentration calculation due to the small sample volume). The size of truncated particles (partial images) is corrected according to Korolev and Sussman (2000) and the nominal size range (10-1280 µm) is extended to 2.56 mm in post-processing.

485
Once most of the artifacts have been corrected, a series of geometrical descriptors, e.g. size (defined here as the diameter of a circle having an area equal to the projected area of the particle, often referred to as surface equivalent diameter in the literature, D eq ), area or perimeter are retrieved from each individual 2D image. Statistical properties are then calculated at 1 Hz, such as the particle size distribution (PSD) or the total concentration ( ::: NT, calculated as the sum of bin concentrations). The mass size distribution (MSD) is computed from the PSD assuming that the particles are spherical with a liquid water density of 1 g cm −3 .
We define a cloud mask and a drizzle mask based on the liquid water content (LWC ) :::: LWC and the particle size (diameter D): a cloud particle is identified when the LWC of droplets smaller than D0 exceeds LWC0, where LWC0 and D0 are specified thresholds of LWC and D, respectively. There is no simple definition of cloud situations, and therefore the values of these thresholds remain uncertain. Here, we use LWC0 = 0.010 g m −3 (which is consistent with other observational and modeling 515 studies of trade-wind clouds such as Heymsfield and McFarquhar (2001) or vanZanten et al. (2011)) and D0 = 100 µm (which is consistent with the AMS glossary definition of cloud drops as water particles between 1 and 100 µm in diameter). We assume that drizzle occurs (drizzle mask is set to 1) when 100 ≤ D < 500 µm, and rain occurs when D ≥ 500 µm.
The cloud LWC was inferred from the size distribution of cloud particles measured by the CDP-2 and 2D-S probes. It was also measured independently by a hot wire probe (DMT LWC-300) that was part of the core instrumentation of the ATR 520 (Table 8, ; : note that the LWC300 :::::::: LWC-300 : sensor broke during RF14 and was immediately replaced by a new one). The hot wire estimates the LWC by measuring the heat released by the vaporization of water droplets on a heated cylinder exposed to the airstream. This calculation is made with the Particle Analysis and Display System (PADS) software, using the aircraft airspeed, pressure and deiced temperature measured by the ATR and the formulas given in the DMT PADS Manual Hot Wire Module 3.5.0 DOC-0290 Rev A. However, the collection efficiency of the sensor is limited for small droplets (< 10 µm) and 525 the evaporation of large drops (> 50 µm) can be incomplete, which can underestimate the LWC measurement in drizzle and rain conditions :::: when :::: such ::::: large ::::: drops ::: are ::::::: present (DMT LWC-300 LWC operator's manual DOC-0361 Rev C). The LWC estimate derived from the CDP-2 and 2D-S probes (distributed in the PMA composite dataset) is thus considered to be more precise than that derived from the LWC-300 (distributed in the SAFIRE-CORE dataset).

Datasets
An aerosol dataset was produced on the basis of UHSAS measurements. It is distributed as an ensemble of NetCDF files (one file per flight) that include products such as the Particle Size Distribution (PSD ) and the total concentration of particles (NT) :::: PSD ::: and ::: NT, all processed at a frequency of 1 Hz.

555
The data are distributed for two levels of processing: the level 2 dataset is associated with single instruments (either 2D-S or CDP-2) while the level 3 dataset corresponds to a combined PMA dataset that merges CDP-2 and 2D-S data into a single composite spectrum that spans the range 2 µm to 2.55 mm. The composite dataset includes additional products such as a cloud mask, a drizzle mask and a rain mask (defined in section 4.3), as well as the 6th moment of the particle size distribution to ease the comparison with radar reflectivities. The periods of flight when the probes are switched off are filled with NaN values. All 560 datasets also include the time and aircraft position from the SAFIRE-CORE dataset.
The LWC measurements from the LWC-300 are included in the SAFIRE-CORE dataset at 1 Hz.
3.5 Lidar and radar remote sensing

Horizontal lidar measurements (ALIAS)
To characterize the presence of clouds and aerosols in the lower troposphere, the ATR was equipped with a lightweight 565 backscatter lidar named ALiAS (Airborne Lidar for Atmospheric Studies) emitting at the wavelength of 355 nm and detecting polarization (Table 9, Chazette et al. (2012Chazette et al. ( , 2020). The main role of this lidar was to measure, together with the BASTA radar :::::::: described :::: next, the fractional area covered by the cloud field near the cloud-base level. For this purpose, the line of sight of the lidar was oriented horizontally, looking through one of the ATR windows (UV fused silica glass) on the right side of the aircraft (Fig. 7).  The native resolution of the lidar backscatter profile along the line-of-sight is 0.75 m. However, to improve the signal to noise ratio, a low-pass filter has been applied and the resolution was downgraded to 15 m. In addition, the backscatter profile was averaged over 50 consecutive shots during the acquisition, which corresponds to approximately one recording every 5 s (averaging time 2.5 s and recording time 2.5 s). The backscatter lidar observations are used to define a cloud mask in the direction perpendicular to the aircraft trajectory. In this direction, the signal was distinguishable from noise up to a distance 575 of about 8 km in clear-sky conditions. However, this range was reduced in the presence of strong scattering, for instance from thick clouds. It means that during the R-patterns, as the aircraft was flying rectangles of about 120 km (along track) × 20 km (cross track), the lidar was able to sample most of the rectangle area unless thick clouds within the rectangle extinguished the lidar signal at some distance of the aircraft.
Both aerosol and cloud products have been derived from the ALiAS observations, and the data are distributed as a set of   (Fig. 9).

Horizontal radar measurements (BASTA)
To characterize the cloudiness in synergy with the lidar, an horizontally-staring cloud radar named BASTA (Bistatic Radar 595 System for Atmospheric Studies) was mounted on the right-hand side of the ATR (Table 9). BASTA is a 1 W bistatic FMCW (Frequency Modulated Continuous Wave) 95 GHz Doppler cloud radar developed from the ground-based BASTA system (Delanoë et al., 2016). It was used in an aircraft for the first time during EUREC 4 A, with two antennas of 20 cm (0.95°b eamwidth) installed in back lateral windows of the ATR (Table 9). The radar was operated in two modes, one after the other, at 12.5 m and 25 m range resolutions with 0.5 and 1 s time resolutions respectively. It led to a measurement in one mode every 600 1.5 s. The maximum range was 12 km with an ambiguous velocity of 9.85 m.s −1 for both modes. The minimum detection range is about 80 m from the aircraft due to coupling between the two antennas.
The Level 1 of BASTA product contains, for both modes, the calibrated and range corrected radar reflectivity, the Doppler velocity and a mask distinguishing the meteorological target from background noise and surface echoes. The calibration of the radar has been derived from other field campaigns and confirmed using in-situ data (a reflectivity was calculated :: by :::::::::: calculating 605 : a ::::::::: reflectivity : from the CDP and 2D-S cloud particles data and compared :::::::: comparing :: it : with radar measurements in cloudy conditions). The sensitivity of the radar is estimated at around -35 dBZ at 1 km. Level 2 data are the most elaborated product, for gaseous attenuation using colocated information from dropsonde temperature, humidity and pressure. A parameterisation of liquid attenuation for both cloud and precipitation as a function of reflectivity was derived thanks to in-situ data and applied to correct reflectivity for liquid attenuation. The corrected reflectivity is then used to distinguish cloud areas from drizzle or rain (section 3.5.3). The radar Doppler velocity is corrected for aircraft motion and folding using gate-to-gate correction. All files are available in a self-documented NetCDF file.

Combined lidar-radar measurements (BASTALIAS)
Based on ALiAS and BASTA data, a combined dataset was developed that takes advantage of the lidar-radar synergy and complementarity to improve the detection of clouds, drizzle and rain (Fig. 12).
For this purpose, the two modes of the BASTA radar products are merged on a single horizontal grid (resolution of 12.5 m within the first 200 m from the aircraft, and 25 m beyond this distance), and a single time resolution (1.5 s). Then the reflectivity 620 is corrected for liquid and gas attenuation and the radar sensitivity is defined as a function of the distance from the aircraft.  A first classification of hydrometeors is then made on the basis of radar observations. As the reflectivity associated with the presence of a remote hydrometeor depends on the drop diameter, reflectivity thresholds can be used to distinguish cloud droplets from drizzle or rain. The definition of these thresholds differs across ground-based radar studies; the threshold distinguishing clouds from drizzle (Z d ) is often set at -20 dBZ (e.g. Kato et al. (2001)), but it can also be set at -17 dBZ or -15 dBZ. The

625
BASTALIAS dataset thus considers three options for the definition of cloud droplets, associated with each of these thresholds.
The threshold distinguishing drizzle from rain is set at Z r = 0 dBZ.
To assess the ability of these reflectivity thresholds to distinguish between cloud, drizzle and rain situations, we calculate the reflectivity Z PMA that would correspond to the drop size distribution of the PMA dataset. It is done using the T-matrix approach and accounting for the beam orientation and for the non-sphericity of large particles(for : . ::: For the smallest particles,

630
Z PMA follows Rayleigh theory and is equal to 10 log 10 (M 6), where M6 is the 6 th moment of the drop size distribution). The distribution of Z PMA values for situations classified by the PMA microphysical masks (based on LWC and D measurements, section 3.4.2) as cloud-only, drizzle-only or rain-only shows that clouds and rain are mainly associated with reflectivities lower than -20 dBZ and larger than 0 dBZ, respectively (Fig 13), which supports the Z d and Z r thresholds used in the BASTALIAS dataset. Reflectivities between -20 and 0 dBZ are predominently associated with drizzle. However, drizzle is associated with a 635 broader range of reflectivities and therefore its identification from reflectivity thresholds remains imperfect.
Since the sensitivity of the radar decreases as the distance from the aircraft increases, BASTA can only detect clouds within a limited distance from the aircraft; beyond this point, the radar can only detect drizzle or rain. The range over which the radar can possibly detect clouds (D radar cloud ) is determined by the distance at which the expected radar sensitivity corrected for attenuation The PMA measurements were performed along the R-patterns flown around the cloud-base level.
equals Z d . In the trade-wind boundary layer conditions of EUREC 4 A, the radar could detect clouds over a maximum horizontal 640 distance ranging from 1.5 to 3.5 km (2.2 km on average) for Z d = -20 dBZ and from 2.9 to 6.2 km (3.8 km on average) for Z d = -15 dBZ. On the other hand, drizzle and rain could be detected at any distance up to 12 km if there is no rain in the vicinity of the aircraft.
In parallel, the ALiAS lidar data at their original resolution (level 1.5 data from Chazette et al. (2020)) are analyzed to determine the horizontal lidar profile that corresponds to the molecular or aerosol backscatter, to estimate the noise level, and 645 to detect the presence of clouds. The lidar cloud detection methodology used in the BASTALIAS dataset is inspired from that developed for the Calipso space ::::::::: spaceborne : lidar and the airborne LNG lidar (Ceccaldi et al., 2013). Although derived from a different methodology, the lidar-only cloud mask of the BASTALIAS dataset is very consistent with that proposed by Chazette et al. (2020), showing the robustness of the cloud detection from lidar measurements. This information is then used to define a lidar pseudo cloud mask at the same space and time resolution as the radar information (for this purpose, each radar time 650 is associated with the closest lidar observation in time). The cloud detection by the lidar is considered impossible beyond the distance from the aircraft (D lidar cloud ) at which the lidar backscatter signal is completely extinguished or undistinguishable from noise. During EUREC 4 A, D lidar cloud ranged from 0 to 8 km, and was about 5 km on average. Finally, the lidar and radar cloud masks are analyzed jointly to make a final classification of hydrometeors and a lidar-radar cloud mask (Fig. 12). The synergy between lidar and radar is illustrated by two examples of individual radar and lidar profiles 655 along their horizontal line of sight (Fig. 14). In the first one, derived from a flight (RF05) associated with small very shallow clouds ('Sugar', Table 3), the lidar detects three clouds in a row which are not detected by the radar; beyond D lidar cloud (6.3 km), q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q qq q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q qq q q q q cloud (lidar only) cloud (radar) drizzle rain cloud detection not possible lidar backscatter signal limit cloud detection lidar limit cloud detection radar limit cloud detection lidar+radar

Distance from ATR [m]
Reflectivity [dBZ] q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q qq q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q qq q q q q cloud (lidar only) cloud (radar) drizzle rain cloud detection not possible radar reflectivity expected sensitivity limit cloud detection lidar limit cloud detection radar limit cloud detection lidar+radar Figure 14. Illustration of cloud, drizzle and rain detection by horizontal remote sensing using lidar-radar synergy. The maximum distances D lidar cloud and D radar cloud over which cloud detection is possible with the ALiAS lidar or BASTA radar are indicated by green dash-dot and red dotted vertical lines, respectively. The range over which hydrometeor detection is no longer possible with radar or lidar is indicated in orange -it corresponds to the maximum of (D lidar cloud , D radar cloud ). The classification of hydrometeors is reported on the lidar and radar signals: drizzle, rain, clouds detected only by lidar, and clouds detected by radar or both radar and lidar. On Jan 28th, 2020 (RF05) at 23:11:30 UTC, the lidar (ALiAS, upper left panel) detects three areas of strong backscatter along its line of sight, while the radar (BASTA, lower left panel) detects no hydrometeor in the range (0-1.8 km) in which it could possibly detect clouds; the areas of strong lidar backscatter therefore correspond to the presence of thin clouds; beyond 6.3 km, the lidar signal is fully extinguished and cloud detection is no longer possible. The cloud detection from ALiAS Level 2 and Level 3 (L23) dataset (performed at a horizontal resolution of 15 m, as opposed to 25 m for BASTALIAS) and using the methodology described in Chazette et al. (2020)) is also reported (note that the ALiAS L23 times have to be shifted by −10 s to coincide with those of BASTALIAS). On Feb 5th, 2020 (RF11) at 09:54:15 UTC, the lidar (upper right panel) measures four areas of strong backscatter and is fully attenuated beyond 3 km. The radar reflectivity (bottom right panel) shows that the first area corresponds to the presence of clouds, but that the following areas correspond to the presence of drizzle or rain (the cloud-drizzle and drizzle-rain transitions are defined by reflectivity thresholds, set here at -20 dBZ and 0 dBZ, respectively). In this case, no cloud can be detected beyond 3 km. This case is from an R-pattern flown near the inversion level; no cloud mask is available from the ALiAS L23 dataset on R-patterns above cloud-base. the backscatter signal is extinguished and the cloud detection becomes impossible. In the second example, from RF11, the radar detects a cloud within the first kilometer, and then drizzle and rain. Wherever the radar detects drizzle or rain, the hydrometeors detected by the lidar are not considered as a cloud in the cloud mask.

Vertical radar measurements (RASTA)
RASTA : ( ::::: RAdar ::::::: SysTem :::::::: Airborne) : is an up-looking pulsed 95 GHz Doppler cloud radar with two antennas (zenith and up backward -with an elevation of 66.7 • -, 30 cm large, Table 9). The radar was dedicated to the characterization of cloud microphysics and dynamics. The radar was operated at 30 m resolution with a maximum range of 6 km at 1 s integration. Both Doppler moments (reflectivity and velocity) and spectrum are available. As for BASTA, the radar reflectivity is range-corrected 665 and calibrated, and the background noise is removed using a thresholding technique based on the background noise characteristics. The derived mask is refined thanks to some image processing. The reflectivity is corrected for gaseous attenuation using colocated information from dropsonde temperature, humidity and pressure. Once the Doppler velocity is unfolded and corrected from aircraft's motion and when backward and zenith antennas are simultaneously available, the vertical velocity and the along-track wind components of the cloud/precipitation wind are retrieved. Two antennas allow us to retrieve the two 670 components of the wind in the plane defined by the two antennas.
Level 2 data are distributed as a set of NetCDF files for the flights during which the radar was operating and clouds were detectable with the radar (RF03, RF04, RF11, RF12, plus all flights from RF13 to RF19). For all these flights but two (RF11 and RF12), two antennas were working (zenith and up-backward), which allows us to derive wind information (its radial component) in addition to cloud information. For RF11 and RF12, only one antenna (zenith) was working and therefore the 675 wind information is not available. Fig. 15 shows the reflectivity and Doppler velocity measured in the vertical and radial directions by the RASTA and BASTA radars during a leg of RF17. During this flight, the height of precipitating cloud tops could exceed 3 km. The vertical and radial reflectivity structures tend to reflect each other, suggesting a well-defined geometry (or aspect ratio) of the clouds. The vertical structure of the Doppler velocity from RASTA exhibits a maximum positive velocity near cloud top and negative velocities in 680 the parts of clouds that are associated with falling hydrometeors (rain or drizzle).

Water stable isotopes (Picarro)
In addition to characterizing the meteorological, turbulent, microphysical, cloud and radiative properties of the atmosphere, the ATR measured the water isotopic composition of the atmosphere using a customised fast response cavity ring-down spectrometer from Picarro (version L2130-i). This effort took place as part of a wider EUREC 4 A-iso initiative involving multiple 685 platforms and instruments Bailey et al., submitted). The rationale for isotopic measurements is that by quantifying the relative content of isotopically heavy ( 1 H 2 H 16 O, 1 H 18 2 O) and light ( 1 H 16 2 O) water molecules in the atmosphere, it is possible to get information about the transport, mixing and phase changes of water. Isotopically heavy water molecules are associated with lower saturation vapor pressures and smaller diffusion velocities than their most abundant, lighter counterparts. Therefore, the three main components of the boundary layer moisture budget, namely ocean evaporation, convective drying and 690 moistening by hydrometeor evaporation, carry a distinct stable water isotope signature (Risi et al., 2019). Specificities in the water vapor cycling associated with different mesoscale cloud organisation patterns, therefore result in characteristic isotopic fingerprints (Aemisegger et al., 2021b). Whether these fingerprints are primarily due to the local processing of water vapor in the marine boundary layer, or result from the interaction with the large-scale flow, is one of the questions to be addressed with water isotope tracers.

695
The isotopic composition of atmospheric water vapor on board the ATR was measured with a sampling frequency of 1 Hz (Table 9). The CRDS system uses laser absorption spectroscopy as a working principle: the different isotopic molecules having different rotational-vibrational energy level structure, they exhibit different transition frequencies in the near-infrared region of the spectrum. Three nearby absorption peaks in the near-infrared region (7199-7200 cm −1 ) corresponding to the three molecules jected through a semi-transparent mirror into a 35 cm 3 cavity with three mirrors in ring configuration (Crosson, 2008). A photodetector is placed behind another mirror and measures the light intensity leaking out of the cavity. The isotope concentration is determined by measuring the exponential ring-down time of the laser intensity after the laser source has been switched off. The higher the heavy isotope concentration, the faster the decay of the laser intensity.
In the ATR, a rearward facing 30 cm long stainless-steel inlet with 1 4 -inch outer diameter was fitted to one of the front 705 windows on the right-hand side of the aircraft (Fig. 7, Table 6). Recent studies have indicated that the precision of laser spectrometers in laboratory settings is comparable to the one of 720 conventional isotope ratio mass spectrometer systems. However, for atmospheric field applications, the overall measurement uncertainty can result from a range of factors such as calibration, sensitivity to variations in water concentration, and retention effects from the tubing (Aemisegger et al., 2012). A detailed post-processing procedure was therefore applied to account for these factors. In particular, a two-stage correction procedure following Weng et al. (2020) was applied at water vapor mixing ratios lower than 10'000 ppmv to correct for a known concentration-bias in laser spectrometric isotope measurements. The 725 water vapor mixing ratio measurement from the CRDS system was calibrated based on a linear correction determined in the laboratory using a dew point generator. More details on the post-processing are available in Bailey et al. (submitted). The dataset is distributed on AERIS as an ensemble of self-documented NetCDF files (Aemisegger et al., 2021a).

Consistency among observations
The ATR measured humidity, winds and clouds with multiple instruments based on different observation techniques. This 730 redundancy and/or complementarity is an opportunity in several respects. It is an asset for the quality control of the data from each instrument and for the processing of combined datasets taking advantage of the complementarity of the instruments. It also allows the robustness and the statistical representativeness of the measurements to be assessed. This last point is particularly important for EUREC 4 A, as the experiment was designed on the premise that the relationships between clouds and their environment could be characterized by combining measurements from several instruments and/or observing platforms that The objective of this section is to verify this premise by comparing some of the main ATR measurements made by different instruments using different techniques and/or samplings. We also assess the consistency between the ATR measurements and the simultaneous dropsonde measurements  or BCO ground-based observations (Stevens et al., 2016).

740
On-board the ATR, humidity was measured by several instruments but the WVSS-II sensor was considered as a reference for the calibration of the SAFIRE-CORE and SAFIRE-TURB datasets (Brilouet et al., 2021) because of its reliability and because it was the least affected by the presence of condensation or very dry air (section 3.1.4). The Picarro CRDS measured water vapor with a similar sampling, and its data were calibrated on the basis of laboratory measurements (section 3.6). During most ATR flights, HALO (or, on Feb 11th, the P-3) was flying circles of 200 km diameter at high altitude, measuring water vapor 745 every 5 min and with a vertical resolution of about 10 m with Vaisala RD-41 dropsondes .
The comparison between these different measurements is presented in Fig. 16 for each ATR flight. For the SAFIRE-CORE and Picarro measurements, the mean and standard deviation of the water mixing ratio are calculated over all the 'T-shortlegs' segments (Table 2) associated with a given kind of segment. Note that the legs flown around the middle and the top of the subcloud-layer have been considered together because the subcloud-layer is well mixed vertically (Albright et al, in prepara-750 tion). For dropsondes, they are calculated over all available level-4 measurements in a layer comprised between the minimum altitude minus 50 m and the maximum altitude plus 50 m sampled by the ATR for a given pattern. Most of the soundings data within the EUREC 4 A circle are derived from HALO dropsondes; these data include a correction for the dry bias of these dropsondes .
For each flight and each pattern, the ATR measurements (SAFIRE-CORE and Picarro) generally exhibit a good agreement, 755 both in terms of mean humidity and standard deviation: over the cloud-base rectangles (R-patterns), the mean discrepancy between the two datasets is 0.084 g kg −1 (0.63 %) and 0.21 g kg −1 (18.7 %) for the mean and standard deviation, respectively.
Thoses differences are slightly larger on L-patterns (0.27 g kg −1 or 1.85 % and 0,21 g kg −1 or 29,4 %, respectively) and on S-patterns close to the surface (0.28 g kg −1 or 1.86 % and 0.13 g kg −1 or 33.5 %). The most notable exceptions occur on Feb 11 2020 (RF17 and RF18), when the aircraft flew within or below precipitating clouds (Table 4): along a few legs, the 760 quality of CRDS measurements was affected by the presence of cloud droplets or precipitation in the air inflow system.
This comparison suggests that despite their different sampling and observing techniques, the ATR and HALO generally 770 measured statistically consistent variabilities of humidity around cloud-base, within the subcloud-layer, and near the surface.
The main discrepancies occurred when the scale of the cloud field organization was much larger than the scale of the area probed by the ATR. In these cases, the differences are likely to be representative of real spatial differences associated with different samplings.

Cloud-base cloud fraction
One of the original motivations for the EUREC 4 A campaign was to test the mixing-desiccation mechanism by which increased convective mixing in the lower troposphere dries the atmosphere around cloud base and reduces cloudiness (Bony et al., 2017).
This mechanism, which has been shown to contribute to the strong positive feedback of low-clouds and the high climate sensitivity of a number of climate models, remains to be tested observationally. Such a test requires measuring the cloud 805 fraction at cloud base CF b together with the lower-tropospheric mixing from convection and larger-scale vertical motions.
Reflecting the view that clouds are both bodies interacting with radiation, collections of particles, and a particular state of atmospheric water (Siebesma et al., 2020), we estimate CF b in different ways, using various observations ranging from lidar and radar remote sensing, to in-situ microphysical measurements (defining the cloud mask either from the cloud particle properties directly or from the equivalent radar reflectivity calculated from these properties), to high-frequency humidity measurements.

810
Using horizontal lidar-radar measurements from ALiAS and BASTA together with the BASTALIAS cloud detection algorithm described in section 3.5.3, we diagnose the cloud fraction within the rectangle area associated with the R-patterns flown at cloud base: for each R-pattern, we divide the total number of points classified as 'cloudy' along the instruments' line of sight by the total number of points where a cloud detection is possible. The resulting time series of CF b is shown in Fig. 18, which uses a reflectivity threshold for the cloud-drizzle transition of -20 dBZ. Using a different threshold (-17 dBZ or -15 dBZ) 815 makes very little difference in the time series (not shown). CF b is small on average (3.5 %) but it ranges from 0 to 6 % across flights. Minima in CF b occurred during RF09 and RF14, when the ATR was flying within the clear-sky area of a field of 'cloud flowers' organised at the mesoscale (Table 4).
Most of the cloud fraction (from 60 to 100 %) is composed of clouds which were detected by the lidar only (Fig. 18). As explained in section 3.5.3, it is because a large proportion of clouds in the trades are small (a few hundred meters) and optically 820 thin (especially at cloud base where the liquid water content is small), and because in the trade-wind boundary layer, horizontal radar measurements can only detect clouds over a range of 2-3 km from the aircraft, while horizontal lidar measurements can detect clouds over a distance at least twice larger.
Using reflectivities at the 5th gate (i.e. about 90 m above the aircraft), and a threshold of -20 dBZ to distinguish clouds from drizzle, a cloud fraction can also be diagnosed :::::: derived : from the vertically-pointing cloud radar measurements (RASTA, 825 section 3.5.4). The CF b estimates from RASTA are in good agreement with those from BASTALIAS, except on RF17 and RF19 during which RASTA measurements seem to be dominated by rain.
A cloud fraction estimate can also be diagnosed from in-situ measurements of cloud microphysics (section 3.4.2) using two methods. The first one consists in using the PMA hydrometeors masks defined as LWC ≥ 0.010 g m −3 and D < 100 µm, and the drizzle mask as 100 ≤ D < 500 µm. We then compute a cloud fraction along the aircraft trajectory as the ratio between 830 the number of points classified as 'cloud-only' over the total number of valid measurements. Although the sampling along the aircraft trajectory is much less extensive than that of horizontal lidar-radar measurements, the time evolution of the CF b derived from PMA data is highly correlated (R = 0.90) with that from BASTALIAS (Fig. 18)  using an hydrometeor classification based either on (D, LWC) or on reflectivities (Z) inferred from the particle size distribution, from insitu humidity measurements at the turbulence scale (RH from SAFIRE-TURB), and from the vertically-pointing RASTA radar. (middle) Comparison of the fractional area covered by "clouds only" (in black) or by "clouds+drizzle" (in orange :::: color) inferred either from the BASTALIAS lidar-radar dataset or from the PMA dataset using both types of hydrometeor classification. Note that the time axis is not linear.
(bottom) Correlation (R) and linear regression coefficient (S) between the CF b estimates derived from different ATR datasets. compute a cloud fraction considering both clouds and drizzle, diagnosed either from BASTALIAS data or from PMA cloud microphysical data. The cloud+drizzle fraction differs from the cloud-only fraction mostly on Feb 11, during which the ATR flew during night time (RF17) or in the morning (RF18) in the presence of cloud flowers and a strong ascending motion in the free troposphere (Table 2.5). However, even in the presence of drizzle and rain, estimates from the two measurements are very consistent with each other (R = 0.90, Fig. 18).
Finally, recognizing that clouds occur in saturated (or, in the presence of sea salt, nearly saturated with respect to pure 840 water saturation) conditions, it is possible to define a pseudo cloud fraction from the high-frequency (25 Hz) / small-scale (4 m) measurements of relative humidity: using the SAFIRE-TURB dataset, we reconstruct high-resolution timeseries of humidity mixing ratio, temperature and then relative humidity by adding the turbulent fluctuations of each variable measured over stabilized segments (T-shortlegs) to the mean of each segment (section 3.3, (Brilouet et al., 2021). Then, by counting the proportion of measurements having a relative humidity exceeding a threshold RH c , we define a pseudo cloud-fraction 845 from the SAFIRE-TURB dataset. The time series of CF b obtained with this method using a threshold RH c = 0.98 is in good agreement and correlates well with the cloud fraction estimated from BASTALIAS (R = 0.76) or PMA data (R = 0.71). The best correlation with BASTALIAS and PMA data is obtained during the second half of the campaign (after RF09), when the high-frequency measurements of humidity were of best quality (Brilouet et al., 2021).
The values of CF b derived from the different datasets obviously depends to some extent on the thresholds used to define 850 cloudy conditions. However, sensitivity tests suggest that the high correlation among the different estimates remains for a range of threshold values. Considering the diversity of measurement techniques (in-situ microphysical and turbulent measurements, horizontal lidar-radar measurements, vertical radar measurements) and spatial samplings (rectangle perimeter or rectangle area) leading to consistent results, the CF b estimates from the ATR can be considered as robust.

Water isotopic composition 855
To assess the consistency of isotopic data between the ATR Picarro dataset and the ground-based measurements from the BCO (Fig. 19), we select the data collected at altitudes lower than 400 m (this height is well below cloud base for all flights and contains the ground-based measurements at the airport) and compute the mean value for each flight (ATR alt≤400m ). The BCO data is averaged over each flight's period (BCO f lights ). The vapor sampled by the Picarro onboard the ATR was drier (-0.7 g kg −1 ) and more depleted (-1 ‰ in δ 18 O ‰ and -4.3 ‰ in δ 2 H) than at the BCO (Table 10). The d-excess was nearly identical 860 except for RF04, RF07 and RF08 for which the d-excess at the BCO was lower. Similar differences between the BCO and the R/V Atalante were recorded during a comparison stop offshore the BCO (personal communication Gilles Reverdin). A possible explanation for the observed differences could be the effect of sea spray evaporation due to the wave activity at the cliff in front of the BCO that enriches and moistens the air close to the land-based site. The flight-to-flight isotope variability recorded by the ATR agrees well with the one observed at the BCO: correlations between ATR alt≤400m and BCO f lights range from 0.7 to 865 0.82 (Table 10). Due to their spatial separation, the instruments did not measure the exact same air, or if so, due to advection, with a time lag. Therefore, the qualitative and quantitative match between the datasets suggests that the measurement are of good quality, and that distinct mesoscale environmental conditions were measurable during the different research flights.  Figure 19. Comparison of the ATR isotope measurements below 400 m ( :: in :::: blue) :::: with ::: the :::::::::: ground-based ::::::::::: measurements :::: from :: the :::::::: Barbados :::: Cloud :::::::::: Observatory ::::: (BCO) :::::: during :::: each :::: ATR :::: flight ::: (in ::::: black). :::: The ::::::::: comparison : is ::::: done :::: (from ::: top :: to ::::::: bottom) :: for : δ 18 O, δ 2 H and d-excess : (expressed in ‰), ::: for specific humidity : (in g kg −1 , : ) : and :: for ::: the : length of measurements (expressed in min)with the ground-based measurements from the Barbados Cloud Observatory (BCO) during each ATR flight.
their respective DOIs are summarized in Table 11.

Summary and conclusions
The EUREC 4 A field campaign, which aims at better understanding the link between clouds and circulation in the region of the trades, based its core experimental strategy on the coordinated operations of two research aircraft (Bony et al., 2017;: the French ATR aircraft flying in the lower troposphere and the German HALO aircraft flying at an altitude of 875 9-10 km. This paper presents the EUREC 4 A's ATR operations and presents the 19 ATR flights (totaling approximately 82 flight hours) that took place from Jan 25 to Feb 13, 2020 over the tropical Atlantic ocean, east of Barbados.
The ATR mission focused on characterizing the thermodynamic, dynamical, microphysical, turbulent and cloud properties of the lower atmosphere. One of its specific roles was to measure the cloud-fraction around cloud-base to help test low-cloud feedback mechanisms. For this and other purposes, the ATR was equipped with a rich and extensive instrumentation composed 880 of in-situ sensors, radiometers and active remote sensing. Eighteen coordinated research missions followed a repeated flight plan consisting of rectangles (or R-patterns) flown at cloud-base or cloud-top, L-legs flown within the subcloud layer (Lpatterns), straight legs flown 60 m above the sea surface (S-patterns), and ferry legs flown in the lower free troposphere above clouds.
The first part of this paper presents the ATR operations, the flight patterns and the flight segmentation (summarized in a 885 collection of YAML files). It also shows that during its 19 missions, the ATR sampled very contrasted environmental conditions.
The second part of this paper presents the ATR instrumentation used during EUREC 4 A: 3 temperature sensors, 5 humidity sensors, 2 broadband radiometers, an infrared spectrometer, 2 visible cameras, 6 microphysical probes, an horizontally-staring backscatter lidar, 2 Doppler cloud radars (1 pointing horizontally and 1 pointing vertically), and a laser spectrometer for water isotopologues. The paper presents the different instruments, highlighting the complementarity that results from their differ- Finally, the paper assesses the consistency among the different ATR measurements, and between the ATR measurements and those performed by other instruments on different platforms such as HALO or the BCO.
The large variability of the aerosol load in the atmosphere (ranging from 50 to :::: more :::: than 500 cm −3 ) is measured consistently by the ALiAS lidar and by UHSAS microphysical probes (sections 3.4.1 and 3.5.1). The measurements of humidity, 900 wind and cloud-base cloud fraction also exhibit a good consistency among the different ATR datasets. The mean specific humidity measured by in-situ sensors differs from that measured by the ATR Picarro laser spectrometer by less than 0.1 g kg −1 at cloud-base, and by less than 0.3 g kg −1 within the subcloud layer; larger disagrements occur on RF17 and RF18 when the Picarro measurements were impacted by cloud droplets and precipitation in the e. along the rectangle perimeter, while horizontal lidar-radar remote sensing samples the interior of the rectangle) shows that the ATR measurements of humidity, wind and cloud-base cloud fraction are robust. measurements from the ATR and from HALO are highly correlated (R = 0.98) and differ by only 0.7 ± 0.5 m s −1 ; the ATR humidity measurements are also in good agreement with the dropsondes data, both in terms of mean and variability. It shows that the measurements made by HALO and the ATR are consistently representative of the explored area, despite the complexity of the cloud organization and its inner heterogeneity.
These results thus verify two premises which were at the basis of the EUREC 4 A experimental strategy: 1) it is possible to 920 measure the cloud-base cloud fraction in a robust way, and 2) the repeated flight patterns of HALO and the ATR allow us to sample the atmosphere statistically in a consistent way, except when the cloud field is organized on a scale much larger than the scale of the ATR flight pattern (which only occurred twice out of the 18 flights). It is therefore legitimate to use observations from the different EUREC 4 A platforms together to carry out process studies. The availability of data from the ATR and other platforms together with the large diversity of environmental conditions and clouds encountered during the campaign should 925 thus make it possible to better understand the physical processes underlying the cloud-circulation interactions in the trades.

Appendix B: Satellite movies
Satellite animations were made to visualize the clouds scenes sampled by the ATR and other platforms during the campaign.
Author contributions. The people responsible for the processing of the different ATR datasets are listed in Table 11. SB led the ATR team and coordinated the scientific operations with ML, JD and BS. JCC and JPD led the SAFIRE operations with the support of AB. PC, CF, JT and AB were responsible for the lidar measurements; JD, CLG and CC for the radar measurements; ML, PEB and SAFIRE/TRAMM  Figure C1. (continued)    Figure D1. Instrumental configuration of the ATR showing the nomenclature used in Tables 5 to 9.