Combined wind lidar and cloud radar for high-resolution wind proﬁling

. This paper introduces an experimental setup for retrieving horizontal wind speed and direction pro-ﬁles with a high temporal and vertical resolution for process studies and validation of convection-permitting model simulations. The CMTRACE (tracing convective momentum transport in complex cloudy atmospheres) campaign used collocated wind lidar and cloud radar measurements to retrieve seamless wind proﬁles from near the surface up to cloud tops. It took place in Cabauw, the Netherlands, between 13 September and 3 October 2021. The intermediate processing steps for generating the level 1 and level 2 data, such as second trip echoes ﬁltering, offset correction, wind retrieval, re-gridding, and ﬂagging, are described. In level 1 (https://doi.org/10.5281/zenodo.6926483, Dias Neto, 2022a), the data from lidar and radars are kept in the original spatial and temporal resolution, while in level 2 (https://doi.org/10.5281/zenodo.6926605, Dias Neto, 2022b), they are regridded to a common spatial and temporal resolution. Statistical analyses of the lidar’s and radar’s wind speed and direction proﬁles indicate a correlation higher than 0.95 for both variables. The bias of wind direction and speed calculated between radar’s and lidar’s observations are 0.24 ◦ and − 0 . 16 ms − 1 , respectively. The fore-seen initial application of the datasets includes the study of convective momentum transport and its validation in regional weather forecasts and large-eddy simulation hindcasts.


Introduction
Wind is an essential component in nearly every weather phenomenon on Earth through its transport of heat, moisture, and scalars.How winds blow sets patterns of precipitation on large scales through e.g.atmospheric rivers or monsoons (Gimeno et al., 2014;Zemp et al., 2014;Naakka et al., 2019;Gimeno et al., 2020), while on small scales it influences surface heat and moisture fluxes, convection, and cloud development.Large-scale wind is driven by thermal (pressure) gradients, but modified by a range of small-scale processes, including surface drag, momentum transport, and gravity waves.The parametrisation of those small-scale processes in weather and climate models remains uncertain, and persistent wind biases likely related to such processes continue to exist.
Wind observations for data assimilation and model validation are therefore invaluable, but generally limited to the surface layer where meteorological stations (over land) or permanent moorings or buoys (in the ocean) exist.Besides meteorological towers (limited to 200 m) or airborne wind measurements (limited in time), ground-based radar and lidars can be used more routinely to measure wind profiles beyond the surface layer.
The so-called velocity azimuth display (VAD) radar approach for retrieving the wind properties was proposed by Lhermitte (1962); Browning and Wexler (1968); Lhermitte (1969), where the mean Doppler velocity (MDV) as a function of azimuth results in a sine curve.Later, Wilson (1970); Kropfli (1986) used radars to study turbulence in the boundary layer based on the VAD.Weather Doppler radars have also been extensively used by weather services (Chandrasekar et al., 2018;Kumjian, 2018) to monitor winds, miti-gate the impact of storms, study the evolution of meteorological systems (e.g.tornadoes, cyclones) (Kosiba et al., 2013), and detect wave structures in the horizontal wind (Miller et al., 2022).
In the last decade, due to the current global transition from fossil fuel power plants to systems based on renewable energy sources, the wind energy industry has bloomed.For accurate wind power prediction, understanding the influence of turbulence in the atmospheric boundary layer, as well as topography and surface drag, is crucial.Wind turbulence can strongly affect energy production (Elliott and Cadogan, 1990;Peinke et al., 2004;Clifton and Wagner, 2014) or even damage the wind turbines (Kelley et al., 2006).Studying such processes requires much higher resolution wind observations to understand the temporal and spatial scales with which wind varies, including over ocean.This has spurred the development and deployment of commercial Doppler wind lidar.Based on VAD radar works, Eberhard et al. (1989) used lidar for studying turbulence for the first time, and since then, lidar has been largely used for this application (Newman et al., 2016;Newman and Clifton, 2017;Mann et al., 2010;Sathe et al., 2015;Smalikho and Banakh, 2017;Bonin et al., 2017).Sathe and Mann (2013) provide a good review of lidar-based experiments.
Recently, observations from wind lidar and weather Doppler radar at low elevations (2 • ) were combined to extend the range of retrieved horizontal wind (Ritvanen et al., 2022), but few studies combine radar and lidar to investigate the detailed evolution of wind below and within clouds.One exception is the work from Bühl et al. (2015), where the authors combine radar and lidar observations, but only for retrieving the vertical component of the wind.
In this study, we attempt to develop a dataset that uses clouds explicitly to make extended wind profiles throughout the lower atmosphere and not just in the surface layer.Clouds have long helped visualise and quantify the winds at higher levels (e.g.atmospheric motion vectors can be derived from clouds' motion Velden et al., 2005;Velden and Bedka, 2009;Kishtawal et al., 2009;Cordoba et al., 2017).But how winds themselves are modified by convection remains poorly studied let alone observed.By measuring winds below and through clouds, we may gain insight into one of the main uncertainties for wind prediction, as highlighted by the numerical weather prediction community; namely, convective momentum transport (CMT), which we broadly define as convectively-driven transport of momentum through updrafts, downdrafts, and the cloud-scale or even mesoscale circulations that accompany clouds.
Turbulent eddy-resolving models may lend themselves better than point measurements for the study of momentum transport by providing a three-dimensional view of the multi-scale flow.However, they are limited by the model's periodic boundary conditions, the use of domains smaller than the scales of mesoscale cloud organisation observed in nature, and possible misrepresentations of turbulence and convection.Recent large-eddy simulations (LESs) studies with open boundaries and large domains show that horizontal flows can generate substantial momentum fluxes (Dixit et al., 2021), which are not present in more traditional (BOMEX, Siebesma and Cuijpers, 1995;RICO, vanZanten et al., 2011) cases of shallow cumulus convection.They also show that such flows produce so-called counter-gradient transport, whereby the environmental wind shear is enhanced instead of diminished (due to local downgradient turbulent diffusion).The handful of LES studies focusing on CMT have been carried out for convection over oceans, while studies of CMT over land are limited.
Recently, Koning et al. (2021) combined 9 years of wind observations from a 200 m tall tower and LES from a commercialised graphics processing unit (GPU) version of Dutch atmospheric large-eddy simulation (DALES, Heus et al., 2010) over Cabauw (the Netherlands) to investigate the relationship between different cloud regimes (clear sky, shallow clouds, non-convective clouds) and momentum flux and wind shear in the boundary layer.The authors found that clear sky and shallow clouds days have a similar diurnal cycle of near-surface winds, but the further deepening of the convective boundary layer in the presence of clouds can lead to a larger daytime increase in near-surface winds.Furthermore, for a similar atmospheric stability in the surface layer, days with shallow clouds sustained larger surface momentum fluxes for a given wind gradient.The data also suggest that more crosswind momentum fluxes are present within the mixed layer -hinting at more organised cloud or roll cloud structures -compared to days when non-convective clouds are predominant.As part of the Ruisdael Observatory (https://ruisdael-observatory.nl/, last access: 15 November 2022), the Dutch LES will be run over the entire Netherlands at a 100 m grid daily.To accompany and validate these simulations, high-resolution (wind) measurements are invaluable.
This paper introduces an experimental setup where scanning cloud radar and wind lidar observations are combined to retrieve horizontal and vertical wind profiles with the highest possible resolution.Even though the sonic detection and ranging (SODAR) and radar wind profiler (RWP) instruments can have similar vertical resolutions as cloud radar and lidar, the SODAR and RWP averaging time is around 10 min, which limits the study of turbulence and convection.By merging radar and lidar observations, this experiment provides, for the first time, continuous profiles of the horizontal wind from near the surface up to cloud tops.The experiment is conducted as part of the Dutch Research Council (NWO)-funded "tracing convective momentum transport in complex cloudy atmospheres experiment" project (CM-TRACE) that targets convective momentum transport under different cloud conditions and across different temporal/spatial scales.

Experiment
CMTRACE occurred between 13 September and 3 October 2021, in Cabauw, the Netherlands.It was carried out by the Delft University of Technology and the Royal Netherlands Meteorological Institute, in coordination with the Ruisdael Land-Atmosphere Interactions Intensive Trace-gas and Aerosol measurement campaign (RITA, https://ruisdael-observatory.nl/the-rita-2021-campaign/, last access: 15 November 2022), which took place at the same site, allowing the synergy of combining observations from in situ measurements (ground-based and airborne) with the CMTRACE remote sensing observations.For this experiment, one wind lidar and two cloud radars were installed close to each other; the wind lidar provided information on the sub-cloud layer winds, while the cloud radars provided information on the cloud layer winds.The horizontal distance between them was less than 60 m (Fig. 1).This small separation between the instruments was intended to optimise overlap in sampled volume.

Instruments
The first cloud radar is a dual-frequency (35 and 94 GHz) scanning polarimetric frequency-modulated continuouswave radar (FMCW) produced by Radiometer Physics GmbH (hereafter CLARA, CLoud Atmospheric RAdar).This radar allows setting different configuration parameters for different range intervals (e.g.range resolution, integration time).During the campaign, three range intervals were used, and the radar settings from each range are summarised in Tables 1 and 3. CLARA was operated performing clockwise and anti-clockwise periodical plan position indicator (PPI) scans (azimuths: 0-359.9• ) with an elevation angle of 75 • ; each scan sequence lasted for about 72 s.This high elevation was chosen to minimise the MDV folding effect that could affect the observations if the scatterers' velocities were larger than the Nyquist velocity.The Nyquist velocity for each range interval is listed in Table 3.Although CLARA is a dual-frequency system, only the Doppler velocities from 35 GHz are used due to an approximately 3 times larger Nyquist velocity when compared to the 94 GHz.
The second cloud radar is a single frequency (94 GHz) scanning polarimetric frequency-modulated continuous-wave radar (FMCW) produced by Radiometer Physics GmbH (hereafter MARA, Mobile Atmospheric RAdar).Although MARA is a scanning capable radar, it continuously pointed vertically during the experiment, providing vertical profiles with a temporal resolution of 1 s.As CLARA, MARA also allows setting different range resolutions for different range intervals, and it was also operated using three range intervals.The configuring parameters for each range interval are also listed in Tables 1 and 2.
The wind lidar is a WindCube-200s scanning lidar produced by Vaisala (hereafter WindCube).It is a pulsed system operating at a wavelength of 1.54 µm, and it is capable of scanning at different azimuths and elevations.This lidar also allows defining different integration times (from 0.1 up to 10 s) and range resolution (from 25 up to 100 m).During the campaign, the WindCube was operated following the six-beam scanning strategy proposed by Sathe et al. (2015), which, according to the authors, provides more information about turbulence than the VAD.The six-beam scanning strategy consists in measuring the radial velocity at five azimuths equally separated by 72 • at a specific elevation angle and one additional measurement at 90 • elevation (see Sathe et al., 2015 for a complete description).During CMTRACE, the elevation of the slanted measurements was set to 75 During the campaign, the three instruments continuously operated following the abovementioned strategy.Apart from that, no other scanning strategy was used.

Weather characteristics
The 21 d of measurements were characterised by different weather conditions and a diverse cloud cover.For a rapid overview of the weather conditions and cloud coverage, Table 4 provides the daily estimated duration in hours of lowlevel clouds (LLCs), mid-level clouds (MLCs), high-level clouds (HLCs), deep convective clouds (DCCs), and stratiform clouds (SCs) (cloud levels are described in Lamb and Verlinde, 2011, chap.1).The duration of the cloud cover was estimated by visual inspection of data recorded by MARA.Table 4 also provides the maximum and mean precipitation rate (RR) measured by an optical disdrometer, and the daily mean wind speed (WS200) and direction (WD200) derived from the WindCube observations at 200 m above the surface.
Table 4 indicates that most of the days were nonprecipitating with few cloud cover (predominantly LLC).During those days, the mean wind speed at 200 m above the ground ranged between 3.6 and 9.6 m s −1 , and the wind direction was predominantly from the southwest, but for some days, the wind direction changed to the east.During the precipitating days, DCC and SC clouds were present in addition to LLC, MLC, and HLC.The wind speed for these days ranged between 6.7 and 12.9 m s −1 , and the wind direction https://doi.org/10.5194/essd-15-769-2023 Earth Syst.Sci.Data, 15, 769-789, 2023  was predominately from the southwest, with the exception of days where it changed to the east.

Data processing
The CMTRACE dataset is structured in three levels according to the processing steps applied.Those different processing levels are designed to facilitate the usage of the CM-TRACE dataset, and with that, users will have the possibility to choose the data level that better suits their needs.The processing steps for each level are summarised in Fig. 2, and they are described in the following sections.

Processing levels
The original data output from the WindCube, MARA, and CLARA is defined as the CMTRACE level 0 dataset.In level 0, the variables available in the dataset are related to the scatterer properties (e.g.backscattered signal, MDV, and spectrum width), and the data from the WindCube are still affected by noise.Note that neither the lidar nor the radars datasets provide the horizontal wind speed and direction observations at this level.
The level 1 processing starts with the removal of artefacts present in the level 0 dataset, as indicated in the flowchart (Fig. 2).A filter based on the WindCube's status variable similar to that described in the manual is applied to the WindCube observations to filter out the noise data and nonrealistic MDV values.After this filtering step, the WindCube data are still affected by second trip echoes (STEs) produced by clouds from altitudes further than the maximum range sampled by the WindCube.A similar issue was also found in previous experiments (Bonin and Alan Brewer, 2017;Bonin et al., 2017).Panels a and b from Fig. 4 show an example of the MDV recorded by the WindCube at 0 • azimuth and the equivalent radar reflectivity (Ze) recorded by MARA on 21 September 2021.The MDV values from below 2 km are continuous and distributed between −2 and 1 m s −1 (Fig. 4a).In contrast, the MDV rapidly changes from −1 to −7 m s −1 between 11:00 and 12:00 UTC at around 2 km.The region with fast velocities extends from 2 up to 6 km in range and appears at different times along the day.In contrast, for the same period, MARA's Ze does not show any cloud below 6 km, and the only clouds with similar shapes appear at altitudes above 8 km.A filter is then applied to minimise the presence of STE in the dataset (see Sect. 3.2).After filtering, the WindCube vertical Doppler velocities are stored as the vertical wind component, and a wind retrieval based on a Fourier transform is applied to the slanted azimuthal MDV observations to retrieve the horizontal wind speed and direction (see Sect. 3.3).In addition to the information related to wind, the backscattered data are also included in WindCube level 1 data.
The level 1 processing applied to CLARA dataset slightly differs from that applied to the WindCube observations; the radar software internally removes the noise, and for this reason, the artefact removal is skipped.The Fourier transform wind retrieval is also applied to CLARA slanted azimuthal MDV observations to retrieve the speed and direction of the horizontal wind.Surprisingly, after the retrieval, CLARA's wind profiles also had information from the lowest 2 km, where clouds are not present most of the time (see Sect. 3.4).It was also noticed that an alternating offset affected the wind direction.The magnitude of this offset was estimated using the WindCube wind direction profiles as reference.It was found that each one of the range intervals listed in Table 3 was affected by a different offset (see Sect. 3.5).After the offset correction, CLARA's wind speed and direction are stored as the level 1 products.Yet, in the level 1 processing of CLARA's data, an index is generated to quantify the percentage of invalid data for each complete scan for each height.The invalid index value I inv is calculated as follows: where N inv is the number of invalid data, and N azm is the number of azimuths.At the current stage, it is not possible to decouple the vertical wind speed component from the hydrometeors' fall velocities using MARA's observations.However, MARA's vertical MDV is also included in level 1. MARA's attenuated Ze is also included in level 1, which can be used for cloud identification.
At the end of level 1 processing, the data still have their original temporal and spatial resolution, and the data derived from the different instruments are stored in different datasets.The level 2 processing merges the wind profiles retrieved from the WindCube and CLARA observations, and produces continuous profiles of wind speed and direction from the surface up to the boundary layer or cloud top.The vertical and temporal resolution of level 2 data is set to 50 m (WindCube range resolution) and to 72 s (duration of CLARA PPI scans), which are the coarser resolutions in the level 1 data.Table 5 summarises the settings of the level 2 dataset.The radar variables from CLARA and MARA are then interpolated to the new spatial resolution.The temporal resolution from the WindCube and MARA data is adjusted to CLARA's temporal resolution by averaging all profiles within the time interval of each complete PPI scan from CLARA.This last strategy is used to create profiles that represent equivalent air mass volumes.After that, the processing continues by creating a time-height flag to identify the regions where the lidar and radars provide measurements.Finally, the wind speed and direction profiles from Wind-Cube and CLARA are merged following a hierarchical criterium.The WindCube data have priority over the CLARA data, meaning that CLARA only provides information in regions where the WindCube data are absent.However, not all data from CLARA are incorporated into level 2. The distribution of velocity differences as a function of the I inv indicates that a systematic bias becomes apparent for I inv larger than 50 % (Fig. 3).Therefore, all CLARA's data flagged with I inv equal to 50 % or larger are not used in the level 2 generation.In addition to the wind-related variable, level 2 also contains the lidar and radar backscattered signals and the vertical Doppler velocities.The complete list of variables available in CMTRACE level 1 and level 2 data is given in the Appendix A.

Second trip echoes filter
In Sect.3.1, it was introduced that the WindCube Doppler velocities observations were affected by the presence of STE produced by clouds above the maximum unambiguous range.As indicated in Table 1, the vertical range resolution of the WindCube was set to 50 m, and for the model we used, the pulse repetition frequency is automatically set to 20 kHz, Sometimes, the STE appears in regions close to the surface (hereafter low-STE), contrasting with the surrounding signal produced by aerosols (e.g. between 11:00 and 12:00 UTC below 2 km).However, at other times, the STE appears above the region loaded with aerosols (hereafter high-STE), where there is no clear contrast with the surroundings (e.g.signals above 2 km).In order to minimise the occurrence of low-and high-STE, two filtering approaches were developed.
The low-STE filter takes advantage of the contrasting characteristics.This filter is based on the temporal anomaly (v azm ) of the MDV from each azimuth angle, as indicated by Eq. ( 2); v azm is the observed MDV, and v azm | t is the mean value calculated within a given time window. (2) The exact size of the time window is arbitrary.However, it should be such that a normal distribution can approximate the distribution of the anomalies (e.g.Fig. 5); if this condition is fulfilled, the standard deviation (STD) can be used to characterise the anomaly distribution.The resulting distribution is expected to have the following characteristics: the low-STE anomalies would populate the edges of the distribution, while the anomaly of the true signal due to turbulence would be close to the centre of the distribution.A too large or too small time window could produce an anomaly distribution that deviates from a normal distribution, or a distribuhttps://doi.org/10.5194/essd-15-769-2023Earth Syst.Sci.Data, 15, 769-789, 2023 tion where the anomaly of the low-STE populates the centre while the true signal anomaly is near the edges.
Once the anomaly STD is calculated, a window n times the STD can be used to discard the data outside the defined window, as illustrated in Fig. 5.The exact value of n is arbitrary, but it should be a value that removes most of the STE and preserves most of the valid data.In this work, the values used for T and n are 3.5 h and 3, respectively.Those values were found after several trials.Since turbulence during nighttime and daytime is different, only anomalies from between 09:00 and 16:00 UTC were used to calculate the STD.Otherwise, if the nighttime anomalies were included, the STD would be smaller than that from 09:00-16:00 UTC, and then 3 • STD would still remove data not affected by STE.Note that this approach may fail in situations when the low-STE values are comparable to the real data.
The high-STE filtering uses the backscattered signal observations from a nearby ceilometer (≈ 20 m away from the WindCube) to estimate the height interface that separates the regions loaded with lidar scatterers from clean regions.The ceilometer observations are still affected by noise, and therefore the data from the scatterers-free region are dominated by noise (Fig. 4c).
Because the ceilometer data did not contain the signal to noise ratio information, an alternative approach was applied to filter out the noise.It was noticed that the noise often reached negative values.This characteristic was then explored to remove the noise from the data.The removal starts by setting all negative values as NaN (not a number).Using a moving window over the time and range coordinates, the NaNs are propagated through the noise region.For the time coordinate, the size of the window was set to cover 15 consecutive profiles, and for the range coordinate, the size of the window was set to cover 10 consecutive ranges.The NaN values were propagated by calculating the mean value of the data points covered by the moving window.The NHI is then retrieved as the largest altitude of the heights from the noise-free data in the region between the surface and 4 km.Basically, the noise height interface (NHI) separates the non-noise region (loaded with scatterers) from the noisedominated region (scatterers free).Figure 4c shows that the NHI curve closely follows the region loaded with lidar scatterers.Note that this approach will not work in regions where the noise does not reach negative values.
The detected NHI is then used as a threshold to separate the WindCube data into two regions.One region is below the NHI, where the WindCube data are expected to be predominately originated from lidar scatterers, and the other is from above the NHI, where clouds and high-STE mainly produced the data.An example of the STE filtered data is shown in Fig. 4d.One can see that most of the STEs visible in Fig. 4a are no longer present in Fig. 4d.Note that due to differences in sensitivity between the WindCube and the ceilometer, it is possible that using the NHI as a height threshold will possibly remove more data than intended.

Fourier transform wind retrieval
As described in Sect.2.1, the WindCube scanning strategy used during the campaign produced observations at five azimuthal angles that differ from the four azimuthal angles (0, 90, 180, 270 • ) used for the Doppler beam swing strategy (Röttger and Larsen, 1990).Similarly, CLARA produced continuous MDV observations in PPI mode.
In order to retrieve the wind speed and direction profiles from both sets of observations, one can use the velocity azimuth display (VAD) method (Doviak and Zrnic, 2006;Browning and Wexler, 1968).This approach assumes that horizontal wind is uniform within the scanned volume and that the vertical velocity of the scatterers is the same for all azimuths.Under those assumptions, the radial velocity can be described as a Fourier series, but only the first coefficient is used for determining the wind speed and direction (Doviak and Zrnic, 2006;Browning and Wexler, 1968).
In this work, the wind speed and direction profiles are derived using the fast Fourier wind vector algorithm (FFWVA) proposed by Ishwardat (2017), and a brief description of this method is given below.The FFWVA is similar to the VAD method.However, instead of using the Fourier series, it takes advantage of the currently available fast Fourier transform algorithms for digital signal processing to decompose the radial Doppler observations in terms of amplitude and phase of their harmonic frequencies.Note that the unit of these frequencies is 1 per degree and not 1 per time.The amplitude and phase from the first harmonic are used for calculating the wind speed and direction, and the determination of both Earth Syst.Sci.Data, 15, 769-789, 2023 https://doi.org/10.5194/essd-15-769-2023 quantities is summarised as follows: where a and b are the real and the imaginary parts from the first harmonic, respectively, and i is the imaginary unit.
V (φ) is the azimuthal slanted MDV values from one complete scan, and φ is the azimuth.DFT stands for discrete Fourier transform.φ d is the wind direction azimuth related to the North, and V h is the horizontal wind speed.N is an amplitude correction parameter, and its value is equal to the number of data points used in the transformation.α is the scanning elevation angle.Using this method, the retrieved wind will lose information from spatial variability smaller than a complete scan.The energy of small-scale variability will be distributed among the higher harmonics.In principle, it would be possible to use the higher harmonics information to identify the regions and periods of enhanced horizontal wind variability.Since the sampled volume diameter increases with height, it is expected that the variability of the horizontal wind within the sampled volume will increase with height.Consequently, the distribution of energy towards higher harmonics will increase with height.

Radar observations below 2 km
In addition to the data obtained from the cloud layer, CLARA also received echoes from the sub-cloud layer during clear air conditions and precipitation.Due to CLARA's polarimetric capability, it was also possible to obtain the differential reflectivity ratio (ZDR) from the sub-cloud layer while executing the PPI scans at 75 • elevation.To investigate the origin of those sub-cloud layer echoes and whether the retrieved wind information was biased, the maximum ZDR from each PPI scan is combined with the difference between wind speed derived from the WindCube and CLARA's data from the entire campaign.
The distribution of maximum ZDR stratified by height shows two main regions (Fig. 6).The first region is above 2 km, where the histogram shows one main mode at 0 dB and a broadening of the distribution up to 7 dB.In this region, the temperature is colder than 0 • C, suggesting that the ZDR signal is likely produced by pristine ice crystals, snow, and super-cooled liquid water.In the second region, below 2 km, the distribution shows two main modes, the first at 0 dB and the second at 7 dB.In this region, the temperature is warmer than 0 • C, suggesting that the mode at 0 dB is likely to be produced by water droplets.However, what could be producing the second ZDR mode at an elevation of 75 • ?Previous studies have already indicated that radar returns from clear air conditions are likely to be from insects (Chandra et al., 2010;Figure 6. Two-dimensional histograms of the maximum ZDR for each PPI scan from the entire CMTRACE stratified with height.Note that only data with I inv smaller than 50 % are used.Geerts and Miao, 2005;Ritvanen et al., 2022), and also that insects could produce a strong polarimetric signature in the boundary layer region (Wainwright et al., 2017;Achtemeier, 1991;Wilson et al., 1994;Rennie et al., 2010;Martner and Moran, 2001).
Results from previous studies suggest that insects may actively fly and not only be carried by the wind, and due to it, the derived wind information could be biased (Lhermitte, 1969;Achtemeier, 1991;Wainwright et al., 2017;Chandra et al., 2010).On the other hand, the study from Wilson et al. (1994) suggests that the wind carried the insects, and no systematic bias was found.Klingebiel et al. (2019) combined 35.5 GHz cloud radar and lidar observations from Barbados, and identified that radar returns (from −65 to 50 dBz) from below non-precipitating clouds coincide with regions where the lidar depolarisation ratio indicates the presence of spherical-like scatterers.The authors suggest that the radar returns are likely from sea salt.Based on radar observations and without in situ measurements, we cannot affirm that the clear air echoes from the lowest 2 km are from insects or wet aerosols.Nevertheless, to evaluate if CLARA's retrieved wind speed is biased, the relationship between the retrieved wind speed difference (CLARA − WindCube) and the ZDR was investigated using data from the lowest 2 km only.Figure 7 shows that the difference in velocity for ZDR larger than 4 dB is similar to that from ZDR around 0 dB, and no systematic bias is found.These results suggest that over Cabauw, the clear air scatterers are likely to be carried by the horizontal wind, giving the possibility to use them to retrieve information from the horizontal wind.Nevertheless, if those scatterers are insects, they can fly up or downward actively (Rennie et al., 2010;Achtemeier, 1991;Chandra et al., 2010), but it was not investigated in this study. https://doi.org/10.5194/essd-15-769-2023 Earth Syst.Sci.Data, 15, 769-789, 2023 Under precipitation conditions and in the presence of vertical wind shear, previous studies suggested that due to the differential fall velocity, smaller droplets can be advected further away than larger droplets (Laurencin et al., 2020;Dawson et al., 2015;Kumjian and Ryzhkov, 2012;Biswas et al., 2022), suggesting that larger droplets are less affected by the horizontal wind.Consequently, the horizontal wind retrieved from observations collected during precipitation may differ from the real wind.Under drizzle, the vertical wind component observed by the wind lidar may also differ from the real magnitude.Those small droplets' fall velocity could contribute to the lidar observed vertical velocity.Additionally, Ghate et al. (2021) suggested that the evaporative cooling effect induced by drizzle strengthens the air's downward motion and weakens the upward motion.
Besides wind profiles, the dataset contains Ze and MDV profiles from the MARA.We suggest that the dataset users combine MARA's Ze and MDV variables to flag precipitating periods.

Radar wind direction offset estimation
After the campaign, it was noticed that the consecutive wind direction profiles from CLARA were periodically biased.In order to assess this bias, direction profiles from CLARA were compared with the direction profiles retrieved from the WindCube data.The histogram of the differences between both sets of profiles from the entire experimental campaign (Fig. 8a) shows that most points are distributed below 2 km, and the distribution is much broader in this region if compared with the region above 2 km.The probability distribution of differences (Fig. 8b) shows that the maxima are constant within each range interval from each chirp sequence, suggesting that the biases are constant in those regions.It also shows that the biases are positive for some profiles, and for others, the biases are negative.
The biases from each range region were calculated as the mean of all probability distribution maxima within each region.The retrieved values are ±1.4 • for the first chirp, ±3.2 • for the second, and ±5.4 • for the third chirp; the red dashed lines in Fig. 8b indicate those values.The negative biases are from profiles when the scans increase in azimuth from 0 to 359 • , and the positive biases are from scans when the azimuths decrease from 359 to 0 • .Figure 12b shows the result of the offset correction, and one can see that the bias along the range coordinate is close to 0 • .The possible reason for this range-dependent offset is that it takes around 1 s for CLARA to sample the three range intervals consecutively while CLARA rotates with an angular speed of approximately 5 • s −1 .Then, the data from each range interval are stored, and the azimuth from when the sampling started is assigned to them, even though each range interval was sampled at a different azimuth.

Radiosonde comparison to lidar and radar
In order to evaluate the wind speed and direction retrieved from the WindCube and CLARA observations, profiles of both observables from 34 radiosondes launched in De Bilt were used; the launching site in De Bilt is approximately 25 km away from the remote sensing site in Cabauw.For the evaluation, the level 1 data from the WindCube and CLARA were used, but the high-STE filter was not applied to the WindCube observations, in order to evaluate the agreement between the observations from regions above 4 km.For the evaluation, the WindCube and Clara profiles were averaged within a time window of 10 min centred at the launching time.Figure 9 shows an example of those profiles from 27 September 2021.Even though both sites are far from each other, the WindCube and CLARA profiles surprisingly almost overlap the radiosonde profiles.
A statistical analysis combining all radiosonde profiles suggests that the WindCube and CLARA observations of wind speed and direction are comparable to the radiosonde measurements (Fig. 10); it is also supported by the statistical metrics bias, RMSE, and correlation listed in Table 6.A more precise evaluation of the WindCube and CLARA data near the surface can be made using the measurements from the KNMI mast tower; however, the mast measurements are not available for this publication.

Radar and lidar intercomparison
In addition to the data evaluation using radiosonde observations, the WindCube and CLARA profiles from the entire dataset were compared against each other.Analogous to the previous Sect.4.1, the high-STE filter was not applied to the WindCube data, in order to assess the quality of the wind profiles in the cloud layer.The statistical analyses of wind speed (Fig. 11a) and direction (Fig. 11b) reveal that the observations from both instruments are well correlated with relative small bias, 0.24 • for wind direction and −0.16 m s −1 for wind speed.The agreement between the data from both instruments is also supported by the statistical metrics listed in Table 7.In addition, Fig. 12a, b show the difference between CLARA and WindCube observations of wind speed and direction strati-fied with range.Those figures show that most of the differences are distributed around 0 for both variables, indicating that the observations provided by both instruments are comparable.They also show that at 3 km, both difference distributions broaden towards the surface, indicating that CLARA observations deviate from the WindCube observations.As suggested in Sect.3.4, insects are likely the source of information in the lowest 2 km.It is possible that insects' random motion is increasing the uncertainty of the retrieved wind profiles.
Given the good agreement between the observations from both instruments, CLARA observations of wind speed and direction from regions below the NHI were included in the merged level 2 dataset if the WindCube did not provide them.The impact of the broadening of the difference may not be significant for the level 2 data because most of the observations in the lowest 3 km are from the WindCube, as shown in Fig. 13.

Application highlights
The CMTRACE dataset may be used for different applications.Here we would like to highlight two ways in which we are currently using the data: (1) the validation of wind and momentum transport in models; and (2) process studies of convective momentum transport (or transport of scalars through the boundary layer).A third application, which we https://doi.org/10.5194/essd-15-769-2023 Earth Syst.Sci.Data, 15, 769-789, 2023    do not further exemplify here, is the assimilation of wind profiles by weather models.

Model validation
The DALES model is running in near real-time hindcast mode on a (still) relatively small domain (15 × 15 km), centred at Cabauw with a grid spacing of 75 m.As part of the Ruisdael Observatory, these simulations will, in the future, be running on domains spanning the Netherlands at a grid spacing of ≈ 100 m.Along with model output at Cabauw from KNMI's mesoscale weather model HARMONIE, Fig. 15 shows an initial comparison of observed and modelled winds on 29 September 2021.On this day, a frontal system passed over the observational site from the early to late morning.The Ze (Fig. 14a) and the MDV (Fig. 14b) recorded by MARA show that between 00:00 and 12:00 UTC, precipitating deep clouds with stratiform outflow passed over the site, while after the front passage, shallow cumulus and congestus clouds developed with tops between 4-7 km.The level 2 wind direction (Fig. 15a) and speed (Fig. 15c) show that before 06:00 UTC, the wind was mainly southerly throughout the lower and middle troposphere.During the frontal passage (06:00-12:00 UTC), the wind direction changed from southerly to westerly, while winds of 10-15 m s −1 extended from near the surface up to cloud tops.After 12:00 UTC, the horizontal wind was mainly westerly, and the wind speed in the lower boundary layer was faster at about 15-20 m s −1 .The DALES model simulates the front passage and post-frontal convection (Fig. 15b, d), but some differences are apparent.The frontal passage seems to be slower in the model, judging mainly from the slanted wind direction and wind speed signature.Furthermore, in the period of post-frontal shallow convection (12:00-18:00 UTC), wind speeds in the observations appear to reach values up to 20 m s −1 (orange-red) that can extend all the way to the surface, while DALES maintains weaker surface layer winds.Comparing winds at 0.1 km above the surface (Fig. 16a, b) reveals that the wind turning associated with the front passage in DALES is well simulated, but the wind speeds at https://doi.org/10.5194/essd-15-769-2023 Earth Syst.Sci.Data, 15, 769-789, 2023 100 m are evidently too weak.Because the DALES winds showed are averages over the 15 × 15 km domain, while the observations represent wind variability at a single location, DALES is not expected to show the convective variations.However, in the observations, such convective variations would average out over time, and the weak wind bias in DALES appears too persistent throughout the day to be caused by the difference in domain-averaged versus point estimates.
The bias is intriguing because earlier comparisons of DALES to observations revealed a too strong wind bias instead (van Stratum et al., 2019).DALES uses a surface roughness length that underestimates the effective regional roughness length at Cabauw, which leads to too small surface stress.Ongoing investigations also include HARMONIE and employ measurement simulators to ensure a fair comparison of observations to models.

Momentum transport on different scales
Focusing on the period of post-frontal shallow convection on the same day (Fig. 17), we can illustrate the presence of multi-scale flows in the convective boundary layer.Reynolds averaging (Stull, 2003) is applied to the vertical and horizontal wind using a sliding window to separate flows with scales longer and shorter than 10 min (Eq.6).v obs is the observed wind, v 10m is the 10 min averaged wind, and v 10m is the 10 min anomaly.
Assuming wind speeds of 10-20 m s −1 , a 10 min average window corresponds to a spatial scale of about 6-12 km, which is in the meso-γ range.Figure 17a and b illustrate the fluctuations in the horizontal and vertical wind speed on these scales, while Fig. 17c and d show the presence of convective up-and downdrafts that carry different winds.Evidently, during the frontal passage, before 12:00 UTC, the vertical motion is on average downward (red) in the presence of precipitating convection, even though individual updrafts and downdrafts are visible.MARA's observations from the same period and height show an intensification of Ze and MDV (Fig. 14a, b), which suggests the occurrence of precipitation.It is possible that w 10m not only contains information from downward winds, but also from the terminal fall velocity of raindrops (Aoki et al., 2016).
After 12:00 UTC, mesoscale fluctuations in both the vertical and horizontal wind are evident, where downward motion tends to be accompanied by stronger horizontal winds extending from the top of the boundary layer to the surface and upward motion to weaker horizontal winds extending upward from the surface.Qualitatively comparing the horizontal and vertical speed observation with MARA's Ze suggests that faster horizontal winds and downward motion are associated with periods of congestus and precipitation.Drizzle, evaporative cooling, and associated downdrafts, as sug-gested by Ghate et al. (2021), may contribute significantly to the momentum transport.In turn, slower horizontal winds and upward motion are associated with periods of no precipitation and clear skies.These mesoscale-like fluctuations may be coupled to an organisation in the cloud field -a currently popular subject of research in the cloud-circulationclimate community -and contribute non-negligibly to total momentum transport.Past LES modelling studies, including with DALES, have not been able to accurately represent these mesoscale flows due to their use of small domain sizes and periodic boundary conditions.As DALES will be running on much larger domains with open boundary conditions, the ability to validate such flows with observations is necessary.In our current ongoing work, we use spectral analysis to derive the contribution of convective and mesoscale fluctuations to total momentum transport.

Conclusions
This paper introduces an experimental setup for retrieving horizontal wind speed and direction profiles from nearsurface to cloud top, taking advantage of the synergy of using wind lidars to retrieve wind profiles in the boundary layer and cloud radars to retrieve wind in the cloud layer.The first CMTRACE campaign was conducted in Cabauw, the Netherlands, and lasted 21 d, and its results are presented here.During this experiment, most days were non-precipitating with the presence of few clouds (mainly low level clouds).The winds at 200 m were mainly southwesterly, with speeds between 3.6 and 12.9 m s −1 .
The CMTRACE dataset was processed in two levels.In level 1, the processing minimises the presence of second trip echoes that affect the WindCube measurements, and reduces the bias in wind direction profiles derived from the radar observations.The level 1 observations from each instrument are kept in their original spatial and temporal resolution.In level 2, the lidar and radar differences in the sampled volume are reduced, and the horizontal wind profiles from both instruments are merged, creating a single profile with a flag to identify the measuring instrument.
Statistical analyses are used to assess the level of confidence of the CMTRACE dataset.When correlating the CMTRACE data with radiosonde measurements, the results show that this correlation is higher than 0.9 for wind speed and wind direction.The results also indicate that the abso-

Figure 1 .
Figure 1.Conceptual illustration (not to scale) of the horizontal separation between the lidar and radars operated during CM-TRACE.

Figure 2 .
Figure 2. Flowchart illustrating the CMTRACE data processing: The upper part shows the processing steps applied for generating the level 1 data, and the lower part shows the steps applied for generating the level 2 data.

Figure 3 .
Figure 3. Two-dimensional histogram of the differences between the wind speed retrieved from CLARA's and WindCube's observations as a function of the invalid index; the horizontal red line indicates 0 m s −1 and the vertical dashed orange line indicates the invalid index value of 50 %.

Figure 4 .
Figure 4. Time-height plots: (a) mean Doppler velocity measured by the WindCube, (b) radar equivalent reflectivity measured by MARA, (c) backscattered signal measured by a nearby Ceilometer, (note that the negative values are already set as NaN (not a number)), and (d) the same as in (a), but after filtering for second trip echoes.The white shaded curve on (c) indicates the retrieved noise height interface.

Figure 5 .
Figure 5. Example of the probability distribution of the MDV anomalies calculated using observations at azimuth equal to 0 • (the same data from Fig. 4a).The vertical red lines indicate the size of the window used for filtering the data, and the light red areas indicate the values removed.

Figure 7 .
Figure 7. Two-dimensional histograms of the difference between WindCube's and CLARA's wind speed from the lowest 2 km as a function of the maximum ZDR.Note that only data with I inv smaller than 50 % are used.

Figure 8 .
Figure 8. Panel (a) shows the two-dimensional histogram of the wind direction differences (CLARA − WindCube), and panel (b) shows the probability distribution of the differences.The data used in both panels are from the entire CMTRACE campaign.The red dashed line indicates the mean bias for each range interval.

Figure 9 .
Figure 9. Profiles of wind direction (panel a) and wind speed (panel b) from 27 September 2021 at 11:00 UTC.The blue, red, and orange lines indicate the radiosonde, CLARA, and WindCube observations, respectively.

Figure 10 .
Figure 10.Two-dimensional histograms of wind speed (panels a and c) and direction (panels b and d) data retrieved by the WindCube (panels a and b) and CLARA (panels c and d) versus all available radiosonde measurements.

Figure 11 .
Figure 11.Two-dimensional histograms from the data pair WindCube-CLARA for wind speed (panel a) and wind direction (panel b); the red line indicates the 1 to 1 line.

Figure 12 .
Figure 12.Two-dimensional probability distribution of the wind speed (panel a) and direction (panel b) differences stratified with height.

Figure 14 .
Figure 14.Time-height plot of MARA's observations from 29 September 2021: (a) attenuated equivalent radar reflectivity, and (b) mean Doppler velocity (negative means towards the ground).

Figure 15 .
Figure 15.Time-height plot of wind direction (panels a and b) and speed (panels c and d) from the CMTRACE level 2 (panels a and c) and from DALES (panels b and d) from 29 September 2021.

Figure 16 .
Figure 16.Time series of wind direction (panel a) and speed (panel b) from 29 September 2021 at 0.1 km above the surface.The blue curve is from the lowest available observation in the CMTRACE level 2; the red is from DALES.
equivalent reflectivity factor observed by a vertically pointing 94 GHz radar radar_spectrum_width (m s −1 ) x measure of the dispersion of radial velocity within the radar measurement volume radar_spectrum_skewness x skewness of the velocity distribution within the radar measurement volume rain_rate (mm h −1 ) x x rain fall rate measured by a near by PARSIVEL (optical disdrometer) data_flag x instrument identification flag: 0: no data, 1: WindCube lidar, 2: CLARA 35 GHz radar • , producing a conical sampling volume equivalent to that from CLARA.Table 1 lists additional configuring parameters used by the WindCube.

Table 1 .
Technical specifications and settings of the lidar and cloud radars operated during CMTRACE.Chirp repetition frequency, Doppler velocity resolution, Nyquist velocity, and range resolution depend on the chirp definition; those values are indicated in Table2for MARA and in Table3for CLARA.b For the WindCube, this value refers to the pulse repetition frequency. a

Table 2 .
Configuration parameters used by the single frequency radar MARA for each chirp sequence.

Table 3 .
Configuration parameters used by the dual-band radar CLARA for each chirp sequence.

Table 4 .
Daily characterisation of the cloud coverage and precipitation during CMTRACE.The first five columns (LLC, MLC, HLC, DCC, SC) indicate the approximated duration in hours of each class of clouds.RR indicates the maximum and mean precipitation rate in mm h −1 measured by a nearby optical disdrometer.WS200 and WD200 are the mean wind speed and direction derived from the WindCube observation at 200 m a.g.l.(above the ground).

Table 5 .
General specifications of the level 2 dataset.

Table 6 .
Statistical metrics of the WindCube and CLARA data assuming the radiosonde measurements as a reference.RMSE stands for root mean square error.The bias and RMSE are in m s −1 and degrees for the wind speed and wind direction, respectively.

Table 7 .
Statistical metrics of the intercomparison between CLARA and WindCube observations.RMSE stands for root mean square error.The bias and RMSE are in m s −1 and degrees for the wind speed and wind direction, respectively.
Figure 13.Distribution of level 2 data source stratified with height: WindCube in blue and CLARA in green.

Table A1 .
List of variables available in Level 1 and Level 2 data.