The Tall Tower Dataset. A unique initiative to boost wind energy research

A dataset containing quality controlled wind observations from 222 tall towers has been created. Wind speed and wind direction measurements have been collected from existing tall towers around the world in an effort to boost the utilisation of these non-standard atmospheric datasets, specially within the wind energy and research fields. The observations taken at several heights greater than 10 metres above ground level have been retrieved from various sparse datasets and compiled in a unique collection with a common format, access, documentation and quality control. For the latter, a total 5 of 18 Quality Control checks have been considered to ensure the high quality of the wind records. Non-quality-controlled temperature, relative humidity and barometric pressure data from the towers have also been obtained and included in the dataset. The Tall Tower Dataset (Ramon and Lledó, 2019a) is published in the repository EUDAT and made available at https://doi.org/10.23728/b2share.0d3a99db75df4238820ee548f35ee36b.

With higher shares of electricity generation depending on wind speed conditions, it is crucial to advance understanding of wind speed conditions at heights between 50 and 150 metres above ground -where current wind turbines are installed-and 15 at multiple time-scales ranging from turbulence to mesoscale circulations, seasonal to decadal oscillations and climate change impacts. To characterise these features, high quality meteorological observations are needed.
Vast amounts of surface wind measurements taken at the standard height of 10 m above surface level do already exist, and efforts have been made to compile the existing surface wind observations (Lott, 2004;Dunn et al., 2012;Klein Tank et al., 2002;Lucio-Eceiza et al., 2018a, b). However, meteorological data at turbine hub heights are much scarcer than surface observations. 20 To take those measurements, a tall tower or met mast needs to be installed and instrumented. The basic structure of these masts consists of a high vertical tower reaching heights of 100 to 200 metres above ground with several platforms distributed along the vertical structure. It allows the placement of several wind sensors (i.e., anemometers and wind vanes) at different heights so that the vertical wind shear can be profiled. In addition, it is also typical to install several horizontal booms at each measuring 1 height oriented to different directions. Thus, more than one sensor per measurement level can be installed to correct or replace data from one of these redundant sensors in case it is affected by a technical failure or by the wind shadow produced by the mast itself. The physical structure of a tall tower, as well as a typical instrumentation layout, are illustrated in Figure 1. Recently, the usage of remote sensing devices to measure atmospheric profiles has increased as an alternative to the tall tower in situ measurements. Atmospheric lidars, for example, are becoming more popular due to their easy installation and 5 maintenance when compared to tall towers. However, the lack of historical lidar data limits their utilisation in long-term assessment studies. One more example of that new trend is the lidar-based satellite Aeolus, which was launched by the European Space Agency in 2018, and has just started acquiring profiles of Earth's wind on a global scale 1 .
Most of the existing met masts are owned by private companies mainly from the wind energy industry. Wind energy companies need to take those measurements prior to the construction of a new wind farm to characterise the wind speeds in 15 1 https://www.esa.int/Applications/Observing_the_Earth/Aeolus the area and eventually ensure the return of the initial investment. Besides, some local effects such as topographic channelling, sea breezes, turbulence or vertical wind shear must be inferred because they can have a substantial impact on the electricity production (Hansen et al., 2012). Since the maintenance costs of these large and complex structures are rather expensive, the energy industry typically takes measurements for a relatively short period (1 or 2 years usually). Then the towers are decommissioned, so the lack of long records of tall tower data reduces the possibilities to study, for example, wind variability 5 at seasonal to decadal time scales. In addition, private companies are usually reluctant to share the tall tower data with third parties, obstructing even more their further usages.
Fortunately, many of the initiatives from a) to e) also take tall tower measurements for their research and then the data are usually made freely accessible for non-commercial purposes. Derived from these diverse efforts devoted to boosting the utilisation of tall tower records, there exist various sparse datasets containing measurements from instrumented towers. Regrettably, 10 they are often difficult to find or access, and the lack of coordination in terms of formats, metadata, data access and Quality Control (QC) hinder their usability outside the owner institution.
The INDECIS 2 project is putting efforts to collect existing non-standard meteorological observations, among other aspects.
In this paper, a dataset is presented, and the QC of the wind data is further detailed. The reader is referred to Ramon and Lledó (2019b) to find complete information on the identification and collection of towers, data formatting and documentation. Sect. 2 15 of this article describes the main features of the dataset, as well as the data characteristics. The QC software suite is defined in Sect. 3. Then, a wrap up of the results after running the QC checks is presented in Sect. 4. The benchmark experiment carried out to test the robustness of the QC software is shown in Sect. 5. Finally, conclusions are presented in Sect. 6.

Tall Tower Dataset description
The Tall Tower Dataset (Ramon and Lledó, 2019a) is a unique collection of data from 222 tall towers resulting from an 20 exhaustive process of identification of existing masts and their later data retrieval. Figure 2 presents the global distribution of the sites, which is highly heterogeneous. Most of the masts are located in Asia (51%), mainly clustered in Iran resulting from a national campaign aimed to boost renewable energies at a country level. Then, tall towers appear more spatially distributed over North America (23%) and Europe (16%), mirroring the important deployment of wind power that is taking place in those regions. Africa (8%), Oceania (1%) and Antarctica (1%) follow. Unfortunately, it has been hard to retrieve data from South 25 America, so no records from this area can be found in the Tall Tower Dataset.
The height above the surface where the top sensor is located for each tower is also depicted in Figure 2. On the one hand, masts placed in historical observatories (i.e., often having more than 20 years of data) tend to be short, with heights ranging between 18 and 50 metres above the ground and usually consist of one measuring level at the top of the pole. Two examples are the American masts in Barrow and Mauna Loa. On the other hand, modern towers often reach 100 to 200 metres of altitude. 30 Indeed, most of the masts in northern Europe have been installed during the last 15-20 years and are generally taller than 80 m, usually reaching 150 to 200 m. However, the tallest structures are located in the USA reaching the exceptional height of 500 m, allowing the placement of sensors up there. The top anemometer at Walnut Grove tall tower in California is at 488 m above ground level. The number of measuring levels in these masts is almost always higher than three, and up to eight in the case of the FINO met masts.
A list of the towers included in the Tall Tower Dataset, as well as their main characteristics such as the owner institution, country, geographic coordinates or specific recording periods, can be found in Sect. S1 of the Supplementary Material. The 5 record lengths and other structural features such as height or instrumentation are quite diverse as they depend on the purpose they were designed for. Most of the towers are typically installed to provide in situ observations for experimental field campaigns within the research or industry fields. In this case, the tall towers are commonly referred to as meteorological masts or met masts, and they represent up to the 87% of all the tall towers in the dataset. However, other sensors are installed over marine platforms (11%) or at the top of lighthouses (1%) to monitor the coastal weather conditions. Finally, 1% of the towers are 10 instrumented communication transmitters that take meteorological measurements at several platforms along with the antenna.
Concerning the location, almost 80% of these tall towers are found inland while the other 20% are placed offshore.
Information indicating the representative features mentioned above is included in the dataset within the corresponding site metadata, which has been standardised for all the sites. This material was sometimes confusing, sparse or even missing in the datasets distributed by the owner data centres, especially when it comes to the conventions in which the initial data were 15 prepared. For example, if the time zone in which the time stamps were delivered was not specified, it could be challenging to discern whether they are provided in local time or Universal Time Coordinated (UTC). Another example concerns the data units, which were not explicitly stated in a few cases either. In both of these confusing situations, the data provider was contacted to confirm the original convention. Further information on the diverse standards in which the data were provided as well as the final conventions employed in the Tall Tower Dataset can be found in Ramon and Lledó (2019b).

20
The time span of the 222 time series is depicted in Figure 3(a). First, we split the series according to their time resolution, which varies from 10-minutely to 1-hourly. Most of the series, i.e. a total of 172, provide 10-minutely averaged data, meeting the WMO standard (WMO, 2007) for estimating mean wind speeds. The other 50 masts report 15-minute, 20-minute, 30minute or hourly data. Information on how these averages have been taken is hardly ever available. The fact is that resulting aggregated values vary depending on whether averages are taken over the horizontal wind components or speed and direction 25 modules independently. WMO (2007) does not prefer one option over the other, as it may depend on the application or available instrumentation. Even though the effects of this choice are rather small, especially for higher wind speeds, it represents an additional source of uncertainty for the values themselves.
The total coverage of the Tall Tower Dataset ranges from 1984 to 2017. While the 90% of the series span less than 20 years, 3% cover 30 or more years. Precise initial and end of periods of record can be found in Sect. S1 of the Supplementary 30 Material for each tall tower. Nevertheless, several of these masts have been recently installed, and measurements are currently operational. Missing data periods -a 12.1% of the dataset-appear sometimes embedded within the series.
Concerning the data retrieval process, the initial efforts focused on collecting the largest amount of wind observations possible. Those records have been complemented with temperature, relative humidity and surface pressure data also measured at the different platforms along the tower. The time evolution of the amount of these five variables is plotted in Figure 3 (b). Most of the data falls within the 21 st century, with a significant increase at the beginning of the millennium. Up to 2.7M of wind speed records have been retrieved for one single month, i.e. December 2015, which constitutes the month with the maximum amount of wind speed data. In the case of wind direction, the month with the highest amount of records is October 2012 (2.1M of measurements). A decrease in the number of observations has been noticed from 2017 onwards. Generally, some of the data providers prefer to keep the most recent data and release them once measurements are preliminary checked for gross 5 errors. Temperature, relative humidity and pressure are not always available. We note that the fewest records correspond to the barometric pressure, which is usually measured only at surface level (i.e., 2 m above ground level).

The Quality Control Software Suite for Tall Towers (QCSS4TT)
To ensure the high quality of tall tower wind data and guarantee the accuracy of any result derived from these records, a QC procedure needs to be carried out. The state-of-the-art has devoted efforts to QC wind data taken at surface stations 10 (e.g., Dunn et al. (2012); Lucio-Eceiza et al. (2018a, b)). However, no QC software has been specifically designed to tackle the same problem with tall tower observations, whose features vary considerably when compared to surface wind data (e.g., measurements are taken at higher altitudes, the spatial density of stations is considerably lower, etc.). Unique measuring techniques, such as the parallel measurements at different platforms along the mast or sensor redundancy at a given height, can also be taken into account to complement and enhance the typical QC. After a review process of the existing QC routines, a set of 18 sequential QC tests (2 preliminary + 16 main tests) have been selected and designed to be performed over wind measurements. The Quality Control Software Suite for Tall Towers (QCSS4TT) designed here is applied to all the wind speed and wind direction data within the Tall Tower Dataset, regardless of whether they were previously quality controlled or not by the providing institution. A general description of the QCSS4TT is presented below in this section. The software is fully described in Sect. S2 in the Supplementary Material.

5
The QC tests within the QCSS4TT are all intra-station checks, as they do not compare series from nearby tall towers. QC routines ingest entire time series of winds at a specific heights, whose time frequencies vary between 10 minutes to one hour.
The recommended sequence for the application of the QC tests is presented in Figure 4. Checks are grouped in five categories depending on the purpose they were designed for. The two preliminary checks are designed to detect manipulation gross errors.
Then, the 16 main QC tests ensure the limits, spatiotemporal and internal consistency of the wind speed and wind direction      Table S4 of the Supplementary Mate-

rial)
This QC test needs to be run after the other routines.
Occurrences of 0s and 360s The occurrence of 0s represents more than 30% of ws Then, the 16 main QC tests follow. A summary can be found in Table 1 and complete information can be encountered in Sect. S2 in the Supplementary Material. Most of them are standard checks typically performed over wind and other Essential Climate Variables such as temperature or precipitation. However, we propose here two new QC tests (the so-called Tower shadow and Vertical ratios checks, respectively) to guarantee the spatial consistency of the data by considering the special 10 characteristics of the tall tower measurements since classic inter-station comparisons appear challenging due to poor spatial density of sites.
After running the QCSS4TT, a natural number (hereafter referred to as QC flag or flag, see Table 2)  for surface winds (i.e., 10 m winds), whose features vary importantly when compared with winds observed at higher altitudes,  (2004)). The WMO allows adjusting some of the fixed-value limits proposed in the WMO (2007) to reflect singular climate conditions more accurately. As the QCSS4TT aims to clean data from towers located all over the world regardless of the prevailing climate conditions in the area, thresholds need to be adjusted manually to not to deem wrong the general and 5 particular climate features observed in the wide variety of world climates. It is also vital to take into account that this sensitive experiment should reduce the number of Type I errors without increasing the number of invalid data that has been accepted by the tests (also referred to as Type II errors).
Based on these thresholds and the nature of the individual wind records, six different categories have been defined (Table 2), and each datum is flagged appropriately. The quality of a record is inferred automatically by checking if it passes all the tests 10 successfully (flagged as '1'), passes the tests but might need further checks such as a visual inspection (hereafter referred to as suspect and marked as '2'), or fails at least one of the tests (flagged as '4'). When an observation is not considered suspect or wrong by any of the QC tests, additional levels may indicate that the observation was not evaluated by three or more tests (indicated as '0') or corresponds to a calm period ('5'). Finally, missing values are flagged uniformly (categorised as '9').
Wind records flagged as '4' are deemed to be erroneous data and thus, unreliable. They have been removed by changing 15 the original record to NA. Suspect data remain unaltered as well as those observations that have not been evaluated by all the QC tests because they might be potentially correct and usable for some applications. But in case the user prefers to impose their own level of restriction, we also include the raw data jointly with the flag values resulting from the quality controlling.
Therefore, the data user is able to filter the raw data based on the flag values. Still in those cases, we strongly discourage the usage of data marked as erroneous ('4'). The QCSS4TT has been applied sequentially over the Tall Tower Dataset according to the flux diagram in Figure 4. We present here the global results obtained from the quality controlling of the Tall Tower Dataset, as well as a summary of the performance of the main tests.
As stated in Sect. 3, the Surroundings check needs detailed original metadata of the tower location. Unfortunately, this 5 valuable information is not always available so the Surroundings check cannot be carried out over all the tower sites. The unique case when this QC test confirms that a series of wind speeds were disturbed by the surrounding forest occurs at Wallaby Creek met mast. After running the main QC routines, long sequences of wind speeds measured at the lowermost level of this met mast -placed at 10 m above surface-have been flagged as wrong. A close look at the site metadata reveals that the canopy well exceeds the 10 metres height during all the recording period, reducing considerably the observed wind speeds. considered erroneous by at least one of the QC tests. They have been replaced by NA, increasing the total number of missing data from 12.1% to 14.6%. A 1.8% of the dataset is flagged as suspect. Some of the QC tests, particularly those that compute period-aggregated statistics such as moving averages or variances, require a minimum amount of data. Due to this constraint, a 0.2% of the data have not been evaluated by 3 or more QC tests to avoid the computation of such statistics with reduced sample sizes. Records identified as calms (i.e. wind speeds under 0.5 ms −1 ) have been also skipped on purpose by a small 20 group of tests, i.e. those that compute quotients between pairs of simultaneous observations. However, calms can be trusted as they passed successfully all the other QC checks. The percentage of calms is highly dependent on the geographical location of the tall tower. Met masts located in Southeast Asia contain the largest percentage of calms, reaching up to 24% of the total data.
The amount of data flagged by each test is considerably different, as can be noticed in Figure 5, which depicts the percentage 25 of data flagged as erroneous and suspect by the main QC tests. We note that both the Flat line and Quartile occurrences checks have flagged the largest amount of data (1.74%). The former detected the most substantial amount of erroneous data (1.52%), followed by the Repeated sequences and Quartile occurrences tests (1.29% and 0.88%, respectively). The Vertical ratios check has detected very few erroneous or suspect records, and the Difference of extreme values test has flagged no datum.
The Occurrences of 0s and 360s values test is not included in Figure 5 since this test does not flag individual records, but the  indicates that the QC test only flags data as suspect. Double asterisk (**) denotes that the QC test only flags data as erroneous.

How reliable is the QCSS4TT?
The performance of the QCSS4TT needs to be assessed. Here, a benchmark experiment has been specifically designed to test the ability of the QCSS4TT in detecting wrong values. In the following, the preparation of the experiment and their results are described.
The setup of the experiment consists of generating a set of presumably QC-free time series where a set of errors will be  constitutes the time range with the largest amounts of records within the Tall Tower Dataset (Figure 3(b)). To better emulate the features of the tall tower data, we retrieve two parallel series at each of the 50 points. These wind speeds are those provided at 10 and 100 metres respectively.
The set of 50 series is replicated fourfold. Three of these series are firstly modified by introducing missing data at random, either by erasing data individually or removing sequences of records. The percentages of missing data in these series are 5 approximately 5%, 10%, and 20%, respectively. The introduction of missing records emulates the frequently observed sporadic sensor failures and no data periods within the wind speed series. Finally, one series is left with no datum set to missing.
The error 'seeding' process is carried out following the methodology in (Hubbard et al., 2004), where the performance of a set of basic QC tests for temperature and precipitation data is assessed. In this publication, a subset of 2% of the total data is selected to be modified by introducing an error of magnitude: where σ x is the standard deviation of the time series x and r i is a randomly selected number generated using a uniform distribution ranging from -3.5 to +3.5 specifically for ith observation. Once the errors are inserted, the QCSS4TT is executed. The QCSS4TT has detected on average nearly 40% of all the seeded errors (see Table 3). Indeed, this result is on the average of the percentage of detection observed for precipitation data in Hubbard et al. (2004), which was 30%-40% for complex terrain sites and 40%-50% for the other locations.
At this stage of the experiment, it is important to study the role of the random number r, and particularly its magnitude, which subsequently influences the size of the error E. Values of r close to zero will introduce smaller errors, which will be 25 14 less likely to be caught by any of the QC checks. Figure 6 presents the percentage of detection as a function of the r values, which have been grouped in intervals of 0.5 units. We note that the QC tests detect most of the biggest errors. However, the percentage of detection decreases as the magnitude of r does, as we expected. Thus, the smallest errors are usually skipped by all the QC tests. Indeed, this result mirrors the conservative philosophy employed in the threshold selection of the checks.
Finally, it has been observed that Type I errors have been made in 8% of the total data, corresponding mainly to suspect 5 flagging. Figure 6. Percentage of detection of seeded errors as a function of the magnitude of the random number r

Conclusions
Hub-height wind data are vital to assess the local wind flow features at heights ranging from 20 to 120 metres, where wind turbines are located. Nonetheless, the wind industry is not the only user of these observations, but also the research academy is interested in retrieving hub-height winds for their studies such as PBL experiments or the verification of climate products.

10
Unfortunately, these non-standard climate data appear sparsely, and the lack of standardised formats, quality and metadata jeopardise their further usage. This is the first time when efforts were devoted to gather the most substantial possible amount of existing data measured at tall towers around the world, and perform an exhaustive QC assessment to eventually made them publicly available for non-commercial purposes in a standard format and access point. Wind speed, wind direction, temperature, pressure and relative humidity observations measured at different heights in 222 tall towers -owned mainly by public institutions such as universities, meteorological weather services or research centres-have been retrieved from sparse archives, compiled in a unique collection, quality controlled -in the case of wind speed and wind direction data-, and released under the name of the Tall Tower Dataset. Data from of 181 of these sites are stored in the EUDAT data repository and can be publicly accessed. Records from the other 41 towers are not available there since the authors of the Tall Tower Dataset   5 do not own the observations and the data providers do not grant rights to share with third parties. Although some initiatives such as the Climate Data Store 1 are starting to appear to uniform and boost the free utilisation of climate observations, there is still some reluctance, mainly in Europe, to contribute to open initiatives that derive in the inclusion of data in public external archives, thus hindering their further usage.
To guarantee the reliability of the wind measurements, a QC software suite has been designed and applied over the Tall   10 Tower Dataset, and the erroneous data have been removed. Some of the QC functions are coded to deal simultaneously with huge amounts of data so that the computation costs may be high specially when considering high resolution data. After the application of the QCSS4TT, the vast majority of the dataset (i.e, the 95.2% of the wind data) passed all the tests successfully.
A benchmark experiment based on Hubbard et al. (2004) has been designed to assess the efficacy of the QCSS4TT in detecting wrong wind speed data. The exercise is based on the detection of a set of seeded errors introduced in 100 wind time 15 hourly series at 50 randomly selected locations obtained using the ERA5 reanalysis. On average, the 40% of these seeded errors have been identified, even though the magnitude of the error is sometimes close to zero and therefore, difficult to detect. This result agrees with the obtained by the previously mentioned publication, thus assuring the reliability of the QCSS4TT results.
We do not perform any analogue experiment for wind direction data since the nature of these data requires a more complex exercise.

20
Even though some tall towers have been decommissioned recently due to several different reasons, most of the locations within the Tall Tower Dataset continue taking measurements that could be added to the collection in a near future. Besides, the authors of this work are open to receive useful inputs on new tower locations not included in the Sect. S1 of the Supplementary Material, and whose data could be potentially added to the Tall Tower Dataset in future updates. Enlarging the collection of these non-standard climate data and increasing the density of stations may allow, for instance, further quality checks by means 25 of inter-station comparisons with nearby tall towers.
Code and data availability. Records from 181 out of the 222 tall towers within the Tall Tower Dataset (Ramon and Lledó, 2019a)  Author contributions. JR retrieved and formatted the tall tower data, produced the QCSS4TT code and carried out the benchmark experiment.
He also wrote the first draft of this manuscript. The work has been done under the supervision of LL, who conceived the research, gave advice on the data collection and assisted JR in several IT issues and code debugging. NPZ came up with the design of the benchmark experiment as well as some of the graphics to visualise the results. AS and FJDB also supervised this work and facilitated the data approaching and retrieving. All authors contributed to the analysis of the results and to the writing and editing of the paper.

5
Competing interests. The authors declare that they have no conflict of interest.
Disclaimer. The Tall Tower Dataset is made available in good faith to be used for non-commercial purposes. In no event will the authors be liable to any user or third party for any damage or loss resulting from any use or misuse of these data.