The Tibetan Plateau space-based tropospheric aerosol climatology: 2007–2020

. A comprehensive and robust dataset of tropospheric aerosol properties is important for understanding the effects of aerosol–radiation feedback on the climate system and reducing the uncertainties of climate models. The “Third Pole” of Earth (Tibetan Plateau, TP) is highly challenging for obtaining long-term


Introduction
The three poles (i.e. the Arctic, the Antarctic, and the Tibetan Plateau -TP) have the highest mountains in the world and store more snow, ice, and freshwater than any other place.The unique geographical locations of the Antarctic, the Arctic, and the TP have unique ecological, climatic, and natural environmental changes.Also, the three locations have a crucial role in global and regional climate change.However, studies have found that these regions are susceptible to climate change.The differences in these regions may also affect key feedback loops for global climate change and the sustainability of human life.Unfortunately, our understanding of the three poles, particularly the relations between the regions, remains limited due to insufficient observation data.Currently, the collection of additional research data for these extreme environments is one of the major bottlenecks in facilitating comprehensive studies of these regions.Sufficient attention has been given to the polar regions and the TP in successive IPCC reports (IPCC, 2013(IPCC, , 2021)).The similarities between the TP and the other two polar regions are their low temperatures, remote location, and large water storage capacity.On the other hand, the TP has a more highly complex climate than the Arctic and the Antarctic (where ice is the primary medium), and its land surface (including forests, grasslands, bare soil, lakes, and glaciers) is more diverse.These differences make the transport and accumulation of pollutants in the TP region different from the other two polar regions.
The TP is known as the "Third Pole" because it has the third largest ice mass on Earth, after the Antarctic and Arctic regions (Qiu, 2008).The TP, also called "Asia's Water Tower", provides freshwater to 40 % of the world's population due to its vast water reserves such as glaciers, lakes, and rivers (Immerzeel et al., 2010).Furthermore, the TP is the "Roof of the World", which covers an area of ∼ 2.5×10 6 km 2 at an average altitude of about 4000 m a.s.l.(above sea level) and includes all of Tibet, parts of Qinghai, Gansu, Yunnan, and Sichuan in south-western China, as well as parts of India, Nepal, Bhutan, and Pakistan (Nieberding et al., 2020).To the north of the TP region is the Taklamakan Desert (TD) (see Fig. 1).This high-altitude and specific topographic area effectively serves as a heat source during the spring (MAM) and summer (JJA) months.This thermal structure helps the TP to function virtually as an "air pump", attracting warm and humid air from the lower-latitude oceans by suction (Yanai et al., 1992;Wu and Zhang, 1998;Wu et al., 2007Wu et al., , 2012)).Consequently, large-scale mountains play a crucial role in shaping regional and even global weather and climate through mechanical and thermodynamic effects and affect the global energy-water cycle (Xu et al., 2008;Molnar et al., 2010;Boos and Kuang, 2010;Wu et al., 2015).It is closely related to the survival of human beings in the world.
Climate projections are simulated responses of the climate system to future emission or concentration scenarios of greenhouse gases (GHGs) and aerosols and are generally calculated using climate models.The reasons for the gap between models and observations may also be inadequate solar, volcanic, and aerosol forcing used in the models and in some modelling may be due to an overestimation of the response to increasing GHGs and other anthropogenic forcing (the latter includes the role of aerosols).The most significant uncertainties in predicting future climate change are related to uncertainties in the distribution and properties of aerosols and clouds, their interactions, and limitations in the representation of aerosols and clouds in global climate models (IPCC, 2021).The primary aerosol type over the TP is dust, which is primarily contributed by the Taklimakan Desert (Z.Liu et al., 2008;S. Chen et al., 2013S. Chen et al., , 2022;;Xu et al., 2015).Previously, some studies of aerosol-cloud interaction (ACI) and aerosolradiation interaction (ARI) were conducted (Liu et al., 2022).For example, the dust aerosols lifting over the TP reduce the radii of ice particles in the convective clouds and prolong the cloud lifetime through the indirect radiation effect, which can lead to the development of higher convective clouds.The dust-affected convective clouds move further eastward under the action of westerly winds and merge with local convective cloud masses, triggering heavy precipitation in the Yangtze River Basin and northern China downstream of the TP (Z.Liu et al., 2019;Y. Liu et al., 2020).However, the effect of aerosol on the atmospheric energy and water cycle remains uncertain, mainly due to the lack of a long-term and accurate vertical aerosol optical property dataset over the TP region.This can help better understand aerosol's impact on the atmospheric heating rate and stabilization and the subsequent cloud-precipitation process.Therefore, constructing a more long-term and reliable vertical dataset of aerosol optical parameters can make up the observational facts for aerosolrelated study and provide a scientific basis for improving the global climate model simulation over the TP.
Generally, the primary aerosol optical parameter (such as extinction coefficient, EC, and aerosol optical depth, AOD) acquisition method is in situ observations, which have high precision.However, in situ observations are restricted by the distribution of measurement stations over the TP.Hence, the resulting data lack spatial continuity, making it difficult to meet the objectives of growing regional atmospheric environmental studies (Chen et al., 2022;Goldberg et al., 2019;Giles et al., 2019).Satellite remote sensing (active and passive) is an effective tool for collecting aerosol optical information (including the vertical structure and spatial distribution) over a wide range of spatial scales, significantly offsetting the deficiencies of in situ observations.Satellite remote sensing can tackle difficulties connected to insufficient data and uneven geographical distributions to a certain extent (Chen et al., 2022;Wei et al., 2021).While for aerosol products observed from the Cloud-Aerosol Lidar and Infrared Pathfinder Satellite Observations (CALIPSO), the presence of some lowreliability aerosol target (LRAT) caused by cloud contamination, solar noise contamination (especially in the daytime), and ground clutter among mostly aerosol observations skews Earth Syst.Sci.Data, 16, 1185Data, 16, -1207Data, 16, , 2024 https://doi.org/10.5194/essd-16-1185-2024 the distribution of the aerosol EC towards larger values (at least some of which may be identified as aerosols and retained in the analysis), making the presence of some lowconfidence aerosol targets bias the distribution of aerosol extinction in most aerosol observations.The distribution of the aerosol EC will show greater biased values (Thomason and Vernier, 2013;Kovilakam et al., 2020;Pan et al., 2020;Kahn et al., 2010) and then will further enhance the aerosol index (AI) value due to the influence of radiation transfer interaction between clouds and the absorption layer, which will not truly reflect the differences in aerosol physical properties (Guan et al., 2008;Y. Liu et al., 2019;Kim et al., 2018).Hence, gaining high confidence in the EC helps us analyse aerosol optical properties and leads to numerous pertinent uses of EC data, which is essential for accurately characterizing the upper range of aerosol ECs that occur on the TP.
The present study provides a dataset of monthly averaged vertical structure characteristics of tropospheric highconfidence aerosol optical properties including the EC, AOD, Ångström exponent (AE), and AI in the daytime and nighttime over the TP and surrounding areas.The data for the abovementioned optical properties were retrieved based on the spaceborne CALIOP (Cloud-Aerosol Lidar with Orthogonal Polarization) on board the CALIPSO satellite for the period 2007-2020.The main objective of this study is to calculate new and high-confidence aerosol optical parameters of the AI in the vertical distribution through strict quality control and validation for a passive remote sensing satellite sensor (MODIS) and active remote sensing ground-based lidar.The AI is dependent on the aerosol concentration, optical properties, and altitude of the aerosol layer.Also, the AI is particularly sensitive to high-altitude aerosols and is used to indicate small particles (those that act as cloud condensation nuclei) with a high weight (Guan et al., 2010;Buchard et al., 2015;Y. Liu et al., 2019;Nakajima et al., 2001).The comprehensive dataset of aerosol optical properties utilized in the study is of substantial importance for understanding the impact of aerosol on the ecosystem and reducing the uncertainties of the climate models.
The dataset used in this study is more effective at characterizing the vertical structure of aerosols while following standardized quality control methods to obtain higher confidence in the aerosol vertical structural property covariate datasets and to allow for comparison and application to the study of climate models and other atmosphericscience-related problems between our records and other different public datasets.To ensure meaningful confidence estimates, it is necessary to carefully apply the following correction procedures and analytical validation for the constructed aerosol covariates over the TP.The main steps for constructing the dataset are grouped as follows.(1) Remove the lowconfidence aerosol EC at 532 and 1064 nm caused by the misclassification of clouds and other interferences (e.g.surface clutter, hygroscopicity).Based on this, an interquartile range (IQR) method (see Sect. 2.2) is utilized to discard the low-confidence targets and to further obtain the monthly averaged aerosol EC for daytime and nighttime with higher confidence.(2) The pseudo AE is calculated using the aerosol EC at 532 and 1064 nm with higher confidence.(3) Obtain the vertical AI by the product of the AOD (the vertical layer integral of EC) and AE.(4) Validate the constructed AI with MODIS and in situ lidar measurements using the standardized frequency distributions.

Study area
Figure 1 depicts the geopotential height of the TP and its surrounding areas (27-42°N, 75-102°E; about 4000 m a.s.l.) and a schematic diagram of the CALIPSO satellite ground track over the TP in different months.The role of the "heatdriving air pump" of the TP provides abundant water vapour for the formation of clouds (Luo and Yanai, 1984;Liou, 1986).Furthermore, the TP environment is greatly affected by natural and anthropogenic aerosols from the surrounding regions (Chen et al., 2013;Bucci et al., 2014;Xu et al., 2015).The strong convection generated by the TP will promote aerosols' vertical transport and increase aerosols' content in the troposphere and stratosphere (Vernier et al., 2015;Liu et al., 2022).Aerosols also serve as cloud condensation nuclei (CCN) or ice nuclei (IN), modifying cloud structure properties and precipitation (Twomey, 1977).Hence, the TP has been called the pump of water vapour, the cloud incubator, and the sand dust transfer station.By delivering water vapour, clouds, and dust, it regulates extreme weather and climate in the downstream and surrounding areas.It can be seen that the TP plays a crucial role in the impact and regulation of global and regional climate or environments (Luo and Yanai, 1984;Rossow and Schiffer, 1999;Wan et al., 2017;Liu et al., 2022).

CALIOP data and the LRAT clearing method
CALIOP, on board the CALIPSO satellite, was launched by NASA on 28 April 2006 and is the nadir-pointing dual-wavelength polarization lidar which can provide global and continuous information on the vertical distribution of aerosols and clouds at 532 and 1064 nm for daytime and nighttime (Winker et al., 2007(Winker et al., , 2009)).The CALIPSO-CALIOP (version 4.20) level-2 aerosol profile product is selected in this study, with vertical and horizontal resolutions of 60 and 5 km, respectively.The used parameter includes Ex-tinction_Coefficient_532 and Extinction_Coefficient_1064 between daytime and nighttime observations from 2007 to 2020.It should be noted that the CALIOP data use as few instruments as necessary to complete the monthly aerosol climatology.We make this decision to limit the impact of differences between instruments due to measurement techhttps://doi.org/10.5194/essd-16-1185-2024 Earth Syst.Sci.Data, 16, 1185-1207, 2024 niques and wavelength ranges as well as to assess the general quality of the instrument's dataset.
The presence of some LRAT caused by cloud contamination, solar noise contamination (especially in the daytime), and ground clutter among mostly aerosol observations skews the distribution of the aerosol EC toward larger values (Thomason and Vernier, 2013).Consequently, to eliminate the LRAT, a statistical approach to identify the LRAT and extreme outliers is utilized based on the IQR.The IQR is a more conservative measure of the spread of the distribution than the standard deviation (Iglewicz and Hoaglin, 1993).Note that this technique is based on median statistics rather than the mean due to the skewed distribution of the EC.In our implementation, we used daily data at each altitude (0.06 km) and latitude (0.05°) bin from 2007 to 2020 to determine the frequency distribution of the EC for different months.In addition, we used the lower quartile (Q1) and upper quartile (Q3) of the underlying distribution to find the IQR, defined as Q3-Q1, a good measure of the spread in the data relative to the median.Here, an extreme outlier is defined as Q3 + (3.5× IQR), and a more upper outlier (Q3 + (1.5× IQR)) is used for comparison (Iglewicz and Hoaglin, 1993).Meanwhile, the extreme outlier threshold is used to clear LRAT-affected observations from the dataset, which is better and more effective at identifying outliers in the density distribution (Kovilakam et al., 2020).

AI data processing
According to the method described in Sect.2.2, the aerosol EC (observed at 532 and 1064 nm for daytime and nighttime) with higher reliability over the TP is obtained.The monthly mean Ångström exponent (hereafter "pseudo Ångström exponent") for daytime and nighttime is derived to establish the 14-year aerosol climatology (2007-2020) based on Eq. ( 1).The AE model for the EC, which is the wavelength dependence at 532 and 1064 nm, is given by Kovilakam et al. (2020).
EC −532 [m, i, j ] and EC −1064 [m, i, j ] are extinction coefficients at 532 and 1064 nm, respectively.AE [m, i, j ] is the pseudo Ångström exponent (Rieger et al., 2015(Rieger et al., , 2019)).The indices [m, i, j ] represent the month, latitude, and altitude, respectively.(λ 532 /λ 1064 ) represents the ratio of the wavelengths at 532 and 1064 nm.The AE is gridded to 0.05°latitude and 0.06 km altitude resolution.Further, the vertical distribution of the AI is calculated according to Eq. ( 2).The AI was developed by Nakajima et al. (2001) and Y. Liu et al. (2019) with the equation given below.
AI [m,i,j ] and AOD [m,i,j ] are the aerosol index and aerosol optical depth, respectively.AE [m,i,j ] is the pseudo Ångström exponent.[m, i, j ] represent the month, latitude, and altitude, respectively.Note that, to match the AE, AOD is also transformed into the vertical distribution (not the column parameter).As we focus on the characteristics of aerosols in the troposphere over the TP, we took samples from the surface at an altitude of 12 km with a vertical resolution of 0.06 km.We integrated the EC of both layers to obtain an AOD, which corresponds to the average of the AE values of both layers.This achieves spatial matching between AOD and AE at the vertical heights.At the later stage, when using the AI obtained from MODIS for comparative testing, we used the PDF (probability density function) and average values of the AI for characterization displayed to facilitate comparison due to the differences in horizontal and vertical space.The data in this paper are all based on the vertical structural distribution of altitude-latitude with vertical and horizontal resolutions of 60 m and 0.05°, respectively.The monthly mean climatology of the AI is retrieved from CALIOP and computed in altitude and latitude at 532 and 1064 nm for the daytime and nighttime datasets.

Aqua-MODIS satellite data
Like CALIPSO, Aqua is part of the A-Train constellation of satellites.Therefore, MODIS on board Aqua can achieve near-simultaneous observations of clouds and aerosols with CALIPSO-CALIOP (less than 2 min) (Winker et al., 2007;Hu et al., 2010) 1).

Ground-based lidar data
In addition, we used the ground-based lidar (38.967°N, 83.65°E; 1099.3 m) detection data from the hinterland of the TD to verify the validity and accuracy of the low-confidence aerosol removal method and the AI calculated by CALIOP detection data.The multi-band Raman polarization lidar (hereafter lidar) is mainly used for the detection of dust, aerosols, and cloud particles in the atmosphere, whose detection belongs to the "Belt and Road" lidar network from Lanzhou University, China (http://ciwes.lzu.edu.cn/, last access: 20 December 2022) and has the advantage of calibrating or validating satellite observations (see Fig. 2).The primary technical specifications of the lidar are mentioned in Table 2.For the performance of the lidar and the data inversion of aerosol-related optical parameters, the authors advise the readers to refer to the research work of Zhang et al. (2022Zhang et al. ( , 2023)).
In this study, based on the Level_2 aerosol profile data product (EC) for daytime and nighttime detected by CALIOP from 2007 to 2020, the LRAT is screened and eliminated.The aerosol characteristic dataset with higher reliability over the TP is constructed, and the dataset is verified and compared with MODIS and ground-based lidar to test its effectiveness and accuracy.Thus, the vertical structure of aerosol properties and its climatology with higher reliability over the TP can be obtained, providing adequate observation facts and a basis for the TP.All the steps were implemented and processed as shown in Fig. 3.

Screening and elimination of the LRAT
In this section, we screened and eliminated the LRAT for the tropospheric aerosol EC from CALIOP over the TP, based on the statistical method (see Sect. 2.2).Figures 4 and 5 show the monthly frequency distribution of the EC detected by CALIPSO-CALIOP at 532 and 1064 nm in the daytime during 2007-2020.Figures 6 and 7 are the same but for nighttime.It is observed that Figs.4-7 demonstrated the nonnormal distribution of the aerosol EC for daytime and nighttime.It is found that the upper outlier appeared to remove many enhanced aerosol measurements when more sand and dust events occurred in the surrounding areas and rose to the TP in spring and summer.In contrast, the extreme outlier was effectively identified in the frequency distribution.Therefore, the extreme outlier threshold is used to clear LRAT observations from the CALIOP dataset, which is necessary.
After the screening and elimination of the LRAT, the monthly aerosol climatology data and extreme outliers are compared for the years 2007-2020.We found that, during the daytime for 532 and 1064 nm, the aerosol EC over the TP is mainly concentrated between 0 and 0.2.The extreme outliers in July and August are more significant than those in https://doi.org/10.5194/essd-16-1185-2024 Earth Syst.Sci.Data, 16, 1185-1207, 2024   other months, related to the rising motion of the TP as a heat source in summer to trigger convection.This results in more ice clouds in the upper air, thus increasing the probability of misclassification of the cirrus anvil as an aerosol (Carrió et al., 2007;Kojima et al., 2004;Seifert et al., 2007).Also, the aerosol data points (samples) are largest in May and smallest in November over the TP.Obviously, the aerosol samples are more numerous in the spring and summer seasons than in autumn and winter.This is related to the frequent sand and dust activities in the spring and summer seasons around the TP (such as the Taklimakan Desert) and anthropogenic pollution (Y.Liu et al., 2019, as mentioned earlier).
Similarly, during the nighttime for 532 and 1064 nm, the aerosol EC over the TP is mainly concentrated between 0 and 0.1, and the extreme outliers in July and August are more significant than those in other months.However, the aerosol EC observed in the nighttime is smaller than the daytime dataset.The primary consideration is that the daytime solar noise is considerable and the signal-to-noise ratio of lidar observation is low, which further increases the probability that the aerosol EC will present a skewed distribution.It can be seen that the elimination of the LRAT from daytime data is more conducive to improving the accuracy of the data.Meanwhile, the aerosol data points are largest in April and smallest in December over the TP.It can be seen that, in April (spring), more aerosol samples were lifted and transported to the TP.Numerous observations have shown elevated dust plumes lofted into the free troposphere during spring, and air parcels between 4 and 7 km mainly originate from the TD (Huang et al., 2008;Sasano, 1996;Liu et al., 2008;Zhou et al., 2002;Matsuki et al., 2003).It is the same as the daytime with spring and summer being more than autumn and winter and 1 order of magnitude larger than the daytime data.It is not difficult to see that CALIOP is less sensitive during daytime than nighttime due to signal-to-noise-ratio reduction by solar background illumination, which leads to weakly scattering layers being detected during nighttime and missed during daytime (Huang et al., 2013;Liu et al., 2009).

Constructing the vertical AI for daytime and nighttime
Figures 8 and 9 show daytime altitude-latitude plots of the monthly climatology of the aerosol EC at 532 and 1064 nm before and after the screen, respectively.The monthly mean climatology of the pseudo AE and AI vertical structure is then computed as shown in Fig. 10.We choose typical months of January, April, July, and October to represent the respective seasons of winter, spring, summer, and autumn.Figures 8 and 9 show that extreme outliers in the troposphere over the TP have been eliminated, especially in the lower layer, where a more obvious LRAT has been identified and eliminated.In the upper layer (more than 7 km), especially in April and July (i.e.spring and summer), weak cirrus signs may exist in the original aerosol signals and be eliminated.
Compared with the other seasons, the aerosol on the TP is widely and uniformly distributed in the troposphere in April, indicating that, in general, more aerosol loads are lifted over the TP in April.In Fig. 10, we compute values between 0 and −1 for much of the troposphere, which occasionally are between 0 and 2 in the middle troposphere (less than 8 km), which has similar results or patterns to Kovilakam's study (Kovilakam et al., 2020).Note that the derived value for a pseudo AE is without a physical meaning, and it is simply a means of combining AOD to obtain the AI of a vertical structure.Using this climatology of pseudo AE values, we can effectively convert any month of AI data to 532 and 1064 nm because the fixed AE is not necessarily applicable to retrieving aerosol extinction in all months.Relevant research points out that the accuracy has been improved using the corresponding AE of each month to correct the satellite data (Kovilakam et al., 2020).Figure 10 also demonstrates the distribution characteristics of AI values at 532 and 1064 nm in different seasons over the TP in the daytime.In all the seasons, the AI is mainly distributed between −0.04 and 0.04.Still, the proportion between 0 and −0.02 is the largest.Here, we have a broad understanding of a traditional AI.The AI is a way to measure how backscattered ultraviolet (UV) radiation from an atmosphere containing aerosols differs from that of a pure molecular atmosphere (Guan et al., 2010).The AI is especially sensitive to the presence of UV-absorbing aerosols such as https://doi.org/10.5194/essd-16-1185-2024 Earth Syst.Sci.Data, 16, 1185-1207, 2024 smoke, mineral dust, and volcanic ash.The AI positively suggests the existence of absorbent aerosols (dust, black carbon, etc.).A small or negative AI suggests the presence of non-absorbable aerosols or clouds (Hu et al., 2020;Guan et al., 2010;Hammer et al., 2018).The AI varies with aerosol layer height, optical depth, and single scattering albedo (Torres et al., 1998(Torres et al., , 2007;;Hsu et al., 2004;Jeong and Hsu, 2008).However, the significance of obtaining a vertical structure AI in our research content is different from that of traditional AI representation.The AI obtained from our research work cannot effectively characterize the absorption and nonabsorption of its aerosols, as the results obtained are in the non-ultraviolet band range.However, the aerosol concentration represented by the vertical structure AI is not possessed by column AOD.Compared to the aerosol column concentration AOD information, as AOD is an integral result of the entire layer height, it will to some extent lose some of the true changes in the vertical heights of aerosols.The significance of our work is that the AI with higher reliability obtained here can more effectively obtain aerosol concentration infor- mation at vertical height.In the four seasons, the distribution of aerosols in the north is broader than that in the south.In spring, the rise height of the aerosol is higher and the vertical distribution range is more comprehensive.The elevation in summer is lower than that in the other three seasons, but the aerosol species are more abundant because there are many ranges of AE values.Similarly, Fig. 11 includes the nighttime difference plots of the aerosol EC at 532 nm between before-screened and after-screened for different months from 2007 to 2020.The difference before and after screening is immense, especially at a height of more than 5 km in the southern region of the TP in July and October.Extreme outliers are observed in the troposphere over the TP and have been recognized and eliminated.The EC detected at 1064 nm shows a similar distribution characteristic to that found at 532 nm and also includes the different attributes before and after the screening and elimination of the LRAT (see Fig. 12).In all the seasons, the AI is mainly distributed between −0.02 and 0.02.Still, the proportion between 0 and −0.02 is largest in April and July between 4 and 8 km.Meanwhile, the AI above 8 km is mainly concentrated at 0-0.02, indicating modal characteristics of the vertical structure distribution of the aerosol concentration and diversity of aerosol types.It is worth noting that there is a large amount of aerosol over the TP in January (winter) related to anthropogenic emissions of pollutants in https://doi.org/10.5194/essd-16-1185-2024 Earth Syst.Sci.Data, 16, 1185-1207, 2024 winter and fossil fuel combustion (such as black carbon and smoke).It is found that the pattern of the AI is more or less consistent with objective facts and phenomena.Interestingly, compared with the daytime, the aerosol detected by CALIOP at night can rise to a greater height and has a broader distribution range.It can be seen that the signalto-noise ratio at night is higher than that in the daytime.CALIOP can detect smaller particles, which is also why the quality and effectiveness of CALIOP night detection data are better than those in the day.After a series of correc-tion algorithms and calculating relevant parameters, we constructed the tropospheric AI climatology dataset over the TP for 2007-2020.

Comparisons with satellite Aqua-MODIS AI products
The multiyear monthly average spatial distributions of the AE and AOD from MODIS are shown in Fig. 14  AI was also calculated (Fig. 14).The distribution of AE values over the TP in all seasons shows a decreasing trend from south-east to north-west, indicating that the particles in the upper air of the south-eastern region are dominated by small particles.In contrast, the particles in the upper air of the north-western region are dominated by large particles, especially in April of spring, which is related to the uplift and transmission of dust aerosol from the Taklimakan Desert to the northern part of the TP in spring.Also, we can see that the AE value of the Taklimakan Desert in the north of the TP in April and July in spring and summer is smaller (as the source of the sand area, mainly dust aerosol), which is smaller than in January and October in autumn and winter.AOD and AE showed opposite seasonal variation distribution patterns.According to the spatial distribution pattern of the AI calculated from MODIS detection results (AE and AOD), it can be seen that the AI value over the TP is mainly between 0 and 0.4.Figure 14 also compares the normalized frequency distribution of the AI over the TP, exhibiting a significant difference in all the seasons from MODIS and CALIOP before and after screening.It is evident that, in general, compared with the data results without any processing, after the elimination of the low-reliability aerosol target, the average AI value of CALIOP is closer to the result of MODIS and the normalized frequency distribution pattern is closer to the same.Interestingly, the AI mean value and normalized frequency distrihttps://doi.org/10.5194/essd-16-1185-2024 Earth Syst.Sci.Data, 16, 1185-1207, 2024  bution pattern of CALIOP in April (spring) after removing the LRAT are more in agreement and matched with the results of MODIS.In addition, the mean AI and normalized frequency distribution pattern of CALIOP in July (summer) and October (autumn) are more consistent with the MODIS results, and both have apparent improvement.The difference between the mean AI from CALIOP in January (winter) and the result of MODIS is relatively more extensive, but the normalized frequency distribution pattern is more consistent.This may be related to the type and chemical composition of aerosol particles that rise over the TP in different seasons and the atmospheric climate conditions unique to the topography of the TP.In brief, the accuracy of the aerosol parameters' AI calculated after obtaining the aerosol EC with higher reliability has been dramatically improved (more or less), so even though not completely accurate, this strategy is expected to at least reduce the inaccuracy of the computed AI.Meanwhile, we proved that using extreme outliers as a limit to get more reliable aerosol detection information is effective and reliable.It is important to note that the 550 nm wavelength range of MODIS belongs to the visible light range, and the data products provided at the satellite transit time are the daytime detection results.Therefore, here we compare and verify the daytime detection results of CALIOP (532 nm) with MODIS results, which are consistent in time, close in detection wavelength, comparable, and representahttps://doi.org/10.5194/essd-16-1185-2024 Earth Syst.Sci.Data, 16, 1185-1207, 2024   verification.From Fig. 15, it can also be seen that there are clouds or another LRAT at the daytime high altitude in the ground-based lidar detection signal.This will be more beneficial for us to check the validity and reliability of the results of the elimination of the LRAT and the calculated AI value.
Similarly, for ground-based lidar detection, we first reverse the EC and use the IQR method (see Sect. 2.2) to obtain extreme outliers and identify and eliminate the LRAT (Fig. 15).We can see that the LRATs (such as clouds and surface clutter) are effectively eliminated after the data optimization of 532 and 1064 nm detection-result ECs.It is once again proved that it is effective and reliable to use extreme outliers as a limit to obtain more reliable aerosol detection information.
It needs to be pointed out that the case of ground-based lidar detection on 11 July 2021 is quite typical, but there is a significant deviation in satellite transit, and this process cannot be captured well.To maximize and better match this process, we take the ground-based lidar observation in the hinterland of the Taklimakan Desert as the centre (38.967°N, 83.65°E; 1099.3 m), select the 38.5-39.5°Nand 83-84°E ranges, extract the ECs observed by CALIOP transit in this  In addition, all the evidence shows that, after removing the LRAT, the optimized data can obtain aerosol characteristics with higher reliability.
Based on the monthly climatology AI product, we explored the average vertical structure characteristics of the AI over the TP during 2007-2020 (as shown in Fig. 17).AI values in the daytime and at night over the TP mainly fluctuate around 0, and the standard deviation increases with the increase in altitude.The trend of the AI changes with altitude is relatively consistent, and the standard deviation below 6 km is slight, indicating that the dispersion of aerosol particles is small.However, the fluctuation in the daytime is greater than that at night (the data quality at night is better than that in the daytime).In general, the detection results of 532 and 1064 nm can achieve complementary observation.
In general, the quality and robustness of the aerosol parameter product have improved for the EC and AI, with some issues that persist in the dataset, which we mention below.
As we do not have ground-based lidar detection data on the TP, we have selected ground-based lidar data from the centre of the Taklamakan Desert for verification and evaluation.The objectives of the verification and evaluation include the removal of low-reliability aerosol targets and the validation of the effectiveness and rationality of the constructed AI parameter results.Due to the limited detection data of ground-based lidar, we chose a typical aerosol process detected by groundbased lidar (11 July 2021), but it did not match well with the transit time and scanning area of the CALIPSO satellite, resulting in significant errors.Therefore, we choose to compare and verify the results of the average values of July in all years within the central area of the transit Taklamakan Desert detected by CALIPSO (see the green box on the left in Fig. 2).Minimize spatial errors caused by significant differences in spatial positions.This kind of error is inevitable in our data processing process and will affect the consistency of detection results to some extent.
In addition, although the monthly AI correction significantly improves the comparison between CALIPSO and MODIS, we note that a somewhat larger deviation may occur in winter, and the effect after correction in summer is the best and significant one, which may be related to the increased probability of mistaking clouds as aerosol particles due to more convective activities in summer.This helps us to refine our research on summer aerosols over the TP.

Summary and outlook
This present study is the first to report a long-term, advancedperformance, high-resolution, continuous, high-quality, and monthly climatology AI vertical structure from the CALIOP observation over the TP used to better understand aerosol radiation forcing against the background of accelerated climate change.Using the relationship developed when EC measurements are available, we screened the entire EC record.We assembled a climatology of high-altitude aerosol characteristics for daytime and nighttime from 2007 to 2020.In addition to providing a monthly climatology AI dataset for MODIS and ground-based lidar validation, our dataset also reveals the patterns and numbers of high-altitude vertical structure characteristics of the aerosol troposphere over the TP.
To produce accurate and higher reliability of AI values, we applied several correction procedures and rigorously checked them for data quality constraints during the long observation period spanning almost 14 years (2007)(2008)(2009)(2010)(2011)(2012)(2013)(2014)(2015)(2016)(2017)(2018)(2019)(2020).Nevertheless, some uncertainties remain, mainly due to technical constraints as well as limited documentation of the measurements.Even though not completely accurate, this strategy is expected to at least reduce the inaccuracy of the computed characteristic value of aerosol optical parameters.Following this initial work, we obtained a vertical AI value with higher reliability.This provides information about the vertical structures of aerosol that could be used in climate models.The collection of more reliable and robust research datasets of aerosol characteristics in these extreme environments is the key basis for promoting comprehensive research on the energy balance of ground-atmosphere radiation over the Tibetan Plateau and even the global region.We expect that this dataset will help some current and future research to simulate the climate change of the monthly climatology.It will also help to update future datasets and study the interaction of aerosol, cloud, and precipitation, thus providing sufficient observation facts and bases.
Finally, it should be pointed out that the AI obtained in the ultraviolet channel can currently characterize both absorption and non-absorption aerosols.The AI obtained from our research work cannot effectively characterize the absorption and non-absorption of its aerosols, as the results we obtained are in the non-ultraviolet band range, which is also an area that we need to further explore in the future.However, the aerosol concentration represented by the vertical structure AI we obtained is not possessed by the column AOD.The significance of our work is that the AI with higher reliability obtained here can more effectively obtain aerosol concentration information and also presents a diversity of aerosol types at the vertical height over the TP.This is the main highlight of our research work.The reason why we use the AI to test the results of MODIS and ground lidar is to verify the effectiveness and reliability of the AI.Fortunately, the test results are very consistent and reasonable.Therefore, the AI of the Earth Syst.Sci.Data, 16, 1185Data, 16, -1207Data, 16, , 2024 https://doi.org/10.5194/essd-16-1185-2024

Figure 1 .
Figure 1.The geopotential height of the TP and its surrounding areas (27-42°N, 75-102°E).The schematic diagram of the transit of the CALIPSO satellite orbits over the TP in all months of 2007 (with 2007 as an example; satellite orbit transit images from https://www.earthdata.nasa.gov,last access: 18 June 2022).The seasons have been classified as follows: March-May is spring, June-August is summer, September-November is autumn, and December-February is winter).

Figure 2 .
Figure 2. The CALIPSO satellite orbit passes through the central area of the Taklimakan Desert hinterland -left.The red triangle represents the observation coordinates of the ground-based lidar -right (38.967°N, 83.65°E; 1099.3 m).TD: Taklimakan Desert; TP: Qinghai-Tibetan Plateau.Pictures from NASA'S Earth data (left) and photography (right).

Figure 3 .
Figure 3. Flowchart of the construction and calculation process of the aerosol optical characteristics dataset over the TP.

Figure 4 .
Figure 4. Monthly frequency distribution of the aerosol extinction coefficient at 532 nm over the Tibetan Plateau (TP) in the daytime during 2007-2020 from January to December.The panels in the first, second, third, and fourth rows correspond to winter (December-February), spring (March-May), summer (June-August), and autumn (September-November).The frequency distribution is the number of events normalized to the maximum value.The upper outlier, extreme outlier, and median are also shown in all the panels.

Figure 6 .
Figure 6.Monthly frequency distribution of the aerosol extinction coefficient at 532 nm over the TP at nighttime during 2007-2020 from January to December.The panels in the first, second, third, and fourth rows correspond to winter (December-February), spring (March-May), summer (June-August), and autumn (September-November).The frequency distribution is the number of events normalized to the maximum value.The upper outlier, extreme outlier, and median are also shown in all the panels.

Figure 8 .
Figure 8.The monthly average comparison and difference of the aerosol extinction coefficient at 532 nm before and after elimination of the LRAT over the TP for daytime during 2007-2020.The reddish-brown dotted line denotes the surface.BS: before-screened, first line; AS: after-screened, second line; BS-AS: before-screened minus after-screened, representing a spatial lattice with screening and elimination (third line).

Figure 9 .Figure 10 .
Figure 9. Same as Fig. 8 but for the aerosol EC at 1064 nm for the daytime.

Figure 11 .
Figure 11.Same as Fig. 8 but for nighttime at 532 nm.

Figure 14 .
Figure 14.Frequency test of the AI calculated by the MODIS-based aerosol AE and AOD over the Qinghai-Tibetan Plateau and the AI calculated by the CALIPSO-based aerosol AE and AOD with high reliability for daytime (before-screened, fourth line; after-screened, fifth line).

Figure 15 .Figure 16 .
Figure 15.Elimination of low-reliability aerosol target signals detected by ground-based lidar in the hinterland of the Taklimakan Desert.

Figure 17 .
Figure 17.Monthly mean changes of vertical structure characteristics of the AI (mean and standard deviation) over the TP during 2007-2020.

Table 1 .
Comparison between MODIS and CALIOP existing data products ( represents the existing data products of the satellite, NAN indicates non-existent, and × represents non-existent data product parameters that need further calculation in this study).

Table 2 .
Basic technical specifications of lidar from the hinterland of the Taklimakan Desert (TD).