Articles | Volume 14, issue 3
Earth Syst. Sci. Data, 14, 1193–1213, 2022
https://doi.org/10.5194/essd-14-1193-2022
Earth Syst. Sci. Data, 14, 1193–1213, 2022
https://doi.org/10.5194/essd-14-1193-2022
Data description paper
16 Mar 2022
Data description paper | 16 Mar 2022

A global land aerosol fine-mode fraction dataset (2001–2020) retrieved from MODIS using hybrid physical and deep learning approaches

A global land aerosol fine-mode fraction dataset (2001–2020) retrieved from MODIS using hybrid physical and deep learning approaches
Xing Yan1, Zhou Zang1, Zhanqing Li2, Nana Luo3, Chen Zuo1, Yize Jiang1, Dan Li1, Yushan Guo1, Wenji Zhao4, Wenzhong Shi5, and Maureen Cribb2 Xing Yan et al.
  • 1State Key Laboratory of Remote Sensing Science, College of Global Change and Earth System Science, Beijing Normal University, Beijing, 100875, China
  • 2Department of Atmospheric and Oceanic Science and ESSIC, University of Maryland, College Park, MD, 20740, USA
  • 3School of Geomatics and Urban Spatial Informatics, Beijing University of Civil Engineering and Architecture, Beijing 102612, China
  • 4College of Resource Environment and Tourism, Capital Normal University, Beijing, China
  • 5Department of Land Surveying and Geo-Informatics, The Hong Kong Polytechnic University, Hong Kong, China

Correspondence: Xing Yan (yanxing@bnu.edu.cn) and Zhanqing Li (zli@atmos.umd.edu)

Abstract

The aerosol fine-mode fraction (FMF) is valuable for discriminating natural aerosols from anthropogenic ones. However, most current satellite-based FMF products are highly unreliable over land. Here, we developed a new satellite-based global land daily FMF dataset (Phy-DL FMF) by synergizing the advantages of physical and deep learning methods at a 1 spatial resolution covering the period from 2001 to 2020. The Phy-DL FMF dataset is comparable to Aerosol Robotic Network (AERONET) measurements, based on the analysis of 361 089 data samples from 1170 AERONET sites around the world. Overall, Phy-DL FMF showed a root-mean-square error (RMSE) of 0.136 and correlation coefficient of 0.68, and the proportion of results that fell within the ±20 % expected error (EE) envelopes was 79.15 %. Moreover, the out-of-site validation from the Surface Radiation Budget (SURFRAD) observations revealed that the RMSE of Phy-DL FMF is 0.144 (72.50 % of the results fell within the ±20 % EE). Phy-DL FMF showed superior performance over alternative deep learning or physical approaches (such as the spectral deconvolution algorithm presented in our previous studies), particularly for forests, grasslands, croplands, and urban and barren land types. As a long-term dataset, Phy-DL FMF is able to show an overall significant decreasing trend (at a 95 % significance level) over global land areas. Based on the trend analysis of Phy-DL FMF for different countries, the upward trend in the FMFs was particularly strong over India and the western USA. Overall, this study provides a new FMF dataset for global land areas that can help improve our understanding of spatiotemporal fine-mode and coarse-mode aerosol changes. The datasets can be downloaded from https://doi.org/10.5281/zenodo.5105617 (Yan, 2021).

Please read the corrigendum first before continuing.
1 Introduction

Evaluating the impact of anthropogenic aerosols on climate change and human health relies on the ability to separate the proportion of anthropogenic aerosols from the total aerosol loading (Anderson et al., 2005; Zheng et al., 2015; Li et al., 2016a). Although satellite remote sensing can provide global scale data on aerosol content that are represented by the aerosol optical depth (AOD), accurate monitoring of anthropogenic aerosols is still a major challenge. This is because a key parameter called the aerosol fine-mode fraction (FMF), which is used for discriminating anthropogenic aerosols from natural ones (Bellouin et al., 2005), has been regarded as highly unreliable according to satellite-based AOD retrievals, especially over land (Levy et al., 2013; Yan et al., 2017; Liang et al., 2021; Yang et al., 2020; Zang et al., 2021a).

Satellite-based FMF retrievals based on physical methods have been performed previously; currently, five global scale FMF products exist (Fig. 1) that exhibit different temporal resolutions from 1 to 16 d (Levy et al., 2007; Garay et al., 2020; C. Chen et al., 2020). Of these, Polarization and Directionality of the Earth's Reflectances (POLDER) can perform multiangle and multispectral polarized measurements, which provide unique advantages in the retrieval of aerosol FMF (Dubovik et al., 2011, 2019). Therefore, in recent years several POLDER-based FMF retrieval methods have been proposed (Zhang et al., 2016, 2021), such as the generalized retrieval of aerosols and surface properties (Dubovik et al., 2014). However, POLDER ended its mission in 2013, whereas the Moderate Resolution Imaging Spectroradiometer (MODIS) has operated for about 20 years and continues to perform well (K. Yan et al., 2021, G. Yan et al., 2021). In addition, the Advanced Along Track Scanning Radiometer (AATSR) ended the mission in 2012 (Kolmonen et al., 2016), while the Visible Infrared Imaging Radiometer Suite (VIIRS) started the mission in 2012, which could provide less than 10-year global FMF products so far (Sawyer et al., 2020). Currently, only the MODIS Dark Target (DT) method has been used to generate global aerosol FMF products over both land and ocean. However, the MODIS DT-derived FMF over land is highly unreliable and is not recommended for use even though it has evolved to the Collection 6.1 (C6.1) level (Levy et al., 2013; C. Chen et al., 2020). To improve the accuracy of MODIS land-based FMF retrievals, improvements have been made to physical approaches, such as the lookup table-based spectral deconvolution algorithm (LUT-SDA, Yan et al., 2017, 2019). Using the LUT-SDA model in previous research, we developed a 10-year global land FMF dataset (Yan et al., 2021b) with moderately improved retrieval accuracy (root-mean-square error, RMSE = 0.22). Because MODIS has no multiangle and multispectral polarized information, Lipponen et al. (2018) noted that MODIS-based FMF retrievals using physical methods still suffer from these major limitations.

https://essd.copernicus.org/articles/14/1193/2022/essd-14-1193-2022-f01

Figure 1Overview of the time periods covered by different satellites that provide global scale FMF products. AATSR Advanced Along-Track Scanning Radiometer; MISR Multiangle Imaging Spectroradiometer; MODIS Moderate Resolution Imaging Spectroradiometer; POLDER Polarization and Directionality of the Earth's Reflectances; VIIRS Visible Infrared Imaging Radiometer Suite.

Download

In recent years, deep learning approaches have been applied to satellite-based atmospheric research (Zang et al., 2021b; Yan et al., 2020a; Yuan et al., 2020; Shen et al., 2018; Ong et al., 2016), including FMF retrieval (X. Chen et al., 2020). Compared with classical machine learning methods, deep learning is more capable of approximating nonlinear relationships (Yan et al., 2021c). For example, X. Chen et al. (2020) used a convolutional neural network (CNN) to develop a deep learning model for MODIS FMF retrievals called the neural network-based AEROsol retrieval (NNAero) method. The NNAero-derived FMF is a significant improvement over the MODIS DT-derived FMF, with the RMSE decreasing from 0.34 (DT) to 0.1567 (NNAero). However, this method has only been applied and validated over northern and eastern China, and not globally. As an important limitation, Zhang et al. (2016) noted that satellite-measured multispectral reflectance of ground-based data alone was not sufficient to retrieve FMF with high accuracy. O'Neill et al. (2008) showed that when the temperature is low, the error of the fine-mode AOD calculated by the physical method, i.e., the spectral deconvolution algorithm (SDA), is clearly large (SDA technical memo, O'Neill et al., 2008). Although this issue has long been known, the relationship between meteorological factors and FMF is complex and difficult to describe by equations in the SDA. Benefiting from its powerful ability to describe nonlinear relationships, using a deep-learning model may overcome the deficiencies of the SDA in calculating FMF.

To address the above issues, we synergized the advantages of the physical method and deep learning to retrieve aerosol FMF over land on a global scale using MODIS data. We tested and validated this hybrid model using two decades of data (2001–2020) and produced a new long-term FMF dataset called physical-deep learning FMF (Phy-DL FMF). Contrary to previous studies, the proposed hybrid model considers both physical characteristics and nonlinear relationships to constrain the FMF calculation. This long-term dataset shows good promise for shedding light on the impacts of human activities on atmospheric aerosols, providing a foundation for understanding the variations in fine-mode aerosols on a global scale.

2 Materials and methods

2.1 MODIS data

The MODIS sensor onboard Terra has provided long-term observations on a global scale every day since February 2000 (Levy et al., 2010), available at the Atmosphere Archive & Distribution System Distributed Active Archive Center. In this study, MODIS C6.1 L1B MOD02SSH data (i.e., top of the atmosphere (TOA) reflectances from Band 1 to Band 7), MODIS C6.1 L3 MOD09CMG data (surface reflectances from Band 1 to Band 7), and MODIS C6.1 L3 MOD08 daily data were obtained from 2001 to 2020 for retrieving FMF. Supplement Table S1 summarizes details about the MODIS data used in this study.

2.2 AERONET data

The AERONET is a worldwide, sun–sky photometer network providing ground-level aerosol properties, recently updated to Version 3 (Holben et al., 1998). To retrieve FMF from AERONET solar extinction data, O'Neill et al. (2001a, 2001b, 2003) developed the SDA method. The FMFs based on this inversion method (i.e., SDA FMF) have been included in the standard AERONET data with an estimated uncertainty of 0.1 (O'Neill et al., 2001b, 2003). Since there are not enough level 2.0 data for use as training data for modeling purposes, here, we used the level 1.5 SDA FMF dataset generated from data from 1170 global AERONET sites covering the period of 2001 to 2020 as the ground truth for further modeling and validation (Fig. S1a in the Supplement). These AERONET sites are spread around the world, enabling the construction of a universal model and allowing a more thorough validation of the new FMF product.

2.3 Meteorological data

Previous studies have reported that meteorological variables are significantly correlated to fine-mode and coarse-mode aerosols. Tai et al. (2010), Liang et al. (2016), and Shen et al. (2018) all revealed that meteorological variables like temperature, relative humidity (RH), and wind speed explain much of the variations in PM2.5 concentrations (> 50 %). Xiang et al. (2019) and Gui et al. (2019) found a negative association between planetary boundary layer height (PBLH) and PM2.5, and Kang et al. (2014) found that fine-mode aerosols and air pressure were significantly correlated. In this study, to investigate the correlation between meteorological variables and the FMF, we implemented the generalized additive model (GAM). Figure S2 reveals that the meteorological variables considered in this study, i.e., PBLH, temperature, surface pressure, RH, and wind speed, all had significant nonlinear relationships with the FMF (at the 99 % significance level). Both PBLH and surface pressure had similar influences on the FMF, i.e., a positive (negative) response when PBLH and surface pressure values were low (high). This is because high PBLH and surface pressure values can increase the diffusion of fine particles, decreasing the magnitude of the FMF (Tai et al., 2010). Meanwhile, the negative response of the FMF to wind speed also reflects the influence of fine particle diffusion as well as the contribution of dust particles strengthened by wind speed (Luo et al., 2016). Increasing temperatures corresponded to decreasing FMFs, partly due to unfavorable diffusion conditions (Tai et al., 2010). On the other hand, more fine particles are released by heating during colder seasons than during warmer seasons (Ramachandran, 2007). The RH had a strong positive influence on PM2.5 concentrations when RH was between 25 % and 75 %. This reflects the secondary particle formation boosted by the increasing RH that contributed to the fine particles (Tai et al., 2010). Therefore, in this study, we used surface temperature, air pressure, PBLH, RH, and wind speed as inputs to the deep-learning model.

Due to the impact of meteorological factors on FMF, five meteorological variables (i.e., 2 m air temperature, PBLH, surface pressure, 10 m U/V wind components, and 2 m dew point temperature) were obtained from the fifth generation product produced by the European Centre for Medium Range Weather Forecasts (ERA5), with hourly data available since 1950 and at a 0.25 spatial resolution (Fig. S1b–f). The RH was then calculated by 2 m dew point temperature and air temperature (Tetens, 1930). Given the overpass time and spatial resolution of MODIS data, only monitoring time meteorological data collected from 10:00 to 11:00 local time were used and resampled to 1× 1 to obtain daily averages.

2.4 Combining physical and deep-learning models (Phy-DL) for retrieving FMF

In this study, we used a concatenation mode to combine a physical model and a deep-learning model, i.e., the outputs of the physical model were used as the inputs for the deep-learning model (Fig. 3). The physical model used was the LUT-SDA (Yan et al., 2017). The LUT-SDA is designed for satellite FMF retrievals when only AODs at two wavelengths are available (such as DT AOD products). As shown in Eq. (1) of the SDA (O'Neill et al., 2001a), a minimum of AODs at three wavelengths are needed to first obtain the AE derivative (α). The AE of the fine-mode AOD (αf) and the FMF can then be calculated.

(1) α f = 1 2 ( 1 - a ) { ( α - α c - α - α c α - α c + b * ) + [ ( α - α c - α - α c α - α c + b * ) 2 + 4 c * ( 1 - a ) ] 1 / 2 } + α c FMF = α - α c α f - α c ,

where a, b*, c*, αc, and αc are fixed parameters described in Sect. 1 of the Supplementary Information document, based on O'Neill (2010). Since AODs at two wavelengths are not sufficient to calculate α, for the global physically based FMF retrieval, we first divide the whole world into nine regions (as done by Sayer et al., 2014) and use historical AERONET observational data to determine αvalue ranges in these regions. The α range of values is based on the first and third quartiles of AERONET measurements in different seasons. For example, in Southeast Asia, α ranges from 0.12 to 0.60 in spring (Yan et al., 2021b). In these nine regions, a set of hypothetical values for α (as determined by Yan et al., 2021b),αf, and AE (α) are imported into the SDA (Eq. 1) to build the relationship with FMF (Fig. 2).

https://essd.copernicus.org/articles/14/1193/2022/essd-14-1193-2022-f02

Figure 2Visual representation of SDA-based FMF retrieval LUT.

Download

Different LUTs based on the SDA for these regions are thus created. Based on the constructed LUT, initial results are obtained using a cost function:

(2) ( FMF 1 , α 1 , α f 1 ) = min [ ( LUT-SDA AE - MODIS AE ) ] 2 ,

where FMF1, α1, and αf1 are uncorrected initial results of FMF, α, and αf by the LUT-SDA, LUTSDAAE is the α in the LUT, and MODISAE is the MODIS MOD08 DT-based AE. After performing the αbias error correction (described in Supplement, Sect. 2, O'Neill et al., 2003) and the mean of extreme (MOE) modification (described in Supplement, Sect. 3, O'Neill et al., 2008), the final FMF output is:

(3) FMF output = α - α c α f corrected 1 - α c .

The deep-learning model used in this study is called EntityDenseNet (Yan et al., 2020). The EntityDenseNet incorporates the Entity Embeddings method (Guo and Berkhahn, 2016) that can directly process spatial or time-based features, such as location, season, and month. It includes one input layer, two hidden layers, and one output layer. Each hidden layer has one fully connected layer, one rectified linear unit (ReLU) layer, one batch normalization (BN) layer, and one dropout layer. The feed-forward operation of each hidden layer can be written as

(4) a n + 1 = BN { f [ W n + 1 D ( a n ) + b n + 1 ] } ,

where n is the layer number, an is the output vector from layer n, D() is the dropout layer for the thinning vector an,Wn+1 and bn+1 are weights and biases, respectively, at layer n+1, f[] is the ReLU activation function, and BN is the batch normalization function.

In this study, we combine Phy-based FMF into EntityDenseNet along with satellite measurements and meteorological data to reduce FMF retrieval biases (Fig. S3). As shown by Yan et al. (2021b), the global land Phy-based FMF is still not reliable enough and there is room for improvement. Due to its unknown and known error sources (e.g., MODIS-derived AE) and nonlinearity in the data itself, a linear model may not be able to correct these errors. In addition, current physical retrieval methods do not use all the information provided by satellite observations pertaining to aerosol size information retrieval (Zang et al., 2021b). Lipponen et al. (2018) showed that applying a machine-learning model to satellite TOA reflectance and geometry data can significantly improve the retrieval accuracy of aerosol size. Other studies have also suggested that surface reflectance and meteorological factors can also impact the FMF retrieval accuracy (Yan et al., 2021a; X. Chen et al., 2020). Thus, besides Phy-based FMF, we input MODIS TOA reflectance data, geometry data, surface reflectance, and meteorological data into EntityDenseNet for the final Phy-DL FMF calculation (Table S1). In the deep learning model training process, 70 %, 20 %, and 10 % of all input data are randomly separated into groups of data for training, validation, and testing, respectively. The validation data are used for the hyperparameter optimization (node numbers and dropout rate in each hidden layer) of the deep-learning model. The testing data are used to evaluate the performance of the trained deep-learning model. When the trained model is finally optimized by the validation and testing data, we apply this trained deep-learning model to reconstruct global land FMF for the period of 2001 to 2020.

https://essd.copernicus.org/articles/14/1193/2022/essd-14-1193-2022-f03

Figure 3Technical flowchart for the production of the global land FMF product. The satellite true-color image composites of data from three wavelengths measured by the MODIS (a red band, a green band, and a blue band data from MODIS C6.1 L1B MOD02SSH).

2.5 In situ observations for independent validation

The National Oceanic and Atmospheric Administration Surface Radiation Budget (SURFRAD) network provides long-term, multiband AOD observations at a temporal resolution of 3min (Augustine et al., 2000). Multifilter Rotating Shadowband Radiometer (MFRSR) provides spectral solar measurements at SURFRAD sites with approximately 10 nm wide and the peak nominally at 415, 500, 614, 670, 870, and 940 nm (Harrison et al., 1994). In this study, we selected four SURFRAD sites (Table S2) which are distant to AERONET sites and applied the SDA method to calculate the FMF (SURFRAD FMF) for validation purposes. Because SURFRAD FMF was not included in modeling training, these data are suitable as the independent validation for FMF products.

2.6 Other global FMF products for comparison

The Phy-DL-derived FMFs were compared with the following FMF products from three other satellite missions (Table S3):

  • a.

    POLDER/GRASP FMF:

    Launched in December 2004, POLDER-3 onboard the Polarization and Anisotropy of Reflectances for Atmospheric Sciences coupled with Observations from a Lidar satellite was operational from March 2005 to October 2013, making multi-angular polarization measurements. By capitalizing on the small and fairly neutral polarized reflectances (Deuze et al., 2001), POLDER/GRASP is able to provide the fine-mode AOD (fAOD, radius < 0.35 µm) in two categories: high-precision and models. Because high-precision fAODs perform better than models fAODs (Wei et al., 2020), we used monthly high-precision POLDER/GRASP fAODs and AODs (both at 490 nm) at a spatial resolution of 1 for calculating FMF (at 490 nm) (FMF = fAOD/AOD).

  • b.

    Multi-angle Imaging SpectroRadiometer (MISR) FMF:

    The MISR instrument onboard the National Aeronautics and Space Administration Earth Observing System Terra satellite has been continuously working since 2000 (Diner et al., 1998; Kahn and Gaitley, 2015). The MISR has nine push-broom cameras with different nominal viewing angles, allowing it to distinguish aerosol types, including aerosol size (Garay et al., 2020). The MISR algorithm retrieves small-mode AODs (at 550 nm) due to aerosol particles with radii less than 0.35 µm at a spatial resolution of 0.5. We used it to calculate FMF as the ratio of fAOD and AOD for further comparisons in this study.

  • c.

    MODIS FMF:

    The latest C6.1 MODIS aerosol product (Levy et al., 2013) no longer includes global scale FMF, so we used FMF at 550 nm from the previous collection (C5) for comparison purposes (Levy et al., 2007). Although this MODIS FMF product is not reliable over land (Levy et al., 2010), it was used in numerous previous studies (Ramachandran, 2007; Vinoj et al., 2014) including for PM2.5 estimations (Li et al., 2016b; Zhang and Li, 2015).

3 Results

3.1 Phy-DL FMF validation

Figure 4 shows the validation of the Phy-DL FMF against AERONET FMF. By matching 20 years of estimated Phy-DL FMF with AERONET FMF (number of match-ups, N= 361 089), we first evaluated the overall performance of Phy-DL FMF (Fig. 4a). The correlation coefficient (R) was 0.68, and the RMSE was 0.136. Approximately 90 % and 79 % of retrievals fell within the expected error (EE) envelopes of ±40 % and ±20 %, respectively (these envelopes have been adopted from X. Chen et al., 2020). Figure 4b shows the biases of the Phy-DL FMF (estimated FMF minus AERONET FMF) as a function of the AERONET FMF. The Phy-DL FMF slightly underestimated the FMF, with a negative median bias in each FMF bin. As each FMF increased a higher percentage of retrievals fell within the ±20 % EE envelope, ranging from 42.85 % (when FMF < 0.3) to 91.17 % (when FMF > 0.8). This indicates that the Phy-DL FMF retrieval performed better when the fine-mode aerosols dominated. Figure 4c shows the validations of Phy-DL FMF over different AERONET sites around the world. Most sites in the eastern USA and Europe have over 70 % of Phy-DL FMF falling within the EE envelope of ±20 %. Over 90 % of Phy-DL FMF fell within the ±20 % EE envelope at some sites in the Amazon Basin, southern Africa, and Southeast Asia. However, at coastal AERONET sites in the Caribbean and Mediterranean regions, Australia, and South America, less than 60 % of Phy-DL FMF fell within the ±20 % EE envelope. A similar result was found for some AERONET sites near deserts in southern South America, Central Asia, Northwest China, and Central Australia.

To further investigate the bias in Phy-DL FMF, Fig. S4a shows that more than 75 % of the sites located on barren land have low percentages of Phy-DL FMF (< 60 %) falling within the EE envelope of ±20 %. About 4 % of the sites have high percentages of Phy-DL FMF (> 90 %) falling within the ±20 % EE envelope. This suggests that the accuracy of Phy-DL FMF over barren land is much lower than over other land types. The AODs over the bright surface used for the Phy-DL FMF retrieval were significantly overestimated, with the worst performance compared to other vegetated land cover types (Levy et al., 2010; Petrenko and Ichoku, 2013). This suggests that the performance of the Phy-DL FMF algorithm is poor when applied to regions with barren land. Figure S4b shows the bias of the Phy-DL FMF and the percentage of retrievals falling within the EE envelope of ±20 % as a function of the normalized difference vegetation index (NDVI). As NDVI increased from < 0.1 to > 0.8, the percentage of FMF retrievals falling within the ±20 % EE envelope also rose from < 70 % to > 85 %, and the range of bias decreased significantly. The core of the SDA method relies on AE as input (Yan et al., 2017). The AE from the MODIS DT aerosol product is still highly uncertain. The low accuracy of AE can significantly influence the performance of the Phy-DL FMF algorithm. As shown in Fig. S5, AEs from the MODIS MOD08 product used as input to the Phy-DL FMF algorithm performed the worst over barren land, with the highest RMSE (> 1) and the lowest percentage of retrievals falling within the EE envelope of ±0.45 (< 45 %). This would result in a lower performance of the Phy-DL FMF algorithm when applied to regions with barren land.

https://essd.copernicus.org/articles/14/1193/2022/essd-14-1193-2022-f04

Figure 4(a) Phy-DL FMF at 500 nm as a function of AERONET FMF. The solid black and red lines are the 1:1 line and the best-fit line obtained from linear regression, respectively. The dashed and dotted black lines represent the expected error (EE) envelopes of ±20 % and ±40 %, respectively. (b) Box plots of the FMF bias (estimated FMF minus AERONET FMF) as a function of AERONET FMF. The dashed horizontal black line indicates the zero bias. The red dot in each box represents the mean value of the FMF bias. The upper, middle, and lower horizontal lines in each box show the 75th, median, and 25th percentiles, respectively. The blue dots connected by the dashed curve are percentages of FMF retrievals falling within the EE envelope of ±20 %. (c) Global distribution of percentages of Phy-DL FMFs falling within the EE envelope of ±20 % at the AERONET sites.

Four sites from the SURFRAD network were selected for the independent validation of the Phy-DL FMF algorithm. As shown in Fig. 5a, the four sites (black triangles) are located across the USA, covering different land types from forested land to barren land. Figure 5b shows how SURFRAD and Phy-DL FMF compare. The R was 0.51, and the RMSE was 0.144, somewhat different than AERONET validation results (i.e., R= 0.68 and RMSE = 0.136). Furthermore, the Phy-DL FMF performance was validated at each SURFRAD site. Fig. 5c shows the bias of Phy-DL FMF (Phy-DL FMF minus SURFRAD FMF), percentage of retrievals falling within the ±20 % EE envelope, and RMSEs at each site. In general, most of the sites have a mean bias and an RMSE lower than 0.1 and 0.15, respectively, with over 70 % of the retrievals falling within the ±20 % EE envelope. The out-of-site validation reveals that the Phy-DL FMF algorithm is reliable even in regions without AERONET sites for model training.

https://essd.copernicus.org/articles/14/1193/2022/essd-14-1193-2022-f05

Figure 5(a) The locations of AERONET sites (green points) and four independent SURFRAD sites (black triangles) for the independent validation of the Phy-DL FMF algorithm. The base map shows the land types from MODIS MCD12C1 data (the International Geosphere-Biosphere Programme scheme, Table S4). (b) Phy-DL FMF at 500 nm as a function of SURFRAD FMF. The solid black and red lines are the 1:1 line and the best-fit line obtained from linear regression, respectively. The dashed and dotted black lines represent the expected error (EE) envelopes of ±20 % and ±40 %, respectively. (c) Boxplots of bias (Phy-DL FMF minus SURFRAD FMF), percentage of FMF estimates falling within the EE envelope of ±20 % (dashed-dotted lines), and RMSEs at the four independent SURFRAD sites. The upper, middle, and lower lines in each box present the 75th, median, and 25th percentiles, respectively. The red point in each box represents the mean value of the FMF bias. DRA = Desert Rock, FPK = Fort Peck, GWN = Goodwin Creek, PSU = Penn State.

Figure S6 shows the frequencies of three FMF levels (low: FMF <  0.5, medium: 0.5 < FMF < 0.8, high: FMF > 0.8, Supplementary section S4) based on Phy-DL and AERONET FMF data from 2001 to 2020. Over 60 % of AERONET-derived FMFs were low over Central Asia, Central Australia, and sub-Saharan Africa; the AOD of these locations was dominated by coarse-mode aerosols (dust). The Phy-DL-estimated FMFs were also low over Central Asia and the sub-Sahara, but slightly underestimated over Central Australia (frequency < 40 %). Over 90 % of both AERONET and Phy-DL FMFs were at the medium level in South America, western Africa, Australia, western Asia, and the western USA. Approximately 45 %–55 % of Phy-DL and AERONET FMFs were at the medium level in the eastern USA, Europe, and Central Africa. Over 60 % of Phy-DL and AERONET FMFs were at a high level in northern India, Southeast Asia, and Southeast China.

3.2 Global distribution of FMF over land and trends from 2001 to 2020

Figure 6a shows the global distribution of mean Phy-DL FMF over land from 2001 to 2020. A high proportion of fine-mode aerosols with FMF greater than 0.77 can be seen in populated regions, including southern China, Southeast Asia, eastern Europe, and the eastern USA. Low FMF values (< 0.55) were observed in Australia, Northwest China, Central Asia, the Saharan region, southern South America, and the southeastern USA, where coarse-mode aerosols from large deserts dominate. Figure 6b shows the spatial distributions of the Phy-DL and AERONET FMFs linear trends from 2001 to 2020. In general, both datasets show decreasing trends (i.e., <-3×10-3 yr−1) in Northeast China, Central Asia, Europe, the Saharan region, southern Africa, South America, Mexico, and the eastern USA. In contrast, Southeast Asia, India, Central Australia, Central Africa, and the western USA show significant increasing trends of over 3×10-3 yr−1. The increasing FMF trend over Central Australia is sporadic and could be related to an increase in fire activity (Andela et al., 2017). In South America and Africa the long-term decrease in burning during the past two decades (Andela et al., 2017; Deeter et al., 2018) has contributed to a significant decrease in FMF. However, the reduced biomass burning in Central Africa is also partially offset by the dramatic growth in anthropogenic emissions (Zheng et al., 2019), leading to a slightly increasing trend in FMF (3×10-3 to 7×10-3 yr−1). The decreasing FMF trends in Europe and the eastern USA are driven by reduced anthropogenic emissions from transportation sources (Crippa et al., 2016; Jiang et al., 2018). The decreasing FMF trend in Northeast China is likely to be associated with a decrease in industrial and residential emissions due to the implementation of clean air policies (van der Werf et al., 2017; Yang et al., 2018; Zheng et al., 2019). In the western USA, the dramatically increasing FMF trend is likely partly attributed to the increase in smoke from wildfires (Parks and Abatzoglou, 2020; O'Dell et al., 2019; Zhang et al., 2020). In India, the significant increase in FMF likely reflects an increase in vehicular anthropogenic emissions and crop residue burning (Jethva et al., 2019; Manoj et al., 2019). Figure 6c shows the time series of the global monthly mean Phy-DL and AERONET FMFs from 2001 to 2020. Both time series show similar annual cycles and decreasing trends (i.e., negative slopes). However, only the Phy-DL-estimated FMF decreasing trend was significant (-1.9×10-3 yr−1 at 95 % significance level). This is because the Phy-DL dataset has greater spatial coverage than that of the point-scale AERONET dataset.

https://essd.copernicus.org/articles/14/1193/2022/essd-14-1193-2022-f06

Figure 6(a) Global distribution of Phy-DL FMF mean values over the 2001–2020 period. Only those pixels with over 120 retrievals per year were considered. (b) Global distribution of Phy-DL FMF linear trends from 2001 to 2020. Only those pixels with trends at the 95 % significance level were considered. The red and blue dots represent AERONET stations with increasing and decreasing linear trends, respectively, at the 95 % significance level. (c) Global monthly mean Phy-DL FMF (red line) and AERONET FMF (blue line). The shaded areas around each line represent the monthly mean FMF value ±0.1× the monthly standard deviation. The double-asterisks “**” indicate that the linear trend was at the 95 % significance level.

Figure 7 shows the global distributions of seasonal Phy-DL-estimated FMFs from 2001 to 2020. In Central Africa, spring had the lowest FMF, especially in northern Central Africa, due to the transportation of dust (Huebert et al., 2003). Meanwhile, FMFs in summer and autumn were higher than those during winter. This is partly attributed to the high temperature and humidity conditions conducive to the formation of fine-mode aerosols (Tan et al., 2015).

https://essd.copernicus.org/articles/14/1193/2022/essd-14-1193-2022-f07

Figure 7Phy-DL-estimated FMF seasonal mean values from 2001 to 2020. The seasons are defined as spring (March, April, and May), summer (June, July, and August), autumn (September, October, and November), and winter (December, January, and February) for both the Northern Hemisphere and Southern Hemisphere. Only those pixels with 120 retrievals per year were considered when calculating the mean values.

In India, FMFs were noticeably higher in autumn and winter, especially in northern India (i.e., the Indo-Gangetic Plain), where the FMF was greater than 0.87. During spring and summer, FMFs were usually less than 0.63 over India. Mhawish et al. (2021) also reported the same seasonal pattern. This is likely related to spring in India being the pre-monsoon season when dust particles from nearby deserts are frequently transported to the country (Gautam et al., 2009). During that season, the dominant coarse-mode aerosols decrease from west to east over the Indo-Gangetic Plain (Kalapureddy and Devara, 2008), thereby leading to lower FMF, particularly in the western Indo-Gangetic Plain. In the post-monsoon seasons (autumn and winter), the higher FMF is attributed to the low boundary layer and non-convective atmosphere, which induces haze and stagnant conditions. Frequent biomass burning events also occur during these seasons (Ramachandran, 2007), which contributes to higher FMFs.

In Central Africa and the Amazon Basin, FMFs in summer and autumn were higher than those in spring and winter (seasons here correspond to the seasons in the Northern Hemisphere). This coincides with local biomass burning, which mainly occurs from early summer to the middle of autumn (Generoso et al., 2003; Perez-Ramirez et al., 2017). Accordingly, fine-mode aerosols including black carbon and organic carbon can contribute to higher FMFs in summer and autumn. Although Australia had low FMFs (< 0.6) in all seasons, some sporadic pixels in autumn had FMFs near 0.7, which may also be related to the frequent wildfires in autumn (Shi et al., 2021; Liu et al., 2021).

In the eastern USA, FMFs were the highest in summer; however, in the western USA, FMFs were the highest in winter. Across the entire USA, FMFs were lowest in spring. In the eastern USA, it is thought that accelerated photochemical reactions and stagnant conditions in summer produce the highest amount of ammonium sulfate in all seasons (Tai et al., 2010). Moreover, ammonium nitrate is the main component of fine-mode particles in the western USA whose content peaks in winter (Hand et al., 2012). This explains why FMF maxima occur in different seasons on both sides of the country.

In eastern China, summer and autumn had higher FMFs (> 0.8) than those in spring and winter (< 0.78). This is probably because warm seasons with relatively high humidity and temperature can enhance the generation of secondary fine particles by gas-to-particle conversions (Tan et al., 2015). In addition, springtime dust transportation in northeastern China results in increasing coarse dust particles, thereby affecting the FMF (Huebert et al., 2003). In contrast, southeastern Asia had exceedingly higher FMFs in winter and spring (> 0.86) than those in summer and autumn (< 0.8), owing to the intense biomass burning from January to April (Yin et al., 2019).

3.3 Comparison between Phy-DL, DL-based, and Phy-based FMFs

To analyze the differences in FMFs obtained by different methods, FMFs generated by the Phy-DL method, the deep-learning (DL) method (meaning no Phy-based FMF as input), and the Phy-based method (i.e., the LUT-SDA) from 2008 to 2017 were compared using AERONET FMF as the ground truth. Figure 8a shows the three types of FMF estimates that were averaged into 20 bins with AERONET measurements AOD > 0.2 based on the method in Levy et al. (2007). Compared with the Phy-based FMF, the DL-based FMF has a better estimation for low FMF (< 0.6), showing the overall improvement in R from 0.51 to 0.60. However, there is still a significant underestimation for DL-based FMF when AERONET FMF is greater than 0.6. The Phy-DL FMF ameliorated the retrievals by reducing both the underestimation for high FMF values and overestimation for low FMF values, with the highest R (0.81) among the three FMFs. The regression equation of Phy-DL FMF also improved tremendously, with smaller intercept and slope closer to 1.

https://essd.copernicus.org/articles/14/1193/2022/essd-14-1193-2022-f08

Figure 8Phy-DL (red), Phy-based (blue), and DL-based FMF (green) estimation compared with AERONET FMFs for AOD > 0.2 (at 500 nm, using data from 2008 to 2017). (a) The dots and the error bars indicate the means and standard deviations of the FMF estimates in 20 bins of AERONET FMF. The solid blue and red lines are the best-fit lines from linear regression. The dashed black line represents the 1:1 line. Linear regression relations and correlation coefficients (R) are given. (b) Boxplots of bias (estimated FMF minus AERONET FMF) and percentage of FMF estimates falling within the EE envelope of ±20 % (dotted, dashed lines) as a function of land type. The upper, middle, and lower lines in each box presents the 75th, median, and 25th percentiles, respectively. The diamond in each box represents the mean value of the FMF bias. (c) The RMSE for each land type against that of the AERONET FMF.

Download

Figure 8b and c compare the accuracy between Phy-DL, Phy-based, and DL-based FMFs over five land types (forests, grasslands, croplands, urban, and barren). The five land types were selected based on MODIS MCD12C1 data from the International Geosphere-Biosphere Programme scheme. Figure 8b shows that Phy-DL FMF had the lowest bias with mean values close to 0, smallest range of bias, and highest FMF retrievals (within ±20 % EE) over all land types. Although DL-based FMF had a slightly smaller range of bias and higher FMF retrievals (within ±20 % EE) than those of Phy-based FMF over forests, croplands, and urban land types, DL-based FMF still had the largest mean bias and showed the worst performance over barren land types. In addition, the DL-based FMF had the highest RMSE among all the FMFs for all land types. Figure 8b shows that Phy-DL, Phy-based, and DL-based FMFs all had the best performance over forests, with RMSE values of 0.120, 0.211, and 0.223, respectively (Fig. 8c). Likewise, all performed the worst over barren land, showing a significant negative bias, with less than 50 % of the FMF retrievals falling within the ±20 % EE envelope. Overall, Phy-DL-estimated FMFs showed a significant improvement over the Phy-based and DL-based FMFs, especially over forests, croplands, and urban land types, where the RMSEs and biases were noticeably reduced.

For further evaluation, Phy-based, DL-based and Phy-DL FMF were validated against AERONET FMF over AERONET sites to show their spatial performance (Fig. S7). The DL-based FMFs have generally the highest RMSE, with 93.2 % of sites having RMSE greater than 0.11, compared to 81.0 % sites for Phy-based FMF and only 34.8 % sites for Phy-DL FMF. Especially, in Australia, India, southern South America, the Mediterranean region and North America, DL-based FMFs have RMSE predominantly exceeding 0.23 but the RMSE of Phy-based FMF range from 0.11–0.23 and Phy-DL FMFs are lower than 0.17. The Phy-DL FMF performed well in eastern Asia, southern Africa, Europe and eastern USA, with RMSE typically lower than 0.11. In contrast, Phy-based FMF in these regions has RMSE greater than 0.11, and DL-based FMF even has large numbers of sites with RMSE over 0.23. With respect to R, 69 % sites of Phy-DL FMF have R over 0.6, but only 21 % sites of Phy-based FMF and 11 % sites of DL-based FMF reach R over 0.6. According to Fig. S7b, d and f, although DL-based FMF has fewer sites with R less than 0.1 in Europe and North America than Phy-based FMF, there are limited sites for both FMFs in eastern China, India, southeastern Asia, the Saharan region and eastern USA having high R (> 0.6). However, most of sites for Phy-DL FMF achieve this high R.

Figure 9 compares the annual mean FMF from 2008–2017 based on Phy-based, DL-based and Phy-DL FMF estimations. In general, high FMFs (> 0.7) were well captured by both estimation methods over eastern China, Southeast Asia, Europe, southern Africa, the eastern USA, and Mexico. However, compared to DL-based and Phy-DL FMF, Phy-based FMF tends to underestimate the hotspots of FMF, such as eastern China and Central Africa. While in some regions with comparatively low FMF (< 0.55), the estimations also show large differences. For example, in Northeast Australia and southern South America, Phy-DL and AERONET FMFs agreed well with values less than 0.55, but Phy-based FMFs were clearly overestimated by  0.1. In addition, in regions dominated by coarse-mode aerosols such as the Saharan region and Central Asia, only Phy-DL FMF captured this low FMF (< 0.45), while Phy-based FMF showed overestimation by  0.1. DL-based FMF also captured the low FMF in Central Asia, yet overestimated the FMF in the Saharan region. In Central Africa, FMF value is relatively high (> 0.7) according to AERONET. The Phy-DL and DL-based FMF captured this high value yet Phy-based FMF is greatly underestimated, with FMF less than 0.7. In Australia, only Phy-DL FMF agreed well with AERONET FMF values less than 0.6, while both DL-based and Phy-based FMF showed severe overestimations with FMF values reaching over 0.65.

https://essd.copernicus.org/articles/14/1193/2022/essd-14-1193-2022-f09

Figure 9Annual mean (a) Phy-based FMF estimates, (b) DL-based FMF estimates, and (c) Phy-DL-based FMF estimates. The colored dots in (a), (b), and (c) show annual mean AERONET FMF. Areas outlined in black show regions with noticeably large differences in the FMF estimates. Only those pixels with over 120 retrievals per year were considered. Data from 2008 to 2017 were averaged.

3.4 Comparison with other satellite-based FMF products

Figure 10a–d shows the performance of Phy-DL, POLDER, MISR, and MODIS FMFs against AERONET FMFs. Because these three satellite FMF products cover different time ranges, we only compared retrievals made during the overlapping period from 2008 to 2013 when all products were available. The Phy-DL FMF performed the best, with R and RMSE values of 0.78 and 0.100, respectively. In addition, 96.31 % (84.74 %) of Phy-DL FMF fell within the EE envelope of ±40 % (±20 %), an improvement over other FMF products. The next best performing FMFs were from POLDER and MISR, where POLDER shows R and RMSE values of 0.48 and 0.233, respectively, and 76.05 % (46.99 %) of the retrievals falling within the EE envelope of ±40 % (±20 %), while MISR FMF has R and RMSE of 0.42 and 0.204, respectively and 85.01 % (45.85 %) retrievals between the EE envelop of ±40 % (±20 %). Both POLDER and MISR FMFs were underestimated compared to AERONET FMF, especially when the AERONET FMF was greater than 0.6. In contrast, MODIS FMF was overestimated compared to AERONET FMF, especially for AERONET FMF greater than 0.6, where MODIS FMF reached values near 1. The overall performance of MODIS FMF was also the worst, with R and RMSE values of 0.37 and 0.282, respectively, and 68.88 % (44.48 %) of the retrievals falling within the EE envelope of ±40 % (±20 %). Figure 10e shows the probability density functions (PDFs) of the FMF biases (estimated FMF minus AERONET FMF). The Phy-DL PDF reveals that most of the biases were close to zero, suggesting the robustness of the Phy-DL method. The MISR and POLDER PDFs showed underestimations, with most of the biases near 0.2 and 0.1, respectively. The MODIS PDF showed overestimations with biases concentrated near 0.05. Overall, compared with AERONET FMF, of the four FMF products, the Phy-DL-estimated FMF agreed the best. Figure S8 shows the global distributions of RMSE from validations of Phy-DL, POLDER, MISR, and MODIS FMFs against AERONET FMFs at the AERONET sites. Concerning MISR FMF, 47.9 % of the sites had RMSEs higher than 0.23, and 5.3 % of the sites had RMSEs lower than 0.11, showing the worst performance. Concerning POLDER FMF, 29.7 % of the sites had RMSEs higher than 0.23, mainly in the USA, the Amazon Basin, southern Africa, western Europe, and Southeast Asia. The MODIS FMF performed well in eastern China, India, Europe, and the eastern USA, with 40.0 % of the sites having RMSEs lower than 0.11. In comparison, the Phy-DL FMF had RMSEs lower than 0.11 for 65.2 % of the sites. In addition, the number of match-ups of Phy-DL-estimated and AERONET FMF was the highest (N= 566), indicating a higher data coverage compared with the other FMF products. In terms of R (Fig. S9), at 82.2 % of the AERONET sites, R for MISR FMF was less than 0.2 (Fig. S9c). At 33.8 % of the AERONET sites, mainly in eastern China, India, and Australia, R for MODIS FMF was greater than 0.5, but at most sites in the USA and Europe, R was less than 0.2 (Fig. S9d). At 39.7 % of the AERONET sites, R for POLDER FMF was greater than 0.5 in Europe, the Amazon Basin, and eastern China, but at most sites in the USA, India, and Australia, R was less than 0.2 (Fig. S9b). The R for Phy-DL FMF was greater than 0.5 at 79.0 % of the AERONET sites, agreeing better with AERONET FMF than POLDER and MODIS FMFs in the USA, Africa, Southeast Asia, and Europe (Fig. S9a).

https://essd.copernicus.org/articles/14/1193/2022/essd-14-1193-2022-f10

Figure 10Evaluation of (a) Phy-DL (550 nm), (b) POLDER (490 nm), (c) MISR (550 nm), and (d) MODIS FMFs (500 nm) against AERONET FMF (500 nm) from 2008 to 2013. Solid black and red lines are 1:1 reference lines and best-fit lines from linear regression, respectively. Dashed and dotted black lines represent the EE envelopes of ±20 % and ±40 %, respectively. The number of samples (N), RMSE, correlation coefficient (R), and linear regression relation are given in each panel. (e) Probability density functions of the FMF bias (estimated FMF minus AERONET FMF) for Phy-DL (green), POLDER (orange), MISR (blue) and MODIS (red) FMFs.

Download

The intercomparison results in Fig. S10 shows that when validated by independent FMF observations not used for training in the deep-learning model (SURFRAD FMF), Phy-DL FMF still outperformed the other satellite products, with the highest R (0.51), lowest RMSE (0.143), and the greatest number of retrievals falling within the EE envelopes of ±20 % (69.08 %) and ±40 % (89.05 %). POLDER results have an RMSE of 0.232 and R of 0.32, with 76.10 % (48.23 %) of retrievals falling within the EE envelope of ±40 % (20 %). The MISR results have an R and RMSE of 0.22 and 0.212, respectively, with 82.61 % (45.38 %) of retrievals falling within the EE envelope of ±40 % (20 %). The MODIS results were the poorest, with an especially high RMSE (0.465) and low percentages of retrievals falling within the EE envelopes of ±40 % (37.23 %) and 20 % (18.09 %). Overall, at the independent SURFRAD sites, Phy-DL FMF was still more accurate and reliable than the other FMF products.

Figure 11 compares the spatial distributions of annual mean MISR, MODIS, POLDER, and Phy-DL estimated FMFs from 2008 to 2013. In general, Phy-DL FMF was higher than the satellite-based FMFs over areas of known biomass burning and urban areas, including the eastern USA, the Amazon Basin, southern Africa, eastern China, and Australia. The Phy-DL and AERONET FMFs in eastern China reached over 0.7, while POLDER, MISR, and MODIS FMFs were significantly underestimated ( 0.6–0.7,  0.5–0.6, and generally < 0.4, respectively). In the western USA, Phy-DL and AERONET FMFs were higher than 0.6, but MODIS FNFs were < 0.4, and MISR and POLDER FMFs were < 0.6. In Central Africa, POLDER, Phy-DL, and AERONET FMFs were similar (> 0.7), but MISR FMF ranged from 0.6 to 0.7, and MODIS FMF exceeded 0.8. In Australia and the Amazon Basin, Phy-DL and MISR FMFs agreed well with AERONET FMFs (0.5–0.6 for Australia and  0.6–0.7 for the Amazon Basin), but POLDER and MODIS FMFs (< 0.4) were significantly underestimated compared with AERONET FMFs. Figure S11 shows the bias, the percentage of FMF retrievals falling within the EE envelope of ±20 %, and the RMSEs of MISR, MODIS, POLDER, and Phy-DL FMFs over five land types (forests, grasslands, croplands, urban, and barren), using data from 2008 to 2013. Over all land types considered and compared with the satellite-based retrievals, Phy-DL FMFs had the smallest biases, a higher percentage of FMFs falling within the EE envelope (> 67 %), and the lowest RMSE (< 0.127). Both POLDER and MISR FMFs had significant negative biases of 0.2 and 0.1, respectively, over all land types. The MODIS FMF had significant positive biases over forests and grasslands and negative biases over croplands, urban areas, and barren areas. Over forests, grasslands, croplands, and urban areas, MODIS FMF had the largest RMSE (> 0.280), and MISR FMF had the lowest percentage of FMFs falling within the EE envelope (< 40 %). Over barren land and of all FMF products, POLDER FMF was the poorest (23.68 % of the FMFs falling within the EE envelope, and RMSE = 0.326).

https://essd.copernicus.org/articles/14/1193/2022/essd-14-1193-2022-f11

Figure 11Annual mean FMFs based on (a) Phy-DL, (b) POLDER, (c) MISR, and (d) MODIS. The colored dots show annual mean AERONET FMFs. Areas outlined in black circles show regions with noticeably large differences in the FMF estimates. Only those pixels with over 120 retrievals per year were considered in the Phy-DL estimation. Data from 2008 to 2013 were averaged.

Next, we conducted a comprehensive comparison of these satellite-based FMF products over Central Africa. Regarding FMF annual mean values (Fig. 12a–d), the POLDER and Phy-DL FMFs agreed the best with AERONET FMFs, which captured the high values in the middle part of Central Africa (> 0.76) and the low values along the coasts (< 0.7). Although MISR FMF also captured the low FMFs over coastal regions, FMFs were underestimated in the interior (< 0.7). However, MODIS FMF was significantly overestimated along the western coast (> 0.85) and underestimated in the southeastern part of Central Africa (< 0.4). Linear trends were also calculated for all the FMF products (Fig. 12e–h). Note that only the linear trends significant at the 95 % level were examined. The AERONET showed a significant increasing trend in the northern part of Central Africa (+0.01 yr−1) and a decreasing trend in the southern region (0.01 yr−1). Of all the FMF products, Phy-DL FMF trends agreed best with AERONET FMF trends. The POLDER and MODIS FMF trends were greatly enhanced in the southern region (+0.05 yr−1), while MISR FMF trends did not reflect the AERONET FMF trends well. Overall, Fig. 12 illustrates that in Central Africa, compared with the three satellite-based FMF products, Phy-DL FMF is more accurate and reliable.

https://essd.copernicus.org/articles/14/1193/2022/essd-14-1193-2022-f12

Figure 12(a–d) Spatial distributions of annual mean FMF (averaged from 2008 to 2013) over Central Africa based on Phy-DL, POLDER, MISR, and MODIS. The colored dots show annual mean AERONET FMF. (e–h) Spatial distributions of the FMF linear trend from 2008 to 2013 over Central Africa based on Phy-DL, POLDER, MISR, and MODIS. The colored dots show linear trends at the AERONET sites. Only pixels and dots with linear trends at the 95 % significance level are shown.

To compare seasonal differences between these methods, Fig. 13 compares the seasonal mean Phy-DL, POLDER, MISR, and MODIS estimated FMFs from 2008 to 2013, and Fig. S12 shows their differences (i.e., satellite estimates minus Phy-DL estimates). In all seasons, Phy-DL FMFs were generally higher than MISR FMFs over urban areas and regions where biomass burning was prevalent, such as the USA, eastern China, and India. During fine-mode-particle predominant seasons (FMF > 0.8), such as summertime for the eastern USA and wintertime for eastern China and India, differences between Phy-DL and MISR FMFs reached <−0.18. The MODIS FMFs (< 0.2) were much lower than Phy-DL FMFs (> 0.6) in sub-Saharan Africa, India, China, Australia, and the western USA in all four seasons, with differences <−0.5. Conversely, during winters in the Amazon Basin and Central Africa, MODIS FMFs (> 0.74) were slightly higher than Phy-DL FMFs ( 0.66); the differences of POLDER FMFs ( 0.2) were globally lower than Phy-DL FMFs in all four seasons. In the eastern USA during autumn and winter, POLDER FMFs were < 0.2 and Phy-DL FMFs were > 0.6, resulting in large differences (<−0.4). Figure S13 shows Phy-DL, POLDER, MISR, and MODIS estimated FMF frequencies at three levels (low: FMF < 0.5, medium: 0.5 < FMF < 0.8, high: FMF > 0.8) from 2008 to 2013. In the low-level category, MODIS and POLDER FMFs were more frequent than AERONET FMFs (50 % and 20 %, respectively), especially over the Amazon Basin and western USA. The frequencies of MISR, Phy-DL, and AERONET FMFs in this category were in good agreement. In the medium-level category, high frequencies of Phy-DL and AERONET FMFs occurred over Australia and the Amazon Basin (> 80 %), and low frequencies of Phy-DL and AERONET FMFs occurred in sub-Saharan Africa, Central Africa, and eastern China (< 30 %). The MISR slightly overestimated the frequency of medium-level FMFs in Central Africa and underestimated it in northern Australia and the Amazon Basin. The frequencies of medium-level POLDER FMFs were underestimated over the Amazon Basin and western USA and overestimated over Southeast China. The MODIS was unable to capture medium-level FMFs globally, with frequencies of < 20 %. High-level FMFs mainly appeared over areas experiencing biomass burning and urban regions, with frequencies commonly < 50 %. The frequencies of MODIS, Phy-DL, and AERONET FMFs in the high-level category over Central Africa, southern China, and the eastern USA agreed well. However, the frequencies of high-level MODIS FMFs were overestimated over the Amazon Basin and underestimated over northern India. The frequencies of high-level POLDER FMFs were captured well over Central Africa, but significantly underestimated over northern India, southern China, and the eastern USA. Moreover, MODIS was unable to capture high-level FMFs globally with frequencies of < 20 %.

https://essd.copernicus.org/articles/14/1193/2022/essd-14-1193-2022-f13

Figure 13Seasonal mean FMF, averaged from 2008 to 2013, based on (from top to bottom) Phy-DL, POLDER, MISR, and MODIS. Columns from left to right are for spring, summer, autumn, and winter.

4 Data availability

The global land FMF dataset (2001–2020) developed in this study, Phy-DL FMF, is available at https://doi.org/10.5281/zenodo.5105617 (Yan, 2021). The FMF data are in the Geotiff format on a daily scale.

5 Conclusion

Given the general lack of, or the poor quality of aerosol fine-mode fraction (FMF) over land, an improved long-term global aerosol FMF (at 500 nm) dataset (2001–2020) was developed over land with a hybrid retrieval algorithm combining physical and deep-learning approaches called Phy-DL FMF. It was extensively evaluated against AERONET FMF retrievals, revealing its higher accuracy (RMSE = 0.136, based on 361 089 validation samples; 79.15 % of the data fell within the ±20 % EE envelope) and generally good agreement with AERONET FMF with respect to its values, trends, and frequencies. In addition, independent validation was conducted based on SURFRAD FMF and the results showed the RMSE of Phy-DL FMF is 0.144 with 72.50 % of the data falling within the ±20 % EE envelope.

Compared with physical–based (calculated using LUT-SDA, i.e., Phy-based FMF) and deep-learning-based (DL-based FMF) FMF results, the accuracy of Phy-DL FMFs was substantially improved over five land types (forests, grasslands, croplands, urban area, and barren land), lessening the common problem of underestimation for high FMF values and overestimations for low FMF values. Geographically, Phy-DL FMF captured the low FMFs well over the Saharan region, Central Asia, Australia, and southern South America, while Phy-based FMF showed significant overestimations. The Phy-DL FMFs were also compared with three satellite-based official global FMF products (MISR, POLDER, and MODIS DT-based FMFs) using both AERONET FMF and SURFRAD FMF as references. The Phy-DL FMF showed a significant improvement in terms of the accuracy and spatial distribution of trends. In Central Africa, eastern China, Australia, the Amazon Basin, and the western USA, Phy-DL FMFs agreed well with AERONET FMFs, while the other three satellite-based FMFs showed significant underestimations. In particular, in southern Africa, the accuracy of the annual average was substantially improved, and the linear trends of Phy-DL FMF corresponded better with AERONET FMF. The Phy-DL FMF dataset also captured the seasonality and frequencies of FMFs well, thereby showing better agreement with AERONET FMFs.

By examining Phy-DL FMFs from 2001 to 2020, we found a general decreasing trend of -1.9×10-3 yr−1 around the globe at the significance level of 95 %, which was not revealed by AERONET point-scale measurements. However, both Phy-DL and AERONET FMFs showed significant increasing trends in FMF over the western USA and India (>+3×10-3 yr−1). The new dataset captured high-level FMFs (> 0.80) over southern China, South Asia, eastern Europe, and the eastern USA. The FMFs were consistently < 0.3 in Northwest China, the Saharan region, and southern South America, indicating coarse-particle desert emissions. The findings of various evaluations, especially the attempted explanations of the spatiotemporal variations and long-term trend changes, suggest that this newly developed dataset is sound, more accurate and thus useful for investigating the impact of fine-mode and coarse-mode aerosols on the atmospheric environment and climate, especially in gaining a deeper insight into fine-mode aerosols.

Supplement

The supplement related to this article is available online at: https://doi.org/10.5194/essd-14-1193-2022-supplement.

Author contributions

XY conceptualized the project. XY and ZL conducted the funding acquisition and supervision. XY, NL and YJ contributed to the methodology. XY, ZZ and NL wrote the original draft. ZZ conducted the visualization and validation. ZZ and NL conducted the investigation. ZZ, CZ, DL and YG curated the data. WZ, WS and MC contributed to the writing, review and editing.

Competing interests

The contact author has declared that neither they nor their co-authors have any competing interests.

Disclaimer

Publisher’s note: Copernicus Publications remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Acknowledgements

The authors gratefully acknowledge the European Centre for Medium Range Weather Forecasts, MODIS, MISR, POLDER, SURFRAD and AERONET teams for their effort in making the data available.

Financial support

This research has been supported by the Natural Science Foundation of Beijing (grant nos. 8222058 and 8224088), the National Natural Science Foundation of China (grant nos. 42030606 and 91837204), the National Key Research and Development Plan of China (grant no. 2017YFC1501702), and the Fundamental Research Funds for the Central Universities.

Review statement

This paper was edited by Nellie Elguindi and reviewed by three anonymous referees.

References

Andela, N., Morton, D. C., Giglio, L., Chen, Y., van der Werf, G. R., Kasibhatla, P. S., DeFries, R. S., Collatz, G. J., Hantson, S., Kloster, S., Bachelet, D., Forrest, M., Lasslop, G., Li, F., Mangeon, S., Melton, J. R., Yue, C., and Randerson, J. T.: A human-driven decline in global burned area, Science, 356, 1356–1361, https://doi.org/10.1126/science.aal4108, 2017. 

Anderson, T. L., Wu, Y. H., Chu, D. A., Schmid, B., Redemann, J., and Dubovik, O.: Testing the MODIS satellite retrieval of aerosol fine-mode fraction, J. Geophys. Res.-Atmos., 110, D18204, https://doi.org/10.1029/2005jd005978, 2005. 

Augustine, J. A., DeLuisi, J. J., and Long, C. N.: SURFRAD – Anational surface radiation budget network for atmospheric research, B. Am. Meteorol. Soc., 81, 2341–2357, 2000. 

Bellouin, N., Boucher, O., Haywood, J., and Reddy, M. S.: Global estimate of aerosol direct radiative forcing from satellite measurements, Nature, 438, 1138–1141, https://doi.org/10.1038/nature04348, 2005. 

Chen, C., Dubovik, O., Fuertes, D., Litvinov, P., Lapyonok, T., Lopatin, A., Ducos, F., Derimian, Y., Herman, M., Tanré, D., Remer, L. A., Lyapustin, A., Sayer, A. M., Levy, R. C., Hsu, N. C., Descloitres, J., Li, L., Torres, B., Karol, Y., Herrera, M., Herreras, M., Aspetsberger, M., Wanzenboeck, M., Bindreiter, L., Marth, D., Hangler, A., and Federspiel, C.: Validation of GRASP algorithm product from POLDER/PARASOL data and assessment of multi-angular polarimetry potential for aerosol monitoring, Earth Syst. Sci. Data, 12, 3573–3620, https://doi.org/10.5194/essd-12-3573-2020, 2020. 

Chen, X., de Leeuw, G., Arola, A., Liu, S., Liu, Y., Li, Z., and Zhang, K.: Joint retrieval of the aerosol fine mode fraction and optical depth using MODIS spectral reflectance over northern and eastern China: Artificial neural network method, Remote Sens. Environ., 249, 112006, https://doi.org/10.1016/j.rse.2020.112006, 2020. 

Crippa, M., Janssens-Maenhout, G., Dentener, F., Guizzardi, D., Sindelarova, K., Muntean, M., Van Dingenen, R., and Granier, C.: Forty years of improvements in European air quality: regional policy-industry interactions with global impacts, Atmos. Chem. Phys., 16, 3825–3841, https://doi.org/10.5194/acp-16-3825-2016, 2016. 

Deeter, M. N., Martinez-Alonso, S., Andreae, M. O., and Schlager, H.: Satellite-Based Analysis of CO Seasonal and Interannual Variability Over the Amazon Basin, J. Geophys. Res.-Atmos., 123, 5641–5656, https://doi.org/10.1029/2018jd028425, 2018. 

Deuze, J. L., Breon, F. M., Devaux, C., Goloub, P., Herman, M., Lafrance, B., Maignan, F., Marchand, A., Nadal, F., Perry, G., and Tanre, D.: Remote sensing of aerosols over land surfaces from POLDER-ADEOS-1 polarized measurements, J. Geophys. Res.-Atmos., 106, 4913–4926, https://doi.org/10.1029/2000jd900364, 2001. 

Diner, D. J., Beckert, J. C., Reilly, T. H., Bruegge, C. J., Conel, J. E., Kahn, R. A., Martonchik, J. V., Ackerman, T. P., Davies, R., Gerstl, S. A. W., Gordon, H. R., Muller, J. P., Myneni, R. B., Sellers, P. J., Pinty, B., and Verstraete, M. M.: Multi-angle Imaging SpectroRadiometer (MISR) – Instrument description and experiment overview, IEEE T. Geosci. Remote, 36, 1072–1087, https://doi.org/10.1109/36.700992, 1998. 

Dubovik, O., Herman, M., Holdak, A., Lapyonok, T., Tanré, D., Deuzé, J. L., Ducos, F., Sinyuk, A., and Lopatin, A.: Statistically optimized inversion algorithm for enhanced retrieval of aerosol properties from spectral multi-angle polarimetric satellite observations, Atmos. Meas. Tech., 4, 975–1018, https://doi.org/10.5194/amt-4-975-2011, 2011. 

Dubovik, O., Lapyonok, T., Litvinov, P., Herman, M., Fuertes, D., Ducos, F., Lopatin, A., Chaikovsky, A., Torres, B., Derimian, Y., Huang, X., Aspetsberger, M., and Federspiel, C.: GRASP: a versatile algorithm for characterizing the atmosphere, SPIE Newsroom, 25, 2.1201408, https://doi.org/10.1117/2.1201408.005558, 2014. 

Dubovik, O., Li, Z., Mishchenko, M. I., Tanré, D., Karol, Y., Bojkov, B., Cairns, B., Diner, D. J., Espinosa, W. R., Goloub, P., Gu, X., Hasekamp, O., Hong, J., Hou, W., Knobelspiesse, K. D., Landgraf, J., Li, L., Litvinov, P., Liu, Y., Lopatin, A., Marbach, T., Maring, H., Martins, V., Meijer, Y., Milinevsky, G., Mukai, S., Parol, F., Qiao, Y., Remer, L., Rietjens, J., Sano, I., Stammes, P., Stamnes, S., Sun, X., Tabary, P., Travis, L. D., Waquet, F., Xu, F., Yan, C., and Yin, D.: Polarimetric remote sensing of atmospheric aerosols: Instruments, methodologies, results, and perspectives, J. Quant. Spectrosc. Ra., 224, 474–511, https://doi.org/10.1016/j.jqsrt.2018.11.024, 2019. 

Garay, M. J., Witek, M. L., Kahn, R. A., Seidel, F. C., Limbacher, J. A., Bull, M. A., Diner, D. J., Hansen, E. G., Kalashnikova, O. V., Lee, H., Nastan, A. M., and Yu, Y.: Introducing the 4.4 km spatial resolution Multi-Angle Imaging SpectroRadiometer (MISR) aerosol product, Atmos. Meas. Tech., 13, 593–628, https://doi.org/10.5194/amt-13-593-2020, 2020. 

Gautam, R., Liu, Z., Singh, R. P., and Hsu, N. C.: Two contrasting dust-dominant periods over India observed from MODIS and CALIPSO data, Geophys. Res. Lett., 36, L06813, https://doi.org/10.1029/2008gl036967, 2009. 

Generoso, S., Bréon, F.-M., Balkanski, Y., Boucher, O., and Schulz, M.: Improving the seasonal cycle and interannual variations of biomass burning aerosol sources, Atmos. Chem. Phys., 3, 1211–1222, https://doi.org/10.5194/acp-3-1211-2003, 2003. 

Gui, K., Che, H., Wang, Y., Wang, H., Zhang, L., Zhao, H., Zheng, Y., Sun, T., and Zhang, X.: Satellite-derived PM2.5 concentration trends over Eastern China from 1998 to 2016: relationships to emissions and meteorological parameters, Environ. Pollut., 247, 1125–1133, https://doi.org/10.1016/j.envpol.2019.01.056, 2019. 

Guo, C.; Berkhahn, F.: Entity Embeddings of Categorical Variables, arXiv [preprint], arXiv:1604.06737, 2016. 

Hand, J. L., Schichtel, B. A., Pitchford, M., Malm, W. C., and Frank, N. H.: Seasonal composition of remote and urban fine particulate matter in the United States, J. Geophys. Res.-Atmos., 117, D05209, https://doi.org/10.1029/2011jd017122, 2012. 

Harrison, L., Michalsky, J., and Berndt, J.: Automated multifilter rotating shadow-band radiometer: An instrument for optical depth and radiation measurements, Appl. Optics, 33, 5118–5125, https://doi.org/10.1364/AO.33.005118, 1994. 

Holben, B. N., Eck, T. F., Slutsker, I., Tanre, D., Buis, J. P., Setzer, A., Vermote, E., Reagan, J. A., Kaufman, Y. J., Nakajima, T., Lavenu, F., Jankowiak, I., and Smirnov, A.: AERONET – A federated instrument network and data archive for aerosol characterization, Remote Sens. Environ., 66, 1–16, https://doi.org/10.1016/s0034-4257(98)00031-5, 1998. 

Huebert, B. J., Bates, T., Russell, P. B., Shi, G. Y., Kim, Y. J., Kawamura, K., Carmichael, G., and Nakajima, T.: An overview of ACE-Asia: Strategies for quantifying the relationships between Asian aerosols and their climatic impacts, J. Geophys. Res.-Atmos., 108, 8633, https://doi.org/10.1029/2003jd003550, 2003. 

Jethva, H., Torres, O., Field, R. D., Lyapustin, A., Gautam, R., and Kayetha, V.: Connecting Crop Productivity, Residue Fires, and Air Quality over Northern India, Sci. Rep.-UK, 9, 16594, https://doi.org/10.1038/s41598-019-52799-x, 2019. 

Jiang, Z., McDonald, B. C., Worden, H., Worden, J. R., Miyazaki, K., Qu, Z., Henze, D. K., Jones, D. B. A., Arellano, A. F., Fischer, E. V., Zhu, L., and Boersma, K. F.: Unexpected slowdown of US pollutant emission reduction in the past decade, P. Natl. Acad. Sci. USA, 115, 5099–5104, https://doi.org/10.1073/pnas.1801191115, 2018. 

Kahn, R. A. and Gaitley, B. J.: An analysis of global aerosol type as retrieved by MISR, J. Geophys. Res.-Atmos., 120, 4248–4281, https://doi.org/10.1002/2015jd023322, 2015. 

Kalapureddy, M. C. R. and Devara, P. C. S.: Characterization of aerosols over oceanic regions around India during pre-monsoon 2006, Atmos. Environ., 42, 6816–6827, https://doi.org/10.1016/j.atmosenv.2008.05.022, 2008. 

Kang, P., Feng, N., Wang, Z., Guo, Y., Wang, Z., Chen, Y., Zhan, J., Zhan, F. B., and Hong, S.: Statistical properties of aerosols and meteorological factors in Southwest China, J. Geophys. Res.-Atmos., 119, 9914–9930, https://doi.org/10.1002/2014JD022083, 2014. 

Kolmonen, P., Sogacheva, L., Virtanen, T. H., de Leeuw, G., and Kulmala, M.: The ADV/ASV AATSR aerosol retrieval algorithm: current status and presentation of a full-mission AOD dataset, Int. J. Digit. Earth, 9, 545–561, https://doi.org/10.1080/17538947.2015.1111450, 2016. 

Levy, R. C., Remer, L. A., Mattoo, S., Vermote, E. F., and Kaufman, Y. J.: Second-generation operational algorithm: Retrieval of aerosol properties over land from inversion of Moderate Resolution Imaging Spectroradiometer spectral reflectance, J. Geophys. Res.-Atmos., 112, D13211, https://doi.org/10.1029/2006jd007811, 2007. 

Levy, R. C., Remer, L. A., Kleidman, R. G., Mattoo, S., Ichoku, C., Kahn, R., and Eck, T. F.: Global evaluation of the Collection 5 MODIS dark-target aerosol products over land, Atmos. Chem. Phys., 10, 10399–10420, https://doi.org/10.5194/acp-10-10399-2010, 2010. 

Levy, R. C., Mattoo, S., Munchak, L. A., Remer, L. A., Sayer, A. M., Patadia, F., and Hsu, N. C.: The Collection 6 MODIS aerosol products over land and ocean, Atmos. Meas. Tech., 6, 2989–3034, https://doi.org/10.5194/amt-6-2989-2013, 2013. 

Li, Z., Lau, W. K.-M., Ramanathan, V., Wu, G., Ding, Y., Manoj, M. G., Liu, J., Qian, Y., Li, J., Zhou, T., Fan, J., Rosenfeld, D., Ming, Y., Wang, Y., Huang, J., Wang, B., Xu, X., Lee, S.-S., Cribb, M., Zhang, F., Yang, X., Zhao, C., Takemura, T., Wang, K., Xia, X., Yin, Y., Zhang, H., Guo, J., Zhai, P. M., Sugimoto, N., Babu, S. S., and Brasseur, G. P.: Aerosol and monsoon climate interactions over Asia, Rev. Geophys., 54, 866–929, https://doi.org/10.1002/2015RG000500, 2016a. 

Li, Z., Zhang, Y., Shao, J., Li, B., Hong, J., Liu, D., Li, D., Wei, P., Li, W., Li, L., Zhang, F., Guo, J., Deng, Q., Wang, B., Cui, C., Zhang, W., Wang, Z., Lv, Y., Xu, H., Chen, X., Li, L., and Qie, L.: Remote sensing of atmospheric particulate mass of dry PM2.5 near the ground: Method validation using ground-based measurements, Remote Sens. Environ., 173, 59–68, https://doi.org/10.1016/j.rse.2015.11.019, 2016b. 

Liang, C., Zang, Z., Li, Z., and Yan, X.: An Improved Global Land Anthropogenic Aerosol Product Based on Satellite Retrievals From 2008 to 2016, IEEE Geosci. Remote Sens. Lett., 18, 944–948, https://doi.org/10.1109/lgrs.2020.2991730, 2021. 

Liang, X., Li, S., Zhang, S., Huang, H., and Chen, S. X.: PM2.5 data reliability, consistency, and air quality assessment in five Chinese cities, J. Geophys. Res.-Atmos., 121, 10220–10236, https://doi.org/10.1002/2016JD024877, 2016. 

Lipponen, A., Mielonen, T., Pitkänen, M. R. A., Levy, R. C., Sawyer, V. R., Romakkaniemi, S., Kolehmainen, V., and Arola, A.: Bayesian aerosol retrieval algorithm for MODIS AOD retrieval over land, Atmos. Meas. Tech., 11, 1529–1547, https://doi.org/10.5194/amt-11-1529-2018, 2018. 

Liu, T., Mickley, L. J., and McCarty, J. L.: Global search for temporal shifts in fire activity: potential human influence on southwest Russia and north Australia fire seasons, Environ. Res. Lett., 16, https://doi.org/10.1088/1748-9326/abe328, 2021. 

Luo, N., An, L., Nara, A., Yan, X., and Zhao, W.: GIS-based multielement source analysis of dustfall in Beijing: a study of 40 major and trace elements, Chemosphere, 152, 123–131, 2016. 

Manoj, M. R., Satheesh, S. K., Moorthy, K. K., Gogoi, M. M., and Babu, S. S.: Decreasing Trend in Black Carbon Aerosols Over the Indian Region, Geophys. Res. Lett., 46, 2903–2910, https://doi.org/10.1029/2018gl081666, 2019. 

Mhawish, A., Sorek-Hamer, M., Chatfield, R., Banerjee, T., Bilal, M., Kumar, M., Sarangi, C., Franklin, M., Chau, K., Garay, M., and Kalashnikova, O.: Aerosol characteristics from earth observation systems: A comprehensive investigation over South Asia (2000–2019), Remote Sens. Environ., 259, 112410, https://doi.org/10.1016/j.rse.2021.112410, 2021. 

O'Dell, K., Ford, B., Fischer, E. V., and Pierce, J. R.: Contribution of Wildland-Fire Smoke to US PM2.5 and Its Influence on Recent Trends, Environ. Sci. Technol., 53, 1797–1804, https://doi.org/10.1021/acs.est.8b05430, 2019. 

O’Neill, N., Eck, T., Smirnov, A., and Holben, B.: Spectral deconvolution algorithm (SDA) technical memo, NASA Tech. Memo., 39 pp., 2008. 

O'Neill, N. T., Dubovik, O., and Eck, T. F.: Modified angstrom ngstrom exponent for the characterization of submicrometer aerosols, Appl. Optics, 40, 2368–2375, https://doi.org/10.1364/ao.40.002368, 2001a. 

O'Neill, N. T., Eck, T. F., Holben, B. N., Smirnov, A., Dubovik, O., and Royer, A.: Bimodal size distribution influences on the variation of Angstrom derivatives in spectral and optical depth space, J. Geophys. Res.-Atmos., 106, 9787–9806, https://doi.org/10.1029/2000jd900245, 2001b. 

O'Neill, N. T., Eck, T. F., Smirnov, A., Holben, B. N., and Thulasiraman, S.: Spectral discrimination of coarse and fine mode optical depth, J. Geophys. Res.-Atmos., 108, 4559, https://doi.org/10.1029/2002jd002975, 2003. 

O'Neill, N. T.: Comment on “Classification of aerosol properties derived from AERONET direct sun data” by Gobbi et al. (2007), Atmos. Chem. Phys., 10, 10017–10019, https://doi.org/10.5194/acp-10-10017-2010, 2010. 

Ong, B. T., Sugiura, K., and Zettsu, K.: Dynamically pre-trained deep recurrent neural networks using environmental monitoring data for predicting PM2.5, Neural Comput. Appl., 27, 1553–1566, https://doi.org/10.1007/s00521-015-1955-3, 2016. 

Parks, S. A. and Abatzoglou, J. T.: Warmer and Drier Fire Seasons Contribute to Increases in Area Burned at High Severity in Western US Forests From 1985 to 2017, Geophys. Res. Lett., 47, e2020GL089858, https://doi.org/10.1029/2020gl089858, 2020. 

Perez-Ramirez, D., Andrade-Flores, M., Eck, T. F., Stein, A. F., O'Neill, N. T., Lyamani, H., Gasso, S., Whiteman, D. N., Veselovskii, I., Velarde, F., and Alados-Arboledas, L.: Multi year aerosol characterization in the tropical Andes and in adjacent Amazonia using AERONET measurements, Atmos. Environ., 166, 412–432, https://doi.org/10.1016/j.atmosenv.2017.07.037, 2017. 

Petrenko, M. and Ichoku, C.: Coherent uncertainty analysis of aerosol measurements from multiple satellite sensors, Atmos. Chem. Phys., 13, 6777–6805, https://doi.org/10.5194/acp-13-6777-2013, 2013. 

Ramachandran, S.: Aerosol optical depth and fine mode fraction variations deduced from Moderate Resolution Imaging Spectroradiometer (MODIS) over four urban areas in India, J. Geophys. Res.-Atmos., 112, D16207, https://doi.org/10.1029/2007jd008500, 2007. 

Sawyer, V., Levy, R. C., Mattoo, S., Cureton, G., Shi, Y., and Remer, L. A.: Continuing the MODIS Dark Target Aerosol Time Series with VIIRS, Remote Sensing, 12, 308, https://doi.org/10.3390/rs12020308, 2020. 

Sayer, A. M., Munchak, L. A., Hsu, N. C., Levy, R. C., Bettenhausen, C., and Jeong, M. J.: MODIS Collection 6 aerosol products: Comparison between Aqua's e-Deep Blue, Dark Target, and “merged” data sets, and usage recommendations, J. Geophys. Res.-Atmos., 119, 13965–913989, https://doi.org/10.1002/2014JD022453, 2014. 

Shen, H., Li, T., Yuan, Q., and Zhang, L.: Estimating regional ground-level PM2.5 directly from satellite top-of-atmosphere reflectance using deep belief networks, J. Geophys. Res.-Atmos., 123, 13875–13886, https://doi.org/10.1029/2018JD028759, 2018. 

Shi, G., Yan, H., Zhang, W., Dodson, J., Heijnis, H., and Burrows, M.: Rapid warming has resulted in more wildfires in northeastern Australia, Sci. Total Environ., 771, 144888, https://doi.org/10.1016/j.scitotenv.2020.144888, 2021. 

Tai, A. P. K., Mickley, L. J., and Jacob, D. J.: Correlations between fine particulate matter (PM2.5) and meteorological variables in the United States: Implications for the sensitivity of PM2.5 to climate change, Atmos. Environ., 44, 3976–3984, https://doi.org/10.1016/j.atmosenv.2010.06.060, 2010. 

Tan, C., Zhao, T., Xu, X., Liu, J., Zhang, L., and Tang, L.: Climatic analysis of satellite aerosol data on variations of submicron aerosols over East China, Atmos. Environ., 123, 392–398, https://doi.org/10.1016/j.atmosenv.2015.03.054, 2015. 

Tetens, V. O.: Über einige meteorologische Begriffe, Zeitschrift für Geophysik, 6, 297–309, 1930. 

van der Werf, G. R., Randerson, J. T., Giglio, L., van Leeuwen, T. T., Chen, Y., Rogers, B. M., Mu, M., van Marle, M. J. E., Morton, D. C., Collatz, G. J., Yokelson, R. J., and Kasibhatla, P. S.: Global fire emissions estimates during 1997–2016, Earth Syst. Sci. Data, 9, 697–720, https://doi.org/10.5194/essd-9-697-2017, 2017. 

Vinoj, V., Rasch, P. J., Wang, H., Yoon, J.-H., Ma, P.-L., Landu, K., and Singh, B.: Short-term modulation of Indian summer monsoon rainfall by West Asian dust, Nat. Geosci., 7, 308–313, https://doi.org/10.1038/ngeo2107, 2014. 

Wei, Y., Li, Z., Zhang, Y., Chen, C., Dubovik, O., Zhang, Y., Xu, H., Li, K., Chen, J., Wang, H., Ge, B., and Fan, C.: Validation of POLDER GRASP aerosol optical retrieval over China using SONET observations, J. Quant. Spectrosc. Ra., 246, 106931, https://doi.org/10.1016/j.jqsrt.2020.106931, 2020. 

Xiang, Y., Zhang, T., Liu, J., Lv, L., Dong, Y., and Chen, Z.: Atmosphere boundary layer height and its effect on air pollutants in Beijing during winter heavy pollution, Atmos. Res., 215, 305–316, https://doi.org/10.1016/j.atmosres.2018.09.014, 2019. 

Yan, G., Jiang, H., Luo, J., Mu, X., Li, F., Qi, J., Hu, R., Xie, D., and Zhou, G.: Quantitative Evaluation of Leaf Inclination Angle Distribution on Leaf Area Index Retrieval of Coniferous Canopies, J. Remote Sens., 2021, 2708904, https://doi.org/10.34133/2021/2708904, 2021. 

Yan, K., Zou, D., Yan, G., Fang, H., Weiss, M., Rautiainen, M., Knyazikhin, Y., and Myneni, R. B.: A Bibliometric Visualization Review of the MODIS LAI/FPAR Products from 1995 to 2020, J. Remote Sens., 2021, 7410921, https://doi.org/10.34133/2021/7410921, 2021. 

Yan, X.: Physical and deep learning retrieved fine mode fraction (Phy-DL FMF), Zenodo [data set], https://doi.org/10.5281/zenodo.5105617, 2021. 

Yan, X., Li, Z., Shi, W., Luo, N., Wu, T., and Zhao, W.: An improved algorithm for retrieving the fine-mode fraction of aerosol optical thickness, part 1: Algorithm development, Remote Sens. Environ., 192, 87–97, https://doi.org/10.1016/j.rse.2017.02.005, 2017. 

Yan, X., Li, Z., Luo, N., Shi, W., Zhao, W., Yang, X., Liang, C., Zhang, F., and Cribb, M.: An improved algorithm for retrieving the fine-mode fraction of aerosol optical thickness. Part 2: Application and validation in Asia, Remote Sens. Environ., 222, 90–103, https://doi.org/10.1016/j.rse.2018.12.012, 2019. 

Yan, X., Zang, Z., Luo, N., Jiang, Y., and Li, Z.: New interpretable deep learning model to monitor real-time PM2.5 concentrations from satellite data, Environ. Int., 144, 106060, https://doi.org/10.1016/j.envint.2020.106060, 2020. 

Yan, X., Zang, Z., Zhao, C., and Husi, L.: Understanding global changes in fine-mode aerosols during 2008–2017 using statistical methods and deep learning approach, Environ. Int., 149, 106392, https://doi.org/10.1016/j.envint.2021.106392, 2021a. 

Yan, X., Zang, Z., Liang, C., Luo, N., Ren, R., Cribb, M., and Li, Z.: New global aerosol fine-mode fraction data over land derived from MODIS satellite retrievals, Environ. Pollut., 276, 116707, https://doi.org/10.1016/j.envpol.2021.116707, 2021b. 

Yan, X., Zang, Z., Jiang, Y., Shi, W., Guo, Y., Li, D., Zhao, C., and Husi, L.: A Spatial-Temporal Interpretable Deep Learning Model for improving interpretability and predictive accuracy of satellite-based PM2.5, Environ. Pollut., 273, 116459, https://doi.org/10.1016/j.envpol.2021.116459, 2021c. 

Yang, X., Jiang, L., Zhao, W., Xiong, Q., Zhao, W., & Yan, X.: Comparison of ground-based PM2.5 and PM10 concentrations in China, India, and the US, Int. J. Environ. Res. Pu., 15, 1382, https://doi.org/10.3390/ijerph15071382, 2018. 

Yang, X., Zhao, C., Luo, N., Zhao, W., Shi, W., and Yan, X.: Evaluation and Comparison of Himawari-8 L2 V1. 0, V2. 1 and MODIS C6. 1 aerosol products over Asia and the oceania regions, Atmos. Environ., 220, 117068, https://doi.org/10.1016/j.atmosenv.2019.117068, 2020. 

Yin, S., Wang, X., Zhang, X., Guo, M., Miura, M., and Xiao, Y.: Influence of biomass burning on local air pollution in mainland Southeast Asia from 2001 to 2016, Environ. Pollut., 254, 112949, https://doi.org/10.1016/j.envpol.2019.07.117, 2019. 

Yuan, Q., Shen, H., Li, T., Li, Z., Li, S., Jiang, Y., Xu, H., Tan, W., Yang, Q., Wang, J., Gao, J., and Zhang, L.: Deep learning in environmental remote sensing: Achievements and challenges, Remote Sens. Environ., 241, 111716, https://doi.org/10.1016/j.rse.2020.111716, 2020. 

Zang, Z., Guo, Y., Jiang, Y., Chen, Z., Li, D., Shi, W., and Yan, X.: Tree-Based Ensemble Deep Learning Model for Spatiotemporal Surface Ozone (O3) Prediction and Interpretation, Int. J. Applied Earth Obs., 103, 102516, https://doi.org/10.1016/j.jag.2021.102516, 2021a.  

Zang, Z., Li, D., Guo, Y., Shi, W., and Yan, X.: Superior PM2.5 Estimation by Integrating Aerosol Fine Mode Data from the Himawari-8 Satellite in Deep and Classical Machine Learning Models, Remote Sensing, 13, 2779, https://doi.org/10.3390/rs13142779, 2021b. 

Zhang, L., Lau, W., Tao, W., and Li, Z.: Large wildfires in the western United States exacerbated by tropospheric drying linked to a multi-decadal trend in the expansion of the Hadley circulation, Geophys. Res. Lett., 47, e2020GL087911, https://doi.org/10.1029/2020GL087911, 2020. 

Zhang, Y. and Li, Z.: Remote sensing of atmospheric fine particulate matter (PM2.5) mass concentration near the ground from satellite observation, Remote Sens. Environ., 160, 252–262, https://doi.org/10.1016/j.rse.2015.02.005, 2015. 

Zhang, Y., Li, Z., Qie, L., Zhang, Y., Liu, Z., Chen, X., Hou, W., Li, K., Li, D., and Xu, H.: Retrieval of Aerosol Fine-Mode Fraction from Intensity and Polarization Measurements by PARASOL over East Asia, Remote Sensing, 8, 417, https://doi.org/10.3390/rs8050417, 2016. 

Zhang, Y., Li, Z., Liu, Z., Wang, Y., Qie, L., Xie, Y., Hou, W., and Leng, L.: Retrieval of aerosol fine-mode fraction over China from satellite multiangle polarized observations: validation and comparison, Atmos. Meas. Tech., 14, 1655–1672, https://doi.org/10.5194/amt-14-1655-2021, 2021. 

Zheng, B., Chevallier, F., Yin, Y., Ciais, P., Fortems-Cheiney, A., Deeter, M. N., Parker, R. J., Wang, Y., Worden, H. M., and Zhao, Y.: Global atmospheric carbon monoxide budget 2000–2017 inferred from multi-species atmospheric inversions, Earth Syst. Sci. Data, 11, 1411–1436, https://doi.org/10.5194/essd-11-1411-2019, 2019. 

Zheng, X., Zhao, W., Yan, X., Shu, T., Xiong, Q., and Chen, F.: Pollution characteristics and health risk assessment of airborne heavy metals collected from Beijing bus stations, Int. J. Environ. Res. Pu., 12, 9658–9671, https://doi.org/10.3390/ijerph120809658, 2015. 

Download

The requested paper has a corresponding corrigendum published. Please read the corrigendum first before downloading the article.

Short summary
This study developed a new satellite-based global land daily FMF dataset (Phy-DL FMF) by synergizing the advantages of physical and deep learning methods at a 1° spatial resolution by covering the period from 2001 to 2020. The Phy-DL FMF was extensively evaluated against ground-truth AERONET data and tested on a global scale against conventional satellite-based FMF products to demonstrate its superiority in accuracy.