Preprints
https://doi.org/10.5194/essd-2023-519
https://doi.org/10.5194/essd-2023-519
05 Jan 2024
 | 05 Jan 2024
Status: this preprint is currently under review for the journal ESSD.

LGHAP v2: A global gap-free aerosol optical depth and PM2.5 concentration dataset since 2000 derived via big earth data analytics

Kaixu Bai, Ke Li, Liuqing Shao, Xinran Li, Chaoshun Liu, Zhengqiang Li, Mingliang Ma, Di Han, Yibing Sun, Zhe Zheng, Ruijie Li, Ni-Bin Chang, and Jianping Guo

Abstract. The Long-term Gap-free High-resolution Air Pollutants concentration dataset (LGHAP) provides spatially contiguous daily aerosol optical depth (AOD) and particulate matters (PMs) concentration at 1-km grid resolution in China since 2000. This advancement empowered some unprecedented assessments of aerosol variations and its impacts on environment, health, and climate in the past few years. However, there is a need to improve such a MODIS-like gap-free high resolution AOD and PM2.5 concentration dataset with new robust features. In this study, we present the version 2 of such a global-scale LGHAP dataset (LGHAP v2) that was generated using an improved big earth data analytics approach via a seamless integration of distinct data science, pattern recognition, and deep learning methods. To better reconstruct global AOD distribution from daily MODIS AOD imageries, multimodal AODs and air quality measurements acquired from relevant satellites, ground monitoring stations, and numerical models across the globe throughout the past two decades were firstly harmonized by harnessing the capability of random forest-based data-driven models. Then, an improved tensor-flow-based AOD reconstruction algorithm was developed to weave harmonized multi-source AODs products together for gap-filling. The results of ablation experiments demonstrated the improved tensor-flow-based gap filling method has a better performance in terms of both convergence speed and data accuracy. Ground-based validation results indicated a good data accuracy of the global gap-filled AOD dataset, with R of 0.85 and RMSE of 0.14 compared against worldwide AOD observations from AERONET, which is better than the purely reconstructed AODs (R=0.83, RMSE=0.15) and slightly worse than raw MAIAC AOD retrievals from Terra (R=0.88, RMSE=0.11). A novel deep learning model, named as the scene-aware ensemble learning graph attention network (SCAGAT), was developed to better predict PM2.5 concentrations across the globe. By gaining better spatial representativeness of data-driven models across regions, the SCAGAT algorithm performed better during spatial extrapolation, largely reducing modeling biases over regions even though in situ PM2.5 concentration measurements are limited or absent. Site-specific validation results indicated that the gap-free PM2.5 concentration estimates exhibit higher prediction accuracies with R of 0.95 and RMSE of 5.7 μg m−3, compared against the PM2.5 concentration measurements obtained from priorly held-out sites worldwide. Overall, leveraging state-of-the-art methods in data science and artificial intelligence, a quality-enhanced LGHAP v2 dataset was generated through big earth data analytics by weaving multimodal AODs and air quality measurements from different sources together cohesively. The gap-free, high-resolution, and global coverage merits render LGHAP v2 dataset an invaluable data base to advance aerosol- and haze-related studies and trigger multidisciplinary applications for environmental management, health risk assessment, and climate change analysis. All gap-free AOD and PM2.5 grids in the LGHAP v2 dataset are shared online publicly (Bai et al., 2023a), with data user guide and relevant visualization codes available at https://doi.org/10.5281/zenodo.10216396.

Kaixu Bai, Ke Li, Liuqing Shao, Xinran Li, Chaoshun Liu, Zhengqiang Li, Mingliang Ma, Di Han, Yibing Sun, Zhe Zheng, Ruijie Li, Ni-Bin Chang, and Jianping Guo

Status: final response (author comments only)

Comment types: AC – author | RC – referee | CC – community | EC – editor | CEC – chief editor | : Report abuse
  • RC1: 'Comment on essd-2023-519', Anonymous Referee #1, 28 Jan 2024
  • RC2: 'Comment on essd-2023-519', Anonymous Referee #2, 26 Feb 2024
Kaixu Bai, Ke Li, Liuqing Shao, Xinran Li, Chaoshun Liu, Zhengqiang Li, Mingliang Ma, Di Han, Yibing Sun, Zhe Zheng, Ruijie Li, Ni-Bin Chang, and Jianping Guo

Data sets

LGHAP: Long-term Gap-free High-resolution Air Pollutants concentration dataset Kaixu Bai and Ke Li https://zenodo.org/communities/ecnu_lghap

Kaixu Bai, Ke Li, Liuqing Shao, Xinran Li, Chaoshun Liu, Zhengqiang Li, Mingliang Ma, Di Han, Yibing Sun, Zhe Zheng, Ruijie Li, Ni-Bin Chang, and Jianping Guo

Viewed

Total article views: 343 (including HTML, PDF, and XML)
HTML PDF XML Total Supplement BibTeX EndNote
266 60 17 343 25 15 12
  • HTML: 266
  • PDF: 60
  • XML: 17
  • Total: 343
  • Supplement: 25
  • BibTeX: 15
  • EndNote: 12
Views and downloads (calculated since 05 Jan 2024)
Cumulative views and downloads (calculated since 05 Jan 2024)

Viewed (geographical distribution)

Total article views: 335 (including HTML, PDF, and XML) Thereof 335 with geography defined and 0 with unknown origin.
Country # Views %
  • 1
1
 
 
 
 
Latest update: 04 Mar 2024
Download
Short summary
A global long-term gap-free high-resolution air pollutants dataset (LGHAP v2) was generated to provide spatially contiguous AOD and PM2.5 concentration maps with daily 1-km resolution from 2000 to 2021. The LGHAP v2 dataset has good data accuracies compared against ground AOD and PM2.5 observations, which is an invaluable data base to advance aerosol-related studies and trigger multidisciplinary applications for environmental management, health risk assessment, and climate change analysis.
Altmetrics