Preprints
https://doi.org/10.5194/essd-2024-34
https://doi.org/10.5194/essd-2024-34
06 Feb 2024
 | 06 Feb 2024
Status: this preprint is currently under review for the journal ESSD.

Reconstructing long-term (1980–2022) daily ground particulate matter datasets in India (LongPMInd)

Shuai Wang, Mengyuan Zhang, Hui Zhao, Peng Wang, Sri Harsha Kota, Qingyan Fu, Cong Liu, and Hongliang Zhang

Abstract. Severe airborne particulate matter (PM, including PM2.5 and PM10) pollution in India has caused widespread concern. Accurate PM datasets are fundamental for scientific policymaking and health impact assessment, while surface observations in India are limited due to scarce sites and uneven distribution. In this work, a simple structured, efficient, and robust model based on the Light Gradient Boosting Machine (LightGBM) was developed to fuse multi-source data and estimate long-term (1980–2022) historical daily ground PM datasets in India (LongPMInd). The LightGBM model shows good accuracy with out-of-sample, out-of-site, and out-of-year cross-validation CV test R2 of 0.77, 0.70, and 0.66, respectively. Small performance gaps between PM2.5 training and testing (delta RMSE of 1.06, 3.83, and 7.74 μg m-3) indicate low overfitting risks. With great generalization ability, the open-accessible, long-term, and high-quality daily PM2.5 and PM10 products were then reconstructed (10 km, 1980–2022). It shows that India has experienced severe PM pollution in the Indo-Gangetic Plain (IGP), especially in winter. PM concentrations significantly increased (p<0.05) in most regions since 2000 (0.34 μg m-3 year-1). The turning point occurred in 2018 when the Indian government launched the National Clean Air Program, PM2.5 concentrations declined in most regions (- 0.78 μg m-3 year-1) during 2018–2022. Severe PM2.5 pollution caused continuous increased attributable premature mortalities, from 0.73 (95 % CI: 0.65–0.80) million in 2000 to 1.22 (95 % CI: 1.03–1.41) million in 2019, particularly in the IGP, where attributable mortality increased from 0.36 to 0.60 million. The LongPMInd datasets have the potential to support multi-applications of air quality management, public health, and climate change. The daily and monthly PM2.5 and PM10 datasets are publicly accessible at https://doi.org/10.5281/zenodo.10073944 (Wang et al., 2023a).

Shuai Wang, Mengyuan Zhang, Hui Zhao, Peng Wang, Sri Harsha Kota, Qingyan Fu, Cong Liu, and Hongliang Zhang

Status: open (extended)

Comment types: AC – author | RC – referee | CC – community | EC – editor | CEC – chief editor | : Report abuse
  • RC1: 'Comment on essd-2024-34', Anonymous Referee #1, 06 Mar 2024 reply
Shuai Wang, Mengyuan Zhang, Hui Zhao, Peng Wang, Sri Harsha Kota, Qingyan Fu, Cong Liu, and Hongliang Zhang

Data sets

LongPMInd: long-term (1980-2022) daily ground particulate matter datasets in India Shuai Wang, Sri Harsha Kota, and Hongliang Zhang https://zenodo.org/records/10073944

Shuai Wang, Mengyuan Zhang, Hui Zhao, Peng Wang, Sri Harsha Kota, Qingyan Fu, Cong Liu, and Hongliang Zhang

Viewed

Total article views: 384 (including HTML, PDF, and XML)
HTML PDF XML Total Supplement BibTeX EndNote
319 48 17 384 23 19 14
  • HTML: 319
  • PDF: 48
  • XML: 17
  • Total: 384
  • Supplement: 23
  • BibTeX: 19
  • EndNote: 14
Views and downloads (calculated since 06 Feb 2024)
Cumulative views and downloads (calculated since 06 Feb 2024)

Viewed (geographical distribution)

Total article views: 377 (including HTML, PDF, and XML) Thereof 377 with geography defined and 0 with unknown origin.
Country # Views %
  • 1
1
 
 
 
 
Latest update: 27 Apr 2024
Download
Short summary
Long-term, open-source, gap-free daily ground-level PM2.5 and PM10 datasets for India (LongPMInd) were reconstructed using a robust machine learning model to support health assessment and air quality management.
Altmetrics