Preprints
https://doi.org/10.5194/essd-2025-104
https://doi.org/10.5194/essd-2025-104
07 Apr 2025
 | 07 Apr 2025
Status: this preprint is currently under review for the journal ESSD.

Tracking County-level Cooking Emissions and Their Drivers in China from 1990 to 2021 by Ensemble Machine Learning

Zeqi Li, Bin Zhao, Shengyue Li, Zhezhe Shi, Dejia Yin, Qingru Wu, Fenfen Zhang, Xiao Yun, Guanghan Huang, Yun Zhu, and Shuxiao Wang

Abstract. Cooking emissions are a significant source of PM2.5, posing considerable public health risks due to their high toxicity and proximity to densely populated areas. Despite their importance, there is currently a lack of an accurate, long-term, high-resolution national cooking emission inventory in China, primarily due to the challenges in obtaining high-quality activity level data over extended periods and at fine spatial scales. Here, we address these limitations by leveraging advanced machine learning techniques to predict activity levels and further estimate emissions.

Specifically, we develop an ensemble model of machine learning algorithms—Random Forest (RF), eXtreme Gradient Boosting (XGBoost), Multilayer Perceptron Neural Network (MLP), and Deep Neural Networks (DNN)—to accurately predict cooking activity levels across Chinese counties based on statistical indicators related to population, economy, and the catering industry. The ensemble machine learning model demonstrates exceptional generalization and transferability (R2=0.892–0.989), outperforming traditional statistical models and individual machine learning models. Unlike previous inventories that rely on simplistic proxy data such as population for calculation and downscaling, our inventory directly calculates county-level cooking emissions, providing more accurate emission estimates and spatial distributions. Furthermore, we incorporate critical but previously missing toxic pollutants, such as ultrafine particles (UFPs) and polycyclic aromatic hydrocarbons (PAHs), into the national cooking emission inventory. Therefore, we develop China's first county-level cooking emission inventory, spanning from 1990 to 2021, with high spatial resolution and wide pollutant coverage.

According to our inventory, in 2021, China’s total cooking emissions of organics in the full volatility range, PM2.5, UFPs, and PAHs are 997 kt, 408 kt, 6.50 × 1025 particles, and 15.8 kt, respectively. From 1990 to 2021, emissions of these pollutants increased by over 65 %, and their spatiotemporal trends were affected to varying degrees by external factors, such as population migration, economic development, pollution control policies, and the pandemic at different periods. Using the SHapley Additive exPlanations (SHAP) algorithm, we further analyze the contribution patterns of key driving factors, such as urbanization rate, population, and local emission factors, to emission changes. Notably, driver analysis reveals that existing control measures are insufficient to curb the rapid growth of emissions, necessitating enhanced controls. Regarding control strategies, our county-level inventory finds that 62.3 % of the China’s organic emissions are concentrated in 30 % of the counties, which are densely populated and occupy only 14.4 % of the national land area. Therefore, prioritizing control of these areas will be an efficient and targeted strategy. Our research provides crucial data and insights for understanding the impact of cooking emissions on air pollution and health, aiding in policy development. Our long-term, high-resolution emission datasets are publicly available at https://doi.org/10.6084/m9.figshare.26085487 (Li et al, 2025).

Publisher's note: Copernicus Publications remains neutral with regard to jurisdictional claims made in the text, published maps, institutional affiliations, or any other geographical representation in this preprint. The responsibility to include appropriate place names lies with the authors.
Share
Zeqi Li, Bin Zhao, Shengyue Li, Zhezhe Shi, Dejia Yin, Qingru Wu, Fenfen Zhang, Xiao Yun, Guanghan Huang, Yun Zhu, and Shuxiao Wang

Status: open (until 14 May 2025)

Comment types: AC – author | RC – referee | CC – community | EC – editor | CEC – chief editor | : Report abuse
Zeqi Li, Bin Zhao, Shengyue Li, Zhezhe Shi, Dejia Yin, Qingru Wu, Fenfen Zhang, Xiao Yun, Guanghan Huang, Yun Zhu, and Shuxiao Wang

Data sets

High-resolution emission inventory of full-volatility organic from cooking souce in China during 2015-2021 Zeqi Li, Bin Zhao, Shengyue Li, Zhezhe Shi, Dejia Yin, Qingru Wu, Fenfen Zhang, Xiao Yun, Guanghan Huang, Yun Zhu, and Shuxiao Wang https://doi.org/10.6084/m9.figshare.26085487

Zeqi Li, Bin Zhao, Shengyue Li, Zhezhe Shi, Dejia Yin, Qingru Wu, Fenfen Zhang, Xiao Yun, Guanghan Huang, Yun Zhu, and Shuxiao Wang

Viewed

Total article views: 114 (including HTML, PDF, and XML)
HTML PDF XML Total Supplement BibTeX EndNote
104 9 1 114 5 0 0
  • HTML: 104
  • PDF: 9
  • XML: 1
  • Total: 114
  • Supplement: 5
  • BibTeX: 0
  • EndNote: 0
Views and downloads (calculated since 07 Apr 2025)
Cumulative views and downloads (calculated since 07 Apr 2025)

Viewed (geographical distribution)

Total article views: 103 (including HTML, PDF, and XML) Thereof 103 with geography defined and 0 with unknown origin.
Country # Views %
  • 1
1
 
 
 
 
Latest update: 15 Apr 2025
Download
Short summary
This study uses an ensemble machine learning model to predict long-term, high-resolution cooking activity data, establishing China’s first county-level cooking emission inventory spanning 1990–2021. It covers key pollutants such as polycyclic aromatic hydrocarbons. It reveals emissions’ long-term spatiotemporal trends and driving factors, such as population migration and economic growth, offering efficient control strategies. This dataset is crucial for air pollution and health impact studies.
Share
Altmetrics