the Creative Commons Attribution 4.0 License.
the Creative Commons Attribution 4.0 License.
High-resolution global ultrafine particle concentrations through a machine learning model and Earth observations
Abstract. Atmospheric pollution is a major concern due to its well-documented and detrimental impacts on human health, with millions of excess deaths attributed to it annually. Particulate matter (PM), comprising airborne pollutants in the form of solid and liquid particles suspended in the air, has been particularly concerning. Historically, research has focused on PM with an aerodynamic diameter less than 10 μm (PM10) and 2.5 μm (PM2.5), referred to as coarse and fine particulate matter, respectively. The long term exposure to both classes of PM have been shown to impact human health, being linked to a range of respiratory and cardiovascular complications. Recently, attention has been drawn to the lower end of the size distribution, specifically ultrafine particles (UFPs), with an aerodynamic diameter less than 100 nm (PM0.1). UFPs can deeply penetrate the respiratory system, reach the bloodstream, and have been increasingly associated with chronic health conditions, including cardiovascular disease. Accurate mapping of UFP concentrations at high spatial resolution is crucial considering strong gradients near the sources. However, due to the relatively recent focus on this class of PM, there is a scarcity of long-term measurements, particularly on the global scale. In this study, we employed a machine learning methodology to produce the first global maps of UFP concentrations at high spatial resolution (1 km) by leveraging limited ground station measurements worldwide. We trained an XGBoost model to predict annual UFP concentrations for a decade (2010–2019) and utilized the conformal prediction framework to provide reliable prediction intervals. This approach makes local-to-global UFP data available to support assessments of the health implications associated with long-term exposure.
- Preprint
(17501 KB) - Metadata XML
- BibTeX
- EndNote
Status: closed
-
RC1: 'Comment on essd-2024-314', Anonymous Referee #1, 04 Sep 2024
The study developed a model to estimate global distributions of ultrafine particle (UFP) number concentrations using the XGBoost algorithm. UFP data were sourced from the EBAS and NOAA databases, as well as from relevant literature. The predictors used in the model include land cover, global NO₂, PM₂.₅, black carbon (BC), carbon dioxide (CO₂), carbon monoxide (CO), nitrogen oxides (NOₓ), temperature, and population. Model performance was assessed using ten-fold cross-validation. The influence of each predictor on the estimates was analyzed using Shapley Additive Explanations (SHAP). The optimal combination of hyperparameters was identified through a grid search. The ten-fold cross-validation yielded an R² of 0.896 and a root mean square error (RMSE) of 2,424 cm⁻³. The study is interesting. I have several concerns regarding this study:
Major Concerns
- UFP Data Harmonization: The UFP data were sourced from EBAS, NOAA, and literature, each involving various sources. The authors should specify the measurement instruments (models), methods, sampling periods, measurement ranges, precision, and the number of observations for each source. Merging UFP data from different sources without considering for different instruments could introduce bias into the model.
- Model Validation: The authors relied solely on ten-fold cross-validation to evaluate the model performance, which is insufficient. I recommend incorporating spatial and temporal validations to provide a more comprehensive evaluation of the accuracy.
Minor Concerns
- Abstract Details: The authors should include the training and cross-validation R², mean absolute error (MAE), and RMSE in the abstract.
- Traffic-related Variables: Variables related to traffic, such as road area, proximity to major roads, or traffic volume, have been shown to be significant factors in estimating UFP. The authors should explain why these variables were not included in the model.
- On line 92, the full names of EBAS and NOAA should be provided when these databases are first mentioned.
- In Section 2.1.6, boundary layer height and precipitation are known to influence UFP estimates. The authors should explain why these variables were not incorporated into the model. Furthermore, why did the authors not utilize ERA5-Land data, which offers finer spatial resolution (~9 km) compared to ERA5?
- On line 293, the authors report a total of 565 observations. They should also provide the number of observations contributed by each source to offer clarity on the dataset composition.
- The authors should provide an analysis of annual UFP trends in the results section.
Citation: https://doi.org/10.5194/essd-2024-314-RC1 -
RC2: 'Comment on essd-2024-314', Anonymous Referee #2, 05 Sep 2024
This manuscript describes an empirical model that attempts to quantify concentrations of ultrafine particles (UFPs, particles smaller than 100 nm) worldwide. This is an ambitious effort, especially given the lack of data available for model training. I have several major comments about the datasets used for model training. Those and additional comments are detailed below.
Major comments
My largest concern is about the training data used to built the model. The authors note that they assembled data from EBAS, NOAA, and the literature. Figure 1 shows the locations of the training datasets, however there is not any sort of list or table (e.g., in the appendix or SI) of the specific data sources. I think that the authors need to be more transparent about the data sources, the time frame for data collection in each location, etc. For example, it's not exactly clear what the authors mean by "565 examples" in the dataset.
I think that the manuscript would benefit from some brief analysis of the UFP data used in model building. Figure 1 shows the UFP concentration at each site as a circle, but this is qualitative. Some of the sites have average PNC > 45000/cm3. That seems extremely high. For example this recent paper from several European cities shows PNC ~10-20k/cm3 at roadside sites (https://acp.copernicus.org/articles/24/9515/2024/). I would expect PNC concentrations, even in urban locations, to be significantly lower than 45000/cm3. Adding a plot of the ground truth data in either the main text or the SI would be extremely helpful for readers.
A second major concern is the title of the manuscript. The model claims to be "global." However, as far as I can tell, there are no training sites in the southern hemisphere. This seems to be a model for North America, Europe, and Asia (mostly India and China). While that is still impressive, it is not truly "global."
Other specific comments
Line 324: "Desert areas like the Sahara, the Arabian Peninsula, and parts of Australia exhibit notably high percentage errors " errors relative to what? You have no training data there according to Fig 1.
Line 338 - are these values of global UFP from a previous paper or from your model? 50000/cm3 seems extremely high even for urban areas.
One thing I take from fig 3 is the high spatial variation. Most of the globe looks like background, and in general cities are only visible when zoomed in.
In the SHAP analysis does "negatively impact" mean "lead to lower concentration predictions" or "make the model worse" eg lower R2 or higher RMSE? (Example: Line 435)
Line 92 - define EBAS and NOAA. Also, what exactly are these stations? I think the NOAA sites measure particle size distributions, but that is not clearly stated here.
Line 121-122 "No temporal or spatial interpolations were conducted, and the closest year available for each of the dataset was utilized, as these variables do not change much over time. " Is this true even in rapidly growing cities?
What is the range of magnitudes for Shapley values, typically?
The authors should present Some evaluation of the land uses at the training sites. Does the land use for the model training data cover the range of land uses in the areas where the model is applied? Or are the training data over represented in some land use categories (e.g., near road) and under represented in others?
Line 332-334: "The model ability to generalize to unseen data is underscored by the fact that, even with increased error margins in the 10-fold cross- validation experiment, the overall performance remains within a reasonable range for practical applications in estimating UFP concentrations. " What would that range be, and for what application? E.g., what accuracy is needed for a UFP epidemiology study?
The health literature on UFP exposures is, in my opinion, "muddy." There are lots of studies that show weak or null effects, and the authors should acknowledge this. For example, this recent review paper gives a good overview: https://pubmed.ncbi.nlm.nih.gov/30790006/.
It's not clear if the model is time-resolved, or if the final model (e.g., the maps in Fig 3) represent the average concentration from 2010-2019. If it's the latter, what did the authors do to account for potential temporal changes in long-term UFP concentrations over that time span?
I am unfamiliar with SHAP, and other readers may be as well. Some additional context might be helpful (e.g., in explaining Fig 4). It seems like there are two features that matter - the absolute maximum SHAP and the SHAP range. BC is the most important predictor; it also has the largest single SHAP value and appears to have the largest range.
It's a little bit hard to wrap my head around the negative SHAP value for primary emissions (BC, CO, NOx) at low UFP concentrations.
The paragraph about "Natural Land Covers’ Diverse Impacts " starting at line 419 suggests that "natural ecosystems in potentially mitigating UFP levels." I think it's more likely that these land uses indicate a lower density of sources, and therefore lower concentrations.
Citation: https://doi.org/10.5194/essd-2024-314-RC2 -
RC3: 'Comment on essd-2024-314', Anonymous Referee #3, 06 Sep 2024
This paper uses a machine learning approach to generate global maps of ultrafine particle (UFP) concentrations at the resolution of 1km, but there are several issues with the methodology that make the results unreliable.
First, it is crucial to train an accurate and high-precision model when using machine learning for large-scale retrievals, and this study fails to achieve that. On one hand, high-quality label data is essential, yet most of the UFP data used here comes from PNC as a proxy, with only a small portion derived from particle size distribution. The authors did not address the inconsistency between these two types of data, mixing them without further processing. While this issue is acknowledged in the discussion, no solution is provided, which I believe significantly affects the results, as inaccurate label data directly leads to flawed models. On the other hand, the study aims to estimate UFP concentrations globally at a 1 km resolution (with billions of pixels), but the sample size is only 565, which is far from sufficient to represent global conditions at such a high resolution. Even with this small sample size, the model's accuracy is still low, with relative errors reaching up to 50%. The SHAP summary plot shows that most samples cluster around a SHAP value of 0, with only a few exhibiting significant positive or negative values, further indicating poor model performance. In addition, we also observed that many features had almost no impact on the model (the sum of absolute SHAP values for all samples was close to zero), indicating that these variables do not need to be considered in the model. Including too much irrelevant information can sometimes negatively affect the model's performance, reducing both accuracy and efficiency. In summary, I believe the authors need to improve the experiments and optimize the modeling process.
Moreover, the authors did not conduct an in-depth spatiotemporal analysis of the inversion results. Section 3.2 mainly discusses the potential applications of the product but does not present any meaningful scientific findings or conclusions. Although the feature analysis in Section 3.3 offers some scientific insight, most of the conclusions are similar to those from previous studies. The analysis of four specific samples in Figure 5 has limited scientific relevance. To better explore the differences in feature impacts across regions or environmental conditions, it would be more rigorous to categorize all samples and analyze the SHAP values for each feature within each category. Analyzing all samples from a region would be more scientifically sound and lead to more robust conclusions than selecting a single sample from that region for analysis.
Citation: https://doi.org/10.5194/essd-2024-314-RC3
Status: closed
-
RC1: 'Comment on essd-2024-314', Anonymous Referee #1, 04 Sep 2024
The study developed a model to estimate global distributions of ultrafine particle (UFP) number concentrations using the XGBoost algorithm. UFP data were sourced from the EBAS and NOAA databases, as well as from relevant literature. The predictors used in the model include land cover, global NO₂, PM₂.₅, black carbon (BC), carbon dioxide (CO₂), carbon monoxide (CO), nitrogen oxides (NOₓ), temperature, and population. Model performance was assessed using ten-fold cross-validation. The influence of each predictor on the estimates was analyzed using Shapley Additive Explanations (SHAP). The optimal combination of hyperparameters was identified through a grid search. The ten-fold cross-validation yielded an R² of 0.896 and a root mean square error (RMSE) of 2,424 cm⁻³. The study is interesting. I have several concerns regarding this study:
Major Concerns
- UFP Data Harmonization: The UFP data were sourced from EBAS, NOAA, and literature, each involving various sources. The authors should specify the measurement instruments (models), methods, sampling periods, measurement ranges, precision, and the number of observations for each source. Merging UFP data from different sources without considering for different instruments could introduce bias into the model.
- Model Validation: The authors relied solely on ten-fold cross-validation to evaluate the model performance, which is insufficient. I recommend incorporating spatial and temporal validations to provide a more comprehensive evaluation of the accuracy.
Minor Concerns
- Abstract Details: The authors should include the training and cross-validation R², mean absolute error (MAE), and RMSE in the abstract.
- Traffic-related Variables: Variables related to traffic, such as road area, proximity to major roads, or traffic volume, have been shown to be significant factors in estimating UFP. The authors should explain why these variables were not included in the model.
- On line 92, the full names of EBAS and NOAA should be provided when these databases are first mentioned.
- In Section 2.1.6, boundary layer height and precipitation are known to influence UFP estimates. The authors should explain why these variables were not incorporated into the model. Furthermore, why did the authors not utilize ERA5-Land data, which offers finer spatial resolution (~9 km) compared to ERA5?
- On line 293, the authors report a total of 565 observations. They should also provide the number of observations contributed by each source to offer clarity on the dataset composition.
- The authors should provide an analysis of annual UFP trends in the results section.
Citation: https://doi.org/10.5194/essd-2024-314-RC1 -
RC2: 'Comment on essd-2024-314', Anonymous Referee #2, 05 Sep 2024
This manuscript describes an empirical model that attempts to quantify concentrations of ultrafine particles (UFPs, particles smaller than 100 nm) worldwide. This is an ambitious effort, especially given the lack of data available for model training. I have several major comments about the datasets used for model training. Those and additional comments are detailed below.
Major comments
My largest concern is about the training data used to built the model. The authors note that they assembled data from EBAS, NOAA, and the literature. Figure 1 shows the locations of the training datasets, however there is not any sort of list or table (e.g., in the appendix or SI) of the specific data sources. I think that the authors need to be more transparent about the data sources, the time frame for data collection in each location, etc. For example, it's not exactly clear what the authors mean by "565 examples" in the dataset.
I think that the manuscript would benefit from some brief analysis of the UFP data used in model building. Figure 1 shows the UFP concentration at each site as a circle, but this is qualitative. Some of the sites have average PNC > 45000/cm3. That seems extremely high. For example this recent paper from several European cities shows PNC ~10-20k/cm3 at roadside sites (https://acp.copernicus.org/articles/24/9515/2024/). I would expect PNC concentrations, even in urban locations, to be significantly lower than 45000/cm3. Adding a plot of the ground truth data in either the main text or the SI would be extremely helpful for readers.
A second major concern is the title of the manuscript. The model claims to be "global." However, as far as I can tell, there are no training sites in the southern hemisphere. This seems to be a model for North America, Europe, and Asia (mostly India and China). While that is still impressive, it is not truly "global."
Other specific comments
Line 324: "Desert areas like the Sahara, the Arabian Peninsula, and parts of Australia exhibit notably high percentage errors " errors relative to what? You have no training data there according to Fig 1.
Line 338 - are these values of global UFP from a previous paper or from your model? 50000/cm3 seems extremely high even for urban areas.
One thing I take from fig 3 is the high spatial variation. Most of the globe looks like background, and in general cities are only visible when zoomed in.
In the SHAP analysis does "negatively impact" mean "lead to lower concentration predictions" or "make the model worse" eg lower R2 or higher RMSE? (Example: Line 435)
Line 92 - define EBAS and NOAA. Also, what exactly are these stations? I think the NOAA sites measure particle size distributions, but that is not clearly stated here.
Line 121-122 "No temporal or spatial interpolations were conducted, and the closest year available for each of the dataset was utilized, as these variables do not change much over time. " Is this true even in rapidly growing cities?
What is the range of magnitudes for Shapley values, typically?
The authors should present Some evaluation of the land uses at the training sites. Does the land use for the model training data cover the range of land uses in the areas where the model is applied? Or are the training data over represented in some land use categories (e.g., near road) and under represented in others?
Line 332-334: "The model ability to generalize to unseen data is underscored by the fact that, even with increased error margins in the 10-fold cross- validation experiment, the overall performance remains within a reasonable range for practical applications in estimating UFP concentrations. " What would that range be, and for what application? E.g., what accuracy is needed for a UFP epidemiology study?
The health literature on UFP exposures is, in my opinion, "muddy." There are lots of studies that show weak or null effects, and the authors should acknowledge this. For example, this recent review paper gives a good overview: https://pubmed.ncbi.nlm.nih.gov/30790006/.
It's not clear if the model is time-resolved, or if the final model (e.g., the maps in Fig 3) represent the average concentration from 2010-2019. If it's the latter, what did the authors do to account for potential temporal changes in long-term UFP concentrations over that time span?
I am unfamiliar with SHAP, and other readers may be as well. Some additional context might be helpful (e.g., in explaining Fig 4). It seems like there are two features that matter - the absolute maximum SHAP and the SHAP range. BC is the most important predictor; it also has the largest single SHAP value and appears to have the largest range.
It's a little bit hard to wrap my head around the negative SHAP value for primary emissions (BC, CO, NOx) at low UFP concentrations.
The paragraph about "Natural Land Covers’ Diverse Impacts " starting at line 419 suggests that "natural ecosystems in potentially mitigating UFP levels." I think it's more likely that these land uses indicate a lower density of sources, and therefore lower concentrations.
Citation: https://doi.org/10.5194/essd-2024-314-RC2 -
RC3: 'Comment on essd-2024-314', Anonymous Referee #3, 06 Sep 2024
This paper uses a machine learning approach to generate global maps of ultrafine particle (UFP) concentrations at the resolution of 1km, but there are several issues with the methodology that make the results unreliable.
First, it is crucial to train an accurate and high-precision model when using machine learning for large-scale retrievals, and this study fails to achieve that. On one hand, high-quality label data is essential, yet most of the UFP data used here comes from PNC as a proxy, with only a small portion derived from particle size distribution. The authors did not address the inconsistency between these two types of data, mixing them without further processing. While this issue is acknowledged in the discussion, no solution is provided, which I believe significantly affects the results, as inaccurate label data directly leads to flawed models. On the other hand, the study aims to estimate UFP concentrations globally at a 1 km resolution (with billions of pixels), but the sample size is only 565, which is far from sufficient to represent global conditions at such a high resolution. Even with this small sample size, the model's accuracy is still low, with relative errors reaching up to 50%. The SHAP summary plot shows that most samples cluster around a SHAP value of 0, with only a few exhibiting significant positive or negative values, further indicating poor model performance. In addition, we also observed that many features had almost no impact on the model (the sum of absolute SHAP values for all samples was close to zero), indicating that these variables do not need to be considered in the model. Including too much irrelevant information can sometimes negatively affect the model's performance, reducing both accuracy and efficiency. In summary, I believe the authors need to improve the experiments and optimize the modeling process.
Moreover, the authors did not conduct an in-depth spatiotemporal analysis of the inversion results. Section 3.2 mainly discusses the potential applications of the product but does not present any meaningful scientific findings or conclusions. Although the feature analysis in Section 3.3 offers some scientific insight, most of the conclusions are similar to those from previous studies. The analysis of four specific samples in Figure 5 has limited scientific relevance. To better explore the differences in feature impacts across regions or environmental conditions, it would be more rigorous to categorize all samples and analyze the SHAP values for each feature within each category. Analyzing all samples from a region would be more scientifically sound and lead to more robust conclusions than selecting a single sample from that region for analysis.
Citation: https://doi.org/10.5194/essd-2024-314-RC3
Data sets
High-resolution global ultrafine particle concentrations through a machine learning model and Earth observations Pantelis Georgiades and Andrea Pozzer https://doi.org/10.17617/3.YK9I4B
Model code and software
Mapping Atmospheric Ultrafine Particles from the Global to the Local Scale Pantelis Georgiades https://github.com/pantelisgeor/Ultrafine-Particles
Viewed
HTML | XML | Total | BibTeX | EndNote | |
---|---|---|---|---|---|
526 | 171 | 84 | 781 | 11 | 12 |
- HTML: 526
- PDF: 171
- XML: 84
- Total: 781
- BibTeX: 11
- EndNote: 12
Viewed (geographical distribution)
Country | # | Views | % |
---|
Total: | 0 |
HTML: | 0 |
PDF: | 0 |
XML: | 0 |
- 1