the Creative Commons Attribution 4.0 License.
the Creative Commons Attribution 4.0 License.
Improved global daily nitrogen dioxide concentrations from 2005 to 2023 derived using a deep learning approach
Abstract. Nitrogen dioxide (NO2) is a critical air pollutant with significant environmental and human health impacts, yet global and long-term NO2 datasets with daily continuity and fine spatial resolution remain limited. In this study, we construct a continuous global daily NO2 concentration spanning from 2005 to 2023 at a 0.1-degree resolution using the advanced Air Transformer deep learning framework that integrates satellite observations, ground-based measurements, meteorological reanalysis, land-use information, and auxiliary geophysical variables. The resulting dataset shows robust performance across diverse regions and pollution regimes, with improved spatial consistency and reduced biases relative to existing global products. Based on this dataset, we characterize the spatiotemporal evolution of global NO2 concentrations over the past two decades. Global annual mean NO2 increased from 2005 to 2015, followed by a moderate decline during 2016–2019, a pronounced decrease in 2020 associated with COVID-19–related reductions in economic activity and transportation, and a partial rebound thereafter, reaching 3.38 ppbv in 2023. The Northern Hemisphere and tropical regions largely followed the global trend, whereas the Southern Hemisphere exhibited distinct behaviour, with relatively stable or declining NO2 levels prior to 2015, a sharp decrease in 2020, and a stronger post-pandemic rebound during 2021–2023. As one of the global, multi-decadal NO2 datasets with daily resolution, this dataset provides a valuable resource for air quality assessment, exposure analysis, and atmospheric model evaluation.
Competing interests: At least one of the (co-)authors is a member of the editorial board of Earth System Science Data.
Publisher's note: Copernicus Publications remains neutral with regard to jurisdictional claims made in the text, published maps, institutional affiliations, or any other geographical representation in this paper. While Copernicus Publications makes every effort to include appropriate place names, the final responsibility lies with the authors. Views expressed in the text are those of the authors and do not necessarily reflect the views of the publisher.- Preprint
(3070 KB) - Metadata XML
-
Supplement
(6283 KB) - BibTeX
- EndNote
Status: final response (author comments only)
- RC1: 'Comment on essd-2025-821', Anonymous Referee #1, 06 Feb 2026
-
RC2: 'Comment on essd-2025-821', Anonymous Referee #2, 17 Feb 2026
Review of “Improved global daily nitrogen dioxide concentrations from 2005 to
2023 derived using a deep learning approach” by Jiangshan Mu et al.
Major Comments
This dataset developed by Mu et al. presents an exciting potential improvement to global estimates of daily surface-level NO2. Global surface-level NO2 products have historically been unable to capture localized impacts due to reliance on coarser satellite-derived data. The Air Transformer approach is especially promising, and the authors have done a nice job of presenting their findings in a clear coherent way. I especially thought that the figures were well conceived and interesting to interpret.
Although I generally think this is a strong paper that presents some exciting potential high impact advances, there are a number of issues that I have identified that make me unable to recommend this dataset for publication at this time. If these issues are resolved appropriately, I believe that this dataset could be a nice addition to ESSD.
First, the major issue I had with this work is that throughout the paper the authors make claims that their dataset is improved at predicting in “diverse regions” or in quantifying “localized impacts”; however, I do not believe they have done sufficient evaluation to evidence these claims. There are no summary statistics regarding the regional number and values of the training data. Additionally, there is no evaluation in regions that are historically undermonitored such as many countries in the Global South and rural areas. It would also be interesting to compare performance in these regions to past studies.
Second, while it is nice to see that the authors compared to a past estimate from Anenberg et al. 2022, I suggest they compare to the newer product from Larkin et al. 2023 (cited below) as well as the surface-level estimate from Cooper et al. 2022. The latter is slightly different from this study in that it estimates NO2 at 1pm (TROPOMI overpass time) but it would still be interesting to compare the statistical performance of this dataset to that one.
Third, I wonder if the authors could consider a few additional predictor variables specifically tied to (1) anthropogenic features (e.g., road systems, built environment, etc.), and (2) wildfire activity. I guess TROPOMI partially captures the latter, but I am curious if including some marker of wildfires would have a notable effect on the model predictions and/or performance.
Lastly, it is good that both a spatial and random cross-validation was performed but I am curious if you could perform a temporal cross-validation especially given that “annualizing” the data appears to have only a minor effect on performance for the spatial cross-validation. I also wonder if instead of doing a random spatial cross-validation, you could assess performance in specific regions (maybe the GBD super regions) to address the first point I brought up.
Minor Comments
L29-32: Agreed that the anthropogenic sources dominate emissions; however, given that you are developing a dataset that includes rural / non-urban estimates, I think it is important to mention the natural sources (e.g., soil NOx, lightning NOx, wildfire NOx) as well.
L39-40: And also, primarily in the Global North as NO2 monitoring infrastructure in e.g., Africa and much of Latin America is lacking. Calls into question that uncertainty is likely higher around NO2 in developing regions.
L54: And also, a well-documented inability to capture NO2 in rural areas and areas with low concentrations that has implications for estimating background concentrations (see DOI: 10.1029/2021GL092783)
L57-58: “the model overcomes retrieval uncertainties and better captures local variations in NO2 concentrations” Curious to hear how you will quantify this, I will read on.
Introduction: Somewhere in the introduction I think it is worth writing a sentence or two more specifically on the associated health effects of NO2, especially given this dataset’s relevance for long-term epidemiological cohort studies.
L65-68: Can you include a map and a summary statistics table of regional-level information from these monitors either in the main or supplement. I am curious to see the number of monitors and observations as well as statistics (e.g., mean, standard deviation, minimum, maximum, and percentage of observations that were removed from the QAQC). Especially given your claim in the introduction that the model better captures local variation I think it is necessary to understand where (and what time periods) the data are available for in historically undermonitored areas in the Global South.
L84: I am curious if you might also need some markers for anthropogenic activity (e.g., roadways) and wildfire activity given that you are creating a daily estimate product the latter is potentially significant.
L90: Nice idea to correct the OMI data here. I don’t think you have defined “n” in this equation, does this mean the total number of years in the overlap period (4)? If so maybe just put 4 there. While I think this is appropriate to correct for the coarser resolving pattern of OMI it is a potential source of uncertainty given that the 2019-2022 period is not necessarily representative of the period prior to 2019 (especially given the influence of COVID-19 lockdowns).
L159-160: Nice to see both a random-based and spatial-based cross validation. I am curious if you additionally considered a temporal-based cross validation?
L203-209: A few comments on the spatial cross-validation. First, it is interesting that you see only a marginal improvement for the monthly and annual predictions compared to the daily. I generally tend to think that averaging the data tends to significantly improve the correlation but in this case, it is a minor improvement. This points to the potential need to conduct a temporal cross-validation. Second, is it possible to perform a region-specific cross-validation? I am curious of the model performance in Africa (where observations are generally limited) compared to Europe and/or China.
L211-215: Could you also compare to other popular global NO2 datasets such as the Larkin et al. 2023 dataset used in the GBD study (https://doi.org/10.3389/fenvs.2023.1125979) and Cooper et al. 2022 (https://doi.org/10.1038/s41586-021-04229-0). Ah I see you compared to Larkin later on, but this is, as I understand, not the most recent version. It might also be worth comparing in non-urban areas as the Larkin product has a known high bias in rural areas.
Figure 1: Can you also include some metric of bias in the figures? It looks like there is a potential low bias in the estimates especially for the spatial cross-validation, but it is difficult to tell without a statistic.
L235: Can you be more specific here? I think you mean they use OMI only (not TROPOMI) in terms of the satellite data but unsure what you mean for the land-use variables.
L239: I don’t think you can conclusively say that the spatial validation is “effective in generalizing to new locations” without testing accuracy in diverse regions. I am curious how well the model does in regions with limited monitors (e.g., Africa, Latin-America, Oceania) and also non-urban scenes. Otherwise, I don’t think you can make this claim but rather need to caveat it with “generalizing to new urban locations in regions with strong monitoring infrastructure”.
L244: Again, this needs an important caveat that the LUR is likely conservative in urban areas but an underestimate in rural areas.
Figure S3: These are annual values, right? Did you also compare at the daily or monthly timescale?
Figure S4: These are somewhat strange “regions” to separate the data into. Could you instead (or in addition) group these by continent? Or by GBD super region as you do in Figure 2?
Figure 2: These are very cool figures, thank you for sharing. I Think currently panels b and d are difficult to read, I wonder if you can redesign the layout of the subplots to make these appear larger?
L359: I take issue with the claim that the dataset can “assess localized impacts” without qualifying that these are only evaluated primarily in urban areas in the Global North. For example, if a small city in Uganda uses this dataset to characterize local NO2 do we anticipate this estimate can accurately characterize its “localized impacts”. I think either more investigation is needed (as indicated above) or some of the claims need to be better qualified throughout the whole paper.
Citation: https://doi.org/10.5194/essd-2025-821-RC2
Data sets
GlobalNO2_AIT: 0.1° Annual Resolution Global Ground-level NO2 Dataset Jiangshan Mu and Chenliang Tao https://doi.org/10.5281/zenodo.13842191
Viewed
| HTML | XML | Total | Supplement | BibTeX | EndNote | |
|---|---|---|---|---|---|---|
| 233 | 112 | 15 | 360 | 74 | 12 | 29 |
- HTML: 233
- PDF: 112
- XML: 15
- Total: 360
- Supplement: 74
- BibTeX: 12
- EndNote: 29
Viewed (geographical distribution)
| Country | # | Views | % |
|---|
| Total: | 0 |
| HTML: | 0 |
| PDF: | 0 |
| XML: | 0 |
- 1
Please find my comments in the attachment.