Preprints
https://doi.org/10.5194/essd-2026-443
https://doi.org/10.5194/essd-2026-443
24 Jun 2026
 | 24 Jun 2026
Status: this preprint is currently under review for the journal ESSD.

A gapless 0.05° hourly tropospheric NO2 dataset (2019–2024) over key Asian hotspots reconstructed using physics‑aware deep learning

Hongrui Gao, Qin He, Kai Qin, Jhoon Kim, Diego Loyola, Pravash Tiwari, Lingxiao Lu, and Jason Blake Cohen

Abstract. High-frequency continuous monitoring of tropospheric nitrogen dioxide (NO2) is crucial for assessing regional air quality and investigating photochemical dynamic evolution. Although new-generation geostationary orbit (GEO) satellites (e.g., GEMS) provide hourly observations, their spatial coverage is often severely compromised by cloud obstruction and inherent retrieval limitations. Furthermore, their short observation history hinders long-term, cross-regional environmental assessments. To address this, this study presents a gapless, 0.05° hourly tropospheric NO2 vertical column density (VCD) dataset for representative hotspots in Asia spanning 2019 to 2024, reconstructed using a physics-aware deep learning framework. The framework integrates partial convolutions to handle irregularly missing satellite observations and proposes a Physics-aware Normalization (PhysNorm) module. PhysNorm dynamically modulates 0.05° high-resolution feature maps using 0.25° low-resolution physical features such as ERA5 meteorological fields and EAC4 chemical priors, thereby ensuring rigorous physical continuity while filling data gaps. Validation results show that the model performs exceptionally on the test set (R2 = 0.889) and maintains high generalization stability in an independent validation on 2024 data, which was not used for training. Cross-validation against an independent polar-orbiting satellite (GOME-2C) confirms the reliability of the dataset in reconstructing pollution hotspots and its applicability to periods without GEMS observations (2019–2022). Using this reconstructed dataset, this study finely delineates the periodic diurnal characteristics of NO2 in typical Asian regions and accurately captures the inter-annual concentration gradients driven by public health events and emission reduction policies over the past six years. This newly generated dataset provides an unprecedented spatiotemporally continuous record, overcoming the limitations of cloud cover and short observational histories, thereby facilitating long-term, high-resolution air quality assessments and epidemiological studies. This dataset is available at https://doi.org/10.5281/zenodo.20427767 (Gao et al., 2026).

Publisher's note: Copernicus Publications remains neutral with regard to jurisdictional claims made in the text, published maps, institutional affiliations, or any other geographical representation in this paper. While Copernicus Publications makes every effort to include appropriate place names, the final responsibility lies with the authors. Views expressed in the text are those of the authors and do not necessarily reflect the views of the publisher.
Share
Hongrui Gao, Qin He, Kai Qin, Jhoon Kim, Diego Loyola, Pravash Tiwari, Lingxiao Lu, and Jason Blake Cohen

Status: open (until 31 Jul 2026)

Comment types: AC – author | RC – referee | CC – community | EC – editor | CEC – chief editor | : Report abuse
Hongrui Gao, Qin He, Kai Qin, Jhoon Kim, Diego Loyola, Pravash Tiwari, Lingxiao Lu, and Jason Blake Cohen

Data sets

A gapless 0.05° hourly tropospheric NO2 dataset (2019–2024) over key Asian hotspots reconstructed using physics‑aware deep learning Hongrui Gao, Qin He, Kai Qin, Jhoon Kim, Diego Loyola, Pravash Tiwari, Lingxiao Lu, and Jason B. Cohen https://doi.org/10.5281/zenodo.20427766

Hongrui Gao, Qin He, Kai Qin, Jhoon Kim, Diego Loyola, Pravash Tiwari, Lingxiao Lu, and Jason Blake Cohen
Metrics will be available soon.
Latest update: 24 Jun 2026
Download
Short summary
There are always gaps in satellite records of air pollution because of factors such as clouds and operating conditions. We used a physics-aware deep learning method, producing a gapless 0.05° hourly dataset of tropospheric nitrogen dioxide over Asian pollution hotspots from 2019 to 2024. The dataset can capture sub-daily changes, revealing how traffic, industry, and clean-air policies shaped pollution, supporting health and environment studies.
Share
Altmetrics