A gapless 0.05° hourly tropospheric NO2 dataset (2019–2024) over key Asian hotspots reconstructed using physics‑aware deep learning
Abstract. High-frequency continuous monitoring of tropospheric nitrogen dioxide (NO2) is crucial for assessing regional air quality and investigating photochemical dynamic evolution. Although new-generation geostationary orbit (GEO) satellites (e.g., GEMS) provide hourly observations, their spatial coverage is often severely compromised by cloud obstruction and inherent retrieval limitations. Furthermore, their short observation history hinders long-term, cross-regional environmental assessments. To address this, this study presents a gapless, 0.05° hourly tropospheric NO2 vertical column density (VCD) dataset for representative hotspots in Asia spanning 2019 to 2024, reconstructed using a physics-aware deep learning framework. The framework integrates partial convolutions to handle irregularly missing satellite observations and proposes a Physics-aware Normalization (PhysNorm) module. PhysNorm dynamically modulates 0.05° high-resolution feature maps using 0.25° low-resolution physical features such as ERA5 meteorological fields and EAC4 chemical priors, thereby ensuring rigorous physical continuity while filling data gaps. Validation results show that the model performs exceptionally on the test set (R2 = 0.889) and maintains high generalization stability in an independent validation on 2024 data, which was not used for training. Cross-validation against an independent polar-orbiting satellite (GOME-2C) confirms the reliability of the dataset in reconstructing pollution hotspots and its applicability to periods without GEMS observations (2019–2022). Using this reconstructed dataset, this study finely delineates the periodic diurnal characteristics of NO2 in typical Asian regions and accurately captures the inter-annual concentration gradients driven by public health events and emission reduction policies over the past six years. This newly generated dataset provides an unprecedented spatiotemporally continuous record, overcoming the limitations of cloud cover and short observational histories, thereby facilitating long-term, high-resolution air quality assessments and epidemiological studies. This dataset is available at https://doi.org/10.5281/zenodo.20427767 (Gao et al., 2026).