Articles | Volume 13, issue 6
https://doi.org/10.5194/essd-13-3013-2021
© Author(s) 2021. This work is distributed under
the Creative Commons Attribution 4.0 License.
the Creative Commons Attribution 4.0 License.
https://doi.org/10.5194/essd-13-3013-2021
© Author(s) 2021. This work is distributed under
the Creative Commons Attribution 4.0 License.
the Creative Commons Attribution 4.0 License.
AQ-Bench: a benchmark dataset for machine learning on global air quality metrics
Clara Betancourt
Jülich Supercomputing Centre, Jülich Research Centre, Wilhelm-Johnen-Straße, 52425 Jülich, Germany
Timo Stomberg
Jülich Supercomputing Centre, Jülich Research Centre, Wilhelm-Johnen-Straße, 52425 Jülich, Germany
Institute of Geodesy and Geoinformation, University of Bonn, Nußallee 17, 53115 Bonn, Germany
Ribana Roscher
Institute of Geodesy and Geoinformation, University of Bonn, Nußallee 17, 53115 Bonn, Germany
Martin G. Schultz
CORRESPONDING AUTHOR
Jülich Supercomputing Centre, Jülich Research Centre, Wilhelm-Johnen-Straße, 52425 Jülich, Germany
Scarlet Stadtler
Jülich Supercomputing Centre, Jülich Research Centre, Wilhelm-Johnen-Straße, 52425 Jülich, Germany
Viewed
Total article views: 8,112 (including HTML, PDF, and XML)
Cumulative views and downloads
(calculated since 14 Jan 2021)
| HTML | XML | Total | BibTeX | EndNote | |
|---|---|---|---|---|---|
| 6,003 | 1,943 | 166 | 8,112 | 190 | 198 |
- HTML: 6,003
- PDF: 1,943
- XML: 166
- Total: 8,112
- BibTeX: 190
- EndNote: 198
Total article views: 6,101 (including HTML, PDF, and XML)
Cumulative views and downloads
(calculated since 24 Jun 2021)
| HTML | XML | Total | BibTeX | EndNote | |
|---|---|---|---|---|---|
| 4,925 | 1,031 | 145 | 6,101 | 173 | 179 |
- HTML: 4,925
- PDF: 1,031
- XML: 145
- Total: 6,101
- BibTeX: 173
- EndNote: 179
Total article views: 2,011 (including HTML, PDF, and XML)
Cumulative views and downloads
(calculated since 14 Jan 2021)
| HTML | XML | Total | BibTeX | EndNote | |
|---|---|---|---|---|---|
| 1,078 | 912 | 21 | 2,011 | 17 | 19 |
- HTML: 1,078
- PDF: 912
- XML: 21
- Total: 2,011
- BibTeX: 17
- EndNote: 19
Viewed (geographical distribution)
Total article views: 8,112 (including HTML, PDF, and XML)
Thereof 7,637 with geography defined
and 475 with unknown origin.
Total article views: 6,101 (including HTML, PDF, and XML)
Thereof 5,861 with geography defined
and 240 with unknown origin.
Total article views: 2,011 (including HTML, PDF, and XML)
Thereof 1,776 with geography defined
and 235 with unknown origin.
| Country | # | Views | % |
|---|
| Country | # | Views | % |
|---|
| Country | # | Views | % |
|---|
| Total: | 0 |
| HTML: | 0 |
| PDF: | 0 |
| XML: | 0 |
- 1
1
| Total: | 0 |
| HTML: | 0 |
| PDF: | 0 |
| XML: | 0 |
- 1
1
| Total: | 0 |
| HTML: | 0 |
| PDF: | 0 |
| XML: | 0 |
- 1
1
Cited
20 citations as recorded by crossref.
- Challenges and Benchmark Datasets for Machine Learning in the Atmospheric Sciences: Definition, Status, and Outlook P. Dueben et al. https://doi.org/10.1175/AIES-D-21-0002.1
- Proper Weather Forecasting Internet of Things Sensor Framework with Machine Learning A. Turukmane & S. Pande https://doi.org/10.4108/eetiot.5382
- Applications of Machine Learning and Artificial Intelligence in Tropospheric Ozone Research S. Hickman et al. https://doi.org/10.5194/gmd-18-8777-2025
- Representing chemical history in ozone time-series predictions – a model experiment study building on the MLAir (v1.5) deep learning framework F. Kleinert et al. https://doi.org/10.5194/gmd-15-8913-2022
- Advances and challenges of machine learning in satellite-based atmospheric NO2 monitoring R. Zhang et al. https://doi.org/10.1016/j.apr.2026.103066
- Enhancing the Prediction of Multiple Ozone Metrics Using Genetic Algorithm-Based Feature Selection for the Multi-Target Regression of the Environmental AQ-Bench Dataset N. Jailani & G. Mara https://doi.org/10.48084/etasr.14985
- Exploring the potential of machine learning for simulations of urban ozone variability N. Ojha et al. https://doi.org/10.1038/s41598-021-01824-z
- Augmenting the real-time rainfall forecast skills over odisha using deep learning technique O. Sharma et al. https://doi.org/10.1007/s00477-024-02825-w
- Explainable Machine Learning Reveals Capabilities, Redundancy, and Limitations of a Geospatial Air Quality Benchmark Dataset S. Stadtler et al. https://doi.org/10.3390/make4010008
- Improving rainfall forecast at the district scale over the eastern Indian region using deep neural network D. Trivedi et al. https://doi.org/10.1007/s00704-023-04734-4
- Integrating geospatial indicators and machine learning for ecosystem health assessment: a case study of Sylhet Sadar, Bangladesh S. Lubna & M. Kabir https://doi.org/10.1080/23754931.2025.2577739
- Importance of ozone precursors information in modelling urban surface ozone variability using machine learning algorithm V. Balamurugan et al. https://doi.org/10.1038/s41598-022-09619-6
- A multi-task learning model for global soil moisture prediction based on adaptive weight allocation Y. li et al. https://doi.org/10.1038/s41598-025-01894-3
- Global, high-resolution mapping of tropospheric ozone – explainable machine learning and impact of uncertainties C. Betancourt et al. https://doi.org/10.5194/gmd-15-4331-2022
- Feature selection for global tropospheric ozone prediction based on the BO-XGBoost-RFE algorithm B. Zhang et al. https://doi.org/10.1038/s41598-022-13498-2
- Graph Machine Learning for Improved Imputation of Missing Tropospheric Ozone Data C. Betancourt et al. https://doi.org/10.1021/acs.est.3c05104
- Interactions between atmospheric composition and climate change – progress in understanding and future opportunities from AerChemMIP, PDRMIP, and RFMIP S. Fiedler et al. https://doi.org/10.5194/gmd-17-2387-2024
- Addressing the Coupled Optimization of Feature Selection and Hyperparameter Tuning Using a TPE-Driven XGBoost-RFE Framework N. Jailani & G. Mara https://doi.org/10.48084/etasr.15024
- Advancing air pollution forecasting: a review of physical, statistical, and machine learning methods A. Rawat et al. https://doi.org/10.1007/s11356-026-37789-7
- LandBench 1.0: A benchmark dataset and evaluation metrics for data-driven land surface variables prediction Q. Li et al. https://doi.org/10.1016/j.eswa.2023.122917
20 citations as recorded by crossref.
- Challenges and Benchmark Datasets for Machine Learning in the Atmospheric Sciences: Definition, Status, and Outlook P. Dueben et al. https://doi.org/10.1175/AIES-D-21-0002.1
- Proper Weather Forecasting Internet of Things Sensor Framework with Machine Learning A. Turukmane & S. Pande https://doi.org/10.4108/eetiot.5382
- Applications of Machine Learning and Artificial Intelligence in Tropospheric Ozone Research S. Hickman et al. https://doi.org/10.5194/gmd-18-8777-2025
- Representing chemical history in ozone time-series predictions – a model experiment study building on the MLAir (v1.5) deep learning framework F. Kleinert et al. https://doi.org/10.5194/gmd-15-8913-2022
- Advances and challenges of machine learning in satellite-based atmospheric NO2 monitoring R. Zhang et al. https://doi.org/10.1016/j.apr.2026.103066
- Enhancing the Prediction of Multiple Ozone Metrics Using Genetic Algorithm-Based Feature Selection for the Multi-Target Regression of the Environmental AQ-Bench Dataset N. Jailani & G. Mara https://doi.org/10.48084/etasr.14985
- Exploring the potential of machine learning for simulations of urban ozone variability N. Ojha et al. https://doi.org/10.1038/s41598-021-01824-z
- Augmenting the real-time rainfall forecast skills over odisha using deep learning technique O. Sharma et al. https://doi.org/10.1007/s00477-024-02825-w
- Explainable Machine Learning Reveals Capabilities, Redundancy, and Limitations of a Geospatial Air Quality Benchmark Dataset S. Stadtler et al. https://doi.org/10.3390/make4010008
- Improving rainfall forecast at the district scale over the eastern Indian region using deep neural network D. Trivedi et al. https://doi.org/10.1007/s00704-023-04734-4
- Integrating geospatial indicators and machine learning for ecosystem health assessment: a case study of Sylhet Sadar, Bangladesh S. Lubna & M. Kabir https://doi.org/10.1080/23754931.2025.2577739
- Importance of ozone precursors information in modelling urban surface ozone variability using machine learning algorithm V. Balamurugan et al. https://doi.org/10.1038/s41598-022-09619-6
- A multi-task learning model for global soil moisture prediction based on adaptive weight allocation Y. li et al. https://doi.org/10.1038/s41598-025-01894-3
- Global, high-resolution mapping of tropospheric ozone – explainable machine learning and impact of uncertainties C. Betancourt et al. https://doi.org/10.5194/gmd-15-4331-2022
- Feature selection for global tropospheric ozone prediction based on the BO-XGBoost-RFE algorithm B. Zhang et al. https://doi.org/10.1038/s41598-022-13498-2
- Graph Machine Learning for Improved Imputation of Missing Tropospheric Ozone Data C. Betancourt et al. https://doi.org/10.1021/acs.est.3c05104
- Interactions between atmospheric composition and climate change – progress in understanding and future opportunities from AerChemMIP, PDRMIP, and RFMIP S. Fiedler et al. https://doi.org/10.5194/gmd-17-2387-2024
- Addressing the Coupled Optimization of Feature Selection and Hyperparameter Tuning Using a TPE-Driven XGBoost-RFE Framework N. Jailani & G. Mara https://doi.org/10.48084/etasr.15024
- Advancing air pollution forecasting: a review of physical, statistical, and machine learning methods A. Rawat et al. https://doi.org/10.1007/s11356-026-37789-7
- LandBench 1.0: A benchmark dataset and evaluation metrics for data-driven land surface variables prediction Q. Li et al. https://doi.org/10.1016/j.eswa.2023.122917
Saved (final revised paper)
Latest update: 07 Jun 2026
Short summary
With the AQ-Bench dataset, we contribute to shared data usage and machine learning methods in the field of environmental science. The AQ-Bench dataset contains air quality data and metadata from more than 5500 air quality observation stations all over the world. The dataset offers a low-threshold entrance to machine learning on a real-world environmental dataset. AQ-Bench thus provides a blueprint for environmental benchmark datasets.
With the AQ-Bench dataset, we contribute to shared data usage and machine learning methods in...
Altmetrics
Final-revised paper
Preprint