Articles | Volume 18, issue 1
https://doi.org/10.5194/essd-18-77-2026
© Author(s) 2026. This work is distributed under
the Creative Commons Attribution 4.0 License.
the Creative Commons Attribution 4.0 License.
https://doi.org/10.5194/essd-18-77-2026
© Author(s) 2026. This work is distributed under
the Creative Commons Attribution 4.0 License.
the Creative Commons Attribution 4.0 License.
GEMS-GER: a machine learning benchmark dataset of long-term groundwater levels in Germany with meteorological forcings and site-specific environmental features
Institute of Applied Geosciences (AGW), Karlsruhe Institute of Technology (KIT), Karlsruhe, Germany
Tanja Liesch
Institute of Applied Geosciences (AGW), Karlsruhe Institute of Technology (KIT), Karlsruhe, Germany
Bastian Habbel
Institute of Applied Geosciences (AGW), Karlsruhe Institute of Technology (KIT), Karlsruhe, Germany
Benedikt Heudorfer
Institute of Meteorology and Climate Research – Atmospheric Trace Gases and Remote Sensing (IMKASF), Karlsruhe Institute of Technology (KIT), Karlsruhe, Germany
Mariana Gomez
Federal Institute for Geosciences and Natural Resources (BGR), Berlin, Germany
Patrick Clos
Federal Institute for Geosciences and Natural Resources (BGR), Berlin, Germany
Maximilian Nölscher
Federal Institute for Geosciences and Natural Resources (BGR), Berlin, Germany
Stefan Broda
Federal Institute for Geosciences and Natural Resources (BGR), Berlin, Germany
Related authors
Tanja Liesch and Marc Ohmer
EGUsphere, https://doi.org/10.5194/egusphere-2025-4048, https://doi.org/10.5194/egusphere-2025-4048, 2025
This preprint is open for discussion and under review for Hydrology and Earth System Sciences (HESS).
Short summary
Short summary
We studied how to add site information to deep learning models that predict groundwater levels at many wells at once. Using data from Germany, we compared four simple ways to combine time varying weather with time invariant site characteristics. All methods gave similar average accuracy. Repeating site data at each time step was slightly best but used more computer power. The quality of site information mattered more than the method, guiding future model design.
Marc Ohmer and Tanja Liesch
EGUsphere, https://doi.org/10.5194/egusphere-2025-4055, https://doi.org/10.5194/egusphere-2025-4055, 2025
This preprint is open for discussion and under review for Hydrology and Earth System Sciences (HESS).
Short summary
Short summary
We compared global vs. local deep learning models for groundwater level prediction using ~3,000 wells. Unlike surface water, groundwater is complex and data-scarce. Results: global models show no systematic accuracy advantage over local ones. Data similarity matters more than quantity for better predictions. Successful groundwater modeling requires strategies tailored to these unique complexities, not just larger datasets.
Marc Ohmer, Tanja Liesch, and Andreas Wunsch
Hydrol. Earth Syst. Sci., 26, 4033–4053, https://doi.org/10.5194/hess-26-4033-2022, https://doi.org/10.5194/hess-26-4033-2022, 2022
Short summary
Short summary
We present a data-driven approach to select optimal locations for groundwater monitoring wells. The applied approach can optimize the number of wells and their location for a network reduction (by ranking wells in order of their information content and reducing redundant) and extension (finding sites with great information gain) or both. It allows us to include a cost function to account for more/less suitable areas for new wells and can help to obtain maximum information content for a budget.
Tanja Liesch and Marc Ohmer
EGUsphere, https://doi.org/10.5194/egusphere-2025-4048, https://doi.org/10.5194/egusphere-2025-4048, 2025
This preprint is open for discussion and under review for Hydrology and Earth System Sciences (HESS).
Short summary
Short summary
We studied how to add site information to deep learning models that predict groundwater levels at many wells at once. Using data from Germany, we compared four simple ways to combine time varying weather with time invariant site characteristics. All methods gave similar average accuracy. Repeating site data at each time step was slightly best but used more computer power. The quality of site information mattered more than the method, guiding future model design.
Marc Ohmer and Tanja Liesch
EGUsphere, https://doi.org/10.5194/egusphere-2025-4055, https://doi.org/10.5194/egusphere-2025-4055, 2025
This preprint is open for discussion and under review for Hydrology and Earth System Sciences (HESS).
Short summary
Short summary
We compared global vs. local deep learning models for groundwater level prediction using ~3,000 wells. Unlike surface water, groundwater is complex and data-scarce. Results: global models show no systematic accuracy advantage over local ones. Data similarity matters more than quantity for better predictions. Successful groundwater modeling requires strategies tailored to these unique complexities, not just larger datasets.
Fabienne Doll, Tanja Liesch, Maria Wetzel, Stefan Kunz, and Stefan Broda
EGUsphere, https://doi.org/10.5194/egusphere-2025-3539, https://doi.org/10.5194/egusphere-2025-3539, 2025
Short summary
Short summary
With the growing use of machine learning for groundwater level (GWL) prediction, proper performance estimation is crucial. This study compares three validation strategies—blocked cross-validation (bl-CV), repeated out-of-sample (repOOS), and out-of-sample (OOS)—for 1D-CNN models using meteorological inputs. Results show that bl-CV offers the most reliable performance estimates, while OOS is the most uncertain, highlighting the need for careful method selection.
Stefan Kunz, Alexander Schulz, Maria Wetzel, Maximilian Nölscher, Teodor Chiaburu, Felix Biessmann, and Stefan Broda
Hydrol. Earth Syst. Sci., 29, 3405–3433, https://doi.org/10.5194/hess-29-3405-2025, https://doi.org/10.5194/hess-29-3405-2025, 2025
Short summary
Short summary
Accurate groundwater level predictions are crucial for sustainable management. This study applies two machine learning models – Neural Hierarchical Interpolation for Time Series Forecasting (N-HiTS) and the Temporal Fusion Transformer (TFT) – to forecast seasonal groundwater levels for 5288 wells across Germany. N-HiTS outperformed TFT, with both models performing well in diverse hydrogeological settings, particularly in lowlands with distinct seasonal dynamics.
Raoul A. Collenteur, Ezra Haaf, Mark Bakker, Tanja Liesch, Andreas Wunsch, Jenny Soonthornrangsan, Jeremy White, Nick Martin, Rui Hugman, Ed de Sousa, Didier Vanden Berghe, Xinyang Fan, Tim J. Peterson, Jānis Bikše, Antoine Di Ciacca, Xinyue Wang, Yang Zheng, Maximilian Nölscher, Julian Koch, Raphael Schneider, Nikolas Benavides Höglund, Sivarama Krishna Reddy Chidepudi, Abel Henriot, Nicolas Massei, Abderrahim Jardani, Max Gustav Rudolph, Amir Rouhani, J. Jaime Gómez-Hernández, Seifeddine Jomaa, Anna Pölz, Tim Franken, Morteza Behbooei, Jimmy Lin, and Rojin Meysami
Hydrol. Earth Syst. Sci., 28, 5193–5208, https://doi.org/10.5194/hess-28-5193-2024, https://doi.org/10.5194/hess-28-5193-2024, 2024
Short summary
Short summary
We show the results of the 2022 Groundwater Time Series Modelling Challenge; 15 teams applied data-driven models to simulate hydraulic heads, and three model groups were identified: lumped, machine learning, and deep learning. For all wells, reasonable performance was obtained by at least one team from each group. There was not one team that performed best for all wells. In conclusion, the challenge was a successful initiative to compare different models and learn from each other.
Mariana Gomez, Maximilian Nölscher, Andreas Hartmann, and Stefan Broda
Hydrol. Earth Syst. Sci., 28, 4407–4425, https://doi.org/10.5194/hess-28-4407-2024, https://doi.org/10.5194/hess-28-4407-2024, 2024
Short summary
Short summary
To understand the impact of external factors on groundwater level modelling using a 1-D convolutional neural network (CNN) model, we train, validate, and tune individual CNN models for 505 wells distributed across Lower Saxony, Germany. We then evaluate the performance of these models against available geospatial and time series features. This study provides new insights into the relationship between these factors and the accuracy of groundwater modelling.
Andreas Wunsch, Tanja Liesch, and Nico Goldscheider
Hydrol. Earth Syst. Sci., 28, 2167–2178, https://doi.org/10.5194/hess-28-2167-2024, https://doi.org/10.5194/hess-28-2167-2024, 2024
Short summary
Short summary
Seasons have a strong influence on groundwater levels, but relationships are complex and partly unknown. Using data from wells in Germany and an explainable machine learning approach, we showed that summer precipitation is the key factor that controls the severeness of a low-water period in fall; high summer temperatures do not per se cause stronger decreases. Preceding winters have only a minor influence on such low-water periods in general.
Annika Nolte, Ezra Haaf, Benedikt Heudorfer, Steffen Bender, and Jens Hartmann
Hydrol. Earth Syst. Sci., 28, 1215–1249, https://doi.org/10.5194/hess-28-1215-2024, https://doi.org/10.5194/hess-28-1215-2024, 2024
Short summary
Short summary
This study examines about 8000 groundwater level (GWL) time series from five continents to explore similarities in groundwater systems at different scales. Statistical metrics and machine learning techniques are applied to identify common GWL dynamics patterns and analyze their controlling factors. The study also highlights the potential and limitations of this data-driven approach to improve our understanding of groundwater recharge and discharge processes.
Benedikt Heudorfer, Tanja Liesch, and Stefan Broda
Hydrol. Earth Syst. Sci., 28, 525–543, https://doi.org/10.5194/hess-28-525-2024, https://doi.org/10.5194/hess-28-525-2024, 2024
Short summary
Short summary
We build a neural network to predict groundwater levels from monitoring wells. We predict all wells at the same time, by learning the differences between wells with static features, making it an entity-aware global model. This works, but we also test different static features and find that the model does not use them to learn exactly how the wells are different, but only to uniquely identify them. As this model class is not actually entity aware, we suggest further steps to make it so.
Guillaume Cinkus, Naomi Mazzilli, Hervé Jourde, Andreas Wunsch, Tanja Liesch, Nataša Ravbar, Zhao Chen, and Nico Goldscheider
Hydrol. Earth Syst. Sci., 27, 2397–2411, https://doi.org/10.5194/hess-27-2397-2023, https://doi.org/10.5194/hess-27-2397-2023, 2023
Short summary
Short summary
The Kling–Gupta Efficiency (KGE) is a performance criterion extensively used to evaluate hydrological models. We conduct a critical study on the KGE and its variant to examine counterbalancing errors. Results show that, when assessing a simulation, concurrent over- and underestimation of discharge can lead to an overall higher criterion score without an associated increase in model relevance. We suggest that one carefully choose performance criteria and use scaling factors.
Guillaume Cinkus, Andreas Wunsch, Naomi Mazzilli, Tanja Liesch, Zhao Chen, Nataša Ravbar, Joanna Doummar, Jaime Fernández-Ortega, Juan Antonio Barberá, Bartolomé Andreo, Nico Goldscheider, and Hervé Jourde
Hydrol. Earth Syst. Sci., 27, 1961–1985, https://doi.org/10.5194/hess-27-1961-2023, https://doi.org/10.5194/hess-27-1961-2023, 2023
Short summary
Short summary
Numerous modelling approaches can be used for studying karst water resources, which can make it difficult for a stakeholder or researcher to choose the appropriate method. We conduct a comparison of two widely used karst modelling approaches: artificial neural networks (ANNs) and reservoir models. Results show that ANN models are very flexible and seem great for reproducing high flows. Reservoir models can work with relatively short time series and seem to accurately reproduce low flows.
Marc Ohmer, Tanja Liesch, and Andreas Wunsch
Hydrol. Earth Syst. Sci., 26, 4033–4053, https://doi.org/10.5194/hess-26-4033-2022, https://doi.org/10.5194/hess-26-4033-2022, 2022
Short summary
Short summary
We present a data-driven approach to select optimal locations for groundwater monitoring wells. The applied approach can optimize the number of wells and their location for a network reduction (by ranking wells in order of their information content and reducing redundant) and extension (finding sites with great information gain) or both. It allows us to include a cost function to account for more/less suitable areas for new wells and can help to obtain maximum information content for a budget.
Andreas Wunsch, Tanja Liesch, Guillaume Cinkus, Nataša Ravbar, Zhao Chen, Naomi Mazzilli, Hervé Jourde, and Nico Goldscheider
Hydrol. Earth Syst. Sci., 26, 2405–2430, https://doi.org/10.5194/hess-26-2405-2022, https://doi.org/10.5194/hess-26-2405-2022, 2022
Short summary
Short summary
Modeling complex karst water resources is difficult enough, but often there are no or too few climate stations available within or close to the catchment to deliver input data for modeling purposes. We apply image recognition algorithms to time-distributed, spatially gridded meteorological data to simulate karst spring discharge. Our models can also learn the approximate catchment location of a spring independently.
Doris E. Wendt, John P. Bloomfield, Anne F. Van Loon, Margaret Garcia, Benedikt Heudorfer, Joshua Larsen, and David M. Hannah
Nat. Hazards Earth Syst. Sci., 21, 3113–3139, https://doi.org/10.5194/nhess-21-3113-2021, https://doi.org/10.5194/nhess-21-3113-2021, 2021
Short summary
Short summary
Managing water demand and supply during droughts is complex, as highly pressured human–water systems can overuse water sources to maintain water supply. We evaluated the impact of drought policies on water resources using a socio-hydrological model. For a range of hydrogeological conditions, we found that integrated drought policies reduce baseflow and groundwater droughts most if extra surface water is imported, reducing the pressure on water resources during droughts.
Andreas Wunsch, Tanja Liesch, and Stefan Broda
Hydrol. Earth Syst. Sci., 25, 1671–1687, https://doi.org/10.5194/hess-25-1671-2021, https://doi.org/10.5194/hess-25-1671-2021, 2021
Cited articles
Abadi, M., Agarwal, A., Barham, P., Brevdo, E., Chen, Z., Citro, C., Corrado, G. S., Davis, A., Dean, J., Devin, M., Ghemawat, S., Goodfellow, I., Harp, A., Irving, G., Isard, M., Jia, Y., Jozefowicz, R., Kaiser, L., Kudlur, M., Levenberg, J., Mané, D., Monga, R., Moore, S., Murray, D., Olah, C., Schuster, M., Shlens, J., Steiner, B., Sutskever, I., Talwar, K., Tucker, P., Vanhoucke, V., Vasudevan, V., Viégas, F., Vinyals, O., Warden, P., Wattenberg, M., Wicke, M., Yu, Y., and Zheng, X.: TensorFlow: Large-Scale Machine Learning on Heterogeneous Distributed Systems, arXiv [preprint], https://doi.org/10.48550/arXiv.1603.04467, 2016. a
Addor, N., Newman, A. J., Mizukami, N., and Clark, M. P.: The CAMELS data set: catchment attributes and meteorology for large-sample studies, Hydrol. Earth Syst. Sci., 21, 5293–5313, https://doi.org/10.5194/hess-21-5293-2017, 2017. a
Ahmadi, A., Olyaei, M., Heydari, Z., Emami, M., Zeynolabedin, A., Ghomlaghi, A., Daccache, A., Fogg, G. E., and Sadegh, M.: Groundwater Level Modeling with Machine Learning: A Systematic Review and Meta-Analysis, Water, 14, 949, https://doi.org/10.3390/w14060949, 2022. a
Barzegar, R., Fijani, E., Asghari Moghaddam, A., and Tziritis, E.: Forecasting of groundwater level fluctuations using ensemble hybrid multi-wavelet neural network-based models, Science of The Total Environment, 599–600, 20–31, https://doi.org/10.1016/j.scitotenv.2017.04.189, 2017. a
Berghuijs, W. R., Luijendijk, E., Moeck, C., van der Velde, Y., and Allen, S. T.: Global Recharge Data Set Indicates Strengthened Groundwater Connection to Surface Fluxes, Geophysical Research Letters, 49, e2022GL099010, https://doi.org/10.1029/2022GL099010, 2022. a
BGR: Mean Annual Rate of Percolation from the Soil in Germany (SWR1000_250), Bundesanstalt für Geowissenschaften und Rohstoffe (BGR) [data set], https://services.bgr.de/arcgis/rest/services/boden/swr1000/MapServer/0 (last access: 8 December 2025), 2003. a
BGR: Organic Matter Content of Top-Soils in Germany HUMUS1000OB, Bundesanstalt für Geowissenschaften und Rohstoffe (BGR) [data set], https://www.bgr.bund.de/humus1000ob (last access: 8 December 2025), 2007. a
BGR: Geomorphographic Map of Germany 1:1 000 000 (GMK1000R), Bundesanstalt für Geowissenschaften und Rohstoffe (BGR) [data set], https://gdk.gdi-de.org/geonetwork/srv/api/records/60ab5e4e-9493-44b0-9cae-d9ce603de742 (last access: 8 December 2025), 2014. a
BGR: Mean Annual Groundwater Recharge of Germany 1:1 000 000 (GWN1000), Bundesanstalt für Geowissenschaften und Rohstoffe (BGR) [data set], https://geoportal.bgr.de/mapapps/resources/apps/geoportal/index.html?lang=de#/geoviewer?metadataId=40E14FF1-99D4-43DA-AF7B-C039F0463BF8 (last access: 8 December 2025), 2019. a
BGR: Digital soil map of Germany (BUEK 1000), Bundesanstalt für Geowissenschaften und Rohstoffe (BGR) [data set], https://services.bgr.de/wms/boden/buek1000de/? (last access: 8 December 2025), 2020. a
BGR and SGD: Hydrogeological spatial structure of Germany (HYRAUM), Bundesanstalt für Geowissenschaften und Rohstoffe (BGR) and Staatliche Geologische Dienste (SGD) [data set], https://www.bgr.bund.de/DE/Themen/Grundwasser/Projekte/Flaechen-Rauminformationen/Hyraum/hyraum.html (last access: 8 December 2025), 2015. a, b
BGR and SGD: Hydrogeological Map of Germany 1:250 000 (HÜK250), Bundesanstalt für Geowissenschaften und Rohstoffe (BGR) and Staatliche Geologische Dienste (SGD) [data set], https://www.bgr.bund.de/DE/BGR/Wissenschaftliche-Infrastruktur/Geologie-Bohrungen/Karten/Deutschland/deutschland_node.html (last access: 8 December 2025), 2019. a, b, c, d, e, f
Breunig, M. M., Kriegel, H.-P., Ng, R. T., and Sander, J.: LOF: identifying density-based local outliers, ACM SIGMOD Record, 29, 93–104, https://doi.org/10.1145/335191.335388, 2000. a
Buck, S. F.: A Method of Estimation of Missing Values in Multivariate Data Suitable for Use with an Electronic Computer, Journal of the Royal Statistical Society Series B-Statistical Methodology, 22, 302–306, https://doi.org/10.1111/j.2517-6161.1960.tb00375.x, 1960. a
Buuren, S. V. and Groothuis-Oudshoorn, K.: mice: Multivariate Imputation by Chained Equations in R, Journal of Statistical Software, 45, https://doi.org/10.18637/jss.v045.i03, 2011. a
Chen, C., He, W., Zhou, H., Xue, Y., and Zhu, M.: A comparative study among machine learning and numerical models for simulating groundwater dynamics in the Heihe River Basin, northwestern China, Scientific Reports, 10, 3904, https://doi.org/10.1038/s41598-020-60698-9, 2020. a
Chen, L.-H., Chen, C.-T., and Pan, Y.-G.: Groundwater Level Prediction Using SOM-RBFN Multisite Model, Journal of Hydrologic Engineering, 15, 624–631, https://doi.org/10.1061/(ASCE)HE.1943-5584.0000218, 2010. a
Chollet, F.: Keras, GitHub [code], https://github.com/keras-team/keras (last access: 28 May 2025), 2015. a
Collenteur, R. A., Bakker, M., Caljé, R., Klop, S. A., and Schaars, F.: Pastas: Open Source Software for the Analysis of Groundwater Time Series, Groundwater, 57, 877–885, https://doi.org/10.1111/gwat.12925, 2019. a
Collenteur, R. A., Moeck, C., Schirmer, M., and Birk, S.: Analysis of nationwide groundwater monitoring networks using lumped-parameter models, Journal of Hydrology, 626, 130120, https://doi.org/10.1016/j.jhydrol.2023.130120, 2023. a
Coulibaly, P., Anctil, F., Aravena, R., and Bobée, B.: Artificial neural network modeling of water table depth fluctuations, Water Resources Research, 37, 885–896, https://doi.org/10.1029/2000WR900368, 2001. a
De Graaf, I. E. M., Gleeson, T., (Rens) Van Beek, L. P. H., Sutanudjaja, E. H., and Bierkens, M. F. P.: Environmental flow limits to global groundwater pumping, Nature, 574, 90–94, https://doi.org/10.1038/s41586-019-1594-4, 2019. a
Destatis: Wassergewinnung: Die regionale Zuordnung erfolgt über den Standort des Wasserversorgungsunternehmens, Destatis [data set], https://www-genesis.destatis.de/datenbank/online/table/32211-0002 (last access: 8 December 2025), 2025. a
EEA: CORINE Land Cover 1990 (raster 100 m), Europe, 6-yearly – version 2020_20u1, May 2020, European Environment Agency (EEA) [data set], https://doi.org/10.2909/C89324EF-7729-4477-9F1B-623F5F88EAA1, 2019a. a
EEA: CORINE Land Cover 2000 (raster 100 m), Europe, 6-yearly – version 2020_20u1, May 2020, European Environment Agency (EEA) [data set], https://doi.org/10.2909/DDACBD5E-068F-4E52-A596-D606E8DE7F40, 2019b. a
EEA: CORINE Land Cover 2006 (raster 100 m), Europe, 6-yearly – version 2020_20u1, May 2020, European Environment Agency (EEA) [data set], https://doi.org/10.2909/08560441-2FD5-4EB9-BF4C-9EF16725726A, 2019c. a
EEA: CORINE Land Cover 2012 (raster 100 m), Europe, 6-yearly – version 2020_20u1, May 2020, European Environment Agency (EEA) [data set], https://doi.org/10.2909/A84AE124-C5C5-4577-8E10-511BFE55CC0D, 2019d. a
EEA: CORINE Land Cover 2018 (raster 100 m), Europe, 6-yearly – version 2020_20u1, May 2020, European Environment Agency (EEA) [data set], https://doi.org/10.2909/960998C1-1870-4E82-8051-6485205EBBAC, 2019e. a
Emamgholizadeh, S., Moslemi, K., and Karami, G.: Prediction the Groundwater Level of Bastam Plain (Iran) by Artificial Neural Network (ANN) and Adaptive Neuro-Fuzzy Inference System (ANFIS), Water Resources Management, 28, 5433–5446, https://doi.org/10.1007/s11269-014-0810-0, 2014. a
Feng, F., Ghorbani, H., and Radwan, A. E.: Predicting groundwater level using traditional and deep machine learning algorithms, Frontiers in Environmental Science, 12, 1291327, https://doi.org/10.3389/fenvs.2024.1291327, 2024. a
Gomez, M., Nölscher, M., Hartmann, A., and Broda, S.: Assessing groundwater level modelling using a 1-D convolutional neural network (CNN): linking model performances to geospatial and time series features, Hydrol. Earth Syst. Sci., 28, 4407–4425, https://doi.org/10.5194/hess-28-4407-2024, 2024. a, b
Guzman, S. M., Paz, J. O., Tagert, M. L. M., and Mercer, A. E.: Evaluation of Seasonally Classified Inputs for the Prediction of Daily Groundwater Levels: NARX Networks Vs Support Vector Machines, Environmental Modeling & Assessment, 24, 223–234, https://doi.org/10.1007/s10666-018-9639-x, 2019. a
Han, Z., Li, F., Zhao, Y., and Liu, C.: Investigation into groundwater level prediction within a deep learning framework: Incorporating the spatial dynamics of adjacent wells, Journal of Hydrology, 657, 133097, https://doi.org/10.1016/j.jhydrol.2025.133097, 2025. a
Harris, C. R., Millman, K. J., van der Walt, S. J., Gommers, R., Virtanen, P., Cournapeau, D., Wieser, E., Taylor, J., Berg, S., Smith, N. J., Kern, R., Picus, M., Hoyer, S., van Kerkwijk, M. H., Brett, M., Haldane, A., Del Río, J. F., Wiebe, M., Peterson, P., Gérard-Marchant, P., Sheppard, T., Reddy, D., Weckesser, W., Abbasi, H., Gohlke, C., and Oliphant, T. E.: Array programming with NumPy, Nature, 585, 357–362, https://doi.org/10.1038/s41586-020-2649-2, 2020. a
Heudorfer, B., Liesch, T., and Broda, S.: On the challenges of global entity-aware deep learning models for groundwater level prediction, Hydrol. Earth Syst. Sci., 28, 525–543, https://doi.org/10.5194/hess-28-525-2024, 2024. a, b
Huang, F., Huang, J., Jiang, S.-H., and Zhou, C.: Prediction of groundwater levels using evidence of chaos and support vector machine, Journal of Hydroinformatics, 19, 586–606, https://doi.org/10.2166/hydro.2017.102, 2017. a
Hunter, J. D.: Matplotlib: A 2D Graphics Environment, Computing in Science & Engineering, 9, 90–95, https://doi.org/10.1109/MCSE.2007.55, 2007. a
Kasiviswanathan, K. S., Saravanan, S., Balamurugan, M., and Saravanan, K.: Genetic programming based monthly groundwater level forecast models with uncertainty quantification, Modeling Earth Systems and Environment, 2, 27, https://doi.org/10.1007/s40808-016-0083-0, 2016. a
Kholghi, M. and Hosseini, S. M.: Comparison of Groundwater Level Estimation Using Neuro-fuzzy and Ordinary Kriging, Environmental Modeling & Assessment, 14, 729–737, https://doi.org/10.1007/s10666-008-9174-2, 2009. a
Killick, R., Fearnhead, P., and Eckley, I. A.: Optimal detection of changepoints with a linear computational cost, Journal of the American Statistical Association, 107, 1590–1598, https://doi.org/10.1080/01621459.2012.737745, 2012. a
Kratzert, F., Nearing, G., Addor, N., Frame, J. M., Gauch, M., Xu, Y., Mai, J., Zeng, Z., Hochreiter, S., Gupta, H., Klotz, D., Klambauer, G., Pfister, L., Reichstein, M., Shen, C., Wood, A., Rakovec, O., Newman, A. J., Clark, M. P., Lerat, J., Andréassian, V., Mendoza, P., Coxon, G., Mizukami, N., Smith, T., Westra, S., Gharari, S., Nearing, G., Duan, Q., Burek, P., and Hall, J.: Caravan – A global community dataset for large-sample hydrology, Scientific Data, 10, 61, https://doi.org/10.1038/s41597-023-01975-w, 2023. a
Krishna, B., Satyaji Rao, Y. R., and Vijaya, T.: Modelling groundwater levels in an urban coastal aquifer using artificial neural networks, Hydrological Processes, 22, 1180–1188, https://doi.org/10.1002/hyp.6686, 2008. a
Lallahem, S., Mania, J., Hani, A., and Najjar, Y.: On the use of neural networks to evaluate groundwater levels in fractured media, Journal of Hydrology, 307, 92–111, https://doi.org/10.1016/j.jhydrol.2004.10.005, 2005. a
Liu, F. T., Ting, K. M., and Zhou, Z.-H.: Isolation Forest, in: 2008 Eighth IEEE International Conference on Data Mining, Pisa, Italy, IEEE, https://doi.org/10.1109/ICDM.2008.17, pp. 413–422, 2008. a
Liu, F. T., Ting, K. M., and Zhou, Z.-H.: Isolation-Based Anomaly Detection, ACM Transactions on Knowledge Discovery from Data, 6, 1–39, https://doi.org/10.1145/2133360.2133363, 2012. a
MacKay, D. J. C.: Bayesian Interpolation, Neural Computation, 4, 415–447, https://doi.org/10.1162/neco.1992.4.3.415, 1992. a
Mackay, J. D., Jackson, C. R., and Wang, L.: A lumped conceptual model to simulate groundwater level time-series, Environmental Modelling & Software, 61, 229–245, https://doi.org/10.1016/j.envsoft.2014.06.003, 2014. a
Maiti, S. and Tiwari, R. K.: A comparative study of artificial neural networks, Bayesian neural networks and adaptive neuro-fuzzy inference system in groundwater level prediction, Environ. Earth Sci., 71, 3147–3160, https://doi.org/10.1007/s12665-013-2702-7, 2014. a
McKinney, W.: Data Structures for Statistical Computing in Python, in: Proceedings of the 9th Python in Science Conference, edited by: van der Walt, S. and Millman, J., 51–56, https://doi.org/10.25080/Majora-92bf1922-00a, 2010. a
Moosavi, V., Vafakhah, M., Shirmohammadi, B., and Behnia, N.: A Wavelet-ANFIS Hybrid Model for Groundwater Level Forecasting for Different Prediction Periods, Water Resources Management, 27, 1301–1321, https://doi.org/10.1007/s11269-012-0239-2, 2013. a
Muñoz-Sabater, J., Dutra, E., Agustí-Panareda, A., Albergel, C., Arduini, G., Balsamo, G., Boussetta, S., Choulga, M., Harrigan, S., Hersbach, H., Martens, B., Miralles, D. G., Piles, M., Rodríguez-Fernández, N. J., Zsoter, E., Buontempo, C., and Thépaut, J.-N.: ERA5-Land: a state-of-the-art global reanalysis dataset for land applications, Earth Syst. Sci. Data, 13, 4349–4383, https://doi.org/10.5194/essd-13-4349-2021, 2021. a, b, c, d, e, f
Nayak, P. C., Rao, Y. R. S., and Sudheer, K. P.: Groundwater Level Forecasting in a Shallow Aquifer Using Artificial Neural Network Approach, Water Resources Management, 20, 77–90, https://doi.org/10.1007/s11269-006-4007-z, 2006. a
Nölscher, M., Mutz, M., and Broda, S.: Multiorder hydrologic Position for Europe — a Set of Features for Machine Learning and Analysis in Hydrology, Scientific Data, 9, 662, https://doi.org/10.1038/s41597-022-01787-4, 2022. a, b, c
Ohmer, M. and Liesch, T.: KITHydrogeology/GEMS-GER: GEMS-GER Code Release v1.0.0, Zenodo [code], https://doi.org/10.5281/zenodo.17855212, 2025. a
Ohmer, M., Liesch, T., Habbel, B., Heudorfer, B., Gomez, M., Clos, P., Nölscher, M., and Broda, S.: GEMS-GER: A Machine Learning Benchmark Dataset of Long-Term Groundwater Levels, Environment, Meteorology, Site Properties, repository, Version v3 (4 August 2025), Zenodo [data set], https://doi.org/10.5281/zenodo.15530171, 2025. a
Pedregosa, F., Varoquaux, G., Gramfort, A., Michel, V., Thirion, B., Grisel, O., Blondel, M., Prettenhofer, P., Weiss, R., Dubourg, V., Passos, A., Vanderplas, J., Cournapeau, D., Perrot, M., Perrot, E., and Brucher, M.: Scikit-learn: Machine Learning in Python, Journal of Machine Learning Research, 12, 2825–2830, 2011. a, b
Reinecke, R., Müller Schmied, H., Trautmann, T., Andersen, L. S., Burek, P., Flörke, M., Gosling, S. N., Grillakis, M., Hanasaki, N., Koutroulis, A., Pokhrel, Y., Thiery, W., Wada, Y., Yusuke, S., and Döll, P.: Uncertainty of simulated groundwater recharge at different global warming levels: a global-scale multi-model ensemble study, Hydrol. Earth Syst. Sci., 25, 787–810, https://doi.org/10.5194/hess-25-787-2021, 2021. a
Richter, B. D., Baumgartner, J. V., Powell, J., and Braun, D. P.: A Method for Assessing Hydrologic Alteration within Ecosystems, Conservation Biology, 10, 1163–1174, https://doi.org/10.1046/j.1523-1739.1996.10041163.x, 1996. a, b
Riembauer, G., Weinmann, A., Xu, S., Eichfuss, S., Eberz, C., and Neteler, M.: Germany-wide Sentinel-2 based land cover classification and change detection for settlement and infrastructure monitoring, Publications Office, LU, proceedings of the 2021 conference on big data from space, virtual. edn., https://doi.org/10.2760/125905, 2021. a
Ross, C., Prihodko, L., Anchang, J., Kumar, S., Ji, W., And Hanan, N.: Global Hydrologic Soil Groups (HYSOGs250m) for Curve Number-Based Runoff Modeling, ORNL Distributed Active Archive Center [data set], https://doi.org/10.3334/ORNLDAAC/1566, 2018. a
Sadat-Noori, M., Glamore, W., and Khojasteh, D.: Groundwater level prediction using genetic programming: the importance of precipitation data and weather station location on model accuracy, Environmental Earth Sciences, 79, 37, https://doi.org/10.1007/s12665-019-8776-0, 2020. a
Samani, S., Vadiati, M., Nejatijahromi, Z., Etebari, B., and Kisi, O.: Groundwater level response identification by hybrid wavelet–machine learning conjunction models using meteorological data, Environmental Science and Pollution Research, 30, 22863–22884, https://doi.org/10.1007/s11356-022-23686-2, 2022. a
Seabold, S. and Perktold, J.: Statsmodels: Econometric and Statistical Modeling with Python, 92–96, Austin, Texas, https://doi.org/10.25080/Majora-92bf1922-011, 2010. a
SGD: OpenDTM-DE: 1-meter Digital Terrain Models of all 16 German federal states, OpenDEM [data set], https://www.opendem.info/ (last access: 8 December 2025), the datasets are licensed under open data terms, specifically: Data License Germany – Attribution – Version 2.0, Creative Commons Attribution 4.0 International, Data License Germany – Zero – Version 2.0, and the custom open data terms of the state of Hesse, https://hvbg.hessen.de/open-data (last access: 8 December 2025), 2024. a, b, c, d, e, f, g, h, i, j, k
Solgi, R., Loáiciga, H. A., and Kram, M.: Long short-term memory neural network (LSTM-NN) for aquifer level time series forecasting using in-situ piezometric observations, Journal of Hydrology, 601, 126800, https://doi.org/10.1016/j.jhydrol.2021.126800, 2021. a
Tao, H., Hameed, M. M., Marhoon, H. A., Zounemat-Kermani, M., Heddam, S., Kim, S., Sulaiman, S. O., Tan, M. L., Sa'adi, Z., Mehr, A. D., Allawi, M. F., Abba, S., Zain, J. M., Falah, M. W., Jamei, M., Bokde, N. D., Bayatvarkeshi, M., Al-Mukhtar, M., Bhagat, S. K., Tiyasha, T., Khedher, K. M., Al-Ansari, N., Shahid, S., and Yaseen, Z. M.: Groundwater level prediction using machine learning models: A comprehensive review, Neurocomputing, 489, 271–308, https://doi.org/10.1016/j.neucom.2022.03.014, 2022. a, b
Tipping, M. E.: Sparse Bayesian Learning and the Relevance Vector Machine, Journal of Machine Learning Research, 1, 211–244, http://www.jmlr.org/papers/volume1/tipping01a/tipping01a.pdf (last access: 8 December 2025), 2001. a
Truong, C., Oudre, L., and Vayatis, N.: Selective review of offline change point detection methods, Signal Processing, 167, 107299, https://doi.org/10.1016/j.sigpro.2019.107299, 2020. a
van Rossum, G.: Python Programming Language, Version 3.9, Python Software Foundation [code], https://www.python.org (last access: 8 December 2025), 1995. a
Virtanen, P., Gommers, R., Oliphant, T. E., Haberland, M., Reddy, T., Cournapeau, D., Burovski, E., Peterson, P., Weckesser, W., Bright, J., van der Walt, S. J., Brett, M., Wilson, J., Millman, K. J., Mayorov, N., Nelson, A. R. J., Jones, E., Kern, R., Larson, E., Carey, C., Polat, I., Feng, Y., Moore, E. W., VanderPlas, J., Laxalde, D., Perktold, J., Cimrman, R., Henriksen, I., Quintero, E. A., Harris, C. R., Archibald, A. M., Ribeiro, A. H., Pedregosa, F., and van Mulbregt, P.: SciPy 1.0: Fundamental Algorithms for Scientific Computing in Python, Nature Methods, 17, 261–272, https://doi.org/10.1038/s41592-019-0686-2, 2020. a
Wunsch, A., Liesch, T., and Broda, S.: Groundwater level forecasting with artificial neural networks: a comparison of long short-term memory (LSTM), convolutional neural networks (CNNs), and non-linear autoregressive networks with exogenous input (NARX), Hydrol. Earth Syst. Sci., 25, 1671–1687, https://doi.org/10.5194/hess-25-1671-2021, 2021. a, b, c
Wunsch, A., Liesch, T., and Broda, S.: Deep learning shows declining groundwater levels in Germany until 2100 due to climate change, Nature Communications, 13, 1221, https://doi.org/10.1038/s41467-022-28770-2, 2022a. a, b
Wunsch, A., Liesch, T., and Broda, S.: Feature-based Groundwater Hydrograph Clustering Using Unsupervised Self-Organizing Map-Ensembles, Water Resources Management, 36, 39–54, https://doi.org/10.1007/s11269-021-03006-y, 2022b. a, b
Yang, X. and Zhang, Z.: A CNN-LSTM Model Based on a Meta-Learning Algorithm to Predict Groundwater Level in the Middle and Lower Reaches of the Heihe River, China, Water, 14, 2377, https://doi.org/10.3390/w14152377, 2022. a
Ying, Z., Wenxi, L., Haibo, C., and Jiannan, L.: Comparison of three forecasting models for groundwater levels: a case study in the semiarid area of west Jilin Province, China, Journal of Water Supply-Research and Technology-Aqua, 63, 671–683, https://doi.org/10.2166/aqua.2014.023, 2014. a
Yoon, H., Jun, S.-C., Hyun, Y., Bae, G.-O., and Lee, K.-K.: A comparative study of artificial neural networks and support vector machines for predicting groundwater levels in a coastal aquifer, Journal of Hydrology, 396, 128–138, https://doi.org/10.1016/j.jhydrol.2010.11.002, 2011. a
Zaadnoordijk, W. J., Bus, S. A., Lourens, A., and Berendrecht, W. L.: Automated Time Series Modeling for Piezometers in the National Database of the Netherlands, Groundwater, 57, 834–843, https://doi.org/10.1111/gwat.12819, 2019. a
Short summary
We present a public dataset of weekly groundwater levels from more than 3000 wells across Germany, spanning 32 years. It combines weather data and site-specific environmental information to support forecasting groundwater changes. Three benchmark models of varying complexity show how data and modeling approaches influence predictions. This resource promotes open, reproducible research and helps guide future water management decisions.
We present a public dataset of weekly groundwater levels from more than 3000 wells across...
Altmetrics
Final-revised paper
Preprint