28 Sep 2021

28 Sep 2021

Review status: this preprint is currently under review for the journal ESSD.

Into the Noddyverse: A massive data store of 3D geological models for Machine Learning & inversion applications

Mark Jessell1,5, Jiateng Guo2, Yunqiang Li2, Mark Lindsay1,5, Richard Scalzo3,5, Jérémie Giraud1, Guillaume Pirot1,5, Ed Cripps4,5, and Vitaliy Ogarko1,5 Mark Jessell et al.
  • 1Mineral Exploration Cooperative Research Centre, Centre for Exploration Targeting, The University of Western Australia, Perth, Australia
  • 2College of Resources and Civil Engineering, Northeastern University, Shenyang, China
  • 3School of Mathematics and Statistics, University of Sydney, Sydney, Australia
  • 4Department of Mathematics and Statistics, The University of Western Australia, Perth, Australia
  • 5ARC Centre for Data Analytics for Resources and Environments (DARE)

Abstract. Unlike some other well-known challenges such as facial recognition, where Machine Learning and Inversion algorithms are widely developed, the geosciences suffer from a lack of large, labelled datasets that can be used to validate or train robust Machine Learning and inversion schemes. Publicly available 3D geological models are far too restricted in both number and the range of geological scenarios to serve these purposes. With reference to inverting geophysical data this problem is further exacerbated as in most cases real geophysical observations result from unknown 3D geology, and synthetic test datasets are often not particularly geological, nor geologically diverse. To overcome these limitations, we have used the Noddy modelling platform to generate one million models, which represent the first publicly accessible massive training set for 3D geology and resulting gravity and magnetic datasets. This model suite can be used to train Machine Learning systems, and to provide comprehensive test suites for geophysical inversion. We describe the methodology for producing the model suite, and discuss the opportunities such a model suit affords, as well as its limitations, and how we can grow and access this resource.

Mark Jessell et al.

Status: open (until 24 Nov 2021)

Comment types: AC – author | RC – referee | CC – community | EC – editor | CEC – chief editor | : Report abuse

Mark Jessell et al.

Data sets

Loop3D/noddyverse: Noddyverse 1.0 Mark Jessell

Mark Jessell et al.


Total article views: 426 (including HTML, PDF, and XML)
HTML PDF XML Total BibTeX EndNote
339 80 7 426 5 5
  • HTML: 339
  • PDF: 80
  • XML: 7
  • Total: 426
  • BibTeX: 5
  • EndNote: 5
Views and downloads (calculated since 28 Sep 2021)
Cumulative views and downloads (calculated since 28 Sep 2021)

Viewed (geographical distribution)

Total article views: 379 (including HTML, PDF, and XML) Thereof 379 with geography defined and 0 with unknown origin.
Country # Views %
  • 1
Latest update: 24 Oct 2021
Short summary
To train and test automated methods in the geosciences, we need to have access to large numbers of examples where we know “the answer”. We present a suite of synthetic 3D geological models that allow researchers to test their methods on a whole range of geological plausible models, thus overcoming one of the fundamental limitations of automation studies.