OpenSWI: A Massive-Scale Benchmark Dataset for Surface Wave Dispersion Curve Inversion

Liu, Feng; Zhao, Sijie; Gu, Xinyu; Ling, Fenghua; Zhuang, Peiqin; Li, Yaxing; Su, Rui; Fang, Lihua; Zhou, Lianqing; Huang, Jianping; Bai, Lei

doi:10.5194/essd-2025-502

Preprints

https://doi.org/10.5194/essd-2025-502

Preprints

05 Nov 2025

| 05 Nov 2025

Status: this preprint is currently under review for the journal ESSD.

OpenSWI: A Massive-Scale Benchmark Dataset for Surface Wave Dispersion Curve Inversion

Feng Liu, Sijie Zhao, Xinyu Gu, Fenghua Ling, Peiqin Zhuang, Yaxing Li, Rui Su, Lihua Fang, Lianqing Zhou, Jianping Huang, and Lei Bai

Abstract. Surface wave dispersion curve inversion plays a critical role in both shallow geophysical exploration and deep geological studies, yet it remains hindered by sensitivity to initial models, susceptibility to local minima, and low computational efficiency. Recently, data-driven deep learning methods, inspired by their success in computer vision and natural language processing, have shown promising potential to overcome these challenges. However, the lack of large-scale and diverse benchmark datasets remains a major obstacle to the development and evaluation of such methods. To address this gap, we introduce OpenSWI, a comprehensive benchmark dataset generated through the Surface Wave Inversion Dataset Preparation (SWIDP) pipeline. OpenSWI comprises two synthetic datasets tailored to different research scales and application scenarios, namely OpenSWI-shallow and OpenSWI-deep, as well as an AI-ready real-world dataset for generalization evaluation, OpenSWI-real. OpenSWI-shallow is derived from the 2-D geological model dataset OpenFWI, containing over 22 million 1-D velocity profiles paired with their fundamental-mode phase and group velocity dispersion curves, spanning a broad spectrum of shallow geological structures (e.g., flat layers, faults, folds, and realistic stratigraphy). OpenSWI-deep is built from 14 global and regional 3-D geological models, comprising approximately 1.26 million high-fidelity 1-D velocity-dispersion data pairs for deep earth studies. OpenSWI-real, compiled from open-source projects, contains two sets of observed dispersion curves and their corresponding 1-D reference models, serving as a benchmark for evaluating the generalization of deep learning models. To demonstrate the utility of OpenSWI, we trained deep learning models on OpenSWI-shallow and OpenSWI-deep, and evaluated them on OpenSWI-real. The results show strong agreement between the predicted and reference velocity models, confirming the diversity and representativeness of the OpenSWI dataset. To facilitate the advancement of intelligent surface wave dispersion curve inversion techniques, we release the OpenSWI dataset (https://doi.org/10.5281/zenodo.16874111) and the SWIDP toolbox along with associated resources (https://doi.org/10.5281/zenodo.16884901), providing open resources to support the research community.

Received: 18 Aug 2025 – Discussion started: 05 Nov 2025

Publisher's note: Copernicus Publications remains neutral with regard to jurisdictional claims made in the text, published maps, institutional affiliations, or any other geographical representation in this paper. While Copernicus Publications makes every effort to include appropriate place names, the final responsibility lies with the authors. Views expressed in the text are those of the authors and do not necessarily reflect the views of the publisher.

Download & links

Feng Liu, Sijie Zhao, Xinyu Gu, Fenghua Ling, Peiqin Zhuang, Yaxing Li, Rui Su, Lihua Fang, Lianqing Zhou, Jianping Huang, and Lei Bai

Status: open (until 06 Feb 2026)

Post a comment Subscribe to comment alert

Feng Liu, Sijie Zhao, Xinyu Gu, Fenghua Ling, Peiqin Zhuang, Yaxing Li, Rui Su, Lihua Fang, Lianqing Zhou, Jianping Huang, and Lei Bai

Data sets

OpenSWI-dataset Feng Liu https://doi.org/10.5281/zenodo.16874111

Model code and software

OpenSWI-toolbox Feng Liu https://doi.org/10.5281/zenodo.16884901

Feng Liu, Sijie Zhao, Xinyu Gu, Fenghua Ling, Peiqin Zhuang, Yaxing Li, Rui Su, Lihua Fang, Lianqing Zhou, Jianping Huang, and Lei Bai

Viewed

Total article views: 537 (including HTML, PDF, and XML)

HTML	PDF	XML	Total	BibTeX	EndNote
276	241	20	537	27	30

HTML: 276
PDF: 241
XML: 20
Total: 537
BibTeX: 27
EndNote: 30

Views and downloads (calculated since 05 Nov 2025)

Month	HTML	PDF	XML	Total
Nov 2025	229	81	16	326
Dec 2025	45	152	4	201
Jan 2026	2	8	0	10

Cumulative views and downloads (calculated since 05 Nov 2025)

Month	HTML	PDF	XML	Total
Nov 2025	229	81	16	326
Dec 2025	45	152	4	201
Jan 2026	2	8	0	10

Viewed (geographical distribution)

Total article views: 540 (including HTML, PDF, and XML) Thereof 540 with geography defined and 0 with unknown origin.

Country	#	Views	%

Latest update: 04 Jan 2026

Short summary

We introduce a large and diverse dataset that supports the development of machine learning methods for studying Earth structures through surface wave dispersion curves. Existing research has been limited by the absence of such benchmark data. Our dataset includes both computer-generated and real-world examples, allowing models to be tested and compared in a consistent way. By making these resources openly available, we aim to advance research on the shallow and deep Earth.


Total:	0
HTML:	0
PDF:	0
XML:	0