OpenSWI: a massive-scale benchmark dataset for surface wave dispersion curve inversion

Liu, Feng; Zhao, Sijie; Gu, Xinyu; Ling, Fenghua; Zhuang, Peiqin; Li, Yaxing; Su, Rui; Fang, Lihua; Zhou, Lianqing; Huang, Jianping; Bai, Lei

doi:10.5194/essd-18-2769-2026

Articles | Volume 18, issue 4

https://doi.org/10.5194/essd-18-2769-2026

© Author(s) 2026. This work is distributed under
the Creative Commons Attribution 4.0 License.

https://doi.org/10.5194/essd-18-2769-2026

© Author(s) 2026. This work is distributed under
the Creative Commons Attribution 4.0 License.

Articles | Volume 18, issue 4

Data description article

|

21 Apr 2026

Data description article |

| 21 Apr 2026

OpenSWI: a massive-scale benchmark dataset for surface wave dispersion curve inversion

Feng Liu, Sijie Zhao, Xinyu Gu, Fenghua Ling, Peiqin Zhuang, Yaxing Li, Rui Su, Lihua Fang, Lianqing Zhou, Jianping Huang, and Lei Bai

Download

Final revised paper (published on 21 Apr 2026)
Preprint (discussion started on 05 Nov 2025)

Interactive discussion

Status: closed

RC1:
'Comment on essd-2025-502', Filippo Gatti, 12 Jan 2026

The size and the extent of the proposed database are remarkable and certainly of interest for the community. However, there are a few issues that must be addressed before publication:
- extracting 1D profiles from the same 3D geology, while adding some random fluctuation, seems to create a bias in the dataset (profiles are close to each other and they all described the same large geological structures).
- too few information are provided, even in the appendix, about the DDPM. In particular, on how viable is to expand the dataset with diffusion model: does the DDPM reproduce the same statistics? how many iterations are needed to infer new samples? how diverse are those samples? Unless the DDPM model has some novel feature, I think its role in this paper is rather marginal and can be overlooked. Otherwise, it should be expanded to highlight its importance
- what is the highest frequency that the geological models can propagate?
- are the random perturbations introduced by author consistent with the natural uncertainty? What about small scale heterogeneity which is well known to have a specific 3D correlation structure? Why did not the authors include this in their dataset?
- The authors overlooked one major dataset, published on this journal in 2024, which provides 30000 ground motion simulations including complex randomized geology:
Lehmann, F.; Gatti, F.; Bertin, M.; Clouteau, D. Synthetic Ground Motions in Heterogeneous Geologies from Various Sources: The HEMEW S -3D Database. Earth Syst. Sci. Data 2024, 16 (9), 3949 3972. https://doi.org/10.5194/essd-16-3949-2024.
This database span a ~10x10 km² for each sample and it is constructed with a minimum bias. Considered the fact that the dataset provides (geology,time-histories) couples, it would be interesting to benchmark the proposed model out-of-distribution, which is the most difficult aspect of benchmarking a new ML model
- The transformer architecture presented in the paper seem a little too advanced for such a simple dataset (dispersion curves vs 1D geological profile). It is necessary to benchmark it with existing alternative deep learning models in order to consider it as a reliable alternative.

Citation: https://doi.org/10.5194/essd-2025-502-RC1
- AC1: 'Reply on RC1', Feng Liu, 13 Mar 2026
  
  Dear RC1,
  Please find attached our response letter.
  
  The document includes our point-by-point responses to your comments , together with a summary of the revisions made to the manuscript.
  Best regards,
  Feng Liu
  
  on behalf of all co-authors
  
  Citation: https://doi.org/10.5194/essd-2025-502-AC1
RC2:
'Comment on essd-2025-502', Anonymous Referee #2, 02 Mar 2026
General Comments:
Liu et al. construct OpenSWI, a comprehensive benchmark dataset designed for surface wave dispersion curve inversion, comprising three subsets: OpenSWI-shallow, OpenSWI-deep, and OpenSWI-real. These datasets effectively address the growing need for large-scale and diverse training resources to facilitate AI-based inversion techniques in both shallow and deep geophysical applications. The manuscript presents a systematic and geologically workflow for datasets construction, generating a large number of velocity dispersion curves from multiple publicly available synthetic and real models. Besides, the authors develop a unified quality control and standardization process, and several effective data augmentation strategies for building a massive and structurally diverse dataset. Finally, the author validated the feasibility and effectiveness of their datasets by testing on multiple real-world observations datasets.
Overall, this work is timely and potentially impactful. The scale of the dataset and the effort toward open-source release are commendable, and the proposed workflow provides a reproducible foundation for future dataset expansion. Nevertheless, several aspects of the data processes, forward modeling details, model training design, and overall presentation would benefit from further clarification and refinement. Addressing these issues would improve the clarity, methodological rigor, and reliability of the benchmark dataset for future applications.
Specific comments:
Page 5, Lines 111-112, and Figures 2, 4: The authors mention that artifacts (e.g., zero or abnormal values) are corrected through interpolation or single-point removal during the quality control process. In Figure 2, the anomalous low-velocity point appears to be a numerical artifact introduced during interpolation after fault insertion, which may indeed be non-physical in the context of a normal fault setting. However, Flat–Fault and Fold–Fault models shown in Figure 4, some geological scenarios may involve reverse faulting or locally overturned strata. In such cases, localized low-velocity anomalies or sharp velocity inversions could be geologically reasonable rather than numerical artifacts. How does the quality control process distinguish between numerical artifacts and geologically meaningful velocity inversions? Please clarify.

Page 7, Lines 141-144: The explanation of the procedures applied for depths <120 km and ≥120 km is unclear and potentially misleading. Although the manuscript states that Brocher’s empirical formulas are less applicable at depths ≥120 km, Brocher’s empirical relationship still appears to be used to compute ρ after deriving Vp from Vs based on a constant assumption. Please clarify this workflow. Besides, the manuscript adopts a fixed value of 1.79 for all depths below 120 km. Could this assumption reduce the variability, diversity, or realism of the dataset? Furthermore, might the use of different parameter conversion procedures above and below 120 km introduce an artificial discontinuity at this boundary?

In the model training, the authors adopt MSE as the loss function for training the inversion model. Have alternative loss functions been evaluated, such as MAE or smoothed MAE (Huber loss)? Since MSE tends to promote smoother predictions, could this potentially affect the preservation of boundaries with sharp velocity discontinuities?

Technical comments:
Page-3, Line 56: The citation format “...of the researchers Merrifield et al. (2022)” is inappropriate.

Page-5, Lines 128-129: Please provide a detailed description of the de-duplication procedure. It would be helpful if the authors could clarify whether the de-duplication was implemented during the profile extraction stage (e.g., by applying a spatial sampling interval), or performed after extraction using a quantitative similarity criterion.

Page-8, Lines 175-176: The expression “they provide deep learning models with ...” is somewhat informal, e.g., “they provide .... samples for model training, ...”

Page-18, Lines 314-316: Please avoid using single-sentence paragraphs.

Page-22, Lines 368-369: The described learning rate decay intervals (20 and 200 epochs) appear inconsistent with the corresponding figure 10, which seems to show decay at approximately 40 and 500 epochs. Please clarify.

Page-23, Lines 399-401: Please avoid using single-sentence paragraphs.

Figure 1 caption: The description “white box” appears inconsistent with the figure, as the box appears closer to gray instead of white.

Figure 2: Please add the axis scale (with units) for the density curves, as currently only the velocity scale is shown.

Figure 7: The central global map shows several gray dots. Could the authors clarify whether these are meaningful markers or possible visualization artifacts (e.g., due to low image resolution)?
Citation: https://doi.org/10.5194/essd-2025-502-RC2
- AC2: 'Reply on RC2', Feng Liu, 13 Mar 2026
  
  Dear RC2,
  Please find attached our response letter. The document includes our point-by-point responses to your comments , together with a summary of the revisions made to the manuscript.
  Best regards,
  Feng Liu
  
  on behalf of all co-authors
  
  Citation: https://doi.org/10.5194/essd-2025-502-AC2

Peer review completion

AR – Author's response | RR – Referee report | ED – Editor decision | EF – Editorial file upload

AR by Feng Liu on behalf of the Authors (16 Mar 2026) Author's response

EF by Polina Shvedko (17 Mar 2026) Manuscript Author's tracked changes

ED: Referee Nomination & Report Request started (17 Mar 2026) by Andrea Rovida

RR by Anonymous Referee #2 (28 Mar 2026)

RR by Filippo Gatti (10 Apr 2026)

ED: Publish as is (10 Apr 2026) by Andrea Rovida

AR by Feng Liu on behalf of the Authors (13 Apr 2026) Manuscript

Short summary

We introduce a large and diverse dataset that supports the development of machine learning methods for studying Earth structures through surface wave dispersion curves. Existing research has been limited by the absence of such benchmark data. Our dataset includes both computer-generated and real-world examples, allowing models to be tested and compared in a consistent way. By making these resources openly available, we aim to advance research on the shallow and deep Earth.