Preprints
https://doi.org/10.5194/essd-2024-131
https://doi.org/10.5194/essd-2024-131
25 Jul 2024
 | 25 Jul 2024
Status: this preprint is currently under review for the journal ESSD.

cigChannel: A massive-scale 3D seismic dataset with labeled paleochannels for advancing deep learning in seismic interpretation

Guangyu Wang, Xinming Wu, and Wen Zhang

Abstract. Identifying buried channels in 3D seismic volumes is essential for characterizing hydrocarbon reservoirs and offering insights into paleoclimate conditions, yet it remains a labor-intensive and time-consuming task. The data-driven deep learning methods are highly promising to automate the seismic channel interpretation with high efficiency and accuracy, as they have already achieved significant success in similar image segmentation tasks within the field of computer vision (CV). However, unlike the CV domain, the field of seismic exploration lacks a comprehensive benchmark dataset for channels, severely limiting the development, application, and evaluation of deep learning approaches in seismic channel interpretation. Manually labeling 3D channels in field seismic volumes can be a tedious and subjective work and most importantly, many field seismic volumes are proprietary and not accessible to most of the researchers. To overcome these limitations, we propose a comprehensive workflow of geological channel simulation and geophysical forward modeling to create a massive-scale synthetic seismic dataset containing 1,200 256×256×256 seismic volumes with labels of more than 10,000 diverse channels and their associated sedimentary facies. It is by far the most comprehensive dataset for channel identification, providing realistic and geologically reasonable seismic volumes with meandering, distributary, and submarine channels. Trained with this synthetic dataset, a convolutional neural network (simplified from the U-Net) model performs well in identifying various types of channels in field seismic volumes, which indicates the diversity and representativeness of the dataset. We have made the dataset, codes generating the data, and trained model publicly available for facilitating further research and validation of deep learning approaches for seismic channel interpretation.

Publisher's note: Copernicus Publications remains neutral with regard to jurisdictional claims made in the text, published maps, institutional affiliations, or any other geographical representation in this preprint. The responsibility to include appropriate place names lies with the authors.
Guangyu Wang, Xinming Wu, and Wen Zhang

Status: open (until 31 Aug 2024)

Comment types: AC – author | RC – referee | CC – community | EC – editor | CEC – chief editor | : Report abuse
Guangyu Wang, Xinming Wu, and Wen Zhang

Data sets

cigChannel: A massive-scale dataset of 3D synthetic seismic volumes and labelled palaeochannels for deep learning Guangyu Wang, Xinming Wu, and Wen Zhang https://doi.org/10.5281/zenodo.10791151

Model code and software

cigChannel Guangyu Wang, Xinming Wu, and Wen Zhang https://github.com/wanggy-1/cigChannel

Guangyu Wang, Xinming Wu, and Wen Zhang

Viewed

Total article views: 23 (including HTML, PDF, and XML)
HTML PDF XML Total BibTeX EndNote
20 3 0 23 0 0
  • HTML: 20
  • PDF: 3
  • XML: 0
  • Total: 23
  • BibTeX: 0
  • EndNote: 0
Views and downloads (calculated since 25 Jul 2024)
Cumulative views and downloads (calculated since 25 Jul 2024)

Viewed (geographical distribution)

Total article views: 23 (including HTML, PDF, and XML) Thereof 23 with geography defined and 0 with unknown origin.
Country # Views %
  • 1
1
 
 
 
 
Latest update: 26 Jul 2024
Download
Short summary
Seismic paleochannel interpretation is essential for hydrocarbon exploration and paleoclimate studies but remains labor-intensive. Deep learning (DL) is promising to automate it but hindered by the lack of labeled data. We propose a workflow to simulate various channels and realistic seismic volumes, yielding the largest 3D seismic dataset with diverse channel labels. Its effectiveness is proven by field applications. The dataset, codes and DL models are released to advance further research.
Altmetrics