cigFacies: a massive-scale benchmark dataset of seismic facies and its application
Abstract. Seismic facies classification is crucial for seismic stratigraphic interpretation and hydrocarbon reservoir characterization but remains a tedious and time-consuming task that requires significant manual effort. The data-driven deep learning approaches are highly promising to automate the seismic facies classification with high efficiency and accuracy, as they have already achieved significant success in similar image classification tasks within the field of computer vision (CV). However, unlike the CV domain, the field of seismic exploration lacks a comprehensive benchmark dataset for seismic facies, severely limiting the development, application, and evaluation of deep learning approaches in seismic facies classification. To address this gap, we propose a comprehensive workflow to construct a massive-scale benchmark dataset of seismic facies and evaluate its effectiveness in training a deep learning model. Specifically, we first develop a knowledge graph of seismic facies based on the geological concepts and seismic reflection configurations. Guided by the graph, we then implement three strategies of field seismic data curation, knowledge-guided synthesization, and GAN-based generation to construct a benchmark dataset of 8000 diverse samples for five common seismic facies. Finally, we use the benchmark dataset to train a network and then apply it on two 3-D seismic data for automatic seismic facies classification. The predictions are highly consistent with expert interpretation results, demonstrating the diversity and representativeness of our benchmark dataset is sufficient to train a network that can generalize well in seismic facies classification across field data. We have made this dataset, the trained model and associated codes publicly available for further research and validation of intelligent seismic facies classification.