the Creative Commons Attribution 4.0 License.
the Creative Commons Attribution 4.0 License.
cigChannel: A massive-scale 3D seismic dataset with labeled paleochannels for advancing deep learning in seismic interpretation
Abstract. Identifying buried channels in 3D seismic volumes is essential for characterizing hydrocarbon reservoirs and offering insights into paleoclimate conditions, yet it remains a labor-intensive and time-consuming task. The data-driven deep learning methods are highly promising to automate the seismic channel interpretation with high efficiency and accuracy, as they have already achieved significant success in similar image segmentation tasks within the field of computer vision (CV). However, unlike the CV domain, the field of seismic exploration lacks a comprehensive benchmark dataset for channels, severely limiting the development, application, and evaluation of deep learning approaches in seismic channel interpretation. Manually labeling 3D channels in field seismic volumes can be a tedious and subjective work and most importantly, many field seismic volumes are proprietary and not accessible to most of the researchers. To overcome these limitations, we propose a comprehensive workflow of geological channel simulation and geophysical forward modeling to create a massive-scale synthetic seismic dataset containing 1,200 256×256×256 seismic volumes with labels of more than 10,000 diverse channels and their associated sedimentary facies. It is by far the most comprehensive dataset for channel identification, providing realistic and geologically reasonable seismic volumes with meandering, distributary, and submarine channels. Trained with this synthetic dataset, a convolutional neural network (simplified from the U-Net) model performs well in identifying various types of channels in field seismic volumes, which indicates the diversity and representativeness of the dataset. We have made the dataset, codes generating the data, and trained model publicly available for facilitating further research and validation of deep learning approaches for seismic channel interpretation.
- Preprint
(31127 KB) - Metadata XML
- BibTeX
- EndNote
Status: final response (author comments only)
-
RC1: 'Comment on essd-2024-131', Anonymous Referee #1, 18 Sep 2024
This manuscript proposes a workflow to generate 3D synthetic seismic cubes with paleo-channels interpretations that can be used to train deep-learning models for seismic interpretation, and a dataset of 1200 3D synthetic seismic cubes. This dataset is used in an application where a deep learning model is trained to interpret paleo-channels and tested on 3 real seismic cubes.
Overall the manuscript is easy to follow, and there is a clear need for such datasets considering that deep learning is getting a lot of traction for subsurface applications but we cannot rely on subsurface data alone. However, the dataset isn't nearly as realistic and comprehensive as claimed by the authors. This will drastically reduce its usefulness as a benchmark, but I believe this is a step in the right direction.
Major comments:
Overall the workflow doesn't rely on the state of the art for the generation of synthetic subsurface realizations. Using soillib instead of more established models like Landlab, Fastscape, or Badlands is puzzling, especially considering that I'm not sure if the principles behind soillib have been thoroughly validated. Integrating channels and topography in an object-based manner (which is similar to what Alluvsim does, see https://doi.org/10.1016/j.cageo.2008.09.012) is quick and easy, but far from the realism that stratigraphic models such as Flumy or Sedsim would lead to. The rock physics model is very simplistic, and ignores all the variability within the facies. And the forward seismic model is also the simplest there is. Of course full-waveform models are very expensive, but 3D convolution using a point-scatterer function can also capture the acquisition setting without increasing the computation cost. So in the end a large part of the variability in seismic data isn't captured in this dataset, and the impact of not fully capturing that variability needs to be discussed at the very least. The lack of faults is also a major drawback, as shown by the application, and will limit the usefulness of this dataset.
There are very few references to support the claim of geological realism, and I would even say that the manuscript lacks geological perspective. Section 2.2 is particularly problematic, because soillib seems to be simulating continental fluvial systems with tributaries, instead of deltas and distributaries as claimed in the text. So how comprehensive are the workflow and dataset when they don't explicitly capture key subsurface deposits for geo-energy applications? On top of that, deltas aren't just channels (they are more recognizable by their lobate shapes on seismic data), but this aspect isn't discussed at all. In general the limitations of the workflow and dataset should be more clearly acknowledged and discussed, including in the abstract. What could be the impact of the limitations for anyone who would like to use this dataset? What is it missing so that methods validated on it could fail when applied on real data?
Beyond all this lies the question of whether this dataset is plausible enough to be used confidently for the validation of new methods. To me the answer is no, but there could be a simple approach to discuss this in quantitative way: divide the dataset into a train and a test set, train an autoencoder to reconstruct the train set, then compare the reconstruction error between the test set and the real seismic cubes. Essentially, if the autoencoder can reconstruct the real data just as well as the synthetic ones, it means that the dataset likely captures the patterns that matter. If not, it quantifies the room for improvement and gives a clear criterion to follow for future work.
I suggest to also reflect on the value of such open workflow and dataset beyond algorithmic developments related to deep learning. The field of geophysics in general tends to test new approaches on (very) simplified case studies, geostatistics tends to do the same, but I also see a lot of value for education (e.g., to learn deep learning with subsurface applications or learn about the impact of heterogeneities on seismic data). On that, turning the functions available on GitHub into a small, properly documented Python package available from PyPi could foster its use.
Specific comments:
Abstract
Line 8: "to most researchers" instead of "to most of the researchers".
Line 9: If it's geologically reasonable then it's not realistic. Something realistic represents a system accurately, while reasonable means that it's a good enough approximation. In this case we're in the later: the geology here isn't realistic at all, but considering the lack of resolution of seismic data it's a good enough approximation to get some valuable insights from deep learning.
Line 13: What does it mean to "perform well"? Better be specific and quantitative there.
Line 14: How many seismic volumes?
Line 14: "which indicates the diversity and representativeness of the dataset" Nothing in the abstract suggests that you can conclude that.
Line 15: That's great!
1. Introduction
Line 20: Considering that we're already experiencing the consequences of climate change, that focus solely on hydrocarbons is unfortunate. Paleochannels are good reservoirs, so they are also valuable in hydrogeology, hydrothermal production, and mining.
Line 27: "have been developed" instead of "are developed".
Line 35: More and more seismic data are being released by government agencies (in Australia, the Netherlands, New Zealand, ...), so lack of access is not as true as it used to be. A key issue remains processing: those data can be raw or not completely processed (e.g., not depth converted), so it's difficult for non-specialists to reuse them. And then, as the authors rightfully mentioned, there's the difficulty in interpreting the data.
Line 39: I wouldn't say that it's not an option, it's just an expensive one, prone to uncertainties (in the processing for instance) and to biases (see for instance https://doi.org/10.1130/GSAT01711A.1).
Line 51: "massive-scale" is exaggerated. It's a relatively large dataset for the subsurface, but it's nothing compared to datasets from the deep learning community. And even in the subsurface much larger datasets have been released before (see https://doi.org/10.5194/essd-14-381-2022 for instance).
Line 65: Maybe mention the link to the GitHub repository also here?
2. Dataset generation workflow
Line 67: This sentence is a bit convoluted, with several repetitions that can be avoided (i.e., "generation" then "generating", "elaborate" then "explain details").
Line 72: Are meandering channels the most common river channels? I'm not convinced of that, it would be better to support this claim with a reference.
Line 86: Any reference to support those two shapes? It would support the claim of realism much better to show that this is indeed what we observe in nature.
Line 105: I couldn't find a paper describing how this model works in detail. The key problem here is that the channels shown in figure 3 look nothing like deltaic channels, but more like a continental river system (and those look more like tributaries, not distributaries). Many models have been developed to simulate such systems (see Landlab, Fastscape, Badlands) based on laws that only approximate the physics of overland flow, erosion, and deposition but have been validated to some degree and can be fast depending on the processes included and the implementation. So why not use those models? Regarding deltas, DeltaRCM developed by Liang et al. (2015) is a valid candidate, even if it's still a bit slow. I'm not sure if Sedflux could be an option (https://doi.org/10.1016/j.cageo.2008.02.013), but it shows that deltas are more than just channels, and that will impact the seismic data.
Line 127: Not just any sediment, you need enough fine sediments to build levees, so meandering turbiditic channels correspond to a quite specific depositional environment. This could restrict the comprehensiveness of the dataset, which won't capture sandier environments.
Line 130: The main difference is the scale: submarine channels are much wider, deeper, and longer than their terrestrial counterparts, which isn't really clear from figure 5. The lack of spatial scale also doesn't help in assessing the plausibility of the 3D structures, especially relative to one another.
Line 152: I would say abandoned meander instead of oxbow lake, which fits more a continental setting.
Line 164: Channels in general have different facies (point bars, levees, crevasse splays, abandoned meanders, abandoned channels). You take that into account when selecting the impedance for submarine channels but not the others, why is that? And what's the impact of that choice? On top of that, variations in grain size distribution within a facies leads to variations of impedance, so why using a uniform impedance, which isn't realistic? And this is excluding the effect of burial and diagenesis.
Line 177: So this is a 1D convolution? I get that this is a common and simple approach, but it's not really realistic either (see for instance https://doi.org/10.1111/1365-2478.12936 or https://library.seg.org/doi/full/10.1190/1.2919584, and https://doi.org/10.1190/geo2021-0824.1 for an application to seismic data interpretation using deep learning).
Figure 8: What's the spatial scale of the 2D sections? This comment stands for almost all the figures, but here in particular because we can't compare to the wavelet without a clear scale. And how do the peak wavenumbers relate to the usual values in Hz?
3. Results
Line 219: No faults? How does that impact the usefulness of the dataset?
Line 228: Actually size and aggradation are the only differences between fluvial and turbiditic channels in your simulations, since the model for meandering is the same. So a deep learning model trained on your dataset might struggle with small turbiditic systems and large fluvial systems.
Lines 228-229: That doesn't really explain why they are so much larger. Overall I see very little geological literature cited to support the plausibility of the models, which is unfortunate.
Line 235: Any reference to show how to do that? Any reference for the weighted loss function?
4. Applications
Line 249: It would be nice to add a link to that dataset.
Line 251: This isn't quantitative, which is unfortunate. It would have been much better to compare to a human-made interpretation, especially since here you have nice channels that look easily interpretable, and measure different metrics such as precision and recall. There seems to be a lot of false positives in the deep-learning interpretation. I realize that the manuscript doesn't aim at developing a deep-learning model for channel interpretation, but are the false positives due to a not-so-optimal model or a not-so-optimal dataset? Not having any validation metric for the training of the deep-learning model doesn't help to assess this.
Line 256 and 261: Are those seismic volumes open? If not, that part of the manuscript is irreproducible.
Line 270: This is really a strong limitation considering that faults are ubiquitous in the subsurface, can have a big impact on applications, and that there has been a lot of studies similar to yours for faults already, so methods to introduce faults already exist (e.g., https://doi.org/10.1190/geo2021-0824.1).
Line 279: That's not quite true: you're proposing a benchmark dataset (lines 5, 59, 290), so a standard to compare (future) methods. How can your dataset become a standard if it excludes a basic configuration of the subsurface (faulted domains)?
5. Conclusions
Line 284: What are the predecessors actually? You've never mentioned any.
Line 285-286: I'm dubious of the claims of realism and diversity, which aren't well supported by the manuscript. That doesn't mean that this dataset cannot be useful, but I expect more openness on the limitations, which is essential if this is to be used as a benchmark.
Table A1: It would be much better to have some justification for those values, either in an extra reference column or in the text. Channels can be smaller than 200 m and larger than 500 m (see https://doi.org/10.2110/jsr.2006.060).
Citation: https://doi.org/10.5194/essd-2024-131-RC1 -
RC2: 'Comment on essd-2024-131', Samuel Bignardi, 25 Jan 2025
This manuscript contains two main components. First, it provides a dataset comprising 1200 simulated seismic volumes, each containing reflections from structures mimicking paleochannels inside an otherwise layered rock environment. Second, it provides the computer codes necessary to generate such seismic volumes (given a distribution of acoustic impedance), and some code engineered to train a simplified U-Net neural network to segment such features.
The manuscript reads well, is well organized, and the authors' logical path is very straightforward to follow. Indeed, it is well known that the bottleneck of AI in geophysical applications lies in the lack of labeled field data. As the only feasible workaround is likely to feed simulated data to the training process, this dataset goes in the right direction, and I do appreciate the open-source perspective offered by the authors.
Major comments:
In my view, the two main questions to be answered are: (1) Is this seismic data representative of the subsurface features that these authors want to retrieve? (2) How good a benchmark is this dataset for testing any algorithm engineered to segment such features?Regarding the first question, after reading the paper I also looked into the Python code. The subsurface is represented as a grid where a value for the reflection coefficient is associated with each node. The seismic volume is computed by convolving such a grid with a Ricker wavelet.
This forward model is very simplistic, in my view, as it disregards many aspects of realistic wave propagation (Contribution of shear waves, separate contributions from Vp and density, multipath reflections, etc.). Far more realistic forward models are available in the literature. If the point is to provide an unbiased benchmark, I believe that some computational time is a cost worth paying.Besides this aspect, the choices made for the subsurface representation also pose some limitations. A model described as a parallel-layered system with paleochannels that always cross the boundary of the computational domain introduces an implicit geometrical bias which will eventually find its way in the CNN training. For this reason, it is no surprise that the CNN mistakes the fault (a feature that crosses the computational boundary and deviates from the parallel layering) for a paleochannel (the only object known to the network capable of introducing a lateral variation).
In summary, while this work has some merit. However, some reliability tests are in order and weak aspects should be better disclosed.
In particular, to consider this dataset fully reliable one should (1) validate the present simulations against a well-established full-waveform solution.
(2) Use a well-established method to assess the network performance. It is good practice to assess the performance of a CNN by dividing the data at least into two sets, a "training" set and a "test" set. This check was performed here, so the reader has no means to judge the capabilities of the workflow.
For example:
* was I to produce a new subsurface model using this workflow and did I feed it to the CNN, how many voxels would the network label correctly?
* What if I provided a simulated seismic volume produced with a third-party algorithm instead?
* How would this automatic segmentation compare to an expert manual segmentation?Detailed comments:
Line 2: "The data-driven deep learning methods are …", remove "The"Line 6/7: "Manually labeling 3D channels in field seismic volumes can be a tedious and subjective work …"
In fairness, there is a large subjectivity with CNNs as well, although it comes in a different form. For example, what CNN architecture should we use? How many layers should we employ? What activation function, and what learning rate to adopt? These are ALL subjective choices.
Line 27: "To address those issues, automatic paleochannel identification methods based on 3D convolutional neural networks (CNNs) (Pham et al., 2019; Gao et al., 2021) are developed.""To address those issues, automatic paleochannel identification methods based on 3D convolutional neural networks (CNNs) (Pham et al., 2019; Gao et al., 2021) HAVE BEEN developed."
Line 28: "They treat paleochannels as bodies rather than slices as human interpreters typically see,"
I disagree. Human interpreters may visualize slice-wise information but they bear the 3D model in mind. Plus, (auspicably) they regard the data with expertise matured in years of training in geology.
I would change the sentence, to something like: "They have the advantage of handling paleochannels according to their 3D nature, as opposed to the slice-by-slice visual investigation of a human operator."
Line 33: "..., currently there is no publicly available dataset of field seismic…"
Remove "currently".Line 41/42: "… allowing us to tailor the objectives that we want the network to learn"
Better: "… allowing us to tailor the features that our network will learn to segment"Line 53/54: "the modeling methods developed by Howard and Knutson (1984), McDonald (2020) and Sylvester et al. (2011), respectively"
A summary of this modeling technique would be welcome to people unfamiliar with those specific articles.
Line 75: "We use the open-source Python package meanderpy (Sylvester, 2021) ..."
Is there any limitation in the Sylvester {2021) modeling that could impact the reliability of this dataset? If so, it should be mentioned.
Line 180 (equation 8). Earlier in the manuscript, when I read "convolution" I implicitly assumed it was performed in the time domain to obtain a seismic volume with the two-way delay time along the vertical axis. Seeing equation (8) expressed with wavenumber makes me realize that the vertical axis is depth. The fact that volume figures have no axes also contributed to this misunderstanding. It would be advisable to add an explanation of this aspect soon in the main text, perhaps at the point where the first model is described.
Line 224: "seismic volumes with multi-class channel labels."
What do you mean by "multi-class labels"?
I was under the impression that this paper used just two classes (i.e. "paleochannel" and "background"). Did I miss something?Line 231: "Regarding the potential problems of the class imbalance problem and the size discrepancy between terrestrial and submarine ..."
Better something like: "Regarding possible class imbalance problems connected to the size discrepancy between terrestrial and submarine … "
Line 242/243: "Gaussian random noise is added to the seismic volume to make the training process more robust and reduce the tendency towards overfitting."
A better description of this aspect is required.
How was the noise designed?
How were the mean and variance chosen, and why?
Is the noise somewhat spatially coherent, or was it applied only node-wise?
What is the resulting signal-to-noise ratio?Line 276: "… heterogenous seismic amplitude, which is an exceptional case for our dataset."
Better something like: "… heterogeneous seismic amplitude, a feature that could not be learned as it was not present in our training dataset."
Line 277: "Therefore, the identification performance of channels with heterogeneous seismic amplitude would be improved if meandering and distributary channels with heterogeneous seismic amplitude can be included in this dataset."
Better something like: "The identification of such channels would likely be improved were we including such acoustic impedance heterogeneity in the modeling of channels."
Line 278/279: "As we mentioned, these are preliminary tests mainly to find out whether this dataset can help the network discriminate channels and non-channel areas."
The expression "to find out" is suitable for everyday conversation, but not much in a formal text. It would be better something like: "As we mentioned, the rationale to these preliminary tests is mainly to judge whether this dataset can help the network discriminate channels and non-channel areas."
I sincerely hope this review will be useful and will help improve your paper.
Best Regards,
Samuel BignardiCitation: https://doi.org/10.5194/essd-2024-131-RC2 -
EC1: 'Comment on essd-2024-131', Andrea Rovida, 04 Mar 2025
Dear Authors
The reviewers' comments to your MS are very detailed and somehow positive and encouraging.
However, one reviewer asked for major revisions. For this reason, besides thoroughly answering each specific comment, I suggest you carefully consider the possibility of revising your dataset as suggested by both reviewers, especially as far as the dataset validation and reliability are concerned.
Looking forward to receiving your replies.Best regards
Andrea RovidaCitation: https://doi.org/10.5194/essd-2024-131-EC1
Data sets
cigChannel: A massive-scale dataset of 3D synthetic seismic volumes and labelled palaeochannels for deep learning Guangyu Wang, Xinming Wu, and Wen Zhang https://doi.org/10.5281/zenodo.10791151
Model code and software
cigChannel Guangyu Wang, Xinming Wu, and Wen Zhang https://github.com/wanggy-1/cigChannel
Viewed
HTML | XML | Total | BibTeX | EndNote | |
---|---|---|---|---|---|
588 | 271 | 284 | 1,143 | 23 | 28 |
- HTML: 588
- PDF: 271
- XML: 284
- Total: 1,143
- BibTeX: 23
- EndNote: 28
Viewed (geographical distribution)
Country | # | Views | % |
---|
Total: | 0 |
HTML: | 0 |
PDF: | 0 |
XML: | 0 |
- 1