the Creative Commons Attribution 4.0 License.
the Creative Commons Attribution 4.0 License.
Dataset of Oil Slicks, Look-Alikes and Remarkable SAR Signatures Obtained from Sentinel-1 Data in the Eastern Mediterranean Sea
Abstract. Publicly available datasets for oil spill detection are scarce, making it difficult to compare the performance of different detection algorithms. To address this, this paper introduces a comprehensive labeled dataset of oil slicks, look-alikes, and other remarkable oceanic phenomena, derived from Sentinel-1 Synthetic Aperture Radar (SAR) products in the Eastern Mediterranean Sea in 2019. The dataset contains 3225 oil objects across 1365 image patches, along with an additional 2290 image patches featuring look-alikes or other phenomena. Data are available at https://doi.pangaea.de/10.1594/PANGAEA.980773 (Yang and Singha, 2025).
This dataset enables researchers to evaluate their oil spill detection models and compare performance with other studies. To facilitate this, the performance of an oil spill detector from a previous study on the dataset is provided as a baseline. In addition, to help the researchers better understand what phenomena their object detector might be confusing with oil slicks, the image patches without oil objects were sorted into several subgroups. On the other hand, for researchers looking to apply object detection models to oil slick detection but lacking a starting dataset, this dataset can serve as a valuable training resource. Beyond dataset presentation, this paper also explains the formation of different oceanic phenomena and their SAR signatures, supported by examples and supplementary materials. These insights help researchers from various backgrounds, such as remote sensing, oceanography, and machine learning, better understand the sources of SAR signatures.
- Preprint
(12441 KB) - Metadata XML
- BibTeX
- EndNote
Status: final response (author comments only)
- RC1: 'Comment on essd-2025-208', Merv Fingas, 29 May 2025
-
RC2: 'Comment on essd-2025-208', Anonymous Referee #2, 20 Oct 2025
Comments on “Dataset of oil slicks look-alikes and remarkable SAR …” by Yang et al.,
The detection of oil slicks in the real ocean is crucial for effective natural disaster management using satellite technology, such as Synthetic Aperture Radar (SAR). However, the complexity of oceanic phenomena makes real-world oil slick identification challenging. This difficulty is partly due to the lack of a robust benchmark dataset suitable for training effective detection models. The manuscript by Yang et al. attempts to address this critical need by providing a categorized dataset. I recommend the publication of this manuscript following the incorporation of the suggestions below.
1 Data Hosting and Model Interoperability: Is it possible to establish a website to host not only the data but also the model codes, similar to initiatives like the PIV challenge (see https://www.pivchallenge.org/)? This would facilitate fair comparison between different models and ensure ease of use for the wider research community.
2 Terminology (Line 47): Please replace "vertical advection in the ocean" with the more precise oceanographic term: "upwelling in the ocean."
3 Equation (1): Please provide a brief justification for choosing the parameter threshold as three times the standard deviation.
4 Annotation Criteria (Lines 111-112): Regarding the data labeled "jointly by two human interpreters," please comment on the specific criteria used for achieving consensus between the interpreters.
5 Figure 3 Caption: The caption should be more informative. For example, the meaning of the vertical line must be explicitly mentioned, even though it is referenced in the main text.
6 Clustering Justification (Lines 189-190): Regarding the "K-means clustering methods with 12 and 5 classes for nw and nc subsets," please provide further detail or comment on the rationale for selecting 12 and 5 classes, respectively.
7 Data Pre-processing (Line 325): When geolocation information is available, the removal of land and islands could be easily and efficiently handled using the global-land-mask Python package (https://github.com/toddkarin/global-land-mask). This should be considered for data cleaning.
8 Figure 13: It would significantly improve clarity to visually indicate the upwelling area in all sub-figures.
9 Typographical Error (Line 33): Please remove the redundant "in" in the sentence.
10 Notation Consistency (Equation (3)): Ensure the notation used for the IoU (Intersection over Union) in Equation (3) is consistent with its use elsewhere in the main text.
11 Model Identification (Line 559): Please explicitly provide the names of the "two models" being referenced.
Citation: https://doi.org/10.5194/essd-2025-208-RC2
Data sets
Oil Slicks, Look-Alikes and Other Remarkable SAR Signatures in Sentinel-1 Imagery in the Eastern Mediterranean Sea in 2019 Yi-Jie Yang and Suman Singha https://doi.pangaea.de/10.1594/PANGAEA.980773
Viewed
HTML | XML | Total | BibTeX | EndNote | |
---|---|---|---|---|---|
1,523 | 305 | 21 | 1,849 | 31 | 53 |
- HTML: 1,523
- PDF: 305
- XML: 21
- Total: 1,849
- BibTeX: 31
- EndNote: 53
Viewed (geographical distribution)
Country | # | Views | % |
---|
Total: | 0 |
HTML: | 0 |
PDF: | 0 |
XML: | 0 |
- 1
The paper is excellent and provides a much-needed source of verification information for SAR detection of oil spills
in particular it is excellent that it covers and provides information on commonly-encountered interferences - including internal waves, winds
(low and high), areas of mixing and vertical advection in the ocean, meso- and sub-mesoscale eddies, biogenic surface films, rain cells, radio frquency interference, etc.
This is the first and important data set to be used for automatic algorithm evaluation
no errors were found