the Creative Commons Attribution 4.0 License.
the Creative Commons Attribution 4.0 License.
A 30-year ocean front datasets based on deep learning from 1993 to 2023 for Northwest Pacific ocean
Abstract. Ocean fronts are critical interfaces between different water masses, profoundly influencing atmosphere–ocean interactions, weather systems, marine ecosystems, and climate regulation. Accurate and long-term observations of ocean fronts are essential for advancing studies in meteorology, oceanography, and climate science. However, no publicly available, long-term ocean front dataset currently exists, and existing detection methods often rely on time-consuming manual labeling or traditional algorithms with limited accuracy in complex frontal regions. In this study, we release the first publicly available 30-year ocean front dataset (1993–2023) for the Northwest Pacific, generated by applying a deep learning framework (Mask R-CNN) to daily sea surface temperature (SST) fields, with manually annotated samples for model training. The dataset provides pixel-level frontal boundaries along with associated attributes, including position, intensity, and width, stored in NetCDF-4 format at 1/12° spatial and daily temporal resolution. Accuracy evaluation shows a mean average precision (mAP) exceeding 0.90, with smaller errors in front width and intensity compared with traditional gradient-based methods, while capturing more small-scale features. The dataset offers three main contributions: (1) Filling the critical gap of a standardized, long-term ocean front product; (2) Serving as a ready-to-use training resource for deep learning models, greatly reducing the need for manual labeling; and (3) Providing benchmark samples for validation and intercomparison of other ocean front detection products. This dataset supports robust investigations of seasonal-to-interannual frontal variability and provides a valuable foundation for applications in meteorology, ecosystem management and climate change research.
- Preprint
(2063 KB) - Metadata XML
- BibTeX
- EndNote
Status: open (until 17 Oct 2025)
- RC1: 'Comment on essd-2025-514', Igor Belkin, 20 Sep 2025 reply
-
RC2: 'Comment on essd-2025-514', Peter Cornillon, 07 Oct 2025
reply
I have decided to discontinue my review after reading through line 215 of the manuscript. The presentation up to this point is confusing, incomplete, and in places incorrect, making it difficult to follow the authors’ reasoning. I provide examples of each of these below, but I emphasize that these are only illustrative—addressing them alone will not, in my opinion, make the manuscript publishable. I agreed to review this paper because I am genuinely interested in the application of machine learning to the detection of ocean fronts. However, based on what I have read so far, I believe I would struggle to understand the algorithm as currently presented.
I would be willing to review a substantially revised version of the manuscript if the authors make a serious effort to present their work in a clearer and more coherent manner.
I also note that the authors do not mention any use of AI tools—specifically large language models—in preparing the manuscript. Given the lack of clarity both because of the English and because of the structural organization of the manuscript, they might consider using such tools to assist in revising the text. Copernicus Publications does not prohibit the use of large language models for language assistance, but it does require that their use, if any, be disclosed in the manuscript.
Terminology (Lines 22-24) “Ocean front refers to a narrow transitional zone between two or more types of water bodies with significantly different properties, which is a jumping zone of marine environmental parameters and can be described by the horizontal gradient of seawater temperature.”
I assume that jumping zone means a step in the observed property but this is certainly not standard usage.
(Lines 80-82) “With the application of depth learning in the field of image recognition (Nogueira et al. 2016), in view of the shortcomings of traditional ocean front detection methods, the ocean front detection algorithm based on depth learning has become a research hotspot.”
It’s deep learning, not depth learning. I realize that English is not the first language of the authors hence the suggestion that they use a generative AI chatbot to help with the English.
Overgeneralized and misleading (Lines 25-27) “These fronts are the places where different air masses (usually cold and warm humid air) interact, not only having profound impacts on meteorology and climate, but also playing key roles in ecology, resource management, and climate regulation.”
This may be the case but is not necessarily so. In fact, for sub-mesoscale fronts is likely rarely the case and for mesoscale fronts it will depend on the properties defining the front; e.g., it’s unlikely to be the case of a strong salinity front with a thermal expression due, say, to river runoff, or to a chlorophyll front resulting from an open ocean bloom. What makes this sentence particularly confusing is that it conflates atmospheric fronts with oceanic ones.
Incorrect/misleading (Lines 40-43) “The main methods for calculating temperature gradients are Gradient method and Sobel gradient algorithm.”
The main methods for front detection are population-based and gradient-based. Sobel gradients are one form of gradient-based algorithms. Canny’s work is also based on gradients. Cayula and Cornillon’s work is population-based.
Incomplete and a bit misleading criticism of previous front detection algorithms (Lines 43-43) “However, these algorithms may not effectively distinguish between genuine ocean fronts and other image features or artifacts.”
This is only one form in which front detection algorithms may fail. The discussion of issues with current algorithms is incomplete. Furthermore, I would be surprised if the fronts detected by the algorithm presented in this paper did not also fail in this regard. As noted above I have not reviewed the algorithm itself but…
(Lines 58-60) “In summary, traditional methods for extracting ocean fronts suffer from limitations such as subjective threshold selection, inadequate handling of complex fronts, dependency on edge detection algorithms, and limited adaptability to changing conditions.”
But all of the methods you discuss from line 40 on are gradient-based. The reason that the population based method of Cayula and Cornillon was developed was to address some of the issues you raise. Admittedly, their method has other issues but, because the primary mechanism is not based on gradients, it doesn’t suffer from some of the problems you mention.
Inappropriate reference (Lines 35-36) “…the main methods for extracting ocean fronts based on remote sensing data include statistical histogram method (Belkin and Cornillon 2003)”
The authors do discuss the application of a histogram based algorithm to extract the fronts of interest but a more appropriate reference would have been to the original manuscripts describing the method.
Sloppy (Lines 469-473)
I. M. Belkin and P. J. P. O. Cornillon, "SST fronts of the Pacific coastal and marginal seas," 2003.
L. C. Breaker, T. P. Mavor, and W. W. J. C. S. G. C. P. Broenkow, "Mapping and Monitoring Large-Scale Ocean Fronts Off
the California Coast Using Imagery from the GOES-10 Geostationary Satellite," 2005.
A. G. Kostianoy, A. I. Ginzburg, M. Frankignoulle, and B. J. J. o. M. S. Delille, "Fronts in the Southern Indian Ocean as
inferred from satellite sea surface temperature data," vol. 45, no. 1-2, pp. 55-73, 2004.
I’m pretty sure that these initials are not correct and most of the references seem to have similarly bizarre initials.
(Lines 128-132, Equations 1-3) There are mistakes in two of the three equations.
(Lines 134-135) “Labelme software was used to generate the ocean front labels First and foremost, data from remote sensing satellite images that show ocean fronts must be gathered, which are from various regions, times of year, and types of houses.”
Hmmm… not sure where the authors are going with fronts related to types of houses.
Undefined concepts/terms (Line 50) “…proposed a dual I value ocean front recognition method based on the gradient I value method”
I’m not familiar with the I value method. I did ask ChatGPT and was provided a description of it but I don’t believe that it is common usage so should be defined.
(Section 3.3, Lines 139-212) The above comments cover a range of specific issues related to the presentation. Of more concern is that the manuscript does not explicitly define (or, at least I couldn’t find such a definition) how an “ocean front” is represented in pixel space — e.g., whether it is treated as a line, a finite-width band, or a bounding box enclosing a high-gradient region. Because Mask R-CNN performs region-based segmentation and IoU is computed over areas, clarification is needed on how these concepts were adapted for the detection of essentially linear frontal features. The authors should also explain how fronts crossing tile boundaries were handled to ensure that detections are not truncated or duplicated across adjacent patches. I emphasize that I did not read the manuscript beyond this point so they may have presented a definition later in the manuscript but, if the authors define this later, it should nonetheless be introduced in Section 3.3.
Data sets
OCEAN FRONT Yuan Niu https://doi.org/10.5281/zenodo.16921277
Model code and software
ocean front code Yuan Niu https://doi.org/10.5281/zenodo.16921685
Interactive computing environment
Files Yuan Niu https://doi.org/10.5281/zenodo.16921678
Viewed
HTML | XML | Total | BibTeX | EndNote | |
---|---|---|---|---|---|
873 | 22 | 14 | 909 | 15 | 15 |
- HTML: 873
- PDF: 22
- XML: 14
- Total: 909
- BibTeX: 15
- EndNote: 15
Viewed (geographical distribution)
Country | # | Views | % |
---|
Total: | 0 |
HTML: | 0 |
PDF: | 0 |
XML: | 0 |
- 1
My review is attached. I recommend major revisions (actually, radical revisions).