A 30-year ocean front datasets based on deep learning from 1993 to 2023 for Northwest Pacific ocean

Niu, Yuan; Zhang, Xuefeng; Zhang, Dianjun

doi:10.5194/essd-2025-514

Preprints

https://doi.org/10.5194/essd-2025-514

Preprints

10 Sep 2025

| 10 Sep 2025

Status: a revised version of this preprint is currently under review for the journal ESSD.

A 30-year ocean front datasets based on deep learning from 1993 to 2023 for Northwest Pacific ocean

Yuan Niu, Xuefeng Zhang, and Dianjun Zhang

Abstract. Ocean fronts are critical interfaces between different water masses, profoundly influencing atmosphere–ocean interactions, weather systems, marine ecosystems, and climate regulation. Accurate and long-term observations of ocean fronts are essential for advancing studies in meteorology, oceanography, and climate science. However, no publicly available, long-term ocean front dataset currently exists, and existing detection methods often rely on time-consuming manual labeling or traditional algorithms with limited accuracy in complex frontal regions. In this study, we release the first publicly available 30-year ocean front dataset (1993–2023) for the Northwest Pacific, generated by applying a deep learning framework (Mask R-CNN) to daily sea surface temperature (SST) fields, with manually annotated samples for model training. The dataset provides pixel-level frontal boundaries along with associated attributes, including position, intensity, and width, stored in NetCDF-4 format at 1/12° spatial and daily temporal resolution. Accuracy evaluation shows a mean average precision (mAP) exceeding 0.90, with smaller errors in front width and intensity compared with traditional gradient-based methods, while capturing more small-scale features. The dataset offers three main contributions: (1) Filling the critical gap of a standardized, long-term ocean front product; (2) Serving as a ready-to-use training resource for deep learning models, greatly reducing the need for manual labeling; and (3) Providing benchmark samples for validation and intercomparison of other ocean front detection products. This dataset supports robust investigations of seasonal-to-interannual frontal variability and provides a valuable foundation for applications in meteorology, ecosystem management and climate change research.

Received: 22 Aug 2025 – Discussion started: 10 Sep 2025

Publisher's note: Copernicus Publications remains neutral with regard to jurisdictional claims made in the text, published maps, institutional affiliations, or any other geographical representation in this paper. While Copernicus Publications makes every effort to include appropriate place names, the final responsibility lies with the authors. Views expressed in the text are those of the authors and do not necessarily reflect the views of the publisher.

Download & links

Yuan Niu, Xuefeng Zhang, and Dianjun Zhang

Status: final response (author comments only)

RC1:
'Comment on essd-2025-514', Igor Belkin, 20 Sep 2025

My review is attached. I recommend major revisions (actually, radical revisions).

Citation: https://doi.org/10.5194/essd-2025-514-RC1
- AC3: 'Reply on RC1', zhang dianjun, 05 Jan 2026
  
  Dear Reviewer,
  Thank you for your thorough review of our manuscript.
  We sincerely appreciate your judicious review which improved significantly the manuscript.
  We have carefully considered your comments and have made the necessary revisions to address your concerns.
  Please find attached our detailed response document outlining the changes we have made.
  Looking forward to hearing from you.
  Yours sincerely,
  Dianjun Zhang
  
  Citation: https://doi.org/10.5194/essd-2025-514-AC3
RC2:
'Comment on essd-2025-514', Peter Cornillon, 07 Oct 2025

I have decided to discontinue my review after reading through line 215 of the manuscript. The presentation up to this point is confusing, incomplete, and in places incorrect, making it difficult to follow the authors’ reasoning. I provide examples of each of these below, but I emphasize that these are only illustrative—addressing them alone will not, in my opinion, make the manuscript publishable. I agreed to review this paper because I am genuinely interested in the application of machine learning to the detection of ocean fronts. However, based on what I have read so far, I believe I would struggle to understand the algorithm as currently presented.
I would be willing to review a substantially revised version of the manuscript if the authors make a serious effort to present their work in a clearer and more coherent manner.
I also note that the authors do not mention any use of AI tools—specifically large language models—in preparing the manuscript. Given the lack of clarity both because of the English and because of the structural organization of the manuscript, they might consider using such tools to assist in revising the text. Copernicus Publications does not prohibit the use of large language models for language assistance, but it does require that their use, if any, be disclosed in the manuscript.
Terminology (Lines 22-24) “Ocean front refers to a narrow transitional zone between two or more types of water bodies with significantly different properties, which is a jumping zone of marine environmental parameters and can be described by the horizontal gradient of seawater temperature.”
I assume that jumping zone means a step in the observed property but this is certainly not standard usage.
(Lines 80-82) “With the application of depth learning in the field of image recognition (Nogueira et al. 2016), in view of the shortcomings of traditional ocean front detection methods, the ocean front detection algorithm based on depth learning has become a research hotspot.”
It’s deep learning, not depth learning. I realize that English is not the first language of the authors hence the suggestion that they use a generative AI chatbot to help with the English.
Overgeneralized and misleading (Lines 25-27) “These fronts are the places where different air masses (usually cold and warm humid air) interact, not only having profound impacts on meteorology and climate, but also playing key roles in ecology, resource management, and climate regulation.”
This may be the case but is not necessarily so. In fact, for sub-mesoscale fronts is likely rarely the case and for mesoscale fronts it will depend on the properties defining the front; e.g., it’s unlikely to be the case of a strong salinity front with a thermal expression due, say, to river runoff, or to a chlorophyll front resulting from an open ocean bloom. What makes this sentence particularly confusing is that it conflates atmospheric fronts with oceanic ones.
Incorrect/misleading (Lines 40-43) “The main methods for calculating temperature gradients are Gradient method and Sobel gradient algorithm.”
The main methods for front detection are population-based and gradient-based. Sobel gradients are one form of gradient-based algorithms. Canny’s work is also based on gradients. Cayula and Cornillon’s work is population-based.
Incomplete and a bit misleading criticism of previous front detection algorithms (Lines 43-43) “However, these algorithms may not effectively distinguish between genuine ocean fronts and other image features or artifacts.”
This is only one form in which front detection algorithms may fail. The discussion of issues with current algorithms is incomplete. Furthermore, I would be surprised if the fronts detected by the algorithm presented in this paper did not also fail in this regard. As noted above I have not reviewed the algorithm itself but…
(Lines 58-60) “In summary, traditional methods for extracting ocean fronts suffer from limitations such as subjective threshold selection, inadequate handling of complex fronts, dependency on edge detection algorithms, and limited adaptability to changing conditions.”
But all of the methods you discuss from line 40 on are gradient-based. The reason that the population based method of Cayula and Cornillon was developed was to address some of the issues you raise. Admittedly, their method has other issues but, because the primary mechanism is not based on gradients, it doesn’t suffer from some of the problems you mention.
Inappropriate reference (Lines 35-36) “…the main methods for extracting ocean fronts based on remote sensing data include statistical histogram method (Belkin and Cornillon 2003)”
The authors do discuss the application of a histogram based algorithm to extract the fronts of interest but a more appropriate reference would have been to the original manuscripts describing the method.
Sloppy (Lines 469-473)
I. M. Belkin and P. J. P. O. Cornillon, "SST fronts of the Pacific coastal and marginal seas," 2003.
L. C. Breaker, T. P. Mavor, and W. W. J. C. S. G. C. P. Broenkow, "Mapping and Monitoring Large-Scale Ocean Fronts Off
the California Coast Using Imagery from the GOES-10 Geostationary Satellite," 2005.
A. G. Kostianoy, A. I. Ginzburg, M. Frankignoulle, and B. J. J. o. M. S. Delille, "Fronts in the Southern Indian Ocean as
inferred from satellite sea surface temperature data," vol. 45, no. 1-2, pp. 55-73, 2004.
I’m pretty sure that these initials are not correct and most of the references seem to have similarly bizarre initials.
(Lines 128-132, Equations 1-3) There are mistakes in two of the three equations.
(Lines 134-135) “Labelme software was used to generate the ocean front labels First and foremost, data from remote sensing satellite images that show ocean fronts must be gathered, which are from various regions, times of year, and types of houses.”
Hmmm… not sure where the authors are going with fronts related to types of houses.
Undefined concepts/terms (Line 50) “…proposed a dual I value ocean front recognition method based on the gradient I value method”
I’m not familiar with the I value method. I did ask ChatGPT and was provided a description of it but I don’t believe that it is common usage so should be defined.
(Section 3.3, Lines 139-212) The above comments cover a range of specific issues related to the presentation. Of more concern is that the manuscript does not explicitly define (or, at least I couldn’t find such a definition) how an “ocean front” is represented in pixel space — e.g., whether it is treated as a line, a finite-width band, or a bounding box enclosing a high-gradient region. Because Mask R-CNN performs region-based segmentation and IoU is computed over areas, clarification is needed on how these concepts were adapted for the detection of essentially linear frontal features. The authors should also explain how fronts crossing tile boundaries were handled to ensure that detections are not truncated or duplicated across adjacent patches. I emphasize that I did not read the manuscript beyond this point so they may have presented a definition later in the manuscript but, if the authors define this later, it should nonetheless be introduced in Section 3.3.

Citation: https://doi.org/10.5194/essd-2025-514-RC2
- AC1: 'Reply on RC2', zhang dianjun, 05 Jan 2026
  
  Dear Reviewer,
  Thank you for your thorough review of our manuscript.
  We sincerely appreciate your judicious review which improved significantly the manuscript.
  We have carefully considered your comments and have made the necessary revisions to address your concerns.
  Please find attached our detailed response document outlining the changes we have made.
  Looking forward to hearing from you.
  Yours sincerely,
  Dianjun Zhang
  
  Citation: https://doi.org/10.5194/essd-2025-514-AC1
- AC2: 'Reply on RC2', zhang dianjun, 05 Jan 2026
  
  Dear Reviewer,
  Thank you for your thorough review of our manuscript.
  We sincerely appreciate your judicious review which improved significantly the manuscript.
  We have carefully considered your comments and have made the necessary revisions to address your concerns.
  Please find attached our detailed response document outlining the changes we have made.
  Looking forward to hearing from you.
  Yours sincerely,
  Dianjun Zhang
  
  Citation: https://doi.org/10.5194/essd-2025-514-AC2

Yuan Niu, Xuefeng Zhang, and Dianjun Zhang

Data sets

OCEAN FRONT Yuan Niu https://doi.org/10.5281/zenodo.16921277

Model code and software

ocean front code Yuan Niu https://doi.org/10.5281/zenodo.16921685

Interactive computing environment

Files Yuan Niu https://doi.org/10.5281/zenodo.16921678

Yuan Niu, Xuefeng Zhang, and Dianjun Zhang

Viewed

Total article views: 1,716 (including HTML, PDF, and XML)

HTML	PDF	XML	Total	BibTeX	EndNote
1,388	281	47	1,716	48	56

HTML: 1,388
PDF: 281
XML: 47
Total: 1,716
BibTeX: 48
EndNote: 56

Views and downloads (calculated since 10 Sep 2025)

Month	HTML	PDF	XML	Total
Sep 2025	816	17	11	844
Oct 2025	144	27	7	178
Nov 2025	114	41	5	160
Dec 2025	73	44	4	121
Jan 2026	135	75	13	223
Feb 2026	34	40	3	77
Mar 2026	67	36	4	107
Apr 2026	5	1	0	6

Cumulative views and downloads (calculated since 10 Sep 2025)

Month	HTML	PDF	XML	Total
Sep 2025	816	17	11	844
Oct 2025	144	27	7	178
Nov 2025	114	41	5	160
Dec 2025	73	44	4	121
Jan 2026	135	75	13	223
Feb 2026	34	40	3	77
Mar 2026	67	36	4	107
Apr 2026	5	1	0	6

Viewed (geographical distribution)

Total article views: 1,711 (including HTML, PDF, and XML) Thereof 1,711 with geography defined and 0 with unknown origin.

Country	#	Views	%

Latest update: 04 Apr 2026

Short summary

We develop and release the first publicly available 30-year front dataset (1993–2023) for the Northwest Pacific, generated using a deep learning framework (Mask R-CNN). The dataset provides pixel-level frontal boundaries with associated attributes, including position, intensity and width, stored in NetCDF-4 format at 1/12° spatial and daily temporal resolution.


Total:	0
HTML:	0
PDF:	0
XML:	0