the Creative Commons Attribution 4.0 License.
the Creative Commons Attribution 4.0 License.
A Submesoscale Eddy Identification Dataset Derived from GOCI I Chlorophyll–a Data based on Deep Learning
Abstract. This paper presents an observational dataset on submesoscale eddies, which obtains from high–resolution chlorophyll–a distribution images from GOCI I. We employed a combination of digital image processing, filtering, YOLOv7–X, and small object detection techniques, along with specific chlorophyll image enhancement processing, to extract information on submesoscale eddies, including their time, polarity, geographical coordinates of the eddy center, eddy radius, coordinates of the upper left and lower right corners of the prediction box, area of the eddy's inner ellipse, and confidence score, which covers eight daily periods between 00:00 and 08:00 (UTC) from April 1, 2011, to March 31, 2021. We identified a total of 19,136 anticyclonic eddies and 93,897 cyclonic eddies at a confidence threshold of 0.2. The mean radius of anticyclonic eddies is 24.44 km (range 2.5 km to 44.25 km), while that of cyclonic eddies is 12.34 km (range 1.75 km to 44 km). The unprecedented hourly resolution dataset on submesoscale eddies provides information on their distribution, morphology, and energy dissipation, making it a significant contribution to understanding marine environments and ecosystems, as well as improving climate model predictions. The dataset is available at https://doi.org/10.5281/zenodo.7694115 (Wang and Yang, 2023).
- Preprint
(6585 KB) - Metadata XML
- BibTeX
- EndNote
Status: closed
-
RC1: 'Comment on essd-2023-138', Anonymous Referee #1, 28 May 2023
This manuscript describes a submesoscale eddy dataset derived from satellite ocean color products, which can be very useful for the studies of eddy dynamics and the ecosystem of oceanic environments. Overall, data and method to generate the dataset are well described, results are validated, and information for data access is complete. However, there are various grammar issues and fuzzy descriptions, which should be revised/corrected before publication. Specifically,
- Line 9, “… an observational dataset on submesoscale eddies, which obtains from high–resolution chlorophyll–a …”, clearly the grammar is not correct, should be something like “which was obtained from …”
- L14, “which covers eight daily periods between 00:00 and 08:00 (UTC) from April 1, 2011, to March 31, 2021”, this sentence belongs to a new sentence.
- L15, “at a confidence threshold of 0.2”. Need to state this 0.2 is high confidence or low confidence.
- L40, “ and there is some controversy” Need citations to support this statement.
- L42, “, etc(Zhang …” It should be “etc.”, and there should be a space before “(“. Please check the entire manuscript for similar issues.
- L50, “chlorophyll” here should be phytoplankton, as concentration of chlorophyll is a proxy for phytoplankton.
- L54, “from high–resolution chlorophyll”, note that “high-resolution” is subjective, and a resolution at 500 m is not “high-resolution” by many standards or measures.
- L75, “Chlorophyll Image Enhancement” à “Enhancement of Chlorophyll Image”. Make similar changes for 2.2.2.
- L82, “directly manually ..”, this is confusing.
- L92, “The Comparison of different …” should be “A comparison of different ..”. Please also check similar issues at other places.
- L128, “where cyclones rotate counterclockwise and anticyclones rotate clockwise.” This is common knowledge, no need to state here.
- L167, “We used the YOLOv7–X as the model”, need citation for this model.
- L168, “YOLOv7–X was obtained by performing stack scaling on the neck and using …” This sentence is confusing, please rephrase.
- L193, “the watercolor remote sensing images”. Not such a thing of “watercolor remote sensing”. It is either ocean color remote sensing, or “water color” remote sensing, but the latter is very rare.
- L194, “images cannot represent the actual distribution of SMEs in the region.” What does this mean?
- L195, “The coverage of clouds above the region is the primary obstacle that affects the identification of eddies using this method.” This is nearly identical to the previous sentence.
- L205, Result à Results.
- L216, “the energy of the SMEs dissipates within just two hours, making it impossible to trace them in the chlorophyll field” Why ‘impossible’?
- L220, “(e) and (f) demonstrate”. This is not professional description, should be “Figs. (e) and (f) …” Please also correct other similar places.
- L237, please insert space between figure title and main text. Do so at other places.
- L258, “Performance of the Model for eddy identification”, why “M” is capital? Style of headings or subheadings should be consistent.
- L281, “Validation and comparison of the identification results using Sentinel–3 chlorophyll image”. Why compare with results using Sentinal-3? Note that the spatial resolution between GOCI-I and Sentinel-3 is similar, so similar results in eddy observation are expected.
- L282, “Due to the differences in the GOCI and OCLI sensors, the blue-green spectral bands used for chlorophyll inversion are different, the calculation coefficients are different, and even the image resolutions are different.” The reasoning is strange.
- Paragraph below section 3.6. Many grammar issues, descriptions are confusing.
- L316, “from the chlorophyll spirals structures at the sea surface” Please check grammar.
- L316, “… and with high spatiotemporal resolution chlorophyll data from ocean color sensors, we suppressed large-scale ocean signals and increased chlorophyll concentration gradients to highlight eddy-induced chlorophyll spirals with more significant contrast in different oceanic environments” Confusing sentence.
- L320, “in ten eight-year periods” So, a total of 80 years? That is impossible.
- L330, “his method can detect SMEs, and the eddy-induced chlorophyll spirals represent a direct mapping of eddy physical properties in the chlorophyll field, with high credibility.” Confusing sentence.
Citation: https://doi.org/10.5194/essd-2023-138-RC1 - AC1: 'Reply on RC1', Yan Wang, 28 May 2023
-
RC2: 'Comment on essd-2023-138', Anonymous Referee #2, 15 Aug 2023
This paper introduces an observational dataset of submesoscale eddies in the Northwest Pacific using deep learning techniques. While the approach and resulting product are novel, certain crucial results and discussions are missing. Specifically, this article exhibits significant language issues, including numerous grammar errors and unclear expression. I might consider accepting this article after these issues are truly resolved.
(1) Even though a precise definition of 'submesoscale eddy' is not yet established, the authors should provide a descriptive introduction to the fundamental characteristics (shape, size, structure, etc.). This is crucial for readers to comprehend the dataset. Clearly, the authors' efforts in reviewing previous research are incomplete, as there is no mention of Munk's groundbreaking work in 2002.
(2) Compared to logarithmic transformation, CLAHE image enhancement technique can provide clearer information of spiral structures, but whether the enhanced signals are genuine and whether they might exaggerate the size and intensity of submesoscale eddies, these aspects need to be elucidated through some results.
(3) L125. Prior to conducting large-scale identification, the utilization of manual annotation methods is required, undoubtedly introducing significant uncertainty. The authors need to demonstrate that the results of manual annotation are statistically reasonable. Figure 5 presents an eddy with a clear structure. The question arises regarding how eddies with less distinct structures are handled. This also touches the issue about the definition of submesoscale eddies.
(4) There have been some studies utilizing machine learning methods to detect mesoscale eddies in the ocean. The authors should introduce the related works and highlight the distinction between the submesoscale eddies identified here and mesoscale eddies. Is the difference merely in terms of size?
(5) L210. 'at a confidence threshold of 0.2'. This is an exceptionally vital parameter, capable of greatly influencing the eventual product. The authors need to provide a clearer reason for the adoption of this value by means of sensitivity testing. This step is indispensable to eliminate artificial selection and ensure robustness.
(6) L230. '…, with the Kuroshio current passing through this area'. Do you mean that the Kuroshio passes through the Sea of Japan?
(7) L245. Beyond location and size, is it possible to analyze the lifecycle of submesoscale eddies?
(8) Sections 3.5 and 3.6 do not show the validation on the detected eddies. These submesoscale eddies are derived from processed chlorophyll images. Can the authors utilize additional observational data to confirm the authenticity of these eddies, for example, high-resolution SST data or other flow observations?
(9) The color scheme of Figure 12 needs to be changed, as it doesn't clearly present the details.
(10) This dataset is regional in nature, focusing on submesoscale eddies in the Northwest Pacific Ocean. This point needs to be clarified in the title of the article, otherwise, readers might assume it's a global eddy dataset.
(11) For a dataset, especially results derived from observations, there are bound to be certain limitations. The authors need to engage in a discussion in this regard, providing readers with guidance and reminders when utilizing the dataset.
(12) For the released product, an explanatory document needs to be added to clarify the meanings of various variables and provide instructions for processing the data.
(13) I strongly recommend the author to polish the language throughout the entire text, as I have identified a significant number of grammar errors and awkward expressions. The Reviewer 1 have provided many language suggestions, but it's not enough to just make changes based on those. Instead, it's advisable to seek assistance from a professional editing service for the revisions.
L9. 'which obtains from'. Grammatical error.
L48. 'Compared to the method of SAR images, it can …'. What does 'it' refer to?
L83. Change to 'This is conducted to avoid'.
L228. 'We counted the number of times each grid cell…'. Unclear description.
L320. 'ten eight-year periods'. What does this mean?
The use of present and past tenses is confusing and inconsistent.
I can't point them all out individually. The language does not yet meet the requirements of this journal.Citation: https://doi.org/10.5194/essd-2023-138-RC2 - AC2: 'Reply on RC2', Yan Wang, 19 Aug 2023
Status: closed
-
RC1: 'Comment on essd-2023-138', Anonymous Referee #1, 28 May 2023
This manuscript describes a submesoscale eddy dataset derived from satellite ocean color products, which can be very useful for the studies of eddy dynamics and the ecosystem of oceanic environments. Overall, data and method to generate the dataset are well described, results are validated, and information for data access is complete. However, there are various grammar issues and fuzzy descriptions, which should be revised/corrected before publication. Specifically,
- Line 9, “… an observational dataset on submesoscale eddies, which obtains from high–resolution chlorophyll–a …”, clearly the grammar is not correct, should be something like “which was obtained from …”
- L14, “which covers eight daily periods between 00:00 and 08:00 (UTC) from April 1, 2011, to March 31, 2021”, this sentence belongs to a new sentence.
- L15, “at a confidence threshold of 0.2”. Need to state this 0.2 is high confidence or low confidence.
- L40, “ and there is some controversy” Need citations to support this statement.
- L42, “, etc(Zhang …” It should be “etc.”, and there should be a space before “(“. Please check the entire manuscript for similar issues.
- L50, “chlorophyll” here should be phytoplankton, as concentration of chlorophyll is a proxy for phytoplankton.
- L54, “from high–resolution chlorophyll”, note that “high-resolution” is subjective, and a resolution at 500 m is not “high-resolution” by many standards or measures.
- L75, “Chlorophyll Image Enhancement” à “Enhancement of Chlorophyll Image”. Make similar changes for 2.2.2.
- L82, “directly manually ..”, this is confusing.
- L92, “The Comparison of different …” should be “A comparison of different ..”. Please also check similar issues at other places.
- L128, “where cyclones rotate counterclockwise and anticyclones rotate clockwise.” This is common knowledge, no need to state here.
- L167, “We used the YOLOv7–X as the model”, need citation for this model.
- L168, “YOLOv7–X was obtained by performing stack scaling on the neck and using …” This sentence is confusing, please rephrase.
- L193, “the watercolor remote sensing images”. Not such a thing of “watercolor remote sensing”. It is either ocean color remote sensing, or “water color” remote sensing, but the latter is very rare.
- L194, “images cannot represent the actual distribution of SMEs in the region.” What does this mean?
- L195, “The coverage of clouds above the region is the primary obstacle that affects the identification of eddies using this method.” This is nearly identical to the previous sentence.
- L205, Result à Results.
- L216, “the energy of the SMEs dissipates within just two hours, making it impossible to trace them in the chlorophyll field” Why ‘impossible’?
- L220, “(e) and (f) demonstrate”. This is not professional description, should be “Figs. (e) and (f) …” Please also correct other similar places.
- L237, please insert space between figure title and main text. Do so at other places.
- L258, “Performance of the Model for eddy identification”, why “M” is capital? Style of headings or subheadings should be consistent.
- L281, “Validation and comparison of the identification results using Sentinel–3 chlorophyll image”. Why compare with results using Sentinal-3? Note that the spatial resolution between GOCI-I and Sentinel-3 is similar, so similar results in eddy observation are expected.
- L282, “Due to the differences in the GOCI and OCLI sensors, the blue-green spectral bands used for chlorophyll inversion are different, the calculation coefficients are different, and even the image resolutions are different.” The reasoning is strange.
- Paragraph below section 3.6. Many grammar issues, descriptions are confusing.
- L316, “from the chlorophyll spirals structures at the sea surface” Please check grammar.
- L316, “… and with high spatiotemporal resolution chlorophyll data from ocean color sensors, we suppressed large-scale ocean signals and increased chlorophyll concentration gradients to highlight eddy-induced chlorophyll spirals with more significant contrast in different oceanic environments” Confusing sentence.
- L320, “in ten eight-year periods” So, a total of 80 years? That is impossible.
- L330, “his method can detect SMEs, and the eddy-induced chlorophyll spirals represent a direct mapping of eddy physical properties in the chlorophyll field, with high credibility.” Confusing sentence.
Citation: https://doi.org/10.5194/essd-2023-138-RC1 - AC1: 'Reply on RC1', Yan Wang, 28 May 2023
-
RC2: 'Comment on essd-2023-138', Anonymous Referee #2, 15 Aug 2023
This paper introduces an observational dataset of submesoscale eddies in the Northwest Pacific using deep learning techniques. While the approach and resulting product are novel, certain crucial results and discussions are missing. Specifically, this article exhibits significant language issues, including numerous grammar errors and unclear expression. I might consider accepting this article after these issues are truly resolved.
(1) Even though a precise definition of 'submesoscale eddy' is not yet established, the authors should provide a descriptive introduction to the fundamental characteristics (shape, size, structure, etc.). This is crucial for readers to comprehend the dataset. Clearly, the authors' efforts in reviewing previous research are incomplete, as there is no mention of Munk's groundbreaking work in 2002.
(2) Compared to logarithmic transformation, CLAHE image enhancement technique can provide clearer information of spiral structures, but whether the enhanced signals are genuine and whether they might exaggerate the size and intensity of submesoscale eddies, these aspects need to be elucidated through some results.
(3) L125. Prior to conducting large-scale identification, the utilization of manual annotation methods is required, undoubtedly introducing significant uncertainty. The authors need to demonstrate that the results of manual annotation are statistically reasonable. Figure 5 presents an eddy with a clear structure. The question arises regarding how eddies with less distinct structures are handled. This also touches the issue about the definition of submesoscale eddies.
(4) There have been some studies utilizing machine learning methods to detect mesoscale eddies in the ocean. The authors should introduce the related works and highlight the distinction between the submesoscale eddies identified here and mesoscale eddies. Is the difference merely in terms of size?
(5) L210. 'at a confidence threshold of 0.2'. This is an exceptionally vital parameter, capable of greatly influencing the eventual product. The authors need to provide a clearer reason for the adoption of this value by means of sensitivity testing. This step is indispensable to eliminate artificial selection and ensure robustness.
(6) L230. '…, with the Kuroshio current passing through this area'. Do you mean that the Kuroshio passes through the Sea of Japan?
(7) L245. Beyond location and size, is it possible to analyze the lifecycle of submesoscale eddies?
(8) Sections 3.5 and 3.6 do not show the validation on the detected eddies. These submesoscale eddies are derived from processed chlorophyll images. Can the authors utilize additional observational data to confirm the authenticity of these eddies, for example, high-resolution SST data or other flow observations?
(9) The color scheme of Figure 12 needs to be changed, as it doesn't clearly present the details.
(10) This dataset is regional in nature, focusing on submesoscale eddies in the Northwest Pacific Ocean. This point needs to be clarified in the title of the article, otherwise, readers might assume it's a global eddy dataset.
(11) For a dataset, especially results derived from observations, there are bound to be certain limitations. The authors need to engage in a discussion in this regard, providing readers with guidance and reminders when utilizing the dataset.
(12) For the released product, an explanatory document needs to be added to clarify the meanings of various variables and provide instructions for processing the data.
(13) I strongly recommend the author to polish the language throughout the entire text, as I have identified a significant number of grammar errors and awkward expressions. The Reviewer 1 have provided many language suggestions, but it's not enough to just make changes based on those. Instead, it's advisable to seek assistance from a professional editing service for the revisions.
L9. 'which obtains from'. Grammatical error.
L48. 'Compared to the method of SAR images, it can …'. What does 'it' refer to?
L83. Change to 'This is conducted to avoid'.
L228. 'We counted the number of times each grid cell…'. Unclear description.
L320. 'ten eight-year periods'. What does this mean?
The use of present and past tenses is confusing and inconsistent.
I can't point them all out individually. The language does not yet meet the requirements of this journal.Citation: https://doi.org/10.5194/essd-2023-138-RC2 - AC2: 'Reply on RC2', Yan Wang, 19 Aug 2023
Data sets
Identification of Submesoscale Eddy Datasets Using AI Methods from GOCI I Chlorophyll Yan Wang https://doi.org/10.5281/zenodo.7694115
Video supplement
Submesoscale eddy variations on an hourly time scale Yan Wang https://youtube.com/shorts/ZtZWRXOYDiQ?feature=share
Viewed
HTML | XML | Total | BibTeX | EndNote | |
---|---|---|---|---|---|
751 | 262 | 48 | 1,061 | 44 | 45 |
- HTML: 751
- PDF: 262
- XML: 48
- Total: 1,061
- BibTeX: 44
- EndNote: 45
Viewed (geographical distribution)
Country | # | Views | % |
---|
Total: | 0 |
HTML: | 0 |
PDF: | 0 |
XML: | 0 |
- 1