Articles | Volume 18, issue 1
https://doi.org/10.5194/essd-18-147-2026
© Author(s) 2026. This work is distributed under the Creative Commons Attribution 4.0 License.
A six-year circum-Antarctic icebergs dataset (2018–2023)
Download
- Final revised paper (published on 06 Jan 2026)
- Supplement to the final revised paper
- Preprint (discussion started on 15 May 2025)
Interactive discussion
Status: closed
Comment types: AC – author | RC – referee | CC – community | EC – editor | CEC – chief editor
| : Report abuse
-
RC1: 'Comment on essd-2025-51', Anonymous Referee #1, 23 Jun 2025
- AC1: 'Reply on RC1', Teng Li, 21 Aug 2025
-
RC2: 'Comment on essd-2025-51', Anne Braakmann-Folgmann, 04 Jul 2025
- AC2: 'Reply on RC2', Teng Li, 21 Aug 2025
Peer review completion
AR – Author's response | RR – Referee report | ED – Editor decision | EF – Editorial file upload
AR by Teng Li on behalf of the Authors (21 Aug 2025)
Author's response
Author's tracked changes
Manuscript
ED: Reconsider after major revisions (12 Sep 2025) by Désirée Treichler
AR by Teng Li on behalf of the Authors (26 Sep 2025)
Author's response
Author's tracked changes
Manuscript
ED: Referee Nomination & Report Request started (15 Oct 2025) by Désirée Treichler
RR by Anonymous Referee #1 (28 Oct 2025)
RR by Anonymous Referee #2 (11 Nov 2025)
ED: Publish subject to minor revisions (review by editor) (19 Nov 2025) by Désirée Treichler
AR by Teng Li on behalf of the Authors (24 Nov 2025)
Author's response
Author's tracked changes
Manuscript
ED: Publish as is (10 Dec 2025) by Désirée Treichler
AR by Teng Li on behalf of the Authors (13 Dec 2025)
This paper presents a circum-Antarctic iceberg database using Sentinel-1 SAR images in the Google Earth Engine platform. Their image segmentation and random forest classifier seem to work successfully in capturing the spatiotemporal distributions of icebergs, including their number and sizes, across the Southern Ocean. However, the authors need to provide more details about their iceberg detection model. While the authors mentioned that they used an ensemble random forest classifier with four different RF classifiers, based on different input features, they did not provide any details about this ensemble result (i.e., weights to each classifier, importance of statistical features, histogram features, and texture features). I encourage the authors to provide the details of their ensemble process to support the robustness of their method. Please also see my detailed comments below.
L146-147: How are these three subsets divided? Randomly or by any other criteria?
L210: Maybe it would be better to use 40 m, instead of 0.04 km, as already used throughout the manuscript (L69 and L216).
L241: “Based on this analysis, we selected an average thickness of 232 m for the icebergs” -> It is not clear how this value of 232 m is derived.
L256-259: Then, does it mean that 2018 data was included in training for all iterations but not tested at all, and 2023 data was never used for training? If so, I don't think this is a fair training strategy because the model could be biased to 2018 data. Would it be better to conduct 6-fold cross-validation (or so-called Leave-One-Out cross-validation), for example, 2018 data as test data and the remaining years as training data for iteration 1, 2019 data as test data and the remaining years as training data for iteration 2, and so forth? The authors mentioned that they used this strategy to “adapt to the time-series nature of the data while minimizing the risks of overfitting” (L256), but I’m not sure how the current strategy can achieve this.
Tables 3 and 4: The authors conducted performance evaluations twice: (i) evaluation for each year (Table 3) and (ii) evaluation with rolling window validation (Table 4). I’m not sure that these two different evaluations are really necessary. To evaluate the model performance, I believe cross-validation in Table 4 is enough.
L263-264: So, what model is finally used for building the iceberg database? The database is built each year separately based on the random forest model in Table 3, or does the entire database use a single model trained from the final iteration in Table 4?
Section 4.1: The authors should have provided a detailed performance of their “ensemble” RF model. In L150-154, the authors mentioned that they used four RF classifiers and assigned weights to these classifiers, but the manuscript lacks details about this process. It is necessary to specify the performance of these four classifiers and how the authors select the weights between these models.
L300: “several tens of kilometers: This is too ambiguous. Please provide specific numbers.
L301-303: I would like to ask the authors to provide more details about why the BYU/NIC database cannot capture so many > 5 km icebergs. Does it intentionally skip relatively small icebergs (near 5 km size), or does its iceberg detection algorithm, by itself, have limitations in capturing near-5-km icebergs? What about much larger icebergs, for example, > 10 km?
L339-349: I wonder if the total number of icebergs here and in Table 5 is the “true” number of icebergs. That is, if an iceberg is detected in two different Sentinel-1 scenes, how is this iceberg counted? This iceberg could be counted in duplicate, as the methods proposed in this study can only “detect” icebergs but cannot “track” identical icebergs. This could not be so significant because the authors used mosaiced data, but there is a possibility that the same icebergs are detected in duplicate (or some icebergs are missed) due to their drift even over a short period. It would be worthwhile to mention this issue and include any relevant discussion about it.
L347: We -> we
L355-356: “in the West Antarctic region and in the East Antarctic region” -> It would be better to only specify Thwaites and Doston ice shelves and Holmes and Mertz ice shelves, without mentioning too ambiguous “West and East Antarctic regions”.
L379-382: “In the Ross Sea sector, the iceberg proportion remained stable at around 16 % in 2018 and 2019, … remained relatively stable at approximately 20% over the six-year period.” In those sentences, the “iceberg proportion” may indicate “the number of icebergs in each sector / the number of total icebergs in the Southern Ocean.” However, I feel like this term “iceberg proportion” can be confused with “how much area (in percentage) is covered by icebergs (i.e., iceberg area / total ocean area of each sector).” Please consider rephrasing these sentences to clarify the meaning of the iceberg proportion. It could be good to discuss just the numbers (in Figure 11a), rather than the proportions (in Figure 11b).
L387: This is similar to the previous comment; please clarify the meaning of “total area.” I believe this means the total area of icebergs.
Figure 11: The caption should be corrected; I don’t think this figure includes any information about “five categories.”
L394-401: I’m not sure that this part really “validates” the small iceberg formation mechanism. The authors just present the distance from large icebergs, and it does not provide any direct clues for the small iceberg formation mechanisms. I don’t think this part is necessary.
L418: Although -> although