A Six-year circum-Antarctic icebergs dataset (2018–2023)
Abstract. The distribution of Antarctic icebergs is crucial for understanding their impact on the Southern Ocean's atmosphere and physical environment, as well as their role in global climate change. Recent advancements in iceberg databases, based on remote sensing imagery and altimetry data, have led to products like the BYU/NIC iceberg database, the Altiberg database, and high-resolution SAR-based iceberg distribution data. However, no unified database exists that integrates various iceberg scales and covers the entire Southern Ocean. Our research presents a comprehensive circum-Antarctic iceberg dataset, developed using Sentinel-1 SAR imagery from the Google Earth Engine (GEE) platform, covering the Southern Ocean south of 55°S. A semi‐automated classification method that integrates incremental random forest classification with manual correction was applied to extract icebergs larger than 0.04 km² , resulting in a dataset for each October from 2018 to 2023. The resulting dataset not only records the geographic coordinates and geometric attributes (area, perimeter, long axis, and short axis) of the icebergs but also provides uncertainty estimates for area and mass. The dataset reveals significant interannual variability in iceberg number and total area-the number of icebergs increased from 33,823 in 2018 to approximately 51,332 in 2021, corresponding to major ice shelf calving events (e.g., the A68a iceberg in the Weddell Sea), followed by a decline in 2022. The annual average total iceberg area is 44,518 ± 4800 km², and the average mass is 8,779 ± 3,029 Gt. Validation using test set samples and a rolling cross-validation of interannual variability shows that the integrates incremental random forest classification achieves accuracy, recall, and F1 scores exceeding 0.90, and after manual correction, all performance metrics should be even better. Comparisons with existing iceberg products (including the BYU/NIC iceberg database and the Altiberg database) indicate a high consistency in spatial distribution, while our dataset demonstrates clear advantages in terms of spatial coverage, iceberg detection scale, and identification capabilities in regions with dense sea ice. This dataset serves as a novel data resource for investigating the impact of Antarctic icebergs on the Southern Ocean, the mass balance of ice sheets, the mechanisms underlying ice shelf collapse, and the response mechanisms of iceberg disintegration to climate change.