Articles | Volume 18, issue 1
https://doi.org/10.5194/essd-18-411-2026
© Author(s) 2026. This work is distributed under the Creative Commons Attribution 4.0 License.
PROMICE-2022 ice mask: a high-resolution outline of the Greenland Ice Sheet from August 2022
Download
- Final revised paper (published on 15 Jan 2026)
- Preprint (discussion started on 05 Aug 2025)
Interactive discussion
Status: closed
Comment types: AC – author | RC – referee | CC – community | EC – editor | CEC – chief editor
| : Report abuse
- RC1: 'Comment on essd-2025-415', Anonymous Referee #1, 29 Aug 2025
- RC2: 'Comment on essd-2025-415', Frank Paul, 06 Nov 2025
- AC1: 'Comment on essd-2025-415', Gregor Luetzenburg, 04 Dec 2025
Peer review completion
AR – Author's response | RR – Referee report | ED – Editor decision | EF – Editorial file upload
AR by Gregor Luetzenburg on behalf of the Authors (04 Dec 2025)
Author's response
Author's tracked changes
Manuscript
ED: Publish as is (23 Dec 2025) by Birgit Heim
AR by Gregor Luetzenburg on behalf of the Authors (29 Dec 2025)
Luetzenburg et al. describe the production of a new ice mask for the Greenland Ice Sheet. This mask has been manually produced, primarily from 10 m Sentinel-2 RGB mosaic from ~August 2022, supplemented with additional high-res data such as SPOT 6/7. The independent validation and quality control are rigorous, and the dataset clearly represents the culmination of a lot of work. The data is presented in FAIR-aligned open and modern formats (geopackages, etc.) and the existence of a clear GitHub repo for reporting and fixing issues is indicative of the effort the authors have gone to make sure that this dataset will be valuable in the long term. I have very few comments, largely potential opportunities to further reinforce the long-term value of the dataset.
DATASET COMMENTS
My initial reaction on seeing the total size of the dataset was that it was surprisingly large (the best part of a GB) relative to my expectation. This is due to the fact that the authors provide a number of options and variations on the dataset (polygons, lines, rasters, with/without nunataks, etc etc). On balance, I think that the benefit of these options outweighs the downside of having an inflated total dataset size - I have definitely been frustrated in the past by e.g. needing a vector dataset when a raster is available (and vice versa) and having to convert between these types. However, I might recommend that the ‘detailed description’ on the Dataverse also includes the brief list of file descriptions that is available on `00-README-PROMICE-2022-IceMask.md`. This would allow users to quickly identify and download only the most relevant file for their use case, without needing to download the separate markdown file.
One thing I have noticed is that the dataset itself does not currently include the semantic version number, although I am aware that the Dataverse file does (currently v2.1). The very useful setup of the GitHub would imply that, hopefully, a number of future version updates are to come. However, after downloading the dataset, there is nothing that I can see - either in the README.md, nor any file metadata, that would indicate the version number. I can easily imagine a situation where the dataset is downloaded by a user for a project, a year passes, and then as a paper is being written up the user has no idea what version of the dataset is used – even as a number of updates have been made to the main dataset. The simple/idiot-proof thing to do would be to include the version number in all the individual filenames (e.g. `01-PROMICE-2022-v2.1-IceMask-line.gpkg`), although maybe the authors have a better suggestion.
Why was the choice made to not provide a higher-resolution raster? I understand that a 150 m option aligned with BedMachine will be useful for modelling purposes, but a high-res version (10 m) would be useful for masking satellite data at a higher resolution, and would compete with the GrIMP mask (15 m resolution). Although I understand that the file size may be the main limitation, I wonder whether a properly zstd-compressed boolean dataset would be smaller than suggested by a proportional scaling of a 4.4 MB 150 m resolution dataset (which would imply a 10 m dataset approaching a gigabyte). Perhaps it could be made available in a separate repository for those interested.
The basin polygons (files `08` and `09`) do not currently include the `NAME` column from the original Mouginot dataset. Whilst the column names aren’t perfect (lots of `NW_NONAME1` style fudges), it would help to provide some context to the polygons, as well as providing an appropriate way to `join` this dataset with others also based on the Mouginot basins.
One interesting consequence of the choices made in producing the Cl1 dataset (file `07`) and the basins polygon (file `08` and `09`) is that the CL1 polygon divides the peripheral ice caps but not the ice cap, whilst the basins polygon divides the ice sheet but not the peripheral ice caps. It’s clear why this was done - one is consistent with Rastner et al (2012), whilst the other is consistent with the Mouginot et al (2019) dataset (L158-161). It does provide an interesting inconsistency: it might be frustrating if one were interested in basins/ice divides for both Greenland and peripheral ice caps. I’m not sure if this is common enough to deserve a separate ‘combined’ dataset, but perhaps just something worth thinking about.
It is a missed opportunity, given the effort in producing and describing the August 2022 Sentinel-2 mosaic, that it is not made available for users. The ITS_LIVE project makes available a similar mosaic (for 2019) as Cloud-optimised Geotiffs (CoGs) and a GUI-GIS-friendly .vrt file on their AWS bucket (https://its-live-data.s3.amazonaws.com/index.html#rgb_mosaics/GRE2/), although it is (to my knowledge) not well documented nor advertised. Having an equivalent dataset that is appropriately described and citeable would be useful to the community for providing a FAIR and citable Greenlandic mosaic for visualization exercises. This could be extremely low-hanging fruit in the context of this paper.
On the topic of CoGs, I wasn’t able to validate whether data is available in a cloud-optimized geospatial formats to allow users to download the required data directly within their code without having to download the full dataset, as you can currently do with e.g. ArcticDEM/REMA and ITS_LIVE data. I couldn’t manage to test this with the Dataverse files as I’m not sure it’s possible to get direct links to the datasets. I’m not sure that the geopackage format is cloud-optimised anyway (If I remember correctly vector datasets probably require a geoparquet or geojson to be cloud-friendly). If it is possible, perhaps it might be possible to include a Jupyter Notebook on the GitHub to show users how.
MINOR COMMENTS
L50-54 - long-running sentence, could do with being split!
L136 - Debris is referred to as challenging (L320, etc), but I think it is slightly under-discussed within the methods as to any particular guidance provided to the delineators when difficult choices were made. How was the SPOT6/7 data helpful in determining debris? Was it a case of visualizing surface roughness, shadows, etc?
L143-150 - Perhaps an additional useful practical analogue in this paragraph would also be the approximate distance between polygon vertices at dense and coarsely mapped sectors of the ice sheet?
L256 - In various places around the paper (I first started picking up on this in this paragraph), the ice mask vector is stated to represent ‘glacierized areas’ of the Greenland Ice Sheet. I am unsure whether there is a consensus that a ‘glacierized area’ would include floating ice, as your dataset does. Perhaps it is worth stating explicitly in the abstract and top-level README.md files that the dataset includes floating ice?
L340/Section 5.3 It is a shame that this section is not accompanied by a figure that could visualize the various alternative masks in a few locations.
L393-395 - Perhaps also recommend a citation of this publication as well as the dataset citation, as is consistent with your GitHub terms of use as well as generally aligned with other dataset citation requests?
Figure 6b - Is this Venn diagram meant to have areas proportional to the data counts? If so, like with pie charts, this is probably hard for humans to meaningfully ingest and could probably be better represented with e.g. a stacked bar chart with a single (horizontal) bar.