Preprints
https://doi.org/10.5194/essd-2026-406
https://doi.org/10.5194/essd-2026-406
18 Jun 2026
 | 18 Jun 2026
Status: this preprint is currently under review for the journal ESSD.

A benchmark deep learning dataset for the classification of supraglacial lake drainage mechanism across the central-west Greenland Ice Sheet

Joshua H. Rines, Ching-Yao Lai, Ellianna Abrahams, Michael G. Shahin, Niall B. Coffey, Eojin Lee, and Laura A. Stevens

Abstract. Supraglacial lakes on the Greenland Ice Sheet drain through physically distinct pathways: hydrofracture, moulins, lateral stream routing, and crevasse-fields. Each drainage mechanism carries unique implications for ice sheet dynamics. Existing automated classifications reduce each lake’s drainage behavior to a time-series of scalar values representing the observed water surface-area and classify each lake based on drainage rate (e.g., rapid vs. slow). This scalar reduction conflates physically different drainage mechanisms, which can only be determined through consideration of full spatio-temporal tracking. Here we introduce a human-benchmarked, machine learning-ready benchmark dataset that pairs full Sentinel-2 multispectral satellite imagery time series with human-expert-labels assigned for N = 1679 supraglacial lakes in the central-west basin of the Greenland Ice Sheet during the 2018 (n = 679) and 2019 (n = 1000) melt seasons. The dataset is formatted as per-lake CF-1.8 NetCDF files each containing: six Sentinel-2 reflectance bands at 10 meter spatial resolution and daily cadence over the 153 day melt season (1 May to 30 September); a per-pixel binary cloud mask; co-registered lake water masks (both static and dynamic); and the human-assigned drainage classification labels. We accompany the dataset with a baseline deep learning classifier, demonstrating the utility of the dataset both in deep learning workflows and in extending lake drainage classification from rate-based to mechanism-based. The dataset is released through the Stanford Digital Repository under a CC BY 4.0 license, and the accompanying open-source sat-tile-stack preprocessing software under an MIT license.

Publisher's note: Copernicus Publications remains neutral with regard to jurisdictional claims made in the text, published maps, institutional affiliations, or any other geographical representation in this paper. While Copernicus Publications makes every effort to include appropriate place names, the final responsibility lies with the authors. Views expressed in the text are those of the authors and do not necessarily reflect the views of the publisher.
Share
Joshua H. Rines, Ching-Yao Lai, Ellianna Abrahams, Michael G. Shahin, Niall B. Coffey, Eojin Lee, and Laura A. Stevens

Status: open (until 25 Jul 2026)

Comment types: AC – author | RC – referee | CC – community | EC – editor | CEC – chief editor | : Report abuse
Joshua H. Rines, Ching-Yao Lai, Ellianna Abrahams, Michael G. Shahin, Niall B. Coffey, Eojin Lee, and Laura A. Stevens

Data sets

Central West Greenland Supraglacial Lake Drainage Classification Dataset (2018-2019) Joshua H. Rines, Ching-Yao Lai, Ellianna Abrahams, Michael G. Shahin, Niall B. Coffey, Eojin Lee, and Laura Stevens https://doi.org/10.25740/sf350xp4038

Model code and software

sat-tile-stack Joshua Rines and Ellianna Abrahams https://github.com/jharlanr/sat-tile-stack/

Joshua H. Rines, Ching-Yao Lai, Ellianna Abrahams, Michael G. Shahin, Niall B. Coffey, Eojin Lee, and Laura A. Stevens
Metrics will be available soon.
Latest update: 20 Jun 2026
Download
Short summary
Lakes that form atop the Greenland Ice Sheet each summer can drain in different ways, each of which uniquely influences how the ice flows. Most automated methods classify drainage type based on rate, which conflates these differences. We built a large dataset of satellite imagery showing how around 1,700 lakes evolve over time, each labeled by drainage type. Released freely, this dataset lets researchers train models to classify lakes by how they drain, not just how fast.
Share
Altmetrics