the Creative Commons Attribution 4.0 License.
the Creative Commons Attribution 4.0 License.
A globally sampled high-resolution hand-labeled validation dataset for evaluating surface water extent maps
Rohit Mukherjee
Frederick Policelli
Ruixue Wang
Beth Tellman
Prashanti Sharma
Zhijie Zhang
Jonathan Giezendanner
Abstract. Effective monitoring of global water resources is increasingly critical due to climate change and population growth. Advancements in remote sensing technology, especially in spatial, spectral, and temporal resolutions, have revolutionized water resource monitoring, leading to more frequent and high-quality surface water extent maps using various techniques such as traditional image processing and machine learning algorithms. However, satellite imagery datasets contain trade-offs that result in inconsistencies in performance. For example, the disparity in measurement principles between optical (Sentinel-2) and radar (Sentinel-1) sensors, and differences in spatial and spectral resolutions among optical sensors. Therefore, developing accurate and robust surface water mapping solutions requires independent validations from multiple datasets in order to identify potential biases within imagery and algorithms. However, high-quality validation datasets are expensive to build, and few contain information on water resources. For this purpose, we introduce a globally sampled, high spatial resolution dataset labeled using 3m PlanetScope imagery. Our surface water extent dataset comprises of 90 images, each with a size of 1024x1024 pixels, which were sampled using a stratified random sampling strategy. We covered all 14 biomes and also highlighted urban and rural regions, lakes, and rivers, including braided rivers and shorelines. To demonstrate the usability of our dataset, we evaluated our novel Sentinel-1 algorithm called the Equal Percent Solution (EPS) for surface water extent delineation. Our method produced an overall accuracy of 88 %, with low commission error. However, EPS also had a high omission error. While investigating the source behind this issue using our hand labels, we found evidence that water signals in Sentinel-1 are affected by turbulence and muddiness. Further, mountainous regions distorted the signals from the water in river valleys leading to inaccuracies. Similar to our evaluation, we expect our dataset to be used for analyzing satellite products and methods to gain insights into their advantages and drawbacks. We expect our high-quality dataset to improve our understanding of the accuracy, spatial generalizability, and robustness of existing surface water products and methods to promote efficient monitoring of our natural resources.
- Preprint
(4827 KB) - Metadata XML
- BibTeX
- EndNote
Rohit Mukherjee et al.
Status: open (until 06 Oct 2023)
-
RC1: 'The paper should have a clear focus on the validation dataset (relfecting the title of the paper)', Wolfgang Wagner, 30 Sep 2023
reply
This is a well written paper describing a 3m water extent dataset that can be used for validating 10-20 m resolution water extent data sets. Unfortunately, rather than just focusing on this validation dataset, the authors also introduce a new Sentinel-1 algorithm called the Equal Percent Solution (EPS) for surface water extent delineation. My main concerns are:
- The water extent validation dataset derived from 3m PlantScope images covers 90 sites worldwide. It is certainly a valuable dataset, but still limited when applied for validating global water body data sets. For example, there are very few sites in arid and semi-arid environments where water body mapping with both optical and radar data can be very problematic.
- Validation datasets do not just require sites with water bodies. Also areas with no water cover should be included to arrive at representative accuracy statistics.
- The EPS method is essentially a threshold-based classifier for images characterized by a bi-modal distribution. This method - and its limitation – are well known. A major problem is that it does not work if there is no bi-modal distribution, i.e. it fails when in many instances, e.g., when the water extent is small or there is no water. Also, it does not work if there are many water-look-alike area.
My recommendations to improve this work are: (i) extent the validation data set to cover more diverse environmental conditions, i.e. also areas with no water bodies, (ii) rather than validating their own Sentinel-1 dataset, the authors can demonstrate the value of their validation dataset by using it to validate several published water body dataset.
The EPS method is in my view not publishable within the context of this paper. The authors should consider a dedicated paper where they work out the innovation of their algorithm compared to published algorithms in much more detail. Furthermore, the method needs much more testing and critical examination. From my knowledge of the Sentinel-1 backscatter data, I expect it to fail in many situations, e.g., when trying to apply it to area with no water or trying to map water in arid environments.
MINOR COMMENTS
Lines 46ff: Also the Copernicus Emergency Management Services offers global near-real-time flood maps based on Sentinel-1
Line 53: How does the sentence starting with Wieland et al. (2023) logically connect with the previous sentence?
Line 59: “can” instead of “could”
Line 76: Delete “of the true distribution”
Line 81: Delete “urban regions have a higher density of built-up environments" (that is obvious, isn’t it?)
Line 193: Do the authors believe that muddiness is the reason for the different Senintel-1 signals? If yes, then much more evidence is necessary to support this claim.
Citation: https://doi.org/10.5194/essd-2023-168-RC1
Rohit Mukherjee et al.
Data sets
Global Surface Water Validation Dataset Rohit Mukherjee, Frederick Policelli, Ruixue Wang, Beth Tellman, Prashanti Sharma, Zhijie Zhang, and Jonathan Giezendanner https://data.cyverse.org/dav-anon/iplant/home/jgiezendanner/Mukherjee_HighResolutionSurfaceWaterLabels_Mai2023.zip
Rohit Mukherjee et al.
Viewed
HTML | XML | Total | BibTeX | EndNote | |
---|---|---|---|---|---|
434 | 122 | 15 | 571 | 8 | 11 |
- HTML: 434
- PDF: 122
- XML: 15
- Total: 571
- BibTeX: 8
- EndNote: 11
Viewed (geographical distribution)
Country | # | Views | % |
---|
Total: | 0 |
HTML: | 0 |
PDF: | 0 |
XML: | 0 |
- 1