GloLakes: a database of global lake water storage dynamics from 1984 to present derived using laser and radar altimetry and optical remote sensing
- Fenner School of Environment & Society, Australian National University, Canberra, ACT, Australia
- Fenner School of Environment & Society, Australian National University, Canberra, ACT, Australia
Abstract. Measurements of the spatiotemporal dynamics of lake and reservoir water storage are fundamental in the assessment of the influence of climate variability and anthropogenic activities on water quantity and quality, as well as wetland ecology and the estimating greenhouse gas emissions from lakes. Previous studies estimated relative water volume changes for lakes where both satellite-derived extent and radar altimetry data are available. This approach is limited to only few hundreds of lakes worldwide. In this study, the number of measured lakes was increased by a factor 400 using high-resolution Landsat and Sentinel-2 optical remote sensing and ICESat-2 laser altimetry in addition to radar altimetry from the Topex/Poseidon, Jason-1, -2 and -3, and Sentinel-3 instruments. Time series of relative (i.e., storage change) or absolute (i.e., total stored volume) storage for more than 170,000 lakes globally with a surface area of at least 1 km2 (representing 99 % of the total volume of all water stored in lakes and reservoirs globally) were retrieved. Within these, we were able to develop an automated workflow for near real-time global lake monitoring of more than 27,000 lakes. The historical and near real-time lake storage dynamics data for 1984 to current are publicly available through https://doi.org/10.25914/K8ZF-6G46 (Hou et al., 2022).
Jiawei Hou et al.
Status: final response (author comments only)
-
RC1: 'Comment on essd-2022-266', Anonymous Referee #1, 10 Oct 2022
Hou et al present a dataset of global water storage variations. This paper is unique in that it attempts to construct absolute storage variability time series, which are challenging to produce. It also aims to fuse together multiple freely available datasets in a novel approach. However, I find it to overall be a flawed manuscript and dataset which needs many necessary improvements (see major comments below). In brief, I am concerned that the authors have not accurately described the dataset, both in regards to the long term and NRT storage dynamics and have not performed sufficient validation analyses. The paper is also poorly written in places, with numerous typos and grammatical errors as well as paragraphs that are poorly structured, and the figures are weak and do not sufficiently illustrate the dataset. Without substantial changes to the manuscript, presentation and perhaps the dataset itself, I’m not sure this paper and dataset would be of value to the broader community.
Major Comments
- Estimation of long term storage dynamics. Throughout the text, the authors state that they ‘measured’ long term absolute storage dynamics and/or ‘produced’ absolute storage time series. To be clear, what the authors did was apply a geostatistical model to estimate water depth and then use this to estimate a absolute storage time series when combined with a Landsat-derived dataset of lake extent. There was thus no ‘measurement of storage’ here – what the authors did was ‘estimate’ storage based on statistical relationships. While there is nothing wrong with estimating using these geostatistical relationships, it is imperative that this is explained correctly and consistently throughout the paper so as not to cause confusion with other methods which actually calculate volume change based on water level observations.
- ICESat-2 data is not NRT. The authors state the importance of Near Real Time (NRT) lake monitoring with a latency of ~1-10 days. They then include ICESat-2 as one of the potential datasets to use for NRT monitoring. However, this indicates a fundamental misunderstanding of ICESat-2 and how it is processed. First of all, ICESat-2 has a repeat time of 91 days, so on average you get an observation ~once every three months (though this does vary based on the size of the lake). Second of all, unlike with say MODIS, Landsat, or Sentinel-2, ICESat-2 data is not immediately released. Currently (as of Oct 9, 2022), the most recently available ICESat-2 data is through June 8th, 2022, and this has been fairly consistent over the past few years (ICESat-2 releases data about every ~6 months). While the NSIDC ICESat-2 website is perhaps a little misleading that it says data is available up to the present, so I understand some of the confusion, simple playing around with the data will quickly reveal the extremely long latency of ICESat-2 products. It is thus very much inaccurate to use ICESat-2 as a potential NRT water volume estimator in the method described here.
- No validation of NRT data. While the authors do perform validation for 238 lakes for what appears to be the geostatistically estimated historical time series, it is not explicitly stated whether they perform any validation of the NRT time series (both the absolute and relative). More detailed information on the accuracy of the NRT time series, and how it varies between using V-H vs. V-A relationships (or how it varies by lake size, if possible), is required to be able to evaluate this dataset.
- Need for large error bars and more details about the approach. The paper glosses over the specifics of the statistical method (pointing towards another paper) used to contruct the historical time series, but more details on this method should be provided to enable the reader to better understand the method and its potential accuracy. The dataset should also include significant error bars on the resulting time series, or at least clear information that these are all estimates with on average ~50% error. This is even more important for the NRT estimations, as given the use of look-up tables to approximate V-H/V-A relationships, these should have even greater error than the original volume time series (see comment about need for validation above)!
- Poor quality figures. The figures included are weak and do not provide sufficient info about the dataset. Figure 3 is very difficult to interpret (what is the difference between panel (b) and panel (c))? Figures 1 and 2 are useful but are very large and all 4 examples for each may not be needed. Figure 4 is useful and depicts the accuracy analysis well. I’d suggest adding additional figures illustrating the accuracy of the geostatistical estimation of water depth (and perhaps the resulting volume time series). It also would be useful to provide more information about where is was possible to build NRT and relative time series, and where it was not possible (perhaps in a figure? Or table?).
- Why not include Landsat and Sentinel-2 (not BLUEDOT) as NRT? Given that the authors calculate NRT storage variations simply by building V-H or V-A relationships with the results of their Landsat and geostatistical depth model volume time series, it should be possible to construct NRT time series for thousands more lakes globally by just classifying water in Landsat-8/9 and Sentinel-2 (and this would actually be NRT, unlike ICESat-2) While I understand that part of the point of this paper is to use existing datasets, classifying water in Landsat-8/9 and Sentinel-2 is pretty darn standard and straightforward at this point, particularly with the existence of cloud platforms like GEE. I’m not suggesting the authors do this globally, more just pointing out that this approach (while likely quite inaccurate) could in theory be used for thousands more lakes. Also, why (for the Sentinel-2 data in particular) do you need to use a lookup table to estimate volume via a V-A relationship – surely you could use the same geostatistical model to calculate volume from the Sentinel-2 (or Landsat 8/9) area observation?
Specific Comments
Lines 6-7: The first sentence of the abstract is unnecessarily long and wordy and contain a typo. Please rephrase.
Lines 56-74: Nice review of all of the different global water datasets, but this paragraph could use some restructuring as in its current form, this transition from the statement about ICESat-2 laser altimetry towards stating that “the spatial resolution of global satellite-dervied surface water dynamics projects…” doesn’t make much sense, as the paragraph then moves to talking about measurements of surface water extent, not water level. I would suggest reorganizing this paragraph and being clearer about developments in observations of water level vs. water extent.
Line 85: Add “cannot measure lake depth and therefore are unable to measure absolute water volume without the use of bathymetric data”
Lines 93-105: This paragraph is confusingly worded. The authors state that “Relative storage changes were estimated … while absolute storage changes were…” which is immediately followed by a statement that this was possible for more than 27,000 lakes worldwide. According to my understanding of the paper, the absolutely storage changes estimated using a geostatistical model were possible for all ~170,000 lakes, whereas the relative changes were possible for ~23,000 and NRT volumes (using the statistical model plus V-H or V-A relationships) were estimated for ~27,000. As currently written, this paragraph thus inaccurately describes the results.
Line 97: Remove “Furthermore”
Line 153: What is meant by “future GSWD water bodies”?
Line 195: While NSIDC is where the ICESat-2 data is hosted, it is incorrect to call it a monitoring platform. Just call it ICESat-2 data.
Table 1: It is incorrect to call NSIDC the platform for ICESat-2. I’m not sure exactly what term would be useful here, but I’d suggest simply calling the datasets something like ICESat-2, USDA G-REALM and BLUEDOT (Sentinel-2).
Figure 3 caption (and throughout the manuscript). It is incorrect to state that “storage dynamics for the period of 1984-2020 were measured in this study”. Given that storage was estimated based purely on statistical relationships that are likely to be highly inaccurate in many places (and with an overall error of ~50%), ‘estimated’ is the only appropriate word to describe this approach.
Line 277: What is meant by “Overall, the relative volume dynamics are generally more reliable, as indicated by the correlation values”? Does this refer to the NRT time series? Or specifically the NRT time series calculated from A-H relationships? And correlation with what? The results on this accuracy are not reported in the paper (see major comment above).
Line 291: Again, please be clear here that these results are ‘estimated’.
Line 299: It is not necessary to state that Busker’s dataset was not publicly available (you can always just email an author and ask for it, as technically every dataset should be publicly available on some level). I suggest removing this part of the sentence.
Line 343: SWOT will measure water height every 11 days (its orbit has a return period of 21 days, but since it is an off-nadir satellite, it will measure every water body every 11 days – see https://swot.jpl.nasa.gov/
Line 356: I explored the online Global Water Monitor linked in the paper. This is a cool way of displaying and communicating the data, and I commend the authors for putting it together, though it needs some cleaning up a bit (for example, what is the unit “GL” on the y axis for the annual time series of water volume)? If possible, it also would benefit from including error bars.
-
RC2: 'Edit to RC1 (from Reviewer #1)', Anonymous Referee #1, 10 Oct 2022
I wanted to lightly addend my review, specifically about the latency of ICESat-2, as I just learned about the existence of ICESat-2 Quick Look, low-latency products (https://nsidc.org/data/user-resources/help-center/faqs-icesat-2-quick-looks). My comments about the limited value of ICESat-2 for NRT data due to its 91 day repeat cycle still apply, but the existence of these quick look products does mean that it is technically possible to use the data within a few days after collection. I apologize for not acknowledging this before.
If the authors are to continue to use ICESat-2 data for their NRT dataset, I would advise explicitly discussing the quick look data as well as its advantages/disadvantages relative to the final product. Similarly, while the other two datasets mentioned in the NRT section (G-REALM and BLUEDOT) are relatively easy to download and process into lake height/area (since they already come processed to individual lake height/area) doing so with ICESat-2 is significantly more complicated as it requires choices around how to aggregate different tracks, filter out poor quality or outlier water level observations, etc. This additional difficulty should be described in the manuscript with additional details on how the authors process the ICESat-2 data automatically.
- AC1: 'Reply on RC1', Jiawei Hou, 02 Jan 2023
-
RC3: 'Comment on essd-2022-266', Anonymous Referee #2, 02 Dec 2022
Hou et al. presented a nice study on creating a new time-varying dataset on global lake water storage. Understanding the lake storage variablity is critical for securing freshwater supplies. The authors leveraged multiple datasets to produce a global dataset with hundreds of thousands of lakes included, which seems to be an impressive work. However, I have a few major comments on the method and data quality.
Pekel et al. used a global model to classify water and land. The authors depended on the Pekel et al.’s data to track water area changes in each lake locally. It is not clear to me whether the global model is suitable for studying each individual water body, for example, how can lake area change be accurately captured by the global model in each lake?
The monthly lake areas were generated from recovering water areas from contaminated images as in a previous publication by the authors (Hou et al., 2022). This is not new as quite a few recent studies have done a similar thing. As monthly lake areas are critical for generating monthly storage given monthly level data is pretty rare, the uncertainty of the recovered water areas seems to have non-negligible impact on the derived storage change. Had the authors assessed the uncertainty of areas from contaminated images? How did the generated time series compare with other existing approaches?
I do not believe the Geo-statistical model used by Messager et al., 2016 to predict the total volume of a lake can be used to derive actual lake bathymetry here. The Geo-statistical model was based on global DEM products which have an uncertainty of several meters on average. Additionally, the water level conditions at the DEM acquisition time vary. As the used DEM data in Messager et al., 2016 cannot retrieve the true land surface elevation underneath water, I think this would introduce an even larger uncertainty (e.g., dozens of meters) when the authors extrapolated water levels beneath the level at the DEM acquisition date.
The validation appears to be insufficient. How did the authors select the 238 lakes for the validation and why this is a comprehensive evaluation? Did the authors consider the performance of the method on cold-region lakes (e.g., in Canada). What the accuracy for smaller lakes given this method is only significant on small lakes as existing studies did a fairly good estimate on large lakes. Why the authors use relative metric R given R only gives a correlation estimate? For example, if the storage was scaled from area, no matter how large the level error is, the R value remains the same. Without a comprehensive validation, it is hard to foresee that the produced datasets would be useful for scientific inquiries.
Specific comments:
Line 64: I would suggest replacing “a larger number” with the actual number.
Line 86: It seems that Avisse et al. is not the only study on estimating lake storage change based on DEM data. Maybe also highlight other relevant studies here.
Line 280: the authors only show a few case studies for the validation. It would be better to have a figure or table to show all data used for the validation.
- AC2: 'Reply on RC2', Jiawei Hou, 02 Jan 2023
Jiawei Hou et al.
Data sets
GloLakes: Global historical and near real-time lake storage dynamics from 1984-present Jiawei Hou, Albert I. J. M. Van Dijk, Luigi J. Renzullo and Pablo R. Larraondo https://doi.org/10.25914/K8ZF-6G46
Jiawei Hou et al.
Viewed
HTML | XML | Total | BibTeX | EndNote | |
---|---|---|---|---|---|
603 | 250 | 24 | 877 | 7 | 12 |
- HTML: 603
- PDF: 250
- XML: 24
- Total: 877
- BibTeX: 7
- EndNote: 12
Viewed (geographical distribution)
Country | # | Views | % |
---|
Total: | 0 |
HTML: | 0 |
PDF: | 0 |
XML: | 0 |
- 1