Articles | Volume 11, issue 4
https://doi.org/10.5194/essd-11-1603-2019
https://doi.org/10.5194/essd-11-1603-2019
Data description paper
 | 
28 Oct 2019
Data description paper |  | 28 Oct 2019

High-temporal-resolution water level and storage change data sets for lakes on the Tibetan Plateau during 2000–2017 using multiple altimetric missions and Landsat-derived lake shoreline positions

Xingdong Li, Di Long, Qi Huang, Pengfei Han, Fanyu Zhao, and Yoshihide Wada
Abstract

The Tibetan Plateau (TP), known as Asia's water tower, is quite sensitive to climate change, which is reflected by changes in hydrologic state variables such as lake water storage. Given the extremely limited ground observations on the TP due to the harsh environment and complex terrain, we exploited multiple altimetric missions and Landsat satellite data to create high-temporal-resolution lake water level and storage change time series at weekly to monthly timescales for 52 large lakes (50 lakes larger than 150 km2 and 2 lakes larger than 100 km2) on the TP during 2000–2017. The data sets are available online at https://doi.org/10.1594/PANGAEA.898411 (Li et al., 2019). With Landsat archives and altimetry data, we developed water levels from lake shoreline positions (i.e., Landsat-derived water levels) that cover the study period and serve as an ideal reference for merging multisource lake water levels with systematic biases being removed. To validate the Landsat-derived water levels, field experiments were carried out in two typical lakes, and theoretical uncertainty analysis was performed based on high-resolution optical images (0.8 m) as well. The RMSE of the Landsat-derived water levels is 0.11 m compared with the in situ measurements, consistent with the magnitude from theoretical analysis (0.1–0.2 m). The accuracy of the Landsat-derived water levels that can be derived in relatively small lakes is comparable with most altimetry data. The resulting merged Landsat-derived and altimetric lake water levels can provide accurate information on multiyear and short-term monitoring of lake water levels and storage changes on the TP, and critical information on lake overflow flood monitoring and prediction as the expansion of some TP lakes becomes a serious threat to surrounding residents and infrastructure.

Dates
1 Introduction

The Tibetan Plateau (TP), providing vital water resources for more than a billion population in Asia, is a sensitive region undergoing rapid climate change (Field et al., 2014). There are more than 1200 alpine lakes larger than 1 km2 on the TP, where glaciers and permafrost are also widely distributed. With little disturbance by human activity in this area, lake storage changes may serve as an important indicator that reflects changes in regional hydrologic processes and responses to climate change. Wang et al. (2018) showed that global endorheic basins are experiencing a decline in water storage, whereas the endorheic basin on the TP is an exception. Given the fact that TP lakes have been expanding for more than 20 years (Pekel et al., 2016), quality data sets on lake water level and/or storage could be the basis for investigating its causes (e.g., climate change/variability) and interactions with the water/energy cycles and human society (e.g., increasing risks of inundation and overflow floods).

As an important component of the hydrosphere, terrestrial water cycle, and global water balance, millions of inland water bodies such as lakes, wetlands, and reservoirs have been investigated globally, and their total storage was estimated to be 181.9×103 km3 based on statistical models (Lehner and Döll, 2004; Messager et al., 2016; Pekel et al., 2016). Lake storage changes that play an important role in the regional water balance can be derived from changes in lake water level and area (Frappart et al., 2005). Lake water levels and areas are mostly derived from satellite remote sensing due to the scarcity of in situ data across the TP, where the harsh environment and complex terrain make in situ measurements difficult to perform and costly (Crétaux et al., 2016; Song et al., 2013; Yao et al., 2018b; Zhang et al., 2017a). Lake water levels can be monitored using satellite altimeters initially designed for sea surface topography or ice sheet/sea ice freeboard height measurements. Satellite altimeters determine the range between the nadir point and satellite by analyzing the waveforms of reflected electromagnetic pulses.

There are two major categories of satellite altimeters, i.e., laser and radar. Laser altimeters, e.g., the Ice, Cloud, and land Elevation Satellite (ICESat), operating in the near-infrared band have smaller footprints and generally higher accuracy than radar altimeters, facilitating applications in glacier/ice mass balance studies (Neckel et al., 2014; Sørensen et al., 2011). Radar altimeters, operating in the microwave band, have larger footprints and are more likely to be contaminated by a signal from complex terrain when applied to inland water bodies. Nevertheless, it is possible to remove these impacts with waveform retracking algorithms (Guo et al., 2009; Huang et al., 2018; Jiang et al., 2017). Zhang et al. (2011) mapped water level changes in 111 TP lakes for the 2003–2009 period using ICESat data that have a temporal resolution of 91 days. ICESat data have relatively denser ground tracks but a lower temporal resolution than most of other altimetric missions. This means that ICESat covers more lakes but provides few water levels for each lake. After ICESat was decommissioned in 2010, CryoSat-2 data starting from 2010 were adopted in related studies (Jiang et al., 2017), due to its similar dense ground tracks and competitive precision compared to ICESat. Other altimetric missions, such as TOPEX/Poseidon (T/P), Jason-1/2/3, the European Remote Sensing (ERS-1/2) satellite, and Envisat, also have some but relatively limited applications in monitoring changes in lake water level on the TP due to sparse ground tracks. In this study, multisource altimetry data (i.e., Jason-1/2/3, Envisat, ICESat, and CryoSat-2) were combined if available for lakes in this study, with the Landsat-derived water levels developed in this study as a critical reference to increase the water level observations and merging data from multiple altimetric missions.

Changes in lake area can be captured by optical or synthetic aperture radar (SAR) images from medium- or high-spatial-resolution remote sensing data, such as Landsat and Sentinel series. Extraction of lake water bodies can be manually (Wan et al., 2016) or automatically (Zhang et al., 2017b) achieved. Automatic water extraction methods based on the water index and auto-thresholding are more efficient in dealing with a mass of remote sensing images. Even so, acquisition and preprocessing of such a large amount of historical data ( 10 TB) covering TP lakes are still intractable for researchers with limited computational resources. With the help of cloud-based platforms, such as the Google Earth Engine (GEE) that significantly reduces data downloading and preprocessing time, tens of thousands of images may be processed online in days instead of months (Gorelick et al., 2017). In this study, more than 20 000 Landsat images were processed online using GEE to extract lake water bodies based on the water index (McFeeters, 1996) and Otsu algorithm (Otsu, 1979).

There have been studies focusing on changes in lake water storage on the TP over recent decades; e.g., Zhang et al. (2017a) examined changes in water storage for  70 lakes from the 1970s to 2015 with ICESat altimetry data and Landsat archives. An individual lake area data set from the 1970s and annual area data after 1989 were used. Due to the short time span of ICESat, they used the hypsometric method to convert lake areas into water levels. Yao et al. (2018b) used digital elevation models (DEMs) and optical images to develop hypsometric curves for lakes on the central TP and estimated annual changes in water storage for 871 lakes from 2002 to 2015. These studies have a wide spatial coverage of lakes but relatively lower temporal resolution and little spaceborne altimetric information, which may limit the accuracy of trends in lake water level/storage in some cases and short-term monitoring of lake overflow floods. The Laboratoire d'Etudes en Géophysique et Océanographie Spatiales (LEGOS) Hydroweb provides a lake data set, including multisource altimetry-based changes in lake water level and storage as well as hypsometric curves for 22 TP lakes (Crétaux et al., 2016, 2011b). The data set incorporates more spaceborne altimetric information and has a higher temporal resolution. However, there may be a remaining bias when different sources of altimetric data are merged, due to the lack of some important reference that can be derived from optical remote sensing to be shown in this study. We term the reference data the “Landsat-derived water level” to be introduced in Sect. 3.2. Here, we list recent studies and data sets (Table 1) to provide a concise summary on remote sensing monitoring of water levels and storage changes over lakes on the TP.

Table 1Recent studies and data sets on TP lakes. H, A, and V in the table denote lake water level, area, and volume, respectively.

Download Print Version | Download XLSX

The overall objective of this study was to examine multiyear and short-term changes in water level and storage across 52 lakes with surface areas larger than 150 km2 on the TP by merging multisource altimetry and optical remote sensing images to generate more coherent high-temporal-resolution lake water level and storage change data sets ranging from weekly to monthly timescales during 2000–2017 and the hypsometric curve (i.e., the lake-area–water-level relationship) for each lake. To investigate changes in lake storage, lake water levels and areas need to be derived from multisource remote sensing.

https://www.earth-syst-sci-data.net/11/1603/2019/essd-11-1603-2019-f01

Figure 1Spatial (the number of lakes covered) and temporal coverage and their overlap periods of multiple satellite altimetric missions used in this study, including Jason-1/2/3, Envisat, ICESat, and CryoSat-2.

Download

First, water levels from various satellite altimeters (Fig. 1) for each lake as well as lake shoreline positions and lake areas from optical remote sensing images (i.e., Landsat) were derived. Second, systematic biases between different altimetry data were removed by either comparing the mean water levels during the overlap period (Fig. 1) or comparing the two water level time series with lake shoreline positions, depending on the length of the overlap period (details can be found in Sect. 3.1). Lake-shoreline-position-derived water levels, termed the Landsat-derived water levels in this study, can serve as a unique source of information reflecting water levels as well as a data merging reference. We will show that after deriving two or three regression parameters, lake shoreline positions can well reflect lake water levels with comparable accuracy to altimetry-derived water levels. Third, with information on lake water levels and areas derived from altimetry data and optical remote sensing images, the hypsometric curve that describes the relationship between the lake water level and storage changes can be derived. Fourth, the integral of the hypsometric curve was performed to convert lake water levels into storage changes.

Results of this study provide a comprehensive and detailed assessment of changes in lake level and storage on the TP for the recent 2 decades and short-term monitoring of lake overflow floods for some lakes. This study could largely benefit more detailed investigations into lakes, lake basins, and regional climate change, because the generated data sets have the highest temporal resolution during the study period with systematic biases well removed. To ensure the data quality, field experiments were carried out and in situ data were collected to examine the uncertainty in the Landsat-derived water levels. Users are free to access the data set described in this paper at https://doi.org/10.1594/PANGAEA.898411 (Li et al., 2019).

2 Study area and data

2.1 Study area

The TP can be generally divided into 12 major basins (Wan et al., 2016; Zhang et al., 2013), among which the inner/central TP (CP) is the only endorheic basin and home to most TP lakes, including  300 TP lakes larger than 10 km2. Therefore, it was chosen as the main study area. The endorheic basin covers an area of  710 000 km2 ( 28 % of total TP) with a mean elevation of  4900 m and has a semiarid plateau climate with annual precipitation ranging from 96 to 295 mm (Li et al., 2017c). Most lakes in the endorheic basin were expanding under the influence of climate change/variability as opposed to other areas in the TP, e.g., Selin Co exceeded Nam Co in area and consequently became the largest lake in the endorheic basin between 2011 and 2012 and expanded by 26 % over the past 40 years (Zhou et al., 2015), whereas Yamzhog Yumco (also known as Yamdrok Lake; outside the endorheic basin, 350 km to the southeast of Selin Co) shrunk by  11 % during 2002–2014 according to Wan et al. (2016). Located in the southeast endorheic basin, the Nam Co basin covering about 10 800 km2, with 19 % of the basin lake water area and a mean lake elevation of  4730 m, was chosen as a field experiment spot. The mean annual temperature and precipitation of Nam Co are 1.3 and 486 mm, respectively (Li et al., 2017a). The other experiment spot was Yamzhog Yumco, which has a mean lake elevation of  4440 m. Subject to steep mountainous terrain, the lake has a narrow-strip shape with complex shorelines. The basin of Yamzhog Yumco covers  6100 km2, with mean annual temperature and precipitation of 2.8 and  360 mm, respectively (Yu et al., 2011). An overall map of experiment lakes is given in Fig. 2.

https://www.earth-syst-sci-data.net/11/1603/2019/essd-11-1603-2019-f02

Figure 2Experiment locations: Nam Co and Yamzhog Yumco. Nam Co is located in the endorheic basin of the TP, while Yamzhog Yumco is located in the Yarlung Zangbo river basin (the upper Brahmaputra River). Both lakes are close to Lhasa city.

Table 2Multisource altimetry data used in this study.

S-GDR stands for Sensor Geophysical Data Record; GDR stands for Geophysical Data Record; GLAH 14 stands for GLAS/ICESat L2 Global Land Surface Altimetry Data (HDF5), version 34; CNES stands for Centre National d'Etudes Spatiales; Aviso stands for Archiving, Validation and Interpretation of Satellite Oceanographic data; Aviso+ data set is available via FTP at http://ftp-access.aviso.altimetry.fr with a registered username and password (last access: 18 August 2019); ESA Envisat products are available via FTP at http://ra2-ftp-ds.eo.esa.int with a registered username and password (last access: 18 August 2019); ESA CryoSat-2 products are available via FTP at http://calval-pds.cryosat.esa.int with a registered username and password (last access: 18 August 2019); NASA ICESat products are available at https://nsidc.org/data/icesat/data.html (last access: 18 August 2019).

Download Print Version | Download XLSX

2.2 Data

Multisource altimetry data were used in this study as shown in Table 2. The earliest record dates back to 2002 (i.e., Envisat and Jason-1) and the latest record ends in 2017 (i.e., Jason-3 and CryoSat-2, Fig. 1). Most of the 52 lakes examined in this study were covered by ICESat, Envisat, and CryoSat-2 data. ICESat data provided by the National Aeronautics and Space Administration (NASA) were available on 42 lakes in this study. Envisat and CryoSat-2 data provided by the European Space Agency (ESA) were available on 35 and 51 lakes in this study. Jason-1/2/3 data provided by the Centre National d'Etudes Spatiales (CNES) were available only on 12 lakes in this study due to the relatively sparse ground tracks or data quality issues. Note that Jason-2 inherited the orbit of Jason-1 after its launch in 2008, whereas Jason-1 was shifted into an interleaved orbit and continued functioning until 2013, thereby increasing the spatial coverage of Jason altimetry series to some degree, e.g., Jason-1 data in Lake Qinghai, the largest lake on the TP, were only available after 2008 due to the orbit shift. ICESat and CryoSat-2 data have the largest spatial coverage but relatively long repeat cycles of 91 and 369 days, respectively (Bouzinac, 2012; Zhang et al., 2011). The Envisat mission has a lower orbit than Jason-1/2/3 but higher than ICESat, resulting in a moderate spatial coverage and a temporal resolution of 35 days (Benveniste et al., 2002). To determine if the altimetry data fall into the lakes, a lake shape data set generated by Wan et al. (2016) was used. An example of using the lake shape data set to determine altimetry data falling into the lake boundaries is given in Fig. 3a, showing that data from all altimeters are available in Zhari Namco.

It should be noted that different altimeters vary with wavelengths of electromagnetic radiation and mechanisms. For instance, Jason-1/2/3 using the Ku and C bands and Envisat/RA-2 using the Ku and S bands work in the low-resolution mode (LRM). These dual-frequency radar altimeters can provide more accurate range corrections due to the ionospheric effect (Tournadre, 2004). The LRM is typical for the early version of satellite altimeters such as TOPEX/Poseidon. There are more advanced modes, such as SAR and interferometric synthetic aperture radar (InSAR), for recent radar altimeters, which generally have smaller footprints than the LRM mode. CryoSat-2/SIRAL working at a single Ku band has three modes, including LRM, SAR, and InSAR, which were designed to have an increasing resolution in turn and work in different zones. The InSAR mode uses interference phenomena so that the shift of the nadir point across the track can be detected, improving the altimeter's performance on ice sheets with slopes (Bouzinac, 2012). The Geoscience Laser Altimeter System (GLAS) is the laser altimeter carried by ICESat working in the near-infrared band.

We used Landsat 5 TM (2000–2011), Landsat 7 ETM+ (2000–2017), and Landsat 8 OLI (2013–2017) surface reflectance data sets provided by GEE to generate information on lake shoreline positions and lake areas. Landsat 7 ETM+ was subject to sensor failure, and all the Landsat 7 ETM+ images contain gaps after 2003 (Markham et al., 2004). There were more than 20 000 images processed, and half of them were excluded from the final results due to cloud contamination or gaps. We collected daily in situ water level measurements in Yamzhog Yumco for validation purposes with a pressure-type water level sensor. The in situ water level measurements spanned half a year from May to October 2018. We also performed unmanned aerial vehicle (UAV or drone) imaging over a 1 km lake bank in Yamzhog Yumco and Nam Co for obtaining better knowledge on the experimental environment.

In addition, GaoFen-2 (GF-2, the China High-Resolution Earth Observation System mission) images were used to perform a rigorous statistical analysis of uncertainty in the Landsat-derived water levels by taking the GF-2-derived lake shoreline positions as the ground truth to analyze the subpixel water area ratios of Landsat image pixels (see Sect. 4). GF-2 images have a spatial resolution of 0.8 m for the panchromatic band, and preprocessing including orthorectification and radiometric calibration was performed. Before analysis, we performed an image-to-image registration with manually selected tie points between GF-2 and corresponding Landsat 8 OLI images until the coregistration error reduced to  2 m.

3 Method

3.1 Satellite altimetry water level

The first step of deriving satellite altimetry water levels is to select correct ground tracks and valid footprints falling on the lakes. Because there is a random ground track shift at  1 km in different cycles for most altimetry missions, it is uncertain whether valid lake footprints can be obtained for each cycle, even though the nominal ground track seems to cross the lake. This problem can be addressed by comparing geographic coordinates of the footprints with a lake shape data set (Wan et al., 2016). After picking out the valid footprints, the lake surface height can be calculated for each footprint. All radar altimetry data share a relationship:

(1) LSH = Alt - ( Range + cor ) ,

where LSH represents the lake surface height with respect to the geoid; Alt represents the altimeter height with respect to the reference ellipsoid; Range represents the distance between the altimeter and lake surface; and cor represents several range corrections due to atmospheric effects, sensor design defects, or geophysical effects. Radar altimeters and laser altimeters need different corrections, given that they are working in different wavelengths and have different designs. For instance, corrections for radar altimeters include waveform retracking correction, wet/dry troposphere correction, ionosphere correction, pole/solid tide correction, and geoid correction. Laser altimeters also need atmospheric delay correction, geoid/pole tide correction, and geoid correction. Unlike radar altimeters, saturation correction instead of waveform retracking correction is more important to laser altimeters.

The retracking correction plays an important role in removing the contamination of land signal when radar altimetry data are applied to inland water bodies. In this study, Jason-1/2/3 data were retracked using a classical waveform retracking algorithm, i.e., the improved threshold method (ITR), whereas CryoSat-2 data were retracked using the narrow primary peak threshold (NTTP) method (Birkett and Beckley, 2010; Cheng et al., 2010; Jain et al., 2015). Retracking corrections were not performed for Envisat and ICESat data, because the Envisat/RA-2 product already provided a retracked range using the ICE-1 method, and the ICESat GLAH14 product already included several corrections (such as saturation correction) that are sufficient for most applications including studies on inland water bodies (Zhang et al., 2011). The original idea of the NTTP, ICE-1, and ITR is quite similar. All of them adopt a threshold defined as the percentage of the waveform peak to determine the retracking gate and then to convert the difference between the retracking gate and the nominal gate into range correction by multiplying the gate range (cΔt∕2, where c is the speed of light and Δt is the time duration of a gate). The differences lie in the choice of thresholds as well as the calculation of waveform peaks. For instance, ICE-1 uses a 30 % threshold, whereas ITR uses a 50 % threshold.

https://www.earth-syst-sci-data.net/11/1603/2019/essd-11-1603-2019-f03

Figure 3(a) Ground tracks of multiple altimetric missions over Zhari Namco and (b) the merged altimetry water levels for Zhari Namco. LSH stands for lake surface height.

For each cycle of an altimeter, it is common that more than one footprint fall on a lake, thereby providing several lake surface height (LSH) observations on the same day. After removing outliers with the three-sigma rule, frequency distributions of the LSHs from the same cycle were calculated. The mean value of the histogram bin with the highest frequency was selected to represent the LSH for the cycle. Meanwhile, the frequency of the chosen histogram bin was reserved to evaluate the data quality for the cycle; e.g., a cycle was marked as high quality if the frequency was higher than 0.8, moderate quality if it was only higher than 0.5, and poor quality if the frequency was lower than 0.5. The LSH from each cycle constituted the water level time series for a lake. LSHs that were marked as poor quality and obviously deviated from the moving average were removed from the altimetry-based lake water level time series.

It is not uncommon that systematic biases exist in different altimetry data sets due to variations in orbit, the discrepancy between correction models, errors associated with sensors, and even the choice of the reference datum. After deriving lake water level time series for each altimeter, we first merged the Envisat and ICESat water levels if both were available for a lake, because they have the longest overlap period (Fig. 1). We chose Envisat-derived water levels as the baseline and removed the difference of the mean values of the two products during the overlap period from the ICESat data, because Envisat data are generally denser and longer than ICESat data. A similar process was applied to Jason-1/2/3, as there are two overlap periods connecting the three altimeters together. Figure 3b shows a result of merged altimetry data when all sensors are available. There are tradeoffs between CryoSat-2 and Jason-2/3 data in terms of spatial coverage and time span. CryoSat-2 data are available for all lakes in this study but they only have an overlap period with Jason-2/3 data, whereas Jason-2/3 data are only available for 12 lakes. For most lakes without Jason-2/3 data, we merged CryoSat-2 data with either ICESat or Envisat using Landsat-derived water levels spanning from 2000 to 2017, because there is no overlap period between these altimetry water levels (Fig. 1). Details on how Landsat-derived water levels aid in merging the altimetry water levels are shown in Sect. 3.2.

3.2 Landsat-derived water level

For most lake basins, it is possible to find a relatively flat portion of lake banks with an average slope of 1∕30 or even smaller, where obvious interannual or intra-annual changes in lake shoreline positions can be detected using Landsat images (30 m). These locations can be found by comparing lake images from the first year and the last year of the study period if the lake shows a clear expanding/shrinking trend. Otherwise, we can compare images acquired in early summer when the LSH is at its lowest level with those acquired in late autumn when the lake expands to its limit. In this study, we assumed that the selected lake bank was flat enough such that the relationship between the lake water level and shoreline position can be depicted in a linear or quasilinear (parabolic) way. Thus, we can transform the lake shoreline positions into Landsat-derived water levels by fitting with altimetry water levels. The validity of this assumption can be evaluated with the coefficient of determination (R2) for each lake as shown in Table 3. For most lakes, the goodness of fit is higher than 0.7, suggesting the generally good fitting relationship between the lake water levels and shoreline positions.

Table 3Summary of regression analyses and hypsometric function by lake.

Download XLSX

Though there were  500 Landsat images obtained for the selected lake banks during the study period, many of them were largely affected by cloud or cloud shadow. All the images were processed online using the GEE application programming interface (API). Preprocessing such as radiometric calibration, atmospheric correction, and geometric correction was already performed in the production of the data sets. In addition, the failure of the Landsat 7 sensor SLC left all the Landsat 7 ETM+ images with gaps after 2003 (Markham et al., 2004), making the available images even fewer. We managed to make use of some images with gaps in generating lake shoreline positions. By choosing the region of interest (ROI) that is parallel to the image gaps, we made most of the Landsat 7 ETM+ images useable. However, the width of ROI must be reduced to avoid shifting gaps as shown in Fig. 4b. The gaps may vary with time but are more like vibration around the midpoint. The ROI did not fill the interval of gaps, because the wider the ROI, the higher possibility of shifting gaps cross it.

Lake shoreline positions were characterized by water area ratios detected in the ROI. To automatically extract water areas from a mass of Landsat archives on GEE, the water index and Otsu threshold method were jointly used. We calculated the normalized difference water index (NDWI) and the modified normalized difference water index (MNDWI) of the images and compared their performance in different seasons. It was found that the MNDWI tends to be more sensitive to shallow water in summer but is less effective than the NDWI when the lake bank is covered by snow in the cold season as shown in Fig. 5c. Therefore, the two water indices were jointly used by applying the MNDWI to images acquired during May to October and applying the NDWI to the remaining months. The NDWI and MNDWI can be calculated as follows (McFeeters, 1996; Xu, 2005):

(2)NDWI=Bgreen-BNIRBgreen+BNIR,(3)MNDWI=Bgreen-BSIRBgreen+BSIR,

where Bgreen, BNIR, and BSIR refer to surface reflectance of bands 2, 4, and 5 for Landsat 5/7 TM/ETM+ images and bands 3, 5, and 6 for Landsat 8 OLI images.

https://www.earth-syst-sci-data.net/11/1603/2019/essd-11-1603-2019-f04

Figure 4(a) Yamzhog Yumco and its surroundings. The DEM was extracted from the SRTM Global 90 m DEM product (Jarvis et al., 2008). (b) Region of interest (ROI, yellow area) selected from a Landsat 7 ETM+ image for detecting changes in lake shoreline (black areas represent gaps in the image). (c) Linear regression of lake shoreline positions that are represented by water area ratios in the ROI and altimetry water levels for Yamzhog Yumco. (d) Landsat-derived water levels and altimetry water levels for Yamzhog Yumco.

Download

https://www.earth-syst-sci-data.net/11/1603/2019/essd-11-1603-2019-f05

Figure 5(a) A Landsat 7 ETM+ image of Lake Aqqikkol acquired in summer in 2001; (b) water area extractions using the modified normalized difference water index (MNDWI) and the normalized difference water index (NDWI), showing that the MNDWI performs better in detecting shallow water; (c) a Landsat 8 OLI image of Nam Co acquired in winter in 2015; (d) water area extraction using the NDWI, showing good performance in distinguishing water from snow; and (e) water area extraction using the MNDWI, showing some confusion of water and snow.

After calculating the water index, the grayscale image was binarized using the Otsu method. If the selected ROI comprises  50 % water and  50 % land, the performance of the method is good, as the distribution of digital numbers of the grayscale image is close to the assumption of the bimodal histogram implicit in the Otsu algorithm (Kittler and Illingworth, 1985; Otsu, 1979). The binarized images were further processed to provide the water area ratio in the ROI, which represents the lake shoreline position. The lake shoreline position time series were then converted into Landsat-derived water levels using linear regression or second-order polynomial fit with altimetry-derived water levels (Fig. 4c–d). For most cases, we only used linear regression, and we performed the second-order polynomial fit only for 2 lakes with Jason-1/2/3 data, because a higher-order regression requires more input information to ensure the reliability. However, cloud, cloud shadow, and shifting gaps may contaminate the ROI and cause errors in the Landsat-derived water levels. Therefore, the QA band of the Landsat surface reflectance product was used to filter the images. Data were excluded if the fraction of the cloud or cloud-shadow-covered area in the ROI was higher than 5 %. For every Landsat 7 ETM+ image acquired after 2003, the pixel number of the ROI was counted and compared with those acquired before 2003. If the loss of pixels exceeded 2 %, the ROI was considered to be affected by a gap and the data were consequently excluded from the subsequent analysis.

A critical function of Landsat-derived water levels is to aid in merging altimetry water levels when there was no overlap period between altimeters or the overlap period was too short. For lakes without Jason-1/2/3 data, lake shoreline positions were firstly translated into Landsat-derived water levels by fitting with CryoSat-2 data functioning as extrapolation of CryoSat-2 to 1–2 years. Then, we applied the same method of merging Jason-1/2/3 to merge the extrapolated CryoSat-2 data with either Envisat or ICESat data. In doing so, we were able to remove all systematic biases between multisource altimetry water levels. After merging the altimetry water levels, we performed regression analysis for the second time between the Landsat-derived water levels and merged altimetry water levels to check if the linear relationship is stable during the entire study period and at different elevations. If the linear relationship was stable, the Landsat-derived water levels would be merged with the altimetry water levels using the linear fitting parameters from the second regression analysis. Otherwise, there may have been a change in the lake bank slope, and, therefore, the extrapolation of CryoSat-2 data with Landsat-derived water levels was not suited. In this case, we reselected the ROI to extract lake shoreline positions and redid altimetry data merging until the Landsat-derived water levels and merged altimetry water levels agreed well with one another in the second linear regression. Detailed analysis about the potential extrapolation issue can be found in the Supplement.

In summary, the basic idea of removing systematic biases of different altimetry water levels is to calculate the means of two altimetry water level time series during the overlap period. The difference between the means is removed from one altimetry water level time series to make both altimetry water level time series consistent and to form a longer time series. This process was subsequently applied to all water level time series with overlap periods to merge them into a single time series for each lake. However, the overlap period may not be long enough, such as Envisat and CryoSat-2 (e.g., there are limited data points (e.g., 1–2) during the overlap period), or does not exist at all, such as ICESat and CryoSat-2. On these cases, Landsat-derived water levels are used to extend or create an overlap period that links two altimetry water level time series.

https://www.earth-syst-sci-data.net/11/1603/2019/essd-11-1603-2019-f06

Figure 6Programming interface of Google Earth Engine©. The red polygon is the region of interest for lake area change extraction of Selin Co.

3.3 Hypsometry

We derived the hypsometric curve for each lake by polynomial fitting of the lake area and level time series. The lake area comprises two parts: the inner invariable part and the outer variable part. As the variable water area was of more concern in this study, ROIs for extracting changes in lake area only cover the lake shoreline and its neighboring areas as shown in Fig. 6. The inner part of the water body was calculated only once and considered invariant, making the calculation more efficient on GEE. Meanwhile, more images are available as the area of ROI becomes smaller, because the possibility of clouds covering the ROI is reduced compared with an ROI covering the entire lake. Landsat 7 ETM+ images after 2003 were not included in this part of the calculation as gaps negatively affected the ROI for lake area extraction. Similar to Sect. 3.2, we selected images with less than 5 % cloud cover on an ROI to generate time series of changes in lake area, obtaining 20–30 data points on average for regression. R2 values for each lake are listed in Table 3, indicating that most lake basins agree well with the parabolic hypsometric curve.

https://www.earth-syst-sci-data.net/11/1603/2019/essd-11-1603-2019-f07

Figure 7Field experiments in two typical lakes: (a) an overview map of the experiment spot; (b) pressure-type water level sensor; (c) unmanned aerial vehicle (UAV); (d) installation of the water level sensor; and (e) UAV image of a portion of the bank of Nam Co.

4 Validation of data quality

4.1 Field experiment

Most Tibetan lakes are located in remote and inaccessible regions, resulting in the scarcity of ground-based in situ measurements that are vital for data quality assessment. We made some in situ measurements in two lakes to validate the data quality of Landsat-derived water levels. The data quality of satellite altimetry on lakes or rivers has been widely investigated, and thus it is beyond the scope of our study. Many studies used in situ water levels to calculate statistical metrics, e.g., root mean squired error (RMSE). However, results provided by different studies vary, which could be associated with the cross-section width of the study water body in the ground track panel (Nielsen et al., 2017). This means that these results may not be comparable due to their unique applications. In addition, it is not rigorous to use in situ data of only one lake to represent the overall performance in the uncertainty assessment for altimetry water levels. Instead, we used the standard deviation of valid footprints acquired in the same cycle as an estimate of uncertainty in satellite altimetry water levels. In contrast, the applicable condition of Landsat-derived water levels is not so variable as that of satellite altimetry data. Derivation of Landsat-derived water levels requires a relatively flat bank as well as some altimetric information, which were available in all lakes. Since these selected bank slopes were similarly small (1∕30), it was possible to use a few typical lakes to represent all lakes. Therefore, we carried out a field experiment (Fig. 7) in Yamzhog Yumco and Nam Co to validate the Landsat-derived water levels.

There were two main goals in our experiments: (1) collecting daily in situ water level data in a TP lake to validate the corresponding Landsat-derived water levels statistically and (2) testing the performance of extracting lake shoreline positions from high-resolution optical images (GF-2) to provide a theoretical uncertainty analysis of Landsat-derived water levels. On Yamzhog Yumco, we installed a pressure-type water level sensor (type H5110-DY, manufactured by Shenzhen Hongdian Technologies Co., Ltd.), which measured water pressure and temperature at the installation depth that were converted into water depths with a relative accuracy of  0.1 %. The device was carried onto the lake and put  20 m below the water surface and 0.5 m above the lake bottom, suggesting an absolute error of  2 cm. As for GF-2 images, the spatial resolution of the panchromatic band is 0.8 m, which is able to provide a very accurate reference of lake boundaries for assessing water classification results for Landsat images. We used three GF-2 images acquired at different seasons (two in July and September 2015 and one in February 2016) and different places on the TP to better represent the local conditions when extracting Landsat-derived water levels or areas. Image coregistration was performed to make sure that there was no obvious spatial shift between the GF-2 images and corresponding Landsat images. The accuracy of the image coregistration was  2 m.

4.2 Uncertainty analysis of Landsat-derived water levels

Based on the in situ water level measurements made by the pressure-type water level sensor, we evaluated the accuracy of Landsat-derived water levels statistically. We first calculated anomalies of in situ water levels and Landsat-derived water levels, and then water levels from the two sources acquired on the same day were used for analysis. There were 16 Landsat-derived water level records available for the comparison against the in situ measurements, indicating an RMSE of the water level anomaly of 0.11 m. The linear fit shows a slope close to 1 and an R2 of 0.89, suggesting the consistency between the in situ water level measurements and the Landsat-derived water levels (Fig. 8b). It should be noted that the Landsat-derived water levels used for validation here were translated from lake shoreline positions using parameters derived from fitting with CryoSat-2 data, i.e., there is no in situ information involved in generating the Landsat-derived water levels shown in Fig. 8.

https://www.earth-syst-sci-data.net/11/1603/2019/essd-11-1603-2019-f08

Figure 8(a) In situ water level anomaly versus Landsat-derived water level anomaly in Yamzhog Yumco; (b) linear regression between the Landsat-derived water levels and in situ water levels during the same period.

Download

Furthermore, we performed a theoretical uncertainty analysis of the Landsat-derived water levels by looking at the original optical data and the generation process with the help of high-resolution GF-2 images. First, we took GF-2 images (after coregistration with the Landsat image for the same period and after the coregistration errors were  2 m) as the ground truth to determine the accurate position and shape of the lake shoreline. Second, we performed water classification from the Landsat 8 OLI image for the same period jointly using the water index method and Otsu algorithm to derive the binarized image. Landsat image pixels where the lake shorelines from the GF-2 images cross were delineated and marked as shoreline pixels as shown in Fig. 9a. Then the water area in each shoreline pixel was calculated.

https://www.earth-syst-sci-data.net/11/1603/2019/essd-11-1603-2019-f09

Figure 9(a–c) GF-2 images (upper layer) and corresponding Landsat 8 OLI images (bottom layer) acquired on 7 September 2015, 29 January 2016, and 5 July 2015; (d) Landsat 8 OLI shoreline pixels (the background is the GF-2 image) – blue pixels were classified as water, and yellow pixels were classified as land; (e) the relationship between the water area ratio in a pixel and the frequency of the pixel being classified as water. Blue bars are sampled at a 0.04 bin space from the 4128 pixels. The red line shows the fitting curve based on the maximum likelihood method.

Given that these shoreline pixels were classified as either water or land, a relationship between the water area ratio of the shoreline pixel and the probability of the pixel being classified as water can be derived. This relationship generally describes the function of the water classification method by telling how likely a pixel is to be determined as water, given the water area ratio of the pixel. Based on the observations of a total of 4128 Landsat shoreline pixels, a power function was chosen to represent the water classifier as Eq. (4) shows:

(4) f ( x ) = x n ,

where x represents the water area ratio in the shoreline pixel, f(x) represents the probability of the shoreline pixel being classified as the water pixel, and n is the parameter that determines the shape of the curve. Parameter n was determined using the maximum likelihood method (Fig. 9e).

As expected, the probability of the pixel being classified as water increases with the water area ratio in the pixel (Fig. 9e). The enclosed area of the fitting curve (y=x1.43) is smaller than that of y=x on [0, 1], suggesting that there may be a lower probability of the occurrence of water pixels that is associated with a systematic bias of the lake shoreline detection. Note that the systematic bias can be removed when linearly fitting the lake shoreline positions and altimetry water levels as long as the bias is stable. Therefore, uncertainty in Landsat-derived water levels developed in this study arises mainly from the variation in this systematic bias.

https://www.earth-syst-sci-data.net/11/1603/2019/essd-11-1603-2019-f10

Figure 10F(X): probability density function of the bias (X) between the classified water ratio (X1) and real water ratio (X0) in a shoreline pixel.

Download

To describe the variation in the systematic bias, a new random variable X was introduced to represent the bias between the classified water area and the real water area in a shoreline pixel. Given the shape and position of the lake shoreline, the real water area in each shoreline pixel is a complex function of the relative position between the pixel and the shoreline. To simplify the derivation, we assumed that the water area ratio in a shoreline pixel is uniformly distributed on [0,1], meaning that the probability of any value between 0 and 1 is equal. If we use X0 to represent the true water area ratio in the shoreline pixel and X1 to represent the classified results based on the water area ratio, the random variable X can be expressed as

(5) X = X 1 - X 0 ,

where X1 can take on 1 or 0 (i.e., the classified results only tell us whether a pixel is a water pixel or not), so X can only take on either 1−X0 or X0. Because the range of X0 is [0,1], it is obvious that the range of X is [−1,1]. A derivation of F(X), i.e., the probability density function (PDF) of X, can be found in the Supplement (Part 2).

Overall, F(X) describes how the bias between the classified water ratio and real water ratio in shoreline pixels is distributed as shown in Fig. 10. If there are N shoreline pixels in an ROI, we can take them as N independent observations of X and calculate the mean value X. This value X can represent an average shift of the detected lake shoreline from the real lake shoreline in the unit of 1-pixel width (30 m). As we mentioned above, the systematic bias can be removed in the regression between the lake shoreline positions and altimetry water levels. As such, it is the variation of the bias that determines the accuracy of the Landsat-derived water levels. We can calculate the standard variation of X to represent the uncertainty in lake shoreline position. Note that there is a simple relationship between σx and σx:

(6) σ x = σ x N .

One only needs to calculate σx:

(7)X=-11F(X)XdX-0.09,(8)σx=-11F(X)(X-X)2dX0.39.

Combined with Eqs. (4) and (7), Eq. (8) was resolved numerically, resulting in  0.39-pixel width. Substituting σx in Eq. (6) with Eq. (8) gives

(9) σ x = 0.39 N .

If the slope of the shoreline is known, e.g., tan θ, the uncertainty of the Landsat-derived water level can be expressed as

(10) σ ho = σ x d tan θ = 0.39 × 30 × tan θ N ,

where σho is the uncertainty of Landsat-derived water levels and d is the spatial resolution of the satellite image (30 m). In this study, a typical width of ROI for deriving Landsat-derived water levels is  10-pixel width, meaning that N is  10. In addition, lake shores used for generating Landsat-derived water levels here generally have a relatively mild slope of 1∕30 or even smaller, which can be roughly estimated from the maximum shoreline change and altimetry water level change within a year. Here if we use 1∕30 as the slope tan θ, the uncertainty of the Landsat-derived water levels can ultimately be estimated to be  0.12 m, which is very close to the RMSE of 0.11 m based on the comparison between the optical water levels and in situ water level measurements mentioned earlier.

However, for most cases we do not know the exact lake bank slope tan θ, which is the reason why we performed the regression analysis between the lake shoreline positions and altimetry-derived water levels. Information on the real lake bank slope is implicitly expressed in the linear fitting slope β (if the fitting line is y=βx+α). Uncertainty in altimetric information could evolve into the fitting parameters and impact the accuracy of the generated Landsat-derived water levels. Given that the observed lake shoreline position is X1 (e.g., X1=5.6, meaning that the observed lake shoreline position is 5.6 Landsat pixels away from the initial position corresponding to the lowest water level, which is different from Eq. (5), X1 here can be a rational number because it is determined by averaging all shoreline pixels in the ROI, whereas in Eq. 5 we focused on only 1 shoreline pixel), combining Eq. (5), the Landsat-derived water level (y) can be expressed as

(11) y = β X 1 - X 1 + Y = β X 0 - X 0 + β X - X + Y ,

where X1-X1 denotes the observed lake shoreline change (in the unit of a Landsat pixel), X1 denotes the mean of observed lake shoreline positions used for linear regression, Y denotes the mean of altimetry water levels used for linear regression, X0-X0 denotes the real lake shoreline change, (X-X) denotes the variation of the Landsat-derived shoreline position caused by the water extraction method, and β is the linear fitting slope. It is obvious that the expected value X-X is 0. As we discussed earlier, a systematic bias does not affect the accuracy of the Landsat-derived water level but the variation of the systematic bias does.

Based on Eq. (11), the overall uncertainty of the Landsat-derived water level σy can be given as

(12) σ y = σ β 2 y β 2 + σ x 2 y X - X 2 + σ Y 2 y Y 2 = σ β 2 ( X 1 - X 1 ) 2 + σ x 2 β 2 + σ Y 2 ,

where β and σβ can be derived from the linear regression analysis, σx is given in Eq. (9), and σY is the uncertainty of the mean altimetry water level which can be estimated from the altimetry data. For a typical lake like Yamzhog Yumco, β=0.35 m, σβ=0.02 m, Max(|X1-X1|)= 11, σx=0.13, and σY=0.015 m, which gives a maximum σy of 0.22 m. Note that X0-X0 is assumed to be the ground truth so there is no error associated with this term. This relationship shows that the uncertainty in the Landsat-derived water level increases with the distance from the center point (X1, Y) represented by (X1-X1)2. Interpretation of this phenomenon is that extrapolation of Landsat-derived water levels (far from the center point) may cause some errors and should be used with caution. More detailed discussion on the extrapolation can be found in the Supplement.

Overall, the uncertainty quantification of the Landsat-derived water levels developed in this study indicates clearly that the accuracy of Landsat-derived water levels depends on the width of an ROI, e.g., the number of pixels/observations, slope of the lake shore, the effectiveness of the water classification method, and the uncertainty in the altimetry water level used for regression. One of the advantages of the Landsat-derived water level is that an ROI does not necessarily cover a large area of lake shores, which maximizes the potential of optical remote sensing images to increase the spatial coverage and temporal resolution of lake water level estimates that may not be realized by using satellite altimetry alone. Optical remote sensing images provide important complementary information on altimetry water levels and can subsequently facilitate lake water storage estimation.

https://www.earth-syst-sci-data.net/11/1603/2019/essd-11-1603-2019-f11

Figure 11Cross validation of the TP lake level and storage changes derived from our study with those provided by the LEGOS Hydroweb database (Crétaux et al., 2011a): (a) trends in lake water levels from 2003 to 2015 and (b) trends in lake water storage from 2003 to 2015.

Download

https://www.earth-syst-sci-data.net/11/1603/2019/essd-11-1603-2019-f12

Figure 12Spatial distribution of trends in lake storage on the TP during 2000–2017. The black line shows the boundary of the endorheic basin of the TP including 39 lakes in this study. The other 13 lakes are located outside the endorheic basin.

4.3 Cross validation with similar products

We compared our product with a widely used lake water level/storage data set provided by the LEGOS Hydroweb, indicating that the two products are, on the whole, consistent with each other (shown in Fig. 11), but our product may perform better in terms of the temporal continuity as well as the temporal resolution (shown in Sect. 6.2). Both advantages are important in improving our understanding of responses of lakes to climate change. There are 21 lakes that are the same in both our study and LEGOS Hydroweb. Annual trends in water level and lake storage during 2003–2015 are compared in Fig. 11, indicating the overall consistency of the two products in terms of R2 of the linear fit.

4.4 Data set description

The data sets cover 52 large lakes (50 lakes with a surface area larger than 150 km2 and 2 lakes that are 100–150 km2) on the TP. The data sets consist of two parts: (1) a table containing hypsometric curves and corresponding regression statistics (R2 and the number of data pairs) for each lake, with parameters of the hypsometric curves listed in separate columns for the convenience of batch processing; and (2) time series for each lake archived as 52 entities with geographic information (i.e., latitude, longitude, and size of the lake) that can be checked in an online map provided by PANGAEA, avoiding the confusion of lake names. The time series of each lake include lake water levels and lake storage changes.

For data points in the water level time series, satellite or sensor type is shown (i.e., from Jason-1/2/3, Envisat, ICESat, CryoSat, or optical images). Uncertainty was calculated using the standard deviation of valid footprints in the cycle (only for altimetry data). The lake water storage time series were transformed from water level time series using the hypsometric relationship so that they have equal data size. The lake water storage time series represent changes in lake storage with respect to a reference water level, which is listed in the corresponding hypsometric curve table as a parameter. The overall uncertainty of Landsat-derived water levels within the regression range (the range of altimetry water levels) is 0.1–0.2 m based on the experiment and analysis in this paper. The extrapolation of Landsat-derived water levels may occur during the time gap between altimetric missions and before 2002. The average uncertainty of altimetry water levels is 0.11 m.

5 Applications

5.1 Spatiotemporal analysis of changes in lake water storage in the Tibetan Plateau

Based on the lake water storage changes we derived, spatial patterns of lake storage trends during 2000–2017 were shown in Fig. 12. In the endorheic basin of the TP, similar to some reported results (Yao et al., 2018b; Zhang et al., 2017a), most lakes have been expanding rapidly; e.g., Selin Co (31.80 N, 89.00 E) gained 19.7±2.0 km3 of water during the study period, and Lake Kusai (35.70 N, 92.90 E) experienced an abrupt expansion due to flood and gained 2.2±0.2 km3 of water in 2011, as reported in related work (Yao et al., 2012). In contrast, some lakes in the southern part of the TP experienced shrinkage, e.g., Yamzhog Yumco (28.93 N, 90.70 E) gained a total of 0.8±0.4 km3 water during 2000–2004 but has shrunk during the remaining 13 years (2005–2017) at a rate of -0.19±0.03 km3 yr−1. In contrast to Yamzhog Yumco, Lake Qinghai (36.90 N, 100.00 E) lost 2.2±0.7 km3 water during 2000–2004 but gained 7.7±0.6 km3 of water during 2005–2017. Similar patterns can be detected in adjacent lakes of Lake Qinghai, e.g., Lake Donggei Cuona (35.28 N, 98.55 E) and Lake Ngoring (34.90 N, 97.70 E).

https://www.earth-syst-sci-data.net/11/1603/2019/essd-11-1603-2019-f13

Figure 13Discrepancy of lake storage trends in Goren Co between Yao et al. (2018a) and our study.

Download

https://www.earth-syst-sci-data.net/11/1603/2019/essd-11-1603-2019-f14

Figure 14(a) Total storage changes in the 52 lakes on the TP, which can be generally divided into two stages: (1) a rapidly increasing stage (2000–2011) with a higher increasing rate of 6.71 km3 yr−1 and (2) a mildly increasing stage (2012–2017) with an increasing rate of 1.98 km3 yr−1. (b) Histogram of intra-annual changes in lake water levels of the 52 lakes on the TP.

Download

However, spatial proximity cannot fully explain the intricate trend distribution in the Selin Co basin, where large lakes such as Selin Co were expanding, whereas smaller adjacent lakes showed an opposite decreasing trend, e.g., Urru Co (31.70 N, 88.00 E), Lake Co Ngoin (31.60 N, 88.77 E), and Goren Co (31.10 N, 88.37 E). In fact, we found that the decreasing trends in some small lakes like Goren Co were not detected in Yao et al. (2018b), which is likely due to the lower temporal resolution as shown in Fig. 13. The three shrinking lakes are located in the upstream region and feed Selin Co through two small rivers. One of the rivers links lakes Goren Co, Urru Co, and Selin Co, whereas the other links lakes Co Ngoin and Selin Co.

A possible explanation of the disparity of changes in lake water storage in the Selin Co basin could be the principle of minimum potential energy. If we simplify the basin with the tank model and take the upstream small lake as a tank with a leaking hole, the storage of the small lake is mainly controlled by the height of the leaking hole. Given that surface water of the small lake increased, most of the increased water would flow into the large lake (a lower tank), and the outflow discharge of the small lake at higher elevations would increase accordingly. The height of the leaking hole would decline (erosion) so as to increase the overflow capacity, which eventually results in the decrease in small lake storage. Another possible situation is that the height of the leaking hole remains the same and the water surface height of the small lake increases, but this situation is not consistent with the minimum potential energy principle, as more water potential energy is stored in the small lake. This phenomenon shows that river-lake interactions may cause complex patterns of the regional surface water distribution. Therefore, decreases in small lake water storage and increases in water storage of Selin Co in the basin detected by our study seem reasonable. Increases in small lake water storage in this basin reported in some published studies may be associated with the sparse sampling of lake water levels.

https://www.earth-syst-sci-data.net/11/1603/2019/essd-11-1603-2019-f15

Figure 15Similarities and differences between water level time series from the LEGOS Hydroweb database (Crétaux et al., 2011a) and our study. (a) Taro Co (31.14 N, 84.12 E). (b) Zhari Namco (30.93 N, 85.61 E). (c) Ngoring Lake (34.90 N, 97.70 E). Shading areas highlight the differences between the two data sets. LSH represents lake surface height.

Download

We averaged the total lake water storage change in each season to generate time series shown in Fig. 14a. The overall storage change in the 52 lakes is 100.1±5.7 km3. The total lake water storage was increasing rapidly during the first 12 years but became relatively stable since 2012. Intra-annual variation in the TP lakes can also be investigated using the densified lake water level time series generated by our study. We removed the linear trend (sometimes there were multiple linear trends for a lake in different periods, which were removed in a stepwise fashion) and calculated the mean monthly water level anomaly for each lake over the study period. Then the intra-annual water level change was represented by the difference between the maximum and minimum values of the monthly water level anomaly. The histogram of the intra-annual water level change in Fig. 14b shows that most of the TP lakes have water level variations ranging from 0.3 to 0.75 m in a year on average. Similar work was performed by Lei et al. (2017), but only a small number of lakes were investigated in their study.

5.2 Quality assessment of similar data products

Some obvious discrepancies between the two data sources can be noticed, e.g., water levels of Taro Co. Both Hydroweb data and our estimation used ICESat and CryoSat-2 data. The difference lies in the fact that our CryoSat-2 product was more updated with a longer time span but Hydroweb used an additional altimetry satellite SARAL. Because the systematic biases of both products were removed, it is possible that we chose different baselines that resulted in the overall shift as shown in Fig. 15a. For instance, we may use different sets of ellipsoid and geoid models. In addition to the overall shift, some time-dependent discrepancy can be found in Fig. 15, e.g., periods highlighted by shading areas.

The black curve shows the Landsat-derived water level we derived, which is a critical reference for connecting two different altimetry data time series without an overlap period. The Landsat-derived water level shows that the last two samples of ICESat data should not be lower than the first few samples of the CryoSat-2/SARAL data (see the dashed boxes). However, it is apparent that Hydroweb data display a reverse relationship, showing that the last two ICESat measurements are smaller than the first few CryoSat/SARAL measurements. It is likely due to an unremoved systematic bias between ICESat and CryoSat/SARAL time series from Hydroweb data in Taro Co.

Even though the Landsat-derived water levels were generated by linearly fitting the lake shoreline positions with altimetry data, the relative magnitude of water levels during different periods should not be largely affected by the fitting parameters, e.g., if Landsat-derived water levels show that Ha>=Hb, where Ha (Hb) means water levels acquired in period A (B), the Ha>=Hb relationship would not change with the fitting parameters used to generate the Landsat-derived water levels. This is the main reason for us to use Landsat-derived water levels as reference. Therefore, Hydroweb data may overestimate the increasing trends in the water levels of Taro Co as their ICESat data are  0.3 m lower than the SARAL/CryoSat data. A similar issue can be observed in Zhari Namco and Ngoring lakes shown in Fig. 15b–c, and the explanation is similar to that of Taro Co. This problem may also exist in some similar studies when multisource altimetry data without overlap periods were used.

https://www.earth-syst-sci-data.net/11/1603/2019/essd-11-1603-2019-f16

Figure 16Lake water level (left y axis) estimates from our approach for six TP lakes. Black lines represent optical data and red dots represent altimetry data. LSH represents lake surface height.

Download

https://www.earth-syst-sci-data.net/11/1603/2019/essd-11-1603-2019-f17

Figure 17(a) Lake storage changes in Lake Zhuonai, Lake Kusai, and Lake Salt corresponding to the outburst event in September 2011 and (b) storage changes in relevant lakes during the outburst event (a magnified plot of the shading area in a).

Download

https://www.earth-syst-sci-data.net/11/1603/2019/essd-11-1603-2019-f18

Figure 18(a) Height variations in the outlet of Lake Kusai estimated from Landsat 8 OLI images indicate that the overflow would occur when the water level increased from 4483.9 to 4484.1 m. (b) Google Earth© image before the outburst of Lake Zhuonai (December 2010). (c) Google Earth© image after the outburst event (December 2013).

https://www.earth-syst-sci-data.net/11/1603/2019/essd-11-1603-2019-f19

Figure 19Changes in the water level of Lake Kusai after receiving the outburst flood from Lake Zhuonai. Stage 1 was used to determine the range of parameter C in Eq. (13). Stage 2 was used to compared the simulated lake outflow from Kusai Lake based on Eq. (13) with the water gain estimate from remote sensing of Lake Salt downstream during the same period; and (b) changes in water storage of Lake Salt derived from remote sensing using our developed method. There was 0.19 km3 of water gained in stage 2, which was comparable to the outflow estimate of Lake Kusai (0.22 km3) based on Eq. (13).

Download

As shown in Fig. 16, optical data can be less noisy than altimetry data in certain lakes and significantly improve the continuity of lake level and storage change monitoring. In addition, a more apparent seasonality in lake level change can be seen from the generated lake level time series. These advantages would largely benefit a better understanding of responses of TP lakes to climate change and facilitate hydrologic modeling of lake basins, regional water balance analysis, and even hydrodynamic analysis of lake water bodies.

5.3 Lake overflow flood monitoring

As mentioned earlier in Sect. 5.1, Lake Kusai experienced an abrupt expansion in 2011, resulting from the dike break of an upstream lake (Hwang et al., 2019; Liu et al., 2016; Xiaojun et al., 2012), named Lake Zhuonai (35.54 N, 91.93 E). The outburst of Lake Zhuonai occurred on 14 September (Liu et al., 2016), with 2.47±0.06 km3 of water leaking into the Kusai River (as shown in Fig. 17b), the main inflow of Lake Kusai. The water level of Lake Kusai increased by up to 7.9±0.5 m within 20 days (from 11 September to 1 October in 2011) based on Jason-2 data and then started to drop as water overflowed from the southeast corner into Lake Haidingnuoer (35.55 N, 93.16 E) and Lake Salt (35.52 N, 93.40 E). Lake Salt, the lowest part of the basin close to the basin boundary, has gained 3.0±0.1 km3 of water since 2011 and has become a critical threat to the surrounding residents and railway  10 km southeast to the boundary. Note that there are few satellite altimetry data available for Lake Salt except several CryoSat-2 observations, where Landsat-derived water levels can provide a near-real-time monitoring of changes in lake water level and storage that are crucial to flood early warning and risk management.

Aided by the high-temporal-resolution lake water level series, it was possible to estimate the height of the outlet of Lake Kusai, an important parameter for overflow estimation. The overflow of Lake Kusai can help predict the water level rise in Lake Salt and even serve as an indicator of flood forecast, as Jason-3 data with a 10-day revisit cycle are now available on Lake Kusai. Several pairs of Landsat 8 OLI images and lake water levels for the same period were compared to provide a range of possible outlet heights, which are likely to be 4483.9 to 4484.1 m, as shown in Fig. 18a. Then we measured the mean width of the outlet from high-resolution optical images provided by Planet Explorer (Planet, 2017), which is relatively stable in Dec 2011 at 31.5±2.3 m in recent years. Given lake water levels and the outlet height and width, an estimation of overflow can be made using the broad crest weir formula:

(13) Q = C b H 1.5 2 g ,

where C is a parameter mainly reflecting geometric characteristics of the weir that mainly varies from 0.3 to 0.4, b is the width of the weir, H is the water head with respect to the top of the weir, and g is the acceleration of gravity.

We determined C ( 0.3) by using stage 1 shown in Fig. 19 as a calibration period. Details can be found in the supplementary file. Then we applied this result to stage 2 shown in Fig. 19 to estimate the total overflow from Lake Kusai and compared the overflow with total water gain in stage 2 in Lake Salt. Since Lake Salt mainly relied on the replenishment of Lake Kusai during that period, with little precipitation input and negligible glacier meltwater in winter, the outflow of Lake Kusai can be comparable with the water gain in Lake Salt derived from remote sensing, though there was a small amount of evaporation loss. This relationship can provide a straightforward validation of our developed method. However, it was not available in stage 1, because the outflow of Lake Kusai first replenished Lake Haidingnuoer until the latter began overflowing. Results based on Eq. (13) indicate that the total outflow from Lake Kusai in stage 2 ranged from 0.21 to 0.22 km3, whereas the water gain in Lake Salt from remote sensing was 0.19±0.01 km3. This indicates that our high-temporal-resolution lake water level time series are valuable in monitoring and predicting lake outflow flooding that is crucial for the safety of downstream residents and infrastructure.

6 Data availability

The derived TP lake water levels, hypsometric curves, and water storage changes are archived and available at https://doi.org/10.1594/PANGAEA.898411 (Li et al., 2019).

7 Conclusion

In this study, we develop high-temporal-resolution (i.e., weekly to monthly timescales) water levels and storage change data sets for 52 large lakes on the TP during 2000–2017 by combining multiple altimetric missions and optical remote sensing images. Generated from lake shoreline positions and regression analysis with altimetry data, the Landsat-derived water level serves as a unique reference covering the entire study period, enabling a more consistent merging of multisource altimetry time series. Multisource altimetry water levels are first extracted separately from spaceborne altimetry products and then combined into a longer and denser altimetry water level time series with systematic biases well removed using Landsat-derived water levels as reference. The combined altimetry and Landsat-derived water levels increase the overall sampling frequency to submonthly regardless of the lake size.

By comparison with a widely used LEGOS Hydroweb data set, we show that without Landsat-derived water levels as a reference there may be a remaining bias in the combined altimetry water levels in certain lakes. Our study has considerably improved the temporal resolution of the monitoring of lake water level and storage changes in the TP. For most lakes examined in the published studies, to our best knowledge, the estimates from our study provide the observations of the highest temporal resolution that can better reveal the interannual and intra-annual variability and trends in lake water level and storage, even in some relatively small lakes whose annual trends may, however, be incorrectly estimated by sparse sampling of lake water levels. The developed data sets can also facilitate the monitoring of some rapidly expanding lakes with overflow risks and provide important information on flood prediction and early warning.

We evaluate the uncertainty in the Landsat-derived water levels by field experiments and rigorous uncertainty analysis. Both methods are consistent that the magnitude of the uncertainty is  0.1 m, which suggests that Landsat-derived water levels are often more efficient and less noisy than altimetry data when the altimeter footprints on the lake surface are insufficient, especially for small lakes. Based on our estimates, 52 large TP lakes accounting for  60 % of the total TP lake area have gained 100.1±5.7 km3 of water during the past 18 years. Lakes in the endorheic basin on the TP have mostly expanded. The complex spatial pattern of lake storage changes in the Selin Co basin was quantified and a possible explanation was proposed in this study. Note that the quality of the Landsat-derived water levels before 2002 may not be as good as those after 2002, because no altimetry data before 2002 are used in this study. Extrapolation of the relationship between lake shoreline positions and water levels may not be stable if the water level during 2000–2001 was much lower or higher than those from 2002 to 2017. Discussions on how the extrapolation may affect the data quality can be found in the Supplement.

Supplement

The supplement related to this article is available online at: https://doi.org/10.5194/essd-11-1603-2019-supplement.

Author contributions

LD and LX designed the research. LX, LD, HQ, and ZF developed the approaches and data sets. LX, HQ, HP, and LD carried out the field experiment. LX, LD, and YW contributed to the analysis of results and writing of the paper.

Competing interests

The authors declare that they have no conflict of interest.

Acknowledgements

Mingda Du's assistance in the field experiments, discussion on the use of satellite altimetry data for lake monitoring with Gang Qiao from Tongji University, and efforts in improving and archiving the data sets made by Daniela Ransby from PANGAEA data publisher are acknowledged here.

Financial support

This research has been supported by the National Natural Science Foundation of China (grant nos. 91547210 and 51722903) and the National Key Research and Development Program of China (grant no. 2017YFC0405801).

Review statement

This paper was edited by Birgit Heim and reviewed by two anonymous referees.

References

Benveniste, J., Baker, S., Bombaci, O., Zeli, C., Venditti, P., Zanife, O., Soussi, B., Dumont, J., Stum, J., and Milagro-Perezin, M. P.: Envisat RA-2/MWR Product Handbook, Eur. Space Agency, Frascati, Italy, 2002. 

Birkett, C. M. and Beckley, B.: Investigating the performance of the Jason-2/OSTM radar altimeter over lakes and reservoirs, Mar. Geod., 33, 204–238, 2010. 

Bouzinac, C.: CryoSat product handbook, ESA, UCL, available at: https://earth.esa.int/documents/10174/125272/CryoSat_Product_Handbook (last access: 17 October 2019), 2012. 

Cheng, K.-C., Kuo, C.-Y., Tseng, H.-Z., Yi, Y., and Shum, C.: Lake surface height calibration of Jason-1 and Jason-2 over the Great Lakes, Mar. Geod., 33, 186–203, 2010. 

Crétaux, J.-F., Jelinski, W., Calmant, S., Kouraev, A., Vuglinski, V., Bergé-Nguyen, M., Gennero, M.-C., Nino, F., Del Rio, R. A., and Cazenave, A.: SOLS: A lake database to monitor in the Near Real Time water level and storage variations from remote sensing data, Adv. Space Res., 47, 1497–1507, 2011a. 

Crétaux, J., Jelinski, W., Calmant, S., Kouraev, A., Vuglinski, V., Bergé-Nguyen, M., Nino, G., Del Rio, R., Cazenave, A., and Maisongrande, P.: Hydrolare/Hydroweb: A lake database to monitor in the Near Real Time water level and storage variations from remote sensing data, Adv. Space Res., 47, 1497–1507, 2011b. 

Crétaux, J.-F., Abarca-del-Río, R., Berge-Nguyen, M., Arsen, A., Drolon, V., Clos, G., and Maisongrande, P.: Lake volume monitoring from space, Surv. Geophys., 37, 269–305, 2016. 

Field, C. B., Barros, V. R., Dokken, D., Mach, K., Mastrandrea, M., Bilir, T., Chatterjee, M., Ebi, K., Estrada, Y., and Genova, R.: IPCC, 2014: Climate change 2014: Impacts, adaptation, and vulnerability – Part A: Global and sectoral aspects. Contribution of working group II to the fifth assessment report of the intergovernmental panel on climate change, Cambridge University Press, Cambridge, UK, New York, NY, USA, 2014. 

Frappart, F., Seyler, F., Martinez, J.-M., León, J. G., and Cazenave, A.: Floodplain water storage in the Negro River basin estimated from microwave remote sensing of inundation area and water levels, Remote Sens. Environ., 99, 387–399, 2005. 

Gorelick, N., Hancher, M., Dixon, M., Ilyushchenko, S., Thau, D., and Moore, R.: Google Earth Engine: Planetary-scale geospatial analysis for everyone, Remote Sens. Environ., 202, 18–27, 2017. 

Guo, J., Chang, X., Gao, Y., Sun, J., and Hwang, C.: Lake level variations monitored with satellite altimetry waveform retracking, IEEE J. Sel. Top. Appl., 2, 80–86, 2009. 

Huang, Q., Long, D., Du, M., Zeng, C., Qiao, G., Li, X., Hou, A., and Hong, Y.: Discharge estimation in high-mountain regions with improved methods using multisource remote sensing: A case study of the Upper Brahmaputra River, Remote Sens. Environ., 219, 115–134, 2018. 

Hwang, C., Cheng, Y.-S., Yang, W.-H., Zhang, G., Huang, Y.-R., Shen, W.-B., and Pan, Y.: Lake level changes in the Tibetan Plateau from Cryosat-2, SARAL, ICESat, and Jason-2 altimeters, Terr. Atmos. Ocean Sci., 30, 1–18, 2019. 

Jain, M., Andersen, O. B., Dall, J., and Stenseng, L.: Sea surface height determination in the Arctic using Cryosat-2 SAR data from primary peak empirical retrackers, Adv. Space Res., 55, 40–50, 2015. 

Jarvis, A., Reuter, H. I., Nelson, A., and Guevara, E.: Hole-filled SRTM for the globe Version 4, available from the CGIAR-CSI SRTM 90 m Database, available at: http://srtm.csi.cgiar.org (last access: 17 October 2019), 15, 25–54, 2008. 

Jiang, L., Nielsen, K., Andersen, O. B., and Bauer-Gottwein, P.: Monitoring recent lake level variations on the Tibetan Plateau using CryoSat-2 SARIn mode data, J. Hydrol., 544, 109–124, 2017. 

Kittler, J. and Illingworth, J.: On threshold selection using clustering criteria, IEEE T. Syst. Man. Cyb., September–October 1985, SMC-15, 652–655, 1985. 

Lehner, B. and Döll, P.: Development and validation of a global database of lakes, reservoirs and wetlands, J. Hydrol., 296, 1–22, 2004. 

Lei, Y., Yao, T., Yang, K., Sheng, Y., Kleinherenbrink, M., Yi, S., Bird, B. W., Zhang, X., Zhu, L., and Zhang, G.: Lake seasonality across the Tibetan Plateau and their varying relationship with regional mass changes and local hydrology, Geophys. Res. Lett., 44, 892–900, 2017. 

Li, B., Zhang, J., Yu, Z., Liang, Z., Chen, L., and Acharya, K.: Climate change driven water budget dynamics of a Tibetan inland lake, Global Planet. Change, 150, 70–80, 2017a. 

Li, H. W., Qiao, G., Wu, Y. J., Cao, Y. J., and Mi, H.: Water Level Monitoring On Tibetan Lakes Based On Icesat And Envisat Data Series, Int. Arch. Photogramm. Remote Sens. Spatial Inf. Sci., XLII-2/W7, 1529–1533, https://doi.org/10.5194/isprs-archives-XLII-2-W7-1529-2017, 2017b. 

Li, L., Liu, Q., Zhang, Y., Liu, L., Ding, M., and Changjun, G. U.: Spatial distribution and variation of precipitation in the Qiangtang Plateau, Geogr. Res., 36, 2047–2060, 2017c. 

Li, X., Long, D., Huang, Q., Han, P., Zhao, F., and Wada, Y.: A high temporal resolution lake data set from multisource altimetric missions and Landsat archives of water level and storage changes on the Tibetan Plateau during 2000–2017, PANGAEA, https://doi.org/10.1594/PANGAEA.898411, 2019. 

Liu, B., Lin, L. I., Yue, D. U., Liang, T., Duan, S., Hou, F., and Ren, J.: Causes of the outburst of Zonag Lake in Hoh Xil, Tibetan Plateau,and its impact on surrounding environment, J. Glaciol. Geocryol., 38, 305–311, 2016. 

Markham, B. L., Storey, J. C., Williams, D. L., and Irons, J. R.: Landsat sensor performance: history and current status, IEEE T. Geosci. Remote, 42, 2691–2694, 2004. 

McFeeters, S. K.: The use of the Normalized Difference Water Index (NDWI) in the delineation of open water features, Int. J. Remote Sens., 17, 1425–1432, 1996. 

Messager, M. L., Lehner, B., Grill, G., Nedeva, I., and Schmitt, O.: Estimating the volume and age of water stored in global lakes using a geo-statistical approach, Nat, Commun., 7, 13603, https://doi.org/10.1038/ncomms13603, 2016. 

Neckel, N., Kropáček, J., Bolch, T., and Hochschild, V.: Glacier mass changes on the Tibetan Plateau 2003–2009 derived from ICESat laser altimetry measurements, Environ. Res. Lett., 9, 014009, https://doi.org/10.1088/1748-9326/9/1/014009, 2014. 

Nielsen, K., Stenseng, L., Andersen, O. B., and Knudsen, P.: The performance and potentials of the CryoSat-2 SAR and SARIn modes for lake level estimation, Water, 9, w9060374, https://doi.org/10.3390/w9060374, 2017. 

Otsu, N.: A threshold selection method from gray-level histograms, IEEE T. Syst. Man. Cyb., 9, 62–66, 1979. 

Pekel, J.-F., Cottam, A., Gorelick, N., and Belward, A. S.: High-resolution mapping of global surface water and its long-term changes, Nature, 540, 418–422, 2016. 

Planet: Planet Application Program Interface: In Space for Life on Earth, San Francisco, CA, available at: https://www.planet.com (last access: 17 October 2019), 2017. 

Sørensen, L. S., Simonsen, S. B., Nielsen, K., Lucas-Picher, P., Spada, G., Adalgeirsdottir, G., Forsberg, R., and Hvidberg, C. S.: Mass balance of the Greenland ice sheet (2003–2008) from ICESat data – the impact of interpolation, sampling and firn density, The Cryosphere, 5, 173–186, https://doi.org/10.5194/tc-5-173-2011, 2011. 

Song, C., Huang, B., and Ke, L.: Modeling and analysis of lake water storage changes on the Tibetan Plateau using multi-mission satellite data, Remote Sens. Environ., 135, 25–35, 2013. 

Tournadre, J.: Validation of Jason and Envisat altimeter dual frequency rain flags, Mar. Geod., 27, 153–169, 2004. 

Wan, W., Long, D., Hong, Y., Ma, Y., Yuan, Y., Xiao, P., Duan, H., Han, Z., and Gu, X.: A lake data set for the Tibetan Plateau from the 1960s, 2005, and 2014, Scientific data, 3, 160039, https://doi.org/10.1038/sdata.2016.39, 2016. 

Wang, J., Song, C., Reager, J. T., Yao, F., Famiglietti, J. S., Sheng, Y., MacDonald, G. M., Brun, F., Schmied, H. M., and Marston, R. A.: Recent global decline in endorheic basin water storages, Nat. Geosci., 11, 926–932, 2018. 

Xiaojun, Y., Shiyin, L., and Meiping, S.: Changes of Kusai Lake in Hoh Xil region and causes of its water overflowing, Ac. Geogr. Sin., 67, 689–698, 2012. 

Xu, H.: A study on information extraction of water body with the modified normalized difference water index (MNDWI), J. Remote Sens.-Beijing, 9, 589–595, 2005. 

Yao, F., Wang, J., Yang, K., Wang, C., Walter, B. A., and Crétaux, J.-F.: High resolution data set of annual lake areas and water storage across the Inner Tibet, 2002–2015, PANGAEA, https://doi.org/10.1594/PANGAEA.888706, 2018a.  

Yao, F., Wang, J., Yang, K., Wang, C., Walter, B. A., and Crétaux, J.-F.: Lake storage variation on the endorheic Tibetan Plateau and its attribution to climate change since the new millennium, Environ. Res. Lett., 13, 064011, https://doi.org/10.1088/1748-9326/aab5d3, 2018b. 

Yao, X., Liu, S., and Sun, M.: Changes of Kusai Lake in Hoh Xil region and causes of its water overflowing, Ac. Geogr. Sin., 67, 689–698, 2012. 

Yu, S., Liu, J., Xu, J., and Wang, H.: Evaporation and energy balance estimates over a large inland lake in the Tibet-Himalaya, Environ. Earth Sci., 64, 1169–1176, 2011. 

Zhang, G., Xie, H., Kang, S., Yi, D., and Ackley, S. F.: Monitoring lake level changes on the Tibetan Plateau using ICESat altimetry data (2003–2009), Remote Sens. Environ., 115, 1733–1742, 2011. 

Zhang, G., Yao, T., Xie, H., Kang, S., and Lei, Y.: Increased mass over the Tibetan Plateau: from lakes or glaciers?, Geophys. Res. Lett., 40, 2125–2130, 2013. 

Zhang, G., Yao, T., Shum, C., Yi, S., Yang, K., Xie, H., Feng, W., Bolch, T., Wang, L., and Behrangi, A.: Lake volume and groundwater storage variations in Tibetan Plateau's endorheic basin, Geophys. Res. Lett., 44, 5550–5560, 2017a. 

Zhang, G., Zheng, G., Gao, Y., Xiang, Y., Lei, Y., and Li, J.: Automated water classification in the Tibetan plateau using Chinese GF-1 WFV data, Photogramm. Eng. Remote Sens., 83, 509–519, 2017b. 

Zhou, J., Wang, L., Zhang, Y., Guo, Y., Li, X., and Liu, W.: Exploring the water storage changes in the largest lake (Selin Co) over the Tibetan Plateau during 2003–2012 from a basin-wide hydrological modeling, Water Resour. Res., 51, 8060–8086, 2015. 

Download
Short summary
Lakes on the Tibetan Plateau experienced rapid changes (mainly expanding) in the past 2 decades. Here we provide a data set of high temporal resolution and accuracy reflecting changes in water level and storage of Tibetan lakes. A novel source of water levels generated from Landsat archives was validated with in situ data and adopted to resolve the inconsistency in existing studies, benefiting monitoring of lake overflow floods, seasonal and interannual variability, and long-term trends.