Densified multi-mission observations by developed optical water levels show marked increases in lake water storage and overflow floods on the Tibetan Plateau

The Tibetan Plateau (TP) known as Asia’s water towers is quite sensitive to climate change, reflected by changes 10 in hydrological state variables such as lake water storage. Given the extremely limited ground observations on the TP due to the harsh environment and complex terrain, we exploited multisource remote sensing, i.e., multiple altimetric missions and Landsat archives to create dense time series (monthly and even higher such as 10 days on average) of lake water level and storage changes across 52 large lakes (>100 km) on the TP during 2000‒2017 (the data set is available online with a DOI: https://doi.org/10.1594/PANGAEA.898411). Field experiments were carried out in two typical lakes to validate the remotely 15 sensed results. With Landsat archives and partial altimetry data, we developed optical water levels that cover most of TP lakes and serve as an ideal reference for merging multisource lake water levels. The optical water levels show an uncertainty of ~0.1 m that is comparable with most altimetry data and largely reduce the lack of dense altimetric observations with systematic errors well removed for most of lakes. The densified lake water levels provided critical and accurate information on the longterm and short-term monitoring of lake water level and storage changes on the TP. We found that the total storage of the 52 20 lakes increased by 97.3 km at two stages, i.e., 6.68 km/yr during 2000‒2012 and 2.85 km/yr during 2012‒2017. The total overflow from Lake Kusai to Lake Haidingnuoer and Lake Salt during Nov 9‒Dec 31 in 2011 was estimated to be 0.22 km, providing critical information on lake overflow flood monitoring and prediction as the expansion of some TP lakes becomes a serious threat to surrounding residents and infrastructure.


Introduction
The Tibetan Plateau (TP), providing vital water resources for more than a billion population in Asia, is a sensitive region undergoing rapid climate change (Field et al., 2014).There are more than 1,200 alpine lakes larger than 1 km 2 (Zhang et al., 2017a) on the TP where glaciers and permafrost are also widely distributed.With little disturbance by human activity in this area, lake storage changes may serve as an important indicator that reflects changes in regional hydrological processes and responses to climate change.Wang et al. (2018) showed that global endorheic basins are experiencing a decline in water storage whereas the endorheic basin on the TP is an exception.Given the fact that TP lakes have been expanding for more than 20 years (Pekel et al., 2016), pasture, farmland, and infrastructure near the lake shore face the risk of inundation.Therefore, it is imperative to largely improve the monitoring of TP lakes.
As an important component of the hydrosphere, terrestrial water cycle, and global water balance, millions of inland water bodies such as lakes, wetlands, and human reservoirs have been investigated globally and their total storage was estimated to be 181.9× 10 3 km 3 based on statistical models (Lehner and Döll, 2004;Messager et al., 2016;Pekel et al., 2016).Lake storage changes, which is more concerned in the regional water balance, can be derived from observed changes in lake water level and area (Cré taux et al., 2016;Cré taux et al., 2011b;Song et al., 2013;Yao et al., 2018b;Zhang et al., 2017a).Most of these two kinds of observations in related studies were obtained from satellite remote sensing due to the scarcity of in situ data across the TP where the harsh environment and complex terrain make in situ measurement costly and risky.Changes in lake water level can be monitored using satellite altimetry with a radar or laser initially designed for sea surface height or ice berg measurements.Altimeters determine the range between the nadir point and satellite by analysing the waveforms of reflected electromagnetic pulse.The waveforms of radar or laser pulse may, however, be contaminated by signal from complex terrain when applied to inland water bodies, but it is possible to remove these impacts with waveform retracking algorithms (Guo et al., 2009;Huang et al., 2018;Jiang et al., 2017).Zhang et al. (2011) mapped water level changes in 111 TP lakes for the period 2003-2009 using ICESat-1 data that have a temporal resolution of 91 days.ICESat-1 data have relatively denser ground tracks but a lower temporal resolution than most of other altimetric missions.This means that ICESat-1 covers more lakes but provide few water level observations for each lake.After ICESat-1 was decommissioned in 2010, CryoSat-2 data spanning the period 2010-2015 were adopted in related studies (Jiang et al., 2017), due to its similar dense ground tracks and competitive precision as ICESat-1.Other altimetric missions, such as TOPEX, Jason-1/2/3, ERS-1/2, and ENVISAT also have some but relatively limited applications in monitoring changes in lake water level on the TP due to sparse ground tracks.In this study, multisource altimetry data (i.e., Jason-1/2, ENVISAT, ICESat-1, and CryoSat-2) were combined if available for the study lakes, with the optical water levels developed in this study as a critical reference to densify the altimetric observations and merging data from multiple altimetric missions.
Changes in lake area can be captured by optical/SAR images from medium or high spatial resolution remote sensing data, such as Landsat TM/ETM + /OLI.Extraction of lake water bodies can be manually (Wan et al., 2016) or automatically (Zhang et al., 2017b) achieved.Automatic water extraction methods based on the water index and auto-thresholding are more efficient in dealing with a large amount of remote sensing images.Even so, acquisition and preprocessing of such a large amount of historical images (~10TB) covering TP lakes are still intractable for researchers with limited computation resources.With the help of cloud-based platforms, such as the Google Earth Engine (GEE) that significantly reduces data downloading and preprocessing time, tens of thousands of images may be processed online in days instead of months (Gorelick et al., 2017).In this study, more than twenty thousand Landsat TM/ETM/OLI images were processed online using GEE to extract lake water bodies based on the water index and Otsu algorithm.
There are some studies focusing on changes in lake storage on the TP over recent decades, e.g., Zhang et al. (2017a) examined changes in lake storage for ~70 lakes from the 1970s to 2015 with ICESat-1 altimetry data and Landsat archives.An individual lake area data set from the 1970s and annual area data after 1989 were used.Due to the short time span of ICESat-1, they used the hypsometric method to convert lake areas into water levels.Yao et al. (2018b) used DEMs and optical images to develop hypsometric curves for lakes on the central TP, and estimated annual changes in lake storage from 2002 to 2015 for 871 lakes.
These studies have a wide spatial coverage of lakes but relatively lower temporal sampling and little altimetric information, which may limit the accuracy of trends in lake water level/storage in some cases and short-term monitoring of lake overflow flood.The LEGOS Hydroweb provides a lake data set, including multisource altimetry-based changes in lake water level and storage as well as hypsometric curves for 22 TP lakes (Cré taux et al., 2016;Cré taux et al., 2011b).The data set incorporated more altimetric information and provided data of higher temporal resolution.However, there may be a remaining bias when different sources of altimetric data were merged, due to the lack of some important reference that can be derived from optical remote sensing images to be shown in this study.We term the reference data as the 'optical water level' to be introduced in section 3.2.Here, we list recent studies and data sets (Table 1) to provide a concise summary on the progress of remote sensingbased water level and storage monitoring on TP lakes.The overall objective of this study was to examine long-term and short-term changes in water level and storage across 52 lakes with surface areas larger than 150 km 2 on the TP by merging multisource altimetry and optical remote sensing images to generate a much denser (monthly or higher such as 10 days on average), more coherent lake water level/storage change data set covering the period 2000-2017 and the hypsometric curve for each study lake.Results of this study provide a comprehensive and detailed assessment of changes in lake level and storage on the TP for the recent two decades, and shortterm monitoring of lake overflow flood for some lakes.This study could largely benefit more detailed investigations into lakes, lake basins, and regional climate change, because the generated data set has the highest temporal resolution during the study period.To ensure data quality, field experiments were carried out and in situ data were collected to examine the uncertainty in the derived optical water levels.
2 Study area and data

Study area
The TP can be generally divided into 12 major basins, among which the Inner/Central TP (CP) is the only endorheic basin and home to most TP lakes including ~300 TP lakes larger than 10 km 2 .Therefore, it was chosen as the main study area.The endorheic basin covers an area of ~710,000 km 2 (~28% of total TP) with a mean elevation of ~4,900 m and has a semi-arid plateau climate with precipitation ranging from 100-300 mm (Yao et al., 2018b).Most lakes in the endorheic basin were expanding under the influence of climate change/variability as opposed to many other places in the TP, e.g., Lake Selin Co exceeded Lake Nam Co in area and consequently became the largest lake in the endorheic basin between 2011-2012 and expanded by 26% over the past 40 years (Zhou et al., 2015), whereas Lake Yamzhog Yumco (outside the endorheic basin, 350 km to the southeast of Selin Co) shrunk by ~11% during 2002-2014 according to Wan et al. (2016).Located in the southeast endorheic basin, the Lake Nam Co basin covering about 10,800 km 2 , with 19% of the basin lake water area and a mean lake elevation of ~4,730 m was chosen as a field experiment spot.The mean annual temperature and precipitation of Lake Nam Co are 1.3 °C, and 486 mm, respectively (Li et al., 2017a).The other experiment spot was Lake Yamzhog Yumco, which has a mean lake elevation of ~4,440 m.Subjected to steep mountainous terrain, the lake has a narrow-strip shape with complex shorelines.The basin of Lake Yamzhog Yumco covers ~6,100 km 2 , with mean annual temperature and precipitation of 2.8 °C and ~360 mm, respectively (Yu et al., 2011).

Data
Multisource altimetry data were used in this study as shown in Table 2.The earliest record dates back to 2002 (i.e., ENVISAT and Jason-1) and the latest record ends in 2017 (i.e., Jason-3 and CryoSat-2).Most of 52 lakes examined in this study were covered by ICESat-1, ENVISAT, and CryoSat-2 data.Jason-1/2/3 data were available only on 12 lakes due to the relatively sparse ground tracks or data quality issues.Note that Jason-2 inherited the orbit of Jason-1 after its launch in 2008, whereas Jason-1 was shifted into an interleaved orbit and continued functioning until 2013, thereby increasing the spatial coverage of Jason altimetry series to some degree, e.g., Jason-1 data in Lake Qinghai, the largest lake on the TP, were only available after 2008 due to the orbit shift.ICESat-1 and CryoSat-2 data have the best spatial coverage but relatively long repeat cycles of 91 days and 369 days (Bouzinac, 2012;Zhang et al., 2011).The ENVISAT mission has a moderate set of orbital parameters to balance its spatial coverage and temporal resolution of 35 days (Benveniste et al., 2002).In order to determine if the altimetry data fall into the study lakes, a lake shape data set generated by Wan et al. (2016) was used.
We used Landsat TM/ETM + /OLI Surface Reflectance data sets provided by GEE to generate information on lake shore and lake area changes.TM images covered 2000-2011, whereas ETM + images covered 2000-2017 and OLI images covered 2013-2017.Though Landsat ETM + suffered sensor failure and all the ETM + images contain gaps after 2003 (Markham et al., 2004), we managed to make use of some images with gaps in generating lake shore changes.All the images were processed online using GEE API.Preprocessing such as radiometric calibration, atmospheric correction, as well as geometric correction was already performed in the production of the data sets.There were more than 20,000 images processed and a half of them were excluded from the final results due to cloud contamination or gaps.
We collected daily in situ water level measurements in Lake Yamzhog Yumco for validation purposes with a pressure-type water level sensor.The in situ water level measurements spanned a half year from May to October 2018.We also performed unmanned aerial vehicle (UAV or drone) imaging over a 1 km-lake bank in Lake Yamzhog Yumco and Lake Nam Co.The UAV images were used to evaluate the accuracy of lake shore extraction from Landsat data, which is similar to  (2018).In addition, we performed a rigorous statistical analysis of the uncertainty in the derived optical water levels by taking the UAV-derived water area ratios as the ground truth of sub-pixel water area ratios of Landsat image pixels (see Section 4).

Methodology
To investigate changes in lake storage, changes in lake water level and lake area need to be derived from multisource remote sensing.First, water levels from various satellite altimeters (see Figure 1) for each lake as well as changes in lake shoreline and area from optical remote sensing images (i.e., Landsat TM/ETM + /OLI) were derived.Second, the systematic biases between different altimetry data were removed by either comparing the mean water level of the overlap period or comparing the two water level time series with changes in lake shoreline, depending on the length of the overlap period.The lake shoreline changes, termed as the optical water levels against altimetry water levels in this study, can serve as a unique source of information reflecting water level changes as well as a data merging reference.We will show that after deriving two or three regression parameters, lake shoreline changes can well represent lake water levels with comparable accuracy as altimetryderived water levels.Third, with information on changes in water level and water area derived from altimetry data and optical remote sensing images, the hypsometric curve that describes the relationship between the lake water level and lake water storage changes can be derived.Fourth, the integral of the hypsometric curve was performed to convert water level time series into lake storage change time series.Details will be illustrated in the following sections.

Altimetry water level
The first step of deriving altimetry water levels is to select correct ground tracks and valid footprints falling on the lakes.
Because there is a random ground track shift at ~1 km in different cycles for most altimetry missions, it is uncertain that valid lake footprints can be obtained for each cycle, even though the nominal ground track seems crossing the lake.This problem can be addressed by comparing the geographic coordinates of the footprints with a lake shape data set (Wan et al., 2016).After picking out the valid footprints, the lake surface height can be calculated for each footprint.All radar altimetry data share a relationship:

𝐿𝑆𝐻 = 𝐴𝑙𝑡 − (𝑅𝑎𝑛𝑔𝑒 + 𝑐𝑜𝑟)
Eq 1 where LSH represents the lake surface height with respect to the geoid.Alt represents the altimeter height with respect to the reference ellipsoid.Range represents the distance between the altimeter and lake surface.cor represents several range corrections including retracking correction, wet/dry troposphere correction, ionosphere correction, pole/solid tide correction, and geoid correction.The retracking range correction plays an important role in removing the contamination of land signal when the altimetry data are applied to inland water bodies.In this study, Jason-1/2/3 data were retracked using a classical waveform retracking algorithm named the improved threshold method (ITR), whereas CryoSat-2 data were retracked using the narrow primary peak threshold (NTTP) method (Birkett and Beckley, 2010;Cheng et al., 2010;Jain et al., 2015).
Retracking corrections were not performed for ENVISAT and ICESat data, because the ENVISAT product already provided a retracked range using the ICE-1 method and the ICESat GLAH14 product already included several corrections that are sufficient for most applications including studies on inland water bodies (Zhang et al., 2011).
For each cycle of an altimeter, it is common that more than one footprint fall on a study lake, thereby providing several LSH observations on the same day.After removing outliers with the three-sigma rule, frequency distributions of the LSHs from the same cycle were calculated.The mean value of the highest bucket of the histogram was selected to represent the LSH for the cycle.Meanwhile, the frequency of the chosen bucket was reserved to evaluate the data quality for the cycle, e.g., a cycle was marked as high quality if the frequency is higher than 0.8, moderate quality if it is only higher than 0.5, and poor quality if the Earth Syst.Sci.Data Discuss., https://doi.org/10.5194/essd-2019-34 Open Access frequency is lower than 0.5.The LSH from each cycle constituted the lake water level time series for the study lake.LSHs that were marked as poor quality and obviously deviated from the moving average were removed from the altimetry-based lake water level time series.
It is not uncommon that systematic biases exist in different altimetry data sets due to the variation in orbit, discrepancy between correction models, errors associated with sensors, and even the choice of the reference datum.After deriving lake water level time series for each altimeter, we first merged the Envisat and ICESat-1 water levels if both were available for a lake, because they have the longest overlap period (Figure 1).We chose Envisat-derived water levels as the baseline and removed the difference of the mean value of the overlap period between the two products, because Envisat data are generally denser and longer than ICESat-1 data.A similar process was applied to Jason-1/2/3, as there are two overlap periods connecting the three altimeters together.
There are tradeoffs between CryoSat-2 and Jason-2/3 data in terms of spatial coverage and time span.CryoSat-2 data are available for all study lakes but they only have the overlap period with Jason-2/3 data, whereas Jason-2/3 data are only available for 12 lakes.For most lakes without Jason-2/3 data, we merged CryoSat-2 data with either ICESat-1 or Envisat using optical water levels spanning from 2000 to 2017, because there is no overlap period between these altimetry water levels.Lake shoreline changes were firstly translated into optical water levels by fitting with CryoSat-2 data, functioning as extrapolation of CryoSat-2 to 1-2 years.Then, we applied the same method of merging Jason-1/2/3, to merge the extrapolated CryoSat-2 data with either Envisat or ICESat-1 data.In doing so, we were able to remove all systematic biases between multisource altimetry-derived water levels.All the water levels were with respect to EGM 96. Figure 2 provides an example of ground tracks of altimetric missions and water level time series on Lake Zhari Namco.

Optical water level
For most lake basins, it is possible to find a relatively flat part of lake banks with an average slope of 1/30 or smaller, where 5 obvious interannual or intra-annual changes in lake shorelines can be detected using high-spatial-resolution Landsat images (30 m).These locations can be found by comparing lake images from the first year and the last year of the study period if the lake shows a clear expanding/shrinking trend.Otherwise, we can compare images acquired in early summer when the LSH is at lowest with those acquired in late autumn when the lake expands to its limit.In this study, we assumed that the selected lake bank was flat enough such that the relationship between changes in lake water level and shoreline can be depicted in a linear 10 or quasi-linear (parabolic) way.The validity of this assumption can be evaluated with the coefficient of determination (R 2 ) for each lake as shown in Table 3.For most lakes, the goodness of fit is higher than 0.7, suggesting the generally good fitting relationship between changes in lake level and shoreline.Though there were ~500 Landsat images obtained for the selected lake banks during the study period, many of them were largely affected by cloud or cloud shadow.In addition, the failure of Landsat 7 sensor SLC left all the Landsat 7 images with gaps after 2003 (Markham et al., 2004), making the available images even fewer.By choosing the region of interest (ROI) that is parallel to the Landsat 7 gaps, we can use most of the Landsat 7 images.However, the width of ROI must be reduced to avoid shifting gaps as shown in Figure 3 (b).The ROI did not fill the interval of gaps, because the wider the ROI, the higher possibility of shifting gaps cross it.
Changes in lake shorelines were characterized by changes in the water surface ratio detected in the ROI.To automatically extract changes in water surface from tremendous amount of Landsat archives on GEE, the water index and Otsu threshold method were jointly used.We calculated the Normalized Difference Water Index (NDWI) and the Modified Normalized Difference Water Index (MNDWI) of input images and compared their performance in different seasons.It was found that the MNDWI tends to be more sensitive to shallow water in summer, but is less effective than NDWI when the lake bank is covered by snow in the cold season as shown in Figure 4. Therefore, the two water indices were jointly used in this study by applying the MNDWI to images acquired during May to October and applying the NDWI to the rest months.The NDWI and MNDWI can be calculated as follows (McFeeters, 1996;Xu, 2005): where   ,   , and   refer to reflectance of bands 2, 4, 5 for Landsat TM/ETM + images and bands 3, 5, 6 for Landsat OLI images.
After calculating the water index, the grayscale image was binarized using the Otsu method.If the selected ROI comprises ~50% water and ~50% land, the performance of the method is good, as the distribution of digital numbers of the grayscale image is close to the assumption of the bimodal histogram implicit in the Otsu algorithm (Kittler and Illingworth, 1985).The binarized images were further processed to provide the water surface ratio in the ROI, which represents changes in lake shoreline.The time series of changes in lake shoreline were then converted into optical lake water levels using linear regression or second-order polynomial fit with altimetry-derived water levels (Figure 3 (c)-(d)).
However, cloud, cloud shadow, and shifting gaps may contaminate the ROI and cause errors in the optical water levels.
Therefore, the QA band of the Landsat Surface Reflectance product was used to filter the images.The data would be excluded if the fraction of the cloud or cloud shadow-covered area in the ROI was higher than 5%.

Hypsometry
We derived the hypsometric curve for each study lake by polynomial fitting of the lake area and level time series.The lake area comprises two parts: the inner invariable part and the outer variable part.As the variable water area is of more concern in this study, ROIs for extracting changes in lake area only cover the lake shoreline and its neighbouring area as shown in Figure 5 .The inner part of the water body was calculated only once and considered as invariant, making the calculation more efficient on GEE.Meanwhile, more images are available as the area of ROI becomes smaller, because the possibility of clouds covering the ROI is reduced compared with an ROI covering the entire lake.The Landsat ETM + images after 2003 were not included in this part of calculation as gaps negatively affected the ROI for lake area extraction.Similar to Section 3.2, we selected images with less than 5% cloud cover on an ROI to generate time series of changes in lake area, obtaining 20~30 data points on average for regression.R 2 values for each lake are listed in Table 3, indicating that most lake basins agree well with the parabolic hypsometric curve.It can also be seen that most fitting functions have a positive parameter for the second-order term, which can be explained according to the knowledge on the hypsometric curve of catchments.This differs somewhat from the previous studies by Song et al. ( 2013) that shows a negative value for the second-order derivatives.After investigating a number of watersheds, researchers suggested that the general hypsometric curve for the entire catchment can be expressed as: (Strahler, 1952;Willgoose and Hancock, 1998): where y corresponds to the normalized elevation and x corresponds to the normalized area above the elevation.a, d, and z are fitting parameters.The hypermetric curve generated by this model always has a ′toe′ as shown in Figure 6, where the secondorder differential of the curve is negative.This means that at the low elevation of the catchment, with decreasing (increasing) elevations, the area above (below) the elevation increases more slowly (faster).Lakes are always formed at the lowest portion of a catchment, so the hyperosmotic curve for a lake basin can be the toe part for the entire catchment.This suggests that if we use the parabolic curve to fit the lake area and water level time series, there should be a positive second-order parameter so that with increasing water levels, the lake area increases faster.The last step was to integrate the hypsometric curve to generate the volume-elevation relationship and convert the lake water levels into storage changes.Most Tibetan lakes are located in remote and inaccessible regions, resulting in the scarcity of ground-based in situ measurements that are, however, vital for data quality assessment.We made some in situ measurements in two study lakes to validate the data quality of optical water levels developed in this study, instead of the satellite altimetry data whose quality on lakes or rivers has been widely known.Many studies used in situ water levels to calculate certain statistical metrics, e.g., root mean squired error (RMSE).However, results provided by different studies vary and could be associated with the cross-section width of the study water body in the ground track panel (Nielsen et al., 2017).This means that these results may not be comparable due to their unique applications.In addition, it is not rigorous to use in situ data of only one lake to represent the overall situation of study lakes in the uncertainty assessment for altimetry water levels.Instead, we used the standard deviation of valid footprints acquired in the same cycle as an estimate of uncertainty in altimetry-derived water levels.In contrast, the applicable condition of optical water levels is not so variable as that of altimetry data.Derivation of optical water levels requires relative flat bank as well as some altimetric information, which were available in all study lakes.Since these selected bank slopes were similarly small (~1/30), it is possible to use a few typical lakes to represent all study lakes.Therefore, we carried out a field experiment in Lake Yamzhog Yumco and Lake Nam Co to validate the derived optical water levels.
There were two main goals in our experiment: (1) collecting daily in situ water level data in a certain TP lake to validate the corresponding optical water levels statistically; (2) imaging a certain length of the lake shore with UAV to test the performance of lake shore extraction so as to provide a theoretical uncertainty analysis of optical water levels.On Lake Yamzhog Yumco, we installed a pressure type water level sensor (type H5110-DY, manufactured by Shenzhen Hongdian technologies Co., Ltd.), which measures water pressure and temperature of the installation depth and converts them into water depths with a relative accuracy of ~0.1%.The device was carried onto the lake and put ~20 m below the water surface and 0.5 m above the lake bottom, which suggests an absolute error of ~2 cm.We chose a typical landscape (i.e., mild slope, little vegetation, and gravel lake beach) to perform UAV imaging in both Lake Yamzhog Yumco and Lake Nam Co in mid-May, 2018.The UAV was operating at a height of 200 m and imaging the ground at a constant rate in visible bands with a wide-angle lens.With a GPS tracker onboard, these UAV images were well georeferenced.Then image mosaic was performed using the Piz4D mapper and converted into a digital orthophoto map (DOM), which is a similar process described by Huang et al. (2018).The spatial resolution of UAV data reached ~5 cm as we compared a real ground object with its size in images.

Uncertainty analysis of optical water levels
Based on the in situ water level measurements made by the pressure-type water level sensor, we evaluated the accuracy of 5 optical water levels statistically.There were 16 optical water level records available for the comparison against the in situ measurements.The RMSE of the water level anomaly was 0.11 m.The linear fit had a slope close to 1 and an R 2 of 0.89, suggesting the good consistency of the in situ water level measurements and the derived optical water levels (Figure 8 (b)).It should be noted that the optical water levels used for validation here were translated from lake shore changes using parameters derived from fitting with CryoSat-2 data, i.e., there is no in situ information involved in generating the optical water levels 10 shown in Figure 8.To be more convincing, we performed a more theoretical uncertainty analysis of the optical water levels by looking at the original optical data and generation processes with the help of UAV images.First, we took UAV images as the ground truth to determine the accurate position and shape of the lake shore line.Second, we performed water classification based on the concurrent Landsat OLI image that contained the UAV scanned area, with the combined water index method and Otsu algorithm to derive the binarized image.Landsat image pixels where the lake shoreline determined using the UAV images crosses were delineated manually and marked as shoreline pixels as shown in Figure 9 (a).Then the water area in each shoreline pixel was calculated.
Given that these shoreline pixels were classified as either water or land, a relationship between the water area ratio of the shoreline pixel and the probability of the pixel being classified as water can be derived.This relationship generally describes the function of the water classification method by telling how likely a pixel will be determined as water, given the water area ratio of the pixel.Based on the observations of a total of 64 UAV-scanned shoreline pixels, a logistic type curve with two parameters that determine the position and shape of the curve was chosen to represent the water classifier as Eq 5 shows: Eq 5 where x represents the water area ratio in the shoreline pixel, f(x) represents the probability of the shoreline pixel being classified as water pixel, a and b are parameters that determine the shape and position of the curve, respectively.The parameters were determined using the maximum likelihood method.Results are shown in Figure 9   It should be noted that even the water ratio in a shoreline pixel is zero, there is a small probability that this pixel may be mistaken as a water pixel, because the surface reflectance information may be contaminated by adjacent water pixels.Therefore, the classifier has a small value (~0.02) when the water ratio is zero.As can be seen from Figure 9 (b), when the water ratio is larger than 0.3, the probability of the pixel classified as water is close to 1.This suggests that there may be a higher probability of the occurrence of water pixels that is associated with a systematic bias of the lake shoreline detection.Note that the systematic bias can be removed when linearly fitting the lake shore changes and altimetry-derived water levels as long as the bias is stable.Therefore, uncertainty in optical water levels developed in this study arises from the variation in this systematic bias.
To describe the variation in the systematic bias, a new random variable X was introduced to represent the bias between the classified water area and the real water area in a shoreline pixel.Given the shape and position of the shoreline, the real water area in each shoreline pixel is a complex function of the relative position between the pixel and the shoreline.To simplify the derivation, we assumed that the water area ratio in a shoreline pixel is uniformly distributed on [0,1], meaning that the probability of any value between 0 and 1 is equal.If we use X0 to represent the true water area ratio in the shoreline pixel and X1 to represent the classified results based on the water ratio, the random variable X can be expressed as: Eq 6 where X1 can take on 0 or 1 (i.e., the classified results only tell us whether a pixel is water pixel or not), so X can only take on either -X0 or 1-X0.Because the range of X0 is [0,1], it is obvious that the range of X is [-1,1].A derivation of F(X), i.e., the probability density function (PDF) of X is given as follows: Earth Syst.Sci.Data Discuss., https://doi.org/10.5194/essd-2019-34 Open Access where f(x) is referred to as the classifier function defined in Eq 5.
Because X0 is uniformly distributed between [0,1], P(X0) = 1.Note that P(X0 = 1, X1 = 1) = 1, this means that if X0 is 1, the probability of a pixel classified as water is 1.However, P(X0 = 0, X1 = 0) is not necessarily 1 (i.e., 1-f(0)), because the pixel with X0 = 0 still has a low probability, i.e., f(0) of being classified as water based on the illustration above.Combining Eqs.( 7)-( 9) with the explanations on the specific case X = 0 results in the following: Eq 10 It is obvious that F(X) is not a continuous function, but it can be integrated and the integral of F(X) on [-1,1] equals 1, meaning that it satisfies the basic property of PDF.
Overall, F(X) describes how the systematic bias between the classified water ratio and real water ratio in shoreline pixels is distributed as shown in Figure 10.If there are N shoreline pixels in an ROI, we can take them as N independent observations of X and calculate the mean value  ̅ .This value  ̅ can represent an average shift of the detected lake shoreline from the real lake shoreline in the unit of one-pixel width (30 m).As we mentioned above, the systematic bias can be removed in the regression between lake shore changes from optical remote sensing and the corresponding water levels from satellite altimetry.
As such, it is the variation of the bias that determines the accuracy of the optical water levels.Therefore, we can calculate the standard variation of  ̅ to represent the uncertainty in lake shoreline changes.Note that there is a simple relationship between σ ̅ and σ  : Eq 13 in combinaions with Eq 5, Eq 10, and Eq 12 was resolved numarically, resuling in ~0.3 pixel width.Substituting σx in Eq 11 with Eq 13 gives: σ ̅ = 0.3 √ Eq 14 Figure 10.Probability density function of the systematic bias X between the classified water ratio (X1) and real water ratio (X0) in a shoreline pixel.

5
If the slope of the shoreline is known, e.g., tanθ, the uncertainty of the optical water level can be expressed as: where σho is the uncertainty of optical water levels and d is the spatial resolution of the satellite image (30 m).In this study, a typical width of ROI for deriving optical water levels is ~10-pixel width, meaning that N is ~10.In addition, lake shores used for generating optical water levels here generally have a relatively mild slope of ~1/30 or even smaller, which can be rounghly estimated from the maximun shoreline change and altimetry water level change within a year.Here if we use 1/30 as the slope 10 tanθ, the uncertainty of the optical water levels can untimately be estimated to be ~0.1 m, which is very close to the RMSE (~0.09 m) based on the comparision of the opitcal water levels and in situ water level measurements mentioned earlier.
Overall, the uncertainty quantification of the optical water levels developed in this study indicates clearly that the accuracy of optical water levels depends on the width of an ROI, e.g., the number of pixels/observations, slope of the lake shore, and the effectiveness of the water classification method.One of the advantages of the optical water level is that an ROI does not necessarily cover a large area of lake shores, which maximizes the potential of optical remote sensing images to increase the spatial coverage and temporal resolution of lake water level estimates that may not, however, be achievable by using satellite altimetry alone.Optical remote sensing images provide important complementary information on altimetry-derived water levels and would subsequently facilitate lake water storage estimation.

Spatiotemporal analysis of changes in Tibetan Lakes
Based on the lake water storage changes we derived, spatial patterns of lake storage trends from 2000 to 2017 were shown in Figure 11.In the endorheic basin of the TP, similar to some reported results (Yao et al., 2018b;Zhang et al., 2017a), most lakes have been expanding rapidly, e.g., Lake Selin Co (89.00 E, 31.80N) gained ~19 km 3 of water during the study period, Lake Kusai (92.90 E, 35.70 N) experienced an abrupt expansion due to flood and gained ~2.2 km 3 of water in 2011, as reported in related work (Xiaojun et al., 2012).In contrast, some lakes in the southern part of the TP experienced shrinkage, e.g., Lake  However, spatial proximity cannot fully explain the intricate trend distribution in the Selin Co basin, where large lakes such as Lake Selin Co were expanding whereas smaller adjacent lakes showed an opposite decreasing trend, e.g., Lake Urru (88.00 E, 31.70 N), Lake Co Ngoin (88.77E, 31.60 N), and Lake Goren Co (88.37 E, 31.10N).In fact, we found that the decreasing trends in some small lakes like Lake Goren Co were not detected in Yao et al. (2018b), which is likely due to the lower sampling frequency as shown in Figure 12.The three shrinking lakes are located in the upstream region and feed Lake Selin Co through two small rivers.One of the rivers links Lakes Goren Co, Urru, and Selin Co, whereas the other links Lakes Co Ngoin and Selin Co.A possible explanation of the disparity of changes in lake water storage in the Selin Co basin could be the principle of minimum potential energy.If we simplify the basin with the tank model and take the upstream small lake as a tank with a leaking hole, the storage of the small lake is mainly controlled by the height of the leaking hole.Given that surface water of the small lake increased, most of the increased water would flow into the large lake (a lower tank), and the outflow discharge of the small lake at higher elevations would increase accordingly.The height of the leaking hole would decline (erosion) so as to increase the overflow capacity, which eventually results in the decrease in small lake storage.Another possible situation is that the height of the leaking hole remains the same and the water surface height of the small lake increases, but this situation is not consistent with the minimum potential energy principle, as more water potential energy is stored in the small lake.This phenomenon shows that river-lake interactions may cause complex patterns of the regional surface water distribution.Therefore, decreases in small lake water storage and increases in water storage of Lake Selin Co in the basin detected by our study seem reasonable.Increases in small lake water storage in this basin reported in some published studies may be associated with the sparse sampling of lake water levels.
We averaged the total lake water storage change in each season to generate the time series shown in Figure 13 (a).The overall storage change in the 52 study lakes is 98.3±2.1 km 3 .The total lake water storage was increasing rapidly during the first 12 years but became relatively stable since 2012.Intra-annual variation in the TP lakes can also be investigated using the densified time series generated by this study.We removed the linear trend (sometimes there were multiple linear trends for a lake in different periods, which were removed in a stepwise fashion) and calculated the mean monthly water level anomaly for each lake across the study period.Then the intraannual water level change was represented by the difference between the maximum and minimum values of the monthly water level anomaly.The histogram of the intra-annual water level change in Figure 13 (b) shows that most of the TP lakes have water level variations ranging from 0.3-0.75m in a year on average.Similar work was performed by Lei et al. (2017) but only a small number of lakes were investigated in their study.

Quality assessment of similar data products
We made a comparison with a widely used lake water level/storage data set provided by the LEOGS Hydroweb, indicating that our product may perform better in terms of the consistency as well as sampling frequency.Both advantages are important in improving our understanding of responses of lakes to climate change.There are 21 same lakes in both our study and LEOGS Hydroweb.Annual trends in water level and lake storage during 2003-2015 were compared (Figure 14).Overall, the two products are consistent in terms of R 2 of the linear fit.However, some obvious discrepancies between the two data sources still exist, e.g., water levels of Lake Taro Co.Both Hydroweb data and our estimation used ICESat-1 and CryoSat-2 data.The difference lies in the fact that our CryoSat-2 product was more updated with a longer time span but Hydroweb used an additional altimetry satellite SARAL.Since both products were performed some kind of removal of the systematic bias, it is possible that we chose different baselines that resulted in the overall shift as shown in Figure 15 (a).
The black curve shows the optical water level we derived, which is a critical reference when connecting two different altimetry data time series without an overlap period.The optical water level shows that the last two samples of ICESat-1 data should not be lower than the first few samples of the CryoSat-2/SARAL data (see the dashed boxes).However, it is apparent that Hydroweb data display a reverse relationship.Though the optical water levels were derived by linearly fitting lake shore changes with altimetry data, the relative water levels during different periods should not be affected by the fitting parameters.This is the main reason for us to use optical water levels as reference data.Therefore, Hydroweb data may underestimate the decreasing trends in the water levels of Lake Taro Co since 2009.This problem may also exist in some similar studies when multisource altimetry data without overlap periods were used.As shown in Figure 16, optical data can be less noisy than altimetry data in certain lakes and significantly improved the continuity of lake monitoring.In addition, a more apparent seasonality in lake change can also be seen from the densified lake level time series.These advantages would largely benefit a better understanding of responses of TP lakes to climate change and facilitate hydrological modelling of lake basins, regional water balance analysis, and even hydrodynamic analysis of lake water bodies.5

Lake overflow flood monitoring
As mentioned earlier in Section 5.1, Lake Kusai experienced an abrupt expansion in 2011, resulting from dike-break of an upstream lake (Hwang et al., 2019;Liu et al., 2016;Xiaojun et al., 2012)  Given lake water levels, the outlet height and width, estimation of overflow can be made using the broad crest weir formula: Eq 16 where C is a parameter mainly reflecting geometric characteristics of the weir that mainly varies from 0.3-0.4,b is the width 5 of the weir, H is the water head with respect to the top of the weir, and g is the acceleration of gravity.
It is difficult to obtain the exact value of C without performing field investigations.Nevertheless, the range of C can be narrowed down by investigating the lake storage change process of Lake Kusai.As shown in Figure 19 (a), from Oct 1 to Nov 9 in 2011, the water level of Lake Kusai decreased rapidly by ~1.2 m.Given that the rainy season has ended (Liu et al., 2016), the water level of Lake Zhuonai became stabilized, providing minimum inflow to Lake Kusai.Meanwhile, the magnitude of 10 total evaporation during the period would not exceed 0.1~0.2m as the mean annual potential evaporation of the region is around 1000 mm (Zhang et al., 2007) and stage 1 (shown in Figure 19) only lasted for 40 days.In addition, the evaporation where V is referred to as the lake storage change, which is a function of water head H.
Equation 17 can be theoretically solved to describe the relationship between H and t if V(H) is simple, e.g., a cubic curve.
Otherwise, it can be solved with a numeric algorithm, such as the finite difference method.In a word, by solving the equation with different values of parameter C and optimizing the mean absolute error between the simulated result and the remotely sensed observations (shown in Figure 19 (a)), we suggested that C equals to 0.30 in this case, which is a reasonable value in hydrodynamic calculations.In stage 2 (Nov 9-Dec 31 in 2011) shown in Figure 19 (a), a temporary increase in water level occurred in Lake Kusai, implying that the overflow is not the only driving factor in stage 2. Nevertheless, stage 2 can provide water level input for Eq 17; thus the total outflow can be simulated using parameter C determined in stage 1.Since Lake Salt downstream mainly relied on the replenishment of Lake Kusai during that period, with little precipitation input and negligible glacier melt water in winter, the outflow of Lake Kusai can be comparable with the water gain in Lake Salt derived from remote sensing, though there was a small amount of evaporation loss.This relationship can provide a straightforward validation of our developed method.
However, it was not available in stage 1, because the outflow of Lake Kusai first replenished Lake Haidingnuoer until the later began overflowing.The total outflow from Lake Kusai in stage 2 was calculated to be 0.21-0.22km 3 , whereas the water gain in Lake Salt was 0.19 km 3 .This result showed that our densified lake water level time series from multiple optical and altimetric missions and the developed method are valuable in monitoring and predicting the outflow flood risk that is crucial for the safety of downstream residents and infrastructure.

Data availability
The derived TP lake water levels, hypsometric curves, and water storage changes are archived and available at https://doi.org/10.1594/PANGAEA.898411.(Li et al., 2019) 7 Conclusion In this study, we generated a dense (monthly and ever higher such as 10 days on average) continuous 18-year data set of changes in lake water level and storage for 52 large lakes on the TP by combining multisource optical and altimetric information.The optical data served as a unique reference covering the entire study period, enabling a more consistent conjunction of multisource altimetry time series.By comparison with a widely used LEGOS Hydroweb data set, we showed that without such a reference data set, there may be a remaining bias in the combined altimetry water levels.Our study has considerably improved the temporal resolution of the monitoring of TP lake water level and storage changes.For most lakes examined in the published studies, to our best knowledge, the estimates from our study provided the densest observations that can better reveal the interannual and intra-annual variability and trends in lake water level and storage, even in some relatively small TP lakes whose annual trends may, however, be incorrectly estimated by sparse sampling of lake water levels.The densified data set can also facilitate the monitoring of some rapidly expanding lakes with overflow risks and provide important information on flood prediction and early warning.
We evaluated the uncertainty in the optical water levels by field experiments and rigorous uncertainty analysis.Both methods are consistent that the magnitude of the uncertainty is around 0.1 m, which suggests that optical water levels are often more efficient and less noisy than altimetry data when the altimeter footprints on the lake surface are insufficient, especially for small lakes.Based on our estimates, 52 large TP lakes accounting for ~60% of the total TP lake area have gained 98.3±2.1 km 3 of water during the past 18 years.Lakes in the endorheic basin on the TP were mostly expanding.Water loss was more likely to be found among the southern TP lakes.In the Selin Co basin, a more complicated spatial pattern of lake storage changes was detected, as small lakes were slowly losing water whereas the large lake was gaining water, which we speculated to be caused by lake-river interactions that need further investigation.Note that the quality of the optical water levels before 2002 may not be as good as those obtained after 2002, because no altimetry data before 2002 were used in this study.
Extrapolation of the lake shore change-water level relationship may not be stable if the water level during 2000-2001 was much lower or higher than those from 2002-2017.

Figure 1 .
Figure 1.Spatial (the number of lakes covered) and temporal coverage and their overlap periods of multiple satellite altimetry missions used in this study, including Jason-1/2/3, ENVISAT, ICESat-1, and CryoSat-2.
review for journal Earth Syst.Sci.Data Discussion started: 15 March 2019 c Author(s) 2019.CC BY 4.0 License.
review for journal Earth Syst.Sci.Data Discussion started: 15 March 2019 c Author(s) 2019.CC BY 4.0 License.

Figure 2 .
Figure 2. (a) Ground tracks of multiple altimetry missions over Lake Zhari Namco and (b) the merged altimetry water levels for Lake Zhari Namco.
For every Landsat ETM + image acquired after 2003, the pixel number of the ROI was counted and compared with those acquired before 2003.If the loss of Earth Syst.Sci.Data Discuss., https://doi.org/10.5194/essd-2019-34for journal Earth Syst.Sci.Data Discussion started: 15 March 2019 c Author(s) 2019.CC BY 4.0 License.pixelsexceeded 2%, the ROI was considered to be affected by gap and the data were consequently excluded from the subsequent analysis.

Figure 3 .
Figure 3. (a) Lake Yamzhog Yumco and its surroundings.The DEM was extracted from the STRM Global 90-m DEM product; (b) ROI (yellow region) selected from a Landsat ETM+ image for detecting changes in lake shoreline and the gaps (black region); (c) linear regression of the lake shoreline change and altimetry water level for Lake Yamzhog Yumco; and (d) optical water levels and altimetry water levels for Lake Yamzhog Yumco.

Figure 4 .
Figure 4. (a) Landsat ETM+ image of the Aqqikkol Lake acquired in summer in 2001; (b) water extraction result using the MNDWI and NDWI, showing that the MNDWI performs better in detecting shallow water; (c) Landsat OLI image of Lake Nam Co acquired in winter in 2015; (d) water extraction result of the NDWI, showing good performance in distinguishing water from snow; and (e) result of the MNDWI, showing some confusion of water and snow.

Figure 5 .
Figure 5 .Programming interface of GEE.The red polygon was the ROI for lake area change extraction of Lake Selin Co.
review for journal Earth Syst.Sci.Data Discussion started: 15 March 2019 c Author(s) 2019.CC BY 4.0 License.
review for journal Earth Syst.Sci.Data Discussion started: 15 March 2019 c Author(s) 2019.CC BY 4.0 License.

Figure 7 .
Figure 7. Field experiments in two study lakes: (a) Landsat 8 image of the experiment spot; (b) pressure-type water level sensor; (c) unmanned aerial vehicle; (d) installation of the water level sensor; and (e) UAV image of a portion of the bank of Lake Nam Co.

Figure 8 .
Figure 8.(a) In situ water level anomaly versus optical water level anomaly in Lake Yamzhog Yumco; and (b) linear regression of the optical water levels and concurrent in situ water levels.

Figure 9 .
Figure 9. (a) Landsat OLI shoreline pixels on Lake Nam Co (the background is the UAV image), pixels marked with 1 were classified as water, and pixels marked with 0 were classified as land.The black dot in the insert figure represents the field experiment spot.(b) The chosen classifier, with two parameters a=21.13 and b=0.15 determined from the maximum likelihood method.The y axis represents the probability that a certain pixel is classified as water, and the x axis is the water area ratio in that pixel.Red dots represent pixels classified as land that have relatively lower water area ratios.Blue dots represent pixels classified as water that have relatively higher water area ratios.

Figure 11 .
Figure 11.Spatial distribution of trends in lake storage on the TP during 2000-2017.The back polygon shows the boundary of the endorheic basin of the TP including 39 study lakes.The other 13 study lakes are located outside the endorheic basin.

Figure 12 .
Figure 12.Discrepancy of lake storage trends in Lake Goren Co between Yao et al. (2018a) and our study.

Figure 13 .
Figure 13.(a) Total storage changes in the study lakes (52) on the TP, which can be generally divided into two stages: (1) a rapidly increasing stage (2000-2011) with a higher increasing rate of 6.68 km 3 /yr and (2) a mildly increasing stage (2012-2017) with an increasing rate of 2.85 km 3 /yr.(b) Histogram of changes in lake water levels of the study lakes on the TP.

Figure 14 .
Figure 14.Cross validation of the TP lake level and storage changes derived from our study with those provided by the LEGOS Hydroweb database (Cré taux et al., 2011a): (a) trends in lake water levels from 2003 to 2015 and (b) trends in lake water storage from 2003 to 2015.

Figure 18 .
Figure 18.(a) Height variations in the outlet of Lake Kusai, the overflow would occur when the water level increases from 4483.9 m to 4484.1 m; (b) Landsat images before the outburst of Lake Zhuonai; and (c) Landsat images after the outburst event.
Earth Syst.Sci.Data Discuss., https://doi.org/10.5194/essd-2019forjournal Earth Syst.Sci.Data Discussion started: 15 March 2019 c Author(s) 2019.CC BY 4.0 License.loss could be partially compensated by inflow.Overflow should be the driving factor of the lake storage balance of Lake Kusai during the period.Therefore, we used the following function to reproduce changes in water level and storage in Lake Kusai during Oct 1 to Nov 9 in 2011 (stage 1 shown in Figure19):

Figure 19 .
Figure 19.(a)Changes in the water level of Lake Kusai after receiving the outburst flood from Lake Zhuonai.Water level in stage 1 was simulated using Eq 17, which provided a referencing range of parameter C. Water level in stage 2 provided water level input for Eq 17 to calculate total outflow, which was compared with the concurrent water gain of Lake Salt downstream; and (b) changes in water storage of Lake Salt derived from remote sensing using our developed method.There was 0.19 km 3 of water gained in stage 2, which was comparable to the outflow estimate of Lake Kusai (0.22 km 3 ).