the Creative Commons Attribution 4.0 License.
the Creative Commons Attribution 4.0 License.
A 1985–2023 time series dataset of absolute reservoir storage in Mainland Southeast Asia (MSEA-Res)
Abstract. The recent surge in reservoir construction has increased global surface water storage, with Mainland Southeast Asia (MSEA) being a significant hotspot. Such infrastructural evolution demands updates in water management strategies and hydrological models. However, information on actual reservoir storage is hard to acquire, especially for transboundary river basins. To date, no high spatio-temporal dataset on absolute storage time series is available for reservoirs in MSEA. To address this gap, we present (1) a comprehensive, open-access database of absolute storage time series (sub-monthly) for 185 reservoirs (larger than 0.1 km3) in MSEA spanning the period 1985–2023, and (2) an analysis of the reservoir storage dynamics. The MSEA-Res database includes static (Area-Elevation-Storage curves, water frequency, reservoir extent) and dynamic (area, water level, and absolute storage time series) components for each reservoir. The 185 reservoirs collectively store around 175 km³ (140 km³ – 210 km³) of water, covering an aggregated area of 8,700 km² (6,500 km² – 10,000 km²). We show that the combined average reservoir storage has increased from 70 km³ to 160 km³ (+130 %) from 2008 to 2017, primarily contributed by dams in the Irrawaddy, Red, Upper Mekong, and Lower Mekong basins. Our in-situ validation provides a good match between estimated storage and in-situ observations, with 60 % of the validation sites (12 out of 20) showing an R² > 0.65 and an average nRMSE < 15 %. The indirect validation (based on altimetry-converted storage) shows even better results, with an R² > 0.7 and an average nRMSE < 12 % for 70 % (14 out of 20) of the reservoirs. Furthermore, the analysis of the 2019–2020 drought event reveals that nearly 30–40 % of the MSEA region experienced more than five months of drought, with the most significant impact on reservoirs in Cambodia and Thailand. As a result, storage departures ranged up to -40 % in some reservoirs, highlighting significant impacts on water availability. Overall, this analysis demonstrates the potential of the inferred storage time series for assessing real-life water-related problems in Mainland Southeast Asia, with the possibility of applications in other parts of the world. The MSEA-Res database and associated Python code are publicly available on Zenodo at https://doi.org/10.5281/zenodo.12787699 (Mahto et al., 2024).
- Preprint
(2382 KB) - Metadata XML
-
Supplement
(905 KB) - BibTeX
- EndNote
Status: final response (author comments only)
-
RC1: 'Comment on essd-2024-441', Anonymous Referee #1, 16 Oct 2024
The manuscript provides a long-term datasets of reservoir storage in Mainland Southeast Asia. This is meaningful for studies about the reservoir operations and further studies on hydrological processes. However, there are still some comments needed to be illustrated as listed below.
- Line 43, Steyaert et al., 2022 and Steyaert and Condon, 2024, these two references are not found in the reference list. Please check.
- In table 3, for two attributes ‘Water surface area (empty reservoir)’ and ‘Absolute storage (empty reservoir)’, why they are marked as empty reservoir?
- I’ve downloaded the dataset and am a bit confused about the water area extraction. There are three attributes ‘Before_area’, ‘After_area’ and ‘Final_area’, how are they derived respectively? How was the ‘Final_area’ determined? some of them equal to ‘Before_area’ and some equal to ‘After_area’. Please add more explanations in the manuscript.
- Is the estimation improved in level 1 comparing to level 0? If so, how much?
- Section 4.4, I understand that direct validation is limited by observations, but I assumed indirect validation can be applied to most of the reservoirs, why only 20 are presented? Can authors present more results?
- Figure 9, there are some discrepancies between the spatial distribution of precipitation deficit and storage deficit in reservoirs, especially in 2020, the lower part are not suffered from precipitation deficit but with less water stored in reservoirs. Please explain why.
- Section 4.5, drought is supposed to be a prolonged disaster that can affect the long-term water availability. It would be interesting to look into the time series of water storage in reservoirs to explore how reservoirs are affected and how they can alleviate the influence of droughts.
- Line 424-426, are there any evidences that China held back water in its dams from any data or references? If not, please remove this sentence.
Citation: https://doi.org/10.5194/essd-2024-441-RC1 -
RC2: 'Reply on RC1', Edward Park, 22 Nov 2024
I suggest a moderate revision for this manuscript. The article is well-formulated and addresses an academically significant topic with a technically robust methodology. The authors are recognized experts in this domain, and the methods employed are sound. The manuscript tackles the critical issue of reservoir constructions, compiling a valuable database of storage changes in one of the global hotspots, with significant implications for water resource management and the global population in Southeast Asia. The study is timely and has the potential to serve as an important resource for both the scientific community and policymakers working on water-related issues in the region.
Given the nature of the journal, ESSD, where the emphasis is on "resource/data publication," innovation may not necessarily be the highest priority. Instead, the value lies in building a platform that supports future research utilizing this data. Since the methods presented are sound, my comments are primarily focused on strengthening the narrative and justifying the study's broader scientific and practical implications.
A stronger justification of the research gap would enhance the manuscript. The current gaps presented seem incremental rather than innovative, improving on existing models, datasets, or studies rather than breaking new ground. While incremental research is meaningful, the study could benefit from highlighting novel techniques or scientific insights.
On line 72, the paragraph starts with a question. Instead, I suggest stating the research question in a more formal manner to improve clarity.
On line 103, "dams" should likely be replaced with "reservoirs" for consistency and accuracy in terminology.
Section 4 stands out as particularly interesting and potentially valuable. The authors provide insights into how storage patterns have evolved over the years and across different basins. This part of the study could serve as critical baseline information for future research in this domain.
The final section is also commendable, as it validates the utility of the database and demonstrates its application with a specific recent example. The analysis of the impact of the 2019–2020 drought on surface water storage effectively highlights the significant effects of extreme dry weather events on water resources in Mainland Southeast Asia. The demonstration of MSEA-Res's utility for hydrological modeling and other applications adds significant value to the manuscript.
Regarding figures, Fig. 1a would be more informative if the river network were included on the map. This would provide additional spatial context for readers.
For Fig. 3, it would be helpful to include an example map or image alongside the text description for each static component. This would improve clarity and accessibility for readers unfamiliar with the methodology.
On line 367, the authors removed three low-performing reservoirs to improve the correlation. It would be beneficial to provide a clear justification for why these reservoirs were excluded from the statistics. Additionally, addressing the reasons behind the underperformance of certain reservoirs compared to others, as well as discussing the overall accuracy of the dataset, would strengthen the manuscript. For example, Fig. 8A suggests a potential systematic spatial distribution of R² values. If this is indeed the case, it may imply a methodological bias, which should be addressed in the discussion.
Edward
Citation: https://doi.org/10.5194/essd-2024-441-RC2 - AC2: 'Reply on RC2', Shanti Shwarup Mahto, 04 Jan 2025
- AC1: 'Reply on RC1', Shanti Shwarup Mahto, 04 Jan 2025
-
RC3: 'Comment on essd-2024-441', Anonymous Referee #3, 24 Nov 2024
This manuscript employs Landsat and Sentinel-2 imagery to calculate the Normalized Difference Water Index (NDWI) and estimate changes in reservoir water surface area across Mainland Southeast Asia. The authors then use hypsometric curves to estimate absolute water storage dynamics for these reservoirs. These storage estimates are validated against several in-situ datasets. Using this dataset, the authors demonstrate the impact of the recent 2019-2020 drought on reservoir storage in the region. Overall, the manuscript is well-written. However, I think the use of NDWI to map surface water and the application of established hypsometric curves for storage estimation do not contribute any significant methodological innovation. I also believe it is unacceptable for a manuscript to lack a Discussion section. I also have a few major concerns outlined below:
Further validation of the surface water estimates, and hypsometric methods is necessary. I recommend comparing your surface water estimates with the Global Surface Water Dataset (GSWD) and/or other published reservoir datasets. Since you calculated water frequency rasters and maximum water extent, these can also be compared against GSWD water occurrence data to strengthen your results. Additionally, many studies have focused on developing hypsometric curves for reservoirs; it is essential to clarify why your approach is advantageous compared to others. Given the significant uncertainties in using DEMs to derive hypsometric curves, I suggest addressing these limitations in your study.
The use of NDWI alone may be too simplistic and may lack the accuracy needed to effectively map surface water dynamics. Without additional processing, NDWI can be prone to misclassification, especially in areas with mixed land-water pixels or seasonal vegetation cover. Additionally, factors such as high turbidity, shadows, or the presence of aquatic vegetation can further impact the accuracy of surface water mapping. Addressing these limitations is essential, and the authors might consider discussing alternative or supplementary approaches to enhance the reliability of water detection across diverse environmental conditions.
Combining Landsat and Sentinel-2 data should enhance the observation frequency for monitoring surface water dynamics. However, the manuscript does not highlight this potential benefit. You mention that sub-monthly surface water observations are achievable with these datasets, but it seems likely that an even higher frequency could be attained by fully leveraging both satellite sources. I recommend clarifying the observation frequency achieved in this study and discussing how the combined use of Landsat and Sentinel-2 could improve temporal resolution, potentially down to a weekly or even more frequent basis, which would provide greater detail on surface water changes.
Specific Comments:
Abstract: The abstract needs to be revised. It does not clearly convey that this study utilizes remote sensing data to estimate reservoir water area dynamics. Instead, it reads more like a compilation of reservoir data in Mainland Southeast Asia.
L87: The GloLakes database provides absolute water storage data from 1984 to the present, rather than just up to 2020.
L101: By combining Landsat and Sentinel-2 data, it is possible to derive sub-weekly reservoir dynamics time series, offering higher temporal resolution than the sub-monthly intervals mentioned in your study.
L102: Why did you choose the hypsometric curves developed by Hao et al. (2024)? What advantages does this database offer over those from other studies?
Table 1: The GDAR link is not working; please check it. Additionally, the link for “Dams in the Mekong” appears to point to the GRanD database instead.
Figure 1: Please ensure the volume units are consistent throughout the manuscript. “Km³” was used previously, whereas “million m³” is used here. Consider standardizing to one unit for clarity.
L169-170: Instead of saying you “acquire” water index, water frequency, or maximum water extent, it’s more accurate to state that you “derive” or “calculate” these data.
L182: The term “optical images” should not refer exclusively to “Green (G) and Near-Infrared (NIR)” bands.
L185: Since Sentinel-2 provides a cloud cover product, have you considered using it to filter out cloudy images?
L185: Images with even 5-20% cloud coverage can still significantly impact the accuracy of surface water extent measurements. This level of cloudiness may obscure key areas or introduce errors, making it essential to account for even minimal cloud presence in your analysis.
L186: Given that Landsat has a 16-day revisit time, how are you compositing 16-day images into a 10-day interval?
L190-195: You need to classify these NDWI pixels before calculating water frequency and maximum water extent.
L203: “for”: after?
L240-244: I do not understand why you need to generate level-0 data.
L246: Please specific “a trend-preserving interpolation technique”.
Table 3: remove low dash after the words (e.g., “Level_m_”)
Figure 5: Do the solid lines represent Level-2 data, or are they a combination of Sentinel and Landsat data?
Citation: https://doi.org/10.5194/essd-2024-441-RC3 - AC3: 'Reply on RC3', Shanti Shwarup Mahto, 04 Jan 2025
-
RC4: 'Comment on essd-2024-441', Anonymous Referee #4, 24 Nov 2024
This study presents an extensive database of reservoir storage timeseries for the Southeast Asian reservoirs from 1985-2023. Indeed, this paper is methodologically rigorous and can be integrated with hydrological models for the validation of their reservoir operations. However, the methodological framework section of this article needs to be significantly modified before publication to improve readability. Therefore, overall, the present version is not acceptable for publication, and I recommend a major revision for this manuscript.
Below are the comments/questions for the authors.
Abstract
- Line 15 – 16: “The 185 reservoirs collectively store around 175 km³ (140 km³ – 210 km³) of water, covering an aggregated area of 8,700 km² (6,500 km² – 10,000 km²)”. What I understood from the manuscript is that the present total water storage (year 2023) from 185 reservoirs is 175 km3. Please reflect the year in the sentence, otherwise it is confusing that which year we are referring to. Further, what is this 140 km3 and 210 km3 range, which I couldn’t find in the manuscript? Same about the reservoir aggregated area. The area values are not described in the manuscript, especially the area range (6,500 km² – 10,000 km²). Please explain all these clearly to avoid confusion.
- Line 17: “average reservoir storage has increased from 70 km³ to 160 km³ (+130%) from 2008 to 2017”. Why the reservoir storage change from 2008 to 2017 has been considered instead of the timeframe of the database? Any specific reason for that consideration? If no, it is better to show the change from 1985 to 2023.
- Line 18 – 21: “Our in-situ validation provides a good match between estimated storage and in-situ observations, with 60% of the validation sites (12 out of 20) showing an R² > 0.65 and an average nRMSE < 15%. The indirect validation (based on altimetry-converted storage) shows even better results, with an R² > 0.7 and an average nRMSE < 12% for 70% (14 out of 20) of the reservoirs”.For in-situ validation, reference R2 value was 0.65, whereas for indirect validation, the R2 value was kept at 0.7. Please unify the reference R2 value as 0.7 as like in the main text so that 10 out 20 stations (50% of the validation sites) show good agreement with in-situ observations. Rewrite the entire sentence accordingly.
- Line 21: “2019-2020 drought event”. Where has this drought happened? Please specify in the sentence.
- Line 25 – 26: “possibility of applications in other parts of the world”. Briefly explain how this dataset will be applicable in other parts of the world in the Results section.
- Line 26: “MSEA-Res database”. This abbreviation is using for the first time in the abstract. It should be clearly explained in the earlier part of the manuscript.
- Overall: The manuscript title and abstract does not clarify regarding the methodology used to estimate the storage timeseries in MSEA. The readers need to proceed to further sections of the manuscript to understand that they have used remote sensing data to estimate the reservoir storage. It should be further emphasized in the title and abstract.
Introduction
- Line 29 – 30: “influencing the redistribution of water”. It should be rewritten as “influencing the distribution of water”.
- Line 44: The term “Mainland Southeast Asia” has been abbreviated as MSEA in the abstract. Hence, when it appears for the first time in the Introduction, it should be again fully spelled and abbreviated, which has not been done. The MSEA abbreviation can be seen in line number 93 for the first time in Introduction without fully spelled. Also, it has been presented as “Mainland Southeast Asia” many times although it has been abbreviated. Please maintain the consistency throughout the manuscript by defining abbreviation in the first place when they appear.
- Line 51: “where a few large rivers flowing in the region originate”. What the authors meant by saying “in the region originate”? Please rewrite to make it clear.
- Line 55: Add Hanasaki et al. (2008), which is one of the pioneering works to include reservoir operation in global hydrological models. https://doi.org/10.1016/j.jhydrol.2005.11.011
- Line 97 – 105: The additional data that the authors offer from Hou et al. (2024) is some reservoirs in the MSEA region with 3-year extra data from 2020. How significantly different this work is from Hou et al. (2024)? I recommend the authors to validate the storage database against the ones available in Hou et al. (2024) and discuss further whether they are comparable or not. If different, please explain why it has happened. If similar, again explanations are needed on why this new dataset is relevant.
- Line 107: 2019-2020 drought has been mentioned here as well. Clearly specify where this drought has happened.
Water reservoirs in Mainland Southeast Asia
- Table 1: the GDAT and dams in the Mekong links are not working. Please check it.
- Figure 1: From Fig. 1(a), what I understood is that both Bhumibol and Sirikit reservoirs in the Chao Phraya basin have a reservoir storage ranging between 9000-13000 km3. However, this is not true for Bhumibol, whose is storage is greater than 13000 km3. Be careful about the storage capacities of all the reservoirs. Besides, Pasak dam is missing in the same basin that is included in the GRanD database. Any reason for leaving out this reservoir?
- Figure 1 caption (Line 146): “Basin-wise distribution of dam location (red dots), stream network, and order”. It should be rewritten as “Basin-wise distribution of dam location (red dots), stream network in the respective catchments, and stream order”.
Methodological Framework
- Line 156: What the authors meant by sub-monthly here? How many times the data is available in a month? Please describe the minimum and maximum based on all the catchments.
- Line 156: It has been mentioned that the authors have used both Landsat and Sentinel-2 satellite data. However, the spatial resolution of Sentinel-2 is 10m and hence uncertainty and accuracy attribution will be very different for both imageries. How did the authors solve this issue? What treatment has done for the Sentinel-2 data to match its spatial resolution as that of Landsat?
- Figure 2: The reservoir maximum extent is not an input data, instead, a derived product after processing. What about the NDWI images? Is it acquired or derived? In some parts of the manuscript, it is mentioned that the NDWI has been acquired, while in some parts it says as derived. Please clearly state which way has been used to get NDWI images. If the study uses both acquired and derived NDWI images, clearly state that when and where the derived products has used. Change Fig. 2 accordingly. Also show the steps of methodological framework in Fig. 2 for better understanding.
- Line 164 – 165: What is the meaning of time series satellite images? Does it mean a series of imageries?
- Line 169 – 170: What is water frequency raster and maximum water extent raster? What is their meaning? In the later stage of manuscript, I understood that these are derived products. Then, why the authors have mentioned them as acquired input dataset?
- Line 175 – 176: Need further information that which satellites products have used to create NDWI, water frequency raster and maximum water extent raster. Add this information in Table 2.
- Line 180: The bands (green, red, and NIR) changes with satellite products and hence the NDWI formula. How to generalise these bands for all sensors? Please rewrite correctly.
- Line 185: Choosing images with cloud coverage less than 80% is not a good idea. What if the cloud coverage is 89% and is exactly over the reservoir extent? What information the authors can access from such an image and how do you treat the image further? I suggest the authors to further mask the satellite image for the reservoir extent and apply the cloud coverage threshold, preferably below 20%.
- Line 186: The revisit time of Landsat is generally 16 days. So, how did the authors make composites at 10-day intervals? Is it achieved by combing Landsat 9 and Landsat 8, whose combined temporal resolution is 8 days at the mid-latitudes. Such explanations are missing in the manuscript. I presume there could be some months without any data. How did the authors generate data for those months?
- Line 193: How the composite NDWI has been created? For example, if we have three NDWI images with a grid cell having values of 0, 1, and 0. What will be the composite value? Is the FREQ for that particular grid 33.3, which means one-third of the grid is covered with water? Please clearly explain this in the main text or in the supplement. Further, I do not understand the EXT layer calculation. How is it calculated? How to derive the largest extent of ones from binary NDWI images? What I understood is a single FREQ and EXT maps are created for the entire period (2013-2023). Is it true? All such technical details should be clarified in the text with further details.
- Line 194: To generate the NDWI, the authors used a cloud coverage threshold below 80%. But in later stages, a threshold of 20% was used to derive FREQ and EXT maps from NDWI images. Why it has to be different? Further, how the NDWI images have cloud coverage because I suppose it is already a processed image after cloud coverage removal?
- Line 197: What is scene-based NDWI image?
- Line 190: Clearly rewrite the entire paragraph.
- Line 200: Did the authors compare the derived area-elevation-storage curve against observed curve? I believe it is very crucial to validate these curves because they form the heart of this study.
- Line 213: Why have the authors taken the A-E-S curves from Hao et al. (2024)? Does this dataset have any specific advantages over other existing datasets?
- Line 217: Is this the water surface area when the reservoir is full or area timeseries?
- Line 222: Why CLAHE operates in a small region? How to choose this operational window?
- Line 223: How is the surface area calculated? How the k-means clustering is useful here? It is not clear. Further explanations are needed.
- Line 245: How the Level-2 data is generated? It has been mentioned that using a trend-preserving interpolation technique. What is it?
- Overall: The overall clarity and logical flow are missing in this section. Further rewriting with clear explanations on the technical details are needed.
Results
- Line 290: In Table 3, it has been mentioned as area-level-storage, while in text wrote as area-elevation-storage. Please unify.
- Line 291: The mean sea level the authors are referring to is a common datum or different for different regions.
- Figure 4: What is the maximum storage capacity of Sirikit? In my knowledge it is blow 10000 km3. Then, why the y-axis of Fig. 4(e) shows a maximum above 15000 km3? The figure caption says the relationship is based on their maximum storage capacity, which is not true for at least Sirikit reservoir.
- Figure 5: Why the storage pattern is different for Longjing, Son La, and Nuozhadu reservoirs? They show storage fluctuations before commissioning unlike the Xe Kaman 1 reservoir.
- Figure 5: Is this a daily timeseries of sub-monthly? Which product has been used (level-1 or level-2) to plot this figure. Please mention it in the figure caption.
- Figure 5 caption: What is scene-based reservoir storage (Line 322)? “dynamic components” should be rewritten as “dynamic component” (Line 321).
- Line 304: How to say that the Longjing (2010) reservoir was filled in roughly one year from Fig. 5(a)? What is the full capacity of that reservoir?
- Line 327: Why to use level-1 data when the authors have a level-2 data? The authors use level-1 data in the subsequent sections as well. Any reason for this?
- Line 345: It is not the volume reduction in Chao Phraya basin. Instead, the storage has substantially reduced due to persisting drought conditions and both Bhumibol and Sirikit reservoirs showed a continuous decline in storage.
- Please clearly mention the temporal scale of all timeseries figures (Fig. 5, Fig. 6, Fig. 7, and Fig. 8).
- Line 371: We need to refer Fig. S2, not Fig. S3.
- Table S1: How the authors can explain the underperformance of Bhumibol, Rajaprbha, and Bang Lang reservoirs? Also, what is the allowable level of nRMSE to be good?
- Line 412: Why has the reference period set between 2017 and 2023? Drought is a slow process sometimes persists for decades. Hence, the reference period has to be changed.
- Line 418 – 419: storage conditions were worsened in 2020 because of the combined effect of reduced precipitation and reduced storage levels in the dams in 2019.
Conclusions
- Line 438: Aggregated storage capacity of nearly 175 km³ was observed by the year 2023. Please mention it.
- Careful about the usage of abbreviations such as SRTM, DEM, GDAT, SAR, NASA SWOT, etc.
- Line 459: “, flood control” should be rewritten as “and flood control”.
Citation: https://doi.org/10.5194/essd-2024-441-RC4 - AC4: 'Reply on RC4', Shanti Shwarup Mahto, 04 Jan 2025
Data sets
A comprehensive time series dataset of absolute reservoir storage in Mainland Southeast Asia (MSEA-Res) from 1985 to 2023 Shanti Shwarup Mahto, Simone Fatichi, and Stefano Galelli https://doi.org/10.5281/zenodo.12787699
Viewed
HTML | XML | Total | Supplement | BibTeX | EndNote | |
---|---|---|---|---|---|---|
462 | 104 | 109 | 675 | 41 | 9 | 13 |
- HTML: 462
- PDF: 104
- XML: 109
- Total: 675
- Supplement: 41
- BibTeX: 9
- EndNote: 13
Viewed (geographical distribution)
Country | # | Views | % |
---|
Total: | 0 |
HTML: | 0 |
PDF: | 0 |
XML: | 0 |
- 1