the Creative Commons Attribution 4.0 License.
the Creative Commons Attribution 4.0 License.
A Harmonized Dataset for Dams and Reservoirs in West Africa
Abstract. Most existing datasets that could support dam and reservoir management and assessments of their impacts in West Africa are limited by inaccurate georeferencing, inconsistent accessibility, heterogeneous data records, and a lack of validation against field observations. In this study, we review and assess existing datasets containing information on dams and reservoirs in West Africa and subsequently integrate them into a harmonized and consolidated regional dataset. We benchmarked the quality of the newly compiled dataset at watershed scale through an extended field study, and statistical analyses. The resulting dataset (https://doi.org/10.60507/FK2/YLDK1Y) includes 1,429 georeferenced dams and 1,258 reservoirs (with a minimum surface of 0.57 × 10-3 km2) exceeding the count of dams and reservoirs in West Africa reported by any available dataset. It contains 38 attributes and an estimated total reservoir surface area of 14,038 km2 and a cumulative storage capacity of 283,032 million cubic meter (MCM), thereby enhancing data accessibility in West Africa. The regional compiled dataset contains fewer missing entries and exhibits lower bias compared to the originate datasets, advancing the existing efforts by explicitly integrating both large- and small-scale reservoirs. The ground-based watershed scale assessment revealed strong spatial and temporal coherence for large scale reservoirs, but a systematic underrepresentation of small scale infrastructure in both the sources and thus also in the compiled dataset highlighting the importance of field validation. The field benchmarking advocates for collaborative research and data sharing initiatives among scientists and institutions across West Africa to improve the accuracy and completeness of dam and reservoir data, especially for small scale infrastructure.
- Preprint
(16463 KB) - Metadata XML
- BibTeX
- EndNote
Status: final response (author comments only)
-
RC1: 'Comment on essd-2026-159', Anonymous Referee #1, 03 May 2026
-
AC1: 'Reply on RC1', Valery Bessely Stanislas Kouassi, 18 May 2026
Good points and thank you for these constructive comments. We will address them.
Citation: https://doi.org/10.5194/essd-2026-159-AC1
-
AC1: 'Reply on RC1', Valery Bessely Stanislas Kouassi, 18 May 2026
-
RC2: 'Comment on essd-2026-159', Anonymous Referee #2, 18 Jun 2026
The manuscript is generally well written and reads smoothly. It provides a clear overview of existing dam and reservoir datasets, identifies relevant scientific gaps, and presents a coherent methodology, results, and discussion. By integrating 12 existing datasets, the authors developed a harmonized database of dams and reservoirs in West Africa and evaluated it using field-survey data from the Upper Bandama watershed. The study has substantial regional relevance and potential practical value. In addition, the dataset has been made publicly available through a repository with a DOI, which is appropriate for an ESSD data description article. However, additional technical details are needed for several key procedures, including data harmonization, duplicate removal, dam–reservoir matching, and manual digitization. The authors should also consider that the source datasets differ in feature type, size threshold, and temporal coverage. Moreover, several datasets inherit records from the same sources and therefore cannot be regarded as completely independent products. These dependencies should be explicitly considered during data integration and quality assessment. Overall, I consider this a meaningful study with considerable potential value. However, the manuscript would benefit from careful revision before publication. Improving the transparency and reproducibility of the methodology would substantially enhance the scientific value and practical usability of both the manuscript and the dataset. My specific comments are provided below.
1. Abstract: Please define “small-scale infrastructure” and provide the corresponding area, storage-capacity, or dam-height threshold.
2. Introduction: The terms “large” and “small” dams and reservoirs should be precisely defined throughout the manuscript. If the classification is based on reservoir area, please provide the corresponding area ranges.
3. Line 76: The research gap is described as the “underestimation” of West African dams and reservoirs in existing datasets. Does this refer to the number of dams, reservoir area, storage capacity, positional accuracy, attribute completeness, or a combination of these factors? Please define the specific dimensions of underestimation more clearly.
4. Please explain how duplicate records were identified and removed, including the matching criteria, thresholds, and whether the process was automated or manual.
5. Please clarify whether dam points and reservoir polygons have a one-to-one relationship in the compiled dataset. How were dam points linked to reservoir polygons?
6. Please clarify whether each record retains its original data source and indicates any modifications made during harmonization.
7. Please report the sample size and selection criteria for the surveyed dams and villages in the Methods and clarify whether the survey covered all known dams in the watershed.
8. The field survey was conducted in 2024, whereas some source datasets are much older. Please consider temporal mismatches when interpreting missing dams and assessing dataset accuracy.
9. Figure 4: Please define the meaning of “lake control.”
10. Many of the selected datasets overlap or inherit information from other datasets. For example, some newer products incorporate records from GRanD, GLWD, HydroLAKES. Did the harmonization procedure account for this lack of independence?
11. Table 4: The current comparison of attribute availability is rather coarse. Please consider listing the individual attributes provided by each dataset, this should not require excessive additional space and would substantially improve the usefulness of the comparison.
12. Lines 307–311: 148,671 MCM represents approximately 52.5%, rather than 55.61%, of 283,032 MCM.
13. Lines 399–400: If the compiled dataset contains 32% of the dams observed during the field campaign, the gap should be approximately 68%, not 88%.
14. The field-based assessment was conducted in only one watershed. Please discuss more carefully whether its results can be generalized to the whole of West Africa.
15. I recommend adding a limitations subsection in discussion. This section should address the use of only one watershed for field evaluation, the non-independence of source datasets, and potential uncertainty in the field-reference data itself.Citation: https://doi.org/10.5194/essd-2026-159-RC2
Data sets
West Africa Dams and Reservoirs Dataset Valery Bessely Stanislas Kouassi et al. https://doi.org/10.60507/FK2/YLDK1Y
Viewed
| HTML | XML | Total | BibTeX | EndNote | |
|---|---|---|---|---|---|
| 332 | 75 | 15 | 422 | 17 | 17 |
- HTML: 332
- PDF: 75
- XML: 15
- Total: 422
- BibTeX: 17
- EndNote: 17
Viewed (geographical distribution)
| Country | # | Views | % |
|---|
| Total: | 0 |
| HTML: | 0 |
| PDF: | 0 |
| XML: | 0 |
- 1
Overall, the manuscript is well written and pleasant to read. The work is extremely interesting in a region like West Africa, where there is relatively little documentation, as it involves compiling an inventory of existing databases on the locations and characteristics of dams and reservoirs, and attempting to validate this information in the field to verify that the databases are consistent. The results could be very useful for the hydrology community, particularly for feeding into regional hydrological, taking into account water storage and irrigation processes. The database is already available online with a corresponding DOI.
One thing I'm not quite clear on is what criteria determine whether or not a database is included in the work. Do you simply use all existing data, or is there a preliminary filtering process? Similarly, I think the criteria for what is considered a large or small dam should be better defined. For example, what is the minimum reservoir size included in this new database? This is important because there is indeed a myriad of very small reservoirs, and I think it’s a real challenge to try to inventory them all. On the other hand, it’s important for database users to know what types and sizes of dams it contains. One point I think is worth elaborating on a bit more is this validation. It’s clear that the database was validated using data from a single watershed. But I think there is a need to provide the reader with a bit more information to assess the reliability of this database, particularly in different hydroclimatic contexts. I fully understand that this type of field validation is a major work that requires significant effort at the local level and, of course, cannot be carried out on a regional scale. However, we need to provide context to help verify whether this validation is representative at the regional level.
Specific comments:
Line 105: It would be helpful to explain why this basin was chosen for validation. Is it the only one with data, or is it representative of the area? In particular, we need to discuss how validation at a single site could be representative of the entire region, given that there are four very distinct climate zones.
Line 141. I don't quite understand this part. For example, the coordinates of the reservoirs were estimated using a statistical approach. That seems rather inappropriate to me. A reservoir can be observed, for example in satellite images, and its location is precise—not the result of an average estimated from a statistical distribution.
Line 286. I don't think this collection exceeds the number in previous studies; as its written it actually ranges from 1,415 to 1,429. So not a large difference. It would be interesting to state here which dataset reports 1415 dams, and the % in common with the current database. And it is not the same number reported line 488 (1141 records). Please revise.
Figure 6: I see that the size of the catchment area is missing in many cases. However, it seems to me that this information is very easy to estimate: once you have the precise locations of the dams, you can delineate the catchment area using a digital elevation model (DEM).
Line 310. It should also be noted that these are climatologically very different regions, with a gradient of aridity from south to north.
Figure 12 can be misleading. At first glance, I thought it was a comparison of the same dams—specifically, what was in the databases versus what was observed—but upon reading the text, one realizes that these are two different databases: on the one hand, what is observed in the field, and on the other hand, what is in the databases. Because otherwise, it would be very surprising that information such as the year of construction or the height of the dam—which are actually fairly easy to obtain—would differ so much. So I think the figure’s caption should be rewritten slightly to explain that these are in fact two different sets of data.
Line 415: I think this paragraph should be expanded slightly to explain the link between the spatial distribution of dams and the climate. One might imagine that dams are unnecessary in humid regions but necessary in semi-arid regions. However, this is not explicitly discussed in the manuscript. However, given the number of dams analyzed, I believe this type of analysis would be very interesting.
Line 496: I think we should clarify here what is meant by a “small reservoir.” Are we trying to detect pools a few meters in size? What is the average size? And, in fact, I feel that this definition is somewhat lacking in the manuscript; perhaps it should be clarified= what is the minimum dam size considered in the database