the Creative Commons Attribution 4.0 License.
the Creative Commons Attribution 4.0 License.
Annual river dataset in China: a new product with a 10 m spatial resolution from 2016 to 2023
Abstract. Rivers play important roles in ecological biodiversity, shipping trade and the carbon cycle. Owing to human disturbances and extreme climates in recent decades, river extents have altered frequently and dramatically. The development of sequential and fine-scale river extent datasets, which could offer strong data support for river protection, management and sustainable use, is urgently needed. A literature review revealed that annual river extent datasets with fine spatial resolutions are generally unavailable for China. To address this issue, the first Sentinel-derived annual China river extent dataset (CRED) from 2016 to 2023 was produced in our study. We first produced annual water maps by combining the dynamic world (DW), ESRI global land cover (EGLC) data and the multiple index water detection rule (MIWDR). For the DW and MIWDR water time series, the mode algorithm, which calculates the most common values, was used to generate yearly water maps. Then, an object-based hierarchical decision tree based on geometric features and auxiliary datasets was developed to extract rivers from the water data. The results indicated that the overall accuracies (OAs) of the CRED were greater than 96.0 % from 2016 to 2023. The user accuracies (UAs), producer accuracies (PAs) and F1 scores of the rivers exceeded 95.3 %, 91.3 % and 93.7 %, respectively. A further data intercomparison indicated that our CRED shared similar patterns with the wetland map of East Asia (EA_Wetlands), China land use/cover change (CNLUCC) and China water covers (CWaC) datasets, with correlation coefficients (R) greater than 0.75. Moreover, our CRED outperformed the three datasets in terms of small river mapping and misclassification reduction. The area statistics indicated that the river area in China was 44,948.78 km2 in 2023, which was mostly distributed in coastal provinces of China. From 2016 to 2023, the river areas were characterized by an initial increase, followed by a decrease and then a slight increase. Spatially, the decreased rivers were located mainly in Southeast China, whereas the increased rivers were distributed mainly in Central China and Northeast China. In general, the CRED explicitly delineated river extents and dynamics in China, which could provide a good foundation for improving river ecology and management. The CRED dataset is publicly available at https://doi.org/10.5281/zenodo.13841910 (Peng et al., 2024a).
- Preprint
(10161 KB) - Metadata XML
-
Supplement
(5938 KB) - BibTeX
- EndNote
Status: open (until 25 Jan 2025)
-
RC1: 'Comment on essd-2024-468', Yinhe Liu, 17 Dec 2024
reply
See attached file
-
AC1: 'Reply on RC1', Kaifeng Peng, 16 Jan 2025
reply
Thank you for taking your precious time and making diligent efforts to review our manuscript. The valuable comments and constructive suggestions are definitely helpful, and we sincerely appreciate them for improving our paper. We have carefully studied the comments and revised the manuscript point-by-point.
To present the response to the comments with both text and figures, we organized the detailed response and modifications into a PDF document. This document has been uploaded as supplement.
-
AC1: 'Reply on RC1', Kaifeng Peng, 16 Jan 2025
reply
-
RC2: 'Comment on essd-2024-468', Anonymous Referee #2, 18 Dec 2024
reply
This paper aims to produce a dataset of Chinese rivers spanning the period from 2016 to 2023 at an annual scale with a resolution of 10 m. However, the dataset lacks originality and has gaps in sufficient quality, and is limited in its potential for broader application, which I detail below:
Originality:
1) The classification of water body is more easily achievable compared to other land cover types in the field of remote sensing. It exhibits a significant spectral difference from other land cover types and has a relatively simple texture. Furthermore, it would be easy to screen rivers by simply using the length-to-width ratio of water bodies. However, the authors utilized publicly available 10 m land cover data and did not used an innovative scheme to extract rivers. They also fail to consider the network of rivers and the topographical features that influence the formation of rivers. The originality of the technical solution is limited.
2) As shown below, the authors did not acknowledge many relevant river datasets in the text. This makes me seriously concerned about the proper place for this paper.
Lin, P., Pan, M., Wood, E. F., Yamazaki, D., & Allen, G. H. (2021). A new vector-based global river network dataset accounting for variable drainage density. Scientific data, 8(1), 28.
Nyberg, B., Sayre, R., & Luijendijk, E. (2024). Increasing seasonal variation in the extent of rivers and lakes from 1984 to 2022. Hydrology and Earth System Sciences, 28(7), 1653-1663.
Yan, D., Wang, K., Qin, T., Weng, B., Wang, H., Bi, W., et al. (2019). A data set of global river networks and corresponding water resources zones divisions. Scientific data, 6(1), 219.
Besides, the work of Allen & Pavelsky (2018) has been cited, but the differences from the data of this article have not been explained.
Allen, G. H., & Pavelsky, T. M. (2018). Global extent of rivers and streams. Science, 361(6402), 585-588.
Scientific quality:
1) The scheme of data validation has considerable uncertainty. The river is typically characterized by the property of network morphology. However, the generated river data display a substantial number of river discontinuities. There is a conspicuous phenomenon of rivers in adjacent years either "disappearing" or "breaking off" noticeably. Although the accuracy is about 95% by visual interpretation, this is based on a pixel-by-pixel basis and does not take into account the connectivity of rivers. Additionally, the visual interpretation is also a highly subjective process. If only the center of the river was selected, the accuracy of the river would be overestimated.
2) The key data utilized in this paper (i.e., European Space Agency and Dynamic World) are 10 m land-cover classification products. They were not primarily designed for water classification. These products tend to underestimate the area of water body, and consequently, the extent of rivers.
3) Rivers possess highly pronounced seasonal characteristics. During the summer flood season, rivers become wider, while in winter, they may even disappear. The specific meaning and significance of annual-scale river data remains unclear.
4) "For areas with missing DW data, the EGLC and Sentinel-2 images were chosen as supplementary datasets, which were utilized to create annual water maps." This strategy is subjective. As depicted in Figure 2, the EGLC data only encompasses the period from 2017 to 2023. In contrast, for the remaining years of 2015 - 2016, classification is carried out using the land cover data that was self-produced. Why use the DW dataset as the primary data of river extraction? How to ensure consistency across datasets? The experimental scheme also has a certain degree of subjectivity.
Application
1) As illustrated in Fig. 8, the river data produced in this paper is significantly different from that of other products. In practical applications, it is relatively difficult for users to make a trade-off regarding which one to use.
2) This is not a global product. It has relatively limited application potential compared with other global river products.
Citation: https://doi.org/10.5194/essd-2024-468-RC2 -
AC2: 'Reply on RC2', Kaifeng Peng, 16 Jan 2025
reply
Thank you very much for your precious time to review our manuscript. We sincerely appreciate your diligent efforts to provide professional comments. Referring to your comments, we have revised our manuscript and provided detailed explanations for each comment. We acknowledge that our paper has limitations, but we believe that it still is a meaningful and interesting study. Our river algorithm is accurate, robust and effective, which can achieve lower misclassification and omission errors compared to using only the length-to-with ratio. Meanwhile, considerable manual editions were implemented for our river maps, which further improve their data quality. The characteristics of being national-scale, annually continuous, and having a 10-m spatial resolution make our river maps valuable for practical applications.
To present the response to the comments with both text and figures, we organized the detailed response and modifications into a PDF document. This document has been uploaded as supplement.
-
AC2: 'Reply on RC2', Kaifeng Peng, 16 Jan 2025
reply
-
RC3: 'Comment on essd-2024-468', Anonymous Referee #3, 21 Jan 2025
reply
General comments:
In this paper, the author integrates existing water-body data sets with Sentinel-2 remote sensing water index method to obtain the annual river network data set of Chinese rivers from 2016 to 2023. This product has a certain contribution to understanding the annual spatiotemporal variation of rivers in China during 2016-2023. However, the original innovation of research methods, the spatiotemporal resolution of data products and the global application value are still relatively limited. I suggested the author consider increasing the workload to supplement and perfect the experiments, making efforts to generate a global data product, and achieving a breakthrough in terms of higher spatiotemporal resolution and global scalability, which is expected to be published in the ESSD journal.
Specific comments:
- The title does not reflect the specific characteristics and advantages of river products. Authors are advised to rewrite the title and abstract to highlight, for example, the spatiotemporal resolution of the product, the maximum possible range of time and space, or spatiotemporal variation. In addition, it is necessary to point out whether the river products in this paper are geometric features such as river length, width, area, and density, or hydrophysical features such as river elevation and discharge.
- Why do you use Sentinel-2 multispectral data? Why not use Landsat-8/9 or Sentinel-1 data? Can this method be generalized to other optical or SAR data sources? Does the spatial resolution of the original multispectral remote sensing data affect the results of river network extraction? In addition, the geometric morphology and temporal and spatial changes of rivers are highly correlated with topography and geomorphology, so why did the author not use DEM data?
- Sentinel-2 SWIR has a resolution of 20 meters. It is suggested to give the spatiotemporal resolution of the river network extraction results in this paper. In addition, how wide a river can the method identify?
- Are the results of the method used in the river network continuous? The authors need to clarify whether the results are related to the time and season of the images. Because the existing public data sets compared by the authors are not all obtained at the same time, this represents the terrestrial results at different times. Which one is more representative?
- How to understand the advantages and limitations of different water body indexes in different river geomorphic regions?
- If the original public data set is not accurate enough, then the error will be propagated to the final river network results. Authors are advised to discuss this uncertainty.
- The annual average change of only seven years is difficult to reflect the longer-term basin-scale change. Previous studies have shown that the monthly variation of river network is strongly affected by precipitation. I suggest the authors conduct a quantitative analysis of the driving factors of seasonal spatiotemporal variation of river network products in the discussion section.
- Is the river network extraction method proposed by the author highly correlated with sediment turbidity? The authors should provide a summary of different hydrogeomorphic conditions in the global areas to ensure that readers can understand the extensibility and feasibility of the method on a global scale.
- Is the time selection of Sentinel-2 images random or carefully selected? The time of the image of the study area will affect the shape of the river, for example, the river runoff is large in summer, and small rivers are easier to identify. The authors should analyze and explain the influence of observation time on the seasonal river extraction results in the discussion section.
- Although the author utilized EGLC and Sentinel-2 data in the experiment, which helped supplement the missing DW data, the analysis still predominantly relied on the DW dataset. However, it remains unclear how water body information, not included in the DW dataset or misclassified (such as being misclassified into other types), was addressed. Additionally, it is important to evaluate the errors introduced by these discrepancies in the original data. It would be beneficial if the author could add a discussion on the potential errors arising from the original data inaccuracies in the discussion section.
- For Figure 3, I want to know what the horizontal coordinates represent.
- In Section 3.1.1, suggest that the mode algorithm be briefly explained, specifically clarifying when a value is assigned as 1 and when it is assigned as 0. Based on the current description in the paper, for the provided 4x4 schematic diagram, I would expect the value at the position in the fourth row and third column to be 1, but the image assigns it a value of 0.
- In line 181, the author mentions, “we constructed five weak rules”; however, the paper only presents four rules: “compactness > 2.3”, “rectangular fit < 0.5”, “length/width > 1.8”, and “compactness > 5.0”. Please verify this discrepancy. Additionally, I could not find a description of “Roundness” in the text, nor its role in the classification process. Could the author clarify its significance?
- The scatter plot fitting in Figure 7 appears to be incorrect. It should ideally be a straight line (y = x) for analyzing the correlation between the two datasets. The current fitting does not provide any meaningful explanation, and the R value obtained is based on these fitting results, which fail to demonstrate the correlation between the two datasets. I suggest you use other correlation coefficients to reflect the correct relationship.
- As shown in Figure 8, there are significantly more rivers in CWaC than in CRED in 2020. However, in the analysis on line 273, the author mentions that the basin area of CWaC is smaller than that of CRED. Could the author clarify how this difference can be explained?
- When evaluating accuracy, it would be helpful to include a focused image of a specific area within the text, rather than just broad images and some numerical results.
- The title for (h) in Figure S5 should be CRED-2023.
Citation: https://doi.org/10.5194/essd-2024-468-RC3
Data sets
The China river extent maps (CRED) from 2016 to 2023 Kaifeng Peng, Beibei Si, Weiguo Jiang, Meihong Ma, and Xuejun Wang https://doi.org/10.5281/zenodo.13841910
Viewed
HTML | XML | Total | Supplement | BibTeX | EndNote | |
---|---|---|---|---|---|---|
433 | 114 | 13 | 560 | 25 | 7 | 6 |
- HTML: 433
- PDF: 114
- XML: 13
- Total: 560
- Supplement: 25
- BibTeX: 7
- EndNote: 6
Viewed (geographical distribution)
Country | # | Views | % |
---|
Total: | 0 |
HTML: | 0 |
PDF: | 0 |
XML: | 0 |
- 1