the Creative Commons Attribution 4.0 License.
the Creative Commons Attribution 4.0 License.
A globally distributed dataset of coseismic landslide mapping via multi-source high-resolution remote sensing images
Abstract. Rapid and accurate landslide mapping following extreme triggering events is critical for emergency response, hazard prevention, and disaster management. Artificial intelligence- based approaches enable rapid landslide mapping, yet the lack of a high-resolution globally distributed and event-based dataset poses a severe challenge in developing generalized machine learning models for landslide detection. This paper addresses this issue by designing a diverse coseismic landslide dataset, the Globally Distributed Coseismic Landslide Dataset (GDCLD), which includes multi-source remote sensing images (i.e., PlanetScope, Gaofen-6, Map World, and Unmanned Aerial Vehicle) encompassing various geographical and geological backgrounds worldwide. The GDCLD can be accessed through this link: https://doi.org/10.5281/zenodo.11369484 (Fang et al., 2024). Furthermore, we evaluate the potential of GDCLD by analyzing mapping performance of the seven most popular semantic segmentation algorithms. We further validate the generalization capabilities of the dataset by deploying the models on three types of remote sensing images from four independent regions. Besides, we also assess the model on rainfall-induced landslide dataset and achieve good results, demonstrating its applicability in landslide segmentation under other triggering factors. The results indicate the superiority of the proposed dataset in landslide detection, offering a robust mapping solution for rapid assessment in future extreme events that trigger landslides across the globe.
- Preprint
(4454 KB) - Metadata XML
-
Supplement
(5588 KB) - BibTeX
- EndNote
Status: open (until 02 Sep 2024)
-
CC1: 'Comment on essd-2024-239', Kamal Rana, 28 Jul 2024
reply
The manuscript presents a really nice set of datasets that I believe can be of tremendous value for the automated mapping community of landslides. Principally in rapid mapping and landslide damage assessment in the respective terrains/landscapes where they occur.
The GDCLD houses varied high-resolution imageries, particularly for diverse geomorphological settings. This is nice because different regions put off different spectral responses and this in turn affects the mapping capacity of deep learning models. Keeping the dataset diverse means various spectral responses of landslide scars can be trained in the models, reaching closer to generalizability. This facet is also seen in the experiments with rainfall-induced landslides where, despite being triggered by a non-seismic origin, the general signature of landslides was captured fairly well. Additionally, an extension of the results of single versus multi-source experiments is also a testament to the diversity of the dataset presented in the manuscript. The latter showcased more pronounced accuracy in delineating landslide scars than the former, credited to the nature of data within GDCLD.Citation: https://doi.org/10.5194/essd-2024-239-CC1 -
AC1: 'Reply on CC1', Xuanmei Fan, 29 Jul 2024
reply
Dear Kamal Rana,
Thank you very much for your praise and affirmation of our work.
With kind regards,
Xuanmei Fan on behalf of all the co-authors
Citation: https://doi.org/10.5194/essd-2024-239-AC1
-
AC1: 'Reply on CC1', Xuanmei Fan, 29 Jul 2024
reply
-
RC1: 'Comment on essd-2024-239', Anonymous Referee #1, 30 Jul 2024
reply
- Remove link and citation from the abstract of GDCLD dataset.
- Four independent regions are not clear. Authors must explain it clearly.
- In the proposed work you have collected data for rainfall induced landslides or other parameters like topographical, anthropogenic and geological parameters are also considered?? If yes mention it, if not what results will be observed after evaluating these parameters.
- Abstract written is so general, it must be rewritten highlighting the major objectives, method adopted and result achieved.
- Too much old citations in the introduction section, it must be updated with latest citations like:
https://onlinelibrary.wiley.com/doi/abs/10.1002/ett.3998
https://link.springer.com/article/10.1007/s12145-022-00889-2
https://www.mdpi.com/2072-4292/16/6/992
https://www.nature.com/articles/s41597-023-02847-z
- Write paper organization at the end of the introduction section. Also write major objective of the paper achieved in the proposed work along with steps taken to accomplish the above objective.
- In line 87 to 88 “Therefore, there is a pressing need for the development of a carefully curated and diverse dataset”. It must be written properly.
- Line 92 what kind of shortcomings were addressed. Have evaluated the existing dataset on the proposed method. If yes then kindly share the result. If not, evaluate it, and add one table highlighting the same.
- Section 2 must be written as “Related Work”
- In section 3 except “Data Collection” all other subsections must be presented in tabular form rather in running text.
- Section 3.2 highlights the preprocessing of the dataset. One detailed fig must be added highlighting the steps involved or operations performed on training dataset.
- Mention the technical novelty of the paper other than creating the generalized dataset.
- How can the GDCLD and the trained models be integrated into current emergency response and disaster management systems? Are there any case studies or real-world applications that demonstrate their effectiveness?
- What are the challenges and considerations for scaling this approach to cover larger areas or more diverse regions? Are there any technological or infrastructural requirements?
- What are the potential future enhancements or expansions planned for the GDCLD? Are there any ongoing efforts to continuously update and improve the dataset?
- What specific characteristics of the GDCLD-trained model enable it to effectively map rainfall-induced landslides? Are there any limitations or areas for improvement in this application?
- How does the performance of the GDCLD-trained model compare to existing models and datasets in quantitative terms? Can you include specific performance metrics or visual comparisons?
- Which seven semantic segmentation algorithms were evaluated, and what were the criteria for their selection? How do these algorithms differ in their approach to landslide detection?
- Why were PlanetScope, Gaofen-Map World, and Unmanned Aerial Vehicles chosen as the primary sources of remote sensing images? Are there other potential sources that could be included in future iterations of the dataset?
- What specific criteria and methods were used to annotate the 1.39 billion landslide pixels? Were there any challenges or limitations encountered during the annotation process?
Citation: https://doi.org/10.5194/essd-2024-239-RC1 -
RC2: 'Comment on essd-2024-239', Anonymous Referee #2, 30 Aug 2024
reply
This work rigorously follows the standard process for generating samples, resulting in tens of thousands of tiles for training and enhancing deep learning (DL) models. The authors have diligently compiled a benchmark dataset, showcasing their dedication. GDCLD stands out with its exceptionally high spatial resolutions ranging from 0.2m to 3m and its diverse spectral properties. This dataset not only excels in landslide mapping across various geographical contexts but also serves as a foundational dataset for transfer learning in landslide detection.
Table 1, please provide the number of sites where landslides have occurred, along with the number of landslide polygons for each dataset.
Table 2, please specify the total number of polygons obtained and confirms that the necessary rights for the use of the mentioned images.
In Fig. 4, it's crucial to clarify the distinction between 'Label' and 'Ground Truth,' as they may initially appear similar.
A clear workflow outlining the entire dataset production process, along with details on personnel involvement, costs, and time invested, would offer valuable insights into the significant effort required to create such a comprehensive resource.
Lastly, the section titled '6.3 Model based on GDCLD performance on existing datasets' necessitates clarification to ensure its content is fully understood.
Citation: https://doi.org/10.5194/essd-2024-239-RC2
Data sets
GDCLD Chengyong Fang et al. https://doi.org/10.5281/zenodo.11369483
Viewed
HTML | XML | Total | Supplement | BibTeX | EndNote | |
---|---|---|---|---|---|---|
345 | 125 | 144 | 614 | 25 | 13 | 12 |
- HTML: 345
- PDF: 125
- XML: 144
- Total: 614
- Supplement: 25
- BibTeX: 13
- EndNote: 12
Viewed (geographical distribution)
Country | # | Views | % |
---|
Total: | 0 |
HTML: | 0 |
PDF: | 0 |
XML: | 0 |
- 1