the Creative Commons Attribution 4.0 License.
the Creative Commons Attribution 4.0 License.
The 2024 Noto Peninsula earthquake building damage dataset: Multi-source visual assessment
Abstract. We present a building damage dataset following the 2024 Noto Peninsula Earthquake. The database was compiled from freely available, multi-source, remote sensing data, verified through opt-in crowd-sourced information. The dataset consists of geo-referenced vector polygons representing the pre-event building footprints of 140,208 structures. Each building was classified through visual inspection using pre-disaster and post disaster vertical, oblique, survey, and verifiable news reporting imagery. Entries were validated using voluntary-submission data sourced through a web-API hosting a live version of the database. We calculate classification metrics for a subset of the database where ground survey photographs were provided by independent surveyors. An average F1-score of 0.94 suggests that the proposed assessment is consistent and high quality. We aim to inform future disaster research such as disaster dynamics models; statistical and machine learning damage models; logistics and evacuation studies. The present work describes the data collection process, damage assessment methodology, and rationale; including limitations encountered, the crowd sourcing validation process, and the dataset structure.
- Preprint
(10793 KB) - Metadata XML
- BibTeX
- EndNote
Status: final response (author comments only)
-
AC1: 'Correction to zenodo DOI', Ruben Vescovo, 08 Mar 2025
Dear Copernicus community members and referees,
I note that the link to the database in the manuscript (lines 227-228 and 248-249) points to an older version of the database, please instead use the following, generic link: https://doi.org/10.5281/zenodo.11055711 which always points to the newest version. We will adjust the link in the manuscript with the next version
As a consequence the reference entry on lines 346-348 will be updated to:
Vescovo, R., Adriano, B., Mas, E., Wiguna, S., Mizutani, A., Ho, C. Y., Morales, J., Dong, X., Ishii, S., Ezaki, Y., Wako, K.,
Tanaka, S., and Koshimura, S.: 2024 Noto Peninsula Earthquake Building Damage Visual Assesment, Tech. rep., Tohoku University,
https://doi.org/10.5281/zenodo.11055711, 2025.Citation: https://doi.org/10.5194/essd-2024-363-AC1 -
RC1: 'Comment on essd-2024-363', Anonymous Referee #1, 04 Apr 2025
General comments
The paper by Vescovo et al. is a very interesting read. The Noto Peninsula earthquake triggered multiple other hazards, a tsunami, landslides and fires. The data is easy to download and easy to understand and could be beneficial for many. Overall, the article provides all information needed, but some effort should still be put in the structure. The description of the hazards is too long, while the body of research on building damage datasets is a bit too thin. The methods section could benefit from rearranging the text to a more step-by-step structure (see specific comments).
I missed a short explanation of why the most probable hazard type(s) are not included, because in this case study there are so many different hazard types at play which makes it very interesting. It is likely that this was just too difficult, but please mention that, even more so because in Figure 5 you make a distinction between earthquake/tsunami/landslide/fire. It could even be an idea to put the multi-hazard nature in the title, because it may be interesting for others too.
Specific comments
Introduction: The introduction starts with an extended description of the Noto Peninsula earthquake and the cascading hazards following the earthquake. A reference to assessment of (specifically) building damage is only found in line 41. As this article is mainly a data paper, it is my suggestion to keep the description of the hazard very brief. More interesting it is to talk about:
- Broader context of building damage datasets (similar to the text following L41).
- What other previous building damage datasets are already there. Nepal 2015 is an interesting case because crowd-sourcing has also been used, but in a different way, e.g. https://ieeexplore.ieee.org/abstract/document/7427206, but also L’Aquila 2009 has been well-documented. You also mention Haiti and Christchurch in the method section that could be moved here.
- What is the benefit of specifically your dataset
Methods: The methods are well-described in the text. However, it could benefit from a better structure. For example, it is not clear what is the point of the `basis for the assessment` paragraph. It is also confusing that the section `Earthquake damage assessment` only contains a description of the case study area, while the `Tsunami damage assessment` also contains other information (like how mismatches are handled). Some suggestions:
- Add a short summary of the steps taken in the general methods paragraph (L68-69)
- Have a structure similar to (1) data sourcing; (2) building footprint selection (sub-paragraph for tsunami/earthquake) (3) Classification system (4) Verification / Feedback (crowd-sourced + experts)
- In the methods, it was also not clear who exactly did the initial damage classification. I assume it was your team?
Data description: Clear section, no comments.
Technical validation: The first paragraphs of the validation section (L156-170) would fit better in the discussion.
Tables and figures: The tables and figures are not ordered by first reference, but are scattered through the paper. Please order by first reference.
Discussion / Data availability: Clear sections, no comments.
Technical corrections
L3 `Vector polygons` is redundant. Please just use `polygons`.
L8 `Disaster dynamics models` unclear what this means
L11;L99;L187;L219: Formulas and mathematical symbols are not correctly exported / not readable.
L36 `in this capacity ...` Sentence doesn’t work
L42 `multi-source` is a key concept, but never formally explained
L56 What do you mean with a `baseline dataset`
L59 As the KKC dataset is proprietary, do you know the license of the data and if it can be used for this purpose? If so, please state here, in the methods or in the data availability
L55-L66 there is no mention of the media photos that are mentioned in the abstract
Figure 2: what does multi-modal mean here?
L88 Tsunami-affected areas are introduced in the next section, why are they mentioned here L102 `Although this … vertical images`: this sentence is unclear, what do you mean?
Figure 8: Hard to read. It’s difficult to see the blue/red differences (and what do they mean? Are mismatches in red?)
L240 the statement here counters the statement made in L189. If your method is robust also in areas without multi-source input, why would it be better than just a single-source approach?
Citation: https://doi.org/10.5194/essd-2024-363-RC1 - AC3: 'Reply on RC1', Ruben Vescovo, 11 Apr 2025
-
RC2: 'Comment on essd-2024-363', Anonymous Referee #2, 07 Apr 2025
Reviewer Report – “The 2024 Noto Peninsula earthquake building damage dataset: Multi-source visual assessment” by Vescovo et al.
The manuscript presents a high-quality and timely dataset developed in the aftermath of the 2024 Noto Peninsula earthquake. The database includes multi-source visual damage assessments for over 140,000 buildings and has been verified through crowd-sourced data submissions. Overall, the methodology is sound and well described, and the resource has clear potential for supporting a wide range of applications in disaster research, including damage modeling, multi-hazard analysis, and emergency response planning. I found the manuscript to be generally clear and well-organized, and I commend the authors for the transparency of the process and the open availability of the data.However, I believe that some improvements could further strengthen the manuscript, particularly in the discussion and contextualization of the dataset within the broader landscape of similar efforts.
Major Suggestions
1. Contextualization within existing damage datasets
While the authors provide a useful comparison of classification schemes in Table 5, I encourage them to elaborate further on how their dataset compares with other similar open-access post-disaster damage datasets (e.g., in the Japanese context, the dataset compiled by the Ministry of Land, Infrastructure, Transport and Tourism (MLIT) for the 2011 Great East Japan Earthquake). Key aspects to highlight might include the attributes provided, accessibility for users, the level of detail in hazard-specific classification, and the potential for reuse in modeling or planning. A broader contextualization would help readers better appreciate the potential of this new dataset and identify synergies with existing efforts.2. Multi-hazard characterization and enrichment of attributes
Given the multi-hazard nature of the Noto Peninsula event (earthquake, tsunami, landslides, fires), it would be highly valuable to incorporate hazard-specific indicators directly within the dataset. Although links to external hazard data are provided (p. 19), embedding this information at the building level (e.g., presence of tsunami inundation, fire impact, or landslide proximity) would significantly enhance usability, especially for non-Japanese users. For instance, the mentioned 2011 MLIT dataset includes information on water depth indicators and indication for the concurrent presence of other hazards. At minimum, a discussion on the feasibility of such integration and its relevance for downstream applications would enrich the paper.3. Discussion improvements
The discussion section could benefit from a more detailed reflection on the limitations and strengths of the dataset with respect to different user communities (e.g., researchers, emergency managers, machine learning practitioners). It may also be worth clarifying how the dataset could evolve in the future, for instance by incorporating additional damage explicative features, higher-resolution classifications, etc.Minor revisions
• Some minor reordering of figures and tables based on their first appearance in the text would improve readability.
• In the current PDF version of the manuscript, mathematical symbols and equations do not render correctly and are therefore not visible to the reader
• L27-29: This sentence seems grammatically incomplete, as it lacks a main verb.
• L41: A full stop should be inserted after 'building damage'.
• To improve the logical flow of the paragraph, it may be helpful to relocate L96-98 ('We provide [...] middle classes') immediately after L91 ('[...] Table 1').
• L94-95: this sentence is not clear, please consider rewriting.
• Figure 5: Increasing the line thickness of the building polygons would likely improve visual clarity.
• L153: A full stop should be inserted after 'original dataset.
• L168: missing 'of' between 'environment' and 'both'Overall, I strongly support the publication of this work, which is a significant contribution to the field. With some enhancements in contextualization and a clearer discussion of dataset enrichment opportunities, the manuscript will be even more valuable to a broad community of users.
Citation: https://doi.org/10.5194/essd-2024-363-RC2 - AC2: 'Reply on RC2', Ruben Vescovo, 11 Apr 2025
Viewed
HTML | XML | Total | BibTeX | EndNote | |
---|---|---|---|---|---|
258 | 61 | 10 | 329 | 8 | 8 |
- HTML: 258
- PDF: 61
- XML: 10
- Total: 329
- BibTeX: 8
- EndNote: 8
Viewed (geographical distribution)
Country | # | Views | % |
---|
Total: | 0 |
HTML: | 0 |
PDF: | 0 |
XML: | 0 |
- 1