the Creative Commons Attribution 4.0 License.
the Creative Commons Attribution 4.0 License.
FASDD: An Open-access 100,000-level Flame and Smoke Detection Dataset for Deep Learning in Fire Detection
Abstract. Deep learning methods driven by in situ video and remote sensing images have been used in fire detection. The performance and generalization of fire detection models, however, are restricted by the limited number and modality of fire detection training datasets. A large-scale fire detection benchmark dataset covering complex and varied fire scenarios is urgently needed. This work constructs a 100,000-level Flame and Smoke Detection Dataset (FASDD) based on multi-source heterogeneous flame and smoke images. To the best of our knowledge, FASDD is currently the most versatile and comprehensive dataset for fire detection. It provides a challenging benchmark to drive the continuous evolution of fire detection models. Additionally, we formulate a unified workflow for preprocessing, annotation and quality control of fire samples. Meanwhile, out-of-the-box annotations are published in four different formats for training deep learning models. Deep learning models trained on FASDD demonstrate the potential value and challenges of our dataset in fire detection and localization. Extensive performance evaluations based on classical methods show that most of the models trained on FASDD can achieve satisfactory fire detection results, and especially YOLOv5x achieves nearly 80 % mAP@0.5 accuracy on heterogeneous images spanning two domains of computer vision and remote sensing. And the application in wildfire location demonstrates that deep learning models trained on our dataset can be used in recognizing and monitoring forest fires. It can be deployed simultaneously on watchtowers, drones and optical satellites to build a satellite-ground cooperative observation network, which can provide an important reference for large-scale fire suppression, victim escape, firefighter rescue and government decision-making. The dataset is available from the Science Data Bank website at https://doi.org/10.57760/sciencedb.j00104.00103 (Wang et al., 2022).
This preprint has been withdrawn.
-
Withdrawal notice
This preprint has been withdrawn.
-
Preprint
(5225 KB)
-
Supplement
(2 KB)
-
This preprint has been withdrawn.
- Preprint
(5225 KB) - Metadata XML
-
Supplement
(2 KB) - BibTeX
- EndNote
Interactive discussion
Status: closed
-
RC1: 'Comment on essd-2022-394', Anonymous Referee #1, 19 Dec 2022
Summary
Fire has wide impacts on Earth Systems and human society, and efficient fire detection could promote better understanding, modeling, and preventing fires. Wang et al., synthesized a comprehensive dataset (FASDD) which covers multiple scenarios (e.g., computer vision and remote sensing) across different spatial scales. The paper is generally well written, however, I have the following major concerns especially for the data generation, data validation, and the usefulness of the data. Meanwhile some expressions listed below are less rigorous. The current version was not acceptable.
Major comments
(1) Data generation: it’s well known that near infrared (NIR) and short-wave infrared (SWIR) are two commonly used bands for fire detection while the authors only used the visible bands via visual interpretation. If only based on true colors, it can hardly convince me about the generality of the dataset for large spatial scale fire detection (e.g., some land surface items could show similar colors with fires in remote sensing images and thus mislead machine leaning models). Also, the results in Table 2 showed the really low performance on detecting remote sensing fires even with advanced machine learning models. With such a lower accuracy (e.g., relative to MTBS fire products or MODIS fire products listed below), how can the data help improve fire detection.
Sparks, A. M., Boschetti, L., Smith, A. M., Tinkham, W. T., Lannom, K. O., & Newingham, B. A. (2014). An accuracy assessment of the MTBS burned area product for shrub–steppe fires in the northern Great Basin, United States. International Journal of Wildland Fire, 24(1), 70-78.
Giglio, L., Boschetti, L., Roy, D. P., Humber, M. L., & Justice, C. O. (2018). The Collection 6 MODIS burned area mapping algorithm and product. Remote sensing of environment, 217, 72-85.
(2) Data generation: for active fire detection, middle infrared and thermal bands are also important but were ignored.
(3) Data generation: for the fire detection, the authors used the top-of-atmosphere reflectance instead of atmospheric-corrected land surface reflectance which should be a problem. For example, if the smoke is white or grey, how to classify smoke versus clouds only through visual interpretation of true-colors? Meanwhile, the CV fire images should be obtained on land surface with a much higher spatial resolution (can be with sub-meter resolution in Fig. 4). Can such kinds of CV image trained machine learning models be directly used to large-scale remote sensing data without atmospheric-correction and with different spatial resolutions?
(4) Data annotation: the “minimum bounding rectangle” was used to label the images. Commonly, fire detection is to classify whether each pixel is burned or not instead of a rectangle (e.g., the MODIS fire product in Giglio et al., (2018) and Giglio et al., (2016)). Meanwhile, the fire patch perimeter was always not rectangle (Laurent et al., 2018), therefore a bounding rectangle could contain both burned and unburned pixels, right?
Giglio, L., Boschetti, L., Roy, D. P., Humber, M. L., & Justice, C. O. (2018). The Collection 6 MODIS burned area mapping algorithm and product. Remote sensing of environment, 217, 72-85.
Giglio, L., Schroeder, W., & Justice, C. O. (2016). The collection 6 MODIS active fire detection algorithm and fire products. Remote sensing of environment, 178, 31-41.
Laurent, P., Mouillot, F., Yue, C., Ciais, P., Moreno, M. V., & Nogueira, J. M. (2018). FRY, a global database of fire patch functional traits derived from space-borne burned area products. Scientific data, 5(1), 1-12.
(5) Data annotation: Line 205-207, such descriptions were less rigorous, are there any numbers or thresholds to quantify such descriptions?
(6) Data validation: the true-color based visual interpretation could also involve biases, therefore it’s important to validate the generated data against other reliable fire dataset, such as the MTBS data
(7) Line 245-246, the dataset consists of 95,314 computer vision fire samples but only 5,773 remote sensing samples. Due to the data imbalance, the model performance (Table 2) on FASDD data therefore mainly depends on the model performance on FASDD_CV. For the limited number of remote sensing fire samples, most samples in each region were distributed within ten days of a specific year (Table 1). Can such limited number of wildfires reflect all the seasonal and interannual changes of environmental conditions and fire dynamics so that machine learning models could learn from enough data? To my knowledge, the fire occurrence conditions changed across seasons and therefore the fire detectability could also be affected.
(8) Evaluation in section 4: the evaluation mainly showed the extent to which machine learning models could detect fires of generated FASDD data, therefore whether FASDD is reliable remains unknown. The FASDD data needed to be validated against other reliable fire products.
(9) There are many existing remote sensing fire products (e.g., MTBS, MODIS fire products) and CV fire data sets. I understand that combining the two kinds of dataset was the main difference of FASDD relative to other datasets, however, the authors did not show why combining these two kinds of dataset is important? Can combining these two kinds of datasets improve fire detection relative to existing fire detection algorithms (e.g., the method for MODIS or MTBS fire product)? If not, why people use such complex dataset (with different spatial resolution and without atmospheric-correction)
Minor comments
(1) Line 210-211, how the detection model was trained on FASDD_CV data with different spatial resolution and employed to images with different spatial resolution?
(2) Line 278-279: the original images have different sizes, right? How to unify them to the same image size?
(3) Line 280, in addition to the epoch and batch size, what about other critical parameters like the learning rate? Also any strategy for overfitting?
(4) Evaluation metrics: instead of predicting boxes, the model performance on classifying pixels (burned or not) is also very important.
(5) Line288: “IoU” firstly mentioned but without its full name
(6) Line 307: “which demonstrates the difficulty of fire detection on Remote Sensing images”, such results may imply that critical information is missing for fire detection (see my major comment 1 and 2) and thus results in the low detectability.
Citation: https://doi.org/10.5194/essd-2022-394-RC1 -
RC2: 'Comment on essd-2022-394', Anonymous Referee #2, 03 Jan 2023
The authors propose a new dataset for fire detection made by combining different types of images. They call it the Flame and Smoke detection dataset (FASDD) and claim that it is the largest set of data on fires. It combines both images of flames and of smokes which are elements associated with the presence of an existing fire. After reviewing precedent datasets, the authors start describing the procedure followed to collect the images and perform the labelling, which is partly done in an automated way by relying on open-source tools that use pre-trained DL models. I recognise the merit of putting together different sources and datasets into a unique and curated catalogue which is perhaps among the largest in terms of size. However, before recommending publication, I would like the following points to be addressed:
- What are the classes for the classification task? (maybe fire/non-fire/smoke/non-smoke ?)
- Are there cases where both flames and smoke are present?
- What are the criteria followed to sample negative instances?
- Why the format 1000x1000 for the satellite images? I would say it is not so practical for training due to memory issues.
- What is the usefulness of having both photos (CV) and satellite images (RS) in the same dataset? In principle it could seem a rather arbitrary choice with little or no advantage for model training.
FInally, I think that more attention should be given to discuss possible applications of this dataset. It is mentioned that in general the FASDD should help advance research in fire detection, but I would like the autors to make an extra effort in envisioning concrete use cases or ways in which the FASDD could improve existing applications and areas of research.
Citation: https://doi.org/10.5194/essd-2022-394-RC2 -
RC3: 'Comment on essd-2022-394', Anonymous Referee #3, 15 Jan 2023
The paper presents a dataset for the detection of fire and smoke in RGB images from different sources, including satellite data.
I'll start with the things I like:
I appreciate the fact that the dataset is open and accessible in an ML-ready format.
I appreciate the effort to generate annotations and control their quality and I like the related diagrams.
I like the fact that you merge different types of related datasets.
The paper is generally well-written and easy to follow.Major Comments
My main concern is that the dataset is not well motivated, and neither the related work nor the experiments are yet sufficient to back up the motivation.
The paper is imbalanced with more emphasis and content given on choices that are trivial (though important), such as the annotation and quality control and less emphasis on the essential, i.e. the motivation and the evaluation.
If this aims to stand as a dataset with impact, the evaluation section needs to be reworked. I expect the authors to show the performance on each type of dataset and show the benefit of adding the different types of datasets. The fact that FASDD has a better performance than FASDD_CV and FASDD_RS is not enough by itself, without enough context about the evaluation setup that makes it a fair comparison.Main Questions
- Why is this specific dataset needed? Why would anyone use this dataset instead of a domain-specific? The answer shouldn't be that fires are increasing in frequency or intensity and it should probably come from the evaluation. Here is a sample question that you may answer: Does training on fires from graphics help in detecting real fires from drones? There are many more like that.
- Is the use of these specific satellites motivated for fire detection? These satellites have low temporal resolution and may not really be useful for fire monitoring.
- Are the optical bands of the satellite suitable for fire detection? Is there any reason why anyone would use them for that?
- I don't see why having a standard image size is a bad thing and I am not sure about the large image sizes in a machine learning dataset. A 1000x1000 image size is impractical.
- The addition of images with objects confused with smoke/fire is interesting. How did you choose the categories of such objects?
- Is the training split that you propose part of the dataset? How did you choose the split? Is it a random one? If so, I really think you should define it more carefully.
- I would like to see an analysis of the performance per domain (drones, RS, graphics, etc) and generally a more thorough evaluation.
- The experimental setup is not very well described. What preprocessing do you do before feeding the samples to the models? How do you handle the different image sizes?
- The training of the models is not detailed enough and as the authors seem to conclude, it does not make for a fair comparison " the parameter configuration and training strategy of the two models are only as consistent as possible but not entirely
consistent, which may lead to a loss of comparability to some extent". It's ok to not train the perfect models, they also seem to be working quite well. The problem is that the comparison does not make much sense. And, importantly, the comparison criteria do not motivate for the use of this specific dataset.
Suggestions for improvement
It would be helpful to have the dataset characteristics of section 5 in one table, rather than in text.
It would help me as a reader to see an obvious connection of Figure 1 and 3 in the text. Maybe use bold or italics in the text when a step of the process appears.
The references for fire/smoke detection with remote sensing products are insufficient.
I am not sure why the authors insist on using the term modalities as way to differentiate between rgb images captured from different sources.
Different modalities usually mean different types of sensing. If you always have optical images, I think it is confusing, if not invalid, to call them different modalities. In the end you only have image data, no sound, speech, text, etc.Minor comments
What are the 100,000 levels?
Line 32: "lead to human respiratory and cardiovascular diseases, and even endanger human life". This phrase suggests that human respiratory and cardiovascular diseases do not endanger human life. Reframe it please.
Line 33: "According to the 2022 report from International Association of Fire and Rescue Services, the frequency of global fire events has shown an increasing trend in the last decade". I read the report and did not see a clear mention of that trend. See Jones, Matthew W., et al. “Global and Regional Trends and Drivers of Fire Under Climate Change.” Reviews of Geophysics, vol. 60, no. 3, 2022, p. e2020RG000726. Wiley Online Library, https://doi.org/10.1029/2020RG000726.
Line 143: What does it mean for a dataset to be "generalized by deep learning researchers"?
I am not sure why you need sections 2.2 and 2.3. As a reader, I am more interested in your choices, and motivation. Maybe you can think of a more logical structure for the text.Citation: https://doi.org/10.5194/essd-2022-394-RC3
Interactive discussion
Status: closed
-
RC1: 'Comment on essd-2022-394', Anonymous Referee #1, 19 Dec 2022
Summary
Fire has wide impacts on Earth Systems and human society, and efficient fire detection could promote better understanding, modeling, and preventing fires. Wang et al., synthesized a comprehensive dataset (FASDD) which covers multiple scenarios (e.g., computer vision and remote sensing) across different spatial scales. The paper is generally well written, however, I have the following major concerns especially for the data generation, data validation, and the usefulness of the data. Meanwhile some expressions listed below are less rigorous. The current version was not acceptable.
Major comments
(1) Data generation: it’s well known that near infrared (NIR) and short-wave infrared (SWIR) are two commonly used bands for fire detection while the authors only used the visible bands via visual interpretation. If only based on true colors, it can hardly convince me about the generality of the dataset for large spatial scale fire detection (e.g., some land surface items could show similar colors with fires in remote sensing images and thus mislead machine leaning models). Also, the results in Table 2 showed the really low performance on detecting remote sensing fires even with advanced machine learning models. With such a lower accuracy (e.g., relative to MTBS fire products or MODIS fire products listed below), how can the data help improve fire detection.
Sparks, A. M., Boschetti, L., Smith, A. M., Tinkham, W. T., Lannom, K. O., & Newingham, B. A. (2014). An accuracy assessment of the MTBS burned area product for shrub–steppe fires in the northern Great Basin, United States. International Journal of Wildland Fire, 24(1), 70-78.
Giglio, L., Boschetti, L., Roy, D. P., Humber, M. L., & Justice, C. O. (2018). The Collection 6 MODIS burned area mapping algorithm and product. Remote sensing of environment, 217, 72-85.
(2) Data generation: for active fire detection, middle infrared and thermal bands are also important but were ignored.
(3) Data generation: for the fire detection, the authors used the top-of-atmosphere reflectance instead of atmospheric-corrected land surface reflectance which should be a problem. For example, if the smoke is white or grey, how to classify smoke versus clouds only through visual interpretation of true-colors? Meanwhile, the CV fire images should be obtained on land surface with a much higher spatial resolution (can be with sub-meter resolution in Fig. 4). Can such kinds of CV image trained machine learning models be directly used to large-scale remote sensing data without atmospheric-correction and with different spatial resolutions?
(4) Data annotation: the “minimum bounding rectangle” was used to label the images. Commonly, fire detection is to classify whether each pixel is burned or not instead of a rectangle (e.g., the MODIS fire product in Giglio et al., (2018) and Giglio et al., (2016)). Meanwhile, the fire patch perimeter was always not rectangle (Laurent et al., 2018), therefore a bounding rectangle could contain both burned and unburned pixels, right?
Giglio, L., Boschetti, L., Roy, D. P., Humber, M. L., & Justice, C. O. (2018). The Collection 6 MODIS burned area mapping algorithm and product. Remote sensing of environment, 217, 72-85.
Giglio, L., Schroeder, W., & Justice, C. O. (2016). The collection 6 MODIS active fire detection algorithm and fire products. Remote sensing of environment, 178, 31-41.
Laurent, P., Mouillot, F., Yue, C., Ciais, P., Moreno, M. V., & Nogueira, J. M. (2018). FRY, a global database of fire patch functional traits derived from space-borne burned area products. Scientific data, 5(1), 1-12.
(5) Data annotation: Line 205-207, such descriptions were less rigorous, are there any numbers or thresholds to quantify such descriptions?
(6) Data validation: the true-color based visual interpretation could also involve biases, therefore it’s important to validate the generated data against other reliable fire dataset, such as the MTBS data
(7) Line 245-246, the dataset consists of 95,314 computer vision fire samples but only 5,773 remote sensing samples. Due to the data imbalance, the model performance (Table 2) on FASDD data therefore mainly depends on the model performance on FASDD_CV. For the limited number of remote sensing fire samples, most samples in each region were distributed within ten days of a specific year (Table 1). Can such limited number of wildfires reflect all the seasonal and interannual changes of environmental conditions and fire dynamics so that machine learning models could learn from enough data? To my knowledge, the fire occurrence conditions changed across seasons and therefore the fire detectability could also be affected.
(8) Evaluation in section 4: the evaluation mainly showed the extent to which machine learning models could detect fires of generated FASDD data, therefore whether FASDD is reliable remains unknown. The FASDD data needed to be validated against other reliable fire products.
(9) There are many existing remote sensing fire products (e.g., MTBS, MODIS fire products) and CV fire data sets. I understand that combining the two kinds of dataset was the main difference of FASDD relative to other datasets, however, the authors did not show why combining these two kinds of dataset is important? Can combining these two kinds of datasets improve fire detection relative to existing fire detection algorithms (e.g., the method for MODIS or MTBS fire product)? If not, why people use such complex dataset (with different spatial resolution and without atmospheric-correction)
Minor comments
(1) Line 210-211, how the detection model was trained on FASDD_CV data with different spatial resolution and employed to images with different spatial resolution?
(2) Line 278-279: the original images have different sizes, right? How to unify them to the same image size?
(3) Line 280, in addition to the epoch and batch size, what about other critical parameters like the learning rate? Also any strategy for overfitting?
(4) Evaluation metrics: instead of predicting boxes, the model performance on classifying pixels (burned or not) is also very important.
(5) Line288: “IoU” firstly mentioned but without its full name
(6) Line 307: “which demonstrates the difficulty of fire detection on Remote Sensing images”, such results may imply that critical information is missing for fire detection (see my major comment 1 and 2) and thus results in the low detectability.
Citation: https://doi.org/10.5194/essd-2022-394-RC1 -
RC2: 'Comment on essd-2022-394', Anonymous Referee #2, 03 Jan 2023
The authors propose a new dataset for fire detection made by combining different types of images. They call it the Flame and Smoke detection dataset (FASDD) and claim that it is the largest set of data on fires. It combines both images of flames and of smokes which are elements associated with the presence of an existing fire. After reviewing precedent datasets, the authors start describing the procedure followed to collect the images and perform the labelling, which is partly done in an automated way by relying on open-source tools that use pre-trained DL models. I recognise the merit of putting together different sources and datasets into a unique and curated catalogue which is perhaps among the largest in terms of size. However, before recommending publication, I would like the following points to be addressed:
- What are the classes for the classification task? (maybe fire/non-fire/smoke/non-smoke ?)
- Are there cases where both flames and smoke are present?
- What are the criteria followed to sample negative instances?
- Why the format 1000x1000 for the satellite images? I would say it is not so practical for training due to memory issues.
- What is the usefulness of having both photos (CV) and satellite images (RS) in the same dataset? In principle it could seem a rather arbitrary choice with little or no advantage for model training.
FInally, I think that more attention should be given to discuss possible applications of this dataset. It is mentioned that in general the FASDD should help advance research in fire detection, but I would like the autors to make an extra effort in envisioning concrete use cases or ways in which the FASDD could improve existing applications and areas of research.
Citation: https://doi.org/10.5194/essd-2022-394-RC2 -
RC3: 'Comment on essd-2022-394', Anonymous Referee #3, 15 Jan 2023
The paper presents a dataset for the detection of fire and smoke in RGB images from different sources, including satellite data.
I'll start with the things I like:
I appreciate the fact that the dataset is open and accessible in an ML-ready format.
I appreciate the effort to generate annotations and control their quality and I like the related diagrams.
I like the fact that you merge different types of related datasets.
The paper is generally well-written and easy to follow.Major Comments
My main concern is that the dataset is not well motivated, and neither the related work nor the experiments are yet sufficient to back up the motivation.
The paper is imbalanced with more emphasis and content given on choices that are trivial (though important), such as the annotation and quality control and less emphasis on the essential, i.e. the motivation and the evaluation.
If this aims to stand as a dataset with impact, the evaluation section needs to be reworked. I expect the authors to show the performance on each type of dataset and show the benefit of adding the different types of datasets. The fact that FASDD has a better performance than FASDD_CV and FASDD_RS is not enough by itself, without enough context about the evaluation setup that makes it a fair comparison.Main Questions
- Why is this specific dataset needed? Why would anyone use this dataset instead of a domain-specific? The answer shouldn't be that fires are increasing in frequency or intensity and it should probably come from the evaluation. Here is a sample question that you may answer: Does training on fires from graphics help in detecting real fires from drones? There are many more like that.
- Is the use of these specific satellites motivated for fire detection? These satellites have low temporal resolution and may not really be useful for fire monitoring.
- Are the optical bands of the satellite suitable for fire detection? Is there any reason why anyone would use them for that?
- I don't see why having a standard image size is a bad thing and I am not sure about the large image sizes in a machine learning dataset. A 1000x1000 image size is impractical.
- The addition of images with objects confused with smoke/fire is interesting. How did you choose the categories of such objects?
- Is the training split that you propose part of the dataset? How did you choose the split? Is it a random one? If so, I really think you should define it more carefully.
- I would like to see an analysis of the performance per domain (drones, RS, graphics, etc) and generally a more thorough evaluation.
- The experimental setup is not very well described. What preprocessing do you do before feeding the samples to the models? How do you handle the different image sizes?
- The training of the models is not detailed enough and as the authors seem to conclude, it does not make for a fair comparison " the parameter configuration and training strategy of the two models are only as consistent as possible but not entirely
consistent, which may lead to a loss of comparability to some extent". It's ok to not train the perfect models, they also seem to be working quite well. The problem is that the comparison does not make much sense. And, importantly, the comparison criteria do not motivate for the use of this specific dataset.
Suggestions for improvement
It would be helpful to have the dataset characteristics of section 5 in one table, rather than in text.
It would help me as a reader to see an obvious connection of Figure 1 and 3 in the text. Maybe use bold or italics in the text when a step of the process appears.
The references for fire/smoke detection with remote sensing products are insufficient.
I am not sure why the authors insist on using the term modalities as way to differentiate between rgb images captured from different sources.
Different modalities usually mean different types of sensing. If you always have optical images, I think it is confusing, if not invalid, to call them different modalities. In the end you only have image data, no sound, speech, text, etc.Minor comments
What are the 100,000 levels?
Line 32: "lead to human respiratory and cardiovascular diseases, and even endanger human life". This phrase suggests that human respiratory and cardiovascular diseases do not endanger human life. Reframe it please.
Line 33: "According to the 2022 report from International Association of Fire and Rescue Services, the frequency of global fire events has shown an increasing trend in the last decade". I read the report and did not see a clear mention of that trend. See Jones, Matthew W., et al. “Global and Regional Trends and Drivers of Fire Under Climate Change.” Reviews of Geophysics, vol. 60, no. 3, 2022, p. e2020RG000726. Wiley Online Library, https://doi.org/10.1029/2020RG000726.
Line 143: What does it mean for a dataset to be "generalized by deep learning researchers"?
I am not sure why you need sections 2.2 and 2.3. As a reader, I am more interested in your choices, and motivation. Maybe you can think of a more logical structure for the text.Citation: https://doi.org/10.5194/essd-2022-394-RC3
Data sets
FASDD: An Open-access 100,000-level Flame and Smoke Detection Dataset for Deep Learning in Fire Detection Ming Wang, Liangcun Jiang, Peng Yue, Dayu Yu, Tianyu Tuo https://www.scidb.cn/en/s/nqYfi2
Viewed
HTML | XML | Total | Supplement | BibTeX | EndNote | |
---|---|---|---|---|---|---|
1,655 | 809 | 82 | 2,546 | 104 | 70 | 67 |
- HTML: 1,655
- PDF: 809
- XML: 82
- Total: 2,546
- Supplement: 104
- BibTeX: 70
- EndNote: 67
Viewed (geographical distribution)
Country | # | Views | % |
---|
Total: | 0 |
HTML: | 0 |
PDF: | 0 |
XML: | 0 |
- 1
Ming Wang
Liangcun Jiang
Peng Yue
Tianyu Tuo
This preprint has been withdrawn.
- Preprint
(5225 KB) - Metadata XML
-
Supplement
(2 KB) - BibTeX
- EndNote