A global database of extreme fire events from satellite data from 2003 to 2022
Abstract. Extreme fires represent a significant threat due to their impacts on climate, ecosystems, and society. Despite their increasing prevalence, their definition remains controversial, as their characteristics vary depending on the region considered. In this article, we present the first version of the Extreme Fire Events (EFEs) database, a global dataset of extreme fires in NetCDF format containing monthly rasters on a regular grid with a spatial resolution of 0.25 degrees. The database includes the period 2003–2022, when a consistent satellite record was available. The basic unit of analysis is a cell-month event (CME), which represents aggregated fire activity within a grid cell during a given month. The identification of extreme events was based on two main satellite-derived variables: Burned Area (BA) from the European Space Agency’s FireCCI51 dataset and Fire Radiative Power (FRP) obtained from the NASA MCD14ML active fire product. Both variables were derived from the MODIS sensor. They were aggregated to the spatial and temporal scale defined for the CMEs and were used to compute standardised anomalies within each of the 55 defined regions, in order to account for spatial and seasonal differences in fire activity in the main global biomes. A CME was classified as an EFE when it presented anomalous values in both variables according to the established regional thresholds. Further, for each EFE, the database also indicates if any fire perimeter from the FRY v2.0 dataset identified as extreme by a certain attribute (fire size, duration, mean FRP, rate of spread and severity) overlapped with the CME. The database includes 19,951 EFEs between 2003 and 2022, with the highest frequency in 2010 and 2007, and the lowest in 2013. The dataset is intended for climate and Earth System modellers aiming to understand the causes and impacts of EFEs, as well as to forecast their occurrence under future scenarios or include them in broader Earth System models.
The manuscript presents a global extreme fire dataset from 2003 to 2022, derived from FireCCI51 and MCD14ML. The effort to aggregate global extreme fire data is valuable, while the current manuscript and the dataset have problems in method and structure. My primary concerns are the validity of the dataset and the lack of validation. Thus I suggest major revision for further consideration.
Major concern:
Line-by-line comments:
L13: The phrase “as their characteristics vary depending on the region considered” may not be necessary, as it is somewhat ambiguous and does not introduce new information.
L14: The description “a global dataset of extreme fires in NetCDF format containing monthly rasters on a regular grid with a spatial resolution of 0.25 degrees” could be more concisely written as “a global monthly and 0.25-degree extreme fires dataset in NetCDF format.”
L22: Please clarify why “main global biomes” is used here, given that the database claims global coverage.
L24: Please avoid using ambiguous words such as “certain”. It is better to list them explicitly.
L27: I have reservations about the dataset's value for forecasts and projections, given that it is not updated in (near) real-time.
L45: The term “unique” should be clarified: does it mean unified, comprehensive, or broadly accepted?
L47: It would be valuable to mention the fire’s impacts here.
L52-54: A region-specific threshold may also not fully resolve this issue, as it primarily highlights anomalies relative to the region's historical average rather than absolute physical extremes.
L54-56: Please clarify this sentence. Small fires in fuel-limited regions are typically not considered extreme, so the current phrasing is slightly confusing.
L62: This appears to be an incomplete sentence as it only includes landscapes.
L63: The text says “several” but provides only one reference. Please include additional relevant datasets, such as the Global Fire Atlas (ESSD) and FIRED (https://www.mdpi.com/2072-4292/12/21/3498).
L66-68: The necessity of mentioning the specific ESA project here is unclear.
L70: This expression is physically imprecise. Multiple distinct fire events can occur within a single month and a 0.25° cell, so aggregating them as a single 'event' introduces artificial artifacts.
L81: By using a 0.25° and monthly resolution, a single, contiguous large-scale fire event is inevitably fractured into multiple CMEs. The authors should explicitly discuss this limitation and how it impacts the definition of an "event".
L82: Why use the word “roughly”? Geographic divisions should be precise and accurate.
L87-89: By providing only binary values rather than standardized anomalies, the dataset's usage for diverse modeling purposes is restricted.
L89: This implies the added value is merely calculating anomalies for the existing FRY v2.0 dataset.
L95: It is uncommon to have a one-sentence paragraph. Please merge or expand.
L107-110: While the authors justify stopping at 2022 due to the discontinuity between FireCCI51 and Sentinel-3 products, relying on a discontinued product restricts the database's reuse value. Given that alternative MODIS BA products are updated in near real-time, the 2003-2022 cutoff misses critical recent extremes.
L116-120: The issue of multiple overpasses (Terra and Aqua for day and night) seems completely unaddressed. Without proper deduplication, fire metrics are systematically biased and double-counted. See: https://www.nature.com/articles/s41559-024-02452-2
L122-138: The added value of including FRY v2.0 is questionable, as end-users could calculate anomalies more straightforwardly from the original FRY v2.0 data.
L176-182: The merge and divide procedure introduces arbitrary thresholds (e.g., why exactly 125,000 km²?), and its added value over a standard continental-biome approach is unclear.
Results (General): For a data descriptor paper, data validation is important. While comprehensive validation is challenging due to the scarcity of similar global products, cross-validating against other sources such as EM-DAT, media reports, and regional disaster databases is necessary to prove the dataset's reliability.