Global high-resolution forest disturbance type dataset

Wang, Li; Liu, Shidong; Song, Wanjuan; Zhang, Jie; Ding, Shengping

doi:10.5194/essd-2025-346

Preprints

https://doi.org/10.5194/essd-2025-346

Preprints

26 Jun 2025

| 26 Jun 2025

Status: this preprint is currently under review for the journal ESSD.

Global high-resolution forest disturbance type dataset

Li Wang, Shidong Liu, Wanjuan Song, Jie Zhang, and Shengping Ding

Abstract. Forests play a pivotal role in global carbon cycling and biodiversity conservation, yet they face increasing disturbances from both anthropogenic and natural drivers. This study presents the first high-resolution (30-m) global forest disturbance dataset (GFD) for 2000–2020, classifying 11 disturbance types by integrating Landsat-based Continuous Change Detection and Classification (CCDC) time-series analysis with spatial metrics and machine learning. A total of 57,000 expert-validated samples were used to train and validate a decision tree model, achieving an overall accuracy of 94.88 %. The results reveal that forestry disturbance (43.79±0.31 %), shifting cultivation (24.32±0.28 %), and forest fires (11.45±0.05 %) dominate global forest loss. There are regional differences in global forest disturbance, such as farmland expansion in South America and Africa, forest fires in northern regions, and shifting cultivation in tropical regions. Disturbed forests span 1,247.06±11.18 Mha, accounting for 30.87 % of the global forest area. Notably, 2.76 % of global forests were newly established, primarily in China, India, and Brazil. Spatial consistency analysis with existing datasets (R²=0.93) confirms the reliability of the GFD product. The GFD dataset advances our understanding of forest dynamics and underscores the need for targeted conservation strategies in an era of escalating environmental change. The 30 m resolution GFD generated by this study is openly available at https://doi.org/10.6084/m9.figshare.28465178 (Liu et al., 2025a).

Received: 12 Jun 2025 – Discussion started: 26 Jun 2025

Publisher's note: Copernicus Publications remains neutral with regard to jurisdictional claims made in the text, published maps, institutional affiliations, or any other geographical representation in this paper. While Copernicus Publications makes every effort to include appropriate place names, the final responsibility lies with the authors. Views expressed in the text are those of the authors and do not necessarily reflect the views of the publisher.

Download & links

Li Wang, Shidong Liu, Wanjuan Song, Jie Zhang, and Shengping Ding

Status: final response (author comments only)

CC1:
'Comment on essd-2025-346', zhou yuming, 27 Jun 2025

The data reveal the types of global forest disturbances, which is helpful for global intervention and protection according to local conditions, and has guiding significance for forest prediction research at the national scale.

Citation: https://doi.org/10.5194/essd-2025-346-CC1
- AC1: 'Reply on CC1', Shidong Liu, 10 Sep 2025
  
  Thanks for recognizing this research. We are delighted to hear that you find the data on global forest disturbance types helpful for targeted intervention and protection efforts. We also greatly appreciate your note on its potential guiding significance for national-scale forest prediction research, this is indeed one of the key motivations behind creating a high-resolution, type-specific dataset like this.
  
  Citation: https://doi.org/10.5194/essd-2025-346-AC1
CC2:
'Comment on essd-2025-346', Shu Fu, 27 Jun 2025

Forests are a massive carbon reservoir, and assessing their carbon disturbances requires comprehensive and detailed identification of forest disturbance types. This research provides reliable data and technical support for evaluating local and even global forest carbon disturbances.

Citation: https://doi.org/10.5194/essd-2025-346-CC2
- AC2: 'Reply on CC2', Shidong Liu, 10 Sep 2025
  
  Thanks for your insightful comment. We are glad that our work on forest disturbance types resonates with your perspective on forest carbon assessment. Accurate data is fundamental to building effective climate mitigation strategies, and we are grateful for experts like yourself who recognize its value.
  
  Citation: https://doi.org/10.5194/essd-2025-346-AC2
CC3:
'Comment on essd-2025-346', Zhang Jimin, 27 Jun 2025

This pioneering study delivers the first 30-m resolution global forest disturbance dataset , classifying 11 types via Landsat time-series, spatial metrics, and machine learning. Achieving 94.88% accuracy with 57,000 samples, it quantifies dominant drivers like forestry and wildfires .this resource revolutionizes carbon accounting and conservation planning, offering unmatched precision for global environmental governance.

Citation: https://doi.org/10.5194/essd-2025-346-CC3
- AC3: 'Reply on CC3', Shidong Liu, 10 Sep 2025
  
  Thanks for recognizing this research. We greatly appreciate your recognition of the dataset's potential impact on carbon accounting and conservation planning.
  
  Citation: https://doi.org/10.5194/essd-2025-346-AC3
CC4:
'Comment on essd-2025-346', Chuhan Ji, 27 Jun 2025

The 30m GFD dataset is a landmark, with 94.88% accuracy in classifying 11 forest disturbances (2000–2020). It enhances understanding of dynamics, aids carbon/biodiversity studies, and supports targeted conservation via reliable, open-access data.

Citation: https://doi.org/10.5194/essd-2025-346-CC4
- AC4: 'Reply on CC4', Shidong Liu, 10 Sep 2025
  
  Thanks for recognizing this research. We truly appreciate your recognition of the dataset's value for carbon, biodiversity, and conservation applications.
  
  Citation: https://doi.org/10.5194/essd-2025-346-AC4
CC5:
'Comment on essd-2025-346', Tian Zhao, 27 Jun 2025

This study makes a significant contribution to global forest monitoring by providing the first high-resolution (30 m) dataset of forest disturbance types over two decades, with robust validation and open access. By integrating Landsat-based Continuous Change Detection and Classification (CCDC) with spatial metrics and decision tree algorithms, the authors developed a robust classification framework that achieved an overall accuracy of 94.88%. The resulting dataset not only improves our ability to distinguish among 11 major forest disturbance types at a fine scale, but also provides critical support for carbon accounting, biodiversity conservation, and sustainable land management under global environmental change.

Citation: https://doi.org/10.5194/essd-2025-346-CC5
- AC5: 'Reply on CC5', Shidong Liu, 10 Sep 2025
  
  Thanks for recognizing this research. We are truly encouraged by your recognition of the dataset’s value in supporting global forest monitoring, carbon accounting, biodiversity conservation, and sustainable land management. It’s particularly rewarding to know that the data can contribute to meaningful applications under global environmental change.
  
  Citation: https://doi.org/10.5194/essd-2025-346-AC5
RC1:
'Comment on essd-2025-346', Ian Evans, 14 Jul 2025

from Ian S. Evans, Durham University, U.K.
GENERAL
It is useful to have maps of world distribution of different forest disturbance types and the authors provide a higher-resolution data set. The results appear reliable and mark a significant contribution to the state of the world’s forests.
13 situations are recognised (Table 1); of these, two ‘weak disturbances’ (drought, pests&diseases) are not considered, so 11 are mapped in Fig.5, including ‘undisturbed’ and ‘newly added forest’. Excluding undisturbed and new leaves 9 types of disturbance, of which 7 are covered in Fig.3 and Table 4 (accuracy of flood and oil palm not being evaluated).
My criticisms are essentially confined to details of presentation and wording. It might be good to have more information on how the types are defined and how time series permit recognition of e.g. recovered areas. On line133 the treatment of ‘vacant areas’ is worrying: more information on this is needed, how big an area is affected?
PRESENTATION DETAILS
101 ‘… America, South …’ comma missing
132 Insert space before ‘in’
140 ‘Considering …’ -this sentence is incomplete, it is just a clause introducing something that is missing.
156 ‘Meanwhile …’ is an incomplete sentence – just a clause. I suggest replacing with ‘Weak disturbances in forest cover are highly time-bound.’
160 Delete ‘are not considered’ - duplication.
166-169 This sentence misuses punctuation (: and ; are repeated). Please re-write.
Fig.3 There is space to replace codes with brief versions of types – e.g. ‘plantation’.
Table 4 118 should be 18
254-260 There should be a space before ±
260 Not a sentence: ‘both …’ implies ‘ …and’
268 ‘Western Siberian Plain in North America’ ??
Fig.4 As each small symbol represents an area (grid square?), the colours must represent density. So ha per … ? Up to 1500 ha, so per at least 39 x 39 km. Please state resolution of this & Fig.5.
Fig.5 ‘Forestry replanting ‘ is inconsistent with text (lines 284, 288 etc.), other Figures (8 & 9) and Table 1 (‘Forestry disturbance’) and does not seem to be used elsewhere.
Actually ‘forestry disturbance’ is an unfortunate term for just one type of forest disturbance – disturbance as a disturbance type. Could it be replaced throughout by ‘forestry replanting’, ‘recovered disturbance’ or just ’replanted’ ?
284-293 Presumably Mha should be M ha
Fig. 6 caption Insert ‘Note varying scales.’
Fisg.6 & 7 maps show density, so it is necessary to state the unit area and (as these are rectangular) its dimensions.
Fig.7 What is the rationale of having red = most in a & b, but red= least in c and d? (For me, a, c and d might be considered ‘good’; b is ‘bad’.). Fig. 6 was consistent with red = most, so readers are going to be confused here.
328-330 This is misleading, based on the inclusion of ‘all’ in Fig.8b. That should be replotted excluding ‘All’. Consistency over the 5 types is thus much less, and the big deviation for Forest fire requires comment.
Figs. 8a, and 9a-d: Note that all show highly skewed distributions of both x and y variables. Calculating regressions on logarithmic scales would reduce the influence of the few high values. It would, however , increase the leverage of the numerous small values: a choice has to be made based on the absolute error margins of small versus large values. Perhaps both types of regression should be presented.

Citation: https://doi.org/10.5194/essd-2025-346-RC1
- AC6: 'Reply on RC1', Shidong Liu, 10 Sep 2025
  
  Thanks for your recognition and constructive suggestions, which make our manuscript stronger. In this version, we have further revised the manuscript and addressed all your concerns. Please see the detailed point-by-point responses below.
  
  Citation: https://doi.org/10.5194/essd-2025-346-AC6
RC2:
'Comment on essd-2025-346', Anonymous Referee #2, 28 Jul 2025

General Comment
The manuscript describes a 30-m global forest disturbance dataset (11 disturbance types) for the time of 2000 to 2020. Disturbance is derived from Landsat data applying the CCDC analysis. My comments focus primarily on the accuracy assessment and area estimation components of the work. A primary area of improvement of the manuscript would be to provide a clear articulation of the sampling design used to collect the data for the accuracy assessment and area estimates. Without a clear description of the sampling design and additional details, it is impossible to ascertain how the accuracy and area estimates were obtained.
Specific Comments
1. Additional details related to the sampling design(s) must be provided. It is unclear how specifically the sample of 57,000 30-m sample units were selected for the model training and validation (Lines 72-73). In Section 2.3 (Lines 153-154), the text states that “8 individuals were uniquely responsible for selecting 8 types, while an additional 4 individuals conducted secondary confirmation of the selected samples.” This text seems to be referring to the process of labeling the sample units, not explaining how the sample units were selected. Did these individuals actually choose which sample units (30-m pixels) were in the sample? There is no mention of randomization in the protocol for selecting the sample, and no details presented of whether strata are present, even though later in the manuscript stratified estimation formulas for accuracy metrics are provided (equations 5 through 10). To compound the confusion, the Figure 3 confusion error matrix has a sample size of nearly 17,000, but there is no mention in the text of how these sample units were selected. Is it a random subset of the 57,000 mentioned earlier? Or are these 17,000 sample units entirely independent of the training sample of 57,000? It is essential to describe the sampling design(s) used to select these units.
2. I have several concerns with the Figure 3 confusion matrix, which I will list as separate items as follows:
a) It seems very unlikely that there would be no errors associated with the undisturbed class (which is class 0). Out of 3476 cases, there was never a commission error or omission error of “undisturbed” – this class is perfectly mapped. It seems implausible that disturbed and undisturbed forest can be classified with 100% accuracy.
b) The confusion matrix is presented in terms of sample counts, which is reasonable if the sampling design is simple random. Yet the authors present formulas for stratified sampling (equations 5-10). In particular, equation (5) indicates how the cell proportions should be estimated for a stratified sample, but that formula was not apparently used in the analysis. The confusion matrix should be presented in terms of the estimated pij (cell proportions) when stratified sampling is used. This concern links to comment 1 because the manuscript does not include description of the sampling design.
c) Row and column totals need to be added to Figure 3.
d) It is unclear what the vertical color bar on the right of the figure represents (range from 0 to 40,000). Please remove it or explain what it is. e) I will identify this comment as purely an opinion, but I am skeptical that a disturbance product can achieve the high accuracies reported. Accurately mapping forest change is exceedingly difficult, so to achieve user’s and producer’s accuracies of over 95% for many of these disturbance types doesn’t seem possible. Comment 2a is related to this same concern.
3. The accuracy estimates reported on page 12 and in Table 4 are also a cause for concern.
a) It is evident that the stratified formulas were not used to estimate producer’s accuracy and overall accuracy. If the sampling design is stratified and the stratified formulas were not used, these estimates would be incorrect.
b) It seems very likely that the standard error values are incorrect for several cases. For example, if we had a simple random sample with a sample size of n=17,000 (approximate sample size of matrix in Figure 3), the standard error of overall accuracy would be SQRT[(0.95)*(0.05)/17000]=0.0033 or 0.33%. The reported standard error for overall accuracy is 2.86% from line 253, nearly 10 times larger. The standard errors for producer’s accuracy of Types 18 and 19 (approximately 20% and 15%) are suspiciously large given the large sample sizes for these two disturbance types. Lastly, the standard errors reported for user’s accuracy also don’t match what I calculate if I apply equation (7) to the data in Figure 3. Please re-check the standard error estimates to confirm.
c) Note that Type 18 in Table 4 is accidentally mis-labeled as “118”
4. Table 5 provides estimates of area of the GFD types. Presumably these are from the inadequately described “validation” sample. The Abstract should be revised to clarify what is presented in the manuscript. The manuscript’s title suggests that the primary purpose of the manuscript is to present a new global forest disturbance dataset (i.e., a map). But key parts of the manuscript are sample-based estimates of area, which would use the disturbance map for stratification, but the key data are then the sample and disturbance type labels provided by the expert interpreters. For area estimation the role of the new disturbance map is secondary. If the main objective of the manuscript is to provide this global dataset, then sample-based area estimates would seem unnecessary and only the accuracy results would be necessary to present. This same ambiguity is present in the Conclusion section. Lines 354-358 highlight the map of disturbance. But without any transition flagging the use of sample-based area estimation, Lines 358-360 then report sample-based estimates of area (Table 5) that use only the map through stratification of the sample. Please revise the Abstract and Conclusion to more clearly identify the purpose of the map and the role of sample-based area estimation to the objectives of the manuscript.
Technical Corrections:
1. Line 15: It is not clear whether the number to the right of the +/- is a standard error or a margin of error of a confidence interval. Please identify more clearly.
2. Lines 19-20: The comparison to other datasets provides an evaluation of “agreement” or “consistency” with these other datasets. These other datasets are not “truth”. Therefore, agreement with these other datasets does not “confirm reliability” or convey “accuracy” but instead quantifies consistency with other datasets.
3. Line 43: What specifically is “subjective” about field surveys? The implication is that remote sensing is not subjective, but that would seem dubious because surely there are subjective components of remote sensing as well.
4. Lines 19, 225, 226, 321: This is a minor point, but stating that a comparison is made with “existing” datasets is not meaningful because we obviously cannot make a comparison to a dataset that does not exist. It would be better to use “other datasets” instead of “existing datasets”.
5. Page 10, equation (10): This formula for the standard error of the estimated proportion of area does not match equation (10) presented in Olofsson et al. (2014).
6. Equation (11): The use of “UA” for the standard error will be confusing because it could easily be misread as an abbreviation for “User’s Accuracy” and “UA” provides no obvious connection to standard error.
7. Equation (12): Please check this formula. It seems unlikely that there would be a “bar” above qi (indicating a mean) in the denominator but no “bar” above pi in that same denominator.
8. Line 226: Because these other datasets are not “truth”, comparisons to these datasets would represent “agreement” and “disagreement”. Use of the term “errors” does not seem appropriate here.
9. Line 234: a space should be inserted between “s” and “p” in “asp”.
10. Table 4: state what the +/- columns represent.
11. Line 288: The meaning of “robust” precision is unclear. In what sense can precision be “robust”?
12. Line 326: “MEA” should be “MAE” and the word “only” should be removed from before “13%” as that is a value judgment of magnitude of the disagreement.
13. Panel b) of Figure 8 should be deleted or perhaps converted to a small table. The R^2, MAE, and RMSE values do not make much sense for only 6 data points and the “All Types” case must have a massive influence on the summary statistics.
14. Lines 338-340: “MEA” should be “MAE” in multiple places.
15. Throughout the manuscript the word “samples” is used incorrectly. The definition of “sample” in statistics is that it is a subset of n units selected from the population. The individual elements of that sample are “sample units”, in this case a sample unit is a 30-m pixel. Thus, there are not 57,000 “samples” (e.g., Line 13), but one “sample” consisting of 57,000 sample units or sample pixels. This incorrect use of “samples” should be corrected throughout the manuscript.

Citation: https://doi.org/10.5194/essd-2025-346-RC2
- AC7: 'Reply on RC2', Shidong Liu, 10 Sep 2025
  
  Thanks for your constructive suggestions, which make our manuscript stronger. In this version, we have further revised the manuscript and addressed all your concerns. Please see the detailed point-by-point responses below.
  
  Citation: https://doi.org/10.5194/essd-2025-346-AC7
RC3:
'Comment on essd-2025-346', Anonymous Referee #3, 28 Jul 2025
This manuscript integrated CCDC time series change detection method and CART model to map and identify forest disturbance type at a global scale. I have a few concerns on the validity and robustness of the proposed method.
The detection of disturbed forest pixels solely depends on CCDC model. What’s the accuracy of change detection? I wonder whether the change detection error and/or modelling uncertainty of CCDC will affect the subsequence disturbance type mapping? CCDC assumes NDVI of all the forest pixels can be quantified by a linear trend term and a harmonic seasonality term (Eq. 1). In fact, not all the pixels will perfectly fit into this assumed model, which would consequently affect the fitting performance of CCDC and therefore the subsequent disturbance mapping. Besides, in addition to CCDC, there are many change detection models available, such as BEAST, BFAST, and Landtrendr. Why did the author go with CCDC? Will applying different model end up with the same change detection outcomes?

It seems that the authors only considered and mapped abrupt forest loss, while graduate forest changes (e.g., forest degradation) and forest gain (e.g., natural regrowth and afforestation) were only mapped.

Line 80: “CRAT” should be “CART”

5. Does the undisturbed area indicate no change has occurred in the pixel? What’s the omission rate (or under-detection rate) of CCDC?

How does the proposed algorithm perform in Landsat images with dense and consistent cloud coverage (e.g., in tropical area)?
Citation: https://doi.org/10.5194/essd-2025-346-RC3
- AC8: 'Reply on RC3', Shidong Liu, 10 Sep 2025
  
  Thanks for your recognition and constructive suggestions, which make our manuscript stronger. In this version, we have further revised the manuscript and addressed all your concerns. Please see the detailed point-by-point responses below.
  
  Citation: https://doi.org/10.5194/essd-2025-346-AC8
EC1:
'Comment on essd-2025-346', Kaiguang Zhao, 08 Aug 2025

test

Citation: https://doi.org/10.5194/essd-2025-346-EC1
- AC10: 'Reply on EC1', Shidong Liu, 10 Sep 2025
  
  Thank you for the efforts of handling our manuscript and the opportunity of revision. We carefully addressed all the comments raised by RC #1-#4 and CC #1-#5 in our revision.
  
  Citation: https://doi.org/10.5194/essd-2025-346-AC10
RC4:
'Comment on essd-2025-346', Anonymous Referee #4, 13 Aug 2025

General comment: The paper is generally well written and provide a useful dataset for the community with reasonable methods. I have only some minor comments below:

Line 127: “We collected Google Global Landsat based CCDC segments (1999-2019). ” I don’t understand this. I think CCDC segments were created by the authors. What do you mean by ‘collected’?

Line 131: deduplicated is very complex word. Try to rephrase.

Line 133: what do you mean by “vacant areas“?

Line 160: suggest removing drought and pest from Table 1 to avoid potential confusions. Linked to line 181, there it says there are 11 disturbance types. If you remove drought and pest from Table 1, then there remains 10 types. More confusing is that Fig 3 contains 10 types including the Code 0. Could you clarifiy this?

Line 177: 200 should be 2001?

-   Figure 1: Change the disturbance type code to its name ?

-   Figure 4: what is the spatial resolution of this map? Better to show forest loss and forest expansion independently. If both forest loss and gain occur in the same grid cell of the map, how did you do? The legend shows only the area being ‘disturbed’ but it does not show the direction of forest cover change.

-   Section titles of 2.4.1 and 2.4.2 can be improved because readers don’t know what are ‘other types’ of forest disturbance in contrast to those been described in 2.4.1. In this sense, the section title of 2.4.1 can be also improved to enhance readability.

-   Section 2.3 describes how training samples are derived no? This should be made clear in its title.

-   Could you show a map describing the spatial distribution of the training samples?

-   How the samples of ‘shifting cultivation’ are determined? This is critical because we know that this type is quite challenging.

-   Fig. 6 & Fig. 7 should also show its spatial resolution.

-   Fig 7: How do you determine the disturbed but not recovered forests? i.e., panel b, by using land cover map time series described in the Methods section?

Citation: https://doi.org/10.5194/essd-2025-346-RC4
- AC9: 'Reply on RC4', Shidong Liu, 10 Sep 2025
  
  Thanks for your recognition and constructive suggestions, which make our manuscript stronger. In this version, we have further revised the manuscript and addressed all your concerns. Please see the detailed point-by-point responses below.
  
  Citation: https://doi.org/10.5194/essd-2025-346-AC9

Li Wang, Shidong Liu, Wanjuan Song, Jie Zhang, and Shengping Ding

Data sets

Global forest main disturbance types between 2000 and 2020 Shidong Liu, Li Wang, Wanjuan Song https://doi.org/10.6084/m9.figshare.28465178

Li Wang, Shidong Liu, Wanjuan Song, Jie Zhang, and Shengping Ding

Viewed

Total article views: 2,937 (including HTML, PDF, and XML)

HTML	PDF	XML	Total	BibTeX	EndNote
2,135	716	86	2,937	53	75

HTML: 2,135
PDF: 716
XML: 86
Total: 2,937
BibTeX: 53
EndNote: 75

Views and downloads (calculated since 26 Jun 2025)

Month	HTML	PDF	XML	Total
Jun 2025	187	32	10	229
Jul 2025	288	143	28	459
Aug 2025	239	82	3	324
Sep 2025	879	53	21	953
Oct 2025	102	61	5	168
Nov 2025	173	116	7	296
Dec 2025	93	115	7	215
Jan 2026	154	95	5	254
Feb 2026	20	19	0	39

Cumulative views and downloads (calculated since 26 Jun 2025)

Month	HTML	PDF	XML	Total
Jun 2025	187	32	10	229
Jul 2025	288	143	28	459
Aug 2025	239	82	3	324
Sep 2025	879	53	21	953
Oct 2025	102	61	5	168
Nov 2025	173	116	7	296
Dec 2025	93	115	7	215
Jan 2026	154	95	5	254
Feb 2026	20	19	0	39

Viewed (geographical distribution)

Total article views: 2,819 (including HTML, PDF, and XML) Thereof 2,819 with geography defined and 0 with unknown origin.

Country	#	Views	%

Latest update: 07 Feb 2026

Short summary

The study introduces the high-resolution global forest disturbance dataset for 2000–2020. Key drivers of forest cover changes are forestry activities (43.79 %), shifting cultivation (24.32 %), and forest fires (11.45 %). Both human activities and natural events widely impact forest ecosystems, with regional differences across tropical, temperate, and boreal zones. Forest fires concentrated in Siberia and North America; and shifting cultivation dominant in tropical areas.


Total:	0
HTML:	0
PDF:	0
XML:	0