the Creative Commons Attribution 4.0 License.
the Creative Commons Attribution 4.0 License.
1 km annual forest cover and plant functional types dataset for China from 1980 to 2023
Abstract. High–spatial–resolution and long-term data on forest cover and plant functional types (PFTs) are crucial for elucidating the impacts of forest cover change on the national terrestrial carbon balance. Since the 1980s, China has undergone a substantial expansion in its forest area, primarily driven by large-scale national afforestation programmes. However, existing land cover products have often failed to capture this long-term increasing trend, leading to an underestimation of forest cover change–related ecological processes. Here, we developed a high-resolution (1 km), annual forest cover dataset for China during 1980–2023. This dataset integrates spatial constraints from multi-source remote sensing data with provincial-level statistics from China’s national forest inventories (NFIs), providing a consistent and spatially explicit record of forest dynamics over four decades. Building on this primary dataset, we further produced an annual PFT dataset that disaggregates total forest cover into eight distinct functional types, tailored for use in dynamic global vegetation models (DGVMs). Validation against independent data confirms the dataset’s ability to accurately represent historical forest recovery, achieving an overall accuracy (OA) of 95.3 ± 0.5 %, with classification accuracies for needleleaf and broadleaf forests ranging from 84.4 % to 92.0 %. To evaluate its applicability, we implemented the dataset within the Lund–Potsdam–Jena General Ecosystem Simulator (LPJ–GUESS). Compared to the widely used PFT dataset from the European Space Agency’s Land Cover Climate Change Initiative (ESA CCI), our product yields a markedly improved simulation of key biophysical and biogeochemical processes in China, enhancing the accuracy of evapotranspiration, leaf area index (LAI), and vegetation carbon flux by 49.4 %–77 %. With its high spatial resolution, long–term temporal coverage, and detailed forest-type classification, our dataset offers a robust foundation for assessing the ecological impacts of forest restoration and for constraining estimates of China’s forest carbon sink since 1980. The dataset is freely available at 10.5281/zenodo.16208012 (Liu et al., 2025).
- Preprint
(12351 KB) - Metadata XML
-
Supplement
(8480 KB) - BibTeX
- EndNote
Status: final response (author comments only)
- RC1: 'Comment on essd-2025-475', Xue Liu, 15 Sep 2025
-
RC2: 'Comment on essd-2025-475', Anonymous Referee #2, 22 Sep 2025
This study developed a 1 km resolution annual forest cover dataset (1980–2023) and an 8-class plant functional type (PFT) dataset (1981–2013) by integrating multi-source remote sensing data with National Forest Inventory (NFI) statistics. They aimed to address the critical limitations of existing land cover products—their inability to capture China’s forest expansion since the 1980s and the inadequacy of plant functional types data for dynamic global vegetation models (DGVMs). The reliability and applicability of the datasets were validated through field surveys and simulations using the LPJ-GUESS model. The data could provide valuable support for national carbon balance assessments. However, I still have some concerns about the dataset: insufficient novelty in core methods (forest cover reconstruction) relative to existing literature, as well as potential issues with data quality due to incomplete validation and model simulations.
Major Comments:
The novelty of the forest cover reconstruction method is limited. Its core logic, integrating NFI statistics with multi-source land cover consistency, closely resembles the approach of Xia et al. (2023). The authors should explicitly acknowledge this overlap and clearly articulate what distinguishes their work, such as extending the temporal coverage to 2023, refining NDVI-based pixel selection, or improving provincial-scale constraint algorithms.
The field-scale validation is inadequate. The manuscript relies heavily on public datasets concentrated in 2011–2015, which does not adequately test the accuracy of the 1980–2010 and 2015-2023 portions of the dataset. Moreover, only broadleaf and needleleaf forest types were validated, whereas at least four types should be assessed given that the PFTs dataset includes eight types. To strengthen credibility, independent and historical validation data are needed, for example through visual interpretation of archived Google Earth imagery to obtain historical records.
Descriptions of simulation experiments using LPJ-GUESS are not sufficiently clear. (1) The manuscript does not specify the temporal scale of simulations (daily or monthly), which is important for interpreting variables such as GPP. (2) No description about parameter calibration for Chinese forest ecosystems, which likely contributes to questionable outputs. (3) Comparisons are limited to ESA CCI PFTs rather than including higher-resolution datasets (e.g., GLC_FCS), and the reported anomaly in southern China is presented without explanation.
Specific Comments:
Line 17: The phrase “to accurately represent historical forest recovery, achieving an overall accuracy (OA)…” is misleading. The reported OA reflects the classification accuracy of forest cover rather than the ability to “represent historical recovery.”
Line 22: The statement that the study “enhances the accuracy of evapotranspiration, leaf area index (LAI), and vegetation carbon flux by 49.4%-77%” is inaccurate. These values actually represent the proportion of China’s land area where simulation errors were reduced, rather than a direct measure of accuracy improvement. It is recommended to revise the sentence to: “reducing simulation errors for evapotranspiration, leaf area index (LAI), and vegetation carbon flux across 49.4%-77% of China’s land area…”
Line 43: Standardize the abbreviation for “land use and land cover change” to LUCC or LULC throughout the manuscript; remove inconsistent use of “LULC” when referring to change rather than land cover itself.
Line 46: Specify the time period for terrestrial carbon storage estimates (e.g., 1980–2020 or 2000–2020) to avoid ambiguity.
Line 99: When first mentioned, provide full names for abbreviations: GPP (Gross Primary Productivity), NEE (Net Ecosystem Exchange), and LAI (Leaf Area Index).
Line 130: Justify the use of the “nearest neighbor method” for resampling 30 m LULC data to 1 km. Explain why it was chosen over alternatives like bilinear interpolation, and acknowledge potential uncertainties (e.g., preserving extreme values).
Line 132: Clarify why Jeong’s NDVI dataset was selected instead of 1 km MODIS NDVI (2000–2023). The use of NDVI at 0.05° resolution has resulted in some coarse grid artifacts, particularly in the Qinghai-Tibet Plateau and northern Heilongjiang, in the data layer “China-ForestChange_Gain_Duration_1km_v1.0.tif.” The authors should explain why Jeong’s dataset was chosen despite these resolution limitations and discuss potential impacts on the results.
Line 155: Justify the use of ERA5-Land (0.1° resolution) instead of higher-resolution Chinese climate datasets (e.g., Peng et al., 2019). Explain how potential biases were addressed.
Line 182: Correct the inconsistency in Figure S1 (Chongqing subfigure), where the sum of broadleaf and needleleaf forest area does not equal total forest area pre-2002.
Line 228: Explicitly list the seven LULC products used for PFTs classification in Table S1 (dataset names, years, sources).
Line 301: Justify why a 10 km × 10 km window size was used for neighborhood analysis.
Line 327: Correct the claim that ERA5-Land daily temperature data are unavailable. Daily data are accessible via Google Earth Engine and Copernicus Climate Data Store.
Line 361: Using three years as a threshold to identify the stable forest is not reasonable for forest land due to its longer growth period (e.g., 5-10 years) compared with cropland.
Line 320-325: Four plant functional types were further classified using GDD and temperature followed the method of Bonan et al. (2002). Clarify annual GDD or multi-year mean GDD was used when conduct the final PFTs classification. Additionally, the GDD data can be improved by using high spatial and temporal resolution temperature dataset.
Lines 394–395: Cite the corresponding table or figure for the reported percentages (6.9% needleleaf, 2.7% broadleaf forest pixels) to support the claim.
Lines 425–430: Relocate the description of PFTs dataset availability from the “Reconstructed Forest Cover Dataset Description” section to a dedicated “Data Availability” section.
Figure 4: Remove “Bamboo and Economic forest” from the PFTs legend, as it is not included in the 8-class PFTs classification.
Figure 5: Correct numerical and unit errors: Verify the unrealistic forest area values for GLASS_GLC, CGLS, ESA_WorldCover and GLCNMO (~6,000,000 km²). Ensure the sum of needleleaf and broadleaf forest area matches total forest area for GLC_FCS, MODIS, and GLCNMO. Standardize units across figure and caption (e.g., 10⁶ ha or km²).
Figure 6: Explain the high forest loss on the Qinghai-Tibet Plateau in recent years. Discuss potential causes such as natural disturbances (permafrost thaw), human activities (infrastructure), or classification errors.
Data comparisons between this study and Xia et al. (2023) should also be included in the analysis or discussion section.
All figure captions use “Figure *,” but in the main text the figures are cited as “Fig. *.”
Citation: https://doi.org/10.5194/essd-2025-475-RC2
Data sets
1 km annual forest cover and plant functional types dataset for China from 1980 to 2023 Bo Liu, Boyan Li, Fulai Feng, Yangcan Bao, Jing Li, Qi Feng https://doi.org/10.5281/zenodo.16208012
Viewed
HTML | XML | Total | Supplement | BibTeX | EndNote | |
---|---|---|---|---|---|---|
1,428 | 99 | 19 | 1,546 | 38 | 26 | 16 |
- HTML: 1,428
- PDF: 99
- XML: 19
- Total: 1,546
- Supplement: 38
- BibTeX: 26
- EndNote: 16
Viewed (geographical distribution)
Country | # | Views | % |
---|
Total: | 0 |
HTML: | 0 |
PDF: | 0 |
XML: | 0 |
- 1
This study used multi-source LULC products and provincial-level statistics data to generate a long-term forest cover data, significantly improving the performance of LPJ-GUESS. It’s very interesting for your long-term PFTs product and its potential application. I like your detailed method instructions and your reports for the accuracy and comparison of this product.
Major:
You mentioned the definition of forest cover varies across different LULC products in Table 2. What’s your definition of “forest cover” in your study? The “forest consistency” is only an indicator for the forest cover detection, not for the definition in the product.
The maximum NDVI values were applied to detecting potential forest cover and PFTs. However, maximum NDVI is unstable due to the interference of clouds, especially in those cloudy area.
During the process of PFTs classification, you aggregated a mass of distinct data layers from 1980 to 2013 for two or four consistency maps and used it to classify PFTs in each year. Is it reasonable to consider information from 1980 to 2013 when you were detecting PFTs in 1980? For example, if the broadleaf forest turned into needleleaf forest in 1985, will it be recognized as needleleaf in 1980? I understand you have assumed that the relative spatial distribution of PFTs remains static, but i m not sure whether it is reasonable.
Check your total forest area for GLC_FCS products and other datasets. I calculated the forest cover area from value 51 to 92 derived from GLC_FCS dataset for China in 2010 using Google Earth Engine, the area was about 240,000,000 ha, not 350,000,000 ha. What’s more, Xia et al. also reported that the total forest area for GLC_FCS products in 2010 is about 220,000,000 ha (Reconstructing Long-Term Forest Cover in China by FusingNational Forest Inventory and 20 Land Use and Land CoverData Sets, in Figure 7). The accurate total area of forest cover in your product is a significant advantage compare to other products, but your imprecise statistics data make me doubtful.
You used GLC_FCS30 in 2010 and 2015 as the proxy for LULC maps in 2011 and 2013. However, the annual LULC map from 2000 in GLC_FCS dataset is available now. You could update your LULC dataset to compare with other forest datasets more accurately. It’s necessary to precisely display the accuracy difference among your product and other datasets.
In Fig. S1, the sum of needleleaf forest area and broadleaf forest area is quite different from the total forest area. Why it happened? If there are many mixed leaf forest areas, how to distinguish needleleaf forest or broadleaf forest from the mixed forest? By the way, the legend “BF” and “NF” were not explained in the figure or caption.
Minor:
You used the different hyphen in the manuscript. In line 8 and 23, “long-term” and “long–term”, make the format consistent.
In line 85, “every several year” is confusing.
In the resampling process, you used the nearest neighbor method. Why don’t use the mode of LULC type?
In Fig. 3, you marked the p-value as “P” instead of “p”. This is a mistake. What’s more, the high consistency between NFI forest areas and reconstructed forest areas is obvious owe to your method. Although this figure is delicate, it’s useless in the paper.
You displayed the CLCNMO dataset in Fig. 5 and line 548, but they cannot be found in the Fig. S2, Table S1 and Table S2. Is it CLCNMO or GLCNMO? Check it carefully.