the Creative Commons Attribution 4.0 License.
the Creative Commons Attribution 4.0 License.
FORMS: Forest Multiple Source height, wood volume, and biomass maps in France at 10 to 30 m resolution based on Sentinel-1, Sentinel-2, and GEDI data with a deep learning approach
Martin Schwartz
Philippe Ciais
Aurélien De Truchis
Jérôme Chave
Catherine Ottlé
Cedric Vega
Jean-Pierre Wigneron
Manuel Nicolas
Sami Jouaber
Siyu Liu
Martin Brandt
Ibrahim Fayad
Abstract. The contribution of forests to carbon storage and biodiversity conservation highlights the need for accurate forest height and biomass mapping and monitoring. In France, forests are managed mainly by private owners and divided into small stands, requiring 10 to 50 m spatial resolution data to be correctly separated. Further, 35 % of the French forest territory is covered by mountains and Mediterranean forests which are managed very extensively. In this work, we used a deep-learning model based on multi-stream remote sensing measurements (NASA’s GEDI LiDAR mission and ESA’s Copernicus Sentinel 1 & 2 satellites) to create a 10 m resolution canopy height map of France for 2020 (FORMS-H). In a second step, with allometric equations fitted to the French National Forest Inventory (NFI) plot data, we created a 30 m resolution above-ground biomass density (AGBD) map (Mg ha-1) of France (FORMS-B). Extensive validation was conducted. First, independent datasets from Airborne Laser Scanning (ALS) and NFI data from thousands of plots reveal a mean absolute error (MAE) of 2.94 m for FORMS-H, which outperforms existing canopy height models. Second, FORMS-B was validated using two independent forest inventory datasets from the Renecofor permanent forest plot network and from the GLORIE forest inventory with MAE of 59.6 Mg ha-1 and 19.6 Mg.ha-1 respectively, providing greater performance than other AGBD products sampled over France. These results highlight the importance of coupling remote sensing technologies with recent advances in computer science to bring material insights to climate-efficient forest management policies. Additionally, our approach is based on open-access data having global coverage and a high spatial and temporal resolution, making the maps reproducible and easily scalable. FORMS products can be accessed from https://doi.org/10.5281/zenodo.7840108 (Schwartz et al., 2023).
- Preprint
(5372 KB) - Metadata XML
- BibTeX
- EndNote
Martin Schwartz et al.
Status: closed
-
RC1: 'Comment on essd-2023-196', Anonymous Referee #1, 10 Jul 2023
Summary
Accurate forest height and biomass mapping and monitoring is important for forest management and biodiversity conservation. Here Schwartz et al., generated a 10 m resolution canopy height map in 2020, by integrating multis-source remote sensing dataset and a deep-learning model; subsequently, with allometric equations fitted to nation forest inventory (NFI), they generated a 30 m resolution above-ground biomass density (AGBD) map. The fine resolution from 10 m to 30 m is essential for analyzing forests in France, which are typically divided into small stands. Through extensive validation against multi-source independent and observational dataset, they showed greater performance for their generated dataset compared to existing canopy height and AGBD products. The manuscript is generally well organized and well written, and the research is important. Here, I listed a few concerns regarding the manuscript.
Specific comments
1) Line 104-105, does the randomness of the split affect the model performance? Generally, in computer science and Earth science, such random split will be repeated for a few times. The mean and standard deviation of the performance metrics derived from a few experiments will be used to show the model performance and related uncertainty.
2) Line 108-109, “We used the 10 by 10 m pixel corresponding to the center of the GEDI footprint as a target”. It seems that the spatial resolution of the input data is 10 m, but the output GEDI data has a resolution of 25m, the sub-pixel (i.e., across 10 m grid cells) heterogeneity within each GEDI footprint should not be contained in the output data. Also the NFI data has a resolution of 30m, then how to validate that the generated canopy height data at 10 m resolution captured the heterogeneity at that scale? Why not unify the input data to the same resolution (e.g., 30m) of GEDI or NFI or generated AGBD?
3) Line 111-112, the loss function should be the loss on the validation dataset, right? Please clarify it. To make sure the results reproducible, it could be better to list the learning rate used. In addition, are there any strategies used to avoid overfitting of the trained models?
4) Line 133-134, “we compared them to the mean of the FORMS-H height in each NFI plot's 30 m circular area”. For the finally generated dataset, how did you upscale from 10 m to 30 m resolution? First calculate the mean FORMS-H height within each 30m grid cell, then calculate its corresponding AGBD or wood volume? Please clarify it in the main text. Then again, why not generate the canopy height data at 30 m resolution during the first step?
5) Line 150-151, so you fitted FORMS-H height against NFI WVD for the final WVD data generation, right? Please clarify it. Since NFI WVD and NFI AGBD have a linear relationship (i.e., linked through the volume-to-biomass ratio), the fitted non-linear relationship between AGBD-height and WVD-height should be the same except for a scaling factor, correct? It could be better to put the fitted results of WVD-height in the supplementary to help the readers to better understand the methods and interpret the results.
6) Fig. 4b, it seems that the generated canopy height in the third column is not well matched with Google map, any reasons for that?
7) Fig. 6, why select those four regions for comparison? What’s the model performance across the entire ALS dataset? Does the generated dataset still outperform other products?
8) Fig. 7, similar problem to my comment#6
9) Fig.8e-f, do the data points represent the AGBD data across all sites in GLORIE and Renecofor or only represent sites falling into selected regions of Fig.8a-d? Please clarify it in the figure caption.
10) Fig.9, what about the R2 metric for the comparison?
11) Any potential limitations for the generated dataset so that the readers can further improve it?
12) The title and main text contain FORMS-H, FORMS-V and FORMS-B, but the abstract only showed the results of FORMS-H and FORMS-B. Briefly introducing the performance of FORMS-V is therefore needed to show the quality of the generated dataset.Citation: https://doi.org/10.5194/essd-2023-196-RC1 - AC1: 'Reply on RC1', Martin Schwartz, 06 Sep 2023
-
RC2: 'Comment on essd-2023-196', Anonymous Referee #2, 12 Aug 2023
I am happy to read this manuscrip from Schwartz et al. This manscript developed canopy height, wood volue density, and aboveground biomass density data products in France using GEDI, Sentinel-1 and Sentinel-2 datasets with a deep learning approach. The developed data products were assessed with multiple independent datasets and showed improvements over previous developed data products. Overall, this study is well organized and the data products are needed in time to support forest structure and carbon assessment in facing climate change. I only have minor comments.
1. Abstract may also include the FORMS-V, which is one of the three data products develioed in this study.
2. Table 1. "In this study" should be "This study".
3. Figure 1. The rasterization of 25-m GEDI footprints to 10m grid may introduce some uncertainties to the model and data products. Is there a way to reduce these uncertainties, for example, using more data quality control or GEDI footprints in pure landscape types?
4. Figure 5. What is the reason that the R2 values are so different in Figure 5a and Figure 5b?
Citation: https://doi.org/10.5194/essd-2023-196-RC2 - AC2: 'Reply on RC2', Martin Schwartz, 06 Sep 2023
Status: closed
-
RC1: 'Comment on essd-2023-196', Anonymous Referee #1, 10 Jul 2023
Summary
Accurate forest height and biomass mapping and monitoring is important for forest management and biodiversity conservation. Here Schwartz et al., generated a 10 m resolution canopy height map in 2020, by integrating multis-source remote sensing dataset and a deep-learning model; subsequently, with allometric equations fitted to nation forest inventory (NFI), they generated a 30 m resolution above-ground biomass density (AGBD) map. The fine resolution from 10 m to 30 m is essential for analyzing forests in France, which are typically divided into small stands. Through extensive validation against multi-source independent and observational dataset, they showed greater performance for their generated dataset compared to existing canopy height and AGBD products. The manuscript is generally well organized and well written, and the research is important. Here, I listed a few concerns regarding the manuscript.
Specific comments
1) Line 104-105, does the randomness of the split affect the model performance? Generally, in computer science and Earth science, such random split will be repeated for a few times. The mean and standard deviation of the performance metrics derived from a few experiments will be used to show the model performance and related uncertainty.
2) Line 108-109, “We used the 10 by 10 m pixel corresponding to the center of the GEDI footprint as a target”. It seems that the spatial resolution of the input data is 10 m, but the output GEDI data has a resolution of 25m, the sub-pixel (i.e., across 10 m grid cells) heterogeneity within each GEDI footprint should not be contained in the output data. Also the NFI data has a resolution of 30m, then how to validate that the generated canopy height data at 10 m resolution captured the heterogeneity at that scale? Why not unify the input data to the same resolution (e.g., 30m) of GEDI or NFI or generated AGBD?
3) Line 111-112, the loss function should be the loss on the validation dataset, right? Please clarify it. To make sure the results reproducible, it could be better to list the learning rate used. In addition, are there any strategies used to avoid overfitting of the trained models?
4) Line 133-134, “we compared them to the mean of the FORMS-H height in each NFI plot's 30 m circular area”. For the finally generated dataset, how did you upscale from 10 m to 30 m resolution? First calculate the mean FORMS-H height within each 30m grid cell, then calculate its corresponding AGBD or wood volume? Please clarify it in the main text. Then again, why not generate the canopy height data at 30 m resolution during the first step?
5) Line 150-151, so you fitted FORMS-H height against NFI WVD for the final WVD data generation, right? Please clarify it. Since NFI WVD and NFI AGBD have a linear relationship (i.e., linked through the volume-to-biomass ratio), the fitted non-linear relationship between AGBD-height and WVD-height should be the same except for a scaling factor, correct? It could be better to put the fitted results of WVD-height in the supplementary to help the readers to better understand the methods and interpret the results.
6) Fig. 4b, it seems that the generated canopy height in the third column is not well matched with Google map, any reasons for that?
7) Fig. 6, why select those four regions for comparison? What’s the model performance across the entire ALS dataset? Does the generated dataset still outperform other products?
8) Fig. 7, similar problem to my comment#6
9) Fig.8e-f, do the data points represent the AGBD data across all sites in GLORIE and Renecofor or only represent sites falling into selected regions of Fig.8a-d? Please clarify it in the figure caption.
10) Fig.9, what about the R2 metric for the comparison?
11) Any potential limitations for the generated dataset so that the readers can further improve it?
12) The title and main text contain FORMS-H, FORMS-V and FORMS-B, but the abstract only showed the results of FORMS-H and FORMS-B. Briefly introducing the performance of FORMS-V is therefore needed to show the quality of the generated dataset.Citation: https://doi.org/10.5194/essd-2023-196-RC1 - AC1: 'Reply on RC1', Martin Schwartz, 06 Sep 2023
-
RC2: 'Comment on essd-2023-196', Anonymous Referee #2, 12 Aug 2023
I am happy to read this manuscrip from Schwartz et al. This manscript developed canopy height, wood volue density, and aboveground biomass density data products in France using GEDI, Sentinel-1 and Sentinel-2 datasets with a deep learning approach. The developed data products were assessed with multiple independent datasets and showed improvements over previous developed data products. Overall, this study is well organized and the data products are needed in time to support forest structure and carbon assessment in facing climate change. I only have minor comments.
1. Abstract may also include the FORMS-V, which is one of the three data products develioed in this study.
2. Table 1. "In this study" should be "This study".
3. Figure 1. The rasterization of 25-m GEDI footprints to 10m grid may introduce some uncertainties to the model and data products. Is there a way to reduce these uncertainties, for example, using more data quality control or GEDI footprints in pure landscape types?
4. Figure 5. What is the reason that the R2 values are so different in Figure 5a and Figure 5b?
Citation: https://doi.org/10.5194/essd-2023-196-RC2 - AC2: 'Reply on RC2', Martin Schwartz, 06 Sep 2023
Martin Schwartz et al.
Data sets
FORMS: FORest Multiple Sources height, biomass and wood volume maps Martin Schwartz https://doi.org/10.5281/zenodo.7840108
Martin Schwartz et al.
Viewed
HTML | XML | Total | BibTeX | EndNote | |
---|---|---|---|---|---|
608 | 632 | 22 | 1,262 | 13 | 16 |
- HTML: 608
- PDF: 632
- XML: 22
- Total: 1,262
- BibTeX: 13
- EndNote: 16
Viewed (geographical distribution)
Country | # | Views | % |
---|
Total: | 0 |
HTML: | 0 |
PDF: | 0 |
XML: | 0 |
- 1