the Creative Commons Attribution 4.0 License.
the Creative Commons Attribution 4.0 License.
Multi-temporal high-resolution data products of ecosystem structure derived from country-wide airborne laser scanning surveys of the Netherlands
Abstract. Recent years have seen a rapid surge in the use of Light Detection and Ranging (LiDAR) technology for characterizing the structure of ecosystems. Even though repeated airborne laser scanning (ALS) surveys are increasingly available across several European countries, only few studies have so far derived data products of ecosystem structure at a national scale, possibly due to a lack of free and open source tools and the computational challenges involved in handling the large volumes of data. Nevertheless, high-resolution data products of ecosystem structure generated from multi-temporal country-wide ALS datasets are urgently needed if we are to integrate such information into biodiversity and ecosystem science. By employing a recently developed, open-source, high-throughput workflow (named “Laserfarm”), we processed around 70 TB of raw point clouds collected from four national ALS surveys of the Netherlands (AHN1–AHN4, 1996–2022). This resulted in ~ 59 GB raster layers in GeoTIFF format as ready-to-use multi-temporal data products of ecosystem structure at a national extent. For each AHN dataset, we generated 25 LiDAR-derived vegetation metrics at 10 m spatial resolution, representing vegetation height, vegetation cover, and vegetation structural variability. The data enable an in-depth understanding of ecosystem structure at fine resolution across the Netherlands and provide opportunities for exploring ecosystem structural dynamics over time. To illustrate the utility of these data products, we present ecological use cases that monitor forest structural change and analyse vegetation structure differences across various Natura 2000 habitat types, including dunes, marshes, grasslands, shrublands, and woodlands. The provided data products and the employed workflow can facilitate a wide use and uptake of ecosystem structure information in biodiversity and carbon modelling, conservation science, and ecosystem management. The full data products and source code are publicly available on Zenodo (https://doi.org/10.5281/zenodo.13940846) (Shi and Kissling 2024).
- Preprint
(10611 KB) - Metadata XML
- BibTeX
- EndNote
Status: final response (author comments only)
-
RC1: 'Comment on essd-2024-488', Fabian Fischer, 23 Jan 2025
The article presents a large dataset of 25 ecosystem structure metrics for the entire area of the Netherlands at 10 m resolution. The metrics are calculated from openly available airborne laser scanning (ALS) data and across multiple ALS campaigns, making it a highly valuable dataset to assess ecosystem dynamics. The article is well-written, with great attention to detail – I particularly commend the tables and figures that present a lot of information without feeling overly complex. The paper also follows a nice logical flow, from presenting the ALS pipeline, to detailed justifications for the derived metrics, to 2 sample case studies that have high relevance for applied research (changes in ecosystem structure + a comparison of structural indices across different ecosystem types). The resulting product provides insight into high resolution ecosystem change over at least 15 years from 2007-2022 (Note that I am not counting the first ALS campaign which likely does not reach minimum quality standards for ecological analysis). It thus has great potential to become a standard tool both for practitioners and researchers interested in ecosystems in the Netherlands. It could also become a nice reference dataset for similar efforts at larger scale.
I will now provide my main comments, with line-by-line comments following below.
Main comment 1: Robustness of pipeline to pulse density / leaf phenology
The main issue where the authors have not (yet) convinced me is whether comparisons in time between different AHN surveys are robust to acquisition properties. Large disturbances (e.g. clear cutting, logging) will obviously be visible and can be separated from noise, but how about growth or smaller disturbances? The authors mention at the end of the introduction that intercomparisons between different instruments and scanning conditions could lead to considerable errors, but then do not really provide means to assess the sensitivity of the products or to correct for some of these problems. This is important, because the surveys of AHN1 and even AHN2 differ strongly from AHN3-5 in terms of pulse density, and even the more recent high quality surveys may differ in leaf phenology. E.g., a scan in April will likely already measure some vegetation in early leaf-on conditions, and this might create bias compared to a scan in December.
Overall, I see three points that I would like the authors to address:
- Sensitivity analysis: The authors should provide a sensitivity analysis of the pipeline with regard to pulse density, e.g., how much do the inferred 25 metrics change when the pulse density of a high quality scan (AHN4-5) is degraded to levels of AHN1 or AHN2. This can then be used to provide clear bounds on what is ecologically interpretable. The simplest way to do this would be to a) use the site in case study 1, degrade the point cloud from AHN5 and check robustness of the metrics, and b) to repeat case study 2, with the exact same sites and metrics, but with AHN1 or AHN2. Is the analysis reproducible for ecosystems with low vegetation and small differences between them? I do not expect all metrics to be perfectly stable for this study to be published, but having an estimate of uncertainty for all of them would be key. Note that I would degrade pulse density (the number of shots) and not point density (the number of shots + returns) to accurately simulate a lower-quality scan.
- Leaf phenology: If possible, the authors should provide a timestamp/acquisition time for every 10 m x 10 m pixel. There are vector files with information on flight lines available for AHN2-5 online, so maybe these could be rasterized? Some of the layers are likely incomplete, but having information on flight time for the pixels down to the month would make the dataset very valuable.
- Quantitative guidelines: Finally, the paper should provide clear quantitative guidelines for researchers or practitioners on what kind of differences are interpretable ecologically. E.g., if I notice a change in height of 1m between 2017 and 2022, is this a real height change, or does this fall within the uncertainty due to laser instrumentation/DTM derivation? Vegetation growth can be slow (0.1-0.5 m/ year), so it is important to be able to separate noise/artefacts from actual change. Cf. also the second main comment.
Main comment 2: Independent comparison and NA values
I carried out a quick comparison with our own laser scanning pipeline, which we have previously tested for robustness (Fischer et al., 2024, Methods in Ecology and Evolution (https://doi.org/10.1111/2041-210X.14416). I will call this the LAStools pipeline, and the author’s pipeline the Laserfarm pipeline. I have uploaded the products of this comparison to Zenodo so that the authors can compare it to their results: https://zenodo.org/records/14722001.
I only carried out a simple comparison: I compared the 95th percentile of canopy height at 10 m resolution from two CHMs produced via the LAStools pipeline (“chm_lspikefree.tif” and “chm_tin.tif”) with the Laserfarm hp95 product at the site in case study 1. The main findings are:
- NA values: The products in the Laserfarm hp95 sometimes seem to have a considerable amount of NA values. Unfortunately, these NA values are not consistent across AHN surveys, so comparisons of canopy height change may vary depending on how these NA values are dealt with across surveys: when I ignored these differences and simply calculated the difference in canopy height means at the study site, I would get a height loss of -3.84 m from AHN2 to AHN3, a height increase of 0.45 m from AHN3 to AHN4, and then again a loss of -0.57 m from AHN4 to AHN5. When only considering areas that were not NA in any of the surveys, the height changes were as follows: -3.30 m, -0.03 m, -0.12 m, so differences of up to 0.5 m. The authors should either try to remove the NA values consistently from all products (this should be possible, as shown by the products I derived with the LAStools pipeline), or provide a mask and a clear guideline on how to deal with them. Cf. the attached pdf for a visualization of the NA values.
- Differences between pipelines: The products from the LAStools pipeline are CHMs, whereas hp95 is derived from the point cloud, so we expect some differences, but they should not be massive, as both are assessing top canopy height. However, I still found clear quantitative and qualitative differences. With the chm_lspikefree.tif (the closest equivalent to top height) I found a height loss of -3.94 m from AHN2 to AHN3, a further loss of -0.36 m from AHN3 to AHN4, and then a minimal gain of -0.05 m to AHN5. The pattern was similar with chm_tin.tif, but with smaller shifts: -2.51 m, -0.47 m, then a minimal loss of -0.02 m. The absolute differences between chm_tin.tif and chm_lspikefree.tif are expected (cf. Fischer et al. 2024), but both suggest a clear height loss from AHN2-AHN3, a smaller, but clear loss from AHN3-AHN4, and then stabilization between AHN4-AHN5. This is in contrast to the results described above for hp95, which shows different patterns.
My main takeaway from this comparison is that change analysis in forest ecosystems is tricky and that average differences < 1.0 m may be hard to interpret/verify, unless robustness is explicitly quantified or pulse density included in the analysis. It would be good if the authors could reflect on this more clearly in the paper and supplement the current paper with a robustness test as described in point 1.
Line-by-line:
14: I appreciate that the authors calculated structural metrics also for AHN1, but it seems to me that the data from AHN1 cannot really be interpreted for ecological purposes due to their low and highly varying quality. The authors suggest as much in the text. I think a more accurate description here would be 2007-2022, and then mentioning in the text that, theoretically, AHN1 is also available, but should only be used with great caution.
27-29: Impressive data volumes
39: Great!
55: Minor comment, but I would disagree that laser scanning is more “direct” than field measurements. It involves scanners, processing point clouds or waveforms, making assumptions about their properties, then aggregating with indices, etc. Maybe another word would be appropriate: “more precise measurements”? “more robust”?
61: the type of the object (“Classification”) is not recorded by the laser sensor. This is post-processing and involves many assumptions. I would remove this here.
99: I agree that terrain modelling is the primary aim, but I would not call a DSM a terrain model, maybe remove and only mention DTMs
109-126: I fully agree. Particularly, for multi-temporal lidar, harmonization is key.
154-155: The Dutch campaigns are impressive, but two clear caveats need to be mentioned in the study: 1/ Winter acquisitions are not ideal for vegetation structure assessments, because they will make it harder to detect small shrubs and lead to different estimates of canopy height between deciduous/broadleaf trees and evergreen/conifer trees. 2/ December to April is actually quite a long period. I imagine that April is already springtime in the Netherlands, with many deciduous trees growing their first leaves and thus capturing more returns higher in the canopy. A repeat study with the first scan in December and the second scan in April might thus (wrongly) conclude that a forest has increased in canopy height/closure.
158-159: While I agree that it’s great to provide AHN1 as well, maybe this should state that it is not suitable for vegetation assessments? Under the worst circumstances (1 point per 16 sqm), analyses would be highly biased both due to DTMs and CHMs.
161: Is this point or pulse density? I think pulse density is generally preferable as it is less instrument dependent
184: Nice table, very clear, thanks!
189-207: This sounds like a great pipeline, but I am missing key information on robustness here. Just from reading this, it is not at all clear to me that the pipeline produces “consistent […] geospatial products from different ALS data.” A couple of points come to mind:
- Have you tested how the pipeline performs when laser scan pulse density (not point density) is degraded systematically, e.g., from 10 to 5 to 2 to 1? F
- Maybe the description of the normalization is incomplete, but, as it is currently described, it seems non-standard and prone to large biases. The standard approach is to (robustly) ground-classify points, then interpolate a DTM, and use the inferred heights to normalize the point clouds. Using the lowest points, by contrast, seems prone to introduce a lot of noise, plus: what happens to cells that do not have a lowest point? Are cells with a single point by default classified as 0 height?
- I do not by default understand what an “infinite square cell” is, and I think this should be explained without needing to refer to another publication
214: So you use ground classification for normalization? I am confused now. Cf. my points above regarding normalization
210-212: In terms of robustness, relying only on provided ground classifications puts a lot of trust into pre-existing classifications and their comparability across campaigns. They may be good, but can you guarantee that the same algorithms were used for ground classification in AHN2 and AHN4? Have you assessed this.
231-250: This is a very technical description, but I think it is great, because it can serve as a guideline for other efforts to produce streamlined products like this
262: Why does the data volume increase under normalization? Usually it decreases, no? Do you store the data at different precision?
280-284: These are descriptions of normalization, etc., that have already been provided above. I would remove them, and move the outlier filter further upwards to the “Processing workflow” section.
275: The metrics are well-chosen in terms of ecological relevance and very nicely explained in the Table. This is very nice and well thought-through. However, I don’t see any reference to robustness. As the introduction of the article rightly points out, harmonizing data across different laser sensors and campaigns is a major challenge. To ensure robustness, structural metrics should also be selected by how robust they are with regard to pulse density. Metrics that I would suspect of being particularly sensitive to laser instrumentation are PPR, the Shannon index, and any of the densities of vegetation points below 3 m (below 1, 1-2, 2-3). Minor inaccuracies in ground classification could introduce huge biases/errors. Further candidates for sensitive metrics would be Kurtosis, Roughness and the 25th percentile.
301: I like how thoroughly the paper catalogues all metrics and procedures. The authors deserve a commendation for their attention to detail!
323-331: This is also great work. Only two comments:
- Could you use this to provide an assessment of how metrics change with pulse/point density?
- I don’t know how much work this would be, but I would generally prefer a pulse density raster (i.e. first or last return density), or ideally, both pulse and point density rasters. Pulse density gives a more direct impression of sampling effort and does not confound it with the increasing power of modern laser scanners and sensors (more returns per pulse).
357: A key bit that is missing in validation is how the metrics respond to variation in pulse density. For pulse density, we would usually expect the opposite biases (i.e., much lower errors at the top, but much more near the ground, cf. Fischer et al. 2024, MEE).
406-408: Cf. my above concerns about PPR.
467: This is a nice figure and very intuitive, but I am missing a bit the quantitative assessment and how changes in growth compare to errors.
520: Also a nice analysis and a good use case.
513-518: Or this could be a methodological artefact. DTM models are not always super robust and small shifts by 10-50 cm might already introduce lots of noise into these assessments.
553-555: I am a big fan of the CV, but in my experience, it will also be negatively correlated with mean height (which is intuitive, since it is computed with a division by mean height). You could also consider standardizing it between 0 and 1, as described in Lobry et al. 2023, MEE (https://doi.org/10.1111/2041-210X.14197).
573-575: I think this is a bit of a shame, because most research in ecology would likely require the DTMs with the canopy metrics, and at the same resolution. Maybe you could consider providing a few DTM metrics in the future to complement the canopy metrics?
598-603: I disagree with this. Our own research found clear problems in canopy robustness down to 2 pulses per m2.
617: This is broadly correct, but there is also the lasR package from the lidR developers that intends to be used for large-scale processing
626-630: The Dutch terrain is certainly a very specific configuration and this point should be highlighted more, as more complex terrain poses many challenges. Ground classification in mountainous settings poses huge challenges, for example. This is not only where the terrain is difficult, but also where a lot of forest area remains.
-
RC2: 'Comment on essd-2024-488', Anonymous Referee #2, 25 Jan 2025
This manuscript describing a series of rasterized ALS structure products for Holland was well designed and written. It still suffers from a few areas of unclarity and/or inaccuracy which I outline below. I would also offer a few suggestions, such as
(1) I suggest "lidar" as the consensus spalling and most accepted modern usage (see: https://lidarmag.com/wp-content/uploads/PDF/LiDARNewsMagazine_DeeringStoker-CasingOfLiDAR_Vol4No6.pdf);
(2) Please provide information on point cloud geolocation precision AND vertical precision.
(3) Please provide information on point cloud classification methods.
(4) Some of the references (eg Asner) are a bit out of date and do not engage with theoretical developments in the literature like:
a. Coops et al. 2021. Modelling lidar-derived estimates of forest attributes over space and time: A review of approaches and future trends. Remote Sensing of Environment 260, 112477. https://doi.org/10.1016/j.rse.2021.112477
b. Cloverdale et al. 2023. Unravelling the relationship between plant diversity and vegetation structural complexity: A review and theoretical framework. Journal of Ecology 111.7 (2023): 1378-1395.and especially for this purpose those that engage with structural typologies based on ALS:
c. Atkins et al. 2023. Integrating forest structural diversity measurement into ecological research. Ecosphere. 14(9), e4633.
and
d. Hakkenberg and Goetz 2021. Climate mediates the relationship between plant biodiversity and forest structure across the contiguous United States. Global Ecology and Biogeography. 30:2245–2258. https://doi.org/10.1111/geb.13380
Beyond these, find some row-by-row comments and questions below:
278 - Is having the median and 50th percentile not redundant? Why have both?278 - Why not Hp98 or Hp100? Are you not biasing results by having a low max Ht (5% below top)?
279 - Why 10000m? Seems too large a value. Would 1000m or even 100m not also be appropriate for Holland?
284 - Was there no independent DEM for verification of ground elevation? I really like the use of cadastral data for masking and verification.
287 - How accurate are these 1m vertical bins when the older imagery likely lacks that precision with so sparse a point density?
Table 3 - Why 0.5m for Shannon's H and not 1m? This metric would suffer even more from vertical precision issues as noted above. Also please indicate the constraints of i. In other words i=1:? The standard answer is top HT, which in this case is biased low at Hp95.
Fig. 3 - What is BR? Should be defined in caption.
338 - Prior to this section, could you provide some information how (what method used) for classification?
461 - This statement on Shannon's index is incorrect. H is NOT evenness. For that you could use a metric like Pielou's J. Shannon's H is a mix of evenness and richness, where richness is the number of Ht bins and is highly correlated with Hp95 because the number of height bins is a primary parameter to the equation (the range of i, currently missing from the equation in Table 3).
Citation: https://doi.org/10.5194/essd-2024-488-RC2
Data sets
Multi-temporal high-resolution data products of ecosystem structure derived from country-wide airborne laser scanning surveys of the Netherlands Yifang Shi and W. Daniel Kissling https://doi.org/10.5281/zenodo.13940846
Model code and software
Laserfarm W. Daniel Kissling, Yifang Shi, Zsófia Koma, Christiaan Meijer, Ou Ku, Francesco Nattino, Arie C. Seijmonsbergen, and Meiert W. Grootes https://github.com/eEcoLiDAR/Laserfarm
Interactive computing environment
Jupyter Notebooks for processing AHN dataset Yifang Shi https://github.com/ShiYifang/AHN
Viewed
HTML | XML | Total | BibTeX | EndNote | |
---|---|---|---|---|---|
330 | 38 | 12 | 380 | 11 | 10 |
- HTML: 330
- PDF: 38
- XML: 12
- Total: 380
- BibTeX: 11
- EndNote: 10
Viewed (geographical distribution)
Country | # | Views | % |
---|
Total: | 0 |
HTML: | 0 |
PDF: | 0 |
XML: | 0 |
- 1