PEATGRIDS: Mapping thickness and carbon stock of global peatlands via digital soil mapping

Widyastuti, Marliana Tri; Minasny, Budiman; Padarian, José; Maggi, Federico; Aitkenhead, Matt; Beucher, Amélie; Connolly, John; Fiantis, Dian; Kidd, Darren; Ma, Yuxin; Macfarlane, Fraser; Robb, Ciaran; Rudiyanto,; Setiawan, Budi Indra; Taufik, Muh

doi:10.5194/essd-2024-333

Preprints

https://doi.org/10.5194/essd-2024-333

Preprints

27 Aug 2024

| 27 Aug 2024

Status: this preprint has been withdrawn by the authors.

PEATGRIDS: Mapping thickness and carbon stock of global peatlands via digital soil mapping

Marliana Tri Widyastuti, Budiman Minasny, José Padarian, Federico Maggi, Matt Aitkenhead, Amélie Beucher, John Connolly, Dian Fiantis, Darren Kidd, Yuxin Ma, Fraser Macfarlane, Ciaran Robb, Rudiyanto, Budi Indra Setiawan, and Muh Taufik

Abstract. Peatlands, which only cover 3 to 5 percent of the global land area, can store up to twice the amount of carbon as the world’s forests. Although recognised for their significant role in the global carbon cycle, discovering the global extent of peatlands and their carbon stock remains challenging. Referring to the UNEP's global peatland map, here we present PEATGRIDS, a data product containing global maps of peat thickness and carbon stock created created using the digital soil mapping approach. We compiled over 25,000 observations of peatland thickness, bulk density (BD) and carbon content (CC), globally. Using the Random Forest (RF) algorithm, we estimated peat thickness and peat BD and CC at ~1 km resolution at multiple depths (0–2 m) globally. The estimates were generated using 19 land surface covariates from digital maps and remote sensing images of land use, soil characteristics, topographical features, and climate parameters. The RF models for peat thickness were trained on 25,200 points grouped into six geographic regions. Validation of the peat thickness estimates showed a good performance, with the coefficient of determination (R²)ranging from 0.15 to 0.72. The prediction for peat BD and CC followed the same model architecture and were trained on 17,000 and 7,000 points, respectively. Overall, BD and CC models performed well and consistently across soil layers with average R² values of 0.61 for BD and 0.48 for CC. Based on the estimated peat thickness, BD and CC, the carbon stock of global peatland was estimated to be 1,029 Pg C for peat dominated area of 6.57 million km². PEATGRIDS is made available at https://doi.org/10.5281/zenodo.12559239 (Widyastuti et al., 2024) to support further analyses and modelling of peatlands across the globe.

This preprint has been withdrawn.

Received: 01 Aug 2024 – Discussion started: 27 Aug 2024

Publisher's note: Copernicus Publications remains neutral with regard to jurisdictional claims made in the text, published maps, institutional affiliations, or any other geographical representation in this paper. While Copernicus Publications makes every effort to include appropriate place names, the final responsibility lies with the authors. Views expressed in the text are those of the authors and do not necessarily reflect the views of the publisher.

Download & links

Preprint (PDF, 4306 KB)

Withdrawal notice
This preprint has been withdrawn.
Preprint (4306 KB)

Supplement (428 KB)

Download & links

This preprint has been withdrawn.

Interactive discussion

Status: closed

RC1:
'Comment on essd-2024-333', Anonymous Referee #1, 02 Oct 2024
This manuscript aims to address the uncertainty in global peatland extent and peat carbon stocks by developing a global model of peat thickness and carbon density. The method that is applied and the resulting maps will be useful to peatland ecologists and soil scientists around the globe, and highlight the important role that peatlands play as carbon reservoirs.
Interestingly, the authors’ overall conclusion that peat soils globally store more than 1,000 Pg C is much larger than previous estimates. However, given the discoveries of new peatlands in recent years, particularly in the tropics, which indicates a past general trend to underestimate peatlands, this much larger estimate could well be more accurate. This manuscript will therefore make a valuable contribution to a growing body of literature that tries to pinpoint and understand the stocks and flows of peat carbon in the face of global environmental change.
General comments:
Overall, the application of digital soil mapping to estimate global carbon stocks is a useful approach that hasn’t been tried before on this scale, as far as I know. The authors have collected an impressive dataset of peat thickness, BD and CC data and used this to train random forest models for six regions. This methodology appears sound and well applied, given the obvious limitations to modelling on such a large scale.
In terms of outcomes, it seems that the authors’ mean values correspond well with previous work, particularly in the 5 case studies, but that their model is often struggling to capture the regional variability that most of these smaller-scale maps do show. Essentially, the authors’ model is moving closer towards the average at the expense of regional variation. This is understandable and as expected, given the use of a RF model with sometimes limited training data from certain regions. However, this also means that it would be good to stress the global nature of this map. The results should be treated with more caution at regional scales by end-users, especially if more local maps are also available.
Additionally, I had a couple of general questions:
It was unclear to me whether some of the training data taken from external sources includes modelled data itself. Could you clarify if all of the input data are direct field measurements, or whether this includes modelled data from previously created maps? In the latter case, it would be good to make this more explicit (how much true field data, how much modelled? In which areas?) For example in Table S1. Also, what does this mean in terms of error propagation from previous sources into this new model? How does this affect your model’s uncertainty?

It is not entirely clear to me based on what criteria the predictor variables were chosen. For example, although the list of predictor variables includes a wetness indicator such as the Topographic Wetness Index, I was left wondering whether it could have been useful to explicitly add a variable related to the seasonality of wetness/inundation? Some peatland areas, particularly along river valley systems in the tropics, might experience seasonal droughts that could be a driver of decomposition, and therefore influence thickness, BD and/or CC. (Conversely, they might experience extreme flooding during the rain season as well). Such seasonal variability in wetness/inundation is currently not captured by your list of predictor variables, as both your WordClim and PALSAR variables use yearly averages only. Perhaps it could be useful to include precipitation seasonality, precipitation of the driest month, or a similar variable from WorldClim?

The model uses six geographic regions, at least for training BD and CC models. This seems a logical methodological choice. However, I was wondering if it would be useful to apply a spatial cross-validation approach as well, in addition to the five-fold CV currently used during training. Currently, all training and testing data are randomly taken from the same region, which means that they could well be close to each other and show spatial autocorrelation. To account for this and test the model’s accuracy in areas from which it lacks any training data, it would be good to predict a test area that has not been used for training. For example, by using five regions as training data and testing on the sixth one. Or by setting testing blocks apart within each of the six continental regions. This way, the authors would get a better idea of their model’s accuracy, given some regions have very limited data.

Detailed comments:
Line 23/27: Harmonize and be consistent in number of datapoints that has been used. Currently, the different numbers (25,000/25,200) are confusing. If the numbers differ for thickness, BD and CC, give the numbers for each of these datasets explicitly in the Methods.
Line 47-50: Harmonize the use of million hectares and million km2. Choose one or the other, but not both interchangeably.
Line 61: This line should state that *peatland carbon stocks* were mapped by this paper using a random forest model, not peatland extent (as is currently implied).
Line 81: This says that the GPM reports 8.7 million km2 of peatland (of which 6.7 million is peat-dominated). However, on line 50 you state that the GPM estimates peatlands to be 4.9 million km2. This appears to be a contradiction. What is the correct number?
Line 82: What are the other non-peat dominated lands that you have excluded from the GPM? Could you elaborate why these are not useful in this study? In general, it would be good to remind the reader here that the GPM has no specific peat definition globally. As your final carbon stock number depends a lot on the GPM’s area estimate, it would be good to say something about how this could impact your results.
Line 97-98: This additional map that was used to extract more points from Indonesia does not appear to be in Table S1? Please clarify.
Line 112: What did you do if only carbon density was provided in certain datasets, and not the underlying BD and CC measurements? How were these datapoints included in your models for BD, CC and the final carbon density output? For example, the Congo Basin is one of your case studies, but you did not include any BD (and only 1 CC) values from Congo in the training set. However, carbon density values are publicly available from this area.
Line 309: Please look at sentence ‘This table as..’ Does not read very well.
Line 370: Creeze et al. (2022) is misspelled
Line 371-374: Please specify the full mean and SD values of the previous study as well as your study for comparison. You say they are similar but do not give the regional values to back this up. This also applies to the other case studies: sometimes this data is provided, but not in all cases. It would be useful to add a Table that compares the mean thickness and carbon density values, and total carbon stock, for the 5 case studies from Congo, Amazon, Indonesia, Scotland, and Northern peatlands, for both your map and the original studies.
Line 443: ‘specifically measured in the main peatlands of the Congo Basin’ Are these selected from this paper's peat thickness map (which is modelled), or are these original field-measurements taken from the same study? (see general comment above)
Line 446: ‘We attempted to fill the data gaps with available peatland maps, but information on peatlands in many regions, including Africa, remains scarce.’ So did you do this or not? Table S1 lists only point data in the tropics, which implies that you did not use additional map data? Please be more explicit here.
Line 456: Why is uncertainty not addressed? This should be relatively straightforward to assess with a RF model in GEE.
Section 4.4: I would appreciate a line in here that stresses that your work and the resulting maps are useful in estimating global carbon stocks, but that at regional scale your model appears to be not fine-grained enough to capture most of the known regional variations in the 5 case studies. Hence, it should be used with caution for assessing regional peatlands.
Supplement Table 1: Please list the number of samples that you used from each study, and (if necessary) specify which of these sources provided direct field observations, and which of these sources provided modelled thickness/BD/CC values. This seems an important distinction to make.
Citation: https://doi.org/10.5194/essd-2024-333-RC1
- AC1:
  'Comment on essd-2024-333', Marliana Tri Widyastuti, 19 Nov 2024
  We sincerely thank you for your thorough and insightful feedback on our manuscript. Your comments have been invaluable in improving the quality and clarity of our work. In response, we have made the following revisions:
  Updated Predictions and Uncertainty Quantification
  
  We incorporated new data points from Canada, enhancing peat thickness, bulk density (BD), and carbon content (CC) predictions. This addition affected the overall model performance and associated statistics, which we have updated in the Results section. Additionally, we quantified the uncertainty for each prediction map, expressed as the standard deviation of the distribution of predictions from decision trees in the random forest algorithm. The propagated uncertainties for thickness, BD, and CC were further used to estimate the uncertainty of total carbon stock.
  
  Clarification of Data Points
  
  We clarified the number of data points gathered from various databases and literature sources and the final number of points used in modelling. These details are briefly mentioned in the main text and provided comprehensively in tables within the Supplementary Materials.
  
  Addressed Clarity and Consistency Issues
  
  We revised unclear statements and corrected inconsistent units and terms throughout the manuscript. Furthermore, additional references were included to support the revisions.
  
  We have also provided detailed responses to each specific comment below. Once again, we appreciate your valuable feedback, which has significantly strengthened our work.
  On behalf of the authors,
  Marliana Widyastuti
  
  Citation: https://doi.org/10.5194/essd-2024-333-AC1
RC2:
'Comment on essd-2024-333', Anonymous Referee #2, 04 Oct 2024
The study attempts to map peat thickness and carbon stock in peatlands across the world using ‘observations’ (some are truly observations, others are from existing regional maps/models) to train and test random forest models, extrapolated using remote sensing and geodata products. Their models are spatially constrained by the existing Global Peatland Map (GPM, restricted to the ‘peat dominated’ lands = 6.7 million km²), and further adjusted in Indonesia (based on Haryono et al., 2011) (= 6.57 million km²).
Despite the conservative decision to restrict their predictions to ‘peat dominated lands’, this study predicts a total global peat carbon stock of 1,029 Pg C, which is between 1.7 and 2.3-times previous estimates.
If valid, this is a bold and significant conclusion, and the dataset could be useful for wider research and policy communities. However, there are several substantial issues with the methodology which I believe need to be resolved before the paper could potentially be accepted.

Major comments
Lack of uncertainty assessment and carbon density prediction:

I appreciate that you are upfront about the lack of uncertainty assessment, but I don’t think this is acceptable in the context of a global assessment of the peat carbon stock, especially when your total carbon stock value is so large, while also considering other methodological limitations (to follow).
Linked to this is the performance of your carbon density prediction (Figure 5). Not only does the model perform poorly against observed carbon density measurements but it appears to be systematically biased- overpredicting carbon density. You do initially acknowledge this in line 250. However, you make no reference to it in the discussion where you conclude (line 345) – ‘In our estimation, the global peat carbon stock is 1,029 Pg which is much higher than previous studies, which reported values ranging from 445 to 612 Pg C (Table 4). This is primarily due to the larger peat extent based on the UNEP Global Peatland Map.’
Figure 5 contradicts this conclusion. While the larger peat extent certainly accounts for some of the higher carbon stock, your overprediction of carbon density is a significant factor which needs to not only be discussed but quantified (Figure 5 suggests that it may account for as much as 50% of the increased carbon stock estimate).
For examples of estimating peat-carbon stock uncertainty (and propagating uncertainty in underlying variables such as peat thickness), see Hugelius et al. (2020), Crezee et al. (2022), Hastie et al. (2022), Draper et al. (2014) etc.
You also have a large RMSE for your peat depth thickness prediction in some regions (e.g. North America and SE Asia), (Table 2).

Definition of peat:

In line 45 you write- ‘The recent global peatlands assessment (GPA) by United Nations Environment Program (UNEP) reported an updated global peat coverage, reaching up to 500 million ha by defining peatlands as areas with more than 30 cm of peat layer (UNEP, 2022)’
You are also using the GPM to constrain peatland area. As such I assume that you are using this 30 cm cut off for your peat definition.
However, later in line 274 you write- ‘Peat thickness ranged from 0.04 to 10.68 m, with a high variation occurring particularly in the peatlands of Sumatra Island, Indonesia (Fig. 7)’
0.04 m or 4 cm does not qualify as peat under the GPA definition. Do you therefore exclude areas which are predicted to be < 30 cm (based on your model) from your results, and carbon estimation? The above sentence suggests not.

Covariates and parameters:

In line 138 you write- ‘We used 19 covariates (Table 1) representing peat formation factors to predict peat thickness, BD, and CC separately.’
How did you test for redundancy of driver variables (e.g. cross correlation) and model overfitting? Please better explain and justify your model set-up (in particular the selection and retention of driver variables).

Spatial autocorrelation:

Your modelling scheme does not seem to account for spatial autocorrelation. As an additional step, you could for example employ a spatial cross validation approach to get a better understanding of model performance. At the very least you should discuss the issue of spatial autocorrelation and potential implications. See for example- Garcia, M (2021); Meyer H and Pebesma (2022), Golblatt et al (2016).

Specific comments:

In line 49 you write- ‘According to the GPM, the global peatland area reached 4.9 million km2..’.
In line 81 you write- ‘The GPM, available at 1-km resolution, reports up to 8.7 million km2, double the the peat area…’.
These two sentences (above) seem contradictory, please change or explain the discrepancy (e.g. different version of map or peat definition?).

Line 140-‘We selected the hyperparameter values with the highest cross-validation score as the final model.’
I see from Table S2 that you tested hyperparameters within a defined range. It would be good to explain why you chose these ranges in terms of e.g. avoiding overfitting, as from the main text it seems that you chose the hyperparameters only based on model performance.

In line 345 you write- ‘In our estimation, the global peat carbon stock is 1,029 Pg which is much higher than previous studies, which reported values ranging from 445 to 612 Pg C (Table 4). This is primarily due to the larger peat extent based on the UNEP Global Peatland Map.’
See previous comment, what about bias in carbon density prediction?

In line 350 you write- ‘Our raster data of peat extent, provided by the Global Peatland Initiative, covers an area of 6.57 million km² designated as 'peat dominated' lands, which we assumed to be peatlands. This number exceeds the estimates reported in the GPA 2020, which accounts for up to 4.8 million km² of peatlands, excluding any peat dominated area with less than 30-40 cm peat layer. This means our estimation includes 1.77 million km2 of peatlands that were previously classified as non-peats with less than 30-40 cm peat layer. However, it is important to include as much of the probable known peatlands as possible to comprehensively estimate their carbon stock.’
Related to above comment, are you also excluding ‘peat’ pixels where your model predicts <30cm of organic soil (peat) thickness?

In line 383 you write- ‘Their peat thickness distribution map was derived using RF algorithm trained on 1,359 data points according to remote sensing layers combined with distance to peatland edge and height above nearest data.’
Do you mean ‘…height above nearest drainage.’?

Line 435- ‘4.4 Model limitations and possible improvements While PEATGRIDS have mapped peatland thickness and carbon stock across the world, we recognise some limitations that need further improvement. The peat extent used in this study is based on the Global Peatland Map, which may include areas not recognised as peats in different classification systems, or areas that have undergone significant land use changes. Since no universal definition of peat exists, PEATGRIDS provides the first estimate for global peat-dominated areas. The 1 km spatial resolution may overestimate some peatlands, especially those that cover areas less than 100 ha, or overlook smaller peatlands. Future refinement of the global peatland extent may improve the accuracy of the peat extent map.’
I would suggest also mentioning that restricting the study to the GPM definition of ‘peat dominated’ areas could also result in missing some peatlands, such as over Brazil (see for example Hastie et al., 2024 and Gumbricht et al., 2017).

Line 456- ‘One important information not addressed in this work is the uncertainty of the predicted maps. Uncertainty analysis is necessary to evaluate how reliable the predicted maps are for decision-making processes, as it acknowledges model limitations and interpretability (Wadoux et al., 2020). Model validation metrics can be used in the interim as an indication of reliability. ‘
Considering the poor performance of your carbon density model and bold conclusions (i.e. 1,029 Pg C), an assessment of uncertainty is essential.

Additional references mentioned-
Draper et al 2014 Environ. Res. Lett. 9 124017- https://iopscience.iop.org/article/10.1088/1748-9326/9/12/124017
Garcia M 2021 Investigating the Use of Spatially-Explicit Modelling and Cross-Validation Strategies in Spatial Interpolation Machine Learning Problems available at: https://run.unl.pt/handle/10362/113881
Goldblatt R, You W, Hanson G and Khandelwal A K 2016 Detecting the boundaries of urban areas in india: a dataset for pixel-based image classification in google earth engine Remote Sens. 8 634
Gumbricht T, Roman-Cuesta RM, Verchot L, et al. An expert system model for mapping tropical wetlands and peatlands reveals South America as the largest contributor. Glob Change Biol. 2017; 23: 3581–3599. https://doi.org/10.1111/gcb.13689
Garcia M 2021 Investigating the Use of Spatially-Explicit Modelling and Cross-Validation Strategies in Spatial Interpolation Machine Learning Problems available at: https://run.unl.pt/handle/10362/113881
Goldblatt R, You W, Hanson G and Khandelwal A K 2016 Detecting the boundaries of urban areas in india: a dataset for pixel-based image classification in google earth engine Remote Sens. 8 634
Hastie et al 2024. A new data-driven map predicts substantial undocumented peatland areas in Amazonia. Environ. Res. Lett. 19 094019. https://iopscience.iop.org/article/10.1088/1748-9326/ad677b
Meyer H and Pebesma E 2022 Machine learning-based global maps of ecological variables and the challenge of assessing them Nat. Commun. 13 2208
Citation: https://doi.org/10.5194/essd-2024-333-RC2
- AC1:
  'Comment on essd-2024-333', Marliana Tri Widyastuti, 19 Nov 2024
  We sincerely thank you for your thorough and insightful feedback on our manuscript. Your comments have been invaluable in improving the quality and clarity of our work. In response, we have made the following revisions:
  Updated Predictions and Uncertainty Quantification
  
  We incorporated new data points from Canada, enhancing peat thickness, bulk density (BD), and carbon content (CC) predictions. This addition affected the overall model performance and associated statistics, which we have updated in the Results section. Additionally, we quantified the uncertainty for each prediction map, expressed as the standard deviation of the distribution of predictions from decision trees in the random forest algorithm. The propagated uncertainties for thickness, BD, and CC were further used to estimate the uncertainty of total carbon stock.
  
  Clarification of Data Points
  
  We clarified the number of data points gathered from various databases and literature sources and the final number of points used in modelling. These details are briefly mentioned in the main text and provided comprehensively in tables within the Supplementary Materials.
  
  Addressed Clarity and Consistency Issues
  
  We revised unclear statements and corrected inconsistent units and terms throughout the manuscript. Furthermore, additional references were included to support the revisions.
  
  We have also provided detailed responses to each specific comment below. Once again, we appreciate your valuable feedback, which has significantly strengthened our work.
  On behalf of the authors,
  Marliana Widyastuti
  
  Citation: https://doi.org/10.5194/essd-2024-333-AC1
AC1:
'Comment on essd-2024-333', Marliana Tri Widyastuti, 19 Nov 2024
We sincerely thank you for your thorough and insightful feedback on our manuscript. Your comments have been invaluable in improving the quality and clarity of our work. In response, we have made the following revisions:
Updated Predictions and Uncertainty Quantification

We incorporated new data points from Canada, enhancing peat thickness, bulk density (BD), and carbon content (CC) predictions. This addition affected the overall model performance and associated statistics, which we have updated in the Results section. Additionally, we quantified the uncertainty for each prediction map, expressed as the standard deviation of the distribution of predictions from decision trees in the random forest algorithm. The propagated uncertainties for thickness, BD, and CC were further used to estimate the uncertainty of total carbon stock.

Clarification of Data Points

We clarified the number of data points gathered from various databases and literature sources and the final number of points used in modelling. These details are briefly mentioned in the main text and provided comprehensively in tables within the Supplementary Materials.

Addressed Clarity and Consistency Issues

We revised unclear statements and corrected inconsistent units and terms throughout the manuscript. Furthermore, additional references were included to support the revisions.

We have also provided detailed responses to each specific comment below. Once again, we appreciate your valuable feedback, which has significantly strengthened our work.
On behalf of the authors,
Marliana Widyastuti
Citation: https://doi.org/10.5194/essd-2024-333-AC1

Interactive discussion

Status: closed

RC1:
'Comment on essd-2024-333', Anonymous Referee #1, 02 Oct 2024
This manuscript aims to address the uncertainty in global peatland extent and peat carbon stocks by developing a global model of peat thickness and carbon density. The method that is applied and the resulting maps will be useful to peatland ecologists and soil scientists around the globe, and highlight the important role that peatlands play as carbon reservoirs.
Interestingly, the authors’ overall conclusion that peat soils globally store more than 1,000 Pg C is much larger than previous estimates. However, given the discoveries of new peatlands in recent years, particularly in the tropics, which indicates a past general trend to underestimate peatlands, this much larger estimate could well be more accurate. This manuscript will therefore make a valuable contribution to a growing body of literature that tries to pinpoint and understand the stocks and flows of peat carbon in the face of global environmental change.
General comments:
Overall, the application of digital soil mapping to estimate global carbon stocks is a useful approach that hasn’t been tried before on this scale, as far as I know. The authors have collected an impressive dataset of peat thickness, BD and CC data and used this to train random forest models for six regions. This methodology appears sound and well applied, given the obvious limitations to modelling on such a large scale.
In terms of outcomes, it seems that the authors’ mean values correspond well with previous work, particularly in the 5 case studies, but that their model is often struggling to capture the regional variability that most of these smaller-scale maps do show. Essentially, the authors’ model is moving closer towards the average at the expense of regional variation. This is understandable and as expected, given the use of a RF model with sometimes limited training data from certain regions. However, this also means that it would be good to stress the global nature of this map. The results should be treated with more caution at regional scales by end-users, especially if more local maps are also available.
Additionally, I had a couple of general questions:
It was unclear to me whether some of the training data taken from external sources includes modelled data itself. Could you clarify if all of the input data are direct field measurements, or whether this includes modelled data from previously created maps? In the latter case, it would be good to make this more explicit (how much true field data, how much modelled? In which areas?) For example in Table S1. Also, what does this mean in terms of error propagation from previous sources into this new model? How does this affect your model’s uncertainty?

It is not entirely clear to me based on what criteria the predictor variables were chosen. For example, although the list of predictor variables includes a wetness indicator such as the Topographic Wetness Index, I was left wondering whether it could have been useful to explicitly add a variable related to the seasonality of wetness/inundation? Some peatland areas, particularly along river valley systems in the tropics, might experience seasonal droughts that could be a driver of decomposition, and therefore influence thickness, BD and/or CC. (Conversely, they might experience extreme flooding during the rain season as well). Such seasonal variability in wetness/inundation is currently not captured by your list of predictor variables, as both your WordClim and PALSAR variables use yearly averages only. Perhaps it could be useful to include precipitation seasonality, precipitation of the driest month, or a similar variable from WorldClim?

The model uses six geographic regions, at least for training BD and CC models. This seems a logical methodological choice. However, I was wondering if it would be useful to apply a spatial cross-validation approach as well, in addition to the five-fold CV currently used during training. Currently, all training and testing data are randomly taken from the same region, which means that they could well be close to each other and show spatial autocorrelation. To account for this and test the model’s accuracy in areas from which it lacks any training data, it would be good to predict a test area that has not been used for training. For example, by using five regions as training data and testing on the sixth one. Or by setting testing blocks apart within each of the six continental regions. This way, the authors would get a better idea of their model’s accuracy, given some regions have very limited data.

Detailed comments:
Line 23/27: Harmonize and be consistent in number of datapoints that has been used. Currently, the different numbers (25,000/25,200) are confusing. If the numbers differ for thickness, BD and CC, give the numbers for each of these datasets explicitly in the Methods.
Line 47-50: Harmonize the use of million hectares and million km2. Choose one or the other, but not both interchangeably.
Line 61: This line should state that *peatland carbon stocks* were mapped by this paper using a random forest model, not peatland extent (as is currently implied).
Line 81: This says that the GPM reports 8.7 million km2 of peatland (of which 6.7 million is peat-dominated). However, on line 50 you state that the GPM estimates peatlands to be 4.9 million km2. This appears to be a contradiction. What is the correct number?
Line 82: What are the other non-peat dominated lands that you have excluded from the GPM? Could you elaborate why these are not useful in this study? In general, it would be good to remind the reader here that the GPM has no specific peat definition globally. As your final carbon stock number depends a lot on the GPM’s area estimate, it would be good to say something about how this could impact your results.
Line 97-98: This additional map that was used to extract more points from Indonesia does not appear to be in Table S1? Please clarify.
Line 112: What did you do if only carbon density was provided in certain datasets, and not the underlying BD and CC measurements? How were these datapoints included in your models for BD, CC and the final carbon density output? For example, the Congo Basin is one of your case studies, but you did not include any BD (and only 1 CC) values from Congo in the training set. However, carbon density values are publicly available from this area.
Line 309: Please look at sentence ‘This table as..’ Does not read very well.
Line 370: Creeze et al. (2022) is misspelled
Line 371-374: Please specify the full mean and SD values of the previous study as well as your study for comparison. You say they are similar but do not give the regional values to back this up. This also applies to the other case studies: sometimes this data is provided, but not in all cases. It would be useful to add a Table that compares the mean thickness and carbon density values, and total carbon stock, for the 5 case studies from Congo, Amazon, Indonesia, Scotland, and Northern peatlands, for both your map and the original studies.
Line 443: ‘specifically measured in the main peatlands of the Congo Basin’ Are these selected from this paper's peat thickness map (which is modelled), or are these original field-measurements taken from the same study? (see general comment above)
Line 446: ‘We attempted to fill the data gaps with available peatland maps, but information on peatlands in many regions, including Africa, remains scarce.’ So did you do this or not? Table S1 lists only point data in the tropics, which implies that you did not use additional map data? Please be more explicit here.
Line 456: Why is uncertainty not addressed? This should be relatively straightforward to assess with a RF model in GEE.
Section 4.4: I would appreciate a line in here that stresses that your work and the resulting maps are useful in estimating global carbon stocks, but that at regional scale your model appears to be not fine-grained enough to capture most of the known regional variations in the 5 case studies. Hence, it should be used with caution for assessing regional peatlands.
Supplement Table 1: Please list the number of samples that you used from each study, and (if necessary) specify which of these sources provided direct field observations, and which of these sources provided modelled thickness/BD/CC values. This seems an important distinction to make.
Citation: https://doi.org/10.5194/essd-2024-333-RC1
- AC1:
  'Comment on essd-2024-333', Marliana Tri Widyastuti, 19 Nov 2024
  We sincerely thank you for your thorough and insightful feedback on our manuscript. Your comments have been invaluable in improving the quality and clarity of our work. In response, we have made the following revisions:
  Updated Predictions and Uncertainty Quantification
  
  We incorporated new data points from Canada, enhancing peat thickness, bulk density (BD), and carbon content (CC) predictions. This addition affected the overall model performance and associated statistics, which we have updated in the Results section. Additionally, we quantified the uncertainty for each prediction map, expressed as the standard deviation of the distribution of predictions from decision trees in the random forest algorithm. The propagated uncertainties for thickness, BD, and CC were further used to estimate the uncertainty of total carbon stock.
  
  Clarification of Data Points
  
  We clarified the number of data points gathered from various databases and literature sources and the final number of points used in modelling. These details are briefly mentioned in the main text and provided comprehensively in tables within the Supplementary Materials.
  
  Addressed Clarity and Consistency Issues
  
  We revised unclear statements and corrected inconsistent units and terms throughout the manuscript. Furthermore, additional references were included to support the revisions.
  
  We have also provided detailed responses to each specific comment below. Once again, we appreciate your valuable feedback, which has significantly strengthened our work.
  On behalf of the authors,
  Marliana Widyastuti
  
  Citation: https://doi.org/10.5194/essd-2024-333-AC1
RC2:
'Comment on essd-2024-333', Anonymous Referee #2, 04 Oct 2024
The study attempts to map peat thickness and carbon stock in peatlands across the world using ‘observations’ (some are truly observations, others are from existing regional maps/models) to train and test random forest models, extrapolated using remote sensing and geodata products. Their models are spatially constrained by the existing Global Peatland Map (GPM, restricted to the ‘peat dominated’ lands = 6.7 million km²), and further adjusted in Indonesia (based on Haryono et al., 2011) (= 6.57 million km²).
Despite the conservative decision to restrict their predictions to ‘peat dominated lands’, this study predicts a total global peat carbon stock of 1,029 Pg C, which is between 1.7 and 2.3-times previous estimates.
If valid, this is a bold and significant conclusion, and the dataset could be useful for wider research and policy communities. However, there are several substantial issues with the methodology which I believe need to be resolved before the paper could potentially be accepted.

Major comments
Lack of uncertainty assessment and carbon density prediction:

I appreciate that you are upfront about the lack of uncertainty assessment, but I don’t think this is acceptable in the context of a global assessment of the peat carbon stock, especially when your total carbon stock value is so large, while also considering other methodological limitations (to follow).
Linked to this is the performance of your carbon density prediction (Figure 5). Not only does the model perform poorly against observed carbon density measurements but it appears to be systematically biased- overpredicting carbon density. You do initially acknowledge this in line 250. However, you make no reference to it in the discussion where you conclude (line 345) – ‘In our estimation, the global peat carbon stock is 1,029 Pg which is much higher than previous studies, which reported values ranging from 445 to 612 Pg C (Table 4). This is primarily due to the larger peat extent based on the UNEP Global Peatland Map.’
Figure 5 contradicts this conclusion. While the larger peat extent certainly accounts for some of the higher carbon stock, your overprediction of carbon density is a significant factor which needs to not only be discussed but quantified (Figure 5 suggests that it may account for as much as 50% of the increased carbon stock estimate).
For examples of estimating peat-carbon stock uncertainty (and propagating uncertainty in underlying variables such as peat thickness), see Hugelius et al. (2020), Crezee et al. (2022), Hastie et al. (2022), Draper et al. (2014) etc.
You also have a large RMSE for your peat depth thickness prediction in some regions (e.g. North America and SE Asia), (Table 2).

Definition of peat:

In line 45 you write- ‘The recent global peatlands assessment (GPA) by United Nations Environment Program (UNEP) reported an updated global peat coverage, reaching up to 500 million ha by defining peatlands as areas with more than 30 cm of peat layer (UNEP, 2022)’
You are also using the GPM to constrain peatland area. As such I assume that you are using this 30 cm cut off for your peat definition.
However, later in line 274 you write- ‘Peat thickness ranged from 0.04 to 10.68 m, with a high variation occurring particularly in the peatlands of Sumatra Island, Indonesia (Fig. 7)’
0.04 m or 4 cm does not qualify as peat under the GPA definition. Do you therefore exclude areas which are predicted to be < 30 cm (based on your model) from your results, and carbon estimation? The above sentence suggests not.

Covariates and parameters:

In line 138 you write- ‘We used 19 covariates (Table 1) representing peat formation factors to predict peat thickness, BD, and CC separately.’
How did you test for redundancy of driver variables (e.g. cross correlation) and model overfitting? Please better explain and justify your model set-up (in particular the selection and retention of driver variables).

Spatial autocorrelation:

Your modelling scheme does not seem to account for spatial autocorrelation. As an additional step, you could for example employ a spatial cross validation approach to get a better understanding of model performance. At the very least you should discuss the issue of spatial autocorrelation and potential implications. See for example- Garcia, M (2021); Meyer H and Pebesma (2022), Golblatt et al (2016).

Specific comments:

In line 49 you write- ‘According to the GPM, the global peatland area reached 4.9 million km2..’.
In line 81 you write- ‘The GPM, available at 1-km resolution, reports up to 8.7 million km2, double the the peat area…’.
These two sentences (above) seem contradictory, please change or explain the discrepancy (e.g. different version of map or peat definition?).

Line 140-‘We selected the hyperparameter values with the highest cross-validation score as the final model.’
I see from Table S2 that you tested hyperparameters within a defined range. It would be good to explain why you chose these ranges in terms of e.g. avoiding overfitting, as from the main text it seems that you chose the hyperparameters only based on model performance.

In line 345 you write- ‘In our estimation, the global peat carbon stock is 1,029 Pg which is much higher than previous studies, which reported values ranging from 445 to 612 Pg C (Table 4). This is primarily due to the larger peat extent based on the UNEP Global Peatland Map.’
See previous comment, what about bias in carbon density prediction?

In line 350 you write- ‘Our raster data of peat extent, provided by the Global Peatland Initiative, covers an area of 6.57 million km² designated as 'peat dominated' lands, which we assumed to be peatlands. This number exceeds the estimates reported in the GPA 2020, which accounts for up to 4.8 million km² of peatlands, excluding any peat dominated area with less than 30-40 cm peat layer. This means our estimation includes 1.77 million km2 of peatlands that were previously classified as non-peats with less than 30-40 cm peat layer. However, it is important to include as much of the probable known peatlands as possible to comprehensively estimate their carbon stock.’
Related to above comment, are you also excluding ‘peat’ pixels where your model predicts <30cm of organic soil (peat) thickness?

In line 383 you write- ‘Their peat thickness distribution map was derived using RF algorithm trained on 1,359 data points according to remote sensing layers combined with distance to peatland edge and height above nearest data.’
Do you mean ‘…height above nearest drainage.’?

Line 435- ‘4.4 Model limitations and possible improvements While PEATGRIDS have mapped peatland thickness and carbon stock across the world, we recognise some limitations that need further improvement. The peat extent used in this study is based on the Global Peatland Map, which may include areas not recognised as peats in different classification systems, or areas that have undergone significant land use changes. Since no universal definition of peat exists, PEATGRIDS provides the first estimate for global peat-dominated areas. The 1 km spatial resolution may overestimate some peatlands, especially those that cover areas less than 100 ha, or overlook smaller peatlands. Future refinement of the global peatland extent may improve the accuracy of the peat extent map.’
I would suggest also mentioning that restricting the study to the GPM definition of ‘peat dominated’ areas could also result in missing some peatlands, such as over Brazil (see for example Hastie et al., 2024 and Gumbricht et al., 2017).

Line 456- ‘One important information not addressed in this work is the uncertainty of the predicted maps. Uncertainty analysis is necessary to evaluate how reliable the predicted maps are for decision-making processes, as it acknowledges model limitations and interpretability (Wadoux et al., 2020). Model validation metrics can be used in the interim as an indication of reliability. ‘
Considering the poor performance of your carbon density model and bold conclusions (i.e. 1,029 Pg C), an assessment of uncertainty is essential.

Additional references mentioned-
Draper et al 2014 Environ. Res. Lett. 9 124017- https://iopscience.iop.org/article/10.1088/1748-9326/9/12/124017
Garcia M 2021 Investigating the Use of Spatially-Explicit Modelling and Cross-Validation Strategies in Spatial Interpolation Machine Learning Problems available at: https://run.unl.pt/handle/10362/113881
Goldblatt R, You W, Hanson G and Khandelwal A K 2016 Detecting the boundaries of urban areas in india: a dataset for pixel-based image classification in google earth engine Remote Sens. 8 634
Gumbricht T, Roman-Cuesta RM, Verchot L, et al. An expert system model for mapping tropical wetlands and peatlands reveals South America as the largest contributor. Glob Change Biol. 2017; 23: 3581–3599. https://doi.org/10.1111/gcb.13689
Garcia M 2021 Investigating the Use of Spatially-Explicit Modelling and Cross-Validation Strategies in Spatial Interpolation Machine Learning Problems available at: https://run.unl.pt/handle/10362/113881
Goldblatt R, You W, Hanson G and Khandelwal A K 2016 Detecting the boundaries of urban areas in india: a dataset for pixel-based image classification in google earth engine Remote Sens. 8 634
Hastie et al 2024. A new data-driven map predicts substantial undocumented peatland areas in Amazonia. Environ. Res. Lett. 19 094019. https://iopscience.iop.org/article/10.1088/1748-9326/ad677b
Meyer H and Pebesma E 2022 Machine learning-based global maps of ecological variables and the challenge of assessing them Nat. Commun. 13 2208
Citation: https://doi.org/10.5194/essd-2024-333-RC2
- AC1:
  'Comment on essd-2024-333', Marliana Tri Widyastuti, 19 Nov 2024
  We sincerely thank you for your thorough and insightful feedback on our manuscript. Your comments have been invaluable in improving the quality and clarity of our work. In response, we have made the following revisions:
  Updated Predictions and Uncertainty Quantification
  
  We incorporated new data points from Canada, enhancing peat thickness, bulk density (BD), and carbon content (CC) predictions. This addition affected the overall model performance and associated statistics, which we have updated in the Results section. Additionally, we quantified the uncertainty for each prediction map, expressed as the standard deviation of the distribution of predictions from decision trees in the random forest algorithm. The propagated uncertainties for thickness, BD, and CC were further used to estimate the uncertainty of total carbon stock.
  
  Clarification of Data Points
  
  We clarified the number of data points gathered from various databases and literature sources and the final number of points used in modelling. These details are briefly mentioned in the main text and provided comprehensively in tables within the Supplementary Materials.
  
  Addressed Clarity and Consistency Issues
  
  We revised unclear statements and corrected inconsistent units and terms throughout the manuscript. Furthermore, additional references were included to support the revisions.
  
  We have also provided detailed responses to each specific comment below. Once again, we appreciate your valuable feedback, which has significantly strengthened our work.
  On behalf of the authors,
  Marliana Widyastuti
  
  Citation: https://doi.org/10.5194/essd-2024-333-AC1
AC1:
'Comment on essd-2024-333', Marliana Tri Widyastuti, 19 Nov 2024
We sincerely thank you for your thorough and insightful feedback on our manuscript. Your comments have been invaluable in improving the quality and clarity of our work. In response, we have made the following revisions:
Updated Predictions and Uncertainty Quantification

We incorporated new data points from Canada, enhancing peat thickness, bulk density (BD), and carbon content (CC) predictions. This addition affected the overall model performance and associated statistics, which we have updated in the Results section. Additionally, we quantified the uncertainty for each prediction map, expressed as the standard deviation of the distribution of predictions from decision trees in the random forest algorithm. The propagated uncertainties for thickness, BD, and CC were further used to estimate the uncertainty of total carbon stock.

Clarification of Data Points

We clarified the number of data points gathered from various databases and literature sources and the final number of points used in modelling. These details are briefly mentioned in the main text and provided comprehensively in tables within the Supplementary Materials.

Addressed Clarity and Consistency Issues

We revised unclear statements and corrected inconsistent units and terms throughout the manuscript. Furthermore, additional references were included to support the revisions.

We have also provided detailed responses to each specific comment below. Once again, we appreciate your valuable feedback, which has significantly strengthened our work.
On behalf of the authors,
Marliana Widyastuti
Citation: https://doi.org/10.5194/essd-2024-333-AC1

Supplement

https://doi.org/10.5194/essd-2024-333-supplement

Data sets

PEATGRIDS: Mapping global peat thickness and carbon stock via digital soil mapping approach, dataset M. T. Widyastuti, B. Minasny, J. Padarian, and F. Maggi https://doi.org/10.5281/zenodo.12559239

Viewed

Total article views: 3,168 (including HTML, PDF, and XML)

HTML	PDF	XML	Total	Supplement	BibTeX	EndNote
1,948	823	397	3,168	270	98	137

HTML: 1,948
PDF: 823
XML: 397
Total: 3,168
Supplement: 270
BibTeX: 98
EndNote: 137

Views and downloads (calculated since 27 Aug 2024)

Month	HTML	PDF	XML	Total
Aug 2024	104	38	3	145
Sep 2024	104	18	5	127
Oct 2024	195	29	30	254
Nov 2024	84	21	88	193
Dec 2024	71	22	102	195
Jan 2025	56	26	77	159
Feb 2025	62	11	0	73
Mar 2025	46	26	6	78
Apr 2025	46	43	0	89
May 2025	49	34	37	120
Jun 2025	25	20	8	53
Jul 2025	46	19	4	69
Aug 2025	93	12	0	105
Sep 2025	315	21	1	337
Oct 2025	62	35	2	99
Nov 2025	63	57	2	122
Dec 2025	71	59	1	131
Jan 2026	50	35	3	88
Feb 2026	75	59	4	138
Mar 2026	71	57	6	134
Apr 2026	132	108	7	247
May 2026	84	45	4	133
Jun 2026	14	7	2	23
Jul 2026	30	21	5	56

Cumulative views and downloads (calculated since 27 Aug 2024)

Month	HTML	PDF	XML	Total
Aug 2024	104	38	3	145
Sep 2024	104	18	5	127
Oct 2024	195	29	30	254
Nov 2024	84	21	88	193
Dec 2024	71	22	102	195
Jan 2025	56	26	77	159
Feb 2025	62	11	0	73
Mar 2025	46	26	6	78
Apr 2025	46	43	0	89
May 2025	49	34	37	120
Jun 2025	25	20	8	53
Jul 2025	46	19	4	69
Aug 2025	93	12	0	105
Sep 2025	315	21	1	337
Oct 2025	62	35	2	99
Nov 2025	63	57	2	122
Dec 2025	71	59	1	131
Jan 2026	50	35	3	88
Feb 2026	75	59	4	138
Mar 2026	71	57	6	134
Apr 2026	132	108	7	247
May 2026	84	45	4	133
Jun 2026	14	7	2	23
Jul 2026	30	21	5	56

Viewed (geographical distribution)

Total article views: 3,113 (including HTML, PDF, and XML) Thereof 3,113 with geography defined and 0 with unknown origin.

Country	#	Views	%

Cited

Latest update: 31 Jul 2026

Download

This preprint has been withdrawn.

Preprint (4306 KB)
Metadata XML

Short summary

PEATGRIDS, the first dataset containing maps of global peat thickness and carbon stock at 1 km resolution. The dataset has been publicly available at Zenodo to support further analyses and modelling of peatlands across the globe. This work employed the random forest machine learning model to provide spatially explicit peat carbon stock at pixel basis.


Total:	0
HTML:	0
PDF:	0
XML:	0

PEATGRIDS: Mapping thickness and carbon stock of global peatlands via digital soil mapping

Interactive discussion

Interactive discussion

Supplement

Data sets

Viewed

Viewed (geographical distribution)

Cited

6 citations as recorded by crossref.