the Creative Commons Attribution 4.0 License.
the Creative Commons Attribution 4.0 License.
Soil organic carbon maps and associated uncertainty at 90 m for peninsular Spain
Abstract. Human activities have significantly disrupted the global carbon cycle, leading to increased atmospheric CO2 levels and altering ecosystems' carbon absorption capacities, with soils serving as the largest carbon reservoirs in terrestrial ecosystems. The complexity and variability of soil properties, shaped by long-term transformations, make it crucial to study these properties at various spatial and temporal scales to develop effective climate change mitigation strategies. However, integrating disparate soil databases presents challenges due to the lack of standardized protocols, necessitating collaborative efforts to standardize data collection and processing to improve the reliability of Soil Organic Carbon (SOC) estimates. This issue is particularly relevant in peninsular Spain, where variations in sampling protocols and calculation methods have resulted in significant discrepancies in SOC concentration and stock estimates. This study aimed to improve the understanding of SOC storage and distribution in peninsular Spain by focusing on two specific goals: integrating and standardizing existing soil profile databases, and modeling SOC concentrations (SOCc) and stocks (SOCs) at different depths using an ensemble machine-learning approach. The research produced four high-resolution SOC maps for peninsular Spain, detailing SOCc and SOCs at depths of 0–30 cm, 30–100 cm and the effective soil depth, along with associated uncertainties. These maps provide valuable data for national soil carbon management and contribute to compiling Spain's National Greenhouse Gas Emissions Inventory Report. Additionally, the findings support global initiatives like the Global Soil Organic Carbon Map, aligning with international efforts to improve soil carbon assessments. The soil organic carbon concentration (g/kg) maps for the 0–30 cm and 30–100 cm standard depths, along with the soil organic carbon stock (tC/ha) maps for the 0–30 cm standard depth and the effective soil depth, including their associated uncertainties, —all at a 90-meter pixel resolution— (SOCM90) are freely available at https://doi.org/10.6073/pasta/48edac6904eb1aff4c1223d970c050b4 (Durante et al., 2024).
- Preprint
(2056 KB) - Metadata XML
-
Supplement
(874 KB) - BibTeX
- EndNote
Status: open (until 04 Mar 2025)
-
RC1: 'Comment on essd-2024-431', Anonymous Referee #1, 06 Dec 2024
reply
The manuscript by Durante et al., submitted to ESSD, is an interesting contribution, particularly due to its impressive dataset on SOC concentration and stocks. However, the modelling approach is not robust and requires significant rethinking. There is abundant literature on the mapping of soil properties and spatial model ensembles, yet it is unclear why the authors have disregarded this body of work. I did not review the results section because the mapping and modelling steps lack rigor and do not make sense. Authors should get help from a digital soil mapping and spatial modelling expert.
Specific comments:
- Inconsistent data points (L. 185)
How are the authors classifying a data point within a pedogenetic horizon as “inconsistent”? Please clarify the criteria used. - SOC data transformation (L. 189-191)
On the one hand, the authors apply a log-transformation to the SOC data, on the other, they remove SOC data from organic soils. This is contradictory and lacks a clear rationale. Why were data from organic soils excluded? This is not a common practice, and the reasoning should be explicitly stated. Additionally, if data from organic soils were excluded, does this mean no predictions were made for organic soils? Please confirm because the figures of the results show prediction for all soils. - Conversion factor (L. 198-200)
The manuscript should specify how many data points were converted using the factor mentioned. Note that this conversion factor has been widely criticized within the scientific community for being overly general. - Representativeness (Section 2.1.3)
The term “representativeness” is poorly defined in the context of this study, and the entire section lacks coherence. Why are the authors using techniques designed for point patterns when the soil data are not a point pattern? The use of Maxent to evaluate the “representativity” of the data is unclear, especially since other models are used later in the study. What exactly are the authors trying to achieve with this analysis? There has been studies looking at the area of applicability of spatial models. - Data input for ML (L. 293-294)
The described step seems outdated, as most modern machine learning (ML) techniques can handle both categorical and continuous datasets as input without requiring separate preprocessing. - Bayesian analysis
Bayesian analysis and Bayesian calibrations are techniques for updating parameter distributions and fitting models, not models themselves. Which specific model was used in the Bayesian analysis? This should be explicitly stated. The three techniques for variable selection could be removed and merged with the modelling step, because the optimal variable set depends on the model. - Model selection and ensemble approach
The modeling approach is unclear. The authors used three models (QRF, EML, and AutoML) combined into an ensemble. However, one of these (QRF) is itself an ensemble of random forests. How was the ensemble constructed? Additionally, the validation step using cross-validation should be applied consistently across all three models. For the final prediction, was the ensemble constructed from all models, or was it fitted to all available data points? Please clarify. - Uncertainty estimates (L. 406)
Some models, such as QRF, return prediction intervals, while others, such as EML, likely return confidence intervals. What uncertainty measures are reported for each model? Additionally, how was the standard deviation derived from the WRF distribution? More details are needed here. - Ensemble uncertainty (L. 408-411)
The method proposed for handling uncertainty is statistically flawed. Selecting the pixel with the lowest standard deviation from different models is incorrect. Model ensembles should be constructed using specific techniques that integrate predictions from multiple models. Accurately representing uncertainty across models is more complex than the proposed approach. - Cross-validation vs. data splitting (Figure 3)
Cross-validation should be used instead of data splitting for model evaluation. - R² calculation
How was the R² calculated? Please provide details about the method used.
Citation: https://doi.org/10.5194/essd-2024-431-RC1 -
AC1: 'Reply on RC1', Juan M. Requena-Mullor, 29 Jan 2025
reply
RC1: 'Comment on essd-2024-431', Anonymous Referee #1, 06 Dec 2024
The manuscript by Durante et al., submitted to ESSD, is an interesting contribution, particularly due to its impressive dataset on SOC concentration and stocks. However, the modelling approach is not robust and requires significant rethinking.
Response: We thank the reviewer for their time and feedback. First, we appreciate the recognition that our manuscript is interesting and based on an impressive dataset, which we believe is relevant for ESSD.
We agree that the manuscript's clarity could be improved, and we are ready to provide a revised version based on the feedback from the reviewers. Specifically, we are prepared to improve the explanation of how we performed the modeling workflow and how we demonstrated its robustness (comments from Reviewer 1) and improve the introduction and discussion (comments from Reviewer 2).
There is abundant literature on the mapping of soil properties and spatial model ensembles, yet it is unclear why the authors have disregarded this body of work.
Response: We agree that a more comprehensive introduction to ensemble learning for digital soil mapping is necessary. In the revised manuscript, we will include updated information and provide a detailed overview of the state-of-the-art in ensemble machine learning for digital soil mapping.
I did not review the results section because the mapping and modelling steps lack rigor and do not make sense. Authors should get help from a digital soil mapping and spatial modelling expert.
Response: We respectfully disagree with the comments from the reviewer, but we acknowledge the need to improve the clarity of the methods section. The modeling approach is robust as it is tested against fully independent residuals. Our modeling approach uses multiple three state-of-the-art machine learning approaches (quantile regression forests, model stacking, and automatic machine learning) parameterized with cross-validation. The approach with less prediction variance against the same set of fully independent data is used for final prediction on a pixel-wise basis. In the revised version of the manuscript, we will improve the narrative and explanation of the workflow.
Specific comments:
1. Inconsistent data points (L. 185)
How are the authors classifying a data point within a pedogenetic horizon as “inconsistent”? Please clarify the criteria used.
Response: Data with logical inconsistencies were identified and treated as errors, likely resulting from inaccuracies in the soil sampling process or property measurements. For instance, inconsistencies were observed when the upper or lower boundary of a morphological horizon—or both—failed to align with adjacent horizons, creating overlapping or undefined sections within the soil profile. These discrepancies, which hindered the accurate assignment of carbon data, were attributed to errors in the morphological description and excluded to avoid introducing inaccuracies into the models. We will revise the text to clarify this topic in the revised version.
2. SOC data transformation (L. 189-191)
On the one hand, the authors apply a log-transformation to the SOC data, on the other, they remove SOC data from organic soils. This is contradictory and lacks a clear rationale. Why were data from organic soils excluded? This is not a common practice, and the reasoning should be explicitly stated. Additionally, if data from organic soils were excluded, does this mean no predictions were made for organic soils? Please confirm because the figures of the results show prediction for all soils.
Response: We appreciate the opportunity to clarify this point because we believe is a misunderstanding. We excluded the organic horizons (e.g., H, O, and L horizons) reported in some soils, but we did not exclude full soil profiles from organic soils (e.g., Histosols). Arguably, the exclusion of organic horizons is a common practice in soil carbon modeling. Most studies estimate soil carbon starting from the mineral horizon at the soil surface (0 cm), omitting horizons located above it that are primarily organic. We further clarify that the presence of soils with predominantly organic horizons in the database used in our study is limited and insufficient for reliably modeling carbon dynamics in these layers. We will add text in the revised version to clarify this topic.
3. Conversion factor (L. 198-200)
The manuscript should specify how many data points were converted using the factor mentioned. Note that this conversion factor has been widely criticized within the scientific community for being overly general.
Response: The conversion factor used in the manuscript does introduce some uncertainty into the carbon data, as it applies a single value uniformly, despite variations influenced by soil type and, more importantly, the methodology used for its determination. This uncertainty arises when converting analytically determined organic carbon into organic matter using a factor of 1.72. However, in our study, we reversed this process for organic matter data from the compiled databases, converting it back to organic carbon using a factor of 0.67, based on the assumption that organic matter contains 58% carbon (Rosell et al., 2001), as stated in the manuscript. This approach ensured consistency in carbon content representation across the dataset and effectively mitigated the uncertainty associated with conversion factors, addressing the reviewer’s concern.
Reference used in response:
Rosell, R. A., Gasparoni, J. C., and Galantini, J. A.: Soil organic matter evaluation, in: Assessment methods for soil carbon, Lewis Publishers Boca Raton, 311–322, 2001.
4. Representativeness (Section 2.1.3)
The term “representativeness” is poorly defined in the context of this study, and the entire section lacks coherence. Why are the authors using techniques designed for point patterns when the soil data are not a point pattern? The use of Maxent to evaluate the “representativity” of the data is unclear, especially since other models are used later in the study. What exactly are the authors trying to achieve with this analysis? There has been studies looking at the area of applicability of spatial models.
Response: We appreciate the reviewer’s insightful comments regarding the concept of "representativeness" in our study. We acknowledge that there is no consensus on how to evaluate the representativeness of soil samples and recognize that various approaches exist for this purpose. In our study, we utilized the Maxent model due to its established application in spatial representativeness studies. Although Maxent is commonly used for point patterns, it has also been successfully adapted for broader spatial applications, such as assessing the representativeness of environmental networks (Villarreal et al., 2018) and mitigating spatial bias in volunteered geographic information for predictive mapping, particularly in cases of low representativeness (Zhang and Zhu, 2019). Importantly, this approach has also been applied to evaluating the representativeness of a soil database to predict soil carbon across Ecuador (Armas et al., 2022).
We understand that the use of our approach need further clarification in a revised version. Briefly, Maxent was used to assess the representativeness of soil types in the database by modeling the probability of a soil type being sampled based on soil-forming factors. This approach allowed us to assess how well the sampled soil types represent their overall variability across peninsular Spain, providing insights into poorly sampled areas. To clarify this approach, we will revise the methodology section to better define the concept of representativeness and explain the rationale behind using Maxent.
References used in response:
Villarreal, S., Guevara, M., Alcaraz-Segura, D., Brunsell, N. A., Hayes, D., Loescher, H. W., and Vargas, R.: Ecosystem functional diversity and the representativeness of environmental networks across the conterminous United States, Agric For Meteorol, 262, 423–433,
Zhang, G., & Zhu, A. X. (2019). A representativeness-directed approach to mitigate spatial bias in VGI for the predictive mapping of geographic phenomena. International Journal of Geographical Information Science, 33(9), 1873–1893.
Armas, Daphne, Mario Guevara, Fernando Bezares, Rodrigo Vargas, Pilar Durante, Víctor Osorio, Wilmer Jiménez, and Cecilio Oyonarte. "Harmonized Soil Database of Ecuador (HESD): data from 2009 to 2015." Earth System Science Data Discussions 2022 (2022): 1-24.
5. Data input for ML (L. 293-294)
The described step seems outdated, as most modern machine learning (ML) techniques can handle both categorical and continuous datasets as input without requiring separate preprocessing.
Response: Although modern machine learning methods can process both categorical and continuous data together, our approach was driven by the need for compatibility with the specific algorithms we chose and our dataset's structure. In our case, categorical variables were rasterized and transformed into binary variables to ensure consistency across the dataset and to create a clear distinction between the presence or absence of specific categories. This was especially important because some categories in the dataset had fewer than 100 soil samples. More importantly, despite the evolving nature of machine learning methods, we believe that our approach is still valid and well-suited to the objectives of this research. We respectfully believe that as new approaches are identified and eventually widely adopted, then reassessments of datasets and spatial predictions will be needed to test how past results compare to new approaches as a natural step of scientific endeavors. We will include a discussion about this topic in the revised version.
6. Bayesian analysis
Bayesian analysis and Bayesian calibrations are techniques for updating parameter distributions and fitting models, not models themselves. Which specific model was used in the Bayesian analysis? This should be explicitly stated.
Response: We agree with the reviewer’s comment. In the revised version of the manuscript, we will explicitly define the structure of the Bayesian model used.
The three techniques for variable selection could be removed and merged with the modelling step, because the optimal variable set depends on the model.
Response: We intend to identify the most informative SOC covariates, considering multiple modeling approaches. We propose that variables that consistently result as top predictors in multiple modeling approaches are more important than those that are top predictors in one modeling approach. With this approach, we are not only interested in improving the predictions but also in providing interpretability of the models used in this work. To further clarify this point, we will include information about interpretability in both the introduction and discussion sections.
7. Model selection and ensemble approach
The modeling approach is unclear. The authors used three models (QRF, EML, and AutoML) combined into an ensemble. However, one of these (QRF) is itself an ensemble of random forests. How was the ensemble constructed? Additionally, the validation step using cross-validation should be applied consistently across all three models. For the final prediction, was the ensemble constructed from all models, or was it fitted to all available data points? Please clarify.
Response: We recognize that the clarity of the modeling narrative in the current version of the manuscript can be improved. We will improve the narrative and explain in the new version of the paper how we create the final map using the model with less prediction variance on a pixel-wise basis.
We agree with the reviewer that QRF (a variant of Random Forests) is an ensemble of multiple regression trees (based on bagging resampling) that solves regression problems using a simple average or a majority vote when used as a classifier. Similar ensemble approaches are also available for generalized linear models (e.g., RandomGLM). However, ensembles of multiple modeling approaches also exist in the literature (Merrifield et al., 2020). Ensemble machine learning is a branch of machine learning where more than one model contributes to a final prediction on a statistical basis (i.e., using an error term as information criteria to determine what model prediction is more accurate). While these ensemble approaches have proven to be more efficient than single models in some cases, for digital soil mapping, it would also be relevant to identify the most suitable model for a specific geographical area under specific soil weathering conditions. Our effort combines predictions from multiple modeling approaches in the geographical space, considering the same criteria of minimum prediction variance. Prediction variance is computed in the same way across methods; therefore, we use it to filter the predictions from multiple models for each pixel in the geographical space. Please note that we provide cross-validated digital soil maps from the three modeling approaches (QRF, model stacking, and automatic ML), and, in addition, we provide the map combining predictions from the different models. We also indicate in which pixels what model generates the least prediction variances which we believe represent valuable information for further local studies interested in modeling SOC across particular geographical regions within the Iberian Peninsula.
Reference used in response:
Merrifield, A.L., Brunner, L., Lorenz, R., Medhaug, I., Knutti, R., 2020. An investigation of weighting schemes suitable for incorporating large ensembles into multi-model ensembles. Earth System Dynamics 11, 807–834. https://doi.org/10.5194/esd-11-807-2020
8. Uncertainty estimates (L. 406)
Some models, such as QRF, return prediction intervals, while others, such as EML, likely return confidence intervals. What uncertainty measures are reported for each model? Additionally, how was the standard deviation derived from the WRF distribution? More details are needed here.
Response: We appreciate this comment and recognize that this topic needs clarification. First, we compare 1) QRF (an ensemble of regression trees), 2) a stack of models as implemented in the MLR package of R (Machine Learning in R) using a linear approach as a meta-algorithm for 'stacking' multiple prediction algorithms independently parameterized with cross-validation, and 3) a novel approach for automatically comparing and testing prediction algorithms known as autoML.
While model-based uncertainty of some algorithms is an emergent property of their development (e.g., for QRF, the model-based uncertainty is extracted from the variance of all regression or classification trees used to grow the forests). For other modeling approaches (such as kernel methods or neural networks), computing uncertainty estimates are not straightforward. One important component of model-based uncertainty of any prediction algorithm is the sensitivity of model predictions to variations in data (commonly reported on digital soil mapping), i.e., induced by the cross-validation process or the repeated data partition for training and testing models.
In our study, we computed the sensitivity of model prediction to data variations using the same data partition strategy across QRF, the stack of models, and the automatic machine learning approach. We further highlight that cross-validation is always computed to parameterize prediction algorithms. We plan to clarify these points in the revised version of the manuscript.
QRF is a variant of RF where the algorithm computes not only the most probable prediction value, but the full conditional distribution of the response variable, as a function of its prediction factors.
Response: We appreciate the reviewer’s insightful comments and agree that additional clarification is needed. In the revised version of the manuscript, we will provide a more detailed explanation of the uncertainty measures reported for each model and the derivation of the standard deviation from the WRF distribution.
9. Ensemble uncertainty (L. 408-411)
The method proposed for handling uncertainty is statistically flawed. Selecting the pixel with the lowest standard deviation from different models is incorrect. Model ensembles should be constructed using specific techniques that integrate predictions from multiple models. Accurately representing uncertainty across models is more complex than the proposed approach.
Response: We respectfully disagree with the assertion that our method for handling uncertainty is statistically flawed. Ensemble forecasting using the standard deviation of predictions from multiple models is a valid and widely recognized method, particularly in machine learning and digital soil mapping (see for example Varón-Ramírez et al., 2022; Arroyo-Cruz et al., 2017). Arguably, this approach effectively captures the variability in predictions across models, providing a measure of uncertainty that reflects differences in model behavior under the same input conditions.
Furthermore, using the standard deviation of predictions is a standard practice in ensemble modeling (Leutbecher and Palmer, 2008; Gavilán-Acuña et al., 2021). Our approach offers an interpretable metric for understanding the spread of predictions and identifying areas where model agreement is lower. The standard deviation reflects the dispersion of predictions from multiple models, capturing an essential dimension of uncertainty—variability between different modeling approaches. This is particularly relevant in ensemble setups where model diversity is leveraged to improve the overall robustness of predictions. Moreover, by applying this method at the pixel level, we provided spatially explicit uncertainty estimates that allow users to identify regions with higher prediction variability. This information is critical for understanding the reliability of SOC estimates and guiding targeted soil management decisions.
In the revised version of our manuscript, we will clarify the reasoning behind our approach and its limitations to ensure transparency and interpretability.
References used in response:
Arroyo-Cruz, C.E., Larson, J., Guevara, M.: A machine learning approach for mapping soil properties in Mexico using legacy data, climate and terrain covariates at a coarse scale 23–27. In: Arrouays, D., Savin, I., Leenaars, J., McBratney, A.B. (Eds.) GlobalSoilMap - Digital Soil Mapping from Country to Globe: Proceedings of the Global Soil Map 2017 Conference, July 4-6, 2017, Moscow, Russia. CRC Press, London. https://doi.org/10.1201/9781351239707, 2017.
Gavilán-Acuña, G., Olmedo, G.F., Mena-Quijada, P., Guevara, M., Barría-Knopf, B., Watt, M.S.: Reducing the Uncertainty of Radiata Pine Site Index Maps Using an Spatial Ensemble of Machine Learning Models. Forests 12, 77. https://doi.org/10.3390/f12010077, 2021.
Leutbecher, M., Palmer, T.N., 2008. Ensemble forecasting. Journal of Computational Physics, Predicting weather, climate and extreme events 227, 3515–3539. https://doi.org/10.1016/j.jcp.2007.02.014
Varón-Ramírez, V.M., Araujo-Carrillo, G.A., Guevara Santamaría, M.A.: Colombian soil texture: building a spatial ensemble model. Earth System Science Data 14, 4719–4741, https://doi.org/10.5194/essd-14-4719-2022, 2022.
10. Cross-validation vs. data splitting (Figure 3)
Cross-validation should be used instead of data splitting for model evaluation.
Response: We thank the reviewer for the opportunity to clarify this point. The accuracy and reliability of the models were assessed at two distinct levels: calibration and validation. Small differences between these criteria indicate minimal overfitting and enhanced reliability (Murder et al., 2016).
- Model Calibration: Goodness-of-fit indices for each modeling technique were obtained using validation strategies tailored to each method. For example, the quantile regression forest approach employed out-of-bag error, while ten-fold cross-validation was utilized for the other techniques.
- External Validation: Predicted values, generated using 75% of the dataset for training, were compared with observed values from the remaining 25% of the dataset. This process was repeated three times for each modeling technique (i.e., three-fold cross-validation), and the average results were calculated (Nussbaum et al., 2016). Additionally, the standard deviation of the models’ predictions (Figure 3 of the manuscript) was used to assess variability and reliability.
Overall, the evaluation process incorporated various approaches, including cross-validation. The results are summarized in Table 4 of the manuscript, presenting the concordance correlation coefficient for both calibration (CCcal) and validation (CCval). To improve clarity, we will include additional details where necessary in the revised manuscript.
References used in response:
Mulder, V.L., Lacoste, M., Richer-de-Forges, A.C., Martin, M.P., Arrouays, D., 2016. National versus global modelling the 3D distribution of soil organic carbon in mainland France. Geoderma 263, 16–34. https://doi.org/10.1016/j.geoderma.2015.08.035
Nussbaum, M., Spiess, K., Baltensweiler, A., Grob, U., Keller, A., Greiner, L., Schaepman, M. E., and Papritz, A.: Evaluation of digital soil mapping approaches with large sets of environmental covariates, SOIL, 4, 1–22, https://doi.org/10.5194/soil-4-1-2018, 2018.
11. R² calculation
How was the R² calculated? Please provide details about the method used.
Response: R-squared was calculated to evaluate the analysis of residuals for the top 30 cm of the SOCc and SOCs maps. It was computed as the square of the correlation between the final observed values (derived from the final maps) and the predicted values (from the soil database). Additionally, the overall agreement rate and Kappa statistic were determined using the postResample function in the caret package in R software.
Citation: https://doi.org/10.5194/essd-2024-431-AC1
- Inconsistent data points (L. 185)
-
RC2: 'Comment on essd-2024-431', Anonymous Referee #2, 05 Jan 2025
reply
Authors used 8, 361 soil profile samples and data of multiple environmental factors to create a digital map of SOC concentration (0-30 cm, and 30-100 cm) and stocks (0-30 cm, and effective soil depth) for peninsular Spain at 90-m resolution. Authors state that they used ensemble machine learning approach to generate SOC estimates and it’s associated uncertainty.
Numerous SOC maps at various resolutions have been published both globally and nationally. However, the authors fail to mention recent advancements in ensemble machine learning-based SOC mapping efforts in the Introduction section. I recommend that the authors thoroughly review the existing SOC DSM literature and clearly identify the knowledge gap that this manuscript aims to address. Additionally, in the appropriate section, the authors should compare their maps with existing SOC estimates, including those from Spain (I remember reviewing earlier SOC mapping study from Spain), and report the findings appropriately. Throughout the manuscript, several abbreviations are repeated multiple times; the authors should carefully review and minimize redundancy. Overall, I find this work incomplete and out of place, as it does not appropriately engage with the existing literature on this important topic.
Abstract: I didn’t find the Abstract focused, informative or structured. No information about sample size, methodological details, and prediction accuracy exist in this abstract. Too many irrelevant details which should be in materials & methods section are provided in the abstract. The text from L17-26 are unnecessary in the Abstract and should be deleted. Also Abstracts should end with a sentence stating who can use the generated information from this study, and not a self-citation. I encourage authors to read some good quality SOC DSM papers and rewrite the Abstract accordingly.
Introduction: Introduction section should summarize the existing literature on the topic of investigation and state clearly the existing knowledge gaps in current efforts. Authors should properly cite and discuss the findings of existing SOC DSM literature, specifically those studies which has used ensemble machine learning approach in other parts of the world. This study is not the first to use this approach and proper appreciation of existing literature is needed. Current Introduction suggests authors are unaware of recent developments in DSM SOC which uses ML techniques.
Materials and methods:
L179-193: This section is confusing and needs to be properly rewritten. I think Histosols are also soil types. So if authors want to report SOC stocks of peninsular Spain, Histosols must be part of it. If authors want to report SOC only in mineral soils of Spain, then that can’t be the total SOC stocks of Spain as it is presented currently, and authors should clearly mention this in relevant sections of the manuscript.
L214-221: How many samples were not included in the modeling? Was any gap filling approach employed in this study?
L303-350: This section is not relevant to ML approach. Current ML algorithms can take into account of categorical, continuous, and correlated variables.
L373: MLR is incorrectly abbreviated here. MLR is not ML approach and should not be included in the ensemble ML approach.
L351: Section 2.2.2 is confusing. Please rewrite and mention which specific models were included in the model ensemble approach applied in this study. I am surprised to see non ML methods such as MLR included/mentioned here, as I thought this manuscript was using ensemble ML approach. The current write-up suggests author used only two ML approaches (QRF and AutoML), in doing so authors can not produce a robust ML ensemble, and thus the interquartile range.
L416-426: Authors attempt to highlight a lot in the manuscript about uncertainty estimates of SOC stocks. But I am surprised to see no robust uncertainty analysis conducted in the text. Authors merely report validation statistics and interquartile range of different approach that they used. I suggest authors to define in the methods section what they mean by the term “uncertainty”. In my knowledge, without proper distributional analyses of each independent variables and SOC stocks using MonteCarlo simulations, no proper uncertainty analysis can be done.
L549: This manuscript does not have results and discussion section. Is this common for this journal? I will not accept this work unless authors provide a robust discussion in an appropriate section, mentioning how their results compare and contrast with the existing SOC literature, which has produced SOC stock estimates using ensemble ML approach.
Citation: https://doi.org/10.5194/essd-2024-431-RC2 -
AC2: 'Reply on RC2', Juan M. Requena-Mullor, 29 Jan 2025
reply
RC2: 'Comment on essd-2024-431', Anonymous Referee #2, 05 Jan 2025
1. Authors used 8, 361 soil profile samples and data of multiple environmental factors to create a digital map of SOC concentration (0-30 cm, and 30-100 cm) and stocks (0-30 cm, and effective soil depth) for peninsular Spain at 90-m resolution. Authors state that they used ensemble machine learning approach to generate SOC estimates and it’s associated uncertainty.
Response: We appreciate the reviewer’s feedback and their acknowledgment of the significance of the dataset we present for ESSD.
2. Numerous SOC maps at various resolutions have been published both globally and nationally. However, the authors fail to mention recent advancements in ensemble machine learning-based SOC mapping efforts in the Introduction section. I recommend that the authors thoroughly review the existing SOC DSM literature and clearly identify the knowledge gap that this manuscript aims to address. Additionally, in the appropriate section, the authors should compare their maps with existing SOC estimates, including those from Spain (I remember reviewing earlier SOC mapping study from Spain), and report the findings appropriately. Throughout the manuscript, several abbreviations are repeated multiple times; the authors should carefully review and minimize redundancy. Overall, I find this work incomplete and out of place, as it does not appropriately engage with the existing literature on this important topic.
Response: We agree that the manuscript's clarity could be improved, and we are ready to provide a revised version based on the feedback from the reviewers. Specifically, we are prepared to explain how we perform the modeling workflow and how we demonstrate its robustness (comments from Reviewer 1), as well as improve the introduction and discussion sections (comments from Reviewer 2).
The comparison of our results with other carbon stock estimates in Spain was not the focus of this study, as it was comprehensively addressed in earlier research conducted in a pilot area of the Iberian Peninsula. In that study, we compared multiple existing SOC maps and explored the appropriateness of different scales and methodological approaches. The findings from that investigation have played a key role in shaping many of the modeling decisions adopted in the current study. The findings are detailed in Durante et al. (2024), Predicting soil organic carbon with different approaches and spatial resolutions for the southern Iberian Peninsula, Spain, Geoderma Regional, Vol. 37 (https://doi.org/10.1016/j.geodrs.2024.e00780). However, we will edit the discussion in the revised version to highlight how other studies have approached research of soil organic carbon modelling and patterns across the Iberian Peninsula.
3. Abstract: I didn’t find the Abstract focused, informative or structured. No information about sample size, methodological details, and prediction accuracy exist in this abstract. Too many irrelevant details which should be in materials & methods section are provided in the abstract.
Response: We agree. We will revise the abstract to provide more quantitative information.
4. The text from L17-26 are unnecessary in the Abstract and should be deleted. Also Abstracts should end with a sentence stating who can use the generated information from this study, and not a self-citation. I encourage authors to read some good quality SOC DSM papers and rewrite the Abstract accordingly.
Response: We appreciate the reviewer's suggestion. Based on our experience, abstracts in ESSD typically conclude with a citation referencing the dataset. We will revise the abstract accordingly, removing unnecessary text and rephrasing the conclusion to better align with the standards for ESSD papers while ensuring it highlights who can use the generated information.
5. Introduction: Introduction section should summarize the existing literature on the topic of investigation and state clearly the existing knowledge gaps in current efforts. Authors should properly cite and discuss the findings of existing SOC DSM literature, specifically those studies which has used ensemble machine learning approach in other parts of the world. This study is not the first to use this approach and proper appreciation of existing literature is needed. Current Introduction suggests authors are unaware of recent developments in DSM SOC which uses ML techniques.
Response: We agree that the introduction on ensemble learning for digital soil mapping could be more comprehensive. In the revised version of the manuscript, we will provide updated information that outlines the current state-of-the-art in ensemble machine-learning techniques for digital soil mapping.
6. Materials and methods:
L179-193: This section is confusing and needs to be properly rewritten. I think Histosols are also soil types. So if authors want to report SOC stocks of peninsular Spain, Histosols must be part of it. If authors want to report SOC only in mineral soils of Spain, then that can’t be the total SOC stocks of Spain as it is presented currently, and authors should clearly mention this in relevant sections of the manuscript.
Response: As mentioned earlier in our response to Reviewer #1 (please see comment 2), we did not exclude Histosol modeling from the analysis. Rather, we have excluded the estimation of organic carbon in the organic horizons (e.g., H, L, O). Soil typology has been incorporated as a covariate in the modeling process through the use of information from the National Soil Map. It is also important to note that this soil type does not appear in the legend of the aforementioned national map. Regarding the suggestion to explicitly mention this exclusion, we agree it could be clarified in a future revision. However, it is already implicitly addressed by the fact that the upper limit of the models is defined at the mineral horizon (0 cm).
7. L214-221: How many samples were not included in the modeling? Was any gap filling approach employed in this study?
Response: The initial dataset was reduced by 29 profiles, and no gap-filling or pedotransfer equations were applied. Only data directly obtained in the field or laboratory were used. As a result, the number of samples varies depending on the modeled carbon parameter, such as density, stock, or the depth considered in the model. Among these parameters, apparent density was the most limiting factor, significantly reducing the number of samples available for analysis. Consequently, carbon stock estimation was based on only 25% of the samples used for carbon density.
8. L303-350: This section is not relevant to ML approach. Current ML algorithms can take into account of categorical, continuous, and correlated variables.
Response: We acknowledge that modern ML algorithms can handle a wide range of data types, including categorical, continuous, and correlated variables, often without requiring extensive preprocessing. However, we argue that relying solely on ML as a "black box" approach—where all available covariates are included without further consideration—carries certain risks such as poor interpretability or overfitting. In contrast, by selecting covariates through a transparent and controlled process, we ensured that each variable included in the model had a clear, demonstrated relationship with SOC content. This makes the modeling process more interpretable and allows us to draw meaningful conclusions about the factors influencing SOC stocks. Moreover, our variable selection process, which combined multiple linear regression, Bayesian analysis, and projection pursuit regression, reduced the risk of overfitting. By integrating covariates with the highest covariate importance scores, we provided a cross-validation mechanism within the variable selection process itself, adding robustness to our methodology. In the revised manuscript, we will clarify the rationale for our variable selection approach and its advantages over an exclusively "black box" ML methodology. This explanation will emphasize the importance of balancing predictive power with interpretability and scientific transparency.
9. L373: MLR is incorrectly abbreviated here. MLR is not ML approach and should not be included in the ensemble ML approach.
Response: We agree and appreciate the reviewer for identifying this inaccuracy. MLR is widely recognized as Multiple Linear Regression, which is not a machine learning approach. We will revise the manuscript to replace it with "ensemble ML approach" where appropriate.
10. L351: Section 2.2.2 is confusing. Please rewrite and mention which specific models were included in the model ensemble approach applied in this study. I am surprised to see non ML methods such as MLR included/mentioned here, as I thought this manuscript was using ensemble ML approach. The current write-up suggests author used only two ML approaches (QRF and AutoML), in doing so authors can not produce a robust ML ensemble, and thus the interquartile range.
Response: We used three distinct machine learning approaches: Quantile Regression Forest, Ensemble Machine Learning, and Auto-Machine Learning. The confusion likely arises from the unclear acronym used for Ensemble Machine Learning, as clarified in our earlier response. We will revise Section 2.2.2 to enhance clarity and accuracy.
11. L416-426: Authors attempt to highlight a lot in the manuscript about uncertainty estimates of SOC stocks. But I am surprised to see no robust uncertainty analysis conducted in the text. Authors merely report validation statistics and interquartile range of different approach that they used. I suggest authors to define in the methods section what they mean by the term “uncertainty”. In my knowledge, without proper distributional analyses of each independent variables and SOC stocks using MonteCarlo simulations, no proper uncertainty analysis can be done.
Response: Thank you for highlighting the importance of robust uncertainty analysis in SOC stock estimation and allowing us to clarify our approach. While Monte Carlo simulations are a well-established method for propagating uncertainties in independent variables, they are not the sole approach for assessing uncertainty. Depending on the objectives of the study and the characteristics of the data, alternative methods can offer valuable insights into soil mapping (see for example Grimm and Behrens, 2010; Stumpf et al., 2017; Zhang et al., 2022; Schmidinger and Heuvelink, 2023). For a comparison between Monte Carlo simulations and other approaches, refer to Heuvelink (2018). In our study, we evaluated uncertainty using methods aligned with digital soil mapping and predictive modeling objectives, specifically:
- Validation statistics. Metrics such as the coefficient of determination, concordance correlation coefficient, root mean square error (RMSE), and mean absolute error (MAE) provided an objective assessment of models’ predictive performance. Metrics such as RMSE and MAE have been employed by Zhang et al. (2022) to map soil total and organic carbon, as well as to perform uncertainty analysis using machine learning techniques.
- Interquartile range (IQR). This metric captured variability across different modeling approaches, particularly relevant in ensemble or comparative modeling, where variation between methods reflects a key aspect of uncertainty. Greiner et al. (2018) utilized IQR as a measure of uncertainty for soil function maps.
- Ensemble forecasting. Using the standard deviation of predictions, we accounted for uncertainty generated by combining predictions from multiple models, a widely used approach in machine learning (see for example Varón-Ramírez et al., 2022; Arroyo-Cruz et al., 2017).
Additionally, our methodology inherently considers spatial variability in predictions, offering spatially explicit uncertainty estimates. This approach identified areas with high prediction variability, critical for decision-making in soil carbon management. While our methods focused on uncertainties emerging from model predictions and variability across modeling techniques—an approach widely validated in digital soil mapping studies—we recognize the importance of explicitly defining "uncertainty" to avoid ambiguity. In the revised manuscript, we will clearly define "uncertainty" in the context of our study and elaborate on the methods employed to evaluate it. We believe these clarifications will strengthen the manuscript's transparency and rigor.
References used in response:
Arroyo-Cruz, C.E., Larson, J., Guevara, M.: A machine learning approach for mapping soil properties in Mexico using legacy data, climate and terrain covariates at a coarse scale 23–27. In: Arrouays, D., Savin, I., Leenaars, J., McBratney, A.B. (Eds.) GlobalSoilMap - Digital Soil Mapping from Country to Globe: Proceedings of the Global Soil Map 2017 Conference, July 4-6, 2017, Moscow, Russia. CRC Press, London. https://doi.org/10.1201/9781351239707, 2017.
Greiner, L., Nussbaum, M., Papritz, A., Zimmermann, S., Gubler, A., Grêt-Regamey, A., Keller, A., 2018. Uncertainty indication in soil function maps – transparent and easy-to-use information to support sustainable use of soil resources. SOIL 4, 123–139. https://doi.org/10.5194/soil-4-123-2018
Grimm, R., Behrens, T., 2010. Uncertainty analysis of sample locations within digital soil mapping approaches. Geoderma 155, 154–163. https://doi.org/10.1016/j.geoderma.2009.05.006
Heuvelink, G.B.M., 2018. Uncertainty and Uncertainty Propagation in Soil Mapping and Modelling, in: McBratney, Alex.B., Minasny, B., Stockmann, U. (Eds.), Pedometrics. Springer International Publishing, Cham, pp. 439–461. https://doi.org/10.1007/978-3-319-63439-5_14
Stumpf, F., Schmidt, K., Goebes, P., Behrens, T., Schönbrodt-Stitt, S., Wadoux, A., Xiang, W., Scholten, T., 2017. Uncertainty-guided sampling to improve digital soil maps. CATENA 153, 30–38. https://doi.org/10.1016/j.catena.2017.01.033
Varón-Ramírez, V.M., Araujo-Carrillo, G.A., Guevara Santamaría, M.A.: Colombian soil texture: building a spatial ensemble model. Earth System Science Data 14, 4719–4741, https://doi.org/10.5194/essd-14-4719-2022, 2022.
Zhang, W., Wan, H., Zhou, M., Wu, W., Liu, H., 2022. Soil total and organic carbon mapping and uncertainty analysis using machine learning techniques. Ecological Indicators 143, 109420. https://doi.org/10.1016/j.ecolind.2022.109420
12. L549: This manuscript does not have results and discussion section. Is this common for this journal? I will not accept this work unless authors provide a robust discussion in an appropriate section, mentioning how their results compare and contrast with the existing SOC literature, which has produced SOC stock estimates using ensemble ML approach.
Response: To the best of our knowledge, including a discussion section is not a mandatory requirement under the ESSD guidelines (available at https://www.earth-system-science-data.net/submission.html#manuscriptcomposition). However, if the editor and reviewers believe that adding a discussion would enhance clarity, we are prepared to include one. In this section, we will highlight the key characteristics of our dataset and methods, emphasizing their relevance and contributions.
Citation: https://doi.org/10.5194/essd-2024-431-AC2
-
AC2: 'Reply on RC2', Juan M. Requena-Mullor, 29 Jan 2025
reply
-
RC3: 'Comment on essd-2024-431', Anonymous Referee #3, 22 Jan 2025
reply
General comments
The manuscript clearly presents the purpose and usefulness of the work.
It is well written and organized in structured sections. In particular, the methodology is comprehensive and transparent.
Four Soil Organic Carbon maps of Spain are provided. Their quality is good (high-resolution maps), and they are coupled with the associated uncertainties.
The effort and importance in creating a nationally unified data set emerges very well from the manuscript.
Specific comments
- I think a fundamental strength of your work is the creation of a unified database of 8,361 georeferenced soil profiles. I apologize in advance if I am wrong, but I cannot find a link to which to access this database. Could you please add it? It would be of considerable interest for the international scientific community. As an additional suggestion, also for the future, it would be very useful to make interactive this database, perhaps by designing a specific web page (or webGIS for instance) that also contains an interactive map of all the sampling points.
- In the Introduction, you point out that the soil profiles were obtained through different techniques of sampling, laboratory etc.. In the perspective of the database, for greater completeness and clarity of the data provided, this information should be added.
- In my view, at line 195, it is not so straightforward how the soil depth standardization occurs. Could you add a few words about this?
- Since land use is a factor influencing soil, especially in its top 30 cm, and since soil profiles span a very large interval (1954-2018), a supplementary figure on the evolution of land use in the study area would add great value to your work.
Technical corrections
Please, correct the following typing errors:
- Line 239: “variabcorporated”?
- Line 365: “assumptions about the;” missing something between “the” and “;”
- Line 443: maybe 200 cm, not meters
- Line 549: replace “considetations” with “considerations”
Citation: https://doi.org/10.5194/essd-2024-431-RC3
Data sets
Soil organic carbon and associated uncertainty at 90 m resolution for peninsular Spain P. Durante et al. https://doi.org/10.6073/pasta/48edac6904eb1aff4c1223d970c050b4
Viewed
HTML | XML | Total | Supplement | BibTeX | EndNote | |
---|---|---|---|---|---|---|
292 | 74 | 14 | 380 | 24 | 16 | 15 |
- HTML: 292
- PDF: 74
- XML: 14
- Total: 380
- Supplement: 24
- BibTeX: 16
- EndNote: 15
Viewed (geographical distribution)
Country | # | Views | % |
---|
Total: | 0 |
HTML: | 0 |
PDF: | 0 |
XML: | 0 |
- 1