the Creative Commons Attribution 4.0 License.
the Creative Commons Attribution 4.0 License.
European soil bulk density and organic carbon stock database using machine learning based pedotransfer function
Abstract. Soil bulk density (BD) serves as a fundamental indicator of soil health and quality, exerting a significant influence on critical factors such as plant growth, nutrient availability, and water retention. Due to its limited availability in soil databases, the application of pedotransfer functions (PTFs) has emerged as a potent tool for predicting BD using other easily measurable soil properties, while the impact of these PTFs’ accuracy on soil organic carbon (SOC) stock calculation has been rarely explored. In this study, we proposed an innovative local modelling approach for predicting BD across Europe using the recently released BD data from the LUCAS Soil 2018 (0–20 cm). Our approach involved a combination of neighbour sample search, Forward Recursive Feature Selection (FRFS) and Random Forest (RF) model (local-RFFRFS). The results showed that local-RFFRFS had a good performance in predicting BD (R2 of 0.58, RMSE of 0.19 g cm-3), surpassing the traditional PTFs (R2 of 0.40–0.45, RMSE of 0.22 g cm-3) and global PTFs using RF with and without FRFS (R2 of 0.56–0.57, RMSE of 0.19 g cm-3). Interestingly, we found the best traditional PTF (R2=0.84, RMSE=1.39 kg m-2) performed close to the local-RFFRFS (R2=0.85, RMSE=1.32 kg m-2) in SOC stock calculation using BD predictions. However, the local-RFFRFS still performed better (ΔR2>0.2 and ΔRMSE>0.1 g cm-3) for soil samples with low SOC stock (<3 kg m-2). Therefore, we suggest that the local-RFFRFS is a promising method for BD prediction while traditional PTFs would be more efficient when BD is subsequently utilized for calculating SOC stock. Finally, we produced two BD and SOC stocks datasets (18,945 and 15,389 soil samples) for LUCAS Soil 2018 using the best traditional PTF and local-RFFRFS, respectively. This dataset is archived from the Zenodo platform at https://zenodo.org/records/10211884 (Chen et al., 2023). The outcomes of this study present a meaningful advancement in enhancing the predictive accuracy of BD, and the resultant BD and SOC stock datasets across the Europe enable more precise soil hydrological and biological modelling.
- Preprint
(2586 KB) - Metadata XML
- BibTeX
- EndNote
Status: closed
-
RC1: 'Comment on essd-2023-493', Asim Biswas, 13 Feb 2024
The manuscript from Chen et al. produced the European soil bulk density and organic carbon stock database (>15000 soil samples) using the recently released BDfine and CFvolumefraction data (around 6000 soil samples) from LUCAS 2018. Authors evaluated the model performance for BD using traditional pedotransfer functions (PTFs) and four proposed machine learning (ML) based PTFs, and found that ML based PTFs (R2 of 0.56-0.57) greatly improved the accuracy for BD prediction, and this is also much higher than previous PTFs for Europe using Hollis-type PTF (R2 of 0.41). For the first time, authors produced the European soil organic carbon stock data of topsoil (0-20 cm) for the year of 2018 and evaluated the impact of BD accuracy on the accuracy of soil organic carbon stock data. The produced data and relevant evaluation are of significant importance for informing more precise soil hydrological and biological modelling, so as to support Soil health by 2050 proposed by the European Commission. This manuscript is generally well-written with clear objectives and solid methodology, and therefore I suggest that it can be accepted for publication after minor revision.
Here are the specific comments:
Lines 100-101: Several symbols for the units should be superscripts, such as g cm-3, g kg-1. Please correct them throughout the manuscript.
Table 2: I think two digits would be enough for the R2 reported here, which is in-line with your previous summary in Table 1. It is also not clear whether the data used to evaluate these traditional PTFs are the same to machine learning PTFs? If not, the results would be not comparable. Please make it clear.
Line 140: Please specify the k here. 5-fold cross-validation? 10-fold cross-validation?
Figure 6: What do the colours mean here? More details should be provided in the figure captions.
Citation: https://doi.org/10.5194/essd-2023-493-RC1 -
AC1: 'Reply on RC1', Songchao Chen, 10 Mar 2024
We highly appreciate all your positive feedbacks as well as suggestive suggestions/comments on our manuscript. All your concerns have been carefully addressed in the revised manuscript, and hope you are satisfied with our revision. Please find our responses to your concerns one by one in the attached document.
-
AC1: 'Reply on RC1', Songchao Chen, 10 Mar 2024
-
RC2: 'Comment on essd-2023-493', Jingyi Huang, 17 Feb 2024
The authors presented a new model to estimate soil bulk density and evaluate the performance using the LUCAS database. Overall, the manuscript was well written. The improved model performance will significantly help the research community to derive more reliable soil carbon stock products. I have some minor comments below:
1. In the two existing bulk density models, the input data is SOM. Please explain how you estimate this from the LUCAS data where the listed variables only have SOC. If you used a conversion factor to estimate SOM from SOC, can you please run some simple uncertainty analysis for the four existing models? For example, if you replace the SOM input in models #3 and #4 with SOC, you will get two new models that only use SOC as inputs. Then, a simple inter-model comparison can be made by plotting the estimated bulk density from four models vs. measured bulk density using a range of SOC inputs.
However, if you used independent sources of SOM for models #3 and #4, some of the uncertainty may be attributed to analytical errors related to SOM and SOC measurements.
2. Please explain a bit more about the depth of the soil samples. It seems that you only build the models and compare your models with other models for the depth of 0-20 cm. My question here is that your machine learning models share the same climate/terrain predictors but different soil property predictors. What will happen if the authors want to estimate soil bulk density at a soil profile at different depths or even for 0-5 cm or 0-15 cm within the dataset you have? Will the coefficients stay the same for different depths? It may be helpful to publish another reduced model without climate/terrain predictors for an improved applicability/transferability of your more accurate models so that researchers can use them for bulk density estimation at different depths, just like the current models.
Of course, if people have access to depth-specific soil bulk density data, they can develop depth-explicit machine learning models to account for the effects of climate and terrain on bulk density at depths. However, I think the research community has not well studied this issue and it will be a very important topic for future collaboration.
Citation: https://doi.org/10.5194/essd-2023-493-RC2 -
AC2: 'Reply on RC2', Songchao Chen, 10 Mar 2024
We highly appreciate all your positive feedbacks as well as suggestive suggestions/comments on our manuscript. All your concerns have been carefully addressed in the revised manuscript, and hope you are satisfied with our revision. Please find our responses to your concerns one by one in the attached document.
-
AC2: 'Reply on RC2', Songchao Chen, 10 Mar 2024
-
RC3: 'Comment on essd-2023-493', Ary Bruand, 18 Feb 2024
This paper addresses the question of predicting the bulk density of soils, its absence in soil databases limiting our ability to move from soil mass characteristics (quantities per unit mass of soil) to characteristics expressed in relation to a volume of soil or to a surface area of soil for a given soil thickness. This is an extremely important subject. The soil databases have bulk density values for a minority of soils stored there, but enough to make the study possible of how it is possible to predict, using pedotransfer functions (PTFs), the bulk density using other characteristics of these soils for which the bulk density values are available. The objective is to have tools for predicting the bulk density using soil characteristics that are much more easily accessible than the bulk density. Here, the measured and predicted values of bulk density are then used to compute the stock of soil organic carbon. The latter are discussed according to the characteristics of the PTFs used and the characteristics of the soils, including their environmental characteristics. This is an article which deserves to be published in “Earth System Science Data” but which must first be corrected both in substance and form according to the comments which follow.
My main concerns are:
- I did not find a presentation of the way used to discuss the “accuracy” of the prediction of the bulk density and then of the soil organic carbon (SOC) content. This requires to be improved (see also comments along the text). “Accuracy” is discussed using the R2 and RMQS values alone. I recommend going deeper in this area. This should be a major point of the discussion.
- There are a certain number of assertions in the discussion section: “better choice for improving BD prediction” (better than what?) (Line 240); “can be an efficient tool” (To what respect?) (Line 347), “greatly improved” (improved but not greatly) (Line 250); “performed better” (this should be more appropriately discussed) (Line 278); “would be accurate enough” (enough with respect to what consideration?) (Line 282). Such assertions that are not clearly supported by facts cannot be accepted.
- The authors do not use always the same abbreviation for the bulk density and the different pedotransfer functions (see comments along the text). There are also other abbreviations which vary in the text (see also comments along the text). This does not make easy reading and understanding the text. Please homogenize all the abbreviations throughout the whole text.
- There are several (too many) writing errors which reflect a lack of proofreading of the manuscript before submitting it. There is even an equation that is wrong in the text even though the calculations appear to have been carried out correctly. (Eq. 3, Line 174). There are enough co-authors to take care of this proofreading work. Please see comments along the text. It is not pleasant for reviewer’s work.
- Legends of Figures and Tables require to be much more informative.
Comments along the text.
Title: The discussion is restricted to the discussion of the topsoil bulk density (i.e. 0-20 cm). The question of the prediction of the bulk density concerns both the topsoil and subsoil horizons. I have no problem with focusing the prediction on the topsoil when the objective is predicting the soil organic carbon content because the stock is mainly located in the topsoil horizons. However, this should be indicated more explicitly in the title by using “topsoil bulk density” instead of “soil bulk density”. Then, I am wondering about the singular form for “pedotransfer function”. It would be more appropriate to use the plural form “pedotransfer functions”.
Line 35: “Additionally, BD plays a crucial role in calculating SOC storage” I recommend starting with a sentence more general like “Additionally, BD plays a crucial role in computing stock of water, chemical elements or compounds by soil surface unit or soil volume unit and then focusing on SOC stocks.
Lines 38 & 39: “to acknowledge … cover patterns” This sentence is correct if you are speaking about the topsoil bulk density. For the subsoil bulk density, the latter closely varies according to soil texture. Please the authors should restrict to topsoils.
Line 46: SOC is soil organic carbon content. Please the authors should add “content” to “SOC” and also to “clay, silt, sand” everywhere in the whole text.
Lines 79 & 80: “data under comparable environmental conditions” Very vague. Please, it is required to be more specific.
Line 83: “accuracy” What is “accuracy” in this paper. How is it expressed, discussed? See other comments about that point.
Line 96: All throughout the text the word “soil” is used when it is the “topsoil” (0-20 cm) which is discussed. It is necessary to avoid such an ambiguity.
Line 99: What do the authors mean by “a single laboratory”. If the analyses were performed in a single laboratory, please give information about this laboratory.
Lines 100 & 100: “-3” and “–1” require to be written in superscript.
Line 120: “Traditional” Is it appropriate? I do not think so. I would suggest using “Earlier published PTFs” or “PTFs from the literature”. I do not understand in why these PTFs would be “traditional”. And what tradition are we talking about? Unclear and not adapted.
Line 125: In table 2, four models are presented and numbered 1, 2, 3 and 4 when there are mentioned as PFT-1, PFT-2, PFT-3 and PFT-4 in Figure 5 (Line 207) and Figure 6 (Line 2014). I mention here that the correct abbreviation for “pedotransfer function” is “PTF” and not “PFT” as mentioned in Figures 5 and 6. What is BD in Table 2? BDfine? SOM content is defined as % by reference to soil mass or soil volume? Same question for the SOC content.
Line 134: “Here, 16 predictor variables” when there are 15 predictors mentioned in Table 3. Please check.
Line 135: Table 3. I do not understand using the three clay, silt and sand contents together (RFFull) when they are not independent predictor variables, their sum being equal to 100. For RFRFS, sand content is not used. Please explain. “Elevation” in the table when it is “DEM” in the text (Line 116). Please homogenize. “EC” in the table when it is “CEC” in the text (Line 101)/ Please homogenize. “ELE” for probably “elevation” when it is not defined in the text. This is confusing.
Lines 136 & 137: “Furthermore, we adopted … performance”. Please give at least one reference.
Lines 160 to 167: I recommend discussing errors using relative errors. Is the error 5%, 10%, 15% or more of the predicted value? Is there any relationship between the relative error and the type of land use? The discussion of the prediction quality would be thus much more relevant.
Line 168: Is it BD or BDfine? Same question for Line 177 and Figure 2. This really confusing.
Line 174: Equation (3) appears to be wrong. How is expressed CFvolumefarction? Does it range from 0 to 1? From 0 to100? It should be “x (1 - CFvolumefarction)” without dividing by 100 if CFvolumefarction ranges from 0 to 1 or “x (100 - CFvolumefarction)/100” if CFvolumefarction ranges from 0 to 100. Required to be clarified and corrected.
Line 178: “with BD ranging from 0.20 to 1.89”. This required to be discussed in the discussion section. For which type of topsoil do we encounter 0.20? Peat topsoils? And for 1.89? Stony topsoils? But are we talking about BDfine or BD including gravels and stones? This remains confusing.
Line 180: “with the exception of clay soils”. First of all, you are talking about “topsoils” and not “soils” and then there are clayey topsoils in your dataset (see the triangle, Figure 2) and not so few. This requires to be rewritten.
Line 193: “Elevation” here when it is for RFFRFS in Table 3. Please homogenize.
Lines 196 to 2003 (and elsewhere in the text, Figures 5 and 6 included): The abbreviations ML-PTFs and T-PTFs are used in the text which is appropriate. I strongly suggest using local-RFFRFS-PTFs, local-RF-FULL-PTFs and so on for the other PTFs to homogenize and make easier text reading and understanding.
Line 205: The legend is not informative enough. Please avoid mentioning “eight PTFs”. This does not bring any information.
Line 207: Figure 5. As mentioned above, this not “PFT” but “PTF”. The legend of the figure is not informative enough. Please avoid mentioning “eight PTFs”. This does not bring any information.
Line 214: Figure 6. Similar comments as in Figure 5. SOC stocks are expressed in kg cm-2 which is wrong. Probably should correspond to kg m-2. When the authors write “observed SOC stocks”, I assume that they are speaking about values of SOC stocks which were computed using the measured values of SOC content and measured values of bulk density. And then, when they write “Predicted so stocks”, the values were computed using measured values of SOC content and the predicted values of bulk density. Whether I understood correctly or not, it is necessary to explain it clearly in the text.
Lines 249 to 252: This is not really true. The difference of R2 is not “around 2.0”. The highest difference of R2 recorded with T-PTFs and with the PTFs developed in the paper is 0.19 when we compare the smallest R2 recorded with T-PTFs and the highest R2 recorded with the PTFs developed in the paper (see values in Figure 5). On the the hand, the difference of R2 recorded with T-PTFs and with the PTFs developed in the paper is 0.14 when we compare the highest R2 recorded with T-PTFs and the highest R2 recorded with the PTFs developed in the paper (see values in Figure 5). I recommend writhing something like “ranged from 0.14 to 0.19” which more appropriate.
Line 247: “can be an efficient tool” Meaning? Something with “can improve” would much more appropriate.
Lines 270 & 271: “with a higher SOC commonly”. “with a higher SOC content commonly” is more correct. And “higher” than what? “larger” than what? “greater” than what? The use of the comparative form requires to say to what you compare.
Line 280 “>3 kg cm-2” Quite high. I assume this is “3 kg m-2“
Line 280: “would be accurate enough” Why? Based on what? This requires to be clarified.
Line 280: “to topsoil” This is the only place where we are talking about topsoils and not soils.
I did not check that all the references cited in the text were in the reference list and vice versa.
Citation: https://doi.org/10.5194/essd-2023-493-RC3 -
AC3: 'Reply on RC3', Songchao Chen, 10 Mar 2024
We highly appreciate all your positive feedbacks as well as suggestive suggestions/comments on our manuscript. All your concerns have been carefully addressed in the revised manuscript, and hope you are satisfied with our revision. Please find our responses to your concerns one by one in the attached document.
-
RC4: 'Comment on essd-2023-493', Anonymous Referee #4, 19 Feb 2024
The authors proposed an interesting topic that addresses the need for the availability of reliable data on soil properties that are crucial for many assessments of soil quality indicators. The authors, in addition to evaluating the performance in terms of accuracy of traditional PTFs and of four proposed machine learning (ML) based PTFs, assessed the impact of their accuracy on that of the estimated SOC stock. This is a very qualifying point of the manuscript in which a problem rarely considered is addressed. Indeed, neglecting the accuracy of input data in estimating soil carbon stock is a major problem that can lead to under- or over-estimation.
The manuscript is well organised and clear with a sound application of the methods used and it is not easy to find flaws beyond the few minor ones that have been pointed out by other reviewers.
Citation: https://doi.org/10.5194/essd-2023-493-RC4 -
AC4: 'Reply on RC4', Songchao Chen, 10 Mar 2024
We highly appreciate all your positive feedbacks as well as suggestive suggestions/comments on our manuscript. All your concerns have been carefully addressed in the revised manuscript, and hope you are satisfied with our revision. Please find our responses to your concerns one by one in the attached document.
-
AC4: 'Reply on RC4', Songchao Chen, 10 Mar 2024
Status: closed
-
RC1: 'Comment on essd-2023-493', Asim Biswas, 13 Feb 2024
The manuscript from Chen et al. produced the European soil bulk density and organic carbon stock database (>15000 soil samples) using the recently released BDfine and CFvolumefraction data (around 6000 soil samples) from LUCAS 2018. Authors evaluated the model performance for BD using traditional pedotransfer functions (PTFs) and four proposed machine learning (ML) based PTFs, and found that ML based PTFs (R2 of 0.56-0.57) greatly improved the accuracy for BD prediction, and this is also much higher than previous PTFs for Europe using Hollis-type PTF (R2 of 0.41). For the first time, authors produced the European soil organic carbon stock data of topsoil (0-20 cm) for the year of 2018 and evaluated the impact of BD accuracy on the accuracy of soil organic carbon stock data. The produced data and relevant evaluation are of significant importance for informing more precise soil hydrological and biological modelling, so as to support Soil health by 2050 proposed by the European Commission. This manuscript is generally well-written with clear objectives and solid methodology, and therefore I suggest that it can be accepted for publication after minor revision.
Here are the specific comments:
Lines 100-101: Several symbols for the units should be superscripts, such as g cm-3, g kg-1. Please correct them throughout the manuscript.
Table 2: I think two digits would be enough for the R2 reported here, which is in-line with your previous summary in Table 1. It is also not clear whether the data used to evaluate these traditional PTFs are the same to machine learning PTFs? If not, the results would be not comparable. Please make it clear.
Line 140: Please specify the k here. 5-fold cross-validation? 10-fold cross-validation?
Figure 6: What do the colours mean here? More details should be provided in the figure captions.
Citation: https://doi.org/10.5194/essd-2023-493-RC1 -
AC1: 'Reply on RC1', Songchao Chen, 10 Mar 2024
We highly appreciate all your positive feedbacks as well as suggestive suggestions/comments on our manuscript. All your concerns have been carefully addressed in the revised manuscript, and hope you are satisfied with our revision. Please find our responses to your concerns one by one in the attached document.
-
AC1: 'Reply on RC1', Songchao Chen, 10 Mar 2024
-
RC2: 'Comment on essd-2023-493', Jingyi Huang, 17 Feb 2024
The authors presented a new model to estimate soil bulk density and evaluate the performance using the LUCAS database. Overall, the manuscript was well written. The improved model performance will significantly help the research community to derive more reliable soil carbon stock products. I have some minor comments below:
1. In the two existing bulk density models, the input data is SOM. Please explain how you estimate this from the LUCAS data where the listed variables only have SOC. If you used a conversion factor to estimate SOM from SOC, can you please run some simple uncertainty analysis for the four existing models? For example, if you replace the SOM input in models #3 and #4 with SOC, you will get two new models that only use SOC as inputs. Then, a simple inter-model comparison can be made by plotting the estimated bulk density from four models vs. measured bulk density using a range of SOC inputs.
However, if you used independent sources of SOM for models #3 and #4, some of the uncertainty may be attributed to analytical errors related to SOM and SOC measurements.
2. Please explain a bit more about the depth of the soil samples. It seems that you only build the models and compare your models with other models for the depth of 0-20 cm. My question here is that your machine learning models share the same climate/terrain predictors but different soil property predictors. What will happen if the authors want to estimate soil bulk density at a soil profile at different depths or even for 0-5 cm or 0-15 cm within the dataset you have? Will the coefficients stay the same for different depths? It may be helpful to publish another reduced model without climate/terrain predictors for an improved applicability/transferability of your more accurate models so that researchers can use them for bulk density estimation at different depths, just like the current models.
Of course, if people have access to depth-specific soil bulk density data, they can develop depth-explicit machine learning models to account for the effects of climate and terrain on bulk density at depths. However, I think the research community has not well studied this issue and it will be a very important topic for future collaboration.
Citation: https://doi.org/10.5194/essd-2023-493-RC2 -
AC2: 'Reply on RC2', Songchao Chen, 10 Mar 2024
We highly appreciate all your positive feedbacks as well as suggestive suggestions/comments on our manuscript. All your concerns have been carefully addressed in the revised manuscript, and hope you are satisfied with our revision. Please find our responses to your concerns one by one in the attached document.
-
AC2: 'Reply on RC2', Songchao Chen, 10 Mar 2024
-
RC3: 'Comment on essd-2023-493', Ary Bruand, 18 Feb 2024
This paper addresses the question of predicting the bulk density of soils, its absence in soil databases limiting our ability to move from soil mass characteristics (quantities per unit mass of soil) to characteristics expressed in relation to a volume of soil or to a surface area of soil for a given soil thickness. This is an extremely important subject. The soil databases have bulk density values for a minority of soils stored there, but enough to make the study possible of how it is possible to predict, using pedotransfer functions (PTFs), the bulk density using other characteristics of these soils for which the bulk density values are available. The objective is to have tools for predicting the bulk density using soil characteristics that are much more easily accessible than the bulk density. Here, the measured and predicted values of bulk density are then used to compute the stock of soil organic carbon. The latter are discussed according to the characteristics of the PTFs used and the characteristics of the soils, including their environmental characteristics. This is an article which deserves to be published in “Earth System Science Data” but which must first be corrected both in substance and form according to the comments which follow.
My main concerns are:
- I did not find a presentation of the way used to discuss the “accuracy” of the prediction of the bulk density and then of the soil organic carbon (SOC) content. This requires to be improved (see also comments along the text). “Accuracy” is discussed using the R2 and RMQS values alone. I recommend going deeper in this area. This should be a major point of the discussion.
- There are a certain number of assertions in the discussion section: “better choice for improving BD prediction” (better than what?) (Line 240); “can be an efficient tool” (To what respect?) (Line 347), “greatly improved” (improved but not greatly) (Line 250); “performed better” (this should be more appropriately discussed) (Line 278); “would be accurate enough” (enough with respect to what consideration?) (Line 282). Such assertions that are not clearly supported by facts cannot be accepted.
- The authors do not use always the same abbreviation for the bulk density and the different pedotransfer functions (see comments along the text). There are also other abbreviations which vary in the text (see also comments along the text). This does not make easy reading and understanding the text. Please homogenize all the abbreviations throughout the whole text.
- There are several (too many) writing errors which reflect a lack of proofreading of the manuscript before submitting it. There is even an equation that is wrong in the text even though the calculations appear to have been carried out correctly. (Eq. 3, Line 174). There are enough co-authors to take care of this proofreading work. Please see comments along the text. It is not pleasant for reviewer’s work.
- Legends of Figures and Tables require to be much more informative.
Comments along the text.
Title: The discussion is restricted to the discussion of the topsoil bulk density (i.e. 0-20 cm). The question of the prediction of the bulk density concerns both the topsoil and subsoil horizons. I have no problem with focusing the prediction on the topsoil when the objective is predicting the soil organic carbon content because the stock is mainly located in the topsoil horizons. However, this should be indicated more explicitly in the title by using “topsoil bulk density” instead of “soil bulk density”. Then, I am wondering about the singular form for “pedotransfer function”. It would be more appropriate to use the plural form “pedotransfer functions”.
Line 35: “Additionally, BD plays a crucial role in calculating SOC storage” I recommend starting with a sentence more general like “Additionally, BD plays a crucial role in computing stock of water, chemical elements or compounds by soil surface unit or soil volume unit and then focusing on SOC stocks.
Lines 38 & 39: “to acknowledge … cover patterns” This sentence is correct if you are speaking about the topsoil bulk density. For the subsoil bulk density, the latter closely varies according to soil texture. Please the authors should restrict to topsoils.
Line 46: SOC is soil organic carbon content. Please the authors should add “content” to “SOC” and also to “clay, silt, sand” everywhere in the whole text.
Lines 79 & 80: “data under comparable environmental conditions” Very vague. Please, it is required to be more specific.
Line 83: “accuracy” What is “accuracy” in this paper. How is it expressed, discussed? See other comments about that point.
Line 96: All throughout the text the word “soil” is used when it is the “topsoil” (0-20 cm) which is discussed. It is necessary to avoid such an ambiguity.
Line 99: What do the authors mean by “a single laboratory”. If the analyses were performed in a single laboratory, please give information about this laboratory.
Lines 100 & 100: “-3” and “–1” require to be written in superscript.
Line 120: “Traditional” Is it appropriate? I do not think so. I would suggest using “Earlier published PTFs” or “PTFs from the literature”. I do not understand in why these PTFs would be “traditional”. And what tradition are we talking about? Unclear and not adapted.
Line 125: In table 2, four models are presented and numbered 1, 2, 3 and 4 when there are mentioned as PFT-1, PFT-2, PFT-3 and PFT-4 in Figure 5 (Line 207) and Figure 6 (Line 2014). I mention here that the correct abbreviation for “pedotransfer function” is “PTF” and not “PFT” as mentioned in Figures 5 and 6. What is BD in Table 2? BDfine? SOM content is defined as % by reference to soil mass or soil volume? Same question for the SOC content.
Line 134: “Here, 16 predictor variables” when there are 15 predictors mentioned in Table 3. Please check.
Line 135: Table 3. I do not understand using the three clay, silt and sand contents together (RFFull) when they are not independent predictor variables, their sum being equal to 100. For RFRFS, sand content is not used. Please explain. “Elevation” in the table when it is “DEM” in the text (Line 116). Please homogenize. “EC” in the table when it is “CEC” in the text (Line 101)/ Please homogenize. “ELE” for probably “elevation” when it is not defined in the text. This is confusing.
Lines 136 & 137: “Furthermore, we adopted … performance”. Please give at least one reference.
Lines 160 to 167: I recommend discussing errors using relative errors. Is the error 5%, 10%, 15% or more of the predicted value? Is there any relationship between the relative error and the type of land use? The discussion of the prediction quality would be thus much more relevant.
Line 168: Is it BD or BDfine? Same question for Line 177 and Figure 2. This really confusing.
Line 174: Equation (3) appears to be wrong. How is expressed CFvolumefarction? Does it range from 0 to 1? From 0 to100? It should be “x (1 - CFvolumefarction)” without dividing by 100 if CFvolumefarction ranges from 0 to 1 or “x (100 - CFvolumefarction)/100” if CFvolumefarction ranges from 0 to 100. Required to be clarified and corrected.
Line 178: “with BD ranging from 0.20 to 1.89”. This required to be discussed in the discussion section. For which type of topsoil do we encounter 0.20? Peat topsoils? And for 1.89? Stony topsoils? But are we talking about BDfine or BD including gravels and stones? This remains confusing.
Line 180: “with the exception of clay soils”. First of all, you are talking about “topsoils” and not “soils” and then there are clayey topsoils in your dataset (see the triangle, Figure 2) and not so few. This requires to be rewritten.
Line 193: “Elevation” here when it is for RFFRFS in Table 3. Please homogenize.
Lines 196 to 2003 (and elsewhere in the text, Figures 5 and 6 included): The abbreviations ML-PTFs and T-PTFs are used in the text which is appropriate. I strongly suggest using local-RFFRFS-PTFs, local-RF-FULL-PTFs and so on for the other PTFs to homogenize and make easier text reading and understanding.
Line 205: The legend is not informative enough. Please avoid mentioning “eight PTFs”. This does not bring any information.
Line 207: Figure 5. As mentioned above, this not “PFT” but “PTF”. The legend of the figure is not informative enough. Please avoid mentioning “eight PTFs”. This does not bring any information.
Line 214: Figure 6. Similar comments as in Figure 5. SOC stocks are expressed in kg cm-2 which is wrong. Probably should correspond to kg m-2. When the authors write “observed SOC stocks”, I assume that they are speaking about values of SOC stocks which were computed using the measured values of SOC content and measured values of bulk density. And then, when they write “Predicted so stocks”, the values were computed using measured values of SOC content and the predicted values of bulk density. Whether I understood correctly or not, it is necessary to explain it clearly in the text.
Lines 249 to 252: This is not really true. The difference of R2 is not “around 2.0”. The highest difference of R2 recorded with T-PTFs and with the PTFs developed in the paper is 0.19 when we compare the smallest R2 recorded with T-PTFs and the highest R2 recorded with the PTFs developed in the paper (see values in Figure 5). On the the hand, the difference of R2 recorded with T-PTFs and with the PTFs developed in the paper is 0.14 when we compare the highest R2 recorded with T-PTFs and the highest R2 recorded with the PTFs developed in the paper (see values in Figure 5). I recommend writhing something like “ranged from 0.14 to 0.19” which more appropriate.
Line 247: “can be an efficient tool” Meaning? Something with “can improve” would much more appropriate.
Lines 270 & 271: “with a higher SOC commonly”. “with a higher SOC content commonly” is more correct. And “higher” than what? “larger” than what? “greater” than what? The use of the comparative form requires to say to what you compare.
Line 280 “>3 kg cm-2” Quite high. I assume this is “3 kg m-2“
Line 280: “would be accurate enough” Why? Based on what? This requires to be clarified.
Line 280: “to topsoil” This is the only place where we are talking about topsoils and not soils.
I did not check that all the references cited in the text were in the reference list and vice versa.
Citation: https://doi.org/10.5194/essd-2023-493-RC3 -
AC3: 'Reply on RC3', Songchao Chen, 10 Mar 2024
We highly appreciate all your positive feedbacks as well as suggestive suggestions/comments on our manuscript. All your concerns have been carefully addressed in the revised manuscript, and hope you are satisfied with our revision. Please find our responses to your concerns one by one in the attached document.
-
RC4: 'Comment on essd-2023-493', Anonymous Referee #4, 19 Feb 2024
The authors proposed an interesting topic that addresses the need for the availability of reliable data on soil properties that are crucial for many assessments of soil quality indicators. The authors, in addition to evaluating the performance in terms of accuracy of traditional PTFs and of four proposed machine learning (ML) based PTFs, assessed the impact of their accuracy on that of the estimated SOC stock. This is a very qualifying point of the manuscript in which a problem rarely considered is addressed. Indeed, neglecting the accuracy of input data in estimating soil carbon stock is a major problem that can lead to under- or over-estimation.
The manuscript is well organised and clear with a sound application of the methods used and it is not easy to find flaws beyond the few minor ones that have been pointed out by other reviewers.
Citation: https://doi.org/10.5194/essd-2023-493-RC4 -
AC4: 'Reply on RC4', Songchao Chen, 10 Mar 2024
We highly appreciate all your positive feedbacks as well as suggestive suggestions/comments on our manuscript. All your concerns have been carefully addressed in the revised manuscript, and hope you are satisfied with our revision. Please find our responses to your concerns one by one in the attached document.
-
AC4: 'Reply on RC4', Songchao Chen, 10 Mar 2024
Data sets
European soil bulk density and organic carbon stock database using LUCAS Soil 2018 S. Chen et al. https://doi.org/10.5281/zenodo.10211883
Viewed
HTML | XML | Total | BibTeX | EndNote | |
---|---|---|---|---|---|
586 | 145 | 36 | 767 | 20 | 17 |
- HTML: 586
- PDF: 145
- XML: 36
- Total: 767
- BibTeX: 20
- EndNote: 17
Viewed (geographical distribution)
Country | # | Views | % |
---|
Total: | 0 |
HTML: | 0 |
PDF: | 0 |
XML: | 0 |
- 1