Colombian soil texture: building a spatial ensemble model

Varón-Ramírez, Viviana Marcela; Araujo-Carrillo, Gustavo Alfonso; Guevara Santamaría, Mario Antonio

doi:https://doi.org/10.5194/essd-14-4719-2022

Articles | Volume 14, issue 10

https://doi.org/10.5194/essd-14-4719-2022

© Author(s) 2022. This work is distributed under
the Creative Commons Attribution 4.0 License.

https://doi.org/10.5194/essd-14-4719-2022

© Author(s) 2022. This work is distributed under
the Creative Commons Attribution 4.0 License.

Articles | Volume 14, issue 10

Data description paper

|

28 Oct 2022

Data description paper |

| 28 Oct 2022

Colombian soil texture: building a spatial ensemble model

Viviana Marcela Varón-Ramírez, Gustavo Alfonso Araujo-Carrillo, and Mario Antonio Guevara Santamaría

Download

Final revised paper (published on 28 Oct 2022)
Preprint (discussion started on 25 Feb 2022)

Interactive discussion

Status: closed

CC1: 'Comment on essd-2021-437', Fuat Kaya, 01 Mar 2022

The comment was uploaded in the form of a supplement: https://essd.copernicus.org/preprints/essd-2021-437/essd-2021-437-CC1-supplement.pdf

Citation: https://doi.org/10.5194/essd-2021-437-CC1
RC1: 'Comment on essd-2021-437', Anonymous Referee #1, 12 Apr 2022

General comments:

The manuscript “Colombian soil texture: Building a spatial ensemble model” by Varón-Ramírez et al. presents soil texture maps (clay, sand, and silt) for Colombia for different depth intervals by using and comparing different machine learning techniques. The authors compare their predicted maps with the global SoilGrid product for Colombia and provide maps that are based on the best model for each pixel. The soil data used to derive the maps and the final maps are provided as independent datasets/raster files and are easily accessible and usable. Unfortunately, the authors do not provide the code to reproduce their maps.

The manuscript and the related maps are unique and useful for the scientific community since, as stated by the authors, they provide the first Colombian soil texture maps obtained by spatial ensemble of national and global soil products. The methods are explained in detail (sometimes maybe too detailed for a general reader). However, I have quite a few comments regarding the manuscript and data presentation which I will outline below. Additionally, I think the language needs to be improved – sometimes the grammar and structure of sentences is not correct which makes it a little bit difficult to always know what the authors are trying to say. The usage of many abbreviations (which are mostly explained at the beginning of the manuscript) makes it also sometimes hard do follow the argumentation. Personally, I think it would help if you could just write out the words or use more intuitive acronyms.

Specific comments:

Abstract: Overall, I think the abstract is a little bit too long and detailed and should focus more on the novelty of the data products provided by this work and how they can be used. It is good that you describe what you did, however, it is probably not necessary to provide all the details about the model performance and comparison in the abstract.

Line 5: How are the depth intervals exactly defined? I assume 0–5, 5–15, 15–30, 30–60, and 60–100 cm.

Line 6: What do you mean by ‘stack’ in this context. Try to avoid to use too technical language in the abstract.

Line 6: Maybe better: “the most important” instead of “top”

Line 10: Maybe better: “smallest” instead of “fewest”

Line 15: What do you mean by “compared to other algorithms”? Aren’t all the methods you used spatial ensemble? Your abstract should really be understandable to a reader that is not that familiar with all the methods you used.

Line 18: Should be “SPF” instead of “PSF”.

Line 19: Without geographic context this information is difficult to follow. I think you can be less specific in the abstract and just say that the ensemble machine learning algorithms usually performed better, but in some regions the SoilGrid product also resulted in reliable predictions.

Introduction: The introduction gives a nice and comprehensive overview about the idea and state of the art of digital soil mapping in general and provides details about the methods used in this work. However, it can probably also be shortening a little bit. For example, the first two sentences (line: 29-31) are probably not needed.

Line 57: Maybe better: “Digital soil maps are derived from soil datasets that represent the continuous nature of soil variability.”

Line 88: Missing brackets around references.

Line 90: Not sure what you mean with “what are the best big-data management strategies for generating high-spatial resolution maps across large areas?”. Your manuscript is not really about data management, but predicting soil maps.

Line 101: Is your objective really to develop a digital soil texture dataset? As far as I understand, the soil data already exists and you are applying machine learning techniques to create digital soil maps. So, your objective should be more about the maps than the soil data. Maybe I am misunderstanding something here and you just have to be more clear what you did for this work and what is based on previous work.

Line 102: Are these soil data already part of any international soil databases (e.g. ISRIC, ISCN) so that they can also be used easy by other researches? I really encourage you to put your dataset in one of these international soil databases if you haven’t done it yet.

Line 117: Awkward phrasing: What do you mean with “positive implications”

Methodology: This section provides a good overview about the applied methods. In general, I think that this section can be improved by focusing more on the actual methods rather than the R packages and functions that were used. If you provide the code to reproduce your maps this information can all be presented in the R script. Please also make sure that you always correctly cite the R packages that you are using – the reference is quite often missing.

Line 120: Awkward phrasing: “A total of five major steps”. Maybe better: “Our work flow contains five major steps, including … which will be discussed in detail below”.

Line 124: Not needed: “Soil particle-size fractions (PSF) such as clay, sand, and silt were collected, including geographical coordinates (EPGS: 4326).” You can just write in the sentence before: “A total of 4,203 georeferenced (EPGS: 4326) soil profiles were collected from … that all contained information about particle-size fractions (clay, sand and silt).”

Line 125: Did you create these geographic regions or are there somewhere defined. If so, please provide the reference.

Line 130: Awkward phrasing of first sentence. Maybe better: “Dataset quality was ensured by i) sum of particle-size data equals 100 % and ii) no overlapping sampling depth”. By definition, two soil horizons cannot be overlapping. Also, what about number of samples for each soil profile? Your Figure 1 shows that some profiles only have data down to 5 cm. Did you exclude any profiles that contained, e.g. less than 3 measurements or that did not reach a certain depth? If not, how did you treated these samples.

Line 131: “to” instead of “at”

Line 133: Citation for R package aqp is missing

Line 136: Great that you are transforming the compositional data. However, I think you need to provide some more background information why this is necessary and why you decided to use the additive log-ratio transformation (especially for non-experts in this field). It would also be helpful if you could provide some details on how to interpret the transformed values.

Line 142: Earlier you wrote, you applied the additive log-ratio transformation, now you are mentioning inverse additive log-ratio transformation. Please provide some more information what this is and why you are doing this. Also, in the published dataset it is not clear to me which of these transformations refers to transformation_1 and transformation_2. You need to provide this information in the metadata file and in the manuscript and maybe also think about a more self-explanatory naming of these two variables.

Line 146: Could you elaborate a little bit on why clay is used in the denominator? Just because it is used by other studies is not necessarily sufficient as an explanation.

Line 147: Missing citation for the R package Compositional.

Line 153: Table 1: Number of covariates does not match with the description of the covariates, e.g. soil has 28 covariates but only 5 are mentioned. A detailed list with precise data sources should be provided in the supplement. Also, the acronym GSI is not explained. What are the land categories you used? Are the extracted years matching with the year of sampling?

Line 154: Table ?? – missing cross-reference

Line 154: How did you do the adjustment to 1 square kilometer? Could you provide some more details here, including uncertainties?

Line 155: Again, explain what you mean by “stack”

Line 156: Missing citation for the R package carat

Line 156 ff: The idea/method behind the recursive feature elimination is not clear to me. If I understand it correctly, you first built a model with all covariates and then selected the most important predictors (based on what?). How is it possible that you only then extract the values for the covariates at the profile level? Maybe, I am missing something here. So, if this a common approach it is probably fine, but I cannot really evaluate this.

Line 166: Did you also consider different training and test datasets? In your introduction you mention spatial cross-validation which is probably a good approach for your data given its clustered nature. Could you please provide some more details about the bootstrapping technique since the splitting of the data is crucial for the validation of your predictions later.

Line 165: Space missing

Line 167: Space missing

Line 168–191: After reading this section I am not quite sure if I fully understand your methods. The description of the two R packages is quite technical and detailed, however, I am missing a little bit a more general description of the methods that are not necessarily restricted to the two R packages. Also, at the end of the section you mention spatial cross-validation, but earlier, you talked about bootstrapping the data. I think I am having a little bit difficulty to follow what you did at each step and why. Maybe you can emphasize this a little bit better. A flow chart might also help to guide the reader. Yet, I also have to admit that I am not really an expert in this field of digital soil mapping and other reviewers may be able to evaluate it better.

Line 194: As mentioned by Fuat Kaya, why did you resample the SoilGrid data and not just extracted if for the sampling locations of your soil profiles?

Line 195: Maybe better: “Next” instead of “After”

Line 195: Did you do the validation for the SoilGrid product only or, as I assume, also for your own predictions? This is not really clear based on the sentence.

Line 196–203: You do not provide an explanation/definition for the concordance correlation coefficient, but for all other measures you mentioned here. Additionally, the whole section (line: 194–203) is not that clear written: first you talk about comparing your results to the SoilGrid products and then you talk about the validation measures you used for your own predictions and the SoilGrid product. Again, try to streamline the description of your different steps so that it becomes more clear what you did when and why. I really think a flow chart-type figure would help here.

Line 206: You did not introduce the term “independent residual” yet. Why did you not just used the prediction error terms? The interpolation was probably done for the final map – if so, say so here. Can you provide some more details about the kriging? There are a lot of different ways doing it.

Line 210–115: Could you please provide some references for this section. Maybe you can also mention more clear that the spatial ensemble was based on your two models and the SoilGrid product.

Results: Overall this section is quite descriptive (which is ok for a result section), however, I think the section could be improved by providing some more context and by focusing on the main results.

Line 225: Table ?? – missing cross-reference; not clear which transformation refers to which

Line 225: Awkward phrasing: “the minimum contents were 1% or less for 5 standard depths” Maybe better: “For all depths, the particle-size fractions ranged from circa 0 to more than 90 %, except for silt, which only ranged from circa 0 to 80%”. However, I think you can also just say that the particle-size fractions are covering more or less the entire range, which is to be expected for continental-scale analysis.

Line 234–236: This is repetition from the method section and probably not needed in the result section; Table ?? – cross-reference missing

Line 238: You have not defined the acronyms TEM, RH and PPT also units are not provided for the covariates. Did you scaled the covariates before using them? As I wrote earlier, you probably need to provide a table with all the covariates and description of them, including units and sources.

Line 242: Table ??: missing cross-reference

Line 254: Table ??: missing cross-reference

Line 262: Fig. 4–8 instead of listing all figures

Line 265–274: It is not always easy to follow which model was best where, which is partly also due to wrong grammar. I encourage you to have someone checking the language and grammar throughout your manuscript. For example, what do you mean with “MACHISPLIN had representation in all natural regions, and in the deepest layers”?

Line 268: It should read MACHISPLIN instead of MACHISPLIS

Line 275: Maybe better: “In terms of SG” instead of “Concerning SG”

Line 278: Table ??: missing cross-reference; missing space

Line 279: It should read “On the other hand”

Discussion: This section provides some context for the results and also discusses limitations of the data and methods. However, it sometimes also repeats things from the introduction and result sections, which is probably unnecessary.

Line 286: “Soil texture is a key property required for many applications in environmental sciences” – This sentence is not adding anything new and the statement was already made in the introduction.

Line 288: Delete the second “previous”; what do you mean with the word “detail” in this context?

Table 4: It should read “Root mean square error”

Table 5: Not clear what is meant by “adjusted parameters”; the table is showing the validation terms for clay, sand and silt for the five depth intervals. The acronyms are also not defined.

Line 290: Missing reference. Without any references it is difficult to follow your argumentation her. Also, this is probably material for the introduction and not really for the discussion.

Line 290–296: This is a description of your methods and could probably be moved to the beginning of your result section to summarize what you did before presenting the results.

Line 297: Does the soil diversity really changes with depth?

Line 297–312: This seems to be more part of the result section than the discussion section.

Line 327: “with” instead of “whit”

Line 330: You are not providing any details why Araujo-Carrillo et al. 2021 provides the best example. I think in this section (line 323–343) you don’t need to describe the methods of the other studies rather focus on comparing your results with their results and discussing which improvements you achieved and why a direct comparison is not that easy. You can then just briefly talk about differences and similarities at broad spatial scale.

Line 343–355: I think this section could also be part of the introduction (and is partly already in it) as a justification for the methods you used in this work.

Line 350: “than” instead of “that”

Line 356–369: Again, you provide a lot of details about the methods from other studies and yet, I don’t think that this is necessary. In general, I like the idea of comparing error terms between the different studies, but I am not convinced that it needs to be done in such detail. Also, you have not mentioned this method (the error term comparison) anywhere else in the manuscript and it comes a little bit unexpected. So, I think you either need to set it up better earlier in the manuscript or just provide a general discussion which study/method had better/worse error terms and what the reason for this is.

Line 372: I think a word like “only” is missing between “but” and “in”

Line 374–376: Could you be more precise here? What do you mean with “in general terms” and “with good quantitative statistics”

Line 383: Maybe better: “for the entire country of Colombia”

Line 384: Sentence structure: “However, the differential factor included maps that represent the best model (EML or SG) in each area of the country at different depths, called in this work spatial ensembled.”

Line 386: Unit % is missing; could you maybe discuss here what approach could overcome some of these limitations?

Line 389: What do you mean by “new and great challenges”

Line 391: Do you really think adding covariates will improve the predictions? You already tested 83 covariates. If you think you are missing important covariates, you should elaborate on why they are important and why they are missing.

Line 392: “with” instead of “whit”

Line 396: What do you mean by “homosoil”?

Conclusions: The conclusion section many repeats statements from the result and discussion sections. Maybe think of some new aspects that could conclude the manuscript and only give a short summary of the main results and how they related to previous work.

Line 403: Is your map "just" better or does it reveal new patterns of soil texture in Columbia? Maybe you can add 1-2 sentence here (or earlier in the result/discussion section).

Data availability: The provided links to the dataset and maps are working and everything can be downloaded easily. Please consider also to publish your code in order to fulfill the FAIR principles.

Line 421: You have not clarified what trans_1 and trans_2 are and this information is also missing in the metadata of your published data.

References: For some of the references the doi link is not correct, e.g. line 442

Figures:

Figure 1: Maybe think of using a different color scheme since red and green are not distinguishable for many people. www.colorbrewer2.org provides a nice tool for picking colorblind-friendly color schemes for maps.

Figure 2: Could you provide this figure in a better resolution?

Figure 3: Again, think of using a colorblind-friendly color scheme. Why are you not providing the maps for each model and each depth? It is not clear from the figure legend (which should stand for its own) why landmap is used for 5 and 30 cm and MACHISPLIN for the other depths in the same figure.

Figure 4–8: Colorblind-friendly scheme; The scale for clay, sand and silt should always have the same range, otherwise it is impossible to compare the maps which each other. Also, the contrast for the color range is not ideal, differences are difficult to see in the maps.

Citation: https://doi.org/10.5194/essd-2021-437-RC1
RC2:
'Comment on essd-2021-437', Anonymous Referee #2, 24 May 2022
This is a very interesting study and I enjoyed reading it. The authors have applied some very novel ensemble machine learning algorithms to predict soil particle size fractions across the entire country of Colombia. The idea of also incorporating global predictions like SoilGrids with national model predictions is another novel application that can help to improve national digital soil maps in areas where training data is limited. The authors also treat the soil particle size fraction data as compositional data, using ALR transformed variables in their ensemble models. I think that this study has a lot of potential but there is one major issue that needs to be addressed.

Based on my understanding of the spatial ensemble approach presented here, this approach has some fundamental issues that need to be corrected. As correctly stated by the authors, soil particle size fraction data is compositional data. The authors appropriately used the ALR transformation for predicting the different PSFs. However, once the data is back transformed to sand, silt, and clay fractions, the authors then treat each fraction independently, thus ignoring their compositional nature. In their spatial ensemble, at a given pixel they may take the predicted clay from SoilGrids, sand from MACHISPLIN, and silt from Landmap, thus ignoring the interdependence of the predictions within a given model. From visually evaluating figures 4-8, it appears that this can result in pixels where modeled sand, silt, and clay percentages far exceed 100 percent. This observation was later confirmed in the manuscript on lns 385-387 and lns 410-411. Spatial ensembling of compositional data requires the preservation of the compositional structure of the data and the authors’ current approach violates this. I see this as a fundamental flaw in the current approach that can be corrected.

I can see two possible solutions to this problem:

Each individual pixel would only be populated with a single model for all PSFs, e.g., all SoilGrids or all Landmap. Model selection would be based on the lowest model error averaged across the three PSFs.

Perform the spatial ensemble on the ALR transformed values (Tans_1 and Trans_2) for MACHISPLIN, Landmap, and SoilGrids. I’m not familiar with all of the processing steps in the MACHISPLIN algorithm but applying a similar approach for interpolating error and selecting the best model.

Another issue with the current spatial ensemble approach is that it produces lots of spatial artifacts (e.g., circle and blob patterns) which is likely due to the patchwork of models used to populate each pixel. Applying some type of model weighting, like the MACHISPLIN algorithm does, might improve these results. Another potential reason for the spatial artifacts seen in Figs 4-8 (e.g., vertical striping running north-south) is likely due to the ML algorithms. Documentation for the MACHISPLIN algorithm states that Boosted Regressive Trees and Random Forests models can produce blocky outputs, thus MACHISPLIN provides the option to exclude those two models from the ensemble using the model parameter ‘smooth.outputs.only’.

There are many more ‘minor’ issues that the authors should address which I list below:

Specific comments:

Introduction: The introduction is long and overly general. The authors should focus on topics directly relevant to this study, including ensemble modeling, spatial cross validation, and modeling compositional data. The section starting of line 59 extending to the sentence ending on line 96 does not add to the manuscript and can be replace with more relevant background text. For example this section provides a very general discussion of geostatistics which isn’t direclty relevant to this study and discusses current research questions in DSM that are not addressed in this study.

Lns 60-61. Are there examples of unsupervised statistical learning for soil PSFs or texture classes? If not remove this statement.

Lns 100-101. Which accuracy indicators? The ones previously stated? If so, then say it rather than generally listing indicators and then not stating which ones you used.

Ln 123. From Fig. 1 it appears that not all sample locations were sampled at all depths. I assume this was due to the presence of shallow soils? If so please state this. Also, please provide a breakdown of the samples (training and validation) represented for each modeled depth. This could be easily included in Fig. 2.

Ln 124. ‘Soil particle-size fractions (PFS)’ -- acronym not consistent with abstract, i.e., SPF, soil particle fraction

Ln 145. Equations 1 and 2 need to be better defined, i.e., all equation parameters need to be explicitly defined. It is not clear in this particular application how Trans_1 and Trans_2 were calculated. You state that clay was used as the denominator variable, so does that make the denominator in equation 1 (zeta-D) equal to the clay fraction. Yet on line 141 you state that D =3, and i=3 (D) represents the silt fraction. Please explicitly define how Trans_1 and Trans_2 are calculated.

Table 1. Information on the spatial resolution of covariate data is missing for several sources, e.g., soil index, sand and clay minerology, landsat. It would also be helpful to provide an approximate grid cell resolution equivalent to the 1: 100,000 map scale. Also, many references in table are missing.

Ln 154 and throughout manuscript. Table citations are missing table numbers.

Ln 154. ‘adjusted to 1 square kilometer’ Which upscaling method was used, e.g., nearest neighbor? bilinear?

Lns 160-162. Please provide additional detials about this bootstrapping technique. Does it account for ranges in covariate space when splitting the samples? Was spatial autocorrelation accounted for when creating the training/testing split? Based on your statement on lns 189-190, without accounting for spatial autocorrelation, your training and testing datasets are not independent. on the other hand, there are arguments against the use of spatial cross validation, see https://doi.org/10.1016/j.ecolmodel.2021.109692

Also, for reference, please provide the number of samples in training and validation sets, i.e., (75%, n=???)

Lns 169-171. It would be good to provide additional details comparing the two ensemble modeling techniques. For example, the Landmap algorithm applies a stacking ensemble approach using 5 base learners and a 'meta model' or super learner to produce an ensembled prediction. What type of super learner was used? Also, how does the stacking ensemble compare to the weighting approach applied in the MACHISPLIN algorithm. How were the model weights calculated? These types of details are more relevent to this study than the very general discussion of machine learning vs geostatics presented in the introduction.

Lns 171-174. This statement is not clear. Is the residual error interpolated for each model? Are these error surfaces used to determine the model weighting in the final ensemble? The details of how this is done needs to be made more clear.

Ln 187. How is the cross validation used to determine the meta-learner?

Lns 189-191. The authors have done a nice job of citing recent work relevant to this study. In regards to the use of spatial cross validation, there has been recent debate as to its appropriateness for map validation. It might be good to reference this here.

https://doi.org/10.1038/s41467-020-18321-y

https://doi.org/10.1016/j.ecolmodel.2021.109692

Lns 194-195. So resampled from 250m to 1km? It is helpful to state this, as well as the resampling method.

Lns 206-207. Kriging assumes some spatial autocorreation among the errors. Was this the case? Might be helpful to provide the semivariograms. Did you consider using a thin-plate spline approach similar to the MACHISPLIN method?

Lns 212-214. The accuracy of this approach depends on the accuracy of your kriged error maps. It seems like applying a model weighting approach similar to MACHISPLIN might provide a better result rather than select the model with the lowest error at each pixel.

Ln 242. 'Boundary adjustment parameters'? I'm not sure what you mean by this. It is not referenced anywhere else. Do you mean Accuracy metrics or indices?

Tables 4 and 5. 'Adjustment parameters'? Why are these model accuracy metrics referenced as adjustment parameters? Also, it should be stated here that these accuracy statistics are based on the validation dataset.

Table 5. Why was CC for clay at 5cm lower than either MACHISPLIN or Landmap for that depth and fraction? I would have thought the spatial ensemble would select the most accurate model for each site and therefor produce more accurate results relative to the other models. There are other instances of this among the depths and fractions. Could this be a result of combining PSFs from different models?

Ln 292. Is this a reference to the predicted map uncertainty for SoilGrids? Was this uncertainty evaluated? It would be interesting to see uncertainty maps of SoilGrids PSFs for this area. This is also an important aspect of digital soil mapping not addressed in this paper. Since SoilGrids quantifies model uncertainty this would be an interesting point of comparison to national model results.

Lns 299-301. Fig 2. is presented in black and white and as a low resolution figure. making it difficult to interpret.

Lns 385-387. Was the data normalized to 100 after the spatial ensemble? This might explain why some of the PSF at certain depths had lower performance relative to the EML models. I see this as a major problem with your spatial ensemble approach.

Lns 402-403. This was seen in Table 4 with the validation statistics. What is missing is a visual comparison of the two EML algorithms. In Fig. 3 you show either Landmap or MACHISPLIN but not both. I would like to see a figure similar to Fig. 3 but showing Landmap, MACHISPLIN, SoilGrids, and the spatial ensemble at one or two depths.

Lns 410-411. Evaluating model accuracy in these areas is tricky because there is limited data to accurately model the spatial distribution of model error. Using ordinary kriging won't do a great job in these data sparse regions.

Lns 414-416. It is good to see that the authors recognize this issue with the spatial ensemble approach. However, I see this as a major flaw that diminishes or even removes the prior efforts to account for the compositional nature of the data. This could have been avoided using one of the alternative approach I outlined above.
Citation: https://doi.org/10.5194/essd-2021-437-RC2
RC3:
'Comment on essd-2021-437', Anonymous Referee #3, 25 May 2022
Soil texture is one of fundamental soil properties. Obtaining better estimation of soil texture for sites without samples is the key for the generation of soil texture maps. The study presented texture maps for Colombia at five standard depths (5, 15, 30, 60, and 100 cm) obtained via spatial ensemble of national and global digital soil mapping products. A comparison of newly developed soil texture maps with the previous maps (SoilGrids, SG) showed the improvements of new maps. The datasets shared in the study are valuable.

What are differences between “SPF” and “PSF”? Line 18, What is “PSF”?

A thorough reading is suggested. It seems that there are some missing information. Such as Line 154 Table ??.....

Table 3, it was showed that top five Covs were selected. Why five Covs? How many explained variances by five Covs?
Citation: https://doi.org/10.5194/essd-2021-437-RC3
RC4: 'Comment on essd-2021-437', Anonymous Referee #4, 26 May 2022

This study used digital soil mapping framework to generate the first texture maps at five depths of Colombia. They used additive log-ratio (ALR) transformation on sand, silt, clay content to develop models. Two ensemble machine learning methods (MACHISPLIN and landmap) and predicted maps from SoilGrids were compared and a spatial ensemble function was created to select the best model for each pixel in the final maps.

I have some specific questions:

Introduction: The introduction is too long, and it should be significantly shortened. Some general sentences can be removed, for example, lines 97–101, Models or algorithms for digital soil mapping are evaluated …

The language should be improved. Some sentences are incorrect. For example, line 109, “Understanding which are the prediction algorithms and approaches yield lower error levels at the pixel level…” delete “are the”.

Methodology: This section is clear and easy to follow, but more details should be provided.

Line 123, how did you sample the profiles? Based on identified horizons in the field?

Line 124, what method was used to measure the clay, sand, and silt in the laboratory?

Line 131, five standard depths (5, 15, 30, 60, and 100 cm). I think it is better to change it to 0–5, 5–15, 15–30, 30–60, 60–100 cm throughout the paper.

Line 146, why was clay used as the denominator? Is there any difference if you use sand or silt?

Line 150, 83 environmental covariates, please provide information on how you obtained these covariates and description of these covariates. For example, there are 6 lithology and 10 soil orders, what are these? What are the 5 oblique geographic coordinates? How did you adjust the covariates to 1 square kilometer? Which resample method did you use?

Line 154, Table ??. None of the Tables are clearly mentioned in the text.

Line 194, add the original resolution of SG and which method did you for the resample? Do you know roughly how many profiles the SG used in Colombian area?

Line 195, “external validation”. I think it’s more common to say “independent validation”.

Line 200, is AVE the same as R2 (coefficient of determination)?

Line 206, so the error maps in Figs. 4, 5, 6, 7, 8 are from interpolation of residuals using ordinary kriging. In a widely used method – regression kriging, the final map is obtained by adding up the regression map and residual kriging map. Have the authors considered doing the same thing? In addition, the error map is not regarded as uncertainty map. Have you considered calculating model uncertainty using other methods, e.g., bootstrapping?

Results:

Line 250, it should be “decreased with increasing depth”. Similarly, line 257, “the RMSE increased with increasing depth”.

Table 4 and 5, why do you use “adjustment parameters”?

Discussion:

Line 288, “While many studies focus on mapping soil properties such as pH and organic matter, less studies focus on comparing and testing approaches for maximizing accuracy.” I think there are many papers focusing on method development and improving the accuracy.

Line 298, “As depth increases, the soil thins, and the proportion of clay and silt rises.”. Unclear sentence. What does “soil thins” mean?

Line 380, Fig. 1 is first cited in the discussion. It should be mentioned before Fig. 2.

Citation: https://doi.org/10.5194/essd-2021-437-RC4
AC1: 'Comment on essd-2021-437 Final response', Viviana Marcela Varón Ramírez, 15 Aug 2022

We are very grateful for the comments; these greatly improved our research. We responded to each comment and fixed the manuscript.

All responses (community and reviewers) are in the attached file.

Citation: https://doi.org/10.5194/essd-2021-437-AC1

Peer review completion

AR: Author's response | RR: Referee report | ED: Editor decision | EF: Editorial file upload

AR by Viviana Marcela Varón Ramírez on behalf of the Authors (07 Sep 2022) Author's response Author's tracked changes Manuscript

ED: Referee Nomination & Report Request started (12 Sep 2022) by Giulio G.R. Iovine

RR by Anonymous Referee #4 (21 Sep 2022)

RR by Anonymous Referee #1 (23 Sep 2022)

Suggestions for revision or reasons for rejection

Review revision for “Colombian soil texture: Building a spatial ensemble model”

Submitted to Earth System Science Data by Viviana Marcela Varón-Ramírez, Gustavo Alfonso Araujo-Carrillo, and Mario Guevara

General comments:
I acknowledge that the authors have heavily revised the manuscript and have addressed most of my previous comments. The manuscript has improved significantly from the previous version. I only have a few minor comments – mainly clarification. Beside this, the grammar and sentence structure are still partially incorrect and sometimes hinder the flow of reading. Therefore, I recommend a careful language edit before final publication.

Specific comments:
Line 9f: I think there are already national texture maps based on digital soil mapping. You could highlight that this is the first national texture map based on digital soil mapping in Colombia.
Line 28f: I disagree that clay content always has to increase with depth. Although this is quite common, it is not a requirement for mapping – not like that all three texture classes always have to sum up to 100%.
Line 43: Does not every statistical model needs a response variable?
Line 73: What does “RMSE ≈ 122,200” mean?
Line 98: Could you provide a short description of the soil texture classes? How are they defined in terms of size (e.g. silt: 0.002 to 0.05 or 0.002 to 0.063 mm).
Line 162: Only a side-comment. The package mlr is not maintained anymore and the most recent version of this package is called mlr3 (https://mlr3.mlr-org.com/).
Line 202: See my comment about (line 9).
Line 217: Figure 2: I find the flow chart really helpful. However, I was wondering why you show PSF error as an input into the models? Did you not use the transformed PSF values?
Line 259: This should read Figure 5 to 9
Line 266: What do you mean by “extensive and continuous areas”?
Line 273: What do you mean with the acronym EM? Should it read ME? But then the sentence does not make sense.
Line 373: In line 273 you wrote: “RMSE values decreased for all PSF and standard depths, which means a raising in the precision of the map.” And here you write: “However, layers of 0-5, 5-15, and 15-30 cm obtained the best results”. Maybe I missed something, but to me this is contradictory.
Line 392: I really appreciate it that the authors provide the R code for this work.

Hide

ED: Publish subject to minor revisions (review by editor) (24 Sep 2022) by Giulio G.R. Iovine

AR by Viviana Marcela Varón Ramírez on behalf of the Authors (04 Oct 2022) Author's response Author's tracked changes Manuscript

ED: Publish as is (06 Oct 2022) by Giulio G.R. Iovine

AR by Viviana Marcela Varón Ramírez on behalf of the Authors (10 Oct 2022) Author's response Manuscript

Short summary

These are the first national soil texture maps obtained via digital soil mapping. We built clay, sand, and silt maps using spatial assembling with the best possible predictions at different depths. Also, we identified the better model for each pixel. This work was done to address the lack of soil texture maps in Colombia, and it can provide soil information for water-related applications, ecosystem services, and agricultural and crop modeling.