the Creative Commons Attribution 4.0 License.
the Creative Commons Attribution 4.0 License.
Soil information and soil property maps for the Kurdistan region, Dohuk governorate (Iraq)
Abstract. We present the first detailed soil property maps at multiple depths for the northwestern autonomous Kurdistan region of Iraq (Dohuk). A total of 532 soil samples from 122 sites were collected at five depth increments (0–10, 10–30, 30–50, 50–70, and 70–100 cm), and their mid-infrared (MIR) spectra were measured. A subset of 108 samples, selected via Kennard–Stone sampling, was analysed in a laboratory on ten soil properties. A Cubist model was trained and used from these measured values to predict all samples’ soil properties from their MIR spectra. Digital soil mapping was conducted using various machine learning regression techniques (ensemble learning, linear classifier, nearest neighbour classifier, decision trees), trained on the predicted soil properties and using a total of 85 covariates at 25 m pixel resolution, resulting in 50 prediction maps in total. Results were compared with the SoilGrids 2.0 product and a regional texture model. Soil depth was also mapped using a quantile random forest with 26 covariates. Our regional model outperformed global SoilGrids 2.0 predictions in resolution and accuracy, with texture RMSEs (sand: ∑RMSE = 9.35; silt: ∑RMSE = 6.8; clay: ∑RMSE = 10.28) comparable to local models. Quantile random forest achieved the best performance in 51 % of the models, and key predictors included Sentinel 2 SWIR, EVI, NDVI, and SAVI. Spatial patterns reflected the contrast between the flat areas of the Simele and Zakho plains, as opposed to the shallower and steeper Little Khabur Valley and anticline formations. Furthermore, the soil depth prediction model (R2 = 0.57; RMSE = 2.59 cm-0.5) showed strong correlation with slope and a similar pattern distribution with deeper soils in the flat areas of the Simele and Zakho plains, while shallow soils are visible in the anticline and strongly erodible areas. Our comprehensive dataset (Bellat et al., 2024a, b, c, d, 2025) offers substantial insights for soil knowledge in the region, as well as for aridic and semi-aridic areas.
- Preprint
(47253 KB) - Metadata XML
-
Supplement
(44835 KB) - BibTeX
- EndNote
Status: open (until 24 Feb 2026)
-
RC1: 'Comment on essd-2025-418', David G. Rossiter, 29 Oct 2025
reply
-
AC2: 'Reply on RC1', Mathias Bellat, 20 Dec 2025
reply
We would like to thank Reviewer #1 for his most welcome comments. The improvements from his suggestion will substantially enhance the manuscript's quality and scientific correctness.All the questions have been answered in detail in the attached file.Best regards,Mathias Bellat, on behalf of all the co-authors.
-
AC2: 'Reply on RC1', Mathias Bellat, 20 Dec 2025
reply
-
RC2: 'Comment on essd-2025-418', Anonymous Referee #2, 14 Nov 2025
reply
This manuscript focuses on regional digital soil mapping in Iraq, using 532 soil samples and 85 covariates to produce soil maps via machine learning. While the modeling approaches are generally appropriate, the work falls short in two critical aspects: (1) Limited Geographical Scope: The investigated region is quite small. Consequently, the resulting dataset has limited implications and applicability for the broader scientific community, despite its location is in Iraq. (2) Limited Novelty: The modeling framework adopted is standard practice in digital soil mapping and lacks significant methodological novelty. Given these limitations, specifically the dataset's limited scope and the conventional nature of the modeling, this work does not meet the high standards for originality and impact required for publication in Earth System Science Data.
Citation: https://doi.org/10.5194/essd-2025-418-RC2 -
AC1: 'Reply on RC2', Mathias Bellat, 21 Nov 2025
reply
We sincerely appreciate the time referee 2 took to read the preprint and highlight the adapted modelling approach in our paper. The reviewer identified two critical aspects of our preprint.
1) Indeed, the studied area (2,280 km2) is “relatively small” regarding other datasets available in ESSD. However, in other case, regional to local data are also available (e.g. Lorenz et al., 2021; Ardizzone et al., 2023; Błaszczyk et al. 2024). We do think that high-quality regional datasets are necessary to feed and improve other larger datasets. Furthermore, as referee 2 expressed, data on the Iraq region are critically lacking. No regional data set – from any kind of observations - is available on Iraq in the whole ESSD (accessed on 14/11/2025). We do think that underrepresented regions of the globe do need and deserve high-quality, standardised data, as the one proposed in this paper and, more generally, in ESSD. Qualitative data presented in the preprint (soil classes map) is also hardly expendable at a large scale, as regional patterns can not always be transposed. Finally, the comparison with the SoilGrid.2.0 product used in the study also highlights the poor quality of such global products when dealing with local problems. Henceforth, we do think a high-quality local dataset is needed and would also demonstrate the scientific interest of major reviews, such as ESSD, for a scientifically under-studied country.
2) When mentioning the lack of novelty in the approach, we do understand the criticisms of referee 2, as no “new” method is developed. However, we do think the novelty lies in the combination of known techniques and our unique pipeline/workflow. This study is fully reproducible from the sampling strategy to the final map produced. By combining the sampling strategy, campaign results, FTIR and laboratory measurements, FTIR model predictions, and DSM models, we propose a unique new approach inspired by Malone et al. (2022) but never applied in real conditions at a regional scale.
We do hope that these answers will incite referee 2 to reconsider the reasons for our application to ESSD journal.
References used:
- Ardizzone, F., Bucci, F., Cardinali, M., Fiorucci, F., Pisano, L., Santangelo, M., & Zumpano, V. (2023). Geomorphological landslide inventory map of the Daunia Apennines, southern Italy. Earth System Science Data,15(2), 753–767. https://doi.org/10.5194/essd-15-753-2023
- Błaszczyk, M., Luks, B., Pętlicki, M., Puczko, D., Ignatiuk, D., Laska, M., Jania, J., & Głowacki, P. (2024). High temporal resolution records of the velocity of Hansbreen, a tidewater glacier in Svalbard. Earth System Science Data, 16(4), 1847–1860. https://doi.org/10.5194/essd-16-1847-2024
- Lorenz, C., Portele, T. C., Laux, P., & Kunstmann, H. (2021). Bias-corrected and spatially disaggregated seasonal forecasts: A long-term reference forecast product for the water sector in semi-arid regions. Earth System Science Data, 13(6), 2701–2722. https://doi.org/10.5194/essd-13-2701-2021
- Malone, B., Stockmann, U., Glover, M., McLachlan, G., Engelhardt, S., & Tuomi, S. (2022). Digital soil survey and mapping underpinning inherent and dynamic soil attribute condition assessments. Soil Security, 6, 100048. https://doi.org/10.1016/j.soisec.2022.100048
-
AC1: 'Reply on RC2', Mathias Bellat, 21 Nov 2025
reply
-
RC3: 'Comment on essd-2025-418', Bas Kempen, 23 Jan 2026
reply
REVIEW RESULTS
A comprehensive paper in a critically under represented geographical area when it comes to soil profiles/ digital soil mapping. Care has been taken with the landscape characterisations including tectonic development and parent material climate and vegetation and geomorphology and soils, as well as maps and photographs to allow the reader to really understand the study area. While the methods are not necessarily ‘new’ themselves, it is an important application of state of the art methods, the novel part of this study is the study area. The output maps are compared to SoilGrids, a global model, with inputs from WoSIS, the study mentions that WoSIS has low sampling density in this area, highlighting the need for such studies, in addition to the scarcity of other options in the area.
I agree with reviewer 1 that this is a well written and thorough study. This study is a good example of adhering to FAIR and open metadata standards, for not only the data but the methodology, and that is commendable. The methods are comprehensively described, and fit their purpose well. All methodological aspects including the code are not only made available following FAIR principles, but thoroughly documented and explained at https://mathias-bellat.github.io/DSM-Kurdistan/digital-soil-mapping.html#visualisation-and-comparison-with-soilgrid-product, creating a fine example of a fully reproducible study and the input and output data themselves are a much needed addition to more or less non-existent openly available soil data in the area. With this, this manuscript fits well within the scope of ESD.
One of the reviewers mentioned the limited geographic scope of the paper. I do agree that this scope is limited but having a DSM study published for a region like Kurdistan (Iraq) is worthwhile and to me an a welcome addition to the body of literature on this topic. Especially given that the authors make their results as well as data open from which other DSM efforts (e.g. SoilGrids) can profit. This is much appreciated.
Having said this, I do have a few comments and questions regarding the manuscript, particularly concerning the methodologies, which in my view require some further explanation and clarification. I encourage the authors to address these points, after which I would recommend publication of this article in ESSD.
SPECIFIC COMMENTS
Main comments
Lines 230-231: Could the authors explain the decision to model each depth layer separately instead of developing one model per property with using the depth as an explanatory variable? Modelling each depth separately is a valid approach, but I would like to understand the reason why the authors took this approach.
Lines 246-247: I do not understand the rationale for combining data splitting (80/20) with (repeated) cross validation. Cross-validation already produces an independent prediction for each data point, from which accuracy metrics can be calculated. What was the motivation for embedding CV within a data-splitting framework? And how does this then work? If CV was performed on the 80% training subset, this would give CV-predictions only for these points. I wonder how predictions for the 20% test set were obtained. Which trained model was used to predict at the points in the test set? In case of normal data splitting this would be the model trained on the training dataset. However, by running CV on the training dataset there is no single trained model but multiple (here 10) fold-specific models.
In addition, the manuscript states that CV was repeated three times? While repetition may improve robustness when data are limited or when using a small number of folds, with 10-fold CV I would expect only minimal variation between the repeats?
Overall, the validation approach seems a bit overcomplicated. The authors may well have had sound reasons for adopting this approach, but in case the rationale and precise implementation need to be explained more clearly.
Lines 268-273: The description of the ensemble modelling approach is unclear to me and would benefit from additional detail. Specifically, it is not clear which ‘conditions’ (l. 269) are being referred to, what criteria were used to select ‘the best one’, and how the individual model predictions were combined in the ensemble? While relevant literature is cited, I believe that a few additional lines better outlining the implementation of the ensemble approach would improve clarity to the reader.
Other comments
Line 10: The summation signs should be removed I believe.
Line 10-11: Reference is made to ‘local models’ (compared to the regional models the authors developed and the SoilGrids global model). It is unclear to me where the ‘local models’ refer to and what the basis is for the claim of the authors.
Line 14: I believe the minus sign in the superscript should be removed, assuming the RMSE values are reported for the transformed depth data. After applying the square-root transformation, the depth unit becomes cm^0.5. The unit of the MSE metric would then be cm, and taking the square root to obtain the RMSE would again give value in cm^0.5. Table 5 also reports the RMSE unit in cm^0.5. I assume the unit of the MAE (Table 5, l. 337) is also cm^0.5? The ‘-0.5’ superscript should be removed I believe (same for line 337).
Line 148: Reference is made to WRB 2006. Can the authors confirm if this was also given that there are more recent versions of the WRB? The latest from 2022 I believe.
Line 178: “potential soil layer” - inconsistent naming – previous paragraph is “potential soil properties”
Line 185: What are the ‘layers’ here? Are these the soil horizons or are these the layers for which samples were collected (l. 186)? Please clarify.
Line 187: How is ‘topsoil’ defined here? Is this the 0-10 layer? Explain a bit more how the ring samples were taken. E.g. where was the ring sample was taken in the topsoil layer: from the top, in the middle of the layer …?
Lines 236-237: References seem to be incomplete. Only years are mentioned (2021,2022)
Line 243: Could expand on why RFE was not restrictive enough/ restrictive enough for what?
Line 372: Why were SoilGrids extremes removed?
Line 401: Limitations are addressed in the paper – I am not sure if the point density, even though notably higher than WoSIS, still warrants mapping at 25m, the reference to Hazelton and Murphy (referencing cartographic scales) seems a bit of a jump. Instead of referencing Hazelton and Murphy I would rather compare to other regional DSM studies.
The Conclusion section reads like an abstract. Reviewer 1 already commented on this and based on that, the cauthors revised the text and I believe with that revision this comment is addressed sufficiently.
Figure 2: What is a negative site? – soil samples
Technical corrections
- Line 21: influence local ecosystems -> influence on local ecosystems
- Line 28: includes -> include
- Line 33: gives information on its ability to fit or not for agricultural purposes, but also to better understand -> gives information on its ability to fit agricultural purposes and helps to better understand
- Line 35: Governate -> governate (not capitalised anywhere else in the paper)
- Line 70: cluster -> conditional
- Line 75 & 265: Hengl and Robert -> Hengl and MacMillan
- Line 74: a raw -> one raw
- Line 129: climax -> climate
- Line 255: McBradney -> McBratney
- Line 229: remove ‘part’
- Line 234: a additive -> an additive
- Line 247: state of the art the art -> state of the art models
- Line 270: approached -> approach
- Line 346: bakns -> banks
- Line 388: remove ‘consistent’
- Line 401: Hazelton and Murphy pg5 -> pg4
- Line 410: LU/C -> land use/cover (it is not used previously)
- Line 425: profiles depth measurement -> profile depth measurements
- Line 441: world -> global
- Line 443: shallower resolution -> higher resolution
- Table 3: Modis brightness index in the wrong column
- Appendix B: units column could be named differently. It contains a mixture of units, formulas, ranges.
Citation: https://doi.org/10.5194/essd-2025-418-RC3 -
EC1: 'Note to Authors - Reply on RC3', Giulio G.R. Iovine, 26 Jan 2026
reply
Note to Authors
According to Referee Bas Kempen, the following line on the RSME metric unit should be disregarded in his comments:
<I believe the superscript “-0.5” should be removed (the same applies to line 337).>
Citation: https://doi.org/10.5194/essd-2025-418-EC1 -
AC3: 'Reply on RC3', Mathias Bellat, 27 Jan 2026
reply
We would like to thank the recommender #3 for his comments, which took into account the previous modification. We are confident that these comments will enhance the overall quality of the manuscript.All the questions have been answered in detail in the attached file.Best regards,Mathias Bellat, on behalf of all the co-authors.
-
RC4: 'Comment on essd-2025-418', Anonymous Referee #4, 27 Jan 2026
reply
The overall study is interesting and represents a solid piece of work in this research area. I agree with other referees that, although the study is limited to a specific locality, it presents novel data from a region that is generally under-represented in the literature. In this sense, the manuscript has clear value as a data paper and, in my opinion, deserves publication.
I will not comment on the many minor issues already addressed by other referees. However, I do have two major comments that I believe should be carefully addressed prior to publication, as well as several moderate and minor comments that would help improve clarity and rigor.
Major comments 1. Use of multiple models (Section 2.4.3; L. 246–253)
The rationale for using such a large number of models is unclear. A modelling approach should be chosen based on the nature of the data and the study objectives. Here, several individual models are used, alongside an ensemble model, without a clear purpose or benefit being articulated.
In addition:
- Quantile Regression Forest (QRF) is not a fundamentally different model from Random Forest; it is simply RF.
- The models are not combined in a way that preserves the ability to produce coherent uncertainty maps.
- As a result, the current strategy weakens, rather than strengthens, the uncertainty assessment.
I strongly recommend using a single modelling framework. Either:
- rely on QRF only, which naturally provides prediction intervals, or
- use an ensemble model only if it can return proper and interpretable prediction uncertainty (which is not evident here).
2. Validation metrics and uncertainty evaluation (Section 2.4.4 and Section 3.1)
Section 2.4.4 is currently confused and mixes metrics that evaluate prediction accuracy with those intended to assess prediction uncertainty. These two aspects should be clearly separated and discussed independently.
There is also no need to report such a large number of metrics. A limited set of interpretable and complementary statistics is preferable, for example:
- bias,
- an error metric (e.g. RMSE or MAE),
- variance explained and/or correlation.
Some specific concerns:
- Metrics such as RPIQ are not clearly explained. While they are common in spectroscopy, it is unclear whether they are scaled relative to the error or to the original data, which makes interpretation difficult in this context.
- The CCC combines correlation and bias, yet bias is also reported separately. It would be clearer to report correlation and bias independently rather than using CCC.
- The interpretation and limitations of PICP need to be discussed more carefully; the authors should consult recent work highlighting its shortcomings (e.g. https://www.sciencedirect.com/science/article/pii/S0016706123002628).
Because of these issues, Section 3.1 needs to be rewritten. Statements such as having a CCC of 0.3 or an RPIQ < 1 are difficult to interpret scientifically. The authors should focus on metrics that have a clear statistical meaning and explicitly explain what the reported values imply for model performance and uncertainty.
Major comment 3: where are the maps of uncertainty (or I may have missed something?), authors should report prediction intervals.
Moderate and minor comments
- 50 and following
The issue is not spatial resolution per se. Any model can theoretically predict at very fine resolution if computationally feasible. The real limitation is data availability in the region of interest or in regions with similar soil-forming factors. Resolution is largely a computational issue and should not be framed as an objective. - Section 2 description
The description of the sampling and laboratory analyses is difficult to follow. It is unclear which samples were analysed in the lab, which were subsampled, and which were analysed using spectroscopy. This section needs to be clarified. - 71
The term “cluster Latin hypercube sampling” is confusing. Clustering and LHS are conceptually incompatible. Do the authors mean conditioned Latin hypercube sampling (cLHS)? - Figures placement
Why are all figures placed at the end of the manuscript? This is inconvenient for reviewers and increasingly uncommon, especially given that reviews are conducted digitally. - 171
Some of the cited papers argue the opposite of what is stated, namely that cLHS is not an optimal sampling design for spatial mapping. - 173
Intervals in cLHS are not equal; this should be corrected. - 225
Spelling error in the cited author’s name. - 237
Please clarify whether the data were standardized (zero mean, unit variance) or normalized (scaled between 0 and 1). - 246
Data splitting should be avoided when cross-validation is already used. Using CV alone would be preferable.
Citation: https://doi.org/10.5194/essd-2025-418-RC4
Data sets
Digital soil mapping predicted on mid-infrared (MIR) spectroscopy measurements in North-Western Kurdistan region, Iraq (netCDF and GeoTIFF files) [dataset] Mathias Bellat et al. https://doi.org/10.1594/PANGAEA.973764
Soil bulk density and soil depth from on-site observations in the North-Western Kurdistan region, Iraq [dataset] Mathias Bellat et al. https://doi.org/10.1594/PANGAEA.973714
Soil properties in the North-Western Kurdistan region, Iraq, derived from laboratory measurements [dataset] Mathias Bellat et al. https://doi.org/10.1594/PANGAEA.973701
Soil properties predicted on mid-infrared (MIR) spectroscopy measurements in North-Western Kurdistan region, Iraq [dataset]. Mathias Bellat et al. https://doi.org/10.1594/PANGAEA.973700
Soil information in Kurdistan region, Dohuk governorate (Iraq) Mathias Bellat et al. https://doi.org/10.57754/FDAT.e2k10-sf012
Model code and software
DSM-Kurdistan code release 1.1.0 Mathias Bellat; Nafiseh Kakhani https://github.com/mathias-bellat/DSM-Kurdistan.git
Interactive computing environment
Soil information in Kurdistan region, Dohuk governorate (Iraq), supplementary material Mathias Bellat; Nafiseh Kakhani https://mathias-bellat.github.io/DSM-Kurdistan/
Digital soil maping Mathias Bellat https://mathias-bellat.shinyapps.io/Northern-Kurdistan-map/
Viewed
| HTML | XML | Total | Supplement | BibTeX | EndNote | |
|---|---|---|---|---|---|---|
| 1,037 | 485 | 36 | 1,558 | 64 | 27 | 44 |
- HTML: 1,037
- PDF: 485
- XML: 36
- Total: 1,558
- Supplement: 64
- BibTeX: 27
- EndNote: 44
Viewed (geographical distribution)
| Country | # | Views | % |
|---|
| Total: | 0 |
| HTML: | 0 |
| PDF: | 0 |
| XML: | 0 |
- 1
essd-2025-418 "Soil information and soil property maps for the Kurdistan region, Dohuk governorate (Iraq)"
Bellat et al.
Review by D G Rossiter 29-Oct-2025
Summary: This exceptionally-thorough and well-explained data paper presents details of the soils in the named region based on survey and models. It used modern methods (inference from MIR spectroscopy) as part of the soil properties determination. From this dataset a standard modern digital soil mapping (DSM) exercise was carried out to produce property maps over the study area. The maps were compared to the global SoilGrids v2.0 maps and, not at all surprisingly, had significantly better point evaluation metrics. All results and workflows are available under the FAIR concept. This paper can be a reference for how such a study can be carried out.
Major Comments:
1. I appreciate the thorough review of previous mapping efforts in the region, it is good to have these listed for reference. The brief review of major pedogenetic procesess is also appreciated. Similarly for the tectonic development, it places the study within context. The study's motivation is clear. Adherence to FAIR standards is appreciated. The entire workflow, all sources and products, are available, with DOI, and explained.
2. The Conclusions mainly repeat the Abstract and sections of the Discussion. I would appreciate a broader conclusion about the success of this study, the applicability of this kind of study to similar regions, the issues of global vs. local models, the main limitations to this kind of study, the importance of reproducibility and FAIR, etc. That is, after doing all this work, what do you conclude about the project?
3. Did you consider DSM for soil classes? Perhaps using a DSMART-like approach with your additional observations? This could be compared with Fig. 6. Obviously that is not to do in the paper, but was it considered and if so, why not attempted? Related to this, it is not clear how the soil class (not classification) map (Figure 6) was created. It's implied that this was expert judgement supplemented by observations, but it's not explicit. Also see comments below re: L443.
4. Can you comment on the realism of patterns as seen in Figures 9--14? We have the point evaluation statistics, but the map shows a landscape. Do the elements we see there correspond to reality, of course by expert judgement? Are the fine details revealed by the 25 m resolution realistic or artefacts?
Detailed Comments:
The WRB 2006 has been replaced by WRB 2022: IUSS Working Group WRB: World Reference Base for Soil Resources. International soil classification system for naming soils and creating legends for soil maps, 4th ed., IUSS, Vienna, Austria, 234 pp., 2022. However I think the definitions used in this paper have not changed.
L179-80 RUSLE: how were the parameters calibrated? Were they from one of the earlier (cited) studies? Especially the K value.
L227 "the different index" -> "the different indices"
L233 "We performed a standardisation of the predicted values of the texture on 100 % with TT.normalise.sum function (Moeys et al., 2024) and a additive-log ratio transformation (Aitchison, 1986) with the alr function (Tsagris et al., 2025)." This is not clear. Was the normalization following the MIR inference/wet lab measurements? And then were the alr variables used in the mapping, followed by back-transformation (as is done in SoilGrids v2.0)?
L235 "close to a normal distribution Liu et al. (2022)" -> "close to a normal distribution (Liu et al. ,2022)"
L236 "2021" refers to what?
L258 "relative "simple"" -> "relatively simple"
\S2.4.3 and throughout the paper: what is meant by "soil depth"? Is it the solum (zone of pedogenesis) or to bedrock/completely unweathered parent material? This might be better termed "thickness" but "depth" is indeed commonly used. L365 "shallow and deep profiles" implies only the solum, is this correct?
L295 the correct reference for PICP is Eq. (2) of Malone (2011) not 2017. The formula is not found in Malone 2017. Malone, B. P., McBratney, A. B., and Minasny, B.: Empirical estimates of uncertainty for mapping continuous depth functions of soil attributes, Geoderma, 160, 614–626, https://doi.org/10.1016/j.geoderma.2010.11.013, 2011. This equation and the others need definitions of the symbols, although some are standard. For PICP what is "v"? I learn from Malone 2011 it is "he number of observations in the validation [better, evaluation] dataset". What is "PL"? L, U as lower, upper limits can be inferred. Finally, the description "we used the prediction interval coverage probability to evaluate the corresponding prediction within an interval" is not clear. The Malone 2011 description is, to me, clearer: "the PICP is the probability that all observed values fit within their prediction limits".
L309 It's interesting that silt is so poorly predicted, yet most of these soils are on the silty edge of the texture triangle. And, clay and sand are in Category A and B. Can you explain why the poor result for silt, even though there is a lot of it and with a good range in these samples? This is mentioned on L387.
L346 "river bakns" -> "river banks". Spell-check.
\S4.3 Another interesting comparison with SG2 would be the prediction ranges. SG2 likely smooths more than this study, see Table 7 where the Q1-Q3 range is always much narrower. This can be brought out in the text -- the interesting discussion is about global vs. local models. The SG2 maps are much more uniform than the maps from this study.
L387 "should be interpreted with caution—consistent"... with what?
L392 "Abdulrahman et al. 2020 " -> "Abdulrahman et al. (2020)". L390 maybe make it explicit here that this is not a DSM product, rather an expert updating from field work and manual interpretation of remote sensing products (correct?).
L401 the Hazelton & Murphy guidelines are for conventional mapping, not DSM. They are expressed in terms of map scale and cm^2 of printed map. Here the product is digital at 25 m resolution. How is the density here converted to match these guidelines? The argument about cLHS is much more relevant for DSM using machine learning from covariates.
L434 formatting problem with the URL https://mathias-bellat.shinyapps.io/Northern-Kurdistan-map, which goes over a line break so gives a 404 error if not manually adjusted
L443 "shallower resolution " -> "higher resolution"? And what is that resolution? It's nowhere stated. L435 says 1:200 000 scale, which implies polygons with a minimum legible delineation (MLD) of 160 ha (0.4 cm^2 on the map). But L390 "The updated soil classification map (Figure 6) must be interpreted with care, specially at micro-scale (<1:50,000)..." implying a smaller MLD. Figure 6 suggest that this is a polygon map.
Figure 3 the inset showing the region is not needed, that has already been shown in Figure 1 and can be found from the coordinates on the main map.
Figure 8 bicolor key y-axis partially obscured