Articles | Volume 17, issue 2
https://doi.org/10.5194/essd-17-517-2025
© Author(s) 2025. This work is distributed under the Creative Commons Attribution 4.0 License.
A China dataset of soil properties for land surface modelling (version 2, CSDLv2)
Download
- Final revised paper (published on 07 Feb 2025)
- Supplement to the final revised paper
- Preprint (discussion started on 29 Aug 2024)
- Supplement to the preprint
Interactive discussion
Status: closed
Comment types: AC – author | RC – referee | CC – community | EC – editor | CEC – chief editor
| : Report abuse
-
RC1: 'Comment on essd-2024-299', Anonymous Referee #1, 02 Oct 2024
- AC1: 'Reply on RC1', Gaosong Shi, 23 Oct 2024
- AC2: 'Reply on RC1', Gaosong Shi, 08 Nov 2024
-
RC2: 'Comment on essd-2024-299', Anonymous Referee #2, 01 Nov 2024
- AC3: 'Reply on RC2', Gaosong Shi, 09 Nov 2024
- AC4: 'Reply on RC2', Gaosong Shi, 15 Nov 2024
Peer review completion
AR – Author's response | RR – Referee report | ED – Editor decision | EF – Editorial file upload
AR by Gaosong Shi on behalf of the Authors (18 Nov 2024)
Author's response
Author's tracked changes
Manuscript
ED: Referee Nomination & Report Request started (25 Nov 2024) by Hao Shi
RR by Anonymous Referee #2 (29 Nov 2024)
ED: Publish as is (12 Dec 2024) by Hao Shi
AR by Gaosong Shi on behalf of the Authors (12 Dec 2024)
Manuscript
GENERAL COMMENTS:
The manuscript presents preparation of soil maps for the entire area of China at a 90 m resolution, covering six soil depths down to 2 m. The maps include information on 23 soil physical and chemical properties. The structure of the manuscript is easy to follow, and the statistical analysis is profound. However, the following points require revisions by the authors:
Please find further suggestion under SPECIFIC COMMENTS.
SPECIFIC COMMENTS:
TITLE: you could put into brackets the acronym of the database: (CSDLv2)
L21: please add information on accuracy based on all depth intervals, not only 0-5 cm.
L36: it seems to be Lu et al. (2016) based on reference list. Please check and correct.
L39: please cite other papers as well.
L47: please rephrase the following: “exemplified by Brazil’s”, the sentence is not finished.
L64: … (McBratney et al., 2003) … the mistyping errors of references could be prevented by using a referencing tool. Please recheck in the entire text if reference list is in line with their citations.
L68: please shortly describe in the text the limitations of the existing dataset.
L80: please rephrase the following: “soil specie survey”, it is mistyped.
L83: please add the list of mapped soil properties.
L84-85: please write with lower case letters the words “available” and “alkali”. Is there a more general name for AN? E.g.: potential long-term supply of nitrogen in the soil (alkali-hydrolysable nitrogen, AN), or something similar?
L94-107: please decrease repetition, by mentioning each advancements once in a logical order.
L102: highlight that covariates were considered for the mapping as independent variables/predictors in the ML. Please consider that improvement in resolution is the result of points 1-3, therefore it could be mentioned after the points 1-4.
L104: please rephrase the following, it is difficult to understand: “without explicitly uncertainty estimates in CSDLv1”
L109: … in Fig. 2. … or change the order of Fig. 1 and 2.
L114-115: based on the entire manuscript 1) validation was performed based on data-splitting and independent soil profile dataset with measured soil data, and 2) comparison was done with existing national and global soil maps. Please consider it and revise the text and workflow figure (Fig. 2. left bottom corner) accordingly.
L132: … in Fig. 2 … change order of figures as suggested above.
L150: it might be better to write “location” instead of “’space”.
L153: it is OK to use soil type information from HWSD, but please shortly explain why you used this 1 km resolution map instead of SoilGrids 250 m resolution.
L159: aspect is not included in Table S2, please add it or delete in the text if it was not used.
L160: do you mean “organism related covariate”? Please rephrase.
L174: similar as above, why soil factors were derived from HWSD 1 km, why not from SoilGrids 250 m?
L181-182: please note that Pearson correlation coefficient can detect only linear relationships. Why didn’t you let RFE decrease the number of covariates? Why did you consider first Pearson corr. coeff. to decrease the number of predictors?
L201: could you please add in the supplementary material info about the 15 most important variables for all depth and soil properties? Similarly to Fig. S26, which shows it for depth 0-5 cm.
L201: please: mention somewhere under “2 Materials and Methods” how resolution of the derived maps was defined.
L252: do you mean 1°× 1° tiles?
L262: Please specify “four different values”. Do you mean four prediction related values?
L271-276: as mentioned above, please clarify if 10 fold cross-validation was performed. Does I mean that all the 1540 Chinese soil profiles of the WoSIS dataset was used only for validation?
L314-315: please consider the following and rephrase if you agree: the goal might be to have training data that is representative for China’s soil types. Do you think that the datasets available to train the model represents well the soil types under different land cover? I ask it because in the case of many countries soils from arable land are well represented, but soils from forested areas, or organic soils, or less widespread soils types are underrepresented. How it is in your case?
L319-322: please note that vertical change in soil properties depends on soil types. Several soil properties are addressed in this manuscript, therefore be specific and do not state that “the average concentrations of most soil property variables tend to decrease with increasing depth (e.g., OC, TN), showing positive skewness distributions.” The last statement is confusing: “indicating no statistically significant differences between samples from different depths.” Is it the case for OC?
L336: please add what can be the reason for the vertical decline in the predictability of soil texture.
L339: please add information about the deeper layers, as well.
L350: what do you mean by “regional covariates”? Please rephrase.
L356: Fig. 4 is discussed later than Fig. 5, please change order of the figures.
L357-358: please explain more the fact that you describe in sentence starting with “The gross …”. Please note that values of soil properties not always increase with depth.
L364: please rephrase the first sentence, it is not complete.
L366: please explain what you mean by “looser soil particles”. What is the reason of having .lower bulk density in the Qianghai-Tibet Plateau?
L367: Is land use the only factor that influence BD in the south-eastern coastal areas? Please explain differences in BD in deeper horizons, which are less affected by land use.
L368: please rephrase the first sentence, it is not complete.
L368-373: please be more specific in referencing the specific regions. Present sentences are contradicting, due to specifying the locations based on the points of the compass.
L373: please discuss map of TN, and why it shows similar pattern with OC.
L400: … lists the PICP values …
L410: please discuss how uncertainty changes with soil depth. What can be an explanation for that change?
L413-414: do you mean that organism type variables have the highest variable importance? Please rephrase.
L422: please note that soils developed on shallow bedrock do not always have low OC. Vegetation type on those soils influence the rate of OC accumulation.
L437: what is the source of organic matter content (TERECO) input layer in the case of clay content maps of CSDLv2? Isn’t it terrestrial ecosystems? Please revise the sentence.
L441—443: please rephrase the last two sentences of the paragraph, those are difficult to understand.
L446-447: please provide more information about the results of Shrini at el. (2017). It is not clear how that is related to your results on CEC.
L451: … SoilGrids 2.0 … please correct it here and the entire manuscript.
L452: please add the selection criteria both in the text and caption of Table 3. E.g., soil properties with highest prediction accuracy, or something similar.
L458-462: In the case of MEC calculate a percentage improvement relative to the possible range or describe absolute improvement, e.g. MEC improved from 0.48 to 0.69.
L477-479: do you think that CSDLv2 can better capture sites with extremely low or extremely high values? If yes, please add it and discuss why it can describe better the extreme values than the other maps.
L493-494: please add an example for “smoothing the properties of certain regions”. Or rephrase the sentence. Do you mean that extreme values are smoothed due to the type of algorithm used (QRF – provides a mean of several trees, which includes a mean at each node)?
L496: … To show the impact of the … or something similar.
L499: … aspects. … end the sentence, delete “:”.
L499: is the resolution of the derived maps 90 m, because the input layers, which are most important for the predictions, also have this resolution? If yes, please add this shortly.
L505-506: please add the benefit of producing map of soil colours in RGB. What is its practical use?
L523-524: the meaning of the sentence starting with “These soil nutrients …” is not clear, please describe more. Do you mean warm-up period?
L534: do you think that 90 m resolution can meet the needs of precision agriculture? 90 m resolution might support the spatial delineation of management zones. Please consider to revise it in the text.
L557-558: “soil management” or “land use”? Please revise if land use is the correct word.
L557-571: this description is very informative. Suggestion for future development: if elevation and slope is highly correlated with temperature and precipitation, it might be possible to derive 90 m resolution climate variables from the original 1 km resolution – downscaling – based on topographical variables.
L572-578: Ok, but it is not clear how you handled soil data originating from different time periods in your study. Please explain it shortly in the text.
L583: on the download page why:
- temporal resolution is yearly and
- spatial resolution is 10 m – 100 m?
L584: please shortly add how 1 km and 10 km resolutions were derived.
L589: … soil physical and chemical soil properties, with … Please delete here and in the entire manuscript the word “fertility”. Fertility is a complex soil property defined by many indicators. In this manuscript soil physical and chemical properties were addressed Of course these influence soil fertility, but the focus is not on that in the manuscript.
L594: … gridded soil datasets, …
L594: please rephrase “more reasonalble”, with something more specific.
L596: please shortly indicate that CSDLv2 describes the state of 1980.
L599-601: please complement the last sentence by how the limitations of CSDLv2 could be addressed in future studies – i.e., summarize paragraph 4.3.
DATA AND CODE AVAILABILITY:
The codes are accessible at GitHub.
Data accessibility is not smooth. I see that the data could be downloaded through FTP, but it didn’t work for me. The download possibility needs improvement or information on using the download site is needed.
TABLES:
Table 2: is it possible to give a general variable name for “Sentinel2B2/B3/B4/8/9” under the description column?
Table 3: do you mean that it is the result of cross-validation in the case of CSDLv2 and performance of the other maps (CSDLv1, SoilGrids 2.0, HWSD 2.0) on the dataset used to train and test the CSDLv2 predictions? Please revise the title to increase clarity. Add number of samples considered for the validation in a separate column.
FIGURES:
Fig. 2: revise left bottom corner based on advice for L114-115, and reedit the figure of “Other soil datasets”, its pattern might not be the same as that of the “Variable maps”. Direction of arrow on the left might go from “points of the soil profiles” to “Compare and evaluate”.
Fig. 4: the caption does not include information on DEM and land use map. Please add them.
Fig. 5: the labels are not visible. Please consider to show the maps in two or three figures, to increase visibility and readability. Please find a logic to put the maps into two or three groups, than you do not have to fit all 23 maps to one page (one figure), but to two or three figures. Please add unit of the soil properties and add “content” where needed, e.g.: sand, silt, and clay content, etc.
Fig. 6: please increase size of the letters on the plot, it is difficult to read.
Fig. 7: please:
- increase size of the letters on the plot, it is difficult to read,
- add R2 – for both maps – and 1:1 line to b), d) and f) plots to better see the comparison,
- use the same min and max values on x and y axis by soil properties, e.g.: 0 and 30 % for OC, 0 and 100 % for sand, 0 and 80 % for clay.
SUPPLEMENTARY MATERIAL:
Fig. S2-S24: please increase size of the letters in the legend. Present version is difficult to read.
Fig. S2, S7, S8-13 : using the word “content” is not appropriate, please revise these captions. Fig. S8-13 needs some further clarification on the meaning of R, G, B, should be easy to understand without reading the manuscript.
Fig. S14-15: I thought there are more variety in the colour of the soil. Do you have only 6 different colour? Or did you decrease/aggregate the possible colours?
Fig. S19-24: write out nitrogen, potassium, phosphorus before the brackets, instead of writing only N, K, and P.
Fig. S2-25: please add unit in the caption of the figure.
Fig. S25: … of the soil organic carbon (OC) and soil pH …
Fig. S26: please increase size of the letters on the plot, it is difficult to read.
Table S4: do you mean that it is the result of cross-validation in the case of CSDLv2 and performance of the other maps (CSDLv1, SoilGrids 2.0, HWSD 2.0) on the dataset used to train and test the CSDLv2 predictions? Please revise the title to increase clarity. Add number of samples considered for the validation in a separate column.
Fig. S26: please add that the top 15 most important variables are shown.