the Creative Commons Attribution 4.0 License.
the Creative Commons Attribution 4.0 License.
A 1 km soil organic carbon density dataset with depth of 20cm and 100cm from 1985 to 2020 in China
Abstract. Soil organic carbon (SOC) is an important component of the worldwide carbon cycle as a vital indicator of soil quality and ecosystem health, with significant implications for agricultural production and climate change adaptation and mitigation strategies. Although there are some studies on mapping the spatial distribution of soil organic carbon density (SOCD), the long-time series SOCD products in China are still lacking. Therefore, this study proposed a new algorithm with climatic zoning, aiming to improve the accuracy of predicting SOC densities with depths of 0–20 cm and 0–100 cm from 1985 to 2020. The data sources used in this study include Landsat archives, topographic data, meteorological data, and measured SOCD data. The innovation lies in the zoning models by climate regions using a random forest ensemble learning approach for SOCD estimation in China. The predicted results show that our zoning model outperformed the global model without climate zoning in predicting SOCD with R2=0.55 and RMSE=2.19 for 0–20 cm SOCD estimation and R2=0.52 and RMSE=6.50 for 0–100 cm. Comparably, the SOCD estimation using the global model is with R2=0.46 and RMSE=2.36 for 0–20 cm SOCD estimation and R2=0.44 and RMSE=8.09 for 0–100 cm. Moreover, our 0–20 cm SOCD predictions align well with independent samples (R²=0.69, RMSE=2.01) and are further validated with Xu's dataset (R²=0.63, RMSE=1.82). Furthermore, the comparisons with the published SOC content products including HWSD, SoilGrids250m, and GSOCmap have also shown good consistency, too. Comparably, our predicted SOCD is the best fit with SoilGrids250m products with R2=0.72 and RMSE=1.35. Comparisons of model predictions to independent datasets from the 1980s, 2000s, and 2010s in China reveal substantial connections and a trend of increasing forecast accuracy over time. The predicted SOCD is available via the Figshare (https://doi.org/10.6084/m9.figshare.27290310.v1) (Dong et al., 2024).
- Preprint
(4335 KB) - Metadata XML
- BibTeX
- EndNote
Status: open (until 06 Apr 2025)
-
RC1: 'Comment on essd-2024-588', Anonymous Referee #1, 16 Mar 2025
reply
This manuscript by Dong et al. provided an intriguing study on quantifying soil organic carbon (SOC) at various depths of 0-20 cm and 0-100 cm in China from 1985 and 2020. The SOC change is highly important for the terrestrial ecosystem carbon cycle. It’s innovative to use climate zones to improve the performance of this model, with comparable accuracy to other published SOC datasets. In addition, this dataset expands the temporal availability of SOCD in China at high spatial resolution, which can be highly applied in related studies. Overall, this manuscript has clear scientific innovations in understanding SOC changes. Please find my detailed comments below.
- Section 3.2. Could you give more explanation about the principles of selecting variables? For example, from Fig.2, the R between AH and SOCD is almost 0, why select this variable? And only 18 variables have been shown on Fig. 2 without CLCU, how to select CLCU as an input predictor?
- Section 4.2. Fig. 8 and Line 250: The discussion of different features for SOCD estimations is comprehensive, which can help us to understand the important factors of SOCD variations. But it’s very interesting to find that the features have different important values in the two depth models. Please try to discuss more about these differences.
- Section 4.5. Fig. 13 and Line 315: “This may be the result of the topsoil being more susceptible to the direct effects of soil management practices and environmental changes.” Which types of management practices contribute to the changes of SOCD in topsoil? Please add more details (policies or references). As shown in Fig. 13(b), the SOCD estimation in 0-100 cm from this study has a higher value than others. Please add some validation for SOCD in 0-100 cm as mentioned previously. In addition, the SOCD in deep soil should increase if SOCD in topsoil increases. So, please give possible reasons for SOCD in 0-100 cm to be stable from the 1990s to 2020s. Fig. 14 (d) and Fig. 15 (d): In Xinjiang province, the SOCD in 2000-2005 seems to change a lot when compared to another period. Is this due to the model itself, or has some event happened during this period to make a significant change in SOCD? Please give reasonable explanations in this part.
- Section 2.1 “brown soil, brown soil”. Duplicate
- Section 2.2. Line 95: The SOCD data from Song or Xu? Please check it carefully.
- Line 125: Generally, the spatial interpolation results are reliable if stations are evenly distributed. How about the spatial distribution of these meteorological data used for interpolation?
- Line 130: Please add the produced time or effective period of the published soil datasets.
- Section 4.2. Line 225: There is no need to write the full name of the statistical metrics, which have been mentioned previously. Fig. 6: Could you add the sample number in Fig. 6? Please add unit for RMSE both in Figures and the manuscript.
- Section 4.4. Fig. 11: Please add a unit for colorbar for (b), (d), (f), and note the Time (which year). Is it the annual average or any specific year? Please add the validation results for 0-100 cm SOCD in the manuscript or Supplementary.
Citation: https://doi.org/10.5194/essd-2024-588-RC1 -
CC1: 'Comment on essd-2024-588', Tingxuan Zhang, 20 Mar 2025
reply
The manuscript has many serious methodological problems and flaws, undermining the provided dataset’s accuracy. These issues include inadequate input variables for the random forest model, lack of method novelty, and unclear figure illustration. I wonder why this manuscript was sent out for review. Let me offer some significant issues:
(1) Random forest is a nonlinear supervised discrete classification model, while Pearson correlation coefficient is a correlation coefficient that measures linear correlation between two sets of data. The authors know nothing about it. They used the Pearson correlation coefficient to determine the variables for the random forest model inputs.
(2) The authors claimed that they used climate zoning to improve the prediction, but this is absolutely unnecessary because temperature and precipitation are the two most important variables (shown in Figure 8) and are highly correlated to climates. In this case, why take the trouble of building the random forest model for each climate zone?
(3) Issue 2 leads me to my next big concern: the verification part. As the authors insist that climate zoning is the novelty of their methods, why did they verify their results against others across different climate zones? From a scientific view, climate zoning does not mean anything to improve the model's accuracy. For instance, if the authors did not use climate zoning but other geographical partitioning, the smaller the partition area, the more accurate the model would be considering Tobler's first law of geography states that everything is related to everything else, but near things are more related to each other.
(4) Even so, I do not find a significant improvement in the SOCD prediction compared to other published datasets.
(5) The highly skewed SOCD sample input leads to the model's low accuracy (Figure 4). This is probably one of many reasons why the accuracy of 0-20cm SOCD showed higher R2 than that of 0-100cm SOCD.
(6) Another reason is the adequate model input data. The lack of lidar data for soil depth measurement makes your results underestimated compared to other datasets (Figures 11 & 12).
(7) Figures 5(a) and 5(c) are unnecessary as the authors did not conduct any analysis using the biomes.
Citation: https://doi.org/10.5194/essd-2024-588-CC1 -
CC2: 'Comment on essd-2024-588', jianzhao wu, 22 Mar 2025
reply
Publisher’s note: the content of this comment was removed on 24 March 2025 since the comment function was misused for promotional purposes.
Citation: https://doi.org/10.5194/essd-2024-588-CC2
Data sets
A 1 km soil organic carbon density dataset with depth of 20cm and 100cm from 1985 to 2020 in China Yi Dong, Xinting Wang, and Wei Su https://doi.org/10.6084/m9.figshare.27290310.v1
Viewed
HTML | XML | Total | BibTeX | EndNote | |
---|---|---|---|---|---|
294 | 41 | 8 | 343 | 7 | 7 |
- HTML: 294
- PDF: 41
- XML: 8
- Total: 343
- BibTeX: 7
- EndNote: 7
Viewed (geographical distribution)
Country | # | Views | % |
---|
Total: | 0 |
HTML: | 0 |
PDF: | 0 |
XML: | 0 |
- 1