the Creative Commons Attribution 4.0 License.
the Creative Commons Attribution 4.0 License.
GSOCS-LULCC: the Global Soil Organic Carbon Stock dataset after Land Use and Land Cover Change
Abstract. The direction and magnitude of soil organic carbon stock (SOCS) change following land use and land cover change (LULCC) are highly uncertain, largely due to the lack of relevant global soil data. Great efforts have been made to build SOCS database at regional, national and even sub-continental scales following LULCC; however, a comprehensive and open-access global database has not yet been developed, hindering a deep understanding of LULCC impact on SOCS dynamics. In this study, we introduce a new global SOCS database for LULCC, compiled from 639 articles documented in the Web of Science through the end of 2023. Targeting five major land uses (cropland, grasslands, forest, plantation, and savanna), this database – named the Global Soil Organic Carbon Stock dataset after Land Use and Land Cover Change (GSOCS-LULCC) – include 1,206 sites with 5,982 records at various sampling depths. The database will enable users to assess the global impact of LULCC on SOCS dynamics and identify the factors that control SOCS changes for specific types of LULCC. The GSOCS-LULCC database is freely available from the Zenodo platform at https://doi.org/10.5281/zenodo.11183819 (Chen et al., 2024).
- Preprint
(2349 KB) - Metadata XML
- BibTeX
- EndNote
Status: open (until 18 Dec 2024)
-
RC1: 'Comment on essd-2024-373', Anonymous Referee #1, 28 Oct 2024
reply
This data description paper presents a dataset of soil organic carbon stock before and after land use and landcover change, covering almost 6000 records from 639 articles. The dataset targets an important area of research and seems like a useful resource for analysis of land use change on soil carbon stocks. This dataset does appear to be broader in terms of studies included compared to other recent meta-analyses of land use change and soil carbon stocks.
While the dataset itself is useful, there are some deficiencies in the meta-data and the information provided in the dataset which, if improved, could make the dataset more useful and easier to understand. First, the dataset lacks detailed metadata explaining the meaning of each column, which should be included in the data archive. Several columns are not clear from the titles alone. The dataset should include a metadata file with a short description of the source and meaning of each column.
Second, several columns in the dataset have limited use because they do not use consistent classifications. The “Region” column is a mix of different information, with some cities, some broad regions (like river basin, name of forest reserve, etc. which are likely not meaningful outside the context of the specific study), and some that appear to be general classifications (e.g., “Dry forest”). Because they are not systematic, these values are not very useful in the context of a meta-analysis. Similarly, the soil type column uses a mixture of different classification approaches that are not compatible and therefore are not useful for systematic analysis. Some values are soil classes (e.g., Haplaquepts), some are descriptive or maybe from a different system (“Red earth”, “Brown soils”), some are soil orders (“Mollisols”, “Alfisol”). To be useful, this column needs to have a common classification. I understand that this information is likely not available for all soils in the dataset. I would suggest having multiple columns, one for soil order (in one classification scheme), which should be available for all sites or from larger-scale databases, and another column for more detailed classification or study-specific information about soil type.
The “Mass correction” field is not explained at all.
The “Year range” and “Year mean” are not clear either. A more useful approach here would be to have a column for “start year” and “end year”.
The “sampling depth” field seems to be redundant with the upper and lower depth columns. And speaking from experience, intermixing different formats (some numbers separated with a dash, some with a ~) is likely to make dealing with this data more difficult because any users will need to add post-processing code to account for multiple formats. “Year range” has similar issues with some entries having one number and some having multiple numbers separated by a ~.
Also, the land use change field would be more useful from a machine reading/analysis perspective if it were split into two columns, before and after.
Generally, I suggest adding columns with flags for when data were gap-filled or modeled as opposed to being present in the original papers. This is particularly important for bulk density, which is linearly related to SOCS and is modeled using a very simple function.
Other comments on the data description:
Line 40: Remove “promotes”
Line 51: Change “revelled” to “revealed”
Line 61: I would change to “English-languange academic journals” to specify that it is referring to the language and not the country
Figure 1: I found this diagram very helpful for explaining the approach
Line 87: Savanna should be included in this list of land use types (there are 5, not 4). And some detail on how a plantation was defined and differentiated from a forest would be helpful.
Line 102: This bulk density calculation seems overly simple given the pivotal importance of bulk density for calculating SOCS. This could introduce a lot of error to the database. I would recommend, at minimum, including a column in the dataset that indicates whether this model was used or if bulk density measurements were present in the original data. I would also suggest incorporating soil texture data into the bulk density function or using SoilGrids bulk density if it is available since it would incorporate more covariates.
Line 107: Was the 10% standard deviation consistent with the reported standard deviations of points in the dataset where mean and standard deviation were actually provided? It would be helpful to include a flag in the database for this similar to bulk density so it’s clear when this was provided versus modeled.
Line 120: Include a citation for the source of the biome information
Line 130: I don’t think it’s useful to show this before/after information in panel a of Figure 4, because the types of land use change are so different from each other. I would suggest having one bar plot with the distribution of means, or color coding them by the initial land use type (before conversion) instead of just before/after. Or show one distribution of the mean and a separate panel with the distribution of change (after minus before) across all points. Or show a two dimensional scatter plot of SOCS before versus after conversion.
Line 142: What kind of mass correction is this referring to? I don’t think it was ever explained, and it is also present in the actual dataset without explanation
Citation: https://doi.org/10.5194/essd-2024-373-RC1 -
RC2: 'Comment on essd-2024-373', Anonymous Referee #2, 09 Nov 2024
reply
The authors collected a global dataset of SOC changes due to LULCC and presented its characteristics from different aspects. Several areas could be improved to enhance the dataset’s value:
Specific comments:
The dataset compilation from 639 articles is impressive, as illustrated in Figure 1. The results section effectively presents the dataset’s characteristics. However, a high impact journal such as ESSD would require a more thorough review and comprehensive dataset development. The current manuscript is overall simplistic, with a brief introduction and no discussion. The introduction could be strengthened by reviewing existing datasets on SOC changes, placing this study within the broader research context, and identifying any knowledge gaps it addresses. Comparing the findings with other dataset would help to validate the dataset. For instance, Huang et al. (2024) conducted a similar study that included 790 articles and emphasized identifying key SOC change drivers. Noting potential overlaps with Huang et al. (2024) and clarifying what makes this dataset unique would underscore its contribution.
Exploring how environmental factors influence SOC changes due to LULCC would provide deeper insights into underlying patterns and mechanisms. Additionally, an investigation into how SOC changes vary across different regions and climate zones could provide deeper insights into the underlying patterns and mechanisms. For example, generating a gridded SOC change map via modeling or spatial interpolation could gain this dataset’s utility and value.
Reference: Huang, X., Ibrahim, M. M., Luo, Y., Jiang, L., Chen, J., & Hou, E. (2024). Land use change alters soil organic carbon: Constrained global patterns and predictors. Earth's Future, 12, e2023EF004254. https://doi.org/10.1029/2023EF004254
Citation: https://doi.org/10.5194/essd-2024-373-RC2
Data sets
GSOCS-LULCC: the Global Soil Organic Carbon Stock dataset after Land Use and Land Cover Change Songchao Chen et al. https://doi.org/10.5281/zenodo.11183818
Viewed
HTML | XML | Total | BibTeX | EndNote | |
---|---|---|---|---|---|
559 | 128 | 10 | 697 | 10 | 9 |
- HTML: 559
- PDF: 128
- XML: 10
- Total: 697
- BibTeX: 10
- EndNote: 9
Viewed (geographical distribution)
Country | # | Views | % |
---|
Total: | 0 |
HTML: | 0 |
PDF: | 0 |
XML: | 0 |
- 1