Supplement of Apparent ecosystem carbon turnover time: uncertainties and robust features

Abstract. The turnover time of terrestrial ecosystem carbon is an emergent ecosystem property that
quantifies the strength of land surface on the global carbon cycle–climate feedback. However,
observation- and modeling-based estimates of carbon turnover and its response to climate are
still characterized by large uncertainties. In this study, by assessing the apparent whole
ecosystem carbon turnover times (τ) as the ratio between carbon stocks and fluxes, we
provide an update of this ecosystem level diagnostic and its associated uncertainties in high
spatial resolution (0.083∘) using multiple, state-of-the-art, observation-based datasets
of soil organic carbon stock (Csoil), vegetation biomass (Cveg)
and gross primary productivity (GPP). Using this new ensemble of data, we estimated the global
median τ to be 43-7+7 yr
(median-difference to percentile 25+difference to percentile
75) when the full soil is considered, in contrast to limiting it to 1 m
depth. Only considering the top 1 m of soil carbon in circumpolar regions (assuming
maximum active layer depth is up to 1 m) yields a global median τ of
37-6+3 yr, which is longer than the previous estimates of 23-4+7 yr
(Carvalhais et al., 2014). We show that the difference is mostly attributed to changes in global
Csoil estimates. Csoil accounts for approximately 84 % of
the total uncertainty in global τ estimates; GPP also contributes significantly
(15 %), whereas Cveg contributes only marginally (less than 1 %) to the
total uncertainty. The high uncertainty in Csoil is reflected in the large range
across state-of-the-art data products, in which full-depth Csoil spans between
3362 and 4792 PgC. The uncertainty is especially high in circumpolar regions with an
uncertainty of 50 % and a low spatial correlation between the different datasets
(0.2<r<0.5) when compared to other regions (0.6<r<0.8). These uncertainties cast a shadow
on current global estimates of τ in circumpolar regions, for which further geographical
representativeness and clarification on variations in Csoil with soil depth are
needed. Different GPP estimates contribute significantly to the uncertainties of τ mainly in
semiarid and arid regions, whereas Cveg causes the uncertainties of τ in
the subtropics and tropics. In spite of the large uncertainties, our findings reveal that the
latitudinal gradients of τ are consistent across different datasets and soil depths. The
current results show a strong ensemble agreement on the negative correlation between τ and
temperature along latitude that is stronger in temperate zones (30–60∘ N) than
in the subtropical and tropical zones (30∘ S–30∘ N). Additionally, while the
strength of the τ–precipitation correlation was dependent on the Csoil data
source, the latitudinal gradients also agree among different ensemble members. Overall, and
despite the large variation in τ, we identified robust features in the spatial patterns of
τ that emerge beyond the differences stemming from the data-driven estimates of
Csoil, Cveg and GPP. These robust patterns, and associated
uncertainties, can be used to infer τ–climate relationships and for constraining
contemporaneous behavior of Earth
system models (ESMs), which could contribute to uncertainty reductions in future
projections of the carbon cycle–climate feedback. The dataset of τ is openly available at
https://doi.org/10.17871/bgitau.201911 (Fan et al., 2019).



S1 Mass conservative aggregation
In order to correctly unify the spatial resolutions and geographic coordinate systems of the datasets from different sources we used a mass conservative aggregation method. In this approach, the total amount/mass of a stock (i.e. soil and vegetation) does not change during spatial aggregation and transformation of the data. Note that this approach is only followed for data that are expressed in mass per unit area of a grid cell. First, a variable (X), in units of mass per unit area, to be aggregated from the original fine resolution (Xfine) is multiplied by a corresponding grid cell area at the same resolution (Afine), as shown in equation (1). Then the product at the fine resolution (XAfine) is aggregated by summing all the finer grid cells within a larger target coarse grid (equation (2)). This aggregated product will be in N✕N matrix size which depends on the target resolution. Similarly, the area of the target grid cell is calculated by summing the areas of all the fine grid cells within it (equation (3)). Finally, the area-weighted mean for the variable at the target resolution is calculated by dividing aggregated coarse product (XAcoarse) with a corresponding land area (Acoarse), as shown in equation (4). (1) The method was applied to all soil and vegetation carbon and GPP datasets (see Section 2 of the manuscript) that required aggregation.

S2 Bulk density correction
The bulk density (BD) estimates in SoilGrids and LandGIS are too high due to two reasons. First, there are relatively fewer measurements of BD, missing in several horizons (Hengl et al., 2017). In addition, there are known problems with measurements of BD in permafrost regions of Canadian and Eurasian high latitudes, especially in forest regions with high organic carbon (Tomislav Hengl, pers. comm.). In this study, we applied a pedotransfer function from Köchy et al. (2015) to make correction based on organic carbon concentration: = $1.38 − 0.31 × log $ 10 11 × 1000 (5) where, OC is the soil organic carbon content in percentage. Note that the correction was only applied in the grid cells with OC > 8%.

S3 Model selection for extrapolation of soil
In this section, we introduce the framework used to select the models for extrapolating soil carbon from 0 -2 m depth to the full soil depth.

S3.1 Different characteristics of permafrost and non-permafrost soil
In general, the amount and vertical distribution of soil organic carbon are largely influenced by the vegetation which fixes atmospheric CO2 and transfers carbon into the land ecosystems. However, the storage of soil organic carbon (SOC) has a more complex relationship with primary productivity (Jackson et al., 2017). The higher biomass, which implies more carbon sequestration by aboveground biomass, however, does not necessarily lead to increases in SOC storage, as would be under assumption of linear increase. Although the processes of soil formation, accumulation, and stabilization have been extensively studied and debated, modeling and parameterizing the simulation of mechanisms that determine the geographical and vertical distribution of SOC, especially in deeper soil, is still unclear at global scales. Therefore, instead of using process-based models, we employed statistical approaches and empirical mathematical models to estimate SOC at full depth from the globally available 2 m estimates. Such approach is needed since the available datasets of soil carbon only cover until a maximum of 2 m depth and the amount of carbon stored in deeper soil, which may have feedback to climate in the decadal to longer timescales, depending on the local SOC stabilization mechanisms, is still unknown. The other reason is that different datasets report SOC stock until different depths, and a direct comparison requires the harmonization of representative soil depths.
To characterize the vertical distribution of SOC, we used 425 permafrost peatland profiles from ISCN soil database (Nave et al., 2017) and 1000 profiles from WOSIS soil database (Batjes et al., 2019). Figure S1 shows the accumulated SOC stock profiles with depth in permafrost and non-permafrost regions. The vertical distribution of carbon with depth in the permafrost region shows a distinct feature of a linear relationship between SOC and soil depth that is not apparent in nonpermafrost profiles. This behavior translates strong stabilization mechanisms, likely the long-term result of temperature and hydrological controls on decomposition processes and implies that the SOC may continue to be large beyond the depth of 3 m in the permafrost regions ( Figure S1b). However, due to limited observations in permafrost peatlands, it is unclear to what depth the SOC concentrations are significant and to which extent a limited depth consideration affects the estimates of total SOC in these regions. In contrast, most of the soil profiles in the non-permafrost region stop increasing before the depth of 2 m ( Figure S1c).

S3.2. Estimating global SOC at full depth
We included 12 different empirical mathematical models (Table S1) to predict total SOC storage to the full soil depth. All the models were used to fit all the soil profiles in WOSIS and ISCN databases (see Figure S1 for observed data, and Figure   S2 for an example of model fitting for a soil profile). Despite the significant fit in each model there is a potential large range of model predictions at full depth ( Figure S2). We, therefore, tested two different model averaging methods to obtain the best representative SOC ensemble prediction until the full soil depth: 1. The equal weights averaging (EWA) method, in which all models are given equal weights and the ensemble prediction is simply calculated as the average of all model predictions, tests the fitness of all possible model combinations to achieve the best model ensemble for predicting SOC at full depth. The best model ensembles are obtained using the criteria to maximize MEF, minimize KL, and minimize AIC. The results show that EWA and BMA methods have similar performances (Table S2). Given the similarities in the statistical performance between methods, but the larger fraction of observations within prediction uncertainty (larger Coverage in Table S2) here we select the EWA method.
We conducted two different batches of experiments to consider the differences between the circumpolar and noncircumpolar datasets (see above section S3.1): (i) fitting the models using the observational data points down to 50 cm depth to estimate the SOC at full soil depth (beyond 100 cm) for non-circumpolar regions ( Figure S3); and (ii) fitting the models using the observational data points down to 200 cm depth to predict the SOC at full soil depth (beyond 200 cm) for circumpolar regions ( Figure S4). This is an approach that can be reproduced at every grid-cell and is here applied to the insitu data to assess the overall robustness of SOC estimates at full depth.
Two model ensembles were selected after using the EWA method to best represent the SOC vertical distribution, one for the circumpolar and non-circumpolar regions; results synthesized in Figures S3 and S4, respectively. Figure S3 shows the robustness of the ensemble including DIJKL to predict the SOC at full depth in the non-circumpolar region using the data points up to 50 cm depth. The modelling efficiency was 0.90 and with a small normalized root mean square error (NRMSE).
The histogram ( Figure S3b) of SOC distribution at full depth shows a similar distribution between the predictions and observations, with a small positive bias ( Figure S3c), which is revealed in the statistical difference between predictions and observations ( Figure S3b). Overall, more than 60% of observations were found to be within the uncertainty range of the prediction ensemble. The results show that the best ensemble represents the vertical distribution of Csoil in most sites. For the circumpolar region the best ensemble included ACDEF models. The results show that the overall model performance of the best ensemble in the circumpolar region is weaker in comparison to the non-circumpolar ensemble. In these regions, the modelling efficiency was 0.64 with a small NRMSE ( Figure S4a). The histogram ( Figure S4b) of SOC distribution at full depth shows qualitatively similar but statistically different distributions of predictions and observations, with a positive bias ( Figures S4b and c). These results show a reduction in performance when compared to the model ensemble for the noncircumpolar regions, while still explaining 77% of the spatial variability at site scales across soil profiles.
Based on these results, the models selected for each of the ensembles (DIJKL for the non-circumpolar and ACDEF for circumpolar regions) were then fitted to the vertical distribution of SOC per grid cell in each grid cell to predict the full depth SOC for each of the three global SOC datasets: Sanderman, SoilGrids and LandGIS.       Figure S11: The zonal correlation between τ and mean annual precipitation (P) using different soil depths and sources of soil carbon data. For details on the data, see Section 2 of the main text. Table S1: Empirical mathematical models used for extrapolation of soil carbon to full soil depth.