Improved estimate of global gross primary production for reproducing its long-term variation, 1982–2017

Satellite-based models have been widely used to simulate vegetation gross primary production (GPP) at the site, regional, or global scales in recent years. However, accurately reproducing the interannual variations in GPP remains a major challenge, and the long-term changes in GPP remain highly uncertain. In this study, we generated a long-term global GPP dataset at 0.05 latitude by 0.05 longitude and 8 d interval by revising a light use efficiency model (i.e., EC-LUE model). In the revised EC-LUE model, we integrated the regulations of several major environmental variables: atmospheric CO2 concentration, radiation components, and atmospheric vapor pressure deficit (VPD). These environmental variables showed substantial long-term changes, which could greatly impact the global vegetation productivity. Eddy covariance (EC) measurements at 95 towers from the FLUXNET2015 dataset, covering nine major ecosystem types around the globe, were used to calibrate and validate the model. In general, the revised EC-LUE model could effectively reproduce the spatial, seasonal, and annual variations in the tower-estimated GPP at most sites. The revised EC-LUE model could explain 71 % of the spatial variations in annual GPP over 95 sites. At more than 95 % of the sites, the correlation coefficients (R2) of seasonal changes between tower-estimated and model-simulated GPP are larger than 0.5. Particularly, the revised EC-LUE model improved the model performance in reproducing the interannual variations in GPP, and the averaged R2 between annual mean tower-estimated and model-simulated GPP is 0.44 over all 55 sites with observations longer than 5 years, which is significantly higher than those of the original EC-LUE model (R2 = 0.36) and other LUE models (R2 ranged from 0.06 to 0.30 with an average value of 0.16). At the global scale, GPP derived from light use efficiency models, machine learning models, and processbased biophysical models shows substantial differences in magnitude and interannual variations. The revised EC-LUE model quantified the mean global GPP from 1982 to 2017 as 106.2± 2.9 Pg C yr−1 with the trend 0.15 Pg C yr−1. Sensitivity analysis indicated that GPP simulated by the revised EC-LUE model was sensitive to Published by Copernicus Publications. 2726 Y. Zheng et al.: Improved estimate of global gross primary production atmospheric CO2 concentration, VPD, and radiation. Over the period of 1982–2017, the CO2 fertilization effect on the global GPP (0.22± 0.07 Pg C yr−1) could be partly offset by increased VPD (−0.17± 0.06 Pg C yr−1). The long-term changes in the environmental variables could be well reflected in global GPP. Overall, the revised EC-LUE model is able to provide a reliable long-term estimate of global GPP. The GPP dataset is available at https://doi.org/10.6084/m9.figshare.8942336.v3 (Zheng et al., 2019).


Introduction
Vegetation gross primary production (GPP) is the largest carbon flux component within terrestrial ecosystems and plays an essential role in regulating the global carbon cycle (Canadell et al., 2007;Zhao et al., 2010). As a primary variable of the terrestrial ecosystem cycle, GPP estimates will substantially determine other variables of the carbon cycle (Yuan et al., 2011). Satellite-based GPP models have been developed based on the light use efficiency (LUE) principle (Monteith, 1972;Potter et al., 1993;Running et al., 2004;Xiao et al., 2005;Yuan et al., 2007). Thus far, LUE models have been a major tool for investigating the spatiotemporal changes in GPP and the environmental regulations, either independently or by combining with other ecosystem models (Keenan et al., 2016;Smith et al., 2016).
However, current LUE models exhibit poor performance in reproducing the interannual variations in GPP. A previous study indicated that seven LUE models could only explain 6 %-36 % of the interannual variations in GPP at 51 eddy covariance (EC) towers . Similarly, a model comparison showed that none of the examined 16 process-based biophysical models or the three remote sensing products (BESS, MODIS C5, and MODIS C5.1) could consistently reproduce the observed interannual variations in GPP at 11 forest sites in North America (Keenan et al., 2012). Seven LUE models simulated the long-term trends in global GPP that varied from −0.15 to 1.09 Pg C yr −1 over the period 2000-2010 . An important reason for the poor performance in modeling the interannual variability is that the effect of environmental regulations on vegetation production is not completely integrated into the LUE models (Stocker et al., 2019). In particular, the long-term changes in several environmental variables are very important for accurately simulating the GPP series at the decadal scale.
Several environmental variables should be included in GPP models. Firstly, as we all know, the rising atmospheric CO 2 concentration in the past few decades substantially stimulated global vegetation growth (Zhu et al., 2016;Liu et al., 2017). Field experiments using greenhouses or open-top chambers showed that an increase of approximately 300 ppm in CO 2 concentration can increase the photosynthesis of C 3 plants on the order of 60 % (Norby et al., 1999). Free-air CO 2 enrichment (FACE) experiments generally confirmed the enhancement in net primary production (NPP) with the rising CO 2 concentration (Ainsworth and Long, 2005). For exam-ple, four FACE experiments indicated that the forest NPP consistently increased at the median of 23 ± 2 % when the ambient CO 2 concentration was elevated to approximately 550 ppm (Norby et al., 2005). According to observations, the atmospheric CO 2 concentration has risen by approximately 20 % from 340 ppm (1982) to 410 ppm (2018) (https: //www.esrl.noaa.gov/, last access: 25 June 2019). However, the effects of CO 2 fertilization on GPP have not been integrated in most current satellite-based LUE models.
Secondly, solar radiation, or more specifically the photosynthetic active radiation (PAR) substantially influences the vegetation production of the terrestrial ecosystem (Alton et al., 2007;Kanniah et al., 2012;Krupkova et al., 2017). A study indicated that the solar radiation incident at the earth surface underwent significant decadal variations (Wild et al., 2005). A comprehensive analysis based on the datasets of worldwide distributed sites indicated significant decreases in solar radiation (2 % per decade) from the late 1950s to 1990 in the regions of Asia, Europe, North America, and Africa (Gilgen et al., 1998). A later assessment by Wild et al. (2005) showed that the radiation has increased at widespread locations since the mid-1980s.
However, not only the total amount of solar radiation or PAR incident at the earth surface but more importantly also their partitioning into direct and diffuse radiations impact the vegetation productivity (Urban et al., 2007;Kanniah et al., 2012). An increased proportion of diffuse radiation enhances vegetation photosynthesis, because a higher blue / red light ratio within the diffuse radiation may lead to higher light use efficiency (Gu et al., 2002;Alton et al., 2007). For example, the sharply increased diffuse radiation induced by the 1991 Mount Pinatubo eruption enhanced the noontime vegetation productivity of a deciduous forest for the following 2 years (Gu et al., 2003). Besides volcanic aerosols, clouds could also reduce the total and direct radiation, while increasing the proportion of diffuse radiation. Yuan et al. (2010) found that the higher LUE at European forests than North America was because of the higher ratio of cloudy days in Europe. Yuan et al. (2014) further proved that the significantly underestimated GPP during cloudy days by six LUE models was because the effects of diffuse radiation on LUE were neglected in these models.
Thirdly, atmospheric vapor pressure deficit (VPD) is another factor that should be included in GPP models. As an important driver of atmospheric water demand for plants, VPD influences terrestrial ecosystem function and photosyn-thesis (Rawson et al., 1977;Yuan, et al., 2019). Rising air temperature increases the saturated vapor pressure at a rate of ∼ 7 % K −1 according to the Clausius-Clapeyron relationship, and therefore, VPD will increase if the atmospheric water vapor content does not increase by exactly the same amount as the saturated vapor pressure. Numerous studies indicated significant changes in the relative humidity (ratio of actual water vapor pressure to saturated water vapor pressure) in both humid areas and continental areas located far from oceanic humidity (Van Wijngaarden and Vincent, 2004;Pierce et al., 2013). In particular, the global averaged land surface relative humidity decreased sharply after the late 1990s (Simmons et al., 2010;Willett et al., 2014), and the global averaged land surface VPD increased sharply after the late 1990s . The leaf and canopy photosynthetic rate declines when the atmospheric VPD increases due to stomatal closure (Fletcher et al., 2007). A recent study highlighted that increases in VPD rather than changes in precipitation would be a dominant influence on vegetation productivity (Konings et al., 2017). However, currently the influence of long-term VPD variations is not well expressed in many LUE models.
We have developed a LUE model, namely the EC-LUE model, by integrating remote sensing data and eddy covariance data to simulate daily GPP (Yuan et al., 2007(Yuan et al., , 2010. The model has been evaluated using the observations at EC towers located in Europe, North America, China, and East Asia, covering various ecosystem types (Yuan et al., 2007(Yuan et al., , 2010Li et al., 2013). In this study, we revised the EC-LUE model by integrating the impacts of several environmental variables (i.e., atmospheric CO 2 concentration, radiation components, and atmospheric VPD) across a long-term temporal scale. Firstly, we evaluated the effectiveness of the revised EC-LUE model in determining the spatial, seasonal, and interannual variations in GPP from multiple eddy covariance sites. Secondly, a global GPP dataset at 0.05 • spatial resolution was generated based on the optimized model. Finally, we analyzed the contributions of the aforementioned environmental variables to the global GPP and discussed the spatial and interannual variations in GPP from different datasets.

Data from the eddy covariance towers
The FLUXNET2015 dataset (http://www.fluxdata.org, last access: 2 March 2018) includes over 200 variables of carbon fluxes, energy fluxes, and meteorological variables collected and processed at sites by the FLUXNET community. In our study, 95 EC sites in the FLUXNET2015 dataset were utilized to optimize the parameters and evaluate the performance of the revised EC-LUE model, including nine major terrestrial ecosystem vegetation types (Table 1): evergreen broadleaf forest (EBF), evergreen needleleaf for- est (ENF), deciduous broadleaf forest (DBF), mixed forest (MF), grassland (GRA), savanna (SAV), shrubland (SHR), wetland (WET), and cropland (CRO). More information about the characteristics of these sites can be found at the FLUXNET website. For each site, the daily GPP, PAR, air temperature (T a ), and VPD were used in our study. The GPP variable (GPP_NT_ VUT_REF) used in this study was estimated from the nighttime partitioning method. The corresponding net ecosystem exchange (NEE) was generated using the variable friction velocity (USTAR) threshold for each year (VUT), in which 40 versions of NEE were created by using different percentiles of USTAR thresholds. The model efficiency between each version and the other 39 versions was calculated to test their similarities, and the reference (REF) NEE was selected as the one with the higher model efficiency sum (the most similar to the other 39). The 120 daily meteorological variables were gap-filled or downscaled from the ERA-Interim reanalysis dataset in both space and time (Vuichard and Papale, 2015). The gap-filling technique of the carbon flux measurements and meteorological variables is the marginal distribution sampling (MDS) method described in Reichstein et al. (2005). In the FLUXNET 2015 dataset, the quality flags ranged from 0 to 1 to indicate percentage of measured and good-quality gap-filled data. For each variable, we used the daily/monthly values with more than 80 % of good-quality data (quality flag > 0.8). We aggregated the daily values to an 8 d time step. And only the 8 d measurements with more than 5 d valid values were used.

Data at the global scale
The global-scale datasets used in this study are shown in Table 2. The meteorological reanalysis dataset was derived from the second Modern-Era Retrospective analysis for Research and Applications (MERRA-2) dataset. It was produced by NASA's Global Modeling and Assimilation Office that used an upgraded version of GEOS-5 (Rienecker et al., 2011). It has been validated carefully using surface meteorological datasets and an enhanced assimilation system to reduce the uncertainty in various meteorological variables globally. In our study, we obtained the daily mean air temperature (T a , • C), mean dew point temperature (T d , • C), total direct PAR (PAR dr , MJ m −2 d −1 ), and total diffuse PAR (PAR df , MJ m −2 d −1 ) at 0.625 • in longitude by 0.5 • in latitude from 1982 to 2017. VPD was calculated from air temperature and dew point temperature: Ta+243.04 , where SVP is the saturated vapor pressure (kPa), and RH is the relative humidity. We aggregated the daily variables (air temperature, dew point temperature, VPD, direct PAR, and diffuse PAR) to 8 d interval temporal resolution. And these variables were resampled to the spatial resolution of 0.05 • latitude by 0.05 • longitude using the bilinear interpolation method. The 8 d Global LAnd Surface Satellite Leaf Area Index (GLASS LAI) dataset at 0.05 • latitude by 0.05 • longitude was adopted to indicate vegetation growth from 1982 to 2017. It was produced using the general regression neural networks (GRNNs) trained with the fused MOD15 LAI and CYCLOPES LAI and the preprocessed MODIS and AVHRR reflectance data over the BELMANIP sites . Product validation and comparison showed that the GLASS LAI product was spatially complete and temporally continuous with lower uncertainty (Xu et al., 2018). Additionally, the MCD12Q1 product with the IGBP classification scheme was used as the input land cover map. The ISLSCP II C 4 vegetation percentage map was used to separate the C 3 and C 4 crop. The NOAA Earth System Research Laboratory (ESRL) CO 2 concentration dataset was used to express the CO 2 fertilization effect.

The revised EC-LUE model
The terrestrial vegetation GPP can be expressed as follows in the revised EC-LUE model: where ε msu is the maximum LUE of sunlit leaves; APAR su is the PAR absorbed by sunlit leaves; ε msh is the maximum LUE of shaded leaves; APAR sh is the PAR absorbed by shaded leaves; and C s , T s , and W s represent the downward regulation scalars of atmospheric CO 2 concentration ([CO 2 ]), air temperature, and VPD on LUE ranging from 0 to 1; min represents the minimum value.
The effect of atmospheric CO 2 concentration on GPP is determined by the following equations (Farquhar et al., 1980;Collatz et al., 1991): where ϕ is the CO 2 compensation point in the absence of dark respiration (ppm), C i is the leaf internal CO 2 concentration, C a is the atmospheric CO 2 concentration, and χ is the ratio of leaf internal to atmospheric CO 2 concentration which can be estimated as follows (Prentice et al., 2014;Keenan et al., 2016): where ε is a parameter related to the "carbon cost of water", which means the sensitivity of VPD to χ; K is the Michaelis-Menten coefficient of Rubisco; and η * is the viscosity of water relative to its value at 25 • C (Korson et al., 1969).
where P o is the partial pressure of O 2 , and K c and K o are the Michaelis-Menten constants for CO 2 and O 2 (Keenan et al., 2016): where T a is air temperature (unit: K) and R is the molar gas constant (8.314 J mol −1 K −1 ). T s and W s can be expressed as follows: where T min , T opt , and T max are the minimum, optimum, and maximum temperatures for vegetation photosynthesis, respectively (Yuan et al., 2007); VPD 0 is the half-saturation coefficient of the VPD constraint equation (k Pa). APAR su and APAR sh can be expressed as follows (Chen et al., 1999): where PAR dir is the direct PAR; PAR dif is the diffuse PAR; PAR dif,u is the diffuse PAR under the canopy; C represents the multiple scattering effects of direct radiation; is the clumping index, which is set according to vegetation types (Tang et al., 2007); θ is the solar zenith angle; β is the mean leaf-sun angle, which is set to 60 • ; and θ is the representative zenith angle for diffuse radiation transmission and can be expressed by LAI (Chen et al., 1999): The LAIs of shaded leaves (LAI sh ) and sunlit leaves (LAI su ) in Eqs. (14) and (15) are computed following Chen et al. (1999):  (Table 3) were used to produce a GPP dataset at 0.05 • × 0.05 • spatial resolution and 8 d temporal resolution over 1982-2017. In order to investigate the uncertainties of the global GPP dataset, 10 000 sets of optimized parameters were randomly selected to simulate global GPP by assuming a normal distribution of these parameters ( Table 3). The uncertainty of global GPP simulations was determined by the mean absolute deviation (MAD) of all the 10 000 simulations (Khair et al., 2017).
Three metrics, the coefficient of determination (R 2 ), RMSE, and bias (the difference between observations and simulations) were adopted to evaluate the performance of the revised EC-LUE model. Additionally, Kendall's coefficient of rank correlation τ (Kanji, 1999) was used to quantify the agreement of seasonal changes between the simulated and tower-estimated GPP. The Kendall coefficient measured the tendency coherence between predicted and observed GPP by comparing the ranks assigned to successive pairs. If GPP sim,j − GPP sim,i and GPP obs,j − GPP obs,i have the same sign (positive or negative), the pair would be concordant, or discordant. With time series data with n observations, the Kendall coefficient of rank correlation τ can be expressed as where n(n − 1)/2 is the total combination of pairs, C is the number of concordant pairs, and D is the number of discordant pairs. Kendall's coefficient ranged from −1 (C = 0) to 1 (D = 0). The Kendall coefficient is much closer to 1, which means a stronger positive relationship between the seasonal patterns of the simulated and tower-estimated GPP.
In addition, we compared the model performance of the revised EC-LUE model with seven light use efficiency models, three machine learning methods, and 10 process-based biophysical models at a monthly step. The participating light use efficiency models include CASA (Potter et al., 1993), CFlux (Turner et al., 2006;King et al., 2011), CFix (Veroustraete et al., 2002), MODIS (Running et al., 2004), VPM (Xiao et al., 2005), VPRM (Mahadevan et al., 2008), and EC-LUE (Yuan et al., 2007). We calibrated the model parameters of all seven light use efficiency models based on the eddy covariance measurements using the same parameterization method as the revised EC-LUE model (see the above method), and then we compared the GPP simulations of seven LUE models driven by EC tower-based meteorology data against the estimated GPP based on EC measurements. For the comparison with machine learning models and process-based biophysical models, we collected their global monthly GPP products released by FLUXCOM (Jung et al., 2017) (Smith et al., 2014), the Land surface Processes and eXchanges (LPX-Bern) (Stocker et al., 2014), the ORganizing Carbon and Hydrology In Dynamic EcosystEms (OR-CHIDEE) (Krinner et al., 2005), and the Vegetation Integrated Simulator for Trace Gases (VISIT) . The monthly GPP simulations at all investigated EC sites were derived from their global products, and equally we obtained the monthly GPP simulations of the revised EC-LUE model from its global dataset driven by the MERRA-2 reanalysis dataset.

Environmental contributions to long-term changes in GPP
To evaluate the contribution of the major environmental variables to GPP, including the atmospheric CO 2 concentration ([CO 2 ]), climate, and satellite-based LAI, two types of experimental simulations were performed. The first simulation experiment (S ALL ) was a normal model run, with all the environmental drivers changing over time. In the second type of simulation experiments (S CLI0 , S LAI0 , and S CO 2 0 ), two driving factors could be varied with time while maintaining Considering the differences between the simulation results of the first type (S ALL ) and the second type (S CO 2 0 and S LAI0 ) of experiments, the GPP sensitivities to atmospheric [CO 2 ] (β CO 2 ) and LAI (β LAI ) were estimated as follows: where GPP i , CO 2i , and LAI i denote the differences in the GPP simulations, atmospheric [CO 2 ], and LAI between the two model experiments from 1982 to 2017, and ε is the stochastic error term.
The GPP sensitivities to the three climate variables air temperature (β T a ), VPD (β VPD ), and PAR (β PAR ) were calculated using a multiple regression approach: where T a i , VPD i , and PAR i denote the differences in T a , VPD, and PAR time series between the two model experiments (S ALL and S CLI0 ), respectively. The regression coefficient β was estimated using the maximum likelihood analysis.

Model performance
In general, the revised EC-LUE model could effectively reproduce the spatial, seasonal, and annual variations in the tower-estimated GPP at most sites (Figs. 1-3). The revised EC-LUE model explained 71 % and 64 % of the spatial variations in GPP across all the validation sites by using the towerderived meteorology data and the meteorological reanalysis dataset, respectively (Fig. 1).
The revised EC-LUE model also showed a good performance in reproducing the seasonal variations in GPP at most EC sites (Fig. 2). In this study, we compared the modeled and tower GPP at an 8 d step for each site to examine the model capacity in reproducing the temporal variations in GPP. In terms of GPP simulations driven by tower-derived meteorology data, the coefficients of determination (R 2 ) varied from 0.26 at the MY-PSO site to 0.96 at the DK-Sor site, with most of them being statistically significant (p value < 0.05) (Fig. 2a, e), and the mean R 2 was 0.81 over all investigated sites. The low R 2 values (< 0.4) were found at three tropical forest sites (i.e., MY-PSO, BR-Sa1, and BR-Sa3). The averaged Kendall correlation coefficient (τ ) was 0.63 over all sites, indicating a strong seasonal coherence between simulated and tower-estimated GPP (Fig. 2d, h). Similarly, τ at tropical forest sites was generally lower than at other sites. According to the RMSE and absolute value of bias, the revised EC-LUE model performed very well at most sites. The averaged RMSE and absolute value of bias over all the sites were 2.13 and 0.81 g C m −2 d −1 , respectively ( Fig. 2b-c, f-g). In addition, there was no obvious difference between the seasonal GPP performances using the tower-derived meteorology data and the meteorological reanalysis dataset (Fig. 2). On average, the revised EC-LUE model showed higher R 2 and τ and lower RMSE and absolute value of bias than the original EC-LUE model (Fig. 2). Furthermore, we selected three sites with high R 2 (US-UMB; DBF; R 2 = 0.93), median R 2 (CN-Din; EBF; R 2 = 0.71), and low R 2 (Br-Sa3; EBF; R 2 = 0.39) to illustrate the time series changes of observed and simulated GPP, LAI, and environmental factors (i.e., air temperature, VPD, and PAR) (Figs. S1-S3). At the US-UMB site, the model captured the GPP variations well year round with no obvious bias (Fig. S1). At the CN-Din site, the model generally performed well except for the underestimation at the end of the year (November-December) with decreased LAI (Fig. S2). However, at the Br-Sa3 site, the model could not capture the variations in GPP for the vegetation greenness and environmental factors varying slightly during the year (Fig. S3).
The ability of the LUE models to reproduce the interannual variations in GPP was investigated at 55 EC towers with observations greater than 5 years (Table 1; Fig. 3). We examined the relations between the mean annual GPP simulations and observations at each site and used the coefficient correla-tion (R 2 ) and slope of the regression relationship to investigate the model capability of simulating the interannual variations in GPP. The result showed that the revised EC-LUE model could effectively determine the interannual variations in GPP (Fig. 3). Approximately 42 % and 40 % of the sites showed higher R 2 values (> 0.5) by using the tower-derived meteorology data and the meteorological reanalysis dataset (Fig. 3a). The averaged R 2 for the revised EC-LUE model was 0.44 by using the tower-derived meteorology data, which was significantly higher than the original EC-LUE model (R 2 = 0.36) and other LUE models (R 2 ranged from 0.06 to 0.30 with an average value of 0.16) (Fig. 3c). The averaged R 2 for the revised EC-LUE model was 0.42 by using the meteorological reanalysis dataset. The averaged slopes of the revised EC-LUE model were 0.60 and 0.57 by using the tower-derived meteorology data and the meteorological reanalysis dataset (Fig. 3c).
Additionally, we examined the model performance of the revised EC-LUE model, other LUE models, machine learning models, and process-based biophysical models in TRENDY at a monthly step by comparing against EC towerestimated GPP (Fig. 4). In comparison with seven LUE models driven by the EC tower-based meteorology dataset, the overall R 2 of the revised EC-LUE model was 0.71, higher than the original EC-LUE model and other LUE models (R 2 ranged from 0.55 to 0.61) (Fig. 4a). For each site, we compared the R 2 , RMSE, and absolute value of bias of the individual model with the averaged value of all eight LUE models (each site has an averaged R 2 , RMSE, and absolute value of bias) (Fig. S4a1-c1). The revised EC-LUE model had higher R 2 than the mean R 2 of the eight LUE models at 62 % of sites, which was comparable with the original EC-LUE model (63 % of sites) and VPM model (60 % of sites) (Fig. S4a1). Moreover, the revised EC-LUE model showed the lower RMSE and absolute value of bias compared to mean values of all eight LUE models at 68 % and 67 % of sites, respectively, which indicated the better performance compared to the other LUE models at most sites ( Fig. S4b1-c1). By using the global reanalysis meteorology data, we compared the performance of the revised EC-LUE model with three existing machine learning model products and 10 process-based biophysical model products in TRENDY (Fig. 4b). The overall R 2 of the revised EC-LUE model (R 2 = 0.57) was higher than that of other models (R 2 ranged from 0.02 to 0.54) (Fig. 4b). The revised EC-LUE model, FLUXCOM ANN, and FLUXCOM MARS had more sites (over 90 %) with higher R 2 than the mean R 2 (Fig. S4a2). And the revised EC-LUE model, FLUXCOM MARS, and FLUXCOM RF showed the lower RMSE at more than 90 % of sites (Fig. S4b2). Compared to the other models, the revised EC-LUE model had the highest site percentage (81 %) with a lower absolute value of bias (Fig. S4c2). Furthermore, the revised EC-LUE model had a higher R 2 , higher τ , lower RMSE, and lower absolute value of bias at most of the sites (Fig. S5).  Yuan et al. (2014). * * and * indicate a significant difference in statistic variables (R 2 and slope) between the rEC-LUE (T ) and other LUE models (i.e., rEC-LUE (T ) and other seven LUE models) at p value < 0.01 and p value < 0.05, respectively.

Spatial-temporal patterns of global GPP
A global GPP dataset at 0.05 • latitude by 0.05 • longitude and an 8 d interval was generated from 1982 to 2017 based on the revised EC-LUE model. The global GPP was 106.2 ± 2.9 Pg C yr −1 across the vegetated area averaged from 1982 to 2017. The GPP was high over the tropical forest areas, such as the Amazon and Southeast Asia, where the moisture and temperature conditions are sufficient for photosynthesis (Fig. 5a). The GPP decreased with the decreasing gradients of temperature and precipitation (Fig. 5b). The moderate GPP was found in temperate and subhumid regions, and the lowest GPP was located in arid or cold regions, where either precipitation or temperature is limited (Fig. 5b).
The long-term trend of GPP over the period of 1982-2017 was determined using a linear regression analysis (Fig. 6). In general, the revised EC-LUE model showed an increased trend in the annual mean GPP from 1982 to 2017. Approximately 69.5 % of the vegetated areas, mainly located in temperate and humid regions, showed increased trends. The spatial pattern of the GPP trend along with the temperature and precipitation gradients was substantially heterogeneous (Fig. 6b). The decreased GPP was found in the tropic regions, especially in the Amazon forest (Fig. 6a). The extremely cold or arid areas exhibited fewer variations in GPP (Fig. 6b).
In addition, this study used the MAD of 10 000 simulations to quantify the uncertainty of estimated GPP globally (see methods). Over the globe, the mean uncertainty of estimated GPP by the revised EC-LUE model is 19.33 Pg C yr −1 . The GPP uncertainties were found to be low in high and middle latitudes but relatively high in tropical forests (about 600 g C m −2 yr −1 ) (Fig. 7).

Contributions of environmental variables to GPP
To quantify the contributions of the environmental variables to long-term changes in GPP, we explored the sensitivity of global summed GPP to climate variables (i.e., VPD, Ta, and PAR), LAI, and atmospheric CO 2 (Fig. 8). The global summed GPP generated from different experimental simulations (Sect. 2.5) appeared differently in terms of the annual mean value, trend, and standard deviation (Fig. 8a). The normal simulated GPP (S ALL GPP, all the environmental drivers changing over time) significantly increased at the rate of 0.15 Pg C yr −1 , while the increasing rate of S CLI0 GPP (climate variables were kept constant at 1982 values) was even greater (0.41 Pg C yr −1 ). On the contrary, the S LAI0 GPP (LAI was kept constant at 1982 values) and the S CO 2 0 GPP (atmospheric [CO 2 ] was kept constant at 1982 values) showed an insignificantly decreasing trend at the rate of −0.04 and −0.07 Pg C yr −1 (Fig. 8a). The GPP sensitivity analysis showed that the global GPP decreased by 6.67 ± 5.04 Pg C with a 0.1 kPa increase in VPD, which was comparable to the increase in GPP with 0.1 unit greening of LAI (i.e., β LAI = 4.78 ± 0.72 Pg C 0.1 unit −1 ) or a 100 MJ increase in PAR (i.e., β PAR = 5.73 ± 3.22 Pg C 100 MJ −1 ) (Fig. 8b). The global GPP increased by 12.31 ± 0.61 Pg C with a 100 ppm −1 rise of atmospheric [CO 2 ] (i.e., β CO 2 = 12.31 ± 0.61 Pg C 100 ppm −1 ). Over the period of 1982-2017, the increased VPD resulted in global GPP decreases of −0.17±0.06 Pg C yr −1 , which could partly counteract the Comparisons between estimated GPP based on EC measurements and GPP simulations in growing season (defined as temperature larger than 0 • ) by the various models (including LUE models, machine learning models, and process-based biophysical models in TRENDY) at a monthly scale. The comparison of GPP simulations was simulated using (a) tower-derived meteorology data and (b) global reanalysis meteorology data, respectively (see Sect. 2.4). rEC-LUE (T ) and rEC-LUE (R) indicate the simulations of the revised EC-LUE model derived from tower-derived meteorology data and global reanalysis meteorology data, respectively.   fertilization effect of CO 2 (0.22±0.07 Pg C yr −1 ). The global GPP showed a decreasing trend after 2001 due to the joint effect of increased VPD and decreased PAR (Fig. 8c). While the increasing trend of GPP before 2000 was affected by the rising atmospheric [CO 2 ], greening of LAI, and increased PAR (Fig. 8c).

Model accuracy analysis
Numerous studies have shown that most GPP models can reproduce the spatial changes in GPP but failed to reproduce the temporal variations (Keenan et al., 2012;Yuan et al., 2014). Therefore, the capacity to reproduce realistic interannual variations for a GPP model is significantly important. In our study, the revised EC-LUE model performed a higher accuracy in reproducing the interannual variations in GPP than did the original EC-LUE model and other LUE models. Yuan et al. (2014) reported that the averaged slope of the regression relation between the mean annual GPP simulated by seven LUE models and the mean annual GPP estimated from the EC towers ranged from 0.19 to 0.56 (Fig. 3c). In contrast, the revised EC-LUE model showed a higher slope of regression relation (0.60), which is much closer to 1 than that obtained from other LUE models (Fig. 3c). The VPM GPP showed fewer interannual variations across most biomes (R 2 < 0.5), probably because of the insensitivity of the environmental stress factors at the interannual scale . In contrast, 42 % of the sites showed higher R 2 values (> 0.5) for the revised EC-LUE model. The improvements of the revised EC-LUE model in reproducing interannual variations are owing to the integration of several important environmental drivers for vegetation production (i.e., atmospheric CO 2 concentration, radiation components, and VPD), which exhibited large variations and contributed significantly to vegetation production at an interannual scale.
By integrating the atmospheric CO 2 concentration, the revised EC-LUE model suggested a CO 2 sensitivity (β CO 2 ) of 12.31 ± 0.61 Pg C per 100 ppm (Fig. 8b), which indicates an increase of 11.6 % in GPP with a rise of 100 ppm in atmospheric [CO 2 ]. Our estimate is comparable to the observed response of NPP to the increased CO 2 in the FACE experiments (13 % per 100 ppm) and estimates of other ecosystem models (5 %-20 % per 100 ppm) (Piao et al., 2013). The elevated atmospheric CO 2 concentration substantially contributes to vegetation productivity.
The evaporation fraction (EF), namely the ratio of evapotranspiration (ET) to net radiation (Rn), was used to indicate the water stress on vegetation growth in the original EC-LUE model (Yuan et al., 2007(Yuan et al., , 2010. The atmospheric VPD was used to indicate water stress to avoid the aggregated errors from ET simulations in the revised EC-LUE model. Physiologically, vegetation production is sensitive to both atmospheric VPD and soil moisture availability to roots. Several studies have reported highly consistent interannual variability of VPD and soil moisture (Zhou et al., 2019a, b). In addition, recent studies highlighted that the increase in VPD had a larger limitation to the surface conductance and evapotranspiration than soil moisture over short timescales in many biomes Sulman et al., 2016). Other studies have also suggested substantial impacts of VPD on vege-  GPP changes over 1982GPP changes over -2017GPP changes over , 1982GPP changes over -2000GPP changes over , and 2001GPP changes over -2017. The asterisk indicates the significant level at p value < 0.05. tation growth (de Cárcer et al., 2018;Ding et al., 2018), forest mortality (Williams et al., 2013), and crop yields (Lobell et al., 2014). It is increasingly important to integrate the atmospheric water constraint into carbon and water flux modeling.

Comparison of global GPP products
Global and regional GPP estimates remain highly uncertain despite the substantial advances in remote sensing technology, ground observations, and theory of carbon flux modeling (Zheng et al., 2018;Ryu et al., 2019). At a regional scale, we compared the annual mean GPP between the revised EC-LUE model and other models across the bioclimatic zones in the Köppen-Geiger climate classification map (Beck et al., 2018) (Fig. 9). The GPP of the revised EC-LUE model was comparable to the mean value of other models for each bioclimatic zone (Fig. 9a). The GPP of different models exhibited large discrepancies in tropical regions (Af, Am, Aw) (Fig. 9a). The correlations (R 2 ) of GPP across all the bioclimatic zones between the revised EC-LUE model and other models ranged from 0.73 (LPX-Bern) to 0.95 (FLUXCOM MARS, FLUXCOM RF) (Fig. 9b).
At a global scale, our study showed large differences in the magnitude of global GPP estimated by various models varied from 92.7 to 168.7 Pg C yr −1 (Figs. 10-11). The LUE models simulated the global GPP to range from 92.7 to 133.7 Pg C yr −1 (Fig. 11a1). Several machine learning approaches estimated the global GPP to range from 111.0 to 144.2 Pg C yr −1 (Fig. 11a2). A comparison of 10 biophysical models in TRENDY showed that the global GPP ranged from 107.8 to 154.9 Pg C yr −1 (Fig. 11a3). The revised EC-LUE model quantified the mean global GPP from 1982 to 2017 as 106.2 ± 2.9 Pg C yr −1 . Other studies also support the conclusion that there are large uncertainties in the GPP estimates. By comparing diverse GPP models and products, Anav et al. (2015) reported that the global GPP ranged from 112 to 169 Pg C yr −1 . Seven satellite-based LUE models estimated the global GPP ranged from 95.1 to 139.7 Pg C yr −1 over the period of 2000-2010 .
The interannual variability and trend in GPP also vary substantially with different models. This study showed that the interannual variability (standard deviation) ranged from 0.32 to 5.89 Pg C yr −1 , with the trends varying from −0.05 to 0.84 Pg C yr −1 (Fig. 11). The biophysical models showed large interannual variability, with the standard deviation ranging from 1.38 to 5.89 Pg C yr −1 . The LUE models estimated the interannual variability varied from 1.30 to 3.13 Pg C yr −1 . In contrast, the machine learning models exhibited less interannual variability with a standard deviation under 1.0 Pg C yr −1 . The interannual variability of the revised EC-LUE model was 2.9 Pg C yr −1 (Fig. 11b1-b3). In general, the GPP interannual variability before the year Figure 9. Comparisons of long-term (1982 to 2010s) averaged GPP between the revised EC-LUE model and other models across bioclimatic zones in the Köppen-Geiger climate classification map (Beck et al., 2018). (a) The regional averaged value (b) correlation coefficients (R 2 ) of GPP across all the bioclimatic zones between the revised EC-LUE model and other models. These models include three machine learning models (FLUXCOM ANN, FLUXCOM MARS, FLUXCOM RF; Jung et al., 2017), the biophysical model BEPS (Ju et al., 2006;, and 10 biophysical models in TRENDY (CABLE, CLASS-CTEM, CLM, ISAM, JSBACH, JULES, LPJ-GUESS, LPX-Bern, ORCHIDEE, and VISIT). The abbreviations for the bioclimatic zones are as follows: Af, tropical, rainforest; Am, tropical, monsoon; Aw, tropical, savannah; BWh, arid, desert, hot; BWk, arid, desert, cold; BSh, arid, steppe, hot; BSk, arid, steppe, cold; Csa, temperate, dry summer, hot summer; Csb, temperate, dry summer, warm summer; Csc, temperate, dry summer, cold summer; Cwa, temperate, dry winter, hot summer; Cwb, temperate, dry winter, warm summer; Cwc, temperate, dry winter, cold summer; Cfa, temperate, no dry season, hot summer; Cfb temperate, no dry season, warm summer; Cfc, temperate, no dry season, cold summer; Dsa, cold, dry summer, hot summer; Dsb, cold, dry summer, warm summer; Dsc, cold, dry summer, cold summer; Dsd, cold, dry summer, very cold winter; Dwa, cold, dry winter, hot summer; Dwb, cold, dry winter, warm summer; Dwc, cold, dry winter, cold summer; Dwd, cold, dry winter, very cold winter; Dfa, cold, no dry season, hot summer; Dfb, cold, no dry season, warm summer; Dfc, cold, no dry season, cold summer; Dfd, cold, no dry season, very cold winter; ET, polar, tundra; EF, polar, frost.
2000 was greater than that after the year 2001 for most of the biophysical models and LUE models (Fig. 11b1-b3). Most GPP models showed an increasing trend or insignificant trend during all valid years and before 2000. Similar to the standard deviation, the trends of machine learning models were less than other models. Compared with the other models, CLASS-CTEM and the revised EC-LUE model showed a significant decreasing trend after 2001 (Fig. 11c1-c3), probably because of the joint effect of increased VPD and decreased PAR (Fig. 8c).

Model uncertainty
The uncertainties of our GPP dataset were low in high-and middle-latitude areas but high in tropical areas (Fig. 7). This is consistent with the validations at site level showing the revised EC-LUE model showed the lowest accuracy over the tropical evergreen broadleaf forest sites (Fig. 2). Similarly, other satellite-based models exhibited a large uncertainty in the GPP simulations over tropical forest areas (Ryu et al., 2011;Yuan et al., 2014). For example, the MODIS GPP product (MOD17) underestimated GPP at high-productivity sites over the tropical evergreen forests (de Almeida et al., 2018). Regarding the quality of satellite data, a high cloud cover exists over tropical regions, introducing large uncer- Figure 10. Comparisons of annual global summed GPP estimates from various models. The datasets or model algorithms were obtained from EC-LUE , MODIS (Smith et al., 2016), MOD17 C6 (Running et al., 2004), PR (Keenan et al., 2016), VPM , FLUXCOM (Jung et al., 2017), SVR (Kondo et al., 2015), BESS (Jiang and Ryu, 2016), BEPS (Ju et al., 2006;Liu et al., 2018), and models in TRENDY (CABLE, CLASS-CTEM, CLM, ISAM, JSBACH, JULES, LPJ-GUESS, LPX-Bern, ORCHIDEE, and VISIT). Figure 11. Comparison of (a1)-(a3) averaged annual GPP, (b1)-(b3) interannual variability in annual GPP represented by standard deviation (SD), and (c1)-(c3) annual GPP trend among different GPP datasets or models. The references of these models are the same as in Fig. 9. The asterisk indicates that the valid period of the dataset begins from 2000 or 2001.
tainties to fraction of absorbed photosynthetically active radiation (FAPAR), LAI and other vegetation indices (e.g., normalized difference vegetation index, NDVI, and enhanced vegetation index, EVI). As suggested by de Almeida et al. (2018), the lack of reliable MOD15 FAPAR data from January to April as a result of the cloudiness contamination could have substantially affected the seasonality of GPP estimates. Besides, the quality of satellite data can even affect the evaluation of the interannual variations in GPP. Using MODIS EVI data, Saleska et al. (2007) reported a large-scale green-up in the Amazon evergreen forests during the drought of 2005. However, an opposite conclusion was drawn when the cloud-contaminated data were excluded from the analysis (Samanta et al., 2010). In our study, a significant decrease in GPP was found in the Amazon evergreen forests, which may result from the sharp increase in VPD after the late 1990s . Studies using optical satellite data can be influenced by the cloudiness contamination. Recently studies using cloud-free satellite-based microwave data also reported a carbon loss in tropic forest (Liu et al., 2015;Fan et al., 2019).
The latest study highlighted that the aggregate canopy phenology rather than the climate changes is the main cause of the seasonal changes in photosynthesis in evergreen broadleaf forests (Wu et al., 2016). In particular, the new leaf growing synchronously with dry season litterfall may shift the old canopy to be younger, which can explain the significant seasonal increase (∼ 27 %) in the ecosystem photosynthesis. Therefore, the vertical changes in leaf age and photosynthesis ability with canopy depth are important to simulate the seasonal variations in carbon flux in tropical forests . These leaf-trait-related parameters can be simulated from the narrow-band spectra of leaves (Serbin et al., 2012;Dechant et al., 2017). Nevertheless, because of the limitation in obtaining the large-scale hyperspectral remote sensing data, regional or global estimation of these parameters is currently unavailable.
The revised EC-LUE model does not integrate the regulation of soil nitrogen content on vegetation production. Atmospheric nitrogen deposition has exhibited a large increasing trend in the past few decades because of the excessive fossil fuel combustion in the industrial and transportation sectors and the abuse of nitrogenous fertilizer in agricultural practices (Galloway et al., 2004). And the global land atmospheric nitrogen deposition is expected to further increase dramatically from 25 to 40 Tg N yr −1 in the 2000s to 60-100 Tg N yr −1 in 2100 (Lamarque et al., 2005). A metaanalysis of worldwide nitrogen addition experiments found that nitrogen addition could have a significantly positive effect on vegetation productivity (Liu and Greaver, 2009). As most terrestrial ecosystems are nitrogen limited, quantifying the spatiotemporal distributions of vegetation nitrogen content at large scales is essential to improve the accuracy of carbon flux estimation. Several studies quantified the leaf nitrogen content by detecting the nitrogen absorption spectra from the narrow band of hyperspectral data (Cho, 2007). However, leaf water, starch, lignin, and cellulose overlap with the absorption characters of nitrogen in the shortwave infrared bands, making it difficult to retrieve the nitrogen content (Kokaly and Clark, 1999). Additionally, canopy structures, background, and illumination/viewing geometry can further decrease the capacity to detect leaf nitrogen (Yoder and Pettigrew-Crosby, 1995;Knyazikhin et al., 2013). Advances in inversion and statistical models of leaf or canopy nitrogen have emerged (Asner et al., 2011;Dechant et al., 2017;Wang et al., 2018), but these methods require further evaluation over large regions, and the global map of leaf or canopy nitrogen is not available yet.
Additionally, the uncertainty of the revised EC-LUE model may arise by scale mismatches between eddy covariance flux footprint and input datasets. The eddy covariance flux footprint is generally less than 3 km 2 and varies depending on the wind speed, wind direction, and atmospheric stability (Tan et al., 2006). In our studies, the revised EC-LUE model was run at 0.05 • (∼ 5 km 2 ) spatial resolution. The uncertainty of simulated GPP introduced by the scale effect is inevitable but smaller than that introduced by the model structures, parameters, or input datasets (Sjostrom et al., 2013;Zheng et al., 2018).

Data availability
The 0.05 • × 0.05 • global GPP dataset for 1982-2017 is available at https://doi.org/10.6084/m9.figshare.8942336.v3 . The dataset is provided in HDF format at an 8 d interval. The valid value ranges from 0 to 3000, and the background filled value is 65 535. The scale factor of the data is 0.01. Each HDF file represents an 8 d GPP at a daily value (unit: g C m −2 d −1 ). To obtain the summation of each 8 d (or 5 d or 6 d) period, please multiply the GPP value by the corresponding days (8 for the first 45 values and 5 or 6 for the last value in a year).

Conclusion
In this study, we produced a long-term global GPP dataset by integrating several major long-term environmental variables into a light use efficiency model, including atmospheric CO 2 concentration, radiation components, and atmospheric water vapor pressure. These environmental variables showed substantial long-term changes and contributed significantly to vegetation production at an interannual scale. The revised EC-LUE model performed well in simulating the spatial, seasonal, and interannual variations in GPP across the globe. In particular, it has a unique superiority in reproducing the interannual variations in GPP (R 2 = 0.44) compared with the original EC-LUE model (R 2 = 0.36) and other LUE models (R 2 ranged from 0.06 to 0.30 with an average value of 0.16). The GPP dataset derived from the revised EC-LUE model provides an alternative and reliable estimate of global GPP at the long-term scale by integrating the important environmental variables.
Author contributions. WY and YZ designed the research, performed the analysis, and wrote the paper; RS, YW, and XL per-formed the analysis; SLiu, SLiang, JC, WJ, and LZ edited and revised the manuscript.
Competing interests. The authors declare that they have no conflict of interest.
Acknowledgements. The covariance data used in the study were acquired and shared by the FLUXNET community, including the following networks: AmeriFlux, AfriFlux, Asi-aFlux, CarboAfrica, CarboEuropeIP, CarboItaly, CarboMont, Chi-naFlux, Fluxnet-Canada, GreenGrass, ICOS, KoFlux, LBA, NECC, OzFlux-TERN, TCOS-Siberia, and USCCC. The ERA-Interim reanalysis data are provided by ECMWF and processed by LSCE. The FLUXNET eddy covariance data processing and harmonization was carried out by the European Fluxes Database Cluster, Ameri-Flux Management Project, and Fluxdata project of FLUXNET, with the support of CDIAC and ICOS Ecosystem Thematic Center, and the OzFlux, ChinaFlux, and AsiaFlux offices.
Financial support. This research has been supported by the National Science Fund for Distinguished Young Scholars (41925001) and National Key Basic Research Program of China (grant no. 2016YFA0602701).
Review statement. This paper was edited by Yuyu Zhou and reviewed by three anonymous referees.