the Creative Commons Attribution 4.0 License.
the Creative Commons Attribution 4.0 License.
A flux tower site attribute dataset intended for land surface modeling
Jiahao Shi
Hua Yuan
Wanyi Lin
Wenzong Dong
Hongbin Liang
Zhuo Liu
Jianxin Zeng
Haolin Zhang
Nan Wei
Zhongwang Wei
Shupeng Zhang
Shaofeng Liu
Xingjie Lu
Yongjiu Dai
Land surface models (LSMs) require reliable forcing, validation, and surface attribute data as the foundation for effective model development and improvement. Eddy covariance flux tower data are widely regarded as the benchmark for LSMs. However, currently available flux tower datasets often require multiple aspects of processing to ensure data quality before application to LSMs. More importantly, these datasets frequently lack site-observed attribute data, such as fractional vegetation cover and leaf area index, which limits their utility as benchmarking data. Here, we conducted a comprehensive quality screening of the existing reprocessed flux tower dataset, including the proportion of gap-filled data, energy balance closure (EBC), and external disturbances such as irrigation and deforestation, leading to 90 high-quality sites. For these sites, we collected vegetation, soil, and topography data as well as wind speed, temperature, and humidity measurement heights from literature; regional networks; and Biological, Ancillary, Disturbance, and Metadata (BADM) files. We then compiled the final flux tower attribute dataset by filling in missing attributes with global data and classifying plant functional types (PFTs). This dataset is provided in NetCDF (Network Common Data Form) format with necessary descriptions and reference sources. Model simulations revealed substantial disparities in the output between the attribute data observed at the site and those commonly used by LSMs, underscoring the critical role of site-observed attribute data and increasing the emphasis on flux tower attribute data in the LSM community. The dataset addresses the lack of the site attribute to some extent, reduces uncertainty in LSM data source, and aids in diagnosing parameter and process deficiencies. The dataset is available at https://doi.org/10.5281/zenodo.12596218 (Shi et al., 2024).
- Article
(3158 KB) - Full-text XML
-
Supplement
(536 KB) - BibTeX
- EndNote
Land surface models (LSMs) simulate the exchange of carbon, water and energy fluxes between soil, vegetation, and atmosphere and are essential tools for comprehending and predicting mass and energy interactions between the Earth's biosphere and atmosphere (Pitman, 2003; Williams et al., 2009). The key role of LSMs is to provide the land surface boundary conditions for climate and weather forecast models (Mariotti et al., 2018; Pitman, 2003) as well as uncoupled stand-alone runs to investigate terrestrial water resources, ecology, and carbon storage (Crow et al., 2012; Humphrey et al., 2021; Ukkola et al., 2016a). Therefore, LSMs offer valuable insights for addressing environmental issues and mitigating climate change. Offline (i.e., uncoupled) LSMs are forced by meteorological data, including wind speed, air temperature, specific humidity, air pressure, precipitation, and downward longwave and shortwave radiation. Flux towers measure the cycling of carbon, water, and energy between the biosphere and atmosphere, providing observations with meteorological data that can be used to force offline LSMs. These observations are characterized by high temporal resolution (typically 30 min), continuous observations, and direct flux measurements and often span years. For these reasons, they are regarded as benchmarking data for LSM calibration, evaluation, and enhancement, enabling model development from sub-daily to seasonal and interannual scales. Numerous studies have leveraged flux tower data for developing LSMs (Best et al., 2015; Blyth et al., 2010; Harper et al., 2021; Melton et al., 2020; Stevens et al., 2020; Stöckli et al., 2008; Ukkola et al., 2016b; Zhang et al., 2017). However, despite their significance, flux tower data were not originally designed for testing and validating LSMs. When applied to LSMs, these datasets suffer from poor data quality and a deficiency of site attribute data.
FLUXNET2015 is currently the most widely used flux tower dataset (Pastorello et al., 2020). However, substantial preprocessing is frequently required to ensure the reliability of meteorological forcing and flux assessment data for LSMs. To reduce repetitious data processing efforts and improve consistency, Ukkola et al. (2022) integrated three flux tower datasets (FLUXNET2015, La Thuile, and OzFlux) and then performed screening, gap-filling, and other procedures to resolve issues such as missing data and energy balance closure (EBC). This effort resulted in a dataset called PLUMBER2, comprising 170 high-quality sites is tailored for LSMs. This work considered as many available flux tower datasets as possible and used an automated, reproducible data screening process. However, the PLUMBER2 dataset only performs quality checks on meteorological forcing data, and not on flux assessment data, to obtain more available years of data and enable models to be assessed against specific weather and climate events. Consequently, a large proportion of gap-filled flux data is present at some sites. Land surface modelers typically employ stringent quality control (QC) procedures to avoid misleading model evaluation results (Blyth et al., 2010; Li et al., 2019; Purdy et al., 2016). Therefore, these existing gap-filled data still require further processing.
Most importantly, these flux tower datasets lack site-observed vegetation, soil, and topography data such as fractional vegetation cover (FVC), leaf area index (LAI), soil texture, slope, and aspect. For regional and single-point modeling, the current practice usually involves obtaining these attribute data for LSMs through the inversion of global satellite observations. This approach introduces additional uncertainty into LSMs and diminishes the utility of flux tower data as benchmarking data for model evaluation.
Uncertainty in vegetation and soil data constitutes a significant source of uncertainty in LSMs (Dai et al., 2019b; Li et al., 2018). Vegetation composition and density play a prominent role in modulating the surface energy budget (Bagley et al., 2017; Williams and Torn, 2015) by altering canopy conductance, aerodynamic properties, and albedo, ultimately affecting water and energy fluxes between the surface and atmosphere (Anderson et al., 2011; Bonan, 2008). Similarly, soil texture directly influences various soil hydrological and thermodynamic parameters, including saturated soil water content and soil thermal conductivity (Arya and Paris, 1981; Minasny and McBratney, 2007). These parameters have a substantial impact on soil temperature and moisture as well as the terrestrial carbon and water cycle (Dirmeyer, 2011; Entekhabi et al., 1996). Although recent LSM development has attempted to use site-observed attribute data to reduce uncertainty in model results (Harper et al., 2021; Melton et al., 2020), the data used in these studies are typically limited and not publicly available, making it challenging for other researchers to apply these valuable data. Generally speaking, no flux tower dataset can be directly used in developing LSMs, and they frequently lack the necessary site-observed information about soil, vegetation, and other attributes.
To provide more accurate and reliable flux tower data for LSM modeling and validation, we conducted thorough quality control for the site data based on the PLUMBER2 dataset produced by Ukkola et al. (2022), resulting in a total of 90 sites. Subsequently, we carried out an extensive collection of available flux tower attribute data, drawing from sources such as site-related literature and websites. We further complemented the attributes with global data. As a result, we generated a flux tower dataset that can be directly applied to LSMs and contains essential attribute data. Furthermore, through modeling comparison for the four key attribute variables – percentage of plant functional type (PFT) cover (PCT_PFT), LAI, canopy height, and soil texture – we demonstrate how the outputs differ between site-observed attribute data and the default attribute data employed by an LSM. These results emphasize the non-negligible impact of flux tower attribute data on model simulation and development.
2.1 Datasets
The data used in this study can be categorized into four groups, as illustrated in Table 1. Firstly, PLUMBER2 serves as the dataset for data quality screening. The second group comprises the attribute sources, including 113 site-related literature; seven flux regional networks; and Biological, Ancillary, Disturbance, and Metadata (BADM) files provided by FLUXNET and AmeriFlux.
The third category includes data sources employed for PFT classification, incorporating seven site-related articles for classification, flux tower site measurements of precipitation and air temperature, global maps of the Köppen–Geiger climate classification, and the reprocessed MODIS Version 6.1 leaf area index dataset. The Köppen–Geiger climate classification maps, presented at 1 km resolution, are derived from an ensemble of four high-resolution, topographically corrected climatic maps. They demonstrate higher classification accuracy and substantially more detail than previous versions. The reprocessed MODIS LAI used the modified temporal spatial filter (mTSF) method for simple data assimilation and then applied the postprocessing – TIMESAT (a software package to analyze time series of satellite sensor data) Savitzky–Golay (SG) filter to obtain the result. Site LAI validation shows that the reprocessed MODIS LAI is much smoother and more consistent with adjacent values than the original MODIS LAI and closer to site observations (Lin et al., 2023; Yuan et al., 2011).
Finally, three global datasets were used to fill in attribute data of sites lacking site-observed FVC, LAI, and soil texture. LAI filling still uses the reprocessed MODIS LAI, whereas the FVC filling employs a global 300 m PFT map, PFTlocal (Harper et al., 2023). PFTlocal incorporates a variety of currently available high-resolution satellite data to quantify the percentage of PFT in each 300 m pixel worldwide. The 300 m resolution is well matched with the regional extent of the flux tower footprint (Chu et al., 2021), providing representative FVC data. Filling of soil texture uses the Global Soil Dataset for Earth System Modeling (GSDE) (Shangguan et al., 2014). The GSDE harmonizes data collected from various sources and uses a standardized data structure and data processing procedures to derive the final dataset. It has been extensively applied in Earth system models (Dai et al., 2019a).
2.2 Processing methods
We undertook three primary steps to establish the final dataset: site and time period selection, attribute collection, and data processing. First, the data selection process involved picking years with a low gap-filled percentage for fluxes (latent and sensible heat) and vapor pressure deficit (VPD), excluding sites subject to external disturbances or unable to undergo EBC checks. Following that, we collected site-observed vegetation, soil, and topography data. Vegetation attributes include FVC, maximum LAI, and mean canopy height. Soil attributes include soil texture, bulk density, organic carbon concentration, and depth. Topography attributes include slope and aspect. Additionally, we obtained the reference measurement heights (for emulating the lowest layer of the atmospheric model to which the LSM would be coupled) of wind speed, air temperature, and humidity. Then, we filled in FVC, maximum LAI, and soil texture using global datasets. Finally, the FVC was further broken down into different PFTs. Figure 1 presents a flowchart of the processing pipeline, with each step described in detail below.
a https://ameriflux.lbl.gov/. b http://www.biomet.co.at/. c http://www.chinaflux.org/. d http://www.europe-fluxdata.eu/. e https://www.gml.noaa.gov/. f https://ozflux.org.au/. g https://www.swissfluxnet.ethz.ch/ (last access for all URLs: 11 July 2023).
2.2.1 Site and time period selection
The PLUMBER2 dataset acquired 170 sites by screening meteorological data (including five key variables that have the largest influence on LSM simulations: incoming shortwave radiation, precipitation, air temperature, air humidity, and wind speed). For FLUXNET2015 and La Thuile datasets, specific humidity is not provided in the original data, so it was calculated from VPD (Ukkola et al., 2017). However, the screening process did not consider the gap-filled situation of VPD. As mentioned earlier, it also did not screen the flux variables. To address these limitations, we further implemented quality control on the PLUMBER2 dataset by performing the following three steps:
-
Sites with only 1 year of observations were excluded to ensure data stability and reliability.
-
The years where the proportion of data with fluxes (latent and sensible heat) quality control (QC) ≤ 1 exceeds 90 % were selected (QC = 0 denotes observed data, QC = 1 represents high-quality gap-filled data in FLUXNET2015 and La Thuile, and there is no QC = 1 in OzFlux).
-
The years where the proportion of VPD QC = 0 exceeds 90 % in FLUXNET2015 and La Thuile datasets were selected.
Furthermore, we excluded 23 sites that lacked ground heat flux observations because the EBC correction factor (fEBC) could not be calculated (), net radiation (Rn), ground heat flux (G), latent heat flux (Qle), and sensible heat flux (Qh)). Additionally, two sites (FR-Lq1 and FR-Lq2) were removed as they have a very low energy balance ratio (EBR, calculated as (Qle + Qh (Rn − G) according to Wilson et al., 2002) after performing energy closure (details in Table S3). Lastly, we excluded 10 sites that experienced external disturbances during the observation period, such as irrigation and deforestation, and one site impacted by a large waterbody nearby (details in Table S3). In the end, we preserved non-consecutive years that met our criteria. This allows us to maximize the utility of valuable observational data. Details of the selected and excluded sites and years are displayed in Tables S2 and S3.
2.2.2 Data collection for vegetation attributes
Percent cover of plant functional types
FVC data were sourced from site descriptions in literature, regional networks, and FLUXNET BADM files. We sought appropriate representations of site FVC and obtained site-observed FVC data for 53 sites. To maximize the amount of FVC collected, some assumptions were made at certain sites during the data collection process, addressing scenarios as follows:
-
For sites lacking explicit FVC data but providing the percentage of vegetation flux footprint contribution or dense forest canopy basal area, we treated these values as FVC. Since FVC directly determines these metrics, they are numerically similar.
-
In grassland and cropland sites, the vegetation cover type typically exhibits a high degree of homogeneity. Therefore, we referred to site pictures (photographs taken at the site) to make a judgment. If a homogeneous cover could be determined from the pictures, it was assigned a 100 % coverage percentage.
-
Some grassland sites with annual vegetation may experience seasonal bare soil exposure. For these sites, we used the FVC during the peak vegetation growth period.
-
In forest sites, we simply treated forest litter as grass cover in the absence of additional information.
After that, trees and shrubs were classified as evergreen or deciduous and coniferous or broadleaf based on their vegetation type. As an example, eucalyptus trees are classified as evergreen broadleaf trees. For data completeness, we used the PFTlocal maps to fill in data for sites lacking site-observed FVC values.
We further broke down the FVC into PFTs to meet the requirements of LSM simulations using PFTs. The breakdown method is as follows: First, the climate type of PFT was determined according to the Köppen climate classification (Poulter et al., 2011). Then, C3 and C4 grasses were partitioned using site descriptions. If site descriptions were unavailable, flux tower air temperature, precipitation, and reprocessed MODIS LAI are used to calculate LAI proportions under climatic conditions, thereby estimating the grass proportions (Still et al., 2003).
A total of 16 PFTs include the original set of 15 PFTs initially developed by Bonan et al. (2002), supplemented with a new bare soil surface type. The full set of PFTs includes bare soil; Needleleaf evergreen tree, temperate (ENT_Te); Needleleaf evergreen tree, boreal (ENT_Bo); Needleleaf deciduous tree (DNT); Broadleaf evergreen tree, tropical (EBT_Tr); Broadleaf evergreen tree, temperate (EBT_Te); Broadleaf deciduous tree, tropical (DBT_Tr); Broadleaf deciduous tree, temperate (DBT_Te); Broadleaf deciduous tree, boreal (DBT_Bo); Broadleaf evergreen shrub, temperate (EBS_Te); Broadleaf deciduous shrub, temperate (DBS_Te); Broadleaf deciduous shrub, boreal (DBS_Bo); C3 grass, arctic; C3 grass; C4 grass; and Crop. This PFT classification scheme is widely utilized in LSMs.
Maximum leaf area index
Maximum LAI data were primarily sourced from site descriptions in literature and AmeriFlux BADM files, which could be the explicitly stated maximum LAI values or those derived from interannual scatterplots. To maximize data availability, we made the following assumptions at certain sites. Specifically, the summertime LAI observation was considered the maximum LAI. And when a single LAI value was provided without observation time or supporting information, it was accepted as the maximum LAI. To ensure data transparency, quality control flags were implemented in the final dataset, allowing users to select data based on their acceptance criteria. A total of 67 site observations of maximum LAI were collected, with 33 sites providing the year of observation. For data completeness, we used the reprocessed MODIS Version 6.1 LAI dataset to fill in missing site-observed maximum LAI data.
Canopy height
We calculated the mean canopy height over the observation period for 69 sites included in the FLUXNET2015 dataset using the canopy heights reported in the FLUXNET BADM file across different periods. The mean canopy height provides a more truthful representation of the vegetation condition during the period of observation. For the remaining 21 sites, the canopy height provided by PLUMBER2 was used.
2.2.3 Data collection for soil attributes
Soil texture
Soil texture data were sourced from site descriptions in literature, regional networks, and AmeriFlux BADM files. These descriptions provided information in two forms: (1) percentages of sand, silt, and clay and (2) soil texture types, such as sandy loam. For the latter, which do not provide the percentages of sand, silt, and clay, we referred to the soil composition table presented by Dy and Fung (2016) to derive the specific proportions. This table classifies soil into 16 categories based on the proportions of sand, silt, and clay. Overall, 72 site observations of soil texture were collected, with 34 sites providing information on the depth of observations. For data completeness, we used the GSDE dataset to fill in the data for sites lacking site-observed soil texture.
Soil bulk density, organic carbon concentration, and depth
Soil bulk density, organic carbon concentration, and depth data were sourced from site descriptions in literature, regional networks, and AmeriFlux BADM files. Specifically, soil bulk density was collected at 37 sites, soil organic carbon concentration at 23 sites, and soil depth at 31 sites. The observation depth was recorded for soil bulk density at 32 sites and for organic carbon concentration at 22 sites. Despite the limited availability of site-observed data for the three soil attributes, we included them in the final dataset. For researchers conducting site-specific studies, these data can serve as valuable references.
2.2.4 Data collection for topography attributes
The topography data encompasses site slope and aspect. These data were gathered from site descriptions in literature, regional networks, FLUXNET, and AmeriFlux BADM files. Specifically, we acquired slope for 57 sites, and aspect for 49 sites from these sources.
2.2.5 Reference measurement height
Site descriptions in literature, regional networks, FLUXNET and AmeriFlux BADM files were all sources for the reference measurement heights. From these sources, we searched for the heights of wind speed, air temperature, and air humidity measurements or the height of the instrument used for these measurements (e.g., wind cups and temperature and humidity sensors). In cases where the flux tower meteorological observation equipment lacked a dedicated wind speed measurement device, we assumed that the use of a three-dimensional sonic anemometer for wind speed measurements. Consequently, wind observation heights were available for a total of 76 sites, while 65 sites had temperature and humidity observation heights. For the remaining sites where observation heights were not reported, we used the flux observation height as a substitute.
2.3 Modeling assessment of attribute data
The impact of collected attributes on carbon, water, and energy fluxes is assessed through single-point simulations using the latest version of the Common Land Model (Dai et al., 2003) (CoLM202X, https://github.com/CoLM-SYSU/CoLM202X/tree/master, last access: 21 November 2023). CoLM202X incorporates processes related to biogeophysics, biogeochemistry, ecological dynamics, and human activities and offers optional processes and schemes which can be customized by the user. In our experiments, vegetation is modeled using a set of time-invariant parameters (optical properties, i.e., leaf optical properties; morphological properties, i.e., canopy height, vegetation root depth and profile, leaf size and angle distributions; and physiological properties). The dynamic vegetation module is turned off and the time-variant LAI and stem area index (SAI) values are prescribed from the reprocessed MODIS LAI data (Lin et al., 2023; Yuan et al., 2011). The two-big-leaf model (Dai et al., 2004) is employed to calculate processes such as radiative transfer (Yuan et al., 2017), photosynthesis (Collatz et al., 1992; Farquhar et al., 1980), and stomatal conductance (Ball et al., 1987). Surface turbulent exchange is simulated using similarity theory (Brutsaert, 1982; Zeng and Dickinson, 1998). Total evapotranspiration includes evaporation from stems, leaves, and the ground, as well as vegetation transpiration. Surface and subsurface runoff consider factors such as terrain, groundwater level, precipitation, and infiltration rate. Additionally, the model accounts for processes including precipitation phase and intensity, canopy interception, vertical movement of water in snow and soil, and snow compaction (Dai et al., 2003).
The simulations aim to evaluate the differences in model results between runs using site-observed attributes and those commonly utilized by LSMs. For simplicity, we refer to site-observed data as site data and data commonly utilized by LSMs as default data in subsequent descriptions. We focus on four crucial attributes, PCT_PFT, LAI, canopy height, and soil texture, to demonstrate their corresponding impacts. In site data simulations, we scaled the default LAI time series to match the maximum LAI observed, corrected the default canopy height using site canopy height, and replaced the default topsoil texture (0–28.9 cm) with the site-observed soil texture. For sites with multiple PFTs, we calculated the LAI for each PFT using growing degree days and PCT_PFT values (Lawrence and Chase, 2007). Canopy height was divided into three categories based on PFTs (trees, shrubs, or grassland) using site data to adjust the default values for the corresponding group, while the other two groups retained their default values.
The default data generally rely on global LAI and soil texture mapping products, lookup table canopy height, and site IGBP (International Geosphere–Biosphere Programme) classifications to characterize surface vegetation and soil conditions. In this study, the default LAI and soil texture refer to the reprocessed MODIS Version 6.1 LAI and the GSDE soil texture shown in Table 1. Lookup table canopy heights are sourced from CoLM, while site IGBP classifications are obtained from FLUXNET and OzFlux. We selected 10 sites for each attribute – LAI, canopy height, and soil texture – where site data differ most from default data. (In the lookup table canopy height simulations, sites with zero plane displacement exceeding reference measurement height are excluded.) For the PCT_PFT analyses, sites with IGBP types that include combinations of trees and grasses (OSH, WSA, and SAV) were chosen, resulting in six available sites. Table 2 provides an overview of the selected sites along with their corresponding attribute information. Each site was simulated under three conditions: (1) using site data for all attributes at each site, (2) using default data for all attributes at each site, and (3) using default data for the corresponding attribute at sites selected for each attribute separately while maintaining site data for the remaining attributes. The comparison between simulations (1) and (3) aims to demonstrate the individual impact of each attribute, while the comparison between simulations (1) and (2) shows the combined impact of all four attributes.
At each site, we ran CoLM at either the half-hourly or hourly time resolution, depending on the forcing data provided, for all years in the original dataset. Subsequent analyses were conducted only for the years we selected. To reach an equilibrium in soil moisture and temperature, CoLM loops the atmospheric forcing data for each site's observation period until it reaches 50 years in duration. The discrepancy between site data and default data is compared by variables related to land surface energy, water, and photosynthesis processes, including latent heat (Qle), sensible heat (Qh), net radiation (Rn), upward shortwave radiation (SWup), gross primary production (GPP), friction velocity (Ustar), surface soil water content (0–4.5 cm) (SWC), and total runoff (TR).
To quantify the differences between the output from site data and default data while accounting for seasonal fluctuations in the impacts of soil and vegetation on climate-related variables (Dirmeyer, 2011; Forzieri et al., 2020), we designed a statistical indicator called the percentage of mean difference (MD %) (Eq. 1). This indicator is calculated by expressing the mean difference for each month as a percentage of the observed or default modeled annual mean. We used multi-year average time series to capture more stable differences in output. In addition, we used delta root mean squared error (ΔRMSE) (Eq. 3) and Δ|Bias| (Eq. 5) to measure the differences in RMSE and Bias of the output between site and default data, allowing us to assess the model's performance after incorporating site data.
where Modsite,i and Moddefault,i are the predicted values using site data and default data, respectively. Obsj is the observed value. n is the number of paired values. RMSEsite and RMSEdefault are the RMSEs of the simulation results using site data and default data, respectively. Biassite and Biasdefault also correspond to the Bias in these results.
a The maximum LAI at the pixel containing the site provided by Reprocessed MODIS Version 6.1 LAI. b Site-observed data collected in this study. c Specific year of maximum LAI. d The top-layer soil texture (sand/silt/clay) at the site location extracted from the GSDE dataset. e Canopy height of the dominant vegetation type at the site from the CoLM lookup table.
3.1 Global distribution and attribute information of selected sites
The final dataset contains 90 globally distributed sites (Fig. 2a). The majority are in North America and Europe, followed by Australia, with smaller representations in Asia (3 sites) and Africa (1 site). Temporal coverage spans from 1997 to 2017, totaling 475 site years. Individual site observations range from 1 to 17 years, with a median of 4 years (Fig. 2b). Despite a reduction in available sites and years due to rigorous quality control, the dataset does offer reliable meteorological forcing and flux assessment data for LSMs. Furthermore, the 90 sites encompass the full range of IGBP classifications originally presented, covering a wide spread of biomes, from grasslands and savannas to forest ecosystems (Fig. 2c). This enables users to evaluate models across diverse biomes using quality-benchmarked flux tower observations.
Out of the 90 sites, data were collected on PCT_PFT for 53 sites, maximum LAI for 67 sites, average canopy height for 69 sites, and soil texture for 72 sites. Additionally, soil bulk density was available for 37 sites, soil organic carbon concentration for 23 sites, and soil depth for 31 sites. Data on slope were collected for 57 sites, aspect for 49 sites, wind observation height for 76 sites, and air temperature and humidity observation heights for 65 sites (Fig. 2d). In the absence of site-observed PCT_PFT, soil texture, and LAI, we opted for appropriate global data to fill in those missing for data completeness. To improve data utilization, we provide the observation year of maximum LAI and the depth of soil texture, which are available at 33 and 34 sites, respectively.
Figure 3 depicts the discrepancies between site data and default data for PCT_PFT, maximum LAI, canopy height, and soil texture. The PCT_PFT shows multiple PFTs at 34 sites, offering a more accurate representation of vegetation conditions compared to IGBP classifications. For LAI, canopy height, and soil texture, variations between site data and default data are substantial at certain sites. Specifically, at 31 sites, discrepancies in LAI values exceed 1 m2 m−2; canopy height differs by over 10 m at 15 sites, and sand percentage varies by more than 20 % at 18 sites.
3.2 The flux tower site attribute dataset
The final dataset is formatted in NetCDF (Network Common Data Form). Table 3 outlines the attribute variables and corresponding descriptions for each site in the file. These attributes can be categorized into vegetation, soil, and topography attributes, as well as reference heights and filtered high-quality years.
For maximum LAI, the file provides both the year range covered by maximum LAI and the maximum value for a specific year. Regarding the three soil attributes, soil texture, bulk density, and organic carbon concentration, the file provides values for multiple soil layers along with the specific depth of each layer. Concerning reference height, we give its corresponding observed variable, i.e., wind speed, air temperature, and humidity, or fluxes (latent and sensible heat). Additionally, the NetCDF file incorporates reference sources for each attribute. These sources are included to facilitate access to the original data and enhance flexibility in application. A summary of these reference sources is presented in Table S1.
a The sources of collected attribute data. b The year range covered by maximum LAI. c Maximum LAI for a specific year. d The value of n ranges from 1 to 4, denoting the four soil layers in ascending order of depth. The parameter layer_n_depth indicates the depth of respective soil layer corresponding to the depth at which soil data is observed.
Figure 4 quantifies the differences between site data and filled data for sites where both data sources are available, illustrating the inhomogeneities in the final dataset resulting from data filling. Differences in vegetation cover (including bare soil, woody, and herbaceous vegetation) generally fall within 20 %, with a minority of sites exceeding 40 %. The mean and median LAI differences are approximately 1 m2 m−2. Canopy height deviations are primarily within 2 m, although a few sites exceed 4 m. Differences in sand content typically remain within 30 %, with both mean and median differences below 15 %. This quantification suggests that the filled data are generally reliable across most sites.
3.3 Impact of site attributes on modeling
The impacts of altering land surface representation from default data to site data, quantified by MD %, on Qle, Qh, Rn, SWup, GPP, Ustar, SWC, and TR are shown in Fig. 5. The figure distinctly demonstrates how vegetation and soil components affect carbon, water, and energy fluxes to varying degrees, contingent on the season. The impacts of vegetation cover, soil texture, and LAI on Qle and Qh is primarily observed in the spring and summer, while canopy height exerts its most substantial effects in autumn and winter. The impact of vegetation cover on Rn and SWup remains consistent throughout the year, whereas LAI maintains a more pronounced effect in spring and summer. In terms of GPP, attributes play a more significant role during the summertime. However, the effects of vegetation and soil attributes on Ustar appear to be independent of season. SWC and TR are both predominantly influenced by soil texture. The difference is that soil texture significantly affects SWC across all seasons, whereas its impact on TR occurs primarily during the summer and fall. Additionally, vegetation cover was observed to have a significant effect on TR at the SD-Dem site. This is due to the salient impact at the SD-Dem site, which is situated within the African savannah with an average annual precipitation of 320 mm (Ardö et al., 2008).
To elucidate the magnitude of each attribute's impact on different variables, Fig. 6 further displays the monthly average maximum MD %. On average, changes in latent and sensible heat are not dominated by any single attribute. All four attributes – PCT_PFT, LAI, canopy height, and soil texture – have a relatively strong impact on both. Their monthly average maximum MD % on Qh is all in the range of 14 %–36 %. And the effect of soil texture on Qle is comparatively greater, at 18.3 %. Regarding Rn, vegetation cover emerges as the chief influencer with a monthly average maximum MD % of 8.8 %. In contrast, SWup is heavily dictated by LAI, at 56.7 %, due to the exceptionally high value at the US-GLE site. Vegetation cover and LAI, both with a monthly average maximum MD % of more than 50 %, dominate the changes in GPP. Soil texture also has a visible impact on GPP due to its influence on soil permeability, aeration, and the capacity to retain water and nutrients. On the other hand, Ustar is almost exclusively shaped by vegetation cover and canopy height. This makes sense because the intensity of land–atmosphere exchange in vegetated areas is directly tied to canopy height, and changes in vegetation cover typically correspond to changes in canopy height. Concerning SWC and TR, vegetation cover and soil texture are the two crucial attributes. Soil texture exhibits monthly average maximum MD % of 46.3 % for SWC and 129.8 % for TR, while vegetation cover shows 22.7 % and 293.8 %, respectively.
Figure 7 uses ΔRMSE and Δ|Bias| to show the shifts in model performance using site data. The incorporation of site-observed attribute data significantly improves the simulation of Rn, SWup, and Ustar. Concerning individual attributes, PCT_PFT proves particularly beneficial for modeling both Rn and SWup. Concurrently, including site LAI also enhances the simulation of SWup. Improvements in these fundamental energy terms contribute to more accurate modeling of latent and sensible heat. Furthermore, site LAI and canopy height demonstrates steady improvements on GPP and Ustar, respectively.
In summary, these results underscore the significant impact and importance of incorporating site-observed attribute data in the simulation of carbon, water, and energy fluxes in LSMs.
In land surface community, flux tower attribute data currently does not receive sufficient attention. However, the site attribute data are nearly as critical as the flux tower observations themselves. We hope that future flux tower datasets will provide standardized site attributes. In this study, we have acquired 90 sites with high quality by a comprehensive selection process, which provide extensive site-observed data on vegetation, soil, and topography attributes. Through single-point simulations, we demonstrated their indispensable role in LSM development. Accurate attribute data will provide multiple benefits by lowering uncertainty in model calibration and evaluation.
After selection, fewer sites and years are available. However, the retained data offers trustworthy observations that can be directly applied. Data quality is generally the focus of model calibration and evaluation, and developing LSMs can benefit immensely from using a modest number of sites (Brooke et al., 2019; Harper et al., 2021; Swenson et al., 2019). Therefore, these updates will help the model's developments. To collect more site-observed attribute data, while considering the diversity described within the same attribute data, particularly the percentage of vegetation cover, we made a few approximations and assumptions during data collection procedure, such as using approximation substitution and site photographs to assist in judgment. Although these methods may introduce slight deviations, they do a good job of reproducing the surface conditions of these sites. Furthermore, we provide descriptions of the attribute data that are as detailed as possible. For instance, the year and depth of observation are given along with the maximum LAI and soil texture whenever feasible, respectively. They are valuable references for data applications. One might argue that the auxiliary descriptions are just as important as the attribute data itself.
Using CoLM at 36 sites, we evaluated the impacts of PCT_PFT, LAI, canopy height, and soil texture on model results. What is conducted here is not an ideal experiment but rather an actual demonstration of the discrepancies in model results between site data and default data. The results are in line with previous research (Dai et al., 2019b), showing that vegetation cover appreciably affects each of the eight variables examined, often being the dominant attribute (Fig. 6). This is due to plant cover being the most prominent surface feature, directly altering surface energy absorption. The net radiation simulation was improved using the site PCT_PFT, but the performance of latent and sensible heat was not as good. This may be related to uncertainties in the model itself as well as other input data, such as vegetation biophysical parameters, and soil thermal and hydraulic conductivities.
Additionally, we find that the impact of attributes is substantially associated with precipitation, as illustrated in the average seasonal cycle shown in Fig. 8. At the AU-How site in Australia, ample rainfall during the wet season combined with the increase in surface available energy due to vegetation cover brings about a significant increase in Qle. In contrast, since limited water is available for evapotranspiration at the SD-Dem site, Qh is the primary feedback from changes in surface energy. The results from the US-KS2 and US-GLE sites indicate that the growing season, synchronized with water availability, is when LAI exerts a major influence on GPP. Furthermore, a notable variation in SWup was seen at the US-GLE site, which was attributed to the presence of snow cover (Berryman et al., 2018). Corrections to LAI can improve the simulation by reducing albedo inaccuracies. This corroborates the point in Essery (2013) that inadequate land cover data is largely to blame for the uncertainty in the climate–snow albedo feedback in LSMs. Results from the IT-Cpz and BE-Vie sites suggest that differences in the intensity of land–air exchange, caused by variations in canopy height, are clearly reflected in Qle during the rainy season. Regarding soil texture, a comparison between FI-Sod and AU-Cpr sites revealed stronger control of Qle by soil texture during the period of high precipitation intensity. This is partly attributed to increased water availability and largely to the pronounced differences in soil infiltration capacity under high-intensity precipitation events.
A previous study by Ménard et al. (2015) stated that attribute data have little effect on modeling results. This study, however, may lack representativeness since it was limited to one site. Furthermore, it averaged differences resulting from attribute data across the whole time series using the raw RMSE and correlation coefficient statistical metrics. This approach makes it difficult to detect the crucial role of attribute data. As described in Sect. 3.3, the impacts of attribute data on climate-related variables generally occur over specific periods (mostly during the growing season) rather than throughout the year.
By combining multiple data sources, we were able to maximize the available site-observed attribute data. Nevertheless, the data sources were primarily from published works, which led to some missing data at certain sites. The attribute data focused only on soil and vegetation information. Future endeavors should incorporate additional surface parameters, such as irrigation, wildfire, and the depth of soil moisture and vegetation roots, which are required for LSMs. Such observations and collections of site time-invariant attributes are generally low-cost but would strongly benefit model enhancement. In addition, the impact of attribute data on model results was assessed using one model, potentially limiting the representativeness of our findings.
As LSMs continually advance their schemes and processes, an increasing array of surface parameters will be incorporated, elevating the models to a heightened level of complexity. It is imperative that these parameters be clearly defined and prescribed. Working with site-observed attribute data enabled us to narrow down reasons for model biases, thereby enhancing our understanding of the true effects of diverse schemes and processes.
The flux tower site attribute dataset provides comprehensive filtered high-quality years of site-observed vegetation, soil, topography attributes, and reference measurement heights. Each site's data are formatted within a NetCDF file named according to the site name, database, and attributes (vegetation, soil, topography, and reference height), such as AT-Neu_FLUXNET2015_Veg_Soil_Topography_ ReferenceHeight.nc. The dataset comprises a total of 90 NetCDF files and can be accessed on Zenodo at https://doi.org/10.5281/zenodo.12596218 (Shi et al., 2024).
The processing codes are available at https://github.com/Mbnl1197/Flux-tower-attribute-for-LSM (last access: 4 September 2024) (DOI: https://doi.org/10.5281/zenodo.13684992, Shi and Yuan, 2024).
This study is centered on two issues with utilizing flux tower data in LSMs: inadequate data quality and insufficient site attributes. We performed a comprehensive quality control on flux tower data. By examining observation percentage, energy balance closure, and external disturbances, 90 high-quality flux tower sites with 475 site years were produced. By combining various data sources, we created a flux tower attribute dataset through data collection, processing, and filling procedures. This dataset includes the site-observed PCT_PFT, maximum LAI, mean canopy height, soil properties (texture, bulk density, organic carbon concentrations, and depth), and site topography (slope and aspect), as well as the reference measurement heights.
Furthermore, the attribute data collected in this study and frequently used by LSMs are incorporated in single-point modeling, respectively, aimed at quantifying the differences in model output. Our results demonstrate the significance of certain attributes in the variation in specific variables. All four attributes significantly influence both latent and sensible heat. Their monthly average maximum MD % typically ranges from 10 % to 30 %. Vegetation cover and LAI serve as the primary controls for net radiation and upward shortwave radiation, respectively, with a monthly average maximum MD % of 8.8 % and 56.7 %. Both GPP and Ustar were strongly influenced by vegetation cover, with LAI and canopy height also exerting significant effects on GPP and Ustar, respectively. The monthly average maximum MD % for each of these impacts exceeds 50 %. For hydrologic variables, i.e., SWC and TR, soil texture typically holds greater significance, followed by vegetation cover. We reveal that the magnitude of these differences is usually accompanied by seasonal fluctuations. Regarding fluxes and GPP in particular, greater discrepancies are generally observed during spring and summer. These results stress the necessity of site-observed attribute data in the development of LSMs.
Our endeavors mitigate the inadequacies of flux tower attribute data, enhancing the ability of flux tower data to serve as benchmarking data for LSMs. The dataset provides relatively complete site attribute data and high-quality flux validation data, making it suitable for direct use as input and for simulation validation in LSMs. This facilitates the comparison of LSM simulations under the same standard framework, promoting their development. Moreover, this effort will draw more attention to flux tower attribute data from the land surface modeling group and foster communication between ecology and modeling communities. We strongly advocate for the routine release of attribute data as part of flux tower data. Making such ancillary data more easily and routinely accessible would greatly increase the value and usability of the data.
The supplement related to this article is available online at: https://doi.org/10.5194/essd-17-117-2025-supplement.
Conceptualization: HY. Data curation: HY and JS. Formal analysis: JS, HY, WD, and WL. Funding acquisition: HY and YD. Investigation: HY and JS. Methodology: HY, JS, NW, ZW, and SL. Resources: YD and HY. Software: JS, HY, HL, WL, NW, JZ, and HZ. Validation: HY, ZL, WD, WL, SZ, and XL. Visualization: JS. Writing (original draft preparation): JS. Writing (review and editing): JS, HY, ZW, WL, WD, ZL, and YD. All authors have read and agreed to the published version of the paper.
The contact author has declared that none of the authors has any competing interests.
Publisher’s note: Copernicus Publications remains neutral with regard to jurisdictional claims made in the text, published maps, institutional affiliations, or any other geographical representation in this paper. While Copernicus Publications makes every effort to include appropriate place names, the final responsibility lies with the authors.
We thank Anna M. Ukkola for her solid work in flux tower data processing, which provided a robust foundation for our research. We are also grateful for her responsiveness to our questions in using PLUMBER2. We extend our thanks to Danielle Svehla Christianson of Integrated Data Systems, Scientific Data Division, and Lawrence Berkeley National Laboratory for her prompt assistance in utilizing AmeriFlux BADM data. This work accessed several flux tower networks in the FLUXNET community, including AmeriFlux, ChinaFlux, European Fluxes, OzFlux, and Swiss FluxNet. Special thanks to FLUXNET and AmeriFlux for providing the BADM data. We also accessed data from ESRL's Global Monitoring Laboratory (GML) of the National Oceanic and Atmospheric Administration (NOAA) and the AT-Neu site research group. This work is based on numerous publications, and we thank these scientists for sharing their data. The PFTlocal maps were supported and provided by the ESA Climate Change Initiative (CCI) Land Cover project. The Köppen–Geiger climate classification maps are hosted by GloH2O.
This work was supported by the National Natural Science Foundation of China (under grant nos. 42075160 and 42088101), the Guangdong Major Project of Basic and Applied Basic Research (grant no. 2021B0301030007), the Southern Marine Science and Engineering Guangdong Laboratory (Zhuhai) (grant no. SML2023SP216), and the specific research fund of The Innovation Platform for Academicians of Hainan Province (grant no. YSPTZX202143).
This paper was edited by Dalei Hao and reviewed by Anna Ukkola, Lingcheng Li, and two anonymous referees.
Anderson, R. G., Canadell, J. G., Randerson, J. T., Jackson, R. B., Hungate, B. A., Baldocchi, D. D., Ban-Weiss, G. A., Bonan, G. B., Caldeira, K., Cao, L., Diffenbaugh, N. S., Gurney, K. R., Kueppers, L. M., Law, B. E., Luyssaert, S., and O'Halloran, T. L.: Biophysical considerations in forestry for climate protection, Front. Ecol. Environ., 9, 174–182, https://doi.org/10.1890/090179, 2011.
Ardö, J., Mölder, M., El-Tahir, B. A., and Elkhidir, H. A. M.: Seasonal variation of carbon fluxes in a sparse savanna in semi arid Sudan, Carbon Balance Manage, 3, 7, https://doi.org/10.1186/1750-0680-3-7, 2008.
Arya, L. M. and Paris, J. F.: A Physicoempirical Model to Predict the Soil Moisture Characteristic from Particle-Size Distribution and Bulk Density Data, Soil Sci. Soc. Am. J., 45, 1023–1030, https://doi.org/10.2136/sssaj1981.03615995004500060004x, 1981.
Bagley, J. E., Kueppers, L. M., Billesbach, D. P., Williams, I. N., Biraud, S. C., and Torn, M. S.: The influence of land cover on surface energy partitioning and evaporative fraction regimes in the U.S. Southern Great Plains, J. Geophys. Res.-Atmos., 122, 5793–5807, https://doi.org/10.1002/2017JD026740, 2017.
Ball, J. T., Woodrow, I. E., and Berry, J. A.: A Model Predicting Stomatal Conductance and its Contribution to the Control of Photosynthesis under Different Environmental Conditions, in: Progress in Photosynthesis Research, edited by: Biggins, J., Springer Netherlands, Dordrecht, 221–224, https://doi.org/10.1007/978-94-017-0519-6_48, 1987.
Beck, H. E., Zimmermann, N. E., McVicar, T. R., Vergopolan, N., Berg, A., and Wood, E. F.: Present and future Köppen-Geiger climate classification maps at 1-km resolution, Sci Data, 5, 180214, https://doi.org/10.1038/sdata.2018.214, 2018.
Berryman, E. M., Vanderhoof, M. K., Bradford, J. B., Hawbaker, T. J., Henne, P. D., Burns, S. P., Frank, J. M., Birdsey, R. A., and Ryan, M. G.: Estimating Soil Respiration in a Subalpine Landscape Using Point, Terrain, Climate, and Greenness Data, J. Geophys. Res.-Biogeo., 123, 3231–3249, https://doi.org/10.1029/2018JG004613, 2018.
Best, M. J., Abramowitz, G., Johnson, H. R., Pitman, A. J., Balsamo, G., Boone, A., Cuntz, M., Decharme, B., Dirmeyer, P. A., Dong, J., Ek, M., Guo, Z., Haverd, V., van den Hurk, B. J. J., Nearing, G. S., Pak, B., Peters-Lidard, C., Santanello, J. A., Stevens, L., and Vuichard, N.: The Plumbing of Land Surface Models: Benchmarking Model Performance, J. Hydrometeorol., 16, 1425–1442, https://doi.org/10.1175/JHM-D-14-0158.1, 2015.
Blyth, E., Gash, J., Lloyd, A., Pryor, M., Weedon, G. P., and Shuttleworth, J.: Evaluating the JULES Land Surface Model Energy Fluxes Using FLUXNET Data, J. Hydrometeorol., 11, 509–519, https://doi.org/10.1175/2009JHM1183.1, 2010.
Bonan, G. B.: Forests and Climate Change: Forcings, Feedbacks, and the Climate Benefits of Forests, Science, 320, 1444–1449, https://doi.org/10.1126/science.1155121, 2008.
Bonan, G. B., Levis, S., Kergoat, L., and Oleson, K. W.: Landscapes as patches of plant functional types: An integrating concept for climate and ecosystem models, Global Biogeochem. Cy., 16, 5-1–5-23, https://doi.org/10.1029/2000GB001360, 2002.
Brooke, J. K., Harlow, R. C., Scott, R. L., Best, M. J., Edwards, J. M., Thelen, J.-C., and Weeks, M.: Evaluating the Met Office Unified Model land surface temperature in Global Atmosphere/Land 3.1 (GA/L3.1), Global Atmosphere/Land 6.1 (GA/L6.1) and limited area 2.2 km configurations, Geosci. Model Dev., 12, 1703–1724, https://doi.org/10.5194/gmd-12-1703-2019, 2019.
Brutsaert, W.: Energy Budget and Related Methods, in: Evaporation into the Atmosphere, Springer Netherlands, Dordrecht, 209–230, https://doi.org/10.1007/978-94-017-1497-6_10, 1982.
Chu, H., Luo, X., Ouyang, Z., Chan, W. S., Dengel, S., Biraud, S. C., Torn, M. S., Metzger, S., Kumar, J., Arain, M. A., Arkebauer, T. J., Baldocchi, D., Bernacchi, C., Billesbach, D., Black, T. A., Blanken, P. D., Bohrer, G., Bracho, R., Brown, S., Brunsell, N. A., Chen, J., Chen, X., Clark, K., Desai, A. R., Duman, T., Durden, D., Fares, S., Forbrich, I., Gamon, J. A., Gough, C. M., Griffis, T., Helbig, M., Hollinger, D., Humphreys, E., Ikawa, H., Iwata, H., Ju, Y., Knowles, J. F., Knox, S. H., Kobayashi, H., Kolb, T., Law, B., Lee, X., Litvak, M., Liu, H., Munger, J. W., Noormets, A., Novick, K., Oberbauer, S. F., Oechel, W., Oikawa, P., Papuga, S. A., Pendall, E., Prajapati, P., Prueger, J., Quinton, W. L., Richardson, A. D., Russell, E. S., Scott, R. L., Starr, G., Staebler, R., Stoy, P. C., Stuart-Haëntjens, E., Sonnentag, O., Sullivan, R. C., Suyker, A., Ueyama, M., Vargas, R., Wood, J. D., and Zona, D.: Representativeness of Eddy-Covariance flux footprints for areas surrounding AmeriFlux sites, Agr. Forest Meteorol., 301–302, 108350, https://doi.org/10.1016/j.agrformet.2021.108350, 2021.
Collatz, G., Ribas-Carbo, M., and Berry, J.: Coupled Photosynthesis-Stomatal Conductance Model for Leaves of C4 Plants, Funct. Plant Biol., 19, 519, https://doi.org/10.1071/PP9920519, 1992.
Crow, W. T., Kumar, S. V., and Bolten, J. D.: On the utility of land surface models for agricultural drought monitoring, Hydrol. Earth Syst. Sci., 16, 3451–3460, https://doi.org/10.5194/hess-16-3451-2012, 2012.
Dai, Y., Zeng, X., Dickinson, R. E., Baker, I., Bonan, G. B., Bosilovich, M. G., Denning, A. S., Dirmeyer, P. A., Houser, P. R., Niu, G., Oleson, K. W., Schlosser, C. A., and Yang, Z.-L.: The Common Land Model, B. Am. Meteorol. Soc., 84, 1013–1024, https://doi.org/10.1175/BAMS-84-8-1013, 2003.
Dai, Y., Dickinson, R. E., and Wang, Y.-P.: A Two-Big-Leaf Model for Canopy Temperature, Photosynthesis, and Stomatal Conductance, J. Climate, 17, 2281–2299, https://doi.org/10.1175/1520-0442(2004)017<2281:ATMFCT>2.0.CO;2, 2004.
Dai, Y., Shangguan, W., Wei, N., Xin, Q., Yuan, H., Zhang, S., Liu, S., Lu, X., Wang, D., and Yan, F.: A review of the global soil property maps for Earth system models, SOIL, 5, 137–158, https://doi.org/10.5194/soil-5-137-2019, 2019a.
Dai, Y., Yuan, H., Xin, Q., Wang, D., Shangguan, W., Zhang, S., Liu, S., and Wei, N.: Different representations of canopy structure – A large source of uncertainty in global land surface modeling, Agr. Forest Meteorol., 269–270, 119–135, https://doi.org/10.1016/j.agrformet.2019.02.006, 2019b.
Dirmeyer, P. A.: The terrestrial segment of soil moisture-climate coupling, Geophys. Res. Lett., 38, L16702, https://doi.org/10.1029/2011GL048268, 2011.
Dy, C. Y. and Fung, J. C. H.: Updated global soil map for the Weather Research and Forecasting model and soil moisture initialization for the Noah land surface model, J. Geophys. Res.-Atmos., 121, 8777–8800, https://doi.org/10.1002/2015JD024558, 2016.
Entekhabi, D., Rodriguez-Iturbe, I., and Castelli, F.: Mutual interaction of soil moisture state and atmospheric processes, J. Hydrol., 184, 3–17, https://doi.org/10.1016/0022-1694(95)02965-6, 1996.
Essery, R.: Large-scale simulations of snow albedo masking by forests, Geophys. Res. Lett., 40, 5521–5525, https://doi.org/10.1002/grl.51008, 2013.
Farquhar, G. D., Von Caemmerer, S., and Berry, J. A.: A biochemical model of photosynthetic CO2 assimilation in leaves of C3 species, Planta, 149, 78–90, https://doi.org/10.1007/BF00386231, 1980.
Forzieri, G., Miralles, D. G., Ciais, P., Alkama, R., Ryu, Y., Duveiller, G., Zhang, K., Robertson, E., Kautz, M., Martens, B., Jiang, C., Arneth, A., Georgievski, G., Li, W., Ceccherini, G., Anthoni, P., Lawrence, P., Wiltshire, A., Pongratz, J., Piao, S., Sitch, S., Goll, D. S., Arora, V. K., Lienert, S., Lombardozzi, D., Kato, E., Nabel, J. E. M. S., Tian, H., Friedlingstein, P., and Cescatti, A.: Increased control of vegetation on global terrestrial energy fluxes, Nat. Clim. Change, 10, 356–362, https://doi.org/10.1038/s41558-020-0717-0, 2020.
Harper, A. B., Williams, K. E., McGuire, P. C., Duran Rojas, M. C., Hemming, D., Verhoef, A., Huntingford, C., Rowland, L., Marthews, T., Breder Eller, C., Mathison, C., Nobrega, R. L. B., Gedney, N., Vidale, P. L., Otu-Larbi, F., Pandey, D., Garrigues, S., Wright, A., Slevin, D., De Kauwe, M. G., Blyth, E., Ardö, J., Black, A., Bonal, D., Buchmann, N., Burban, B., Fuchs, K., de Grandcourt, A., Mammarella, I., Merbold, L., Montagnani, L., Nouvellon, Y., Restrepo-Coupe, N., and Wohlfahrt, G.: Improvement of modeling plant responses to low soil moisture in JULESvn4.9 and evaluation against flux tower measurements, Geosci. Model Dev., 14, 3269–3294, https://doi.org/10.5194/gmd-14-3269-2021, 2021.
Harper, K. L., Lamarche, C., Hartley, A., Peylin, P., Ottlé, C., Bastrikov, V., San Martín, R., Bohnenstengel, S. I., Kirches, G., Boettcher, M., Shevchuk, R., Brockmann, C., and Defourny, P.: A 29-year time series of annual 300 m resolution plant-functional-type maps for climate models, Earth Syst. Sci. Data, 15, 1465–1499, https://doi.org/10.5194/essd-15-1465-2023, 2023.
Humphrey, V., Berg, A., Ciais, P., Gentine, P., Jung, M., Reichstein, M., Seneviratne, S. I., and Frankenberg, C.: Soil moisture–atmosphere feedback dominates land carbon uptake variability, Nature, 592, 65–69, https://doi.org/10.1038/s41586-021-03325-5, 2021.
Lawrence, P. J. and Chase, T. N.: Representing a new MODIS consistent land surface in the Community Land Model (CLM 3.0), J. Geophys. Res., 112, G01023, https://doi.org/10.1029/2006JG000168, 2007.
Li, J., Chen, F., Zhang, G., Barlage, M., Gan, Y., Xin, Y., and Wang, C.: Impacts of Land Cover and Soil Texture Uncertainty on Land Model Simulations Over the Central Tibetan Plateau, J. Adv. Model. Earth Sy., 10, 2121–2146, https://doi.org/10.1029/2018MS001377, 2018.
Li, J., Zhang, G., Chen, F., Peng, X., and Gan, Y.: Evaluation of Land Surface Subprocesses and Their Impacts on Model Performance With Global Flux Data, J. Adv. Model. Earth Sy., 11, 1329–1348, https://doi.org/10.1029/2018MS001606, 2019.
Lin, W., Yuan, H., Dong, W., Zhang, S., Liu, S., Wei, N., Lu, X., Wei, Z., Hu, Y., and Dai, Y.: Reprocessed MODIS Version 6.1 Leaf Area Index Dataset and Its Evaluation for Land Surface and Climate Modeling, Remote Sens., 15, 1780, https://doi.org/10.3390/rs15071780, 2023.
Mariotti, A., Ruti, P. M., and Rixen, M.: Progress in subseasonal to seasonal prediction through a joint weather and climate community effort, npj Clim. Atmos. Sci., 1, 4, https://doi.org/10.1038/s41612-018-0014-z, 2018.
Melton, J. R., Arora, V. K., Wisernig-Cojoc, E., Seiler, C., Fortier, M., Chan, E., and Teckentrup, L.: CLASSIC v1.0: the open-source community successor to the Canadian Land Surface Scheme (CLASS) and the Canadian Terrestrial Ecosystem Model (CTEM) – Part 1: Model framework and site-level performance, Geosci. Model Dev., 13, 2825–2850, https://doi.org/10.5194/gmd-13-2825-2020, 2020.
Ménard, C. B., Ikonen, J., Rautiainen, K., Aurela, M., Arslan, A. N., and Pulliainen, J.: Effects of Meteorological and Ancillary Data, Temporal Averaging, and Evaluation Methods on Model Performance and Uncertainty in a Land Surface Model, J. Hydrometeorol., 16, 2559–2576, https://doi.org/10.1175/JHM-D-15-0013.1, 2015.
Minasny, B. and McBratney, A. B.: Estimating the Water Retention Shape Parameter from Sand and Clay Content, Soil Sci. Soc. Am. J., 71, 1105–1110, https://doi.org/10.2136/sssaj2006.0298N, 2007.
Pastorello, G., Trotta, C., and Canfora, E.: The FLUXNET2015 dataset and the ONEFlux processing pipeline for eddy covariance data, Sci. Data, 7, 225, https://doi.org/10.1038/s41597-021-00851-9, 2020.
Pitman, A. J.: The evolution of, and revolution in, land surface schemes designed for climate models, Int. J. Climatol., 23, 479–510, https://doi.org/10.1002/joc.893, 2003.
Poulter, B., Ciais, P., Hodson, E., Lischke, H., Maignan, F., Plummer, S., and Zimmermann, N. E.: Plant functional type mapping for earth system models, Geosci. Model Dev., 4, 993–1010, https://doi.org/10.5194/gmd-4-993-2011, 2011.
Purdy, A. J., Fisher, J. B., Goulden, M. L., and Famiglietti, J. S.: Ground heat flux: An analytical review of 6 models evaluated at 88 sites and globally, J. Geophys. Res.-Biogeo., 121, 3045–3059, https://doi.org/10.1002/2016JG003591, 2016.
Shangguan, W., Dai, Y., Duan, Q., Liu, B., and Yuan, H.: A global soil data set for earth system modeling, J. Adv. Model. Earth Sy., 6, 249–263, https://doi.org/10.1002/2013MS000293, 2014.
Shi, J. and Yuan, H.: Mbnl1197/Flux-tower-attribute-for-LSM: Flux-tower-attribute-for-LSM (V3.0), Zenodo [code], https://doi.org/10.5281/zenodo.13684992, 2024.
Shi, J., Yuan, H., Lin, W., Dong, W., amd Dai, Y.: A flux tower site attribute dataset intended for land surface modeling, Zenodo [data set], https://doi.org/10.5281/zenodo.12596218, 2024.
Stevens, D., Miranda, P. M. A., Orth, R., Boussetta, S., Balsamo, G., and Dutra, E.: Sensitivity of Surface Fluxes in the ECMWF Land Surface Model to the Remotely Sensed Leaf Area Index and Root Distribution: Evaluation with Tower Flux Data, Atmosphere, 11, 1362, https://doi.org/10.3390/atmos11121362, 2020.
Still, C. J., Berry, J. A., Collatz, G. J., and DeFries, R. S.: Global distribution of C 3 and C 4 vegetation: Carbon cycle implications, Global Biogeochem. Cy., 17, 6-1–6-14, https://doi.org/10.1029/2001GB001807, 2003.
Stöckli, R., Lawrence, D. M., Bonan, G. B., Denning, A. S., and Running, S. W.: Use of FLUXNET in the Community Land Model development, J. Geophys. Res., 113, G01025, https://doi.org/10.1029/2007JG000562, 2008.
Swenson, S. C., Burns, S. P., and Lawrence, D. M.: The Impact of Biomass Heat Storage on the Canopy Energy Balance and Atmospheric Stability in the Community Land Model, J. Adv. Model Earth Sy., 11, 83–98, https://doi.org/10.1029/2018MS001476, 2019.
Ukkola, A. M., De Kauwe, M. G., Pitman, A. J., Best, M. J., Abramowitz, G., Haverd, V., Decker, M., and Haughton, N.: Land surface models systematically overestimate the intensity, duration and magnitude of seasonal-scale evaporative droughts, Environ. Res. Lett., 11, 104012, https://doi.org/10.1088/1748-9326/11/10/104012, 2016a.
Ukkola, A. M., Pitman, A. J., Decker, M., De Kauwe, M. G., Abramowitz, G., Kala, J., and Wang, Y.-P.: Modelling evapotranspiration during precipitation deficits: identifying critical processes in a land surface model, Hydrol. Earth Syst. Sci., 20, 2403–2419, https://doi.org/10.5194/hess-20-2403-2016, 2016b.
Ukkola, A. M., Haughton, N., De Kauwe, M. G., Abramowitz, G., and Pitman, A. J.: FluxnetLSM R package (v1.0): a community tool for processing FLUXNET data for use in land surface modelling, Geosci. Model Dev., 10, 3379–3390, https://doi.org/10.5194/gmd-10-3379-2017, 2017.
Ukkola, A. M., Abramowitz, G., and De Kauwe, M. G.: A flux tower dataset tailored for land model evaluation, Earth Syst. Sci. Data, 14, 449–461, https://doi.org/10.5194/essd-14-449-2022, 2022.
Williams, I. N. and Torn, M. S.: Vegetation controls on surface heat flux partitioning, and land-atmosphere coupling, Geophys. Res. Lett., 42, 9416–9424, https://doi.org/10.1002/2015GL066305, 2015.
Williams, M., Richardson, A. D., Reichstein, M., Stoy, P. C., Peylin, P., Verbeeck, H., Carvalhais, N., Jung, M., Hollinger, D. Y., Kattge, J., Leuning, R., Luo, Y., Tomelleri, E., Trudinger, C. M., and Wang, Y.-P.: Improving land surface models with FLUXNET data, Biogeosciences, 6, 1341–1359, https://doi.org/10.5194/bg-6-1341-2009, 2009.
Wilson, K., Goldstein, A., Falge, E., Aubinet, M., Baldocchi, D., Berbigier, P., Bernhofer, C., Ceulemans, R., Dolman, H., Field, C., Grelle, A., Ibrom, A., Law, B. E., Kowalski, A., Meyers, T., Moncrieff, J., Monson, R., Oechel, W., Tenhunen, J., Valentini, R., and Verma, S.: Energy balance closure at FLUXNET sites, Agr. Forest Meteorol., 113, 223–243, https://doi.org/10.1016/S0168-1923(02)00109-0, 2002.
Yuan, H., Dai, Y., Xiao, Z., Ji, D., and Shangguan, W.: Reprocessing the MODIS Leaf Area Index products for land surface and climate modelling, Remote Sens. Environ., 115, 1171–1187, https://doi.org/10.1016/j.rse.2011.01.001, 2011.
Yuan, H., Dai, Y., Dickinson, R. E., Pinty, B., Shangguan, W., Zhang, S., Wang, L., and Zhu, S.: Reexamination and further development of two-stream canopy radiative transfer models for global land modeling, J. Adv. Model Earth Sy., 9, 113–129, https://doi.org/10.1002/2016MS000773, 2017.
Zeng, X. and Dickinson, R. E.: Effect of Surface Sublayer on Surface Skin Temperature and Fluxes, J. Climate, 11, 537–550, https://doi.org/10.1175/1520-0442(1998)011<0537:EOSSOS>2.0.CO;2, 1998.
Zhang, X., Dai, Y., Cui, H., Dickinson, R. E., Zhu, S., Wei, N., Yan, B., Yuan, H., Shangguan, W., Wang, L., and Fu, W.: Evaluating common land model energy fluxes using FLUXNET data, Adv. Atmos. Sci., 34, 1035–1046, https://doi.org/10.1007/s00376-017-6251-y, 2017.