Global transpiration data from sap flow measurements: the SAPFLUXNET database

. Plant transpiration links physiological responses of vegetation to water supply and demand with hydrological, energy, and carbon budgets at the land–atmosphere interface. However, despite being the main land evaporative ﬂux at the global scale, transpiration and its response to environmental drivers are currently not well constrained by observations. Here we introduce the ﬁrst global compilation of whole-plant transpiration data from sap ﬂow measurements (SAPFLUXNET, https://sapﬂuxnet.creaf.cat/, last access: 8 June 2021). We harmonized and quality-controlled individual datasets supplied by contributors worldwide in a semi-automatic data workﬂow implemented in the R programming language. Datasets include sub-daily time series of sap ﬂow and hydrometeorological drivers for one or more growing seasons, as well as metadata on the stand characteristics, plant attributes, and technical details of the measurements. SAPFLUXNET contains 202 globally distributed datasets with sap ﬂow time series for 2714 plants, mostly trees, of 174 species. SAPFLUXNET has a broad bioclimatic coverage, with woodland/shrubland and temperate forest biomes especially well represented (80 % of the datasets). The measurements cover a wide variety of stand structural characteristics and plant sizes. The datasets encompass the period between 1995 and 2018, with 50 % of the datasets being at least 3 years long. Accompanying radiation and vapour pressure deﬁcit data are available for most of the datasets, while on-site soil water content is available for 56 % of the datasets. Many datasets contain data for species that make up 90 % or more of the total stand basal area, allowing the estimation of stand transpiration in diverse ecological settings. SAPFLUXNET adds to existing plant trait datasets, ecosystem ﬂux networks, and remote sensing products to help increase our understanding of plant water use, plant responses to drought, and ecohydrological processes. SAPFLUXNET version 0.1.5 is freely available from the Zenodo repository (https://doi.org/10.5281/zenodo.3971689; Poyatos et al., 2020a). The “sapﬂuxnetr” R package – designed to access, visualize, and process SAPFLUXNET data – is available from CRAN. basic temporal ag-gregations. We present the ecological and geographic coverage of SAPFLUXNET version 0.1.5 (Poyatos et al., 2020a), followed by a discussion of potential applications of the database, its limitations, and a perspective of future developments.


Introduction
Terrestrial vegetation transpires ca. 45 000 km 3 of water per year (Schlesinger and Jasechko, 2014;Wang-Erlandsson et al., 2014;Wei et al., 2017), a flux that represents 40 % of global land precipitation and 70 % of total land evapotranspiration (Oki and Kanae, 2006), and is comparable in magnitude to global annual river discharge (Rodell et al., 2015). For most terrestrial plants, transpiration is an inevitable water loss to the atmosphere because they need to open stom-ata to allow CO 2 diffusion into the leaves for photosynthesis. Latent heat from transpiration represents 30 %-40 % of surface net radiation globally (Schlesinger and Jasechko, 2014;Wild et al., 2015). Transpiration is therefore a key process coupling land-atmosphere exchange of water, carbon, and energy, determining several vegetation-atmosphere feedbacks, such as land evaporative cooling or moisture recycling. Regulation of transpiration in response to fluctuating water availability and/or evaporative demand is a key component of plant functioning and one of the main deter-minants of a plant's response to drought (Martin-StPaul et al., 2017;Whitehead, 1998). Despite its relevance for earth functioning, transpiration and its spatiotemporal dynamics are poorly constrained by available observations (Schlesinger and Jasechko, 2014) and not well represented in models Mencuccini et al., 2019). An improved understanding of transpiration and its regulation along environmental gradients and across species is thus needed to predict future trajectories of land evaporative fluxes and vegetation functioning under increased drought conditions driven by global change.
Conceptually, transpiration can be quantified at different organizational scales: leaves, branches, and whole plants; ecosystems; and watersheds. In practice, transpiration is relatively easy to isolate from the bulk evaporative flux, evapotranspiration, when measuring in a dry canopy, at the leaf or the plant level. However, in terrestrial ecosystems, evapotranspiration includes evaporation from the soil and from watercovered surfaces, including plants. Transpiration measurements on individual leaves or branches with gas exchange systems are difficult to upscale to the plant level (Jarvis, 1995). Likewise, transpiration measurements using wholeplant chambers (e.g. Pérez-Priego et al., 2010) or gravimetric methods (e.g. weighing lysimeters) in the field are still challenging. At the ecosystem scale and beyond, evapotranspiration is generally determined using micrometeorological methods, catchment water budgets, or remote sensing approaches (Shuttleworth, 2007;Wang and Dickinson, 2012). In some cases, isotopic methods and different algorithms applied to measured ecosystem fluxes can provide an estimation of transpiration at the ecosystem scale (Kool et al., 2014;Stoy et al., 2019).
Transpiration drives water transport from roots to leaves in the form of sap flow through the plant's xylem pathway (Tyree and Zimmermann, 2002), and this sap flow affects heat transport in the xylem. Taking advantage of this, thermometric sap flow methods were first developed in the 1930s (Huber, 1932) and further refined over the following decades (Čermák et al., 1973;Marshall, 1958) to provide operational measurements of plant water use. These methods have become widely used in plant ecophysiology, agronomy, and hydrology , especially after the development of simple, easily replicable methods (e.g. Granier, 1985Granier, , 1987. Whole-plant measurements of water use obtained with thermometric sap flow methods provide estimates of water flow through plants from sub-daily to interannual timescales and have been mostly applied in woody plants, although several studies have measured sap flow in herbaceous species (Baker and Van Bavel, 1987;Skelton et al., 2013) and non-woody stems (e.g. Lu et al., 2002). Xylem sap flow can be upscaled to the whole plant, obtaining a near-continuous quantification of plant water use, keeping in mind that stem sap flow typically lags behind canopy transpiration (Schulze et al., 1985). Multiple sap flow sensors can be deployed, in almost any terrestrial ecosystem, to determine the magnitude and temporal dynamics of transpiration across species, environmental conditions, or experimental treatments. All sap flow methods are subject to methodological and scaling issues, which may affect the quantification of absolute water use in some circumstances (Čermák et al., 2004;Köstner et al., 1998;Smith and Allen, 1996;Vandegehuchte and Steppe, 2013). Nevertheless, all methods are suitable for the assessment of the temporal dynamics of transpiration and of its responses to environmental changes or to experimental treatments .
The generalized application of sap flow methods in ecological and hydrological research in the last 30 years has thus generated a large volume of data, with an enormous potential to advance our understanding of the spatiotemporal patterns and the ecological drivers of plant transpiration and its regulation . However, these data need to be compiled and harmonized to enable global syntheses and comparative studies across species and regions. Across-species data syntheses using sap flow data have mostly focused on maximum values extracted from publications (Kallarackal et al., 2013;Manzoni et al., 2013;Wullschleger et al., 1998). Multi-site syntheses have focused on the environmental sensitivity of sap flow, using site means of plant-level sap flow or sap-flow-derived stand transpiration Tor-ngern et al., 2017). Because data sharing is only incipient in plant ecophysiology, sap flow datasets have not been traditionally available in open data repositories. Open data practices are now being implemented in databases, which fosters collaboration across monitoring networks in research areas relevant to plant functional ecology (Falster et al., 2015;Gallagher et al., 2020;Kattge et al., 2020) and ecosystem ecology (Bond-Lamberty and Thomson, 2010). The success of the data sharing and data re-use policies within the FLUXNET global network of ecosystemlevel fluxes has shown how these practices can contribute to scientific progress (Bond-Lamberty, 2018).
Here we introduce SAPFLUXNET, the first global database of sap flow measurements built from individual community-contributed datasets. We implemented this compilation in a data structure designed to accommodate time series of sap flow and the main hydrometeorological drivers of transpiration, together with metadata documenting different aspects of each dataset. We harmonized all datasets and performed basic semi-automated quality assurance and quality control (QC) procedures. We also created a software package that provides access to the database, allows easy visualization of the datasets, and performs basic temporal aggregations. We present the ecological and geographic coverage of SAPFLUXNET version 0.1.5 (Poyatos et al., 2020a), followed by a discussion of potential applications of the database, its limitations, and a perspective of future developments.

An overview of sap flow measurements
The main characteristics of sap flow methods have been reviewed elsewhere (Čermák et al., 2004;Smith and Allen, 1996;Swanson, 1994;Vandegehuchte and Steppe, 2013). Given the already broad scope of the paper, here we only provide a brief methodological overview, without delving into the details of the individual methods. Sap flow sensors track the fate of heat applied to the plant's conducting tissue, or sapwood, using temperature sensors (thermocouples or thermistors), usually deployed in the plant's main stem. Both heating and temperature sensing can be done either internally, by inserting needle-like probes containing electrical resistors (or electrodes for some methods) and temperature sensors into the sapwood (Vandegehuchte and Steppe, 2013), or externally; these latter systems are especially designed for small stems and non-lignified tissues (Clearwater et al., 2009;Helfter et al., 2007;Sakuratani, 1981). Depending on how the heat is applied and the principles underlying sap flow calculations, sap flow sensors can be classified into three major groups: heat dissipation methods, heat pulse methods, and heat balance methods . Heat dissipation and heat pulse methods estimate sap flow per unit sapwood area, and they have been called "sap flux density methods" (Vandegehuchte and Steppe, 2013); heat balance methods directly yield sap flow for the entire stem or for a sapwood section. Heat dissipation methods include the constant heat dissipation (HD; Granier, 1985Granier, , 1987, the transient (or cyclic) heat dissipation (CHD; Do and Rocheteau, 2002), and the heat deformation (HFD; Nadezhdina, 2018) methods. Heat pulse methods include the compensation heat pulse (CHP; Swanson and Whitfield, 1981), heat ratio (HR; Burgess et al., 2001), heat pulse T-max (HPTM; Cohen et al., 1981), and sapflow+ (Vandegehuchte and Steppe, 2012) methods. Heat balance methods include the trunk sector heat balance (TSHB;Čermák et al., 1973) and the stem heat balance (SHB; Sakuratani, 1981) methods. The suitability of a certain method in a given application largely depends on plant size and the flow range of interest , but heat dissipation and compensation heat pulse are the most widely used Poyatos et al., 2016). Apart from these different methodologies, within each sap flow method sensor design (Davis et al., 2012) and data processing (Peters et al., 2018) can vary, resulting in relatively high levels of methodological variability comparable to those in other areas of plant ecophysiology.
The output from sap flow sensors is automatically recorded by data loggers at hourly or even higher temporal resolution. This output relates to heat transport in the stem and needs to be converted to meaningful quantities of water transport, such as sap flow per plant or per unit sapwood area. How this conversion is achieved varies greatly across methods, with some relying on empirical calibrations and others being more physically based and requiring the estimation of wood thermal properties and other parameters (Čermák et al., 2004;Smith and Allen, 1996;Vandegehuchte and Steppe, 2013). Depending on the method and the specific sensor design, sap flow measurements can be representative of single points, linear segments along the sapwood, sapwood area sections, or entire stems. Except for stem heat balance methods, which typically measure entire stems or large sapwood sections, most sap flow measurements need to be spatially integrated to account for radial (Berdanier et al., 2016;Cohen et al., 2008;Nadezhdina et al., 2002;Phillips et al., 1996) and azimuthal (Cohen et al., 2008;Lu et al., 2000;Oren et al., 1999a) variation of sap flow within the stem to obtain an estimate of whole-plant water use (Čermák et al., 2004). At a minimum, an estimate of sapwood area is needed to upscale the measurements to whole-plant sap flow rates. Sap flow rates can thus be expressed per individual (i.e. plant or tree), per unit sapwood area (normalizing by water-conducting area), and per unit leaf area (normalizing by transpiring area).
Here we will use the term "sap flow" when referring, in general, to the rate at which water moves through the sapwood of a plant and, more specifically, when we refer to sap flow per plant (i.e. water volume per unit time, Edwards et al., 1996). We acknowledge that the term "sap flux" has also been proposed for this quantity (Lemeur et al., 2009), but more generally "sap flux density" (e.g. Vandegehuchte and Steppe, 2013) or just "sap flux" are used to refer to "sap flow per unit sapwood area". Since here we include methods natively measuring sap flow per plant or per sapwood area, throughout this paper we will use the more general term "sap flow", and, when necessary, we will indicate explicitly the reference area used: "sap flow per (unit) sapwood area", "sap flow per (unit) leaf area", or "sap flow per (unit) ground area".

Data compilation
SAPFLUXNET was conceived as a compilation of published and unpublished sap flow datasets (Appendix , Table A1), and thus the ultimate success of the initiative critically depended on the contribution of datasets by the sap flow community. An expression of interest showed that a critical mass of datasets with a wide geographic distribution could potentially be contributed, and the results of this survey were used to raise the interest of the sap flow community . The data contribution stage was open between July 2016 and December 2017, although a few additional datasets were updated during the data quality control process and contain more recent data.
All contributed datasets had to meet some minimum criteria before they were accepted, in terms of both content and format. We required that all datasets contained sub-daily, processed sap flow data, representative of whole-plant water use under different hydrometeorological conditions. This meant that both the processing from raw temperature data to sap flow quantities and the scaling from single-point measurements to whole-plant data had been performed by the data contributor responsible for each dataset. Time series of sap flow data and hydrometeorological drivers were required to be representative of one growing season, setting, as a broad reference, a minimum duration of 3 months. Sap flow could be expressed as total flow rate either per plant or per unit sapwood area. Contributors also needed to provide metadata on relevant ecological information of the site, stand, species, and measured plants as well as on basic technical details of the sap flow and hydrometeorological time series. Datasets had to be formatted using a documented spreadsheet template (cf. "sapfluxnet_metadata_template.xlsx" in the Supplement) and uploaded to a dedicated server at CREAF, Spain, using an online form.

Data harmonization and quality control: QC1
Once datasets were received, they were stored and entered a process of data harmonization and quality control ( Fig. 1,  Supplement Fig. S1). This process combined automatic data checks with human supervision, and the entire workflow was governed by functions and scripts in the R language (R Core Team, 2019), including other related tools, such as R markdown documents and Shiny applications. All R code involved in this QC process was implemented in the sapfluxnetQC1 package ; see the package vignettes for a detailed description (https://github.com/ sapfluxnet/sapfluxnetQC1/tree/master/vignettes, last access: 8 June 2021). To aid in the detection of potential data issues throughout the entire process (Figs. 1, S1), we implemented several elements of control: (1) automatic log files tracking the output of each QC function applied, (2) automatic creation and update of status files tracking the QC level reached by each dataset, (3) automatic QC summary reports in the form of R markdown documents, (4) interactive Shiny applications for data visualization, (5) documentation of manual changes applied to the datasets using manually edited text files, (6) storage of manual data cleaning operations in text files, and (7) automatic data quality flagging associated with each dataset. All these items ensure a robust, transparent, reproducible, and scalable data workflow. Example files for (2), (3), and (6) can be found in the Supplement.
The first stage of the data QC (QC1) performed several data checks (Supplement Table S1) on received spreadsheet files and produced an interactive report in an R markdown document, which signalled possible inconsistencies in the data and warned of potential errors. These data issues were addressed, with the help of data contributors if needed. Once no errors remained, the dataset was converted into an object of the custom-designed "sfn_data" class ( Fig. S2; see also Sect. 2.5), which contained all data and metadata for a given dataset (Tables A2-A6 list all variable names and units). Data and metadata belonging to all Level 1 datasets were further Figure 1. Overview of the SAPFLUXNET data workflow. Data files are received from data contributors and undergo several quality-control processes (QC1 and QC2). Both QC1 and QC2 produce an .RData object of the custom-designed sfn_data S4 class storing all data, metadata, and data flags for each dataset. The progress and results of the QC processes are monitored through individual reports and log files. The final outcome is stored in a folder structure with a either single .RData file for each dataset or a set of seven csv files for each dataset.
visually inspected using an interactive R Shiny application, and, if no major issues were detected, they were subjected to the second QC process, QC2.

Data harmonization and quality control: QC2
Datasets entering QC2 underwent several data cleaning and data harmonization processes (Table S2). We first ran outlier detection and out-of-range checks; these checks did not delete or modify the data but only warned about any suspicious observation ("outlier" and "range" warnings). The outlier detection algorithm was based on a Hampel filter, which also estimates a replacement value for a candidate outlier (Hampel, 1974). For the range checks, we defined minimum and maximum allowed values for all the time series variables, based on published values of extreme weather records and maximum transpiration rates (Cerveny et al., 2007;Manzoni et al., 2013). The outcome of outlier and range checks were visually inspected on the actual time series being evaluated using an interactive R Shiny application (Fig. S3). Following expert knowledge, visually confirmed outliers were replaced by the values estimated by the Hampel filter. Similarly, we replaced out-of-range values with "NA" if the variable was out of its physically allowed range (Fig. S3). Outlier and out-ofrange "warnings" for each observation (e.g. for each variable and times step) were documented in two data flags tables, with the same dimensions as the corresponding data tables (Fig. S2). Likewise, those observations with confirmed problematic values, which were removed or replaced, were also flagged; further information can be found in the "data flags" vignettes in the "sapfluxnetr" package .
Final data harmonization processes in QC2 involved unit transformations and the calculation of derived variables (Table S2). When plant sapwood area was provided by data contributors, we interconverted between sap flow rate per plant and per unit sapwood area. If leaf area was supplied, we also calculated sap flow per unit leaf area, but note that this transformation does not take into account the seasonal variation in leaf area; we document in the metadata for which datasets this information could be available from data contributors. In QC2 we estimated missing environmental variables which could be derived from related variables in the dataset (Appendix, Table A6). We also estimated the apparent solar time and extraterrestrial global radiation from the provided timestamp and geographic coordinates using the R package "so-laR" (Perpiñán, 2012). All estimated or interconverted observations were flagged as "CALCULATED" in the "env_flags" or "sap_flags" table ( Fig. S2).

Data structure
One of the major benefits of the SAPFLUXNET data workflow is the encapsulation of datasets in self-contained R objects of the S4 class with a predefined structure. These objects belong to the custom-designed sfn_data class, which displays different slots to store time series of sap flow and environmental data, their associated data flags, and all the metadata (Fig. S2). For further information please see the "sfn_data classes" vignette in the sapfluxnetr package . The code identifying each dataset was created by the combination of a "country" code; a "site" code; and, if applicable, a "stand" code and a "treatment" code. This means that several stands and/or treatments can be present within one site (Table S3).
At the end of the QC process, we generated a folder structure with a first-level storing datasets as either sfn_data objects or as a set of comma-separated (csv) text files. Within each of these formats, a second-level folder groups datasets according to how sap flow is normalized (per plant, sapwood, or leaf area); note that the same dataset, expressing different sap flow quantities, can be present in more than one folder (e.g. "plant" and "sapwood"). Finally, the third level contains the data files for each dataset: either a single sfn_data object storing all data and metadata or all the individual csv files. More details on the data structure and units can be found in the "sapfluxnetr-quick-guide" and "metadata-anddata-units" vignettes, respectively, in the sapfluxnetr package .

Data coverage
The SAPFLUXNET version 0.1.5 database harbours 202 globally distributed datasets (Figs. 2a and S4, Table S3), from 121 geographical locations, with Europe, the eastern USA, and Australia especially well represented. These datasets were represented in the bioclimatic space using the terrestrial biomes delimited by Whittaker (Fig. 2b), but note that, as any bioclimatic classification, it has its limitations. Datasets have been compiled from all terrestrial biomes, except for temperate rain forests, although some tropical montane sites have been included. Woodland/shrubland and temperate forest biomes are the most represented in the database, adding up to 80 % of the datasets (Fig. 2b). However, large forested areas in the tropics and in boreal regions are still not well represented ( Fig. 2a and b). Looking at the distribution by vegetation type (Fig. 2c), evergreen needleleaf forest is the most represented vegetation type (65 datasets), followed by deciduous broadleaf forest (47 datasets) and evergreen broadleaf forest (43 datasets).

Methodological aspects
For more than 90 % of the plants, sap flow at the whole-plant level is available (either directly provided by contributors or calculated in the QC process); this is important for upscaling SAPFLUXNET data to the stand level (cf. Sect. 4.2). Because the leaf area of the measured plants is often not available as metadata, sap flow per unit leaf area was estimated for only 18.6 % of the individuals (Fig. 4). The heat dissipation method is the most frequent method in the database (HD, 66.4 % of the plants), followed by the trunk sector heat balance (TSHB, 16.4 %) and the compensation heat pulse method (CHP, 8.4 %) (Fig. 4). This distribution is broadly similar to the use of each method documented in the literature, although the TSHB method is overrepresented here, compared to the current use of this method by the sap flow community Poyatos et al., 2016). Some methods, especially those belonging to the heat pulse family and the cyclic (or transient) heat dissipation (CHD) method, are mostly used in angiosperms, while the TSHB and the heat field deformation (HFD) methods are more frequently used in gymnosperms (Fig. 4). Calibration of sap flow sensors and scaling from point measurements to the whole-plant can be critical steps towards accurate estimates of absolute sap flow rates. In SAPFLUXNET, most of the sap flow time series have not undergone a species-specific calibration, with the CHD method showing the highest percentage of calibrated time series (Table 1). This lack of calibrations may be relevant for the more empirical heat dissipation methods (HD and CHD), which have been shown to consistently underestimate sap flow rates by 40 % on average Peters et al., 2018;  . Bar height for a given colour is proportional to the number of plants in the corresponding dataset, which is also shown in parentheses next to the dataset code. Steppe et al., 2010). Radial integration of single-point sap flow measurements is more frequent than azimuthal integration (Table 2), except for the CHD method. For a large number of plants measured with the HD method and all plants measured with the HPTM method, there was not any radial integration procedure reported. In contrast, the CHP, HR, SHB, and TSHB methods are those which more fre-quently addressed radial variation in one way or another (Table 2). Azimuthal integration procedures are also more frequent when the TSHB method is used (Table 2).  Table 1. Number of sap flow times series in SAPFLUXNET depending on whether they were calibrated (species-specific) or noncalibrated, or whether this information was not provided, for the different sap flow methods: cyclic (or transient) heat dissipation (CHD), compensation heat pulse (CHP), heat dissipation (HD), heat field deformation (HFD), heat pulse T-max (HPTM), heat ratio (HR), stem heat balance (SHB), and trunk sector heat balance (TSHB). The percentage of calibrated time series was expressed with respect to the total number of sap flow time series for each method. Plant size metadata in SAPFLUXNET are complemented with plant-level data of sapwood and leaf area, which provide information on the functional areas for water transport and loss (Fig. 5a). Distributions of sapwood and leaf area show highly skewed distributions, with long tails towards the largest values and slightly higher median values for gymnosperms (262 cm 2 and 33.0 m 2 for sapwood and leaf areas, respectively) compared to angiosperms (168 cm 2 and 29.9 m 2 ). Accordingly, median sapwood depth is also higher for gymnosperms (5.1 cm) compared to angiosperms (3.7 cm). The largest trees (Mortoniodendron, Pouteria, Agathis) with deep sapwood (17-24 cm) are also those with largest sapwood areas. Many large angiosperm trees from tropical (CRI_TAM_TOW, IDN_PON_STE, GUF_GUY_ST2; see Table S3 for dataset codes) and temperate forests (Fagus grandifolia, USA_SMIC_SCB) also show large sapwood areas (> 5000 cm 2 ), but the plant with the deepest sapwood is a gymnosperm, an Abies pinsapo in Spain with 30.7 cm of sapwood depth.

Stand characteristics
Stand-level metadata include several variables associated with management, vegetation structure, and soil properties. Half of the datasets originate from naturally regenerated, unmanaged stands, and 13.9 % come from naturally regenerated but managed stands. Plantations add up to 32.2 %, and orchards only represent 4 % of the datasets. Reporting of structural variables is mixed, with stand height, age, density, and basal area showing relatively low missingness (6.4 %, 11.4 %, 12.9 %, and 13.4 %, respectively); in contrast, soil depth and leaf area index (LAI) are missing from 26.7 % and 33.7 % of the datasets.
SAPFLUXNET datasets originate from stands with diverse structural characteristics. Median stand age is 54 years, and there are several datasets coming from > 100-year- old forests (Fig. 5b). Stand height shows a similar range and distribution of values compared to individual plant height ( Fig. 5a and b). The denser stands correspond to coppiced evergreen oak stands from Mediterranean forests (FRA_PUE, ESP_TIL_OAK), species-rich tropical forests (MDG_SEM_TAL), or relatively young temperate forests (e.g. FRA_HES_HE1_NON, USA_CHE_MAP). The sparsest stands (< 200 stems ha −1 ) correspond to tree-grass savanna systems (Spain, Portugal, Australia, Senegal), dry woodlands (China), or oil palm plantations in Indonesia (IDN_JAM_OIL). Stands with the largest basal areas (> 70 m 2 ha −1 ) are mostly dominated by broadleaf species, except for a Picea abies plantation in Sweden (SWE_SKO_MIN). The distribution of LAI shows a median of 3.5 m 2 m −2 , with the largest values observed in temperate (CZE_BIK, USA_DUK_HAR, HUN_SIK) and tropical (GUF_GUY_GUY, COL_MAC_SAF_RAD) forests. The stands with the lowest LAI correspond to the sparse woodlands from Mediterranean and semi-arid locations and also those from forests near altitudinal or latitudinal treelines (FIN_PET, AUT_TSC). SAPFLUXNET datasets show a median soil depth of 100 cm, with only a dozen datasets originated from sites with soils deeper than 10 m (Fig. 5b).
The number of plants per dataset is highly variable, with most of the datasets (86 %) containing data for at least 4 trees and 46 % of the datasets having data for at least 10 trees ( Fig. 6a; see also Fig. 9).

Temporal characteristics
The oldest datasets in SAPFLUXNET go back to 1995 (GBR_DEV_CON, GBR_DEV_DRO), while the most recent data reach up to 2018 (datasets from the ESP_MAJ cluster of sites). Several multi-year datasets are present in SAPFLUXNET (Fig. 6), with 50 % of the datasets spanning a period of at least 3 years and some datasets being extraordinarily long (16 years in FRA_PUE). Frequently, the datasets only cover the "growing season" periods, or even shorter periods for some sites which were eventually included because they improved the ecological and geographic coverage of the database (e.g. ARG_MAZ, ARG_TRE as representative of deciduous Nothofagus forest in southern Patagonia). In contrast, a few datasets show continuous records over multiple years (Fig. 6b). Amongst the longest datasets, most of them come from European or North American sites (Fig. 6), except some datasets from Israel (ISR_YAT_YAT, 7 years), Russia (RUS_FYO, 7 years), South Korea (KOR_TAE cluster of sites, 6 years), or New Zealand (NZL_HUA_HUA, 5 years).
SAPFLUXNET provides an unprecedented database to study the detailed temporal dynamics of plant transpiration across species and sites globally. Sub-daily records of sap flow (e.g. at least at hourly time steps) are available for extended periods (Fig. 6b), allowing both seasonal and diel patterns in water-use regulation by trees to be addressed, as well as how these temporal patterns change across species or years across terrestrial biomes, reflecting different phenologies and water-use strategies. For instance, in Mediterranean forests, evergreen species such as Quercus ilex, Arbutus unedo, and Pinus halepensis show moderate sap flow the whole year round, while the deciduous Quercus pubescens shows higher sap flow density during a shorter period and its water use is heavily reduced during a dry year (2012) (Fig. 7a). Temperate forests without water availability limitations show relatively high flows during the growing season and similar diel sap flow patterns amongst species (Fig. 7b). In contrast, tropical forests show moderate to high sap flow rates during the entire year, with different dynamics in the intradaily wateruse regulation across species. For example, Inga sp. in a highly diverse wet tropical forest in Costa Rica reduced sap flow during mid-day hours compared to co-existing species (Fig. 7c).

Availability of environmental data
All SAPFLUXNET datasets contain ancillary time series of the main hydrometeorological drivers of transpiration, accompanied by information on where these variables had been measured (Fig. 8a). Air temperature is available for all datasets. Although vapour pressure deficit (VPD) was originally absent in 38 % of the datasets ( Fig. 8a and b), we could estimate it for those sites providing air temperature and relative humidity data (QC Level 2; see Sect. 2.3), and finally only 2 out of the 202 datasets have missing VPD information. For radiation variables, shortwave radiation was most often provided, compared to photosynthetically active and net radiation, which were less provided; only 8 out of 202 datasets do not have any accompanying radiation data. Most of these environmental variables were measured on site, with precipitation being the variable most frequently retrieved from nearby meteorological stations (48 % of the datasets) (Fig. 8a). Soil water content measured at shallow depth, typically between 0 and 30 cm below the soil surface, is provided for 56 % of the datasets, while soil moisture from deep soil layers is available for only 27 % of the datasets.

Uncertainty estimation and bias correction in sap flow measurements
Uncertainty for the main sap flow density methods could be obtained by using a recent compilation of sap flow calibration data   (Table B1) area observed in SAPFLUXNET, except for the CHP method ( Fig. B1). At low flows, uncertainties were larger for HPTM and, to a lesser extent, for CHP, while they were lowest for HR and HFD. Uncertainties increased steeply with flow particularly for the HPTM, CHP, and HR methods. These patterns were evident when examining sub-daily sap flow measured with the most represented sap flux density methods in SAPFLUXNET (Fig. B2). The analysis of calibration data also showed that HD, the most represented method by far, underestimates water flow, on average, by 40 %  when using the original calibration (Granier, 1985(Granier, , 1987. Because plant-level metadata contain information that document the conversion from raw to processed data, a first- order correction for data from uncalibrated HD probes can be applied (Fig. B3a). Additional uncertainties and corrections by sapwood area estimation and integration of sap flow radial variability must also be considered when upscaling to plant-level sap flow. Uncertainty from sapwood area estimation is expected to be lower than methodological uncertainty given the generally tight relationship between basal area and sapwood area ( Fig. B3b and c). Data without an explicit radial integration of sap flow measurements can be adjusted using generic radial sap flow profiles based on wood type (Berdanier et al., 2016). In this case, assuming uniform sap flow along the sapwood usually leads to sap flow overestimation for both ringporous and diffuse-porous species (Fig. B4).

Applications in plant ecophysiology and functional ecology
There are multiple potential applications of the SAPFLUXNET database to assess whole-plant wateruse rates and their environmental sensitivity, both across species (e.g. Oren et al., 1999b) and at the intraspecific level . SAPFLUXNET will allow disentangling the roles of evaporative demand and soil water content in controlling transpiration at the plant level, complementing recent studies looking at how water supply and demand affect evapotranspiration at the ecosystem level Novick et al., 2016). The availability of global sap flow data at sub-daily time resolution and spanning entire growing seasons will allow focusing on how maximum water use and its environmental sensitivity vary with plant-level attributes such as stem diameter (Dierick and Hölscher, 2009;Meinzer et al., 2005), tree height (Novick et al., 2009;Schäfer et al., 2000), hydraulic traits (Manzoni et al., 2013;Poyatos et al., 2007), and other plant traits (Grossiord et al., 2020;Kallarackal et al., 2013). SAPFLUXNET thus provides an unprecedented tool to understand how structural and physiological traits coordinate with each other , how these traits translate to whole-plant regulation of water fluxes (McCulloh et al., 2019), and how this integration determines drought responses (Choat et al., 2018) and post-drought recovery patterns (Yin and Bauerle, 2017). Analyses of the temporal dynamics of plant water use in response to specific drought events, as recently assessed for gross primary productivity (e.g. Schwalm et al., 2017), can also help to quantify drought legacy effects. If combined with water potential measurements, sap flow data can be used to estimate whole-plant hydraulic conductance and study its response to drought (e.g., Cochard et al., 1996), as well as the recovery of the plant hydraulic system after drought. SAPFLUXNET will allow new insights into within-day patterns and controls in whole-plant water use, which can disclose the fine details of its physiological regulation. Circadian rhythms can modulate stomatal responses to the environment, potentially affecting sap flow dynamics (e.g. de Dios et al., 2015). Hysteresis in diel sap flow relationships with evaporative demand and time lags between transpiration and sap flow are two linked phenomena likely arising from plant capacitance and other mechanisms (O'Brien et al., 2004;Schulze et al., 1985) that also influence diel evapotranspiration dynamics (Matheny et al., 2014;Zhang et al., 2014). A major driver of time lags is the use of stored water to meet the transpiration demand , which can now be analysed across species, plant sizes, or drought conditions using time series analyses, simplified electric analogies (Phillips et al., 1997(Phillips et al., , 2004Ward et al., 2013), or detailed water transport models (Bohrer et al., 2005;Mirfenderesgi et al., 2016). Night-time water use can be substantial for some species (Forster, 2014;Resco de Dios et al., 2019). However, available syntheses rely on study-specific quantification of what constitutes nocturnal sap flow and do not address possible methodological influences (Zeppel et al., 2014). SAPFLUXNET includes metadata to identify methods (e.g. HRM; Burgess et al., 2001) and data processing approaches (zero-flow determination method in "pl_sens_cor_zero", Table A5) that can help identify suitable datasets to quantify night-time fluxes.
Sap flow data have been widely employed to assess changes in tree water use after biotic (e.g. Hultine et al., 2010) or abiotic (Oren et al., 1999a) disturbances. Likewise, sap flow data have been used to report changes in species and stand water use following experimental treatments in-volving resource availability modifications (e.g. Ewers et al., 1999) or density changes (i.e. thinning; Simonin et al., 2007). The SAPFLUXNET database includes datasets with experimental manipulations, applied either at the stand or at the individual level, qualitatively documented in the metadata (Table 3). The main treatments present are related to thinning, water availability changes (irrigation, throughfall exclusion), and wildfire impact (Table 3), potentially facilitating new data syntheses and meta-analyses using these datasets (e.g. Grossiord et al., 2018).
The combination of SAPFLUXNET with other ecophysiological databases can be informative as to the relative sensitivity of different physiological processes in response to drought, for example those related to growth and carbon assimilation . Within-day fluctuations of stem diameter can be jointly analysed with co-located sap flow measurements to study the dynamics of stored water use under drought and its contribution to transpiration (e.g. Brinkmann et al., 2016) and to infer parameters on tree hydraulic functioning using mechanistic models of tree hydrodynamics (Salomón et al., 2017;Steppe et al., 2006;Zweifel et al., 2007). These analyses could be carried out for a large number of species by combining SAPFLUXNET with data from the DENDROGLOBAL database (http: //78.90.202.92/streess/databases/dendroglobal, last access: 8 June 2021); there are at least 18 SAPFLUXNET datasets with dendrometer data in DENDROGLOBAL. This database and the International Tree-Ring Data Bank (S.  could also be used with SAPFLUXNET to investigate, at the species level, the link between radial growth and water use, including their environmental sensitivity (Morán-López et al., 2014), and how these two processes comparatively respond to drought (Sánchez-Costa et al., 2015). Moreover, given the tight link between water use and carbon assimilation, combining SAPFLUXNET with water-use efficiency from plant δ 13 C data could potentially be used to estimate whole-plant carbon assimilation (Hu et al., 2010;Klein et al., 2016;Rascher et al., 2010;Vernay et al., 2020), a quantity that is difficult to measure directly, especially in field-grown, mature trees.

Applications in ecosystem ecology and ecohydrology
SAPFLUXNET will provide a global look at plant water flows to bridge the scales between plant traits and ecosystem fluxes and properties . Vegetation structure, species composition, and differential wateruse strategies amongst and within species scale up to different seasonal patterns of ecosystem transpiration, with a strong influence on ecosystem evapotranspiration and its partitioning. Global controls on evaporative fluxes from vegetation have been mostly addressed using ecosystem (Williams et al., 2012) or catchment evapotranspiration data (Peel et al., 2010). These studies have described global patterns in evapo- transpiration driven by different plant functional types or climates, but they cannot be used to quantify and to explain the enormous variation in the regulation of transpiration across and within taxa. The SAPFLUXNET database will provide a longdemanded data source to be used in ecohydrological research (Asbjornsen et al., 2011). Upscaling individual measurements to the stand level (Čermák et al., 2004;Granier et al., 1996;Köstner et al., 1998) is necessary to quantitatively compare sap-flow-based transpiration with evapotranspiration and transpiration estimates at the ecosystem scale and beyond. Even though SAPFLUXNET was designed to accommodate sap flow data at the plant level, scaling to the ecosystem level is possible for many datasets. For a basic upscaling exercise using SAPFLUXNET data (Poyatos et al., 2020b), whole-plant sap flow can be normalized by individual basal area (as DBH is usually available in the metadata; cf. Sect. 3.3), averaged for a given species, and then scaled to stand-level transpiration using total stand basal area and the fraction of basal area occupied by each measured species (see stand metadata, Table A3). For many datasets, sap flow data are available for the species comprising most of the stand basal area (often even 100 %, Fig. 9), but species-based upscaling may be unfeasible in many tropical sites (Fig. 9b), where size-based scaling could be applied instead (e.g. da Costa et al., 2018). Further refinements of the upscaling procedure could be achieved by using trunk diameter distributions of the sap flow plots (Berry et al., 2018). This information, however, is not readily available in SAPFLUXNET, and other data sources (e.g. forest inventories, LIDAR data) or additional simplifying assumptions (i.e. applying the size distribution of measured individuals in the dataset) would be needed.
Stand-level transpiration estimates from a large number of SAPFLUXNET sites can contribute to improve our understanding of the role of forest transpiration in the context of stand water balance and its components at the ecosystem (e.g. Tor-ngern et al., 2018) and catchment levels (Oishi et al., 2010;Wilson et al., 2001). Importantly, SAPFLUXNET can contribute to a better understanding of the global controls on vegetation water use (Good et al., 2017), including the biological and climatic controls on evapotranspiration partitioning into transpiration and evaporation components (Schlesinger and Jasechko, 2014;Stoy et al., 2019). There is some overlap between the FLUXNET network and SAPFLUXNET (47 datasets from FLUXNET sites). Hence, transpiration from SAPFLUXNET can also be used as a "ground-truth" reference for transpiration estimates from remote sensing approaches (Talsma et al., 2018) and from eddy covariance data (Nelson et al., 2020). Extrapolating sap-flowderived stand transpiration to large spatial scales can be challenging due to landscape-scale variation in forest structure (Ford et al., 2007) or topography (Hassler et al., 2018) and due to the low spatial representativeness of sap flow measurements (Mackay et al., 2010). A promising research avenue to help elucidate the role of vegetation in driving hydrological changes across environmental gradients (Vose et al., 2016) would be to combine species-specific stand transpiration data from SAPFLUXNET with stand structural and compositional data from forest inventories (e.g. sapwood area index; Benyon et al., 2015).
Understanding the patterns and mechanisms underlying species interactions with respect to water use within a community is necessary to predict tree species vulnerability to drought (Grossiord, 2020). Multispecies datasets from SAPFLUXNET (Table S3) can be used to assess competition for water resources amongst species, for example by identifying changes in seasonal water use across co-existing species and hence characterizing the spatiotemporal segregation of their hydrological niches (Silvertown et al., 2015). By providing a detailed seasonal quantification of tree water use, SAPFLUXNET could also complement isotope-based studies and contribute to interpreting the large diversity in root water uptake patterns observed worldwide (Barbeta and Earth Syst. Sci. Data, 13, 2607-2649, 2021 https://doi.org/10.5194/essd-13-2607-2021 Figure 9. Potential for upscaling species-specific plant sap flow to stand-level sap flow using SAPFLUXNET datasets. Datasets are shown using an aggregated biome classification; "dry and tropical" include "subtropical desert", "temperate grassland desert", "tropical forest savanna", and "tropical rain forest". Each panel shows the percentage of total stand basal area that is covered by sap flow measurements for each species in the dataset. Datasets are also coloured by the number of species present. Numbers on top of each bar depict the total number of plants for a given dataset. Empty bars show datasets for which sap flow data expressed at the plant level were not available. Peñuelas, 2017;Evaristo and McDonnell, 2017) and to explaining the different seasonal origin of root-absorbed water across species and environmental gradients (Allen et al., 2019). Plant water fluxes and hydrodynamics are amongst the most uncertain components of ecosystem and terrestrial biosphere models Fisher et al., 2018). These models are now incorporating hydraulic traits and processes in their transpiration regulation algorithms (Mencuccini et al., 2019), but multi-site assessments of these algorithms are usually performed against evapotranspiration from eddy flux data (Knauer et al., 2015;Matheny et al., 2014). Model validation against sap flow data has been carried out typically at only one (Kennedy et al., 2019;Williams et al., 2001) or few (Buckley et al., 2012) sites. SAPFLUXNET can thus contribute to assessing the performance of models simulating transpiration of stands or species within stands (e.g. De Cáceres et al., 2021), for a large number of species and under diverse climatic conditions.

Limitations
Sap flow data processing differs within and amongst methods, because different algorithms, calibrations, or parameters involved in sap flow calculations may be applied. All of these methods contribute to methodological uncertainty (Looker et al., 2016;Peters et al., 2018), and this challenging methodological variability precludes the implementation of a complete, standardized data workflow from raw to processed data within SAPFLUXNET, as it is done for eddy flux data (Vitale et al., 2020;Wutzler et al., 2018). Commercial software for sap flow data processing from multiple methods is available (i.e. http://www.sapflowtool.com/SapFlowToolSensors.html, last access: 8 June 2021), but it has not yet been widely adopted. Freely available data-processing software is only available for the HD method Speckman et al., 2020;Ward et al., 2017). Open-source software also allows a seamless integration of different data processing approaches and the implementation of species-specific calibrations, which can contribute to obtaining more robust estimations of sap flow and facilitate replicability (Peters et al., 2021).
Sap flow measured with thermometric methods provides a precise estimate of the temporal dynamics of water flow through plants . However, their performance in measuring absolute flows is mixed. While some wellrepresented methods in SAPFLUXNET, such as CHP, yield accurate estimates (at least for moderate-to-high flows), the HD method, the most represented method by far, can significantly underestimate sap flow. Our suggested bias correction for uncalibrated HD data (cf. Sect. 3.7) can be applied, but, given the unexplained high variability (i.e. by species and wood traits) in the performance of sap flow calibrations , these corrections should be applied with caution.
SAPFLUXNET has been designed to store whole-plant sap flow data, and therefore sap flow measured at multiple points within an individual is not available in the database. Even though this spatial variation could be useful to describe detailed aspects of plant water transport (Nadezhdina et al., 2009), focusing on plant-level data greatly simplifies the data structure. Hence, SAPFLUXNET only includes data already upscaled to the plant level by the data contributors. The main details of how this upscaling process was done for each dataset are provided together with other plant metadata (Table A5), but these metadata show that within-plant variation in sap flow is often not considered (Table 2). For those datasets without radial integration of point measurements, we show how to implement a radial integration based on generic wood porosity types (cf. Sect. 3.7, Appendix B). The impact of not accounting for radial and circumferential variability when scaling single-point measurements of sap flow to the whole-plant level can be important (Merlin et al., 2020), but the estimation of sapwood area can also cause large errors if it is not accurately determined (Looker et al., 2016). SAPFLUXNET does not provide information on the method employed to quantify sapwood area (e.g. visual estimation with or without the application of dyes, indirect estimation through allometries at species or site levels) or on the accuracy of sapwood area data. This precludes uncertainty estimation at the individual level (Fig. B3). Future developments in the SAPFLUXNET data structure could include this information as metadata to better document the sensor-to-plant scaling process. Overall, this first global compilation of sap flow data will allow addressing uncertainties in sap flow upscaling in space and time in the same way that the development of FLUXNET stimulated the quantification and aggregation of uncertainties for eddy flux data (Richardson et al., 2012).
While SAPFLUXNET makes global sap flow data available for the first time, we note that spatial coverage is still sparse and some forested regions are underrepresented in the database (Fig. 2a). We note especially the relatively small number of datasets for boreal and tropical forests, two important biomes in terms of global water and carbon fluxes (Beer et al., 2010;Schlesinger and Jasechko, 2014). While many geographic gaps are caused by the absence of sap flow studies from such areas, some regions where sap flow studies have been conducted are still not represented in SAPFLUXNET. For example, the recent proliferation of Asian sap flow studies (Peters et al., 2018) has not translated into a high representativity of Asian datasets in SAPFLUXNET yet. Similarly, while the coverage of taxonomic and biometric diversity is unprecedented, SAPFLUXNET lacks data for the extremely tall trees (Ambrose et al., 2010) or for other growth forms such as shrubs (Liu et al., 2011), lianas (Chen et al., 2015), and other nonwoody species (Lu et al., 2002).

Outlook
The public release of SAPFLUXNET has set the stage for the first generation of sap-flow-based data syntheses. The work on these syntheses will fuel new ideas and tools for future improvements of the database, for example new computing approaches for the processing and analysis of sap flow datasets. One example would be the development of robust imputation algorithms to gap-fill time series of sap flow and environmental data, which can take advantage of tools and datasets already developed by the ecosystem flux community (Moffat et al., 2007;Vuichard and Papale, 2015). The dissemination of SAPFLUXNET will encourage the use of machine learning algorithms, only occasionally used to analyse sap flow datasets so far (e.g. Whitley et al., 2013). These approaches can also be used to identify the relative importance of different hydrometeorological drivers of transpiration (W. L. , or to produce global transpiration maps, by combining SAPFLUXNET with other data . This upscaling of stand transpiration to large areas will also allow addressing broader questions at the regional and continental scale, such as the role of transpiration in moisture recycling (Staal et al., 2018).
The eventual success of this initiative, in terms of enabling data re-use and contributing to the understanding and modelling of tree water use at local to global scales, will likely encourage the sap flow community to contribute new datasets to future updates of the database. We expect that the development of open-source software for the processing of raw sap flow data (Peters et al., 2021;Speckman et al., 2020), its eventual widespread use by the sap flow community, and the adoption of standardized calibration practices will increase the quality and intercomparability of future sap flow datasets. These new datasets will hopefully expand the temporal, geographical, and ecological representativity of SAPFLUXNET when new data contribution periods can be opened in the future.

Data availability, access, and feedback
In this paper we present SAPFLUXNET version 0.1.5 (Poyatos et al., 2020a), which contains some small metadata improvements on version 0.1.4, the first one to be made publicly available, in March 2020. Both versions supersede version 0.1.3, which was initially released to data contributors in March 2019. The entire database can be downloaded from its hosting web page in the Zenodo repository (https://doi.org/10.5281/zenodo.3971689, Poyatos et al. 2020a). In this repository, we provide the database as separate .csv files and as .RData objects; see Sect. 2.4. for details on data structure. Together with the initial publication of SAPFLUXNET in March 2019, we also released the sapfluxnetr R package, available on CRAN, to enable easy access, selection, temporal aggregation, and visualization of SAPFLUXNET data. Feedback on data quality issues can be forwarded to the SAPFLUXNET initiative email address: sapfluxnet@creaf.uab.cat. All the information about SAPFLUXNET, including the publication of new calls for data contribution, can be found on the project website: http: //sapfluxnet.creaf.cat/ (last access: 8 June 2021).

Code availability
The code to reproduce the figures in this paper is available in the following Zenodo repository: https://doi.org/10.5281/zenodo.4727825 .

Conclusions
The SAPFLUXNET database provides the first global perspective of water use by individual plants at multiple timescales, with important applications in multiple fields, ranging from plant ecophysiology to Earth system science. This database has been built from community-contributed datasets and is complemented with a software package to facilitate data access. Both the database and the software have been implemented following open science practices, ensuring public access and reproducibility. Data sharing has been a key component of the success of the FLUXNET network of ecosystem fluxes (Bond-Lamberty, 2018), and many databases in plant and ecosystem ecology now offer open data (Bond-Lamberty and Thomson, 2010;Falster et al., 2015;Gallagher et al., 2020;Kattge et al., 2020). SAPFLUXNET fully aligns with this philosophy. We expect that this initial data infrastructure will promote data sharing amongst the sap flow community in the future (Dai et al., 2018) and will allow the continued growth of the SAPFLUXNET database.
R. Poyatos et al.: Global transpiration data from sap flow measurements: the SAPFLUXNET database Table A1. SAPFLUXNET dataset codes and DOIs (digital object identifiers) of the publications associated with each dataset. Where no DOI was available, the bibliographic reference is shown. Some datasets may have no associated publication ("unpublished").

Appendix B: Uncertainty estimation in sap flow measurements in the SAPFLUXNET database
Here we will show examples of uncertainty estimation for sap flow data in the SAPFLUXNET database. We will address three main sources of uncertainty which affect plantlevel estimates of sap flow: (i) methodological uncertainty, (ii) sapwood area uncertainty, and (iii) radial integration uncertainty.
Methodological uncertainty was estimated using the data in the global meta-analysis of sap flow calibrations by Flo et al. (2019) as published in Flo et al. (2021). This estimation can be applied for the main sap flow density methods. We predicted the standard error (SE) of sub-daily sap flow density by fitting, for each method, linear mixed models of reference flow (i.e. using a gravimetric method or others employed as reference standards in calibration studies) as a function of measured flow, including the individual calibration as a random intercept factor (Table B1, Fig. B1). This model shows that HPTM presents the highest uncertainty and that this method and CHP are the ones showing larger uncertainties at low flows, while HD and CHD show lower relative uncertainty at high sap flow density (Figs. B1, B2). We also show in Fig. B3a the effect of applying the bias correction factor for uncalibrated heat dissipation probes obtained from the meta-analysis by Flo et al. (2019).
Uncertainty in the determination of sapwood area can arise when allometric relationships are used to estimate sapwood area, because this area is then applied to upscale sap flow density values to the whole plant. This uncertainty can be accounted for if the original data employed to obtain the allometry are available. Using these data for one of the datasets in SAPFLUXNET (ESP_VAL_SOR), we first predicted sapwood area, together with the upper and lower bounds of its 68 % predictive interval (equivalent to 1 SE). Then, we estimated the corresponding mean sap flow and its 68 % uncertainty interval (Fig. B3a). In this case, methodological uncertainty was larger than that caused by sapwood area estimation (Fig. B3b). Total combined uncertainty (i.e. methodological and sapwood) was obtained by adding their squared values and then taking the square root, following error propagation theory (Fig. B3c).
In this example tree, total uncertainty for instantaneous values is around 400-500 cm 3 h −1 , resulting in a high uncertainty for low flows but low relative uncertainty for higher flows, reaching 13 % at peak flows on 6 June (Fig. B3c). When expressed as daily means, this uncertainty will be reduced as temporal averaging decreases the uncertainty by a factor equal to the inverse of the root square of the number of observations within a day (Richardson et al., 2012). In the same example (Fig. B3c), a day with high daily mean flow will also show lower relative uncertainty (6 June, 1589 ± 45 cm 3 h −1 , 3 %) compared to one with lower daily mean flow (30 May, 237 ± 45 cm 3 h −1 , 19 %). Table B1. Fixed-effect coefficients from the linear mixed models fitting reference sap flow density as a function of measured sap flow density using the data from the global sap flow calibration metaanalysis by Flo et al. (2019). Models for each method included the individual calibration as a random intercept. Sap flow methods are ranked according to their presence in the SAPFLUXNET database, from most to least present: HD (heat dissipation), CHP (compensation heat pulse), HR (heat ratio), HPTM (heat pulse T-max), CHD (cyclic heat dissipation), and HFD (heat field deformation). Finally, when no information on the variation of sap flow along the sapwood is available, radial integration of point measurements of sap flow density and associated uncertainty can be obtained by applying generic radial profiles according to wood porosity (Berdanier et al., 2016) as implemented in the R package "sapflux" (https://github.com/berdaniera/ sapflux, last access: 8 June 2021). An example application of this procedure shows how different uncertainty bounds can be obtained depending on wood anatomy (Fig. B4). In addition, this application shows how assuming a uniform radial profile for ring-porous or diffuse-porous species can lead to substantial underestimation of whole-plant sap flow, compared to a lower impact for tracheid-bearing species. Figure B1. Methodological uncertainty estimation in sap flow density measurements based on the data from the global meta-analysis of sap flow calibrations in Flo et al. (2019). The main panel shows predicted standard error based on method-specific linear mixed models of reference flow as a function of measured flow including the individual calibration as a random intercept factor. The span of the horizontal lines below the main panel corresponds to the maximum sap flow density in SAPFLUXNET (estimated as the 99 % quantile of sub-daily measurements) for that specific method.  Figure B3. An example of sap flow uncertainty estimation and bias correction for a Pinus sylvestris tree (ESP_VAL_SOR_Js_Ps_12) measured using heat dissipation sensors. Panel (a) shows sap flow density HD measurements with and without the application of the bias correction reported in Flo et al. (2019), together with the corresponding uncertainty estimated from the model in Fig. B1. Panel (b) shows corrected sap flow data comparing methodological uncertainty with that derived from the 68 % predictive interval of sapwood area estimation. Panel (c) shows corrected sap flow data together with the combined methodological and sapwood uncertainty. Figure B4. Effects of a generic radial integration on sap flow data originally supplied without any radial integration procedure. Radial integration and uncertainty estimation (blue bands show the 68 % prediction interval based on 100 bootstrap samples) were applied using the wood-type-specific radial profiles provided in Berdanier et al. (2016) for a ring-porous species (a), a tracheidbearing species (b), and a diffuse-porous species (c), all belonging to the USA_UMB_CON dataset.