the Creative Commons Attribution 4.0 License.
the Creative Commons Attribution 4.0 License.
A multiyear eddy covariance and meteorological dataset from five pairs of agroforestry systems with open cropland or grassland in Northern Germany
José Ángel Callejas-Rodelas
Justus van Ramshorst
Alexander Knohl
Lukas Siebicke
Dietmar Fellert
Marek Peksa
Dirk Böttger
Christian Markwitz
Agroforestry systems are considered suitable nature-based solutions to mitigate climate change. Long-term measurements of CO2 flux densities, evapotranspiration and sensible heat flux densities are, however, largely still missing. Here we present a unique eddy covariance and meteorological dataset from a total of ten stations paired over agroforestry and open cropland or grassland agricultural sites located in Northern Germany. The data were harmonized to create a consistent dataset which includes gap-filled time series of meteorological and lower-cost eddy covariance measurements with identical instrumentation, accounting for a total of seventy eight site-years of data. The objective of this dataset is to provide observational data on the differences of meteorological conditions, carbon, water and energy balances of adjacent agroforestry and open cropland or grassland sites in five distinct regions of Germany. This extensive, continuous dataset can be used to study ecosystem properties and the potential benefits of agroforestry. It can also be used to parametrize models on crop and biomass productivity, or to evaluate the response of such agroecosystems to climate change scenarios, among other applications. Anticipated key users of this dataset are researchers in the fields of micrometeorology, eddy covariance, agronomy, and ecosystem modeling. This dataset can be accessed through https://doi.org/10.25625/A2Z8T8 (Callejas Rodelas et al., 2025b).
- Article
(3391 KB) - Full-text XML
- BibTeX
- EndNote
The conversion of conventional agriculture, represented by open cropland (OC) or open grassland (OG) systems, to agroforestry (AF) systems, is an example of a nature-based solution with the potential to positively impact climate change mitigation (Cardinael et al., 2021; Chapman et al., 2020; Kay et al., 2019). AF encompasses any type of agricultural system in which trees and crops are cultivated concurrently, with the objective of benefiting from the presence of the trees in the agricultural land while keeping an agricultural production of food, timber or other products (Nair, 1985). AF systems traditionally have been an important component in many agricultural landscapes across the globe. These systems encompass a wide array of practices adapted to the requirements of the territories, climate, culture, society or economy (Pancholi et al., 2023). Intensive research has been conducted on AF systems to study their environmental benefits (for reviews on this topic, see Satish et al., 2024; Pancholi et al., 2023; Singh et al., 2021). Due to the demonstrated superior performance in ecosystem functions and services (Kay et al., 2019), there is an increasing interest in recent years to stimulate the transition of conventional agriculture to AF in numerous regions globally (Gupta et al., 2020).
AF systems have the potential to increase carbon sequestration, in comparison to conventional agriculture, in both within the soil (Cardinael et al., 2015, 2017) and via biomass growth (Peichl et al., 2006). According to De Stefano and Jacobson (2017), soil organic carbon (SOC) stocks exhibited an increase at several sites that had transitioned from agricultural to agroforestry land use. Moreover, AF systems can also increase water use efficiency (WUE) (Ong et al., 2002), as ecosystem-scale evapotranspiration (ET) is expected to stay equal (Markwitz et al., 2020a), even under increased CO2 uptake. WUE is defined as the amount of carbon assimilated as biomass or grain produced per unit of water used by the crop (Hatfield and Dold, 2019), which is equivalent to Gross Primary Production (GPP) divided by evapotranspiration (WUE = GPPET).
Short Rotation Alley Cropping (SRAC) is a type of AF in which trees and crops are cultivated in parallel (Markwitz and Siebicke, 2019). There is an increasing interest in such agricultural systems in regions like central Europe (Quinkenstein et al., 2009). Crops rotate in seasonal cycles, while trees rotate in periods of typically 3 to 6 years and are used for biomass production, e.g. for bioenergy (Böhm et al., 2014). Therefore, these systems can provide a renewable source of energy, while still keeping a relatively large yield production from the crops (Veldkamp et al., 2023). In order to thoroughly address the impact of such land use conversion, further data are required, that quantify ecosystem-atmosphere interactions at the system in order to understand how AF impacts the surface energy balance, carbon sequestration, evapotranspiration, and WUE.
The eddy covariance (EC) technique has become the main tool for the study of land-atmosphere interactions at the ecosystem scale (Baldocchi, 2014). EC allows to quantify ecosystem-scale exchanges of energy, trace gases and momentum between ecosystems and the atmosphere. However, there are inherent challenges in its application, including the costs of installation and maintenance (Hill et al., 2017), data storage and management (Aubinet et al., 2012), and data processing with complex data pipelines (Sabbatini et al., 2018; Pastorello et al., 2020). This has led to a lack of direct measurements with the EC technique over certain types of ecosystems in recent decades (Schimel et al., 2015), including AF systems and, specifically, SRAC systems (Markwitz and Siebicke, 2019).
Most of the research on carbon, water and energy balances of AF systems has been performed in tropical areas (Cardinael et al., 2015), where AF systems have a higher socio-economic and environmental significance (Chapman et al., 2020), and not in temperate regions. For illustrative purposes, the reader is referred to e.g. Chinchilla-Soto et al. (2021) or Gómez-Delgado et al. (2011). Furthermore, the studies conducted in the temperate zone have mainly focused on evaluating the potential carbon sequestration of an AF system by studying soil composition and SOC (e.g. Kanzler et al., 2021; Cardinael et al., 2015; Howlett et al., 2011); by combining in situ measurements of SOC, C leaching and respiration with models to assess ecosystem-level C pools (e.g. Peichl et al., 2006) or by assessing aboveground biomass in AF systems using remote sensing (Sprenkle-Hyppolite et al., 2024; Zomer et al., 2016). However, none of these studies directly measured ecosystem-scale CO2 or H2O exchanges via the EC technique. To the best of our knowledge, the study of Ward et al. (2012) was the only one that quantified carbon and water balances of an alley cropping AF using EC, in a Mediterranean climate in Australia.
As part of the SIGNAL project (https://agroforst-info.de/signal/, last access: 29 January 2026), the study by Markwitz et al. (2020a) quantified ecosystem-scale energy and water flux densities and the influence of SRAC on evapotranspiration using the EC technique. Two studies have evaluated the lower-cost EC setups used for this dataset (Callejas-Rodelas et al., 2024; van Ramshorst et al., 2024). Another study was published, focusing on carbon exchanges over some of the studied sites for the period 2019–2021 (van Ramshorst et al., 2025). Furthermore, a recent study investigated the spatial variability of carbon and water flux densities across one of the SRAC systems involved in the project (Callejas-Rodelas et al., 2025a).
This study presents multiple years of data concerning land-atmosphere exchange processes, as measured during the SIGNAL project. The dataset covers meteorological data, EC flux density time series of CO2, H2O, sensible heat (H), latent heat (LE) and momentum, some of the main measured turbulent flow variables (standard deviation of vertical wind velocity, SIGMA_W, friction velocity, USTAR, Obukhov length, MO_LENGTH), and footprint climatology. We provide a unique, harmonized dataset, incorporating the most significant variables recorded at the paired OC or OG and SRAC sites. The datasets can be used for different purposes, with the main targets being the micrometeorological and the modeling scientific communities. Good quality-controlled long-term data above SRAC and OC/OG sites can be a highly valuable input for models aiming to predict crop and land-use change responses under different climate change scenarios. Furthermore, these data, in conjunction with other datasets that compare the ecosystem functioning between AF and OC/OG, can be used by stakeholders and decision-makers in the development of agricultural policy.
2.1 Site description
The study sites were located in Northern Germany in five different regions (Fig. 1): Dornburg (51.015° N, 11.64° E, Thuringia), Forst (51.79° N, 14.63° E, Brandenburg), Mariensee (52.565° N, 9.464° E, Lower Saxony), Vechta (52.759° N, 8.549° E, Lower Saxony) and Wendhausen (52.33° N, 10.632° E, Lower Saxony). The mean annual temperature and precipitation for the reference period (1981–2010) for all regions is shown in Table 1, with values extracted from the German Weather Service data portal (DWD, 2025). The IDs of the DWD stations closest to the sites are Jena (2444), Cottbus (880), Hanover (2014), Grossenkneten (44) and Brunswick Airport (662), respectively. For simplicity, the SRAC systems will henceforth be referred to as AF.
Table 1Site location, climatological averages and standard deviations of annual air temperature (TA) and annual sum of precipitation (P) for the reference period 1981–2010 (DWD, 2025), climate zone according to the Köppen-Geiger classification (extracted from Beck et al., 2018), elevation above sea level and soil characteristics, for all the project sites. Soil type and organic carbon content are based on Shao et al. (2025), Veldkamp et al. (2023) and Schmidt et al. (2021), while soil bulk density is based on Shao et al. (2025) and Markwitz et al. (2020a).
Figure 1Map of site location within Germany. The blue dot represents the grassland agroforestry site (Mariensee), while the red dots represent the cropland agroforestry sites (Dornburg, Forst, Vechta and Wendhausen).
Each study region was divided into an AF and a OC or OG site. AF consisted of tree strips of fast-growing species, which were intercalated by crops, for Dornburg, Forst, Vechta and Wendhausen; or by perennial grassland, for Mariensee. Trees were poplar (Populus Nigra × Populus Maximowiczii) in Dornburg, Vechta and Wendhausen, while they were willow (Salix schwerinii × S. viminalis) in Mariensee, and both poplar and black locust (Robinia pseudoacacia) in Forst. The management of crops or grass at both OC or OG and AF was similar. Crops were subject to a yearly rotation scheme, while trees were typically harvested in cycles of three to four years. Grass was cut twice per year in Mariensee. During the project, tree harvest took place in February of 2021 for Forst, Mariensee and Wendhausen. No tree cut happened in Vechta and in Dornburg. However, in Dornburg a partial cut of 15 m on each side of the EC station took place in winter 2018/19 to mitigate the potential impact of tall trees on the EC measurements. Table A1 in Appendix A shows the crops, sowing and harvest dates, and the harvest dates of trees, for all the sites from 2016 to 2024. Further information on the management can be found in the Supplement of Veldkamp et al. (2023).
Two EC stations were installed at each region, one at the AF site and one at the OC/OG site. At Wendhausen, two additional stations were installed at the AF from August of 2022 to September 2024. A detailed analysis of the distributed network of three stations above the AF is presented in Callejas-Rodelas et al. (2025a). The present article focuses exclusively on the long-term dataset corresponding to the original paired AF and OC stations in Wendhausen, maintaining a similar structure to the other sites. The AF sites in Dornburg, Forst and Wendhausen, as well as the OC sites, were large in comparison to the stations' footprint area (Fig. 2a, b and c). In Mariensee, the AF site was small in comparison to the footprint area (Fig. 2c). In Vechta, the footprint area covered by the AF station was small because of the tall surrounding trees located in the western and southern sectors (Fig. 2d). The AF stations were 10 m tall and were located within the tree strips, except the station at Vechta AF, which was 5 m tall and was located at the crop field (Fig. 2). The OC/OG stations were 3 m (Forst and Mariensee), 3.5 m (Dornburg and Wendhausen) or 5 m tall (Vechta) (Table 6) and were located approximately at the center of the OC/OG sites. The station at Dornburg AF was relocated 10 m in NW direction at the beginning of the second phase of the project in August 2019.
Figure 2Satellite view of the sites, including the distribution of the agroforestry (AF, pink dotted line) and the open cropland or grassland (OC/OG, black dotted line) areas, the location of the eddy covariance stations (dark blue diamond for AF and orange circle for OC/OG), and the footprint climatology derived for the whole measurement period, for AF (dark blue solid line) and OC/OG (orange solid line), for all sites (a – Dornburg, b – Forst, c – Mariensee, d – Vechta, e – Wendhausen). The footprint climatology displayed in the maps corresponds to the 80 % contribution to the footprint. The footprint was calculated using the model of Kljun et al. (2015) in its Python version. Figure created with QGIS v. 3.22.11. Imagery Google Satellite Maps, © Google Earth 2025.
The measurement period varied by region and variable (Table A2). At Dornburg, Forst, and Wendhausen, EC and meteorological measurements were performed from spring 2016 to Autumn 2024, with an interruption from January 2018 to mid 2019. At Mariensee, measurements covered the period from spring of 2016 to the end of 2021, with a similar long gap from January 2018 to mid 2019. At Vechta, measurements covered summer 2019 to autumn 2024. The site-years were considered accounting for all years covered by the measurements, from approximately mid 2016 (all regions except Vechta) or mid 2019 (Vechta) until end of 2021 (Mariensee) or end of 2024 (all the other regions).
2.2 Meteorological measurements
The meteorological and turbulence variables (Table 2) measured at the sites are named in accordance with FLUXNET standards (https://fluxnet.org/data/\\aboutdata/data-variables/, last access: 23 January 2025). All the variables were collected at both the AF and OC/OG stations using a similar setup. Atmospheric pressure (PA) and photosynthetic active radiation (PPFD_IN) were measured only at the AF sites. Air temperature (TA) and relative humidity (RH) were measured at a height of 2 m at all stations. Shortwave incoming radiation (SW_IN), shortwave reflected radiation (SW_OUT), long-wave outgoing radiation (LW_OUT) and net radiation (NETRAD) were measured at a height of 0.5 m below the top of the station (see height in the site description section). SW_OUT and LW_OUT measurements were only available from 2019 onwards. At Vechta OC, no LW_OUT data were available. At Wendhausen OC, the SW_IN sensor failed from December 2023 until the end of the project. Atmospheric pressure (PA) and precipitation (P) were measured at a height of 1 m. P at the OC/OG was always taken as the reference because it suffered less from interception, which at the AF was caused by the trees. Soil heat flux (G) was measured at all the stations using one soil heat flux plate before 2019 and two plates from 2019 (Table 2) inserted at a depth of 5 cm and randomly distributed. PPDF_IN measurements were available from early 2022 in Forst, Dornburg and Vechta, from December 2022 in Wendhausen and not available in Mariensee. PPFD_IN was measured at 9.5 m height at the AF sites, next to the other radiation sensors. Data were recorded at a 10 s time step and stored in 30 min files on CR1000X dataloggers (Campbell Scientific, Inc. Logan, UT, USA).
Table 2Meteorological and turbulence variables measured at the stations and instrumental information. Variable names follow the FLUXNET standards (https://fluxnet.org/data/aboutdata/data-variables/, last access: 23 January 2025). The information in this table corresponds to the latest measuring setups. Not all variables were available during the whole project duration. Specifically, the lower-cost eddy covariance setups to measure CO2 molar density and RH, were installed in 2019. From 2016 to 2017, a different setup was used, as described in more detail in Sect. 2.3 (originally in Markwitz and Siebicke, 2019).
2.2.1 Processing of meteorological data
Meteorological measurements were filtered to remove outliers, based on a plausibility range for all variables, and resampled to a time resolution of 30 min to match the time resolution of the EC time series. All the measurements were averaged from 10 s to 30 min resolution, except precipitation, which was summed up over the 30 min period. For all variables, if less than 25 % of the 30 min period values were available, the average or sum corresponding to that variable and that period was marked as a not-a-number (NaN) value. Afterwards, saturated water vapor pressure (esat) and actual water vapor pressure (e) were obtained following the Magnus-Tetens formulation (Eqs. 2 and 3 in Vuichard and Papale, 2015; based on Murray, 1967). Vapor pressure deficit (VPD) was calculated as the difference between esat and e.
2.2.2 Gap-filling of meteorological data
Gaps in the time series of meteorological variables were filled following a four-steps routine. First, very short gaps up to one hour were filled using linear interpolation. This was applied to all variables except P. Second, if there were gaps at one station within a region, but the data were available at the paired station, then gaps were filled with linear regression between the two stations. This was applied to all variables except P, which was filled by replacing the missing value one by one. In the case of PA and PPFD_IN, the whole time series were inserted into the OC/OG stations datasets, because these variables were only measured at the AF. Furthermore, at Vechta the missing LW_OUT values at the OC station were filled with the measurements at the AF, because the land cover around both AF and OC stations was similar. At Wendhausen, the missing SW_IN data from December of 2023 until September 2024 were replaced with the measurements at the AF station.
Third, longer gaps were filled with a similar approach to that employed in FLUXNET (Pastorello et al., 2020; Vuichard and Papale, 2015), with slight differences. ERA5-Land re-analysis data (Muñoz-Sabater et al., 2021) were used as predictor for the missing data, instead of ERA5 re-analysis data (Hersbach et al., 2020) as employed in the FLUXNET central processing pipeline (Pastorello et al., 2020), due to their enhanced spatial resolution of 9 km. The data were requested using the API of the Climate Data Store from Copernicus (https://cds.climate.copernicus.eu/datasets, last access: 15 November 2024) with the Python library cdsapi (https://pypi.org/project/cdsapi/, last access: 15 November 2024). The ERA5-Land time series of 2 m TA, 2 m dew-point temperature (TAd), SW_DOWN, horizontal wind speed South–North direction (WS_N), horizontal wind speed West–East direction (WS_E) and PA were first down-sampled to 30 min time resolution from the original 1 h time step, using linear interpolation. P was not interpolated. WS and WD were calculated by applying basic trigonometry to both horizontal wind speed components. Finally, RH and VPD were calculated from 2 m TA and TAd, using the same formulation as explained above in Sect. 2.2.1, and the relations between RH and VPD.
Gaps in SW_IN, PA, TA, RH, VPD, wind speed (WS) and wind direction (WD) were filled using linear regressions between the ERA5-Land data and the station data. Despite their high degree of variability at the local scale, WS and WD were also filled by this procedure because it is difficult to find a set of predictor variables to perform a more sophisticated type of regression (see next paragraph). In the case of WS, WD and SW_IN, the intercept of the linear models was forced to be zero, to exclude the possibility of having negative data as a model result (e.g. a negative WD or a negative SW_IN), which would not be physically consistent. In the case of the other variables, both slope and intercept were used to generate the new data. Due to its different nature, P was gap-filled at the hourly scale by multiplying ERA5-Land data by a factor calculated as the ratio of the total sum of P measured at the station and the total sum of P from ERA5-Land, for the target period (Vuichard and Papale, 2015). Finally, PPFD_IN was filled by multiplying SW_IN by a factor obtained as the ratio between measured PPFD_IN and SW_IN.
The fourth step consisted in a different approach to fill the variables that vary more at the local scale and may not correlate well in general with ERA5 or ERA5-Land re-analysis data. These variables were: NETRAD, LW_OUT, SW_OUT and G. This approach used Extreme Gradient Boosting (XGBoost), a machine learning tool belonging to the tree-decision algorithms (Chen and Guestrin, 2016). This algorithm was also applied for gap-filling flux densities (see Sect. 2.3.2). The code was adapted from the original version published by Vekuri et al. (2023) to gap-fill meteorological data. A more detailed description of the algorithm can be found below, in Sect. 2.3.2. The predictors used to gap-fill NETRAD, LW_OUT, SW_OUT and G were SW_IN, TA, VPD and PA, all of them already gap-filled with the linear regressions in the previous step; and NETRAD from ERA5-Land data. Due to the nature of XGBoost, the relevance of the different predictors for modeling the target variable is evaluated. Consequently, the incorporation of variables such as VPD or PA to the set of predictors for, e.g. radiation variables, does not introduce bias in the results (see following Sect. 2.3.2 for more information).
The gap-filling of completely missing variables (see Table 3), such as PPFD_IN before 2022 or LW_OUT before 2019, was done by directly applying the steps three and four outlined previously. Therefore the provided time series of these variables are completely modeled before those dates. The accuracy of the gap-filling was considered sufficient due to the good representation of ERA5-Land data of the meteorological conditions at the sites (Fig. B1 in Appendix B).
Table 3Percentages of measured (_m), post-filtered (_af) and filled (_gf) data for some of the main variables. No column exists for TAU_af or TAU_gf because no filters or gap-filling were applied to TAU. Measured data for flux densities includes all data that could be calculated from raw data, prior to filtering. Filtered data correspond to data that went through the quality checks. Filled data for flux densities refer to the amount of data that were filled. These values were calculated as the averages across all time series originating from the different USTAR filters that were applied (see Sect. 2.3.1). Measured data for meteorological variables refer to all available data after de-spiking the initial raw data (see Sect. 2.2.1). Filled meteorological data would just be the remaining fraction up to a 100 %.
A flag was generated for the meteorological data, named with the suffix “_QC_GF”. The flag was 0 for measured data, 1 for interpolated data for short gaps, 2 for data filled using the nearby station, and 3 if ERA5-Land data were used as predictors.
2.2.3 Implementation of XGBoost
The XGBoost algorithm and other associated functions for the gap-filling were implemented using the Python libraries xgboost (Chen and Guestrin, 2016) and scikit-learn (Pedregosa et al., 2012). The parameters used in the implementation of XGBoost were kept similar to Vekuri et al. (2023). The time information, directly linked to seasonality in phenology and both seasonality and diurnal variability in meteorological variables, was included as in the original code of Vekuri et al. (2023), by adding two cyclical functions for month and time of the day and an additional linear description of time to XGBoost. This was valid for the gap-filling of both meteorological and flux density data (see Sect. 2.3). The model evaluation was done by calculating the root mean squared error (RMSE) between modeled and measured data, after fitting the model to the training dataset, prior to predicting all missing data. The original code can be found in the repository linked to the publication of Vekuri et al. (2023) (https://github.com/hvekuri/co2_gapfilling, last access: 12 February 2025). The modified version of the code, adapted to the current datasets, can be found in the code repository linked to this publication.
2.2.4 Evaluation of gap-filling of meteorological data
The evaluation of the gap-filling was performed by splitting first the datasets in training (80 %) and test (20 %) datasets. The root mean squared error (RMSE) was calculated on the test dataset after training the model and prior to predicting all missing data (Table B1 in Appendix B), for both the variables modeled using linear regressions and XGBoost. As additional justification of the gap-filling method, linear models between measured and ERA5-Land data were calculated before training the models for gap-filling (Table B2). In general, ERA5-Land and measured data correlated well, which allowed us to consider that the gap-filling using the linear models was appropriate. No evaluation was performed for P and PPFD_IN. A comment on the uncertainties related to gap-filling can be found in Sect. 4.2.2.
2.3 Eddy covariance measurements
During the first phase of the project, covering the years 2016 to 2017, measurements of three-dimensional wind speed and H2O concentrations only (Table 2) were taken at all the stations to calculate flux densities of latent heat (LE, W m−2), sensible heat (H, W m−2) and momentum (TAU, kg m−2 s−1). The LC-EC setup to measure H2O was based on a BME280 sensor (Bosch, Germany) as described in Markwitz and Siebicke (2019). For details on data processing, filtering and gap-filling and analysis of the LE the reader is referred to Markwitz et al. (2020a). The dataset corresponding to that publication can be found in https://doi.org/10.5281/zenodo.4038399 (Markwitz et al., 2020b). The following data processing and gap-filling sections refer to the eddy covariance measurements from 2019 until 2024.
Following the interruption of 2018, in 2019 the measurement setup changed. The measurements of three-dimensional wind speed continued, but the setup for H2O measurements was replaced by a setup measuring both CO2 and H2O concentrations (Table 2), in order to calculate additionally CO2 flux densities (FC, µmol m−2 s−1).
Three-dimensional wind speed measurements were taken with Metek uSONIC-3 Omni ultrasonic anemometers (METEK GmbH, Elmshorn, Germany). CO2 and H2O concentrations were measured with a lower-cost EC (LC-EC) setup, previously prepared at the University of Exeter (Hill et al., 2017). The LC-EC setups consisted of a CO2 infrared gas analyzer (IRGA) combined with a RH capacitance cell and sensors for temperature and pressure to derive H2O mole fraction from RH. These sensors were isolated within a custom made enclosure. The air was sampled below the sonic anemometer and then directed to the enclosure using Synflex 1300 tubes (1300-M0603, Eaton Corporation, Dublin, Ireland) and across two stainless steel filters with a pore size of 2 µm (SS-4FW-2, Swagelog, Solon, Ohio, USA). The length of the tubes depended on the station height (9 m at the AF stations, 4.5 m in Vechta AF and OC, and around 2.5 m at the other OC/OG stations). Nominal flow rate was 2.2 L min−1 until summer 2022, when pumps were replaced by stronger ones to get a flow rate of 5.0 L min−1. Data from the EC setups were logged at a 2 Hz frequency by CR6 dataloggers (Campbell Scientific, Inc. Logan, UT, USA) in files of 30 min duration. Data from the ultrasonic anemometers were logged additionally at 20 Hz frequency in separated files. The flux density calculation was done using the 2 Hz dataset, however upon request the 20 Hz raw data can be made available from the authors. The LC-EC setups were validated in Callejas-Rodelas et al. (2024) and van Ramshorst et al. (2024). The data from the validation are available in two different repositories which can be found in the corresponding publications. Additional details about the setups, including information about all the sensors of the enclosure and a scheme of the setup, can be found in Callejas-Rodelas et al. (2024).
Both meteorological and eddy covariance measurements were backed up and uploaded to a server using Ethernet connections and Raspberry Pi 2, model B processors (Raspberry Pi Foundation, Cambridge, UK). The stations were solar powered, with the installation comprising monocrystalline solar panels (model 3-01-001245, from Offgridtec GmbH, Eggenfelden, Germany), lithium batteries (type S12/130, code NGS0120130HS0CA, from Exide Technologies GmbH, Büdingen, Germany) and solar charge controllers (model PR 10–30, from KATEK Memmingen GmbH, Memmingen, Germany).
2.3.1 Processing of eddy covariance data
Raw turbulence data were processed using the EddyUH software (Mammarella et al., 2016) in its Matlab version (MATLAB, R2023a). Input meteorological data were TA, RH and PA, which were gap-filled using the nearby station (AF-OC/OG), following steps one and two explained in Sect. 2.2.2. FC, LE, H and TAU, among other relevant parameters, were calculated from the raw data of 2 Hz frequency, by applying the following processing routine. De-spiking of the data was performed by applying absolute difference limits between consecutive data points (Mammarella et al., 2016; Aubinet et al., 2012). Block-averaging for de-trending the data followed the approach explained in Rannik and Vesala (1999). Coordinate rotation for the three-dimensional wind field was carried out through the planar fit method described in Wilczak et al. (2001). Low-frequency losses due to block-averaging and finite period integration were corrected for according to Rannik and Vesala (1999). High-frequency losses due to the low-pass filtering effect of the instruments, especially pronounced due to the use of slow response sensors in the LC-EC setup (Callejas-Rodelas et al., 2024; van Ramshorst et al., 2024), were corrected with the experimental approach of Mammarella et al. (2009). A quality flag depending on the state of turbulence was assigned to all flux density time series, with integer values ranging from 1 to 9 following the approach of Foken et al. (2005). Random uncertainty was estimated according to Finkelstein and Sims (2001). Data were processed on a yearly basis, as recommended by the processing standards of ICOS and FLUXNET (Pastorello et al., 2020; Sabbatini et al., 2018), and in order to ensure a better accuracy of the calculation of the transfer function for H2O. Exceptions were some periods shorter than one year that were added to the processing of the previous/posterior year. More details on how the transfer functions were calculated can be found in the original paper (Mammarella et al., 2009), and further information on the processing routine for the LC-EC setups is given in Callejas-Rodelas et al. (2024). When the dependency of H2O time response with RH could not be estimated accurately, the coefficients of the fit corresponding to the previous year were employed.
To obtain net ecosystem exchange (NEE), the storage term of CO2 (SC) was calculated and added to FC: NEE = FC + SC. SC was calculated using the single point measurements of CO2 concentration from the IRGA, since a vertical profile of the CO2 mole fraction was not available at any of the stations. SC was calculated by taking the difference between each 30 min average of the CO2 concentration (ct), multiplied by the air molar density (ρm) and measurement height (zm) and divided by the integration time (30 min), according to the following equation:
This is the procedure followed by FLUXNET (Pastorello et al., 2020) when a profile is non existent.
After flux density calculations, NEE, H and LE were filtered to remove outliers and to ensure a good quality of the measurements. FC was also filtered to provide a clean version of it in the final datasets. Outliers were removed following the approach of Mauder et al. (2013), based on a median absolute deviation (MAD) filter, since it is a robust outlier detector (Leys et al., 2013). The MAD filter was used for positive and negative values of NEE and FC, with the threshold parameter q (Eq. 1 in Mauder et al., 2013) being 7.5; for LE and H, the same approach was applied for positive values, while for negative values, a hard limit of −100 W m−2 was applied to H and of −20 W m−2 to LE, rejecting data below that threshold. A moving window of 14 d was used for calculating median and MAD values and for removing outliers, and three iterations were performed on each time series to increase the robustness of the outlier removal. After the MAD filter, additional hard upper and lower limits were applied. These limits were 700 W m−2 for H and LE, and −60 and 50 µmol m−2 s−1 for NEE and FC. Only good quality data (e.g. sufficiently well-developed turbulence) according to data quality flags between 1 and 6 were accepted (Foken et al., 2005).
Additionally, H, LE and NEE were filtered according to friction velocity values, to ensure a well developed turbulence. The USTAR threshold was estimated following the procedure of Papale et al. (2006), based on the method by Reichstein et al. (2005). Following FLUXNET processing pipeline (Pastorello et al., 2020), 40 different USTAR thresholds were calculated and applied. The 40 thresholds were withdrawn as the percentiles 2.5 % to 97.5 %, in steps of 2.5 %, of the distribution of estimated USTAR values. USTAR thresholds were calculated on a yearly basis. Table 4 shows the averages of USTAR thresholds corresponding to the 2.5 % and 97.5 % percentiles, across years. This procedure led to 40 different time series for each variable, i.e. H, LE and NEE. The gap-filling (Sect. 2.3.2) was performed on these time series.
2.3.2 Gap-filling of eddy covariance data
Gaps in the time series of NEE, LE and H were filled using the combination of two methods, as also applied in Winck et al. (2023). First, the full time series were gap-filled using the Marginal Distribution Sampling (MDS) algorithm (Reichstein et al., 2005), implemented in REddyProc. The output of REddyProc provides a quality flag depending on the reliability of the filled data, ranging from 0 to 3. Original data have a quality flag of zero. Highly reliable gap-filled data have a quality flag of 1, as these data correspond to short gaps of up to few hours duration with low uncertainty from gap-filling (Wutzler et al., 2018). Data with flags of 0 and 1 were selected after using the MDS algorithm. Later on, the remaining gaps were filled using the XGBoost method. Prior to apply this algorithm, additional hard limits were applied to winter and autumn months to prevent a bias during those periods, due to the large sensitivity of XGBoost to the input data used to train the model. For H and LE, these hard limits were 100 W m−2 in October and March and 50 W m−2 from November to February. For NEE, the hard limits were ± 15 µmol m−2 s−1 from November to February and ± 20 µmol m−2 s−1 in October and March.
The original code implementing XGBoost (see Sect. 2.2.3) was used in Vekuri et al. (2023) to fill NEE and later on adapted in Callejas-Rodelas et al. (2025a) to gap-fill H and LE as well, with a different combination of predictors. This later version, from Callejas-Rodelas et al. (2025a), was employed to generate the current datasets. The meteorological drivers used for the gap-filling were SW_IN, TA, VPD, WS and WD. WS was included to consider the effect of a changing fetch area for the flux density measurements, and WD was included to account for the heterogeneity across the AF sites and some of the OC/OG sites. The gap-filling was applied separately to all 40 different time series for each variable H, LE and NEE.
2.3.3 Evaluation of gap-filling of eddy covariance data
Both REddyProc and XGBoost reproduced the diel cycle of measured data, for NEE (Fig. C1a in Appendix C) and LE (Fig. C1b). The gap-filling led to a reduction of the magnitude of the measured values, due to the fact that most gaps were present during winter and during nighttime, with smaller flux density magnitudes. However, the similarity in diel cycles between measured and filled data can be considered an indicator of reliability of the gap-filling procedure, besides other sensitivity or uncertainty analysis that may be performed.
Furthermore, RMSE was evaluated for the application of XGBoost across different scenarios of training and testing data. The RMSE values displayed in Table 5 were generally small and indicate a good performance of the technique.
2.3.4 Uncertainty estimates of eddy covariance data
A method to evaluate the uncertainty in aggregates of eddy covariance data (such as annual carbon or water balances) was proposed in Callejas-Rodelas et al. (2025a). This method consisted in assigning individual errors to the 30 min data in two distinct manners. In the first case, errors were calculated as the deviation of fluxes between lower-cost and conventional EC setups (Callejas-Rodelas et al., 2024; van Ramshorst et al., 2024). This deviation was considered as the worst case RMSE of the linear regression models of NEE and LE, respectively. In the second case, errors were calculated as the sum of that deviation and the random error (Callejas-Rodelas et al., 2025a). Errors in gap-filled data with REddyProc were defined as the standard deviation of the data points used for gap-filling (Wutzler et al., 2018). Finally, the error in gap-filled data with XGBoost was taken as the RMSE between modeled and measured data, after fitting the XGBoost regressor and prior to predicting all missing data, similarly to the meteorological data filled with XGBoost. RMSE values of gap-filled NEE, LE and H are displayed in Table 5. The individual errors can be propagated if a cumulative sum is calculated, as explained in Sect. 2.5 of the paper by Callejas-Rodelas et al. (2025a). The proposed method leads to small uncertainty estimates for very long time series, as the propagated error reduces its magnitude with the square root of the number of data. This is due to the nature of the error, which is considered fully random: when an uncertainty range is assigned to a single 30 min data point, the underlying assumption is that the data is equal to its value ± the uncertainty. This method, therefore, is not suitable for considering systematic deviations between setups with a known performance, such as our LC-EC and CON-EC setups.
Table 5Root mean squared error (RMSE) between modeled and measured data, for NEE, LE and H. RMSE was calculated after fitting the XGBoost regressor to measured data and prior to the prediction of all missing data. RMSE was calculated as an average across all 40 USTAR scenarios for each variable and site.
For this reason, we applied a different method to evaluate the uncertainty in annual or multi-annual sums of NEE or ET. This method was based on Richardson and Hollinger (2007) with slight modifications. For this error evaluation, we used the filtered and cleaned data corresponding to the USTAR 50 % scenario for all sites. The method, as described in Richardson and Hollinger (2007), consisted in two distinct steps.
First, we generated 50 different synthetic datasets for NEE and LE, using neural networks implemented through the library keras (Chollet et al., 2015) implemented in the tensorflow platform (Abadi et al., 2015). We added random noise to the time series of NEE and LE, so then each of the 50 datasets had different levels of noise. The code to generate the synthetic datasets was based on the code by Vekuri et al. (2023) for their evaluation of XGBoost against REddyProc. We then added the same gaps existing in the original time series to the synthetic datasets, and applied the same processing routine described previously: first, short gaps were filled with REddyProc, then longer gaps were filled with XGBoost. We calculated the standard deviation across the 50 different multi-annual sums for NEE (σR(NEE) and LE (σR(LE)) to get the uncertainty due to the random noise in the original data.
Second, we generated a single gap-free synthetic dataset and created different short- and long-gap scenarios, selecting only one year of data. For all sites we used 2020, because it was the most complete year, except for Wendhausen OC for which we used 2022 as 2020 was not available. Short gaps were added covering randomly 30 % of the dataset. Long gaps were added in addition to the short gaps, starting in day one of the year, until day 365, in intervals of 1 to 28 d, increasing in steps of 3 d, with the gap starting in day one at first, then increasing in intervals of 10 d until day 361. This means for each station, we created 370 short-gap scenarios, and 370 long-gap scenarios. All these scenarios were filled using the same routine as explained previously. Then, we calculated the difference in annual sums of NEE (ΔNEEj,k) and LE (ΔLEj,k) between the short- and long-gap scenario. This difference was the error due to the presence of a gap of length j starting on day k. This was calculated for all j, k scenarios; then, we divided the year in 12 months (periods m), and calculated for each combination of j and m the standard deviation of ΔNEEj,k or ΔLEj,k across all k for each month m across all k scenarios (the long-gap scenarios) to get σm (ΔNEEj) and σm (ΔLEj), respectively). After this, we calculated the slope of the relation between gap length (j) and σm (ΔNEEj) or σm (ΔLEj). This slope was used to get the uncertainty due to long gaps (σLG (NEE) or σLG (LE)) by multiplying it by the gap length in days. Finally, we calculated the error in the annual sums by adding in quadrature the error due to noise and the error due to long gaps, both for NEE and LE:
Finally, we added in quadrature the previously described error corresponding to long gaps, and the error described in the previous paragraph corresponding to random noise. This was considered the total error in the final sum. The advantage of this method is that it can be applied to any annual or multi-annual sum, by just selecting the corresponding period in the short-gap evaluation, and using the relations between gap length and standard deviation for different months of the year, corresponding to the long-gap evaluation. On the other hand, there exist two main hindrances. First, it could lead to bias in the error evaluation due to the nature of these ecosystems: since crops rotate year to year, therefore assuming similar uncertainty across the season might be wrong. Second, this method does not evaluate the performance of the gap-filling itself, so it assumes no uncertainty related to the method selection, just due to the noise and gaps in the measured time series. However, in general it provides a more robust estimate of the uncertainty compared to the simple error propagation described at the beginning of this section.
2.4 Partitioning of NEE into GPP and RECO
The net ecosystem exchange of CO2 (NEE) was partitioned into its photosynthesis (GPP) and respiration (RECO) components. Both nighttime and daytime models were applied, after Reichstein et al. (2005) and Lasslop et al. (2010), respectively. The partitioning was performed using the REddyProc package in R, version 4.2.1 (Wutzler et al., 2018). The partitioning was applied separately to each of the filled time series of NEE. This implies the 40 filled NEE time series led to 80 different final time series of GPP and RECO, because of the two methods used for partitioning. There was one exception, for Wendhausen OC, where the daytime partitioning method failed for the upper percentiles (87.75 to 97.5) and only nighttime GPP and RECO were provided for those USTAR scenarios.
Following Tikkasalo et al. (2025), nighttime GPP was forced to zero by subtracting the 1.5 d running median of the nighttime GPP from the GPP time series and forced any residual nighttime GPP to zero. The 1.5 d running median of GPP was then added to the RECO time series. This way, the relation NEE = RECO − GPP is valid at all time steps and GPP is zero when global radiation is zero.
2.5 Flux footprint climatology
The footprint climatology was estimated using the model of Kljun et al. (2015) in its Python version (Python v. 3.6). Separate runs were carried out for the different years at each of the stations, in order to better address the effect of a changing canopy height, and the effect of crop rotation at the different sites. Additionally, a single run with all available data was done, to provide an illustration on the average area measured by the EC stations (Fig. 2). Yearly footprint runs are not shown in the current article, but the results are available at the repository linked to this publication.
The input data to the footprint model were non gap-filled wind data (WS and WD), roughness length (z0), USTAR, Obukhov length (MO_LENGTH), the standard deviation of lateral wind speed (V_SIGMA), boundary layer height (PBLH, retrieved from ERA5, Hersbach et al., 2020), measurement height (zm), displacement height (d) and aerodynamic canopy height (ha). The time resolution of all parameters calculated from the eddy covariance measurements was 30 min, therefore PBLH was resampled to match this resolution by linearly interpolating the original hourly time series. Only daytime values were used, based on values of SW_IN higher than 10 W m−2. Some hard limits were applied to MO_LENGTH (values between −30 and +30 m were removed) and zm−d (values below 2.1 m were removed) to avoid errors while running the model. ha was calculated during neutral conditions (stability parameter ) based on the procedure by Chu et al. (2018). Complete time series of ha were estimated as described in more detail in van Ramshorst et al. (2025). This allowed for a better representation of the roughness effects of a spatially and temporally varying canopy, therefore it can be considered as a more accurate procedure compared to the use of a single value representing the average canopy height for the whole site for each time step. d and z0 were estimated as 0.1 and 0.6 times the aerodynamic canopy height, following Chu et al. (2018).
Canopy height at the different sites was estimated using some available tree height measurements and pictures taken across the measurement period at the sites. For the OC and the OG sites, a fixed maximum canopy height was assigned to the crops and grasses. For the AF sites, the tree height was considered as the canopy height of the whole ecosystem. In all cases, canopy height was linearly interpolated from the start until the end of the growing season. For crops, the growing season extended from crop sowing until crop harvest. For grass, the growing season extended between grass cuts. For the AF sites, the growing season extended from April to October each year. Canopy height is provided as an ancillary parameter in the dataset.
Table 6Average of parameters used for the footprint model implementation: measurement height (zm), canopy height (z), aerodynamic canopy height (ha) (calculated as described in van Ramshorst et al., 2025) and friction velocity (USTAR). Canopy height was estimated from measurements and pictures of the sites. Mean wind speed (WS) and wind direction (WD), also needed as input to the footprint model, are displayed in Table 7.
2.6 Shortcomings of the dataset
In some cases there were certain conditions which reduced the accuracy of the eddy covariance measurements. Firstly, at Dornburg AF, the measurements were conducted in the roughness sublayer and below the canopy height of the tree stripes in the footprint during most of the study period. Trees were very close to the height of the EC station in 2018, and higher than it from mid 2020, a problem which was partially mitigated by a clearcut of an area of 30 m around the station in August 2019. Although data followed the same processing routine as the other datasets and were thoroughly quality checked and filtered, the structure of turbulence might have affected their reliability.
Furthermore, the period January 2020 until March of 2021 was missing in Wendhausen OC, and the year 2021 in Dornburg OC presented a very poor quality. These two site-years were discarded in the final dataset, as the gap-filling would become very uncertain.
At Vechta AF, there was only one tree stripe, and the station was located within the crops, and not within the tree stripe (Fig. 2). Therefore, the representativity of this AF site should be considered with care. Furthermore, the EC data from 2022 to 2024 were not included in the final dataset due to the very noisy flux densities.
3.1 Meteorological data
SW_IN was similar between AF and OC/OG, with small variations between both systems attributed to changes in cloud cover along the integration periods of 30 min (Table 7). PA was measured only at the AF site. Variations in PA between the AF and OC/OG sites were expected to be insignificant. TA was slightly higher at the AF in Dornburg, Mariensee and Vechta, and slightly lower in Wendhausen, while in Forst it was similar at both sites. RH was higher at the OC than at the AF, in Dornburg and Forst, and lower at the OC/OG than at the AF in Mariensee, Vechta and Wendhausen. VPD was larger at the AF in Dornburg and Forst, and lower in Mariensee, Vechta and Wendhausen. In general, differences in TA, RH and VPD are negligible for the comparison of AF and OC/OG, likely due to the homogeneity in atmospheric conditions for low distant stations. WS was larger at the OC in Dornburg, Wendhausen and Vechta, and larger at the AF in Forst and Mariensee. This is attributed to the in general lower tree height at Mariensee AF, therefore the WS measured at 10 m height (AF) is larger than at 3 or 3.5 m height (OG). In the case of Forst AF, this was caused by the station being located at a gap within the tree stripe. At the other sites, however, trees were taller, especially in Dornburg and Vechta, than the AF station, leading to a stronger wind-barrier effect and wind speed reduction. Additionally, in Dornburg the OC site was located on an open elevated slope, leading to stronger winds. WD indicated predominant winds from the Northwest for Dornburg AF and OC, Mariensee AF and Wendhausen OC; and from the Southwest for Forst AF and OC, Mariensee OG, Vechta AF and OC and Wendhausen AF. P was calculated as an average over all annual sums at both AF and OC/OG, however, the values at the OC/OG are taken as the reference at all the sites because at some of them (Dornburg, Mariensee and Wendhausen) there was an interception effect from the AF tree canopy.
Table 7Averages of shortwave incoming radiation (SW_IN, W m−2), atmospheric pressure (PA, kPa), air temperature (TA, °C), relative humidity (RH, %), vapor pressure deficit (VPD, kPa), precipitation (P, mm), wind speed (WS, m s−1) and wind direction (WD, °) for all sites. Averages were calculated across the period 1 July 2016 to 1 July 2024 for Dornburg, Forst and Wendhausen; 1 July 2016 to 1 July 2021 for Mariensee; and 1 July 2019 to 1 July 2024 for Vechta. For P, the value was calculated as the average of the annual cumulative sums across the corresponding period. For the other variables, the average was calculated as the average of all 30 min values across the corresponding period. WD is expressed in degrees (°) from the North direction.
With respect to the time series of meteorological data, the exemplary time series of SW_IN and TA (Fig. 3a and b) in Forst AF followed a marked seasonality, with the largest values in June for SW_IN and July for TA, and the lowest values in December for SW_IN and in January for TA. There was a longer gap in 2019, and additional multiple shorter gaps of few weeks or months in the other years. Gaps were longer in winter due to the solar powered stations, but this did not seem to affect the performance of the gap filling procedure. In the case of WS, gaps were longer, especially in 2018 and part of 2019, but the gap-filling seemed to appropriately reproduce the expected variable behavior at the site in terms of seasonal and diurnal variations. WS slightly reduced in magnitude over time, mostly after 2021, an effect that could be attributed to the increasing tree height after the tree harvest in that year at the site.
Figure 3Exemplary time series of shortwave incoming radiation (SW_IN, W m−2) (a), air temperature (TA, °C) (b) and wind speed (WS, m s−1) (c) at Forst AF, at 30 min time resolution. Dark blue markers indicate measured data, olive green markers indicate data filled using either interpolation or the nearby station as a reference, and pink markers indicate data filled using ERA5-Land data as a reference (see Sect. 2.2.2). Note that in the last three years, SW_IN had more gaps and more data were filled using ERA5-Land data, but still most of the data were measured or filled with the nearby station as a reference. Note that not all the filled points are visible in the figure; often, measured values impede the visualization of filled values.
The gap-filled data reproduced the seasonality and trends in all variables. In general, the agreement between measured meteorological variables and ERA5-Land data is good (Fig. B1 and Tables B2 and B1), which justifies the use of the gap-filling procedure. For WS and WD, the correlation between measured and ERA5-Land data is worst (Fig. B1e and f) and varies more across sites, which can be explained by locally induced effects on the wind field and to a potential time shift between point data and gridded data (Lipson et al., 2022). For SW_IN, slopes are very close to 1.0, but because there is a large scatter in the data (Fig. B1), RMSE is also large. This is caused by locally varying cloud conditions. Dornburg, Mariensee and Vechta experienced a higher annual TA at the AF than at the OC/OG (Fig. 4), while in Wendhausen, TA was lower at the AF, and in Forst it was similar for both sites. However, the differences were very small between AF and OC/OG. 2021 was the coldest year at all sites, while the other years were relatively similar. 2016 and 2024 for all sites, and 2019 for Vechta, should not be considered because the average does not comprise some of the coldest months of the year. P was largest in Vechta, followed by Wendhausen, and lowest in Forst. 2018 was the driest year at all places, except in Vechta. The driest year in Vechta was 2022. In Forst the annual sum of P in 2018 was similar to 2019 and 2020, likely indicating a longer term drought event.
Figure 4Annual averages of air temperature (TA, °C) and annual sums of precipitation (P, mm) for all the sites. Black circles with solid line represent the TA values at the AF sites. Orange squares with solid line represent the TA values at the OC/OG sites. Light orange bars represent annual sums of P at the OC/OG sites, taken as a reference for the sites. The measurement period started in spring of 2016 at all sites, except in Vechta, where it started in summer of 2019. Due to this, and due to the incomplete year of 2024 because of the end of the project, the data corresponding to 2016 and 2024 (and 2019 for Vechta), should be considered with care. Data corresponding to incomplete years are marked with asterisks.
3.2 Eddy covariance flux densities
All flux density time series (for NEE, LE and H) at the exemplary Dornburg AF site exhibited a marked seasonality (Fig. 5), with the largest values at the peak of the growing season (around June and July each year) and the lowest values in winter. NEE indicated an uptake of carbon throughout the growing season, coinciding in time with a larger LE, representing water vapor release from the AF system. The lowest values of NEE were attained in 2019 and in 2023. H was the largest in 2023, corresponding to a lower LE. Throughout the whole period, LE was in general larger than H, indicating that water was not a limiting factor, since more energy was employed to evaporate water from the ecosystem, than to heat up the air next to the surface. The gap-filling reproduced satisfactorily the seasonality and in general it did not appear to bias the time series of flux densities. However, XGBoost smoothed the extreme values of the distributions for all variables, as demonstrated by, for example, the gap-filled LE values, which were never below 0 (Fig. 5). This is a characteristic of gradient boosting techniques (Chen and Guestrin, 2016; Friedman, 2001).
Figure 5Exemplary time series of flux densities of CO2 (NEE, µmol m−2) (a), latent heat (LE, W m−2) (b) and sensible heat (H, W m−2) (c) corresponding to the 50 % percentile USTAR scenario at Dornburg AF. Dark blue markers indicate measured data, olive green markers indicate data filled with REddyProc, and pink markers indicate data filled with XGBoost (see Sect. 2.3.2). The horizontal dashed line in (a) represents the 0 uptake line. Negative values of NEE indicate a carbon uptake, while positive values indicate a carbon release. Note that not all the filled points are visible in the figure; often, measured values impede the visualization of filled values.
Annual sums of H were larger at the OC/OG than at the AF in Dornburg, Forst and Mariensee, while they were lower at the OC than at the AF in Vechta and Wendhausen (Fig. 6). Some years, like 2019 in Wendhausen, showed very similar sums of H. With respect to annual sums of ET, we found larger annual sums at the AF than at the OC/OG in most of the years at most of the sites, with some exceptions, like the year 2020 in Dornburg, the year 2021 in Mariensee, or the year 2024 in Wendhausen. In general, increases in the magnitude of both H and LE occurred in some years, like in 2020 in Forst, but there was not a clear correlation between both, because the energy partitioning depended on many factors, such as the available net radiation, the crop present at the system and its phenology, and yearly soil and plant water balances. Annual sums of NEE (Fig. 7) indicated a larger carbon sequestration at the AF than at the OC/OG in all years except 2019 and 2020 in Dornburg, in all years except 2019 in Forst, 2019 and 2020 in Mariensee and only one year in Wendhausen (2019). A more negative annual sum of NEE indicates a higher uptake. In the remaining years, either the OC/OG carbon uptake was larger, or both systems showed a positive carbon balance, indicating a carbon release to the atmosphere. With respect to GPP, it was generally larger at the AF than at the OC/OG in all sites, except in Vechta, where it was similar for both AF and OC. In most years, RECO values were larger at the AF than at the OC in Dornburg and Forst, while they were some years larger at the AF in Wendhausen and Vechta, and some years larger at the OC. At Mariensee, RECO was lower at the AF than at the OG. Generally, the agreement between nighttime and daytime modeled GPP and RECO was good in the case of Forst and Dornburg. For the other sites, some years GPPNT and RECONT were larger than GPPDT and RECODT, respectively, while in other years the opposite behavior was observed.
Figure 6Annual sums of sensible heat flux density (H, MJ m−2) and evapotranspiration (ET, mm) for all the sites. Orange circles with solid line represent the H values at the AF sites. Burgundy squares with solid line represent the H values at the OC/OG sites. Dark blue bars represent the ET values at the AF sites. Turquoise bars represent the ET values at the OC/OG sites. For each annual sum, error bar represents the standard deviation of the sums across all USTAR scenarios. The sums corresponding to the years 2019 and 2024 should be considered with care, given that these were incomplete years (see Table A2). The years 2020 in Wendhausen OC and 2021 in Dornburg OC were missing. Additionally, in 2021 the considered period is shorter in Wendhausen OC, because there were three missing months from January to March. Data corresponding to the incomplete years are marked with asterisks.
The balance between gross primary productivity (GPP) and ecosystem respiration (RECO), explains the differences in the net carbon balance of AF and OC/OG. The AF sites showed most of the site years a larger GPP than the OC/OG sites, but also a larger RECO. This is due to an enhanced physiological activity at the AF compared to the OC/OG sites, with larger leaf area index and an extended growing season of trees in comparison to crops. However, there were some exceptions, like Vechta, where the vegetation surrounding both AF and OC stations was very similar, therefore leading to very similar carbon balances between both sites; Mariensee, where RECO was larger at the OG because it consisted of low-intensity managed grass; and some years at Wendhausen, where both GPP and RECO were larger at the OC. If GPP was large due to the presence of the trees at the AF, but RECO was also large because there was more biomass respiring and more organic matter being decomposed in the soil, then NEE would sum for a larger uptake at the OC/OG. In order to evaluate longer term differences, all the individual and year-to-year differences in crops, phenology and climate must be considered. Additionally, to assess the overall net carbon balances, carbon export such as the yield and biomass removal from the field after harvest must be taken into account (van Ramshorst et al., 2025). Most values in 2019 were very low, because the growing season was either very advanced or over when the measurements started (Table A2).
Figure 7Annual sums of CO2 flux density (NEE, g C m−2) (dark blue), gross primary production from the daytime model (GPPDT, g C m−2, solid turquoise markers), gross primary production from the nigthtime model (GPPDT, g C m−2, empty light orange markers), ecosystem respiration from the daytime model (RECODT, g C m−2, solid terracotta markers) and ecosystem respiration from the nighttime model (RECONT, g C m−2, empty red russet markers) for all the sites. Circle markers represent the values at the AF sites. Squared markers represent the values at the OC/OG sites. For each annual sum, error bar represents the standard deviation of the sums across all USTAR scenarios. The sums corresponding to the years 2019 and 2024 should be considered with care, given that these were incomplete years. The years 2020 in Wendhausen OC and 2021 in Dornburg OC were missing. Additionally, in 2021 the considered period is shorter in Wendhausen OC, because there were three missing months from January to March. The data corresponding to the incomplete years are marked with asterisks. Note that these sums do not included the carbon exports through harvest of crops and trees. Comprehensive analyses on carbon balance over the studied sites should include those data, accessible through the DOIs in Table A3.
The annual sums of NEE were, for some years, such as 2020 for Dornburg AF or 2021 for Mariensee AF, very uncertain (Table 8): the value had a smaller magnitude compared to the annual error. This means that, for those years, there is no certainty that the ecosystems are either a sink or a source of CO2. In the case of ET, the sums were always much larger that the uncertainty.
Table 8Annual sums of ET and NEE, corresponding to the USTAR 50 % scenario, together with the uncertainty estimates based on Richardson and Hollinger (2007) (see Sect. 2.3.4). The uncertainty in ET was originally calculated for LE, then the error in ET was taken based on the same ratio between the error in LE and the annual sum of LE.
4.1 Novelty and implications of the dataset
The objective of the current publication was to present a harmonized and complete dataset of multiple site-years of meteorological and eddy covariance data over five AF and five OC/OG sites. A total of seventy eight site-years were compiled during this project, which sum up to a consistent dataset that may be re-used by the scientific community. This dataset is unique in its nature, because currently such a comprehensive data set from temperate alley cropping agroforestry systems is not available. In general, harmonized datasets offer the opportunity to study particular phenomena of interest, without the negative sides of data harvesting from different platforms, repositories, personal communications, etc. Datasets focusing on specific ecosystems, which were compiled in order to answer a specific research question, may help to better target and organize further research, by ensuring consistency and alignments in data structure and processing routines.
Particularly, the effect of this type of AF on carbon exchange has not been addressed through EC yet, neither the water use efficiency of both systems, AF and OC/OG, respectively. Furthermore, the dataset can help to understand differences in biophysical effects of the land use change from OC/OG to AF, with implications for the local and potentially global radiative forcing. Based on this data compilation, the modeling of long-term behavior of AF and OC/OG sites, combined with yield modeling through several crop rotations (using complementary data to this publication), could address the question whether this type of AF systems are potentially more resilient to climate change impacts.
Furthermore, the range of studied sites, characterized by distinct crop rotations, soils and climatic conditions, offers the opportunity to examine in detail inter-annual and inter-site variability of the development of a certain crop and its implications on water and carbon balances. The behavior of the AF and OC/OG systems is expected to be different depending not only on the meteorological conditions of a specific season, but also on the physiology of the crop grown in between the trees, the length of its growing season until harvest, or the age of the trees at the AF after tree harvest. The present dataset, complemented by the two years campaign of EC measurements with a distributed network of three EC stations (published in Callejas-Rodelas et al., 2025a), could help to elucidate the discrepancies between AF and OC/OG in carbon and energy balances, as well as on ecosystem properties, depending on the crop type.
Due to the site heterogeneity, particularly at the AF, the footprint model output should be considered with care (Göckede et al., 2004). Nevertheless, the current dataset may help to study the effect of heterogeneity in the AF sites on flux densities, and to test different approaches to better characterize footprints and the source/sink behavior of the areas around the stations. This dataset also provides an opportunity to analyze the effect of canopy heterogeneity, canopy height change and harvest disturbances on the footprint climatology. More refined information on the footprints could be used to assign further quality flags to the flux densities, according to Rebmann et al. (2005) for a more accurate evaluation of carbon and energy exchanges around the stations.
4.2 Uncertainties in the dataset
4.2.1 Uncertainties and errors in the eddy covariance measurements
The uncertainties in the EC measurements from the current datasets are primarily attributable to the inherent characteristics of the EC technique itself, e.g. errors during nighttime measurements and the consequent USTAR filtering (Massman and Lee, 2002). Secondly, errors may have arisen from the use of lower-cost eddy covariance setups. Moreover, uncertainty in the measurements may have arose from the station location (Chen et al., 2011), which may be particularly large above heterogeneous sites such as the AF systems.
The error due to larger spectral corrections in the LC-EC setups (Callejas-Rodelas et al., 2024; van Ramshorst et al., 2024), could be accounted for in different ways. If the goal is to get the strength of carbon and evapotranspiration signals in cumulative sums, the error in individual measurements, in comparison to conventional EC setups, as done in Callejas-Rodelas et al. (2025a) for daily sums, can be used to propagate to an error in the sum using classical error propagation. Another option would be to consider the individual error as the random uncertainty estimated with different methods (e.g. Finkelstein and Sims, 2001; Kessomkiat et al., 2013; Richardson et al., 2008). The measurement error, in comparison to conventional EC setups, could then be propagated together with errors in gap-filled data as explained in Sect. 2.3.4. Another more complete method, based on Richardson and Hollinger (2007), was applied to the current dataset, only to the USTAR 50 % scenario, for NEE and LE (Table 8). This method provided an estimate of errors in annual sums, but it could be applied in principle to multi-annual sums and it should also be combined with further analyses related to these data, such as quantifying carbon balances over the studied sites. However, this method only estimates error related to noise and to the gaps in the dataset, assuming the error from the gap-filling method itself is negligible. The uncertainty introduced by the modeling of missing data is discussed in Sect. 4.2.3.
The study of Callejas-Rodelas et al. (2025a) concluded that the strength of the lower-cost EC setups resides in the fact that they can facilitate replicated measurements, due to their reduced cost. Replicated measurements, especially above heterogeneous ecosystems, allow for a reduction in the error in ecosystem-level balances of carbon and energy (Hill et al., 2017). Furthermore, the higher affordability of LC-EC allows for the deployment of multiple setups at different sites, such as in the current project, therefore providing more datasets and a larger statistical robustness for the research questions.
Finally, with regard to some features of the station location and the footprint size, in Wendhausen AF, Mariensee AF and Vechta AF, the footprint area encompassed most of the target field (Fig. 2), however – under certain conditions – it also extended to some areas beyond it, more marked in the case of Mariensee. This is not shown in Fig. 2, where only the 80 % footprint contour line is displayed. The influence of the areas outside the AF field may be small, but should not be completely neglected when evaluating sources and sinks of carbon and water. For the other sites, the smaller station at the OC/OG sites and the bigger size of the fields in Dornburg AF and Forst AF prevented this effect. In any case, individual year footprints should be considered for a better estimate of the areas measured by the stations, and, only for a qualitative estimate of the distribution of sources and sinks of carbon and water vapor.
4.2.2 Uncertainties in gap-filling meteorological data
The four-step gap-filling procedure applied to meteorological data (Sect. 2.2.2) enabled to generate consistent datasets with a quality flag indicating gaps filled with linear interpolation, using the nearby station (AF or OC/OG) as a reference, or using ERA5-Land as a reference. In general, if data from a nearby meteorological station are available, the accuracy of using those as a reference is higher than if re-analysis data are used. However, there are inconsistencies in the raw data from the German Weather Service stations close to the field sites, with some missing variables or poor quality data. There are several publications where a number of approaches are applied for gap-filling the main meteorological variables (e.g. Dyukarev, 2023), and some available software packages which facilitate the user application (El Hachimi et al., 2023).
For consistency with FLUXNET datasets, re-analysis data from the European Centre for Medium-Range Weather Forecasts (ECMWF) were used as a reference to fill missing meteorological data. The approach applied in FLUXNET (Pastorello et al., 2020) is based on the original implementation of Vuichard and Papale (2015). Vuichard and Papale (2015) used originally ERA-Interim re-analysis data (Dee et al., 2011) as a reference, but this was updated later on in FLUXNET to use ERA5 data as a reference (Hersbach et al., 2020). Instead, the current datasets were filled using ERA5-Land data as a reference, with improved resolution over the land domain and a better representation of some features, such as orography (Muñoz-Sabater et al., 2021). ERA5-Land data were also used as a reference in Dyukarev (2023) and El Hachimi et al. (2023). It would be advisable to switch the FLUXNET processing routine to a gap-filling procedure which employs ERA5-Land data for terrestrial ecosystem stations. The gap-filling of the meteorological parameters, particularly TA, SW_IN and VPD, is of fundamental relevance for the posterior gap-filling of EC flux densities. Any bias reduction can help reduce the error induced in the gap-filling of EC data.
The gap-filling of most of the variables using linear regression models was considered reliable (Table B1 and Fig. B1) and it was consistent to the FLUXNET implementation, however there was a large scatter in SW_IN, WS or WD. These variables could potentially influence the posterior gap-filling of NEE, LE and H. In general, there are unavoidable biases in the match between local point observations, from stations, and grid-scale data, from ERA5 or ERA5-Land (Haiden et al., 2018). This bias could be reduced by employing a more sophisticated approach, such as in Lipson et al. (2022), however it was not implemented in the current dataset. That method improved the correction of ERA5 data to match the local requirements, incorporating a spin-up period for ERA5 data and smoothing hourly biases, and using the logarithmic wind profile to optimize the relation between the ERA5 wind variables and the directly measured variables. This approach would eventually be applicable to any other site.
The gap-filling of the remaining variables (NETRAD, LW_OUT, SW_OUT and G), performed with XGBoost, showed a lower RMSE for all variables (Table B1). All variables are radiation or heat transport variables, therefore easier to predict based on other radiation variables or temperature, if compared to ecophysiological variables such as NEE. Nonetheless, this method was not bench marked against others. In Dyukarev (2023), for example, five different methods were compared to gap-fill the main meteorological parameters, based on all the available parameters of ERA5-Land (a total of 47). This approach, adapted to provide ensemble results (see next section and Lucas-Moffat et al., 2022), could be the solution to obtain more accurate gap-filled meteorological time series, especially for variables with a large site dependency. This could improve posterior flux density gap-filling and a more accurate picture of biophysical ecosystem properties, by providing less uncertain NETRAD or LW_OUT.
4.2.3 Uncertainties in gap-filling eddy covariance data
The hybrid gap-filling procedure for the eddy covariance data, which combines the MDS algorithm for short gaps and the XGBoost algorithm for the remaining gaps, was selected due to the presence of long gaps in the measured data, and to the demonstrated superior performance of machine learning algorithms compared to the use of only the MDS method (Winck et al., 2023). Concretely, XGBoost performed better than MDS for filling NEE over the FLUXNET2015 dataset and for high-latitude sites in Vekuri et al. (2023), and for a long dataset above a deciduous forest in Liu et al. (2025). Due to its handy applicability and the possibility of customizing the predictors and other model parameters, it was chosen as the algorithm to fill longer gaps (of several days up to few weeks or months) for our presented data set. The filled data were carefully examined to check the physical coherence of the values and the seasonality, but some level of uncertainty remained in the datasets, mainly arising from the many missing data (Table 3). This uncertainty may be calculated in different manners (see Sect. 4.2.1).
In general, it is desirable that the uncertainty associated to any gap-filling procedure should be of similar magnitude as the noise of the measurements (Moffat et al., 2007). In the current datasets, longer gaps were typically present in winter time. However it was shown in other studies that in temperate or boreal ecosystems, with low physiological activity in winter, long gaps did not add large uncertainty to the data (Richardson and Hollinger, 2007). Moreover, because of having multiple years of data, it was assumed that the gap-filling model could learn appropriately the relations between drivers and flux densities for different times of the year.
The uncertainty in annual balances of NEE was estimated by Vekuri et al. (2023) and Liu et al. (2025) by creating artificial time series using neural networks and then adding multiple scenarios with different percentages of data coverage. In both studies the authors used XGBoost and compared it to MDS. Based on the error in annual balances from both studies, we could assign an uncertainty in annual sums of NEE in the current dataset of ± 30 g C m−2 from Vekuri et al. (2023), considering the worst case scenario with very long gaps and a small data coverage, and of ± 70 % of the annual sum from Liu et al. (2025) considering the spread across different gap scenarios. The uncertainty in ET balances could be considered in a similar manner. We suggest considering these values when making use of the current dataset. More specific analysis, such as the method proposed by Richardson and Hollinger (2007), are planned in future work. Such method allows to disentangle random error and gap uncertainty depending on which time of the year gaps occurred, and focuses on the gap uncertainty rather than the gap-filling method uncertainty.
The EC community is increasingly adopting machine learning algorithms to fill gaps in time series (Lucarini et al., 2024) and there is a number of studies demonstrating their reliability (e.g., Mahabbati et al., 2021, Tramontana et al., 2016 or Zhu et al., 2022). Providing ensemble results from different methods, as recommended by (Lucas-Moffat et al., 2022) and implemented by e.g. Tikkasalo et al. (2025), may solve partially the question of gap-filling method uncertainty. However, machine learning methods are not yet standardized neither used in the processing pipelines of international networks such as FLUXNET (Pastorello et al., 2020) or ICOS (Sabbatini et al., 2018). It is recommended and expected to see more efforts in this direction within the EC community.
4.3 Comparison of processing routines to FLUXNET
The development of international networks of EC stations, such as ICOS or FLUXNET, aimed to standardize measurements and data processing pipelines with centralized processing protocols and units to increase inter-comparability and reliability of results (Rebmann et al., 2018; Sabbatini et al., 2018). However, the datasets uploaded through a network get embedded in a larger scale processing scheme that may mask site individual features that are relevant for the data to be interpreted or processed. Many EC stations, however, do not belong to these networks, due to the lack of certain measurements, the use of different instruments such as lower-cost setups, or the use of different processing schemes (Pastorello et al., 2020). It is thus important for the EC community that these original datasets are published along with a detailed site characterization and methodological explanation. This was the main motivation for the current publication.
The EC setups deployed within the SIGNAL project do not meet all the standards of FLUXNET in terms of the minimum requirements of instruments and measured variables. The major difference is the use of slow-response gas analyzers for measuring CO2 molar density and for RH measurements, from which later on H2O mole fraction is derived. The gas analyzers used by the community are capable to sample at higher frequencies, and specifically the analyzer recommended in ICOS is the LI-7200 (LI-COR Inc., Lincoln, NE, USA) (Rebmann et al., 2018; Sabbatini et al., 2018). Regarding the post-processing, applied after the calculation of half-hour flux densities, there were several differences to FLUXNET, from which only the most relevant are pointed out.
Firstly, from the required data variables for post-processing (“Critical data variables for the post-processing, averaged or integrated over 30 or 60 minutes”, in Pastorello et al., 2020), in the current datasets the storage of CO2 (SC), the soil temperature (TS) and the soil water content (SWC) are missing. Compared to the variables in Table 2 of Pastorello et al. (2020), TS, SWC, outgoing photosynthetic photon flux density (PPFD_OUT), diffuse incoming photosynthetic photon flux density (PPFD_IN) and diffuse incoming global radiation (SW_DIF) are missing inside this work. With respect to gap-filling meteorological data, for the current dataset ERA5-Land data were used as a reference, instead of ERA5 as in FLUXNET (Pastorello et al., 2020). We included a more advanced approach for some energy and radiation variables which can be important to interpret ecosystem functioning and its biophysical properties. Thirdly, in FLUXNET the gap-filling of NEE, LE and H is performed fully using the MDS algorithm, based on Reichstein et al. (2005), whereas the hybrid approach combining MDS and XGBoost was employed for this dataset. Then, the energy balance closure is estimated using three different methods and the corrected H and LE are provided as an ancillary data product (Pastorello et al., 2020), while in the current datasets no energy balance correction was applied. Regarding USTAR filtering, the bootstrapping of the USTAR distribution is slightly different in FLUXNET than applied in REddyProc (Wutzler et al., 2018). In FLUXNET, the method of Reichstein et al. (2005), implemented in Papale et al. (2006), was complemented by the change-point detection method after Barr et al. (2013). During the preparation of the current dataset, only the moving point detection method of Papale et al. (2006) was implemented. Thereafter, the use of 40 different USTAR thresholds to generate 40 different time series of the flux density variables is similar as in FLUXNET in the case of NEE, but it is not implemented for H and LE. A third method implemented in FLUXNET, the sundown partitioning method after van Gorsel et al. (2009), was not applied to the current dataset because there were no storage data, required by the method. Finally, the partitioning of NEE into GPP and RECO was done for both daytime and nighttime based methods, as in FLUXNET, but the time series were forced to fulfill the relation NEE = RECO − GPP as explained in Sect. 2.4, following Tikkasalo et al. (2025). This correction is not implemented in FLUXNET.
Data presented in this article can be accessed through https://doi.org/10.25625/A2Z8T8 (Callejas Rodelas et al., 2025b), under a Creative Commons Attribution licence (CC BY-4.0). Codes used to filter, gap-fill, create the figures and extract relevant information for the paper, can be accessed through https://github.com/jangelcrgot/Processing_dataset_multiyear_eddycov_agroforestry (last access: 29 January 2026), under a GNU General Public License version 3.0 (GPL-3.0). Raw meteorological and eddy covariance data can be provided by the authors upon request.
Upon appropriate citation and referencing, users are encouraged to download and work with the current dataset and to reuse the codes, to perform further data analysis, modeling or test of gap-filling algorithms, footprint models, etc. There is a metadata excel sheet in the repository with name, units and description of all the published variables. There are three files corresponding to each station, containing meteorological data at 30 min resolution, except precipitation; eddy covariance data at 30 min resolution; and precipitation data at 1 h resolution. Missing data are marked with −9999 for original variables. Time coordinate is UTC. Time stamp is 30 min, with the ISO format yyyymmddHHMMSS, and corresponds to the end of the averaging period. Time stamp and variable names follow FLUXNET standards. Year (“YEAR”), day of year (“DOY”), and hour (“HOUR”), are also provided. Additionally, the yearly footprints and the single footprint including all years are published in files, containing the contour lines of the 70 %, 80 % and 90 % contribution to the footprints. The contour lines are defined by “X_” and “Y_” coordinates in EPSG:4326 (WGS84), together with the latitude and longitude coordinates in the coordinate reference system EPSG:25832. The files are named as “footprint_year_station_contribution%” for the different years and % of footprint contribution, and “station_contribution” for the single footprint run.
Variable names and units are stored in the file called “variable_name_unit_meteo.csv” and “variable_name_unit_eddycov” for meteorological and eddy covariance data, respectively. The filled time series of H, LE and NEE are named as “variable_name_- Uperc_f”. The quality flag columns of the filled data are named as “variable_name_Uperc_QC_GF”. Time series of GPP and RECO are named as “GPP_DT_Uperc”, “GPP_NT_Uperc”, “RECO_DT_Uperc” and “RECO_DT_Uperc”, for daytime and nigthtime modelled GPP and RECO, respectively. The uncertainty from REddyProc gap-filling is included in the columns “NEE_Uperc_fsd”. “perc” refers to the USTAR percentile, from 2.5 % to 97.5 %, in steps of 2.5 %. The original flux density variables, prior to filtering, are named as NEE_ORIG, FC_ORIG, LE_ORIG and H_ORIG. In the dataset, the cleaned FC and SC are also available. Random error corresponding to these variables is named as RE.FC, RE.LE and RE.H, respectively. Quality flags are named as FC_QC, LE_QC and H_QC, respectively.
Furthermore, land cover maps containing the distinction between tree and crop areas at the sites were included in the repository. The maps are in the format .qgz, which can be opened with any tool for geospatial analysis. At Wendhausen and Forst, the different crops at the footprint area of the stations were also outlined, and the corresponding information to the crops rotation can be found in Table A1.
All the information that was considered relevant for the interpretation and further analysis of the current dataset was included in this publication or can be found in the files in the linked repository. However, the authors might have missed some details or might not be aware of some features of the sites, which could eventually be important to perform some specific analysis. The users of the dataset are therefore encouraged to contact the authors if missing information is required.
Table A1Sowing and harvest dates of crops, harvest dates of trees, and cutting dates for the grassland in Mariensee, throughout the project duration. No precise dates were available for the grass cut in Mariensee, but it was always taking place at late spring (Langhof and Swieter, 2024). In 2023, clover did not grow well in Forst due to the very wet winter conditions, therefore no harvest took place. In 2024, part of the winter wheat in Dornburg was infested by snails and it was replaced in March by summer barley. Harvest of both crops took place on the same day. In Wendhausen, there were three crops, typically summer barley, corn and winter rapeseed, rotating each year across the site. In the table, only the crop located in the main footprint area of the EC station is written down. For more information about these crops in the years 2022, 2023 and 2024, please refer to Callejas-Rodelas et al. (2025a).
Table A2Measurement period for all stations. During the first phase of the SIGNAL project a different EC setup was running at all sites in addition to the meteorological sensors (more details in Markwitz et al., 2020a; Markwitz and Siebicke, 2019). In the table, only the starting dates of the EC setups measuring during the second and third phases of the project are displayed. The EC setup in Vechta AF was removed some months before the end of the project, but the meteorological sensors were still measuring until October 2024.
Table B1Root mean squared error (RMSE) for evaluation of the gap-filling of meteorological data. For the variables SW_IN, PA, TA, RH, VPD, WS and WD, RMSE was calculated between modeled and test data, using a ratio of 0.8 : 0.2 for training/test data for the linear regression models. In the case of SW_OUT, NETRAD, LW_OUT and G, RMSE was calculated between predicted and measured data, after model fitting and prior to predicting all missing data, for different ratios of training to test data. RMSE was calculated as the average RMSE of the modeled data for the different split data sets in training and test data, for each variable. More details about the gap-filling can be found in Sect. 2.2.2 and 2.3.2.
Table B2Slopes (a) and coefficients of determination (r2) of linear regression models (of the form ) between measured and ERA5-Land data, for SW_IN, PA, TA, RH, VPD, WS and WD. The intercept b was forced to be 0 in the case of SW_IN, WS and WS. More details about the gap-filling can be found in Sect. 2.2.2 and 2.3.2.
Figure B1Scatter plots and linear regressions of measured meteorological vs. ERA5-Land re-analysis data, for global radiation (SW_IN, a), atmospheric pressure (PA, b), air temperature (TA, c), vapor pressure deficit (VPD, d), wind speed (WS, e) and wind direction (WD, f), at Forst AF. Data are at 30 min time resolution. The turquoise lines represent the linear regression models between ERA5-Land and measured data, and the pink/orange lines represent the reference 1-slope line with an intercept at 0.0. Slopes, r2 coefficients and intercepts of the linear models between ERA5-Land and measured data are displayed in the legends.
Figure C1Exemplary mean diel cycle of CO2 flux density (NEE, µmol m−2 s−1) and latent heat flux density (LE, W m−2) at Dornburg. Values were calculated as the mean across all available data corresponding to each 30 min period, across all USTAR scenarios. Error bars represent the standard deviations over the 40 USTAR scenarios. Solid lines represent the AF values, dashed lines represent the OC values. Dark blue colour represents measured data, pink colour represents measured data plus data filled with REddyProc, and light orange colour represents measured data and gap-filled data with both REddyProc and XGBoost. It is important to note that because most gaps occurred in winter and nighttime, the filled data have a different distribution and thus lower magnitude.
JACR, JvR and CM performed field measurements and data analysis during phases III, II and I of the SIGNAL project, respectively. JACR compiled and processed the longer term dataset and wrote the manuscript. JvR contributed to data processing and manuscript writing. AK, LS and CM wrote the project proposal and contributed to data processing and manuscript writing. DF, MP and DB contributed to data collection through technical support during fieldwork.
The contact author has declared that none of the authors has any competing interests.
Publisher's note: Copernicus Publications remains neutral with regard to jurisdictional claims made in the text, published maps, institutional affiliations, or any other geographical representation in this paper. The authors bear the ultimate responsibility for providing appropriate place names. Views expressed in the text are those of the authors and do not necessarily reflect the views of the publisher.
We wish to acknowledge the funding agencies for providing the necessary funds to conduct this research, as well as the technical support in the field work received by Frank Tiedemann, Edgar Tunsch and student assistants (Bioclimatology group), and Julian Meyer (Soil Science group of Tropical and Subtropical Ecosystems) from the University of Göttingen. Further, we wish to acknowledge the contributions by Mathias Herbst to the BonaRes SIGNAL proposal and project design, the support from the team of the Micrometeorology Group at the University of Helsinki and from the Natural Resources Institute Finland (LUKE) in Helsinki, and the preparation of the lower-cost eddy covariance systems by Robert Clement and Timothy Hill from the University of Exeter. We used the library cmcrameri in Python to generate figures with color palettes proven to be suitable for people with color vision deficiencies, for which we acknowledge Crameri et al. (2020).
This research was supported by the German Federal Ministry of Education and Research (BMBF, project BonaRes, Module A, SIGNAL 031A562A, 031B0510A and 031B1063A) and the Deutsche Forschungsgemeinschaft (grant No. INST 186/1118-1 FUGG). This project also received funding from the European Unions' Horizon 2020 research and innovation program under Grant Agreement No. 862695 EJP SOIL, the German Academic Exchange Service (DAAD), and the Reinhard-Süring-Foundation (RSS), affiliated to the German Meteorological Society (DMG).
This open-access publication was funded
by the University of Göttingen.
This paper was edited by Tobias Gerken and reviewed by two anonymous referees.
Abadi, M., Agarwal, A., Barham, P., Brevdo, E., Chen, Z., Citro, C., Corrado, G. S., Davis, A., Dean, J., Devin, M., Ghemawat, S., Goodfellow, I., Harp, A., Irving, G., Isard, M., Jia, Y., Jozefowicz, R., Kaiser, L., Kudlur, M., Levenberg, J., Mané, D., Monga, R., Moore, S., Murray, D., Olah, C., Schuster, M., Shlens, J., Steiner, B., Sutskever, I., Talwar, K., Tucker, P., Vanhoucke, V., Vasudevan, V., Viégas, F., Vinyals, O., Warden, P., Wattenberg, M., Wicke, M., Yu, Y., and Zheng, X.: TensorFlow: Large-scale Machine Learning on Heterogeneous Systems, https://tensorflow.org (last access: 20 September 2025), 2015. a
Aubinet, M., Feigenwinter, C., Heinesch, B., Laffineur, Q., Papale, D., Reichstein, M., Rinne, J., and Van Gorsel, E.: Nighttime Flux Correction, in: Eddy Covariance: A Practical Guide to Measurement and Data Analysis, edited by: Aubinet, M., Vesala, T., and Papale, D., Springer Netherlands, Dordrecht, ISBN 978-94-007-2350-4, https://doi.org/10.1007/978-94-007-2351-1, 2012. a, b
Baldocchi, D.: Measuring Fluxes of Trace Gases and Energy between Ecosystems and the Atmosphere – the State and Future of the Eddy Covariance Method, Global Change Biology, 20, 3600–3609, https://doi.org/10.1111/gcb.12649, 2014. a
Barr, A., Richardson, A., Hollinger, D., Papale, D., Arain, M., Black, T., Bohrer, G., Dragoni, D., Fischer, M., Gu, L., Law, B., Margolis, H., McCaughey, J., Munger, J., Oechel, W., and Schaeffer, K.: Use of Change-Point Detection for Friction–Velocity Threshold Evaluation in Eddy-Covariance Studies, Agricultural and Forest Meteorology, 171–172, 31–45, https://doi.org/10.1016/j.agrformet.2012.11.023, 2013. a
Beck, H. E., Zimmermann, N. E., McVicar, T. R., Vergopolan, N., Berg, A., and Wood, E. F.: Present and Future Köppen-Geiger Climate Classification Maps at 1-Km Resolution, Scientific Data, 5, 180214, https://doi.org/10.1038/sdata.2018.214, 2018. a
Böhm, C., Kanzler, M., and Freese, D.: Wind Speed Reductions as Influenced by Woody Hedgerows Grown for Biomass in Short Rotation Alley Cropping Systems in Germany, Agroforestry Systems, 88, 579–591, https://doi.org/10.1007/s10457-014-9700-y, 2014. a
Callejas-Rodelas, J. Á., Knohl, A., van Ramshorst, J., Mammarella, I., and Markwitz, C.: Comparison between Lower-Cost and Conventional Eddy Covariance Setups for CO2 and Evapotranspiration Measurements above Monocropping and Agroforestry Systems, Agricultural and Forest Meteorology, 354, 110086, https://doi.org/10.1016/j.agrformet.2024.110086, 2024. a, b, c, d, e, f, g
Callejas-Rodelas, J. Á., Knohl, A., Mammarella, I., Vesala, T., Peltola, O., and Markwitz, C.: Does increased spatial replication above heterogeneous agroforestry improve the representativeness of eddy covariance measurements?, Biogeosciences, 22, 4507–4529, https://doi.org/10.5194/bg-22-4507-2025, 2025a. a, b, c, d, e, f, g, h, i, j, k
Callejas Rodelas, J. Á., van Ramshorst, J., Knohl, A., Siebicke, L., and Markwitz, C.: A Multiyear Eddy Covariance and Meteorological Dataset from Five Pairs of Agroforestry and Monocropping Agroecosystems in Northern Germany, GRO.data [data set], https://doi.org/10.25625/A2Z8T8, 2025b. a, b
Cardinael, R., Chevallier, T., Barthès, B. G., Saby, N. P., Parent, T., Dupraz, C., Bernoux, M., and Chenu, C.: Impact of Alley Cropping Agroforestry on Stocks, Forms and Spatial Distribution of Soil Organic Carbon – A Case Study in a Mediterranean Context, Geoderma, 259–260, 288–299, https://doi.org/10.1016/j.geoderma.2015.06.015, 2015. a, b, c
Cardinael, R., Chevallier, T., Cambou, A., Béral, C., Barthès, B. G., Dupraz, C., Durand, C., Kouakoua, E., and Chenu, C.: Increased Soil Organic Carbon Stocks under Agroforestry: A Survey of Six Different Sites in France, Agriculture, Ecosystems & Environment, 236, 243–255, https://doi.org/10.1016/j.agee.2016.12.011, 2017. a
Cardinael, R., Cadisch, G., Gosme, M., Oelbermann, M., and Van Noordwijk, M.: Climate Change Mitigation and Adaptation in Agriculture: Why Agroforestry Should Be Part of the Solution, Agriculture, Ecosystems & Environment, 319, 107555, https://doi.org/10.1016/j.agee.2021.107555, 2021. a
Chapman, M., Walker, W. S., Cook-Patton, S. C., Ellis, P. W., Farina, M., Griscom, B. W., and Baccini, A.: Large Climate Mitigation Potential from Adding Trees to Agricultural Lands, Global Change Biology, 26, 4357–4365, https://doi.org/10.1111/gcb.15121, 2020. a, b
Chen, B., Coops, N. C., Fu, D., Margolis, H. A., Amiro, B. D., Barr, A. G., Black, T. A., Arain, M. A., Bourque, C. P.-A., Flanagan, L. B., Lafleur, P. M., McCaughey, J. H., and Wofsy, S. C.: Assessing Eddy-Covariance Flux Tower Location Bias across the Fluxnet-Canada Research Network Based on Remote Sensing and Footprint Modelling, Agricultural and Forest Meteorology, 151, 87–100, https://doi.org/10.1016/j.agrformet.2010.09.005, 2011. a
Chen, T. and Guestrin, C.: XGBoost: A Scalable Tree Boosting System, in: Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, 785–794, ACM, San Francisco California USA, ISBN 978-1-4503-4232-2, https://doi.org/10.1145/2939672.2939785, 2016. a, b, c
Chinchilla-Soto, C., Durán-Quesada, A. M., Monge-Muñoz, M., and Gutiérrez-Soto, M. V.: Quantifying the Annual Cycle of Water Use Efficiency, Energy and CO2 Fluxes Using Micrometeorological and Physiological Techniques for a Coffee Field in Costa Rica, Forests, 12, 889, https://doi.org/10.3390/f12070889, 2021. a
Choe, S., Niu, D., Bischel, X., Schmidt, M., Manu, R., Corre, M. D., and Veldkamp, E.: BonaRes-SIGNAL Crop, Wood, and Leaf Biomass Production, Carbon and Nitrogen Contents in Wood and Leaf Litter (2018–2023), Leibniz Centre for Agricultural Landscape Research (ZALF), https://doi.org/10.20387/BONARES-BKJE-VB72, 2025a. a, b
Choe, S., Niu, D., Hahn, P., de Waard, R., Langhof, M., Majaura, M., Manu, R., Veldkamp, E., and D. Corre, M.: Optimising Crop Types in Temperate Alley-Cropping Agroforestry Depends on Tree Age [preprint], 2025b. a
Chollet, F., et al.: Keras, GitHub, https://github.com/fchollet/keras (last access: 20 September 2025), 2015. a
Chu, H., Baldocchi, D. D., Poindexter, C., Abraha, M., Desai, A. R., Bohrer, G., Arain, M. A., Griffis, T., Blanken, P. D., O'Halloran, T. L., Thomas, R. Q., Zhang, Q., Burns, S. P., Frank, J. M., Christian, D., Brown, S., Black, T. A., Gough, C. M., Law, B. E., Lee, X., Chen, J., Reed, D. E., Massman, W. J., Clark, K., Hatfield, J., Prueger, J., Bracho, R., Baker, J. M., and Martin, T. A.: Temporal Dynamics of Aerodynamic Canopy Height Derived From Eddy Covariance Momentum Flux Data Across North American Flux Networks, Geophysical Research Letters, 45, 9275–9287, https://doi.org/10.1029/2018GL079306, 2018. a, b
Crameri, F., Shephard, G. E., and Heron, P. J.: The Misuse of Colour in Science Communication, Nature Communications, 11, 5444, https://doi.org/10.1038/s41467-020-19160-7, 2020. a
Dee, D. P., Uppala, S. M., Simmons, A. J., Berrisford, P., Poli, P., Kobayashi, S., Andrae, U., Balmaseda, M. A., Balsamo, G., Bauer, P., Bechtold, P., Beljaars, A. C. M., Van De Berg, L., Bidlot, J., Bormann, N., Delsol, C., Dragani, R., Fuentes, M., Geer, A. J., Haimberger, L., Healy, S. B., Hersbach, H., Hólm, E. V., Isaksen, L., Kållberg, P., Köhler, M., Matricardi, M., McNally, A. P., Monge-Sanz, B. M., Morcrette, J.-J., Park, B.-K., Peubey, C., De Rosnay, P., Tavolato, C., Thépaut, J.-N., and Vitart, F.: The ERA-Interim Reanalysis: Configuration and Performance of the Data Assimilation System, Quarterly Journal of the Royal Meteorological Society, 137, 553–597, https://doi.org/10.1002/qj.828, 2011. a
De Stefano, A. and Jacobson, M. G.: Soil Carbon Sequestration in Agroforestry Systems: A Meta-Analysis, Agroforestry Systems, https://doi.org/10.1007/s10457-017-0147-9, 2017. a
DWD: Deutscher Wetterdienst Climatological Means dataset for Germany, https://opendata.dwd.de/climate_environment/CDC/ (last access: 15 March 2025), 2025. a, b
Dyukarev, E.: Comparison of Artificial Neural Network and Regression Models for Filling Temporal Gaps of Meteorological Variables Time Series, Applied Sciences, 13, 2646, https://doi.org/10.3390/app13042646, 2023. a, b, c
El Hachimi, C., Belaqziz, S., Khabba, S., Ousanouan, Y., Sebbar, B.-E., Kharrou, M. H., and Chehbouni, A.: ClimateFiller: A Python Framework for Climate Time Series Gap-Filling and Diagnosis Based on Artificial Intelligence and Multi-Source Reanalysis Data, Software Impacts, 18, 100575, https://doi.org/10.1016/j.simpa.2023.100575, 2023. a, b
Finkelstein, P. L. and Sims, P. F.: Sampling Error in Eddy Correlation Flux Measurements, Journal of Geophysical Research: Atmospheres, 106, 3503–3509, https://doi.org/10.1029/2000JD900731, 2001. a, b
Foken, T., Göockede, M., Mauder, M., Mahrt, L., Amiro, B., and Munger, W.: Post-Field Data Quality Control, in: Handbook of Micrometeorology, edited by: Lee, X., Massman, W., and Law, B., Vol. 29, 181–208, Kluwer Academic Publishers, Dordrecht, ISBN 978-1-4020-2264-7, https://doi.org/10.1007/1-4020-2265-4_9, 2005. a, b
Friedman, J. H.: Greedy Function Approximation: A Gradient Boosting Machine., The Annals of Statistics, 29, https://doi.org/10.1214/aos/1013203451, 2001. a
Göckede, M., Rebmann, C., and Foken, T.: A Combination of Quality Assessment Tools for Eddy Covariance Measurements with Footprint Modelling for the Characterisation of Complex Sites, Agricultural and Forest Meteorology, 127, 175–188, https://doi.org/10.1016/j.agrformet.2004.07.012, 2004. a
Gómez-Delgado, F., Roupsard, O., le Maire, G., Taugourdeau, S., Pérez, A., van Oijen, M., Vaast, P., Rapidel, B., Harmand, J. M., Voltz, M., Bonnefond, J. M., Imbach, P., and Moussa, R.: Modelling the hydrological behaviour of a coffee agroforestry basin in Costa Rica, Hydrology and Earth System Sciences, 15, 369–392, https://doi.org/10.5194/hess-15-369-2011, 2011. a
Gupta, S. R., Dagar, J. C., and Teketay, D.: Agroforestry for Rehabilitation of Degraded Landscapes: Achieving Livelihood and Environmental Security, in: Agroforestry for Degraded Landscapes, edited by: Dagar, J. C., Gupta, S. R., and Teketay, D., 23–68, Springer Singapore, Singapore, ISBN 978-981-15-4135-3, https://doi.org/10.1007/978-981-15-4136-0_2, 2020. a
Haiden, T., Sandu, I., Balsamo, G., Arduini, G., and Beljaars, A.: Addressing Biases in Near-Surface Forecasts, ECMWF Newsletter, https://doi.org/10.21957/ENG71D53TH, 2018. a
Hatfield, J. L. and Dold, C.: Water-Use Efficiency: Advances and Challenges in a Changing Climate, Frontiers in Plant Science, 10, 103, https://doi.org/10.3389/fpls.2019.00103, 2019. a
Hersbach, H., Bell, B., Berrisford, P., Hirahara, S., Horányi, A., Muñoz-Sabater, J., Nicolas, J., Peubey, C., Radu, R., Schepers, D., Simmons, A., Soci, C., Abdalla, S., Abellan, X., Balsamo, G., Bechtold, P., Biavati, G., Bidlot, J., Bonavita, M., De Chiara, G., Dahlgren, P., Dee, D., Diamantakis, M., Dragani, R., Flemming, J., Forbes, R., Fuentes, M., Geer, A., Haimberger, L., Healy, S., Hogan, R. J., Hólm, E., Janisková, M., Keeley, S., Laloyaux, P., Lopez, P., Lupu, C., Radnoti, G., De Rosnay, P., Rozum, I., Vamborg, F., Villaume, S., and Thépaut, J.-N.: The ERA5 Global Reanalysis, Quarterly Journal of the Royal Meteorological Society, 146, 1999–2049, https://doi.org/10.1002/qj.3803, 2020. a, b, c
Hill, T., Chocholek, M., and Clement, R.: The Case for Increasing the Statistical Power of Eddy Covariance Ecosystem Studies: Why, Where and How?, Global Change Biology, 23, 2154–2165, https://doi.org/10.1111/gcb.13547, 2017. a, b, c
Howlett, D. S., Moreno, G., Mosquera Losada, M. R., Nair, P. K. R., and Nair, V. D.: Soil Carbon Storage as Influenced by Tree Cover in the Dehesa Cork Oak Silvopasture of Central-Western Spain, Journal of Environmental Monitoring, 13, 1897, https://doi.org/10.1039/c1em10059a, 2011. a
Kanzler, M., Böhm, C., and Freese, D.: The Development of Soil Organic Carbon under Young Black Locust (Robinia Pseudoacacia L.) Trees at a Post-Mining Landscape in Eastern Germany, New Forests, 52, 47–68, https://doi.org/10.1007/s11056-020-09779-1, 2021. a
Kay, S., Rega, C., Moreno, G., Den Herder, M., Palma, J. H., Borek, R., Crous-Duran, J., Freese, D., Giannitsopoulos, M., Graves, A., Jäger, M., Lamersdorf, N., Memedemin, D., Mosquera-Losada, R., Pantera, A., Paracchini, M. L., Paris, P., Roces-Díaz, J. V., Rolo, V., Rosati, A., Sandor, M., Smith, J., Szerencsits, E., Varga, A., Viaud, V., Wawer, R., Burgess, P. J., and Herzog, F.: Agroforestry Creates Carbon Sinks Whilst Enhancing the Environment in Agricultural Landscapes in Europe, Land Use Policy, 83, 581–593, https://doi.org/10.1016/j.landusepol.2019.02.025, 2019. a, b
Kessomkiat, W., Franssen, H.-J. H., Graf, A., and Vereecken, H.: Estimating Random Errors of Eddy Covariance Data: An Extended Two-Tower Approach, Agricultural and Forest Meteorology, 171–172, 203–219, https://doi.org/10.1016/j.agrformet.2012.11.019, 2013. a
Kljun, N., Calanca, P., Rotach, M. W., and Schmid, H. P.: A simple two-dimensional parameterisation for Flux Footprint Prediction (FFP), Geoscientific Model Development, 8, 3695–3713, https://doi.org/10.5194/gmd-8-3695-2015, 2015. a, b
Langhof, M. and Swieter, A.: Five Years of Grassland Yield and Quality Assessment in a Temperate Short-Rotation Alley Cropping Agroforestry System, Agroforestry Systems, 98, 933–937, https://doi.org/10.1007/s10457-024-00963-2, 2024. a, b
Lasslop, G., Reichstein, M., Papale, D., Richardson, A. D., Arneth, A., Barr, A., Stoy, P., and Wohlfahrt, G.: Separation of Net Ecosystem Exchange into Assimilation and Respiration Using a Light Response Curve Approach: Critical Issues and Global Evaluation, Global Change Biology, 16, 187–208, https://doi.org/10.1111/j.1365-2486.2009.02041.x, 2010. a
Leys, C., Ley, C., Klein, O., Bernard, P., and Licata, L.: Detecting Outliers: Do Not Use Standard Deviation around the Mean, Use Absolute Deviation around the Median, Journal of Experimental Social Psychology, 49, 764–766, https://doi.org/10.1016/j.jesp.2013.03.013, 2013. a
Lipson, M., Grimmond, S., Best, M., Chow, W. T. L., Christen, A., Chrysoulakis, N., Coutts, A., Crawford, B., Earl, S., Evans, J., Fortuniak, K., Heusinkveld, B. G., Hong, J.-W., Hong, J., Järvi, L., Jo, S., Kim, Y.-H., Kotthaus, S., Lee, K., Masson, V., McFadden, J. P., Michels, O., Pawlak, W., Roth, M., Sugawara, H., Tapper, N., Velasco, E., and Ward, H. C.: Harmonized gap-filled datasets from 20 urban flux tower sites, Earth System Science Data, 14, 5157–5178, https://doi.org/10.5194/essd-14-5157-2022, 2022. a, b
Liu, Y., Lucas, B., Bergl, D. D., and Richardson, A. D.: Robust Filling of Extra-Long Gaps in Eddy Covariance CO2 Flux Measurements from a Temperate Deciduous Forest Using eXtreme Gradient Boosting, Agricultural and Forest Meteorology, 364, 110438, https://doi.org/10.1016/j.agrformet.2025.110438, 2025. a, b, c
Lucarini, A., Cascio, M. L., Marras, S., Sirca, C., and Spano, D.: Artificial Intelligence and Eddy Covariance: A Review, Science of The Total Environment, 950, 175406, https://doi.org/10.1016/j.scitotenv.2024.175406, 2024. a
Lucas-Moffat, A. M., Schrader, F., Herbst, M., and Brümmer, C.: Multiple Gap-Filling for Eddy Covariance Datasets, Agricultural and Forest Meteorology, 325, 109114, https://doi.org/10.1016/j.agrformet.2022.109114, 2022. a, b
Mahabbati, A., Beringer, J., Leopold, M., McHugh, I., Cleverly, J., Isaac, P., and Izady, A.: A comparison of gap-filling algorithms for eddy covariance fluxes and their drivers, Geoscientific Instrumentation, Methods and Data Systems, 10, 123–140, https://doi.org/10.5194/gi-10-123-2021, 2021. a
Mammarella, I., Launiainen, S., Gronholm, T., Keronen, P., Pumpanen, J., Rannik, Ü., and Vesala, T.: Relative Humidity Effect on the High-Frequency Attenuation of Water Vapor Flux Measured by a Closed-Path Eddy Covariance System, Journal of Atmospheric and Oceanic Technology, 26, 1856–1866, https://doi.org/10.1175/2009JTECHA1179.1, 2009. a, b
Mammarella, I., Peltola, O., Nordbo, A., Järvi, L., and Rannik, Ü.: Quantifying the uncertainty of eddy covariance fluxes due to the use of different software packages and combinations of processing steps in two contrasting ecosystems, Atmospheric Measurement Techniques, 9, 4915–4933, https://doi.org/10.5194/amt-9-4915-2016, 2016. a, b
Markwitz, C. and Siebicke, L.: Low-cost eddy covariance: a case study of evapotranspiration over agroforestry in Germany, Atmospheric Measurement Techniques, 12, 4677–4696, https://doi.org/10.5194/amt-12-4677-2019, 2019. a, b, c, d, e
Markwitz, C., Knohl, A., and Siebicke, L.: Evapotranspiration over agroforestry sites in Germany, Biogeosciences, 17, 5183–5208, https://doi.org/10.5194/bg-17-5183-2020, 2020a. a, b, c, d, e
Markwitz, C., Knohl, A., and Siebicke, L.: Data Set Supporting Journal Article: Markwitz, C., Knohl, A. and Siebicke, L.: “Evapotranspiration over Agroforestry Sites in Germany”, Biogeosciences, Zenodo [data set], https://doi.org/10.5281/zenodo.4038398, 2020b. a
Massman, W. and Lee, X.: Eddy Covariance Flux Corrections and Uncertainties in Long-Term Studies of Carbon and Energy Exchanges, Agricultural and Forest Meteorology, 113, 121–144, https://doi.org/10.1016/S0168-1923(02)00105-3, 2002. a
Mauder, M., Cuntz, M., Drüe, C., Graf, A., Rebmann, C., Schmid, H. P., Schmidt, M., and Steinbrecher, R.: A Strategy for Quality and Uncertainty Assessment of Long-Term Eddy-Covariance Measurements, Agricultural and Forest Meteorology, 169, 122–135, https://doi.org/10.1016/j.agrformet.2012.09.006, 2013. a, b
Moffat, A. M., Papale, D., Reichstein, M., Hollinger, D. Y., Richardson, A. D., Barr, A. G., Beckstein, C., Braswell, B. H., Churkina, G., Desai, A. R., Falge, E., Gove, J. H., Heimann, M., Hui, D., Jarvis, A. J., Kattge, J., Noormets, A., and Stauch, V. J.: Comprehensive Comparison of Gap-Filling Techniques for Eddy Covariance Net Carbon Fluxes, Agricultural and Forest Meteorology, 147, 209–232, https://doi.org/10.1016/j.agrformet.2007.08.011, 2007. a
Muñoz-Sabater, J., Dutra, E., Agustí-Panareda, A., Albergel, C., Arduini, G., Balsamo, G., Boussetta, S., Choulga, M., Harrigan, S., Hersbach, H., Martens, B., Miralles, D. G., Piles, M., Rodríguez-Fernández, N. J., Zsoter, E., Buontempo, C., and Thépaut, J.-N.: ERA5-Land: a state-of-the-art global reanalysis dataset for land applications, Earth Syst. Sci. Data, 13, 4349–4383, https://doi.org/10.5194/essd-13-4349-2021, 2021. a, b
Murray, F. W.: On the Computation of Saturation Vapor Pressure, Journal of Applied Meteorology, 6, 203–204, https://doi.org/10.1175/1520-0450(1967)006<0203:OTCOSV>2.0.CO;2, 1967. a
Nair, P. K. R.: Classification of Agroforestry Systems, Agroforestry Systems, 3, 97–128, https://doi.org/10.1007/bf00122638, 1985. a
Ong, C., Wilson, J., Deans, J., Mulayta, J., Raussen, T., and Wajja-Musukwe, N.: Tree–Crop Interactions: Manipulation of Water Use and Root Function, Agricultural Water Management, 53, 171–186, https://doi.org/10.1016/S0378-3774(01)00163-9, 2002. a
Pancholi, R., Yadav, R., Gupta, H., Vasure, N., Choudhary, S., Singh, M. N., and Rastogi, M.: The Role of Agroforestry Systems in Enhancing Climate Resilience and Sustainability – A Review, International Journal of Environment and Climate Change, 13, 4342–4353, https://doi.org/10.9734/IJECC/2023/v13i113615, 2023. a, b
Papale, D., Reichstein, M., Aubinet, M., Canfora, E., Bernhofer, C., Kutsch, W., Longdoz, B., Rambal, S., Valentini, R., Vesala, T., and Yakir, D.: Towards a standardized processing of Net Ecosystem Exchange measured with eddy covariance technique: algorithms and uncertainty estimation, Biogeosciences, 3, 571–583, https://doi.org/10.5194/bg-3-571-2006, 2006. a, b, c
Pastorello, G., Trotta, C., Canfora, E., Chu, H., Christianson, D., Cheah, Y.-W., Poindexter, C., Chen, J., Elbashandy, A., Humphrey, M., Isaac, P., Polidori, D., Reichstein, M., Ribeca, A., Van Ingen, C., Vuichard, N., Zhang, L., Amiro, B., Ammann, C., Arain, M. A., Ardö, J., Arkebauer, T., Arndt, S. K., Arriga, N., Aubinet, M., Aurela, M., Baldocchi, D., Barr, A., Beamesderfer, E., Marchesini, L. B., Bergeron, O., Beringer, J., Bernhofer, C., Berveiller, D., Billesbach, D., Black, T. A., Blanken, P. D., Bohrer, G., Boike, J., Bolstad, P. V., Bonal, D., Bonnefond, J.-M., Bowling, D. R., Bracho, R., Brodeur, J., Brümmer, C., Buchmann, N., Burban, B., Burns, S. P., Buysse, P., Cale, P., Cavagna, M., Cellier, P., Chen, S., Chini, I., Christensen, T. R., Cleverly, J., Collalti, A., Consalvo, C., Cook, B. D., Cook, D., Coursolle, C., Cremonese, E., Curtis, P. S., D'Andrea, E., Da Rocha, H., Dai, X., Davis, K. J., Cinti, B. D., Grandcourt, A. D., Ligne, A. D., De Oliveira, R. C., Delpierre, N., Desai, A. R., Di Bella, C. M., Tommasi, P. D., Dolman, H., Domingo, F., Dong, G., Dore, S., Duce, P., Dufrêne, E., Dunn, A., Dušek, J., Eamus, D., Eichelmann, U., ElKhidir, H. A. M., Eugster, W., Ewenz, C. M., Ewers, B., Famulari, D., Fares, S., Feigenwinter, I., Feitz, A., Fensholt, R., Filippa, G., Fischer, M., Frank, J., Galvagno, M., Gharun, M., Gianelle, D., Gielen, B., Gioli, B., Gitelson, A., Goded, I., Goeckede, M., Goldstein, A. H., Gough, C. M., Goulden, M. L., Graf, A., Griebel, A., Gruening, C., Grünwald, T., Hammerle, A., Han, S., Han, X., Hansen, B. U., Hanson, C., Hatakka, J., He, Y., Hehn, M., Heinesch, B., Hinko-Najera, N., Hörtnagl, L., Hutley, L., Ibrom, A., Ikawa, H., Jackowicz-Korczynski, M., Janouš, D., Jans, W., Jassal, R., Jiang, S., Kato, T., Khomik, M., Klatt, J., Knohl, A., Knox, S., Kobayashi, H., Koerber, G., Kolle, O., Kosugi, Y., Kotani, A., Kowalski, A., Kruijt, B., Kurbatova, J., Kutsch, W. L., Kwon, H., Launiainen, S., Laurila, T., Law, B., Leuning, R., Li, Y., Liddell, M., Limousin, J.-M., Lion, M., Liska, A. J., Lohila, A., López-Ballesteros, A., López-Blanco, E., Loubet, B., Loustau, D., Lucas-Moffat, A., Lüers, J., Ma, S., Macfarlane, C., Magliulo, V., Maier, R., Mammarella, I., Manca, G., Marcolla, B., Margolis, H. A., Marras, S., Massman, W., Mastepanov, M., Matamala, R., Matthes, J. H., Mazzenga, F., McCaughey, H., McHugh, I., McMillan, A. M. S., Merbold, L., Meyer, W., Meyers, T., Miller, S. D., Minerbi, S., Moderow, U., Monson, R. K., Montagnani, L., Moore, C. E., Moors, E., Moreaux, V., Moureaux, C., Munger, J. W., Nakai, T., Neirynck, J., Nesic, Z., Nicolini, G., Noormets, A., Northwood, M., Nosetto, M., Nouvellon, Y., Novick, K., Oechel, W., Olesen, J. E., Ourcival, J.-M., Papuga, S. A., Parmentier, F.-J., Paul-Limoges, E., Pavelka, M., Peichl, M., Pendall, E., Phillips, R. P., Pilegaard, K., Pirk, N., Posse, G., Powell, T., Prasse, H., Prober, S. M., Rambal, S., Rannik, Ü., Raz-Yaseef, N., Rebmann, C., Reed, D., Dios, V. R. D., Restrepo-Coupe, N., Reverter, B. R., Roland, M., Sabbatini, S., Sachs, T., Saleska, S. R., Sánchez-Cañete, E. P., Sanchez-Mejia, Z. M., Schmid, H. P., Schmidt, M., Schneider, K., Schrader, F., Schroder, I., Scott, R. L., Sedlák, P., Serrano-Ortíz, P., Shao, C., Shi, P., Shironya, I., Siebicke, L., Šigut, L., Silberstein, R., Sirca, C., Spano, D., Steinbrecher, R., Stevens, R. M., Sturtevant, C., Suyker, A., Tagesson, T., Takanashi, S., Tang, Y., Tapper, N., Thom, J., Tomassucci, M., Tuovinen, J.-P., Urbanski, S., Valentini, R., Van Der Molen, M., Van Gorsel, E., Van Huissteden, K., Varlagin, A., Verfaillie, J., Vesala, T., Vincke, C., Vitale, D., Vygodskaya, N., Walker, J. P., Walter-Shea, E., Wang, H., Weber, R., Westermann, S., Wille, C., Wofsy, S., Wohlfahrt, G., Wolf, S., Woodgate, W., Li, Y., Zampedri, R., Zhang, J., Zhou, G., Zona, D., Agarwal, D., Biraud, S., Torn, M., and Papale, D.: The FLUXNET2015 Dataset and the ONEFlux Processing Pipeline for Eddy Covariance Data, Scientific Data, 7, 225, https://doi.org/10.1038/s41597-020-0534-3, 2020. a, b, c, d, e, f, g, h, i, j, k, l, m
Pedregosa, F., Varoquaux, G., Gramfort, A., Michel, V., Thirion, B., Grisel, O., Blondel, M., Müller, A., Nothman, J., Louppe, G., Prettenhofer, P., Weiss, R., Dubourg, V., Vanderplas, J., Passos, A., Cournapeau, D., Brucher, M., Perrot, M., and Duchesnay, É.: Scikit-Learn: Machine Learning in Python, arXiv, https://doi.org/10.48550/arxiv.1201.0490, 2012. a
Peichl, M., Thevathasan, N. V., Gordon, A. M., Huss, J., and Abohassan, R. A.: Carbon Sequestration Potentials in Temperate Tree-Based Intercropping Systems, Southern Ontario, Canada, Agroforestry Systems, 66, 243–257, https://doi.org/10.1007/s10457-005-0361-8, 2006. a, b
Quinkenstein, A., Wöllecke, J., Böhm, C., Grünewald, H., Freese, D., Schneider, B. U., and Hüttl, R. F.: Ecological Benefits of the Alley Cropping Agroforestry System in Sensitive Regions of Europe, Environmental Science & Policy, 12, 1112–1121, https://doi.org/10.1016/j.envsci.2009.08.008, 2009. a
Rannik, Ü. and Vesala, T.: Autoregressive Filtering versus Linear Detrending in Estimation of Fluxes by the Eddy Covariance Method, Boundary-Layer Meteorology, 91, 259–280, https://doi.org/10.1023/A:1001840416858, 1999. a, b
Rebmann, C., Göckede, M., Foken, T., Aubinet, M., Aurela, M., Berbigier, P., Bernhofer, C., Buchmann, N., Carrara, A., Cescatti, A., Ceulemans, R., Clement, R., Elbers, J. A., Granier, A., Grünwald, T., Guyon, D., Havránková, K., Heinesch, B., Knohl, A., Laurila, T., Longdoz, B., Marcolla, B., Markkanen, T., Miglietta, F., Moncrieff, J., Montagnani, L., Moors, E., Nardino, M., Ourcival, J.-M., Rambal, S., Rannik, Ü., Rotenberg, E., Sedlak, P., Unterhuber, G., Vesala, T., and Yakir, D.: Quality Analysis Applied on Eddy Covariance Measurements at Complex Forest Sites Using Footprint Modelling, Theoretical and Applied Climatology, 80, 121–141, https://doi.org/10.1007/s00704-004-0095-y, 2005. a
Rebmann, C., Aubinet, M., Schmid, H., Arriga, N., Aurela, M., Burba, G., Clement, R., De Ligne, A., Fratini, G., Gielen, B., Grace, J., Graf, A., Gross, P., Haapanala, S., Herbst, M., Hörtnagl, L., Ibrom, A., Joly, L., Kljun, N., Kolle, O., Kowalski, A., Lindroth, A., Loustau, D., Mammarella, I., Mauder, M., Merbold, L., Metzger, S., Mölder, M., Montagnani, L., Papale, D., Pavelka, M., Peichl, M., Roland, M., Serrano-Ortiz, P., Siebicke, L., Steinbrecher, R., Tuovinen, J.-P., Vesala, T., Wohlfahrt, G., and Franz, D.: ICOS Eddy Covariance Flux-Station Site Setup: A Review, International Agrophysics, 32, 471–494, https://doi.org/10.1515/intag-2017-0044, 2018. a, b
Reichstein, M., Falge, E., Baldocchi, D., Papale, D., Aubinet, M., Berbigier, P., Bernhofer, C., Buchmann, N., Gilmanov, T., Granier, A., Grünwald, T., Havránková, K., Ilvesniemi, H., Janous, D., Knohl, A., Laurila, T., Lohila, A., Loustau, D., Matteucci, G., Meyers, T., Miglietta, F., Ourcival, J.-M., Pumpanen, J., Rambal, S., Rotenberg, E., Sanz, M., Tenhunen, J., Seufert, G., Vaccari, F., Vesala, T., Yakir, D., and Valentini, R.: On the Separation of Net Ecosystem Exchange into Assimilation and Ecosystem Respiration: Review and Improved Algorithm, Global Change Biology, 11, 1424–1439, https://doi.org/10.1111/j.1365-2486.2005.001002.x, 2005. a, b, c, d, e
Richardson, A. D. and Hollinger, D. Y.: A Method to Estimate the Additional Uncertainty in Gap-Filled NEE Resulting from Long Gaps in the CO2 Flux Record, Agricultural and Forest Meteorology, 147, 199–208, https://doi.org/10.1016/j.agrformet.2007.06.004, 2007. a, b, c, d, e, f
Richardson, A. D., Mahecha, M. D., Falge, E., Kattge, J., Moffat, A. M., Papale, D., Reichstein, M., Stauch, V. J., Braswell, B. H., Churkina, G., Kruijt, B., and Hollinger, D. Y.: Statistical Properties of Random CO2 Flux Measurement Uncertainty Inferred from Model Residuals, Agricultural and Forest Meteorology, 148, 38–50, https://doi.org/10.1016/j.agrformet.2007.09.001, 2008. a
Sabbatini, S., Mammarella, I., Arriga, N., Fratini, G., Graf, A., Hörtnagl, L., Ibrom, A., Longdoz, B., Mauder, M., Merbold, L., Metzger, S., Montagnani, L., Pitacco, A., Rebmann, C., Sedlák, P., Šigut, L., Vitale, D., and Papale, D.: Eddy Covariance Raw Data Processing for CO2 and Energy Fluxes Calculation at ICOS Ecosystem Stations, International Agrophysics, 32, 495–515, https://doi.org/10.1515/intag-2017-0043, 2018. a, b, c, d, e
Satish, P., Madiwalar, A. F., Lallawmkimi, M. C., Reddy, K. J., Parveen, S., P, A., Laxman, T., Kiruba, M., and Anand, G.: Agroforestry: Multifunctional Benefits and Implementation Strategies, Journal of Geography, Environment and Earth Science International, 28, 1–12, https://doi.org/10.9734/jgeesi/2024/v28i10821, 2024. a
Schimel, D., Pavlick, R., Fisher, J. B., Asner, G. P., Saatchi, S., Townsend, P., Miller, C., Frankenberg, C., Hibbard, K., and Cox, P.: Observing Terrestrial Ecosystems and the Carbon Cycle from Space, Global Change Biology, 21, 1762–1776, https://doi.org/10.1111/gcb.12822, 2015. a
Schmidt, M., Corre, M. D., Kim, B., Morley, J., Göbel, L., Sharma, A. S. I., Setriuc, S., and Veldkamp, E.: Nutrient Saturation of Crop Monocultures and Agroforestry Indicated by Nutrient Response Efficiency, Nutrient Cycling in Agroecosystems, 119, 69–82, https://doi.org/10.1007/s10705-020-10113-6, 2021. a
Shao, G., Martinson, G. O., Corre, M. D., Luo, J., Niu, D., and Veldkamp, E.: Conversion of Cropland Monoculture to Agroforestry Increases Methane Uptake, Agronomy for Sustainable Development, 45, 1, https://doi.org/10.1007/s13593-024-00997-x, 2025. a, b
Singh, V., Johar, V., Kumar, R., and Chaudhary, M.: Socio-Economic and Environmental Assets Sustainability by Agroforestry Systems: A Review, International Journal of Agriculture, Environment and Biotechnology, 14, 521–533, https://doi.org/10.30954/0974-1712.04.2021.6, 2021. a
Sprenkle-Hyppolite, S., Griscom, B., Griffey, V., Munshi, E., and Chapman, M.: Maximizing Tree Carbon in Croplands and Grazing Lands While Sustaining Yields, Carbon Balance and Management, 19, 23, https://doi.org/10.1186/s13021-024-00268-y, 2024. a
Swieter, A. and Langhof, M.: Cropland Agroforestry 2017 and 2018, BonaRes Data Centre (Leibniz Centre for Agricultural Landscape Research (ZALF)), https://doi.org/10.20387/BONARES-AGDC-M0DM, 2020. a
Tikkasalo, O.-P., Peltola, O., Alekseychik, P., Heikkinen, J., Launiainen, S., Lehtonen, A., Li, Q., Martínez-García, E., Peltoniemi, M., Salovaara, P., Tuominen, V., and Mäkipää, R.: Eddy-covariance fluxes of CO2, CH4 and N2O in a drained peatland forest after clear-cutting, Biogeosciences, 22, 1277–1300, https://doi.org/10.5194/bg-22-1277-2025, 2025. a, b, c
Tramontana, G., Jung, M., Schwalm, C. R., Ichii, K., Camps-Valls, G., Ráduly, B., Reichstein, M., Arain, M. A., Cescatti, A., Kiely, G., Merbold, L., Serrano-Ortiz, P., Sickert, S., Wolf, S., and Papale, D.: Predicting carbon dioxide and energy fluxes across global FLUXNET sites with regression algorithms, Biogeosciences, 13, 4291–4313, https://doi.org/10.5194/bg-13-4291-2016, 2016. a
van Gorsel, E., Delpierre, N., Leuning, R., Black, A., Munger, J. W., Wofsy, S., Aubinet, M., Feigenwinter, C., Beringer, J., Bonal, D., Chen, B., Chen, J., Clement, R., Davis, K. J., Desai, A. R., Dragoni, D., Etzold, S., Grünwald, T., Gu, L., Heinesch, B., Hutyra, L. R., Jans, W. W., Kutsch, W., Law, B., Leclerc, M. Y., Mammarella, I., Montagnani, L., Noormets, A., Rebmann, C., and Wharton, S.: Estimating Nocturnal Ecosystem Respiration from the Vertical Turbulent Flux and Change in Storage of CO2, Agricultural and Forest Meteorology, 149, 1919–1930, https://doi.org/10.1016/j.agrformet.2009.06.020, 2009. a
van Ramshorst, J. G. V., Knohl, A., Callejas-Rodelas, J. Á., Clement, R., Hill, T. C., Siebicke, L., and Markwitz, C.: Lower-cost eddy covariance for CO2 and H2O fluxes over grassland and agroforestry, Atmospheric Measurement Techniques, 17, 6047–6071, https://doi.org/10.5194/amt-17-6047-2024, 2024. a, b, c, d, e
van Ramshorst, J. G. V., Callejas-Rodelas, J. Á., Knohl, A., and Markwitz, C.: Comparison of Ecosystem-Scale Carbon Fluxes at Agroforestry and Adjacent Monocropping Sites in Germany, Agroforestry Systems, 99, 147, https://doi.org/10.1007/s10457-025-01244-2, 2025. a, b, c, d
Vekuri, H., Tuovinen, J.-P., Kulmala, L., Papale, D., Kolari, P., Aurela, M., Laurila, T., Liski, J., and Lohila, A.: A Widely-Used Eddy Covariance Gap-Filling Method Creates Systematic Bias in Carbon Balance Estimates, Scientific Reports, 13, 1720, https://doi.org/10.1038/s41598-023-28827-2, 2023. a, b, c, d, e, f, g, h, i
Veldkamp, E., Schmidt, M., Markwitz, C., Beule, L., Beuschel, R., Biertümpfel, A., Bischel, X., Duan, X., Gerjets, R., Göbel, L., Graß, R., Guerra, V., Heinlein, F., Komainda, M., Langhof, M., Luo, J., Potthoff, M., Van Ramshorst, J. G. V., Rudolf, C., Seserman, D.-M., Shao, G., Siebicke, L., Svoboda, N., Swieter, A., Carminati, A., Freese, D., Graf, T., Greef, J. M., Isselstein, J., Jansen, M., Karlovsky, P., Knohl, A., Lamersdorf, N., Priesack, E., Wachendorf, C., Wachendorf, M., and Corre, M. D.: Multifunctionality of Temperate Alley-Cropping Agroforestry Outperforms Open Cropland and Grassland, Communications Earth & Environment, 4, 20, https://doi.org/10.1038/s43247-023-00680-1, 2023. a, b, c
Vuichard, N. and Papale, D.: Filling the gaps in meteorological continuous data measured at FLUXNET sites with ERA-Interim reanalysis, Earth System Science Data, 7, 157–171, https://doi.org/10.5194/essd-7-157-2015, 2015. a, b, c, d, e
Ward, P., Micin, S., and Fillery, I.: Application of Eddy Covariance to Determine Ecosystem-Scale Carbon Balance and Evapotranspiration in an Agroforestry System, Agricultural and Forest Meteorology, 152, 178–188, https://doi.org/10.1016/j.agrformet.2011.09.016, 2012. a
Wilczak, J. M., Oncley, S. P., and Stage, S. A.: Sonic Anemometer Tilt Correction Algorithms, Boundary-Layer Meteorology, 99, 127–150, https://doi.org/10.1023/A:1018966204465, 2001. a
Winck, B. R., Bloor, J. M. G., and Klumpp, K.: Eighteen Years of Upland Grassland Carbon Flux Data: Reference Datasets, Processing, and Gap-Filling Procedure, Scientific Data, 10, 311, https://doi.org/10.1038/s41597-023-02221-z, 2023. a, b
Wutzler, T., Lucas-Moffat, A., Migliavacca, M., Knauer, J., Sickel, K., Šigut, L., Menzer, O., and Reichstein, M.: Basic and extensible post-processing of eddy covariance flux data with REddyProc, Biogeosciences, 15, 5015–5030, https://doi.org/10.5194/bg-15-5015-2018, 2018. a, b, c, d
Zhu, S., Clement, R., McCalmont, J., Davies, C. A., and Hill, T.: Stable Gap-Filling for Longer Eddy Covariance Data Gaps: A Globally Validated Machine-Learning Approach for Carbon Dioxide, Water, and Energy Fluxes, Agricultural and Forest Meteorology, 314, 108777, https://doi.org/10.1016/j.agrformet.2021.108777, 2022. a
Zomer, R. J., Neufeldt, H., Xu, J., Ahrends, A., Bossio, D., Trabucco, A., Van Noordwijk, M., and Wang, M.: Global Tree Cover and Biomass Carbon on Agricultural Land: The Contribution of Agroforestry to Global and National Carbon Budgets, Scientific Reports, 6, 29987, https://doi.org/10.1038/srep29987, 2016. a
- Abstract
- Introduction
- Methods and datasets description
- Results
- Discussion
- Code and data availability
- Data records and usage of the dataset
- Appendix A: Supporting information on the sites
- Appendix B: Evaluation of gap-filling of meteorological data
- Appendix C: Diel cycles of measured and gap-filled NEE and LE
- Author contributions
- Competing interests
- Disclaimer
- Acknowledgements
- Financial support
- Review statement
- References
- Abstract
- Introduction
- Methods and datasets description
- Results
- Discussion
- Code and data availability
- Data records and usage of the dataset
- Appendix A: Supporting information on the sites
- Appendix B: Evaluation of gap-filling of meteorological data
- Appendix C: Diel cycles of measured and gap-filled NEE and LE
- Author contributions
- Competing interests
- Disclaimer
- Acknowledgements
- Financial support
- Review statement
- References