Interactions between the biosphere and the atmosphere can be well
characterized by fluxes between the two. In particular, carbon and energy
fluxes play a major role in understanding biogeochemical processes on an
ecosystem level or global scale. However, the fluxes can only be measured at
individual sites, e.g., by eddy covariance towers, and an upscaling of these
local observations is required to analyze global patterns. Previous work
focused on upscaling monthly, 8-day, or daily average values, and global maps
for each flux have been provided accordingly. In this paper, we raise the
upscaling of carbon and energy fluxes between land and atmosphere to the next
level by increasing the temporal resolution to subdaily timescales. We
provide continuous half-hourly fluxes for the period from 2001 to 2014 at

Understanding the coupling of the atmosphere and the biosphere is key to
understanding Earth system dynamics and ultimately to predict future
trajectories based on dynamic and fully coupled Earth system models

Note that an alternative notion would be to use the term “prediction” here. However, in the climate community “prediction” is typically used for future scenarios, while in machine learning the application domain could be also at ungauged spatial locations

) step, the model is applied to large spatial domains where only gridded estimates of the drivers are available. Machine learning techniques are very effective here since they are fully data-adaptive, do not require initial assumptions on functional relationships, and can cope with nonlinear dependencies.One of the first upscaling papers by

There exist further upscaling approaches in the literature based on support
vector regression models

Upscaling flux tower measurements represents a “bottom-up” approach whereas
the “top-down” atmospheric CO

Today, global flux products feature, at best, a daily temporal resolution as
presented by

Furthermore, there is a need for a global data product of half-hourly fluxes.
Such a data product would allow for characterizing subdaily variations in the
diurnal cycles at places where no towers are currently installed. Please note
that we use the term

Characterizing typical subdaily flux patterns is critically needed for
certain satellite remote sensing applications. For example, the
interpretation of satellite retrievals of sun-induced fluorescence as proxy
for photosynthesis

In view of the need for global high-frequency flux data, we aim at increasing
the temporal resolution of data-driven carbon and energy flux products to
subdaily timescales by estimating half-hourly values at global
scale. We tackle the problem of
predicting diurnal cycles with half-hourly values globally for both carbon
and energy fluxes between biosphere and atmosphere by treating the upscaling
task as a large-scale regression problem. From the machine learning
perspective, the random forest regression framework serves as a basis for our
computations due to its good performance and suitable scaling properties with
respect to large data sets. We test two approaches for estimating half-hourly
GPP with random forest models and evaluate both of them using a
leave-one-site-out cross-validation strategy for a large set of FLUXNET
sites. We produce derived global products with

The following sections are organized as follows. First, we introduce the data
base that is used in our study by
describing both site-level and global forcing data
(Sect.

In this section, we shortly describe the two data sources we are using in our
studies. For learning the relationships between predictor variables and the
target fluxes as well as for the cross-validation experiments, we make use of
site-level data extracted at FLUXNET sites that are equipped with eddy
covariance towers (Sect.

Fluxes at half-hourly resolution are currently only achieved by eddy
covariance instruments that provide local measurements and spatial extensions
are so far only possible by
deployment of those instruments on globally distributed towers. Based on
these in situ observations, we aim at predicting half-hourly fluxes globally
and therefore also rely on the data obtained by the eddy covariance method at
different sites. The eddy covariance method

As predictor variables, we use the ones selected by

In order to compute the global flux products at half-hourly resolution via
upscaling, we require the predictor variables mentioned in the previous
section at global scale, i.e.,
the variables of the RS

Data from
CRUNCEPv6 have been obtained via personal correspondence with Nicolas Viovy (email:

Ensemble methods are powerful machine learning tools that combine the outputs
of many individual prediction models to obtain more accurate estimations for
a target variable. The random forest approach

Besides the work of

Given a training set

General structure of a decision tree for regression: binary splits
with thresholds for individual predictor variables will be used to navigate a
sample

Starting at node 1 in Fig.

It is usually the case that multiple stopping criteria are tested and if one
of them is fulfilled, the current node is not split but becomes a leaf node
that stores a final output prediction. Learning a decision tree therefore
consists of computing split parameters until only leaf nodes remain that are
not split any further (Fig.

To reduce overfitting to the training set, the learning process is carried
out in a stochastic manner by introducing several types of randomization.
Whenever split parameters need to be identified, only a random subset of the

In his work about random forests,

Predicting the output

However, the number of trees

The problem of upscaling diurnal cycles of carbon and energy fluxes can be
formulated as a large-scale regression task, i.e., estimating half-hourly
fluxes for every grid cell of the globe based on a set of predictor
variables. These predictor variables typically encode climate conditions or
Earth observations obtained from remote sensing at the corresponding spatial
positions. However, the temporal resolutions of variables can be different,
not only between the target flux (half-hourly) and a predictor variable
(e.g., daily) but also among different predictor variables (e.g., daily and
half-hourly). Therefore, two
prediction approaches for upscaling diurnal cycles are presented in
Sect.

Visualization of the first prediction approach: an individual RDF regression model has been learned for each half hour of a day by just using training data from the corresponding half hour. Here, the predictions of half-hourly fluxes for a single day are visualized. The predictor variables are passed to the individual RDF regression models indicated by the arrows above the RDF models. Each RDF model computes an output for the corresponding half hour, which is shown by the arrows below the RDF models, such that the diurnal cycle is estimated by a conjunction of 48 different predictions from 48 different regression models. Note that this approach allows for predicting diurnal cycles only based on predictors with daily resolution (by ignoring the vertical colored bars in the upper part). However, predictors with half-hourly resolution can also be incorporated (by adding the corresponding half-hourly values indicated by the vertical colored bars).

Recall from the beginning of Sect.

Even if one uses only predictor variables of daily temporal resolution which
can be treated as constant for the whole day, different values of the target
flux for different half hours of the day can be estimated. The reason is that
the 48 different RDF models are learned with different values for the target
output variable

Visualization of the second prediction approach: a single RDF
regression model is able to predict the flux at every half hour of the day if
at least one predictor variable has a half-hourly temporal resolution (such
as the potential radiation

In contrast to the first prediction approach, the second approach only uses a
single regression model that is able to estimate different values for
different half hours of the same day. It is then necessary that the
distinction between these half hours is somehow encoded in the predictor
variables, which is not the case if only predictors of daily resolution are
incorporated. Therefore, this approach requires at least one predictor
variable at half-hourly temporal resolution. Fortunately, the potential
radiation (

In addition, besides the potential radiation, its first-order temporal
derivative can also be incorporated as an additional half-hourly predictor.
This allows for a distinction between
morning and afternoon via the sign of the derivative as well as for the
distinction between day and night. The latter is achieved because

Although meteorological variables such as air temperature or vapor pressure deficit (VPD) as well as incoming radiation are also potential candidates for predictors that encode subdaily variations in the fluxes, they are currently only available with a half-hourly resolution at individual sites, e.g., also measured at eddy covariance towers. Due to the missing half-hourly meteorological data products at a global scale, it is not possible to use these information for the global upscaling. However, since we are interested in whether such data products could be beneficial for upscaling diurnal cycles, we use the corresponding site-level data in our cross-validation analysis to get further insights. Hence, meteorological variables measured at the eddy covariance towers of FLUXNET can still be used for validating the upscaling approaches and evaluations of cross-validation experiments are presented in the next section.

The global products presented in this paper cover diurnal cycles of four
fluxes: GPP, NEE, LE, and H. For each of these fluxes, we have consistently
performed cross-validation experiments but the results presented in the
following only consider GPP as a running example. We have decided to apply
RDF models for regression due to its efficient training and testing
algorithms, even in the case of large-scale data, as well as its good performance
for upscaling daily mean values of GPP

The Nash-Sutcliffe modeling efficiency, from now on simply called modeling
efficiency, has been introduced by Nash and Sutcliffe in the context of river
flow forecasting but it is often also used as an evaluation criterion in
other applications that involve the prediction of variables, especially in
related upscaling tasks

The motivation for the leave-one-site-out evaluation as a special case of cross-validation is twofold. First, we want to evaluate regression models that have been learned from as many observations as possible and based on training sets that are most similar to the training set that will be used to compute the global products, which will incorporate all the available data from all FLUXNET sites. Second, we intend to mimic a realistic scenario most similar to the upscaling task by predicting fluxes at locations where no training data has been taken from. As a consequence, we predict fluxes at one FLUXNET site using a regression model learned with all observations from all the remaining FLUXNET sites. After doing this for each individual site, we concatenate all site-specific predictions to form a long vector of predictions that can be compared to the corresponding observations measured at the corresponding sites. This allows for a general evaluation of the prediction approaches in a site-independent manner.

We start with a short overview of the experiments that have been conducted in
order to clarify our ideas and motivations behind them. In
Sect.

Prediction performances (

In the following, we compare the results of our presented prediction
approaches for half-hourly GPP depending on different sets of predictor
variables, which have been obtained by using the leave-one-site-out strategy
explained in the beginning of Sect.

In Fig.

On the one side, half-hourly

Fingerprint plots of half-hourly GPP fluxes estimated for US-SO2 in
2004 with leave-one-site-out cross-validation which show that short-term
fluctuations on subdaily timescales are captured better when half-hourly
meteorological predictors have also been included

Some example sites with average diurnal cycles for different months comparing two prediction approaches with the observations.

To further highlight the difference in the predictions when half-hourly
meteorology is encoded in the driver variables, we visualize all half-hourly
estimations over one year at a specific site using fingerprint plots. A
fingerprint in this context is a plot with 365 rows corresponding to 365 days
of a year and 48 columns corresponding to 48 half hours of each day such that
one fingerprint contains all half-hourly values of a whole year and shows
characteristic patterns for the selected site, e.g., length of the growing
season. In Fig.

Modeling efficiencies (and RMSE in

Prediction performances for monthly average diurnal cycles of GPP
are shown in the same way as the accuracies for all half-hourly values in
Fig.

For visual inspection purposes, it is useful to look at average diurnal
cycles for individual months at specific sites. Example plots are shown in
Fig.

Average diurnal cycles of two sites showing the problems with seasonal droughts. The error in the prediction of the half-hourly fluxes increases during hot and dry summers for both sites, FR-Pue and IT-Cpz.

In fact, modeling efficiencies for monthly average diurnal cycles increase on
average across all sites to a range between

However, the average diurnal cycles can also be used to identify potential
problems of the predictions. In Fig.

Comparison between leave-one-site-out

In order to gain any insights into whether site-specific information is
currently not well represented in the predictors, we have conducted two
auxiliary experiments. During the first experiment, we additionally estimate
GPP fluxes at each site in a leave-one-month-out setup and compare the
resulting predictions with those of the leave-one-site-out setup. For the
leave-one-month-out estimations, we learn and test regression models for each
month at each site separately. Furthermore, each regression model for each
month is only learned with data from the same
site but measured in different months (and years). Hence, the regression
models are highly site-specific, since only correspondences between predictor
variables and GPP fluxes at a single site are used and predictions are made
at the same site but in a different time period. As a result, we have
observed improved flux estimations, which is shown exemplarily in
Fig.

Improvements in the initial estimations

This table also contains the prediction performances obtained from a second
experiment, in which we have used the daily GPP as an additional daily
predictor for our regression models in the leave-one-site-out setup. Of
course, this is only possible in the cross-validation analysis where we
actually have the daily averages of GPP, but the following evaluation reveals
interesting insights. Using the daily average GPP basically incorporates
information about the amplitudes of the diurnal cycles, hence drought effects
of reduced productivity can directly be observed in this additional predictor
variable. First of all, it can be seen in
Fig.

Comparing modeling efficiencies (and RMSE in

From this experiment, we can conclude that the problems for predicting diurnal cycles of GPP are mainly caused by the lack of estimating the daily mean GPP properly. If the daily mean is given, predictions of half-hourly values are much more accurate. Hence, the main problems for the upscaling of half-hourly fluxes are not related to producing the right shapes of the diurnal courses, but turn out to be problems of estimating the correct amplitudes. These are then the same problems as for upscaling daily average values (or fluxes at coarser timescales) and are not introduced by the step of going to a larger temporal resolution in terms of half hours.

In this section, we want to shortly summarize the main findings from our
cross-validation experiments. First, we have seen that it does not really
matter which of the two proposed prediction approaches we are using, since
prediction performances hardly differed between the single model approach and the
individual model approach. We
prefer to use the single model approach, because it seems to be more
plausible from a physical perspective to make distinctions between half hours
of a day by the information encoded in the predictor variables and
half-hourly

While the previous sections validate the presented prediction approaches and
point to potential problems in the estimation of half-hourly fluxes, we also
decided to produce the first global products of half-hourly GPP and
NEE,
as well as LE and H that will be described in the next section. So far, the
analyses have shown that best predictions are obtained by incorporating
meteorological variables at half-hourly resolution, but such data products
are not available at a global scale. Therefore, we have computed the global
products only based on the daily predictors of the RS

Furthermore, we have decided to use the second prediction approach
(Sect.

Prediction performances in terms of modeling efficiency (and RMSE in

In Table

For each of the four fluxes (GPP, NEE, LE, H), we have learned a single
regression model for all half hours based on all available half-hourly values
of the corresponding flux at the 222 FLUXNET sites listed in
Appendix

In addition to the provided half-hourly data, we also offer derived products
containing the monthly average diurnal cycles of the four fluxes for the 14
years that are covered by the half-hourly product. For the potential user of
the data, it will be much more convenient to directly obtain the monthly
average diurnal cycles compared to downloading the much larger half-hourly
data product and computing the monthly averages afterwards. Furthermore, the
monthly average diurnal cycles are more robust, which has also been shown by
larger modeling efficiencies in the experimental evaluations, e.g., as listed
in Table

The global maps show estimated values for half-hourly
GPP

Maximum diurnal amplitudes of GPP within a month are shown for June
2014

Cutouts of the global products are visualized in
Fig.

Besides the fingerprint plots summarizing a whole year of half-hourly values
for a specific location, it is also possible to compute diurnal amplitudes
for each grid cell from the global products. We again picked GPP acting as an
example for all the fluxes and determined maximum diurnal amplitudes within
each month. In Fig.

Furthermore, we have been interested in the maximum flux at each spatial
position. These statistics have been calculated among all the years 2001 to
2014 to produce a single map of maximum half-hourly values for each flux. The
results are shown in Fig.

Maximum half-hourly values of GPP

Finally, we want to compare our global product of NEE with an ensemble of
atmospheric CO

Comparison of our upscaled global NEE fluxes (red lines) with an
ensemble of atmospheric CO

The upper panel in Fig.

For some regions, such as Southern Africa and South American Temperate, mismatches between the inversions and the flux tower upscaling might also be due to contributions of fire emissions which are “seen” by the atmosphere but not in our approach. Overall, the patterns of the MSC in most of the regions are very similar to the results of the atmospheric inversions. This is remarkable given that the two approaches and data sources are entirely independent. Thus, our upscaling product has the potential to provide further constraints for the atmospheric inversion methods with the benefit of high resolution in both space and time.

Distributions of correlation coefficients for the mean seasonal cycles in the 11 TransCom regions comparing either individual ensemble members from the set of atmospheric inversions or each ensemble member with our upscaled NEE product.

We use the atmospheric inversions as an independent benchmark here, even though
a number of uncertainties also apply to those. To put the agreement of the
upscaling with the inversions into context of agreement among inversions, we
display the values of pairwise correlation coefficients for the MSC of the
different inversion methods together with the correlation coefficients
between the MSC of each inversion approach and our upscaling product for each
TransCom land region in Fig.

The calculated global half-hourly flux products are publicly available for
free at

Please check

Please note that all data files of GPP can contain slightly negative values, which seems to be implausible at first glance. However, these negative values mainly occur during nighttime and are the result of an artifact in the flux partitioning method at site level carried out for the FLUXNET eddy covariance tower network, where observed NEE is separated into GPP and ecosystem respiration. The negative values from the sites are part of the training set for the proposed upscaling approach, and therefore the machine learning model can produce negative GPP values for similar environmental conditions as well. Since our data products are obtained by an entirely data-driven machine learning approach, the observational error at site level (that also causes negative nighttime GPP at the sites) propagates to global scale. Hence, dealing with negative GPP observations is not only a problem in our global data products but also occurs when working with site-level data. Neglecting negative values at the sites during model learning or manually setting them to zero would lead to biased regression models and setting negative estimations to zero would cause biased predictions. We therefore decided to keep negative values both in the training set and in our provided global data products. If these negative values are causing trouble within any application that builds on our data products, they can easily be set to zero by the user as an appropriate post-processing step. However, the user should keep in mind that this leads to an overall bias within the data product.

The data products are stored as individual files for each variable and each
year that has been considered. We have chosen the platform-independent
NetCDF

In this paper, we have shown how to perform an upscaling of half-hourly carbon and energy fluxes from local in situ measurements to global scale. We have introduced two general prediction approaches to estimate half-hourly values mainly from predictor variables at coarser temporal resolution. Since the problem has been formulated as a large-scale regression task, we have been working with random forest regression, although other regression algorithms could be applied as well. Our prediction approaches have been validated by a set of cross-validation experiments employing a leave-one-site-out strategy for the FLUXNET towers that provide the observations. As a result of our analyses, we have presented global flux products at half-hourly temporal resolution for the years 2001 to 2014 covering four important variables: gross primary production, net ecosystem exchange, latent heat, and sensible heat. Detailed descriptions of the experimental setup for the cross-validation as well as for the computations that have led to the global products were given as well. Concerning the global data products, we have also shown derived statistics like maximum diurnal amplitudes of a month as well as maximum half-hourly fluxes at each spatial position. These properties can only be computed from data products with subdaily temporal resolution showing the benefits of our contributions.

In future work, we aim at improving the prediction performance of half-hourly
fluxes in various ways. First, we plan to add additional sources of
information to the drivers by extending the set of predictor variables to
cover further relevant aspects for the individual fluxes like water
availability or soil properties. This would allow for tackling difficult
scenarios like seasonal droughts, where the current approaches have shown
larger errors in the prediction. Second, we also want to incorporate the
history of the predictor variables in order to account for lagged effects. So
far, samples are treated independently in the prediction but their temporal
context due to the time series characteristics may provide additional
knowledge that can be exploited for the estimation of fluxes. Third, subdaily
meteorology could be included in the calculations of the global products by
incorporating the new generation of meteorological reanalysis data of ERA5 at an hourly
timescale that will be released in the near future or by exploiting
observations from geostationary satellites. Of course, the global products
will be updated if these additional ideas lead to better prediction
performances. Another import aspect of future work is providing uncertainties
for the flux estimations, which could be done by quantile regression
approaches

Since our machine learning models for the upscaling tasks depend on in situ
measurements from FLUXNET towers that are required to create the training
set, it is necessary to look at the spatial distribution of these towers to
judge on the adequacy of the global data products. In
Fig.

The superimposed FLUXNET tower locations of the sites used in this paper clearly show the biased distribution of underlying in situ measurements due to better spatial coverage of regions in Europe and North America compared to the rest of the world.

In Sect.

The distribution of gaps in the flux data of 129 456 site days that we have used from 222 FLUXNET towers clearly shows a nighttime dominance of the gaps. In total, there are roughly 35 % of gaps in this half-hourly flux data.

One can clearly observe a nighttime dominance of the gaps. For GPP, this is not a big problem, because it is assumed to be zero anyhow. Considering NEE, the absolute fluxes are also smaller during night compared to daytime observations. The nighttime dominance of gaps arises from less turbulence during these hours and this is an inherent problem of the measurement devices that we cannot resolve. However, it should be noted that such a biased distribution of gaps does not directly lead to a model bias, as it would be the case, for example, for linear methods. Since we have picked random forests as a nonlinear machine learning technique, our derived models are less biased for imbalanced data because the final estimations in the leaf nodes of the decision trees are made locally in predictor space by considering mean values from samples that fall into the respective leaf node. Hence, they are independent from samples that are far away in predictor space but could potentially have higher or lower density.

In addition, we have also carried out preliminary experiments where we have only used site days with no gaps, i.e., where all 48 half-hourly values have been available. This has then reduced the overall number of training samples massively and has clearly reduced prediction performance, most likely due to worse generalization abilities because the reduced training data did not capture all environmental conditions sufficiently well.

In Table

This table has been reproduced from Table 2
of

In this study, we made use of data from 222 FLUXNET sites that are equipped
with eddy covariance towers. We would like to thank all the data providers of
these sites for their hard work by collecting, filtering, and processing the
raw data as well as for sharing the data with the community. In
Table

This is a list of 222 FLUXNET sites from which we have used data in our study.

Continued.

Continued.

Continued.

Continued.

PB and MJ designed the experiments and PB carried them out, which also involved the incorporation of ideas and suggestions from MM and MR. FG contributed to the technical implementation. Evaluating the experiments as well as preparing the presentation of results was done by PB, with additional input from MJ, MM, and MR. PB wrote the manuscript with contributions from all co-authors.

The authors declare that they have no conflict of interest.

The work presented in this paper is part of the project “Detecting changes in essential ecosystem and biodiversity properties – towards a Biosphere Atmosphere Change Index: BACI”. This project has received funding from the European Union's Horizon 2020 Research and Innovation programme under grant agreement no. 640176. Furthermore, this work used eddy covariance data acquired by the FLUXNET community and in particular by the following networks: AmeriFlux (US Department of Energy, Biological and Environmental Research, Terrestrial Carbon Program, DE-FG02-04ER63917 and DE-FG02-04ER63911), AfriFlux, AsiaFlux, CarboAfrica, CarboEuropeIP, CarboItaly, CarboMont, ChinaFlux, Fluxnet-Canada (supported by CFCAS, NSERC, BIOCAP, Environment Canada, and NRCan), GreenGrass, KoFlux, LBA, NECC, OzFlux, TCOS-Siberia, and USCCC. We acknowledge the financial support to the eddy covariance data harmonization provided by CarboEuropeIP, FAO-GTOS-TCO, iLEAPS, Max Planck Institute for Biogeochemistry, National Science Foundation, University of Tuscia, Université Laval and Environment Canada, and the US Department of Energy, as well as the database development and technical support from the Berkeley Water Center, Lawrence Berkeley National Laboratory, Microsoft Research eScience, Oak Ridge National Laboratory, University of California – Berkeley, and the University of Virginia. Edited by: Vinayak Sinha Reviewed by: Ronald Prinn and one anonymous referee