the Creative Commons Attribution 4.0 License.
the Creative Commons Attribution 4.0 License.
Harmonized gap-filled datasets from 20 urban flux tower sites
Sue Grimmond
Martin Best
Winston T. L. Chow
Andreas Christen
Nektarios Chrysoulakis
Andrew Coutts
Ben Crawford
Stevan Earl
Jonathan Evans
Krzysztof Fortuniak
Bert G. Heusinkveld
Je-Woo Hong
Jinkyu Hong
Leena Järvi
Sungsoo Jo
Yeon-Hee Kim
Simone Kotthaus
Keunmin Lee
Valéry Masson
Joseph P. McFadden
Oliver Michels
Wlodzimierz Pawlak
Matthias Roth
Hirofumi Sugawara
Nigel Tapper
Erik Velasco
Helen Claire Ward
Download
- Final revised paper (published on 22 Nov 2022)
- Preprint (discussion started on 03 Jun 2022)
Interactive discussion
Status: closed
-
RC1: 'Comment on essd-2022-65', Anonymous Referee #1, 30 Jul 2022
Lipson et al. present a unique harmonized dataset of 20 urban flux towers. The manuscript is clearly written overall, and the data files are adequately documented. I agree with the authors that the dataset is timely, critical, and can be of interest to many research and applications. However, I have a few minor comments that I suggest the authors consider.
1. Most parts of the manuscript focus on quality control, bias correction, and gap-filling the meteorological variables. It was unclear what was done specifically for flux variables (e.g., latent, sensible, momentum) until I opened up the files and examined the data. I think flux variables are quality-controlled and filtered but not filled. First, I suggest the authors make it clear in the main text. Second, I wonder why flux variables are not gap-filled. With net radiation, wind velocity, and roughness data available, it should be feasible to fill the flux variables. And, it could potentially facilitate the use of the dataset, esp. those who need gap-free data and those who need to aggregate to a coarse scale (e.g., daily, monthly).
2. The proposed approach for bias correction seems robust and more flexible in capturing the diurnal variation than the linear regression approach. But, I think it requires enough data (e.g., multiple years) to generate a robust ‘representative’ curve, i.e., to smooth out the fluctuations and fill periods when no data are available for a specific hour of a day. Therefore, I suggest the authors add a brief description and justification of the method.
3. Line 393-395: Please be specific. Do all or any of the sites adjust for daylight saving time? Please consider providing the information if it’s site-specific.
4. Table 5: Consider providing some information on the spatial extents of each parameter. Some are based on tower location; some are averaged over the target or larger area.
5. Table 6: Please provide the approximate spatial extent (e.g., in meters) for those using ‘fpm’.
6. Data files: Please add a brief description of the data file structures (e.g., number of header lines, missing values), especially for those time series text files.Citation: https://doi.org/10.5194/essd-2022-65-RC1 -
RC2: 'Comment on essd-2022-65', Anonymous Referee #2, 06 Sep 2022
The paper describes a dataset of eddy covariance data collected at 20 urban sites spread across Australia, Eastern Asia, Europe, and North America. Quality control and gapfilling methods applied to the data were described, as well as extended versions of driving climatic drivers based on reanalysis data harmonized with site data. This dataset is a unique and important contribution to eddy covariance, micrometeorology, and urban climate research.
The argument of using the dataset for climate models is somewhat limited by the fact that climate models mostly cannot take full advantage of datasets like this yet. However, the data presented here being available helps reduce the barriers for models to better take advantage of these data. The bias correction specific to urban environments is an innovative contribution, helping understand the mismatch between in-situ and larger scale representations of these sites; in particular, the hourly corrections are important for analyses of short duration and very localized phenomena, which are not always well represented in these types of datasets.
One of the main limitations related to this dataset is that it was not made available within any of the many research networks that could have offered it a home, but rather it was made available via a general platform–Zenodo. Most notably, FLUXNET offers a clear path to community created datasets, with one such dataset having been described in this very journal: see Delwiche et al. 2021, ESSD. The dataset published in this manner not only makes it harder to find, but also makes it inherently less compatible with/comparable to similar datasets, increasing the difficulty of combined analyses. The datasets being useful for models is a desirable characteristic, as is being easy to use in synthesis-type analyses.
Although the methods presented in the paper for gapfilling the data using different bias-corrections is very interesting, the comparison to alternative methods is not exactly a fair one. The goals for the ERA5 reanalysis dataset and, to some extent, also the WFDE5, are not to be used as single-point data streams, so the usefulness of the comparison of the methods proposed to these directly should not be interpreted as the proposed method being an improvement over these. The comparison to the FLUXNET2015 method, on the other hand, is very appropriate and led to interesting results. There are clear improvements from the linear method used for FLUXNET2015, which are clearly shown in the description. However, there are also variables/metrics that seem to not do quite as well, e.g., for short-wave radiation and, to some degree, wind variables. Exploring why these variables haven’t seen the same level of improvement would be very interesting indeed, perhaps leading to some form of a hybrid approach.
Finally, some form of summary of the data might have been of interest in a paper like this. For example, a plot showing the carbon or energy fluxes through time for all 20 sites, maybe in a summary form? Or maybe fluxes grouped by climate or latitude or land cover-type? Maybe simple scatter plots of all data (or groups) of fluxes against driving variables (temperature or precipitation). Any of these would help highlight the coverage and usefulness of the dataset, which is quite clearly significant for someone who is familiar with the data, but maybe not as much to someone with more limited exposure to the domain.
A few more specific details come next.
In the QC method presented, step 2 (setting nocturnal shortwave radiation set to zero) might seem like a reasonable approach, clearly no such thing as negative radiation. However, this might also inadvertently introduce a bias to the corrections: the sensor has its accuracy, which might lead to negative radiation values at night, and artificially forcing them to zero will reduce the impact of the sensor’s accuracy limitations to the bias corrections that come later. Perhaps it is indeed better to have this step, but this limitation should at least be clearly stated.
Still in the QC method, now for step 4, would it be necessary to adjust the number of standard deviations and/or window size for each variable? Wind varies in a very different way than atmospheric pressure, for example.
A little more detail could have been given as to how the values for Table 4 were obtained, and also how they are used. The information currently listed in the paper is somewhat limited.
In the hourly and daily corrections, section 2.6.1, step 4 involves extending 1-year worth of data by duplicating that year before and after; wouldn’t this approach artificially create “jumps” towards the beginning and end of the dataset? Wouldn’t it be better to use the actual years before and after the current year whenever available?
The evaluation of the gapfilling presented in section 3 focuses on the differences to ERA5. However, as mentioned above, the differences to the other methods, in particular the one for FLUXNET2015, would be of wider interest.
The statement regarding daylight savings in section 4.2 (“does not account for day light savings”) is somewhat confusing: does this mean it is always UTC or that the conversion from local time to UTC didn’t correct for it so time might be shifted when DST? I believe it is supposed to be the former, however rephrasing that statement might help clarify this point.
Finally, it would have been useful to have a netCDF file with data for all sites for easy access. This would be just a shortcut, but a very handy one.
Citation: https://doi.org/10.5194/essd-2022-65-RC2 - AC1: 'Response to RC1 & RC2', Mathew Lipson, 23 Sep 2022