the Creative Commons Attribution 4.0 License.
the Creative Commons Attribution 4.0 License.
A vegetation phenology dataset by integrating multiple sources using the Reliability Ensemble Averaging method
Abstract. Global change has substantially shifted vegetation phenology, with important implications in the carbon and water cycles of terrestrial ecosystems. Various vegetation phenology datasets have been developed using remote sensing data; however, the significant uncertainties in these datasets limit our understanding of ecosystem dynamics in terms of phenology. It is therefore crucial to generate a reliable large-scale vegetation phenology dataset, by fusing various existing vegetation phenology datasets, to provide comprehensive and accurate estimation of vegetation phenology with fine spatiotemporal resolution. In this study, we merged four widely used vegetation phenology datasets to generate a new dataset using the Reliability Ensemble Averaging fusion method. The spatial resolution of the new dataset is 0.05° and its temporal scale spans 1982–2022. The evaluation using the ground-based PhenoCam dataset from 280 sites indicated that the accuracy of the newly merged dataset was improved substantially. The start of growing season and the end of growing season in the newly merged dataset had the largest correlation (0.84 and 0.71, respectively) and accuracy in terms of the root mean square error (12 and 17 d, respectively). Using the new dataset, we found that the start of growing season exhibits a significant (p < 0.01) advanced trend with a rate of approximately 0.24 d yr−1, and that the end of growing season exhibits a significant (p < 0.01) delayed trend with a rate of 0.16 d yr−1 over the period 1982–2022. This dataset offers a unique and novel source of vegetation phenology data for global ecology studies.
- Preprint
(1693 KB) - Metadata XML
- BibTeX
- EndNote
Status: final response (author comments only)
-
RC1: 'Comment on essd-2024-225', Anonymous Referee #1, 13 Sep 2024
General comments
This data paper describes a new vegetation phenology dataset that fuses four existing remotely sensed datasets. It then compares the timing of two phenophases (start and end of the growing season) from this new dataset to the same metrics from three phenology camera networks (phenocams). Considering the discrepancies in accuracy and temporal/spatial coverage between existing phenology datasets, this new fusion dataset, created based on weighed averaging, is extremely useful and appears to have higher accuracy than existing datasets. However, the methods could be explained in greater detail, in addition to a few other concerns, which I highlight below.
First, I’m concerned about how the SOS and EOS dates were extracted and compared across the different datasets. Of the 4 datasets fused together, at least several appear to use different SOS and EOS thresholds and methods to extract the dates, which begs the questions – are they directly comparable? For example, if one dataset identifies SOS as 15% of maximum green-up, but another uses 50%, the extracted SOS date will naturally differ between those datasets, but one is not necessarily more or less accurate than the other. The same applies when comparing the fused dataset with the phenocam dataset – how are SOS and EOS identified across the 3 phenocam networks included in the phenocam dataset? Also, in addition to different thresholds, it appears that the datasets use different methods to extract SOS and EOS dates. This could be one reason that there is observed variability across the SOS and EOS dates. If possible, it would be best to standardize the methods and threshold across all the datasets. It might also be helpful to compare entire seasonal trends in vegetation greenness through time to visualize differences between datasets instead of just the seasonal transition dates.
Also, this is mostly semantics, but an important distinction to not mislead readers. “PhenoCam” with a capital P and C is usually used to indicate data from the PhenoCam Network (phenocam.nau.edu) and phenocam (lowercase p and c) indicates a generic camera from any network. Please see Richardson 2023, Box 3 for further explanation (https://www.sciencedirect.com/science/article/pii/S0168192323004410). Since you used digital camera data from 3 different sources, please use “phenocam” in this paper to avoid confusion.
Finally, I know discussion sections are normally shorter in data papers, but it seems like the authors could connect their results more to other studies. For example, how does the rate of growing season advance/delay compare to the rate that other studies have found? The fact that this analysis is done makes this more than just a data paper, and requires more contextualization with the literature. I would suggest either removing that analyses or adding more sources to support your results.
Line edits
Line 15: Here and elsewhere, I would suggest saying “a ground-based phenocam data” instead of “the ground-based…”- using “the” makes it sounds like you’re referring to a single data source, (such at the PhenoCam Network), but your phenocam dataset actually includes multiple data sources. Also, see my comment for line 109 about using “PhenoCam” versus “phenocam”
Line 17: largest correlation and accuracy in comparison with what?
Line 17: root mean square error between what? (Observed and predicted dates?)
Line 18: When giving the start and end of the growing season trends, what region are you referring to – the entire globe?
Line 30: They still are used – I suggest changing to “are commonly”
Line 40-43: What region was assessed to see trends in SOS and EOS in these studies? The entire globe?
Line 44: “merits and demerits” is awkward. Perhaps replace with “advantages and disadvantages”
Line 45: Perhaps set up/explain the merged dataset a little more here. From this sentence, my initial thought was “Why would a merged dataset be better if the raw datasets that go into it are inaccurate?” I understand it better after the next paragraph when you explain REA, but it’s unclear here.
Line 48: change “was” to “is”
Line 52: What is the “vegetation index method”? Please explain in the text.
Lines 69-72: Please write out all satellite/dataset abbreviations (MODIS, VIP, GIM) the first time (e.g., MODIS = Moderate Resolution Imaging Spectroradiometer)
Line 75: I don’t think you need the “Note” under the table. The dataset names and abbreviations are clear in the table.
Line 95: Write out and define NDVI the first time it’s used.
Line 82: What is the “threshold method”? You use this term multiple times, so it would be good to explain it in more detail the first time.
Line 84: What does “segment EVI2 amplitude” mean? When the time series reaches 15% of the maximum seasonal amplitude? Please add more detail about these methods.
Line 90: What threshold was used to determine SOS and EOS? If the datasets use different thresholds, this could be one reason they differ in their SOS and EOS dates.
Line 95-97: This dataset uses a different threshold (20%) than the MCD12Q2 dataset (15%). As I mentioned above, this could contribute to their differences in accuracy. Also, please explain what “segment NDVI amplitude” means.
Line 104: Why is SOS 20% and EOS 50%? Usually, the same threshold is used for both the start and end of season.
Line 109: PhenoCam with capital C is used to indicate data from the PhenoCam Network (phenocam.nau.edu) and phenocam (lowercase p and c) indicates a generic camera from any network, which I would suggest using here to avoid confusion. Please see Richardson 2023, Box 3 for further explanation (https://www.sciencedirect.com/science/article/abs/pii/S0168192323004410).
Line 111: Please add source of data (PhenoCam Network) “includes data acquired from the *PhenoCam Network*….”
Line 111: The PhenoCam Network does not use fisheye cameras – they use standard camera lens. They also aren’t completely downward-facing- they are tilted slightly downwards, but always include the horizon. See Fig 5: https://phenocam.nau.edu/pdf/PhenoCam_Install_Instructions.pdf
Line 114: Include full link to dataset (https://daac.ornl.gov/VEGETATION/guides/PhenoCam_V2.html). Please also cite PhenoCam data paper associated with this dataset:
Seyednasrollah, B., A.M. Young, K. Hufkens, T. Milliman, M.A. Friedl, S. Frolking, and A.D. Richardson. 2019. Tracking vegetation phenology across diverse biomes using PhenoCam imagery: The PhenoCam Dataset v2.0. Manuscript submitted to Scientific Data. https://www.nature.com/articles/s41597-019-0229-9
Lines 115-119: What geographic area is included in these datasets?
Line 118: Is this dataset also collected by digital cameras?
Line 119: How were these 280 sites selected (for example, the PhenoCam Network has 393 sites alone)? How many sites are from each of the 3 networks?
Line 120: How are SOS and EOS extracted from all the phenocam datasets? In general, the methods of data collection and extraction from all 3 phenocam datasets could be explained in more detail.
Line 129: How was interannual variability used to assign weights to each dataset? Are datasets considered more or less accurate with higher interannual variability? Why?
Line 130: How exactly were the datasets compared to get a value of “consistency” and “offset”? Please explain these methods in more detail.
Line 136: How are the methods described in section 2.2 and 2.2.1 different? Isn’t 2.1 describing the REA method? (Also missing a 2.2.2 section in the paper).
Line 137: What is the “voting principle”?
Line 169: What does BIAS stand for? What is it measuring? Similarly, what is the difference between the RMSE and unbiased RMSE (For both, I don’t mean mathematically, but rather contextually in terms of understanding the data).
Line 194: Were all analyses only for above 30 degrees N? If so, that should be stated in the methods.
Line 254: Remove the “etc” (not clear what it refers to – just list more places if desired).
Line 262: Include references for this claim.
Line 302-303: Which site did you chose? From which phenocam dataset? Change “American” to “US”.
Line 312: Remind readers that an “advance” in SOS means earlier dates. The terms “advance” and “delay” can sometimes be confusing to follow.
Line 315: Some regions also show a significant delay in SOS (Fig 7b) – it would be good to point that out, too.
Line 329: change “were” to “are”- they’re still widely used.
Line 332: What does the “complexity of surface backgrounds” mean?
Line 334: Please explain what the “mixed-pixel effect” is.
Lines 337-350: I feel like this paragraph could be either removed or condensed down into 2-3 sentences. You already set up why this data fusion method works in the intro and methods. I would just broadly state the benefits again without re-stating all the details.
Line 341: What does the “process of coefficient determination” mean?
Lines 341-342: Please further explain what the non-linear change in endmembers is and why that would result in poor performance. This is the first time it is mentioned in the paper.
Line 342: Please redefine REA acronym the first time it’s used in the discussion section.
Line 349: What is the “voting principle”?
Line 356: It’s not clear what “high processing efficiency” means – how does that relate to the low RMSE of the REA dataset?
Line 359: How does this rate compare to other studies? Include citations.
Line 363: I suggest replacing “invaluable” with “accurate” or “reliable” (or something similar)
Line 264: Consider starting a new sentence here.
Figures:
Fig 2: Why are plots b & d sharing an axis with plots a & c? They don’t appear to share axis values. I was confused at first thinking that the weights (a & c) were shown by latitude, but that doesn’t seem to be the case, so it is misleading to share an axis.
Also, the vertical figure legend is hard to read- is it possible to place it somewhere where the dataset names can be written horizontally?
Fig 3: Please add a label/title to the legend scale bar (something like “Proportion of contribution”)
Fig 4. The legends and text are small and hard to read. In panel a, consider using a color scale with more variation – it’s hard to see differences between the shades of green. In the figure legend text, “EOS dates” panel should be “d” instead of “b”.
Fig 5: For all panels, include units in axes labels (DOY). In figure legend, remind readers that each data point represents a site year.
I suggest moving the radar charts (panels f & k) to a separate figure (they’re too small and hard to read) and include an explanation about how to interpret them in the results section.
Fig 6: Just an observation - for EOS, REA estimates are consistently late compared to phenocam. It could be related to how the EOS date is determined (which method/threshold used) for phenocam vs REA.
Fig 7: Need to add x-axis label. Are the black lines the average SOS/EOS date for each year? Please note in figure legend what the black and red dotted lines represent.
In the percentage insets in panels b and d, I assume the x-axis letters represent the significant/non-significant advance and delays (abbreviations aren’t defined)? Perhaps include the abbreviations in the legend with the colors: e.g., significant delay (DS), non-significant delay (DN), etc.
Citation: https://doi.org/10.5194/essd-2024-225-RC1 -
RC2: 'Comment on essd-2024-225', Anonymous Referee #2, 17 Oct 2024
The paper integrates four existing phenology datasets by applying different weights to the start (SOS) and end (EOS) of the growing season, as determined by each dataset. Overall, the manuscript is well-written and the dataset presented would be useful for community. However, several issues need to be addressed before it can be accepted for publication.
First, the remote sensing datasets are not processed exactly the same, in terms of curves used to fit the time series and threshold used to extract transition dates. the authors should clarify how these methodological differences might influence the uncertainty or accuracy of the resulting merged dataset.
Second, I question whether the REA method truly outperforms a simple average. I recommend that the authors include additional analysis comparing the results obtained using the REA method with those from a simple average.
Third, no open-source code is provided, which limits the reproducibility of the study. Providing the code, especially for the REA method, would be interesting and useful for the community.
PhenoCam data are not properly cited; please check out the fair use data policy here https://phenocam.nau.edu/webcam/fairuse_statement/.
Line 30: PhenoCam, as a ground-based measurement, has been operational for more than 20 years. It should be introduced earlier in the text here.
Line 41-42: What is the specific time period over are evaluated?
Line 44: Please provide examples of regions where significant differences in the phenological metrics are observed.
Line 55: As you mentioned earlier, the performance of datasets varies across regions. How does the REA method address or resolve these regional performance variations?
Line 90: How are SOS and EOS determined in the VIP phenology dataset? Please compare these criteria with the methods used to determine greenup in the MCD12Q2 dataset.
Line 95: What specific curves are applied for the MCD12Q2 and VIP datasets?
Line 95: What threshold is used to extract phenological metrics from the GIM_3g dataset?
Line 110: Please provide a link to the PhenoCam dataset for reference.
Line 129: Since the method relies on interannual variability in the time series, what is the minimum required length for the time series? Is it possible to use REA to merge only two datasets? A discussion or quick test with shorter time series would be valuable, especially considering the availability of recent Planet data with.
Line 130: For datasets with higher interannual variability, did you assign them lower weights in the REA method? Please clarify.
Line 137: Open-source code for the REA method should be made available. This would assist the community in merging datasets from various sources and years.
Line 170: Please provide a brief description of the metrics used and their characteristics.
Line 182: Provide a specific example of how the M-K test will be applied in the study.
Line 183: The citation of Mao and Sun may not be necessary here—consider removing it.
Line 325 I am a bit concerned about the huge deviations in spring 2022 shown by Figure 7, which seems very inconsistent with Figure 2.78 in DOI: https://doi.org/10.1175/BAMS-D-23-0090.1
A description of how uncertainty is determined needs to be added to the REA phenology dataset.
Citation: https://doi.org/10.5194/essd-2024-225-RC2
Data sets
A vegetation phenology dataset by integrating multiple sources using the Reliability Ensemble Averaging method Yishuo Cui and Yongshuo Fu https://doi.org/10.5281/zenodo.11127281
Viewed
HTML | XML | Total | BibTeX | EndNote | |
---|---|---|---|---|---|
334 | 84 | 17 | 435 | 12 | 10 |
- HTML: 334
- PDF: 84
- XML: 17
- Total: 435
- BibTeX: 12
- EndNote: 10
Viewed (geographical distribution)
Country | # | Views | % |
---|
Total: | 0 |
HTML: | 0 |
PDF: | 0 |
XML: | 0 |
- 1