the Creative Commons Attribution 4.0 License.
the Creative Commons Attribution 4.0 License.
Global Retrieval of 24-hourly Solar-Induced Chlorophyll Fluorescence and Evapotranspiration from OCO-2, OCO-3 and ECOSTRESS over 1982–2022
Abstract. Solar-induced chlorophyll fluorescence (SIF) and evapotranspiration (ET) have been widely recognized as proxies for carbon gain and water loss at the ecosystem level. However, most SIF and ET products on the global scale are generally at coarse temporal resolutions of at least one day, which limits their ability to characterize diurnal carbon and water cycles. In this study, we extended the spatiotemporal scale of satellite SIF (from OCO-2 and OCO-3) and ET (from ECOSTRESS) data using machine learning methods, resulting in a global hourly SIF and ET dataset (HOUR_SIFOCO and HOUR_ETECO) spanning from 1982 to 2022, with a spatial resolution of 0.1°. Our product also provides photosystem-level SIF derived from direct estimation and simulation of Soil Canopy Observation, Photochemistry and Energy fluxes (SCOPE) model, aiming to offer a more accurate description of photosynthesis. Our satellite-derived products show good correlations with in-situ flux tower measurements from the FLUXNET2015 community (hourly-scale median R2 for SIF: 0.72, and ET: 0.53; daily-scale median R2 for SIF: 0.73, and ET: 0.63). Globally, our product shows good consistency with popular SIF and ET gridded products: the mean proportions of pixels with monthly R2 exceeding 0.7 are 69.5 % and 68.1 % when compared with four popular products, respectively. The causal-based attribution analysis revealed significant spatial heterogeneity in the lagged effects of different environmental factors on SIF, ET, and water use efficiency based on SIF and ET on the global scale. Overall, our dataset will provide new insights for monitoring the diurnal variations of carbon and water cycles and deepen our understanding of their changes over the past 40 years. The global hourly SIF and ET dataset (1982–2022) at 0.1° spatial resolution produced in this study is available at https://doi.org/10.57760/sciencedb.ecodb.00177 (Deng et al., 2025b).
- Preprint
(22744 KB) - Metadata XML
-
Supplement
(6720 KB) - BibTeX
- EndNote
Status: open (until 05 Jun 2025)
-
RC1: 'Comment on essd-2025-99', Paul Blackwell, 09 Apr 2025
reply
The comment was uploaded in the form of a supplement: https://essd.copernicus.org/preprints/essd-2025-99/essd-2025-99-RC1-supplement.pdf
-
RC2: 'Comment on essd-2025-99', Anonymous Referee #2, 27 Apr 2025
reply
This study employs a machine learning technique (LightGBM model) to integrate solar-induced chlorophyll fluorescence (SIF) data from OCO-2 and OCO-3 satellites with evapotranspiration (ET) observations from the ECOSTRESS, constructing a global 0.1° spatial resolution hourly dataset of SIF and ET. Attribution analysis using the SURD method reveals the driving effects of four key environmental factors (PAR, VPD, soil moisture, and air temperature) on SIF, ET, and water use efficiency (WUE). The authors expected the new datasets may provide novel data support for understanding the diurnal variation patterns of vegetation photosynthesis and transpiration over the past four decades and their climatic response mechanisms. Producing hourly SIF/ET data are very meaningful for carbon and water cycle research. We need to move from coarse time resolution to fine time resolution, especially at the global scale. The author has done a lot of work and produced two sets of hourly data for 40 years. The analysis also involves the matching of different sensors, verification of ground observations, model simulation, etc. These works are worthy of recognition. However, I have big doubts about many parts of the paper, especially the structure (focus) of the manuscript and the relevant method part. Please see the below comments.
Major Comments
- First of all, I appreciate that the authors have done a lot of work (models, validation, etc.), but piling them all into one study is like a hodgepodge, which makes it difficult to distinguish the main points, and it is difficult to carefully integrate any/one of them....
- First, the author has produced two types of data: SIF and ET, and has highlighted both the hourly scale and the long-term characteristics of 40 years. Just doing one of them, as long as it is done well and carefully, can support the publication of a paper.
- Secondly, the author further used two methods to convert directional SIF into canopy total SIF (each model has its own complexity). Considering that the existing products have almost no conversion to canopy total SIF, and the improvement after conversion, such as the relationship with GPP, is not very different, the role of adding this part is a bit far-fetched in this study;
- The author did not clearly explain why SIF is based on both OCO-2 and OCO-3 satellites (of course, there is a serious problem that I will point out later). Therefore, due to too much material, too much calibration, verification and analysis involved, the author's results are not clear in terms of focus/innovation/advantages. For example, the analysis starting from Section 3.2 ‘Spatial and temporal patterns of SIF, ET, and WUE’ and the large amount of analysis in Section 3.3 ‘Attribution of drivers for SIF, ET, and WUE’ have nothing to do with hourly-scale data. All the analyses in 3.2 and 3.3 can be easily done with daily monthly data. Although the authors discussed that the daily data is a composite of the hourly data, they did not provide evidence to show that these analyses (spatial patterns, long-term trends) must use the advantages of hourly data.
- A big concern for me is about using OCO-2 and OCO-3 as training data sources at the same time. First of all, OCO-3 and ECOSTRESS are both on board the ISS and can observe the surface at different times of the day. But OCO-2 is a traditional polar-orbiting satellite, and each time it passes over the surface is about 1:30 p.m. local time. So it is very confusing how OCO-2 and OCO-3 are fused together? The author did not describe how to deal with the different times of two observations. Secondly, if the OCO-2 time is mainly used after the fusion, how can it capture diurnal changes? Thirdly, although the OCO-2 and OCO-3 sensor parameters themselves are similar, judging from the results in Figure 2, the SIFs of the two are very different even after correction, and fusion together produces too much uncertainty. The most important point is that it seems that OCO-2 can be omitted to avoid the introduction of huge external errors.
Detailed Comments:
Introduction
Line 43: Measuring ET in space is possible in recent years? Please clarify it.
Line 46-47: What are these limitations?
Paragraph from line 53-69: You did not even introduce why we need to monitor these processes at diurnal scale. No any background.
Line 66: Surely! Polar-orbiting satellites are not designed to capture diurnal variables. Please reorganize these!
Line 84-87: BRDF effect of SIF should not be an important of your current work.
Methods
Line 114-115: I can not infer it from Figure 2.
Paragraph from line 123-134: Since the diurnal cycle of OCO-3 SIF could be easily affected by solar angle. You may need to follow Zhang’s method (RSE 2023)
Zhang Z, Zhang Y. Solar angle matters: Diurnal pattern of solar-induced chlorophyll fluorescence from OCO-3 and TROPOMI. Remote Sensing of Environment, 2023, 285: 113380.
Line 144: How to measure this 99%? Based on area or coverage?
Line 151: What do you mean ‘weighted appropriately’? Please give more details.
Line 165: Should introduce why did you use FLUXNET2015 for SIF validation first. It’s more like an indirect evaluation through SIF-GPP relationships.
Line 168: Then have you done some site homogeneity examination?
Paragraph from line 170-175: It is worth noting that none of the products currently selected are hourly, so the performance of hourly itself does not seem to be verified.
2.3 Data fusion of OCO-2 and OCO-3 SIF sounding data: Please see my Major comment 2
Line 198-199:This relationship (MAE, R2) can not support a highly match between OCO-2 SIF and OCO-3 SIF.
2.4 Estimation of photosystem-level SIF
I strongly suggest you remove this part. First, it deviates from your main point. Second, both methods also contain a lot of certainty and do not show any special advantages.
2.5 Description of continuous spatiotemporal scale-up model
This section does not show how you handle SIF predictions at different times.
Line 270: You did not use Penman-Monteith equation but ML, then why did you show this equation?
Results
3.1 Diurnal dynamic validation of SIF and ET
The figures should appear after your text description.
The middle two rows do not provide much information in this study. By removing them you can show more diurnal details.
Figure 6: The SIFtotal-D looks weird.
Section 3.2 and 3.3
There is nothing wrong with the large amount of analysis in 3.2 and 3.3. For example, you developed a long-term SIF and ET, and calculated WUE to analyze the spatiotemporal changes. However, your title and innovation particularly emphasize hourly, so it is a bit off.
Discussion
The first point of the discussion is almost centered around the hourly performance of the product, but the results seem to be missing except for a 6-hour global pattern of SIF/ET in Figure 4. Therefore, it is strongly recommended that the author adjust the focus of some experiments and stories.
The second discussion point is about the difference in the methods for calculating the total canopy SIF. As I said before, the authors put too much content into one study, so the more important analysis of the long-term SFI/ET and WUE, and the authors also did some attribution analysis, which was completely omitted from the discussion. I personally think that the method for calculating the total canopy SIF should appear in a paper estimating SIF, but it is not the focus of the paper.
Since the discussion involves trends, GIMMS data uncertainties should be included.
Citation: https://doi.org/10.5194/essd-2025-99-RC2 -
RC3: 'Comment on essd-2025-99', Anonymous Referee #3, 02 May 2025
reply
General Comments
The manuscript essd-2025-99 proposes long-term (40 yr), hourly SIF and ET datasets developed using LightGBM models trained with OCO-2/OCO-3 SIF products and ECOSTRESS ET data. The objective is to improve the temporal coverage and sampling continuity of official satellite products using ERA5-Land meteorological variables and GIMMS FPAR. The workload is substantial, and while the evaluation spans multiple perspectives, it remains superficial and lacks in-depth analysis. However, the spatiotemporal variability analysis and attribution sections remain very limited and raise concerns. Detailed comments are provided below.
Major Comments
- Modeling
1.1 Figure 1 presents the overall methodological framework; however, the figure shows a direct arrow from LightGBM_ET to LightGBM_SIF. This implies a dependency that contradicts the separate modeling processes described in the methodology. Clarify the relationship between the two models.
1.2 The manuscript references the Penman-Monteith (PM) equation (Eq. 10) for ET modeling. However, the equation presented is for reference ET (ET₀), which assumes no water stress and is not suitable for modeling actual ET under vegetation stress. Given that ECOSTRESS ET is derived using the PT-JPL model, and vegetation cover is a key driver (Luo et al., 2022), the feature selection should be based on PT-JPL inputs. Additionally, total shortwave radiation—not just PAR and fPAR—should be considered in modeling ET.
1.3 How were the official products sampled for model training? What are the spatial patterns of the global SIF and ET samples? Given the volume and resolution of the original datasets, it is practically infeasible to download all high-resolution observations, aggregate them to 0.1°, and perform global sampling without substantial computational resources. Please clarify the specific data sampling and preprocessing strategy.
- Data
2.1 Provide a summary table listing all required datasets along with metadata (e.g., resolution, temporal range, source) to improve clarity for readers.
2.2 Clarify the fusion method for OCO-2 and OCO-3 SIF. How are systematic biases handled? Do differences between the two products introduce any artifacts in the final dataset?
2.3 What are the percentage of ECOSTRESS ET and OCO samples at different time in a day for training?
- Evaluation
3.1 The use of 136 sites for SIF and 146 for ET validation at a 0.1° resolution raises concerns regarding scale mismatch. No site selection or scale-matching analysis is presented. Consider validating such choices by comparing the performance of aggregated official products over these sites.
3.2 Given that ERA5-Land is a major input, include a comparison between the final ET product and ERA5-Land ET to assess added value.
3.3 Figure 3: Clarify whether the x-axis reflects ground observations or satellite product labels.
3.4 Figure 4: The missing data in South America in the first column appears to have artificial boundaries. Please verify the cause and correct if needed.
3.5 Figure 6: SIFtotal-D shows limited variability at daytime. Explain the reason in more detail. Also, the focused validation sites should be introduced in the data section. Expand the uncertainty analysis across these sites—currently, it is only briefly addressed.
3.6 The diurnal accuracy performance of the dataset at different times of the day should be analyzed and discussed.
- Spatiotemporal Analysis and Attribution
The spatiotemporal analysis and attribution sections are weak and not suitable for publication. Multiple patterns and anomalies are presented without sufficient explanation or literature comparison. Consider either removing these sections or thoroughly revising them with stronger justifications and literature support.
- Figure 8: The stark trend contrast between two hemispheres needs explanation. Why are SIF and ET increasing significantly over India and polar regions? Any reports from previous studies?
- Figure 9: The trend of the proposed dataset diverges significantly from reference products at different subplots (e.g., Figure 9 a to d). Some trends between the proposed data and the reference are contradictory. Also, axis labels are missing.
- Figure 10: Several inconsistencies arise:
- Why are Sahara and Greenland included in the attribution analysis?
- Why is tropical forest soil moisture-limited? These are typically energy-limited systems.
- Why is VPD dominant only in sub-Saharan semi-arid areas?
- Why does PAR dominate arid regions (western US, Mediterranean, western Australia), where water should be the limiting factor?
- Why are temperature effects dominant in India and China? Surface temperature itself is regulated by other environmental factors.
Overall, the attribution framework and interpretation are not convincing.
- Introduction and Discussion
5.1 The introduction highlights limitations such as hotspot effects and low spatial resolution but does not connect these issues to the contributions of this work. The research gaps are not clearly defined. The introduction should be rewritten to clearly articulate the motivation and novelty, especially concerning dataset development, as required by ESSD.
5.2 The introduction section lacks a thorough review of ET datasets. Since the study presents both SIF and ET products, equal emphasis is needed on ET.
5.3 The importance and novelty of converting satellite-observed SIF to total SIF are not adequately discussed in the introduction.
- Other Comments
6.1 Title: The term “retrieval” typically refers to direct estimation from raw satellite observations, which is misleading in this context. This study estimates SIF and ET using machine learning based on high-level satellite products rather than raw mission data. A more accurate and appropriate term would be “estimation” or “modeling.” The title should also clarify that the datasets are derived from satellite products, not directly from satellite raw observations.
6.2 Line 35: Clarify what "cycle" is being referred to.
6.3 Line 41: “Severe challenges” are mentioned without further elaboration. Specify what challenges exist in quantifying photosynthesis and ET globally.
6.4 Line 45: Specify the spectral range of the "faint signal" of SIF.
6.5 Figure 1: Typo — "long-sterm" should be "long-term".
6.6 Line 19: Add “available” → “are generally available at”
6.7 Line 60: Replace “commonly need” with “common need”
6.8 Justify the choice of MCD43A1 over MCD43A4 for NIRv estimation.
Reference
Luo, Zelin, et al. "Different vegetation information inputs significantly affect the evapotranspiration simulations of the PT-JPL model." Remote Sensing 14.11 (2022): 2573.Citation: https://doi.org/10.5194/essd-2025-99-RC3
Data sets
A Global 24-hourly Retrieval of Solar-Induced Chlorophyll Fluorescence and Evapotranspiration from OCO-2, OCO-3 and ECOSTRESS over 1982–2022 Zhuoying Deng, Tingyu Li, Jinghua Chen, Shaoqiang Wang, Kun Huang, Peng Gu, Haoyu Peng, and Zhihui Chen https://doi.org/10.57760/sciencedb.ecodb.00177
Model code and software
A Global 24-hourly Retrieval of Solar-Induced Chlorophyll Fluorescence and Evapotranspiration from OCO-2, OCO-3 and ECOSTRESS over 1982–2022 Zhuoying Deng, Tingyu Li, Jinghua Chen, Shaoqiang Wang, Kun Huang, Peng Gu, Haoyu Peng, and Zhihui Chen https://doi.org/10.57760/sciencedb.ecodb.00177
Viewed
HTML | XML | Total | Supplement | BibTeX | EndNote | |
---|---|---|---|---|---|---|
285 | 75 | 10 | 370 | 18 | 5 | 4 |
- HTML: 285
- PDF: 75
- XML: 10
- Total: 370
- Supplement: 18
- BibTeX: 5
- EndNote: 4
Viewed (geographical distribution)
Country | # | Views | % |
---|
Total: | 0 |
HTML: | 0 |
PDF: | 0 |
XML: | 0 |
- 1