Extended global terrestrial evapotranspiration and gross primary production dataset from 1982 to near present

Xu, Zhenwu; Zhang, Yongqiang; Kong, Dongdong; Ma, Ning; Zhang, Xuanze

doi:10.5194/essd-2026-94

Preprints

https://doi.org/10.5194/essd-2026-94

Preprints

02 Mar 2026

| 02 Mar 2026

Status: a revised version of this preprint was accepted for the journal ESSD and is expected to appear here in due course.

Extended global terrestrial evapotranspiration and gross primary production dataset from 1982 to near present

Zhenwu Xu, Yongqiang Zhang, Dongdong Kong, Ning Ma, and Xuanze Zhang

Abstract. The Penman–Monteith–Leuning (PML) model is a widely recognized diagnostic framework for estimating coupled terrestrial evapotranspiration (ET) and gross primary production (GPP). To address the critical need for high-fidelity, long-term, and near-present eco-hydrological records, we developed the PML-V2.2 dataset, spanning from 1982 to 2024. Driven by observation-constrained Multi-Source Weighted-Ensemble Precipitation (MSWEP) and Multi-Source Weather (MSWX) meteorological variables, the dataset comprises three complementary products: (1) PML-V2.2a, an 8-day 500 m MODIS-based product (2000–2024) optimized for near-present monitoring; (2) PML-V2.2b, a half-month 0.1° AVHRR-based product (1982–2020) anchoring long-term climate attribution; and (3) PML-V2.2c, a consolidated half-month 0.1° record integrating the former two for seamless 43-year continuity (1982–2024). Our methodological framework features an expanded bottom-up calibration using 208 flux sites (~1400 site-years) across various plant functional types (PFT) and a refined parameterization that explicitly distinguishes between irrigated and rainfed croplands. This distinction effectively mitigated systematic biases in agricultural regions, reducing ET and GPP estimation errors by 8.7 % and 16.2 %, respectively. Performance evaluation reveals high accuracy across PFTs (cross-validation Nash-Sutcliffe Efficiency, NSE > 0.60, absolute bias < 5 %), while top-down water-balance validation across 56 large river basins during 1982–2016 and 152 basins during 2003–2020 confirms exceptional reliability (NSE: 0.89–0.91). The MODIS-based (V2.2a) and AVHRR-based (V2.2b) products exhibit high statistical and spatial agreement during their overlapping period (NSE = 0.90 and 0.79 for annual ET and GPP anomalies), ensuring a seamless transition across satellite epochs. Based on the consolidated PML-V2.2c dataset, global terrestrial annual ET and GPP during 1982–2024 are estimated at 65.8 × 10³ km³yr⁻¹ (with 58.0 % from transpiration) and 143.0 PgC yr⁻¹, respectively. Long-term analysis reveals significant (p < 0.01) increasing trends in GPP (0.315 PgC yr⁻²) and ET (0.019 × 10³ km³yr⁻²) during 1982–2024, where rapid growth in GPP and water use efficiency is partially offset by CO₂-induced physiological water savings. By bridging the gap between satellite epochs, PML-V2.2 provides an internally consistent long-term global dataset for hydrology, ecology, and other Earth science studies. The dataset is freely accessible, with the 500 m resolution PML-V2.2a product hosted on Google Earth Engine, and all 0.1° PML-V2.2a/b/c versions archived at the National Tibetan Plateau Data Center under https://doi.org/10.11888/Terre.tpdc.303314 (Xu et al., 2026).

Received: 03 Feb 2026 – Discussion started: 02 Mar 2026

Publisher's note: Copernicus Publications remains neutral with regard to jurisdictional claims made in the text, published maps, institutional affiliations, or any other geographical representation in this paper. While Copernicus Publications makes every effort to include appropriate place names, the final responsibility lies with the authors. Views expressed in the text are those of the authors and do not necessarily reflect the views of the publisher.

Download & links

Zhenwu Xu, Yongqiang Zhang, Dongdong Kong, Ning Ma, and Xuanze Zhang

Status: closed

RC1:
'Comment on essd-2026-94', Anonymous Referee #1, 23 Apr 2026
The authors present a valuable update to the Penman–Monteith–Leuning (PML) model, introducing the PML-V2.2 dataset to provide a long-term (1982–2024), near-present global record of coupled terrestrial evapotranspiration (ET) and gross primary production (GPP). The manuscript is well-structured, clearly written, and the methodology is robust. The rigorous validation against both eddy covariance data and basin-scale water balance estimates provides high confidence in the dataset's fidelity. Overall, the development of this consolidated 43-year dataset bridging the AVHRR and MODIS epochs represents a significant contribution to the community. I have only a few minor suggestions to further enhance the transparency, utility, and readability of the manuscript before publication:
General Comments
The term "near present" is used prominently throughout the manuscript to describe the dataset's temporal coverage, but its exact definition can be slightly ambiguous due to varying data latencies. Specifying the exact cutoff month (e.g., "updated through December 2024") in the Abstract and the Data Availability section would provide clearer expectations for future users.

It would be highly beneficial to provide a comprehensive supplementary table detailing the basic attributes of all 208 flux tower sites used for calibration and validation. Furthermore, making the model performance metrics for individual sites, as well as the underlying data for key main figures, especially those illustrating cross-product uncertainties, publicly available in a repository would significantly enhance the reproducibility and usability of this study.

While the current manuscript effectively summarizes existing models in Table 1, the Discussion section could be strengthened by briefly comparing the PML-V2.2 global mean estimates (65.8 × 10³ km³ yr⁻¹ for ET and 143.0 PgC yr⁻¹ for GPP) and their long-term trends against a few other mainstream global products. Adding a few sentences to contextualize these results within the broader scientific consensus would help readers better understand where this new product stands.

In Figure 12, the mean values between the "b" version and the "a/c" versions seem slightly different, while their interannual variabilities and trends are very similar. What might be the cause of this discrepancy? Is it due to differences in the mean LAI between GIMMS_LAI4g and the wWHD-smoothed MODIS LAI, or is it an artifact of the simulation at a coarser resolution? Please provide a brief discussion addressing this, ideally supported by the relevant data.

Although this is a global dataset, there is a previously officially released dataset called PML-V2 China (He et al., 2022), which has a daily resolution for 2000–2020. It would be valuable to briefly compare the new dataset with this regional one, focusing on regional differences and water balance closure. This would help answer whether the new global dataset is suitable for regional research in China, especially for recent years. (Reference: He, S., et al., 2022. Earth Syst. Sci. Data, 14, 5463–5488).

In Figure A5, the comparison of water-balance-based ET seems slightly unbalanced for products that do not provide data over deserts. Could you establish a mask or threshold to account for this? Additionally, there are cases where evaporation from water bodies is excluded. Please provide more methodological details on how these specific datasets were processed to ensure a fair comparison.

Specific comments:
Abstract: The text mentions "a 8-day 500 m MODIS-based product." Based on standard pronunciation rules, the article should be changed to "an," making it "an 8-day."

Abstract: Near the end of the abstract, it states "where rapid growth in GPP and water use efficiency is partially offset". Since the subject includes both GPP and water use efficiency, the verb should ideally be plural. Consider changing it to "are partially offset."

The specific NOAA CO₂ dataset utilized in this study should be explicitly stated in the text to ensure strict methodological reproducibility. Since NOAA maintains multiple carbon cycle products, please clarify whether the globally averaged marine surface monthly mean data from the Global Monitoring Laboratory or data from a specific baseline observatory was used, and provide the exact citation or accession link.

Section 2.1: In the variable explanation for Equation 1, the British spelling "water vapour pressure" is used, whereas the American spelling "vapor pressure" is used elsewhere in the manuscript (e.g., when introducing VPD). I suggest unifying the spelling convention throughout the text.

Section 2.1: In the paragraph below Equation 5, the phrase "under a same constraint... Therefore, a same function" contains a minor grammatical flaw. The definite article should be used here. Please change this to "under the same constraint... Therefore, the same function."

Sections 2.2 & 3.2: The manuscript uses the term "splitted" multiple times (e.g., "splitted parameterization scheme," and in the caption for Figure 6). The past tense and past participle of split is simply "split." Please replace all instances of "splitted" with "split" throughout the text.

Section 3.4: When listing statistical evaluation metrics, this section mentions "NSE = 0.904". However, the rest of the manuscript (such as the abstract and other metrics in the same paragraph) mostly retains two decimal places (e.g., NSE = 0.90, NSE = 0.79). I recommend standardizing the number of decimal places for data throughout the manuscript to maintain consistent formatting.
Citation: https://doi.org/10.5194/essd-2026-94-RC1
- AC1: 'Reply on RC1', Zhenwu Xu, 22 May 2026
  
  # Response to Reviewer
  We sincerely thank the reviewer for the thorough evaluation and the highly positive feedback on our updated PML dataset and manuscript. Your constructive suggestions are extremely valuable for enhancing the transparency, utility, and overall readability of our work. We will fully incorporate these suggestions into the revised manuscript. Below is our detailed, point-by-point response to your comments.
  # Response to General Comments
  1.The meaning of “near present”.
  
  We completely agree that clearly defining the temporal coverage is crucial for data users. To avoid any ambiguity caused by varying data latencies, we will explicitly specify the exact cutoff month. Unlike the “to present” MODIS products, our dataset is updated annually (when forcing is available) with a latency of approximately six months. Therefore, the current version is updated through December 2024. We will revise the Abstract and the Data Availability sections to clearly state this timeline. This clarification will provide users with precise expectations regarding the near-present coverage of the product.
  2. Flux tower details and data repository.
  
  This is a highly valuable suggestion that aligns perfectly with the principles of open science. To make the data more accessible and easier to reuse compared to a static supplementary table, we will directly upload the comprehensive attributes of all 208 flux tower sites (including their locations, vegetation types, and available data periods) into a permanent repository (e.g., Zenodo). Furthermore, we will also make the site-level model performance metrics and the underlying source data for key figures, especially those illustrating cross-product uncertainties, publicly available in the same repository.
  3. Comparison with other global products.
  
  We agree that contextualizing our results within the broader literature will greatly strengthen the Discussion section. We will add a dedicated paragraph to compare the PML-V2.2 global mean estimates and their long-term trends against several other mainstream global products (e.g., GLEAM, SiTHv2, etc.). This can be shown as supplementary figures. Adding this comparison will help readers better understand how our new product aligns with or differs from the current scientific consensus.
  4. Differences in global mean values in Figure 12.
  
  Thank you for your careful observation regarding Figure 12. We acknowledge that while the interannual variabilities and trends match closely, there is a slight discrepancy in the mean values between the b version and the a/c versions. Based on our analysis, this difference is primarily attributed to the slight differences in spatial averages in forcing data. Specifically, this is caused by systematic differences in the mean LAI magnitude between the GIMMS LAI4g and the smoothed MODIS LAI, combined with the scaling effects of simulating at a coarser spatial resolution. We will add a brief, data-supported discussion addressing this specific discrepancy in the revised text, with a new supplementary figure to help explain that.
  5. Comparison with PML-V2 China.
  
  This is an excellent point. Acknowledging the previously released regional dataset, PML-V2 China (He et al., 2022), will definitely benefit users focusing on regional studies. We will add a brief comparison between our new global dataset and the regional PML-V2 China dataset in the Discussion section. This comparison will focus on highlighting regional differences and evaluating water balance closure. This addition will effectively clarify the suitability of our global product for regional-scale research in China, especially for applications requiring data from the most recent years.
  6. Fair comparison in Figure A5.
  
  You raise a very valid concern regarding the comparison of water-balance-based ET in Figure A5. Comparing products that have varying spatial coverage over deserts or water bodies can indeed introduce biases. To address this and ensure a strict and fair comparison, we will establish a unified spatial mask. This mask will explicitly account for barren desert regions and exclude evaporation from inland water bodies to standardize the evaluation area across all participating datasets. We will also expand the methodology section to provide detailed descriptions of how these specific datasets were masked and processed.
  # Response to Specific Comments
  
  We sincerely appreciate your meticulous review of the text. All the specific comments and grammatical corrections will be fully adopted and revised accordingly in the manuscript.
  
  Citation: https://doi.org/10.5194/essd-2026-94-AC1
RC2:
'Comment on essd-2026-94', Oscar Manuel Baez Villanueva, 26 May 2026

The article titled “Extended global terrestrial evapotranspiration and gross primary production dataset from 1982 to near present” presents the new version of the PML dataset (PML-V2.2). The dataset comprises three complementary products: (i) PML-V2.2a, an 8-day 500 m MODIS-based dataset (2000–2024); (ii) PML-V2.2b, a half-monthly 0.01deg AVHRR-based product (1982–2020) optimised for near-present monitoring; and (iii) PML-V2.2c, a half-monthly 0.1deg record consolidating the a and b products (1982–2024). Parameter optimisation was performed through a leave-one-out cross-validation procedure for each plant functional type (PFT). The manuscript is very well written, clear, and aligned with the scope of the journal. Additionally, I believe that this dataset is of particular interest to the research community! I hope the following comments and suggestions help further strengthen the manuscript.
General coments:
It would be interesting to mention in the abstract whether the product is expected to be updated regularly (e.g., annually). This information could further help the uptake of the dataset from new users.
The authors present PML-V2.2b as a product optimised for near-present monitoring. It would therefore be valuable to extend the dataset through 2025 (the last complete year) to further demonstrate this capability. In that case, the consolidated PML-V2.2c product could span 1982–2025. This would further strengthen the statement made in L66–67 regarding the production lag affecting many biophysically consistent datasets.
It would be worth considering the use of the Kling–Gupta Efficiency (KGE) instead of the Nash–Sutcliffe Efficiency (NSE). The KGE decomposes performance into correlation, bias, and variability components, providing additional diagnostic insight and facilitating comparison across regions and climatic conditions.
Out of curiosity, is there a particular reason for using MSWEP V2.8 instead of the more recent MSWEP V3.16?
Could the authors elaborate on how rainy days were treated in the case of the data from theflux sites?
Is there a sensitivity analysis associated with the optimised parameters presented in Table 2? It would be particularly interesting to assess whether parameter sensitivity varies across PFTs.
In L218, the authors mention that monthly CO2 concentrations from NOAA were used. How were these data temporally disaggregated to the 8-day and half-monthly scales?
Were the three models optimised independently? If so, it would be interesting to summarise the best-performing parameters for each version. If not, a brief discussion on the expected impact of using different forcing datasets across the three products (particularly between versions a and b) would be useful. It appears that the optimisation was performed at the 8-day temporal scale and therefore primarily for version a; it would be good to state this explicitly in the manuscript.
I appreciate the effort made by the authors to carefully merge AVHRR and MODIS observations in order to minimise discontinuities and artificial biases in long-term trends. It would be interesting to add a few sentences about the pixel-scale bidirectional consolidation process and the reverse-scaling procedure. In addition, why was the 2001–2003 period selected for the consolidation and not a longer period?
It would be very interesting to the readers to include performance metrics for all three dataset versions. Additionally, the consistency of the consolidation could be assessed at the half-monthly scale over the overlapping period across all products.
Besides the visualisation of global patterns, it would be very interesting to add a figure showcasing time series of the three products in comparison with in situ data.

Specific coments:
L22: The authors mention that PML-V2.2c exhibits “exceptional reliability”. It would be useful to specify relative to which products or benchmarks this statement is made. A comparison against other products would be very interesting!
L50–57: Please add references for the datasets mentioned.
L50: “Temporal span” or “record length” may fit better here than “temporal depth”.
L89–90: Please include the versions and references of the datasets.
Table 1: For GLEAM4, in the “key feature” column, “evaporative stress” would be more appropriate than “water stress”. In addition, the temporal coverage should be updated to 1980–2025.
L107: If I am not mistaken, bare soil evaporation should be denoted as Es instead of Eis to remain consistent with Eq. 2 and the following explanation.
L148: Perhaps “estimates” would be more appropriate than “observations” in this context.
L382–385: It would be very interesting to compare all dataset versions over the overlapping period, including metrics such as mean annual global ET, trends, and the partitioning of evaporation into its three main components, as discussed in these lines.

Citation: https://doi.org/10.5194/essd-2026-94-RC2
- AC2: 'Reply on RC2', Zhenwu Xu, 27 May 2026
  
  # Response to Oscar M. Baez-Villanueva (referee #2)
  
  We sincerely thank the reviewer for the thorough evaluation and the highly positive feedback on our updated PML dataset and manuscript. Your constructive suggestions are extremely valuable for enhancing the transparency, utility, and overall readability of our work. We will fully incorporate these suggestions into the revised manuscript. Below is our detailed, point-by-point response to your comments.
  
  # Response to General Comments
  
  1. Data update frequency and temporal extension.
  
  We agree that clarifying the update schedule is important for users. We will mention the annual update cycle in the Abstract. Regarding the extension to 2025, we are currently facing a widespread challenge: the severe satellite drift of MODIS since 2022 has caused significant data degradation for 2025, particularly in tropical rainforest regions. We had identified this issue during our pre-update of PML data in early Feb. 2025, therefore data for 2025 is not released at the moment. As MODIS data alone is currently insufficient to support reliable further updates, we are transitioning to VIIRS forcing data (same spatial-temporal resolution, algorithm as MODIS). The VIIRS LAI processing is completed and without drift issues, while the Albedo data (VNP43IA3) is currently being uploaded in coordination with the GEE team.
  
  We plan to update the dataset through 2025 for the final published version (expected by July/August). Moving forward, with GEE-based MODIS/VIIRS drivers, our annual update latency in subsequent years will ideally be reduced to two to six months. We will explicitly state this timeline and our transition strategy for subsequent annual updates in the revised manuscript.
  
  2. Use of Kling-Gupta Efficiency (KGE).
  
  This is an excellent suggestion. Currently, we provide NSE, R, RMSE, and Bias to capture different dimensions of model performance. However, we agree that KGE is a robust composite metric that facilitates easier cross-product comparisons. We will calculate the KGE metrics for all flux sites and include them in the revised manuscript to complement our existing evaluations.
  
  3. Choice of MSWEP version.
  
  We will briefly clarify our choice of MSWEP V2.8 in the text. When we initiated the data preparation and long-term simulations, MSWEP V3 had not been officially released (it is currently still under review). Our preliminary investigations indicated slight differences between V2.8 and V3, and the long-term stability of V3 remained uncertain at the time. Therefore, V2.8 was selected to maintain strict data consistency across the 43-year simulation. We will consider comprehensively updating to V3 in future versions once its stability is fully evaluated by the community.
  
  4. Treatment of rainy days at flux sites.
  
  We will elaborate on this in the methodology section. In this study, we did not explicitly filter out rainy days from the flux site data. The primary reason is that our model runs and evaluations are conducted at an 8-day temporal scale. At this temporally aggregated scale, the instantaneous uncertainties and noise introduced by precipitation events on eddy covariance measurements are largely smoothed out. This minimizes the necessity for strict daily or hourly rainy-day filtering, while preserving the continuity of the time series.
  
  5. Parameter sensitivity analysis.
  
  We will add a brief discussion regarding parameter sensitivity across different PFTs. Since a formal parameter sensitivity analysis was not included in the previous 2019 RSE paper, we will incorporate a variance-based sensitivity analysis (e.g., the Sobol or Morris method) in this updated version. This will make the parameterization of the PML model more transparent and scientifically robust.
  
  6. Temporal disaggregation of CO2 data.
  
  We will add a clarifying sentence in Section 2 to explain this processing. Specifically, the monthly NOAA CO2 concentrations were applied as static values for the 8-day or half-monthly steps within each respective month. Since terrestrial carbon-water coupling responds primarily to the long-term CO2 trend and seasonal cycles, the absence of high-frequency (sub-monthly) atmospheric CO2 fluctuations has a negligible impact on the simulations.
  
  7. Model optimization and forcing impacts.
  
  We will explicitly state in the manuscript that the parameter optimization was performed exclusively at the 8-day temporal scale using high-quality site meteorological observations and MODIS data, primarily targeting version a.
  
  We will also clarify that versions a and b share the exact same parameterization scheme; the discrepancies arise solely from the different forcing datasets. We intentionally selected “bias-corrected” forcing datasets (MSWEP + MSWX) rather than raw reanalysis data (e.g., ERA5, MERRA2) to minimize such forcing-induced biases. Combined with the GIMMS LAI4g dataset, which has undergone extensive calibration and validation using in situ observations and Landsat imagery, this strategy effectively balances bottom-up simulations with top-down constraints, which underpins the high fidelity of our water balance validation.
  
  8. Consolidation process and overlapping period.
  
  We will add more details describing the pixel-scale bidirectional consolidation process. A bidirectional correction was adopted because unidirectional scaling might fail to perfectly preserve both identical means and consistent long-term trends. Furthermore, the 2001–2003 period was selected because it represents a highly stable overlapping window with high-quality data from both AVHRR and MODIS sensors. Compared to using a longer baseline (e.g., 2001–2020), this shorter, focused window avoids introducing abrupt shifts or artificial artifacts at the transition boundary (1999–2001).
  
  Importantly, we will expand the discussion to acknowledge that any cross-sensor consolidation inherently introduces uncertainties—a challenge we will face again with our upcoming transition to VIIRS. However, without such harmonization, continuous long-term studies across different satellite eras would be impossible. While some existing products tend to remain vague regarding their cross-sensor transition strategies, bridging this gap transparently is precisely the core value of our consolidated version “c”, which is particularly valuable for placing recent anomalies within a long-term historical context. We are committed to ensuring the methodological soundness of this processing, providing robust long-term estimates, and openly documenting the associated uncertainties in the revised manuscript.
  
  9. Performance metrics and time series comparisons.
  
  These are highly constructive suggestions. We will include a new figure showcasing the time series of the PML-V2.2a production data directly compared with in situ flux data, and we plan to upload more comprehensive site-level plots to our Zenodo repository for transparency.
  
  However, for versions b and c, we would like to clarify that performing a direct time-series comparison against flux towers is scientifically problematic. The spatial mismatch between a tower's footprint (typically ~1 km) and the coarse grid resolutions (0.01° and 0.1°) over heterogeneous land surfaces introduces massive uncertainties. This spatial footprint mismatch is also the primary reason we avoid direct horizontal site-level comparisons with other coarse-resolution global products. We will explicitly discuss this limitation in the revised manuscript.
  
  # Response to Specific Comments
  
  We sincerely appreciate your meticulous review of the text. All the specific comments, including dataset references, terminology adjustments, removing overstated or non-standard descriptions, and updating Table 1, will be fully adopted. We will also include brief cross-product comparisons in the Discussion section as suggested.
  
  Citation: https://doi.org/10.5194/essd-2026-94-AC2
RC3:
'Comment on essd-2026-94', Anonymous Referee #1, 27 May 2026

The author had revised this manuscript accroding to my comments and can be accepted for publication at present.

Citation: https://doi.org/10.5194/essd-2026-94-RC3
- AC3: 'Reply on RC3', Zhenwu Xu, 27 May 2026
  
  We sincerely thank the reviewer for the positive evaluation and for recommending our manuscript for publication. We will carefully revise the manuscript according to the changes detailed in our previous response.
  
  Citation: https://doi.org/10.5194/essd-2026-94-AC3
RC4: 'Comment on essd-2026-94', Anonymous Referee #1, 28 May 2026

It's a good revised manuscript and can be accepted for publication.

Citation: https://doi.org/10.5194/essd-2026-94-RC4

Status: closed

RC1:
'Comment on essd-2026-94', Anonymous Referee #1, 23 Apr 2026
The authors present a valuable update to the Penman–Monteith–Leuning (PML) model, introducing the PML-V2.2 dataset to provide a long-term (1982–2024), near-present global record of coupled terrestrial evapotranspiration (ET) and gross primary production (GPP). The manuscript is well-structured, clearly written, and the methodology is robust. The rigorous validation against both eddy covariance data and basin-scale water balance estimates provides high confidence in the dataset's fidelity. Overall, the development of this consolidated 43-year dataset bridging the AVHRR and MODIS epochs represents a significant contribution to the community. I have only a few minor suggestions to further enhance the transparency, utility, and readability of the manuscript before publication:
General Comments
The term "near present" is used prominently throughout the manuscript to describe the dataset's temporal coverage, but its exact definition can be slightly ambiguous due to varying data latencies. Specifying the exact cutoff month (e.g., "updated through December 2024") in the Abstract and the Data Availability section would provide clearer expectations for future users.

It would be highly beneficial to provide a comprehensive supplementary table detailing the basic attributes of all 208 flux tower sites used for calibration and validation. Furthermore, making the model performance metrics for individual sites, as well as the underlying data for key main figures, especially those illustrating cross-product uncertainties, publicly available in a repository would significantly enhance the reproducibility and usability of this study.

While the current manuscript effectively summarizes existing models in Table 1, the Discussion section could be strengthened by briefly comparing the PML-V2.2 global mean estimates (65.8 × 10³ km³ yr⁻¹ for ET and 143.0 PgC yr⁻¹ for GPP) and their long-term trends against a few other mainstream global products. Adding a few sentences to contextualize these results within the broader scientific consensus would help readers better understand where this new product stands.

In Figure 12, the mean values between the "b" version and the "a/c" versions seem slightly different, while their interannual variabilities and trends are very similar. What might be the cause of this discrepancy? Is it due to differences in the mean LAI between GIMMS_LAI4g and the wWHD-smoothed MODIS LAI, or is it an artifact of the simulation at a coarser resolution? Please provide a brief discussion addressing this, ideally supported by the relevant data.

Although this is a global dataset, there is a previously officially released dataset called PML-V2 China (He et al., 2022), which has a daily resolution for 2000–2020. It would be valuable to briefly compare the new dataset with this regional one, focusing on regional differences and water balance closure. This would help answer whether the new global dataset is suitable for regional research in China, especially for recent years. (Reference: He, S., et al., 2022. Earth Syst. Sci. Data, 14, 5463–5488).

In Figure A5, the comparison of water-balance-based ET seems slightly unbalanced for products that do not provide data over deserts. Could you establish a mask or threshold to account for this? Additionally, there are cases where evaporation from water bodies is excluded. Please provide more methodological details on how these specific datasets were processed to ensure a fair comparison.

Specific comments:
Abstract: The text mentions "a 8-day 500 m MODIS-based product." Based on standard pronunciation rules, the article should be changed to "an," making it "an 8-day."

Abstract: Near the end of the abstract, it states "where rapid growth in GPP and water use efficiency is partially offset". Since the subject includes both GPP and water use efficiency, the verb should ideally be plural. Consider changing it to "are partially offset."

The specific NOAA CO₂ dataset utilized in this study should be explicitly stated in the text to ensure strict methodological reproducibility. Since NOAA maintains multiple carbon cycle products, please clarify whether the globally averaged marine surface monthly mean data from the Global Monitoring Laboratory or data from a specific baseline observatory was used, and provide the exact citation or accession link.

Section 2.1: In the variable explanation for Equation 1, the British spelling "water vapour pressure" is used, whereas the American spelling "vapor pressure" is used elsewhere in the manuscript (e.g., when introducing VPD). I suggest unifying the spelling convention throughout the text.

Section 2.1: In the paragraph below Equation 5, the phrase "under a same constraint... Therefore, a same function" contains a minor grammatical flaw. The definite article should be used here. Please change this to "under the same constraint... Therefore, the same function."

Sections 2.2 & 3.2: The manuscript uses the term "splitted" multiple times (e.g., "splitted parameterization scheme," and in the caption for Figure 6). The past tense and past participle of split is simply "split." Please replace all instances of "splitted" with "split" throughout the text.

Section 3.4: When listing statistical evaluation metrics, this section mentions "NSE = 0.904". However, the rest of the manuscript (such as the abstract and other metrics in the same paragraph) mostly retains two decimal places (e.g., NSE = 0.90, NSE = 0.79). I recommend standardizing the number of decimal places for data throughout the manuscript to maintain consistent formatting.
Citation: https://doi.org/10.5194/essd-2026-94-RC1
- AC1: 'Reply on RC1', Zhenwu Xu, 22 May 2026
  
  # Response to Reviewer
  We sincerely thank the reviewer for the thorough evaluation and the highly positive feedback on our updated PML dataset and manuscript. Your constructive suggestions are extremely valuable for enhancing the transparency, utility, and overall readability of our work. We will fully incorporate these suggestions into the revised manuscript. Below is our detailed, point-by-point response to your comments.
  # Response to General Comments
  1.The meaning of “near present”.
  
  We completely agree that clearly defining the temporal coverage is crucial for data users. To avoid any ambiguity caused by varying data latencies, we will explicitly specify the exact cutoff month. Unlike the “to present” MODIS products, our dataset is updated annually (when forcing is available) with a latency of approximately six months. Therefore, the current version is updated through December 2024. We will revise the Abstract and the Data Availability sections to clearly state this timeline. This clarification will provide users with precise expectations regarding the near-present coverage of the product.
  2. Flux tower details and data repository.
  
  This is a highly valuable suggestion that aligns perfectly with the principles of open science. To make the data more accessible and easier to reuse compared to a static supplementary table, we will directly upload the comprehensive attributes of all 208 flux tower sites (including their locations, vegetation types, and available data periods) into a permanent repository (e.g., Zenodo). Furthermore, we will also make the site-level model performance metrics and the underlying source data for key figures, especially those illustrating cross-product uncertainties, publicly available in the same repository.
  3. Comparison with other global products.
  
  We agree that contextualizing our results within the broader literature will greatly strengthen the Discussion section. We will add a dedicated paragraph to compare the PML-V2.2 global mean estimates and their long-term trends against several other mainstream global products (e.g., GLEAM, SiTHv2, etc.). This can be shown as supplementary figures. Adding this comparison will help readers better understand how our new product aligns with or differs from the current scientific consensus.
  4. Differences in global mean values in Figure 12.
  
  Thank you for your careful observation regarding Figure 12. We acknowledge that while the interannual variabilities and trends match closely, there is a slight discrepancy in the mean values between the b version and the a/c versions. Based on our analysis, this difference is primarily attributed to the slight differences in spatial averages in forcing data. Specifically, this is caused by systematic differences in the mean LAI magnitude between the GIMMS LAI4g and the smoothed MODIS LAI, combined with the scaling effects of simulating at a coarser spatial resolution. We will add a brief, data-supported discussion addressing this specific discrepancy in the revised text, with a new supplementary figure to help explain that.
  5. Comparison with PML-V2 China.
  
  This is an excellent point. Acknowledging the previously released regional dataset, PML-V2 China (He et al., 2022), will definitely benefit users focusing on regional studies. We will add a brief comparison between our new global dataset and the regional PML-V2 China dataset in the Discussion section. This comparison will focus on highlighting regional differences and evaluating water balance closure. This addition will effectively clarify the suitability of our global product for regional-scale research in China, especially for applications requiring data from the most recent years.
  6. Fair comparison in Figure A5.
  
  You raise a very valid concern regarding the comparison of water-balance-based ET in Figure A5. Comparing products that have varying spatial coverage over deserts or water bodies can indeed introduce biases. To address this and ensure a strict and fair comparison, we will establish a unified spatial mask. This mask will explicitly account for barren desert regions and exclude evaporation from inland water bodies to standardize the evaluation area across all participating datasets. We will also expand the methodology section to provide detailed descriptions of how these specific datasets were masked and processed.
  # Response to Specific Comments
  
  We sincerely appreciate your meticulous review of the text. All the specific comments and grammatical corrections will be fully adopted and revised accordingly in the manuscript.
  
  Citation: https://doi.org/10.5194/essd-2026-94-AC1
RC2:
'Comment on essd-2026-94', Oscar Manuel Baez Villanueva, 26 May 2026

The article titled “Extended global terrestrial evapotranspiration and gross primary production dataset from 1982 to near present” presents the new version of the PML dataset (PML-V2.2). The dataset comprises three complementary products: (i) PML-V2.2a, an 8-day 500 m MODIS-based dataset (2000–2024); (ii) PML-V2.2b, a half-monthly 0.01deg AVHRR-based product (1982–2020) optimised for near-present monitoring; and (iii) PML-V2.2c, a half-monthly 0.1deg record consolidating the a and b products (1982–2024). Parameter optimisation was performed through a leave-one-out cross-validation procedure for each plant functional type (PFT). The manuscript is very well written, clear, and aligned with the scope of the journal. Additionally, I believe that this dataset is of particular interest to the research community! I hope the following comments and suggestions help further strengthen the manuscript.
General coments:
It would be interesting to mention in the abstract whether the product is expected to be updated regularly (e.g., annually). This information could further help the uptake of the dataset from new users.
The authors present PML-V2.2b as a product optimised for near-present monitoring. It would therefore be valuable to extend the dataset through 2025 (the last complete year) to further demonstrate this capability. In that case, the consolidated PML-V2.2c product could span 1982–2025. This would further strengthen the statement made in L66–67 regarding the production lag affecting many biophysically consistent datasets.
It would be worth considering the use of the Kling–Gupta Efficiency (KGE) instead of the Nash–Sutcliffe Efficiency (NSE). The KGE decomposes performance into correlation, bias, and variability components, providing additional diagnostic insight and facilitating comparison across regions and climatic conditions.
Out of curiosity, is there a particular reason for using MSWEP V2.8 instead of the more recent MSWEP V3.16?
Could the authors elaborate on how rainy days were treated in the case of the data from theflux sites?
Is there a sensitivity analysis associated with the optimised parameters presented in Table 2? It would be particularly interesting to assess whether parameter sensitivity varies across PFTs.
In L218, the authors mention that monthly CO2 concentrations from NOAA were used. How were these data temporally disaggregated to the 8-day and half-monthly scales?
Were the three models optimised independently? If so, it would be interesting to summarise the best-performing parameters for each version. If not, a brief discussion on the expected impact of using different forcing datasets across the three products (particularly between versions a and b) would be useful. It appears that the optimisation was performed at the 8-day temporal scale and therefore primarily for version a; it would be good to state this explicitly in the manuscript.
I appreciate the effort made by the authors to carefully merge AVHRR and MODIS observations in order to minimise discontinuities and artificial biases in long-term trends. It would be interesting to add a few sentences about the pixel-scale bidirectional consolidation process and the reverse-scaling procedure. In addition, why was the 2001–2003 period selected for the consolidation and not a longer period?
It would be very interesting to the readers to include performance metrics for all three dataset versions. Additionally, the consistency of the consolidation could be assessed at the half-monthly scale over the overlapping period across all products.
Besides the visualisation of global patterns, it would be very interesting to add a figure showcasing time series of the three products in comparison with in situ data.

Specific coments:
L22: The authors mention that PML-V2.2c exhibits “exceptional reliability”. It would be useful to specify relative to which products or benchmarks this statement is made. A comparison against other products would be very interesting!
L50–57: Please add references for the datasets mentioned.
L50: “Temporal span” or “record length” may fit better here than “temporal depth”.
L89–90: Please include the versions and references of the datasets.
Table 1: For GLEAM4, in the “key feature” column, “evaporative stress” would be more appropriate than “water stress”. In addition, the temporal coverage should be updated to 1980–2025.
L107: If I am not mistaken, bare soil evaporation should be denoted as Es instead of Eis to remain consistent with Eq. 2 and the following explanation.
L148: Perhaps “estimates” would be more appropriate than “observations” in this context.
L382–385: It would be very interesting to compare all dataset versions over the overlapping period, including metrics such as mean annual global ET, trends, and the partitioning of evaporation into its three main components, as discussed in these lines.

Citation: https://doi.org/10.5194/essd-2026-94-RC2
- AC2: 'Reply on RC2', Zhenwu Xu, 27 May 2026
  
  # Response to Oscar M. Baez-Villanueva (referee #2)
  
  We sincerely thank the reviewer for the thorough evaluation and the highly positive feedback on our updated PML dataset and manuscript. Your constructive suggestions are extremely valuable for enhancing the transparency, utility, and overall readability of our work. We will fully incorporate these suggestions into the revised manuscript. Below is our detailed, point-by-point response to your comments.
  
  # Response to General Comments
  
  1. Data update frequency and temporal extension.
  
  We agree that clarifying the update schedule is important for users. We will mention the annual update cycle in the Abstract. Regarding the extension to 2025, we are currently facing a widespread challenge: the severe satellite drift of MODIS since 2022 has caused significant data degradation for 2025, particularly in tropical rainforest regions. We had identified this issue during our pre-update of PML data in early Feb. 2025, therefore data for 2025 is not released at the moment. As MODIS data alone is currently insufficient to support reliable further updates, we are transitioning to VIIRS forcing data (same spatial-temporal resolution, algorithm as MODIS). The VIIRS LAI processing is completed and without drift issues, while the Albedo data (VNP43IA3) is currently being uploaded in coordination with the GEE team.
  
  We plan to update the dataset through 2025 for the final published version (expected by July/August). Moving forward, with GEE-based MODIS/VIIRS drivers, our annual update latency in subsequent years will ideally be reduced to two to six months. We will explicitly state this timeline and our transition strategy for subsequent annual updates in the revised manuscript.
  
  2. Use of Kling-Gupta Efficiency (KGE).
  
  This is an excellent suggestion. Currently, we provide NSE, R, RMSE, and Bias to capture different dimensions of model performance. However, we agree that KGE is a robust composite metric that facilitates easier cross-product comparisons. We will calculate the KGE metrics for all flux sites and include them in the revised manuscript to complement our existing evaluations.
  
  3. Choice of MSWEP version.
  
  We will briefly clarify our choice of MSWEP V2.8 in the text. When we initiated the data preparation and long-term simulations, MSWEP V3 had not been officially released (it is currently still under review). Our preliminary investigations indicated slight differences between V2.8 and V3, and the long-term stability of V3 remained uncertain at the time. Therefore, V2.8 was selected to maintain strict data consistency across the 43-year simulation. We will consider comprehensively updating to V3 in future versions once its stability is fully evaluated by the community.
  
  4. Treatment of rainy days at flux sites.
  
  We will elaborate on this in the methodology section. In this study, we did not explicitly filter out rainy days from the flux site data. The primary reason is that our model runs and evaluations are conducted at an 8-day temporal scale. At this temporally aggregated scale, the instantaneous uncertainties and noise introduced by precipitation events on eddy covariance measurements are largely smoothed out. This minimizes the necessity for strict daily or hourly rainy-day filtering, while preserving the continuity of the time series.
  
  5. Parameter sensitivity analysis.
  
  We will add a brief discussion regarding parameter sensitivity across different PFTs. Since a formal parameter sensitivity analysis was not included in the previous 2019 RSE paper, we will incorporate a variance-based sensitivity analysis (e.g., the Sobol or Morris method) in this updated version. This will make the parameterization of the PML model more transparent and scientifically robust.
  
  6. Temporal disaggregation of CO2 data.
  
  We will add a clarifying sentence in Section 2 to explain this processing. Specifically, the monthly NOAA CO2 concentrations were applied as static values for the 8-day or half-monthly steps within each respective month. Since terrestrial carbon-water coupling responds primarily to the long-term CO2 trend and seasonal cycles, the absence of high-frequency (sub-monthly) atmospheric CO2 fluctuations has a negligible impact on the simulations.
  
  7. Model optimization and forcing impacts.
  
  We will explicitly state in the manuscript that the parameter optimization was performed exclusively at the 8-day temporal scale using high-quality site meteorological observations and MODIS data, primarily targeting version a.
  
  We will also clarify that versions a and b share the exact same parameterization scheme; the discrepancies arise solely from the different forcing datasets. We intentionally selected “bias-corrected” forcing datasets (MSWEP + MSWX) rather than raw reanalysis data (e.g., ERA5, MERRA2) to minimize such forcing-induced biases. Combined with the GIMMS LAI4g dataset, which has undergone extensive calibration and validation using in situ observations and Landsat imagery, this strategy effectively balances bottom-up simulations with top-down constraints, which underpins the high fidelity of our water balance validation.
  
  8. Consolidation process and overlapping period.
  
  We will add more details describing the pixel-scale bidirectional consolidation process. A bidirectional correction was adopted because unidirectional scaling might fail to perfectly preserve both identical means and consistent long-term trends. Furthermore, the 2001–2003 period was selected because it represents a highly stable overlapping window with high-quality data from both AVHRR and MODIS sensors. Compared to using a longer baseline (e.g., 2001–2020), this shorter, focused window avoids introducing abrupt shifts or artificial artifacts at the transition boundary (1999–2001).
  
  Importantly, we will expand the discussion to acknowledge that any cross-sensor consolidation inherently introduces uncertainties—a challenge we will face again with our upcoming transition to VIIRS. However, without such harmonization, continuous long-term studies across different satellite eras would be impossible. While some existing products tend to remain vague regarding their cross-sensor transition strategies, bridging this gap transparently is precisely the core value of our consolidated version “c”, which is particularly valuable for placing recent anomalies within a long-term historical context. We are committed to ensuring the methodological soundness of this processing, providing robust long-term estimates, and openly documenting the associated uncertainties in the revised manuscript.
  
  9. Performance metrics and time series comparisons.
  
  These are highly constructive suggestions. We will include a new figure showcasing the time series of the PML-V2.2a production data directly compared with in situ flux data, and we plan to upload more comprehensive site-level plots to our Zenodo repository for transparency.
  
  However, for versions b and c, we would like to clarify that performing a direct time-series comparison against flux towers is scientifically problematic. The spatial mismatch between a tower's footprint (typically ~1 km) and the coarse grid resolutions (0.01° and 0.1°) over heterogeneous land surfaces introduces massive uncertainties. This spatial footprint mismatch is also the primary reason we avoid direct horizontal site-level comparisons with other coarse-resolution global products. We will explicitly discuss this limitation in the revised manuscript.
  
  # Response to Specific Comments
  
  We sincerely appreciate your meticulous review of the text. All the specific comments, including dataset references, terminology adjustments, removing overstated or non-standard descriptions, and updating Table 1, will be fully adopted. We will also include brief cross-product comparisons in the Discussion section as suggested.
  
  Citation: https://doi.org/10.5194/essd-2026-94-AC2
RC3:
'Comment on essd-2026-94', Anonymous Referee #1, 27 May 2026

The author had revised this manuscript accroding to my comments and can be accepted for publication at present.

Citation: https://doi.org/10.5194/essd-2026-94-RC3
- AC3: 'Reply on RC3', Zhenwu Xu, 27 May 2026
  
  We sincerely thank the reviewer for the positive evaluation and for recommending our manuscript for publication. We will carefully revise the manuscript according to the changes detailed in our previous response.
  
  Citation: https://doi.org/10.5194/essd-2026-94-AC3
RC4: 'Comment on essd-2026-94', Anonymous Referee #1, 28 May 2026

It's a good revised manuscript and can be accepted for publication.

Citation: https://doi.org/10.5194/essd-2026-94-RC4

Zhenwu Xu, Yongqiang Zhang, Dongdong Kong, Ning Ma, and Xuanze Zhang

Data sets

PML-V2.2: Global terrestrial evapotranspiration and gross primary production dataset from 1982 to near present Zhenwu Xu, Yongqiang Zhang, and Dongdong Kong https://doi.org/10.11888/Terre.tpdc.303314

Zhenwu Xu, Yongqiang Zhang, Dongdong Kong, Ning Ma, and Xuanze Zhang

Viewed

Total article views: 1,675 (including HTML, PDF, and XML)

HTML	PDF	XML	Total	BibTeX	EndNote
1,154	475	46	1,675	39	61

HTML: 1,154
PDF: 475
XML: 46
Total: 1,675
BibTeX: 39
EndNote: 61

Views and downloads (calculated since 02 Mar 2026)

Month	HTML	PDF	XML	Total
Mar 2026	384	186	17	587
Apr 2026	359	149	10	518
May 2026	247	73	11	331
Jun 2026	53	24	5	82
Jul 2026	111	43	3	157

Cumulative views and downloads (calculated since 02 Mar 2026)

Month	HTML	PDF	XML	Total
Mar 2026	384	186	17	587
Apr 2026	359	149	10	518
May 2026	247	73	11	331
Jun 2026	53	24	5	82
Jul 2026	111	43	3	157

Viewed (geographical distribution)

Total article views: 1,653 (including HTML, PDF, and XML) Thereof 1,653 with geography defined and 0 with unknown origin.

Country	#	Views	%

Latest update: 24 Jul 2026

Short summary

The PML-V2.2 dataset integrates MODIS and AVHRR archives to provide coupled estimates of terrestrial evapotranspiration and gross primary production from 1982 to near present. Rigorously calibrated at 208 flux stations and validated against water balances in 152 large river basins, this extended and seamless record reveals significant increases in global evapotranspiration, vegetation productivity, and water use efficiency, supporting diverse studies in hydrology, ecology, and Earth sciences.


Total:	0
HTML:	0
PDF:	0
XML:	0