We introduce the Frequent Rainfall Observations on GridS (FROGS) database
(Roca et al., 2019). It is composed of gridded daily-precipitation products
on a common
Precipitation is a key element of the water and energy cycle. Observational efforts to document precipitation have a long history (Park et al., 2017) and have matured rapidly in recent decades. Historically available in situ archives and associated gridded products (e.g., Becker et al., 2013) are being complemented by the burgeoning capability of satellite observations (Levizzani et al., 2018). Reanalysis precipitation is not only derived from the model physics but from a short-term forecast and reflects observed atmospheric variability (Bosilovich et al., 2008). While monthly and/or large spatial scale observations have been available for some time, recent progress permits documentation of precipitation at finer space and timescales that consequently allows us to address new challenges such as extreme precipitation (Westra et al., 2013).
Indeed, recent processing of global-scale ground-based archives has led to
new datasets at a
Recent GEWEX-led (Global Energy and Water Exchanges) assessments have paved the way for efficient and useful coordinated intercomparison and validation exercises: the Cloud Assessment (Stubenrauch et al., 2013) and the Water Vapor Assessment (Schröder et al., 2016, 2019). One of reasons for this success lies in the making of a common gridded database, facilitating the handling of many datasets that were originally available on various grids, resolutions, formats, data types, etc. The Cloud Assessment database encompasses 23 products, mainly remotely sensed, and is now being extended in time and in the number of products available (Claudia Stubenrauch, personal communication, 2018). Within the Water Vapor Assessment, precipitable water spans 22 products, originating from in situ observations, satellite measurements and reanalysis outputs (Schröder et al., 2018). In both assessments though, a common publicly available database has been provided for further analysis after the main assessment effort. Here we follow the legacy of the GEWEX Data Analysis Panel (GDAP) assessments by building a database of daily precipitation with data originating from rain gauges, satellite and reanalysis products. The database, Frequent Rainfall Observations on GridS (FROGS), is released at the beginning of the assessment. This will help to integrate new investigations and its overall assessment recently initiated under the auspices of GEWEX GDAP and IPWG (Haddad and Roca, 2017). This also includes a dedicated effort to analyze extreme events and assess their characteristics under the joint WCRP Grand Challenge on Weather and Climate Extremes–GEWEX GDAP project on “Precipitation Extremes” (Alexander et al., 2018) with a special issue being developed on the topic.
The aim of this paper is to introduce the FROGS database that includes
ground-based, satellite and reanalysis products, gridded to a common
In situ products are seen by many as offering a “ground truth”, but this is
not necessarily the case since they are gridded estimates based on various
interpolation algorithms which use incomplete station networks of varying
quality and density. All products have their pros and cons especially when
it comes to extreme precipitation estimates, and all are limited in data
coverage over certain land regions, e.g., Africa. Some include error estimates
or quality masks while others do not. Here we have chosen to include
products in FROGS, which offer daily-global or quasi-global land estimates.
All products are either already available on a
The ground-based datasets.
The NOAA CPC unified gauge-based analysis of global daily precipitation
(Chen et al., 2008; Xie et al., 2010) interpolates data
from over 30 000 stations over land onto a regular grid with dedicated
quality control (QC) and accounting for topographic biases
(Xie et al., 2007). The QC is performed through
historical record comparisons and neighbor checks, concurrent
radar and/or satellite observations and utilizes numerical model forecasts. The
gauge reports come from multiple sources including GTS, COOP and other
national and international agencies. The daily analysis is constructed on a
The Global Precipitation Climatology Centre (GPCC) builds a suite of gridded
precipitation products based on rain gauge measurements and comprehensive
quality control (Becker et al., 2013). In particular a full
global
REGEN is the name given to a set of daily-land-based precipitation datasets
created through a collaboration with the University Of New South Wales
(UNSW), GPCC and NOAA's National Center for Environmental Information
(NCEI). There are two related datasets that are currently available on a
Most of the “satellite” precipitation estimation products make use of ancillary, non-satellite data in their estimations and these enriched products are regarded as the best estimate. Nevertheless, for most of these products, an unadjusted version is also available and is included here. Hardly any dataset is truly global, and we present below the quasi-global land and ocean datasets as well as the ocean-only and the land-only quasi-global data products currently available in FROGS. The database is completed with some regional products including tropical land and oceans or ones covering only continental Africa and South America. The list of products is summarized in Table 2 as well as information regarding their respective spatial and temporal coverage.
The satellite-based datasets.
Most of the products are available at the daily scale, and so this is the
version we use. The day is defined over the 00:00–00:00Z period for each of the datasets unless otherwise specified. The daily data are then regridded onto
a
The 3B42 v7.0 product (Huffman et al., 2007) is a reference product in various previous studies of tropical rainfall distribution (Maggioni et al., 2016). It also exemplifies what a dataset that is highly geared towards microwave data (imagers and 183 GHz sounders) can provide in terms of daily accumulation. It also puts to perspective the use of scattering-based retrieval over land for instantaneous retrieval (Gopalan et al., 2010). The combined TRMM radar–imager product (Haddad et al., 1997) serves as a reference for other microwave instruments prior to merging. Geostationary-based IR imagery is also used in the algorithm to ensure observations when no low earth observing satellites are available. The technique also relies on a sophisticated bias correction approach that relies on GPCP monthly analysis over land. While this popular product has been evaluated over a very large number of small regions and catchments as well as over the whole tropics for certain metrics (Sun et al., 2018), it still lacks systematic intercomparison with the whole suite of products presented here. As a consequence, it is included here even though NASA has announced the discontinuation of its production post-2018. As a complement to the gauge-adjusted product, the microwave-calibrated IR estimates and the microwave-only estimates are also provided. Along with this reference product, NASA has also been releasing a low-latency version called 3B42RT, where RT is for real time and no gauge data are used. Two versions are provided here, the actual products and the uncalibrated one (Huffman et al., 2007). The suite of 3B42 products is at the core of the constellation-based family of satellite rainfall products.
The Global Satellite Mapping of Precipitation (GSMaP) product provides high-resolution precipitation estimations using satellite observations from multiple platforms (Kubota et al., 2007; Mega et al., 2018). This product is mainly based on the microwave estimation of rainfall for a suite of microwave imagers and sounders. The suite of GSMaP products hence belongs to the constellation-based family of satellite rainfall products. The microwave instantaneous rain rate estimates (Aonashi et al., 2009) are propagated based on cloud motion wind vectors originally derived from IR-geostationary imagery to yield to a gridded high-resolution precipitation product (Ushio et al., 2009). GSMaP belongs to the morphing-based microwave algorithms, like CMORPH (Joyce et al., 2004) or searchlight (Bellerby, 2013). To complement the satellite-only estimation, the product is further scaled to rain gauge estimates to correct for some bias over land. Two sets of products are included in the database: unadjusted and adjusted. Two versions of the GSMaP products are provided here: the so-called reanalysis and the near-real-time versions of the products that differ in the amount of data they used in the processing. Owing to the production schedule, the homogeneous processing of the reanalysis data has been performed from mid-2000 up to April 2014. As a consequence, we have restricted our use to the processed full years from 2001 to 2013 inclusive. While the original product is offered at a range of two daily averages 00:00–00:00Z and 12:00–12:00Z, here for the sake of homogeneity only the 00:00–00:00Z daily average is provided in the ensemble database. The near-real-time dataset extends up to 2017.
PERSIANN-CDR v1 is a quasi-global IR-based product trained over radar data in the US and normalized to GPCP monthly totals (Ashouri et al., 2015). It can be thought of as an alternative daily downscaling of the GPCP monthly data to that of GPCP 1DD CDR. Despite sharing monthly totals, the two products differ substantially in their estimation of the daily-precipitation distribution (Sun et al., 2018) and so are included in the database. The product does not rely on passive microwave data and as a consequence extends back further in time than GPCP 1DD. PERSIANN-CDR has been extensively evaluated over various regions and was shown to provide mixed levels of agreement with observations from local rain gauge networks (Miao et al., 2015; Tan and Santo, 2018). PERSIANN-CDR is the climate monitoring oriented product of the PERSIANN family of datasets that can otherwise be accessed at the CHRS data portal (Nguyen et al., 2019).
The CMORPH product (Joyce et al., 2004; Xie et al., 2017) belongs to the microwave-based morphing algorithms like GSMaP (Kubota et al., 2007). The microwave-derived instantaneous rain rate estimates from multiple platforms are propagated using cloud motion wind vectors originally derived from IR-geostationary imagery and a Kalman filter (Joyce and Xie, 2011). Such an approach results in a high-resolution precipitation product. The CMORPH products belong to the constellation-based family of satellite rainfall products, and both microwave imagers and sounders are used. The instantaneous imagers-based rain rates are obtained using GPROF 2004 (Kummerow et al., 2001) while the sounders estimation relies on the algorithm of Ferraro et al., 2005. Two versions are provided with CRT or without RAW gauges adjustments. The adjustment is performed using PDF matching. Over land, the daily-CPC-gauge analysis is used for this correction (Xie et al., 2003), while over ocean the adjustment is done using the GPCP pentad-merged product. The product is thought to perform well overall, with a small bias relative to the gauges, yet it experiences difficulties with snow and cold season rainfall (Xie et al., 2017).
The Global Precipitation Climatology Product (GPCR) Climate Data Record (CDR) Version 1.3 daily product
(Huffman et al., 2001) is another reference
product used in various previous studies (Adler et al.,
2017). The GPCP CDR dataset is the only global product in the database. It
is adapted from the Geostationary Operational Environmental Satellite (GOES)
precipitation index technique with monthly, local adjustments. The approach
merges IR imagery from geostationary and polar platforms. It relies on the
use of one single microwave platform and the Level 2 retrievals of Kummerow
et al. (1996). A bias adjustment scheme is finally used over land that
relies on rain gauge data from the GPCC database at the monthly scale. Over
the high latitudes, GPCP incorporates IR-based precipitation estimations and
microwave-derived rain rates for the lower latitudes. The original data file from NOAA contains a valid range attribute between 0 and 100 mm d
The Climate Hazards Infrared Precipitation (CHIRP) and the Climate Hazards
Infrared Precipitation with Stations (CHIRPS) are satellite-based
precipitation products (Funk et al., 2015).
While CHIRP is satellite only, CHIRPS also benefits from station data from five
public data sources (GHCN monthly and daily, Global Surface Summary Of the Day, GTS
daily, Southern African Science Service Centre for Climate Change and
Adaptive Land Management) as well as private datasets from various
countries in the world (see Funk et al., 2015, for the list). As a result,
the density of gauges in the final product varies significantly in time as
well as space. CHIRP uses the infrared observations from geostationary
observations in a GOES GPI-modified approach and various ground-based and
alternative sources (in-house climatology, 3B42, Climate Forecasts
Systems outputs) for its calibration on a monthly scale. It is considered
as a rain-gauge-free or satellite-only product. Then the CHIRPS estimates
are obtained by merging the stations with the CHIRP estimates using a
weighted average of the closest stations and CHIRP results for each
0.05
While all of the other products are based on indirect measurements more or less, this product actually relies on very indirect evidence of precipitation by relating satellite-based estimations of soil moisture to the precipitation that affected the surface. This product is based on the SM2RAIN algorithm (Brocca et al., 2013). The algorithm is applied to the active and passive ESA Climate Change Initiative soil moisture datasets (Ciabatta et al., 2018). It is an alternative way to use indirect satellite-based measurements to estimate rainfall. Note that due to soil moisture data quality issues, a mask is applied to the rainfall products, and no estimates are provided over the tropical rainforest areas, frozen and snow covered soil, rainforest areas, and areas with topographic complexity.
The Hamburg Ocean-Atmosphere Parameters and Fluxes from Satellite Data HOAPS) product is described at length in Andersson et al. (2010). This product relies on recalibrated and inter-calibrated measurements
from SSM/I and SSMIS passive microwave radiometers (Fennig
et al., 2017) to estimate a suite of fresh-water budget elements globally
(80
The recently released TAPEER product is based on the universally adjusted
Geostationary Operational Environmental Satellite precipitation index
technique (Xu et al., 1999) that merges geostationary infrared
imagery with microwave instantaneous rain rates estimates at daily local
scales to yield the daily-precipitation accumulation (Kidd
et al., 2003). The current implementation relies on the BRAIN L2 dataset
(Viltard et al., 2006) for a suite of conical microwave
imagers and includes the SAPHIR data from the Megha-Tropiques mission
(Roca et al., 2015) for rainfall detection and is available
at
The Tropical Applications of Meteorology using SATellite (TAMSAT) data and
ground-based observations (version 2.0 and 3.0; Maidment
et al., 2017) are a product that provides rainfall estimates across Africa
based both on geostationary thermal infrared (TIR) images obtained every
15 min (30 min prior to June 2006) and on ground-based observations from
the Global Telecommunications System (GTS). The TAMSAT algorithm is based on
two primary data inputs: (i) Meteosat TIR imagery provided by the European
Organisation for the Exploitation of Meteorological Satellites (EUMETSAT)
and (ii) rain gauge observations (daily accumulated, 06:00–06:00 UTC) for
calibration. The general procedure follows three steps: (a) algorithm
calibration – at the decadal (version 2.0) and pentadal (version 3.0)
time steps, (b) estimation of the pentadal and decadal rainfall and (c) estimation of daily rainfall. The TAMSAT daily rainfall estimates
have a native resolution of 0.0375
The African Rainfall Climatology version 2.0 (ARC2) is a revision of the
first version of the ARC and is consistent with the operational Rainfall
Estimation Version 2 (RFE 2.0) (Novella and Thiaw, 2013). The
product is a composite of (i) 3-hourly geostationary infrared (IR) data
centered over Africa from the European Organisation for the Exploitation of
Meteorological Satellites (EUMETSAT) and (ii) quality-controlled 24 h
(06:00–06:00 UTC) rainfall accumulation records from the Global
Telecommunication System (GTS) gauge database. The calibrated IR and the
quality-controlled GTS gauges are then combined following multiple criteria
(i.e., the two-step merging process) to produce the final rainfall
estimates. The ARC2 daily data set is updated regularly. The native
resolution is
The Combined Scheme approach (CoSch) (Vila
et al., 2009) is a gauge-satellite-based precipitation product that provides
daily gridded estimates over Latin America. The general procedure for
satellite-gauge merging and data production involves the following tasks: (i) obtain and run quality control of global and regional rain gauge data from GTS and multiple institutions, respectively, (ii) reprocess the daily
accumulated satellite-based rainfall fields, following the same time
accumulation as the rain gauges (12:00–12:00 UTC) and (iii) apply the additive
and multiplicative bias correction schemes for each station on a daily
basis. The CoSch actual product-version uses the real-time TRMM
Multi-satellite Precipitation Analysis (TMPA-RT; Huffman et al., 2007) (Version 7)
as a high-quality satellite rainfall algorithm. The CoSch daily rainfall
estimates database is available from March 2000 to the present and its
native spatial resolution is 0.25
Atmospheric reanalyses blend observed meteorological state fields (temperature, humidity, wind and pressure) with a global weather model through assimilation to provide a continuous representation of not only the state fields, but also the model-generated fields. Precipitation is one such model-derived but observationally guided field. Typically, reanalysis precipitation is considered to have more uncertainty than the analyzed state fields (Kalnay et al., 1996). However, precipitation is a key quantity in both the reanalysis representations of global water and energy cycles (through the latent heat of condensation) and so should be understood (Bosilovich et al., 2008). There are few studies intercomparing many reanalyses daily precipitation, although distinctly different distributions were found among a collection of 10 analyses and reanalyses (focusing on gauge data over the United States) (Bosilovich et al., 2009). Even for a given weather event, the distribution of the precipitation can have large variance. Shiu et al. (2012) results suggest that reanalyses can reproduce the temperature–precipitation relationship as temperature increases, but the more recent reanalyses had higher variance than the older generation. The long-term collection of daily reanalyses precipitation here will help characterize and understand the state of the reanalyses abilities to reproduce the high-frequency occurrences of extreme precipitation.
The list of products is summarized in Table 3, and below we detail the common grid and present each individual product.
The reanalysis datasets.
The Modern-Era Retrospective Analysis for Research and Applications (MERRA) version 1 (Rienecker et al., 2011) and version 2 (Gelaro et al., 2017) benefited throughout their development from the focus on the water cycle, which was identified as a key component to understanding weather and climate. Significant improvements were included in the model (Molod et al., 2015) and the water vapor analysis (Takacs et al., 2016). While the influence of observing system changes is still apparent in MERRA-2 (Bosilovich et al., 2017), and there are some significant regional biases (e.g., tropical land topography overestimates), there is indication that the extreme end of the distribution is significantly improved in MERRA-2 over MERRA-1 for the continental United States (Bosilovich and al., 2015). The observations evaluated here will allow the testing of these improvements in other regions in reanalyses.
The details of the Japanese 55-year Reanalysis (JRA-55) are provided in Kobayashi et al., 2015. This version introduced 4-D variational analysis extending in time beyond the introduction of satellite data for weather analysis (back to 1958). Wind profile retrievals for tropical cyclones were assimilated and provide a significant contribution to the analysis of tropical cyclones. While some improvements have been noted in the stability of the precipitation time series and certain water vapor biases, the JRA-55 mean precipitation tends to be high, attributed to a dry model bias and spin-down effect of the forecast following reinitialization.
The ECMWF Interim Reanalysis (ERA-Interim) (Dee et al., 2011) was developed to test the recent advancement of the forecast model and assimilation development beyond ERA-40 (Uppala et al., 2005), especially in the representation of the hydrologic cycle. This included advances in the humidity analysis, radiance bias correction and cloud parameterization, which are crucial for the representation of the water vapor state and generation of precipitation. While the large-scale representation of the precipitation has improved over ERA-40, some differences from observed data can be found (Simmons et al., 2010).
The National Centers for Environmental Prediction (NCEP) Climate Forecast System Reanalysis (CFSR) was developed to provide initial conditions for continuing seasonal predictions, as well as for climate studies (Saha et al., 2010). At a horizontal resolution of 38 km, the representation of the modeled precipitation will the highest-resolution source reanalysis data included here. While high resolution should provide improved locations of precipitation events and structural patterns, the CFSR also uses observation-corrected precipitation for forcing its land surface model. This was done to provide the best surface forcing and soil moisture for the subsequent forecasts. As with the other reanalyses here, the influence of changing observations, especially the addition of ATOVS radiances, significantly affects the mean precipitation of CFSR (Zhang et al., 2012).
All of the data for the reanalyses (MERRA-1, MERRA-2, CFSR, and JRA-55,
ERA-Interim) were obtained from the CREATE service (Potter et al., 2018). These data are identically
formatted with one variable per file for both 6 h and monthly timescales. The 6 h outputs were then used to create the daily form and the data time was adjusted to
have a 12 h mid time. The files were also adjusted to have the same
longitudinal wrap as GPCP. The files were regridded to
Figure 1 shows the annual mean precipitation time series all of the products and indicates the various time spans and spatial coverage of the products. This large ensemble of products is characterized by various trends in their depiction of the average precipitation evolution. Note that the regional products might not be compared directly with the quasi-global ones. Despite this, there are some clear outliers and inhomogeneities in the products available. It is recommended that further work should aim at understanding these differences between the products through a concerted community intercomparison effort.
Time series of annual total daily precipitation in millimeters (mm) averaged over each dataset domain (regional or global, land or ocean, or both) as shown on the embedded maps in the panels. The name of the dataset and number of years available are indicated in each panel. Datasets are organized by order of appearance in Tables 1, 2 and 3.
Files are produced within netCDF-4 format with metadata following the Climate and Forecast (CF) Convention version 1.6 and Attribute Convention for Dataset Discovery (ACDD) version 1.3. An example of the header of a product is provided in the Appendix.
One file per product, per year, at the resolution
For some products, extra information can be found in the files. For instance, the TAPEER product is completed with an estimate of the uncertainty of the daily precipitation.
The database (Roca et al., 2019) is referenced with the following DOI:
For the first time we offer an easily accessible database of daily
precipitation products on a common grid which we hope will prove invaluable
for intercomparison, model evaluation (Tapiador et
al., 2017, 2019) and other research purposes. In particular, FROGS offers an
invaluable resource to study precipitation extremes and to help us
understand some of the uncertainties that are inherent across all
precipitation products. This understanding should extend to considering
resolution and scaling effects on extremes imposed through the gridding of
point-based information (e.g., Dunn et al., 2014) and the regridding to lower
resolution of some of the products (e.g., Herold et al., 2017) which could
“smooth” extremes. A few studies based on this database are already under
consideration in various journals with a focus on extreme precipitation
(special issue in
An example of a header of the netCDF-4 file for the 3B42 v7.0 product.
RR, LVA and MGB initiated the work. RR, GP, RJ and StC prepared some datasets. MB drafted the figure. SoC managed the DOI. All the authors contributed to the writing of the paper.
The authors declare that they have no conflict of interest.
We kindly acknowledge the participants the WCRP meeting on Extreme Precipitation at DWD in Offenbach, 2018, who provided a strong incentive for this database to be finalized. Adrien Guérou's help at an early stage of this effort is appreciated. All the data providers are also acknowledged for making their data freely available. We further thank the GsMaP, GPCP, GPCC, CPC and CHIRPS teams for the enriching exchanges about their products. We thank Marc Schroder and Alexandre Ramos for helpful discussions on the data. The graphic was made using the NCAR Command Language (NCL 2013).
This research has been supported by the Australian Research Council (ARC, Discovery Project DP160103439) and by the ARC Centre of Excellence for Climate Extremes (grant no. CE170100022).
This paper was edited by Scott Stevens and reviewed by two anonymous referees.