A pan-African high-resolution drought index dataset

Droughts in Africa cause severe problems, such as crop failure, food shortages, famine, epidemics and even mass migration. To minimize the effects of drought on water and food security on Africa, a highresolution drought dataset is essential to establish robust drought hazard probabilities and to assess drought vulnerability considering a multiand cross-sectional perspective that includes crops, hydrological systems, rangeland and environmental systems. Such assessments are essential for policymakers, their advisors and other stakeholders to respond to the pressing humanitarian issues caused by these environmental hazards. In this study, a high spatial resolution Standardized Precipitation-Evapotranspiration Index (SPEI) drought dataset is presented to support these assessments. We compute historical SPEI data based on Climate Hazards group InfraRed Precipitation with Station data (CHIRPS) precipitation estimates and Global Land Evaporation Amsterdam Model (GLEAM) potential evaporation estimates. The high-resolution SPEI dataset (SPEI-HR) presented here spans from 1981 to 2016 (36 years) with 5 km spatial resolution over the whole of Africa. To facilitate the diagnosis of droughts of different durations, accumulation periods from 1 to 48 months are provided. The quality of the resulting dataset was compared with coarse-resolution SPEI based on Climatic Research Unit (CRU) Time Series (TS) datasets, Normalized Difference Vegetation Index (NDVI) calculated from the Global Inventory Monitoring and Modeling System (GIMMS) project and root zone soil moisture modelled by GLEAM. Agreement found between coarse-resolution SPEI from CRU TS (SPEI-CRU) and the developed SPEI-HR provides confidence in the estimation of temporal and spatial variability of droughts in Africa with SPEI-HR. In addition, agreement of SPEI-HR versus NDVI and root zone soil moisture – with an average correlation coefficient (R) of 0.54 and 0.77, respectively – further implies that SPEI-HR can provide valuable information for the study of drought-related processes and societal impacts at sub-basin and district scales in Africa. The dataset is archived in Centre for Environmental Data Analysis (CEDA) via the following link: https://doi.org/10.5285/bbdfd09a04304158b366777eba0d2aeb (Peng et al., 2019a). Published by Copernicus Publications. 754 J. Peng et al.: A pan-African high-resolution drought index dataset


Introduction
Drought is a complex phenomenon that affects natural environments and socioeconomic systems in the world (von Hardenberg et al., 2001;Vicente-Serrano, 2007;Van Loon, 2015;Wilhite and Pulwarty, 2017). Impacts include crop failure, food shortage, famine, epidemics and even mass migration (Wilhite et al., 2007;Ding et al., 2011;Zhou et al., 2018). In recent years, severe events have occurred across the world, such as the 2003 central Europe drought (García-Herrera et al., 2010), the 2010 Russian drought (Spinoni et al., 2015), the 2011 Horn of Africa drought (Nicholson, 2014), the 2000 drought in southeastern Australia (van Dijk et al., 2013;Peng et al., 2019c), the 2013-2014 California drought (Swain et al., 2014), the 2014 North China drought (Wang and He, 2015) and the 2015-2017 southern Africa drought (Baudoin et al., 2017;Muller, 2018). Widespread negative effects of these droughts on natural and socioeconomic systems have been reported afterwards (Wegren, 2011;Arpe et al., 2012;Griffin and Anchukaitis, 2014;Mann and Gleick, 2015;Dadson et al., 2019;Marvel et al., 2019). Thus, there is a clear need to improve our knowledge about the spatial and temporal variability of drought, which provides a basis for quantifying drought impacts and the exposure of society, the economy, and the environment over different areas and timescales (Pozzi et al., 2013;AghaKouchak et al., 2015).
Generally, drought is defined as a temporal anomaly characterized by a deficit of water compared with long-term conditions (Mishra and Singh, 2010;Van Loon, 2015). Droughts can typically be grouped into five types: meteorological (precipitation deficiency), agricultural (soil moisture deficiency), hydrological (runoff and/or groundwater deficiency), socioeconomic (social response to water supply and demand) and environmental or ecologic (Keyantash and Dracup, 2002;AghaKouchak et al., 2015;Crausbay et al., 2017). These different drought categories involve different event characteristics in terms of timing, intensity, duration and spatial extent, making it very difficult to characterize droughts quantitatively (Panu and Sharma, 2002;Lloyd-Hughes, 2014;Vicente-Serrano, 2016). For this reason numerous drought indices have been proposed for precise applications, and reviews of the available indices have been provided by previous studies, such as Heim Jr. (2002), Keyantash and Dracup (2002), and Mukherjee et al. (2018). Van Loon (2015) noted that there is no best drought index for all types of droughts because every index is designed for a specific drought type, thus multiple indices are required to capture the multifaceted nature of drought. Nevertheless, the Standardized Precipitation Index (SPI) is recommended by the World Meteorological Organization (WMO) for drought monitoring, which is calculated based solely on long-term precipitation data over different time spans (McKee et al., 1993). The advantages of SPI are its relative simplicity and its ability to characterize different types of droughts given the different times of response of different usable water sources to precipitation deficits (Kumar et al., 2016;Zhao et al., 2017). However, information on precipitation is inadequate to characterize drought; in most definitions, drought conditions also depend on the demand of water vapour from the atmosphere. More recently, Vicente-Serrano et al. (2010) proposed an alternative drought index for SPI, which is called Standardized Precipitation Evapotranspiration Index (SPEI). Compared to SPI, it considers not only the precipitation supply but also the atmospheric evaporative demand Vicente-Serrano et al., 2012b). This makes the index more informative of the actual drought effects over various natural systems and socioeconomic sectors (Vicente-Serrano et al., 2012b;Bachmair et al., 2016Bachmair et al., , 2018Kumar et al., 2016;S. Sun et al., 2016S. Sun et al., , 2018Peña-Gallardo et al., 2018a, b).
For the calculation of SPEI, high-quality and long-term observations of precipitation and atmospheric evaporative demand are necessary. These observations may either come from ground-based station data or gridded data, such as satellite and reanalysis datasets. For example, the SPEIbase  and the Global Precipitation Climatology Centre Drought Index (GPCC-DI) (Ziese et al., 2014) both provide SPEI datasets at a global scale. SPEIbase provides gridded SPEI with a 50 km spatial resolution and is calculated from Climatic Research Unit (CRU) Time Series (TS) datasets, which are produced based on measurements from more than 4000 ground-based weather stations across the world (Harris et al., 2014). The SPEI dataset provided by GPCC-DI has a spatial resolution of 1 • and was generated from GPCC precipitation (Becker et al., 2013;Schneider et al., 2016) and National Oceanic and Atmospheric Administration (NOAA)'s Climate Prediction Center (CPC) temperature dataset (Fan and Van den Dool, 2008). Both of these datasets have been applied for various drought-related studies at global and regional scales (e.g. Chen et al., 2013;Vicente-Serrano et al., 2013, 2016Isbell et al., 2015;Q. Sun et al., 2016;Deo et al., 2017). However, these global SPEI datasets' spatial resolution are too coarse to be applied at district or sub-basin scales (Vicente-Serrano et al., 2017). A sub-basin-scale quantification of drought conditions is particularly crucial in regions such as Africa, in which geospatial data and drought indices can be essential to manage existing drought-related risks (Vicente-Serrano et al., 2012a) and where in situ measurements are scarce (Trambauer et al., 2013;Masih et al., 2014;Anghileri et al., 2019). Over last century, Africa has been severely influenced by intense drought events, which has led to food shortages and famine in many countries (Anderson et al., 2012;Yuan et al., 2013;Sheffield et al., 2014;Awange et al., 2016;Funk et al., 2018;Nicholson, 2018;Gebremeskel et al., 2019). Therefore, the availability of a high-resolution drought index dataset may contribute to an improved characterization of drought risk and vulnerability and minimize its impact on water and food security by supporting policymakers, water managers and stakeholders. Conveniently, with the advancement of satellite technology, the estimation of precipitation and evaporation from remote sensing datasets is becoming more accurate . In particular, the long-term Climate Hazards group InfraRed Precipitation with Station data (CHIRPS) (Funk et al., 2015a) precipitation dataset and Global Land Evaporation Amsterdam Model (GLEAM) (Miralles et al., 2011) evaporation dataset provide high-quality data for near-real-time drought monitoring. Here, we use CHIRPS and GLEAM datasets to develop a pan-African high spatial resolution (5 km) SPEI dataset, which may be useful to inform drought relief management strategies for the continent. The dataset covers the period from 1981 to 2016 and it is comprehensively inter-compared with soil moisture, vegetation index and coarse-resolution SPEI datasets.

CHIRPS
CHIRPS is a recently developed high-resolution daily, pentadal, dekadal and monthly precipitation dataset (Funk et al., 2015a). It was produced by blending a set of satelliteonly precipitation values (CHIRP) with additional monthly and pentadal station observations. CHIRP is based on infrared cold cloud duration (CCD) estimates calibrated with the Tropical Rainfall Measuring Mission Multi-satellite Precipitation Analysis version 7 (TMPA 3B42 v7) and the Climate Hazards group Precipitation climatology (CHPclim). CHP clim (Funk et al., 2015a, b) is based on station data from the Food and Agriculture Organization (FAO) and the Global Historical Climate Network (GHCN). Compared with other global precipitation datasets, such as Multi-Source Weighted-Ensemble Precipitation (MSWEP)  and Global Precipitation Climatology Project (GPCP) (Adler et al., 2003), CHIRPS has several advantages: a long period of record, high spatial resolution (5 km), low spatial biases and low temporal latency. It has been widely validated and applied in various applications (e.g. Shukla et al., 2014;Maidment et al., 2015;Duan et al., 2016;Zambrano-Bigiarini et al., 2017;Rivera et al., 2018). In particular, it was recently validated over East Africa and Mozambique and demonstrated good performance compared to other precipitation datasets (Toté et al., 2015;Dinku et al., 2018). Furthermore, CHIRPS was specifically designed for drought monitoring over regions with deep convective precipitation, scarce observation networks and complex topography . Several studies (e.g. Toté et al., 2015;Guo et al., 2017) have used CHIRPS for drought monitoring. Its high spatial resolution makes it particularly suitable for local-scale studies, such as sub-basin drought monitoring, especially in areas with complex topography. The detailed description of the dataset was provided by Funk et al. (2015a). In this study, daily CHIRPS precipitation from 1981 to 2016 was used.

GLEAM
GLEAM is designed to estimate land surface evaporation and root zone soil moisture from remote sensing observations and reanalysis data (Miralles et al., 2011;. Specifically, the Priestley-Taylor equation is used to calculate potential evaporation within GLEAM based on near-surface temperature and net radiation, while the root zone soil moisture is obtained from a multilayer water balance driven by precipitation observations and updated with microwave soil moisture estimates . The actual evaporation is estimated by constraining potential evaporation with a multiplicative evaporative stress factor based on root zone soil moisture and vegetation optical depth (VOD) estimates. GLEAM version 3a (v3a) provides global daily potential and actual evaporation, evaporative stress conditions, and root zone soil moisture from 1980 to 2018 at spatial resolution of 0.25 •  (see http://www.gleam.eu, last access: 29 March 2020). GLEAM datasets have already been comprehensively evaluated against FLUXNET observations and used for multiple hydro-meteorological applications (Greve et al., 2014;Miralles et al., 2014;Trambauer et al., 2014;Forzieri et al., 2017;Lian et al., 2018;Richard et al., 2018;Vicente-Serrano et al., 2018;Zhan et al., 2019). In particular, two recent studies detected global drought conditions based on GLEAM potential and actual evaporation data (Vicente-Serrano et al., 2018;Peng et al., 2019b). For this study, the GLEAM potential evaporation and root zone soil moisture were used.

CRU-TS
The global gridded CRU-TS datasets provide most widely used climate variables, including precipitation, potential evaporation, diurnal temperature range, maximum and minimum temperature, mean temperature, frost day frequency, cloud cover, and vapour pressure (Harris et al., 2014). The CRU TS datasets were produced using angular distance weighting (ADW) interpolation based on monthly mete- Figure 1. Spatial patterns of 3-month and 12-month SPEI at high spatial resolution (5 km) and coarse spatial resolution (50 km) in June 1995. The high spatial resolution SPEI (SPEI-HR) is based on CHIRPS precipitation and GLEAM potential evaporation, while the coarse spatial resolution SPEI (SPEI-CRU) is calculated from CRU TS datasets.
orological observations collected at ground-based stations across the world. The recently released CRU TS version 4.0.1 covers the period 1901-2016 and provides monthly data at 50 km spatial resolution. The CRU TS datasets have been widely used for various applications since their release (e.g. van der Schrier et al., 2013;Chadwick et al., 2015;Delworth et al., 2015;Jägermeyr et al., 2016). The SPEIbase dataset was generated from CRU TS datasets . In this study, the CRU TS precipitation and potential evaporation from 1981 to 2016 was used.

GIMMS NDVI
The Normalized Difference Vegetation Index (NDVI) can serve as a proxy of vegetation status and has been widely applied to investigate the effects of drought on vegetation (e.g. Rojas et al., 2011;Vicente-Serrano et al., 2013Törnros and Menzel, 2014). The Global Inventory Monitoring and Modeling System (GIMMS) NDVI was generated based on Advanced Very-High-Resolution Radiometer (AVHRR) observations and has accounted for various deleterious effects, such as orbital drift, calibration loss and volcanic eruptions (Beck et al., 2011;Pinzon and Tucker, 2014). For the current study, the latest version of GIMMS NDVI (3g.v1) was used, which covers the time period from 1981 to 2015 at biweekly temporal resolution and 8 km spatial resolution (Pinzon and Tucker, 2014).

SPEI calculation
The SPEI proposed by Vicente-Serrano et al. (2010) has been used for a wide variety of agricultural, ecological and hydrometeorological applications (e.g. Schwalm et al., 2017;Naumann et al., 2018;Jiang et al., 2019). It accounts for the impacts of evaporation demand on droughts and inherits the simplicity and multi-temporal characteristics of SPI. The procedure for SPEI calculation includes the estimation of a climatic water balance (namely the difference between precipitation and potential evaporation), the aggregation of the climatic water balance over various timescales (e.g. 1, 3, 6, 12, 24 months or more) and a fitting to a certain parameter distribution. As suggested by Beguería et al. (2014) and Vicente-Serrano and Beguería (2016), the log-logistic proba-  bility distribution is best for SPEI calculation, from which the probability distribution of the difference between precipitation and potential evaporation can be calculated as suggested by Vicente-Serrano et al. (2010) and Beguería et al. (2014). The negative and positive SPEI values indicate dry and wet conditions, respectively. Table 1 summarizes the category of dry and wet conditions based on SPEI values. In this study, the CHIRPS and GLEAM datasets were used for SPEI calculation at high spatial resolution (5 km). For comparison, the SPEI at 50 km was also calculated based on CRU TS datasets for the same 1981-2016 period. It should be noted that the SPEI over sparsely vegetated and barren areas were masked out based on the Moderate Resolution Imaging Spectroradiometer (MODIS) land cover product (MCD12Q1) (Friedl et al., 2010) because SPEI is not reliable over these areas (Beguería et al., , 2014Zhao et al., 2017).

Evaluation criteria
The SPEIbase dataset  was calculated with CRU TS dataset, which has been evaluated and applied by many studies (e.g. Chen et al., 2013;Vicente-Serrano et al., 2013;Isbell et al., 2015;Q. Sun et al., 2016;Greenwood et al., 2017;Um et al., 2017). The newly generated SPEI at high spatial resolution based on CHIRPS and GLEAM (SPEI-HR) is compared temporally and spatially to the SPEI calculated from CRU TS datasets. In addition, the NDVI can also serve as an indicator for drought and vegetation health and to assess the performance of drought indices (Vicente-Serrano et al., 2013;Aadhar and Mishra, 2017). Furthermore, root zone soil moisture is an ideal hydrological variable for agricultural (soil moisture) drought monitoring. The recently released root zone soil moisture (RSM) from GLEAM v3 provides a great opportunity to evaluate whether soil moisture drought is well represented by SPEI. To facilitate direct comparison between SPEI, NDVI and RSM, both NDVI and RSM are standardized by subtracting their corresponding (1981-2016) mean and expressed the resulting anomalies as numbers of standard deviations. This standardization has been applied by many studies to evaluate drought indices (Anderson et al., 2011;Mu et al., 2013;Zhao et al., 2017). The correlation between SPEI and the standardized NDVI and RSM is quantified using Pearson's correlation coefficient (R). In addition, the high-resolution SPEI from GLEAM and CHIRPS is also resampled to the same grid size of SPEI from CRU TS in order to quantify their correlation and disentangle whether the added value of the former arises from its increased accuracy or higher resolution. In the following section, the high-resolution (5 km) SPEI is referred to as SPEI-HR, while the coarse 50 km resolution SPEI is referred to as coarse spatial resolution SPEI (SPEI-CRU).

Results and discussion
3.1 Inter-comparison between high-and coarse-resolution SPEI Figure 1 shows the spatial distribution of SPEI-HR and SPEI-CRU at different resolutions for an example month (June 1995). Figure 1a, b show the 3-month SPEI and 12month SPEI, respectively. It can be seen that the highresolution and coarse-resolution SPEI display quite similar dry and wet patterns over the whole of Africa for both temporal scales. However, as expected, the SPEI-HR shows much more spatial detail that, as a result, reflects mesoscale geographic and climatic features, which highlights the advantages of this new dataset. The differences in patterns between 3-month and 12-month SPEI indicate the different water deficits caused by different aggregation timescales, which can further separate agricultural, hydrological, environmental and other droughts. For example, in June 1995, southern Africa showed persistent dry conditions over a prolonged period, while western Africa only showed a short-term drought. In order to quantify how different SPEI-HR is from SPEI-CRU, the correlation between them is calculated for each grid cell over the whole study period. Figure 2 shows the correlations for timescales 1, 3, 6, 9, 12, 24, 36 and 48 months. In general, the SPEI-HR and SPEI-CRU agree well in terms of temporal variability with high positive correlations over most of Africa for every timescale. However, relatively low correlations appear in central Africa, and they become lower as the SPEI timescale increases. This region has very few station observations. It should be noted that the correlations shown here are statistically significant, with p values of less than 0.05. In addition, the average correlation between 6month SPEI-CRU and SPEI-HR for each month of the year is summarized in Fig. 3 using a box plot. In general, pos-  itive correlations with a median larger than 0.6 (p < 0.05) are found for every month. There are no substantial differences in correlations between different months. Figure A1 in Appendix shows additional box plots for SPEI at other timescales.

Comparison against root zone soil moisture and NDVI
To gain more insights into their significance and applicability, the SPEI datasets are compared with NDVI and RSM. Figure 4 shows the results of the spatial and temporal comparison between 6-month SPEI and RSM as indicated by Törnros and Menzel (2014). Figure 4a, b display the correlation (p < 0.05) of SPEI-HR and SPEI-CRU against RSM during the whole time period, respectively. In general, both SPEI-HR and SPEI-CRU show strong correlations with RSM over the whole African continent. Compared to SPEI-CRU, the SPEI-HR shows higher correlations, particularly over central Africa. Since Sect. 3.1 shows that relatively large discrepancy between SPEI-CRU and SPEI-HR exists over central Africa, the results presented here suggest a potentially better performance of SPEI-HR compared with SPEI-CRU in this region. The time series of SPEI and RSM, averaged over the entire study area, are shown in Fig. 4c, together with the corresponding correlations. It can be seen that both SPEI-HR and SPEI-CRU agree well with each other and with the RSM dynamics. Consistent with the results from the spatial correlation analysis, the SPEI-HR and SPEI-CRU show simi-  lar results when compared with RSM (R = 0.77 for SPEI-HR; R = 0.72 for SPEI-CRU). Furthermore, the scatter plots between 6-month SPEI and RSM for the entire data record are shown in Appendix Fig. A2, where positive and significant correlations with RSM are found for both SPEI-HR (R = 0.51) and SPEI-CRU (R = 0.42). To explore the correlation between RSM and different timescales of SPEI, Table 2 summarizes the correlation value calculated in the same way as Fig. 4c. It can be seen that the highest correlations against RSM are found at 3-and 6-month timescales. It should be noted that satellite-data-driven estimates of root zone soil moisture are more suitable for evaluating SPEI compared to satellite-based top-layer soil moisture or reanalysis soil moisture data (Mo et al., 2011;Xu et al., 2018).
Similar to the above analysis between SPEI and RSM, the comparison of results between SPEI and NDVI is shown in Fig. 5. First, Fig. 5a, b present the spatial distribution of the correlations (p < 0.05) between SPEI-HR and NDVI and between SPEI-CRU and NDVI, respectively. While correlations are overall lower than for RSM, it can be seen that both SPEI datasets are positively correlated with NDVI over most of the continent. It is also clear that SPEI-HR shows higher correlations. The time series comparison between the area mean SPEI and NDVI is shown in Fig. 5c. Both SPEI-HR and SPEI-CRU show agreement with NDVI, with R = 0.54 and R = 0.47, respectively. In addition, the comparison between 6-month SPEI and NDVI for the entire data record was also calculated, with R = 0.24 for SPEI-HR and R = 0.21 for SPEI-CRU significant at 95 % confidence level (Fig. A3). While these correlations are admittedly low, overall results suggest that the SPEI has a positive relation with NDVI, which is also reported by previous studies (e.g. Törnros and Menzel, 2014;Vicente-Serrano et al., 2018). The lower correlations against NDVI than against RSM are likely due to complex physiological processes associated with vegetation and the fact that ecosystem state is driven by multiple variables other than water availability (Nemani et al., 2003). Furthermore, there are also clearly documented lags between precipitation and NDVI, with NDVI time series typically peaking 1 or even 2 months after the period of maximum rainfall (Funk and Brown, 2006). Finally, Table 3 summa-rizes the correlation between SPEI and NDVI at different timescales. Compared with the results presented in Table 2 for RSM, the correlation with NDVI shown in Table 3 is also generally lower, and the highest correlations appear between 9-and 24-month SPEI (R > 0.5).
Altogether, the comparisons between SPEI and RSM and between SPEI and NDVI indirectly indicate the validity of the generated SPEI datasets. Therefore, the generated highresolution SPEI-HR from satellite products has the potential to improve upon the state of the art of drought assessment over Africa.

Patterns of SPEI, RSM and NDVI during specific drought events
Most of Africa has suffered severe droughts in past decades (Naumann et al., 2014;Blamey et al., 2018). Among them, the 2011 East Africa drought (Anderson et al., 2012;AghaKouchak, 2015) and 2002 southern Africa drought (Masih et al., 2014) were extremely severe and had devastating effects on the natural and socioeconomic environment. Taking these two events as case studies, the spatial patterns of the newly developed high-resolution 6-month SPEI-HR are analysed, together with the variability in NDVI and RSM. Figure 6a,

Data availability
The high-resolution SPEI dataset is publicly available from the Centre for Environmental Data Analysis (CEDA) from the following link: https://doi.org/10.5285/bbdfd09a04304158b366777eba0d2aeb (Peng et al., 2019a). It covers the whole of Africa at a monthly temporal resolution and 5 km spatial resolution from 1981 to 2016 and is provided with geographic latitude-longitude projection and NetCDF format.

Conclusions
The study presents a newly generated high-resolution SPEI dataset (SPEI-HR) over Africa. The dataset is produced from satellite-based CHIRPS precipitation and GLEAM potential evaporation and covers the entire African continent over the time period from 1981 to 2016 with a spatial resolution of 5 km. The accumulated SPEI, ranging from 1 to 48 months, is provided to facilitate applications from meteorological to hydrological droughts. The SPEI-HR was compared with widely used coarse-resolution SPEI data (SPEI-CRU), GIMMS NDVI and GLEAM root zone soil moisture to investigate its capability for drought detection. In general, the SPEI-HR has good correlation with SPEI-CRU temporally and spatially. They both agree well with NDVI and root zone soil moisture, although SPEI-HR displays higher correlations overall. These results indicate the validity and advantage of the newly developed high-resolution SPEI-HR dataset, and its unprecedentedly high spatial resolution offers important advantages for drought monitoring and assessment at district and river basin level in Africa. Figure A2. Scatter plots between 6-month SPEI and RSM for the entire data record. R is correlation coefficient with p < 0.05, and the colours denote the occurrence frequency of values. Figure A3. Scatter plots between 6-month SPEI and NDVI for the entire data record. R is correlation coefficient with p < 0.05, and the colours denote the occurrence frequency of values.
Author contributions. JP developed the processing algorithm, generated the dataset and drafted the manuscript. DGM and CF produced the GLEAM and CHIRPS data as input. SD, FH, ED and TL supported the generation of the dataset and the analysis of the results. All authors contributed to the discussion, review and revision of this paper.
Competing interests. The authors declare that they have no conflict of interest.