WHU-SGCC: a novel approach for blending daily satellite (CHIRP) and precipitation observations over the Jinsha River basin

Accurate and consistent satellite-based precipitation estimates blended with rain gauge data are important for regional precipitation monitoring and hydrological applications, especially in regions with limited rain gauges. However, the existing fusion precipitation estimates often have large uncertainties over mountainous areas with complex topography and sparse rain gauges, and most of the existing data blending algorithms are not good at removing the day-by-day errors. Therefore, the development of effective methods for high-accuracy precipitation estimates over complex terrain and at a daily scale is of vital importance for mountainous hydrological applications. This study aims to offer a novel approach for blending daily precipitation gauge data and the Climate Hazards Group Infrared Precipitation (CHIRP; daily, 0.05) satellite-derived precipitation developed by UC Santa Barbara over the Jinsha River basin from 1994 to 2014. This method is called the Wuhan University Satellite and Gauge precipitation Collaborated Correction (WHU-SGCC). The results show that the WHU-SGCC method is effective for liquid precipitation bias adjustments from points to surfaces as evaluated by multiple error statistics and from different perspectives. Compared with CHIRP and CHIRP with station data (CHIRPS), the precipitation adjusted by the WHU-SGCC method has greater accuracy, with overall average improvements of the Pearson correlation coefficient (PCC) by 0.0082–0.2232 and 0.0612–0.3243, respectively, and decreases in the root mean square error (RMSE) by 0.0922–0.65 and 0.2249–2.9525 mm, respectively. In addition, the Nash–Sutcliffe efficiency coefficient (NSE) of the WHU-SGCC provides more substantial improvements than CHIRP and CHIRPS, which reached 0.2836, 0.2944, and 0.1853 in the spring, autumn, and winter. Daily accuracy evaluations indicate that the WHU-SGCC method has the best ability to reduce precipitation bias, with average reductions of 21.68 % and 31.44 % compared to CHIRP and CHIRPS, respectively. Moreover, the accuracy of the spatial distribution of the precipitation estimates derived from the WHU-SGCC method is related to the complexity of the topography. The validation also verifies that the proposed approach is effective at detecting major precipitation events within the Jinsha River basin. In spite of the correction, the uncertainties in the seasonal precipitation forecasts in the summer and winter are still large, which might be due to the homogenization attenuating the extreme rain event estimates. However, the WHU-SGCC approach may serve as a promising tool to monitor daily precipitation over the Jinsha River basin, which contains complicated mountainous terrain with sparse rain gauge data, based on the spatial correlation and the historical precipitation characteristics. The daily precipitation estimations at the 0.05 resolution over the Jinsha River basin during all four seasons from 1990 to 2014, derived from WHU-SGCC, are available at the PANGAEA Data Publisher for Earth & Environmental Science portal (https://doi.org/10.1594/PANGAEA.905376, Shen et al., 2019). Published by Copernicus Publications. 1712 G. Shen et al.: WHU-SGCC: a novel approach for blending daily satellite

To overcome these limitations, many studies have focused on proposing effective methodologies for blending rain gauge 95 observations, satellite-based precipitation estimates, and sometimes radar data to take advantage of each dataset. Many

176
CHIRPS is the blended product of a two-part process. First, IR precipitation (IRP) pentad rainfall estimates are fused with 177 corresponding CHPClim pentad data to produce an unbiased gridded estimate, called CHIRP, which is available online at 178 ftp://ftp.chg.ucsb.edu/pub/org/chg/products/CHIRP/daily/ (last access: 10 December, 2018). In the second part of the process, 179 the CHIRP data are blended with ground-based precipitation observations obtained from a variety of sources, including 180 national and regional meteorological services by means of a modified inverse -distance weighting algorithm to create the final 181 blended product, CHIRPS (Funk et al., 2014). The daily CHIRP satellite-based data over the Jinsha River Basin from 1990.02 182 to 2015.02 were selected as the input for WHU-SGCC blending with rain observations, and the corresponding daily CHIRPS 183 blended data was used for comparisons of the precipitation accuracy.

184
The blended in situ daily precipitation observations of the CHIRPS data come from a variety of sources, such as the daily 7 NOAA CPC, and more than a dozen national and regional meteorological services. However, the stations for daily CHIRPS 188 data have a different spatial distribution than those downloaded from the CMA, and the precipitation values used for CHIRPS

205
The basic description of the WHU-SGCC method is given below, and the details are illustrated separately in later sections:

208
(2) Analyze the relationships between the precipitation observations and the C1, C2, and C3 pixel types, and interpolate for 209 the C4 pixels. These relationships are described by four rules, which are described below as Rules 1 through 4.

215
(1) Gauge observations are the most accurate, or "true", values for reference purposes. However, the sparseness of the 216 gauges, their uneven spatial distribution, and the high proportion of missing data may limit high accuracy estimation in rainfall 217 monitoring.

218
(2) No major terrain changes occurred during the twenty years (Appendix B).

219
(3) There are no abnormal values at one pixel in the CHIRP dataset during the long time series , so Pearson's Correlation 220 Coefficient (PCC) can represent the statistical similarity of the rainfall characteristics among the pixels in a certain spatial area 221 at a seasonal scale.    It is reasonable to assume that some pixels are statistically similar to the historical precipitation characteristics of the C1 pixels 244 within a certain area. Therefore, it is feasible to adjust the satellite estimation bias of the C2 pixels by referring to the 245 appropriate regression relationships at the corresponding C1 pixels based on Rule 1.

272
The results of the FCM are the degree of membership of each pixel to the cluster centre as represented by numerical values.

273
The pixels in each cluster have similar terrain features and precipitation characteristics.

274
Second, as mentioned above, the aim of Rule 2 is to derive an adjustment method for the C2 pixels based on learning from where n is the number of samples, i

285
The PCC ranges between -1 and +1. If there are no repeated data values, a perfect PCC of +1 or −1 occurs when each of the 286 variables is a perfect monotonic function of the other. However, if the value is close to zero, there is zero correlation. In 287 addition, the correlation is not only determined by the value of the correlation coefficient b ut also by the correlation test's p-288 value. The critical values for the PCC and p-value are 0.5 and 0.05, respectively; thus, a PCC value higher than 0.5 and a p-289 value lower than 0.05 indicate that the data are significantly correlated (Zhang and Chen, 2016). Therefore, the final 290 determination of the C2 pixels must meet the following criteria: 291 PCC 0.5 0.05 and p  (9)

292
Each R pixel has m PCC and p-values (the number of C1 pixels in the cluster), and the subset of C2 pixels is identified by 293 excluding the data that failed the correlation test and retaining bot h the data with a maximu m PCC of at least 0.5 and a p-value 294 lower than 0.05, and the corresponding index of C1 pixels. The selected C2 pixels can then be considered statistically similar 295 to the precipitation characteristics of the corresponding C1 pixels in their defined spatial area.

296
After identifying the C2 pixels and their corresponding C1 pixels, the adjustment method for the C2 pixels is derived from 297 the regression model for the C1 pixels :

304
Recognizing that precipitation has a spatial distribution, the assumption that the C3 pixels are statistically similar to the 305 precipitation characteristics of the C2 pixels is adopted to establish the adjustment method for the C3 pixels.

306
First, the determination of the C3 pixels in each spatial cluster is based on the selection of C2 pixels. The satellite-based 307 estimation values at the pixels other than the C1 and C2 pixels are used to calculate the PCC and p-value with the satellite-

308
based estimation values at the C2 pixels in the same cluster. The results of each pixel's k PCC and p-value (the number of C2 309 pixels in the cluster) are evaluated based on the correlation test (Eq. (9)) that the pixels have a maximu m PCC of at least 0.5 310 and a p-value is of no more than 0.05, and the corresponding index of C2 pixels is retained. The selected pixels are called C3 311 pixels, which are statistically similar to the precipitation characteristics of the corresponding C2 pixels in the defined spatial 312 area.

313
After identifying the C3 pixels, a method for merging the CHIRP grid cell values at the C3 pixels ( ) and the target reference where is a positive constant set to 10 mm (Sokol, 2003),   The pixels other than the C1, C2 and C3 pixels are called C4 pixels and they are adjusted by inverse distance weighting (IDW).

327
IDW is based on the concept of the first law of geography from 1970, which was defined as everything is related to everything 328 else, but near things are more related than distant things. Therefore, the attribute value of an unsampled point is the weighted  12 where as R is the unknown spatial precipitation data, i R is the adjusted precipitation values at the C2 and C3 pixels, n is the 335 number of C2 and C3 pixels, i d is the distance from each C2 or C3 pixel to the unknown grid cell, and  is the power which 336 is generally specified as a geometric form for the weight. Several studies (Simanton and Osborn 1980;Tung 1983) have 337 experimented with variations in the power; a the small  tends to estimate values with the averages of sampled grids in the 338 neighbourhood, while a large  tends to give larger weights to the nearest points and increasingly down -weights points 339 farther away (Chen and Liu, 2012;Lu and Wong, 2008). The value of  has an influence on the spatial distribution of the 340 information from precipitation observations. For this reason,  is varied in the range of 0.1 to three (0.1, 0.3, 0.5, 1.0, 1.5, 341 2.0, 2.5 and 3.0) in this study.

342
Note that the unknown spatial precipitation data include C1 and C4 pixels because the C1 pixels values were not adjusted 343 in Rule 1.

344
After applying these four rules, we obtained complete daily adjusted regional precipitation maps for the four seasons over  precipitation and has an optimal value of 0. The POD, also known as the hit rate, represents the probability of rainfall detection , 361 and the FAR is defined as the ratio of the false alarm of rainfall to the total number of rainfall events. All of the accuracy 362 assessment metrics are shown in Table 2.  Note: is the observation data; is the adjusted value using the WHU-SGCC method for the test sample pixel; ̅ is the arithmetic mean of and is given by  Table D1 and the spatial 385 distribution of C1-C3 pixels in Figure D1 with the most uniform in the fall and, while the sparsest in the winter. Each validation 386 gauge station could be identified as either C2, C3 or C4 pixels to evaluate the performances of all the rules in the WHU-SGCC 387 method. Table 3 The percentage of each class pixels adjusted by each rule using the WHU-SGCC method within the Jinsha River Basin.   The spatial distributions of the statistical comparisons between the observations and the WHU-SGCC precipitation 431 estimations are shown in Fig. 5 and Fig. 6. Overall, the variation in the PCC shows low correlations in areas with lower 432 elevation, particularly in the southeast Jinsha River Basin, where there is higher precipitation and a greater density of rain gauges. The PCC is highest in the fall, followed by the spring and winter, and finally by summer. The higher correlations are 434 located in the north-central area along the Tongtian River, Jinsha River and upstream part of the Yalong River, which has 435 complex terrain and few rain gauges. The RMSE is lowest in the winter than in the spring, fall and summer, which can be 436 attributed to the lower precipitation in the winter and the greatest in the summer. The spatial distribution of the RMSE shows 437 that, the smaller errors are scattered in the northwest area of the river basin, with values lower than 5 mm, while the highest 438 errors are located along the border between the lower reaches of the Jinsha Jiang River and the river basin. This is related to 439 the climate regimes of the Jinsha River Basin, which includes more rainfall in the south and southeast areas than in the north,

441
The results show that the WHU-SGCC method improves the correlation relative to CHIRP and CHIRPS, especially in 442 central and southeast river basin during the spring, fall and winter, wit h most of the PCC values falling between 0.4 and 0.8 443 (Fig. 5). As shown by the RMSE (Fig. 6), the WHU-SGCC can also correct the precipitation bias in the central and southeast

477
Moreover, in terms of the POD, FAR and CSI, except for the results in winter, the WHU-SGCC method appears to be better 478 at detecting precipitation than CHIRP and CHIRPS; the results of POD and CSI are closest to 1, although FAR is worse than 20 worst in the winter, and the CSI is slightly higher, which may be attributed to the overestimation of no -rain events and the 481 inherent uncertainty in the CHIRP.

482
Overall, the WHU-SGCC approach can be regarded as an effective tool for daily precipitation adjustments.  To measure the WHU-SGCC performance on predicting rain events, daily precipitation thresholds of 0.1, 10, 25, and 50 mm 496 were considered, and the results are shown in Table 5 and Table 6. The average percentage of each class of rain events at the daily precipitation in the range of 25 mm to 50 mm. In the spring, fall and summer, significantly more no-rain days occurred 502 than rainy days, and approximately 5% of the days had daily precipitation of 10-50 mm. The seasonal distribution of rainfall 503 was concentrated in the summer, and 54.76%, 14.01% and 3.62% of the days had daily precipitation of 0.1-10, 10-25, and 25-504 50 mm, respectively. The results indicated that the average daily precipitation was less than 10 mm throughout the years of 505 the study. The WHU-SGCC approach had lower errors than CHIRP and CHIRPS, as indicated by the RMSE, MAE and BIAS, but the 509 performance of WHU-SGCC is not promising for events with total rainfall greater than 25 mm in the summer (Fig. 8). This 510 negative performance for total rainfall higher than 25 mm in the summer might be attributed to the overestimation of rainfall 511 by CHIRP and CHIRPS. For the seasonal distribution of precipitation (Table 5)

517
Besides, the POD and CSI results of CHIRPS are the worst, while the results of the WHU-SGCC are the highest, which 518 indicate its superiority for the detection of precipitation events. As for the results of the WHU-SGCC, the assessments of POD 519 and CSI are the best in the summer, followed by the fall, spring, and winter, which are related to the seasonal rainfall pattern 520 of more rain in the summer and less in the winter.

521
Therefore, the WHU-SGCC approach is applicable for the detection of rainfall events in the Jinsha River Basin, while in 522 the summer it is better with rainfall less than or approximately equal to the average daily precipitation. Due to the 523 homogenization of the WHU-SGCC method, its performance for short intense and extreme rain events was poorer than those 524 of CHIRP and CHIRPS, which should be improved in a future study.

536
This study provides a novel approach, the WHU-SGCC method, for merging daily satellite-based precipitation estimates with 537 observations. A case study of the Jinsha River Basin was conducted to verify the effectiveness of the WHU-SGCC approach 538 during all four seasons from 1990 to 2014, and the adjusted precipitation estimates were compared to CHIRP and CHIRPS.

539
The WHU-SGCC method aims to reduce the bias and uncertainties in CHIRP data over regions with complicated mountainous 540 terrain and sparse rain gauges. To the best of the authors' knowledge, this study is the first to use daily CHIRP and CHIRPS 541 data in this area.

542
According to our findings, the following conclusions can be drawn: (1) The WHU-SGCC method is effective for the greater accuracy compared with CHIRP and CHIRPS, with average improvements of Pearson's correlation coefficient (PCC) improvements over CHIRP and CHIRPS, which reached 0.2836, 0.2944 and 0.1853 in the spring, fall and winter, respectively.
In the summer, the NSE of the WHU-SGCC is still negative, but it is improved to be nearly zero, which indicates that the 550 adjusted results are similar to the average level of the rain gauge observations. All of the measured errors were reduced except 551 for the BIAS, which showed no significant improvement in the summer but was approximately 0. Overall, the WHU-SGCC

563
As for the results of the WHU-SGCC, the assessments of POD and CSI are the best in the summer, followed by the fall, spring, 564 and winter, which are related to the seasonal rainfall pattern of more rain in the summer and less in the winter. In spite of the 565 corrections, the performance of the WHU-SGCC for short intense and extreme rain events was poorer than those of CHIRP 566 and CHIRPS, and the bias in the precipitation forecasts in the summer are still large, which may due to the homogenization 567 attenuating the extreme rain events estimates.

568
In conclusion, the WHU-SGCC approach can help adjust the biases of daily satellite-based precipitation estimates over the 569 Jinsha River Basin, which contains complicated mountainous terrain with sparse rain gauges. This approach is a promising 570 tool to monitor daily precipitation over the Jinsha River Basin, considering the spatial correlation and historical precipitation 571 characteristics between raster pixels in regions with similar topographic features . Future development of the WHU-SGCC 572 approach will focus on the following three aspects: (1) the improvement of the adjusted precipitation quality to better monitor 573 extreme rainfall events by blending multiple data sources for different rain events; (2) the introduction of more climatic factors and multi-model ensembles to achieve more accurate spatial distributions of precipitation; and (3)

591
LPDAAC_ECS&q=M CD12&o k=M CD12 (last access: 23 July 2019). Fig. B1 shows that the land use had no obvious changes 592 over the study period. In addition, the upstream area of the Jinsha River is an untraversed region that has not been affected

621
This appendix shows how to set the number of clusters in the FCM method.

622
To adjust the pixels other than those of the gauge stations, the pixels that are statistically similar to the C1 pixels were 623 selected. According to Rule 2, the C2 pixels were identified in a spatial area defined by the FCM method. In the followin g