Development of East Asia Regional Reanalysis based on advanced hybrid gain data assimilation method and evaluation with E3DVAR, ERA-5, and ERA-Interim reanalysis
- Atmospheric Predictability and Data Assimilation Laboratory, Department of Atmospheric Sciences, Yonsei University, Seoul, Republic of Korea
Correspondence: Hyun Mee Kim (firstname.lastname@example.org)
The East Asia Regional Reanalysis (EARR) system is developed based on the advanced hybrid gain data assimilation method (AdvHG) using the Weather Research and Forecasting (WRF) model and conventional observations. Based on EARR, the high-resolution regional reanalysis and reforecast fields are produced with 12 km horizontal resolution over East Asia for 2010–2019. The newly proposed AdvHG is based on the hybrid gain approach, weighting two different analyses for an optimal analysis. The AdvHG differs from the hybrid gain in that (1) E3DVAR is used instead of EnKF, (2) 6 h forecast of ERA5 is used to be more consistent with WRF, and (3) the preexisting, state-of-the-art reanalysis is used. Thus, the AdvHG can be regarded as an efficient approach for generating regional reanalysis datasets thanks to cost savings as well as the use of the state-of-the-art reanalysis. The upper-air variables of EARR are verified with those of ERA5 for January and July 2017 and the 10-year period 2010–2019. For upper-air variables, ERA5 outperforms EARR over 2 years, whereas EARR outperforms (shows comparable performance to) ERA-I and E3DVAR for January 2017 (July 2017). EARR represents precipitation better than ERA5 for January and July 2017. Therefore, although the uncertainties of upper-air variables of EARR need to be considered when analyzing them, the precipitation of EARR is more accurate than that of ERA5 for both seasons. The EARR data presented here can be downloaded from https://doi.org/10.7910/DVN/7P8MZT (Yang and Kim, 2021b) for data on pressure levels and https://doi.org/10.7910/DVN/Q07VRC (Yang and Kim, 2021c) for precipitation.
Reanalysis datasets have been widely used in the socio-economic field as well as in meteorological and climate research around the world. Most reanalysis datasets consist of global reanalysis whose spatial and temporal resolutions are relatively coarse (e.g., Schubert et al., 1993; Kalnay et al., 1996; Gibson et al., 1997; Kistler et al., 2001; Kanamitsu et al., 2002; Uppala et al., 2005; Onogi et al., 2007; Bosilovich, 2008; Saha et al., 2010; Dee et al., 2011; Rienecker et al., 2011; Bosilovich et al., 2015; Kobayashi et al., 2015; Hersbach et al., 2020). With the emerging importance of regional reanalysis datasets, many operational centers and research institutes around the world have been producing these datasets in their own areas (Mesinger et al., 2006; Borsche et al., 2015; Bromwich et al., 2016; Jermey and Renshaw, 2016; Zhang et al., 2017; Bromwich et al., 2018; Fukui et al., 2018; He et al., 2019; Ashrit et al., 2020).
Long-term high-resolution datasets are essential to investigate past extreme weather events which might be associated with mesoscale features such as heavy rainfall events with high spatial and temporal variability, which coarser-resolution models cannot represent. Dynamical downscaling approaches can be a solution for generating high-resolution datasets, but there are some issues with insufficient spin-up (Kayaba et al., 2016). Moreover, Fukui et al. (2018) demonstrated that regional reanalysis over Japan assimilating only the conventional observations had the potential to reproduce precipitation fields better than the dynamical downscaling approaches. Ashrit et al. (2020) also found that the high-resolution regional reanalysis over India showed substantial improvements of regional hydroclimatic features during summer monsoon for the period 1979–1993 compared to the global reanalysis ERA-Interim (ERA-I; Dee et al., 2011) from ECMWF. Furthermore, He et al. (2019) revealed that the pilot regional reanalysis over the Tibetan Plateau was able to represent more accurate precipitation features and atmospheric humidity than the global reanalyses of ECMWF (i.e., ECMWF's fifth-generation reanalysis (ERA5; Hersbach et al., 2020) and ERA-I).
As part of this effort, regional reanalysis over East Asia was produced based on the Unified Model (UM) for the 2-year period 2013–2014 and it was confirmed that regional reanalysis over East Asia is beneficial (Yang and Kim, 2017, 2019). However, because the UM was no longer available for generating regional reanalysis over East Asia, another numerical weather prediction (NWP) model and its data assimilation (DA) method are required.
To find the most appropriate and cost-efficient DA method for a regional reanalysis over East Asia, several DA methods were compared. Yang and Kim (2021a) demonstrated that the hybrid ensemble-variational data assimilation method (E3DVAR) performed better than three-dimensional variational data assimilation (3DVAR) and ensemble Kalman filter (EnKF) over East Asia for January and July 2016. However, it is essential to confirm whether this hybrid method is accurate enough to be used for a regional reanalysis over East Asia. Thus, E3DVAR was compared with the latest and the previous reanalysis data from ECMWF (ERA5 and ERA-I) for (re)analysis and (re)forecast variables and it was found that the performance for regional reanalysis needs to be further improved.
For this reason, a new advanced hybrid gain (AdvHG) DA method, which combines E3DVAR and ERA5 based on the Weather Research and Forecasting (WRF) model, is proposed and investigated in this study. A hybrid gain DA method has been developed as a new kind of hybrid method (Penny, 2014). Based on this method, an advanced DA method is newly developed in this study. Finally, using this newly proposed DA method (AdvHG), the East Asia Regional Reanalysis (EARR) system is developed based on the WRF model. EARR datasets were produced for 10-year period 2010–2019 and are publicly available (https://dataverse.harvard.edu/dataverse/EARR, last access: 17 March 2022).
To investigate the accuracy and uncertainty of the state-of-the-art AdvHG DA algorithm developed in this study, analysis and forecast atmospheric variables of E3DVAR, AdvHG, WRF-based ERA-I, and WRF-based ERA5 are evaluated for January and July 2017, respectively. In addition, reforecast precipitation fields of ERA-I and ERA5 from ECMWF are also verified and compared. In this study, the datasets are evaluated for a 2-month period (January and July 2017) or a 10-year period (2010–2019) depending on the availability of datasets. The reanalysis and (re)forecast fields of the EARR based on AdvHG and ERA5 are verified for a 10-year period (2010–2019). In Sect. 2, the EARR system including the model, DA method, and observations are explained. In Sect. 3, the evaluation methods are presented. The verification results of the (re)analysis and (re)forecast variables are presented in Sect. 4. Section 4.1 introduces the evaluation results for wind, temperature, and humidity variables, and Sect. 4.2 presents those for precipitation (re)forecast. Data availability is covered in Sect. 5. Lastly, the summary and conclusions are presented in Sect. 6.
In this study, the Advanced Research WRF model (v3.7.1) is used with 12 km horizontal resolution (540 × 432 grid points) and 50 vertical levels (up to 5 hPa) for the East Asia domain shown in Fig. 1. The model settings and physics scheme are summarized in Table 1. Analysis fields are obtained every 6 h (00:00, 06:00, 12:00, and 18:00 UTC) via assimilation of conventional observations with a 6 h assimilation window, and forecast fields are integrated up to 36 h. The ERA5 reanalysis (Hersbach et al., 2020) is used as the first initial condition before the cycling and as boundary conditions every 6 h.
2.2 Data assimilation methods
The E3DVAR method is one of the hybrid DA methods that use a static climatological background error covariance (BEC) and ensemble-based flow-dependent BEC, and couples the EnKF and 3DVAR (Zhang et al., 2013). E3DVAR is based on a cost function of 3DVAR. In E3DVAR, EnKF provides flow-dependent BEC as well as updates on perturbations for ensemble members. Following Zhang et al. (2013),
where is a traditional cost function based on a static climatological BEC B and is an additional cost function based on ensemble-based BEC Pf. C is a correlation matrix for localization of the ensemble covariance Pf. The weighting coefficient β between static and ensemble-based BEC is set to 0.8 in this study. To account for model error for E3DVAR, a multi-physics scheme is applied to 40-member ensembles. Yang and Kim (2021a) found that E3DVAR is the most appropriate DA method among 3DVAR, EnKF, and E3DVAR methods over East Asia. More detailed information on E3DVAR implemented in this study can be found in Yang and Kim (2021a).
2.2.2 Hybrid gain data assimilation method
In the last decade, the traditional hybrid methods have been widely used for many operational centers and research institutes. Recently, Penny (2014) proposed a new class of hybrid gain methods combining desirable aspects of both variational and EnKF families of algorithms by weighting analyses from 3DVAR and LETKF for an optimal analysis in the Lorenz 40-component model. Since then, this algorithm has been implemented at ECMWF (Bonavita et al., 2015) and at a hybrid global ocean DA system in the National Centers for Environmental Prediction (NCEP) (Penny et al., 2015).
The hybrid gain algorithm can be described with the following equations:
where , , and denote the hybrid analysis, deterministic analysis, and the ensemble mean analysis from the ensemble-based assimilation method, and α is a tunable parameter (Penny, 2014; Houtekamer and Zhang, 2016).
The hybrid gain method is different from traditional hybrid methods, in that a hybrid gain approach linearly combines analysis fields from EnKF and variational DA methods to produce a hybrid gain analysis rather than linearly combining respective BECs (Penny, 2014). Basically, the hybrid gain method is used to hybridize two different Kalman gain matrices of ensemble-based (Eq. 4) and variational DA systems (Eq. 5) as in Eq. (3):
H is an observation operator mapping the model state vector to observation space and R is the observation error covariance matrix. The matrices Pf and B indicate the ensemble-based and the static climatological BEC, respectively. By choosing the specific coefficients (β1=1, β2=α, ), it can be written as in Eq. (6) and it can give an algebraically equivalent result with Eq. (2) (Penny, 2014):
One of the advantages of the hybrid gain algorithm with respect to its development is that preexisting operational systems can be used without significant modification for a hybrid analysis (Penny, 2014) and independent parallel development of respective methods is allowed (Houtekamer and Zhang, 2016). Furthermore, the hybrid gain approach can be considered a practical and straightforward method in the foreseeable future to combine advantageous features of both ensemble- and variational-based DA algorithms (Houtekamer and Zhang, 2016). More detailed information on this algorithm can be found in Penny (2014).
2.2.3 Advanced hybrid gain data assimilation method
In this study, based on the hybrid gain approach, an advanced hybrid gain DA method (AdvHG) is newly proposed as follows:
where denotes the 6 h forecast of ERA5 reanalysis based on the WRF model and denotes the analysis of E3DVAR (Fig. 2). In Eq. (7), α is a tunable parameter and is assigned to be 0.5 in this study. This advanced hybrid gain approach is different from the hybrid gain approach in that (1) E3DVAR analysis is used instead of EnKF, (2) 6 h forecast of ERA5 is used instead of deterministic analysis from the variational DA method, and (3) the preexisting and state-of-the-art reanalysis data (i.e., ERA5) are simply used instead of producing deterministic analysis by assimilation. The reasons for these different approaches proposed in this study are as follows:
E3DVAR is used instead of EnKF because Yang and Kim (2021a) confirmed that E3DVAR outperforms EnKF for winter and summer seasons over East Asia.
Instead of deterministic analysis, the 6 h forecast of ERA5 based on the WRF model is used to make the hybrid analysis more balanced and consistent with the WRF model, because ERA5 reanalysis fields are based on its own modeling system with coarser resolution, which is different from that used in this study.
European Centre for Medium-Range Weather Forecasts (ECMWF) reanalysis (ERA5) is used instead of producing our own analysis fields from a variational DA method. This is a very efficient approach because of the cost savings as well as the use of the high-quality latest reanalysis from ECMWF assimilating all currently available observations with the state-of-the-art and advanced technology.
Therefore, the approach proposed in this study is called “advanced hybrid gain method” (denoted as “AdvHG”).
The NCEP PrepBUFR (Prepared or QC'd data in BUFR (Binary Universal Form for the Representation of meteorological data) format) conventional observations (global upper-air and surface weather observations, NCEP/NWS/NOAA/U.S.DOC, 2008) are used every 6 h (00:00, 06:00, 12:00, and 18:00 UTC) for an assimilation by E3DVAR and AdvHG methods (Fig. 1). The PrepBUFR is the output of the final process for preparing the observations to be assimilated in the different NCEP analyses. For observations, rudimentary multi-platform quality control (QC) and more complex platform-specific QCs were conducted (e.g., surface pressure, rawinsonde heights and temperature, wind profiler, aircraft wind and temperature) in NCEP (Keyser, 2013). Furthermore, if the innovations (i.e., observation minus background) of some observations are greater than 5 times the observational error, then that observation is rejected during the assimilation procedure in this study.
The assimilated observations are as follows: the surface observations (SYNOP, METAR, Ship, and Buoy), radiosonde observation (SOUND), upper-wind report (PILOT), wind profiler, aircraft, atmospheric motion vector (AMV) wind from satellites, scatterometer oceanic surface winds (Scatwind), and precipitable water vapor from the Global Positioning System (GPSPW). The observation errors depending on each observation platform, variable, and vertical levels are assigned based on the default observation error statistics provided in the WRFDA system (Table 2). All observations are spatially thinned by 20 km except for AMV thinned by 200 km, as done by Warrick (2015), Cotton et al. (2016), and Shin et al. (2016).
To evaluate 6 h accumulated precipitation simulated by E3DVAR, AdvHG, ERA-I, and ERA5 over East Asia, global surface weather observations (NCEP PrepBUFR, NCEP/NWS/NOAA/U.S.DOC, 2008) are used every 6 h (00:00, 06:00, 12:00, and 18:00 UTC). For an evaluation of the monthly precipitation fields, the World Monthly Surface Station Climatology (NCDC/NESDIS/NOAA/U.S.DOC et al., 1981) over 4700 different stations (2600 in more recent years) is used.
2.4 Global reanalysis datasets
To compare EARR generated with other reanalysis datasets, ERA5 (Hersbach et al., 2020) and ERA-I (Dee et al., 2011) reanalyses are chosen. The horizontal resolutions of ERA-I and ERA5 are approximately 79 (TL255) and 31 km (TL639), respectively. Because ERA5 is based on the operational system in 2016, improvements in model physics, numerics, data assimilation, and additional observations over the last decade are the advantages of ERA5 (Hersbach et al., 2018).
In this study, (re)forecast as well as reanalysis fields need to be verified. Regarding reanalysis and (re)forecast fields of ECMWF, reanalysis fields (i.e., ERA5 and ERA-I) downloaded from ECMWF are evaluated (Figs. 3 and 6). Two different (re)forecast fields (e.g., ERA5_fromECMWF, WRF-based ERA5) are used in this study. WRF-based ERA5 and ERA-I are forecast fields based on the WRF model with 12 km horizontal resolution where ERA5 and ERA-I are used as initial conditions. By contrast, ERA5_fromECMWF and ERA-I_fromECMWF are reforecast fields based on the ECMWF model not the WRF model, and thus the reforecast fields of ERA5 and ERA-I are provided and downloaded from ECMWF. These reforecast fields are only used for evaluation of precipitation (Figs. 8 and 9). The (re)analysis and (re)forecast fields and corresponding experiments are explained in Table 3.
3.1 Equitable threat score and frequency bias index
Based on the contingency table (Table 4), ETS is defined as
The ETS range is from to 1 and the value 1 for ETS is a perfect score. ETS is a more balanced score than probability of detection (POD) and false alarm ratio (FAR) because it is sensitive to both false alarms and misses (Wilson, 2017).
FBI is defined as
The FBI indicates whether the model tends to over-forecast (too frequently, FBI >1) or under-forecast (not frequent enough, FBI <1) events with respect to frequency of occurrence.
3.2 Probability of detection and false alarm ratio
Based on the contingency table (Table 4), POD is defined as
The POD range is from 0 to 1. POD is required to be used with FAR because POD can be artificially improved by systematically over-forecasting the events (Wilson, 2017).
FAR is defined as
The range of FAR is from 0 to 1 and its lower score implies a higher accuracy.
3.3 Brier skill score
Verification of the performance of high-resolution forecast with the traditional verification metrics (e.g., ETS, FBI) can be misleading due to a double penalty, particularly for highly variable fields (e.g., precipitation). Therefore, as one of the spatial verification approaches that do not require forecast to match point observation spatially, the neighborhood (fuzzy) verification method, which assumes that a slightly displaced forecast can be acceptable and a local neighborhood can define the degree of allowable displacement (Ebert, 2008; Kim et al., 2015; On et al., 2018), is used in this section. According to Ebert (2008), depending on the matching strategy, neighborhood verifications can be categorized into two frameworks: “single observation–neighborhood forecast (SO-NF)” where neighborhood forecasts surrounding observations are considered, and “neighborhood observation–neighborhood forecast (NO-NF)” strategies where not only neighborhood forecasts but also neighborhood observations surrounding observations are considered. Due to the absence of high-resolution gridded precipitation observation data in East Asia, various verification scores widely used as an NO-NF strategy are not available in this study. Thus, in this section, the Brier skill score (BSS), as one of the SO-NF strategies, is introduced.
The Brier score (BS) is similar to the mean squared error (MSE) and is defined as (Wilks, 2006)
where pi denotes the probability forecast, oi denotes the binary observation which is either 0 or 1, and N is the total number of observations during the given period. Generally, the BSS (or BS) is used to verify ensemble forecasts which are able to calculate probabilistic forecasts (Kay et al., 2013; Kim and Kim, 2017). However, the BSS can also be used for deterministic forecasts using a pragmatic post-processing procedure (Theis et al., 2005; Mittermaier, 2014), which derives probabilistic forecasts from deterministic forecasts at every model grid point by considering neighborhood forecast as pseudo ensemble:
where BSref is the BS of reference. The BSS is the skill score with respect to the BS as in Eq. (13). For reference, a climatology or other forecast can be used. In this study, the WRF-based ERA-I is considered as a reference.
3.4 Pattern correlation coefficient
The pattern correlation coefficient (PCC) is defined as Eq. (14) (Shiferaw et al., 2018; Yoo and Cho, 2018; Park and Kim, 2020),
where xi and oi are (re)forecast and observed precipitation at ith observation location and the over-bar indicates the averaged variables over N observed stations in the verification area.
4.1 Evaluation of wind, temperature, and humidity variables
4.1.1 RMSE for January and July 2017
The analysis and forecast RMSEs of E3DVAR, AdvHG, the WRF-based ERA-I, and WRF-based ERA5 are calculated for zonal wind, meridional wind, temperature, and Qvapor (water vapor mixing ratio in WRF) variables against sonde observations at 00:00 and 12:00 UTC in verification domain (dashed box in Fig. 1) for January and July 2017 and averaged over each month (Figs. 3, 4, and 5).
For the analysis RMSE (Fig. 3), E3DVAR is smaller than AdvHG for all pressure levels and variables, except for temperature in July at 1000 hPa and Qvapor in January and July at 1000 hPa. In general, the analysis RMSE of AdvHG for all variables is comparable to or greater than that of ERA5. The analysis RMSE of ERA5 is smaller than that of ERA-I for all levels and variables; in particular, the analysis RMSE difference between ERA5 and ERA-I is distinctive for wind.
Regarding wind variables of analysis (Fig. 3a, b, c, and d), E3DVAR is the most closely fitted to observations except for the wind in the upper troposphere in January, followed by ERA5, AdvHG, and ERA-I. For the temperature RMSE (Fig. 3e and f), E3DVAR is smaller than AdvHG. For Qvapor, RMSE in July is much larger than that in January due to a monsoonal flow carrying moist air to East Asia. In general, the Qvapor RMSE of E3DVAR is the smallest, followed by ERA5, AdvHG, and ERA-I. Therefore, for all variables, E3DVAR analysis fields are generally the most closely fitted to observations. Since the analysis RMSE implies how much the analysis fields are fitted to observations rather than the accuracy of analysis itself, not only the analysis RMSE but also the forecast RMSE should be considered.
For 24 h forecast fields in January (Fig. 4a, c, e, and g), overall, the RMSEs of AdvHG and E3DVAR are greater than those of ERA5 and smaller than those of ERA-I, and the AdvHG RMSE is smaller than the E3DVAR RMSE for all levels and variables. Meanwhile, for July (Fig. 4b, d, f, and h), AdvHG and E3DVAR show comparable RMSE to ERA-I.
Furthermore, the general features of the 36 h forecast RMSE (Fig. 5) are similar to the 24 h forecast RMSE (Fig. 4). However, particularly in January, the 36 h forecast RMSE differences between ERA5 and ERA-I are more distinctive than those of the 24 h forecast. In January, the vertically averaged 36 h forecast RMSE differences of ERA5 and ERA-I are 0.52 m s−1 for wind, 0.16 K for temperature, and 0.08 g kg−1 for Qvapor, whereas those of the 24 h forecast are 0.4 m s−1 for wind, 0.11 K for temperature, and 0.06 g kg−1 for Qvapor. In addition, the 36 h forecast RMSE differences between ERA5 and AdvHG for January are on average 0.1 m s−1 for wind, 0.05 K for temperature, and 0.02 g kg−1 for Qvapor, which are even smaller compared to those of the 24 h forecast, implying that AdvHG is much more accurate than ERA-I for January 2017. For July, the 36 h forecast RMSE of ERA5 is the smallest and the RMSEs of AdvHG and E3DVAR are similar to those of ERA-I.
4.1.2 RMSE and spread for the period 2010–2019
In this section, the EARR produced in this study is verified for a longer period with WRF-based ERA5. The RMSE and spread of reanalyses and reforecasts based on the AdvHG method are calculated and averaged over the period 2010–2019. The reanalyses and (re)forecast fields are evaluated by calculating RMSE valid at 00:00 and 12:00 UTC and spread at 00:00, 06:00, 12:00, and 18:00 UTC.
The averaged RMSEs of reanalysis for ERA5 and EARR (denoted as AdvHG in Fig. 6) and spread of analysis and 6 h forecast fields of EARR (AdvHG) are shown in Fig. 6. With respect to spread, the ensemble spreads of analysis fields are smaller than those of 6 h forecast fields, on average, by 0.15 m s−1 for wind, 0.04 K for temperature, and 0.02 g kg−1 for Qvapor, which is the well-known characteristic of ensemble-based DA methods. Specifically, the wind spread (Fig. 6a and b) is similar to or greater than the wind RMSE except for the upper troposphere above 200 hPa, implying the ensemble spread for wind is well represented below 200 hPa. On the contrary, the ensembles for temperature and Qvapor (Fig. 6c and d) are underdispersive compared to their RMSEs.
Regarding the reanalysis RMSE, overall AdvHG RMSE is greater than ERA5 RMSE for all variables (Fig. 6). The vertically averaged RMSEs of AdvHG are greater by 0.16 m s−1 for wind, 0.09 K for temperature, and 0.01 g kg−1 for Qvapor than those of ERA5. Nonetheless, the wind RMSEs of AdvHG are similar to those of ERA5 for the middle of the troposphere (400–850 hPa), and the Qvapor RMSEs of AdvHG are similar to those of ERA5 except for 1000 hPa.
In addition, regarding the 24 h forecast RMSE, AdvHG shows a larger RMSE than ERA5 for all variables (Fig. 7). The vertically averaged RMSE differences of wind, temperature, and Qvapor variables between AdvHG and ERA5 are approximately 0.2 m s−1, 0.07 K, and 0.03 g kg−1, respectively. These differences are smaller, compared to the 24 h forecast RMSE difference between ERA-I and ERA5 shown in Fig. 4 (i.e., wind, temperature, and Qvapor RMSE difference: 0.4 m s−1, 0.11 K, and 0.06 g kg−1 for January 2017, 0.25 m s−1, 0.05 K, and 0.04 g kg−1 for July 2017).
4.2 Evaluation of precipitation for January and July in 2017
4.2.1 Evaluation metrics
Equitable threat score and frequency bias index
In this section, for the point-based equitable threat score (ETS) and frequency bias index (FBI) based on Table 4, the 6 h accumulated precipitation fields based on the 6 h forecast of E3DVAR, AdvHG, WRF-based ERA-I, WRF-based ERA5, ERA-I_fromECMWF, and ERA5_fromECMWF are evaluated every 6 h (00:00, 06:00, 12:00, and 18:00 UTC) for January and July 2017 (Fig. 8). Here, all the WRF-based precipitation fields are based on 12 km horizontal resolution, and ERA-I_fromECMWF and ERA5_fromECMWF have 79 and 31 km horizontal resolutions, respectively. Generally, ETS decreases as a threshold increases for both months (Fig. 8a and c). For January 2017 (Fig. 8a), AdvHG ETS is the greatest among others. Compared to precipitation reforecasts from ECMWF (i.e., ERA-I_fromECMWF, ERA5_fromECMWF), AdvHG shows the higher ETS, indicating that AdvHG is able to simulate more accurate precipitation fields than ERA-I and ERA5 from ECMWF in January 2017. Surprisingly, ETS of ERA5_fromECMWF for January 2017 is the lowest among all the results and is even lower than that of ERA-I_fromECMWF.
Since the precipitation reforecasts from ECMWF have not only coarser resolutions but also a different forecast model (i.e., the forecasting system of ECMWF), the precipitation forecasts of ERA5 and ERA-I are additionally produced by using the same forecast model with the same resolution as AdvHG and E3DVAR in this study, as explained in Sect. 2.4. For January 2017 (Fig. 8a), ETS of ERA5 (i.e., WRF-based ERA5) is higher than that of ERA5_fromECMWF for all thresholds, whereas ETS of ERA-I (i.e., WRF-based ERA-I) is lower than that of ERA-I_fromECMWF except for high thresholds (8 and 16 mm per 6 h). The ERA5 ETS is greater than the ERA-I ETS, but is smaller than the AdvHG ETS. The AdvHG shows the greatest ETS among others with the same resolution and forecast model, and E3DVAR, ERA5, and ERA-I follow.
Regarding FBI in winter (Fig. 8b), for 4, 8, and 16 mm per 6 h thresholds, all the results show that FBI is smaller than 1, implying an underestimation of the frequency of precipitation for high-threshold events. In general, AdvHG shows the FBI closest to 1 among all the results, which is consistent with the greatest ETS of AdvHG. The E3DVAR FBI is similar to the AdvHG FBI, and ERA5 and ERA-I FBIs are similar to each other.
Overall, the ETS values for January, whose maximum is around 0.4 (Fig. 8a), are much greater than those for July 2017, whose maximum is around 0.2 (Fig. 8c), implying that the precipitation forecast in summer is more difficult than that in winter. The ETS difference between the results in July is smaller than that in January. Particularly, for the thresholds 4 and 8 mm per 6 h, the ETSs in July are similar to each other (Fig. 8c). Except for those two thresholds, the ETS of ERA-I_fromECMWF is the smallest. At the threshold of 16 mm per 6 h, ERA5 ETS is the highest, followed by AdvHG, E3DVAR, ERA-I, ERA5_fromECMWF, and ERA-I_fromECMWF. At the threshold of 0.5 and 1 mm per 6 h, the E3DVAR ETS is the greatest, followed by ERA5, AdvHG, ERA5_fromECMWF, ERA-I, and ERA-I_fromECMWF.
With respect to FBI in July 2017, the WRF-based results yield FBIs greater than 1, whereas reforecast from ECMWF yields FBIs greater than 1 for 0.5, 1, and 4 mm per 6 h thresholds and smaller than 1 for higher thresholds (8 and 16 mm per 6 h) (Fig. 8d). For July 2017, in general, ERA5_fromECMWF FBI is the closest to 1, followed by E3DVAR, AdvHG, ERA5, ERA-I, and ERA-I_fromECMWF FBI.
Probability of detection and false alarm ratio
The probability of detection (POD or hit rate) and false alarm ratio (FAR) are calculated for precipitation simulated from E3DVAR, AdvHG, WRF-based ERA-I, WRF-based ERA5, ERA-I_fromECMWF, and ERA5_fromECMWF for January and July 2017 (Fig. 9). For January 2017, AdvHG POD is the greatest among the WRF-based results, followed by E3DVAR, ERA5, and ERA-I (Fig. 9a). In addition to the lowest ETS of ERA5_fromECMWF for January 2017 as discussed in the Sect. “Equitable threat score and frequency bias index”, the FAR of ERA5_fromECMWF is extremely high with a low POD in winter. Therefore, especially for January 2017, the precipitation fields simulated from EARR (AdvHG) over East Asia are much more accurate than those from ERA5_fromECMWF.
For July 2017, generally, AdvHG shows the largest POD, except for ERA5 (Fig. 9c). The FAR values in July are much greater than those in January, which is consistent with the ETS difference between these two seasons.
Brier skill score
The neighborhood sizes are chosen to be 3Δx, 5Δx, 9Δx, and 11Δx, which are 36, 60, 108, and 132 km, respectively, and the thresholds 0.5, 1, 4, 8, and 16 mm per 6 h are considered. The probabilistic precipitation forecasts are calculated at every model grid point depending on neighborhood sizes and thresholds. Regarding each observation, the nearest model grid point to observations is considered as the center of the neighborhood. For verification, 6 h accumulated precipitation fields are extracted from the first 0–6 h forecast fields of WRF-based ERA-I, WRF-based ERA5, E3DVAR, and AdvHG every 6 h (00:00, 06:00, 12:00, and 18:00 UTC). The BSSs of ERA5_fromECMWF and ERA-I_fromECMWF are not calculated, because they have a different resolution from WRF-based results.
Based on the neighborhood approach, the BSS is calculated depending on different neighborhood sizes for January and July 2017, respectively (Fig. 10). Because the reference of BS is chosen as the ERA-I, the positive BSS suggests a better accuracy than ERA-I. In general, for both months, the AdvHG BSS is greater than the ERA5 BSS. Although the E3DVAR BSS is the greatest in July 2017, the AdvHG BSS is the greatest in January 2017.
For January 2017, as a neighborhood size increases, the AdvHG and E3DVAR BSSs tend to increase except for ERA5. Overall, the AdvHG BSS is the greatest among other BSSs for all thresholds for all neighborhood sizes. The ERA5 BSS is greater than the E3DVAR BSS except for 16 mm per 6 h. The highest BSS of AdvHG and the lowest BSS of ERA-I are consistent with the ETS result. Unlike the greater E3DVAR ETS than ERA5 ETS, the ERA5 BSS is greater than the E3DVAR BSS in January 2017.
For July 2017, while the ETS difference between the WRF-based results is not distinct (Fig. 8c), the BSS difference is rather noticeable. Generally, the E3DVAR BSS is the greatest among other BSSs for all thresholds except for 16 mm per 6 h for neighborhood sizes 9 and 11. Although the E3DVAR BSS is the largest, AdvHG outperforms ERA5 and ERA-I. The worst performance of ERA-I precipitation is consistent with the ETS result. At 0.5, 1, and 4 mm per 6 h thresholds, E3DVAR BSS is the greatest, which is similar to ETS. At 8 and 16 mm per 6 h thresholds, ERA5 ETS is the highest, followed by AdvHG and E3DVAR, whereas overall, the E3DVAR BSS is the highest, followed by AdvHG and ERA5.
4.2.2 Spatial distribution
Six h accumulated precipitation with the pattern correlation coefficient
In this section, the spatial distributions of 6 h accumulated precipitation from the WRF-based forecast and reforecast from ECMWF are compared. In addition, pattern correlation coefficients (PCC) are calculated and shown at the bottom right of Figs. 11 and 12.
The PCC is computed according to the usual Pearson correlation operating on the N observed point pairs of 6 h accumulated precipitation fields simulated from (re)forecast and observations at the specific time. For the calculation of PCC, 6 h accumulated precipitation fields from (re)forecast fields are interpolated bilinearly to the N observed points.
First, on 29 and 30 January 2017 (Fig. 11), it is noticeable that the precipitation fields of AdvHG match observations well over East Asia, whereas, in particular, those of ERA5_fromECMWF do not. For example, ERA5_fromECMWF overestimates precipitation over the inland area of China (Fig. 11zz), while AdvHG simulates precipitation similar to observations regarding its position and intensity (Fig. 11x). ERA5_fromECMWF also shows a noticeably smaller PCC (Fig. 11g, n, and zz). Although PCC does not represent the exact accuracy or predictability of precipitation, the overall feature of PCC is consistent with the results found so far. For January 2017, the averaged PCC of AdvHG is the greatest (i.e., 0.61) and that of ERA5_fromECMWF is the smallest (i.e., 0.46; not shown).
For 1 and 2 July 2017 (Fig. 12), in general, AdvHG, E3DVAR, and ERA5 simulate well not only the overall features of precipitation fields but also their intensity. During July 2017, ERA5 and ERA-I simulate heavier precipitation than AdvHG (not shown), which is consistent with the larger FBI of ERA5 and ERA-I at higher thresholds. For the 1-month period of July 2017, the averaged PCC of ERA5 is the greatest (i.e., 0.37) and that of AdvHG is 0.34, but the PCC difference between ERA5 and AdvHG is not distinctive. Moreover, the overall range of averaged PCC of different datasets in summer (i.e., 0.29–0.35) is smaller than that in winter (i.e., 0.46–0.61), which is consistent with the seasonal difference of ETS in this study.
Monthly accumulated precipitation
In this section, the monthly accumulated precipitation fields of rain gauge-based observations, E3DVAR, AdvHG, ERA-I, ERA5, ERA-I_fromECMWF, and ERA5_fromECMWF are compared with each other for two 1-month periods in January and July 2017, respectively.
The monthly accumulated precipitation fields simulated by E3DVAR and AdvHG (Fig. 13b and c) are similar to each other, and E3DVAR and AdvHG produce the best fit to observed fields. Especially for the northwestern part of Japan (e.g., Chugoku and Kinki), E3DVAR and AdvHG are able to represent precipitation correctly, whereas ERA-I_fromECMWF and ERA5_fromECMWF fail to do so (Fig. 13). Moreover, although all the results similarly represent overall features of precipitation in January (Fig. 13), ERA5_fromECMWF (Fig. 13g) simulates the overestimated precipitation over South China, which is consistent with the results in the previous section as well as its larger FBI at lower thresholds (0.5 and 1 mm per 6 h) shown in Fig. 8b. It is noticeable that all results fail to represent the observed precipitation area over the Tibetan Plateau (25–40∘ N, 95–105∘ E).
For the monthly accumulated precipitation in July 2017, overall, the ERA5_fromECMWF (Fig. 14g) and the WRF-based results (Fig. 14b, c, and e) except for ERA-I (Fig. 14d) simulate precipitation well, similar to observations. The WRF-based results including AdvHG overestimate precipitation over the western and southern parts of Japan, while ERA-I_fromECMWF and ERA5_fromECMWF simulate similar precipitation fields to observed fields. The WRF-based results tend to overestimate precipitation in South China, Korea, and Japan, compared with ERA-I_fromECMWF and ERA5_fromECMWF. This is consistent with the result in Fig. 8d, in which FBIs from WRF-based results are generally greater than for higher thresholds (8 and 16 mm per 6 h), whereas those from ECMWF are smaller than 1.
Even though the detailed precipitation features of WRF-based results are different, the overall features of precipitation from WRF-based results are similar to each other, which implies that predictability of precipitation strongly depends on the physics schemes as well as on the NWP model, especially for the summer season. According to Que et al. (2016), depending on the combinations of physics options in the WRF model, the spatial distribution of precipitation can be significantly different over the Asian summer monsoon area, and the YSU PBL scheme which is used in this study tends to overestimate precipitation over the same area. Thus, different physics options could simulate the different spatial distribution of precipitation.
In addition, compared to ERA5 based on the WRF model (Fig. 14e), the ECMWF model for ERA5_fromECMWF (Fig. 14g) seems to suppress precipitation. Thus, the WRF model with the physics schemes used in this study might simulate more precipitation than the ECMWF model, although the initial condition is the same. Therefore, it is important to consider the consistency of the systems for DA and the forecast model for a good performance of forecast weather variables such as precipitation.
The EARR data presented in this study are available every 6 h (i.e., 00:00, 06:00, 12:00, and 18:00 UTC) for the period 2010–2019 from the Harvard Dataverse Repository (https://dataverse.harvard.edu/dataverse/EARR, last access: 17 March 2022). The EARR 6 hourly data on pressure levels (https://doi.org/10.7910/DVN/7P8MZT, Yang and Kim, 2021b) and 6 hourly precipitation data (https://doi.org/10.7910/DVN/Q07VRC, Yang and Kim, 2021c) are provided in NetCDF file format.
The EARR 6 hourly data on pressure levels (Yang and Kim, 2021b) include u-component of wind, v-component of wind, temperature, geopotential height, and specific humidity variables of reanalysis on pressure levels (i.e., 925, 850, 700, 500, 300, 200, 100, and 50 hPa). The EARR 6 hourly precipitation data (Yang and Kim, 2021c) contain the 6 h accumulated total precipitation variable of the 6 h reforecast on a single level. The 6 h accumulated total precipitation is obtained from the 6 h reforecast field which is integrated for 6 h from the reanalysis field every 6 h (i.e., 00:00, 06:00, 12:00, and 18:00 UTC).
In this study, to develop the regional reanalysis system over East Asia, the advanced hybrid gain algorithm (AdvHG) is newly proposed and evaluated with a traditional hybrid DA method (E3DVAR) as well as existing reanalyses from ECMWF (ERA5 and ERA-I) for January and July 2017. The East Asia Regional Reanalysis (EARR) system is developed based on the AdvHG as the data assimilation method using the WRF model and conventional observations. The high-resolution regional reanalysis and reforecast fields over East Asia with 12 km horizontal resolution are produced and evaluated against observations with ERA5 for the 10-year period 2010–2019.
The AdvHG newly proposed in this study is based on the hybrid gain approach, weighting analyses from variational-based and ensemble-based DA algorithms to generate optimal hybrid analysis, which can play an important role as a simple and practical method in the foreseeable future to take advantage of the strength of each DA method. The advanced hybrid gain method is different from the hybrid gain approach in that (1) E3DVAR is used instead of EnKF, (2) 6 h forecast of ERA5 is used instead of deterministic analysis for a more balanced and consistent analysis with the WRF model, and (3) the pre-existing and state-of-the-art reanalysis data (i.e., ERA5) are simply used instead of producing our own analysis fields from a variational DA method. Thus, it can be regarded as an efficient approach for generating a regional reanalysis dataset because of cost savings and the use of the state-of-the-art reanalysis from ECMWF that assimilates all available observations.
For verification, the latest ECMWF reanalysis and reforecast datasets (i.e., ERA5 and ERA-I) are used. With respect to forecast variables, two different forecast fields of ECWMF are used: (1) reforecast fields from ECMWF (i.e., ERA5_fromECMWF and ERA-I_fromECMWF) and (2) forecast fields (i.e., WRF-based ERA5 and WRF-based ERA-I) integrated in the WRF model with 12 km resolution using ERA5 and ERA-I as initial conditions.
Analysis and forecast wind, temperature, and humidity variables of AdvHG are evaluated with ERA5 for the 10-year period and assessed with five different experiments (i.e., E3DVAR, ERA5, ERA-I, ERA5_fromECMWF, ERA-I_fromECMWF) for January and July 2017. Overall, the analysis RMSE of E3DVAR is the smallest among others but comparable to that of ERA5, especially for January 2017. Regarding forecast variables, AdvHG outperforms E3DVAR for January and July 2017. Although ERA5 outperforms AdvHG for upper-air variables for two seasons in 2017, AdvHG outperforms ERA-I in January and shows comparable performance to ERA-I in July. Additionally, the verification results of AdvHG and ERA5 for the period 2010–2019 are consistent with those for two 1-month periods in 2017.
The precipitation forecast variables are also verified regarding a neighborhood-based verification score (i.e., BSS) as well as the point-based verification scores (i.e., ETS, FBI, POD, and FAR). According to the point-based verification scores, the precipitation forecast of AdvHG in January is the most accurate, followed by E3DVAR, ERA5, and ERA-I. For July, the overall ETS values of all results are relatively lower than those in January, implying a lower predictability in summer. In addition, the ETS differences between the results are not distinctive in July. For higher thresholds (8 and 16 mm per 6 h) in July, AdvHG ETS is greater than E3DVAR ETS and smaller than ERA5 ETS, whereas E3DVAR ETS is the greatest followed by ERA5 and AdvHG for lower thresholds (0.5 and 1 mm per 6 h).
To prevent double penalty when verifying highly variable data with high resolution (e.g., precipitation), the BSS based on the neighborhood approach is calculated for 6 h accumulated precipitation forecasts depending on different neighborhood sizes for January and July 2017. In general, the BSS of AdvHG is greater than that of ERA5 and ERA-I for both months. Although the E3DVAR BSS is the greatest in July 2017, the AdvHG BSS is the greatest in January 2017.
Lastly, the spatial distributions of 6 h and monthly accumulated precipitation forecast for AdvHG, E3DVAR, ERA-I, ERA5, ERA-I_fromECMWF, and ERA5_fromECMWF are compared with rain gauge-based observations. For January 2017, it is noticeable that AdvHG precipitation is the closest to observations with the highest PCC (i.e., 0.61), and ERA5_fromECMWF overestimates precipitation over South China with the lowest PCC (i.e., 0.46). For July 2017, the WRF-based results tend to overestimate precipitation compared to ERA-I_fromECMWF and ERA5_fromECMWF. In addition, even though the averaged PCC of ERA5 (i.e., 0.37) is slightly greater than that of AdvHG (i.e., 0.34), the PCC difference between ERA5 and AdvHG is not distinctive and overall the range of averaged PCC of all datasets in summer (i.e., 0.29–0.37) is smaller than that in winter (i.e., 0.46–0.6).
In conclusion, for upper-air variables, overall, ERA5 outperforms EARR based on AdvHG, but the RMSE difference between ERA5 and EARR (AdvHG) is smaller than that between ERA5 and ERA-I. In addition, EARR outperforms ERA-I for January 2017 and shows comparable performance to ERA-I for July 2017. On the contrary, according to the evaluation results of precipitation, in general, EARR represents precipitation better than ERA5 as well as ERA5_fromECMWF for January and July 2017. Even if E3DVAR precipitation is better represented than EARR precipitation for July, the difference is not considerable for July and EARR simulates precipitation for January better than E3DVAR does. Therefore, although the uncertainties of upper-air variables of EARR should be considered when analyzing them, the precipitation reforecast of EARR is more accurate than that of ERA5 for both seasons.
Combining the global reanalysis data (i.e., ERA5) characterized by the high quality of large-scale features with detailed smaller-scale features in the higher resolution represented by the ensemble-based assimilation method (i.e., E3DVAR) as well as a community numerical weather prediction model (i.e., WRF model) is a key factor for EARR to be able to produce high-resolution initial conditions represented with regional features, which could contribute to a reduction in forecast errors, especially for precipitation. Therefore, EARR has its own advantage of representing regional features of precipitation better than relatively coarse-resolution global reanalysis.
HMK proposed the main scientific ideas and EGY contributed the supplementary ideas during the process. EGY developed the reanalysis system and produced the 10-year regional reanalysis data. EGY and HMK analyzed the simulation results and completed the paper. DHK contributed to analyzing the reanalysis data and to the preparation of software and computing resources for the reanalysis system.
The contact author has declared that neither they nor their co-authors have any competing interests.
Publisher's note: Copernicus Publications remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
The authors appreciate the reviewers for their valuable comments. This study was carried out by utilizing the supercomputer system supported by the National Center for Meteorological Supercomputer of Korea Meteorological Administration and Korea Research Environment Open NETwork (KREONET) provided by the Korea Institute of Science and Technology Information. The authors gratefully acknowledge the late Fuqing Zhang for collaborations at the earlier stages of this study.
This study was supported by the National Research Foundation of Korea (NRF) grant funded by the South Korean government (Ministry of Science and ICT) (grant no. 2021R1A2C1012572) and the Yonsei Signature Research Cluster Program of 2021 (grant no. 2021-22-0003).
This paper was edited by Qingxiang Li and reviewed by Minyan Wang and one anonymous referee.
Ashrit, R., Indira Rani, S., Kumar, S., Karunasagar, S., Arulalan, T., Francis, T., Routray, A., Laskar, S. I., Mahmood, S., Jermey, P., Maycock, A., Renshaw, R., George, J. P., and Rajagopal, E. N.: IMDAA Regional Reanalysis: Performance Evaluation During Indian Summer Monsoon Season, J. Geophys. Res.-Atmos., 125, e2019JD030973, https://doi.org/10.1029/2019JD030973, 2020.
Bonavita, M., Hamrud, M., and Isaksen, L.: EnKF and hybrid gain ensemble data assimilation. Part II: EnKF and hybrid gain results, Mon. Weather Rev., 143, 4865–4882, https://doi.org/10.1175/MWR-D-15-0071.1, 2015.
Borsche, M., Kaiser-Weiss, A. K., Undén, P., and Kaspar, F.: Methodologies to characterize uncertainties in regional reanalyses, Adv. Sci. Res., 12, 207–218, https://doi.org/10.5194/asr-12-207-2015, 2015.
Bosilovich, M.: NASA's modern era retrospective-analysis for research and applications: Integrating Earth observations, Earthzine, http://www.earthzine.org/2008/09/26/nasas-modern-era-retrospective-analysis (last access: 17 March 2022), 2008.
Bosilovich, M., Lucchesi, R., and Suarez, M.: MERRA-2: File specification, NASA GMAO Office Note No. 9 (Version 1.1), NASA GMAO, GMAO NASA Goddard Space Flight Center, US, 73 pp., http://gmao.gsfc.nasa.gov/pubs/docs/Bosilovich785.pdf (last access: 17 March 2022), 2015.
Bromwich, D. H., Wilson, A. B., Bai, L. S., Moore, G. W., and Bauer, P.: A comparison of the regional Arctic System Reanalysis and the global ERA-Interim Reanalysis for the Arctic, Q. J. Roy. Meteor. Soc., 142, 644–658, https://doi.org/10.1002/qj.2527, 2016.
Bromwich, D. H., Wilson, A. B., Bai, L., Liu, Z., Barlage, M., Shih, C. F., Maldonado, S., Hines, K. M., Wang, S.-H., Woollen, J., Kuo, B., Lin, H.-C., Wee, T.-K., Serreze, M. C., and Walsh, J. E.: The Arctic system reanalysis, version 2, B. Am. Meteorol. Soc., 99, 805–828, https://doi.org/10.1175/BAMS-D-16-0215.1, 2018.
Cotton, J., Forsythe, M., Warrick, F., Salonen, K., Bormann, N., and Lean, K.: AMVs in the Tropics: use in NWP, data quality and impact, Joint ECMWF/ESA Workshop on “Tropical modeling, 30 observations and data assimilation”, ECMWF, Reading, UK, 7–10 November 2016, https://www.ecmwf.int/node/16865 (last access: 17 March 2022), 2016.
Dee, D. P., Uppala, S. M., Simmons, A. J., Berrisford, P., Poli, P., Kobayashi, S., Andrae, U., Balmaseda, M. A., Balsamo, G., Bauer, P., Bechtold, P., Beljaars, A. C. M., van de Berg, L., Bidlot, J., Bormann, N., Delsol, C., Dragani, R., Fuentes, M., Geer, A. J., Haimberger, L., Healy, S. B., Hersbach, H., Hólm, E. V., Isaksen, L., Kållberg, P., Köhler, M., Matricardi, M., McNally, A. P., Monge-Sanz, B. M., Morcrette, J.-J., Park, B.-K., Peubey, C., de Rosnay, P., Tavolato, C., Thépaut, J.-N., and Vitart, F.: The ERA-Interim reanalysis: configuration and performance of the data assimilation system, Q. J. Roy. Meteor. Soc., 137, 553–597, https://doi.org/10.1002/qj.828, 2011.
Ebert, E. E.: Fuzzy verification of high-resolution gridded forecasts: a review and proposed framework, Meteorol. Appl., 15, 51–64, https://doi.org/10.1002/met.25, 2008.
Fukui, S., Iwasaki, T., Saito, K., Seko, H., and Kunii, M.: A feasibility study on the high-resolution regional reanalysis over Japan assimilating only conventional observations as an alternative to the dynamical downscaling, J. Meteorol. Soc. Jpn., 96, 565–585, https://doi.org/10.2151/jmsj.2018-056, 2018.
Gibson, J. K., Kållberg, P., Uppala, S., Nomura, A., Hernandez, A., and Serrano, E.: ERA Description, ECMWF Re-Analysis Project, Technical Report Series, 1, ECMWF, ECMWF, Reading, UK, 72 pp., https://www.ecmwf.int/en/elibrary/9584-era-description (last access: 17 March 2022), 1997.
Grell, G. A. and Freitas, S. R.: A scale and aerosol aware stochastic convective parameterization for weather and air quality modeling, Atmos. Chem. Phys., 14, 5233–5250, https://doi.org/10.5194/acp-14-5233-2014, 2014.
He, J., Zhang, F., Chen, X., Bao, X., Chen, D., Kim, H. M., Lai, H.-W., Leung, L. R., Ma, X., Meng, Z., Ou, T., Xiao, Z., Yang, E.-G., and Yang, K.: Development and evaluation of an ensemble-based data assimilation system for regional reanalysis over the Tibetan Plateau and surrounging regions, J. Adv. Model. Earth Syst., 11, 2503–2522, https://doi.org/10.1029/2019MS001665, 2019.
Hersbach, H., de Rosnay, P., Bell, B., Schepers, D., Simmons, A., Soci, C., Abdalla, S., Alonso-Balmaseda, M., Balsamo, G., Bechtold, P., Berrisford, P., Bidlot, J., de Boisséson, E., Bonavita, M., Browne, P., Buizza, R., Dahlgren, P., Dee, D., Dragani, R., Diamantakis, M., Flemming, J., Forbes, R., Geer, A., Haiden, T., Hólm, E., Haimberger, L., Hogan, R., Horányi, A., Janisková, M., Laloyaux, P., Lopez, P., Muñoz-Sabater, J., Peubey, C., Radu, R., Richardson, D., Thépaut, J.-N., Vitart, F., Yang, X., Zsótér, E., and Zuo, H.: Operational global reanalysis: progress, future directions and synergies with NWP, ECMWF ERA report series, 27, https://www.ecmwf.int/en/elibrary/18765-operational-global-reanalysis-progress-future-directions-and-synergies-nwp (last access: 17 March 2022), 2018.
Hersbach, H., Bell, B., Berrisford, P., Hirahara, S., Horányi, A., Muñoz-Sabater, J., Nicolas, J., Peubey, C., Radu, R., Schepers, D., Simmons, A., Soci, C., Abdalla, S., Abellan, X., Balsamo, G., Bechtold, P., Biavati, G., Bidlot, J., Bonavita, M., Chiara, G. D.. Dahlgren, P., Dee, D., Diamantakis, M., Dragani, R., Flemming, J., Forbes, R., Fuentes, M., Geer, A., Haimberger, L., Healy, S., Hogan, R. J., Hólm, E., Janisková, M., Keeley, S., Laloyaux, P., Lopez, P., Lupu, C., Radnoti, G., de Rosnay, P., Rozum, I., Vamborg, F., Villaume, S., and Thépaut, J.-N.: The ERA5 global reanalysis, Q. J. Roy. Meteor. Soc., 146, 1999–2049, https://doi.org/10.1002/qj.3803, 2020.
Hong, S.-Y., Noh, Y., and Dudhia, J.: A new vertical diffusion package with an explicit treatment of entrainment processes, Mon. Weather Rev., 134, 2318–2341, https://doi.org/10.1175/MWR3199.1, 2006.
Houtekamer, P. L. and Zhang, F.: Review of the ensemble Kalman filter for atmospheric data assimilation, Mon. Weather Rev., 144, 4489–4532, https://doi.org/10.1175/MWR-D-15-0440.1, 2016.
Iacono, M. J., Delamere, J. S., Mlawer, E. J., Shephard, M. W., Clough, S. A., and Collins, W. D.: Radiative forcing by long-lived greenhouse gases: Calculations with the AER radiative transfer models, J. Geophys. Res.-Atmos., 113, D13103, https://doi.org/10.1029/2008JD009944, 2008.
Jermey, P. M. and Renshaw, R. J.: Precipitation representation over a two-year period in regional reanalysis, Q. J. Roy. Meteor. Soc., 142, 1300–1310, https://doi.org/10.1002/qj.2733, 2016.
Jiménez, P. A., Dudhia, J., González-Rouco, J. F., Navarro, J., Montávez, J. P., and García-Bustamante, E.: A revised scheme for the WRF surface layer formulation, Mon. Weather Rev., 140, 898–918, https://doi.org/10.1175/MWR-D-11-00056.1, 2012.
Kalnay, E., Kanamitsu, M., Kistler, R., Collins, W., Deaven, D., Gandin, L., Iredell, M., Saha, S., White, G., Woolen, J., Zhu, Y., Chelliah, M., Ebisuzaki, W., Higgins, W., Janowiak, J., Mo, K. C., Ropelewski, C., Wang, J., Leetmaa, A., Reynolds, R., Jenne, R., and Joseph, D.: The NCEP/NCAR 40-year reanalysis project, B. Am. Meteorol. Soc., 77, 437–471, https://doi.org/10.1175/1520-0477(1996)077<0437:TNYRP>2.0.CO;2, 1996.
Kanamitsu, M., Ebisuzaki, W., Woollen, J., Yang, S.-K., Hnilo, J. J., Fiorino, M., and Potter, G. L.: NCEP–DOE AMIP-II Reanalysis (R-2), B. Am. Meteorol. Soc., 83, 1631–1643, https://doi.org/10.1175/BAMS-83-11-1631, 2002.
Kay, J. K., Kim, H. M., Park, Y.-Y., and Son, J.: Effect of doubling ensemble size on the performance of ensemble prediction in warm season using MOGREPS implemented in KMA, Adv. Atmos. Sci., 30, 1287–1302, https://doi.org/10.1007/s00376-012-2083-y, 2013.
Kayaba, N., Yamada, T., Hayashi, S., Onogi, K., Kobayashi, S., Yoshimoto, K., Kamiguchi, K., and Yamashita, K.: Dynamical regional downscaling using the JRA-55 reanalysis (DSJRA-55), Sola, 12, 1–5, https://doi.org/10.2151/sola.2016-001, 2016.
Keyser, D.: An Overview of Observational Data Processing at NCEP (with information on BUFR Format including “PrepBUFR” files), 6 August 2013, GSI tutorial, https://dtcenter.ucar.edu/com-GSI/users/docs/presentations/2013_tutorial/Tue_L1_Keyser_ObsProcessing.pdf (last access: 17 March 2022), 2013.
Kim, S. and Kim, H. M.: Effect of considering sub-grid scale uncertainties on the forecasts of a high-resolution limited area ensemble prediction system, Pure Appl. Geophys., 174, 2021–2037, https://doi.org/10.1007/s00024-017-1513-2, 2017.
Kim, S., Kim, H. M., Kay, J. K., and Lee, S.-W.: Development and Evaluation of High Resolution Limited Area Ensemble Prediction System in Korea Meteorological Administration, Atmosphere, 25, 67–83, https://doi.org/10.14191/Atmos.2015.25.1.067, 2015 (in Korean with English abstract).
Kistler, R., Collins, W., Saha, S., White, G., Woollen, J., Kalnay, E., Chelliah, M., Ebisuzaki, W., Kanamitsu, M., Kousky, V., Dool, H. V. D., Jenne, R., and Fiorino, M.: The NCEP–NCAR 50-Year Reanalysis: Monthly Means CD-ROM and Documentation, B. Am. Meteorol. Soc., 82, 247–267, https://doi.org/10.1175/1520-0477(2001)082<0247:TNNYRM>2.3.CO;2, 2001.
Kobayashi, S., Ota, Y., Harada, Y., Ebita, A., Moriya, M., Onoda, H., Onogi, K., Kamahori, H., Kobayashi, C., Endo, H., Miyaoka, K., and Takahashi, K.: The JRA-55 reanalysis: General specifications and basic characteristics, J. Meteorol. Soc. Jpn., 93, 5–48, https://doi.org/10.2151/jmsj.2015-001, 2015.
Mesinger, F., DiMego, G., Kalnay, E., Mitchell, K., Shafran, P. C., Ebisuzaki, W., Jović, D., Woollen, J., Rogers, E., Berbery, E. H., Ek, M. B., Fan, Y., Grumbine, R., Higgins, W., Li, H., Lin, Y., Manikin, G., Parrish, D., and Shi, W.: North American Regional Reanalysis, B. Am. Meteorol. Soc., 87, 343–360, https://doi.org/10.1175/BAMS-87-3-343, 2006.
Mittermaier, M. P.: A strategy for verifying near-convection-resolving model forecasts at observing sites, Weather Forecast., 29, 185–204, https://doi.org/10.1175/WAF-D-12-00075.1, 2014.
National Centers for Environmental Prediction/National Weather Service/NOAA/U.S. Department of Commerce: NCEP ADP Global Upper Air and Surface Weather Observations (PREPBUFR format), Research Data Archive at the National Center for Atmospheric Research, Computational and Information Systems Laboratory, Boulder, CO [data set], https://doi.org/10.5065/Z83F-N512, 2008.
National Climatic Data Center/NESDIS/NOAA/U.S. Department of Commerce, Meteorology Department/Florida State University, Climate Analysis Section/Climate and Global Dynamics Division/National Center for Atmospheric Research/University Corporation for Atmospheric Research, and Harvard College Observatory/Harvard University: World Monthly Surface Station Climatology. Research Data Archive at the National Center for Atmospheric Research, Computational and Information Systems Laboratory, Boulder, CO [data set], http://rda.ucar.edu/datasets/ds570.0/ (last access: 7 November 2019), 1981.
On, N., Kim, H. M., and Kim, S.: Effects of resolution, cumulus parameterization scheme, and probability forecasting on precipitation forecasts in a high-resolution limited-area ensemble prediction system, Asia-Pac. J. Atmos. Sci., 54, 623–637, https://doi.org/10.1007/s13143-018-0081-4, 2018.
Onogi, K., Tsutsui, J., Koide, H., Sakamoto, M., Kobayashi, S., Hatsushika, H., Matsumoto, T., Yamazaki, N., Kamahori, H., Takahashi, K., Kadokura, S., Wada, K., Kato, K., Oyama, R., Ose, T., Mannoji, N., and Taira, R.: The JRA-25 reanalysis, J. Meteorol. Soc. Jpn., 85, 369–432, https://doi.org/10.2151/jmsj.85.369, 2007.
Park, J. and Kim, H. M.: Design and evaluation of CO2 observation network to optimize surface CO2 fluxes in Asia using observation system simulation experiments, Atmos. Chem. Phys., 20, 5175–5195, https://doi.org/10.5194/acp-20-5175-2020, 2020.
Penny, S. G.: The hybrid local ensemble transform Kalman filter, Mon. Weather Rev., 142, 2139–2149, https://doi.org/10.1175/MWR-D-13-00131.1, 2014.
Penny, S. G., Behringer, D. W., Carton, J. A., and Kalnay, E.: A hybrid global ocean data assimilation system at NCEP, Mon. Weather Rev., 143, 4660–4677, https://doi.org/10.1175/MWR-D-14-00376.1, 2015.
Que, L. J., Que, W. L., and Feng, J. M.: Intercomparison of different physics schemes in the WRF model over the Asian summer monsoon region, Atmos. Ocean. Sci. Lett., 9, 169–177, https://doi.org/10.1080/16742834.2016.1158618, 2016.
Rienecker, M. M., Suarez, M. J., Gelaro, R., Todling, R., Bacmeister, J., Liu, E., Bosilovich, M. G., Schubert, S. D., Takacs, L., Kim, G.-K., Bloom, S., Chen, J., Collins, D., Conaty, A., da Silva, A., Gu, W., Joiner, J., Koster, R. D., Lucchesi, R., Molod, A., Owens, T., Pawson, S., Pegion, P., Redder, C. R., Reichle, R., Robertson, F. R., Ruddick, A. G., Sienkiewicz, M., and Woollen, J.: MERRA: NASA's Modern-Era Retrospective Analysis for Research and Applications, J. Climate, 24, 3624–3648, https://doi.org/10.1175/JCLI-D-11-00015.1, 2011.
Saha, S., Moorthi, S., Pan, H.-L., Wu, X., Wang, J., Nadiga, S., Tripp, P., Kistler, R., Woollen, J., Behringer, D., Liu, H., Stokes, D., Grumbine, R., Gayno, G., Wang, J., Hou, Y.-T., Chuang, H.-Y., Juang, H.-M. H., Sela, J., Iredell, M., Treadon, R., Kleist, D., Delst, P. V., Keyser, D., Derber, J., Ek, M., Meng, J., Wei, H., Yang, R., Lord, S., Dool, H. V. D., Kumar, A., Wang, W., Long, C., Chelliah, M., Xue, Y., Huang, B., Schemm, J.-K., Ebisuzaki, W., Lin, R., Xie, P., Chen, M., Zhou, S., Higgins, W., Zou, C.-Z., Liu, Q., Chen, Y., Han, Y., Cucurull, L., Reynolds, R. W., Rutledge, G., and Goldberg, M.: The NCEP Climate Forecast System Reanalysis, B. Am. Meteorol. Soc., 91, 1015–1057, https://doi.org/10.1175/2010BAMS3001.1, 2010.
Schubert, S., Pfaendtner, J., and Rood, R.: An assimilated dataset for earth science applications, B. Am. Meteorol. Soc., 74, 2331–2342, https://doi.org/10.1175/1520-0477(1993)074<2331:AADFES>2.0.CO;2, 1993.
Shiferaw, A., Tadesse, T., Rowe, C., and Oglesby, R.: Precipitation extremes in dynamically downscaled climate scenarios over the greater horn of Africa, Atmosphere, 9, 112, https://doi.org/10.3390/atmos9030112, 2018.
Shin, I.-C, Kim, J.-G., Chung, C.-Y., Baek, S.-K., and Lee, J.-R.: The impact of the COMS data on the KMA NWP System, in: 14th JCSDA Technical Review Meeting & Science Workshop on Satellite Data Assimilation, Moss Landing, CA, U.S., 31 May–2 June 2016, https://dokumen.tips/documents/the-impact-of-the-coms-data-on-the-kma-nwp-the-impact-of-the-coms-data-on-the.html (last access: 17 March 2022), 2016.
Skamarock, W. C., Klemp, J. B., Dudhia, J., Gill, D. O., Barker, D. M., Duda, M. G., Huang, X.-Y., Wang, W., and Powers, J. G.: A description of the advanced research WRF version 3, NCAR Tech. Note, NCAR/TN-475+STR, https://opensky.ucar.edu/islandora/object/technotes:500/datastream/PDF/view (last access: 17 March 2022), 2008.
Tewari, M., Chen, F., Wang, W., Dudhia, J., LeMone, M. A., Mitchell, K., Ek, M., Gayno, G., Wegiel, J., and Cuenca, R. H.: Implementation and verification of the unified NOAH land surface model in the WRF model, in: 20th conference on weather analysis and forecasting/16th conference on numerical weather prediction, American Meteorological Society, Seattle, WA, U.S., 14 January 2004, 14.2A, https://ams.confex.com/ams/84Annual/techprogram/paper_69061.htm (last access: 17 March 2022), 2004.
Theis, S. E., Hense, A., and Damrath, U.: Probabilistic precipitation forecasts from a deterministic model: A pragmatic approach, Meteorol. Appl., 12, 257–268, https://doi.org/10.1017/S1350482705001763, 2005.
Thompson, G., Field, P. R., Rasmussen, R. M., and Hall, W. D.: Explicit forecasts of winter precipitation using an improved bulk microphysics scheme. Part II: Implementation of a new snow parameterization, Mon. Weather Rev., 136, 5095–5115, https://doi.org/10.1175/2008MWR2387.1, 2008.
Uppala, S. M., KÅllberg, P. W., Simmons, A. J., Andrae, U., Bechtold, V. D. C., Fiorino, M., Gibson, J. K., Haseler, J., Hernandez, A., Kelly, G. A., Li, X., Onogi, K., Saarinen, S., Sokka, N., Allan, R. P., Andersson, E., Arpe, K., Balmaseda, M. A., Beljaars, A. C. M., Berg, L. V. D., Bidlot, J., Bormann, N., Caires, S., Chevallier, F., Dethof, A., Dragosavac, M., Fisher, M., Fuentes, M., Hagemann, S., Hólm, E., Hoskins, B. J., Isaksen, L., Janssen, P. A. E. M., Jenne, R., Mcnally, A. P., Mahfouf, J.-F., Morcrette, J.-J., Rayner, N. A., Saunders, R. W., Simon, P., Sterl, A., Trenberth, K. E., Untch, A., Vasiljevic, D., Viterbo, P., and Woollen, J.: The ERA-40 re-analysis, Q. J. Roy. Meteor. Soc., 131, 2961–3012, https://doi.org/10.1256/qj.04.176, 2005.
Warrick, F.: Options for filling the LEO-GEO AMV Coverage Gap, NWP SAF Tech. Doc., NWP SAF-MO-TR-030, 21 pp., https://nwpsaf.eu/monitoring/amv/investigations/gapfill_amvs/nwpsaf_mo_tr_030.pdf (last access: 17 March 2022), 2015
Wilks, D. S.: Statistical methods in the atmospheric sciences, 2nd edn., Academic Press, 627 pp., https://sunandclimate.files.wordpress.com/2009/05/statistical-methods-in-the-atmospheric-sciences-0127519661.pdf (last access: 17 March 2022), 2006.
Wilson, L.: Verification of Categorical Forecasts – The Contingency Table, 7th International Verification Methods Workshop, Berlin, Germany, 3–11 May 2017, https://www.7thverificationworkshop.de/Presentation/tutorial_verification_of_categorial_forecasts.pdf (last access: 24 April 2022), 2017.
Yang, E.-G. and Kim, H. M.: Evaluation of a regional reanalysis and ERA-Interim over East Asia using in situ observations during 2013–14, J. Appl. Meteorol. Climatol., 56, 2821–2844, https://doi.org/10.1175/JAMC-D-16-0227.1, 2017.
Yang, E.-G. and Kim, H. M.: Evaluation of Short-Range Precipitation Reforecasts from East Asia Regional Reanalysis, J. Hydrometeorol., 20, 319–337, https://doi.org/10.1175/JHM-D-18-0068.1, 2019.
Yang, E.-G. and Kim, H. M.: A comparison of variational, ensemble-based, and hybrid data assimilation methods over East Asia for two one-month periods, Atmos. Res., 249, 105257, https://doi.org/10.1016/j.atmosres.2020.105257, 2021a.
Yang, E.-G. and Kim, H. M.: East Asia Regional Reanalysis 6 hourly data on pressure levels from 2010 to 2019, V1, Harvard Dataverse [data set], https://doi.org/10.7910/DVN/7P8MZT, 2021b.
Yang, E.-G. and Kim, H. M.: East Asia Regional Reanalysis 6 hourly precipitation data from 2010 to 2019, V1, Harvard Dataverse [data set], https://doi.org/10.7910/DVN/Q07VRC, 2021c.
Yoo, C. and Cho, E.: Comparison of GCM precipitation predictions with their RMSEs and pattern correlation coefficients, Water, 10, 28, https://doi.org/10.3390/w10010028, 2018.
Zhang, F., Zhang, M., and Poterjoy, J.: E3DVar: Coupling an ensemble Kalman filter with three-dimensional variational data assimilation in a limited-area weather prediction model and comparison to E4DVar, Mon. Weather Rev., 141, 900–917, https://doi.org/10.1175/MWR-D-12-00075.1, 2013.
Zhang, Q., Pan, Y., Wang, S., Xu, J., and Tang, J.: High-resolution regional reanalysis in China: Evaluation of 1 year period experiments, J. Geophys. Res.-Atmos., 122, 10801–10819, https://doi.org/10.1002/2017JD027476, 2017.