BAYWRF: a convection-resolving, present-day climatological atmospheric dataset for Bavaria

Climate impact assessments require information about climate change at regional and ideally local scales. In dendroecological studies, this information has traditionally been obtained using statistical methods, which preclude the 10 linkage of local climate changes to large-scale drivers in a process-based way. As part of recent efforts to investigate the impact of climate change on forest ecosystems in Bavaria, Germany, within the BayTreeNet project, we developed a highresolution atmospheric modelling dataset, BAYWRF, for the region of Bavaria over the thirty-year period of September 1987 to August 2018. The atmospheric model employed in this study, WRF, was configured with two nested domains of 7.5and 1.5-km grid spacing, centred over Bavaria and forced at the outer lateral boundaries by ERA5 reanalysis data. Based on 15 a shorter evaluation period of September 2017 to August 2018, we evaluate two aspects of the simulations: (i) we assess model biases compared with an extensive network of observational data at both two-hourly and daily mean temporal resolutions, and (ii) we investigate the influence of using grid analysis nudging. The model represents variability in nearsurface meteorological conditions well, with a clear improvement when nudging is used, although there are cold and warm biases in winter and summer, respectively. We also present a brief overview of the full dataset, which will provide a unique 20 and valuable tool for investigating climate change in Bavaria with high interdisciplinary relevance. Data from the finest resolution WRF domain are available for download at daily temporal resolution from a public repository at the Open Science Framework (Collier, 2020; https://www.doi.org/10.17605/OSF.IO/AQ58B).

location (e.g., Mölg et al., 2014). In order to contextualize local changes, there is a need to link local climate to the large-30 scale climate, ideally in a process-based way.
In dendroclimatological studies, the traditional approach is to compute a calibration function between local or regional treering parameters and climatic variables. Typically, such a statistical relationship would try to utilize local station data (which are generally sparse), gridded observations (which tend to be coarse resolution), or indices of large-scale climate dynamics 35 (which describe coupled atmosphere-ocean modes) as the climatic influence (e.g., Hochreuther et al., 2016). Besides known problems like stationarity (e.g., Frías et al., 2006), statistical approaches also limit the possibilities to explain the influences at the various scales on a process-resolving level. Dynamical downscaling with a full numerical atmospheric model provides a physical answer (Giorgi and Mearns, 1991), yet the disadvantage is the high computational cost. Hence, dynamical downscaling at near-kilometer resolution has traditionally been performed on a case-study basis for weather events (e.g., 40 Gohm et al., 2008). Multi-decadal simulations, on the other hand, were typically limited to resolutions of tens of kilometers (e.g., Di Luca et al., 2016). With the progress of computational resources, dynamical downscaling is becoming a candidate for climate impact studies that require local-scale information, and the first decadal simulations at ~1-km resolution are now available (e.g., Collier et al., 2018). From the resultant model output, impact studies could utilize information about local meteorological conditions at high-spatial and high-temporal resolution, and over long, climatologically relevant temporal 45 periods. Moreover, the physically consistent output would enable to generate the said process understanding of influences across the various climatic scales.
The management of forests is a classical impact study where adaptation and mitigation measures meet the heterogeneous effects of climate change at local scales (e.g., Lindner et al., 2014). With this background, the project BayTreeNet was 50 started recently under the umbrella of the interdisciplinary climatological research network Bayklif (https://www.bayklif.de; last accessed 1 March 2020), and aims to investigate the response of forest ecosystems to current and future climate dynamics across different growth areas in Bavaria, Germany. The project comprises a network of 10 measurement sites where meteorological and dendroecological data will be monitored and used both for research and for public and educational outreach. The locations were selected carefully to account for ecological and elevational variety in the study region, and the 55 sites are currently in the process of being installed. High-temporal (approximately daily) and high-spatial resolution data is a key component of dendroecological impact studies, since the physiological behavior of trees, their structural properties and functional wood anatomy, as well as other important parameters such as wood density and mortality risk are not only influenced by seasonal averages, but also by short-term extreme events and weather anomalies (e.g., Bräuning et al., 2016). 60 Previous regional climate simulations including Bavaria over continuous multi-decadal periods were performed with model resolutions as high as 5-7 km and up to the year 2009 (e.g., Berg et al., 2013;Warscher et al., 2019). However, to the best of our knowledge, such datasets at the kilometer scale and up to the near present do not yet exist, despite previous research https: //doi.org/10.5194/essd-2020-52  highlighting the importance of convection-permitting resolution in this region (Fosser et al., 2014). We address this data gap by performing simulations with an atmospheric model, configured with convection-permitting spatial resolution in a nested 65 domain over Bavaria, for the recent climatological period of 1987 to 2018. These data were generated as part of the aforementioned BayTreeNet project in order to assess the dendroecological consequences of local climate change on forests in the study region. These data will also find multidisciplinary interest among researchers assessing ecological and human dependencies on the climate for scientific and practical questions.

Atmospheric model
The atmospheric simulations were performed using the advanced research version of the Weather Research & Forecasting (WRF) model v. 4.1 (Skamarock and Klemp, 2008) configured with two one-way-nested domains of 7.5-and 1.5-km grid spacing situated over Bavaria (Fig. 1), hereafter referred to as D1 and D2. Terrain data were taken from NASA Shuttle Radar Topographic Mission data re-sampled to 1-km and 500-m grids (Jarvis et al., 2008;https://cgiarcsi.community/data/srtm-75 90m-digital-elevation-database-v4-1; last accessed 24 May 2020) for D1 and D2, respectively, while land-use data was updated based on the European Space Agency Climate Change Initiative Land Cover data at 300-m spatial resolution  Table 1. We note that no additional urban physics were enabled beyond the default parameterization used 80 by the Noah family of land surface models (Liu et al., 2006) and land-use sub-tiling was not enabled.
Forcing data at the lateral boundary of D1 and bottom boundaries of both domains was taken from the ERA5 reanalysis (Copernicus Climate Change Service (C3S), 2017) at three-hourly temporal resolution. The 30-year simulation was divided into 30 annual simulations that were run continuously from 15 August of year n-1 to 31 August of year n. The first 16 days 85 of each simulation were discarded as spin-up time, retaining data from 1 September of year n-1 onwards. Atmospheric carbon dioxide (CO2) was updated in WRF for each simulation year using annually and globally averaged concentrations at the surface from the National Oceanic and Atmospheric Administration Earth System Research Laboratory (Tans and Keeling, 2019). Each simulation employed the CO2 concentration of year n, which ranged from 351 to 407 ppm between 1988 and 2018. All other parameters and bottom boundary conditions (e.g., vegetation and land use) were held constant for 90 all simulations. Therefore, they do not capture the impact of known land-use changes over the study period (e.g., Fuchs et al., 2013).
Each run required 12 days of wall-time with 320 processors on the Meggie compute cluster at the Erlangen Regional Computing Center, for a total of 2.86 million core hours. The model was compiled using intel 17.0 compilers and run using 95 https://doi.org/10.5194/essd-2020-52 distributed-memory parallelization. Model output was written at two-hourly intervals, amounting to more than 55 TB of data, in addition to ~30 TB of pre-processing and input files.

Model Evaluation
For detailed model evaluation, we selected the period of 00 UTC 1 September 2017 to 00 UTC 1 September 2018, as data 100 availability is highest closest to present day and the summer of 2018 contained a record heatwave with drought conditions (Beyer, 2018). Neither extensive physics optimization nor a longer evaluation period was possible due to the computational expense of the simulations. For the evaluation period, we compared two test simulations (see Sect. 2.3) with data from the German Weather Service (DWD) Climate Data Center for all stations in Bavaria with hourly temporal resolution available, which provide good spatial coverage of our study area (Table 2; Fig. 2). We compared the following near-surface 105 atmospheric variables: air temperature and relative humidity at 2 m (T and RH), zonal and meridional wind components at 10 m (U and V), surface pressure (PS), and precipitation (PR). In our comparison, we excluded measurement sites where the observed terrain height differed from the modelled value by more than 100 m (similar to e.g., Vionnet et al., 2019), corresponding to four sites in total for all variables except for PS (three) and PR (nine). After this exclusion, the average difference between modelled and observed terrain height at all stations is within ± 8 m for each dataset. We also excluded 110 any days with missing observational data when computing daily statistics. We did not evaluate radiation variables, as only sunshine hours are available from the DWD in sufficiently large sample sizes. However, for understanding temperature biases in WRF during summer 2018, we used incoming shortwave radiation from the DWD Climate Data Center dataset entitled "Hourly station observations of solar incoming (total/diffuse) and longwave downward radiation for Germany" (Table 2). In total, there were four sites with both incoming shortwave (SW) and T data available in Bavaria between 1 June 115 and 31 August 2018 whose elevation was represented within ±100 m in D2: Nürnberg (id 3668), Weihenstephan-Dürnast (5404), Würzburg (5705), and Fürstenzell (5856).
For statistical analysis, we computed the mean deviation (MD), mean absolute deviation (MAD), and the coefficient of determination (R 2 ) between station data and data from the closest grid point in D2 without spatial interpolation at two-hourly 120 and daily temporal frequency. The MD, also referred to here as the model bias, and the MAD were computed from observation minus model. For precipitation, only daily totals were evaluated, and the MD and MAD were computed considering only days with non-zero observed precipitation.
Finally, we also compared night-time land surface temperature (LST) from the MODIS MYD11A1 dataset (Table 2)  comparison with MODIS. In our comparison, we excluded nights when MODIS had more than 50% missing data over D2, leaving a sample size of 52. 130

Forcing Strategy
For the evaluation period, we compared two simulations with different forcing approaches, one excluding and one including grid analysis nudging to constrain drift in the large-scale circulation (e.g., Bowden et al., 2013). We refer to these simulations as WRF_NO_NUDGE and WRF_NUDGE in Section 3.1, respectively. For the WRF_NUDGE simulation, analysis nudging was applied in D1 outside of the planetary boundary layer and above the lowest 10 model levels using the 135 default strength (3.0 x 10 -4 ) for temperature and winds and reduced strength (5.0 x 10 -5 ) for the water vapor mixing ratio (e.g., Otte et al., 2012), consistent with a previous decadal application of WRF (Collier et al., 2018). Given the computational expense of each annual simulation, we did not attempt to optimize the nudging coefficients for our study area and instead evaluate simply whether nudging in this form improves the simulated atmospheric variables or not.

Climatological Analysis 140
To briefly evaluate the full climatological simulation, we compared simulated and observed monthly mean T from the DWD dataset 'MO_TT_MN004' (Table 2), with a sample size of 26 stations that remained after filtering for height differences exceeding 100 m, the presence of missing data, and stations located in grid cells classified as urban (see Sect. 3.1). For the distributed trend analysis, we did not apply a field significance test (e.g., Wilks, 2016) due to the small sample size. Future users should rigorously evaluate biases for the variables, time periods, and resolutions relevant to their particular 145 applications.
We note that unphysically large sub-surface temperatures were simulated at a number of glacierized grid points, primarily during the months of July to September. Considering all of D2, the daily average number of affected grid cells was 24,

Model Evaluation
Averaged over the evaluation year, both WRF simulations capture the magnitude and variability of sub-diurnal near-surface 160 meteorological conditions at most sites well ( Fig. 3; Table 3). The interquartile range (IQR; range between upper and lower quartile) of MDs is one order of magnitude smaller than the observed standard deviation for all variables. As expected, variability is best captured for T and PS, with R 2 values that uniformly exceed 0.87 and 0.96, respectively. Those of RH have a larger range but a lower quartile above ~0.55. Compared with these variables, the model shows less skill in simulating subdiurnal variability in winds, with lower quartiles of R 2 for U and V of approximately 0.39 and 0.27, respectively. Shifting to 165 daily timescales, both simulations represent variability in daily total PR surprisingly well, with IQRs of MDs below ~1.25 mm and lower quartiles of R 2 exceeding 0.18 and 0.33, depending on the simulation. Previous studies have reported Root Mean Square Deviations (RMSD). For direct comparison, the mean RMSD in WRF_NUDGE for two-hourly T and RH is 2.67 K and 13.7%, respectively, and for daily total precipitation is 5.27 mm. These values are similar to but lower than previously reported biases in WRF at 5-km grid spacing over Bavaria for the period of 2001-2009(Warscher et al., 2019. 170 Examination of model biases on a monthly basis reveals further insights into the model performance (Fig. 4). The amplitude of the annual cycle is overpredicted in WRF, indicating that the good average agreement in T results from compensating biases: there is a cold bias in WRF in winter, a well-known issue with the model over snow-covered surfaces (e.g., Tomasi et al., 2017), and a warm bias in summer (Fig. 4a). The latter bias results in an underprediction of RH during this season (Fig.  175 4b), suggesting that WRF represents absolute humidity more accurately. The summer temperature bias is also more sustained than the winter one, resulting in the long tails (heads) in the distribution of MDs of T (RH) in Fig. 3. There is also a general underprediction of near-surface winds from fall to early winter, as exemplified by the results for U in Fig. 4c   also result from an overestimation of heat storage when a mosaic approach is not used, and therefore the entire grid cell is treated as urban (Daniel Fenner, personal communication). The potential role of the land-surface specification or properties is reinforced by the comparison with MODIS data (Fig. 7), which shows the largest warm biases over grid cells classified as urban or croplands while biases are smaller in forested areas. There is also a cold bias along the foothills and at higher elevations in the Alps. The biases are slightly smaller in WRF_NUDGE than in WRF_NO_NUDGE, consistent with the 200 station-based assessment.
In addition to factors internal to WRF, we note that the driving reanalysis data may also contribute to the warm bias, at least at some locations. From the available observations, 60 stations have both valid T data between June and August 2018 and a modelled elevation in ERA5 that is within ±100 m of reality. Averaged over the summer months and all stations, ERA5 has 205 a mean warm bias of 0.37°C. At 25 of the sites, a warm bias exceeding 0.5°C is present, with an average value over these sites of 0.92°C.

Impact of Grid Analysis Nudging
The inclusion of grid analysis nudging leads to a small but nearly uniform improvement in agreement between observed and simulated variables. The distribution of MDs is closer to zero for all variables except U and PS, while those of MADs are 210 closer for all variables (cf. Fig. 3 and Table 3). R 2 values are also uniformly higher when nudging is used, and the lowest lower-quartile value is 0.3 in WRF_NUDGE compared with only 0.18 in WRF_NO_NUDGE. Nudging produces a particularly noticeable improvement in simulated precipitation, halving the MD and nearly doubling the R 2 values (cf. Fig. 3,   Fig. 4 and Table 3). Its usage also reduces the magnitude of the seasonal temperature biases and the number of extreme occurrences of the warm bias in summer (cf. Fig. 4 and Fig. 6). Considering daily timescales (the temporal resolution of data 215 available in BAYWRF), the agreement of WRF_NUDGE with the observations is similar or even improved (Table 4): the mean MD is largely unaffected, but the average MAD decreases and average R 2 increases. Based on these improvements, grid analysis nudging was adopted for the climatological simulations. Here we note that in addition to the potential factors contributing to temperature biases discussed in Section 3.1, evaluation 235 of the climatological simulation is also affected by discontinuities in station location. One example is Nürnberg (id 3668), which moved on 4 December 1995 from (49.4947ºN, 11.0806ºE) to (49.5030ºN, 11.0549ºE). The older station position is shifted one grid cell to the south and one grid cell to the west compared with its current location, corresponding to a shift in land use from urban (old position) to grasslands (new). Any discontinuities in location and underlying surface type are not captured since the most recent station positions are used for extracting T data from D2. 240

Climatological Simulations
Spatially distributed trends in T are strongest and significant over the largest in area during JJA ( Fig. 9a; other seasons not shown), ranging from ~0.3 to 0.7 K/decade over Bavaria. Trends in precipitable water are likewise uniformly positive over the study region, ranging from ~0.2 to 0.3 mm/decade. The trends of both fields also have a positive gradient between southwestern and northeastern Bavaria. These results agree qualitatively and quantitively with previous studies (e.g., 245 Alshawaf et al., 2017). fields, while perturbation potential temperature (T) was converted to atmospheric temperature. Subsets of D1 or sub-diurnal data can be made available upon request. 255

Conclusions
We presented a climatological, convection-permitting simulation with the atmospheric model WRF over Bavaria for the period of September 1987 to August 2018. For the evaluation period of September 2017 to August 2018, we compared the simulations with meteorological measurements across Bavaria and evaluated the impact of using grid-analysis nudging. We found that the model reproduced variability in near-surface meteorological conditions well, although seasonal temperature 260 biases were present. Grid analysis nudging decreased the mean deviations and increased the correlations between simulations and observations at the majority of sites for nearly all evaluated atmospheric variables, with a particularly noticeable improvement for simulated daily precipitation. BAYWRF provides a useful database for linking large-scale climate, as represented by the ERA5 reanalysis, to mesoscale climate over Germany, to local conditions in Bavaria, in a physically based way. The data are intended for dendroecological research applications but would also provide a valuable tool for 265 investigations of the climate dependence of economic, societal, ecological, and agricultural processes in Bavaria.

Author contributions
EC performed the simulations, analyzed the data and wrote the manuscript. TM developed the study concept and wrote the manuscript.

Competing interests 270
The authors declare that they have no conflict of interest.

This project is sponsored by the Bavarian State Ministry of Science and the Arts in the context of the Bavarian Climate
Research Network (bayklif). We gratefully acknowledge the compute resources and support provided by the Erlangen Regional Computing Center (RRZE) and we thank Thomas Zeiser for his assistance with the timely completion of the 275 simulations.

References
Alshawaf, F., Balidakis, K., Dick, G., Heise, S. and Wickert, J.: Estimating trends in atmospheric water vapor and https://doi.org/10.5194/essd-2020-52    instantaneous values, while those for precipitation were computed using daily totals. The shape of the plots shows the distribution of data over their range of values, white lines delineate 25 th , 50 th and 75 th percentiles, and a black dot indicates the mean. The observed standard deviation (σobs) for each variable is provided in the left column.