A Six-year long (2013–2018) High-resolution Air Quality Reanalysis Dataset over China base on the assimilation of surface observations from CNEMC

Air pollution in China has changed substantially since 2013, and the effects such changes bring to the human health and 25 environment has been an increasingly hot topic in many scientific fields. Such studies, however, are often hindered by a lack of long-term air quality dataset in China of high accuracy and spatiotemporal resolutions. In this study, a six-year long highresolution Chinese air quality reanalysis datasets (CAQRA) has been developed by assimilating over 1000 surface air quality monitoring sites from China National Environmental Monitoring Centre (CNEMC) using the ensemble Kalman filter (EnKF) and the Nested Air Quality Prediction Modeling System (NAQPMS). Surface fields of six conventional air pollutants in China, 30 namely PM2.5, PM10, SO2, NO2, CO and O3 for period 2013–2018, are provided at high spatial (15km×15km) and temporal (1 hour) resolutions. This paper aims to document this dataset by providing the detailed descriptions of the assimilation system and presenting the first validation results for the reanalysis dataset. A five-fold cross validation (CV) method was used to assess the quality of CAQRA. The CV results show that the CAQRA has excellent performances in reproducing the magnitude and variability of the surface air pollutants in China (CV R2 = 0.52–0.81, CV RMSE = 0.54 mg/m! for CO and 16.4–39.3 35 https://doi.org/10.5194/essd-2020-100 O pe n A cc es s Earth System Science Data D icu ssio n s Preprint. Discussion started: 2 June 2020 c © Author(s) 2020. CC BY 4.0 License.

36% for PM10 and 35% for PM2.5. Concurrently, the air quality in China has changed dramatically over the past six years (Silver et al., 2018;Zheng et al., 2017). Such large changes in the Chinese air quality and the effects they brought to the human health and environment has become an increasingly hot topic in many scientific fields(e.g. Xue et al., 2019;Zheng et al., 2017).
Although the ground-based observations could provide valuable information on the spatial and temporal distributions of the air pollutants in China, the air quality monitoring sites are sparse with low spatial resolution and are unevenly distributed. As 5 a result, such studies are often hindered by a lack of long-term air quality datasets in China of high accuracy and spatiotemporal resolutions for these years, which is urgently needed to provide a scientific basis for these studies.
Satellite observations have advantages of high spatial coverage and are widely used in the monitor of air pollution over wide domains. A series of satellite retrievals related to the air quality have been developed over the past two decades, such as the NO2, SO2 and O3 observations from OMI (Ozone Monitoring Instrument; Levelt et al., 2006), CO observations from 10 MOPITT (Measurement of Pollution in the Troposphere; Deeter et al., 2003) and AOD observations from MODIS (Moderate Resolution Imaging Spectroradiometer; Barnes et al., 1998). However, satellite measurements can only provide the column information on the air pollutants. Techniques that relate the surface concentrations to the satellite column measurements are thus required to estimate the surface concentrations of air pollutants. For example, various methods have been developed to estimate the surface PM2.5 concentrations based on the satellite-derived AOD, including chemical transport models( van 15 Donkelaar et al., 2016;van Donkelaar et al., 2010), advanced statistical methods (Ma et al., 2014;Ma et al., 2016;Xue et al., 2019;Zou et al., 2017) and semi-empirical models (Lin et al., 2015;Lin et al., 2018), which have been proven to be an effective way to monitor the wide-coverage PM2.5 distributions in a good accuracy (Chu et al., 2016;Shin et al., 2019). However, challenges still remain in the satellite-estimated concentrations due to the missing values related to cloud contamination, uncertainties in satellite measurement and retrieval algorithms, and the difficulties in modeling the complex relationship 20 between the surface concentrations and satellite measurements (Shin et al., 2019;van Donkelaar et al., 2016;Xue et al., 2019).
In addition, most of previous satellite estimates of surface concentrations were limited to specific regions and time period and has a daily or even longer temporal resolutions due to the issues with data availability (e.g. incomplete coverage of AODbased estimates), which limits their usage in the studies at finer scale, such as the assessment of acute health effects of air quality. To our best knowledge, a nationwide long-term estimate of surface concentrations of all conventional air pollutants in 25 China at the hourly scale has still not been reported by previous satellite estimates.
A long-term air quality reanalysis of criteria air pollutants would provide constrained estimates of concentrations at all locations and times, which optimally combines the accuracy of observations and the physical information and spatial continuity of the chemistry transport models (CTMs) through advanced data assimilation techniques. Reanalysis is a uniform, continuous and state-of-science best-estimate data product, which has been used by a vast number of research communities. For example, 30 several long-term meteorology reanalysis has been developed by the weather centres from different regions/countries, such as ERA-Interim reanalysis by European Centre for Medium Range Weather Forecasts (ECMWF; Dee et al., 2011), NCEP/NCAR Reanalysis by National Centers for Environmental Protection (NCEP; Saha et al., 2010), MERRA-2 by NASA Global Modeling and Assimilation Office (NASA-GMAO; Rienecker et al., 2011), JRA-55 by Japan Meteorological Agency https://doi.org /10.5194/essd-2020-100 Open Access Earth System Science Data Discussions Preprint. Discussion started: 2 June 2020 c Author(s) 2020. CC BY 4.0 License. (Kobayashi et al., 2015) and CRA-40 by China Meteorological Administration. The use of data assimilation in reanalysis of atmospheric chemistry is more recent, and some reanalysis datasets for atmospheric compositions have been produced over the past decades, for example the MACC, CIRA, CAMS reanalysis by ECWMF (Flemming et al., 2017;Inness et al., 2019;Inness et al., 2013), MERRA-2 aerosol reanalysis by NASA-GMAO (Randles et al., 2017), tropospheric chemistry reanalysis (TCR) for years 2005-2012 by Miyazaki et al., 2015 and its latest version TCR-2 (Miyazaki et al., 2020), the global reanalysis 5 of carbon monoxide by Gaubert, B., 2016, multi-sensor reanalysis of total ozone for years 1970-2012 by van der A et al., 2015 and Japanese Reanalysis for Aerosol (JRAero) for years 2011-2015 by Yumimoto et al., 2017. These reanalysis datasets have promoted our understanding of the atmospheric compositions and are also helpful for the air quality researches. However, these are all global datasets having coarse horizonal resolutions (> 50km), which may be insufficient for capturing the high spatial variability of the air pollutants at the regional scale. In addition, some of these reanalysis datasets only provided the air 10 quality data priori to year 2012 and only focused on specific species, for example the aerosol and ozone reanalysis. To our best knowledge, there is still no high-resolution air quality reanalysis dataset in China for recent years when China's air quality changed dramatically.
In view of these discrepancies, in this study we will develop a high-resolution regional air quality reanalysis dataset over China for years 2013-2018 (will be extended in the future by adding one year each year) by assimilating over 1000 surface 15 observation sites from China National Environmental Monitoring Centre (CNEMC) in a post processing mode using our own developed chemical data assimilation system (ChemDAS). The developed reanalysis dataset will help fill the gaps in the high resolution air quality dataset in China by providing the surface concentration fields of all six conventional air pollutants in China at high spatial (15km×15km) and temporal (hourly) resolutions. This dataset would be the first high-resolution air quality reanalysis dataset in China that can simultaneously provide the surface concentrations of six conventional air pollutants 20 in China, which should be of great value in (1) retrospective analysis of air quality in China, (2) assessment of health and environmental impacts of air pollution in fine scales; (3) model evaluation and satellite calibration, and (4) provide basic training dataset for statistical or AI (Artificial intelligence) based forecast.

Description of chemical data assimilation system (ChemDAS)
The Chinese air quality reanalysis dataset was produced by the chemical data assimilation system developed by the 25 Institute of Atmospheric Physics, Chinese Academy of Sciences (Tang et al., 2011). This system consists of (i) a threedimensional chemical transport model (CTM) called Nested Air Quality Prediction Modeling System (NAQPMS) developed by Wang et al., 2000, (ii) an ensemble Kalman filter (EnKF) assimilation algorithm, and (iii) surface observations from CNEMC with an automatic outlier detection method . We used an offline analysis scheme in this study since there is no previous experiences for the online run of chemical data assimilation system under such high horizontal resolution. 30 The lessons learned from the offline analysis would feed into the future implementation of online analysis. In the offline analysis scheme, a free run of ensemble simulation was first conducted, and then the observations were assimilated using the https://doi.org/10.5194/essd-2020-100 EnKF and the results of ensemble simulation. The similar offline analysis scheme has also been used in previous reanalysis studies, such as Candiani et al., 2013 andKumar et al., 2012. Detailed descriptions of the ensemble simulation, observations and the data assimilation algorithm used in this study are presented below.

Air Pollution Prediction model
Reanalysis focuses on the retrospective analysis over long time periods. It uses the same air pollution model used to 5 produce forecasts, but the meteorology fields used to drive the CTM are more accurate as they use simulated fields as opposed to forecasted fields used to forecast air quality. In addition, reanalysis using a more comprehensive set of observations to provide constraints. The same NAQPMS model used as an operational forecast model, which has been used in previous assimilation studies (Tang et al., 2011;Tang et al., 2013), has been used for the reanalysis in this study. The model is driven by the hourly meteorological fields produced by the Weather Research and Forecasting Model (WRF; Skamarock, 2008). The 10 gas phase chemistry is simulated with the Carbon-Bond Mechanism Z (CBM-Z) developed by Zaveri and Peters, 1999. Aqueous-phase chemistry and wet deposition are simulated based on the Regional Acid Deposition Model (RADM) mechanism in the Community Multi-scale Air Quality (CMAQ) version 4.6. For aerosol processes, the thermodynamic model ISORROPIA 1.7 (Nenes et al., 1998) is used for the simulations of inorganic atmospheric aerosols. Six secondary organic aerosols (SOA) were explicitly treated in NAQPMS based on Li et al., 2011. To simulate the interactions between the particles 15 and gases, 28 heterogeneous reactions on sulfate, soot, dust and sea salt particles were included based on the previous studies . Size-resolved mineral dust emission is calculated online as a function of relative humidity, friction velocities, mineral particle size distribution and surface roughness . Sea-salt emission is calculated using a scheme of Athanasopoulou et al., 2008. Dry deposition of gases and aerosols is modelled based on the scheme of Wesely, 1989, and the advection is simulated with an accurate mass-conservative algorithm from Walcek and Aleksic, 1998. 20 Figure 1 shows the modeling domain of this study, which covers the most parts of East Asia with a fine horizontal resolution of 15km. The vertical coordinate system is set as 20 terrain-following levels with the model top up to 20000 m and the first layer about 50m. Nine vertical layers were set within the 2km closet to the surface to better characterise the vertical mixings within the boundary layers. Emissions of air pollutants used in this study included monthly anthropogenic emissions from HTAP_v2.2 emission inventory at base year of 2010 (Janssens-Maenhout et al., 2015), biomass burning emissions form 25 Global Fire Emissions Data base (GFED) version 4 (Randerson et al., 2017;van der Werf et al., 2010), biogenic volatile organic compounds (BVOC) emissions from MEGAN-MACC (Sindelarova et al., 2014), marine VOCs emissions from POET database (Granier et al., 2005), soil NOx emissions from Regional Emission inventory in Asia (Yan et al., 2003) and the lightning NOx emissions from Price et al., 1997. The clean initial conditions were used in the air quality simulation with a two-week free run of NAQPMS as a spin-up time. Top and boundary conditions were provided by the global chemical transport 30 model MOZART (The Model for Ozone and Related Chemical Tracers; Brasseur et al., 1998;Hauglustaine et al., 1998) and the meteorology fields were provided by the WRF model. In each day's meteorology simulation, a 36-h free run of WRF was conducted with the first 12-h simulation as a spin-up run and the remaining 24-h to provide the meteorology inputs for https://doi.org/10.5194/essd-2020-100 NAQPMS. Initial and boundary conditions for the meteorology simulation were provided by the National Center for Atmospheric Research/National Center for Environment Prediction (NCAR/NCEP) 1°× 1° reanalysis data.
Previous studies have shown that the emissions are a major contributor to the prediction uncertainty of air quality (Carmichael et al., 2008;Hanna et al., 1998;. Uncertainties of emission inventories were thus considered in the ensemble simulation which was driven by an ensemble of perturbed emissions ( " ; = 1, 2, ⋯ , #$% ). The emissions were 5 perturbed based on their error probability distribution functions (pdf), which were assumed to be Gaussian distribution in this study. Table 1 lists the perturbed species considered in this study as well as their corresponding emission uncertainties obtained from previous studies. An isotropic correlation model was assumed in the covariance of the emission errors, which was written as: (1) 10 where ( , ) represents the correlation between grid i and j, ℎ( , ) represents the distance between these two points and represents the decorrelation length, which was specified as 150km in this study. The ensemble of emissions was simply obtained by multiplying the base emissions with a perturbation factor as shown in Eq. (2): where represents the vector of base emissions and ∘ denotes the schur product. According to the pdf of emission errors, 15 follows the same Gaussian distribution with the emission errors except that its mean equals to 1. The performance of EnKF is strongly related to the ensemble size. In this study, the ensemble size was chosen as 50 to maintain a balance between the computational cost and the filter performance. Based on the method of Evensen, 1994, fifty smooth pseudo random perturbation fields for were generated for each perturbed species. The emission perturbations were kept independent from each other to prevent the pseudo correlation among different species. 20

Observations
Surface observations of hourly ambient PM2.5, PM10, SO2, NO2, CO and O3 concentrations from the CNEMC were used in this study. The number of observation sites was about 510 in 2013 and increased to 1436 in 2015. Real-time observations of these six air pollutants at each monitoring sites are routinely uploaded to the CNEMC and released to the public (available at http://www.cnemc.cn/, last access: 17 Apr, 2020) with an hour interval. 25 The quality control of observations is critical for the data assimilation since outliers of observations can exert catastrophic impacts on the performance of assimilation. To deal with it, a fully automatic outlier detection method was developed to filter out the outliers of observations . An automatic way of outlier detection is very important in the chemical data assimilation since there is large amount of observation data from multiple species. Four types of outliers characterised by the temporal and spatial inconsistency, instrument-induced low variances, periodic calibration exceptions and less PM10 than PM2. 5 30 https://doi.org/10.5194/essd-2020-100  in concentrations were detected and removed before the assimilation. More details about the outlier detection method are available in Wu et al., 2018. A proper estimate of observation error is important for the performance of filter since the observation and background error determine the relative weights of the observation and background value on the analysis. The observation error includes the measurement error and representativeness error. For each species, the measurement error was given by their instruments, 5 that is 5% for PM2.5 and PM10, 2% for SO2, NO2 and CO, and 4% for O3 according to the officially released documents of the Chinese Ministry of Ecology and Environmental Protection (HJ 193-2013 andHJ 654-2013, available at http://www.cnemc.cn/jcgf/dqhj/, last access: 17 Apr, 2020). The representativeness error arises from the different spatial scales that the discrete observation data and model simulation represent, which was estimated based on the previous study by Li et al., 2019 who estimated the representativeness errors under the 30km horizontal resolution. We extended their estimates to the 10 15km model resolution in this study according to the method of Elbern et al., 2007 which is formulated by: where /#0/ represents the representativeness error, ∆ represents the model resolution, /

Data assimilation algorithm 20
We used a variation of EnKF approach, i.e., a local ensemble transform Kalman filter (LETKF; Hunt et al., 2007), to assimilate observations into model state. Like any other EnKF implementation the ETKF uses a stochastic ensemble of model realizations to represent the background error covariance matrix, while it solves the analysis equations in the space spanned by the ensemble perturbations (Bishop and Toth, 1999). It is a kind of deterministic filter that does not need to perturb the observations which avoids the introduction of additional sampling errors. The LETKF is a local implementation of ETKF 25 which calculates the analysis grid by grid by only using the observations from the region surrounding this analysis grid. By performing the analysis locally, there is no need to represent the full error space of the global state. Instead, the ensemble perturbations of limited ensemble size only need to represent a low-dimensional error space within the local region, which well addresses the rank problem of EnKF resulting from the limited ensemble size. In addition, the long-distance spurious correlation can also be supressed by the local implementation. The formulation of LETKF in a local region can be written as 30 where HHH represents the analysis state and HHH represents the background state. represents the background perturbations calculated by using the results of ensemble simulation and L is the analysis in the ensemble space spanned by . N 5 represents the analysis error covariance in the ensemble space with a dimension of #$% × #$% .
denotes the vector of observations used in the analysis of this grid and R is the observation error covariance matrix. represents the linear observational operator that maps the model space to the observation space. The scalar in Eq. (6) denotes the inflation factor for the background covariance matrix which was estimated by an algorithm proposed by Wang and Bishop, 2003: where represents the observation-minus-forecast residuals, p represents the number of observations, and represents the ensemble estimated background error covariance matrix. The inflation is necessary for the ensemble-based assimilation algorithm since the ensemble estimated background error covariance is very likely to underestimate the true background error 15 covariance due to the limited ensemble size and the existence of model error (Liang et al., 2012). Without any treatment to prevent the underestimation of background error covariance, the model forecast would be overconfident and eventually result in filter divergence.
To prevent the spurious correlation between non or weakly related variables, each air pollutant is assimilated independently by only using the observations of this pollutant. Figure 2 illustrates the local scheme we used in the assimilation, 20 where the plus signs and dots respectively denote the centre of model grids and the location of observation sites. For each model grid, only the observation sites located within a (2 + 1) by (2 + 1) rectangular area centred at this model grid are used in the analysis of this model grid. The cut-off radius was chosen as 12 model grids, approximate to 180 km under the 15km horizontal resolutions. The use of cut-off radius, however, could lead to discontinuities in the analysis when an observation "enters" or "leaves" the local domain as one moves from one model grid to another (Sakov and Bertino, 2011). In 25 order to increase the smoothness of the analysis state, following Hunt et al., 2007, we artificially reduced the impact of the observations close to the boundary of local domain by multiplying the entries in ; by a factor that decays from one to zero as the distance of the observation from the central model grid increases. The decay factors used in this study were calculated where ( ) denotes the decay factor for the observation , ℎ( ) is the distance between the observation and the central model grid point, and is the decorrelation length that was chosen as 80 km, smaller than the cut-off radius, to increase the smoothness of the analysis state. Typically, only the state of the central model grid needs to be updated and used to construct the global analysis field. However, the experiences show that there is still observable discontinuity seen in the analysis over where ℎ( , ) denotes the distance of the model grid from the central model grid of the patch that generated th analysis value of this grid, represents the number of patch that containing this model grid and is the decorrelation length which was chosen as 80km in this study.

Results
This section presents the fields from the Chinese air quality reanalysis dataset (CAQRA) and compares them with 15 observations. It aims to provides a brief introduction to the CAQRA and gives a first assessment of the quality of this dataset.
The cross validation method was used in the assessment of CAQRA, in which a proportion of observation data is withheld from the data assimilation and used as a validation dataset. We conducted five cross validation experiments by randomly dividing the observation sites from CNEMN into five groups (20% observation sites for each group). In each experiment, the analysis was performed with one group's observation data left out in the assimilation. The analysis results at validation sites, 20 i.e., the observation sites that are not used in the assimilation, were then collected and used to validate the assimilation. For convenience, the analysis results at validation sites from the five cross validation experiments were combined together and composed of a validation dataset containing all observations sites (cross validation run). This dataset was then evaluated against the observations to assess the qualities of CAQRA. Besides, the independent observations of PM2.5 from the U.S. Department of State Air Quality Monitoring Program over China were also used in the assessment of PM2.5 reanalysis field. The qualities 25 of CAQRA was assessed at different spatial and temporal scales to better understand the qualities of CAQRA. Also shown are the validation results of the ensemble mean of the simulations without assimilation (base simulation) to highlight the impacts of assimilation.

Spatial distributions of PM reanalysis over China
We first present the reanalysis fields of the PM concentrations (PM2.5 and PM10) in China. Figure 3 shows the six-year mean (2013-2018) spatial distributions of the PM2.5 concentrations in China from CAQRA, base simulation and observations.
The CAQRA presents a continuous map of the PM2.5 concentrations in China, and well reproduced the observed magnitude of 5 PM2.5 concentrations in China. The largest PM2.5 concentrations were located in the NCP region due to the intensive industrial activities and the associated high emissions of PM2.5 and its precursors (Qi et al., 2017). Higher PM2.5 concentrations were also found in the SE region where the PM2.5 concentrations are both influenced by the local emissions and the long range transport of the air pollutants from north China (Lu et al., 2017). In the NW region, besides the hotspots of PM2.5 concentrations in big cities, high PM2.5 concentrations were also obvious in the Taklimakan Desert due to the influences of dust emissions. The 10 observed magnitude and spatial variability of PM10 concentrations were also well represented by the PM10 reanalysis. In general, the spatial distributions of PM10 reanalysis were similar to that of the PM2.5 reanalysis except in the Gansu and Ningxia provinces where there were high PM10 concentrations but relatively low PM2.5 concentrations. This would be related to the large contributions of dust emissions in these areas. The base simulation significantly overestimated the PM2.5 and PM10 concentrations in China, especially in south China. This would be due to the systematic biases in the emission inventory (Kong 15 et al., 2019) and also due to that the negative trends of the PM and its precursors' emissions were not considered in our simulation. In addition, the hotspots of PM2.5 concentrations in the NW region and Tibet Plateau also failed to be captured by the base simulation possibly due to the absence of the activity data in these remote regions.
The seasonal maps of PM2.5 and PM10 concentrations are shown in Figs. S1-2 in the Supplement, which shows significant seasonal variations. Both the PM2.5 and PM10 concentrations show maximum values in winter in most regions of China due to 20 the increased anthropogenic emissions related to enhanced power generation, industrial activities and the fossil fuel burnings for heating . The unfavourable meteorological conditions with stable boundary condition would also contribute to the high PM concentrations in winter. In contrast, due to the lower emission rate and more intense mixing processes, the PM concentrations were lowest during summer. The PM concentrations in Taklimakan Desert exhibit a different seasonality with PM concentrations highest in spring and lowest in winter. This would be due to that the major sources of PM in 25 Taklimakan Desert are not anthropogenic emissions but the dust emissions which are usually largest in spring due to the frequent strong dust storms. Figure 4 further shows an example of the hourly results of PM reanalysis, which presents a yearround time series of site mean hourly PM concentrations over Beijing. It shows that the PM reanalysis well captures the hourly evolution of the PM concentrations. Both the heavy haze episodes during wintertime and the strong dust storms during the springtime can be well represented in the PM reanalysis.

Assessment of PM reanalysis over China
The cross validation (CV) method was used to assess the quality of PM reanalysis in China. Table 2 summarized the sitebased CV results for the reanalysis data from 2013 to 2018 at different temporal scales. It is pertinent to mention here that these sites are all validation sites that were not used in the data assimilation. The validation results indicate that by assimilating the surface PM concentrations, the reanalysis shows much better performances in reproducing the magnitude and variability The qualities of PM2.5 and PM10 reanalysis data in different regions of China were further summarized in Table S1-2. At the hourly scale, there were small negative biases of PM2.5 reanalysis in the NCP (-4.8%), NE (-5.8%), SE (-3.8%) and SW (-3.4%) regions. The biases were relatively larger in the NW and Central regions, with CV NMB about -7.3% and -8.2%, 15 respectively. Two reasons might help explain the lager biases in these two regions. Firstly, the observation sites are sparser in the NW and Central regions. As a result, the PM2.5 concentrations cannot be well constrained at some sites in cross validation.
Secondly, the emissions of PM2.5 and its precursors might be too low in these two regions, leading to the underestimations of background errors since we only considered the emission uncertainty in the ensemble simulation. Although this problem has been alleviated by using the inflation technique that compensates the missing errors, the overconfident model results could 20 still to some extent degrade the performance of assimilation, making the analysis less influenced by the observations. The errors of PM2.5 reanalysis exhibited apparent spatial differences (Table S1). The CV RMSE were smallest in the SE (14.9 µg/m ! ) and SW (16.5 µg/m ! ) regions, and increased to around ~25 µg/m ! in the NCP, NE and Central regions. Consistent with the bias distributions, the largest CV RMSE value was found in the NW region which could up to 52.1 µg/m ! , but is still significant smaller than the RMSE value of base simulation (73.0 µg/m ! ). The errors of PM2.5 reanalysis were smaller at the 25 daily, monthly and yearly scales, with CV RMSE about 10.6-39.4 µg/m ! at the daily scale, 7.4-26.9 µg/m ! at the monthly scale and 6.1-23.5 µg/m ! at the yearly scale. In term of the hourly PM10 reanalysis, the CV results (Table S2) showed that there were small negative biases in the NCP, NE, SE and SW regions, ranging from -9.6% of NE to -5.9 of SE. The biases were larger over the NW and Central regions with CV NBM increasing to about 18.0% and 14.1% respectively. The errors of PM10 reanalysis also exhibited spatial heterogeneity. The CV RMSE was smallest in the SE (26.0 µg/m ! ) and SW (30.

Independent validations of PM2.5 reanalysis
Besides the cross validation, the PM2.5 reanalysis was further validated against the independent dataset from the U.S. Department of State Air Quality Monitoring Program over China (http://www.stateair.net/, last access: 17 Apr, 2020) , which contains hourly PM2.5 concentrations in Beijing, Chengdu, Guangzhou, Shanghai and Shenyang cities. Table 4 presents the comparisons of the observed PM2.5 concentrations with these obtained from the CAQRA and base simulation. The results 5 indicate that the PM2.5 reanalysis agrees better with the observed magnitude and variability of PM2.5 concentrations in all cities.
Both the MBE and RMSE were greatly reduced in CAQRA, which were only ranging from -7.1 to -0.3 µg • m ;! and from 16.8 to 33.6 µg • m ;! , respectively, in these cities. The correlation coefficient was also greatly improved in CAQRA (R 2 = 0.74-0.86) when compared to the base simulation (R 2 = 0.09-0.38). These results confirm that the CAQRA has a high quality in representing the PM2.5 pollution in China during these years. 10

Comparisons with the satellite-estimated PM2.5 concentrations
Previous studies have shown that estimating the ground-based PM2.5 concentrations from satellite-derived AOD is an effective way to map the PM2.5 concentrations in a good accuracy. To further illustrate the accuracy of our PM2.5 reanalysis data, we also compared the accuracy of our PM2.5 reanalysis data to that of the satellite-estimated PM2.5 concentrations. Table   5 summarized several representative studies that focus on the estimations of ground-based PM2.5 concentrations in China at 15 the national level using different kinds of methods. Most of these studies estimated the ground-based PM2.5 concentrations at the daily scale since they used the polar-orbiting satellite data (e.g. MODIS) that only provide the daily observations of AOD.
Estimation conducted by Liu et al., 2019 was an exception that has an hourly resolution since they used the AOD measurement from the geostationary satellite (Himawari-8). The horizontal resolutions of these studies were mainly around 10km except that of Lin et al., 2018 having the finest horizontal resolution (1km) and that of Zhan et al., 2017 having the coarsest horizontal 20 resolution (0.5° ). There are few studies provide the long-term PM2.5 data covering the recent years. In comparison, our PM2.5 reanalysis data can provide the long-term data over China with fine temporal (1h) and high accuracy. A fine temporal resolution is important for the epidemiological studies, especially for the assessment of acute health effects of air pollution. Furthermore, the accuracy of our reanalysis data (CV R 2 = 0.86, CV RMSE = 15.1 µg • m ;! ) was also higher than most of these satellite estimates (CV R 2 = 0.56-0.86, CV RMSE = 15.0-20.2 µg • m ;! ). 25

Spatial distributions of the reanalysis of gaseous air pollutants over China
Next, we present the reanalysis fields for the gaseous air pollutants in China, namely SO2, CO, NO2 and O3. Figure 7 presents the spatial distributions of the six-year averaged SO2 and CO concentrations in China obtained from CAQRA, base simulation and observations. The SO2 reanalysis well captured the magnitude and spatial distributions of SO2 concentrations in China, while the base simulation significantly overestimated the SO2 concentrations due to the positive biases of SO2 emissions in the simulation. Consistent with the observations, the SO2 reanalysis exhibited high spatial heterogeneity with highest values located in the NCP region, especially in Shandong, Shanxi and Hebei provinces. Several hotspots of SO2 concentrations were also found in the NE region. SO2 is mainly emitted from the fossil fuel consumption, especially the burning of coal . Shandong, Shanxi, Inner Mongolia and Hebei provinces were the largest four consumers of coal in 5 China according to the China Energy Statistical Yearbook (NBSC 2017a, b), which well explained the high SO2 concentrations in these provinces. We also found observable SO2 concentrations in the Xizang province which were missing in the base simulation. The spatial distribution of the CO reanalysis is similar to that of the SO2 and agreed well with the observed spatial distributions. In contrast, the base simulation significantly underestimated the CO concentrations, especially in the NCP region.
In addition, both the observation and reanalysis show hotspots of CO concentrations in the NW region and Xizang provinces, 10 while these hotspots were largely underestimated or even missing in the base simulation. According to previous studies, such underestimation might be related to the underestimated CO emissions in China (Kong et al., 2020;Tang et al., 2013). For NO2 (Fig. 8), both the reanalysis and base simulation captured the observed magnitude and spatial distributions of NO2 concentrations in China. The high NO2 concentrations were generally located in the NCP region and the major city clusters in China. However, the base simulation generally showed underestimations of NO2 concentrations in China. The spatial 15 distributions of O3 concentrations (Fig. 8) show smaller spatial heterogeneity compared to the other gases. The O3 reanalysis captured the observed magnitude and spatial distributions of O3 concentrations well in China while the base simulation generally showed underestimations of O3 concentrations in China. Figure S3-6 further presents the seasonal maps of the reanalysis fields of these gases. All gases exhibited a profound seasonal cycle with maximum values seen in the winter and lowest values in summer except the O3 which showed an opposite seasonal cycle. The largest values of SO2, CO and NO2 20 concentrations in winter would be due to the increased anthropogenic emissions and also the more stable atmosphere conditions during that time. For O3, the highest value in summer was closely related to the enhanced photochemistry reactions in summer associated with higher temperature and stronger solar radiance.

Assessment of gases' reanalysis over China
The evaluation results of these gases' reanalysis data were shown in Table 2. It indicates that the reanalysis data has an 25 excellent performance in representing the magnitude and variability of these gaseous air pollutants in China with CV R 2 ranging from 0.51 of SO2 to 0.76 of O3, and the CV MBE (CV NMB) about -2.0 µg • m ;! (-8.5%), -2.3 µg • m ;! (-6.9%), -0.06 mg • m ;! (-6.1%) and -2.3 µg • m ;! (-4.0%) for the hourly SO2, NO2, CO and O3 reanalysis data, respectively. Compared to the base simulation, the errors were reduced by about one half in the reanalysis with CV RMSE about 24.9 µg • m ;! , 16.4 µg • m ;! , 0.54 mg • m ;! and 21.9 µg • m ;! for the hourly SO2, NO2, CO and O3 reanalysis data, respectively. The reanalysis 30 shows better performance at the daily, monthly and yearly scales. The CV RMSE values of the daily SO2 and NO2 reanalysis data were also smaller than the previously datasets of SO2 and NO2 concentrations in China developed by Zhan et al., 2018;https://doi.org/10.5194/essd-2020-100 Open In terms of different regions (Table S3-   observations showed negative trends in China as well (Fig. 10), however the negative trend was not significant except that in the NE region (Table 5). This is consistent with the smaller reductions of NOx emissions (21%) in China due to the smaller changes in the emissions from transportation sector which accounts for almost one third of the NOx emissions in China. The 30 pollution controls on the transportation section were exactly offset by the growing emissions related to the vehicle growth (Zheng et al., 2018b). The base simulation generally underestimated the NO2 concentrations during the wintertime, and the https://doi.org/10.5194/essd-2020-100 observed negative trends of NO2 concentrations were also underestimated in all regions. By assimilating the observed NO2 concentrations, the reanalysis data agreed better with the observations both for the magnitude and the negative trends. The CO observations showed significant negative trends in all regions except in the NW region (Fig. 11) with calculated negative trends ranging from -0.18 to -0.06 µg • m ;! • yr ;& . Such negative trends were also observed by the satellite measurements such as the MOPITT observations (Zheng et al., 2018a), which were mainly due to the reduced anthropogenic emissions in

Comparison with the CAMS reanalysis data 20
In order to further evaluate the accuracy of our reanalysis dataset for the gaseous air pollutants, the CAMS reanalysis (CAMSRA) produced by the ECMWF (Inness et al., 2019) are employed as a reference to compare with our reanalysis dataset.
The CAMSRA is the latest global reanalysis dataset of atmospheric compositions, which assimilates the satellite retrievals of O3, CO, NO2 and AOD. Three-hourly reanalysis data of SO2, NO2, CO and O3 concentrations in surface model level from 2013 to 2018 were used in this study, which were downloaded from https://atmosphere.copernicus.eu/copernicus-releases-25 new-global-reanalysis-data-set-atmospheric-composition (last access: 17 Apr, 2020) at an 1 degree by 1 degree resolution.
Here we only focus on the comparisons of gaseous pollutants since the CAMSRA does not provide the PM2.5 and PM10 concentrations. Figure 13 presents the spatial distributions of the six-year averaged concentrations of these gaseous air pollutants in China obtained from CAMSRA. Compared with these obtained from CAQRA and observations ( Fig. 7-8), the CAMSRA largely 30 overestimates the surface concentrations of SO2 and O3 in China. In addition, due to the higher spatial resolution (15km) used in CAQRA than that used in CAMSRA (about 50km), our products can provide more detailed spatial patterns of the surface air pollutants in China, which should be more suitable for the air quality studies in the regional scale. Table 7  compares the accuracy of the CAQRA with that of the CAMSRA in estimating the surface concentrations of gaseous air pollutants in China. Compared with CAMSRA (R 2 = 0.00-0.23), the CAQRA shows much better performance in capturing the spatiotemporal variability in the surface concentrations of gaseous air pollutants in China with R 2 ranging from 0.53 to 0.77. The MBE and RMSE values are also smaller in the CAQRA than these in CAMSRA, especially for the SO2 and O3 concentrations. This could be attributed to the assimilation of surface observations in CAQRA, while CAMSRA only 5 assimilate the satellite retrievals. These suggest that the CAQRA can provide surface air quality datasets in China with higher quality than the CAMSRA, which is especially valuable for future relevant studies with high demands in spatiotemporal resolution and accuracy.

Conclusions
A high-resolution Chinese Air Quality Reanalysis (CAQRA) Dataset has been produced in this study by assimilating the 10 surface observations of PM2.5, PM10, SO2, NO2, CO and O3 concentrations from CNEMC. It provides the time-consistent concentration fields of PM2.5, PM10, SO2, NO2, CO and O3 in China from 2013 to 2018 (will be extended in the future by adding 1 year each year) at a high spatial (15km) and temporal (1 hour) resolution. The CAQRA was produced by the ChemDAS which used the NAQPMS model as the forecast model, and the LETKF to assimilate the observations in a post processing mode. The background error covariance was calculated from the ensemble simulation which considered the 15 emission uncertainties of major air pollutants. An inflation technique was also used to dynamically inflate the background error to prevent the underestimation of true background error covariance. ranging from -7.1 to -0.3 µg • m ;! and from 16.8 to 33.6 µg • m ;! , respectively. The reanalysis of gaseous air pollutants was also compared with the latest global reanalysis data CAMSRA from the ECMWF. The CAMSRA is of great values in providing the three-dimensional distributions of multiple chemical species in global. As a regional dataset, our products have higher spatial resolution than the CAMSRA, which could be more suitable for the air quality studies in the regional scale.
Although our products only provide the surface concentrations of six conventional air pollutants in China, the accuracy of 5 CAQRA was estimated to be higher than that of the CMSRA due to the assimilations of surface observations. Thus, our products have its own values in the regional air quality studies with high demands in spatiotemporal resolution and accuracy We also compared our PM2.5 reanalysis data to the previous satellite estimates of surface PM2.5 concentrations, which shows that the PM2.5 reanalysis data is more accurate than most of satellite estimates and has finer temporal resolution.
As the first version of Chinese air quality reanalysis data, there were still limitations in the CAQRA that the potential 10 users should be aware of. The current CAQRA only contains the surface concentrations of the air pollutants in China which cannot provide the information on the vertical structure of the air pollutants. Besides, the observation sites used in the assimilation are mainly urban or suburban sites that cannot provide enough information on the air pollution over the rural areas, which may influence the quality of CAQRA over the rural areas. To further improve the accuracy of our air quality reanalysis dataset, in future, an online run of EnKF could be conducted to simultaneously correct the emissions and concentrations. More 15 kinds of observations, such as the observation data of PM2.5 compositions, could also been assimilated to provide the fields of PM2.5 composition in China, which would support both the epidemiological studies and climate research.