High-resolution mapping of monthly industrial water withdrawal in China from 1965 to 2020

. High-quality gridded data on industrial water use are vital for research and water resource management. However, such data in China usually have low accuracy. In this study, we developed a gridded dataset of monthly industrial water withdrawal (IWW) for China, which is called the China industrial water withdrawal dataset (CIWW); this dataset spans a 56-15 year period from 1965 to 2020 at spatial resolutions of 0.1° and 0.25°. We utilized >400,000 records of industrial enterprises, monthly industrial product output data, and continuous statistical IWW records from 1965 to 2020 to facilitate spatial scaling, seasonal allocation, and long-term temporal coverage in developing the dataset. Our CIWW dataset was significantly improved in comparison to previous data for the characterization of the spatial and seasonal patterns of the IWW dynamics in China and showed consistency with statistical records at the local scale. The CIWW dataset, together with its methodology and auxiliary 20 data, will be useful for water resource management and hydrological models. This new dataset is now available at https://doi.org/10.6084/m9.figshare.21901074 (Hou and Li, 2023).


Introduction
Industrial water withdrawal (IWW) accounted for approximately 19% of human water withdrawal globally, which is the second largest sector of human water use following irrigation (WWAP, 2019). In developed countries, IWW accounted for 25 more than half of their water use (Shen et al., 2010;Wada et al., 2011a;Flörke et al., 2013). Driven by economic and population growth, global IWW has steadily increased over the past 60 years (Oki and Kanae, 2006;Wada et al., 2011b) from 400 km 3 per year in 1960 to 955 km 3 per year in 2010 (Flörke et al., 2013) and it was projected to continue to increase in the future (Oki et al., 2003;Shen et al., 2010;Fujimori et al., 2017). Considering the high spatial heterogeneity and fast changes of IWW, quantitative information with high spatiotemporal resolution on IWW is essential for water resource management and research.
Existing data of IWW primarily consisted of statistical data at administrative or watershed levels and model estimations at the grid level, in which the sectoral information was represented with varying details (Arnell, 1999(Arnell, , 2004Alcamo et al., 2000Alcamo et al., , 2007Vörösmarty et al., 2000;Oki et al., 2003;Hanasaki et al., 2008a;OTAKI et al., 2008;Wada et al., 2011b;Hejazi et al., 2014;Wada et al., 2016;Yan et al., 2022). However, these datasets have their limitations. Although gridded data, typically developed from administrative-level, emerged to provide more detailed spatial information (Hanasaki et al., 2008a;Wada et 35 al., 2011a), their accuracy depended on the downscaling methods. For the total IWW, statistics were usually allocated to grids level relying on spatial proxies such as population density, urban or industrial area (Hanasaki et al., 2008a(Hanasaki et al., , b, 2010Beek et al., 2011;Wada et al., 2011aWada et al., , b, 2014. For sectoral IWW, different mapping methods were applied. Water withdrawal for energy sector was estimated by the total energy generated and water use efficiency under different technologies (Koch and Vögele, 2009;Flörke et al., 2013). With detailed information on the location, power output, and water use efficiency of power 40 plants, water withdrawal for energy sector could be mapped out (Vassolo and Döll, 2005;Flörke et al., 2013;Müller Schmied et al., 2014;Wang et al., 2016;Qin et al., 2019). Water withdrawal for manufacturing was estimated either as the residue of energy water use from the total IWW (Hejazi et al., 2014) or product of population and per capita water consumption (Vörösmarty et al., 2000). Although several global gridded IWW datasets have been developed using these methods, how well spatial proxies such as population can represent the spatial distribution of IWW is unclear (Otaki et al., 2008). Moreover, the 45 coarse resolution (e.g., 0.5˚) and low accuracy of global datasets, especially at fine scale, limit their applications for regional water issues.
IWW had seasonal fluctuations because of changes in weather conditions (temperature, precipitation, thunderstorms), water supply availability (especially under monsoon climates such as in China), production demand, and emission restrictions (Liu et al., 2006). However, most existing datasets either neglected seasonal variations or simply treating them as invariable across 50 months (i.e., each month shared 1/12 of annual total withdrawal) (Brunner et al., 2019;Wada et al., 2011a). These misrepresented intra-annual variations may result in significant discrepancies between the data and reality. In a few studies, seasonal variations in water withdrawal were considered for specific sectors. For example, seasonality in water withdrawal for electricity generation was considered by incorporating the influence of temperature variability on the cooling water demand of thermoelectric power plants (Byers et al., 2014;Liu et al., 2015). Results demonstrated a clear seasonal pattern, with large 55 withdrawals in winter at high-latitudes and summer in tropical regions (Huang et al., 2018). Therefore, it is essential to fully account for intra-annual variations in IWW, which directly affect water resource management and allocation (Derepasko et al., 2021;Sunkara and Singh, 2022).
After decades of fast growth, China has become the second-largest economy in the world, with the rapid industrial development leading to increasing water use (Zhou et al., 2020). Industrial water withdrawal in China accounted for 20.2% of total water 60 withdrawal in 2019 (source: China water resources bulletin) and increased by 4.5 times from 31.93 km 3 in 1965 to 142.86 km 3 in 2013 (Zhou et al., 2020). However, water resources in China are spatially distributed unevenly, causing severe water stress due to mismatch between water supply and demand from population and industrial development (Liu et al., 2013;Zhao et al., 2015). For instance, Northern China, one of China's largest industrial centres and densely populated region, is one of the most https://doi.org/10.5194/essd-2023-66 Preprint. Discussion started: 19 April 2023 c Author(s) 2023. CC BY 4.0 License.
water scarce in the world (Yin et al., 2020). The growth in water demand has further increased the water conflict, making it 65 urgent to optimize current water use and management structure and prepare for future climate change. However, IWW data produced from reliable data sources with a long period and high spatial resolution in China is still lacking. The publicly available data of IWW in China are either statistical data at provincial, prefecture, or basin level (Xia et al., 2017;Qin et al., 2020;Chen et al., 2021), or gridded data extracted from global datasets which have poor regional accuracy (Liu et al., 2019a, b;Han et al., 2019;Niva et al., 2020;Yin et al., 2020;Li et al., 2022). 70 To fill this data gap, in this study, we were motivated to use reliable local data sources to develop gridded datasets of monthly IWW in China with high spatial resolution and seasonal variations. By using multiple statistical data, the high-resolution mapping of IWW was achieved by a unique industrial enterprises dataset including >400,000 enterprises; the seasonal variations were derived from industry product output data; and the long-term temporal coverage was obtained by continuous statistical records from 1965 to 2020. The resulting dataset, named the China Industrial Water Withdrawal dataset (CIWW), 75 provides monthly IWW from 1965 to 2020 at a spatial resolution of 0.1° and 0.25°. The dataset would be useful to better understand the spatial and seasonal variations of IWW in China and support hydrological studies and regional water resource management.

Data and Method
In this study, IWW was defined as the amount of water abstracted from freshwater sources for industrial rather than water 80 consumption.

Statistical data for industrial output value and water withdrawal
The provincial-level industrial output value (IOV, unit: 10 3 Yuan per year) and IWW were from the Chinese Economic Census Yearbook in 2008 (http://www.stats.gov.cn/tjsj/pcsj/jjpc/2jp/left.htm, last access: 2 April 2021). The data included 85 surveyed IOV and IWW for enterprises above a designated production level, consisting of three main industrial sectors (mining, manufacturing, and production and supply of electricity, gas and water) and 38 subsectors (Table A1). Note that two subsectors "Other Mining" and "Waste Resources and Material Recycling and Processing" had no data, we used the averaged IOV and IWW value of the mining and manufacturing sector in each province to fill these two subsectors.

Industrial enterprise data in China 90
The industrial enterprise dataset used in this study was from the Database of Chinese Industrial Enterprises in Mainland China from 1998 to 2013 (https://www.lib.pku.edu.cn/portal/cn/news/0000001637, last access: 18 May 2022). The datasets contained industrial information such as address, products, annual IOV, and industry category for more than 400,000 https://doi.org/10.5194/essd-2023-66 Preprint. Discussion started: 19 April 2023 c Author(s) 2023. CC BY 4.0 License. enterprises whose IOV was more than 5 million Yuan (or 20 million Yuan from 2011 to 2013 due to standard changes). The dataset covered three main industrial sectors and 37 subsectors, similar to the provincial statistical data. The enterprises' records 95 for subsector 'Water Production and Supply' were not used because the water supply was mainly for domestic rather than industrial purposes. To match the surveyed IWW data, which were only available in 2008, industrial enterprise data in 2008 were selected for spatial downscaling the provincial IWW (Fig. B2).

Statistical data for monthly industrial product output
The monthly industrial product output data were from the China Industry Product Output Database 100 (http://olap.epsnet.com.cn/auth/platform.html?sid=9C98BFB19A412FF66F744C2DA364ED5E_ipv473399501&cubeId=52, last access: 26 September 2021). The data contained monthly outputs of 283 specific products of 36 industrial subsectors at the provincial level. We used the average of 5 years from 2006-2010 to reduce interannual variability in outputs. The monthly output of each product was converted to monthly fractions (divided by the annual total output) to represent its intra-annual variation. Missing values in monthly product output fractions were filled by the average value of monthly fractions of product 105 output from 2006 to 2010. The monthly output fractions of 283 products were aggregated to 36 subsectors by averaging products within each subsector by arithmetic mean.

Statistical data for water use to extend long-term water withdrawal data
In order to produce IWW data for the past four decades, long time statistical IWW data were required. Provincial statistical data on industrial water use in China from the National Water Resources Bulletin (http://www.mwr.gov.cn/sj/tjgb/szygb/, last 110 access: 3 May 2022) from 2003 to 2020 was used. To further extend the time series to the earlier period, the industrial water use by Zhou et al., 2020 (referred to as 'Zhou2020 data' hereafter) from 1965 to 2002 was used by summing up the prefecture data to the provincial level. Noting that IWW and industrial water use (i.e., the annual quantity of water withdrawal for industrial purposes) were treated the same in our study due to their similar definition, allowing us to obtain complete statistical records of IWW from 1965 to 2020 in China. 115 Table 1 provides a summary of source data for developing CIWW dataset.

Mapping industrial water withdrawal
The spatial mapping of IWW in China was achieved using the IOV of >400,000 enterprises in 2008 and the sub-sectoral water use efficiency at the provincial level from the Chinese Economic Census Yearbook in 2008. ) gave the spatial pattern of the total IWW in 2008.

Allocating industrial water withdrawal to seasonal variations
We assumed that IWW was proportional to industrial product output with a constant water use efficiency during the year at 140 the monthly scale. Therefore, seasonal variations in IWW could be approximated by seasonal variations in industrial product output, which were calculated as the monthly fractions of product output to annual total output.
Since the monthly industrial product output data included 283 different products of different subsectors and the number of products varied across subsectors, we first calculated the monthly fraction of each product output of each province, averaged from 2006 to 2010, to reduce the influence of inter-annual variability. Because water use for producing different products was 145 unknown, we simply used arithmetic mean of different products to represent aggregated monthly fractions for each subsector.
By this way we obtained fractions of product outputs for subsector subs, in province p for month mon (Fraction , , ).
We found Fraction  ) gave the spatial and seasonal pattern of the total IWW of China in 2008.

Developing China's industrial water withdrawal data from 1965 to 2020 160
We developed long-term IWW data in China from 1965 to 2020 by mapping provincial IWW statistics in other years based on the spatial-seasonal pattern derived from IWW in 2008. Due to statistical differences in data sources, the raw IWW from the 2008 Chinese Economic Census Yearbook was not directly used in developing the long-term data. Instead, its spatialseasonal distribution was used to map the provincial industrial water withdrawal ( ) from China National Water Resources Bulletin between 2003 and 2020 and Zhou2020 data between 1965 and 2002. Since Zhou2020 data showed good 165 consistency with the China Water Resources Bulletin data, these two IWW records were combined to develop long-term data.
The provincial industrial water withdrawal ( ) of each year was allocated to the grid level following Eq. (5)  summed the monthly gridded , to annual total IWW of all grids in province p. Table 2 provides an overview of the CIWW dataset, including the gridded monthly IWW data in China from January 1965 to December 2020 with a spatial resolution of 0.1° and 0.25° and auxiliary data supporting the development.

185
To validate the performance of the CIWW dataset, we compared the spatial and seasonal patterns with statistical data records and other datasets. For spatial validation, the 40-year mean IWW (1971-2010) from CIWW and other global gridded data (Huang et al., 2018) (referred to as Huang2018 data) were compared with statistical data (Zhou et al., 2020) for 289 cities in China. We used the statistical data at the provincial level to produce the CIWW dataset and the prefecture level to verify the product. The validation at the prefecture level demonstrated how well the spatial patterns were after downscaling. Results in 190 For seasonal validation, owing to the data limitation, we only had monthly statistical IWW data in Beijing from 2006 to 2010 195 . Results showed that both CIWW and Huang2018 data could capture the 5-year mean seasonality of IWW in Beijing. However, the magnitude of IWW was significantly overestimated by Huang2018 data (88 mm per year) relative to https://doi.org/10.5194/essd-2023-66 Preprint. Discussion started: 19 April 2023 c Author(s) 2023. CC BY 4.0 License. statistical data (33 mm per year). In comparison, the magnitude of IWW in CIWW data (37 mm per year) was more in line with statistical data (Fig. 2b).
These validations demonstrated better performance of CIWW data with much higher accuracy and improved representations 200 of spatial and seasonal variations, making it a better data source for IWW related applications in China.

manufacturing (c), and mining (d). The box plot in the bottom left corner shows the interquartile range (25% and 75%) of non-zero water withdrawal, with red and yellow lines denoting the median and mean values. Numbers displayed in percentage denote the percentage of the sectoral IWW to total IWW.
There was substantial spatial variation in total IWW (Fig. 3a). Eastern coastal area of China had generally higher IWW, followed by southeastern and central China, while western China had the least IWW. The largest water withdrawal can be 210 found in the urban agglomeration of the Yangtze River Delta and Pearl River Delta. The spatial distribution of IWW over the country implied that industry enterprises were primarily concentrated in urban area with more intensified economic activities. The water withdrawal by main industrial sectors showed distinctive spatial patterns. Water withdrawal from EGPS expressed a dispersive pattern which was mainly concentrated in southeastern coastal areas, especially in the Yangtze River Delta region (Fig. 3b). Water withdrawal from manufacturing broadly resembled the total IWW and population distribution of China, 215 mainly reflecting the fact that manufacturing and population were closely linked with each other (Fig. 3c). Water withdrawal of mining was confined to regions with rich mineral resources such as the central, northern, and southwestern China (Fig. 3d).
Overall, the sector with the largest IWW was EGPS (57.85%), followed by manufacturing (37.11%) and mining (5.03%). The dominance of the EGPS sector in total IWW reflected the large water requirement for thermoelectric power generation (Gu et al., 2016;Niva et al., 2020). 220 https://doi.org/10.5194/essd-2023-66 Preprint. Discussion started: 19 April 2023 c Author(s) 2023. CC BY 4.0 License. to November, 25%), spring (March to May, 24%) and winter (December to February, 23%) (Fig. 4). February was the month with the lowest IWW, possibly due to its fewer days and the coincidence with the Chinese Spring Festival holiday (Liu et al., 230 2006). The highest IWW occurred in June, probably due to the largest industrial output and high demand for cooling. Such an IWW peak did not extend to other summer months because extreme weather events such as heat waves and heavy rain occurred more frequently in July and August, resulting in production shutdowns and reduced water consumption (Liu et al., 2006).

Seasonal variations of industrial water withdrawal in China
Seasonal patterns of IWW for manufacturing and mining sectors were generally similar, but the subsectors of manufacturing showed more diverse patterns. IWW for EGPS had quite different seasonality, as there were two peaks in June to August and 235 December, which probably reflected the seasonal changes in cooling water withdrawal for thermal electricity generation due to seasonal temperature variation. Summer peak of EGPS was related to the high energy demand for air conditioning cooling (Huang et al., 2018), and the winter peak was related to the high energy demand for heating (Byers et al., 2014;Liu et al., 2015;Huang et al., 2018). For interannual monthly variations, IWW in China had increased significantly from 2.1 billion m 3 per month to 14 billion m 3 per month during 1965-2010, and it then decreased to 10 billion m 3 per month (Fig. 5). These long-term changes indicated 245 that IWW in China now has entered a slowly declining phase.

Discussion
Our study developed new gridded data for IWW in China from 1965 to 2020. The CIWW dataset improves upon previous data particularly in high spatial and seasonal patterns. Instead of using indirect proxies like population density to map out IWW, we used industrial enterprise data which were direct water withdrawers. Compared with existing IWW data that either lack or 250 only have limited representation of seasonal changes (Wada et al., 2011b;Huang et al., 2018;Brunner et al., 2019), our data presented seasonal variations based on information from direct water consumer: sectorial industrial production processes.
Further, we used localized data sources in China to produce the long-term IWW data, significantly improving regional accuracy and consistency with statistical data records. The usage of public data sources and transparent methodology makes it possible to further update and recalibrate the data for specific user needs. 255 The IWW data product with high resolutions supports various research applications. On the one hand, the high spatial resolution revealed IWW at fine scales. Figure 6 shows IWW hotspots in some of China's most densely urbanized regions in 2008 at 0.01° (this resolution is not included in the CIWW dataset but can be produced by the data and code we provided), including the Beijing-Tianjin-Hebei, the Yangtze River Delta, and Pearl River Delta. These maps displayed high heterogeneity 265 of IWW at local scales. On the other hand, our data can facilitate downscaling of statistical data between different administrative (e.g., provincial or 270 prefecture level), natural (e.g., watershed), and grid levels and help reconcile the scale mismatch between data with different spatial units (e.g., administrative and watershed/catchment). For example, with the gridded CIWW data, the statistical provincial IWW data could be downscaled to the prefecture-level or even the county level (Fig. 7a). Moreover, the provincial IWW could be scaled to the watershed level using weights from the gridded IWW. Figure 7b shows IWW rescaled from provincial levels to watersheds in the Yellow River basin. 275

Uncertainties in spatial downscaling methods
The spatial pattern of IWW in CIWW dataset was primarily based on >400,000 industrial enterprises in 2008. The spatial sampling of industrial enterprises could affect the reliability of spatial mapping. Although this was a large number of records, the enterprise dataset could not cover all industries in China, as it only sampled enterprises above a designated production level. This means other enterprises below this level, including their IWW, would be omitted from the datasets, leading to 280 spatial under-sampling of all industries and their IWW in China. According to the 2008 Chinese Economic Census Yearbook, the enterprises above a designated level accounted for 93% of IOV and 85% of water withdrawal of all industries. This suggested that spatial sampling would have a limited influence on the overall spatial pattern. Also, this issue could be mitigated when point-level enterprise estimates were aggregated to the grid level. Another source of uncertainty comes from water use efficiency. Ideally, the enterprise-level IWW could be estimated using 285 each enterprise's IOV and WUE. However, the enterprise-specific WUE was unavailable, so we used the provincial subsectorial WUE instead to estimate IWW. We assumed that enterprises of the same subsector in the province had similar WUE.
In real situation, the WUE of different enterprises may vary substantially depending on subsector and technological levels.
The data can be improved with better data sources in future.

Uncertainties in seasonal allocation methods 290
When allocating the annual IWW to monthly scales, we used monthly variations of industrial product output data to represent the seasonal variation of IWW. It should be emphasized that there were differences in monthly variations across different products and provinces. When aggregating the monthly variations of 283 products to subsectors, each product was assigned an equal weight due to the lack of product-specific WUE, which neglected the structural differences within the subsector because products consuming more water should play a more important role in determining the seasonal variation of the 295 subsector. When aggregating IWW from subsector to sector, the structural differences within a sector were considered with the weights of subsector WUE.
We observed considerable differences in monthly variations of production output across provinces. It is difficult to justify whether these different seasonal fluctuations arose from statistical/random errors, unweighted product outputs to subsector, interannual variability (Fig. 2b), or actual regional differences, so we chose to use the national mean monthly variations to 300 represent each subsector to improve the robustness. These monthly subsector variations were then combined with the subsectoral water withdrawal of each grid to derive its seasonal variations in IWW (Eq. (4)). The regional differences in seasonal variations of IWW should be explored further in future studies.

Uncertainties in producing long-term gridded data
A key step in developing the long-term gridded IWW data was to apply the spatial-seasonal pattern of IWW derived in 2008 305 to other years (due to data constraint). The year 2008 was chosen to match the 2008 Chinese Economic Census Yearbook data which included detailed IWW information. This means that even though the total IWW increased over time with economic development, their spatial pattern and seasonality remained the same in CIWW. We admitted that the time-invariant spatialseasonal pattern of IWW was a strong assumption and probably not true in reality. Such a time-invariant spatial pattern had been adopted in previous studies based on either a static population density map (Wada et al., 2016) or maps with decadal 310 updates (Huang et al., 2018). Alternative time-varying data sources, such as nightlights, land cover, and population density maps with frequent temporal updates, could provide additional information to better catheterize the temporal changes in the spatial pattern of IWW. To investigate how spatial patterns had changed over time, we re-estimated IWW using enterprise data in 1998. We found that the spatial pattern from the 1998 data was similar to 2008 at 0.25˚ (the Spearman rank correlation, ρ=0.84). The similarity improved further at coarser grids (ρ=0.91 at 0.5˚) (Fig. C3). Although specific industrial enterprises, 315 their WUE, and water withdrawal may change substantially over time, the broad spatial pattern after aggregating to grid scale https://doi.org/10.5194/essd-2023-66 Preprint. Discussion started: 19 April 2023 c Author(s) 2023. CC BY 4.0 License. may still hold because the spatial pattern of IWW is largely determined by the distribution of population and economy of the country. Nevertheless, temporal changes in driving factors of IWW, such as industrial structure, water use efficiency, and climate, etc. (Alcamo et al., 2003;OTAKI et al., 2008;Flörke et al., 2013;Zhou et al., 2020), should be considered in future to achieve higher accuracy. Due to this limitation, CIWW data in earlier periods may contain larger uncertainty, and users 320 should interpret it cautiously.

Conclusions
To fill the data gap in water withdrawal in China, one of the top water consumers in the world, we developed new gridded datasets, namely, the China Industrial Water Withdrawal Dataset. The dataset provided monthly IWW from 1965 to 2020 with a spatial resolution of 0.1° and 0.25°. With the best available data sources, the dataset presented significant improvements 325 upon previous global datasets in characterizing the spatial pattern, seasonal variation, and long-term changes of IWW in China with a much higher accuracy. The transparent methodology and public availability of the source data allowed further adjustments and calibration to support the various applications by users. They also served as a reference for other countries to develop localized datasets of their own. The dataset could help understand the human water use dynamics and support studies in hydrology, geography, environment, sustainability sciences, and regional water resources management and allocation in 330 China.