A 30 m resolution dataset of China’s urban impervious surface area and green space, 2000–2018

Accurate and timely maps of urban underlying land properties at the national scale are of significance in improving habitat environment and achieving sustainable development goals. Urban impervious surface (UIS) and urban green space (UGS) are two core components for characterizing urban underlying environments. However, the UIS and UGS are often mosaicked in the urban landscape with complex structures and composites. The “hard classification” or binary single type cannot be used effectively to delineate spatially explicit urban land surface property. Although six mainstream datasets on global or national urban land use and land cover products with a 30 m spatial resolution have been developed, they only provide the binary pattern or dynamic of a single urban land type, which cannot effectively delineate the quantitative components or structure of intra-urban land cover. Here we propose a new mapping strategy to acquire the multitemporal and fractional information of the essential urban land cover types at a national scale through synergizing the advantage of both big data processing and human interpretation with the aid of geoknowledge. Firstly, the vector polygons of urban boundaries in 2000, 2005, 2010, 2015 and 2018 were extracted from China’s Land Use/cover Dataset (CLUD) derived from Landsat images. Secondly, the national settlement and vegetation percentages were retrieved using a sub-pixel decomposition method through a random forest algorithm using the Google Earth Engine (GEE) platform. Finally, the products of China’s UIS and UGS fractions (CLUD-Urban) at a 30 m resolution were developed in 2000, 2005, 2010, 2015 and 2018. We also compared our products with six existing mainstream datasets in terms of quality and accuracy. The assessment results showed that the CLUD-Urban product has higher accuracies in urban-boundary and urban-expansion detection than other products and in addition that the accurate UIS and UGS fractions were developed in each period. The overall accuracy of urban boundaries in 2000–2018 are over 92.65 %; and the correlation coefficient (R) and root mean square errors (RMSEs) of UIS and UGS fractions are 0.91 and 0.10 (UIS) and 0.89 and 0.11 (UGS), respectively. Our result indicates that 71 % of pixels of urban land were mosaicked by the UIS and UGS within cities in 2018; a single UIS classification may highly increase the mapping uncertainty. The high spatial heterogeneity of urban underlying covers was exhibited with average fractions of 68.21 % for UIS and 22.30 % for UGS in 2018 at a national scale. The UIS and UGS increased unprecedentedly with annual rates of 1605.56 and 627.78 km2 yr−1 in 2000–2018, driven by fast urbanization. The CLUD-Urban mapping can fill the knowledge gap in understanding impacts of the UIS and UGS patterns on ecosystem services and habitat environments and is valuable for detecting the hotspots of waterlogging and improving urban greening for planning and management practices. The datasets can be downloaded from https://doi.org/10.5281/zenodo.4034161 (Kuang et al., 2020a). Published by Copernicus Publications. 64 W. Kuang et al.: Dataset of China’s UIS and UGS


Introduction
The effects of rapid urbanization on environments have been witnessed around the world (Seto et al., 2012;Bai et al., 2018;Kuang et al., 2020b, Zhang et al., 2021 and profoundly contribute to changes in biosphere, hydrosphere and atmosphere . In China, a rapid urbanization process emerged in the 21st century (Xu and Min, 2013;Ma et al., 2014;Bai et al., 2014;Kuang, 2012;Kuang et al., 2016), resulting in a rapid increase in urban impervious surface area (UIS) (Kuang et al., 2013;Kuang and Dou, 2020;Lu et al., 2008, Kuang and. This process further triggered various urban environmental problems such as urban heat islands and urban flooding (Haase et al., 2014;Hamdi and Schayes;2007;Kuang, 2011;Kuang et al., 2015Kuang et al., , 2017Xu, 2006;Zhang et al., 2017). Although many green areas have been constructed in Chinese cities recently, China has a relatively lower percentage of urban green space (UGS) than developed countries such as the United States (Nowak and Greenfield, 2012;Kuang et al., 2014). These urban environmental problems triggered the urgency of developing accurate urban land cover datasets with high spatial resolutions for delineating the underlying urban environments. Along with the development of Earth observation technologies, remote sensing has become the mainstream method for mapping UIS and UGS and monitoring their changes (Weng, 2012;Wang et al., 2013;Lu et al., , 2018Zhang et al., 2009).
Various land use products such as the Global Land Cover product (GlobaLand30) (Chen et al., 2015), the University of Maryland (UMD) Land Cover Classification (Hansen et al., 2000), Moderate Resolution Imaging Spectroradiometer (MODIS)-based land use and land cover products , GlobCover (Bontemps et al., 2011), and Finer Resolution Observation and Monitoring of Global Land Cover (FROM-GLC) (Gong et al., 2013) are freely available worldwide (Grekousis et al., 2015;Dong et al., 2017). These products have different definitions of urban areas or settlements due to their different classification systems, such as the International Geosphere-Biosphere Programme (IGBP) (Belward, 1996). Some urban land datasets, such as the Normalized Urban Areas Composite Index (NUACI), which were constructed by supervised learning approaches, have been released at a national or global scale with spatial resolutions from 30 m to 1 km He et al., 2019;Gong et al., 2019). Others such as the built-up grid of the Global Human Settlement Layer (GHSL Built) (Pesaresi et al., 2013) and Global Urban Footprint (GUF) (Esch et al., 2017(Esch et al., , 2018 have been published too. Most urban land products have focused on built-up land or urban area classification but cannot delineate urban land as a heterogeneous unit consisting of UIS, UGS and other fractions (Chen et al., 2015). Therefore, few urban land products have provided intra-urban UIS and UGS fractions at the sub-pixel level.
A detailed UIS dataset inside a city is required as a primary urban environmental index. Numerous studies on impervious surface mapping at the national scale mainly rely on medium-low-spatial-resolution remotely sensed data such as MODIS and the Defense Meteorological Satellite Program's Operational Linescan System (DMSP OLS) (Gong et al., 2013;Zhou et al., 2014;Grekousis et al., 2015;Zhou et al., 2015;Kuang et al., 2016;Zhou et al., 2018). Recently, more research has shifted to employ medium-high-spatialresolution data (e.g., Landsat) to improve data products Liu et al., 2018;Gong et al., 2019Gong et al., , 2020aLi et al., 2020;Lin et al., 2020). The US Geological Survey have developed the National Land Cover Database (NLCD) and provided impervious surface fraction, percent tree canopy, land cover classes and their changes with a spatial resolution of 30 m (Falcone and Homer, 2012;Yang et al., 2018). However, a detailed intra-urban UIS and UGS dataset with a 30 m spatial resolution for China at the national scale is not available yet, making it difficult to conduct detailed analysis of aspects such as urban living environments.
A systematic assessment on urban land mapping algorithms indicates that previous research mainly classified urban land into a single type with "urban area" or impervious surface area (ISA), which limits recognition of the urban environment (Reba and Seto, 2020). There are two critical challenges in mapping urban land cover composites. Firstly, the conceptual definition of urban land or ISA in previous research is unclear; thus, the spatial extent is inconsistent, resulting in a large divergence in the statistical area of urban land. Meanwhile, the segmentation on urban-rural boundaries has not been accurate from moderate-resolution satellite images using computer-based automatic classification owing to differences in geographic conditions, social economic conditions and land policies. Therefore, accurate mapping of urban-rural boundaries is pivotal in detecting urban land cover change. Secondly, the spatial heterogeneity of urban surface properties has resulted in difficulty in decomposing urban land cover types with complex surface materials at the pixel scale, which has been limited by the huge amounts of data processing and storage capacities required for a 30 m resolution.
In reality, the urban land cover is composed of UIS, UGS and other fractions. UIS refers to the urban impervious surface features caused by artificial land use activities, like building roofs, asphalt or cement roads, and parking lots. UGS is an important component of the green infrastructure of cities and provides a range of ecosystem services, including parks, trees and grass. Previous studies have proven that spectral mixture analysis (SMA) provides an effective tool to retrieve the UIS and UGS fractions from Landsat multispectral imagery Weng, 2004, 2006;Peng et al., 2016;. However, this method needs local knowledge for problem-specific analysis such as intra-urban land cover analysis of a single city or a single urban agglomeration (Zhang and Weng, 2016;Xu et al., 2018). Although the globally standardized SMA can effectively extract substrate, dark areas and vegetation (Small and Milesi, 2013), the UIS cannot be accurately and directly extracted from multispectral images without post-processing considering its widely spectral variation and different meanings of UIS and substrate . Because of the high correlation between UIS and vegetation indices in the urban landscape , a fractional UIS dataset can be estimated from vegetation indices using a regression-based approach (Sexton et al., 2013;Wang et al., 2017).
To address the above issues, we propose a synthetical strategy to utilize the advantage of both accurate urban boundary information from China's Land Use/cover Dataset (CLUD) extracted by human-computer digitalization and the retrieval of UIS and UGS fractions through big-data processing from the GEE platform. Based on this strategy, we developed the product of a national UIS and UGS fraction dataset at a 30 m spatial resolution in 2000, 2005, 2010, 2015 and 2018 across China. This dataset provides a foundation for understanding urban dwellers' environments and enhances our understanding of the impacts of urbanization on ecological services and functions, and it will also be helpful in future research and practices in urban planning and urban environmental sustainability.

The strategy of developing the CLUD-Urban product
To acquire the accurate CLUD-Urban product, three steps were generally implemented according to our mapping strategy. Firstly, national urban boundaries in 2000-2018 were extracted from CLUD which was generated using a uniform technological flow and classification system in a humancomputer digitalization environment. Time series of urban boundaries and their expansions have good performance in accuracy and data quality. The national urban vector boundaries in 2000,2005,2010,2015 and 2018 were converted to raster data with a 30 m resolution for further processing (Fig. 1). Secondly, the settlement and vegetation fractions with a 30 m resolution were retrieved using a random forest algorithm in the GEE platform. Thirdly, the UIS and UGS fractions with a 30 m resolution were mapped through overlaying the urban boundaries of CLUD with settlement and vegetation fractions, respectively (Fig. 1). Accuracy assessment both of urban boundaries and of UIS and UGS fractions was implemented using samples from Google Earth images. Quality control was conducted throughout the data processing in mapping the CLUD-Urban product. A detailed description is given in the following sections.

Data sources and pre-processing
Landsat is the longest-running satellite series for Earth observation. Landsat Thematic Mapper (TM), Enhanced Thematic Mapper Plus (ETM+) and Operational Land Imager (OLI) data with path ranges of 118-149 and row ranges of 23-43 in China were selected (Table 1). In mapping CLUD, Landsat TM, ETM+ and OLI in each period, China-Brazil Earth Resources Satellite program (CBERS) and Huan Jing (HJ-1A and HJ-1B) satellite images in 2010 were used to generate false-color composite images with near-infrared, red and green spectral bands as red, green and blue. Image enhancement was performed to improve the visual interpretation quality. Image-to-image registration was conducted to control the image rectification errors of less than 2 pixels (60 m). CBERS-1 and Huan Jing (HJ-1A and HJ-1B) satellite images were only used in extracting the vector polygons of CLUD in 2010, which was conducted using uniform data processing with Landsat images.
In the retrieval of settlement and vegetation fractions, Landsat TM, ETM+, and OLI data in each period from January to December were collected using the GEE platform. Shuttle Radar Topography Mission (SRTM) digital elevation model data and the normalized difference vegetation index (NDVI) with a 30 m resolution were acquired as input parameters to retrieve settlement and vegetation fractions. Google Earth images in selected cities with a 0.6 m resolution were used to assess the accuracy of the CLUD-Urban product.

The classification system and interpretation symbols
CLUD with 30 m resolution was developed by the Chinese Academy of Sciences and has been updated from 2000 to 2018 every 5 or 3 years. This dataset can delineate land use or land cover change associated with human activities, including urbanization at a scale of 1 : 100 000 (Liu et al., 2005a(Liu et al., , b, 2010. This product adopted a hierarchical classification system covering the 6 first-level classes and the 25 second-level classes. Here the 6 first-level classes comprise cropland, woodland, grassland, water body, construction land and unused land. A detailed description of each class can be found in previous publications (Liu et al., 2005b;Zhang et al., 2014). The construction land was divided into three second-level classes, including urban land, rural settlements, and industrial and mining lands beyond cities. Urban land was defined as a built-up area of the concentrated construction, i.e., buildings, roads, squares, green infrastructure and other lands for providing a living, industrial production, and ecosystem services for the dwellers of cities or towns. According to this definition, urban land can be megacities (more than 10 million population), megalopolises (5-10 million population), large cities (1-5 million population), medium cities (0.5-1 million population), small cities (0.2-  0.5 million population) and towns (less than 0.2 million population) (Kuang, 2020a). The industrial and traffic lands outside cities are excluded in the urban land. Based on the designed classification system, the interpretation symbols from the second-level classes were built for the false-color composite images as a reference to aid the human-computer interpretation ( Fig. 2) .

Land use and dynamic polygon interpretation
According to the CLUD classification system, the vector polygons of land use classes in 2000 were digitalized through overlying the false-color composite images with the aid of interpretation symbols and geoknowledge from each zone (Fig. 3). The polygons of urban lands were identified through using detailed image interpretation symbols for each secondlevel land use class based on Landsat or similar-resolution images. Usually, the polygons of urban lands exhibit larger sizes than rural settlements and others (e.g., industrial and traffic lands) in a cinerous color ornamented with white. Digitalization personnel differentiated the urban land from rural settlements and others based on interpretation symbols and geoknowledge from field investigation (Fig. 2). In the digitalization environment, each vector polygon was assigned a code of the second-level classes. The vector polygons of land use classes in 2000 were double-checked to ensure the correct type in interpretation. The dynamic polygons were extracted through comparing the difference in two differently dated images and assigned the codes including the types before and after changes (Fig. 3). The land use changes within 5 or 3 years were detected using the uniform method. Finally, the land use maps in 2000, 2005, 2010, 2015 and 2018 and their changes at 5-or 3-year intervals were generated for CLUD. The detailed technological flow can be found in previous publications (Liu et al., 2005b;Zhang et al., 2014). An example of a land use map in 2010 in the Conghua district of Guangzhou and dynamic land use changes in 2010-2015 is illustrated in Fig. 3.

Retrieval of multitemporal urban boundaries
The vector boundaries of urban extents were extracted from the CLUD land use maps in each period (Kuang et al., 2016). Here we showed urban boundaries and the expansion process with a 30 m resolution in the cities of Xi'an, Wuhan, Guangzhou and Ürümqi (Fig. 4).

Collection of training samples
The training samples of UIS and UGS fractions are a pivotal input parameter in the random forest model for mapping national settlement and vegetation fractions. In light of large discrepancies among UIS and UGS composites in different climate zones with various geographical and social economic conditions, we collected a total of 2570 samples from randomly selected cities in different climate zones   (Fig. 5). Here we also refer to the existing UIS dataset to acquire samples with 10 % intervals of the ISA fraction, and those samples are primarily distributed in the homogeneous UIS or UGS areas, which might provide more effective samples and decrease the impact of imagery mismatch. The samples of UIS and UGS covered with diversified types, including buildings, roads and squares and grass and trees from parks, roads and residential green spaces. The UIS and UGS percentages were interpreted within each sample using Google Earth images (Fig. 5b1-b4). Finally, the training samples in 2000, 2005, 2010, 2015 and 2018 were used for training the random forest model.

Retrieval of settlement and vegetation fractions using random forest model
Many previous studies have indicated that random forest is more effective and accurate in classifying urban land types than other machine learning approaches such as support vector machines (SVMs) and artificial neural networks (ANNs) . Random forest exhibits a strong capacity in processing high-dimensional datasets and has been successfully applied to mapping global ISA at a 30 m resolution . In this research, we proposed a strategy to acquire the settlement and vegetation percentage at the pixel scale using the advantage of random forest and bigdata processing based on the GEE platform. According to 16 global urban ecoregions based on temperature, precipitation, topographic conditions and social economic factors , China has three urban ecoregions. In each urban ecoregion, the annual maximum NDVI; spectral bands in Landsat TM, ETM+ and OLI; and the slope index derived from the SRTM DEM with a 30 m resolution were selected as the input parameters to run the random forest model. The Landsat images were from 1 January to 31 December of each baseline year. The annual maximum NDVI (NDVI max ) was retrieved using Eq. (1): where NDVI i is the NDVI value of the ith image. The individual NDVI was calculated from Landsat images in the period between 1 January to 31 December, and all images were collected using GEE (Gorelick et al., 2017).
In the GEE platform, the settlement and vegetation fractions were calculated for each urban ecoregion through using the training parametrizations. The lawn, forest or their mosaicked areas were selected as input samples in mapping UGS. Post-processing was implemented to remove the pixels with NDVI values greater than 0.5 or DEM slope values greater than 15 • . In arid and semi-arid areas, the enhanced bare-soil index (EBSI) was utilized to separate UIS from bare soils (As-syakur et al., 2012;Li et al., 2019). As a result, the settlement and vegetation fractions with 30 m×30 m in 2000, 2005, 2010, 2015 and 2018 were generated for developing the CLUD-Urban product (Fig. 6).

Mapping of UIS and UGS fractions
The settlement and vegetation fractions with a 1 • × 1 • grid of each period were downloaded from GEE platform. In

Accuracy assessment of the CLUD-Urban product
The national urban boundaries and UIS and UGS fractions were assessed through qualitative and quantitative indices, respectively. Firstly, on the accuracy of CLUD in 2000, 2005 and 2010 we referred to our previous publications (Liu et al., 2010Zhang et al., 2014). The accuracy of the six first-level classes -cropland, forest, grassland, built-up area, water body and unused -and of the second-level land use and land cover types, including urban land, rural settlements, industrial and traffic lands, was assessed using the field investigation data and the Google Earth images (Liu et al., 2010Zhang et al., 2014). We also implemented accuracy assessment on urban boundaries of CLUD from 2000 to 2018 using overall accuracy, producer's accuracy and user's accuracy ( Fig. 10) (Kuang et al., 2016;Kuang, 2020a).
The validation samples for assessing the accuracy of UIS and UGS fractions were collected within urban boundaries using a stratified random sampling method with the ISA fraction at 10 % intervals. Those samples with a window size of 3 pixels × 3 pixels (90 m × 90 m) were used to digitalize the UIS and UGS polygons through the human-computer interaction based on Google Earth images Kuang, 2020b). A total of 1869 validation samples were randomly acquired in different regions in China in 2000-2018, including 1070 samples located in the changed UIS and UGS areas (Fig. 10). Mean UIS and UGS fractions in each grid were calculated. The comparison between estimated values and validation values was conducted using the correlation coefficient (R) and root mean square error (RMSE) Kuang, 2020b). We also evaluated the changed UIS and UGS areas using R and RMSE based on 1070 validation samples.  Zhang et al., 2014;Kuang et al., 2016;Ning et al., 2018). The built-up area has the highest accuracy among the six land use types owing to its clear urban boundaries, and the accuracy reached 98.92 % in 2000 and 97.01 % in 2005 according to previous assessment . The user's accuracy of urban land type is relatively high with 93. 67 % in 201067 % in , 92.65 % in 201567 % in and 91.32 % in 2018. Overall, the urban land accuracy shows a decreasing trend, which resulted from the fuzzy and unidentifiable urban-rural boundaries owing to the continuous pattern of urban-rural land driven by China's fast urban development since the 21st century. In CLUD, the change polygons were identified based on human interpretation. The validation of UIS and UGS fractions in each period showed that the RM-SEs were 0.11-0.12 and 0.11-0.12 respectively, and the R values were 0.89-0.91 and 0.87-0.90, respectively (Table 3). The R and RMSE values for the changed UIS areas in 2000-2018 are 0.88 and 0.12, and those for the changed UGS areas in the same period are 0.85 and 0.12, respectively.

Patterns and dynamics of UIS and UGS since the beginning of the 21st century
Our result indicated that China's UIS increased from 2.46 × 10 4 km 2 in 2000 to 5.35 × 10 4 km 2 in 2018 (Fig. 7). From the perspective of the quality of dwellers' habitat environments, the percentage of UIS in China's urban areas in 2018 is 74.42 %, showing a higher UIS density than developed countries like the USA . However, the UIS percentage in urban areas decreased from 74.42 % in 2000 to 68.21 % in 2018 owing to the improvement of urban greening conditions. As shown in Fig. 7, the UIS across China is mainly distributed in the coastal and central regions and is relatively discrete in the western regions. The pattern of "high in east and low in west" national UIS remained unchanged between 2000 and 2018 (Fig. 7). China's UGS shows an increasing trend. The total UGS area increased from 1.00 × 10 4 km 2 in 2000 to 1.83 × 10 4 km 2 in 2018 (Fig. 8). Looking at both UIS and UGS in urban areas, our results indicate a slight increase in UGS density and decrease in UIS density, which has resulted from strengthening urban greening since the start of the 21st century. The UGS percentage rose from 18.91 % in 2000 to 22.30 % in 2018.
As shown in Fig. 9, UIS and UGS of cities from coastal, northeastern and southwestern China have high spatial heterogeneity and showed the different urban expansion rate in the past 28 years. The large discrepancies of the UIS and UGS percentage in urban areas were exhibited among eastern, central, western and coastal zones. The coastal zone showed a remarkable increasing trend from 16.50 % in 2000 to 21.66 % in 2018 ( Figs. 9 and 11). We also found that urban greening conditions were positively improved in Beijing in the same period, which resulted in the increase in UGS percentage and decrease in UIS percentage in urban areas (Fig. 9). This means that urban habitat environment in coastal zones has become more liveable and comfortable, which is associated with the greening of parks and forests. We also found that the western cities have a relatively low UGS percentage in urban areas, which is 0.86 % lower than the average of China owing to the low-greening conditions (Figs. 9 and 11).

Comparisons of the CLUD-Urban product with other datasets
We compared the vector boundaries of urban areas with the existing land use products and found obvious discrepancies because of the differences in data production, data source, resolution and definition of urban land use types.
The spatial resolutions of land cover products range from 30 to 1000 m. Figure 12 provides a comparison of urban land datasets (see Table 4 for these datasets), showing that our product has better performance in delineating the detailed spatial patterns of intra-urban land cover, i.e. the compos-ite of UIS and UGS (note both the GHSL Built and Globa-Land30 products cover only 2 years). The accuracy of urban boundaries from CLUD-Urban is over 92 % and is basically inconsistent with that of the impervious surface map . Our dataset has a higher classification accuracy in urban boundaries than that of GHSL with 90.3 %, FROM-GLC with 89.6 %, Human Built-up and Settlement Extent (HBASE) with 88.0 %, GlobaLand30 with 88.4 % and NUACI with 85.6 %. Furthermore, our CLUD-Urban product can accurately delineate the spatial heterogeneity of UIS and UGS composites, which showed the R with 0.90 and 0.89 and RMSE with 0.11 and 0.11, respectively. In those existing datasets, the UIS and UGS composites can't be effectively decomposed at the pixel scale (Fig. 12).

The mapping advantages integrated with human-computer interpretation and GEE platform
In mapping urban land use and land cover change at national scale, two pivotal steps were required to segment the urban land, rural settlements, and industrial and traffic lands in periphery of cities for accurately acquiring the urban boundaries and to retrieve the UIS and UGS fractions at pixel scale. The urban boundaries are generally mapped using classification methods such as unsupervised classifiers, supervised classifiers, human-computer interpretation and other advanced algorithms (i.e., ANN, SVM and random forest) (Wu and Murray, 2003;Zhang et al., 2020). Among these methods, human-computer interpretation is widely regarded as a most accurate method in classifying urban land use and land cover changes, especially in both detecting changing information and extracting vector polygons as whole geofeatures. However, this method takes more time and manual labor to digitalize a large number of polygons. CLUD has an advantage for providing the accurate urban boundaries and     is updated at an interval of every 5 or 3 years from 2000 to 2018.
Cities or towns were classified as a homogeneous unit in CLUD. We developed the UIS and UGS fractions to fill the data gap for the requirement of urban environmental management. Here we adopted the advantage of high accuracy and long-time series in mapping urban land from CLUD. Meanwhile we also utilized the highly efficient computation and large storage capacities on the GEE platform. In mapping the CLUD-Urban product, we proposed quantitatively retrieving the UIS and UGS fractions using random forest. Because we used advantages of manual interpretation and intelligent computation, CLUD-Urban exhibits high accuracy and reliability in delineating urban land surface properties.

The potential implications of promoting habitat environment and urban sustainability
The CLUD-Urban product may effectively delineate the "built-up environment" of Chinese cities, especially maps on surface imperviousness and greening conditions (Kuang, 2020b). CLUD-Urban can be applied to such fields as enhancing the quality of the urban habitat environment, reducing urban heat islands, and improving prevention of rainstorm and flood disasters . Our previous study indicated that the thermal dissipation strength of forest canopy or lawns in cities may be assessed at the pixel scale and that the greening projects are more effective in alleviating urban-heat-island intensity (Kuang et al., 2015). The CLUD-Urban product also helps identify urban flood regulation priority areas based on ecosystem service approaches (Li et al., 2020). The analysis of CLUD-Urban indicates an unprecedented rate and magnitude of urban expansion since the start of the 21st century. The low UGS of cities in western zones indicates the need to promote their greening level (Kuang and Dou, 2020). The CLUD-Urban product can also be used to assess sustainable-development-goal (SDG) targets such as the ratio of land consumption to population growth or average share of the built-up area that is open space for public use. Therefore, CLUD-Urban can have many potential applications in the development of sustainable, liveable and resilient cities.

Limitations of the method and dataset
Although state-of-the-art technologies and methodologies were applied to the development of CLUD-Urban (Dong et al., 2017;Kuang et al., 2020b), improvement to the mapping CLUD-Urban quality can still be made. For example, the retrieval of UIS and UGS was conducted as a prerequisite of CLUD, which focused on the pixel decomposition of UIS and UGS in urban areas. If the UIS and UGS fractions are parameterized to be input into a hydrological process model or urban climate, the settlements or impervious surfaces located on the outskirts of a city or in rural areas are removed from CLUD. To address this issue, the first-level classification or second-level classification of CLUD should be utilized to merge with UIS and UGS using the method in our previous publication . Mapping CLUD requires a large amount of labor and time as many interpreters are involved in this work. The extraction of urban boundaries might be subjective, and there is a time lag in mapping UIS and UGS. It is needed to develop some advanced tools to extract urban boundaries using automatic algorithms.
Fine urban land use and land cover change mapping at a national scale with high-resolution multi-source data may be developed with the aid of big data and cloud platforms (Gong et al., 2020a). The development of a series of new algorithms and models are pivotal for improving the accuracy of datasets in retrieving urban boundaries and land cover composites. However, geoknowledge is still essential for retrieving a high-quality dataset . CLUD-Urban can contribute to the development of sustainable cities, such as with Global Ecosystems and Environment Observation (GEO) and UN-Habitat, in future.

Data availability
All data presented in this paper are available at https://doi.org/10.5281/zenodo.4034161 . These new-version datasets include the UIS and UGS fractions with a 30 m spatial resolution in 2000, 2005, 2010 2015 and 2018. A detailed metadata description is provided, including contact information.

Conclusion
CLUD-Urban -China's UIS and UGS fraction datasets with 30 m spatial resolution -was generated using multiple data sources. CLUD-Urban provides detailed delineation of UIS and UGS components in China for the years of 2000, 2005, 2010, 2015 and 2018. Comparing to other products, the novelty of this dataset is to take cities as heterogeneous units at the pixel level, which consist of UIS, UGS and others. The accuracy of the CLUD-Urban dataset is over 92.65 % using the integrated approach of visual interpretation and prior knowledge. The RMSEs of UIS and UGS fractions are 0.10  Gong et al. (2020b) and 0.14, respectively. Results from the analysis of urban areas, including UIS and UGS, show large regional differences in China. CLUD-Urban provides fundamental data sources for examining urban environment issues and for delineating intra-urban structure or urban landscape at the national scale.
Author contributions. WK, SZ and XL designed the research; SZ and XL implemented the research; WK, SZ and DL wrote the paper.