A global dataset of annual urban extents (1992–2020) from harmonized nighttime lights

Understanding the spatiotemporal dynamics of global urbanization over a long time series is increasingly important for sustainable development goals. The harmonized nighttime light (NTL) time-series composites created by fusing multi-source NTL observations provide a long and consistent record of the nightscape for characterizing and understanding global urban dynamics. In this study, we generated a global dataset of annual urban extents (1992–2020) using consistent NTL observations and analyzed the spatiotemporal patterns of global urban dynamics over nearly 30 years. The urbanized areas associated with locally high intensity human activities were mapped from the global NTL time-series imagery using a new stepwise-partitioning framework. This framework includes three components: (1) clustering of NTL signals to generate potential urban clusters, (2) identification of optimal thresholds to delineate annual urban extents, and (3) check of temporal consistency to correct pixel-level urban dynamics. We found that the global urban land area percentage of the Earth’s land surface rose from 0.22 % to 0.69 % between 1992 and 2020. Urban dynamics over the past 3 decades at the continent, country, and city levels exhibit various spatiotemporal patterns. Our resulting global urban extents (1992–2020) were evaluated using other urban remote sensing products and socioeconomic data. The evaluations indicate that this dataset is reliable for characterizing spatial extents associated with intensive human settlement and high-intensity socioeconomic activities. The dataset of global urban extents from this study can provide unique information to capture the historical and future trajectories of urbanization and to understand and tackle urbanization impacts on food security, biodiversity, climate change, and public well-being and health. This dataset can be downloaded from https://doi.org/10.6084/m9.figshare.16602224.v1 (Zhao et al., 2021).

pose challenges to a variety of fields such as agricultural production (Hou et al., 2021;Jiang et al., 2013), environmental quality Qiang et al., 2012), energy consumption (Guneralp et al., 2017;Chen et al., 2013), biodiversity loss (Seto et al., 2012;Lawler et al., 2014), and human health and well-being Lu et al., 2019;Cao et al., 2018). As the major gathering areas of human activities, urban areas play a critical role in global change and regional development. For instance, despite covering an extremely small percentage of the global land surface, urban areas are estimated to be responsible for more than 90 % of the economic output, 65 % of the energy consumption, and 70 % of the greenhouse gas emission globally (Solecki et al., 2013). Therefore, global records of annual urbanized areas over the past few decades are the basis for exploring historical laws and predicting future pathways of urban growth and for further understanding and tackling ongoing global change and corresponding consequences for the urban system Liu et al., 2020;Seto et al., 2012).
Satellite remote sensing big data have shown great potential for mapping dynamics of urban areas with continuous observations spanning over years to decades at the global scale . Currently, global maps of urban areas have been derived from medium-spatial-resolution data such as the Moderate Resolution Imaging Spectrometer (MODIS) (Schneider et al., 2010(Schneider et al., , 2009 and Landsat imagery Corbane et al., 2018;Gong et al., 2013;Kuang et al., 2021b), as well as high-resolution observations such as synthetic aperture radar (SAR) and Sentinel data Esch et al., 2013;Taubenbock et al., 2012). Most of these maps aim to reveal the spatial distribution of artificial impervious areas, and the extraction results with finer spatial resolutions are more likely to be the real impervious surface. However, the issues of consistency and comparability in the derived urban results of different global maps inevitably hinder the applications of global change studies . To address these challenges, several-decades-long global maps of annual artificial impervious areas were systematically developed, using the improved automatic mapping algorithms and massive Landsat time-series imagery on Google Earth Engine (GEE) platform Liu et al., 2020;Huang et al., 2021). These temporally consistent records of artificial impervious areas provide an essential basis for understanding the urbanization process from the perspective of physical characteristics of the land-use type. Unlike the temporally consistent but broken patches of artificial impervious areas used for revealing the long-term urbanization process, the global urban boundaries obtained from fine-resolution artificial impervious areas by Li et al. (2020b) provide spatially contiguous boundaries of urban extents without hollow regions. Additionally, a global dataset of intra-urban land cover types with a 5-year interval was developed aiming to provide more details of the key urban composites (e.g., green space and impervious areas) (Kuang et al., 2021b). Therefore, different global urban products based on artificial impervious areas have greatly contributed to revealing human modifications to the landscape against the background of rapid global urbanization.
Unlike other traditional remotely sensed products, artificial nighttime light (NTL) observations from satellites have a unique significance in characterizing urbanized areas and urbanization activities Li and Zhou, 2017b). The former mainly focus on reflecting the form and texture information of the landscape, while the latter specializes in providing the coupled information of NTL intensity and location (Ma et al., 2015). The nocturnal lighting signals recorded by different types of NTL imagery are suggested to be salient indicators for revealing the dynamic patterns of human settlements and economic activities from different perspectives . Specifically, the stable NTL products obtained from the Defense Meteorological Satellite Program's Operational Linescan System (DMSP-OLS) are widely applied to delineate the boundary of the urban domain (Li and Zhou, 2017b), while the improved NTL composites obtained from the Visible Infrared Imaging Radiometer Suite (VIIRS) instrument on board the Suomi National Polar-orbiting Partnership (NPP) satellite are suitable for depicting further the interior patterns within the urban domain (Chen et al., 2017;Ma et al., 2018;Zhao et al., 2020). Both of them have advantages in capturing urbanization and socioeconomic activities with high local intensities . Additionally, the fine NTL images with higher capabilities of NTL detection derived from commercial satellites are supposed to identify different types of urban structures and functions Wang and Shen, 2021). In a sense, urban areas derived from NTL images may be more closely correlated to the areas gathered by higher-intensity socioeconomic activities locally.
The characteristics of NTL spatial structures in potential urban domains have been demonstrated to help identify the ambiguous boundary between urban and surrounding nonurban areas, but previous studies were limited regarding spatial coverage or temporal periods. The key of this type of method is to capture differences in NTL signals from rural areas to urban cores at local scales (Zhao et al., 2020b). Su et al. (2015) extracted the urban built-up areas in the urban agglomeration of the Pearl River Delta in four different periods by applying topographic analysis to NTL images to quantify the relative spatial variations in NTL pixels from builtup areas compared to surrounding non-built-up areas. Additionally, some spatially explicit approaches were applied to China and Southeast Asia for extracting their dynamics of urban extents over the past 2 decades. The key idea of this kind of approach is to identify the optimal urban threshold through the robust quadratic relationship observed at the local scales between pixel-level NTL and its spatial gradient Ma et al., 2015;Kamarajugedda et al., 2017). However, the extracted urban extents are likely to be overestimated when applying the above methods to large cities or metropolitan areas (Zhao et al., 2020b). Differently, a quantile-based approach was proposed by Zhou et al. (2018) to remove such bloomed pixels in suburban areas and delineate the urban extents  in different urban clusters at the global scale. Still, the accuracy of the extraction results cannot be guaranteed when applying this algorithm to other city types (Zhao et al., 2020b). Recently, a new framework, through capturing and quantifying the spatial characteristics of NTL gradients from urban cores to rural areas, was developed to identify the urban extents from the potential urban domains (Zhao et al., 2020b). A salient advantage of this framework is to map the urban extents over different spaces and time by effectively characterizing the diverse patterns of NTL spatial gradients at the local scales from urban to surrounding non-urban areas (Zhao et al., 2020b). Thus far, this approach has been initially applied to Southeast Asia for monitoring its annual urban extents , showing great potential in further applications for studies over large regions and long periods.
Given the advances in the fusion of multi-source luminous remote sensing data for acquiring temporally consistent and comparable NTL observations, a global record of annual urban extents aimed at portraying the spatiotemporal coverage of high-intensity or intensive urbanization and socioeconomic activities has its unique advantage in revealing the global urbanization process . Despite a proliferation of remotely sensed urban maps for different spatiotemporal scales being developed, the definitions of urban extents vary with different studies and datasets (Liu et al., 2014;Kuang et al., 2021a). In addition to the artificial impervious areas, the population has also been used as an indicator when delineating the boundary between urban and non-urban areas. For instance, urban center areas in the Global Human Settlement (GHS) dataset were defined jointly with population data and previously derived urban extent data (Florczyk et al., 2019). However, this type of urban extent map, delineated by supporting auxiliary data related to population and socioeconomic factors, is always limited at both spatial and temporal scales. Differently, urban extents delineated from NTL imagery, although they may omit small human settlements with low luminance (e.g., villages and small towns), have been demonstrated to effectively identify areas gathered by relatively active urbanization activities at the local scales (Ma et al., 2015. Additionally, harmonizing the temporally extended NTL dataset by integrating NTL observations from different sensors can provide valuable support for mapping global urban dynamics over the past decades . In short, global time-series maps of NTL-derived annual urban areas that can characterize the relatively developed areas are pressingly needed. To further extend the applications of NTL observations for delineating, understanding, and predicting pathways of global urban growth associated with socioeconomic activities, as well as to better support future sustainable development, we generated a global dataset of annual urban extents from 1992 to 2020 using long-term and consistent nighttime lights. The remainder of this article describes the datasets and pre-processing (Sect. 2), details of the stepwise methods for NTL-based urban mapping (Sect. 3), a discussion of the results and findings (Sect. 4), and conclusions (Sect. 5).

Datasets and pre-processing
We used the harmonized global NTL dataset  as the primary dataset for mapping the global time-series urban extents. This dataset generated using the global integration framework of DMSP and VIIRS data could provide temporally consistent and extended DMSP-like stable NTLs compared to the previous version of DMSP stable NTL imagery . The DMSP stable NTL composites (1992 inter-calibrated by the NTL stepwise-calibration model (Li and Zhou, 2017a) and the extended DMSP-like data (2014-2020) simulated from the VIIRS NTLs using the NTL integration approach  are two major components of this timeseries dataset. These global annual image products consist of 30 × 30 arcsec gridded cell-based stable NTLs with a 0-63 digital number (DN) range. In this study, we downloaded the nearly 30-year-long records of global stable NTLs tagged in GeoTIFF file format at the figshare repository (https://doi.org/10.6084/m9.figshare.9828827.v2, Li et al., 2021). To slightly distinguish the spatial variations of saturated DMSP NTL observations when developing the urban delineating method, a weight coefficient constructed based on the normalized difference vegetation index (NDVI), defined as (1 − NDVI/100), was used to update the DN values of NTL time-series imagery at the global scale.
Other ancillary data such as masks of water and gas flare were used to filter out the urbanization-unrelated illuminations recorded in the NTL time series. Similarly to in previous attempts (Zhao et al., 2020b), water masks were regarded as the aggregated 1 km water percentage maps with values larger than 50 % derived from 250 m MODIS Land Water Mask data (MOD44W), and gas flare masks were obtained from Elvidge et al. (2009). In addition to the global 1 km binary urban maps used previously , the global artificial impervious area (GAIA) data with a spatial resolution of 30 m  in 2018 were also processed to a 1 km binary layer of dense artificial impervious area (AIA, pixels with values of GAIA percentage > 50 %) to provide ancillary support for implementing the stepwise urban mapping approach mentioned below.

Framework of stepwise urban mapping method
We mapped the global annual urban extents (1992-2020) from time series of consistent NTL observations based on a new stepwise-partitioning framework (Fig. 1). This framework further improves and extends the urban mapping ap-proach developed in Zhao et al. (2020b). It includes three components: (1) clustering of NTL signals to generate potential urban clusters, (2) identification of optimal thresholds to delineate annual urban extents, and (3) checking of temporal consistency to correct pixel-level urban dynamics. The "urban extents" derived from NTL imagery in this study were defined as core urban domains, where most built-up areas and partially green spaces and other land-use types with urban functions are inside. The global urban mapping framework was designed based on top-down segmentation and local delineating and bottom-up merging. Details of the stepwise urban mapping method are given in the following sections.

Potential urban cluster map generation
A global map of potential urban clusters (i.e., urban domains including urban cores and suburban and surrounding rural areas) was generated using the NTL clustering and segmentation approach. This approach includes two major sections. First, we applied the marker-controlled watershed segmentation algorithm (Parvati et al., 2008) to generate global initial urban clusters of spatially contiguous pixels with similar DN values. The increasing of DNs in each urban cluster from the periphery to the core spatially corresponds to the intensification of urbanization and socioeconomic activities from rural areas to urban cores. Therefore, each initial urban cluster is an enclosed zone constituting urban and surrounding non-urban areas. Considering that this grayscale morphology algorithm with dilation and erosion processing is sensitive to the spatial variations of NTL DN values, the filtered NTL imagery in 2013 rather than the latest imagery was used here to avoid the over-segmentation of urban clusters caused by the slight heterogeneity of simulated DMSP-like NTLs in the urban domain . Second, we used necessary screening rules to identify and remove non-urban clusters from the initial urban clusters which were unrelated to urbanization. Both the global binary urban reference data mentioned in the datasets and temporal trends of the annual average NTL DNs of each initial urban cluster were designed to screen out the non-urban clusters from the initial ones. Here, an updated urban binary layer in 2018 overlaid by its binary layer of dense AIA (GAIA percentage > 50 %) and corresponding NTL luminance layer (DN > 40) was used to mark the areas associated with dense human settlements and high-intensity human activities, respectively. The initial urban clusters, which exclude such areas or exhibit the abnormal NTL temporal trends unrelated to the urbanization process, were identified as non-urban clusters to be removed for generating the potential urban clusters. As demonstrated in Zhou et al. (2015Zhou et al. ( , 2018 at the global scale, these associated parameters could jointly determine the screening rules for identifying non-urban clusters. More details about cluster screening are presented in Zhao et al. (2020b).

Initial urban extent delineation
The annual urban extents from 1992 to 2020 within each potential urban cluster were automatically delineated by extending a heuristic NTL-based urban mapping approach further developed in Zhao et al. (2020b). The key idea of this approach is to determine a threshold from NTL images of each potential urban cluster in different years by detecting the corresponding feature points of NTL curves. Considering that local patterns of DMSP NTL spatial variations in the potential urban domains over different spaces and time are quite different, the characteristics of NTL quantile curves were first examined to identify their corresponding patterns before each delineation. Building on the work by Zhao et al. (2020b), we revised the parameter details for realizing more robust delineations of the initial urban extents at the global scale.
First, we grouped the patterns of NTL distributions in local areas into two types by characterizing their NTL quantile curves. The quantile curves of NTL NDs at different quantile levels from 0 to 1 were mapped to depict the two patterns of NTL spatial gradients from non-urban areas to urban areas (Fig. 2a). As mentioned in Zhou et al. (2018) and Zhao et al. (2020b), both gradual and non-gradual variation in NTL gradients can be observed during a shift in NTL quantile patterns over long time series. For example, the NTL quantile curves, which nearly coincide with the reference line (2004)(2005)(2006)(2007)(2008)(2009)(2010)(2011)(2012), show a gradual variation from non-urban to urban areas, while cases in others years exhibit non-gradual NTL gradient variation with obvious turning features (Fig. 2b).
Second, we identified the NTL thresholds to gradually remove non-urban pixels of the potential urban domains using optimal strategies corresponding to pattern types. Two strategies were jointly applied here to identify the urban extents in areas with various urbanization stages. For example, when the NTL variation pattern from non-urban to urban areas is non-gradual, we remove non-urban pixels using the quantilebased strategy; otherwise, we delineate the urban extents using the parabola-based strategy as an alternative because of the difficulty in distinguishing boundaries among urban, suburban, and rural areas. Both strategies were designed based on the curve feature points of NTL variations from non-urban to urban areas.

Pattern of NTL variation from non-urban to urban areas
We quantified two parameters to classify the NTL variation patterns from non-urban to urban areas into two different types: (1) non-gradual NTL variation pattern and (2) gradual NTL variation pattern. One parameter is the area ratio (Ar) of two parts, the gray part (S R ) below the reference line and the yellow part (S Q ) enclosed by the quantile curve and reference line (Fig. 3a, arrow 1). The other is the included angle (Aα) between two line segments, connected by the turning point  (red dot) and two adjacent intersections (blue dot) on both sides (Fig. 3a, arrow 2). Based on tests at the global scale and the previous parameterization in Zhao et al. (2020b), we used the joint condition of the two parameters calculated from NTL quantile analysis to identify the patterns of NTL variation (Fig. 3b). When Aα is lower than 150 • or Ar is larger than 0.13, the NTL quantile curve is always far above or below the reference line; we identify the variation pattern of the NTL spatial gradient in the potential urban domain as non-gradual. Otherwise, the pattern is considered a gradual one.

Quantile-based strategy for non-gradual pattern
We used the quantile-based strategy to gradually separate urban from surrounding non-urban areas in potential urban clusters that exhibit a non-gradual pattern of the NTL spatial gradient. As examined in Zhou et al. (2018) and Zhao et al. (2020b), this approach can effectively capture the notable NTL gradient change among urban, suburban, and rural areas. We used the turning point of the NTL quantile curve for each urban domain to identify the temporary threshold for gradually removing corresponding non-urban areas. For a cluster with G < 0, its urbanized category in this year was identified as I . For this case, relatively low-DN pixels are dominant; the DN value of the turning point essentially reveals the potential boundary between rural and suburban. Hence, the turning point of the NTL quantile curve constituted by remaining NTL pixels after the first removal corresponds to the boundary between suburban and urban if the second pattern of NTL variation is also non-gradual. Therefore, the DN value of the turning point in the second removal (D2) was identified as the optimal threshold for delineating urban extent from the potential urban cluster in this year. The urbanized category was identified as II for a cluster with G > 0 and Q > 0.5. For this case with relatively balanced high-and low-DN pixels, the estimated threshold is likely to separate urban and suburban. Therefore, the DN value of the turning point in the first removal (D1) was applied to identify the corresponding urban extent in this case. For a cluster with G > 0 and Q ≤ 0.5, the urbanized category was identified as III. For this case with both dim pixels and the strong blooming effect, the estimated threshold in the first removal is relatively low and another removal is necessary when its gradient pattern is again identified as non-gradual. Therefore, the urban boundary was derived after the second removal/iteration using the threshold D2.

Parabola-based strategy for the gradual pattern
We used the parabola-based strategy to map urban extents from potential urban clusters that exhibit a gradual pattern of the NTL spatial gradient. Since pixels with relatively low, medium, and high DN values constituted in these clusters are generally balanced, the relationships of quadratic curves between pixel-level NTL DNs and corresponding brightness gradient (BG) values should always be robust and insensitive (Ma et al., 2015). This strategy includes three steps. First, the pixel-level BG was calculated to depict the neighboring fluctuations in different urban clusters during the past decades. Then, the relationship between NTL and the BG for each case was fitted using a quadratic parabola. Finally, the split points of the fitting curve were identified to obtain the optimal threshold for delineating urban extents. The key for identifying urban extents in this approach is the law behind the variation in NTL and its neighboring gradient from rural to urban areas. Specifically, due to the intensified luminance of human activities from rural to urban areas and the saturation effect of DMSP NTL in urban cores, urban cores with high NTL and rural areas with low NTL are likely to exhibit low BG values (Fig. 5a). At the same time, urban-rural transition zones with medium NTL always have relatively high BG values because of their spatially increasing NTL DNs (Fig. 5a). Therefore, the quantitative relationship between the NTL DN and BG can be characterized by a quadratic parabola function. The extracted urban extents depend on the range of NTL and the fitted results (Fig. 5b). Similarly to in Ma et al. (2015) and Zhao et al. (2020b), areas with NTL DN larger than that of the urban split point always correspond to the dense artificial surface and thus are identified as urban extents in this study (Fig. 5c). Therefore, despite there being no notable changes along the NTL gradient to delineate urban and non-urban, the urban boundary in this type of potential urban cluster can also be captured by the DN value of the urban split point using the parabola-based strategy as an alternative.

Urban sequence updating
Finally, we post-processed the initial urban extents delineated from NTL imagery to build temporally more consistent time-series urban extents. The post-processing scheme includes two major procedures . First, an iteratively temporal filtering procedure was applied to the initial binary urban sequence, for correcting the temporal inconsistency in the initial urban extents using a multi-temporal moving window. Cases of the pixels with isolated states in their temporal-neighborhood urban sequence can be reduced by changing the labels with low values of temporal-consistency probability. Then, for the filtered urban sequence, a logical reasoning check was performed to ensure that the updated sequence follows a reasonable urbanization process (i.e., nonurban to urban). The alternatively occurring urban and nonurban labels in the filtered urban sequence can be replaced based on the major label category. Details of the two procedures are described in .

Comparison with global urban time-series products
Urban extents delineated from NTL data spatially agree with the relatively dense artificial impervious areas at the local scales (Fig. 6). We first aggregated the 30 m GAIA data  to the 1 km resolution to calculate the proportion of impervious surface area (ISA) in each pixel. We then selected urban clusters with different urban sizes and sprawl trends in different continents for illustration. Through overlaying the derived urban boundaries on the ISA percentage maps, we found that our mapped urban results can reveal the spatial distribution of the contiguous areas with relatively high proportions of ISA. These dense artificial impervious areas around urban cores are likely to be the real urban extents containing intensive high-intensity human activities. However, unlike using a single threshold of ISA percentage values to define urban areas (Homer et al., 2015;Zhou et al., 2014), the ISA percentage maps within the NTL-derived urban extents in this study vary over regions and across years. Overall, the total NTL-derived urban areas are largely consistent with the intensive artificial impervious areas at the global and continent scales, corresponding to a reasonable range of ISA percentage levels (20 %-45 %) over the past decades (Fig. 7). We binarized the global ISA percentage maps  at an interval of 5 % to obtain the reference urban boundary corresponding to different ISA levels.
At the global scale, the total area of NTL-derived annual urban extents generally matches well with that of the areas with ISA percentage values larger than around 25 %. This comparison is a bit different on different continents. Most ISA percentage thresholds corresponding to the reference boundaries are generally stable, at around 30 % in Asia, 45 % in Oceania, 20 % in Europe, and 25 % in North America. However, the NTL-derived urban results in Africa agree with the ISA-derived reference results when the dynamic thresholds are set from 55 % in 1992 to 20 % in 2018. This difference is mainly because some artificial impervious areas in developing regions (e.g., small towns) in the early period cannot be well reflected on NTL imagery due to the dim luminance at night . That is, during the earlier period, the NTL-derived urban areas generally detect the urban core with relatively dense artificial impervious areas, while other land types such as the green space in continuously expanded urban areas are likely to be identified as urban in the later period . A similar illustration can also be found in South America. These comparisons suggest that our urban extents derived from NTLs can reveal dense human settlements with ISA percentage levels varying at a reasonable range from 55 % to 5 %.
Urban extents derived from NTL imagery for each potential urban cluster are also largely consistent with the global urban boundaries (GUBs) derived from fine-resolution ISA data (Fig. 8). The multi-temporal GUB dataset delineated using 30 m resolution GAIA data provides a physical urban boundary globally over the past decades . The local urban sizes of NTL-derived results and GUB data are generally consistent worldwide, with R 2 values of no less than 0.86, for six different periods (Fig. 8a). Most of the cluster-based evaluation points are distributed near the 1 : 1 line, showing that the extracted urban results in each potential urban cluster are generally consistent in these two datasets. Specifically, urban areas from this study and the reference dataset in the United States are largely scattered along the 1 : 1 line, presenting high values of the corresponding R 2 larger than 0.9 (Fig. 8c). The comparison results in Asia are also reasonable, despite differences in the total urban areas in some urban clusters being slightly obvious (Fig. 8b). To further analyze the possible reasons for the difference mentioned above, we selected several points aligning far from the 1 : 1 line (1 and 2 in Fig. 8a and 1, 2, 3, and 4 in Fig. 8b) for illustration. By overlaying our urban boundaries on the NTL images and overlaying the urban reference boundaries (GUBs) on the fine-resolution ISA maps (GAIA data), we found that the spatial inconsistency of these two datasets is acceptable (Fig. 8d). First, the pervi-ous surface areas with both large size and dim artificial luminance (e.g., grass, forest, or farmland of the suburban areas between two towns/cities) are excluded in the NTL-derived urban extents, while they are included in the GUB data (see a1 and a2 in Fig. 8d). Second, the scattered artificial impervious areas with medium artificial luminance distributed around the urban fringe areas (e.g., large-scale shanty towns or villages) are excluded in the NTL-derived urban extents, while they are included in the GUB data (see b1 and b2 in Fig. 8d). Third, continuous and brightly lit areas connecting scattered artificial impervious areas (e.g., small cities with scatter patterns isolated away from surrounding megacities) are included in the NTL-derived urban extents, while they are excluded in the GUB data (see b3 and b4 in Fig. 8d). Besides, several hollow areas can also be observed within the NTL-derived urban extents (see a1, a2, b3, and b4 in Fig. 8d). These hollow areas with lower DNs at the local scales correspond to regions without ISA and therefore should be considered non-urban areas. Therefore, compared with the urban  reference boundaries with multi-temporal records, the urban extents detected from NTL imagery using our approach are reasonable when considering the differences in the definitions, data sources, and approaches used in delineating urban boundaries.

Comparison with historical Google imagery
Urban dynamics derived from NTL imagery also agree well with the actual urban extents in historical Google Earth images, for cities with different expansion patterns (Fig. 9). Here, for illustration, the NTL-derived urban boundaries were overlaid on both corresponding NTL images and corresponding high-resolution historical Google Earth maps (in 1992, 1999, 2006, 2013, and 2020) of two selected urban clusters. We find that our mapped results are capable of detecting the spatial changes in urban dynamics over a long period. That is, the delineated boundaries enable us to automatically distinguish urban areas and surrounding nonurban areas for different urbanization processes. For example, Chengdu (China) experienced an evident urban growth over the nearly 30 years; the derived urban boundary from NTL imagery can well depict the complex expansion patterns such as enclave expansion and multi-center continuous development. Furthermore, urban dynamics can be easily delineated from cities experiencing sprawl expansion; one case can be found in Omaha (United States). The results commonly suggest that the delineated urban extents using our approach include both locally intensive impervious areas and continuously lit areas with bright artificial luminance.

Comparison with socioeconomic statistics
Our time-series urban extents delineated from NTL imagery are partly consistent with gridded global population maps (Fig. 10). Considering that urbanization can be characterized in more ways than just human settlement, we applied a population indicator from the Gridded Population of the World (GPW) (Center for International Earth Science Information Network, 2018) to evaluate the reliability of urban extents generated in this study. The gridded populations of each potential urban cluster in different years were calculated to characterize their total populations. In general, we observed the correlation relationships between the urban sizes from NTL data and the total population from GPW data (Fig. 10ae). Although the relationships between urban size and total population are complex and difficult to quantify, a general linear pattern is obvious between the two processed datasets, with correlation values at about 0.5 in different periods. Such weak correlations are reasonable because of the complex processes of urban expansion and population growth, which not only interact with each other but are also influenced by other factors such as economic growth, transportation infrastructure, governance, and planning controls, as well as the characteristics of the environment. Moreover, it was found that large urban areas in different periods always correspond to high populations, which is consistent with general cognition. Besides, the consistency of the spatial extents between urban areas and the high-population grid is also acceptable, such as a current case in Des Moines in the United States (Fig. 10f).

Global and continent levels
Widespread expansions with different degrees of global urban extents have been observed over the past 3 decades. Globally, the area percentage of global urban land as a proportion of the Earth's land surface was 0.22 % in 1992 and   increased to 0.69 % in 2020, with a total increase of about 983 834 km 2 . The trends and magnitudes of urban growth vary among different continents (Fig. 11a). For example, Asia, Europe, and North America show both larger urban distribution and faster urban expansion from 1992 to 2020 compared with other continents. With a faster urban growth rate in Asia than on other continents, the total area of urban land in Asia has surpassed that in Europe and North America, ranking first in the world. Based on the current situation, the total urban area of Europe may exceed that of North America in the near future. The urban land percentage of each continent as a proportion of the global urban land also varies with time. Specifically, the dynamics of this percentage in Asia show a significant increasing trend, while those in Europe and North America show an obvious decreasing trend and a steady trend, respectively. Additionally, slightly increasing trends of urban land as a proportion of the global urban land are also observed in South America and Africa.
The spatiotemporal patterns of urban size also vary considerably worldwide (Fig. 11b). To further analyze the general characteristics of local urban dynamics, the total areas of the derived urban extents for each potential urban cluster in 1992 and 2020 were calculated to characterize their urban size. In general, a notable enlargement in urban size has been observed in Asia from 1992 to 2020 due to its rapid urbanization processes. For example, the total number of the largest urban areas (i.e., area > 2000 km 2 ) of Asia in 1992 is 3, increasing to 16 in 2020. Additionally, many small-size urban areas have been developed into medium-or large-size urban areas worldwide, especially in Europe and Asia. The distribution pattern of large and super-large urban areas (i.e., area > 1000 km 2 ) in North America has remained unchanged since the 1990s.

Country level
The dynamics of urban extents from 1992 to 2020 also vary among different countries. The United States and China are the top two countries in terms of the total urban area in 2020, and the urban expansion in China is faster than that in the United States over the nearly 30 years (Fig. 12a). Although the total urban area in the top two countries occupies more than one-third of the global total in 2020, a slightly decreasing trend of this proportion, from 41 % in 1992 to 35 % in 2020, is observed. This can probably be attributed to a steady urbanization process in the United States and relatively rapid urbanization processes in other developing countries except for China. The urban areas of African countries are generally smaller than those of other countries, and the new expanded urban areas over the past 3 decades are also relatively small.
For the countries with dense urban populations, the growth patterns of urban areas and urbanized proportions also vary with spaces and time (Fig. 12b). We selected the top 20 countries ranked by urban population in 2018 (https://data. worldbank.org, last access: 5 September 2021) for illustration. By 2020, the total urban area in the United States and China occupies about 0.22 % and 0.13 % of the global urban area. Moreover, more than 170 000 km 2 of land has been urbanized in China since the 1990s, showing a stronger momentum of increase in urban extents in the future than the remaining countries. At the country level, the urbanized proportion of the total land area in developed countries such as Japan, the United Kingdom, South Korea, and Italy is higher, followed by Germany, France, and the United States. Besides, among these countries with a high proportion of urban extents, South Korea and Italy have experienced a relatively rapid urbanization process from 1992 to 2020.

City level
The spatial patterns of urban expansion at local scales from 1992 to 2020 in China and the United States are quite different (Fig. 13). We mapped the spatiotemporal distribution of the urban extents at both regional and local scales to further compare the historical pathways of the newly expanded urban areas since 1992 in developed as well as developing countries. In general, the spatial extents of the majority of urban areas over the nearly 30 years in the United States are relatively stable, while those in China tend to be enlarged at different degrees. For instance, expansions in metropolises of the United States like Chicago and Houston always occur in the early period. In contrast, slight expansions in small cities (e.g., Clarksville and Omaha) during the last decade can also be detected. However, urban areas in China exhibit diverse patterns of spatial expansion over the past decades in terms of expansion size, pace, direction, and shape. For instance, Chengdu, Xi'an, and Hefei in China have developed into large cities due to the rapid agglomerations of population, industry, and economic activities since the turn of the new century. Moreover, expansions in metropolises of China like Beijing and Shanghai are also evident. As a valley city, the urban sprawl in Lanzhou is mainly restricted by the landform, exhibiting a narrow shape of the strip.

Data availability
The global dataset of annual urban extents (1992-2020) from nighttime light imagery is available at https://doi.org/10.6084/m9.figshare.16602224.v1 . This dataset was tagged in GeoTIFF file format,  with a spatial resolution of 30 arcsec (∼ 1 km) under the WGS84 geographic coordinate system. A detailed description of this dataset is also provided. The uploaded imagery can be processed using free GIS software such as QGIS. The harmonized global NTL dataset used as the primary dataset for mapping the global time-series urban extents is available at https://doi.org/10.6084/m9.figshare.9828827.v5 . Additionally, other ancillary data used in this study include water masks derived from MODIS MOD44W data, gas flare data (Elvidge et al., 2009), binary urban map data , and GAIA data .

Conclusions
In this study, we generated a global dataset of annual urban extents (1992-2020) using consistent NTL observations and analyzed the spatiotemporal patterns of global urban dynamics at different levels over the past 3 decades. The advantage of this dataset is to provide spatially explicit global urban maps with long temporal coverage for depicting the dynamics of urban areas associated with locally high intensity human activities. Based on previous studies, a new stepwise-partitioning framework, including potential urban cluster identification, initial urban extent delineation, and urban sequence updating, was developed to generate the annual urban extents from harmonized NTL time-series imagery at the global scale. Then, we evaluated the global urban extents (1992-2020) using a variety of reference data including fine-resolution ISA data, other urban boundary data, Google Maps, and gridded population data. Next, the spatiotemporal patterns of the global urban dynamics since the 1990s were further discussed at the global, continent, country, and city levels.
The evaluations indicate that this dataset of urban extents is reliable for characterizing spatial extents associated with intensive human settlements and high-intensity socioeconomic activities, over space and time. In general, the global urban land area as a percentage of the total Earth's land surface is 0.22 % in 1992 and increased to 0.69 % in 2020. Specifically, Asia, Europe, and North America show a larger urban distribution and faster urban expansion than other continents. The annual percentages of urban land as a proportion of the global urban land in Asia show a significant increasing trend. In contrast, those in Europe and North America show an obvious decreasing trend and a steady trend, respectively. The United States and China are the top two countries in terms of the total urban area in 2020, and the urban expansion in China is faster than that in the United States over the nearly 30 years. Overall, the spatial extents of the majority of urban areas since the 1990s in the United States are generally stable, while those in China tend to be enlarged at different degrees.
This study further extends the applications of NTL remote sensing to urban-related studies. The stepwise urban mapping framework improved in this study specialized in identifying the urbanized areas by classifying NTL signals of the potential urban domains, including urban cores and suburban and rural areas. Differently from other global urban mapping time-series products (e.g., artificial impervious areas), our dataset is unique and advanced in its capacity to identify areas associated with locally high intensity urbanization and socioeconomic activities. It should be noted that some developed areas, such as areas with extensive luminosity loss due to the policy or status of light turndown or turnoff at night , may be omitted in this dataset. This limitation may be reduced by fusing NTL signals with other effective information mined from fine-resolution ISA data  and social media big data (Ma, 2018).
In general, our dataset shows great potential in various urban studies, such as studies of urbanization and the corresponding impacts on land use, habitat quality, urban heat islands, and urban climate.
Author contributions. YZ and MZ designed the study; XL collected the data; YZ, CC, and MZ discussed the method and analysis sections; MZ implemented the study and drafted the manuscript; YZ, CC, XL, SS, and CS reviewed and revised the manuscript.
Competing interests. At least one of the (co-)authors is a member of the editorial board of Earth System Science Data. The peerreview process was guided by an independent editor, and the authors also have no other competing interests to declare.
Disclaimer. Publisher's note: Copernicus Publications remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Acknowledgements.
We would like to thank the highperformance computing support from the Center for Geodata and Analysis, Faculty of Geographical Science, Beijing Normal University (https://gda.bnu.edu.cn/, last access: 5 September 2021). We appreciate the reviewers and handling editor for their constructive comments and suggestions.
Financial support. This research has been supported by the National Natural Science Foundation of China (grant nos. 42041007 and 42101462) and the China Postdoctoral Science Foundation (grant no. 2021M690019).
Review statement. This paper was edited by Alexander Gruber and reviewed by two anonymous referees.