Articles | Volume 14, issue 2
Earth Syst. Sci. Data, 14, 651–664, 2022
https://doi.org/10.5194/essd-14-651-2022
Earth Syst. Sci. Data, 14, 651–664, 2022
https://doi.org/10.5194/essd-14-651-2022

Data description paper 15 Feb 2022

Data description paper | 15 Feb 2022

A global seamless 1 km resolution daily land surface temperature dataset (2003–2020)

A global seamless 1 km resolution daily land surface temperature dataset (2003–2020)
Tao Zhang1, Yuyu Zhou1, Zhengyuan Zhu2, Xiaoma Li3, and Ghassem R. Asrar4 Tao Zhang et al.
  • 1Department of Geological and Atmospheric Sciences, Iowa State University, Ames, IA, 50011, USA
  • 2Department of Statistics, Iowa State University, Ames, IA, 50011, USA
  • 3College of Landscape Architecture and Art Design, Hunan Agricultural University, Changsha, Hunan, 410128, China
  • 4Universities Space Research Association, Columbia, MD, 21046, USA

Correspondence: Yuyu Zhou (yuyuzhou@iastate.edu)

Abstract

Land surface temperature (LST) is one of the most important and widely used parameters for studying land surface processes. Moderate Resolution Imaging Spectroradiometer (MODIS) LST products (e.g., MOD11A1 and MYD11A1) can provide this information with moderate spatiotemporal resolution with global coverage. However, the applications of these data are hampered because of missing values caused by factors such as cloud contamination, indicating the necessity to produce a seamless global MODIS-like LST dataset, which is still not available. In this study, we used a spatiotemporal gap-filling framework to generate a seamless global 1 km daily (mid-daytime and mid-nighttime) MODIS-like LST dataset from 2003 to 2020 based on standard MODIS LST products. The method includes two steps: (1) data pre-processing and (2) spatiotemporal fitting. In the data pre-processing, we filtered pixels with low data quality and filled gaps using the observed LST at another three time points of the same day. In the spatiotemporal fitting, first we fitted the temporal trend (overall mean) of observations based on the day of year (independent variable) in each pixel using the smoothing spline function. Then we spatiotemporally interpolated residuals between observations and overall mean values for each day. Finally, we estimated missing values of LST by adding the overall mean and interpolated residuals. The results show that the missing values in the original MODIS LST were effectively and efficiently filled with reduced computational cost, and there is no obvious block effect caused by large areas of missing values, especially near the boundary of tiles, which might exist in other seamless LST datasets. The cross-validation with different missing rates at the global scale indicates that the gap-filled LST data have high accuracies with the average root mean squared error (RMSE) of 1.88 and 1.33, respectively, for mid-daytime (13:30) and mid-nighttime (01:30). The seamless global daily (mid-daytime and mid-nighttime) LST dataset at a 1 km spatial resolution is of great use in global studies of urban systems, climate research and modeling, and terrestrial ecosystem studies. The data are available at Iowa State University's DataShare at https://doi.org/10.25380/iastate.c.5078492 (T. Zhang et al., 2021).

1 Introduction

Land surface temperature (LST) is an important variable for studies of energy balance, evapotranspiration, and ecosystem processes in the monitoring of Earth's resources (Anderson et al., 2010; Long et al., 2020). It has been widely used in various studies such as the urban heat island phenomenon (H. Li et al., 2021; X. Liu et al., 2020; Tang et al., 2017; Yue et al., 2019), hydrology (Bai et al., 2019; Zhang et al., 2017), meteorology (Anderson et al., 2010; Li et al., 2018b), ecology (Sims et al., 2008), and energy systems (Peng et al., 2012; Zhou et al., 2014b). LST varies significantly in both space and time due to the spatiotemporal variation in factors such as solar radiation, atmospheric conditions, and land surface characteristics (Li et al., 2018a; Peng et al., 2014; Zhang et al., 2015). LST can be measured in situ, obtained from land surface modeling, and retrieved by remote sensing (Ford and Quiring, 2019; Sheffield et al., 2018). Remotely sensed LST is by far the most widely obtained and used due to its global spatial coverage, high spatiotemporal resolutions, and available long-term data records.

LST products with a variety of spatial and temporal resolutions have been developed from different sensors/satellites such as (1) high spatial resolution of 60–120 m and low temporal resolution of about every 2–16 d from Landsat (Parastatidis et al., 2017; Roy et al., 2014) and Advanced Spaceborne Thermal Emission and Reflection Radiometer (ASTER) (Hulley et al., 2015); (2) coarse spatial resolution of 3–5 km but high sub-daily to sub-hourly temporal resolution from geostationary satellites (Choi and Suh, 2013; Duguay-Tetzlaff et al., 2015; Jiang and Liu, 2014; Trigo et al., 2008; Yu et al., 2009); and (3) moderate spatial resolution of about 1 km and moderate daily temporal resolution from the Moderate Resolution Imaging Spectroradiometer (MODIS) (Wan, 2013, 2014), Visible Infrared Imaging Radiometer Suite (VIIRS) (Guillevic et al., 2014), and Sea and Land Surface Temperature Radiometer (SLSTR) LST (Ghent et al., 2017). Among them, MODIS LST is the most widely used especially for regional and global studies due to its global coverage and long-term and well-calibrated and well-documented data records (since 2000) (Aguilar-Lome et al., 2019; Li et al., 2017; Peng et al., 2012; Sandeep et al., 2021; J. Zhao et al., 2020; Zhou et al., 2019). However, MODIS LST has a large number of missing values due to a variety of factors such as cloud contamination, non-overlapping satellite orbits, and instrumental malfunction (Crosson et al., 2012; Li et al., 2018a; H. Liu et al., 2020; Shen et al., 2015; Wan, 2013).

Filling the missing values of MODIS LST is an effective way to overcome this limitation in the MODIS LST product. Several seamless datasets have been developed in previous studies (Cheng et al., 2021; Li et al., 2018a; Metz et al., 2017; X. Zhang et al., 2021b; B. Zhao et al., 2020); however, they only cover specific regions or have coarse spatiotemporal resolutions (Table S2). Recently, Zhan et al. (2021) produced a global 1 km LST dataset (2003–2019), but only a daily average of LST was included. Shiff et al. (2021) developed a Google Earth Engine (GEE) code and a web app for generating 1 km gap-filled LST by using Climate Forecast System Version 2 (CFSv2) modeled air temperature and MODIS LST data, but they did not consider the naturally spatial variation in LST. A global daily minimum and maximum LST dataset with reasonable spatial pattern that can be used for a variety of studies and applications by scientists and practitioners such as city planners and water resources managers is still not available. Meanwhile, a variety of gap-filling methods have been proposed to fill gaps in MODIS LST. These methods can be divided into four groups (Li et al., 2018a; Weiss et al., 2014; Zhang et al., 2020). The first group is based on data fusion methods, which combine LST data from different satellites or different overpass times (e.g., morning and afternoon) of the same satellite on the same day (Crosson et al., 2012; Duan et al., 2017; Long et al., 2020; Xu and Cheng, 2021; X. Zhang et al., 2020, 2021a). The second group is based on empirical relationships among different methods that were used to estimate the missing values by fitting empirical relationship between LST and auxiliary data (e.g., latitude, longitude, altitude, surface moisture, normalized difference vegetation index, and ground observed LST) (Fan et al., 2014; Ke et al., 2013; B. Li et al., 2021; B. Zhao et al., 2020). The third group is based on the internal spatiotemporal relationship that predicted the missing values with the available LST using algorithms such as temporal interpolation (Kilibarda et al., 2014; Xu and Shen, 2013), spatial interpolation (Ke et al., 2013; Yang, 2004), spatiotemporal interpolation (Sun et al., 2017; Weiss et al., 2014), and multi-dimensional smoothing (Garcia, 2010, 2011; H. Liu et al., 2020; Pham et al., 2019). The fourth group is a hybrid method that combined several methods from the groups mentioned above (Hong et al., 2021; Li et al., 2018a; Metz et al., 2017; Weiss et al., 2014; Xu and Cheng, 2021).

However, most of the current methods have some shortcomings in accuracy and efficiency for producing globally consistent and seamless MODIS-like LST. For example, the data fusion method has the problem of mismatch between LST from different sources and usually cannot fully fill gaps (Crosson et al., 2012). The computational cost of the methods based on the empirical relationship could increase significantly with the increase in spatial resolutions and might not be able to fully capture spatial and temporal variations in LST as the auxiliary data have low temporal resolutions (Fan et al., 2014; Ke et al., 2013; B. Zhao et al., 2020). The temporal interpolation and multi-dimensional smoothing methods are computationally efficient but may miss short-term temporal variations in LST (Kilibarda et al., 2014; Xu and Shen, 2013). The spatial interpolation methods may lead to physically unrealistic features in the interpolated LST when there are a lot of missing observations (Ke et al., 2013; Yang, 2004). The spatiotemporal interpolation methods can capture the short-term changes in LST but are time-consuming due to the use of local moving windows for each pixel (Li et al., 2018a; Weiss et al., 2014). The hybrid methods take the advantages of the methods mentioned above and carry with them some of shortcomings of these methods, and they may actually amplify them in the process of merging data inputted using different methods.

We proposed a spatiotemporal gap-filling framework to gap-fill missing values in the MODIS LST product with good accuracies and high efficiencies. This framework includes two key steps of preprocessing and spatiotemporal fitting. Based on this framework, we developed a global 1 km daily (mid-daytime and mid-nighttime) LST dataset from 2003 to 2020 using the 1 km daily MODIS LST product. The remainder of this paper describes the study area and data (Sect. 2), the proposed spatiotemporal gap-filling approach (Sect. 3), the results and discussion (Sect. 4), data availability (Sect. 5), and conclusions (Sect. 6).

2 Study area and data

The study area is nearly the entire global land surface, including 178 MODIS tiles (Fig. 1). The 1 km daily MODIS LST product version 6 from 2003 to 2020 is the primary data used in this study. It was produced based on the National Aeronautics and Space Administration (NASA) Earth Observing System (EOS) satellites Terra and Aqua (MOD11A1 and MYD11A1) (Wan, 2013, 2014). There are four observations each day from the two satellites (i.e., 10:30 and 22:30 for Terra: T1 and T3; 13:30 and 01:30 for Aqua: T2 and T4 at local time). Another two auxiliary datasets used are the annual MODIS land cover product (MCD12Q1) (Sulla-Menashe and Friedl, 2018) and urban extents derived from nighttime light observations and their surrounding rural areas (Zhou et al., 2014a, 2018). Water pixels from the MCD12Q1 product were excluded in our analysis.

https://essd.copernicus.org/articles/14/651/2022/essd-14-651-2022-f01

Figure 1MODIS data tiles used in gap-filling and cross-validation analysis.

https://essd.copernicus.org/articles/14/651/2022/essd-14-651-2022-f02

Figure 2An overview of the spatiotemporal gap-filling framework (taking T2 as an example).

Download

3 Method

We developed a spatiotemporal gap-filling framework to gap-fill missing values in the MODIS daily LST to produce a seamless 1 km spatial resolution global dataset from 2003 to 2020 (Fig. 2). The framework includes two key steps: (1) data pre-processing (Sect. 3.1) and (2) spatiotemporal fitting (Sect. 3.2). This gap-filling method was applied to MODIS LST at T2 ( 13:30, Aqua Day in Fig. 2) and T4 ( 01:30, Aqua Night in Fig. 2), respectively, to build the 1 km daily LST (maximum and minimum) data. In the subsections below, we describe each of these steps in detail.

3.1 Data pre-processing

Data pre-processing includes two parts: (1) data filtering and (2) daily merge. We first checked the quality of the original MODIS data based on their quality assurance (QA) information and removed data points with error > 3 K. We applied this threshold value because a stricter (or lower) value can exclude most of LST in urban areas (Crosson et al., 2012; Metz et al., 2017). Second, we conducted a daily merge using four observations from the two satellites (Terra and Aqua) on a given day using a modified algorithm from Li et al. (2018a). Taking a pixel with missing value of T2 as an example (Fig. 2), we calculated percent of valid data (PVD) in a year for all four observations. When PVD of T2 is smaller than 5 % and one PVD of T1, T4, or T3 is greater than 5 %, we gap-filled missing values of T2 using data from one of the other three observations based on the order of T1, T4, and T3. If PVD of T1 is greater than 5 %, we estimated T2 by T1 using the linear regression method with T2 as the dependent variable and T1 as the independent variable based on available time series of LSTs in a year. If PVD of T4 is greater than 5 %, we estimated T2 by T4 using the shift method (i.e., adding T4 and adjusting daily difference between T2 and T4 to get T2). If PVD of T3 is greater than 5 %, we estimated T2 by T3 using the shift method (i.e., adding T3 and adjusting daily difference between T2 and T3 to get T2). After the daily merge, we gap-filled the leftover missing values using the spatiotemporal fitting. We selected the threshold of PVD at 5 % because the valid data smaller than 5 % are not enough to capture the spatial pattern of LST in a tile according to our experiments. Details of the linear regression and shift methods can be found in Li et al. (2018a). Specifically, we used the shift method because there is a nonlinear relationship between daytime and nighttime LSTs (i.e., T2 and T3 or T4) (Crosson et al., 2012). We estimated the daily shift using temporally interpolating monthly averaged shift, i.e., monthly mean LST difference between T2 and T3 (or T4), and then we added the daily shift to T3 (or T4) to estimate T2.

https://essd.copernicus.org/articles/14/651/2022/essd-14-651-2022-f03

Figure 3An example of the spatiotemporal fitting algorithm for gap-filling LST.

Download

3.2 Spatiotemporal fitting

The spatiotemporal fitting algorithm includes three steps (Fig. 3). First, we fitted the overall mean of observations in each pixel (i.e., the fitted daily values (temporal trend) in a year using the smoothing spline function for which the independent variable is the day of year) using a smoothing spline function (Green and Silverman, 1994) to capture the overall trend. Specifically, the overall means of T2 and T4 were independently estimated. The time series of daily LST in a year (e.g., LST of T2) can be divided into two components: the overall mean (trend) and daily residual with gaps (daily fluctuations). We used the smoothing spline function for fitting overall trend since this algorithm does not have a hypothesis on the shape of the seasonal trend and is capable of capturing different seasonal patterns of LST across the globe. Second, we spatiotemporally interpolated residuals for each day using a correlation-based method (details in Sect. 3.2.1), in which the missing residual of a target pixel was estimated based on the temporally and linearly regressive correlation between the target pixel and its eight neighboring valid pixels (i.e., with good quality). We used the daily residuals of a year from the target pixel and its neighboring pixels to estimate the missing values. When the value of the target pixels is missing for a specific day, we can still build linear regression functions based on the time series data. We selected 1 % of the uniformly distributed pixels (10 km intervals) as representative neighboring pixels to perform the interpolation of residuals with high efficiency without reducing the accuracy based on our experiments. Moreover, we divided the global land surface area into nine overlapped zones to avoid possible boundary effects (Details in Sect. 3.2.2). Finally, the seamless overall mean and daily residuals were added to obtain the gap-filled LST data.

3.2.1 Interpolation based on correlation weighting (ICW)

An interpolation based on correlation weighting (ICW) technique was used to interpolate the residual of land surface temperature (LST) on each day of a year. This method was inspired by the inverse distance weighting (IDW) interpolation method. The IDW method uses the weighted average values of neighboring sites to estimate the missing value, in which the weight was calculated based on the inverse distances between the target site and its neighboring sites. In the ICW method, the weight between the target site and one of its neighboring sites was calculated based on the correlation between daily values in a year at two locations.

The missing value of the target site VS0 at the time t was estimated based on values of the neighboring sites with Eq. (1).

(1) V S 0 ( t ) = i = 1 i = n w ( S 0 , S i ) V S 0 ( S i , t ) ,

where VS0(t) is the estimated value of target site at the time t; w(S0,Si) is the weight of the ith neighboring site Si, which can be calculated with Eq. (2); VS0(Si,t) is the estimated value of the target site at the time t based on the ith neighboring site, which can be estimated with Eq. (3); and n is the number of neighboring sites.

(2) w ( S 0 , S i ) = cor ( S 0 , S i ) i = 1 i = n cor ( S 0 , S i ) ,

where w(S0,Si) and n are the same as those in Eq. (1), cor(S0,Si) is the Pearson's correlation coefficient between S0 and Si.

(3) V S 0 ( S i , t ) = α i + β i V S i ( t ) ,

where αi and βi are the intercept and slope of the linear function between the target site and the ith neighboring site, which was fitted using ordinary least squares (OLS) method based on the matched time series (days of a year) of LST in the two locations, and VSi(t) is the value of the ith neighboring sites at time t.

3.2.2 Implementation of the ICW method

The ICW method was implemented as follows. First, in order to improve the efficiency, each MODIS tile was divided into blocks with a size of 10 pixels by 10 pixels, and the block center pixels were used as neighboring pixels for interpolating missing residuals. That is, missing residuals in a block can be interpolated based on the values from the eight neighboring block center pixels. Second, in order to ensure that all the block center pixels have valid (good-quality) data for the estimation of other pixels, the missing values in the block center pixels were interpolated using the IDW method. The steps used in this process are as follows: (a) computing the average value of each block; (b) resampling the original MODIS tile of 1200 by 1200 to 120 by 120 and the value of each pixel in the new image being the average value of a block in the original MODIS tile; (c) interpolating missing values in the resampled image based on the IDW method; and (d) assigning interpolated values to the block center pixels without valid values in the original MODIS tile. Third, in order to reduce the possible boundary effects of the interpolated residuals between neighboring blocks, for each pixel of a block, one of the neighboring block center pixels which has the largest correlation coefficient with the target pixel was used for estimation. This process can avoid systematic deviation in the boundary pixels from different blocks that were estimated based on a different combination of block center pixels because all the pixels in a block were interpolated based on eight center pixels of neighboring blocks. Equations (1) and (3) can be simplified to Eq. (4).

(4) V S 0 ( t ) = α m + β m V m ( t ) ,

where αm and βm are the intercept and slope of the linear function between the target pixel and its neighboring pixel with the corresponding maximum correlation coefficient, which was fitted using the ordinary least squares (OLS) method based on the matched time series (days of a year) values of the two locations, and Vm(t) is the value of the neighboring pixel with the maximum correlation coefficient at the time t.

https://essd.copernicus.org/articles/14/651/2022/essd-14-651-2022-f04

Figure 4Division of global regions. Dashed and shaded rectangles indicate the extent of input data and output data, respectively.

Finally, in order to mitigate boundary effects between neighboring tiles, multiple neighboring tiles were mosaicked as a region, and residuals of the block center pixels in the region were interpolated at the same time. The overlapped areas between the two neighboring regions were also considered to avoid possible boundary effects. The interpolation was conducted following the order of the region IDs (Fig. 4). For example, as the ID of North America is 1, it was the first region to be processed.

3.3 Accuracy assessment

We evaluated the accuracy of the gap-filled data using cross-validation by randomly selecting 15 MODIS tiles in 2005, 2010, and 2015 (Fig. 1). In each year, we selected 19 d with the maximum observations of high-quality data (i.e., daily data with valid observations larger than 95 % percentile in a year) in the cross-validation. For each of the selected days, we manually introduced gaps under three scenarios (i.e., excluding 25 %, 50 %, and 75 % of valid pixels) based on the spatial pattern of missing pixels from another day of the year. Then we gap-filled these missing values and compared them with the original values. We calculated root mean square error (RMSE) as the indicator of accuracy (Eq. 5).

(5) RMSE = 1 n i = 1 n LST ^ i - LST i 2 ,

where LSTi and LST^i are original MODIS LST and gap-filled LST values of the ith pixel, and n is the number of the gap-filled pixels.

Table 1Average RMSEs of 15 tiles used in cross-validation analysis of the efficacy of the gap-filling method (unit: C).

Note: “urban” means the urban and surrounding rural areas. The tile level means the accuracy was calculated based on all the excluded pixels of each tile; urban area means the accuracy was calculated based on urban pixels in the excluded areas of each tile. Each RMSE value is the mean of RMSEs from all selected days in 15 selected MODIS tiles.

Download Print Version | Download XLSX

https://essd.copernicus.org/articles/14/651/2022/essd-14-651-2022-f05

Figure 5Scatter plots between gap-filled LST and original MODIS LSTs for daytime and nighttime in the excluded areas used for cross-validation. We used 855 images (15 tiles × 3 years × 19 d) and selected 11 pixels from the excluded area in each image in the scatter plots. Meanwhile, we excluded values of water pixels in accuracy assessment. The color of points represents the density of points, in which the red points represent the highest density, and the blue points represent the lowest density. The solid red line represents the regression line, and the black line is the 1:1 line.

Download

4 Results and discussion

4.1 Accuracy of gap-filled LST

The results of cross-validation indicate the gap-filled LST has high accuracies (Fig. 5 and Table 1). The observed and gap-filled LSTs of representative pixels for different ratios of exclusion scattered along the 1:1 line with RMSE ranging from 2.05 to 2.31 and from 1.35 to 1.62, respectively, for daytime and nighttime (Fig. 5). As shown in Table 1, the average RMSE at tile level (i.e., calculated based on all the excluded pixels of each tile) ranges from 1.20 to 2.13 with an average of 1.88 and 1.33, respectively, for daytime and nighttime. The lowest RMSE occurs in 2010 with 25 % excluding rate for nighttime, while the highest RMSE occurs in 2015 with 75 % excluding rate for daytime. Compared with the accuracies at tile level, the accuracies for urban areas (i.e., calculated based on urban pixels in the excluded areas of each tile) are always higher with RMSE ranging from 1.14 to 2.06 (Table 1).

When the number of missing values in the original LST increases, the gap-filled LST data tend to reduce in accuracy (Fig. 5, Table 1). As shown in Fig. 5, when the excluding rate increases from 25 % to 75 %, the RMSE of LST for daytime and nighttime increases from 2.05 to 2.31 and 1.35 to 1.62 for daytime and nighttime, respectively. This is also true across all years at the tile level and in urban area in Table 1. However, the RMSE values are still within reasonable ranges. When the excluding rate is 75 %, the RMSEs are 2.31 and 1.62, respectively, for daytime and nighttime (Fig. 5). Meanwhile, 88.9 % of the RMSE in Table 1 is lower than 2. Besides, the accuracies of gap-filled LST vary with climate zones and may be also correlated with landforms (Table S1).

https://essd.copernicus.org/articles/14/651/2022/essd-14-651-2022-f06

Figure 6Spatial pattern of the original and gap-filled LSTs at global scale on day 200 of year 2020.

4.2 Spatial and temporal patterns of LST

The examples of global LST data illustrate that the missing values in the original MODIS LST have been effectively gap-filled using the proposed gap-filling algorithm (Fig. 6). In the original MODIS LST, the continuously missing values mainly occur in East Asia, South Asia, and Central Africa for both daytime and nighttime on the example date (Fig. 6). In the gap-filled data, the missing values in these regions were fully gap-filled.

https://essd.copernicus.org/articles/14/651/2022/essd-14-651-2022-f07

Figure 7Spatial pattern of the original and gap-filled LSTs in five representative cities. NA (gray color) in the gap-filled LST is water pixels. Solid black lines are the boundary of urban regions extracted by using global artificial impervious area data with 30 m spatial resolution (Li et al., 2020).

The comparisons of spatial patterns between gap-filled and original MODIS LSTs in representative cities around the world (Fig. 7) illustrate that the missing values in the original MODIS LST have been effectively gap-filled at the city scale. As shown in Fig. 7, there is no missing value in the entire land surface area of the gap-filled data (water pixels were masked as NA). The gap-filled data capture well the urban heat island (UHI) phenomenon (i.e., higher temperature in urban than in the surrounding rural areas). The spatial pattern of the gap-filled LST is reasonable with transition from urban to rural areas, and there are no obvious boundary effects (more details in Sects. S1 and S2). For example, there is no obvious boundary effect between two MODIS tiles in the gap-filled LST data in Houston, USA, which suggests the interpolation of residuals (Sect. 3.2.2) in the proposed method is reliable. The gap-filled LST in the Pearl River Delta region shows a number of small speckles because this region is an agglomeration of sub-areas undergoing rapid urbanization.

https://essd.copernicus.org/articles/14/651/2022/essd-14-651-2022-f08

Figure 8Temporal pattern of average daytime LST from original and gap-filled data in Beijing in the year 2010. The black circles are example days showing maps of the original and gap-filled LST data.

The comparison of the temporal pattern between gap-filled and original MODIS LSTs in a mega-city (Fig. 8) illustrates that the missing data in the original MODIS LST can be effectively and completely gap-filled for the entire period. As shown in Fig. 8, there are several days with limited valid (high-quality) observations in the original MODIS LST in Beijing, China, during daytime in 2010, and these missing values were fully gap-filled in our data for the entire period. When there are only a few missing values in the original LST data (days 28 and 130 in Fig. 8), the gap-filled and original LSTs show similar spatial pattern with significant UHI phenomenon. When there are a large number of missing values in the original LST data (days 219 and 293 in Fig. 8), the gap-filled LSTs can still illustrate the UHI phenomenon, while the original LST data are limited. Therefore, we may get more accurate estimation of annual average LST based on the gap-filled LST than the original LST data.

4.3 Comparison with existing seamless LST data

The accuracy of the resulting gap-filled LST from this study is comparable or better when compared with other reported seamless LST datasets. Our gap-filled LST data show higher accuracies compared with the gap-filled LSTs based on the hybrid spatiotemporal gap-filling method proposed by Li et al. (2018a). These two datasets are most comparable because of the use of a similar accuracy evaluation method (cross-validation at the global scale) in both studies. In the hybrid method proposed by Li et al. (2018a), about 11 % to 60 % of the valid values (Xiaoma Li, personal communication, 2019) were excluded for cross-validation purposes in the urban areas at the global scale, and the average RMSE is 3.29 and 2.68 for daytime and nighttime, respectively. In this study, the average RMSE is 1.83 and 1.28 in the urban and surrounding areas for daytime and nighttime, respectively (Table 1). The gap-filled LSTs based on the data fusion method implemented on GEE (Shiff et al., 2021) were also evaluated at the global scale, but the mean RMSE is 2.7, higher than that of this study. The accuracies of other seamless LST datasets were generally evaluated based on a limited number of in situ LST observations (Zhang et al., 2019; Zhou et al., 2017), which are not exactly the same as satellite LSTs (Hong et al., 2021), and the evaluation in these studies are not directly comparable with our study. For example, the LST data by B. Zhao et al. (2020) reached the average RMSE of 1.59 at the daily level; the LST data by X. Zhang et al. (2021b) showed the RMSE ranging from 2.03 to 3.98 K on the Tibetan Plateau (X. Zhang et al., 2021a); and the LST data using a hybrid method (Hong et al., 2021) has a mean absolute error (MAE) of 1.0 K at the daily level (Table S2).

The gap-filled LST in this study does not have the issue of boundary effect that might exist in the previous methods. Li et al. (2018a) combined several techniques including data fusion (Crosson et al., 2012), spatiotemporal interpolation (Gerber et al., 2018; Weiss et al., 2014), and temporal interpolation methods (Xu and Shen, 2013) to reconstruct daily (mid-daytime and mid-nighttime) LST. The systematic differences between neighboring regions with the use of different gap-filling techniques in the hybrid method may lead to boundary effects (Li et al., 2018a). The data fusion method implemented on GEE (Shiff et al., 2021) directly filled the missing values in MODIS LST using the estimated LST values without consideration of the spatial continuity, which might lead to boundary effects. The seamless LST data produced by Zhao et al. (2020) might also contain boundary effects since different regression methods were used to reconstruct the missing values according to the number of valid pixels. There are no obvious boundary effects in the LST data by X. Zhang et al. (2021b) using the data fusion model proposed by X. Zhang et al. (2021a). However, abrupt changes might occur between the original valid MODIS LST and the gap-filled LST using the data fusion model (Figs. 7 and 8 in the study by X. Zhang et al., 2021a). The gap-filled LST data in this study using the novel framework consisting of two key steps (Sect. 3.2.2) can mitigate boundary effects between neighboring regions (Fig. S1), neighboring tiles (Fig. S2), and within a given tile (Fig. 7 and Sect. S1).

The gap-filled global 1 km daytime and nighttime LST data have advantages regarding spatiotemporal resolutions (i.e., daily minimum and maximum) or coverage (i.e., global) and have significant potential for use in many disciplines of Earth system science and applications (Table S2). In the existing seamless LST datasets, Zhan et al. (2021) produced a global daily average 1 km resolution LST dataset from 2003 to 2019, without resolving by daytime and nighttime. B. Zhao et al. (2020) developed monthly average LST with 5.6 km spatial resolution for China from 2003 to 2017. Cheng et al. (2021) published a daily (mid-daytime and mid-nighttime) 1 km seamless LST of China from 2002 to 2020. X. Zhang et al. (2021b) generated a daily (daytime and nighttime) 1 km all-weather LST dataset for China and its surrounding areas for 2000 to 2020. Li et al. (2018a) produced a 1 km daily (mid-daytime and mid-nighttime) LST dataset only in urban and surrounding rural areas of the United States. Shiff et al. (2021) only provided GEE code for producing global 1 km daily (mean, mid-daytime, and mid-nighttime) LST data. The LST data in this study have a spatial resolution of 1 km and include daily LST at mid-daytime and mid-nighttime with a global coverage from 2003 to 2020, which have higher spatiotemporal resolutions or coverage than other existing published seamless LST datasets.

The gap-filling framework proposed in this study can be efficiently implemented and has advantages regarding computing time compared to other algorithms/methods. For example, the gap-filling method proposed by B. Zhao et al. (2020) was used for monthly 5.6 km resolution LST data, and it may require significant computation time for higher spatiotemporal resolution (daily, 1 km) LST data because it needs to calculate the distance between similar valid pixels and each target pixel (with missing or low-quality value) based on a geographically weighted regression method. The gap-filling method proposed by X. Zhang et al. (2021a) is also complex and time-consuming due to the involvement of multi-source data and a complex parameterization process on a pixel-by-pixel basis. The daily average LST data produced by Zhan et al. (2021) were calculated based on the nonlinear annual temperature cycle (ATC) and diurnal temperature cycle (DTC) modeling on a pixel-by-pixel basis, which is time-consuming for global-scale applications (Hong et al., 2021). The hybrid gap-filling method proposed by Li et al. (2018a) is time-consuming due to the use of a spatiotemporal interpolation (Gerber et al., 2018; Weiss et al., 2014) algorithm, in which the missing value of a pixel at a specific time and location was interpolated by using a quantile regression in the corresponding local spatial and temporal window. In the proposed method in this study, the interpolation of the residual for a pixel at a specific time was implemented by calculating correlation coefficients and fitting linear regression functions using the time series data of the target pixel and its neighboring pixels in the corresponding local window (Sect. 3.2.1). Moreover, 1 % of pixels at central pixels of blocks (10 pixel× 10 pixel) were used as neighboring pixels for interpolation of residual (Sect. 3.2.2) to reduce the amount of calculation, and the relevant parameters (i.e., correlation coefficients and coefficients of linear regression function) between the target pixel and its eight nearest neighboring pixels were calculated only one time for the entire period (365 d for a year) based on the time series of residuals. The reason is that the time series of residuals from two neighboring pixels within a short distance are highly correlated with each other. Our scheme can significantly improve the efficiency for global applications without reducing the accuracy according to our experiments.

The accuracy of the gap-filled LST should not be significantly affected by the land cover type and elevation differences in local spatial windows. LST values from different land cover types and elevations within a small spatial region may be significantly different (X. Zhang et al., 2021a). These differences of LST values can be captured through the temporal pattern of LST (overall mean) by separately fitting the smoothing spline curves (Fig. 3), and the spatiotemporal similarity of residuals between neighboring pixels was gap-filled. The gap-filled LST values are the sum of overall mean and residuals. Therefore, our method can capture the missing values of LST in different land cover types and elevations in local spatial windows.

A limitation of this study is that the gap-filled LST dataset mainly reflects the clear-sky conditions, and future work can focus on recovering cloudy-sky LST to produce an all-weather LST dataset when high-quality ancillary data become available. As only the spatiotemporal information of clear-sky MODIS LST data was used to fill the missing values, the gap-filled pixels mainly reflect the clear-sky LST and might overestimate the actual LST values. Previous studies have attempted to develop methods for obtaining all-weather LST data by incorporating cloudy-sky LST retrieved from passive microwave observations or reanalyzed products (Duan et al., 2017; Long et al., 2020; X. Zhang et al., 2019, 2020, 2021a) or adding clear-sky LST and the LST differences resulting from cloud impacts according to the surface energy balance (SEB) methods (e.g., Jia et al., 2021). However, it is challenging to obtain cloudy-sky LST, and cloud caused LST differences at the global scale in the last 2 decades because the ancillary datasets have lower spatial resolutions and accuracies compared to MODIS LST, leading to complicated algorithms with complicated hypotheses (Long et al., 2020; X. Zhang et al., 2021a). Future studies are needed to develop robust and efficient algorithms for producing global all-weather LST data.

5 Data availability

Data described in this paper can be accessed at Iowa State University's DataShare at https://doi.org/10.25380/iastate.c.5078492 (T. Zhang et al., 2021). The dataset contains 36 sub-datasets (one for each year during daytime and nighttime from 2003 to 2020). Each sub dataset contains LST data of a specific time (daytime or nighttime) and specific year (2003–2020) and is organized by day of year. The data are in GeoTIFF with the georeferenced information embedded. Each file keeps the MODIS ellipse sinusoidal projection with a spatial resolution of 1 km. The unit of LST in GeoTIFF is in 0.1 degrees Celsius (C), and the naming rule can be found in the file “README.pdf”.

6 Conclusions

We propose a framework for filling the gap in long-term Earth observations and geophysical data records that are used by many Earth system science disciplines and applications. We used the proposed method to generate a globally consistent and 1 km daily (mid-daytime and mid-nighttime) MODIS-like LST dataset from 2003 to 2020 using MODIS LST datasets (MOD11A1 and MYD11A1), which has advantages in spatial coverage and spatiotemporal resolutions compared to existing studies. The resulting dataset filled all existing gaps resulting from elimination of poor-quality data seamlessly with high accuracies based on a cross-validation under different rates of missing values for both daytime and nighttime. The average RMSE of gap-filled LST for daytime and nighttime ranges from 1.80 to 2.03 and 1.23 to 1.45 C, respectively, when different percentages of the data were excluded. The results show that the missing values in the original MODIS LST were effectively and efficiently filled, and there is no obvious block effect caused by large areas of missing values, especially near the boundary of tiles, which might exist in other seamless LST datasets. The gap-filled global 1 km daily LST dataset can provide a better data source for multidisciplinary applications such as the urban heat island phenomenon, air temperature estimation, soil moisture estimation, evapotranspiration, and drought monitoring (Phan and Kappas, 2018). However, it is worth noting that the accuracy of the gap-filled LST can be influenced by the rate of missing values, indicating that uncertainties might increase with the increase in missing values in the original dataset. Moreover, future work can focus on diurnal changes in LST by increasing observations within a day.

Supplement

The supplement related to this article is available online at: https://doi.org/10.5194/essd-14-651-2022-supplement.

Author contributions

YZ designed the research, TZ implemented the research and wrote the original manuscript, and YZ and ZZ supervised the research. All co-authors revised the manuscript and contributed to the writing.

Competing interests

At least one of the (co-)authors is a member of the editorial board of Earth System Science Data. The peer-review process was guided by an independent editor, and the authors also have no other competing interests to declare.

Disclaimer

Publisher's note: Copernicus Publications remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Acknowledgements

This research was supported by the College of Liberal Arts and Science's (LAS) Dean's Emerging Faculty Leaders award at the Iowa State University and the National Science Foundation (2041859).

Financial support

This research has been supported by the National Science Foundation (grant no. 2041859).

Review statement

This paper was edited by Kirsten Elger and reviewed by two anonymous referees.

References

Aguilar-Lome, J., Espinoza-Villar, R., Espinoza, J. C., Rojas-Acuña, J., Willems, B. L., and Leyva-Molina, W. M.: Elevation-dependent warming of land surface temperatures in the Andes assessed using MODIS LST time series (2000–2017), Int. J. Appl. Earth Obs. Geoinf., 77, 119–128, https://doi.org/10.1016/j.jag.2018.12.013, 2019. 

Anderson, M. C., Kustas, W. P., Norman, J. M., Hain, C. R., Mecikalski, J. R., Schultz, L., González-Dugo, M. P., Cammalleri, C., d'Urso, G., Pimstein, A., and Gao, F.: Mapping daily evapotranspiration at field to continental scales using geostationary and polar orbiting satellite imagery, Hydrol. Earth Syst. Sci., 15, 223–239, https://doi.org/10.5194/hess-15-223-2011, 2011. 

Bai, L., Long, D., and Yan, L.: Estimation of Surface Soil Moisture With Downscaled Land Surface Temperatures Using a Data Fusion Approach for Heterogeneous Agricultural Land, Water Resour. Res., 55, 1105–1128, https://doi.org/10.1029/2018WR024162, 2019. 

Cheng, J., Dong, S., and Shi, J.: 1 km seamless land surface temperature dataset of China (2002-2020), edited by: Natl. Tibet. Plateau Data Center, https://doi.org/10.11888/Meteoro.tpdc.271657, National Tibetan Plateau Data Center, 2021. 

Choi, Y. Y. and Suh, M. S.: Development of Himawari-8/Advanced Himawari Imager (AHI) land surface temperature retrieval algorithm, Remote Sens.-Basel, 10, 1–20, https://doi.org/10.3390/rs10122013, 2013. 

Crosson, W. L., Al-Hamdan, M. Z., Hemmings, S. N. J., and Wade, G. M.: A daily merged MODIS Aqua-Terra land surface temperature data set for the conterminous United States, Remote Sens. Environ., 119, 315–324, https://doi.org/10.1016/j.rse.2011.12.019, 2012. 

Duan, S. B., Li, Z. L., and Leng, P.: A framework for the retrieval of all-weather land surface temperature at a high spatial resolution from polar-orbiting thermal infrared and passive microwave data, Remote Sens. Environ., 195, 107–117, https://doi.org/10.1016/j.rse.2017.04.008, 2017. 

Duguay-Tetzlaff, A., Bento, V. A., Göttsche, F. M., Stöckli, R., Martins, J. P. A., Trigo, I., Olesen, F., Bojanowski, J. S., da Camara, C., and Kunz, H.: Meteosat land surface temperature climate data record: Achievable accuracy and potential uncertainties, Remote Sens.-Basel, 7, 13139–13156, https://doi.org/10.3390/rs71013139, 2015. 

Fan, X. M., Liu, H. G., Liu, G. H., and Li, S. B.: Reconstruction of MODIS land-surface temperature in a flat terrain and fragmented landscape, Int. J. Remote Sens., 35, 7857–7877, https://doi.org/10.1080/01431161.2014.978036, 2014. 

Ford, T. W. and Quiring, S. M.: Comparison of Contemporary In Situ, Model, and Satellite Remote Sensing Soil Moisture With a Focus on Drought Monitoring, Water Resour. Res., 55, 1565–1582, https://doi.org/10.1029/2018WR024039, 2019. 

Garcia, D.: Robust smoothing of gridded data in one and higher dimensions with missing values, Comput. Stat. Data An., 54, 1167–1178, https://doi.org/10.1016/j.csda.2009.09.020, 2010. 

Garcia, D.: A fast all-in-one method for automated post-processing of PIV data, Exp. Fluids, 50, 1247–1259, https://doi.org/10.1007/s00348-010-0985-y, 2011. 

Gerber, F., De Jong, R., Schaepman, M. E., Schaepman-Strub, G., and Furrer, R.: Predicting Missing Values in Spatio-Temporal Remote Sensing Data, IEEE T. Geosci. Remote, 56, 2841–2853, https://doi.org/10.1109/TGRS.2017.2785240, 2018. 

Ghent, D. J., Corlett, G. K., Göttsche, F. M., and Remedios, J. J.: Global Land Surface Temperature From the Along-Track Scanning Radiometers, J. Geophys. Res.-Atmos., 122, 12167–12193, https://doi.org/10.1002/2017JD027161, 2017. 

Green, P. J. and Silverman, B. W.: Nonparametric regression and generalized linear models: a roughness penalty approach, CRC Press, ISBN 0-412-30040-0, 1994. 

Guillevic, P. C., Biard, J. C., Hulley, G. C., Privette, J. L., Hook, S. J., Olioso, A., Göttsche, F. M., Radocinski, R., Román, M. O., Yu, Y., and Csiszar, I.: Validation of Land Surface Temperature products derived from the Visible Infrared Imaging Radiometer Suite (VIIRS) using ground-based and heritage satellite measurements, Remote Sens. Environ., 154, 19–37, https://doi.org/10.1016/j.rse.2014.08.013, 2014. 

Hong, F., Zhan, W., Göttsche, F.-M., Lai, J., Liu, Z., Hu, L., Fu, P., Huang, F., Li, J., Li, H., and Wu, H.: A simple yet robust framework to estimate accurate daily mean land surface temperature from thermal observations of tandem polar orbiters, Remote Sens. Environ., 264, 112612, https://doi.org/10.1016/j.rse.2021.112612, 2021. 

Hulley, G. C., Hook, S. J., Abbott, E., Malakar, N., Islam, T., and Abrams, M.: The ASTER Global Emissivity Dataset (ASTER GED): Mapping Earth's emissivity at 100 meter spatial scale, Geophys. Res. Lett., 42, 7966–7976, https://doi.org/10.1002/2015GL065564, 2015. 

Jia, A., Ma, H., Liang, S., and Wang, D.: Cloudy-sky land surface temperature from VIIRS and MODIS satellite data using a surface energy balance-based method, Remote Sens. Environ., 263, 112566, https://doi.org/10.1016/j.rse.2021.112566, 2021. 

Jiang, G. M. and Liu, R.: Retrieval of sea and land surface temperature from SVISSR/FY-2C/D/E measurements, IEEE T. Geosci. Remote, 52, 6132–6140, https://doi.org/10.1109/TGRS.2013.2295260, 2014. 

Ke, L., Ding, X., and Song, C.: Reconstruction of time-series modis lst in central qinghai-tibet plateau using geostatistical approach, IEEE Geosci. Remote S., 10, 1602–1606, https://doi.org/10.1109/LGRS.2013.2263553, 2013. 

Kilibarda, M., Hengl, T., Heuvelink, G. B. M., Gräler, B., Pebesmatadić, E. P., and Bajat, B.: Spatio-temporal interpolation of daily temperatures for global land areas at 1 km resolution, J. Geophys. Res., 119, 2294–2313, https://doi.org/10.1002/2013JD020803, 2014. 

Li, B., Liang, S., Liu, X., Ma, H., Chen, Y., Liang, T., and He, T.: Estimation of all-sky 1 km land surface temperature over the conterminous United States, Remote Sens. Environ., 266, 112707, https://doi.org/10.1016/j.rse.2021.112707, 2021. 

Li, H., Zhou, Y., Jia, G., Zhao, K., and Dong, J.: Quantifying the response of surface urban heat island to urbanization using the annual temperature cycle model, Geosci. Front., 13, 101141, https://doi.org/10.1016/j.gsf.2021.101141, 2021. 

Li, X., Zhou, Y., Asrar, G. R., Mao, J., Li, X., and Li, W.: Response of vegetation phenology to urbanization in the conterminous United States, Glob. Change Biol., 23, 2818–2830, https://doi.org/10.1111/gcb.13562, 2017. 

Li, X., Zhou, Y., Asrar, G. R., and Zhu, Z.: Creating a seamless 1 km resolution daily land surface temperature dataset for urban and surrounding areas in the conterminous United States, Remote Sens. Environ., 206, 84–97, https://doi.org/10.1016/j.rse.2017.12.010, 2018a. 

Li, X., Zhou, Y., Asrar, G. R., and Zhu, Z.: Developing a 1 km resolution daily air temperature dataset for urban and surrounding areas in the conterminous United States, Remote Sens. Environ., 215, 74–84, https://doi.org/10.1016/j.rse.2018.05.034, 2018b. 

Li, X., Gong, P., Zhou, Y., Wang, J., Bai, Y., Chen, B., Hu, T., Xiao, Y., Xu, B., Yang, J., Liu, X., Cai, W., Huang, H., Wu, T., Wang, X., Lin, P., Li, X., Chen, J., He, C., Li, X., Yu, L., Clinton, N., and Zhu, Z.: Mapping global urban boundaries from the global artificial impervious area (GAIA) data, Environ. Res. Lett., 15, 094044, https://doi.org/10.1088/1748-9326/ab9be3, 2020. 

Liu, H., Lu, N., Jiang, H., Qin, J., and Yao, L.: Filling Gaps of Monthly Terra / MODIS Daytime Land Surface Temperature Using Discrete Cosine Transform Method, Remote Sens.-Basel, 12, 1–16, https://doi.org/10.3390/rs12030361, 2020. 

Liu, X., Zhou, Y., Yue, W., Li, X., Liu, Y., and Lu, D.: Spatiotemporal patterns of summer urban heat island in Beijing, China using an improved land surface temperature, J. Clean. Prod., 257, 120529, https://doi.org/10.1016/j.jclepro.2020.120529, 2020. 

Long, D., Yan, L., Bai, L., Zhang, C., Li, X., Lei, H., Yang, H., Tian, F., Zeng, C., Meng, X., and Shi, C.: Generation of MODIS-like land surface temperatures under all-weather conditions based on a data fusion approach, Remote Sens. Environ., 246, 111863, https://doi.org/10.1016/j.rse.2020.111863, 2020. 

Metz, M., Andreo, V., and Neteler, M.: A New Fully Gap-Free Time Series of Land Surface Temperature from MODIS LST Data, Remote Sens.-Basel, 9, 1333, https://doi.org/10.3390/rs9121333, 2017. 

Parastatidis, D., Mitraka, Z., Chrysoulakis, N., and Abrams, M.: Online global land surface temperature estimation from landsat, Remote Sens.-Basel, 9, 1–16, https://doi.org/10.3390/rs9121208, 2017. 

Peng, S., Piao, S., Ciais, P., Friedlingstein, P., Ottle, C., Bréon, F. M., Nan, H., Zhou, L., and Myneni, R. B.: Surface urban heat island across 419 global big cities, Environ. Sci. Technol., 46, 696–703, https://doi.org/10.1021/es2030438, 2012. 

Peng, S. S., Piao, S., Zeng, Z., Ciais, P., Zhou, L., Li, L. Z. X., Myneni, R. B., Yin, Y., and Zeng, H.: Afforestation in China cools local land surface temperature, Proc. Natl. Acad. Sci. USA, 111, 2915–2919, https://doi.org/10.1073/pnas.1315126111, 2014. 

Pham, H. T., Kim, S., Marshall, L., and Johnson, F.: Using 3D robust smoothing to fill land surface temperature gaps at the continental scale, Int. J. Appl. Earth Obs., 82, 101879, https://doi.org/10.1016/j.jag.2019.05.012, 2019. 

Phan, T. N. and Kappas, M.: Application of MODIS land surface temperature data: a systematic literature review and analysis, J. Appl. Remote Sens., 12, 1, https://doi.org/10.1117/1.jrs.12.041501, 2018. 

Roy, D. P., Wulder, M. A., Loveland, T. R., C. E., W., Allen, R. G., Anderson, M. C., Helder, D., Irons, J. R., Johnson, D. M., Kennedy, R., Scambos, T. A., Schaaf, C. B., Schott, J. R., Sheng, Y., Vermote, E. F., Belward, A. S., Bindschadler, R., Cohen, W. B., Gao, F., Hipple, J. D., Hostert, P., Huntington, J., Justice, C. O., Kilic, A., Kovalskyy, V., Lee, Z. P., Lymburner, L., Masek, J. G., McCorkel, J., Shuai, Y., Trezza, R., Vogelmann, J., Wynne, R. H., and Zhu, Z.: Landsat-8: Science and product vision for terrestrial global change research, Remote Sens. Environ., 145, 154–172, https://doi.org/10.1016/j.rse.2014.02.001, 2014. 

Sandeep, P., Obi Reddy, G. P., Jegankumar, R., and Arun Kumar, K. C.: Monitoring of agricultural drought in semi-arid ecosystem of Peninsular India through indices derived from time-series CHIRPS and MODIS datasets, Ecol. Indic., 121, 107033, https://doi.org/10.1016/j.ecolind.2020.107033, 2021. 

Sheffield, J., Wood, E. F., Pan, M., Beck, H., Coccia, G., Serrat-Capdevila, A., and Verbist, K.: Satellite Remote Sensing for Water Resources Management: Potential for Supporting Sustainable Development in Data-Poor Regions, Water Resour. Res., 54, 9724–9758, https://doi.org/10.1029/2017WR022437, 2018. 

Shen, H., Li, X., Cheng, Q., Zeng, C., Yang, G., Li, H., and Zhang, L.: Missing Information Reconstruction of Remote Sensing Data: A Technical Review, IEEE Geosci. Remote Sens. Mag., 3, 61–85, https://doi.org/10.1109/MGRS.2015.2441912, 2015. 

Shiff, S., Helman, D., and Lensky, I. M.: Worldwide continuous gap-filled MODIS land surface temperature dataset, Sci. Data, 8, 1–10, https://doi.org/10.1038/s41597-021-00861-7, 2021. 

Sims, D. A., Rahman, A. F., Cordova, V. D., El-Masri, B. Z., Baldocchi, D. D., Bolstad, P. V., Flanagan, L. B., Goldstein, A. H., Hollinger, D. Y., Misson, L., Monson, R. K., Oechel, W. C., Schmid, H. P., Wofsy, S. C., and Xu, L.: A new model of gross primary productivity for North American ecosystems based solely on the enhanced vegetation index and land surface temperature from MODIS, Remote Sens. Environ., 112, 1633–1646, https://doi.org/10.1016/j.rse.2007.08.004, 2008. 

Sulla-Menashe, D. and Friedl, M. A.: User Guide to Collection 6 MODIS Land Cover Dynamics (MCD12Q2) Product, User Guid., NASA EOSDIS Land Processes DAAC: Missoula, MT, USA, https://modisrest-dev.ornl.gov/documentation/guides/mcd12q2_v6_user_guide.pdf (updated 15 January 2019), 6, 1–18, 2018. 

Sun, L., Chen, Z., Gao, F., Anderson, M., Song, L., Wang, L., Hu, B., and Yang, Y.: Reconstructing daily clear-sky land surface temperature for cloudy regions from MODIS data, Comput. Geosci., 105, 10–20, https://doi.org/10.1016/j.cageo.2017.04.007, 2017. 

Tang, J., Di, L., Xiao, J., Lu, D., and Zhou, Y.: Impacts of land use and socioeconomic patterns on urban heat island, Int. J. Remote Sens., 38, 3445–3465, https://doi.org/10.1080/01431161.2017.1295485, 2017. 

Trigo, I. F., Peres, L. F., DaCamara, C. C., and Freitas, S. C.: Thermal land surface emissivity retrieved from SEVIRI/Meteosat, IEEE T. Geosci. Remote, 46, 307–315, https://doi.org/10.1109/TGRS.2007.905197, 2008. 

Wan, Z.: Collection-6 MODIS Land Surface Temperature Products Users' Guide, Earth Research Institute, University of California, Santa Barbara, 2013. 

Wan, Z.: New refinements and validation of the collection-6 MODIS land-surface temperature/emissivity product, Remote Sens. Environ., 140, 36–45, https://doi.org/10.1016/j.rse.2013.08.027, 2014. 

Weiss, D. J., Atkinson, P. M., Bhatt, S., Mappin, B., Hay, S. I., and Gething, P. W.: An effective approach for gap-filling continental scale remotely sensed time-series, ISPRS J. Photogramm., 98, 106–118, https://doi.org/10.1016/j.isprsjprs.2014.10.001, 2014. 

Xu, S. and Cheng, J.: A new land surface temperature fusion strategy based on cumulative distribution function matching and multiresolution Kalman filtering, Remote Sens. Environ., 254, 112256, https://doi.org/10.1016/j.rse.2020.112256, 2021. 

Xu, Y. and Shen, Y.: Reconstruction of the land surface temperature time series using harmonic analysis, Comput. Geosci., 61, 126–132, https://doi.org/10.1016/j.cageo.2013.08.009, 2013. 

Yang, J. S.: Estimation of Land Surface Temperature Using Spatial Interpolation and Satellite-Derived Surface Emissivity, J. Environ. Inform., 4, 40–47, https://doi.org/10.3808/jei.200400035, 2004. 

Yu, Y., Tarpley, D., Privette, J. L., Goldberg, M. D., Rama Varma Raja, M. K., Vinnikov, K. Y., and Xu, H.: Developing algorithm for operational GOES-R land surface temperature product, IEEE T. Geosci. Remote, 47, 936–951, https://doi.org/10.1109/TGRS.2008.2006180, 2009. 

Yue, W., Liu, X., Zhou, Y., and Liu, Y.: Impacts of urban configuration on urban heat island: An empirical study in China mega-cities, Sci. Total Environ., 671, 1036–1046, https://doi.org/10.1016/j.scitotenv.2019.03.421, 2019. 

Zhan, W., Hong, F., and Chen, Y.: Global daily average 1 km resolution land surface temperature dataset from 2003 to 2019[DB/OL], https://doi.org/10.12199/nesdc.ecodb.2016YFA0600200.01.005, Chinese Ecosyst. Sci. Data Cent., 2021. 

Zhang, L., Jiao, W., Zhang, H., Huang, C., and Tong, Q.: Studying drought phenomena in the Continental United States in 2011 and 2012 using various drought indices, Remote Sens. Environ., 190, 96–106, https://doi.org/10.1016/j.rse.2016.12.010, 2017. 

Zhang, T., Zhou, Y., Zhu, Z., Li, X., and Asrar, G. R.: A global seamless 1 km resolution daily land surface temperature dataset (2003–2020), Iowa State University [data set], https://doi.org/10.25380/iastate.c.5078492, 2021. 

Zhang, X., Pang, J., and Li, L.: Estimation of Land Surface temperature under cloudy skies using combined diurnal solar radiation and surface temperature evolution, Remote Sens.-Basel, 7, 905–921, https://doi.org/10.3390/rs70100905, 2015. 

Zhang, X., Zhou, J., Göttsche, F. M., Zhan, W., Liu, S., and Cao, R.: A method based on temporal component decomposition for estimating 1-km all-weather land surface temperature by merging satellite thermal infrared and passive microwave observations, IEEE T. Geosci. Remote, 57, 4670–4691, https://doi.org/10.1109/TGRS.2019.2892417, 2019. 

Zhang, X., Zhou, J., Liang, S., Chai, L., Wang, D., and Liu, J.: Estimation of 1-km all-weather remotely sensed land surface temperature based on reconstructed spatial-seamless satellite passive microwave brightness temperature and thermal infrared data, ISPRS J. Photogramm., 167, 321–344, https://doi.org/10.1016/j.isprsjprs.2020.07.014, 2020.  

Zhang, X., Zhou, J., Liang, S., and Wang, D.: A practical reanalysis data and thermal infrared remote sensing data merging (RTM) method for reconstruction of a 1-km all-weather land surface temperature, Remote Sens. Environ., 260, 112437, https://doi.org/10.1016/j.rse.2021.112437, 2021a. 

Zhang, X., Zhou, J., Tang, W., Ding, L., Ma, J., and Zhang, X.: Daily 1-km all-weather land surface temperature dataset for the Chinese landmass and its surrounding areas (TRIMS LST; 2000–2020), National Tibetan Plateau Data Center, https://doi.org/10.11888/Meteoro.tpdc.271252, 2021b. 

Zhao, B., Mao, K., Cai, Y., Shi, J., Li, Z., Qin, Z., Meng, X., Shen, X., and Guo, Z.: A combined Terra and Aqua MODIS land surface temperature and meteorological station data product for China from 2003 to 2017, Earth Syst. Sci. Data, 12, 2555–2577, https://doi.org/10.5194/essd-12-2555-2020, 2020. 

Zhao, J., Yu, L., Xu, Y., Li, X., Zhou, Y., Peng, D., Liu, H., Huang, X., Zhou, Z., Wang, D., Ren, C., and Gong, P.: Exploring difference in land surface temperature between the city centres and urban expansion areas of China's major cities, Int. J. Remote Sens., 41, 8963–8983, https://doi.org/10.1080/01431161.2020.1797216, 2020. 

Zhou, D., Xiao, J., Bonafoni, S., Berger, C., Deilami, K., Zhou, Y., Frolking, S., Yao, R., Qiao, Z., and Sobrino, J. A.: Satellite remote sensing of surface urban heat islands: Progress, challenges, and perspectives, Remote Sens.-Basel, 11, 1–36, https://doi.org/10.3390/rs11010048, 2019. 

Zhou, J., Zhang, X., Zhan, W., Göttsche, F. M., Liu, S., Olesen, F. S., Hu, W., and Dai, F.: A Thermal Sampling Depth Correction Method for Land Surface Temperature Estimation From Satellite Passive Microwave Observation Over Barren Land, IEEE T. Geosci. Remote, 55, 4743–4756, https://doi.org/10.1109/TGRS.2017.2698828, 2017. 

Zhou, Y., Smith, S. J., Elvidge, C. D., Zhao, K., Thomson, A., and Imhoff, M.: A cluster-based method to map urban area from DMSP/OLS nightlights, Remote Sens. Environ., 147, 173–185, https://doi.org/10.1016/j.rse.2014.03.004, 2014a. 

Zhou, Y., Clarke, L., Eom, J., Kyle, P., Patel, P., Kim, S. H., Dirks, J., Jensen, E., Liu, Y., Rice, J., Schmidt, L., and Seiple, T.: Modeling the effect of climate change on U.S. state-level buildings energy demands in an integrated assessment framework, Appl. Energ., 113, 1077–1088, https://doi.org/10.1016/j.apenergy.2013.08.034, 2014b. 

Zhou, Y., Li, X., Asrar, G. R., Smith, S. J., and Imhoff, M.: A global record of annual urban dynamics (1992–2013) from nighttime lights, Remote Sens. Environ., 219, 206–220, https://doi.org/10.1016/j.rse.2018.10.015, 2018. 

Download
Short summary
We generated a global seamless 1 km daily (mid-daytime and mid-nighttime) land surface temperature (LST) dataset (2003–2020) using MODIS LST products by proposing a spatiotemporal gap-filling framework. The average root mean squared errors of the gap-filled LST are 1.88°C and 1.33°C, respectively, in mid-daytime and mid-nighttime. The global seamless LST dataset is unique and of great use in studies on urban systems, climate research and modeling, and terrestrial ecosystem studies.