Articles | Volume 16, issue 11
https://doi.org/10.5194/essd-16-5449-2024
https://doi.org/10.5194/essd-16-5449-2024
Data description paper
 | 
28 Nov 2024
Data description paper |  | 28 Nov 2024

Global 30 m seamless data cube (2000–2022) of land surface reflectance generated from Landsat 5, 7, 8, and 9 and MODIS Terra constellations

Shuang Chen, Jie Wang, Qiang Liu, Xiangan Liang, Rui Liu, Peng Qin, Jincheng Yuan, Junbo Wei, Shuai Yuan, Huabing Huang, and Peng Gong
Abstract

The Landsat series constitutes an unparalleled repository of multi-decadal Earth observations, serving as a cornerstone in global environmental monitoring. However, the inconsistent coverage of Landsat data due to its long revisit intervals and frequent cloud cover poses significant challenges to land monitoring over large geographical extents. In this study, we developed a full-chain processing framework for the multi-sensor data fusion of Landsat 5, 7, 8, and 9 and MODIS Terra surface reflectance products. Based on this framework a global 30 m resolution daily seamless data cube (SDC) of land surface reflectance was generated, spanning from 2000 to 2022. A thorough evaluation of the SDC was undertaken using a leave-one-out approach and a cross-comparison with NASA's Harmonized Landsat and Sentinel-2 (HLS) products. The leave-one-out validation at 425 global test sites assessed the agreement between the SDC with actual Landsat surface reflectance values (not used as input), revealing an overall mean absolute error (MAE) of 0.014 (the valid range of surface reflectance values is 0–1). The cross-comparison with HLS products at 22 Military Grid Reference System (MGRS) tiles revealed an overall mean absolute deviation (MAD) of 0.017 with L30 (Landsat 8-based 30 m HLS product) and a MAD of 0.021 with S30 (Sentinel-2-based 30 m HLS product). Moreover, experimental results underscore the advantages of employing the SDC for global land cover classification, achieving a sizable improvement in overall accuracy (2.4 %–11.3 %) over that obtained using Landsat composite and interpolated datasets. A web-based interface has been developed for researchers to freely access the SDC dataset, which is available at https://doi.org/10.12436/SDC30.26.20240506 (Chen et al., 2024).

1 Introduction

Earth observation (EO) data acquired by satellite sensors are fundamental to global land monitoring (Markham and Helder, 2012; Song et al., 2018; Wulder et al., 2022), providing critical information sources with unparalleled spatial and temporal coverage at a low cost. Over the past decades, satellite remote sensing has emerged as a prominent technology in Earth system science (Gong et al., 2023; Yang et al., 2013), contributing to the monitoring of land surface dynamics (Gong et al., 2013, 2019; Huang et al., 2017; H. Liu et al., 2020, 2021; Song et al., 2018), land surface phenology (Bolton et al., 2020; Piao et al., 2019), forest (Hansen et al., 2008, 2013), water (Ji et al., 2018; Pekel et al., 2016; Pickens et al., 2022; Sagan et al., 2020), and urbanization (Gong et al., 2012, 2020; X. Liu et al., 2020).

The Landsat series stands as the most enduring source of Earth observations, with a historical archive extending back to 1972 (Wulder et al., 2022). This longevity, combined with its relatively high spatial resolution, rigorous radiometric calibration, and free-access policy, has made Landsat a cornerstone for monitoring global terrestrial environments (Markham and Helder, 2012; Wulder et al., 2022). Nevertheless, the utility of Landsat data inevitably encounters certain limitations. A notable constraint is its relatively low temporal frequency, revisiting each area on Earth every 16 d (8 d when there are two Landsat satellites in orbit with an 8 d offset) (Zhu et al., 2015b). This issue is further compounded by the presence of cloud and cloud shadow, which can introduce significant temporal gaps in the acquisition of clear-sky observations, especially in cloudy regions (Zhu et al., 2016). Moreover, Landsat time series observations typically exhibit irregularities in both observation frequencies and acquisition dates due to the presence of cloud contamination and the geographically heterogeneous Landsat overpass coverage (Li and Roy, 2017). These irregularities present significant challenges when utilizing Landsat for large-scale monitoring of land cover and land use change (Potapov et al., 2020; Zhang et al., 2024). Therefore, the availability of Landsat datasets characterized by consistency in both temporal and spatial dimensions is crucial for facilitating various global environmental studies (Khan et al., 2024; Li et al., 2023; Pickens et al., 2022; Potapov et al., 2021a, b, 2022a, b; Song et al., 2021; Turubanova et al., 2023).

One conventional approach employed to mitigate data gaps in optical remote sensing is image compositing, which selects the highest-quality observations within a pre-defined time interval based on specific criteria to create seamless clear images at large scales (Jin et al., 2023; Qiu et al., 2023; White et al., 2014). Historically, image compositing has been mostly applied to coarse-resolution data with high temporal frequency (Qiu et al., 2023), such as that obtained by the Advanced Very High Resolution Radiometer (AVHRR) and Moderate Resolution Imaging Spectroradiometer (MODIS) sensors (Chuvieco et al., 2005; Cihlar et al., 1994; Holben, 1986; Huete et al., 2002; Wolfe et al., 1998). The use of image compositing for medium-resolution data (e.g., Landsat) was comparatively uncommon before the advent of Landsat free-access policy in 2008 (Qiu et al., 2023). In recent years, many image compositing algorithms have been developed for Landsat data (Frantz et al., 2017; Griffiths et al., 2019; Jin et al., 2023; Nelson and Steinwand, 2015; Qiu et al., 2023; Roy et al., 2010; White et al., 2014). Nevertheless, Landsat image compositing is not without its limitations. Firstly, due to the lack of frequent Landsat observations (especially in cloudy areas), it may take several months or even years to provide a composite Landsat image, which can cause problems if there are land cover or phenological changes (Zhu et al., 2015b). Furthermore, the compositing process may introduce distortions to the temporal dynamics of Landsat time series (Qiu et al., 2023), thereby hampering subsequent applications that depend on precise temporal information.

Landsat interpolation methods also provide the capability to generate seamless synthetic Landsat images (Brooks et al., 2012; Malambo and Heatwole, 2016; Yan and Roy, 2018, 2020; Zhu et al., 2015b). Linear interpolation is commonly employed to address missing values in Landsat time series (Defourny et al., 2019; Inglada et al., 2017; Tran et al., 2022), though it may not be highly effective for applications like land cover classification (Che et al., 2024). To improve performance, more sophisticated interpolation methods have been developed (Brooks et al., 2012; Malambo and Heatwole, 2016; Yan and Roy, 2018; Zhou et al., 2022; Zhu et al., 2015b). Nevertheless, a significant limitation of these methods is their dependence on numerous clear-sky Landsat observations for accurate time series estimation (Chen et al., 2021; Zhu et al., 2015b). This requirement poses a considerable obstacle to their large-scale applications, particularly in cloudy areas. Moreover, the performance of interpolation-based methods relies on the careful tuning and selection of model parameters, thereby encountering the challenge of balancing between over-fitting and under-fitting (Wu et al., 2022; Zhou et al., 2022). Large-scale remote sensing applications prefer processing algorithms capable of automatic adaptation to diverse input data conditions, eliminating the need for manual parameter tuning (Chen et al., 2023).

The spatiotemporal fusion technique provides a promising solution which aims at incorporating more frequent coarse-resolution observations to enhance the temporal frequency of Landsat and generate synthetic Landsat-like dense time series images (Chen et al., 2023, 2021; Gao et al., 2006; Liu et al., 2022; Zhu et al., 2010, 2016). For example, the Terra/Aqua Moderate Resolution Imaging Spectroradiometer (MODIS) provides frequent coarse-resolution observations at a 250, 500, and 1000 m spatial resolution with a near-daily revisit frequency (Schaaf et al., 2002). The MODIS land bands have comparable center wavelengths to the Landsat Enhanced Thematic Mapper Plus (ETM+) sensor, making the MODIS data the ideal input for the spatiotemporal fusion with Landsat (Gao et al., 2006). Many different types of Landsat–MODIS spatiotemporal fusion methods have been developed (Chen et al., 2023; Gao et al., 2006, 2022; Goyena et al., 2023; Guo et al., 2020; Hilker et al., 2009a; Liu et al., 2019, 2022; Mizuochi et al., 2017; Shi et al., 2022; Wang et al., 2017, 2020; Zhang et al., 2013; Zhu et al., 2010, 2016; Zurita-Milla et al., 2008) and applied to land cover and land surface phenology monitoring (Abowarda et al., 2021; Battude et al., 2016; Chen et al., 2018; Gervais et al., 2017; Hilker et al., 2009b; Y. Li et al., 2017; Senf et al., 2015; Singh, 2011; Tian et al., 2013; Walker et al., 2012; Watts et al., 2011). The utility of multi-sensor data fusion in facilitating land cover and land use analyses has also been validated empirically in previous studies (Carrasco et al., 2019; Chen et al., 2017; Yin et al., 2019).

To the best of our knowledge, there is currently no global 30 m seamless dataset of land surface reflectance generated by fusing Landsat and MODIS products available to the community. Although there have been numerous studies dedicated to developing algorithms for missing data reconstruction and multi-sensor data fusion (Shen et al., 2015; Zhu et al., 2018), unified and generalized frameworks for effective and efficient Landsat–MODIS data fusion on a global scale have not yet been explored extensively. To address this need, in this study, we (i) developed a full-chain processing framework for the multi-sensor data fusion of Landsat 5, 7, 8, 9 and MODIS Terra surface reflectance products; (ii) generated a global 30 m daily seamless data cube (SDC) of land surface reflectance, covering the period from 2000 to 2022; (iii) evaluated the reconstruction accuracy of the proposed framework quantitatively using a leave-one-out strategy at 425 global test sites; (iv) evaluated the quality of the SDC quantitatively by cross-comparing it with the Harmonized Landsat and Sentinel-2 (HLS) products at 22 Military Grid Reference System (MGRS) tiles; (v) evaluated the performance of using the SDC for global-scale land cover classification against Landsat composite and interpolated datasets; and (vi) provided a web-based interface for researchers to freely access the SDC dataset.

2 Materials

2.1 Landsat Collection 2 level-2 surface reflectance products

We collected a comprehensive dataset comprising 6 564 546 Landsat Collection 2 level-2 surface reflectance (L2SR) images from the US Geological Survey (USGS) Earth Resources Observation and Science (EROS) Center, including data acquired by the Landsat 8 and 9 Operational Land Imager (OLI), Landsat 7 Enhanced Thematic Mapper Plus (ETM+), and Landsat 5 Thematic Mapper (TM). This dataset covers most of global land surface except Antarctica, spanning from 2000 to 2022.

The L2SR products are generated through a sequence of processing steps applied to Landsat raw data. These steps include reprojection, radiometric calibration, geometric correction, atmospheric correction, and cloud masking. Compared to previous Landsat Collection 1 products, Collection 2 products have markedly improved the Landsat absolute geolocation accuracy using Landsat 8 geolocational imaging performance harmonized with the European Space (ESA) Agency Global Reference Image (GRI) data (Crawford et al., 2023). The Landsat 5 and 7 TM and ETM+ data were atmospherically corrected using the Landsat Ecosystem Disturbance Adaptive Processing System (LEDAPS) (Masek et al., 2006), and the Landsat 8 and 9 OLI data were corrected using the Land Surface Reflectance Code (LaSRC) (Vermote et al., 2016). The Fmask algorithm (Zhu and Woodcock, 2012) was applied to detect cloud and cloud shadow in Landsat images. The L2SR products are spatially referenced using Worldwide Reference System-2 (WRS-2) path rows and provided in Universal Transverse Mercator (UTM) projection. Figure 1 illustrates the spatial and temporal distribution of all L2SR images used in this study. As listed in Table 1, we used blue, green, red, near-infrared (NIR), and two shortwave infrared (SWIR1 and SWIR2) bands for the generation of the SDC. Although Landsat sensors have a relatively narrow field of view (15°), the bidirectional reflectance distribution function (BRDF) normalization was found to be effective in making multi-temporal Landsat observations more consistent (Claverie et al., 2015; Roy et al., 2016b). We applied the C-factor technique and global constant BRDF coefficients provided by Roy et al. (2016a) to obtain the Landsat nadir BRDF-adjusted reflectance (NBAR) dataset for SDC generation.

https://essd.copernicus.org/articles/16/5449/2024/essd-16-5449-2024-f01

Figure 1Spatial and temporal distribution of L2SR images used in this study.

Table 1Attributes of Landsat 5 TM, 7 ETM+, and 8–9 OLI and MODIS Terra products (Markham and Helder, 2012; Masek et al., 2020; Morisette et al., 2002).

Download Print Version | Download XLSX

2.2 MODIS nadir BRDF-adjusted reflectance (NBAR) products

We reprocessed the MODIS MOD09GA version 6.1 surface reflectance product to obtain a daily seamless (without missing values) 500 m MODIS NBAR dataset as detailed in Liang et al. (2024). The official MODIS MCD43A4 NBAR product was not employed due to persisting concerns associated with it, including the prevalence of missing data and residual influence of cloud and aerosols (Liang et al., 2024).

All MODIS Terra Surface Reflectance MOD09GA version 6.1 images for the period 2000–2022 were acquired from NASA Earthdata. The MOD09GA product is tiled in the MODIS sinusoidal system with a spatial resolution of about 500 m. A set of processing algorithms was applied to these MOD09GA images to derive a daily 500 m resolution seamless MODIS NBAR data cube (Liang et al., 2024), including three main stages: (i) land-cover-based BRDF correction with the kernel-driven RossThick LiSparse reciprocal (RTLSR) model using parameters derived from the MCD43A1 BRDF model parameter dataset and land cover maps from the MCD12Q1 land cover product, (ii) outlier removal and gap filling using the ecosystem curve-fitting method, and (iii) sliding-window temporal smoothing using the Savitzky–Golay filter. The generated MODIS NBAR seamless dataset provides seven spectral bands that are commonly used for terrestrial applications, six of which that have compatible bandwidths with Landsat sensors are employed for SDC generation as listed in Table 1.

2.3 The Harmonized Landsat and Sentinel-2 V2.0 surface reflectance product

NASA's Harmonized Landsat and Sentinel-2 (HLS) V2.0 products were used in cross-comparison with the generated SDC product for quantitative assessment. HLS products combine observations from the Landsat Operational Land Imager (OLI; since 2013) and the Sentinel-2 Multi-Spectral Instrument (MSI; since 2015), providing global surface reflectance data at a 30 m spatial resolution with a theoretical revisit interval of 2–3 d at the Equator and even more frequent revisits in areas of higher latitudes (Claverie et al., 2018). The creation of HLS products involves four major processes (Claverie et al., 2018): (i) atmospheric correction and cloud masking, (ii) geometric resampling and geographic registration, (iii) BRDF normalization, and (iv) bandpass adjustment. HLS products are gridded into the UTM Military Grid Reference System (MGRS) used by the Sentinel-2 products. The HLS S30 product (Sentinel-2-based 30 m product) is derived from 10 and 20 m Sentinel-2 bands using overlapping-area-weighted averaging, and the L30 product (Landsat-based 30 m product) is reprojected to the same Sentinel-2 grid using cubic convolution interpolation (Claverie et al., 2018). Both HLS L30 and S30 products are atmospherically corrected using the Land Surface Reflectance Code (LaSRC) (Vermote et al., 2016). The cloud mask used in HLS products is a combination of the mask derived from the Fmask algorithm and the mask derived from the LaSRC algorithm. The HLS L30 and S30 products are delivered as NBAR, using the C-factor technique and global constant BRDF coefficients provided by Roy et al. (2016a). A bandpass adjustment is applied to the S30 product using a global constant set of coefficients (Claverie et al., 2018).

3 Methods

Figure 2 illustrates the overview of the SDC processing chain, comprising five key processing steps.

https://essd.copernicus.org/articles/16/5449/2024/essd-16-5449-2024-f02

Figure 2Overview of the SDC processing chain for each task unit.

Download

3.1 Gridding and reprojection

The UTM-based Military Grid Reference System (MGRS) was chosen as the projection system for the SDC product and was also adopted by ESA's Sentinel-2 products and NASA's HLS products (Claverie et al., 2018). It is noteworthy that our adopted grid slightly deviates from the Sentinel-2 grid. The original Landsat coordinate system aligns the UTM coordinate origin with a pixel center, while the Sentinel-2 grid aligns it with a pixel corner (Claverie et al., 2018). We expanded the Sentinel-2 tiles by 15 m in each direction to align them with the original Landsat coordinate system, minimizing the need for resampling Landsat data. The SDC product is gridded onto this modified MGRS system, with a tile size of 109.83 km×109.83 km (3661×3661 Landsat pixels). Although it has been found that overlaps among neighboring MGRS tiles may result in resource wastage to some extent (Bauer-Marschallinger and Falkner, 2023), implementing this UTM-based projection system serves to minimize the need for resampling Landsat data, thereby reducing the introduction of additional errors (Dwyer et al., 2018).

The metadata for all Landsat and MODIS images has been pre-indexed into a database. For each SDC generation task unit, the metadata regarding all source data falling within specified spatial and temporal ranges can be efficiently retrieved from the database. Subsequently, all involved Landsat and MODIS source data are reprojected and gridded onto our UTM-based grid using nearest-neighbor resampling for Landsat and bilinear resampling for MODIS. MODIS data are resampled to 30 m spatial resolution to streamline subsequent processing steps, and the computational costs for this upscaling operation are negligible. Table 2 lists the input data products across distinct time periods utilized in SDC generation.

Table 2Input product specifications across distinct periods.

Download Print Version | Download XLSX

3.2 Landsat cloud masking

Cloud and cloud shadow masks are essential for removing contaminated Landsat pixels in SDC generation. We used Fmask (Zhu and Woodcock, 2012) detection results as primary cloud and cloud shadow indicators. There are still a few clouds and heavy aerosols in Landsat images that remain undetected by the current Fmask method, which may have noteworthy implications for subsequent data processing procedures (Chen et al., 2021). To mitigate this issue, an enhanced cloud filtering approach is employed to reduce residual clouds and cloud shadows in Landsat imagery, which comprises three major steps:

  • 1.

    Firstly, the cloud and cloud shadow masks generated by the Fmask algorithm are expanded by a margin of 150 m (Claverie et al., 2018). This dilation process is designed to exclude potentially contaminated pixels adjacent to the initially detected cloud and shadow areas.

  • 2.

    Secondly, a brightness-threshold filter combined with a spatial filter is applied to remove remaining highly reflective pixels, which is achieved by cross-comparing with MODIS NBAR data. This filter operates on a patch-wise basis, with each patch measuring 20 × 20 Landsat pixels. For a given image patch, we commence by computing the ratio of Landsat reflectance (summed over the six spectral bands) to MODIS reflectance for each pixel. Following this, if the median of all these Landsat–MODIS reflectance ratios within this image patch surpasses a pre-defined threshold, the entire image patch is then flagged as cloudy. The threshold was set to 2 in this study as this value effectively eliminates most residual clouds without being overly aggressive.

    (1) median i ρ i Landsat ( x , y ) i ρ i MODIS ( x , y ) , , for all pixels within the image patch > 2 ,

    where (x,y) indicates the pixel location and ρiLandsat(x,y) and ρiMODIS(x,y) are the surface reflectance of band i for the corresponding Landsat pixel and MODIS pixel, respectively.

  • 3.

    Further, a time series outlier detection technique based on the Hampel filter combined with a spatial filter is employed to detect temporal outliers using the vegetation index (VI) of Landsat time series (Claverie et al., 2018).

    (2) VI ( t ) = ρ NIR Landsat ( t ) ρ Red Landsat ( t ) ,

    where t indicates the time phase and ρNIRLandsat(t) and ρRedLandsat(t) are the NIR and red band surface reflectance of Landsat at time phase t.

For each sample VI(t) of the Landsat time series, the Hampel filter computes the median of the VIs in a temporal window (with the center sample excluded).

(3) VI median = median { VI ( t - Δ t ) , , VI ( t - 1 ) , VI ( t + 1 ) , , VI ( t + Δ t ) } ,

where Δt represents the temporal window size.

Then, it estimates the scale of natural variation (SNV) of each sample by deriving the median of the absolute deviations of the VIs in the temporal window from the median.

(4) SNV = median { | VI ( t - Δ t ) - VI median | , , | VI ( t - 1 ) - VI median | , | VI ( t + 1 ) - VI median | , , | VI ( t + Δ t ) - VI median | }

If the center sample, VI(t), differs from the median, VImedian, by more than 5 SNV, it is flagged as an outlier. No filter is applied if there are fewer than 3 samples within a 60 d temporal window. To eliminate isolated outlier pixels that generate a speckle effect, the sample pixel is flagged as an outlier only if the majority of its surrounding pixels are also flagged as outliers, cloud, or cloud shadow.

3.3 Landsat cross-sensor calibration

To ensure the temporal continuity of the generated SDC dataset, a Landsat cross-sensor calibration approach was employed to reduce the data inconsistencies between the input Landsat OLI and TM and ETM+ products. The linear regression models have been widely used to reduce cross-sensor reflectance difference (Chastain et al., 2019; Claverie, 2023; Claverie et al., 2018; Roy et al., 2016a; Shang and Zhu, 2019). It has been found that a single set of linear transformation coefficients is not proper for global-scale applications (Olthof and Fraser, 2024; Shang and Zhu, 2019). Therefore, our approach aims at building multiple transformation models for each MGRS tile and each spectral band separately.

The Landsat 7 ETM+ and Landsat 8 OLI share a sun-synchronous orbit at an altitude of approximately 710 km. Both satellites revisit a specific area on Earth every 16 d, with an 8 d offset from each other. Notably, there exists a discernible across-track overlap between each ETM+ observation and an adjacent OLI observation acquired 1 d apart (Roy et al., 2016a). Our cross-calibration approach uses matched (acquired 1 d apart) ETM+ and OLI observations in these overlapped regions to build linear regression models for each MGRS tile and each spectral band separately, adjusting ETM+ (and TM) observations to match the OLI spectral responses (Roy et al., 2016a).

For each MGRS tile, ETM+ and OLI observations in the overlap regions acquired only 1 d apart during the years 2014–2021 are reprojected onto the modified MGRS grid using nearest-neighbor resampling. Pixels flagged as cloud, cloud shadow, and snow/ice were discarded. Then, the remaining candidate pixels are used to build linear regression models to adjust reflectance differences for each spectral band. If the number of available pixels is insufficient to build the regression model, candidate pixels from adjacent MGRS tiles would be included. This step helps improve stability and mitigate the issue of spatial inconsistency between neighboring MGRS tiles. Table 3 shows an example of calibration coefficients obtained at three MGRS tiles in this study:

(5) ρ i OLI = a × ρ i ETM+ + b for each band i .

Table 3Calibration coefficients obtained at three MGRS tiles in this study.

Download Print Version | Download XLSX

3.4 MODIS harmonization to Landsat bandpass

Harmonizing MODIS to the Landsat bandpass reduces inconsistencies between Landsat and MODIS observations, which has been proven effective in improving the reconstruction accuracy of subsequent gap-filling and spatiotemporal fusion processes (Chen et al., 2023; Gevaert and García-Haro, 2015; Shi et al., 2022). Here, a cross-resolution data harmonization approach is employed for harmonizing MODIS to the Landsat OLI bandpass. This method involves utilizing matched Landsat and MODIS observations to establish multiple linear transformation models for each spectral band and each local image patch.

In contrast to previous methods that construct distinct transformation models for each land cover type (Cao et al., 2020; Shen et al., 2013; Yang et al., 2020), our approach adopts a patch-wise harmonization strategy with an overlapping mechanism to tackle spatial heterogeneity. This strategy avoids the necessity for high-accuracy land cover maps while concurrently ensuring computational efficiency. As illustrated in Fig. 3, Landsat images are paired with MODIS images acquired on corresponding dates, with the exclusion of contaminated Landsat pixels, such as clouds and cloud shadows. Subsequently, for each image patch and each spectral band, a linear transformation model is constructed utilizing candidate pixels within that image patch. The MODIS reflectance is then adjusted using the obtained transformation coefficients to generate more harmonized MODIS data. In areas of patch overlap, the transformation coefficients are averaged for the final adjustments. This approach employs multiple regional transformation models to better account for material-dependent spectral characteristics that vary across regions and uses an overlapping mechanism to enhance spatial consistency between neighboring image patches.

https://essd.copernicus.org/articles/16/5449/2024/essd-16-5449-2024-f03

Figure 3An illustration of patch-wise harmonization with overlapping.

Download

3.5 Unified gap filling and spatiotemporal fusion

Existing spatiotemporal fusion algorithms generally require cloud-free seamless Landsat images as input (Gao et al., 2006; Shi et al., 2022; Zhu et al., 2010, 2016), which may harm their data efficiency and performances, especially in cloudy areas where there are few cloud-free Landsat images available. Therefore, previous studies (Chen et al., 2018; Liu et al., 2021) applied gap-filling algorithms to partly contaminated Landsat images first and then used these gap-filled images for the subsequent fusion process. Different from these approaches, we propose the Unified, ROBust, OpTimization-based spatiotemporal reconstruction model (uROBOT), which can tackle both gap-filling and spatiotemporal fusion problems in a unified manner.

As shown in Fig. 4, the input data for uROBOT consist of a matched time series of Landsat–MODIS image patches, DF and DC, acquired at {T1,,Tn}; a MODIS image patch, Cp, acquired at the prediction phase, Tp; and a partially contaminated Landsat image, Fp, acquired at Tp (case 1 in Fig. 4). The output is the predicted reflectance value for the unobserved segments of the Landsat image, Fp-, at Tp. Should Fp be entirely contaminated/unobserved, the scenario simplifies to a conventional spatiotemporal fusion problem (case 2 in Fig. 4).

https://essd.copernicus.org/articles/16/5449/2024/essd-16-5449-2024-f04

Figure 4Input–output settings of uROBOT. The superscripts + and indicate the corresponding observed and unobserved (or contaminated) segments of Landsat image at the prediction phase. The uROBOT can tackle both gap-filling (case 1) and spatiotemporal fusion (case 2) problems in a unified manner.

Download

3.5.1 Preliminaries of the uROBOT model

Similar to the previous spatiotemporal fusion model (Chen et al., 2023), the basic assumption of uROBOT is that the MODIS image, Cp, can be accurately approximated by a linear combination of other similar MODIS images in the input time series data given by

(6) C p D c α ,

where Cp is the MODIS image patch acquired at the prediction phase, Tp; DC is a matrix that stacks the input MODIS time series image patches acquired at {T1,,Tn}; and α is a sparse vector that selects out MODIS image patches similar to Cp and combines them to approximate Cp.

Then, it is assumed that the representation coefficient α can be transferred to corresponding Landsat images, and we obtain the following estimation for Fp:

(7) F ^ p = D F α ,

where F^p is the estimated Landsat image patch at Tp and DF is a matrix that stacks the input Landsat time series image patches acquired at {T1,,Tn}.

The representation coefficients α can be obtained by solving an optimization problem with two extra regularization terms. Detailed explanations regarding these two regularization terms will be provided subsequently.

(8) min α | C p - D C α | 2 2 + λ | α | 1 + β | F interp - D F α | 2 2 + μ | F p + - D F + α | 2 2 ,

where λ, β, and μ are scalars; α is a vector that flattens the representation coefficients; Cp is a vector that flattens the MODIS image patch at Tp; DC is a matrix that stacks the MODIS time series image patches acquired at {T1,,Tn}; DF is a matrix that stacks the Landsat time series image patches acquired at {T1,,Tn}; Finterp is a vector that flattens an interpolated Landsat image patch at Tp (further elucidation on this matter is subsequently presented); Fp+ is a vector that flattens the observed part of Fp; and DF+ is a matrix that stacks the corresponding observed parts of DF.

Beyond the previous spatiotemporal fusion model (Chen et al., 2023), the uROBOT model accepts the observed part Fp+ (if there is one as in case 1 in Fig. 4) as extra input. Hence, the regularization term μ(|Fp+-DF+α|) allows uROBOT to exploits the observed segments Fp+ to better reconstruct the target image, Fp. This feature enables the uROBOT model to handle both gap-filling and spatiotemporal fusion problems in a unified manner. Additionally, uROBOT utilizes the interpolated Landsat image, Finterp, in the temporal continuity penalty term, which further improves the performance of uROBOT. Finterp is an interpolated patch of Landsat imagery at time Tp, generated by combining Landsat observations acquired at nearby dates, with weights determined by the corresponding MODIS data. Therefore, the regularization term β(|Finterp-DFα|22) serves to enhance the temporal continuity of the final predicted Landsat image, F^p=DFα.

All the constraint terms in Eq. (8) contribute to addressing gradual and step changes. To handle extreme conditions such as ephemeral land cover changes, the uROBOT model also distributes the approximation residuals into the prediction (Chen et al., 2023), and the final prediction is formulated as follows:

(9) F ^ p = D F α + ( C p - D C α ) .

In regions with frequent cloud cover, the scarcity of cloud-free observations can pose a challenge. To address this, the temporal continuity constraint β|Finterp-DFα|22 and the residual distribution in Eq. (9) ensure that our estimations are consistent with Cp and are at least as accurate as the interpolated results, Finterp.

3.5.2 Implementation of uROBOT for SDC reconstruction

As shown in Fig. 4, the uROBOT model reconstructs seamless Landsat images for each prediction phase separately. At each prediction phase, Tp, the uROBOT model takes three main steps to reconstruct the corresponding Landsat image, Fp:

  • 1.

    Firstly, the uROBOT model constructs an interpolated image, Finterp, using the weighted combination (Zhu et al., 2010) of clear-sky Landsat pixels acquired nearest to the prediction phase, Tp.

    (10) F interp ( x , y ) = w 1 × F 1 ( x , y ) + w 2 × F 2 ( x , y ) for each pixel location ( x , y ) ,

    where (x,y) indicates a given pixel location, F1(x,y) is the cloud-free Landsat pixel acquired nearest to and before Tp, F2(x,y) is the cloud-free Landsat pixel acquired nearest to and after Tp, and the weights w1 and w2 are obtained using corresponding MODIS pixels.

    w1=(C2(x,y)-Cp(x,y))2(C1(x,y)-Cp(x,y))2+(C2(x,y)-Cp(x,y))2(11)andw2=(C1(x,y)-Cp(x,y))2(C1(x,y)-Cp(x,y))2+(C2(x,y)-Cp(x,y))2,

    where Cp(x,y) is the corresponding MODIS pixel acquired at Tp, C1(x,y) is the corresponding cloud-free MODIS pixel acquired on the same date as F1(x,y), and C2(x,y) is the corresponding cloud-free MODIS pixel acquired on the same date as F2(x,y).

  • 2.

    Secondly, the uROBOT model utilizes all input time series data and the spatial information of Fp+ to do similar image matching and approximation (Chen et al., 2023), by solving the optimization problem in Eq. (8) to obtain the representation coefficient α.

  • 3.

    Then, the last step is to reconstruct the target Fp- using the obtained coefficients α and all input data, as

    (12) F ^ p - = D F - α + ( C p - - D C - α ) ,

    where F^p- is an estimation of the unobserved/contaminated segments of Fp and DF-, Cp-, and DC- are the corresponding masked parts of DF, Cp, and DC.

3.6 Quantitative assessment using the leave-one-out approach

The leave-one-out assessment approach has been widely used to evaluate the accuracy of reconstructed surface reflectance values in previous gap-filling and spatiotemporal fusion studies (Chen et al., 2011; Gao et al., 2006; Zhu et al., 2016, 2010). This approach initially excludes a certain Landsat image from the input data, subsequently using the remaining input data to reconstruct the excluded image and thereafter evaluating discrepancies between the originals and reconstructions using standard metrics, such as the correlation coefficient (CC), root mean square error (RMSE), mean absolute error (MAE), rMAE (MAE normalized by true surface reflectance values), and Roberts edge (Edge) spatial features (Zhu et al., 2022). We calculated the normalized difference in the Edge spatial features between a reconstructed image and actual Landsat image. The normalized metric values range from −1 to 1, indicating the under- or over-estimate of spatial details. The average normalized metric value of pixels with an Edge value higher than the 90th percentile in the actual Landsat image was used to represent the spatial accuracy of the reconstructed image (Zhu et al., 2022).

A total of 425 test sites that are distributed across the globe are selected randomly, each of which covers a 6 km×6 km area. These test sites were grouped by their dominant land cover types using the FROM_GLC land cover map (Gong et al., 2013; C. Li et al., 2017). Figure 5 shows the spatial distribution and corresponding dominant land cover types of the 425 global test sites. For each Landsat image at each test site, we exclude it from the input data and apply SDC reconstruction using the remaining data, separately. Then, the reconstruction accuracy is evaluated by comparing the reconstructed images with the original Landsat images. The accuracy assessment is conducted in four years (2001, 2004, 2012, and 2021) representative of different input data conditions.

https://essd.copernicus.org/articles/16/5449/2024/essd-16-5449-2024-f05

Figure 5Spatial distribution and corresponding dominant land cover types of the 425 global test sites.

3.7 Quantitative assessment by cross-comparing with HLS products

Another assessment approach is to cross-compare the gap-filled SDC dataset with actual observations from other sensors. NASA's HLS products provide dense 30 m observation data by harmonizing Landsat OLI (since 2013) and Sentinel-2 MSI (since 2015) products, making themselves good reference data to evaluate the SDC product for the period 2016–2022. We selected 22 MGRS tiles representative of different land cover types as test sites (Chen et al., 2023) and evaluated the agreement between SDC and HLS products in the year 2021. Figure 6 shows the spatial distribution of the 22 MGRS tiles in the cross-comparison. Each MGRS tile covers a 109.8 km×109.8 km area. Since the L30 product is derived from Landsat OLI data, we employed the leave-one-out validation strategy (the same as in the last section) for the cross-comparison with L30 data. The least-squares regression method was used to reduce spectral bandpass differences between SDC and HLS data for each spectral band and each test site.

https://essd.copernicus.org/articles/16/5449/2024/essd-16-5449-2024-f06

Figure 6Spatial distribution of the 22 MGRS tiles involved in the cross-comparison with HLS.

4 Results

4.1 Global 30 m daily seamless data cube (SDC) of land surface reflectance

Based on the developed framework, a global 30 m 23-year (2000–2022) daily surface reflectance SDC dataset was generated by combining multi-sensor observations from Landsat TM, ETM+, and OLI and MODIS Terra products. The generated SDC dataset is tiled into the UTM-based grid as described in Sect. 3.1. This gridding system includes 18 466 tiles (each of which includes 3661×3661 Landsat pixels), covering most of global land surface except Antarctica as shown in Fig. 7.

https://essd.copernicus.org/articles/16/5449/2024/essd-16-5449-2024-f07

Figure 7Distribution of the 18 466 tiles used for SDC generation.

Figures 8–11 depict four examples of SDC time series in comparison with the HLS data (not used in the generation of SDC), with a specific focus on land cover changes. Figure 8 illustrates a crop harvest event that took place in April 2021 in Egypt. Remarkably, the SDC time series demonstrates a more adept representation of the various crop harvest stages compared to the L30 time series. This enhanced performance is attributed to the incorporation of more frequent temporal information from MODIS. Figure 9 presents the second case in the Poyang Lake wetland, a region prone to frequent cloud cover. Although there are only limited cloud-free Landsat observations, the temporal phases of land–water transition are effectively captured in the SDC time series. The third case typified a tundra region in Canada. As shown in Fig. 10, the SDC time series accurately reflects the snow season and vegetation growth in the tundra ecosystem. The fourth case in Fig. 11 illustrates a flood event that occurred in India. This region experienced flooding in October 2013, a period when Sentinel-2 data were unavailable. During December, the water gradually receded as depicted in the figure. The results demonstrate the proficiency of SDC data in monitoring rapid land cover changes. Further, the SDC dataset exhibits robust consistency in both spatial and temporal dimensions, with an extended temporal coverage dating back to 2000.

https://essd.copernicus.org/articles/16/5449/2024/essd-16-5449-2024-f08

Figure 8False-color composites of SDC and HLS data and time series of red SR (red) and NIR SR (blue) of the central pixel located at 30.4851° N, 31.9383° E (Egypt; tile 36RUU) from 15 February to 15 July 2021.

https://essd.copernicus.org/articles/16/5449/2024/essd-16-5449-2024-f09

Figure 9False-color composites of SDC and HLS data and time series of red SR (red) and NIR SR (blue) of the central pixel located at 29.0965° N, 116.1107° W (China; tile 50RMT) from 15 February to 15 July 2021.

https://essd.copernicus.org/articles/16/5449/2024/essd-16-5449-2024-f10

Figure 10False-color composites of SDC and HLS data and time series of red SR (red) and NIR SR (blue) of the central pixel located at 56.2778° N, 110.9034° W (Canada; tile 12VWH) from 15 February to 15 June 2021.

https://essd.copernicus.org/articles/16/5449/2024/essd-16-5449-2024-f11

Figure 11False-color composites of SDC and HLS data and time series of red SR (red) and NIR SR (blue) of the central pixel located at 23.2923° N, 68.7947° E (India; tile 42QVL) from 15 November 2013 to 15 March 2014.

4.2 Accuracy assessment results using the leave-one-out approach

Table 4 displays the results of quantitative assessments for the 425 global test sites across four typical years, utilizing different input data settings. Notably, the SDC attains its peak reconstruction accuracy in 2021, when incorporating Landsat ETM+, OLI, and MODIS data. The reconstruction accuracy of SDC is comparatively diminished in 2012, when Landsat TM and OLI observations are not available. Table 5 illustrates the SDC reconstruction accuracy across test sites characterized by different land cover types. The reconstruction accuracy of SDC in waterbodies and tundra areas is observed to manifest relatively higher error levels. Figure 12 displays scatterplots depicting the predicted SDC and actual Landsat surface reflectance values in the leave-one-out assessment. The majority of data points closely align with the 1:1 line, indicating a robust consistency between the predicted and actual reflectance values. Overall, these results substantiate that reconstructed SDC surface reflectance values achieve a high level of accuracy on a global scale.

Table 4Accuracy of SDC reconstruction at the 425 global test sites (: scan-line corrector failure).

Download Print Version | Download XLSX

Table 5Accuracy of SDC reconstruction at the 425 global test sites for different land cover types.

Download Print Version | Download XLSX

https://essd.copernicus.org/articles/16/5449/2024/essd-16-5449-2024-f12

Figure 12Scatterplots of the actual and predicted values for the test sites of different land cover types.

Download

4.3 Accuracy assessment results by cross-comparing with HLS products

Table 6 presents the quantitative cross-comparison results between SDC and HLS L30 and S30 products at the 22 MGRS tiles. The results indicate a higher level of agreement between SDC and HLS L30 products compared to that between SDC and HLS S30 products. This discrepancy may be attributed to the spectral differences between Landsat OLI and Sentinel-2 MSI instruments. Notably, since the metric values listed in Table 6 are calculated for test sites at a different spatial scale (109.8 km×109.8 km here and 6 km×6 km in Sect. 4.2), the listed root mean square deviation (RMSD) and CC values are not directly comparable to those in Tables 4 and 5. As described in Sect. 3.1, the gridding system of SDC has a 15 m offset compared to the Sentinel-2 gridding system used by HLS products. This 15 m offset may also cause systematic deviations in the cross-comparison.

Table 6Cross-comparison results of between SDC and HLS L30 and S30 products at the 22 MGRS tiles.

Download Print Version | Download XLSX

Figure 13 presents scatterplots depicting the reconstructed SDC data in comparison to HLS data. The blue band in the plots indicates higher deviations, known to be sensitive to atmospheric conditions. Most data points lie near the 1:1 line, indicating a high degree of agreement between SDC and HLS products.

https://essd.copernicus.org/articles/16/5449/2024/essd-16-5449-2024-f13

Figure 13Scatterplots of SDC and HLS reflectance values in the cross-comparison.

Download

4.4 The effectiveness of Landsat cross-sensor calibration

The evaluation of our Landsat cross-sensor calibration method involved a comparison of surface reflectance differences before and after the bandpass adjustment. The surface reflectance differences are quantified by the RMSDs between ETM+ and OLI observations (temporally interpolated to align with the acquisition dates of ETM+). This comparison is conducted for six spectral bands across six MGRS tiles (10SEH, 17RKQ, 32UNA, 36MWS, 49SGV, and 55HCC). Figure 14 presents the surface reflectance differences before and after the cross-calibration process for six spectral bands at the six test sites. Additionally, Fig. 15 illustrates two examples of calibrated blue and NIR surface reflectance time series. The results demonstrate the effectiveness of our method in reducing the surface reflectance differences between ETM+ and OLI observations, aligning them more closely using obtained local-scale linear transformation models.

https://essd.copernicus.org/articles/16/5449/2024/essd-16-5449-2024-f14

Figure 14The mean RMSDs between ETM+ and OLI observations before and after the cross-calibration for six spectral bands at six MGRS tiles.

Download

https://essd.copernicus.org/articles/16/5449/2024/essd-16-5449-2024-f15

Figure 15Time series of original and calibrated surface reflectance: (a) located at 23.2923° N, 68.7947° E; (b) located at 23.2923° N, 68.7947° E. The blue lines are the estimated curves of the OLI observations based on the Fourier approach (Dash et al., 2010; Shang and Zhu, 2019).

Download

4.5 The effects of remaining Landsat sensor differences

The effects of remaining Landsat sensor differences on the SDC reconstruction accuracy are investigated in this section. We evaluated the reconstruction accuracy with different input data settings in three typical years (2001, 2004, and 2021) using the 425 global test sites from Sect. 3.6. The accuracy assessment results presented in Table 7 suggest that incorporating more data from different sensors helps to improve the overall reconstruction accuracy. However, a more in-depth analysis reveals that while incorporating ETM+ ( denoting a scan-line corrector failure) images in the input data improved the accuracy of TM image reconstructions in 2004, it simultaneously diminished the accuracy of OLI image reconstructions in 2021. In a similar vein, the inclusion of OLI somewhat reduced the accuracy of ETM+ image reconstructions in 2021 as evidenced in Table 8 and Fig. 16. Despite this, these fluctuations in accuracy were relatively minor.

Table 7Accuracy of SDC reconstruction (of all images) with different input data settings. Metrics were averaged over the six bands, and the best results are marked in bold. (: scan-line corrector failure).

Download Print Version | Download XLSX

Table 8Accuracy of SDC reconstruction (of images from specific sensors) with different input data settings. Metrics were averaged over the six bands, and the best results are marked in bold.

Download Print Version | Download XLSX

https://essd.copernicus.org/articles/16/5449/2024/essd-16-5449-2024-f16

Figure 16Comparison of the SDC reconstruction accuracy with different input data settings. The symbols + (increase) and (decrease) indicate the change in accuracy with the additional input data.

Download

4.6 The effectiveness of Landsat cloud masking

To mitigate cloud mask omission errors in Fmask results, SDC generation incorporated an enhanced cloud masking approach involving a brightness-threshold filter and a time series outlier filter. Figure 17 presents two cases of cloud masking results by Fmask and the enhanced cloud masking method. These cases reveal the presence of cloud or heavy aerosol pixels that remain undetected by the Fmask algorithm. Meanwhile, the enhanced cloud masking method adeptly identified and filtered out the majority of these previously undetected cloud pixels, demonstrating its effectiveness in ensuring the quality of the input Landsat data for SDC generation.

https://essd.copernicus.org/articles/16/5449/2024/essd-16-5449-2024-f17

Figure 17Comparison of cloud masks derived by Fmask and the enhanced cloud masking approach. The yellow shading in the cloud masks signifies cloud-free pixels. It can be seen that thin clouds undetected by Fmask were mostly screened out by the enhanced cloud masking.

4.7 Temporal continuity of reconstructed SDC time series

The SDC product provides an extensive historical dataset of multi-spectral and medium-resolution data covering the period from 2000 to 2022. As indicated in Table 2, this prolonged temporal coverage encapsulates multiple distinct phases, each characterized by varied input data configurations. The temporal continuity and consistency of SDC data emerge as pivotal factors for its efficacy in monitoring long-term land dynamics. Figure 18 presents two distinct scenarios of SDC time series extending from 2000 to 2022. The first scenario pertains to a forest situated in a mountainous region, characterized by strong phenological changes and rugged terrain features. The second scenario depicts a desert location exhibiting minimal temporal variations and near-Lambertian surfaces. It can be seen that there are no discernible discontinuities between the different phases in the SDC time series.

https://essd.copernicus.org/articles/16/5449/2024/essd-16-5449-2024-f18

Figure 18Two cases of SDC and Landsat time series: (a) located at 35.4549° N, 113.4381° E and (b) located at 39.0850° N, 81.2460° E.

5 Discussion

5.1 Global-scale land cover classification utilizing the SDC: a comparative analysis with Landsat composite and interpolated datasets

Image compositing is a conventional approach employed to mitigate data gaps in optical remote sensing, which selects the highest-quality observations within a defined time interval based on specific criteria to create seamless clear images (Qiu et al., 2023). Contrary to best-pixel composite methods, there are also some interpolation methods to generate synthetic images based on harmonic time series fits (Zhou et al., 2022; Zhu et al., 2015b). Figure 19 displays the comparison of SDC images with Landsat composite and interpolated images. The Landsat composite images are generated by the NLCD (Jin et al., 2023) and MAX-RNB (Qiu et al., 2023) algorithms, and the Landsat interpolated images are generated using the HAPO algorithm (Zhou et al., 2022). Notwithstanding the good visual quality apparent in these Landsat composite and interpolated images, it is evident that certain temporal change information was lost during the process of image compositing and interpolation.

https://essd.copernicus.org/articles/16/5449/2024/essd-16-5449-2024-f19

Figure 19False-color composites of Landsat, Landsat composite, and SDC images at 35.3104° N, 97.7974° W.

Moreover, we conducted a comparative analysis to evaluate the effectiveness of using Landsat composite and interpolated images versus using SDC data for land cover classification. To perform this assessment, we utilized the validation sample set provided in a previous study (C. Li et al., 2017), which consists a total of 32 946 data points distributed globally. To facilitate a clear and equitable comparison, we employed solely the six-band surface reflectance time series from Landsat composite interpolated images and SDC data as input features for the classification. Subsequently, we conducted a fivefold cross-validation procedure independently for each input data configuration using the same LightGBM classifier with default parameter settings.

Table 9 presents the results of overall classification accuracy for various input data configurations. It is discernible that there exist minor discrepancies in performance between the NLCD and MAX-RNB results. And the utilization of seasonal composite images led to a significantly higher level of classification accuracy compared to annual composite images. Remarkably, the highest classification accuracy was attained using SDC time series as input features, outperforming other input settings by a wide margin (2.4 %–11.3 %). Figure 20 shows two examples of land cover classification results when using SDC can correctly identify land cover types, while other data cannot. The SDC time series preserves the temporal information of the original Landsat data and remains stable when Landsat observations are sparse.

Table 9Overall accuracy (OA) of land cover classification results using NLCD composite images, MAX-RNB composite images, HAPO interpolated images, and SDC time series as input features.

Download Print Version | Download XLSX

https://essd.copernicus.org/articles/16/5449/2024/essd-16-5449-2024-f20

Figure 20Examples of land cover classification results using different input data where (a) the ground truth is cropland and (b) the ground truth is grass.

Download

5.2 The influence of Landsat 7 ETM+ scan-line corrector failure

The scan-line corrector (SLC) of the ETM+ sensor on board Landsat 7 failed permanently in May 2003, resulting in about 22 % of the pixels per scene not being scanned (causing wedge-shaped gaps) since that time (Chen et al., 2011). These wedge-shaped gaps are evident in the Landsat ETM+ images as displayed in Figs. 17–19. However, this issue poses no obstacle to SDC generation as the uROBOT model can exploit time series input data to fill these gaps and fuse them with MODIS in a unified manner. In the year 2012, when there are only Landsat ETM+ SLC-off images available, the uROBOT model can still utilize these ETM+ image patches as input and achieve a relatively high reconstruction accuracy as indicated in Table 4. Even in later years when Landsat OLI images become available, the incorporation of Landsat ETM+ images as input can still help improve the overall accuracy of the SDC generation as indicated in Table 7.

5.3 Limitations of SDC products

The SDC is not equivalent to actual daily 30 m Earth observation data. It is an estimation based on Landsat and MODIS time series observations. Reconstructing missing Landsat data is an under-determined problem, meaning there can be infinitely many possible solutions (Shen et al., 2015). Using 500 m MODIS images as “guidance” and applying the constraints presented in Eq. (8), we can narrow down the solution space and make more accurate estimations. However, achieving 100 % accuracy is not feasible since the information provided in the input data is usually incomplete. Additionally, the effective spatial resolution of MODIS observations changes significantly due to the variations in view angles (Pahlevan et al., 2017). Even after BRDF normalization and temporal smoothing, these effects cannot be perfectly mitigated. The effective temporal resolution of SDC depends on the quality of the input Landsat and MODIS data, which can vary in space and time.

The LEDAPS atmospheric correction algorithm for Landsat TM and ETM+ does not perform as well as the LaSRC for Landsat OLI (Vermote et al., 2018). Additionally, the Fmask algorithm for Landsat 5 and 7 is less effective compared to its performance for Landsat 8 and 9 due to the absence of the cirrus band (Zhu et al., 2015a). Our enhanced cloud masking approach identifies most of the previously undetected clouds. Nonetheless, there are still certain thin aerosols and cloud shadows in Landsat imagery that can introduce temporal noise and spatial artifacts into the SDC dataset. Implementing a more aggressive cloud masking strategy could minimize these impacts, yet it could also significantly reduce the number of available observations. Therefore, the development of a more accurate and robust algorithm for cloud and cloud shadow detection is essential for future improvements.

As shown in Table 8 and Fig. 16, remaining Landsat cross-sensor inconsistencies may slightly reduce the reconstruction accuracy. And the sensor differences between Landsat and MODIS sensors could also introduce errors. Future improvements may require a more effective method for cross-sensor calibration and data harmonization.

The influence of geographic registration errors was not considered in this study since Landsat Collection 2 products have significantly improved the absolute geolocation accuracy of Landsat data (Crawford et al., 2023). However, the co-registration accuracy between Landsat imagery and MODIS data could also influence the reconstruction accuracy of SDC dataset. Future improvements could introduce co-registration processes to address this concern.

The SDC dataset could also benefit from integrating more data sources. For example, the European Space Agency (ESA) launched two satellites of the Sentinel-2 mission (S-2A and S-2B) in 2015 and 2017, respectively. The Multi-Spectral Instrument (MSI) on board both satellites acquires multi-spectral data at a spatial resolution of 10 to 60 m depending on the wavelength with a 5 d revisit period at the Equator. The incorporation of Sentinel-2 could facilitate the generation of higher-quality SDC datasets with finer spatial resolutions.

6 Code and data availability

The SDC dataset is available at https://doi.org/10.12436/SDC30.26.20240506 (Chen et al., 2024) or on the project website (http://sdc.iearth.cloud/, last access: 16 November 2024), where a web-based interface is provided for all researchers to be able to freely access the SDC dataset. Notably, the SDC dataset is dynamically generated upon request to optimize data storage efficiency. Furthermore, leveraging the generated SDC dataset, we have produced global 23-year (2000–2022) annual land cover maps using the FROM_GLC classification system, which are readily accessible on the website. All code utilized in our analyses and the accompanying experimental data are available upon reasonable request.

7 Conclusion

In this study, a global-coverage 30 m resolution 23-year (2000–2022) daily seamless data cube (SDC) of land surface reflectance was developed based on the fusion of multi-sensor observations from the Landsat 5, 7, 8, and 9 and MODIS Terra constellations. SDC generation relies on a novel processing framework, which comprises a set of processing stages: gridding and reprojection, Landsat cloud masking, Landsat cross-sensor calibration, MODIS harmonization to Landsat bandpass, and a unified gap-filling and spatiotemporal fusion stage. The quality of the generated SDC dataset was evaluated using a leave-one-out approach and a cross-comparison with NASA's Harmonized Landsat and Sentinel-2 products. The leave-one-out validation at 425 global test sites assessed the agreement between the SDC with actual Landsat surface reflectance values (not used as input), revealing an overall mean absolute error (MAE) of 0.014 (the valid range of surface reflectance values is 0–1). The cross-comparison of SDC with HLS products at 22 Military Grid Reference System (MGRS) tiles revealed an overall mean absolute deviation (MAD) of 0.017 with L30 (Landsat 8-based 30 m HLS product) and a MAD of 0.021 with S30 (Sentinel-2-based 30 m HLS product).

The SDC has several advantages compared with other existing Landsat-based surface reflectance datasets: (i) it exhibits a higher observation frequency and enhanced capabilities for monitoring land cover changes, (ii) it is consistent in both spatial and temporal dimensions without missing values, (iii) it has a global coverage and a prolonged 23-year duration from 2000 to 2022, and (iv) validation results revealed a high level of accuracy in SDC reconstruction. Moreover, the experiment employing the SDC for global-scale land cover classification underscores its advantages compared with Landsat composite/interpolated datasets, achieving a sizable improvement in overall accuracy (2.4 %–11.3 %). The SDC will be a competitive analysis-ready surface reflectance dataset in many other studies related to global environmental monitoring.

Author contributions

PG conceptualized the research idea and supervised the project. SC and JiW developed the methodologies, performed data analysis, and drafted the manuscript. QL and XL processed the MODIS data. SC, RL, and PQ curated the data. JiW and JY configured and maintained the computing resources. JuW developed the web-based user interface. SC, JiW, SY, HH, and PG reviewed and edited the manuscript.

Competing interests

The contact author has declared that none of the authors has any competing interests.

Disclaimer

Publisher's note: Copernicus Publications remains neutral with regard to jurisdictional claims made in the text, published maps, institutional affiliations, or any other geographical representation in this paper. While Copernicus Publications makes every effort to include appropriate place names, the final responsibility lies with the authors.

Acknowledgements

We gratefully acknowledge NASA and USGS for providing the Landsat and MODIS products used in this study. Our sincere thanks also go to Xiaolin Zhu and the two anonymous reviewers for their valuable feedback and constructive suggestions, which greatly enhanced the quality of this manuscript. Finally, we wish to thank the editor and the editorial teams for their efficient handling and unwavering support throughout the review and publication process.

Financial support

This research was jointly supported by the Major Program of the National Natural Science Foundation of China (grant nos. 42090015 and 42071400), the Croucher Foundation (grant no. CAS22902/CAS22HU01), and the Major Key Project of PCL.

Review statement

This paper was edited by Jia Yang and reviewed by X. Zhu and two anonymous referees.

References

Abowarda, A. S., Bai, L., Zhang, C., Long, D., Li, X., Huang, Q., and Sun, Z.: Generating surface soil moisture at 30 m spatial resolution using both data fusion and machine learning toward better water resources management at the field scale, Remote Sens. Environ., 255, 112301, https://doi.org/10.1016/j.rse.2021.112301, 2021. 

Battude, M., Al Bitar, A., Morin, D., Cros, J., Huc, M., Marais Sicre, C., Le Dantec, V., and Demarez, V.: Estimating maize biomass and yield over large areas using high spatial and temporal resolution Sentinel-2 like remote sensing data, Remote Sens. Environ., 184, 668–681, https://doi.org/10.1016/j.rse.2016.07.030, 2016. 

Bauer-Marschallinger, B. and Falkner, K.: Wasting petabytes: A survey of the Sentinel-2 UTM tiling grid and its spatial overhead, ISPRS J. Photogramm., 202, 682–690, https://doi.org/10.1016/j.isprsjprs.2023.07.015, 2023. 

Bolton, D. K., Gray, J. M., Melaas, E. K., Moon, M., Eklundh, L., and Friedl, M. A.: Continental-scale land surface phenology from harmonized Landsat 8 and Sentinel-2 imagery, Remote Sens. Environ., 240, 111685, https://doi.org/10.1016/j.rse.2020.111685, 2020. 

Brooks, E. B., Thomas, V. A., Wynne, R. H., and Coulston, J. W.: Fitting the Multitemporal Curve: A Fourier Series Approach to the Missing Data Problem in Remote Sensing Analysis, IEEE T. Geosci. Remote, 50, 3340–3353, https://doi.org/10.1109/TGRS.2012.2183137, 2012. 

Cao, Z., Chen, S., Gao, F., and Li, X.: Improving phenological monitoring of winter wheat by considering sensor spectral response in spatiotemporal image fusion, Phys. Chem. Earth Pt. A/B/C, 116, 102859, https://doi.org/10.1016/j.pce.2020.102859, 2020. 

Carrasco, L., O'Neil, A., Morton, R., and Rowland, C.: Evaluating Combinations of Temporally Aggregated Sentinel-1, Sentinel-2 and Landsat 8 for Land Cover Mapping with Google Earth Engine, Remote Sens.-Basel, 11, 288, https://doi.org/10.3390/rs11030288, 2019. 

Chastain, R., Housman, I., Goldstein, J., Finco, M., and Tenneson, K.: Empirical cross sensor comparison of Sentinel-2A and 2B MSI, Landsat-8 OLI, and Landsat-7 ETM+ top of atmosphere spectral characteristics over the conterminous United States, Remote Sens. Environ., 221, 274–285, https://doi.org/10.1016/j.rse.2018.11.012, 2019. 

Che, X., Zhang, H. K., Li, Z. B., Wang, Y., Sun, Q., Luo, D., and Wang, H.: Linearly interpolating missing values in time series helps little for land cover classification using recurrent or attention networks, ISPRS J. Photogramm., 212, 73–95, https://doi.org/10.1016/j.isprsjprs.2024.04.021, 2024. 

Chen, B., Huang, B., and Xu, B.: Multi-source remotely sensed data fusion for improving land cover classification, ISPRS J. Photogramm., 124, 27–39, https://doi.org/10.1016/j.isprsjprs.2016.12.008, 2017. 

Chen, B., Chen, L., Huang, B., Michishita, R., and Xu, B.: Dynamic monitoring of the Poyang Lake wetland by integrating Landsat and MODIS observations, ISPRS J. Photogramm., 139, 75–87, https://doi.org/10.1016/j.isprsjprs.2018.02.021, 2018. 

Chen, J., Zhu, X., Vogelmann, J. E., Gao, F., and Jin, S.: A simple and effective method for filling gaps in Landsat ETM+ SLC-off images, Remote Sens. Environ., 115, 1053–1064, https://doi.org/10.1016/j.rse.2010.12.010, 2011. 

Chen, S., Wang, J., and Gong, P.: ROBOT: A spatiotemporal fusion model toward seamless data cube for global remote sensing applications, Remote Sens. Environ., 294, 113616, https://doi.org/10.1016/j.rse.2023.113616, 2023. 

Chen, S., Wang, J., Liu, Q., Liang, X., Liu, R., Qin, P., Yuan, J., Wei, J., Yuan, S., Huang, H., and Gong, P.: Global 30 m seamless data cube (2000–2022) of land surface reflectance generated from Landsat-5,7,8,9 and MODIS Terra constellations, Pengcheng Laboratory [data set], https://doi.org/10.12436/SDC30.26.20240506, 2024. 

Chen, Y., Cao, R., Chen, J., Liu, L., and Matsushita, B.: A practical approach to reconstruct high-quality Landsat NDVI time-series data by gap filling and the Savitzky–Golay filter, ISPRS J. Photogramm., 180, 174–190, https://doi.org/10.1016/j.isprsjprs.2021.08.015, 2021. 

Chuvieco, E., Ventura, G., Martín, M. P., and Gómez, I.: Assessment of multitemporal compositing techniques of MODIS and AVHRR images for burned land mapping, Remote Sens. Environ., 94, 450–462, https://doi.org/10.1016/j.rse.2004.11.006, 2005. 

Cihlar, J., Manak, D., and D'Iorio, M.: Evaluation of compositing algorithms for AVHRR data over land, IEEE T. Geosci. Remote, 32, 427–437, https://doi.org/10.1109/36.295057, 1994. 

Claverie, M.: Evaluation of surface reflectance bandpass adjustment techniques, ISPRS J. Photogramm., 198, 210–222, https://doi.org/10.1016/j.isprsjprs.2023.03.011, 2023. 

Claverie, M., Vermote, E., Franch, B., He, T., Hagolle, O., Kadiri, M., and Masek, J.: Evaluation of Medium Spatial Resolution BRDF-Adjustment Techniques Using Multi-Angular SPOT4 (Take5) Acquisitions, Remote Sens.-Basel, 7, 12057–12075, https://doi.org/10.3390/rs70912057, 2015. 

Claverie, M., Ju, J., Masek, J. G., Dungan, J. L., Vermote, E. F., Roger, J.-C., Skakun, S. V., and Justice, C.: The Harmonized Landsat and Sentinel-2 surface reflectance data set, Remote Sens. Environ., 219, 145–161, https://doi.org/10.1016/j.rse.2018.09.002, 2018. 

Crawford, C. J., Roy, D. P., Arab, S., Barnes, C., Vermote, E., Hulley, G., Gerace, A., Choate, M., Engebretson, C., Micijevic, E., Schmidt, G., Anderson, C., Anderson, M., Bouchard, M., Cook, B., Dittmeier, R., Howard, D., Jenkerson, C., Kim, M., Kleyians, T., Maiersperger, T., Mueller, C., Neigh, C., Owen, L., Page, B., Pahlevan, N., Rengarajan, R., Roger, J.-C., Sayler, K., Scaramuzza, P., Skakun, S., Yan, L., Zhang, H. K., Zhu, Z., and Zahn, S.: The 50-year Landsat collection 2 archive, Science of Remote Sensing, 8, 100103, https://doi.org/10.1016/j.srs.2023.100103, 2023. 

Dash, J., Jeganathan, C., and Atkinson, P. M.: The use of MERIS Terrestrial Chlorophyll Index to study spatio-temporal variation in vegetation phenology over India, Remote Sens. Environ., 114, 1388–1402, https://doi.org/10.1016/j.rse.2010.01.021, 2010. 

Defourny, P., Bontemps, S., Bellemans, N., Cara, C., Dedieu, G., Guzzonato, E., Hagolle, O., Inglada, J., Nicola, L., Rabaute, T., Savinaud, M., Udroiu, C., Valero, S., Bégué, A., Dejoux, J.-F., El Harti, A., Ezzahar, J., Kussul, N., Labbassi, K., Lebourgeois, V., Miao, Z., Newby, T., Nyamugama, A., Salh, N., Shelestov, A., Simonneaux, V., Traore, P. S., Traore, S. S., and Koetz, B.: Near real-time agriculture monitoring at national scale at parcel resolution: Performance assessment of the Sen2-Agri automated system in various cropping systems around the world, Remote Sens. Environ., 221, 551–568, https://doi.org/10.1016/j.rse.2018.11.007, 2019. 

Dwyer, J. L., Roy, D. P., Sauer, B., Jenkerson, C. B., Zhang, H. K., and Lymburner, L.: Analysis Ready Data: Enabling Analysis of the Landsat Archive, Remote Sens.-Basel, 10, 1363, https://doi.org/10.3390/rs10091363, 2018. 

Frantz, D., Röder, A., Stellmes, M., and Hill, J.: Phenology-adaptive pixel-based compositing using optical earth observation imagery, Remote Sens. Environ., 190, 331–347, https://doi.org/10.1016/j.rse.2017.01.002, 2017. 

Gao, F., Masek, J., Schwaller, M., and Hall, F.: On the blending of the Landsat and MODIS surface reflectance: predicting daily Landsat surface reflectance, IEEE T. Geosci. Remote, 44, 2207–2218, https://doi.org/10.1109/TGRS.2006.872081, 2006. 

Gao, H., Zhu, X., Guan, Q., Yang, X., Yao, Y., Zeng, W., and Peng, X.: cuFSDAF: An Enhanced Flexible Spatiotemporal Data Fusion Algorithm Parallelized Using Graphics Processing Units, IEEE T. Geosci. Remote, 60, 1–16, https://doi.org/10.1109/TGRS.2021.3080384, 2022. 

Gervais, N., Buyantuev, A., and Gao, F.: Modeling the Effects of the Urban Built-Up Environment on Plant Phenology Using Fused Satellite Data, Remote Sens.-Basel, 9, 99, https://doi.org/10.3390/rs9010099, 2017. 

Gevaert, C. M. and García-Haro, F. J.: A comparison of STARFM and an unmixing-based algorithm for Landsat and MODIS data fusion, Remote Sens. Environ., 156, 34–44, https://doi.org/10.1016/j.rse.2014.09.012, 2015. 

Gong, P., Liang, S., Carlton, E. J., Jiang, Q., Wu, J., Wang, L., and Remais, J. V.: Urbanisation and health in China, Lancet, 379, 843–852, https://doi.org/10.1016/S0140-6736(11)61878-3, 2012. 

Gong, P., Wang, J., Yu, L., Zhao, Y., Zhao, Y., Liang, L., Niu, Z., Huang, X., Fu, H., Liu, S., Li, C., Li, X., Fu, W., Liu, C., Xu, Y., Wang, X., Cheng, Q., Hu, L., Yao, W., Zhang, H., Zhu, P., Zhao, Z., Zhang, H., Zheng, Y., Ji, L., Zhang, Y., Chen, H., Yan, A., Guo, J., Yu, L., Wang, L., Liu, X., Shi, T., Zhu, M., Chen, Y., Yang, G., Tang, P., Xu, B., Giri, C., Clinton, N., Zhu, Z., Chen, J., and Chen, J.: Finer resolution observation and monitoring of global land cover: first mapping results with Landsat TM and ETM+ data, Int. J. Remote Sens., 34, 2607–2654, https://doi.org/10.1080/01431161.2012.748992, 2013. 

Gong, P., Liu, H., Zhang, M., Li, C., Wang, J., Huang, H., Clinton, N., Ji, L., Li, W., Bai, Y., Chen, B., Xu, B., Zhu, Z., Yuan, C., Ping Suen, H., Guo, J., Xu, N., Li, W., Zhao, Y., Yang, J., Yu, C., Wang, X., Fu, H., Yu, L., Dronova, I., Hui, F., Cheng, X., Shi, X., Xiao, F., Liu, Q., and Song, L.: Stable classification with limited sample: transferring a 30 m resolution sample set collected in 2015 to mapping 10 m resolution global land cover in 2017, Sci. Bull., 64, 370–373, https://doi.org/10.1016/j.scib.2019.03.002, 2019. 

Gong, P., Li, X., Wang, J., Bai, Y., Chen, B., Hu, T., Liu, X., Xu, B., Yang, J., Zhang, W., and Zhou, Y.: Annual maps of global artificial impervious area (GAIA) between 1985 and 2018, Remote Sens. Environ., 236, 111510, https://doi.org/10.1016/j.rse.2019.111510, 2020. 

Gong, P., Guo, H., Chen, B., Chen, F., He, G., Liang, D., Liu, Z., Sun, Z., Wu, J., Xu, Z., Yan, D., and Zhang, H.: iEarth: an interdisciplinary framework in the era of big data and AI for sustainable development, Natl. Sci. Rev., 10, nwad178, https://doi.org/10.1093/nsr/nwad178, 2023. 

Goyena, H., Pérez-Goya, U., Montesino-SanMartin, M., Militino, A. F., Wang, Q., Atkinson, P. M., and Ugarte, M. D.: Unpaired spatio-temporal fusion of image patches (USTFIP) from cloud covered images, Remote Sens. Environ., 295, 113709, https://doi.org/10.1016/j.rse.2023.113709, 2023. 

Griffiths, P., Nendel, C., and Hostert, P.: Intra-annual reflectance composites from Sentinel-2 and Landsat for national-scale crop and land cover mapping, Remote Sens. Environ., 220, 135–151, https://doi.org/10.1016/j.rse.2018.10.031, 2019. 

Guo, D., Shi, W., Hao, M., and Zhu, X.: FSDAF 2.0: Improving the performance of retrieving land cover changes and preserving spatial details, Remote Sens. Environ., 248, 111973, https://doi.org/10.1016/j.rse.2020.111973, 2020. 

Hansen, M. C., Roy, D. P., Lindquist, E., Adusei, B., Justice, C. O., and Altstatt, A.: A method for integrating MODIS and Landsat data for systematic monitoring of forest cover and change in the Congo Basin, Remote Sens. Environ., 112, 2495–2513, https://doi.org/10.1016/j.rse.2007.11.012, 2008. 

Hansen, M. C., Potapov, P. V., Moore, R., Hancher, M., Turubanova, S. A., Tyukavina, A., Thau, D., Stehman, S. V., Goetz, S. J., Loveland, T. R., Kommareddy, A., Egorov, A., Chini, L., Justice, C. O., and Townshend, J. R. G.: High-Resolution Global Maps of 21st-Century Forest Cover Change, Science, 342, 850–853, https://doi.org/10.1126/science.1244693, 2013. 

Hilker, T., Wulder, M. A., Coops, N. C., Linke, J., McDermid, G., Masek, J. G., Gao, F., and White, J. C.: A new data fusion model for high spatial- and temporal-resolution mapping of forest disturbance based on Landsat and MODIS, Remote Sens. Environ., 113, 1613–1627, https://doi.org/10.1016/j.rse.2009.03.007, 2009a. 

Hilker, T., Wulder, M. A., Coops, N. C., Seitz, N., White, J. C., Gao, F., Masek, J. G., and Stenhouse, G.: Generation of dense time series synthetic Landsat data through data blending with MODIS using a spatial and temporal adaptive reflectance fusion model, Remote Sens. Environ., 113, 1988–1999, https://doi.org/10.1016/j.rse.2009.05.011, 2009b. 

Holben, B. N.: Characteristics of maximum-value composite images from temporal AVHRR data, Int. J. Remote Sens., 7, 1417–1434, https://doi.org/10.1080/01431168608948945, 1986. 

Huang, H., Chen, Y., Clinton, N., Wang, J., Wang, X., Liu, C., Gong, P., Yang, J., Bai, Y., Zheng, Y., and Zhu, Z.: Mapping major land cover dynamics in Beijing using all Landsat images in Google Earth Engine, Remote Sens. Environ., 202, 166–176, https://doi.org/10.1016/j.rse.2017.02.021, 2017. 

Huete, A., Didan, K., Miura, T., Rodriguez, E. P., Gao, X., and Ferreira, L. G.: Overview of the radiometric and biophysical performance of the MODIS vegetation indices, Remote Sens. Environ., 83, 195–213, https://doi.org/10.1016/S0034-4257(02)00096-2, 2002. 

Inglada, J., Vincent, A., Arias, M., Tardy, B., Morin, D., and Rodes, I.: Operational High Resolution Land Cover Map Production at the Country Scale Using Satellite Image Time Series, Remote Sens.-Basel, 9, 95, https://doi.org/10.3390/rs9010095, 2017. 

Ji, L., Gong, P., Wang, J., Shi, J., and Zhu, Z.: Construction of the 500 m Resolution Daily Global Surface Water Change Database (2001–2016), Water Resour. Res., 54, 10270–10292, https://doi.org/10.1029/2018WR023060, 2018. 

Jin, S., Dewitz, J., Danielson, P., Granneman, B., Costello, C., Smith, K., and Zhu, Z.: National Land Cover Database 2019: A new strategy for creating clean leaf-on and leaf-off Landsat composite images, J. Remote Sens., 3, 0022, https://doi.org/10.34133/remotesensing.0022, 2023. 

Khan, A., Potapov, P., Hansen, M. C., Pickens, A. H., Tyukavina, A., Serna, A. H., Uddin, K., and Ahmad, J.: Perennial snow and ice cover change from 2001 to 2021 in the Hindu-Kush Himalayan region derived from the Landsat analysis-ready data, Remote Sensing Applications: Society and Environment, 34, 101192, https://doi.org/10.1016/j.rsase.2024.101192, 2024. 

Li, C., Gong, P., Wang, J., Zhu, Z., Biging, G. S., Yuan, C., Hu, T., Zhang, H., Wang, Q., Li, X., Liu, X., Xu, Y., Guo, J., Liu, C., Hackman, K. O., Zhang, M., Cheng, Y., Yu, L., Yang, J., Huang, H., and Clinton, N.: The first all-season sample set for mapping global land cover with Landsat-8 data, Sci. Bull., 62, 508–515, https://doi.org/10.1016/j.scib.2017.03.011, 2017. 

Li, H., Song, X.-P., Hansen, M. C., Becker-Reshef, I., Adusei, B., Pickering, J., Wang, L., Wang, L., Lin, Z., Zalles, V., Potapov, P., Stehman, S. V., and Justice, C.: Development of a 10 m resolution maize and soybean map over China: Matching satellite-based crop classification with sample-based area estimation, Remote Sens. Environ., 294, 113623, https://doi.org/10.1016/j.rse.2023.113623, 2023. 

Li, J. and Roy, D.: A Global Analysis of Sentinel-2A, Sentinel-2B and Landsat-8 Data Revisit Intervals and Implications for Terrestrial Monitoring, Remote Sens.-Basel, 9, 902, https://doi.org/10.3390/rs9090902, 2017. 

Li, Y., Huang, C., Hou, J., Gu, J., Zhu, G., and Li, X.: Mapping daily evapotranspiration based on spatiotemporal fusion of ASTER and MODIS images over irrigated agricultural areas in the Heihe River Basin, Northwest China, Agr. Forest Meteorol., 244–245, 82–97, https://doi.org/10.1016/j.agrformet.2017.05.023, 2017. 

Liang, X., Liu, Q., Wang, J., Chen, S., and Gong, P.: Global 500 m seamless dataset (2000–2022) of land surface reflectance generated from MODIS products, Earth Syst. Sci. Data, 16, 177–200, https://doi.org/10.5194/essd-16-177-2024, 2024. 

Liu, H., Gong, P., Wang, J., Clinton, N., Bai, Y., and Liang, S.: Annual dynamics of global land cover and its long-term changes from 1982 to 2015, Earth Syst. Sci. Data, 12, 1217–1243, https://doi.org/10.5194/essd-12-1217-2020, 2020. 

Liu, H., Gong, P., Wang, J., Wang, X., Ning, G., and Xu, B.: Production of global daily seamless data cubes and quantification of global land cover change from 1985 to 2020 – iMap World 1.0, Remote Sens. Environ., 258, 112364, https://doi.org/10.1016/j.rse.2021.112364, 2021. 

Liu, M., Yang, W., Zhu, X., Chen, J., Chen, X., Yang, L., and Helmer, E. H.: An Improved Flexible Spatiotemporal DAta Fusion (IFSDAF) method for producing high spatiotemporal resolution normalized difference vegetation index time series, Remote Sens. Environ., 227, 74–89, https://doi.org/10.1016/j.rse.2019.03.012, 2019. 

Liu, S., Zhou, J., Qiu, Y., Chen, J., Zhu, X., and Chen, H.: The FIRST model: Spatiotemporal fusion incorrporting spectral autocorrelation, Remote Sens. Environ., 279, 113111, https://doi.org/10.1016/j.rse.2022.113111, 2022. 

Liu, X., Huang, Y., Xu, X., Li, X., Li, X., Ciais, P., Lin, P., Gong, K., Ziegler, A. D., Chen, A., Gong, P., Chen, J., Hu, G., Chen, Y., Wang, S., Wu, Q., Huang, K., Estes, L., and Zeng, Z.: High-spatiotemporal-resolution mapping of global urban change from 1985 to 2015, Nat. Sustain., 3, 564–570, https://doi.org/10.1038/s41893-020-0521-x, 2020. 

Malambo, L. and Heatwole, C. D.: A Multitemporal Profile-Based Interpolation Method for Gap Filling Nonstationary Data, IEEE T. Geosci. Remote, 54, 252–261, https://doi.org/10.1109/TGRS.2015.2453955, 2016. 

Markham, B. L. and Helder, D. L.: Forty-year calibrated record of earth-reflected radiance from Landsat: A review, Remote Sens. Environ., 122, 30–40, https://doi.org/10.1016/j.rse.2011.06.026, 2012. 

Masek, J. G., Vermote, E. F., Saleous, N. E., Wolfe, R., Hall, F. G., Huemmrich, K. F., Gao, F., Kutler, J., and Lim, T.-K.: A Landsat Surface Reflectance Dataset for North America, 1990–2000, IEEE Geosci. Remote S., 3, 68–72, https://doi.org/10.1109/LGRS.2005.857030, 2006. 

Masek, J. G., Wulder, M. A., Markham, B., McCorkel, J., Crawford, C. J., Storey, J., and Jenstrom, D. T.: Landsat 9: Empowering open science and applications through continuity, Remote Sens. Environ., 248, 111968, https://doi.org/10.1016/j.rse.2020.111968, 2020. 

Mizuochi, H., Hiyama, T., Ohta, T., Fujioka, Y., Kambatuku, J. R., Iijima, M., and Nasahara, K. N.: Development and evaluation of a lookup-table-based approach to data fusion for seasonal wetlands monitoring: An integrated use of AMSR series, MODIS, and Landsat, Remote Sens. Environ., 199, 370–388, https://doi.org/10.1016/j.rse.2017.07.026, 2017. 

Morisette, J. T., Privette, J. L., and Justice, C. O.: A framework for the validation of MODIS Land products, Remote Sens. Environ., 83, 77–96, https://doi.org/10.1016/S0034-4257(02)00088-3, 2002. 

Nelson, K. J. and Steinwand, D.: A Landsat Data Tiling and Compositing Approach Optimized for Change Detection in the Conterminous United States, Photogramm. Eng. Rem. S., 81, 573–586, https://doi.org/10.14358/PERS.81.7.573, 2015. 

Olthof, I. and Fraser, R. H.: Mapping surface water dynamics (1985–2021) in the Hudson Bay Lowlands, Canada using sub-pixel Landsat analysis, Remote Sens. Environ., 300, 113895, https://doi.org/10.1016/j.rse.2023.113895, 2024. 

Pahlevan, N., Sarkar, S., Devadiga, S., Wolfe, R. E., Roman, M., Vermote, E., Lin, G., and Xiong, X.: Impact of Spatial Sampling on Continuity of MODIS–VIIRS Land Surface Reflectance Products: A Simulation Approach, IEEE T. Geosci. Remote, 55, 183–196, https://doi.org/10.1109/TGRS.2016.2604214, 2017. 

Pekel, J.-F., Cottam, A., Gorelick, N., and Belward, A. S.: High-resolution mapping of global surface water and its long-term changes, Nature, 540, 418–422, https://doi.org/10.1038/nature20584, 2016. 

Piao, S., Liu, Q., Chen, A., Janssens, I. A., Fu, Y., Dai, J., Liu, L., Lian, X., Shen, M., and Zhu, X.: Plant phenology and global climate change: Current progresses and challenges, Glob. Change Biol., 25, 1922–1940, https://doi.org/10.1111/gcb.14619, 2019. 

Pickens, A. H., Hansen, M. C., Stehman, S. V., Tyukavina, A., Potapov, P., Zalles, V., and Higgins, J.: Global seasonal dynamics of inland open water and ice, Remote Sens. Environ., 272, 112963, https://doi.org/10.1016/j.rse.2022.112963, 2022. 

Potapov, P., Hansen, M. C., Kommareddy, I., Kommareddy, A., Turubanova, S., Pickens, A., Adusei, B., Tyukavina, A., and Ying, Q.: Landsat Analysis Ready Data for Global Land Cover and Land Cover Change Mapping, Remote Sens.-Basel, 12, 426, https://doi.org/10.3390/rs12030426, 2020. 

Potapov, P., Li, X., Hernandez-Serna, A., Tyukavina, A., Hansen, M. C., Kommareddy, A., Pickens, A., Turubanova, S., Tang, H., Silva, C. E., Armston, J., Dubayah, R., Blair, J. B., and Hofton, M.: Mapping global forest canopy height through integration of GEDI and Landsat data, Remote Sens. Environ., 253, 112165, https://doi.org/10.1016/j.rse.2020.112165, 2021a. 

Potapov, P., Turubanova, S., Hansen, M. C., Tyukavina, A., Zalles, V., Khan, A., Song, X.-P., Pickens, A., Shen, Q., and Cortez, J.: Global maps of cropland extent and change show accelerated cropland expansion in the twenty-first century, Nat. Food, 3, 19–28, https://doi.org/10.1038/s43016-021-00429-z, 2021b. 

Potapov, P., Hansen, M. C., Pickens, A., Hernandez-Serna, A., Tyukavina, A., Turubanova, S., Zalles, V., Li, X., Khan, A., Stolle, F., Harris, N., Song, X.-P., Baggett, A., Kommareddy, I., and Kommareddy, A.: The Global 2000–2020 Land Cover and Land Use Change Dataset Derived From the Landsat Archive: First Results, Front. Remote Sens., 3, 856903, https://doi.org/10.3389/frsen.2022.856903, 2022a. 

Potapov, P., Turubanova, S., Hansen, M. C., Tyukavina, A., Zalles, V., Khan, A., Song, X.-P., Pickens, A., Shen, Q., and Cortez, J.: Global maps of cropland extent and change show accelerated cropland expansion in the twenty-first century, Nat. Food, 3, 19–28, https://doi.org/10.1038/s43016-021-00429-z, 2022b. 

Qiu, S., Zhu, Z., Olofsson, P., Woodcock, C. E., and Jin, S.: Evaluation of Landsat image compositing algorithms, Remote Sens. Environ., 285, 113375, https://doi.org/10.1016/j.rse.2022.113375, 2023. 

Roy, D. P., Ju, J., Kline, K., Scaramuzza, P. L., Kovalskyy, V., Hansen, M., Loveland, T. R., Vermote, E., and Zhang, C.: Web-enabled Landsat Data (WELD): Landsat ETM+ composited mosaics of the conterminous United States, Remote Sens. Environ., 114, 35–49, https://doi.org/10.1016/j.rse.2009.08.011, 2010. 

Roy, D. P., Kovalskyy, V., Zhang, H. K., Vermote, E. F., Yan, L., Kumar, S. S., and Egorov, A.: Characterization of Landsat-7 to Landsat-8 reflective wavelength and normalized difference vegetation index continuity, Remote Sens. Environ., 185, 57–70, https://doi.org/10.1016/j.rse.2015.12.024, 2016a. 

Roy, D. P., Zhang, H. K., Ju, J., Gomez-Dans, J. L., Lewis, P. E., Schaaf, C. B., Sun, Q., Li, J., Huang, H., and Kovalskyy, V.: A general method to normalize Landsat reflectance data to nadir BRDF adjusted reflectance, Remote Sens. Environ., 176, 255–271, https://doi.org/10.1016/j.rse.2016.01.023, 2016b. 

Sagan, V., Peterson, K. T., Maimaitijiang, M., Sidike, P., Sloan, J., Greeling, B. A., Maalouf, S., and Adams, C.: Monitoring inland water quality using remote sensing: potential and limitations of spectral indices, bio-optical simulations, machine learning, and cloud computing, Earth-Sci. Rev., 205, 103187, https://doi.org/10.1016/j.earscirev.2020.103187, 2020. 

Schaaf, C. B., Gao, F., Strahler, A. H., Lucht, W., Li, X., Tsang, T., Strugnell, N. C., Zhang, X., Jin, Y., Muller, J.-P., Lewis, P., Barnsley, M., Hobson, P., Disney, M., Roberts, G., Dunderdale, M., Doll, C., d'Entremont, R. P., Hu, B., Liang, S., Privette, J. L., and Roy, D.: First operational BRDF, albedo nadir reflectance products from MODIS, Remote Sens. Environ., 83, 135–148, https://doi.org/10.1016/S0034-4257(02)00091-3, 2002. 

Senf, C., Leitão, P. J., Pflugmacher, D., van der Linden, S., and Hostert, P.: Mapping land cover in complex Mediterranean landscapes using Landsat: Improved classification accuracies from integrating multi-seasonal and synthetic imagery, Remote Sens. Environ., 156, 527–536, https://doi.org/10.1016/j.rse.2014.10.018, 2015. 

Shang, R. and Zhu, Z.: Harmonizing Landsat 8 and Sentinel-2: A time-series-based reflectance adjustment approach, Remote Sens. Environ., 235, 111439, https://doi.org/10.1016/j.rse.2019.111439, 2019. 

Shen, H., Wu, P., Liu, Y., Ai, T., Wang, Y., and Liu, X.: A spatial and temporal reflectance fusion model considering sensor observation differences, Int. J. Remote Sens., 34, 4367–4383, https://doi.org/10.1080/01431161.2013.777488, 2013. 

Shen, H., Li, X., Cheng, Q., Zeng, C., Yang, G., Li, H., and Zhang, L.: Missing Information Reconstruction of Remote Sensing Data: A Technical Review, IEEE Geoscience and Remote Sensing Magazine, 3, 61–85, https://doi.org/10.1109/MGRS.2015.2441912, 2015. 

Shi, W., Guo, D., and Zhang, H.: A reliable and adaptive spatiotemporal data fusion method for blending multi-spatiotemporal-resolution satellite images, Remote Sens. Environ., 268, 112770, https://doi.org/10.1016/j.rse.2021.112770, 2022. 

Singh, D.: Generation and evaluation of gross primary productivity using Landsat data through blending with MODIS data, Int. J. Appl. Earth Obs., 13, 59–69, https://doi.org/10.1016/j.jag.2010.06.007, 2011. 

Song, X.-P., Hansen, M. C., Stehman, S. V., Potapov, P. V., Tyukavina, A., Vermote, E. F., and Townshend, J. R.: Global land change from 1982 to 2016, Nature, 560, 639–643, https://doi.org/10.1038/s41586-018-0411-9, 2018. 

Song, X.-P., Hansen, M. C., Potapov, P., Adusei, B., Pickering, J., Adami, M., Lima, A., Zalles, V., Stehman, S. V., Di Bella, C. M., Conde, M. C., Copati, E. J., Fernandes, L. B., Hernandez-Serna, A., Jantz, S. M., Pickens, A. H., Turubanova, S., and Tyukavina, A.: Massive soybean expansion in South America since 2000 and implications for conservation, Nat. Sustain., 4, 784–792, https://doi.org/10.1038/s41893-021-00729-z, 2021. 

Tian, F., Wang, Y., Fensholt, R., Wang, K., Zhang, L., and Huang, Y.: Mapping and Evaluation of NDVI Trends from Synthetic Time Series Obtained by Blending Landsat and MODIS Data around a Coalfield on the Loess Plateau, Remote Sens.-Basel, 5, 4255–4279, https://doi.org/10.3390/rs5094255, 2013. 

Tran, K. H., Zhang, H. K., McMaine, J. T., Zhang, X., and Luo, D.: 10 m crop type mapping using Sentinel-2 reflectance and 30 m cropland data layer product, Int. J. Appl. Earth Obs., 107, 102692, https://doi.org/10.1016/j.jag.2022.102692, 2022. 

Turubanova, S., Potapov, P., Hansen, M. C., Li, X., Tyukavina, A., Pickens, A. H., Hernandez-Serna, A., Arranz, A. P., Guerra-Hernandez, J., Senf, C., Häme, T., Valbuena, R., Eklundh, L., Brovkina, O., Navrátilová, B., Novotný, J., Harris, N., and Stolle, F.: Tree canopy extent and height change in Europe, 2001–2021, quantified using Landsat data archive, Remote Sens. Environ., 298, 113797, https://doi.org/10.1016/j.rse.2023.113797, 2023. 

Vermote, E., Justice, C., Claverie, M., and Franch, B.: Preliminary analysis of the performance of the Landsat 8/OLI land surface reflectance product, Remote Sens. Environ., 185, 46–56, https://doi.org/10.1016/j.rse.2016.04.008, 2016. 

Vermote, E., Roger, J. C., Franch, B., and Skakun, S.: LaSRC (Land Surface Reflectance Code): Overview, application and validation using MODIS, VIIRS, LANDSAT and Sentinel 2 data's, in: IGARSS 2018 – 2018 IEEE International Geoscience and Remote Sensing Symposium, Valencia, Spain, 22–27 July 2018, IEEE, 8173–8176, https://doi.org/10.1109/IGARSS.2018.8517622, 2018. 

Walker, J. J., de Beurs, K. M., Wynne, R. H., and Gao, F.: Evaluation of Landsat and MODIS data fusion products for analysis of dryland forest phenology, Remote Sens. Environ., 117, 381–393, https://doi.org/10.1016/j.rse.2011.10.014, 2012. 

Wang, Q., Zhang, Y., Onojeghuo, A. O., Zhu, X., and Atkinson, P. M.: Enhancing Spatio-Temporal Fusion of MODIS and Landsat Data by Incorporating 250 m MODIS Data, IEEE J. Sel. Top. Appl., 10, 4116–4123, https://doi.org/10.1109/JSTARS.2017.2701643, 2017. 

Wang, Q., Tang, Y., Tong, X., and Atkinson, P. M.: Virtual image pair-based spatio-temporal fusion, Remote Sens. Environ., 249, 112009, https://doi.org/10.1016/j.rse.2020.112009, 2020. 

Watts, J. D., Powell, S. L., Lawrence, R. L., and Hilker, T.: Improved classification of conservation tillage adoption using high temporal and synthetic satellite imagery, Remote Sens. Environ., 115, 66–75, https://doi.org/10.1016/j.rse.2010.08.005, 2011. 

White, J. C., Wulder, M. A., Hobart, G. W., Luther, J. E., Hermosilla, T., Griffiths, P., Coops, N. C., Hall, R. J., Hostert, P., Dyk, A., and Guindon, L.: Pixel-Based Image Compositing for Large-Area Dense Time Series Applications and Science, Can. J. Remote Sens., 40, 192–212, https://doi.org/10.1080/07038992.2014.945827, 2014. 

Wolfe, R. E., Roy, D. P., and Vermote, E.: MODIS land data storage, gridding, and compositing methodology: Level 2 grid, IEEE T. Geosci. Remote, 36, 1324–1338, https://doi.org/10.1109/36.701082, 1998. 

Wu, L., Liu, X., Liu, M., Yang, J., Zhu, L., and Zhou, B.: Online Forest Disturbance Detection at the Sub-Annual Scale Using Spatial Context From Sparse Landsat Time Series, IEEE T. Geosci. Remote, 60, 1–14, https://doi.org/10.1109/TGRS.2022.3145675, 2022. 

Wulder, M. A., Roy, D. P., Radeloff, V. C., Loveland, T. R., Anderson, M. C., Johnson, D. M., Healey, S., Zhu, Z., Scambos, T. A., Pahlevan, N., Hansen, M., Gorelick, N., Crawford, C. J., Masek, J. G., Hermosilla, T., White, J. C., Belward, A. S., Schaaf, C., Woodcock, C. E., Huntington, J. L., Lymburner, L., Hostert, P., Gao, F., Lyapustin, A., Pekel, J.-F., Strobl, P., and Cook, B. D.: Fifty years of Landsat science and impacts, Remote Sens. Environ., 280, 113195, https://doi.org/10.1016/j.rse.2022.113195, 2022. 

Yan, L. and Roy, D.: Large-Area Gap Filling of Landsat Reflectance Time Series by Spectral-Angle-Mapper Based Spatio-Temporal Similarity (SAMSTS), Remote Sens.-Basel, 10, 609, https://doi.org/10.3390/rs10040609, 2018. 

Yan, L. and Roy, D. P.: Spatially and temporally complete Landsat reflectance time series modelling: The fill-and-fit approach, Remote Sens. Environ., 241, 111718, https://doi.org/10.1016/j.rse.2020.111718, 2020. 

Yang, J., Gong, P., Fu, R., Zhang, M., Chen, J., Liang, S., Xu, B., Shi, J., and Dickinson, R.: The role of satellite remote sensing in climate change studies, Nat. Clim. Change, 3, 875–883, https://doi.org/10.1038/nclimate1908, 2013. 

Yang, J., Yao, Y., Wei, Y., Zhang, Y., Jia, K., Zhang, X., Shang, K., Bei, X., and Guo, X.: A Robust Method for Generating High-Spatiotemporal-Resolution Surface Reflectance by Fusing MODIS and Landsat Data, Remote Sens.-Basel, 12, 2312, https://doi.org/10.3390/rs12142312, 2020. 

Yin, Q., Liu, M., Cheng, J., Ke, Y., and Chen, X.: Mapping Paddy Rice Planting Area in Northeastern China Using Spatiotemporal Data Fusion and Phenology-Based Method, Remote Sens.-Basel, 11, 1699, https://doi.org/10.3390/rs11141699, 2019. 

Zhang, H. K., Luo, D., and Li, Z.: Classifying raw irregular time series (CRIT) for large area land cover mapping by adapting transformer model, Science of Remote Sensing, 9, 100123, https://doi.org/10.1016/j.srs.2024.100123, 2024. 

Zhang, W., Li, A., Jin, H., Bian, J., Zhang, Z., Lei, G., Qin, Z., and Huang, C.: An Enhanced Spatial and Temporal Data Fusion Model for Fusing Landsat and MODIS Surface Reflectance to Generate High Temporal Landsat-Like Data, Remote Sens.-Basel, 5, 5346–5368, https://doi.org/10.3390/rs5105346, 2013. 

Zhou, Q., Zhu, Z., Xian, G., and Li, C.: A novel regression method for harmonic analysis of time series, ISPRS J. Photogramm., 185, 48–61, https://doi.org/10.1016/j.isprsjprs.2022.01.006, 2022. 

Zhu, X., Chen, J., Gao, F., Chen, X., and Masek, J. G.: An enhanced spatial and temporal adaptive reflectance fusion model for complex heterogeneous regions, Remote Sens. Environ., 114, 2610–2623, https://doi.org/10.1016/j.rse.2010.05.032, 2010. 

Zhu, X., Helmer, E. H., Gao, F., Liu, D., Chen, J., and Lefsky, M. A.: A flexible spatiotemporal method for fusing satellite images with different resolutions, Remote Sens. Environ., 172, 165–177, https://doi.org/10.1016/j.rse.2015.11.016, 2016.  

Zhu, X., Cai, F., Tian, J., and Williams, T.: Spatiotemporal Fusion of Multisource Remote Sensing Data: Literature Survey, Taxonomy, Principles, Applications, and Future Directions, Remote Sens.-Basel, 10, 527, https://doi.org/10.3390/rs10040527, 2018. 

Zhu, X., Zhan, W., Zhou, J., Chen, X., Liang, Z., Xu, S., and Chen, J.: A novel framework to assess all-round performances of spatiotemporal fusion models, Remote Sens. Environ., 274, 113002, https://doi.org/10.1016/j.rse.2022.113002, 2022. 

Zhu, Z. and Woodcock, C. E.: Object-based cloud and cloud shadow detection in Landsat imagery, Remote Sens. Environ., 118, 83–94, https://doi.org/10.1016/j.rse.2011.10.028, 2012. 

Zhu, Z., Wang, S., and Woodcock, C. E.: Improvement and expansion of the Fmask algorithm: cloud, cloud shadow, and snow detection for Landsats 4–7, 8, and Sentinel 2 images, Remote Sens. Environ., 159, 269–277, https://doi.org/10.1016/j.rse.2014.12.014, 2015a. 

Zhu, Z., Woodcock, C. E., Holden, C., and Yang, Z.: Generating synthetic Landsat images based on all available Landsat data: Predicting Landsat surface reflectance at any given time, Remote Sens. Environ., 162, 67–83, https://doi.org/10.1016/j.rse.2015.02.009, 2015b. 

Zurita-Milla, R., Clevers, J. G. P. W., and Schaepman, M. E.: Unmixing-Based Landsat TM and MERIS FR Data Fusion, IEEE Geosci. Remote S., 5, 453–457, https://doi.org/10.1109/LGRS.2008.919685, 2008. 

Download
Short summary
The inconsistent coverage of Landsat data due to its long revisit intervals and frequent cloud cover poses challenges to large-scale land monitoring. We developed a global 30 m 23-year (2000–2022) daily seamless data cube (SDC) of surface reflectance based on Landsat 5, 7, 8, and 9 and MODIS products. The SDC exhibits enhanced capabilities for monitoring land cover changes and robust consistency in both spatial and temporal dimensions, which are important for global environmental monitoring.
Altmetrics
Final-revised paper
Preprint