the Creative Commons Attribution 4.0 License.
the Creative Commons Attribution 4.0 License.
Global 30-m seamless data cube (2000–2022) of land surface reflectance generated from Landsat-5,7,8,9 and MODIS Terra constellations
Abstract. The Landsat series constitutes an unparalleled repository of multi-decadal Earth observations, serving as a cornerstone in global environmental monitoring. However, the inconsistent coverage of Landsat data due to its long revisit intervals and frequent cloud cover poses significant challenges to land monitoring over large geographical extents. In this study, we developed a full-chain processing framework for the multi-sensor data fusion of Landsat-5, 7, 8, 9 and MODIS Terra surface reflectance products. Based on this framework, a global, 30-m resolution, and daily Seamless Data Cube (SDC) of land surface reflectance was generated, spanning from 2000 to 2022. A thorough evaluation of the SDC was undertaken using a leave-one-out approach and a cross-comparison with NASA’s Harmonized Landsat and Sentinel-2 (HLS) products. The leave-one-out validation at 425 global test sites assessed the agreement between the SDC with actual Landsat surface reflectance values (not used as input), revealing an overall Mean Absolute Error (MAE) of 0.014 (the valid range of surface reflectance values is 0–1). The cross-comparison with the HLS products at 22 Military Grid Reference System (MGRS) tiles revealed an overall Mean Absolute Deviation (MAD) of 0.017 with L30 (Landsat-8-based 30-m HLS product) and a MAD of 0.021 with S30 (Sentinel-2-based 30-m HLS product). Moreover, experimental results underscore the advantages of employing the SDC for global land cover classification, achieving a sizable improvement in overall accuracy (2.4 %~11.3 %) over that obtained using Landsat composite and interpolated datasets. A web-based interface has been developed for researchers to freely access the SDC dataset, which is available at https://doi.org/10.12436/SDC30.26.20240506 (Chen et al., 2024).
- Preprint
(18878 KB) - Metadata XML
- BibTeX
- EndNote
Status: closed
-
CC1: 'Comment on essd-2024-178', Diana Jones, 29 Jun 2024
The DOI link to the shared dataset is not accessible.
Citation: https://doi.org/10.5194/essd-2024-178-CC1 -
CC2: 'Reply on CC1', Jie Wang, 30 Jun 2024
Thank you for your interest in our data.
I tried to access the data via https://doi.org/10.12436/SDC30.26.20240506 and it was redirected to https://data-starcloud.pcl.ac.cn/resource/26 correctly.
The web page contains a Resource Description of the data. Please find out how to access data from the given instructions. Thank you!
Citation: https://doi.org/10.5194/essd-2024-178-CC2 -
CC3: 'Reply on CC2', Diana Jones, 02 Jul 2024
Thank you for updating the data access URL. However, I still have two questions.
First, the public SDC covers only a very small area of the world. It appears that there is no data available for the entire United States.
Second, the time frequency of the public SDC does not seem to be daily; instead, it appears to be every four days or even eight days.
Citation: https://doi.org/10.5194/essd-2024-178-CC3 -
AC1: 'Reply on CC3', Shuang Chen, 02 Jul 2024
Thank you for your comment. We have provided a web-based interface that allows users to access and customize the SDC for any location (except Antarctica) and with any temporal frequency (up to daily). Detailed information and instructions can be found at the following webpage: https://doi.org/10.12436/SDC30.26.20240506 or https://data-starcloud.pcl.ac.cn/resource/26. Please search the key words "Test Account" and "Instructions". If you have any other questions, please feel free to contact us.
Citation: https://doi.org/10.5194/essd-2024-178-AC1 -
CC4: 'Reply on AC1', Diana Jones, 03 Jul 2024
Thanks for your explanation.
1. As you suggested, I used the test account you provided and attempted to submit an order for data download. However, I was unable to do so successfully. The website displayed the message "Failed to write order to OBS." I tried various locations worldwide, but all attempts were unsuccessful. Based on this situation, I am uncertain whether the data actually exists. This seems to contradict the aims and scope of the ESSD journal. At the very least, the manuscript should not be published or reviewed further until all the claimed data are fully disclosed. Since the data results and authenticity cannot be verified, only theoretical and methodological discussions can be carried out at this stage.
2. The manuscript states that the accuracy of the SDC dataset has an MAE of 0.014. From the perspective of review and publication, the complete test dataset should be disclosed for public verification by third parties. This is a common and recognized practice in the field of image classification and segmentation; otherwise, the authenticity of the accuracy result cannot be confirmed. Additionally, considering the issues mentioned in point 1, if most of the data do not exist, how was the accuracy verified? This raises a significant concern. Furthermore, I noticed that the website also provides land cover data by year. However, the manuscript appears to only provide an OA indicator. Which year's accuracy is this? Is using OA alone sufficient to verify the accuracy of such a large dataset?
3. If the data are to be made publicly available for free, it would be best to provide direct download links (similar to practices by USGS, ESA, and other remote sensing data providers) to facilitate user access. Currently, it seems that most of the data are inaccessible.Citation: https://doi.org/10.5194/essd-2024-178-CC4 -
AC2: 'Reply on CC4', Shuang Chen, 04 Jul 2024
Thank you for your valuable feedback.
1. We have found that the "Standard" mode cannot process multiple order submissions simultaneously, which triggered the error "Failed to XXX". We apologize for the inconvenience, and we are working on solving this issue and improving the serving capability of our system. Please select the "Quick" order mode, which is more stable at this stage.
The data volume of a global daily SDC from 2000 to 2022 would exceed 22 PBs. There is not yet a cost-effective approach to store such a vast amount of data. Therefore, we use a distributed storage system as a cached data pool to store the most recently accessed SDC data. Users can access the SDC for their AOI through the provided interface. If the requested SDC data is not available in the cached data pool, the system will generate the SDC from raw data in real time and then return the generated SDC data to users.
Considering the large data volume and complex data storage mechanisms, we are not able to provide direct download links. The order-download mode is also a common practice as adopted by ESA’s EO Browser and USGS’s EarthExplorer platforms. We apologize for the inconvenience and will continuously work on improving the usability of our system.
2. As described in our manuscript, we conducted a leave-one-out validation on the reconstructed SDC dataset, which reveals an overall MAD of 0.014. We have uploaded the related test dataset in the Download section of our website. The data used in the cross-comparison with NASA’s HLS dataset are also available.
This manuscript focuses on the development and validation of the SDC dataset, and the land cover data provided on our website were generated to test the capability of SDC for global applications. The Overall Accuracy (OA) indicator mentioned in Table 9 is the result of the comparative experiment, and it is not related to the provided land cover data files. We briefly mentioned the land cover data in the Data Availability section. Other researchers can download these data for reference if they are interested in. We will improve our mapping framework and publish complete land cover products formally in our future work.
3. We have provided a web-based interface that allows users to access and customize the SDC for any location (except Antarctica) and with any temporal frequency (up to daily). Detailed information and instructions can be found at the following webpage: https://doi.org/10.12436/SDC30.26.20240506 or https://data-starcloud.pcl.ac.cn/resource/26. Large-scale SDC data processing requires much greater computing capacity that exceeds the capability of this web-based interface. If users need large-scale SDC for their research, please feel free to contact us for cooperations. We have a computing cluster that can efficiently reconstruct global-scale SDC data for subsequent applications.
Citation: https://doi.org/10.5194/essd-2024-178-AC2 -
CC5: 'Reply on AC2', Delmar Wilson, 05 Jul 2024
1. Before I raised concerns, the DOI provided by the authors for the SDC dataset was invalid, and no validation data was supplied. After my inquiry, the authors updated the data access URL and uploaded the validation data. However, when I attempted to download the validation data, the system indicated "No access permission." This makes me question whether the authors genuinely intend to make the dataset public. Sharing data publicly is a fundamental requirement for articles published in the ESSD journal. Providing validation data is also essential for assessing accuracy; without it, the dataset's authenticity cannot be verified.
2. Issues related to data storage and website functionality are the authors' responsibility. These are not my primary concerns and are unrelated to the data's availability. Avoiding the provision of a complete dataset due to its large size is unacceptable. Unless there is a reasonable third-party verification, we are justified in questioning the data's authenticity. If the current storage mechanism and website cannot meet the requirements, resulting in the dataset not being publicly available, then this paper should be retracted. The paper can be resubmitted once these issues are resolved and the complete dataset is provided.
3. Although this dataset closely resembles existing data products from USGS and ESA, such as the HLS dataset, I am still interested in some of the unique characteristics claimed in this paper. I hope the authors can adopt a sincere and transparent attitude to promptly release the complete dataset in an accessible and verifiable manner, in accordance with ESSD journal publication requirements and data openness regulations. This will allow third-party validation and oversight to alleviate concerns regarding data authenticity and accuracy. I believe this approach will better advance the field. I am also willing to offer any assistance and support within my capabilities to facilitate the successful publication and application of this paper.
Citation: https://doi.org/10.5194/essd-2024-178-CC5 -
AC3: 'Reply on CC5', Shuang Chen, 09 Jul 2024
Dear Delmar Wilson,
Thank you for your comment.
I am sure that I have not received any comment/message from you before. Or, do you mean that the previous comments posted by Diana Jones are also from you? Anyway, I will respond to all your concerns here.
- The complete message you received should be “No access permission. Please login in first.” So, maybe you can login in using the provided test account. Then, you will be able to download the validation data for as many times as you need. If it still doesn’t work, try to log out and then login in again.
- The DOI was registered on May 6, 2024. We have not and are not able to make any modifications on the DOI since that time. We have tested and verified the accessibility of this DOI before. We don’t know why you cannot access that DOI at that time, maybe you can check your network conditions.
- As described in the manuscript, the SDC dataset for any location(except Antarctica) and any time (2000-2022) are available via the provided interface on our website, which meets the data access requirements of ESSD. If you have any questions on accessing the SDC dataset, please feel free to contact us at any time. We are willing to provide assistance and information needed.
- Before the submission of this manuscript, I had made available the validation data on another website (sdc.iearth.com, as described in the manuscript). After receiving your request, I made a copy of these validation data and uploaded it onto the project website, so that it be found by users more easily.
- Thanks for your interest in this research work. We are working on improving our system, so that it can better serve all users within the community. We are grateful for your willing to offer assistance on improving the data access system. Could you provide us with your contact information? We can discuss about it in detail.
Citation: https://doi.org/10.5194/essd-2024-178-AC3
-
AC3: 'Reply on CC5', Shuang Chen, 09 Jul 2024
-
CC5: 'Reply on AC2', Delmar Wilson, 05 Jul 2024
-
AC2: 'Reply on CC4', Shuang Chen, 04 Jul 2024
-
CC4: 'Reply on AC1', Diana Jones, 03 Jul 2024
-
AC1: 'Reply on CC3', Shuang Chen, 02 Jul 2024
-
CC3: 'Reply on CC2', Diana Jones, 02 Jul 2024
-
CC2: 'Reply on CC1', Jie Wang, 30 Jun 2024
-
CC6: 'Comment on essd-2024-178', Diana Jones, 10 Jul 2024
The previous conversation was too long, so I am starting a new comment for easier reading.
1. I have tried logging in with the provided test account multiple times, but I still cannot download the validation data. Either the system does not respond at all, or it displays a "Download failed" message. Have the authors successfully downloaded the validation data using the method you described?2. The authors have provided many different URLs for data access, which is very confusing. Additionally, the data access system provided by the authors is user-unfriendly and highly unstable. Throughout the download process, you may encounter various issues at each step. These problems significantly increase the difficulty of obtaining the data and may even render the data inaccessible. I strongly recommend that other readers participate in testing the downloadability and accessibility of the data.
3. I do not believe that the accessibility and transparency of the data in this manuscript comply with the ESSD journal's data policy.
For example, the ESSD journal requires that,
"upon submission, all data must be directly accessible through link(s) in the manuscript;"
This means that upon manuscript submission to the journal, all data supporting the research results must be directly accessible via clickable links, without additional steps or permissions.
However, I do not think your data meets this criterion. As mentioned earlier, even after you submitted the manuscript, and even up to now, I have encountered many issues accessing the data website and downloading the data. For instance, the system frequently displays messages such as "Failed to write order to OBS," "Download failed," and "Failed to get the file under the directory."
Additionally, the ESSD journal also requires that,
"upon submission, authors must certify some form of fully anonymous review access, directly at the chosen repository (i.e., no registration, name, email, or other information is required of reviewers as they access the data);"
This means that data should be accessible without requiring registration or providing any information. However, your current data access method does not comply with this policy. In your system, it is necessary to register and log in to access the data. Otherwise, it prompts "No access permission. Please log in first." Sometimes, even after logging in, the data cannot be accessed or downloaded.
4. I am willing to offer assistance but would like to know the specific progress and timeline for the system improvements. I hope the authors can take a candid approach to make substantial improvements to the system. If there are any bottlenecks or difficulties, we can work together to resolve them. However, the author's responses so far have consistently denied the existence of these objective issues and have not resulted in any substantive improvements.
Citation: https://doi.org/10.5194/essd-2024-178-CC6 -
CC7: 'Reply on CC6', Yamal Vivian, 11 Jul 2024
I tried to access the data following provided instructions, and successfully downloaded the data.
Citation: https://doi.org/10.5194/essd-2024-178-CC7 -
CC12: 'Reply on CC7', Diana Jones, 14 Jul 2024
I'm referring to the validation dataset. I still haven't been able to successfully download it. Have you managed to download the validation dataset?
Citation: https://doi.org/10.5194/essd-2024-178-CC12
-
CC12: 'Reply on CC7', Diana Jones, 14 Jul 2024
-
CC8: 'Reply on CC6', Ethan Carter, 11 Jul 2024
I encountered a similar issue. I also failed to download the data, and the validation data download was unresponsive.
Citation: https://doi.org/10.5194/essd-2024-178-CC8 -
CC9: 'Reply on CC6', Congcong Li, 12 Jul 2024
I successfully downloaded the data using the test account and the instructions provided.
-
CC11: 'Reply on CC9', Diana Jones, 14 Jul 2024
I'm referring to the validation dataset. I still haven't been able to successfully download it. Have you managed to download the validation dataset?
Citation: https://doi.org/10.5194/essd-2024-178-CC11 -
AC6: 'Reply on CC11', Shuang Chen, 04 Aug 2024
Thank you for your feedback and for bringing this issue to our attention. The system may take longer time to respond due to the large file size of the experimental data. We apologize for any inconvenience this has caused.
We have also uploaded the experimental data to OneDrive, you can access it through this URL: https://1drv.ms/u/s!AvA96w8h5Q9sgtEBEk9J5D0nKa5tsA?e=TdNRk7.
If you continue to experience issues, please let us know, and we will assist you further.
Citation: https://doi.org/10.5194/essd-2024-178-AC6
-
AC6: 'Reply on CC11', Shuang Chen, 04 Aug 2024
-
CC11: 'Reply on CC9', Diana Jones, 14 Jul 2024
-
CC7: 'Reply on CC6', Yamal Vivian, 11 Jul 2024
-
RC1: 'Comment on essd-2024-178', X. Zhu, 12 Jul 2024
General comments
This paper presents the development of a global 30-m seamless data cube by fusing Landsat and MODIS data, building on the authors' previous methods. This work is highly valuable for future applications requiring fine-resolution time series data. Despite the numerous algorithms developed for fusing Landsat and MODIS data or filling gaps in Landsat data, no global datasets generated by these technologies are currently available. However, the paper has several issues that need to be addressed.
Specific comments
- Page 3, Line 70: For Landsat interpolation methods, there are techniques that do not require numerous clear-sky observations. For instance, the nearest similar pixel interpolator method only needs one clear-sky observation.
- Introduction: Before the last paragraph, the authors should discuss the current research gap. Specifically, they should explain why a full-chain processing framework and such fused data are necessary. Additionally, they should outline the challenges users face when using current methods to produce data independently.
- Page 11, Figure 3: When performing harmonization, was Landsat upscaled to the resolution of MODIS? Figure 3 suggests that both datasets have the same resolution. A linear transformation model was used; how do the authors address the issue when the linear model is not statistically significant?
- Major Differences Between uROBOT and ROBOT Models: What are the primary differences between the uROBOT model and the previously developed ROBOT model by the authors?
- Accuracy of the Time Series Model in Eq. 6: The accuracy of the time series model in Equation 6 could be affected by the time interval of the data. In cloudy regions, if the data is too sparse, is the result reliable? How do the authors address diverse changes when the data is sparse?
- Eq. 7: The coefficients from MODIS are used for Landsat. This approach may be acceptable if the land is homogeneous, but it lacks a clear mechanism for complex landscapes. Landsat pixels have different temporal dependencies compared to coarse pixels. If this issue affects the reliability of the final product, the end users should be notified.
- Eq. 9: The equation redistributes the residual to handle land cover changes, but the residual is at a coarse scale. How do the authors address changes at the fine pixel scale?
- Page 15, Line 355: The three accuracy metrics mentioned cannot adequately assess the spatial context preserved by the fused images. Some spatial metrics, such as edge features (see examples in https://doi.org/10.1016/j.rse.2022.113002), should be presented.
- Tables 4-6: Is the accuracy assessment conducted for each site? The tables only show the mean value. What is the range of these indices?
- Figure 14: The RMSD values in Figure 14 are on a different scale compared to Tables 4-6.
- Table 9: It would be beneficial to include a figure showing examples where only the Seamless Data Cube (SDC) can accurately classify the pixels, whereas other data cannot.
Citation: https://doi.org/10.5194/essd-2024-178-RC1 -
AC4: 'Reply on RC1', Shuang Chen, 01 Aug 2024
Dear X. Zhu,
Thank you very much for the time you spent on our manuscript and for the useful comments. In the attached file, we provide responses to your comments/suggestions and point to the made changes in the revised manuscript.
Regards,
The authors
-
CC10: 'Comment on essd-2024-178', James Anderson, 12 Jul 2024
Many thanks to the authors for providing such a dataset. However, I have some concerns regarding its quality.
1. The image quality in many areas seems to be poor, as shown in the example below. Severe cloud contamination and strong spatial stitching effects were observed.