the Creative Commons Attribution 4.0 License.
the Creative Commons Attribution 4.0 License.
Democratizing planetary-scale analysis: An ultra-lightweight Earth embedding database for accurate and flexible global land monitoring
Abstract. The rapid evolution of satellite-borne Earth Observation (EO) systems has fundamentally revolutionized terrestrial monitoring, yielding comprehensive petabyte-scale archives. However, the immense computational resources and storage volumes required for global-scale analysis often preclude widespread use by many research teams, hindering broader scientific adoption and the execution of planetary-scale studies. To address these barriers, we present the Embedded Seamless Data (ESD), an ultra-lightweight, 30-m global Earth embedding database spanning the 25-year period from 2000 to 2024.
By transforming high-dimensional, multi-sensor observations from the Landsat series (5, 7, 8, and 9) and MODIS Terra into information-dense, quantized latent vectors, ESD distils essential geophysical and semantic features into a unified latent space. Utilizing the ESDNet architecture and Finite Scalar Quantization (FSQ), the dataset achieves a transformative ~340-fold reduction in data volume compared to raw daily archives. This compression allows the entire global land surface for a single year to be encapsulated within approximately 2.4 TB, enabling decadal-scale global analysis on standard local workstations.
Rigorous validation demonstrates that ESD maintains high reconstructive fidelity to the original reflectance values across the spectral dimension, achieving a Mean Absolute Error (MAE) of 0.0130 (averaged over six spectral bands, including Blue, Green, Red, NIR, SWIR1, and SWIR2), a Root Mean Square Error (RMSE) of 0.0179, and a Correlation Coefficient (CC) of 0.8543. By condensing the annual phenological cycle into 12 temporal latent steps, the embeddings provide inherent denoising effects and a semantically organized latent space that outperforms raw reflectance data in downstream land-cover classification tasks, achieving a comparable and even higher overall accuracy of 79.74 % than the 76.92 % obtained using raw sensor fusion data on globally distributed land cover sample sets. With robust few-shot learning capabilities and longitudinal consistency across 25 years, the ESD product provides a versatile foundation for democratizing planetary-scale Earth system research and advancing next-generation geospatial artificial intelligence.
Competing interests: At least one of the (co-)authors is a member of the editorial board of Earth System Science Data.
Publisher's note: Copernicus Publications remains neutral with regard to jurisdictional claims made in the text, published maps, institutional affiliations, or any other geographical representation in this paper. While Copernicus Publications makes every effort to include appropriate place names, the final responsibility lies with the authors. Views expressed in the text are those of the authors and do not necessarily reflect the views of the publisher.- Preprint
(50705 KB) - Metadata XML
- BibTeX
- EndNote
Status: open (until 28 May 2026)
- RC1: 'Solid work that needs minor corrections', Zhengpeng Feng, 10 Mar 2026 reply
-
RC2: 'Comment on essd-2026-57', Anonymous Referee #2, 30 Apr 2026
reply
This paper proposed an ultra-lightweight 30-m Earth embedding from 2000 to 2024 for global land monitoring. Multi-sensor observations from the Landsat series (5, 7, 8, and 9), MODIS Terra, and various land cover products were utilized as input data. ESDNet was designed for embedding generation. Experimental results show the performance of the proposed embedding and ESDNet. Overall,this topic is interesting and the presentation could be further improved.
- What are differences between the proposed Earth embedding and Alpha Earth embedding?
- For Fig. 2, please provide a detailed caption to help readers better follow.
- For ESDNet, how to avoid the high-frequency noise during multimodal feature fusion?
- Network details should be provided for reproductivity, such as the number of encoder-decoder layers and kernel size.
- For Section 3.3, details are missing for total loss functon. How to choose the reasonable weight parameters for weighted loss? Are there experiments to show the result of different weights?
- Please double check all the Eqs and figures to make sure that each term is explained clearly.
- “(b) Supervised Regression Loss” should be “(c) Supervised Regression Loss”.
- How to jointly optimize three loss functions? The detailed training strategy could be provided.
- For Table 4, a discussion for the performance with different band is encouraged to be added.
- What is the potential for this embedding to be deployed in the foundation model?
Citation: https://doi.org/10.5194/essd-2026-57-RC2 -
RC3: 'Comment on essd-2026-57', Anonymous Referee #3, 18 May 2026
reply
This manuscript proposes a lightweight Embedded Seamless Database (ESD) based on global Earth observation images, which features high compression and high reconstructive fidelity. The research in this dataset is of great significance, as it provides a different solution from AEFs for global-scale Earth analysis. Judging from the download volume on the data portal (https://data-starcloud.pcl.ac.cn/iearthdata), the dataset appears to be very popular. Overall, it is a interesting dataset and a well-written manuscript. I would like to offer the following comments for reference.
- Figures 1 and 2 are the core illustrations that convey the essential methodological logic of the paper, directly shaping readers’ understanding of the ESD pipeline and the ESDNet architecture. However, the current version suffers from serious ambiguity and insufficient information delivery. In particular, subfigures (b) and (c) in Figure 2 give the impression of being redundant and lacking substantive meaning due to design deficiencies. In addition, there is some logical overlap between Figures 1 and 2, and the authors may consider improving the way the relationship between these two figures is presented. The relationships among the three subfigures (a), (b), and (c) in Figure 2 also require more detailed explanation and clearer correspondence. Overall, as these are central components of the paper, it would be helpful to include more key information and technical background regarding ESDNet and the FSQ module.
- The three core dimensional symbols in Figure 1, 365, H, W, C1; 12, H, W, C2; and 12, H, W are not explained anywhere in the manuscript. Their physical meanings should be clearly clarified either within the figure itself or in the figure caption, so that the figure can be understood independently of the main text.
- In Table 3, the global annual SDC30 dataset is reported as 0.8 PB, while ESD is reported as 2.4 TB. This implies a theoretical compression ratio of approximately 333, which is somewhat inconsistent with the ~340 description used in the abstract and the main text.
- Figure 3 appears very similar to the ESA global classification product, making it difficult to identify its relationship to the samples or training dataset used in this study.
- The paper only briefly mentions three key hyperparameters in the ablation experiments, such as 12 temporal steps, 10 residual blocks, and a 65,536-dimensional embedding space. However, the manuscript does not provide a complete set of training and inference hyperparameters for ESDNet, making it difficult for third-party researchers to fully reproduce the reported results.
Citation: https://doi.org/10.5194/essd-2026-57-RC3
Data sets
Embedded Seamless Data: An ultra-lightweight Earth embedding database for accurate and flexible global land monitoring Shuang Chen https://doi.org/10.12436/iEarth.0000.20251229.000064.v1
Viewed
| HTML | XML | Total | BibTeX | EndNote | |
|---|---|---|---|---|---|
| 434 | 176 | 25 | 635 | 45 | 65 |
- HTML: 434
- PDF: 176
- XML: 25
- Total: 635
- BibTeX: 45
- EndNote: 65
Viewed (geographical distribution)
| Country | # | Views | % |
|---|
| Total: | 0 |
| HTML: | 0 |
| PDF: | 0 |
| XML: | 0 |
- 1
First off, I really enjoyed reading this manuscript. The work on Embedded Seamless Data (ESD) is quite impressive, and I think it addresses a real bottleneck in planetary-scale analysis. The ultra-lightweight design and the way it handles decadal-scale global land monitoring are definitely high-quality contributions. The overall framework is solid, but there are a few things that need to be cleaned up before it’s ready for publication.
The most critical thing I noticed is in Table 11, which provides the technical comparison between ESD and other existing Earth embedding databases. To be honest, there are some pretty clear errors in the technical specs listed for the competing products. I’d strongly recommend the authors take another look at that table and cross-check it with the comparison table in the " Earth Embeddings as Products: Taxonomy, Ecosystem, and Standardized Access" paper. Getting these details right is important for the credibility of the comparison, so please revise those rows to ensure the metrics and features for the other databases/models are accurately represented.
On a related note, when you talk about the multitask training strategy in Section 3.3, it would be great if you could briefly clarify how the loss weights (alpha, beta, and gamma) were tuned. You don't need a full sensitivity analysis, but just a sentence or two on whether they were empirically balanced or if there's a specific rationale behind their values would help reproducibility.