GlobalGeoTree: A Multi-Granular Vision-Language Dataset for Global Tree Species Classification

Mu, Yang; Xiong, Zhitong; Wang, Yi; Shahzad, Muhammad; Essl, Franz; Kreft, Holger; van Kleunen, Mark; Zhu, Xiao Xiang

doi:10.5194/essd-2025-613

Preprints

https://doi.org/10.5194/essd-2025-613

Preprints

05 Nov 2025

| 05 Nov 2025

Status: this preprint is currently under review for the journal ESSD.

GlobalGeoTree: A Multi-Granular Vision-Language Dataset for Global Tree Species Classification

Yang Mu, Zhitong Xiong, Yi Wang, Muhammad Shahzad, Franz Essl, Holger Kreft, Mark van Kleunen, and Xiao Xiang Zhu

Abstract. Global tree species mapping using remote sensing data is vital for biodiversity monitoring, forest management, and ecological research. However, progress in this field has been constrained by the scarcity of large-scale, labeled datasets. To address this, we introduce GlobalGeoTree – a comprehensive global dataset for tree species classification. GlobalGeoTree comprises 6.3 million geolocated tree occurrences, spanning 275 families, 2,734 genera, and 21,001 species across the hierarchical taxonomic levels. Each sample is paired with Sentinel-2 image time series and 27 auxiliary environmental variables, encompassing bioclimatic, geographic, and soil data. The dataset is partitioned into GlobalGeoTree-6M, a large subset for model pretraining, and curated evaluation subsets, primarily GlobalGeoTree-10kEval, a benchmark for zero-shot and few-shot classification. To demonstrate the utility of the dataset, we introduce a baseline model, GeoTreeCLIP, which leverages paired remote sensing data and taxonomic text labels within a vision-language framework pretrained on GlobalGeoTree-6M. Experimental results show that GeoTreeCLIP achieves substantial improvements in zero- and few-shot classification on GlobalGeoTree-10kEval over existing advanced models. By making the dataset, models, and code publicly available, we aim to establish a benchmark to advance tree species classification and foster innovation in biodiversity research and ecological applications. The code is publicly available at https://github.com/MUYang99/GlobalGeoTree, and the GlobalGeoTree dataset is available at https://huggingface.co/datasets/yann111/GlobalGeoTree.

Received: 10 Oct 2025 – Discussion started: 05 Nov 2025

Publisher's note: Copernicus Publications remains neutral with regard to jurisdictional claims made in the text, published maps, institutional affiliations, or any other geographical representation in this paper. While Copernicus Publications makes every effort to include appropriate place names, the final responsibility lies with the authors. Views expressed in the text are those of the authors and do not necessarily reflect the views of the publisher.

Download & links

Yang Mu, Zhitong Xiong, Yi Wang, Muhammad Shahzad, Franz Essl, Holger Kreft, Mark van Kleunen, and Xiao Xiang Zhu

Status: open (until 12 Dec 2025)

Post a comment Subscribe to comment alert

RC1: 'Comment on essd-2025-613', Anonymous Referee #1, 14 Nov 2025 reply

General Comments:
The authors present a global multi-modal dataset for tree species classification, integrating diverse data sources and offering both a large-scale pretraining dataset and a separate evaluation set. They also propose GeoTreeCLIP, a model that leverages hierarchical label structures and demonstrates improvements over baseline methods. The experimental setup is comprehensive, including comparisons with CLIP-style models and supervised learning approach. All code and data are publicly available.
Specific Comments:
1. Dataset Construction:
1.1  The authors use the JRC Forest Cover Map v1 for filtering. Given that version 2 has been publicly released with documented improvements, is there a reason for not using the updated version?
1.2  The GlobalGeoTree-10kEval set includes 90 species out of over 21,000. Could the authors clarify the selection criteria? Were any sampling or filtering strategies applied to ensure the reliability of the evaluation set, particularly given the inclusion of citizen science sources like iNaturalist?
1.3  While the evaluation set is constructed as a separate test set, there appears to be no explicit validation process to assess its quality. Given the integration of heterogeneous data sources, some form of validation (manual or automated) would greatly enhance the trustworthiness and utility of the dataset.
2. Model (GeoTreeCLIP):
2.1  The authors attribute the performance improvements of GeoTreeCLIP to domain-specific pretraining. However, it’s difficult to isolate the effects of pretraining alone, as other baseline models lack temporal fusion and may differ in how auxiliary data are handled. A more controlled ablation or discussion would strengthen this claim.
3. Evaluation Metrics and Reporting:
3.1  The paper mentions addressing class imbalance by grouping species into frequent, common, and rare categories. However, results are not reported per group. Including group-specific performance would align with common practices in imbalanced classification tasks
3.2  Given the global scope of the dataset and the known regional biases, regional performance breakdowns would be informative and important for understanding model generalizability.
Additional comments:
The authors’ effort in assembling such a large-scale, publicly available dataset and developing a strong benchmark model is highly appreciated. However, since this is a data description paper, the dataset itself should be the focal point. At present, the lack of validation for datasets is a significant limitation. While the work offers valuable contributions for machine learning research, particularly within benchmark or workshop tracks at venues like CVPR or NeurIPS, it may not yet meet the expectations for a journal like ESSD, which prioritizes data quality.

Reply

Citation: https://doi.org/10.5194/essd-2025-613-RC1

Yang Mu, Zhitong Xiong, Yi Wang, Muhammad Shahzad, Franz Essl, Holger Kreft, Mark van Kleunen, and Xiao Xiang Zhu

Data sets

GlobalGeoTree: A Multi-Granular Vision- Language Dataset for Global Tree Species Classification Y. Mu et al. https://doi.org/10.15468/dd.9qxqyy

Model code and software

GlobalGeoTree: A Multi-Granular Vision- Language Dataset for Global Tree Species Classification Y. Mu et al. https://github.com/MUYang99/GlobalGeoTree

Yang Mu, Zhitong Xiong, Yi Wang, Muhammad Shahzad, Franz Essl, Holger Kreft, Mark van Kleunen, and Xiao Xiang Zhu

Viewed

Total article views: 344 (including HTML, PDF, and XML)

HTML	PDF	XML	Total	BibTeX	EndNote
276	51	17	344	20	12

HTML: 276
PDF: 51
XML: 17
Total: 344
BibTeX: 20
EndNote: 12

Views and downloads (calculated since 05 Nov 2025)

Month	HTML	PDF	XML	Total
Nov 2025	276	51	17	344

Cumulative views and downloads (calculated since 05 Nov 2025)

Month	HTML	PDF	XML	Total
Nov 2025	276	51	17	344

Viewed (geographical distribution)

Total article views: 339 (including HTML, PDF, and XML) Thereof 339 with geography defined and 0 with unknown origin.

Country	#	Views	%

Latest update: 24 Nov 2025

Short summary

To better protect our planet's forests, we need to know what trees are where. We created GlobalGeoTree, a massive public dataset linking 6.3 million tree locations worldwide with satellite data. This dataset helps computers learn to identify tree species from space, supporting biodiversity monitoring and climate action. Our baseline model shows this is a promising path to understanding global forests.


Total:	0
HTML:	0
PDF:	0
XML:	0