the Creative Commons Attribution 4.0 License.
the Creative Commons Attribution 4.0 License.
Toward Better Conservation: A Spatial Analysis of Species Occurrence Data from the Global Biodiversity Information Facility
Abstract. The world is facing an unprecedented loss of biodiversity, with nearly one million species on the brink of extinction, and the extinction rate accelerating. Conservation efforts are often hindered by insufficient information on crucial ecosystems. To address this issue, our paper leverages advances in machine-based pattern recognition to estimate species occurrence maps using georeferenced data from the Global Biodiversity Information Facility (GBIF). Our algorithms have generated maps for more than 600,000 species, including vertebrates, arthropods, mollusks, other animals, vascular plants, fungi, and other organisms. Validation involved comparing these maps with expert maps for mammals, ants, and vascular plants. We found a close similarity in global distribution patterns, with regional differences attributed to technical variations or necessary revisions in existing expert maps based on GBIF data. As a practical application, we identified the global distributions of approximately 68,000 species with small ranges (25 km x 25 km or less) confined to a single country. Our maps reveal a skewed international distribution of these species, identifying 30 countries where 78.2 percent are concentrated. These results highlight the need to integrate the newly mapped GBIF data into global conservation planning. Our algorithms support rapid updates and the creation of new maps as GBIF occurrence reports increase. The data are available on the World Bank Development Data Hub at https://doi.org/10.57966/h21e-vq42 (Dasgupta et al. 2024).
- Preprint
(2461 KB) - Metadata XML
- BibTeX
- EndNote
Status: final response (author comments only)
-
CC1: 'Comment on essd-2024-241', Mustafa Md. Golam, 18 Oct 2024
-
AC1: 'Reply on CC1', Brian Blankespoor, 23 Oct 2024
We appreciate the insightful comment from Mr. Mustafa Md. Golam and fully agree that comprehensive, real-time, and geographically targeted data on biodiversity at risk is crucial for developing effective, evidence-based conservation strategies and for informing global biodiversity management and protection efforts. In response to this, our paper presents occurrence region maps for around 600,000 species, encompassing arthropods, mollusks, plants, fungi, and various invertebrates, alongside amphibians, birds, fish, reptiles, and mammals. This dataset represents a significant advancement in supporting conservation planning and biodiversity protection across terrestrial, freshwater, and marine environments.
Citation: https://doi.org/10.5194/essd-2024-241-AC1
-
AC1: 'Reply on CC1', Brian Blankespoor, 23 Oct 2024
-
CC2: 'Comment on essd-2024-241', Mainul Huq, 23 Oct 2024
A long-awaited system such as this one would help improve our understanding of the relative importance of an area in terms of biodiversity. The area may turn out to be a rare home of a set of threatened species. Armed with such information, the policy makers would be better equipped; to undertake conservation projects especially in those areas and avoid projects, if undertaken, would adversely impact those ecologically sensitive areas. I am sure the relevant experts and policy makers would be glad to have access to such information.
Citation: https://doi.org/10.5194/essd-2024-241-CC2 -
AC2: 'Reply on CC2', Brian Blankespoor, 23 Oct 2024
We appreciate Mr. Mainul Huq's valuable comment and fully concur with his observation. Our estimates, as presented in the paper, indicate that the traditional focus on vertebrates in conservation planning has overlooked many other critical species. The expanded coverage in our new dataset, which includes species beyond vertebrates, reveals that numerous taxa, such as arthropods, have not received sufficient attention in biodiversity conservation efforts. As a result, our expanded biodiversity database broadens the scope of conservation, identifying many more potentially threatened species globally and leading to significant revisions in "conservation hotspot" maps.
Citation: https://doi.org/10.5194/essd-2024-241-AC2
-
AC2: 'Reply on CC2', Brian Blankespoor, 23 Oct 2024
-
CC3: 'Comment on essd-2024-241', Mainul Huq, 23 Oct 2024
A long-awaited system such as this one would help improve our understanding of the relative importance of an area in terms of biodiversity. The area may turn out to be a rare home of a set of threatened species. Armed with such information, the policy makers would be better equipped; to undertake conservation projects especially in those areas and avoid projects, if undertaken, would adversely impact those ecologically sensitive areas. I am sure the relevant experts and policy makers would be glad to have access to such information.
Citation: https://doi.org/10.5194/essd-2024-241-CC3 -
AC3: 'Reply on CC3', Brian Blankespoor, 23 Oct 2024
(Please note our response to CC2 above).
Citation: https://doi.org/10.5194/essd-2024-241-AC3
-
AC3: 'Reply on CC3', Brian Blankespoor, 23 Oct 2024
-
CC4: 'Comment on essd-2024-241', Kenneth Chomitz, 21 Nov 2024
This paper advances the state of the art of biodiversity mapping. A long standing problem in biodiversity mapping and conservation is constructing range maps for species based on relatively small numbers of geolocated occurences. The authors use the alphahull algorithm to conservatively bound a range based on occurrence data. The paper uses it to generate global range maps for 600,000 species across a very wide range of taxa.The work is timely and important because of the urgency of biodiversity conservation, and increased demand for range maps from nascent initiatives to create biodiversity credits. At the same time, there is increased supply of occurrence data from expanded and innovative monitoring efforts. The paper illustrates how this data can be rapidly assimilated into a growing live database of biodiversity maps.Citation: https://doi.org/
10.5194/essd-2024-241-CC4 -
AC4: 'Reply on CC4', Brian Blankespoor, 29 Oct 2025
Thank you for this thoughtful and constructive comment. We are glad to read that you recognize the significance and timeliness of the work, especially given the urgent need for scalable and transparent methods in biodiversity mapping. Our goal was to offer a globally consistent, data-driven approach that leverages the expanding availability of occurrence data, and to demonstrate how tools like the alpha-hull algorithm can be applied at scale across taxa. We appreciate your point about the growing demand for reliable range maps in emerging conservation finance mechanisms, such as biodiversity credits, and hope this work contributes meaningfully to those efforts.
Citation: https://doi.org/10.5194/essd-2024-241-AC4
-
AC4: 'Reply on CC4', Brian Blankespoor, 29 Oct 2025
-
RC1: 'Comment on essd-2024-241', Kenneth Chomitz, 04 Jul 2025
Dasgupta Blankespoor and Wheeler (DBW) create a fine-scale global map of the distribution of more than 600,000 species. This appears to be the most species-rich such map available. The paper states that its main contribution is ‘the expanded coverage of invertebrates’ – a key component of biodiversity that is often overshadowed by attention to charismatic mammals. From a methodological view, the paper’s main contribution is that ‘our algorithms support rapid updates and the creation of new maps as GBIF occurrence reports increase.’
Their database-cum-methodology lacks a name – let’s call it the DBW algorithm and DBW-GBIF database. It arrives at a crucial time. As the paper notes, the Global Biodiversity Framework is heightening the attention to conservation. Not noted but important is increased demand for, and supply of, mappable biodiversity data in connection with burgeoning biodiversity credit markets and corporate compliance with biodiversity reporting regimes such as the Task Force on Nature-related Disclosures. As the paper emphasizes, the volume of species occurrence data is growing very rapidly. Hence DBW’s ability to assimilate the data and update species range maps is important.
The paper would benefit from better documentation of methods and a clearer exposition of how it advances the state of the art. Main points as follows.
Algorithm and methods – Supplementary material should document methods, including datacleaning (or is this already done by GBIF?), alpha hull model parameterization, and workflow. If a main contribution of the DBW algorithm is in the ability to rapidly or continually update, it is worth discussing the ease and computational cost of doing that. It would be desirable to open-source the algorithm if possible, as that would accelerate its use and continual improvement.
Comparison with other species distribution databases. The paper could compare species coverage, spatial resolution, methodologies, and timeliness with other resources such as the Map of Life.
Alpha hull pilot example. I am not convinced that this is an essential section, since the alpha hull technique has already been demonstrated. If it is to be included, then a more compelling example would be one where convexity clearly goes wrong and the alpha hull maps a more constrained range.
Case comparisons. This section could be more clearly motivated and more thoroughly explored. I take the motivation to be showing a) that the rapid DBW algorithm produces results that are comparable to much more intensive taxonomy-specific efforts; and then b) arguing that where there are disparities, DBW may be superior. With regard to a) figures 2,3, and 4 are inadequate to make the case – it’s not possible to eyeball the differences. Instead, the paper should present and discuss a map of the differences between the GBIF and comparator rankings (and also define the rankings – percentiles? Is 10 high or low? Etc). On the comparison with Kass et al 2022, it’s worth noting that Kass don’t just use alpha hull, they also use environmental variables as a predictor, so that may explain the divergence and is worth some discussion. Also, given the interest in small-range species, Kass et al’s use of buffers around species with <3 occurrences (which can't be handled by alpha hull) seems preferable to DBW’s decision to omit them entirely. I would therefore challenge the conclusion (line 313) that ‘comparing full database results for GBIF and Kass et al. (2022) would be, in effect, comparing apples and oranges.”
Priority-setting applications. This section would benefit from comparison with prior, similar exercises such as Jenkins, Pimm and Joppa (2013) Global patterns of terrestrial vertebrate diversity. How does DBW'S wider set of taxonomies change understanding of hotspots or priorities? The tables could be better documented, it's not always immediately clear what the 'percentages' refer to (what's the denominator?) Also, the paper ends by talking about 40 countries, but the prior discussion has been of 30 countries -- that requires correction or clarification.
Citation: https://doi.org/10.5194/essd-2024-241-RC1 -
AC6: 'Reply on RC1', Brian Blankespoor, 29 Oct 2025
RC1:
Dasgupta Blankespoor and Wheeler (DBW) create a fine-scale global map of the distribution of more than 600,000 species. This appears to be the most species-rich such map available. The paper states that its main contribution is ‘the expanded coverage of invertebrates’ – a key component of biodiversity that is often overshadowed by attention to charismatic mammals. From a methodological view, the paper’s main contribution is that ‘our algorithms support rapid updates and the creation of new maps as GBIF occurrence reports increase.’
Their database-cum-methodology lacks a name – let’s call it the DBW algorithm and DBW-GBIF database. It arrives at a crucial time. As the paper notes, the Global Biodiversity Framework is heightening the attention to conservation. Not noted but important is increased demand for, and supply of, mappable biodiversity data in connection with burgeoning biodiversity credit markets and corporate compliance with biodiversity reporting regimes such as the Task Force on Nature-related Disclosures. As the paper emphasizes, the volume of species occurrence data is growing very rapidly. Hence DBW’s ability to assimilate the data and update species range maps is important.
Authors’ response:
Thank you for this generous and insightful comment. We're especially pleased that you highlight both the taxonomic breadth of the dataset—particularly the expanded inclusion of invertebrates—and the methodological emphasis on scalability and updatability. Your framing of the work as the “DBW algorithm” and “DBW-GBIF database” is helpful, and we agree that naming the approach may aid in future discussion and adoption. As you point out, the growing demands of biodiversity finance, corporate reporting, and global policy frameworks make timely, transparent, and repeatable mapping methods increasingly critical. We hope this work serves as a foundation for continued refinement and collaboration across these emerging applications.
RC1:
The paper would benefit from better documentation of methods and a clearer exposition of how it advances the state of the art. Main points as follows.
Algorithm and methods – Supplementary material should document methods, including data cleaning (or is this already done by GBIF?), alpha hull model parameterization, and workflow. If a main contribution of the DBW algorithm is in the ability to rapidly or continually update, it is worth discussing the ease and computational cost of doing that. It would be desirable to open-source the algorithm if possible, as that would accelerate its use and continual improvement.
Authors’ response:
The method includes several steps to include quality information through the filtering of records and post-processing steps to exclude locations with little information. As mentioned in the paper, “We accept the GBIF’s protocols for occurrence report admissibility. Detailed descriptions of the GBIF’s protocols and database elements can be found at https://www.gbif.org/data-quality-requirements-occurrences.” (Section 2.1 Para 2). We added an entirely new Annex providing examples of the algorithm output and a flowchart of the main processing steps for illustrative purposes. Also, we use GoogleBigQuery and R to implement our approach and take advantage of the alphahull algorithm that is already implemented as a function in the open source software R in the package alphahull (Pateiro-Lopez & Rodriguez-Casal (2010)). Furthermore, data processing can be implemented with cloud computing to improve computational timing.
RC1:
Comparison with other species distribution databases. The paper could compare species coverage, spatial resolution, methodologies, and timeliness with other resources such as the Map of Life.
Authors’ response:
We expect our effort to complement other species distribution databases such as Map of Life (MoL). MOL is a global biodiversity platform that integrates species distribution data from expert range maps, occurrence records (e.g., GBIF, eBird), and ecological models to provide standardized, spatially explicit information across taxa. MOL currently includes data for over 450,000 species across more than 260 countries and territories (http://www.mol.org, accessed 2025-10-27), encompassing a wide range of taxa, including vertebrates, plants, and insects. Spatial resolution varies by dataset, from coarse global grids (at approximately 110 km²) for richness layers to fine-scale models of approximately 1 km² for certain taxa and regions. Methodologically, MOL combines expert-drawn ranges with environmental modeling and remote-sensing inputs to generate derived indicators such as species richness, habitat suitability, and protection indices. Their workflows support dynamic updating, with occurrence data refreshed biannually and regional species lists updated monthly, although the timeliness of expert-based data can vary across taxa.
Added (Section 1 Para 4): “The study’s approach should be viewed as complementary to previous biodiversity assessments (e.g. Map of Life) and can benefit the policy process in several major ways.”
RC1:
Alpha hull pilot example. I am not convinced that this is an essential section, since the alpha hull technique has already been demonstrated. If it is to be included, then a more compelling example would be one where convexity clearly goes wrong and the alpha hull maps a more constrained range.
Authors’ response:
To evaluate differences between spatial modeling approaches, we compared species occurrence regions generated using the convex hull and alpha hull methods across representative taxa, including arthropods, birds, fungi, fish, plants, and reptiles. The convex hull method defines a minimal polygon encompassing all occurrence points, while the alpha hull identifies finer boundaries that can separate spatially distinct clusters (See new Annex for examples).
- Added (Section 2.2 Para 3) “We assess the utility of these mapping algorithms for delineating species occurrence regions, with illustrative examples provided in the Annex.”
- Added (Section 3.2 Para 2) “Examples of the convex and alpha hull comparisons are included in the Annex.”
- Added (Section 5. Para 1) "Building on these results and given the numerous competing demands on land and water, the spatial precision of conservation activities and policies can be enhanced through alphahull techniques, thereby increasing their potential for success by effectively focusing the area of consideration"
RC1:
Case comparisons. This section could be more clearly motivated and more thoroughly explored. I take the motivation to be showing a) that the rapid DBW algorithm produces results that are comparable to much more intensive taxonomy-specific efforts; and then b) arguing that where there are disparities, DBW may be superior. With regard to a) figures 2,3, and 4 are inadequate to make the case – it’s not possible to eyeball the differences. Instead, the paper should present and discuss a map of the differences between the GBIF and comparator rankings (and also define the rankings – percentiles? Is 10 high or low? Etc). On the comparison with Kass et al 2022, it’s worth noting that Kass don’t just use alpha hull, they also use environmental variables as a predictor, so that may explain the divergence and is worth some discussion. Also, given the interest in small-range species, Kass et al’s use of buffers around species with <3 occurrences (which can't be handled by alpha hull) seems preferable to DBW’s decision to omit them entirely. I would therefore challenge the conclusion (line 313) that ‘comparing full database results for GBIF and Kass et al. (2022) would be, in effect, comparing apples and oranges.”
Authors’ response:
We clarified the text to define the ranking.
- Added (Section 3.3. Para 1) “Using these matched species, each comparison assesses the similarity in global biodiversity patterns produced by our GBIF maps and the expert research products by rank group based on the pixel level distribution of the species maps from 1 (lowest count) to 10 (highest count).”
To improve the case comparison of rankings, we compare Species Occurrence Regions derived from GBIF with the sets of matched species for mammals (Marsh et al. 2022), ants (Kass et al. 2022) and plants (Borgelt et al. 2022) with a confusion matrix of 5 to 10 classes of group rank based on distribution of the species. We added more details in the Annex.
- Added (Section 3.4 Para 1) “In all three case comparisons, we find quite similar global patterns of species density with strong agreement between the sources for each case. From a confusion matrix of 5 to 10 classes based on distribution of the species rank group, we find that across these taxa overall accuracy and agreement are significantly better than random (accuracy p-value = 0) with plants consistently outperforming ant and mammals (see Annex).”
In addition, we also ran spearman correlation coefficient at the pixel level with a roving window of pixels (e.g. 3x3 and 5x5), where we found high agreement between case comparisons in areas of low population and low lights at night.
- Added (Section 3.4 Para 1) “Additionally, we calculated the local Spearman correlation coefficient for each case and found high agreement in areas with low population and levels of lights at night as a proxy for economic activity.”
The alpha hull approach offers a parsimonious, occurrence-driven estimate of species range, requiring minimal assumptions and no environmental predictors. In contrast, species distribution models (SDMs) rely on right-hand side (RHS) predictors to estimate habitat suitability and can extrapolate beyond observed occurrences. Divergence arises because alpha hulls capture realized distributions conservatively, while SDMs are sensitive to predictor choice and scale, potentially over- or underestimating suitable areas. This trade-off between simplicity and predictive generalization underscores the need to consider method objectives and data limitations when mapping species distributions. Our practical experience in client countries suggests that it is also easier to communicate and have a dialogue with policy makers about species occurrence regions that are derived from occurrence reports compared to statistical modeling methods that infer occurrence based on environmental predicators, which can be difficult to explain. Given the scale of biodiversity, it is urgent to start the biodiversity dialogue now and start with an easy to understand approach to mapping.
- Added (Section 5.4 Para 2): “In addition, the alpha hull approach offers a parsimonious, occurrence-driven estimate of species range, requiring minimal assumptions and no environmental predictors. Our practical experience in client countries further indicates that communicating with policymakers is often more straightforward when using species occurrence regions derived from reported occurrences, compared to statistical modeling methods that infer occurrence from environmental predictors, which can be challenging to explain. Given the rate of biodiversity loss, it is urgent to initiate the biodiversity dialogue immediately, beginning with approaches to mapping that are readily understandable”
The alpha hull approach emphasizes robust range estimation, and excluding extremely rare species helps prevent artificially inflating distributions with highly uncertain extrapolations. While species with very limited occurrences are important to highlight, the approach by Kass et al. that applies buffers to species with fewer than three records can produce overly generous or biologically unrealistic ranges—especially when occurrence data are sparse or not spatially clustered. In contrast, our approach adopts a conservative, data-driven delineation that reflects actual occurrence patterns, while recognizing that truly rare species remain inherently challenging to model reliably.
- We note this important methodological distinction (Section 3.3.2) “Since Kass et al. (2022) also rely heavily on the alphahull methodology, we attribute these differences to two technical factors. First, their database comes from intensive processing and error checking of records drawn from the Global Ant Biodiversity Informatics (GABI) database in July 2020. In our study, by contrast, the records are drawn from GBIF occurrence data, as of July 2023. Second, our approach is significantly more conservative. For example, we exclude unique species occurrences that number fewer than three, while Kass et al. (2022) include them; since alphahulls cannot be estimated for these 5,168 ant species, Kass et al. (2022) estimate their ranges by drawing 30 km buffer zones around the occurrence locations.”
- As we include a case comparison for ants, we deleted the following sentence (Section 3.3.2 para 3): Given this difference, comparing full database results for GBIF and Kass et al. (2022) would be, in effect, comparing apples and oranges.
RC1:
Priority-setting applications. This section would benefit from comparison with prior, similar exercises such as Jenkins, Pimm and Joppa (2013) Global patterns of terrestrial vertebrate diversity. How does DBW'S wider set of taxonomies change understanding of hotspots or priorities? The tables could be better documented, it's not always immediately clear what the 'percentages' refer to (what's the denominator?) Also, the paper ends by talking about 40 countries, but the prior discussion has been of 30 countries -- that requires correction or clarification.
Authors’ response:
We appreciate the reviewer’s observation and agree that there are multiple approaches to setting priorities in biodiversity conservation. However, in practice, resource allocation is most often determined at the national and subnational levels by administrative authorities, particularly in the Global South, where direct communication with policymakers is essential for implementation. In this context, a natural entry point is the conservation of endemic species—those occurring exclusively within a country—or species with very restricted ranges within national boundaries. These taxa are especially vulnerable due to their limited populations and habitats, making them highly susceptible to extinction risks. Prioritizing their protection provides countries with a clear and feasible starting point for action. Moreover, conservation measures for such species are relatively straightforward to design and implement, as their distributions are typically confined to smaller geographic areas. Protecting these species not only contributes to global biodiversity goals but also safeguards the unique natural heritage of individual nations.
In response to the reviewer’s comment, we note that our dataset builds on and complements the work of Jenkins, Pimm, and Joppa in their study Global Patterns of Terrestrial Vertebrate Diversity and Conservation, which relied on IUCN data to identify “small-ranged” amphibians, birds, mammals, and reptiles. Our investigation shows that many species with small occurrence areas are categorized as “data deficient” in the IUCN threat classification. By incorporating GBIF occurrence records, we were able to extend the coverage of small-range species within terrestrial vertebrates and broaden the scope to include plants, invertebrates, and species in freshwater and marine ecosystems. Our initiative should therefore be viewed as a complement to the earlier study, providing an expanded and richer basis for biodiversity conservation analysis.
- Added (Section 4.1): Table 2. Top 30 countries for species endemism, by group: Endemic species by group (%)
- Added (Section 4.2): Table 4. Top 30 countries for species with small occurrence regions, by group
- Added (Section 4.2): Table 5. Top 30 countries for endemic species with small occurrence regions, by group
- Added (Section ): Figure 6. Regional distribution (regional percentage of global) of endemic species with small occurrence regions
- Added (Section): Figure 7. Regional percent of global distribution of endemic species with small occurrence regions, by income class
In a forthcoming World Bank Policy Research Working Paper, we quantify the protection gap for endemic and small-range species by comparing occurrence maps from our database with protected area maps from the World Database on Protected Areas across more than 150 countries and territories. This analysis is intended to complement the work of Jenkins, Pimm, and Joppa (2013), providing an expanded and richer basis for protection gap assessment including plants and invertebrates.
- Add (Section 5.3 Para 1): Implications for conservation priority The identification of endemic and small-range species from GBIF-derived maps provides new insights for conservation targeting. A broader view of biodiversity provides an opportunity for more countries to contribute to a shared vision of conservation stewardship, as most nations host significant distributions of species within at least one major taxonomic group. Consistent with prior studies (Kier et al., 2009), this analysis confirms that small-range species are disproportionately at risk of extinction due to limited dispersal capacity and habitat specialization. The correlation between endemism and small-range occurrence suggests that national-level conservation actions will be critical for achieving global biodiversity outcomes. Australia, the United States, Brazil, Mexico, and South Africa emerge as global centers of endemism and small-occurrence-region species, underscoring their unique biodiversity assets. The identification of numerous localized hotspots—such as Madagascar, Costa Rica, and Southeast Asia—further highlights regions where protection gap analyses are most urgent.
- Added (Section 5.4 Para 1): “These data provide an expanded and richer basis for protection gap assessment that can complement prior assessments, such as Jenkins, Pimm, and Joppa (2013), which focused on terrestrial vertebrate diversity. These data can also be used to examine the overlap between existing protected areas and endemic species distributions, highlighting significant variation in initial conservation conditions, including current protection levels and the spatial distribution of unprotected species. (Dasgupta et al., 2025)”
- We have now corrected the text for 30 countries (instead of 40 countries) (Section 5.4).
References:
- Blankespoor B, Dasgupta S, Wheeler D. Bridging Conflicts and Biodiversity Protection: The Critical Role of Reliable and Comparable Data. World Bank Policy Research Working Paper 11076; 2025 Feb. https://doi.org/10.1596/1813-9450-11076.
- Blankespoor, B., Dasgupta, S., Wheeler, D., Jeuken, A., van Ginkel, K., Hill, K., and Hirschfeld, D.: Linking sea-level research with local planning and adaptation needs, Nature Climate Change, 13(8), 760–763, 2023. https://doi.org/10.1038/s41558-023-01789-8
- Beck, J., Böller, M., Erhardt, A., & Schwanghart, W.: Spatial bias in the GBIF database and its effect on modeling species’ geographic distributions. Ecological Informatics, 19, 10–15. 2014. https://doi.org/10.1016/j.ecoinf.2013.11.002
- Boakes, E. H., McGowan, P. J. K., Fuller, R. A., Chang-qing, D., Clark, N. E., O’Connor, K., & Mace, G. M.: Distorted views of biodiversity: spatial and temporal bias in species occurrence data. PLoS Biology, 8(6), e1000385. 2010. https://doi.org/10.1371/journal.pbio.1000385
- Borgelt, J., Sicacha-Parada, J., Skarpaas, O., et al.: Native range estimates for red-listed vascular plants, Nature Scientific Data, 9, 117, 2022. https://doi.org/10.1038/s41597-022-01233-5
- Cardoso, P., Erwin, T. L., Borges, P. A. V., & New, T. R.: The seven impediments in invertebrate conservation and how to overcome them. Biological Conservation, 144(11), 2647–2655. 2011. https://doi.org/10.1016/j.biocon.2011.07.024
- Dasgupta, S., Blankespoor, B., David Wheeler. Predicting Species Extinction Risks using Occurrence Data from the Global Biodiversity Information Facility. 2025/12. Journal of Management & Sustainability, 15(2): 93, 2025. https://doi.org/10.5539/jms.v15n2p93
- Elith, J., & Leathwick, J. R.: Species distribution models: ecological explanation and prediction across space and time. Annual Review of Ecology, Evolution, and Systematics, 40, 677–697. 2009. https://doi.org/10.1146/annurev.ecolsys.110308.120159
- Hughes, A. C., Dorey, J. B., Bossert, S., Qiao, H., & Orr, M. C. Big data, big problems? How to circumvent problems in biodiversity mapping and ensure meaningful results. Ecography, 2024(8),e07115, 2024. https://doi.org/10.1111/ecog.07115
- Janicki, J., Narula, N., Ziegler, M., Guénard, B., & Economo, E. P.: Visualizing and interacting with large-volume biodiversity data using client–server web-mapping applications: The design and implementation of antmaps.org. Ecological Informatics, 32, 185–193. 2016. https://doi.org/10.1016/j.ecoinf.2016.02.006
- Kass, J., Guénard, B., Dudley, K., et al.: The global distribution of known and undiscovered ant biodiversity, Science Advances, 8, 2022. https://www.science.org/doi/10.1126/sciadv.abp9908
- Kier, G., Kreft, H., Lee, T. M., Jetz, W., Ibisch, P. L., Nowicki, C., Mutke, J., & Barthlott, W.: A global assessment of endemism and species richness across island and mainland regions. Proceedings of the National Academy of Sciences, 106(23), 9322–9327. 2009. https://doi.org/10.1073/pnas.08103061
- Marsh, C., et al.: Expert range maps of global mammal distributions harmonized to three taxonomic authorities, Journal of Biogeography, 49, 979-992, 2022. https://doi.org/10.1111/jbi.14330
- Meyer, C., Weigelt, P., & Kreft, H.: Multidimensional biases, gaps and uncertainties in global plant occurrence information. Ecology Letters, 19(8), 992–1006. 2016.https://doi.org/10.1111/ele.12624
- Pateiro-López, B. and Rodríguez-Casal, A.: Generalizing the convex hull of a sample: the R package alphahull, Journal of Statistical Software, 34, 2010. https://doi.org/10.18637/jss.v034.i05
- Tanalgo, K. C. Open and FAIR data sharing are building blocks to bolster biodiversity conservation in Southeast Asia. Biological Conservation, 307,111192, 2025. https://doi.org/10.1016/j.biocon.2025.111192
- Yesson, C., Brewer, P. W., Sutton, T., Caithness, N., Pahwa, J. S., Burgess, M., Gray, W. A., White, R. J., Jones, A. C., & Bisby, F. A.: How global is the Global Biodiversity Information Facility? PLoS ONE, 2(11), e1124. 2007. https://doi.org/10.1371/journal.pone.0001124
Citation: https://doi.org/10.5194/essd-2024-241-AC6
-
AC6: 'Reply on RC1', Brian Blankespoor, 29 Oct 2025
-
RC2: 'Comment on essd-2024-241', Anonymous Referee #2, 18 Sep 2025
This manuscript presents an interesting and valuable contribution to advancing species mapping across multiple scales. The authors propose a new approach tested on both large public datasets and simpler data inputs, effectively demonstrating its applicability. I believe this work has strong potential to address ongoing challenges in conservation prioritisation at different scales. The analyses appear robust and rigorous, yet some areas could be strengthened, particularly in the Discussion section.
First, I recommend expanding the Discussion to more explicitly address the practical challenges of species range mapping and to situate the proposed approach in relation to existing solutions. For instance, the paper by Hughes et al. (2024) provides an excellent overview of common pitfalls in biodiversity mapping and how they can be circumvented: Hughes, A. C., Dorey, J. B., Bossert, S., Qiao, H., & Orr, M. C. (2024). Big data, big problems? How to circumvent problems in biodiversity mapping and ensure meaningful results. Ecography, 2024(8), e07115. https://doi.org/10.1111/ecog.07115
Second, I encourage the authors to consider discussing the specific barriers to data mobilisation in developing regions such as Southeast Asia, where biodiversity data remain sparse and fragmented. For instance: Tanalgo, K. C. (2025). Open and FAIR data sharing are building blocks to bolster biodiversity conservation in Southeast Asia. Biological Conservation, 307, 111192. https://doi.org/10.1016/j.biocon.2025.111192
While other reviewers have already provided detailed feedback on the technical aspects of the analyses—which I share—I would emphasise that the manuscript would benefit most from a more developed Discussion. By addressing these broader challenges and situating the study within the context of ongoing debates on data quality, accessibility, and mobilisation, the authors can considerably strengthen the manuscript’s impact and relevance for both scientific and applied conservation audiences.
Citation: https://doi.org/10.5194/essd-2024-241-RC2 -
AC5: 'Reply on RC2', Brian Blankespoor, 29 Oct 2025
This manuscript presents an interesting and valuable contribution to advancing species mapping across multiple scales. The authors propose a new approach tested on both large public datasets and simpler data inputs, effectively demonstrating its applicability. I believe this work has strong potential to address ongoing challenges in conservation prioritisation at different scales. The analyses appear robust and rigorous, yet some areas could be strengthened, particularly in the Discussion section.
First, I recommend expanding the Discussion to more explicitly address the practical challenges of species range mapping and to situate the proposed approach in relation to existing solutions. For instance, the paper by Hughes et al. (2024) provides an excellent overview of common pitfalls in biodiversity mapping and how they can be circumvented: Hughes, A. C., Dorey, J. B., Bossert, S., Qiao, H., & Orr, M. C. (2024). Big data, big problems? How to circumvent problems in biodiversity mapping and ensure meaningful results. Ecography, 2024(8), e07115. https://doi.org/10.1111/ecog.07115
Authors’ response:
We thank the reviewer for bringing Hughes et al. (2024) to our attention. As our paper was submitted to Earth System Science Data in June 2024, we inadvertently missed this then-recent publication. The reference has been added in the revised paper. We fully agree with the reviewer and Hughes et al. that the scarcity of reliable, location-specific, and open-access biodiversity data—along with local threats and protections—remains a critical challenge for effective conservation. Existing public datasets are often incomplete, especially in developing regions, and remain heavily skewed toward terrestrial vertebrates, leaving major taxonomic groups and ecosystems underrepresented.
Our study takes a step towards addressing this gap. By leveraging millions of georeferenced occurrence records from GBIF, we applied machine-based pattern recognition to generate occurrence maps for over 600,000 species spanning terrestrial, freshwater, and marine systems, including plants, fungi, arthropods, and mollusks, in addition to vertebrates. Our data is open access. More importantly, our open-access estimation algorithms and codes are designed for rapid updates, allowing us to incorporate approximately 1.3 million new reports added to GBIF each day. This capacity supports timely improvements in species mapping and provides a scalable framework for replication in conservation analyses worldwide. In this sense, our initiative complements the existing literature by extending taxonomic breadth, enhancing spatial coverage, and strengthening the open-data foundation for biodiversity monitoring.
We also wish to emphasize that uneven data availability across taxonomic groups can introduce selectivity bias in biodiversity conservation and resource allocation. Traditional metrics have focused largely on terrestrial vertebrates, while plants, invertebrates, and marine species remained underrepresented due to limited data. Building on the broader dataset assembled in this paper, we conducted an additional analysis to estimate extinction risks across all major taxonomic groups, including vertebrates, plants, and invertebrates (Dasgupta et al. 2025). Our findings indicate that many of the previously underrepresented groups face relatively higher extinction risks than vertebrates (see the bar chart and the table), underscoring that conservation strategies focused on a single group risk systematic bias (A new World Bank database to support a new era in biodiversity conservation and Broadening Our View of Global Biodiversity PowerPoint, slide 37). Expanding coverage to a wider range of taxa therefore provides a more balanced and accurate foundation for conservation priority setting by uncovering many more potentially threatened species worldwide and significantly altering "conservation hotspot" maps.
Source: "A new World Bank database to support a new era in biodiversity conservation" World Bank Data Blog (August 29, 2024) at https://blogs.worldbank.org/en/opendata/a-new-world-bank-database-to-support-a-new-era-in-biodiversity-c (accessed 2025-10-29) and Broadening Our View of Global Biodiversity PowerPoint, slide 37 (November 2024) at: https://thedocs.worldbank.org/en/doc/6be2561d9ba07ce60a76bde4143be7c0-0050022025/broadening-our-view-of-global-biodiversity-powerpoint (accessed 2025-10-29).
RC2:
Second, I encourage the authors to consider discussing the specific barriers to data mobilisation in developing regions such as Southeast Asia, where biodiversity data remain sparse and fragmented. For instance: Tanalgo, K. C. (2025). Open and FAIR data sharing are building blocks to bolster biodiversity conservation in Southeast Asia. Biological Conservation, 307, 111192. https://doi.org/10.1016/j.biocon.2025.111192
Authors’ response:
Once again, we thank the reviewer for the suggestion and have incorporated the critical need for open and equitable sharing of biodiversity, threat, and protection data across countries and administrative jurisdictions along with the following two references in the introduction and discussion of the revised paper as well as in the list of references.
RC2:
While other reviewers have already provided detailed feedback on the technical aspects of the analyses—which I share—I would emphasise that the manuscript would benefit most from a more developed Discussion. By addressing these broader challenges and situating the study within the context of ongoing debates on data quality, accessibility, and mobilisation, the authors can considerably strengthen the manuscript’s impact and relevance for both scientific and applied conservation audiences.
Authors’ response:
As recommended by the reviewer, we added a new section on discussion (Section 5). Thank you for this thoughtful suggestion. We agree that addressing regional disparities in biodiversity data availability is crucial, especially in highly biodiverse yet data-poor areas such as Southeast Asia. We revised the manuscript to include a discussion of key barriers to data mobilization in these regions—such as limited funding, infrastructure, and institutional support—as well as cultural and political challenges related to data sharing. The reference to Tanalgo (2025) is particularly valuable, and we will cite it to highlight how open and FAIR data practices can support more equitable and comprehensive biodiversity mapping.
References:
Dasgupta, S., Blankespoor, B., David Wheeler. Predicting Species Extinction Risks using Occurrence Data from the Global Biodiversity Information Facility. 2025/12. Journal of Management & Sustainability, 15(2): 93, 2025. https://doi.org/10.5539/jms.v15n2p93
Hughes, A. C., Dorey, J. B., Bossert, S., Qiao, H., & Orr, M. C. Big data, big problems? How to circumvent problems in biodiversity mapping and ensure meaningful results. Ecography, 2024(8), e07115, 2024. https://doi.org/10.1111/ecog.07115
Tanalgo, K. C. Open and FAIR data sharing are building blocks to bolster biodiversity conservation in Southeast Asia. Biological Conservation, 307, 111192, 2025. https://doi.org/10.1016/j.biocon.2025.111192
-
AC5: 'Reply on RC2', Brian Blankespoor, 29 Oct 2025
Data sets
Global Biodiversity Species Occurrence Endemism and Small Occurrence Data S. Dasgupta, B. Blankespoor, and D. Wheeler https://datacatalog.worldbank.org/search/dataset/0066034/global_biodiversity_data
Global Biodiversity Species Occurrence Gridded Data and Global Biodiversity Species Global Grid S. Dasgupta, B. Blankespoor, and D. Wheeler https://datacatalog.worldbank.org/search/dataset/0066034/global_biodiversity_data
Viewed
| HTML | XML | Total | BibTeX | EndNote | |
|---|---|---|---|---|---|
| 1,310 | 404 | 823 | 2,537 | 47 | 82 |
- HTML: 1,310
- PDF: 404
- XML: 823
- Total: 2,537
- BibTeX: 47
- EndNote: 82
Viewed (geographical distribution)
| Country | # | Views | % |
|---|
| Total: | 0 |
| HTML: | 0 |
| PDF: | 0 |
| XML: | 0 |
- 1
This paper underscores the importance of global biodiversity data in shaping effective conservation strategies. By analyzing species occurrence maps, it reveals regional patterns that reflect ecosystem health. It highlights how species abundance helps identify conservation statuses, pinpointing critical areas and vulnerable species groups. Additionally, mapping species distribution supports the development of more precise, tailored conservation plans.