the Creative Commons Attribution 4.0 License.
the Creative Commons Attribution 4.0 License.
Mapping global onshore wind turbines using multi-source remote sensing images and hybrid learning approaches
Abstract. Wind power serves as a vital zero-carbon alternative to fossil fuels for climate change mitigation. Nevertheless, the vast expansion of wind turbine installation requires extensive terrestrial resources, raising wide concerns regarding land use competition and ecological impacts. Quantifying these effects necessitates near real-time geospatial data on wind turbine placement and density. However, current methods remain inadequate monitoring for the fast-growing wind turbine deployment. Here, we developed an integrated framework that combines OpenStreetMap (OSM) data with multi-source remote sensing images (Google Earth and Sentinel-1/2) and deep learning and traditional machine learning models (ResNet-18 and Random Forest) to map global onshore wind turbines. Our models achieve validation accuracy >97 % while enabling cost-effective, timely updates of global onshore wind turbines. Eventually, we established a geographical dataset covering a total of 379,595 wind turbines globally by 2024. This dataset represents a tenfold expansion over currently available global wind turbine inventories as of 2020. In addition, we found that 80% wind turbines are situated on cropland and grassland, followed by forest and bare ground. This dataset facilitates essential studies on renewable energy land management, ecological impact analysis, and data-driven energy transition policies. The codes and dataset of the global onshore wind turbines is available at Zenodo link: https://doi.org/10.5281/zenodo.16759861 (Shujun et al., 2025).
- Preprint
(1791 KB) - Metadata XML
- BibTeX
- EndNote
Status: final response (author comments only)
- RC1: 'Comment on essd-2025-512', Maximilian Kleebauer, 01 Dec 2025
-
RC2: 'Comment on essd-2025-512', Anonymous Referee #2, 20 Feb 2026
Overall Assessment
This manuscript presents a 2024 global inventory of 379,595 onshore wind turbines derived through a hybrid workflow integrating OpenStreetMap (OSM), high-resolution Google Earth imagery, Sentinel-1/2 data, deep learning (ResNet-18), and Random Forest classification. The topic is timely and relevant, and the resulting dataset could potentially support renewable energy planning, biodiversity assessments, and land-use analysis at multiple scales.
Despite its potential value, the manuscript raises significant concerns regarding benchmarking against existing global datasets, spatial completeness, false positives, and validation rigor. In its current form, the study does not convincingly demonstrate that the proposed dataset represents a substantial advancement over the most up-to-date global wind turbine inventories. These issues should be carefully addressed before the manuscript can be considered for publication.
Major Comments
The authors state that their dataset represents a tenfold expansion over the 2020 global wind turbine inventory containing 33,514 turbines. While this numerical comparison is technically correct, it does not constitute an appropriate benchmark. The manuscript does not adequately compare the proposed dataset with the Global Renewables Watch dataset (https://github.com/microsoft/global-renewables-watch?tab=readme-ov-file#dataset-download), which already contains 375,197 wind turbines and is publicly available. Given that this resource is both recent and widely recognized, it should serve as the primary point of reference rather than a 2020 dataset.
Furthermore, the Global Renewables Watch dataset includes additional attributes such as construction year, which are absent from the authors’ dataset. This omission weakens the claim of producing a more comprehensive inventory. A direct and systematic comparison with Global Renewables Watch is therefore essential. The authors should quantify the spatial overlap between the two datasets, explain methodological differences, and clarify in what specific ways their dataset improves upon existing products in terms of coverage, accuracy, update frequency, or attribute richness. Without this comparison, the assertion of substantial advancement remains insufficiently supported.
I downloaded both datasets. A visual comparison between the two datasets reveals substantial spatial inconsistencies. A considerable number of turbines present in Global Renewables Watch are missing from the authors’ dataset. Conversely, the authors’ dataset includes many turbines that do not appear in Global Renewables Watch. Among these newly detected turbines, a noticeable proportion appear to be false positives. These discrepancies raise concerns about both omission and commission errors.
The manuscript reports very high classification metrics, but these results appear to be derived primarily from internal validation procedures. Independent cross-dataset validation is necessary for a global infrastructure product of this scale. The authors should quantify omission and commission rates relative to Global Renewables Watch and perform targeted manual verification of turbines that are unique to their dataset. Regional accuracy assessments would also be more informative than reporting only global averages, particularly given the strong geographic heterogeneity in wind turbine siting environments.
An additional concern relates to the acquisition and use of Google Earth imagery for training deep learning models. Google’s terms of service and licensing agreements generally restrict bulk downloading, scraping, and redistribution of imagery, particularly for automated analysis or machine learning training, unless explicit authorization has been obtained. Even if enforcement actions are unlikely, the legal and ethical responsibility remains with the data producers. The fact that similar practices may have appeared in other published studies does not necessarily imply compliance with platform policies.
More broadly, a truly state-of-the-art global wind turbine dataset should ideally integrate existing authoritative sources rather than operate independently of them. A stronger contribution would involve building upon Global Renewables Watch as a baseline and incorporating newly detected turbines identified by the proposed workflow. Preserving useful attributes such as construction year and land cover would substantially increase scientific value. Including a data source field (e.g., OSM, Global Renewables Watch, or the authors) to indicate provenance and a confidence score or quality flag would further improve transparency and usability. Such integration would significantly enhance the comprehensiveness and reliability of the dataset.
Finally, the manuscript should focus more explicitly on validating newly detected turbines that are absent from existing inventories. These turbines represent the main added value of the proposed method, yet they are also the most likely source of false positives. The authors should isolate these unique detections, conduct stratified manual validation across multiple regions, and report false positive rates specifically for this subset. Clarifying whether classification thresholds were optimized using independent data would also strengthen methodological credibility. Reducing false positives to a minimum is particularly important for infrastructure datasets that may inform ecological impact assessments and policy decisions.
Minor Comments
The manuscript contains several grammatical and stylistic issues that require correction. For example, the phrase “current methods remain inadequate monitoring” is awkwardly constructed, “which balancing representativeness” contains a grammatical error, and “Compared current datasets” is an incomplete sentence.
Citation: https://doi.org/10.5194/essd-2025-512-RC2
Data sets
Mapping global onshore wind turbines using multi-source remote sensing images and hybrid learning approaches Shujun Li et al. https://doi.org/10.5281/zenodo.17217523
Viewed
| HTML | XML | Total | BibTeX | EndNote | |
|---|---|---|---|---|---|
| 537 | 156 | 33 | 726 | 32 | 50 |
- HTML: 537
- PDF: 156
- XML: 33
- Total: 726
- BibTeX: 32
- EndNote: 50
Viewed (geographical distribution)
| Country | # | Views | % |
|---|
| Total: | 0 |
| HTML: | 0 |
| PDF: | 0 |
| XML: | 0 |
- 1
Spelling and Grammar Issues
ZZ 45–46
Original: “The codes and dataset … is available.”
Correct: “The code and dataset … are available.”
ZZ 83
Original: “there is a geospatial wind turbine dataset for 2020 is introduced.”
Correct: “A geospatial wind turbine dataset for 2020 was introduced.”
Unclear Description: OSM Query
The query string in line 129 (["generator: source"="wind"]) appears syntactically incorrect.
OpenStreetMap commonly uses:
The manuscript should clearly state the exact Overpass query used.
Major Comment: Missing Methodological Detail on OSM Extraction
Major Comment: Unexplained OSM Error Rate
The manuscript states a “10% error rate in OSM’s global wind turbine dataset” but provides no methodological explanation.
Missing information:
Major Comment: Insufficient Documentation of Random Forest Sampling
Missing details include:
The Random Forest sampling workflow requires a clear methodological description.
Minor Comment: Sentinel-1/2 Features and Missing GEE Scripts
The manuscript lists processing steps, but details are missing, included:
Referencing Zenodo alone is insufficient; the core processing steps must appear in the manuscript.