Mapping photovoltaic power plants in China using Landsat, Random Forest, and Google Earth Engine
- 1College of Geography and Environmental Science, Henan University, Kaifeng 475004, China
- 2Key Laboratory of Geospatial Technology for the Middle and Lower Yellow River Regions (Henan University), Ministry of Education, Kaifeng 475004, China
- 3Henan Key Laboratory of Earth System Observation and Modeling, Henan University, Kaifeng 475004, China
- 1College of Geography and Environmental Science, Henan University, Kaifeng 475004, China
- 2Key Laboratory of Geospatial Technology for the Middle and Lower Yellow River Regions (Henan University), Ministry of Education, Kaifeng 475004, China
- 3Henan Key Laboratory of Earth System Observation and Modeling, Henan University, Kaifeng 475004, China
Abstract. Photovoltaic (PV) technology, as an efficient solution for mitigating impacts of climate change, has been increasingly used across the world to replace fossil-fuel power to minimize greenhouse gas emissions. With the world's highest cumulative and fastest built PV capacity, China needs to assess the environmental and social impacts of these established photovoltaic (PV) power plants. However, a comprehensive map regarding the locations and extent of the PV power plants remains to be scarce at the country scale. This study developed a workflow combining machine learning and visual interpretation methods with big satellite data to map the PV power plants in China. We applied a pixel-based Random Forest (RF) model to classify the PV power plants from composite images in 2020 with 30-meter spatial resolution on Google Earth Engine (GEE). The result classification map was further improved by a visual interpretation approach. Eventually, we established a map of PV power plants in China by 2020, covering a total area of 2917 km2. Based on the derived national PV map, we found that most PV power plants were sited on cropland, followed by barren land and grassland. In addition, the installation of PV power plants has generally decreased the vegetation cover. This new dataset is expected to be conducive to policy management, environmental assessment, and further classification of PV power plants.
Xunhe Zhang et al.
Status: final response (author comments only)
-
RC1: 'Comment on essd-2022-16', Anonymous Referee #1, 21 Feb 2022
In this manuscript, authors developed a workflow combining machine learning and visual interpretation methods to map the Photovoltaic power plants in China. This topic is very important to assess the environmental and social impacts of these estabilished photovoltaic power plants. In fact, there are a number of papers on remote sensing target information extraction, the paper is not particularly novel. And,there are some problems in this paper:ÂÂ1, The introduction is inconformity with the objectives of study. For example, the one and two paragraphs are talking about the Photovoltaic power plants, machine learning, which can be wrote in a more refined way and introduce the main topic quickly. The references in the Introduction section are too limited, the authors should refer more works of relation analysis on deep learning methods, PV power plants and Remote sensing images, about the mechanism of deep learning and remote sensing image extraction of PV power plants, this paper hasn't given more details description.Â2, Figure 2 (page 7): the specific meanings of 1,2,3,4,5,and 6 in this figure should be explained.Â3, There are too many texts in the discussion section and need to be further streamlined.Â4,In addition, did the authors consider how to validate the results? how can we believe the results? e.g. what validation, more specific about ground truthing etc. Without this information, I can not trust the results of this paper.
-
AC1: 'Reply on RC1', Xunhe Zhang, 20 Apr 2022
In this manuscript, authors developed a workflow combining machine learning and visual interpretation methods to map the Photovoltaic power plants in China. This topic is very important to assess the environmental and social impacts of these estabilished photovoltaic power plants. In fact, there are a number of papers on remote sensing target information extraction, the paper is not particularly novel. And, there are some problems in this paper:
Response: We are grateful for referee #1’s recognition of this study’s importance. Although there are a number of previous studies for mapping land properties, PV power plant mapping has not been widely conducted, and there still lack the open dataset for PV power plant in China. Our dataset offers the latest public dataset for the spatial extent of PV power plants in China. In this study, we integrate the advantage of cloud computing, machine learning, visual interpretation and freely available remote sensing imagery to map the PV power plants in China.
Â
1, The introduction is inconformity with the objectives of study. For example, the one and two paragraphs are talking about the Photovoltaic power plants, machine learning, which can be wrote in a more refined way and introduce the main topic quickly. The references in the Introduction section are too limited, the authors should refer more works of relation analysis on deep learning methods, PV power plants and Remote sensing images, about the mechanism of deep learning and remote sensing image extraction of PV power plants, this paper hasn't given more details description.
Response to comment 1: Following referee #1’s suggestion, we have rewritten our introduction part to fit the objectives of this study. We have streamlined the introduce part. We added sentences to describe the mechanism of machine learning and deep learning. We also added more references about learning. We further explained the advantages of our methods.
Â
2, Figure 2 (page 7):Â the specific meanings of 1,2,3,4,5, and 6 in this figure should be explained.
Response to comment 2: We are sorry for the unclear description of Figure 2. We have refined the figure 2 with accurate description. The specific meanings of 1, 2, 3, 4, 5, and 6 are the 6 example sites to show the true-color images from Landsat-8, Sentinel-2, and Google Earth for visual interpretation.
Â
3, There are too many texts in the discussion section and need to be further streamlined.
Response to comment 3: We have streamlined and shortened the discussion in the revised manuscript.
Â
4,In addition, did the authors consider how to validate the results? how can we believe the results? e.g. what validation, more specific about ground truthing etc. Without this information, I can not trust the results of this paper.
Response to comment 4: In this study, there were two stages for mapping the PV power plants in this study. In the first stage, we used a pixel-based random forest model with selected features to map the PV power plants in China. We further validated the model and the performance of the random forest model using kappa coefficient, overall accuracy, producer’s accuracy and user’s accuracy (Table 3). In the second stage, we used visual interpretation method to filter the misclassified PV power plant due to commission errors in machine learning process. We carefully inspected each polygon with the latest Landsat-8, Sentinel-2, and Google Earth true color images by visual interpretation. The entire visual interpretation step took us about 2 weeks. While visual interpretation is time consuming, it generally offer validation with high accuracy.
In the revised manuscript, we added extra validation by comparing our dataset with the Dunnett's dataset and Kruitwagen's dataset in China. Dunnett et al. (2020) provided a harmonized solar plants dataset obtained from OpenStreetMap, an open-access map. The PV power plants in the open-access map were annotated by volunteers. The total area of PV power plants in China from Dunnett's dataset is 897.4 km2, of which 842 km2 have spatially intersected with our dataset. The no intersected solar panels area is 55.4 km2. Some of them are too small for our method to recognize.
Kruitwagen's dataset (Kruitwagen et al., 2021) was classified by deep learning methods. The total area of PV power plants in China from Kruitwagen's dataset is 2169.8 km2 by 2018, of which 1873.5 km2 have spatially intersected with our dataset. The PV power plants in Kruitwagen's dataset that do not intersect with our dataset are 296.3 km2, some of which are too small to be identified by our method and some of which are misidentified in Kruitwagen's dataset.
We found our methods could fail sometimes to recognize these PV power plants situated in mountainous areas that typically have unique installation spacing and installation angles for their solar panels. Small size PV power plants (less than 0.04 km2) was potentially another reason for mis-classification in this study.
-
AC1: 'Reply on RC1', Xunhe Zhang, 20 Apr 2022
-
RC2: 'Comment on essd-2022-16', Anonymous Referee #2, 20 Mar 2022
In this manuscript, machine learning and visual interpretation methods were combined to map the PV power plants in China. The topic is very important, and the study results would be useful for developing PV industries in the future. However, there are some problems in the manuscript which should be solved before reconsideration for publication in Earth System Science Data:
1. The Introduction section should be written. The novelty of this study compared to the previous studies should be highlighted. And the useless contents should be removed.
2. The "Dunnett's dataset" was used as the basic training and validation samples, which means your model's ability was "limited" to this dataset. In other words, the PV power plants that cannot be identified in "Dunnett's dataset" may also be ignored in your model. Although you mentioned that you modified the dataset, how it was implemented and what is the difference between the modified dataset and the original dataset were not clearly stated.
3. Writing should be taken more seriously as there are many writing and grammatical errors in the manuscript. For example, in line 201," we discovered that the EVI values of PV power plants in 2020 was strongly and positively linked with the that in 2013".
4. The quality of the figures should be further improved. For example, the "North arrow" in Fig. 1 and Fig. 5 were missing. And in Fig. 2, what do 1-6 mean?
-
AC2: 'Reply on RC2', Xunhe Zhang, 20 Apr 2022
In this manuscript, machine learning and visual interpretation methods were combined to map the PV power plants in China. The topic is very important, and the study results would be useful for developing PV industries in the future. However, there are some problems in the manuscript which should be solved before reconsideration for publication in Earth System Science Data:
Response: We thank referee #2 very much for the positive feedback.
Â
1.The Introduction section should be written. The novelty of this study compared to the previous studies should be highlighted. And the useless contents should be removed.
Response to comment 1: By following referee #2’s comment, we have rewritten the Introduction. In the new Introduction, we introduced the advantage of cloud computing, machine learning, visual interpretation and freely available remote sensing imagery to map the PV power plants in China. We also highlight the advantages of our study compared with previous studies in the introduction part. The current Introduction has been largely refined.
Â
2.The "Dunnett's dataset" was used as the basic training and validation samples, which means your model's ability was "limited" to this dataset. In other words, the PV power plants that cannot be identified in "Dunnett's dataset" may also be ignored in your model. Although you mentioned that you modified the dataset, how it was implemented and what is the difference between the modified dataset and the original dataset were not clearly stated.
Response to comment 2: We thank referee #2 for pointing out this. Suitable training samples are indeed crucial for an RF model's classification accuracy and stable performance. We modified the Dunnett's dataset as our training samples in this study, and found that the labelled PV power plants in Dunnett's dataset are rarely distributed in eastern China, which will limit our model's performance to identify the PV power plant in similar areas. So we further manually selected and edited the extent of different PV power plants that were not annotated in Dunnett's dataset to ensure the labelled data covered most of the parameter space of PV power plants in China. The total area of the PV power plants in China is about 897 km2 from primary Dunnett's dataset and the area of the modified training regions was 1121 km2. We then randomly sampled points within the training with a balanced quantity from humid and arid regions in China. We have put the statements in the revised manuscript in Section 2.1.3 for clearer future readership.
Â
3.Writing should be taken more seriously as there are many writing and grammatical errors in the manuscript. For example, in line 201," we discovered that the EVI values of PV power plants in 2020 was strongly and positively linked with the that in 2013".
Response to comment 3: We have thoroughly gone through the manuscript to edit the writing and correct the grammatical errors.
Â
4.The quality of the figures should be further improved. For example, the "North arrow" in Fig. 1 and Fig. 5 were missing. And in Fig. 2, what do 1-6 mean?
Response to comment 4: We have further improved all figures in the revised manuscript: we added the "North arrow" in Figure 1 and Figure 5. We are sorry for the unclear description of Figure 2. We refined the figure 2 with accurate description. We added the scale bar and coordinate in the Figure 2. The specific meanings of 1, 2, 3, 4, 5, and 6 are the 6 example sites to show the true-color images from Landsat-8, Sentinel-2, and Google Earth for visual interpretation.
-
AC2: 'Reply on RC2', Xunhe Zhang, 20 Apr 2022
Xunhe Zhang et al.
Data sets
The dataset of photovoltaic power plant distribution in China by 2020 Xunhe Zhang https://doi.org/10.5281/zenodo.4552918
Xunhe Zhang et al.
Viewed
HTML | XML | Total | BibTeX | EndNote | |
---|---|---|---|---|---|
447 | 153 | 23 | 623 | 5 | 10 |
- HTML: 447
- PDF: 153
- XML: 23
- Total: 623
- BibTeX: 5
- EndNote: 10
Viewed (geographical distribution)
Country | # | Views | % |
---|
Total: | 0 |
HTML: | 0 |
PDF: | 0 |
XML: | 0 |
- 1