the Creative Commons Attribution 4.0 License.
the Creative Commons Attribution 4.0 License.
Decadal surge of water-surface solar in China's Yangtze Delta: A high-fidelity SAR-optical fusion inventory (2015–2024)
Abstract. China hosts approximately 97 % of the world's water-surface photovoltaics (WPV), with nearly two-thirds of its national capacity concentrated in the Yangtze River Delta (YRD), a densely populated economic powerhouse facing intense land-energy trade-offs. Despite this dominance, no high-resolution, decade-long inventory has existed to track this rapid expansion. WPV detection using optical RS imagery is severely limited by persistent cloud cover, water surface reflections, and spectral confusion, compromising long-term consistency over aquatic environments. Here, we developed a multi-sensor fusion framework integrating all-weather Sentinel-1 Synthetic Aperture Radar (SAR) and annual composite Sentinel-2 optical imagery. Key features include six Sentinel-2 bands, spectral indices (NDVI, MNDWI, NDBI, NDPI, and SAVI), texture metrics, and dual-polarization SAR backscatter. We trained a Random Forest classifier on 55,849 verified samples to generate annual WPV maps for 2015–2024. Afterwards, we applied post-processing procedures, including noise removal, patch merging, and area thresholding, and further validated installation years and eliminated errors through manual inspection of Google Earth time-series imagery. The well-constructed dataset of the first 10 m-resolution WPV atlas for the YRD maps 401 validated projects with a cumulative area of 145.4 km2 by 2024. It outperforms existing global PV inventories with an overall accuracy of 97.3 % and a Kappa coefficient of 0.94. The results reveal rapid expansion from 17.4 km2 in 2015 to 145.4 km2 in 2024, with 87 % deployed on natural lakes, with a marked shift in leadership from Jiangsu to Anhui, and clear spatial clustering near grid infrastructure and stable water bodies. This high-fidelity inventory provides a robust foundation for monitoring WPV evolution, assessing environmental impacts, and informing sustainable energy planning in the world's leading floating solar region.
- Preprint
(2538 KB) - Metadata XML
- BibTeX
- EndNote
Status: final response (author comments only)
- RC1: 'Comment on essd-2025-695', Giuseppe Marco Tina, 07 Feb 2026
-
RC2: 'Review report on essd-2025-695', Anonymous Referee #2, 24 Feb 2026
This paper is well written and presents solid data and methodological rigor. I have several suggestions that may further strengthen the manuscript:
- Clarification of Water Mask (Line 176)
In Line 176, please clearly specify which waterbody dataset was used as the water mask. If multiple datasets were integrated, describe their respective roles and how they were harmonized. - Multicollinearity Discussion
Since the model integrates original optical bands together with derived spectral indices, it would be helpful to briefly discuss the potential multicollinearity between these variables. Although Random Forest is generally robust to correlated predictors, explicitly stating that multicollinearity does not adversely affect classification performance under tree-based models would improve methodological transparency. - Random Forest Model Specification
Please provide more details about the Random Forest implementation, including key hyperparameters (e.g., number of trees, maximum tree depth, minimum samples per leaf) and whether cross-validation was used. - Comparison with Existing Global Inventories
The manuscript would benefit from a clearer introduction and discussion of existing datasets such as A Global Inventory of PV and Global Renewables Watch. Please briefly describe their data sources, spatial resolution, update frequency, and methodological framework. Additionally, discuss how your approach improves upon these traditional inventories (e.g., spatial resolution, temporal consistency, detection accuracy, validation strategy). - Provincial-Level Proportional Analysis (Section 3.2)
In Section 3.2, in addition to reporting cumulative WPV area, I recommend presenting the ratio of cumulative WPV area to total water area for each province. This proportional metric would provide more meaningful insight into deployment intensity and facilitate comparison across provinces.
Overall, this study makes an important contribution. Subject to addressing the points raised above, I support publication of this manuscript.
Citation: https://doi.org/10.5194/essd-2025-695-RC2 - Clarification of Water Mask (Line 176)
-
EC1: 'Comment on essd-2025-695', Chunlüe Zhou, 22 Mar 2026
General comments:
This manuscript collected RS based reflectance, vegetation, water and built-up indices, texture and SAR data and trained a random forest (RF) classifier to extract floating water photovoltaics (WPV) over the downstream area of Yangtze river. Authors validated the final products and analyzed the recent trend of WPV projects. This manuscript is well written, except for some technical details that are missing. In addition, my major concern on this work is that I'm conservative on the application of this WPV product. The authors mentioned some cases of how others can use this product. But as an earth system science researcher, I strongly recommend the author to provide an application scenario under a broader earth system framework, which is also the main scope of ESSD, e.g., how to use this data product to improve the energy sectors and its interaction with others under an integrated assessment perspective.
Specific comments:
1. Input data of RF classifier include both direct reflectance of each Sentinel-2 MSI band and the combination of them, i.e., normalized indices, which have strong dependency among these different inputs. Though the classifier is relatively simple (only tell if a grid is WPV or not), which can cause the impact of this issue not reflected in your study, I would still recommend authors to discuss the potential impact from dependency in your inputs, i.e. multi-collinearity, when training your RF classifier.
2. For RF classifier training, how did you gain a robust model? Did you consider 10-fold cross-validation? How are parameters determined when training the model? It will also help if authors can think about using the SHAP value to reflect the importance of each feature on FPV detection.
Technical corrections:
Line 32: What does "eliminated errors" mean?
Line 90: "unprecedented accuracy" can be changed to "high".
Line 131: "To reduce cloud interference, a cloud-masking algorithm was applied, and annual median composites were generated from all available images." Repeated sentence. Please delete.
Line 134: "These composites ensure radiometric consistency and provide a stable spatial baseline for dynamic WPV detection and temporal analysis." Repeated sentence. Please delete.
Line 138: It can be helpful to have a flow chart describing how you merged different products into a unified water mask.
Figure 2: "a" shall be "d", "b" shall be "e", "c-e" shall be "a-c".
Line 193: Did you calculate texture features for all 6 bands?
Line 233: "Each potential region was then subjected to rigorous manual interpretation and correction using high-resolution satellite imagery from Google Earth (Fig. 3c)." Please clarify what "manual interpretation" method you used. Based on expert judgement?
Line 250: It is not clear how you build a model to account for multi-year data. Did you combine multi-year reflectance, indices and texture and SAR data together and use them as input to train your model? Please explain the methodology.
Figures 7, 8: Please change FPV to WPV.
Citation: https://doi.org/10.5194/essd-2025-695-EC1 -
EC2: 'A supplementary note', Chunlüe Zhou, 22 Mar 2026
This is a delayed review of this manuscript. The response and revised manuscript will be dispatched to the review for the next-stage of the peer-review process.
Citation: https://doi.org/10.5194/essd-2025-695-EC2
-
EC2: 'A supplementary note', Chunlüe Zhou, 22 Mar 2026
Data sets
The Yangtze River Delta Water-Surface Photovoltaics Dataset (2015–2024) Yue Yan https://doi.org/10.5281/zenodo.17484488
Viewed
| HTML | XML | Total | BibTeX | EndNote | |
|---|---|---|---|---|---|
| 324 | 132 | 17 | 473 | 27 | 45 |
- HTML: 324
- PDF: 132
- XML: 17
- Total: 473
- BibTeX: 27
- EndNote: 45
Viewed (geographical distribution)
| Country | # | Views | % |
|---|
| Total: | 0 |
| HTML: | 0 |
| PDF: | 0 |
| XML: | 0 |
- 1
Reviewer Report
General Evaluation
The manuscript presents an important and timely contribution by producing the first decade‑long, high‑resolution (10 m) water‑surface PV (WPV) inventory for the Yangtze River Delta using a SAR–optical fusion approach. The topic is highly relevant and the dataset could be valuable for future studies.
However, several methodological aspects require clarification, and certain results need deeper interpretation before the manuscript can be considered for publication.
Major Comments
The methodology states that several Sentinel‑2–derived spectral indices (e.g., NDVI, MNDWI, NDBI, NDPI, SAVI) were used in the Random Forest classifier.
However, the manuscript does not provide any quantitative values, ranges, or thresholds that explain how these indices contribute to distinguishing:
water surfaces
non‑vegetated land
rocky or bare surfaces
Given the importance of spectral indices in the fusion approach, the authors should provide, at minimum:
typical value ranges for water vs. land features,
variable importance scores from the Random Forest model, and
examples of how specific indices helped resolve misclassification challenges.
This transparency is essential for reproducibility.
The surface area of lakes and reservoirs in the YRD can fluctuate significantly due to seasonal or multi‑year droughts.
The manuscript does not explain how these hydrological variations were handled.
Please clarify:
Were annual water masks independently derived for each year?
Did the classifiers incorporate hydrological seasonality?
How were changes in water extent prevented from being misinterpreted as WPV presence or absence?
This point is critical, especially when estimating decadal trends.
The authors mention the use of texture metrics and SAR backscatter features.
However, floating PV (FPV) systems—unlike fixed structures—can move due to wind, currents, or water‑level fluctuations.
Please discuss:
whether FPV motion affects texture features,
whether SAR temporal variability could introduce classification noise,
and whether the method is equally robust for fixed installations and mobile floating platforms.
This clarification is important since China hosts many FPV plants.
The developed dataset could potentially support research on the environmental effects of FPV installations.
Please comment on the feasibility of using this method to investigate:
water‑surface temperature variations due to partial shading;
changes in water colour or turbidity, especially related to algae bloom development or suppression;
whether SAR–optical fusion offers the sensitivity needed for such environmental applications.
These points would strengthen the broader applicability of the work.
Figure 11 shows that several basins have extremely high WPV coverage (85–95%).
The manuscript should clarify:
Which area was used as the denominator when computing the WPV percentage (e.g., maximum historical water extent, annual water extent, permanent water core).
Whether such high coverage is physically accurate, or if classification steps may have overestimated WPV area in small or seasonally shrinking basins.
The implications of these very high coverage levels for hydrological, ecological, or energy‑planning impacts.
A deeper interpretation is needed.
Please provide a clear definition of lake versus reservoir, since the distinction is relevant for WPV siting policies, water‑level stability, and ownership/management regimes.
A short paragraph is needed in the Methods or Study Area section.
Minor Comments
Figure 2 labeling error
There is an inconsistency between the letters shown in the images and those referenced in the caption.
Please correct the figure annotations to ensure correspondence