the Creative Commons Attribution 4.0 License.
the Creative Commons Attribution 4.0 License.
SinoLC-1: the first 1-meter resolution national-scale land-cover map of China created with the deep learning framework and open-access data
Zhuohong Li
Mofan Cheng
Jingxin Hu
Guangyi Yang
Hongyan Zhang
Abstract. In China, the demand for a more precise perception of the national land surface has become most urgent given the pace of development and urbanization. Constructing a very-high-resolution (VHR) land-cover dataset for China with national coverage, however, is a non-trivial task and thus, an active area of research impeded by the challenges of image acquisition, manual annotation, and computational complexity. To fill this gap, the first 1-meter resolution national-scale land-cover map of China, SinoLC-1, was established using a deep learning-based framework and open-access data including global land-cover (GLC) products, open street map (OSM), and Google Earth imagery. Reliable training labels were generated by combining three 10-meter GLC products and OSM data. These training labels and 1-meter resolution images derived from Google Earth were used to train the proposed framework. This framework resolved the label noise stemming from a resolution mismatch between images and labels by combining a resolution-preserving backbone, a weakly supervised module, and a self-supervised loss function, to refine the VHR land-cover results automatically without any manual annotation requirement. Based on large storage and computing servers, processing the 73.25 TB dataset to obtain a final SinoLC-1 land-cover product covering the entire land surface of China, ~9,600,000 km2, took about 10 months. The SinoLC-1 product was validated using a visually interpreted validation set including 106,852 random samples and a statistical validation set collected from the official land survey report provided by the Chinese government. The validation results showed SinoLC-1 achieved an overall accuracy of 73.61 % and a kappa coefficient of 0.6595. Validations for every provincial region further indicated the credible accuracy of this dataset across whole China. Furthermore, the statistical validation results indicated SinoLC-1 conformed closely to the official survey reports. In addition, SinoLC-1 was qualitatively compared with five other widely used GLC products. These results indicated SinoLC-1 had the highest spatial resolution, the most accurate land-cover edges, and the finest landscape details. In conclusion, as the first 1-meter resolution national-scale land-cover map of China, SinoLC-1 delivered accuracy and provided primal support for related research and applications throughout China. The SinoLC-1 land-cover product is freely accessible at https://doi.org/10.5281/zenodo.7707461 (Li et al., 2023).
Zhuohong Li et al.
Status: final response (author comments only)
-
RC1: 'Comment on essd-2023-87', Anonymous Referee #1, 13 Apr 2023
General comments:
The SinoLC-1 product, the initial 1-m resolution land cover data product for China, is introduced in this work. It may be useful for understanding fine-scale biogeophysical issues on the land. Also, the product offers development in big data processing, sample migration, and open-access data application that might be useful for efficient national land resource surveys and the mapping of large-scale very-high-resolution land cover data. Before it may be accepted, this manuscript should yet be improved.
Suggestions and comments:
- The Introduction, which focuses on data at the global scale, highlights 3 types of land use land cover data that fully or partially cover China. Reviewing national and local-scale land use land cover data in China is advised given that this manuscript focuses on the production of land cover maps at the national scale. The CLCD and CLUD in China and the NLCD in the United States are examples of the several extensively utilized national-scale land use land cover product.
- Why not use bands composition to assist in mapping?
- Why are other OSM types not involved in mapping?
- It's possible to argue against the classification system's building category. Table 2 compares building to mining land in the NLRS, which is inappropriate because mining land refers to a mine site (see the NLRS land category determination rules published in 2019). Moreover, optical images and even RGB images should have difficulty classifying forest swamps. The authors are suggested to submit mapping results for land cover types that are challenging to distinguish in medium resolution imagery in order to show the scientific significance and applicability of SinoLC-1.
- Add legends to all maps to address the current difficulty of comparing different product qualities, such as Figure 13.
- The authors utilized current global-scale land cover products as mapping samples, but the quality of them in the Chinese region is uncertain. The quality of these products in the Chinese region is not always robust according to the text and figures in section 4.2 of the manuscript. Therefore, how do the authors account for these variables that might affect SinoLC-1's quality?
- It is challenging to automatically map forests, shrubs, grasslands, wetlands, and tundra using medium-resolution images. To help the reader comprehend the characteristics of various land cover types in Google images, it is advised that the authors change Figure 6 by adding VHR samples.
- The area discrepancies between the provincial land cover categories of SinoLC-1 and NLRS are compared in Figure 17. It is important to note that the value interval on the vertical axis is too big. For instance, in Henan province, each pitch of the vertical axis corresponds to a 5,000 km2 gap. Thus, the area difference between the two results cannot be well reflected for land cover categories with small areas. It is suggested that the authors seek alternative comparison methods to make the area difference between all types of land cover clear.
- It is advised that the authors use more inclusive language, primarily in section 4.2, where words like "worst" need to be changed.
- Checking the terms and some phrases is advised, e.g., "OBAI" in line 104 and "cropped" in line 199.
Citation: https://doi.org/10.5194/essd-2023-87-RC1 -
RC2: 'Comment on essd-2023-87', Anonymous Referee #2, 24 Apr 2023
The authors of this manuscript took such a tremendous effort to classify land cover of China in a very high (1m) resolution. However, the uncertainty of training datasets, the reproducibility of methods and the independence of validation were not clear.
This manuscript utilized 3 global-scale land cover products as training samples, but the mapping accuracy of them in China is uncertain especailly considering that a small number of observations in China were included to generate these maps. Also the uncertainty of the SinoLC-1 in the Southwest, Northwest and North regions due to unmatched training data and outdated VHR images need to be considered.
Validation uncertainty. The authors manually annotated 106, 852 points by visual interpretation results of VHR or HR imagery as validation datasets (Line 296-298). However, the accuracy of visual interpretation might contain considerable uncertainty. For example, ponds/lakes, paddy fields, and wetlands might be mis-interpretated. There are some open-accessed validation datasets (some obtained from field surveys), it would be great if the authors could add more rigorous and transparent validation.
Line 25: “SinoLC-1 conformed closely to the official survey reports”, this expression is vague, needs statistical values to support how close.
Line 275-276: “the predicted batches were seamlessly merged into the land-cover tiles by taking the average predicted values of the overlapped areas”, since the land cover is categorical data, it would be more reasonable to take the majority instead of the average.
Figure 7: the bar showed the sample number instead of the proportion. It would be better to show the proportion of the validation samples of each type account for all sample points (106, 852) and the area proportion of each land-cover type of China in the SinoLC-1 dataset.
Figure 8: the legend is missing.
Line 409-412: The expression is not clear, please clarify which types showed higher accuracies (O.A. and kappa), and which types showed low accuracies.
Figure 15: Adding the numerical values of confusion proportions to this figure would provide more quantitative information.
3.2 section belongs to Results, but almost no numerical statistics were shown to support the descriptions.
Figure 18, the left figure (a) showed the misestimated area, while it would be more comparable if it showed the misestimated rate for each land-cover type.
Line 480-485: Figure 20 shows significant land-cover changes between 2011 and 2021. It would be better to add a statistical table of the proportion of change areas in each region, which would be helpful to assess the uncertainty in the Southwest, Northwest and North region.
Citation: https://doi.org/10.5194/essd-2023-87-RC2
Zhuohong Li et al.
Data sets
SinoLC-1: the first 1-meter resolution national-scale land-cover map of China created with the deep learning framework and open-access data Zhuohong Li, Wei He, Mofan Cheng, Jingxin Hu, Xiao An, Yan Huang, Guangyi Yang, and Hongyan Zhang https://doi.org/10.5281/zenodo.7707461
Zhuohong Li et al.
Viewed
HTML | XML | Total | BibTeX | EndNote | |
---|---|---|---|---|---|
4,635 | 1,287 | 382 | 6,304 | 150 | 119 |
- HTML: 4,635
- PDF: 1,287
- XML: 382
- Total: 6,304
- BibTeX: 150
- EndNote: 119
Viewed (geographical distribution)
Country | # | Views | % |
---|
Total: | 0 |
HTML: | 0 |
PDF: | 0 |
XML: | 0 |
- 1