Preprints
https://doi.org/10.5194/essd-2025-662
https://doi.org/10.5194/essd-2025-662
04 Dec 2025
 | 04 Dec 2025
Status: this preprint is currently under review for the journal ESSD.

ChinaAI-FSC: A Comprehensive AI-Ready MODIS Fractional Snow Cover Dataset for China (2000–2022)

Jinliang Hou, Mingkai Zhang, Xiaohua Hao, Jifu Guo, Peng Dou, Ying Zhang, and Chunlin Huang

Abstract. We present ChinaAI-FSC, the first large-scale, standardized, AI-ready fractional snow cover (FSC) sample collection for mainland China, spanning 22 snow seasons from 2000 to 2022 and addressing a critical gap in long-term snow monitoring. The dataset consists of 47,728 samples (each 128 × 128 MODIS-pixel tiles), where high-resolution Landsat-5/7/8/9 and Sentinel-2 imagery provide consistent FSC reference labels. A total of 20 feature variables, including MODIS surface reflectance (bands 1-7), topographic attributes, forest and land cover information, and geolocation factors, were extracted to enable both point-scale and tile-scale spatially contextualized AI modelling. A structured and transparent workflow, encompassing systematic sample preparation, rigorous quality control, spatiotemporal sample partitioning, and standardized metadata, ensures reproducibility, physical consistency, and interoperability across machine learning and deep learning applications. Dataset reliability and AI-readiness were systematically evaluated using a novel “Four Layers-Four Domains-Fifteen Attributes (4L-4D-15A)” assessment protocol, covering data, information, system, and application dimensions. The quality, reliability, and usability of ChinaAI-FSC were demonstrated through three representative use cases: (1) benchmarking of six ML/DL models (ANN, SVR, RF, CNN, UNet, and ResNet), (2) validation of the standard MODIS FSC product, and (3) nationwide seamless FSC mapping. By providing harmonized, validated, and well-documented samples, ChinaAI-FSC establishes a unified foundation for AI-driven snow cover mapping, long-term monitoring, and cryosphere–hydrological modelling, promoting reproducible, interoperable, and next-generation research in cryospheric science. The dataset is publicly available from the National Tibetan Plateau Data Center (TPDC) at https://doi.org/10.11888/Cryos.tpdc.303034 (also accessible via https://cstr.cn/18406.11.Cryos.tpdc.303034) and from Zenodo at https://doi.org/10.5281/zenodo.17707386.

Publisher's note: Copernicus Publications remains neutral with regard to jurisdictional claims made in the text, published maps, institutional affiliations, or any other geographical representation in this paper. While Copernicus Publications makes every effort to include appropriate place names, the final responsibility lies with the authors. Views expressed in the text are those of the authors and do not necessarily reflect the views of the publisher.
Share
Jinliang Hou, Mingkai Zhang, Xiaohua Hao, Jifu Guo, Peng Dou, Ying Zhang, and Chunlin Huang

Status: open (until 10 Jan 2026)

Comment types: AC – author | RC – referee | CC – community | EC – editor | CEC – chief editor | : Report abuse
Jinliang Hou, Mingkai Zhang, Xiaohua Hao, Jifu Guo, Peng Dou, Ying Zhang, and Chunlin Huang

Data sets

ChinaAI-FSC: A Comprehensive AI-Ready MODIS Fractional Snow Cover Dataset for China (2000-2022) Jinliang Hou et al. https://doi.org/10.5281/zenodo.17707386

Model code and software

AI-Ready-China-FSC Jinliang Hou et al. https://github.com/houjin0503/AI-Ready-China-FSC

Jinliang Hou, Mingkai Zhang, Xiaohua Hao, Jifu Guo, Peng Dou, Ying Zhang, and Chunlin Huang
Metrics will be available soon.
Latest update: 04 Dec 2025
Download
Short summary
ChinaAI-FSC provides the first large-scale, AI-ready snow dataset for mainland China, spanning 2000–2022. By integrating MODIS, Landsat, and Sentinel-2 observations with advanced quality control, it supports AI model training, benchmarking, and large-scale snow mapping. The dataset enhances snow monitoring accuracy and fosters reproducible research on climate and hydrological processes.
Share
Altmetrics