Preprints
https://doi.org/10.5194/essd-2025-263
https://doi.org/10.5194/essd-2025-263
04 Jun 2025
 | 04 Jun 2025
Status: this preprint is currently under review for the journal ESSD.

CLRD-GLPS: A Long-term Seasonal Dataset of Ruminant Livestock Distribution in China's Grazing Production Systems (2000–2021) Using Stacking-based Interpretable Machine Learning

Ning Zhan, Tao Ye, Mario Herrero, Jian Peng, Weihang Liu, and Heng Ma

Abstract. Understanding the spatial-temporal distribution of grazing livestock is crucial for assessing livestock system sustainability, managing animal diseases, mitigating climate change risks, and controlling greenhouse gas emissions. In China, grazing ruminants are predominantly distributed across vast grasslands in semi-humid and alpine regions. However, existing gridded livestock distribution datasets fail to distinguish between grazing and other livestock production systems and do not simultaneously account for long-term and seasonal dynamics. This study introduces CLRD-GLPS, a comprehensive dataset mapping China's ruminant livestock distribution in grazing livestock production systems from 2000 to 2021. Our approach addresses limitations in existing datasets by integrating interpretable machine learning methods to segment grazing livestock from total livestock populations and generate seasonal grazing pastures with dynamic grazing suitability masks. We developed a stacking-based ensemble methodology that enhances predictive performance while providing insights into distribution mechanisms. The stacking ensemble models demonstrate robust performance through 5-fold cross-validation, with R² values ranging from 0.909 to 0.967 for cattle and 0.874 to 0.914 for sheep and goats. Validation results demonstrated the high accuracy of CLRD-GLPS across multiple spatial scales. At the county level, it strongly agreed with census data, effectively capturing grazing livestock distribution. City-level validation confirmed strong agreement (R² = 0.691–0.881), while grid-level validation using independent observations yielded R² = 0.79, further confirming the accuracy of fine-resolution predictions. The CLRD-GLPS dataset provides essential information for understanding grazing ruminant dynamics and developing targeted livestock management policies. Furthermore, our methodological framework offers a template for creating similar livestock distribution datasets for other regions and livestock production systems.

Publisher's note: Copernicus Publications remains neutral with regard to jurisdictional claims made in the text, published maps, institutional affiliations, or any other geographical representation in this preprint. The responsibility to include appropriate place names lies with the authors.
Share
Ning Zhan, Tao Ye, Mario Herrero, Jian Peng, Weihang Liu, and Heng Ma

Status: open (until 11 Jul 2025)

Comment types: AC – author | RC – referee | CC – community | EC – editor | CEC – chief editor | : Report abuse
Ning Zhan, Tao Ye, Mario Herrero, Jian Peng, Weihang Liu, and Heng Ma

Data sets

CLRD-GLPS: A Long-term Seasonal Dataset of Ruminant Livestock Distribution in China's Grazing Production Systems (2000-2021) Using Stacking-based Interpretable Machine Learning Ning Zhan, Tao Ye, Mario Herrero, Jian Peng, Weihang Liu, Heng Ma https://doi.org/10.5281/zenodo.15347430

Ning Zhan, Tao Ye, Mario Herrero, Jian Peng, Weihang Liu, and Heng Ma

Viewed

Total article views: 84 (including HTML, PDF, and XML)
HTML PDF XML Total BibTeX EndNote
75 8 1 84 0 0
  • HTML: 75
  • PDF: 8
  • XML: 1
  • Total: 84
  • BibTeX: 0
  • EndNote: 0
Views and downloads (calculated since 04 Jun 2025)
Cumulative views and downloads (calculated since 04 Jun 2025)

Viewed (geographical distribution)

Total article views: 84 (including HTML, PDF, and XML) Thereof 84 with geography defined and 0 with unknown origin.
Country # Views %
  • 1
1
 
 
 
 
Latest update: 06 Jun 2025
Download
Short summary
We separated grazing ruminant livestock (cattle, sheep, and goats) from those raised in non-grazing systems in China and mapped where they graze from 2000 to 2021. Using machine learning methods that combine multiple models, we produced accurate maps showing livestock distribution across seasons. This helps track seasonal changes in grazing and supports better land use, animal health, and climate-related planning.
Share
Altmetrics