the Creative Commons Attribution 4.0 License.
the Creative Commons Attribution 4.0 License.
A high-resolution gridded dataset of livestock distribution on the Mongolian Plateau (2000–2020)
Abstract. Accurate quantification of the geospatial distribution of livestock in pastoral regions is important for assessing and maintaining grassland ecological security and sustainable development. Statistical livestock data based on static and macro-level administrative units cannot characterize the fine-scale distribution of livestock across mobile geographic spaces. This study proposed a livestock spatial mapping framework that combined livestock inventory statistics of soum/banner counties with multi-source data (e.g., land cover, population, topography, and climate, etc.) using the Random Forest model (RF). A series of high-resolution gridded spatial distribution datasets of total livestock, sheep & goats, and large livestock (cattle, horses, and camels) densities at five-year intervals were obtained for the Mongolian Plateau from 2000 to 2020. The fitting accuracy of this dataset with statistical data (R²>0.85) is significantly better than that of the existing Gridded Livestock of the World (GLW) series dataset, and the spatial distribution is more accurate and detailed. At the same time, it also compensates for the lack of spatial information of large livestock such as camels in the GLW. This approach enables coarse-grained administrative division data transforming into high-resolution spatial gridded data, by solving the key problems of low spatial resolution, missing local details, and the spatial fusion of different data sources. Based on the acquired high-precision spatial distribution data of livestock density, it can be fused and analyzed with other geographic environment data, which is of great value for the ecological environment protection of grassland in nomadic grassland areas. Gridded livestock density datasets are freely available at https://doi.org/10.6084/m9.figshare.28695728 (Liu and Wang, 2025).
- Preprint
(3889 KB) - Metadata XML
- BibTeX
- EndNote
Status: final response (author comments only)
-
RC1: 'Comment on essd-2025-256', Anonymous Referee #1, 03 Jun 2025
This paper presents a spatial mapping framework for livestock distribution by integrating soum/banner-level livestock inventory statistics with multi-source data using the Random Forest model. The study is well-designed, the manuscript is clearly written and logically structured. However, several issues should be addressed to further enhance the clarity, comprehensiveness, and practical value of the research.
- The spatial resolution of the results should be explicitly stated in the abstract. Additionally, the study does not produce annual livestock distribution maps. Is this limitation due to the availability of annual statistical data?
- Regarding Figure 1, a land cover map might be more informative than a terrain map in helping readers understand the context of the study. For Figure 3, the axes should be clearly labeled with appropriate units and descriptions.
- In Table 1, the authors mention using both MCD12Q1 and GLC_FCS30D as land cover datasets. However, the manuscript lacks a clear explanation of how MCD12Q1 was utilized. Could the authors clarify its specific role? Moreover, if both datasets were used, how were potential inconsistencies or conflicts between them resolved?
- What role does land cover data play in this study? Is it used as an input feature for model training, or as a mask to constrain the spatial extent of livestock distribution? Given that most livestock in the study area are likely found in grassland areas, was this considered in the mapping process?
- In Figure 9, there appears to be a decline in livestock distribution in the northwestern part of the study area around 2005. What could be the reason for this change? Was it due to policy, climate, land use changes, or other factors?
- In the discussion section, the authors primarily focus on analyzing the spatiotemporal patterns of the resulting dataset. However, for a data-oriented journal, I believe greater emphasis should be placed on discussing the data construction methodology and potential sources of uncertainty associated with the dataset.
Citation: https://doi.org/10.5194/essd-2025-256-RC1 -
AC2: 'Reply on RC1', Yaping Liu, 22 Jul 2025
This paper presents a spatial mapping framework for livestock distribution by integrating soum/banner-level livestock inventory statistics with multi-source data using the Random Forest model. The study is well-designed, the manuscript is clearly written and logically structured. However, several issues should be addressed to further enhance the clarity, comprehensiveness, and practical value of the research.
Response: We appreciate your general evaluation of this paper. According to your suggestions, we have revised the whole manuscript, which will be introduced below for all the updated changes.
1. The spatial resolution of the results should be explicitly stated in the abstract. Additionally, the study does not produce annual livestock distribution maps. Is this limitation due to the availability of annual statistical data?
Response: Thank you for your comments. First of all, we updated additional details in the abstract regarding the spatial resolution of 1 km (see line 19). The use of five-year intervals in this study is primarily due to the following reasons: (1) data availability constraints: annual livestock statistics at the county level in Mongolian Plateau is a real challenge. It can be accessed in Mongolia official website, but it is not available annually from Inner Mongolia maybe because of more sum/banners in this region. To ensure comparability between China and Mongolia, a five-year interval was ultimately adopted in this study. (2) Besides the statistical data, considering other long-term datasets, most of them also use the 5 years interval, so this period can make almost all the data align with the same time period. The annual dataset is a very good suggestion for our future work. In the next step, with more data be available, we will obtain annual livestock data to further optimize temporal resolution. We have added the future plan in section 5.
2. Regarding Figure 1, a land cover map might be more informative than a terrain map in helping readers understand the context of the study. For Figure 3, the axes should be clearly labeled with appropriate units and descriptions.
Response: Thank you for your comment. For Figure 1, we have replaced the topographic map with a land cover map. For Figure 3, units have been added to the figure, the horizontal axis has been changed to “Statistics,” and “Simulated” has been added to the vertical axis. Besides these revisions, we also checked all the figures and make some of them update, e.g., Figure 6.
3. In Table 1, the authors mention using both MCD12Q1 and GLC_FCS30D as land cover datasets. However, the manuscript lacks a clear explanation of how MCD12Q1 was utilized. Could the authors clarify its specific role? Moreover, if both datasets were used, how were potential inconsistencies or conflicts between them resolved?
Response: Thank you for your comment. MCD12Q1 and GLC_FCS30D are all land cover datasets mentioned in Table1. GLC_FCS30D (30m resolution) was used as the basic land cover data for grassland and cropland coverage extracting in this study. While, MCD12Q1 (500 m resolution) was only used as reference for the reclassification (supplementary in S1). After comparation of the two reclassification results, GLC_FCS30D has obviously fine effect in the grid scale. So, theGLC_FCS30D is used as the final input land cover data for the model training. In order to make this clear, we added the supplement table (S1).
S1 Reclassification information of land cover data products.
Code
Type
MCD12Q1 (500 m)
GLC_FCS30D (30 m)
1
Cropland
12 Croplands, 14 Cropland/Natural Vegetation Mo- saics
10 Rainfed cropland, 11 Herbaceous cover, 12 Tree or shrub cover, 20 Irrigated cropland
2
Forest
1 Evergreen Needleleaf Forests, 2 Evergreen Broadleaf Forests, 3 Deciduous Needleleaf Forests, 4 Deciduous Broadleaf Forests, 5 Mixed Forests
51 Open evergreen broadleaved forest, 52 Closed evergreen broadleaved forest, 61 Open deciduous broadleaved forest, 62 Closed deciduous broadleaved forest, 71 Open evergreen needle-leaved forest, 72 Closed evergreen needle-leaved forest, 81 Open deciduous needle-leaved forest, 82 Closed deciduous needle-leaved forest, 91 Open mixed leaf forest, 92 Closed mixed leaf forest
3
Grassland
8 Woody Savannas, 9 Savannas, 10 Grasslands
130 Grassland
4
Shrubland
6 Closed Shrublands, 7 Open Shrublands
122 Deciduous shrubland
5
Wetland
11 Permanent Wetlands
180 Wetlands
6
Water
17 Water Bodies
210 Water body
7
Tundra
/
140 Lichens and mosses, 150 Sparse
vegetation
8
Impervious surface
13 Urban and Built-up Lands
190 Impervious surfaces
9
Bare land
16 Barren
200 Bare areas, 201 Consolidated bare areas
202 Unconsolidated bare areas, 152 Sparse
shrubland, 153 Sparse herbaceous
10
Permanent ice/snow
15 Permanent Snow and Ice
220 Permanent ice and snow
4. What role does land cover data play in this study? Is it used as an input feature for model training, or as a mask to constrain the spatial extent of livestock distribution? Given that most livestock in the study area are likely found in grassland areas, was this considered in the mapping process?
Response: Thank you for your comment. It is true, land cover data played a dual core role in this study: (1) model parameterization: grassland coverage and cropland coverage were extracted from land cover data as key environmental predictors and directly incorporated into the training of the random forest model; (2) spatial domain constraints: a mask was generated using land cover data and “Global suitability map for pastoral areas” data to exclude areas unsuitable for livestock distribution (such as water and built-up areas), strictly constraining the simulation results within ecologically reasonable spatial boundaries. For example, we extracted water area from land cover data directly to exclude the livestock distribution.
As for the grassland areas, based on field surveys and statistical data, we found that non-grassland areas such as the southwestern Gobi Desert also have scattered livestock distributions (e.g., camels). Therefore, we did not select a simple grassland mask restrict the livestock distribution, rather than using the “Global suitability map for pastoral areas” data for livestock potential distribution reference. Those values ≠-997 are selected as the potential spatial extent of livestock distribution. In order the make these more clarify, we update the related description in section 2.2.1.
5. In Figure 9, there appears to be a decline in livestock distribution in the northwestern part of the study area around 2005. What could be the reason for this change? Was it due to policy, climate, land use changes, or other factors?
Response: Thank you for your comment. In Figure 9, the observed decline in livestock density was primarily concentrated in some soums of southern Khövsgöl Province, northern Zavkhan Province, and northwestern Arkhangai Province. These areas are traditional important pastoral regions and also ecologically fragile areas, which are relatively sensitive to climate change. The reduction is predominantly attributed to consecutive extreme dzud events between 2000 and 2002, which triggered significant livestock losses due to extremely cold weather and insufficient forage reserves, culminating in a sharp population decline [1-2]. Additionally, due to the intensification of poverty, the decline in the mobility of herders, and the weakening of formal and customary ranch management systems, the number of livestock has also decreased[3]. This part of the content has been explained in section 4.3 and the related reference are added as well.
[1] Nandintsetseg, B., Greene, J. S., and Goulden, C. E. J. I. J. o. C. A. J. o. t. R. M. S.: Trends in extreme daily precipitation and temperature near Lake Hövsgöl, Mongolia, 27, 341-347, 2007.
[2] Rao, M. P., Davi, N. K., D’Arrigo, R. D., Skees, J., Nachin, B., Leland, C., Lyon, B., Wang, S.-Y., and Byambasuren, O.: Dzuds, droughts, and livestock mortality in Mongolia, Environmental Research Letters, 10, 074012, 10.1088/1748-9326/10/7/074012, 2015.
[3]. Fernandez-Gimenez, M. E.: Land use and land tenure in Mongolia: A brief history and current issues, In: Bedunah, Donald J., McArthur, E. Durant, and Fernandez-Gimenez, Maria, comps. 2006. Rangelands of Central Asia: Proceedings of the Conference on Transformations, Issues, and Future Challenges. 2004 January 27; Salt Lake City, UT. Proceeding RMRS-P-39. Fort Collins, CO: US Department of Agriculture, Forest Service, Rocky Mountain Research Station. p. 30-36.
6. In the discussion section, the authors primarily focus on analyzing the spatiotemporal patterns of the resulting dataset. However, for a data-oriented journal, I believe greater emphasis should be placed on discussing the data construction methodology and potential sources of uncertainty associated with the dataset.
Response: Thank you for your constructive suggestion. We have intensified the discussion on the applicability of data methods in the discussion section. By comparing with existing studies [4-5], we have concluded that this method has high accuracy in simulating the spatial distribution of livestock. At the same time, the discussion on the possible uncertainties of this data has been strengthened. The lack of fine-scale data and the diversity of data sources are the main reasons. It also pointed out the shortcomings of this study, that is, in distinguishing large livestock, it is necessary to further refine the distribution of each subtype on the basis of the current overall distribution in the future.
[4] Robinson, T. P., Wint, G. W., Conchedda, G., Van Boeckel, T. P., Ercoli, V., Palamara, E., Cinardi, G., D'Aietti, L., Hay, S. I., and Gilbert, M. J. P. o.: Mapping the global distribution of livestock, 9, e96084, https://doi.org/10.1371/journal.pone.0096084, 2014.
[5]Gilbert, M., Nicolas, G., Cinardi, G., Van Boeckel, T. P., Vanwambeke, S. O., Wint, G., and Robinson, T. P. J. S. d.: Global distribution data for cattle, buffaloes, horses, sheep, goats, pigs, chickens and ducks in 2010, 5, 1-11, https://doi.org/10.1038/sdata.2018.227, 2018.
-
RC2: 'Comment on essd-2025-256', Anonymous Referee #2, 07 Jun 2025
This study presents a valuable contribution to the field of grassland ecology and sustainable livestock management by generating a 1-km resolution gridded dataset of livestock distribution across the Mongolian Plateau from 2000 to 2020. The work addresses a critical gap in existing datasets, such as the Gridded Livestock of the World (GLW), by integrating multi-source remote sensing data (e.g., land cover, climate, and socioeconomic variables) with statistical livestock inventories using a Random Forest (RF) model. The dataset notably improves spatial resolution and includes large livestock species (e.g., camels) overlooked in prior global datasets. The research is scientifically significant for informing grassland conservation, overgrazing mitigation, and policy-making in mobile pastoral systems. However, the methodology adopted by the authors may lead to considerable uncertainty in the results. In addition, the manuscript needs to add some necessary details to the description of the method.
1. The RF model’s hyperparameter optimization (e.g., n_estimators, max_depth) is briefly mentioned but lacks details on cross-validation procedures or sensitivity analyses. Documenting the iterative process (e.g., grid/random search) and reporting optimal parameters is critical for reproducibility. It is recommended to add relevant descriptions.
2. The study relies on 436 administrative units annually, potentially oversimplifying spatial heterogeneity in complex landscapes (e.g., the Gobi Desert vs. the steppe). Stratified sampling or spatially explicit validation (e.g., hotspot analysis) is needed to ensure model robustness across diverse ecosystems.
3. This effort develops livestock data from 2000 to 2020, but unfortunately with 5-year intervals rather than continuous annual time series, which would lose some time variation information. Since yearly time series of livestock numbers are available for the different boroughs (https://www.1212.mn/), it is recommended to develop a continuous time series for the study period.
4. The underlying data of livestock numbers used in this work are from statistical yearbooks or the Bureau of Statistics, and considering that there is some uncertainty in these statistics, the uncertainty of the results of this paper should be discussed.
5. The work focuses on the spatial distribution and change characteristics of the number of livestock. It is suggested to introduce the temporal changes in the number of different livestock during the study period so that readers can more easily understand the spatial and temporal dynamics of the livestock industry in the region.
Citation: https://doi.org/10.5194/essd-2025-256-RC2 -
AC1: 'Reply on RC2', Yaping Liu, 22 Jul 2025
This study presents a valuable contribution to the field of grassland ecology and sustainable livestock management by generating a 1-km resolution gridded dataset of livestock distribution across the Mongolian Plateau from 2000 to 2020. The work addresses a critical gap in existing datasets, such as the Gridded Livestock of the World (GLW), by integrating multi-source remote sensing data (e.g., land cover, climate, and socioeconomic variables) with statistical livestock inventories using a Random Forest (RF) model. The dataset notably improves spatial resolution and includes large livestock species (e.g., camels) overlooked in prior global datasets. The research is scientifically significant for informing grassland conservation, overgrazing mitigation, and policy-making in mobile pastoral systems. However, the methodology adopted by the authors may lead to considerable uncertainty in the results. In addition, the manuscript needs to add some necessary details to the description of the method.
Response: We appreciate your general evaluation of this paper. According to your suggestions, we have revised the whole manuscript, which will be introduced below for all the updated changes.
1. The RF model’s hyperparameter optimization (e.g., n_estimators, max_depth) is briefly mentioned but lacks details on cross-validation procedures or sensitivity analyses. Documenting the iterative process (e.g., grid/random search) and reporting optimal parameters is critical for reproducibility. It is recommended to add relevant descriptions.
Response: Thank you for your comment. We added details of the RF model parameter optimization process, including systematically optimizing the hyperparameters using the grid search method and determining the optimal parameters (n_estimators, max_depth, min_samples_split, min_samples_leaf) through three cross-validations. In order to identify the factors with higher feature importance we utilized sensitivity analysis. We have comprehensively supplemented the detailed content related to hyperparameter optimization, cross-validation process, and sensitivity analysis in the manuscript in section 3.1. Because of this updating in parameter optimizing, we have added the code for sensitivity analysis and re-uploaded it to the figshare (https://doi.org/10.6084/m9.figshare.28695728), making it be available for readers.
2. The study relies on 436 administrative units annually, potentially oversimplifying spatial heterogeneity in complex landscapes (e.g., the Gobi Desert vs. the steppe). Stratified sampling or spatially explicit validation (e.g., hotspot analysis) is needed to ensure model robustness across diverse ecosystems.
Response: Thank you for your comments for the sample strategy. The sample in this study is the currently available panel data of the entire Mongolian Plateau region. We conducted the training and simulation based on this panel data. Similar study for livestock distribution uses the same approach in literature [1-2]. We are also very grateful for your suggestions on spatial heterogeneity. During the process of livestock spatialization, we selected environmental factors with spatial heterogeneity (such as climate factors, NDVI, etc.) as the independent variables of the model input to further ensure the robustness of the model.
[1] Gilbert, M., Nicolas, G., Cinardi, G., Van Boeckel, T. P., Vanwambeke, S. O., Wint, G., and Robinson, T. P. J. S. d.: Global distribution data for cattle, buffaloes, horses, sheep, goats, pigs, chickens and ducks in 2010, 5, 1-11, https://doi.org/10.1038/sdata.2018.227, 2018.
[2] Kolluru V, John R, Saraf S, et al. Gridded livestock density database and spatial trends for Kazakhstan[J]. Scientific Data, 2023, 10(1): 839.
3. This effort develops livestock data from 2000 to 2020, but unfortunately with 5-year intervals rather than continuous annual time series, which would lose some time variation information. Since yearly time series of livestock numbers are available for the different boroughs (https://www.1212.mn/), it is recommended to develop a continuous time series for the study period.
Response: Thank you for your constructive suggestion. Annual time series products are real needed in this region. While, in this study, the use of five-year intervals in this study is primarily due to the following reasons: (1) data availability constraints: annual livestock statistics at the county level in Mongolian Plateau is a real challenge. It can be accessed in Mongolia official website, but it is not available annually from Inner Mongolia maybe because of more counties in this region. To ensure comparability between China and Mongolia, a five-year interval was ultimately adopted in this study. (2) Besides the statistical data, considering other long-term datasets, most of them also use the 5 years interval, so this period can make almost all the data align with the same time period. Follow this suggestion, with more data be available in the future study, annual livestock data will be fully obtained to further optimize temporal resolution. The future plan about the annual livestock density dataset constructed had added in section 5.
4. The underlying data of livestock numbers used in this work are from statistical yearbooks or the Bureau of Statistics, and considering that there is some uncertainty in these statistics, the uncertainty of the results of this paper should be discussed.
Response: Thank you for your comment. Based on currently data availability, this study uses livestock statistics from the Inner Mongolia Autonomous Region Statistical Yearbook, counties and banners data, and the official website of the Mongolian National Bureau of Statistics (https://www.1212.mn/). These data are collected and generated based on standardized census systems (such as household-by-household counts in China and sample surveys in Mongolia) and undergo a tiered quality review process to ensure their reliability. In the future, more small regional verifications can be strengthened, including the consistency verification of yearbook data, to enhance the reliability of the data.
5. The work focuses on the spatial distribution and change characteristics of the number of livestock. It is suggested to introduce the temporal changes in the number of different livestock during the study period so that readers can more easily understand the spatial and temporal dynamics of the livestock industry in the region.
Response: Thank you for your constructive suggestion. In order to more fully present the spatio-temporal dynamics of animal husbandry in the study area, we have added the description of temporal changes in different types of livestock in Section 2.2.3. The number of total livestock population and sheep & goats are on the rise, while the number of large livestock is decreasing initially before increasing in recent decade. Besides the trends in livestock number, we also added the description livestock spatio-temporal changes in section 4.3.
Citation: https://doi.org/10.5194/essd-2025-256-AC1
-
AC1: 'Reply on RC2', Yaping Liu, 22 Jul 2025
-
CC1: 'Comment on essd-2025-256', Yuanzhi Yao, 11 Jun 2025
There are many errors in the codes they shared in the links and do not align with the description in the muascript.
Such as, they said they used RandomizedSearchCV method in line 243, but they used GridSearchCV methods in their codes.
I think the authots need to carefully check their codes to match the description in the manuscripts.
Citation: https://doi.org/10.5194/essd-2025-256-CC1 -
AC3: 'Reply on CC1', Yaping Liu, 22 Jul 2025
Response: Thank you for your comment. We sincerely apologize for the inconsistencies between the shared code and manuscript descriptions. We have checked and updated the code with consistency with each other. Besides this check, we also added the new code for additional sensitivity analysis experiments in this revision. The specific code and content have been updated in the figshare repository (https://doi.org/10.6084/m9.figshare.28695728).
Citation: https://doi.org/10.5194/essd-2025-256-AC3
-
AC3: 'Reply on CC1', Yaping Liu, 22 Jul 2025
Data sets
Gridded_livestock_mongolian_plateau_2000_2020 Yaping Liu and Juanle Wang https://doi.org/10.6084/m9.figshare.28695728
Model code and software
Gridded_livestock_mongolian_plateau_2000_2020 Yaping Liu and Juanle Wang https://doi.org/10.6084/m9.figshare.28695728
Viewed
HTML | XML | Total | BibTeX | EndNote | |
---|---|---|---|---|---|
328 | 70 | 21 | 419 | 9 | 13 |
- HTML: 328
- PDF: 70
- XML: 21
- Total: 419
- BibTeX: 9
- EndNote: 13
Viewed (geographical distribution)
Country | # | Views | % |
---|
Total: | 0 |
HTML: | 0 |
PDF: | 0 |
XML: | 0 |
- 1