the Creative Commons Attribution 4.0 License.
the Creative Commons Attribution 4.0 License.
HIStory of LAND transformation by humans in South America (HISLAND-SA): annual and 1-km crop-specific gridded data (1950–2020)
Abstract. South America is a global hotspot for land use and land cover (LULC) change, marked by dramatic agricultural land expansion and deforestation. Developing high-resolution, long-term crop-specific data is essential for gaining a deeper understanding of natural-human interactions and addressing the impacts of human activities on regional biogeochemical, hydrological cycles, and climate. In this study, we integrated multi-source data, including high-resolution remote sensing data, model-based data, and historical agricultural census data, to reconstruct the historical dynamics of four major commodity crops (i.e., soybean, maize, wheat, and rice) in South America at annual time scale and 1 km×1 km spatial resolution from 1950 to 2020. The results showed that cropland in South America has expanded rapidly through encroachment into other vegetation over the past 70 years. Specifically, soybean is one of the most dramatically expanded crops, increasing from essentially zero in 1950 to 48.8 Mha in 2020, resulting in a total loss of 23.92 Mha of other vegetation (i.e., forest, pasture/rangeland, and unmanaged grass/shrubland). In addition, the area of maize increased by a factor of 2.1 from 12.7 Mha in 1950 to 26.9 Mha in 2020, while rice and wheat areas remained relatively stable. The newly developed crop type data provide important insights for assessing the impacts of agricultural land expansion on crop production, greenhouse gas emissions, and carbon and nitrogen cycles in South America. Moreover, these data are instrumental for developing national policies, sustainable trade, investment, and development strategies aimed at securing food supply and other human and environmental objectives in South America and other regions. The datasets are available at https://doi.org/10.5281/zenodo.14002960 (Xu et al., 2024).
Competing interests: At least one of the (co-)authors is a member of the editorial board of Earth System Science Data.
Publisher's note: Copernicus Publications remains neutral with regard to jurisdictional claims made in the text, published maps, institutional affiliations, or any other geographical representation in this preprint. The responsibility to include appropriate place names lies with the authors.- Preprint
(2802 KB) - Metadata XML
-
Supplement
(322 KB) - BibTeX
- EndNote
Status: open (until 08 Jul 2025)
-
RC1: 'Comment on essd-2024-527', Anonymous Referee #1, 27 Mar 2025
reply
The authors made an effort to map the long-term crop distribution in South America by synthesizing multiple sourced datasets. Their efforts should be acknowledged. Overall, the paper presents a clear storyline, which is divided into three sections. Unfortunately, I did not see the scientific question that the paper aims to address. Additionally, the intended application of the research is not clear, given the existence of several relevant datasets. It appears that the work is somewhat hobby-oriented, with the research area, spatial resolution, time scale, and targeted crop types being arbitrarily determined by the authors’ interests. Furthermore, I have a few comments that are worth considering.
- A lot of work relates to raster data resampling. How can we assess the uncertainty and sensitivity of cross-scale data processing?
- Figure 3 presents the spatial distribution of crop-specific density. I find it somewhat difficult to understand. Does it represent the proportion of a given crop in a 1x1 km grid, or does it indicate the fraction of a given crop in the total cropland area within a 1x1 km grid? This is a bit confusing.
- From Figure 3, it is also difficult to interpret the areas of multiple cropping, assuming multiple cropping significantly exists in this region.
- The purpose of presenting Figure 4 is unclear. This figure could simply be produced when statistics on harvested areas are available.
- If Figure 3 represents the proportion of crop-specific density, then Figure 5 is hard to understand. By what method can this proportion be allocated to a specific land change process?
- The validation scheme is unclear and lacks a systematic approach. Given that existing datasets have been used for modeling, it is difficult to understand why they are also used for evaluation. For example, Section 3.3.1, “Evaluation Against Existing Datasets at the Provincial Level,” is puzzling, as in many cases R² = 1.
- Figure 7 presents the comparison of the crop-specific areas between this study and census data at the municipal level. However, it is not clear why to present Argentina (1960, 2008, and 2018), Bolivia (1950), Brazil (1995, 2006, and 2017), Chile (2017), Colombia (1960), and Paraguay (2008)? Rather than other regions in other years? Similar question to Figure 8, 9, and 10.
Citation: https://doi.org/10.5194/essd-2024-527-RC1 -
RC2: 'Comment on essd-2024-527', Anonymous Referee #2, 27 Mar 2025
reply
The paper reconstructs the historical expansion of four major crops—soybean, maize, wheat, and rice—across South America at an annual time scale and high spatial resolution (1 km × 1 km). By integrating multiple data sources such as remote sensing, model-based reconstructions, and historical agricultural census data, the researchers aim to provide a comprehensive dataset that captures long-term trends in land use change. The study covers 13 South American countries and employs validation methods using existing datasets (FAO, GEOGLAM, SPAM, GLAD) and accuracy assessments at various administrative levels. The findings reveal a dramatic expansion of agricultural land, particularly for soybean and maize, mainly at the expense of natural vegetation. Soybean cultivation grew from almost zero in 1950 to 48.8 million hectares (Mha) in 2020, leading to the loss of 23.92 Mha of forests, pastures, and shrublands. Maize also saw significant growth, doubling from 12.7 Mha in 1950 to 26.9 Mha in 2020, with rapid acceleration after 2000. In contrast, wheat and rice areas remained relatively stable over the study period. The analysis of land use transitions shows that 24.49 Mha of forests and 13.82 Mha of pastures were converted into croplands, largely for soybean and maize production.
The dataset developed in this study is valuable for assessing the environmental impacts of agricultural expansion, such as deforestation, carbon emissions, and biodiversity loss. It also has critical implications for policymakers looking to balance food security and environmental conservation in South America. By providing a long-term, high-resolution record of crop-specific land transformation, this dataset enhances our understanding of human-environment interactions and supports global efforts in sustainable agriculture and climate change mitigation.
While this paper presents a significant contribution to historical land use mapping in South America, it has several notable weaknesses:
- The authors spend little effort in collecting, processing raw data sources. Instead they overly rely on statistical interpolation and integration of existing datasets. For a data product, the most important and also most time-consuming task is to collect the original, raw data. In this HISLAND, it should be sub-national crop area (e.g. upto 2nd admin level) and production data from 1950-2020. Without a great effort to assemble such a long-time series ( currently mostly at 1st admin (e.g. province) level), the study instead uses linear interpolation to fill gaps in crop-specific data, assuming constant trends between known data points. This approach can oversimplify non-linear trends in agricultural expansion, particularly in regions where crop cultivation was influenced by policy shifts, market dynamics, or environmental changes. In contrast, studies using machine learning or geostatistical modeling (e.g., SPAM series though the authors only used SPAM2010 ) often produce more accurate reconstructions by focusing on the fundamental effort of collecting sub-national crop data and capturing complex relationships between variables.
- One of the great strengths of this long-term, high-resolution maps is to compare and contrast the crop area/production changes from year to year and to show the crop switches and crop pattern changes at a spatially granular level of gridcells. Figure 2(The flow chart in this study) shows the methodology, and I could hardly see how crop type transition from year to year is handled, or how is the cropland intensity comparable from year to year. For example, if I compare the maize area in one gridcell from Year 1 to Year 2, the change of maize area between these two years are the REAL maize area change or simply the error from the modelling/allocation?
- Uncertainty and Validation Issues
While the study integrates multiple datasets and performs validation at different administrative levels, it lacks a comprehensive uncertainty analysis. Unlike datasets such as HYDE or MapBiomas, which provide detailed error estimates and confidence intervals for their reconstructions, this study does not explicitly quantify the uncertainties in its spatial allocation methods or crop-specific data modeling. Additionally, validation is largely dependent on comparisons with existing datasets, some of which have their own biases. A more robust ground-truth validation (e.g., field data or higher-resolution satellite imagery) would strengthen the dataset's reliability.
- Lack of Socioeconomic and Policy Considerations
Although the study acknowledges the role of economic and policy drivers (e.g., subsidies, trade policies, and neoliberal reforms), it does not quantitatively integrate these factors into the model. Other land-use datasets, such as those from GFSAD (Global Food Security-support Analysis Data) and EarthStat, incorporate economic and climate factors to model cropland changes more dynamically. Without this integration, the dataset may overestimate or underestimate cropland expansion in response to policy shifts and market fluctuations.
- Crop yield is not mapped. A critical component for such mappings is the crop yield, which has great spatial heterogeneity and much more critical for food security. Admittedly mapping crop yields is more challenging as cropping system (e.g. rainfed vs irrigated, smallholder vs large estate farming), management is far difficult to map. And yet missing this critical component severely limits the value and usefulness of this product.
Citation: https://doi.org/10.5194/essd-2024-527-RC2 -
RC3: 'Comment on essd-2024-527', Anonymous Referee #3, 21 Jun 2025
reply
This study presents a long-term, high-resolution spatial dataset of four major crops across South America. The topic is timely, and the dataset has clear potential for impactful use in agricultural, environmental, and economic research. The manuscript is generally well-written and logically structured. However, significant methodological simplifications and a lack of uncertainty quantification weaken confidence in the reliability and robustness of the dataset. My concerns are detailed below.
1. Methodological Uncertainty in Reconstructing Historical Maps (Section 2.4.3)
This section is the methodological core of the dataset, reconstructing 70 years of crop-specific spatial maps. However, the approach introduces several sources of uncertainty that compromise the robustness of the dataset:- Temporal Anchoring to 2020:
The spatial allocation relies heavily on crop distribution circa 2020. Although cropland density based on inventory is used to constrain the extent, this approach assumes that spatial distribution patterns have remained relatively stable over seven decades, which is unlikely. For example, Figure 12 shows clear cropland expansion in GLAD data from 2001 to 2020, whereas the developed maps reflect more intensification than expansion—an inconsistency that may misrepresent true land use change. - Shared Temporal Trends Across Crops:
The temporal variation of crop-specific area is derived from cropland density of ratios between years. As a result, all four crops follow the same temporal trend within each pixel, which oversimplifies the complexity of crop dynamics and ignores crop substitution or rotation over time. - Order of Allocation:
The order of crop allocation (soybean → maize → wheat → rice) could significantly affect the final spatial distribution. The rationale behind this sequence should be clearly justified, or alternative orders tested to assess sensitivity.
Suggestions to Reduce Uncertainty:
- Incorporate higher-resolution statistical data (e.g., Admin 2 or subnational data) where available to improve spatial representativeness.
- Integrate additional spatial products across the time series (e.g., SPAM maps for 2000, 2005, 2010, and 2020) as anchor points or for calibration.
- Employ machine learning or statistical downscaling models (e.g., GAEZ crop suitability layers) to guide spatial allocation based on biophysical, socioeconomic, and historical drivers.
2. Crop-to-Land Use Transition Methodology
The paper does not clearly explain how changes in crop-specific areas are reconciled with changes in land use categories. Given the reliance on different products (e.g., HILDA+, inventory data), it is unclear:
- How were increases or decreases in crop area assigned to different land use classes?
- In cases where crop-specific changes exceed the corresponding land use change within a pixel, how was the conflict resolved?
- How was consistency maintained when both datasets carry uncertainties—particularly in earlier decades?
This aspect is critical to validate transitions over time and should be supported with additional evidence, such as inventories, case studies, or literature-based benchmarks.
3. Uncertainty Analysis is Essential
Given the simplified methodology and the integration of disparate datasets, a formal uncertainty analysis is essential to strengthen the reliability of the product. Discrepancies visible in Figure 6 and Table 4, as well as known limitations in source datasets (e.g., inventories), point to substantial uncertainty that needs to be acknowledged and quantified.
Consider approaches such as:
- Sensitivity analysis to test different assumptions (e.g., crop order, data source weights).
- Comparison against independent datasets or national statistics (where available).
- Monte Carlo simulations or bootstrapping to evaluate variability in key assumptions.
4. Clarification on Presentation of Results
- Figure 6: Since spatial data were adjusted at the provincial level using inventory data (Eq. 2), comparisons shown are essentially against data already used for calibration. This limits the independence of the validation and should be acknowledged.
- Figure 11: Please clarify whether these 2020 maps are derived from existing products or developed as part of this study. If they are pre-existing, the comparisons do not reflect the added value of the developed dataset.
Specific Comments
- Title: Consider specifying the focus on four major commodity crops for clarity.
- Line 33: Replace “cropland” with the names of the four crops to avoid confusion.
- Line 36: If “other vegetation” in Line 36 matches the scope in Line 34, merge or clarify the definitions.
- Line 50–53: Specify whether this refers to global patterns or South America only.
- Figure 1: Recommend adding GADM Admin 1 boundaries for better spatial context.
- Lines 248–253 (Step 1): The interpolation process between missing years is unclear. While Equation 1 is mentioned, how is this different from linear interpolation? Clarify the assumptions behind using national trends versus pixel-level trends.
- Line 260–269 (Step 2): Clarify how mismatches were handled when one product had spatial coverage but the other did not. How did interpolation behave near transition years (e.g., 1984, 2014)? Were there artificial spatial jumps in coverage? Given HILDA+ provides annual maps, why wasn’t it used for interpolation?
- Figure 2: Suggest moving this figure earlier (e.g., at the beginning of Section 2) to help readers follow the workflow.
- Line 309: How were the upward/downward trends and anomaly values identified? Over what period was the trend computed? Again, clarify the role of Equation 1 versus linear interpolation.
- Equation 3: The model does not appear to account for long-term productivity changes due to technological or genetic improvements. Consider integrating literature-based estimates or assumptions for these factors.
- Line 335: Define “top N grids”—how were they selected, and why?
- Section 2.4.3: Clarify how crop-specific harvested areas were adjusted when provincial totals and pixel-level cropland constraints conflicted. What happens if the sum of crop areas exceeds the available cropland in a pixel?
- Figure 3: Use a background color to distinguish zero-value grids more clearly.
- Figure 6: The slope values >1 suggest lower crop estimates in the developed dataset. Cross-validate these values as the discrepancies are significant.
- Figure 8: Explain how spatial proportions from census data were allocated to grid cells. If all grids within a municipal boundary received the same value, state this in the caption.
Citation: https://doi.org/10.5194/essd-2024-527-RC3 - Temporal Anchoring to 2020:
Data sets
HISLAND-SA: Annual and 1-km crop-specific gridded data in South America from 1950 to 2020 Binyuan Xu, Hanqin Tian, Shufen Pan, Xiaoyong Li, Ran Meng, Óscar Melo, Anne McDonald, María de los Ángeles Picone, Xiao-Peng Song, Edson Severnini, Katharine G. Young, and Feng Zhao https://doi.org/10.5281/zenodo.14002960
Viewed
HTML | XML | Total | Supplement | BibTeX | EndNote | |
---|---|---|---|---|---|---|
366 | 92 | 19 | 477 | 26 | 14 | 21 |
- HTML: 366
- PDF: 92
- XML: 19
- Total: 477
- Supplement: 26
- BibTeX: 14
- EndNote: 21
Viewed (geographical distribution)
Country | # | Views | % |
---|
Total: | 0 |
HTML: | 0 |
PDF: | 0 |
XML: | 0 |
- 1