the Creative Commons Attribution 4.0 License.
the Creative Commons Attribution 4.0 License.
Agricultural Land Management Practices in the Conterminous United States from 1980–2023
Abstract. Agricultural land management practices affect environmental variables including greenhouse gas emissions, carbon and nitrogen cycles, soil profiles, water quality, and air pollution. To better understand agricultural land management practices, we compiled a land management history for the conterminous United States (CONUS) from 1980–2023. We used the National Resources Inventory as the basis for our sample-based approach to impute planting and harvest dates, synthetic N fertilizer and manure N and C application rates and timing, tillage systems and intensity, and cover crop adoption. We aggregated the imputations to a 0.25-degree grid and compiled a comprehensive dataset detailing the management practices used on agricultural lands across CONUS. From 1980–2023, we found trends towards later planting dates for cotton and spring grains and trends towards earlier planting dates for soybeans. Synthetic N fertilizer rates increased steadily from 1980–2000 and then stabilized from 2000–2023, while manure N amendments were low between 1980 and 2000 and then increased rapidly from 2000–2023. Generally, there were increases in no-till and reduced-till systems across CONUS, with more notable increases in central and eastern regions. Cover crop adoption increased across CONUS from 1980–2023, with the highest level of adoption occurring in the Northeast. The results from this product align with previously published histories, although our work provides a more comprehensive representation of cropland management practices than previous works. Our dataset is available on Dryad under public domain license (Hoskovec et al., 2025) and can be used to inform studies of agricultural lands that evaluate processes such as greenhouse gas emissions, environmental impacts, and food production.
- Preprint
(13330 KB) - Metadata XML
- BibTeX
- EndNote
Status: final response (author comments only)
- RC1: 'Comment on essd-2025-842', Yushu Xia, 16 Feb 2026
-
RC2: 'Comment on essd-2025-842', Anonymous Referee #2, 08 May 2026
This paper describes a method that creates spatiotemporal datasets across CONUS (1980-2023) of agricultural land management practices including planting and harvest dates for major crops, nutrient management (fertilizer/manure application rates, timing), tillage, and cover crops. It uses a machine learning approach to impute estimates based primarily on point-based NRI surveys that are then adjusted based on state-level inventory datasets and then estimated spatially (wall-to-wall) across CONUS using weighted averaging techniques.
The method is novel and the dataset created will be a useful resource that leverages the rich survey datasets and the use of Gradient Boosted Modeling appears appropriate and well justified. The writing is high quality and the manuscript is well-organized. My main concerns are related to 1) the presentation of the methods and tracing the various datasets and their ultimate raw sources and derivatives and 2) the conceptual underpinning of the method, which does not seem to adequately leverage datasets that capture more detailed spatial variability (e.g., county-scale Ag Census data) and, thus, inherently smoothes away a lot of spatial variability in an increasingly concentrated (livestock) and intensifying agricultural landscape. I recommend reconsider with major revisions.
Regarding the first major concern, it would be really helpful to include a table of all of the major datasets used for each variable, what spatial and temporal scales they cover, etc. You rely on several foundational datasets that often are not adequately explained and you force the reader to search for the most salient points from the cited sources. Obviously, you don’t need to have an extensive description of each dataset (although the supplemental could be expanded) but describing the basic underpinning of each of the major ones used is essential. For instance, the EPA (2024) dataset is used throughout the method and the reader is not told that it relies on other datasets like the USDA-NASS annual livestock surveys to estimate things like manure N. As a data journal article, more time needs to be spent describing the underlying input datasets to the method.
For the second point, I am concerned about the utility of the spatial estimates and their inability to capture real and important spatial variability that is clearly shown by other datasets. For example, a reader hypothesizes that concentrated animal feeding operations (CAFOs) (or at least counties within a state that have relatively high densities of livestock) play an outsized role in certain environmental outcomes. Would your dataset be adequate to assess this question when it relies so heavily on state-level livestock estimates? I strongly recommend that the authors consider incorporating county-scale (and finer, if available) datasets to better constrain the spatial estimates based on the GBMs. If they are unable to incorporate such datasets because that is too out of scope, then at the very least they need to communicate the limitations of the produced dataset in terms of estimating within-state spatial variability.
I expand on the major comments below with reference to specific locations in the manuscript.
Specific comments:
Line 2: What is meant by “soil profiles”? This term is only used once here in the abstract. Please be more specific.
Lines 8-11: Several variables are described as increasing through time. Could you please put in percentages to provide more detail on how large the increases are?
Line 17: This list is much shorter than the one in the first sentence of the abstract. I would be consistent with the framing. It doesn’t have to be an exhaustive list but more than GHG/carbon here makes sense.
Line 24: Again, same comment as above. You had a much broader list of potential applications for this dataset but in this paragraph you are choosing to only focus on GHG/carbon. Why? This also happens again in Line 47.
Figures 1 and 2: These two figures should be re-created using publicly available shapefiles from the USDA (e.g., https://nrcs.maps.arcgis.com/home/item.html?id=58c18a7690fa4b2c86c5a9a069e0457b) I don’t know what the journal policy is on reproducing images from other sources but I think it should be avoided whenever possible.
Lines 66-77: How many points are included in the NRI sample? How much does it change through time? Is there a map that shows their distribution that could be included here or in the supplemental?
Line 102: One overarching question that comes up for me for each of the output variables from your method is why do you often aggregate spatially explicit data to the broader state-level and then use a different technique to distribute the data in a spatially explicit manner? Why not take advantage of the spatial variability present in some of the underlying datasets (e.g. OpTIS is available at the sub-state level, Ag Census has many variables available at the county level). I understand that your methodology, which relies heavily on the NRI survey sample points, would have to change substantially. But from a purely conceptual standpoint, why start with higher resolution data, upscale it, and then downscale it? Are there ways to incorporate county-scale datasets into your method?
Line 118: Based on this description of the use of the EPA (2024) dataset (i.e., manure N available to soils), does this mean that your dataset is estimating N “downstream” of the losses associated with collection and storage of manure? If so, what would you tell a researcher who is interested in determining the total impact of agricultural production on atmospheric nitrogen losses (ammonia and N2O) and wants to include the losses at CAFOs where collection and storage losses can be quite high? If your dataset has this limitation, then it needs to be directly communicated somewhere in the methods and discussion, especially because you state in the abstract that air pollution is a potential application.
Lines 124-125: Was there a reason that alfalfa was excluded from your analysis? Area harvested nationally for alfalfa is much larger compared to barley, oats, and sorghum combined.
Line 182: Could you please explain in more detail how manure N availability was estimated in EPA (2024)? I am roughly familiar with that dataset but looking at the link in the references does not provide any immediate indication of how manure N availability was determined. I would strongly recommend that at least a sentence or two is devoted to explaining how EPA (2024) came up with these estimates and what underlying datasets they relied upon (USDA-NASS survey livestock data)
Line 185: Just to note, this is where I really started to wonder why county-scale Ag Census data wasn’t considered at all (see major comment above)
Lines 186-195: Related to the major comment above about capturing spatial variability, at some point (either in methods or discussion) it would be very useful to have more discussion on the conceptual idea behind using a quite spatially-limited NRI point-scale survey data (and associated soil and geographic characteristics) along with GBMs to characterize the spatial variability of nutrient application rates and management. It’s just a fundamentally different way of understanding the drivers of spatial variability than I am used to. While far from perfect, at least county-scale datasets (Falcone, 2021; https://doi.org/10.3133/ofr20201153) give an indication of things like where livestock are concentrated within a state. This county-scale spatial variability could be a function of a lot of different things including historical roots of livestock production and recent development of livestock production infrastructure. Your conceptual model seems to suggest that spatial variability is solely a function of soil and geographic conditions that is mediated by a very limited spatial dataset of NRI point surveys. I don’t mean to dismiss this method out-of-hand but it just seems to have a lot of limitations that are not adequately explained.
Lines 210-213: Again, this is where I became increasingly confused as to why you would spend considerable effort to break apart estimates into each animal type at the state scale, while not incorporating animal type specific inventories at the county scale (Falcone, 2021)
Lines 226-227: What basis do you have for assuming that manure is only applied in the fall or the spring? Very often in livestock intensive places, manure is applied both in the fall and the spring (see this example from a survey of farmers in Wisconsin: https://dnr.wisconsin.gov/sites/default/files/topic/TMDLs/NEL/appendix_e_ag_survey.pdf)
Line 273: I think I know what you mean by “donor pools” but it is not immediately apparent. Please explain further.
Line 289-292: Again, is there any way to incorporate county-scale estimates of cover crop adoption from Ag Census data? Seems to be a real missed opportunity.
Lines 330-331: Isn’t this just circular logic? You assumed that manure was only applied in the spring or the fall in the methods and now it is being described as a result?
Figures 6 and 7: I would strongly consider changing these line charts to stacked area charts. It would perhaps be harder to see trends for individual crop types but then the reader could easily visualize how the total is changing through time.
Line 391: I tried to track down the USDA-ERS (2020) reference from the link in the references but it only went to a report from 2000. Please update.
Line 394-395: Is this finding surprising, though? If I understood the methods correctly, didn’t you adjust your manure rates based on EPA (2024) which also uses the USDA-NASS livestock numbers? This is where it occurred to me that there definitely needs to be more description of the different datasets used and what underlying datasets are used to inform them. I realize that it’s a lot to keep track of. But these interdependencies in the agricultural data ecosystem are incredibly important to understand and communicate.
Citation: https://doi.org/10.5194/essd-2025-842-RC2
Data sets
Agricultural land management activities for conterminous US from 1980-2023 Lauren Hoskovec et al. https://datadryad.org/share/LINK_NOT_FOR_PUBLICATION/WmO29tLOABEqXMRp0H-NZpr1ddUotryamH2b78bK42g
Viewed
| HTML | XML | Total | BibTeX | EndNote | |
|---|---|---|---|---|---|
| 339 | 276 | 26 | 641 | 42 | 68 |
- HTML: 339
- PDF: 276
- XML: 26
- Total: 641
- BibTeX: 42
- EndNote: 68
Viewed (geographical distribution)
| Country | # | Views | % |
|---|
| Total: | 0 |
| HTML: | 0 |
| PDF: | 0 |
| XML: | 0 |
- 1
General Comments:
The manuscript “Agricultural Land Management Practices in the Conterminous United States from 1980–2023” presents a grid-based crop management dataset derived from a data fusion and modeling approach. The authors provide visualizations that illustrate spatial and temporal patterns across multiple management variables produced in this study. The manuscript also describes the novelty of the dataset and includes comparisons with existing products. Overall, this work represents a timely contribution with clear potential value for modelers and a broad range of stakeholders. I offer several minor suggestions below that may further improve the clarity and accessibility of the manuscript for a wider audience.
Specific Comments:
L2: The manuscript did mention applications for GHG and C accounting quite a few times, but there was not much mention of water quality or air pollution, despite their inclusion in the Abstract. I suggest that the authors mention these applications in both the Introduction and the Discussion sections. It may be particularly helpful to add a short paragraph that lists example use cases of this dataset, which would clarify its broader applicability and help a wider audience appreciate its full value.
L49-50. The historical agricultural management data is very important for model spinup and making historical model simulations for counterfactual analysis. Consider mentioning these applications in the Introduction and/or Discussion sections.
L57-60. The language in this section gives the impression that the work primarily focuses on gap filling. However, the study also includes evaluation results, as described in the Appendix, and quality control steps are probably included as well. I suggest revising the language to better reflect the full range of activities conducted in this study. It may also be beneficial to highlight elements of novelty here rather than waiting until the Discussion section.
L65-77. Based on my understanding, the original NRI data are not openly accessible to the broader research community. If this is correct, an important contribution of this work lies in making information derived from this valuable dataset available through a gridded product. I suggest emphasizing this aspect.
Figure 1. Is it necessary to include both Figures 1 and 2 in the main manuscript? They are not mentioned frequently in the manuscript, and the results are not often aggregated to these units. Consider moving to SI.
L85. For audience less familiar with NRI and various agricultural management data product, it might be helpful to mention whether the following data products are completely independent from NRI or has partial overlaps. There are many products discussed in this section and a flowchart or a table (mentioning existing products and what novelty this study brings) might be helpful to illustrate the methodology.
L169-170. Later on, the manuscript also mentioned manure animal type, which is important information and should probably be mentioned here as well. What about fertilizer types? Is it possible to include such information in the dataset as well and if not, what is the main reason for the exclusion/ challenge?
L175-177. With this calculation method, is it possible to encounter area double counting when both manure and synthetic fertilizers? The ARMS data does include %area receiving both (even though it’s a small value).
L183. Assume the N availability data is assigned with animal type and manure form for the subsequent calculation?
L200. Would it be accurate to say that the dataset of this study is more focused on crops, and therefore is not suited for the modeling of pastureland or vegetables?
L220-221. The availability of crop-specific dataset from ARMS is highly dependent upon the time period. Is it possible to show some statistics about how much gap filling needs to be done for each time period? The gap-filling aspect can also influence conclusions drawn on temporal trends.
Table 1. Is it possible to connect these levels with typical tillage practice names or expand the “Example” column for the broader audience who is interested in using this dataset?
L283-292. Would something like CDL be helpful since crop types are identified in time? Or is CDL already embedded in one of the source datasets here? If not, I wonder if that data could be used for some sort of validation.
L294. Explain why 0.25 degree is selected as the final grid size.
L296-298. It is not fully clear what the six imputations correspond to – is it for calculating the quantiles mentioned in the next paragraph?
L305. Explain more about the “data suppression” procedure and whether there is follow up interpolations.
L306. Point out corresponding “Appendix” for the Method section, and consider adding some languages related to validation/ evaluation, as some of the results in the Appendix seems quite important.
Figure 3. The right edge of the (f) figure was cut off.
Figure 8. Could this temporal trend be explained?
Figure 10. Could there be a bit more explanation in the caption to make it stand alone. For example, consider mentioning that the tillage intensity increases from A to K.
Figure 11. Consider modifying the color scheme or value range associated with the legend – the distinction of color in these four figures is pretty minor.
L374-377. It would be great if the authors could also mention main differences in methodology in addition to the similarity. Otherwise, it would be expected that the results are similar given the underlying datasets and methodology.