Early-season crop identification is of great importance
for monitoring crop growth and predicting yield for decision makers and
private sectors. As one of the largest producers of winter wheat worldwide,
China outputs more than 18 % of the global production of winter wheat.
However, there are no distribution maps of winter wheat over a large spatial
extent with high spatial resolution. In this study, we applied a
phenology-based approach to distinguish winter wheat from other crops by
comparing the similarity of the seasonal changes of satellite-based
vegetation index over all croplands with a standard seasonal change derived
from known winter wheat fields. Especially, this study examined the
potential of early-season large-area mapping of winter wheat and developed
accurate winter wheat maps with 30 m spatial resolution for 3 years
(2016–2018) over 11 provinces, which produce more than 98 % of the
winter wheat in China. A comprehensive assessment based on survey samples
revealed producer's and user's accuracies higher than 89.30 % and 90.59 %,
respectively. The estimated winter wheat area exhibited good correlations
with the agricultural statistical area data at the municipal and county
levels. In addition, the earliest identifiable time of the geographical
location of winter wheat was achieved by the end of March, giving a lead
time of approximately 3 months before harvest, and the optimal
identifiable time of winter wheat was at the end of April with an overall
accuracy of 89.88 %. These results are expected to aid in the timely
monitoring of crop growth. The 30 m winter wheat maps in China are available
via an open-data repository (DOI:
Wheat is one of the most important cereal crops in the world
(FAOSTAT, 2018; Guo et al.,
2019). According to the statistics provided by the Food and Agriculture
Organization (FAO), the harvested area of wheat reached 215
Satellite-based methods are an effective and quick tool for crop mapping owing to their great spatial coverage and temporal continuity (Belgiu and Csillik, 2017; Griffiths et al., 2019; Jin et al., 2019). Most studies have used supervised classification methods, such as decision tree classification (Brown and Pervez, 2014; Wardlow and Egbert, 2008), and supervised machine learning methods (Yang et al., 2019), such as random forests (Wang et al., 2019; Yin et al., 2020), support vector machines (Zheng et al., 2015), and neural networks (Cai et al., 2018; Zhong et al., 2019), to distinguish crop types. However, these methods strongly depend on the selection of the training samples, which is time-consuming and labor-intensive (Skakun et al., 2017b). For instance, the 30 m resolution Cropland Data Layer (CDL) product generated by the USDA (United States Department of Agriculture) National Agricultural Statistics Service (NASS) classified more than 100 types of crops grown in the United States using the decision tree classification method (Boryan et al., 2011). The CDL product uses a large volume of USDA Common Land Unit (CLU) data as training samples, which are renewed every year. In Nebraska alone, more than 250 000 CLU polygon records were used to train and validate the CDL product (Boryan et al., 2011). Such large volumes of CLU data can only be acquired with government support and are usually confidential (Boryan et al., 2011). Therefore, the accuracy of national and subnational crop classification products based on supervised classification algorithms is limited because of the lack of training datasets (Petitjean et al., 2012).
As an alternative approach, several studies have used phenological characteristics as a metric for identifying geographic locations of winter wheat (Qiu et al., 2017; Skakun et al., 2017b; Wardlow et al., 2007). The common method differentiates winter wheat from other crops based on the differences in key phenological phases (e.g., tillering, heading, and harvesting) in combination with spectral signatures (Pan et al., 2012; Skakun et al., 2017a). Some studies integrate accumulated growing degree day (GDD) to consider the phenology difference to reduce phenology variability due to different climatic conditions (Franch et al., 2015; Skakun et al., 2017b; Zhong et al., 2014). Other methods like dynamic time warping (DTW) have been proven to be an effective solution for mapping crop distribution, e.g., for identifying rice paddy fields (Guan et al., 2016) and classifying vegetables types (Li and Bijker, 2019). DTW was initially designed for speech recognition (Sakoe and Chiba, 1978). Maus et al. (2016) proposed a time-weighted version of the DTW method, namely time-weighted dynamic time warping (TWDTW), which accounts for seasonality in crop types, thus further improving the classification accuracy. Unlike supervised classification methods, these methods require very low volumes of training data, thus substantially reducing the need for field surveys (Belgiu and Csillik, 2017).
China produces approximately one-sixth of the world's wheat in one-tenth of the world's wheatland (FAOSTAT, 2018), with winter wheat constituting 95 % of the total wheat production in China (National Bureau of Statistics of China, 2018). Numerous studies have been conducted to identify the cultivation map of winter wheat at county (Pan et al., 2012), province (He et al., 2019), and regional scales (Wu et al., 2007). Significant efforts have been made to generate a planting area map of winter wheat over the large regions of China. Based on MODIS (Moderate Resolution Imaging Spectroradiometer) surface reflectance products, Qiu et al. (2017) used the differences in the enhanced vegetation index before and after heading dates to develop two indicators to map winter wheat in the major winter wheat producing regions of China. A recent study generated a 30 m resolution distribution map of winter crop, instead of winter wheat over the main producing areas in China using the decision tree classification method (Tian et al., 2019). However, several limitations in existing winter wheat maps remain. First, previous studies showed that the MODIS dataset failed to identify the planting areas of winter wheat because of the relatively low spatial resolution (Tian et al., 2019). In China, because of the large population and implementation of household responsibilities, farmers have the freedom to select the type of crop they wish to plant. The planting area per household is only 1.37 ha on average (Guo, 2008), which accounts for 5 % of a 500 m MODIS pixel. Therefore, identification methods with low spatial resolution data (e.g., MODIS dataset) will result in large misclassifications (Qiu et al., 2017). Second, identifications based on high-spatial-resolution satellite datasets still show large uncertainty in several regions. For example, based on the Landsat 7, Landsat 8, and Sentinel-2 images with a spatial resolution of 30 m, Tian et al. (2019) found a relative error greater than 50 % in identifying the planting areas compared to statistical data for Hubei and Shanxi provinces.
Especially, identifying the geographic location and areas of winter wheat as early as possible is important for monitoring crop growth, simulating crop water use, and meeting the timeliness requirement of yield predictions (Chipanshi et al., 2015; Q. Song et al., 2017). Under the background of climate change, the frequencies of extreme weather events and natural disasters are expected to increase (Trenberth et al., 2014; Zambrano et al., 2018). Therefore, early mapping of crop distribution is urgently necessary for policymakers to reduce economic loss and assess food security (Inglada et al., 2016). Identifying the crop distribution in the early-season period is more challenging than that by the end of the growing season because of the limited input information.
In this research, we used a phenology-based method to identify the geographic locations of winter wheat in China and produced a 30 m resolution winter wheat map for the period of 2016–2018. Moreover, we explored the potential for early-season mapping of the planting areas of winter wheat and determined the earliest identifiable time and optimal identifiable time. The identification accuracy was assessed based on field surveys, visual interpretation results of very high-spatial resolution images, and agricultural statistical data. The proposed method can generate winter wheat maps that can be updated annually, proving a useful tool for crop management and policymaking.
This study identified planting areas of winter wheat for the period of
2016–2018 in 11 provinces covering an area of 390
Study area spans 11 provinces over China (the region
covered by oblique lines). The solid black lines represent the boundary of
the provinces. The black dots indicate survey sites obtained from Google
Earth; the red triangles indicate field survey sites; and each site covers 1 km
The methodological workflow consists of the following steps: (1) image preprocessing to construct monthly maximum composite NDVI (normalized difference vegetation index) images and extraction of cropland based on the FROM-GLC (Finer Resolution Observation and Monitoring of Global Land Cover) product (see Sect. 2.3 for more details); (2) data processing, which produces standard seasonal change of NDVI for winter wheat for each province based on the winter wheat samples; (3) winter wheat identification, where TWDTW is used to measure the similarity of seasonal changes of NDVI for known winter wheat fields with investigated fields, and area statistical data use at the province level to determine the thresholds of similarity measurements; and (4) evaluation for assessing the classification accuracies (Fig. 2).
Flowchart of the proposed methodology for winter wheat classification.
In this study, we used the time-weighted dynamic time warping (TWDTW) method to identify the planting locations and areas of winter wheat. The TWDTW is an improved version of the DTW algorithm (Petitjean et al., 2012; Sakoe and Chiba, 1978). In the DTW algorithm, the distance (i.e., cost) (Fig. 3a) between two time series, namely series X of known winter wheat fields and series Y of unknown land cover, is calculated by warping the series Y via stretching or shortening the time dimension (Fig. 3b and c), in order to find the optimal warping path, which is the minimum distance between the two series. Compared to other similarity-based methods, such as Euclidean distance, the DTW is more advantageous in that it can flexibly deal with the temporal distortions associated with seasonal change, such as amplitude, time scaling, or shifting (Lhermitte et al., 2011). Taking the seasonal change in land cover types into consideration, Maus et al. (2016) added a time constraint to the DTW (i.e., TWDTW) to balance shape matching and phenological change, thus further increasing identification reliability contrast with the DTW method.
In order to use the TWDTW method, first, the standard seasonal change curve
of NDVI of winter wheat retrieved at some known winter wheat fields is
required (Fig. 4). Taking each province as a unit, the dissimilarity values
can then be calculated by comparing the seasonal change in NDVI of each
investigated pixel with the standard seasonal curve of winter wheat in a
given province. The pixels with low dissimilarity values have a higher
probability of being winter wheat. In this research, we employ the area
statistical data of winter wheat at the province level to determine the
thresholds of dissimilarity. The pixels (
Seasonal changes of NDVI for winter wheat over 11 provinces in the study area.
This study used satellite-based NDVI extracted from Sentinel-2 and Landsat composite imageries to indicate the seasonal change in the vegetation. The standard seasonal curve of winter wheat was generated by averaging the NDVI with 20 % of the winter wheat pixels randomly selected from field surveys in each province (see Sect. 2.3). The winter wheat over all the 11 provinces has similar seasonal changes (Fig. 4). Generally, winter wheat reaches the maximum growth period during March to June and is harvested during May to June. We assumed that the seasonal change of winter wheat for each province does not vary from year to year. We used the standard seasonal curves derived from NDVI measurements taken in 2018 to identify the planting area of winter wheat for the period of 2016–2018 to further examine the applicability of the method.
To determine the earliest identifiable time, we employed incremental time windows by setting 1 October of the previous year as the start and extending it with an increment of 1 month until the following June to compare the seasonal changes with different lengths. In other words, we started to identify the planting areas from the previous October, and subsequently, at each month, a new image is acquired to compose longer time series and generate a new identification. The influence of seasonal change length on identification accuracies was assessed based on these classification accuracies.
Three winter crops are grown over the whole study area, including winter
wheat, winter rapeseed, and winter garlic. The first two crops constitute 91 %
and 8 % of the planting area of winter crops, respectively
(National Bureau of Statistics of China, 2018); winter rapeseed may
affect the identification of winter wheat. Relying solely on optical imagery
to discriminate them would be a challenge because of their similar spectral
characteristics and phenological stages (Veloso
et al., 2017). Widely planted in HuB, winter rapeseeds cover an
area of 0.97
Fortunately, the difference in the plant structure between winter wheat and
winter rapeseed makes it possible to differentiate them based on radar data
(Veloso et al., 2017). Therefore, we used radar
data to exclude the interference from winter rapeseed in this study. By
investigating the survey samples in HuB, we found that the VH (vertical transmit and horizontal receive)
backscatter values in April are a good indicator to differentiate winter
wheat from winter rapeseed. The VH backscatter values in April for winter
wheat were lower than
The seasonal change in monthly maximum composite NDVI
The identification accuracy of winter wheat was evaluated based on two methods: (1) validation using the ground truth samples at the field level, including ground surveys and visual interpretation of very high-resolution images from Google Earth, and (2) comparisons with agricultural statistical data at administrative units. Eighty percent of the winter wheat samples and all non-winter wheat samples were selected to obtain the confusion matrix of the winter wheat map for each province (see Sect. 3 for more details). The overall accuracy (OA) was measured to investigate the overall effectiveness of the method. The producer's accuracy (PA) shows the proportion of ground truth samples properly judged as the target class, and the user's accuracy (UA) shows the proportion of samples judged as the target class on the classification map that are actually present on the ground. In addition, the planting area of winter wheat identified in this study was compared with that obtained from agricultural statistical data at the county and municipal levels through Pearson's correlation coefficient. Other statistical indicators, including the mean absolute error (MAE) and the root mean square error (RMSE), were also used to evaluate the performance.
The methodology in this study mainly relied on the similarity measurement
between the NDVI seasonal change in an investigated pixel and a known
seasonal change of winter wheat. Two different data sources were used to
calculate the NDVI: the constellation of the Landsat 7, Landsat 8 and Sentinel-2
satellites. The NDVI was derived from the Landsat Surface Reflectance (SR) products
produced by the United States Geological Survey (USGS), which have been
processed for atmospheric corrections. The quality bands provided by the SR
products were used to remove pixels contaminated by clouds. The study also
used the NDVI obtained from the MultiSpectral Instrument (MSI) sensor
on board Sentinel-2. The SR products generated from Level-2A products by
running Sen2Cor provided by ESA (
Times of good observations in the study area obtained from
monthly maximum NDVI composite images between 1 October 2017 and 31 July 2018. The right column shows the frequency of the times of good observations
during the period of 2016–2018 from October to the following July. Provincial administrative boundary data and global country
administrative boundary data are sourced from
To differentiate winter wheat from other winter crops (i.e., winter
rapeseed), this study used the synthetic aperture radar (SAR) (i.e., Ground
Range Detected, Level-1; GRD) product from Sentinel-1. It had a
dual-polarized vertical transmission with VV (vertical transmit and vertical
receive) and VH (vertical transmit and horizontal receive) bands. We processed
each image and acquired the backscatter coefficient (
In this study, the VH and NDVI data are both composited into their corresponding monthly maximum images, respectively, for the period between 1 October 2015 and 31 July 2018 on the platform of GEE. The operations were run on GEE in pixels: within a month, we obtained NDVI values of all available clean pixels and got the maximum for the monthly composite. The pixels of the monthly composite imageries had the highest quality and represented the whole month, whereas a small number of pixels had no values. The reason for this is that imageries from Landsat 7, Landsat 8 and Sentinel had several pixels with bad quality owing to clouds, cloud shadows, and/or no data acquisition (e.g., failure of Landsat 7) (Fig. 6).
To obtain the standard seasonal change curve of winter wheat and validate
how the proposed method performs, we collected survey samples from the
following three sources. First, 38 sites (red triangles in Fig. 1) were investigated through field surveys during 2018 in the six provinces
(i.e., SD, HN, HB, JS, SAX, and HuB)
(Tian et al., 2019) (Fig. 1). Each field site
covered 1 km
The total number of samples of different types for each province during 2018.
In this study, the Finer Resolution Observation and Monitoring of Global Land
Cover (FROM-GLC) product with 30 m resolution was used to extract cropland
locations. The product can be downloaded via
To examine the potential for early-season identification of winter wheat and explore how early we could produce the distribution maps before the harvest, we investigated the method with shorter time windows and assessed its performance based on all the survey samples collected, which correspond to 33 776 pixels in total. We compared the producer's accuracy (PA), user's accuracy (UA), and overall accuracy (OA) for different seasonal change lengths starting from October, with monthly increments thereafter (Fig. 7). The identification accuracy increases with seasonal change length until March with an overall accuracy of 87.3 %. From April onward, the identification results reach saturation in terms of the accuracy, with an overall accuracy close to maximum, 89.88 %. This indicates that the method can identify the planting area of winter wheat 3 months before harvest (i.e., March), with stable performance until April.
Evolution of producer's accuracy (PA), user's accuracy (UA), and overall accuracy (OA) with monthly increments. PA of non-winter wheat and PA of winter wheat represent the probabilities that the ground truth reference data of non-winter wheat and wheat class are correctly classified, respectively. UA of non-winter wheat and UA of winter wheat indicate the ratio of the total quantity of pixels correctly classified into the objective class (i.e., non-winter wheat and winter wheat) to the total quantity of pixels classified into the objective class using the proposed method.
We used the time window from October to April to compare the similarity between the seasonal change of investigated fields and that of known winter wheat field; thus, we produced winter wheat distribution maps (Fig. 8). Our method shows good performance in identifying the planting areas of winter wheat over all the 11 provinces. Based on winter wheat and non-winter wheat survey samples, the overall identification accuracy varies among the 11 provinces, ranging from 84.97 % to 95.85 % (Table 2). The user's accuracy (UA) and producer's accuracy (PA) are high in most provinces. For SC and GS, the same approach produced the lowest PA of winter wheat, 72.78 % and 73.08 %, respectively (Table 2).
Final winter wheat identification map of China in 2018.
Panels 1–6 on the right and bottom are the zoomed-in maps, indicating
the local details in the different provinces and regions, including SD, HN,
AH and JS, HuB, central and western regions of China, and XJ, respectively.
Provincial administrative boundary data and global country administrative
boundary data come from the Resource and Environment Science and Data Center cloud platform
(
Confusion matrix for the identification map of planting areas of winter wheat in 11 provinces during 2018.
In addition, this method accurately estimates the areas of winter wheat
compared to the available agricultural statistical data at the municipal and
county levels (Fig. 9). The correlation coefficient (
Comparison between the estimated planting area of winter
wheat and agricultural statistical area at the municipal
Finally, we examined the capability of the method for extending the standard
seasonal change of NDVI acquired from a single year to apply it in other
years (i.e., 2016 and 2017). We used the same seasonal change of NDVI of
winter wheat for each province derived from field samples obtained from 2018
to compare the dissimilarity with that of unknown fields for 2016–2017. We
then compared the estimated winter wheat areas with agricultural statistical
area for the 2 years (Fig. 10).
Comparison between the estimated and statistical winter
wheat area at the municipal
Winter wheat is one of the most important crops in the world, and information on its spatial extent is critical for making economic and grain subsidy policies (FAOSTAT, 2018). To our knowledge, there are currently no distribution maps for winter wheat over China on a large scale with a spatial resolution of 30 m. Previous studies have made efforts to generate the distribution map of winter wheat over the major producing areas in China based on moderate-spatial-resolution satellite data (i.e., MODIS) (Qiu et al., 2017). However, owing to small plot sizes for crops, the distribution map with moderate resolution may lead to large uncertainties because of mixed pixels, further restricting the classification accuracy (Tian et al., 2019). Machine learning methods, such as random forests and support vector machines, have been proven to be effective in identifying the spatial distribution of various crops (Cai et al., 2018; Liu et al., 2018); these methods, however, strongly depend on the number of training samples, thus restricting the large-area crop mapping because of the lack of data (Belgiu and Csillik, 2017; Millard and Richardson, 2015; Valero et al., 2016).
In this study, we generated winter wheat distribution maps with a spatial resolution of 30 m for the period of 2016–2018 based on the TWDTW method using Landsat- and Sentinel-derived monthly maximum composite NDVI. The results obtained based on field surveys and statistical data indicate that the proposed method can accurately identify the winter wheat planting areas over all the 11 provinces. Compared to machine learning methods, our method performs well even if with only a few training samples, which is a significant advantage for large-scale crop identification given the lack of survey samples available (King et al., 2017). In addition, the performance is ideal even when using the same standard seasonal change of the winter wheat for each province for the years when ground surveys are lacking (Fig. 10). Therefore, the proposed method can identify winter wheat quickly with a few training samples and can be extended for years when training samples are scarce (Maus et al., 2016). Recent research suggested that the TWDTW method is more robust in contrast to other identification techniques, such as the random forests, when there are only a small number of training samples (Belgiu and Csillik, 2017).
More importantly, this method can identify planting areas of winter wheat 3 months before harvesting (i.e., March) and can achieve stable performance in April, which are significant for early and continuous winter wheat production predictions (Franch et al., 2015; McNairn et al., 2014). Therefore, understanding where crops are distributed, especially during the early within-season period, is a top priority in predicting total production and monitoring trends in production (Shao et al., 2015; Skakun et al., 2017b). Existing agricultural estimates on crop area or mapping of crop distribution are usually available at the end of the season or after crop harvest (Boryan et al., 2011; Zhong et al., 2019), and the limited input information makes early identification of winter wheat distribution a challenge (Kontgis et al., 2015; X.-P. Song et al., 2017). For example, machine learning methods strongly depend on field survey data and time series features as input; this increases the difficulty in early identification because collecting field data during the season is time-consuming and laborious, especially over large areas (Skakun et al., 2017b; Q. Song et al., 2017). Moreover, the time series input features are generally obtained for the entire growing season, making early mapping more challenging (Johnson, 2016). In this study, our results indicate that early-season identification of winter wheat planting area is feasible up to 3 months before harvest with limited imageries and time information.
Some potential uncertainties could affect the identification accuracy. First, the quantity of cloud-free satellite data substantially determines effectiveness of retrieving the seasonal change of crop growth; this can influence the identification quality (Dong et al., 2020b). In this study, we used all the available satellite data of Landsat and Sentinel and composited multitemporal monthly maximum NDVI images, in order to avoid cloud contamination as much as possible. However, there are large differences in the available images among various provinces; it remains a challenge to acquire cloud-free images in cloudy and rainy southern areas, such as in SC, HuB, and JS (X.-P. Song et al., 2017). The low identification accuracy in these provinces is likely due to the relatively poor data quality of satellite data (Dong et al., 2015). Second, although the seasonal change of winter wheat is relatively consistent in most provinces (i.e., a low peak in NDVI in winter and a high peak in NDVI in spring), there is an interclass difference in winter wheat in each province, such as wheat variety, sowing time, and irrigation conditions. Some winter wheat fields may have an earlier sowing time, showing a pattern deviation from the standard average pattern of this province, and therefore, may lead to some omission errors. Besides, there are some particularities in the NDVI seasonal change curves of SC and HB, where NDVI shows an increasing trend from October to April. This is different from the typical seasonal change curves with two NDVI peaks during the growing season, and this may make it difficult to differentiate winter wheat from other crops. That may be the reason for relatively lower identification accuracy. So, the identification of winter crops in warmer regions should be paid more attention.
The derived winter wheat maps in China for 3 years (2016–2018) are
available at
To help the readers to reproduce this work, Table 3 summarizes the data source and platform information of datasets and processing steps in this study. The input datasets came from three parts including: the GEE platform, our group, and freely accessible websites. Specifically, the four satellite datasets in Sect. 2.3.1 were available at the GEE platform. The survey samples were collected by our group from the three sources, which have been introduced in detail in Sect. 2.3.2. The land cover product (i.e., FROM-GLC product) in Sect. 2.3.3 was downloaded from the freely accessible website of Tsinghua University, and the agricultural statistical area data in Sect. 2.3.3 were downloaded from the National Bureau of Statistics of China.
In addition, the process of monthly maximum NDVI composition was implemented on the GEE platform. The TWDTW algorithm, the exclusion of disturbances of winter rapeseed, and the classification accuracy assessment were operated on the localhost platform.
The detailed information of the datasets and processes in this study.
Information on the geographical location and distribution of crops at global, national, and regional scales is valuable for many applications. To our knowledge, there are no published distribution maps for winter wheat over China on a large scale with a spatial resolution of 30 m. Based on the available Landsat and Sentinel imageries and a time-weighted dynamic time warping (TWDTW) method, this study produced an unprecedented 30 m spatial resolution winter wheat distribution map of China for the period of 2016–2018. The method performed well over the 11 provinces that produce more than 98 % of the winter wheat in China. When validated with 33 776 survey samples, the overall accuracy was 89.88 %, and the producer's and user's accuracies reached 89.30 % and 90.59 %, respectively. The resultant planting areas of winter wheat were spatially consistent with the agricultural statistical area, and the method explained 78 % of the spatial variabilities in the planting areas at the county level averaged over six provinces. More importantly, this method is effective in identifying the planting areas of winter wheat 3 months prior to harvest, which is beneficial for early yield estimation. In general, this paper produced a 30 m spatial resolution winter wheat map of China, which is expected to contribute to the timely monitoring of winter wheat growth. In future work, the main goal to be achieved is to improve the method and apply it to other staple crops (e.g., corn and rice) and eventually complete maps of staple crops at national scales.
WY and JD designed the research, performed the analysis, and wrote the paper; YF, JW, SF, YZ, and WH performed the analysis; ZN, JH, and HT edited and revised the paper.
The authors declare that they have no conflict of interest.
We would like to thank the editor and the two reviewers for their valuable comments. We would also like to thank all the scientists and students who participated in the field observations.
This research has been supported by the China National Funds for Distinguished Young Scientists (grant no. 41925001), the National Youth Top-Notch Talent Support Program (grant no. 2015-48), the Changjiang Young Scholars Program of China (grant no. Q2016161), and the Fundamental Research Funds for the Central Universities (grant no. 19lgjc02).
This paper was edited by David Carlson and reviewed by Sergii Skakun and one anonymous referee.