High-resolution map of sugarcane cultivation in Brazil using a phenology-based method

. Sugarcane is the most important source of sugar, and its cultivation area has undergone rapid expansion, replacing other crops, pastures, and forests. Brazil is the world’s largest sugarcane producer and contributed to approximately 38.6% of the world’s total production in 2019. Sugarcane in Brazil can be harvested from April to December in south-central area and from September to April in northeast area. The flexible phenology and harvest conditions of sugarcane in Brazil make it 15 difficult to identify the harvest area at state to country scales. In this study, we developed a phenology-based method to identify the harvest area of sugarcane in Brazil by incorporating the multiple phenology conditions into a time-weighted dynamic time warping method (TWDTW). Then, we produced annual 30-m spatial resolution sugarcane harvest maps (2016-2019) for 14 states in Brazil (over 98% of the harvest area) based on the proposed method by using Landsat-7/8 and Sentinel-2 optical data. The proposed method performed well in identifying sugarcane harvest area with limited training sample data. Validations for 20 the 2018 harvest year displayed high accuracy, with user’s, producer’s, and overall accuracies of 94.35%, 87.04%, and 91.47% in Brazil, respectively. In addition, the identified harvest


Introduction
Sugarcane (Saccharum officinarum) is a semi-perennial crop that can be cut multiple times throughout several years in tropical and subtropical areas (Abdel-Rahman and Ahmed, 2008;Sindhu et al., 2016;Mulyono et al., 2017). Sugarcane is an important sugar and energy crop. Over 70% of world sugar comes from sugarcane (Brar et al., 2015;Iqbal et al., 2015). Sugarcane ethanol is an alternative energy source that can reduce CO2 emission resulting from fossil fuel use (Bordonal et al., 2015;30 Jaiswal et al., 2017). In addition, sugarcane leaves can be used as fodder and be manufactured into paper and bioenergy (Leal et al., 2013). Sugarcane accounts for approximately 20% of global crop production over the period from 2000-2018, and this value is almost twice the share of maize, the second most produced crop worldwide (FAO, 2020). Sugarcane has now become a crop with high socio-economic importance for Brazil and other producers, such as India, China, Thailand, Pakistan, the United States, and Australia (Monteiro et al., 2018). Recently, sugarcane has undergone rapid expansion by occupying the 35 land of other crops, pastures, and forests (Adami et al., 2012;Ferreira et al., 2015;Wachholz de Souza et al., 2017;Defante et al., 2018). The expansion of sugarcane production areas can influence regional land cover use, water use, greenhouse gas emission, soil carbon balance, and climate change (Loarie et al., 2011;Mello et al., 2014;Zhang et al., 2015;Jaiswal et al., 2017). On the one hand, the replacement of other crops with sugarcane may directly affect on agriculture and food security (Mello et al., 2014;Jaiswal et al., 2017). On the other hand, sugarcane expansion may influence the local climate by altering 40 surface albedo and evapotranspiration (Loarie et al., 2011). Additionally, sugarcane has a large amount of water requirement and is often planted in the areas where water is limiting, therefore, sugarcane expansion may cause concerns about water security . Timely and accurate estimates of the distribution, harvest area and growing conditions of sugarcane are crucial for sustainable sugarcane production and national food security.
Recently, single-date or time series moderate to high resolution remote sensing optical data (e.g., Landsat,SPOT,45 and CBERS) and synthetic aperture radar (SAR) data (e.g., PALSAR and Sentinel-1) have been used for crop mapping based on unsupervised and supervised classification methods (Lin et al., 2009;Johnson et al., 2014;Li et al., 2015;Belgiu et al., 2018). The most current and popular classification method is machine learning, such as random forest (Zhou et al., 2015;Luciano et al., 2018Luciano et al., , 2019Wang et al., 2019), support vector machine (Johnson et al., 2014;Zheng et al., 2015), and neural networks (Cai et al., 2018). For sugarcane identification, Junior et al. (2017)  sugarcane in Zhanjiang city in China using machine learning methods and Sentinel-1A/2 time series satellite data. However, 55 these methods strongly depend on extensive training samples, which are time-consuming and labor-intensive to obtain at the state and country scales (Dong et al., 2020a;Wang et al., 2020). For example, the U.S. Department of Agriculture (USDA) National Agricultural Statistics Service (NASS) produced the 30-m resolution Cropland Data Layer (CDL) product using a decision tree classification method, and this approach mainly relied on the large volume of USDA Common Land Unit (CLU) data to establish training samples that are updated annually (Boryan et al., 2011). Only in Nebraska did the CLD product need 60 more than 250,000 CLU records for training the classification method.
Phenology-based algorithms are also commonly used in regional classification (Wardlow et al., 2007;Zhong et al., 2014;Massey et al., 2017;Dong et al., 2020b). These methods have been proposed and developed based on the crop calendar, e.g., the germination, tillering, grand growth, and ripening phases. Li et al. (2015) identified sugarcane on the Leizhou Peninsula in China by comparing the polarization features (such as scattering angle and polarization entropy) of sugarcane with those of 65 other land use types in the early, middle, and late tillering periods using TerraSAR-X images. The result indicated that the tillering period is a suitable growing phase that can be used for sugarcane cultivation area mapping. Mulyono et al. (2017) identified sugarcane plantations in the Magetan district of East Java Province in Indonesia using support vector machine and the crop phenology profile of enhanced vegetation index (EVI) time series derived from Landsat 8 images. Phenology-based methods would be a potential alternative approach to identify crop cultivation areas at the country scale with a low volume of 70 training samples (Dong et al., 2020a). However, these studies were developed based on several phenological thresholds (e.g., green-up date, senescence date, and length of growing season) of crops in different growing stages, which need to be calibrated when extended to other regions or years. Therefore, traditional phenology-based methods are still insufficient to perform ideal mapping and may be limited by multiple thresholds when applied to large scales, such as the country to continental scales.
Dynamic time warping (DTW) is an effective phenology-based method in crop classification at large scales that compares the 75 differences in seasonal variations in the vegetation index of a target crop with those of other crop types and natural vegetation.
The original DTW method was developed for speech recognition and then employed for phenology-based classification using time series satellite images (Sakoe and Chiba, 1978;Petitjean et al., 2012;Petitjean and Weber, 2014). However, the DTW neglects the temporal ranges when searching the best alignment between two time series. Maus et al. (2016) proposed a timeweighted version of the DTW method (TWDTW) by adding a temporal weight that accounts for the seasonality of crops into 80 the DTW method to balance shape matching and phenological changes. The TWDTW method performed well in identifying winter wheat, crop, and forest with limited training data (Maus et al., 2016;Belgiu et al., 2018;Manabe et al., 2018;Dong et al., 2020a).
Brazil is the world's largest sugarcane production country and contributed to 37.6% of the world's total harvest area in 2019, followed by India (18.9%), Thailand (6.9%), China (5.3%), Pakistan (3.9%), and Mexico (3.9%) (FAO, 2020). Sugarcane in 85 Brazil is mainly located in south-central and northeast areas. Sugarcane is a semi-perennial crop. Its life cycle begins with the planting of a stem cutting and grows for approximately 12 months or 12-18 months depending on the season, variety, and region of planting (Rudorff et al., 2010). Generally, in the south-central area, sugarcane can be harvested from April to December with a harvesting season spanning 9 months. In the northeast areas, sugarcane can be harvested from September to April in the next year with a harvesting season spanning 8 months. The flexible phenology and harvest conditions of sugarcane 90 in Brazil make it difficult to identify the harvest area at large scales, particularly at the state to country scales. In this study, based on Landsat and Sentinel-2 satellite data, we proposed a phenology-based method by incorporating multiple phenological conditions of sugarcane into the TWDTW method to automatically identify the harvest area of sugarcane in Brazil at a spatial resolution of 30-m from 2016 to 2019.

Satellite data 105
In this study, all the available Landsat-7/8 and Sentinel-2 reflectance data from July 2015 to February 2020 were used to generate 16-day composite normalized difference vegetation index (NDVI) series at the Google Earth Engine (GEE) platform.
Landsat-7 and Landsat-8 data had a 30-m spatial resolution and a 16-day temporal resolution. The quality band BQA was used to remove pixels contaminated by clouds. The Sentinel-2 data had a 10-m spatial resolution and a 5-day temporal resolution.
The quality band QA60 was used to remove pixels contaminated by clouds. NDVI was computed using the reflectance data 110 from the near infrared band (ρNIR) and red band (ρRED): The cloud-free frequencies of the 16-day composite NDVI in a year for each pixel are shown in Fig. 2. For 2018 and 2019, most of the pixels had more than 20 times of good observations. For 2016 and 2017, the pixels with more than 20 times of good observations were less, while most of the pixels had more than 18 times of good observations (Fig. 2). To reconstruct the 115 near real-time temporal profile of the 16-day NDVI, we filled the gaps in NDVI time series using a linear interpolation method and then filtered the NDVI curves.

Sample data
The sample data used for calibration and validation in this study were mainly obtained based on the high-resolution images 120 from Google Earth. We selected the samples for sugarcane and non-sugarcane according to the following rules. First, we selected the samples for sugarcane or non-sugarcane using visual interpretations according to color and textures of the images from Google Earth. As shown in Fig. 3a, sugarcane exhibits unique color and textures on the high-resolution images from Google Earth. Sugarcane has coarser surface than most of the crops and smoother surface than forest or trees in growing season, which can be used to separate sugarcane from these types. Second, we cross checked and confirmed each of these samples 125 using its NDVI timeseries. Sugarcane has a long life cycle ranging from 12 to 18 months (Rudorff et al., 2010), which is longer than most of the crops (Fig. 3b). And the sharply decreased NDVI in sugarcane harvest period can separate sugarcane from forest and pasture (Fig. 3b). We only use the samples which can satisfy the above two criteria. Finally, we collected totally 2909 samples with 1393 for sugarcane and 1516 for non-sugarcane in the year 2018 ( Fig. 1). <<Figure 3>> 130

Agricultural statistical data
The agricultural statistical data for the sugarcane harvest area were derived from the Municipal Agricultural Production (PAM) provided by the Brazilian Institute of Geography and Statistics (Instituto Brasileiro de Geografia e Estatí stica -IBGE; www.ibge.gov.br). The PAM was carried out yearly across the entire country with the statistical data at the Brazil, major region, federation unit, mesoregion, microregion and municipality levels. It provides information related to the plant area and 135 harvest area, production and average yield, and average price in the reference year for 64 agricultural products. In our study, we used the harvest area from 14 major sugarcane planted states at the federation unit, municipality, microregion, and mesoregion levels from 2016 to 2018 (the boundary lines of these regions are shown in Fig. 4).

Methods 140
In this study, we employed a phenology-based method, namely the TWDTW method, to identify the harvest area of sugarcane in Brazil. The workflow was as follows: (1) pre-processing of the satellite dataset (i.e., Landsat 7/8 and Sentinel-2) to obtain the 16-day composite NDVI series, including low quality data removing (e.g., the elimination of clouds, cloud shadows, and Landsat 7 ETM+ Scan Line Corrector (SLC-off) gaps), NDVI compositing, NDVI gap filling and data filtering; (2) extraction of the standard NDVI curves of sugarcane in Brazil; (3) model development and sugarcane harvest area identification in Brazil 145 based on the TWDTW method; and (4) assessment of the mapping accuracy of the Brazil sugarcane harvest area.

Time-weighted dynamic time warping (TWDTW) method
TWDTW is a time-weighted version of the DTW method for land use and land cover classification (Belgiu et al., 2018;Dong et al., 2020a). The DTW works by comparing the similarity between two sequences, such as the unknown sequence (Y) and target sequence (X), through warping the unknown sequence (Y) to search for their optimal path and obtain their minimum 150 distance, namely the similarity (dissimilarity) (Petitjean et al., 2012;Petitjean and Weber, 2014). The TWDTW method adds a temporal cost accounting for the phase difference between the two time series to the minimum distance (Maus et al., 2016).
The TWDTW algorithm includes three steps in land use classification and identification: (1) generate the standard curve of a selected index (e.g., NDVI) for several crops or a single crop (e.g., sugarcane) based on time series images and field sample training data; (2) find the best alignment, and generate the dissimilarities (TWDTW distances) between unknown curves of 155 NDVI time series (i.e., the NDVI curves of the unknown pixels) with the standard NDVI curve of sugarcane; (3) identify the unknown pixels based on the TWDTW distance; in this process pixels with low distance indicate high similarity and a high probability of being associated with the specified class (i.e., sugarcane).

<<Figure 5>> 160
The workflow by employing the TWDTW method for sugarcane harvest area mapping are shown in Fig.5. The growing period of sugarcane can be separated into four phases: the germination, tillering, grand growth, and ripening, and the NDVI values vary in different growing phases (Fig. 6). (1) Germination phase. Sugarcane begins to germinate approximately 15-30 days after planting. The NDVI starts to increase in this period. (2) Tillering phase. Tillering phase starts after approximately 2 months of germination, and tillers emerge from the base of the mother shoot to form 5-10 stalks. The NDVI increases quickly 165 in this period. (3) Grand growth phase. This phase spans a period of approximately 4-10 months after planting. The NDVI peaks during this period. (4) Ripening and harvesting phase. In this phase, the NDVI starts to decrease, and the moisture content in sugarcane drastically drops.
From planting until the first cut, the crop is called planted sugarcane, and the growth cycle lasts between 12 and 18 months depending on the season, variety, and region of planting. After the first harvest, ratoon sugarcane is harvested yearly with a 170 normal cycle of 12 months for a period of approximately 5 to 7 years or more (Rudorff et al., 2010). Although planted sugarcane and ratoon sugarcane have different length in growing cycles, they shared similar NDVI curves from the grand growth to harvest phases (red symbols in Fig. 6) which can be used as the standard seasonal curve for harvest area mapping. In Fig. 6, the standard NDVI curves for sugarcane were generated by randomly selecting 50 sugarcane samples across Brazil from the field data in 2018 (Section 2.2.2) and calculating their averaged NDVI values in the same growing period. The standard NDVI 175 curves were produced as following: (1)  calculating their averaged NDVI values to obtain the standard NDVI curves (Fig. 6).

<<Figure 6>> 180
Sugarcane in Brazil covers an extensive harvesting period (Rudorff et al. 2010). In the south-central area (including Sã o Paulo, Goiá s, Minas Gerais, Mato Grosso do Sul, Paraná , Mato Grosso, Bahia, Rio de Janeiro, and Espí rito Santo), sugarcane is often harvested from April to December, with a harvesting season spanning 9 months. In the northeast area (including Alagoas, Pernambuco, Paraí ba, Rio Grande do Norte, and Sergipe), sugarcane is harvested from September to April in the next year, with a harvesting season spanning 8 months. According to the phenology of sugarcane in the south-central and northeast areas, Fig.7 shows the possible standard NDVI curves (these NDVI curves were repeated from the NDVI standard curve for sugarcane, as denoted by the red symbols and line in Fig. 6) for sugarcane in south-central and northeast Brazil. In this study, we incorporate the flexible phenological and harvest conditions of sugarcane in Brazil into the TWDTW method as follows.

<<Figure 7>>
(1) Calculate the dissimilarities (TWDTW distances). For each year from 2016-2019, we calculated all the "distance" values 190 for each unknown pixel by comparing its NDVI timeseries with the standard NDVI curves of sugarcane (Fig. 7) based on the TWDTW method. In this process, we can obtain 13 "distance" values corresponding to the 13 standard NDVI curves in Fig.   7 for each pixel, and we selected the minimum value of the "distance" as the final distance value for each pixel.
(2) Remove the influence of other vegetation with similar NDVI changes to sugarcane harvesting period, such as the Brazilian Cerrado biomes and pasture. In the Brazilian Cerrado, some vegetations (such as grassland, shrubland, woodland, and deciduous forest) 195 exhibited low NDVI values from the end of the drought season to the beginning of the rainy season (August to October), which is similar to the sugarcane harvesting practice. However, the "difference" in NDVI between the growing season and nongrowing season for these vegetations were lower than those for sugarcane (Ferreira et al. 2004;Mueller et al. 2015). Therefore, we used the NDVI "difference" as another supplement criterion to further separate sugarcane from the aforementioned vegetation types across the states related to the Brazilian Cerrado (i.e., Sã o Paulo, Goiá s, Minas Gerais, Mato Grosso do Sul, 200 Mato Grosso, and Bahia). The "difference" between the maximum NDVI value in sugarcane growing season (NDVImax: mean value of the two maximum NDVI in the growing season) and the minimum NDVI value in sugarcane non-growing season (NDVImin: mean value of the two minimum NDVI in the non-growing season) was calculated for each pixel in these states ( Fig. 7). From the statistics of 50 randomly selected sugarcane samples, we found the "difference" for sugarcane is mostly greater than 0.31. Therefore, we set the pixels with "difference" less than 0.31 as non-sugarcane. Additionally, pasture, which 205 has similar temporal-spectral behaviour to sugarcane (Xavier et al., 2006), was further removed using the pasture maps (overall accuracy of 87%) produced by Parente et al. (2017). Across all the 14 studied states, pixels consecutively labelled as pasture on the pasture maps from 2016 to 2019 was set as non-sugarcane. (3) Identify to produce the sugarcane maps. We used the agricultural statistical harvest area for sugarcane at the state level to determine the "distance" threshold. A pixel with "distance" value lower than the "distance" threshold was considered a "sugarcane" pixel, and the total area of all sugarcane pixels should 210 be equal to the statistical harvest area of sugarcane in the investigated state. In our study, municipalities with small areas of planted sugarcane (less than 1000 ha or less than 1% of the total sugarcane area in the investigated state) were identified separately to improve the identification accuracy of the entire investigated state.

Accuracy assessment
In this study, we first assessed the identification accuracy using the selected sugarcane and non-sugarcane samples based on 215 the high-resolution images from Google Earth (Section 2.2.2). The producer's accuracy (PA), user's accuracy (UA), and overall accuracy (OA) were used for validation. The producer's accuracy (PA) is the percentage of surveyed reference samples correctly identified as the target class; the user's accuracy (UA) is the percentage of surveyed reference samples identified as the target class on the classification map actually confirmed by field surveys; and overall accuracy (OA) is the ratio of correctly classified samples to all the samples. Additionally, we calculated the sugarcane harvest area on the map in different 220 administrative regions and compared them with agricultural statistical data at the municipality, microregion, and mesoregion levels. The coefficient of determination (R 2 ) and RMAE (relative mean absolute error) between the statistical harvest area and the estimated harvest area were adopted to assess the map accuracy. The MAE (mean absolute error) can be expressed as: where and ̂ are the statistical area and identified area for the th administrative region, respectively. is the number of 225 the administrative regions with valid statistical data. RMAE is the value of MAE relative to the mean value of the statistical area for all the administrative regions:

Results
Annual cultivation maps of sugarcane in the 14 states of Brazil from 2016 to 2019 were produced using the TWDTW method 230 (taking 2018 as an example in Fig. 8). The validation demonstrated good performance of the proposed method in identifying sugarcane harvest area in 2018. To show the detailed information of the sugarcane maps produced in our study. We selected four typical areas in different states to zoom-in and compared the sugarcane maps with high-resolution images from Google Earth (Fig. 9). In general, the sugarcane maps well captured the delineation of fields even though with some noises. What's more, the sugarcane maps can exhibit detailed information, such as small parcels, roads, and ridges between the fields. Based 235 on the 2859 samples derived from Google Earth in 2018, the user's, producer's, and overall accuracies were 94.35%, 87.07%, and 91.47% in Brazil, respectively. The performance of the method varied by state and region. For all the 14 studied states, the overall accuracy (OA) varied from 83.51% to 94.46%, with the user's accuracy (UA) ranging from 86.96% to 98.25% and producer's accuracy (PA) ranging from 78.18% to 91.52% for sugarcane; and the user's accuracy (UA) ranging from 76.00% to 94.42% and producer's accuracy (PA) ranging from 86.05% to 98.15% for non-sugarcane (Table 1). Sã o Paulo, the state 240 with the largest planted area of sugarcane (accounting for over 50% of the sugarcane in Brazil), displayed high user's, producer's, and overall accuracies of 94.53%, 91.52%, and 94.46%, respectively. Goiá s, the state with the second-largest planted area of sugarcane (accounting for approximately 10% of the sugarcane in Brazil), had high user's, producer's, and overall accuracies of 93.96%, 90.48%, and 93.38%, respectively. All the investigated states exhibited overall accuracy (OA) over 83% (Table 1). 245 <<Figure 8>> <<Figure 9>> <<Table 1>> Additionally, the proposed method can accurately estimate the sugarcane harvest area compared with the agricultural statistical data at different administrative levels. The estimated harvest area of sugarcane in 2018 exhibited good correlations with the 250 agricultural statistical area data derived from PAM at the municipality, microregion, and mesoregion levels. The coefficients of determination (R 2 ) were 0.88 (N=3637), 0.96 (N=369), and 0.99 (N=87) at the municipality, microregion, and mesoregion levels, respectively, and the respective RMAEs were 34.0% (MAE=0.09×10 4 ha), 20.5% (MAE=0.55×10 4 ha), and 13.7% (MAE=1.55×10 4 ha) (Fig. 10). The performance was better when the validated regions were aggregated to larger areas, namely the accuracy from low to high were at the municipality, microregion, and mesoregion levels. 255 <<Figure 10>> The correlations between the agricultural statistical and the estimated harvest areas varied by state and region. At the municipality level, the coefficient of determination (R 2 ) between the agricultural statistical and the estimated harvest areas ranged from 0.62 to 0.98 in south-central Brazil and from 0.85 to 0.89 in northeast ; the RMAE ranged from 18.8% to 64.5% in south-central Brazil and from 38.8% to 53.2% in northeast Brazil ( Fig. 11; Fig. 13). At the microregion 260 level, the R 2 between the agricultural statistical and the estimated harvest areas ranged from 0.67 to 1 in south-central Brazil and from 0.70 to 1 in northeast Brazil (Fig. 12); the RMAE ranged from 14.2% to 60.5% in south-central Brazil and from 7.7% to 46.6% in northeast Brazil (Fig. 13). At the mesoregion level, the R 2 between the agricultural statistical and the estimated harvest areas ranged from 0.43 to 1 in south-central Brazil and from 0.99 to 1 in northeast Brazil (Fig. 12); the RMAE ranged from 1.7% to 58.1% in south-central Brazil and from 3.1% to 22.6% in northeast Brazil (Fig. 13). Validation at all three levels 265 showed high performance, with high R 2 and slope close to 1. In general, Mato Grosso and Bahia in the northern part of southcentral Brazil and Sergipe in the southern part of northeastern Brazil displayed lower R 2 and higher RMAE values (Figs. 11-13). The performance for Sã o Paulo and Goiá s were higher than those for most other states, except Rio de Janeiro (Figs. 11-13). Finally, we assessed the capability of the method and standard seasonal changes in NDVI (i.e., standard NDVI curves) <<Figures 11-13>>

Discussion
As the largest global producer of sugarcane, Brazil contributed to approximately 38.6% of the world's sugarcane production in 2019 and played an important role in retaining the global demand for sugarcane (FAO, 2020). While the cultivation area of 275 sugarcane in Brazil has increased by approximately 11% over the past 10 years according to census data (FAO, 2020), which indicated substantial land cover changes and important feedback to regional climate systems (Loarie et al., 2011;Mello et al., 2014;Jaiswal et al., 2017). The first harvest area map of sugarcane in Brazil at a 30-m spatial resolution was generated only for Sã o Paulo state for the 2003/2004 crop year based on an automatic image classification method (Rudorff et al., 2005).
Subsequently, the cultivation map of sugarcane was updated using visual/manual image interpretation through 2013, and the coverage of the map extended from Sã o Paulo state to a total of 8 states in the south-central region of Brazil, accounting for 88% of the sugarcane cultivation area in Brazil, as reported by the Canasat Project (Rudorff et al., 2010). Souza et al. (2020) reconstructed annual land use and land cover information at a 30-m spatial resolution from 1985-2017 for Brazil based on random forest method and Landsat data trained by plenty of samples selected from existing land cover maps, which provides the cultivation map of sugarcane belonging to a subclass of agriculture (the producer's and user's accuracies for agriculture 285 were 83.3% and of 81.3%, respectively). However, these prevailing methods strongly require large volume of training samples, which makes it difficult to update annually at large scales.
In this study, we proposed a phenology-based sugarcane classification method by incorporating multiple phenological conditions of sugarcane into the TWDTW method. Then, we identified the harvest area of sugarcane with a spatial resolution of 30-m in Brazil from 2016 to 2019 using 16-day composite NDVI series derived from Landsat-7/8 and Sentinel-2 data. Our 290 proposed method can automatically identify the sugarcane areas with limited training data and needs only one standard NDVI curve (Fig. 6) for all the 14 sugarcane planted states in Brazil. Validations against field sample data and agricultural statistical area data showed that the generated sugarcane harvest maps were in high accuracy. For example, validations for 2018 displayed the user's, producer's, and overall accuracies of 94.35%, 87.07%, and 91.47% in Brazil, respectively.
Although our method can effectively and accurately identify the sugarcane harvest area from the regional to national or 295 continental scales, there are still several potential uncertainties in the identification process. First, because of the quality of the processed NDVI series, the speckle "salt-and-pepper" effects/noises exist in some areas of the harvest map. According to the map statistics in 2018, sugarcane patches with only one pixel account for 0.7% (Sã o Paulo) to 9.1% (Espí rito Santo) of the sugarcane area (Fig. 14). In the future, the object-based identification by segmenting images into homogeneous objects, instead of the pixel-based method used in our study, may alleviate the "salt-and-pepper" effects and improve the identification 300 performance (Belgiu et al., 2018).

<<Figure 14>>
Second, the identification accuracy for Bahia state was lower than that for other states. Because of the different harvest seasons in south-central and northeast Brazil, we identified sugarcane in the south-central and northeast areas using different phenological characteristics and standard curve combinations (Fig. 7). Bahia is a transition state between south-central and 305 northeast Brazil. Namely, sugarcane in southern Bahia has a harvest season similar to that in south-central Brazil, and sugarcane in northern Bahia has a harvest season similar to that in northeast Brazil. In our study, we treated Bahia as a state in south-central Brazil with a harvest season from April to December (Fig. 7), which may introduce errors to the identification in the northern part of Bahia.
Third, Mato Grosso state exhibited a lower R 2 and higher RMAE than other states when comparing the identified sugarcane 310 harvest area with the agricultural statistical sugarcane harvest area. Two major biomes are located in Mato Grosso, including the humid tropical forests of the Amazon in the north and the heterogeneous Cerrado area (a tropical savanna) in the southcentral part of the state (Kastens et al. 2017). In Mato Grosso, sugarcane may be misclassified with some kinds of grassland, grazing areas, or seasonal forest, which exhibit phenological changes similar to those of sugarcane (Mueller et al. 2015;Bendini et al. 2019). The NDVI values of these vegetation types decrease between August and October (from the end of the drought 315 season to the beginning of the rainy season) and increased thereafter (Ferreira et al. 2004), which is quite similar to that in the harvesting stage of sugarcane, resulting in misclassification with the sugarcane harvested from August to October. In our study, we used the "difference" between the maximum NDVI value in the growing season and the minimum NDVI value in the nongrowing season (see method in Section 2.3.2) to alleviate the misclassification at some extent because the "difference" for the abovementioned vegetation types is generally lower than that for sugarcane (Ferreira et al. 2004). In the future, incorporating 320 more complex spectral-temporal variability metrics, such as the combination of more spectral information instead of NDVI and a longer time window for several harvest seasons instead of one harvest season may help improve model performance (Mueller et al. 2015).

Conclusion
Brazil is the world's largest sugarcane producer and contributes to approximately 38.6% of total global sugarcane production.
Based on the available Landsat-7/8 and Sentinel-2 optical images, we produced sugarcane harvest maps with a 30-m spatial resolution (2016-2019) by incorporating multiple phenological conditions of sugarcane in Brazil into the TWDTW method. 330 The proposed method can automatically identify sugarcane harvest area with limited training sample data. Based on 2859 samples derived from Google Earth, the validation experiment reflected high accuracy across the 14 sugarcane planted states in Brazil in 2018, with the user's, producer's, and overall accuracies of 94.35%, 87.07%, and 91.47%, respectively.
Additionally, the identified harvest area of sugarcane exhibited a good correlation with the agricultural statistical area data derived from PAM at the municipality, microregion, and mesoregion levels. The maps can be used to monitor the harvest area 335 and yield of sugarcane, and evaluate the feedback to regional climate.
Author contributions. Wenping Yuan and Yi Zheng designed the research, performed the analysis, and wrote the paper; Ana Clá udia dos Santos Luciano and Jie Dong edited and revised the manuscript.