ChinaCropPhen1km: A high-resolution crop phenological dataset for three staple crops in China during 2000-2015 based on LAI products

Crop phenology provides essential information for land surface phenology dynamics monitoring and modelling, and 10 crop management and production. Most previous studies mainly investigated crop phenology at site scale, however, land surface phenology dynamics monitoring and modelling at a large-scale need a high-resolution spatially explicit information on crop phenology dynamics. In this study, we proposed a method to retrieve 1km-grid crop phenological dataset for three main crops from 2000 to 2015 based on GLASS LAI products. First, we compared three common smoothing methods and chose the most suitable methods for different crops and regions. Then, we developed an optimal filter-based phenology 15 detection (OFP) approach which combined both inflexionand threshold-based method and detected the key phenological stages of three staple crops at 1km spatial resolution across China. Finally, we established a high resolution gridded-phenology product for three staple crops in China during 2000-2015, named as ChinaCropPhen1km. Compared with the intensive phenological observations from the Agricultural Meteorological Stations of China Meteorological Administration, the dataset had a high accuracy with errors of retrieved phenological date less than 10 days, and represented the spatiotemporal patterns 20 of the observed phenological dynamics at site scale fairly well. The well-validated dataset can be applied for many purposes including improving agricultural system or earth system modelling over a large area. DOI of the referenced dataset: https://doi.org/10.6084/m9.figshare.8313530 (Luo et al., 2019).


Introduction
Phenology is a key indicator of vegetation growth and development and plays an important role in vegetation monitoring (Qiu 25 et al., 2015;Tao et al., 2017;Zhong et al., 2016). Accurate information on the timing of key crop phenological stages is critical for determining the optimal timing of agronomic management options, reliable simulations of crop growth and yield, and analyzing the plant response to climate change (Bolton and Friedl, 2013;Brown et al., 2012;Chen et al., 2018a;Sakamoto et al., 2013;Sakamoto et al., 2010;Wang et al., 2015;Zhang and Tao, 2013).
Field phenological observations are time-and money-consuming. And the observational stations are limited and distributed sparsely. Therefore, the field phenological observations can't meet the requirements of many purposes such as vegetation monitoring for remote areas with sparse observations and the grid-based earth system simulations. The satellite-based observations with a wide spatial coverage and short revisit times have become a powerful method to monitor vegetation growth and obtain vegetation information at regional and global scales. Previous studies have mainly used a vegetation index (VI) to extract crop phenology. For examples, Pan et al. (2015) presented a method to construct Normalized Difference Vegetation 35 Index (NDVI) time-series dataset derived from HJ-1 A/B CCD and extract phenology parameters. Zeng et al. (2016) detected corn and soybeans phenology with Moderate Resolution Imaging Spectroradiometer (MODIS) 250-m Wide Dynamic Range Vegetation Index (WDRVI) time-series data. Cao et al. (2015) developed an adaptive local iterative logistic fitting method to fit time-series of Enhanced Vegetation Index (EVI) derived from MODIS and estimated green-up date of spring vegetation. Sakamoto (2018) refined the Shape Model Fitting method to estimate the timing of 36 crop-development stages of major U.S. 40 crops from MODIS WDRVI time-series data. Crop phenology detected by these studies is relatively accurate. Nevertheless, it cannot be ignored that the VI are overly dependent on the band characteristics of sensors (Atzberger et al., 2014). By contrast, the Leaf Area Index (LAI) is more robust across diverse sensors and more sensitive than VI to large amounts of vegetation (Verger et al., 2016). In addition, previous studies focused on only very limited areas or very few crops due to the high diversity and complexity of agricultural planting structures (Liao et al., 2019;Liu et al., 2017;Xu et al., 2017;Wang et al., 2012). 45 It is urgently required to acquire the gridded phenological datasets over a long-term period at a national scale as it is the basis for a large-scale agricultural system or earth system simulation. For example, crop model can simulate crop growth, development and predict crop yields. However, its applications to a large area are limited by the lack of accurate and spatially heterogeneous crop growth information (Curnel et al., 2011;Dorigo et al., 2007;Tao et al., 2009;Jin et al., 2018). According to some previous studies, it could improve the accuracy of model estimation at large scale by assimilating reliable remote sensing 50 data into crop growth models (Bolten et al., 2010;Nearing et al., 2012;Ines et al., 2013;Chen et al., 2018a;Huang et al., 2015;Zhou et al., 2019;de Wit and van Diepen, 2007). Among the state variables used in the assimilation, phenology is one of the essential variables because of its critical roles in affecting dry matter accumulation and distribution during the growing stages and reflecting crop periodic biological changes influenced by various environmental conditions (e.g., climate) (Jin et al., 2018;Zheng et al., 2016). 55 In this study, using a remotely sensed Global Land Surface Satellite (GLASS) LAI product (2000-2015) (Xiao et al., 2014), we aim to 1) choose the most suitable smoothing method to reduce the noise of the LAI time-series for different crops and regions; 2) detect the phenological information of three staple crops (i.e., maize, rice and wheat) at 1-km spatial resolution across China, and evaluate its accuracy by comparing with the observed data at Agricultural Meteorological Stations (AMS) of China Meteorological Administration (CMA); 3) explore the spatial patterns of different phenological stages. The resultant 60 remote sensing LAI-based crop phenology dataset with 1-km spatial resolution across China (ChinaCropPhen1km) will benefit to understand crop phenological dynamics, climate change impacts and adaptations, and agricultural system modelling over a large area, temporally and spatially (Luo et al., 2019).

Study area 65
The study areas are across the mainland of China, possessing of complex environments and crop planting structures, diverse cropping intensity and cultivation habits ( Fig. 1) (Piao et al., 2010;Zhang et al., 2014a). Rice, wheat and maize are the three staple crops in China, together accounting for 59% of the total planting area and 92% of the grain yield in 2017. Roughly half of the cropland in China is multi-cropped, such as the double cropping system of wheat-maize in the North China Plain, and the rotation system between early-rice and late rice in Southern China (Frolking et al., 2002). 70

The GLASS LAI data
An improved MODIS-based LAI dataset (GLASS LAI) from 2000 to 2015 was from Liang et al. (2013) (http://glassproduct.bnu.edu.cn/?pid=3&c=1). The GLASS LAI product was generated with general regression neural networks (GRNNs) trained by the fused LAI from MODIS and CYCLOPES LAI products and the reprocessed MODIS reflectance of the 75 BELMANIP sites during the period 2001-2003(Liang et al., 2013. By computing the RMSE and R 2 between several global LAI products and the high-resolution LAI reference map, it could be shown that the accuracy of the GLASS LAI (RMSE=0.78, R 2 =0.81) was fairly better than that of the MOD15 and GEOV1 (Xiao et al., 2016). Moreover, the intercomparison have indicated that GLASS LAI (8-day composites of 1-km spatial resolution) was more temporally continuous and spatially complete than the other LAI products (Xiao et al., 2014;Xiao et al., 2016). It has been applied to vegetation monitoring and 80 crop model assimilation (Xiao et al., 2014;Chen et al., 2018a).

Phenology observation
The crop phenology observation records from 2000 to 2013 of maize, rice, and wheat crops were obtained from AMS, which were governed by CMA (https://data.cma.cn/). Such phenology information was observed and recorded by well-trained agricultural technicians in the experimental field, and then checked and managed by the Chinese Agricultural Meteorological 85 Monitoring System (CAMMS). In this study, we selected the agrometeorological stations with more than 10 years of records of key phenological dates, including green-up date, emergence date, transplanting date, V3 stage (i.e., early vegetative stage of maize when the third leaf is fully expanded), heading date, and maturity date, for the three crops. Totally, there were 436 stations across main crop-cultivated areas in China (Fig. 1).

Methods
The method to retrieve the phenological information of three staple crops at national scale is presented schematically in Fig.  95 2. The data processes are as follows: 1) data preprocessing, 2) selecting the cropland sample grid to determine the suitable smoothing method, 3) determining the optimal filter-based phenology detection (OFP) approach, 4) retrieving the phenological information of three crops at 1-km pixel across China.

Data preprocessing
Due to the differences among these datasets on projected coordinate system, firstly, we projected or re-projected all raster data 100 to "Asia North Albers Equal Area Conic" by using the Projection Raster tool in ArcGIS. Then, we combined 46 annual GLASS LAI images together and used a China provincial administrative vector map to mask images by province. Finally, a LAI timeseries was created for each pixel for further applications.

Methods chosen to smooth LAI products
Previous studies have proposed different smoothing methods to reduce the noise of GLASS LAI time series, and found the 105 OFP method varied by studied times, areas, and objectives Wang et al., 2018). Three popular methods were chosen in the study to smooth the LAI time-series curves, including the Double Logistic (DL) method, Savitzky-Golay (S-G) filter method, and Wavelet-based filter (WF) method.

Double logistic (DL) method
Double logistic is a method of merging local fitting parts to obtain the overall fitting result (Jonsson and Eklundh, 2004). In 110 the local fitting process, the Double logistic function can be expressed as:

Savitzky-Golay (S-G) filter method 115
Based on locally adaptive moving window, Savitzky-Golay (S-G) filtering method can be used to smooth data and suppress disturbances with a local polynomial regression model (Savitzky and Golay, 1964). Algorithm can be summarized as follows: * ∑ 2 where, represents the original LAI value, LAI * is the smoothed LAI value, j is running index of the LAI time series, is the coefficient of the i-th LAI value, n is the half-width of the smoothing window, and N is the width of the moving window 120 to perform filtering (2n+1). The width of the moving window-N, not only determines the degree of smoothing, but also affects the ability to follow a rapid change. We selected three windows width (3, 4, 5) to identify a better width for different crops and regions.

Wavelet-based filter (WF) method
Wavelet-based filter method can reduce noise with reflecting the periodicity of seasonal vegetation change (Sakamoto et al., 125 2005). The input signals is transformed in the wavelet transform as follows: where a is a scaling parameter, b is a shifting parameter, and implies a mother wavelet.
The advantage of the WF method is that it can maintain the time components of time-series data and hardly distort signals.
The input signals is decomposed to linear combinations of wavelet functions in the multi-resolution approximation: 130 4 where implies the approximate expression in level i, and implies the high-frequency components in level i. We used three types of mother wavelets: Daubechies (1988) (order=3-24), Coiflet (order=1-5), and Symlet (order=4-15) in the study.

Methods to detect the phenological information 135
The methods to detect remote-sensing-based phenology can generally be classified into three groups: inflexion-based method (Chen et al., 2016), threshold-based method (Manfron et al., 2017), and methods based on the mathematical or geometrical model fitting approach (Sakamoto et al., 2010). In this study, we used both inflexion-and threshold-based methods together to detect phenology. Firstly, we defined the inflection and maximum points of LAI time-series as the specific timing of key phenological stages for different crops (Fig. 3). 140

Green-up date, emergence date, transplanting date and V3 stage
We defined the date of inflection point (the first derivative increases continuously after this point or the second derivative equals 0) of the LAI time-series curves as the green-up date of winter wheat, emergence date of spring wheat, transplanting date of rice and V3 stage (early vegetative stage of maize when the third leaf is fully expanded) of maize (Sakamoto, 2018;Sakamoto et al., 2005;Sakamoto et al., 2010). Before the inflection point, the LAI values are kept low for a long time, 145 and then they start to increase continuously after this point.

Heading date
Heading date in the study was defined as the day when LAI reaching the maximum, as similar as some previous studies (Sakamoto et al., 2005;Chen et al., 2018b). That is to say, the maximum LAI points in the time-series curve are regarded as the heading dates. 150

Maturity date
When crops reach maturity, the physiological activity will change largely, leading to an abrupt decrease in LAI (Sakamoto et al., 2005;Chen et al., 2018b). Therefore, we regarded an inflection point in the LAI time-series curve, where the first derivative is negative with the largest absolute value, as the maturity date.

Determining the optimal filter-based phenology detection approach (OFP) 155
Based on the observations around the nearest AMS, we needed firstly to determine the restricted time windows responding to each key phenological stage for different crops. Then we sampled randomly 1000 grids every year in each province from the grids where the land use data was identified as cropland and retrieved the key phenological stages in the sampling grids according to the three smoothing methods and the above definitions of key stages. To determine the OFP approach in each province, we identified the inflection points and maximum value point of each LAI time-series curve at each grid within the 160 restricted time windows. After detecting phenological information of the cropland sample grids, we calculated the RMSE values between the estimated phenological dates and observed dates, and averaged these RMSE values for each crop at a provincial scale. Finally, we chose the most suitable smoothing method for different crops in each province with minimum RMSE.

Retrieving the phenological information at 1-km pixel across China 165
After removing the grids from where the land use data was identified as non-cropland, we then obtained cropland grids where the phenological information will be detected. Then, the most suitable smoothing method for different crops in each province were applied to reconstruct the LAI time series at 1-km grid scale. Finally, we detected the key phenological dates based on OFP approach and regarded the grids that the three key phenological stages (mentioned in 2.3.3) could be simultaneously at a national scale, we calculated the mean of phenological dates detected from each crop pixel around corresponding AMS and compared them with the corresponding observations by using RMSE.

Comparisons of different smoothing methods
The smoothed time profiles of LAI generated by different smoothing methods are shown in Fig. 4. Both S-G filter and WF 175 method can smooth LAI time-series well. That is to say, the generated time profiles of LAI match well with the seasonal tendency of the observed LAI time-series in the field. In addition, both methods can clearly characterize the local changes in the time component and maintain the time components of LAI time-series data. Although DL method performs poorly for smoothing LAI time series of the double-season crops, it's still reliable for single-season crops. These findings were consistent with those in some previous studies (Zhu et al., 2012;Sakamoto et al., 2005;Qiu et al., 2016). 180 We further compared mean RMSE of different smoothing methods and selected the most suitable smoothing method with minimal mean RMSE for different provinces and crops (Table 1). If the RMSE values were same, we also compared the number of crop grids according to different smoothing methods, and selected the suitable method which had identified a larger number of crop girds. It is noted that the number of identified grids differ considerably even with same RMSE values. Totally, S-G filter was an overwhelming smoothing way for 95% crops and provinces, followed by WF and DL method. 185 We ascribed the great performance of S-G to two reasons as follows. 1) One is its scientific smoothing principle: S-G filter applies an iterative weighted moving average filter to the time series, which can replace the noise data as well as keep the fidelity (Geng et al., 2014). By contrast, WF decomposes the time series into scaled and shifted wavelets to acquire timelocalization of a given signal (Qiu et al., 2014). DL uses a series of parameters to fit the time series (Beck et al., 2006). 2) The other is that S-G is more suitable for GLASS LAI. S-G can catch the local variations-e.g. the bimodal curve characteristics 190 from double cropping rice and the rotation of winter wheat and summer maize ( Fig. 3-b, f) -in time series and perform best for data without extreme noise such as GLASS LAI (Eklundh and Jönsson, 2015). DL is more useful for data with much noise, however, fails to catch local changes due to being unfit for data with double peaks. WF is also a powerful tool for processing non-stationary and noisy signals such as VI time-series rather than GLASS LAI (Rouyer et al., 2008;Sakamoto et al., 2006). Therefore, S-G is the most suitable for the complex cultivating systems across whole mainland of China. We also attributed 195 the excellent performance of S-G to the phenological extraction rules established in this paper, and the goal of accurately extracting the crop cultivation grids, as well as key phenology stages. For example, WF smoothing method might eliminate pseudo inflexion points that may not be pseudo due to the uncertainty of GLASS LAI data sometimes, and misidentify noncrop grids by inflexion-and threshold-based methods consequently resulting in very few crop grids identified (Qiu et al., 2016).

Validation of the phenological data 200
The comparison between retrieved phenological dates and phenological observations of each crop from 2000-2015 at national scale showed that all retrieved and observed dates were closely and averagely distributed 1:1 line for three crops (Fig. 5).
Additionally, the RMSE values of retrieved phenological dates were consistently less than 10 days ( Table 2). The RMSE averages of three key dates for rice were around 5.3 days, followed by wheat (5.5 days) and maize (6.7 days), corresponding to the related R 2 of 0.98, 0.97 and0.97, respectively. 205 As for the differences among crops, the retrieved accuracy of maize phenological stages was consistently the worst, with the biggest RMSE and errors (≥±10 days), and the lowest errors (≤±10 days) and R 2 . We ascribed the lower accuracy of maize phenology to the wider spatial heterogeneity environment and the complex rotation planting system relative to the other two crops (Qiu et al., 2018). The highest accuracy of rice phenology also supported the accuracy impact of complex planting system because paddy field is unfit for dryland crops such as maize, wheat, soybean, and other coarse cereals (Dong et al., 2015). 210 More interestingly, the retrieved accuracy of three crops decreased as crop growing and developing up to maturity periods (Table 2), with the average RMSEs ranging from 3.7 to 7.2 days. The highest accuracy (RMSE=2.8, error=0.5%) was found for the green up/emergence stages of wheat, while accuracy of maturity stage for each crop were the lowest (average RMSE=7.2, error=19%). The reasonable explanation might be relative weaker interfere from other vegetation because the green-up/emergence stage occurs most early during plant growing period (some 80 DOY Table 3). With the land surface 215 greening up, more and more information on plant growing statuses will be shot by satellites, which consequently mix with the crops' information and interfere to retrieve accurately the phenological stages of crops. Of course, the interfering from anthropological activities should not be ignored with climate warming.
Nevertheless, overall the retrieved phenological dates for the three crops are in strong correspondence with the observational dates (R 2 > 0.95) and their relationships are statistically significant (p < 0.01). Meanwhile, the growing status of other plants 220 (or rotation crops e.g. wheat-maize, maize-soybean) and the influence of other noises will lead to deviations of the remotesensing LAI curve and the actual observed curve in the field. The noises also include other factors, e.g. weather conditions, farmers' behaviors, etc... However, the uncertainty does not exclude the applicability of our method to retrieve key phenological stages of crops, especially retrieving relative higher resolution phenological information based on mature remote-sensing products at a large spatial scale. 225

Spatiotemporal patterns of key phenological stages from 2000 to 2015
We showed the annual averages of each key crop growth stage to indicate their spatiotemporal patterns due to the similarity in inter-annual patterns for a certain crop over the 16 years (Figs.6-7, Table 3, and Fig. S1-S4). Besides summarizing the key stages by crops and sub-regions, we also calculated three crop growth periods, including VGP (vegetative growth period), RGP (reproductive growth period) and GP W (whole growth period) to interpret their patterns (Fig.8, Table 3). Among the five 230 sub-regions with rice cultivation, the sub-region III was the most complex because three types of rice were cultivated there ( Fig.6-a; Fig.7-a). The single-rice in the sub-region III was generally cultivated in the northern parts of four provinces (i.e., Anhui, Jiangsu, Zhejiang and Hubei), which was characterized by three key stages occurring latest (DOY 159~265) than other three single-rice sub-regions (Ⅰ, Ⅱ, Ⅳ). Moving from the south (Ⅳ) to north (Ⅰ) (excluding the sub-region III because of cultivation of double-rice), single-rice wasn't transplanted in sequence as expected. In the sub-region Ⅱ, it was transplanted 235 latest (DOY 154) but had relatively early maturity dates (DOY 255), resulting in the shortest growing period (101 days) (Fig.8a). On the contrary, in the sub-region Ⅳ, single-rice was transplanted earliest but had the maturity occurring latest, resulting in the longest growing period of 130 days (Figs.6-a, 7-a, 8-a). In the sub-region Ⅴ where only double rice was cultivated, earlyrice was transplanted earlier (DOY 99), maturity dates of late rice occurred later (DOY 310), and consequently resulting in longer growing periods (97 and 101 days for corresponding early and late-rice) than those in the sub-region III with double-240 rice cultivation (80 and 84 days) ( Table 3).
As for wheat, green-up & emergence dates ranged most widely (DOY 30~128) than other crops. Winter wheat in the subregions Ⅱ and Ⅳ had earlier green-up dates, while spring wheat in the sub-regions of Ⅰ and III had later emergence dates (Table   3, Figs.6-b, 7-b). Moreover, along with the latitudes from the north to south (excluding the sub-region III because of the sparsest wheat cultivation there), the first key dates became earlier but with shorter growth periods (106, 93, 92 days for Ⅰ, Ⅱ 245 and Ⅳ) due to the sufficient temperature and light in the sub-region Ⅳ (Yu et al., 2012) (Fig.8-b). Interestingly, the heading and maturity dates in the three sub-regions showed consistently the same spatial patterns as that of the first stage with latitudes decreasing (Figs.6-b, 7-b).
Both spring and summer maize types were concurrently cultivated in the sub-regions III and Ⅳ, while only one of them was cultivated in the sub-regions Ⅰ and Ⅱ, the main planting areas of northern China (Figs. 6-c, Fig.7-c, Table 3). V3 of summer 250 maize was approximately 43 days later than that of spring maize (DOY 161 vs. 117), but their maturity dates were very close (DOY 259 vs. 245), which thus caused a shorter growth period for summer maize, especially for the sub-regions Ⅱ and III (some 84 days) (Fig.8-c). Additionally, in three sub-regions (Ⅰ, III, Ⅳ) for spring maize, like wheat, the spatial patterns of the three key stages for maize were similar in spatial patterns with latitude increasing. Finally, the key dates and periods were the most variable in the sub-region Ⅳ . 255 In sum, the spatial patterns of key phenological stages varied by crops and cultivated ways. In addition, early rice and single cropping rice in the sub-region Ⅲ, wheat in the sub-region Ⅲ, and maize and rice in the sub-region Ⅳ showed a larger variability than others due to the mixed planting of heterogeneous varieties of the same crop. Many factors could have impacts on crop phenology, such as climate, environment, farmer's behaviors, technological development, and human activities (Liu et al., 2016;Liu et al., 2018). Different from natural ecosystems such as wild forest or grassland, three main crops cultivated 260 across the mainland of China didn't reach greening-up or flowering dates in sequence with latitude, especially for rice Tao et al., 2014;Zhang et al., 2014b). Moreover, climate conditions did have impacts on crop phenology. For example, increased temperature had advanced heading and maturity date of crops in China Tao et al., 2014).
At the same time, crop management activities, such as cultivar shift and the adjustment of planting and harvesting date, had affected crop phenology largely (Tao et al., 2006;Tao et al., 2013). 265

The changes in three key phenological dates and growth periods from 2000 to 2015
To interpret the changes of the three key phenological dates and growth periods from 2000 to 2015, we analyzed their trends at pixel scale and summarized the grids with a significant trend (p<0.1) according to crops and sub-regions (Figs 9-10, Table   4). We found more positive trends, with 0.78 days/year for 70% summarized medians, but fewer negative ones, with -0.69 days/year and 30% medians. This suggests that phenological dates have been delayed. Specifically, the proportion of pixels 270 that had a positive trend is 92% for wheat ( Fig.9-b), 75% for rice ( Fig.9-a), and 50% for maize (Fig.9-c).
For rice, transplanting dates were consistently advanced by -0.64 days/year for early rice and single-rice, and delayed by 0.84 days/year for late rice in most areas. Maturity dates became later by 1.23 days/year, but heading dates had less changes. In addition, double rice in the sub-region III showed less variable than that in the sub-region Ⅴ ( Fig.9-a). By contrast, the first stages (i.e., green-up & emergence) delayed by 0.88 days/year consistently for almost all the wheat cultivation areas ( Fig.9-b). 275 Maize in the sub-region Ⅱ, and wheat in the sub-region Ⅰ and Ⅱ ( Fig.9-b), had an opposite trend to that of rice (Fig.9, Table 4).
Moreover, the changes in the three stages showed less variable in the sub-region Ⅱ, the main planting areas for both dryland crops. Among all the crops and growth stages, maize in the regions III and Ⅳ had consistently negative trends with exception of maturity dates in the sub-region Ⅳ.
Compared with the significant changes in phenological dates, the duration of phenological periods changed in less pixels (< 280 30%) (Table 4). More pixels with positive trends, with 1.25 days/year for 66.7% medians, were identified than those with negative trends, with -0.97 days/year for 32.3% medians, implying a commonly prolonged growth periods during the study period. 95.8% of the medians were positive for rice, while 75% of the medians were negative for wheat. The changes of maize growth periods were similar to those of its phenological dates.
The duration of growth periods was prolonged, especially for the whole growth period (GPw), which was consistently observed 285 for rice cropping systems, except for early rice in the sub-region Ⅴ. In addition, the duration of VGP for single rice in the subregion Ⅰ had weaker trends ( Fig. 10-a). On the contrary, almost all the wheat growth periods were shortened except for winter wheat in the sub-region Ⅴ, especially for spring wheat in the sub-region Ⅰ ( Fig.10-b). Additionally, in term of growth period duration, maize had the similar changes as wheat in the sub-regions Ⅰ and Ⅱ. Changes in growth period duration were different for spring (shortened) and winter (prolonged) wheat, and for both maize types between the sub-region III and Ⅳ (Fig.10-c). 290 The results are well supported by some previous studies based on the intensive observations at site scale Tao et al., 2014;Tao et al., 2012;Zhang et al., 2014b).

Uncertainties of ChinaCropPhen1km
Inevitably, there are still some uncertainties in the generated dataset (i.e., ChinaCropPhen1km). First of all, the uncertainties of the GLASS LAI products have a relatively greater impact on ChinaCropPhen1km. One is that its quality is closely related 295 to that of the input MODIS Surface Reflectance product (MOD09A1), which is affected by many factors such as cloud, snow, aerosols, and water vapor (Xiao et al., 2014). That is to say, the noise of original LAI time-series due to these factors could introduce uncertainties in the detection of phenological stages. Therefore, a smoothing method is necessary to minimize the influence of noise. Nevertheless, the selection of the smoothing method could affect the accuracy of the phenology detected from the reconstructed time series (Verger et al., 2016). Thus, we compared several common used smoothing methods and 300 chose the most suitable method with minimum RMSE for different crops in each province. The other is that the GLASS LAI retrieval algorithm can eliminate abrupt spikes and dips, which may result in the loss of neighboring smaller peaks in LAI profiles (Xiao et al., 2016). Therefore, the number of detected pixels with double-season rice cultivated may be less than that of the actual situation due to the short interval between the two local maximum points (i.e., heading stages of early rice and late rice). Secondly, the use of NLCD dataset can be another source of uncertainty. In this study, the spatial distribution of 305 each crop type was determined based on the dryland crop (i.e., maize and wheat) and paddy rice mask, which was derived from the dryland and paddy field layer of NLCD, respectively. Since the inclusion of several crop types for the dryland layer and the omission or commission error of the land cover types can lead to uncertainties, the unavailable crop-specific map is promising to improve the accuracy of retrieved phenological dates, Finally, the coarse spatial resolution of 1 km could bring uncertainty in the results as several land cover or crop types could be contained within a 1-km pixel and then the identified 310 signal of the specific phenological stages might be weakened, especially in the mountainous regions (e.g., southern China) with complex terrain and consequently diverse vegetation types. In future studies, the application of remote sensing products with finer spatial resolution could be expected to solve the mixed pixel issues.

Conclusion
In the present study, we proposed a method to retrieve 1km-grid crop phenological dataset for three main crops from 2000 to 2015 based on GLASS LAI products. First, we compared three common smoothing methods and chose the most suitable methods for different crops and regions. The results showed that S-G was the most frequently chosen method as it not only 320 could well smooth the time series but also keep the fidelity. Next, we developed an OFP approach which combined both inflexion-and threshold-based method to detect the key phenological stages of three staple crops at spatial resolution of 1km across China. Finally, we established a high resolution gridded-phenology product for three staple crops in China during 2000-2015, i.e., ChinaCropPhen1km.
The ChinaCropPhen1km dataset has been well validated using the intensive phenological observations of AMS. It can reflect 325 the spatial differences in the local climatic and management factors. Thus, this first high-resolution crop phenological dataset can be applied for many purposes, including understanding land surface phenological dynamics, investigating climate change impacts and adaptations, and improving agricultural system or earth system modelling over a large area, temporally and spatially.

Author contribution 330
Zhao Zhang and Yi Chen designed the research. Yi Chen and Ziyue Li collected datasets. Yuchuan Luo implemented the research and wrote the paper; Zhao Zhang and Fulu Tao revised the manuscript.

Competing interests
The authors declare that they have no conflict of interest.

Winter wheat
Spring wheat Summer maize Spring maize Single-rice Double rice  dates; RGP means reproductive growth period, the difference between maturity and heading dates; GP w means whole growth period, the difference between maturity and transplanting/Green up & Emergence/V3 dates. The numbers in the parentheses mean the annual mean growth periods. 525 Note: The same meanings for VGP, RGP and GP w as Table 3; Psg (%) means the percent of grids showing significant trend at p<0.1 level; The numbers in the parentheses mean the statistic values of grids during three growth periods.