Annual oil palm plantation maps in Malaysia and Indonesia from 2001 to 2016

Increasing global demand of vegetable oils and biofuels results in significant oil palm expansion in Southeast Asia, predominately in Malaysia and Indonesia. The land conversion to oil palm plantations poses risks to deforestation (50% of the oil palm was taken from forest during 1990-2005, (Koh and Wilcove, 2008)), loss of biodiversity, and greenhouse gas emission over the past decades. Quantifying the consequences of oil palm expansion requires fine scale and frequently updated datasets of land cover dynamics. Previous studies focused on total changes for a multi-year interval without identifying the exact time 15 of conversion, causing uncertainty in the timing of carbon emission estimates from land cover change. Using Advanced Land Observing Satellite (ALOS) Phased Array Type L-band Synthetic Aperture Radar (PALSAR), ALOS-2 PALSAR-2 and Moderate Resolution Imaging Spectroradiometer (MODIS) datasets, we produced an Annual Oil Palm Area Dataset (AOPD) at 100-meter resolution in Malaysia and Indonesia from 2001 to 2016. We first mapped the oil palm extent using PALSAR/PALSAR-2 data for 2007-2010 and 2015-2016 and then applied a disturbance and recovery algorithm (BFAST) to 20 detect land cover change time-points using MODIS data during the years without PALSAR data (2011-2014 and 2001-2006). The new oil palm land cover maps are assessed to have an accuracy of 86.61% in the mapping step (2007-2010 and 20152016). During the intervening years when MODIS data are used, 75.74% of the change detected time matched the timing of actual conversion using Google Earth and Landsat images. The AOPD dataset revealed spatiotemporal oil palm dynamics every year and shows that plantations expanded from 2.59 to 6.39 M ha and from 3.00 to 12.66 M ha in Malaysia and Indonesia, 25 respectively (i.e., a net increase of 146.60% and 322.46%) between 2001 and 2016. The higher trends from our dataset are consistent with those from the national inventories with limited annual average difference in Malaysia (0.2 M ha) and Indonesia (-0.17 M ha). We highlight the capability of combining multiple resolution radar and optical satellite datasets in annual plantation mapping at large extent using image classification and statistical boundary-based change detection to achieve long time-series. The consistent characterization of oil palm dynamics can be further used in downstream applications. The annual 30 oil palm plantation maps from 2001 to 2016 at 100 m resolution is published in the Tagged Image File Format with georeferencing information (GeoTIFF) at https://doi.org/10.5281/zenodo.3467071.


Introduction
The global demand for vegetable oil and its derivative products calls for an increase in palm oil production leading to oil palm expansion and intensification in Southeast Asia . According to the Food and Agriculture Organization 35 (FAO), Malaysia and Indonesia account for 81.90% of the global oil palm fruit production in 2017, an increase by 179.72% from 2000 to 2017 (see http://faostat.fao.org) that is projected to continue in the future (Murphy, 2014). The boom of oil palm industries caused and also raised the deforestation risks (Austin et al., 2018;Vijay et al., 2018). In Malaysia and Indonesia, more than 50% of the oil palm plantation was converted from forest during 1990-2005(Koh and Wilcove, 2008 and industrial plantation dominated by oil palm (72.5% of all plantations) caused a ~60% decrease of peatland forest from 2007 to 2015 40 (Miettinen et al., 2016). A series of consequences include but not limited in biodiversity decline (Fitzherbert et al., 2008), peatland loss (Koh et al., 2011) and carbon emission (Guillaume et al., 2018).
Quantifying the spatiotemporal details of oil palm expansion is important to understand the deforestation process and its impacts on ecosystems services and promote progress in environmental governance and policy decisions (Gibbs et al., 2010;Koh and Wilcove, 2008). However, annual information on the expansion of oil palm plantations is poorly documented in 45 Malaysia and Indonesia. The statistical records (e.g., FAO, United States Department of Agriculture (USDA)) give neither the detailed spatial distribution nor the young oil palm trees and small-holder plantations. Many efforts have been made to characterize the oil palm extent (Cheng et al., 2018;Gaveau et al., 2016;Miettinen et al., 2017). For example, the Roundtable on Sustainable Palm Oil (RSPO), whose members manage 1/3 of the world's oil palm, provided spatial information on oil palm distribution in Malaysia and Indonesia (Gunarso, 2013). The continuous mapping of oil palm on peatland in 1990, 2000, 50 2007 and 2010 described the dynamic change of oil palm on peat during the past 30 years (Miettinen et al., 2012). But these maps are given for a certain year or several time phases without capturing the exact time of oil palm changes. Dynamic global vegetation models use gross land-use change and thus require high-resolution grid-cell-based annual oil palm conversion maps rather than country-level inventories and bi-decadal land cover maps (Yue et al., 2018a;Yue et al., 2018b). Lack of continuous change information may cause wrong interpretation of land cover change time and significant bias in global carbon dynamic 55 studies (Zhao and Liu, 2014;Zhao et al., 2009). As a result, oil palm plantation maps at high temporal and spatial resolutions in Malaysia and Indonesia are urgently needed.
Remote sensing has been used in oil palm monitoring since 1990s. Progress has been made in oil palm mapping and change detection, including 1) data sources from optical satellite earth observations (Lee et al., 2016;Srestasathiern and Rakwatin, 2014) to microwave datasets such as Phased Array Type L-band Synthetic Aperture Radar (PALSAR) (Cheng et al., 2018;60 Dong et al., 2015), 2) spatiotemporal resolutions from regional to national scale (Miettinen et al., 2017) and from single to multi-decadal mapping (Gaveau et al., 2016;Miettinen et al., 2016), 3) interpretation methods from manual to semi-and fully automatic identification (Baklanov et al., 2018;Cheng et al., 2019;Li et al., 2017a;Mubin et al., 2019;Ordway et al., 2019), 4) products going from oil palm land cover maps to more detailed datasets on plantation structure, e.g. tree counting (Li et al., 3 focused on the continuous oil palm change detection (Carlson et al., 2013;Gaveau et al., 2016;Vijay et al., 2018). These studies adopted visual or semi-automatic interpretation for oil palm plantation, which is labor-extensive and not appropriate for long-term annual oil palm plantation monitoring. Automatic identification can overcome this difficulty by using classification algorithms based on Landsat and PALSAR/PALSAR-2 data, which were successfully applied to produce the 2015 land cover map of insular Southeast Asia with discrimination of oil palm plantation (Miettinen et al., 2017). So far, 70 however, the annual dynamics of oil palm plantations (expansion and shrinkage) remains unquantified for Malaysia and Indonesia.
The annual oil palm mapping in tropical areas such as insular South-East Asia is a challenge due to the persistent cloudy conditions (Gong et al., 2013;Yu et al., 2013). Multi-temporal optical images can help reduce cloud effects  but it is still difficult to obtain effective optical observations in Malaysia and Indonesia (51.88% of the region is without annual 75 Landsat images, Figure S1). Microwave remote sensing is not affected by clouds, and is considered to be the most efficient source in separating forested vegetation and oil palms (Ibharim et al., 2015;Teng et al., 2015). The long-time span of 25 m resolution PALSAR/PALSAR-2 data provides opportunities for mapping oil palm at high spatiotemporal resolutions. Recently the PALSAR/PALSAR-2 data have been successfully used in charactering oil palm change for the whole Malaysia for six years using PALSAR (2007)(2008)(2009)(2010) and PALSAR-2 (2015PALSAR-2 ( -2016  . However, the gap years (2011-2014) 80 between PALSAR and PALSAR-2 hampered continuous tracking of oil palm plantation dynamics. One potential way to achieve annual mapping is to use optical earth observation data e.g., Landsat images for the PALSAR gap period (Chen et al., 2018;Shen et al., 2019). However, this requires abundant Landsat images (>4) (Xu et al., 2018a) that are not available in the humid tropical regions and may cause "false changes" and "inter-annual inconsistency" (Broich et al., 2011). Recently, a superresolution mapping method (Li et al., 2017b;Qin et al., 2017;Xu et al., 2017) was used to reconstruct missing forest cover 85 change during 2011-2014 (Zhang et al., 2019) by fusing the PALSAR/PALSAR-2 and the MODIS normalized difference vegetation index (NDVI) with dense temporal resolution and phenological information. However, it is difficult to separate oil palm and natural forest with similar NDVI variation using such classification-based fusion. A new approach based on change detection in a given period using time-series observations (i.e., MODIS NDVI, GIMMS NDVI) was successfully applied to fill the data-missing years in developing a nominal 30 m annual China land use and land cover dataset (Xu et al., in review). 90 This approach takes advantage of dense observations by detecting break points in a time-series using change detection algorithms, combined with the pre-knowledge from the mapped years and thus reduces the inter-annual inconsistency.
The objectives of this study are (i) to develop a robust and consistent approach capable of detecting annual oil palm changes in Southeast Asia using multiple remote sensing datasets based on image classification and breakpoint detection, (ii) to produce a nominal 100 m annual oil palm plantation dataset (AOPD) in Malaysia andIndonesia from 2001 to 2016, and(iii) to quantify 95 the spatial and temporal patterns of oil-palm change dynamics since 2001. Specifically, we developed the annual oil palm plantation dataset in Malaysia and Indonesia by using a two-stage method. The first step is random forest-based image classification using PALSAR during 2007-2010 and PALSAR-2 data during 2015-2016 (the periods with PALSAR/PALSAR-2 data available). Combined with the oil palm maps produced in the first step during the years with PALSAR coverage, MODIS 2010a), to fill the data-gap years (2011)(2012)(2013)(2014) outside the PALSAR years and extend the oil palm land cover mapping period back to 2001. Oil palm in this study refers to both young and mature oil palm trees from industrial plantation and smallholders with the minimum size of 1 ha (oil palm smallholders is defined as 50 hectares or less of cultivated land producing palm oil controlled by smallholder farmers (the definition used by the RSPO) with an average of 2 ha (Bank, 2010)).

Study area
Insular South-East Asia was originally occupied by evergreen moist tropical forest, which is one of the most biologically diverse terrestrial ecosystem on Earth. The natural environment, with humid tropical climates and low-lying topography, is suitable for the oil palm (Elaeis guineensis) (Fitzherbert et al., 2008). Since 1911 when the first commercially oil palm plantation in Southeast Asia settled in Sumatra, oil palm plantation expanded rapidly in Sumatra and peninsular Malaysia and 110 then spread to Sarawak and Sabah in Malaysia and Kalimantan in Indonesia (Corley and Tinker, 2008). Industrial oil palm plantations spurred the economic sectors in Southeast Asian countries but also raised concerns on the negative social and environmental impacts (Obidzinski et al., 2012;Sayer et al., 2012). Recently, oil palm plantations expansion became one of the dominant drivers of deforestation in Malaysia and Indonesia (Austin et al., 2018;Gaveau et al., 2016). Thus, we chose as a study area the whole Malaysia, Sumatra and Kalimantan in Indonesia, encompassing 96% of the total oil palm production in 115 Indonesia (Petrenko et al., 2016). Oil palm plantations in these two countries account for 67.51% of world's total oil palm plantation area (FAOSTAT, 2017), and dramatic land cover conversion happened in this region due to human induced modifications.

Overview of the AOPD producing
The development of AOPD includes two major stages: 1) oil palm mapping using PALSAR/PALSAR-2 data (Section 2.3) 120 and 2) change-detection based oil palm updating using MODIS NDVI during the gap years in operation between ALOS and ALOS-2 (Section 2.4). The first stage aimed at producing the oil palm maps for using PALSAR and 2015 using PALSAR-2 datasets. The detailed procedures include the pre-process of the original PALSAR/PALSAR-2 data, training sample collection and image classification and final production of oil palm maps for the target years after postprocessing using ancillary datasets. In the second stage, we combined oil palm maps produced in the first stage with MODIS 125 NDVI data. Time series of MODIS NDVI data and change maps were prepared in the data preparation step, followed by the breakpoint test using change-detection algorithm, BFAST to detect the change year (change from other land cover types to oil palm and the reverse) in the PALSAR/PALSAR-2 data missing period. After the post-processing, we derived the oil palm maps in these gap years and traced the oil palm distribution back to 2001. Combining the results from the two stages we obtained the annual oil palm plantation maps from 2001 to 2016 at 100 m spatial resolution, forming the AOPD dataset. The 130 whole workflow is shown in Figure 1.

PALSAR/PALSAR-2 product and data preparation
We used multi-source remote sensing images to fully cover the whole study period including ALOS PALSAR, ALOS-2 PALSAR-2 and MODIS NDVI. The Landsat archives were not used because of the low data availability in this region caused 135 by frequent thick cloud cover ( Figure S1). Japan Aerospace Exploration Agency (JAXA) provided the 25 m resolution global PALSAR/PALSAR-2 mosaic by mosaicking SAR images of backscattering coefficient (http://www.eorc.jaxa.jp/ALOS/en/palsar_fnf/data/index.htm).
Although the product was compiled at an annual frequency, one product a year is sufficient to identify the oil palm changes since oil palm is a perennial crop without significant phenological variations in the tropics. To cover the whole study area, 15 140 patches of 5°×5° PALSAR/PALSAR-2 grids for six years  from PALSAR-2) were used. Since ALOS satellite stopped working in 2011, no data was available between 2011 and 2014 until the operation of ALOS-2. The product contains data of HH (i.e. horizontal transmit and horizontal receive) and HV (i.e. horizontal transmit and vertical receive) digital numbers ( ) acquired by PALSAR/PALSAR-2 in Fine Beam Dual (FBD) mode with orthorectification and topographic correction. For PALSAR/PALSAR-2, HH and HV DN values were converted to normalized 145 backscattering coefficients (unit: decibel (dB)) using the following Eq. (1) formula (Rosenqvist et al., 2007): where is a calibration factor (−83.0 dB) in PALSAR/PALSAR-2 data (Shimada et al., 2009). Two additional layers, and , were produced by calculating the ratio and difference from and of decibels as followings Eq.
(2) and Eq. (3): 150 Although the ALOS PALSAR and ALOS-2 PALSAR-2 have different satellite microwave sensor properties (e.g., frequency, off-nadir angle), the backscatter signals are relatively stable for the given period (2007-2010 and 2015-2016) as seen by comparing the distribution of backscattering values (HH and HV) of 250000 randomly generated pixels (using ArcGIS 10.3) 155 in the study area between different years (see Figure S2). The similar findings for the stability of PALSAR/PALSAR-2 data was also given in previous studies Qin et al., 2017). Meanwhile, the HH and HV values for oil palm and forest is also shown in Figure S3 and indicate the separability between the two land cover types for both PALSAR/PALSAR-the study period. One problem of using PALSAR/PALSAR-2 data, however, is the "salt and pepper" noise (Zhang et al., 2019), 160 which may cause misclassification and false changes in the subsequent process. Previous studies showed that the resampling method reached higher accuracy and better visual results in oil palm mapping compared to the commonly used filter method (Cheng et al., 2018). The identification and area estimation of oil palm plantations have also been proven to perform better at 100 m resolution (Cheng et al., 2018). Therefore, we resampled the original 25 m PALSAR/PALSAR-2 images to 100 m resolution for every year to reduce "salt and pepper" noise. 165

Training sample collection and image classification
In this study, a multi-year training sample set (2007-2010, 2015 and 2016) was used to map the oil palm extent in Indonesia and Malaysia from 2007 to 2016. We used the training sample set for Malaysia from our previous study (Cheng et al., 2017) and interpreted the training datasets for Indonesia using the same interpretation method. The sample collection was mainly based on the high-resolution (<1m) images from Google Earth with the assistance of PALSAR/PALSAR-2 images. We first Four land cover types in this training sample set were included: oil palm (mature and young oil palm-identified by the canopy shape using very high-resolution images from Google Earth), water, other vegetation (forest, shrubland and other plantations 175 such as rubber), and others (impervious, cropland and bare land). Mixed land cover types were found in "other vegetation" and "others" because it is difficult to further separate these types within the categories. The detailed distribution of training data is presented in Table 1. Other vegetation types consist of ~52.9% of the total sample, secondly ranked the oil palm samples (26.7%), while "others" and water types only account for ~20.4% of the total training samples, which is consistent with the real land cover distribution. 180 Thereafter, we used a random forest (RF) classifier in the image classification step. The HH and HV digital number of decibels, the derived difference (HH-HV) and ratio (HH/HV) images were all used as inputs to the RF classifier to derive the original annual oil palm maps for the six years. The MODIS NDVI is not used as input to RF model for classification because of the similarity between tropical forest and oil palm and the coarse resolution which may negate the benefits of our classification based on higher spatial resolution PALSAR data. 185

Post-processing and oil palm map
Post-processing after the initial results is necessary because of the limitation in training set, unavoidable classification errors and the difficulty in describing heterogeneous real land surface. To obtain reliable oil palm dataset, we adopted several steps including mode filtering, terrain filtering, intact forest and mangrove filter in post-process to improve the final oil palm maps in stage 1 for 2007, 2008, 2009, 2010, 2015 and 2016. 7 Mode filtering is used to filter the very small patches (mainly single pixel) in the initial results since it is more likely to be errors or noise induced by PALSAR/PALSAR-2 data rather than real oil palm plantation. The topographic factor such as slope and elevation will cause the confusion of backscattering signals from satellite sensors, particularly in area with undulating terrain. Therefore, we applied terrain filter to reduce the confusion by topographic factor using the Shuttle Radar Topography Mission (SRTM) 30-m digital elevation model (DEM). The altitude threshold of 1000 m was applied since the oil palm is 195 mainly distributed in lowland (mostly <300 m) and regions higher than 1000 m are not suitable for oil palm cultivation (Austin et al. 2015;Carlson et al. 2013;Corley and Tinker 2008). Subsequently, we used two additional layers, intact forest landscape (IFL) in 2016 from (Potapov et al., 2008) and the Global Mangrove Atlas (GMA, available at: http://geodata.grid.unep.ch/results.php) to filter out non-oil palm areas and reduce the misclassification. The intact forest map denotes natural forest ecosystem without human caused disturbances where oil palm plantation is not supposed to be cultivated. 200 The mangrove swamp area is subsequently flooded by sea water, which is not suitable for oil palm cultivation due to the significant negative impact on the fresh fruit bunch and oil production (Henry and Wan, 2012).
Another problem when developing oil palm maps is the replantation of oil palm trees. Oil palm has a long-life cycle of 25 to 30 years. After that, the trees will be cleared and replaced because of a decrease in palm oil yield (Röll et al., 2015). However, from the satellite observations, the land cover type is bare land at the time of oil palm logging whereas the land use property 205 remains unchanged as oil palm plantation backwards and forwards. Given the limitation of satellite observation, we provided two versions of our oil palm datasets. The first version is the oil palm datasets after the post-processing mentioned above. Here replantation is not considered, and this version includes conversion from other land cover types to oil palm (oil palm expansion) as well as the opposite one (oil palm shrinkage). In the second version, we assumed that oil palm expansion is a unidirectional activity due to the growing demand of palm oil. The time-series filtering was conducted by using the 2007 oil palm extent to 210 filter all pixels classified as "non-oil palm" in the subsequent years. As a result, this version of the oil palm plantation dataset has continuously expanding areas from 2007 to 2016. The second version includes the impact of oil palm replantation and the thriving oil palm industry in South-East Asian countries but ignored any possible decrease of oil palm (e.g. abandonment, conversion to cropland) in some areas.

MODIS NDVI is an important index of vegetation conditions and has been widely used in vegetation and land cover change
studies (Clark et al., 2010;Ding et al., 2016;Estel et al., 2015). NDVI in the recent updated MODIS vegetation index data (MOD13Q1) collection 6 from 2000-2007 and from 2010-2015 (downloaded from https://lpdaac.usgs.gov/) was used to fill the gap years (2000-2006 and 2011-2014) of PALSAR/PALSAR-2 datasets using change detection algorithms. The 220 MOD13Q1 product has a spatial resolution of 250 m and is composited every 16 days. In total, 6 MODIS tiles with 23 scenes per year (181 and 138 scenes for the two study periods, 2000-2007, P1 and 2010-2015, P2) were required to cover the study area (h27v08, h27v09, h28v08, h28v09, h29v08 and h29v09). All the MODIS images were projected from its original sinusoidal projection to a geographic grid with a WGS 1984 spheroid and resized to 100 m to match the resolution of the oil palm maps using the nearest neighbor resampling approach. The pixel quality and reliability layers in the MOD13Q1 product 225 were used to further exclude the poor-quality pixels. During the whole study period, 53.64% of the observations have good quality while 46.36% were interpolated using spline interpolation. For those pixels with less than 30 good-quality observations (4.79% in P1 and 9.64% in P2 of the total change area), we didn't apply the BFAST algorithm. For the remaining area, 61.67% (P1) and 58.24% (P2) of pixels had 12 (~50% in 23) good-quality observations annually.
A change map for the microwave data gap period between PALSAR and PALSAR-2 (2011-2014) was extracted using the 230 change pixels in 2010 and 2015 oil palm maps with spatial locations and "from-to" types. Here, we assumed the change from classification was reliable because of the high resolution of PALSAR data. We then sought the exact change year within the intervals in the next step (Section 2.4.2) using temporal NDVI files extracted from each change pixel. Frequent changes such as two or three shifts during the gap years were assumed to be of low probability and thus not considered in this study. For the period during 2001-2006 without PALSAR/PALSAR-2 data and oil palm distribution in 2000, we assumed a unidirectional 235 expansion of oil palm and the oil palm extent in 2007 was used as the potential change regions in the next step. In total, we derived two versions of change maps (one with bi-directional change and the other with only unidirectional oil palm expansion) for the two periods.

Breakpoint test using change-detection algorithm, BFAST
Change detection analysis was conducted in the change pixels derived from the last step to identify the exact change time 240 within the two periods (2011-2014 and 2001-2006) based on the time-series MODIS NDVI from 2010 to 2015 and 2000 to 2007, respectively. Here we aimed to capture an abrupt NDVI changes (breakpoints) in the two given periods, which is assumed to be caused by the conversion of the original land cover type to the oil palm cultivation. Many change detection algorithms and their derivatives have been developed in recent years to detect subtle or abrupt changes in a dense time-series satellite profiles (Broich et al., 2011;Kennedy et al., 2010;Verbesselt et al., 2010b). Most of these algorithms were applied in 245 forest change monitoring and all reach high consistency in detecting significant change (Cohen et al., 2017). A recent algorithm, Bayesian Estimator of Abrupt change, Seasonal change, and Trend (BEAST), aggregating the competing models than the conventional single-best-model, performed well in capturing multiple and subtle phenological changes (Zhao et al., 2019b).
Here we used BFAST to capture the oil palm conversion time within the two study periods (2011-2014 and 2001-2006).
BFAST has been successfully applied in monitoring forest disturbance and regrowth and has proved robust with different 250 sensors (DeVries et al., 2015;Verbesselt et al., 2012). Based on the structural change methods, the BFAST algorithm is able to find the structural breakpoint between different segments in the observation time series (DeVries et al., 2015), and thus can be used to detect the time and number of abrupt or gradual changes as well as to characterize the magnitude and direction. The BFAST method decomposes the time series into trend, seasonality, and residuals sections (Verbesselt et al., 2010b). The model can be expressed as Eq. (4): 255 where C is the observed value at time , C is the trend section, C is the seasonal section and C is the noise section.
An ordinary least square residuals-based moving sum test (Zeileis, 2005) was used to test whether breakpoints occurred in the trend or seasonal components. Then, test was conducted to determine the number and optimal position of the breaks using Bayesian Information Criteria and the minimum of the residual sum of squares. The trend and seasonal coefficients were then 260 computed using robust regression. A harmonic seasonality model (with three harmonic terms) was used to describe the seasonality of the satellite data (Eq. 6) (Verbesselt et al., 2010b). For each piecewise linear ( C ) from I * to IK0 * where 0 * , … , L * is the assumed break points which defines the p+1 segment, C can be expressed as follows: where is the index of the breaks, i=1, … , . I and I are the intercept and slope of the fitted piecewise linear model. 265 For the 0 # , … , R # seasonal break points, C is the harmonic model for S # to SK0 # : where, = 1, … . is the number of harmonic terms in the periodic model (default value = 3); S,U is the amplitude; is the frequency; S,U is the time phase. For the MODIS NDVI used in this study, the value was 23 (i.e. 23 observations of MODIS observations per year) (Verbesselt et al., 2010b). Here, the maximum number of breaks was artificially set to 1 because of the 270 assumption of one time change for each period based on prior knowledge from the oil palm maps. Figure 2(a) shows two examples of the breakpoint detection of the MODIS NDVI using BFAST algorithm. In the first example, no obvious break detected in the coarse resolution time-series, whereas significant change was captured in the trend section after time-series decomposition in the second example (Figure 2(b)). More details of the BFAST algorithm are referenced in (Verbesselt et al., 2010b;Verbesselt et al., 2012). To evaluate the validity of using coarse MODIS time series in oil palm change detection, we 275 visually interpreted an extra 100 change points based on the PALSAR images from 2007 to 2010 and applied the BFAST algorithms using the MODIS NDVI. An example of the comparison between the BFAST based change results and visual interpretation from PALSAR images was shown in Figure S4. The break time detected from MODIS NDVI showed the similar conversion year compared with the microwave satellite images (a total of 86% agreement with 62% matched the same change year and 24% within a 1-year interval). Moreover, we did a test using the subsequent NDVI fragments to replace the original 280 NDVI fragments after the break detected time and compared the break results to show the robustness of the algorithm considering the effect of oil palm plantation stand age ( Figure S5).

Annual oil palm results updating
The previous steps generated annual oil palm maps for six years (Section 2.3) and the oil palm change time in the missing periods (2011-2014 and 2001-2016) (Section 2.4.1 and 2.4.2). In the final step, all these data were combined to update the 285 continuous oil palm dataset from 2001 to 2016 following Xu et al (under review).
For the gap period from 2011 to 2014, the oil palm updating was based on the "from-to" land cover types (L1 and L2) of the start (t1) and the end years (t2) with the detected change time (ti). Then L2 was allocated between ti and t2 while L1 was assigned before ti (t1 to ti). types. So, the land cover type between 2001 and change time (ti) was classified as non-oil palm, and oil palm was assigned to the period after ti (ti to t2). Thereafter, the oil palm maps between 2001 to 2016 were updated. Quality maps ( Figure S6 and S7) were also generated to indicate the availability of valid NDVI values (i.e., not under cloud cover), the spatial resolution of the 295 dataset used and the consistency of change time detection from different breakpoint test approaches in BFAST algorithms (the ordinary least squares residuals-based MOving SUM test (OLS-MOSUM), the supremum of a set of Lagrange multiplier statistics (SupLM) and Bayesian information criterion test (BIC), (Zeileis, 2005)). The annual oil palm updating process was applied in both the bi-directional and unidirectional versions. And two versions of the oil palm datasets (AOPD-bi and AOPDuni) were developed. 300

Evaluation
Our product of annual oil palm maps, AOPD, was evaluated in three aspects: 1) independent annual oil palm sample set for Malaysia (2007) and Indonesia (2010 to evaluate the annual mapping results for the classified maps using PALSAR/PALSAR-2 data and gap years using change detection method, 2) a change sample set aimed at assessing the accuracy of detected change years and 3) comparison with statistical inventories (e.g., FAO, USDA, Malaysian 305 Palm Oil Board (MPOB) (2011-2016), Badan Pusat Statistik (BPS-Statistics Indonesia) (2011-2015)), the existing oil palm maps from Gaveau et al. (2016) and the Landsat based deforestation maps (Hansen et al., 2013). FAO and USDA agricultural statistical data provided the harvested area of oil palm using data collected by official and unofficial outlets. MPOB is a government agency providing oil palm planted area in Malaysia based on the data reported by state agencies, institutions, private estates and independent smallholders. BPS-Statistics Indonesia, a non-ministry government agency, provided statistical 310 data for public including oil palm planted area compiled from Quarterly (SKB17-Oil Palm) and Annually (SKB17-Annual) Plantation Estate Survey, custom documents from Directorate General of Customs and secondary data from Directorate General of Estate Crops.
Two sets of annual oil palm samples set were used to validate the mapping results in Malaysia and Indonesia according to the sampling protocol of Gong et al. (2013). The independent annual sample set in Malaysia was from the previous studies (Cheng 315 et al., 2019;Cheng et al., 2017). All pixel-based samples were randomly produced in equal-area hexagonal grid (95.98 km 2 for each grid cell), therefore the distribution of the samples among different land cover types has minimum bias with the real land cover composition. All the testing samples were manually checked using high-quality Google Earth (<1 m) at the first round and then double checked by the time-series PALSAR images (25 m) since it is easy to identify the crown of palm trees in the high-resolution datasets and recognize the regular oil palm plantations in the microwave satellite datasets. Once the start 320 and the end of the period is determined of oil palm based on Google Earth images or PALSAR data, the middle years can be checked by the stable spectral/backscatter coefficient information in the continuous PALSAR images. The annual sample set contains ~3000 samples with four land cover types (~16% were oil palm samples) and it covers the whole Malaysia (see the green points in Figure 3, only oil palm samples presented). The second annual Indonesia sample set was developed following the protocol of (Cheng et al., 2017). This sample set contains 7663 samples in total (601 were oil palms and the rest were non-325 oil palm types) during 2010 to 2016 (see the blue points in Figure 3). The details of the number and spatial distribution of validation samples is presented in Figure 3 and Table 2. More information on the randomized sampling method could be referred to Cheng et al., 2017).
The change sample set was developed to evaluate the detected change year by the breakpoint detection analysis. Time lapses of high-resolution imagery from Google Earth covering the change period were used to check the change time detected by the 330 BFAST algorithm. We randomly selected 5000 points (implemented with ArcGIS 10.3 software) in the change area but there were only limited samples (370, 25.07% of the total 1476 oil palm samples) with continuous high-resolution images from Google Earth and cloud-free Landsat time series. We compared our detected change years with the actual oil palm conversion time for these test samples. A confidence interval of ±1 years was used considering uncertainty in visual interpretation of the change time (Dara et al., 2018). Detailed distribution of the testing samples can be seen from Figure 3. 335

Spatial and temporal characterizes of oil palm expansion
The annual changes of oil palm plantations from 2001 to 2016 are shown in Figure 4. The spatial and temporal dynamics of oil palm changes vary in Malaysia and Indonesia. In the study area, most oil palm plantations are located on lowland areas (elevation <250 m, slope <2.5 degree), and few are distributed in gently undulating hills (elevation >500 m, slope >5 degree) 340  (Figure 4(a)). Figure 4 indicate the oil palm changes (expansion and shrink) at early years while the dark colors are the changes in more recent years. Oil palm plantations expanded rapidly during the study period in peninsular Malaysia and 12 Sumatra and Borneo. In Indonesia, rapid expansion first occurred in Sumatra and was then surpassed by Kalimantan (Gunarso, 2013;Petrenko et al., 2016). This can also be observed in our maps where more changes happened in earlier years in Sumatra (lighter colors in Figure 4) and later in Kalimantan (darker colors). The decrease in oil palm plantations was also detected ( Figure 4(b)), although it is difficult to separate the oil palm replantation after one rotation (i.e. still oil palm in land use) from the permanent oil palm loss (i.e. change to other land use types). Compared to the period before 2007 using change-detection 350 in NDVI data, our data product in the gap period of 2011-2014 would be of better quality since the net changes were constrained by the oil palm maps in 2010 and 2015 derived from PALSAR and PALSAR-2 data, respectively. This is because the unidirectional version is temporally filtered based on the assumption of one-way expansion of oil palm plantation, while the bi-directional version considered the conversion from oil palm to other land cover types (Section 2.3.3).

Accuracy assessment 375
The mapping performance of AOPD was evaluated first using independent annual oil palm sample set for 2007, 2008, 2009, 2010, 2015 and 2016. The mapping accuracy from the previously developed datasets over Malaysia  were also compared. The results of the annual accuracy (F-score) with producer accuracy (PA) and user accuracy (UA) are shown in Table 3 and 4. PA shows how correctly the reference samples are classified and indicated the omission error (1-PA) while UA represent what percentage of the classes has been correctly classified and is linked with commission error (1-UA). The 380 average annual accuracy for oil palm areas in Malaysia reached 86.22%, which is 8.27% higher than the annual maps from the previous study . The improvement of the oil palm mapping performance is mainly due to the different postprocessing (one-way expansion and bi-directional oil palm change strategies) and the introduction of the ancillary data (IFL and GMA). Meanwhile, there is no significant difference in the oil palm mapping accuracy among the six years in Malaysia (all above 85% with less than 2% differences, Table 3), indicating the stability and robustness of AOPD. The evaluation using 385 the second annual oil palm sample set in Indonesia shown the average mapping accuracy of 74.20% and the F-score of 0.74 during 2010-2016. The oil palm mapping accuracy was relatively stable during the gap years and the classified years (higher than 72% with 3% fluctuations, Table 4). Figure 6 shows the direct comparison of the change maps with the images from Google Earth and Landsat, which document the change process. We use time lapse of images when the annual high-resolution images from Google Earth were not available.

Comparison of our results with statistics and other products 405
We first compared the oil palm plantation area from our AOPD product with oil palm harvested area from FAO and USDA,  Figure 5(a)). Therefore, the areas from FAO inventory should be used with caution due to the lack of reliable on-field data sources (Ordway et al., 2019).

Compared to FAO and USDA statistics, the annual mean differences from 2001 to 2016 of our results in Malaysia and
Indonesia are positive and amount to 2.00 M ha and 1.18 M ha, respectively. The differences were limited to an average of 0.08 M ha (FAO) and 0.55 M ha (USDA) in Malaysia but were relatively higher in Indonesia (1.88 M ha compared to FAO 415 and 0.60 M ha compared to USDA), probably because of more confusion from other plantations (i.e., coconuts, rubber and Acacia) and / or more smallholder growth in Indonesia (Lee et al., 2014). There are also small differences of oil palm plantation area in comparison with local national statistics: MPOB (average annual difference of 0.20 M ha) and BPS-Statistics Indonesia (-0.17 M ha). These differences only consist 3.14% and 1.37% of the total oil palm plantation area in 2016 in the two countries.
Trends of oil palm expansion in our mapping results (upper and lower boundary lines) are also compared with statistical data 420 (FAO and USDA from 2001, MPOB and BPS-Statistics from 2011 (Table S1). Generally, the overall trends of our mapping results (0.758-0.941 M ha/yr) are higher than the FAO (0.561 M ha/yr) and USDA (0.630 M ha/yr) records during the past 16 years, with larger discrepancy in Malaysia (47.07-59.40% higher than FAO and 39.45-53.55% higher than USGS) than Indonesia (16.84-31.68% higher than FAO and 5.99-22.76% higher than USGS). The higher estimation may be induced by the confusion in other woody plantations such as coconuts and pulp. Although there is high separability between 425 rubber, wattles and palms in PALSAR data , the coconuts which belongs to palm trees and have a fan-like shape showed less differences with oil palm compared to other plantations. Another possible reason is the difference in the oil palm plantation definitions (mature and immature oil palm or only mature oil palm included in FAO inventory). in Indonesia. This is also in consistent with the higher uncertainty in the early period and higher reliability in recent years.
During the study period, the oil palm export price (total export value/export amount, data source: FAOSTAT) rapidly increased from 402.67 dollars/t in 2006 to the peak (1080.72 dollars/t) in 2011 ( Figure S9) but subsequently fell. The crop price is closely related to demand and may further impact the oil palm market and production (Turner et al., 2011). However, although there 440 is a ~10-20% slowdown of the conversion rate, oil palm plantation area continuously increased after 2011. The land conversion to oil palm may also be affected by multiple factors such as agricultural rent, wages and market-mediated effects (such as tax) (Furumo and Aide, 2017;Taheripour et al., 2019), and the relationship between oil palm expansion and price fluctuation still requires further exploration. compare our mapping results. The oil palm plantation in Gaveau's dataset was visually interpreted using Landsat datasets in 1973,1990,1995,2000,2005,2010 and 2015 in Borneo. The overall distribution of oil palm extent in Borneo are similar between our mapping results (the unidirectional version) and the Gaveau's results (Figure 8a and 8b). The differences were scattered across the whole island with more oil palm plantation areas in our results than in Gaveau's results in the south of Borneo (Figure 8c, aggregated to proportional maps at 5 km × 5 km to zoom in the difference). Generally, 7.45, 9.23 and 9.86 450 M ha oil palm plantation area were mapped in AOPD for Borneo during 2010, 2015 and 2016, which is 23.98%, 12.61% and 18.83% larger than the estimates from Gaveau's dataset. Our higher estimation of oil palm plantation area is possibly because some of the smallholder oil palm plantation (1-50 ha in size) is captured in our results whereas only industrial plantations were visually interpreted in Gaveau's results. Misclassification (commission errors) in our results may however also contribute to our estimation being higher. 455 The oil palm concession area for Indonesia and Malaysia (Sarawak) for 2014 from global forest watch (www.globalforestwatch.org) is also used in the comparison. This dataset indicated the boundaries of areas allocated by government to companies for oil palm plantation. The oil palm concession area in Indonesia and Malaysia (Sarawak) for 2014 is 12.98 M ha, which is slightly higher (8.7%) than our mapping results (11.85 M ha). However, since the concession data was compiled from various countries and sources (such as governments and other organizations) with different quality, some 460 location of the existing concessions may be inaccurate (Figure 9(a)) or omitted (Figure 9(b)) compared to our mapping results with PALSAR-2 data. Many concessions are not fully developed (i.e. not planted with oil palm yet) and the number reached more than half of the total 11 M ha (~5.5 M ha) in Sumatra and Kalimantan islands in 2010 (Slette and Wiyono, 2011). Another possible reason for the differences is the inclusion of very small oil palm plantations in our dataset of less than 50 ha, while most of the oil palm concessions (81.71%) were larger than 1000 ha. 465 Oil palm expansion is one of the major drivers of deforestation in the studied region (Austin et al. 2018). Therefore, the forest area loss map from Hansen et al. (2013) was overlaid with the AOPD map, and the results are shown for selected areas in Figure 10. in areas (a) and (c), where the year of oil palm expansion is roughly coincides with the year of forest clearance. In other case such as area (b), a larger discrepancy was found in the two maps because of different causes. For example, forest loss is not always caused by oil palm expansion but timber plantation, logging, fires, conversion from forest to grassland and 470 agriculture (Austin et al., 2018;Kamlun et al., 2016). Meanwhile, expansion of oil palm plantation didn't always occur in forest area, but also in non-forest area. In some regions, the oil palm was planted after the logging of forest immediately (area filled with same color in Figure 10) but in other regions, lands may experience first a forest clearance and then oil palm plantation several years later (indicated by the patches filled with darker color in AOPD than in the forest loss map ( Figure   10)). However, the difference of the spatial resolution (30m vs 100m) may also cause some differences, particularly in 475 smallholder and newly developed oil palms. According to our result, 28.20% of total oil palm expansion area overlapped with Hansen's forest loss area (5.38% with the exact same change time, 15.37% later than forest loss year and the remaining 7.46% earlier than the forest loss time). Among the overlapped area, 19.16% of the area has the same change time, 23.67% in 1-year intervals (may be caused by the time lag between clearance and cultivation), and 38.11% of oil palm expanding areas in AOPD coincide with forest area loss with a lag of at least 2 years. These latter areas may experience first forest clear-cut for other 480 applications or logged and remained unused for several years and then converted to oil palm plantation.

Uncertainty of AOPD
Mapping annual oil palm plantation using remote sensing data in Malaysia and Indonesia is challenging. We developed the first annual oil palm land cover maps (AOPD) from 2001 to 2016 at 100-m resolution combining optical and microwave 485 satellite observations. However, the uncertainties of AOPD, coming from both mapping and change detection, should be acknowledged for the future applications of our dataset. In the mapping procedure, our results showed a good separation between primary forest and oil palm trees but confusion may occur in some impervious area and plantations of other species such as coconuts. As a result, the accuracy of the change detection in the second step was also influenced by the oil palm maps generated from PALSAR/PALSAR-2 data in the first stage. Although oil palm maps for the six years of PALSAR/PALSAR-490 2 data reached high accuracy at nearly 90% in Malaysia and ~75% in Indonesia, inaccurate inputs in some pixels may lead to cumulative errors in the change detection during the PALSAR data gap years, particularly in Indonesia. The oil palm maps during 2001-2006 without "from-to" inputs, therefore, have more biases compared with the results from 2011 to 2014.
Uncertainties could also be induced in the change detection process. Even though the change pixels during the data gap period are constrained by the 100-m oil palm maps from PALSAR before and after that period, the use of moderate resolution MODIS 495 data at 250 m may cause the loss of spatial information and false identification of the change times. Some studies suggested that the fusion of coarse and fine resolution satellite data requires fine resolution images at a certain frequency (Zhang et al., 2017). However, when aiming to conduct consecutive mapping and changes detection, there will always be a trade-off between spatial and temporal resolution (Yin et al., 2018) considering the availability of satellite data such as MODIS and Landsat data (i.e., MODIS has denser observations but coarser spatial resolution than Landsat data). In addition to the satellite data, the 500 change detection algorithm may also bring uncertainties. Because the accuracy of the detected change time by BFAST within a time series is influenced by the signal-to-noise ratio (Verbesselt et al., 2010b), cloud contamination and poor data quality in some regions from MODIS reduced the amount of valid information. And the bias may also be found in the gap years when no breakpoint could be found using BFAST algorithm and the errors were accumulated to years when switching to MODIS before and after PALSAR. However, it is difficult to identify whether the errors are originated from the classification during 505 PALSAR period or the change detection in the gap period. Further improvement could be the use of algorithms which combines the different models (i.e., BEAST) rather than the single-best model (Zhao et al., 2019a). When applying the change detection algorithms, we assumed one-time change in two periods (2001-2007 and 2011-2014). However, multiple changes may occur in the deforestation area when the logging activity is applied first and followed by the replantation of oil palm several years later. More importantly, oil palm will be cut down and replanted after 20 to 25 years for the next rotation in order to make the maximum profits. This would cause confusion with the transitions between oil palm and other land use types. Therefore, we provided two versions of AOPD: one is the original results with bi-directional oil palm area change, and the other is the unidirectional datasets by assuming all the oil palm loss is from rotation and that a loss is followed by a new oil palm plantation.
Despite of these uncertainties, the AOPD annual oil palm maps integrated the strengths of microwave (SAR) and optical satellite observations. SAR has the capability in identifying the oil palm from forest regardless of the weather condition, and 515 MODIS time series has a hyper-temporal density and long-time span. Also, our study gives a good example of integrating fine and coarse datasets. Instead of directly using the coarse dataset, the oil palm maps combined the overall change information for the whole data gap period from fine PALSAR/PALSAR-2 data and the detection of exact change year using coarse MODIS data. In recent years, there is a transition from annual classification to change information mining in remote sensing interpretation to reduce the false changes (Xu et al., 2018b). This method can be used not only in monitoring global oil palm 520 dynamics but also in producing annual land cover maps where only discrete fine resolution observations are available. Since the data scarcity of successive Landsat imagery is common across the world, the algorithm described in this study provides an effective way of combining coarse data to update the annual land cover change. Further, inventory compilation and manual visualization of oil palm change in large extent would remain labour and time consuming (Gaveau et al., 2016;Miettinen et al., 2016;Vijay et al., 2018). Our semi-automatic algorithm in oil palm mapping may thus help to establish a long-term 525 monitoring for oil palm, that can be improved over time with regular validation using ground-based observation or very highresolution images such as Google Earth.

Applications of AOPD
The 100-m annual oil palm maps from AOPD produced in this study can be used in a number of applications. First of all, it can be readily to be used as a cross-validation reference data for other regional oil palm datasets (e.g., FAO inventory). Second, 530 the annual data can be further used to quantify the spatiotemporal characteristics of oil palm change, estimate the annual oil palm yields, identify the potential oil palm planted area and predict the boundary of oil palm expansion in the future and so on. Overlapping the AOPD with forest maps, peatland maps and other land cover maps can give a clue on how the oil palm expansion influences different ecosystems and their carbon balance. For example, oil palm expansion is the largest single driver of deforestation in Indonesia, which contributed to 2.08 M ha of deforestation (23%) in Indonesia from 2001 to 2016 535 (Austin et al. 2018). The protected areas were also at long-term risk of deforestation from oil palm cultivation (Vijay et al., 2018). Previous studies revealed that oil palm directly replaced 3.1 M ha (27%) peatland in Peninsular Malaysia, Sumatra and Borneo from 2007 to 2015 (Miettinen et al., 2016), causing the carbon-rich tropical peatland to a strong carbon source (Miettinen et al. 2017a). AOPD at fine spatiotemporal resolution can also serve as land-use change forcing data in the bookkeeping models (Hansis et al., 2015;Houghton and Nassikas, 2017) and possibly dynamic global vegetation models 540 (DGVM) (Sitch et al., 2015) (provided that those models include a specific PFT to represent oil palm (Fan et al., 2015)) to better simulate the carbon emissions and hydrology dynamics. It would improve the carbon budget greatly in Southeast Asia if DGVMs could systematically simulate biomass, litter and soil carbon changes caused by shifts in oil palm plantation, primary forest, peatlands and fire using accurate and compatible land-use change data.
Another vision lies in the sustainable future of oil palm industry. As the major contributor to the economy that supports 545 thousands of people in the tropical countries, developing oil palm industry has been one of the priorities in these countries (Mahmud et al., 2010;Sayer et al., 2012). At the same time, the possible environmental and ecological consequences of monocultures need to be taken into account for the sustainable development of oil palm industry. For example, Roundtable on Sustainable Palm Oil (RSPO) is established to formulate the standards for the industrial oil palm plantation in South-East Asia, followed by the foundation of Africa Palm Oil Initiative. Voluntary zero-deforestation commitments in the palm oil industry 550 were also implemented since 2010 (Focus, 2016). However, how many and to what extent large corporations will pay real attention to the rights of local populations remains unknown (Barr and Sayer, 2012).
It is crucial to balance between the rural economic development and environmental protection, especially in the regions with high-biodiversity primary forest and carbon-rich peatlands like Southeast Asia. More complete information on oil palm plantation (e.g. spatiotemporal changes of oil palm and its consequences) would help to reduce the disputes and provide 555 strategies for oil palm's sustainable development. Our annual oil palm maps would thus contribute to the policy formulation as well as policy evaluation (e.g. national moratorium on new permits for the oil palm conversion from primary natural forests and peat lands (Busch et al., 2015)).

Data availability
The AOPD in Malaysia and Indonesia from 2001 to 2016 at 100-m resolution are available to the public at 560 https://doi.org/10.5281/zenodo.3467071 . The dataset includes a set of GeoTIFF images in the WGS_1984_World_mercator projected coordinate system. It can be opened/reprocessed in GIS applications (e.g., QGIS, ArcGIS) and other opening computing environment (R, matlab, etc.). Value 1 represents oil palm while value 0 is Null value.
In this study, we used PALSAR/PALSAR-2 and MODIS NDVI datasets to produce AOPD and SRTM DEM, Intact Forest Combining the optical and microwave satellite observations, we developed the first annual oil palm maps (AOPD) in Malaysia and Indonesia from 2001 to 2016 at 100-m resolution using the image classification and change detection analysis. The dataset reached a high accuracy in both annual classification and change-detection. As a result, this dataset provided insights and details on dynamic oil palm changes for Malaysia and Indonesia from the perspective of remote sensing and can serve as a 575 supplement for statistics. Further applications of the dataset include but is not limited to regional carbon studies, water and agricultural management, biodiversity and conservation protection and the sustainable development of oil palm industry. The annual updating method in this study that fully used information from discrete fine resolution data and continuous coarse resolution data is also expected to be applicable in other regions facing data scarcity.     , whereas the lower boundary lines are the lower limit according to our results. Note that during the gap between the two periods, no uncertainty could be derived, which does not mean that the uncertainty was small.