Twenty-meter annual paddy rice area map for mainland Southeast Asia using Sentinel-1 synthetic-aperture-radar data

. Over 90 % of the world’s rice is produced in the Asia–Paciﬁc region. Synthetic-aperture radar (SAR) enables all-day and all-weather observations of rice distribution in tropical and subtropical regions. The complexity of rice cultivation patterns in tropical and subtropical regions makes it difﬁcult to construct a representative data-relevant rice crop model, increasing the difﬁculty in extracting rice distributions from SAR data. To address this problem, a rice area mapping method for large regional tropical or subtropical areas based on time-series Sentinel-1 SAR data is proposed in this study. Based on the analysis of rice backscattering characteristics in mainland Southeast Asia, the combination of spatiotemporal statistical features with good generalization ability was selected and then input into the U-Net semantic segmentation model, combined with WorldCover data to reduce false alarms, ﬁnally the 20 m resolution rice area map of ﬁve countries in mainland Southeast Asia in 2019 was obtained. The proposed method achieved an accuracy of 92.20 % on the validation sample set, and the good agreement was obtained when comparing our rice area map with statistical data and other rice area maps at the national and provincial levels. The maximum coefﬁcient of determination R 2 was 0.93 at the national level and 0.97 at the provincial level. These results demonstrate the advantages of the proposed method in rice area mapping with complex cropping patterns and the reliability of the generated rice area maps. The 20 m annual paddy rice area map for mainland Southeast Asia is available at https://doi.org/10.5281/zenodo.7315076 (Sun et al., 2022b).


Introduction
Sustainable Development Goal 2, "Zero Hunger", was set by the United Nations in 2015 (Desa, 2016): the dual pressure of population and environment threatens the sustainability of global food security (Faostat, 2010;Godfray et al., 2010).Rice feeds more than half of the world's population as a staple food and is a major crop for world food security (Kuenzer and Knauer, 2012).Asia is the largest rice-producing region in the world (Chen et al., 2012), and Southeast Asia accounts for 40 % of global rice exports (Yuan et al., 2022).
High-precision rice-planting area maps are the basis for monitoring rice growth and forecasting yields, the cornerstone for the government, planners and policymakers to formulate reasonable policies and the guarantee of global food security (Mosleh et al., 2015;Laborte et al., 2017;Clauss et al., 2018;Jin et al., 2018;Yu et al., 2020;Hoang-Phi et al., 2021).
Remote-sensing technology plays a crucial role in rice growth monitoring and distribution mapping (Weiss et al., 2020;Zhao et al., 2021;Tsokas et al., 2022).Rice area mapping at the national scale usually uses medium-and lowresolution optical remote-sensing data, such as MODIS and C. Sun et al.: Twenty-meter annual paddy rice area map Landsat data.Some researchers used MODIS multitemporal data to produce rice area maps of China with resolutions of 500, 500, 250 and 500 m, respectively (Xiao et al., 2005;Sun et al., 2009;Clauss et al., 2016;Qiu et al., 2022).Guan et al. (2016) produced rice area maps of Vietnam at 500 m resolution using MODIS time-series data in 2010.The National Agricultural Statistics Service (NASS) released the state-based Crop Data Layer (CDL), a 30 m resolution crop distribution map product for the entire continental United States, using multisource medium-resolution remotesensing data (Landsat, IRS-p6, DEIMOS-1, etc.) (Johnson and Mueller, 2010).Luo et al. (2020) and Wei et al. (2022) used Landsat time-series data to produce 1 km and 30 m resolution rice data sets for China, respectively.Recently the Sentinel-2 satellite sensor opened up new possibilities for paddy rice monitoring.Liu et al. (2022) obtained mediumresolution rice area maps of China using Sentinel-2 timeseries data in 2020.
At the continental scale, MODIS time-series data were frequently used to map the distribution of rice cultivation (Dong et al., 2016a, b).Xiao et al. (2006), Gumma et al. (2011aGumma et al. ( , b, 2014) ) and Bridhikitti and Overcamp (2012) produced lowand medium-resolution rice area maps for several South and Southeast Asian countries using MODIS data at the 500 m spatial resolution, respectively.Nelson and Gumma (2015) extracted the 500 m spatial-resolution general rice extent map in Asia from 2000 to 2012 using MODIS data.Using MODIS time-series data, Zhang et al. (2017) generated rice acreage maps for China and India from 2000 to 2015.Han et al. (2022) used MODIS data to complete 500 m annual rice maps for the Asian monsoon region from 2000 to 2020.Satellite pour l'Observation de la Terre (SPOT) data were also used for continent-wide rice area mapping.Manjunath et al. (2015) used 2009-2010 multitemporal SPOT vegetation (VGT) normalized difference vegetation index (NDVI) data to produce 1 km resolution rice area maps for South and Southeast Asia.
Most of the rice in the world is distributed in hot and rainy areas.Optical data are easily obscured by clouds, which poses a challenge for rice area extraction in humid and subhumid climates with abundant water resources such as Southeast Asia (Liu et al., 2019;Sun et al., 2021).Compared with traditional optical remote sensing, synthetic-aperture radar (SAR) is an active microwave radar with the advantages of all day and all weather, is weather-independent, can penetrate clouds and is very sensitive to the geometric structure and dielectric properties of crops (Huang et al., 2017;Orynbaikyzy et al., 2019;Sun et al., 2022a).In recent years, free SAR data represented by Sentinel-1 data have been widely used in rice mapping over large regions.Singha et al. (2019) obtained seasonal rice maps at 10 m resolution for Bangladesh and northeastern India using time-series Sentinel-1 verticalhorizontal polarization (VH) data for 2017.Pan et al. (2021) used 2016-2020 Sentinel-1 VH data to produce 10 m spatialresolution double-season rice maps for nine provinces in southern China.Xu et al. (2021) used time-series Sentinel-1 VH data to obtain a 20 m rice area map for Thailand in 2019.
To take full advantage of multisource remote-sensing data, some researchers combined optical and SAR time-series data in the large-scale rice mapping studies (Thenkabail et al., 2009;Zhang et al., 2018;You and Dong, 2020).Phan et al. (2021) used Sentinel-1/Sentinel-2 and Landsat data to produce the first Vietnam land use/land cover annual data set with 30 m resolution from 1990 to 2020.Han et al. (2021) obtained 500 m resolution rice maps from 2017 to 2019 in Northeast and Southeast Asia using Sentinel-1 and MODIS time-series data.
At present, large-scale rice mapping methods based on remote-sensing data can be divided into two categories: one is the combination of phenological information and remotesensing images, and the other is the combination of timeseries data and machine learning relying on image information.The phenology-based approach refers to the extraction of rice by defining phenological indicators or identifying rice-growing stages by combining the time-series data covering the rice growth cycle and the analysis of the rice phenological calendar (Nelson et al., 2014;Chen et al., 2016;Nguyen and Wagner, 2017;Liu et al., 2018;Xin et al., 2020;Ni et al., 2021).The growing stages such as transplanting, heading and maturity are most often used to extract rice.Shew and Ghosh (2019) combined vegetation indices extracted from Landsat time-series data with a rule-based algorithm for growing stages to map a 30 m dry-season rice map of Bangladesh from 2014 to 2018.Li et al. (2020) extracted the minimum and maximum values of permanent water backscatter coefficients and three thresholds of phenological characteristics, namely, the date of the beginning of the season, date of maximum backscatter during the peak growing season and length of the vegetative stage from 402 scenes of Sentinel-1 data in 2017 to map rice paddies in the Mun River basin, Thailand.Kang et al. (2022) completed a 10 m resolution rice map of Cambodia from Sentinel-1 (2015) and Sentinel-2 (2015Sentinel-2 ( -2017) ) time-series data using three key rice phenological periods in the dry and rainy seasons, respectively.
However, the phenology-based methods rely too much on human intervention and are not suitable for rice area extraction with complex cropping cycles.The approaches based on the combination of time-series data and machinelearning method refer to the direct use of time series as the input features for machine learning (Ndikumana et al., 2018;Chang et al., 2020;Mansaray et al., 2021;Yang et al., 2021).Machine-learning methods are used to extract rice information by mining fixed relationships across growth periods of rice (Yang et al., 2019;You et al., 2021).Torbick et al. (2017) used Sentinel-1, Landsat-8 and PALSAR-2 timeseries data and a random-forest algorithm to map the riceplanting area and planting intensity of Myanmar with 20 m resolution in 2015.Inoue et al. (2020) developed a 30 m res-olution map of paddy rice in Japan for 2018 using Sentinel-1 SAR data and Sentinel-2 data with the conventional decision tree methods.Wei et al. (2021) completed rice area mapping for the Arkansas River basin, USA, by entering dualpolarized Sentinel-1 data from 2017 to 2019 into a modified U-Net model.Soh et al. (2022) used Sentinel-1 and Sentinel-2 time-series data and a K-means clustering method to map rice in West Malaysia.
The climate in tropical or subtropical regions such as Southeast Asia is suitable for rice growth throughout the year, increasing the difficulty in extracting information on the distribution of rice areas.First, it is difficult to obtain accurate phenological information, as the climate in Southeast Asia is hot and humid for rice growth and the timing of rice seeding and transplanting is more flexible (Xu et al., 2021).Thus, it is difficult to determine effective phenological indicators and to accurately identify rice-growing stages.Second, rice cultivation patterns in Southeast Asia are too complex to construct a representative rice growth model (Kang et al., 2022).This poses obstacles for rice area extraction methods that utilize time-fixed relationships in time-series data.
Current publicly downloadable remote-sensing data-based rice products for Southeast Asia include the Asian rice map (International Rice Research Institute -IRRI -rice data, 500 m) (Nelson and Gumma, 2015), Vietnam-wide annual land use/land cover data sets from 1990 to 2020 (VLUCDs, 30 m) (Phan et al., 2021), annual paddy rice maps for Northeast and Southeast Asia from 2017 to 2019 (NESEA-Rice10, 10 m) (Han et al., 2021) and annual rice in the Asian monsoon region from 2000 to 2020 (500 m) (Han et al., 2022).Except for Vietnam's VLUCD, the source data for the public rice maps in Southeast Asia were mainly MODIS.Rice area maps using MODIS data contained a large number of mixed pixels due to low spatial resolution (Dong et al., 2015;Shew and Ghosh, 2019), which affected the accuracy of rice area maps.
Therefore, in this study, to meet the requirements of highprecision rice area mapping in Southeast Asia, the objectives accomplished using Sentinel-1 time-series data are as follows.
A new feature-extraction method is proposed by analyzing the time-series backscattering variation of rice in mainland Southeast Asia.The method does not need to summarize the general evolutionary model from rice backscatter coefficients with diverse cultivation patterns.Using three simple but effective temporal statistical features defined in this study, it is possible to capture features that provide key information about the rice-growth process.This study provides a new idea for rice area mapping methods in tropical or subtropical regions.
A deep combination of the above features and the U-Net model will be used to fully exploit the pixel-level semantic features to complete the annual rice area mapping of five Southeast Asian countries in 2019, enriching the available Southeast Asian rice area maps and providing support-ing information for the scientific community and scientific decision-making.
The rest of the paper is organized as follows.Section 2 describes the study area and the data information used.Sect. 3 presents the rice area mapping scheme.Sect. 4 presents the rice area mapping results and accuracy assessment.Sect. 5 discusses the results.Sect.6 gives the data addresses, and Sect.7 draws conclusions.

Study area
Approximately 90 % of the world's rice is grown on 140 million hectares of land in Asia.The rice production in mainland Southeast Asia accounts for about 15 % of the world's rice production (Fao, 2020).The study area is five countries in mainland Southeast Asia, namely, Myanmar, Thailand, Laos, Cambodia and Vietnam, as shown in Fig. 1.These countries have more land under rice cultivation than any other crop, and Vietnam and Thailand are the two largest rice exporters in the world (Yuan et al., 2022).Indeed, changes in rice production in these countries could destabilize international rice markets and have a clear impact on global food security.
Southeast Asia has a tropical monsoon climate with an average annual temperature of 20-27 • C and abundant rainfall.Therefore, rice can be grown at any time of the year.Agricultural systems in Southeast Asia are dominated by rainfed lowland rice and irrigated lowland rice (Kuenzer and Knauer, 2012).Under suitable irrigation conditions, rice can be harvested two to three times per year.

Satellite imagery and auxiliary data
The European Space Agency (ESA) provides a free data source for global land cover monitoring through Sentinel-1A, launched in 2014, and Sentinel-1B, launched in 2016 (Torres et al., 2012).The Sentinel-1 satellites carry a Cband (5.405 GHz) synthetic-aperture radar with a 12 d revisit period.In this study, the 2019 dual-polarized (VV/VH) ground-range-detected (GRD) products in interferometric wide-swath (IW) mode were downloaded from the Alaska Satellite Facility (ASF) website.In total, 12 tracks, 90 frames and 2665 scenes of data were acquired.Details are shown in Table 1.
The digital elevation model (DEM) and land use/land cover product were also collected.The Shuttle Radar Topography Mission (SRTM) 3 s DEM product was used for terrain correction of SAR data.WorldCover data were used to reduce false alarms caused by water and woodland.WorldCover is a global land cover product produced by the ESA and several scientific institutions using Sentinel-1 and Sentinel-2 data (Zanaga et al., 2021).It provides information https://doi.org/10.5194/essd-15-1501-2023 Earth Syst.Sci.Data, 15, 1501-1520, 2023 on 11 land cover types for 2020 with a resolution of 10 m and an overall accuracy of 80.7 % for the Asian region.

Agricultural statistics
The statistical yearbooks of each country were collected to compile annual census data of rice-harvested areas at different administrative levels in these countries.The administrative levels include national and subnational levels (state, province or regions, uniformly represented by province in this study).The unit of area in the statistical data is uniformly converted to hectares (ha).

Available rice maps based on remote-sensing data
From the perspective of resolution and coverage area, two publicly downloadable rice maps were selected for comparison.
(1) VLUCDs Researchers from the Japan Aerospace Exploration Agency (JAXA) produced the first 30 m resolution VLUCDs using multiple sources of data (including Landsat and Sentinel-1/Sentinel-2) and a random forest algorithm (Phan et al., 2021).The VLUCDs contain annual land cover products for 1990-2020, including a primary classification (10 different categories of primary land cover, including rice) and a sec-ondary classification (18 different categories of secondary primary land cover, including rice).The rice layer was extracted from the 2019 annual land cover products for comparison.
(3) Rice data of Asia from IRRI rice data IRRI is an international agricultural research and training organization with its headquarters in Los Baños, Laguna, in the Philippines, and offices in 17 countries.IRRI is 1 of 15 agricultural research centers in the world that form the Consultative Group for International Agricultural Research (CGIAR), a global partnership of organizations engaged in research on food security.IRRI is also the largest nonprofit agricultural research center in Asia.IRRI produced a 500 m resolution map of the general distribution of rice in Asia from 2001 to 2012 using MODIS time-series data (Nelson and Gumma, 2015) that is freely available to the public.
Table 2 shows details of the SAR data, auxiliary data, available rice maps, land cover data and statistics used in the study.

Method
The flowchart of this study is shown in Fig. 2. First, the Sentinel-1 time-series images were preprocessed.Then, key features in the rice-growth process are extracted from the time-series SAR data.To make full use of the pixel-level semantics of the features, the extracted features were fed into Earth Syst.Sci.Data, 15, 1501Data, 15, -1520Data, 15, , 2023 https://doi.org/10.5194/essd-15-1501-2023 the U-Net model to obtain rice area extraction results with spatial details.Finally, to reduce false alarms from water bodies and non-rice vegetation, the results were postprocessed using masks generated based on high-precision land cover products to obtain the annual rice area map of five Southeast Asian countries.

Preprocessing
The Sentinel-1 time-series data were preprocessed using the Sentinel Application Platform (SNAP) software (Filipponi, 2019).SNAP is a common architecture for all Sentinel toolboxes.ESA and ESRIN provide the SNAP user tool free of charge to the earth observation community.The steps were as follows.
(1) Orbit correction: this operation refines the inaccurate orbit-state vectors provided in the metadata of a SAR product with the precise orbit files which are available days to weeks after the generation of the product.(2) Thermal-noise removal: because SARs are contaminated by additive thermal noise, this step is introduced to mitigate thermal-noise effects.(3) Radiometric calibration: this process provides the image in which the pixel values can be directly related to the radar backscatter of the image.(4) Co-registration: this step co-registers multitemporal intensity images.(5) Terrain correction: this process converts SAR data from the slant or ground-range projection to geographic coordinate projection and corrects the distortion effects that occurred during the acquisition (overlay, shading).( 6) Multitemporal speckle noise filtering: this operation re-duces speckles, which degrade the quality of the image and make interpretation of features more difficult.( 7) Converting values to decibels: this step converts the multitemporal intensity map to sigma 0 (σ 0 ) on the decibel (dB) scale using a logarithmic transformation.The final σ 0 images with 20 m resolution in the WGS84 geographic coordinate system were obtained.

Feature extraction
As described in many previous studies (Singha et al., 2019;Chang et al., 2020;Crisóstomo De Castro Filho et al., 2020;Sun et al., 2022a), VH polarization was more sensitive to the flooding period of rice than VV polarization and has been more widely used for rice area extraction.Therefore, Sentinel-1 VH polarization time-series data were selected in this study.To analyze the time-series characteristics of the backscattering coefficients of rice and other land cover types in the study area, representative sample plots of four typical land cover types (rice, water bodies, buildings and non-rice vegetation) were selected.Based on Google Earth data and other land cover data, four rice regions that belong to different cropping systems were chosen.The average VH polarization time-series data of these land cover types were calculated, as shown in Fig. 3.
In Fig. 3, the backscattering coefficients of water bodies were small, as they exhibited single-specular scattering, and their return power was lower than that of other land covers.In contrast, buildings exhibited double bounce and their return   powers were much stronger, leading to larger backscattering coefficients.The scattering process of radar waves of nonrice vegetation was more complicated, and the backscattering coefficients of non-rice vegetation were between buildings and water bodies.For different kinds of rice samples, the curve fluctuations were significant, due to the effects of flooding and multiseason planting patterns.However, generally, their backscattering intensities ranged between buildings and water bodies.More specifically, during the observation period, two seasons of rice were planted in the land parcel of Rice 1, the first from April to July and the second from August to October.The land parcel of Rice 2 was planted with only one season of rice, from April to September.The land parcel of Rice 3 was planted with two seasons of rice: the first season was from March to July and the second season was from July to October.The land parcel of Rice 4 was planted with three seasons of rice: the first season was from February to June, the second season was from June to October and only part of the third season (October-December) was observed.It can be seen that the time steps of each growing season for the selected Rice 1-Rice 4 were inconsistent.In fact, the high heterogeneity of rice backscattering coefficients in Southeast Asia is caused by the high heterogeneity in climate and topography.This makes the backscatter coefficient curves of the rice growth cycle more diverse and does not allow us to summarize a generalized model of rice evolution.Therefore, it will be difficult to accomplish the rice field extraction task using a direct reliance on the fixed relationship between phenology and time.
Through a large number of comparative experimental analyses and combined with our previous research work (Xu et al., 2021), three time-series statistical features that can describe the most significant SAR characteristics during rice growth were selected for rice area mapping in the study area, namely, the sharpness of the change in σ 0 (σ 0 var ), the minimum value of the backscatter coefficients in the time series (σ 0 min ) and the maximum value of backscatter coefficients in the time series (σ 0 max ).The interaction between the crop canopy and microwave radiation varies with time during plant growth.In contrast, the backscattering coefficients of non-crops, such as water bodies, buildings and woodland, are more stable.Therefore, the sharpness of the change in σ 0 with time will be a key factor in distinguishing cropland from other land cover types.σ 0 var is given by the following equation.
where σ 0 mean = 1 n n i σ 0 i n is the number of images.During the flooding stage, the backscattering characteristics of rice are significantly different from other crops that do not require extensive irrigation and are close to that of water.Therefore, this study identified the flooding stage by calculating the minimum value of the backscatter coefficient in the time-series images to distinguish rice from other crops.σ 0 min is given by Eq. (2).
The seasonal backscattering variation exhibited by water bodies can interfere with the identification of rice.In contrast to the seasonal variation of water bodies, the backscatter coefficient of rice shows a substantial increase during the growth process.Therefore, false alarms generated by water bodies can be reduced by identifying the maximum value of backscatter coefficients in the time-series images.σ 0 max is given by the following equation.
A pseudo-color image is synthesized in the order of R: σ 0 max , G: σ 0 min and B: σ 0 var shown in Fig. 4. Due to the higher σ 0 var and σ 0 max and lower σ 0 min of rice, the color of rice in the pseudo-color composite image is mainly purplish red and sometimes red or dark blue.Compared to other land covers, water bodies have lower σ 0 var , σ 0 max and σ 0 min .Therefore, water bodies are black in the pseudo-color image.Land covers with less variation in backscatter intensity, such as settlement and non-rice vegetation, generally have smaller σ 0 var and higher σ 0 min .Therefore, the colors of these land covers are usually yellow or green in the pseudo-color image.

Training and validation sets
The above analysis shows that the rice and non-rice land covers of these Southeast Asian countries have consistent features in the pseudo-color image; i.e., the model trained by one scene was applicable for all other scenes with good transferability.Therefore, a training data set generated from the orbit-frame 99-16 images of Thailand from a previous work (Xu et al., 2021) was used, as shown in Fig. 1.A sliding window with a pixel size of 224×224 was used to slice the training images into image patches with 50 % overlap.The training data set consisted of 15 659 image patches with a pixel size of 224×224.A validation sample set for accuracy evaluation was collected using auxiliary data such as Google Earth optical images and other rice maps.The validation samples were divided into two categories, rice and non-rice: the number of samples is shown in Table 3, and their distribution is shown in Fig. 1.

U-Net model
In this paper high-precision rice area mapping was accomplished using the U-Net model.U-Net is a classical semantic segmentation model widely used in biomedical image segmentation and remote sensing (Wei et al., 2019;Xu et al., 2021;Lin et al., 2022).It outputs semantically labeled pixelby-pixel images corresponding to the input image while extracting high-level semantic features so that the spatial details of the input image can be maintained (Ronneberger et al., 2015).A SAR image covers a large spatial area, which includes multiple ground objects with complex and rich semantic information; rice fields are spatially characterized by a continuous and large distribution.Therefore, U-Net is used to fully combine the spatial and semantic information in SAR images to achieve high-precision rice area extraction.
The structure of the U-Net model is shown in Fig. 5. U-Net consists of an encoder (contracting path) and a decoder (expansive path).The encoder is used for feature extraction, and the decoder is used to restore the size of the input image.U-Net has 23 convolutional layers, including 18 3 × 3 convolutional layers, 4 2 × 2 convolutional layers and 1 1 × 1 convolutional layer.The encoder part consists of five downsampling units, where each unit consists of two 3 × 3 convolutional layers and a 2 × 2 max-pooling layer.The output of the downsampling unit is input to the next downsampling https://doi.org/10.5194/essd-15-1501-2023 Earth Syst.Sci.Data, 15, 1501-1520, 2023 unit by max pooling.The decoder contains four upsampling units, each of which consists of two 3 × 3 convolutional layers and a 2 × 2 deconvolutional layer.In the final stage of the decoder, the feature vector of the last upsampling unit is converted into a probability mapping by the 1 × 1 convolutional layer.The dimension of the probability mapping is 2, and the pixel value indicates the probability that the pixel belongs to rice and non-rice.
Meanwhile, thanks to the U-shaped structure and skip connection, each downsampling is cascaded with the corresponding upsampling, and this fusion of features at different scales is greatly helpful for upsampling in recovering pixels.Specifically, the shallow downsampling multiplier is small and the feature map has more detailed rice spatial distribution features (low-level spatial features), while the deep down-sampling multiplier is large and the information is heavily condensed with large spatial loss.However, the high-level semantic features obtained from deep downsampling help in the determination of rice regions.When the high-level and low-level features are fused, it helps to improve the segmentation accuracy.
To solve the problem of uneven data distribution, we added a batch-normalization (BN) layer (Ioffe and Szegedy, 2015) before each convolutional layer.The BN layer allows the input data to follow the same distribution to achieve regularization of the model.

Postprocessing
In rice area mapping, water bodies (e.g., rivers and lakes) can confuse the flooding signal of rice.In addition, non-rice vegetation may cause some disturbances due to weather effects.
Therefore, as drawn from many studies (Cué La Rosa et al., 2019;Sun et al., 2021), water body masks and woodland masks produced by WorldCover were used to reduce false alarms of rice field extraction results to some extent.

Accuracy evaluation
In this study, several strategies were used to evaluate our rice map product, including accuracy assessments based on validation sets and comparisons with statistical data and other rice maps at the national and provincial levels.First, common accuracy metrics based on the validation set were calculated to measure the classification effectiveness of the model, including accuracy, precision, recall and Kappa (Congalton, 1991;Vapnik, 1999;Mchugh, 2012).TP denotes the number of pixels correctly classified as rice, TN denotes the number of pixels correctly classified as nonrice, FP denotes the number of pixels misclassified as rice among non-rice pixels, FN denotes the number of pixels misclassified as non-rice among rice pixels and P e is the desired accuracy.
Second, the spatial consistency of rice field extraction results with statistical data and other rice maps was compared at the national and provincial levels.The coefficient of determination (R 2 ) of the rice area map with statistical data and other rice area maps was calculated using the following equation (Draper and Smith, 1998).
n is the total number of administrative units, x i is the area of extracted rice, x i is its corresponding mean value, k i is the area of statistical data or other rice maps and k i is its corresponding mean value.

Results
The 2019 rice area map for mainland Southeast Asia using Sentinel-1 SAR data was shown in Fig. 6.According to the extraction result, the main rice-producing areas in Myanmar are located in the Ayeyarwady, Bago and Yangon delta regions, which are crossed by river systems.In addition, Mandalay, Sagaing and Magwayue in the northern arid mountainous region also play an important role in rice production.Thailand's rice fields are concentrated in the central plains, north and northeast.The main rice-producing areas in Laos are located in the central and southern lowland areas.Many of the major rice-producing provinces are located along the Mekong River, such as Borikhamxay, Khammouane, Savannakhet, Salavan and Champasak.Rice fields in Cambodia are concentrated in the Tonlé Sap Lake basin and the southern Mekong River basin.In Vietnam, the representative riceplanting areas are the Mekong delta and the Red River delta.
Next, the rice area map was evaluated as comprehensively as possible from three different scales.First, the validation sample set introduced in the previous section was used to evaluate the accuracy of rice area mapping from the methodological level.Second, at the national level, the rice area maps were compared with statistical data on rice-harvested area and other available rice area maps, respectively.Finally, at the provincial level, more detailed comparisons were made with statistical data and other provincial rice area maps to measure the spatial consistency between the extracted rice distribution and these data.

Accuracy based on the validation set
The accuracies of the rice area map based on the validation sample set are shown in Table 4.Among them, the accuracy was as high as 92.20 %, and the Kappa was 0.8425, which proved that the proposed method had a good classification performance.The precision was 92.45 %, indicating that the method could effectively reduce the false alarms in the rice area extraction results.Therefore, these precision metrics illustrated that the rice mapping results were in good agreement with the validation samples.This also further demonstrated the capability of the proposed method for rice area mapping in large tropical regions.

Comparison with statistical data and other rice area maps at the national scale
Figure 7 showed the comparison of the extracted rice area with statistical data and the IRRI rice data at the nationallevel scale for five Southeast Asian countries.As seen from the figure, the extraction results were consistent with both statistical data and IRRI rice data.Most points were distributed in the vicinity of the 1 : 1 line.In contrast, the extraction result was more consistent with IRRI, R 2 could reach 0.93 and R 2 with statistical data was 0.78.Table 5 showed the statistical area of rice, the area of other rice area maps and the area of rice extraction for five South- It could be seen that the statistics of rice-harvested area were much higher than the area of rice extracted from Vietnam.The statistical data were the total rice harvest areas in different growing seasons each year, but the extracted rice area was the land area where rice was planted.In Vietnam, there are three seasons of rice, namely, spring rice, fall rice and winter rice, while the harvested areas of spring rice and fall rice are comparable, and the harvested area of winter rice is smaller.In this way, part of the statistical data of rice harvest area is repeated and accounts for a large proportion of the area, resulting in a larger rice-statistical area than the extracted rice area.Although other countries also have multiple rice seasons, the areas of rice in the main season are large, while that in the other seasons is small, so the area proportion calculated repeatedly is small.The extracted rice area was closer to the paddy land area in the statistical yearbook of Vietnam and the VLUCD, indicating that the extraction result was reliable.

Comparison with statistical data and other rice area maps at the provincial scale
Figure 8 shows the comparison of the extracted rice area with the statistical data of rice-harvested area and IRRI rice data at the provincial scale for five Southeast Asian countries.The available rice maps contain a 500 m resolution rice map of mainland Southeast Asia (IRRI rice data) and a 30 m resolution rice map of Vietnam (VLUCD) (see Sect. 2.2.3 for details).In general, the rice area extraction results were in good agreement with the statistical area, IRRI data and VLUCDs.Among them, the R 2 ranged from 0.82 to 0.88 with statistical data and from 0.83 to 0.97 with IRRI, as shown in Fig. 8.As shown in Fig. 8a and b, the rice-planting areas in Thailand and Cambodia extracted by our method had a good correlation with the statistical data and IRRI data at the provin-cial scale.The R 2 was distributed in the range of 0.83-0.88.There were no provinces with large deviations.
In Fig. 8c, in Myanmar, the R 2 between the extracted area of rice and the statistical data and IRRI rice data was 0.83 and 0.84, respectively.However, the extracted rice area of Ayeyarwady province was significantly lower than that of the statistical data and IRRI data.The extraction results of Ayeyarwady were compared with the IRRI data, as shown in Fig. 9.As reported by Han et al. (2021), due to the influence of mixed pixels, the IRRI data divide too many rivers and too much non-rice vegetation into rice.The extracted rice area map retains the details of rivers and roads.
The R 2 of the extracted rice area in Laos with statistical data was 0.82, and the highest agreement with IRRI data was 0.97, as shown in Fig. 8d.For the same reason as Ayeyarwady province, the rice-extraction area in Savannakhet province was lower than the IRRI data because the details of rivers and roads were preserved in the extraction results.
Different from the other subfigures, Vietnam added datacomparison results with 30 m VLUCDs.The extraction results in Vietnam correlated well with the statistical data, VLUCDs and IRRI data, with all R 2 values greater than 0.80, as shown in Fig. 8e.The area of rice extraction in Vietnam was in higher agreement with the VLUCD (R 2 of 0.87) than with statistics (R 2 of 0.86) and IRRI rice data (R 2 of 0.83).Most of the points of VLUCDs were distributed on the 1 : 1 line.

Discussion
In this study, annual rice area maps for five Southeast Asian countries in 2019 were generated using temporal features extracted based on Sentinel-1 SAR time series and an improved U-Net model.Accuracy, precision and recall based on the validation set exceeded 90 %, with a Kappa of 0.8425.Accuracy evaluation of rice mapping showed that the proposed temporal features were able to portray the unique growth characteristics of rice, and the improved U-Net model was able to suppress the false alarms of sporadic distribution caused by complex topography.The proposed method has  superior capability in mapping rice distribution in large tropical regions.
The rice area extraction results were compared with statistical data from the national and provincial levels in Sect.4.2 and 4.3.The results of multiple comparisons show that our rice area extraction results are in high agreement with the statistical data.However, there were also minor inconsistencies.A possible reason is that the statistical cycle is not strictly aligned with the SAR data collection cycle.The rice area extracted in this study is the total area of all fields that have been planted with rice in a year.Most agricultural statistics record the total area of rice planted in different growing seasons on an annual basis or even from one month of one year to the next.In addition, the statistical methods may cause errors in the statistics.The well-organized rice-growing seasons were mainly considered in all statistics, and the random and irregular planting behavior of individual farmers was inevitably ignored.Considering the data collection conditions and statistical errors, it is understandable that the extracted rice map differs from the official statistics.
The comparison results between rice area products extracted based on different remote-sensing data showed that our rice area extraction results were in good agreement with the available rice products at the national and provincial levels.To fully demonstrate the reliability of the rice extraction results, three subregions from the rice map were selected for comparison in Thailand and Vietnam, as shown in Fig. 10.As mentioned in other literature (Dong et al., 2015;Han et al., 2021), the MODIS-based IRRI rice map with 500 m resolution contains a large number of mixed image elements, and thus misclassification exists in rice area maps.The spatial distribution characteristics of our rice area map were generally consistent with those of the IRRI data, and our rice area map retained more details with fewer mixed pixels.In addition, our rice area map also had better agreement with the spatial distribution and detailed information of rice from VLUCDs.Overall, comparisons based on the validation set, statistical data and other rice area map products confirmed the reliability of our rice area map.
In the study, the temporal features along rivers and wetlands are more similar to paddy rice and have similar colhttps://doi.org/10.5194/essd-15-1501-2023 Earth Syst.Sci.Data, 15, 1501-1520, 2023 ors in the feature pseudo-color image, which can easily be misclassified as rice.The backscattered information of scattered rice fields is subject to interference from topography and surrounding non-rice land cover, resulting in missed detection.Improvements can be made in future studies using water masks extracted from higher-precision land cover data or by adding more negative samples.

Data availability
The 20 m annual paddy rice area map for mainland Southeast Asia can be accessed in the Zenodo data set from the following DOI: https://doi.org/10.5281/zenodo.7315076(Sun et al., 2022b).The spatial reference system of the data set is EPSG:4326(WGS84).

Conclusions
Ending hunger and malnutrition is essential, and rice plays a critical role.Satellite-based remote sensing offers the most practical means of monitoring rice cultivation in mainland Southeast Asia.Questions remain, however, as to appropriate timing, number of satellite observations, spatial resolution of satellite imagery and effective data-processing methods for rice distribution and production information.
To perform large-scale rice area mapping in tropical and subtropical regions, an efficient rice area mapping method based on time-series SAR features and a deep-learning model is proposed.A 20 m spatial-resolution rice area map of mainland Southeast Asia was produced using the 2019 Sentinel-1 time-series data and the proposed rice area mapping method.The accuracy of the proposed method in the validation sample set was 92.20 %.Our rice area map correlated significantly with statistical data and was consistent with other rice area maps.These results demonstrate the advantages of the proposed method for rice area mapping with complex cropping patterns.The rice area map we produced will provide data support for agricultural resource studies, such as yield prediction and agricultural management.

Figure 1 .
Figure 1.Location of the study area.The Sentinel-1 data with orbit frame 99-16 were used for the training samples, and the rice and non-rice flags show the distribution of the validation sample set.The base map is from Esri.

Figure 2 .
Figure 2. Flowchart of the proposed rice area mapping method using Sentinel-1 data.

Figure 3 .
Figure 3.The average VH polarization backscattering coefficient curve of typical land covers (the shaded areas refer to the standard deviation calculated from the sample points).

Figure 4 .
Figure 4.The pseudo-color image synthesized from three SAR feature parameters (R: σ 0 max ; G: σ 0 min ; B: σ 0 var ) and the corresponding optical image from © Google Earth.

Figure 5 .
Figure 5. Structure of the U-Net model.

Figure 6 .
Figure 6.Twenty-meter resolution rice area map of five countries in mainland Southeast Asia in 2019.

Figure 8 .
Figure 8.Comparison of the extracted rice area with the statistical rice-harvested area and IRRI data set at the provincial scale.N is the number of provinces in each country.

Figure 9 .
Figure 9.Comparison of our extracted rice area map of Ayeyarwady Province with IRRI data: (a) our rice area extraction result; (b) the enlarged view of the red box in panel (a); (c) the pseudo-color image of SAR features corresponding to panel (b); (d) IRRI rice data; (e) the enlarged view of the red box in panel (d); (f) the optical image (© Google Earth) corresponding to panel (e).

Figure 10 .
Figure 10.Comparison of our rice area map with available rice area products in typical regions.Our extraction results (a1-c1, a2-c2, a3-c3); VLUCD rice map (d1-f1, d2-f2); IRRI rice data (d3-f3).The figures in the second column show the enlarged views of the red boxes in the figures of the first column, and the figures in the third column show the enlarged views of the red boxes in the figures of the second column.

Table 1 .
List of Sentinel-1 SAR data in 2019 used in this study.

Table 2 .
Data information used in the study.

Table 3 .
Information of the validation sample set.

Table 4 .
Accuracy of the rice area map based on the validation set.
Figure 7.Comparison of the extracted rice area with the statistical rice-harvested area and IRRI data set at a national-level scale.M is the number of countries.

Table 5 .
Statistics, other rice area maps and the extracted rice area for five Southeast Asian countries.As shown in Fig.7and Table5, compared with IRRI rice data, the extraction area of Cambodia, Laos and Thailand was close to that of IRRI, while that of Myanmar and Vietnam was slightly lower.Compared with the statistical data, the extraction areas of Cambodia and Laos were in good agreement with the statistical data.The extraction areas of rice in Myanmar and Vietnam were lower, while those in Thailand were slightly higher.