Geometric accuracy assessment of global coarse resolution 1 satellite data sets: a study based on AVHRR GAC data at the 2 subpixel level

: AVHRR GAC (Global Area Coverage) data provide daily global coverage of 9 the Earth, which are widely used for global environmental and climate studies. However, their 10 geolocation accuracy has not been comprehensively evaluated due to the difficulty caused by 11 onboard resampling and the resulting coarse resolution, which hampers their usefulness in 12 various applications. In this study, a Correlation-based Patch Matching Method (CPMM) was 13 proposed to characterize and quantify the AVHRR GAC geo-location accuracy at the subpixel 14 level. This method is not limited to landmarks and not suffer from errors caused by false 15 detection due to the effect of mixed pixels, thus enables a more robust and comprehensive 16 geometric assessment. Data of NOAA-17, MetOp-A, and MetOp-B satellites were selected to 17 test the geocoding accuracy. The three satellites predominately present West shifts in the across- 18 track direction, with average values of -1.69 km, -1.9 km, -2.56 km and standard deviations of 19 1.32 km, 1.1 km, 2.19 km for NOAA-17, MetOp-A, and MetOp-B, respectively. The large shifts 20 and uncertainties are partly induced by the larger satellite zenith angles (SatZ) and partly due 21 to the terrain effect, which is related to SatZ and becomes apparent in the case of large SatZ. It 22 is thus suggested that GAC data with SatZ less than 40° should be preferred in applications. 23 The along-track geolocation accuracy is clearly improved compared to the across-track 24 direction, with average shifts of -0.7 km, -0.02 km, 0.96 km and standard deviations of 1.01 25 km, 0.79 km, 1.70 km for NOAA-17, MetOp-A, and MetOp-B, respectively. The data can be 26 accessed from http://www.esa-cloud-cci.org/ ( Stengel et al., 2017) and 27 https://ladsweb.modaps.eosdis.nasa.gov/ (Didan, 2015). of quantifying the geometric accuracy of coarse resolution satellite data available as fundamental climate data records (FCDR) for global applications (Hollmann 2013). We show the procedure based 97 on AVHRR GAC data, which are for the ESA CCI cloud project (Stengel et al., 2017) 98 and are now also used for the ESA CCI+ snow project. The assessment is conducted at the sub- 99 pixel level and not affected by the mixed pixel problem. This method is applied to some test 100 data from NOAA-17, MetOp-A, and MetOp-B, respectively. Furthermore, the potential factors 101 that cause geometric distortions are explored and discussed. the band-to-band (BBR)

corrections, the spatial misplacement of the GAC scene caused by these factors can be up to 48 25-30 km occasionally (Devasthale et al., 2016). 49 For geocoding of AVHRR data, a two-step approach is usually used: 1) geocoding based 50 on orbit model, ephemeris data, and time of onboard clock ( Van et al., 2008), achieving an 51 accuracy within 3-5 km depending on the accuracy of orbit parameters and model (Khlopenkov 52 et al., 2010); 2) using any kind of ground control points (GCPs) (e.g., road or river intersections, 53 coastal lines) to improve geocoding (Takagi, 2004; Van et al., 2008). Additionally, in order to 54 eliminate the ortho-shift caused by elevations, an orthorectification would be needed (Aguilar 55 et al., 2013;Khlopenkov et al., 2010). The dataset used in this study is from the ESA (European 56

120
From a standpoint of geometric accuracy assessment, the reflectances in band 1 and 2 were 121 employed in this study. However, these two bands are not only affected by the atmosphere but 122 also by the earth surface anisotropy characterized by the bidirectional reflectance distribution 123 function (BRDF) (Cihlar et al., 2004). Given the fact that BRDF effects can be reduced through 124 the calculation of vegetation indices such as NDVI (Lee & Kaufman, 1986), the NDVI is 125 employed in this study, which is derived from the reflectance in band 1 and 2 according to 126 where 1 R and 2 R refer to the reflectance in band 1 and 2, respectively. It is important to note requirement of an order of magnitude better than one-tenth of the image spatial resolution 139 (Aksakal, 2013), which means 400 m for the AVHRR GAC data. The NDVI provided by 140 MOD13A1 V006 product was introduced as a source of reference data to perform the geometric 141 quality assessment, because the sub-pixel accuracy of MODIS product is sufficient to satisfy 142 this requirement (Wolfe et al., 2002). The high geolocation accuracy of MODIS products was 143 achieved by using the most advanced data processing system, which has updated the models of 144 spacecraft and instrument orientation several times since launch. Consequently, the various 145 geolocation biases resulted from instrument effects and sensor orientation are removed (Wolfe 146 et al., 2002). The NDVI data with the date corresponding to that of AVHRR GAC data, were 147 obtained from the Level-1 and Atmosphere Archive & Distribution System (LAADS) 148 Distributed Active Archive Center (DAAC) (https://ladsweb.modaps.eosdis.nasa.gov/) with 149 the sinusoidal projection at a spatial resolution of 500 m and a temporal resolution of 16-day. 150 The detailed description of the MOD13A1 V006 product can be found in Didan (2015). 151

Geographical regions of interest 152
The purpose of this study is not only to assess the geolocation accuracy of 4 km AVHRR 153 GAC data, but also to explore the potential impact factors related to geolocation accuracy. 154 Therefore, the investigations were made at different latitudes and longitudes, at different 155 locations with different SatZ, for different land covers, as well as different topographies. The 156 swaths covering parts of Europe (including the alpine mountain) and Africa were used since 157 they fit the study needs (Fig. 1). Investigations were based on six regions of interest (ROI) as 158 shown in Figs. 1 and 2. The ROIs from 1 to 6 enable us to investigate the geolocation accuracy 159 at different SatZ, topography, as well as latitudes and longitudes. Their locations and extents 160 are consistent for the scenes from NOAA-17 and MetOp-A ( Fig. 1), which enables the 161 comparison of geolocation accuracy between these two sensors. The size of ROI was attempted 162 to be set as large as possible in order to get more significant and comprehensive results. On the 163 other hand, areas covered by cloud and water have to be avoided, resulting in the different sizes 164 of these ROIs. Half of the ROIs (ROIs 2, 4, 6) serve as a good example for a typical 165 mountainous areas on Earth. The other half of ROIs (ROIs 1, 3, 5), on the other hand, mainly 166 cover relatively flat areas. Since the NOAA-17 scene was almost unaffected by cloud, another 167 ROI (ROI 7) was selected to check the geolocation accuracy at nadir. The MetOp-B scene was 168 influenced by cloud but served as a good example to illustrate the combined effect of 169 topography and large SatZ (Fig. 2). Although there are also 6 ROIs selected, their sizes and 170 extents are totally different from the above two scenes. In order to include the terrain area, two 171 subsets were used (Figs. 2a and c). Each grid in the ROI represents the minimum unit (namely 172 the patch) based on which we conduct the geometric quality analysis.    Method (CPMM) is proposed to find the best match between small image patches taken from 184 the reference images and the AVHRR GAC images. This method is expected to be more suitable 185 for the geometric accuracy assessment of coarse resolution images than the current methods, 186 i.e. the CGM, LFM, and co-registration using shorelines. Because it is not limited to a certain 187 landmark such as a lake or sea shoreline, and thus enables a more comprehensive assessment 188 over different areas in the satellite scene. Moreover, this method does not suffer from errors 189 caused by false detection due to the effect of mixed pixels because it is applied directly on the 190 pixel values. The framework of CPMM is shown in Fig. 3, and the detailed description of this 191 method is provided below.

195
The AVHRR GAC data set is stored in a Network Common Data Format (NetCDF), with 196 latitude and longitude assigned to each pixel. In order to achieve a higher accuracy of image 197 matching, the data need to be reprojected. The AVHRR GAC scene was reprojected into the 198 Lambert Conformal Conic (LCC) projection by building the Geographic Lookup Table (

Patch matching and geometric assessment 208
In the process of matching the AVHRR GAC data with reference MODIS data, a patch 209 size of 7 × 7 AVHRR pixels (corresponding to approximately 28 km × 28 km) was used. These proven to be most ideal for these criteria during the test of different patch size. 219 For each patch in the ROI, the AVHRR GAC data within the patch were extracted. Then 220 the patch was shifted in the Y-and X-direction as indicated by the blue arrows in Fig. 3. Shifts 221 were conducted stepwise in order to achieve sub-pixel accuracy, beginning with only 500 m 222 and adding up to 8 km (i.e., ± 2 pixels) at a step of 500 m (equivalent to the MODIS pixel size) 223 in any direction of Y-and X-combination. Consequently, 33×33 combinations of X-and Y-224 shifts have been simulated. For each shift, the MODIS NDVI pixels within the extent of the 225 patch were extracted and aggregated to 4 km by spatial averaging. Afterwards, the correlation 226 between the 4 km rescaled MODIS NDVI and the 4 km AVHRR NDVI was calculated for each 227 shift in X-and Y-direction. The displacement of one patch was indicated by the shift 228 combination with the best correlation, which means the geolocation accuracy of the patch. In 229 this way, the geolocation errors were transformed into the across-track and along-track 230 directions at the sub-pixel level for correlation with possible error sources. 231 It is expected that the results from each patch are different. Therefore, the general accuracy

Influence factor 240
The influence of potential variables on the geometric accuracy was studied, including 241 SatZ, topography, latitudes, and longitude. To achieve this, the information of these factors were 242 also extracted for each patch on the scene. showing the spatial distribution of correlation between the MODIS reference scene and the 257 AVHRR data (Fig. 4). The color coding indicates a high correlation in dark green and reddish-   The ROIs 7, 3, 1, 4 show slightly larger mean shifts but are still with the magnitudes of less 284 than 2.5 km. These results are unexpected, because the ROIs (ROIs 2 and 6) over terrain areas 285 are with smaller shifts than those (ROIs 7, 3, 1, 4) over relatively flat areas in the across-track 286 direction. One possible reason is that the SatZ for ROIs 2 and 6 are not large (less than 40°) 287 ( Fig. 1b) so that the terrain effect on geolocation accuracy is counterbalanced by the small SatZ. 288 This also indicates that the influence of small SatZ may be stronger than the terrain effect. But 289 it is surprising that the ROI 7 (Fig. 5g), which is located at the nadir area (Fig. 1b) When combining the results of all ROIs together (Fig. 5h)

303
The shifts in the along-track direction are mainly negative throughout these ROIs, 304 indicating that the NOAA-17 scene is dominated by South shifts in the along-track direction. 0.28 and -0.29, respectively. These shifts are generally small in these three regions given that 308 the maximum shift is no more than 3.5 km (Table 2). In contrast, the ROIs 2, 5, 6 and 7 present 309 systematic shifts to the South, which are mostly distributed within the range of -2 to 0 km, with  Furthermore, it can be stated that the distribution of shifts in the along-track direction is 321 less widely spread than that in the across-track direction, demonstrating the smaller uncertainty 322 of geocoding in the along-track direction, as indicated by the smaller StdDev values throughout 323 these ROIs (Table 2). Moreover, the geolocation errors in the across-track direction are greater 324 than the along-track direction (Fig. 5), which is expected due to the applied clock drift 325 correction. 326 Table 2. Summary of the results for the scene of NOAA-17. The unit of the shift is km. respectively). These results demonstrate that SatZ plays a crucial role in determining the 341 uncertainty of the shifts in the across-track direction. This conclusion also agrees with previous 342 research conducted by Aguilar et al. (2013). When combining the results of all ROIs (Fig. 6g),  A scene are slightly closer to the nadir area than those on the NOAA-17 scene (Figs. 1b and d). This can be further confirmed by the consistently smaller StdDev values in the along-track 371 direction than those in the across-track direction as shown in Table 3. 372 Table 3. Summary of the results for the scene of MetOp-A. The unit of the shift is km.   (Table 3) than those for NOAA-17 (Table 2). Therefore, it 381 can be concluded that the MetOp-A scene shows a better geolocation accuracy and less 382 uncertainty than the NOAA-17 scene in the along-track direction. It can be seen that the shifts in the along-track direction are still significantly smaller than 426 those in the across-track direction. Furthermore, the uncertainties of the shifts in the along-track 427 direction are generally smaller than those in the across-track direction, when excluding the 428 results of ROI 1 due to its limited number of patches (Table 4). This further verifies that after 429 removing clock drift errors, the geolocation errors in the along-track direction are generally 430 more accurate and with less uncertainties than the across-track direction. 431 Table 4. Summary of the results for the scene of MetOp-B. The unit of the shift is km. The comparison of Fig. 7g with Fig. 6g and Fig. 5h reveals that the MetOp-B scene is 433 significantly inferior to the MetOp-A scene in terms of the geolocation accuracy in the along-434 track direction, with the former being concentrated around 1 and the latter around 0. 435 Furthermore, the uncertainty of the shifts of the MetOp-B scene (StdDev=1.7) is much larger 436 than that of the MetOp-A scene (StdDev=0.79). As for the performance of the MetOp-B scene 437 relative to the NOAA-17 scene, it can be found that they are comparable with regard to the 438 magnitude as well as the distribution of the shifts in the along-track direction. However, the 439

Min(X) Max(X) Mean(X) StdDev(X) Min(Y) Max(Y) Mean(Y) StdDev(Y) N
MetOp-B scene shows larger uncertainties than NOAA-17. 440 accuracy are also explored. 454 As shown in Figs. 8a-c, it can be seen that the shifts in the across-track direction vary 455 considerably for all SatZ, and this is particularly evident in the results of MetOp-B (Fig. 8c). 456 This demonstrates that besides the SatZ effects, the geolocation accuracy is also influenced by 457 other factors. Furthermore, the spread at each fixed SatZ tends to become larger at larger SatZ 458 (larger than 20°) (Figs. 8a-b). The large variability of MetOp-B scene shifts at small SatZ (less 459 than 20°) (Fig. 8c) is mainly due to the effect of thin cloud or cloud shadow as explained before. 460 Despite the dispersion of the shifts for all SatZ, it can still be found that the shifts in the across-461 track direction do not change much when the SatZ is less than 20° (Figs. 8a-b and Table 5). A 462 slightly decreasing trend (increasing trend of the magnitude) can be observed from 20° to 40° 463 (Table 5), and becomes more apparent at SatZ larger than 40° ( Fig. 8c and Table 5). be seen that the shifts in the along-track direction are relatively stable at each level of SatZ for 476 SatZ smaller than 15°, but becomes more variable for greater SatZ. A similar phenomenon can 477 be observed in Fig. 8f, where the shifts are relatively stable with SatZ ranging from 20° to 35°, 478 but becomes more variable at each level of SatZ with its values larger than 35°. It is noteworthy 479 that the wide spread of shifts with SatZ less than 20° is mainly caused by cloud contamination. 480  (an increasing trend in magnitude) for longitudes larger than 5°. Given the fact that the latitude 500 of the nadir area is distributed between 10°-15° for NOAA-17, 8°-15° for MetOp-A, and -8°-501 the satellite, as it shows almost no influence in the nadir area. The influence increases with the 504 difference of the longitude relative to that of the nadir area. This is well understandable, as the 505 influence of longitude is equivalent to that of SatZ in the across-track direction. 506 The variation of the shifts (in the along-track direction) with latitude also depends on the 507 situation (Figs. 8j-l). The magnitudes of shifts with larger latitude (larger than 45°) are generally 508 greater than those with smaller latitude (less than 40°) on the NOAA-17 (Fig. 8j) and MetOp-509 B scene (Fig. 8l). This is not visible for the MetOp-A scene (Fig. 8k), where the shifts exhibit 510 almost no change with latitude. This can be attributed to the fact that the clock drift errors are 511 corrected more thoroughly for MetOp-A satellite than NOAA-17 and MetOp-B satellites. 512 Furthermore, the MetOp satellites have an on-board stabilization to keep them in the right 513 position and orientation in orbit compared to the NOAA satellites. 514

515
The geometric accuracy of satellite data is crucial for most applications as geometric 516 inaccuracy can bias the obtained results. Therefore, the assessment of the geolocation accuracy 517 is important to provide satellite data of high quality enabling successful applications. In this 518 study, a correlation-based patch matching method was proposed to characterize and quantify 519 the AVHRR GAC geo-location accuracy. This method presented here yields significant 520 advantages over existing approaches and enables achieving a subpixel geo-positioning accuracy 521 of coarse resolution scenes. It is free from the impact of false detection due to the influence of 522 mixed pixels, not limited to a certain landmark (e. g. shoreline) and therefore enables a more 523 comprehensive geometric assessment. This method was utilized to characterize the geolocation 524 accuracy of AVHRR GAC scenes from NOAA-17, MetOp-A, and MetOp-B satellites. 525 The study is based on several ROIs comprising numerous patches over different land cover 526 types, latitudes, and topographies. The scenes from these satellites all present West shifts in the 527 across-track direction, with an average shift of -1.69 km and a StdDev of 1.32 km for NOAA-528 17, -1.9 km and 1.1 km respectively for MetOp-A, and -2.56 km and 2.19 km respectively for 529 using the combined data from NOAA-17 and MetOp will result in additional uncertainty in 541 time series applications. 542 From the results above, it can be found that the geolocation accuracy in the along-track 543 direction is always higher and with less uncertainties than the across-track direction, which is 544 consistent with previous related studies. This is understandable since the GAC dataset from the 545 ESA cloud CCI project has been corrected for clock drift errors, but has no ortho-correction, 546 which is not feasible due to the onboard sampling characteristics. SatZ plays a decisive role in 547 determining the magnitude as well as the uncertainty of the shifts in the across-track direction. 548 Larger SatZ generally induce greater shifts and uncertainties in this direction. The combined 549 effect of SatZ and topography on geolocation accuracy in the across-track direction has also 550 been shown. And significant terrain effects appear only in the case of large SatZ (>40° for this 551 study). It is important to note that the effect of SatZ on the magnitude and uncertainty of shifts 552 in the along-track direction is not negligible. But this effect is likely to be intertwined with other 553 factors. The impact of longitude on the shifts in the across-track direction is equivalent to that 554 of SatZ, while the effect of latitude is related to the degree of how the clock drift errors are 555 corrected. It was found that the clock drift errors are more thoroughly corrected for MetOp-A 556 than NOAA-17 and MetOp-B. 557 Although this assessment was only conducted for a single scene of each satellite, it 558 provides an important preliminary geolocation assessment for AVHRR GAC data. It is a first 559 step towards a more precise geolocation and thus improves application of coarse-resolution 560 satellite data. For instance, it identifies the threshold of SatZ under which the GAC data should 561 be preferred in applications. Furthermore, the CPMM geolocation assessment method proposed 562 by this study is also applicable to other coarse-resolution satellite data. 563

569
Xiaodan Wu was responsible for the main research ideas and writing the manuscript. 570 Kathrin Naegeli contributed to the data collection. Stefan Wunderle contributed to the 571 manuscript organization. All the authors thoroughly reviewed and edited this paper. 572

573
The authors declare that they have no conflict of interest.