Status of the Tibetan Plateau observatory (Tibet-Obs) and a 1 10-year (2009-2019) surface soil moisture dataset 2

. The Tibetan Plateau observatory of plateau scale soil moisture and soil temperature (Tibet-Obs) 16 was established ten years ago, which has been widely used to calibrate/validate satellite- and model-based 17 soil moisture (SM) products for their applications to the Tibetan Plateau (TP). This paper reports on the status 18 of the Tibet-Obs and presents a 10-year (2009-2019) surface SM dataset produced based on in situ 19 measurements taken at a depth of 5 cm collected from the Tibet-Obs that consists of three regional-scale SM 20 monitoring networks, i.e. the Maqu, Naqu, and Ngari (including Ali and Shiquanhe) networks. This surface 21 SM dataset includes the original 15-min in situ measurements collected by multiple SM monitoring sites of 22 the three networks, and the spatially upscaled SM records produced for the Maqu and Shiquanhe networks. 23 Comparisons between four spatial upscaling methods, i.e. arithmetic averaging, Voronoi diagram, time stability,


Naqu network 144
The Naqu network is located in the Naqu river basin with an average elevation of 4500 m. The climate type 145 is characterized as cold-semiarid with cold dry winters and rainy summers. Over three-quarters of total annual 146 precipitation (400 mm) falls between June and August (Su et al.  Table A5, and the typical characteristics of topography and land cover within the network are shown in Fig.  151 5 as well. The Decagon 5TM ECH2O probes were installed at depths of 5/2.5, 10/7.5, 15, 30, and 60 cm to 152 measure the SMST, and the soil-specific calibration was also performed by van der Velde (2010) that yields 153 a RMSE of about 0.029 m 3 m -3 . Table 4 provides the specific periods of data missing during each year and 154 the total data lengths of surface SM for each site. Among these sites, only two sites (Naqu and MS sites in 155 Table A5) have collected more than six years of SM measurements, while the data records for the others are 156 less than four years. Similar to the Ali network, the Naqu network will also not be used for the further analysis 157 in this study due to limited number of monitoring sites and data records. 158

Precipitation data 159
The precipitation data is from two weather stations, i.e. Maqu (34°00'N, 102°05'E) and Shiquanhe (32°30'N, 160 80°05'E), operated by the China Meteorological Administration (CMA) which provides the near-surface 161 meteorological data of about 700 weather stations in China. The daily precipitation data can be downloaded 162 from https://data.cma.cn/dataService/cdcindex/datacode/SURF_CLI_CHN_MUL_DAY.html that is in 163 Chinese. The data is only available to agreement users, which is not allowed to be shared without permission 164 from the CMA. the data is available from https://cds.climate.copernicus.eu/cdsapp#!/dataset/reanalysis-era5-land?tab (last 171 verified on 11 March 2021). More information about the ERA5-land product can be referred to  Sabater et al., (2018). In this study, the data of volumetric total soil water content for the top soil layer (0-7 173 cm) is used for the analysis. 174

MERRA2 soil moisture product 175
MERRA2 is an atmospheric reanalysis dataset produced by NASA using the Goddard Earth Observing 176 System Model version 5 (GEOS-5) and atmospheric data assimilation system (ADAS), version 5.12.4. 177 MERRA2 provides SM data currently available from 1980 to present at hourly time interval with a spatial 178 resolution of 0.5° (latitude) by 0.625° (longitude), and the data is available from 179 https://disc.gsfc.nasa.gov/datasets/M2T1NXLND_5.12.4/summary (last verified on 11 March 2021). More 180 information about the MERRA2 product can be referred to GMAO (2015). In this study, the data of 181 volumetric liquid soil water content for the surface layer (0-5 cm) is used for the analysis. More information about the GLDAS Noah product can be referred to Rodell et al. (2004). In this study, the 189 data of soil water content for the top soil layer (0-10 cm) is used for the analysis. represents the total number of SM monitoring sites, represents the time (e.g. the t th day), and [-] represents 199 the weight vector. 200 In this study, only the surface SM measurements taken from the Maqu and Shiquanhe networks are upscaled 201 to obtain the regional-scale SM for a long-term period due to the availability of much longer data records in 202 comparison to the Naqu and Ali networks (see Section 2.1). Four upscaling methods are investigated and 203 inter-compared with each other to find the most suitable method for the application to the Tibet-Obs. Brief 204 descriptions of the selected upscaling methods are given in Appendix B. The arithmetic averaging method 205 (hereafter "AA") assigns an equal weight coefficient to each SM monitoring site (see Appendix B.1), while 206 the Voronoi diagram method (hereafter "VD") determines the weight based on the geographic distribution 207 of all the SM monitoring sites (see Appendix B.2). On the other hand, the time stability method (hereafter 208 "TS") regards the most stable site as the representative site of the SM monitoring network (see Appendix 209 B.3), while the apparent thermal inertia (ATI) method is based on the close relationship between apparent 210 thermal inertia ( ) and SM (see Appendix B.4). 211

Trend analysis 212
The Mann-Kendall test and Sen's slope estimate (Gilbert, 1987;Mann, 1945;Smith et al., 2012) are adopted 213 in this study to analyze the trend of 10-year upscaled SM time series and model-based products (i.e. ERA5-214 land, GLDAS Noah, and MERRA2). Specifically, the trend analysis is based on the monthly average SM, 215 and all the missing data is regarded as an equal value smaller than other valid data. The test consists of 216 calculating the seasonal statistics S and its variance VAR(S) separately for each month during the 10-year 217 period, and the seasonal statistics are then summed to obtain the Z statistics. 218 For the month (e.g. January), the statistics can be computed as: 219

221
where k and l represent the different year and l > k, SMi,l and SMi,k represent the monthly average SM for the 222 month of the year k and l, respectively. 223 The ( ) is computed as: 224 where is the length of the record for the month (e.g. the 10 year data record in this study with =10), 226 is the number of equal-value data in month , , is the number of equal-value data in the th group for 227 month . 228 After obtaining the and ( ), the statistic ′ and ( ′ ) for the selected season (e.g. warm season 229 between May and October and cold season between November and April in this study) can be summed as: 230 where M represents the number of months in the selected season, e.g. M = 12 for the full year, while M = 6 233 for the warm and cold season, respectively. 234 Then the statistics Z can be computed as: 235 If the statistics is positive (negative) and its absolute value is greater than 1− /2 (here = 0.05, 1− /2 = 237 1.96), the trend of the SM time series is regarded as upward (downward) at the significance level of . 238 Otherwise, we accept the hypothesis that there is not significant trend found for the SM time series. 239 If the trend is monotonous, we will further estimate the slope (change per unit time) with Sen's method (Sen, 240 1968 where represents the SM that is considered as the ground truth, and ̅ represents the upscaled SM. 251 The closer the metric is to zero, the more accurate the estimation is. 252 The metrics used for the correlation analysis are the Nash-Sutcliffe efficiency coefficient (NSE [-]) as: 253 (4) 254 The value of the NSE ranges from -∞ to 1, and the closer the metric is to 1, the better the match of the 255 estimated SM with the reference ( ). 256 The metrics used to define the most representative SM time series (i.e. the best upsclaed SM) is the

Preprocessing of model-based soil moisture products 264
The performance of the ERA5-land, MERRA2, and GLDAS Noah SM products are assessed using the 265 upscaled SM data of the Maqu and Shiquanhe networks for a 10-year period. The corresponding regional-266 scale SM for each product can be obtained by averaging the grid data falling in the network areas. The 267 numbers of grids covering the Maqu and Shiquanhe networks are 77 and 20 for the ERA5-land product, 12 268 and 4 for the GLDAS Noah product, and only one for the MERRA2 product. Moreover, the ERA5-land and 269 MERRA2 products with the temporal resolution of hourly and 3-hourly are averaged to daily-scale, and the 270 unit of GLDAS Noah SM is converted from kg m −2 to m 3 m −3 . The uppermost layer of the ERA5-land (0-7 271 cm), MERRA2 (0-5 cm), and GLDAS Noah (0-10 cm) SM products are considered to match the in situ 272 observations at depth of 5 cm. 273 4 Results 274

Inter-comparison of soil moisture upscaling methods 275
In this section, four upscaling methods (see Section 3.1) are inter-compared first with the input of the 276 maximum number of available SM monitoring sites for a single year in the Maqu and Shiquanhe networks 277 to find the most suitable upscaled SM that can best represent the areal conditions (i.e. ground truth, SMtruth). 278 Later on, the performance of the four upscaling methods is further investigated with the input of reducing 279 number of SM monitoring sites to find the most suitable method for producing long-term (~10 year) upscaled 280 SM for the Maqu and Shiquanhe networks. 281 and August). On the other hand, the SMATI-max for the Shiquanhe network is comparable to SMAA-max and 290 SMVD-max, while deviation is observed for the SMTS-max. It seems that the ATI method performs better in the 291 Shiquanhe network due to the existence of a stronger relationship between τ and θ in the desert ecosystem. 292 Table B1 lists the values of MRD (see Eq. (B4) in Appendix B), ( ) (Eq. (B3)), and CEC (Eq. (B6)) 293 calculated for the upscaled SM produced by the four upscaling methods. The CEC is used here to determine 294 the most suitable upscaled SM that can best represent the areal conditions for the two networks. It can be 295 found that the SMAA-max yields the lowest CEC values for both networks, indicating that the SMAA-max can be 296 used to represent actual areal conditions, which will thus be regarded as the ground truth for following 297 analysis (i.e. SMtruth). The arithmetic average of the dense in situ measurements was also used as the ground 298 and four (i.e. SQ02, SQ03, SQ06, and SQ14) monitoring sites that provided more than nine years of in situ 302 SM measurement data for the Maqu and Shiquanhe networks, respectively (see Tables 2 and 3). This 303 indicates that the minimum number of available monitoring sites can be used to produce the long-term (~10 304 year) consistent upscaled SM are three and four for the Maqu and Shiquanhe networks, respectively. Fig. 7  305 shows the daily average SM time series produced by the four upscaling methods based on the minimum 306 available monitoring sites (hereafter "AA-min", "TS-min", "VD-min", and "ATI-min"). The SMtruth obtained 307 by the AA-max is also shown for comparison purpose. For the Maqu network, the upscaled SM produced by 308 the AA-min, VD-min, and TS-min generally capture well the SMtruth variations, while the upscaled SM of 309 the ATI-min shows dramatic deviations. Similarly, the upscaled SM produced by the AA-min and VD-min 310 are consistent with the SMtruth for the Shiquanhe network with slight overestimations, while significant 311 deviations are noted for the upscaled SM of the TS-min and ATI-min. Table B2 lists the error statistics (e.g. 312 Bias, RMSE, ubRMSE, and NSE) computed between the upscaled SM produced by these four upscaling 313 methods with the input of the minimum available sites and the SMtruth. The upscaled SM produced by the 314 AA-min shows better performance for both networks as indicated by the lower RMSE and higher NSE values 315 in comparison to the other three upscaling methods. 316 Apart from the maximum and minimum numbers of available SM monitoring sites mentioned above, there 317 are about 14, 10, 8, and 6 available monitoring sites during different time spans for the Maqu network, and 318 for the Shiquanhe network are about 11, 10, 6, and 5 available monitoring sites (see Tables A2 and A4 in the 319 Appendix A). Fig. B2 shows the radar graph of error statistics (i.e. RMSE and NSE) computed between the 320 SMtruth and the upscaled SM produced by the four upscaling methods based on the input of different numbers 321 of available monitoring sites. For the Maqu network, the performances of the AA and VD methods are better 322 than the TS and ATI methods as indicated by smaller RMSEs and higher NSEs for all the estimations. A 323 similar conclusion can be obtained for the Shiquanhe network, while the performance of the ATI method is 324 largely improved when the number of available monitoring sites is not less than 10. It is interesting to note 325 that the upscaled SM produced by the AA-min are comparable to those produced with more available sites 326 (e.g. 10 sites) as indicated by comparable RMSE and NSE values for both networks. It indicates that the AA-327 min is suitable to produce long-term (~10 years) upscaled SM for both networks, which yield RMSEs of 328 0.022 and 0.011 m 3 m -3 for the Maqu and Shiquanhe networks in comparison to the SMtruth produced by the 329 AA-max based on the maximum available monitoring sites. 330

Long-term analysis of upscaled soil moisture measurements 331
In this section, the AA-min is first adopted to produce the consecutive upscaled SM time series (hereafter 332 "SMAA-min") for an approximately 10-year period for the Maqu and Shiquanhe networks, respectively. In 333 addition, the other time series of upscaled SM are produced by the AA method with input of all available SM 334 monitoring sites regardless of the continuity (hereafter "SMAA-valid"), which is widely used to validate the precipitation of the Maqu network area for the full year, warm seasons, and cold seasons in a 10-year period. 347 As described in Section 3.2, the time series would present a monotonous trend if the absolute value of 348 statistics Z is greater than a critical value, i.e. Z0.05 = 1.96 in this study. The results show that there is not 349 significant trend found for both precipitation and SMAA-min time series, while the SMAA-valid shows a drying 350 trend with a Sen's slope of -0.008 for warm seasons. The drying trend of the SMAA-valid is caused by the 351 change of available SM monitoring sites (see Table A2). Specifically, several monitoring sites (e.g. NST11-352 NST15) located in the wetter area were damaged since 2013, and four new monitoring sites (i.e. NST21-353 NST25) were installed in the drier area in 2015 (see Table 2) that affect the trend of the SMAA-valid. 354 to the change of available SM monitoring sites (see Table A4) as the Maqu network. Specifically, several 362 monitoring sites (e.g. SQ11 and SQ12) located in the wetter area were damaged around 2014, and five new 363 monitoring sites (i.e. SQ17-21) were installed in the drier area in 2016 (see Table 3). 364 In summary, the SMAA-valid are likely affected by the change of available SM monitoring sites over time that 365 leads to inconsistent trend with the SMAA-min. This indicates that the SMAA-min is superior to the SMAA-valid for 366 the production of the long-term consistent upscaled SM time series. 367

Application of the long-term upscaled soil moisture to validate the model-based products 368
In this section, the long-term upscaled SM time series (i.e. SMAA-min) produced for the two networks are 369 applied to validate the reliability of three model-based SM products, i.e. ERA5-land, MERRA2, and GLDAS 370 Noah, to demonstrate the uniqueness of this dataset for validating existing reanalysis datasets for a long term 371 period (~10 years). Since the ERA5-land product only provides the data of volumetric total soil water content, 372 the period when the soil is subject to freezing and thawing transition (i.e. November-April) is excluded for 373 this evaluation. 374  Table 5 as well. Although the three products generally capture the seasonal 392 variations of the SMAA-min, both GLDAS Noah and MERRA2 products overestimate the SMAA-min for the 393 entire study period leading to positive bias values, and overestimation is also noted for the ERA5-land product 394 in the warm season with bias of about 0.002 m 3 m -3 . The trend analysis for the three SM products are also 395 shown in Fig. 9b. Both the ERA5-land and MERRA2 products are able to reproduce the wetting trend found 396 for the SMAA-min, while the GLDAS Noah product cannot capture well the trend. 397 In summary, the currently model-based SM products still show deficiencies in representing the trend and 398 variation of measured SM dynamics for a long-term period (~10 years) in the Tibetan grassland and desert 399 ecosystems that dominate the landscape of the TP.  Table  418 6. In general, the percentage increases with increasing number of monitoring sites at any RMSE levels. It can 419 be noted that more than 50% of combinations are able to comply with the RMSE requirement of 0.  Table 6). In order to find the optimal combination with 12 sites for the Maqu network, 435 all the possible combinations (i.e. the number of 6188) are ranked by RMSE values from the smallest to 436 largest, and Table 7 lists the examples of ranking 1-5 th and 95-100 th . It can be noted that the 100 th combination 437 contains the largest number of currently available monitoring sites (i.e. 7 sites including CST03, CST05, 438 NST01, NST03, NST05, NST06, and NST10) with a RMSE of less than 0.006 m 3 m -3 . Therefore, the 100 th 439 combination of 12 monitoring sites (as shown in Table 7) is suggested for the Maqu network. 440 In summary, it is suggested to maintain well current 12 monitoring sites for the Shiquanhe network, while 441 for the Maqu network it is suggested to restore five old monitoring sites, i.e. CST02, NST11, NST13, NST14, 442 and NST15.  LPRM AMSR-E SM product, ASCAT SM product i) The weighted average of SM depended on the percentage spatial coverage strata can be regarded as the ground reference.
ii) The AMSR-E and ASCAT products are able to provide reasonable area SM during monsoon seasons. i) The ECV and ERA products give the best performance, and all products are able to capture the SM dynamic except for the NASA product.
ii) The JAXA AMSR-E/AMSR2 products underestimate SM, while the ASCAT product overestimates it.
iii) The SMOS product exhibits big noise and bias, and the LPRM AMSR-E product shows a significantly larger seasonal amplitude. i) The JAXA AMSR2 product underestimates area SM while the LPRM AMSR2 product overestimates it.
ii) The SMOS-IC product exhibits some noise of SM temporal variation.
iii) The SMAP product has the highest accuracy among the five products while FY3B shows relatively lower accuracy.
Yang et al.
Maqu and Ngari network, period between 2008 and 2011 AMSR-E brightness temperature product The assimilated SM products exhibit higher accuracy than the AMSR-E product and LSM simulations for wet areas, whereas their accuracy is similar for dry areas.

B.1 Arithmetic averaging
The arithmetic averaging method assigns an equal weight coefficient to each SM monitoring site of the network, which can be formulated as: 730 where represents the i th SM monitoring site.

B.2 Voronoi diagram
The Voronoi diagram method divides the network area into several parts according to the distances between each SM monitoring site. This approach determines the weight of each site ( [-]) based on the geographic 735 distribution of all the SM monitoring sites within the network area, which can be formulated as:

B.3 Time stability
The time stability method is based on the assumption that the spatial SM pattern over time tends to be consistent (Vachaud et al., 1985), and the most stable site can be regarded as the representative site of the 740 network. For each SM monitoring site within the time window (M days in total), the mean relative represents the mean SM measured at all available monitoring sites on the t th day.
quantifies the bias of each SM monitoring site to identify a particular location is wetter or drier than regional mean, and σ( ) The most stable site is identified by the lowest value.

B.4 Apparent thermal inertia
The apparent thermal inertia (ATI) method is based on the close relationship between apparent thermal inertia The core issue of the ATI approach is to obtain the ̅ and minimize the cost function of Eq. (B8) to obtain 765 β and R. The ̅ can be retrieved from the apparent thermal inertia by the empirical regression g( ), and has strong connection with the surface status, e.g. land surface temperature and albedo, which is defined as: The amplitude of the diurnal LST A is estimated as LSTmax -LSTmin for a single day. Finally, we use the regression analysis between in situ SM measurements ( ) at each monitoring site and corresponding ATI (τ) to obtain the g(•) form. 785 There are 17 and 12 monitoring sites participate in the regression analysis for the Maqu and Shiquanhe networks during the periods of 11/2009-10/2010 and 8/2018-7/2019, respectively. The ATI cannot be obtained for each monitoring site in every day since the satellite-based LST data are contaminated by clouds.
In order to make full use of the data, we make the ATI-SM pair for the 1st monitoring site on the 1st day as No. 1, the pair for the 17 th (or 12 th ) monitoring site in the Maqu (or Shiquanhe) network on the 1st day as the 790 No. 17 (or No. 12), the pair for the 1st monitoring site at the 2nd day as the No. 18 (No. 13), and so on. Later on, we select a certain number of ATI-SM pairs (e.g. 40, 50, 60, 70, 80, 90, and 100) as a group to compute the averaged ATI and SM and construct the most reasonable regression relationship between them. If the ATI or SM data at one day is missing, this pair is ignored. As shown in Fig. B1, the empirical relationship is generated from 80-pair-averaged ATI and SM for the Maqu and Shiquanhe networks. 795 When the empirical relationship g(•) is determined, the regional-average SM can be derived from gridaveraged ATI by the function g(•), which it is regarded as ̅ in Eq. (B8). Finally, the optimal (̂) is obtained by minimizing the cost function (i.e. Eq. (B8)), and the upscaled SM can be estimated as: The detailed description of the ATI method is referred to Qin et al. (2013). 800