GHRSAT: the first global hourly dataset of all-sky remotely sensed estimates of surface air temperature
Abstract. Spatially continuous surface air temperature (SAT) is critically important for a wide range of fields such as eco-environmental assessments and hydrology. Remotely sensed estimation models based on satellite-derived thermal infrared data provides a structurally different approach for reconstructing SAT compared to spatial interpolation of ground observations of SAT and numerical modelling, which are mainly limited by the coverage of stations and coarse spatial resolutions, respectively. However, the data products of remotely sensed estimates of SAT developed in previous studies are only available at daily or monthly resolutions, and are primarily restricted for local regions. In this study, we generated the first hourly dataset (GHRSAT) of all-sky remotely sensed SAT estimates for the global land areas except Antarctica between 2011 and 2023. The hourly estimates in GHRSAT were reconstructed from land surface temperature using the hybrid estimation models that integrate random forest (RK) models and kriging techniques. The hybrid models were developed for different regions on a monthly basis. We adopted ordinary kriging (OK) and fixed rank kriging (FRK) in the modelling of the site residuals from the RF models for regions with low-density and high-density stations, respectively. Our results show that the hybrid models for generating GHRSAT have the predictive performance between 1.48 °C to 2.28 °C in overall cross-validation RMSE. The mean RMSE for estimating hourly SAT can be significantly reduced by 0.18–0.41 °C by the hybrid models compared to the RF models. We analyzed the variability in the predictive errors of estimating hourly SAT across regions, months and sites. The variability is apparently decreased when using the hybrid models. We found that the RF models are less sensitive to the parameter tuning of the RF models, which greatly impacts the hybrid models. Improving the RF models by parameter tuning can drastically improve the hybrid models based on the RF models. Additionally, we found the performance difference between OK and FRK in developing the hybrid models for regions with large amounts of stations is slight with the mean RMSE of 0.05 °C. In summary, the scheme of the hybrid models can result in satisfactorily higher performance for estimating SAT, and has the general practicability of applying to regions at various scales. The GHRSAT dataset is publicly available at http://doi.org/10.11888/RemoteSen.tpdc.301540 (Zhang, 2024).