HiTIC-Monthly: a monthly high spatial resolution (1 km) human thermal index collection over China during 2003–2020

. Human-perceived thermal comfort (known as human-perceived temperature) measures the combined effects of multiple meteorological factors (e.g., temperature, humidity, and wind speed) and can be aggravated under the inﬂuences of global warming and local human activities. With the most rapid urbanization and the largest population, China is being severely threatened by aggravating human thermal stress. However, the variations of thermal stress in China at a ﬁne scale have not been fully understood. This gap is mainly due to the lack of a high-resolution gridded dataset of human thermal indices. Here, we generated the ﬁrst high spatial resolution (1 km) dataset of monthly human thermal index collection (HiTIC-Monthly) over China during 2003– 2020. In this collection, 12 commonly used thermal indices were generated by the Light Gradient Boosting Machine (LightGBM) learning algorithm from multi-source data, including land surface temperature, topography, land cover, population density, and impervious surface fraction. Their accuracies were comprehensively assessed based on the observations at 2419 weather stations across the mainland of China. The results show that our dataset has desirable accuracies, with the mean R 2 , root mean square error, and mean absolute error of 0.996, 0.693 ◦ C, and 0.512 ◦ C, respectively, by averaging the 12 indices. Moreover, the data exhibit high agreements with the observations across spatial and temporal dimensions, demonstrating the broad applicability of our dataset. A comparison with two existing datasets also suggests that our high-resolution dataset can describe a more explicit spatial distribution of the thermal information, showing great potentials in ﬁne-scale (e.g., intra-urban) studies. Further investigation reveals that nearly all thermal indices exhibit increasing trends in most parts


Introduction
Global climate change has brought significant challenges to human society and natural systems (Arias et al., 2021;Haines and Ebi, 2019) by inducing higher air temperature and more frequent extreme weather and climate events around the world (Arias et al., 2021;Schwingshackl et al., 2021).Heatrelated disasters, e.g., heatwaves, droughts, and wildfires, are occurring more frequently and becoming more intense (Tong et al., 2021;Arias et al., 2021;Luo et al., 2022), exacerbating the thermal environment and threatening the tolerance limits of humans, animals, and plants (Raymond et al., 2020).Substantial warming and increasing extreme weather and climate events aggravate human thermal comfort and increase the exposures to uncomfortable thermal environments (Brimicombe et al., 2021), thus posing adverse impacts on public health, socio-economy, and agricultural productivities (Budhathoki and Zander, 2019;Moda et al., 2019;Tuholske et al., 2021;Sun et al., 2019;Zhao et al., 2017).
The thermal stress that human beings actually perceive is not only related to air temperature, but also jointly influenced by other environmental variables such as humidity, wind, and/or direct sunlight (Mistry, 2020;Djongyang et al., 2010).These variables alter the heat balance that maintains the core temperature of human bodies by influencing the heat exchange (e.g., radiation, convection, conduction, and evaporation) between humans and the surrounding environment (Periard et al., 2021;Stolwijk, 1975).High atmospheric humidity can exacerbate the thermal stress on human bodies by reducing evaporation from the skin through sweating when the air temperature is high (Li et al., 2018;Rogers et al., 2021;Luo and Lau, 2021).Furthermore, abnormal weather with a combination of extremely high air temperature, humidity, and/or wind can reduce labor capacity and human performance (Roghanchi and Kocsis, 2018;Lazaro and Momayez, 2020;Enander and Hygge, 1990), leading to temperature-related discomfort, stress, morbidity, and even death (Di Napoli et al., 2018;Kuchcik, 2021;Nastos and Matzarakis, 2011), particularly during heatwaves.For example, in the summer of 2017, 2018, and 2019, there were 1489, 1700, and 161 heatwave-related deaths, respectively, in the United Kingdom (Rustemeyer and Howells, 2021).Additionally, vulnerable groups including children, the elderly, chronic patients, and poor communities are at higher risk of being affected by thermal stress (Patz et al., 2005;Wang et al., 2019), which is likely to be further exacerbated as global population aging and climate warming (United Nations, 2017).
The changes and impacts of human thermal stress have attracted increasing attention in recent years (Schwingshackl et al., 2021;Krzysztof et al., 2021;Li et al., 2018;Rahman et al., 2022;Ren et al., 2022;Luo and Lau, 2021).For instance, Szer et al. (2022) estimated the impact of heat stress on construction workers based on the Universal Thermal Climate Index (UTCI).Ren et al. (2022) and Luo and Lau (2021) quantified the contribution of urbanization and climate change to urban human thermal comfort in China.Schwingshackl et al. (2021) assessed the future severity and trend of global heat stress based on Coupled Model Intercomparison Project phase 6 (CMIP6).These studies were mainly based on meteorological stations or coarse-gridded data.However, the meteorological stations are sparsely distributed (Peng et al., 2019), particularly in undeveloped and mountainous areas, which cannot reveal continuously spatial distributions of air temperature and thermal stress conditions (He et al., 2022).Additionally, existing low spatial resolution image products (Mistry, 2020;Di Napoli et al., 2020) cannot be applied to fine-scale studies because they cannot provide information with spatial details and variations.However, the changes in human thermal stress at a fine scale (e.g., 1 km×1 km) remain much less understood.This research gap is mainly inhabited by the unavailability of a high spatial resolution (high-resolution) gridded dataset of human thermal stress.
Although extensive studies have been conducted to generate high-resolution land surface temperature (LST) (such as the Land Surface Temperature in China LSTC; Zhao et al., 2020 and the global seamless land surface temperature dataset, Zhang et al., 2022b;Hong et al., 2022), or near surface air temperatures (SAT) products (such as ERA5, Copernicus Climate Change Service, 2017, TerraClimate, Abatzoglou et al., 2018, andGPRChinaTemp1km, He et al., 2022), human thermal stress datasets were generally produced at low-resolution levels, such as ERA5-HEAT (Di Napoli et al., 2020), HDI_0p25_1970_2018 (hereafter, HDI) (Mistry, 2020), and HiTiSEA (Yan et al., 2021).ERA5-HEAT was derived from ERA5 and includes two global hourly human thermal stress indices (UTCI and mean radiant temperature (MRT)) from January 1979 to the present (Di Napoli et al., 2020).The HDI dataset was generated using 3 h climate variables of the global land data assimilation system (GLDAS), and it contains 10 daily indices with a spatial resolution of 0.25 • × 0.25 • , covering 90 • N-60 • S from 1970 to 2018 (Mistry, 2020).HiTiSEA contains 10 daily human thermal stress indices from 1981 to 2017, with a spatial resolution of 0.1 • × 0.1 • over South and East Asia (Yan et al., 2021), which was derived from the ERA5-Land and ERA5 reanalysis products.However, these existing thermal index datasets have very coarse spatial resolutions.There is an urgent need for a high-resolution (e.g., 1 km) data collection of multiple human thermal stress indices.
Various indices have been proposed to measure human thermal stress, but there is no universal thermal stress index that works in all climate zones (Schwingshackl et al., 2021;Brake and Bates, 2002;Roghanchi and Kocsis, 2018;Luo and Lau, 2021).Existing human thermal stress indices considered different climate conditions, direct or indirect exposures to weather elements, human metabolism, and the local working environment (Di Napoli et al., 2020), which were designed to evaluate or quantify the comprehensive environmental pressure of meteorological factors (e.g., temperature, humidity, wind) on human bodies (Epstein and Moran, 2006).These indices are based on the thermal exchange between the human and surrounding environments or empirical relationships gained by studying human responses to various environmental factors, varying in complexity, applicability, and capacity (Staiger et al., 2019).For example, the heat index (HI) is used for meteorological service (NWS, 2011); wet-bulb temperature (WBT) is used to measure the upper physiological limit of human beings (Raymond et al., 2020); physiologically equivalent temperature (PET) and UTCI are used to estimate human thermal comfort (Varentsov et al., 2020).Therefore, a high-resolution dataset that contains different commonly used human thermal stress indices is urgently called for in global and regional studies, particularly for those with complex climate conditions (e.g., China).
China has been threatened by deteriorating thermal environments under global climate change and rapid local urbanization over the past decades (Ren et al., 2022;Luo and Lau, 2019).The changes and characteristics of human thermal stress across China have attracted extensive attention in recent years (Yan, 2013;Tian et al., 2022;Li et al., 2022).Wang et al. (2021) found that the frequency of extreme human-perceived temperature events increases in summer and decreases in winter in most urban agglomerations (UAs) of China.Li et al. (2022) showed that the frequency of thermal discomfort days in China exhibits a significant increasing trend from 1961 to 2014, and there will be more threats from thermal discomfort in the future.Therefore, a long-term and high-resolution dataset with multiple human thermal stress indices in China is of great importance for investigating detailed spatial and temporal variations of human thermal stress across the country.Such a dataset has the potential to (1) assess population exposure to extreme thermal conditions and heat-related health risks, (2) reveal the spatiotemporal evolution of human thermal stress and its influence on public health, tourism, industries, military, epidemiology, and biometeorology at a fine scale, and (3) provide policymakers with data in manipulating targeted strategies to mitigate heat stress and protect vulnerable people.

Meteorological data
Daily mean surface air temperature, relative humidity, and wind speed recorded at the 2419 weather stations across China (Fig. 1) during 2003-2020 were collected from the China Meteorological Data Service Center (CMDC) at http: //data.cma.cn/en(last access: 16 November 2021).All station records were subjected to strict quality control and evaluation, including homogenization based on a statistical approach (Xu et al., 2013) and evaluation of temporal inhomogeneity based on the Easterling-Peterson method (Li et al., 2004).

Covariates
Human thermal stress is related to temperature, topography, land cover, population density, surface water, and vegetation (Wang et al., 2020;Rahman et al., 2022;Krzysztof et al., 2021).In this study, eight variables reflecting the changes and spatial distribution characteristics of temperature were used to predict human thermal indices (Table 1) in addition to the meteorological variables.As LST is one of the most essential parameters for predicting human thermal indices, the seamless LST dataset created by Zhang et al. (2022b)   were calculated by averaging daily LST, which was obtained by averaging four observations in a day, including middaytime and mid-nighttime observations from ascending and descending orbits of MOD11A1 (Terra) and MYD11A1 (Aqua).More details about the LST data are described in Zhang et al. (2022b).The land cover dataset (MCD12Q1 Version 6) developed by Sulla-Menashe and Friedl (2019) based on a supervised classification method was downloaded via Google Earth Engine (GEE).The Multi-Error-Removed Improved-Terrain (MERIT) elevation dataset developed by Yamazaki et al. (2017) was downloaded from GEE.This dataset was generated after removing the errors from existing digital elevation models (DEMs), such as SRTM3 and AW3D-30m, based on multi-source satellite data and filtering algorithms.The spatial resolution of this dataset is 3 s (i.e., ∼ 90 m at the Equator).In addition, the slope was also extracted from the elevation data to act as the topography predictor.As the artificial surface is closely related to human activities (Zhao and Zhu, 2022), the dataset of global artificial impervious area (GAIA) produced by Gong et al. (2020) from the Google Earth Engine (GEE) was used to delineate human footprints.The overall accuracy of GAIA is greater than 90 % (Gong et al., 2020).The population dataset was downloaded from the WorldPop Project (Gaughan et al., 2013).Then, the abovementioned eight datasets were preprocessed to have the same spatial extent, projection, and spatial resolution (1 km) through image mosaicking, reprojection, resampling, clipping, aggregating, and monthly synthesizing.Moreover, year and month of the year were also used as covariates.Note that we did not include precipitation as a covariate because the precipitation data are not normally distributed.More importantly, they exhibit many zero values in many regions of China (especially in the dry season),  which would increase the uncertainty of the spatial prediction.

Calculation of human thermal indices
In addition to SAT, the calculation of human thermal indices used in this study is described in  were derived by averaging daily values in each month.
E s = 6.112 × exp (17.67×T /(T +243.5)) (1) Here E s is saturation vapor pressure (hPa) near the surface, T ( • C) is air temperature at 2 m above the ground, and RH (%) is relative humidity at 2 m above the ground.

Prediction of human thermal indices using LightGBM
The Light Gradient Boosting Machine (LightGBM) algorithm was employed to predict human thermal indices during 2003-2020.LightGBM is one of the gradient boosting decision tree (GBDT) algorithms developed by Microsoft Research (Ke et al., 2017).This algorithm has become a  very popular nonlinear machine learning algorithm due to its superior performance in machine learning competitions and efficiency (Candido et al., 2021).Its performance has been evaluated and shows desirable results in different applications, such as evapotranspiration estimation (Fan et al., 2019), land cover classification (Candido et al., 2021;Mccarty et al., 2020), air quality prediction (Su, 2020;Zeng et al., 2021;Tian et al., 2021), subsurface temperature reconstruction (Su et al., 2021), and above-ground biomass estimation (Tamiminia et al., 2021).Furthermore, LightGBM adopts the Gradient-based One-Side Sampling (GOSS) and Exclusive Feature Bundling (EFB) algorithms to improve the training speed (Su et al., 2021).Here, GOSS is used to select data instances with larger gradients and to exclude a considerable proportion of small gradient data instances (Ke et al., 2017), and EFB is used to merge features (Ke et al., 2017).Compared with traditional GBDT algorithms including eXtreme gradient boosting (XGBoost) and Stochastic Gradient Boosting (SGB), LightGBM effectively decreases the training time without reducing the accuracy (Los et al., 2021;Ke et al., 2017). We

Accuracy assessment
Four statistic metrics -namely, determination coefficient (R 2 ), mean absolute error (MAE), RMSE, and bias (Rice, 2006) -were used to evaluate the prediction accuracy of the human thermal indices.Ranging from 0 to 1, R 2 measures the proportion of variance explained by the model, representing how well the human thermal indices were predicted compared to the observations.MAE represents the average absolute error between the predictions and the observations.RMSE is the standard deviation of the residuals and is sensitive to outliers.Bias describes the differences between the predictions and the observations.These metrics are computed as follows.
where ŷ is the predicted value of human thermal indices, y is the mean of the observed human thermal indices calculated from meteorological stations, and N is the number of samples.values (i.e., < 0.4 • C, Figs.5e and 6e).The MAE and RMSE of NET and WCT decrease from northwestern to southeastern China (Figs. 5i,5l,6i,6l).For other indices, small MAE and RMSE values are mainly observed in plains including NCP, while large values tend to appear in regions with complex topography, such as arid Northwest China, mountainous Northeast and South China, and the Hengduan Mountains.These differences are related to the uneven distribution of weather stations, i.e., dense in plains and coarse in complex terrain areas.The bias values range from −0.3 to 0.3 • C (Fig. 7).Positive bias values tend to be distributed in northern China while negative values are mainly located in the south.This spatial variability is likely caused by the generally lower temperatures in the north and higher tempera-tures in the south.In particular, the extremely small values in the north and the extremely large values in the south may be overestimated and underestimated to some extent, respectively, due to limited samples of extremely small and large values (compared with the rest of the samples) when training the machine learning model.The overestimation and underestimation issues caused by limited training samples of extreme values are quite common in machine learning (Wu et al., 2022;Li et al., 2020;Uddin et al., 2022;Cho et al., 2020).

Accuracies in major urban agglomerations
More than half of the national population in China lives in cities, particularly in UAs (i.e., also known as city clusters).
Here we assessed the prediction accuracies in 20 major UAs in China, which hold 62.83 % and 80.57 % of the total population and gross domestic product (GDP) of the country (Fang and Yu, 2016).These accuracy assessments are presented in Tables S1-S4 in the Supplement.As shown in Table S1, all UAs have R 2 values higher than 0.9837, with an average of 0.9947.S3).The biases in the 20 UAs range from −0.160 to 0.123 • C (Table S4).These results suggest that all predicted human thermal indices in different UAs across China are of good quality at the local scale.It implies that our prediction model and results have great potential in evaluating local thermal environment changes (e.g., in urban areas or cities).

Spatial variations of the human thermal indices
The abovementioned assessments show that our model based  Figure 9 shows the monthly distribution of the predicted ET in 2020, which exhibits obvious seasonality with higher temperatures in summer and lower in winter.The temperature shows a significant zonal difference with colder temperatures in northern than in southern China.The temperature has a close relationship with topography and decreases with elevation, varying from plateaus to plains.The Qinghai-Tibet Plateau (TP) has the lowest temperature, while southern China, the Sichuan Basin, and the Gobi regions in Northwest China witness the highest temperatures.The distribu-tion of temperature exhibits different patterns among the four seasons, especially between winter (e.g., January) and summer (e.g., July).In winter, the temperature increases from northern to southern areas and is the coldest in Northeast and Northwest China and the warmest on the island of Hainan.In the summer, the hottest temperature appears in the Tarim and Junggar basins of Xinjiang.The NCP region also has a high temperature in summer, which might be related to local urbanization (Liu et al., 2008) and irrigation (Kang and Eltahir, 2018).
The spatial variations of the predicted human thermal indices in summer (which is often characterized by severe heat stress) are examined in Fig. 10 by taking July 2020 as an example.As it shows, the 12 indices exhibit similar distribution patterns.There are significant differences in temperature among Northwest, northeastern, and southeastern China.Generally, the temperature decreases from the southeast to the northwest, and the southeast and northwest parts have the highest and lowest temperatures, respectively.
HMI exhibits the highest temperature while NET shows the lowest in July 2020.The dominant modes of these indices are further examined by applying the empirical orthogonal function (EOF) analysis (Figs. S10-S13).As Fig. S10 shows, the leading EOF (EOF1) of all 12 thermal indices exhibit highly consistent spatial distribution with higher values in the northern region and lower values in the south.Their temporal variations are also similar to each other (Fig. S11).The second and third EOF modes (EOF2 and EOF3) are also similar among different thermal indices (except EOF3 of NET, Figs.S11-S13).These results demonstrate the desirable quality of our products.

Temporal changes in the human thermal indices
The yearly evolutions of the annual mean human thermal indices during 2003-2020 are displayed in Fig. 11.Despite the interannual fluctuation in the time series, all indices exhibit upward trends except for NET and WCT, of which the decreasing trends are mainly affected by the recovering wind speed in the recent decade (Zeng et al., 2019).The fastest warming appears in HMI (0.303  The possible reasons for the prominent warming trends in North China are explained as follows.The urbanization process has been prevailing in this area, with rapid growth in the economy and population.This process is accompanied by dramatic increases in impervious surfaces and decreases in green spaces.These changes lead to warmer surface and near-surface air temperature, known as urban heat islands (UHIs), thus increasing thermal stress in this region.The urbanization effects on local heat stress have also been reported by Luo and Lau (2021).Moreover, North China has a large number of croplands with prominent irrigation activities, which may increase air humidity near the surface and exacerbate the combined effects of temperature and humidity, leading to increased heat stress (Kang and Eltahir, 2018).
In addition, this area has experienced a weakening of surface wind speed (Zhang et al., 2021), which also exacerbates thermal stress, especially in NET and WCT.Furthermore, different indices have different degrees of increasing trends.HMI has the largest increasing magnitude (Fig. 12h), and ET is seen with relatively slight increases across China (Fig. 12f).The trends of NET and WCT have similar spatial distribution patterns, with large proportions having cooling trends since 2003 (Fig. 12j and l).Most parts of Xinjiang, northeastern and southern China have obvious decreasing trends, and the Inner Mongolia Plateau (IMP), NCP, eastern TP, YRD, and YGP have slightly increasing trends.
The temporal trends of the human thermal indices in different seasons were also examined (Fig. 13).The fastest warming tendency is observed in the spring season.The rising trends of spring HMI, HI, MDI, AT in , and AT out exceed 0.4 • C per decade, and the trends of other indices (except ET and NET) are larger than 0.3 • C per decade (Fig. S2).Summer also has been experiencing significant increasing trends in all indices, i.e., at a rate of > 0.2 • C per decade (except ET and NET).The trends in summer HMI, HI, WBT, MDI, DI, sWBGT, AT in , and AT out exceed 0.3 • C per decade (Fig. S3).Differing from spring and summer, the human thermal indices (except WCT and NET) in the autumn season show slightly cooling trends (Fig. S4).Autumn WCT and NET have significantly strong decreasing trends, i.e., −0.349 and −0.507 • C per decade, respectively.Similarly strong cooling trends of WCT and NET appear in winter, i.e., −0.661 and −0.453 • C per decade, respectively, while other indices experience marginal long-term changes (Fig. S5).
Figure S6   In the spring, increases in all thermal indices are observed in most parts of China (Fig. S8), particularly in northern regions, such as central Inner Mongolia, parts of NCP, and Northeast China, while parts of southern China have slight decreases.These decreases are noticeable in NET and WCT (Fig. S8j and m).In contrast to spring, the autumn season is observed with decreased thermal temperature in the north and increases in the south (e.g., Southwest China, Fig. S9).

Comparison with existing human thermal index datasets
We compared our HiTIC-Monthly with two existing datasets, i.e., HDI (Mistry, 2020) and HiTiSEA (Yan et al., 2021), which have coarser spatial resolutions of 0.25 • × 0.25 • and 0.1 • ×0.1 • (Table 4), respectively.We derived monthly mean AT in in July 2018 from HDI and HiTiSEA and compared them with HiTIC-Monthly over the mainland of China, with a particular highlight in the four largest UAs, including Beijing-Tianjin-Hebei (BTH), YRD, middle Yangtze River Valley (mYRV) and Pearl River Delta (PRD) (Fig. 14).The  summer of 2018 was selected because it was included in all three datasets and frequent heat events occurred in this summer (Zhou et al., 2020).Generally, the three datasets depict similar spatial patterns.However, our HiTIC-Monthly dataset obviously provides more detailed and clearer spatial information on human thermal stress than the other two.Additionally, the observed AT in values at individual weather stations are also compared (Fig. 14).It can be seen that HDI and HiTISEA overestimate AT in , and such an overestimation is especially severe for HDI, while our dataset is in good agreement with the observed AT in at individual weather stations.Therefore, our predicted temperature can describe the spa-tial variations in the city areas well, thereby providing fundamental support for fine-scale climate studies, such as urban climate research.

Limitations and future works
There are 12 commonly used human thermal indices in the HiTIC-Monthly dataset produced in this study.Nine of these indices were computed from temperature and humidity (or water vapor) and the other three (i.e., AT out , NET, and WCT) were derived from temperature, humidity, and wind speed.In addition, other indices considering the combined effect of https://doi.org/10.5194/essd-15-359-2023 Earth Syst.Sci.Data, 15, 359-381, 2023 environmental variables such as sunlight (Blazejczyk, 1994;Fanger, 1970;Höppe, 1999;Yaglou and Minaed, 1957) were proposed, including wet-bulb globe temperature (WBGT), predicted mean vote (PMV), UTCI, physiological equivalent temperature (PET), etc.These thermal indices were not in-cluded in our study due to the lack of sunshine and radiative flux data.Since LST is the most important variable for predicting the 11 human thermal indices, the uncertainty in the LST dataset may influence the accuracy of the human thermal indices.
The LST variable in our prediction is collected from a global seamless 1 km resolution daily LST dataset (Zhang et al., 2022b).This dataset was generated based on spatiotemporal gap-filling algorithms and the MODIS LST data.It may overestimate LST in some cases because the LST under cloudy weather was filled based on the data in clear sky conditions (Zhang et al., 2022b).A high-quality LST dataset would further improve the prediction accuracy of the human thermal indices.
The human thermal indices dataset is at a monthly scale, but the temporal resolution may not be sufficient for the research of extreme weather events (e.g., heatwaves and cold spells) and related environmental health (e.g., heat-related mortality).A daily high-resolution human thermal index collection (HiTIC-Daily) will be produced and released in our future studies.In the current study, we provided the first national-level dataset over the mainland of China with multiple high-resolution human thermal indices in a monthly interval, which shows high prediction accuracies in all climate regimes across China.A global dataset of multiple human thermal indices dataset is also expected in the near future.

Data availability
The high spatial resolution monthly human thermal index collection (HiTIC-Monthly) generated in this study is freely available to the public in network common data form (NetCDF) from Zenodo at https://doi.org/10.5281/zenodo.6895533and the National Tibetan Plateau Data Center (TPDC) of China at https://data.tpdc.ac.cn/disallow/ 036e67b7-7a3a-4229-956f-40b8cd11871d (last access: 7 November 2022) (Zhang et al., 2022a).The human thermal indices include surface air temperature (SAT), indoor apparent temperature (AT in ), outdoor shaded apparent temperature (AT out ), discomfort index (DI), effective temperature (ET), heat index (HI), humidex (HMI), modified discomfort index (MDI), net effective temperature (NET), simplified wet-bulb globe temperature (sWBGT), wet-bulb temperature (WBT), and wind chill temperature (WCT).This dataset has a spatial resolution of 1 km × 1 km and covers the mainland of China from 2003 to 2020, stacking by year.Each stack is composed of 12 monthly images.The unit of the dataset is 0.01 degrees Celsius ( • C), and the values are stored in an integer type (Int16) to save storage space and need to be divided by 100 to get the values in degrees Celsius when in use.The projection coordinate system is Albers equal-area conic projection.The naming rule and other detailed information can be found in "README.pdf".

Conclusions
A long-term and high-resolution dataset of multiple human thermal indices is of great significance for monitoring de-tailed spatiotemporal changes of human thermal stress in different climate regions across China and assessing the health risks of people exposed to extreme heat at a fine scale.However, the current datasets of human thermal indices (e.g., HDI and HiTiSEA) only have coarse spatial resolutions (> 0.1 • ).In this study, we generated a dataset of monthly human thermal index collection with a high spatial resolution of 1 km over the mainland of China (HiTIC-Monthly).In this collection, 12 human thermal indices from 2003 to 2020 were predicted, including SAT, AT in , AT out , DI, ET, HI, HMI, MDI, NET, sWBGT, WBT, and WCT.
The HiTIC-Monthly dataset was produced by LightGBM based on multi-source data, including MODIS LST, DEM, land cover, population density, and impervious surface fraction.This dataset shows a desirable performance, with mean R 2 , RMSE, MAE, and bias of 0.996, 0.693 • C, 0.512 • C, and 0.003 • C, respectively.Our predictions also exhibit good agreements with the observations in both spatial and temporal dimensions.Moreover, the comparison with two existing datasets (i.e., HDI and HiTiSEA) suggests that HiTIC-Monthly has more detailed spatial information.Further investigation shows that almost all the indices show warming trends in most parts of China during 2003-2020, particularly for North China, Southwest China, TP, and parts of Northwest China.Additionally, the warming tendency is faster in spring and summer.WCT and NET show similar and strong cooling trends in autumn and winter, while other indices exhibit slight long-term changes.HiTIC-Monthly has broad applicability due to its high spatiotemporal prediction accuracy.Moreover, HiTIC-Monthly can offer significant support for studies that require fine-scale human thermal information.
Author contributions.ML and YZ conceptualized and designed the study.HZ collected the data, conducted the analyses, and wrote the first draft of the paper.All authors discussed the results and edited the paper.
Competing interests.The contact author has declared that none of the authors has any competing interests.
Disclaimer.Publisher's note: Copernicus Publications remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
gram of China, the Pearl River Talent Recruitment Program of Guangdong Province (grant no.2017GC010634), and the Innovation Group Project of Southern Marine Science and Engineering Guangdong Laboratory (Zhuhai) (grant no.311021008).The authors are grateful to the editor and two reviewers, whose comments and suggestions have significantly improved the quality of our manuscript.Review statement.This paper was edited by Qingxiang Li and reviewed by Minyan Wang and one anonymous referee.

Figure 1 .
Figure 1.Spatial distribution of meteorological stations in the mainland of China, with color shadings indicating the elevation in meters.
used the Python package Scikit-Learn to perform the LightGBM training, and hyperparameters of LightGBM were tuned based on grid search methods.The observed monthly human thermal indices at the 2419 weather stations across the mainland of China during 2003-2020 were randomly classified into a training set (80 %) for hyperparameters tuning and model training and a testing set (20 %) for model evaluation.
.1 Evaluation of the predicted human thermal indices 4.1.1Overall accuracy The prediction accuracies of the 12 human thermal indices were evaluated based on the validation data introduced in Sect.3.2.All predicted human thermal indices exhibit high accuracies.Figure 2 shows the scatter plots of the observed versus the predicted values of the 12 human thermal indices.As the figure displays, the data points of all indices https://doi.org/10.5194/essd-15-359-2023Earth Syst.Sci.Data, 15, 359-381, 2023

Figure 4 .
Figure 4. Spatial distribution of R 2 of the 12 human thermal index predictions at individual meteorological stations over the mainland of China during 2003-2020.

Figure 8 .
Figure 8. Annual prediction accuracies of the 12 human thermal indices over the mainland of China during 2003-2020: (a) RMSE, (b) MAE, and (c) bias.

Figure 9 .
Figure 9. Spatial distributions of the monthly mean ET over the mainland of China in 2020.

Figure 10 .
Figure 10.Spatial distributions of the 12 human thermal indices over the mainland of China in July 2020.

Figure 11 .
Figure 11.Temporal changes of the 12 annually averaged human thermal indices over the mainland of China during 2003-2020.The line illustrates the linear trend, the number in the square bracket means the corresponding trend per decade, and the asterisk next to the number indicates that the trends are significant at the 0.05 level.
maps the spatial patterns of the trends of summer mean human thermal indices over the mainland of China during 2003-2020.All indices show warming trends in most parts of China, particularly in NCP and TP.As one of the most densely populated regions in China, the prominent increases in thermal indices in NCP indicate that the local has been experiencing increasing threats of intensifying heat stress.Among the 12 indices, AT out , HI, NET and WCT tend to have a slight cooling trend in southeastern China.This cooling trend is consistent with the corresponding summer SAT.The spatial distributions of the changing trends in winter across the mainland of China during 2003-2020 are depicted in Fig. S7.The trend patterns in winter are similar to that in summer to some degree.The warming trends are concentrated in Southwest China, most parts of Northwest China, and parts of East China (e.g., YRD).The cooling https://doi.org/10.5194/essd-15-359-2023Earth Syst.Sci.Data, 15, 359-381, 2023

Figure 12 .
Figure 12.Spatial distributions of the linear trends (unit: • C per decade) in the 12 annually averaged human thermal indices over the mainland of China during 2003-2020.

Figure 13 .
Figure 13.Temporal trends of the 12 annually and seasonally averaged human thermal indices over the mainland of China during 2003-2020.The number means linear trend per decade.The asterisk indicates that the trends are significant at the 0.05 level.

Figure 14 .
Figure 14.Comparison of the spatial patterns among HDI_0p25_1970_2018 (HDI), HiTiSEA, and HiTIC-Monthly for AT in over the mainland of China and its four largest UAs in July 2018: Beijing-Tianjin-Hebei (BTH), Yangtze River Delta (YRD), middle Yangtze River Valley (mYRV) and Pearl River Delta (PRD).Colored circles indicate the observed AT in values at individual meteorological stations.

Financial support .
This research has been supported by the National Natural Science Foundation of China (grant no.41871029), the National Key Research and Development Program of China (grant no.2019YFC1510400), the Natural Science Foundation of Guangdong Province (grant no.2019A1515011025), and the Guangdong Provincial Pearl River Talents Program (grant no.2017GC010634).

Table 1 .
Gridded datasets used in this study.

Table 2 .
Equations of the human thermal indices for each station.

Table 4 .
Comparisons of the four thermal index datasets.