A lacustrine surface-sediment pollen dataset covering the Tibetan Plateau and its potential in past vegetation and climate reconstructions
Abstract. A dataset of pollen extracted from the surface-sediments of lakes with an even spatial distribution is essential for pollen-based reconstructions of past vegetation and climate. We collected 90 lake surface-sediment samples from the Tibetan Plateau (TP) covering major vegetation types. A comprehensive modern pollen dataset is established by integrating our newly obtained modern pollen dataset with previous modern lacustrine pollen datasets, covering the full range of climatic gradients across the TP with mean annual precipitation (Pann) from 97 to 788 mm, mean annual temperature (Tann) -9.09 to 6.93 °C, mean temperature of the coldest month (Mtco) -23.48 to -2.65 °C, and mean temperature of the warmest month (Mtwa) 1.77 to 19.26 °C. Numerical analyses revealed that Pann is the primary climatic determinant for pollen distribution, while net primary production (NPP) is a valuable variable reflecting vegetation conditions. To detect the quantitative relationship between pollen and Pann/NPP, both weighted-averaging partial least squares (WA-PLS) and random forest algorithm (RF) were employed. The performance of both models suggests that this modern pollen dataset has good predictive power in estimating past NPP and Pann, but RF has a slight advantage with this dataset. This comprehensive modern pollen dataset is considered reliable when reconstructing vegetation and climate from pollen spectra from the central TP, but caution is needed if it is applied to pollen spectra from the marginal regions of the TP and those covering the Last Glacial period, due to poor analogue quality in those cases. The dataset, including site locations, pollen percentages, NPP, and climate data for 90 lakes, is available at the National Tibetan Plateau Data Center (TPDC; Tian, 2025; https://doi.org/10.11888/Paleoenv.tpdc.302470).