Simple noise estimates and pseudoproxies for the last 21k years

Abstract. Climate reconstructions are means to extract the signal from uncertain paleo-observations, i.e. proxies. It is essential to evaluate these to understand and quantify their uncertainties. Similarly, comparing climate simulations and proxies requires approaches to bridge the, e.g., temporal and spatial differences between both and address their specific uncertainties. One way to achieve these two goals are so called pseudoproxies. These are surrogate proxy records within, e.g., the virtual reality of a climate simulation. They in turn depend on an understanding of the uncertainties of the real proxies, i.e. the noise-characteristics 5 disturbing the original environmental signal. Common pseudoproxy approaches so far concentrated on data with high temporal resolution from, e.g., tree-rings or ice-cores over the last approximately 2,000 years. Here we provide a simple but flexible noise model for potentially low-resolution sedimentary climate proxies for temperature on millennial time-scales, the code for calculating a set of pseudoproxies from a simulation and, for one simulation, the pseudoproxies themselves. The noise model considers the influence of other environmental variables, a dependence on the climate state, a bias due to changing seasonality, 10 modifications of the archive (e.g., bioturbation), potential sampling variability, and a measurement error. Model, code, and data should allow to develop new ways of comparing simulation data with proxies on long time-scales. Code and data are available at https://doi.org/10.17605/OSF.IO/ZBEHX.

Some aspects of statistical climate reconstruction methods can be evaluated in so-called pseudoproxy experiments. In these experiments, the reconstruction methods are mimicked for example in the controlled conditions provided by climate simulations with Earth System Models. However, for these tests surrogate proxy records have to be produced, which are compatible with the climate simulated by these models-the pseudoproxies. In testing the reconstruction methods, pseudoproxies eventually replace the real paleo-observations in the method and the virtual climate of the Earth System simulation stands in for the 5 real climate.
Our use of the term pseudoproxy follows the literature since Mann and Rutherford (2002). That is, a pseudoproxy represents a modification of observational data, reanalysis data, or simulation output. It replaces real world proxies in an application. The term does not necessarily refer to substitutes for specific proxy records or particular proxy types. That is, the term pseudoproxy does not by itself imply that the modifications of the input data represent validly the uncertainties or characteristics of real 10 world data. This view of the term pseudoproxy is in line with the past literature (compare, for example, Mann and Rutherford, 2002;Osborn and Briffa, 2004;Von Storch et al., 2004;Jones et al., 2009;Graham and Wahl, 2011;Thompson et al., 2011;Lehner et al., 2012;Smerdon, 2012;Hind et al., 2012;Annan and Hargreaves, 2013;Kurahashi-Nakamura et al., 2014;Steiger and Hakim, 2016). Modifications of the input data may be as simple as adding white or colored noise or they may invoke more complex forward approaches (for example mechanistic Proxy System Models, Evans et al., 2013, see below). 15 Studies of the climate of the past 2,000 years regularly use such pseudoproxy approaches mimicking annually resolved proxies such as dendroclimatogical ones. Smerdon (2012) reviews the approach of using pseudoproxy-experiments to evaluate reconstruction methods with a focus on the last millennium. Such methods basically originated in the work of Mann and Rutherford (2002) focussing on climate-field reconstructions. The review by Smerdon (2012) emphasizes the essential contribution of pseudoproxy-experiments to our understanding of past climates and to evaluating our methods of studying past 20 climates. To date, most studies using pseudoproxies concentrated on the last few millennia. Few studies considered periods further in the past (e.g., Laepple and Huybers, 2013;Dolman and Laepple, 2018;Dee et al., 2018).
For a useful test of reconstruction methods, the pseudoproxies should be as realistic as possible, with statistical properties similar to the real proxies. This is achieved by contaminating the climate variables simulated by the Earth System Model with statistical noise with a certain amplitude and statistical characteristics. These properties ideally are based on estimates of a 25 realistic or at least plausible noise to successfully mimic the behavior of real-world proxies.
In our understanding there are various approaches to obtain such pseudoproxies. These range from most comprehensive to most simplified. We can try to obtain a comprehensive representation of a so called proxy system (Evans et al., 2013) from the environmental influences on a sensor to our measurement and formulate this into a mechanistic forward model of the system of interest. Such models can be very complex or they may concentrate solely on a core set of processes (compare the 30 full and reduced implementations of the Vaganov-Shashkin approach to modelling tree-rings presented by Evans et al., 2006;Tolwinski-Ward et al., 2011). That is, the first approach to obtaining pseudoproxies is process-based. Other, more reduced approaches potentially ignore this mechanistic process understanding and focus on stochastic expressions of the noise that influence our inferences about past climates. Such an approach can try to formulate mathematically tractable expressions for statistical noise-terms, which represent the different processes or effects influencing the stages from the original environmental 35 conditions to our final observation [Dolman et al., in preparation, A. Dolman, personal communication, 2018, T. Laepple, personal communication, 2017. Another way of producing pseudoproxies by focussing on stochastic noise expressions uses simple estimates of plausible errors. These different approaches can be specific for certain proxy types or very general. They can focus on one stage of the proxy system from environment to measurement or consider multiple stages.
All these approaches fit into the concept of a proxy system model as described by Evans et al. (2013). The idea of forward 5 models to study the behavior of proxies and proxy systems is not new (e.g., Schmidt, 1999;Tolwinski-Ward et al., 2011;Thompson et al., 2011) but Evans et al. (2013) were the first to clearly delineate the modelling of proxy systems. A proxy system represents the biological, chemical, geological, and possibly also documentary system that translates environmental influences into an archived state on which researchers make observations. We usually refer to these observations when speaking of climate proxies. A proxy system model is a representation of how the proxy system translates the environmental influences 10 into our observations based on our understanding. Evans et al. (2013) present a generalized concept of this modelling approach, which consists of three components: First, a sensor model reacts to the environmental influences. Second, an archive model transforms these sensor recordings into archive units. A third model translates the archive into representations of what we usually observe on an archive. For example, the sensor 'tree' records the environmental influences in its archive 'wood', and we can make measurements on this archive in form of tree-ring counts and widths etc. The full system from recording to 15 observation is the proxy system.
Each stage in this system and its model representations adds uncertainty, and each stage omitted in a generalization also increases uncertainty. For example, the environment and the final reconstruction process can be additional stages, but we can try to include the associated uncertainties in any of the three stages proposed by Evans et al. (2013). That is, considering the reconstruction stage, the calibration introduces additional uncertainty, which is not a priori captured by the stages sensor, 20 archive, and measurement. We can argue to include this additional source of error in the measurement stage. We can also argue that these uncertainties are de facto uncertainties resulting from processes at the sensor stage or at the archiving stage and include them there. Similarly, the sensor model does not necessarily account for all uncertainties of the environmental influences. An additional environmental stage could provide weighted data of various environmental influences (compare, e.g., Dee et al., 2018). These processes, however, can also be included in the sensor model or uncertainties can be assumed to mostly 25 affect the measurement model. In short, inferences about past climates from proxy-data are based on observations on an archive that accumulated a property of a system. This (property of the) system was sensitive to and recorded an environmental process at some date. From the recording stage to our inference there are multiple sources of error to our inference.
The potential errors include different sources of noise related to laboratory uncertainties like measurement errors and reproducibility, local disturbances, dating uncertainty, time resolution, serial autocorrelation, and all possibly dependent on the 30 overall climate state. Further uncertainty includes habitat preferences, seasonal biases, the variability in the relation between sensor and environment, long term changes in this relation, long term modifications of the archive, sampling variability and sampling disturbances, and not least generally erroneous assumptions on the researcher's side on the relation between recording sensor and environment, i.e., the calibration relation. A recipe for calculating pseudoproxies may include potential error estimates not only for parts of the assumed proxy-system but also for the relation between the 'observed' data and time, that is the anchoring of the data in time.
Regarding dating/age uncertainty, there are various approaches to dealing with it (e.g., Breitenbach et al., 2012;Carré et al., 2012;Anchukaitis and Tierney, 2013;Comboul et al., 2014;Goswami et al., 2014;Brierley and Rehfeld, 2014;Rehfeld and Kurths, 2014;Kopp et al., 2016;Boers et al., 2017) of which a number try to transfer the dating uncertainty towards the 5 proxy-record-uncertainty (e.g., Breitenbach et al., 2012;Goswami et al., 2014;Boers et al., 2017). Our interest explicitly is to include the uncertainty from the dating in an statistical noise term for a pseudoproxy time-series. Therefore, we do not consider Bayesian or Monte Carlo methods but take a simple approach to develop an error term for the uncertainty in the dating. We also do not include explicit age-modelling (compare, e.g., Haslett and Parnell, 2008;Blaauw and Christen, 2011;Trachsel and Telford, 2017).

10
Besides evaluating reconstruction methods, a plausible estimate of noise within the proxies also can assist in comparison studies between model-simulations and the proxy-records or among different model-simulations. This increases our understanding about past climate changes by consolidating information from all available sources, which are proxy records and model simulations. The lack of high-quality observations with small uncertainty is always going to hamper efforts to assess the quality of model-simulations of past climates. Such comparisons have to rely on the paleo-observations from proxies, and even 15 the highest-quality proxy-records have an irreducible amount of uncertainty. Most often data-model-comparisons take place in the virtual reality of the model and use the modelled variables. In the case of proxies, the comparison is between, for example, a temperature reconstruction and a model. The alternative is to compare both in the proxy-space using a proxy-representation of the model-climate. Pseudoproxies or a recipe how to compute them may be part of an interface between the data on the one side and the model simulations on the other side.

20
Recent years saw an intensification in the research on forward modelling proxies for understanding proxies, testing reconstruction methods, and evaluating simulation output (see, for example, Dolman and Laepple, 2018;Dee et al., 2015Dee et al., , 2018Konecky et al., 2019). Many of these approaches follow the concept of considering sensor, archive, and observations as distinct steps in the process. Still, few of these approaches consider transient time-scales beyond the late Holocene. Nevertheless, particularly the work by Dolman and Laepple (2018) and also Dee et al. (2018) allow for the calculation of different sedimentary 25 proxies over multi-millennial time-scales based on knowledge of certain processes in the respective proxy systems.
In this paper, we adopt the conceptual sub-divisions of Evans et al. (2013) to present a formal but still simple noise based approach to describe the disturbances masking the signal in proxy records. This approach can also be applied to produce pseudoproxies for timescales longer than the last few millennia, that is proxies with coarser time resolutions than interannual and afflicted by larger degrees of dating uncertainty. Thereby this work extends on previous pseudoproxy-approaches, which 30 often concentrated on well dated proxy-systems affected by fewer sources of uncertainty.
The following presents a set of assumptions on proxy noise and estimates for some of the mentioned error sources. We further provide pseudoproxies based on these assumptions for the TraCE-21ka simulation (Liu et al., 2009), which covers the last 21,000 years. We concentrate on proxies, which are subject to some kind of sedimentary process. Thus, our work appears to be particularly similar to the proxy system model for sedimentary proxies implemented by Dolman and Laepple (2018). 35 Dolman and Laepple (2018) also consider the long time-scales since the last glacial maximum and rely on output from the TraCE-21ka simulation for their forward modelling. Both, the present manuscript and Dolman and Laepple (2018) follow the concept outlined by Evans et al. (2013). The main difference between Dolman and Laepple (2018) and the present study is that they provide a simple process-focussed model of the proxy system, whereas we try to provide a simple characterisation of the noise in the proxy system that finally influences the proxies. The process-based formulation of Dolman and Laepple 5 (2018) concentrates on two types of marine proxies whereas our noise-based approach tries to generalize over sedimentary proxy types. We regard both approaches as complementary and want to emphasize the value in having a multitude of methods to assess model-simulations and reconstruction methods.
Our approach contributes to the existing proxy system modelling and pseudoproxy computation applications by being an intermediate step between complex forward modelling approaches and the noise based approaches, of which the latter may 10 ignore the proxy system workings. Our code simplifies and generalizes more complex assumptions. The noise-focus and the generalizations allow us to provide global pseudoproxy data and an ensemble of pseudoproxy data using the TraCE-21ka simulation over the time-scale of the last 21 thousand years. The manuscript assets at https://doi.org/10.17605/OSF.IO/ZBEHX provide the generated pseudoproxy data and also include sample code. Thereby the manuscript provides for one simulation the data to make an informed comparison with real proxies and the data to evaluate reconstruction techniques. Code and 15 assumptions enable any interested user to produce similar pseudoproxies for their simulation of interest. We consider the measurement error, local changes to the original proxy-recording (compare, e.g., Laepple and Huybers, 2013), the basic climate state, a potential bias, and a simple estimate of the effect of dating uncertainty. All noise expressions are coded in a way to flexibly allow for different colors and types of noise.

20
We use the annual mean temperature at each grid-point of the TraCE-21ka simulation (Liu et al., 2009). To date, this is the only available interannual transient Earth System Model simulation covering the last 21,000 years. Specific technical considerations, for example, related to freshwater pulses and sea-level adjustments lead to some artefacts in the simulation output data fields.
A brief description of the simulation can be found at http://www.cgd.ucar.edu/ccr/TraCE/, and the Ph.D.-dissertation of He On multi-millennial time-scales we have to consider changes in the insolation caused by changes in earth's orbital elements. 30 Global insolation data is calculated using the R (R Core Team, 2017) package palinsol (Crucifix, 2016). We use for most noise-processes simple Gaussian noise. However, as the code is flexible, the user can easily change this.
In defining what we consider as noise, we first have to state the signal, which we assume the proxy system records. That is, do we assume that the proxy records local or regionally accumulated signals? Here, we take the signal of interest to be local, that is non-local influences enter the noise term and are not part of the signal. In addition, there are further local factors which affect the recording of the signal but are not part of the signal of interest. The appendix provides tables (Tables A1 to A4)   5 summarising the considered parameters and noise models in the various steps of our considerations.
In the following, we distinguish between different sources of errors related to the concepts of sensor, archive, and measurements of Evans et al. (2013). Figure 1 summarises our procedure. Each section contains a discussion of the implications of the respective error term. Afterwards we discuss the results of applying the respective step in the framework to the output of the TraCE-21ka simulation. The sensor, that is for example an organism or a physical or biogeochemical process, reacts to multiple parts of its environment.
Researchers' interest often is only in one of the environmental variables. The sensor, S, is likely a nonlinear function of the environment, S(E), where E = {e i }, with e i being components of the environment. If our interest is only in the sensor's 15 reaction to one variable, T , Under this assumption, further components of the environment besides T contribute only noise components η i to the reaction of the sensor. Errors due to noise are not necessarily additive but can also be multiplicative or could bias the estimate. In a first step we, here, assume the sensor-reaction to be Any of these errors or noise-processes may show auto-correlation in either space or time or both. Any such process may, in turn, add memory to the sensor-system. Indeed this memory-effect and spatial or temporal correlations may be large. For example, if a process takes place in an environment with slowly and fast varying components, and our interest is in one of the fast components, the low frequent variations add a noise or error with high auto-correlation in time.

25
The sensor reacts to all, potentially high-frequent, changes in its environment. This local environment is unlikely isolated from the larger scale system. Thus, additional noise may be due to the sensor reacting to advected environmental properties instead of 'local' ones or due to the environment redistributing the sensor or the record. In the marine realm but also in lake domains, currents may influence the sensor, while in many domains the wind may affect the recording of the signal.
Furthermore, small and large scale spatial variations of the process may affect the signal and contribute to the record. Our 30 approach regards these contributions as noise. All these influences may introduce spatial and, here considered to be of more  importance, temporal correlations in those environmental properties, which we here consider as part of the noise term. We assume that advection from other regions by currents and wind, are especially important in contributing autocorrelation to our noise process. One can see these non-local factors as noise in the archive rather than the sensor.
Besides simple noise, redistributions of the environmental signal may also introduce biases in our estimate of the environment. Any bias is likely not fully time-constant but evolves with the environment on interannual, multi-decadal, and multi-5 centennial to millennial time-scales. The different time-scales result from the different time-scales of the environment. This is relevant for recent climate changes and interannual to interdecadal climate variability, but it becomes even more important for multi-millennial time-scales where we have to deal with the effects of changing seasons, glaciation, deglaciation, changes in bathymetry, and lithospheric adjustments. All of these processes may lead to biases, and such biases also lead to autocorrelation in the error.

10
One example of such time-evolving biases are changes in the seasonality of the environmental sensor. While one can see this again as a source of uncertainty in a narrowly defined proxy-system from sensor to reconstruction, it is in the end a bias of our attribution of the measurement to one season. We consider this bias on the sensor level. There are other potentially erroneous attributions besides the processes' seasonality. These are the location of the process in all three dimensions, for example, the habitat of living organisms, and a generally only partially correct calibration relationship. Again, these are environmental 15 factors influencing the sensor and we consider them as noise here. However, they reflect a non-stationarity of our reconstructioncalibration-relation. Nevertheless, the idea that the modern relations between environment and proxy system worked over the full period of interest is a fundamental assumption of paleo-climatology (e.g., Bradley, 2015).
In the following, we assume three components to be important disturbances of the signal at the sensor level: the environmental noise, the redistribution, and the attribution errors. We reduce the latter to the potential biases due to changes in the 20 seasonality. Taking all three components the sensor-record becomes where we for the moment replace η i by η env . In the following, we reduce these three components to two terms in our modifications of the input data.

25
First, we assume that η i includes both the effects of environmental dependencies and of redistribution. That is, η i = η env + η redistr . This is the first error term. This in fact implies that we should consider auto-correlated noise-processes. If we only modify the model-output and concentrate on one parameter T , for example, temperature data, our pseudoproxy at this point becomes, The current version of η i is only a weakly correlated autoregressive (AR) process of order one, which we additionally scale by an ad hoc scaling factor. It thereby only includes a small part of the potential correlations among errors due to redistribution and other processes. The innovations are sampled dependent on time and climate background from is a time-dependent standard deviation. The time-dependence mimics a dependence of the noise on the background climate variability on long time-scales. Here, we use a 1000 year moving standard deviation S(t i ) = σ(T (t i−499 : t i+500 )). Our general formulation assumes that noise variability increases with increasing variability in the parameter T .
Obviously, it could also be that noise variability reduces or reacts totally differently relative to the variability of T . The code 5 includes a switch to invert the moving standard deviation about its mean or to randomize the orientation.

Bias
We can consider the changes of the seasonality, η season , as an orbitally influenced bias term. We compute it for any latitude of interest. We apply the orbital bias term as additive but one may see it as a multiplicative or a nonlinear effect in many cases.
Therefore the code uses it after the noise term η i . The bias is the second error term in our formulation of modifications at 10 the sensor level. The bias term is a scaling of the changes in annual latitudinal insolation but it is possible to choose different sub-annual time-periods of interest. The scaling is arbitrary and we refer to the provided code for details. The bias is zero in the year 0 BP. We set it to be positive if the insolation is larger; this can be randomized in the code. The amplitude of the bias is scaled by an ad hoc constant. The bias becomes notable at some latitudes but may be rather negligible elsewhere. We take the bias as Bias(t) = β · I n . Where β is the scaling constant, and I n is a normalised and shifted insolation. I n is calculated as The chosen time-period influences the statistics included in the scaling. We consider the insolation since 150,000 BP. q 0.25 is the 25th percentile of the insolation data, u is generally 1, but can be sampled from U = {−1, 1}.
The pseudoproxy becomes Year BP

Assumptions on essential error sources 2: Archive
So far our approach describes a record of an environmental influence plus two error terms. This record becomes subsequently integrated in an archive. Afterwards, various processes may modify the archive or redistribute it. Modifications include selective destruction of parts of the record by processes acting all the time or by sparse random events or continually acting random processes. Examples are bioturbation or re-suspension. These processes may result either in a correlated noise in time and 5 space or simply white noise. Other de facto white noise errors may result from our finite and random sampling of the archive.
However, this may be rather part of the observational noise.
Such modifications of the archive and sampling issues represent an important step in using inverse reconstruction methods because it is a priori not clear how the archive is generated and whether an individual measurement represents mean environmental states or relates to single events. In this context, forward models and pseudoproxy approaches of sedimentary proxies 10 are a crucial tool in disentangling the controlling climatic environmental factors in the generation of sediment cores and their interpretation.

Smoothing and noise
Because we focus on sedimentary proxies, we argue that the archiving process foremost is a filter of variability above a certain frequency level, for example, by diffusive processes or bioturbation (compare Dolman and Laepple, 2018, and their references).
5 Dependent on the system in question this may only affect the very high frequencies but for other systems it may extend to multidecadal or even centennial to millennial frequencies. On top of this smoothing of the archive, there may be additional noise as the smoothing function is unlikely homogeneous. We assume such a filtering to be the fundamental modification of the record in the archive, and, thus, only consider this process in our archive modelling.
Inspired by the simple proxy forward formulations of Laepple and Huybers (2013, see also Dolman and Laepple, 2018), we 10 produce five different versions of the archived pseudoproxy-series. The first and second series are simple running averages of the sensor record on which we add a highly autocorrelated AR-process of order one. The two versions differ in the length of the averaging window, the AR-coefficients, and the standard-deviations of the innovations. The versions three and four similarly differ in the amount of average smoothing, but we use random window lengths for each date. The rationale for the two different smoothing lengths is to represent both strongly and only slightly smoothed proxies.

15
The fifth version aims to mimic the behavior of proxies when researchers use only a small part of an available proxy, e.g., pick only a certain number of samples. An example is the simple forward formulation for Mg/Ca proxies by Laepple and Huybers (2013, see also Dolman and Laepple, 2018).
Smoothing lengths and random factors in this approach could depend on the background climate. Indeed, the code includes options for the random smoothing lengths to depend on the mean climate or the climate variability. The provided data uses an 20 approach where the random smoothing lengths follow an autoregressive process around a climate dependent reference smoothing length, where, considering Vardaro et al. (2009), warmer climates result in shorter smoothing intervals. The smoothed archive records are then either where g 1 (t) is the time dependent filter, or where g 2 is the constant smoothing and we add an AR-process to account for the inhomogeneities in the smoothing.
The fifth version of the pseudoproxies subsamples over the random filter interval ees a noise term to mimic a seasonal uncertainty. That is, we sample n years within the filter interval, and take the mean over the temperature and the noise for these years. We add another noise term to represent the intra-annual seasonal uncertainty. P T in this case becomes  where h(t) represents the sub-sampling and η s the intra-annual noise. We do not include the bias term for the subsampled proxies. On the one hand we apply the bias only for the mean annual temperature, i.e. individual seasons show different biases.
While we could account for this by sampling the biases over the different seasons or even months in producing h(t) or η s , we prefer to keep our model simpler. Excluding the bias term may be interpreted as the seasonal subsampling cancelling out the bias. In reality any cancellation would not result in a convergence on the simulated climate state but more likely on a 5 recorded value between the biased and the 'true' climate. The coded version of the sub-sampling still includes the bias-term as a comment.

Results
The biased moving average already shows the differences between the target temperature and the pseudoproxy-record (compare Figure 2). The pseudo-archive-series in Figure 3a shows this more clearly. Here we use a randomized smoothing interval.
Differences are less visible for shorter random smoothing intervals (compare Figure 3b). Further panels of Figure 3 add the constant smoothing archive approximations which we modify by an additional highly correlated AR-process (Figure 3c and d).
This procedure randomly amplifies, dampens, or inverts certain biases in the presented case. That is, while the simple random smoothing may emphasize the bias, the AR-procedure overlies this bias with additional variations.
The panels highlight an apparent offset between the randomly smoothed archive series, the constantly smoothed archive 5 series, and the smoothed input data. The smoothed version of the input data as well as the constant filtering use a centered approach, that is they are symmetric about their date. The time varying smoothing tries more realistically to imitate a bioturbation approach (compare Dolman and Laepple, 2018, and their references) and thus provides a shift in the series.
Figure 3a also shows the seasonally subsampled pseudo-archive-proxy. The data ignores the bias term and the resulting series is by construction symmetric around the original data, our target. Nevertheless, there are pronounced deviations from 10 the original data. Considering only the deviations from the target temperature moving mean highlights that this approach is notably more noisy than the filtered data but preserves pronounced longer term excursions of the input data (not shown).

Assumptions on essential error sources 3: Measurements
The archiving represents also a transformation from time-units to archive-distance-units, to depths, rings, distances. The proxy becomes a tuple of date and data. Now the dates are uncertain as each data-point includes information from different original 15 dates due to the smoothing function. The sampling may lead to additional uncertainties due to disturbances of the archive, and the dating of our samples is a profoundly uncertain process.

Measurement error
Prior to dealing with errors due to dating uncertainty, we take an additional noise term to represent measurement errors and apply this for each date to account for the potentially imperfectly measured series. The term includes not only the errors intro-20 duced by our assumed methods of measuring the proxies and the methods' potential to make mistakes. This "true" measurement error may result in biases due to limits of what our methods can detect or systematic offsets due to a laboratory-specific, potentially erroneous, approach to the measurement. Potential offsets imply that we should generally expect a certain amount of auto-correlation in this noise. The term has further to account for the accidental handling of the records in the laboratory, for example, influences from storage or from other processing of the samples and the data, which may result in autocorrelated 25 errors if these influences have a systematic component. Thus, it is not necessarily the case that we can consider inter-laboratory reproducibility as white noise. However, the intra-laboratory repeatability is likely indeed a white random process. We also assume repeatability and reproducibility to be part of our measurement error term. While we just mentioned various reasons to assume autocorrelation in this error-term, we only provide a white noise term for the measurement noise. Again, the code allows to modify this. 30 We apply the measurement error term at the end. However, we introduce this term before dealing with the dating uncertainty since we provide proxies without dating uncertainty. The measured proxy-series becomes In reality, we do not have a continuously sampled series, but obtain only samples at certain intervals. Assuming N samples the sampled pseudoproxy becomes The sampling of the archive likely produces errors in the samples. We assume these are included in the measurement uncertainty. We provide at each grid-point sampled series of the pseudoproxies detailed above. We do not distinguish between different sampling techniques. We simply sample the records at certain dates and add the described noise term.

Dating uncertainty
10 Dating uncertainty represents a big part of our overall uncertainty for many proxies, especially for sedimentary proxy-records.
In our framework, already the smoothing function redistributes information from one date across the archive. Usually one considers this temporal uncertainty separately from the proxy-record error. For assessing reconstruction methods and simulations, it would be beneficial to be able to include dating uncertainty within the proxy-error. That is, if we consider proxies as tuples of data and date, we have to transform the uncertainty of the date into an error-term for the data. In the following we distinguish 15 between the dating uncertainty, that is the uncertainty that a sample is from a certain date, and the dating error, by which we mean the potential error in our (pseudo)proxy due to the uncertain dating.
There are a number of approaches to transfer the dating uncertainty towards the proxy-record error (e.g., Breitenbach et al., 2012;Goswami et al., 2014;Boers et al., 2017). Ensemble and Bayesian age-depth modelling approaches also allow to infer an additional error term (e.g., Haslett and Parnell, 2008;Blaauw and Christen, 2011). However in the present application, we 20 want to capture the error in a time-series. Thus, we take a very simple approach, which assumes that the error due to dating uncertainties is related to the climate state over the period of the dating uncertainty. Nevertheless, since we provide sample dates and random sampling uncertainties, the application of age modelling to the pseudoproxies is in principle possible (e.g., following the approach of Dee et al., 2015Dee et al., , 2018.
The code includes several variations of our estimation of an effective dating error. These reflect different amounts of depen-25 dence between subsequent samples. In all variants, we only consider dependence between two subsequent samples while for real proxies the correlations may extend across larger portions of the proxy-record. The following general approach is common to all variations of our procedure: First, we sample uncertainties in time for each sample date. We take these as dating uncertainty standard deviations. These uncertainties can be sampled fully randomly or dependent on the available smoothing interval data from the archive stage. Then we take the effective dating error at each sample date/depth to be a random sample 30 from a normal distribution.
The mean of this distribution is the difference between the sample-data and the mean over the data within plus and minus two dating uncertainty standard deviations. The standard deviation of the distribution is the standard deviation of the differences between the individual data points within this interval and this mean. The effective dating error is then where is the mean over the region of influence and is the variance of the distribution.
In the simplest formulation ignoring the dependence between subsequent dates, the sampled pseudoproxies become Alternative formulations of the pseudoproxy become or This initial formulation of the effective dating uncertainty error ignores potential correlation between the dating errors. The 15 most simple way to account for this makes subsequent errors dependent This formulation has only a minor influence on the results. It is included in the code via a binary switch.
A slightly more complex formulation makes the error term at each date dependent on the previous sample's age uncertainties and mean data. Previous refers to archive units instead of time units. Then the dating error becomes where ξ D i are the random innovations for date i. Our initial choice of ρ = 0.9 can give large effective dating uncertainty errors. A switch in the code allows to use this inter-dependent error. Another switch allows to consider the dependence between samples as a function of their dates and the dating uncertainty, The time-dependent dating uncertainty for each date σ d (t) is generated randomly (compare above σ D ). We provide data for the case with a time-dependent ρ(t).
Alternative simple formulations may include different noise processes like noise generated from Gamma-distributions. The available smoothing interval data can inform the sampled dating uncertainty. We could further use this information to provide a deterministic, not random, error for each sampled date, that is we could take a bias based on all dates influencing the selected 5 date within the dating uncertainty.
In our current setup the age uncertainty does not depend on the measurement noise. The measurement error is added afterwards to the series including the effective dating uncertainty error. This decision is arbitrary. On the one hand a classical dating uncertainty affects the measured value. Then, also P P T above should already include the measurement error. On the other hand, the dating uncertainty affects the archived values independent of the measurement noise. Therefore we keep both 10 independent.
The measured proxy-series becomes The final proxy is in temperature units as is the initial input data. We ignore a separate term for potentially non-linear and climate-state dependent errors in our calibration relationship and assume the measurement noise term accounts for this as well.

15
A separate term could be again a state-dependent Gaussian noise. It could also be a noise from a skewed distribution whose mode depends on the background climate. On the other hand, a state-dependent bias term could simulate a mis-specified calibration relation while a time-dependent bias term could simulate a degenerative effect over time within the archived series.
None of these are included in the current version.   501-year moving mean of the input data, the pseudo-archive series with shorter average smoothing lengths and the constant smoothing plus AR series with added measurement noise, c) 501-year moving mean of the input data, the subsampled record, and the subsampled record with added measurement noise.

Results
Nevertheless, the biased estimates occasionally are only bad matches for the original data. This is also the case for the subsampled data where we did not include the bias. Comparing the sampled pseudoproxy series to the smoothed original temperature data (compare Figure 5a) highlights that estimates for past climates may well fall within the range of the original interannual temperature variability but may nevertheless strongly misrepresent the mean climate represented by the sample.
Considering the effective dating uncertainty error, the discrepancies between input data and pseudoproxy are rather small 5 for uncorrelated or weakly correlated age uncertainties. However, in the case of strong dependencies between subsequent data, pronounced biases and mismatches may occur (not shown). The assumed co-relation between two dates has a strong influence on the size of these mismatches. We show the case for a time-dependent co-relation between subsequent dates, which gives intermediately sized mismatches. Temperature T, sampled Sampled subsampled + ε D Sampled subsampled + ε D + η M Figure 5. Visualising the sampled records: a) Input data and its 501-year moving mean, the pseudo-archive series with longer average smoothing lengths plus the effective dating error and plus the effective dating error and measurement noise, b) input data and its 501-year moving mean, the constantly smoothed record with longer smoothing length plus AR series with added effective dating error and with added effective dating error and measurement noise, c) input data and its 501-year moving mean, the pseudo-archive series with shorter average smoothing lengths plus the effective dating error and plus the effective dating error and measurement noise, d) input data and its 501-year moving mean, the constantly smoothed record with shorter smoothing length plus AR series with added effective dating error and with added effective dating error and measurement noise, e) input data and its 501-year moving mean, the subsampled record with added effective dating error and with added effective dating error and measurement noise.

General Results
Figures 2 to 4 present the different versions of the pseudoproxies for the chosen location. Under our assumptions, the influence of the orbital bias term is notable. The approaches using time-dependent smoothing or simple smoothing plus an AR-process may nearly or fully cancel the bias. This effect is less prominent for the time-dependent filter. Generally, both approaches seem to have similar effects. 5 Figure 4 includes the effect when we hypothetically add measurement noise at every date. Under our assumptions this noise is still smaller than or only as large as the original interannual variability but, including biases, mean estimates may be outside of the interannual variability of the original data. In these examples, the variability of the subsampled proxies is comparable to the smoothed ones after a measurement error is added. It is interesting to note that for the smaller smoothing the AR-process seems to cancel the orbital bias more strongly in Figure 3. Figure 5 shows the data-sets sampled at N = 200 dates. It clarifies 10 the error described for the interannual data. The document assets provide equivalent visualisations for another grid-point. These generally confirm the above descriptions. Figure 6 adds a comparison of power spectral densities computed from a wavelet based approach similar to the Weighted Wavelet Z-transform of Foster (1996). The approach is described by Mathias et al. (2004) and records results in prominent differences for multi-centennial to millennial periods. On the other hand, differences are smaller for the irregularly sampled input temperature data but still notable for millennial periods. However, there is an offset between the irregularly sampled data and the regularly sampled input data.

Spectral power
Spectra for full and late records of the various pseudoproxies are generally similar to the irregularly sampled input data spectra (Figure 6b-f) but the offset to the input data can be smaller than in Figure 6a. Differences between sampled late and full 25 records are often largest at intermediate millennial periods. Deviations are largest for the subsampled pseudoproxy approach at long periods (Figure 6f) but they become also notable for the constant smoothing approaches at shorter periods in the centennial band (Figure 6c,e). This is mainly due to the characteristics of the full period spectra for the constant smoothing, which show an increase in power spectral density for shorter and longer periods. That is, the constant smoothing full period spectra remind of grey noise spectra. Despite these differences and the apparent offset to the input data spectra, the irregularly sampled spectra 30 for all cases are rather similar.

Global data
The supplementary assets for this manuscript include plots of selected series from our analyses at all grid-points starting from the south towards the north (Supplementary document 1 Figure 1 at https://doi.org/10.17605/OSF.IO/ZBEHX/). These series are the input data at the grid-point, the smoothed-plus-AR-process series at the grid-point, and its subsampled version including all uncertainties.

5
These plots highlight three main points. First, peaks and troughs at some location are clearly attributable to the specific implementation of the forcing in the TraCE-21ka simulation (He, 2011; see also Liu et al., 2009). That is, these signals are not realistic but due to technical decisions in the production of the simulations. Furthermore there is potentially unrealistic variability at some grid-points for some periods. Second, the bias term in its current version may have only a small influence at certain latitudes. Third, our noise model shows often larger effects in the mid latitudes and the tropics. There is also a against the original data on the x-axis for a small selection of grid-points, highlighting the common lack of a clear relation besides the deglaciation. Figure 7 provides correlation coefficients between the sampled interannual grid-point data and the pseudoproxies including all uncertainties for the strong smoothing plus AR. The four panels show correlations for those samples within the first, second, and third 5,000 year chunks of the original data, and those samples in the remaining years. We choose to present the data this 5 way to avoid detrending the data over the deglaciation interval. Relations between original data and pseudoproxies are generally weakest in the tropical belt. In the period until present, correlations are overall weak. High latitude correlations are most notable during the deglaciation and slightly less notable during the first millennia of the Holocene. In these periods, correlations appear to be largest in areas with glacial remnants. Figure 8 adds for the first, the last, and the full period the relative standard-deviation σ T 21k /σ P in the left column and the 10 biasT T 21k −T P in the right column. T21k refers to the simulation, P to the pseudoproxies. For the standard deviation ratios, we use 501-year moving averages of the TraCE-21ka data. Variability is generally larger in the pseudoproxies except for the North Atlantic and the northern high latitudes in the early period, and it is larger in the pseudoproxies more or less everywhere in the late period. Over the full period, variability is notably larger mainly in the tropics and the southern hemisphere, it is about equal over Antarctica and wide regions of the northern Hemisphere. The variability is clearly larger in the input data only over 15 a small region in the northern Pacific.
The overall largest bias occurs off the coast of southeastern Greenland in the early period in Figure 8. Otherwise there is a spatial separation between the mid-to high latitudes and the tropics and subtropics for both periods. The bias is more prominent in the higher latitudes where it is predominantly positive in the early period but predominantly negative in the late period. Obviously, the general latitudinal bias pattern is by construction because we construct the bias as function of latitudinal 20 insolation.

On generalizations of the errors
While we already chose comparatively simple procedures for our approach to obtain pseudoproxies from a model simulation, it is likely possible to simplify these to a higher degree. Such a general expression for the error in proxies over multi-millennial time-scales may be more usable in a number of ad-hoc model evaluations and model-data comparisons. Most importantly, such 25 a generalized approach also allows to quickly produce ensembles of pseudoproxies.
Following our previous assumptions, the easiest way to obtain such a generalized error-model would be to assume a simple, potentially correlated noise model for the sensitivity of the sensor to the environment. Here, we use an AR-process of order one with AR-coefficient φ = 0.7. Either here or later one scales the series or adds a bias term to account for changing seasonality over multi-millennial time-scales. The sum of the input data and this error are then subject to a simple moving averaging Year BP Sampled T+η i +Bias moving mean +AR Sampled T+η i +Bias moving mean +AR+ε D Sampled T+η i +Bias moving mean +AR+ε D + η M Figure 9. Visualising the simplified essence of the surrogate proxy calculations: a) input data and 501-point moving mean, b) input data plus initial noise and bias term, c) moving mean of input data plus noise plus bias and the same record plus an AR(1)-process, d) smoothed temperature plus noise plus bias plus AR-process sampled at 200 dates, this record plus the effective dating error, and this record plus the effective dating error and measurement noise, e) smoothed temperature plus noise plus bias plus AR-process sampled at 100 dates, this record plus the effective dating error, and this record plus the effective dating error and measurement noise. the essence of the error. In short, the generalized pseudoproxy becomes: where g is the smoothing, η i is the initial noise, Bias is the bias term, D is the effective dating error, and η M is the measurement error. This is conceptually identical to the smoothing plus AR approach presented above. Its derivation is less grounded in real proxies. The provided data differs only in the amount of autocorrelation in the noise terms.
5 Figure 9 summarises results for the generalized approach. It clarifies that while an error may mask certain features of the past climate evolution, this simple generalized pseudoproxy-generation is unlikely to distort the proxy completely if we take the assumptions made above to be approximately appropriate. Interestingly, the generalization appears to modify the input signal slightly less than the more complex approach. However, as we display slightly different data comparisons here, it is more appropriate to note that the dating uncertainty has only a minor effect compared to the initial bias and AR-process 10 modifications and compared to the subsequent addition of the measurement noise.
While researchers may validly wish for such simplified recipes for producing pseudoproxies, using a full or at least more complex process-based approach is advisable, if it is necessary to account for effects of biology, environmental long-term changes, and other weakly constrained uncertainties. More complex approaches further allow to better mimic non-linearities between the climate and sensor and thus a truly non-linear pseudoproxy. points, which are close to proxies either included in Shakun et al. (2012), Clark et al. (2012), or Marcott et al. (2013. Figure   10 shows the locations. Using the generalized approach provides an ensemble based on the most reduced formulation. The provided code allows users to produce ensembles for their input data of interest.
Modifications to the code are as follows: First, we use a number of parameter values sampled from either uniform distributions around the otherwise fixed value or from a list of values. Second, we consider random orientations for bias and moving 5 standard deviations, that is we take S as S u where we sample u from U = {−1, 1}. We provide the script for the ensemble production as supplementary example code at https://doi.org/10.17605/OSF.IO/ZBEHX. As mentioned above, these changes relax our assumptions on the effect of changes in the background climate.
For Figure 11 we This publication presents a flexible yet simple approach for describing the error originating from climatic and non-climatic sources in proxy-records over multi-millennial time-scales including the last deglaciation. The assumptions are relatively simple but they are based on similar assumptions for process-based proxy-system forward models.
The approach can be easily extended to compute ensembles of proxies for single locations. We chose to give one set of pseudoproxies for each grid-point of the TraCE-21ka simulation and an ensemble of pseudoproxies at locations close to real 25 proxy-locations. This simulation has a specific climatology (Liu et al., 2009) but a comparison to real proxy data may easily be achieved by only considering anomalies (as done, e.g., by Marsicek et al., 2018). The provided pseudoproxy data, and the code to compute further pseudoproxies allows the application of our pseudoproxy-approach for the evaluation of models, the comparison of models to paleo data, and the testing of reconstruction and data-assimilation methods.
We choose only one possible set of parameters in our pseudoproxy-model, but we sample around this set for the ensemble 30 of pseudoproxies. We choose these specific parameters to provide some disturbance to the data but not to get anywhere too far away from the original state. For example, it is quite likely that we have to face larger biases in reality than represented by our choice. Users should make their own choice of parameters according to their assumptions on the various noise-contributions.
One can easily extend the chosen approach to even longer time-scales. Some modifications may be advisable considering the dating uncertainty to account for the likely sparser data further back in time, to better accommodate the increasing uncertainty, and especially to be more realistic in considering an effective dating uncertainty error for the pseudoproxy data. Similarly, we 5 do not consider spatial correlations in the noise. Such correlations between locations are probably relevant for some noise-terms while they are probably less important for others.
We focused on the time-series approach and did not choose a probabilistic approach like, for example, Breitenbach et al. (2012) or Goswami et al. (2014). Neither, does our approach as of now explicitly link to probabilistic age-modelling approaches as described by Haslett and Parnell (2008), Blaauw andChristen (2011), or Trachsel andTelford (2017).

10
There are a variety of other potential approaches how to obtain simple pseudoproxies from the model data. One such example would be to consider an envelope around the model state, to select randomly a set of dates from the original data, fit a smooth through this set and then sample again around this uncertain smoothing. Similarly, Gaussian Process Models or Generalized Additive Models may be valuable means in producing pseudoproxies for paleoclimate studies over time-scales longer than the Common Era of the last 2,000 years. For example, Simpson (2018) shows the benefits of Generalized Additive Models for 15 studies on paleoenvironmental time series.
The present approach ignores a variety of possible complications. For example, we currently do not consider hiatusses in the sensor. Furthermore, the dependency on the background climate is small. Nevertheless, we are confident that this approach is of value for the comparison of simulation data and proxy data over long periods, for testing reconstruction methods, and for evaluating different model simulations against each other.
20  Tables A1 to A4 summarise the considered parameters and noise models. They also clarify whether the parameter settings are used for a global field of surrogate proxies, a more generalized approach, an ensemble calculation, or all.