the Creative Commons Attribution 4.0 License.
the Creative Commons Attribution 4.0 License.
A regional pCO_{2} climatology of the Baltic Sea from in situ pCO_{2} observations and a modelbased extrapolation approach
Henry C. Bittig
Erik Jacobs
Thomas Neumann
Gregor Rehder
Ocean surface pCO_{2} estimates are of great interest for the calculation of air–sea CO_{2} fluxes, oceanic uptake of anthropogenic CO_{2}, and eventually the Global Carbon Budget. They are accessible from direct observations, which are discrete in space and time and thus always sparse, or from biogeochemical models, which only approximate reality. Here, a combined method for the extrapolation of pCO_{2} observations is presented that uses (1) modelbased patterns of variability from an empirical orthogonal function (EOF) analysis of variability with (2) observational data to constrain EOF pattern amplitudes in (3) an ensemble approach, which locally adjusts the spatial scale of the mapping to the density of the observations. Thus, dataconstrained, gap and discontinuityfree mapped fields including local error estimates are obtained without the need for or dependence on ancillary data (e.g. satellite sea surface temperature maps). This extrapolation approach is generic in that it can be applied to any oceanic or coastal region covered by a suitable model and observations. It is used here to establish a regional pCO_{2} climatology of the Baltic Sea (Bittig et al., 2023: https://doi.org/10.1594/PANGAEA.961119), largely based on ICOSDE ship of opportunity (SOOP) Finnmaid surface pCO_{2} observations between LübeckTravemünde (Germany) and Helsinki (Finland). The climatology can serve as improved input for atmosphere–ocean CO_{2} flux estimation in this coastal environment.
 Article
(18828 KB)  Fulltext XML
 BibTeX
 EndNote
The ocean plays a major role in the global carbon cycle and has a controlling function on the atmospheric CO_{2} content on longer timescales (DeVries, 2022). Since the rise of atmospheric CO_{2} concentrations since the beginning of the preindustrial period (year 1750), the ocean has taken up ∼25 % of the CO_{2} released from human activities (Friedlingstein et al., 2022), with the annual uptake mainly related to the increase in the air–sea pCO_{2} imbalance. The role of coastal and continental shelf waters is more complex. Apart from atmospheric CO_{2} levels, changes in nutrient loads and organic matter supply from land, changes in weathering in the drainage basin, and even changes in the functioning and composition of biological key players on various levels can lead to changes in the inorganic carbon system and thus the source–sink function of coastal seas (e.g. Laruelle et al., 2018; Müller et al., 2016; Carstensen and Duarte, 2019; Kuliński et al., 2022). Moreover, coastal seas provide an important conduit of landderived carbon into the open ocean's interior (e.g. Thomas et al., 2004).
For the Baltic Sea, several attempts have been made to quantify the net CO_{2} air–sea balance in the form of a pCO_{2} climatology as well as to derive trends in surface water pCO_{2}, with partly contradictory results (e.g. Omstedt et al., 2009; Parard et al., 2016, 2017; Becker et al., 2021; Neumann et al., 2022; Wesslander et al., 2010). Most of the approaches either used the output from biogeochemical models or tried to create pCO_{2} fields from mapped proxy data and observational data using extrapolation approaches. Seasonal mapping of pCO_{2} is particularly challenging for the Baltic Sea due to its high regional and temporal variability, a salinity gradient affecting e.g. CO_{2} equilibria, and a large seasonal amplitude caused by high net productivity in spring and summer and entrainment of waters enriched in remineralization products due to mixedlayer deepening in autumn and winter (Schneider and Müller, 2018).
Climatologies of pCO_{2} on a global or oceanwide scale are an important tool for the quantification of the oceanic CO_{2} sink in the framework of e.g. the Global Carbon Budget (Landschützer et al., 2013; Friedlingstein et al., 2022). Tailored regional analysis can help to gain insight into changes in the source–sink behaviour of distinct regions, as recently demonstrated for the northern European shelf, including the Baltic Sea (Becker et al., 2021).
A robust climatology and trend for the Baltic Sea has, apart from refining the estimate of the net CO_{2} flux in relation to the Global Carbon Budget, several implications. As the entire Baltic Sea area belongs to the territorial waters or exclusive economic zones (EEZs) of one of the panBaltic nations, the air–sea CO_{2} fluxes might be of importance for current and future carbon management and accounting schemes in the framework of national emission reduction targets (Luisetti et al., 2020). A pCO_{2} climatology could also serve as a baseline for potential negative emissions applications, including blue carbon or coastal alkalinity enhancement (GESAMP, 2019). Knowledge of monthly pCO_{2} fields and their variability might also help to identify and quantify the impact of perturbations and extreme events, e.g. heat waves (Humborg et al., 2019).
In this work, we build a foundation for such applications. We first present a novel extrapolation approach, followed by construction of a Baltic Sea pCO_{2} climatology. The extrapolation approach is then evaluated and put into context with existing mapping methods. Notable features of the seasonal pCO_{2} climatology are discussed and special attention given to the regional longterm pCO_{2} trend before we conclude our work.
2.1 Extrapolation approach
For mapping from scarce observational data to spatially filled maps of the Baltic Sea, we use an ensemble of truncated empirical orthogonal function (EOF) reconstructions. For a more detailed description than the brief summary below, please consult Appendix A.
EOF decomposition and singular value decomposition (SVD) have been used widely in atmospheric and ocean science (e.g. Lorenz, 1956; Weare et al., 1976; Weare and Newell, 1977; Hannachi et al., 2007). They can be used to efficiently reduce the dimensionality of the original dataset (Lorenz, 1956; Davis, 1976; Preisendorfer, 1988; Monahan et al., 2009; Jolliffe and Cadima, 2016). Here, we use the EOF decomposition of a model dataset X of pCO_{2} in the Baltic Sea to obtain spatial EOF patterns e_{i} of pCO_{2} variability.
In a second step, observational pCO_{2} data Y are used in conjunction with these spatial patterns of variability in a truncated EOF reconstruction to constrain the EOF spatial patterns' amplitudes (Kaplan et al., 2000; Preisendorfer, 1988) and thus to extrapolate from scarce observational data to the full domain (Sect. A3). This step represents an optimization of a cost function Q (Eq. A19), which in our case also takes the observational data uncertainty into account to avoid overfitting (Sect. A4). For a given EOF reconstruction with l spatial modes, we thus obtain both an extrapolated field as well as an estimate of the mapping uncertainty σ_{P}, representing the uncertainty in the extrapolation with the given data constraints. In addition, we derive a representational error estimate ${\mathit{\sigma}}_{\phantom{\rule{0.125em}{0ex}}\phantom{\rule{0.125em}{0ex}}{P}^{\prime}}$ for the variance not resolved by the given EOF reconstruction due to truncation and use of just l modes (Sect. A4).
However, the choice of how many EOF modes to use for a given truncated EOF reconstruction, or at which level to truncate, is an arbitrary one often inspired by a certain threshold of the total variance explained (e.g. Kaplan et al., 2000).
In a third step, we therefore use an ensemble approach to circumvent this problem: instead of a single EOF reconstruction with a fixed number of modes l, we use a series of EOF reconstructions from just one mode up to a maximum number of l_{max} modes (Sect. A5). This series is then combined by a weighted ensemble at each location with the individual EOF reconstruction's mapping uncertainties as weights (Eqs. A25 and A26; Sect. A6).
Note that the weights are spatially resolved and depend on the available data constraints, i.e. the ensemble weights provide for locality, which includes adaptation of the mapping's spatial scales to the data constraint density, thus providing for a more robust extrapolation and more realistic uncertainty estimates than with a fixed number l of EOF modes.
We thus obtain an ensemble mean pCO_{2} value x_{reconstr}, the average number of modes $\stackrel{\mathrm{\u203e}}{\mathit{l}}$ used in the ensemble of reconstructions at a given location, and an uncertainty estimate σ_{reconstr} (Eq. A28), which consists of the sample variance about the ensemble mean value, the ensembleaveraged mapping uncertainty σ_{P} (from each truncated EOF reconstruction with l from 1…l_{max}), and the ensembleaveraged representational error ${\mathit{\sigma}}_{\phantom{\rule{0.125em}{0ex}}\phantom{\rule{0.125em}{0ex}}{P}^{\prime}}$ (from each EOF reconstruction) (Sect. A6). All these quantities are obtained without gaps on the full spatial domain of the original model dataset X.
Finally, due to the high temporal dynamics of our Baltic Sea environment, we use an expansion of the EOF reconstruction approach to reconstruct not only the data value, but both the data value and a (shortterm) linear trend, in order to temporally collate (temporally extended) observations into a timecoherent, synoptic picture (Sect. A7; Elken et al., 2019).
2.2 Baltic Sea pCO_{2} climatology
Given limitations of modelled data, we aimed to produce an observationbased pCO_{2} climatology. For this, we used (a) the above extrapolation approach with spatial patterns based on ecosystem model data and (b) observations of surface pCO_{2} from the Surface Ocean CO_{2} Atlas (SOCAT) to produce a monthly climatology of pCO_{2} as well as of the linear, shortterm pCO_{2} trend to temporally collate pCO_{2} observations (Sect. A7) within a month (see Fig. 1 for a visualization of the approach).
2.2.1 Spatial patterns of variability
Spatial patterns of variability e_{i} for our extrapolation approach are based on model data from a Baltic Sea setup of the Ecological ReGional Ocean Model (ERGOM version 1.2). This version of ERGOM includes a simple carbon cycle as described in Kuznetsov and Neumann (2013) with amendments to allow for nonRedfield stoichiometry as outlined in Neumann et al. (2022). In principle, any carboncontaining biogeochemical ocean model of choice can be used as a basis for the patterns of variability e_{i}. Nonetheless, a better carbon representation in the model likely provides bettersuited e_{i} patterns of variability.
The model data variability characteristics are illustrated in the Appendix (Fig. A1). Less than 1 % of the locations show a temporal decorrelation scale ≤7 d, so that we chose a weekly aggregation of the model data for variability pattern extraction. For the climatology itself, we chose a more typical monthly resolution.
ERGOM has been shown to adequately mirror observations of the largescale nitrate, phosphate, oxygen, and carbon distribution (Neumann et al., 2022; Eilola et al., 2011; Neumann et al., 2015). However, shortcomings still exist in the exact magnitude and timing of phytoplankton carbon uptake and release throughout the seasons (Neumann et al., 2022; Fig. 4).
From an ERGOM version 1.2 model run from 1948 to 2020, we used the last 20 years of modelled surface pCO_{2} from 2001 to 2020 averaged into 1044 weekly means. The model run has a horizontal resolution of 3 nautical miles, which yields a dataset X with 12 010 grid points m for the Baltic Sea area south of the Skaggerak (i.e. HELCOM subbasins 2 to 17, HELCOM Secretariat, 2017) and 1044 time steps n.
From these model data X, patterns of variability e_{i} were extracted by EOF analysis (see Appendix A for details). The analysis of our model data retained 224 EOF modes with a minimal crossvalidation error of 22.6 µatm against the model data and a total explained variance of 98.6 %. The spatial patterns e_{i} have the same 3 nautical mile resolution as the model data.
2.2.2 Surface pCO_{2} observations
Surface pCO_{2} measurements in the Baltic Sea were obtained from SOCAT version 2022 (Pfeil et al., 2013; Bakker et al., 2016; Bakker et al., 2022), which collects surface pCO_{2} data from underway observations of ships of opportunity (SOOPs) or research vessels. All observations are based on CO_{2} measurements in air equilibrated with sea surface waters (Körtzinger et al., 1996; Pierrot et al., 2009) with a typical accuracy of 2–5 µatm (in some cases ≤10 µatm), which is indicated by the respective quality flags A–E (Pierrot et al., 2009; Bakker et al., 2016).
Here, Baltic Sea pCO_{2} data for the period June 2003 to December 2021 were used. They were from the ICOSDE SOOP Finnpartner/Finnmaid line (Schneider et al., 2006; Gülzow et al., 2011) between LübeckTravemünde and Helsinki, which covers the southern and central Baltic Sea, or the ICOSSE SOOP Tavastland line between LübeckTravemünde and Oulu or Kemi, which additionally covers the northern basins starting from 2019 (Fig. 2a). Additional surface pCO_{2} data originated from RV Aranda cruise ARA04/2019 in February and March 2019 and RV Elisabeth Mann Borgese cruise EMB214 (Rehder et al., 2021) in May and June 2019 (Fig. 2b), which were carried out as part of the BONUS INTEGRAL project. Data processing and quality control followed the SOCAT guidelines. In total, data coverage in the southern and central Baltic Sea is high, with locally up to 189 out of 223 months covered (i.e. June 2003 to December 2021). Data in the northern Baltic Sea, however, are only available since February 2019, with observations during a total of 15 out of 35 months.
2.2.3 Monthly climatology construction
For every month t of the 189 months with observations, the observations were centred temporally on the 15th of each month as t_{∘} (see Sect. A7), so that the extrapolation approach provides a field $\stackrel{\mathrm{\u203e}}{\mathcal{X}}$ with a spatial distribution of

the reconstructed pCO_{2} (x_{t,reconstr}, Eq. A27) at t_{∘},

the pCO_{2} error estimate (σ_{t,reconstr}, Eq. A28) at t_{∘},

the (shortterm) pCO_{2} trend at t_{∘},

an error estimate of the pCO_{2} trend at t_{∘}, and

the average number of patterns e_{i} used in the reconstruction ($\stackrel{\mathrm{\u203e}}{\mathit{l}}$, Eq. A25)
for each month with observations (e.g. Fig. 3 for $\stackrel{\mathrm{\u203e}}{\mathcal{X}}$s of May 2013 as an illustration).
The construction of a monthly pCO_{2} climatology for the entire Baltic Sea has its bottleneck in the temporal coverage of the northern basins, i.e. the Bothnian Sea, the Quark, and the Bothnian Bay. While the central and southern Baltic Sea has approximately 85 % of all the months covered since June 2003, observations in the northern basins only start in February 2019, with 43 % monthly coverage until the end of 2021. While the extrapolation approach provides fully filled maps of pCO_{2}, including the northern basins, also for months preFebruary 2019, they are associated with larger uncertainties, i.e. limited explanatory power, in the northern basins (e.g. Fig. 3b). To not create a wrong impression of seasonality or interannual variability, especially if the mapping uncertainty fields σ_{t,reconstr} are not respected by a casual user, we chose to provide a mean seasonal climatology Y. Here, the weighting of the climatology (Eq. 1; see below) ensures the preference of reconstructions with proper observation constraints (i.e. February 2019 and onwards for the northern basins) over reconstructions without them (i.e. preFebruary 2019).
For a given location m, χ_{m} denotes the (reconstructed) time series of $\stackrel{\mathrm{\u203e}}{\mathcal{X}}$ (with size 1×189) and y_{m} the time series of the mean monthly climatology Y (with size 1×12), respectively. The monthly means y_{m} were then obtained by
where w_{m} is the time series of inverse variance weights (Eq. 2) with size 1×189 at the given location m, O_{m} a time series operator or matrix of size 12×189 that selects the monthly mean y_{m} corresponding to the month of χ_{m}, t−t_{ref} the time difference (in decimal years) between each of the 189 monthly $\stackrel{\mathrm{\u203e}}{\mathcal{X}}$ and a reference time, t_{ref}, and g_{m} an extra degree of freedom to allow for a linear longterm trend in $\stackrel{\mathrm{\u203e}}{\mathcal{X}}$ for each location m (e.g. surface pCO_{2}). As a reference time, the middle of 2013 was chosen, which closely corresponds to the mean date of all the observations.
Equation (1) represents a system of 189 linear equations with 13 degrees of freedom to calculate 12 weighted mean values y_{m} and a longterm trend g_{m}. Through the inverse variance weights (Eq. 2), monthly maps χ_{m} with betterconstrained data, i.e. smaller variance of reconstruction ${\mathit{\sigma}}_{m,\mathrm{reconstr}}^{\mathrm{2}}$ at the given location m (and for the given month), obtain preference in the weighted mean (Eq. 1).
When done for each point m, one obtains the mean seasonal climatology Y (with size m×12), centred on 2013, as well as a map of its mean linear (longterm) trend G (with size m×1) for each of the five fields $\stackrel{\mathrm{\u203e}}{\mathcal{X}}$ discussed.
2.2.4 Dataset description
The dataset (Bittig et al., 2023) consists of two plaintext files. One contains the mean seasonal climatology Y and the other the mean linear (longterm) trend G. Based on the 3 nautical mile resolution of the original model data, both feature 12 010 grid points m for HELCOM subbasins 2 to 17. For the months January through to December, the climatology file gives pCO_{2} (µatm), a pCO_{2} error estimate (1σ) (µatm), a (shortterm) pCO_{2} trend for the 15th of each month (µatm d^{−1}), a (shortterm) pCO_{2} trend error estimate (1σ) (µatm d^{−1}), and the average number of patterns ($\stackrel{\mathrm{\u203e}}{\mathit{l}}$) (see Figs. A3 and A4). The longterm trend file gives both the longterm linear trend G (µatm yr^{−1}) (Fig. 7) and its error estimate (1σ) (µatm yr^{−1}).
3.1 Extrapolation approach
The mapping approach gives fully filled fields on the entire spatial domain from scattered observational data. The mapped pCO_{2} is in a similar range and frequency distribution to observations with improvements in terms of distribution and correlation over the original model pCO_{2} (Fig. 4a). The mapped fields are smooth and without spatial gaps or discontinuities. The pCO_{2} error estimate is reduced in the spatial vicinity of observations. In contrast, pCO_{2} error is markedly increased both near shore as well as in subbasins not covered by observations, e.g. the northern basins or the Gulf of Riga for a SOOP Finnmaidbased reconstruction (e.g. Fig. 3b). Still, pCO_{2} values remain within reasonable margins even in those areas (e.g. Fig. 3a). We cannot observe a tendency of the mapping approach to give extreme values or outliers in the absence of observations (compare Fig. 4a). Rather, with few or no data constraints, variability pattern e_{i} amplitudes in the ensemble tend towards zero, i.e. towards zero deviation, which is equivalent to the spatially resolved temporal mean pCO_{2} of the model (Appendix A).
Addition of a linear temporal trend to collate observations to a common time t_{∘} helps to reduce the mismatch between temporally spread observations and a fulldomain mapping of a dynamic coastal system (compare e.g. Figs. 3e, f and A2c, d, respectively).
However, for about half the grid points of the 189 monthly mappings, the magnitude of the shortterm temporal pCO_{2} trend (e.g. Fig. 3c) at a given location m is insignificant, i.e. within its trend error estimate (e.g. Fig. 3d). At the same time, a considerable portion of the significant pCO_{2} trends is found at grid points m in the direct footprint of observations where the trend error estimates are small in the first place. This demonstrates that a monthly temporal resolution is sufficient for construction of a Baltic Sea pCO_{2} climatology with our approach.
3.2 Quality of interpolation and extrapolation
To assess the quality of the obtained fields, we consider (a) the mapped result against the original observations and (b) a comparison of mappings from concurrent subsets of observations.
For the first aspect, we consider the residual pCO_{2} against SOCAT observations for all 189 monthly mapped fields from the climatology construction (e.g. Fig. 3e, f). We use typical metrics for interpolation methods (correlation, bias, standard deviation). As the monthly fields $\stackrel{\mathrm{\u203e}}{\mathcal{X}}$ were constructed from those observations, this assesses primarily to which degree the data constraints were taken into account for the mapping, e.g. the balance between pointtopoint reproduction and smoothing with the given patterns of variability e_{i}.
The comparison shows highly correlated pCO_{2} data (R^{2}=0.95), a similar pCO_{2} distribution (Fig. 4a), and an unbiased mapping with a standard deviation of 26 µatm (Fig. 4b). This includes both misfits of the mapped fields as well as variations within observations of a given grid point location (compare the spread between observations and mapping in Fig. 3e). The error estimate is of a similar magnitude (Fig. 4c) and is well collocated. That is, residuals are within 1× the error estimate in 68 % of the cases (Fig. 4d).
The second aspect, how data constraints in one area transfer to accurate predictions in another part of the domain, is more difficult to assess objectively. It requires the observational data to be split into a subset that is used for mapping and a subset that is deemed independent and used for evaluation. How this subsetting is done, however, directly influences the outcome of the evaluation. That is, while such evaluations may seem instructive and generalizable, they are highly dependent on design and thus carry some degree of arbitrariness.
To illustrate this, we consider a series of monthly mappings for May 2019, the month with the highest density of observations in our 189 monthly mappings. We use chessboardlike grids with alternating white and black boxes for subsetting, where observations in the white boxes are used for mapping and observations in the black boxes are used for evaluation. Grid box sizes vary from 1^{∘} × 1^{∘}, 2^{∘} × 2^{∘}, to 3^{∘} × 3^{∘} subsetting grids (see Fig. 5a, e, and i for example grids). To check for dependence on a specific data distribution, the 1^{∘} × 1^{∘}, 2^{∘} × 2^{∘}, and 3^{∘} × 3^{∘} subsetting grids were shifted spatially in latitude and longitude so that every grid size had eight different realizations, each with different sets of data in the mapping and evaluation subsets, respectively (Fig. 5).
For 1^{∘} × 1^{∘} subsetting, the eight realizations show a relatively coherent picture: differences ΔpCO_{2} between mapped and observed pCO_{2} (Fig. 5b) are slightly larger (by 5–12 µatm) than for interpolated data (Fig. 4b) and are mostly unbiased. pCO_{2} error estimates are elevated (Fig. 5c vs. Fig. 4c), while the ratio of differences and error estimates, i.e. the collocation of bias and error estimate, shows a similar distribution for both extrapolation (Fig. 5d) and interpolation (Fig. 4d). With increasing spatial scales of the 2^{∘} × 2^{∘} and 3^{∘} × 3^{∘} subsetting, pCO_{2} differences and error estimates become larger, as would be expected (Fig. 5f, j and g, k). Encouragingly, however, the ratio of differences and error estimates preserves a similar shape (Fig. 5h, l). That is, the mapping approach seems to provide suitable mapping error estimates.
More concerning, however, are increasing differences between the eight realizations with increasing spatial scales. Especially for the 3^{∘} × 3^{∘} subsetting, some realizations show a pCO_{2} bias, which additionally can be different between realizations (blue vs. orange realizations, Fig. 5j), as well as noticeable differences in pCO_{2} error estimates and thus mapped pCO_{2} (e.g. green realization, Fig. 5k). This illustrates a strong dependence on the specific choice of subsetting, i.e. which data are included in mapping and which data are excluded for validation.
We conclude that, with a given choice of subsetting, one implicitly chooses which statistical values one wants to get out. While our evaluation could be described as being well designed, it turns out to be startingpointdependent. If a month with less data coverage were chosen, differences between realizations (i.e. which data end up in the training or validation subset) would be even larger. Therefore, based on the above illustration, any subsettingbased evaluation needs to be taken with a sufficient amount of caution, as the subsetting design will imprint itself on the outcome.
3.3 Baltic Sea pCO_{2} climatology
3.3.1 pCO_{2} distribution
The mean seasonal pCO_{2} distribution x_{reconstr} matches previously published seasonal cycles of surface pCO_{2} in the western and central Baltic Sea (Schneider and Müller, 2018): a strong spring bloom pCO_{2} drawdown occurs between March and May to a low around 200 µatm in May, some slight increase in June, and a second summer bloom low in July, followed by relaxation towards atmospheric levels in September and October and supersaturated levels during late autumn and winter, peaking around 500 µatm in December (Fig. 6a; Figs. A3 and A4, first column).
Regionally, the western basins lead the mean seasonal cycle, while the central Gotland Basin and the Gulf of Finland trail behind. The productive season is even shorter in the northern basins, with the major pCO_{2} drawdown in the Bothnian Sea occurring in June (Figs. A3 and A4, first column). Similarly, the seasonal amplitude is less pronounced in the western or northern basins than in the Gotland Basin and is most intense in the Gulf of Finland. For the northern basins, this is the first fully seasonal climatology from pCO_{2} observations.
The pCO_{2} error estimate σ_{reconstr} shows a minimum of 12 µatm. Close to observations, the pCO_{2} error estimate of the mean seasonal climatology is of the order of 20 µatm and as such close to the value of the individual mappings (compare Fig. 6b and Fig. 4c). Further away, σ_{reconstr} increases up to a level around 90 µatm (95th percentile), with a spatial preference for high σ_{reconstr} in the northern basins as well as in the Gulf of Riga, the Oder Bight, and other sheltered, nearshore areas (Figs. A3 and A4, second column).
3.3.2 pCO_{2} shortterm trends
In general, the shortterm pCO_{2} trends follow the mean seasonal pCO_{2} cycle with on average negative trends, i.e. decreasing pCO_{2}, during winter and spring. Positive trends prevail during late summer and autumn (Fig. 6c), with the strongest trend magnitudes occurring in the northern Baltic proper and the Gulf of Finland (Figs. A3 and A4, third column).
However, in 8 out of the 12 months, the majority of estimated trends are insignificant. That is, more than 70 % are smaller than their error estimate, indicating that inclusion of a shortterm trend in the mapping may not be required for these months. Only in March and April as well as in August and September are the majority of pCO_{2} trends significant (despite elevated pCO_{2} trend error estimates occurring at the same time in some areas; see Fig. 6d). That is, significant shortterm pCO_{2} trend magnitudes fall together with the strong springtime pCO_{2} drawdown as well as with autumnal mixedlayer deepening and entrainment of highpCO_{2} waters (Fig. 6; more in Jacobs et al., 2021).
Like for the pCO_{2} distribution, trend error estimates are higher in the northern basins as well as the Gulf of Riga, Oder Bight, and other sheltered, nearshore areas (Figs. A3 and A4, fourth column).
3.3.3 Number of patterns or mapping scales
The average number of patterns $\stackrel{\mathrm{\u203e}}{\mathit{l}}$ per month can be seen as a qualitative indicator to show preference for reconstructions with larger or smaller spatial scales, respectively, in the weighted mean (Eq. A25) of the reconstruction.
For the climatology Y, $\stackrel{\mathrm{\u203e}}{\mathit{l}}$ shows a maximum around 105 patterns (Fig. 6e). This level is at about 45 % of the maximum number of patterns, i.e. indicating only a slight preference for reconstructions with largerthanaverage scales in the weighted mean. However, a notable fraction with a higher number of patterns $\stackrel{\mathrm{\u203e}}{\mathit{l}}$ and a peak around 140 patterns exists (Fig. 6e). These are located in the vicinity of the observations, where the preference in the weighted mean is for reconstructions with stronger smallscale features (see e.g. Fig. 3g).
There is no seasonal imprint on the average number of patterns $\stackrel{\mathrm{\u203e}}{\mathit{l}}$, which is in contrast to the pCO_{2} and pCO_{2} shortterm trend distributions (Fig. 6).
3.4 Longterm pCO_{2} trend
The mean climatology Y is constructed from the 189 monthly mappings $\stackrel{\mathrm{\u203e}}{\mathcal{X}}$ and a linear longterm trend G (see Eq. 1 and Sect. 2.2.3). For pCO_{2}, the analysis shows a significant longterm pCO_{2} increase of $+\mathrm{1.4}\pm \mathrm{0.5}$ µatm yr^{−1} (1σ) in the southern Baltic Sea from the Great Belt to the Bornholm Basin and southern parts of the Gotland Basin (Fig. 7). A similar increase ($+\mathrm{1.5}\pm \mathrm{0.7}$ µatm yr^{−1}) is obtained in selected regions of the northern Baltic proper, the Åland Sea, and the Gulf of Finland. A positive trend in pCO_{2} is also visible in coastal areas along the Finnish coast in the northern basins as well as in the Gulf of Riga. However, multiyear observations in those areas are unavailable to support a longterm pCO_{2} trend (see Sect. 2.2.2; Fig. 2). In all the other areas, the data available do not indicate a significant longterm trend, i.e. one that is outside the 68.3 % (1σ) confidence bound. This includes in particular the western and eastern Gotland Basin but also the northern basins (Fig. 7), where the observational time series is likely too short to allow for the detection of a significant longterm trend (Fig. 2).
4.1 Extrapolation approach
The extrapolation approach uses an EOF analysis of the data covariance Q as a foundation. When working with an EOF reconstruction, one needs to bear in mind the purely mathematical nature of the EOF modes' e_{i} construction. They are built by (1) maximizing the (partial) variance they explain and by (2) being orthogonal to all previous EOF modes (Preisendorfer, 1988; Monahan et al., 2009; Dommenget, 2015). They may therefore indicate some causal connection within the original data X, which requires confirmation by other means. They may, however, just as well group some portion of (partial) variance in one part of the domain with some portion of (partial) variance in another part of the domain just to mathematically maximize the amount of (partial) variance of the given EOF mode e_{i}. This can give rise to apparent EOFmode “teleconnections”, where one part of the data or spatial domain is seemingly tightly coupled to another part of the data, e.g. for the second EOF mode e_{i=2}, which always has a dipole pattern. Often, there is a temptation to attribute such couplings or “teleconnections” to a driving mechanism. However, given its origin, patterns of EOF modes should be interpreted with great caution (and only with ancillary, supporting data) and are best seen to be of a mathematical rather than mechanistic nature (Monahan et al., 2009; Dommenget, 2015). We therefore did not try to assign physical mechanisms or drivers to specific EOF patterns e_{i} in our work.
The use of EOF modes as a basis for extrapolation ensures that the extrapolated map covers the full spatial domain, that it is gapfree, and that it is discontinuityfree.
A key aspect of truncated EOF reconstructions is the number l of significant modes included. Our crossvalidation approach provided a relatively large number of l_{max}=224 modes. The relatively large number of modes is a prerequisite for a small representational error P^{′}, i.e. for the covariance not resolved by the collection of truncated EOF modes, which gives a lower bound on the uncertainty of the mapping σ_{reconstr} (Eq. A28). In our case, the average ${\mathit{\sigma}}_{\phantom{\rule{0.125em}{0ex}}\phantom{\rule{0.125em}{0ex}}{P}^{\prime}}$ when using all 224 patterns e_{i} amounts to 14.7 µatm. If only the first 50 patterns e_{i} were included, the average representational error would be as high as 29.1 µatm.
A reconstruction with a small number of modes provides for a more uniform, largescale homogeneous field, where gaps in the data are filled by the largescale picture. However, such reconstructions may lack the flexibility to reproduce real features of the observations, e.g. through too strong smoothing. Conversely, a large number of modes provides for a fine, smallscale field with high flexibility. However, features in some areas without nearby observations may be badly constrained with the risk of “ghost” signals.
By using an ensemble of reconstructions that cover the entire range of 1…l_{max} modes and by using the mapping variance ${\mathit{\sigma}}_{\phantom{\rule{0.125em}{0ex}}\phantom{\rule{0.125em}{0ex}}P}^{\mathrm{2}}$ as weights (Eqs. A25 and A26), we find an optimal tradeoff in our mapping approach, where both aspects are balanced. In this way, the scales used in the mapping are adjusted spatially according to the distribution of observations as constraints.
The cost function of the mapping approach minimizes the residual between observations and mapped data (Eq. A19). In this way, a synoptic pCO_{2} mapping is obtained. If observations, however, are temporally extended compared to the system's temporal dynamics, the pCO_{2} mapping can be distorted and cannot represent a synoptic picture, depending on the spatial and temporal pattern by which observations are obtained (Elken et al., 2019).
The introduction of a temporal trend at each location (Sect. A7) as an additional degree of freedom reduces such distortion caused by sampling. It both improves the fit to the data (compare Fig. A2c, d vs. Fig. 3e, f) and provides extra information.

For a given mapping, a strong or weak trend indicates where temporal dynamics are high or low and informs on where the frequency of observations should be enhanced or not, respectively.

For a time series of mappings, information on both the value and trend allows for a more accurate interpolation by a cubic Hermite spline (Appendix B) compared to a standard, linear interpolation of the values alone (e.g. Fig. 8).
For our mapping of surface pCO_{2} in monthly intervals, we conclude that the inclusion of a temporal trend in the mapping is required, particularly for spring and autumn. For months without strong trends, we do not see adverse effects in the mapped pCO_{2} compared to an extrapolation without inclusion of a temporal trend.
The dynamics of the pCO_{2} field are prescribed by the model data covariance matrix Q (Fig. A1a). The closer the model and realworld variability match, the easier it is for the mapping to reflect reality and to provide a most realistic picture. Nonetheless, also for a model dataset X that may lack some realworld features or their magnitude, the observational constraints provide an improvement in the mapped field compared to the original model (Fig. 4).
The mapping error estimates are elevated where observations are scarce or dynamics are high to start with. For our pCO_{2} mapping, the model covariance Q is the origin of a pronounced coast–basin difference in the pCO_{2} variance caused by both physical and biological drivers in the model. This prescribed data covariance distribution is retained in the mapping error estimates σ_{reconstr}. For example, they are enlarged in dynamic river plumes, in the sheltered, shallow Gulf of Riga, or in other nearshore areas, unless there are observations to constrain dynamics and thus reduce σ_{reconstr}. In regions with few observations, σ_{reconstr} approaches the prescribed data variance from Q (Fig. A1a).
We do not observe a seasonality in the number of patterns $\stackrel{\mathrm{\u203e}}{\mathit{l}}$ of the mapping (Fig. 6e). At the same time, the pCO_{2} value varies strongly between seasons. In addition, there is a slight preference for observations to occur in spring, summer, and autumn over winter months. However, there is no discernible difference in the spatial coverage of pCO_{2} observations between seasons (data not shown). The aseasonality of $\stackrel{\mathrm{\u203e}}{\mathit{l}}$ can be explained by the “largescale” mapping error σ_{P}, used for the ensemble weights (Eq. A26) to only depend on the distribution of samples H_{t} and their respective observational error estimates S. Therefore, the extrapolation approach does not discriminate or distinguish by (pCO_{2}) value, and there is no seasonal imprint in the number of patterns.
4.2 Comparison with other pCO_{2} mapping approaches
SOCAT observations have been used previously to build surface pCO_{2} climatologies of the Baltic Sea (see Fig. 9 for examples). They rely on observationdriven extrapolation approaches, which can be grouped into different categories: (a) one where sparse data points are extrapolated based on some statistic metric of influence, e.g. a spatial decorrelation length or timescale, and (b) one where parametrizations with proxy data in combination with filled fields of those proxies are used for extrapolation (Rödenbeck et al., 2015).
Our mapping approach belongs to the first category. For a given month or time window, we use the available pCO_{2} data directly and one to one without an added transformation to inform the extrapolated pCO_{2} map. As a (spatial) metric of influence, the variability patterns are used. This gives a direct link between observations of a given month or time window and the mapped field. Conversely, if there are no pCO_{2} observations in a given time window, this directly translates to a (time) gap in the mapped fields for any approach of the first category.
Most previous pCO_{2} mappings in the Baltic Sea belong to the second category (e.g. Parard et al., 2016; Becker et al., 2021). In a first step, they use all available pCO_{2} data together with collocated proxies (e.g. location, sea surface temperature, chlorophylla concentration, water depth, distance to the coast, season) to establish a relationship or parametrization between proxies and pCO_{2}. In a second step, distributions of those proxy parameters are used to establish a pCO_{2} distribution in time and space. Here, pCO_{2} data of a given month do not directly inform the extrapolated pCO_{2} map but only indirectly through the established relationship using all the data. Conversely, if there are no pCO_{2} observations in a given time window but proxy data are available, they can be used to estimate a pCO_{2} map nonetheless. That is, (temporal) gap filling is possible with the twostep procedure of approaches of the second category.
As a drawback, mapped pCO_{2} fields of the second category rely on the validity of the proxy relationship and, equally important, on good proxy input data. In some cases, e.g. satellitebased sea surface temperature or chlorophylla concentration, proxies may be (seasonally) prone to artefacts, e.g. due to stronger cloud cover in winter months. Such artefacts then directly translate into mapped pCO_{2} fields and may create unrealistic patterns or outofrange values (e.g. Fig. 9a, d; Parard et al., 2016). A mapping approach of the first category such as the one presented here, which uses the observations directly, is not affected by externalsource data quality (Fig. 9i–l).
Wesslander et al. (2010)Wesslander et al. (2010)Laruelle et al. (2018)Schneider and Müller (2018)Becker et al. (2021)Becker et al. (2021)4.3 Longterm pCO_{2} trend
The longterm trend of surface pCO_{2} in Baltic Sea surface waters has been studied before (Table 1): Wesslander et al. (2010) describe an increase in monthly pCO_{2} (approximately +20 µatm yr^{−1}) for the 1993–1998 period in the eastern Gotland Basin, which seems mostly driven by less intense summer pCO_{2} minima (from approximately 50 µatm in 1993 to approximately 250 µatm in 1998). For the longer period 1993–2009, they do not detect a significant trend. While Wesslander et al. (2010) use pCO_{2} calculated from pH and alkalinity, all further studies are based on the same surface pCO_{2} observations archived in SOCAT (Bakker et al., 2016), albeit using different methods and time extents.
Laruelle et al. (2018) found an increase of $+\mathrm{2.9}\pm \mathrm{2.4}$ µatm yr^{−1} (1σ) for wintertime pCO_{2} from 1995 to 2011 in the southern and central Baltic Sea. At the same time, they see a similar spatial gradient to our work (Fig. 7) with a stronger increase in the southwest and a zero or even negative trend in the northern Gotland Basin.
Schneider and Müller (2018) describe an increase of +4.6 to +6.1 µatm yr^{−1} (1σ between 0.6 and 1.5 µatm yr^{−1}) for different areas for the 2008–2015 period. However, their analysis did not deseasonalize the observations prior to linear trend analysis. During their analysed period, there is an increase in observations during autumn and winter (with typically high pCO_{2}) and a decrease in early spring observations (with typically low pCO_{2}), while latespring and summer observations have a similar data coverage over the years. Their relatively high pCO_{2} trend estimate therefore seems to be partially caused by a seasonal shift in data availability that was left unaccounted for in the analysis.
Becker et al. (2021) derived trends of $+\mathrm{2.05}\pm \mathrm{0.12}$ µatm yr^{−1} for the southern Baltic Sea (south of 56^{∘} N) and $+\mathrm{1.84}\pm \mathrm{0.21}$ µatm yr^{−1} for the central Baltic Sea (between 56 and 61^{∘} N) for the period 1995–2016. However, they observe a stronger trend in earlier years (starting in the 1990s) than for later periods (starting in the mid2000s).
Our analysis covers the more recent period 2003–2021 and gives significant trends of $+\mathrm{1.4}\pm \mathrm{0.5}$ µatm yr^{−1} (1σ) in the southern Baltic Sea (south of 56.5^{∘} N) and $+\mathrm{1.5}\pm \mathrm{0.7}$ µatm yr^{−1} in the Åland Sea, parts of the northern Baltic proper, as well as the Gulf of Finland (between 59 and 61^{∘} N). The central Baltic Sea and the northern basins themselves show no significant trend. The positive trend ($+\mathrm{1.2}\pm \mathrm{0.6}$ µatm yr^{−1}) in coastal waters of the northern basins along the Finnish coast as well as in the Gulf of Riga (Fig. 7) currently has no support from direct observational constraints in those areas (Fig. 2). In contrast, it likely originates from high similarity to other coastal waters (e.g. along the southern Baltic shore) in the model data variability analysis and thus in the variability patterns e_{i} used for mapping. As such, the pCO_{2} increase may be realistic but lacks field data support.
Together with the literature, our results seem to indicate a reduction in overall surface pCO_{2} increase in the Baltic Sea during the past decades or even its complete absence like in the central Baltic Sea (Fig. 7) in recent periods. While atmospheric pCO_{2} levels are still rising (approximately +2.4 µatm yr^{−1} in the past decade; Lan et al., 2022), this has implications for the CO_{2} balance between the atmosphere and the Baltic Sea. This supports previous claims of coastal seas becoming a less intense source of CO_{2} or eventually turning into (an increasing) CO_{2} sink (Laruelle et al., 2018).
Surface pCO_{2} data are available from SOCAT (https://doi.org/10.25921/1h9fnb73, Bakker et al., 2022; Bakker et al., 2016) and Rehder et al. (2021). ERGOM model output data are available at https://threddsiow.iowarnemuende.de/thredds/catalogs/projects/integral/catalog_pocNP_V04R25_3nm_agg_time.html?dataset=IOWTHREDDSBaltic_pocNP_V04R25_3nm_agg_time_2020112012 (Neumann, 2021). The climatology is available from Bittig et al. (2023): https://doi.org/10.1594/PANGAEA.961119.
In this work, we developed an extrapolation approach that combines two worlds: models, specifically the distribution and connectivity that exist in model data variability; and observations in that they provide constraints of the realworld picture.
The most notable features of the approach are that it does not tend to give extreme, outofrange values even with few data constraints and that it provides local error estimates, which reflect both underlying variability, e.g. coast–basin gradients, and observational data constraints. We consider of particular merit the fact that the extrapolation scheme adapts its spatial scales to the number of observations in a certain area, leading to a sound representation of less uncertainty where more data are available.
Used together with highquality surface pCO_{2} data from SOCAT, we established a climatology that covers the entire Baltic Sea. Given the present data scarcity in the northern basins of the Baltic Sea, a mean pCO_{2} climatology is what is reliably achievable today. It can serve as a seasonal baseline of regional pCO_{2} evolution that enables assessment of interannual variations, e.g. with respect to timing or amplitude. With sustained and/or enhanced observations in such datapoor regions, a fully monthly resolved pCO_{2} dataset will become realistic in the future. To this end, SOOP Tavastland and SOOP Silja Serenade (crossing between Stockholm and Helsinki) provide a positive outlook for uncertainty reduction. Both SOOP lines were recently adopted by ICOS.
Finally, our extrapolation approach as well as the method to establish a climatology are not limited to pCO_{2} or the Baltic Sea. Instead, they are transferable to other areas and parameters where the present work can serve as a template.
A1 Notation and mathematical background
To represent a spatial dataset at a given time t with m spatial points, we use the data vector x_{t} with size m×1. For a spatial time series of n times, we use the data matrix X with size m×n, where the n columns correspond to n data vectors x_{t}. Data in x_{t} and X are the deviation from the spacedependent temporal mean ${\stackrel{\mathrm{\u203e}}{\mathit{x}}}_{m}$. The empirical data covariance matrix Q_{XX} with size m×m is given by
The singular value decomposition (SVD) or empirical orthogonal function (EOF) decomposition of the data matrix X decomposes X into a series of orthonormal factors or principal components according to
where E is a m×m orthonormal matrix, Σ a m×n rectangular diagonal matrix, and A a n×n orthonormal matrix. With the above convention on m and n, the column vectors e_{i} (with size m×1) in E represent spatial patterns, while the column vectors a_{i} (with size n×1) in A represent amplitude time series. They are also called leftsingular vectors and rightsingular vectors, respectively. The diagonal elements of Σ are the socalled singular values s_{i} of X. Equation (A2) can be written in vector form as
where α_{t}=Σa_{t} is the socalled dimensional amplitude at a given time t.
Based on a SVD of X (Eq. A2), the m×m matrix B=XX^{T} can be expressed as
where $\stackrel{\mathrm{\u0303}}{\mathbf{\Lambda}}$ is a diagonal matrix whose diagonal elements are ${\stackrel{\mathrm{\u0303}}{\mathit{\lambda}}}_{i}={s}_{i}^{\mathrm{2}}$. Equation (A4) states the eigendecomposition of B. Since matrix B is proportional to the covariance matrix Q (Eq. A1), we can state the eigendecomposition of the data covariance matrix Q as
with corresponding eigenvalue problem
and
The spatial patterns e_{i} with size m×1 are eigenvectors of the data covariance matrix, and the diagonal elements λ_{i} of Λ the corresponding eigenvalues λ_{i}, which give the amount of variance associated with each e_{i}. From the set of eigenvectors e_{i} and their corresponding amount of variance λ_{i}, the spatial distribution x_{t} at a given time t can be reconstructed by determination of a suitable amplitude vector a_{t} (Eq. A3) (see textbooks of linear algebra or statistical data analysis, e.g. Dommenget, 2015).
A2 Truncation of eigenvalue modes
For practical purposes, reconstruction often uses only the first l leading eigenvectors instead of all m eigenvectors e_{i} ($\mathrm{1}\le l\le m$). That is, E, Λ, and A are split into a matrix E, Λ, and A of size m×l, l×l, and n×l, respectively, which contain the first l leading eigenvector modes, and a matrix E^{′}, Λ^{′}, and A^{′}, which contain the remaining m−l eigenvector modes. Equations (A2), (A3), and (A5) thus become
The leading eigenvector modes e_{i} cover most of the data variance and often represent more “largescale” spatial patterns, while the modes higher than l contain only a small amount of the data variance and can be seen as “smallscale” spatial patterns (Lorenz, 1956; Kaplan et al., 2000; Dommenget, 2015). Correspondingly, the first and second terms in Eq. (A10) give the largescale and smallscale variances P and P^{′}, respectively, whereas in Eq. (A8) and its vector form Eq. (A9) they give the largescale and smallscale contributions, respectively, to the spatial data vector x_{t}.
This split can also be interpreted as decomposition into a “signal” part and a “noise” part. Equations (A8)–(A10) with only the l leading modes (i.e. with only the first term) are called a truncated EOF reconstruction in which dimensionality is reduced from m to l modes. Higherorder modes (in E^{′} and Λ^{′}, i.e. the second term) are assumed to be dominated by noise and error and are discarded in a truncated eigenvector or EOF reconstruction (Lorenz, 1956; Kaplan et al., 2000).
A3 Extrapolation from observational data
To represent a set of k observations at a given time t, we use the observational data vector y_{t} and the observational error σ_{t}, both with size k×1. The transfer operator H_{t} is then a “sampling” or observation function that depends on the spatial configuration of the observation points and samples from the data vector x_{t} of size m×1 so that the result ${\widehat{\mathit{x}}}_{t}$,
has the same size k×1 as the observations. To be comparable with ${\widehat{\mathit{x}}}_{t}$, the observational data y_{t} are expressed as a deviation from the spacedependent temporal mean ${\stackrel{\mathrm{\u203e}}{\mathit{x}}}_{m}$ too (Elken et al., 2019).
The eigenvalue reconstruction in truncated form (Eq. A9) then becomes
Minimization of a suitable cost function Q, e.g. for leastsquares optimization as in Elken et al. (2019),
yields a system of l linear equations with l unknowns (Elken et al., 2019):
which provides the observationbased eigenvector amplitudes α_{t} or a_{t} (Eq. A14), respectively.
With Eq. (A9) in truncated form and Eq. (A14), the data vector x_{t} can be obtained, i.e. an extrapolation from k×1 observational data points y_{t} to the entire spatial domain of m×1 data points x_{t} with the help of the truncated eigenvalue decomposition of m×l eigenvectors and l eigenvalues e_{i} and λ_{i}, or E and Λ, respectively.
A4 Error considerations
Both “observational error” and “representational error” impact the determination of the eigenvector amplitudes a_{t}. Observational error σ_{t} includes both instrumental and sampling error and can be used to limit the impact of observational data constraints y_{k} (Kaplan et al., 2000). Representational error is the error made by truncation, i.e. by using only the l largest, leading eigenvectors and neglecting the remainder of the spectrum, which can be derived from the “smallscale” covariance P^{′} contribution (Kaplan et al., 2000).
The error covariance matrix ℛ is the sum of the observational data error covariance S and a term that accounts for the covariance created in the truncated modes E^{′} not resolved by the analysis:
where S is a diagonal matrix with the observations' variance ${\mathit{\sigma}}_{t}^{\mathrm{2}}$ on the diagonal and H_{t} is the aforementioned sampling operator (Kaplan et al., 2000).
By addition of constraints on the cost function Q (a) to limit the estimated eigenvector amplitudes based on the amount of variance they explain in the original eigenvalue decomposition and (b) that a_{t} determination is limited by the uncertainty of observations and from truncation (see above), Eq. (A13) is modified to (Kaplan et al., 2000)
and the solution Eqs. (A15)–(A17) thus become (Kaplan et al., 2000)
On the other hand, the largescale portion of the data covariance Q (Eq. A10) can be obtained from this solution according to Kaplan et al. (2000):
P can be seen as the mapping uncertainty of the extrapolation.
Note that the calculation of P only requires the sampling operator H_{t}, i.e. where samples are present, as well as an observational error estimate S, but not the actual observations or amplitudes at these locations. In the absence of any observation, H_{t} contains only zeros and Eqs. (A21) and (A23) then simplify to $\mathbf{D}={\mathbf{\Lambda}}^{\mathrm{1}}$ and P=EΛE^{T} (compare Eq. A10), respectively. Addition of observational constraints thus reduces the mapping uncertainty P.
For our purposes, we assume that offdiagonal elements in the covariance P^{′} are negligible, i.e. that the truncated eigenvector modes E^{′} are sufficiently smallscale and that their variances Λ^{′} are small enough that they do not show a noticeable covariance contribution (Eq. A18). We therefore approximate P^{′} by a diagonal matrix whose diagonal elements are obtained from rearrangement of Eq. (A10):
where Q is accessible from the data X (Eq. A1) and the second term from the truncated eigenvalue decomposition.
A5 Number of significant modes
A critical aspect before any extrapolation from observations is how many modes e_{i} to use, or – in other words – where to truncate reconstruction, i.e. how to distinguish between modes representing desired variability and modes representing “noise”. Often, an arbitrary threshold of the total variance covered by the l leading modes is chosen (e.g. Kaplan et al., 2000).
Here, we apply the DINEOF (Data Interpolating Empirical Orthogonal Functions) variant of SVD or EOF decomposition of the dataset X to find the l_{max} leading eigenvectors e_{i}. The number l_{max} of significant modes is determined by a crossvalidation procedure from the data X (Beckers and Rixen, 2003), and we use this l_{max} as the upper bound for l in our ensemble approach (see below).
A6 Ensemble approach for robustness and locality
Depending on the spatial distribution or clustering of observations y_{t}, not all amplitudes of the l_{max} spatial eigenvectors e_{i} may be well constrained. In some configurations, e.g. with few or clustered observations, where H_{t} selects only a few of the m spatial data points, a reconstruction with fewer modes than l_{max} may give a solution with a smaller mapping uncertainty P than the truncated reconstruction with all l_{max} modes e_{i}.
We therefore use an ensemble approach over the truncated reconstructions, where we vary l from just 1 mode to l_{max} modes:
where the weights w_{l} (with size m×1) are based on the diagonal elements of P, i.e. the variance ${\mathit{\sigma}}_{\phantom{\rule{0.125em}{0ex}}\phantom{\rule{0.125em}{0ex}}P,t}^{\mathrm{2}}$ or mapping uncertainty of the extrapolation:
As the mapped variance diag(P) varies in magnitude with the number of eigenvectors included, the weights w_{l} for a given number of modes l are normalized by the total variance over all m data points, which gives a normalized spatial weight vector w_{l} (Eq. A26). For a given spatial data point m, preference is thus given to reconstructions where it is among the betterconstrained ones for the given number of modes l.
For 𝓧_{l}, each using only the first l (≤l_{max}) eigenvector modes (Eq. A25), we insert the following.

The reconstructed data vector x_{t} (from Eq. A9 in truncated form) to yield as $\stackrel{\mathrm{\u203e}}{\mathcal{X}}$ the ensemble mean $\stackrel{\mathrm{\u203e}}{{\mathit{x}}_{t}}$

The term ${\left(\stackrel{\mathrm{\u203e}}{{\mathit{x}}_{t}}{\mathit{x}}_{t}\right)}^{\mathrm{2}}$ to yield as $\stackrel{\mathrm{\u203e}}{\mathcal{X}}$ the biased weighted sample variance about the ensemble mean ${\mathit{\sigma}}_{t,\mathrm{mean}}^{\mathrm{2}}$

The mapping variance ${\mathit{\sigma}}_{\phantom{\rule{0.125em}{0ex}}\phantom{\rule{0.125em}{0ex}}P,t}^{\mathrm{2}}$ (i.e. the diagonal of P; Eq. A23) to yield as $\stackrel{\mathrm{\u203e}}{\mathcal{X}}$ the remaining mean “largescale” variance $\stackrel{\mathrm{\u203e}}{{\mathit{\sigma}}_{\phantom{\rule{0.125em}{0ex}}\phantom{\rule{0.125em}{0ex}}P,t}^{\mathrm{2}}}$

The approximation of the truncated variance ${\mathit{\sigma}}_{\phantom{\rule{0.125em}{0ex}}\phantom{\rule{0.125em}{0ex}}{P}^{\prime},t}^{\mathrm{2}}$ (i.e. the diagonal of P^{′}; Eq. A24) to yield as $\stackrel{\mathrm{\u203e}}{\mathcal{X}}$ the mean “smallscale” variance $\stackrel{\mathrm{\u203e}}{{\mathit{\sigma}}_{\phantom{\rule{0.125em}{0ex}}\phantom{\rule{0.125em}{0ex}}{P}^{\prime},t}^{\mathrm{2}}}$

The number of modes l itself to yield as $\stackrel{\mathrm{\u203e}}{\mathcal{X}}$ the average number of modes $\stackrel{\mathrm{\u203e}}{\mathit{l}}$ used in the reconstruction as a spatial vector of size m×1 each, i.e. on the full spatial domain
The reconstructed data vector x_{t,reconstr} and the total variance of reconstruction ${\mathit{\sigma}}_{t,\mathrm{reconstr}}^{\mathrm{2}}$ are given by
A7 Temporal coherence
So far, we have considered the observations y_{t} and the reconstructed data vector x_{t} to originate and be valid for a given time t, respectively. However, in practice, observations y will spread over a certain time extent Δt, e.g. days for a single SOOP transect to weeks for a basincovering cruise. If this time extent (e.g. duration of a cruise) is comparable to the timescale of the system's dynamics (e.g. spring surface warming or bloom onset), distortions in the reconstructed fields x_{t} may occur (Elken et al., 2019) as different “states” of the system could be sampled at different space–time locations.
To collate temporally extended observations into a common synoptic reconstruction without artefacts, Elken et al. (2019) added a linear temporal (shortterm) trend for each eigenvector mode to the optimization, thus collating observations made at different observation times t to a common reference time instance t_{∘}.
To this end,

the time difference Δt between observation and reference times is introduced:
$$\begin{array}{}\text{(A29)}& {\displaystyle}\mathbf{\Delta}\mathit{t}=\mathit{t}{t}_{\circ}.\end{array}$$ 
The sampling operator H_{t} is extended to not only sample from the eigenvector e_{i} the value χ corresponding to the observation, but also the value times the time difference, χ⋅Δt, corresponding to a (shortterm) time gradient. H_{t}E thus becomes a matrix of size k×2l.

The l eigenvalues are replicated with a constant scaling factor, so that Λ becomes a 2l×2l diagonal matrix. That is, both the value (first l eigenvalues) and its time difference (second l eigenvalues times the scaling factor) follow the same eigenvector e_{i} order and importance, with the scaling factor to determine the balance between both. Based on preliminary tests, we used an empirical scaling between the spatial value pattern and temporal trend pattern of $\mathrm{9}\times {\mathrm{10}}^{\mathrm{4}}$, equal to a ratio of approximately 1 µatm per 33 d.

The eigenvalue amplitude vector a_{t} is doubled in size to 2l×1, where the first l values continue to represent the eigenvectors' e_{i} amplitudes, while the second l values represent their temporal derivative.
Thus, the system of linear equations (Eqs. A20–A22) becomes a system of 2l equations with 2l unknowns, and the reconstruction (Eq. A9) provides both an extrapolation to the entire spatial domain of m×1 data locations at reference time t_{∘} as well as a temporal trend of change in the m×1 data locations at reference time t_{∘} (Elken et al., 2019).
With a pCO_{2} value x and linear pCO_{2} trend d given for each month of the climatology, the seasonal evolution of pCO_{2} at a given location m can be calculated as a cubic Hermite spline according to
with
Here, t_{k} and t_{k+1} are the times of the climatological month before and after the time t of interest, and γ gives the normalized time. h(γ) are Hermite basis functions and x_{k} and x_{k+1} as well as d_{k} and d_{k+1} are the associated monthly pCO_{2} value and pCO_{2} trend, respectively, which determine the seasonal pCO_{2} x_{t} at time t.
HCB and GR conceived the study. The method was developed by HCB with important input by EJ and GR. TN performed the model simulations and HCB the analysis. HCB led the manuscript writing with contributions by all the coauthors.
The contact author has declared that none of the authors has any competing interests.
Publisher's note: Copernicus Publications remains neutral with regard to jurisdictional claims made in the text, published maps, institutional affiliations, or any other geographical representation in this paper. While Copernicus Publications makes every effort to include appropriate place names, the final responsibility lies with the authors.
Efforts by Anna WillstrandWranne (SMHI) and Tobias Steinhoff (GEOMAR/NORCE) around SOOP Tavastland observations are highly acknowledged. The Surface Ocean CO_{2} Atlas (SOCAT) is an international effort endorsed by the International Ocean Carbon Coordination Project (IOCCP), the Surface Ocean Lower Atmosphere Study (SOLAS), and the Integrated Marine Biosphere Research (IMBeR) programme to deliver a uniformly qualitycontrolled surface ocean CO_{2} database. The many researchers and funding agencies responsible for the collection of data and quality control are thanked for their contributions to SOCAT. The authors want to thank two anonymous reviewers and the handling editor for their valuable and constructive comments which improved the manuscript.
This work was funded by the projects CSCOPE (grant no. 03F0877D) and BONUS INTEGRAL (grant no. 03F0773A), which received funding from BONUS (Art. 185), funded jointly by the EU, the German Federal Ministry of Education and Research, the Swedish Research Council Formas, the Academy of Finland, the Polish National Centre for Research and Development, and the Estonian Research Council. Computational power was provided by the North German Supercomputing Alliance (HLRN). Measurements on SOOP Finnmaid were temporarily (2009–2011) funded by the German Federal Ministry of Education and Research in the framework of BONUS projects BalticC (grant no. 03F0486A), Baltic Gas (grant no. 03F0488B), and, since 2012, ICOSD (grant nos. 01LK1101F and 01LK1224D). Additional support came from the JERICOS3 project funded by the European Commission's H2020 Framework Programme under grant agreement no. 871153.
The publication of this article was funded by the Open Access Fund of the Leibniz Association.
This paper was edited by Giuseppe M. R. Manzella and reviewed by two anonymous referees.
Bakker, D. C. E., Pfeil, B., Landa, C. S., Metzl, N., O'Brien, K. M., Olsen, A., Smith, K., Cosca, C., Harasawa, S., Jones, S. D., Nakaoka, S., Nojiri, Y., Schuster, U., Steinhoff, T., Sweeney, C., Takahashi, T., Tilbrook, B., Wada, C., Wanninkhof, R., Alin, S. R., Balestrini, C. F., Barbero, L., Bates, N. R., Bianchi, A. A., Bonou, F., Boutin, J., Bozec, Y., Burger, E. F., Cai, W.J., Castle, R. D., Chen, L., Chierici, M., Currie, K., Evans, W., Featherstone, C., Feely, R. A., Fransson, A., Goyet, C., Greenwood, N., Gregor, L., Hankin, S., HardmanMountford, N. J., Harlay, J., Hauck, J., Hoppema, M., Humphreys, M. P., Hunt, C. W., Huss, B., Ibánhez, J. S. P., Johannessen, T., Keeling, R., Kitidis, V., Körtzinger, A., Kozyr, A., Krasakopoulou, E., Kuwata, A., Landschützer, P., Lauvset, S. K., Lefèvre, N., Lo Monaco, C., Manke, A., Mathis, J. T., Merlivat, L., Millero, F. J., Monteiro, P. M. S., Munro, D. R., Murata, A., Newberger, T., Omar, A. M., Ono, T., Paterson, K., Pearce, D., Pierrot, D., Robbins, L. L., Saito, S., Salisbury, J., Schlitzer, R., Schneider, B., Schweitzer, R., Sieger, R., Skjelvan, I., Sullivan, K. F., Sutherland, S. C., Sutton, A. J., Tadokoro, K., Telszewski, M., Tuma, M., van Heuven, S. M. A. C., Vandemark, D., Ward, B., Watson, A. J., and Xu, S.: A multidecade record of highquality fCO_{2} data in version 3 of the Surface Ocean CO_{2} Atlas (SOCAT), Earth Syst. Sci. Data, 8, 383–413, https://doi.org/10.5194/essd83832016, 2016. a, b, c, d
Bakker, D. C. E., Alin, S. R., Becker, M., Bittig, H. C., CastañoPrimo, R., Feely, R. A., Gkritzalis, T., Kadono, K., Kozyr, A., Lauvset, S. K., Metzl, N., Munro, D. R., Nakaoka, S., Nojiri, Y., O'Brien, K. M., Olsen, A., Pfeil, B., Pierrot, D., Steinhoff, T., Sullivan, K. F., Sutton, A. J., Sweeney, C., Tilbrook, B., Wada, C., Wanninkhof, R., Willstrand Wranne, A., Akl, J., Apelthun, L. B., Bates, N., Beatty, C. M., Burger, E. F., Cai, W.J., Cosca, C. E., Corredor, J. E., Cronin, M., Cross, J. N., De Carlo, E. H., DeGrandpre, M. D., Emerson, S. R., Enright, M. P., Enyo, K., Evans, W., Frangoulis, C., Fransson, A., GarcíaIbáñez, M. I., Gehrung, M., Giannoudi, L., Glockzin, M., Hales, B., Howden, S. D., Hunt, C. W., Ibánhez, J. S. P., Jones, S. D., Kamb, L., Körtzinger, A., Landa, C. S., Landschützer, P., Lefèvre, N., Lo Monaco, C., Macovei, V. A., Maenner Jones, S., Meinig, C., Millero, F. J., Monacci, N. M., Mordy, C., Morell, J. M., Murata, A., Musielewicz, S., Neill, C., Newberger, T., Nomura, D., Ohman, M., Ono, T., Passmore, A., Petersen, W., Petihakis, G., Perivoliotis, L., Plueddemann, A. J., Rehder, G., Reynaud, T., Rodriguez, C., Ross, A. C., Rutgersson, A., Sabine, C. L., Salisbury, J. E., Schlitzer, R., Send, U., Skjelvan, I., Stamataki, N., Sutherland, S. C., Sweeney, C., Tadokoro, K., Tanhua, T., Telszewski, M., Trull, T., Vandemark, D., van Ooijen, E., Voynova, Y. G., Wang, H., Weller, R. A., Whitehead, C., and Wilson, D.: Surface Ocean CO2 Atlas Database Version 2022 (SOCATv2022) (NCEI Accession 0253659), NOAA National Centers for Environmental Information [data set], https://doi.org/10.25921/1h9fnb73, 2022. a, b
Becker, M., Olsen, A., Landschützer, P., Omar, A., Rehder, G., Rödenbeck, C., and Skjelvan, I.: The northern European shelf as an increasing net sink for CO2, Biogeosciences, 18, 1127–1147, https://doi.org/10.5194/bg1811272021, 2021. a, b, c, d, e, f, g, h
Beckers, J. M. and Rixen, M.: EOF Calculations and Data Filling from Incomplete Oceanographic Datasets, J. Atmos. Ocean. Tech., 20, 1839–1856, https://doi.org/10.1175/15200426(2003)020<1839:ECADFF>2.0.CO;2, 2003. a
Bittig, H. C., Jacobs, E., Neumann, T., and Rehder, G.: A regional pCO_{2} climatology of the Baltic Sea, PANGAEA [data set], https://doi.org/10.1594/PANGAEA.961119, 2023. a, b, c
Carstensen, J. and Duarte, C. M.: Drivers of pH Variability in Coastal Ecosystems, Environ. Sci. Technol., 53, 4020–4029, https://doi.org/10.1021/acs.est.8b03655, 2019. a
Davis, R. E.: Predictability of Sea Surface Temperature and Sea Level Pressure Anomalies over the North Pacific Ocean, J. Phys. Oceanogr., 6, 249–266, https://doi.org/10.1175/15200485(1976)006<0249:POSSTA>2.0.CO;2, 1976. a
DeVries, T.: The Ocean Carbon Cycle, Annu. Rev. Environ. Resour., 47, 317–341, https://doi.org/10.1146/annurevenviron120920111307, 2022. a
Dommenget, D.: An Introduction to Statistical Analysis in Climate Research, https://users.monash.edu.au/~dietmard/teaching/honours.statistics/dommenget.statistics.lecture.notes.pdf (last access: 31 January 2024), 2015. a, b, c, d
Eilola, K., Gustafsson, B. G., Kuznetsov, I., Meier, H. E. M., Neumann, T., and Savchuk, O. P.: Evaluation of biogeochemical cycles in an ensemble of three stateoftheart numerical models of the Baltic Sea, J. Marine Syst., 88, 267–284, https://doi.org/10.1016/j.jmarsys.2011.05.004, 2011. a
Elken, J., Zujev, M., She, J., and Lagemaa, P.: Reconstruction of LargeScale Sea Surface Temperature and Salinity Fields Using SubRegional EOF Patterns From Models, Front. Earth Sci., 7, 232, https://doi.org/10.3389/feart.2019.00232, 2019. a, b, c, d, e, f, g, h
Friedlingstein, P., O'Sullivan, M., Jones, M. W., Andrew, R. M., Gregor, L., Hauck, J., Le Quéré, C., Luijkx, I. T., Olsen, A., Peters, G. P., Peters, W., Pongratz, J., Schwingshackl, C., Sitch, S., Canadell, J. G., Ciais, P., Jackson, R. B., Alin, S. R., Alkama, R., Arneth, A., Arora, V. K., Bates, N. R., Becker, M., Bellouin, N., Bittig, H. C., Bopp, L., Chevallier, F., Chini, L. P., Cronin, M., Evans, W., Falk, S., Feely, R. A., Gasser, T., Gehlen, M., Gkritzalis, T., Gloege, L., Grassi, G., Gruber, N., Gürses, Ö., Harris, I., Hefner, M., Houghton, R. A., Hurtt, G. C., Iida, Y., Ilyina, T., Jain, A. K., Jersild, A., Kadono, K., Kato, E., Kennedy, D., Klein Goldewijk, K., Knauer, J., Korsbakken, J. I., Landschützer, P., Lefèvre, N., Lindsay, K., Liu, J., Liu, Z., Marland, G., Mayot, N., McGrath, M. J., Metzl, N., Monacci, N. M., Munro, D. R., Nakaoka, S.I., Niwa, Y., O'Brien, K., Ono, T., Palmer, P. I., Pan, N., Pierrot, D., Pocock, K., Poulter, B., Resplandy, L., Robertson, E., Rödenbeck, C., Rodriguez, C., Rosan, T. M., Schwinger, J., Séférian, R., Shutler, J. D., Skjelvan, I., Steinhoff, T., Sun, Q., Sutton, A. J., Sweeney, C., Takao, S., Tanhua, T., Tans, P. P., Tian, X., Tian, H., Tilbrook, B., Tsujino, H., Tubiello, F., van der Werf, G. R., Walker, A. P., Wanninkhof, R., Whitehead, C., Willstrand Wranne, A., Wright, R., Yuan, W., Yue, C., Yue, X., Zaehle, S., Zeng, J., and Zheng, B.: Global Carbon Budget 2022, Earth Syst. Sci. Data, 14, 4811–4900, https://doi.org/10.5194/essd1448112022, 2022. a, b
GESAMP: High Level Review of a Wide Range of Proposed Marine Geoengineering Techniques, Rep. Stud. GESAMP No. 98, (IMO/FAO/UNESCOIOC/UNIDO/WMO/IAEA/UN/UN Environment/UNDP/ISA Joint Group of Experts on the Scientific Aspects of Marine Environmental Protection), http://www.gesamp.org/publications/highlevelreviewofawiderangeofproposedmarinegeoengineeringtechniques (last access: 31 January 2024), 2019. a
Gülzow, W., Rehder, G., Schneider, B., Schneider v. Deimling, J., and Sadkowiak, B.: A new method for continuous measurement of methane and carbon dioxide in surface waters using offaxis integrated cavity output spectroscopy (ICOS): An example from the Baltic Sea, Limnol. Oceanogr.Methods, 9, 176–184, https://doi.org/10.4319/lom.2011.9.176, 2011. a
Hannachi, A., Jolliffe, I. T., and Stephenson, D. B.: Empirical orthogonal functions and related techniques in atmospheric science: A review, Int. J. Climatol., 27, 1119–1152, https://doi.org/10.1002/joc.1499, 2007. a
HELCOM Secretariat: HELCOM subbasins 2017 (level 2), https://metadata.helcom.fi/geonetwork/srv/eng/catalog.search#/metadata/d4b6296cfd19462c94d24c81b9313d77 (last access: 14 May 2019), 2017. a
Humborg, C., Geibel, M. C., Sun, X., McCrackin, M., Morth, C.M., Stranne, C., Jakobsson, M., Gustafsson, B., Sokolov, A., Norkko, A., and Norkko, J.: High Emissions of Carbon Dioxide and Methane From the Coastal Baltic Sea at the End of a Summer Heat Wave, Front. Mar. Sci., 6, 493, https://doi.org/10.3389/fmars.2019.00493, 2019. a
Jacobs, E., Bittig, H. C., Gräwe, U., Graves, C. A., Glockzin, M., Müller, J. D., Schneider, B., and Rehder, G.: Upwellinginduced trace gas dynamics in the Baltic Sea inferred from 8 years of autonomous measurements on a ship of opportunity, Biogeosciences, 18, 2679–2709, https://doi.org/10.5194/bg1826792021, 2021. a
Jolliffe, I. T. and Cadima, J.: Principal component analysis: a review and recent developments, Phil. Trans. R. Soc. A, 374, 20150202, https://doi.org/10.1098/rsta.2015.0202, 2016. a
Kaplan, A., Kushnir, Y., and Cane, M. A.: Reduced Space Optimal Interpolation of Historical Marine Sea Level Pressure: 1854–1992, J. Climate, 13, 2987–3002, https://doi.org/10.1175/15200442(2000)013<2987:RSOIOH>2.0.CO;2, 2000. a, b, c, d, e, f, g, h, i, j, k
Kuliński, K., Rehder, G., Asmala, E., Bartosova, A., Carstensen, J., Gustafsson, B., Hall, P. O. J., Humborg, C., Jilbert, T., Jürgens, K., Meier, H. E. M., MüllerKarulis, B., Naumann, M., Olesen, J. E., Savchuk, O., Schramm, A., Slomp, C. P., Sofiev, M., Sobek, A., Szymczycha, B., and Undeman, E.: Biogeochemical functioning of the Baltic Sea, Earth Syst. Dynam., 13, 633–685, https://doi.org/10.5194/esd136332022, 2022. a
Kuznetsov, I. and Neumann, T.: Simulation of carbon dynamics in the Baltic Sea with a 3D model, J. Marine Syst., 111–112, 167–174, https://doi.org/10.1016/j.jmarsys.2012.10.011, 2013. a
Körtzinger, A., Thomas, H., Schneider, B., Gronau, N., Mintrop, L., and Duinker, J. C.: Atsea intercomparison of two newly designed underway pCO_{2} systems – Encouraging results, Mar. Chem., 52, 133–145, https://doi.org/10.1016/03044203(95)000836, 1996. a
Lan, X., Tans, P., and Thoning, K. W.: Trends in globallyaveraged CO_{2} determined from NOAA Global Monitoring Laboratory measurements. Version 202211, https://gml.noaa.gov/ccgg/trends/ (last access: 15 November 2022), 2022. a
Landschützer, P., Gruber, N., Bakker, D. C. E., Schuster, U., Nakaoka, S., Payne, M. R., Sasse, T. P., and Zeng, J.: A neural networkbased estimate of the seasonal to interannual variability of the Atlantic Ocean carbon sink, Biogeosciences, 10, 7793–7815, https://doi.org/10.5194/bg1077932013, 2013. a
Laruelle, G. G., Cai, W.J., Hu, X., Gruber, N., Mackenzie, F. T., and Regnier, P.: Continental shelves as a variable but increasing global sink for atmospheric carbon dioxide, Nat. Commun., 9, 454, https://doi.org/10.1038/s4146701702738z, 2018. a, b, c, d
Lorenz, E. N.: Empirical orthogonal functions and statistical weather prediction, Scientific Report No. 1, Statistical Forecasting Project, Massachusetts Institute of Technology, Department of Meteorology, Cambridge, MT, USA, https://eapsweb.mit.edu/sites/default/files/Empirical_Orthogonal_Functions_1956.pdf (last access: 31 January 2024), oCLC: 2293210, 1956. a, b, c, d
Luisetti, T., Ferrini, S., Grilli, G., Jickells, T. D., Kennedy, H., Kröger, S., Lorenzoni, I., Milligan, B., van der Molen, J., Parker, R., Pryce, T., Turner, R. K., and Tyllianakis, E.: Climate action requires new accounting guidance and governance frameworks to manage carbon in shelf seas, Nat. Commun., 11, 4599, https://doi.org/10.1038/s4146702018242w, 2020. a
Monahan, A. H., Fyfe, J. C., Ambaum, M. H. P., Stephenson, D. B., and North, G. R.: Empirical Orthogonal Functions: The Medium is the Message, J. Climate, 22, 6501–6514, https://doi.org/10.1175/2009JCLI3062.1, 2009. a, b, c
Müller, J. D., Schneider, B., and Rehder, G.: Longterm alkalinity trends in the Baltic Sea and their implications for CO_{2}induced acidification, Limnol. Oceanogr., 61, 1984–2002, https://doi.org/10.1002/lno.10349, 2016. a
Neumann, T.: ERGOM 1.2 model hindcast 1948–2020, IOW [data set], https://threddsiow.iowarnemuende.de/thredds/catalogs/projects/integral/catalog_pocNP_V04R25_3nm_agg_time.html?dataset=IOWTHREDDSBaltic_pocNP_V04R25_3nm_agg_time_2020112012 (last access: 31 January 2024), 2021. a
Neumann, T., Siegel, H., and Gerth, M.: A new radiation model for Baltic Sea ecosystem modelling, J. Marine Syst., 152, 83–91, https://doi.org/10.1016/j.jmarsys.2015.08.001, 2015. a
Neumann, T., Radtke, H., Cahill, B., Schmidt, M., and Rehder, G.: NonRedfieldian carbon model for the Baltic Sea (ERGOM version 1.2) – implementation and budget estimates, Geosci. Model Dev., 15, 8473–8540, https://doi.org/10.5194/gmd1584732022, 2022. a, b, c, d
Omstedt, A., Gustafsson, E., and Wesslander, K.: Modelling the uptake and release of carbon dioxide in the Baltic Sea surface water, Cont. Shelf Res., 29, 870–885, https://doi.org/10.1016/j.csr.2009.01.006, 2009. a
Parard, G., Charantonis, A. A., and Rutgersson, A.: Using satellite data to estimate partial pressure of CO_{2} in the Baltic Sea, J. Geophys. Res.Biogeo., 121, 1002–1015, https://doi.org/10.1002/2015JG003064, 2016. a, b, c, d
Parard, G., Rutgersson, A., Raj Parampil, S., and Charantonis, A. A.: The potential of using remote sensing data to estimate air–sea CO_{2} exchange in the Baltic Sea, Earth Syst. Dynam., 8, 1093–1106, https://doi.org/10.5194/esd810932017, 2017. a
Pfeil, B., Olsen, A., Bakker, D. C. E., Hankin, S., Koyuk, H., Kozyr, A., Malczyk, J., Manke, A., Metzl, N., Sabine, C. L., Akl, J., Alin, S. R., Bates, N., Bellerby, R. G. J., Borges, A., Boutin, J., Brown, P. J., Cai, W.J., Chavez, F. P., Chen, A., Cosca, C., Fassbender, A. J., Feely, R. A., GonzálezDávila, M., Goyet, C., Hales, B., HardmanMountford, N., Heinze, C., Hood, M., Hoppema, M., Hunt, C. W., Hydes, D., Ishii, M., Johannessen, T., Jones, S. D., Key, R. M., Körtzinger, A., Landschützer, P., Lauvset, S. K., Lefèvre, N., Lenton, A., Lourantou, A., Merlivat, L., Midorikawa, T., Mintrop, L., Miyazaki, C., Murata, A., Nakadate, A., Nakano, Y., Nakaoka, S., Nojiri, Y., Omar, A. M., Padin, X. A., Park, G.H., Paterson, K., Perez, F. F., Pierrot, D., Poisson, A., Ríos, A. F., SantanaCasiano, J. M., Salisbury, J., Sarma, V. V. S. S., Schlitzer, R., Schneider, B., Schuster, U., Sieger, R., Skjelvan, I., Steinhoff, T., Suzuki, T., Takahashi, T., Tedesco, K., Telszewski, M., Thomas, H., Tilbrook, B., Tjiputra, J., Vandemark, D., Veness, T., Wanninkhof, R., Watson, A. J., Weiss, R., Wong, C. S., and YoshikawaInoue, H.: A uniform, quality controlled Surface Ocean CO2 Atlas (SOCAT), Earth Syst. Sci. Data, 5, 125–143, https://doi.org/10.5194/essd51252013, 2013. a
Pierrot, D., Neill, C., Sullivan, K., Castle, R., Wanninkhof, R., Lüger, H., Johannessen, T., Olsen, A., Feely, R. A., and Cosca, C. E.: Recommendations for autonomous underway pCO_{2} measuring systems and datareduction routines, DeepSea Res. Pt. II, 56, 512–522, https://doi.org/10.1016/j.dsr2.2008.12.005, 2009. a, b
Preisendorfer, R. W.: Principal component analysis in meteorology and oceanography, no. 17 in Developments in atmospheric science, Elsevier, Amsterdam, 1988. a, b, c
Rehder, G., VelascoSobeck, S., Kreuzer, L., Gutierrez Loza, L., Koziorowska, K., Makuch, P., Jacobs, E., Otto, S., Bittig, H., Lainela, S., Honkanen, M., and Kolbe, M.: Continuous nutrient, trace gas and CO_{2} measurements during EMB214, May–June 2019, PANGAEA [data set], https://doi.org/10.1594/PANGAEA.931313, 2021. a, b
Rödenbeck, C., Bakker, D. C. E., Gruber, N., Iida, Y., Jacobson, A. R., Jones, S., Landschützer, P., Metzl, N., Nakaoka, S., Olsen, A., Park, G.H., Peylin, P., Rodgers, K. B., Sasse, T. P., Schuster, U., Shutler, J. D., Valsala, V., Wanninkhof, R., and Zeng, J.: Databased estimates of the ocean carbon sink variability – first results of the Surface Ocean pCO2 Mapping intercomparison (SOCOM), Biogeosciences, 12, 7251–7278, https://doi.org/10.5194/bg1272512015, 2015. a
Schneider, B. and Müller, J. D.: Biogeochemical Transformations in the Baltic Sea: Observations Through Carbon Dioxide Glasses, Springer Oceanography, Springer International Publishing, https://doi.org/10.1007/9783319616995, 2018. a, b, c, d
Schneider, B., Kaitala, S., and Maunula, P.: Identification and quantification of plankton bloom events in the Baltic Sea by continuous pCO_{2} and chlorophyll a measurements on a cargo ship, J. Marine Syst., 59, 238–248, https://doi.org/10.1016/j.jmarsys.2005.11.003, 2006. a
Thomas, H., Bozec, Y., Elkalay, K., and de Baar, H. J. W.: Enhanced Open Ocean Storage of CO_{2} from Shelf Sea Pumping, Science, 304, 1005–1008, https://doi.org/10.1126/science.1095491, 2004. a
Weare, B. C. and Newell, R. E.: Empirical orthogonal analysis of atlantic ocean surface temperatures, Q. J. Roy. Meteor. Soc., 103, 467–478, https://doi.org/10.1002/qj.49710343707, 1977. a
Weare, B. C., Navato, A. R., and Newell, R. E.: Empirical Orthogonal Analysis of Pacific Sea Surface Temperatures, J. Phys. Oceanogr., 6, 671–678, https://doi.org/10.1175/15200485(1976)006<0671:EOAOPS>2.0.CO;2, 1976. a
Wesslander, K., Omstedt, A., and Schneider, B.: Interannual and seasonal variations in the air–sea CO_{2} balance in the central Baltic Sea and the Kattegat, Cont. Shelf Res., 30, 1511–1521, https://doi.org/10.1016/j.csr.2010.05.014, 2010. a, b, c, d, e
 Abstract
 Introduction
 Methods
 Results
 Discussion
 Data availability
 Conclusions
 Appendix A: Extrapolation approach
 Appendix B: Seasonal pCO_{2} from value and linear trend climatology by cubic Hermite spline calculation
 Author contributions
 Competing interests
 Disclaimer
 Acknowledgements
 Financial support
 Review statement
 References
 Abstract
 Introduction
 Methods
 Results
 Discussion
 Data availability
 Conclusions
 Appendix A: Extrapolation approach
 Appendix B: Seasonal pCO_{2} from value and linear trend climatology by cubic Hermite spline calculation
 Author contributions
 Competing interests
 Disclaimer
 Acknowledgements
 Financial support
 Review statement
 References