Ocean acidification has profoundly altered the ocean's carbonate chemistry since preindustrial times, with potentially serious consequences for marine life. Yet, no long-term, global observation-based data set exists that allows us to study changes in ocean acidification for all carbonate system parameters over the last few decades. Here, we fill this gap and present a methodologically consistent global data set of all relevant surface ocean parameters, i.e., dissolved inorganic carbon (DIC), total alkalinity (TA), partial pressure of CO

The oceans have taken up roughly one-quarter of the anthropogenic CO

These chemical changes, collectively described as ocean acidification, will have a profound impact on marine organisms, especially those that form shells made of CaCO

At the global scale, most of what we know about the progression of ocean acidification in the recent decades has come from either models

In line with the goal of the OceanSODA project, we aim to develop a global, observation-based data set documenting the progression of ocean acidification over the recent decades.
Such a data set will be crucial to put the current trends of ocean acidification into the context of the changes over the last few decades.
By also describing the level of variability in ocean acidification around the long-term trend, it will also help to better understand the challenges that marine organisms are facing.
Additionally, it will permit us to explore in much more detail how ocean acidification has unfolded regionally and potentially deviated from the simple model of it being dependent on the rise in atmospheric CO

The well-measurable parameters of the marine carbonate system are dissolved inorganic carbon (DIC), total alkalinity (TA), pH, and the partial pressure of carbon dioxide (

We use here the pair

Measurements of

The actual spatial and temporal coverage for any of these parameters is very low.
Even for

By far the most established efforts are those that interpolate and extrapolate the ocean

The extrapolation of TA onto a global grid is also well established

In comparison, very few efforts attempted to interpolate and extrapolate DIC.

Here, to map TA and

The rest of the paper describes the data and methods used to calculate this data set for ocean acidification. The uncertainties of the predictions are assessed, followed by the presentation of the data with a focus on the seasonal cycle. Last, we discuss the implications of the uncertainties for the use of the derived marine carbonate system.

To reconstruct the global progression of all parameters of the surface ocean carbonate system over the last three decades (1985 through 2018), we follow the three steps depicted by the flow diagram in Fig.

Next, we describe the concept of the GRaCER method and then detail its implementation for

Schematic flow diagram showing the three steps required to reconstruct the surface ocean carbonate system. In the first step (yellow hexagons), the GRaCER (Geospatial Random Cluster Ensemble regression) method is used to develop statistical models for the observed TA (left) and

A schematic showing the steps used in the GRaCER method for a single month. Panels

The GRaCER algorithm builds conceptually on a series of cluster-regression algorithms that have been successfully used for the interpolation and extrapolation of surface ocean

For the clustering step, we use monthly climatological data of

The regression is then performed individually for each of the clusters (Fig.

The ensemble members are created by performing the cluster-regression step multiple times. Creating an ensemble is possible due to the fact that each clustering instance is slightly different (Fig.

For the estimation of TA, we employ the support vector regression (SVR) method with 12 clusters and 16 ensemble members. The clustering is performed on climatological mean TA, sea-surface salinity (SSS), sea-surface temperature (SST), and nitrate (

A similar exhaustive search was used for determining the number of clusters. The number of ensemble members was chosen by the number above which there is no longer an increase in performance, analogous to the number of trees in a random forest. Test data are a subset of years spaced 3 years apart starting in 1985. We ensure that the models are not overfitted by selecting hyper-parameters using

To regress and map TA, we use SSS, SST, silicic acid (H

For the estimation of

Clustering is performed on climatological values of

Details of the regression method and of the hyper-parameter selection are given in Sect.

The regression and mapping is performed with the following variables as predictors: SST, SSS, the logarithm of chlorophyll

It is important to note that the predictors are proxies for the spatiotemporal changes in

Data are used to develop the two-step GRaCER model, i.e., clustering and regression, and to evaluate the estimates. Table

Variables used as the clustering features and predictor variables for regression. Details about these data are given in the text. Note that clustering features are all resampled to monthly climatologies.

Data sources used in this study.

For the clustering of TA, we used the mapped product of total alkalinity (TA

For the clustering step of

SSS is from the Simple Ocean Data Assimilation (SODA) analysis

Chlorophyll

For the regression step of TA, the bottle measurements from the GLODAP v2 product are used as the target variable

For the regression of

The discrete measurements of

Finally, outliers are removed from gridded

We use sea-surface temperature from OSTIA for both TA and

The machine learning estimates of TA,

For DIC and pH, we use the directly measured data from GLODAP v2.2019

Three long-term time series stations are used to provide direct independent comparisons for DIC and TA, namely the Hawaii Ocean Time-series at 22.57

Data present in the Lamont-Doherty Earth Observatory

Finally, we include Argo float measurements of pH from the Southern Ocean Carbon and Climate Observations and Modeling project (SOCCOM)

The remaining parameters of the marine carbonate system, i.e., DIC, pH, and

An important consideration in these calculations is the internal consistency of the marine carbonate system, i.e., the error due to uncertainties in the equations and coefficients that describe the marine carbonate system.

Any application of our data product requires a firm understanding of the errors and uncertainties associated with each of the reported parameters of the surface ocean carbonate system. We first discuss the errors and uncertainties associated with the statistically modeled quantities TA and

We identify three sources of errors that contribute to the total uncertainty for

The measurement error reflects the combination of potential biases (systematic errors) from sampling and measurement as well as random errors associated with sampling and the imprecise nature of the measurement system. Since both TA and

The representation error,

The uncertainty associated with the prediction error,

We summarize these uncertainties with mean biases (

Summary of the uncertainties of total alkalinity and

We adopt an uncertainty

Owing to the sparseness of the TA observations, we cannot estimate the uncertainty

The uncertainty

While the global bias of the TA product of OceanSODA-ETHZ is near zero (0.5

Test metrics for total alkalinity

A good check on the model prediction error is provided by comparing the estimated TA against independent observations.
To this end, we use data from the Hawaii Ocean Time-series (HOT), the Bermuda Atlantic Time-series Study (BATS), and the Irminger station shown in Fig.

A comparison of a subset of measurements from long-term observation stations (gray) with predicted total alkalinity (TA)

Comparison of training and independent data sources with various methods for the open-ocean region using the COSCATs coastal mask by

For the uncertainty

We estimate the uncertainty

From the RMSE of our test data, we estimate an uncertainty

The comparison between the regression estimated and observed

The time series comparisons show that the seasonal cycle is well represented at BATS and HOT with

Root mean squared error (RMSE –

A comparison of observations from long-term observation stations (gray) with predicted dissolved inorganic carbon (DIC). The top row

We determine the uncertainties of the calculated parameters in two ways. First, we propagate the uncertainties of

In the global mean, the computed DIC in OceanSODA-ETHZ has a very low bias compared with in situ GLODAP measurements (0.5

It is interesting to point out that the computed DIC in OceanSODA-ETHZ compares very favorably to directly estimated DIC products, such as that provided by NNGv2. Our uncertainty associated with the prediction error for DIC of 16.3

The comparison of the DIC time series data (BATS, HOT, and Irminger stations) supports the findings of the global top-down estimates (Fig.

The pH comparison with the GLODAP pH measurements shows that OceanSODA-ETHZ has a negligible bias (0.001).
As with DIC, regional biases in pH are larger than the global average, with the coastal and high-latitude oceans contributing significantly to the regional biases. The RMSE of pH with respect to GLODAP is also low (0.024) but is slightly outperformed by the RMSE of pH calculated with LIARv2 TA and FFNNv2

For the bottom-up estimate, we propagate the total uncertainty of

A comparison of propagated uncertainties with independent uncertainties as an assertion of the validity of uncertainty estimates. The map

The comparison of top-down vs. bottom-up uncertainty estimates for open and coastal oceans is shown in Fig.

A spatial comparison between OceanSODA-ETHZ and existing products might reveal potential biases in our product if the bias is present in all comparisons.
We compare TA against LIARv2 and NNGv2

The differences in TA between OceanSODA-ETHZ and NNGv2 and LIARv2 are on the same order of magnitude in the open ocean as the prediction error (13

A comparison of the mean differences between TA

The differences between OceanSODA-ETHZ

A basin–mean comparison of OceanSODA-ETHZ

We also show the temporal evolution of the basin–mean differences between OceanSODA-ETHZ

The comparison with other methods illustrates that while gap-filling methods are converging on a global scale, there are regional differences. Further, large differences in

The climatological mean spatial distribution of TA,

Mean maps of the GRaCER-based estimates for the period 1985–2018 for

Total alkalinity shows the largest differences between basins, with the mean alkalinity being much higher in the saltier Atlantic than in the Pacific and Indian basins

Dissolved inorganic carbon is more homogeneous across the basins but has a much larger meridional gradient than TA, amounting to more than 150

The spatial distribution and seasonal cycles of

Hovmöller plots

The spatial distribution of

The OceanSODA-ETHZ data set can provide important novel constraints on the long-term trends in ocean acidification. We determine the long-term trends by a linear regression approach, restricting the period to 1990 through 2018, thus leaving out the 1980s, where the estimates are much more uncertain.

The global- and basin-scale trends for

The basin-scale consistency holds true for pH as well (

Linear trends and their standard errors for OceanSODA-ETHZ variables for the period 1990 to 2018. All columns show increases per decade (decade

Here we consider two notable decisions that have a large impact on the final estimates: (1) the use of the ensemble approach and (2) the choice of regression algorithm.
For details on the minor choices, see Sect.

As previously motivated, we opt for the cluster-regression approach that is able to generalize estimates in sparse regions due to information sharing within a cluster. However, cluster boundaries are often semi-discrete, resulting in artifactual boundaries. This makes the output of cluster-regression approaches less suitable for studies where gradients over short time periods or distances are assessed, e.g., for the detection of extreme events. Our approach removes these boundaries and improves the robustness of the estimates by eliminating, to a large extent, the sensitivity of the regression to the clustering algorithm.

The second major consideration is the choice of regression algorithm.
Our choice of different algorithms for TA and

One of the novel contributions of this study is that we are able to assert the validity of our results by comparing the bottom-up (propagated) with the top-down uncertainties (in situ comparisons).
Using this approach, we show that the uncertainty estimates of DIC are remarkably well constrained, with the top-down estimate being within 5 % of the bottom-up uncertainty estimate for the open ocean (Fig.

To assess this problem from the bottom-up perspective, we need to consider the uncertainties of

The source of the mismatch must thus be driven primarily by uncertainties in the top-down perspective, where it may be that the representation error of pH is larger than for DIC.
We rule out the measurement error as a contributor to the mismatch, as the bias of the measurements (provided accurate calibration to reference samples) should be normally distributed around zero.
Thus, the representation error is the most likely candidate, due to the temperature- and pressure-sensitive nature of pH compared to the conservative nature of DIC with respect to the same variables

The last two decades have seen major improvements in the accuracy and precision of the TA and

In contrast, the prediction uncertainty is the largest contributor to the total uncertainty for both DIC and TA, suggesting that this could be a fruitful avenue to pursue.
However, current literature suggests that this is unlikely.

This leaves the representation error, which contributes a moderate fraction to the total

Why are these gains smaller than hoped?
It may be that our gradient approach for calculating the representation error breaks down as the resolution increases due to the decreasing number of adjacent grid points.
This is hardly surprising considering that 78 % of grid cells in the SOCAT v2019 monthly gridded product are represented by sampling on a single day that falls within that period

Attribution of DIC, TA, and temperature to the seasonal cycle of

Here we demonstrate one of the possible ways in which the OceanSODA-ETHZ data can be used to gain further insight into the marine carbonate system.

We decompose and attribute the mean seasonal cycle variability of

The seasonal amplitude of

In order to use the OceanSODA-ETHZ product in an optimal manner, it is important to be aware of its strengths and weaknesses.

The primary use of the OceanSODA-ETHZ data set is to determine and assess the seasonality, the interannual variations, and trends of ocean acidification thanks to its containing all relevant parameters of the marine carbonate system

The product is also very well suited for assessing models. Thanks to the spatially resolved estimates of uncertainty for TA and

A strength of the OceanSODA-ETHZ product is that it extends further into the coastal margin than most previous studies

The total uncertainties of our estimates in the coastal ocean are considerably larger compared to the open-ocean estimates (Fig.

Software for the GRaCER framework is available on Zenodo

Our approach for estimating TA and

We find that our estimates of TA are within the ballpark of previous methods with a prediction error (root mean square error) of 13

We demonstrate a use case of the OceanSODA-ETHZ data set in which we decompose the seasonal variability of

Finally, OceanSODA-ETHZ will be maintained and updated for future work.

The first outlier removal method requires the

Secondly, we exclude data that lie outside the expected ranges for the monthly climatology of

In this study, one of the avenues that we explored was to predict

The results appeared promising, but on further investigation we found that the regressions that were trying to predict

Hyper-parameters for the support vector regression (SVR) were chosen on a per-cluster basis using grid search cross validation, where unshuffled

We used the LightGBM package to perform the gradient boosted regression with decision trees (GBDTs).
The GBDT algorithm was trained using early stopping, which determines the number of trees used in the model – typically one of the most important hyper-parameters.
Every fifth year from 1987 to 2019 was set aside as the validation data used in the early stopping.
The total number of leaves per tree and the minimum number of training points per terminal leaf were both set to

Given that the problem of solving

Time series of TA (orange) and

One of the advantages of using the GRaCER approach is that any metric can be mapped from the results to the appropriate clusters, resulting in an ensemble of metric scores. The possible metrics that can be applied include bias, root mean squared error, and mean absolute error. Further, these metrics can be applied to test data, meaning that the resulting scores can be based on test scores – that is, data that are unseen by the model during the training process, thus giving a true representation of the uncertainty. Given that the clusters used in this study are climatological, we can get fully mapped climatological estimates of uncertainty.

The uncertainty of TA remains fairly constant between summer and winter, with the Amazon plume showing increased uncertainty in the Northern Hemisphere summer (Fig.

The seasonal difference is larger for

The Huber test scores mapped to the ensemble clusters for total alkalinity (TA) and

Similarly, the spatial distribution of feature importances can be determined with the GRaCER approach when using gradient boosted decision trees as the regression method (Fig.

Feature importances determined by gradient boosted decision trees for

Ocean basin boundaries used in Table

The GRaCER method introduces the idea of using an ensemble of clusters, thus removing the variability that may be introduced in the clustering step.
The location of the clusters varies from ensemble member to ensemble member.
This creates a high-variance–low-bias scenario that is used by other ensemble methods such as random forests

Map of the position of cluster boundaries across all ensemble members and months for

LG and NG conceived of the study and developed the method. LG performed the analysis and testing of the method and wrote the paper with substantial input by NG.

The authors declare that they have no conflict of interest.

We are deeply indebted to the scientists who sampled, analyzed, and contributed to the global databases for ocean carbon data, namely the Surface Ocean CO

This research has been supported by the European Space Agency (OceanSODA project, grant no. 4000112091/14/I-LG), the European Commission (COMFORT project, grant no. 820989), and the Horizon 2020 (4C project, grant no. 821003).

This paper was edited by Giuseppe M. R. Manzella and reviewed by two anonymous referees.