Barium in seawater: Dissolved distribution, relationship to silicon, and barite saturation state determined using machine learning
Abstract. Barium is widely used as a proxy for dissolved nutrients and particulate organic carbon fluxes in seawater. However, these proxy applications are limited by insufficient knowledge of the dissolved distribution of Ba ([Ba]). For example, there is significant spatial variability in the Ba–Si relationship, and ocean chemistry may influence sedimentary Ba preservation. To help address these issues, we developed 4,095 models for predicting [Ba] using Gaussian Progress Regression Machine Learning. These models were trained to predict [Ba] from standard oceanographic observations using GEOTRACES data from the Arctic, Atlantic, Pacific, and Southern Oceans. Trained models were then validated by comparing predictions against withheld [Ba] data from the Indian Ocean. We find that a model using depth, T, S, [O2], [PO4], and [NO3] as predictors can accurately predict [Ba] in the Indian Ocean with a mean absolute percentage deviation of 6.3 %. We use this model to simulate [Ba] on a global basis using these same six predictors in the World Ocean Atlas. The resulting [Ba] distribution constrains the total Ba budget of the ocean to 122±8 × 1012 mol and clarifies the global relationship between dissolved Ba and Si. We also calculate the saturation state of seawater with respect to barite, revealing that the ocean below 1,000 m is, on average, at or near saturation. We describe a number of possible applications for our model output, ranging from use in biogeochemical models to paleoproxy calibration. Our approach could be extended to other trace elements with relatively minor adjustments and demonstrates the utility of machine learning to accurately simulate the distributions of tracers in the sea.
Öykü Mete et al.
Status: open (until 27 Apr 2023)
- RC1: 'Comment on essd-2023-67', Anonymous Referee #1, 14 Mar 2023 reply
- RC2: 'Comment on essd-2023-67', Christophe Monnin, 21 Mar 2023 reply
- RC3: 'Comment on essd-2023-67', Frank Pavia, 28 Mar 2023 reply
- RC4: 'Comment on essd-2023-67', Anonymous Referee #4, 28 Mar 2023 reply
Öykü Mete et al.
Distribution of dissolved barium in seawater determined using machine learning https://www.bco-dmo.org/dataset/885506
Öykü Mete et al.
Viewed (geographical distribution)
The manuscript “Barium in seawater: Dissolved distribution, relationship to silicon, and barite saturation state determined using machine learning” by Mete et al. developed a Gaussian Progress Regression Machine Learning (ML) approach to predict dissolved Ba ([Ba]) in the ocean. This study is significant for understanding the marine Ba cycle because it provides a global picture of the vertical and spatial distribution of [Ba] and Ωbarite and suggests factors that intimately link to [Ba]. It is exciting that the ML-derived Ba profiles are in excellent agreement with in situ data. The manuscript is well-reasoned and well-written. I enjoy reading it and am happy to recommend it for publication if the following concerns could be addressed. I hope the authors find my comments constructive and help them make the manuscript more impactful.
Sect. 2 and 3.1: The ML model split the observed datasets into two partitions: the data from the Arctic, Atlantic, Pacific, and Southern Oceans were used for model training, whereas the data from the Indian Ocean were reserved for model testing. Yes, as indicated by the authors, the location-based training-testing separation is to minimize overfitting. However, we also need to be careful that the training data happen to perfectly cover the minimum and maximum [Ba] (according to Figs. 4A-7A), so [Ba] in the Indian Ocean is very well predicted. I would like to know whether the ML model also works well when testing data fall outside training data. That said, it is necessary to include the randomly assigned training-testing separation results for comparison in the appendix.
For the paper to benefit the community, additional discussion about the implications of existing interpretations that rely on Ba* would be of great interest. Unlike [Ba] or Ωbarite, the scientific significance of Ba* is not clear in the current version of the manuscript. Specifically, what do the positive and negative Ba* mean? Does the global Ba* heterogeneity in Figs. 4-7 reveal oceanographic and biogeochemical processes affecting the dissolved Ba-Si relationship? I believe the relationship to silicon is one of the main targets of this study which requires in-depth discussion.
Sect. 5.1: When the authors identified the optimal predictor model, they eliminated features that offer the least improvement to ML model performance. Why are only MLD and chlorophyll a eliminated, but not salinity (they improve the model equivalently low, i.e., -3%)? Including salinity tends not to change the MAD much due to its high p-value. The authors need to justify it further.
L81: The solubility product Ksp is a constant at a given temperature and pH. Thus, Ksp values at different depths are different. The text needs to clarify how [SO4] and Ksp are assigned.
L266-267: [Si]in situ and [Ba]in situ from the WOA, [Ba]predicted from ML model output?
Fig.3B and L502-513: The authors attribute the deviation between observed and ML-modeled [Ba] from SS259 in the deep Bay of Bengal to the uncertainty of in situ [Ba] measurements. Could this deviation result from the factors eliminated from Model #3336? This possibility needs to be discussed at least.