A volumetric census of the Barents Sea in a changing climate

. The Barents Sea, located between the Norwegian Sea and the Arctic Ocean, is one of the main pathways of the Atlantic Meridional Overturning Circulation. Changes in the water mass transformations in the Barents Sea potentially affect the thermohaline circulation through the alteration of the dense water formation process. In order to investigate such changes, we present here a seasonal atlas of the Barents Sea including both temperature and salinity for the period 1965–2016. The atlas is 5 built as a compilation of datasets from the World Ocean Database, the Polar Branch of Russian Federal Research Institute of Fisheries and Oceanography, and the Norwegian Polar Institute using the Data-Interpolating Variational Analysis (DIVA) tool. DIVA allows for a minimization of the expected error with respect to the true ﬁeld. The atlas is used to provide a volumetric analysis of water mass characteristics and an estimation of the ocean heat and freshwater contents. The results show a recent "Atlantiﬁcation" of the Barents Sea, that is a general increase of both temperature and salinity, while its density remains stable. 10 The atlas is made freely accessible as user-friendly NetCDF ﬁles to encourage further research in the Barents Sea physics (https://doi.org/10.21335/NMDC-2058021735, Watelet et al. (2020)).

. Bathymetry of the Barents Sea and its neighbouring seas. Our analyses on the Barents Sea correspond to the shaded region.
The Barents Sea Opening, located between the Norwegian coast and Bear Island, and the Kola sections are shown as blue and red circles respectively. BI stands for Bear Island, Sv for Svalbard, FJL for Franz Jozef Land and NZ for Novaja Zemlja. processes in the Barents Sea affect the thermohaline circulation of the North Atlantic and Arctic oceans (Swift et al., 1983;Kuhlbrodt et al., 2009;Mauritzen et al., 2013;Lozier et al., 2019).
The Barents Sea is the largest shelf sea of the Arctic Ocean, and it is bounded by Norway and the Kola Peninsula (Russia) to the south, the Svalbard and Franz Josef Land archipelagos to the north, and Novaya Zemlya to the east (see Fig. 1). The Barents Sea is connected to the Norwegian Sea to the west through the Barents Sea Opening (BSO), and to the Arctic Ocean 25 to the north and northeast. Together with the Fram Strait between Svalbard and Greenland, the BSO is the main gateway between the North Atlantic and the Arctic and, thus, a main pathway for Atlantic Water transport northwards from the Nordic Seas to the Arctic Ocean (Knipowitsch, 1905;Helland-Hansen and Nansen, 1909). Due to its climatic importance and vast marine resources, the Barents Sea area is sampled and monitored on a seasonal timescale (Eriksen et al., 2018). However, the coverage varies between seasons and years, especially during winter and spring, and the spatial coverage is sometimes only 30 semi-synoptic or concentrated at fixed sections.
Satellite remote sensing provides observations of sea surface temperature, and recently sea surface salinity, with high resolution in both space and time. For example, using AVHRR data, Comiso and Hall (2014) found the northern Barents Sea to be one of the areas within the Arctic that shows the highest temperature increase for the period 1981-2012. Furthermore, they found a significant decline in sea-ice cover between the two periods: 1979-1995 and 1996-2012. However, to investigate regional 35 climate processes, such as water mass transformation and property changes, in situ observations are needed. In situ data often have disadvantages of a limited coverage in space (e.g. repeated hydrographic sections) and/or time (e.g. ship surveys). Thus, providing these observations on a regular grid is desirable in order to examine spatio-temporal changes.
Here, we present a gridded dataset of temperature and salinity in the Barents Sea region at seasonal temporal resolution for the period 1965-2016, based on all available in situ observations. The dataset is compiled using the Data-Interpolating 40 Variational Analysis (DIVA) tool. We provide the dataset including fields of expected error, and present two examples of usage where this gridded dataset has an advantage over the non-gridded raw data: volumetric analysis of water mass characteristics, and estimation of ocean heat and freshwater content.

Data sources
In situ hydrographic data were obtained from three different sources, the World Ocean Database 2013 (WOD13), the Norwegian DIVA is a statistical software designed to generate continuous fields from heterogeneously distributed in situ data using a Variational Inverse Method (Brasseur, 1995;Troupin et al., 2012). The result of its variational analysis are gridded fields which minimise the expected errors with respect to the unknown true fields. Under a few assumptions on the correlations, the 70 Variational Inverse Method (VIM) is equivalent to the popular Optimal Interpolation (Rixen et al., 2000). In practice, the aim of the VIM is to minimize the following cost function J: where the N d observations d j are used to reconstruct the analysed field ϕ and with where α 0 penalizes the field itself (anomalies with respect to a reference field, e.g., a climatological average), α 1 penal-80 izes gradients (no trends), α 2 penalizes variability (regularization), and µ j penalizes data-analysis misfits (objective analysis) (Troupin et al., 2016).
Unless specified otherwise, we always use the command line version of DIVA in this study. This version comes with the full set of options, for instance regarding the optimization of the statistical parameters later used in the analyses.
The Barents Sea bathymetry to be used in the atlas processing was extracted from the General Bathymetric Charts of the Oceans (GEBCO) at a spatial resolution of 30 seconds by using Diva-on-web (http://ec.oceanbrowser.net/emodnet/diva.html).

90
This bathymetry was then smoothed to a resolution of 1/8°by using a 2D convolution low-pass filter followed by a linear interpolation to avoid too complex shapes when computing the coastlines for each depth level. Besides, several fjords were removed from the bathymetry. All the interpolated data falling outside these smoothed coastlines or outside the full domain (6.9-66.1°E ; 69-83°N) shown in Fig. 1 were removed. A data range check was also performed and excluded temperature data falling outside -1.9-20°C and salinity data outside 30-36. The remaining data availability per season is shown in Fig. 2 for 95 temperature and in Fig. 3 for salinity.
For each of the 23 depth levels, the objective is to perform one analysis for each season and for each year between 1965-2016. Based on data availability from regular cruise activity, we chose the seasons as follows: November to January (winter), February to April (spring), May to July (summer) and August to October (autumn). The first season is thus November 1964 to January 1965, the last being August to October 2016. The analysis is carried out in two steps. A reference field, or a first guess 100 state, needs to be created before each analysis is carried out. The reference fields are created by collecting all data for each season across 11 years centred around the year to be analysed. A moving window centred at the year of interest is used due  to the strong multidecadal variability of the region (e.g. Smedsrud et al., 2013). Near the beginning and end of the period the window size is reduced to the available years (i.e., the reference field for 1965 is based on data from the period 1965-1970).
The horizontal average is used as a constant first guess when creating the reference fields. Therefore, 4 reference fields are 105 generated per year, that is one per season. By subtracting the reference field from the original data, DIVA directly works with anomalies of temperature and salinity before adding back the reference to the optimal analysis. In this way, the analysis tends to smoothly reach the reference values in the absence of data.
In the reference fields, the correlation length is estimated by a fit between the empirical data correlation function as a function of the distance and its theoretical counterpart, while the signal to noise ratio is approximated by cross validation techniques 110 (Craven and Wahba, 1978). Both the correlation length and the signal-to-noise ratio are thus estimated on the basis of the data sets. Moreover, they are both filtered vertically to avoid unrealistic discontinuities between depth levels. To avoid an overconfidence in the data accuracy, the signal-to-noise ratio is capped at 10 for salinity and 3 for temperature, because of its higher temporal variability. Using these statistical parameters, the reference fields are computed by the Variational Inverse Method with DIVA over the same 11 years, for each season.

115
Then, each analysis is performed using the corresponding 11 year-reference field and the associated statistical parameters.
We decided to use the statistical parameters based on the larger amount of data (11 years) in order to increase their robustness and decrease their variability. For temperature, a logit transformation was applied on data beforehand, so as to ensure the results are constrained between -1.9 and 20°C after applying a reciprocal function to the analyses. This extra precaution for temperature is justified by the sea ice formation around -1.9°C. The analyses are stored on an output grid with a resolution of 120 0.1°in latitude and 0.25°in longitude. Other atlas products, such as the WOA, are also provided on regular lat-lon grids, as well as most operational ocean models. Hence, it makes some of the usages more straightforward.
In order to assess the reliability of the analyses, an error field associated with each of them is computed by using the clever poor man's method, a good compromise between the computation time and the accuracy (see Beckers et al. (2014)). The poor man's error is computed by analysing a "data" vector with unit values and is very cost-effective (Troupin et al., 2010), but the 125 error field is too optimistic. It is shown that using the same method with a correlation length divided by a factor ∼ 1.7 requires a similar computation time and yields a more realistic estimate of the error, that is, the clever poor man's error. This analysis error is then compared to the first guess error, and the ratio of those errors yields the relative error field which thus consists in a value between 0 and 1. Qualitatively, this figure measures the added value brought by in situ data to the analysis: 0 would be the true field while 1 corresponds to an absence of data, that is an analysis equal to the first guess.

130
4 Temperature and salinity atlas The temperature and salinity atlas is available at the Norwegian Marine Data Centre as two NetCDF files. Each file contains analyses of temperature or salinity, respectively, for all seasons and years at all depths, and also includes the error field associated with each analysis. The statistical parameters (correlation length and signal to noise ratio) and the analysed fields restricted to the most reliable areas are also available. These latter analyses are masked if the relative error exceeds 0.3 or 0.5. As shown 135 in Fig. 2 and 3, there are several seasons with data gaps. In such cases, the atlas only contains a missing value, for both the analysis and the error field. The data gaps for salinity are mainly found before 1970 and after 2010, while the temperature has only exceptional data gaps. Between 1970-2010, there are data gaps in the salinity atlas during the 1971-1972 winter period and in both temperature and salinity atlas during the 1996-1997 winter period. Besides, other gaps appear sometimes in the deepest layers. In Section 5, we explain how to make use of the error field to take into account the data coverage before 140 applying any analysis. The data is accessible at https://doi.org/10.21335/NMDC-2058021735 (Watelet et al., 2020).
The hydrographic atlas presented here complements global gridded data products, such as the World Ocean Atlas Zweng et al., 2018), by providing a regional approach tailored to the specific region by offering a higher spatiotemporal resolution allowed by the higher regional data coverage. The presented gridded dataset provides researchers with readily available observation-based data, including error estimates, for several key purposes, such as numerical ocean model characteristics (e.g. Skagseth et al., 2020) for regional climate studies.

Uncertainties and use of error field
In the following sections we demonstrate how the error field provided in the atlas can be utilized to objectively limit the data in time or space before applying the desired analysis. Moreover, we give some examples of possible usages of the atlas product.  First, we consider uncertainties by investigating the error field from the atlas. As the data coverage in the Barents Sea varies between years, seasons and sub-regions, the error field varies accordingly (Fig. 4). The geographical patterns of the error fields are similar at other depths (not shown). Generally, the errors are larger in the northern and eastern parts of the Barents Sea compared with the western and southern parts, due to differences in data coverage (see Section 5.2; Fig. 4

; Supplementary
Material). Moreover, the data coverage is generally better in the autumn season and, hence, the error is generally smaller 165 compared with the other seasons. For this reason, we decided to focus on the autumn only when considering the whole Barents Sea. For studies needing the whole Barents Sea climatology in other seasons (e.g. winter), other data sources could prove necessary.
Volumetric T-S diagrams for both 1994-1998 and 2006-2010 were compiled by summing all the pixels falling inside the T-S classes defined by temperature ranging from -1 to 7°C and salinity varying between 33 and 35.5, using steps of 0.05°C 170 and 0.025, respectively. In this calculation, each pixel is weighted by its vertical extent for each corresponding layer to get a proportional representation of to the water volume within each T-S class. Moreover, the horizontal extent of each pixel is weighted by the latitude ϕ relative to the average latitude ϕ 0 of all the grid cells, due to the narrowing of the longitudinal bands towards the north, using the function The average T-S properties in both periods is shown in Fig. 5a, while the difference between the two periods is shown in Further utilizing the error field, we provide an estimation of the uncertainties for both the two 5-year periods included in the above analysis. Comparing the error fields in both periods (Fig. 5c, d) with the changes in the T-S properties between the 185 two periods (Fig. 5b), as well as the T-S diagrams of both periods (Fig. 5a), it is clear that the error is small for the T-S classes that have the largest presence and also are showing the largest changes. This strengthens the reliability of the findings of T-S changes in the Barents Sea in autumn.

Most reliable area
In this Section, we focus on the spatial pattern of the error field. We first limit our study area to the area where the average 190 relative error for temperature is less than 0.5 (Fig. 6)     associated with the "Great Salinity Anomaly of the 1980s" Dickson et al. (1988) is seen as a distinct maximum of salinities below 34.8. Finally, the potential density relative to the surface is shown in Fig. 12 where classes range between 1027.2 and 1028.8 kg m −3 with a step of 0.2 kg m −3 . The potential density does not display large changes on the long term, similarly to 220 the conclusions made above by using profiles. However, water masses with densities above 1028.0 kg m −3 , associated with dense water production, has rarely exceeded 20 percent of the total water mass within the MRA after year 2000.

Ocean Heat Content
The Ocean Heat Content (OHC) change at the MRA is calculated following the method described in Boyer et al. (2007): where t and s are temperature and salinity averages at each location between 1970-2010, ρ is the density of seawater averaged over 1970-2010 for each grid point, c p is the specific heat of seawater taken here as 3985 J kg −1 K −1 (Hill, 1962) and ∆t is the temperature anomaly with respect to the averaged temperature on the reference period 1970-2010, that is 2.73°C. with a R 2 of 0.36, which is significant at a confidence level of 95%. We followed the Fisher-Snedecor test of significance de-230 scribed in Chouquet (2009) and Montgomery et al. (2012) augmented by a penalization of autocorrelation (Wilks, 1995).
The temperature from the BSO extracted from ICES (https://ocean.ices.dk/iroc/#) is also shown. The correlation between the   temperature at the BSO and the OHC is 0.89 (winter 1976-autumn 2015) and also significant at a confidence level of 95%, indicating that the temperature observed at the BSO is a reliable proxy for the OHC downstream in the southern part of the Barents Sea.

Equivalent freshwater content
To investigate changes in salinity in the MRA, we use the Boyer et al. (2007) method to compute the Ocean FreshWater (OFW) anomaly.
OF W = − ρ(t, s, p) ρ(t, 0, p) ∆s s + ∆s dxdydz (2) where ∆s is the salinity anomaly with respect to averaged salinity on the reference period 1970-2010, that is 34.88, ρ is the 240 density of seawater at each grid point.
In Fig. 13b, changes in the OFW in the MRA are shown between 1970-2010. The slope is −1.722 × 10 7 m 3 d −1 with a R 2 of 0.11, which means the negative trend is not significant at a confidence level of 95%, although very close to the significance threshold. We followed the same method as for the OHC to examine the significance. The salinity at the BSO extracted from ICES (https://ocean.ices.dk/iroc/#) is also shown. The correlation with the OFW between winter 1976-1977 and winter 2010-