Next generation of Bluelink ocean reanalysis with multiscale data assimilation: BRAN2020

BRAN2020 (2020 version of the Bluelink ReANalysis) is an ocean reanalysis that combines observations with an eddy-resolving, near-global ocean general circulation model to produce a four-dimensional estimate of the ocean state. The data assimilation system employed is ensemble optimal interpolation, implemented with a new multiscale approach that constrains the broad-scale ocean properties and the mesoscale circulation in two steps. There is a separation in the scales that are corrected in the two steps: the high-resolution step corrects the mesoscale dynamics in the same way as previous versions of BRAN, while the extra coarse step is effective at correcting biases that develop at large scales. The reanalysis currently spans January 1993 to December 2019 and assimilates observations of in situ temperature and salinity, as well as of satellite sea-level anomaly and sea surface temperature. BRAN2020 is planned to be updated to within months of real time after this initial release, until an updated version of BRAN is available. Reanalysed fields from BRAN2020 generally show much closer agreement to observations than all previous versions with misfits between reanalysed and observed fields reduced by over 30 % for some variables, for subsurface temperature and salinity in particular. The BRAN2020 dataset is comprised of daily averaged fields of temperature, salinity, velocity, mixed-layer depth and sea level. Reanalysed fields realistically represent all of the major current systems within 75 S and 75 N, excluding processes relating to sea ice but including boundary currents, equatorial circulation, Southern Ocean variability and mesoscale eddies. BRAN2020 is publicly available at https://doi.org/10.25914/6009627c7af03 (Chamberlain et al., 2021b) and is intended for use by the research community.


Introduction
An ocean reanalysis combines ocean observations with a model to produce gridded estimates of the ocean state. The dataset presented here is the 2020 version of the Bluelink ReANalysis (BRAN2020; 2020 refers to the year in which the configuration was finalised). Previous versions of BRAN include BRAN1 (Oke et al., 2005), BRAN1p5 , BRAN2 (Schiller et al., 2008b), BRAN3 (Oke et al., 2013b), BRAN2015 (Oke et al., 2018, when the naming convention changed) and BRAN2016 (the same configuration as BRAN2015 but initialised in the 1990s). Overall, the accu-racy of BRAN2020 is much better than all previously presented versions.
BRAN has been developed as part of the Bluelink project, an Australian partnership between CSIRO, the Bureau of Meteorology and the Department of Defence. The development of BRAN has supported operational oceanographic forecasting and focused on the dynamics of the upper ocean around Australia and surrounding regions. The long, highresolution ocean reanalyses in the various BRAN products have found various operational and research applications (see Schiller et al., 2020, for an overview). For instance, previously BRAN output has been used to support the study of Published by Copernicus Publications.
hard-to-observe features such as boundary currents in the Asian-Australia region (Schiller et al., 2008a) and zonal currents across the Indian Ocean (Divakaran and Brassington, 2011) and anomalous temperature events in the Coral Sea (Schiller et al., 2009) and along New South Wales (Oke and Griffin, 2011). The reanalyses also provide boundary conditions for regional models (e.g. the Great Barrier Reef; Steven et al., 2019).
There are two main differences between BRAN2020 and all previous versions of reanalyses in this series. The first difference is the adoption of a multiscale data assimilation technique (Chamberlain et al., 2021a) that constrains the broad-scale features of the ocean and the mesoscale features separately in two steps. Chamberlain et al. (2021a) showed that this delivers an improvement in accuracy -measured by the misfit to assimilated and withheld observations -of up to 35 %. The same improvement is realised here in BRAN2020. Multiscale data assimilation has also been applied to regional-and basin-scale ocean models (Li et al., 2015;Carrier et al., 2019;Tissier et al., 2019), which see similar improvements in overall performance. The second difference is the assimilation of a larger observational dataset that includes observations from marine mammals, moorings and ship-borne surveys that were not assimilated into previous versions of BRAN. Previous versions only assimilated observations from the conventional observation platforms (Argo; eXpendable BathyThermographs, XBTs; shipborne surveys; satellite sea-level anomalies, SLAs; and satellite sea-surface temperature, SST). The new sources of data offer greater spatial coverage, with fewer gaps. This paper is intended to provide a description of how BRAN2020 data are generated and the nature of modelobservation differences, with sufficient detail that a user can assess whether the data are suitable for their intended application. The structure of the paper is as follows. Section 2 describes the reanalysis cycle, ocean model, data assimilation system and observations assimilated. The discussion of the data assimilation system includes a summary of the multiscale data assimilation approach, with technical details of the assimilation reported in Appendix A. Section 3 describes the results, including an evaluation of the quality of reanalysed fields and inter-comparisons with previous BRAN experiments. Section 4 describes the output available and access and structure of files, and Sect. 5 discusses potential uses of the BRAN2020 dataset and concludes.

Ocean reanalysis methods
In this section we describe the procedures followed to generate BRAN2020. We provide descriptions of the analysis cycle, how the model and observations are combined, the ocean model, the data assimilation system, and the observations assimilated.

Analysis cycle
The initial condition for the ocean state at the beginning of BRAN2020 is interpolated from climatology, using the 2013 version of the World Ocean Atlas (WOA13; Zweng et al., 2013;Locarnini et al., 2013). The first 3 months of the integration (October-December 1992; comprising 30 analysis cycles) is not included in the publicly available dataset. There are large adjustments in the first several analysis cycles as the reanalysed fields approach the observed ocean state from initial conditions, as seen in reductions in averaged modelobservation differences with each analysis cycle. After 30 cycles these differences have stabilised.
The analysis cycle involves alternate running of the ocean model and data assimilation system. For BRAN2020, observations are assimilated every 3 d, as used in previous versions (Oke et al., 2018). The instantaneous fields for salinity, sea level and horizontal velocity and daily averaged fields of temperature (for the day immediately preceding the analysis time) are used as the background field for the data assimilation step. Daily averaged temperature is used as the background field instead of instantaneous fields to avoid systematic, regional biases due to the diurnal cycle of surface temperatures. Temperature fields are also converted from potential temperature (used by the model) to in situ temperature to match the observations to be assimilated. This conversion has no impact at the surface but changes the temperature by about 0.2 • C at 2000 m depth.
The differences between the background field and observations are calculated by interpolating the model fields to the observation locations, yielding a vector of background innovations. Background innovations represent the differences between the model and observations before assimilation. The data assimilation system is used to calculate the analysis field (fully described in Sect. 2.3). The differences between the analysis field and observations are also calculated, yielding a vector of analysis innovations, also referred to as "residuals". Analysis innovations represent the differences between the model and observations after assimilation. The differences between the analysis field and the background fields are here referred to as the increments. Increments include values for all variables of the model state -temperature, salinity, horizontal velocities and sea level -for all model grid points. The increments are added to the instantaneous fields at the end of the previous ocean model step, for all variables, to reinitialise the model for the next analysis cycle. For salinity, horizontal velocities and sea level, this means that the model is initialised with the analysis fields. But because temperature is treated slightly differently, as described above, the initialised temperature does not precisely match the analysis fields.
This analysis cycle is repeated every 3 d for the period of January 1993 to December 2019. Background and analysis innovations, for each cycle, are used to quantify the accuracy of the reanalysis in Sect. 3. Daily averaged fields of temperature, salinity, velocity, sea level and mixed-layer depth are stored for each day of integration. The mixed-layer depth here is the depth over which the buoyancy exceeds a threshold of 0.0003 m/s 2 , as described by Griffies (2012). These daily averaged fields comprise the BRAN2020 product that is presented here.

Ocean model
The model configuration used here is the Ocean Forecasting Australian Model, version 3 (OFAM3). A comprehensive description of OFAM3 is presented by Oke et al. (2013a). Briefly, OFAM3 is a near-global, eddy-resolving, z * configuration of the Modular Ocean Model (version 5; Griffies, 2012), developed principally for the purpose of reanalyses and forecasting upper-ocean conditions across the globe, excluding the polar regions. The model grid has a 0.1 • grid resolution between 75 • S and 75 • N, with a 5 m vertical resolution down to 40 m depth and a 10 m vertical resolution to 200 m depth. OFAM3 is forced with atmospheric conditions from JRA-55 (Kobayashi et al., 2015;Tsujino et al., 2020) using bulk formulas to calculate surface fluxes of heat, freshwater and momentum. The model is ocean-only and does not include sea ice. Forcing fields are masked for sea ice extent from the atmospheric reanalysis to avoid unrealistic surface fluxes that would result from bulk formulas with open water exposed to cold winter conditions at high latitudes. Where sea ice is present, values in the atmospheric reanalysis fields are replaced with values expected below sea ice; for instance, winds and downward shortwave radiation are set to zero while downward thermal radiation is set to the thermal radiation at freezing temperature.
Surface salinity is restored to a monthly climatology, using WOA13 , with a restoring timescale of 14 d. Temperature and salinity below 2000 m are also relaxed towards WOA13 climatology Locarnini et al., 2013), with a restoring timescale of 1 year. The configuration of OFAM3 used here employs the k-ε scheme (Rodi, 1987) to calculate unresolved turbulent mixing in the upper ocean.

Ensemble optimal interpolation
The assimilation system used to constrain BRAN2020 is EnKF-C (Sakov, 2014), implemented in the ensemble optimal interpolation (EnOI) mode. Technical details, specific to the EnKF-C software including all parameters and settings, are presented in Appendix A. For EnOI, the state of the ocean model, w (comprised of temperature, salinity, horizontal velocity and sea level), is updated by projecting background innovations onto an ensemble of model anomalies, A, and calculating weightings, c, for each member. The weights vary horizontally and are recalculated for each analysis cycle. The increment, used to update the ocean state, is a weighted sum of ensemble members: where superscripts a, b and inc denote analysis, background and increment fields, respectively. Subscript i denotes the ith ensemble member; n is the ensemble size; and x, y and z are dimensions in space. The weights, in Eq.
(2), depend on many factors, including the assumed relative size of the observation error variance and background field error variance. These weights also depend on the correlations between the background innovations and the anomalies in the ensemble members. If, for example, there is no combination of ensemble members that can "fit" the background innovations, then the ensemble can be considered rank-deficient -lacking sufficient degrees of freedom to fit the data. In that case, the analyses will not match the observations well. The details of the assumed observation errors and the details of the ensembles can have significant impact on the performance of the data assimilation and the overall reanalysis.

Multiscale data assimilation
A new feature in BRAN2020 is the adoption of multiscale data assimilation which is particularly effective at reducing the bias in the model in the subsurface. The method is described briefly here; for details please see Chamberlain et al. (2021a).
In BRAN2020, two ensembles are constructed based on anomalies derived from two previous experiments using freerunning models with no data assimilation. A strength of this method is that it is multivariate. This means that observations of one variable can be used to calculate increments for other variables. For example, the assimilation of SLA produces an update to velocity, even when velocity observations are not assimilated. To achieve this, the ensemble fields are used to characterise the statistical relationships -the covariancebetween each model variable. In practice, the covariability of different variables at different locations is represented by the ensemble members. By constructing the ensembles in this way, we are making an assumption that the background field errors can be represented by the sample of anomalies in these ensembles. Moreover, here with multiscale data assimilation, we are assuming that there are background field errors on both broad scales and mesoscales.
For BRAN2020, the data assimilation is implemented in two steps -the coarse-and high-resolution assimilationsusing two different ensembles at the different resolutions.  Delworth et al., 2006). The coarse ensemble is constructed with monthly climatological anomalies, with one member constructed for each month of a free-running 40-year model simulation. This ensemble is intended to represent the potential broadscale background field errors. The high-resolution ensemble is generated by an OFAM3 spinup (Oke et al., 2013a). The high-resolution ensemble contains seasonal-scale anomalies, constructed by centred differences of 3 d from 3-month averages, with one member constructed for each month of a 12year model simulation, without data assimilation. This ensemble is intended to represent the potential mesoscale background field errors.
The first step, the coarse-resolution assimilation, is intended to correct the broad scales of the ocean state. To calculate increments for the coarse assimilation step, the background field on the high-resolution grid, w b high-res , is first spatially averaged onto a 1 • grid, yielding w b coarse . The spatially averaged background field is compared to observations, yielding the background innovations for the coarseresolution assimilation step. EnOI is used to project these innovations onto the 480-member coarse-resolution ensemble of anomalies to produce a coarse increment w inc coarse . The increments are interpolated back to the 0.1 • grid, yielding w inc inter-coarse , and added to w b high-res . This gives an updated background field on the high-resolution grid w updated-b high-res : ready for the second step of the assimilation. We find that this first step largely constrains the broad scales of the ocean state, without impacting the mesoscales. The second step, the high-resolution assimilation, is intended to correct the mesoscale structures of the ocean state and uses the same configuration as recent versions of BRAN (Oke et al., 2018). The updated background field on the highresolution grid, w updated-b high-res , is compared to observations (the same observations assimilated in the first step), yielding the background innovations for the high-resolution assimilation step. EnOI is used to project these innovations onto a 144member high-resolution ensemble of anomalies to produce high-resolution increments w inc high-res . These increments are added to the respective variables in the model's restart files, and the model is integrated forwards in time. We find that this step largely constrains the mesoscales of the ocean, mainly adjusting the locations and properties of mesoscale eddies.
As demonstrated in Chamberlain et al. (2021a), even though the same observations are used, there is a clear separation of scales in the corrections from each assimilation step. The length scale of the corrections are determined when the assimilation system projects the observation-model differences onto ensemble members. Here, the ensemble members of the coarse step are climatological anomalies with scales much longer than those of the high-resolution ensemble with seasonal-scale anomalies.

Ocean observations
BRAN2020 assimilates observations from satellite SST, satellite SLAs, and in situ temperature and salinity. BRAN2020 uses updated sources of observational data relative to previous versions of BRAN. Figure 1 is a Gantt chart summarising the temporal extent of various observation products that are assimilated into BRAN2020.
For most years, the European Space Agency SST Climate Change Initiative (CCI-SST; Merchant et al., 2019) is the source of Advanced Very High Resolution Radiometer (AVHRR; Embury et al., 2019a) and Along Track Scanning Radiometer (ATSR; Embury et al., 2019b) SST observations. Data from CCI-SST include updated calibrations and quality controls, spanning 1981 to 2016; "best-quality" (level-5) day-and night-time temperatures at depth (0.2 m) data are assimilated into BRAN. Note these SST data are compared to a daily averaged background; the difference between instantaneous and daily averaged surface temperature is small in BRAN since the 5 m resolution is greater than the typical diurnal variation (∼ 1 m). From year 2017 onwards, AVHRR data (again, day-and night-time observations) are sourced from NAVO (Naval Oceanographic Office, 2014a, b, c, d). Night-time microwave SST data from AMSR-E (Gentemann et al., 2010) and AMSR2 (retrieved from the Japan Aerospace Exploration Agency, ftp://ftp.eorc. jaxa.jp/AMSR2/GDS2.0/, last access: 30 November 2021; for further details see https://suzaku.eorc.jaxa.jp/GHRSST/, last access: 30 November 2021) are assimilated from 2009, where available. Microwave SST data are not as accurate as infrared observations, but microwave SST data are not as affected by clouds (in the absence of rain), thus providing a good complement to infrared-based observations. Nighttime SST data from the Visible Infrared Imaging Radiometer Suite (VIIRS; Petrenko et al., 2014; NOAA Office of Satellite and Product Operations, 2019) aboard Suomi National Polar-orbiting Partnership (NPP) and NOAA-20 satellites are included from 2012 and 2018, respectively. The VIIRS SST data have somewhat superior spatial coverage to AVHRR SST data, due to the higher spatial resolution (0.75 to 1.5 km for VIIRS compared with 9 to 30 km).
Observations of SLAs are assimilated into BRAN2020, using along-track satellite altimeter data from various satellite platforms that have been available since the 1990s (Fig. 1). SLA data are sourced from the Radar Altimeter Database System (RADS Ver. 4, retrieved from http://rads. tudelft.nl, last access: 30 November 2021; Scharroo et al., 2013) and include corrections for astronomical tides and inverted barometer effects. SLAs in the model are referenced to a mean sea level, calculated from an 18-year mean of an OFAM3 experiment without data assimilation (the same experiment that was used to construct the high-resolution ensemble). Consequently, the mean barotropic currents of BRAN2020 will be very similar to those of this free-running experiment. SLA observations are not assimilated in water less than 200 m.
In situ observations of temperature and salinity are assimilated into BRAN2020, using measurements from Argo floats, surface drifting buoys, XBT (temperature only), sea mammals and moorings, as well as ship-borne surveys and underway systems. These data are sourced from the Coriolis Ocean Dataset for Reanalysis (CORA, versions 5.0 and 5.1; Cabanes et al., 2013;Tanguy et al., 2019) and from a nearreal-time (NRT) database maintained at the Australian Bureau of Meteorology. CORA databases include data that have been processed with delayed-mode quality control. CORA observation types with higher uncertainties, such as XBT and sea mammals, are assigned larger errors in types TEM2 and SAL2 in Table 1. The NRT database, used from 2018 onwards in BRAN2020 ( Fig. 1), includes data sourced from the Global Telecommunications System and the Argo Global Data Assembly Centres (Argo, 2021). These data are subject to some NRT quality control and are duplicate-checked to exclude multiple versions of the same observations (Brassington et al., 2012).
For every assimilated observation, an explicit observation error estimate is required. Specifically, the standard deviations of observation errors need to be estimated. Observations are assumed to be unbiased. Observation errors include measurement error and representation error (e.g. Oke and Sakov, 2008). Because observation errors (especially representation errors) are relatively poorly known, the estimates used for BRAN2020 were tuned somewhat to improve the performance based on ocean reanalyses simulating recent years. Table 1 summarises observation classes and the assumed standard deviations of each observation type. As is common in data assimilation problems, it is possible to force the model ocean state to closely match observation data, but this can lead to overfitting and may cause unrealistic artefacts in the model that can degrade the quality of the reanalysis. As a result, assumed observation errors tend to be larger than some might expect. The one exception to this here is the assumed estimate of the AVHRR SST error (Table 1). For BRAN2020, we assume that the standard deviation of the AVHRR SST error is 0.1 • C. In retrospect, based on the recent results presented by Merchant et al. (2019, their Fig. 6) and the assessment in Sect. 3, we think that the BRAN2020 performance may have been better for upper-ocean temperature if we had assumed a larger error for AVHRR SSTperhaps with a value between 0.2 and 0.4 • C. We plan to assess this for future versions of BRAN in the future. This is discussed further in Sect. 3.1.
The observation errors actually used by EnKF-C in the data assimilation calculations are influenced by a process to build "super-observations". This is a preprocessing step applied at the start of each data assimilation cycle to combine observations of the same type and at approximately the Table 1. Summary of observations and errors used when assimilated into BRAN. TEM (temperature) and SAL (salinity) include Argo, CTD, moorings, etc.; TEM2 and SAL2 includes XBT and marine mammals.

Observation
Assumed standard deviation of type and platform observation errors 0.15 psu same locations and times (in most cases, within the same model grid point). The error variance of a super-observation is calculated from the inverse of the sum of inverse variances of the individual observations used to construct the superobservation (Sakov, 2014). As a result, the error variance of each super-observation is often smaller than the assumed observation error variances of the individual observations. The practice of building super-observations is common, reducing the computational cost of the assimilation step and, as a conservative measure, eliminating some anomalous observations that may represent fluctuations not represented by the model (e.g. sub-grid-scale features).

Evaluation and assessment of ocean reanalysis
To demonstrate the quality of the BRAN2020 dataset, we present two classes of metrics. First, we present the modelobservation differences computed during the assimilation steps -specifically looking at the background and analysis innovations, or model-observation misfits. Second, we compare daily averaged reanalysed fields (the fields released as part of the BRAN2020 dataset) with daily observations, observation-based indices, and observation-based properties.

Analysis of innovations
Here, we report statistics of the background and analysis innovations, before and after assimilation. This provides insight into how closely BRAN2020 analyses fit the assimilated observations (using the analysis innovations) and how faithfully the model integrates analyses forwards in time (using the background innovations). Comparison of the analysis and background innovations can help identify potential cases of overfitting. Ideally, we hope for small analysis innovations and small background innovations. If analysis inno- Figure 1. Gantt chart showing when data from different types and sources are assimilated into BRAN2020. Altimeter data are accessed from RADSv4, accessing data from 13 different satellite altimeters (black bars). SST data are accessed from five different sources (red bars), including data from individual satellites (VIIRS, AMSR-E, AMSR2) and from datasets that have combined AVHRR and ATSR data from multiple satellites (CCI AVHRR, CCI ATSR and NAVO AVHRR). In situ data are accessed from three sources -two from the Copernicus Marine Environment Monitoring Service (CMEMS; Coriolis Ocean Dataset for Reanalysis, CORA5.0 and CORA5.1) and a near-real-time dataset maintained by the Bureau of Meteorology.
vations are small but background innovations are large, we might suspect overfitting. The mean absolute deviations of the analysis and background innovations are presented in Tables 2 and 3, respectively. BRAN2020 statistics are presented for decadally and globally averaged absolute innovations for SST, SLA, and subsurface temperature and salinity in different depth ranges. The decade averages are presented for the 1990s, 2000s and 2010s to show how the quality of the reanalysed fields changes with time as the observing system changes (Fig. 1). For comparison, the same statistics are included for BRAN2016. Background innovations of both BRAN2020 and BRAN2016 are after 3 d of integration from the previous analysis. Note the values reported for each version are with respect to the observation products assimilated in that version; most data are common to both versions, though as noted, BRAN2020 contains extra data not included in BRAN2016 such as sea mammals and moorings. However, the BRAN2020 results here are consistent with those reported in Chamberlain et al. (2021a), who tested a multiscale reanalysis configuration using the same observations as used in BRAN2016. For almost all metrics reported in Tables 2 and 3, BRAN2020 fields are in better agreement with observations than BRAN2016. The only exception to this is temperature in the top 50 m in the 2000s.
The greatest improvement in BRAN2020, compared to BRAN2016, is in subsurface temperature and salinity fields (Tables 2 and 3). For depths greater than 500 m, the analysis and background innovations for both temperature and salinity are of the scale of 30 % smaller in BRAN2020 than they were in BRAN2016. In the decade of the 2000s, between 50 and 500 m depth, the analysis and background innovations are 20 %-40 % smaller in BRAN2020 for both temperature and salinity. For depths shallower than 50 m in the 2000s, the analysis and background innovations for temperature are 11 %-14 % greater in BRAN2020 than they were in BRAN2016; and for salinity, they are 3 %-17 % smaller in BRAN2020.
BRAN2020 also demonstrates modest improvements relative to BRAN2016 in the innovations of the surface metrics. In the 2000s, the analysis and background innovations of SST are reduced by 8 %-23 %; for SLAs, they are reduced by 3 %-6 %. These small changes are consistent with the results of Chamberlain et al. (2021a), who found improvements in the subsurface and minimal change in surface metrics associated with the implementation of multiscale data assimilation. Based on this we conclude the improvements obtained with BRAN2020 are associated with the updated observation products. In addition to the improvements in mean absolute differences shown here, BRAN2020 also reduces biases in the ocean model, or the mean values of the innovations. The subsurface fields were known to have relatively large biases in earlier versions of BRAN (e.g. Oke et al., 2013b). These seem to be mostly eliminated with the use of the multiscale data assimilation. For instance, the averages of the background observation-minus-model innovations for all subsurface temperature and salinity in the 2010s of BRAN2016 are −0.185 • C and +0.00771 psu, respectively; in BRAN2020 these are reduced to −0.0456 • C and +0.00196 psu. These global values can mask regional structures with significant biases that average out. However, even these regional biases are mostly eliminated with multiscale data assimilation (see Fig. 6 of Chamberlain et al., 2021a). As discussed by Chamberlain et al. (2021a), the two assimilation steps correct the ocean state at separate scales. The fine scale corrects mesoscale features (as done in BRAN2016), and the extra coarse step corrects large-scale biases.
There is a slight degradation in temperature, shallower than 50 m, that we think is an indication that the BRAN2020 analyses are overfitting SST. As noted in Sect. 2.4, we think that a better result may have been achieved if we had assumed a larger observation error for AVHRR SST for BRAN2020 (Table 1). In short tests of ∼ 20 cycles, the BRAN system evaluated the impact of increasing the observation error in AVHRR SST from 0.1 to 0.3 • C. The results of mean absolute analysis innovations for subsurface temperatures, 0-50 m, decreased by ∼ 3 % with the larger SST error, and there were smaller reductions of ∼ 0.5 % in forecast innovations, explaining part of the differences between BRAN2020 and BRAN2016 in Tables 2 and 3. Note again there are more observations assimilated and assessed in the BRAN2020 relative to BRAN2016. We note that VIIRS SST is reported as higher in spatial resolution than AVHRR SST and generally agrees more closely with drifting-buoy SST (Minnett et al., 2020;O'Carroll et al., 2019, their Fig. 10), although this is not reflected in our error estimates in Table 1 -and so we surmise that post-2012 (when VIIRS data are assimilated) the impact of this slight overfitting to AVHRR is much less. Note also that ATSR has less error relative to AVHRR, as assessed with drifting buoys   Fig. 6), so the assumed ATSR error of 0.1 • C (Table 1) is reasonable.
To better understand whether BRAN2020 analyses overfit any of the assimilated observations, we present differences between the mean absolute values of background and analysis innovations for BRAN2020 and BRAN2016 in Table 4. Comparison between BRAN2016 and BRAN2020 (in the 2000s, columns 3 and 6 of Table 4) suggests that the error growth between analysis cycles during the 2000s is faster for BRAN2020. This may be because either BRAN2020 overfits some observations or BRAN2016 underfits some observations -or perhaps both. We cannot be sure of which explanation is true. But as noted, we think that BRAN2020 overfits AVHRR SST; and we are quite sure that BRAN2016 underfits subsurface temperature and salinity below 50 m and particularly below 500 m. These results demonstrate one of the main challenges of performing these big reanalysis experiments. There are many factors to "tune" in these systems, and there are competing goals of "fitting" data as closely as possible but not overfitting. It looks like for BRAN2020, the configuration is achieving a good balance -fitting the data much more closely than previous versions across most of the ocean state but perhaps overfitting some variables in some places.
Time series of the mean absolute values of the background and analysis innovations are presented for SLA, SST, temperature (all depths) and salinity (all depths) in Fig. 2. These are the same metrics used to prepare Tables 2, 3 and 4. SLA innovations (Fig. 2a) fluctuate modestly in time, with a few step changes when the composition of satellite altimeters changed (Fig. 1). For example, the innovations increase when data from the TOPEX/Poseidon satellites become unavailable at the end of the 1990s, and they reduce again when data from the Jason and Envisat satellites are available after 2002.
Over the course of BRAN2020, there was a gradual decrease in SST innovations (Fig. 2b), as the number of satellites increased and sensors improved (O'Carroll et al., 2019). As noted in Fig. 1, the main source of SST observations for BRAN2020 is CCI AVHRR. But BRAN2020 also assimilated microwave SST from 2009, and VIIRS SST from 2012. There is no clear step change when microwave SST is used, but there is a clear reduction in both analysis and background innovations when VIIRS SST data are assimilated from 2012. VIIRS SST data have been observed to generally have smaller standard deviations when compared with driftingbuoy SSTs than either AVHRR or AMSR2 SST products (O'Carroll et al., 2019;Helen Beggs, personal communication, 2021).
The constraint of subsurface temperature and salinity improves consistently over the course of BRAN2020 (Fig. 2c,  d). This is particularly true over the decade of the 2000s directly associated with the expansion of the Argo float array that provides systematic observations of subsurface ocean properties with near-global coverage (Roemmich et al., 2019). Prior to the Argo array, in situ data were scattered in space and sporadic in time. Not only do more in situ observations improve analyses -with smaller analysis innovations -but they also result in better-quality "forecasts", with smaller background innovations. Another prominent feature of Fig. 2c and d is a seasonal cycle in the globally averaged innovations from the beginning of BRAN2020 until about 2008. The seasonal cycle in innovations largely disappears once observations from the global Argo array nominally reached its target density in about 2008 (Wong et al., 2020).
The analysis of innovations computed during the assimilation steps of BRAN2020, presented above, is intended to give an indication of the quality of the BRAN2020 product. A few caveats regarding these comparisons are warranted. A long, centred observation window is used for subsurface observations in the coarse-resolution data assimilation step of BRAN2020 (see Appendix A). This means that some of the observations included to calculate the background innovations are assimilated in previous assimilation cycles. For SST and SLA, the observations used for the background innovations are all independent -having not been used to constrain the reanalysis. But for subsurface temperature and salinity, this is not always the case.
Regarding the comparisons between BRAN2020 and BRAN2016, we again note that the data sources and volume of data assimilated in each experiment are different, and the reanalyses are initialised and forced differently. To more clearly attribute the improvements in BRAN2020, Chamberlain et al. (2021a) tested a multiscale data assimilation system using precisely the same observations and boundary conditions as BRAN2016. Their results show substantial improvements in subsurface temperature and salinity with multiscale assimilation relative to the BRAN2016 configuration and smaller changes in surface metrics of the ocean state (SST and SLA). We therefore conclude that the main improvements evident in BRAN2020, compared to BRAN2016, relate to the new multiscale data assimilation approach presented in Chamberlain et al. (2021a). Table 4. Decadal averages of differences between the mean absolute values of background and analysis innovations for BRAN2020 and BRAN2016. These statistics provide an indication of how the errors grow between each analysis cycle. Fast error growth may indicate that analyses are overfit or that the observing system is too sparse to adequately constrain the reanalysis.

Point-wise comparisons
The analysis of innovations presented above is based on metrics calculated during the data assimilation steps of BRAN2020. However, the BRAN2020 dataset that is publicly available is comprised of daily averaged, reanalysed fields that are produced by the model in between analysis updates. To assess these fields explicitly, we now compare the daily averaged fields with observations for each day of BRAN2020: 9861 daily comparisons from the start of 1993 to the end of 2019. For SLA and SST and for subsurface temperature and salinity in depth ranges of 0-5 and 50-500 m, we group the model-data differences into 10 × 10 • bins for the period 2008-2019 across the entire model domain. The results for each of these fields are presented in Figs. 3, 4, 5 and 6. For each figure, the top panel shows the map of mean absolute deviation (MAD) between observations and daily mean reanalysed fields and the bottom panels show the difference between the MAD on day 3 (the day before each assimilation step) and day 1 (the day after each assimilation step) of each analysis cycle. These fields are intended to demonstrate the level of agreement between BRAN2020 fields and observations and how differences grow in each analysis cycle. We expect that this should provide users with a clear idea of how accurate BRAN2020 fields are for all regions.
The MADs between reanalysed and observed SLAs on day 1 of each assimilation cycle average 4.5 cm globally, with values of 6-12 cm in western boundary currents (WBCs), 4-5 cm along the Antarctic Circumpolar Current (ACC), and 2-3 cm in equatorial regions and in the more quiescent parts of the ocean (Fig. 3). These MADs tend to grow by 1-2 cm in WBCs, by less than 2 cm along the ACC, and by less than 1 cm in the equatorial and less variable regions of the ocean.
For SST, the MADs between reanalysed and observed fields on day 1 of each assimilation cycle average 0.26 • C globally, 0.2-0.6 • C in WBCs, 0.3-0.4 • C along the ACC, and less than 0.2 • C in equatorial and quiescent parts of the ocean (Fig. 4). These MADs tend to grow by 0.1-0.3 • C in WBCs and along the ACC and by less than 0.05 • C for much of the equatorial ocean. Unlike the growth in SLAs, MADs for SST grow significantly in eastern boundary current regions, with increases of 0.1-0.2 • C. Figures 5 and 6 show similar distributions of MAD and MAD growth within the assimilation cycle for temperature and salinity, in two depth bands, 0-50 and 50-500 m. For temperature in the top 50 m, the MADs between reanalysed and observed fields on day 1 of each assimilation cycle average 0.31 • C globally, with values of 0.4-0.8 • C in WBCs, less than 0.3 • C along most of the ACC, and less than 0.2 • C in the equatorial and quiescent ocean (Fig. 5). These MADs for temperature in the top 50 m tend to grow by 0.2-0.3 • C in WBCs, by 0.1-0.2 • C along the ACC and by less than 0.1 • C for much of the equatorial ocean. Like SST, the upper-ocean temperature MAD growth by day 3 is larger in eastern boundary currents, with increases of up to 0.3 • C.
For temperature between 50 and 500 m depth, the MADs between reanalysed and observed fields on day 1 of each assimilation cycle average 0.4 • C globally, with values of 0.4-0.6 • C in WBCs, less than 0.3 • C along most of the ACC, about 0.5 • C in equatorial regions and 0.3 • C in the quiescent ocean (Fig. 5). These MADs tend to grow by 0.3-0.6 • C in WBCs, by 0.3-0.4 • C along the ACC and by 0.2-0.3 • C for much of the equatorial ocean. For salinity in the top 50 m, the MADs between reanalysed and observed fields on day 1 of each assimilation cycle average 0.08 psu globally, with values of 0.04-0.2 psu in WBCs, less than 0.05 psu along most of the ACC, and less than 0.06 psu in equatorial and quiescent ocean (Fig. 6).
These MADs for salinity in the top 50 m tend to grow by about 0.1 psu equatorwards of 30 • S and 30 • N and by less than 0.03 psu in the Southern Ocean.
For salinity between 50 and 500 m depth, the MADs between reanalysed and observed fields on day 1 of each assim-    depth for different time periods. Clearly shown is the decrease in MAD from the 1990s to the 2010s ( Fig. 7a; compare the solid and dashed lines) as the number of subsurface observations increases with data from the Argo float array becoming available (Fig. 7c). Separate profiles of averaged MAD (Fig. 7a) are presented for regions of high and low variability, as indicated in the mask map ( Fig. 7b; blue denotes high-variability regions), and averaged over times with and without Argo data. For this analysis, the regions of high variability are where the standard deviation of the detrended SLA exceeds 8 cm. The choice of this cutoff value is arbitrary, but this value means that the high-variability region includes all of the WBC regions, the path of the ACC and some of the high-variability tropical current systems. Figure 7 shows that MAD is lowest in regions of low variability when Argo is available and highest in highvariability regions pre-Argo. Figure 7d-f show Hovmöller plots of model-data MAD fields for temperature for all regions (panel d), high-variability regions (panel e) and low- variability regions (panel f). For temperature, MAD is highest in the upper thermocline and below the surface (∼ 50-200 m) and a seasonal cycle is evident in the pre-Argo period, peaking in the middle of each year. MAD at the surface is relatively low, where reanalysed fields are well-constrained by SST observations. At depth (> 500 m) temperature MADs are low everywhere, even at the beginning of BRAN2020, but reduce further with Argo data.
The profiles of salinity MAD with depth (Fig. 8) are similar to those of temperature, with variability decreasing everywhere with Argo and an increasing number of global observations. Without the equivalent of high-quality SST observations for salinity, the MAD is high at the surface. There is less difference in salinity MAD between high-and lowvariability regions, compared to temperature; most of this difference is in the top 200 m.

Sea-level variability
To further assess the variability in the reanalysed circulation, we compare estimates of sea-level variability, here quantified using the standard deviation of the detrended sea level (Fig. 9), for BRAN2020, a free-running model (Spinup-EI) and an observation-based analysis product (using Aviso SSALTO/DUACS). Spinup-EI is a simulation with the same global ocean model as used for BRAN2020 but is run without data assimilation (Oke et al., 2013a). The run used here is equivalent to the run that is used to construct the high-resolution ensemble described in Sect. 2.3. The Aviso-SSALTO/DUACS product is generated by analysing along-track SLA observations from multiple missions to construct maps of SLA on a 1/4 • grid and spans 1993 to 2014 (from http://www.aviso.altimetry.fr, last access: 30 November 2021). The sea-level variability fields from models are calculated over the same years. The comparisons in Fig. 9 show that the simulated SLA variability is very similar for all products. The model simulation (Spinup-EI) realistically reproduces the key features of the observed field. However, we consider the comparisons between BRAN2020 and the Aviso-SSALTO/DUACS product to be significantly better. The locations, spatial extent and amplitude of the local maxima of SLA variability in Fig. 9 agree more closely between BRAN2020 and Aviso-SSALTO/DUACS than between BRAN2020 and Spinup-EI. Figure 9 confirms that the reanalysed circulation in BRAN2020 is realistic and closely aligned with other observational estimates. Figure 10 shows a comparison of some key SST-based climate indices, calculated using observations (black) and BRAN2020 (red) fields. This includes comparisons of Niño 3 and Niño 4 using data from NOAA's Climate Prediction Center based on optimally interpolated SST (Banzon et al., 2014), as well as the Interdecadal Pacific Oscillation (IPO; as defined by Henley et al., 2015) and the Indian Ocean Dipole (IOD) based on HadISST (Rayner et al., 2003). There is a very close match between the observed and reanalysed Niño indices and the IPO (Fig. 10a-c). The agreement for the IOD (Fig. 10d) is not as good. For this comparison, the observed IOD is calculated from a 1 • resolution HadISST product. The discrepancies shown in Fig. 10 for the IOD are mostly associated with the eastern node, a significant part of which crosses islands of Indonesia (see the boxes in Fig. 11a). SST estimates in different products depend on resolution, particularly in coastal regions with high variability. As a result, the details of the "observed" and reanalysed IOD differ in detail. Despite this, the observed and reanalysed IODs are still very consistent. Figure 11 shows maps of SST variability, here quantified as the standard deviation of SST anomalies, for three products: BRAN2020, Spinup-EI and the gridded CCI-SST observation product (version 2.1; Merchant et al., 2019;. All estimates are calculated for the period spanning 2000-2009, with respect to climatologies from the same period. SST variability is greatest in regions of significant mesoscale eddy variability, such as WBCs and along the path of the ACC. SST variability is also high in places associated with interannual modes, such as the equatorial regions and the midlatitudes, which include the nodes of the SST indices indicated. Figure 11 shows that the SST variability simulated by a model without data assimilation is realistic. But the comparison between the observation-based estimate (CCI-SST) and BRAN2020 shows exceptionally good agreement. In practice, it is difficult to identify any region of disagreement for the colour scales shown here. The only subtle differences, evident in this comparison, are in regions of seasonal sea ice cover, which BRAN does not include.

Australian boundary currents
Here we calculate volume transports along some of the major boundary current systems in the Australasian region and compare results with previous versions of BRAN. Currents and velocities are not assimilated directly into the ocean state, making these comparisons a useful evaluation of the whole ocean reanalysis system. In this case, we compare transport from BRAN2020 with previous versions of the reanalysis and also with the "free-running" SPINUP-EI experiment without data assimilation, albeit with data assimilated into surface forcing driving the model. Schiller et al. (2008b) examined boundary currents of Australia from an earlier version of BRAN, presenting the average transports at several sections of the Indonesian Throughflow (ITF), the Leeuwin, the South Australian Current (SAC) and the East Australian Current (EAC). The positions of these sections are shown in Fig. 12 with the long-term average of currents in the upper ocean. Volume transport averages and ranges are shown again in Table 5, along with equivalent transports calculated from BRAN2020 over both the time corresponding to the Schiller et al. (2008b) analysis and the full time span available in BRAN2020. Since the reanalysis simulates mesoscale dynamics of the ocean, the variability and range of transports are large and new transport values from BRAN2020 fall well within the ranges of Schiller et al. (2008b), and in most cases the differences in transports from the two models are significantly less than the range. Figure 13 shows time series of volume transports through a few selected positions from Table 5, comparing BRAN2020 transports with the recent BRAN2016 and a free-running experiment with the same ocean model setup. Significant multiyear variability is evident in the volume transport through the Timor Sea. While the free-running model captures the seasonal cycle and even much of the year-to-year variability (due to climate signal contained within the surface forcing), only the reanalyses with data assimilation capture the decreasing trend in westward flow over the years 2000 to 2015, which BRAN2020 shows to strengthen again in 2016. This decreasing trend is also present in BRAN2016, albeit with a weaker magnitude.
The position of the transport shown for the EAC (bottom panel, Fig. 13) is past the point where it separates from the coast and part of the EAC extension, where the volume transport is less than the flow north of the separation point. While the means of the three experiments shown are consistent, there is more variability in the BRAN versions relative to the free-running model, indicating eddies and mesoscale variability are not as prominent at this position in the ocean model without data assimilation. Also, southward transport here is stronger than the mean reported by Schiller et al. (2008b), Table 5.
Interannual variability in the Leeuwin Current system is less than variability over short timescales here, and the mean transport in the Leeuwin is weaker than the other systems presented here. The transport from the reanalyses shows larger extremes relative to the free-running experiment. There are some persistent extremes in the BRAN transports, such a strengthening southward flow in the years 1999 and 2008, none of which are captured in the free-running model.

Conclusions
For most metrics, BRAN2020 outperforms all previously presented reanalyses in this series of BRAN products. We attribute the improved performance of BRAN2020 mainly to the adoption of multiscale data assimilation (Chamberlain et al., 2021a). This delivered the greatest benefit for subsurface temperature and salinity, where there are previously acknowledged biases (Oke et al., 2013b). As a result, agreement between observed and reanalysed temperature and salinity fields below 50 m is of the order of 30 % less in BRAN2020 compared to previous versions of BRAN. The learnings from performing BRAN2020 and the multiscale data assimilation paper (Chamberlain et al., 2021a) were applied to develop a new observational product, called Blue Maps (Oke et al., 2021). In that paper, Oke et al. (2021) demonstrate again the benefits of using EnOI with a diverse ensemble of anomalies that represent multiple timescales and length scales.

Appendix A: Ensemble optimal interpolation configuration
For BRAN2020, EnOI is applied in a two-step multiscale data assimilation approach, as described in Sect. 2.1 and by Chamberlain et al. (2021a). Here, we present the technical details of the data assimilation, with specific reference to the parameters and settings (in Table A1) used with the opensource EnKF-C software (Sakov, 2014).
The details in Table A1 include the data assimilation cycle length, the ensemble size for each assimilation step, the method for ensemble construction, the localisation radii, the stride, the R factor, the K factor, observation windows and the domain over which observations are assimilated. Each of these points is explained below. The data assimilation parameters are different for the two data assimilation steps: values from the high-resolution step are unchanged from the previous version, BRAN2016. The values used for BRAN2020 are those found to give good performance in tests of the multiscale data assimilation system in Chamberlain et al. (2021a).
The cycle length for BRAN2020 is 3 d, meaning that the ocean state is integrated forwards 3 d between each application of data assimilation. Data are assimilated at both resolutions in each cycle. The ensemble size for the coarse- and high-resolution data assimilation steps is 480 members and 144 members, respectively. The ensemble for the coarseresolution assimilation is intended to represent broad-scale anomalies. The ensemble for the high-resolution assimilation is intended to represent anomalies associated with mesoscale features. In practice, we would like to make the ensemble sizes as large as possible. These are limited by computational resources. Larger ensembles are feasible for the coarse-resolution data assimilation step, as the ocean state vectors and ensemble members are much smaller and relatively inexpensive computationally compared to those of the high-resolution data assimilation step. The use of two data assimilation steps increases the effective ensemble size but also increases the diversity in ensemble members used, from the different scales of anomalies captured in the different resolutions. Both these ensemble aspects, size and diversity, are shown to improve data assimilation performance in an observational product, Blue Maps (Oke et al., 2021). For the coarse-resolution assimilation, data are only assimilated within 65 • S and 65 • N to avoid impacts of sea ice that are represented in the coarse-resolution ensemble. For the coarse-resolution assimilation step, in situ data are assimilated for a 10 d data window, centred on the analysis time. For all other observations, data are assimilated for a 3 d centred window. The 10 d in situ data window matches the cycle of Argo floats, so this step can assimilate data from the full Argo array when available. Note the data window is greater than the reanalysis cycle, so some data are assimilated in three consecutive cycles in the coarse data assimilation step. We see no evidence of overfitting (degradation in forecast/background innovations) from this. Rather, increasing the density of in situ observations is beneficial in constraining the subsurface ocean that is poorly sampled and subject to dynamics and bias.
Following the equations presented in Sect. 2.1, the correction term applied to the ocean state, to reduce the misfit to observations, can be written as a sum of the ensemble members, or The values of c are related to the relative uncertainties in the model and observations, with higher magnitudes of c for higher model uncertainty or lower observation uncertainty.
The magnitude for a particular member (c i ) is higher when it correlates with the background innovations. For further details of the procedure used here, please refer to Sakov (2014) and Oke et al. (2013b). In theory, without localisation, c (in Eq. A1) does not vary in space. But for most practical applications, with a large state dimension, this works poorly because the ensemble is rank-deficient -lacking a sufficient number of degrees of freedom to appropriately fit the background innovations (e.g. . For BRAN2020, as well as for most other applications of EnOI, localisation is used to increase the effective rank of the ensemble and to eliminate unrealistic, long-distance ensemble-based covariances. This results in spatially inhomogeneous weightings that are a function of the horizontal location, i.e. c(x, y).
Use of localisation also means that observations beyond the localising length scale from a grid point have no impact on analyses at that grid point. This is exploited by EnKF-C for computational efficiency, by projecting background innovations onto the ensemble separately for individual horizontal grid points. This means that instead of a single -practically impossible -calculation, the assimilation problem is reduced to a large number of smaller calculations, with one calculation at individual horizontal grid points. The localising function used by EnKF-C is a quasi-Gaussian function, where the function goes to exactly zero over the localisation radius. The effective e-folding length scale of the localising function is ∼ 3.5 times smaller than the localisation length scale reported here (and elsewhere in the literature). The localisation radius used with high-resolution data assimilation is 250 km, consistent with the scale of the mesoscale features that are corrected in this step. We found that increasing the localisation radius to 1600 km improved the efficacy of the course-resolution assimilation step by including more neighbouring in situ profiles in the assimilation at any point, noting that the typical spacing of Argo floats is ∼ 300 km.
It is expected that the weighting coefficients in c vary smoothly in space, with values for adjacent grid points being very similar. This permits a further simplification with the use of "stride" (Sakov, 2014). In practice, weights are only calculated on a subset of grid points, determined by the stride; for instance, a stride of 3 for the high-resolution step means that coefficients are only calculated for every third grid point in each horizontal direction, significantly reducing the computational cost. When a stride greater than 1 is used, weights (c) are linearly interpolated onto the model grid before constructing the increments.
EnKF-C includes an option to use a so-called "K factor", described in Sakov and Sandery (2017). The K factor is in-tended to prevent overfitting by modifying observation variances so that the corresponding increments do not exceed the specified value times the spread of the ensemble.
Another parameter used to tune the data assimilation performance of EnKF-C is the "R factor". The R factor scales the assumed observation error variances, affecting the ratio of the background error variance (the ensemble variance) to observation error variances and adjusting the impact of observations on the calculated corrections to the ocean state (Sakov, 2014). The R factor is typically increased for applications using larger localisation radii that include more observations for each calculation. This increases the relative assumed observation error variance, again, to avoid overfitting. In practice, using an R factor of 4 effectively reduces the amplitude of the ensemble variance by 4, which is the same as reducing the effective background error (ensemble spread) by Competing interests. The contact author has declared that neither they nor their co-authors have any competing interests.
Disclaimer. Publisher's note: Copernicus Publications remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Acknowledgements. Production of BRAN2020 is supported by the Bluelink project, a partnership between CSIRO, the Australian Bureau of Meteorology and the Australian Department of Defence. We thank Pavel Sakov for advice about and assistance in data assimilation and EnKF-C. Data were sourced from the Integrated Marine Observing System (IMOS, http://www.aodn.org.au, last access: 30 November 2021). IMOS is a national collaborative research infrastructure, supported by the Australian Government. CCI L2P SST data from AVHRR and ATSR sensors were produced at the Met Office as part of the European Space Agency (ESA) SST Climate Change Initiative (CCI) project, funded by ESA. AMSR-E L2P SST data were provided by the Group for High Resolution Sea Surface Temperature (GHRSST) and REMSS. AMSR2 L2P SST data were provided by GHRSST and JAXA EORC. The AVHRR L2P SST data from the Naval Oceanographic Office are made available under the Multi-sensor Improved Sea Surface Temperature (MISST) project sponsorship by the Office of Naval Research (ONR). The VIIRS L3U SST data were provided by GHRSST and the National Oceanic and Atmospheric Administration (NOAA). Argo data were collected and made freely available by the International Argo Program and the national programs that contribute to it (https://argo.ucsd.edu/, last access: 30 November 2021; https:// www.ocean-ops.org/board?t=argo, last access 30 November 2021).