A gridded surface current product for the Gulf of Mexico from consolidated drifter measurements

A large set of historical surface drifter data from the Gulf of Mexico—3761 trajectories spanning 27 years and more than a dozen data sources—are collected, uniformly processed and quality controlled, and assimilated into a spatially and temporally gridded dataset called GulfFlow. This dataset is available in two versions, with one-quarter degree or one-twelfth degree spatial resolution respectively, both of which have overlapping monthly temporal bins with semimonthly spacing, and extend from the years 1992 through 2019. Together these form a significant resource for studying the circulation and variability 5 in this important region. The uniformly processed historical drifter data interpolated to hourly resolution from all publicly available sources are also distributed in a separate product called GulfDriftersOpen. Forming a mean surface current map by directly bin-averaging the hourly drifter data is found to lead to severe artifacts, a consequence of the extremely inhomogeneous temporal distribution of the drifters. Averaging instead the already monthly-averaged data in GulfFlow avoids these problems, resulting in the highest-resolution map of the mean Gulf of Mexico surface currents yet produced. The consolidated drifter 10 dataset is freely available from https://doi.org/10.5281/zenodo.3985916 (Lilly and Pérez-Brunius, 2020a), while the gridded products are available for noncommercial use at https://doi.org/10.5281/zenodo.3978793 (Lilly and Pérez-Brunius, 2020b), the latter being freely available for noncommercial use only for reasons discussed herein. 1 https://doi.org/10.5194/essd-2020-241 O pe n A cc es s Earth System Science Data D icu ssio n s Preprint. Discussion started: 15 October 2020 c © Author(s) 2020. CC BY 4.0 License.

1994, see Fig. 4a; data from a related experiment, LATEX-C, could not be located. The drifters, referred to as "WOCE-type" drifters by Howard and DiMarco (1998), are described by those authors as follows: The drifters consisted of a spherical, 33.7-cm diameter, foam-filled fiberglass surface float attached by a tether to a 91-cm diameter hoop which supported a 6-m cylindrical drogue made of heavy canvas. The canvas cylinder had a 180 series of circular holes in it, which is why this type of drogue is commonly referred to as a "holey-sock". Eighteen drifters had a 3-m tether which placed the bottom of the 6-m drogue at 9-m depth. Two drifters (07834 and 07833) had longer tethers which placed the drogue bottom at 50-m depth and one (07839) had an even longer tether which placed the drogue bottom at 100-m depth. Thus the nominal drogue depth of most drifters was at 7.5 m. For this data, raw position estimates from Argos tracking had been spline-fitted onto six-hourly trajectories. In our processing, brief initial deployments and recoveries of drifters 69341 and 78331, lasting only a few days, are omitted.

The Surface Current and Lagrangian Drift Program (SCULP)
The Surface Current and Lagrangian Drift (SCULP) Program described by Ohlmann and Niiler (2005)  and are seen to provide dense coverage over much of the U.S. continental shelf in the Gulf.
The SCULP experiments used Argos-tracked drifters patterned after the Coastal Dynamics Experiment (CODE) drifters of Davis (1985) and manufactured by Technocean, Inc., http://www.technocean.com. These drifters have submerged sails roughly 195 1 m wide by 1 m deep acting as a simple drogue, and take the form of a plus sign (+) when viewed from above.
Upstream processing of the SCULP datasets is described by Ohlmann and Niiler (2005), and included despiking by flagging time points where velocities exceeded 250 cm s −1 . A final interpolation step is described as follows: The despiked position data were then interpolated onto a uniform three-hour time grid by fitting an analytic correlation function to the Fourier transform of a model spectrum based on 10 days of unequally spaced data centered 200 on the day of interest (Ohlmann et al., 2001;Van Meurs, 1995). The correlation function includes parameters to represent a low-frequency spectral amplitude, a tidal amplitude, and a tidal peak width.
This represents a very different and more complex interpolation step than has been employed in any of the other drifter datasets, and calls for specialized processing steps to detect occasional oscillatory artifacts that are discussed later. While it appears that such events are rare, and although we have done our best to identify suspect time periods, the possibility of such artifacts should 205 be kept in mind.

NOAA's Global Drifter Program (GDP) drifters
The United States' National Oceanic and Atmospheric Administration (NOAA) produces a large global dataset of surface drifters through its Global Drifter Program (GDP), http://www.aoml.noaa.gov/phod/dac. The physical design of the GDP drifters is based on instruments developed for the Surface Velocity Program (SVP) of the Tropical Ocean Global Atmosphere 210 (TOGA) experiments, see Lumpkin and Pazos (2007). Consequently, the drifters employed by the GDP are known as "SVP drifters". While there are several design variants, a common feature is a holey-sock drogue centered at 15 m depth that is intended to reduce wind slippage. Drogues can be lost during the drifter lifetime, which will alter the response to wind forcing.
A flag for the presence or absence of the drogues is provided as a part of the GDP dataset using the methods described by Lumpkin et al. (2013). The standard GDP dataset is a 6-hourly product that uses the quality control process of Hansen and Poulain (1996), as well as the kriging method of interpolation described therein. This processing involves heavy interpolation, dating back to a time when the typical temporal density of position fixes was much less than it is currently. Position fixes were historically determined using Argos tracking, but since 2013 a steadily increasing fraction of drifters has been tracked with the Global Positioning System (GPS). Further details on this dataset may be found in Lumpkin and Pazos (2007).

220
An updated, higher temporal resolution version of this dataset is the hourly product of Elipot et al. (2016), constructed using local polynomial fitting or "loess" (Fan and Gijbels, 1996;Cleveland, 1979). While the hourly dataset mostly contains data after 2005-when a change to the tracking arrangement with Argos led to many more position fixes-it also contains some trajectories at earlier times when the average sampling rate happened to be sufficiently high. Drifters tracked by the Argos system, as well as those using the much higher accuracy GPS tracking, are both included; see Elipot et al. (2016) for a detailed 225 discussion of the errors expected for each of these two tracking methods.
Because of the very different tracking and interpolation methods employed, the GDP drifter dataset is separated into three

Uniform processing
The uniform processing methodology is as follows. We begin with data that has been interpolated to a uniform sampling interval, possibly with gaps. Trajectory segments lying entirely within the study region shown in Fig. 1 are isolated. Depth is found by looking up drifter positions within the Smith and Sandwell global one minute bathymetry dataset v. 19.1 (Sandwell and Smith, 1997). Data points for which the depth is negative, indicating a location on land, are flagged as bad.

345
A visual inspection is then carried out in order to identify trajectory segments that appear suspicious. Several different types of features are interpreted as cause to flag a data segment: stationary locations, likely indicating a grounded drifter; extended periods of linearly varying positions, likely indicating a linear interpolation over a data gap; isolated, patchy data segments of valid data near the end of a record; unusually noisy or jagged data segments; high-speed segments terminating near the shore, likely arising from grounding due to wind or wave activity; isolated anomalous points; and finally conspicuous, 350 rapidly-changing oscillations. These latter only in the SCULP drifters and apparently indicate a gap that has been filled with the vigorous interpolation applied to those datasets. Such features are clearly distinguished from eddies in that they appear as "knots" rather than loops when viewed on a map.
All data points that have been flagged as bad or missing are removed. The data are then interpolated to a uniform hourly spacing using interpolation with a piecewise cubic Hermite interpolation polynomial, the "pchip" method. A filled flag is 355 created to indicate when this interpolation has been applied. Within the hourly dataset, the filled value of interpolated data point is set to true if no valid un-interpolated data points were present within plus or minus six hours and five minutes. In the case of the hourly HARGOS and HGDP datasets, a gap field is available that gives the time gap that has been interpolated over in the upstream interpolation. For these two datasets, our filled flag is also set to true if the time gap in the upstream interpolation exceeds six hours. For the other datasets, similar information regarding upstream interpolation gaps is not available.

360
From this interpolated hourly dataset, velocity is computed using the first central difference on the sphere, see Appendix A.
Trajectory beginning or ending segments lacking good data are discarded, as are any trajectories containing no good data.
The remaining trajectories from all experiments are combined, with an integer source field added to indicate the originating dataset.
Next, acting on the combined hourly dataset, several objective criteria are applied to identify possibly problematic points.

365
Data points having instantaneous speeds less than 0.1 cm s −1 or greater than 250 cm s −1 are flagged, as well as those failing to pass a minimum acceleration criterion. The minimum acceleration criterion identifies times for which the acceleration magnitude |u (t) + iv (t)|, smoothed with a 168-point (two week) Hanning filter, is smaller than 10 −4 cm s −2 ; time periods exhibiting such a high degree of smoothness generally either reflect that the drifter is grounded, or that a data gap that has been interpolated over in earlier processing. Two criteria are also applied to isolate several unrealistically fast segments in shallow 370 water seen in the SCULP datasets.
Data flagged at this stage is marked by setting the filled field to true so that they can be excluded from future analysis, and then the flagged values are re-interpolating over using pchip. The fraction of data points that are filled at this secondary level is very small, about two per thousand valid data points. Finally, velocity and depth are both re-computed in the same https://doi.org/10.5194/essd-2020-241 It is useful at this stage to introduce notation for different types of averages. For convenience we represent the velocity as 470 a vector, u ≡ [u v] T , where the superscript "T " denotes the transpose. Let an overbar, u, denote an average over a spatial bin and over all times, and let angled brackets, u , denote an average over a spatial bin and a particular temporal bin. Thus, u is a function of time while u is not. We refer to u as the local average, u as the global average, and u as the double average.
Note that these averages do not commute; u is the same as u and is not equal to u .
The GulfFlows product contain the local average u at grid location and each time slice. As we have seen, a global time 475 average over all velocities in a spatial bin, u, is a poor way to form an estimate of the time mean currents, see Fig. 2c. A local average followed by the global average, u , gives the improved mean flow estimate seen in Fig. 2d and discussed earlier in Sect. 2.

A variance decomposition
In the same way that one can different define versions of an space/time averaged flow field, one can similarly define different 480 versions of the velocity covariance matrix. Within each bin, three different time-varying covariance matrices are which involve different combinations of local and global averages. The first of these, Γ, is the autocovariance within a 485 space/time bin for all observed velocities u relative to the time-independent double-mean velocity u . The second, E, is the autocovariance within a space/time bin for all observed velocities u relative to the time-dependent local mean velocity u in that bin. The third, Σ, is a covariance-like quantity involving the deviation between the local mean velocity u in a space-time bin and the double-mean velocity u .
Unlike the other matrices, Σ is not an averaged quantity, but instead is the outer product of a deviation vector with itself; 490 thus it is not technically a covariance. However, its time-average Σ will be a covariance matrix, so it is sensible to extend that term to this quantity as well.
The matrices Γ, E, and Σ will respectively be called the total, local, and bulk time-dependent covariance matrices. In fact these three matrices are related by the identity into the definition of Γ, followed by carrying out the indicated average. Taking the time average of Eqn. (4) followed by the matrix trace-that is, the sum of the diagonal elements-one finds where γ 2 ≡ tr Γ , 2 ≡ tr E , and σ 2 ≡ tr Σ , and with "tr" being the trace. This is a partitioning of the velocity variance at each 2D spatial bin into two parts: variability that is resolved by the gridded product, in σ 2 , together with unresolved subgridscale variability, in 2 .
The GulfFlow products contain E at each 3D bin, in order to quantify the subgrid-scale variability that is lost in forming the local average velocity u . The bulk covariance Σ is readily formed from the local average velocity u , and Γ, if desired, can 505 be reconstructed from their sum.
The standard deviations and σ for GulfFlow-1/4 • are shown in Fig. 7. The former represents subgrid-scale variability from the original drifter data that is lost in averaging to form the local average velocities u on a three-dimensional grid, while the latter represents temporal variability that is resolved by those velocities. Note the difference in magnitudes, as well as the very different spatial patterns. Whereas σ is large over the energetic and variable Loop Current, as expected, is elevated 510 over the continental shelf in addition to the Loop Current region, and especially in the vicinity of the Mississippi outflow.
This difference in spatial patterns is consistent with the physical expectation that variability on the shelf will be dominated by smaller horizontal scale structures than in deep water, see e.g. Bracco et al. (2016) and Luo et al. (2016). Thus, it is likely that meaningful information regarding temporal variability of small-scale energy is contained within the time-varying local variance E.

Sampling properties
The spatial and spatiotemporal sampling of the GulfFlow-1/4 • gridded product is shown in Fig. 8. Here, a bin is considered sampled if it contains at least one hourly drifter data point. The spatial distribution map in Fig. 8a shows the percent of possible time slices, out of a total of 648, that are sampled. In general we see that the GulfFlow-1/4 • is sparsely sampled, with at most a fifth or a quarter of the overlapping monthly slices being sampled at any spatial bin.

520
Averaging within monthly slices has led to a distribution that presents much less spatial variability than that seen in the distribution of the original drifter positions in Fig. 6a. Moreover, the nature of the distribution has changed. Total data densities were seen to be generally higher on the Texas-Louisiana continental shelf than in deep water in Fig. 6a. However, the opposite tendency is apparent in Fig. 8a. Here we see that shallow bins are consistently sampled during fewer time slices than deeper bins, with the transition between these roughly coinciding with the 500 m isobath; note the similarity between this isobath and 525 the 10% sampling contour. This in part reflects the fact that the continental shelves have typically been the subject of intense experiments carried out over short durations, as opposed to the repeated deployments over long time periods found in some deepwater experiments. It also reflects a tendency of drifters not to cross from the deep interior to the continental shelves.
One sees that the longitude/time distribution of sampled slices for the GulfFlow-1/4 • dataset, Fig. 8b, exhibits a change in behavior between the western and eastern Gulf. Overall the eastern Gulf is considerably less well-sampled than the west.  Horizon Marine, https://www.horizonmarine.com, was crucial during the planning, execution, and data acquisition for the SGOM drifter program. Data from drifters deployed in US waters in support of the EddyWatch® program, https://www.horizonmarine.com/eddywatch, have also been provided as a part of a data exchange agreement between Horizon Marine and CICESE; these comprise the NGOM dataset.
Online access to the two OCG datasets require the user to agree to include particular acknowledgement in any publication. The required 635 acknowledgement for the OilSpill portion of the dataset is: "Drifter data were obtained courtesy of Prof. R. H. Weisberg and the USF Ocean Circulation Group, through a coordinated ocean observing and modeling program focusing on the West Florida Shelf (Weisberg et al., 2005).
The drifters were deployed in the eastern Gulf of Mexico region in summer 2010 as a rapid response to the Deepwater Horizon oil spill (Liu et al., , 2013a, and the data were used in monitoring the Gulf of Mexico Loop Current and the shelf circulation (Liu et al., 2013b) and in evaluating a trajectory models Liu et al., 2014)." Similarly, that for the RedTide portion is "Drifter data