the Creative Commons Attribution 4.0 License.
the Creative Commons Attribution 4.0 License.
Gap-filled subsurface mooring dataset off Western Australia during 2010–2023
Abstract. Coastal moorings allow scientists to collect long-term datasets valuable in understanding shelf dynamics, detecting climate variability and changes, and evaluating their impacts on marine ecosystems. Continuous time series data from moorings is often disrupted due to mooring losses or instrument failures, which prevents us from obtaining complete and accurate information on the marine environment. Here, we present an updated version of the 14-year subsurface mooring dataset off the southwest coast of Western Australia during 2010–2023 (https://doi.org/10.25919/myac-yx60, Bui and Feng, 2024). This updated dataset offers continuous daily temperature and current data with a 5-meter vertical resolution, collected from six coastal Integrated Marine Observing System (IMOS) moorings at depths between 48 m and 500 m. Self-Organizing Map (SOM) machine learning technique is applied to fill in the data gaps in the previous version. The usage of the in-filled data product is demonstrated by detecting sub-surface marine heatwaves on the Rottnest shelf. The data products can be used to characterise subsurface features of extreme events such as marine heatwaves, and marine cold-spells, influenced by the Leeuwin Current and the wind-driven Capes Current, and to detect long-term change signals along the coast.
- Preprint
(4477 KB) - Metadata XML
-
Supplement
(3100 KB) - BibTeX
- EndNote
Status: final response (author comments only)
-
RC1: 'Comment on essd-2024-449', Alejandro Orfila, 10 Nov 2024
Rv. Ms essd-2024-449 “Gap-filled subsurface mooring dataset off Western Australia during 2010-2023” by Bui et al.
The Ms. presents the use of a SOM approach to fill gaps in time series of temperature and velocities from a mooring array at the southwest shelf of Western Australia. The system is trained with 14.5 years of daily temperatures at 3 moorings and with nearly 13 years of daily velocities at 5 moorings. For both gap filling approaches SOM is trained in conjunction with daily SST and coastal sea level.
Although the use of the Self Organized Maps for gap filling of time series is not new in ocean studies, this work provides the access of a complete dataset of velocity at temperature data at different depths in the Rottnest Shelf area. These data are of great interest, as the authors state, for the analysis of the seasonal and interannual variability of the Leeuwin Current (LC) and to assess the influence of large scale modes of variability on it. The methodology is well sound and the paper is well written although I would appreciate that some aspects should be treated in more detail.
- The first question that at least can be discussed is why the authors don't use local winds (at least at one station as ancillary data to train the SOM since, as they stated in the introduction, the strength of the LC is largely influenced by winds. Also I suggest to show in Figure 6 the time series of sea level used in the study.
- Lines 118-120. Could you please discuss why selecting these large numbers of neurons?. Are the results expected to be the same, reducing lets say 50 or 70%?.
- Lines 182-184 ->highly speculative. Besides, what can be concluded from Figure 7?.
Overall, I think this is a good Manuscript, well written that provides a good set of filled data .
Citation: https://doi.org/10.5194/essd-2024-449-RC1 -
RC2: 'Comment on essd-2024-449', Anonymous Referee #2, 13 Nov 2024
The paper by Bui and coauthors describes a dataset based on gap-filled
mooring observations for temperature and currents.Substantial modifications are suggested to improve the quality of the
presentation and clarify the value of the proposed dataset. The level of detail
is insufficient in some places and potentially confusing statements on
crucial aspects are present.The novelty of the dataset for potential users needs to be clarified.
Are these data otherwise not available, and not assimilated in models (see
e.g. Siripatana et al 2020, https://agupubs.onlinelibrary.wiley.com/doi/full/10.1029/2020JC016580
or Santana et al 2023 https://gmd.copernicus.org/articles/16/3675/2023/ for a different region)?
How large is the gain from this dataset, compared to the cited Sloyan et al (2024)?
Can you mention some example applications which would benefit from this dataset?Percentages of missing data aggregated for each station are given (Table 2),
but perhaps the dependence by depth should be provided too.
The amount of missing data seems generally low (<30%), so the added value of
the SOM estimation is not evident.
Other methods used for mooring data (e.g. Cao et al (2015) https://journals.ametsoc.org/view/journals/atot/32/7/jtech-d-14-00221_1.xml
should be discussed).
A plot showing SOM estimates should be provided, since Figure 6 is not very clear.Different instruments are mentioned, but differences between them are not explained.
In the SOM description, it is not clear to me whether the procedure is
applied for each station or if they are aggregated. Have you tested various
possibilities? Since stations are in a rather small area, how are measurements correlated?
More graphical examples should be provided to illustrate the method performances.Only a few scatter plots are shown, while more quantitative metrics need be
used to assess the goodness of the SOM-based estimates, besides the numbers (RMSE) provided
in section 2.3. A baseline method, e.g. climatology or AR process, should be compared.
Perhaps analysis can can be shown both for 'average' conditions and extremes, such as MHW/MCS states,
and Fig S7 can serve as a starting point. However, what do you do in such plots when two profiles
are overlapping? Are you showing an average?Detection of MHW/MCS events seems a reasonable application, but referencing
and context seems missing. Please revise this part.
Line-specific comments
L19 long term trends with less than 15 years of data is debatable
L27 CSIRO undefined
Fig 1 currents should be plotted from analysis data, rather than sketched manually
L33 You are referring to the Ningaloo here?
L45 Explain acronyms, such as SBE, in the caption. They are expanded
in some cases later but the table is unclear as it stands. What's ADCP?
Table 1 it would good to expand acronyms for locations here
L62 (and elsewhere) typo 'Euclidean'
L77 not sure what 'completion' means here
Table 2 does an empty cell mean zero?
L110 please either use lower or uppercase for MATLAB consistently
Fig 2 I wonder if showing actual examples could be more informative.
For example, a case with a small fraction of missing data and a more
difficult one (of course from the validation set, so to compare with ground truth).
Fig 3a looks quite worse than the other two. Please explain why.
L154 please make this quantitative
Fig 5 using white for missing data is an unfortunate choice given the
colorbar. Please change this to avoid ambiguity, as in S1. Also right now it looks measurements
are continuous in the vertical, which is not the case. Please use a different
plot, e.g. as scatter plot.
L164 If you mention La Niña, then the time series should be included.
As for the comment above, not sure if the Pacific or Ningaloo.
How can one anticipate this situation? Do you have a reference?
L168 Please provide background. How do you calculate this? Please reference properly
Table 3 Time has units I guess (days). There is a misplaced bracket.
L218 What's the empty bullet?
L235 In this part or before you should provide references to earlier works
on sea temperature extremes in the area, if any
L240 is it reasonable to assign 0 degrees Celsius in the water column?
L246 I am confused, aren't you using ITCOMPSOM as stated earlier?
L325 what is '2015' here?Citation: https://doi.org/10.5194/essd-2024-449-RC2 -
RC3: 'Comment on essd-2024-449', Giuseppe M.R. Manzella, 17 Nov 2024
The paper presents data collected in the coastal area of Western Australia and applies a machine learning method to fill gaps in the time series.
Data in coastal areas continue to be insufficient for the objectives of oceanographic research and those of the blue economy, so this contribution is highly appreciated.
The tool presented in the paper is along the line going back (e.g) to Weare et al., 1976 who developed a system for the selection of variables with the same characteristics; Davis, 1976 and Preisendorfer, 1982 who developed statistical models for short-term prediction; Fukumori and Wunsch, 1991 who obtained the most efficient representation of the original fields.
Continuing with the bibliography, it is worth mentioning Rutherford (1972) who applied the classical Gandin’s least-square minimization method for a statistical interpolation of short-range weather forecast error fields. Tencaliec et al. (2015) to estimate missing data from river time series applying regressive-autoregressive analysis. Mariethoz et al. (2012) who used spectral analysis for spatiotemporal reconstruction of gaps in multivariate fields.
Machine learning is in the wake of methods exposed in the cited bibliography.
- General comments
There are always essential elements to consider, but which do not seem to be well clarified in the article:
- how much does the non-linearity and variability of phenomena in the coastal area weigh on the method?
- Are the data sufficiently representative of the physical state of the sea ? (perhaps the answers are in the articles cited by the authors, but a brief summary would have been very useful)
Figures 3 and 4 should be discussed on the basis of point 1. Before even getting to them I was in fact convinced that the method worked well with parameters such as temperature (or even salinity) but would have had significant errors with velocities.
- Specific comments
Line 77. Satellite data are used to extend the temperature to the surface. Since these data are part of sea truth exercises, a very brief presentation of associated precisions and uncertainties would be useful.
Line 78. The temperature in each mooring is extended to the surface with a linear interpolation. No problem with the seasonal thermocline?
Conclusion
The data are interesting and should be published. But I agree with one of the other referee: possible applications of gap filled data should be discussed, not only on heat waves.
With corrections the article can be accepted.
Bibliography
Davis, R.E., 1976: Predictability of sea surface temperature and sea level pressure anomalies over the north Pacific Ocean, J. Phys. Oceanogr., 6, 249 - 266.
Fukumori, I., and C. Wunsch, 1991: Efficient representation of north Atlantic hydrographic and chemical distributions, Progress in Oceanography, 27, 111-195.
Preisendorfer, R. W., 1988: Principal component analysis in meteorology and oceanography, Elsevier, New York, 425 pp.
Rutherford, I.D., 1972: Data assimilation by statistical interpolation of forecast error fields. Journal of Atmospheric Sciences, 29, 809 – 815.
Tencaliec, P., A.-C. Favre, C. Prieur, and T. Mathevet: 2015: Reconstruction of missing daily streamflow data using dynamic regression models, Water Resour. Res., 51, 9447–9463, doi:10.1002/2015WR017399.
Weare, B.C., A. Navato and R.E. Newell, 1976: Empirical orthogonal analysis of Pacific Sea surface temperature, J. Phys. Oceanogr., 6, 671-678.
Citation: https://doi.org/10.5194/essd-2024-449-RC3
Data sets
Gap-filled, gridded subsurface physical oceanography time series dataset derived from selected mooring measurements off the Western Australia coast during 2009-2023 T. Bui and M. Feng https://doi.org/10.25919/myac-yx60
Viewed
HTML | XML | Total | Supplement | BibTeX | EndNote | |
---|---|---|---|---|---|---|
154 | 50 | 11 | 215 | 21 | 6 | 6 |
- HTML: 154
- PDF: 50
- XML: 11
- Total: 215
- Supplement: 21
- BibTeX: 6
- EndNote: 6
Viewed (geographical distribution)
Country | # | Views | % |
---|
Total: | 0 |
HTML: | 0 |
PDF: | 0 |
XML: | 0 |
- 1