Preprints
https://doi.org/10.5194/essd-2024-138
https://doi.org/10.5194/essd-2024-138
03 May 2024
 | 03 May 2024
Status: this preprint has been withdrawn by the authors.

Data mining-based machine learning methods for improving hydrological data a case study of salinity field in the Western Arctic Ocean

Shuhao Tao, Ling Du, and Jiahao Li

Abstract. In the Western Arctic Ocean lies the largest freshwater reservoir in the Arctic Ocean, the Beaufort Gyre. Long-term changes in freshwater reservoirs are critical for understanding the Arctic Ocean, and data from various sources, particularly measured or reanalyzed data, must be used to the greatest extent possible. Over the past two decades, a large number of intensive field observations and ship surveys have been conducted in the western Arctic Ocean to obtain a large amount of CTD data. Multiple machine learning methods were evaluated and merged to reconstruct annual salinity product in the western Arctic Ocean over the period 2003–2022. Data mining-based machine learning methods make use of variables determined by physical processes, such as sea level pressure, sea ice concentration, and drift. Our objective is to effectively manage the mean root mean square error (RMSE) of sea surface salinity, which exhibits greater susceptibility to atmospheric, sea ice, and oceanic changes. Considering the higher susceptibility of sea surface salinity to atmospheric, sea ice, and oceanic changes, which leads to greater variability, we ensured that the average root mean square error of CTD and EN4 sea surface salinity field during the machine learning training process was constrained within 0.25 psu. The machine learning process reveals that the uncertainty in predicting sea surface salinity, as constrained by CTD data, is 0.24 %, whereas when constrained by EN4 data it reduces to 0.02 %. During data merging and post-calibrating, the weight coefficients are constrained by imposing limitations on the uncertainty value. Compared with commonly used EN4 and ORAS5 salinity in the Arctic Ocean, our salinity product provide more accurate descriptions of freshwater content in the Beaufort Gyre and depth variations at its halocline base. The application potential of this multi-machine learning results approach for evaluating and integrating extends beyond the salinity field, encompassing hydrometeorology, sea ice thickness, polar biogeochemistry, and other related fields. The datasets are available at https://zenodo.org/records/10990138 (Tao and Du, 2024).

This preprint has been withdrawn.

Publisher's note: Copernicus Publications remains neutral with regard to jurisdictional claims made in the text, published maps, institutional affiliations, or any other geographical representation in this preprint. The responsibility to include appropriate place names lies with the authors.
Shuhao Tao, Ling Du, and Jiahao Li

Interactive discussion

Status: closed

Comment types: AC – author | RC – referee | CC – community | EC – editor | CEC – chief editor | : Report abuse
  • RC1: 'No signs of cross-validation against independent data, probable overfitting', Anonymous Referee #1, 24 Jun 2024
    • AC1: 'Reply on RC1', Ling Du, 16 Jul 2024
  • RC2: 'Comment on essd-2024-138', Anonymous Referee #2, 10 Jul 2024
    • AC2: 'Reply on RC2', Ling Du, 16 Jul 2024

Interactive discussion

Status: closed

Comment types: AC – author | RC – referee | CC – community | EC – editor | CEC – chief editor | : Report abuse
  • RC1: 'No signs of cross-validation against independent data, probable overfitting', Anonymous Referee #1, 24 Jun 2024
    • AC1: 'Reply on RC1', Ling Du, 16 Jul 2024
  • RC2: 'Comment on essd-2024-138', Anonymous Referee #2, 10 Jul 2024
    • AC2: 'Reply on RC2', Ling Du, 16 Jul 2024
Shuhao Tao, Ling Du, and Jiahao Li

Data sets

Data mining-based machine learning methods for improving hydrological data: a case study of salinity field in the Western Arctic Ocean Shuhao Tao and Ling Du https://zenodo.org/records/10990138

Shuhao Tao, Ling Du, and Jiahao Li

Viewed

Total article views: 680 (including HTML, PDF, and XML)
HTML PDF XML Total BibTeX EndNote
535 98 47 680 40 39
  • HTML: 535
  • PDF: 98
  • XML: 47
  • Total: 680
  • BibTeX: 40
  • EndNote: 39
Views and downloads (calculated since 03 May 2024)
Cumulative views and downloads (calculated since 03 May 2024)

Viewed (geographical distribution)

Total article views: 658 (including HTML, PDF, and XML) Thereof 658 with geography defined and 0 with unknown origin.
Country # Views %
  • 1
1
 
 
 
 
Latest update: 20 Nov 2024
Download

This preprint has been withdrawn.

Short summary
The salinity field of the Western Arctic Ocean is taken as an example to construct a novel data mining method for polar sea areas, utilizing multiple machine learning methods that integrate multiple data sources and incorporate physical processes. The application potential of this approach extends beyond the salinity field and includes other related fields like hydrometeorology, sea ice thickness, polar biogeochemistry, among others.
Altmetrics