Global open-ocean daily turbulent heat flux dataset (1992&ndash;2020) from SSM/I via deep learning

Wang, Haoyu; Wang, Mengjiao; Li, Xiaofeng

doi:10.5194/essd-2025-545

Preprints

https://doi.org/10.5194/essd-2025-545

Preprints

06 Oct 2025

| 06 Oct 2025

Status: this preprint is currently under review for the journal ESSD.

Global open-ocean daily turbulent heat flux dataset (1992–2020) from SSM/I via deep learning

Haoyu Wang, Mengjiao Wang, and Xiaofeng Li

Abstract. Air–sea turbulent heat fluxes – latent heat flux (LHF) and sensible heat flux (SHF) – are fundamental to the Earth’s energy and moisture budgets and to ocean–atmosphere coupling. Global flux estimates via bulk aerodynamic algorithms depend on sea surface temperature (SST), surface wind speed (SSW), near-surface air temperature (T_a), and specific humidity (Q_a), but orbital sampling and cloud contamination leave gaps in satellite inputs that propagate uncertainty to T_a/Q_a and hence to LHF/SHF. Here we present DeepFlux, a global daily 1° × 1° heat-flux dataset for 29 years (January 1992–December 2020). The dataset is produced with a concise completion-then-retrieval workflow: Special Sensor Microwave/Imager (SSM/I) variables (SSW, cloud liquid water, total column water vapor, and rain rate) are first gap-filled using the AI-based Generalized Data Completion Model (GDCM) to yield spatiotemporally continuous inputs; these – together with Optimum Interpolation SST (OISST) – are then used to retrieve T_a and Q_a via the AI-based Matrices-Points Fusion Network (MPFNet). LHF and SHF are then computed using a bulk algorithm. Validation against in-situ buoy observations shows that the dataset closely matches the true measurements, with RMSEs of 0.53 °C (T_a), 0.70 g kg⁻¹ (Q_a), 5.53 W m⁻² (SHF), and 25.28 W m⁻² (LHF). Comparisons with widely used flux products indicate differences among products, reflecting variability in flux estimates from different sources. DeepFlux provides an open, consistent, observation-constrained view of near-surface meteorology and air–sea heat exchange for climate diagnostics, model evaluation, and process studies. DeepFlux v1.0 is openly available under CC BY 4.0 at [repository] (DOI: http://dx.doi.org/10.12157/IOCAS.20250823.001). If you want to download without registering you can visit https://zenodo.org/records/17160579.

Received: 05 Sep 2025 – Discussion started: 06 Oct 2025

Publisher's note: Copernicus Publications remains neutral with regard to jurisdictional claims made in the text, published maps, institutional affiliations, or any other geographical representation in this paper. While Copernicus Publications makes every effort to include appropriate place names, the final responsibility lies with the authors. Views expressed in the text are those of the authors and do not necessarily reflect the views of the publisher.

Download & links

Preprint (PDF, 4284 KB)

Supplement (1059 KB)

Download & links

Haoyu Wang, Mengjiao Wang, and Xiaofeng Li

Status: final response (author comments only)

Subscribe to comment alert

RC1:
'Comment on essd-2025-545', Anonymous Referee #1, 19 Feb 2026
This is a novel approach to calculate turbulent heat fluxes over the ocean. Aside from some usual manuscript omissions and corrections (noted below), my main concern is on the description and complete understanding by the authors of the data sets used, in particular, the SSM/I products that they presumably obtained from RSS. In my opinion, they lack full understanding of this data set, what the limitations may be, etc. but use them on face value. Here are just some of my concerns:
I dont even see a citation to the SSM/I data set used, aside from the very end.

There are numerous SSM/I data sets available, why was this one selected?

What intercalibration did RSS employ to allow for a seemless transition between satellites and SSM/I to SSMIS?

Starting with F10, DMSP operated two satellites, one around 6 am/6 pm, the other around 10 am/10 pm. You fail to described or recognize this.

These satellites drift in time, you dont mention this. (you do note in Table 3 the overlap periods, some aspects of the overpass times would be nice to include).

My primary suggestion would be to include more details on this data set and demonstrate your understanding of what you are using. Otherwise, it just seems to be a huge data exercise.
Some general comments (and this is not all of them)
Not all references are in the list and some are out of order. In particular, those from Wang et al.

Figures - I find them extremely crowded and small to see.

Line 56 - replace "bottleneck" with "obstacle"

Lines 97-100: SSM/II was designed to support US Naval operations on a weather scale. It has become a cornerstone for climate studies but you should be aware of its weather usage. There are numerous applications - hurricane monitoring, heavy rainfall, etc.

The formatting in Table 1 is poor, its hard to distinguish the various dats sets - could you use a table with gridlines to stratify the information?

I would be consistent in defining ascending and descending orbits throughout the paper - using North and South is not commonly used.
Citation: https://doi.org/10.5194/essd-2025-545-RC1
- AC1:
  'Reply on RC1', Haoyu Wang, 11 Mar 2026
  
  Thank you for your recognition of our work and for your suggestions. I have revised the article content in light of the comments from both reviewers. The response document are attached.
  
  Citation: https://doi.org/10.5194/essd-2025-545-AC1
  - RC3: 'Reply on AC1', Anonymous Referee #1, 16 Mar 2026
    
    The authors have done an excellent job addressing my questions and concerns. The manuscript is greatly improved and I find it acceptable.
    
    Citation: https://doi.org/10.5194/essd-2025-545-RC3
    
    AC3: 'Reply on RC3', Haoyu Wang, 17 Mar 2026
    
    Thank you for recognising our work.
    
    Citation: https://doi.org/10.5194/essd-2025-545-AC3
RC2:
'Comment on essd-2025-545', Anonymous Referee #2, 26 Feb 2026

This paper presents a method and dataset for global ocean heat flux over an almost 30 year period. The method draws on re-analysis data to machine-learn how to "complete" SSMI-series satellite fields of data. From those completed fields for variables such as surface temperature, humidity and wind speeds, a bulk-formulae based module computes fluxes (one has to read another paper to understand how that step is formulated more fully). In comparison with in situ based measurements of these components and associated fluxes, the authors present results suggesting the new product is scientifically competitive with established products. Some analysis of the drivers of long-term trends in the fluxes of the new product are shown.
It is an interesting contribution to the development of better quantification of air-sea fluxes. My critical comments on the work are as follows.
The method of "SSMI completion" is heavily machine-learning led. The approach is presented, inevitably, at a relatively high level. It sounds methodical and reasonable, but nonetheless, with such an approach, many specific design choices are made the affect the result. Other choices could have been made, and this structural uncertainty in the design is not explored. This seems to me to be a general problem with machine-learning approaches, where choices for processing are not really based on physical understanding or hypotheses: the inability to attribute the outcomes to scientific uncertainties, because machine-learning design choices are at least as important.
In this context, the total independence of comparison data from training data becomes crucial. But this is not always easy to be clear about, especially when re-analysis fields have been used as part of the training, as the assimilation products may well have ingested all or much of the comparison data. Comments focussed on and acknowledging any limitations of independence would help on this topic.
I was left a little unclear what the precise measurand heat-flux is. I infer the product aims for a global completed heat flux equivalent to the instantaneous heat fluxes one would be able to retrieve from the satellite data, and that the comparison data are matches to the nearest SSMI comparison time. If so, there is an issue with using the product for long-term change analysis in that there is a subdaily cycle in heat fluxes, and the satellite observation times are not consistent over the full period. (Targetting an explicitly daily-mean heat flux by machine learning might be a useful approach and could be validated against in situ data at high temporal resolution aggregated to daily values.)
OISST is used for SST trends. This is an unfortunate choice among the available options for a long-term SST record, as OISST's operational mode of production is associated with inconsistency of bias referencing over time (Journal of Climate 34, 2923–2939 (2021)), causing relatively out-of-family trends (instability) over the period of this dataset. (See: 10.1175/JCLI-D-20-0793.1 ; https://climate.esa.int/documents/2370/SST_CCI_D5.1_CAR_v1.1-signed.pdf.)
I would like to see in the paper a comparison of the accuracy statistics for matches that were present in the SSMI swaths compared to the infilled times-and-places. This would be a good measure of the effectiveness of the infilling in providing a "daily" complete product.
Overall, the paper is well written and presented. There is inconsistency in acronyms being presented within and without being italicised, and sometimes named differently in figures (e.g. SSW and WS). Table 1 is very confusing and needs to be aligned in a way the reader can understand what is connected to what.

Citation: https://doi.org/10.5194/essd-2025-545-RC2
- AC2: 'Reply on RC2', Haoyu Wang, 11 Mar 2026
  
  Thank you for your recognition of our work and for your suggestions. I have revised the article content in light of the comments from both reviewers. The response document are attached.
  
  Citation: https://doi.org/10.5194/essd-2025-545-AC2

Haoyu Wang, Mengjiao Wang, and Xiaofeng Li

Supplement

https://doi.org/10.5194/essd-2025-545-supplement

Data sets

DeepFlux v1.0: A Global Open Oceans Daily Heat Flux Dataset For 1992–2020 From SSMI Satellite Data Using Deep Learning Models Haoyu Wang et al. https://doi.org/10.12157/IOCAS.20250823.001

Model code and software

GDCM Haoyu Wang et al. https://doi.org/10.12157/IOCAS.20250823.001

MPFNet Haoyu Wang et al. https://doi.org/10.12157/IOCAS.20250823.001

Haoyu Wang, Mengjiao Wang, and Xiaofeng Li

Viewed

Total article views: 726 (including HTML, PDF, and XML)

HTML	PDF	XML	Total	Supplement	BibTeX	EndNote
469	208	49	726	114	37	72

HTML: 469
PDF: 208
XML: 49
Total: 726
Supplement: 114
BibTeX: 37
EndNote: 72

Views and downloads (calculated since 06 Oct 2025)

Month	HTML	PDF	XML	Total
Oct 2025	181	26	6	213
Nov 2025	62	17	9	88
Dec 2025	24	64	7	95
Jan 2026	82	42	7	131
Feb 2026	39	28	6	73
Mar 2026	81	31	14	126

Cumulative views and downloads (calculated since 06 Oct 2025)

Month	HTML	PDF	XML	Total
Oct 2025	181	26	6	213
Nov 2025	62	17	9	88
Dec 2025	24	64	7	95
Jan 2026	82	42	7	131
Feb 2026	39	28	6	73
Mar 2026	81	31	14	126

Viewed (geographical distribution)

Total article views: 737 (including HTML, PDF, and XML) Thereof 737 with geography defined and 0 with unknown origin.

Country	#	Views	%

Latest update: 26 Mar 2026

Download

Preprint (4284 KB)
Metadata XML

Short summary

DeepFlux provides a global, gap-free, daily record of air temperature, humidity, and turbulent heat flux from 1992 to 2020. Using satellite data and deep learning, it fills missing observations and delivers continuous estimates. Tests against in situ measurements show it is closer to reality and more reliable than existing products. This open resource supports improved climate studies and model evaluation.


Total:	0
HTML:	0
PDF:	0
XML:	0