Articles | Volume 18, issue 5
https://doi.org/10.5194/essd-18-3391-2026
© Author(s) 2026. This work is distributed under the Creative Commons Attribution 4.0 License.
Bowen ratio-constrained global dataset of bulk air–sea turbulent heat fluxes from 1993 to 2017
Download
- Final revised paper (published on 19 May 2026)
- Supplement to the final revised paper
- Preprint (discussion started on 05 Jun 2025)
- Supplement to the preprint
Interactive discussion
Status: closed
Comment types: AC – author | RC – referee | CC – community | EC – editor | CEC – chief editor
| : Report abuse
-
RC1: 'Comment on essd-2025-272', Anonymous Referee #1, 19 Jun 2025
- AC3: 'Reply on RC1', Ronglin Tang, 18 Aug 2025
-
RC2: 'Comment on essd-2025-272', Anonymous Referee #2, 03 Jul 2025
- AC2: 'Reply on RC2', Ronglin Tang, 18 Aug 2025
-
EC1: 'Comment on essd-2025-272', Tobias Gerken, 11 Jul 2025
- AC1: 'Reply on EC1', Ronglin Tang, 18 Aug 2025
Peer review completion
AR – Author's response | RR – Referee report | ED – Editor decision | EF – Editorial file upload
AR by Ronglin Tang on behalf of the Authors (12 Sep 2025)
Author's response
Author's tracked changes
Manuscript
ED: Referee Nomination & Report Request started (18 Sep 2025) by Tobias Gerken
RR by Anonymous Referee #1 (27 Sep 2025)
RR by Tobias Gerken (17 Dec 2025)
ED: Reconsider after major revisions (17 Dec 2025) by Tobias Gerken
AR by Ronglin Tang on behalf of the Authors (27 Jan 2026)
Author's response
Author's tracked changes
Manuscript
ED: Referee Nomination & Report Request started (12 Feb 2026) by Tobias Gerken
RR by Anonymous Referee #1 (02 Mar 2026)
ED: Publish subject to minor revisions (review by editor) (12 Mar 2026) by Tobias Gerken
AR by Ronglin Tang on behalf of the Authors (18 Mar 2026)
Author's response
Author's tracked changes
Manuscript
ED: Publish as is (25 Mar 2026) by Tobias Gerken
AR by Ronglin Tang on behalf of the Authors (28 Mar 2026)
Post-review adjustments
AA – Author's adjustment | EA – Editor approval
AA by Ronglin Tang on behalf of the Authors (13 May 2026)
Author's adjustment
Manuscript
EA: Adjustments approved (13 May 2026) by Tobias Gerken
Summary and Merit:
Global air-sea flux estimates are useful for understanding the transport of heat and water throughout the globe. With this dataset, the authors use a physics-constrained data-driven method to generate a dataset at moderate resolution (0.25 degrees) from 1993-2017. A key improvement is realistic representation of the ratio of SHF to LHF. While I think the work itself is a very interesting exercise and think this has strong potential to be a useful dataset, I do have a significant concern that I would like to see discussed.
Main comment:
I am not entirely convinced that the training dataset has large enough spatial and temporal coverage for the neural network to accurately generalize and produce a product with global-scale coverage. In particular, from Figure 2, it looks like the training observations are disproportionately from the tropical ocean. Outside of the tropics, only the northeast Pacific and North Atlantic appear to have (visually) reasonable coverage. To evaluate performance on “unseen” locations, the authors employ spatial-informed cross validation. While this procedure demonstrates that predictions are reasonably accurate at the different spatial domains that are part of the training set, this does not indicate that predictions will be accurate in regions where there are not any existing data. For instance, there are many locations in the southern hemisphere presumably characterized by different dynamics than the locations in training dataset. The comparisons between basins presented later are also only reflective of the locations in Fig 2, I think. Of additional concern is that there are many variables used in training which likely have a relationship with air-sea fluxes that is very location-specific.
I do appreciate that the authors attempt to address this issue with the above, but I don’t think this goes far enough. I also acknowledge that this is not an easy comment to address (i.e., more buoy measurements cannot be used if the buoys do not exist). But, I still think the discussion of this could be improved. One idea might be to perform an even more targeted form of cross-validation, e.g., removing one of the isolated locations from training to see how well the neural network performs— and use this to quantify uncertainty. E.g., Remove the single location south of Australia from training, and see how the NN performs for predictions of that location when only the others are used in training. The current Figures 3-5 lump data together from different regions, so it is not possible to determine how well performance is for the isolated locations. Such an approach could be repeated for other single isolated locations to get a generalized idea of uncertainty at several of the remote locations not included in training. There probably could be other ways to address it as well. But in any case, there needs to be some manner of disclaimer- the R values and RMSE shown represent performance at the locations used in training and do not necessarily indicate the same performance in a generalized global sense.
Line-by-line comments and suggestions:
Title/abstract – It might be helpful to explicitly mention that these are bulk flux predictions
L66 – typo seriously “imped”
L68 – change “ascribed” to “attributed”
L70-77 – I think this section should be more explicit on what the problems are with existing parameterizations
L78 – clarify what upscaling means in this context
L93 – “patterns”
L103 – I don’t understand what “their synergistic changes” refers to
L107 – ambiguous whether “this work” refers to the 2024 work or the present paper
L118 – “three fold”
L146-161 – I think these datasets should be listed in table form, not as a long paragraph. It would make this much easier to read.
L202 – By forcing variables, it might be helpful to clarify that this means variables used in training the neural network
L214 – not sure it’s necessary to list these out in paragraph form. To be concise it might be better to simply refer to the relevant table.
L276 – I am concerned that the relationships between air sea fluxes and these 11 variables are not globally generalizable.
L316 – Might be helpful to add a short explanation on why you chose these metrics
L363-383, Fig 5 – While performance in terms of RMSE is clearly improved as explained, depending on the application it might be considered a deficiency that BrTHF does not reproduce extreme values of Bowen ratio that we know exist from the observations (i.e. the distribution is not necessarily better represented than the other models). I think this needs to be explicitly discussed.
L400+ - I think it might be useful to compare the performance by basin to the amount of data coverage between basins. This might help explain why the model performed the way it did.
Fig 7 – I would recommend to use a color other than blue for the second and third columns. As is, it is confusing that dark blue = poor performance in column 1, but dark blue = good performance in columns 2 and 3.
I also think it should be very clear that the basins here just represent the buoy locations that are available in those basins; not uniform coverage in them.
L448-449 – That looks true for all datasets, not just BrTHF from Figure 8. I would recommend to clarify.
Fig 8-9 – Is there a measure of uncertainty in these long-term averages that could be included on the plots?
L472 – “rest of the products”
L482-483 – I would recommend to speculate on what regions/mechanism may have caused this positive trend, as it differs from the other products.
Sec 3.3 – This section implies that performance between BrTHF and Seaflux-ERA5 is similar, even in regard to Bowen ratio which earlier seemed to be the point of significant improvement for BrTHF. Please comment on this.
Fig 13 – It’s a bit confusing that the labels on the color bar are below the plots on the left. It might be more intuitive to add a title above each subplot rather than a colorbar label.
L553-555 – Do we trust these results, considering that there was significant uncertainty at high latitudes (and the NN was trained on few observations from high latitudes)? Could this be an artifact of the training data/procedure?
L588 – “custom”
L590 – I’m unconvinced that the absence of outliers is an improvement, since outliers exist in the observations. Please comment on this.
L609-618 – I’m not sure that this isn’t also true for the present dataset based on looking at Figure 2
L666 – Performance in terms of SHF/LHF did not clearly look superior based on the plots. Please clarify that the largest improvement is in Bowen ratio.