the Creative Commons Attribution 4.0 License.
the Creative Commons Attribution 4.0 License.
Optical complexity of North Sea, Baltic Sea, and adjacent coastal and inland waters derived from satellite data
Abstract. Despite advances in remote sensing, consistent monitoring of water quality across freshwater-marine systems remains challenging due to methodological fragmentation. Here, we provide an overview of an exemplary dataset on water quality characteristics in inland waters, coasts, and the open sea estimated from optical satellite data. Specifically, this is Sentinel-3 OLCI data for the entire North Sea and Baltic Sea region for the period June to September 2023. The dataset includes daily aggregated observational data with a spatial resolution of approximately 300 m of reflectance at the top-of-atmosphere and for cloud-free water areas remote-sensing reflectance, inherent optical properties of the water, and an estimation of the concentrations of water constituents, e.g. related to the aquatic carbon content. These are the results of the novel A4O atmospheric correction and the ONNS water algorithm. The dataset serves as a prototype for understanding the processing chain and interdependencies, but also for developing a high degree of connectivity for answering various scientific questions; we do not perform an actual validation of the 73 individual parameters in the dataset. The aim is to show how fragmentation in water quality monitoring along the aquatic continuum from lakes, rivers to the sea can be overcome by applying an optical water type-specific and neural network-based processing scheme for Copernicus satellite data. Emphasis of this work is on analysing the optical complexity of remote-sensing reflectance in the North Sea, Baltic Sea, coastal, and inland waters. Results of a new optical water type classification show that almost all (99.7 %) remote-sensing reflectances delivered by A4O are classifiable and that the region exhibits the full range of optical water diversity. The dataset can serve as a blueprint for a holistic view of the aquatic environment and is a step towards an observation-based digital twin component of the complex system.
- Preprint
(4657 KB) - Metadata XML
- BibTeX
- EndNote
Status: final response (author comments only)
- AC1: 'Comment on essd-2025-443 - Information about changes in the underlying dataset', Martin Hieronymi, 25 Sep 2025
-
RC1: 'Comment on essd-2025-443', Anonymous Referee #1, 06 Nov 2025
The manuscript documents a daily aggregated, Level‑3 dataset produced by merging all available Sentinel‑3 OLCI observations from both S3A and S3B overpasses for each day across the North Sea and Baltic Sea during June to September 2023. The processing chain uses the A4O atmospheric correction followed by the ONNS neural network water processor. The data product includes remote‑sensing reflectance at 16 OLCI bands, a suite of inherent optical properties, concentrations such as chlorophyll, TSM, DOC and POC derived from IOPs, the Forel‑Ule color index, optical water type classifications, and several quality and context flags including cloud masks, adjacency risk, glint risk, bright pixel flags, and a whitecap fraction parameterization. The paper states that this is a prototype release, that no full validation of the many variables is provided here, and that the code will only be released in the medium term. The archived dataset is now at WDCC with a DOI, CC‑BY 4.0 license, and a variables document.
The dataset targets a well known gap in ocean color for optically complex waters at the land sea interface where standard processing is often limited. Using both S3A and S3B improves daily spatial coverage, and a single pipeline across inland and coastal waters is attractive for monitoring and for synoptic biogeochemical analyses. Coupling an OWT framework to both atmospheric correction and water property retrieval is methodologically coherent, and providing OWT outputs alongside geophysical variables is useful for quality screening and for science use. As an ESSD submission, the core value is the accessible, gridded Level‑3 product with metadata, DOIs, and a usage context. These positive aspects are clear in the paper.
The manuscript explicitly states that publication of the code is only planned in the medium term and does not include a Code availability section. ESSD allows data‑only descriptions, yet it strongly encourages deposition of software and algorithms in FAIR repositories and requires a Code availability section when code is part of the work. For complex EO processing pipelines that strongly condition the resulting data, ESSD policy emphasizes transparency and reproducibility as core principles. In its current form, the work falls short of these expectations because independent users cannot reproduce the dataset or verify implementation details of A4O or the specific ONNS configuration used. At minimum, a versioned, citable container image or repository with the exact A4O and ONNS code paths, trained weights, and runtime environment is needed, together with a Code availability section that points to those DOIs.
The retrieval chain uses neural networks and an AC method. Small choices in training data, preprocessing, and band handling materially change IOPs and derived concentrations. Without the code or at least a fully specified ATBD with the exact trained model artifacts, an independent group cannot regenerate the L3 product from S3 Level‑1 data. That limits reuse and undermines the central ESSD promise of transparent, reusable data products.
The ONNS basis is documented in Frontiers in Marine Science and is citable, which is a strength. However, the present chain departs from the 2017 ONNS in key ways. The paper indicates that concentrations now come from ONNS‑derived IOPs rather than directly from class‑specific networks, and that the OWT scheme used here is the newer Bi and Hieronymi framework. Those choices are reasonable, yet they change the forward model and error propagation, so they must be documented with enough specificity to be reproducible. The A4O method has been compared against other ACs, but a full methodological description plus code or trained models are still not publicly archived.
The manuscript is explicit that it does not perform a full validation of the many variables. For an ESSD data description that is acceptable only if adequate demonstration of fitness for purpose is provided and if uncertainties and quality information are delivered in a way that users can apply. Here, the validation evidence is mostly qualitative, which is a weakness. The paper even notes a possibly erroneous blueward tendency of A4O in some conditions and that reflectance magnitude is often underestimated, which is significant because Rrs is the driver for all IOP and concentration products. Users need at least some quantitative, OWT‑stratified matchup statistics versus in situ Rrs and against IOP and concentration measurements, with uncertainty budgets that follow accepted EO data record practice. A concise validation plan can be staged, but the first ESSD version requires some validation.
The variables list in the paper and on WDCC is helpful, but several names and units would benefit from alignment with existing community standards. For NetCDF, CF conventions recommend using standard_name attributes where possible and consistent units and descriptive long_name fields. For ocean color, ESA CCI and NASA ocean color products provide a de facto vocabulary, for example RRS for remote‑sensing reflectance, CHLOR_A for chlorophyll, K_490 for diffuse attenuation at 490 nm, APH for phytoplankton absorption, ADG for CDOM‑plus‑detritus absorption, and BBP for particulate backscattering. The present ONNS variable names such as ONNS_a_g_440, ONNS_b_p_510, and the use of the term Gelbstoff for CDOM are understandable in context but may confuse users who expect CF‑style names and common ocean color acronyms.
On reflectance terminology, the manuscript lists A4O_Rrs_n as normalized remote‑sensing reflectance. In ocean color there is potential confusion between fully normalized water‑leaving radiance nLw, remote‑sensing reflectance Rrs, and various normalization schemes. The paper should define exactly what normalization means in A4O, how it differs from standard Rrs, and why the units remain sr‑1. That definition should also be embedded in the NetCDF metadata so that users do not misinterpret the quantity.
The provision of flags is welcome, including cloud masks, cloud risk near edges, adjacency, glint risk, bright pixels, and a special flag for very high biomass or floating algae. The inclusion of a whitecap fraction parameter (A4O_A_wc) is scientifically useful because whitecaps increase broadband water‑leaving signal and can bias retrievals if not handled. The whitecap parameterization is cited to satellite‑based work, which is appropriate. What is missing is a clear, file‑embedded description of how users should combine these flags for robust quality screening and what the recommended filters are for computing spatial or temporal aggregates. Given that the paper acknowledges artifacts near clouds and adjacency and a blueward bias in some regions, the dataset should come with a documented, conservative quality mask and a short tutorial for users.
The dataset is built from both Sentinel‑3A and Sentinel‑3B OLCI sensors merged to daily Level‑3. That is effectively a dual‑sensor product within a single mission. The title reads as derived from satellite data, which could be interpreted as multisensor across missions. The abstract clarifies that the source is Sentinel‑3 OLCI, and the methods section explicitly states S3A and S3B. To avoid misunderstanding, I recommend reflecting the instrument in the title or at least stating prominently on first mention that this is an OLCI‑only product that merges S3A and S3B.
Summary
ESSD requires a Data availability section and encourages authors to archive software and provide a Code availability section. The paper satisfies data availability through the WDCC DOI. It does not yet satisfy the spirit of ESSD reproducibility for algorithmic data products, because the code is not accessible and the AC is not documented at the ATBD level. ESSD explicitly invites authors to deposit code and even supports literate programming submissions to maximize transparency. This manuscript should follow that guidance for acceptance. The dataset fills a scientific gap but the present paper is not ready for acceptance because reproducibility and quantitative validation are not yet sufficient, and because naming, terminology, and user guidance need revision for broad reuse. If the authors release the processing code, add validation and/or uncertainty descriptions, align variable metadata with CF and common ocean color practice, clarify scope, and document flagging rules, I would recommend acceptance after those changes.
Citation: https://doi.org/10.5194/essd-2025-443-RC1 -
CC1: 'Comment on essd-2025-443', Elizabeth C Atwood, 22 Dec 2025
General comments:
Excellent overview in the Introduction, albeit with some specific comments below. The method shows good promise but I have some reservations, noted in Specific Comments below. The speculative other features of the data (§5) are not especially unique to this dataset. The provided dataset, while impressive, is not validated, which is mentioned in the manuscript and should not forgotten by potential users (or in future references to this paper).
Specific comments
L50-51: I don’t seen a result in either reference saying that precipitation can be a source alone for CDOM?
L52: Water movement in rivers and lakes would be in a different category to currents, should be explicitly included since your main point is to cover the limnological to the oceanographic.
L71-72: It is the aim to cover all water areas uniformly and with sufficient data quality, not only of this study but of these other services, no matter that there do still exist areas for improvement. I suggest removing “Thus”.
L77: Compromised when a single AC method optimized for a specific water type is used.
L78-79: What about Atwood et al (2024, doi: 10.3390/rs16173267), which focused precisely on transitional water systems? Spyrakos et al (2018, doi: 10.1002/lno.10674) also focused on coastal as well as inland waters.
L90-92: This is not as novel as it is being presented, such a system to continuously monitor from rivers, through coastal areas to the ocean has been demonstrated through the CERTO project.
L113-116: This has good potential for large overrepresentation of more oceanic waters in the training dataset, thus skewing the cluster optimization results to be oriented on oceanic water conditions than the smaller by area transitional water systems.
L144-146: To cite a method for the current study in a manuscript that is still being prepared is questionable.
L155-156: What was the range of time offsets between S-3A and 3B? For coastal systems which are strongly tidal, a difference of over 45 min during the right tidal cycle can greatly change the water optical properties in a single location. This is less of a worry for the Baltic Sea, but certainly for the North Sea, in particular the English Channel. I question then what a mean pixel-reflectance value represents when taken from images at different times/tidal conditions for those coastal regions.
L173: With regard to the SST assumption for same latitude, not including elevation probably makes less of a difference over the study area, but in the next sentence you mention Dead Sea and Great Salt Lake. Surely on a global scale it would further be important to include additional aspects (like elevation, distance from coast, etc) to estimate freshwater system SST, other than just latitude.
L283-284: The c-mean algorithm provides membership values to all classes, not a subset as suggested with “three to six classes usually contributing”. If you aren’t providing memberships to all classes, how do you determine what subset of classes to calculate memberships for?
L285-286: A minimum requirement for total membership of 0.0001? I think this written differently than it was meant, that would be impossibly low for a summed membership even if unbounded. I also don’t find “0.0001” in Hieronymi et al (2023a). I am going to assume this is meant a minimum threshold a valid dominant OWT membership, and I would argue this threshold is far too low. For bounded memberships, i.e. summing to 1, a good rule of thumb for minimum membership definition is 1/C where C is the total number of OWT classes. This would occur if a point is assigned equal membership to all clusters, thus suggesting it does not belong well to any cluster. This rule of course deviates in the case of unbounded memberships that can sum to more or less than 1, as in Moore et al. (2001), Jackson et al. (2017) and Bi & Hieronymi (2024). But it is argued in Bi & Hieronymi (2024) that memberships should still sum to close to 1 if there is no redundancy or underrepresentation in the classification results. Thus being somewhere close to 1/C threshold should still be advisable, which for the Bi & Hieronymi (2024) OWT class set with 10 classes, would be 0.1. Thus the threshold of 0.0001 seems unreasonably low to me. This point also relates to Fig. 12 and conclusions made therefrom.
L343-345: High variability of OWT class could come from other aspects besides shallow water and visible seafloor – unless that is also changing greatly? This variability could also be due to a location in a narrow water connection between to large water bodies. This also relates to the conclusion on L389 regarding using OWT classes as a mask for optically shallow water areas – is this supposed to follow over the high variability in OWT class signal?
L360-361: Point well taken that low optical variability over time is important for ground-truthing and SVC, but the point in the parentheses “but rather for clear waters” is confusing. Do you mean that only clearer waters are important for ground truth and SVC? I would argue that optically complex water, but with low variability, would also be important for these efforts so as to cover a larger portion of the full spectrum.
L447-449: It is at the moment speculative if OWT serve to better characterize aquatic carbon, including the example OWT 3b bloom, and this should be phrased accordingly.
L504-506: Mélin & Vantrepotte (2015) and Spyrakos et al. (2018) both focus specifically on coastal/transitional water systems, thus your conclusion is not fully supported. Also a critique that 17 classes may be too many simply on the basis of lacking in-situ data is not a very strong argument.
Technical corrections
L55: The sentence starting here rather belongs thematically to the next paragraph.
L100-101: You should indicate the location of Glasgow in Fig 1 if you reference it in the text. Same goes for the Elbe River catchment.
L130: What do the various boxes (white or red) represent? This is not explained in the figure caption.
L184: Suggested rewording for clarity: “Products from climatologies and their derivatives, such as white cap fraction, refer…”
L200: The first three parameters (OWT_AVW, OWT_Area, OWT_NDI) are rather spectral curve characteristics directly from the Rrs, that are not dependent on output from an OWT classification? And OWT_index, is this the dominant OWT class?
L278: Is the AVW meant from Vandermeulen et al 2020? Then this shouldn’t be mixed up with a normal weighted mean, better to change to “weighted harmonic mean” so this remains clear.
L560: Remove the either “to” or “must”, incorrect with both.
Citation: https://doi.org/10.5194/essd-2025-443-CC1 -
RC2: 'Comment on essd-2025-443', Timothy Moore, 29 Dec 2025
General comments
Overall: The present manuscript describes an end-to-end processing chain for generating regional ocean color products from a processing scheme that incorporates flexible choices for atmospheric correction and bio-optical algorithms through a classification scheme developed in a previous manuscript. This comprehensive approach is a realization of how optical water type classification schemes can be fully utilized in a processing chain, as both integrating intermediate products for blending, and as stand-alone products in of themselves. The developed scheme covers a wide variety of waters, from inland systems to coastal and open oceans of the Baltic/North Atlantic region. The developed scheme addresses problems of choosing algorithms across such diversity of systems.
The presentation of the overall manuscript is reasonable. However, notable omissions hinder important aspects of the manuscript. First, the OWT scheme, which is the central 'glue' to the processing chain, is not visualized and referenced in a previous manuscript. The OWTs are never shown or clearly explained as to what they represent. A new section should be devoted to this. What are the centroids and distributions of the 3 variables - this could be shown in 3-d space. What do the Rrs shapes look like, what are the spectral/optical characteristics, why are there 'a' and 'b' subdivisions - are these variations? I think it is crucial for a table and/or figure showing the OWTs, as these are referenced throughout the paper.
Secondly, an overarching figure showing the scheme would also be very helpful. The 'algorithm' is really a fully developed scheme that incorporates A/C with classification and production of in-water properties as products. Using the term 'algorithm' to describe the full processing chain is somewhat mis-leading, as many readers may associate 'algorithm' with a formula for generating a specific bio-optical product from Rrs.
Lastly, the final optical products produced are not validated with any field data, but are qualitatively assessed. Operationally, extra processing effort (=time and computing power) is required for such a processing chain, which is justifiable if this leads to product improvement. This lack of verification without comparisons with 'validation' data leaves this question dangling. While this has been demonstrated in other earlier OWT-based studies, it is left unanswered here.
Specific Comments:
L29: PACE now expands wavelength range into the UV
L44: What is meant by the 'view'? Is this a geometric view from satellite, or a philosophical view of ocean color?
L98-104: Not familiar with the geographic landmarks mentioned and they are not indicated on the map
L124: so, 4 months of image data used in total?
L134: What is meant by the 'algorithm'? It sounds like at least part inclusion of an atmospheric correction? Should this be the 'processing scheme' or 'algorithm scheme'? Also, The 'used' algorithm could be phrased better...the 'developed' algorithm or 'proposed algorithm'? A schematic figure showing the processing chain and/or evolution of the 'algorithm' would be a useful (critical!) figure for this section.
L153: the 'used processing chain' is a bit awkward phrasing...alternative suggestions: the 'developed processing chain'...the 'implemented processing chain'...the 'presented'. This section is a bit confusing because now it appears that daily averaging is part of the 'algorithm', along with in-water production generation. I really think this needs to be clarified what components are involved in the 'algorithm' - A/C correction, water typing, averaging, product generation (a schematic would help here).
L155-160: Unclear how many pixels were used for the developing the OWT scheme.
L165-238: Elements described here could be better visualized if included in an overarching schematic.
L252: 'already favoured'->'promoted' or 'advanced'?
L270: Comment: the OWT scheme is based on satellite data (it was unclear up to this point).
L275: This is an important section and its a bit unclear how or what the premise of the OWT scheme is. The statement that 'Rrs is not compared directly..' and '3 optical variables' is critical, but the 3 variables should immediately be listed as it wasn't totally clear what those were in the following sentences. Recommend listing them directly as 1), 2) and 3) here. It appears that its OWT_area, AVW, and the NDI? Its unclear how many OWTs there are, and what their characteristics are in terms of the 3 parameters. A figure or table would be helpful
L293: Need references for 'fuzzy logic approach' or more description of its use.
L326: Its unclear what is meant by 'the variability is broad' - environmental, optical, both? This whole sentence is a bit ambigous and phrasing is unclear: the sentence continues with 'possible problems of the water algorithms used', which now switches topic from optical variability (leading to?) higher uncertainties in optical products from applied algorithms (that is how I interpret this, which could be wrong).
L327: 'ambiguous variability' is ambiguous in itself. Do the authors mean, unpredictable Rrs shapes resulting from errors in atmospheric correction?
L328: '...with A4O having a possibly erroneous tendency towards more blue water (i.e. OWT 1)': this statement is unclear - I'm not sure what this means. Can you clarify and rephrase?
L329: ' The other AC methods would probably result...' - this sounds like conjecture - was this observed?
L337: What is the reason for ascribing OWTs labeled with 'a' and 'b'?
L344: Use of the word 'visible' twice in the same sentence for different reasons is confusing - suggest rephrasing.
L345: The underlying assumptions meaning that algorithms assume optically deep conditions?
L347-348: By 'too blue', is the inference that pixels are erroneously classified? Its unclear what the OWTs are representing without a table, figure or other explanation.
L360: A frequency map of OWTs would illustrate the permanence or variability of areas in relation to OWTs (this could be a sub-figure to Fig. 10 or 11).
L383: 'not shown' implies this was tested, but the statement says '..it would be expected..' - this needs to be clarified as to whether in fact the authors did try this, or this is speculation or based on some other result which would need references in that case.
L390: Its not evident to the reader at this point what any OWTs look like or represent. This needs to be presented as a section devoted solely to the characteristics of the OWTs earlier in the manuscript.
L396: This statement needs some references.
L414-415: Are these global numbers or over the study area?
L435: What does 'guaranteed' mean? This is a strong statement, and would recommend to modify.
L430-437: not sure if anything new or relevant from this section
Citation: https://doi.org/10.5194/essd-2025-443-RC2
Data sets
Sentinel-3 OLCI daily averaged earth observation data of water constituents Martin Hieronymi et al. https://doi.org/10.26050/WDCC/AquaINFRA_Sentinel3
Viewed
| HTML | XML | Total | BibTeX | EndNote | |
|---|---|---|---|---|---|
| 1,415 | 75 | 41 | 1,531 | 43 | 50 |
- HTML: 1,415
- PDF: 75
- XML: 41
- Total: 1,531
- BibTeX: 43
- EndNote: 50
Viewed (geographical distribution)
| Country | # | Views | % |
|---|
| Total: | 0 |
| HTML: | 0 |
| PDF: | 0 |
| XML: | 0 |
- 1
For your information, we created a new version (2) of the satellite dataset and made it freely available at WDCC with new DOI. This was necessary to comply with the ESSD data license policy. The data is available without restriction under CC-BY 4.0. While reproducing the data, we made minor modifications to some metadata, including adding a reference to the description of the dataset in this ESSD article. The actual data and all values remain unchanged. A change note will be included in the final paper.
Hieronymi, Martin; Bi, Shun; Behr, Daniel (2025). Sentinel-3 OLCI daily averaged earth observation data of water constituents (Version 2). World Data Center for Climate (WDCC) at DKRZ. https://doi.org/10.26050/WDCC/AquaINFRA_Sentinel3_v2