the Creative Commons Attribution 4.0 License.
the Creative Commons Attribution 4.0 License.
Early-life dispersal traits of coastal fishes: an extensive database combining observations and growth models
Marine Di Stefano
David Nerini
Itziar Alvarez
Giandomenico Ardizzone
Patrick Astruch
Gotzon Basterretxea
Aurélie Blanfuné
Denis Bonhomme
Antonio Calò
Ignacio Catalan
Carlo Cattano
Adrien Cheminée
Romain Crec'hriou
Amalia Cuadros
Antonio Di Franco
Carlos Diaz-Gil
Tristan Estaque
Robin Faillettaz
Fabiana C. Félix-Hackradt
José Antonio Garcia-Charton
Paolo Guidetti
Loïc Guilloux
Jean-Georges Harmelin
Mireille Harmelin-Vivien
Manuel Hidalgo
Hilmar Hinz
Jean-Olivier Irisson
Gabriele La Mesa
Laurence Le Diréach
Philippe Lenfant
Enrique Macpherson
Sanja Matić-Skoko
Manon Mercader
Marco Milazzo
Tiffany Monfort
Joan Moranta
Manuel Muntoni
Matteo Murenu
Lucie Nunez
M. Pilar Olivar
Jérémy Pastor
Ángel Pérez-Ruzafa
Serge Planes
Nuria Raventos
Justine Richaume
Elodie Rouanet
Erwan Roussel
Sandrine Ruitton
Ana Sabatés
Thierry Thibaut
Daniele Ventura
Laurent Vigliola
Dario Vrdoljak
Vincent Rossi
Download
- Final revised paper (published on 28 Aug 2024)
- Supplement to the final revised paper
- Preprint (discussion started on 19 Feb 2024)
- Supplement to the preprint
Interactive discussion
Status: closed
-
RC1: 'Comment on essd-2024-22', Sebastiaan A.L.M. Kooijman, 19 Mar 2024
title: Early-life dispersal traits of coastal fishes: a long-term database combining observations and growth models
authors: Marine Di Stefano et al
journal: Earth System Science Data (ESSD)
date in: 2024/03/19
date out: 2024/03/20General:
I very much agree with the authors about the ecological relevance of the early life stages of fish, and the need to become better organised on structuring information.
The von Bertlanffy growth curve is indeed very popular, but most authors use the Beverton & Holt formulation that has the extra parameter t_0: the time before birth at which L=0.
This obviously lacks any biological realism/relevance and results from fitting length-at-age data at 1, 2, .. years, excluding the early stages, which deviate for the von Bertalanffy growth curve.
Only 231 of the 35600 fish species in fishbase have data on egg development, which illustrates the problem.
The paper is well written and hopefully initiates un important data structuring effort.Development is very sensitive to local temperature, and temperature varies with location, season and, especially, depth of the fish.
While the neonates typically live very close to the surface, later stages live in deeper (and cooler) waters, linked to their food.
Food selection is very much coupled to size, so changes with growth. How is this taken into account?
What are the plans to further develop and maintain the data-base? (see remark for lines 59,76).
Remarks are made about further developments (line 365, section 4.4), but they do not make not clear whether or not they will become part of the present data base.
What is the max capacity of Excel sheets? What about searching and extraction options?
Who will do the maintenance?
Is there a mechanism for submitting and curating new submissions? Are there any actions to place the data base under the fishbase-umbrella?Specific:
7 dates -> data? Later formulations talk about missing information.
59,76 does "final database" implies that the data base is what it is, and will not be maintained? No further additions?
Or is meant "the data base in its present state"?85 DEB theory takes "birth" as the event when feeding starts. Many newly hatched larva have no mouth or their mouth is still closed.
It can take several days before the mouth becomes functional. The significance of this detail in a DEB context is in the investment into reproduction:
the mother paid (via the yolk sac) for all development until the start of feeding.
Weight at birth is in the AmP collection derived from the (typical) volume of an egg at spawning, assuming a specific density of 1 g/cm^3.
The values on egg development in fishbase typically refer to hatch, not to birth.107 The use of length for the early life stages is a bit tricky, since they can change in shape substantially. DEB theory conserves mass, making mass more valuable than length.
130 Although the std DEB model simplifies to a von Bertalanffy curve AT CONSTANT FOOD AND TEMPERATURE for length-at-age, it differs from it at varying food.
Many AmP fish-entries have varying food.
The von Bertalanffy growth rate is not a DEB parameter (but can be derived from DEB parameters in combination with food and temperature).
DEB theory also deals with reproduction, while von Bertalanffy (or better Pütter) does not. This can be used to understand that fecundity is approximately proportional to weight,
and increases non-linearly with length.Table 1: t_j is not parameter, but depends on DEB parameters and food availability, such the the growth rate at the end of the exponential stage equals that at the start of the vBert stage.
So it is not possible to see the transition in a time-length curve as is clear from Fig 2.Fig 2: In DEB theory, length-at-time is more complex during the embryo-stage due to the depletion of reserve.
But since L_b is typically small for ray-finned fish, this might be a detail in the present context.159 This only applies for a given food type. Since length increase during ontogeny is enormous in ray-finned fish, size-dependent changes in diet are the rule,
rather than the exception, and f is no longer restricted to the interval (0,1).Fig 3: I could not find the family Tunidae in fishbase or Catalog of life. Do you mean Scombridae?
295 Lika et al 2014 (https://www.zotero.org/groups/500643/deb_library/items/RUCFIFB3/item-list) found a coupling between temperature (summer vs winter spawners)
and the acceleration factor in Mediterrean perciformes. The paper suggests that dispersal is key to this.374 The abj model is an one-parameter extension of the std model.
If the maturity-parameter E_Hj is close to E_Hb, the length at the end of acceleration L_j will be close to that at birth L_b.
Indeed if E_Hj=E_Hb, we have L_j=L_b and the abj model reduces to the std model.
The choice for std or abj is typically made at the family or higher levels, not at the species level.
All entries for which a std model is fitted, can also be fitted with a abj model, with the same of smaller mean relative error.
408 I fully agree with the remark that the AmP collection has shortcomings and see the data base as part of a long-term maturation process,
where entries with little or unreliable data are replaced and updated.
The hope is that researchers will see the benefits of a high-quality data base, recognize what info is essential, and start the collect data with DEB theory in mind.429 I very much agree with this remark
Citation: https://doi.org/10.5194/essd-2024-22-RC1 -
RC2: 'Comment on essd-2024-22', Anonymous Referee #2, 10 Apr 2024
-Manuscript-
I’m not sure the usage of long-term in the title is fitting. While the collective database of years spans 29 years, most of the data may not be from long-term data sets. I would prefer the wording to be labeled as something such as “extensive” or “comprehensive” database, with the former being how the authors termed the collection themselves on line 47 as extensive compilation.
Line 93, change “thanks to” to “using a”, same line might also want to explain that the details of this method will be described in detail in the following section 2.2.
Line 140, check for consistency in wording across documents and the database, the supplement uses eggs and small larvae, while this line uses eggs and just-hatched larvae. It would also be helpful if the authors addressed the caveats of using this classification, for example, do they have an opinion on whether most of the species included in this database generally fit this classification model that was meant as a general concept? This was addressed some on the last sentence on line 163-164, but I don’t know if it is also marked as such in the database notes or not, which would be helpful.
Section 3.1, it would be helpful to also provide summary information on depth, and I also wonder if Fig 4 presents the data in the best way, I might have a look at displaying each location with an equal sized dot so we can get a better feel for the regional comprehensiveness and color the circles based on the number of entries sampled per node, that way some of the crowding won’t overshadow understanding of geographic coverage. Additionally, we don’t have a good feel for how many species are represented in the samples geographically, which would be useful to have visualized.
In general, some of the presentation choices present a challenge, especially because the 110000 entries are all treated the same despite knowing these are giving more weight to 15 or so families that represent the bulk of the entries. I find this particularly troublesome when looking at a figure like #7, because I can’t put much weight into understanding how many of the families, species, etc. I am looking at in that database and how they end up weighting the observations graphed.
Lastly, over 80% of the dates are “reconstructed” so it would be helpful if the paper was clearer on best practices for reuse of this information and caveats, as we already see from Fig. 6 and 7 (which was helpful to include) that they may underestimate seasonality and/or overestimate PLD (although again without knowing the family/species details it is difficult to say this outright is really a pattern or not as species-specific details of which is observed versus estimated will critically matter) as pointed out in the earlier comment. I’m just left with wanting more context to be provided in section 4.3 for people that might want to reuse the database information have some clear recommendations. I appreciate the frankness about when there have been mismatches, as presented for example in the last paragraph of section 4.3, but that leads me asking, what is reliable, what isn’t, what are the bounds of including a sensitivity analyses if including the estimates in a model for example.
-CSV file-
The special characters did not display well in the CSV file nor import well into Excel, all of these should be double checked please. For example, one series of data had the following data for a site: Marseille - Endoume - Petits Fonds Hétérogènes. Another site was shown as Jávea.
The CSV file was missing DOI direct links for each of the reference datasets (called oddly Projects when I would prefer it to be called reference or source), DOIs to the original data are essential to have, otherwise it takes multiple steps for someone using the database to get back to the original datasets to read more comprehensive information about the data because it would require piecing together information from the references which can be tedious.
-General considerations-
One is left mostly contemplating what as well the future of this dataset will be after reading section 4.4. As is, this mostly seems to be a meta-analysis of 44 studies which one would anticipate the authors themselves may reuse for different analyses rather than what end-users would consider a “long-term database”. One is left wondering about the upkeep and broader dissemination. For example, is this information going to be added for each species included to fishbase?
The dataset brings together 44 datasets and provides some new estimates, but perhaps its biggest value is the authors’ evaluation on noting significant gaps in geography, species, and temporal coverage despite having almost 30 years of information. Specific comments about these gaps would be beneficial to include clearly in the abstract and conclusions as a message/recommendations to the monitoring and research community on improving comprehensiveness going forward.
Citation: https://doi.org/10.5194/essd-2024-22-RC2 -
RC3: 'Comment on essd-2024-22', Pierre Pepin, 03 May 2024
Review ESSD 2024 22 ELH Dispersal traits database
The manuscript lead by Di Stefano, Nerini, Rossi and a suite of collaborators describes and reports on development of a database of early-life dispersal traits of coastal fishes based on data from the western Mediterranean. The database consists of 111866 entries merged from 44 datasets from sample collections from 1993 to 2021. An extensive set of metadata serve to characterize each entry, which generally consists of the information from a single individual (92823) or several specimens. The database includes information about spawning coordinates (17130) or settlement coordinates (101536). The database includes the reconstruction of missing data using dynamic energy budgets (DEB) models.
Overall, this unique database represents a potentially very valuable asset for scientists interested in the study of early life history stages. The collation of data from a variety of studies in a harmonized manner provides an archive of information that would likely not be available because data from each investigation could be lost in filling cabinets, dead computer drives, or scattered among various archives that would have to be harmonized for each investigator interested in the material. The use of DEB models to infer spawning date, pelagic larval duration, or settlement dates represents a standardized product that adds value to each observation. As a result, anyone having concerns about the information contained in the database can revisit the approach taken to determine how the overall results could be altered.
The manuscript itself details the approach to harmonizing the information, including an assessment of the collection methods, that provides a comprehensive and repeatable description of the process followed by the contributors. The writing is somewhat uneven, with more grammatical issues (e.g., consistent use of verb tense, direct translation of some terms) from the introduction to the results about the nature of the database, while the discussion was rather more consistent in terms of the quality of the writing. The discussion addresses important sources of uncertainty in the database itself and the elements, but one concern not discussed is that collections over a 29-year period, from a variety of locations, may have been affected by the changing status of the ecosystems in which data were collected. The influence of anthropogenic pressures and climate change could be something to evaluate in a more profound assessment of the data, but there should be some discussion about the potential consequences of changing baselines. Dealing with the issue per se is well beyond the scope of the harmonization process but it is seldom considered and how that could affect the data collated from multiple sources. This likely represents a minor addition but it should be acknowledged as clearly as the other elements of uncertainty.
My overall assessment is that this work represents a valuable contribution to the scientific community. The data reported in the dataset was developed based on a consistent and repeatable approach. Note that the data are delimited using semicolons, not commas.
Specific comments:
L.36 “…highly studied over several decades with…”
L.50 “…we aim to gather…”
L.52 “…one original aspect of this work…2020) which allows us to enrich observations…”
L.54 “…to provide evidence and interpret…”
L.56 “…data collected over several decades with…”
L.61 “…allow us to describe overall taxonomic and spatiotemporal coverage, and evaluate potential…”
L.71 “Data were collected along…”
L.74 The authors need to define UVC at the first occurrence in the text. Currently, the definition appears at line 200, which caused confusion as I worked my way through the document.
L.111 “One unique feature of this database is the compilation…”
L.123 “…data, which validates and substantiates our choice…”
L.126 “It helps to provide a realistic physiological basis of growth dynamics…”
L.130 The variables describing different aspects of body length should be described in the body of the text, not in the legend (although they will need to be repeated) to Table 1.
L.150-155 It is important that the database stands alone, so a summary of the parameters applied in the DEB models should be summarized in the supplemental material, and linked to each of the 44 datasets, so that the work can be reproduced independent of the AmP website.
L.154 Lsettl varies among individuals. How is this dealt with in the harmonization of the datasets?
L.157 I don’t understand what is meant by “ranging in reasonable intervals”?
L.160 “…0.7, 1] was selected arbitrarily, because the individuals collected have…”
L.170 Spell out fifty-five: when starting a sentence with a number always use words; same applies when starting with a Latin species name.
L.328 “…researchers from planning fieldwork in coastal areas, particularly for UVC…”
L.333 “…15 days), spawning can only be estimated as occurring at the end of winter and in the spring.”
Citation: https://doi.org/10.5194/essd-2024-22-RC3 - AC1: 'Response letter', Marine Di Stefano, 03 Jul 2024