the Creative Commons Attribution 4.0 License.
the Creative Commons Attribution 4.0 License.
Metazoan zooplankton in the Bay of Biscay: a 16-year record of individual sizes and abundances obtained using the ZooScan and ZooCAM imaging systems
Paul Bourriau
Edwin Daché
Marie-Madeleine Danielou
Mathieu Doray
Christine Dupuy
Bertrand Forest
Laetitia Jalabert
Martin Huret
Sophie Le Mestre
Antoine Nowaczyk
Pierre Petitgas
Philippe Pineau
Justin Rouxel
Morgan Tardivel
Jean-Baptiste Romagnan
This paper presents two metazoan zooplankton datasets obtained by imaging samples collected on the Bay of Biscay continental shelf in spring during the PELGAS (PELagique GAScogne) integrated surveys over the 2004–2019 period. The samples were collected at night with a 200 µm mesh-size WP2 net fitted with a Hydrobios (back-run stop) mechanical flowmeter and hauled vertically from the sea floor to the surface, with the maximum depth set at 100 m when the bathymetry was deeper than this. The first dataset originates from samples collected from 2004 to 2016 and imaged on land with the ZooScan and is composed of 1 153 507 imaged and measured objects. The second dataset originates from samples collected from 2016 to 2019 and imaged onboard the R/V Thalassa with the ZooCAM and is composed of 702 111 imaged and measured objects. The imaged objects are composed of zooplankton individuals, zooplankton pieces, non-living particles and imaging artefacts ranging from 300 µm to 3.39 mm in equivalent spherical diameter which were individually imaged, measured and identified. Each imaged object is geolocated and associated with a station, a survey, a year and other metadata. Each object is described by a set of morphological and grey-level-based features (8 bit encoding, 0 = black, 255 = white), including size, that were automatically extracted from each individual image. Each object was taxonomically identified using the web-based application Ecotaxa with built-in random-forest and CNN-based semi-automatic sorting tools, which was followed by expert validation or correction. The objects were sorted into 172 taxonomic and morphological groups. Each dataset features a table combining metadata and data at individual-object granularity from which one can easily derive quantitative population and community descriptors such as abundances, mean sizes, biovolumes, biomasses and size structure. Each object's individual image is provided along with the data. These two datasets can be used in combination for ecological studies, as the two instruments are interoperable, or they can be used as training sets for ZooScan and ZooCAM users. The data presented here are available at the SEANOE dataportal: https://doi.org/10.17882/94052 (ZooScan dataset, Grandremy et al., 2023c) and https://doi.org/10.17882/94040 (ZooCAM dataset, Grandremy et al., 2023d).
- Article
(7148 KB) - Full-text XML
- BibTeX
- EndNote
Metazoan planktonic organisms, hereafter referred to as zooplankton, encompass an immense diversity of life forms which have successfully colonized the entire ocean, from eutrophic estuarine shallow areas to the oligotrophic open ocean and from the sunlit ocean to hadal depths. Their body sizes span 5 to 6 orders of magnitude in length: from µm to tens of metres (Sieburth and Smetacek, 1978). Zooplankton plays a pivotal role in marine ecosystems (Banse, 1995). It transfers the organic matter produced in the epipelagic domain by photosynthesis to the deeper layers of the ocean (Siegel et al., 2016) by producing fast-sinking aggregates (Turner, 2015) and by diel vertical migration (Steinberg et al., 2000; Ohman and Romagnan, 2016). Zooplankton therefore participates in mitigating the anthropogenic carbon dioxide buildup in the atmosphere that is responsible for climate change. Moreover, zooplankton is an exclusive trophic resource for commercially important fish during their larval stage, so a shift in zooplankton species or phenology can have dramatic effects on recruitment (e.g. for North Sea cod; Beaugrand et al., 2003). In addition, it is a major trophic resource for adult small planktivorous pelagic fish known as forage fish (van der Lingen, 2006). Recent studies suggest that zooplankton dynamics may have a significant effect on small pelagic fish population dynamics and individual body condition (Brosset et al., 2016; Menu et al., 2023) and therefore impact wasp-waist ecosystem-based fisheries and socio-ecosystems that are dependent on those fisheries worldwide (Cury et al., 2000).
Despite zooplankton being of such global importance in both climate change effects on ecosystems and the management of fisheries (Chiba et al., 2018; Lombard et al., 2019), it is still technically difficult to monitor compared to other marine ecological compartments. Zooplankton biomass, diversity and spatio-temporal distributions cannot be estimated from spaceborne sensors, unlike those of phytoplankton (Uitz et al., 2010), commercial exploitation data of zooplankton do not exist yet, unlike the corresponding data for fish. One noticeable exception is the CPR Survey network, which enables zooplankton data generation at spatio-temporal scales that are fine enough to study climate change and diversity-related zooplanktonic processes (Batten et al., 2019). Yet, generating zooplankton data often requires dedicated surveys at sea, specific sampling instruments and trained taxonomic analysts. Moreover, besides actual observation, modelling zooplankton remains a challenging task due to the diversity of traits, such as life forms, life cycles, body sizes, and physiological processes, exhibited by zooplankton (Mitra and Davis, 2010; Mitra et al., 2014). However, over the past 2 decades, the development of imaging and associated machine-learning semi-automatic identification tools (Irisson et al., 2022) has greatly improved the capability of scientists to analyse long (Feuilloley et al., 2022), high-frequency (Romagnan et al., 2016) or spatially resolved (Grandremy et al., 2023a) zooplankton time series as well as trait-based data (Orenstein et al., 2022). Imaging and machine learning have particularly enabled the increased development of combined size and taxonomy zooplankton ecological studies (e.g. Vandromme et al., 2014; Romagnan et al., 2016; Benedetti et al., 2019). Yet, the use of these machine-learning tools is not trivial because they require abundant, scientifically qualified, sensor-specific training image data (i.e. a learning set and test set; Irisson et al., 2022) and complex hardware and software setups (Panaïotis et al., 2022). One good example of such an image dataset is the ZooScanNet dataset (Elineau et al., 2018), which features an extensive ZooScan (Gorsky et al., 2010) imaging dataset usable as a training set for ecologists as well as for imaging and machine-learning scientists.
The objective of this paper is to present two freely available zooplankton imaging datasets originating from two different instruments, the ZooScan (Gorsky et al., 2010) and the ZooCAM (Colas et al., 2018). These datasets originate from the PELGAS (PELagique GAScogne) integrated survey in the Bay of Biscay (Doray et al., 2018a), a continental shelf ecosystem supporting major European fisheries (ICES, 2021). Combined, these datasets make up a 16-year time series of sized and taxonomically resolved zooplankton, along with context metadata allowing the calculation of quantitative data, covering the whole Bay of Biscay continental shelf from the French coast to the continental slope and from the Basque Country to southern Brittany in spring. These datasets can be used for ecological studies (Grandremy et al., 2023a), machine-learning studies and modelling studies.
2.1 Sampling
Zooplankton samples were collected during successive PELGAS integrated surveys carried out over the Bay of Biscay (BoB) French continental shelf every year in spring from 2004 to 2019 onboard the R/V Thalassa. The aim of this survey is to assess small pelagic fish biomass and monitor the pelagic ecosystem to inform ecosystem-based fisheries management. Fish data, hydrology, phyto- and zooplankton samples, and megafauna sightings (marine mammals and seabirds) are concomitantly collected to build long-term spatially resolved time series of the BoB pelagic ecosystem. The PELGAS sampling protocols combine daytime en-route data collection (small pelagic fish and megafauna) with night-time depth-integrated hydrology and plankton sampling at fixed points. Detailed PELGAS survey protocols can be found in Doray et al. (2018a, 2021). The PELGAS survey datasets providing hydrological, primary producers, fish and megafauna data are available as gridded data in the SEANOE data portal (Doray et al., 2018b) at the following link: https://doi.org/10.17882/53389.
The number of zooplankton samples obtained per year varied between 41 (2005) and 65 (2019) due to adjustments in the sampling strategy and the weather conditions, with 889 zooplankton samples collected in total. From 2004 to 2006, samples were collected in the southern Bay of Biscay up to the Loire estuary only (Fig. 1). Sampling was carried out in vertical tows performed during the night using a 200 µ m mesh-size WP2 net, generally from 100 m depth (or 5 m above the seabed) to the surface. In 2004 and 2005, the targeted maximum sampling depth was 200 m. In 2004, 15 samples were collected deeper than 100 m, among which 11 were deeper than 120 m; in 2005, 20 samples were collected deeper than 100 m, among which 13 were deeper than 120 m. Before 2014, the sampled water volume was estimated by multiplying the cable length by the net opening surface (0.25 m2), whereas the net has been equipped with a Hydrobios back-run stop flowmeter since 2014. The samples originating from the 2004 to 2016 surveys were preserved in 4 % formaldehyde (final concentration) and analysed on land in the laboratory with the ZooScan, while they have been analysed live onboard with the ZooCAM since 2016.
2.2 Sample processing and analyses
2.2.1 Digitization with the ZooScan
Preserved samples were digitized with the ZooScan (Gorsky et al., 2010), a flatbed scanner generating 16 bit grey-level high-resolution images (2400 dpi, pixel size: 10.56 µm, image size: 15 × 24 cm equivalent to 14 200 × 22 700 pixels). It is well suited for the imaging of preserved organisms ranging in size from 300 µm to several centimetres. The ZooScan is run by the custom-made, ImageJ-based, ZooProcess software, which generates a single large image for each scan that contains up to 2000 organisms, depending on the size of the imaged organisms.
Prior to digitization, the seawater and formaldehyde solution was filtered through a 180 µm mesh sieve into a trash tank under a fume hood. The organisms were then gently but thoroughly rinsed with freshwater over the tank in the sieve. They were then size fractionated with a 1 mm sieve into organisms larger and smaller than 1 mm. This size-splitting step is recommended when using the ZooScan as it addresses the possible bias due to the underrepresentation of large objects caused by the necessary subsampling. Each size fraction was subsampled separately with a Motoda splitter to obtain two subsamples containing 500–1000 objects for the large-organism size fraction and 1000–2000 objects for the small-organism size fraction. To mitigate the number of overlapping objects, each subsample was imaged after the manual separation of objects on the scanning tray, as recommended in Vandromme et al. (2012). Overall, 699 samples were digitized following this protocol, corresponding to 1397 scans (one sample was not divided into size fractions as it did not contain organisms larger than 1 mm).
2.2.2 Digitization with the ZooCAM
The ZooCAM is an in-flow imaging instrument designed to digitize preserved as well as live zooplankton samples onboard, immediately after net collection (Colas et al., 2018). The ZooCAM features a cylindrical transparent tank in which the zooplankton sample is mixed with filtered seawater. Depending on the richness of the sample and the subsampling (if necessary), the volume of seawater can be adjusted between 2–7 L. The organisms were pumped at 1 L min−1 from the tank to a flow cell inserted between a CCD camera (pixel size: 10.3 µm) and a red LED flashing device, where they were imaged at 16 fps. Given the flow cell volume, the size of the field of view, the imaging frequency and the flow rate, all the seawater volume containing the organisms was imaged (Colas et al., 2018). Before all the initial volume was imaged, the tank and the tubing were carefully and thoroughly rinsed with filtered seawater to ensure the imaging of all the organisms poured into the tank. For each sample, the ZooCAM generates a stack of small-size (∼ 1 MB) raw images that are subsequently analysed with the ZooCAM software. Depending on the initial water content of the tank and the rinsing, a ZooCAM run can generate up to 10 000 raw images from which the individual organism vignettes will be extracted. A ZooCAM run on a live sample often generates up to 5000–10 000 vignettes of individual organisms. It is very important to subsample the initial samples with a dichotomic splitter (a Motoda splitter was used here) to ensure that the object concentration in subsamples is low enough to reduce the risk of imaging overlapping objects and avoid any dependency on the water volume imaged when reconstructing quantitative estimates of zooplankton, as the initial and rinsing volumes are variable. Overall, 190 samples were digitized live onboard with the ZooCAM.
2.3 Image processing
Both instruments generate grey-level working images (8 bit encoding, 0 = black, 255 = white). In both cases, image processing consisted of (i) a “physical” background homogenization in which an empty background image was subtracted from each sample image (one for ZooScan and as many as there were raw images for ZooCAM), (ii) a thresholding of each raw image (the threshold value was 243 for ZooScan and 240 for ZooCAM), and (iii) the segmentation of each object imaged. The ZooProcess software was set to detect and segment objects with an area equal to or larger than 631 pixels, whereas the ZooCAM software was set to detect objects with an area equal to or larger than 667 pixels, which in both cases equals 300 µm equivalent spherical diameter (ESD), or a biovolume of 0.014 mm3 (using a spherical biovolume model; Vandromme et al., 2012).
Morphological features were then extracted for each detected object. Features generated by the ZooScan are defined in Gorsky et al. (2010) and those generated by the ZooCAM are defined in Colas et al. (2018). ZooScan images were processed with the ZooProcess v7.39 (4 October 2020) open-source software. ZooCAM images were processed with the proprietary ZooCAM custom-made software, which uses the MIL (Matrox Imaging Library, Dorval, Québec, Canada) as the individual-object processing kernel. Each detected object was finally cropped from the working sample images and saved as a unique labelled vignette in a sample-specific folder, along with a sample-specific single text file containing the object features arranged as a table with objects arranged in lines and features in columns.
2.4 Touching objects
ZooProcess features a tool that enables the digital separation of possibly touching objects in the final image dataset for each sample. As touching objects may impair abundance and size-structure estimations (Vandromme et al., 2012), remaining touching objects were searched for in the individual vignettes from the ZooScan and were digitally manually separated with the ZooProcess separation tool to improve the quality of further identifications, counts and the size structure of zooplankton. The ZooCAM software does not offer such a tool.
2.5 Taxonomic identification of individual images
All individual vignettes from both instruments were sorted into two instrument-specific separated sets and identified with the help of the online application Ecotaxa (Picheral et al., 2017). Ecotaxa features a random forest algorithm (Breiman, 2001) and a series of instrument-specific tuned spatially sparse convolutional neural networks (Graham, 2014) that were used in a combined approach to predict identifications of unidentified objects. First, an automatic classification of non-identified individual vignettes into coarse zooplankton and non-zooplankton categories was carried out. In both cases (ZooScan and ZooCAM), Ecotaxa hosted instrument-specific image datasets, previously curated and freely available, that were used as initial learning sets. These initial classifications were then visually inspected, manually validated or corrected when necessary, and taxonomically refined when possible. After a few thousand images had been validated in each project, they were used as dataset-specific learning sets to improve the initial coarse automatic identifications. This process was iterated until all the individual vignettes were classified up to their maximum reachable level of taxonomical detail. A subsequent quality check of automatic taxonomic identifications was realized in a two-step process: first, a complete review (validation and/or correction) of all individual automatic identifications was done by Nina Grandremy and Jean-Baptiste Romagnan; then, trained experts reviewed and curated the ZooScan and the ZooCAM datasets (Laeticia Jalabert handled the ZooScan dataset and Antoine Nowaczyk handled the ZooCAM dataset) at the individual level. Although some identification errors may still remain in the datasets, we consider this double check process to be sufficient to provide taxonomically qualified data.
2.6 Intercalibration of the two instruments
The two datasets are usable separately. However, considered together, they form a 16-year-long spatio-temporal time series. A comparison study was done to ensure these datasets are homogeneous and can thus be combined for ecological studies (Grandremy et al., 2023b). All the zooplankton samples from year 2016 (61 sampling stations over the whole BoB continental shelf) were imaged with both instruments. In brief, all non-zooplankton and touching-object images were removed from the initial datasets. Then, the interoperable size range was determined with an assessment based on a comparison of the normalized biovolume–size spectra (NB-SSs) for the instruments. This size interval ranges between 0.3–3.39 mm ESD. Finally, the zooplankton communities as seen by the ZooScan and the ZooCAM were compared by taxa and by station using 27 taxonomic groups. Poorly represented taxa as well as non-taxonomically identified objects were not taken into account in the computation of zooplankton variables and in community structure analyses. Both instruments showed similar NB-SS slopes for 58 out of 61 stations and depicted equivalent abundances, biovolumes and mean organism sizes as well as similar community compositions for a majority of the sampling stations. They also estimated similar spatial patterns of the zooplankton community at the scale of the Bay of Biscay. However, some taxonomic groups showed discrepancies between instruments, which originate from differences in the sample preparation protocols before image acquisition, the imaging techniques and quality, and whether the samples were imaged live or fixed. For example, the mineralized protists (Rhizaria here) dissolve in formalin and are considered to be underestimated in preserved seawater samples (Biard et al., 2016). Also, the random orientation of objects in the ZooCAM flow cell leads to a loss of taxonomic identification accuracy due to the difficulty in spotting the specific features needed for the identification (Colas et al., 2018; Grandremy et al., 2023b). This is particularly acute for copepods, for which the ZooScan seems to provide better identification capabilities for experts, as the organisms are imaged in a lateral view most of the time, whereas the ZooCAM often images them in a non-lateral, randomly oriented view, preventing the visualization of specific features. A detailed discussion about how to explain the discrepancies between the ZooScan and the ZooCAM can be found in Grandremy et al. (2023b). We assume that the two presented datasets form a single 16-year-long spatio-temporal time series of abundances (Fig. 2) and sizes of zooplanktonic organisms (Fig. 3) from which biovolumes, biomasses, the Shannon index (Fig. 4) and the zooplankton community size structure can be derived (Vandromme et al., 2012).
3.1 Taxonomic groups and operational morphological groups
The ZooScan dataset is composed of 1 153 507 zooplankton individuals, zooplankton parts, non-living particles and imaging artefacts individually imaged and measured with the ZooScan and ZooProcess (Gorsky et al., 2010) and sorted into 127 taxonomic and morphological groups. The ZooCAM dataset is composed of 702 111 zooplankton individuals, zooplankton parts, non-living particles and imaging artefacts individually imaged and measured with the ZooCAM (Colas et al., 2018) and sorted into 127 taxonomic and morphological or life-stage groups. The total number of different groups identified with both instruments combined is 170, among which 84 are common to both datasets (Table 1), 43 belong to the ZooScan dataset only and 43 others belong to the ZooCAM dataset only (Table 2). The identified groups were divided into actual taxa and operational morphological groups (OMGs). Typically, OMGs are either non-adult life stages of taxa, aggregated morphological groups, or non-living groups (see Tables 1 and 2). Among the groups common to both instruments, 45 are actual taxa and 39 are OMGs (Table 1). Among the ZooScan-only groups, 22 are taxa and 21 are OMGs, and among the ZooCAM-only groups, 18 are taxa and 25 are OMGs (Table 2).
The differences in identified groups, in the taxa/OMGs ratio and in the associated counts arose from several aspects of the data generation. Firstly, the two imaging methods differ in their technical set-up. The main difference is that, on the one hand, fixed organisms are laid down and arranged manually on the imaging sensor and digitized in a lab (i.e. steady 2-D) set-up when using the ZooScan, whereas organisms are imaged live in a moving fluid in a 3-D environment (the flow cell) onboard when digitized with the ZooCAM. Their position in front of the camera may not enable an identification as precise as that achieved when they are laid on the scanner tray (Grandremy et al., 2023b; Colas et al., 2018). Secondly, the datasets are sequential in time: the ZooCAM dataset follows the ZooScan's. Zooplankton communities in the Bay of Biscay may have changed over time, even if their biomass as aggregated groups shows remarkable space-time stability (Grandremy et al., 2023a). Thirdly, we cannot guaranty that there is no adverse effect on taxonomic identification, as validation involved several experts (Culverhouse, 2007). Although we paid great attention to homogenizing the final detailed datasets, we recommend that taxa and OMGs should be aggregated and the biological resolution should be reduced for ecological studies (Grandremy et al., 2023a, b). Additionally, numerous identified and sorted taxa and OMGs do not belong to the metazoan zooplankton or they are non-adult life stages or parts of organisms. Those were included in the presented datasets because they are always found in natural samples. They need to be separated from entire organisms to ensure that abundance estimations are as accurate as possible, and they must be taken into account to ensure accurate biovolume or biomass estimations. A good example is the siphonophore issue: numerous swimming bells of degraded siphonophore individuals can be found and imaged in a sample. Determining an accurate siphonophore abundance may not be easy, but this could be overcome by considering the biovolume or biomass of siphonophores by adding up the biovolumes or biomasses of the numerous parts of the organisms imaged.
OMG names mainly take the form of two words separated by “_” (underscore). Although we tried to name them as explicitly as possible, a few potentially needed clarifications can be found in Table 3.
3.2 Data and images
3.2.1 Data
The data are divided into two datasets available as tab-separated files, one for each instrument. Within each dataset, the data is organized as a table containing text data as well as numerical data. Each dataset contains a combination of actual data and metadata at individual-object granularity. For each object, the user can find descriptors originating from the image processing (i.e. features), sampling metadata (i.e. latitude and longitude of the sampling station, date and time of sampling, sampling device, etc.), and sample processing metadata (i.e. subsampling factor, seawater sampled volume, pixel size) in columns and individual objects in lines. The column headers are defined in Tables A1 and A2 for the ZooCAM and ZooScan datasets respectively. The following prefixes enable the segregation of types of data and metadata: (i) “object_”, which identifies variables assigned to each object individually; (ii) “sample_”, which identifies variables assigned to each sample; (iii) “acq_”, which identifies variables assigned to each data acquisition for the same sample (note that this type of variable is only found in the ZooScan dataset, as ZooScan samples were split into two size fractions corresponding to two acquisitions); and (iv) “process_”, which identifies variables describing key image processing features (i.e. pixel size). Those prefixes originate from the use of the Ecotaxa web application to sort and identify the images (Picheral et al., 2017), which promotes this specific formatting. The ZooCAM dataset is arranged as a matrix with 72 columns (variables) × 702 111 rows (individual imaged objects), and the ZooScan dataset is arranged as a matrix with 71 columns (variables) × 1 153 507 rows (individual imaged objects).
Among the 70+ variables, it is worth noting the following ones:
- i.
objid: this is a unique individual object numerical identifier that enables a single data line to be linked to a corresponding single image in the image dataset.
- ii.
taxon: this is the taxonomic or OMG identification of the imaged objects, written as they appear in Tables 1 and 2.
- iii.
lineage: this is the full taxonomic lineage of the taxon. Lineage may be used to aggregate taxa at higher taxonomic levels (respecting taxonomic lineages).
- iv.
classif_id: This is a unique numerical taxon identifier.
- v.
sample_sub_part/acq_sub_part: these are the subsampling ratios for ZooCAM and ZooScan respectively, which are needed to reconstruct the quantitative estimates of the sample abundances.
- vi.
sample_fishingvolume/sample_tot_vol: these are the total sampled seawater volumes for ZooCAM and ZooScan respectively, which are needed to normalize the sample concentrations by the seawater volume.
One can therefore calculate quantitative abundance estimates for a taxon in a sample as follows:
where Ab is the abundance in ind m−3 and n is the number of individuals for “taxon”.
3.2.2 Images
There are two sets of individual images sorted into folders by category (Tables 1 and 2) in each dataset. For the ZooCAM only, the images from years 2016 and 2017 contain printed region of interest (ROI) bounding-box limits and text at the bottom of each image and a non-homogenized background within and around the ROI bounding box; images from year 2018 contain a non-homogenized background within the ROI bounding box only; and images from 2019 have a completely homogeneous and thresholded background around the object. These differences arose from successive ZooCAM software updates that did not modify the calculation of an object's features. The ZooScan images all have a completely homogeneous and thresholded background around the object, with no bounding-box limits nor text printed in the images. All images for the two instruments' datasets have a 1 mm scale bar printed at the bottom left corner.
The ZooScan dataset can be found as the PELGAS Bay of Biscay ZooScan zooplankton Dataset (2004–2016) at the SEANOE data portal at https://doi.org/10.17882/94052; Grandremy et al., 2023c). Individual object images can be freely viewed and explored by anyone using the Ecotaxa web application (https://ecotaxa.obs-vlfr.fr/, Picheral et al., 2017; no registration): search for the project name “PELGAS Bay of Biscay ZooScan zooplankton Dataset (2004–2016)” under the tab “explore images”.
The ZooCAM dataset can be found as the PELGAS Bay of Biscay ZooCAM zooplankton Dataset (2016–2019) at the SEANOE data portal (https://doi.org/10.17882/94040; Grandremy et al., 2023d). Individual object images can be freely viewed and explored by anyone using the Ecotaxa web application (https://ecotaxa.obs-vlfr.fr/; no registration): search for the project name “PELGAS Bay of Biscay ZooCAM zooplankton Dataset (2016–2019)” under the tab “explore images”.
Each dataset comes as a .zip archive that contains
-
one tab-separated file containing all data and metadata associated with each imaged and identified object
-
one comma-separated file containing the name, type, definition and unit of each field (column)
-
one comma-separated file containing the taxonomic list of the dataset with counts and the nature of the content of the category
-
a directory named individual_images containing images of individual objects that are named according to the object id objid and sorted into subdirectories according to taxonomic identification, year and sampling station.
Recent studies showed that small pelagic fish (SPF) communities have suffered from a drastic loss of condition in the Mediterranean Sea and in the Bay of Biscay (Van Beveren et al., 2014; Doray et al., 2018d; Saraux et al., 2019) over the last 20 years. This loss of condition is expressed in particular as a constant decrease in SPF size- and weight-at-age (Doray et al., 2018d; Véron et al., 2020) and is possibly explained by a change in SPF trophic resource composition, size and quality (Brosset et al., 2016; Queiros et al., 2019; Menu et al., 2023). Identifying and measuring zooplankton at appropriate temporal and spatial scales is not an easy task, but it can be addressed with imaging. These datasets were assembled in an effort to make possible the exploration of the relationship between the observed dynamics of SPF in the Bay of Biscay and the dynamics of their main food resource, metazoan zooplankton. This zooplankton imaging data series is a significant output of Nina Grandremy's PhD (2019–2023), and is currently being exploited (Grandremy et al., 2023a), and the intention is to continue this series and update it on a yearly basis in the framework of the PELGAS program to better understand the underlying processes presiding over the long-term SPF dynamics. Moreover, those two zooplankton datasets can be associated with the PELGAS survey datasets published in 2018 (which can also be found at the SEANOE data portal), which feature hydrological, primary producers, fish and megafauna data arranged as gridded data (Doray et al., 2018b). Together, all these datasets allow all of the pelagic ecosystem compartments to be studied simultaneously with a coherent spatial domain (the Bay of Biscay continental shelf), resolution and time series. Nevertheless, spatial gridding of the data is highly recommended (as presented in Figs. 2, 3 and 4), since the spatial coverage of the sampling protocols can vary between years (Fig. 1) within and between each pelagic ecosystem compartment. A procedure for such batch data spatial smoothing is presented in, for example, Petitgas et al. (2009, 2014). See also Doray et al. (2018c) and Grandremy et al. (2023a) for application examples. As several descriptors of the spring zooplankton community (abundances, sizes, biovolumes, biomass) can be derived from this 16-year-long spatially resolved time series at several taxonomic levels, these datasets are intended to be used in various ecological studies that include the zooplankton compartment, especially in modelling studies, where zooplankton is usually underrepresented (Mitra, 2010; Mitra et al., 2014). Finally, these datasets could also be applied as learning datasets when machine learning is used in plankton studies.
NG scanned and validated most of the ZooScan dataset, assembled the datasets, and led the drafting. PB collected and managed the samples from 2004 on and participated in the manual validation of identifications. ED scanned a substantial fraction of the ZooScan samples and participated in the initial sorting of vignettes. MMD participated in the collection of samples and was involved in the development of the ZooCAM. MD was chief scientist on the PELGAS surveys and participated in the drafting. CD supervised NG's work and participated in the drafting. BF developed, improved and maintained the ZooCAM software. LJ curated a substantial fraction of the manual validations of identifications for the ZooScan dataset. MH participated in the collection of samples, led the DEFIPEL project and participated in the drafting. SLM participated in the collection of samples and managed the ZooCAM. AN curated a substantial fraction of the manual validations of identifications for the ZooScan and the ZooCAM datasets. PP supervised NG's work and participated in the drafting. PhP participated in the collection of samples and participated in the drafting. JR supervised the development and improvement of the ZooCAM. MT developed and improved the ZooCAM and participated in the collection of samples. JBR supervised NG's work, participated in the collection of samples, curated a substantial fraction of the manual validations of identifications for the ZooCAM dataset, and led the drafting.
The contact author has declared that none of the authors has any competing interests.
Data are published without any warranty, express or implied. The user assumes all risk arising from his/her use of data. Data are intended to be research-quality, but it is possible that the data themselves contain errors. It is the sole responsibility of the user to assess if the data are appropriate for his/her use, and to interpret the data accordingly. Authors welcome users to ask questions and report problems.
Publisher's note: Copernicus Publications remains neutral with regard to jurisdictional claims made in the text, published maps, institutional affiliations, or any other geographical representation in this paper. While Copernicus Publications makes every effort to include appropriate place names, the final responsibility lies with the authors.
Nina Grandremy acknowledges the funding of her PhD by Region Pays de la Loire, France and Ifremer. The authors wish to thank Jean-Yves Coail, Gérard Guyader and Patrick Berriet (Ifremer – Département Ressources physiques et Ecosystèmes de fond de Mer (REM), Unité Recherches et Développements Technologiques (RDT), and Service Ingénierie et Instrumentation Marine (SIIM)) for their contribution to the hardware assembly of the ZooCAM. The authors acknowledge the work of Elio Raphalen for scanning year 2005 samples. The authors thank the European Marine Biological Resource Centre (EMBRC) platform PIQs (Quantitative Imaging Platform of Villefranche-sur-Mer) for image analysis. This work was supported by EMBRC France, whose French state funds are managed by the French National Research Agency within the Investments of the Future program under reference ANR-10-INBS-02. Finally, the authors wish also to thank the many other students, technicians and scientists who participated in the sampling and sample imaging onboard, and the successive crews of the R/V Thalassa involved in the PELGAS surveys from 2004 to 2019.
This research has been supported by France Filière Pêche (Enjeux d'avenir) through the project DEFIPEL (Développement d'une approche intégrée pour la filière petits pélagiques française).
This paper was edited by François G. Schmitt and reviewed by two anonymous referees.
Ahlstrom, E. H.: Studies on the Pacific Pilchard Or Sardine (Sardinops Caerulea): Influence of Temperature on the Rate of Development of Pilchard Eggs in Nature, U.S. Fish and Wildlife Service, 206 pp., 1943.
Banse, K.: Zooplankton: Pivotal role in the control of ocean production: I. Biomass and production, ICES J. Mar. Sci., 52, 265–277, https://doi.org/10.1016/1054-3139(95)80043-3, 1995.
Batten, S. D., Abu-Alhaija, R., Chiba, S., Edwards, M., Graham, G., Jyothibabu, R., Kitchener, J. A., Koubbi, P., McQuatters-Gollop, A., Muxagata, E., Ostle, C., Richardson, A. J., Robinson, K. V., Takahashi, K. T., Verheye, H. M., and Wilson, W.: A Global Plankton Diversity Monitoring Program, Front. Mar. Sci., 6, 321, https://doi.org/10.3389/fmars.2019.00321, 2019.
Beaugrand, G., Brander, K. M., Lindley, J. A., Souissi, S., and Reid, P. C.: Plankton effect on cod recruitment in the North Sea, Nature, 426, 661–664, https://doi.org/10.1038/nature02164, 2003.
Benedetti, F., Jalabert, L., Sourisseau, M., Becker, B., Cailliau, C., Desnos, C., Elineau, A., Irisson, J.-O., Lombard, F., Picheral, M., Stemmann, L., and Pouline, P.: The Seasonal and Inter-Annual Fluctuations of Plankton Abundance and Community Structure in a North Atlantic Marine Protected Area, Front. Mar. Sci., 6, 214, https://doi.org/10.3389/fmars.2019.00214, 2019.
Biard, T., Stemmann, L., Picheral, M., Mayot, N., Vandromme, P., Hauss, H., Gorsky, G., Guidi, L., Kiko, R., and Not, F.: In situ imaging reveals the biomass of giant protists in the global ocean, Nature, 532, 504–507, https://doi.org/10.1038/nature17652, 2016.
Breiman, L.: Random Forests, Mach. Learn., 45, 5–32, https://doi.org/10.1023/A:1010933404324, 2001.
Brosset, P., Bourg, B. L., Costalago, D., Bănaru, D., Beveren, E. V., Bourdeix, J.-H., Fromentin, J.-M., Ménard, F., and Saraux, C.: Linking small pelagic dietary shifts with ecosystem changes in the Gulf of Lions, Mar. Ecol. Prog. Ser., 554, 157–171, https://doi.org/10.3354/meps11796, 2016.
Chiba, S., Batten, S., Martin, C. S., Ivory, S., Miloslavich, P., and Weatherdon, L. V.: Zooplankton monitoring to contribute towards addressing global biodiversity conservation challenges, J. Plankton Res., 40, 509–518, https://doi.org/10.1093/plankt/fby030, 2018.
Colas, F., Tardivel, M., Perchoc, J., Lunven, M., Forest, B., Guyader, G., Danielou, M. M., Le Mestre, S., Bourriau, P., Antajan, E., Sourisseau, M., Huret, M., Petitgas, P., and Romagnan, J. B.: The ZooCAM, a new in-flow imaging system for fast onboard counting, sizing and classification of fish eggs and metazooplankton, Prog. Oceanogr., 166, 54–65, https://doi.org/10.1016/j.pocean.2017.10.014, 2018.
Culverhouse, P. F.: Human and machine factors in algae monitoring performance, Ecol. Inform., 2, 361–366, https://doi.org/10.1016/j.ecoinf.2007.07.001, 2007.
Cury, P., Bakun, A., Crawford, R. J. M., Jarre, A., Quiñones, R. A., Shannon, L. J., and Verheye, H. M.: Small pelagics in upwelling systems: patterns of interaction and structural changes in “wasp-waist” ecosystems, ICES J. Mar. Sci., 57, 603–618, https://doi.org/10.1006/jmsc.2000.0712, 2000.
Doray, M., Huret, M., Authier, M., Duhamel, E., Romagnan, J.-B., Dupuy, C., Spitz, J., Sanchez, F., Berger, L., Dorémus, G., Bourriau, P., Grellier, P., Pennors, L., Masse, J., and Petitgas, P.: Gridded maps of pelagic ecosystem parameters collected in the Bay of Biscay during the PELGAS integrated survey, SEANOE [data set], https://doi.org/10.17882/53389, 2018a.
Doray, M., Petitgas, P., Huret, M., Duhamel, E., Romagnan, J. B., Authier, M., Dupuy, C., and Spitz, J.: Monitoring small pelagic fish in the Bay of Biscay ecosystem, using indicators from an integrated survey, Prog. Oceanogr., 166, 168–188, https://doi.org/10.1016/j.pocean.2017.12.004, 2018b.
Doray, M., Hervy, C., Huret, M., and Petitgas, P.: Spring habitats of small pelagic fish communities in the Bay of Biscay, Prog. Oceanogr., 166, 88–108, https://doi.org/10.1016/j.pocean.2017.11.003, 2018c.
Doray, M., Petitgas, P., Romagnan, J. B., Huret, M., Duhamel, E., Dupuy, C., Spitz, J., Authier, M., Sanchez, F., Berger, L., Dorémus, G., Bourriau, P., Grellier, P., and Massé, J.: The PELGAS survey: Ship-based integrated monitoring of the Bay of Biscay pelagic ecosystem, Prog. Oceanogr., 166, 15–29, https://doi.org/10.1016/j.pocean.2017.09.015, 2018d.
Doray, M., Boyra, G., and van der Kooij, J.: ICES Survey Protocols – Manual for acoustic surveys coordinated under ICES Working Group on Acoustic and Egg Surveys for Small Pelagic Fish (WGACEGG), https://doi.org/10.17895/ICES.PUB.7462, 2021.
Elineau, A., Desnos, C., Jalabert, L., Olivier, M., Romagnan, J.-B., Costa Brandao, M., Lombard, F., Llopis, N., Courboulès, J., Caray-Counil, L., Serranito, B., Irisson, J.-O., Picheral, M., Gorsky, G., and Stemmann, L.: ZooScanNet: plankton images captured with the ZooScan, SEANOE [data set], https://doi.org/10.17882/55741, 2018.
Feuilloley, G., Fromentin, J.-M., Saraux, C., Irisson, J.-O., Jalabert, L., and Stemmann, L.: Temporal fluctuations in zooplankton size, abundance, and taxonomic composition since 1995 in the North Western Mediterranean Sea, ICES J. Mar. Sci., 79, 882–900, https://doi.org/10.1093/icesjms/fsab190, 2022.
Gorsky, G., Ohman, M. D., Picheral, M., Gasparini, S., Stemmann, L., Romagnan, J.-B., Cawood, A., Pesant, S., García-Comas, C., and Prejger, F.: Digital zooplankton image analysis using the ZooScan integrated system, J. Plankton Res., 32, 285–303, https://doi.org/10.1093/plankt/fbp124, 2010.
Graham, B.: Spatially-sparse convolutional neural networks, arXiv [preprint], https://doi.org/10.48550/arXiv.1409.6070, 2014.
Grandremy, N., Romagnan, J.-B., Dupuy, C., Doray, M., Huret, M., and Petitgas, P.: Hydrology and small pelagic fish drive the spatio–temporal dynamics of springtime zooplankton assemblages over the Bay of Biscay continental shelf, Prog. Oceanogr., 210, 102949, https://doi.org/10.1016/j.pocean.2022.102949, 2023a.
Grandremy, N., Dupuy, C., Petitgas, P., Mestre, S. L., Bourriau, P., Nowaczyk, A., Forest, B., and Romagnan, J.-B.: The ZooScan and the ZooCAM zooplankton imaging systems are intercomparable: A benchmark on the Bay of Biscay zooplankton, Limnol. Oceanogr.-Meth., 21, 718–733, https://doi.org/10.1002/lom3.10577, 2023b.
Grandremy, N., Bourriau, P., Daché, E., Danielou, M. M., Doray, M., Dupuy, C., Huret, M., Jalabert, L., Le Mestre, S., Nowaczyk, A., Petitgas, P., Pineau, P., Raphalen, E., and Romagnan, J.-B.: PELGAS Bay of Biscay ZooScan zooplankton Dataset (2004–2016), SEANOE [data set], https://doi.org/10.17882/94052, 2023c.
Grandremy, N., Bourriau, P., Danielou, M. M., Doray, M., Dupuy, C., Forest, B., Huret, M., Le Mestre, S., Nowaczyk, A., Petitgas, P., Pineau, P., Rouxel, J., Tardivel, M., and Romagnan, J.-B.: PELGAS Bay of Biscay ZooCAM zooplankton Dataset (2016–2019), SEANOE [data set], https://doi.org/10.17882/94040, 2023d.
ICES: Bay of Biscay and Iberian Coast ecoregion – Fisheries overview, ICES Advice: Fisheries Overviews, https://doi.org/10.17895/ices.advice.9100, 2021.
Irisson, J.-O., Ayata, S.-D., Lindsay, D. J., Karp-Boss, L., and Stemmann, L.: Machine Learning for the Study of Plankton and Marine Snow from Images, Annu. Rev. Mar. Sci., 14, 277–301, https://doi.org/10.1146/annurev-marine-041921-013023, 2022.
Lombard, F., Boss, E., Waite, A. M., Vogt, M., Uitz, J., Stemmann, L., Sosik, H. M., Schulz, J., Romagnan, J.-B., Picheral, M., Pearlman, J., Ohman, M. D., Niehoff, B., Möller, K. O., Miloslavich, P., Lara-Lpez, A., Kudela, R., Lopes, R. M., Kiko, R., Karp-Boss, L., Jaffe, J. S., Iversen, M. H., Irisson, J.-O., Fennel, K., Hauss, H., Guidi, L., Gorsky, G., Giering, S. L. C., Gaube, P., Gallager, S., Dubelaar, G., Cowen, R. K., Carlotti, F., Briseño-Avena, C., Berline, L., Benoit-Bird, K., Bax, N., Batten, S., Ayata, S. D., Artigas, L. F., and Appeltans, W.: Globally Consistent Quantitative Observations of Planktonic Ecosystems, Front. Mar. Sci., 6, 196, https://doi.org/10.3389/fmars.2019.00196, 2019.
Menu, C., Pecquerie, L., Bacher, C., Doray, M., Hattab, T., van der Kooij, J., and Huret, M.: Testing the bottom-up hypothesis for the decline in size of anchovy and sardine across European waters through a bioenergetic modeling approach, Prog. Oceanogr., 210, 102943, https://doi.org/10.1016/j.pocean.2022.102943, 2023.
Mitra, A. and Davis, C.: Defining the “to” in end-to-end models, Prog. Oceanogr., 84, 39–42, https://doi.org/10.1016/j.pocean.2009.09.004, 2010.
Mitra, A., Castellani, C., Gentleman, W. C., Jónasdóttir, S. H., Flynn, K. J., Bode, A., Halsband, C., Kuhn, P., Licandro, P., Agersted, M. D., Calbet, A., Lindeque, P. K., Koppelmann, R., Møller, E. F., Gislason, A., Nielsen, T. G., and St. John, M.: Bridging the gap between marine biogeochemical and fisheries sciences; configuring the zooplankton link, Prog. Oceanogr., 129, 176–199, https://doi.org/10.1016/j.pocean.2014.04.025, 2014.
Moser, H. G. and Ahlstrom, E. H.: Staging anchovy eggs, National Marine Fisheries Service, Southwest Fisheries Center, NOM, PO. Box 271, La Jolla, CA 92038, NOAA Tech. Rep. NMFS, 36, 37-41, 1985.
Ohman, M. D. and Romagnan, J.-B.: Nonlinear effects of body size and optical attenuation on Diel Vertical Migration by zooplankton, Limnol. Oceanogr., 61, 765–770, https://doi.org/10.1002/lno.10251, 2016.
Orenstein, E. C., Ayata, S., Maps, F., Becker, É. C., Benedetti, F., Biard, T., de Garidel-Thoron, T., Ellen, J. S., Ferrario, F., Giering, S. L. C., Guy-Haim, T., Hoebeke, L., Iversen, M. H., Kiørboe, T., Lalonde, J., Lana, A., Laviale, M., Lombard, F., Lorimer, T., Martini, S., Meyer, A., Möller, K. O., Niehoff, B., Ohman, M. D., Pradalier, C., Romagnan, J., Schröder, S., Sonnet, V., Sosik, H. M., Stemmann, L. S., Stock, M., Terbiyik-Kurt, T., Valcárcel-Pérez, N., Vilgrain, L., Wacquet, G., Waite, A. M., and Irisson, J.: Machine learning techniques to characterize functional traits of plankton from image data, Limnol. Oceanogr., 67, 1647–1669, https://doi.org/10.1002/lno.12101, 2022.
Panaïotis, T., Caray–Counil, L., Woodward, B., Schmid, M. S., Daprano, D., Tsai, S. T., Sullivan, C. M., Cowen, R. K., and Irisson, J.-O.: Content-Aware Segmentation of Objects Spanning a Large Size Range: Application to Plankton Images, Front. Mar. Sci., 9, 870005, https://doi.org/10.3389/fmars.2022.870005, 2022.
Petitgas, P., Goarant, A., Massé, J., and Bourriau, P.: Combining acoustic and CUFES data for the quality control of fish-stock survey estimates, ICES J. Mar. Sci., 66, 1384–1390, https://doi.org/10.1093/icesjms/fsp007, 2009.
Petitgas, P., Doray, M., Huret, M., Massé, J., and Woillez, M.: Modelling the variability in fish spatial distributions over time with empirical orthogonal functions: anchovy in the Bay of Biscay, ICES J. Mar. Sci., 71, 2379–2389, https://doi.org/10.1093/icesjms/fsu111, 2014.
Picheral, M., Colin, S., and Irisson, J.O.: EcoTaxa, a tool for the taxonomic classification of images, https://ecotaxa.obs-vlfr.fr/ (last access: 1 June 2023), 2017.
Queiros, Q., Fromentin, J.-M., Gasset, E., Dutto, G., Huiban, C., Metral, L., Leclerc, L., Schull, Q., McKenzie, D. J., and Saraux, C.: Food in the Sea: Size Also Matters for Pelagic Fish, Front. Mar. Sci., 6, 385, https://doi.org/10.3389/fmars.2019.00385, 2019.
Romagnan, J. B., Aldamman, L., Gasparini, S., Nival, P., Aubert, A., Jamet, J. L., and Stemmann, L.: High frequency mesozooplankton monitoring: Can imaging systems and automated sample analysis help us describe and interpret changes in zooplankton community composition and size structure – An example from a coastal site, J. Marine Syst., 162, 18–28, https://doi.org/10.1016/j.jmarsys.2016.03.013, 2016.
Saraux, C., Van Beveren, E., Brosset, P., Queiros, Q., Bourdeix, J.-H., Dutto, G., Gasset, E., Jac, C., Bonhommeau, S., and Fromentin, J.-M.: Small pelagic fish dynamics: A review of mechanisms in the Gulf of Lions, Deep-Sea Res. Pt. II, 159, 52–61, https://doi.org/10.1016/j.dsr2.2018.02.010, 2019.
Sieburth, J. McN., Smetacek, V., and Lenz, J.: Pelagic ecosystem structure: Heterotrophic compartments of the plankton and their relationship to plankton size fractions 1, Limnol. Oceanogr., 23, 1256–1263, https://doi.org/10.4319/lo.1978.23.6.1256, 1978.
Siegel, D. A., Buesseler, K. O., Behrenfeld, M. J., Benitez-Nelson, C. R., Boss, E., Brzezinski, M. A., Burd, A., Carlson, C. A., D'Asaro, E. A., Doney, S. C., Perry, M. J., Stanley, R. H. R., and Steinberg, D. K.: Prediction of the Export and Fate of Global Ocean Net Primary Production: The EXPORTS Science Plan, Front. Mar. Sci., 3, 22, https://doi.org/10.3389/fmars.2016.00022, 2016.
Steinberg, D. K., Carlson, C. A., Bates, N. R., Goldthwait, S. A., Madin, L. P., and Michaels, A. F.: Zooplankton vertical migration and the active transport of dissolved organic and inorganic carbon in the Sargasso Sea, Deep-Sea Res. Pt. I, 47, 137–158, https://doi.org/10.1016/S0967-0637(99)00052-7, 2000.
Turner, J. T.: Zooplankton fecal pellets, marine snow, phytodetritus and the ocean's biological pump, Prog. Oceanogr., 130, 205–248, https://doi.org/10.1016/j.pocean.2014.08.005, 2015.
Uitz, J., Claustre, H., Gentili, B., and Stramski, D.: Phytoplankton class-specific primary production in the world's oceans: Seasonal and interannual variability from satellite observations, Global Biogeochem. Cy., 24, GB3016, https://doi.org/10.1029/2009GB003680, 2010.
Van Beveren, E., Bonhommeau, S., Fromentin, J.-M., Bigot, J.-L., Bourdeix, J.-H., Brosset, P., Roos, D., and Saraux, C.: Rapid changes in growth, condition, size and age of small pelagic fish in the Mediterranean, Mar. Biol., 161, 1809–1822, https://doi.org/10.1007/s00227-014-2463-1, 2014.
van der Lingen, C., Hutchings, L., and Field, J.: Comparative trophodynamics of anchovy Engraulis encrasicolus and sardine Sardinops sagax in the southern Benguela: are species alternations between small pelagic fish trophodynamically mediated?, Afr. J. Mar. Sci., 28, 465–477, https://doi.org/10.2989/18142320609504199, 2006.
Vandromme, P., Stemmann, L., Garcìa-Comas, C., Berline, L., Sun, X., and Gorsky, G.: Assessing biases in computing size spectra of automatically classified zooplankton from imaging systems: A case study with the ZooScan integrated system, Meth. Oceanogr., 1–2, 3–21, https://doi.org/10.1016/j.mio.2012.06.001, 2012.
Vandromme, P., Nogueira, E., Huret, M., Lopez-Urrutia, Á., González-Nuevo González, G., Sourisseau, M., and Petitgas, P.: Springtime zooplankton size structure over the continental shelf of the Bay of Biscay, Ocean Sci., 10, 821–835, https://doi.org/10.5194/os-10-821-2014, 2014.
Véron, M., Duhamel, E., Bertignac, M., Pawlowski, L., and Huret, M.: Major changes in sardine growth and body condition in the Bay of Biscay between 2003 and 2016: Temporal trends and drivers, Prog. Oceanogr., 182, 102274, https://doi.org/10.1016/j.pocean.2020.102274, 2020.