<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE article PUBLIC "-//NLM//DTD Journal Publishing with OASIS Tables v3.0 20080202//EN" "https://jats.nlm.nih.gov/nlm-dtd/publishing/3.0/journalpub-oasis3.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink" xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:oasis="http://docs.oasis-open.org/ns/oasis-exchange/table" xml:lang="en" dtd-version="3.0" article-type="review-article">
  <front>
    <journal-meta><journal-id journal-id-type="publisher">ESSD</journal-id><journal-title-group>
    <journal-title>Earth System Science Data</journal-title>
    <abbrev-journal-title abbrev-type="publisher">ESSD</abbrev-journal-title><abbrev-journal-title abbrev-type="nlm-ta">Earth Syst. Sci. Data</abbrev-journal-title>
  </journal-title-group><issn pub-type="epub">1866-3516</issn><publisher>
    <publisher-name>Copernicus Publications</publisher-name>
    <publisher-loc>Göttingen, Germany</publisher-loc>
  </publisher></journal-meta>
    <article-meta>
      <article-id pub-id-type="doi">10.5194/essd-18-945-2026</article-id><title-group><article-title>Benchmark of plankton images classification: emphasizing features extraction over classifier complexity</article-title><alt-title>Benchmark of plankton images classification</alt-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author" corresp="yes" rid="aff1 aff2">
          <name><surname>Panaïotis</surname><given-names>Thelma</given-names></name>
          <email>thelma.panaiotis.pub@proton.me</email>
        <ext-link>https://orcid.org/0000-0001-5615-6766</ext-link></contrib>
        <contrib contrib-type="author" corresp="no" rid="aff2 aff3">
          <name><surname>Amblard</surname><given-names>Emma</given-names></name>
          
        </contrib>
        <contrib contrib-type="author" corresp="no" rid="aff4">
          <name><surname>Boniface-Chang</surname><given-names>Guillaume</given-names></name>
          
        </contrib>
        <contrib contrib-type="author" corresp="no" rid="aff5">
          <name><surname>Dulac-Arnold</surname><given-names>Gabriel</given-names></name>
          
        </contrib>
        <contrib contrib-type="author" corresp="no" rid="aff6">
          <name><surname>Woodward</surname><given-names>Benjamin</given-names></name>
          
        </contrib>
        <contrib contrib-type="author" corresp="no" rid="aff2">
          <name><surname>Irisson</surname><given-names>Jean-Olivier</given-names></name>
          
        </contrib>
        <aff id="aff1"><label>1</label><institution>National Oceanography Centre, European Way, Southampton, SO14 3ZH, UK</institution>
        </aff>
        <aff id="aff2"><label>2</label><institution>Laboratoire d'Océanographie de Villefranche, Sorbonne Université, 181 Chemin du Lazaret, 06230 Villefranche-sur-Mer, France</institution>
        </aff>
        <aff id="aff3"><label>3</label><institution>Fotonower, 48 Rue René Clair 75018 Paris, France</institution>
        </aff>
        <aff id="aff4"><label>4</label><institution>Google Research, 6 Pancras Sq, London N1C 4AG, UK</institution>
        </aff>
        <aff id="aff5"><label>5</label><institution>Google Research, 8 Rue de Londres, 75009 Paris, France</institution>
        </aff>
        <aff id="aff6"><label>6</label><institution>CVision AI, 81 West St, Medford, MA 02155, USA</institution>
        </aff>
      </contrib-group>
      <author-notes><corresp id="corr1">Thelma Panaïotis (thelma.panaiotis.pub@proton.me)</corresp></author-notes><pub-date><day>5</day><month>February</month><year>2026</year></pub-date>
      
      <volume>18</volume>
      <issue>2</issue>
      <fpage>945</fpage><lpage>967</lpage>
      <history>
        <date date-type="received"><day>23</day><month>May</month><year>2025</year></date>
           <date date-type="rev-request"><day>6</day><month>June</month><year>2025</year></date>
           <date date-type="rev-recd"><day>15</day><month>December</month><year>2025</year></date>
           <date date-type="accepted"><day>12</day><month>January</month><year>2026</year></date>
      </history>
      <permissions>
        <copyright-statement>Copyright: © 2026 Thelma Panaïotis et al.</copyright-statement>
        <copyright-year>2026</copyright-year>
      <license license-type="open-access"><license-p>This work is licensed under the Creative Commons Attribution 4.0 International License. To view a copy of this licence, visit <ext-link ext-link-type="uri" xlink:href="https://creativecommons.org/licenses/by/4.0/">https://creativecommons.org/licenses/by/4.0/</ext-link></license-p></license></permissions><self-uri xlink:href="https://essd.copernicus.org/articles/18/945/2026/essd-18-945-2026.html">This article is available from https://essd.copernicus.org/articles/18/945/2026/essd-18-945-2026.html</self-uri><self-uri xlink:href="https://essd.copernicus.org/articles/18/945/2026/essd-18-945-2026.pdf">The full text article is available as a PDF file from https://essd.copernicus.org/articles/18/945/2026/essd-18-945-2026.pdf</self-uri>
      <abstract><title>Abstract</title>

      <p id="d2e164">Plankton imaging devices produce vast datasets, the processing of which can be largely accelerated through machine learning. This is a challenging task due to the diversity of plankton, the prevalence of non-biological classes, and the rarity of many classes. Most existing studies rely on small, unpublished datasets that often lack realism in size, class diversity and proportions. We therefore also lack a systematic, realistic benchmark of plankton image classification approaches. To address this gap, we leverage both existing and newly published, large, and realistic plankton imaging datasets from widely used instruments (see Data Availability section for the complete list of dataset DOIs). We evaluate different classification approaches: a classical Random Forest classifier applied to handcrafted features, various Convolutional Neural Networks (CNN), and a combination of both. This work aims to provide reference datasets, baseline results, and insights to guide future endeavors in plankton image classification. Overall, CNN outperformed the classical approach but only significantly for uncommon classes. Larger CNN, which should provide richer features, did not perform better than small ones; and features of small ones could even be further compressed without affecting classification performance. Finally, we highlight that the nature of the classifier is of little importance compared to the content of the features. Our findings suggest that compact CNN (i.e. modest number of convolutional layers and consequently relatively few total parameters) are sufficient to extract relevant information to classify small grayscale plankton images. This has consequences for operational classification models, which can afford to be small and quick. On the other hand, this opens the possibility for further development of the imaging systems to provide larger and richer images.</p>
  </abstract>
    
<funding-group>
<award-group id="gs1">
<funding-source>Agence Nationale de la Recherche</funding-source>
<award-id>ANR-18-BELM-0003-01</award-id>
</award-group>
<award-group id="gs2">
<funding-source>HORIZON EUROPE Food, Bioeconomy, Natural Resources, Agriculture and Environment</funding-source>
<award-id>101059915</award-id>
</award-group>
<award-group id="gs3">
<funding-source>Schmidt Futures</funding-source>
<award-id>CALIPSO</award-id>
</award-group>
</funding-group>
</article-meta>
  </front>
<body>
      

<sec id="Ch1.S1" sec-type="intro">
  <label>1</label><title>Introduction</title>
      <p id="d2e176">Plankton, defined as organisms unable to swim against currents, are crucial components of oceanic systems as they form the basis of food webs and contribute to organic carbon sequestration (Ware and Thomson, 2005; Falkowski, 2012). They have been the subject of scientific research for centuries (Péron and Lesueur, 1810). The definition of planktonic organisms, based on motility and ecological niche rather than phylogeny, means that it encompasses a wide range of taxonomic clades (Tappan and Loeblich, 1973). Furthermore, within these clades, plankton is known to be particularly diverse (Hutchinson, 1961). Thus, planktonic organisms cover a wide range of size (from a few micrometers to several meters), shape, opacity, color, etc. While some planktonic taxa are ubiquitous (e.g. copepods), many are rare and sparsely distributed (e.g. fish larvae, scyphomedusae) (Ser-Giacomi et al., 2018).</p>
      <p id="d2e179">Historically, plankton was studied by sampling with nets and pumps followed by identification and counting by taxonomists. These approaches, still used today, are precise but time-demanding. Quantitative imaging and automated identification are now complementing traditional methods of plankton observation, with various imaging instruments developed to generate quantitative data (Lombard et al., 2019). Some of these instruments image collected samples, such as the ZooScan (Gorsky et al., 2010), the FlowCAM (Sieracki et al., 1998), or the ZooCAM (Colas et al., 2018). Others acquire images in situ, such as the Underwater Vision Profiler (UVP; Picheral et al., 2010, 2022), the In Situ Ichthyoplankton Imaging System (ISIIS; Cowen and Guigand, 2008), the Imaging FlowCytobot (IFCB; Olson and Sosik, 2007), or the ZooGlider (Ohman et al., 2019). These instruments vary significantly in terms of targeted size range, imaging technique, and deployment requirements, each necessitating distinct processing pipelines. Moreover, the growing availability and ease of use of these instruments are generating an ever-increasing volume of plankton imaging data. Most of this data is now processed through automated algorithms. Among the various processing tasks, detecting or identifying organisms is commonly performed using supervised machine learning, where an algorithm learns patterns from training data and then generalizes these patterns to new data. Despite significant advances in hardware for high-throughput plankton imaging, these new instruments do not always come with a solid and easy-to-use software pipeline (Bi et al., 2015 is a rare counter-example), leaving operators with the burden of coding or adapting one themselves. Even once the data is processed, many current analysis workflows still rely on aggregating and summarizing the classified images, since the usual statistical tools used in ecology are not meant to handle such large amounts of data points. This limits our ability to leverage the full richness of these new datasets (Malde et al., 2020).</p>
      <p id="d2e183">Automated classification of plankton images is a challenging computer science task. To begin with, planktonic communities (Ser-Giacomi et al., 2018), and therefore the resulting image datasets (Eftekhari et al., 2025; Schröder et al., 2019), exhibit significant class imbalance. In other words, a few classes contribute to a substantial part of the dataset, while others classes are poorly represented. This specificity of plankton image datasets contrasts with standard benchmark image datasets where classes are almost evenly distributed: between 732 and 1300 images for each of the 1000 classes in ImageNet (Russakovsky et al., 2015). As a consequence, rare planktonic classes are usually harder to predict for automated algorithms (Lee et al., 2016; Van Horn and Perona, 2017; Schröder et al., 2019), although classes with highly distinctive morphologies could still be correctly classified even with few training images (Kraft et al., 2022). Secondly, planktonic organisms encompass a wide range of taxa and form a morphologically heterogeneous group, varying in size, shape and opacity. More specifically, certain classes can exhibit significant intraclass variation: for instance, when morphological differences arise from life stages (e.g., doliolids) or when a class includes diverse, but rare, objects grouped together, as they are too uncommon to warrant separate classes (e.g., fish larvae). This variability can lead to confusion between classes (Grosjean et al., 2004). In addition to diverse classes of living organisms, real-world plankton image datasets comprise a considerable amount of non-living objects, such as marine snow aggregates or bubbles (Benfield et al., 2007); these classes often constitute the majority of the datasets (Ellen et al., 2019; Schröder et al., 2019; Irisson et al., 2022). Finally, plankton images collected by quantitative instruments are typically low in resolution (with edges measuring only a few hundred pixels or less) and are often grayscale or with little variation in color; therefore the distinction among classes needs to be made from a relatively small amount of information.</p>
      <p id="d2e186">Historically, the automatic classification of plankton images involved training machine learning classifiers using handcrafted features extracted from the images. These manually extracted features – intended to capture plankton traits (observable characteristics, primarily morphological) – aim to summarize the image content in numerical form, providing a concise representation that facilitates the classification process. Typical handcrafted features were global image moments (size, average gray, etc.; Tang et al., 1998), texture features such as gray-level co-occurrence matrices (Hu and Davis, 2005), or shape features from Fourier transforms of the contour (Tang et al., 1998). Classifiers included Support Vector Machines (SVM; Luo et al., 2004; Hu and Davis, 2005; Sosik and Olson, 2007), Random Forests (RF; Gorsky et al., 2010) or Multi-Layer Perceptrons (MLP; Culverhouse et al. 1996). Several studies compared various classifiers trained on a common set of features, revealing varying results depending on the dataset, but ultimately no significant difference in their performance (Grosjean et al., 2004; Blaschko et al., 2005; Gorsky et al., 2010; Ellen et al., 2015, 2019). This suggests that the performance of classical approaches is not driven by the classifier as much as by the number and diversity of features that are fed to it. Indeed, classification performance usually increases with a richer set of features (Blaschko et al., 2005). Nevertheless, this may not be true if some features are redundant or introduce noise into the data, hence the importance of feature selection (Sosik and Olson, 2007; Guo et al., 2021b). Because handcrafted features are designed for a particular imaging system, a single universal set that works across all instruments is difficult to define; the optimal set of features tends to be instrument and dataset dependent (Orenstein et al., 2022). One solution would be to define a very large, universal feature set and leave it to the classifier to select the relevant ones for each task. But this would be a challenging task, as it requires both expertise in biology, for many taxa (to know what to extract), and in computer science (to know how to do it); feature engineering has therefore emerged as a complete research field (Guyon and Elisseeff, 2003). In the following, we will refer to these two-step methods (1 – handcrafted feature extraction and 2 – classification) as “classic approaches”, in contrast to the “deep approaches” introduced later, which bypass manual feature design by training feature extractors that automatically learn relevant features for the task at hand (Irisson et al., 2022).</p>
      <p id="d2e190">Among classifiers, RF is a tree-based ensemble learning method that has shown high accuracy and versatility among computer vision tasks (Hastie et al., 2009). Each decision tree in the “forest” is trained on a random subset of the data (i.e. bootstrap), and at each step, it considers a random selection of predictors (or features) to split the data according to labeled classes. The tree keeps splitting until it reaches a stopping point, such as a maximum number of splits. During prediction, each object passes through the tree until it reaches a terminal leaf, where it is classified based on the majority class within that leaf. By averaging the results from multiple trees, RF reduces the risk of overfitting (Breiman, 2001). Fernández-Delgado et al. (2014), who evaluated the performances of nearly 180 classifiers on various datasets, concluded that RF outperformed all others. Gorsky et al. (2010) previously reached this conclusion on a ZooScan images dataset, resulting in a widespread use of RF classifiers later on. The IFCB data processing pipeline also switched from SVM to RF (Anglès et al., 2015). Finally, EcoTaxa (Picheral et al., 2017), a web application dedicated to the taxonomic annotation of images, initially implemented a RF classifier to classify unlabeled images.</p>
      <p id="d2e193">However, since 2015, an increasing proportion of plankton image classification studies have employed deep learning methods, especially Convolutional Neural Networks (CNN). CNN are a kind of artificial neural network, typically used for pattern recognition tasks like image segmentation or classification. Their architecture is inspired from the visual cortex of animals, where each neuron reacts to stimuli from a restricted region (Dyck et al., 2021). In the case of an image classification task, a CNN directly takes an image as input (as opposed to classic approaches for which image features need to be extracted first), transforms it in various ways (the “Convolutional” part), combines the resulting features as input for a set of interconnected “neurons” that further reduce the information (the “Neural Network” part), and finally outputs a probability for the image to belong to each class; the class of highest probability is chosen as the predicted label. In contrast to classical approaches described above, the classification task with CNN is performed in a single step, where the feature extractor and the classifier are trained simultaneously. This process optimizes the deep features specifically for the classification task. Moreover, those features can be used to train any kind of classifier, often resulting in better classification performance than with handcrafted features (Orenstein and Beijbom, 2017).</p>
      <p id="d2e196">CNN, first developed in 1990 (Le Cun et al., 1989) and popularized in 2012 (Krizhevsky et al., 2012), were applied to plankton image classification for the first time in 2015, during a challenge hosted on the online platform Kaggle (<uri>https://www.kaggle.com/c/datasciencebowl/</uri>, last access: 10 December 2025). Since then, numerous studies have demonstrated the effectiveness of CNN in recognising plankton images (Dai et al., 2016; Lee et al., 2016; Luo et al., 2018; Cheng et al., 2019; Ellen et al., 2019; Lumini and Nanni, 2019; Schmid et al., 2020). On a few plankton images datasets, CNN have proven to reach higher prediction accuracy than the classical approach of handcrafted features extraction followed by classification (Orenstein et al., 2015; Kyathanahally et al., 2021; Irisson et al., 2022). Currently, research on the classification of plankton images, or images of any other type of marine organisms, is dominated by CNN (Irisson et al., 2022; Rubbens et al., 2023; Eerola et al., 2024). While CNN remain a dominant method for image classification, they have been surpassed by vision transformers (Vaswani et al., 2017), a newer state-of-the-art approach. However, vision transformers are less data-efficient than CNN, requiring larger datasets and greater computational resources for effective training (Raghu et al., 2021). When applied to plankton image classification, vision transformers have shown only marginal improvements over CNN (Kyathanahally et al., 2022; Maracani et al., 2023).</p>
      <p id="d2e202">A relatively recent review (Irisson et al., 2022) revealed that over 175 papers have addressed the topic of automated classification of plankton images. As shown earlier, a few compared classifiers explicitly, with varying outcomes. But overall, these 100<inline-formula><mml:math id="M1" display="inline"><mml:mo>+</mml:mo></mml:math></inline-formula> studies used different datasets, often only one per study, and most of which were not publicly released. The datasets varied in terms of number of classes and number of images, two factors that significantly affect performance. They also reported different performance metrics and the one most commonly reported (global accuracy) is unrepresentative for unbalanced datasets (Soda, 2011). Indeed, out of the 10 most cited papers in the field (Irisson et al., 2022), 8 conducted a plankton classification experiment, but only 4 reported per class metrics or a confusion matrix (others only report global metrics such as accuracy). A similar pattern is observed among the papers cited here: of the 33 papers that performed a plankton classification task, only half reported metrics beyond global metrics (Table S1 in the Supplement). Looking at the bigger picture, it appears that performance has remained relatively stable over time, while the taxonomic classification tasks became increasingly difficult since, with richer and larger datasets, more taxa were labeled (Irisson et al., 2022). This suggests that classifiers improved, although this is unquantifiable for all the reasons above. Earlier plankton image datasets were modest in size, typically containing a dozen or a few dozen of classes (Benfield et al., 2007), but were crucial for establishing the first classification methods. Building on that foundation, three major plankton image datasets have been published and used in several studies (Table 1), while a few other studies have focused on smaller versions of these datasets (Dai et al., 2016; Zheng et al., 2017; Lumini and Nanni, 2019). These benchmark datasets share several important characteristics: they are large (though this is debatable for PlanktonSet 1.0), representative of true data (with minimal alteration of class distribution and inclusion of all classes, such as detritus or miscellaneous), and accessible online. This highlights that a move towards standardization and intercompatibility is ongoing. Beyond publishing large reference datasets, as we strive to do in this work, another avenur for progress is the collection of many diverse, albeit smaller, datasets. This is typically the first step for the creation of ”universal” foundation-type models. The push towards more open and reproducible science has helped in this respect and several local datasets have been published: e.g. Table 1 in Kareinen et al. (2025), Table 2 in Eerola et al. (2024).</p>

<table-wrap id="T1" specific-use="star"><label>Table 1</label><caption><p id="d2e215">Common plankton images benchmark datasets.</p></caption><oasis:table frame="topbot"><oasis:tgroup cols="6">
     <oasis:colspec colnum="1" colname="col1" align="justify" colwidth="2cm"/>
     <oasis:colspec colnum="2" colname="col2" align="justify" colwidth="2cm"/>
     <oasis:colspec colnum="3" colname="col3" align="justify" colwidth="2cm"/>
     <oasis:colspec colnum="4" colname="col4" align="justify" colwidth="1cm"/>
     <oasis:colspec colnum="5" colname="col5" align="justify" colwidth="1cm"/>
     <oasis:colspec colnum="6" colname="col6" align="justify" colwidth="7cm"/>
     <oasis:thead>
       <oasis:row>
         <oasis:entry colname="col1" align="left">Name</oasis:entry>
         <oasis:entry colname="col2" align="left">References</oasis:entry>
         <oasis:entry colname="col3" align="left">Imaging instrument</oasis:entry>
         <oasis:entry rowsep="1" namest="col4" nameend="col5" align="center">Composition </oasis:entry>
         <oasis:entry colname="col6" align="left">Relevant publications</oasis:entry>
       </oasis:row>
       <oasis:row rowsep="1">
         <oasis:entry colname="col1" align="left"/>
         <oasis:entry colname="col2" align="left"/>
         <oasis:entry colname="col3" align="left"/>
         <oasis:entry colname="col4" align="left">Images</oasis:entry>
         <oasis:entry colname="col5" align="right">Classes</oasis:entry>
         <oasis:entry colname="col6" align="left"/>
       </oasis:row>
     </oasis:thead>
     <oasis:tbody>
       <oasis:row rowsep="1">
         <oasis:entry colname="col1" align="left">WHOI-plankton</oasis:entry>
         <oasis:entry colname="col2" align="left">Orenstein et al. (2015), Sosik et al. (2015)</oasis:entry>
         <oasis:entry colname="col3" align="left">IFCB</oasis:entry>
         <oasis:entry colname="col4" align="left">3.5 M</oasis:entry>
         <oasis:entry colname="col5" align="right">103</oasis:entry>
         <oasis:entry colname="col6" align="left">Callejas et al. (2025), Ciranni et al. (2025), Lee et al. (2016), Dai et al. (2017), Orenstein and Beijbom (2017), Cui et al. (2018), Hassan et al. (2025), Kraft et al. (2022), Kyathanahally et al. (2021, 2022), Langeland Teigen et al. (2020), Liu et al. (2018), Maracani et al. (2023), Venkataramanan et al. (2021)</oasis:entry>
       </oasis:row>
       <oasis:row rowsep="1">
         <oasis:entry colname="col1" align="left">ZooScanNet</oasis:entry>
         <oasis:entry colname="col2" align="left">Elineau et al. (2024)</oasis:entry>
         <oasis:entry colname="col3" align="left">ZooScan</oasis:entry>
         <oasis:entry colname="col4" align="left">1.4 M</oasis:entry>
         <oasis:entry colname="col5" align="right">93</oasis:entry>
         <oasis:entry colname="col6" align="left">Callejas et al. (2025), Ciranni et al. (2025), Guo and Guan (2021), Malde and Kim (2019), Schröder et al. (2019), Kyathanahally et al. (2021, 2022), Maracani et al. (2023)</oasis:entry>
       </oasis:row>
       <oasis:row>
         <oasis:entry colname="col1" align="left">PlanktonSet 1.0</oasis:entry>
         <oasis:entry colname="col2" align="left">Cowen et al. (2015)</oasis:entry>
         <oasis:entry colname="col3" align="left">ISIIS</oasis:entry>
         <oasis:entry colname="col4" align="left">30 336</oasis:entry>
         <oasis:entry colname="col5" align="right">121</oasis:entry>
         <oasis:entry colname="col6" align="left">Dieleman et al. (2016), Du et al. (2020), Geraldes et al. (2019), Guo and Guan (2021), Guo et al. (2021a), Langeland Teigen et al. (2020), Li and Cui (2016), Li et al. (2019), Py et al. (2016), Rodrigues et al. (2018), Uchida et al. (2018), Kyathanahally et al. (2021, 2022), Langeland Teigen et al. (2020), Maracani et al. (2023), Yan et al. (2017)</oasis:entry>
       </oasis:row>
     </oasis:tbody>
   </oasis:tgroup></oasis:table></table-wrap>

      <p id="d2e345">Currently, despite several years of active research on the topic and while CNN have been applied to plankton images for more than five years (Luo et al., 2018), a systematic, global comparison of classifier performance is still lacking. Leveraging both previously published and new published plankton imaging datasets, the motivation for this study is to provide such a systematic, operational benchmark that evaluates practical and accessible approaches suitable for real-world applications. This includes starting with a classical feature-based image classification approach and exploring a few deep-learning methods. All are applied on large, realistic, and publicly released datasets from six commonly used plankton imaging instruments, to encompass some of the variability in imaging modalities, processing pipelines, and target size ranges present in plankton imaging. For the classical approach, we use the handcrafted features natively extracted by the software associated with the instrument, assuming that they were engineered to be relevant for those images, and a RF classifier, given its popularity and performance on plankton images. For the deep approach, our base model is a relatively small and easy to train CNN (MobileNet V2), readily accessible to non ML specialists and below state of the art hardware. In addition to this benchmark, we perform additional comparisons to tackle the following questions: (i) In which conditions do CNN strongly improve classification performance over the classical approach? (ii) Is per-class weighting of errors effective to counter the effect of class imbalance in plankton datasets? (iii) How rich do features need to be for plankton images classification: are larger CNN needed or, on the contrary, can features be compressed? (iv) What are the relative effect of features (deep vs. handcrafted) and classifier (MLP vs. RF) on classification performance?</p>
</sec>
<sec id="Ch1.S2">
  <label>2</label><title>Material and method</title>
<sec id="Ch1.S2.SS1">
  <label>2.1</label><title>Datasets</title>
<sec id="Ch1.S2.SS1.SSS1">
  <label>2.1.1</label><title>Imaging tools</title>
      <p id="d2e370">We used datasets from six widely used plankton imaging instruments, each with distinct properties such as deployment methods or the size range of targeted organisms (Table 2). For a detailed review of these instruments, refer to Lombard et al. (2019).</p>

<table-wrap id="T2" specific-use="star"><label>Table 2</label><caption><p id="d2e376">Main characteristics of the plankton imaging instruments used to collect the datasets.</p></caption><oasis:table frame="topbot"><oasis:tgroup cols="4">
     <oasis:colspec colnum="1" colname="col1" align="left"/>
     <oasis:colspec colnum="2" colname="col2" align="left"/>
     <oasis:colspec colnum="3" colname="col3" align="left"/>
     <oasis:colspec colnum="4" colname="col4" align="left"/>
     <oasis:thead>
       <oasis:row rowsep="1">
         <oasis:entry colname="col1">Instrument</oasis:entry>
         <oasis:entry colname="col2">Deployment</oasis:entry>
         <oasis:entry colname="col3">Covered size range</oasis:entry>
         <oasis:entry colname="col4">Reference</oasis:entry>
       </oasis:row>
     </oasis:thead>
     <oasis:tbody>
       <oasis:row>
         <oasis:entry colname="col1">FlowCAM</oasis:entry>
         <oasis:entry colname="col2">Ex situ (laboratory, ship)</oasis:entry>
         <oasis:entry colname="col3">20 to 200 <inline-formula><mml:math id="M2" display="inline"><mml:mrow class="unit"><mml:mi mathvariant="normal">µ</mml:mi><mml:mi mathvariant="normal">m</mml:mi></mml:mrow></mml:math></inline-formula></oasis:entry>
         <oasis:entry colname="col4">Sieracki et al. (1998)</oasis:entry>
       </oasis:row>
       <oasis:row>
         <oasis:entry colname="col1">IFCB</oasis:entry>
         <oasis:entry colname="col2">In situ (mooring)</oasis:entry>
         <oasis:entry colname="col3">10 to 100 <inline-formula><mml:math id="M3" display="inline"><mml:mrow class="unit"><mml:mi mathvariant="normal">µ</mml:mi><mml:mi mathvariant="normal">m</mml:mi></mml:mrow></mml:math></inline-formula></oasis:entry>
         <oasis:entry colname="col4">Olson and Sosik (2007)</oasis:entry>
       </oasis:row>
       <oasis:row>
         <oasis:entry colname="col1">ISIIS</oasis:entry>
         <oasis:entry colname="col2">In situ (ship-towed)</oasis:entry>
         <oasis:entry colname="col3"><inline-formula><mml:math id="M4" display="inline"><mml:mo>&lt;</mml:mo></mml:math></inline-formula> 1 mm to several cm</oasis:entry>
         <oasis:entry colname="col4">Cowen and Guigand (2008)</oasis:entry>
       </oasis:row>
       <oasis:row>
         <oasis:entry colname="col1">UVP6</oasis:entry>
         <oasis:entry colname="col2">In situ (CTD rosette, mooring, AUV)</oasis:entry>
         <oasis:entry colname="col3">620 <inline-formula><mml:math id="M5" display="inline"><mml:mrow class="unit"><mml:mi mathvariant="normal">µ</mml:mi><mml:mi mathvariant="normal">m</mml:mi></mml:mrow></mml:math></inline-formula> to a few cm</oasis:entry>
         <oasis:entry colname="col4">Picheral et al. (2022)</oasis:entry>
       </oasis:row>
       <oasis:row>
         <oasis:entry colname="col1">ZooCAM</oasis:entry>
         <oasis:entry colname="col2">Ex situ (laboratory, ship)</oasis:entry>
         <oasis:entry colname="col3"><inline-formula><mml:math id="M6" display="inline"><mml:mo>&gt;</mml:mo></mml:math></inline-formula> 300 <inline-formula><mml:math id="M7" display="inline"><mml:mrow class="unit"><mml:mi mathvariant="normal">µ</mml:mi><mml:mi mathvariant="normal">m</mml:mi></mml:mrow></mml:math></inline-formula></oasis:entry>
         <oasis:entry colname="col4">Colas et al. (2018)</oasis:entry>
       </oasis:row>
       <oasis:row>
         <oasis:entry colname="col1">ZooScan</oasis:entry>
         <oasis:entry colname="col2">Ex situ (laboratory)</oasis:entry>
         <oasis:entry colname="col3">200 <inline-formula><mml:math id="M8" display="inline"><mml:mrow class="unit"><mml:mi mathvariant="normal">µ</mml:mi><mml:mi mathvariant="normal">m</mml:mi></mml:mrow></mml:math></inline-formula> to a few cm</oasis:entry>
         <oasis:entry colname="col4">Gorsky et al. (2010)</oasis:entry>
       </oasis:row>
     </oasis:tbody>
   </oasis:tgroup></oasis:table></table-wrap>

</sec>
<sec id="Ch1.S2.SS1.SSS2">
  <label>2.1.2</label><title>Image processing</title>
      <p id="d2e572">Each imaging tool had its own specific image processing and feature extraction pipeline. The motivation here is to use these tools “out of the box”, as other plankton ecologists would. ISIIS data was processed using Apeep (Panaïotis et al., 2022), and features were extracted using Scikit-image (Walt et al., 2014). The IFCB data processing relied on several MATLAB scripts (Sosik and Olson, 2007) to segment objects and extract different types of features. The UVPapp application (Picheral et al., 2022) was developed to process UVP6 images and extract features. Both ZooScan and FlowCAM data were processed using ZooProcess (Gorsky et al., 2010), which generates crops of individual objects together with a set of features, extracted by ImageJ (Schneider et al., 2012). The processing of ZooCam data was very similar to the processing of ZooScan and FlowCAM data (Colas et al., 2018). Thus, for all datasets, each grayscale image was associated with a set of handcrafted features, which depended on the instrument but were mostly global features, related to shape and gray-levels, and a label.</p>
</sec>
<sec id="Ch1.S2.SS1.SSS3">
  <label>2.1.3</label><title>Datasets assembling and composition</title>
      <p id="d2e583">All datasets were generated in a similar way: complete, real-world datasets were sorted by human operators; All classifications were reviewed by one independant operator for each dataset. Except for IFCB and ZooCAM, samples particularly rich in some rare classes were added to the dataset (all images, not just those of the class of interest). Classes still containing fewer than <inline-formula><mml:math id="M9" display="inline"><mml:mo>∼</mml:mo></mml:math></inline-formula> 100 objects were merged into a taxonomically and/or morphologically neighboring class. If no relevant merging class could be found, objects were assigned to a miscellaneous class together with objects impossible to classify. Therefore, every single object from the original samples was included in the classification task, ensuring that the metrics computed on these datasets were as relevant to a real-world situation as possible. The IFCB images were taken from Sosik et al. (2015) (years 2011–2014); the images for other instruments were taken from EcoTaxa (Picheral et al., 2017), with the permission of their owners. Full references for each dataset are provided in Table 3. The number of images in the resulting datasets ranged from 301 247 to 1 592 196, in 32 to 120 classes (Table 3). As expected, the datasets collected in situ (ISIIS, UVP6, and IFCB) were particularly rich in marine snow and other non-living objects, resulting in a low proportion of plankton.</p>

<table-wrap id="T3" specific-use="star"><label>Table 3</label><caption><p id="d2e596">References and dataset composition in terms of the numbers of images, classes and handcrafted features, as well as the proportion of plankton (i.e. living organisms, as opposed to detritus and imaging artifacts).</p></caption><oasis:table frame="topbot"><oasis:tgroup cols="6">
     <oasis:colspec colnum="1" colname="col1" align="left"/>
     <oasis:colspec colnum="2" colname="col2" align="left"/>
     <oasis:colspec colnum="3" colname="col3" align="right"/>
     <oasis:colspec colnum="4" colname="col4" align="right"/>
     <oasis:colspec colnum="5" colname="col5" align="right"/>
     <oasis:colspec colnum="6" colname="col6" align="right"/>
     <oasis:thead>
       <oasis:row>
         <oasis:entry colname="col1">Instrument</oasis:entry>
         <oasis:entry colname="col2">Dataset reference</oasis:entry>
         <oasis:entry rowsep="1" namest="col3" nameend="col6" align="center">Composition </oasis:entry>
       </oasis:row>
       <oasis:row rowsep="1">
         <oasis:entry colname="col1"/>
         <oasis:entry colname="col2"/>
         <oasis:entry colname="col3"># images [min; max per class]</oasis:entry>
         <oasis:entry colname="col4">Classes</oasis:entry>
         <oasis:entry colname="col5">Features</oasis:entry>
         <oasis:entry colname="col6">% plankton</oasis:entry>
       </oasis:row>
     </oasis:thead>
     <oasis:tbody>
       <oasis:row>
         <oasis:entry colname="col1">FlowCAM</oasis:entry>
         <oasis:entry colname="col2">Jalabert et al. (2024)</oasis:entry>
         <oasis:entry colname="col3">301 247 [74; 69 085]</oasis:entry>
         <oasis:entry colname="col4">93</oasis:entry>
         <oasis:entry colname="col5">47</oasis:entry>
         <oasis:entry colname="col6">36.2</oasis:entry>
       </oasis:row>
       <oasis:row>
         <oasis:entry colname="col1">ISIIS</oasis:entry>
         <oasis:entry colname="col2">Panaïotis et al. (2024)</oasis:entry>
         <oasis:entry colname="col3">408 166 [70; 321 335]</oasis:entry>
         <oasis:entry colname="col4">32</oasis:entry>
         <oasis:entry colname="col5">31</oasis:entry>
         <oasis:entry colname="col6">15.3</oasis:entry>
       </oasis:row>
       <oasis:row>
         <oasis:entry colname="col1">UVP6</oasis:entry>
         <oasis:entry colname="col2">Picheral et al. (2024)</oasis:entry>
         <oasis:entry colname="col3">634 459 [87; 508 817]</oasis:entry>
         <oasis:entry colname="col4">54</oasis:entry>
         <oasis:entry colname="col5">62</oasis:entry>
         <oasis:entry colname="col6">7.7</oasis:entry>
       </oasis:row>
       <oasis:row>
         <oasis:entry colname="col1">ZooCAM</oasis:entry>
         <oasis:entry colname="col2">Romagnan et al. (2024)</oasis:entry>
         <oasis:entry colname="col3">1 286 590 [81; 204 132]</oasis:entry>
         <oasis:entry colname="col4">93</oasis:entry>
         <oasis:entry colname="col5">48</oasis:entry>
         <oasis:entry colname="col6">67.8</oasis:entry>
       </oasis:row>
       <oasis:row>
         <oasis:entry colname="col1">ZooScan</oasis:entry>
         <oasis:entry colname="col2">Elineau et al. (2024)</oasis:entry>
         <oasis:entry colname="col3">1 451 745 [90; 241 731]</oasis:entry>
         <oasis:entry colname="col4">120</oasis:entry>
         <oasis:entry colname="col5">48</oasis:entry>
         <oasis:entry colname="col6">71.2</oasis:entry>
       </oasis:row>
       <oasis:row>
         <oasis:entry colname="col1">IFCB</oasis:entry>
         <oasis:entry colname="col2">Sosik et al. (2015)</oasis:entry>
         <oasis:entry colname="col3">1 592 196 [90; 1 177 499]</oasis:entry>
         <oasis:entry colname="col4">69</oasis:entry>
         <oasis:entry colname="col5">72</oasis:entry>
         <oasis:entry colname="col6">12.6</oasis:entry>
       </oasis:row>
     </oasis:tbody>
   </oasis:tgroup></oasis:table></table-wrap>

      <p id="d2e788">To assess performance at a coarser taxonomic level, which may be sufficient in some applications and is more comparable to older papers tackling automated classification of plankton images (e.g. Culverhouse et al., 1996; Sosik and Olson, 2007; Gorsky et al., 2010), each class was assigned to a broader group (Tables 4, S2–S7). Each class/group was then categorized as planktonic or non-planktonic (i.e. detritus and imaging artifacts), allowing metrics to be computed for planktonic organisms only, excluding the, sometimes dominant, non-living objects (Table 3). The datasets were split, per class, into 70 % for training, 15 % for validation and 15 % for testing, once, before all experiments. This split ensured that the majority of the data was used for training, maximizing model learning, while preserving a sufficient portion for validation and testing (at least 10 objects for the rarest classes in FlowCAM and ISIIS datasets).</p>
</sec>
</sec>
<sec id="Ch1.S2.SS2">
  <label>2.2</label><title>Classification models</title>
      <p id="d2e800">Each dataset was classified using different models, described below. The training procedure was the same for all models and datasets: (i) models were fitted to the training split, according to a loss metric, (ii) hyperparameters were assessed based on the same loss metric but computed on the independent validation split to limit overfitting, (iii) the model with optimal hyperparameters was used to predict the never-seen-before test split, only once, and various performance metrics were computed.</p>
      <p id="d2e803">The RF classifiers were implemented using Scikit-learn (Pedregosa et al., 2011). The CNN models were implemented using Tensorflow (Abadi et al., 2016). Training and evaluation were performed on two Linux machines, depending on the model: a Dell server equipped with a Quadro RTX 8000 GPU and a node of the Jean-Zay supercomputer, equipped with a V100 SXM2 GPU.</p>
      <p id="d2e806">The code to reproduce all results is available at <ext-link xlink:href="https://doi.org/10.5281/zenodo.17937437" ext-link-type="DOI">10.5281/zenodo.17937437</ext-link> (Panaïotis and Amblard, 2025).</p>
<sec id="Ch1.S2.SS2.SSS1">
  <label>2.2.1</label><title>Classic approach</title>
      <p id="d2e819">A RF classifier was trained on handcrafted features extracted from images by the software dedicated to each instrument. Their number ranged from 31 to 72 depending on the software (Table 3). Most features were global features, computed on the whole object: morphological features were computed on the object silhouette; gray-levels features were summaries of the distribution of gray levels in the object. In the case of IFCB, additional texture features were extracted, in the form of gray level co-occurrence matrices. The diversity of features is known to be crucial for the performance of the classifiers (Blaschko et al., 2005).</p>
      <p id="d2e822">The loss metric used during training and validation was categorical cross-entropy, which optimizes the model's confidence in predicting the correct class by minimizing the difference between predicted probabilities and actual labels. While this helps improve accuracy, it does not directly optimize for accuracy itself, which is based solely on whether predictions are correct, not on the confidence of those predictions. In terms of hyperparameters, the number of features used to compute each split was set to the square root of the number of features (which is the default for a classification task, Hastie et al., 2009) and the minimum number of samples in a terminal node was set to 5. The optimal number of trees was investigated using a grid search procedure, over the values 100, 200, 350, and 500; for each, the classifier was fitted on the training split and evaluated on the validation split. The number of trees leading to the lowest validation loss was selected. This classic approach is illustrated in the first row of Fig. 1.</p>
</sec>
<sec id="Ch1.S2.SS2.SSS2">
  <label>2.2.2</label><title>Convolutional neural network</title>
      <p id="d2e834">Since our goal here is to assess the performance of easy-to-use, turnkey models that most research teams should be able to deploy, we chose a rather small CNN (MobileNet V2; Sandler et al., 2018), as our reference model. In addition, we also evaluated the performance of much larger CNN: EfficientNet V2 (Tan and Le, 2021), in its S and XL versions.</p>
      <p id="d2e837">Images were resized and padded to match the input dimensions required by each CNN model (MobileNet V2: 224 <inline-formula><mml:math id="M10" display="inline"><mml:mo>×</mml:mo></mml:math></inline-formula> 224 <inline-formula><mml:math id="M11" display="inline"><mml:mo>×</mml:mo></mml:math></inline-formula> 3; EfficientNet V2 S: 384 <inline-formula><mml:math id="M12" display="inline"><mml:mo>×</mml:mo></mml:math></inline-formula> 384 <inline-formula><mml:math id="M13" display="inline"><mml:mo>×</mml:mo></mml:math></inline-formula> 3; EfficientNet V2 XL: 512 <inline-formula><mml:math id="M14" display="inline"><mml:mo>×</mml:mo></mml:math></inline-formula> 512 <inline-formula><mml:math id="M15" display="inline"><mml:mo>×</mml:mo></mml:math></inline-formula> 3). Since each image was originally single-channel, the single channel was replicated across the typical three color channels used in CNN. To preserve aspect ratio, each image was resized so that its longest side equaled the model's input size, then padded to a square format using the median value of the border pixels to maintain a homogeneous background (Orenstein et al., 2015). Since all images are resized and padded to a common pixel grid, the large natural size variation of plankton is compressed, limiting the amount of scale-specific detail that can be exploited by the CNN. Finally, the grayscale channel was replicated to create three identical channels and achieve the desired shape. Since training a CNN from scratch is time and data-consuming, we applied transfer learning by using a feature extractor pre-trained on the ImageNet dataset. The pre-trained feature extractor could be used as it is, as the features extracted by a model trained on generic datasets have also proven to be relevant for other tasks (Yosinski et al., 2014), such as plankton classification (Orenstein and Beijbom, 2017; Rodrigues et al., 2018; Kyathanahally et al., 2021). But they can also be fine-tuned on the target dataset to achieve better performance (Yosinski et al., 2014), which is what we did here, for each dataset.</p>
      <p id="d2e884">In a CNN, the typical classifier following the feature extractor is a MLP. To prevent overfitting, we added a dropout layer (rate <inline-formula><mml:math id="M16" display="inline"><mml:mo>=</mml:mo></mml:math></inline-formula> 0.5) immediately after the feature vector, preventing the model from relying on a few key neurons only (Srivastava et al., 2014) This was followed by a fully connected layer with either 600 or 50, depending on the model, to explore how the layer size impacts performance. Finally, the model ended with a classification head, the size of which depended on the number of classes to predict. This resulted in 4.5 M parameters for the smaller CNN and 208 M for the larger one. All models are described in Fig. 1.</p>
      <p id="d2e894">Data augmentation (Shorten and Khoshgoftaar, 2019) was used to improve model generalization ability and performance, especially for rare classes. Images from the training set were randomly flipped vertically and horizontally, zoomed in and out (up to 20 %), and sheared (up to 15°). Such a process increases the diversity of examples seen during training, improving generalization ability of the model (Dai et al., 2016). Images were not rotated because objects from a few classes had a specific orientation (e.g. vertical lines in the ISIIS dataset, or some organisms that have a specific orientation in datasets collected in situ). As for the RF, the loss metric was the categorical cross entropy. At the end of each training epoch (i.e. a complete run over all images in the training split), both loss and accuracy were computed on the validation split, to check for overfitting, and model parameters were saved.</p>
      <p id="d2e898">The feature extractor, fully connected and classification layers were trained for 10 epochs (5 epochs for EfficientNets). Monitoring the loss on the validation set revealed that this was sufficient for exhaustive training (Fig. S1). The optimizer used the Adam algorithm, with a decaying learning rate from an initial value of 0.0005 and a decay rate of 0.97 per epoch. Similarly to the optimization of the number of trees of the RF models, the number of training epochs was optimized by retaining the parameters associated with the epoch presenting the minimum validation loss, hence reducing overfitting (Smith, 2018).</p>
</sec>
<sec id="Ch1.S2.SS2.SSS3">
  <label>2.2.3</label><title>Hybrid approaches</title>
      <p id="d2e909">Finally, to discriminate the effect of the feature extractor (either handcrafted or deep) and the classifier (either a RF or a MLP), the deep features produced by the fine-tuned MobileNet V2 (<inline-formula><mml:math id="M17" display="inline"><mml:mrow><mml:mi>n</mml:mi><mml:mo>=</mml:mo></mml:mrow></mml:math></inline-formula> 1792) were used to train a RF classifier. Furthermore, to compare RF trained on similar numbers of features and to evaluate the importance of feature richness, we reduce the dimension of those deep features from 1792 to 50 using a principal component analysis (PCA) fitted on the training set only, before feeding them into the RF classifier. These two “hybrid” approaches are illustrated in the last two rows of Fig. 1.</p>
</sec>
<sec id="Ch1.S2.SS2.SSS4">
  <label>2.2.4</label><title>Class weights</title>
      <p id="d2e931">In an unbalanced dataset, well-represented classes are given more importance because examples from these classes are more frequent in the loss calculation, while very small classes are almost negligible. As a result, performance on these small classes is often very poor (Luo et al., 2018; Schröder et al., 2019). To address this imbalance, training data can be resampled to achieve a more balanced distribution (e.g. oversampling poorly represented classes and/or undersampling dominant classes), a set of methods known as dataset-level approaches (Sun et al., 2009). Alternatively, the classifier can be tuned so that the misclassification cost is higher for small classes (i.e. algorithm-level approaches). Although both types of methods were shown to improve classification performance in some situations (e.g. for a binary classification task, McCarthy et al., 2005), resampling forces the model to learn on an artificial, balanced class distribution; when the real-world data have a different (often skewed) distribution, the learned decision thresholds become mis-calibrated and performance degrades (Moreno-Torres et al., 2012; González et al., 2017). Thus, a class-weighted loss was implemented to increase the cost of misclassifying rare plankton classes. Class weights can be set as the inverse frequency of classes, or smoother alternative such as root or fourth-root of the inverse frequency (Cui et al., 2019), which gives, for class <inline-formula><mml:math id="M18" display="inline"><mml:mi>i</mml:mi></mml:math></inline-formula>:

              <disp-formula id="Ch1.E1" content-type="numbered"><label>1</label><mml:math id="M19" display="block"><mml:mrow><mml:msub><mml:mi>w</mml:mi><mml:mi>i</mml:mi></mml:msub><mml:mo>=</mml:mo><mml:msup><mml:mfenced open="(" close=")"><mml:mstyle displaystyle="true"><mml:mfrac style="display"><mml:mrow><mml:mi mathvariant="normal">max</mml:mi><mml:mo>(</mml:mo><mml:mi>c</mml:mi><mml:mo>)</mml:mo></mml:mrow><mml:mrow><mml:msub><mml:mi>c</mml:mi><mml:mi>i</mml:mi></mml:msub></mml:mrow></mml:mfrac></mml:mstyle></mml:mfenced><mml:mn mathvariant="normal">0.25</mml:mn></mml:msup></mml:mrow></mml:math></disp-formula>

            The effect of these per-class weights was investigated by training both weighted and non-weighted versions of a RF on native features and of the reference CNN (Mob <inline-formula><mml:math id="M20" display="inline"><mml:mo>+</mml:mo></mml:math></inline-formula> MLP<sub>600</sub>; Fig. 1).</p>

      <fig id="F1" specific-use="star"><label>Figure 1</label><caption><p id="d2e993">Description of the models tested. Each model consists of a feature extractor and a classifier, and is named accordingly. For each model, the brown line represents the feature vector and its length is indicated. For MLPs, the number in subscript gives the size of the fully connected layer. RF <inline-formula><mml:math id="M22" display="inline"><mml:mo>=</mml:mo></mml:math></inline-formula> Random Forest, MLP <inline-formula><mml:math id="M23" display="inline"><mml:mo>=</mml:mo></mml:math></inline-formula> Multilayer Perceptron, NW <inline-formula><mml:math id="M24" display="inline"><mml:mo>=</mml:mo></mml:math></inline-formula> no weights (i.e. learning not weighted by class size), PCA <inline-formula><mml:math id="M25" display="inline"><mml:mo>=</mml:mo></mml:math></inline-formula> Principal Component Analysis. The colors defined here are consistent with other figures. The MobileNet V2 with a fully connected layer of size 600 (Mob <inline-formula><mml:math id="M26" display="inline"><mml:mo>+</mml:mo></mml:math></inline-formula> MLP<sub>600</sub>, in dark blue) will be considered as a reference model and repeated in all figures.</p></caption>
            <graphic xlink:href="https://essd.copernicus.org/articles/18/945/2026/essd-18-945-2026-f01.png"/>

          </fig>

</sec>
<sec id="Ch1.S2.SS2.SSS5">
  <label>2.2.5</label><title>Model evaluation</title>
      <p id="d2e1055">After each model in Fig. 1 was trained and tuned for either the number of trees (for classical models) or the number of epochs (for CNN) on each dataset, models were evaluated on the test split, to which they had not been previously exposed. Usual metrics were computed: accuracy score (percentage of objects correctly classified), balanced accuracy, macro-averaged <inline-formula><mml:math id="M28" display="inline"><mml:mrow><mml:msub><mml:mi>F</mml:mi><mml:mn mathvariant="normal">1</mml:mn></mml:msub></mml:mrow></mml:math></inline-formula>-score, micro-averaged <inline-formula><mml:math id="M29" display="inline"><mml:mrow><mml:msub><mml:mi>F</mml:mi><mml:mn mathvariant="normal">1</mml:mn></mml:msub></mml:mrow></mml:math></inline-formula>-score, class-wise precision (percentage correct in the predicted class) and recall (percentage correct within the true class).</p>
      <p id="d2e1080">In datasets with strong class imbalance – such as many plankton datasets – accuracy alone can be misleading. For instance, in an 11-class dataset with one dominant class comprising 90 % of the data (and each of the other classes making up only 1 %), a classifier that always predicts the dominant class would achieve 90 % accuracy but would provide no insight into the ten minority classes. A random classifier that draws labels according to the empirical class distribution would yield a lower-bound 81 % accuracy (0.9<inline-formula><mml:math id="M30" display="inline"><mml:mrow><mml:msup><mml:mi/><mml:mn mathvariant="normal">2</mml:mn></mml:msup><mml:mo>+</mml:mo></mml:mrow></mml:math></inline-formula> 10  <inline-formula><mml:math id="M31" display="inline"><mml:mo>×</mml:mo></mml:math></inline-formula> 0.01<sup>2</sup>). This baseline reflects the underlying distribution while still producing a full confusion matrix that can be used to compute metrics such as precision and recall. In addition, the balanced accuracy score, computed as the simple average of per-class recall scores, was also computed, as it is a better estimate of model performance in such a scenario (Kelleher et al., 2020).</p>
      <p id="d2e1111">Furthermore, in the case of plankton datasets, the dominant classes are often not plankton (detritus, mix, etc.). The accuracy value is mostly driven by these classes (Orenstein et al., 2015) and, therefore, does not provide any information about the performance on plankton classes, which are often the subject of study. To focus on these classes, we also computed the average of precision and recall per class, weighted by the number of objects in the class, but using only plankton classes, i.e. the target classes (Owen et al., 2025). Averaged plankton recall gives a direct indication of the proportion of planktonic organisms that were correctly predicted, while averaged plankton precision reflects how free the predicted plankton classes are from false positives.</p>
</sec>
</sec>
</sec>
<sec id="Ch1.S3">
  <label>3</label><title>Results</title>
<sec id="Ch1.S3.SS1">
  <label>3.1</label><title>Training time</title>
      <p id="d2e1131">Training and evaluation times were always shorter for the classical approach (using pre-extracted handcrafted features and a RF classifier) than for CNN (which combined feature extraction and classification). Running on 12 CPU cores, gridsearch, training, and evaluation for the RF classifier based on native features took less than an hour for the smallest dataset (ISIIS, <inline-formula><mml:math id="M33" display="inline"><mml:mo>∼</mml:mo></mml:math></inline-formula> 400 000 objects) and a few hours for the IFCB dataset (<inline-formula><mml:math id="M34" display="inline"><mml:mo lspace="0mm">∼</mml:mo></mml:math></inline-formula> 1.6 M objects). The extraction of handcrafted features could not be reliably timed, as it is performed using very different software, but is usually in the order of hours for about a million objects. In contrast, it took 5h to train the MobileNet V2 <inline-formula><mml:math id="M35" display="inline"><mml:mo>+</mml:mo></mml:math></inline-formula> MLP<sub>600</sub> for 10 epochs on the ISIIS dataset but 15 h for the same number of epochs on the IFCB dataset, using a Quadro RTX 8000 GPU.</p>
</sec>
<sec id="Ch1.S3.SS2">
  <label>3.2</label><title>Benchmark performance of MobileNetV2, our reference model</title>
      <p id="d2e1172">On the six large and realistic plankton image datasets included in this study, a small CNN model (MobileNetV2) trained with per-class weights achieved strong performance while remaining easy to implement. The balanced accuracy across all classes ranged from 79 % to 90 %, with plankton class precision and recall reaching 80 %, except for ISIIS and UVP6 datasets. These benchmark results are further compared to other approaches in the following sections.</p>

<table-wrap id="T4a" specific-use="star"><label>Table 4</label><caption><p id="d2e1178">Classification report for detailed classes in the ZooScan dataset. Reported values are <inline-formula><mml:math id="M37" display="inline"><mml:mrow><mml:msub><mml:mi>F</mml:mi><mml:mn mathvariant="normal">1</mml:mn></mml:msub></mml:mrow></mml:math></inline-formula>-scores. <inline-formula><mml:math id="M38" display="inline"><mml:mi>N</mml:mi></mml:math></inline-formula> test indicates the number of objects in the test set for each class. A colored version of this table is available in Table S7.</p></caption><oasis:table frame="topbot"><oasis:tgroup cols="7">
     <oasis:colspec colnum="1" colname="col1" align="left"/>
     <oasis:colspec colnum="2" colname="col2" align="left"/>
     <oasis:colspec colnum="3" colname="col3" align="right"/>
     <oasis:colspec colnum="4" colname="col4" align="right"/>
     <oasis:colspec colnum="5" colname="col5" align="right"/>
     <oasis:colspec colnum="6" colname="col6" align="right"/>
     <oasis:colspec colnum="7" colname="col7" align="right"/>
     <oasis:thead>
       <oasis:row>
         <oasis:entry colname="col1">Class</oasis:entry>
         <oasis:entry colname="col2">Grouped</oasis:entry>
         <oasis:entry colname="col3"><inline-formula><mml:math id="M39" display="inline"><mml:mi>N</mml:mi></mml:math></inline-formula> test</oasis:entry>
         <oasis:entry colname="col4">Nat <inline-formula><mml:math id="M40" display="inline"><mml:mo>+</mml:mo></mml:math></inline-formula></oasis:entry>
         <oasis:entry colname="col5">Mob <inline-formula><mml:math id="M41" display="inline"><mml:mo>+</mml:mo></mml:math></inline-formula></oasis:entry>
         <oasis:entry colname="col6">Eff S <inline-formula><mml:math id="M42" display="inline"><mml:mo>+</mml:mo></mml:math></inline-formula></oasis:entry>
         <oasis:entry colname="col7">Mob <inline-formula><mml:math id="M43" display="inline"><mml:mo>+</mml:mo></mml:math></inline-formula> PCA</oasis:entry>
       </oasis:row>
       <oasis:row rowsep="1">
         <oasis:entry colname="col1"/>
         <oasis:entry colname="col2"/>
         <oasis:entry colname="col3"/>
         <oasis:entry colname="col4">RF</oasis:entry>
         <oasis:entry colname="col5">MLP600</oasis:entry>
         <oasis:entry colname="col6">MLP600</oasis:entry>
         <oasis:entry colname="col7"><inline-formula><mml:math id="M44" display="inline"><mml:mo>+</mml:mo></mml:math></inline-formula> RF</oasis:entry>
       </oasis:row>
     </oasis:thead>
     <oasis:tbody>
       <oasis:row rowsep="1">
         <oasis:entry colname="col1"/>
         <oasis:entry colname="col2"/>
         <oasis:entry namest="col3" nameend="col7" align="center">Plankton classes </oasis:entry>
       </oasis:row>
       <oasis:row>
         <oasis:entry colname="col1">Actinopterygii</oasis:entry>
         <oasis:entry colname="col2">Actinopterygii</oasis:entry>
         <oasis:entry colname="col3">289</oasis:entry>
         <oasis:entry colname="col4">23.8</oasis:entry>
         <oasis:entry colname="col5">87.9</oasis:entry>
         <oasis:entry colname="col6">91.6</oasis:entry>
         <oasis:entry colname="col7">94.5</oasis:entry>
       </oasis:row>
       <oasis:row>
         <oasis:entry colname="col1">egg <inline-formula><mml:math id="M45" display="inline"><mml:mo>&lt;</mml:mo></mml:math></inline-formula> Actinopterygii</oasis:entry>
         <oasis:entry colname="col2">Actinopterygii</oasis:entry>
         <oasis:entry colname="col3">689</oasis:entry>
         <oasis:entry colname="col4">35.3</oasis:entry>
         <oasis:entry colname="col5">88.3</oasis:entry>
         <oasis:entry colname="col6">88.3</oasis:entry>
         <oasis:entry colname="col7">90.5</oasis:entry>
       </oasis:row>
       <oasis:row>
         <oasis:entry colname="col1">Neoceratium</oasis:entry>
         <oasis:entry colname="col2">Alveolata</oasis:entry>
         <oasis:entry colname="col3">53</oasis:entry>
         <oasis:entry colname="col4">0.0</oasis:entry>
         <oasis:entry colname="col5">92.3</oasis:entry>
         <oasis:entry colname="col6">89.5</oasis:entry>
         <oasis:entry colname="col7">92.7</oasis:entry>
       </oasis:row>
       <oasis:row>
         <oasis:entry colname="col1">Noctiluca</oasis:entry>
         <oasis:entry colname="col2">Alveolata</oasis:entry>
         <oasis:entry colname="col3">980</oasis:entry>
         <oasis:entry colname="col4">54.6</oasis:entry>
         <oasis:entry colname="col5">92.7</oasis:entry>
         <oasis:entry colname="col6">90.2</oasis:entry>
         <oasis:entry colname="col7">92.5</oasis:entry>
       </oasis:row>
       <oasis:row>
         <oasis:entry colname="col1">Amphipoda</oasis:entry>
         <oasis:entry colname="col2">Amphipoda</oasis:entry>
         <oasis:entry colname="col3">125</oasis:entry>
         <oasis:entry colname="col4">0.0</oasis:entry>
         <oasis:entry colname="col5">82.7</oasis:entry>
         <oasis:entry colname="col6">86.1</oasis:entry>
         <oasis:entry colname="col7">90.1</oasis:entry>
       </oasis:row>
       <oasis:row>
         <oasis:entry colname="col1">Cumacea</oasis:entry>
         <oasis:entry colname="col2">Amphipoda</oasis:entry>
         <oasis:entry colname="col3">78</oasis:entry>
         <oasis:entry colname="col4">30.4</oasis:entry>
         <oasis:entry colname="col5">91.2</oasis:entry>
         <oasis:entry colname="col6">94.0</oasis:entry>
         <oasis:entry colname="col7">94.8</oasis:entry>
       </oasis:row>
       <oasis:row>
         <oasis:entry colname="col1">Hyperiidea</oasis:entry>
         <oasis:entry colname="col2">Amphipoda</oasis:entry>
         <oasis:entry colname="col3">289</oasis:entry>
         <oasis:entry colname="col4">26.1</oasis:entry>
         <oasis:entry colname="col5">90.2</oasis:entry>
         <oasis:entry colname="col6">93.4</oasis:entry>
         <oasis:entry colname="col7">92.8</oasis:entry>
       </oasis:row>
       <oasis:row>
         <oasis:entry colname="col1">Annelida</oasis:entry>
         <oasis:entry colname="col2">Annelida</oasis:entry>
         <oasis:entry colname="col3">349</oasis:entry>
         <oasis:entry colname="col4">21.3</oasis:entry>
         <oasis:entry colname="col5">85.0</oasis:entry>
         <oasis:entry colname="col6">85.9</oasis:entry>
         <oasis:entry colname="col7">87.5</oasis:entry>
       </oasis:row>
       <oasis:row>
         <oasis:entry colname="col1">larvae <inline-formula><mml:math id="M46" display="inline"><mml:mo>&lt;</mml:mo></mml:math></inline-formula> Annelida</oasis:entry>
         <oasis:entry colname="col2">Annelida</oasis:entry>
         <oasis:entry colname="col3">50</oasis:entry>
         <oasis:entry colname="col4">0.0</oasis:entry>
         <oasis:entry colname="col5">72.9</oasis:entry>
         <oasis:entry colname="col6">75.2</oasis:entry>
         <oasis:entry colname="col7">75.0</oasis:entry>
       </oasis:row>
       <oasis:row>
         <oasis:entry colname="col1">part <inline-formula><mml:math id="M47" display="inline"><mml:mo>&lt;</mml:mo></mml:math></inline-formula> Annelida</oasis:entry>
         <oasis:entry colname="col2">Annelida</oasis:entry>
         <oasis:entry colname="col3">149</oasis:entry>
         <oasis:entry colname="col4">35.7</oasis:entry>
         <oasis:entry colname="col5">86.2</oasis:entry>
         <oasis:entry colname="col6">85.4</oasis:entry>
         <oasis:entry colname="col7">88.2</oasis:entry>
       </oasis:row>
       <oasis:row>
         <oasis:entry colname="col1">Tomopteridae</oasis:entry>
         <oasis:entry colname="col2">Annelida</oasis:entry>
         <oasis:entry colname="col3">83</oasis:entry>
         <oasis:entry colname="col4">7.0</oasis:entry>
         <oasis:entry colname="col5">92.1</oasis:entry>
         <oasis:entry colname="col6">91.8</oasis:entry>
         <oasis:entry colname="col7">89.6</oasis:entry>
       </oasis:row>
       <oasis:row>
         <oasis:entry colname="col1">Fritillariidae</oasis:entry>
         <oasis:entry colname="col2">Appendicularia</oasis:entry>
         <oasis:entry colname="col3">1820</oasis:entry>
         <oasis:entry colname="col4">28.1</oasis:entry>
         <oasis:entry colname="col5">89.7</oasis:entry>
         <oasis:entry colname="col6">88.9</oasis:entry>
         <oasis:entry colname="col7">90.5</oasis:entry>
       </oasis:row>
       <oasis:row>
         <oasis:entry colname="col1">Oikopleuridae</oasis:entry>
         <oasis:entry colname="col2">Appendicularia</oasis:entry>
         <oasis:entry colname="col3">4967</oasis:entry>
         <oasis:entry colname="col4">39.4</oasis:entry>
         <oasis:entry colname="col5">94.2</oasis:entry>
         <oasis:entry colname="col6">94.5</oasis:entry>
         <oasis:entry colname="col7">95.0</oasis:entry>
       </oasis:row>
       <oasis:row>
         <oasis:entry colname="col1">tail <inline-formula><mml:math id="M48" display="inline"><mml:mo>&lt;</mml:mo></mml:math></inline-formula> Appendicularia</oasis:entry>
         <oasis:entry colname="col2">Appendicularia</oasis:entry>
         <oasis:entry colname="col3">1243</oasis:entry>
         <oasis:entry colname="col4">48.6</oasis:entry>
         <oasis:entry colname="col5">85.2</oasis:entry>
         <oasis:entry colname="col6">84.4</oasis:entry>
         <oasis:entry colname="col7">86.9</oasis:entry>
       </oasis:row>
       <oasis:row>
         <oasis:entry colname="col1">trunk</oasis:entry>
         <oasis:entry colname="col2">Appendicularia</oasis:entry>
         <oasis:entry colname="col3">193</oasis:entry>
         <oasis:entry colname="col4">0.0</oasis:entry>
         <oasis:entry colname="col5">67.3</oasis:entry>
         <oasis:entry colname="col6">67.1</oasis:entry>
         <oasis:entry colname="col7">72.4</oasis:entry>
       </oasis:row>
       <oasis:row>
         <oasis:entry colname="col1">Chaetognatha</oasis:entry>
         <oasis:entry colname="col2">Chaetognatha</oasis:entry>
         <oasis:entry colname="col3">7859</oasis:entry>
         <oasis:entry colname="col4">75.4</oasis:entry>
         <oasis:entry colname="col5">97.3</oasis:entry>
         <oasis:entry colname="col6">97.6</oasis:entry>
         <oasis:entry colname="col7">97.9</oasis:entry>
       </oasis:row>
       <oasis:row>
         <oasis:entry colname="col1">head <inline-formula><mml:math id="M49" display="inline"><mml:mo>&lt;</mml:mo></mml:math></inline-formula> Chaetognatha</oasis:entry>
         <oasis:entry colname="col2">Chaetognatha</oasis:entry>
         <oasis:entry colname="col3">190</oasis:entry>
         <oasis:entry colname="col4">0.0</oasis:entry>
         <oasis:entry colname="col5">56.9</oasis:entry>
         <oasis:entry colname="col6">69.8</oasis:entry>
         <oasis:entry colname="col7">72.4</oasis:entry>
       </oasis:row>
       <oasis:row>
         <oasis:entry colname="col1">tail <inline-formula><mml:math id="M50" display="inline"><mml:mo>&lt;</mml:mo></mml:math></inline-formula> Chaetognatha</oasis:entry>
         <oasis:entry colname="col2">Chaetognatha</oasis:entry>
         <oasis:entry colname="col3">555</oasis:entry>
         <oasis:entry colname="col4">15.3</oasis:entry>
         <oasis:entry colname="col5">73.0</oasis:entry>
         <oasis:entry colname="col6">75.0</oasis:entry>
         <oasis:entry colname="col7">77.6</oasis:entry>
       </oasis:row>
       <oasis:row>
         <oasis:entry colname="col1">cirrus</oasis:entry>
         <oasis:entry colname="col2">Cirripedia</oasis:entry>
         <oasis:entry colname="col3">60</oasis:entry>
         <oasis:entry colname="col4">9.1</oasis:entry>
         <oasis:entry colname="col5">68.5</oasis:entry>
         <oasis:entry colname="col6">59.5</oasis:entry>
         <oasis:entry colname="col7">68.6</oasis:entry>
       </oasis:row>
       <oasis:row>
         <oasis:entry colname="col1">cypris</oasis:entry>
         <oasis:entry colname="col2">Cirripedia</oasis:entry>
         <oasis:entry colname="col3">147</oasis:entry>
         <oasis:entry colname="col4">0.0</oasis:entry>
         <oasis:entry colname="col5">87.9</oasis:entry>
         <oasis:entry colname="col6">92.8</oasis:entry>
         <oasis:entry colname="col7">91.8</oasis:entry>
       </oasis:row>
       <oasis:row>
         <oasis:entry colname="col1">nauplii <inline-formula><mml:math id="M51" display="inline"><mml:mo>&lt;</mml:mo></mml:math></inline-formula> Cirripedia</oasis:entry>
         <oasis:entry colname="col2">Cirripedia</oasis:entry>
         <oasis:entry colname="col3">649</oasis:entry>
         <oasis:entry colname="col4">0.0</oasis:entry>
         <oasis:entry colname="col5">92.2</oasis:entry>
         <oasis:entry colname="col6">92.4</oasis:entry>
         <oasis:entry colname="col7">94.3</oasis:entry>
       </oasis:row>
       <oasis:row>
         <oasis:entry colname="col1">Evadne</oasis:entry>
         <oasis:entry colname="col2">Cladocera</oasis:entry>
         <oasis:entry colname="col3">5003</oasis:entry>
         <oasis:entry colname="col4">17.1</oasis:entry>
         <oasis:entry colname="col5">96.8</oasis:entry>
         <oasis:entry colname="col6">97.1</oasis:entry>
         <oasis:entry colname="col7">97.4</oasis:entry>
       </oasis:row>
       <oasis:row>
         <oasis:entry colname="col1">Penilia</oasis:entry>
         <oasis:entry colname="col2">Cladocera</oasis:entry>
         <oasis:entry colname="col3">3592</oasis:entry>
         <oasis:entry colname="col4">39.9</oasis:entry>
         <oasis:entry colname="col5">96.8</oasis:entry>
         <oasis:entry colname="col6">97.0</oasis:entry>
         <oasis:entry colname="col7">97.7</oasis:entry>
       </oasis:row>
       <oasis:row>
         <oasis:entry colname="col1">Podon</oasis:entry>
         <oasis:entry colname="col2">Cladocera</oasis:entry>
         <oasis:entry colname="col3">292</oasis:entry>
         <oasis:entry colname="col4">0.0</oasis:entry>
         <oasis:entry colname="col5">88.3</oasis:entry>
         <oasis:entry colname="col6">87.8</oasis:entry>
         <oasis:entry colname="col7">87.6</oasis:entry>
       </oasis:row>
       <oasis:row>
         <oasis:entry colname="col1">Acartiidae</oasis:entry>
         <oasis:entry colname="col2">Copepoda</oasis:entry>
         <oasis:entry colname="col3">8853</oasis:entry>
         <oasis:entry colname="col4">24.2</oasis:entry>
         <oasis:entry colname="col5">95.5</oasis:entry>
         <oasis:entry colname="col6">95.4</oasis:entry>
         <oasis:entry colname="col7">95.9</oasis:entry>
       </oasis:row>
       <oasis:row>
         <oasis:entry colname="col1">Calanidae</oasis:entry>
         <oasis:entry colname="col2">Copepoda</oasis:entry>
         <oasis:entry colname="col3">6190</oasis:entry>
         <oasis:entry colname="col4">33.0</oasis:entry>
         <oasis:entry colname="col5">96.3</oasis:entry>
         <oasis:entry colname="col6">96.4</oasis:entry>
         <oasis:entry colname="col7">97.0</oasis:entry>
       </oasis:row>
       <oasis:row>
         <oasis:entry colname="col1">Calanoida</oasis:entry>
         <oasis:entry colname="col2">Copepoda</oasis:entry>
         <oasis:entry colname="col3">22 713</oasis:entry>
         <oasis:entry colname="col4">57.6</oasis:entry>
         <oasis:entry colname="col5">94.3</oasis:entry>
         <oasis:entry colname="col6">94.3</oasis:entry>
         <oasis:entry colname="col7">94.9</oasis:entry>
       </oasis:row>
       <oasis:row>
         <oasis:entry colname="col1">Calocalanus pavo</oasis:entry>
         <oasis:entry colname="col2">Copepoda</oasis:entry>
         <oasis:entry colname="col3">71</oasis:entry>
         <oasis:entry colname="col4">2.7</oasis:entry>
         <oasis:entry colname="col5">84.2</oasis:entry>
         <oasis:entry colname="col6">85.5</oasis:entry>
         <oasis:entry colname="col7">89.9</oasis:entry>
       </oasis:row>
       <oasis:row>
         <oasis:entry colname="col1">Candaciidae</oasis:entry>
         <oasis:entry colname="col2">Copepoda</oasis:entry>
         <oasis:entry colname="col3">1767</oasis:entry>
         <oasis:entry colname="col4">11.9</oasis:entry>
         <oasis:entry colname="col5">95.5</oasis:entry>
         <oasis:entry colname="col6">95.1</oasis:entry>
         <oasis:entry colname="col7">95.5</oasis:entry>
       </oasis:row>
       <oasis:row>
         <oasis:entry colname="col1">Centropagidae</oasis:entry>
         <oasis:entry colname="col2">Copepoda</oasis:entry>
         <oasis:entry colname="col3">6890</oasis:entry>
         <oasis:entry colname="col4">32.8</oasis:entry>
         <oasis:entry colname="col5">94.6</oasis:entry>
         <oasis:entry colname="col6">94.6</oasis:entry>
         <oasis:entry colname="col7">95.1</oasis:entry>
       </oasis:row>
       <oasis:row>
         <oasis:entry colname="col1">Copilia</oasis:entry>
         <oasis:entry colname="col2">Copepoda</oasis:entry>
         <oasis:entry colname="col3">99</oasis:entry>
         <oasis:entry colname="col4">0.0</oasis:entry>
         <oasis:entry colname="col5">88.5</oasis:entry>
         <oasis:entry colname="col6">94.2</oasis:entry>
         <oasis:entry colname="col7">95.1</oasis:entry>
       </oasis:row>
       <oasis:row>
         <oasis:entry colname="col1">Corycaeidae</oasis:entry>
         <oasis:entry colname="col2">Copepoda</oasis:entry>
         <oasis:entry colname="col3">3576</oasis:entry>
         <oasis:entry colname="col4">28.5</oasis:entry>
         <oasis:entry colname="col5">96.3</oasis:entry>
         <oasis:entry colname="col6">96.6</oasis:entry>
         <oasis:entry colname="col7">97.2</oasis:entry>
       </oasis:row>
       <oasis:row>
         <oasis:entry colname="col1">Eucalanidae</oasis:entry>
         <oasis:entry colname="col2">Copepoda</oasis:entry>
         <oasis:entry colname="col3">183</oasis:entry>
         <oasis:entry colname="col4">16.8</oasis:entry>
         <oasis:entry colname="col5">88.4</oasis:entry>
         <oasis:entry colname="col6">90.2</oasis:entry>
         <oasis:entry colname="col7">91.3</oasis:entry>
       </oasis:row>
       <oasis:row>
         <oasis:entry colname="col1">Euchaetidae</oasis:entry>
         <oasis:entry colname="col2">Copepoda</oasis:entry>
         <oasis:entry colname="col3">1019</oasis:entry>
         <oasis:entry colname="col4">21.3</oasis:entry>
         <oasis:entry colname="col5">94.2</oasis:entry>
         <oasis:entry colname="col6">94.1</oasis:entry>
         <oasis:entry colname="col7">96.2</oasis:entry>
       </oasis:row>
       <oasis:row>
         <oasis:entry colname="col1">Haloptilus</oasis:entry>
         <oasis:entry colname="col2">Copepoda</oasis:entry>
         <oasis:entry colname="col3">407</oasis:entry>
         <oasis:entry colname="col4">31.8</oasis:entry>
         <oasis:entry colname="col5">95.6</oasis:entry>
         <oasis:entry colname="col6">95.4</oasis:entry>
         <oasis:entry colname="col7">96.5</oasis:entry>
       </oasis:row>
       <oasis:row>
         <oasis:entry colname="col1">Harpacticoida</oasis:entry>
         <oasis:entry colname="col2">Copepoda</oasis:entry>
         <oasis:entry colname="col3">832</oasis:entry>
         <oasis:entry colname="col4">0.2</oasis:entry>
         <oasis:entry colname="col5">90.7</oasis:entry>
         <oasis:entry colname="col6">92.7</oasis:entry>
         <oasis:entry colname="col7">93.1</oasis:entry>
       </oasis:row>
       <oasis:row>
         <oasis:entry colname="col1">Heterorhabdidae</oasis:entry>
         <oasis:entry colname="col2">Copepoda</oasis:entry>
         <oasis:entry colname="col3">355</oasis:entry>
         <oasis:entry colname="col4">0.0</oasis:entry>
         <oasis:entry colname="col5">87.6</oasis:entry>
         <oasis:entry colname="col6">86.2</oasis:entry>
         <oasis:entry colname="col7">89.3</oasis:entry>
       </oasis:row>
       <oasis:row>
         <oasis:entry colname="col1">Metridinidae</oasis:entry>
         <oasis:entry colname="col2">Copepoda</oasis:entry>
         <oasis:entry colname="col3">2439</oasis:entry>
         <oasis:entry colname="col4">14.7</oasis:entry>
         <oasis:entry colname="col5">94.6</oasis:entry>
         <oasis:entry colname="col6">94.6</oasis:entry>
         <oasis:entry colname="col7">95.7</oasis:entry>
       </oasis:row>
       <oasis:row>
         <oasis:entry colname="col1">Oithonidae</oasis:entry>
         <oasis:entry colname="col2">Copepoda</oasis:entry>
         <oasis:entry colname="col3">9847</oasis:entry>
         <oasis:entry colname="col4">59.2</oasis:entry>
         <oasis:entry colname="col5">96.6</oasis:entry>
         <oasis:entry colname="col6">96.6</oasis:entry>
         <oasis:entry colname="col7">97.0</oasis:entry>
       </oasis:row>
       <oasis:row>
         <oasis:entry colname="col1">Oncaeidae</oasis:entry>
         <oasis:entry colname="col2">Copepoda</oasis:entry>
         <oasis:entry colname="col3">3070</oasis:entry>
         <oasis:entry colname="col4">9.1</oasis:entry>
         <oasis:entry colname="col5">93.4</oasis:entry>
         <oasis:entry colname="col6">94.2</oasis:entry>
         <oasis:entry colname="col7">94.8</oasis:entry>
       </oasis:row>
       <oasis:row>
         <oasis:entry colname="col1">Pontellidae</oasis:entry>
         <oasis:entry colname="col2">Copepoda</oasis:entry>
         <oasis:entry colname="col3">1080</oasis:entry>
         <oasis:entry colname="col4">54.8</oasis:entry>
         <oasis:entry colname="col5">97.0</oasis:entry>
         <oasis:entry colname="col6">96.5</oasis:entry>
         <oasis:entry colname="col7">98.6</oasis:entry>
       </oasis:row>
       <oasis:row>
         <oasis:entry colname="col1">Rhincalanidae</oasis:entry>
         <oasis:entry colname="col2">Copepoda</oasis:entry>
         <oasis:entry colname="col3">35</oasis:entry>
         <oasis:entry colname="col4">52.0</oasis:entry>
         <oasis:entry colname="col5">70.2</oasis:entry>
         <oasis:entry colname="col6">78.3</oasis:entry>
         <oasis:entry colname="col7">85.3</oasis:entry>
       </oasis:row>
       <oasis:row>
         <oasis:entry colname="col1">Sapphirinidae</oasis:entry>
         <oasis:entry colname="col2">Copepoda</oasis:entry>
         <oasis:entry colname="col3">162</oasis:entry>
         <oasis:entry colname="col4">0.0</oasis:entry>
         <oasis:entry colname="col5">91.8</oasis:entry>
         <oasis:entry colname="col6">91.2</oasis:entry>
         <oasis:entry colname="col7">91.9</oasis:entry>
       </oasis:row>
       <oasis:row>
         <oasis:entry colname="col1">Temoridae</oasis:entry>
         <oasis:entry colname="col2">Copepoda</oasis:entry>
         <oasis:entry colname="col3">4549</oasis:entry>
         <oasis:entry colname="col4">23.4</oasis:entry>
         <oasis:entry colname="col5">96.0</oasis:entry>
         <oasis:entry colname="col6">96.0</oasis:entry>
         <oasis:entry colname="col7">96.9</oasis:entry>
       </oasis:row>
       <oasis:row>
         <oasis:entry colname="col1">Ctenophora</oasis:entry>
         <oasis:entry colname="col2">Ctenophora</oasis:entry>
         <oasis:entry colname="col3">137</oasis:entry>
         <oasis:entry colname="col4">0.0</oasis:entry>
         <oasis:entry colname="col5">67.0</oasis:entry>
         <oasis:entry colname="col6">72.3</oasis:entry>
         <oasis:entry colname="col7">81.1</oasis:entry>
       </oasis:row>
       <oasis:row>
         <oasis:entry colname="col1">cyphonaute</oasis:entry>
         <oasis:entry colname="col2">cyphonaute</oasis:entry>
         <oasis:entry colname="col3">1334</oasis:entry>
         <oasis:entry colname="col4">29.8</oasis:entry>
         <oasis:entry colname="col5">98.4</oasis:entry>
         <oasis:entry colname="col6">98.5</oasis:entry>
         <oasis:entry colname="col7">98.4</oasis:entry>
       </oasis:row>
       <oasis:row>
         <oasis:entry colname="col1">larvae <inline-formula><mml:math id="M52" display="inline"><mml:mo>&lt;</mml:mo></mml:math></inline-formula> Luciferidae</oasis:entry>
         <oasis:entry colname="col2">Decapoda</oasis:entry>
         <oasis:entry colname="col3">98</oasis:entry>
         <oasis:entry colname="col4">16.4</oasis:entry>
         <oasis:entry colname="col5">95.2</oasis:entry>
         <oasis:entry colname="col6">95.4</oasis:entry>
         <oasis:entry colname="col7">97.9</oasis:entry>
       </oasis:row>
       <oasis:row>
         <oasis:entry colname="col1">larvae <inline-formula><mml:math id="M53" display="inline"><mml:mo>&lt;</mml:mo></mml:math></inline-formula> Porcellanidae</oasis:entry>
         <oasis:entry colname="col2">Decapoda</oasis:entry>
         <oasis:entry colname="col3">748</oasis:entry>
         <oasis:entry colname="col4">64.2</oasis:entry>
         <oasis:entry colname="col5">96.2</oasis:entry>
         <oasis:entry colname="col6">97.4</oasis:entry>
         <oasis:entry colname="col7">98.3</oasis:entry>
       </oasis:row>
       <oasis:row>
         <oasis:entry colname="col1">megalopa</oasis:entry>
         <oasis:entry colname="col2">Decapoda</oasis:entry>
         <oasis:entry colname="col3">213</oasis:entry>
         <oasis:entry colname="col4">27.9</oasis:entry>
         <oasis:entry colname="col5">95.9</oasis:entry>
         <oasis:entry colname="col6">95.2</oasis:entry>
         <oasis:entry colname="col7">96.7</oasis:entry>
       </oasis:row>
       <oasis:row>
         <oasis:entry colname="col1">protozoea <inline-formula><mml:math id="M54" display="inline"><mml:mo>&lt;</mml:mo></mml:math></inline-formula> Penaeidae</oasis:entry>
         <oasis:entry colname="col2">Decapoda</oasis:entry>
         <oasis:entry colname="col3">59</oasis:entry>
         <oasis:entry colname="col4">0.0</oasis:entry>
         <oasis:entry colname="col5">84.2</oasis:entry>
         <oasis:entry colname="col6">87.6</oasis:entry>
         <oasis:entry colname="col7">92.3</oasis:entry>
       </oasis:row>
       <oasis:row>
         <oasis:entry colname="col1">protozoea <inline-formula><mml:math id="M55" display="inline"><mml:mo>&lt;</mml:mo></mml:math></inline-formula> Sergestidae</oasis:entry>
         <oasis:entry colname="col2">Decapoda</oasis:entry>
         <oasis:entry colname="col3">89</oasis:entry>
         <oasis:entry colname="col4">0.0</oasis:entry>
         <oasis:entry colname="col5">78.5</oasis:entry>
         <oasis:entry colname="col6">71.7</oasis:entry>
         <oasis:entry colname="col7">81.0</oasis:entry>
       </oasis:row>
       <oasis:row>
         <oasis:entry colname="col1">zoea <inline-formula><mml:math id="M56" display="inline"><mml:mo>&lt;</mml:mo></mml:math></inline-formula> Brachyura</oasis:entry>
         <oasis:entry colname="col2">Decapoda</oasis:entry>
         <oasis:entry colname="col3">1750</oasis:entry>
         <oasis:entry colname="col4">40.0</oasis:entry>
         <oasis:entry colname="col5">95.7</oasis:entry>
         <oasis:entry colname="col6">96.7</oasis:entry>
         <oasis:entry colname="col7">97.5</oasis:entry>
       </oasis:row>
       <oasis:row>
         <oasis:entry colname="col1">zoea <inline-formula><mml:math id="M57" display="inline"><mml:mo>&lt;</mml:mo></mml:math></inline-formula> Galatheidae</oasis:entry>
         <oasis:entry colname="col2">Decapoda</oasis:entry>
         <oasis:entry colname="col3">759</oasis:entry>
         <oasis:entry colname="col4">1.3</oasis:entry>
         <oasis:entry colname="col5">88.1</oasis:entry>
         <oasis:entry colname="col6">88.3</oasis:entry>
         <oasis:entry colname="col7">89.3</oasis:entry>
       </oasis:row>
     </oasis:tbody>
   </oasis:tgroup></oasis:table></table-wrap>

<table-wrap id="T4b" specific-use="star"><label>Table 4</label><caption><p id="d2e2747">Continued.</p></caption><oasis:table frame="topbot"><oasis:tgroup cols="7">
     <oasis:colspec colnum="1" colname="col1" align="left"/>
     <oasis:colspec colnum="2" colname="col2" align="left"/>
     <oasis:colspec colnum="3" colname="col3" align="right"/>
     <oasis:colspec colnum="4" colname="col4" align="right"/>
     <oasis:colspec colnum="5" colname="col5" align="right"/>
     <oasis:colspec colnum="6" colname="col6" align="right"/>
     <oasis:colspec colnum="7" colname="col7" align="right"/>
     <oasis:thead>
       <oasis:row>
         <oasis:entry colname="col1">Class</oasis:entry>
         <oasis:entry colname="col2">Grouped</oasis:entry>
         <oasis:entry colname="col3"><inline-formula><mml:math id="M58" display="inline"><mml:mi>N</mml:mi></mml:math></inline-formula> test</oasis:entry>
         <oasis:entry colname="col4">Nat <inline-formula><mml:math id="M59" display="inline"><mml:mo>+</mml:mo></mml:math></inline-formula></oasis:entry>
         <oasis:entry colname="col5">Mob <inline-formula><mml:math id="M60" display="inline"><mml:mo>+</mml:mo></mml:math></inline-formula></oasis:entry>
         <oasis:entry colname="col6">Eff S <inline-formula><mml:math id="M61" display="inline"><mml:mo>+</mml:mo></mml:math></inline-formula></oasis:entry>
         <oasis:entry colname="col7">Mob <inline-formula><mml:math id="M62" display="inline"><mml:mo>+</mml:mo></mml:math></inline-formula> PCA</oasis:entry>
       </oasis:row>
       <oasis:row rowsep="1">
         <oasis:entry colname="col1"/>
         <oasis:entry colname="col2"/>
         <oasis:entry colname="col3"/>
         <oasis:entry colname="col4">RF</oasis:entry>
         <oasis:entry colname="col5">MLP600</oasis:entry>
         <oasis:entry colname="col6">MLP600</oasis:entry>
         <oasis:entry colname="col7"><inline-formula><mml:math id="M63" display="inline"><mml:mo>+</mml:mo></mml:math></inline-formula> RF</oasis:entry>
       </oasis:row>
     </oasis:thead>
     <oasis:tbody>
       <oasis:row rowsep="1">
         <oasis:entry colname="col1"/>
         <oasis:entry colname="col2"/>
         <oasis:entry namest="col3" nameend="col7" align="center">Plankton classes </oasis:entry>
       </oasis:row>
       <oasis:row>
         <oasis:entry colname="col1">Doliolida</oasis:entry>
         <oasis:entry colname="col2">Doliolida</oasis:entry>
         <oasis:entry colname="col3">1461</oasis:entry>
         <oasis:entry colname="col4">37.7</oasis:entry>
         <oasis:entry colname="col5">93.2</oasis:entry>
         <oasis:entry colname="col6">92.4</oasis:entry>
         <oasis:entry colname="col7">93.8</oasis:entry>
       </oasis:row>
       <oasis:row>
         <oasis:entry colname="col1">larvae <inline-formula><mml:math id="M64" display="inline"><mml:mo>&lt;</mml:mo></mml:math></inline-formula> Echinodermata</oasis:entry>
         <oasis:entry colname="col2">Echinodermata</oasis:entry>
         <oasis:entry colname="col3">76</oasis:entry>
         <oasis:entry colname="col4">0.0</oasis:entry>
         <oasis:entry colname="col5">80.6</oasis:entry>
         <oasis:entry colname="col6">76.6</oasis:entry>
         <oasis:entry colname="col7">84.0</oasis:entry>
       </oasis:row>
       <oasis:row>
         <oasis:entry colname="col1">pluteus <inline-formula><mml:math id="M65" display="inline"><mml:mo>&lt;</mml:mo></mml:math></inline-formula> Echinoidea</oasis:entry>
         <oasis:entry colname="col2">Echinodermata</oasis:entry>
         <oasis:entry colname="col3">361</oasis:entry>
         <oasis:entry colname="col4">26.8</oasis:entry>
         <oasis:entry colname="col5">86.7</oasis:entry>
         <oasis:entry colname="col6">87.8</oasis:entry>
         <oasis:entry colname="col7">89.7</oasis:entry>
       </oasis:row>
       <oasis:row>
         <oasis:entry colname="col1">pluteus <inline-formula><mml:math id="M66" display="inline"><mml:mo>&lt;</mml:mo></mml:math></inline-formula> Ophiuroidea</oasis:entry>
         <oasis:entry colname="col2">Echinodermata</oasis:entry>
         <oasis:entry colname="col3">542</oasis:entry>
         <oasis:entry colname="col4">13.4</oasis:entry>
         <oasis:entry colname="col5">91.0</oasis:entry>
         <oasis:entry colname="col6">92.5</oasis:entry>
         <oasis:entry colname="col7">92.0</oasis:entry>
       </oasis:row>
       <oasis:row>
         <oasis:entry colname="col1">Eumalacostraca</oasis:entry>
         <oasis:entry colname="col2">Eumalacostraca</oasis:entry>
         <oasis:entry colname="col3">3453</oasis:entry>
         <oasis:entry colname="col4">61.3</oasis:entry>
         <oasis:entry colname="col5">91.4</oasis:entry>
         <oasis:entry colname="col6">91.7</oasis:entry>
         <oasis:entry colname="col7">92.4</oasis:entry>
       </oasis:row>
       <oasis:row>
         <oasis:entry colname="col1">Eumalacostraca potentially protozoea</oasis:entry>
         <oasis:entry colname="col2">Eumalacostraca</oasis:entry>
         <oasis:entry colname="col3">225</oasis:entry>
         <oasis:entry colname="col4">26.1</oasis:entry>
         <oasis:entry colname="col5">83.0</oasis:entry>
         <oasis:entry colname="col6">81.4</oasis:entry>
         <oasis:entry colname="col7">83.8</oasis:entry>
       </oasis:row>
       <oasis:row>
         <oasis:entry colname="col1">larvae <inline-formula><mml:math id="M67" display="inline"><mml:mo>&lt;</mml:mo></mml:math></inline-formula> Mysida</oasis:entry>
         <oasis:entry colname="col2">Eumalacostraca</oasis:entry>
         <oasis:entry colname="col3">14</oasis:entry>
         <oasis:entry colname="col4">0.0</oasis:entry>
         <oasis:entry colname="col5">72.7</oasis:entry>
         <oasis:entry colname="col6">88.9</oasis:entry>
         <oasis:entry colname="col7">82.8</oasis:entry>
       </oasis:row>
       <oasis:row>
         <oasis:entry colname="col1">Mysida</oasis:entry>
         <oasis:entry colname="col2">Eumalacostraca</oasis:entry>
         <oasis:entry colname="col3">120</oasis:entry>
         <oasis:entry colname="col4">76.5</oasis:entry>
         <oasis:entry colname="col5">86.4</oasis:entry>
         <oasis:entry colname="col6">91.6</oasis:entry>
         <oasis:entry colname="col7">94.4</oasis:entry>
       </oasis:row>
       <oasis:row>
         <oasis:entry colname="col1">Harosa</oasis:entry>
         <oasis:entry colname="col2">Harosa</oasis:entry>
         <oasis:entry colname="col3">244</oasis:entry>
         <oasis:entry colname="col4">1.6</oasis:entry>
         <oasis:entry colname="col5">76.7</oasis:entry>
         <oasis:entry colname="col6">75.1</oasis:entry>
         <oasis:entry colname="col7">74.2</oasis:entry>
       </oasis:row>
       <oasis:row>
         <oasis:entry colname="col1">Isopoda</oasis:entry>
         <oasis:entry colname="col2">Isopoda</oasis:entry>
         <oasis:entry colname="col3">83</oasis:entry>
         <oasis:entry colname="col4">67.1</oasis:entry>
         <oasis:entry colname="col5">98.8</oasis:entry>
         <oasis:entry colname="col6">97.6</oasis:entry>
         <oasis:entry colname="col7">98.2</oasis:entry>
       </oasis:row>
       <oasis:row>
         <oasis:entry colname="col1">Atlanta</oasis:entry>
         <oasis:entry colname="col2">Mollusca</oasis:entry>
         <oasis:entry colname="col3">68</oasis:entry>
         <oasis:entry colname="col4">0.0</oasis:entry>
         <oasis:entry colname="col5">84.8</oasis:entry>
         <oasis:entry colname="col6">83.9</oasis:entry>
         <oasis:entry colname="col7">90.9</oasis:entry>
       </oasis:row>
       <oasis:row>
         <oasis:entry colname="col1">Bivalvia <inline-formula><mml:math id="M68" display="inline"><mml:mo>&lt;</mml:mo></mml:math></inline-formula> Mollusca</oasis:entry>
         <oasis:entry colname="col2">Mollusca</oasis:entry>
         <oasis:entry colname="col3">777</oasis:entry>
         <oasis:entry colname="col4">12.6</oasis:entry>
         <oasis:entry colname="col5">95.0</oasis:entry>
         <oasis:entry colname="col6">95.5</oasis:entry>
         <oasis:entry colname="col7">95.8</oasis:entry>
       </oasis:row>
       <oasis:row>
         <oasis:entry colname="col1">Cavolinia inflexa</oasis:entry>
         <oasis:entry colname="col2">Mollusca</oasis:entry>
         <oasis:entry colname="col3">662</oasis:entry>
         <oasis:entry colname="col4">58.2</oasis:entry>
         <oasis:entry colname="col5">97.5</oasis:entry>
         <oasis:entry colname="col6">96.2</oasis:entry>
         <oasis:entry colname="col7">97.2</oasis:entry>
       </oasis:row>
       <oasis:row>
         <oasis:entry colname="col1">Creseidae</oasis:entry>
         <oasis:entry colname="col2">Mollusca</oasis:entry>
         <oasis:entry colname="col3">767</oasis:entry>
         <oasis:entry colname="col4">47.4</oasis:entry>
         <oasis:entry colname="col5">93.7</oasis:entry>
         <oasis:entry colname="col6">94.0</oasis:entry>
         <oasis:entry colname="col7">94.2</oasis:entry>
       </oasis:row>
       <oasis:row>
         <oasis:entry colname="col1">Creseis acicula</oasis:entry>
         <oasis:entry colname="col2">Mollusca</oasis:entry>
         <oasis:entry colname="col3">1294</oasis:entry>
         <oasis:entry colname="col4">67.6</oasis:entry>
         <oasis:entry colname="col5">94.5</oasis:entry>
         <oasis:entry colname="col6">94.4</oasis:entry>
         <oasis:entry colname="col7">94.9</oasis:entry>
       </oasis:row>
       <oasis:row>
         <oasis:entry colname="col1">Cymbulia peroni</oasis:entry>
         <oasis:entry colname="col2">Mollusca</oasis:entry>
         <oasis:entry colname="col3">14</oasis:entry>
         <oasis:entry colname="col4">0.0</oasis:entry>
         <oasis:entry colname="col5">80.0</oasis:entry>
         <oasis:entry colname="col6">72.7</oasis:entry>
         <oasis:entry colname="col7">76.5</oasis:entry>
       </oasis:row>
       <oasis:row>
         <oasis:entry colname="col1">egg <inline-formula><mml:math id="M69" display="inline"><mml:mo>&lt;</mml:mo></mml:math></inline-formula> Mollusca</oasis:entry>
         <oasis:entry colname="col2">Mollusca</oasis:entry>
         <oasis:entry colname="col3">129</oasis:entry>
         <oasis:entry colname="col4">1.5</oasis:entry>
         <oasis:entry colname="col5">76.7</oasis:entry>
         <oasis:entry colname="col6">77.0</oasis:entry>
         <oasis:entry colname="col7">75.7</oasis:entry>
       </oasis:row>
       <oasis:row>
         <oasis:entry colname="col1">Gymnosomata</oasis:entry>
         <oasis:entry colname="col2">Mollusca</oasis:entry>
         <oasis:entry colname="col3">79</oasis:entry>
         <oasis:entry colname="col4">60.4</oasis:entry>
         <oasis:entry colname="col5">92.8</oasis:entry>
         <oasis:entry colname="col6">95.7</oasis:entry>
         <oasis:entry colname="col7">95.6</oasis:entry>
       </oasis:row>
       <oasis:row>
         <oasis:entry colname="col1">Limacinidae</oasis:entry>
         <oasis:entry colname="col2">Mollusca</oasis:entry>
         <oasis:entry colname="col3">2113</oasis:entry>
         <oasis:entry colname="col4">25.3</oasis:entry>
         <oasis:entry colname="col5">96.1</oasis:entry>
         <oasis:entry colname="col6">96.3</oasis:entry>
         <oasis:entry colname="col7">96.9</oasis:entry>
       </oasis:row>
       <oasis:row>
         <oasis:entry colname="col1">part <inline-formula><mml:math id="M70" display="inline"><mml:mo>&lt;</mml:mo></mml:math></inline-formula> Mollusca</oasis:entry>
         <oasis:entry colname="col2">Mollusca</oasis:entry>
         <oasis:entry colname="col3">255</oasis:entry>
         <oasis:entry colname="col4">2.2</oasis:entry>
         <oasis:entry colname="col5">61.9</oasis:entry>
         <oasis:entry colname="col6">55.3</oasis:entry>
         <oasis:entry colname="col7">60.9</oasis:entry>
       </oasis:row>
       <oasis:row>
         <oasis:entry colname="col1">Actiniaria</oasis:entry>
         <oasis:entry colname="col2">other_Cnidaria</oasis:entry>
         <oasis:entry colname="col3">22</oasis:entry>
         <oasis:entry colname="col4">16.7</oasis:entry>
         <oasis:entry colname="col5">93.0</oasis:entry>
         <oasis:entry colname="col6">93.3</oasis:entry>
         <oasis:entry colname="col7">89.8</oasis:entry>
       </oasis:row>
       <oasis:row>
         <oasis:entry colname="col1">ephyra</oasis:entry>
         <oasis:entry colname="col2">other_Cnidaria</oasis:entry>
         <oasis:entry colname="col3">179</oasis:entry>
         <oasis:entry colname="col4">36.7</oasis:entry>
         <oasis:entry colname="col5">86.4</oasis:entry>
         <oasis:entry colname="col6">91.5</oasis:entry>
         <oasis:entry colname="col7">91.3</oasis:entry>
       </oasis:row>
       <oasis:row>
         <oasis:entry colname="col1">Hydrozoa</oasis:entry>
         <oasis:entry colname="col2">other_Cnidaria</oasis:entry>
         <oasis:entry colname="col3">579</oasis:entry>
         <oasis:entry colname="col4">13.6</oasis:entry>
         <oasis:entry colname="col5">74.6</oasis:entry>
         <oasis:entry colname="col6">75.1</oasis:entry>
         <oasis:entry colname="col7">78.4</oasis:entry>
       </oasis:row>
       <oasis:row>
         <oasis:entry colname="col1">Obelia</oasis:entry>
         <oasis:entry colname="col2">other_Cnidaria</oasis:entry>
         <oasis:entry colname="col3">147</oasis:entry>
         <oasis:entry colname="col4">18.2</oasis:entry>
         <oasis:entry colname="col5">85.9</oasis:entry>
         <oasis:entry colname="col6">85.7</oasis:entry>
         <oasis:entry colname="col7">88.5</oasis:entry>
       </oasis:row>
       <oasis:row>
         <oasis:entry colname="col1">part <inline-formula><mml:math id="M71" display="inline"><mml:mo>&lt;</mml:mo></mml:math></inline-formula> Cnidaria</oasis:entry>
         <oasis:entry colname="col2">other_Cnidaria</oasis:entry>
         <oasis:entry colname="col3">125</oasis:entry>
         <oasis:entry colname="col4">0.0</oasis:entry>
         <oasis:entry colname="col5">14.8</oasis:entry>
         <oasis:entry colname="col6">44.0</oasis:entry>
         <oasis:entry colname="col7">44.6</oasis:entry>
       </oasis:row>
       <oasis:row>
         <oasis:entry colname="col1">calyptopsis</oasis:entry>
         <oasis:entry colname="col2">other_Crustacea</oasis:entry>
         <oasis:entry colname="col3">1205</oasis:entry>
         <oasis:entry colname="col4">12.2</oasis:entry>
         <oasis:entry colname="col5">93.5</oasis:entry>
         <oasis:entry colname="col6">94.3</oasis:entry>
         <oasis:entry colname="col7">93.3</oasis:entry>
       </oasis:row>
       <oasis:row>
         <oasis:entry colname="col1">larvae <inline-formula><mml:math id="M72" display="inline"><mml:mo>&lt;</mml:mo></mml:math></inline-formula> Stomatopoda</oasis:entry>
         <oasis:entry colname="col2">other_Crustacea</oasis:entry>
         <oasis:entry colname="col3">245</oasis:entry>
         <oasis:entry colname="col4">46.5</oasis:entry>
         <oasis:entry colname="col5">95.6</oasis:entry>
         <oasis:entry colname="col6">96.5</oasis:entry>
         <oasis:entry colname="col7">98.4</oasis:entry>
       </oasis:row>
       <oasis:row>
         <oasis:entry colname="col1">metanauplii <inline-formula><mml:math id="M73" display="inline"><mml:mo>&lt;</mml:mo></mml:math></inline-formula> Crustacea</oasis:entry>
         <oasis:entry colname="col2">other_Crustacea</oasis:entry>
         <oasis:entry colname="col3">37</oasis:entry>
         <oasis:entry colname="col4">0.0</oasis:entry>
         <oasis:entry colname="col5">81.8</oasis:entry>
         <oasis:entry colname="col6">85.3</oasis:entry>
         <oasis:entry colname="col7">93.7</oasis:entry>
       </oasis:row>
       <oasis:row>
         <oasis:entry colname="col1">nauplii <inline-formula><mml:math id="M74" display="inline"><mml:mo>&lt;</mml:mo></mml:math></inline-formula> Crustacea</oasis:entry>
         <oasis:entry colname="col2">other_Crustacea</oasis:entry>
         <oasis:entry colname="col3">845</oasis:entry>
         <oasis:entry colname="col4">4.6</oasis:entry>
         <oasis:entry colname="col5">91.5</oasis:entry>
         <oasis:entry colname="col6">91.8</oasis:entry>
         <oasis:entry colname="col7">93.3</oasis:entry>
       </oasis:row>
       <oasis:row>
         <oasis:entry colname="col1">Ostracoda</oasis:entry>
         <oasis:entry colname="col2">other_Crustacea</oasis:entry>
         <oasis:entry colname="col3">1169</oasis:entry>
         <oasis:entry colname="col4">46.4</oasis:entry>
         <oasis:entry colname="col5">96.4</oasis:entry>
         <oasis:entry colname="col6">96.7</oasis:entry>
         <oasis:entry colname="col7">97.6</oasis:entry>
       </oasis:row>
       <oasis:row>
         <oasis:entry colname="col1">part <inline-formula><mml:math id="M75" display="inline"><mml:mo>&lt;</mml:mo></mml:math></inline-formula> Crustacea</oasis:entry>
         <oasis:entry colname="col2">other_Crustacea</oasis:entry>
         <oasis:entry colname="col3">3065</oasis:entry>
         <oasis:entry colname="col4">2.6</oasis:entry>
         <oasis:entry colname="col5">63.2</oasis:entry>
         <oasis:entry colname="col6">65.3</oasis:entry>
         <oasis:entry colname="col7">68.2</oasis:entry>
       </oasis:row>
       <oasis:row>
         <oasis:entry colname="col1">Pyrosomatida</oasis:entry>
         <oasis:entry colname="col2">Pyrosomatida</oasis:entry>
         <oasis:entry colname="col3">75</oasis:entry>
         <oasis:entry colname="col4">22.2</oasis:entry>
         <oasis:entry colname="col5">93.9</oasis:entry>
         <oasis:entry colname="col6">95.4</oasis:entry>
         <oasis:entry colname="col7">94.8</oasis:entry>
       </oasis:row>
       <oasis:row>
         <oasis:entry colname="col1">Foraminifera</oasis:entry>
         <oasis:entry colname="col2">Rhizaria</oasis:entry>
         <oasis:entry colname="col3">469</oasis:entry>
         <oasis:entry colname="col4">25.7</oasis:entry>
         <oasis:entry colname="col5">89.7</oasis:entry>
         <oasis:entry colname="col6">89.8</oasis:entry>
         <oasis:entry colname="col7">90.4</oasis:entry>
       </oasis:row>
       <oasis:row>
         <oasis:entry colname="col1">Phaeodaria</oasis:entry>
         <oasis:entry colname="col2">Rhizaria</oasis:entry>
         <oasis:entry colname="col3">8106</oasis:entry>
         <oasis:entry colname="col4">55.1</oasis:entry>
         <oasis:entry colname="col5">96.6</oasis:entry>
         <oasis:entry colname="col6">96.2</oasis:entry>
         <oasis:entry colname="col7">96.7</oasis:entry>
       </oasis:row>
       <oasis:row>
         <oasis:entry colname="col1">endostyle</oasis:entry>
         <oasis:entry colname="col2">Salpida</oasis:entry>
         <oasis:entry colname="col3">135</oasis:entry>
         <oasis:entry colname="col4">16.0</oasis:entry>
         <oasis:entry colname="col5">60.4</oasis:entry>
         <oasis:entry colname="col6">58.2</oasis:entry>
         <oasis:entry colname="col7">61.4</oasis:entry>
       </oasis:row>
       <oasis:row>
         <oasis:entry colname="col1">juvenile <inline-formula><mml:math id="M76" display="inline"><mml:mo>&lt;</mml:mo></mml:math></inline-formula> Salpida</oasis:entry>
         <oasis:entry colname="col2">Salpida</oasis:entry>
         <oasis:entry colname="col3">67</oasis:entry>
         <oasis:entry colname="col4">0.0</oasis:entry>
         <oasis:entry colname="col5">82.3</oasis:entry>
         <oasis:entry colname="col6">84.0</oasis:entry>
         <oasis:entry colname="col7">81.9</oasis:entry>
       </oasis:row>
       <oasis:row>
         <oasis:entry colname="col1">nucleus</oasis:entry>
         <oasis:entry colname="col2">Salpida</oasis:entry>
         <oasis:entry colname="col3">222</oasis:entry>
         <oasis:entry colname="col4">11.5</oasis:entry>
         <oasis:entry colname="col5">68.6</oasis:entry>
         <oasis:entry colname="col6">71.4</oasis:entry>
         <oasis:entry colname="col7">74.7</oasis:entry>
       </oasis:row>
       <oasis:row>
         <oasis:entry colname="col1">Salpida</oasis:entry>
         <oasis:entry colname="col2">Salpida</oasis:entry>
         <oasis:entry colname="col3">2460</oasis:entry>
         <oasis:entry colname="col4">42.1</oasis:entry>
         <oasis:entry colname="col5">92.9</oasis:entry>
         <oasis:entry colname="col6">92.3</oasis:entry>
         <oasis:entry colname="col7">93.4</oasis:entry>
       </oasis:row>
       <oasis:row>
         <oasis:entry colname="col1">Bassia</oasis:entry>
         <oasis:entry colname="col2">Siphonophorae</oasis:entry>
         <oasis:entry colname="col3">15</oasis:entry>
         <oasis:entry colname="col4">0.0</oasis:entry>
         <oasis:entry colname="col5">57.1</oasis:entry>
         <oasis:entry colname="col6">50.0</oasis:entry>
         <oasis:entry colname="col7">56.0</oasis:entry>
       </oasis:row>
       <oasis:row>
         <oasis:entry colname="col1">bract <inline-formula><mml:math id="M77" display="inline"><mml:mo>&lt;</mml:mo></mml:math></inline-formula> Abylopsis tetragona</oasis:entry>
         <oasis:entry colname="col2">Siphonophorae</oasis:entry>
         <oasis:entry colname="col3">185</oasis:entry>
         <oasis:entry colname="col4">34.9</oasis:entry>
         <oasis:entry colname="col5">91.2</oasis:entry>
         <oasis:entry colname="col6">89.0</oasis:entry>
         <oasis:entry colname="col7">89.9</oasis:entry>
       </oasis:row>
       <oasis:row>
         <oasis:entry colname="col1">bract <inline-formula><mml:math id="M78" display="inline"><mml:mo>&lt;</mml:mo></mml:math></inline-formula> Diphyidae</oasis:entry>
         <oasis:entry colname="col2">Siphonophorae</oasis:entry>
         <oasis:entry colname="col3">2185</oasis:entry>
         <oasis:entry colname="col4">12.0</oasis:entry>
         <oasis:entry colname="col5">85.9</oasis:entry>
         <oasis:entry colname="col6">86.0</oasis:entry>
         <oasis:entry colname="col7">87.9</oasis:entry>
       </oasis:row>
       <oasis:row>
         <oasis:entry colname="col1">eudoxie <inline-formula><mml:math id="M79" display="inline"><mml:mo>&lt;</mml:mo></mml:math></inline-formula> Abylopsis tetragona</oasis:entry>
         <oasis:entry colname="col2">Siphonophorae</oasis:entry>
         <oasis:entry colname="col3">98</oasis:entry>
         <oasis:entry colname="col4">0.0</oasis:entry>
         <oasis:entry colname="col5">90.3</oasis:entry>
         <oasis:entry colname="col6">92.1</oasis:entry>
         <oasis:entry colname="col7">89.6</oasis:entry>
       </oasis:row>
       <oasis:row>
         <oasis:entry colname="col1">eudoxie <inline-formula><mml:math id="M80" display="inline"><mml:mo>&lt;</mml:mo></mml:math></inline-formula> Diphyidae</oasis:entry>
         <oasis:entry colname="col2">Siphonophorae</oasis:entry>
         <oasis:entry colname="col3">525</oasis:entry>
         <oasis:entry colname="col4">2.9</oasis:entry>
         <oasis:entry colname="col5">84.3</oasis:entry>
         <oasis:entry colname="col6">86.9</oasis:entry>
         <oasis:entry colname="col7">89.9</oasis:entry>
       </oasis:row>
       <oasis:row>
         <oasis:entry colname="col1">gonophore <inline-formula><mml:math id="M81" display="inline"><mml:mo>&lt;</mml:mo></mml:math></inline-formula> Abylopsis tetragona</oasis:entry>
         <oasis:entry colname="col2">Siphonophorae</oasis:entry>
         <oasis:entry colname="col3">199</oasis:entry>
         <oasis:entry colname="col4">12.1</oasis:entry>
         <oasis:entry colname="col5">90.9</oasis:entry>
         <oasis:entry colname="col6">90.2</oasis:entry>
         <oasis:entry colname="col7">93.5</oasis:entry>
       </oasis:row>
       <oasis:row>
         <oasis:entry colname="col1">gonophore <inline-formula><mml:math id="M82" display="inline"><mml:mo>&lt;</mml:mo></mml:math></inline-formula> Diphyidae</oasis:entry>
         <oasis:entry colname="col2">Siphonophorae</oasis:entry>
         <oasis:entry colname="col3">2460</oasis:entry>
         <oasis:entry colname="col4">30.0</oasis:entry>
         <oasis:entry colname="col5">93.2</oasis:entry>
         <oasis:entry colname="col6">93.4</oasis:entry>
         <oasis:entry colname="col7">94.2</oasis:entry>
       </oasis:row>
       <oasis:row>
         <oasis:entry colname="col1">nectophore <inline-formula><mml:math id="M83" display="inline"><mml:mo>&lt;</mml:mo></mml:math></inline-formula> Abylopsis tetragona</oasis:entry>
         <oasis:entry colname="col2">Siphonophorae</oasis:entry>
         <oasis:entry colname="col3">173</oasis:entry>
         <oasis:entry colname="col4">20.7</oasis:entry>
         <oasis:entry colname="col5">88.6</oasis:entry>
         <oasis:entry colname="col6">87.6</oasis:entry>
         <oasis:entry colname="col7">91.7</oasis:entry>
       </oasis:row>
       <oasis:row>
         <oasis:entry colname="col1">nectophore <inline-formula><mml:math id="M84" display="inline"><mml:mo>&lt;</mml:mo></mml:math></inline-formula> Diphyidae</oasis:entry>
         <oasis:entry colname="col2">Siphonophorae</oasis:entry>
         <oasis:entry colname="col3">4417</oasis:entry>
         <oasis:entry colname="col4">63.1</oasis:entry>
         <oasis:entry colname="col5">92.9</oasis:entry>
         <oasis:entry colname="col6">92.2</oasis:entry>
         <oasis:entry colname="col7">93.1</oasis:entry>
       </oasis:row>
       <oasis:row>
         <oasis:entry colname="col1">nectophore <inline-formula><mml:math id="M85" display="inline"><mml:mo>&lt;</mml:mo></mml:math></inline-formula> Hippopodiidae</oasis:entry>
         <oasis:entry colname="col2">Siphonophorae</oasis:entry>
         <oasis:entry colname="col3">17</oasis:entry>
         <oasis:entry colname="col4">18.2</oasis:entry>
         <oasis:entry colname="col5">73.3</oasis:entry>
         <oasis:entry colname="col6">81.1</oasis:entry>
         <oasis:entry colname="col7">85.7</oasis:entry>
       </oasis:row>
       <oasis:row>
         <oasis:entry colname="col1">nectophore <inline-formula><mml:math id="M86" display="inline"><mml:mo>&lt;</mml:mo></mml:math></inline-formula> Physonectae</oasis:entry>
         <oasis:entry colname="col2">Siphonophorae</oasis:entry>
         <oasis:entry colname="col3">1386</oasis:entry>
         <oasis:entry colname="col4">59.5</oasis:entry>
         <oasis:entry colname="col5">87.4</oasis:entry>
         <oasis:entry colname="col6">81.8</oasis:entry>
         <oasis:entry colname="col7">84.7</oasis:entry>
       </oasis:row>
       <oasis:row>
         <oasis:entry colname="col1">part <inline-formula><mml:math id="M87" display="inline"><mml:mo>&lt;</mml:mo></mml:math></inline-formula> Siphonophorae</oasis:entry>
         <oasis:entry colname="col2">Siphonophorae</oasis:entry>
         <oasis:entry colname="col3">412</oasis:entry>
         <oasis:entry colname="col4">0.0</oasis:entry>
         <oasis:entry colname="col5">66.8</oasis:entry>
         <oasis:entry colname="col6">67.4</oasis:entry>
         <oasis:entry colname="col7">69.5</oasis:entry>
       </oasis:row>
       <oasis:row>
         <oasis:entry colname="col1">Physonectae</oasis:entry>
         <oasis:entry colname="col2">Siphonophorae</oasis:entry>
         <oasis:entry colname="col3">16</oasis:entry>
         <oasis:entry colname="col4">0.0</oasis:entry>
         <oasis:entry colname="col5">43.5</oasis:entry>
         <oasis:entry colname="col6">48.5</oasis:entry>
         <oasis:entry colname="col7">66.7</oasis:entry>
       </oasis:row>
       <oasis:row>
         <oasis:entry colname="col1">siphonula</oasis:entry>
         <oasis:entry colname="col2">Siphonophorae</oasis:entry>
         <oasis:entry colname="col3">144</oasis:entry>
         <oasis:entry colname="col4">19.2</oasis:entry>
         <oasis:entry colname="col5">90.3</oasis:entry>
         <oasis:entry colname="col6">86.1</oasis:entry>
         <oasis:entry colname="col7">89.0</oasis:entry>
       </oasis:row>
       <oasis:row>
         <oasis:entry colname="col1">Coscinodiscus</oasis:entry>
         <oasis:entry colname="col2">Stramenopiles</oasis:entry>
         <oasis:entry colname="col3">1075</oasis:entry>
         <oasis:entry colname="col4">41.2</oasis:entry>
         <oasis:entry colname="col5">97.3</oasis:entry>
         <oasis:entry colname="col6">96.8</oasis:entry>
         <oasis:entry colname="col7">97.2</oasis:entry>
       </oasis:row>
       <oasis:row>
         <oasis:entry colname="col1">actinula <inline-formula><mml:math id="M88" display="inline"><mml:mo>&lt;</mml:mo></mml:math></inline-formula> Solmundella bitentaculata</oasis:entry>
         <oasis:entry colname="col2">Trachylina</oasis:entry>
         <oasis:entry colname="col3">19</oasis:entry>
         <oasis:entry colname="col4">0.0</oasis:entry>
         <oasis:entry colname="col5">68.8</oasis:entry>
         <oasis:entry colname="col6">78.9</oasis:entry>
         <oasis:entry colname="col7">82.4</oasis:entry>
       </oasis:row>
       <oasis:row>
         <oasis:entry colname="col1">Aglaura</oasis:entry>
         <oasis:entry colname="col2">Trachylina</oasis:entry>
         <oasis:entry colname="col3">455</oasis:entry>
         <oasis:entry colname="col4">57.9</oasis:entry>
         <oasis:entry colname="col5">91.8</oasis:entry>
         <oasis:entry colname="col6">91.7</oasis:entry>
         <oasis:entry colname="col7">93.0</oasis:entry>
       </oasis:row>
     </oasis:tbody>
   </oasis:tgroup></oasis:table></table-wrap>

<table-wrap id="T4c" specific-use="star"><label>Table 4</label><caption><p id="d2e4433">Continued.</p></caption><oasis:table frame="topbot"><oasis:tgroup cols="7">
     <oasis:colspec colnum="1" colname="col1" align="left"/>
     <oasis:colspec colnum="2" colname="col2" align="left"/>
     <oasis:colspec colnum="3" colname="col3" align="right"/>
     <oasis:colspec colnum="4" colname="col4" align="right"/>
     <oasis:colspec colnum="5" colname="col5" align="right"/>
     <oasis:colspec colnum="6" colname="col6" align="right"/>
     <oasis:colspec colnum="7" colname="col7" align="right"/>
     <oasis:thead>
       <oasis:row>
         <oasis:entry colname="col1">Class</oasis:entry>
         <oasis:entry colname="col2">Grouped</oasis:entry>
         <oasis:entry colname="col3"><inline-formula><mml:math id="M89" display="inline"><mml:mi>N</mml:mi></mml:math></inline-formula> test</oasis:entry>
         <oasis:entry colname="col4">Nat <inline-formula><mml:math id="M90" display="inline"><mml:mo>+</mml:mo></mml:math></inline-formula></oasis:entry>
         <oasis:entry colname="col5">Mob <inline-formula><mml:math id="M91" display="inline"><mml:mo>+</mml:mo></mml:math></inline-formula></oasis:entry>
         <oasis:entry colname="col6">Eff S <inline-formula><mml:math id="M92" display="inline"><mml:mo>+</mml:mo></mml:math></inline-formula></oasis:entry>
         <oasis:entry colname="col7">Mob <inline-formula><mml:math id="M93" display="inline"><mml:mo>+</mml:mo></mml:math></inline-formula> PCA</oasis:entry>
       </oasis:row>
       <oasis:row rowsep="1">
         <oasis:entry colname="col1"/>
         <oasis:entry colname="col2"/>
         <oasis:entry colname="col3"/>
         <oasis:entry colname="col4">RF</oasis:entry>
         <oasis:entry colname="col5">MLP600</oasis:entry>
         <oasis:entry colname="col6">MLP600</oasis:entry>
         <oasis:entry colname="col7"><inline-formula><mml:math id="M94" display="inline"><mml:mo>+</mml:mo></mml:math></inline-formula> RF</oasis:entry>
       </oasis:row>
     </oasis:thead>
     <oasis:tbody>
       <oasis:row rowsep="1">
         <oasis:entry colname="col1"/>
         <oasis:entry colname="col2"/>
         <oasis:entry namest="col3" nameend="col7" align="center">Plankton classes </oasis:entry>
       </oasis:row>
       <oasis:row>
         <oasis:entry colname="col1">Liriope <inline-formula><mml:math id="M95" display="inline"><mml:mo>&lt;</mml:mo></mml:math></inline-formula> Geryoniidae</oasis:entry>
         <oasis:entry colname="col2">Trachylina</oasis:entry>
         <oasis:entry colname="col3">34</oasis:entry>
         <oasis:entry colname="col4">0.0</oasis:entry>
         <oasis:entry colname="col5">52.0</oasis:entry>
         <oasis:entry colname="col6">73.0</oasis:entry>
         <oasis:entry colname="col7">78.7</oasis:entry>
       </oasis:row>
       <oasis:row>
         <oasis:entry colname="col1">Rhopalonema velatum</oasis:entry>
         <oasis:entry colname="col2">Trachylina</oasis:entry>
         <oasis:entry colname="col3">373</oasis:entry>
         <oasis:entry colname="col4">49.1</oasis:entry>
         <oasis:entry colname="col5">85.6</oasis:entry>
         <oasis:entry colname="col6">85.2</oasis:entry>
         <oasis:entry colname="col7">87.2</oasis:entry>
       </oasis:row>
       <oasis:row rowsep="1">
         <oasis:entry colname="col1">Solmundella bitentaculata</oasis:entry>
         <oasis:entry colname="col2">Trachylina</oasis:entry>
         <oasis:entry colname="col3">56</oasis:entry>
         <oasis:entry colname="col4">3.5</oasis:entry>
         <oasis:entry colname="col5">67.4</oasis:entry>
         <oasis:entry colname="col6">70.6</oasis:entry>
         <oasis:entry colname="col7">73.4</oasis:entry>
       </oasis:row>
       <oasis:row rowsep="1">
         <oasis:entry colname="col1">average</oasis:entry>
         <oasis:entry colname="col2"/>
         <oasis:entry colname="col3"/>
         <oasis:entry colname="col4">22.9</oasis:entry>
         <oasis:entry colname="col5">85.5</oasis:entry>
         <oasis:entry colname="col6">86.6</oasis:entry>
         <oasis:entry colname="col7">88.5</oasis:entry>
       </oasis:row>
       <oasis:row rowsep="1">
         <oasis:entry colname="col1"/>
         <oasis:entry colname="col2"/>
         <oasis:entry namest="col3" nameend="col7" align="center">Non plankton classes </oasis:entry>
       </oasis:row>
       <oasis:row>
         <oasis:entry colname="col1">artefact</oasis:entry>
         <oasis:entry colname="col2">artefact</oasis:entry>
         <oasis:entry colname="col3">7718</oasis:entry>
         <oasis:entry colname="col4">76.7</oasis:entry>
         <oasis:entry colname="col5">80.8</oasis:entry>
         <oasis:entry colname="col6">80.0</oasis:entry>
         <oasis:entry colname="col7">79.8</oasis:entry>
       </oasis:row>
       <oasis:row>
         <oasis:entry colname="col1">badfocus <inline-formula><mml:math id="M96" display="inline"><mml:mo>&lt;</mml:mo></mml:math></inline-formula> artefact</oasis:entry>
         <oasis:entry colname="col2">badfocus</oasis:entry>
         <oasis:entry colname="col3">6046</oasis:entry>
         <oasis:entry colname="col4">19.6</oasis:entry>
         <oasis:entry colname="col5">63.1</oasis:entry>
         <oasis:entry colname="col6">62.9</oasis:entry>
         <oasis:entry colname="col7">63.1</oasis:entry>
       </oasis:row>
       <oasis:row>
         <oasis:entry colname="col1">bubble</oasis:entry>
         <oasis:entry colname="col2">bubble</oasis:entry>
         <oasis:entry colname="col3">2432</oasis:entry>
         <oasis:entry colname="col4">19.0</oasis:entry>
         <oasis:entry colname="col5">92.2</oasis:entry>
         <oasis:entry colname="col6">91.0</oasis:entry>
         <oasis:entry colname="col7">91.2</oasis:entry>
       </oasis:row>
       <oasis:row>
         <oasis:entry colname="col1">detritus</oasis:entry>
         <oasis:entry colname="col2">detritus</oasis:entry>
         <oasis:entry colname="col3">36 260</oasis:entry>
         <oasis:entry colname="col4">55.2</oasis:entry>
         <oasis:entry colname="col5">82.9</oasis:entry>
         <oasis:entry colname="col6">81.4</oasis:entry>
         <oasis:entry colname="col7">81.6</oasis:entry>
       </oasis:row>
       <oasis:row>
         <oasis:entry colname="col1">fiber <inline-formula><mml:math id="M97" display="inline"><mml:mo>&lt;</mml:mo></mml:math></inline-formula> detritus</oasis:entry>
         <oasis:entry colname="col2">fiber</oasis:entry>
         <oasis:entry colname="col3">6708</oasis:entry>
         <oasis:entry colname="col4">62.9</oasis:entry>
         <oasis:entry colname="col5">74.6</oasis:entry>
         <oasis:entry colname="col6">74.7</oasis:entry>
         <oasis:entry colname="col7">74.8</oasis:entry>
       </oasis:row>
       <oasis:row>
         <oasis:entry colname="col1">Insecta</oasis:entry>
         <oasis:entry colname="col2">Insecta</oasis:entry>
         <oasis:entry colname="col3">169</oasis:entry>
         <oasis:entry colname="col4">27.1</oasis:entry>
         <oasis:entry colname="col5">84.3</oasis:entry>
         <oasis:entry colname="col6">86.9</oasis:entry>
         <oasis:entry colname="col7">89.6</oasis:entry>
       </oasis:row>
       <oasis:row>
         <oasis:entry colname="col1">egg <inline-formula><mml:math id="M98" display="inline"><mml:mo>&lt;</mml:mo></mml:math></inline-formula> other</oasis:entry>
         <oasis:entry colname="col2">other_egg</oasis:entry>
         <oasis:entry colname="col3">2015</oasis:entry>
         <oasis:entry colname="col4">59.7</oasis:entry>
         <oasis:entry colname="col5">92.2</oasis:entry>
         <oasis:entry colname="col6">91.0</oasis:entry>
         <oasis:entry colname="col7">92.4</oasis:entry>
       </oasis:row>
       <oasis:row>
         <oasis:entry colname="col1">other <inline-formula><mml:math id="M99" display="inline"><mml:mo>&lt;</mml:mo></mml:math></inline-formula> living</oasis:entry>
         <oasis:entry colname="col2">other_living</oasis:entry>
         <oasis:entry colname="col3">40</oasis:entry>
         <oasis:entry colname="col4">16.3</oasis:entry>
         <oasis:entry colname="col5">39.2</oasis:entry>
         <oasis:entry colname="col6">59.3</oasis:entry>
         <oasis:entry colname="col7">73.7</oasis:entry>
       </oasis:row>
       <oasis:row rowsep="1">
         <oasis:entry colname="col1">seaweed</oasis:entry>
         <oasis:entry colname="col2">seaweed</oasis:entry>
         <oasis:entry colname="col3">1272</oasis:entry>
         <oasis:entry colname="col4">35.3</oasis:entry>
         <oasis:entry colname="col5">68.0</oasis:entry>
         <oasis:entry colname="col6">68.2</oasis:entry>
         <oasis:entry colname="col7">66.3</oasis:entry>
       </oasis:row>
       <oasis:row>
         <oasis:entry colname="col1">average</oasis:entry>
         <oasis:entry colname="col2"/>
         <oasis:entry colname="col3"/>
         <oasis:entry colname="col4">41.3</oasis:entry>
         <oasis:entry colname="col5">75.2</oasis:entry>
         <oasis:entry colname="col6">77.3</oasis:entry>
         <oasis:entry colname="col7">79.2</oasis:entry>
       </oasis:row>
     </oasis:tbody>
   </oasis:tgroup></oasis:table></table-wrap>

</sec>
<sec id="Ch1.S3.SS3">
  <label>3.3</label><title>Rare classes are where CNN outperform classical approaches</title>
      <p id="d2e4957">In terms of overall accuracy, the CNN only showed a modest improvement on five datasets compared with the classical approach of using handcrafted features and an RF classifier (<inline-formula><mml:math id="M100" display="inline"><mml:mo lspace="0mm">+</mml:mo></mml:math></inline-formula>3.5 % to <inline-formula><mml:math id="M101" display="inline"><mml:mo>+</mml:mo></mml:math></inline-formula>13.8 %) (Fig. 2). The exception was the UVP6 dataset, where the improvement was more pronounced (<inline-formula><mml:math id="M102" display="inline"><mml:mo lspace="0mm">&gt;</mml:mo></mml:math></inline-formula> 40 %) The use of class weights slightly decreased the accuracy of both the deep and classical approaches, as it focused training on small classes and less on large classes, which account for more in the computation of accuracy. Note that a random classifier achieved 55 %, 61 % and 63 % accuracy on the detritus-dominated IFCB, ISIIS and UVP6 datasets, respectively. While the accuracies of all non-random models were higher, they must be gauged in terms of the increase over the random model and not in absolute terms.</p>

      <fig id="F2" specific-use="star"><label>Figure 2</label><caption><p id="d2e4983">Performance comparison between a small CNN (Mob <inline-formula><mml:math id="M103" display="inline"><mml:mo>+</mml:mo></mml:math></inline-formula> MLP600), a RF trained on handcrafted features and a random classifier on all six datasets. Both class weighted and non-weighted versions of the models were evaluated. The models are described in Fig. 1. Plain bars show the value of each metric at the finest taxonomic level, striped bars show the value after regrouping objects into broader ecological groups. All values, including <inline-formula><mml:math id="M104" display="inline"><mml:mrow><mml:msub><mml:mi>F</mml:mi><mml:mn mathvariant="normal">1</mml:mn></mml:msub></mml:mrow></mml:math></inline-formula>-scores, are reported in Table S8.</p></caption>
          <graphic xlink:href="https://essd.copernicus.org/articles/18/945/2026/essd-18-945-2026-f02.png"/>

        </fig>

      <p id="d2e5010">Deep approaches showed much higher balanced accuracies than classical ones, as well as improved precisions and recalls averaged over plankton classes; this was true both with and without weights (Fig. 2). The balanced accuracy of the random classifier was very poor in all datasets, confirming that this metric is more relevant in datasets with many small classes. The same applies for <inline-formula><mml:math id="M105" display="inline"><mml:mrow><mml:msub><mml:mi>F</mml:mi><mml:mn mathvariant="normal">1</mml:mn></mml:msub></mml:mrow></mml:math></inline-formula>-scores: macro-<inline-formula><mml:math id="M106" display="inline"><mml:mrow><mml:msub><mml:mi>F</mml:mi><mml:mn mathvariant="normal">1</mml:mn></mml:msub></mml:mrow></mml:math></inline-formula> captures the failure of the random classifiers, while micro-<inline-formula><mml:math id="M107" display="inline"><mml:mrow><mml:msub><mml:mi>F</mml:mi><mml:mn mathvariant="normal">1</mml:mn></mml:msub></mml:mrow></mml:math></inline-formula> mirrors accuracy (Fig. S2). The improvements brought by CNN were associated with the fact that they performed better on non-dominant classes (e.g. Tables 4, S2–S7).</p>
      <p id="d2e5047">Class weights improved balanced accuracy for both deep (up to <inline-formula><mml:math id="M108" display="inline"><mml:mo>+</mml:mo></mml:math></inline-formula>8.2 % for the UVP6 dataset) and classical approaches (up to <inline-formula><mml:math id="M109" display="inline"><mml:mo>+</mml:mo></mml:math></inline-formula>18.0 % for the UVP6 dataset). Thus, as expected, giving more weight to small classes improved their learning by the classifier, but this was especially true for RF models. Weighting decreased plankton precision for both models, on all datasets: errors involving samples from large classes were less penalized, resulting in a greater contamination of plankton classes, i.e. lower precision. Symmetrically, the use of class weights improved the recall of plankton classes for all models (except MobileNet on the FlowCam dataset). Again, this improvement is expected since plankton classes, which typically contain fewer images than non-plankton ones (e.g. detritus), are given more weight, reducing the number of false negatives, i.e. increasing recall. Since applying class weights improved detection of underrepresented classes (primarily plankton), only the weighted versions of each model will be evaluated in the subsequent analysis.</p>
</sec>
<sec id="Ch1.S3.SS4">
  <label>3.4</label><title>Small CNN are sufficient for plankton image classification</title>
      <p id="d2e5072">Using a larger and supposedly richer feature extractor, such as EfficientNet S or EfficientNet XL, did not markedly improve performance metrics (Fig. 3). If anything, performance was lower with EfficientNet XL, likely due to immediate overfitting after the first epoch, causing the model to adhere too closely to the training data and impair its ability to generalize. This may be due to the relatively small training dataset, which, in proportion to the number of parameters in the model, increases the risk of overfitting. The effect was especially pronounced with the UVP6 dataset, which is not only small (<inline-formula><mml:math id="M110" display="inline"><mml:mo lspace="0mm">∼</mml:mo></mml:math></inline-formula> 635 000 images) but also has a low proportion of plankton images (7.7 %); both balanced accuracy and plankton-specific metrics (average precision and recall) were notably impacted. On the other hand, compressing the features before classification, by using a fully connected layer of size 50 instead of 600 after the MobileNet feature extractor, did not reduce classification performance (Fig. 3). Both results suggest that a relatively small model is enough to extract all informative content from the small, grayscale plankton images in these datasets.</p>

      <fig id="F3" specific-use="star"><label>Figure 3</label><caption><p id="d2e5084">Performance comparison between our reference CNN (Mob <inline-formula><mml:math id="M111" display="inline"><mml:mo>+</mml:mo></mml:math></inline-formula> MLP600), a CNN with a larger feature extractor (Eff S <inline-formula><mml:math id="M112" display="inline"><mml:mo>+</mml:mo></mml:math></inline-formula> MLP600 and Eff XL <inline-formula><mml:math id="M113" display="inline"><mml:mo>+</mml:mo></mml:math></inline-formula> MLP600) and a MobileNet followed by a smaller MLP (Mob <inline-formula><mml:math id="M114" display="inline"><mml:mo>+</mml:mo></mml:math></inline-formula> MLP50) on all six datasets. The models are described in Fig. 1. Plain bars show the value of each metric at the finest taxonomic level, striped bars show the value after regrouping objects into broader ecological groups. All values, including <inline-formula><mml:math id="M115" display="inline"><mml:mrow><mml:msub><mml:mi>F</mml:mi><mml:mn mathvariant="normal">1</mml:mn></mml:msub></mml:mrow></mml:math></inline-formula>-scores, are reported in Table S8.</p></caption>
          <graphic xlink:href="https://essd.copernicus.org/articles/18/945/2026/essd-18-945-2026-f03.png"/>

        </fig>

</sec>
<sec id="Ch1.S3.SS5">
  <label>3.5</label><title>The features are more important than the classifier</title>
      <p id="d2e5141">Moving from native features to MobileNet deep features before the RF classifier significantly increased all classification metrics (Fig. 4). On the contrary, performance stayed the same when the MLP600 classifier was replaced by a RF after the same MobileNet feature extractor. This suggests that the classifier itself is of relatively little importance; rather, it is the quality of the features that determines performance. Since features are optimized during CNN training, their quality aligns with the patterns the algorithm learns to improve classification accuracy.</p>

      <fig id="F4" specific-use="star"><label>Figure 4</label><caption><p id="d2e5146">Performance comparison between our reference CNN (Mob <inline-formula><mml:math id="M116" display="inline"><mml:mo>+</mml:mo></mml:math></inline-formula> MLP600), a RF trained on deep features extracted by a MobileNet V2 without (Mob <inline-formula><mml:math id="M117" display="inline"><mml:mo>+</mml:mo></mml:math></inline-formula> RF) and with (Mob <inline-formula><mml:math id="M118" display="inline"><mml:mo>+</mml:mo></mml:math></inline-formula> PCA <inline-formula><mml:math id="M119" display="inline"><mml:mo>+</mml:mo></mml:math></inline-formula> RF) feature reduction, and a RF trained on handcrafted features on all six datasets. The models are described in Fig. 1. Plain bars show the value of each metric at the finest taxonomic level, striped bars show the value after regrouping objects into broader ecological groups. All values, including <inline-formula><mml:math id="M120" display="inline"><mml:mrow><mml:msub><mml:mi>F</mml:mi><mml:mn mathvariant="normal">1</mml:mn></mml:msub></mml:mrow></mml:math></inline-formula>-scores, are reported in Table S8.</p></caption>
          <graphic xlink:href="https://essd.copernicus.org/articles/18/945/2026/essd-18-945-2026-f04.png"/>

        </fig>

      <p id="d2e5194">Finally, compressing features with a classification-agnostic dimension reduction method (PCA here) had very little effect on classification performance (Fig. 4). This supports the idea, stated in the previous section, that the information required to classify the relatively small, gray-scale plankton images captured by the instruments considered here can be efficiently summarized in only a few numbers (50 here). This opens operational possibilities since the feature extractor, the feature compressor and the classifier can be separated.</p>
</sec>
<sec id="Ch1.S3.SS6">
  <label>3.6</label><title>Performance on coarser groups</title>
      <p id="d2e5205">Regrouping classes into broader ecological groups improved all performance metrics (accuracy, plankton precision and plankton recall) across all datasets and approaches (Figs. 2, 3, and 4), as it made the classification task easier, in line with previous results (Kraft et al., 2022). However, it is important to note that our method – regrouping classes after training on detailed classes – differs from retraining a model on grouped classes alone. In the latter approach, regrouping would increase the number of examples within each group, likely enhancing performance. Yet, this could also introduce more diversity within each class, sometimes referred to as “within-class subconcepts” (He and Garcia, 2009), which might reduce accuracy in certain, morphologically diverse, groups (e.g. both Appendicularia bodies and houses being labeled as Appendicularia). This decrease in performance is especially evident in miscellaneous classes containing objects that could not be assigned to other categories (Tables 4, S2–S7). The performance increase between detailed and coarse classes was larger for classical approaches, particularly on the ZooCam and ZooScan datasets (Fig. 2). This highlights the fact that classical approaches often confused fine-scale taxa, comprised within larger groups. A good example is Copepoda, which has 22 subclasses in the ZooCam dataset and 20 in the ZooScan dataset. The classification of some of these <inline-formula><mml:math id="M121" display="inline"><mml:mo>∼</mml:mo></mml:math></inline-formula> 20 classes was often poor with classical models while the classification of Copepoda, as a whole, was rather good. Since Copepoda represented a large percentage of the images in each dataset, 38 % and 34 % respectively, classifications metrics significantly improved when they were grouped.</p>

      <fig id="F5" specific-use="star"><label>Figure 5</label><caption><p id="d2e5217">Density distribution (i.e. continuous histogram) of the difference in performance metrics per class when going from RF on native features to different deep models (colors), on the ZooScan datasets, at two taxonomic levels (rows).</p></caption>
          <graphic xlink:href="https://essd.copernicus.org/articles/18/945/2026/essd-18-945-2026-f05.png"/>

        </fig>

      <p id="d2e5226">The other side of the same coin is that performance improvements when going from a RF on native features to different deep models were larger when the taxonomic level was more detailed. In Fig. 5, most classes show better performance with the deep models (to the right of zero), and the increase is more pronounced with detailed classes (top) than on regrouped ones (bottom), for precision in particular. In other words, deep models beat classical ones on almost all classes (most differences in per-class metrics were above zero) but, on datasets with more and smaller classes, CNN beat classical approaches more often and by a wider margin than on coarser datasets. This further supports that CNN are better than classical approaches specifically at classifying rare classes.</p>
</sec>
</sec>
<sec id="Ch1.S4">
  <label>4</label><title>Discussion</title>
<sec id="Ch1.S4.SS1">
  <label>4.1</label><title>Costs and benefits of using CNN</title>
      <p id="d2e5245">In terms of accuracy alone, CNN did not appear to offer a significant performance improvement over the classical approach of handcrafted feature extraction followed by a RF classifier. However, the high scores of a purely random classifier on this metric show how flawed it can be on unbalanced datasets. Instead, balanced accuracy (Kelleher et al., 2020) and metrics on plankton classes only both showed that CNN performed better in classifying objects, especially in low abundance classes (and when class weights were used). This was further confirmed by the fact that the difference between CNN and the classical approach was lower when classification was performed at a coarser taxonomic level. This makes the use of pretrained CNN particularly relevant for plankton images classification, which are particularly diverse, contain many small classes and in which the dominant classes are often composed of various detritus and artifacts.</p>
      <p id="d2e5248">Giving more weight to poorly represented classes resulted in better performance, especially for RF. One plausible explanation would be that weighted RF (Chen et al., 2004) actually make use of class weights twice: weights are used to compute the criterion to generate the splits (entropy in our case) when building the tree; weights are also used when voting for the majority class in terminal nodes. On the other hand, class weights are only used to compute a weighted loss in CNN (Cui et al., 2019).</p>
      <p id="d2e5251">While CNN took longer to train than RF in terms of overall training duration, the comparison is not straightforward. First, training a RF model requires extracting features from the images beforehand. This feature extraction is coded, not trained, so this part cannot be directly compared. Additionally, it can be challenging to know when feature extraction is truly complete, as the optimal set of features often depends on the specific dataset and task. But even in terms of pure evaluation (i.e. extracting features and predicting the class of new images), the computation of some handcrafted features can take a non-negligible amount of time and a CNN may prove faster, notably thanks to the use of GPUs by the underlying software libraries (Chellapilla et al., 2006). Additionally, the training time of CNN depends heavily on the number of parameters. For instance, our lightweight model (MobileNet V2) trained in under 100 h, which is fast compared to larger models (Zebin et al., 2019). Since lightweight CNN models demonstrated performance comparable to larger ones for plankton classification tasks (e.g. Kraft et al., 2022), they present an appealing choice: their computational demands are often modest and compatible with most recent computers. Finally, a metric that may be more relevant than computational time for many applications is the total time investment of the scientific team, including model setup, training, and output validation. In this respect, we argue that CNN are actually simpler to adopt. Modern deep learning libraries such as Tensorflow (Abadi et al., 2016) or Pytorch (Paszke et al., 2019) are free and open-source, and the abundance of tutorials and pre-trained models means that users need little image processing or coding expertise to get started, whereas extracting relevant handcrafted features typically requires domain-specific knowledge. Although training a CNN may involve some technical steps (e.g. configuring a data loader), the deployment stage is extremely lightweight, often only a few lines of code to load the saved model and run inference. Consequently, the resulting model packages the whole pipeline (from image pre-processing to classification) and can be deployed on various devices. And as GPU resources become increasingly available for the scientific community, these powerful tools become more accessible (Malde et al., 2020).</p>
      <p id="d2e5254">Finally, our results highlight the efficacy of both CNN and classical methods for accurate prediction of well-represented plankton classes. However, rare classes still require manual validation by a taxonomist. Importantly, improved prediction quality achieved by CNN compared to classical approaches is likely to save time by reducing the need for prediction corrections, as reported by Irisson et al. (2022).</p>
</sec>
<sec id="Ch1.S4.SS2">
  <label>4.2</label><title>Importance of the quality and number of features</title>
      <p id="d2e5265">Models using a CNN feature extractor, which generated features much more numerous than the handcrafted ones (<inline-formula><mml:math id="M122" display="inline"><mml:mo lspace="0mm">&gt;</mml:mo></mml:math></inline-formula> 1000 vs. <inline-formula><mml:math id="M123" display="inline"><mml:mo>∼</mml:mo></mml:math></inline-formula> 50), performed better as expected from the literature (Orenstein and Beijbom, 2017). Increasing the size of the feature extractor, hence yielding potentially richer features (keeping their number in the same order of magnitude: 1792 for the MobileNet V2 vs. 1280 for the EfficientNet V2) did not lead to a significant improvement in classification performance; but it did lengthen the training time. Reducing the number of features from a CNN to an amount similar to the number of handcrafted features (50), using PCA or compression within a small fully connected layer, did not significantly affect classification performance either. These results show that the richness and diversity of features is important, but only to a certain extent with plankton images. Although features from CNN cannot be individually interpreted, texture features were shown to be important for image classification by CNN (Baker et al., 2018). Moreover, visualization techniques have been developed to provide insights into the convolutional layers of CNN, revealing that convolutional layers detect patterns like edges and textures (Zeiler and Fergus, 2014). By contrast, most handcrafted feature sets were poor in texture-related features, which may explain their lower performance.</p>
      <p id="d2e5282">The fact that the number of features can be greatly reduced (from 1792 to 50, a 36-fold reduction, in our case; from 216 to 25, an 8-fold reduction, in Guo et al., 2021b) suggests there is only a limited amount of relevant information in plankton images for CNN to extract. These images are typically small (<inline-formula><mml:math id="M124" display="inline"><mml:mo lspace="0mm">∼</mml:mo></mml:math></inline-formula> 100 <inline-formula><mml:math id="M125" display="inline"><mml:mo>×</mml:mo></mml:math></inline-formula> 100 pixels for the average ZooScan image) and often grayscale, which restricts the amount of useful information available to any classifier. Consequently, increasing network depth or size does not yield appreciable performance gains, because the intrinsic information in the images is already fully exploited by a small CNN.</p>
      <p id="d2e5299">Therefore, improvements in classification accuracy are more likely to come from richer inputs than from larger network architectures. One way to achieve this is by increasing the quantity of annotated plankton images; pooling data from multiple instruments and sampling conditions has been shown to improve CNN accuracy (Ellen and Ohman, 2024) and this is the first step towards building a so-called foundation model for plankton images. A second, independent route is to enhance the informational content of each image. For example, color cameras such as those used in the planktoscope (Pollina et al., 2022) or the Scripps Plankton Camera (Orenstein et al., 2020b), should capture more information by using multiple channels. Beyond color, additional fluorescence channels can be obtained using environmental high content fluorescence microscopy, enriching the information content of images (Colin et al., 2017); but this method can only be applied ex situ. Expanding the amount of training data and capturing richer image information should both yield gains in classification performance, albeit at the cost of greater storage and processing requirements. Our findings also open an opportunity to simplify plankton image classification models, by performing a wise feature selection through recursive feature elimination for example (a backward selection of less informative features until only informative features remain; Guyon et al., 2002; Guo et al., 2021b). Dimension reduction techniques, such as PCA (Legendre and Legendre, 2012), can also be used to remove both correlations and noise in the features. The combination of deep feature extraction, dimension reduction, and a robust classifier, such as RandomForest, is lightweight and quick to train, yet yields high quality results (Fig. 4). Because of these advantages, this approach has been implemented in the EcoTaxa web application (Picheral et al., 2017), allowing users to apply such methods to their own plankton image datasets.</p>
      <p id="d2e5302">The similar performance between a full CNN and a deep feature extractor combined with a RF classifier (Fig. 4) suggests that the nature of the features is much more important than the nature of the classifier. These results are consistent with those comparing different classifiers on handcrafted features, where no significant differences could be highlighted (Grosjean et al., 2004; Blaschko et al., 2005; Gorsky et al., 2010; Ellen et al., 2015). Still, in highly unbalanced datasets (IFCB, ISIIS and UVP6), the plankton precision was slightly higher with the RF than with the MLP<sub>600</sub>, reflecting a lower contamination of plankton classes by dominant detritus. Its stronger sensitivity to class weights is another possible explanation in our case.</p>
</sec>
<sec id="Ch1.S4.SS3">
  <label>4.3</label><title>Alternative approaches for plankton image classification</title>
      <p id="d2e5322">A potential drawback of CNN is that they may not account for the real size of objects, since all images are rescaled to the same dimensions before input. One solution to capture size would be not to scale down images larger than the input dimension but to pad the smaller ones with the background color. However, very small objects may be reduced to just 1 pixel after a few pooling layers and all information in the original image could be lost. Another common solution would be to concatenate size information from handcrafted features (e.g. area, Feret diameter) or simply the image diagonal size to one of the fully connected layers to create a model that accounts for both image aspect and object size. Still, despite the a priori relevance of size to recognize plankton taxa, such approaches do not necessarily provide a large improvement in classification performance: Kerr et al. (2020) report a small improvement when geometric features are concatenated, while Kyathanahally et al. (2021) report a negligible gain. Ellen et al. (2019) evaluated the effect of concatenating different types of “metadata” (geometric, geotemporal and hydrographic) to fully connected layers: geometric features alone did not improve model performance, whereas geotemporal and hydrographic metadata each yielded a noticeable boost, and adding geometric metadata on top of those provided an additional improvement. One possible explanation is that deep features already capture the essential information needed for classification, making additional geometric features redundant. However, adding geotemporal and hydrographic features (individually or combined) enhanced prediction performance, which is unsurprising given the patchy nature of plankton organisms. Plankton taxa tend to exhibit positive correlations within groups (Greer et al., 2016; Robinson et al., 2021), and are often associated with specific environmental parameters – a relationship that machine learning algorithms can leverage (e.g., relating plankton biomass to environmental conditions, as shown in Drago et al., 2022). However, one should keep in mind that incorporating metadata features during training may hinder subsequent analyses linking these organisms to their environment, since the classifier learned a correlation between the abundance of some organisms and some environmental conditions from the training set, and will therefore induce it in its predictions.</p>
      <p id="d2e5325">As highlighted above, plankton datasets are often highly unbalanced, with few objects in plankton classes while the largest classes often consist of non-living objects such as marine snow. There are both “algorithm-level” and “data-level” methods for dealing with class imbalance (Krawczyk, 2016), which can be used separately or simultaneously. Algorithm-level methods include the use of class weights to give more importance to poorly represented classes in the loss computation (Cui et al., 2019); like we did here. Another algorithm-level method is to use a different loss function, such as sigmoid focal cross entropy (Lin et al., 2020), which penalizes hard examples (small classes) more than easier ones (large classes). Data-level methods include oversampling small classes and undersampling large classes, thereby rebalancing the distribution of classes in the training set (Krawczyk, 2016). While this practice often improves performance on a test set to which the same modifications are applied, it can lead to poor performance when evaluating the model on a real, therefore unbalanced, dataset, because the model has learned an unrepresentative class distribution from the training set. This problem is known as “dataset shift” (Moreno-Torres et al., 2012). Typically, using a model trained on an idealized training set to classify objects from a new, real dataset leads to poor prediction quality (González et al., 2017). Similarly, a model trained for specific conditions (such as location, depth, or time) will likely fail to generalize to images acquired under different circumstances. To mitigate this, a potential solution would be to assemble a training set from samples that match the context of the future deployment (similar climate and season), hoping that similar context will give rise to similar class distributions. Alternatively, and more generically, the training set can be made as exhaustive as possible by spanning a wide range of spatial and temporal conditions; its global class distribution would minimize the average differences with the class distribution of new samples. Consequently, the impact of the dataset shift depends directly on how representative the training data are of the spatial and temporal regimes of interest. All types of classification models, including cutting-edge architectures like vision transformers, are susceptible to dataset shift (Zhang et al., 2022). Today, there is no obvious solution to deal with dataset shift in classification tasks and other approaches, such as quantification, should be considered (González et al., 2019; Orenstein et al., 2020a).</p>
      <p id="d2e5328">Weighting improves the recall of rare classes but reduces their precision, reflecting the classic precision–recall trade-off. When downstream analysis involves manual verification, higher recall is advantageous because a few false positives in rare classes can easily be corrected while missed detections would likely be lost among the most numerous classes and not easily recovered. Conversely, in high-throughput monitoring through imaging, where human review of all samples is infeasible, emphasizing precision reduces spurious detections at the cost of under-estimating true abundances. In such settings, post-hoc confidence thresholding (e.g. Faillettaz et al., 2016; Luo et al., 2018) offers a pragmatic compromise, albeit an imperfect one. In all situations, using various intensities of class weighting is a flexible solution to adapt the classifier to the study's objective</p>
      <p id="d2e5331">The rarity of some plankton classes means that some classes will inevitably be absent from the training set. Because a conventional classifier is trained on a fixed label list, every object is forced into one of these known classes, causing novel or poorly characterized organisms to be misclassified. In these situations, approaches such as unsupervised, self-supervised or semi-supervised learning (e.g. autoencoders) or specific open-set classifiers can be employed (Bendale and Boult, 2016; Ciranni et al., 2025; Masoudi et al., 2024). These methods can leverage the rich feature embeddings produced by a CNN while detecting objects that do not belong to any of the known training classes (Malde and Kim, 2019; Schröder et al., 2020).</p>
</sec>
</sec>
<sec id="Ch1.S5">
  <label>5</label><title>Data availability</title>
      <p id="d2e5344">The datasets used in this study are: <ext-link xlink:href="https://doi.org/10.1575/1912/7341" ext-link-type="DOI">10.1575/1912/7341</ext-link> (Sosik et al., 2015), <ext-link xlink:href="https://doi.org/10.17882/101950" ext-link-type="DOI">10.17882/101950</ext-link> (Panaïotis et al., 2024), <ext-link xlink:href="https://doi.org/10.17882/101961" ext-link-type="DOI">10.17882/101961</ext-link> (Jalabert et al., 2024), <ext-link xlink:href="https://doi.org/10.17882/101948" ext-link-type="DOI">10.17882/101948</ext-link> (Picheral et al., 2024), <ext-link xlink:href="https://doi.org/10.17882/101928" ext-link-type="DOI">10.17882/101928</ext-link> (Romagnan et al., 2024), and <ext-link xlink:href="https://doi.org/10.17882/55741" ext-link-type="DOI">10.17882/55741</ext-link> (Elineau et al., 2024).</p>
</sec>
<sec id="Ch1.S6">
  <label>6</label><title>Code availability</title>
      <p id="d2e5374">All the code supporting this study is available at <ext-link xlink:href="https://doi.org/10.5281/zenodo.17937437" ext-link-type="DOI">10.5281/zenodo.17937437</ext-link> (Panaïotis and Amblard, 2025).</p>
</sec>
<sec id="Ch1.S7" sec-type="conclusions">
  <label>7</label><title>Conclusion and perspectives</title>
      <p id="d2e5388">In summary, a small CNN achieved strong performance at plankton image classification across six realistic plankton image datasets, while being easy to apply. It unsurprisingly outperformed the classical approach of extracting a small number of handcrafted features and using a RF classifier, particularly for rare classes. Applying per-class weighting improved the detection of underrepresented classes. Surprisingly, using a large CNN did not lead to better classification performance than a much smaller one and deep features could be quite heavily compressed without loss of performance. This is likely related to the fact that plankton images, which are typically small and grayscale, provide relatively little information content for CNN. Richer images (e.g. higher resolution, colour or multispectral data) produced by next-generation imaging systems would provide additional discriminative information that bigger models could leverage. Finally, the nature of the features dominated the outcome: deep features drove the performance gains, while the choice of classifier had little impact. Overall, these findings suggest that larger and more diverse training sets and/or advances in imaging hardware, rather than ever larger models, will be key to further improving plankton classification. Furthermore, metrics that emphasize the classes of interest – often the minority classes in plankton datasets – should be prioritized.</p>
      <p id="d2e5391">The results presented here are in line with the shift towards the use of deep learning models for plankton classification tasks (Rubbens et al., 2023), which was made possible by advances in computational performance through easier access to dedicated hardware, the release of sufficiently large datasets, and the development of turnkey deep learning libraries such as Tensorflow (Abadi et al., 2016) or Pytorch (Paszke et al., 2019). Datasets in this study are made publicly available to facilitate future benchmarking of new classification methods.</p>
</sec>

      
      </body>
    <back><app-group>
        <supplementary-material position="anchor"><p id="d2e5393">The supplement related to this article is available online at <inline-supplementary-material xlink:href="https://doi.org/10.5194/essd-18-945-2026-supplement" xlink:title="pdf">https://doi.org/10.5194/essd-18-945-2026-supplement</inline-supplementary-material>.</p></supplementary-material>
        </app-group><notes notes-type="authorcontribution"><title>Author contributions</title>

      <p id="d2e5404">JOI and TP conceived the study; GBC and GDA developed a first CNN classifier; TP and EA implemented the RF classifier and the final CNN classifier from the initial work of GBC and GDA, with guidance from BW; EA performed the experiments under the supervision of TP and JOI; TP wrote the original draft; all authors reviewed and approved the final manuscript.</p>
  </notes><notes notes-type="competinginterests"><title>Competing interests</title>

      <p id="d2e5410">Emma Amblard was employed by Fotonower. Guillaume Boniface-Chang was employed by Google Research, London. Gabriel Dulac-Arnold was employed by Google Research, Paris. Ben Woodward was employed by CVision AI. The peer-review process was guided by an independent editor, and the authors have no other competing interests to declare.</p>
  </notes><notes notes-type="disclaimer"><title>Disclaimer</title>

      <p id="d2e5416">Views and opinions expressed are those of the author(s) only and do not necessarily reflect those of the European Union. Neither the European Union nor the granting authority can be held responsible for them.Publisher's note: Copernicus Publications remains neutral with regard to jurisdictional claims made in the text, published maps, institutional affiliations, or any other geographical representation in this paper. The authors bear the ultimate responsibility for providing appropriate place names. Views expressed in the text are those of the authors and do not necessarily reflect the views of the publisher.</p>
  </notes><ack><title>Acknowledgements</title><p id="d2e5425">We would like to acknowledge scientists, crew members and technicians who contributed to data collection and the taxonomist experts who sorted the images to build the datasets. Special thanks go to Eric Orenstein for providing scripts to extract handcrafted features from IFCB images and for his valuable feedback on the manuscript.</p></ack><notes notes-type="financialsupport"><title>Financial support</title>

      <p id="d2e5430">This work was carried out within the projects “World Wide Web of Plankton Image Curation”, funded by the Belmont Forum through the Agence Nationale de la Recherche ANR-18-BELM-0003-01 and the National Science Foundation (NSF) ICER1927710, and LOVNOWER funded by the program “France relance” from 21 December 2020. TP's doctoral fellowship was granted by the French Ministry of Higher Education, Research and Innovation (3500/2019). This work was granted access to the HPC resources of IDRIS under the allocation AD011013532 made by GENCI. TP was supported by projects CALIPSO funded by Schmidt Sciences and BIOcean5D funded by EU Horizon Europe (grant no. 101059915).The article processing charges for this open-access publication were covered by the National Oceanography Centre.</p>
  </notes><notes notes-type="reviewstatement"><title>Review statement</title>

      <p id="d2e5442">This paper was edited by Sebastiaan van de Velde and reviewed by Kaisa Kraft and Jeffrey Ellen.</p>
  </notes><ref-list>
    <title>References</title>

      <ref id="bib1.bib1"><label>1</label><mixed-citation>Abadi, M., Agarwal, A., Barham, P., Brevdo, E., Chen, Z., Citro, C., Corrado, G. S., Davis, A., Dean, J., Devin, M., Ghemawat, S., Goodfellow, I., Harp, A., Irving, G., Isard, M., Jia, Y., Jozefowicz, R., Kaiser, L., Kudlur, M., Levenberg, J., Mane, D., Monga, R., Moore, S., Murray, D., Olah, C., Schuster, M., Shlens, J., Steiner, B., Sutskever, I., Talwar, K., Tucker, P., Vanhoucke, V., Vasudevan, V., Viegas, F., Vinyals, O., Warden, P., Wattenberg, M., Wicke, M., Yu, Y., and Zheng, X.: TensorFlow: Large-Scale Machine Learning on Heterogeneous Distributed Systems, arXiv [preprint], <ext-link xlink:href="https://doi.org/10.48550/arXiv.1603.04467" ext-link-type="DOI">10.48550/arXiv.1603.04467</ext-link>, 2016.</mixed-citation></ref>
      <ref id="bib1.bib2"><label>2</label><mixed-citation>Anglès, S., Jordi, A., and Campbell, L.: Responses of the coastal phytoplankton community to tropical cyclones revealed by high-frequency imaging flow cytometry, Limnology and Oceanography, 60, 1562–1576, <ext-link xlink:href="https://doi.org/10.1002/lno.10117" ext-link-type="DOI">10.1002/lno.10117</ext-link>, 2015.</mixed-citation></ref>
      <ref id="bib1.bib3"><label>3</label><mixed-citation>Baker, N., Lu, H., Erlikhman, G., and Kellman, P. J.: Deep convolutional networks do not classify based on global object shape, PLOS Computational Biology, 14, e1006613, <ext-link xlink:href="https://doi.org/10.1371/journal.pcbi.1006613" ext-link-type="DOI">10.1371/journal.pcbi.1006613</ext-link>, 2018.</mixed-citation></ref>
      <ref id="bib1.bib4"><label>4</label><mixed-citation>Bendale, A. and Boult, T. E.: Towards Open Set Deep Networks, Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 1563–1572,  <uri>https://www.cv-foundation.org/openaccess/content_cvpr_2016/html/Bendale_Towards_Open_Set_CVPR_2016_paper.html</uri> (last access: 15 December 2025), 2016.</mixed-citation></ref>
      <ref id="bib1.bib5"><label>5</label><mixed-citation>Benfield, M., Grosjean, P., Culverhouse, P., Irigolen, X., Sieracki, M., Lopez-Urrutia, A., Dam, H., Hu, Q., Davis, C., Hanson, A., Pilskaln, C., Riseman, E., Schulz, H., Utgoff, P., and Gorsky, G.: RAPID: Research on Automated Plankton Identification, Oceanography, 20, 172–187, <ext-link xlink:href="https://doi.org/10.5670/oceanog.2007.63" ext-link-type="DOI">10.5670/oceanog.2007.63</ext-link>, 2007.</mixed-citation></ref>
      <ref id="bib1.bib6"><label>6</label><mixed-citation>Bi, H., Guo, Z., Benfield, M. C., Fan, C., Ford, M., Shahrestani, S., and Sieracki, J. M.: A Semi-Automated Image Analysis Procedure for In Situ Plankton Imaging Systems, PLOS ONE, 10, e0127121, <ext-link xlink:href="https://doi.org/10.1371/journal.pone.0127121" ext-link-type="DOI">10.1371/journal.pone.0127121</ext-link>, 2015.</mixed-citation></ref>
      <ref id="bib1.bib7"><label>7</label><mixed-citation>Blaschko, M. B., Holness, G., Mattar, M. A., Lisin, D., Utgoff, P. E., Hanson, A. R., Schultz, H., Riseman, E. M., Sieracki, M. E., and Balch, W. M.: Automatic in situ identification of plankton, in: 2005 Seventh IEEE Workshops on Applications of Computer Vision (WACV/MOTION'05), Vol. 1, 79–86,  <ext-link xlink:href="https://doi.org/10.1109/ACVMOT.2005.29" ext-link-type="DOI">10.1109/ACVMOT.2005.29</ext-link>, 2005.</mixed-citation></ref>
      <ref id="bib1.bib8"><label>8</label><mixed-citation>Breiman, L.: Random Forests, Machine Learning, 45, 5–32, <ext-link xlink:href="https://doi.org/10.1023/A:1010933404324" ext-link-type="DOI">10.1023/A:1010933404324</ext-link>, 2001.</mixed-citation></ref>
      <ref id="bib1.bib9"><label>9</label><mixed-citation>Callejas, S., Lira, H., Berry, A., Martí, L., and Sanchez-Pi, N.: No Plankton Left Behind: Preliminary Results on Massive Plankton Image Recognition, in: High Performance Computing, Cham, 170–185, <ext-link xlink:href="https://doi.org/10.1007/978-3-031-80084-9_12" ext-link-type="DOI">10.1007/978-3-031-80084-9_12</ext-link>, 2025.</mixed-citation></ref>
      <ref id="bib1.bib10"><label>10</label><mixed-citation>Chellapilla, K., Puri, S., and Simard, P.: High Performance Convolutional Neural Networks for Document Processing, Tenth International Workshop on Frontiers in Handwriting Recognition,  <uri>https://inria.hal.science/inria-00112631v1</uri> (last access: 15 December 2025), 2006.</mixed-citation></ref>
      <ref id="bib1.bib11"><label>11</label><mixed-citation>Chen, C., Liaw, A., and Breiman, L.: Using Random Forest to Learn Imbalanced Data, <uri>https://statistics.berkeley.edu/sites/default/files/tech-reports/666.pdf</uri> (last access: 15 December 2025), 2004.</mixed-citation></ref>
      <ref id="bib1.bib12"><label>12</label><mixed-citation>Cheng, K., Cheng, X., Wang, Y., Bi, H., and Benfield, M. C.: Enhanced convolutional neural network for plankton identification and enumeration, PLOS ONE, 14, e0219570, <ext-link xlink:href="https://doi.org/10.1371/journal.pone.0219570" ext-link-type="DOI">10.1371/journal.pone.0219570</ext-link>, 2019.</mixed-citation></ref>
      <ref id="bib1.bib13"><label>13</label><mixed-citation>Ciranni, M., Gjergji, A., Maracani, A., Murino, V., and Pastore, V. P.: In-domain self-supervised learning for plankton image classification on a budget, Proceedings of the Winter Conference on Applications of Computer Vision, 1588–1597,  <uri>https://openaccess.thecvf.com/content/WACV2025W/MaCVi/html/Ciranni_In-domain_self-supervised_learning_for_plankton_image_classification_on_a_budget_WACVW_2025_paper.html</uri> (last access:  15 December 2025), 2025.</mixed-citation></ref>
      <ref id="bib1.bib14"><label>14</label><mixed-citation>Colas, F., Tardivel, M., Perchoc, J., Lunven, M., Forest, B., Guyader, G., Danielou, M. M., Le Mestre, S., Bourriau, P., Antajan, E., Sourisseau, M., Huret, M., Petitgas, P., and Romagnan, J. B.: The ZooCAM, a new in-flow imaging system for fast onboard counting, sizing and classification of fish eggs and metazooplankton, Progress in Oceanography, 166, 54–65, <ext-link xlink:href="https://doi.org/10.1016/j.pocean.2017.10.014" ext-link-type="DOI">10.1016/j.pocean.2017.10.014</ext-link>, 2018.</mixed-citation></ref>
      <ref id="bib1.bib15"><label>15</label><mixed-citation>Colin, S., Coelho, L. P., Sunagawa, S., Bowler, C., Karsenti, E., Bork, P., Pepperkok, R., and de Vargas, C.: Quantitative 3D-imaging for cell biology and ecology of environmental microbial eukaryotes, eLife, 6, e26066, <ext-link xlink:href="https://doi.org/10.7554/eLife.26066" ext-link-type="DOI">10.7554/eLife.26066</ext-link>, 2017.</mixed-citation></ref>
      <ref id="bib1.bib16"><label>16</label><mixed-citation>Cowen, R. K. and Guigand, C. M.: In situ ichthyoplankton imaging system (ISIIS): system design and preliminary results, Limnology and Oceanography: Methods, 6, 126–132, <ext-link xlink:href="https://doi.org/10.4319/lom.2008.6.126" ext-link-type="DOI">10.4319/lom.2008.6.126</ext-link>, 2008.</mixed-citation></ref>
      <ref id="bib1.bib17"><label>17</label><mixed-citation>Cowen, R. K., Sponaugle, S., Robinson, K. L., Luo, J., Oregon State University, and Hatfield Marine Science Center: PlanktonSet 1.0: Plankton imagery data collected from F. G. Walton Smith in Straits of Florida from 2014-06-03 to 2014-06-06 and used in the 2015 National Data Science Bowl, NCEI Accession 0127422, <ext-link xlink:href="https://doi.org/10.7289/v5d21vjd" ext-link-type="DOI">10.7289/v5d21vjd</ext-link>, 2015.</mixed-citation></ref>
      <ref id="bib1.bib18"><label>18</label><mixed-citation>Cui, J., Wei, B., Wang, C., Yu, Z., Zheng, H., Zheng, B., and Yang, H.: Texture and Shape Information Fusion of Convolutional Neural Network for Plankton Image Classification, in: 2018 OCEANS – MTS/IEEE Kobe Techno-Oceans (OTO), 2018 OCEANS – MTS/IEEE Kobe Techno-Oceans (OTO), 5 pp., <ext-link xlink:href="https://doi.org/10.1109/OCEANSKOBE.2018.8559156" ext-link-type="DOI">10.1109/OCEANSKOBE.2018.8559156</ext-link>, 2018.</mixed-citation></ref>
      <ref id="bib1.bib19"><label>19</label><mixed-citation>Cui, Y., Jia, M., Lin, T.-Y., Song, Y., and Belongie, S.: Class-Balanced Loss Based on Effective Number of Samples, Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 9268–9277,  <uri>https://openaccess.thecvf.com/content_CVPR_2019/html/Cui_Class-Balanced_Loss_Based_on_Effective_Number_of_Samples_CVPR_2019_paper.html</uri> (last access: 15 December 2025), 2019.</mixed-citation></ref>
      <ref id="bib1.bib20"><label>20</label><mixed-citation>Culverhouse, P. F., Simpson, R. G., Ellis, R., Lindley, J. A., Williams, R., Parisini, T., Reguera, B., Bravo, I., Zoppoli, R., Earnshaw, G., McCall, H., and Smith, G.: Automatic classification of field-collected dinoflagellates by artificial neural network, Marine Ecology Progress Series, 139, 281–287, <ext-link xlink:href="https://doi.org/10.3354/meps139281" ext-link-type="DOI">10.3354/meps139281</ext-link>, 1996.</mixed-citation></ref>
      <ref id="bib1.bib21"><label>21</label><mixed-citation>Dai, J., Wang, R., Zheng, H., Ji, G., and Qiao, X.: ZooplanktoNet: Deep convolutional network for zooplankton classification, in: OCEANS 2016 – Shanghai, OCEANS 2016 – Shanghai, 6 pp., <ext-link xlink:href="https://doi.org/10.1109/OCEANSAP.2016.7485680" ext-link-type="DOI">10.1109/OCEANSAP.2016.7485680</ext-link>, 2016.</mixed-citation></ref>
      <ref id="bib1.bib22"><label>22</label><mixed-citation>Dai, J., Yu, Z., Zheng, H., Zheng, B., and Wang, N.: A Hybrid Convolutional Neural Network for Plankton Classification, in: Computer Vision – ACCV 2016 Workshops, Cham,  102–114, <ext-link xlink:href="https://doi.org/10.1007/978-3-319-54526-4_8" ext-link-type="DOI">10.1007/978-3-319-54526-4_8</ext-link>, 2017.</mixed-citation></ref>
      <ref id="bib1.bib23"><label>23</label><mixed-citation> Dieleman, S., Fauw, J. D., and Kavukcuoglu, K.: Exploiting Cyclic Symmetry in Convolutional Neural Networks, in: Proceedings of The 33rd International Conference on Machine Learning, International Conference on Machine Learning, 1889–1898, 2016.</mixed-citation></ref>
      <ref id="bib1.bib24"><label>24</label><mixed-citation>Drago, L., Panaïotis, T., Irisson, J.-O., Babin, M., Biard, T., Carlotti, F., Coppola, L., Guidi, L., Hauss, H., Karp-Boss, L., Lombard, F., McDonnell, A. M. P., Picheral, M., Rogge, A., Waite, A. M., Stemmann, L., and Kiko, R.: Global Distribution of Zooplankton Biomass Estimated by In Situ Imaging and Machine Learning, Frontiers in Marine Science, 9, <ext-link xlink:href="https://doi.org/10.3389/fmars.2022.894372" ext-link-type="DOI">10.3389/fmars.2022.894372</ext-link>, 2022.</mixed-citation></ref>
      <ref id="bib1.bib25"><label>25</label><mixed-citation>Du, A., Gu, Z., Yu, Z., Zheng, H., and Zheng, B.: Plankton Image Classification Using Deep Convolutional Neural Networks with Second-order Features, in: Global Oceans 2020: Singapore – U.S. Gulf Coast, Global Oceans 2020: Singapore – U.S. Gulf Coast, 5 pp., <ext-link xlink:href="https://doi.org/10.1109/IEEECONF38699.2020.9389034" ext-link-type="DOI">10.1109/IEEECONF38699.2020.9389034</ext-link>, 2020.</mixed-citation></ref>
      <ref id="bib1.bib26"><label>26</label><mixed-citation>Dyck, L. E. van, Kwitt, R., Denzler, S. J., and Gruber, W. R.: Comparing Object Recognition in Humans and Deep Convolutional Neural Networks – An Eye Tracking Study, Frontiers in Neuroscience, 15, 750639, <ext-link xlink:href="https://doi.org/10.3389/fnins.2021.750639" ext-link-type="DOI">10.3389/fnins.2021.750639</ext-link>, 2021.</mixed-citation></ref>
      <ref id="bib1.bib27"><label>27</label><mixed-citation>Eerola, T., Batrakhanov, D., Barazandeh, N. V., Kraft, K., Haraguchi, L., Lensu, L., Suikkanen, S., Seppälä, J., Tamminen, T., and Kälviäinen, H.: Survey of automatic plankton image recognition: challenges, existing solutions and future perspectives, Artif. Intell. Rev., 57, 114, <ext-link xlink:href="https://doi.org/10.1007/s10462-024-10745-y" ext-link-type="DOI">10.1007/s10462-024-10745-y</ext-link>, 2024.</mixed-citation></ref>
      <ref id="bib1.bib28"><label>28</label><mixed-citation>Eftekhari, N., Pitois, S., Masoudi, M., Blackwell, R. E., Scott, J., Giering, S. L. C., and Fry, M.: Improving in Situ Real-Time Classification of Long-Tail Marine Plankton Images for Ecosystem Studies, in: Computer Vision – ECCV 2024 Workshops, Cham, 121–134, <ext-link xlink:href="https://doi.org/10.1007/978-3-031-92387-6_8" ext-link-type="DOI">10.1007/978-3-031-92387-6_8</ext-link>, 2025.</mixed-citation></ref>
      <ref id="bib1.bib29"><label>29</label><mixed-citation>Elineau, A., Desnos, C., Jalabert, L., Olivier, M., Romagnan, J.-B., Costa Brandao, M., Lombard, F., Llopis, N., Courboulès, J., Caray-Counil, L., Serranito, B., Irisson, J.-O., Picheral, M., Gorsky, G., and Stemmann, L.: ZooScanNet: plankton images captured with the ZooScan, SEANOE [data set], <ext-link xlink:href="https://doi.org/10.17882/55741" ext-link-type="DOI">10.17882/55741</ext-link>, 2024.</mixed-citation></ref>
      <ref id="bib1.bib30"><label>30</label><mixed-citation>Ellen, J., Hongyu Li, and Ohman, M. D.: Quantifying California current plankton samples with efficient machine learning techniques, in: OCEANS 2015 – MTS/IEEE Washington, OCEANS 2015 – MTS/IEEE Washington,  9 pp., <ext-link xlink:href="https://doi.org/10.23919/OCEANS.2015.7404607" ext-link-type="DOI">10.23919/OCEANS.2015.7404607</ext-link>, 2015.</mixed-citation></ref>
      <ref id="bib1.bib31"><label>31</label><mixed-citation>Ellen, J. S. and Ohman, M. D.: Beyond transfer learning: Leveraging ancillary images in automated classification of plankton, Limnology and Oceanography: Methods, 22, 943–952, <ext-link xlink:href="https://doi.org/10.1002/lom3.10648" ext-link-type="DOI">10.1002/lom3.10648</ext-link>, 2024.</mixed-citation></ref>
      <ref id="bib1.bib32"><label>32</label><mixed-citation>Ellen, J. S., Graff, C. A., and Ohman, M. D.: Improving plankton image classification using context metadata, Limnology and Oceanography: Methods, 17, 439–461, <ext-link xlink:href="https://doi.org/10.1002/lom3.10324" ext-link-type="DOI">10.1002/lom3.10324</ext-link>, 2019.</mixed-citation></ref>
      <ref id="bib1.bib33"><label>33</label><mixed-citation>Faillettaz, R., Picheral, M., Luo, J. Y., Guigand, C., Cowen, R. K., and Irisson, J.-O.: Imperfect automatic image classification successfully describes plankton distribution patterns, Methods in Oceanography, 15–16, 60–77, <ext-link xlink:href="https://doi.org/10.1016/J.MIO.2016.04.003" ext-link-type="DOI">10.1016/J.MIO.2016.04.003</ext-link>, 2016.</mixed-citation></ref>
      <ref id="bib1.bib34"><label>34</label><mixed-citation>Falkowski, P.: Ocean Science: The power of plankton, Nature, 483, S17–S20, <ext-link xlink:href="https://doi.org/10.1038/483S17a" ext-link-type="DOI">10.1038/483S17a</ext-link>, 2012.</mixed-citation></ref>
      <ref id="bib1.bib35"><label>35</label><mixed-citation> Fernández-Delgado, M., Cernadas, E., Barro, S., and Amorim, D.: Do we need hundreds of classifiers to solve real world classification problems?, The Journal of Machine Learning Research, 15, 3133–3181, 2014.</mixed-citation></ref>
      <ref id="bib1.bib36"><label>36</label><mixed-citation>Geraldes, P., Barbosa, J., Martins, A., Dias, A., Magalhães, C., Ramos, S., and Silva, E.: In situ real-time Zooplankton Detection and Classification, in: OCEANS 2019 – Marseille, OCEANS 2019 – Marseille, 6 pp., <ext-link xlink:href="https://doi.org/10.1109/OCEANSE.2019.8867552" ext-link-type="DOI">10.1109/OCEANSE.2019.8867552</ext-link>, 2019.</mixed-citation></ref>
      <ref id="bib1.bib37"><label>37</label><mixed-citation>González, P., Álvarez, E., Díez, J., López-Urrutia, Á., and del Coz, J. J.: Validation methods for plankton image classification systems, Limnology and Oceanography: Methods, 15, 221–237, <ext-link xlink:href="https://doi.org/10.1002/lom3.10151" ext-link-type="DOI">10.1002/lom3.10151</ext-link>, 2017.</mixed-citation></ref>
      <ref id="bib1.bib38"><label>38</label><mixed-citation>González, P., Castaño, A., Peacock, E. E., Díez, J., Del Coz, J. J., and Sosik, H. M.: Automatic plankton quantification using deep features, Journal of Plankton Research, 41, 449–463, <ext-link xlink:href="https://doi.org/10.1093/plankt/fbz023" ext-link-type="DOI">10.1093/plankt/fbz023</ext-link>, 2019.</mixed-citation></ref>
      <ref id="bib1.bib39"><label>39</label><mixed-citation>Gorsky, G., Ohman, M. D., Picheral, M., Gasparini, S., Stemmann, L., Romagnan, J.-B., Cawood, A., Pesant, S., Garcia-Comas, C., and Prejger, F.: Digital zooplankton image analysis using the ZooScan integrated system, Journal of Plankton Research, 32, 285–303, <ext-link xlink:href="https://doi.org/10.1093/plankt/fbp124" ext-link-type="DOI">10.1093/plankt/fbp124</ext-link>, 2010.</mixed-citation></ref>
      <ref id="bib1.bib40"><label>40</label><mixed-citation>Greer, A. T., Woodson, C. B., Smith, C. E., Guigand, C. M., and Cowen, R. K.: Examining mesozooplankton patch structure and its implications for trophic interactions in the northern Gulf of Mexico, Journal of Plankton Research, 38, 1115–1134, <ext-link xlink:href="https://doi.org/10.1093/plankt/fbw033" ext-link-type="DOI">10.1093/plankt/fbw033</ext-link>, 2016.</mixed-citation></ref>
      <ref id="bib1.bib41"><label>41</label><mixed-citation>Grosjean, P., Picheral, M., Warembourg, C., and Gorsky, G.: Enumeration, measurement, and identification of net zooplankton samples using the ZOOSCAN digital imaging system, ICES J. Mar. Sci., 61, 518–525, <ext-link xlink:href="https://doi.org/10.1016/j.icesjms.2004.03.012" ext-link-type="DOI">10.1016/j.icesjms.2004.03.012</ext-link>, 2004.</mixed-citation></ref>
      <ref id="bib1.bib42"><label>42</label><mixed-citation>Guo, C., Wei, B., and Yu, K.: Deep Transfer Learning for Biology Cross-Domain Image Classification, Journal of Control Science and Engineering, 2021, 2518837, <ext-link xlink:href="https://doi.org/10.1155/2021/2518837" ext-link-type="DOI">10.1155/2021/2518837</ext-link>, 2021a.</mixed-citation></ref>
      <ref id="bib1.bib43"><label>43</label><mixed-citation>Guo, J. and Guan, J.: Classification of Marine Plankton Based on Few-shot Learning, Arab. J. Sci. Eng., 46, 9253–9262, <ext-link xlink:href="https://doi.org/10.1007/s13369-021-05786-2" ext-link-type="DOI">10.1007/s13369-021-05786-2</ext-link>, 2021.</mixed-citation></ref>
      <ref id="bib1.bib44"><label>44</label><mixed-citation>Guo, J., Ma, Y., and Lee, J. H. W.: Real-time automated identification of algal bloom species for fisheries management in subtropical coastal waters, Journal of Hydro-environment Research, 36, 1–32, <ext-link xlink:href="https://doi.org/10.1016/j.jher.2021.03.002" ext-link-type="DOI">10.1016/j.jher.2021.03.002</ext-link>, 2021b.</mixed-citation></ref>
      <ref id="bib1.bib45"><label>45</label><mixed-citation> Guyon, I. and Elisseeff, A.: An introduction to variable and feature selection, Journal of machine learning research, 3, 1157–1182, 2003.</mixed-citation></ref>
      <ref id="bib1.bib46"><label>46</label><mixed-citation>Guyon, I., Weston, J., Barnhill, S., and Vapnik, V.: Gene Selection for Cancer Classification using Support Vector Machines, Machine Learning, 46, 389–422, <ext-link xlink:href="https://doi.org/10.1023/A:1012487302797" ext-link-type="DOI">10.1023/A:1012487302797</ext-link>, 2002.</mixed-citation></ref>
      <ref id="bib1.bib47"><label>47</label><mixed-citation>Hassan, M., Salbitani, G., Carfagna, S., and Khan, J. A.: Deep learning meets marine biology: Optimized fused features and LIME-driven insights for automated plankton classification, Computers in Biology and Medicine, 192, 110273, <ext-link xlink:href="https://doi.org/10.1016/j.compbiomed.2025.110273" ext-link-type="DOI">10.1016/j.compbiomed.2025.110273</ext-link>, 2025.</mixed-citation></ref>
      <ref id="bib1.bib48"><label>48</label><mixed-citation> Hastie, T., Tibshirani, R., and Friedman, J.: The elements of statistical learning: data mining, inference, and prediction, Springer Science &amp; Business Media, ISBN-13 978-0387952840, 2009.</mixed-citation></ref>
      <ref id="bib1.bib49"><label>49</label><mixed-citation>He, H. and Garcia, E. A.: Learning from Imbalanced Data, IEEE Transactions on Knowledge and Data Engineering, 21, 1263–1284, <ext-link xlink:href="https://doi.org/10.1109/TKDE.2008.239" ext-link-type="DOI">10.1109/TKDE.2008.239</ext-link>, 2009.</mixed-citation></ref>
      <ref id="bib1.bib50"><label>50</label><mixed-citation>Hu, Q. and Davis, C.: Automatic plankton image recognition with co-occurrence matrices and Support Vector Machine, Marine Ecology Progress Series, 295, 21–31, <ext-link xlink:href="https://doi.org/10.3354/meps295021" ext-link-type="DOI">10.3354/meps295021</ext-link>, 2005.</mixed-citation></ref>
      <ref id="bib1.bib51"><label>51</label><mixed-citation> Hutchinson, G. E.: The Paradox of the Plankton, The American Naturalist, 95, 137–145, 1961.</mixed-citation></ref>
      <ref id="bib1.bib52"><label>52</label><mixed-citation>Irisson, J.-O., Ayata, S.-D., Lindsay, D. J., Karp-Boss, L., and Stemmann, L.: Machine Learning for the Study of Plankton and Marine Snow from Images, Annu. Rev. Mar. Sci., 14, 277–301, <ext-link xlink:href="https://doi.org/10.1146/annurev-marine-041921-013023" ext-link-type="DOI">10.1146/annurev-marine-041921-013023</ext-link>, 2022.</mixed-citation></ref>
      <ref id="bib1.bib53"><label>53</label><mixed-citation>Jalabert, L., Signoret, G., Caray-Counil, L., Vilain, M., Martins, E., Lombard, F., Picheral, M., and Irisson, J.-O.: FlowCAMNet: plankton images captured with the FlowCAM, SEANOE [data set], <ext-link xlink:href="https://doi.org/10.17882/101961" ext-link-type="DOI">10.17882/101961</ext-link>, 2024.</mixed-citation></ref>
      <ref id="bib1.bib54"><label>54</label><mixed-citation>Kareinen, J., Eerola, T., Kraft, K., Lensu, L., Suikkanen, S., and Kälviäinen, H.: Self-Supervised Pretraining for Fine-Grained Plankton Recognition, arXiv [preprint], <ext-link xlink:href="https://doi.org/10.48550/arXiv.2503.11341" ext-link-type="DOI">10.48550/arXiv.2503.11341</ext-link>, 9 May 2025.</mixed-citation></ref>
      <ref id="bib1.bib55"><label>55</label><mixed-citation> Kelleher, J. D., Mac Namee, B., and D'arcy, A.: Fundamentals of machine learning for predictive data analytics: algorithms, worked examples, and case studies, MIT Press, ISBN 9780262044691, 2020.</mixed-citation></ref>
      <ref id="bib1.bib56"><label>56</label><mixed-citation>Kerr, T., Clark, J. R., Fileman, E. S., Widdicombe, C. E., and Pugeault, N.: Collaborative Deep Learning Models to Handle Class Imbalance in FlowCam Plankton Imagery, IEEE Access, 8, 170013–170032, <ext-link xlink:href="https://doi.org/10.1109/ACCESS.2020.3022242" ext-link-type="DOI">10.1109/ACCESS.2020.3022242</ext-link>, 2020.</mixed-citation></ref>
      <ref id="bib1.bib57"><label>57</label><mixed-citation>Kraft, K., Velhonoja, O., Eerola, T., Suikkanen, S., Tamminen, T., Haraguchi, L., Ylöstalo, P., Kielosto, S., Johansson, M., Lensu, L., Kälviäinen, H., Haario, H., and Seppälä, J.: Towards operational phytoplankton recognition with automated high-throughput imaging, near-real-time data processing, and convolutional neural networks, Front. Mar. Sci., 9, <ext-link xlink:href="https://doi.org/10.3389/fmars.2022.867695" ext-link-type="DOI">10.3389/fmars.2022.867695</ext-link>, 2022.</mixed-citation></ref>
      <ref id="bib1.bib58"><label>58</label><mixed-citation>Krawczyk, B.: Learning from imbalanced data: open challenges and future directions, Prog. Artif. Intell., 5, 221–232, <ext-link xlink:href="https://doi.org/10.1007/s13748-016-0094-0" ext-link-type="DOI">10.1007/s13748-016-0094-0</ext-link>, 2016.</mixed-citation></ref>
      <ref id="bib1.bib59"><label>59</label><mixed-citation> Krizhevsky, A., Sutskever, I., and Hinton, G. E.: ImageNet Classification with Deep Convolutional Neural Networks, in: Advances in Neural Information Processing Systems 25, edited by: Pereira, F., Burges, C. J. C., Bottou, L., and Weinberger, K. Q., Curran Associates, Inc., 1097–1105, 2012.</mixed-citation></ref>
      <ref id="bib1.bib60"><label>60</label><mixed-citation>Kyathanahally, S. P., Hardeman, T., Merz, E., Bulas, T., Reyes, M., Isles, P., Pomati, F., and Baity-Jesi, M.: Deep Learning Classification of Lake Zooplankton, Frontiers in Microbiology, 12, <ext-link xlink:href="https://doi.org/10.3389/fmicb.2021.746297" ext-link-type="DOI">10.3389/fmicb.2021.746297</ext-link>, 2021.</mixed-citation></ref>
      <ref id="bib1.bib61"><label>61</label><mixed-citation>Kyathanahally, S. P., Hardeman, T., Reyes, M., Merz, E., Bulas, T., Brun, P., Pomati, F., and Baity-Jesi, M.: Ensembles of data-efficient vision transformers as a new paradigm for automated classification in ecology, Sci. Rep., 12, 18590, <ext-link xlink:href="https://doi.org/10.1038/s41598-022-21910-0" ext-link-type="DOI">10.1038/s41598-022-21910-0</ext-link>, 2022.</mixed-citation></ref>
      <ref id="bib1.bib62"><label>62</label><mixed-citation>Langeland Teigen, A., Saad, A., and Stahl, A.: Leveraging Similarity Metrics to In-Situ Discover Planktonic Interspecies Variations or Mutations, in: Global Oceans 2020: Singapore – U.S. Gulf Coast, Global Oceans 2020: Singapore – U.S. Gulf Coast, Biloxi, MS, USA,  8 pp., <ext-link xlink:href="https://doi.org/10.1109/IEEECONF38699.2020.9388998" ext-link-type="DOI">10.1109/IEEECONF38699.2020.9388998</ext-link>, 2020.</mixed-citation></ref>
      <ref id="bib1.bib63"><label>63</label><mixed-citation>Le Cun, Y., Jackel, L. D., Boser, B., Denker, J. S., Graf, H. P., Guyon, I., Henderson, D., Howard, R. E., and Hubbard, W.: Handwritten digit recognition: applications of neural network chips and automatic learning, IEEE Communications Magazine, 27, 41–46, <ext-link xlink:href="https://doi.org/10.1109/35.41400" ext-link-type="DOI">10.1109/35.41400</ext-link>, 1989.</mixed-citation></ref>
      <ref id="bib1.bib64"><label>64</label><mixed-citation>Lee, H., Park, M., and Kim, J.: Plankton classification on imbalanced large scale database via convolutional neural networks with transfer learning, in: 2016 IEEE International Conference on Image Processing (ICIP), 2016 IEEE International Conference on Image Processing (ICIP),  3713–3717, <ext-link xlink:href="https://doi.org/10.1109/ICIP.2016.7533053" ext-link-type="DOI">10.1109/ICIP.2016.7533053</ext-link>, 2016.</mixed-citation></ref>
      <ref id="bib1.bib65"><label>65</label><mixed-citation> Legendre, P. and Legendre, L.: Numerical ecology, Elsevier, 990 pp., ISBN-13 978-0444538680, 2012.</mixed-citation></ref>
      <ref id="bib1.bib66"><label>66</label><mixed-citation>Li, X. and Cui, Z.: Deep residual networks for plankton classification, in: OCEANS 2016 MTS/IEEE Monterey, OCEANS 2016 MTS/IEEE Monterey,  4 pp., <ext-link xlink:href="https://doi.org/10.1109/OCEANS.2016.7761223" ext-link-type="DOI">10.1109/OCEANS.2016.7761223</ext-link>, 2016.</mixed-citation></ref>
      <ref id="bib1.bib67"><label>67</label><mixed-citation>Li, X., Long, R., Yan, J., Jin, K., and Lee, J.: TANet: A Tiny Plankton Classification Network for Mobile Devices, Mobile Information Systems, 2019, 6536925, <ext-link xlink:href="https://doi.org/10.1155/2019/6536925" ext-link-type="DOI">10.1155/2019/6536925</ext-link>, 2019.</mixed-citation></ref>
      <ref id="bib1.bib68"><label>68</label><mixed-citation>Lin, T.-Y., Goyal, P., Girshick, R., He, K., and Dollár, P.: Focal Loss for Dense Object Detection, IEEE Transactions on Pattern Analysis and Machine Intelligence, 42, 318–327, <ext-link xlink:href="https://doi.org/10.1109/TPAMI.2018.2858826" ext-link-type="DOI">10.1109/TPAMI.2018.2858826</ext-link>, 2020.</mixed-citation></ref>
      <ref id="bib1.bib69"><label>69</label><mixed-citation>Liu, J., Du, A., Wang, C., Yu, Z., Zheng, H., Zheng, B., and Zhang, H.: Deep Pyramidal Residual Networks for Plankton Image Classification, in: 2018 OCEANS – MTS/IEEE Kobe Techno-Oceans (OTO), 2018 OCEANS – MTS/IEEE Kobe Techno-Oceans (OTO), 5 pp., <ext-link xlink:href="https://doi.org/10.1109/OCEANSKOBE.2018.8559106" ext-link-type="DOI">10.1109/OCEANSKOBE.2018.8559106</ext-link>, 2018.</mixed-citation></ref>
      <ref id="bib1.bib70"><label>70</label><mixed-citation>Lombard, F., Boss, E., Waite, A. M., Vogt, M., Uitz, J., Stemmann, L., Sosik, H. M., Schulz, J., Romagnan, J.-B., Picheral, M., Pearlman, J., Ohman, M. D., Niehoff, B., Möller, K. O., Miloslavich, P., Lara-Lpez, A., Kudela, R., Lopes, R. M., Kiko, R., Karp-Boss, L., Jaffe, J. S., Iversen, M. H., Irisson, J.-O., Fennel, K., Hauss, H., Guidi, L., Gorsky, G., Giering, S. L. C., Gaube, P., Gallager, S., Dubelaar, G., Cowen, R. K., Carlotti, F., Briseño-Avena, C., Berline, L., Benoit-Bird, K., Bax, N., Batten, S., Ayata, S. D., Artigas, L. F., and Appeltans, W.: Globally Consistent Quantitative Observations of Planktonic Ecosystems, Front. Mar. Sci., 6, <ext-link xlink:href="https://doi.org/10.3389/fmars.2019.00196" ext-link-type="DOI">10.3389/fmars.2019.00196</ext-link>, 2019.</mixed-citation></ref>
      <ref id="bib1.bib71"><label>71</label><mixed-citation>Lumini, A. and Nanni, L.: Deep learning and transfer learning features for plankton classification, Ecological Informatics, 51, 33–43, <ext-link xlink:href="https://doi.org/10.1016/j.ecoinf.2019.02.007" ext-link-type="DOI">10.1016/j.ecoinf.2019.02.007</ext-link>, 2019.</mixed-citation></ref>
      <ref id="bib1.bib72"><label>72</label><mixed-citation>Luo, J. Y., Irisson, J.-O., Graham, B., Guigand, C., Sarafraz, A., Mader, C., and Cowen, R. K.: Automated plankton image analysis using convolutional neural networks, Limnology and Oceanography: Methods, 16, 814–827, <ext-link xlink:href="https://doi.org/10.1002/lom3.10285" ext-link-type="DOI">10.1002/lom3.10285</ext-link>, 2018.</mixed-citation></ref>
      <ref id="bib1.bib73"><label>73</label><mixed-citation>Luo, T., Kramer, K., Samson, S., Remsen, A., Goldgof, D. B., Hall, L. O., and Hopkins, T.: Active learning to recognize multiple types of plankton, in: Proceedings of the 17th International Conference on Pattern Recognition, ICPR 2004, Proceedings of the 17th International Conference on Pattern Recognition, 2004. ICPR 2004, Cambridge, UK, Vol. 3, 478–481, <ext-link xlink:href="https://doi.org/10.1109/ICPR.2004.1334570" ext-link-type="DOI">10.1109/ICPR.2004.1334570</ext-link>, 2004.</mixed-citation></ref>
      <ref id="bib1.bib74"><label>74</label><mixed-citation>Malde, K. and Kim, H.: Beyond image classification: zooplankton identification with deep vector space embeddings, arXiv [preprint], <ext-link xlink:href="https://doi.org/10.48550/arXiv.1909.11380" ext-link-type="DOI">10.48550/arXiv.1909.11380</ext-link>, 2019.</mixed-citation></ref>
      <ref id="bib1.bib75"><label>75</label><mixed-citation>Malde, K., Handegard, N. O., Eikvil, L., and Salberg, A.-B.: Machine intelligence and the data-driven future of marine science, ICES J. Mar. Sci., 77, 1274–1285, <ext-link xlink:href="https://doi.org/10.1093/icesjms/fsz057" ext-link-type="DOI">10.1093/icesjms/fsz057</ext-link>, 2020.</mixed-citation></ref>
      <ref id="bib1.bib76"><label>76</label><mixed-citation>Maracani, A., Pastore, V. P., Natale, L., Rosasco, L., and Odone, F.: In-domain versus out-of-domain transfer learning in plankton image classification, Sci. Rep., 13, 10443, <ext-link xlink:href="https://doi.org/10.1038/s41598-023-37627-7" ext-link-type="DOI">10.1038/s41598-023-37627-7</ext-link>, 2023.</mixed-citation></ref>
      <ref id="bib1.bib77"><label>77</label><mixed-citation>Masoudi, M., Giering, S. L. C., Eftekhari, N., Massot-Campos, M., Irisson, J.-O., and Thornton, B.: Optimizing Plankton Image Classification With Metadata-Enhanced Representation Learning, IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing, 17, 17117–17133, <ext-link xlink:href="https://doi.org/10.1109/JSTARS.2024.3424498" ext-link-type="DOI">10.1109/JSTARS.2024.3424498</ext-link>, 2024.</mixed-citation></ref>
      <ref id="bib1.bib78"><label>78</label><mixed-citation>McCarthy, K., Zabar, B., and Weiss, G.: Does cost-sensitive learning beat sampling for classifying rare classes?, in: Proceedings of the 1st international workshop on Utility-based data mining, New York, NY, USA, 69–77, <ext-link xlink:href="https://doi.org/10.1145/1089827.1089836" ext-link-type="DOI">10.1145/1089827.1089836</ext-link>, 2005.</mixed-citation></ref>
      <ref id="bib1.bib79"><label>79</label><mixed-citation>Moreno-Torres, J. G., Raeder, T., Alaiz-Rodríguez, R., Chawla, N. V., and Herrera, F.: A unifying view on dataset shift in classification, Pattern Recognition, 45, 521–530, <ext-link xlink:href="https://doi.org/10.1016/j.patcog.2011.06.019" ext-link-type="DOI">10.1016/j.patcog.2011.06.019</ext-link>, 2012.</mixed-citation></ref>
      <ref id="bib1.bib80"><label>80</label><mixed-citation>Ohman, M. D., Davis, R. E., Sherman, J. T., Grindley, K. R., Whitmore, B. M., Nickels, C. F., and Ellen, J. S.: Zooglider: An autonomous vehicle for optical and acoustic sensing of zooplankton, Limnology and Oceanography: Methods, 17, 69–86, <ext-link xlink:href="https://doi.org/10.1002/lom3.10301" ext-link-type="DOI">10.1002/lom3.10301</ext-link>, 2019.</mixed-citation></ref>
      <ref id="bib1.bib81"><label>81</label><mixed-citation>Olson, R. J. and Sosik, H. M.: A submersible imaging-in-flow instrument to analyze nano-and microplankton: Imaging FlowCytobot, Limnology and Oceanography: Methods, 5, 195–203, <ext-link xlink:href="https://doi.org/10.4319/lom.2007.5.195" ext-link-type="DOI">10.4319/lom.2007.5.195</ext-link>, 2007.</mixed-citation></ref>
      <ref id="bib1.bib82"><label>82</label><mixed-citation>Orenstein, E. C. and Beijbom, O.: Transfer Learning and Deep Feature Extraction for Planktonic Image Data Sets, in: 2017 IEEE Winter Conference on Applications of Computer Vision (WACV), 2017 IEEE Winter Conference on Applications of Computer Vision (WACV), 1082–1088, <ext-link xlink:href="https://doi.org/10.1109/WACV.2017.125" ext-link-type="DOI">10.1109/WACV.2017.125</ext-link>, 2017.</mixed-citation></ref>
      <ref id="bib1.bib83"><label>83</label><mixed-citation>Orenstein, E. C., Beijbom, O., Peacock, E. E., and Sosik, H. M.: WHOI-Plankton – A Large Scale Fine Grained Visual Recognition Benchmark Dataset for Plankton Classification, arXiv [preprint], <ext-link xlink:href="https://doi.org/10.48550/arXiv.1510.00745" ext-link-type="DOI">10.48550/arXiv.1510.00745</ext-link>, 2015.</mixed-citation></ref>
      <ref id="bib1.bib84"><label>84</label><mixed-citation>Orenstein, E. C., Kenitz, K. M., Roberts, P. L. D., Franks, P. J. S., Jaffe, J. S., and Barton, A. D.: Semi- and fully supervised quantification techniques to improve population estimates from machine classifiers, Limnology and Oceanography: Methods, 18, 739–753, <ext-link xlink:href="https://doi.org/10.1002/lom3.10399" ext-link-type="DOI">10.1002/lom3.10399</ext-link>, 2020a.</mixed-citation></ref>
      <ref id="bib1.bib85"><label>85</label><mixed-citation>Orenstein, E. C., Ratelle, D., Briseño-Avena, C., Carter, M. L., Franks, P. J. S., Jaffe, J. S., and Roberts, P. L. D.: The Scripps Plankton Camera system: A framework and platform for in situ microscopy, Limnology and Oceanography: Methods, 18, 681–695, <ext-link xlink:href="https://doi.org/10.1002/lom3.10394" ext-link-type="DOI">10.1002/lom3.10394</ext-link>, 2020b.</mixed-citation></ref>
      <ref id="bib1.bib86"><label>86</label><mixed-citation>Orenstein, E. C., Ayata, S.-D., Maps, F., Becker, É. C., Benedetti, F., Biard, T., de Garidel-Thoron, T., Ellen, J. S., Ferrario, F., Giering, S. L. C., Guy-Haim, T., Hoebeke, L., Iversen, M. H., Kiørboe, T., Lalonde, J.-F., Lana, A., Laviale, M., Lombard, F., Lorimer, T., Martini, S., Meyer, A., Möller, K. O., Niehoff, B., Ohman, M. D., Pradalier, C., Romagnan, J.-B., Schröder, S.-M., Sonnet, V., Sosik, H. M., Stemmann, L. S., Stock, M., Terbiyik-Kurt, T., Valcárcel-Pérez, N., Vilgrain, L., Wacquet, G., Waite, A. M., and Irisson, J.-O.: Machine learning techniques to characterize functional traits of plankton from image data, Limnology and Oceanography, 67, 1647–1669, <ext-link xlink:href="https://doi.org/10.1002/lno.12101" ext-link-type="DOI">10.1002/lno.12101</ext-link>, 2022.</mixed-citation></ref>
      <ref id="bib1.bib87"><label>87</label><mixed-citation>Owen, B. M., Tweedley, J. R., Moheimani, N. R., Hallett, C. S., Cosgrove, J. J., and Silberstein, L. P. O.: What is “accuracy”? Rethinking machine learning classifier performance metrics for highly imbalanced, high variance, zero-inflated species count data, Limnology and Oceanography: Methods, <ext-link xlink:href="https://doi.org/10.1002/lom3.70009" ext-link-type="DOI">10.1002/lom3.70009</ext-link>, 2025.</mixed-citation></ref>
      <ref id="bib1.bib88"><label>88</label><mixed-citation>Panaïotis, T. and Amblard, E.: ThelmaPana/plankton_classif, Zenodo [code], <ext-link xlink:href="https://doi.org/10.5281/zenodo.17937437" ext-link-type="DOI">10.5281/zenodo.17937437</ext-link>, 2025.</mixed-citation></ref>
      <ref id="bib1.bib89"><label>89</label><mixed-citation>Panaïotis, T., Caray–Counil, L., Woodward, B., Schmid, M. S., Daprano, D., Tsai, S. T., Sullivan, C. M., Cowen, R. K., and Irisson, J.-O.: Content-Aware Segmentation of Objects Spanning a Large Size Range: Application to Plankton Images, Frontiers in Marine Science, 9, <ext-link xlink:href="https://doi.org/10.3389/fmars.2022.870005" ext-link-type="DOI">10.3389/fmars.2022.870005</ext-link>, 2022.</mixed-citation></ref>
      <ref id="bib1.bib90"><label>90</label><mixed-citation>Panaïotis, T., Caray-Counil, L., Jalabert, L., and Irisson, J.-O.: ISIISNet: plankton images captured with the ISIIS (In-situ Ichthyoplankton Imaging System), SEANOE [data set], <ext-link xlink:href="https://doi.org/10.17882/101950" ext-link-type="DOI">10.17882/101950</ext-link>, 2024.</mixed-citation></ref>
      <ref id="bib1.bib91"><label>91</label><mixed-citation>Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., Desmaison, A., Köpf, A., Yang, E., DeVito, Z., Raison, M., Tejani, A., Chilamkurthy, S., Steiner, B., Fang, L., Bai, J., and Chintala, S.: PyTorch: An Imperative Style, High-Performance Deep Learning Library, arXiv [preprint], <ext-link xlink:href="https://doi.org/10.48550/arXiv.1912.01703" ext-link-type="DOI">10.48550/arXiv.1912.01703</ext-link>, 3 December 2019.</mixed-citation></ref>
      <ref id="bib1.bib92"><label>92</label><mixed-citation> Pedregosa, F., Varoquaux, G., Gramfort, A., Michel, V., Thirion, B., Grisel, O., Blondel, M., Prettenhofer, P., Weiss, R., Dubourg, V., Vanderplas, J., Passos, A., Cournapeau, D., Brucher, M., Perrot, M., and Duchesnay, É.: Scikit-learn: Machine Learning in Python, Journal of Machine Learning Research, 12, 2825–2830, 2011.</mixed-citation></ref>
      <ref id="bib1.bib93"><label>93</label><mixed-citation> Péron, F. and Lesueur, C. A.: Tableau des caractères génériques et spécifiques de toutes les espèces de méduses connues jusqu'à ce jour, in: Annales du Muséum d'Histoire Naturelle, 325–366, 1810.</mixed-citation></ref>
      <ref id="bib1.bib94"><label>94</label><mixed-citation>Picheral, M., Guidi, L., Stemmann, L., Karl, D. M., Iddaoud, G., and Gorsky, G.: The Underwater Vision Profiler 5: An advanced instrument for high spatial resolution studies of particle size spectra and zooplankton, Limnology and Oceanography: Methods, 8, 462–473, <ext-link xlink:href="https://doi.org/10.4319/lom.2010.8.462" ext-link-type="DOI">10.4319/lom.2010.8.462</ext-link>, 2010.</mixed-citation></ref>
      <ref id="bib1.bib95"><label>95</label><mixed-citation>Picheral, M., Colin, S., and Irisson, J.-O. : EcoTaxa, a tool for the taxonomic classification of images, <uri>https://ecotaxa.obs-vlfr.fr/</uri> (last access: 13 November 2020), 2017.</mixed-citation></ref>
      <ref id="bib1.bib96"><label>96</label><mixed-citation>Picheral, M., Catalano, C., Brousseau, D., Claustre, H., Coppola, L., Leymarie, E., Coindat, J., Dias, F., Fevre, S., Guidi, L., Irisson, J. O., Legendre, L., Lombard, F., Mortier, L., Penkerch, C., Rogge, A., Schmechtig, C., Thibault, S., Tixier, T., Waite, A., and Stemmann, L.: The Underwater Vision Profiler 6: an imaging sensor of particle size spectra and plankton, for autonomous and cabled platforms, Limnology and Oceanography: Methods, 20, 115–129, <ext-link xlink:href="https://doi.org/10.1002/lom3.10475" ext-link-type="DOI">10.1002/lom3.10475</ext-link>, 2022.</mixed-citation></ref>
      <ref id="bib1.bib97"><label>97</label><mixed-citation>Picheral, M., Courchet, L., Jalabert, L., Motreuil, S., Carray-Counil, L., Ricour, F., and Petit, F.: UVP6Net: plankton images captured with the UVP6, SEANOE [data set], <ext-link xlink:href="https://doi.org/10.17882/101948" ext-link-type="DOI">10.17882/101948</ext-link>, 2024.</mixed-citation></ref>
      <ref id="bib1.bib98"><label>98</label><mixed-citation>Pollina, T., Larson, A. G., Lombard, F., Li, H., Le Guen, D., Colin, S., de Vargas, C., and Prakash, M.: PlanktoScope: Affordable Modular Quantitative Imaging Platform for Citizen Oceanography, Frontiers in Marine Science, 9, <ext-link xlink:href="https://doi.org/10.3389/fmars.2022.949428" ext-link-type="DOI">10.3389/fmars.2022.949428</ext-link>, 2022.</mixed-citation></ref>
      <ref id="bib1.bib99"><label>99</label><mixed-citation>Py, O., Hong, H., and Zhongzhi, S.: Plankton classification with deep convolutional neural networks, in: 2016 IEEE Information Technology, Networking, Electronic and Automation Control Conference, 2016 IEEE Information Technology, Networking, Electronic and Automation Control Conference,  132–136, <ext-link xlink:href="https://doi.org/10.1109/ITNEC.2016.7560334" ext-link-type="DOI">10.1109/ITNEC.2016.7560334</ext-link>, 2016.</mixed-citation></ref>
      <ref id="bib1.bib100"><label>100</label><mixed-citation> Raghu, M., Unterthiner, T., Kornblith, S., Zhang, C., and Dosovitskiy, A.: Do Vision Transformers See Like Convolutional Neural Networks?, in: Advances in Neural Information Processing Systems, 12116–12128, 2021.</mixed-citation></ref>
      <ref id="bib1.bib101"><label>101</label><mixed-citation>Robinson, K. L., Sponaugle, S., Luo, J. Y., Gleiber, M. R., and Cowen, R. K.: Big or small, patchy all: Resolution of marine plankton patch structure at micro- to submesoscales for 36 taxa, Science Advances, 7, eabk2904, <ext-link xlink:href="https://doi.org/10.1126/sciadv.abk2904" ext-link-type="DOI">10.1126/sciadv.abk2904</ext-link>, 2021.</mixed-citation></ref>
      <ref id="bib1.bib102"><label>102</label><mixed-citation> Rodrigues, F. C. M., Hirata, N. S., Abello, A. A., Leandro, T., La Cruz, D., Lopes, R. M., and Hirata Jr, R.: Evaluation of Transfer Learning Scenarios in Plankton Image Classification, in: VISIGRAPP (5: VISAPP), 359–366, 2018.</mixed-citation></ref>
      <ref id="bib1.bib103"><label>103</label><mixed-citation>Romagnan, J.-B., Panaïotis, T., Bourriau, P., Danielou, M.-M., Doray, M., Dupuy, C., Forest, B., Grandremy, N., Huret, M., Le Mestre, S., Nowaczyk, A., Petitgas, P., Pineau, P., Rouxel, J., Tardivel, M., and Irisson, J.-O.: ZooCAMNet: plankton images captured with the ZooCAM, SEANOE [data set], <ext-link xlink:href="https://doi.org/10.17882/101928" ext-link-type="DOI">10.17882/101928</ext-link>, 2024.</mixed-citation></ref>
      <ref id="bib1.bib104"><label>104</label><mixed-citation>Rubbens, P., Brodie, S., Cordier, T., Destro Barcellos, D., Devos, P., Fernandes-Salvador, J. A., Fincham, J. I., Gomes, A., Handegard, N. O., Howell, K., Jamet, C., Kartveit, K. H., Moustahfid, H., Parcerisas, C., Politikos, D., Sauzède, R., Sokolova, M., Uusitalo, L., Van den Bulcke, L., van Helmond, A. T. M., Watson, J. T., Welch, H., Beltran-Perez, O., Chaffron, S., Greenberg, D. S., Kühn, B., Kiko, R., Lo, M., Lopes, R. M., Möller, K. O., Michaels, W., Pala, A., Romagnan, J.-B., Schuchert, P., Seydi, V., Villasante, S., Malde, K., and Irisson, J.-O.: Machine learning in marine ecology: an overview of techniques and applications, ICES Journal of Marine Science, 80, 1829–1853, <ext-link xlink:href="https://doi.org/10.1093/icesjms/fsad100" ext-link-type="DOI">10.1093/icesjms/fsad100</ext-link>, 2023.</mixed-citation></ref>
      <ref id="bib1.bib105"><label>105</label><mixed-citation>Russakovsky, O., Deng, J., Su, H., Krause, J., Satheesh, S., Ma, S., Huang, Z., Karpathy, A., Khosla, A., Bernstein, M., Berg, A. C., and Fei-Fei, L.: ImageNet Large Scale Visual Recognition Challenge, Int. J. Comput. Vis., 115, 211–252, <ext-link xlink:href="https://doi.org/10.1007/s11263-015-0816-y" ext-link-type="DOI">10.1007/s11263-015-0816-y</ext-link>, 2015.</mixed-citation></ref>
      <ref id="bib1.bib106"><label>106</label><mixed-citation> Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., and Chen, L.-C.: MobileNetV2: Inverted Residuals and Linear Bottlenecks, Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 4510–4520, 2018.</mixed-citation></ref>
      <ref id="bib1.bib107"><label>107</label><mixed-citation>Schmid, M. S., Cowen, R. K., Robinson, K., Luo, J. Y., Briseño-Avena, C., and Sponaugle, S.: Prey and predator overlap at the edge of a mesoscale eddy: fine-scale, in-situ distributions to inform our understanding of oceanographic processes, Sci. Rep., 10, 1–16, <ext-link xlink:href="https://doi.org/10.1038/s41598-020-57879-x" ext-link-type="DOI">10.1038/s41598-020-57879-x</ext-link>, 2020.</mixed-citation></ref>
      <ref id="bib1.bib108"><label>108</label><mixed-citation>Schneider, C. A., Rasband, W. S., and Eliceiri, K. W.: NIH Image to ImageJ: 25 years of image analysis, Nat. Methods, 9, 671–675, <ext-link xlink:href="https://doi.org/10.1038/nmeth.2089" ext-link-type="DOI">10.1038/nmeth.2089</ext-link>, 2012.</mixed-citation></ref>
      <ref id="bib1.bib109"><label>109</label><mixed-citation>Schröder, S.-M., Kiko, R., Irisson, J.-O., and Koch, R.: Low-Shot Learning of Plankton Categories, in: Pattern Recognition, vol. 11269, edited by: Brox, T., Bruhn, A., and Fritz, M., Springer International Publishing, Cham, 391–404, <ext-link xlink:href="https://doi.org/10.1007/978-3-030-12939-2_27" ext-link-type="DOI">10.1007/978-3-030-12939-2_27</ext-link>, 2019.</mixed-citation></ref>
      <ref id="bib1.bib110"><label>110</label><mixed-citation>Schröder, S.-M., Kiko, R., and Koch, R.: MorphoCluster: Efficient Annotation of Plankton Images by Clustering, Sensors, 20, 3060, <ext-link xlink:href="https://doi.org/10.3390/s20113060" ext-link-type="DOI">10.3390/s20113060</ext-link>, 2020.</mixed-citation></ref>
      <ref id="bib1.bib111"><label>111</label><mixed-citation>Ser-Giacomi, E., Zinger, L., Malviya, S., De Vargas, C., Karsenti, E., Bowler, C., and De Monte, S.: Ubiquitous abundance distribution of non-dominant plankton across the global ocean, Nat. Ecol. Evol., 2, 1243–1249, <ext-link xlink:href="https://doi.org/10.1038/s41559-018-0587-2" ext-link-type="DOI">10.1038/s41559-018-0587-2</ext-link>, 2018.</mixed-citation></ref>
      <ref id="bib1.bib112"><label>112</label><mixed-citation>Shorten, C. and Khoshgoftaar, T. M.: A survey on Image Data Augmentation for Deep Learning, Journal of Big Data, 6, 60, <ext-link xlink:href="https://doi.org/10.1186/s40537-019-0197-0" ext-link-type="DOI">10.1186/s40537-019-0197-0</ext-link>, 2019.</mixed-citation></ref>
      <ref id="bib1.bib113"><label>113</label><mixed-citation>Sieracki, C. K., Sieracki, M. E., and Yentsch, C. S.: An imaging-in-flow system for automated analysis of marine microplankton, Marine Ecology Progress Series, 168, 285–296, <ext-link xlink:href="https://doi.org/10.3354/meps168285" ext-link-type="DOI">10.3354/meps168285</ext-link>, 1998.</mixed-citation></ref>
      <ref id="bib1.bib114"><label>114</label><mixed-citation>Smith, L. N.: A disciplined approach to neural network hyper-parameters: Part 1 – learning rate, batch size, momentum, and weight decay, arXiv [preprint], <ext-link xlink:href="https://doi.org/10.48550/arXiv.1803.09820" ext-link-type="DOI">10.48550/arXiv.1803.09820</ext-link>, 24 April 2018.</mixed-citation></ref>
      <ref id="bib1.bib115"><label>115</label><mixed-citation>Soda, P.: A multi-objective optimisation approach for class imbalance learning, Pattern Recognition, 44, 1801–1810, <ext-link xlink:href="https://doi.org/10.1016/j.patcog.2011.01.015" ext-link-type="DOI">10.1016/j.patcog.2011.01.015</ext-link>, 2011.</mixed-citation></ref>
      <ref id="bib1.bib116"><label>116</label><mixed-citation>Sosik, H. M. and Olson, R. J.: Automated taxonomic classification of phytoplankton sampled with imaging-in-flow cytometry, Limnology and Oceanography: Methods, 5, 204–216, <ext-link xlink:href="https://doi.org/10.4319/lom.2007.5.204" ext-link-type="DOI">10.4319/lom.2007.5.204</ext-link>, 2007.</mixed-citation></ref>
      <ref id="bib1.bib117"><label>117</label><mixed-citation>Sosik, H. M., Peacock, E. E., and Brownlee, E. F.: WHOI-Plankton. Annotated Plankton Images – Data Set for Developing and Evaluating Classification Methods, MBLWHOI Library [data set], <ext-link xlink:href="https://doi.org/10.1575/1912/7341" ext-link-type="DOI">10.1575/1912/7341</ext-link>, 2015.</mixed-citation></ref>
      <ref id="bib1.bib118"><label>118</label><mixed-citation> Srivastava, N., Hinton, G., Krizhevsky, A., Sutskever, I., and Salakhutdinov, R.: Dropout: a simple way to prevent neural networks from overfitting, The Journal of Machine Learning Research, 15, 1929–1958, 2014.</mixed-citation></ref>
      <ref id="bib1.bib119"><label>119</label><mixed-citation>Sun, Y., Wong, A. K. C., and Kamel, M. S.: Classification of imbalanced data: a review, Int. J. Patt. Recogn. Artif. Intell., 23, 687–719, <ext-link xlink:href="https://doi.org/10.1142/S0218001409007326" ext-link-type="DOI">10.1142/S0218001409007326</ext-link>, 2009.</mixed-citation></ref>
      <ref id="bib1.bib120"><label>120</label><mixed-citation>Tan, M. and Le, Q.: EfficientNetV2: Smaller Models and Faster Training, in: Proceedings of the 38th International Conference on Machine Learning, International Conference on Machine Learning, 10096–10106,  <uri>https://proceedings.mlr.press/v139/tan21a.html</uri> (last access: 15 December 2025), 2021.</mixed-citation></ref>
      <ref id="bib1.bib121"><label>121</label><mixed-citation>Tang, X., Stewart, W. K., Huang, H., Gallager, S. M., Davis, C. S., Vincent, L., and Marra, M.: Automatic Plankton Image Recognition, Artificial Intelligence Review, 12, 177–199, <ext-link xlink:href="https://doi.org/10.1023/A:1006517211724" ext-link-type="DOI">10.1023/A:1006517211724</ext-link>, 1998.</mixed-citation></ref>
      <ref id="bib1.bib122"><label>122</label><mixed-citation>Tappan, H. and Loeblich, A. R.: Evolution of the oceanic plankton, Earth-Science Reviews, 9, 207–240, <ext-link xlink:href="https://doi.org/10.1016/0012-8252(73)90092-5" ext-link-type="DOI">10.1016/0012-8252(73)90092-5</ext-link>, 1973.</mixed-citation></ref>
      <ref id="bib1.bib123"><label>123</label><mixed-citation>Uchida, K., Tanaka, M., and Okutomi, M.: Coupled convolution layer for convolutional neural network, Neural Networks, 105, 197–205, <ext-link xlink:href="https://doi.org/10.1016/j.neunet.2018.05.002" ext-link-type="DOI">10.1016/j.neunet.2018.05.002</ext-link>, 2018.</mixed-citation></ref>
      <ref id="bib1.bib124"><label>124</label><mixed-citation>Van Horn, G. and Perona, P.: The Devil is in the Tails: Fine-grained Classification in the Wild, arXiv [preprint], <ext-link xlink:href="https://doi.org/10.48550/arXiv.1709.01450" ext-link-type="DOI">10.48550/arXiv.1709.01450</ext-link>, 2017.</mixed-citation></ref>
      <ref id="bib1.bib125"><label>125</label><mixed-citation>Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A. N., Kaiser, L., and Polosukhin, I.: Attention is All you Need, in: Advances in Neural Information Processing Systems, 2017.  </mixed-citation></ref>
      <ref id="bib1.bib126"><label>126</label><mixed-citation>Venkataramanan, A., Laviale, M., Figus, C., Usseglio-Polatera, P., and Pradalier, C.: Tackling Inter-class Similarity and Intra-class Variance for Microscopic Image-Based Classification, in: Computer Vision Systems, Cham, 93–103, <ext-link xlink:href="https://doi.org/10.1007/978-3-030-87156-7_8" ext-link-type="DOI">10.1007/978-3-030-87156-7_8</ext-link>, 2021.</mixed-citation></ref>
      <ref id="bib1.bib127"><label>127</label><mixed-citation>Walt, S. van der, Schönberger, J. L., Nunez-Iglesias, J., Boulogne, F., Warner, J. D., Yager, N., Gouillart, E., and Yu, T.: scikit-image: image processing in Python, PeerJ, 2, e453, <ext-link xlink:href="https://doi.org/10.7717/peerj.453" ext-link-type="DOI">10.7717/peerj.453</ext-link>, 2014.</mixed-citation></ref>
      <ref id="bib1.bib128"><label>128</label><mixed-citation>Ware, D. M. and Thomson, R. E.: Bottom-Up Ecosystem Trophic Dynamics Determine Fish Production in the Northeast Pacific, Science, 308, 1280–1284, <ext-link xlink:href="https://doi.org/10.1126/SCIENCE.1109049" ext-link-type="DOI">10.1126/SCIENCE.1109049</ext-link>, 2005.</mixed-citation></ref>
      <ref id="bib1.bib129"><label>129</label><mixed-citation>Yan, J., Li, X., and Cui, Z.: A More Efficient CNN Architecture for Plankton Classification, in: Computer Vision, Springer, Singapore, 198–208, <ext-link xlink:href="https://doi.org/10.1007/978-981-10-7305-2_18" ext-link-type="DOI">10.1007/978-981-10-7305-2_18</ext-link>, 2017.</mixed-citation></ref>
      <ref id="bib1.bib130"><label>130</label><mixed-citation> Yosinski, J., Clune, J., Bengio, Y., and Lipson, H.: How transferable are features in deep neural networks?, Advances in neural information processing systems, 27, 2014.</mixed-citation></ref>
      <ref id="bib1.bib131"><label>131</label><mixed-citation>Zebin, T., Scully, P. J., Peek, N., Casson, A. J., and Ozanyan, K. B.: Design and Implementation of a Convolutional Neural Network on an Edge Computing Smartphone for Human Activity Recognition, IEEE Access, 7, 133509–133520, <ext-link xlink:href="https://doi.org/10.1109/ACCESS.2019.2941836" ext-link-type="DOI">10.1109/ACCESS.2019.2941836</ext-link>, 2019.</mixed-citation></ref>
      <ref id="bib1.bib132"><label>132</label><mixed-citation>Zeiler, M. D. and Fergus, R.: Visualizing and Understanding Convolutional Networks, in: Computer Vision – ECCV 2014, 818–833, <ext-link xlink:href="https://doi.org/10.1007/978-3-319-10590-1_53" ext-link-type="DOI">10.1007/978-3-319-10590-1_53</ext-link>, 2014.</mixed-citation></ref>
      <ref id="bib1.bib133"><label>133</label><mixed-citation>Zhang, C., Zhang, M., Zhang, S., Jin, D., Zhou, Q., Cai, Z., Zhao, H., Liu, X., and Liu, Z.: Delving Deep into the Generalization of Vision Transformers under Distribution Shifts, arXiv [preprint], <ext-link xlink:href="https://doi.org/10.48550/arXiv.2106.07617" ext-link-type="DOI">10.48550/arXiv.2106.07617</ext-link>, 8 March 2022.</mixed-citation></ref>
      <ref id="bib1.bib134"><label>134</label><mixed-citation>Zheng, H., Wang, R., Yu, Z., Wang, N., Gu, Z., and Zheng, B.: Automatic plankton image classification combining multiple view features via multiple kernel learning, BMC Bioinformatics, 18, 570, <ext-link xlink:href="https://doi.org/10.1186/s12859-017-1954-8" ext-link-type="DOI">10.1186/s12859-017-1954-8</ext-link>, 2017.</mixed-citation></ref>

  </ref-list></back>
    <!--<article-title-html>Benchmark of plankton images classification: emphasizing features extraction over classifier complexity</article-title-html>
<abstract-html/>
<ref-html id="bib1.bib1"><label>1</label><mixed-citation>
      
Abadi, M., Agarwal, A., Barham, P., Brevdo, E., Chen, Z., Citro, C., Corrado, G. S., Davis, A., Dean, J., Devin, M., Ghemawat, S., Goodfellow, I., Harp, A., Irving, G., Isard, M., Jia, Y., Jozefowicz, R., Kaiser, L., Kudlur, M., Levenberg, J., Mane, D., Monga, R., Moore, S., Murray, D., Olah, C., Schuster, M., Shlens, J., Steiner, B., Sutskever, I., Talwar, K., Tucker, P., Vanhoucke, V., Vasudevan, V., Viegas, F., Vinyals, O., Warden, P., Wattenberg, M., Wicke, M., Yu, Y., and Zheng, X.: TensorFlow: Large-Scale Machine Learning on Heterogeneous Distributed Systems, arXiv [preprint], <a href="https://doi.org/10.48550/arXiv.1603.04467" target="_blank">https://doi.org/10.48550/arXiv.1603.04467</a>, 2016.

    </mixed-citation></ref-html>
<ref-html id="bib1.bib2"><label>2</label><mixed-citation>
      
Anglès, S., Jordi, A., and Campbell, L.: Responses of the coastal
phytoplankton community to tropical cyclones revealed by high-frequency
imaging flow cytometry, Limnology and Oceanography, 60, 1562–1576,
<a href="https://doi.org/10.1002/lno.10117" target="_blank">https://doi.org/10.1002/lno.10117</a>, 2015.

    </mixed-citation></ref-html>
<ref-html id="bib1.bib3"><label>3</label><mixed-citation>
      
Baker, N., Lu, H., Erlikhman, G., and Kellman, P. J.: Deep convolutional
networks do not classify based on global object shape, PLOS Computational
Biology, 14, e1006613, <a href="https://doi.org/10.1371/journal.pcbi.1006613" target="_blank">https://doi.org/10.1371/journal.pcbi.1006613</a>, 2018.

    </mixed-citation></ref-html>
<ref-html id="bib1.bib4"><label>4</label><mixed-citation>
      
Bendale, A. and Boult, T. E.: Towards Open Set Deep Networks, Proceedings of
the IEEE Conference on Computer Vision and Pattern Recognition, 1563–1572,  <a href="https://www.cv-foundation.org/openaccess/content_cvpr_2016/html/Bendale_Towards_Open_Set_CVPR_2016_paper.html" target="_blank"/>
(last access: 15 December 2025),
2016.

    </mixed-citation></ref-html>
<ref-html id="bib1.bib5"><label>5</label><mixed-citation>
      
Benfield, M., Grosjean, P., Culverhouse, P., Irigolen, X., Sieracki, M.,
Lopez-Urrutia, A., Dam, H., Hu, Q., Davis, C., Hanson, A., Pilskaln, C.,
Riseman, E., Schulz, H., Utgoff, P., and Gorsky, G.: RAPID: Research on
Automated Plankton Identification, Oceanography, 20, 172–187,
<a href="https://doi.org/10.5670/oceanog.2007.63" target="_blank">https://doi.org/10.5670/oceanog.2007.63</a>, 2007.

    </mixed-citation></ref-html>
<ref-html id="bib1.bib6"><label>6</label><mixed-citation>
      
Bi, H., Guo, Z., Benfield, M. C., Fan, C., Ford, M., Shahrestani, S., and
Sieracki, J. M.: A Semi-Automated Image Analysis Procedure for In Situ
Plankton Imaging Systems, PLOS ONE, 10, e0127121,
<a href="https://doi.org/10.1371/journal.pone.0127121" target="_blank">https://doi.org/10.1371/journal.pone.0127121</a>, 2015.

    </mixed-citation></ref-html>
<ref-html id="bib1.bib7"><label>7</label><mixed-citation>
      
Blaschko, M. B., Holness, G., Mattar, M. A., Lisin, D., Utgoff, P. E.,
Hanson, A. R., Schultz, H., Riseman, E. M., Sieracki, M. E., and Balch, W.
M.: Automatic in situ identification of plankton, in: 2005 Seventh IEEE
Workshops on Applications of Computer Vision (WACV/MOTION'05), Vol. 1,
79–86,  <a href="https://doi.org/10.1109/ACVMOT.2005.29" target="_blank">https://doi.org/10.1109/ACVMOT.2005.29</a>, 2005.

    </mixed-citation></ref-html>
<ref-html id="bib1.bib8"><label>8</label><mixed-citation>
      
Breiman, L.: Random Forests, Machine Learning, 45, 5–32,
<a href="https://doi.org/10.1023/A:1010933404324" target="_blank">https://doi.org/10.1023/A:1010933404324</a>, 2001.

    </mixed-citation></ref-html>
<ref-html id="bib1.bib9"><label>9</label><mixed-citation>
      
Callejas, S., Lira, H., Berry, A., Martí, L., and Sanchez-Pi, N.: No
Plankton Left Behind: Preliminary Results on Massive Plankton Image
Recognition, in: High Performance Computing, Cham, 170–185, <a href="https://doi.org/10.1007/978-3-031-80084-9_12" target="_blank">https://doi.org/10.1007/978-3-031-80084-9_12</a>, 2025.

    </mixed-citation></ref-html>
<ref-html id="bib1.bib10"><label>10</label><mixed-citation>
      
Chellapilla, K., Puri, S., and Simard, P.: High Performance Convolutional
Neural Networks for Document Processing, Tenth International Workshop on
Frontiers in Handwriting Recognition,  <a href="https://inria.hal.science/inria-00112631v1" target="_blank"/>
(last access: 15 December 2025),
2006.

    </mixed-citation></ref-html>
<ref-html id="bib1.bib11"><label>11</label><mixed-citation>
      
Chen, C., Liaw, A., and Breiman, L.: Using Random Forest to Learn Imbalanced
Data, <a href="https://statistics.berkeley.edu/sites/default/files/tech-reports/666.pdf" target="_blank"/>
(last access: 15 December 2025), 2004.

    </mixed-citation></ref-html>
<ref-html id="bib1.bib12"><label>12</label><mixed-citation>
      
Cheng, K., Cheng, X., Wang, Y., Bi, H., and Benfield, M. C.: Enhanced
convolutional neural network for plankton identification and enumeration,
PLOS ONE, 14, e0219570,
<a href="https://doi.org/10.1371/journal.pone.0219570" target="_blank">https://doi.org/10.1371/journal.pone.0219570</a>, 2019.

    </mixed-citation></ref-html>
<ref-html id="bib1.bib13"><label>13</label><mixed-citation>
      
Ciranni, M., Gjergji, A., Maracani, A., Murino, V., and Pastore, V. P.:
In-domain self-supervised learning for plankton image classification on a
budget, Proceedings of the Winter Conference on Applications of Computer
Vision, 1588–1597,  <a href="https://openaccess.thecvf.com/content/WACV2025W/MaCVi/html/Ciranni_In-domain_self-supervised_learning_for_plankton_image_classification_on_a_budget_WACVW_2025_paper.html" target="_blank"/>
(last access: &thinsp;15 December 2025), 2025.

    </mixed-citation></ref-html>
<ref-html id="bib1.bib14"><label>14</label><mixed-citation>
      
Colas, F., Tardivel, M., Perchoc, J., Lunven, M., Forest, B., Guyader, G.,
Danielou, M. M., Le Mestre, S., Bourriau, P., Antajan, E., Sourisseau, M.,
Huret, M., Petitgas, P., and Romagnan, J. B.: The ZooCAM, a new in-flow
imaging system for fast onboard counting, sizing and classification of fish
eggs and metazooplankton, Progress in Oceanography, 166, 54–65, <a href="https://doi.org/10.1016/j.pocean.2017.10.014" target="_blank">https://doi.org/10.1016/j.pocean.2017.10.014</a>, 2018.

    </mixed-citation></ref-html>
<ref-html id="bib1.bib15"><label>15</label><mixed-citation>
      
Colin, S., Coelho, L. P., Sunagawa, S., Bowler, C., Karsenti, E., Bork, P.,
Pepperkok, R., and de Vargas, C.: Quantitative 3D-imaging for cell biology
and ecology of environmental microbial eukaryotes, eLife, 6, e26066,
<a href="https://doi.org/10.7554/eLife.26066" target="_blank">https://doi.org/10.7554/eLife.26066</a>, 2017.

    </mixed-citation></ref-html>
<ref-html id="bib1.bib16"><label>16</label><mixed-citation>
      
Cowen, R. K. and Guigand, C. M.: In situ ichthyoplankton imaging system
(ISIIS): system design and preliminary results, Limnology and Oceanography:
Methods, 6, 126–132, <a href="https://doi.org/10.4319/lom.2008.6.126" target="_blank">https://doi.org/10.4319/lom.2008.6.126</a>,
2008.

    </mixed-citation></ref-html>
<ref-html id="bib1.bib17"><label>17</label><mixed-citation>
      
Cowen, R. K., Sponaugle, S., Robinson, K. L., Luo, J., Oregon State
University, and Hatfield Marine Science Center: PlanktonSet 1.0: Plankton
imagery data collected from F. G. Walton Smith in Straits of Florida from
2014-06-03 to 2014-06-06 and used in the 2015 National Data Science Bowl, NCEI Accession 0127422, <a href="https://doi.org/10.7289/v5d21vjd" target="_blank">https://doi.org/10.7289/v5d21vjd</a>, 2015.

    </mixed-citation></ref-html>
<ref-html id="bib1.bib18"><label>18</label><mixed-citation>
      
Cui, J., Wei, B., Wang, C., Yu, Z., Zheng, H., Zheng, B., and Yang, H.:
Texture and Shape Information Fusion of Convolutional Neural Network for
Plankton Image Classification, in: 2018 OCEANS – MTS/IEEE Kobe Techno-Oceans
(OTO), 2018 OCEANS – MTS/IEEE Kobe Techno-Oceans (OTO), 5 pp., <a href="https://doi.org/10.1109/OCEANSKOBE.2018.8559156" target="_blank">https://doi.org/10.1109/OCEANSKOBE.2018.8559156</a>, 2018.

    </mixed-citation></ref-html>
<ref-html id="bib1.bib19"><label>19</label><mixed-citation>
      
Cui, Y., Jia, M., Lin, T.-Y., Song, Y., and Belongie, S.: Class-Balanced
Loss Based on Effective Number of Samples, Proceedings of the IEEE/CVF
Conference on Computer Vision and Pattern Recognition, 9268–9277,  <a href="https://openaccess.thecvf.com/content_CVPR_2019/html/Cui_Class-Balanced_Loss_Based_on_Effective_Number_of_Samples_CVPR_2019_paper.html" target="_blank"/>
(last access: 15 December 2025), 2019.

    </mixed-citation></ref-html>
<ref-html id="bib1.bib20"><label>20</label><mixed-citation>
      
Culverhouse, P. F., Simpson, R. G., Ellis, R., Lindley, J. A., Williams, R.,
Parisini, T., Reguera, B., Bravo, I., Zoppoli, R., Earnshaw, G., McCall, H.,
and Smith, G.: Automatic classification of field-collected dinoflagellates
by artificial neural network, Marine Ecology Progress Series, 139, 281–287,
<a href="https://doi.org/10.3354/meps139281" target="_blank">https://doi.org/10.3354/meps139281</a>, 1996.

    </mixed-citation></ref-html>
<ref-html id="bib1.bib21"><label>21</label><mixed-citation>
      
Dai, J., Wang, R., Zheng, H., Ji, G., and Qiao, X.: ZooplanktoNet: Deep
convolutional network for zooplankton classification, in: OCEANS 2016 –
Shanghai, OCEANS 2016 – Shanghai, 6 pp.,
<a href="https://doi.org/10.1109/OCEANSAP.2016.7485680" target="_blank">https://doi.org/10.1109/OCEANSAP.2016.7485680</a>, 2016.

    </mixed-citation></ref-html>
<ref-html id="bib1.bib22"><label>22</label><mixed-citation>
      
Dai, J., Yu, Z., Zheng, H., Zheng, B., and Wang, N.: A Hybrid Convolutional
Neural Network for Plankton Classification, in: Computer Vision – ACCV 2016
Workshops, Cham,  102–114, <a href="https://doi.org/10.1007/978-3-319-54526-4_8" target="_blank">https://doi.org/10.1007/978-3-319-54526-4_8</a>, 2017.

    </mixed-citation></ref-html>
<ref-html id="bib1.bib23"><label>23</label><mixed-citation>
      
Dieleman, S., Fauw, J. D., and Kavukcuoglu, K.: Exploiting Cyclic Symmetry in Convolutional Neural Networks, in: Proceedings of The 33rd International Conference on Machine Learning, International Conference on Machine Learning, 1889–1898, 2016.

    </mixed-citation></ref-html>
<ref-html id="bib1.bib24"><label>24</label><mixed-citation>
      
Drago, L., Panaïotis, T., Irisson, J.-O., Babin, M., Biard, T.,
Carlotti, F., Coppola, L., Guidi, L., Hauss, H., Karp-Boss, L., Lombard, F.,
McDonnell, A. M. P., Picheral, M., Rogge, A., Waite, A. M., Stemmann, L.,
and Kiko, R.: Global Distribution of Zooplankton Biomass Estimated by In
Situ Imaging and Machine Learning, Frontiers in Marine Science, 9,
<a href="https://doi.org/10.3389/fmars.2022.894372" target="_blank">https://doi.org/10.3389/fmars.2022.894372</a>, 2022.

    </mixed-citation></ref-html>
<ref-html id="bib1.bib25"><label>25</label><mixed-citation>
      
Du, A., Gu, Z., Yu, Z., Zheng, H., and Zheng, B.: Plankton Image
Classification Using Deep Convolutional Neural Networks with Second-order
Features, in: Global Oceans 2020: Singapore – U.S. Gulf Coast, Global
Oceans 2020: Singapore – U.S. Gulf Coast, 5 pp., <a href="https://doi.org/10.1109/IEEECONF38699.2020.9389034" target="_blank">https://doi.org/10.1109/IEEECONF38699.2020.9389034</a>, 2020.

    </mixed-citation></ref-html>
<ref-html id="bib1.bib26"><label>26</label><mixed-citation>
      
Dyck, L. E. van, Kwitt, R., Denzler, S. J., and Gruber, W. R.: Comparing
Object Recognition in Humans and Deep Convolutional Neural Networks – An Eye
Tracking Study, Frontiers in Neuroscience, 15, 750639, <a href="https://doi.org/10.3389/fnins.2021.750639" target="_blank">https://doi.org/10.3389/fnins.2021.750639</a>, 2021.

    </mixed-citation></ref-html>
<ref-html id="bib1.bib27"><label>27</label><mixed-citation>
      
Eerola, T., Batrakhanov, D., Barazandeh, N. V., Kraft, K., Haraguchi, L.,
Lensu, L., Suikkanen, S., Seppälä, J., Tamminen, T., and
Kälviäinen, H.: Survey of automatic plankton image recognition:
challenges, existing solutions and future perspectives, Artif. Intell. Rev.,
57, 114, <a href="https://doi.org/10.1007/s10462-024-10745-y" target="_blank">https://doi.org/10.1007/s10462-024-10745-y</a>, 2024.

    </mixed-citation></ref-html>
<ref-html id="bib1.bib28"><label>28</label><mixed-citation>
      
Eftekhari, N., Pitois, S., Masoudi, M., Blackwell, R. E., Scott, J.,
Giering, S. L. C., and Fry, M.: Improving in Situ Real-Time Classification
of Long-Tail Marine Plankton Images for Ecosystem Studies, in: Computer
Vision – ECCV 2024 Workshops, Cham, 121–134, <a href="https://doi.org/10.1007/978-3-031-92387-6_8" target="_blank">https://doi.org/10.1007/978-3-031-92387-6_8</a>, 2025.

    </mixed-citation></ref-html>
<ref-html id="bib1.bib29"><label>29</label><mixed-citation>
      
Elineau, A., Desnos, C., Jalabert, L., Olivier, M., Romagnan, J.-B., Costa
Brandao, M., Lombard, F., Llopis, N., Courboulès, J., Caray-Counil, L.,
Serranito, B., Irisson, J.-O., Picheral, M., Gorsky, G., and Stemmann, L.:
ZooScanNet: plankton images captured with the ZooScan, SEANOE [data set], <a href="https://doi.org/10.17882/55741" target="_blank">https://doi.org/10.17882/55741</a>, 2024.

    </mixed-citation></ref-html>
<ref-html id="bib1.bib30"><label>30</label><mixed-citation>
      
Ellen, J., Hongyu Li, and Ohman, M. D.: Quantifying California current
plankton samples with efficient machine learning techniques, in: OCEANS 2015
– MTS/IEEE Washington, OCEANS 2015 – MTS/IEEE Washington,  9 pp., <a href="https://doi.org/10.23919/OCEANS.2015.7404607" target="_blank">https://doi.org/10.23919/OCEANS.2015.7404607</a>, 2015.

    </mixed-citation></ref-html>
<ref-html id="bib1.bib31"><label>31</label><mixed-citation>
      
Ellen, J. S. and Ohman, M. D.: Beyond transfer learning: Leveraging
ancillary images in automated classification of plankton, Limnology and
Oceanography: Methods, 22, 943–952, <a href="https://doi.org/10.1002/lom3.10648" target="_blank">https://doi.org/10.1002/lom3.10648</a>, 2024.

    </mixed-citation></ref-html>
<ref-html id="bib1.bib32"><label>32</label><mixed-citation>
      
Ellen, J. S., Graff, C. A., and Ohman, M. D.: Improving plankton image
classification using context metadata, Limnology and Oceanography: Methods,
17, 439–461, <a href="https://doi.org/10.1002/lom3.10324" target="_blank">https://doi.org/10.1002/lom3.10324</a>, 2019.

    </mixed-citation></ref-html>
<ref-html id="bib1.bib33"><label>33</label><mixed-citation>
      
Faillettaz, R., Picheral, M., Luo, J. Y., Guigand, C., Cowen, R. K., and
Irisson, J.-O.: Imperfect automatic image classification successfully
describes plankton distribution patterns, Methods in Oceanography, 15–16,
60–77, <a href="https://doi.org/10.1016/J.MIO.2016.04.003" target="_blank">https://doi.org/10.1016/J.MIO.2016.04.003</a>, 2016.

    </mixed-citation></ref-html>
<ref-html id="bib1.bib34"><label>34</label><mixed-citation>
      
Falkowski, P.: Ocean Science: The power of plankton, Nature, 483, S17–S20,
<a href="https://doi.org/10.1038/483S17a" target="_blank">https://doi.org/10.1038/483S17a</a>, 2012.

    </mixed-citation></ref-html>
<ref-html id="bib1.bib35"><label>35</label><mixed-citation>
      
Fernández-Delgado, M., Cernadas, E., Barro, S., and Amorim, D.: Do we need hundreds of classifiers to solve real world classification problems?, The Journal of Machine Learning Research, 15, 3133–3181, 2014.

    </mixed-citation></ref-html>
<ref-html id="bib1.bib36"><label>36</label><mixed-citation>
      
Geraldes, P., Barbosa, J., Martins, A., Dias, A., Magalhães, C., Ramos,
S., and Silva, E.: In situ real-time Zooplankton Detection and
Classification, in: OCEANS 2019 – Marseille, OCEANS 2019 – Marseille, 6 pp.,
<a href="https://doi.org/10.1109/OCEANSE.2019.8867552" target="_blank">https://doi.org/10.1109/OCEANSE.2019.8867552</a>, 2019.

    </mixed-citation></ref-html>
<ref-html id="bib1.bib37"><label>37</label><mixed-citation>
      
González, P., Álvarez, E., Díez, J., López-Urrutia, Á.,
and del Coz, J. J.: Validation methods for plankton image classification
systems, Limnology and Oceanography: Methods, 15, 221–237, <a href="https://doi.org/10.1002/lom3.10151" target="_blank">https://doi.org/10.1002/lom3.10151</a>, 2017.

    </mixed-citation></ref-html>
<ref-html id="bib1.bib38"><label>38</label><mixed-citation>
      
González, P., Castaño, A., Peacock, E. E., Díez, J., Del Coz,
J. J., and Sosik, H. M.: Automatic plankton quantification using deep
features, Journal of Plankton Research, 41, 449–463, <a href="https://doi.org/10.1093/plankt/fbz023" target="_blank">https://doi.org/10.1093/plankt/fbz023</a>, 2019.

    </mixed-citation></ref-html>
<ref-html id="bib1.bib39"><label>39</label><mixed-citation>
      
Gorsky, G., Ohman, M. D., Picheral, M., Gasparini, S., Stemmann, L.,
Romagnan, J.-B., Cawood, A., Pesant, S., Garcia-Comas, C., and Prejger, F.:
Digital zooplankton image analysis using the ZooScan integrated system,
Journal of Plankton Research, 32, 285–303, <a href="https://doi.org/10.1093/plankt/fbp124" target="_blank">https://doi.org/10.1093/plankt/fbp124</a>, 2010.

    </mixed-citation></ref-html>
<ref-html id="bib1.bib40"><label>40</label><mixed-citation>
      
Greer, A. T., Woodson, C. B., Smith, C. E., Guigand, C. M., and Cowen, R.
K.: Examining mesozooplankton patch structure and its implications for
trophic interactions in the northern Gulf of Mexico, Journal of Plankton
Research, 38, 1115–1134, <a href="https://doi.org/10.1093/plankt/fbw033" target="_blank">https://doi.org/10.1093/plankt/fbw033</a>, 2016.

    </mixed-citation></ref-html>
<ref-html id="bib1.bib41"><label>41</label><mixed-citation>
      
Grosjean, P., Picheral, M., Warembourg, C., and Gorsky, G.: Enumeration,
measurement, and identification of net zooplankton samples using the ZOOSCAN
digital imaging system, ICES J. Mar. Sci., 61, 518–525, <a href="https://doi.org/10.1016/j.icesjms.2004.03.012" target="_blank">https://doi.org/10.1016/j.icesjms.2004.03.012</a>, 2004.

    </mixed-citation></ref-html>
<ref-html id="bib1.bib42"><label>42</label><mixed-citation>
      
Guo, C., Wei, B., and Yu, K.: Deep Transfer Learning for Biology
Cross-Domain Image Classification, Journal of Control Science and
Engineering, 2021, 2518837, <a href="https://doi.org/10.1155/2021/2518837" target="_blank">https://doi.org/10.1155/2021/2518837</a>, 2021a.

    </mixed-citation></ref-html>
<ref-html id="bib1.bib43"><label>43</label><mixed-citation>
      
Guo, J. and Guan, J.: Classification of Marine Plankton Based on Few-shot
Learning, Arab. J. Sci. Eng., 46, 9253–9262, <a href="https://doi.org/10.1007/s13369-021-05786-2" target="_blank">https://doi.org/10.1007/s13369-021-05786-2</a>, 2021.

    </mixed-citation></ref-html>
<ref-html id="bib1.bib44"><label>44</label><mixed-citation>
      
Guo, J., Ma, Y., and Lee, J. H. W.: Real-time automated identification of
algal bloom species for fisheries management in subtropical coastal waters,
Journal of Hydro-environment Research, 36, 1–32, <a href="https://doi.org/10.1016/j.jher.2021.03.002" target="_blank">https://doi.org/10.1016/j.jher.2021.03.002</a>, 2021b.

    </mixed-citation></ref-html>
<ref-html id="bib1.bib45"><label>45</label><mixed-citation>
      
Guyon, I. and Elisseeff, A.: An introduction to variable and feature
selection, Journal of machine learning research, 3, 1157–1182, 2003.

    </mixed-citation></ref-html>
<ref-html id="bib1.bib46"><label>46</label><mixed-citation>
      
Guyon, I., Weston, J., Barnhill, S., and Vapnik, V.: Gene Selection for
Cancer Classification using Support Vector Machines, Machine Learning, 46,
389–422, <a href="https://doi.org/10.1023/A:1012487302797" target="_blank">https://doi.org/10.1023/A:1012487302797</a>, 2002.

    </mixed-citation></ref-html>
<ref-html id="bib1.bib47"><label>47</label><mixed-citation>
      
Hassan, M., Salbitani, G., Carfagna, S., and Khan, J. A.: Deep learning
meets marine biology: Optimized fused features and LIME-driven insights for
automated plankton classification, Computers in Biology and Medicine, 192,
110273, <a href="https://doi.org/10.1016/j.compbiomed.2025.110273" target="_blank">https://doi.org/10.1016/j.compbiomed.2025.110273</a>, 2025.

    </mixed-citation></ref-html>
<ref-html id="bib1.bib48"><label>48</label><mixed-citation>
      
Hastie, T., Tibshirani, R., and Friedman, J.: The elements of statistical
learning: data mining, inference, and prediction, Springer Science &amp;
Business Media, ISBN-13 978-0387952840, 2009.

    </mixed-citation></ref-html>
<ref-html id="bib1.bib49"><label>49</label><mixed-citation>
      
He, H. and Garcia, E. A.: Learning from Imbalanced Data, IEEE Transactions
on Knowledge and Data Engineering, 21, 1263–1284, <a href="https://doi.org/10.1109/TKDE.2008.239" target="_blank">https://doi.org/10.1109/TKDE.2008.239</a>, 2009.

    </mixed-citation></ref-html>
<ref-html id="bib1.bib50"><label>50</label><mixed-citation>
      
Hu, Q. and Davis, C.: Automatic plankton image recognition with
co-occurrence matrices and Support Vector Machine, Marine Ecology Progress
Series, 295, 21–31, <a href="https://doi.org/10.3354/meps295021" target="_blank">https://doi.org/10.3354/meps295021</a>, 2005.

    </mixed-citation></ref-html>
<ref-html id="bib1.bib51"><label>51</label><mixed-citation>
      
Hutchinson, G. E.: The Paradox of the Plankton, The American Naturalist, 95,
137–145, 1961.

    </mixed-citation></ref-html>
<ref-html id="bib1.bib52"><label>52</label><mixed-citation>
      
Irisson, J.-O., Ayata, S.-D., Lindsay, D. J., Karp-Boss, L., and Stemmann,
L.: Machine Learning for the Study of Plankton and Marine Snow from Images,
Annu. Rev. Mar. Sci., 14, 277–301, <a href="https://doi.org/10.1146/annurev-marine-041921-013023" target="_blank">https://doi.org/10.1146/annurev-marine-041921-013023</a>, 2022.

    </mixed-citation></ref-html>
<ref-html id="bib1.bib53"><label>53</label><mixed-citation>
      
Jalabert, L., Signoret, G., Caray-Counil, L., Vilain, M., Martins, E.,
Lombard, F., Picheral, M., and Irisson, J.-O.: FlowCAMNet: plankton images
captured with the FlowCAM, SEANOE [data set], <a href="https://doi.org/10.17882/101961" target="_blank">https://doi.org/10.17882/101961</a>,
2024.

    </mixed-citation></ref-html>
<ref-html id="bib1.bib54"><label>54</label><mixed-citation>
      
Kareinen, J., Eerola, T., Kraft, K., Lensu, L., Suikkanen, S., and
Kälviäinen, H.: Self-Supervised Pretraining for Fine-Grained
Plankton Recognition, arXiv [preprint], <a href="https://doi.org/10.48550/arXiv.2503.11341" target="_blank">https://doi.org/10.48550/arXiv.2503.11341</a>, 9 May 2025.

    </mixed-citation></ref-html>
<ref-html id="bib1.bib55"><label>55</label><mixed-citation>
      
Kelleher, J. D., Mac Namee, B., and D'arcy, A.: Fundamentals of machine
learning for predictive data analytics: algorithms, worked examples, and
case studies, MIT Press, ISBN 9780262044691, 2020.

    </mixed-citation></ref-html>
<ref-html id="bib1.bib56"><label>56</label><mixed-citation>
      
Kerr, T., Clark, J. R., Fileman, E. S., Widdicombe, C. E., and Pugeault, N.:
Collaborative Deep Learning Models to Handle Class Imbalance in FlowCam
Plankton Imagery, IEEE Access, 8, 170013–170032, <a href="https://doi.org/10.1109/ACCESS.2020.3022242" target="_blank">https://doi.org/10.1109/ACCESS.2020.3022242</a>, 2020.

    </mixed-citation></ref-html>
<ref-html id="bib1.bib57"><label>57</label><mixed-citation>
      
Kraft, K., Velhonoja, O., Eerola, T., Suikkanen, S., Tamminen, T.,
Haraguchi, L., Ylöstalo, P., Kielosto, S., Johansson, M., Lensu, L.,
Kälviäinen, H., Haario, H., and Seppälä, J.: Towards
operational phytoplankton recognition with automated high-throughput
imaging, near-real-time data processing, and convolutional neural networks,
Front. Mar. Sci., 9, <a href="https://doi.org/10.3389/fmars.2022.867695" target="_blank">https://doi.org/10.3389/fmars.2022.867695</a>,
2022.

    </mixed-citation></ref-html>
<ref-html id="bib1.bib58"><label>58</label><mixed-citation>
      
Krawczyk, B.: Learning from imbalanced data: open challenges and future
directions, Prog. Artif. Intell., 5, 221–232, <a href="https://doi.org/10.1007/s13748-016-0094-0" target="_blank">https://doi.org/10.1007/s13748-016-0094-0</a>, 2016.

    </mixed-citation></ref-html>
<ref-html id="bib1.bib59"><label>59</label><mixed-citation>
      
Krizhevsky, A., Sutskever, I., and Hinton, G. E.: ImageNet Classification
with Deep Convolutional Neural Networks, in: Advances in Neural Information
Processing Systems 25, edited by: Pereira, F., Burges, C. J. C., Bottou, L.,
and Weinberger, K. Q., Curran Associates, Inc., 1097–1105, 2012.

    </mixed-citation></ref-html>
<ref-html id="bib1.bib60"><label>60</label><mixed-citation>
      
Kyathanahally, S. P., Hardeman, T., Merz, E., Bulas, T., Reyes, M., Isles,
P., Pomati, F., and Baity-Jesi, M.: Deep Learning Classification of Lake
Zooplankton, Frontiers in Microbiology, 12, <a href="https://doi.org/10.3389/fmicb.2021.746297" target="_blank">https://doi.org/10.3389/fmicb.2021.746297</a>, 2021.

    </mixed-citation></ref-html>
<ref-html id="bib1.bib61"><label>61</label><mixed-citation>
      
Kyathanahally, S. P., Hardeman, T., Reyes, M., Merz, E., Bulas, T., Brun,
P., Pomati, F., and Baity-Jesi, M.: Ensembles of data-efficient vision
transformers as a new paradigm for automated classification in ecology, Sci.
Rep., 12, 18590, <a href="https://doi.org/10.1038/s41598-022-21910-0" target="_blank">https://doi.org/10.1038/s41598-022-21910-0</a>,
2022.

    </mixed-citation></ref-html>
<ref-html id="bib1.bib62"><label>62</label><mixed-citation>
      
Langeland Teigen, A., Saad, A., and Stahl, A.: Leveraging Similarity Metrics
to In-Situ Discover Planktonic Interspecies Variations or Mutations, in:
Global Oceans 2020: Singapore – U.S. Gulf Coast, Global Oceans 2020:
Singapore – U.S. Gulf Coast, Biloxi, MS, USA,  8 pp., <a href="https://doi.org/10.1109/IEEECONF38699.2020.9388998" target="_blank">https://doi.org/10.1109/IEEECONF38699.2020.9388998</a>, 2020.

    </mixed-citation></ref-html>
<ref-html id="bib1.bib63"><label>63</label><mixed-citation>
      
Le Cun, Y., Jackel, L. D., Boser, B., Denker, J. S., Graf, H. P., Guyon, I., Henderson, D., Howard, R. E., and Hubbard, W.: Handwritten digit recognition: applications of neural network chips and automatic learning, IEEE Communications Magazine, 27, 41–46, <a href="https://doi.org/10.1109/35.41400" target="_blank">https://doi.org/10.1109/35.41400</a>, 1989.

    </mixed-citation></ref-html>
<ref-html id="bib1.bib64"><label>64</label><mixed-citation>
      
Lee, H., Park, M., and Kim, J.: Plankton classification on imbalanced large
scale database via convolutional neural networks with transfer learning, in:
2016 IEEE International Conference on Image Processing (ICIP), 2016 IEEE
International Conference on Image Processing (ICIP),  3713–3717, <a href="https://doi.org/10.1109/ICIP.2016.7533053" target="_blank">https://doi.org/10.1109/ICIP.2016.7533053</a>, 2016.

    </mixed-citation></ref-html>
<ref-html id="bib1.bib65"><label>65</label><mixed-citation>
      
Legendre, P. and Legendre, L.: Numerical ecology, Elsevier, 990 pp., ISBN-13 978-0444538680, 2012.

    </mixed-citation></ref-html>
<ref-html id="bib1.bib66"><label>66</label><mixed-citation>
      
Li, X. and Cui, Z.: Deep residual networks for plankton classification, in:
OCEANS 2016 MTS/IEEE Monterey, OCEANS 2016 MTS/IEEE Monterey,  4 pp., <a href="https://doi.org/10.1109/OCEANS.2016.7761223" target="_blank">https://doi.org/10.1109/OCEANS.2016.7761223</a>,
2016.

    </mixed-citation></ref-html>
<ref-html id="bib1.bib67"><label>67</label><mixed-citation>
      
Li, X., Long, R., Yan, J., Jin, K., and Lee, J.: TANet: A Tiny Plankton
Classification Network for Mobile Devices, Mobile Information Systems, 2019,
6536925, <a href="https://doi.org/10.1155/2019/6536925" target="_blank">https://doi.org/10.1155/2019/6536925</a>, 2019.

    </mixed-citation></ref-html>
<ref-html id="bib1.bib68"><label>68</label><mixed-citation>
      
Lin, T.-Y., Goyal, P., Girshick, R., He, K., and Dollár, P.: Focal Loss
for Dense Object Detection, IEEE Transactions on Pattern Analysis and
Machine Intelligence, 42, 318–327, <a href="https://doi.org/10.1109/TPAMI.2018.2858826" target="_blank">https://doi.org/10.1109/TPAMI.2018.2858826</a>, 2020.

    </mixed-citation></ref-html>
<ref-html id="bib1.bib69"><label>69</label><mixed-citation>
      
Liu, J., Du, A., Wang, C., Yu, Z., Zheng, H., Zheng, B., and Zhang, H.: Deep
Pyramidal Residual Networks for Plankton Image Classification, in: 2018
OCEANS – MTS/IEEE Kobe Techno-Oceans (OTO), 2018 OCEANS – MTS/IEEE Kobe
Techno-Oceans (OTO), 5 pp., <a href="https://doi.org/10.1109/OCEANSKOBE.2018.8559106" target="_blank">https://doi.org/10.1109/OCEANSKOBE.2018.8559106</a>, 2018.

    </mixed-citation></ref-html>
<ref-html id="bib1.bib70"><label>70</label><mixed-citation>
      
Lombard, F., Boss, E., Waite, A. M., Vogt, M., Uitz, J., Stemmann, L.,
Sosik, H. M., Schulz, J., Romagnan, J.-B., Picheral, M., Pearlman, J.,
Ohman, M. D., Niehoff, B., Möller, K. O., Miloslavich, P., Lara-Lpez,
A., Kudela, R., Lopes, R. M., Kiko, R., Karp-Boss, L., Jaffe, J. S.,
Iversen, M. H., Irisson, J.-O., Fennel, K., Hauss, H., Guidi, L., Gorsky,
G., Giering, S. L. C., Gaube, P., Gallager, S., Dubelaar, G., Cowen, R. K.,
Carlotti, F., Briseño-Avena, C., Berline, L., Benoit-Bird, K., Bax, N.,
Batten, S., Ayata, S. D., Artigas, L. F., and Appeltans, W.: Globally
Consistent Quantitative Observations of Planktonic Ecosystems, Front. Mar.
Sci., 6, <a href="https://doi.org/10.3389/fmars.2019.00196" target="_blank">https://doi.org/10.3389/fmars.2019.00196</a>, 2019.

    </mixed-citation></ref-html>
<ref-html id="bib1.bib71"><label>71</label><mixed-citation>
      
Lumini, A. and Nanni, L.: Deep learning and transfer learning features for
plankton classification, Ecological Informatics, 51, 33–43, <a href="https://doi.org/10.1016/j.ecoinf.2019.02.007" target="_blank">https://doi.org/10.1016/j.ecoinf.2019.02.007</a>, 2019.

    </mixed-citation></ref-html>
<ref-html id="bib1.bib72"><label>72</label><mixed-citation>
      
Luo, J. Y., Irisson, J.-O., Graham, B., Guigand, C., Sarafraz, A., Mader,
C., and Cowen, R. K.: Automated plankton image analysis using convolutional
neural networks, Limnology and Oceanography: Methods, 16, 814–827,
<a href="https://doi.org/10.1002/lom3.10285" target="_blank">https://doi.org/10.1002/lom3.10285</a>, 2018.

    </mixed-citation></ref-html>
<ref-html id="bib1.bib73"><label>73</label><mixed-citation>
      
Luo, T., Kramer, K., Samson, S., Remsen, A., Goldgof, D. B., Hall, L. O.,
and Hopkins, T.: Active learning to recognize multiple types of plankton,
in: Proceedings of the 17th International Conference on Pattern Recognition,
ICPR 2004, Proceedings of the 17th International Conference on
Pattern Recognition, 2004. ICPR 2004, Cambridge, UK, Vol. 3, 478–481, <a href="https://doi.org/10.1109/ICPR.2004.1334570" target="_blank">https://doi.org/10.1109/ICPR.2004.1334570</a>, 2004.

    </mixed-citation></ref-html>
<ref-html id="bib1.bib74"><label>74</label><mixed-citation>
      
Malde, K. and Kim, H.: Beyond image classification: zooplankton
identification with deep vector space embeddings, arXiv [preprint], <a href="https://doi.org/10.48550/arXiv.1909.11380" target="_blank">https://doi.org/10.48550/arXiv.1909.11380</a>,
2019.

    </mixed-citation></ref-html>
<ref-html id="bib1.bib75"><label>75</label><mixed-citation>
      
Malde, K., Handegard, N. O., Eikvil, L., and Salberg, A.-B.: Machine
intelligence and the data-driven future of marine science, ICES J. Mar. Sci.,
77, 1274–1285, <a href="https://doi.org/10.1093/icesjms/fsz057" target="_blank">https://doi.org/10.1093/icesjms/fsz057</a>, 2020.

    </mixed-citation></ref-html>
<ref-html id="bib1.bib76"><label>76</label><mixed-citation>
      
Maracani, A., Pastore, V. P., Natale, L., Rosasco, L., and Odone, F.:
In-domain versus out-of-domain transfer learning in plankton image
classification, Sci. Rep., 13, 10443, <a href="https://doi.org/10.1038/s41598-023-37627-7" target="_blank">https://doi.org/10.1038/s41598-023-37627-7</a>, 2023.

    </mixed-citation></ref-html>
<ref-html id="bib1.bib77"><label>77</label><mixed-citation>
      
Masoudi, M., Giering, S. L. C., Eftekhari, N., Massot-Campos, M., Irisson,
J.-O., and Thornton, B.: Optimizing Plankton Image Classification With
Metadata-Enhanced Representation Learning, IEEE Journal of Selected Topics
in Applied Earth Observations and Remote Sensing, 17, 17117–17133,
<a href="https://doi.org/10.1109/JSTARS.2024.3424498" target="_blank">https://doi.org/10.1109/JSTARS.2024.3424498</a>, 2024.

    </mixed-citation></ref-html>
<ref-html id="bib1.bib78"><label>78</label><mixed-citation>
      
McCarthy, K., Zabar, B., and Weiss, G.: Does cost-sensitive learning beat
sampling for classifying rare classes?, in: Proceedings of the 1st
international workshop on Utility-based data mining, New York, NY, USA,
69–77, <a href="https://doi.org/10.1145/1089827.1089836" target="_blank">https://doi.org/10.1145/1089827.1089836</a>, 2005.

    </mixed-citation></ref-html>
<ref-html id="bib1.bib79"><label>79</label><mixed-citation>
      
Moreno-Torres, J. G., Raeder, T., Alaiz-Rodríguez, R., Chawla, N. V.,
and Herrera, F.: A unifying view on dataset shift in classification, Pattern
Recognition, 45, 521–530, <a href="https://doi.org/10.1016/j.patcog.2011.06.019" target="_blank">https://doi.org/10.1016/j.patcog.2011.06.019</a>, 2012.

    </mixed-citation></ref-html>
<ref-html id="bib1.bib80"><label>80</label><mixed-citation>
      
Ohman, M. D., Davis, R. E., Sherman, J. T., Grindley, K. R., Whitmore, B.
M., Nickels, C. F., and Ellen, J. S.: Zooglider: An autonomous vehicle for
optical and acoustic sensing of zooplankton, Limnology and Oceanography:
Methods, 17, 69–86, <a href="https://doi.org/10.1002/lom3.10301" target="_blank">https://doi.org/10.1002/lom3.10301</a>, 2019.

    </mixed-citation></ref-html>
<ref-html id="bib1.bib81"><label>81</label><mixed-citation>
      
Olson, R. J. and Sosik, H. M.: A submersible imaging-in-flow instrument to
analyze nano-and microplankton: Imaging FlowCytobot, Limnology and
Oceanography: Methods, 5, 195–203, <a href="https://doi.org/10.4319/lom.2007.5.195" target="_blank">https://doi.org/10.4319/lom.2007.5.195</a>, 2007.

    </mixed-citation></ref-html>
<ref-html id="bib1.bib82"><label>82</label><mixed-citation>
      
Orenstein, E. C. and Beijbom, O.: Transfer Learning and Deep Feature
Extraction for Planktonic Image Data Sets, in: 2017 IEEE Winter Conference
on Applications of Computer Vision (WACV), 2017 IEEE Winter Conference on
Applications of Computer Vision (WACV),
1082–1088, <a href="https://doi.org/10.1109/WACV.2017.125" target="_blank">https://doi.org/10.1109/WACV.2017.125</a>, 2017.

    </mixed-citation></ref-html>
<ref-html id="bib1.bib83"><label>83</label><mixed-citation>
      
Orenstein, E. C., Beijbom, O., Peacock, E. E., and Sosik, H. M.:
WHOI-Plankton – A Large Scale Fine Grained Visual Recognition Benchmark
Dataset for Plankton Classification, arXiv [preprint], <a href="https://doi.org/10.48550/arXiv.1510.00745" target="_blank">https://doi.org/10.48550/arXiv.1510.00745</a>, 2015.

    </mixed-citation></ref-html>
<ref-html id="bib1.bib84"><label>84</label><mixed-citation>
      
Orenstein, E. C., Kenitz, K. M., Roberts, P. L. D., Franks, P. J. S., Jaffe,
J. S., and Barton, A. D.: Semi- and fully supervised quantification
techniques to improve population estimates from machine classifiers,
Limnology and Oceanography: Methods, 18, 739–753, <a href="https://doi.org/10.1002/lom3.10399" target="_blank">https://doi.org/10.1002/lom3.10399</a>, 2020a.

    </mixed-citation></ref-html>
<ref-html id="bib1.bib85"><label>85</label><mixed-citation>
      
Orenstein, E. C., Ratelle, D., Briseño-Avena, C., Carter, M. L., Franks,
P. J. S., Jaffe, J. S., and Roberts, P. L. D.: The Scripps Plankton Camera
system: A framework and platform for in situ microscopy, Limnology and
Oceanography: Methods, 18, 681–695, <a href="https://doi.org/10.1002/lom3.10394" target="_blank">https://doi.org/10.1002/lom3.10394</a>, 2020b.

    </mixed-citation></ref-html>
<ref-html id="bib1.bib86"><label>86</label><mixed-citation>
      
Orenstein, E. C., Ayata, S.-D., Maps, F., Becker, É. C., Benedetti, F.,
Biard, T., de Garidel-Thoron, T., Ellen, J. S., Ferrario, F., Giering, S. L.
C., Guy-Haim, T., Hoebeke, L., Iversen, M. H., Kiørboe, T., Lalonde,
J.-F., Lana, A., Laviale, M., Lombard, F., Lorimer, T., Martini, S., Meyer,
A., Möller, K. O., Niehoff, B., Ohman, M. D., Pradalier, C., Romagnan,
J.-B., Schröder, S.-M., Sonnet, V., Sosik, H. M., Stemmann, L. S.,
Stock, M., Terbiyik-Kurt, T., Valcárcel-Pérez, N., Vilgrain, L.,
Wacquet, G., Waite, A. M., and Irisson, J.-O.: Machine learning techniques
to characterize functional traits of plankton from image data, Limnology and
Oceanography, 67, 1647–1669, <a href="https://doi.org/10.1002/lno.12101" target="_blank">https://doi.org/10.1002/lno.12101</a>, 2022.

    </mixed-citation></ref-html>
<ref-html id="bib1.bib87"><label>87</label><mixed-citation>
      
Owen, B. M., Tweedley, J. R., Moheimani, N. R., Hallett, C. S., Cosgrove, J.
J., and Silberstein, L. P. O.: What is “accuracy”? Rethinking machine
learning classifier performance metrics for highly imbalanced, high
variance, zero-inflated species count data, Limnology and Oceanography:
Methods, <a href="https://doi.org/10.1002/lom3.70009" target="_blank">https://doi.org/10.1002/lom3.70009</a>, 2025.

    </mixed-citation></ref-html>
<ref-html id="bib1.bib88"><label>88</label><mixed-citation>
      
Panaïotis, T. and Amblard, E.: ThelmaPana/plankton_classif, Zenodo [code], <a href="https://doi.org/10.5281/zenodo.17937437" target="_blank">https://doi.org/10.5281/zenodo.17937437</a>, 2025.

    </mixed-citation></ref-html>
<ref-html id="bib1.bib89"><label>89</label><mixed-citation>
      
Panaïotis, T., Caray–Counil, L., Woodward, B., Schmid, M. S., Daprano,
D., Tsai, S. T., Sullivan, C. M., Cowen, R. K., and Irisson, J.-O.:
Content-Aware Segmentation of Objects Spanning a Large Size Range:
Application to Plankton Images, Frontiers in Marine Science, 9, <a href="https://doi.org/10.3389/fmars.2022.870005" target="_blank">https://doi.org/10.3389/fmars.2022.870005</a>, 2022.

    </mixed-citation></ref-html>
<ref-html id="bib1.bib90"><label>90</label><mixed-citation>
      
Panaïotis, T., Caray-Counil, L., Jalabert, L., and Irisson, J.-O.:
ISIISNet: plankton images captured with the ISIIS (In-situ Ichthyoplankton
Imaging System), SEANOE [data set], <a href="https://doi.org/10.17882/101950" target="_blank">https://doi.org/10.17882/101950</a>, 2024.

    </mixed-citation></ref-html>
<ref-html id="bib1.bib91"><label>91</label><mixed-citation>
      
Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G.,
Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., Desmaison, A., Köpf,
A., Yang, E., DeVito, Z., Raison, M., Tejani, A., Chilamkurthy, S., Steiner,
B., Fang, L., Bai, J., and Chintala, S.: PyTorch: An Imperative Style,
High-Performance Deep Learning Library, arXiv [preprint], <a href="https://doi.org/10.48550/arXiv.1912.01703" target="_blank">https://doi.org/10.48550/arXiv.1912.01703</a>, 3 December 2019.

    </mixed-citation></ref-html>
<ref-html id="bib1.bib92"><label>92</label><mixed-citation>
      
Pedregosa, F., Varoquaux, G., Gramfort, A., Michel, V., Thirion, B., Grisel,
O., Blondel, M., Prettenhofer, P., Weiss, R., Dubourg, V., Vanderplas, J.,
Passos, A., Cournapeau, D., Brucher, M., Perrot, M., and Duchesnay, É.:
Scikit-learn: Machine Learning in Python, Journal of Machine Learning
Research, 12, 2825–2830, 2011.

    </mixed-citation></ref-html>
<ref-html id="bib1.bib93"><label>93</label><mixed-citation>
      
Péron, F. and Lesueur, C. A.: Tableau des caractères
génériques et spécifiques de toutes les espèces de
méduses connues jusqu'à ce jour, in: Annales du Muséum
d'Histoire Naturelle, 325–366, 1810.

    </mixed-citation></ref-html>
<ref-html id="bib1.bib94"><label>94</label><mixed-citation>
      
Picheral, M., Guidi, L., Stemmann, L., Karl, D. M., Iddaoud, G., and Gorsky,
G.: The Underwater Vision Profiler 5: An advanced instrument for high
spatial resolution studies of particle size spectra and zooplankton,
Limnology and Oceanography: Methods, 8, 462–473, <a href="https://doi.org/10.4319/lom.2010.8.462" target="_blank">https://doi.org/10.4319/lom.2010.8.462</a>, 2010.

    </mixed-citation></ref-html>
<ref-html id="bib1.bib95"><label>95</label><mixed-citation>
      
Picheral, M., Colin, S., and Irisson, J.-O. :
EcoTaxa, a tool for the taxonomic classification of images, <a href="https://ecotaxa.obs-vlfr.fr/" target="_blank"/> (last access: 13 November 2020), 2017.

    </mixed-citation></ref-html>
<ref-html id="bib1.bib96"><label>96</label><mixed-citation>
      
Picheral, M., Catalano, C., Brousseau, D., Claustre, H., Coppola, L.,
Leymarie, E., Coindat, J., Dias, F., Fevre, S., Guidi, L., Irisson, J. O.,
Legendre, L., Lombard, F., Mortier, L., Penkerch, C., Rogge, A., Schmechtig,
C., Thibault, S., Tixier, T., Waite, A., and Stemmann, L.: The Underwater
Vision Profiler 6: an imaging sensor of particle size spectra and plankton,
for autonomous and cabled platforms, Limnology and Oceanography: Methods,
20, 115–129, <a href="https://doi.org/10.1002/lom3.10475" target="_blank">https://doi.org/10.1002/lom3.10475</a>, 2022.

    </mixed-citation></ref-html>
<ref-html id="bib1.bib97"><label>97</label><mixed-citation>
      
Picheral, M., Courchet, L., Jalabert, L., Motreuil, S., Carray-Counil, L.,
Ricour, F., and Petit, F.: UVP6Net: plankton images captured with the UVP6, SEANOE [data set],
<a href="https://doi.org/10.17882/101948" target="_blank">https://doi.org/10.17882/101948</a>, 2024.

    </mixed-citation></ref-html>
<ref-html id="bib1.bib98"><label>98</label><mixed-citation>
      
Pollina, T., Larson, A. G., Lombard, F., Li, H., Le Guen, D., Colin, S., de
Vargas, C., and Prakash, M.: PlanktoScope: Affordable Modular Quantitative
Imaging Platform for Citizen Oceanography, Frontiers in Marine Science, 9, <a href="https://doi.org/10.3389/fmars.2022.949428" target="_blank">https://doi.org/10.3389/fmars.2022.949428</a>,
2022.

    </mixed-citation></ref-html>
<ref-html id="bib1.bib99"><label>99</label><mixed-citation>
      
Py, O., Hong, H., and Zhongzhi, S.: Plankton classification with deep
convolutional neural networks, in: 2016 IEEE Information Technology,
Networking, Electronic and Automation Control Conference, 2016 IEEE
Information Technology, Networking, Electronic and Automation Control
Conference,  132–136, <a href="https://doi.org/10.1109/ITNEC.2016.7560334" target="_blank">https://doi.org/10.1109/ITNEC.2016.7560334</a>, 2016.

    </mixed-citation></ref-html>
<ref-html id="bib1.bib100"><label>100</label><mixed-citation>
      
Raghu, M., Unterthiner, T., Kornblith, S., Zhang, C., and Dosovitskiy, A.:
Do Vision Transformers See Like Convolutional Neural Networks?, in: Advances
in Neural Information Processing Systems, 12116–12128, 2021.

    </mixed-citation></ref-html>
<ref-html id="bib1.bib101"><label>101</label><mixed-citation>
      
Robinson, K. L., Sponaugle, S., Luo, J. Y., Gleiber, M. R., and Cowen, R.
K.: Big or small, patchy all: Resolution of marine plankton patch structure
at micro- to submesoscales for 36 taxa, Science Advances, 7, eabk2904,
<a href="https://doi.org/10.1126/sciadv.abk2904" target="_blank">https://doi.org/10.1126/sciadv.abk2904</a>, 2021.

    </mixed-citation></ref-html>
<ref-html id="bib1.bib102"><label>102</label><mixed-citation>
      
Rodrigues, F. C. M., Hirata, N. S., Abello, A. A., Leandro, T., La Cruz, D.,
Lopes, R. M., and Hirata Jr, R.: Evaluation of Transfer Learning Scenarios
in Plankton Image Classification, in: VISIGRAPP (5: VISAPP), 359–366, 2018.

    </mixed-citation></ref-html>
<ref-html id="bib1.bib103"><label>103</label><mixed-citation>
      
Romagnan, J.-B., Panaïotis, T., Bourriau, P., Danielou, M.-M., Doray,
M., Dupuy, C., Forest, B., Grandremy, N., Huret, M., Le Mestre, S.,
Nowaczyk, A., Petitgas, P., Pineau, P., Rouxel, J., Tardivel, M., and
Irisson, J.-O.: ZooCAMNet: plankton images captured with the ZooCAM, SEANOE [data set],
<a href="https://doi.org/10.17882/101928" target="_blank">https://doi.org/10.17882/101928</a>, 2024.

    </mixed-citation></ref-html>
<ref-html id="bib1.bib104"><label>104</label><mixed-citation>
      
Rubbens, P., Brodie, S., Cordier, T., Destro Barcellos, D., Devos, P.,
Fernandes-Salvador, J. A., Fincham, J. I., Gomes, A., Handegard, N. O.,
Howell, K., Jamet, C., Kartveit, K. H., Moustahfid, H., Parcerisas, C.,
Politikos, D., Sauzède, R., Sokolova, M., Uusitalo, L., Van den Bulcke,
L., van Helmond, A. T. M., Watson, J. T., Welch, H., Beltran-Perez, O.,
Chaffron, S., Greenberg, D. S., Kühn, B., Kiko, R., Lo, M., Lopes, R.
M., Möller, K. O., Michaels, W., Pala, A., Romagnan, J.-B., Schuchert,
P., Seydi, V., Villasante, S., Malde, K., and Irisson, J.-O.: Machine
learning in marine ecology: an overview of techniques and applications, ICES
Journal of Marine Science, 80, 1829–1853, <a href="https://doi.org/10.1093/icesjms/fsad100" target="_blank">https://doi.org/10.1093/icesjms/fsad100</a>, 2023.

    </mixed-citation></ref-html>
<ref-html id="bib1.bib105"><label>105</label><mixed-citation>
      
Russakovsky, O., Deng, J., Su, H., Krause, J., Satheesh, S., Ma, S., Huang,
Z., Karpathy, A., Khosla, A., Bernstein, M., Berg, A. C., and Fei-Fei, L.:
ImageNet Large Scale Visual Recognition Challenge, Int. J. Comput. Vis., 115,
211–252, <a href="https://doi.org/10.1007/s11263-015-0816-y" target="_blank">https://doi.org/10.1007/s11263-015-0816-y</a>, 2015.

    </mixed-citation></ref-html>
<ref-html id="bib1.bib106"><label>106</label><mixed-citation>
      
Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., and Chen, L.-C.: MobileNetV2: Inverted Residuals and Linear Bottlenecks, Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 4510–4520, 2018.

    </mixed-citation></ref-html>
<ref-html id="bib1.bib107"><label>107</label><mixed-citation>
      
Schmid, M. S., Cowen, R. K., Robinson, K., Luo, J. Y., Briseño-Avena,
C., and Sponaugle, S.: Prey and predator overlap at the edge of a mesoscale
eddy: fine-scale, in-situ distributions to inform our understanding of
oceanographic processes, Sci. Rep., 10, 1–16,
<a href="https://doi.org/10.1038/s41598-020-57879-x" target="_blank">https://doi.org/10.1038/s41598-020-57879-x</a>, 2020.

    </mixed-citation></ref-html>
<ref-html id="bib1.bib108"><label>108</label><mixed-citation>
      
Schneider, C. A., Rasband, W. S., and Eliceiri, K. W.: NIH Image to ImageJ: 25 years of image analysis, Nat. Methods, 9, 671–675, <a href="https://doi.org/10.1038/nmeth.2089" target="_blank">https://doi.org/10.1038/nmeth.2089</a>, 2012.

    </mixed-citation></ref-html>
<ref-html id="bib1.bib109"><label>109</label><mixed-citation>
      
Schröder, S.-M., Kiko, R., Irisson, J.-O., and Koch, R.: Low-Shot
Learning of Plankton Categories, in: Pattern Recognition, vol. 11269, edited
by: Brox, T., Bruhn, A., and Fritz, M., Springer International Publishing,
Cham, 391–404, <a href="https://doi.org/10.1007/978-3-030-12939-2_27" target="_blank">https://doi.org/10.1007/978-3-030-12939-2_27</a>, 2019.

    </mixed-citation></ref-html>
<ref-html id="bib1.bib110"><label>110</label><mixed-citation>
      
Schröder, S.-M., Kiko, R., and Koch, R.: MorphoCluster: Efficient
Annotation of Plankton Images by Clustering, Sensors, 20, 3060,
<a href="https://doi.org/10.3390/s20113060" target="_blank">https://doi.org/10.3390/s20113060</a>, 2020.

    </mixed-citation></ref-html>
<ref-html id="bib1.bib111"><label>111</label><mixed-citation>
      
Ser-Giacomi, E., Zinger, L., Malviya, S., De Vargas, C., Karsenti, E.,
Bowler, C., and De Monte, S.: Ubiquitous abundance distribution of
non-dominant plankton across the global ocean, Nat. Ecol. Evol., 2, 1243–1249,
<a href="https://doi.org/10.1038/s41559-018-0587-2" target="_blank">https://doi.org/10.1038/s41559-018-0587-2</a>, 2018.

    </mixed-citation></ref-html>
<ref-html id="bib1.bib112"><label>112</label><mixed-citation>
      
Shorten, C. and Khoshgoftaar, T. M.: A survey on Image Data Augmentation for
Deep Learning, Journal of Big Data, 6, 60,
<a href="https://doi.org/10.1186/s40537-019-0197-0" target="_blank">https://doi.org/10.1186/s40537-019-0197-0</a>, 2019.

    </mixed-citation></ref-html>
<ref-html id="bib1.bib113"><label>113</label><mixed-citation>
      
Sieracki, C. K., Sieracki, M. E., and Yentsch, C. S.: An imaging-in-flow
system for automated analysis of marine microplankton, Marine Ecology
Progress Series, 168, 285–296, <a href="https://doi.org/10.3354/meps168285" target="_blank">https://doi.org/10.3354/meps168285</a>, 1998.

    </mixed-citation></ref-html>
<ref-html id="bib1.bib114"><label>114</label><mixed-citation>
      
Smith, L. N.: A disciplined approach to neural network hyper-parameters:
Part 1 – learning rate, batch size, momentum, and weight decay, arXiv [preprint], <a href="https://doi.org/10.48550/arXiv.1803.09820" target="_blank">https://doi.org/10.48550/arXiv.1803.09820</a>, 24 April 2018.

    </mixed-citation></ref-html>
<ref-html id="bib1.bib115"><label>115</label><mixed-citation>
      
Soda, P.: A multi-objective optimisation approach for class imbalance
learning, Pattern Recognition, 44, 1801–1810,
<a href="https://doi.org/10.1016/j.patcog.2011.01.015" target="_blank">https://doi.org/10.1016/j.patcog.2011.01.015</a>, 2011.

    </mixed-citation></ref-html>
<ref-html id="bib1.bib116"><label>116</label><mixed-citation>
      
Sosik, H. M. and Olson, R. J.: Automated taxonomic classification of
phytoplankton sampled with imaging-in-flow cytometry, Limnology and
Oceanography: Methods, 5, 204–216, <a href="https://doi.org/10.4319/lom.2007.5.204" target="_blank">https://doi.org/10.4319/lom.2007.5.204</a>, 2007.

    </mixed-citation></ref-html>
<ref-html id="bib1.bib117"><label>117</label><mixed-citation>
      
Sosik, H. M., Peacock, E. E., and Brownlee, E. F.: WHOI-Plankton. Annotated
Plankton Images – Data Set for Developing and Evaluating Classification
Methods, MBLWHOI Library [data set], <a href="https://doi.org/10.1575/1912/7341" target="_blank">https://doi.org/10.1575/1912/7341</a>, 2015.

    </mixed-citation></ref-html>
<ref-html id="bib1.bib118"><label>118</label><mixed-citation>
      
Srivastava, N., Hinton, G., Krizhevsky, A., Sutskever, I., and
Salakhutdinov, R.: Dropout: a simple way to prevent neural networks from
overfitting, The Journal of Machine Learning Research, 15, 1929–1958, 2014.

    </mixed-citation></ref-html>
<ref-html id="bib1.bib119"><label>119</label><mixed-citation>
      
Sun, Y., Wong, A. K. C., and Kamel, M. S.: Classification of imbalanced
data: a review, Int. J. Patt. Recogn. Artif. Intell., 23, 687–719,
<a href="https://doi.org/10.1142/S0218001409007326" target="_blank">https://doi.org/10.1142/S0218001409007326</a>, 2009.

    </mixed-citation></ref-html>
<ref-html id="bib1.bib120"><label>120</label><mixed-citation>
      
Tan, M. and Le, Q.: EfficientNetV2: Smaller Models and Faster Training, in:
Proceedings of the 38th International Conference on Machine Learning,
International Conference on Machine Learning, 10096–10106,  <a href="https://proceedings.mlr.press/v139/tan21a.html" target="_blank"/>
(last access: 15 December 2025), 2021.

    </mixed-citation></ref-html>
<ref-html id="bib1.bib121"><label>121</label><mixed-citation>
      
Tang, X., Stewart, W. K., Huang, H., Gallager, S. M., Davis, C. S., Vincent,
L., and Marra, M.: Automatic Plankton Image Recognition, Artificial
Intelligence Review, 12, 177–199,
<a href="https://doi.org/10.1023/A:1006517211724" target="_blank">https://doi.org/10.1023/A:1006517211724</a>, 1998.

    </mixed-citation></ref-html>
<ref-html id="bib1.bib122"><label>122</label><mixed-citation>
      
Tappan, H. and Loeblich, A. R.: Evolution of the oceanic plankton,
Earth-Science Reviews, 9, 207–240, <a href="https://doi.org/10.1016/0012-8252(73)90092-5" target="_blank">https://doi.org/10.1016/0012-8252(73)90092-5</a>, 1973.

    </mixed-citation></ref-html>
<ref-html id="bib1.bib123"><label>123</label><mixed-citation>
      
Uchida, K., Tanaka, M., and Okutomi, M.: Coupled convolution layer for
convolutional neural network, Neural Networks, 105, 197–205, <a href="https://doi.org/10.1016/j.neunet.2018.05.002" target="_blank">https://doi.org/10.1016/j.neunet.2018.05.002</a>, 2018.

    </mixed-citation></ref-html>
<ref-html id="bib1.bib124"><label>124</label><mixed-citation>
      
Van Horn, G. and Perona, P.: The Devil is in the Tails: Fine-grained
Classification in the Wild, arXiv [preprint], <a href="https://doi.org/10.48550/arXiv.1709.01450" target="_blank">https://doi.org/10.48550/arXiv.1709.01450</a>, 2017.

    </mixed-citation></ref-html>
<ref-html id="bib1.bib125"><label>125</label><mixed-citation>
      
Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.
N., Kaiser, L., and Polosukhin, I.: Attention is All you Need, in:
Advances in Neural Information Processing Systems, 2017.


    </mixed-citation></ref-html>
<ref-html id="bib1.bib126"><label>126</label><mixed-citation>
      
Venkataramanan, A., Laviale, M., Figus, C., Usseglio-Polatera, P., and
Pradalier, C.: Tackling Inter-class Similarity and Intra-class Variance for
Microscopic Image-Based Classification, in: Computer Vision Systems, Cham,
93–103, <a href="https://doi.org/10.1007/978-3-030-87156-7_8" target="_blank">https://doi.org/10.1007/978-3-030-87156-7_8</a>, 2021.

    </mixed-citation></ref-html>
<ref-html id="bib1.bib127"><label>127</label><mixed-citation>
      
Walt, S. van der, Schönberger, J. L., Nunez-Iglesias, J., Boulogne, F.,
Warner, J. D., Yager, N., Gouillart, E., and Yu, T.: scikit-image: image
processing in Python, PeerJ, 2, e453, <a href="https://doi.org/10.7717/peerj.453" target="_blank">https://doi.org/10.7717/peerj.453</a>, 2014.

    </mixed-citation></ref-html>
<ref-html id="bib1.bib128"><label>128</label><mixed-citation>
      
Ware, D. M. and Thomson, R. E.: Bottom-Up Ecosystem Trophic Dynamics
Determine Fish Production in the Northeast Pacific, Science, 308,
1280–1284, <a href="https://doi.org/10.1126/SCIENCE.1109049" target="_blank">https://doi.org/10.1126/SCIENCE.1109049</a>, 2005.

    </mixed-citation></ref-html>
<ref-html id="bib1.bib129"><label>129</label><mixed-citation>
      
Yan, J., Li, X., and Cui, Z.: A More Efficient CNN Architecture for Plankton
Classification, in: Computer Vision, Springer, Singapore,
198–208, <a href="https://doi.org/10.1007/978-981-10-7305-2_18" target="_blank">https://doi.org/10.1007/978-981-10-7305-2_18</a>, 2017.

    </mixed-citation></ref-html>
<ref-html id="bib1.bib130"><label>130</label><mixed-citation>
      
Yosinski, J., Clune, J., Bengio, Y., and Lipson, H.: How transferable are features in deep neural networks?, Advances in neural information processing systems, 27, 2014.

    </mixed-citation></ref-html>
<ref-html id="bib1.bib131"><label>131</label><mixed-citation>
      
Zebin, T., Scully, P. J., Peek, N., Casson, A. J., and Ozanyan, K. B.:
Design and Implementation of a Convolutional Neural Network on an Edge
Computing Smartphone for Human Activity Recognition, IEEE Access, 7,
133509–133520, <a href="https://doi.org/10.1109/ACCESS.2019.2941836" target="_blank">https://doi.org/10.1109/ACCESS.2019.2941836</a>,
2019.

    </mixed-citation></ref-html>
<ref-html id="bib1.bib132"><label>132</label><mixed-citation>
      
Zeiler, M. D. and Fergus, R.: Visualizing and Understanding Convolutional Networks, in: Computer Vision – ECCV 2014, 818–833, <a href="https://doi.org/10.1007/978-3-319-10590-1_53" target="_blank">https://doi.org/10.1007/978-3-319-10590-1_53</a>, 2014.

    </mixed-citation></ref-html>
<ref-html id="bib1.bib133"><label>133</label><mixed-citation>
      
Zhang, C., Zhang, M., Zhang, S., Jin, D., Zhou, Q., Cai, Z., Zhao, H., Liu,
X., and Liu, Z.: Delving Deep into the Generalization of Vision Transformers
under Distribution Shifts, arXiv [preprint], <a href="https://doi.org/10.48550/arXiv.2106.07617" target="_blank">https://doi.org/10.48550/arXiv.2106.07617</a>, 8 March 2022.

    </mixed-citation></ref-html>
<ref-html id="bib1.bib134"><label>134</label><mixed-citation>
      
Zheng, H., Wang, R., Yu, Z., Wang, N., Gu, Z., and Zheng, B.: Automatic
plankton image classification combining multiple view features via multiple
kernel learning, BMC Bioinformatics, 18, 570,
<a href="https://doi.org/10.1186/s12859-017-1954-8" target="_blank">https://doi.org/10.1186/s12859-017-1954-8</a>, 2017.

    </mixed-citation></ref-html>--></article>
