Global marine plankton functional type biomass distributions: Phaeocystis spp.

Abstract. The planktonic haptophyte Phaeocystis has been suggested to play a fundamental role in the global biogeochemical cycling of carbon and sulphur, but little is known about its global biomass distribution. We have collected global microscopy data of the genus Phaeocystis and converted abundance data to carbon biomass using species-specific carbon conversion factors. Microscopic counts of single-celled and colonial Phaeocystis were obtained both through the mining of online databases and by accepting direct submissions (both published and unpublished) from Phaeocystis specialists. We recorded abundance data from a total of 1595 depth-resolved stations sampled between 1955–2009. The quality-controlled dataset includes 5057 counts of individual Phaeocystis cells resolved to species level and information regarding life-stages from 3526 samples. 83% of stations were located in the Northern Hemisphere while 17% were located in the Southern Hemisphere. Most data were located in the latitude range of 50–70° N. While the seasonal distribution of Northern Hemisphere data was well-balanced, Southern Hemisphere data was biased towards summer months. Mean species- and form-specific cell diameters were determined from previously published studies. Cell diameters were used to calculate the cellular biovolume of Phaeocystis cells, assuming spherical geometry. Cell biomass was calculated using a carbon conversion factor for prymnesiophytes. For colonies, the number of cells per colony was derived from the colony volume. Cell numbers were then converted to carbon concentrations. An estimation of colonial mucus carbon was included a posteriori, assuming a mean colony size for each species. Carbon content per cell ranged from 9 pg C cell−1 (single-celled Phaeocystis antarctica) to 29 pg C cell−1 (colonial Phaeocystis globosa). Non-zero Phaeocystis cell biomasses (without mucus carbon) range from 2.9 × 10−5 to 5.4 × 103 μg C l−1, with a mean of 45.7 μg C l−1 and a median of 3.0 μg C l−1. The highest biomasses occur in the Southern Ocean below 70° S (up to 783.9 μg C l−1) and in the North Atlantic around 50° N (up to 5.4 × 103 μg C l−1). The original and gridded data can be downloaded from PANGAEA, doi:10.1594/PANGAEA.779101 .


Introduction
Plankton functional types (PFTs; Le Quéré et al., 2005) and marine ecosystem composition are important for the biogeochemical cycling of many abundant elements on Earth, such as carbon, nitrogen, and sulphur (e.g. Weber and Deutsch, 2010). In recent decades, changes have been observed in marine plankton communities (Chavez et al., 2003;Reid et al., 2007;Hatun et al., 2009;Beaugrand and Reid, 2003), and these changes are likely to affect local and global biodiversity, fisheries and biogeochemical cycling. Marine ecosystem models based on PFTs (Dynamic Green Ocean Models; DGOMs) have been developed in order to study the lower trophic levels of marine ecosystems and the potential impact of changes in their structure and distribution (Le Quéré et al., 2005). DGOMs have been applied to a wide range of biological and biogeochemical questions (Aumont and Bopp, 2006;Hashioka and Yamanaka, 2007;Moore and Doney, 2007;Vogt et al., 2010;Weber and Deutsch, 2010). However, the validation of these models has proven difficult due to the scarcity of observational abundance and biomass data for individual PFTs.
The MARine Ecosystem DATa (MAREDAT) initiative is a community effort to provide marine ecosystem modellers with global biomass distributions for the major PFTs currently represented in marine ecosystem models (Buitenhuis et al., 2012;silicifiers, calcifiers, nitrogen fixers, DMSproducers, picophytoplankton, bacteria, microzooplankton, mesozooplankton and macrozooplankton). MAREDAT is part of the MARine Ecosystem Model Intercomparison Project (MAREMIP). All MAREDAT biomass fields are publicly available for use in model evaluation and development, and for other applications in biological oceanography.
The haptophyte Phaeocystis has been suggested to play a fundamental role in the global biogeochemical cycling of carbon and sulphur (Le Quéré et al., 2005). Phaeocystis is a globally distributed genus of marine phytoplankton with a polymorphic life cycle, alternating between flagellated, freeliving cells of 3-9 µm in diameter and colonial stages which form colonies reaching several mm-cm Peperzak et al., 2000;Peperzak and Gäbler-Schwarz, 2012;Chen et al., 2002;Schoemann et al., 2005). Three of the six recognised Phaeocystis species are known to form massive blooms of gelatinous colonies (Medlin and Zingone, 2007), which may contribute significantly to carbon export (Riebesell et al., 1995;DiTullio et al., 2000), although recent observations suggest that the contribution of Phaeocystis spp. to vertical flux of organic matter is small (Reigstad and Wassmann, 2007). In addition, Phaeocystis cells are important producers of dimethylsulphoniopropionate (DMSP), which is the marine precursor of the trace gas dimethylsulphide (DMS). DMS has been suggested to play an important role in cloud formation, and DMS production is the main recycling pathway of sulphur from the ocean to the land. Furthermore, Phaeocystis has been well documented as asso-ciated with marked increases in seawater viscosity (Jenkinson and Biddanda, 1995;Seuront et al., 2007). In their review, Schoemann et al. (2005) conclude that it should be possible to derive a single unique parameterisation of Phaeocystis growth for global modelling. Hence, Phaeocystis has recently been included in a number of regional and global DGOMs (e.g. Wang and Moore, 2011).
Here, we present biomass data that were estimated from direct cell counts of colonial and single-celled Phaeocystis. We show the spatial and temporal distribution of Phaeocystis biomass, with a particular emphasis on the seasonal and vertical patterns. We discuss in detail our method for converting abundance to carbon biomass and note the uncertainties in the carbon conversions. Our biomass estimates are tailored to suit the needs of the modelling community for marine ecosystem model validation and model development, but they are also intended to aid biological oceanographers in the exploration of the relative abundances of different PFTs in the modern ocean and their respective biogeochemical roles, for the study of ecological niches in marine ecosystems and the assessment of marine biodiversity.

Origin of data
Our data consists of abundance measurements from several databases (BODC, OBIS, OCB DMO, Pangaea, WOD09, US JGOFS 1 ), and published and unpublished data from several contributing authors (E. Breton, M. Estrada, J. Gibson, D. Karentz, M. A. Van Leeuwe, J. Peloquin, L. Peperzak, V. Schoemann, J. Stefels, C. Widdicombe). Often, the online databases did not denote the method used for the quantitative analysis of Phaeocystis abundances. However, most known counts have been made using the common inverted microscopy and epifluorescence methods (Karlson et al., 2010). Both methods require the sampling of Phaeocystis colonies in Niskin bottles and the subsequent preservation of cells in Lugol's solution or another preservative. After storage of the sample prior to analysis, many scientists concentrate the sample through settling in counting chambers or filtration onto a polycarbonate filter.
Most conventional preservation agents cause the disintegration of the colonial matrix, such that colonial and single cells can no longer be distinguished. One preservation method based on a mixture of Lugol's, glutaraldehyde and iodine (Guiselin et al., 2009;Sherr and Sherr, 1993;Rousseau et al., 1990) is able to maintain colony structure (e.g. Karentz and Spero, 1995;Riebesell et al., 1995;Brown et al., 2008;Wassmann et al., 2005), but this is not widely used. Due to these difficulties, only a few measurements resolve Phaeocystis life stages or morphotypes. Table 1 summarizes the origin of all our data, sorted by database, principal investigator and the project during which measurements were taken. At present, the database contains 5057 individual data points from 3526 samples of 1595 depth-resolved stations.

Quality control
Given the low numbers of data points and the fact that Phaeocystis is a blooming species with a wide range of biomass concentrations, the identification and rejection of outliers in our dataset is challenging. We use Chauvenet's criterion to identify statistical outliers in the log-normalized biomass data (Glover et al., 2011;Buitenhuis et al., 2012). Based on the analysis, none of the stations was identified to yield biomasses with a probability of deviation from the mean greater than 1/2n, with n = 2547 being the number of non-zero data summed up for all stations (two-sided zscore: |zc| = 3.72). In addition to the statistical testing of the biomass distribution, we also quality controlled the range of our cell abundances. We found that our maximum reported abundance of 19 × 10 7 cells l −1 is within the range of previously reported abundances: Schoemann et al. (2005) report maximum cell abundances of the order of ca. 10 7 cells l −1 in areas of colony occurrence (http://www.nioz.nl/projects/ ironages). The largest bloom of P. antarctica was observed in Prydz Bay (http://www.nioz.nl/projects/ironages), with cell abundances measured up to 6 × 10 7 cells l −1 . Eilertsen et al. (1989) reported a maximum of 1.2 × 10 7 cells l −1 of P. pouchetii in the Konsfjord. For P. globosa, a maximal abundance of 20 × 10 7 cells l −1 has been observed, corresponding to a total biomass of ca. 10 mg C l −1 including mucus (Cadée and Hegeman, 1986;Schoemann et al., 2005). The latter biomass value is 20 times larger than the maximal biomass we report (5.4 × 10 3 µg C l −1 ). Thus, based on statistical and observational evidence, none of the data were flagged.

Biomass conversion
We distinguish between single, colonial and unspecified Phaeocystis cells. While Phaeocystis is generally observed and counted under bloom conditions, a significant fraction of cells is non-colonial even during bloom conditions (V. Schoemann, auxillary data). Hence, in order to calculate the lower limit biomass, we have assumed unspecified cells to be single cells. To first order, this choice does not affect the order of magnitude of our cell biomass estimates, since cell carbon is of the same order of magnitude for both colonial and single cells (see below). We define total Phaeocystis biomass to consist of cell biomass and biomass contained in the mucus surrounding Phaeocystis colonies. For our calculation of total biomass, we chose unidentified cells to be in the colonial stage. Hence, our cell biomass estimates represent a lower limit, and our total biomass estimates including colonial mucus represent an upper limit for global Phaeocystis biomass.
Biomass was determined from cell abundance using species-and form-specific conversion factors (Fig. 1). Similar conversion schemes have been previously described (e.g. Schoemann et al., 2005, and references therein). Total cell abundances were divided into single cells, colonial cells and undefined cell types. For each species, the mid-point of the range of reported cell diameters from the literature was used for single and colonial cells (Table 2; P. globosa: Rousseau et al., 2007;Schoemann et al., 2005;P. antarctica: Mathot et al., 2000;Rousseau et al., 2007;Schoemann et al., 2005;P. pouchetii: Wassmann et al., 2005;Rousseau et al., 2007).
Where the species was not specified, Southern Ocean cell counts were assumed to be Phaeocystis antarctica. For cell counts in other regions, the mid-point of the range of cell diameters for P. pouchetii and P. globosa was taken (Table 2; flagellates: 5.0 µm, colonial cells: 6.7 µm). From cell diameter we computed biovolume, assuming spherical geometry of all cell types. We then converted biovolume to carbon biomass using an empirical volume-carbon conversion formula for prymnesiophytes developed by Menden-Deuer and Lessard (2000, Table 2).
Most colonial cells were reported in the form of cell abundances. However, one dataset (P. globosa; number of data points: n = 30) provided colony counts only, but additionally reported the corresponding colony diameters. We used the reported colony diameter to calculate colony volume (assuming spherical colonies), and from this estimated the number of cells per colony using published conversion factors (Table 2; P. globosa: Rousseau et al., 1990;P. antarctica: Mathot et al., 2000; no colony-only cell counts reported for P. pouchetii). Total cell counts per colony were then converted to carbon biomass using the method described above.
We show biomass estimates based on cell carbon excluding colonial mucus as our lower limit for Phaeocystis biomass. The range of uncertainty for the lower limit biomass estimates is given by the uncertainty in cell diameters. Additional uncertainty is introduced where cell life form is not specified. The uncertainty introduced by this assumption is addressed by calculating a minimum cell biomass estimate treating all undefined cell types as single cells.
Estimates for colonial mucus are included to provide an upper limit for Phaeocystis biomass. Estimating mucus carbon from cell counts alone is problematic, as the ratio of mucus carbon to cell number increases with colony size. Colony size therefore needs to be known in order to calculate accurate estimates of mucus carbon. Only one of the datasets (n = 30) included information on colony size. Consequentially, we have used a standard colony diameter of 200 µm for all three species, based on a review of previously reported colony sizes:  find most P. pouchetii colonies in their study range between 20-450 µm  (2000) observe P. antarctica colonies to range from 9.3-560 µm; and Rousseau et al. (1990) report colony sizes of P. globosa to range from 10 µm-2 mm. In all references, larger colonies occured, but were rarer than the smaller colonies. In our data, P. globosa colonies range from 11-594 µm in diameter, with a mean diameter of 197 µm.
Given that the samples of , Mathot et al. (2000) and Rousseau et al. (1990) cover a similar range of sizes for all three species, and that the dataset that reports colony sizes confirms a mean colony size of ca. 200 µm, these findings suggest that the chosen standard diameter is a realistic value for a typical Phaeocystis bloom. Maximum sizes are reported in Schoemann et al. (2005) and Baumann et al. (1994), and range between 9 mm-3 cm for P. globosa, between 1.5-2 mm for P. pouchetii, and around 1.4-9 mm for P. antarctica. Given the lack of data on colony sizes, we are unable to quantify the impact of large colonies on average carbon biomass. However, huge colony sizes are likely to be geographically restricted to specific regions. We assess the uncertainty of our estimates by calculating mucus carbon for the minimum and maximum colony sizes reported for each species Baumann et al., 1994). Estimates of minimal and maximal total carbon are included in our data base, but only mean total carbon including mucus will be discussed below. Conversion factors have previously been published for estimating mucus biomass and number of cells from colony volume for P. antarctica  and P. globosa . Using these estimates we calculated the expected mucus biomass per cell (Table 2). Unspecified cell types were assumed to be colonial cells when calculating these upper estimates of Phaeocystis biomass.
For P. pouchetii, no direct mucus carbon conversion factor has been developed, but  provides a conversion factor for colony volume to total colony biomass ( Table 2; cells and mucus). Following the same procedure as for the other two species, we used this to calculate total biomass per cell. We then subtracted our cell biomass estimate for colonial cells to obtain an estimate of mucus carbon per cell for comparison with P. globosa and P. antarctica estimates.
Unspecified species outside of the Southern Ocean were given a total biomass per cell of 224 pg, which corresponds to the mean total biomass estimate for P. globosa and P. pouchetii (Table 2).

Global distribution of abundance data
Of the 1595 stations contained in the database (Fig. 2), 83 % are located in the Northern Hemisphere (NH) and only 17 % in the Southern Hemisphere (SH; Fig. 3). Out of the 3526 samples, 2547 were reported as non-zero biomass, with 2054 non-zero abundances out of 2862 samples for the NH, and 493 non-zero abundances out of 664 samples for the SH (Table 3). Most measurements (53 %) were taken in the latitudinal band of 50-70 • N (Fig. 3). When only data points with non-zero abundances are taken into account, we find that most non-zero data were collected between 60-80 • N (64 %; Table 3), with relatively few non-zero abundances recorded between 50-60 • N (11 %). Several latitudinal bands are undersampled. We could not collect data for the 40-20 • S, 0-10 • N and 30-40 • N latitudinal bands. All in all, we have little non-zero data in tropical and sub-tropical latitudes from 40 • S to 40 • N, where sampling is targeted at other phytoplankton groups. While 60 % of measurements were taken in the upper 10 m of the water column, the mean sampling depth of our dataset is 27 m, and the median sampling depth is 10 m. Reported cell abundances were maximal at depths between 0-80 m. Observations and laboratory experiments suggest that Phaeocystis is well-adapted to low light conditions (Arrigo et al., 1999;Shields and Smith, 2009). In our database, the deepest occurrence of Phaeocystis was at 292 m at 65 • N, 35 • W (Barents Sea; OBIS dataset).

Phaeocystis cell biomass distribution (mucus excluded)
Phaeocystis biomass estimates based on cell carbon only, without mucus carbon included, constitute a lower boundary for carbon biomass of this PFT in the global ocean. Since mucus carbon biomass is difficult to quantify based on Phaeocystis cell counts, many marine ecosystem models do not include a parameterisation of mucus carbon for this PFT. Thus, in the following section, our estimates of cell biomass represent a lower limit of carbon biomass for model validation.
Phaeocystis biomasses span a wide range of concentrations, which is why we show log transformed biomass concentrations in all subsequent figures. However, we report only non log-transformed biomass concentrations in this manuscript for better comparability with the original data submission.

Global surface cell biomass characteristics
Phaeocystis biomass estimated from cell carbon alone is depicted in Fig. 5a for the surface layer of the ocean (0-5 m). The maximal biomass calculated from the reported cell abundances is 5449.3 µg C l −1 , located at 53  all calculated cell biomasses, 40.1 % are in the range of 0-0.1 µg C l −1 , 55.6 % in the range of 0-1 µg C l −1 , and 67.5 % between 0 and 5 µg C l −1 . 94.8 % of all cell biomasses lie below 100 µg C l −1 . Figure 5b shows the range of uncertainty for cell biomass in % resulting from the uncertainty in cell diameters reported for each species and life stage. Biomasses calculated using the higher estimates of cell diameter are 246 to 355 % higher than estimates calculated using mean cell dimensions. Biomasses calculated using the lower cell diameter estimates are between 4 and 26 % of the mean values. Uncertainties are highest when species or life form is not reported. Biomass estimates are highly sensitive to changes in cell size, and reduced uncertainty is only possible if cell measurements are available in addition to abundance data.

Latitudinal cell biomass distribution
Calculated cell biomasses do not follow a distinct latitudinal pattern (Fig. 6a). Highest cell biomasses occur at latitudes around 50 • N and 80 • S, lowest cell biomasses are calculated for latitudes around 20 • S (Peruvian upwelling). Cell biomasses decrease from 50 • N towards the pole in the Northern Hemisphere, but Southern Hemisphere concentrations increase polewards towards the Antarctic continent. Given that many of our data stem from coastal regions, we note that our latitudinal distributions are biased towards high coastal concentrations in some areas, as open ocean areas are still undersampled. However, cell biomass distributions confirm previous findings that Phaeocystis blooms occur in the temperate and high latitudes of both hemispheres, and that Phaeocystis is fairly ubiquitous, occurring in all major ocean basins.  This suggests that Phaeocystis should be sampled more regularly at depths between 100-300 m and below.

Cell biomass distributions for the Northern and Southern
Hemispheres show that the calculated Phaeocystis biomasses reflect those of a typical blooming species (Fig. 8a and b). In the NH, Phaeocystis blooms during the spring months, with the spread of the biomass distribution being a combination of the temporal development of a bloom, and different bloom starting times at different latitudes. In the SH, cell biomasses are highest in December and January. The temporal development mostly reflects Southern Ocean dynamics, as few samples were taken at latitudes below 40 • S (compare Fig. 6b).

Total Phaeocystis biomass distribution (mucus included)
Biomass estimates including colonial mucus are given as an upper limit for our biomass estimates (Fig. 9a). Given that the ratio of mucus carbon to cell carbon is highly dependent on colony size, the addition of mucus carbon estimates introduces a high level of uncertainty to total biomass estimates where colony size data is unavailable. Calculating mucus carbon biomass based on the minimum and maximum reported colony sizes for each species  gives a huge range of values: percent colony carbon as mucus ranges from 0.2-99.6 % for P. globosa, 1.4-94.3 % for P. antarctica and 55.8-99.8 % for P. pouchetii. Using a standard colony diameter of 200 µm increases biomass estimates by a factor of 1.2 for colonial P. globosa and P. antarctica cells, but by 32.8 for P. pouchetii compared to estimates considering cell biomass alone. The contribution of (standard) mucus to total carbon per cell is 96.9 % for P. globosa, and 14.6 % for P. pouchetii and P. antarctica (Table 2) for this standard colony size. The difference between the three species leads to a larger contribution by the Northern Hemisphere species to total Phaeocystis biomass ( Fig. 9a and b). Total Phaeocystis biomass estimates including (standard) mucus range from 2.9 × 10 −5 µg C l −1 to 19 823 µg C l −1 . The maximal total biomass (19 823 µg C l −1 ) is 3.6 times higher Table 4. Seasonal distribution of abundance data for the Northern and Southern Hemispheres. Number of data points for each month. All: all data, non-zero: data with non-zero carbon biomass. 27 observations did not include the month when measurements were taken. than the corresponding data point with the maximal cell biomass of 5449.3 µg C l −1 . This data point is associated with high cell numbers during a bloom of P. pouchetii off the coast of the Netherlands in the Wadden Sea. In contrast, the maximal total biomass in the Southern Hemisphere is only 918 µg C l −1 , and thus one order of magnitude lower than maximal total biomasses in the Northern Hemisphere (Fig. 9). The global mean of all reported nonzero total biomass values is 183.8 µg C l −1 , and the median is 11.3 µg C l −1 . While our publicly available dataset also contains an estimate of maximal and minimal total carbon biomass based on maximal and minimal reported colony sizes (and thus maximal and minimal mucus), we do not visualize these results here. Uncertainties in the mucus contribution to total biomass due to these uncertainties in colony size range from hundreds to thousands of percent, and total carbon biomass estimates are far from certain at this point in time.

Discussion
We have estimated the carbon biomass of the haptophyte Phaeocystis from microscopic determinations of cell abundances. This approach is associated with several uncertainties.
First, since the data included in this database are sparse, we may have biases that we cannot account for. Whether the biomass estimates truly represent global averages is unclear. that there is always a background concentration of Phaeocystis cells when this genus is present in colonial form. Furthermore, even though Phaeocystis is ubiquitous , our data show a poor spatial resolution and data coverage outside the high-latitude coastal regions. Our biomass estimates for the coastal seas may not be representative of open ocean concentrations. Some areas such as the Pacific Ocean are clearly under-represented and we were not able to acquire any Phaeocystis measurements from the Northwest and West Pacific. Furthermore, there is a gap in our observations in the Arctic waters north of Siberia, and north of North America and in Greenland waters, despite published reports of high biomass off Greenland (Smith Jr., 1993). Our data is also seasonally biased in the Southern Hemisphere, with 58 % of the data acquired during the summer months. In addition, we note that Phaeocystis is only accurately counted at times when it is expected to form large blooms, when there is a strong likelihood that its abundance is high and when scientists are specifically looking for this group. Hence, low background concentrations of singlecelled Phaeocystis will often be overlooked. Since the single-celled life stages of Phaeocystis lack a clear morphological distinction, this gap in our current knowledge is unlikely to be resolved using microscopic methods, but will require genetic identification methods. Second, there are methodological issues with the determination of abundance data that will influence our biomass calculations. Several data contributors do not report the life stage cells were in at the time of sampling, most likely due to the disruption of colony structure during cell fixation. This fact results in difficulties in distinguishing single and colonial cells. Hence, in order to obtain a lower limit on Phaeocystis cell biomass, we chose to assume undefined cells to be in the form of flagellates, which will bias the resulting biomass calculations. The ratio of free-living to colonial cells is highly variable, but a significant background concentration of free-living cells is present even during bloom conditions. Our assumption that all unspecified cells are flagellates is therefore likely to lead to an underestimation of Phaeocystis cell biomass.
Furthermore, non-blooming species such as P. cordata, P. jahnii or P. scrobiculata are not recorded explicitly in our abundance data, but may constitute a non-negligible fraction of total global Phaeocystis biomass in some oceanic regions.
Third, there are large uncertainties associated with the conversion of cell abundances to biomass. Cell measurements were only provided for very few datasets; for the majority of the database, biovolumes were calculated using mean published cell dimensions. Cell size is highly variable for all Phaeocystis species  and using a constant biovolume estimate for each species will underestimate the spatial and temporal variability that occurs in Phaeocystis biomass. Due to the differences in the reported size range, our estimates of cell carbon content are different from some previously reported figures. For example, our estimates of cell carbon content for P. globosa (Table 2; flagellates: 13 pg C cell −1 ; colonial cells: 29 pg C cell −1 ) are higher than estimates by Rousseau et al. (1990; flagellates: 11 pg C cell −1 ; colonial cells: 14 pg C cell −1 ), and our estimates for P. antarctica (Table 2; flagellates: 9 pg C cell −1 ; colonial cells: 21 pg C cell −1 ) are higher than those reported by  flagellates: 3 pg C cell −1 ; colonial cells: 14 pg C cell −1 ) due to these differences in the reported mean cell diameters that were used to calculate the carbon estimates. Furthermore, literature values for the carbon conversion factor are only given for prymnesiophytes in general, but we lack information on the individual species of Phaeocystis, which may have a species-dependent, spatially and temporally varying cell carbon content.
Last, there is a large uncertainty associated with the addition of mucus carbon biomass due to the lack of data on cell forms, colony size and the amount of mucus per colonial cell. Greater use of preservation methods that maintain colony structure, along with routine colony size measurements, would allow for more reliable estimates of colonial mucus carbon. Further data on Phaeocystis colony sizes are clearly needed if mucus carbon is to be included in global biomass estimates and model validation. Moreover, there are uncertainties related to the structure of the mucilaginous carbon surrounding colonies. For example, an alternative method for estimating the total carbon biomass of P. globosa has been suggested by Van Rijssel et al. (1997) based on the observed hollow structure of the colonies. Van Rijssel et al. (1997) compute total biomass per cell based on a linear relationship between colony surface area and carbon content. A comparison of the estimated mean total carbon per P. globosa cell leads to significant differences. For our standard colonies of 200 µm diameter, we find total P. globosa carbon per cell to be 33.6 pg C cell −1 following Rousseau et al. (1990, Table 2); we compute an amount of 202.5 pg C cell −1 using Van Rijssel et al. (1997). The Rousseau relationship results in 9.6 ng C colony −1 , whereas the Van Rijssel relationship would lead to 58 ng C colony −1 for this species. Prior to the publication of , the contribution of mucus carbon to total carbon per cell for P. pouchetii was done using the Rousseau et al. (1990) and Mathot et al. (2000) or the Van Rijssel et al. (1997) formulations (Reigstad and Wassmann, 2007). Using these relationships, Reigstad and Wassmann (2007) find a much lower contribution of mucus (10 %) to total carbon per cell than what we find using Verity et al. (2007, 96.9 %). Earlier estimates of P. pouchetii mucus carbon may thus not be compatible with our estimations. Clearly, future studies are needed to address this uncertainty in colony structure and mucus distribution, and the corresponding volume to biomass conversion factors.

Conclusions
This is the first attempt at creating a global Phaeocystis biomass database. At present, however, we are still far from being able to give a global estimate of Phaeocystis biomass concentration. Data are limited by lack of spatial and temporal resolution, and at most sampling sites we lack a seasonal cycle that would be necessary to determine reasonable estimates for annual mean biomass concentration. Annual and monthly mean biomasses are of particular interest for the modelling community, but these will only be meaningful if further microscopic data can be added to the database. Targeted explorations of marine ecosystems with the aim to determine phytoplankton biomass would be desirable, but such endeavours tend to be expensive and laborious. A marine census of species biomass would shed light on the relative importance of key marine plankton groups and their respective importance for global biogeochemical cycling.  . 9. Estimates of (a) log-normalized total mean Phaeocystis biomass including colonial mucus for the surface layer (0-5 m) and (b) fraction of total mean surface biomass composed of mucus carbon. Zero values are not represented. The difference between the ratios of total carbon to cell carbon for the three species leads to a larger contribution of the Northern Hemisphere species to total Phaeocystis biomass.
36 Figure 9. Estimates of (a) log-normalized total mean Phaeocystis biomass including colonial mucus for the surface layer (0-5 m) and (b) fraction of total mean surface biomass composed of mucus carbon. Zero values are not represented. The difference between the ratios of total carbon to cell carbon for the three species leads to a larger contribution by the Northern Hemisphere species to total Phaeocystis biomass.

A1 Data table
A full data table containing all biomass data points can be downloaded from the data archive PANGAEA, doi:10.1594/PANGAEA.779101. The data file contains longitude, latitude, depth, sampling time, abundance counts and biomass concentrations, as well as the full data references.

A2 Gridded netCDF biomass product
Monthly mean biomass data has been gridded onto a 360 × 180 • grid, with a vertical resolution of 33 depth levels (equivalent to World Ocean Atlas depths) and a temporal resolution of 12 months (climatological monthly means). Data has been converted to netCDF format for easy use in model evaluation exercises. The netCDF file can be downloaded from PANGAEA, doi:10.1594/PANGAEA.779101. This file contains total and non-zero abundances, cell biomasses and total biomass estimates. For all fields, the means, medians and standard deviations resulting from multiple observations in each of the 1 • pixels are given. The ranges in cell and total biomasses due to uncertainties in cell size and life form are not included as variables in the netCDF product, but are given as ranges (minimum cell biomass, maximum cell biomass; minimum total biomass, maximum total biomass) in the data