On modeling the Southern Ocean Phytoplankton Functional Types

. Phytoplankton in the Southern Ocean support important ecosystems and play a key role in the earth’s carbon cycle, hence affecting climate. However, current global biogeochemical models struggle to reproduce the dynamics and co-existence of key phytoplankton functional types (PFTs) in this Ocean. Here we explore the traits important to allow three key PFTs (diatoms, coccolithophores and Phaeocystis ) to have distributions, dominance and composition consistent with observations. In this study 5 we use the Darwin biogeochemical/ecosystem model coupled to the Massachusetts Institute of Technology (MIT) general circulation model (Darwin-MITgcm). We evaluated our model against an extensive synthesis of observations, including in situ microscopy and high-performance liquid chromatography (HPLC), and satellite derived phytoplankton dominance, PFT chlorophyll-a (Chla), and phenology metrics. To capture the regional timing of diatom blooms obtained from satellite required including both a lightly siliciﬁed diatom type and a larger and heavy siliciﬁed type in the model. To obtain the anticipated 10 distribution of coccolithophores, including the Great Calcite Belt, required accounting for a high afﬁnity for nutrients and an ability to escape grazing control of this PFT. The implementation of two life stages of Phaeocystis to simulate both solitary and colonial forms of this PFT (with switching between forms being driven by iron availability) improved the co-existence of coccolithophores and Phaeocystis north of the Polar Front. The dual life-stages of Phaeocystis allowed it to compete both with other phytoplankton of larger size and/or similar sizes. The evaluation of simulated PFTs showed signiﬁcant agreement 15 to a large set of matchups with in situ PFT Chl-a data derived from pigment concentrations. Satellite data provided important qualitative comparisons of PFT phenology and PFT dominance. With these newly added traits the model produced the observed >50 % coccolithophore contribution to the biomass of biomineralizing PFTs in the Great Calcite Belt. The model together with the large synthesis of observations provides a clearer picture of the Southern Ocean phytoplankton community structure, and new appreciation of the traits that are likely important in setting this structure. grazing control for coccolithophores. Next, the model was extended 75 to include both solitary and colonial forms of Phaeocystis . Observational information from in situ and satellite measurements was used to help to deﬁne differences in the PFT traits, to constrain the model, as well as to quantitatively evaluate the model performance to overall ﬁnd a representation of the phytoplankton community in the Southern Ocean that is close to observations. We used the optimized Darwin model to test three hypotheses on the factors controlling the biogeography of Southern Ocean phytoplankton groups:

ithophore traits, the study by Krumhardt et al. (2017), Monteiro et al. (2016), as well as previous studies by Paasche (2001) and Iglesias-Rodríguez et al. (2002), emphasized the high nutrient affinity of the coccolithophores and high grazing protection of this PFT (Monteiro et al., 2016). Nissen et al. (2018) reported on higher grazing pressure on coccolithophores than on diatoms. Krumhardt et al. (2019) used lower grazing pressure on coccolithophores than on diatoms and related the distribution of coccolithophores to a specific temperature function in dependence to its growth rate. However, none of these studies included 70 Phaeocystis in their model simulations.
In our study, we improved the representation of key Southern Ocean PFTs, namely diatoms, coccolithophores and Phaeocystis, using the Darwin biogeochemical model coupled to the Massachusetts Institute of Technology (MIT) general circulation model ( Darwin-MITgcm). In a first step, we modified the Darwin model to account for two distinct size classes of diatoms and for a high affinity for nutrients and an ability to escape grazing control for coccolithophores. Next, the model was extended 75 to include both solitary and colonial forms of Phaeocystis. Observational information from in situ and satellite measurements was used to help to define differences in the PFT traits, to constrain the model, as well as to quantitatively evaluate the model performance to overall find a representation of the phytoplankton community in the Southern Ocean that is close to observations. We used the optimized Darwin model to test three hypotheses on the factors controlling the biogeography of Southern Ocean phytoplankton groups: 80 -Size diversity of the diatoms (Queguiner, 2013;Tréguer et al., 2018) leads to the distribution of small diatoms ("slightly silicified and fast growing") at the lower latitudes and large diatoms ("strongly silicified and slowly growing") at higher latitudes in the Southern Ocean.
-Distribution of coccolithophores in the Great Calcite Belt is not necessarily controlled by temperature  but determined by the ability of this PFT to escape grazing because of their exoskeleton (Nejstgaard et al., 1997;85 Huskin et al., 2000;Monteiro et al., 2016), and to grow under nutrient depleted conditions (especially phosphate and iron) (Paasche, 2001; Iglesias-Rodríguez et al., 2002). These characteristics of coccolithophores would make them more competitive among other phytoplankton of larger or similar size, small diatoms and Phaeocystis.
-Phaeocystis exists in two life stages, solitary cells and colonies, depending on iron availability (Bender et al., 2018). This additional difference in the traits of distinct haptophytes, coccolithophores and Phaeocystis, allows them to co-exist. 90 The paper is organized as follows. Section 2 describes the numerical model set up, experimental design and observations (in situ and satellite retrievals) used for model evaluation, Section 3 presents the results and discussion. Section 4 concludes with summary and outlook. The Darwin biogeochemical model (Dutkiewicz et al., 2015) represents the ocean biogeochemical cycling of phosphorus (P), nitrogen (N), carbon (C), silicon (Si) and iron (Fe). Chlorophyll-a (Chla) and carbon are decoupled given the Geider et al. (1998) photophysiological model. The version of the Darwin model used in our study simulates, among a total of 42 biogeochemical components describing these biogeochemical cycles, two types of zooplankton and six phytoplank-100 ton groups. These six (from initially nine in Dutkiewicz et al. 2015) phytoplankton groups are analogues of diatoms, nanophytoplankton, prochlorophytes, other pico-phytoplankton (including pico-eukaryotes), nitrogen fixing phytoplankton (including Trichodesmium) and coccolithophores. Starting from this reduced with respect to the number of PFTs Dutkiewicz et al. (2015) Darwin configuration, the following steps have been performed to adapt the Darwin model for simulations of the Southern Ocean biogeochemistry and phytoplankton dynamics and diversity: 105 -Diatoms have been introduced as two distinct size classes (as two different model variables): small and "slightly silicified and fast growing" at lower latitudes (introduced instead of "other pico"); large and "strongly silicified slowly growing cells" at high latitudes (Queguiner, 2013).
Thus, in the modified Darwin version the following six PFTs are considered: large and small diatoms, Phaeocystis and coccolithophores, Proclorococcus-like and N-fixers. Although later two PFTs only play a very minor role in the Southern Ocean, their distributions determine the extent and abundance of small phytoplankton and coccolithophores north of the Subantarcic and 115 Suptropical Fronts. Hence, we keep N-fixer and Proclorococcus-like prokarytes (it would also allow to maintain a reasonably good performance of the model globally). Phaeocystis are considered as adjusted (with respect to the traits) "other large" since "other large" did not survive in the original (Dutkiewicz et al., 2015) version that was developed for the global ocean. However, we cannot strictly state that the Phaeocystis-analogue considered is pure Phaeocystis sp., it could be also other misrepresented  In the current model configuration, instead of exploiting the radiative transfer model accounting explicitly for absorption and scattering of spectrally resolved light as in the version by Dutkiewicz et al. (2015), we use a simplified (because of computational limitations) parametrization of the light in terms of shortwave irradiance (I) penetrated over depth. Table 1 summarizes specific traits for the simulated PFTs, which are described by the following physiological parameters: the maximum photosynthetic rate (P C max , day −1 ); the photoinhibition parameter (β) applied to Prochlorococcus; the growth half-saturation constant 125 (k sat , mmol m −3 ); the biomineralizing function (mf unc), whether or not they form biominerals such as opal and calcite.
These main differences between specified traits alter the growth rate (µ j , day −1 ) of the particular phytoplankton (j = 1, 2,..., 6) and the grazing of phytoplankton by small or micro-zooplankton (Gr jk , k = 1, 2) given the palatability factor (r j,k ) and sinking rate (w sink , m day −1 ).
The growth of phytoplankton is parameterized following Geider et al. (1998) to account for decoupling between Chla and 130 C: here P C mj is the light saturated photosynthesis rate; γ η and γ T denote the functions of the growth rate on limiting nutrients (η ji , i = P, N, Si, F e) and temperature, respectively (Dutkiewicz et al., 2015); α j is the initial slope of the photosynthesis vs.
irradiance (P-I, Platt et al. 1980) curve, which is (following Dutkiewicz et al. 2015) a product of the phytoplankton-specific light absorption (considered spectrally averaged, a * j , m 2 mgChla −1 ) and the maximum quantum yield of carbon fixation (φ maxj , 140 mmolC (mol photons) −1 ); θ j is the simulated chlorophyll to carbon ratio. The P C mj and k sati parameters are specified with the use of empirical allometric relationships (Ward et al., 2012(Ward et al., , 2017. The γ T function was considered the same for diatom, coccolithophores, Phaeocystis and prokaryotes given the coefficient τ T = 0.8 normalized the maximum value (unitless), the temperature coefficient A T = -4000 K, and the optimal temperature T 0 = 293.15 K.
Grazing is formulated as a Holling III function: where g max jk is the zooplankton maximum grazing rate on phytoplankton (d −1 , Dutkiewicz et al. 2015), and κ sat k is the half-saturation constant for grazing.
Sinking is expressed given the phytoplankton-specific sinking rate w sinkj as:

5
The described biogeochemical model configuration given the parameters in Table 1 is exploited for our REF experiment.
Most of the biogeochemical model parameters used in our study have been taken from the original study by Dutkiewicz et al. (2015) and from detailed laboratory studies conducted by Trimborn et al. (2017). Hence, Table 1 contains only the parameters used in the parameterizations crucial to drive the differences/diversity in the considered PFT traits. Other parameters (as well 155 as parameterizations) not listed are the same as in Dutkiewicz et al. (2015).
In our additional experiment PHAEO, two distinct Phaeocystis life stages (colonies and solitary cells) have been introduced following Popova et al. (2007) and Kaufman et al. (2017). These two Phaeocystis life stages are considered as a function of iron availability (Bender et al., 2018): if the iron concentration is less than the iron half saturation constant (k satF e ), Phaeocystis is assumed to be present as solitary cells with the mortality rate and grazing pressure being higher by 1.3 and 1.25, respectively, 160 than those cells in a colonial form. Following Popova et al. (2007), we consider Phaeocystis sinking rate (w sink ) dependent of available nutrients, but in our case it is limited to iron concentration as following: k satF e (P haeo cell ) = k satF e (P haeo) * 0.8.
Note that in the model Phaeocystis, independent of the life stage -colonial phase or solitary cells, -is considered as one tracer.

165
However, the assumed morphology and, therefore, physiology (mortality rate, r j,k , k satF e , sinking rate) differ as described above. We have not performed any sensitivity experiments with respect to the new parameters. However, we expect the model to be sensitive to their specification since it will also determine the competition between Phaeocystis and small diatoms.

Physics
The biogeochemical model is coupled to a global configuration of the Massachusetts Institute of Technology general circulation 170 model (MITgcm, 2012) on a cubed-sphere grid (Adcroft et al., 2004) with a mean horizontal grid spacing of 18 km and 50 vertical levels with the resolution ranging from 10 m near the surface to 450 m in the deep ocean (Menemenlis et al., 2005;Losch et al., 2010). The simulation includes a dynamic sea-ice model with a viscous-plastic rheology and a zero-layer thermodynamic submodel (Losch et al., 2010). Penetrating light is attenuated within sea ice with an exponential law (Taylor et al. 2013, Appendix A2).

175
Initial conditions of the physical model were obtained from a short spin-up simulation initialised in January 1979 from rest and from temperature and salinity fields derived from the Polar Science Center Hydrographic Climatology (PHC) 3.0 (Steele et al., 2001). In the spin-up phase, the model is forced until the end of 1991 by 6-hourly atmospheric surface fields derived from the European Centre for Medium-Range Weather Forecasts (ECMWF) 40 year reanalysis (ERA-40) (Uppala et al., 2005). For more details see Losch et al. (2010, Section 3). Starting on January 1st, 1992, the model with biogeochemistry is forced until 180 2012 by 3-hourly atmospheric surface fields of the Japanese 55-year reanalysis (JRA55, Kobayashi et al. 2015). Initially, the model time step had to be decreased to 10 min because of the higher forcing frequency. This constraint was slowly relaxed to 20 min by January 1st, 1996. The change in forcing also required an adjustment of some of the sea-ice model parameters. The albedos for dry ice, wet ice, dry snow, and wet snow were set to 0.75, 0.71, 0.87, and 0.81, respectively; the simulation does not use the replacement pressure method (Kimmritz et al., 2017). After spinning up the biogeochemistry for six years, during 185 which also the physical simulation adjusts to the new forcing, the years 1999 -2012 are integrated and the period of August 2002 -April 2012 is used for analysis.

Biogeochemical tracers initialisation
To initialise (in 1992) the biogeochemical model variables we use the results of the study by Taylor et al. (2013), which used a similar MITgcm configuration coupled with the Regulated Ecosystem Model (REcoM, Schartau et al. 2007) to examine the rest of the period that we consider here. It is not computationally possible to reach a totally adjusted system, and the ecological questions we address in this paper do not require such adjustments.

Evaluation with observational data
To assess our model results, we compare the simulations to several large in situ and satellite datasets, as detailed below and summarized in Table 2. Where the coverage of the observations is similar in respect to time we use our two-weekly 205 model outputs. Where only monthly climatological or composite data (often from different time periods) are available we use monthly climatological model results for the period of [2006][2007][2008][2009][2010][2011][2012]. Where only results for specific months are available from observations we compare our output to these specific months. Table 3 contains the information about the evaluated phytoplankton groups as classified in the model and observations. summer months (see supplemental video materials). The phytoplankton groups for this PFT-Chla dataset were derived using the Diagnostic Pigment Analysis (DPA) following Vidussi et al. (2001) and Uitz et al. (2006) and modified as in Hirata et al. (2011) and Brewin et al. (2015) and adapted to a much larger dataset. Briefly, PFTs have different and specific pigments (marker pigments, e.g. fucoxanthin -diatoms) that allow distinguishing them. The biomass of a specific PFT can be quantified by determining the contribution of the corresponding diagnostic pigment to the total phytoplankton biomass (represented by 220 the weighted sum of the diagnostic pigments). It is worth mentioning that DPA allows also to retrieve other PFTs -like dinoflagellates, cryptophytes and green algae -however, they were not included in this referenced dataset, originally generated for the evaluation of satellite retrievals of diatoms, coccolithophores (haptophytes) and prokaryotes. For more details on the method and data quality control of this in situ dataset, we refer the reader to the study by Losa et al. (2017, Supplementary Material, Section 1 and 3).

In situ observations
225 Figure 1 shows the locations of this available in situ HPLC dataset in the Southern Ocean. As we can see there and in matchup statistics is presented for several biogeochemical provinces (Longhurst, 1998)  Model results of PFT dominance are compared to dominating phytoplankton groups from the monthly climatologies of the satellite product PHYSAT (1998( , Alvain et al. 2008. PHYSAT is based on the analysis of normalized water-leaving radiance anomalies, computed after removing the impact of chlorophyll-a variations. Specific water-leaving radiance spectra anomalies (in terms of spectral shapes and amplitudes) have been empirically associated to the presence of dominant phytoplankton groups, based on in situ diagnostic pigment observations. This product is based on the multispectral Sea-Viewing
We also evaluate the model simulations in terms of PFT Chla ( We compare model climatology of Southern Ocean PFT dominance (averaged over the years 2006 -2012) to the PHYSAT PFT dominance. Dominance of the modeled PFT is defined if its Chla fraction is more than 55% of the total Chla. In line with the evaluation against the PHYSAT PFT dominance, the simulated PFT dominance are compared to similar estimates obtained 275 in the study by Dutkiewicz et al. (2015).
Two SynSenPFT products (at 4 km and daily) -diatom Chla that combines diatom Chla from PhytoDOAS and OC-PFT, and coccolithophore Chla that combines coccolithophore Chla from PhytoDOAS with haptophyte Chla from OC- PFT -are  method proposed and initially applied for assessing the TChla phenology by Siegel et al. (2002). In particular, we use the following indices: the Chla maximum date, the bloom start date, and the bloom end date. These indices are calculated based on the REF Chl simulations for diatoms (including small and large) over the year 2007/2008. We chose this particular year because: 1) with the two-weekly model output the phenological indices can be more precisely calculated than based on the two-weekly or monthly mean climatology; 2) it is a typical year over the period 2006 -2012 with respect to the simulated PFT 290 distribution (after model reached the quasi-steady state) and climate oscillations (Soppa et al., 2016).

Results and Discussions
In the following, we show and discuss the results of our model simulations using   Dutkiewicz et al. (2015). for climatological December, January, February and July. For complete 12 monthly mean climatologies for PFT dominance as retrieved by PHYSAT and predicted in Dutkiewicz et al. (2015) 305 and REF experiment, the reader is referred to the Supplementary Material (Figures S15 -S17, respectively). In general, the PHYSAT Southern Ocean PFT dominance climatology (over the years 1998 -2006) shows a strong seasonal variability of PFT compositions and contributions of PFTs to TChla (Alvain et al., 2008). From November to January south of 40 • S, the diatom contribution is higher than 50%. This high diatom contribution in the austral spring and summer is associated with large diatom blooms starting in October at lower latitudes and moving towards higher latitudes in December -January. The Among the different experiments, the model set up REF gave the best agreement to observed phytoplankton composition, 320 dominance and diatom phenology: This set up includes two size classes of diatoms given the parameters in Table 1, but initially without considering Phaeocystis in two distinct phases. Figure  appearance of diatom blooms in the Southern Ocean simulated by most (global ocean) biogeochemical models ; as well in the Darwin model set up published by Dutkiewicz et al. (2015) and regional models (Nissen et al., 2018)) can be explained by the lack of inclusion of the size diversity in diatoms (Tréguer et al., 2018). did not survive. It happened because there were not sufficient differences between the traits assumed for coccolithophores and "other large" (or Phaeocystis-analogue). As a result, it took longer for the model to get in a quasi-steady state and finally lead to just one of the haptophytes survived (taking over for another). Hence, the experiment REF represents diatoms and haptophytes after reaching a quasi-steady state, but cannot distinguish among haptophytes. In original Darwin-2015model (Dutkiewicz et al. 2015) "other large" did not survive.

Differentiation among haptophytes: coccolithophores vs. Phaeocystis
To cope with the aforementioned problem leading to two different states either with coccolithophores or with Phaeocystis surviving, in experiment PHAEO we introduced additional differences between the traits of these two PFTs. In particular, we considered two distinct life stages of Phaeocystis (colonies and solitary cells) in which its morphological features and physiology depend on iron availability (Bender et al., 2018). To illustrate the simulated Southern Ocean phytoplankton compositions,  Figure S8). These show that the fraction of coccolithophores is higher in austral winter than in summer.  Nissen et al. (2018), who reported on an increased (relative to diatoms) grazing of coccolithophores as a factor controlling the coccolithophore biogeography in the Southern Ocean. Our assumptions on low palatability factor of coccolithophores are, nevertheless, backed up by studies by Nejstgaard et al. (1997), Huskin et al. (2000), Losa et al. (2006) and Monteiro et al. (2016). In the study by Losa et al. 385 (2006) on optimised biogeochemical parameters, it was shown that coccolithophore blooms are associated with low grazing pressure. Based on laboratory experiments, Nejstgaard et al. (1997) and Huskin et al. (2000) concluded that coccolithophores (due to its "stony" structure) do not influence the microzooplankton growth. While the exact mechanisms of how this PFT uses the coccolith to protect itself against grazing is not fully understood (Monteiro et al., 2016), the ability of coccolithophores to escape grazing control has "relatively well-supported evidence" (see Monteiro et al. 2016 for review). High affinity of coc-390 colithophores for nutrients (for phosphate and iron to a larger extent than for nitrogen, Paasche 2001) makes them strongly competitive in environmental conditions with declining nutrient concentrations (Paasche, 2001;Iglesias-Rodríguez et al., 2002;Krumhardt et al., 2017), for instance under strong ocean stratification or nutrient consumption by other PFTs (see Figure 6). Thus Figure 6 shows that the simulated abundance of coccolithophores north of the Subtropical Front (STF) -where phosphate occurs in very low concentrations -is explained by the introduced high affinity of this PFT to phosphate (small half-saturation rate in γ η function) allowing coccolithophores to grow in nutrient depleted conditions. However, in the region between the Subtropical and Subantarctic Fronts the occurrence of coccolithophores is more evidently linked to low grazing pressure on 410 this PFT due to its much lower palatibility for zooplankton in comparison with small diatoms or Phaeocystis presented by single solitary cells. As in the study by Smith et al. (2017) reported biogeography of observed coccolithophores in the Great Calcite Belt, our simulated coccolithophore Chla is distributed in the silica-depleted area, where small diatom cells, even if they could still compete for other nutrients, have higher palatability for grazers. Coccolithophores do not compete with small diatoms on silica resources and might survive due to its lower palatability factor. It could also be that in this area silica limited 415 diatoms slowly grow allowing coccolithophores for earlier access to other (not used yet by diatoms) macronutrients and iron.

General evaluation of experiment PHAEO
In this subchapter we present the quantitative and qualitative evaluation of our final model setup PHAEO results.  and 6, support the evidence of Phaeocystis dominance among haptophytes at these latitudes. Thus, SynSenPFT more likely overestimates coccolithophore Chla at the latitudes higher than 60 • S. For diatoms, modeled Chla exceeds SynSenPFT estimates south of the Antarctic Circumpolar Current Front. However, SynSenPFT diatom Chla is known to be underestimated for the Antarctic Province (see Losa et al. (2017)). At the same time, diatom Chla estimates obtained with PhytoDOAS are higher 450 (see Supplementary Material, Section S2) despite the low coverage of the product, which can indicate that predicted model diatom Chla could be a bit less overestimated than it is suggested by comparison with SynSenPFT.

PHAEO against in situ HPLC-based observations
Two-weekly PHAEO model Chla snapshots for diatoms (large + small), haptophytes (coccolithophores + Phaeocystis) and Longhurst's biogeochemical provinces (Longhurst 1998, see Figure 1). The highest disagreement was obtained for diatoms in the Atlantic Sector of the ANTA province, where the simulated diatom Chla is systematically overestimated by~0.5 mg m −3 .
The best agreement with the HPLC based diatom Chla (excluding small provinces, see Figure 1) was obtained at the SSTC and SANT. For the haptophytes, the highest systematic error towards overestimation has been found at two small provinces shows that statistical criteria, such as MAE and root mean squared error (RMSE) give statistical meaningful metrics with respect to "model minus in situ Chla data" and the evaluation does not necessarily require a logarithmic transformation, as it is often done in ocean colour product validation (Brewin et al., 2010;Losa et al., 2017).
With respect to the agreement between model and observed in situ Chla for prokaryotic pico-phytoplankton (Soppa et. al 2017) depicted in Figure S11 (Supplementary Material) one can conclude that the frequency distributions of the simulated and 485 observed pico-phytoplankton are different, and the frequency distribution of the differences confirms that MAE and RMSE given absolute (Table 6) or logarithmically transformed values can hardly provide satisfactory estimates. Nevertheless, it is worth emphasizing that the largest differences between model and observed in situ prokaryotic pico-phytoplankton are located along the Antarctic Peninsula.
It is worth mentioning that the statistical estimates between model and observation PFT-CHla were carried out using match-490 ups within ± 1 week. Moreover, the model does not explicitly represent sea-ice algae and, therefore, might work less well in the region around the sea-ice. In this respect, we have to point out that all the statistics are presented for a qualitative assessment of the model rather than for a quantitative estimate of model uncertainties, since the representation error (Janjić et al., 2018) related to the differences in spatial and temporal scales considered and sampled by the model vs. observations as well as to the mismatch in grouping phytoplankton  are quite large.

PHAEO against in situ MAREDAT PFT biomass observations
The representation error is even larger for the comparison of PHAEO monthly mean climatology of the diatom, coccolithophores and Phaeocystis biomass (mgC m −3 ) with monthly composites of in situ PFT biomass measurements from the MAREDAT dataset. Figure S13 shows the distribution of MAREDAT seasonal (summer and spring) composites of diatom (panels a and b), coccolithophores (panels d and e) and Phaeocystis (panels g and h) biomass data vs. PHAEO monthly clima-

Perspectives and limitations of the study
Concluding from the gained experience (including sensitivity tests) on constraining the model with available observations which lead to our PHAEO set up, from PHAEO results and their discussion with comparable datasets from in situ, satellite and modelling, we come up with the following crucial points that if addressed could further improve phytoplankton composition predictions in the Southern Ocean.

515
Phytoplankton growth: equation (3) (φ max ). The differences in the phytoplankton growth are presented mostly by the variety of the assumed maximum photosynthesis rate (P C maxj ) and chlorophyll to carbon ratio θ j , resulting in slower growth of coccolithophores than for diatoms and Phaeocystis, which determines the simulated PFT phenology and competition. The initial slope of the P-I curve (α), opposed to the study by Hickman et al. (2010) and Dutkiewicz et al. (2015), was considered identical for all PFTs. The use of PFT specific absorption spectra (Eq. (3)) when setting up the 520 PFT traits allows the initial slope of the P-I curve (α) being distinct for particular phytoplankton. However, an improved representation of the α parameter would also require some differences in the maximum quantum yield of carbon fixation (φ max ) specification (Hiscock et al., 2008). This would further improve the model performance (for instance, Phaeocystis antarctica dominance in the Ross Sea) and would probably bring the assumed P C maxj values (which are currently, to some extent, overestimated) closer to measurements (Tables S5, S6 Supplementary Material, Section S4). However, the φ max 525 is measured given α and phytoplankton specific absorption. That means that biogeochemical models have to differentiate between the α parameter for distinct PFTs.
Phaeocystis colony formation: in this study, we use very simplistic approach to parameterize life cycle transition of Phaeocystis given just one model tracer. In our model this transition is triggered only by iron variability (as reported by Bender et al. 2018), but not by light availability (as previously reported by Pererzak, 1993). Since we reported on 530 our first trial, it is worth keeping in mind that the model is expected to be sensitive to the differences we specify for the mortality and grazing rates and iron uptake for colonial and single cell stage. A careful model calibration of these parameters could further improve the model performance.
Prokaryotes: even though the Prochlorophytes-analogue is not present/dominant in the Southern Ocean, accounting for this pico-phytoplankton is a prerequisite for the simulation of the northern edge of coccolithophores distribution south of 535 the STF (in the SSTC bgc province). In this respect the assumption on photoinhibition for this PFT as well as for other PFTs might need a careful revision.
Remineralization and other parameterized processes: the simulated distribution, competition and co-occurrence of the key Southern Ocean PFTs are generally discussed in terms of differentiating the PFT traits via the specification of phytoplankton growth (with different light acclimation strategies and affinities to nutrients) and palatibilities for zooplankton 540 grazing. However, there are other processes altering the model PFT dynamics. For instance (a model based evidence, not shown), augmenting the model by CDOM affected the remineralization processes altering the nutrient distribution and therefore the spatial and temporal distribution of PFTs competing for the available resources, which indicates sensitivity of the model to the parameterization of remineralization processes. One should also think of limitations due to unresolved sea-ice algae, which might lead to overestimated diatom Chla in the marginal ice zone. of algae/sea-ice interaction. 545 Present-day satellite retrieval algorithms allow to detect biomass (and dominance) of some PFTs including haptophytes in general (OC-PFT, Hirata et al. 2011), coccolithophores in particular (PhytoDOAS, Sadeghi et al. 2012, diatoms and cyanobacteria/prokaryotes ( OC-PFT;PhytoDOAS, Bracher et al. 2009). Though there is the mismatch between the phytoplankton group-ing used in numerical models, satellite algorithms and in situ observations, the information from these different sources can be considered complementary.

550
However, when combining or comparing models and observational information, we have to keep in mind representation errors and limitations of approaches used for deriving PFT information from in situ and satellite observations. Generally, a temporal and spatial scale mismatch exists between in situ or satellite observations and model output depending on the model discretization. In situ measurements in the Southern Ocean are sparse in space and time and only provide a fraction of the information obtained by the model. Scientific cruises in the Southern Ocean are often carried out close to the continents/ice 555 shelf or in regions with high phytoplankton concentration (Figure 1). Satellite observations cover larger areas frequently but only cloud-and ice-free scenes which leads to a temporal bias in the Southern Ocean, where both, sea ice and clouds occur most of the year. In addition, they are limited to only observe the first optical depth, which often limits the detection of the chlorophyll maximum. The development of algorithms for deriving PFT information requires a large in situ dataset with homogeneous temporal and spatial distribution. The DPAs used to estimate PFTs from HPLC pigments assumes that different 560 PFTs have different marker pigments, but it is known that they can also have pigments in common (Hirata et al., 2011).
This ambiguity leads to uncertainties in the in situ database which is, on the one hand, needed as fundamental input for the algorithms of PFT retrievals and, on the other hand, used for direct comparison with model output here. Concerning spectral based methods applied to either in situ or satellite data, it is difficult to distinguish the specific absorption spectra of PFTs (e.g. coccolithophores and Phaeocystis). These and more limitations are discussed by Sathyendranath (2014) and Bracher et al. 565 (2017).

Concluding remarks and outlook
An extensive synthesis of available observational data sets on the Southern Ocean PFTs allowed us to better understand their biogeography. This information was used to infer which types should coexist in which regions, and, therefore, to constrain the model. Leveraging satellite estimates and in situ observations allowed us to define the trait requirements for capturing 570 phytoplankton biogeography in the Southern Ocean, and we set up a model for simulating the distribution of key Southern Ocean PFTs: diatoms, coccolithophores and Phaeocystis. The observed co-occurrence of two different phytoplankton groups, coccolithophores and diatoms in the Subantarctic Zone (Queguiner, 2013;Smith et al., 2017) was clearly simulated by the Darwin-MITgcm model adjusted for the Southern Ocean and in a reasonable agreement with PHYSAT, PhytoDOAS and SynSenPFT satellite products.

575
Our results support the hypothesis that introducing two size classes of diatoms in biogeochemical models is a prerequisite to simulate the observed diatom phenology and PFT distribution in general. We have also shown that the simulated biogeography of coccolithophores is not controlled by temperature itself as reported by Smith et al. (2017), since we did not use a specific for coccolithophores temperature limitation function. It was directly explained by phosphate depleting as well as by low palatability of this PFT for grazers. This confirms our second hypothesis. Nevertheless, we found that the simulation of co-580 occurrence of coccolithophores and Phaeocystis required additional model developments to account for changes in assumed life stage of Phaeocystis (Popova et al., 2007;Kaufman et al., 2017) subject to iron availability (Bender et al., 2018). This parameterization of morphological shifts indeed allows for co-existence of the two types of haptophytes corroborating our third hypothesis on the dependence of Phaeocystis sp. life stages on iron availability. By considering two life stages of Phaeocystis we introduce additional differences in the traits, which along with assumed physiological parameters for coccolithophores 585 makes coccolithophores competitive among phytoplankton of larger cell size requiring higher nutrients concentration to grow or/and among PFTs of similar size -small diatoms and Phaeocystis solitary cells -but of higher palatability factor to be grazed. These additional differences in the traits of distinct haptophytes, coccolithophores and Phaeocystis allows these groups to coexist (e.g. along the Subantarctic and Polar fronts). However, there is still room for improvement, for instance, by specifying more precisely the differences in photophysiology and related optical imprints (Moisan and Mitchell, 2018) for Phaeocystis in 590 single cell and colony phases.
The evidence that coupled ocean/biogeochemical models can capture phytoplankton specific traits in the way that it can consider different aspects of differentiation among phytoplankton groups (biogeochemical role; allometric, photophysiological and optical parameters; accounting for carbon and Chla decoupling) makes them very valuable and skilful instruments. They can combine the knowledge from in situ measurements and remote sensing by exploiting various PFT retrieval principles 595 used (separately) in these observations and relate them to the environmental conditions. Further extension/progress is expected by coupling a radiative transfer model to the biogeochemical model (Gregg and Rousseaux, 2016;Dutkiewicz et al., 2018) allowing to simulate spectrally resolved water leaving radiance and therefore providing perspectives to assimilate explicitly multi-and hyper-spectral satellite information, which might improve PFT prediction.

600
The supplementary material contains a protocol of prior Darwin sensitivity experiments with differently prescribed phytoplankton traits (Section S1); PhytoDOAS diatoms Chla over the Great Calcite Belt (Section S2); seasonal variation of the meridional distribution of zonally averaged phytoplankton composition for four sections of the Southern Ocean (Section S3); in situ and laboratory measurement information on the photophysiological parameters of diatoms and Phaeocystis (Section S4); additional information on model evaluation against in situ HPLC (Soppa et al. 2017) and MAREDAT datasets (Section 605 5); monthly climatology of the PFT dominance obtained with PHYSAT, REF and PHAEO (Section 6). Code and data availability. The model data were generated with the MIT Darwin Project Biogeochemical, Ecosystem, and Optical Model.
The code is part of the MITgcm, which is available from http://mitgcm.org. The specific version of the code used in this study, as well 615 as initial fields can be provided opon request from the corresponding author (svetlana.losa@awi.de). Information about observational data availability is provided in the text (via specified URL).
Competing interests. The authors declare that they have no conflict of interest.  mf unc silicified silicified calcifier P C max is the maximum photosynthetic rate at 30 • C; β is the photoinhibition parameter, applied to Prochlorococcus; ksat is the growth half-saturation constant; r j,k denotes the palatability factor of the particular phytoplankton (j = 1, 2,..., 6), grazing of phytoplankton for small or micro-zooplanktons (k = 1, 2); w sink is the sinking rate; φmax j is the maximum quantum yield of carbon fixation; a * j is the spectrally averaged phytoplankton-specific light absorption; mpj is the mortality rate; mf unc determines the biomineralizing function.   (Longhurst, 1998)   the Southern Ocean fronts Orsi and Harris, 2001) as in Figure 1.  Orsi and Harris, 2001) as in Figure 1.
40 Figure 8. PHAEO climatological monthly mean surface PFT dominance for July (a), December (b), January (c), February (d) (see Figure   2 and Figures S15 and S16 for The model output is masked by the area with sea ice concentration > 87% during respective month. White contours denote the Southern Ocean fronts Orsi and Harris, 2001) as in Figure 1. The model output is masked by the area with sea ice concentration > 75% during respective month.