Quantifying spatiotemporal variability in zooplankton dynamics in the Gulf of Mexico with 1 a physical-biogeochemical model 2

16 Zooplankton play an important role in global biogeochemistry, and their secondary production 17 supports valuable fisheries of the world’s oceans. Currently, zooplankton standing stocks cannot 18 be estimated using remote sensing techniques. Hence, coupled physical-biogeochemical models 19 (PBMs) provide an important tool for studying zooplankton on regional and global scales. 20 However, evaluating the accuracy of zooplankton biomass estimates from PBMs has been a major 21 challenge due to sparse observations. In this study, we configure a PBM for the Gulf of Mexico 22 (GoM) from 1993-2012 and validate the model against an extensive combination of biomass and 23 rate measurements. Spatial variability in a multi-decadal database of mesozooplankton biomass 24 for the northern GoM is well resolved by the model with a statistically significant (p < 0.01) 25 correlation of 0.90. Mesozooplankton secondary production for the region averaged 66 + 8 x 10 6 26 kg C yr -1 , equivalent to ~10% of net primary production (NPP), and ranged from 51 to 82 x 10 6 kg 27 C yr -1 , with higher secondary production inside cyclonic eddies and substantially reduced 28 secondary production in anticyclonic eddies. Model results from the shelf regions suggest that 29 herbivory is the dominant feeding mode for small mesozooplankton (<1-mm) whereas larger 30 mesozooplankton are primarily carnivorous. In open-ocean oligotrophic waters, however, both 31 mesozooplankton groups show proportionally greater reliance on heterotrophic protists as a food 32 source. This highlights an important role of microbial and protistan food webs in sustaining 33 mesozooplankton biomass in the GoM, which serves as the primary food source for early life 34 stages of many commercially important fish species, including tuna.


Introduction
Within marine pelagic ecosystems, zooplankton function as an important energy pathway between the base of the food chain and higher trophic levels such as fish, birds, and mammals (Landry et al., 2019;Mitra et al., 2014).Zooplankton also have a well-documented impact on chemical cycling in the ocean (Buitenhuis et al., 2006;Steinberg and Landry, 2017;Turner, 2015).The ecological roles of zooplankton, however, are varied and taxon dependent.Globally, protistan grazing is the largest source of phytoplankton mortality, accounting for 67% of daily phytoplankton growth (Landry and Calbet, 2004).Protistan zooplankton function primarily within the microbial loop leading to efficient nutrient regeneration in the surface ocean (Sherr and Sherr, 2002;Strom et al., 1997).By contrast, mesozooplankton contribute significantly less to phytoplankton grazing pressure, consuming an estimated 12% of primary production globally (Calbet, 2001), but strongly impact the biological carbon pump.In addition to grazing pressure on phytoplankton, mesozooplankton affect the biological carbon pump through top-down pressure on protistan grazers, production of sinking fecal pellets, consumption of sinking particles, and active carbon transport during diel vertical migration (Steinberg and Landry, 2017;Turner, 2015).
Herbivorous mesozooplankton are particularly important to study as they are often associated with shorter food chains that enable efficient energy transfer from primary producers to higher trophic levels of immediate societal interest such as economically valuable fish species and/or their planktonic larvae.
Zooplankton populations have been identified as being vulnerable to impacts of a warming ocean (Caron and Hutchins, 2013;Pörtner and Farrell, 2008;Straile, 1997), through direct temperature effects on metabolic rates (Ikeda et al., 2001;Kjellerup et al., 2012) and thermal stratificationdriven alterations in food web structure and productivity (Landry et al., 2019;Richardson, 2008).
Studies aimed at monitoring and predicting zooplankton populations are therefore critical for understanding the first-order effects of a warming ocean on marine ecosystems given the importance of secondary production and the impact zooplankton have on biogeochemical cycling.
Despite their importance, zooplankton have been historically sampled with limited temporal and spatial resolution.Unlike ocean hydrodynamics and phytoplankton variability, zooplankton abundance cannot currently be estimated remotely from space.Thus, numerical models provide a useful tool for synoptic assessments of zooplankton stocks on basin and global scales (Buitenhuis et al., 2006;Sailley et al., 2013;Werner et al., 2007).Nonetheless, evaluating the accuracy of zooplankton abundance estimates in numerical experiments, such as three-dimensional physicalbiogeochemical ocean models (PBMs), is a major challenge due to the sparse ship-based observations in most regions (Everett et al., 2017).Consequently, PBMs are typically predominately validated against surface chlorophyll (Chl) from remote sensing products (Doney et al., 2009;Gregg et al., 2003;Xue et al., 2013).
In most marine environments, phytoplankton net growth rates and biomass are determined primarily by the imbalance between phytoplankton growth and zooplankton grazing (Landry et al., 2009).PBMs can accurately predict phytoplankton standing stock (i.e.compare well with satellite Chl observations) despite being driven by the wrong underlying dynamics, leading to major errors in model estimates of secondary production and nutrient cycling (Anderson, 2005;Franks, 2009).For instance, parameter tuning using only surface Chl as a validation metric can allow broad patterns in phytoplankton biomass to be reproduced even with gross over-or underestimation of phytoplankton turnover times.Similarly, even a model that is validated against satellite Chl and net primary production might completely misrepresent the proportion of phytoplankton mortality mediated by zooplankton groups, leading to inaccurate estimates of important ecological metrics like secondary production and carbon export.Hence, validating PBMs against zooplankton dynamics is key to increasing confidence in model solutions.The importance of validation is further evident when considering zooplankton impacts on the behaviors of biogeochemical models (Everett et al., 2017).Differences in simulated zooplankton communities expressed through the number of functional types, various mathematical grazing functions, or the arrangement of transfer linkages have been shown to have substantial impacts on the dynamics of simple and complex biogeochemical models (Gentleman et al., 2003b;Gentleman and Neuheimer, 2008;Mitra et al., 2014;Murray and Parslow, 1999;Sailley et al., 2013).
The Gulf of Mexico (GoM) is a particularly suitable region for examining zooplankton dynamics with PBMs.In the northern and central Gulf, zooplankton abundances have been extensively measured for over three decades (1982-present) by the Southeast Area Monitoring and Assessment Program (SEAMAP).Within the SEAMAP dataset, zooplankton biomass exhibits strong spatiotemporal variability, reflecting complex physical circulation in the GoM.Circulation off the shelf is characterized by substantial upper layer mesoscale activity driven primarily by the energetic Loop Current (Forristall et al., 1992;Maul and Vukovich, 1993;Oey et al., 2005).In contrast, coastal and shelf circulation patterns are predominantly wind-driven (Morey et al., 2003a(Morey et al., , 2013)).Freshwater discharged by the Mississippi River and other smaller rivers is frequently entrained offshore by shelf break interaction with mesoscale features (e.g., anti-cyclonic loop current eddies), leading to strong horizontal and vertical gradients in physical and biogeochemical quantities (Morey et al., 2003b).Overlap of these gradients with the SEAMAP study region result in zooplankton collections across biogeochemically heterogeneous environments, providing a powerful model constraint.For instance, Chl ranges across three orders-of-magnitude (~0.01 -10 mg Chl m -3 ) from oligotrophic to eutrophic waters.
Over the past decade several PBM studies have been conducted in the GoM, all primarily examining nutrient and phytoplankton dynamics.Early work by Fennel et al. (2011) examined phytoplankton dynamics on the Louisiana and Texas continental shelf, concluding that loss terms (e.g., grazing) rather than growth rates dictated accumulation rates of phytoplankton biomass.With the same biogeochemical model, Xue et al. (2013) conducted the first gulf-wide PBM study to investigate broad seasonal biogeochemical variability and to constrain a shelf nitrogen budget.
More recently, Gomez et al. (2018) implemented a biogeochemical model with multiple phytoplankton and zooplankton functional types to gain a more detailed understanding of nutrient limitation and phytoplankton dynamics in the GoM.To examine phytoplankton seasonality and biogeography in the oligotrophic Gulf, Damien et al. (2018) validated a PBM based on a unique subsurface autonomous glider dataset.Together, these studies have demonstrated the utility of PBMs for investigating GoM lower trophic levels and have also highlighted the key ecosystem roles of zooplankton.Specifically, both Fennel et al. (2011) and Gomez et al. (2018) identified the importance of zooplankton in modulating the simulated seasonal patterns of phytoplankton biomass, emphasizing the importance of top-down control on the shelf.Although simulated zooplankton community results were not presented, Damien et al. (2018) noted that biotic processes such as grazing pressure, are "essential to fully understanding the functioning of the GoM ecosystem."However, in all of these studies, zooplankton validation was largely absent.
In this study, we configured a PBM for the GoM to estimate zooplankton abundance and analyze zooplankton community dynamics.The PBM is forced by three-dimensional hydrodynamic fields from a data assimilative Hybrid Coordinate Ocean Model (HYCOM) hindcast of the GoM (http://www.hycom.org)and is based on the biogeochemical model NEMURO (North Pacific Ecosystem Model for Understanding Regional Oceanography; Kishi et al., 2007), which is substantially modified here for application to the GoM.The model is integrated over 20-years (1993-2012) and validated against an extensive combination of remote and in situ measurements including total and size-fractioned mesozooplankton biomass and grazing rates, microzooplankton grazing rates, phytoplankton growth rates and net primary production as well as validation of surface chlorophyll and vertical profiles of chlorophyll and nitrate.Our goals were: 1) to develop and validate a PBM to estimate mesozooplankton abundance, 2) to characterize the spatiotemporal variability in mesozooplankton dietary composition, and 3) to quantify regional mesozooplankton secondary production.We focus primarily on the oligotrophic open-ocean GoM, where prey (i.e. zooplankton) availability may be limiting for fish, their larvae and other higher trophic levels.

NEMURO model description
The biogeochemical model for this study is based on NEMURO (Kishi et al., 2007) but has been modified and parameterized to more accurately reflect the ecology of the GoM (herein called NEMURO-GoM).NEMURO is a concentration-based, lower-trophic-level ecosystem model originally developed and parameterized for the North Pacific.Like most marine functional-group biogeochemical models, it is structured around simplified representations of the lower food web originating from earlier nutrient-phytoplankton-zooplankton models (Fasham et al., 1990;Franks, 2002;Riley, 1946).Complexity is added through additional state variables and transfer functions with the specific goal of resolving dynamics within the nutrient, phytoplankton and zooplankton pools.In total, NEMURO has eleven state variables: six non-living state variablesnitrate (NO3), ammonium (NH4), dissolved organic nitrogen (DON), particulate organic nitrogen (PON), silicic acid (Si(OH)4) and particulate silica (Opal); two phytoplankton state variablessmall (SP) and large phytoplankton (LP); and three zooplankton state variablessmall (SZ), large (LZ) and predatory zooplankton (PZ).
Each biological state variable in NEMURO is an aggregated representation of taxonomically diverse plankton groups that function similarly in the ecosystem.The phytoplankton community is modeled as two functional types of obligate autotrophs: small phytoplankton (SP, predominantly cyanobacteria and picoeukaryotes in the GoM) and large phytoplankton (LP, diatoms).Small zooplankton (SZ) represent heterotrophic protists, and metazoan zooplankton are divided into suspension-feeding mesozooplankton (LZ) and predatory zooplankton (PZ).Here, we assume that LZ and PZ are non-migratory.Heterotrophic bacteria are implicitly represented by temperaturedependent decomposition rates, which represent nitrification and remineralization processes.
NEMURO uses nitrogen as a model "currency" since it is the major limiting macronutrient in much of the ocean.Silica is also included as a potentially co-limiting nutrient for diatoms (i.e.LP).
By default, sinking in NEMURO is restricted to PON and Opal, and benthic processes are not included.Here, because of the large shelf area in the GoM, we implemented a simple diagenesis of PON/Opal to NO3/SiO4 and removal of PON/Opal through sedimentation, where 1% of the flux sinking out of bottom cell was removed and 10% converted back into NO3/SiO4.However, we found that this had no significant impact on the simulated surface Chl or mesozooplankton biomass on the shelf.The inclusion of a more complex sediment diagenesis model (including denitrification) would have added further realism (Fennel et al., 2011).However, our main focus was to evaluate zooplankton dynamics in the oligotrophic region where higher trophic levels that depend on mesozooplankton secondary production may experience food limitation and where benthic processes are negligible.
NEMURO was chosen for the present study because it distinguishes SZ, LZ and PZ, permitting a detailed analysis of dynamics for multiple functional types in the GoM zooplankton community.
During initial GoM simulations, default NEMURO parameterizations, configured for the North Pacific (Kishi et al., 2007), substantially overestimated surface nitrate, surface Chl, and mesozooplankton biomass relative to observations.We attribute these differences to: 1) substantially higher temperatures in the GoM compared with the North Pacific, which significantly increase decomposition and growth rates in the model resulting in higher nutrient recycling and elevated near-surface stocks of phytoplankton and zooplankton, and 2) distinct differences in taxonomic composition of phytoplankton and zooplankton communities in the GoM and North Pacific, with significant differences in key parameter values associated with growth and grazing.
For more details on the specific processes represented and the interactions between state variables in NEMURO, we direct readers to Kishi et al. (2007) (USGS) and due to a lack of observations for other rivers was prescribed for all 37 rivers represented in the model.

Modifications to default NEMURO model
To improve realism for application to the GoM, five structural changes were made to the original NEMURO model.First, we removed the SP to LZ grazing pathway.The original SP state variable for the North Pacific represents nanophytoplankton (e.g.coccolithophores), which can be important prey of copepods and other mesozooplankton.In the GoM, however, cyanobacteria and picoeukaryotes (too small for direct feeding by most mesozooplankton) comprise much of the phytoplankton biomass and hence are represented as SP in our model.In addition to adding ecological realism, this change in direct trophic connection between SP and LZ allowed the model to produce a more realistic LP-dominated phytoplankton community on the shelf (see Discussion).
Next, quadratic mortality was replaced with linear mortality for all biological state variables with the exception of predatory zooplankton (PZ).In biogeochemical models, quadratic mortality is often used for numerical stability and/or to represent implicit loss terms to an un-modeled parasite or predator that covaries in abundance with its prey (e.g.viral lysis of phytoplankton or predation by un-modeled higher predators) (Anderson et al., 2015).However, grazing mortality is explicitly modeled in NEMURO and viral mortality is generally not a substantial loss term for bulk phytoplankton (Staniewski and Short, 2018).Quadratic mortality was retained for PZ, to account for predation pressure of un-modeled predators (e.g.planktivorous fish).During the model tuning process, we found that removal of quadratic mortality from the four other plankton functional groups was an important parameterization change that allowed the model to simulate more realistic mesozooplankton biomass in the oligotrophic GoM (see Discussion).
The default ammonium inhibition term and light limitation functional form in NEMURO were replaced in NEMURO-GoM with more widely adopted parameterizations.The exponential ammonium inhibition term in the nitrate limitation function was replaced with the term described by Parker (1993), as has been done in previous PBM studies (Fennel et al., 2006)

NEMURO-GoM model tuning procedure
In total, NEMURO includes 71 parameters, 23 of which were modified in the present study.For initial model tuning, we used an idealized one-dimensional model designed to mimic the oligotrophic GoM.To guide our tuning procedure, we relied on a semi-quantitative approach where the one-dimensional model solution was evaluated based on five ecosystem benchmarks.
Target values for benchmarks and other ecosystem attributes were determined from observations or a theoretical basis.Ecosystem benchmarks included: surface Chl, mesozooplankton biomass, DCM depth, DCM magnitude, and SP:LP ratio.Surface Chl and mesozooplankton biomass were chosen as benchmarks to evaluate the realism of plankton biomass in the model.The DCM depth and magnitude were chosen to evaluate the vertical structure of the simulated ecosystem, and SP:LP ratio was used to gauge the realism of the plankton community composition (i.e.high SP:LP is expected in the oligotrophic GoM).The model was also tuned by considering the relative magnitudes of loss terms for phytoplankton (grazing, mortality, respiration, and excretion), total protistan zooplankton grazing relative to mesozooplankton grazing, as well as surface and deep nitrate concentrations.We outline each parameter change, justification and the resulting impact on the ecosystem benchmarks simulated by the idealized one-dimensional model in Supplement S3.
Where possible, we modified parameters in groups so that relative changes were consistent throughout the model (e.g.doubling all zooplankton mortality terms).After tuning in the onedimensional model, parameter sets were implemented into the full three-dimensional model where additional tuning was performed.Once a final parameter set was determined we conducted a parameter sensitivity analysis over 18 individual experiments to identify impacts of parameter changes from default NEMURO values (S4).

Description of the offline numerical environment
To run large numbers of three-dimensional simulations efficiently for basin-scale tuning, NEMURO-GoM was run offline using the MITgcm offline tracer advection package.MITgcm was selected as it contains convenient packages for running offline simulations (McKinley et al., 2004).That is, the dynamical equations of motion are not computed during the NEMURO-GoM integration, but rather the physical prognostic variables (i.e., temperature, salinity and threedimensional velocity fields) are prescribed from daily-averaged flow fields saved from a previous hydrodynamic model integration.This allows the recycled use of flow fields leaving only the tracer equations to be computed.In the offline MITgcm package, the prognostic variables provide input to an advection scheme and mixing routine that conservatively handles offline advection and diffusion of the biogeochemical tracer fields.MITgcm has many options for linear and non-linear advection schemes.Here we use a 3 rd order direct space time flux limiting scheme.Sub grid-scale mixing of the biogeochemical fields is handled offline through the nonlocal K-Profile Parameterization (KPP) package based on mixing schemes developed by Large et al. (1994).For more information about the MITgcm packages, we direct readers to the MITgcm manual (http://mitgcm.org/).
There are two main advantages to running PBMs in an offline environment: 1) the momentum equations are not integrated during the model run; and 2) the physical time step is no longer bound by the dynamical Courant-Friedrichs-Lewy (CFL) numerical stability criterion, which together significantly reduces the computational cost.Instead, the stability of the tracer advection scheme and time scales needed to resolve biological/physical processes of interest set the limits on the time steps and prescription frequencies of flow fields.When the physical time step is shorter than the flow field prescription frequency, a simple linear interpolation of the flow fields is performed between time steps.Offline simulations of tracer advection have been found to closely resemble online runs (that is, computed together with the integration of the hydrodynamic model's prognostic equations) when the three-dimensional flow fields are prescribed at a frequency that is at or below the inertial period (T = 2π/f, TGoM >24 hr) for a region (Hill et al., 2005).
In the present study, the offline time step (30 minutes) is an order of magnitude greater than the hydrodynamic model's (HYCOM-GoM, described in Section 2.2.2) baroclinic time step (120 seconds).For reference, HYCOM-GoM required ~76 days to run to completion on 64 parallel cores.These time requirements would increase considerably with the 11 additional biogeochemical tracers used in NEMURO.In contrast, NEMURO-GoM ran significantly faster, taking a total of ~50 hours on 80 parallel cores.Offline models offer a valuable tool for integrating PBMs particularly as spatial resolution and complexity in these models continues to increase (e.g., DARWIN (Follows et al., 2007), GENOME (Coles et al., 2017)).While computationally advantageous, however, offline simulations have inherently greater input and output (I/O) demands that can become bottlenecks in some applications.Issues with conservation can also arise as three-dimensional advection schemes are only approximately positive definite.

Description of the offline dynamical fields
The NEMURO-GoM model is "forced" by daily averaged three-dimensional velocity, temperature approximately identical profiles of vertical velocity to those in H-GoM fields.For mapping of temperature and salinity fields (used in the KPP mixing routine and for scaling biological temperature dependent rates), a simple linear interpolation was performed.

Surface chlorophyll observations
A benchmark for surface Chl was determined using the Sea-Viewing Wide Field-of-View Sensor

Mesozooplankton biomass observations
To evaluate model mesozooplankton biomass estimates, we used plankton tows collected during Southeast Area Monitoring and Assessment Program (SEAMAP) surveys in the northern and central GoM.In total, 11,781 plankton tows were collected from 1983-2017, with two main annual surveys in the spring (offshore) and fall (shelf) (Fig. 1).On average, SEAMAP collected approximately 300 samples per year with a specific sampling array offshore and more general sampling coverage on the shelf.In total, 6,835 samples were used for direct point-to-point modeldata comparisons.Samples were collected using standard gear consisting of a 61-cm diameter bongo frame fitted with two 333-μm mesh nets.The nets were fished in a double-oblique tow pattern from the surface down to 200 m or 5 m off the bottom and back to the surface.Simultaneous samples were also collected using a 202-μm mesh net during 82 tows.Of these samples, roughly half were collected in the oligotrophic GoM.The average ratio between biomass measured in the 333-and 202-μm bongo tows (0.5093 + 0.12) was used to convert 333-μm samples so that direct comparisons could be made with model mesozooplankton (LZ+PZ) biomass fields.Zooplankton biomass was originally quantified as displacement volumes (DV).Carbon mass (CM) equivalents were subsequently calculated as log10(CM) = (log10(DV) +1.434)/0.820(Wiebe, 1988;Moriarty and O'Brien, 2013).
For comparison to in situ biomass measurements size ranges were assumed for zooplankton state variables (see section 2.3.4).In NEMURO-GoM, SZ represent heterotrophic protists (e.g.ciliates) and early stages of mesozooplankton, which are typically < 200 μm in the ocean.Hence, SZ are assumed to range in size from 2-200 μm.LZ are considered to represent small mesozooplankton ranging in size from 0.2 to 1.0 mm while PZ represent large mesozooplankton ranging in size from 1.0 to 5.0 mm.For comparison to the SEAMAP climatology the model LZ and PZ fields were first converted to units of carbon assuming Redfield C:N ratio and summed over the entire simulation before being depth averaged to the bottom or 200 m.For point-to-point model-data comparisons, total mesozooplankton biomass fields were interpolated to SEAMAP sample locations/times and then depth averaged to the corresponding sample tow depth.

Observed vertical profiles of chlorophyll and nitrate
Depth profiles of Chl were also collected during SEAMAP surveys using a SeaBird WETStar fluorometer attached to a CTD.Calibration of the fluorimeter was infrequent, and thus profiles were used to determine the depth of the fluorescence maxima for comparisons to DCM depths in the model.In total, 2,435 profiles were collected from 2003 to 2012, with 1,052 profiles overlying bottom depths >1000 m.Profiles were available for earlier SEAMAP surveys; however, no standard QA/QC protocol for fluorometer data was in place prior to 2003.
To evaluate DCM magnitudes in the model, we used 145 fluorescence profiles collected during May 2017 and 2018 process study cruises (see section 2.3.4).The fluorometer was attached to a CTD and calibrated using 126 in situ Chl samples.Chl concentrations were determined from filtered samples collected at depths ranging from 5 to 115 m using High Performance Liquid Chromatography (HPLC).Since the cruise sampling does not overlap with our NEMURO-GoM simulation period, model-data comparisons were made for all 20 years of the model run using sample locations and time of the year.This was also done with other field measurements from the process cruises (see section 2.3.4).
For model-data comparisons of nitrate, we utilized profiles from the World Ocean Database (WOD).In total, 96 profiles were available during our simulation period and located in the oligotrophic GoM (>1000 m isobath).Profiles were collected during all months except March and December with the majority of samples collected during May, July and August (Fig. 1A).

Biomass and rate measurements from process study cruises
Although in situ rate measurements are made much less frequently than biological standing stock measurements, they offer very powerful constraints for validating the internal dynamics of a biogeochemical model (Franks, 2009).Consequently, we made phytoplankton and zooplankton rate measurements on two cruises in the open ocean GoM in May 2017 and 2018 and used these measurements to validate the model (Fig. 1A).On the process study cruises, we utilized a quasi-Lagrangian sampling scheme to investigate plankton dynamics in the oligotrophic GoM.Two drifting arrays (one sediment trap array and one in situ incubation array) were deployed to serve as a moving frame of reference during ~4-day studies ("cycles") characterizing the water parcel (Landry et al., 2009;Stukel et al., 2015).During these cycles, we measured daily profiles of Chl, photosynthetically active radiation, phytoplankton growth rates and productivity, protistan grazing rates, and size-fractionated mesozooplankton biomass and grazing rates.
Size-fractionated mesozooplankton biomass and grazing rates were determined from daily daynight paired oblique ring-net tows (1-m diameter, 202-μm mesh).In total, 40 oblique bongo net tows (16 in 2017 and 24 in 2018) sampled the oligotrophic GoM mesozooplankton community from near surface to a depth ranging from 100-135 m.Upon recovery, the sample was anesthetized using carbonated water, split using a Folsom splitter, filtered through a series of nested sieves (5, 2, 1, 0.5, and 0.2 mm), filtered onto pre-weighed 200-µm Nitex filters, rinsed with isotonic ammonium formate to remove sea salt, and flash frozen in liquid nitrogen.In the lab, defrosted samples were weighed for total wet weight, and subsampled in duplicate (wet weight removed) for gut fluorescence analyses.The remaining wet sample was dried and subsequently reweighed and combusted for CHN analyses to determine total dry weight and C and N biomasses.Gut fluorescence subsamples were homogenized using a sonicating tip, extracted in acetone, and measured for Chl and phaeopigments using the acidification method.The phaeopigment concentrations in the zooplankton guts were the basis for calculated grazing rates using gut turnover times based on temperature relationships for mixed zooplankton assemblages.For additional details, see Décima et al. (2011) and Décima et al. (2016).
Briefly, one 2.8-L polycarbonate bottle was gently filled with whole seawater taken from six depths (from the surface to the depth of the mixed layer).A second 2.8-L bottle was then filled with 33% whole seawater and 67% 0.2-μm filtered seawater.Both bottles were then placed in mesh bags and incubated in situ at natural depths for 24 h.These experiments were conducted on each day of the ~4-day cycle.After 24 h, the bottles were retrieved, filtered onto glass fiber filters, and Chl concentrations were determined using the acidification method (Strickland and Parsons., 1972).Net growth rates (k=ln(Chlfinal/Chlinit)) in each bottle were determined relative to initial Chl samples.Phytoplankton specific mortality rates resulting from the grazing pressure of protists were calculated as m = (kd -k0)/(1-0.33),where kd is the growth rate in the dilute bottle and k0 is the growth rate in the control bottle.Phytoplankton specific growth rates were calculated as μ = k0 + m.For additional details, see Landry et al. (2016) and Selph et al. (2016).Phytoplankton net primary production was quantified at the same depths by H 13 CO3 -uptake experiments.Triplicate 2.8-L polycarbonate bottles and a fourth "dark" bottle were spiked with H 13 CO3 -and incubated in situ for 24 h at the same sampling depths as for the dilution experiments.Samples were then filtered and the 13 C: 12 C ratios of particulate matter determined by isotope ratio mass spectrometry.

Surface chlorophyll model-data comparisons
Model surface Chl estimates were found to agree closely with satellite observations reproducing patterns in both the oligotrophic and shelf region (Fig. 2).Spatial covariance between SeaWIFS climatology and model surface Chl climatology (calculated with daily cloud cover mask applied) is statistically significant (p < 0.01) with a correlation (ρ) of 0.72.When model estimates are compared to all 22,244,513 SeaWIFS measurements at corresponding times and locations (i.e. daily grid cell pairs), we find a ρ value of 0.50 (p < 0.01).To facilitate more detailed model-data comparisons, the GoM domain was divided into an oligotrophic region (>1000 m bottom depth) and a shelf region (<1000 m bottom depth).In the oligotrophic region, the correlation between model-data daily grid cell pairs is significant but weak (ρ = 0.17, p < 0.01) as a result of relatively low large-scale spatial variability, and hence dominance at the mesoscale.However, bias is quite low (-0.014mg Chl m -3 ), equivalent to 10% of the observed mean.In the shelf region, the correlation is higher (ρ = 0.47, p < 0.01) yet the bias is greater (+0.90 mg Chl m -3 ), equivalent to 92% of the mean.Previous GoM studies have determined ρ values for monthly averages, which we calculate here for comparison.Based on 30-day averages, the ρ values are 0.70 (p < 0.01) for the oligotrophic region and 0.26 (p < 0.01) for the shelf region.
In addition to resolving the dominant spatiotemporal variability, the model also captures the amplitude of the seasonal surface Chl signal reasonably well.In the oligotrophic region, the model accurately estimates the observed annual surface Chl minimum (Model: 0.065 + 0.005 vs.
SeaWIFS: 0.065 + 0.007 mg Chl m -3 ) while slightly underestimating the observed annual maximum (Model: 0.47 + 0.15 vs. SeaWIFS: 0.75 + 0.55 mg Chl m -3 ).When model estimates for the entire oligotrophic region are taken into account (i.e.not restricted to satellite measurement locations and times), the annual minimum develops in early September, and the maximum develops in late January (Table 1).In the shelf region, greater model-data mismatch exists for surface Chl, with the model overestimating the observed annual minimum by 15% (Model: 0.23 + 0.09 vs. SeaWIFS: 0.20 + 0.07 mg Chl m -3 ) and the observed annual maximum by 102% (Model: 8.09 + 1.31 vs. SeaWIFS: 4.01 + 1.23 mg Chl m -3 ).Here, we find the annual surface Chl seasonal cycle almost completely out of phase with the oligotrophic region, with the annual minimum developing in early February and the annual maximum developing at the end of July (Table 1).

Regional mesozooplankton biomass model-data comparisons
Model mesozooplankton biomass (i.e.LZ + PZ) fields also agree closely with observations in both the oligotrophic and shelf region (Fig. 3).Spatial covariance between SEAMAP climatology and Although seasonal cycles in the oligotrophic and shelf regions could not be derived from the SEAMAP dataset given the significant differences in sampling locations over the course of a year, we investigated model-data mismatches for each month.The model closely matches or slightly underestimates mesozooplankton biomass for most of the year, with the exception of January, May and August (Fig. 3A).The largest model-data mismatch occurs during March, June, July and December, when the model underestimates mesozooplankton biomass by approximately 35%.
Unlike surface Chl, the total mesozooplankton biomass (i.e.depth-integrated) seasonality is similar in both regions of the GoM.In the oligotrophic region, the annual biomass minimum (maximum) occurs at the beginning of January (middle of May), while in the shelf region, the annual minimum (maximum) occurs in late December (end of May) (Table 1).

Chlorophyll and nitrate profile model-data comparisons
To

Size fractionated mesozooplankton biomass and grazing model-data comparisons
To further constrain the phytoplankton and zooplankton community simulated by NEMURO-GoM, we utilized in situ measurements collected during the process study cruises.First, we compared the relative proportions of LZ and PZ biomass to four discrete size classes measured at sea (Fig. 5A, C).In both model and observations, we find nearly identical size distributions assuming that LZ approximates the smallest two size classes of mesozooplankton sampled ("small mesozooplankton", 0.2-1.0-mm)and PZ approximates the largest two size classes ("large mesozooplankton", 1.0-5.0mm).In the field data, small mesozooplankton biomass varied from 33 to 46 % (median = 40%, at 95% C.I.), while model estimates of LZ biomass vary from 31 to 46% (median = 40%).Large mesozooplankton biomass in the field data varied from 54 to 67% (median = 60%), while model estimates of PZ biomass vary from 54 to 69% (median = 60%).Whiskers extend to 95% confidence interval.Outliers for model estimates are not shown.
Mesozooplankton specific grazing rates measured during the process study cruises were also used to validate the simulated mesozooplankton community.Field measurements showed that specific grazing rates (μg Chl mg C -1 d -1 ), decreased consistently with increasing mesozooplankton sizeclass (Fig. 5B).For model-data comparisons, we computed grazing on LP by LZ and PZ at each depth.Grazing terms were converted into units of Chl using the model estimated C:Chl ratio for LP before being depth-integrated to the corresponding net tow depth and normalized to simulated depth-integrated LZ and PZ biomasses.We find that model mesozooplankton grazing estimates capture the general trend of decreased specific grazing rates with increasing mesozooplankton size (Fig. 5D).However, the model overestimates grazing by small mesozooplankton while underestimating grazing by large mesozooplankton.In the field data, small mesozooplankton grazing ranges from 1.34 to 2.51 μg Chl mg C -1 d -1 (median = 1.85) while model estimates of LZ grazing rates vary from 3.64 to 8.14 μg Chl mg C -1 d -1 (median = 6.01).Field measurements of large mesozooplankton grazing range from 0.76 to 1.44 μg Chl mg C -1 d -1 (median = 0.94), while model estimates of PZ grazing vary from 0.44 to 0.70 μg Chl mg C -1 d -1 (median = 0.58).In terms of total grazing, the model average is also considerably higher (2.99 + 2.20 μg Chl mg C -1 d -1 ) relative to field measurements (1.38 + 0.59 μg Chl mg C -1 d -1 ) (see Discussion).

Phytoplankton growth and microzooplankon grazing model-data comparisons
Measurements of specific phytoplankton growth rates, phytoplankton mortality due to microzooplankon grazing, and net primary production (NPP) were used to evaluate protistan dynamics in the model.We find the model underestimates phytoplankton growth and microzooplankton grazing while overestimating NPP (Fig. 6A, B).Phytoplankton specific growth rates from dilution experiments range from 0.50 to 0.66 d -1 (median = 0.55 d -1 ) while model estimates of phytoplankton (SP+LP) specific growth rates vary from 0.13 to 0.27 d -1 (median = 0.21 d -1 ).In terms of microzooplankton grazing rates, field data range from 0.19 to 0.55 d -1 (median = 0.39 d -1 ) while model estimates of SZ vary from 0.10 to 0.21 d -1 (median = 0.16 d -1 ).NPP estimates show better agreement with observations with rates from 276 to 360 mg C m -2 d -1 (median = 321 mg C m -2 d -1 ) in field data while model estimates vary from 190 to 741 mg C m -2 d -1 (median = 431 mg C m -2 d -1 ).
Although the model underestimates phytoplankton growth and microzooplankton grazing rates, the relative proportion of NPP consumed by protists in the model (67 -85%; median = 76%) compares reasonably well to field measurements (55 -92%; median = 72%) (Fig. 6C).Notably, the model average proportion of phytoplankton production consumed by protists closely matches the mean for all tropical waters reported by Calbet & Landry (2004).When phytoplankton mortality due to mesozooplankton grazing was evaluated in the model at cruise sample locations we find mesozooplankton grazing accounts for 13 + 8 %, which also closely agrees with the global average (Calbet et al., 2001).

Simulated mesozooplankton diet
After model tuning and validation, we utilized NEMURO-GoM to investigate spatiotemporal variability in diet and secondary production of the GoM mesozooplankton community.First, we examined the trophic level of LZ and PZ in the model, which provides a measure of their cumulative diet.Trophic level is calculated by computing the dietary contributions of each prey in LZ (i.e.LP and SZ) and PZ diets (i.e.LP, SZ, and LZ), assuming that the trophic level of LP = 1 and SZ = 2.In the oligotrophic region, both LP and SZ contribute approximately 50% to LZ diet, as indicated by a mean trophic level near 2.5 (2.54 + 0.02) for LZ (Fig. 7A).In the same region, PZ have a trophic level of 2.78 + 0.04 indicating a higher contribution of zooplankton to their diet (i.e.SZ and/or LZ) (Fig. 7B).In the shelf region, LZ are more herbivorous, as indicated by a decrease in trophic level to 2.31 + 0.01, while PZ are more carnivorous, as indicated by an increase in trophic level to 2.90 + 0.04.
Despite little evidence for LZ diets dominated by zooplankton in the annual average (in contrast to PZ, which often have a trophic level ~3), we commonly find regions in instantaneous fields during both winter and summer conditions where SZ are the dominant prey source for LZ (Fig. 7C, E).These regions, typically in the Loop Current or Loop Current Eddies (LCEs), highlight the episodic importance of heterotrophic protists as prey sources for small mesozooplankton in the GoM.High proportions of SZ in LZ diets can be attributed to the competitive advantage of SP over LP in extremely low nutrient environments such as in the Loop Current, resulting in high abundances of SP and their predators (SZ) relative to LP. Instantaneous fields also reveal that phytoplankton can be important prey for PZ as well, particularly during summer, as indicated by trophic levels of around 2.5 in the western GoM (Fig. 7F).In addition to strong variability in trophic positions, there are also regions in the oligotrophic GoM, most clearly in the centers of LCEs during summer, where the model predicts no feeding by mesozooplankton (Fig. 8E).The convergent anti-cyclonic circulation of LCEs is typically associated with low phytoplankton biomass, which at times may fall near or below feeding thresholds in the NEMURO grazing formulation.This formulation is intended to simulate suppression of feeding activity for zooplankton at mean prey densities that cannot support the energy expended while searching for prey.
To investigate which prey contributes most to LZ and PZ diets, we computed each prey source term for both LZ and PZ at each grid cell (Fig. 8).As one would expect, the dominant prey for LZ and PZ align closely with spatial variability in their respective trophic positions.For LZ diet, herbivory dominates throughout the GoM, except for the Loop Current (Fig. 8A).LP contribution to LZ diet is highest on the shelf, where LP biomass is also high due to the competitive advantage of LP over SP in high nutrient conditions.In contrast, PZ diet varies with the relative availability of SZ and LZ prey.In the oligotrophic region, PZ feed mainly on SZ (heterotrophic protists) because LZ biomass is relatively low.On the shelf, they consume primarily LZ (Fig. 8D).Despite the significant change in diet between the shelf and oligotrophic regions, PZ trophic positions remain fairly consistent (Fig. 7D).This is due to the fact that SZ in the oligotrophic region and LZ in the shelf region both feed predominantly on phytoplankton and hence occupy similar trophic levels.In the instantaneous fields for winter (Fig. 8B, E) and summer (Fig. 8C, F), the dominant prey for both LZ and PZ show substantial mesoscale variability, indicating that oceanographic features such as fronts and eddies influence not only biomass but also zooplankton ecological roles (see Discussion).

Simulated mesozooplankton secondary production
To our knowledge, regional secondary production for the GoM has not been quantified previously.
Based on our model, secondary production due to mesozooplankton averages 66 ± 8 x 10 6 kg C yr - 1 , and ranged from a minimum of 51 x 10 6 kg C (in 1999) to a maximum of 82 x 10 6 kg C (in 2011).In the oligotrophic region, LZ secondary production averages 35 ± 5 mg C m -2 d -1 while PZ secondary production is 11 ± 2 mg C m -2 d -1 (Fig. 9).The annual secondary production minimum develops at the end of December while the annual maximum occurs at the beginning of June (Table 1).In this region, mesozooplankton secondary production is 14 ± 2 x 10 6 kg C yr -1 , equivalent to 6% of NPP.On the shelf, secondary production is about 4-fold higher, with LZ production of 146 ± 17 mg C m -2 d -1 and PZ production of 42 ± 5 mg C m -2 d -1 .Here, the annual minimum also occurs at the end of December while the maximum occurs later at the end of July (Table 1).On the shelf, secondary production constitutes a higher proportion of NPP (13%) and averages 51 ± 6 x 10 6 kg C yr -1 .
In addition to differences in total secondary production, significant differences were found in the mesozooplankton community response to changes in total phytoplankton biomass on the shelf and in the oligotrophic region.On the shelf, the average ratio between LZ and PZ secondary production is 3.51 and remains almost constant with increasing phytoplankton biomass (ρ = 0.13, p < 0.01).
Although we find a similar average value in the oligotrophic region (3.14), ratios are more variable and strongly dependent on phytoplankton biomass (ρ = 0.52, p < 0.01).Ratios of LZ to PZ secondary production reached values of ~2.5 in the lowest phytoplankton biomass regions of the open ocean GoM and increased to ~4.0 during times and places where local phytoplankton biomass was high.These differences likely stem from the longer turnover times of PZ, which make them less sensitive to variability in bottom-up drivers and allows them to have a proportionally greater role in oligotrophic settings.
As witnessed in the instantaneous fields of diet and secondary production, mesoscale eddies are common features in the GoM and hence important to quantify for regional zooplankton dynamics.
To investigate secondary production inside cyclonic and anticyclonic eddies we implement the TOEddies eddy detection algorithm (Laxenaire et al., 2018) which uses surface velocities along closed contours of sea surface height (SSH) for detection of mesoscale eddies (Chaigneau et al., 2011;Laxenaire et al., 2019;Pegliasco et al., 2015).Grid cells located inside each eddy are defined within the SSH contour associated with the maximum mean surface velocity (interior grid cells).Grid cells located between the outer most closed contour and within 1.5 radius of the eddy center and not within another eddy were used to define background conditions outside of eddies (exterior grid cells).Only eddies with areas larger than an equivalent circular diameter of 100 km and not within the Loop Current were considered in the analysis.On average, 3.78 cyclonic and 3.33 anticyclonic eddies were identified in each daily velocity field.

Discussion
Many parameters in biogeochemical models are poorly constrained by observations and laboratory studies and/or highly variable in the environment.The numbers and uncertainties around these parameters allow PBMs with varying degrees of tuning to reproduce a single ecosystem attribute (e.g., surface Chl) even if multiple processes are inaccurately represented (Anderson, 2005;Franks, 2009).Once validated, one of the main values of coupling physical and biogeochemical models (i.e.PBMs) is their utility for making inferences about portions of the lower trophic level that are under sampled and/or difficult to measure in the field.If PBMs are to be utilized to explain variability rather than simply fit an observational dataset, multiple ecosystem attributes, must be validated and the underlying model structure and assumptions critically evaluated.In the section below, we further justify changes to model structure by evaluating the underlying assumptions in default NEMURO and discuss model-data mismatch before drawing conclusions about the GoM zooplankton community and the implications of its dynamics for higher trophic levels.

Justification for NEMURO modifications
The phytoplankton community in the North Pacific (NP) domain where NEMURO was originally designed is largely composed of nanoplankton (original SP) and microplankton (original LP).By default, SP is assumed to represent coccolithophores and autotrophic nanoflagellates, which can be important prey of copepods and other mesozooplankton in temperate and subpolar regions (Kishi et al., 2007).However, in tropical regions such as the GoM, smaller picophytoplankton taxa typically dominate, particularly in highly oligotrophic regions.Common picophytoplankton of the GoM include cyanobacteria and picoeukaryotes, which are too small for most mesozooplankton to feed on.Hence, the SP to LZ grazing pathway was removed in our model.We found that removal of this grazing pathway allowed the model to simulate a more realistic phytoplankton community on the shelf region.Despite intuition, SP largely dominated the shelf region in the model when LZ were allowed to graze on SP.After closer inspection, we found that grazing of SP sustained LZ biomass on the shelf to levels where top-down pressure constrained LP standing stocks.This prevented large blooms of LP, leading to a competitive advantage for SP even in highly eutrophic conditions (the Mississippi river delta), which was observed for a wide range of LP maximum growth rates, LP half-saturation constants, and LZ/PZ grazing rates.Thus, removal of the SP to LZ grazing pathway added ecological realism and improved the model solution.
During the model tuning process (outlined in the supplemental), we also found that, despite a wide range of tested parameter sets, the model was unable to simulate mesozooplankton biomass low enough to match SEAMAP observations in the oligotrophic region.Even with unrealistically low phytoplankton biomass, equivalent to approximately 50% of surface Chl observed in SeaWIFS images, the model overestimated mesozooplankton biomass.To achieve realistic levels of mesozooplankton biomass in the oligotrophic region, default LZ and PZ mortality parameter values needed to be increased by an order of magnitude.However, this produced unrealistically high loss rates on the shelf region, leading to mesozooplankton biomass estimates that were substantially lower than SEAMAP shelf observations.Implementation of linear mortality on all biological state variables (except PZ) resolved this issue by providing the model with a greater dynamic range.In NEMURO, and other biogeochemical models, quadratic mortality is often used to increase model stability and/or is mechanistically justified as representing the impact of unmodeled predators that co-vary in abundance with prey (Gentleman and Neuheimer, 2008;Steele and Henderson, 1992).However, grazing losses of all state variables (except PZ), are already explicitly modeled in NEMURO by default.Hence, removal of quadratic mortality also added ecological realism and improved the model solution.Quadratic mortality was retained for PZ to account for the implicit predation pressure of un-modeled predators (e.g.planktivorous fish).

Surface chlorophyll discrepancies
Within our model-data comparisons of surface Chl we find that NEMURO-GoM reproduces important patterns in both the oligotrophic and shelf region.The latter of which, apart from the northern shelf, has not been well resolved by previous PBMs (e.g., Gomez et al., 2018;Xue et al., 2013).The absence of a shelf Chl signature may, in some cases, be overly attributed to bias in satellite measurement due to high concentrations of colored dissolved organic matter (CDOM).
While a clear shelf signature is well resolved in NEMURO-GoM, the model-data mismatch is greater on the shelf compared to oligotrophic regions.This is an expected result considering that the model incorporates climatological river forcing while actual variability is much more complex.
Furthermore, the absence of CDOM in the model likely contributes to the overestimation of phytoplankton biomass on the shelf.
In future studies, the inclusion of daily nutrient data like that produced for the Mississippi River by USGS starting in 2011 is needed for PBMs to better resolve variability on the shelf.Including benthic processes, such as denitrification (Fennel et al., 2006), may also reduce model-data mismatch in shelf regions.Implementing more realistic light attenuation (e.g.wavelength-specific light attenuation or inclusion of CDOM) could further improve estimates of phytoplankton biomass on the shelf as primary production can be sensitive to different light attenuation formulations (Anderson et al., 2015).In our model, it was difficult to simulate deep DCMs in the oligotrophic GoM while also simulating DCMs on the shelf that were shallow enough to maintain high nitrate.This may reflect the need for more realistic light attenuation in the model.
Quantifying uncertainty in C:Chl ratios is also an important task moving forward as future PBMs  . Knapp, pers. comm.).
Although modifying the boundary conditions may be justified, deepening the nitracline at the boundaries made it increasingly difficult to sustain realistic surface phytoplankton biomass in the oligotrophic GoM.This may point to the importance of nitrogen fixing cyanobacteria, which provide an alternative source of new nitrogen (other than upwelling and mixing) that could be supporting phytoplankton at the surface given the strong stratification and deep nitraclines in the GoM.In the process of model tuning, we noticed that increasing the DON pool by increasing the PON to DON decomposition rate was necessary to maintain both relatively deep nitraclines and realistic surface Chl by providing a slow leeching of ammonium near the surface through bacterial communities.The need for this slow production of ammonium in surface layers may compensate for nitrogen fixation, which is not included in NEMURO (Holl et al., 2007;Mulholland et al., 2006).In future studies, including diazotrophs as a separate phytoplankton functional type would be essential for evaluating the importance of nitrogen fixation in the GoM.
Despite the model-data mismatch on the Campeche Bank, this discrepancy appears to have little impact on the rest of the GoM.However, the model overestimates surface Chl in the southwestern GoM, which can likely be attributed to entrainment of high Chl water originating from the Campeche Bank.Locally, the ecological impact is likely more significant.Higher phytoplankton biomass would be expected to support higher mesozooplankton grazing rates and secondary production.Indeed, some of the highest model rates of secondary production occur on the Campeche Bank.Hence, the surface Chl model-data mismatch may lead to an overestimation of secondary production for this region.

Deep chlorophyll maximum discrepancies
Since most PBMs focus on validating against satellite derived surface chlorophyll, the dynamics of the DCM is often insufficiently investigated.Consequently, many models predict DCM depths that are far too shallow.Identifying this issue in the literature proved to be difficult because most studies do not provide profiles of simulated Chl (an exception is the recent GoM PBM by Damien et al. (2018)).We note that DCM depths in the DIAZO model (Stukel et al., 2014) were often quite shallow or completely nonexistent in the portion of the domain that included the oligotrophic GoM region.Underestimates of DCM depth in the unmodified COBALT biogeochemical model has also been identified (Moeller et al., 2019).In our investigation of the GoM PBM implemented by Gomez et al. (2018), we found that DCMs in the oligotrophic region were commonly shallow and weak.In the default NEMURO simulation, DCM depths in the oligotrophic region were typically at a depth of 25 m, which is much shallower than observed (SEAMAP: 80 + 25 m, Process cruises: 107 + 21 m).While this issue may seem insignificant, particularly if a study is focused on mixedlayer dynamics, accurate placement of the DCM can have profound impacts on PBM behaviors, because the DCM is typically co-located with the nitracline.Unrealistically shallow DCMs and nitraclines permit unrealistically high nitrate fluxes into the surface layer following mixing events; thus, validating the DCM in PBMs is critical.
For these reasons, we devoted substantial effort to tuning phytoplankton dynamics in the DCM.
Modifications to α (the slope of the photosynthesis-irradiance curve) and attenuation coefficients allowed the model to estimate deeper more realistic DCM depths.Inclusion of a variable C:Chl module was also implemented to better resolve the DCM.However, an additional issue was present in the default NEMURO simulations, the NEMURO-GoM, and every simulation that we attempted.In all simulations that formed DCMs, the location of the DCM was always co-located with a maximum in phytoplankton specific growth rate, even though field measurements indicate that phytoplankton growth rates and NPP are either relatively constant with depth or decline in the DCM.This is not surprising, given the low photon flux at the base of the euphotic zone and the energetic demands required to up-regulate cellular density of light harvesting pigments.
Additionally, our field measurements show that the DCM was not associated with a biomass maximum (biomass was fairly constant with depth), suggesting that DCM formation in the GoM is physiologically driven.
We believe this DCM dynamical issue was responsible, in part, for the underestimation of specific phytoplankton growth and microzooplankon grazing rates by the model despite estimating higher NPP (Fig. 4D).An underestimation of C:Chl ratios or overestimation of phytoplankton biomass may also contribute to the model-data mismatch at the DCM.The ecological impact of this mismatch is likely to be greatest in the model during the summer months when the mixed layer depth is shallow and the DCM reaches its maximum depth.Elevated growth rates or biomass at the DCM during this period could significantly decrease nitrate flux into the mixed layer causing the model to underestimate near surface primary and secondary production.
Future PBM studies need to focus more effort on resolving ecological dynamics responsible for the formation of the DCM.Errors originating from the hydrodynamic model or use of temporally averaged velocity fields used in offline models may also contribute to model-data mismatch at the DCM.Vertical mixing is particularly important to PBMs, but is often poorly validated in hydrodynamic models.Greater coordination between physical modelers and biologists to constrain vertical fluxes should be considered an important avenue for improving PBM simulations moving forward.

Mesozooplankton grazing discrepancies
Novel to this study, model estimates of mesozooplankton biomass are shown to agree closely with observations on the shelf and in the oligotrophic GoM.To our knowledge, this study includes the first quasi-regional zooplankton biomass model validation in a PBM.Our model also provides the first model-data comparisons of size-specific zooplankton biomass and grazing rates for the GoM.
Such comparisons provide valuable insights into the potential biases of traditional functional group biogeochemical models pertaining to zooplankton dynamics (Everett et al., 2017).While NEMURO-GoM shows broad agreement with zooplankton observations, some model-data mismatch occurs, particularly for mesozooplankton grazing rates.
We identify three factors that may explain the model-data mismatch for mesozooplankton grazing rates.The first and most obvious factor is the temporal sampling discrepancy as measurements were collected outside our model simulation period.Model-data mismatch may also arise from inaccuracies in the field measurements.During our process cruises, the zooplankton gut pigment measurements were based solely on phaeopigment content due to phytodetrital aggregates and Trichodesmium colonies found in our zooplankton net tows, which can lead to substantial contamination.Thus, true mesozooplankton grazing rates were likely underestimated because undegraded Chl can be abundant in the foreguts of mesozooplankton.Furthermore, the gut pigment approach assumes that any group of mesozooplankton has a constant gut throughput time (as a function of temperature), which is an oversimplification.
Uncertainties in model grazing formulations could also contribute to model-data mismatch (Gentleman et al., 2003a;Sailley et al., 2015).Future in situ grazing measurements are needed to enable an objective selection of grazing formulations and parameter values.In particular, field studies that shed light on prey selectivity would be useful for parameterizing PBMs with multiple mesozooplankton functional groups, such as NEMURO-GoM.Such studies are challenging, however, because the difficulty of making in situ grazing measurements on mesozooplankton, combined with the inherent uncertainty of these measurements can make it challenging to differentiate between, for instance Ivlev and Holling's disk grazing formulations (e.g., Fig. 4 in Morrow et al. (2018)).Nevertheless, differences in parameterizing grazing can lead to substantially different model behavior (Anderson et al., 2010;Sailley et al., 2015;Wainwright et al., 2007).In NEMURO-GoM, secondary production and dietary preferences of the mesozooplankton community are both strongly influenced by model grazing formulation.While we carefully chose parameterizations that gave reasonable fits to extensive field datasets of zooplankton biomass and grazing rates, this does not preclude the possibility that other functional forms would have more accurately simulated zooplankton dynamics.Hence, future PBMs should investigate how different grazing formulations impact zooplankton dynamics in the region.We especially recommend collaborations between experimentalists (potentially using new techniques such as DNA metabarcoding of gut contents) and modelers to develop synthetic approaches with the potential to quantitatively assess the realism of different grazing formulations.
Clear model-data mismatch is also evident in the proportion of grazing mediated by PZ and LZ.
This may be due to the fact that PZ is by default explicitly defined and parameterized as a higher trophic level mesozooplankton that can feed on LZ.In reality, while there is a correlation between size and trophic level in the ocean, many predatory zooplankton are <1 mm, and many suspensionfeeding zooplankton are >1 mm; hence, the overlap of taxonomic groups with different functional roles and sizes makes it difficult to directly compare model categories to field data.For example, shelf suspension-feeding zooplankton are likely larger than their counterparts in the oligotrophic GoM although their functional role in the ecosystem does not change between environments.
The ecological impact of the model's potential overestimation of LZ grazing rates is most likely to manifest through an increase in the ratio of secondary production to mesozooplankton biomass.
Since both LZ and PZ biomasses are accurately modeled by NEMURO-GoM, the overestimation of grazing rates suggests that LZ turnover times may be too high, thus leading to higher estimates of secondary production.However, this interpretation may oversimplify the complex interactions within pelagic protistan communities.In the oligotrophic region where our model overestimates LZ grazing rates, the model indicates that heterotrophic protists comprise approximately half of the LZ diet.Thus, overestimates of grazing on LP do not necessarily lead to overestimates in total consumption if <1-mm zooplankton derive substantial nutrition from non-phototrophic sources in the field.Furthermore, the model's construction (i.e., LZ and PZ are functional groups, while the field data are size classes) suggests that part of the model-data mismatch in Fig. 5d may result from the presence of some suspension-feeders (i.e., LZ) in the >1-mm zooplankton size class and some carnivorous zooplankton (i.e., PZ) in the <1-mm zooplankton size class.In this case, the model may simply attribute too high of a LP:SZ prey ratio to LZ.If this is the issue, the model's estimate of LZ secondary production may be accurate, but its trophic level too low (or, conversely, the trophic level of PZ too high).Direct assessments of zooplankton trophic position (e.g., by compound specific isotopic analysis of amino acids, (Chikaraishi et al., 2009;Décima et al., 2017) may help resolve these issues.

Mesozooplankton dynamics in the open-ocean oligotrophic Gulf of Mexico
Despite its nutrient-poor conditions, the open-ocean GoM ecosystem is a key region for spawning and larval development of many commercially important fishes, including Atlantic bluefin tuna, yellowfin tuna, skipjack tuna, sailfish and mahi mahi (Cornic and Rooker, 2018;Kitchens and Rooker, 2014;Lindo-Atichati et al., 2012;Muhling et al., 2017;Rooker et al., 2012Rooker et al., , 2013)).Why so many species choose such oligotrophic waters as habitat for their larval stages is unknown, but may be due to reduced predation risk (Bakun, 2013;Bakun and Broad, 2003).Regardless, rapid growth and survival through the larval period depends on mesozooplankton prey that are suitably abundant and appropriately sized for these larval fishes.These prey taxa may be especially sensitive to increased stratification and oligotrophication associated with climate change, making investigation of their dynamics and production an important topic of research.
Mesozooplankton biomass in the oligotrophic GoM was found to be strikingly low in both observations and model estimates, approximately an order of magnitude less than on the shelf.
Model results clearly show that this low biomass condition arises from bottom-up resource limitation.Our results suggest that low phytoplankton biomass in oligotrophic regions, and particularly within Loop Current Eddies, may even lead to localized and episodic regions where phytoplankton concentrations approach thresholds low enough that trigger collapse of mesozooplankton grazing.Prey limiting conditions for mesozooplankton and their predators would be expected more frequently in the GoM during warmer ocean conditions.Higher sea surface temperatures and increased thermal stratification could suppress vertical mixing, resulting in lower phytoplankton biomass.
Despite extreme oligotrophy and dominance of picophytoplankton, our model shows that both PZ and LZ can be sustained at modest abundances in the oligotrophic GoM.Indeed, the substantial abundances of large (>1 mm) mesozooplankton, equivalent to 60% of total mesozooplankton in both observations and model results (Fig. 4A, C), is an important result that helps explain the success of larval fish in the region.Our results show that large mesozooplankton (PZ) occupy a trophic position of approximately 3.0 in the open ocean GoM, which is marginally lower than on the shelf where they feed primarily on small mesozooplankton (LZ).The change in trophic position is associated with a switch from carnivory to feeding predominantly on heterotrophic protists in the oligotrophic region.This result highlights the importance of intermediate protistan trophic levels in sustaining mesozooplankton communities in oligotrophic regions.Indeed, both LZ and PZ ingest proportionally more SZ in the open ocean than on the shelf.Notably, these protistan trophic steps cannot be quantified by routine field techniques because they have no pigment signature to make them visible in gut pigment measurements and may not enrich in bulk 15 N leading to isotopic invisibility from a trophic perspective (Gutiérrez-Rodríguez et al., 2014).
Despite their importance to phytoplankton grazing, they are sometimes missing from GoM ecosystem models (e.g., Fennel et al., 2011) and severely underrepresented or even absent in complex mass-balance constrained models (Arreguin-Sanchez et al., 2004;Geers et al., 2016).
New insights may arise from focused investigations of phytoplanktonprotistcrustacean linkages in oligotrophic regions in both model and experimental studies.This will likely require the use of next-generation technologies such as compound specific isotopic analyses of specific amino acids that have been shown to enrich in protists (Décima et al., 2017) or DNA metabarcoding to assess zooplankton gut contents (Cleary et al., 2016).
Another robust finding from this study is the dynamic mesoscale variability in zooplankton abundance, diet, and trophic position.These model results highlight the impact of Loop Current Eddies and mesoscale fronts and other features in modifying the biogeochemistry and food web of the GoM.The existence of hot spots of productivity in the GoM has been noted in observational studies (Biggs and Ressler, 2001), and the importance of GoM mesoscale features to fish larvae has been hypothesized (Domingues et al., 2016;Lindo-Atichati et al., 2012;Rooker et al., 2012).
Indeed, cyclonic eddies were found to have enhanced secondary production in our model, while secondary production was depressed within anticyclonic eddies.Our results further suggest that these mesoscale structures may not only modify zooplankton abundances, but also their trophic roles in the ecosystem, with implications for the transfer efficiencies of carbon and nitrogen in the pelagic food web.

Conclusions
In this study we developed a PBM for the GoM and after extensive validation used the model to investigate zooplankton community dynamics.The model was able to capture broad ecosystem attributes including phytoplankton and mesozooplankton abundances, depth of the DCM and nutricline, as well as growth and grazing patterns.Based on model estimates of trophic position and diet our results suggest that on the shelf small mesozooplankton are strongly herbivorous while large mesozooplankton are strongly carnivorous.However, distinct changes in diet were noted in the oligotrophic GoM, where both groups rely more on protistan prey.Changes in diet and secondary production highlighted in this study have the potential to impact food availability to higher trophic levels, such as pelagic larval fishes.In future work, we plan to couple NEMURO-GoM to an individual-based model to evaluate the extent to which mesozooplankton abundance limits larval fish feeding, growth, and survival along transport pathways in the GoM.Insights from this ecosystem-based approach may help to better resolve stock-recruitment relationships that are needed for sustainable fisheries management and improved stock-assessment models.
and salinity fields from a pre-existing20-year (1993-2012)  HYCOM (HYbrid Coordinate Ocean Model)(Chassignet et al., 2003) regional GoM hindcast (H-GoM).H-GoM is based on version 2.2.99B of the HYCOM code, originally provided by the Naval Oceanographic Office (NAVOCEANO) Major Shared Resource Center.H-GoM was run at 1/25 th degree (~4 km) horizontal resolution with 36 vertical hybrid coordinate layers and assimilated historic, in situ and satellite observations.The domain encompasses the entire GoM and extends south of the Mexican-Cuba Yucatan channel to 18 °N and as far east as 77 °W (Fig.1).Further details on H-GoM (experiment ID: GOMu0.04/expt_50.1),including model forcing and the main model configuration file (i.e.blkdat.input_501),can be found at https://www.hycom.org.The H-GoM flow fields were mapped from the HYCOM native hybrid vertical coordinate to zlevels used by the MITgcm.NEMURO-GoM was configured for 29 vertical z-levels (10-m intervals from 0-150 m, 25-m intervals from 150-300 m, 50-m intervals from 300-500m, and 1000 m, 2000 m, ~4000 m).Mapping was performed by computing total zonal and meridional transports across the lateral boundaries of each MITgcm grid cell (e.g., 0-10 m bin; which may include multiple HYCOM layers) and then dividing by the area of the respective cell face.This vertical mapping approach is consistent as both HYCOM and MITgcm use an Arakawa C-grid orientation for model variables.The H-GoM bathymetry was adjusted such that no partial cells existed in the domain to avoid thin cells.The continuity equation was subsequently used to calculate vertical velocities.The use of transports in this approach ensures conservation and

(
Figure 1 (A-E): Spatial and temporal coverage of observational data sets used for model validation.Total number of non-zero values available in daily SeaWIFS images from September 1997 to 10 December, 2010 along with nitrate profiles from the World Ocean Database (black dots) and the location of in situ measurements collected during May 2017 (circles) and May 2018 (triangles) process study cruises (A).Total annual sampling of the SEAMAP surveys from 1983-2017 with samples overlapping with the PBM simulation period denoted in red (B).Samples organized by month (D).Total sample density within each 0.5° x 0.5° box (C) along with the number of years with at least one sample (E).1000 m isobaths and coastline are denoted by black continuous lines.

Figure 2 (
Figure 2 (A-F): Comparison of surface chlorophyll (mg m -3 ) between SeaWIFS observations (left column) and model (right column) from 4 September 1997 to 10 December 2010.Surface chlorophyll climatology computed from daily SeaWIFS images (A) and the corresponding log10 field (C).Surface chlorophyll climatology computed from daily averages over the entire model simulation (1993-2012) (C) and the corresponding log10 field (D).Time series of simulated 30day average surface chlorophyll based on SeaWIFS observations (black) and model (red) for the oligotrophic region (grid cells with bottom depths >1000 m) (E) and shelf region (grid cells with bottom depths <1000 m) (F).
model is statistically significant (p < 0.01) with a ρ value of 0.90.When model estimates are compared to SEAMAP tows at corresponding sample times and locations for the 6,835 measurements in the simulation period, the ρ value is 0.55 (p < 0.01).In the oligotrophic region, the model slightly overestimates mesozooplankton biomass (Model: 4.09 + 1.82 mg C m -3 vs. SEAMAP: 3.52 + 3.44 mg C m -3 ) with ρ value of 0.23 (p < 0.01) with a bias of 0.57 mg C m -3 , equivalent to 16% of the observed mean.Conversely, in the shelf region the model underestimates mesozooplankton biomass (Model: 17.40 + 13.58 mg C m -3 vs. SEAMAP: 20.91 + 24.62 mg C m - 3 ), with a ρ value of 0.49 (p < 0.01) and a bias of -3.5 mg C m -3 , equivalent to 17% of the observed mean.Model estimates and SEAMAP measurements also compare well with total mesozooplankton biomass measurements (0.2-5 mm) collected in the oligotrophic region during the process study cruises (Model: 5.55 + 2.87 mg C m -3 vs. Cruise: 4.33 + 2.28 mg C m -3 ).
validate the vertical structure of the simulated ecosystem, we utilized observed profiles of fluorescence, Chl, and nitrate.When simulated DCM depths were compared to all 2,435 SEAMAP fluorescence profiles, we find a statistically significant correlation (ρ = 0.59, p < 0.01) with the observed maximum fluorescence depth.The maximum fluorescence depth ranged from the surface to 143 m while model values show a similar variability ranging from the surface to 163 m (Fig. 4A).In the oligotrophic region, the model overestimates the DCM depth (Model: 95 + 20 m vs. SEAMAP: 80 + 25 m) and has a ρ value of 0.38 (p < 0.01) with a bias of 15 m, equivalent to 19% of the observed mean.In the shelf region, the model also overestimates DCM depth (Model: 63 + 26 m vs. SEAMAP: 53 + 23 m) and has a ρ value of 0.49 (p < 0.01) with a bias of 10 m, equivalent to 19% of the observed mean.In contrast, the model slightly underestimated the DCM depth when compared to calibrated fluorescence profiles collected during the process cruises (Model: 100 + 18 m vs. Observed: 107+ 21 m) (Fig. 4B).In terms of magnitude, the model overestimates DCM Chl (Model: 0.74 + 0.35 mg Chl m -3 vs. Observed: 0.38 + 0.13 mg Chl m -3 ), although most of the observations fall within one standard deviation of the model average.Despite this model-data mismatch, simulated nitrate profiles closely match profiles from the World Ocean Database (WOD).In both model and observations, the mean nitracline occurred at approximately 75 m (Fig. 4C).On average, model nitrate tended to be lower at the surface and higher at depth relative to observations.Above the nitracline, model nitrate was 0.071 + 0.39 mmol N m -3 while observed nitrate was 0.55 + 1.29 mmol N m -3 .Below 200 m, model and data show better agreement, with deep nitrate in the model of 24.92 + 3.28 mmol N m -3 compared to 23.55 + 5.21 mmol N m -3 in WOD profiles.

Figure 4 (
Figure 4 (A-C): Model-data comparisons in the vertical distribution of chlorophyll (A&B) and nitrate (C).Comparison between maximum fluorescence depths observed in SEAMAP profiles and simulated maximum chlorophyll depths (m) (A).Comparison between in situ chlorophyll samples collected during May, 2017 and 2018 process study cruises and simulated chlorophyll concentrations (mg m -3 ) (B).Grey dots represent cruise sample values and red dashed line denotes

Figure 5 (
Figure 5 (A-C): Specific phytoplankton growth (μ, d -1 ) and microzooplankon grazing (m, d -1 ) between model (red) and field data (black) (A).Depth-integrated net primary production (mg C m -2 d -1 ) (B).The fraction of phytoplankton growth that is grazed by protists in the model and field data (C).Whiskers extend to the 95% confidence intervals.Outliers for model estimates are not shown.

Figure 7 (
Figure 7 (A-F): Trophic levels of simulated large zooplankton (LZ, top) and predatory zooplankton (PZ, bottom).Annual-averaged trophic positions of LZ (A) and PZ (D).Instantaneous model field of trophic positions during winter conditions for LZ (B) and PZ (E) on 4 February 2012.Instantaneous model field of trophic positions during summer conditions for LZ (C) and PZ (F) on 2 August 2011.

Figure 8 (
Figure 8 (A-F): Dominant prey source for simulated large zooplankton (LZ, top) and predatory zooplankton (PZ, bottom).Annual-averaged diet contributions for LZ (A) and PZ (D).Instantaneous model field of zooplankton diet during winter conditions for LZ (B) and PZ (E) on 4 February 2012.Instantaneous model field of zooplankton diet during summer conditions for LZ (C) and PZ (F) on 2 August 2011.

Figure 9 (
Figure 9 (A-F): Vertically integrated secondary production (mg C m -2 d -1 ) of simulated large zooplankton (LZ, top) and predatory zooplankton (PZ, bottom).Annual-averaged secondary production for LZ (A) and PZ (D).Instantaneous model field of secondary production during winter conditions for LZ (B) and PZ (E) on 4 February 2012.Instantaneous model field of secondary production during summer conditions for LZ (C) and PZ (F) on 2 August 2011.
will likely continue to depend heavily on satellite Chl for the bulk of model validation.To correctly parameterize C:Chl ratios more in situ samples are needed that resolve changes in phytoplankton light harvesting pigments along gradients from coastal to oligotrophic regions and from the surface to the DCM.Without these observations, it is difficult to gauge mismatches between model and satellite ocean color products (e.g.discrepancies on the Campeche Bank in our model) or in situ profiles of Chl.In addition to validation, these measurements are needed to avoid erroneous model tuning.For instance, a model that exhibits significant mismatch with respect to surface Chl may in fact accurately estimate carbon based phytoplankton biomass while using unrealistic C:Chl ratios.One could arrive at incorrect conclusions about regional ecosystem dynamics as a result of modifying model parameters or structure in an effort to better fit Chl observations.Given the importance of C:Chl ratios in PBMs, future studies should quantify uncertainty in modeled Chl through sensitivity experiments focused on C:Chl model parameters and formulation, with explicit comparison to direct field measurements of phytoplankton C:Chl.In our model, the most noticeable surface Chl model-data mismatch occurs on the southern GoM shelf (Campeche Bank), where the model consistently overestimates surface Chl.This bias was also notable in the GoM PBM implemented by Damien et al. (2018), particularly in winter.We believe this discrepancy is driven by a combination of errors involving overestimation of shelf mixing by the hydrodynamic model, entrainment of high Chl water (given the overestimated DCM magnitude in the model), or errors in the open boundary conditions which result in an overestimation of upwelled nutrients/biomass near the Yucatan Peninsula that are transported westward by shelf currents.We found that the Campeche Bank model-data mismatch was reduced when open boundary conditions included nitracline depths of greater than 100 m.This may reflect realistic in situ conditions considering that Caribbean water entering the GoM is highly oligotrophic.During our process cruises, nitrate was often undetectable above 100 m in samples collected near the Loop Current (A Briefly, daily average shortwave radiation fields obtained from Climate Forecast System Reanalysis (CFSR) were used to force light limitation of phytoplankton.Once a final parameter set was determined (see section 2.1.3),initial and open boundary conditions for all state variables were prescribed from a spun up idealized onedimensional version of NEMURO-GoM.After initializing, the three-dimensional model was spun up over four years before conducting the full 20-year experiment.River nutrient input from the Mississippi was prescribed using nitrate samples collected by United States Geological Survey . All model equations are provided in the supplement to this manuscript.Biogeochemical model forcing, initial and open boundary conditions are also outlined in Supplement S1.