Ocean biogeochemistry (OBGC) models span a wide variety of complexities, including highly simplified nutrient-restoring schemes, nutrient–phytoplankton–zooplankton–detritus (NPZD) models that crudely represent the marine biota, models that represent a broader trophic structure by grouping organisms as plankton functional types (PFTs) based on their biogeochemical role (dynamic green ocean models) and ecosystem models that group organisms by ecological function and trait. OBGC models are now integral components of Earth system models (ESMs), but they compete for computing resources with higher resolution dynamical setups and with other components such as atmospheric chemistry and terrestrial vegetation schemes. As such, the choice of OBGC in ESMs needs to balance model complexity and realism alongside relative computing cost. Here we present an intercomparison of six OBGC models that were candidates for implementation within the next UK Earth system model (UKESM1). The models cover a large range of biological complexity (from 7 to 57 tracers) but all include representations of at least the nitrogen, carbon, alkalinity and oxygen cycles. Each OBGC model was coupled to the ocean general circulation model Nucleus for European Modelling of the Ocean (NEMO) and results from physically identical hindcast simulations were compared. Model skill was evaluated for biogeochemical metrics of global-scale bulk properties using conventional statistical techniques. The computing cost of each model was also measured in standardised tests run at two resource levels. No model is shown to consistently outperform all other models across all metrics. Nonetheless, the simpler models are broadly closer to observations across a number of fields and thus offer a high-efficiency option for ESMs that prioritise high-resolution climate dynamics. However, simpler models provide limited insight into more complex marine biogeochemical processes and ecosystem pathways, and a parallel approach of low-resolution climate dynamics and high-complexity biogeochemistry is desirable in order to provide additional insights into biogeochemistry–climate interactions.
Biogeochemical cycles represented in each candidate model.
Ocean biogeochemistry is a key part of the Earth system: it regulates the
cycles of major biogeochemical elements and controls the associated feedback
processes between the land, ocean and atmosphere. As a result, changes in
ocean biogeochemistry can have important implications for climate (Reid et
al., 2009). Marine ecosystems are indirectly affected by anthropogenic
environmental change (Jackson et al., 2001), particularly through
climate-induced changes in physical properties and CO
With the recent publication of the Intergovernmental Panel on Climate Change
(IPCC) fifth assessment report (AR5), global efforts are already underway
to develop the next generation of Earth system models (ESMs) to support
climate policy development and any further IPCC assessment report. Ocean biogeochemistry
(OBGC) models coupled to ESMs can help address a series of overarching scientific
questions such as: how will the ocean contribute to atmospheric trace gas
composition (e.g. CO
For an anticipated sixth IPCC assessment report it is generally
agreed that these global-scale questions, with direct implications for
climate policies, will again be the main focus of ocean biogeochemical
models within ESMs. In addition, the ESM model archive is increasingly being
used for activities within the Inter-Sectoral Impact Model Intercomparison
Project
(
Composition of the marine ecosystems represented in each candidate model along with the total number of biogeochemical tracers (including those detailed in Table 1).
Within the UK, the Integrated Global Biogeochemical Modelling Network (iMarNet) project aims to advance the development of ocean biogeochemical models through collaboration between existing modelling groups at Plymouth Marine Laboratory (PML), National Oceanography Centre (NOC), University of East Anglia (UEA) and the Met Office Hadley Centre (UKMO). As part of iMarNet we conducted an intercomparison of six current UK models to help inform the selection of a baseline OBGC model for the next UK Earth system model (UKESM1). This intercomparison focused on the ability of the model to reproduce global-scale bulk properties – such as nutrient and carbon distributions – that broadly characterise the activity of marine biota (and thus the carbon cycle) in the ocean. To limit the role of errors originating from modelled physics, all of the examined model simulations were performed within the same physical ocean general circulation model (GCM), under the same external forcing and following the same experimental protocol. As all of the models examined have been previously published, our analysis does not include an assessment of their underlying biological fidelity (i.e. the extent to which structures, parameterisations and parameter sets of candidate models are a priori realistic). However, while primarily focused on model skill, the intercomparison also considers the computational cost of the models in relation to the realism that they offer. Previous authors have performed biogeochemical model intercomparisons with parallels to this study (e.g. Friedrichs et al., 2007; Kriest et al., 2010; Steinacher et al., 2010; Popova et al., 2012). These have differed from this study and each other in a number of ways. For instance, this study is 3-D rather than 1-D (cf. Friedrichs et al., 2007), global rather than regional (cf. Popova et al., 2012), uses identical rather than diverse physics (cf. Steinacher et al., 2010) and spans a more functionally diverse range of biogeochemical models (cf. Kriest et al., 2010). The latter two factors, in particular, distinguish this study, permitting us to both formally separate the impact of physics from that of biogeochemical dynamics and to do so across a broad range of model complexity from nutrient–phytoplankton–zooplankton–detritus (NPZD) models to state-of-the-art plankton functional types (PFTs) models with considerable ecological sophistication. This study is still constrained by the use of a single ocean circulation and by a bespoke gradation of model complexity (PlankTOM6 and PlankTOM10 partially inform this). Nonetheless, this study represents an intercomparison along separate lines to those previously conducted.
All participating models made use of a common version (v3.2) of the Nucleus
for European Modelling of the Ocean (NEMO) physical ocean general
circulation model (Madec, 2008) coupled to the Los Alamos sea–ice model
(CICE) (Hunke and Lipscomb, 2008). This physical framework is configured at
approximately 1
Simulations were initialised at the year 1890 from an extant physics-only
spin-up (ocean and sea–ice) to minimise undesirable transient behaviour in
ocean circulation. In terms of ocean biogeochemistry, all model runs made
use of a common data set of three-dimensional fields for the initialisation
of major tracers. Nutrients (nitrogen, silicon and phosphorus) and dissolved
oxygen in this data set were drawn from the World Ocean Atlas 2009 (Garcia et
al., 2010a, b), while dissolved inorganic carbon (DIC)
and alkalinity were drawn from the Global Ocean Data Analysis Project
(GLODAP) (Key et al., 2004). GLODAP does not include a DIC field that is
directly valid for 1890, so a temporally interpolated field was produced
based on GLODAP's “pre-industrial” (i.e.
Observational (Takahashi et al., 2009; top left) and modelled
annual average surface ocean
Observational (World Ocean Atlas, 2009; top left) and modelled
annual average surface ocean dissolved inorganic nitrogen (mmol m
After initialisation at the year 1890, the models were run for 60 years (1890–1949 inclusive) under the so-called “normal year” of version 2 forcing for common ocean–ice reference experiments (CORE2-NYF; Large and Yeager, 2009). Subsequently, the models were run under transient interannual forcing from the same data set (CORE2-IAF) for a further 58 years (1950–2007 inclusive). CORE2 provides observationally derived geographical fields of downwelling radiation (separate long and short wave), precipitation (separate rain and snow) and surface atmospheric properties (temperature, specific humidity and winds) and is used in conjunction with bulk formulae to calculate net heat, freshwater and momentum exchange between the atmosphere and the ocean.
Observational (SeaWiFS; top left) and modelled annual average
surface ocean chlorophyll (mg m
For all models, some degree of tuning occurred prior to this study, albeit in different physical frameworks (to varying degrees) to those used here. Tuning during this study was limited or absent between models, but some models, such as HadOCC and MEDUSA, may have benefited from being previously tuned within the NEMO framework (although in a different version and grid configuration).
Figure S7 in the Supplement shows an intercomparison of the common NEMO physics with observations (temperature, Locarnini et al., 2010; salinity, Antonov et al., 2010; mixed layer depth, Monterey and Levitus, 1997) for several key physical fields. In terms of SST, NEMO represents observed patterns well despite simulating a warmer Gulf Stream and noticeably cooler temperatures in the vicinity of the Labrador Sea. In conjunction with fresher salinities in the north Atlantic (results not shown), these differences result in shallower depths of the mixed layer and pycnocline in this region. In contrast, in the Southern Ocean both mixed layer depths and the modelled pycnocline are markedly deeper than in observations. This latter regional bias has biogeochemical consequences across all of the models examined here (see later).
The models evaluated within this study vary significantly in biological complexity. The key features of the participating models are summarised below.
The Hadley Centre ocean carbon cycle model (HadOCC) model is a
simple NPZD representation
that uses N nutrient as its base currency but with coupled flows of C,
alkalinity and O
The Diat-HadOCC model is a development of the HadOCC model that includes two phytoplankton classes (diatoms and “other phytoplankton”) and representations of the Si and Fe cycles, as well as a dimethyl sulphide (DMS) sub-model. The model is the ocean biogeochemistry component of HadGEM2-ES (Collins et al., 2011), the UK Met Office's Earth system model used to run simulations for CMIP5 and the Intergovernmental Panel on Climate Change (IPCC) fifth assessment report (AR5).
The model of Ecosystem Dynamics, nutrient Utilisation, Sequestration and Acidification (MEDUSA) is an
“intermediate complexity” plankton ecosystem model designed to incorporate
sufficient complexity to address key feedbacks between
anthropogenically driven changes (climate, acidification) and oceanic
biogeochemistry. MEDUSA-2 resolves a size-structured ecosystem of small
(nanophytoplankton and microzooplankton) and large (microphytoplankton and
mesozooplankton) components that explicitly includes the biogeochemical
cycles of N, Si and Fe nutrients as well as the cycles of C, alkalinity and
O
PlankTOM is a dynamic green ocean model that represents lower-trophic level
marine ecosystems based on PFTs. A hierarchy of
PlankTOM models exists that vary in the number of PFTs resolved. Two members
drawn from this stable were used in this study. PlankTOM6 includes six PFTs –
diatoms, coccolithophores, mixed phytoplankton, bacteria, protozooplankton
and mesozooplankton – while PlankTOM10 includes an additional four PFTs –
nitrogen fixers,
European regional seas ecosystem model (ERSEM) is a generic lower-trophic level model designed to represent the
biogeochemical cycling of C and nutrients as an emergent property of
ecosystem interaction. The ecosystem is subdivided into three functional
types – producers (phytoplankton), decomposers (bacteria) and consumers
(zooplankton) – and then further subdivided by trait – size and silica uptake – to
create a food web. Physiological (ingestion, respiration, excretion and
egestion) and population (growth, migration and mortality) processes are
included in the descriptions of functional group dynamics. Four
phytoplankton (picophytoplankton, nanophytoplankton, diatoms and
non-siliceous macrophytoplankton), three zooplankton (microzooplankton,
heterotrophic nanoflagellates and mesozooplankton) and one bacterium are
represented, along with the cycling of C, N, P, Si and O
The intercomparison process required limited changes to model organisation
and code, and models retained disparate parameterisations for several
overlapping processes, including ocean carbonate chemistry and air–sea
exchange (HadOCC, Diat-HadOCC – Dickson and Goyet, 1994; Nightingale et
al., 2000; MEDUSA – Blackford and Gilbert, 2007; PlankTOM-6, PlankTOM-10 – Edmond
and Gieskes, 1970, Broecker et al., 1982, Wanninkhof, 1992; ERSEM – Artoli et
al., 2012). In the case of calcium carbonate (CaCO
Frequency distributions of best to worst performances for each model in terms of correlation coefficients and normalised standard deviations of annual surface fields and depth integrated primary productivity.
The representation of biogeochemical cycles and biota in each model is summarised in Tables 1 and 2 respectively.
Assessment against observational data sets was made for a set of bulk ocean
biogeochemical properties that were common across all models:
Observational fields used within the model intercomparison are comprised of
World Ocean Atlas 2009 DIN (Garcia et al., 2010a), chlorophyll (O'Reilly et
al., 1998) and
These fields were selected for several reasons. Firstly, they are ocean or biogeochemical bulk properties for which there are global-scale observations. Secondly, these fields broadly represent foundational aspects of marine biogeochemical cycles. For instance, nutrients play a critical role in regulating the distribution and occurrence of marine plankton, while phytoplankton photosynthesis represents the vast majority of the primary energy source to marine ecosystems. Thirdly, the measurement of these fields is relatively well defined with long-established standard methodologies. Properties that are directly related to biological entities, for instance biomass abundances, can be less precisely defined, difficult to match up with modelled quantities or even absent from some models examined here. That said, the observational field of global-scale primary production used here has a relatively high uncertainty because it is drawn from three methodologies that exhibit a large range (cf. Yool et al., 2013). Finally, the examined properties are those which, if modelled poorly, legitimately cast doubt over the wider utility of a biogeochemical model in an Earth systems context. Model results always depart from observations, but systematic disagreement with these basic observations is strongly suggestive of problems with process representation within a model. The model comparison focuses on the mean and seasonal cycle. It does not include evaluation of variability over interannual or longer timescales, in part because of limited data availability.
Figures 1–3 (and Figs. S1–S3 in the Supplement) show annual average fields
from each of the models for a series of ocean properties together with
comparable observational fields. The figures also include a panel that shows
the corresponding model–observation Taylor diagram (Taylor, 2001). These
illustrate both the correlation between (azimuthal position) and relative
variability of (radial axis) of model and observations, such that models more
congruent with observations generally appear closer to the reference marker
on the
Figure 1 shows annual average surface
Model–observation correlation coefficients (
The negative
Figure 2 illustrates model performance of annual average surface dissolved inorganic nitrogen (DIN) concentrations. Here, all models capture global patterns relatively well, with correlation coefficients > 0.8, in part because of the initialisation from observations in 1890. The model with the highest spatial pattern correlation coefficient is ERSEM, although it slightly underestimates the global variability of DIN. The other models have lower spatial pattern correlation coefficients and generally overestimate the global variability of DIN. PlankTOM6 performs below other models, while PlankTOM10 has a similar performance to the simpler models. In general, aside from ERSEM and PlankTOM10, most models show elevated Pacific DIN, and the simpler models, MEDUSA-2 in particular, exhibit high equatorial anomalies. Finally, while ERSEM shows good agreement throughout most of the world ocean, both the north Atlantic and north Pacific show anomalously low annual average DIN concentrations.
Surface DIN concentrations are influenced by both the efficiency of primary production and the efficiency of remineralization, both of which differ between models. Although we do not explore the differences in remineralization, the models which show positive DIN biases in the equatorial Pacific (HadOCC, Diat-HadOCC and MEDUSA-2) are generally shown to also have positive integrated primary production biases in this region (Fig. S1). To a lesser extent, the reverse is true of the models with negative DIN biases in the equatorial Pacific (PlankTOM10 and ERSEM).
Figure 3 shows low correlation (
In addition to the ocean properties shown in Figs. 1–3, complementary figures for alkalinity, DIC and primary production can be found in the supplementary material (Figs. S1–S3). In each case, global annual average fields are shown together with the corresponding Taylor diagram.
Table 3 shows the correlation coefficients and standard deviations
normalised relative to observations of the models for all six of the ocean
properties (five surface fields plus depth-integrated primary production).
The range of correlation coefficients over all of the
models is shown for each field. As already suggested above, model
performance varies both between fields and between models. All models
perform consistently and relatively well for DIN and DIC in part because of
the “memory” of initial distributions. Model performance varies more
widely for
Monthly Taylor plots for
Figure 4 summarises the data in Table 3 by showing the distribution of performance rankings (both correlation coefficients and normalised standard deviations) across the selected fields for each model, i.e. the number of first, second, etc. rankings for each model. No model is shown to consistently outperform all other models across all metrics. Indeed, all models perform best in at least one metric, and similarly all models perform worst in at least one metric. There is little discernable relationship between model complexity and model performance. Indeed, Table 3 shows that for four out of six fields the best performing model in terms of correlation coefficients is a simpler model (i.e. HadOCC, Diat-HadOCC or MEDUSA-2) and for five out of six fields the best performing model in terms of normalised standard deviations is a more complex model (i.e. PlankTOM6, PlankTOM10 or ERSEM).
These findings in annual average model performance are found to be consistent when examined at monthly timescales (Fig. 5).
While the majority of biological activity in the ocean is concentrated in
its surface layers, biogeochemical fields in the deep ocean have a complex
structure created through the interaction of ocean physics with
biologically mediated processes such as export and remineralization. As
such, model performance cannot be solely assessed from surface fields of
ocean BGC properties. To examine this, Figs. 6 and 7 show the annual
average depth profiles of DIC and alkalinity for three important regions:
the north Atlantic (Atlantic 0–60
In Fig. 6, all models are shown to capture the DIC profile in the
equatorial Pacific, though HadOCC, Diat-HadOCC and MEDUSA-2 are somewhat
closer to observations than ERSEM and the PlankTOM models. A similar
situation is seen in the north Atlantic where the depth profiles of
MEDUSA-2, HadOCC and Diat-HadOCC are closest to observations, although
surface agreement is greater than that at depth. All models are shown to
perform relatively poorly in the Southern Ocean, with much weaker gradients
with depth than observations. HadOCC, Diat-HadOCC and ERSEM show gradients
that are marginally closer to those observed, but all of the models
consistently fail to reproduce the observed > 100 mmol m
The annual average depth profiles of alkalinity are shown in Fig. 7. In
the north Atlantic, HadOCC and Diat-HadOCC are closer to observations while
ERSEM and, particularly, MEDUSA-2 are further away from observations (but in
opposite directions). Again, and for the same reasons as outlined above, no
model performs well at capturing the depth profile observed in the Southern
Ocean. In the equatorial Pacific, all of the models have similar alkalinity
at depth but diverge from observations towards the surface. The near-surface
depth profiles in HadOCC, Diat-HadOCC and MEDUSA-2 are closest to
observations in that region. Alkalinity shows very little variability with
depth in the PlankTOM6, PlankTOM10 and ERSEM models and is higher than
observations in near-surface waters (> 100 meq m
Observed (black; GLODAP) and modelled profiles of dissolved
inorganic carbon (mmol C m
The depth profiles of DIN and O
Computational timing tests (CPU time) were carried out relative to the ocean component of the HadGEM3 (Hewitt et al., 2011) model (ORCA1.0L75), on standard configurations of 128 and 256 processors on an IBM Power7 machine. As would be intuitively expected, the cost of candidate ocean biogeochemical models is found to be higher for models with more tracers regardless of the number of processors used. While there are deviations in both directions between the models, there is a broadly linear relationship between number of model tracers and compute cost (Fig. S6 in the Supplement), reflecting the significant cost of applying advection and mixing terms to each tracer.
Using ERSEM (the computationally most expensive model) increases computational cost approximately 6-fold relative to HadOCC when 128 processors are used. This relative increase in computational cost is reduced to approximately 4.5-fold when 256 processors are used. PlankTOM10 has the greatest relative reduction (36.6 %) in computational cost when run on 256 processors as opposed to 128, although this model would still increase the total cost of the ocean component by a factor of 5 relative to a physics-only ocean, compared to a factor of 1.5 for HadOCC (Table 4).
Our model comparison suggests that for global annual average surface fields, global monthly average surface fields and annual average depth profiles in three oceanographic regions, there is little evidence that increasing the complexity of OBGC models leads to improvements in the representation of large-scale ocean patterns of bulk properties. In some cases, the comparison suggests that simpler OBGC are closer to observations than intermediate or complex models for the standard assessment metrics used here.
Observed (black; GLODAP) and modelled profiles of alkalinity (meq m
The biologically simpler models HadOCC, Diat-HadOCC and MEDUSA-2 are shown
to have generally higher global spatial pattern correlation coefficients of
Computational cost of each candidate model when coupled to the ocean component of HadGEM3, relative to a physics-only simulation with the same ocean model (ORCA1.0L75). A cost of 2 indicates that adding the biogeochemistry model doubles total simulation cost. Timings are shown for simulations carried out on 128 and 256 processors of an IBM Power7 machine.
There are, however, ocean biogeochemical fields where models of greater biological complexity tend to equate to improved model skill. The annual and monthly global correlation coefficients of the PlankTOM models are shown to be closest to observations for chlorophyll and primary production fields (Fig. 3 and Table 4). These PlankTOM models do not consistently produce the annual chlorophyll and primary production field standard deviations closest to observations (Table 4); however, at monthly resolution their field standard deviations are the most consistent across models (Fig. 5).
The comparison of depth profiles shows that despite all models being initialised from the same observational fields, there is quite a lot of divergence even at depths of less than 1000 m. In some cases, such as alkalinity in the Southern Ocean (Fig. 7), all models have a similar systematic bias compared to observations. This is suggestive of the influence of errors within the physical ocean model. That is, the ocean biogeochemistry may be influenced to a greater extent by the physical ocean model and hence there is a common response across models. For other fields such as DIN in the Southern Ocean and equatorial Pacific (Fig. S5), models have both positive and negative biases compared to observations, suggestive of a greater relative role of the OBGC model than the physical model.
It is clear that more biologically complex models are required to more
completely assess the impacts of environmental change on marine ecosystems.
By representing processes that are not present in simpler models, the more
complex models are also able to represent additional factors such as
climatically active gases (e.g. DMS, N
It should be noted that models implemented within the NEMO physical ocean framework prior to this intercomparison project had an advantage over those new to this framework. This is a somewhat unavoidable consequence of what is also one of this intercomparison study's main strengths, namely that the models were adapted to use the same ocean physics framework. Specifically, the HadOCC and MEDUSA-2 model developers were familiar with NEMO v3.2 and had some previous opportunity to tune models. Linked to this is the question of how dependent the results were on parameter values. Although model developers were afforded a limited opportunity to tune parameters, given further time to tune one would expect improved performance, especially for those models that had not been previously implemented within NEMO v3.2.
The rationale for the chosen fields of intercomparison was, as stated previously, that they are common across all models and are key facets of global marine biogeochemistry. It could, however, be argued that these bulk fields were insufficient to adequately assess all models and in particular the most complex models. Further analysis beyond the scope of this paper will be necessary to evaluate mechanistically the implications of the different biological components in each model.
Finally, although computational cost is discussed as a pragmatic driver of OBGC model selection, it should be noted that computer power is continuously increasing and the intercomparison results presented here may differ for an alternative spatial resolution ocean grid requiring greater computational resources. In addition, ongoing efforts to transport passive ocean tracers on degraded spatial scales (e.g. Levy et al., 2012) have the potential to result in computational savings that would realistically permit the implementation of higher complexity OBGC models within ESMs.
The six ocean biogeochemical models analysed within this intercomparison cover a large range of ecosystem complexity (from 7 tracers in HadOCC to 57 in ERSEM), and therefore result in a range of approximately five in computational costs (from increasing the cost of the physical ocean model by a factor of 2 to a factor of 10). Results suggest little evidence that higher biological complexity implies better model performance in reproducing observed global-scale bulk properties of ocean biogeochemistry.
As no model is found to have the highest skill across all metrics and all are most or least skilful for at least one metric, our results suggest that it is in the interest of the international climate modelling community to maintain a diverse suite of ocean biogeochemical models.
One priority for the next generation of Earth system models (CMIP6) is to enhance model resolution in the hope that it will resolve some of the existing biases in climate models. This puts pressure on the computing time available for representing biological complexity. Our results suggest that intermediate complexity models (such as MEDUSA-2 and Diat-HadOCC) offer a good compromise between the representation of biological complexity (through their inclusion of an iron cycle) and computer time given their relatively good performance in reproducing bulk properties. However, intermediate complexity models are limited in the detail to which they can address climate feedbacks, and it may be that more complex models can in future provide additional insight based on ongoing measurements and data syntheses.
The quest for increasing resolution in ESMs is unlikely to end soon, as the
resolution needed to resolve eddies in the ocean (
This work was funded by the UK Natural Environmental Research Council Integrated Marine Biogeochemical Modelling Network to Support UK Earth System Research (i-MarNet) project (NE/K001345/1) and the UK Met Office. MFR was partially funded by the EC FP7 GreenSeas project. Edited by: J. Middelburg