To describe the underlying processes involved in oceanic plankton dynamics is crucial for the determination of energy and mass flux through an ecosystem and for the estimation of biogeochemical element cycling. Many planktonic ecosystem models were developed to resolve major processes so that flux estimates can be derived from numerical simulations. These results depend on the type and number of parameterizations incorporated as model equations. Furthermore, the values assigned to respective parameters specify a model's solution. Representative model results are those that can explain data; therefore, data assimilation methods are utilized to yield optimal estimates of parameter values while fitting model results to match data. Central difficulties are (1) planktonic ecosystem models are imperfect and (2) data are often too sparse to constrain all model parameters. In this review we explore how problems in parameter identification are approached in marine planktonic ecosystem modelling.

We provide background information about model uncertainties and estimation methods, and how these are considered for assessing misfits between observations and model results. We explain differences in evaluating uncertainties in parameter estimation, thereby also discussing issues of parameter identifiability. Aspects of model complexity are addressed and we describe how results from cross-validation studies provide much insight in this respect. Moreover, approaches are discussed that consider time- and space-dependent parameter values. We further discuss the use of dynamical/statistical emulator approaches, and we elucidate issues of parameter identification in global biogeochemical models. Our review discloses many facets of parameter identification, as we found many commonalities between the objectives of different approaches, but scientific insight differed between studies. To learn more from results of planktonic ecosystem models we recommend finding a good balance in the level of sophistication between mechanistic modelling and statistical data assimilation treatment for parameter estimation.

The growth, decay, and interaction of planktonic organisms drive the
transformation and cycling of chemical elements in the ocean. Understanding
the interconnected and complex nature of these processes is critical to
understanding the ecological and biogeochemical function of the system as a
whole. The development of biogeochemical models requires accurate
mathematical descriptions of key physiological and ecological processes, and
their sensitivity to changes in the chemical and physical environment. Such
mathematical descriptions form the basis of integrated dynamical models,
typically composed of a set of differential equations that allow credible
computations of the flux and transformation of energy (light) and mass
(nutrients) within the ecosystem (US Joint Global Ocean Flux Study
Planning Report Number 14,

Generalized mechanistic
descriptions of how energy is absorbed and how mass becomes distributed in an
ecosystem already exist, such as dynamic energy budget models

Dynamical marine, as well as limnic, ecosystem models usually start from a description of the build-up of biomass by photoautotrophic organisms (phytoplankton) as these take up dissolved nutrients from the water column and exploit light energy by photosynthesis. Phytoplankton biomass, as a product of primary production, is subsequently removed by natural mortality (cell lysis due to starvation, senescence, and viral attack), predation by zooplankton, and vertical export away from surface ocean layers via sinking of single or aggregated cells and of fecal pellets. Parameterizations of these three loss processes can be interlinked, e.g. grazing of phytoplankton aggregates by large copepods. Depending on the trophic levels considered in a model, the predation among different zooplankton types (e.g. between herbivores, carnivores, or omnivores) can be explicitly parameterized. Mortality and aggregation of phytoplankton cells and the excretion of organic matter (fecal pellets) by zooplankton act as primary sources of dead particulate organic matter (detritus) that can be exported to depth via sinking. Exudation by phytoplankton and bacteria can be a major source of labile dissolved organic matter that represents diverse substrates for remineralization. The transformation of particulate and dissolved organic matter back to inorganic nutrients is parameterized as hydrolysis and remineralization processes. Often hydrolysis and remineralization are assumed to be proportional to the biomass of heterotrophic bacteria, which is considered in many models. Heterotrophic bacteria remain unresolved in some models where microbial remineralization is parameterized only as a function of concentration and quality of organic substrates.

At some level most models include a parameterization to account for the net effect of higher trophic levels that are not explicitly resolved. This is usually formulated as a closure flux back to nutrient pools and whose rates simply depend on the biomass of the highest trophic level resolved. These closure assumptions ensure mass conservation while neglecting the actual mass loss to higher trophic levels like fish, which would be subject to fish movements and changes in biomass on multi-annual scales rather than seasonal timescales. Every marine planktonic ecosystem model can thus be described as a simplification of the dynamics inherent to a system of nutrients, phytoplankton, zooplankton, detritus, dissolved organic matter, and possibly bacteria.

In many cases marine ecosystem models are embedded in an existing physical
ocean model set-up that simulates environmental conditions, advection, and
mixing of the biological and chemical state variables. Feedbacks from the
ecosystem model states on physical variables can be relevant

One of the most influential model approaches to studying the nitrogen flux
through such a marine plankton ecosystem at a local site was proposed by

In practice, there are always some fixed model parameters that need to be assigned values, whether they describe the behaviour of fixed plankton functional types or the distributions of traits in a stochastic community. In the end, it is the choice of these parameter values that determines a specific model solution of any ecological or biogeochemical model set-up.

Model solutions of interest are typically those that can simulate and explain
complex data. Model calibration, which can be considered a form of data
assimilation (DA), is the process by which model parameter values are
inferred from the observational data. Optimal parameter values are regarded
as those that generate model results that match observations (minimize the data–model
misfit) but that are also in accordance with the range of values known e.g. from
experiments or from preceding DA studies. To determine optimal parameter
estimates we have to account for uncertainties in data and in model dynamics
as well, which is specified by an error model. Parameter estimates are thus
conditioned by (a) the dynamical model equations, (b) the data, (c) our prior
knowledge about the range of possible parameter values, and (d) the
underlying error model

Situations can occur where model results that are compared with data are
insensitive to variations of some parameters. Values of those parameters
remain unconstrained by the available data, which is a problem of parameter
identifiability. The availability (type and number) of data thus places
limitations on the number of model parameters whose values become
identifiable, and values of some parameters may never be fully constrained.
This in turn sets restrictions on the complexity of plankton interactions
that can be unambiguously confined during ecosystem model calibration

Much of the literature on DA in oceanography is focussed on state estimation

Thorough reviews of common DA methods applied in marine biogeochemical
modelling are given by

In our review we primarily focus on topics related to parameter identification, thereby including basic aspects of DA. Parameter identification in marine planktonic ecosystem modelling is a wide field and we do not attempt to discuss differences between various DA tools or techniques. We rather put emphasis on models, including parameterizations of ecosystem processes, statistical (error) models, model uncertainties, and structural complexity. We adopt and explain mathematical notation that is often used for DA studies in operational meteorology and oceanography. On the one hand we provide background information that should facilitate intelligibility when studying DA literature. On the other hand we like to elucidate typical objectives and common problems when simulating a marine planktonic system. In this manner we hope to support a mutual understanding between ecologically/biogeochemically and mathematically/statistically motivated studies.

The paper starts with some theoretical background information
(Sect.

The term parameter identification is used broadly to describe parameter
estimation problems, including the specification of uncertainties in
parameter estimates and model parameterizations. It involves the following
procedures.

Parameter sensitivity analyses: the evaluation of how model results change with variations of parameter values.

Parameter estimation: the calibration of model results by adjusting parameter values in light of the data.

Parameter identifiability analyses: the specification of parameter uncertainties in order to reveal structural model deficiencies and shortages in data availability/information.

The prognostic dynamical equations of a marine ecosystem model can be
expressed as a set of difference equations:

For stochastic dynamical models,

To relate the dynamical model output of Eq. (

A general relationship between the true state and model state can be
expressed as

How we interpret and specify Eq. (

The observation vector

The simplest possible example of an observational error model assumes
additive Gaussian errors. Equation (

We now consider how to estimate uncertain parameters

In general the likelihood can be expressed as an integral over probabilities
conditioned on particular values of the model state and true state:

The Bayesian approach encourages us to explicitly quantify our prior
knowledge about the parameter values through the prior

Once the likelihood is formulated and a prior distribution is prescribed,
classical Bayes estimates (BEs) may be computed from posterior mean or
posterior median values of

In maximum likelihood (ML) estimation we seek the parameter values

Historically, Bayesian methods

In some problems, assimilating all the data at once from all available
sampling times can be computationally impractical. This is particularly
likely for models with stochastic dynamics (

To see how this works, suppose we know the probability density

Once the predictive filtering densities

Alternatively,

The various types of filter differ essentially in terms of how the integrals
in Eqs. (

At present there appears to be some ambiguity regarding the term “variational” in the context of DA. It is sometimes used to describe approaches explicitly based on control theory or “inverse methods” that may not include explicit assumptions about error distributions and where cost functions are defined a priori, rather than being derived from statistical or probabilistic models. However, a distribution-free approach seems difficult to recommend in general for marine ecosystem model parameter estimation, given the strong nonlinearity, non-Gaussianity, and relatively weak data constraint often encountered in such problems. Within the marine ecosystem modelling community, the term “variational DA” is often used more broadly to refer to all non-sequential methods that involve the minimization of a cost function, whether or not this is based on a probability model.

In any case, there are some powerful mathematical tools developed for
variational DA that can be applied to minimize cost functions. Adjoint
methods allow the gradient of the cost function with respect to all fitted
parameters to be computed in an extremely efficient manner; see

Much recent interest has focused on combined state and parameter estimation,
whereby model parameters

In some other recent studies emphasis is put on “hierarchical” error models

Another important initiative is the estimation of hyperparameters of the
kinematic error model along with the ecosystem parameters

The choice of a suitable estimation method for marine ecosystem model
parameters should be mainly based on the availability of relevant prior
information, as well as on the basic error assumptions (Eqs.

As a simple but common example, consider a deterministic model with no model
error and data with additive Gaussian observational errors, Eq. (

Alternatively, nonnegativity constraints on the variables and parameters may
lead us to prefer the lognormal observational error model. Likewise, we can
assume lognormal priors for the parameters. In this case the posterior
density becomes

We close this section with some cautionary remarks about different
terminology that the reader may encounter in the literature. First, many DA
papers and textbooks start by assuming a certain cost function, based on
variational or optimal control theory, rather than deriving it from a
probabilistic treatment as herein

Deviant parameter estimates of a model may point towards a deficiency in model structure, forcing, or boundary conditions. Estimates of the effectively same parameters may turn out to be different within dissimilar plankton ecosystem models, even if those models may have been calibrated with the same data and although they possibly share an identical physical (environmental) set-up. To understand why parameter estimates can be different it is helpful to unravel some of the basic differences between major parameterizations that describe growth and loss rates of phytoplankton.

A crucial element of most plankton ecosystem models is the description of phytoplankton growth as a function of light, temperature, and nutrient availability. How growth of algae is parameterized is relevant and the associated parameter values affect the timing and intensity e.g. of a phytoplankton bloom in model solutions.

The build-up of phytoplankton biomass depends on how much of the available
nutrients can be utilized and how much energy can be absorbed from sunlight.
Under nutrient-replete and light-saturated conditions, the carbon fixation
(gross primary production, GPP) reaches a (temperature-dependent) maximum
rate, described as a parameter (

In practice an analogy between

In many marine ecosystem models two separate limitation functions are
combined: one that expresses the photosynthesis vs. light relationship
(P–I curve) and another that describes the dependence between ambient
nutrient concentrations and nutrient uptake. The two functions are similar in
their characteristics, starting from zero (no light or no nutrients) and
approaching saturation at some high light and at replete nutrient
concentration. Three approaches are generally found in marine ecosystem
models to limit algal growth by photosynthesis and nutrient uptake. The first
is to apply Blackman's law

In a P–I curve the level of increase from low to high irradiance is
specified by the initial slope parameter (the maximum of the first derivative
of the P–I curve with respect to light), also referred to as photosynthetic
efficiency (

Typical parameterizations of growth limitation by nutrient availability
(ambient nutrient concentrations) are expressed with the half-saturation
constant (

More complex growth dependencies are described with models that consider
intracellular acclimation dynamics

Parameterizations of phytoplankton cell losses involve lysis (starvation and/or viral infection), the aggregation of cells together with all other suspended matter, and grazing by zooplankton. Exudation and leakage are processes of organic matter loss that occur while the physiology of the algae is functional. Cell lysis, exudation, and leakage are usually expressed as a single rate parameter and this loss of organic matter is assumed to be proportional to the phytoplankton biomass.

Parameterizations of phytoplankton losses due to the process of coagulation
and sinking of phytoplankton and detrital aggregates are basically derived
from the principle theory of coagulation. The application of coagulation
theory to simulate phytoplankton aggregation is well established for models
that resolve size classes of particles (of phytoplankton cells and detritus)
explicitly

A common problem is to find constraints that allow for a clear distinction
between phytoplankton losses due to the export of aggregated cells and the
loss because of grazing. Both processes can be responsible for the drawdown
of phytoplankton biomass, and data that cover the onset, peak, and decline of
a bloom are needed for a possible distinction. How the complex nature of
predator–prey interaction is parameterized remains a critical element of
plankton ecosystem models. Compared to the approaches that describe algal
growth, an even larger number of different parameterizations exist for
grazing

Elaborate analyses of mesozooplankton and microzooplankton biomass, grazing,
and mortality rates were done by

The explicit distinction between zooplankton size classes, like
mesozooplankton and microzooplankton, was bypassed in

Parameter values of acclimation models have typically been adjusted to
explain laboratory measurements

Explicit error assumptions for parameter optimizations and for comparisons of
acclimation model results with laboratory data were introduced by

To collect diverse data that fully resolve onset, peak and decline of an
algal bloom at ocean sites is difficult to achieve. Data derived from remote
sensing, e.g. Chl

In contrast to laboratory measurements, data from mesocosm experiments reflect some natural variability of the plankton community, mainly captured by replicate mesocosms. The availability of measurements from replicate mesocosms is also helpful when defining error models that specify the statistical treatment of the data used for parameter estimation.

Error models define our assumptions about uncertainties and the statistical relationships between observed data, the true state, model output, model inputs (forcings and initial/boundary conditions), and model parameters. Here we review error models that have been applied to address the various sources of uncertainty in marine ecosystem models and consider their implications for parameter identification. An explicit treatment of each source of uncertainty may not be necessary, but we do recommend reflecting on how these uncertainties can be accounted for when modelling plankton dynamics and biogeochemical cycles.

The simplest and most common models for observational error assume that the
observational errors

The additive normal assumption (i) is straightforward but also restricted, as
it does not capture three common characteristics of some ecosystem data such
as Chl

For power-normal, gamma, or proportional error assumptions we have the
difficulty that the variance on the original scale approaches zero at low
values. This may be unrealistic, at least in regard to instrumental noise. In
normal models this problem can be addressed by adding a constant term to the
variance

Time evolution of parameter estimates in a simulation test of an
ensemble Kalman filter using untransformed data (

The validity of the constant variance assumption (ii) may be improved by a
scale transformation, although the transformation that best normalizes the
error distribution (see above) may not best promote the homogeneity of
variance. Spatiotemporal variations in the error variance may naturally
occur, for example due to seasonal modulations of the unresolved variability
and hence the representativeness error component. Accounting for this
variation should improve parameter estimates and uncertainty assessment

In some contexts, e.g. mesocosms, the error covariance matrix might be
estimated from experimental replicates prior to fitting the model
(Sect.

The assumption of independent errors between samples and variable types (iii)
can be invalidated in cases where contributions from representativeness error
or kinematic model error are large, or where the data have been derived by
interpolation or application of a regression model. Neglected correlation may
result in parameter estimates that are less efficient (higher variance) and
more strongly correlated (e.g. see example in Sect. 5.4). Pre-averaging the
data is somewhat helpful to promote independence (and normality, via the
central limit theorem), but might also remove some of the informative
variability. One common ad hoc intervention in the cost function is to scale
the residual error variance with the sample size of each data type, to avoid
biasing the fit in favour of better-sampled variables

Whatever the assumptions of the observational/residual error model, it is
possible to test their validity using the assimilated data, either by
analysing the residuals and performing lack-of-fit tests
(

Finally, we caution that certain interpolated or derived data may strictly
invalidate the observational error model, not only due to error correlation
(see above), but also due to the introduction of

Prior uncertainty plays an important role in estimating model parameters.
Typically, there is not enough information in the assimilated data to
constrain all parameters of a biogeochemical model. The results may well be
sensitive to the “error model of prior uncertainty”. Prior uncertainty can
be represented by prior probability densities in Bayesian approaches or
plausible ranges in non-Bayesian approaches. To account for nonnegativity
constraints, prior distributions typically include lognormal

Quantifying the prior uncertainty in

When posterior uncertainty becomes unacceptably high, it can be reduced by
reducing the prior uncertainty in

Dynamical marine ecosystem models are usually specified by differential
equations that are first-order in time, and therefore require for solution
one initial condition (IC) for each grid cell or spatial location in the
model. These inputs are, in general, uncertain, and liable to impact the
model output, at least during a transient relaxation period, or indefinitely
if the uncertainty spans more than one basin of attraction of the dynamical
system or if the model dynamics are chaotic

In some cases it is possible to neglect IC error because of accurate
measurements, or because a steady state (equilibrium or seasonal cycle) that
is only sensitive to

In non-spatial (0-D) models, IC errors have been modelled as both fixed
parameters

Marine ecosystem models are usually modulated by time- and space-dependent
environmental drivers (forcings) and boundary conditions that are not
predicted by the model dynamics but are necessary inputs to determine the
evolution of the model state variables. Studies have demonstrated the
sensitivity of biogeochemical variables to errors in bottom-up forcings such
as wind stress and vertical mixing

There are basically two approaches to modelling the effects of BC/forcing
error: (1) to consider individual or net impacts on model dynamics as
dynamical model errors (

The kinematic approach (Eq.

In either case, BC/forcing error models may fall short in describing
potential errors in

For some problems, in particular for chaotic systems, the phase noise may be
too intense or ill-defined to allow effective use of a parametric phase lag
model. A better approach here might be to use a “synthetic likelihood”

Even with perfectly known parameters, forcings, and initial/boundary conditions, we would still not expect the modelled fluxes such as primary productivity and grazing to perfectly reproduce the true fluxes, or the state variables to perfectly follow the true variability. Aggregation of species into model functional groups, effects of finite spatial and temporal resolution, and inherent approximations in the flux parameterizations and model structure may all contribute to “structural error” in the model dynamics.

One promising approach to account for structural error is to add stochastic
noise (dynamical model errors) to the ecosystem model
parameters

We note that some structural errors may impose persistent or intermittent
biases in the model output that may not be amenable to a simple statistical
description. For example, a succession in blooming phytoplankton species
might extend or multiply the bloom periods in ways that are not “random”
and that are difficult to reconcile with a single model functional group,
even with stochastic parameters. Limited spatial resolution can also impose
persistent biases that lead to poor extrapolation properties when we try to
correct them by adjusting

An alternative approach might be to employ the tools of multimodel inference

The determination of parameter uncertainties has many facets, getting to the
core of discussions of Bayesian and frequentist approaches and
interpretations

In general, if we wish to make inference about uncertainties of parameter
estimates (

For an unbiased ML estimator, the

Uncertainty regions in parameter space can be determined basically in two
different ways, based on either Bayesian or frequentist interpretations.
According to the Bayesian interpretation a credible region is specified by
conditional probability distribution of the true value given the data. For
maximizations of the likelihood

In case of classical BEs no tolerance limit

A fundamentally different approach to the BE methods is to repeat parameter
optimizations many times but with data subsamples or resample data sets.
Large data sets are split up into a series of subsamples that should be as
independent as possible, or many synthetic data sets are created by applying
a random number generator to independently draw bootstrap samples

An alternative to ensemble-based sequential, MCMC, and bootstrap methods for
determining uncertainties of parameter estimates is the construction of 1-D
or 2-D profile likelihoods

A single point in parameter space is identified by ML and MAP estimators,
i.e.

A first approach considers a linearization (first order power expansion) of
the model's observation vector

Another more common approach for a point-wise approximation of

Hessian matrices have often been approximated with a finite central
differences approach for first and second derivatives of

The problem of increment size reduces if first derivatives of

Computations of the Hessian, Eq. (

Ideally, every eigenvector would exhibit only one single component, meaning
that values of every parameter can be estimated independently of the other
parameters' values. In practice this is only the case for few parameters of a
planktonic ecosystem model. Eigenvectors with two or more distinct components
disclose those parameters whose estimated values are correlated and for which
correlation coefficients can be explicitly derived

In Sect.

Cost function contours when varying values of a combination of two
parameters

Details of the cost functions and the corresponding mapping from model
results

Figure

Another peculiarity is that the ranges of the MCMC's posterior indicate
larger uncertainties if the cost function without covariance information is
applied (right side of Fig.

Overall, these results exemplify the uncertainty in constraining major loss
parameters in the presence of grazing, if no explicit prior information about
grazing rates or data of zooplankton biomass are available. Collinearities
between grazing parameters and other phytoplankton biomass losses may be
reduced by testing model performance against independent data, e.g. as done
for the mesozooplankton and microzooplankton grazing in

Good performance should be attributable to a model capturing the predominant
plankton dynamics under varying conditions in different environments.
Parameter values are often optimized for local ocean sites, but ideally,
parameter estimates from one site should improve model performance at other
locations as well. The generality of optimized models can be tested by
cross-validating against independent data, providing a direct and effective
test of predictive skill

Parameter optimizations can often improve the fit of a model by selecting unrepresentative parameter values that serve only to compensate for misfits between data and model results. It is therefore essential to check whether the resultant “optimized” model is giving the right answer for the correct reasons.

This is the principle of cross-validation, in which an optimized model is
tested in terms of its ability to reproduce data that were not included in
the calibration phase. This is often achieved by excluding a subset of the
original calibration data set, for later use in model evaluation. For
example, in a variational data assimilation exercise for the Arabian Sea,

The cross-validation approach has the advantage of testing one of the key
attributes of marine biogeochemical models, namely their predictive skill.
The technique is, however, not without its difficulties. The first issue is
that it is important to ensure the test data are truly independent of the
training data. In this regard,

The potential to select unrealistic, compensatory, parameter values may not
always be obvious, especially if good estimates of the “true” (or at least
sensible) values of the model parameters are not well known a priori. Such
errors may, nonetheless, strongly impact the ability of a model to reproduce
anything but the assimilated data. This issue appears to be a common theme in
simple marine biogeochemical models calibrated to time-series data, as a
number of studies

Of the many factors that affect the ability of a biogeochemical model to reproduce and predict observations, the appropriate degree of model complexity in any given situation is both one of the most important, and one of the least well defined. This is because there exists a fundamental trade-off between simplicity and complexity. Simple models have the advantage of being easier to understand, and with fewer parameters they should also be better constrained (both before and after optimization). Nonetheless, simplification requires a degree of abstraction, and it can sometimes be difficult to draw parallels with the complexities of the observed system.

At the other end of the spectrum, a highly complex model can explicitly
resolve more processes, allowing more detailed comparison with observations.
As models become more complex, the number of degrees of freedom increases,
and the calibrated model will generally be able to match the observations
better than a simpler model. If insufficient observations are available, the
extra degrees of freedom can lead to the introduction of compensatory errors
at the assimilation site, which could then increase uncertainty at other
locations, as illustrated by

A range of statistical techniques are available to assess this trade off, and
a useful review is given by

Aside from directly assessing a model's predictive skill using
cross-validation, a number of alternative approaches are available to
identify the minimum number of model parameters that are supported by the
available data. One of the simplest techniques (in terms of its
applicability) is the Akaike information criterion (AIC,

Predictive skill for five ecosystem models of different complexity,
after assimilation of satellite data (black) and after assimilation of
satellite data with 20 % added noise (grey)

A perhaps more intuitive approach is given by the likelihood ratio test (LRT)
for e.g. comparing so-called nested models, in which the simpler model is a
special case of the more complex model, in the sense that

The theory mentioned above is well described by

Theoretical arguments, as well as results from cross-validations, have
revealed problems with the portability of locally calibrated models

Model selection metrics at sites of the Bermuda Atlantic Time-series
Study (BATS) and the North Atlantic Bloom Experiment (NABE), as a function of
complexity across a suite of nested ocean biogeochemical models

For spatial or temporal variation to be useful we have to make sure that the
corresponding parameter adjustments reflect changes in the actual underlying
(real-world) dynamics. To assess whether this condition is met is a
particularly challenging problem that has yet to be adequately addressed.
Direct comparisons are needed between optimizations that allow variation in
posterior parameter vectors and those that do not. In studies where direct
comparisons are made, a common finding is a reduction in the model misfit to
the assimilated data by allowing these kinds of variations, but this tells us
little. A reduction of the cost function is expected, as a direct consequence
of an effective increase in the number of adjustable parameters. As pointed
out by

Switching between different parameter sets in time or for
specific regions may not necessarily be a solution per se but may indicate
where model refinements have to be investigated

Satellite ocean colour data are widely used to investigate spatial
differences in parameter estimates. In many cases, a local calibration method
is applied where parameters are optimized separately to fit Chl

Pronounced regional and seasonal differences are not restricted to adjacent
seas and coastal areas. Large-scale studies for the North Atlantic have shown
comparably strong regional differences between parameter estimates

Patterns of spatial variation in parameters are not easily validated as most parameters do not
have well-observed equivalents in nature. Nevertheless,

The presence of parameter variation between sites or regions for which a
model was calibrated independently does not refute the existence of a common
parameter vector with which the model could achieve similar results.

Spatially varying estimates for the phytoplankton maximum growth
rate (

There is a clear advantage of combining sites or regions, in that it makes
more data available to constrain parameters. It also creates a representative
sample for the domain of interest, reducing the risk of over-fitting. In
contrast, when assimilating data at a single site,

Application of the method to the North Atlantic data set used by

Geographic extent of the two sub-domains giving the optimal
calibration in the split-domain calibration study of

The idea of representing seasonal variation in part by temporal variations in
the parameters has been examined in various studies

In a more recent BATS assimilation study with a simpler NPZD model,

An experiment allowing both time and space variation in biogeochemical
parameters that includes cross-validation is presented by

As shown in this section, a variety of approaches have been explored for DA with parameters varying in space or time or both. We conclude the section by considering what might be learnt from these types of studies. A common finding is that the posterior misfit cost with respect to the assimilated data is reduced by allowing variation, but this provides no evidence in itself to support the case for parameter variation. Allowing parameter variation increases the number of parameter values to be optimized, making it easier to fit a given data set.

Goodness-of-fit statistics that penalize model complexity in terms of number
of parameters (e.g. the F-score of

Allowing parameters to vary reduces the extent to which their values can be
constrained by a given set of observations, making an already
under-determined problem worse. It could therefore be argued that parameter
variation is justified only when there is good evidence to infer that a given
model cannot adequately represent the observed variability under the uniform
parameter vector constraint. The evidence should be statistically robust,
taking into account all relevant sources of uncertainty. The consideration of
these additional uncertainties, motivated by its potential for improving
parameter estimates

Heterogeneity in the parameter vector is most likely to be useful for
structurally simple models. Those models may lack the required flexibility to
capture some distinct spatial features observed within large domains or they
may fail to resolve specific events during a complete annual cycle. Its
introduction may be a sensible alternative to increasing structural
complexity as it does not increase the computational demands of 3-D
simulations. From an ecological point of view, the need to introduce space
and time variations in parameter values reflects limitations in resolving
physical environmental changes, or deficiencies in physiological or
ecological processes, or all of these factors together. For example,
variations in plankton elemental stoichiometry, e.g. variable
Chl

If good reasons are found to support the use of parameter variation for model
improvement, then the issue of how to benefit from this spatio-temporal
information must be addressed. Spatially varying parameters can be applied
directly in 3-D models

Systematic approaches for parameter optimization that were successfully
applied in 0-D or 1-D set-ups, may become too costly as resolution in space
is increased and if the time period for integration is prolonged. This is the
case when spatially 3-D models with high resolution or steady annual cycles
(i.e. periodic solutions) are considered. For the computation of a steady
annual cycle (or fixed point) typically thousands of years of model time are
necessary, which may result in a number of time steps in the order
of

The application of emulators has emerged in many different fields of science
and thus the theoretical background is relatively well developed

A dynamic emulator (or reduced order or surrogate model) is a substitute for
the original model

Dynamical or physical emulators are based on a simplified model
version (

When using dynamic emulators, it is often insufficient to take the output of
the faster but less accurate coarse model during optimization, because the
accuracy of the coarse model

The alignment operator in optimization step

The method was successfully applied for parameter identification of a
transient 1-D configuration with an NPZD ecosystem model and for periodic
states with climatological forcing in a 3-D setting in a N-based model with
dissolved organic phosphorus (DOP)

In contrast to a dynamical emulator, statistical emulators relate the input
parameters statistically to the model output and thus to

Simulated

In Fig.

Figure

Another example for statistical emulation in biogeochemical modelling is
presented by

Another study of

While emulations based on statistical approaches are comparatively fast, such methods rely on sufficiently large sets of training data (i.e. full model simulations). To generate such training data can be costly, especially for 3-D models with high spatial resolutions. To overcome this problem one might consider a combination of statistical and dynamical emulators.

A two stage emulation process is suggested by

The ultimate aim of the two-stage procedure would be to use a sufficiently large
number of state estimates of the model, based on a (sufficiently precise) dynamical emulator,
for the construction of a statistical emulator for a cost function or similar metric. The dynamical
emulator would effectively bridge the gap between a small reference ensemble
that is practical to generate with the full 3-D model and the statistical
emulator that requires a relatively large training set. The respective metric
must incorporate an error model that takes into account all sources of
uncertainty in the statistical emulation of the full model. Thus, the
uncertainty estimates obtained when training the statistical emulator must be
inflated by combining them with the dynamical emulator's own uncertainty
estimates. Stage 1 emulation results suggest that it may be important to
first extend the latter to include temporal covariance estimates for the
parametric uncertainty associated with the averaged 3-D model output used.
Another important consideration is that global 3-D models require long spin-up
times to overcome an initial model drift (see Sect.

Global biogeochemical ocean models are commonly used to investigate the
mutual interactions between ocean biota and climate change, a famous example
being coupled Earth system models (ESMs) applied in the fifth assessment of
the Intergovernmental Panel on Climate Change

A major challenge in calibrating biogeochemical models on global scale is
that the simulations require many millennia until tracer distributions are in
equilibrium with the given circulation field and the biogeochemical processes

To attain some equilibrated biogeochemical cycling requires considerable
computational time, which makes it particularly difficult to employ methods
that exploit the parameter-cost function manifold with a large ensemble of
model runs like the MCMC method. The derivation and application of emulators,
as described in Sect.

Some speed-up of long-term model simulations can also be achieved with an
appropriate balance between a model's spatial resolution and the complexity
of biogeochemical tracer dynamics, as approached by

Some DA applications may not require equilibrated tracer dynamics to maintain
steady seasonal cycles, e.g. when applying sequential DA approaches with
recurrent analyses steps and corrections of the simulated state variables. An
example is the study of

In summary, various procedures for calibrating large-scale and global biogeochemical ocean circulation models exist, but are presently challenged by overcoming limitations in computational time to approach equilibrated steady cycles in biogeochemical tracer distributions. Data availability on global scale introduces additional limitations to act as constraints for parameter identification of global biogeochemical models.

With regard to the ocean's key role in global carbon cycling and hence for
the climate system, four different types of data are typically considered for
assessing and calibrating global biogeochemical ocean models: (i) data of
dissolved inorganic tracers, e.g. distributions of nutrients, oxygen,
alkalinity, and dissolved inorganic carbon, (ii) data products derived from
remote sensing measurements, e.g. of chlorophyll

For the calibration and assessment of large-scale or global biogeochemical
models, many studies resort to using climatological data sets, e.g. of
nutrients and oxygen, components of the carbonate system

One reason for the fallback to rather basic data types such as climatological
nutrient concentrations for global model evaluation is the sparse
distribution of open ocean, in situ observations. One example is the scarcity
of global microzooplankton biomass observations in the ocean, as depicted in

Ocean measurements of rates are particularly valuable, but these may not be
straightforward to accomplish, e.g. isotopic measurements on a research
vessel. Some rate measurements may also suffer from large methodological
uncertainties, e.g. measurements of nitrogen fixation. Of similar value,
comparable to rate measurements, are observations of oceanic particle flux,
as obtained from sediment traps or from optical methods

The joint effect of particle flux and remineralization is often described by
one or two parameters in global models. Early models referred to an
exponential function of remineralization with depth

Under steady state conditions

So far few attempts have been made to systematically calibrate
parameterizations of particle export and remineralization in global
biogeochemical models.

Projections from the parameter–cost function manifold
(

In a recent study of

In the presence of very diverse timescales and space scales, which is typical
in global biogeochemical ocean modelling, the selection of data sets and the
definition of the error model strongly affect parameter identification. We
also stress that parameter estimates of global biogeochemical modelling
studies are conditioned by the applied circulation, which can have a large
impact on simulated tracer fields

A typical large-scale application of marine biogeochemical models is their
use in ESMs from which projections of future climate change can be derived
for different emission and land-use scenarios. Output of such models helps to
inform scientists, but also society and policymakers about possible
consequences of human action on the climate system. A key example is the most
recent assessment report of the IPCC that featured ESMs with fully
interactive carbon cycles

A comprehensive attempt to account for uncertainties in the models when
determining likelihoods of reaching certain climate goals, like the
politically widely accepted 2

Note that reproducing the current climate state is merely a necessary condition for model skill, but may not constrain the model's ability to correctly simulate the sensitivity to natural or anthropogenic environmental change. Observational information on past climate change, such as glacial–interglacial changes may help to better constrain the models' sensitivity to changing environmental conditions, even though no historical analog of the current anthropogenic perturbation is known in terms of the rapid rate of change. Still, any information about model sensitivities to applied perturbations is extremely valuable, be it derived from lab or mesocosm experiments or from historical information. DA is a promising tool to combine such information on very different space scales and timescales and to develop an improved understanding of how the earth system works and may respond to ongoing environmental change.

The survey of

The theoretical backbone for studies of parameter estimation and uncertainty builds first of all on how model errors and observational errors are treated. Specifying the error model is an essential first step in the workflow of parameter identification, enabling the subsequent derivation of conditional probabilities and cost functions. Our review shows that there is no ultimate standard error model or procedure but a meaningful practice is to become explicit about these errors and to reconsider the underlying assumptions for discussions of parameter estimates and model results. Whether the DA approach conserves mass and/or energy is relevant in this respect, depending on the scientific problem addressed. Some ecosystem model applications may not critically depend on mass conservation, e.g. when simulating plankton growth to act as food source in regional simulations of fish stock size and recruitment. In biogeochemical applications the conservation of mass can be essential, in particular for large-scale or global ocean applications.

As in many other fields of science, the basic estimation methods considered in plankton ecosystem DA studies are Bayesian estimation and maximum likelihood. Their major differences are how prior information enters the DA approach and how estimates and uncertainties are evaluated. The consideration of prior parameter values from preceding studies is meaningful and likely alleviates parameter identification problems. A drawback then is that asymptotic (point-wise) approximations of posterior uncertainty covariance matrices, as described herein, may not apply. But when the model parameters in question have been estimated before in a number of comparable settings, it may seem a tragic waste of effort and information to pursue an ML approach without prior information. A similar issue arises in specifying an “ignorance” prior, and the choice of using BEs when no prior information is available can also be questioned.

We included a section on typical basic parameterizations of plankton models,
mainly to stress that the treatment of light- and nutrient limitation may
differ between modelling studies. Furthermore, we touched on the problem of
resolving phytoplankton losses specified by e.g. grazing and aggregation
parameters. Latest plankton growth models account for physiological
acclimation effects, responsible for variations between carbon fixation,
cellular allocation of nitrogen and phosphorus, and Chl

Many acclimation- or optimality-based models have been qualitatively calibrated with data from laboratory experiments. DA approaches for parameter estimation were only done in a few of these studies. Going from laboratory data to the assimilation of data from mesocosm experiments can be a useful intermediate step for testing e.g. acclimation or adaptive models and for assessing uncertainty ranges of parameter values. In this respect, parameter estimates of one experiment can be used for cross-validation with data of another independent mesocosm experiment. On the one hand, simulations of the physical environment of mesocosms are easier to implement, compared e.g. to setting up a 1-D model for an ocean site. On the other hand, parameter estimates obtained from the assimilation of mesocosm data might not be representative for ocean simulations. Although more difficult, model cross-validations between different ocean sites or regions provide valuable insight, eventually specifying a model's predictive skill under oceanic conditions.

Some studies have shown that an increase in model complexity may not automatically improve predictive skill. This can be partially attributed to over-fitting, which can yield parameter estimates that improve model–data misfits at one site but induce unreasonable model results at other ocean sites. Such results illustrate the vital role played by well-designed cross-validation experiments. A critical element of cross-validation is whether the assimilated data are truly independent from the data used for testing model skill. This is, for instance, not typically the case if observations from different years but of the same characteristic region are used unless inter-annual variability dominates over the repeating seasonal dynamics. Regional differences between parameter estimates are informative and have the potential to reveal a model's limitations in a way that can suggest improvements.

Parameter identification becomes more difficult as we go from local and regional-scale to large-scale and global model simulations. Algorithms for parameter optimization require multiple model evaluations, which can be computationally expensive for global biogeochemical models. The procedure for optimizing parameter values can be accelerated with the application of an emulator. We discussed the use of dynamical and statistical emulators. The dynamical emulator is a simpler representation of a full model operator that is computationally expensive, thereby approximating the underlying model dynamics. A statistical emulator interpolates model output from a set of training runs with different values assigned to the parameter vector. Based on the derived statistics it can be applied to approximate unknown model output for other input parameters. Both emulator approaches have been shown to efficiently support the search for optimal parameter values. The development and use of emulators of biogeochemical models will likely gain in importance along with improved computer performance. A promising approach is to apply models with coarser resolution or a series of 1-D models (distributed over ocean regions) as dynamical emulators for 3-D global biogeochemical model simulations. Studies have shown that sufficient accuracy of the emulator can be achieved with repeated intermediate alignments of the dynamical emulator. Alternatively, differences between 1-D and 3-D results can be statistically quantified as emulator uncertainty, impacting on the parameter search process and used to modify the emulator-based cost function.

Parameter identification in global marine biogeochemical circulation models is still in its infancy, due to the high computational requirements, the huge range of spatial and temporal scales to be covered, and the comparatively sparse spatial-temporal distribution of data in the ocean. In contrast to local optimizations, the consideration of all relevant spatial and temporal scales has one major advantage in that it provides the opportunity to rigorously test and benchmark biogeochemical models. In addition to tasks and complications mentioned in our review, care must be taken in the selection of appropriate data sets, assuring their relevance (or potential) for answering the questions posed. Moreover a critical evaluation of the respective roles of physics, biogeochemistry, exchanges across the model's boundaries and, possibly, ecology is an as yet unresolved task.

A recurring problem associated with parameter optimization is that marine
biogeochemical models are often unrealistically simplified, while at the same
time remaining unconstrained by data. Ideally, models should be developed to
minimize the number of uncertain parameters yet maintain a level of
complexity that is suited to their intended use in answering specific
questions

A commonality of new model formulations is to focus on principles, e.g. by
considering the adaptation of traits towards optimal trade-offs

Perhaps one of the most remarkable developments is the revival of
thermodynamically inspired ecosystem theories for modelling biogeochemical
cycling in the oceans

The use of previously underexploited data sets

A substantial fraction of recent fluorescence measurements from Bio-Argo
platforms has already been included in a new global Chl

Data products from remote sensing measurements are continuously improved and
new empirical relationships between photosynthesis and respiration are
derived to estimate net community production (NCP) on the global scale

The application of DA methods has become standard for calibrating marine
ecosystem and biogeochemical models. But scientific insight can differ
between DA studies considerably. In the literature we find that there is
often an imbalance between the level of sophistication of the ecosystem model
used and the DA method employed. This is likely due to the fact that marine
ecosystem/biogeochemical modelling studies integrate knowledge from different
scientific fields, of which each has its own foci, objectives, and expertise,
i.e. plankton ecology, physical oceanography, marine geochemistry, and
mathematics and statistics. It is difficult to track major advancements in
marine ecosystem modelling when considering the different views from each of
these research fields. Furthermore, the design of experimental studies and
the collection of field data are often achieved without harmonizing the needs
of biologists with the modellers' exigencies

Facets of parameter identification in biological modelling disclose major commonalities and disparities between the objectives expressed in the different research fields. Discussions on parameter identification are therefore helpful to achieve a common understanding and to promote communication between observers, modellers, and statisticians. Problems of parameter identification may thus be well addressed by pooling expertise across multiple disciplines, without losing sight of scientific objectives. Such joint efforts should help planktonic ecosystem models to fulfil their potential as quantitative tools for aquatic sciences.

Results presented in Figs. 2, 7, and 8 and Figs. A1, A2, and B1 are made available by the respective authors. The results are centrally stored. Please send requests to mschartau@geomar.de.

In a variable lag fit (VLF), we assume that the truth at time

For the observational error in Fig.

Caution must be exercised here regarding the estimation of

To investigate estimation of the time lag variance parameter

Demonstration of the variable lag fit (VLF) applied to a simulated
data set.

Profiles of the variable lag fit cost function (

True parameter values and means

For our example we account for six different types of measurements from
mesocosms of the Pelagic Ecosystem CO

The standard errors (

Correlations during exponential growth
(

Observations of nine mesocosms (red asterisks), resampled data (grey
markers), and optimized simulation results (blue lines): dissolved inorganic
nitrogen and carbon (DIN and DIC), particulate nitrogen and carbon (PON and
POC), and chlorophyll

Adjoint models can be used to efficiently compute the derivative (or
gradient) of the cost function

The needed derivatives of the model variables

The idea behind adjoint models is to avoid this direct computation, whose
effort grows linear with the number of parameters

A useful overview of adjoint model construction and applications is given in

Another approach is based on a discretized extended Lagrange equation. Under
certain mathematical assumptions, a solution of Eq. (

The multipliers

The adjoint model construction starting from a discretized extended Lagrange
equation, Eq. (

Individual sections of our review were written by one or more lead author(s), with contributions from the other authors (Phil Wallhead, PW; John Hemmings, JH; Ben Ward, BW; Ulrike Löptien, UL; Thomas Slawig, TS; Iris Kriest, IK; Andreas Oschlies, AO; and Markus Schartau, MS). All authors were involved in mutual revisions of the individual sections. The sections' lead authors are (1) Introduction (MS), (2) Theoretical background (PW, MS, and JH), (3) Typical parameterizations of plankton models (MS), (4) Error models (PW), (5) Parameter uncertainties (MS), (6) Cross-validation and model complexity (BW), (7) Space–time variations in model parameters (JH), (8) Emulator approaches (UL and TS), (9) Parameter estimation of large-scale biogeochemical ocean circulation models (IK, AO, and MS), (10) Summary and perspectives (MS), Appendix A (PW), Appendix B (MS), and Appendix C (MS and TS). Shubham Krishna performed parameter optimizations, MCMC computations of the mesocosm modelling example, as well as calculations of the 2-D parameter arrays.

The authors declare that they have no conflict of interest.

We gratefully acknowledge the support from the International Space Science Institute (ISSI). This publication is an outcome of the ISSI's Working Group on “Carbon Cycle Data Assimilation: How to consistently assimilate multiple data streams”. We would like to thank four anonymous referees who provided constructive and helpful comments. The time and effort they spend on our manuscript is much appreciated. The examples of mesocosm data assimilation are based on the mesocosm modelling environment designed for large integrated projects Surface Ocean Processes in the Anthropocene (SOPRAN, 03F0662A) and BIOACID (03F0728A), both funded by the German Federal Ministry of Education and Research (BMBF). Contributions from Iris Kriest, Ulrike Löptien, and Thomas Slawig were supported by the BMBF-funded PalMod – Paleo Modelling: A national paleo climate modelling initiative. Edited by: M. Scholze Reviewed by: four anonymous referees