Micronekton – small marine pelagic organisms around 1–10 cm in size – are a key component of the ocean ecosystem, as they constitute the main source of forage for all larger predators. Moreover, the mesopelagic component of micronekton that undergoes diel vertical migration (DVM) likely plays a key role in the transfer and storage of CO

Micronekton organisms are at the midtrophic level of the ocean ecosystem and have thus a central role, as prey of larger predator species such as tuna, swordfish, turtles, sea birds or marine mammals, and as a potential new resource in the blue economy

Observations and biomass estimations of micronekton rely traditionally on net sampling and active acoustic sampling

While these techniques for collecting observational estimates of biomass are progressing, new developments are also achieved in the modeling of the micronekton components of the ocean ecosystem. SEAPODYM (Spatial Ecosystem And Population Dynamics Model) is an Eulerian ecosystem model that includes one lower- (zooplankton) and six midtrophic (micronekton) functional groups and detailed fish populations

A method to estimate the model parameters has been developed using a maximum likelihood estimation (MLE) approach

For this purpose, we use Observing System Simulation Experiments (OSSEs,

The paper is organized as follows: Sect. 2 describes the model setup and forcings as well as the method developed to characterize regions of observations and the metrics used to evaluate the parameter estimation. Section 3 describes the outcome of the clustering method to define oceanographic regimes and synthesizes the main results of our estimation experiments. The results are then discussed in Sect. 4 in the light of biological and dynamical processes. Some applications and limitations of our study are also identified along with suggestions for possible future research.

SEAPODYM-MTL (midtrophic levels) simulates six functional groups of micronekton in the epipelagic and upper and lower mesopelagic layers at a global scale. These layers encompass the upper 1000 m of the ocean. The euphotic depth (

This work is based on a 10-year (2006–2015) simulation of SEAPODYM-MTL, hereafter called the nature run (NR). Euphotic depth, horizontal velocity and temperature fields come from the ocean dynamical simulation FREEGLORYS2V4 produced by Mercator Ocean. FREEGLORYS2V4 is the global, nonassimilated version of the GLORYS2V4 (

We define the spatiotemporal discrete observable space

We define a

To perform realistic OSSEs, a rigorous protocol needs to be followed

A schematic view of the OSSE system. The synthetic observations are generated using the simulation with the reference configuration (nature run). The control run is used to perform the estimation experiments. The evaluation of the OSSE is done by comparing the estimated parameters with the reference parameters.

The nature run (NR) used to perform the OSSE is generated using the reference configuration of SEAPODYM-MTL described in Sect.

SEAPODYM-MTL parameters used for the two different simulations: the nature run (NR) and the control run (CR).

The control run (CR) used to perform the parameter estimate is generated using perturbed forcing fields (Fig.

A MLE is used as an assimilation module, used here to estimate model parameters from observations. Its implementation is based on an adjoint technique

In the framework of OSSE, we perform estimation experiments with different sets of synthetic observation points of size

Spatial division of the different regimes as defined in Table

The estimation experiments are evaluated using three metrics: (i) the performance of the estimation, (ii) its accuracy and (iii) its convergence speed.

The performance is measured with the mean relative error between the estimated coefficients and the reference coefficients as defined in

The accuracy is measured by the residual value of the likelihood which provides a good estimate of the discrepancy between the estimated and the observed biomass.

The convergence speed is measured by the iteration number of the optimization scheme.

The residual likelihood and iteration number metrics are provided by the Automatic Differentiation Model Builder (ADMB) algorithm

The number of points per regime, obtained from the clustering (Sect.

Outcome of the clustering method (Sect.

Based on these results, we construct all possible configurations, using the methodology described in Sect.

Table

Mean relative error (

The influence of the current velocity regimes (high-current-velocity system or low-current-velocity system) on the performance of the parameter estimation is studied considering three groups of experiments (Table

From these sets of experiments, it appears that the performance of the parameter estimation decreases with higher current velocity at the observation points. This conclusion is valid regardless of the regime of the secondary variables: either low or high temperatures, positive or null bloom index, and weak or strong stratification (Table

Note that the influence of low and high velocities is not explored for all secondary-variable fixed regimes. Indeed, even within fixed regimes, the secondary-variable distribution along observation points might not be statistically comparable between two experiments. This could lead to a potential bias introduced by a secondary variable, which is not the target of the study. For instance, the influence of velocity in a polar temperature regime can be investigated by comparing the configurations

Scatter plot and marginal distribution from kernel density estimation

Scatter plot and marginal distribution from kernel density estimation in the plane

Although the distributions of the secondary variables are not always shown in the following experiments, they have been examined to ensure that the OSSE results are not biased by systematic differences in the secondary variables. Experiments with significant cross-correlation between indicator variables are not presented; this concerns 9 out of the 26 possible experiments.

In Exp. 2a to d (Table

The influence of stratification is first investigated with a set of three configurations combining the tropical-temperature regime; low-velocity regime; null bloom index regime; and three regimes of weak (Exp. 3a), intermediate (Exp. 3b) and strong (Exp. 3c) stratification. A marginal distribution plot of observation sets for all experiments (not shown) indicates that the three datasets differ only along the stratification variable (primary variable). The observation points display a temperature between

Experiment table. List of conducted experiments, their corresponding configurations and the evaluation diagnostics: mean relative error on the coefficients, residual likelihood and number of iterations. The tested regime (primary variable) is specified in the first column, the number of observable belonging to each configuration is indicated in the fourth column, with their relative proportion in brackets. Note that, even if the number of observable points differs for each configuration, the experiments were conducted with 400 observations randomly chosen among the ones belonging to the configuration. The section that describes each experiment is mentioned in the last column.

In order to investigate the influence of primary production on the performance of the estimation, we compare the results of estimation in configurations with different bloom index regimes (primary variable). Temperature, stratification index and velocity have been fixed (secondary variables) to subpolar, weak, and low regimes respectively (Exp. 4a and b) and to tropical, strong, and low regimes for Exp. 4c and d.
Distributions of the observation points along the secondary variables indicate that the experiments are not biased by secondary variables, as the distributions present similar modes centered at

Both Exp. 4a and b result in an averaged error of 7 % on the estimated parameters (Table

When considering all possible experiments, and given the fact that all these configurations are associated with specific locations and times, it is possible to represent a global map of averaged estimation errors (Eq.

Averaged absolute value of relative error (

The above experiments are based on random selection of observation points within a large subset. This technique was chosen to avoid any bias related to the temporal or spatial potential autocorrelation of observation networks. However, sampling at sea is rarely randomly distributed and can generate correlations. To relax this strong assumption, we perform experiments based on positions from real acoustic transects (underway ship measurements). Two regions are compared using the transects from the PIRATA cruises in the equatorial Atlantic

The same forcing, method and initial parameterization were used with a random noise amplitude (

In the following, we will discuss a possible theoretical interpretation of the outcome of the estimation experiments (Sect. 4.1) and a potential application of our results (Sect. 4.2). Section 4.3 closes this discussion examining the particular framework used to conduct this study and opening some perspectives for future work.

The differences in the performance of parameter estimation can be interpreted in the light of the characteristic timescales of physical and biological processes. The parameters we want to estimate (

The predicted micronekton biomass at a given time and location (grid cell) results from two main mechanisms. First, the potential production (

Map of PIRATA and BAS ship transects for the years 2013–2015.

An observation will thus be the most effective for the estimation of parameters if it carries the information of the initial distribution of primary production into functional groups. This is the case if the biomass is renewed quickly enough compared to the time it takes for the currents and diffusive coefficient to mix it. This condition can be seen in terms of equilibrium between the biological processes (production, recruitment and mortality) and the physical processes (advection and diffusion). For an observation to be the most useful to the parameter estimation, it is necessary that the characteristic timescale governing biological processes (

This interpretation highlights the problem of observability of the parameters

The clustering approach we propose allowed the identification of oceanic regions that provide optimal oceanic characteristics for our parameter estimation. It separates regions where the distribution of biomass is driven by physical processes from regions where it is driven by biological processes. This could be seen as a new definition of ecoregions based on similar ecosystem structuring dynamics. The definition of ocean ecoregions has been proposed based on various criteria

We have chosen to model the error between the true state of the ocean and the modeled state by adding a white noise perturbation to the forcings of the NR as input of the CR. Our idealized approach does not take into account the possible spatial distribution of uncertainty and errors of ocean models, and other approaches would be interesting to explore. For instance, implementing an error proportional to the deviation of the climatological field should be more realistic because it would be based on the natural and intrinsic variability of the ocean. Indeed, we expect forcing fields to be less accurate where the ocean has strong variability. However, for the purpose of our study, a spatial homogeneous error was preferable to avoid introducing any bias. Random noise ensures that the results obtained in different locations are directly comparable. Conducing a sensitivity analysis with respect to the choice of forcing error modeling was beyond the scope of this study.
In addition to the uncertainty of ocean model outputs, other sources of uncertainties remain to be explored to progress toward more realistic estimation experiments. For instance, we considered that the observation operator (Eq.

Understanding and modeling marine ecosystem dynamics is considerably challenging. It generally requires sophisticated models relying on a certain number of parameterized physical and biological processes. SEAPODYM-MTL provides a parsimonious approach with only a few parameters and a MLE to estimates these parameters from observations. Among them, the energy transfer efficiency coefficients are of great importance because they directly control the biomass of micronekton functional groups, including those that undergo DVM and contribute to the sequestration of carbon dioxide into the deep ocean

SEAPODYM-MTL is based on a system of advection–diffusion–reaction equations for each functional group

The initial conditions for this system are

Following

Note that these coefficient are also defined for negative temperature values.

A module estimates SEAPODYM-MTL parameters
using a variational data assimilation method:
a maximum likelihood estimation (MLE)

The gradient of the likelihood function is computed using the adjoint state method. The parameters are then estimated using a quasi-Newton algorithm implemented by the Automatic Differentiation Model Builder (ADMB) algorithm

Physical oceanographic data of the free GLORYS2V4 ocean circulation simulation are
available at the Copernicus Marine Environment Monitoring Service
(CMEMS:

All authors contributed to the design of the study. AD developed the method, conducted the experiments, analyzed the results and wrote the original manuscript. AC and OT contributed to the development of the parameter estimation component of SEAPODYM-MTL. OT prepared the forcing fields and contributed to the revision of the manuscript. PL coordinated the AtlantOS activity at CLS and contributed to the analysis of results and the revision of the manuscript.

The authors declare that they have no conflict of interest.

This study has been conducted using E.U. Copernicus Marine Service Information. The authors thank the Groupe Mission Mercator Coriolis (Mercator Ocean) for providing the ocean general circulation model FREEGLORYS2V4 simulation and Jacques Stum and Benoit Tranchant at Collecte Localisation Satellite for processing satellite primary production and ocean reanalysis data. We also thank Bernard Bourlès and Jérémie Habasque from the Institut de Recherche pour le Développement and Sophie Fielding from the British Antarctic Survey for making the PIRATA (

This research has been supported by the Horizon 2020 (grant no. AtlantOS (633211) and MEESO (817669)).

This paper was edited by Stefano Ciavatta and reviewed by Jann Paul Mattern and one anonymous referee.