Calibration of terrestrial ecosystem models is important but challenging. Bayesian inference implemented by Markov chain Monte Carlo (MCMC) sampling provides a comprehensive framework to estimate model parameters and associated uncertainties using their posterior distributions. The effectiveness and efficiency of the method strongly depend on the MCMC algorithm used. In this work, a differential evolution adaptive Metropolis (DREAM) algorithm is used to estimate posterior distributions of 21 parameters for the data assimilation linked ecosystem carbon (DALEC) model using 14 years of daily net ecosystem exchange data collected at the Harvard Forest Environmental Measurement Site eddy-flux tower. The calibration of DREAM results in a better model fit and predictive performance compared to the popular adaptive Metropolis (AM) scheme. Moreover, DREAM indicates that two parameters controlling autumn phenology have multiple modes in their posterior distributions while AM only identifies one mode. The application suggests that DREAM is very suitable to calibrate complex terrestrial ecosystem models, where the uncertain parameter size is usually large and existence of local optima is always a concern. In addition, this effort justifies the assumptions of the error model used in Bayesian calibration according to the residual analysis. The result indicates that a heteroscedastic, correlated, Gaussian error model is appropriate for the problem, and the consequent constructed likelihood function can alleviate the underestimation of parameter uncertainty that is usually caused by using uncorrelated error models.
Prediction of future climate heavily depends on accurate predictions of the
concentration of carbon dioxide (CO
Various parameter estimation methods have been applied to TEMs. For an overview, one can refer to the OptIC (Optimization InterComparison) project (Trudinger et al., 2007) and the REFLEX (REgional FLux Estimation eXperiment) project (Fox et al., 2009). In classical optimization-based approaches, inverse problems with a large number of parameters can often be ill-posed in that the solution may not be unique or even may not exist (O'Sullivan, 1986). As an alternative approach, the Bayesian framework provides a comprehensive solution to this problem. In Bayesian methods, the model parameters are treated as random variables and their posterior probability density functions (PPDFs) represent the estimation results. The PPDF incorporates prior knowledge of the parameters, mismatch between model and observations, and observation uncertainty (Lu et al., 2012). Thus, compared to other approaches in inverse problems, Bayesian inference not only estimates model parameters but also quantifies associated uncertainty using a full probabilistic description.
Two types of Bayesian methods are widely used in parameter estimation of TEMs, variational data assimilation (VAR) methods (Talagrand and Courtier, 1987) and Markov chain Monte Carlo (MCMC) sampling. VAR methods are computationally efficient; however, they assume that the prior parameter values and the observations follow a Gaussian distribution, and they require the model to be differentiable with respect to all parameters for optimization. In addition, VAR methods can only identify a local optimum and approximate the PPDF by a Gaussian function (Rayner et al., 2005; Ziehn et al., 2012). In contrast, MCMC sampling makes no assumptions about the structure of the prior and posterior distributions of model parameters or observation uncertainties. Moreover, the MCMC methods, in principle, can converge to the true PPDF with an identification of all possible optima. Although it is more computationally intensive than VAR approaches, MCMC sampling is being increasingly applied in the land surface modeling community (Dowd, 2007; Zobitz et al., 2011).
One widely used MCMC algorithm is adaptive Metropolis (AM) (Haario et al., 2001). For example, Fox et al. (2009) applied AM in their comparison of different algorithms for the inversion of a terrestrial ecosystem model; Järvinen et al. (2010) utilized AM for estimation of ECHAM5 climate model closure parameters; Hararuk et al. (2014) employed AM for improvement of a global land model against soil carbon data; and Safta et al. (2015) used AM to estimate parameters in the data assimilation linked ecosystem carbon model. The AM algorithm uses a single Markov chain that continuously adapts the covariance matrix of a Gaussian proposal distribution using the information of all previous samples collected in the chain so far (Haario et al., 1999). As a single-chain method, AM has difficulty in traversing multidimensional parameter space efficiently when there are numerous significant local optima, and AM can be inefficient for estimating the PPDFs that exhibit strong correlations, as correlated dimensions are better to be updated together (Vrugt, 2016). In addition, the AM algorithm uses a multivariate Gaussian distribution as the proposal to generate candidate samples and evolve the chain. AM, therefore, is particularly suitable for Gaussian-shaped PPDFs, but it may not converge properly to the distributions with multiple modes. Moreover, AM suffers from uncertainty about how to initialize the covariance of the Gaussian proposal. Poor initialization of the proposal covariance matrix results in slow adaptation and inefficient convergence.
The Gaussian proposal is also widely used in non-AM MCMC studies that involve TEMs. For example, Ziehn et al. (2012) used the Gaussian proposal for the MCMC simulation of the BETHY model (Knorr and Heimann, 2011) and Ricciuto et al. (2008, 2011) utilized the Gaussian proposal in their MCMC schemes to estimate parameters in a terrestrial carbon cycle model. The single-chain and Gaussian-proposal MCMC approaches have limitations in sufficiently exploring the full parameter space and share slow convergence in sampling the non-Gaussian-shaped PPDFs and thus may end up with a local optimum with inaccurate uncertainty representation of the parameters. Therefore, this poses a question on whether the AM and the widely used MCMC algorithms with Gaussian proposal generate a representing sample of the posterior distribution of the underlying model parameters. While we expect that computationally expensive sampling methods for parameter estimation yield a global optimum with an accurate probabilistic description, in reality we may in many cases obtain a local optimum with an inaccurate PPDF due to the limitations of these algorithms.
In this study, we employ the differential evolution adaptive Metropolis (DREAM) algorithm (Vrugt et al., 2008, 2009a; Lu et al., 2014) for an accurate Bayesian calibration of an ecosystem carbon model. The DREAM scheme runs multiple interacting chains simultaneously to explore the entire parameter space globally. During the search, DREAM does not rely on a specific distribution, like the Gaussian distribution used in most MCMC schemes, to move the chains. Instead, it uses the differential evolution optimization method to generate the candidate samples from the collection of chains (Price et al., 2005). This feature of DREAM eliminates the problem of initializing the proposal covariance matrix and enables efficient handling of complex distributions with strong correlations. In addition, as a multi-chain method, DREAM can efficiently sample multimodal posterior distributions with numerous local optima. Thus, the DREAM scheme is particularly applicable to complex and multimodal optimization problems. Recently, Post et al. (2017) reported a successful application of DREAM in estimation of the complex Community Land Model (CLM) using 1-year records of NEE observations. They found that the posterior parameter estimates were superior to their default values in the ability to track and explain the measured NEE data.
While multimodality is a potential feature of parameters in complex models (Kinlan and Gaines, 2003; Stead et al., 2005; Thibault et al., 2011; Zhang et al., 2013), its existence has not been well documented in terrestrial ecosystem modeling due to the limitations of methods that have been applied in most previous studies. In addition, while the importance of likelihood function choice on Bayesian calibration has been well realized (Trudinger et al., 2007), the reasonable usage of an appropriate likelihood function has been barely explored in land surface modeling. Here we apply DREAM and AM to a TEM to estimate the parameter distributions based on a set of synthetic data and real measurement data. In both cases, we estimate the PPDFs of 21 process parameters in the data assimilation linked ecosystem carbon (DALEC) model. The objectives of this study are to (1) present a statistically sound methodology to solve the parameter estimation problems in complex TEMs and to improve the model simulation; (2) characterize parameter uncertainty in detail using accurately sampled posterior distributions; (3) investigate the effects of model calibration methods on parameter estimation and model performance; and (4) justify the usage of the likelihood function and explore the influence of the likelihood function on the model calibration results. This work should provide ecological practitioners with valuable information on model calibration and understanding of the TEMs.
In the following Sect. 2, we first briefly summarize the general idea of Bayesian calibration and describe the AM and DREAM algorithms. Then in Sect. 3, we apply both algorithms to the DALEC model in a synthetic and a real-data study. Next in Sect. 4, we discuss the influence of the likelihood function on parameter estimation and model performance. Finally in Sect. 5, we close this paper with our main conclusions.
Bayesian calibration of a model states that the posterior distribution
The likelihood function measures the model fits to the observations. Selecting a likelihood function suitable to a specific problem is still under study (Vrugt et al., 2009b). A commonly used likelihood function is based on the assumption that the differences between the model simulations and observations are multivariate normally distributed, leading to a Gaussian likelihood such as the work of Fox et al. (2009), Hararuk et al. (2014), and Ricciuto et al. (2008, 2011). In this work, we also use the Gaussian likelihood, with heteroscedastic and uncorrelated variances that are evaluated from the provided daily observation uncertainties. The assumptions of normality and independence are investigated by the residual analysis. In addition, we explore the influence of different choices of the likelihood function on the parameter estimation and model performance. The effect of data correlations on the inferred parameters was also assessed in our previous study (Safta et al., 2015).
In most environmental problems, the posterior distribution cannot be obtained with an analytical solution and is typically approximated by sampling methods such as MCMC. The MCMC method approximates the posterior distribution by constructing a Markov chain whose stationary distribution is the target distribution of interest. As the chain evolves and approaches the stationary, all the samples after chain convergence are used for posterior distribution approximation, and the samples before convergence, which are affected by the starting states of the chain, are discarded.
The well-constructed MCMC schemes have been theoretically proven to converge
to the appropriate target distribution
The adaptive Metropolis (AM) algorithm is a modification to the standard
Metropolis sampler (Haario et al., 2001). The key feature of the AM algorithm
is that it uses a single Markov chain that continuously adapts to the target
distribution via its calculation of the proposal covariance using all
previous samples in the chain. The proposal distribution employed in the AM
algorithm is a multivariate Gaussian distribution with means at the current
iteration
To apply the AM algorithm, an initial covariance
The construction of
To improve efficiency in high-dimensional case, Haario et al. (2005) extended the standard AM method to componentwise adaptation. This strategy applies AM on each parameter separately. The proposal distribution of each component is a 1-D normal distribution, which is adapted in a similar manner as in the standard AM algorithm, but the componentwise adaptation does not work very well for distributions with a strong correlation. Safta et al. (2015) applied an iterative algorithm to break the original high-dimensional problem into a sequence of steps of increasing dimensionality, with each intermediate step starting with an appropriate proposal covariance based on a test run. This technique provided a rather reasonable proposal distribution, but the computational cost used to define the proposal was rather high.
AM is a single-chain method. As a single chain, it may suffer from some
difficulties in judging the convergence. Sometime the most powerful
diagnostics cannot guarantee that the chain has converged to the target
distribution (Gelman and Shirley, 2011). One solution to alleviate the
problem is running multiple independent chains with widely dispersive
starting points and then using the diagnostics for multi-chain schemes, such
as the univariate
The DREAM algorithm is a multi-chain method (Vrugt, 2016). Multi-chain
approaches use multiple chains running in parallel for global exploration of
the posterior distribution, so they have several desirable advantages over
the single-chain methods, particularly when addressing complex problems
involving multimodality and having a large number of parameters with strong
correlations. In addition, the application of multiple chains allows
utilizing a large variety of statistical measures to diagnose the
convergence including the
DREAM uses the Differential Evolution Markov Chain (DE-MC) algorithm (ter Braak, 2006) as its main building block. The key feature of the DE-MC scheme is that it does not specify a particular distribution as the proposal but proposes the candidate points using the differential evolution method based on current samples collected in the multiple chains. Thus, DE-MC can apply to a wide range of problems whose distribution shapes are not necessarily similar to the proposal distribution, and it also removes the requirement of initializing the covariance matrix as in AM. In addition, the DE-MC can successfully simulate the multimodal distributions, because it directly uses the current location of the multiple chains to generate candidate points, allowing the possibility of direct jumps between different modes.
The DREAM algorithm maintains the nice features of the DE-MC but greatly accelerates the chain convergence. More information about the DREAM algorithm was presented in Vrugt et al. (2008, 2009a), Laloy and Vrugt (2012), Lu et al. (2014), and Vrugt (2016).
Since multimodality is a potential feature of complex problems including terrestrial ecosystem models (Stead et al., 2005; Thibault et al., 2011), it is important to understand the strategies of AM and DREAM and to investigate their capabilities in sampling the multimodal distributions.
The AM sampler is typically tuned for distributions with a single mode. For distributions with closely connected modes, AM can work well with suitable initial values. On the other hand, for distributions consisting of disconnected modes with between regions of low probability, even with a reasonably wide covariance matrix, AM could have a slow convergence and end up with only one mode (e.g., Fig. 5 in Vrugt, 2016). To remedy this problem, AM needs an overly dispersed Gaussian proposal with large initial variances to allow it to transit between the different modes. But this may result in a very low acceptance rate as many of the jumps will fall outside the target distribution with nearly zero densities. To alleviate this problem, Haario et al. (2006) proposed the DRAM algorithm, which combines the delayed rejection (DR) with AM. The DR algorithm allows for a very expansive search at the beginning by using a large covariance matrix of the proposal, and then the proposal covariance is reduced by a freely chosen scale factor if the parameters do not have significant movement. By creating multiple proposal stages, DRAM enables an extensive search and meanwhile alleviates the overshooting problem and improves the acceptance rate. However, as dimensionality increases, the multimodality becomes more difficult for the algorithms using the Gaussian proposal because it is highly likely different dimensions have different variances and a constant scaling factor can only shrink the covariance simultaneously.
In contrast, DREAM is designed for sampling high-dimensional and multimodal problems by running multiple different chains simultaneously for global exploration. It automatically tunes the scale and orientation of the proposal in randomized subspaces during the search (Vrugt et al., 2009a). As DREAM directly uses the current location of the multiple chains, instead of the covariance of the Gaussian proposal, to generate candidate points, it enables direct jumps between different modes (including the relatively far disconnected modes) as long as the initial samples of the chains are widely distributed over the parameter space. Laloy and Vrugt (2012) demonstrated that DREAM can successfully sample a 25-dimensional trimodal distribution with equal separation of 10 units between modes. However, for the same problem with the same number of function evaluations, AM and DRAM converged to only one mode. Note that to sample a distribution with many modes, one needs to have some prior information about their rough locations; otherwise no methods can guarantee finding all the modes, especially when the distance between the modes is very large and not a constant.
In this section, we applied the DREAM algorithm to the data assimilation linked ecosystem carbon (DALEC) model to estimate the posterior distributions of its parameters. In comparison, the AM algorithm was also applied. DALEC is a relatively simple carbon pool and flux model designed specifically to enable parameter estimation in terrestrial ecosystems. We used DALEC to evaluate the performance of AM and DREAM in model calibration; we compared their accurate simulations of the parameter PPDFs, model's goodness of fit, and predictive performance of the calibrated models. Previous studies based on MCMC methods that used Gaussian proposals have not reported multimodality in the marginal PPDFs of the model parameters, so it is important to know whether the parameters have multimodality; if the multimodality exists, we assess whether or not DREAM can identify the multiple modes and improve the calibration results and thus the predictive performance.
The DALEC v1 model is used here (Williams et al., 2005; Fox et al., 2009) with some structural modifications (Safta et al., 2015). DALEC consists of six process-based submodels that simulate carbon fluxes between five major carbon pools: three vegetation carbon pools for leaf, stem, and root and two soil carbon pools for soil organic matter and litter. The fluxes calculated on any given day impact carbon pools and processes in subsequent days.
Nominal values and ranges of the 21 parameters for optimization in the DALEC model, and the maximum a posteriori (MAP) estimates based on the AM and DREAM samplers.
Parameter units refer to Table 1 of Safta et al. (2015). The LL represents the log likelihood evaluated at the MAP parameter estimates; the larger the value is, the better the model fit.
The six submodels in DALEC are photosynthesis, phenology, autotrophic
respiration, allocation, litterfall and decomposition. Photosynthesis is
driven by the aggregate canopy model (ACM) (Williams et al., 2005), which
itself is calibrated against the soil–plant–atmosphere model (Williams et
al., 1996). DALEC v1 was modified to incorporate the phenology submodel used
in Ricciuto et al. (2011), driven by six parameters. This phenology submodel
controls the current leaf area index (LAI) proportion of the seasonal maximum
LAI (
So for the first three plant submodels, deciduous phenology has six
parameters; ACM shares one parameter,
The allocation model partitions carbon to several vegetation carbon pools.
Leaf allocation is first determined by the phenology model, and the remaining
available carbon is allocated to the root and stem pools depending on the
fractional stem allocation parameter (
Estimated marginal posterior probability density functions (PPDFs) of the 21 parameters using the AM and DREAM algorithms, along with the true parameter values to generate the pseudo-data in the synthetic case.
Model parameters are summarized in Table 1. These parameters were grouped
according to the six submodels that employ them, except for
In order to reduce computational time, we employed transient assumptions for
running DALEC. That is, for any given set of parameter values, DALEC was run
one cycle only for 15 years between 1992 and 2006 where observation data are
available. Under this assumption, four additional parameters were used to
describe the initial states of two vegetation carbon pools
(
The calibration data consist of the Harvard Forest daily net ecosystem exchange (NEE) values, which were processed for the NACP site synthesis study (Barr et al., 2013) based on flux data measured at the site (Urbanski et al., 2007). The daily observations cover a period of 15 years starting with the year 1992 and part of the data in the year 2005 is missing. Hill et al. (2012) estimated that daily NEE values followed a normal distribution, with standard deviations estimated by bootstrapping half-hourly NEE data (Papale et al., 2006; Barr et al., 2009). These standard deviations have values between 0.2 and 2.5, with the mean value about 0.7. Total 14 years 5114 NEE data (years from 1992 to 2004 and year 2006) were considered here for model calibration and their corresponding standard deviations were used to construct the heteroscedastic, diagonal covariance matrix of the Gaussian likelihood function by assuming the data were uncorrelated. In Sect. 4, we examine the independent, Gaussian error assumption using residual analysis and investigate the influence of error models on parameter estimation and model performance.
We first applied AM and DREAM to a synthetic case to evaluate their capability in parameter estimation. The same periods of daily NEE data were generated with the nominal parameter values in Table 1. These synthetic data for calibration were then corrupted with Gaussian errors having means at zero and the same standard deviations with the observed NEEs.
DREAM launched 10 parallel chains starting at values randomly drawn from the
parameter prior distributions. AM used one chain and the chain has the same
initialization with DREAM. In addition, AM also requires the initialization
of the covariance matrix of its Gaussian proposal. We first drew some samples
from the parameter space and computed the initial covariance. However, this
initialization caused a slow convergence of AM with an extremely small
acceptance rate (about 0.01 % after 1
Chain convergence was assessed via the Gelman–Rubin
In addition, Fig. 1 indicates that about half of the parameters are well constrained, when we define a well-constrained parameter as its posterior distribution occupying at most half the range of the prior distribution (Keenan et al., 2013). This result is consistent with some of previous studies on DALEC calibration using NEE data alone. For example, in the synthetic study of Fox et al. (2009), their MCMC simulation (M1) showed that 16 of 17 parameters were well constrained. Similarly, the synthetic study in Hill et al. (2012) indicated that 20 of 23 parameters had their 90 % confidence intervals occupy less than half of the prior range.
Whether a parameter is identifiable depends on the model, model parameters, and the calibration data. When the parameter related processes are necessary to simulate the model outputs whose corresponding observation data are sensitive to the parameters, the parameters can usually be identified and sometimes well constrained. For example, Keenan et al. (2013) showed that in their FöBAAR model with 40 parameters, many parameters could not be constrained even with the consideration of several data streams together. They found that these unidentifiable parameters might be redundant in the model structure representation. Roughly speaking, for a simple model with a few number of parameters, the parameters can be more identifiable than the complex models with a large parameter size (Richardson et al., 2010; Weng and Luo, 2011). On the other hand, if the calibration data are sensitive to the parameters, even a complex model can sometimes be well constrained by using a single type of observations. For example, Post et al. (2017) estimated eight CLM parameters using 1-year records of half-hourly NEE observations at four sites, and found that for most sites the CLM parameters can be well constrained with their 95 % confidence intervals close to the maximum a posteriori estimates. For the only site where the parameter uncertainties were relatively large, they concluded that the simulated NEE was less sensitive to these parameters. In our and those synthetic studies of Fox et al. (2009) and Hill et al. (2012), all the parameter related processes are necessary for DALEC simulation and most parameters were shown to be sensitive to the observation data (Safta et al., 2015), this explains to some extent that many DALEC parameters can be well constrained in these synthetic studies.
In the real-data study, the measured NEE data with given standard deviations were used for DALEC calibration. Both AM and DREAM algorithms were applied to infer the unknown parameters. Different from the synthetic case, the real-data study involves model structural errors besides the measurement errors. We again use the heteroscedastic, uncorrelated, Gaussian likelihood function for calibration and examine these error assumptions in Sect. 4 through residual analysis.
Univariate and multivariate Gelman–Rubin
Estimated marginal posterior probability density functions (PPDFs) of the 21 parameters using the AM and DREAM algorithms in the real-data study.
DREAM launched 10 parallel chains starting at values randomly drawn from the
parameter prior distributions, and each chain evolved 300 000 iterations.
Chain convergence was assessed via both the univariate and multivariate
Gelman–Rubin
AM used one chain and the chain has the same initialization of the first
sample with DREAM. For the initialization of the Gaussian covariance in the
AM proposal, we first drew some samples from the parameter space and
constructed the covariance. However, this initialization caused a high
rejection rate and ended up with essentially a single parameter state after
hundreds of thousands of iterations. To facilitate the convergence of AM, we
constructed the initial covariance based on the first 200 000 samples from
the DREAM simulation. We conducted 10 independent AM runs, so the same
The estimated PPDFs from AM and DREAM are presented in Fig. 3, and the
optimal parameter estimates, as represented by the maximum a posteriori
(MAP), are summarized in Table 1. Figure 2 shows that more than half of the
parameters are constrained and some well-constrained parameters are edge
hitting, where the mode of these parameters occur near one of the edges of
their allowable ranges and most of the parameter values are clustered near
the edge such as
AM and DREAM results for parameters
In comparison of the results between AM and DREAM, Fig. 3 indicates that they
produce very similar PPDFs for many parameters, such as
In addition, the simulated joint PPDFs of the two parameters
Posterior distributions of parameters
The existence of two modes for
Figure 6a highlights the years in red where the model based on the right mode
of
Figure 6c depicts the recorded lowest temperature of the days between
1 September and 20 November for years 1992 and 1994, where the red line
highlights the period between the first leaf and the last leaf drops in 1994.
The blue line highlights the corresponding period of leaf fall in 1992. Since
the senescence was triggered in the early September of 1994, the temperature
of triggering leaf fall was relatively high, about 8.1
The bimodality identified in the DREAM simulation and examined in the
scenarios above reflects the inability of the model structure to predict the
observations consistently with a single set of parameters. This bimodality
examined in DREAM may be caused in part by an incomplete representation of
the senescence process. Using a temperature threshold (parameter
The difference in estimated parameters between AM and DREAM causes different
simulations of NEE, especially during the autumn. As an example, Fig. 7
illustrates the comparison of the simulated NEE to observations for a month
in autumn of the year 1995 based on MAP estimates obtained under AM and
DREAM. Visual inspection indicates that the simulated NEE from the
DREAM-calibrated parameters provides a better fit to the observations, as
also indicated by the smaller root mean squared errors (RMSEs). In addition,
the maximum log likelihoods listed in Table 1 suggest that overall the
DREAM-estimated parameters produce a better model fit to the observations,
comparing
Simulated NEE values based on the optimal parameters (i.e., the MAP values listed in Table 1) estimated by the AM and DREAM algorithms in October 1995. The root mean square error (RMSE) indicates that DREAM produces a better model fit than AM.
To further compare the calibration results between AM and DREAM, we explore
their predictive skills based on the sampled PPDFs of model parameters. We
employed the Bayesian posterior predictive distribution (Lynch and Western,
2004) to assess the adequacy of the calibrated models. Specifically, the
posterior distribution for the predicted NEE data,
From the estimated
95 % confidence intervals of the simulated NEE values in year 1995 based on the parameter samples from AM and DREAM. Two measures of predictive performance, CRPS statistic and predictive coverage, indicate that DREAM outperforms AM in prediction.
In order to quantitatively compare the predictive performance of the calibrated models based on AM and DREAM, we defined two metrics, a probabilistic score called CRPS and predictive coverage. The CRPS (Gneiting and Raftery, 2007) measures the difference between the cumulative distribution function (CDF) of the observed data and that of the predicted data. The lower the value of the CRPS is, the better the predictive performance. The predictive coverage measures the percent of observations that fall within a given predictive interval. A larger value of the predictive coverage suggests better predictive performance. Figure 8 shows that AM gives a CRPS value of 0.48, while the value of DREAM is 0.43. The lower value of DREAM indicates that, on average, DREAM produces tighter marginal predictive CDF that are better centered around the NEE data, suggesting its superior predictive performance to AM in terms of both accuracy and precision. In addition, the predictive coverage of DREAM is larger than that of AM, attesting once again to its superior performance in prediction.
Results of two independent chains of AM with the initial covariance
matrix constructed using the converged DREAM samples. The
Bayesian calibration of TEMs is challenging due to high model nonlinearity, high computational cost, a large number of model parameters, large observation uncertainties, and the existence of local optima. Thus, a robust and efficient MCMC algorithm is desired to give reliable probabilistic descriptions of the TEM parameters.
In this section, we investigate the influence of the proposal initialization
on the computational efficiency and reliability of AM. In above analysis, the
initial covariance matrix of AM was constructed based on DREAM samples
As a single-chain sampler, it is conceptually possible for AM to become
trapped in a single mode (Jeremiah et al., 2011). Consider a distribution
with two far-separated modes and assume that the chain is initialized near
one of the two modes (both samples initialization and proposal covariance
initialization). At the beginning of the sampling, AM will explore the area
around the mode where it is initialized and start identifying the first mode.
Since the candidate samples generated by the Gaussian proposal have higher
Metropolis ratios (Eq.
Residual analysis of the calibration using Gaussian likelihood with
heteroscedastic and
Although the two AM chains can only simulate one of the two modes for
The choice of likelihood function plays an important role in the Bayesian parameter estimation, and the likelihood construction depends on the error model assumption. In this study, we assumed a heteroscedastic, uncorrelated, Gaussian error model. However, this simplistic assumption may not be realistic for complex TEMs. In this section, we examine whether the assumed error model provides an accurate representation of residuals between the simulated and observed NEEs. If the assumptions are not satisfied, we consider a more flexible error model and investigate the influence of the corresponding likelihood function on parameter estimation and model performance.
Figure 10 presents results of residual analysis based on the heteroscedastic, uncorrelated, Gaussian assumption. The plot of residuals versus simulated NEE in Fig. 10a justifies the assumption of heteroscedastic variances; the density plot of residuals in Fig. 10b justifies the assumption of normality, but the autocorrelation plot of residuals in Fig. 10c indicates that the errors are significantly correlated at a lag of 4, which violates the independence assumption. This violation has been reported in several time-series data models, such as the TEM in Ricciuto et al. (2008), the rainfall–runoff model in Feyen et al. (2007), and the groundwater reactive transport model in Lu et al. (2013). The correlated errors are likely to be observed in models where systematic model errors exist like the DALEC model in this study.
According to the residual analysis, we consider a heteroscedastic,
Estimated posterior probability density functions (PPDFs) of the six error model parameters.
Residual analysis of the calibration using Gaussian likelihood with
heteroscedastic and
Estimated marginal posterior probability density functions (PPDFs) of the 21 TEM parameters using the uncorrelated and correlated Gaussian likelihoods.
Simulated NEE values based on the MAP estimates from the uncorrelated and correlated Gaussian likelihoods in October 1995.
Figure 11 indicates that the six error model parameters are well identified
in current parameter ranges. The heteroscedastic parameters
The PPDFs of the 21 TEM parameters using the correlated Gaussian likelihood
are presented in Fig. 13, associated with the results from the uncorrelated
Gaussian likelihood. In comparison, we found that the two error model
assumptions produced different PPDFs for most parameters. The most remarkable
difference is that the bimodality of parameters
The difference in the parameter PPDFs from the two likelihood functions results in different model performance as shown in Fig. 14, where we took the simulations in October of 1995 as an example. Although the overall RMSEs are similar, the simulations on a single day are different. This is not surprising, as MCMC is a Bayesian calibration and the calibration results depend on the choice of the likelihood function, mainly the assumptions of the error model. In this study, the heteroscedastic, correlated, Gaussian error model is more reasonable than the uncorrelated one.
In this work, we apply two advanced MCMC algorithms, AM and DREAM, in the Bayesian calibration of the terrestrial ecosystem model DALEC. In both synthetic and real-data studies, we found that AM is sensitive to the algorithm initializations. When it starts with a proper initialization, through prior information or some test runs or even some dimension-reduction strategies, AM can produce reasonable approximation of the parameter posterior distributions. However, AM still shows some difficulties in sampling multimodal distributions with the Gaussian proposal. By comparison, DREAM's performance does not depend on initialization of the algorithm and can fast converge to the high-dimensional and multimodal distributions. Thus, DREAM is particularly suitable to calibrate complex terrestrial ecosystem models, where the uncertain parameter size is usually large and existence of local optima is always a concern. The application indicates that, compared to AM, DREAM can accurately simulate the posterior distributions of the model parameters, resulting in a better model fit, superior predictive performance, and perhaps identifying structural errors or process differences between the model and ecosystem from which observations were used for calibration.
In Bayesian calibration, the choice of likelihood function plays an important role in parameter estimation. In this effort, we justify the assumptions of error model used in constructing the likelihood function and find that a heteroscedastic, correlated, Gaussian error model is reasonable for this problem as supported by the residual analysis.
The NEE observation data used in this study are available
from Oak Ridge National Laboratory Distributed Active Archive Center
(
The authors declare that they have no conflict of interest.
This research was conducted by the Terrestrial Ecosystem Science – Science Focus Area (TES-SFA) project, supported by the Office of Biological and Environmental Research in the DOE Office of Science. The Harvard Forest flux tower is part of the AmeriFlux network supported by the Office of Biological and Environmental Research in the DOE Office of Science and is additionally supported by National Science Foundation as part of the Harvard Forest Long-Term Ecological Research site. The NACP site-synthesis activity supported assembling the data set. Oak Ridge National Laboratory is managed by UT-BATTELLE for DOE under contract DE-AC05-00OR22725. Sandia National Laboratories is a multi-program laboratory managed and operated by Sandia Corporation, a wholly owned subsidiary of Lockheed Martin Corporation, for the DOE's National Nuclear Security Administration under contract DE-AC04-94-AL85000. Edited by: Trevor Keenan Reviewed by: Jasper Vrugt and one anonymous referee