Monte Carlo-based calibration and uncertainty analysis of a coupled plant growth and hydrological model

Computer simulations are widely used to support decision making and planning in the agriculture sector. On the one hand, many plant growth models use simplified hydrological processes and structures – for example, by the use of a small number of soil layers or by the application of simple water flow approaches. On the other hand, in many hydrological models plant growth processes are poorly represented. Hence, fully coupled models with a high degree of process representation would allow for a more detailed analysis of the dynamic behaviour of the soil–plant interface. We coupled two of such high-process-oriented independent models and calibrated both models simultaneously. The catchment modelling framework (CMF) simulated soil hydrology based on the Richards equation and the van Genuchten–Mualem model of the soil hydraulic properties. CMF was coupled with the plant growth modelling framework (PMF), which predicts plant growth on the basis of radiation use efficiency, degree days, water shortage and dynamic root biomass allocation. The Monte Carlo-based generalized likelihood uncertainty estimation (GLUE) method was applied to parameterize the coupled model and to investigate the related uncertainty of model predictions. Overall, 19 model parameters (4 for CMF and 15 for PMF) were analysed through 2 × 106 model runs randomly drawn from a uniform distribution. The model was applied to three sites with different management in Müncheberg (Germany) for the simulation of winter wheat ( Triticum aestivum L. ) in a cross-validation experiment. Field observations for model evaluation included soil water content and the dry matter of roots, storages, stems and leaves. The shape parameter of the retention curve n was highly constrained, whereas other parameters of the retention curve showed a large equifinality. We attribute this slightly poorer model performance to missing leaf senescence, which is currently not implemented in PMF. The most constrained parameters for the plant growth model were the radiation-use efficiencyand thebase temperature . Cross validation helped to identify deficits in the model structure, pointing out the need for including agricultural management options in the coupled model.


Introduction
Plant growth and hydrological models are widely used to evaluate strategies such as climate adaption, risk management of pesticide, or fertilizer application in agricultural sciences and politics.However modelling comes along with limitations.Different models can lead to deviating results even if they are driven by the same input and forcing data.Such effects are represented by model uncertainty.Furthermore, the selection of input parameters can change the results and also increase uncertainty.This effect is commonly known as parameter uncertainty.Hence, a good knowledge of these uncertainties is crucial, especially when plant growth models are used to project food supply and hydrological models are applied in order to develop strategies for water resource management.The importance of a comprehensive knowledge about the capabilities and limitations of such models applied in the field of decision making has also been highlighted by Kersebaum et al. (2007).
Most current plant growth models integrate plant growth and hydrological processes tightly, leading to very complex models.Therefore, the calibration of such models is often T. Houska et al.: Monte Carlo-based calibration and uncertainty analysis done step by step.In a number of studies (e.g.Pathak et al., 2012;Wang et al., 2005;Iizumi et al., 2009) the hydrological model has been calibrated as a first step and the plant growth model as a second step in order to reduce the number of parameters varied in one calibration step.However, in such a setup, feedbacks between biomass production and hydrology are not considered (Pauwels et al., 2007).Alternatively, the past years have seen modular model developments and the promotion of comprehensive model-coupling strategies (Priesack et al., 2006).Kraft et al. (2011) coupled the catchment modelling framework (CMF) (Kraft et al., 2011) with the plant growth modelling framework (PMF) (Multsch et al., 2011) to simulate the direct feedbacks of soil hydrological conditions on plant development.However, their coupled version of CMF and PMF has so far only been used for virtual modelling experiments (Multsch et al., 2011, Kraft et al., 2011) and not yet for real observed data.
Instead of calibrating single models step by step, we favour the use of a Monte Carlo algorithm to screen the hyperdimensional parameter space for behavioural model runs of the entire coupled model and apply the GLUE (generalized likelihood uncertainty estimation) method, a widespread Bayesian technique to investigate model performance and parameter uncertainty (Beven and Binley, 1992).GLUE results in a range of parameter sets which all lead to acceptable model runs rather than only one "optimal" calibrated parameter set (Candela et al., 2005).This behaviour is known as "equifinality".Model realizations are grouped into behavioural and non-behavioural model runs and associated parameter sets.The former describes an acceptable model application, allowing for some degree of error in simulating a target value (defined in an a priori threshold criterion).The latter describes parameter sets which return unacceptable model outputs and can be deleted (Beven, 2006).A further distinction is made between constrained and unconstrained parameters (Christiaens and Feyen, 2002).The more sensitive a model parameter for predicting a given target value is, the more constrained it becomes in the remaining behavioural parameter sets.
The level of improvement of the model by the GLUE approach depends on the used likelihood function threshold criterion and the number of sampled parameters.However, the choice of the likelihood function itself also has a strong influence on the results, which has also been reported by He et al. (2010), who additionally highlight the importance of the likelihood function to ensure statistical validity.A number of likelihood functions have been applied: for example, the inverse error variance with a shaping factor (Beven and Binley, 1992), the Nash and Sutcliffe model efficiency (Freer et al., 1996), scaled maximum absolute residuals (Keesman and van Straten, 1990) and the index of agreement (Wilmott, 1981), model bias and coefficient of determination.
A number of studies applied the GLUE method to achieve a better understanding of plant growth models and their parameters.For example, Wang et al. (2005) utilized the GLUE method for evaluation of the EPIC model with the mean squared error as a likelihood function.He et al. (2010) tested the influence of different likelihood functions with the crop environment resource synthesis (CERES)-Maize model.They used modifications of the variance of model errors and mean squared error as likelihood functions.Mo and Beven (2004) applied the method with the index of agreement as a likelihood function for calibration of a soilvegetation-atmosphere transfer model.Pathak et al. (2012) considered bias, root-mean-squared error (RMSE) and the index of agreement as likelihood functions in the uncertainty assessment of the CSM-CROPGRO-Cotton model.
In this study, the combined set of model parameters from a fully coupled plant growth model (PMF) and a hydrological model (CMF) was calibrated parallelly.Hence, the objectives of this study were as follows: -In-depth analysis of the coupled model setup through a GLUE analysis to investigate the sensitivity of plant growth and hydrological model parameters and to derive a range of behavioural model runs.
-Cross validation of parameter transferability for three different sites against observed soil moisture and biomass data for storages and roots as well as stems and leaves (further summarized as plant dry matter) of winter wheat.
To describe the "goodness of fit" of our model prediction, we used a set of three likelihood functions (model efficiency, bias and coefficient of determination).Subsequently, we will distinguish between (i) forcing data (e.g.meteorological observations), (ii) input data (e.g.soil information) and (iii) model parameters (e.g.plant coefficients).

Catchment modelling framework (CMF)
A plot-scale hydrological model for the unsaturated zone was built by using the catchment modelling framework (CMF) (Kraft, 2011).CMF is a computer program used for setting up individual hydrological models.A programming library facilitates the design of water transport models between soil layers in up to three dimensions.It allows the development of detailed mechanistic models as well as lumped large-scale linear storage-based models.A model in CMF functions as a network of storages and boundary conditions connected by flux-calculating sub-models.It works as an extension to Python and can easily be coupled with other models.The specific realization of CMF was done with a onedimensional setup.Water fluxes were simulated with the Richards equation.We simulated the soil moisture with the van Genuchten-Mualem model of the soil hydraulic properties ( van Genuchten, 1980) for 50 soil layers with a uniform thickness of 0.05 m.The k sat parameter was used to simulate the saturated conductivity.The porosity parameter is defined by pore volume per soil volume, while alpha and n as known van Genuchten parameters.The interaction of the lowest soil layer with the groundwater is modelled as a Neumann boundary condition.To initiate the water content of CMF we used meteorological data for the year 1992 and calibrated it for the growing season 1993/1994.

Plant growth modelling framework (PMF)
As a plant growth model, we used the plant growth modelling framework (PMF), developed by Multsch et al. (2011).PMF is a dynamic and integrative tool for setting up individual plant models.In general, PMF consists of four core elements: (i) plant model, (ii) process library, (iii) plant building set and (iv) crop database.The basic idea of PMF is to divide the plant into its parts -root, shoot, stem, leaf and storage -which interact during the growth process.This structure builds up the plant model.A process library contains mathematical formulations of biophysical processes, such as biomass accumulation, water uptake and development.The user can connect the plant model with a set of biophysical processes by using the plant building set.The plant parameters are taken from the crop database.
The biomass accumulation is affected by the radiation use efficiency (RUE).The higher the RUE, the higher is the biomass accumulation.RUE is used to calculate the biomass growth with the biomass radiation-use-efficiency concept (Monteith and Moss, 1977).The mathematical solution of the radiation use efficiency in PMF is based on Acevedo et al. (2002).The photosynthetically active absorbed radiation is calculated by solar radiation and its intercepted fraction.The simulation of biomass accumulation from photosynthetic active radiation is performed with the canopy extinction coefficient (k), where the radiation is directly transformed into dry matter or assimilated CO 2 .An overview of this concept is given by Sinclair and Horie (1989).The minimum temperature for plant development is defined by the base temperature (t base ).This acts as a threshold temperature above which development occurs.Each plant development step is defined by a temperature sum.If the temperature sum is reached, then the developing process begins.If this parameter is too high, then the plant starts its growing process too late, and vice versa.For simplicity, the parameter t base is independent from further environmental influences.Development stages are used to control biomass allocation between plant organs.The plant development is divided into the eight stages by a thermal time threshold: emergence, leaf development, tillering, stem elongation, anthesis, seed fill, dough stage and maturity.Root elongation determines the daily root growth rate.Root water uptake in PMF equals a macroscopic approach type 2 (Feddes et al., 2001;Hopmans and Bristow, 2002).In this concept the actual water uptake is distributed over the rooting zone and occurs as a sink term in the Richards equation.The allocation of water uptake in PMF depends on the relation of the root biomass in each soil layer and the total root biomass in the rooting zone.Influences of water are incorporated according to the soil moisture conditions with a crop-specific response function.The response function is related to the soil matrix potential and the water content.According to Feddes et al. (1978), the response function is determined by three thresholds defining oxygen deficiency, wilting point and optimal water uptake conditions.The crop-specific response function includes a dimensionless water stress index.The resulting actual water uptake from each layer is the product of the stress index and the available water.
The root growth takes place during sowing and the development stage anthesis.During this part of the growing season, a fraction of the total biomass is allocated to the root.Root growth includes the calculation of the total underground biomass as a fraction from the total plant biomass, the calculated vertical growth (elongation) and the distribution of the root biomass over the rooting zone (branching).The last group of parameters are the basal crop coefficients (kcb ini , kcb mid and kcb end ), which are used to assess plant transpiration from potential evapotranspiration (Table 1).The transpiration and evaporation in PMF are simulated according to the dual-coefficient approach described by Allen et al. (1998).The potential PET is adjusted with crop-specific coefficients to account for different growing stages.This crop coefficient is low at the end of the growing period of winter wheat to account for a lower transpiration as a consequence of leaf senescence.For our case study, because of the lack of available phenological data, we did not activate the vernalization module.The important process of vernalization will be tested in future when phenological data are available.Plant partitioning is done according to biomass fractions of each plant organ according to a table given by De Vries (1989).The fraction of biomass which is allocated to each organ depends on the growing stage.Root growth and stem elongation occurs until anthesis.After that stage, dry matter is only allocated to the above-ground biomass.At the very end of the growing season the storage organs are filled.All PMF parameters are chosen on the basis of their influence on roots, stems and leaves or storages dry-matter outputs, based on oneparameter-at-a-time sensitivity analyses and expert knowledge.

Coupling CMF and PMF
Since both models contain Python interfaces which expose all states and fluxes, the implementation of "glue" code to connect the exchange of data is trivial.However, the definition of the functional boundary between the models is a major issue in model coupling -that is, which processes are covered by which model.It necessary to be ensure that Table 1.Parameter ranges of the Monte Carlo simulation for the coupled CMF-PMF for site 1 in Müncheberg.PAR stands for photosynthetically active radiation.Minimal to maximal input is the range for the GLUE analysis, while the output is the constrained range of the observed behavioural parameter sets (cf.Figs. 2 and 3).Uncertainty reduction in the output over 30 % is reflected in bold type.processes are not simulated by both models.In this study, the problem is solved by the definition of interfaces in each model to transport information transparently.

Parameter
The plant model contains methods for the connection to the environmental interfaces ("soil", "atmosphere") and the classes of the process library ("water", "ET").The methods of the classes "soil" and "atmosphere" are implemented to query the state variables and parameters of CMF directly.Hence, at any code block, where PMF needs information about the actual matrix potential, water content and meteorological conditions, this information is routed through these classes to CMF, which sends the data back.
The class "ET" calculates the evapotranspiration and the class "water" estimates the plant's water stress due to soil moisture conditions.During every day in the growing season, the plant model calls the class ET for the calculation of the potential transpiration (T pot ).A water stress value for each layer (ws) is calculated and contained in w stress .Finally, the actual transpiration is determined by the sum of the stressdriven water uptake from the rooting zone and the water uptake of the plant from each layer by the proportion of fineroot content and water stress of each layer.
Each layer of the water transport model has an accessible Neumann boundary condition representing the system boundary between the soil and the roots.The flux of the boundary conditions is set by the coupling code to the wa-ter uptake calculated by PMF.Using the interface approach, the governing equation of water transport, plant growth and potential transpiration can be changed without changing the coupling system.An extension from plots to larger scales such as a hillslope, as shown for a virtual case by Kraft et al. (2011), or a full 3-D landscape model is hence feasible without the need to change PMF and the coupling mechanism.
The two models are run consecutively after each time step using an operator split approach.First the plant growth model PMF simulates one time step, t, using the states of the water transport model CMF at t − 1.After that, CMF proceeds to t using the water uptake of PMF as a loss term at each soil layer.

Likelihood functions
The performance of a parameter set to predict observations was evaluated by goodness-of-fit values, represented by several likelihood functions.The choice of the likelihood function depends on the situation and is often subjective (Beven and Binley, 1992).Nevertheless, the choice of only one objective function for the calibration is inaccurate in most cases (Vrugt et al., 2003).We therefore used a combination of three likelihood functions: 1.The bias function was used as a central statistical measurement to summarize overall model performance: where L θ j |Y is the likelihood measure for each model run with parameter set θ, N is the total number of measurements, Y i is the measured value for the ith measurement and Ŷi is the corresponding output of the model.The bias measures the differences between measurements and model outputs.For underpredictions of the model, the bias is positive; for overpredictions, the bias is negative.Thus, it is a useful measure for assessing whether structural changes of the model equations are necessary for reducing the overall bias of the prediction.(Wallach, 2006).However, bias alone is not sufficient to evaluate model errors, as a bias of zero could also be due to cancellation of large errors with different signs (Wallach, 2006).
2. In order to measure the deviation of model prediction and measurement data we used the coefficient of determination, which is defined as where Ȳ is the average of the measured and Ȳ is the average of the simulated data.A maximum value of R 2 = 1 indicates that a perfect linear relationship between measured and calculated values exists, while the minimum value of R 2 = 0 indicates a low performance of the model.R 2 alone is also not a good measure of the model agreement with the observations, as R 2 could also be equal to 1 if the model systematically over-or under-predicts.
3. Finally, we employed the Nash-Sutcliff index (Nash and Sutcliffe, 1970) for measuring the model's sensitivity to outliers.This widely used function (e.g.Garnier et al., 2001;Beven and Binley 1992) is calculated as follows: If the model predicts the measurements perfectly, we have Y i = Ŷi , implying NSE = 1.If Ŷi = Ȳ for all i, then NSE = 0. Thus, a model which gives NSE = 0 has the same goodness of fit as using the average of the measured data for every situation (Wallach, 2006).
The three proposed likelihood functions cover most aspects in an adequate manner.They are widely used in hydrology (e.g.Li et al., 2010;Besalatpour et al., 2012;Pathak et al., 2012) with high explanatory power.It has to be noted that other choices for the likelihood function would certainly be imaginable (Beven and Freer, 2001).

GLUE sequence
The general GLUE method proceeds in several consecutive steps (Beven and Binley, 1992), which were adapted to this specific study: a. Selection of sensitive parameters: a full list of all parameters considered in GLUE from CMF and PMF is given in Table 1.Fifteen plant specific parameters from PMF which influence plant development, transpiration and biomass production were altered in the analysis.The hydrological parameters were given by the van Genuchten-Mualem parameters.These parameters were selected on the basis of a one-parameter-ata-time sensitivity analysis, which is not presented in this study.
b. Creation of a priori distribution: a random function was used to create 2 × 10 6 parameter sets, whereby each parameter set consisted of 19 parameters.Since their a priori distribution was unknown, a uniform distribution was assumed.The parameter ranges were selected on the basis of expert knowledge and other publications.
c. Execution of model runs: the 2 × 10 6 realizations of the coupled CMF-PMF model were forced with the same climate data for three different sites by using a high-performance computing cluster.
d. Creation of posteriori distribution: the simulated variables of both models were compared with observed data by using the three likelihood functions NSE, R 2 and bias.The variables of PMF are root, stem and leaves as well as storage dry matter and soil moisture in the case of CMF.Three threshold criteria were used to obtain parameter settings which fit the measured data equally well.All parameter sets that resulted in a bias > ±500 kg ha −1 for the plant dry matter and > ±10 % soil moisture, an NSE < 0 and a R 2 < 0.3, were discarded.
These four steps resulted in behavioural parameter sets for the coupled model for each study site.In order to test the limitations of the behavioural parameter sets, a full cross validation for all three sites was conducted.

Study site and data
Study site: the coupled CMF-PMF model was parameterized and evaluated using data from three agricultural field sites.(Mirschel, 2007).This extensive data set was used in several previous modelling studies (e.g.Wegehenkel, 2000;Palosuo et al., 2011;Kersebaum et al., 2007).Sites are characterized by a primarily sandy Eutric Cambisol, with a homogenous volumetric sand content of 80 to 90 % in a soil profile with 2.25 m depth.Silt and clay content contribute 5 to 10 %.The soil is mediumtextured with good structural stability.The bulk density is around 1.5 g cm −1 and the organic matter content in the first 0.3 m amounts to 0.6 %.An in depth description of soil physical and chemical properties is given by Mirschel (2007).
Forcing data: the climate data comprise daily sum of precipitation, minimum and maximum temperature, mean relative humidity, early morning vapour pressure, global radiation and mean wind speed.During 1994 (the year for which winter wheat was cultivated), the conditions were relatively humid, with an annual precipitation of 714 mm and an annual average temperature of 9.1 • C.During the growing season (October-July), precipitation of 588 mm and an average temperature of 15.8 • C were measured.The day of sowing and the day of harvest were set to observed dates (15 October 1993 and 29 July 1994).
Evaluation data: average gravimetric soil moisture measurements are available for three soil depths (0-0.3,0.3-0.6 and 0.6-0.9m) on 15 days during the observation period from 1993 to 1994 for every site.The average soil moisture ranged from 12.1 to 12.9 % at site 1-3, with a minimum of 3 % and a maximum of 21.2 %.Soil moisture was very similar across all three sites.The sites differed in their management strategies, with high-level intensive (site 1), organic (site 2) and extensive management (site 3) and in their winter wheat cultivar, namely 'Bussard', 'Ramiro', and 'Greif'.Crop growth data for winter wheat are available for five different days in 1994.Data on root dry matter [kg ha −1 ], stem and leaves dry matter [kg ha −1 ] and storage dry matter [kg ha −1 ] are given for all three sites.

Parameter uncertainty
To assess the range of each parameter in the behavioural parameter sets, we need to take a closer look at the parameter distributions for the different likelihood functions.Table 1 summarizes the results of the GLUE approach, providing the a priori and posteriori parameter ranges for the 19 model parameters as well as the reduction of the parameter uncertainty.For five parameters we were able to substantially reduce their uncertainty bounds by 30 to 70 %, while 11 parameters were rather unconstrained, with uncertainty reduction of less than 10 %.Out of the eight parameters that define the growing stage through the thermal time requirement [ • days] only the parameter tillering showed a large uncertainty reduction potential.This indicates that many of the parameters identifying plant growth stages lead to a high grade of equifinality.
A selection of eight model input parameters in terms of behavioural model runs is shown in Fig. 1, where four CMF and four PMF parameters are given as interaction scatter plots.The figure depicts the mean NSE (calculated from the single, equally weighted NSE for soil moisture, roots, stems and leaves, as well as storage dry matter) for site 1 as an example for constrained as well as non-constrained parameters of Table 1.On the interaction scatter plots, no correlations between parameters of PMF and CMF can be detected (Fig. 1).As the GLUE method cannot deal with such correlations, this is an important precondition of the GLUE method (Jin et al., 2010).The interaction scatter plots also show a clear prediction boundary for the parameter n at 1.3 [-] and for RUE at 6 g MJ −1 PAR.A setting of RUE above 6 g MJ −1 PAR and n below 1.3 [-] can never lead to an adequate model prediction for winter wheat and soil moisture in 1994 for site 1 in Müncheberg, regardless of which values are selected for other model input parameters.
The most constrained parameter of PMF is RUE (Table 1, Fig. 1), which influences biomass accumulation.Good settings for RUE were found from 1.5 to 4.9 g MJ −1 PAR (Table 1).This range is in line with most other applications.Acevedo et al. (2002) suggested a RUE of 3.0 g MJ −1 PAR for wheat, which was used in the setup of Multsch et al. (2011).Calderini et al. (1997) found RUE across their investigated wheat cultivars between 1.08 and 1.16 g Mj-1 PAR.Lindquist et al. (2005) suggested a RUE of 3.8 g MJ 1 PAR.DSSAT version 4.0 uses RUE with a setting of 4.2 g MJ −1 PAR (Jones et al., 1998).Due to large influence of weather variability to the RUE response, CERES-Maize developers do not recommend using RUE as a calibration parameter (Ma et al., 2011).We therefore suggest fixing RUE at our found optimum of 2.02 g MJ −1 PAR for further applications of PMF.
The second-most-constrained parameter (Table 1, Fig. 1) is the CMF shape parameter of the retention curve n, with a strict optimum at 1.45 [-] and a range reduction of 60 % through the threshold criteria.Christiaens and Feyen (2002) found n to be not greatly constrained from 1.2 to 1.6 for the MIKE SHE model.In contrast, Vogel et al. (2000) reported the parameter to be quite sensitive.Ippisch et al. (2006) demonstrated that the van Genuchten-Mualem model caused convergence problems with n close to 1.0 for the numerical solver, which we can confirm.
When looking at the density distribution of the behavioural model runs in Fig. 1, we can locate an optimal parameter range with 0.015-0.025[−] for alpha.But even within this range there is no guarantee of a good model response.Several parameter values in the considered range of alpha yield poor prediction with an NSE < 0, depending on the settings of other model parameters.The parameter t base provides best results within the range of 1.5 to 2.5 • C (Table 1).These settings are higher than the value given for PMF by Multsch et al. (2011) with 0 • C, which was based on a study by Mc-Master and Wilhelm (1997).A wide range from 0 to 10 • C for t base can be found in the literature, and is strongly dependent on cultivars (Porter and Gawith, 1999).The root growth shows a local optimum at 0.6 and a global optimum at 2.4 cm day −1 .Following our GLUE results the k parameter could not be confined; nevertheless, Pathak et al. (2012) found the k parameter constrained to around 0.64 [-] for the CROPGRO-Cotton model.All other investigated parameters are unconstrained within their boundaries to the outputs of various plant dry matter and soil moisture (Table 1, Fig. 1).
Finding mainly insensitive parameters corresponds well to prior studies using the GLUE procedure for models with a similar large number of model parameters (e.g.Viola et al., 2009;Rankinen et al., 2006).Part of the problem is that parameter-rich models allow for equifinality, levelling out the impact of certain parameters.days and lower during moderate moisture conditions.But the GLUE method per se has the tendency to overestimate uncertainty during low and high simulation events (Vrugt et al., 2008).In the soil depth from 0.3 to 0.6, as well as in the soil depth from 0.6 to 0.9 m, we have a constant uncertainty in the prediction of around 5 %.The median of the behavioural model run in the upper soil layer has an NSE of 0.57, a bias of 2 % soil moisture and an R 2 of 0.84.Model performance criteria in the soil depths below are of similar quality, with poorer performance for R 2 values but improved biases (Fig. 2).

Soil water balance
Compared with other studies, the median output for soil moisture after calibration was on the same quality level as previously reported findings.For example, Christiaens and Feyen (2002) published results of the GLUE method used for the MIKE SHE model with an NSE close to zero.Jiménez-Martínez et al. ( 2009), who simulated the soil moisture under melons growing in southeastern Spain, found a van Genuchten parameter set for Hydrus-1D model resulting in a high R 2 of 0.9 and 0.029 % RMSE for soil moisture.Scharnagl et al. (2011) found a RMSE of 0.009 water content and NSE of 0.87 for their Hydrus-1D model setup at the site of Selhausen, near Jülich, Germany.Their uncertainty bounds for soil moisture varied around 3 %, with higher uncertainty during dry and wet situations, which is consistent with our findings.We obtained similar model efficiencies as the best-performing model CERES (NSE = 0.66) in predicting the soil water content over all soil depths as a study of Kersebaum et al. (2007) at the same study site 1 in Müncheberg.
Despite those results, the prediction uncertainty could be further reduced by using more model runs and a stricter setting of threshold likelihood function.However, single model run time and the number of model runs had already pushed the overall computer run time of the uncertainty estimation provided here to three months.An efficient way might be the use of the DREAM algorithm that is able to solve complex posteriori probability density functions for a large number of parameters.This algorithm could reduce model runs and the uncertainty of the posteriori distribution (Vrugt et al., 2008).The GLUE method was chosen because of its easy implementation and the possibility of parallelization.
Nevertheless, the simulated parameter uncertainty can also depend on the chosen likelihood function and is not independent of errors in measured data (Mo and Beven, 2004).Thus, instead of attributing remaining model predictive uncertainty to the coupled CMF-PMF model structure itself, we should be aware that there are other sources of global uncertainty that impact on the overall model performance.

Plant growth
Results for the root, stem and leaves as well as storage dry matter are given in Fig. 3.This distribution shows good results for the root dry-matter simulation.All observed values fall within the 50 % probability range.A high NSE of 0.94 and R 2 of 0.98 along with a very low bias of −58.2 kg ha −1 indicate acceptable model performance.The median of the stem and leaves dry-matter simulation quality is lower than for the other simulated outputs with NSE of 0.79, bias of 221.7 kg ha −1 and R 2 of 0.87.Looking at the uncertainty boundaries, we can locate a relatively large uncertainty starting especially from July onwards and a somewhat lower uncertainty during the first half of June, without matching the observed value on 14 June.The observed value on this day is even higher than the next observation on 26 July, which may, however, occur in reality owing to decaying leaves (senescence).In the current model version of PMF the model cannot represent a reduction of biomass during the growing season due to leave senescence.In this sense, GLUE even facilitates the investigation of the model structure and identification of clear model limitations.The uncertainty of the prediction is constant around 500 kg ha −1 for the root dry matter, while the stem and leaves as well as the storage dry matter have a mean uncertainty of around 2000 kg ha −1 .The storage dry-matter simulation fits the measured data within the 50 % probability boundary.
In comparison to previously reported studies, we obtained acceptable results for the prediction of plant dry matter.For example, Jégo et al. (2010) found an RMSE of 1000 kg ha −1 for spring wheat biomass simulation using the STICS model.Results are also excellent when compared to the model intercomparison study of plant growth models that was realized for the same forcing and evaluation data set by Kersebaum et al. (2007).Eight models were applied, resulting in RM-SEs from 773 kg ha −1 to 3329 kg ha −1 and NSE spanning from 0.19 to 0.96 in simulation of above-ground biomass for site 1.The best-performing model was AGROSIM (note that this was the worst-performing model in the intercomparisons project by Kersebaum et al., 2007, who looked at soil moisture prediction), while the CANDY model returned the worst results.

Cross comparison of sites and parameter sets
We obtained reliable soil moisture and plant dry-matter outputs for three experimental sites in Müncheberg with our coupled CMF-PMF model.But the observed data set is relatively small, as is mostly the case in measured plant data analyses.Therefore it is even more essential to evaluate model performance across different field sites.We therefore chose to apply a cross-validation method in which parameter settings for one site were tested on another site and vice versa in order to investigate the general model and parameter transferability (Pathak et al., 2012).This procedure has become an established method in dealing with small data sets in the course of model parameterization, calibration and validation (Nassif et al., 2012).
A comparison of the range of the behavioural parameter sets of the three sites is shown in Fig. 4. We can see small and similar ranges for the most-constrained parameters -RUE, t base , alpha and n -over all sites.In this case, site 1 shows the widest range for the constrained parameters k sat , alpha,  t base and root growth.Medians shown as red lines in the box plots indicate optimal parameter settings for the constrained parameters.They are located more or less at the same position for the four constrained parameters, while this position varies throughout the sites for the other parameters.We conclude that in further applications of CMF-PMF, ranges for the constrained parameters as given in Table 1 can be substantially reduced to obtain improved model runs.One could also consider fixing the parameter to the median and excluding them from further calibration.
We employed a cross validation with each of the behavioural parameter sets we obtained for one site on the remaining two sites.As examples we show results of the cross validation for soil moisture 0.3-0.6 m [%] (Fig. 5) as well as for stem and leaf (Fig. 6).Transferability of model parameter sets worked well for soil moisture.In comparison to the other sites, we found the smallest uncertainty ranges to be at site 3.While site 1 has a mean uncertainty around 10 %, site 3 has only 5 %.Nevertheless, parameter sets found for site 1 worked well for the other sites.The NSE dropped from a high the behavioural boundary condition (NSE>0, bias<±500 kg ha -1 plant dry matter and R²>0.3). 5 The yellow area is the 95% probability range of the simulation, the orange area the 50% 6 probability range.7 Fig. 6.Cross validation of stem and leaves prediction with uncertainty boundaries.Grey-shaded sites are calibrated.Black dots are observed values, and the red dashed line is the median of the behavioural boundary condition (NSE > 0, bias < ±500 kg ha −1 plant dry matter and R 2 > 0.3).The yellow area is the 95 % probability range of the simulation, and the orange area the 50 % probability range.level of 0.48 for site 1 to NSE = 0.31 for site 2 and an NSE of 0.37 for site 3.The same cross validation at the other sites resulted in a small but constant range of NSE between 0.23 and 0.37.The bias remains the same across all sites ranging between very low soil moisture of −0.6 and −1.0 %.
Cross validation for stem and leaves output of the CMF-PMF model worked well for sites 1 and 3, but less well for site 2 (Fig. 6).This is most likely related to the similar observed values for the stem and leaves dry matter at the intensively managed site (site 1) and the extensively managed site (site 3), while the stem and leaves dry matter at the organically managed site (site 2) is significantly lower.Uncertainty boundaries for the organic site growths during the simulation period are up to 2000 kg ha −1 , while the uncertainty for the other sites increases up to 3000 kg ha −1 , with a very low uncertainty around the 14 June 1994.We found similar NSEs of 0.79 for site 1 and 0.74 for site 2 and 3. Validated on the other sites, parameter settings for site 1 resulted in an acceptable NSE of 0.35 for site 2 and a good NSE of 0.88 for the extensive site (site 3).R 2 remains on the same level through all tested sites, between 0.79 and 0.91, indicating that the general dynamics of crop growth were captured for all sites.The variation of performance criteria in the cross-validation experiment (i.e.stable R 2 across all sites versus a drop of NSE and an increase in bias from one site to another) highlight the importance of using a set of different likelihood functions.
One source of uncertainty in the prediction quality of the coupled model with regard to dry-matter production is to be seen in our disregarding of further field management strategies.Even though fertilization is considered in the simulation of crop growth in the current model setup of CMF-PMF, we neglected other agricultural management options that significantly influence biomass production, e.g.pesticide application in conventional field management or weeding in organic farming.Accordingly, the water balance of the soil has been simulated without a nutrient balance.For this reason, only water stress restricts crop growth.Fertilizer demand does not constrain plant growth in our model setup, as fertilizer is provided as unlimited.In general, PMF is capable of accounting for active and passive nitrogen uptake.Selection of cultivars also has a significant impact on yields, which is also not considered in PMF.The reason for this is simple: PMF does not have a management tool.Site 1 was managed with a highlevel intensive strategy, site 2 an organic one and site 3 an extensive one.The winter wheat cultivar was also adapted to these strategies with elite winter wheat 'Bussard' for site 1, infrequently used 'Ramiro' for site 2 and elite winter wheat 'Greif' for site 3.While these differences in management and cultivar lead to a high variability in plant matter production across sites (Fig. 6), they does not impact soil moisture conditions to a similar degree (Fig. 5).Consequently, to apply PMF in the sense of a full crop growth model for agricultural application, a management module is required that considers typical management strategies in agriculture.Instead of full inclusion of this management tool in the PMF model itself, we promote following the idea of the framework strategy of PMF as well as CMF and apply an external farm management model (Aurbacher et al., 2013;Windhorst et al., 2012).

Conclusions
In keeping in line with standards for the development and evaluation of environmental models, our implementation of four consecutive steps in implementing the GLUE method (sect.2.2.2) for the validation of our coupled model is consistent with the postulations of Jakeman et al. (2006).Through the investigation of the parameter uncertainty, the CMF-PMF model performance was found to crucially depend on the parameter values for n (CMF) and RUE (PMF).Their uncertainty boundaries could be reduced by 60 and 77 %, respectively, through the GLUE analyses.Other parameters -including k, emergence, stem elongation and anthesisshowed only a minor influence on the model outputs.The performance of our CMF-PMF model setup was found to be better than some previously tested models given that model performance was good for soil moisture and plant dry matter at the same site.Overall, approximately 90 % of the observed soil moisture data were within the predicted uncertainty bounds that were determined through the GLUE method.The model performances for simulating observed plant dry matter were found to be in an uncertainty range from 500 to 2000 kg ha −1 , with only one measured value missed.The posteriori parameter settings found can be used for a more efficient calibration of the CMF-PMF model in future case studies.
The cross validation at different sites showed only slight reductions of the likelihood functions.From this, we conclude that the model is transferable in space, at least under similar soil and weather conditions.Next steps should include a model test over several growing periods for which other crops need to be covered by PMF to be able to simulate crop rotation patterns.
Structural model uncertainty was identified with regard to the need of including agricultural management options and the missing capability of representing senescence in PMF.While the latter should be improved by considering processes reflecting senescence within PMF, we propose coupling of an agricultural management model and CMF-PMF, an essential step if PMF is to be used as a full crop growth model.

Fig. 1 .
Fig. 1.Parameter uncertainty and interaction.The scatter plots show parameter interaction and correlations for behavioural model runs coloured from yellow to red for NSE > 0 at site 1 in Müncheberg for the coupled CMF-PMF model.PMF parameters are given on the x axis, and CMF parameters are plotted on the y axis.The density distributions at the top and to the right depict the parameter uncertainty.NSE are reported as mean, equally weighted NSE for soil moisture, root, stem and leaves, as well as storage dry matter.

Figure 2
Figure2summarizes the capability of the coupled CMF-PMF model for predicting the soil moisture output in three depths.In addition to the median of the GLUE-derived behavioural model runs, we also showed the 50 and 95 % uncertainty bounds.Overall, approximately 90 % of the observed data were within the predicted uncertainty bounds.The uncertainty of the prediction was higher during dry and wet

Figure 2 . 6 Fig. 2 .
Figure 2. Probabilistic time series for the simulation of soil moisture with behavioural 3 (NSE>0, bias<±10% soil moisture and R²>0.3)CMF-PMF model runs on site 1 for three soil 4 depths.Inserts: The likelihood functions quantify the median of the prediction range.5 6

Figure 3 . 7 Fig. 3 .
Figure 3. Probabilistic time series for the simulation of plant dry matters with behavioural 3 (NSE>0, bias<±500 kg ha -1 plant dry matter and R²>0.3)CMF-PMF model runs on site 1. 4 Inserts: The likelihood functions quantify the median of the prediction range.Note differences 5 in scale of y-axis.6 7

Figure 4 .Fig. 4 . 2 Figure 5 .Fig. 5 .
Figure 4. Range of behavioural parameter sets considering all three threshold criteria of the 3 CMF-PMF model for the three sites in Muencheberg.Results are shown for the same set of 4 model input parameters as in Fig. 1. 5 6

Figure 6 .
Figure 6.Cross-validation of stem and leaves prediction with uncertainty boundaries.Grey 3 shaded sites are calibrated.Black dots are observed values, red dashed line is the median of 4

2082, 2014 2074 T. Houska et al.: Monte Carlo-based calibration and uncertainty analysis
They are located at the Müncheberg experimental stations, 50 km to the east of Berlin, Germany, where the ZALF (Leibniz Centre for Agricultural Landscape Research) recorded a comprehensive experimental data set www.biogeosciences.net/11/2069/2014/Biogeosciences, 11, 2069-