Interactive comment on “ A new concept for simulation of vegetated land surface dynamics – Part 1 : The event driven phenology model ” by V . Kovalskyy

The paper presents a new approach to model vegetation phenology called the ‘event driven phenology model’ (EPDM) and illustrates its performance at two Ameriflux crop sites. The paper is well written and the EPDM concept is interesting. I also appreciate the aspect of satellite data assimilation and propagation of uncertainties. I recommend publishing the paper in BG if the raised points below will be adequately addressed in a revised version.


Introduction
Phenology has been used increasingly to indicate changes in climate (IPCC, 2007).Focused on temporal shifts in biogeophysical cycles, recent studies have detected signs of significant local and regional changes in observations of phenol-ogy both at ground level (Kramer et al., 2000;Parmesan and V. Kovalskyy and G. M. Henebry: The event driven phenology model progress, being of vital importance to agricultural production and silviculture (Wisiol and Hesketh, 1987;Hay and Walker, 1989;Kaduk and Heimann, 1996;Thornley and Johnson, 2000;Bondeau et al., 2007;Ahrends et al., 2008;El Hajj et al., 2009).Simplified phenological models have been adapted into land surface modules (LSMs) of both regional and global climate models (RCMs or GCMs) to track the seasonality of fractional vegetation cover (FVC) and/or leaf area index (LAI) for LSM parameterization (Richardson et al., 2011).For instance, the Community Climate System Model (CCSM3) developed at NCAR (Blackmon et al., 2001) and used in the IPCC Fourth Assessment Report (AR4; IPCC, 2007) includes several functional vegetation types in its dynamic global vegetation model (DGVM).For each vegetation type the DGVM requires daily LAI generated by a phenology sub-module.To meet the demands of high volume computing, the sub-module mimics only three types of LAI development (summer-up, rain-up, and grass) based on accumulated growing degree-days or precipitation (Bonan et al., 2003;Levis et al., 2004).In BIOME-BGC, the simplification is taken to the level of time driven LAI growing function (Wang et al., 2009).Most often, models use static climatologies of canopy characteristics (Hasumi and Emori, 2004;Senay, 2008).
Simplifying phenologies can, however, have drawbacks in terms of error tracking and accuracy.Although phenological transitions are linked to calendar time or one of several possible proxy variables, the sub-modules often have no way to be driven by the weather simulated in the main RCM or GCM.Yet, phenology modules in the global climate models such as CCSM3 (Bonan et al., 2003;Levis et al., 2004) and MIROC (Hasumi and Emori, 2004) or the watershed model SWAT (Neitsch et al., 2002) are based on these principles.Demonstrating the shortfall of this approach, Stöckli et al. (2008b) showed largely inadequate performance of TRIFFID (Topdown Representation of Interactive Foliage and Flora Including Dynamics; Hughes et al., 2004), IBIS (Integrated BIosphere Simulator of NCAR Community Land Model; Foley et al., 1996) and NC (Carbon-Nitrogen dynamics simulator in BIOME BGC; Thornton et al., 2002).A comparable problem was identified by Bondeau et al. (2007), evaluating their phenology module for the LPJ DGVM.Moreover, none of these models provides the means to track uncertainties in phenology.Pitman et al. (2009) identified crop phenology as one of the key weaknesses in GCM LSMs which inhibits a model intercomparison on the effects of land cover change on modeled climate.
The challenges for modern phenological modeling extend beyond issues of accuracy in canopy state estimation and quantifying uncertainty.The availability of spatially explicit observations of the modeled variables calls for data assimilation to become mandatory for LSP modeling (Walker et al., 2001;Nagler, 2008;Turner et al., 2008).A pioneering attempt to use a data assimilation scheme with phenological model was made by Stöckli et al. (2008b).Based on Grow-ing Season Index (GSI) developed by Jolly et al. (2005), it presented a major advancement over traditional phenological models by relying on three environmental factors that can limit plant growth.The line of this research continues (Stöckli et al., 2011) but addressing spatial variability in phenology also requires inclusion of disturbances such as fire, grazing, hail, heat waves, floods, etc., that can affect -or even reset -plant growth and development.Thus, there is the need for a modern phenological module that could offer a flexible generic interface for a greater variety of factors while coupling to other models or in a standalone application.
In this paper we demonstrate an implementation of a new approach to phenological modeling that can (a) interact with ongoing meteorological conditions, (b) work in both prognostic and diagnostic modes, (c) track uncertainties (via error propagation); and (d) use remotely sensed data to adjust outcomes (via data assimilation).The key feature of our approach is the representation of the driving forces in the form of events that can influence plant growth and development; thus, we call it the event driven phenology model (EDPM).While it is easy to understand an event as an abrupt environmental disturbance, e.g., rainfall, hailfall, moisture stress, or frost, the approach also brings insolation and air temperature (growing degree-days) into the form of events.The transformation of continuous factors into events relies on partitioning that depends on canopy responses.Here, the range of possible factor values is divided into segments based on vegetation responses.For instance, air temperature can be partitioned into freezing temperatures, growth supporting temperatures, and heat stress that later can be treated as discrete events.We will demonstrate that this modeling approach also enables representing interactions of multiple drivers/events that is crucial for capturing the temporal variability of vegetation (Seastedt and Knapp, 1993;Knapp and Smith, 2001;Zhang et al., 2010;Schwalm et al., 2010).
Here we start the evaluation of event driven approach for potential regional application to predict seasonal trajectories of a key characteristic of the vegetated land surface while estimating the timing of phenological transitions.As the first step towards realizing its potential, the approach is tested on flux tower site-level, at daily time steps simulating temporal development of maize and soybean canopies at two Amer-iFlux locations: Mead, Nebraska (NE), and Bondville, Illinois (IL).The case study aims to show that, after training the EDPM can capture and reproduce the response patterns seen in the dynamics of canopy properties after different events during three broad phenological phases (or phenophases): green-up, reproduction, and senescence.In this fashion the model can reconstruct past trajectories as well as project future dynamics of a canopy attribute, such as the "tower normalized difference vegetation index" (TNDVI) derived from flux tower records of instantaneous insolation and photosynthetically active radiation (PAR) (Huemmrich et al., 1999).Further, we plug the EDPM into a one-dimensional Kalman Filter (1DKF) scheme to enhance its performance with assimilation of observations from the NASA's Moderate Resolution Imaging Spectroradiometer (MODIS).This paper is the first step in the effort to evaluate and validate basic capabilities of the EDPM.A companion paper (Kovalskyy and Henebry, 2011) compares performances of alternative phenological representations with EDPM results on four flux tower locations.Spatially explicit validations and comparisons have also been made, but the report of those results is currently in preparation.

AmeriFlux sites and data used for model development and testing
In order to acquire empirical knowledge of vegetation responses to events, the EDPM needs training on consistent observations of canopy dynamics and microclimatological records.The AmeriFlux network offers such data, but, not all sites in the network provide comparably coherent records.
There are many temporal gaps in the archives as well as inconsistencies in the lists of recorded attributes that vary across time and locations.Our choice of sites was driven by the need to test the EDPM on different types of herbaceous vegetation exhibiting strong seasonal and interannual variation.Moreover, we sought to verify model performance at different locations to evaluate model generality for the two principal commodity crops in the central US.
The data for model development and testing were obtained from the web resource of the AmeriFlux network (http://ameriflux.ornl.gov/).We selected the three sites at Mead, NE, for model training because these archives were the longest and most consistent.Two sites at Bondville, IL, were selected and reserved for model validation, but not every year had records of sufficient quality for testing (the deficient years are indicated by " * " in Table 1).All five sites have been used to grow either maize only or soybeans in annual rotation with maize; thus enabling the model to train on phenologies from different photosynthetic pathways (C3 for soybean and C4 for maize).Other AmeriFlux sites offered less temporal coverage for the principal variables used in model training and subsequent simulations.These variables included downwelling shortwave radiation, upwelling shortwave radiation, downwelling photosynthetically active radiation (PAR), upwelling PAR, air temperature at 2 m above ground, precipitation, and vapor pressure deficit.
The radiation flux data enabled us to characterize the vegetation canopies using TNDVI (Huemmrich et al., 1999;Wittich and Kraft, 2008).We deemed this derived canopy variable to be more consistent than the sporadic leaf area index measurements found in the site records.TNDVI was found to be linearly related to remotely sensed NDVI (Appendix C. 3, Huemmrich et al., 1999;Kovalskyy et al., 2011b).With the proximately sensed vegetation index as a variable estimated by the EDPM, the link to remotely sensed data was straightforward.To be consistent with an overpass timing of the polar orbiting Terra satellite, we calculated the TNDVI values out of instantaneous records of global shortwave radiation and PAR taken at 11:00 LT.We took advantage of the MODIS Land Products ASCII subsets available from the Oak Ridge National Laboratory Distributed Active Archive Center (ORNL DAAC, 2009).The MODIS NDVI values and errors propagated into this vegetation index from observed reflectances were calculated using nadir bi-directional reflectance distribution function (BRDF) adjusted reflectance (NBAR) data from MODIS bands 1 and 2. Availability of records from the two flux towers in Bondville (∼0.5 km apart) enabled us to approximate the footprint of a MODIS MCD43A4 pixel (500 m) when validating the assimilation scheme.

The concept of events and their implementation
The approach taken in the development of event driven model differs from traditional approaches to phenology www.biogeosciences.net/9/141/2012/Biogeosciences, 9, 141-159, 2012 V. Kovalskyy and G. M. Henebry: The event driven phenology model and vegetation dynamics modeling.Many physics-based vegetation models use incremental additive approaches (Bonan, 2003) to construct the dynamics of vegetation properties, e.g., allocation of C to various plant compartments, including the leaves.Empirical methods based on proxies typically estimate the value of the current vegetation property without reference to previous values.In contrast, the EDPM uses a sequential approach that calculates the value for the next step in a multiplicative fashion, based on the previous value.Also, the model treats environmental conditions as temporally discrete events at the granularity of a single day rather than as constantly acting forces.
The EDPM forms a seasonal trajectory of some canopy attribute by connecting the subsequent attribute values linearly in the way that resembles a geometric progression: where A t and A t+1 are the parameter values at consecutive time steps and E t is the step-change coefficient (or slope) effecting the change from t to t + 1.Unlike a geometric sequence, the slope E t in (1) is not constant, but an exponential growth and decay depending onE t resembles a geometric progression.The multiplier E t is assumed to be dependent on current conditions of plant growth and phenological phase of plant development.Since the model treats these conditions as events, E t represents a mixed step-change slope induced by various events happening in the period t.
The mixed step-change slope is modeled as the geometric mean of individual change slopes from each event type occurring at the step t: where e kt is the change coming from event type k during t, n is the number of impacting events during t, and t is the step index.The choice of geometric mean for derivation of mixed effects is purely empirical.Both a simple product and an arithmetic mean were considered for the derivation ofE t , but the formulation (2) performed better during early model development.The unfortunate consequence of using (1) and ( 2) is a rapid growth of uncertainty.However, this approach also allows for straightforward assimilation of external observations, enabling uncertainties to be constrained within reasonable limits by application of a one-dimensional Kalman filter (Appendix C3).Another consideration is that events are limited in time and so are the uncertainties associated with events.The interaction of events may result in cancellation of effects that dampens the uncertainty.A central feature of the EDPM is an event priority queue that administers the effect hierarchy of the detected events.For instance, the impact of a heat stress is canceled if a rainfall event follows, or the impact of precipitation is ignored if it coincides with a frost event.The functioning of this queue addresses the specifics of the vegetation type and event timing (Appendix C1).
The only criterion for selection of event types is the ecological relevance for the specific vegetation type.Events in the EDPM should be connected to one or more meteorological variables so that the model can automatically detect events, as described in detail in Appendix B. They also must produce some typical fluctuation in the canopy phenology that can be traced by the procedures presented further in the Model Training section.We call these fluctuations the event traces.The EDPM uses knowledge about event traces to generate the canopy dynamics for a growing season.Organized into step-by-step change sequences (e 1 ,e 2 ,e 3, . . .e n ), actual traces from events and their combinations determine the particular step changes (E t in (1)) and thereby drive the behavior of the modeled vegetation attribute (e.g., TNDVI in this case study and potentially FaPAR or LAI).
The step-change slope e t has the next level of detail where it is subdivided into static part and variable scaling factor.Using these two parts, a general event trace is transformed into particular event trace.The static part represents a general event trace, which is a string of typical step-change slopes produced by a given event type.Each event trace contains a sequence of step-change slopes also referred to as event sensitivities (s 1 ,s 2 , . . .s n ).These sensitivities are derived empirically from flux tower observations.The use of sequences comes from the fact that the effects from events can last for more than one time step and can produce different step-change slopes at each consecutive time step.The sequences of static step-change slopes are stored in the EDPMs library to be called upon detection of events.Every event type has separate strings of sensitivities for each vegetation type during different phenophases.In this fashion the model takes into account phenophase-dependent differences of the canopy responses to events.For instance, the same magnitude of heat stress may have a markedly different effect on TNDVI dynamics during green-up than during senescence.
The static step-change slopes in event traces represent a general case and are brought closer to a particular detected event through the use of variable scaling factor or intensity.These intensities are produced by event detection procedures that scan time series of meteorological data looking for indicators of events (Appendix B).The scaling factor or intensity (i) is a normalized departure from some threshold, following Jolly et al. (2005).However, here the threshold is not based on local spatial averages; rather, it is uniform for similar vegetation types.For the events that cause a decrease in TNDVI values, the intensity values have the range of 0 to 1.For the events that cause increases (growth), the values vary from 1 to maximum positive change (maximum step-change slope discussed in the model training section and Appendix B).The intensity value of 1 constitutes no effect on dynamics of the modeled canopy property.The thresholds (lower and upper bounds) for event detections were taken from various literature sources meant for broad range of conditions for maize and soybean cultivation in the US.The sources are listed in last column of Table 2.  Sadok and Sinclair, 2009;Yazar et al., 1999 * The upper bound of heat stress with step-change slope around 0 has been found neither in observation nor in literature.Therefore the arbitrary value of 70 • C was adopted as an effective upper bound to avoid computational overflows in detection process and provide an appropriately shallow slope for intensities.
The thresholds and limits were adjusted and rounded to be uniform for both crops modeled in the study case.Even though these adjustments may result in increases of errors, they greatly simplified the division into intensities and sensitivities making the model more robust.Rewriting (2) in terms of sensitivities and intensities yields where E is the change coefficient, s is sensitivity of the land cover to event e, i is the intensity of event e, n is the number of impacting events during t, and t is the step index.It follows from (3) that the EDPM takes events as signals for change, uses intensities to scale the change, and relies on sensitivities to direct the impacts of events depending on the vegetation type and phenophase.This approach is similar to the use of indices in plant modeling (e.g., Duru et al., 2009) to simplify impacts from changes in nutrient availability and other environmental factors.

Phenophase control: driving factors
The control of phenophases is the key to how the EDPM calls up event traces and thereby generates canopy dynamics.The EDPM has two ways to control phenophase transitions: either prescribed by the user (predefined phenophase transitions dates obtained from external sources) or environmentally triggered (automatic phenophase transitions).Environmental triggering is based on accumulation of controlling variables during the growing season.Previous studies have also used accumulated values to track phenology and to estimate transition points between phenophases (Nielsen, 2002;de Beurs and Henebry, 2004;Setiyono et al., 2007).In the EDPM, this approach utilizes historical records of key controlling variables to estimate cumulative probabilities of phenophase transitions.The details of how such records were collected and deployed in the Phenophase Control Module are presented in the Sect.3.3.
The collected data made up the distributions that helped the EDPM determine the chances of phenological phase transition to occur at any given date.Initially, we assumed that the distributions of controlling variables during phase transitions can be approximated by a cumulative distribution function (CDF).Based on the CDF, the Phenophase Control procedures decide the most likely moment of phenological transitions.Taking advantage of multiple control variables, we combine the information from different sources into collective/joint probability of phenophase transition.This feature enables triggering of phenophases change (e.g., from reproductive phase to senescence) when one (or more) of the controlling variables reaches the average historical value for the given phenological transition point (PTP).We have chosen a geometric mean to combine phase transition probabilities from different sources.This choice is empirical, but it proved to approximate phenological transition points better than an arithmetic mean.Presuming the most likely point of phenophase transition to be the trigger value of 0.5, this probability must be reached conjunctively through calculation of geometric mean of cumulative probabilities from each controlling variable (see Appendix C1 for details).
For the case study presented here we used three controlling variables to demonstrate the feasibility of the approach: i Accumulated Growing Degree-Days (AGDD): where AGDD is the accumulated growing degree-days, T max t is the maximum temperature for day t, T min t is the minimum temperature for day t, and BT is the base temperature, here, 0  where AI is the accumulated insolation, and Ins total incoming solar radiation for the day t; and iii Elapsed Days (ED): where ED is reset at the beginning of each phenophase.
The starting points for the Phase Control Module to begin accumulating temperatures (4), insolation (5), and days (6) are different for each phenological transition point (PTP).The countdown for start of season is started at the first calendar day of the year.To estimate the ending date of a phenophase (green-up, reproductive phase or senescence), each accumulation restarts at the beginning of a new phenophase.Parallel to the accumulations for individual phases, the module also keeps track of controlling variables for the whole season calculating the chances for growing cycle to be over.Additionally, the PTP of the end of the last phase (senescence) finishes the growing season.The choice of time point to commence accumulation can be used to address the particularities of phenological development of various vegetation types in different geographic regions.

Phenophase control: calibration
The Phenophase Control Module required historical records of phenological timing to parameterize its procedures.We extracted the dates of the phenological transitions from in situ data, viz., the TNDVI time series derived from flux tower records.The technique we used for determining PTPs is retrospective and is based on the dynamics of derivatives from NDVI trajectories also used by Viña et al. (2004).A least squares linear split algorithm (Wu, 1993) was applied to the time series of TNDVI to confirm the PTP dates.When the difference between the two estimates was larger than five days, we took the middle point between the dates into our collection (a procedure that we needed to use a few times in this study due to missing records in data from Bondville site).Further, we collected the values of accumulated controlling variables observed up to the PTP dates for the three Mead sites.The means and standard deviations of the durations for the phenophases and the growing season durations (Table 3) were used to parameterize inverse normal CDF and derive the cumulative probabilities of phenophase change associated with each controlling variable.

Model training
The objective of the EDPM training was to obtain canopy responses to the events listed in Table 2. Using thresholds from this table, events of seven types were extracted from the flux tower data representing three positive drivers (growing degrees, adequate insolation, precipitation) and four negative drivers (frost, heat stress, vapor pressure deficit (VPD) stress, low insolation).We examined the TNDVI dynamics to collect event traces observed after specific kinds of events.We overlaid the detected events along the observed dynamics of TNDVI and iteratively derived the crop sensitivities (step-change slopes for each day after the event) following the optimization approach of Mangiarotti et al. (2008).We used the minimization of the objective function (7) that at the same time determines the variance (J ) associated with a given sensitivity (s i ): where i is intensity,s i is given canopy sensitivity from a range of considered s values (0 < s i < s max ), et denotes total number of events of one type that were used in training, n is number of events occurred in the same day as the event of interest, k is the sequential event index, and A current and A next are the consecutive observations of TNDVI.The value of s max was simply 2.5, which is more than double of maximum daily step change coefficient seen in observations from Mead, NE; the increment step for iteration of s was 0.0001 Each event type (except for growing degree-days and adequate insolation) was tuned in this manner for three phenological phases (green-up, reproduction, and senescence) of each crop separately.We used data from both rainfed and irrigated sites in Mead, NE for training.However, since rainfall and heat stress showed weaker responses on the irrigated sites (as expected), we excluded the irrigated sites from the model training only for those two event types.
The growing degree-days and adequate insolation were the main drivers for canopy growth during the green-up phase, where the EDPM works in the first of its two trajectory building tactics.Transformed into events, these factors had their event traces extracted from the records of TNDVI dynamics before the training stage.These event traces were depicted as the maximum step-change slopes attributed solely to temperature and insolation.We also had lower bounds for insolation (average daily insolation before the start of season, see Table 2) and temperature (0 • C, see Table 2) that constitute values sufficient to support crop growth.Connecting these maximum step-change slopes and slope of 1 associated with lower boundary values yielded the range of canopy responses to both growing degree-day and daily insolation.Based on the maximum slope and selected bounds, we were able to rescale actual records of air temperature and daily insolation into intensities of events (see Appendix B for additional details).
Unlike during the green-up phase, TNDVI loses its positive response to temperature and insolation during the reproductive and senescent phases.Therefore, during these phenophases, we decided to rely on average step-change slopes extracted from observed TNDVI time series during corresponding segments of each season.The average stepchange slopes for senescence and reproductive phase were triggered by elapsed days when building seasonal curves during the simulations.In this fashion the model was made to follow two different tactics of developing TNDVI trajectories: (1) building the canopy with growing degree-days, insolation, and precipitation events occurring during the greenup; and (2) sustaining the canopy with some average change rate that can also be influenced by other events occurring during reproduction and senescence.The first tactic puts the positive step-change slopes E's from the three factors into the geometric sequence (1) and, thus, it resembles the simple exponential growth model described in Thornley and Johnson (2000).For reproduction and senescence many detailed models of vegetation dynamics use a balancing of basic growth and defoliation rates (Levis et al., 2004;Duru et al., 2009;Wang et al., 2009).This balancing involves growing degree-days to determine both rates (e.g., both growth rate and defoliation are driven by thermal time).For simplicity we decided to use a single average step-change slope as the second tactic for driving the canopy dynamics subsequent to green-up.
After capturing the sensitivities of one event type, we were able to remove the effects of those events from TNDVI trajectories.Residuals were then used in training of the event traces of other event types.Since the EDPM is sequential, reversing just one step coinciding with an event required the same removal operation to be done to all TNDVI values observed after the event in question.Therefore, the removal process had to multiply all the subsequent values of TNDVI by the inverse of the particular event trace, viz., (s et i et ) −1/n .Training for events with traces lasting more than one day was carried out in the similar manner through obtaining and removing one sensitivity (s et ) at a time.The order in which event traces were identified and removed was permuted to yield the list sum of J i values in (7) for all event types.Only temporally isolated events were chosen for model training.Unfortunately, this approach was not able to separate the effects of heat stress and VPD stress on TNDVI trajectories, neither in terms of timing nor in the magnitude of step-change slopes leaving only one event type for further use.No training of frost events was possible on crops due to a lack of observed events during a growing season in the training or validation data.Three other types of disturbance events were captured for every phenophase for each crop: heat stress events, insufficient insolation events, and rainfall events.
Without prior knowledge about durations of the event effects the raining procedures provided the EDPM with variances (J in ( 7)) and event traces of standard length 14 days.In total we received 18 strings of sensitivities (9 for each crop).However, it was unreasonable to assume that all events had the same duration of their effects on canopy.Hence, durations of individual event effects were determined by limiting the sums of uncertainties introduced by events.Summing the doubled standard deviations (2 √ J) from each step in the event trace, we stopped when the sum reached 0.1 for the TNDVI (∼10% of the range during a growing season).Thus the effects of heat stress events lasted 3 days for soybean during the reproductive phase, but just 2 days for maize during the same phase.This alternative solution replaced the previous choice of Student's t-test determining the significance of obtained sensitivity slopes (s et <> 1) contrasted with no

Event AggregaƟon
Fig. 1.Workflow diagram of the Event Driven Phenological Model (EDPM) software system (participation of modules with dashed outlines is optional.)Event loading module takes unprocessed inputs from various sources, converts them in to events and aggregates the events by the day when they occur (See Appendix B for details).Event processing module takes events and builds seasonal trajectories of canopy properties and uncertainties behind them based on knowledge of phenological timing and event response patterns collected during the EDPM training (See Appendix C1 and C2 for details).Output processing module applies areal weights (portion cover) to the produced LSP if it is built for a pixel.Data assimilation module uses 1DKF scheme to correct predictions with available observations (See Appendix C3 for details).slope (s et = 1).The end of an event trace was assumed to be the step at which the t-score loses its significance.In many cases this approach produced event traces lasting one or even two weeks, thereby absorbing the noise in flux tower data.By restricting the accumulated standard deviations not to exceed 0.1, we sought to limit noise and enhance signal in the trained traces.Details of error propagation in the EDPM are presented in the Appendix C2.

Setup of the pilot study
This pilot study was planned to assess overall performance of the EDPM and to evaluate the functioning of the modules (Fig. 1).The workflow diagram in Fig. 1 illustrates that the model can be deployed in multiple regimes producing results organized and conditioned in different fashions to appear in Sect. 4 of this paper and in its companion (Kovalskyy and Henebry, 2011).
The model was supplied with training and independent forcings to validate its basic capabilities on observations.Four goals were set to demonstrate success of the event driven approach.The first goal was to examine the ability of the automatic phenophase control module to capture phenological transitions.Special attention was paid to the implications of EDPMs shortcomings in estimating PTPs automatically.The second goal was to illustrate the ability of the model to capture details of canopy dynamics for the entire growing season and for each phenophase separately.The third goal was to demonstrate the generality of the EDPM using forcings from a distant location that were not used in the training phase.The fourth goal was to show how the model could work within the assimilation scheme using MODIS observations (Appendix C3).
In our analyses we were looking at differences with observations and correlation between modeled and observed TNDVI values.Therefore, root mean square error (RMSE) and coefficient of determination (r 2 ) were used as measures of model accuracy collected during exhaustive test-runs of the EDPM working in different regimes and modes.Model outcomes produced in prognostic mode were evaluated on forcings from different years and locations using two regimes of controlling phenophase transition dates: automatic estimation or predefined timing of PTPs.(The values for the predefined PTPs are shown in the Table 4 and Appendix Table A1.)Model testing with predefined PTPs aimed to show the ability of the EDPM to mimic canopy responses to various events isolated from errors in estimation of phenophase timing that can be eliminated with more training data.In diagnostic mode the EDPM was run within the assimilation scheme (Appendix C3) on the location and forcings that were not used for the model training.In addition to collecting RMSE and r 2 in diagnostic mode, we preserved the record and later analyzed the dynamics of propagated errors..

Tests of the EDPM in orognostic mode
Our primary concern in the performance of the EDPM was the accuracy of the automatically estimated phenophase transition dates.Using parameters in Table 3, the automated phenological control module performed with inconsistent deviations from the references.The differences between estimated and reference dates for maize were often reaching the level of two standard deviations.The predicted dates of Start of Season were missed by 10 days at most for maize and by 4 days for soybeans, which was comparable to retrospective results obtained by Brown and de Beurs ( 2008) and by Zhang et al. (2009).The automatic phase control module had a more difficulty estimating the durations of reproductive phase and senescence in maize (2-3 week differences in Table 4).The relative differences still show that the automatic PTPs were somewhat better in soybeans than in maize.Yet with just three seasons of results, we should not yet conclude that the automatic estimation of phenophase transitions failed for maize since the length of entire season was better captured for maize than soybean (maximum difference of 15 versus 22 days).
Figure 2 reveals minor differences between simulated and observed TNDVI dynamics for both crops.The automatic phase control module overestimated, underestimated, and even hit the timing of phases (Fig. 2a).The automatic PTP estimation performed comparably on both sets of inputsrecords used for model training and independent data reserved for validation.However, the forcings from Bondville sites (independent) had many runs of missing days in it.Having no inputs to accumulate the system was preset for overestimation of phase durations because the triggers of phase transitions do not account for missing records of control variables.Therefore, of the three independent runs, only a few underestimations in phenological timing were found; whereas, on the training site the automatic phenological control module produced early and late estimates in 50/50 proportion.The implications of error in PTPs become apparent after the end of greenup, where observations and predictions split their trajectories (Fig. 2a).At that point, the RMSE crosses the 0.1 mark almost for all runs with auto-matic phenophase control (Table 5).The errors remained high until the end of season unless the start of senescence was missed as well.In the case of missed starts of both reproductive phase and senescence, the RMSE grew beyond 0.3 for the last phenophase, resulting in seasonal of 0.2.The magnitude of such errors constitute almost one third of TNDVI range observed during growing seasons.These results reinforce the importance of accurate estimation of key phenophase transition dates.Figure 2 and Table 5 also show that the TNDVI dynamics at the two locations were captured with different levels of precision.The missing data at Bondville (independent) yielded anticipated systematic errors.Meanwhile, the differences with observations at Mead showed no consistent bias.The coefficients of determination for tests with automatic PTP estimation dropped to 0.5 for some seasons in Bondville.Yet, with predefined PTPs, the r 2 stayed above 0.8.At Mead (training) site, higher levels of r 2 (> 0.8) were achieved in both PTP regimes (Table 5).The levels of RMSE in tests with predefined phenological dates were the same on both locations.Similar RMSE values point to an adequate capturing of fine temporal details and indicate potential for regional generalization of the model.

Test of assimilation scheme
The data assimilation scheme (Appendix C3) was tested at the independent site (Bondville) during 2005, 2006, and 2007 growing seasons.The experiment involved mixing of observations and modeled TNDVI values in a hypothetical 500 m MODIS pixel half covered by maize and half by soybean, consistent with NASS data for the area.The EDPM with automatic PTP matched followed the 2005 observation well (Fig. 3a), but the performance dropped in during 2006 and 2007 due to missing forcing from flux towers (Fig. 3b  and 3c).The divergence from observations grew mostly during the reproductive phase, but remained low during greenup and senescence.Nevertheless, the RMSE (0.082), coefficient of determination (0.9) and a slope close to unity for the fitted linear model (Fig. 3d) were all evident of close agreement between modeled and observed TNDVI.The propagated errors, however, told a different story.Due to underlying formulations (1) and (3), each error value in the modeled trajectory carries the uncertainties from previous estimates (Appendix C2). Figure 4 illustrates that propagated errors (quantified by standard deviationσ ) grew up to 0.5 by the end of season in either maize or soybean.Hence, despite the good agreement with observations, the uncertainty of the prognosis increased quickly.Both soybean and maize had average uncertainty level greater than 0.3 (Fig. 4a and b).The level of propagated errors in the two-crop mixture dropped to less than 0.35 at the end of season (Fig. 4c) with the overall average of just above 0. through periodic MODIS updates creating the saw-toothed pattern (Fig. 4d).Reduction of average propagated errors to below 0.1 values (0.072) helped reaching the desired level of uncertainty planned during the model design in Sect.3.4.Overall, the data assimilation scheme of the EDPM increased its accuracy (two times smaller RMSE over prognostic with automatic PTPs) and substantially lowered uncertainties (five times smaller mean seasonal propagated errors).

Potential of the EDPM
The case study illustrated the good performance and diverse prediction capabilities of the EPDM while giving material and motivation for improvements to the internal model logic.
The model was able to estimate not only the TNDVI values, but also the uncertainties of those estimates.The EDPM predicted phenophase transitions achieved an accuracy that can be useful in both basic and applied research.The model did not require regular or frequent observation updates in its assimilation scheme.This robustness could prove useful, since land surface observations from space are often compromised by cloud cover (Roy et al., 2006).Despite the efforts of compositing procedures to compensate for the missing data, producing regular observations remains a problem that can be mitigated by the EDPM and its 1DKF assimilation scheme.Even though the growth rate of propagated errors was quite high, weekly or biweekly prognoses or interpolations could be reasonably accurate.This study presented the results from just a few point locations, but it uncovered many potentials of the EDPM.The performance evaluation in regional spatially explicit application of the model is in progress (Kovalskyy at al., 2011a).The EDPM can well facilitate two-way interactions between the lower layers of the troposphere and the vegetated land surface needed for effective land-atmosphere coupling in biogeophysical models (Richardson at al., 2011).The event driven approach is suited to model fluctuations of canopy dynamics induced by sudden factors such as insect outbreaks, canopy damage by hailfall, etc.The interface of events could also be adapted to factors that influence vegetation before and after the growing season (e.g., off-season precipitation, snowpack depth and duration).The model could also be used in yield predictions similar to how the NDVI has been used in previous work (e.g., Bastiaanssen and Ali, 2003;Doraiswamy et al., 2004;Prasad et al., 2006;Dente et al., 2008).

Shortcomings of the EDPM
At this relatively early stage in model development, we can identify some disadvantages of the event driven approach.First was the need for well-organized and consistent training data.The training was done on gap-filled records from the Mead sites, but having the data available in that condition is rare.At Bondville sites, the data records reach into early 1990's, but the gaps and inconsistency in formatting made only a few years suitable for use as forcings for the EDPM.In some years, gaps in data prevented the automatic PTP estimation module from starting the season at all; during other years, the system simply could not reach senescence due to the data gaps.Lower quality records led to event traces picking up noise during training.This propagated noise drove the outcomes into very wrong directions.Thus, care must be exercised when using noisy or temporally inconsistent observations during model training.One solution is to screen www.biogeosciences.net/9/141/2012/Biogeosciences, 9, 141-159, 2012 for outlier slopes produced by sudden changes of equipment, illumination conditions (Huemmrich et al., 1999), or other transient occurrences.
The event driven approach does not offer any mechanism to propagate uncertainties in the forcings into model output.Based on thresholds, event detection procedures simply cut off the unnecessary information.Although these errors could have some impacts on predicted canopy dynamics, currently they are presumed to be negligible.Instead during the training phase, the EDPM acquires and then propagates internal (sensitivity related) errors.The rates at which the internal errors grow can pose a major disadvantage.Due to this setback the EDPM could not offer the level of uncertainty low enough to be used in longer term forecasting (one month and beyond) without updates from observations.Rapid growth of propagated errors in the EDPM calls for further work to refine the performance of the model and improve its predictive capabilities.
As we mentioned earlier, the impact of false starts and delays arising from the automatic PTP estimation requires improvement.Depending on the magnitude of the discrepancy between the estimated and the observed transition point, the errors in predicted TNDVI values can grow substantially.The graph of these errors would resemble the uprising stairs with the number of steps equal to the number of phenophases.Current underperformance of automated phenological control module can be related to methods used for extraction of PTPs for training.Since the actual phenology records were not available we had to use retrospective techniques to capture growing season metrics.These "observed" dates could only approximate the actual phenological transitions (Zhang et al., 2009) determined by daylength for soybean (Setiyono et al., 2007), and by temperature for maize (Tojo Soler et al., 2005).Also, the choice of the geometric mean for combining phenophase change probabilities yields more conservative estimates for transition points and this may result in overestimation of phase durations.A key limitation was that the available flux tower data offered few seasons of observations to derive reliable estimates of the first and second moments of PTP distributions that are needed to test these distributions in multiple CDFs.However, as the body of flux tower data grows, it should be possible to improve the accuracy of predicted PTPs.

Directions for improvement and further development
The EDPM should not be considered a closed or completed modeling system.The EDPM framework is flexible and potentially can accommodate additional phenophases or even double and triple cropped growing seasons.The principal constraint is only the availability of high quality observations to calibrate phenophase transitions.Also, the model is not restricted to a single indicator of canopy status.In fact, we suspect that the TNDVI is limiting the model capabilities due to its suppressed response to changes in LAI above a certain level (Wittich and Kraft, 2008).The model realizations of soybean and maize crops suggest that the EDPM may be implemented for other kinds of vegetation.Although agricultural management practices are relevant to the EDPM, a lack of data about the timing of sowing, fertilizing, irrigating, and harvesting prevented their incorporation into the experiment.We continue investigating the issues encountered in this study along with expansion of the EDPM capabilities.The next milestone for the EDPM is the test in a coupling scheme for estimation of some land surface flux.This test has been carried out and reported in the companion paper (Kovalskyy and Henebry, 2011) where the model was used to parameterize the VegET scheme for estimation of actual evapotranspiration (Senay, 2008).With the ability to simulate changing vegetation conditions during the growing season, the EDPM was used to simulated dynamics of phenology driven factor (K cp ) through vegetation index relationships.The interactive approach of the EDPM improved the performance of the VegET that usually uses static climatologies of NDVI for derivation of K cp .

Conclusions
This pilot study investigated a new concept in modeling land surface phenologies.Results of this study showed the event driven concept to be viable, flexible, and yet precise tool for predicting temporal dynamics of TNDVI.The trajectories of TNDVI dynamics produced by the model matched well with observation producing high r 2 (0.5 and more) and RMSE as low as 0.08.The EDPM was designed with an abstraction level that allows quantifying impacts of extreme events of both natural and anthropogenic origin; thus, the potential for model application is broad.Yet, the need remains to reduce uncertainty of prognoses and increase accuracy in estimation of phenological dates.After the success of the first model test, we can anticipate coupling the EDPM to regional land surface models to parameterize carbon fluxes or evapotranspiration.
Finally, the EDPM software system has a high level of technological readiness for the tasks set before modern phenological models.Capable of data assimilation, error tracking, and performing stand-alone predictions of canopy states, the model has most of its training and dynamics building procedures automated.All modules are written in C++ with the use of standard libraries and external database server and can be compiled on Windows and Linux platforms.The spatial extension module is being finalized now to enable the EDPM system to work with raster data sources.Other data assimilation schemes are being considered for the model to work with.Model training on different vegetation types will be pursued in the near future.day.It applies restrictions to avoid event conflicts, e.g., heat stress effect is superseded by sufficient precipitation event.In the current implementation the restrictions are organized as custom functions for each crop and each phenophase that assign the intensities of irrelevant/conflicting events to a value of 1, i.e., no effect.The EC communicates with Phenological Phase Controller to know the phenophase, and the time (measured in ED, AGDD, and AI) elapsed since the start of that phenophase.At each step Phenological Phase Controller accumulates controlling variables (4, 5, 6) and uses inverse normal CDF [incdf ()] to calculate cumulative probabilities of phenophase transitions according to each controlling variable c.All probabilities of phase transitions are combined into one with geometric mean as follows: Where p t is cumulative probability of transition to the next phase, c i -value of a controlling variable at current model step, i -sequential number of controlling variables, N total number of controlling variables (3 in the study case).When queried, the Phenological Phase Controller calculates p t and if it is greater than 0.5 initiates the phenophase transition, but otherwise reports the name of the current phenophase.
Based on information about ongoing phenophase the EC module then receives Traces and Errors for all active and relevant events from the corresponding library and later passes those Traces and Errors to the Phenology Building and Er-ror Propagation sub-modules.Events are kept active in the Effect Controller for period of their relevance (one step or more).Expired events (for which the period of relevance/duration is over) are removed from the queue.
Phenology Building sub-module uses intensities and sensitivities in (3) to derive step-change coefficient E and then (1) to predict the next value of TNDVI.

C2 Error propagation
Error propagation in the EDPM model follows the general scheme where variance of the function's outcome equals the sum of variances associated with each participating variable multiplied by squared partial derivative of the function by that variable (Goldenstein, 2004).Being a function of event combinations in time t, the single step slope in (1) derived via (3) will have the errors propagated in the following manner: where σ 2 is the variance, Et is a notation relating σ 2 to daily change coefficient, s is the sensitivity of a given vegetation type to the event, i is the intensity of the event, n is the total number of occurring events at time t, e is the sequential event index and σ 2 Se denotes variance associated with particular sensitivity.Note that (C1) can be further simplified analytically to have σ 2 Et as a sum of σ 2 Se resulting in greater error (worst case).The simplification is based on the fact that any σ 2 Se is less than 1 while s *i is always close to 1 and the coefficient for every σ 2 Se depends mostly on 1/n.Hence, the worst case is n = 1 where σ 2 Se is not attenuated.This logic was used in the actual error propagation code to decrease computation time.Accordingly, the model propagates error in each step as follows: Using MODIS observations in the assimilation scheme, we propagated errors associated with reflectances of MODIS bands 1 and 2 involved in derivation of the NDVI as follows: where ρ N and ρ R are the reflectances of near infrared and red band and σ 2 N and σ 2 R are the associated variances obtained by Roy et al. (2005).

C3 Data assimilation scheme
In this study, we set a rather simple objective for the assimilation scheme; namely, to update the EDPM with remotely sensed observations of modeled vegetation properties.The one dimensional Kalman Filter (1DKF) scheme provides a ready solution since (1) in the main manuscript and (C2) can directly serve as the first and second steps of the 1DKF (Goldenstein, 2004).A simple mandatory requirement for the assimilation was for the NDVI observations to be linearly related to the TNDVI.We found such a linear relationship existed between the MODIS NDVI and Tower NDVI on our data conforming with other findings (Huemmrich et al., 1999;Kovalskyy et al., 2011b).At Mead site, ordinary least squares regression obtained a slope of 1.113 with variance 0.012, with significance of p < 0.01 and a coefficient of determination of 0.69 (Fig. C2).The regression parameter coefficients slightly varied between crops but did not produce a significant difference in Student's t tests.The differences in regression slopes between locations may be due to reasons discussed in detail elsewhere (Kovalskyy et al., 2011b).
Along with the linear model errors Eq. (C4), we propagated the error from NDVI calculation based on Nadir BRDF Adjusted Reflectances Eq. (C5): The Kalman Gain (K) is thus calculated as: An update by a MODIS NDVI observation modifies the model state as in Eq. (C6) and the model variance as in Eq. (C7): where K is the current value of Kalman gain, O is the observation or NDVI from MODIS,σ 2 mTNDVI is the variance of the relationship between the NDVI and the TNDVI, U is the notation of updated value.
We need to stress here that this assimilation scheme is designed to correct EDPM predictions through the occasional observation updates rather than to smooth the outcome.During the test runs, the scheme used (1) from the main manuscript and (C2) in daily steps and (C5-C7) only upon availability of MODIS observations (once in 8 or more days).Similar approach was used by Walker et al. (2001).

Fig. 4 .
Fig. 4. Temporal dynamics of propagated errors from EDPM simulations (A) maize; (B) soybean, (C) mixed crop and in the onedimensional Kalman Filter (1DKF) assimilation scheme (D) for 2005 growing season in Bondville.Dashed lines are the seasonal averages of daily propagated errors shown as solid grey lines.

Table 1 .
Descriptions of flux tower sites used in the study.
* indicates insufficient records.** indicates sites reserved for validation.

Table 2 .
Thresholds and bounds used for deriving events.

Table 3 .
Durations of phenological phases and seasons for different crops measured in control variables (AGDD = accumulated growing degree-days, AI = accumulated insolation, ED = elapsed days).

Table 4 .
Phenological transition dates estimated by automatic phenological control module compared with observed PTPs.Other phase controlling parameters are reported in Appendix TableA1.

Table A1 .
Values of phenophase controlling variables observed at automatically estimated phenological transition points (PTPs) compared with the corresponding values observed at predefined PTPs considered as the reference.AGDD = accumulated growing degree-days, AI = accumulated insolation, percent difference = 100*(automatic-predefined)/predefined.