Introduction
Forest ecosystems absorb and store a large fraction of anthropogenic carbon
dioxide (CO2) emissions (Le Quéré et al., 2015; Pan et
al., 2011) and supply wood products to a growing human population (Shvidenko
et al., 2005). Therefore, predicting future carbon sequestration and timber
supply is critical for adapting forest management practices to future
environmental conditions and for using forests to assist with the reduction in atmospheric CO2 concentrations. The key sources of information for
developing these predictions are results from global change ecosystem
manipulation experiments, observations of forest dynamics across
environmental gradients, and process-based ecosystem models. The challenge
is integrating these three sources into a common framework for creating
probabilistic predictions that provide information on both the expected
future state of the forest and the probability distribution of those future
states.
Data assimilation (DA), or data–model fusion, is an increasingly used
framework for integrating ecosystem observations into ecosystem models
(Luo et al., 2011; Niu et al., 2014; Williams et al.,
2005). DA integrates observations with ecosystem models through statistical,
often Bayesian, methods that can generate probability distributions for
ecosystem model parameters and initial states. DA allows for the explicit
accounting of observational uncertainty (Keenan et al., 2011), the
incorporation of multiple types of observations with different timescales
of collection (MacBean et al., 2016; Richardson et al., 2010), and
the representation of prior knowledge through informed parameter prior
distributions or specific relationships among parameters (Bloom and
Williams, 2015).
Using DA to parameterize ecosystem models with observations from multiple
locations that leverage ecosystem manipulation experiments and environmental
gradients will allow for predictions to be consistent with the rich history
of global change research in forest ecosystems. Ecosystem manipulation
experiments provide a controlled environment in which data collected can be
used to describe how forests acclimate and operate under altered
environmental conditions (Medlyn et al., 2015) and can potentially allow
for the optimization of model parameters associated with the altered
environmental factor in the experiment. Furthermore, the assimilation of
data from ecosystem manipulation experiments may increase parameter
identifiability (reducing equifinality; Luo et al., 2009), where two
parameters have compensating controls on the same processes, by isolating
the response to a manipulated driver. Observations that span environmental
gradients include measures of forest ecosystem stocks and fluxes across a
range of climatic conditions, nutrient availabilities, and soil water
dynamics. These studies leverage time and space to quantify the sensitivity
of forest dynamics to environmental variation. However, covariation of
environmental variation can pose challenges separating the responses to
individual environmental factors. Overall, assimilating observations from a
region that includes environmental gradients and manipulation experiments is
a useful extension of prior DA research focused on DA at a single site with
multiple types of observations (Keenan et al., 2012;
Richardson et al., 2010; Weng and Luo, 2011).
Southeastern US planted pine forests are ideal ecosystems for exploring
the application of DA to carbon cycle and forest production predictions.
These ecosystems are dominated by loblolly pine (Pinus taeda L.), thus allowing for a
single parameter set to be applicable to a large region containing many soil
types and climatic gradients. Loblolly pine represents more than one half of
the standing pine volume in the southern United States (11.7 million ha) and
is by far the single most commercially important forest tree species for the
region, with more than 1 billion seedlings planted annually (Fox et al.,
2007; McKeand et al., 2003). There is also a rich history of experimental
research located across the region focused on global change factors that
have included nutrient addition (Albaugh
et al., 2016; Carlson et al., 2014; Raymond et al., 2016), water exclusion
(Bartkowiak et al., 2015; Tang et
al., 2004; Ward et al., 2015; Will et al., 2015), and water addition
experiments (Albaugh et al., 2004; Allen et al., 2005;
Samuelson et al., 2008). The region also includes a multiyear ecosystem
CO2 enrichment study (McCarthy et al., 2010). Furthermore, many
of these experiments are multi-factor with water exclusion by nutrient
addition (Will et al., 2015), water addition by nutrient addition
(Albaugh et al., 2004; Allen et al., 2005; Samuelson et al.,
2008), and CO2 by nutrient addition treatments
(McCarthy et al., 2010; Oren et al., 2001). Beyond
experimental treatments, southeastern US loblolly pine ecosystems include
at least two eddy-covariance sites with high-frequency measurements of C and
water fluxes along with biometric observations over many years
(Noormets et al., 2010; Novick et al., 2015) and sites with
multiyear sap flow data (Ewers et al., 2001;
Gonzalez-Benecke and Martin, 2010; Phillips and Oren, 2001). Finally, there
are studies that include plots that span the regional environmental
gradients and extend back to the 1980s (Burkhart et al., 1985). Overall,
the multi-decadal availability of observations of C stocks (or biomass),
leaf area index (LAI), C fluxes, water fluxes, and vegetation dynamics in
plots with experimental manipulation and plots across environmental
gradients, is well suited to potentially constrain model parameters and
predictions of how carbon cycling responds to environmental change.
Regional observational data streams used in data assimilation.
Data stream
Measurement
Measurement
Uncertainty
Stream
frequency
or estimation
ID for
technique
Table 3
Foliage biomass (Pine)
Annual or less
Allometric relationship
Based on propagating the al-lometric model uncertainty inGonzalez-Benecke et al. (2014).Varied by observation.
1
Foliage biomass(hardwood)
Annual or less
Allometric relationship
Assumed zero
2
Stem biomass (pine)
Annual or less
Allometric relationship
Based on propagating theallometric model uncertaintyin Gonzalez-Benecke et al.(2014).Varied by observation.
3
Stem biomass(hardwood)
Annual or less
Allometric relationship
Assumed zero
4
Coarse root biomass(combined)
Annual or less
Allometric relationship
Assumed zero∗
5
Fine root biomass(combined)
Annual or less
Allometric relationship
SD: 10 % of observation
6
Foliage biomassproduction (combined)
Annual
Litterfall traps
SD: 10 % of observation
7
Fine root biomassproduction (combined)
Annual
Mini-rhizotrons
SD: 10 % of observation
8
Pine stem density
Annual or less
Counting individuals
1% (assumed small)
9
Leaf area index (pine)
Monthly toannual
Litter traps or LI 2000
SD: 10 % of observation
10
Leaf area index(hardwood)
Monthly toannual
Litter traps or LI 2000
SD: 10 % of observation
11
Leaf area index(combined)
Only used ifnot separatedinto pine andhardwood
Litter traps or LI 2000
SD: 10 % of observation
12
Gross ecosystemproduction
Monthly
Modeled from fluxeddy-covariance netecosystem exchange
SD: 10 % of observation
13
Evapotranspiration
Monthly
Eddy covariance
SD: 10 % of observation
14
∗ The relatively low number of observations prevented convergence when using
the observational uncertainty model, so observational uncertainty was assumed
to be zero to allow convergence.
Using loblolly pine plantations across the southeastern US as a focal
application, our objectives were to (1) develop and evaluate a new DA
approach that integrates diverse data from multiple locations and
experimental treatments with an ecosystem model to estimate the probability
distribution of model parameters, (2) examine how the predictive capacity and
optimized parameters differ between an assimilation approach that only uses
environmental gradients and an assimilation approach that uses both
environmental gradients and ecosystem manipulations, and (3) demonstrate the
capacity of the DA approach to predict, with uncertainty, regional forest
dynamics by simulating how forest productivity responds to drought, nutrient
fertilization, and elevated atmospheric CO2 across the southeastern
US.
Map of loblolly pine distribution, plot locations used in data
assimilation, and the experiment type associated with each plot. The
control-only treatments were plots without any associated experimental
treatment or flux measurements. Fertilized treatments were plots with nutrient
additions. CO2 treatments were plots with free-air concentration enrichment
treatments. The flux treatments were plots with eddy-covariance measurements
of ecosystem-scale carbon and water exchange. The water treatments included
throughfall exclusion and irrigation experiments.
Methods
Observations
We used 13 different data streams from 294 plots at 187 unique
locations spread across the native range of loblolly pine trees to constrain
model parameters (Table 1; Fig. 1). The data streams covered the period
between 1981 and 2015. The Forest Modeling Research Cooperative (FMRC)
Thinning Study provides the largest number of plots that span the region
(Burkhart et al., 1985). In this study, we only used the control plots
that were not thinned. The Forest Productivity Cooperative (FPC) Region-wide
18 (RW18) study included control and nutrient fertilization addition plots
that span the region (134.4 kg ha-1 N + 13.44 kg ha-1 P
biannually) (Albaugh et al., 2015). The Pine Integrated Network: Education, Mitigation, and
Adaptation Project (PINEMAP) study included four locations dispersed across the region that included a replicated factorial
experiment with control, nutrient fertilization (224 kg ha-1 N + 27 kg ha-1 P + micronutrients once at project initiation), throughfall
reduction (30 % reduction), and fertilization by throughfall treatments
(Will et al., 2015). The Southeast Tree Research and Education
Site (SETRES) study was located at a single location
and included replicated control, irrigation (∼ 650 mm of added
water per year), nutrient fertilization (∼ 100 kg N ha-1 + 17 kg P ha-1
with micronutrients applied annually with absolute
amount depending on foliar nutrient ratios), and fertilization by irrigation
treatments (Albaugh et al., 2004). The Waycross study was a single
site with a non-replicated fertilization treatment. The annual application
of nutrient fertilization was focused on satisfying the nutrient demand by
the trees and resulted in one of the most productive stands in the region
(Bryars et al., 2013). These five studies included data streams of
stand stem biomass (defined as the sum of stem wood, stem bark, and branches)
and live stem density. Waycross and SETRES included LAI measurements from
litterfall traps (Waycross) or estimates from LI-COR LAI-2000 (SETRES).
SETRES also included fine root and coarse root measurements. In the PINEMAP,
SETRES, and RW18 studies we only used foliage biomass estimates from the
control plots. We excluded the foliage biomass estimates from the treatment
plots because they were derived from allometric models that may not have
captured changes in allometry due to the experimental treatment. We did use
LAI measurements from both control and treatment plots where available (SETRES).
We also included observations from the Duke Free-Air Carbon Enrichment (FACE) study where the atmospheric
CO2 was increased by 200 ppm above ambient concentrations. Based on the
data presented in McCarthy et al. (2010), the study included six
control plots, four CO2 fumigated rings (including the unfertilized
half of the prototype), two nitrogen fertilization treatments (115 kg N ha-1 yr-1 applied annually), and one CO2 by nitrogen
addition treatment (fertilized half of prototype). The Duke FACE study
included observations of stem biomass (loblolly pine and hardwood), coarse
root biomass (loblolly pine and hardwood), fine root biomass (combined
loblolly pine and hardwood), stem density (loblolly pine only), leaf
turnover (combined loblolly pine and hardwood), fine root production
(combined loblolly pine and hardwood), and monthly LAI (loblolly pine and
hardwood).
A diagram of the monthly time-step 3-PG model used in this study.
The stocks are represented by the boxes and the fluxes by the arrows. An
influence of a stock on a flux that is not directly related to that stock is
represented by the dotted lines. The environmental influences on a flux are described using italics. A description of the model can be found in the
Supplement.
Finally, we included two AmeriFlux sites with eddy-covariance towers in
loblolly pine stands. The US-DK3 site was located in the same forest as the
Duke FACE site described above (Novick et al., 2015). The US-NC2 site
was located in coastal North Carolina (Noormets et al., 2010). We
used monthly gross ecosystem production (GEP; modeled gross primary
productivity from net ecosystem exchange measured at an eddy-covariance
tower) and evapotranspiration (ET) estimates from the sites. The monthly
GEP and ET were gap-filled by the site principal investigator. The GEP was a flux-partitioned
product created by the site principal investigator. The biometric data from the US-DK3 site were assumed to be the same as the first control ring. The biometric data from
the US-NC2 site included observations of stem biomass (loblolly pine and
hardwood), coarse root biomass (loblolly pine and hardwood), fine root
biomass (combined loblolly pine and hardwood), stem density (loblolly pine
only), leaf turnover (combined loblolly pine and hardwood), and fine root
production (combined loblolly pine and hardwood).
Ecosystem model
We used a modified version of the Physiological Principles Predicting Growth
(3-PG) model to simulate vegetation dynamics in loblolly pine stands
(Bryars et al., 2013; Gonzalez-Benecke et
al., 2016; Landsberg and Waring, 1997). 3-PG is a stand-level vegetation
model that runs at a monthly time step and includes vegetation carbon
dynamics and a simple soil water bucket model (Fig. 2). While a complete
description of the 3-PG model and our modifications can be found in the
Supplement Sect. 1, the key concept for interpreting the
results is that gross primary productivity (GPP) was simulated using a
light-use efficiency approach where the absorbed photosynthetically active
radiation (APAR) was converted to carbon based on a quantum yield
(Supplement Sect. 1.1). Quantum yield was simulated using a
parameterized maximum quantum yield (alpha) that was modified by
environmental conditions including atmospheric CO2, available soil
water (ASW), and soil fertility (Supplement Sect. 1.2–1.3). The
ASW and soil fertility modifiers were values between 0 and 1, while the
atmospheric CO2 modifier had a value of 1 at 350 ppm (thus values
greater than 1 at higher CO2 concentrations).
Elevated CO2 modified tree physiology by increasing quantum yield,
based on an increasing but saturating relationship with atmospheric CO2
(Supplement Sect. 1.2). Based on initial results from the data
assimilation, we also added a function where the allocation to foliage
relative to stem biomass decreased as atmospheric CO2 increased
(Supplement Sect. 1.2). ASW and quantum yield were positively
related through a logistic relationship between relative ASW and the quantum
yield modifier, where relative ASW was the ratio of simulated ASW to a
plot-level maximum ASW. Soil fertility and quantum yield were proportionally
related, where quantum yield was scaled by an estimate of relative
stand-level fertility (a value of 1 was the maximum fertility). The
fertility modifier (or soil fertility rating, FR) was constant throughout a simulation of a plot and
was either based on site characteristics or directly optimized as a
stand-level parameter (Supplement Sect. 1.3). For plots with
nutrient fertilization, FR was a directly optimized parameter or set to 1,
depending on the level of fertilization (see below). For unfertilized plots,
we used site index (SI), a measure of the height of a stand at a specified
age (25 years), to estimate FR. This approach is in keeping with previous
efforts (Gonzalez-Benecke et al., 2016; Subedi et al.,
2015); however, SI does not solely represent the nutrient availability of an
ecosystem. For a given climate SI captures differences in soil fertility,
where a lower SI corresponded to a site with lower fertility, but regional
variation in SI also included the influence of climate on growth rates that
were already accounted for in the other environmental modifiers in the 3-PG
model. When a climate term is not used in the empirical FR model, FR is
relative to the highest SI in the region, which does not occur in the
northern extent of the region even in fertilized plots due to climatic
constraints. Thus, we also included the historical (1970–2011) 35-year mean
annual temperature (MAT) as an additional predictor, resulting in an
empirical relationship that predicted FR as an increasing, but saturating,
function of SI within areas of similar long-term temperature. For our
application of the 3-PG model using DA, we removed the previously simulated
dependence of total root allocation on FR (Bryars
et al., 2013; Gonzalez-Benecke et al., 2016) because we separated coarse and
fine roots. Other environmental conditions influenced GPP, including
temperature, frost days, and vapor pressure deficit (VPD). A description of
these modifiers can be found in Supplement Sect. 1.2.
Key climatic and stand characteristic inputs to the regional 3-PG
simulations: (a) mean annual temperature (1979–2011) as a summary of the
gradient in monthly temperature inputs used in simulations, (b) maximum
available soil water for the top 1.5 m of soil from SSURGO, (c) mean
annual precipitation (1979–2011) as a summary of the gradient in monthly
precipitation inputs used in simulations, and (d) site index. The area shown
is the natural range of loblolly pine (Pinus taeda L.).
Each month, net primary production (a parameterized and constant proportion
of GPP) was allocated to foliage, stem (stem wood, stem bark, and branches),
coarse roots, and fine roots (Supplement Sect. 1.4). Differing
from previous applications of 3-PG to loblolly pine ecosystems, we modified
the model to simulate fine roots and coarse roots separately. 3-PG also
simulated simple population dynamics by including stem density as a state
variable. Stem density and stem biomass pools were reduced by both
density-dependent mortality, based on the concept of self-thinning
(Landsberg and Waring, 1997), and density-independent mortality, a
new modification where a constant proportion of individuals die each month
(Supplement Sect. 1.5). Finally, we added a simple model of
hardwood understory vegetation to enable the assimilation of GEP and ET
observations from eddy-covariance tower studies with significant
understories (Supplement Sect. 1.7).
The water cycle was a simple bucket model with transpiration predicted using
a Penman–Monteith approach (Bryars et al.,
2013; Gonzalez-Benecke et al., 2016; Landsberg and Waring,
1997) (Supplement Sect. 1.6). The canopy conductance used in the
Penman–Monteith subroutine was modified by environmental conditions. The
modifiers included the same ASW and VPD modifier as used in the GPP
calculation. Maximum canopy conductance occurred when simulated LAI exceeded
a parameterized value of LAI (LAIgcx). Evaporation was equal to the
precipitation intercepted by the canopy. Runoff occurred when the ASW
exceeded a plot-specific maximum ASW. As in prior applications of 3-PG, ASW
was not allowed to take a value below a minimum ASW, resulting in an implicit
irrigation in very dry conditions. This assumption may cause the model to be
less sensitive to low ASW, but the optimized parameterization may compensate for this.
The 3-PG model used in this study simulated the monthly change in 11
state variables per plot: four stocks for loblolly pines, five stocks for
understory hardwoods, loblolly pine stem density (stems ha-1), and ASW.
The key fluxes that were used for DA included monthly GEP, monthly ET,
annual root turnover, and annual foliage turnover. In total, 46 parameters
were required by 3-PG. The model required mean daily maximum temperature,
mean daily minimum temperature, mean daily PAR, total frost days per month,
total rain per month, annual atmospheric CO2, and latitude. Each plot
also required maximum ASW, SI, MAT, and the initial condition of the 11
state variables as model inputs (Fig. 3).
We used the first observation at the plot as the initial conditions for the
loblolly pine vegetation states (foliage biomass, stem biomass, coarse root
biomass, fine root biomass, and stem number). When observations of coarse
biomass and fine root biomass were not available, these stocks were
initialized as a mean region-wide proportion of the observed stem biomass.
However, the value of initial root biomass in plots without observations was
not important because root biomass did not influence any other functions in
the model. The hardwood understory stocks at US-DK3 and US-NC2 were also
initialized using the first set of observations. Initial fine root and
coarse biomass were distributed between loblolly pine and hardwoods based on
their relative contribution of total initial foliage biomass. The
initialized ASW was assumed to be equal to the maximum ASW because most
plots were initialized in winter months when plant demand for water was
minimal. The maximum ASW in each plot was extracted from the Soil Survey Geographic Database (SSURGO) soils
dataset (Soil Survey Staff, 2013). The value we used corresponded to the
maximum ASW for the top 1.5 m of the soil. We assumed that the minimum ASW
was zero. Because we focused on a region-wide optimization, we used
region-wide 4 km estimates of observed monthly meteorology as inputs and to calculate the 35-year MAT for each plot (Abatzoglou, 2013). SI was
based on height measurements at age 25 in each plot or calculated by
combining observations of height at younger ages with an empirical model
(Dieguez-Aranda et al., 2006).
We simulated ecosystem manipulation experiments in the 3-PG model by
altering the environmental modifiers or by modifying the environmental
inputs. Nutrient addition experiments were simulated by setting FR equal to
1 for the studies that applied nutrients at regular intervals to remove
nutrient deficiencies (RW18, SETRES, Waycross). FR was directly estimated
for fertilized plots in two of the studies either because nutrients were
only added once at the beginning of the study (PINEMAP), thus potentially
not removing nutrient limitation, or because nitrogen was the only element added
(Duke FACE), thus allowing the potential for nutrient limitation by other
elements. For these plots, we also assumed that the FR of the fertilized
plot was equal to or larger than the control plot. Throughfall exclusion
experiments were simulated by decreasing the throughfall by 30 % in the
treatment plots. The SETRES irrigation experiments were simulated by adding
650 mm to ASW between April and October. CO2 enrichment experiments
were simulated by setting the atmospheric CO2 input equal to the
treatment mean from the elevated CO2 rings (570 ppm). One plot (US-NC2)
included a thinning treatment during the period of observation. We simulated
the thinning by specifying a decrease in the stem count that matched the
proportion removed at the site, with the biomass of each tree equivalent to
the average of trees in the plot.
Data assimilation method
We used a hierarchical Bayesian framework to estimate the posterior
distributions of parameters, latent states of stocks and fluxes, and process
uncertainty parameters. The latent states represented a value of the stock
or flux before uncertainty was added through measurement. The approach was
as follows.
Consider a stock or flux (m) for a single plot (p) at time t (qp,m,t).
qp,m,t is influenced by the processes represented in the 3-PG model and
a normally distributed model process error term,
qp,m,t∼Nfθ,FRp,σm,
where θ
is a vector of parameters that are optimized, FRp is
the site fertility, and σm is the model process error. Not shown
are the vector of parameters that were not optimized (Supplemental
Table S1),
the plot ASW, an array of climate inputs, and the initial conditions
because these were assumed known and not estimated in the hierarchical
model. The process error assumed that the error linearly scales with the
magnitude of the prediction:
σm2=γm+ρmfθ,FRp.
While the structure of the Bayesian model allowed for all data streams to
have process uncertainty that scales with the prediction, in this
application we only allowed stem biomass, GEP, and ET process uncertainty to
scale because they had large variation across space (stem biomass) and
through time (i.e., there should be lower process uncertainty in the winter
when GEP is lower). For the other data streams, the linear scaling term was
removed by fixing ρm at 0.
FRp did not have an explicit probability distribution. Rather the
probability density was evaluated as 1 if the plot was not fertilized, thus
causing FRp to be estimated from SI and MAT (Supplement
Eq. 15), or if it was a fertilized plot and had an FRp equal or
higher than that of its non-fertilized control plot. The probability density
was evaluated as 0 if the estimated FRp in a fertilized plot was less than
the FRp in the control plot or if FRp was not contained in the
interval between 0 and 1.
FRp∼1ifnon-fertilized,FRp≥0,andFRp≤11ifFRp=1andfertilizationlevelsareassumedtoremovenutrientdeficiencies0ifFRp<1andfertilizationlevelsareassumedtoremovenutrientdeficiencies1iffertilizedbutlevelsarenotassumedtoremovedeficienciesandFRp≥FRofcontrolplot0iffertilizedbutlevelsarenotassumedtoremovedeficienciesandFRp<FRofcontrolplot0ifFRp<0orFRp>1
Our model included the effect of observational errors for measurements of
stocks and fluxes. For a single stock or flux for a plot at time t there
was an observation (yp,m,t). The normally distributed observation error
model was
yp,m,t∼N(qp,m,t,τp,m,t2),
where τp,m,t2 represented the measurement error of the observed
state or flux. By including the observational error model, qp,m,t represented the latent, or unobserved, stock or flux. The variance was
unique to each observation because it was represented as a proportion of the
observed value. The τp,m,t2 was assumed known (Table 1) and not
estimated in the hierarchical model.
The hierarchical model required prior distributions for all optimized
parameters, including the parameters for the 3-PG model (θ), FRp, and
the process error parameters. The prior distributions for (p(θ)) are
specified in Table 3. Some parameters were informed by previous research in
loblolly pine ecosystems, while other parameters were “uninformative” with
flat distributions that had broad, but physically reasonable, bounds. The
prior distributions for the process error parameters were non-informative
and had a uniform distribution with upper and lower bounds that spanned the
range of reasonable error terms.
γm∼U0.001,100ρm∼U(0,10)
By combining the data, process, and prior models, our joint posterior that
includes all 13 data streams, plots, months with observations, and
fitted parameters was
p(θ,y,γ,q|y,τ,priors)∝,∏p=1P∏m=1M∏t=1TNqp,m,t|fθ,FRp,γm+ρmfθ,FRp,∏p=1P∏m=1M∏t=1TN(yp,m,t|qp,m,t,τp,m,t2),∏p=1Pp(FRp)∏f=1Fp(θf)∏m=1Mp(γm)∏m=1Mp(ρm),
where bolded components represent vectors, P is the total number of plots,
M
is the total number of data streams, T is the total months with
observations, and F is the total number of 3-PG parameters that are
optimized.
We numerically estimated the joint posterior distribution using the Monte
Carlo Markov Chain–Metropolis Hasting (MCMC-MH) algorithm (Zobitz et al.,
2011). This approach has been widely used to approximate parameter
distributions in ecosystem DA research (Fox et al., 2009; Trudinger et al.,
2007; Williams et al., 2005; Zobitz et al., 2011). Briefly, the algorithm
proposed new values for the model parameters, uncertainty parameters, latent
states, and FR. The proposed values were generated using a random draw from a
normal distribution with a mean equal to the previously accepted value for
that parameter and standard deviation equal to the parameter-specific jumping
size. The ratio of the proposed calculation of Eq. (7) to the previously
accepted calculation of Eq. (7) was used to determine if the proposed
parameter was accepted. If the ratio was greater than or equal to 1, the
proposed value was always accepted. If the ratio was less than 1, a random
number between 0 and 1 was drawn and the proposed value was accepted if the
ratio was greater than the random number. This allowed less probable
parameter sets to be accepted, thus sampling the posterior distribution. We
adapted the size of the jump size for each parameter to ensure the acceptance
rate of the parameter set was between 22 and 43 % (Ziehn et al., 2012) by
adjusting the jump size if the acceptance rate for a parameter was outside
the 22–43 % range. All MCMC-MH chains were run for 30 million
iterations with the first 15 million iterations discarded as the burn-in.
Four chains were run and tested for convergence using the Gelman–Rubin
convergence criterion, where a value for the criterion less than 1.1
indicated an acceptable level of convergence. We sampled every 1000th
parameter in the final 15 million iterations of the MCMC-MH chain and used
this thinned chain in the analysis described below. The 3-PG model and
MCMC-MH algorithm were programmed in Fortran 90 and used OpenMP to
parallelize the simulation of each plot within an iteration of the MCMC-MH
algorithm.
Descriptions of the studies used in data assimilation.
Study
Number of
Number of
Experimental
Data
Measurement
Measurement
Reference
name
locations
plots
treatments
streams
years
stand
per site
(plots)
(Table 2)
ages (years)
FMRCa thinning study
163
1
None
1, 3, 9
1981–2003
8–30
Burkhart et al. (1985)
FPCb Region-wide 18
18
2
Nutrientaddition
1, 3, 9
2011–2014
12–21
Albaugh et al. (2015)
PINEMAPc
4
16
Nutrientaddition, 30 %throughfall,nutrient × throughfall
1, 3, 9
2011–2015
3–13
Will et al. (2015)
Waycross
1
2
Nutrientaddition
3, 9, 10
1991–2010
4–23
Bryars et al. (2013)
SETRESd
1
16
Nutrient addi-tion, irriga-tion, nutrient × irrigation
1, 3, 5, 6,9, 10
1991–2006
8–23
Albaugh et al. (2004)
Duke FACEe and US-DK3 flux
1
12
CO2, nutrientaddition, CO2 × nutrient addition
2, 3, 4, 5, 6,7, 8, 9, 10,11, 13, 14
1996–2004
13–22
McCarthy et al. (2010);Novick et al. (2015)
NC2 flux
1
1
None
2, 3, 4, 5, 6,7, 9, 10, 11,12, 13, 14
2005–2014
12–22
Noormets et al. (2010)
Total
187
294
1981–2014
4–30
a Forest Modeling Research Cooperative. b Forest Productivity
Cooperative. c PINEMAP. d Southeast Tree Research and Education
Site. e Free-Air Carbon Enrichment.
The prior distributions of all 3-PG model parameters optimized using data assimilation. NPP: net primary production.
Parameter
Parameter
Units
Prior
Prior
Reference
description
distribution
parameters
for prior
(see footnote)
Allocation and structure
pFS2
Ratio of foliage to stemallocation at stemdiameter: 2 cm
–
Uniform
Min: 0.08 Max: 1.00
Uninformed
pFS20
Ratio of foliage to stem allocation at stem diameter:20 cm
–
Uniform
Min: 0.10 Max: 1.00
Uninformed
pRF
Ratio of fine roots to foliageallocation
–
Uniform
Min: 0.05 Max: 2.00
Uninformed
pCRS
Ratio of coarse roots to stemallocation
–
Uniform
Min: 0.15 Max: 0.35
1
SLA0
Specific leaf area at stand age 0
m2 kg-1
mean: 5.53 SD: 0.44
2
SLA1
Specific leaf area for matureaged stands
m2 kg-1
Normal
mean: 3.58 SD: 0.11
2
tSLA
Age at which specific leafarea is 0.5 (SLA0 + SLA1)
Years
Normal
mean: 5.97 SD: 2.15
2
fCpFS700
Proportional decrease in allocation to foliage between 350 and 700 ppm CO2
–
Uniform
Min: 0.50 Max: 1.00
Uninformed
StemConst
Constant in stem mass vs.diameter relationship
–
Normal
mean: 0.022 SD: 0.005
3
StemPower
Power in stem mass vs.diameter relationship
–
Normal
mean: 2.77 SD: 0.2
3
Canopy photosynthesis, autotrophic respiration, and transpiration
alpha
Canopy quantum efficiency(pines)
mol C mol PAR-1
Uniform
Min: 0.02 Max: 0.06
Uninformed
y
Ratio NPP / GPP
–
Uniform
Min: 0.30 Max: 0.65
4
MaxCond
Maximum canopy conductance
m s-1
Uniform
Min: 0.005 Max: 0.03
2
LAIgcx
Canopy LAI for maximumcanopy conductance
–
Uniform
Min: 2 Max: 5
2, 5, 6
Environmental modifiers of photosynthesis and transpiration
kF
Reduction rate of productionper ∘C below zero
–
Normal
mean: 0.18 SD: 0.016
2
Tmin
Minimum monthly mean temperature for photosynthesis
∘C
Normal
mean: 4.0 SD: 2.0
2, 5, 6
Topt
Optimum monthly mean temperature for photosynthesis
∘C
Normal
mean: 25.0 SD: 2.0
2, 5, 6
Tmax
Maximum monthly mean temperature for photosynthesis
∘C
Normal
mean: 38.0 SD: 2.0
2, 5, 6
Continued.
Parameter
Parameter
Units
Prior
Prior
Reference
description
distribution
parameters
for prior
(see footnote)
SWconst
Moisture ratio deficit whendownregulation is 0.5
–
Uniform
Min: 0.01 Max: 1.8
Uninformed
SWpower
Power of moisture ratio deficit
–
Uniform
Min: 1 Max: 13
Uninformed
CoeffCond
Defines stomatal response toVPD
mbar-1
Normal
mean: 0.041 SD: 0.003
2
fCalpha700
Proportional increase in canopy quantum efficiency between350 and 700 ppm CO2
–
Uniform
Min: 1.00 Max: 1.8
Uninformed
MaxAge
Maximum stand age used tocompute relative age
Years
Uniform
Min: 16 Max: 200
Uninformed
nAge
Power of relative age in the age modifier
–
Uniform
Min: 0.2 Max: 4.0
Uninformed
rAge
Relative age to where age modifier was 0.5
–
Uniform
Min: 0.01 Max: 3.00
Uninformed
FR1
Fertility rating parameter 1(mean annual temperaturecoefficient)
–
Uniform
Min: 0.0 Max: 1.0
Uninformed
FR2
Fertility rating parameter 2 (site index age 25 coefficient)
–
Uniform
Min: 0.0 Max: 1.0
Uninformed
Mortality
wSx1000
Maximum stem mass per tree at 1000 trees ha-1
kg tree-1
Normal
mean: 235 SD: 25
2, 5, 6
ThinPower
Power in self-thinning law
–
Uniform
Min: 1.0 Max: 2.5
2, 5, 6
mS
Fraction of mean stem biomass per tree on dying trees
–
Uniform
Min: 0.1 Max: 1.0
Uninformed
Rttover
Average monthly root turnoverrate
month-1
Uniform
Min: 0.017 Max: 0.042
7
MortRate
Density-independent mortalityrate (pines)
month-1
Uniform
Min: 0.0002 Max: 0.004
Uninformed
Understory hardwoods
alpha_h
Canopy quantum efficiency(understory hardwoods)
mol C mol PAR-1
Uniform
Min: 0.005 Max: 0.07
Uninformed
pFS_h
Ratio of foliage to stem parti-oning (understory hardwoods)
–
Uniform
Min: 0.2 Max: 3.0
Uninformed
pR_h
Ratio of foliage to fine roots(understory hardwoods)
–
Uniform
Min: 0.05 Max: 2
Uninformed
SLA_h
Specific leaf area (understoryhardwoods)
m2 kg-1
Normal
mean: 16 SD: 3.8
8
fCalpha700_h
Proportional increase in canopy quantum efficiency between350 and 700 ppm CO2 (understory hardwood)
–
Uniform
Min: 1.00 Max: 2.5
Uninformed
1: Albaugh et al., 2005. 2: Gonzalez-Benecke et
al., 2016. 3: Gonzalez-Benecke et al., 2014. 4: DeLucia et al., 2007. 5: Bryars et al.,
2013. 6: Subedi et al., 2015. 7: Matamala et al.,
2003. 8: LeBauer et al., 2010. Uninformed priors had large,
ecologically reasonable bounds.
Description of the different data assimilation approaches used.
Simulation
Treatments included in assimilation
Number
name
of
plots
Base
All plots and experiments in the region were used simultaneously. Includes unique pCRS, wSx1000, and ThinPower parameters for plots in the Duke FACE study.
294
NoExp
Same as Base assimilation but excluding all plots with experimental manipulations. Includes control plots that are part of experimental studies.
208
NoDkPars
Same as Base assimilation but without pCRS, wSx1000, and Thin-Power parameter for plots in the Duke FACE and US-DK3 studies.
294
Data assimilation evaluation
Using the observations, model, and hierarchical Bayesian method described
above, we assimilated both the non-manipulated and manipulated plots (Base
assimilation; Table 4). We assessed model performance first by calculating
the RMSE and bias of stem biomass predictions (the most common data stream).
In the evaluation, we only used the most recent observed values to increase
the time length between initialization and validation. Second, we assessed
the predictive capacity by comparing model predictions to data not used in
the parameter optimization in a cross-validation study. In this evaluation,
we repeated the Base assimilation without 160 FMRC thinning study plots
(Table 2), predicted the 160 plots using the median parameter values, and
calculated the RMSE and bias stem biomass of the independent set of plots.
Rather than holding out all 160 plots from a single assimilation and not
generating a converged chain, we divided the 160 plots into four unique sets
of 40 plots and repeated the assimilation for each set. Finally, we compared
the predicted responses to experimental manipulation to the observed
responses. We focused the comparison on the percentage difference in stem
biomass between the control and treatment plots. We used a paired t test to
test for differences between the predicted and observed responses within an
experimental type (irrigated, drought, nutrient addition, and elevated
CO2). We combined the single and multi-factor treatments for analysis.
For the analysis of the nutrient addition studies, we only used plots where
FR was assumed to be 1 so that we were able to simulate the treatments
without requiring the optimization of a site-specific FR parameter.
During preliminary analysis, we found that the Base assimilation predicted
lower stem biomass than observed in the elevated CO2 plots in the Duke
FACE study. Further analysis investigating the cause of the bias in the
CO2 plots showed that three parameters (wSx1000, ThinPower, and pCRS)
were required to be unique to the Duke FACE study in order to reduce the
bias. Therefore, the Base assimilation included unique parameters for
wSx1000, ThinPower, and pCRS parameters in all plots in the Duke FACE and
US-DK3 studies. To highlight the need for the site-specific parameters, we
repeated the Base assimilation approach without the three additional
parameters for the Duke studies (NoDkPars assimilation).
The optimized medians, range of the 99 % quantile intervals of the
posterior distributions and the 99 % quantile range for priors with
normally distributed priors or the range of the upper and lower bounds for
priors with uniform distributions. C.I.: credible interval.
Parameter
Posterior
Posterior 99 %
Prior range
Posterior/
median
C.I. range
prior range
Allocation and structure
Parameter group
mean: 0.38
pFS2
0.58
0.55–0.61
0.08–1.00
0.06
pFS20
0.57
0.55–0.59
0.10–1.00
0.05
pR
0.11
0.07–0.15
0.05–2.00
0.04
pCRS
0.26
0.25–0.27
0.15–0.35
0.11
pCRS (Duke)
0.21
0.18–0.23
0.15–0.35
0.20
SLA0
8.44
7.67–9.25
4.4–6.66
0.70
SLA1
2.84
2.72–2.96
3.59–4.16
0.43
tSLA
4.13
3.88–4.41
0.43–11.51
0.05
fCpFS700
0.74
0.60–0.90
0.50–1.00
0.60
StemConst
0.022
0.009–0.035
0.009–0.035
1.00
StemPower
2.78
2.29–3.27
2.25–3.29
0.95
Canopy photosynthesis, autotrophic respiration, and transpiration
Parameter group
mean: 0.14
alpha
0.029
0.026–0.031
0.02–0.06
0.14
y
0.50
0.47–0.53
0.30–0.65
0.15
MaxCond
0.011
0.01–0.012
0.005–0.03
0.09
LAIgcx
2.2
2.0–2.48
2.0–5 .0
0.16
Environmental modifiers of photosynthesis and transpiration
Parameter group
mean: 0.61
kF
0.16
0.12–0.2
0.14–0.22
1.04
Tmin
-5.56
-8.88 to -2.69
-1.15 to 9.15
0.60
Topt
23.42
21.1–26.31
19.85–30.15
0.51
Tmax
39.56
34.71–44.39
32.85–43.15
0.94
SWconst
1.09
0.91–1.56
0.01–1.8
0.36
SWpower
8.86
3.39–12.98
1.00–13.00
0.80
CoeffCond
0.036
0.029–0.043
0.034–0.048
0.91
fCalpha700
1.33
1.18–1.52
1.0–1.80
0.43
MaxAge
151.5
54.4–199.6
16.0–200 .0
0.79
nAge
3.35
1.77–3.99
1.00–4.00
0.74
rAge
2.25
0.81–2.99
0.01–3.00
0.73
FR1
0.073
0.061–0.086
0.00–1.00
0.03
FR2
0.17
0.15–0.19
0.0–1.0
0.04
Mortality
Parameter group
mean: 0.37
wSx1000
176.9
169.6–184.4
165.6–294.4
0.15
wSx1000 (Duke)
243.3
196.89–305.02
165.6–294.4
0.76
ThinPower
1.68
1.60–1.78
1.00–2.5
0.12
ThinPower
1.26
1.00–1.85
1.00–2.5
0.56
(Duke)
mS
0.52
0.37–0.71
0.10–1.00
0.38
Rttover
0.023
0.017–0.031
0.017–0.042
0.55
MortRate
0.001
9e-04–0.0011
2e-04–0.004
0.06
Understory hardwoods
Parameter group
mean: 0.28
alpha_h
0.02
0.02–0.02
0.005–0.07
0.01
pFS_h
1.78
1.54–2.06
0.2–3.0
0.19
pR_h
0.21
0.06–0.43
0.05–2.00
0.19
SLA_h
16.3
14.1–19.0
6.2–25.8
0.25
fCalpha700_h
1.84
1.58–2.17
1.0–2.50
0.74
Sensitivity to the inclusion of ecosystem experiments
We also evaluated how parameter distributions and the associated
environmental sensitivity of model predictions depended on the inclusion of
ecosystem experiments in data assimilation. First, we repeated the Base
assimilation, this time excluding the plots that included the manipulated
treatments (NoExp). We removed all manipulation types at once, rather than
individual experimental types, because all experimental types involved
multi-factor studies. The NoExp assimilation had the same number of data
streams as the Base assimilation because it included the control treatments
from the experimental studies. The NoExp assimilation represented the
situation where only observations across environmental gradients were
available. Second, we compared the parameterization of the ASW, soil
fertility, and atmospheric CO2 environmental modifiers from the Base to
the NoExp assimilation. The modifier equations are described in
Supplement Sects. 1.2 and 1.3. Third, we repeated the same
independent validation exercise for the 160 FMRC plots as described above
for the Base assimilation. Fourth, we predicted the treatment plots in the
irrigated, drought, nutrient addition (only plots where FR was assumed to be
1), and elevated CO2 plots. As for the Base assimilation, we used a
t test to compare the experimental response between the NoExp assimilation
and observed values and between the NoExp and Base assimilations. Since the
experimental treatments were not used in the optimization, this was an
independent evaluation of predictive capacity.
Model evaluation of stem biomass when assimilating (a) observations across environmental gradients and ecosystem manipulation
experiments (Base; Table 4) and (b) only observations across
environmental gradients (NoExp; Table 4). The gray circles correspond to
predictions where all plots were used in data assimilation. The black
triangles correspond to predictions where 160 plots were not included in
data assimilation and represent an independent evaluation of model
predictions (cross-validation). For each plot, we used the measurement
with the longest interval between initialization and measurement for
evaluation.
The mean response, expressed as a percentage change in stem
biomass from the control treatment, for irrigation, drought (as a reduction
in throughfall), nutrient addition, and elevated CO2 experiments. The
observed response and the response simulated by the Base, NoExp, and
NoDkPars assimilation approaches are shown. The # sign signifies that the value below
the marker was significantly different from the observed response (p < 0.05). The * sign signifies that the value below the marker was significantly different from
the response in the Base assimilation (p < 0.05). Error bars are
±1 standard deviation.
Regional predictions with uncertainty
To demonstrate the capacity of the data assimilation system to create
regional predictions with uncertainty, we simulated the regional response to
a decrease in precipitation, an increase in nutrient availability, and an
increase in atmospheric CO2 concentration, each as a single factor
change from a 1985–2011 baseline. Each prediction included uncertainty by
integrating across the parameter posterior distributions using a Monte Carlo
sample of the parameter chains. Our region corresponded to the native range
of loblolly pine and used the HUC12 (USGS 12-digit Hydrological Unit Code)
watershed as the scale of simulation. For each HUC12 in the region, we used
the mean SI, 30-year mean annual temperature, ASW aggregated to the HUC12
level, and monthly meteorology from Abatzoglou (2013) as inputs (Fig. 3).
The SI of each HUC12 was estimated from biophysical variables in the HUC12
using the method described in Sabatia and Burkhart (2014).
This SI corresponded to an estimated SI for stands without intensive
silvicultural treatments or advanced genetics of planted stock.
Median and range of the 99 % quantile intervals of the posterior distributions for the parameters in the NoExp and NoDkPars assimilations
Parameter
NoExp
NoExp 99 %
NoDkPars
NoDkPar 99 %
median
range
median
Allocation and structure
pFS2
0.63
0.61–0.68
0.57
0.55–0.60
pFS20
0.63
0.60–0.65
0.57
0.55–0.59
pR
0.11
0.06–0.16
0.11
0.08–0.15
pCRS
0.29
0.27–0.30
0.26
0.25–0.27
pCRS (Duke)
0.25
0.23–0.28
n/a
n/a
SLA0
7.47
6.57–8.41
8.56
7.73–9.32
SLA1
3.00
2.88–3.12
2.89
2.79–2.99
tSLA
4.75
4.30–5.26
4.12
3.90–4.38
fCpFS700
0.50
0.50–0.53
0.94
0.83–1.00
StemConst
0.022
0.01–0.04
0.02
0.01–0.04
StemPower
2.79
2.27–3.26
2.77
2.28–3.30
Canopy photosynthesis, autotrophic respiration, and transpiration
alpha
0.030
0.028–0.033
0.029
0.026–0.031
y
0.48
0.45–0.51
0.49
0.46–0.52
MaxCond
0.017
0.015–0.021
0.011
0.011–0.012
LAIgcx
4.4
3.9–5.0
2.1
2.0–2.5
Environmental modifiers of photosynthesis and transpiration
kF
0.15
0.11–0.20
0.16
0.11–0.20
Tmin
-7.8
-10.97 to -4.95
-6.04
-9.06 to -3.03
Topt
21.55
19.15–24.39
22.71
20.54–25.42
Tmax
40.56
36.51–45.62
39.82
35.62–44.56
SWconst
0.93
0.8–1.1
1.14
0.91–1.62
SWpower
6.27
2.98–11.49
7.99
3.29–12.95
CoeffCond
0.041
0.034–0.047
0.036
0.030–0.042
fCalpha700
1.01
1.0 0–1.06
1.15
1.10–1.25
MaxAge
152.84
54.18–199.5
152.0
49.2–199.3
nAge
3.36
1.93–3.99
3.36
1.89–3.99
rAge
2.26
0.80–2.99
2.24
0.83–2.99
FR1
0.12
0.09–0.14
0.08
0.07–0.09
FR2
0.20
0.16–0.24
0.17
0.15–0.19
Mortality
wSx1000
191.6
180.2–210.2
181.32
173.26–196.32
wSx1000 (Duke)
235.1
175.0–297.5
n/a
n/a
ThinPower
1.76
1.61–1.92
1.59
1.46–1.72
ThinPower (Duke)
1.42
1.01–2.02
n/a
n/a
mS
0.54
0.33–0.80
0.50
0.25–0.71
Rttover
0.019
0.02–0.03
0.022
0.017–0.030
MortRate
0.0013
0.0011–0.0014
0.0011
9e-04–0.0013
Understory hardwoods
alpha_h
0.031
0.025–0.040
0.02
0.017–0.023
pFS_h
2.39
1.86–2.96
1.79
1.59–2.09
pR_h
0.25
0.05–0.67
0.21
0.06–0.41
SLA_h
12.37
9.96–15.07
16.42
14.37–18.55
fCalpha700_h
1.08
1.00–1.83
1.83
1.56–2.15
n/a: not applicable; NoDkPars assimilation did not include Duke-specific parameters.
Optimized environmental response functions in the 3-PG model for
the (a) soil fertility influence on photosynthesis, (b) available soil
water influence on photosynthesis and conductance, and (c) atmospheric
CO2 influence on photosynthesis. The function shapes were derived
from the parameters in the Base, NoExp, and NoDkPars assimilations (Table 4).
To sample parameter uncertainty, we randomly drew 500 samples, with replacement, from the Base
assimilation MCMC chain and simulated forest development from a 1985
planting to age 25 in 2011 in each HUC. We chose age 25 as the final age
because it is a typical age of harvest in the region. For each sample, we
repeated the regional simulation with (1) a 30 % reduction in
precipitation, (2) FR set to 1, and (3) atmospheric CO2 increased by 200 ppm.
Within a parameter sample, we calculated the percent change in stem
biomass at age 25 between the control simulation and the three simulations with
the environmental changes. We focused our regional analysis on the
distribution of the percent change in stem biomass.
(a) Regional predictions of stem biomass stocks for a 25-year-old
stand planted in 1985. Parameters used in the predictions were from the Base
assimilation approach described in Table 5. (b) The width of the 95 %
quantile interval associated with uncertainty in model parameters.
Predictions of the percentage change in stem biomass at age 25 in
response to (a, b) a 200 ppm increase in atmospheric CO2 over 1985–2011
concentrations, (c, d) a 30 % reduction in precipitation from 1985–2011
levels, and (e, f) a removal of nutrient limitation by setting the soil
fertility rating in the model equal to 1. The left column is the median
prediction and the right column is the width of the 95 % quantile
interval (C.I.: credible interval) associated with parameter uncertainty. The predictions used the Base
assimilation.
Results
Data assimilation evaluation
Our multisite, multi-experiment, multi-data stream DA approach (Base
assimilation) increased confidence in the model parameters (Table 5).
Averaged across parameters, the posterior 99 % quantile range from the
Base assimilation was 60 % less than the prior range. The largest
reduction in parameter uncertainty was for the parameters associated with
light-use efficiency (alpha) and the conversion of GPP to net primary productivity (NPP) (y), which on
average had ranges that were 85 % lower in the posterior than the prior.
Parameters associated with allocation and allometry had a 63 % reduction
in the range while parameters associated with mortality processes had a 70 %
reduction in the range. Parameters associated with environmental modifiers
had the least reduction in the range with a 40 % decrease. In addition to
the parameters associated with the 3-PG model, the model process error
parameters for each data stream were well constrained with large reductions
in the range (> 99 % decrease; Supplemental Table S2)
The Base assimilation reliably predicted data from the regionally
distributed non-manipulated plots that were not used in the optimization.
The mean bias in stem biomass of the cross-validation was -3.7 % and the
RMSE was 21.8 Mg ha-1 (Fig. 4a). Furthermore, the response of stem
biomass to irrigation (df= 7, p= 0.18), nutrient addition (df= 26,
p= 0.29), and elevated CO2 (df= 4, p= 0.43) was not
significantly different between the observed and the Base assimilation
(Fig. 5). The Base assimilation was significantly more sensitive to
drought than observed (n= 31, p < 0.001; Fig. 5).
The plots at the Duke Forest study had a higher carrying capacity of stem
biomass before self-thinning (WSx1000), lower self-thinning rate (ThinPower), and smaller allocation to coarse root (pCRS) than values
optimized from the other plots across the region (Table 6). The DA approach
without these three study-specific parameters (NoDkPars) predicted
significantly lower accumulation of stem biomass in response to elevated
CO2 than observed (df= 4, p= 0.002; Fig. 5). The NoDKPars
assimilation optimized the CO2 fertilization parameter (fCalpha700) to
a value that predicted 45 % less light-use efficiency at 700 ppm (1.13 in
NoDKPar vs. 1.33 in Base; Table 6) than the Base assimilation.
Sensitivity to the inclusion of ecosystem experiments
Excluding the experimental treatments from the data assimilation did not
strongly influence the predictive capacity of the model. The RMSE validation
plots in NoExp assimilation decreased slightly compared to Base assimilation
(21.8 to 18.0 Mg ha-1), while the bias slightly increased (-3.7 to
-4.1 %) (Fig. 4b). Excluding the experimental treatments resulted in a
significantly lower response of stem biomass to elevated CO2 than
observed (df= 4, p < 0.001; Fig. 5). Furthermore, there was a
slight negative response of stem biomass to CO2 in the NoExp
assimilation because the parameter governing the change in foliage
allocation at elevated CO2 (fCpFS700) was unconstrained by observations
(Table 6). This led to convergence on the lower bound of the prior
distribution (0.5) where foliage allocation decreased with increased
atmospheric CO2. The predictions of irrigation, drought, and nutrient
addition experiments were not significantly different between the Base and
NoExp assimilations (Fig. 5).
The parameters and associated response functions in the 3-PG for nutrients,
ASW, and atmospheric CO2 differed between the Base and NoExp
assimilations (Fig. 6). First, the parameterization of the soil fertility (FR) showed a stronger dependence on SI in the NoExp assimilation
than in the Base assimilation (Fig. 6a). For a given SI there was a lower
FR and thus stronger nutrient limitation, when experimental treatments were
excluded from assimilation. Second, the parameterization of the function
relating photosynthesis and canopy conductance to ASW resulted in lower
photosynthesis and maximum conductance when ASW was less
than 50 % of the maximum ASW in the NoExp than in the Base assimilations (Fig. 6b). Finally, the
response of photosynthesis to atmospheric CO2 was functionally zero in
the NoExp assimilation, thus highlighting the importance of the elevated
CO2 treatments in the Duke FACE study for constraining the
parameterization of the CO2 response function (Fig. 6c).
Regional predictions with uncertainty
Regionally (i.e., the native range of loblolly pines), stem biomass at age
25 ranged from 52 to 292 Mg ha-1 with the most
productive areas located in the coastal plains and the interior of
Mississippi and Alabama (Fig. 7a). The least productive locations were the
western and northern extents of the native range. The width of the 95 %
quantile interval for each HUC12 unit ranged from 6.2 to 29.8 Mg ha-1
with the largest uncertainty located in the most productive HUC12 units and in
the far western extent of the region (Fig. 7b).
The predicted change in stem biomass at age 25 from an additional 200 ppm of
atmospheric CO2 (over the 1985–2011 concentrations) was similar
to the change associated with a removal of nutrient limitation (by setting
FR to 1) (Fig. 8a, c). The median change associated with elevated CO2
for a given HUC12 unit ranged from 19.2 to 55.7 % with a regional median
of 21.7 % (Fig. 8a). The change associated with the removal of nutrient
limitation ranged from 6.9 to 303.7 % for a given HUC12 unit, with a regional median of 24.1 % (Fig. 8b). The response to elevated CO2
was more consistent across space than the response to nutrient addition. The
largest potential gains in productivity from nutrient addition were
predicted in central Georgia, the northern extent of the region, and the
western extents, areas with the lowest SI (Fig. 3).
Stem biomass was considerably less responsive to a 30 % decrease in
precipitation than to nutrient addition and an increase in atmospheric
CO2. The median change in stem biomass when precipitation was reduced
from the 1985–2011 levels ranged from -11.6 to -0.1 % for a given HUC12
unit with a regional median of -5.1% (Fig. 8c). Central Georgia was the
most responsive to precipitation reduction, reflecting the relatively low
annual precipitation and warm temperatures (Fig. 3).
For a given location, the predicted response to elevated CO2 had larger
uncertainty than the predicted response to precipitation reduction and
nutrient limitation removal (Fig. 8c, d, f). The uncertainty, defined as the
width of the 95 % quantile interval, was consistent across the region for
the response to elevated CO2 (Fig. 8b). The uncertainty in the
response to precipitation reduction and nutrient limitation removal was
largest in the regions with the largest predicted change (Fig. 8d, f).
Discussion
Using DA to parameterize models for predicting ecosystem change requires
disentangling the vegetation responses to temperature, precipitation,
nutrients, and elevated CO2. To address this challenge, we introduced a
regional-scale hierarchical Bayesian approach (Data Assimilation to Predict Productivity
for Ecosystems and Regions, DAPPER) that assimilated data
across environmental gradients and ecosystem manipulation experiments into a
modified version of the 3-PG model. Furthermore, we synthesized observations
of carbon stocks, carbon fluxes, water fluxes, vegetation structure, and
vegetation dynamics that spanned 35 years of forest research in a region
(Table 1, Fig. 1) with large and dynamic carbon fluxes (Lu et al.,
2015). By combining the DAPPER system with the regional set of observations,
we were able to estimate parameters in a model with high predictive capacity
(Fig. 4) and with quantified uncertainty on parameters (Table 5) and
regional simulations (Figs. 7 and 8).
Our hierarchical approach (Eq. 7) was designed to partition uncertainty
among parameters, model process, and measurements (Hobbs and Hooten, 2015).
Separating the parameter and process uncertainty is required to estimate
prediction intervals, as prediction intervals only include parameter and
process errors (Dietze et al., 2013; Hobbs and Hooten, 2015).
Previous forest ecosystem DA efforts have either focused on parameter
uncertainty, by using measurement uncertainty as the variance term in a
Gaussian cost function (Bloom and Williams, 2015; Keenan
et al., 2012; Richardson et al., 2010) or on total uncertainty by directly
estimating the Gaussian variance term (Ricciuto et al., 2008). Our
approach allowed the estimation of the probability distribution of forest
biomass before uncertainty is added through measurement. Considering that
the method of DA can potentially have a large influence on posterior
parameter distributions (Trudinger et al., 2007), future research
should focus on comparing the hierarchical approach presented here to other
approaches by using the same data constraints with alternative cost
functions.
Sensitivity to the inclusion of ecosystem experiments
The most important experimental manipulation for constraining model
parameters was the Duke FACE CO2 fertilization study because the
CO2 fertilization parameters (fCalpha700 and fCpFS700) converged on the
lower bounds of their prior distributions when the experiments were excluded
from the assimilation. In contrast, excluding the nutrient fertilization,
drought, and irrigation studies did not substantially alter the predictive
capacity of the model. This finding suggests that data assimilation using
plots across environmental gradients alone can constrain parameters
associated with water and nutrient sensitivity. However, regardless of
whether the experiments were included in the assimilation, the optimized
model predicted higher sensitivity to drought than observed, highlighting
that future studies should focus on improving the sensitivity to drought.
The 3-PG model included a highly simplified representation of interactions
between the water and carbon cycles that resulted in parameterizations that
may contain assumptions that require additional investigation. First,
transpiration was modeled as a function of a potential canopy transpiration
that occurred if leaf area was not limiting transpiration. The LAI at which
leaf area was no longer limiting was a parameter that was optimized (LAIgcx
in Table 5), resulting in a value of 2.2. Interestingly, this optimized
value is consistent with the scant literature on this topic. In their
analysis of multiyear measurements of transpiration in loblolly pine,
Phillips and Oren (2001) observed that transpiration per unit leaf area
was relatively insensitive to increases in leaf area above an LAI of
approximately 2.5. Iritz and Lindroth (1996) reviewed transpiration
data from a range of crop species and found only small increases in
transpiration above LAI of 3–4. These authors suggest that the
threshold-type responses observed were related to the range of LAI at which
self-shading increases most rapidly, therefore limiting increases in
transpiration. The resulting model behavior of “flat” transpiration above
2.2 LAI, with gradually decreasing photosynthesis above that value, results
in increasing water use efficiency at higher LAI values. Second, the
relationship between relative ASW and the modifier of photosynthesis and
transpiration predicted a modifier value greater than zero when the relative
ASW was zero. This resulted in positive values from photosynthesis and
transpiration when the average ASW during the month was zero. In practice,
the monthly ASW was rarely zero during simulations, which presents a
challenge constraining the shape of the ASW modifier. The priors for the two
ASW modifiers (SWconst and SWpower) had ranges that permitted the modifier
to be zero. Therefore, additional data are likely needed during very dry
conditions to develop a more physically based parameterization.
Alternatively, the parameterization of a non-zero soil moisture modifier at
zero ASW may be due to trees having access to water at soil depths deeper
than the top 1.5 m of soil represented by the bucket in 3-PG. Overall, it is
important to view the parameterization presented here as a phenomenological
relationship that is consistent with observations from drought and
irrigation experiments as well as observations across regional gradients in
precipitation.
Constraining the sensitivity to atmospheric CO2 differs from
constraining the sensitivity to ASW because, unlike the multiple constraints
on water sensitivity (drought, irrigation, and gradient studies),
environmental conditions created by the few elevated CO2 plots provided
unique constraint on parameters. Our finding demonstrated that DA efforts
should test for bias in unique ecosystem experiments before finalizing a set
of model parameters used in optimization. In particular, we found that the
parameter governing the photosynthetic response to elevated CO2
(fCalpha700) was substantially lower when all parameters were assumed to be
shared across all plots than when the CO2 fertilization experiment was
allowed to have unique parameters. The need for the three unique parameters
at the Duke FACE study parameters can be explained by the constraint
provided by multiple data streams and multiple plots. An assumption of the
model was that an increase in stem biomass caused a decrease in stem density
through self-thinning, unless the average tree stem biomass was below a
parameterized threshold (WSx1000). Therefore, an increase in photosynthesis
and stem biomass through CO2 fertilization could cause a decrease in
stem density. For a single study, it is straightforward to simultaneously
fit the CO2 fertilization and self-thinning parameters to fit stem
biomass and stem density observations for the site. However, regional DA
presents a challenge because the self-thinning parameters are well
constrained by the stem biomass and stem density observations across the
region but the CO2 fertilization parameters are not. As a result of the
regional DA, the self-thinning parameters caused a stronger decrease in stem
density than observed in the Duke FACE study. Therefore, the optimization
favored a solution where there was a lower response to CO2 and thus a
smaller decrease in stem density. Allowing the Duke FACE study to have
unique self-thinning parameters resulted in lower rates of
self-thinning and allowed for simulated stem biomass to respond to CO2
in a way that matched the observations without penalizing the optimization
by degrading the fit to the stem density.
Our finding that the Duke FACE study required unique self-thinning
parameters to reduce bias in the simulated stem biomass suggests that when
using DA to optimize parameters that are shared across plots, careful
examination of prediction bias in key sites that provide a unique constraint
on certain parameters (like the Duke FACE) is critical. Based on this
example, we suggest that DA efforts using multiple studies and multiple
experiment types identify whether particular experiments at a limited number
of sites have the potential to uniquely constrain specific parameters. In
this case, additional weight or site-specific parameters may be needed to
avoid having the signal of the unique experiment overwhelmed by the large
amount of data from the other sites and experiments. Additionally, the
finding suggests that multisite DA should consider using hierarchical
approaches to predicting mortality, particularly because mortality is often
not simulated as mechanistically as growth. A hierarchical approach, where
each plot has a set of mortality parameters that are drawn from a regional
distribution, could avoid having unexplained variation in mortality rates
leading to bias in the parameterization of growth-related processes (i.e.,
growth responses to CO2, drought, nutrient fertilization). The
hierarchical approach to mortality could also highlight patterns in
mortality rates across a region and allow for additional investigations into the mechanisms driving the patterns.
Regional predictions with uncertainty
Our predictions of how stem biomass responds to elevated CO2, nutrient
addition, and drought were designed to illustrate the capacity of the DAPPER
approach to simulate the uncertainty in future predictions. By using DA, our
regional predictions and the uncertainty are consistent with observations
but are associated with key caveats. First, only parameter uncertainty was
presented in the regional simulations. There is additional uncertainty
associated with model process error. We showed the parameter uncertainty
because it isolated the capacity to parameterize the individual
environmental response functions in the model. Second, the response to
drought may be too strong because of the bias in the model predictions of
the drought studies. However, there is potential that the drought studies
underestimated the sensitivity to ASW since they are relatively short term
(< 5 years) and manipulate local ASW without manipulating large-scale ASW (i.e., regional water tables). Third, the large responses to nutrient fertilization at the western and northern extents of the study region may be
too high. The large responses are attributed to the low SI and the low
predicted site fertility rating (FRp). The low SI may be attributable to
water limitation and temperature limitation that is not fully accounted for
in the parameterization. Additional nutrient addition experiments in the
northern and western extent along with further development of the
representation of nutrient availability in the 3-PG model may allow for a
more robust representation of soil fertility. Finally, the baseline
fertility used in our regional analysis was derived from an empirical model
of SI that was developed using field plots with minimal management
(Sabatia and Burkhart, 2014). Subsequently our estimate of
baseline fertility is likely on the low end of forest stands currently in
production and the response to nutrient addition may be higher than a
typical stand under active management.