Optimization of the seasonal cycles of simulated CO 2 flux by fitting simulated atmospheric CO 2 to observed vertical profiles

Abstract. An inverse of a combination of atmospheric transport and flux models was used to optimize the Carnegie-Ames-Stanford Approach (CASA) terrestrial ecosystem model properties such as light use efficiency and temperature dependence of the heterotrophic respiration separately for each vegetation type. The method employed in the present study is based on minimizing the differences between the simulated and observed seasonal cycles of CO2 concentrations. In order to compensate for possible vertical mixing biases in a transport model we use airborne observations of CO2 vertical profile aggregated to a partial column instead of surface observations used predominantly in other parameter optimization studies. Effect of the vertical mixing on optimized net ecosystem production (NEP) was evaluated by carrying out 2 sets of inverse calculations: one with partial-column concentration data from 15 locations and another with near-surface CO2 concentration data from the same locations. We confirmed that the simulated growing season net flux (GSNF) and net primary productivity (NPP) are about 14% higher for northern extra-tropical land when optimized with partial column data as compared to the case with near-surface data.


Introduction
Accurate estimation of the global distribution of CO 2 flux is important not only for making a basis for imposing the emission restriction of CO 2 gases on each country under international agreement, but also for understanding both natural and anthropogenic processes controlling the CO 2 fluxes.One common approach for estimation of CO 2 flux is to use atmospheric transport inversions (Gurney et al., 2002;Ro-Correspondence to: S. Maksyutov (shamil@nies.go.jp) denbeck et al., 2003).With increasing number of CO 2 observation data becoming available recently, the use of atmospheric transport inversion will produce more reliable results (Maksyutov et al., 2003).Equally important in increasing the reliability of the atmospheric transport inversions is to increase the reliability of the background CO 2 fluxes that are used to derive the a-priori values of CO 2 concentration fields for solving the inverse problems.
Fluxes of CO 2 due to net ecosystem production (NEP) of terrestrial ecosystem, fossil fuel combustions, biomass burning, and exchange with ocean are major contributors to the seasonal cycle of CO 2 in atmosphere.Among all of these fluxes, NEP makes the largest contribution to variability in CO 2 in the atmosphere although it is very close to neutral over the course of a year (Tucker et al., 1986).To better understand the carbon cycle in the terrestrial ecosystem, several models have been developed to date.For example, Potsdam Model Intercomparison study compared a total of 17 global terrestrial biogeochemistry models, and analyzed these models from several aspects such as the simulated net primary productivities (NPP), using the common input data (Cramer et al., 1999).
Methods to optimize terrestrial ecosystem models with atmopsheric CO 2 seasonal cycle vary from a model to model.One way is to adjust the model parameters one by one until a simulated physical quantity is close enough to the observed value.On the other hand, statistical approaches are commonly used to adjust model parameters.Fung et al. (1987) optimized temperature sensitivity of the ecosystem respiration globally to get a better fit of the simulated northern hemispheric CO 2 seasonality to the observations, and achieved quite reasonable results for the amplitude of seasonal cycle although with some problems in the phase.Later, Randerson et al. (2002) simultaneously optimized parameters of the Carnegie-Ames-Stanford Approach (CASA) terrestrial ecosystem model by incrementally varying the values of two parameters and constructing a three-dimensional plot of a cost function describing the weighted difference between modeled and observed CO 2 concentrations.In their study, they used the Goddard Institute for Space Studies tracer transport model to simulate the atmospheric CO 2 concentrations from CASA fluxes with different values of parameters (Randerson et al., 2002).Kaminski et al. (2002) simultaneously optimized 24 parameters of the Simple Diagnostic Biosphere Model (SDBM) by assimilating seasonal cycles of CO 2 concentrations from 41 observing sites.Further, Rayner et al. (2005) elaborated on the carbon cycle data assimilation system developed by Kaminski et al. (2002) and simultaneously optimized 57 parameters of Biosphere Energy Transfer Hydrology Scheme (BETHY) using the observed data of CO 2 for 1979 to 1999.
To our knowledge, these studies which used the observed CO 2 concentrations to optimize parameters of terrestrial ecosystem model relied upon available CO 2 data which are dominated by surface level measurements.However, more recent studies have revealed that the vertical mixing biases in transport models result in bias in the optimized fluxes.For example, Stephens et al. (2007) suggested that a number of transport models compared in the TransCom-3 study (Gurney et al. 2002) do have vertical mixing biases which were revealed by comparing optimized concentration fields with observed vertical profiles not used in the inversion.Models with both too step and too shallow vertical gradients were present.Similarly, Yang et al. (2007) used ground-based FTS and aircraft measurements to suggest that use of CO 2 concentration data in boundary layer in the atmospheric inversions can bias the estimated fluxes, and pointed to a weak vertical mixing bias on average in a number of the transport models of TransCom-3.They implied that the use of CO 2 column data could be more relevant for the reliable optimization of terrestrial ecosystem models.Mean weak mixing bias in TransCom-3 models by (Gurney et al., 2002) can be attributed to using mostly offline models with missing or simplified physical process parameterizations such as shallow and penetrative cloud convection and boundary layer turbulence.Some of more recent transport models, such as compared by Law et al. (2008) involve complete online transport schemes and are expected to do better in vertical mixing.
In the present study, we optimized CASA with partial column data of CO 2 obtained by aircraft measurements, and separately, with near-surface data of CO 2 for comparison.We applied the atmospheric transport inversion method, which is widely used to estimate regional fluxes of CO 2 (e.g.Gurney et al., 2004), to estimate two parameters of the CASA flux model (light use efficiency and temperature dependence of the heterotrophic respiration) independently for each of the 11 vegetation types.By analyzing the vertical profiles of simulated and observed CO 2 , it was found that the transport model used in this study has a weak vertical mixing especially in the northern mid latitude during winter and this inaccuracy of the mixing led to the underestimation of NEP seasonality when near-surface data was used exclusively.The optimization with partial column data of CO 2 , on the other hand, is less affected by mixing scheme of a transport model and expected to result in more accurate optimization of seasonal cycles of NEP field.

Methods
In this section, we first present the overall description of the inversion method used for the CASA parameter optimization, followed by the detailed description of each part of the optimization process as well as the models used in this study.

Carbon cycle model
We used the Carnegie-Ames-Stanford Approach (CASA) to simulate terrestrial biosphere.Specifically, the CASA described by van der Werf et al. (2003) was used with following modifications.The fire activities in CASA were turned off by setting the burned fraction to zero at every grid cell of CASA for all times.This is because we are only interested in the seasonal cycle of NEP in the present study, and the interannual variability of the forest fire activities is too erratic to account for in the average seasonal cycle (van der Werf et al., 2006).As input data for CASA, we used the same dataset as described by van der Werf et al. (2003) except for monthly normalized difference vegetation index (NDVI).We used NDVI data from Pathfinder AVHRR Land dataset (Agbu and James, 1994) for 1981 to 2001, and derived the monthly climatology of NDVI following the method described by Randerson et al. (1997).Figure 1 shows the distributions of the vegetation types in CASA as well as the abbreviation for each vegetation type of CASA used throughout the rest of this paper.We used CASA with spatial resolution of 1 • latitude × 1 • longitude and monthly time step.In the rest of this sub section, the algorithms of CASA used to derive NPP and flux of carbon due to heterotrophic respiration R h are briefly introduced since the parameters that control these two quantities were optimized in this study.
The net ecosystem exchange (NEE) in CASA is obtained as a difference between the net primary productivity (NPP) and the sum of fluxes due to R h , fuel wood burnings, and consumptions of plants by herbivores.In CASA, the NPP at a grid cell g and time t is given by NPP(g,t) = IPAR(g,t)ε(g,t) (1) where IPAR is intercepted photosynthetically active radiation and ε is light use efficiency.The value of IPAR in Eq. ( 6) is a function of NDVI and proportional to photosynthetically active radiation PAR (Bishop and Rossow, 1991).On the other hand, ε is a production efficiency of an ecosystem for a given IPAR and is expressed as where factors F T and F W are dependent on temperature and soil moisture and account for stresses induced by temperature and soil water availability, respectively, and E max is a maximum light use efficiency.To our knowledge, E max has been taken as a universal constant common to all ecosystem types in the original CASA (e.g.0.5 gC (MJ PAR) −1 as used by van der Werf et al., 2003).
Likewise, conditions of soil moisture and temperature dominate the control over R h.The effect of temperature on R h is expressed as F R which is an exponential function of a factor Q 10 : where T (g, t) is a surface temperature.In this study, we simultaneously optimized E max and Q 10 of each vegetation type; that is, the size of parameter vector p is 22 (i.e. 2 parameters × 11 vegetation types).Furthermore, we used 0.5 gC (MJ PAR) −1 and 2.00 as the initial values of E max and Q 10 , respectively, and 0.25 gC (MJ PAR) −1 and 0.30 as the prior uncertainty of E max and Q 10 , respectively.

Formalism of the parameter optimization
In this study, we optimized a set of the CASA parameters, p, using the Bayesian inversion in which the weighted mismatches between the modeled and observed concentrations of atmospheric CO 2 concentrations are minimized.This is equivalent to minimizing the cost function J J = (4) where x is a matrix consisting of the observed CO 2 concentrations, M is a transport model which maps p to simulated concentrations of CO 2 , p 0 is the initial values of p, and C x and C p 0 are the covariance matrices of x and C p 0 , respectively.The operator M consists of atmospheric transport model (A) and CASA (B), i.e.M (p) = A B (p).As shown in the following section, B is nonlinear while A is linear, so in order to minimize Eq. ( 4) we expanded B around p 0 in Taylor series and approximated it up to the 1st-order term: where G is the first derivative of B(p) with respect to p at p = p 0 .We evaluated G(p − p 0 ) numerically assuming a linear relationship between the first derivative and p for a small change in p.Furthermore, the solutions of p which minimizes Eq. ( 1) is and the associated covariance matrix of p is The detailed derivations of Eqs. ( 6) and ( 7) were previously shown, for example, by Enting (2002) and Bousquet et al. (1999).In this study, the minimization of J was done iteratively since we used the linear approximation in Eq. ( 5).Throughout the iterative process, the values of p 0 and C p 0 were fixed at the values described in the following section.Note that, because Eq. ( 5) is not exact, neither p nor C p obtained by Eqs. ( 6) and ( 7) are exact solutions to minimize J. Thus, to assign the measure of the improvements in the simulation, we calculated χ 2 which is the mean-square mismatch between the observed and simulated concentrations: where N obs is the number of observations (i.e. the size of x), and M(p) is in its exact form.

Atmospheric transport model
The NIES transport model (Maksyutov and Inoue, 2000) was used to simulate the global distributions of CO 2 resulting from a given surface CO 2 flux.It is an off-line model and uses National Centers for Environmental Prediction (NCEP) reanalysis meteorology (Kalnay et al., 1996).The model has a resolution of 2.5 • latitude × 2.5 • longitude, 15 vertical levels (from ∼0.15 to 20 km in altitude), and the time step of 15 min.The advection scheme is semi-Lagrangian with tracer mass adjustment for the conservation of tracer.The monthly climatological day-time mean planetary boundary layer (PBL) height, derived from the GEOS-1 reanalysis (Schubert et al., 1995), was used to define the PBL height in the model.The detailed description of the model's scheme for vertical mixing can be found in Appendix A of Ishizawa et al. (2006).For this study, the transport model was run for 3 www.biogeosciences.net/6/2733/2009/Biogeosciences, 6, 2733-2741, 2009 model-years with the meteorology of 1997-1999 and appropriate background fluxes (described below), and the result from the 3rd year was used to represent the seasonal cycle of the CO 2 concentration for a given surface flux.Annual anthropogenic carbon fluxes for 1990 (Andres et al., 1996) and 1995 (Brenkert, 1998) and monthly oceanic flux (Takahashi et al., 2002) were used as the background fluxes.The linear trend of the simulated CO 2 concentration at each station was subtracted from each station data to prepare a detrended seasonal cycle at each station.The propagation of response function G (see Eq. 5) in the atmosphere was simulated with the NIES transport model and used to evaluate Eqs. ( 6) and (7).

Observed data of CO 2
We used data of vertical profiles of CO 2 concentration from GLOBALVIEW-CO 2 (2007).The locations of the 15 vertical profiles used in this study are shown in Fig. 1, and the vertical coverage at each data point is listed in Table 1.The error of each seasonal cycle was obtained using the method described by Kaminski et al. (2002).The discrete vertical profiles were converted to a partial column concentration, assuming that the each data point represents a concentration of CO 2 in a column of atmosphere having a thickness of 1000 m centered at the altitude at which the data was taken (see Table 1).We used weighted mean of the uncertainty of each data point in the vertical profile to obtain the uncertainty of the partial column concentration.In addition to the dataset of partial column concentrations, the CO 2 concentrations at the lowest level of each vertical profile were collected to prepare the "near-surface" dataset of the CO 2 concentrations.

Results and discussions
In this section, we first describe the values of optimized parameters and the changes in their uncertainties.Then, the results of the seasonal cycles obtained from the partial column data and near surface data will be compared from several aspects.

Optimized parameters
The values of both Q 10 and E max stabilized after five iterative calculations to minimize Eq. ( 4) with the observed seasonal cycles of partial column data.However, the values of Q 10 and E max fluctuated quite significantly throughout the optimization with near-surface data.Thus, we chose to use the results which resulted in the smallest value of χ 2 since we derived χ 2 without any approximations.We found that the value of χ 2 decreased from 1.84 to 0.60 after optimization with the partial-column data, while it decreased from 2.60 to 1.67 after optimization with the near-surface data.
The optimization with partial-column data resulted in an average E max of 0.54 gC (MJ PAR) −1 and Q 10 of 1.81 for 11 vegetation types with standard deviations of ±0.20 gC (MJ PAR) −1 and 0.29, respectively; while the optimization with near-surface data resulted in average E max of 0.49 gC (MJ PAR) −1 and Q 10 of 1.81 with standard deviations of ±0.27 gC (MJ PAR) −1 and 0.27, respectively.The optimized values of E max and Q 10 for each vegetation type are shown in Fig. 2. The value of E max optimized with partialcolumn CO 2 were greater than or approximately equal to the E max optimized with the near-surface CO 2 data for all vegetation types except for BNF.Moreover, E max of BNF was more tightly constrained by the near-surface data than by the partial-column data (Fig. 3).On the other hand, near-surface and partial-column inversions resulted in the values of Q 10 that are significantly different from each other for AGR and NEF, although these two vegetation types had the opposite trends in E max and Q 10 (Fig. 2).Interestingly, near-surface data of CO 2 used in this study constrained E max more than partial-column CO 2 data while the trend was vice versa for Q 10 of all vegetation types except for AGR (Fig. 3).
At the same time, it has to be emphasized that the optimizations of other parameters could have led to the comparable reduction in χ 2 and thus the physical meanings of the optimized parameters shown in Fig. 2 need to be carefully interpreted.Moreover, the available data on seasonal cycles of vertical profiles of CO 2 are quite limited at this point, and thus the results of this study are strongly biased toward the location of the available data as shown in Fig. 3 which shows that some of the vegetation types which have no nearby observation points have no significant reduction in the parameter's uncertainty.Therefore, increasing the number of the reliable vertical profile data is expected to improve the confidence level of the resulting parameters.

Growing season net flux and NPP
To analyze the amplitude of seasonality of NEP of CASA optimized in this study, we calculated growing season net flux (GSNF) which is defined as the sum of NEP for the months when NEP is positive (Randerson et al., 1997).The use of GSNF is valuable in this study since CASA is de-signed to have no annual net flux (i.e.zero annual NEP) for each model grid, and so we can use GSNF as a measure of the productivity of ecosystem in CASA.The values of GSNF were higher when CASA was optimized with the partial-column CO 2 data than with the near-surface data at almost all latitudes except for around 40 • to 45 • (Fig. 4).We compared the values of GSNF and NPP for each vegetation type (Table 2), and found that GSNF decreased notably for BNF when we changed the CO 2 data for inversion from the near-surface to partial-column data which account for the low value of GSNF from partial-column inversion between 40 types obtained by inversion with the partial column data were either approximately equal to or greater than those obtained with the near-surface data, accumulating to 15.8% and 17.0% increases in the total annual NPP and GSNF, respectively, upon changing the data choice from near-surface to partial column concentrations (Table 2).At the same time, Randerson et al. (1997) predicted that the global sums of NPP and GSNF for 1990 were 54.9 PgC y −1 and 13.6 PgC y −1 , respectively, and both of these values are slightly larger than corresponding values obtained in this study (see Table 2).
Correctly identifying the cause of this discrepancy is out of scope of the present study, since the datasets used for CASA in their study are different from those in the present study.Thus, directly comparing the results of these two studies is difficult, and so we limit our discussion to the compar- ison of our own results in this paper.Furthermore, using column concentrations of CO 2 observed by a ground-based FTS, Yang et al. (2007) found that the actual GSNF north of 30 • is approximately 28% larger than the GSNF predicted by Randerson et al. (1997) using CASA.However, in their study, Yang et al. (2007) did not directly evaluate the effects of utilizing column or partial column concentrations of CO 2 instead of boundary concentration data, and so no conclusion was made on how much of this 28% is due to the weak vertical mixing in transport models.In the present study, we can directly compare these two cases.For example, our analysis indicates that the use of near-surface data of CO 2 resulted in GSNF that was 14% less than the case with partial-column data for north of 30 • N. At the same time, we note here that this value (14%) can be expected to be slightly larger when total column concentrations (e.g. from ground-based FTS measurements) are used instead of partial columns used in this study.

Seasonal cycle and vertical profiles of CO 2 with optimized CASA NEP
Using two sets of optimized CO 2 flux field from CASA along with background fluxes, we simulated seasonal cycle of global CO 2 concentration field.Figure 5 shows that the optimized seasonal cycles of partial-column concentrations resulted in the better fits to observations of partial column concentrations than those simulated with prior values of E max and Q 10 , for both cases of optimizations.Furthermore consistent with the trend of GSNF and NPP, the seasonal cycle of CO 2 partial-column concentrations simulated with CASA optimized with near-surface data had a smaller amplitude than those optimized with partial-column data (Fig. 5; results for only selected locations are shown).We also compared the vertical profiles of the observed and simulated CO 2 concentrations, by averaging vertical profiles for Northern Hemisphere summer (July, August, and September) and winter (January, February, and March) (Fig. 6).By comparing the vertical profiles simulated with 2 cases of optimized CASA, we found that the vertical gradients of their CO 2 concentrations are almost identical while the amplitude of seasonal cycle at a given altitude is greater for the CO 2 concentration simulated with CASA optimized with partial column data.On the other hand, for both of these simulated vertical profiles of many locations, the simulated vertical gradients are too strong compared with the observed vertical gradients especially in winter (Fig. 6).This indicates that the vertical mixings in the transport model at these locations are not sufficient.Moreover, similarly to what was suggested by Yang et al. (2007) for the average of 12 transport models used in TransCom-3, NIES transport model has insufficient rates of vertical mixing both between the planetary boundary layer and upper troposphere (Fig. 6).This weak vertical mixing in the transport model is attributed as a cause of the GSNF and NPP of CASA that was underestimated when CASA was optimized with the near-surface data.That is, low (in summer) and high (in winter) concentrations of CO 2 in boundary layer, caused by the net flux of CO 2 due to activities of terrestrial ecosystem (i.e.photosynthesis and respiration), are not effectively propagated to the higher altitudes due to the insufficient vertical mixing in the transport model, and this results in artificially high amplitudes of seasonal cycle of CO 2 concentration near surface even when the correct amount of CO 2 flux from CASA is given to a transport model.Thus, when only near-surface data of CO 2 concentrations are used to op-timize CASA, the amplitudes of seasonal cycles of NEP in CASA are underestimated.On the other hand, when column concentrations of CO 2 are used, the optimization of CASA is affected less by the inaccuracy of vertical mixing in the transport model and more reliable results can be obtained although other problems in the transport model as well as other parameters of CASA may bias the results.Furthermore, since the method described in this paper can correct the seasonality of CASA NEP without being much affected by a scheme of vertical mixing in a transport model, it can be used to prepare flux fields of CO 2 which can be used as a reference for tuning vertical mixing processes in a transport model, and could be complementary to other widely used vertical mixing tracers such as radon.

Summary
The seasonality of the CASA ecosystem model was optimized using the vertical profiles of the observed CO 2 concentrations and the inverse of transport model with CASA.We found that the method employed in this study can effectively optimize the seasonality of CASA NEP.Moreover, we found that the CASA NEP simulated with the partial column concentrations of CO 2 has larger seasonal amplitude than that simulated with the near-surface data.Our analysis showed that annual GSNF predicted with the partial column data was 14% larger than that predicted with the near-surface data.Furthermore, the analysis of the vertical profiles showed that www.biogeosciences.net/6/2733/2009/Biogeosciences, 6, 2733-2741, 2009 the low GSNF predicted with near-surface data is due to the weak vertical mixing in the transport model used in this study.In conclusion, optimization of an ecosystem model for CO 2 flux in conjunction with an atmospheric transport model can be more reliably achieved with CO 2 column concentrations than only with the near-surface data, especially when a vertical mixing scheme in a transport model is not accurate enough.As a result, we arrived at the CO 2 flux model which fits CO 2 column observations better and is less dependent on the mixing properties of the transport model used in the parameter optimization process.Better fit to the partial column average concentration can potentially improve a fit of the forward model simulations to the observations of the CO 2 by ground based and space based remote sensing instruments.Transport model tuning is left beyond a scope of this study because the main purpose of producing correct NEP seasonality is achieved by using partial CO 2 column observations, although it would be even more efficient to simultaneously tune transport and surface fluxes, that would allow including surface-only observation sites data consistently with vertical profiles.

Fig. 2 .
Fig. 2. (a) E max (b) Q 10 and of each vegetation type optimized with partial column concentrations of CO 2 and near-surface CO 2 concentration.The dotted and dashed lines represent the initial value and its uncertainty of respective parameter, respectively.

Fig. 5 .
Fig. 5. Seasonal cycles of CO 2 partial column concentrations.Observed values are plotted with the results of 2 cases of CASA optimizations, as well as their prior values.

Fig. 6 .
Fig.6.Vertical profiles of the simulated and optimized CO 2 concentrations at each location.The simulated profiles were made using the CASA parameters obtained with partial column of CO 2 and near surface CO 2 data.
Map of vegetation types in CASA.TRF: tropical rainforests, BDF: broadleaf deciduous forests; BNF: broadleaf and needleleaf forests; NEF: needleleaf evergreen forests; SVN: savannas, GSL: perennial grasslands, BSB: broadleaf shrubs with bare soil, TUN: tundra, DST: desert, AGR: agriculture.Red squares on the map indicate the locations of the vertical profile data used for this study (see Table1).

Table 1 .
Locations and amplitudes of the CO 2 vertical profile data used for this study.The data were obtained from GLOBALVIEW-CO 2 (2007).

Table 2 .
NPP and GSNF of each vegetation type after CASA optimizations with near-surface and partial columns of CO 2 .The global totals are also shown (note the unit change).