A data-model fusion approach for upscaling gross ecosystem productivity to the landscape scale based on remote sensing and flux footprint modelling

In order to use the global available eddycovariance (EC) flux dataset and remote-sensing measurements to provide estimates of gross primary productivity (GPP) at landscape (10 1–102 km2), regional (103–106 km2) and global land surface scales, we developed a satellitebased GPP algorithm using LANDSAT data and an upscaling framework. The satellite-based GPP algorithm uses two improved vegetation indices (Enhanced Vegetation Index – EVI, Land Surface Water Index – LSWI). The upscalling framework involves flux footprint climatology modelling and data-model fusion. This approach was first applied to an evergreen coniferous stand in the subtropical monsoon climatic zone of south China. The EC measurements at Qian Yan Zhou tower site (26 4448 N, 1150413 E), which belongs to the China flux network and the LANDSAT and MODIS imagery data for this region in 2004 were used in this study. A consecutive series of LANDSAT-like images of the surface reflectance at an 8-day interval were predicted by blending the LANDSAT and MODIS images using an existing algorithm (ESTARFM: Enhanced Spatial and Temporal Adaptive Reflectance Fusion Model). The seasonal dynamics of GPP were then predicted by the satellite-based algorithm. MODIS products explained 60% of observed variations of GPP and underestimated the measured annual GPP (= 1879 g C m −2) by 25–30%; while the satellite-based algorithm with default static parameters explained 88% of observed variations of GPP but overestimated GPP during the growing seasonal by about 20–25%. The optimization of the satellite-based algorithm using a data-model fusion technique with the assistance Correspondence to: B. Chen (baozhang.chen@igsnrr.ac.cn) of EC flux tower footprint modelling reduced the biases in daily GPP estimations from about 2.24 g C m −2 day−1 (non-optimized,∼43.5% of mean measured daily value) to 1.18 g C m−2 day−1 (optimized,∼22.9% of mean measured daily value). The remotely sensed GPP using the optimized algorithm can explain 92% of the seasonal variations of EC observed GPP. These results demonstrated the potential combination of the satellite-based algorithm, flux footprint modelling and data-model fusion for improving the accuracy of landscape/regional GPP estimation, a key component for the study of the carbon cycle.


Introduction
Growing interest in climate change has stimulated recent research that aims to quantify components of the natural carbon (C) cycle.The eddy-covariance technique (EC) is commonly used to directly measure the CO 2 , water vapour and energy exchange between the atmosphere and terrestrial ecosystems (Baldocchi, 2008).Today, there exists more than 500 EC-flux towers across the continents.EC measurements are a rich source of information on temporal variability and environmental controls of CO 2 exchange between the atmosphere and terrestrial ecosystems (Law et al., 2000).These global EC datasets provide investigators with opportunities and information to (1) explore emergent-scale properties by quantifying how the metabolism of complex ecosystems respond to perturbations in climate variables on diurnal, seasonal, interannual and decadal time scales and elucidate physical and biological controlling factors (Law et al., 2000;Baldocchi, 2008); (2) examine the carry-over effects that may be introduced by either favourable or deleterious B. Chen et al.: Upscaling GPP to the landscape scale conditions during antecedent years (Barford et al., 2001); (3) observe a disturbance and the recovery from it or to span a natural sequence of ecological development coupled with fluctuations in climate (Amiro et al., 2006;Stoy et al., 2006); and (4) test and validate ecosystem process models (Chen et al., 2007;Urbanski et al., 2007), since most of these models span timescales from hours to decades.Although the available EC data have been rapidly accumulating, much of this information is of limited use because of the difficulties/uncertainties in (i) assessing/interpreting the associated measuring biases of EC data and (ii) upscaling of the EC fluxes at the ecosystem (typically less than 1-3 km 2 for each site) to larger scales, e.g.landscape and regional scales.
The EC method is based on measurements of turbulent fluctuations of the vertical velocity and the concentration of a passive tracer.Knowledge of the area's soil and vegetation that impacts the EC flux is clearly important both in planning the site tower location and in the interpretation of measured fluxes (Finnigan, 2004).The adoption of the EC technique to estimate surface exchange is based on the assumption that certain meteorological conditions (e.g.horizontal homogeneity, steady-state, and non-advection) are satisfied (Gökede et al., 2004).Since such conditions are often violated in complex terrain, e.g. at flux monitoring sites in forests, correct interpretation of EC-data is still a matter of some difficulty (Sogachev et al., 2004).In particular, the spatial variability of vegetation density influences the lower atmospheric circulation and surface exchange of energy, water and C over a wide range of scales (e.g., Shen and Leclerc, 1995;Buermann et al., 2001;Cosh and Brutsaert, 2003).As a result, the evaluation of the spatial representativeness of long-term accumulated EC-flux measurements is still challenging (Chen et al., 2009a).
Footprint analysis is a recognized part of the establishment and siting of flux towers and the analysis of their output (Finnigan, 2004).The interpretation of EC flux measurements over a heterogeneous surface depends largely on the footprint over which the fluxes are sampled.The temporal and spatial variability of footprints and the associated influence of varying site heterogeneities on tower flux measurements has yet to be fully investigated, although this information is critically needed for interpretation and for making a more wider use of the globally available EC datasets which have been rapidly accumulating.
In order to use EC flux measurements to provide estimates of components of the natural C cycles at landscape (10 1 -10 2 km 2 ), regional (10 3 -10 6 km 2 ), or hemispheric to global land surface (10 7 -10 8 km 2 ) scales, they must be reasonably "upscaled" using either models and/or remote-sensing measurements (e.g.Earth observation -EO -data).However, it has been proven that, it is an extremely challenging task to scale up those EC measurements from stand-level to a region or global scales because of the large spatial heterogeneity and temporal dynamics of ecosystems across complex landscapes and regions and the nonlinearity inherent in ecophysiological processes (Levy et al., 1999;Chen et al., 2007;Hilker et al., 2008).
Satellite remote sensing can provide consistent and systematic observations of vegetation and ecosystems over large spatial extents on variable spatial and temporal resolutions.For example, the Moderate Resolution Imaging Spectroradiometer (MODIS) is a 36 band spectrometer providing a global data set every 1-2 days.The spatial resolution of MODIS (pixel size at nadir) is 250 m for channel 1 and 2 (0.6 µm-0.9 µm), 500 m for channel 3 to 7 (0.4 µm -2.1 µm) and 1000 m for channel 8 to 36 (0.4 µm-14.4µm), respectively.The LANDSAT TM (Thematic Mapper) and ETM+ (Enhanced Thematic Mapper Plus) sensors onboard the LANDSAT 5 and 7 platform have acquired images of the Earth at 30 m spatial resolution with a 16-day repeat cycle.Data from the satellite-borne MODIS are currently used in the calculation of global weekly gross primary productivity (GPP) at 1-km spatial resolution (Running et al., 2004;Coops et al., 2007).These data with variable spatial and temporal resolutions, however, require appropriate methods for upscaling and interfacing EC-measurements to satellite observations (Drolet et al., 2008;Hall et al., 2008).It is challenging to compare the estimated GPP using MODIS satellite data at a 1-km resolution with the EC-derived GPP because of mismatch between them in spatial scales (Xiao et al., 2004).Whereas, the LANDSAT imagines have fine resolutions (i.e. 30 m), which are ideal for spatial scaling and the comparing of C fluxes derived from satellite-borne and from EC data with the assistance of flux footprint analysis (Chen et al., 2009a).
The seasonal phenologically dynamics of canopy development (leaf flush, expansion, senescence, fall), in relation to their biophysical, biochemical (e.g., chlorophyll and other pigments, nitrogen) and optical properties, in turn influence both biophysical parameters (e.g., albedo, latent and sensible heat fluxes) and biogeochemical parameters (e.g., photosynthesis) of the land surface (Xiao et al., 2004;Li et al., 2007).Therefore, the time series of vegetation indices have the potential to provide valuable insight into the processes (e.g., growing season length and water condition) that regulate ecosystem C exchange.Clearly, there is a need to examine those advanced vegetation indices for all available satellite sensors in relation to leaf phenology and the seasonal dynamics of GPP across the flux tower sites in various biomes.Limited number of studies had evaluated radiometric and biophysical performance of vegetation indices from the LANDSAT data, probably because of the mismatch of spatial resolutions between the LANDSAT data and EC flux towers.
The objective of this study is twofold: (1) to examine biophysical performance of vegetation indices in relation to seasonal dynamics of CO 2 fluxes; and (2) to develop a practical approach for upscaling GPP to the landscape scale using ECtower data and remote-sensing observations and a recently developed footprint model (Chen et al., 2009a).Qian Yan Zhou (QYZ) EC flux tower (26 • 44 48 N, 115 • 04 13 E, and elevation 102 m) located in Southeast China was selected as an experiment site.
The EC data measured at this tower and the LANDSAT and MODIS imagery acquired in 2004 were used in this study.Firstly, a consecutive series of LANDSAT-like images of the surface reflectance at an 8-day interval were predicted by the blending of the LANDSAT and MODIS images using the Enhanced Spatial and Temporal Adaptive Reflectance Fusion Model (ESTARFM; Gao et al., 2006;Zhu et al., 2010); Secondly, footprint climatologies of the QYZ tower across multi-temporal scales (i.e.daily, weekly, monthly and annual) were calculated using the Simple Analytical Footprint model on Eulerian coordinates (SAFE; Chen et al., 2008Chen et al., , 2009a)); Thirdly, weekly GPP maps at a 30-m resolution were produced using a light-use efficiency modelling approach (e.g.Xiao et al., 2004) in combination with an optimization algorithm using the data-model fusion technique and footprint modelling; and Finally, these year-round estimates of spatially distributed GPP at a 30-m resolution with and without optimized parameters were integrated with footprint weighting factors to the ecosystem-scale and then compared to that directly derived from EC measurements and MODIS products.

Site description
As part of the Chinese Ecosystem Research Network (CERN) and China FLUX network, The QYZ Experimental Station is located in the subtropical continental monsoon region, the mean annual air temperature and mean annual precipitation were 17.9 • C and 1485 mm, respectively, according to the long-term records of the adjacent weather station (1985)(1986)(1987)(1988)(1989)(1990)(1991)(1992)(1993)(1994)(1995)(1996)(1997)(1998)(1999)(2000)(2001)(2002)(2003)(2004).Most areas in the same latitude zone as QYZ around the world are arid steppes or deserts.The warm and humid environment in QYZ is the result of a unique southeast monsoon.QYZ is located on a gentle undulating terrain with slopes between the crest and valley of the hill from 2.88 and 13.58 • (Wen et al., 2006).The QYZ ecological experimental station was established in 1983, at that time the area was mainly covered with wild grasslands, shrub lands and some sparse Pinus massoniana.After the experimental station setup, the land covers of QYZ were changed much by scientists (Huang et al., 2007).The EC flux tower was established in late August of 2002.The forest cover reaches 90% in the 1-km 2 area surrounding the tower and 70% in the 100-km 2 area (Liu et al., 2006).The dominating trees in the flux footprint area are Pinus elliottii, Pinus massoniana and Cunninghamia lanceolata.The stand characteristics at QYZ based on a survey made in 2005 (Wen et al., 2006) (Wen et al., 2006).(Huang et al., 2007).The understory shrub includes mainly Loropetalum Chinense and Lyonia compta.The soil is red soil, which weathered from red sand rocks.

EC flux tower measured data
Year-round half-hourly flux and meteorological measurements made in 2004 were used in this study.These data include component C fluxes and sensible and latent heat fluxes and air temperature (T a ), wind direction (WD), wind speed (u), standard deviation of wind speed (σ u ), friction velocity (u * ).

Data processing and quality controls
The half-hourly net ecosystem exchange (NEE) was computed as the sum of EC-measured CO 2 flux (F c , positive upward) and the rate of change in CO 2 storage (S c ) in the air column from the ground to the EC measurement height, i.e.NEE = F c +S c .F c was calculated as F c = ρa w c , where ρ a is the mean molar density of dry air, w c is the covariance of vertical velocity (w, in m s −1 ) and CO 2 molar mixing ratio (c, in µmol mol −1 ), where the over bar and prime indicate mean and fluctuation, respectively.A coordinate rotation was applied to bring the mean vertical and horizontal wind speed components to zero for each half hour.Time lags, due to sample travel time and adsorption in the sample line, were determined by maximizing the covariance between w and c. S c was computed as ρa ∂ ∂t c(z)dz based on the profile measurement of CO 2 molar mixing ratio using the c measured in the half hours before and after the current half hour.Net ecosystem productivity (NEP) was calculated as NEP = −NEE.
Half-hourly data were filtered to exclude high rates of errors in the sonic and IRGA error flags typically attributable to heavy rainfall.After calibration, quality control procedures were implemented.Fluxes with statistics that did not fall within reasonable limits and/or occurred during instrumental function were removed.We only accepted the half-hourly NEE measurements under well-mixed conditions indicated by the friction velocity (u * ) being larger than 0.2 m s −1 during nighttime.The valid measured half-hourly NEE coverage ratios after applying data quality criteria and removing the data when u * < 0.2 m s −1 during nighttime for 2004 were 44.7%, which was slightly lower than typical values (∼50%) for continuous EC-measurement towers (Falge et al., 2001).

Gap-filling, C flux partitioning, time integration and uncertainties
The missing half-hourly meteorological data were filled by the combined look-up table and mean diurnal variation methods (Reichstein et al., 2005).For short gaps (<2 h), the missing data were linearly interpolated.Ecosystem respiration (R e ) was estimated as NEE during periods when GPP was known to be zero, i.e., at night and during the cold season.Missed R e at night and daytime R e (R ed ) were estimated as a function of soil temperature and soil moisture with a yearly interval (Reichstein et al., 2002), where R ref , b 1 and b 2 are fitted parameters, R ref is the respiration rate at a reference temperature T ref (here 10 • C was used), and S w is the soil water content (m 3 m −3 ) at 5 cm depth, and T s is the soil temperature ( • C) at 5 cm depth.GPP then was estimated as NEP + R ed (warm season periods when air temperature > 0 • C) or zero (cold season periods when air temperature < = 0 • C).Gaps in GPP were filled using an empirical light response curve, fit to the measured data.The rectangular hyperbolic model (Hollinger et al., 1999;Lee et al., 1999;Griffis et al., 2003) was used, where Q is photosynthetically active radiation (PAR), α is the quantum yield, A max is the light-saturated asymptote for canopy photosynthetic capacity and p w is a time-varying parameter that quantifies the fractional departure of GPP from the mean GPP vs. Q relationship.We evaluated the seasonal variations in p w , α and A max by fitting Eq. ( 2) to measured, daytime NEP and Q data using a flexible, moving window of 240 measured (not-missing) data points, moved in increments of 48 points.The width of the moving window was typically 7-14 days but increased during periods with significant data gaps.Gaps in NEP during the daytime were filled as the difference between modelled R e using Eq. ( 1) and modelled GPP using Eq. ( 2).
Although the methodology used to fill gaps in NEP and partitioning NEP into R e and GPP introduced a small degree of uncertainty in the absolute magnitudes of R e and GPP, sensitivity testing showed that the effects on annual NEP, R e and GPP were relatively small.We also compared our gapfilled dataset with one produced using the neural network method of Papale and Valentini (2003) and found that the two datasets agreed well.
To assign statistical uncertainties to the annual NEP, R e and GPP, we first estimate an error distribution for each halfhour, consisting of (i) instrumentation error calculated via daily difference methods of Hollinger and Richardson (2005) and Richardson et al. (2006), (ii) error associated with the model fit for R e and GPP, and (iii) error associated with filling T s and Q during gaps.These error distributions were then used to generate different 500 simulations of the NEP time series, from which confidence intervals were bootstrapped using the Monte Carlo method (Efron and Tibshirani, 1993).The range of annual NEP estimation was determined using one-sampling percentile method and at 95% confidence intervals (C.I.).Random sampling of NEP error populations was used to determine the uncertainties in the annual sums of NEP.Year-round data were divided into 24 periods (12 seasons, day and night).The uncertainties for gap-filled halfhourly NEP were estimated by randomly drawing from the error population defined by half-hours with valid NEP observations (NEP observed − NEP estimated ).The mean difference between observations and estimates was 0 and such random error in the estimates of annual NEP was found to be within ± 30 g C m −2 .The biases for valid half-hourly NEP observations were also estimated by randomly drawing errors from a double exponential probability distribution with shape factors for daytime and nighttime measurements from Richardson et al. (2006).

Aqusition of LANDSAT and MODIS image data
Seven cloud-free LANDSAT scenes (1 scene of TM and 6 scenes of ETM+, Table 2) acquired between mid-April and the end of November 2004 were available for the study site and were acquired through the USGS (US Geological Survey) GLOVIS portal (http://glovis.usgs.gov/).Additionally, 33 eight-day MODIS composite nadir BRDF-adjusted reflectance (NBAR) products for both the Aqua (MYD09A1) and Terra (MOD09A1) platforms acquired in 2004 with a spatial resolution of 500 m and the corresponding GPP products with a 1-km resolution were obtained from the EOS

Image data corrections
The LANDSAT ETM+ data acquired after the 31 May 2003 Scan Line Corrector (SLC) failure.All of the scenes with SLC− off were gap-filled following the gap-fill algorithm developed by the USGS Earth Resources Observation Systems (EROS) Data Center (EDC), which is available at http: //landsat.usgs.gov/documents/L7SLCGapFilledMethod.pdf.The LANDSAT imagery was georeferenced and atmospherically corrected using the cosine approximation model (COST) of Wu et al. (2005) and radiometrically normalized following the method of Hall et al. (1991) with respect to the 2004 imagery in order to simplify the data comparison.The MODIS data were reprojected to the Universal Transverse Mercator (UTM) projection using the MODIS reprojection tool (Kalvelage and Willems, 2005), clipped to the extent of the available LANDSAT imagery, and resampled to a 30-m spatial resolution using a nearest neighbour approach.

Blending of LANDSAT and MODIS data to predict consecutive LANDSAT-like imagery at an 8-day interval
LANDSAT imagery at a 30-m spatial resolution is wellsuited for characterising landscape-level forest structure and is ideal for comparison of C fluxes derived from satelliteborne and EC data with assistance of flux footprint analysis (Chen et al., 2009a).Its 16-day revisit-cycle, however, generally lengthened by cloud contamination (Ju and Roy, 2007), can limit LANDSAT's use for monitoring biodynamics (Ranson et al., 2003;Roy et al., 2008).At the QYZ site, for instance, there were only 7 scenes available in 2004, which were not sufficient for studying the seasonality of GPP.In contrast, MODIS has a higher temporal resolution and coarser spatial resolutions.Combination of LAND-SAT and MODIS data is able to capitalize on the spatial detail of LANDSAT and the temporal regularity of MODIS acquisitions.In this research, the LANDSAT (see Table 3, bands 2, 3 and 4) and MODIS (bands 1, 2 and 3) acquired in 2004 were blended using the ESTARFM model (Gao et al., 2006;Zhu et al., 2010) in a 30 km × 30 km area centred at the QYZ tower.Reflectance data for the selected MODIS channels with a 500-m resolution were resampled to a 30-m resolution image and were then fused with the LANDSAT data to produce 33 synthetic LANDSAT-like images with an 8-day interval encompassing the 2004 growing season.No significant difference between real (observed) and synthetic (predicted) reflectance values was found by comparing, on a channel-by-channel basis, the surface reflectance values of seven real LANDSAT images with the corresponding closest date of synthetic LANDSAT imagery.Similarly, a pixelbased regression analysis shows that predicted and observed reflectance values for the seven LANDSAT dates were highly correlated (mean r 2 = 0.71 for the NIR band; r 2 = 0.56 for the red band; p < 0.01).

Vegetation indices used in photosynthesis modelling
A number of vegetation indices have been developed for broad-waveband optical sensors which, used in this study, include the Enhanced Vegetation Index (EVI), the Normalized Difference Vegetation Index (NDVI) and the Land Surface Water Index (LSWI).NDVI (Tucker, 1979;Field et al., 1995)   green vegetation cover or vegetation canopy density and has been shown to be well correlated with green LAI and biomass (e.g., Sellers, 1985;Myneni et al., 1995).
EVI is similar in design to NDVI but uses spectral information from the blue band (ρ blue ).Following Huete et al. (1997) it was computed, where G = 2.5, C 1 = 6, C 2 = 7.5, and L = 1.EVI is found to be significantly correlated with the fraction of the photosynthetically active radiation absorbed by leaf chlorophyll in the canopy providing a good surrogate of the spatial variability index for photosynthesis rate.
LSWI (Xiao et al., 2002) is a useful water index and was calculated as the normalized difference between the NIR (0.78-0.89 µm) and AWIR (1.58-1.75µm) spectral bands: LSWI = (ρ nir − ρ swir )/(ρ nir + ρ swir ), where ρ nir and ρ swir are the reflectance of near infrared bands, red bands and short infrared bands, respectively.The produced consecutive LANDSAT-like predictions of surface reflectance with an 8-day interval at an area of 30 km × 30 km centred at the QYZ tower in 2004 (totally 33 scenes) were used to derive these three vegetation indices.

Upscaling framework using data-model fusion technique and processing steps
An algorithm for estimating landscape and regional C fluxes (e.g.GPP) includes the following four steps (Fig. 1): (i) A satellite-based vegetation photosynthesis model (VPM; Xiao et al., 2004) was adopted to produce GPP maps at a 30 m resolution using LANDSAT and climate data measured at the EC tower; (ii) EC flux footprints for the corresponding LANDSAT-like imagery available periods (an 8-day interval) were calculated using a recently developed footprint model (SAFE; Chen et al., 2009a); (iii) By assuming the footprint integration of remotely sensed GPP to be comparable with the EC-derived GPP values, several key parameters of the satellite-based VPM model were optimized using the data-model fusion technique; and (iv) The updated/optimized satellite-based VPM was applied to estimating landscape/regional GPP, which was compared/validated with other satellite data (e.g.MODIS products).

Brief description of the adopted VPM model
Satellite-based studies have used the light-use efficiency (ε) approach to estimate GPP (Prince and Goward, 1995;Running et al., 2000Running et al., , 2004;;Behrenfeld et al., 2001) or net primary production (NPP) (Field et al., 1995;Ruimy et al., 1999).Significant effort and progress have been made in developing the satellite-based GPP algorithms (Running et al., 2004;Xiao et al., 2004Xiao et al., , 2005)).Similar to the MODIS GPP algorithm (Running et al., 2004) and the VPM model (Xiao et al., 2004), the algorithm (Fig. 2) used in this study relies on the light-use efficiency (ε) approach relating GPP to the amount of absorbed photosynthetically active radiation (APAR) (Monteith, 1966(Monteith, , 1972) ) such that, where PAR is the photosynthetically active radiation (in µmol photosynthetic photon flux density, PPFD), f PAR chl is the fraction of PAR absorbed by leaf chlorophyll in the canopy, and ε is the light use efficiency (µ mol CO 2 /µmol PPFD).
Light use efficiency (ε) is affected by leaf phenology, temperature, and water: where ε 0 is the apparent quantum yield or maximum light use efficiency (µmol CO 2 /µmol PPFD) for a given land cover type or vegetation function type, and P m , W m and T m are the modifiers for the effects of leaf phenology, water and temperature on light use efficiency of vegetation, respectively.Different parameters and inputs for the satellite-based algorithm were estimated in different ways: (i) the fraction of  PAR absorbed by leaf chlorophyll in the canopy (f PAR chl ) and the modifiers (P m , W m ) were estimated using the synthetic LANDSAT-like imagery data; (ii) PAR and temperature modifier (T m ) were calculated using climate data (either from tower measurements or climate models); and (iii) the maximum light use efficiency (ε 0 ) was referred to the landcover-related look-up table and then modified/optimized using EC tower C measurements and footprint climatology.
To accurately estimate f PAR chl in forests is a challenge to both radiative transfer modelling and field measurements.Significant efforts and progress have been made in developing advanced vegetation indices that are optimized for the retrieval of f PAR from individual optical sensors (Gobron et al., 1999;Govaerts et al., 1999).In this study, the f PAR chl within the photosynthetically active period of vegetation was estimated as a linear function of the EVI, The parameter P m was estimated using the NDVI and the LSWI and is calculated at two different phases, depending upon the life expectancy of leaves (deciduous versus evergreen): During bud burst to leaf full expansion 1 After leaf full expansion .
The timings of bud burst and leaf full expansion can be identified using NDVI.The effect of water on plant photosynthesis (W m ) has been estimated as a function of available soil content in the plant root zone and water vapour pressure deficit (VPD) in a number of process-based ecosystem models (e.g.Chen et al., 2007) and remote-sensing based models (e.g.Running et al., 2000).Soil moisture represents water supply to the leaves and canopy, and VPD represents evapourative demand in the atmosphere.Leaf and canopy water content is largely determined by the dynamics of both soil moisture and VPD.As the first order of approximation, here following the alternative and simple approach that uses a satellite-derived water index (Xiao et al., 2004), the seasonal dynamics of W m was estimated, where α is a magnifier (its default value equals 1.0) and LSWI max is the maximum LSWI within the plant growing season for individual pixels.The temperature modifier T m was estimated at each time step, using the equation developed for the terrestrial ecosystem model (Raich et al., 1991), where T min , T max and T opt are the minimum, maximum and optimal temperature for photosynthetic activities, respectively.Their default values are respectively set to be 0, 35 and 20 • C in this study.If air temperature falls below T min , T m is set at zero.The ε 0 values vary with vegetation types and the information about ε 0 for individual vegetation types can be obtained from a survey in the literature (Ruimy et al., 1995) and optimized using EC tower measurements.According to the work (Zhang et al., 2006), the default ε 0 value was estimated to be 0.032 µmol CO 2 /µmol PPFD in this study stand in 2004.

Modelling daytime footprint and footprint climatology
We used the SAFE footprint model (Chen et al., 2009a), based on two-dimensional Eulerian advection-diffusion equation (Horst and Weil, 1992), to compute the footprint function.Both atmospheric stability and the wind velocity power law above the canopy are taken into account in the model, allowing it to be applicable to all conditions of atmospheric stability.The detailed description of the footprint model and the footprint climatology calculations can been found in Chen et al. (2009a).
The inputs of the footprint model include the EC sensor height (h m ), canopy height (h c ), roughness length (z 0 ), friction wind speed threshold for EC flux calculation (u th * ), and half-hourly meteorological variables (WD, u, σ u and u * ) and sensible and latent heat fluxes measured at the EC sensor height.Values of h m , h c , u th * and z 0 for the QYZ site are 39.6 m, 13 m, 0.2 m s −1 and 0.1 m, respectively.
The flux footprints were calculated at a grid size of 30 m × 30 m (consistent with the LANDSAT spatial resolution) covering the area (domain) centred on the towers of 6 km ×6 km.The model was run at half-hourly time steps during daytime when photosynthesis is on.The daytime www.biogeosciences.net/7/2943/2010/Biogeosciences, 7, 2943-2958, 2010 B. Chen et al.: Upscaling GPP to the landscape scale half-hourly footprint f ( x,y) was rotated along the wind direction and accumulated to yield footprint climatology at 8day intervals and at monthly and annual time steps, φ (x,y).
The total footprint = φ dxdy within the model domain ( ) was normalized to equal 1.The calculated footprint provides a map of the contribution for the area around the tower to the integral EC-measured flux component.

Optimizing the VPM model's parameters
The EC technique is based on measurements of turbulent fluctuations of the vertical velocity and the concentration of a passive tracer.The EC flux tower measured flux (e.g.GPP), representing an integrated flux over its footprint area (typically 1-3 km 2 ), is unable to simply compare with the remotely sensed GPP based on the LANDSAT (30-m resolution) or MODIS (1-km resolution) images because their spatial scales do not match.We first up-scale the GPP field at a 30-m resolution derived from LANDSAT to the ecosystem scale and then minimize the differences between the modelled and EC-measured GPP using a data-model fusion algorithm to optimize the adopted VPM model's parameters.

Scaling of remotely sensed GPP to the ecosystem scale
The ecosystem-scale overall GPP (F GPP,rs ) was up-scaled from the spatial distributed 30-m GPP field (F GPP,rs ), where φ pure is the pure footprint and is the upwind footprint source area for the accumulative footprint percentage of 99%.The footprint function φ pure was estimated using the SAFE model and F GPP,rs was estimated from the data-fused LANDSAT-like images using the adapted VPM model and both of them were at a 30-m resolution.The footprint integrated F GPP,φ matches the scale of the EC derived GPP.

Model parameter optimization algorithm
The Ensemble Kalman filter (EnKF) data-model assimilation technique, known as a stochastic-dynamic system, was used in this study.The optimum values of the model parameters, therefore, are assumed to correspond to the minima of the cost function J (x) (Tarantola, 1987), where x is the vector of unknown parameters and x b is the a priori values of x, O is the vector of observations (i.e.EC derived GPP), Y is the nonlinear model (VPM).The covariance matrices C o and P b describe a priori uncertainties on observations and on parameters, respectively.Both matrices are diagonal as we assume these two kinds of uncertainties to be independent.The model parameters for MODIS and LANDSAT were respectively optimized using Eq. ( 13) by setting Y (x) to be the VPM simulated GPP at the MODIS pixel which contains the QYZ tower and to be the footprint integrated GPP from the modelled GPP field based on the synthetic LANDSAT-like images using Eq. ( 12).Following Diego et al. (2007), we adopted a gradient-based algorithm which converges more rapidly than standard Monte-Carlo methods to solve Eq. ( 13) for optimizing model parameters, typically converging to a minimum of J (x) within 100 iterations.Nine parameters of the VPM model were optimized for each MODIS scene and synthetic LANDSAT-like scene (Table 3) such that the parameters are allowed to vary seasonally with a temporal resolution of approximately 8 days.

VPM model verification and comparison
GPP fields over a 30 km × 30 km region centred at the QYZ tower were estimated from all the 33 MODIS reflectance images (500 m resolution) and the synthetic LANDSATlike scenes (30 m resolution) in 2004, respectively, using the adopted VPM model in two scenarios: with default static parameters (the priori values in Table 3) and with the optimized parameters which varied seasonally.The calculated 8-day footprint climatologies overlain on the respective LANDSAT GPP maps and the ecosystem-level GPP values were estimated using Eq. ( 12).The footprint integrated LANDSAT GPP values (VPM estimated with both optimized and non-optimized parameters) were validated/compared with the EC derived GPP and also compared with the MODIS GPP products and the predicted GPP using the optimized VPM model for the pixel which contains the QYZ tower.The MODIS GPP values used in this study were the averages of the 8-day composite GPP values acquired from the Terra and Aqua platforms because their acquired time (descending node) over the QYZ site were 10:30 and 13:30 LT (local time), respectively.The MODIS GPP fields at a 1-km resolution were re-sampled to a 30 m resolution for comparison with modelled LANDSAT GPP using VPM at a 30-m resolution.Linear regression analysis between the EC measured GPP and the remotely sensed GPP was chosen to test the optimized VPM model behaviour.Root mean square error (RMSE) was used to estimate the model errors.The significant levels of differences between two data series were detected using t-test.

Seasonal variations in C fluxes
As shown in Fig. 3, NEP, GPP and R e exhibited significant seasonal variability.Photosynthesis was active through the Measured net ecosystem productivity (NEP) was partitioned into gross primary productivity (GPP) and ecosystem respiration (R e ) using procedures described in Sect.2.2.
year, exhibiting high rates during May to September and showing low rates subjected to drought events.

Variations in seasonal footprint climatology
Figure 4 shows the pure mean monthly daytime footprints and the corresponding cumulative footprint contours for every other month in 2004 for QYZ.Most months showed elliptical shapes of the footprint distributed along the NNW-SSE prevailing wind directions while the seasonal variations in size and orientation of footprint for QYZ were obvious.The areas of 90% cumulative footprints were less than 1 km 2 in the winter months whilst as large as 2.5 km 2 in the summer months.

Comparing estimated values of remotely sensed and EC derived GPP
The remotely sensed weekly mean GPP maps at a 30-m resolution were produced based on the synthetic LANDSATlike imagery data using the VPM model in two scenarios.
Figure 5 shows a comparison between annual mean GPP maps estimated using VPM with static default parameters (Fig. 5a) and with optimized dynamic parameters (Fig. 5b) and MODIS products (Fig. 5c: average of Aqua and Terra).
The GPP fields at a 30-m resolution were up-scaled to the ecosystem scale by overlaying the weekly footprint φ pure using Eq. ( 12).The regression analysis results of remotely sensed (up-scaled from the spatially distributed 30-m GPP fields predicted by VPM with static default and optimized dynamic parameters and MODIS products and predicted GPP using MODIS imagery data with optimized VPM) versus the EC-derived weekly ecosystem-scale GPP are shown in Fig. 6 and Table 4. MODIS GPP products explained 60% of observed variations of GPP but apparently underestimated EC-derived GPP (slope = 0.7).The VPM predicted MODIS GPP at the tower's pixel with the optimized parameters can explain 90% of the EC derived GPP but with a large RMSE value of 1.41 g C m −2 day −1 .The VPM predicted LANDSAT GPP with the static default parameters explained 88% of the variations of GPP but overestimate EC-derived GPP (slope = 1.7).The VPM model with optimized parameters significantly improved the GPP estimates (RMSE decreasing from 2.24 to 1.18 g C m −2 day −1 , the difference between these two estimates was significant, studenttest p < 0.001, Table 4).Figure 7 compares the seasonal dynamics of these four different remote estimates of ecosystem-scale GPP with the EC derived GPP.The remotely sensed GPP reasonably captured the seasonal variations of observed GPP.During winter time (when GPP < =∼3 g m −2 day −1 ), the remote-sensing algorithms underestimated the measured GPP, while the MODIS products underestimated the EC-derived GPP through the year.The estimation of MODIS GPP using the VPM model with optimized dynamic parameters was improved with the default static parameters during warmer months (i.e. when the EC-derived GPP >=∼4 g m −2 day −1 ).The non-optimized VPM model using the LANDSAT data overestimated the observed GPP from April through to September (when GPP >=∼7 g m −2 day −1 ).The optimization of VPM with the combining footprint and data-model assimilation algorithm reduced the differences between the remotely sensed GPP and the EC-derived GPP.
The annual total GPP values in 2004 at the ecosystem scale were 1879,1751,1979,1763 and 1361 g C m −2 yr −1 for the EC-derived, predicted using LANDSAT data with and without optimization and the tower pixel's GPP of MODIS with and without (products) optimization, respectively.The difference in estimated annual GPP between with and without optimization was 228 g C m −2 yr −1 (=∼−12% of the observed annual GPP of 1879 C m −2 ).We note that the overall annual bias of the non-optimized VPM model was smaller than that of the optimized model (100 vs. −128 g C m −2 yr −1 ) due to the modelling error compensation of the overestimation during summer months while underestimation during winter months using the nonoptimized VPM model (see Fig. 7).
Figure 8 shows the differences in estimated GPP between without and with the optimized footprint integrated algorithm (non-optimized -optimized) using the synthetic LANDSAT-like images in 2004 over a 30 km × 30 km region centered at QYZ.The estimated GPP values with the optimized VPM model were systematically lower than that without optimization and their differences varied from 0 to 371 g C m −2 yr −1 with large spatial variations.Figure 9 shows the spatial variations in the differences of the estimated annual GPP between derived from LAND-SAT and MODIS (LANDSAT -MODIS) over the same region as Fig. 8.The differences of annual GPP between LANDSAT and MODIS (LANDSAT -MODIS) varied from about −75% to 125% of observed GPP (=−1400 to 2400 g C m −2 yr −1 ).The estimated GPP values from LAND-SAT were larger than that from MODIS in the north area to the tower while lower in the south-east area to the tower (Fig. 9).

Discussion
The performance of a new algorithm based on the advanced vegetation indices using the LANDSAT data was evaluated with assistance of a footprint model.A good agreement between predicted with optimized parameters and EC measured weekly GPP in 2004 in an evergreen needleleaf forest at QYZ (R 2 = 0.92, p < 0.001, Fig. 6) indicates that there exist a good quantitative relationship between the LANDSAT vegetation indices and CO 2 flux data, in terms of the seasonal phase and magnitude of photosynthesis.However, the discrepancies between remotely sensed and EC-derived daily GPP are still large (∼22.9%),especially in winter and in summer when  photosynthesis was suppressed by drought (Fig. 7).Those discrepancies may be attributed to three sources of errors.The first source is errors in the time-series data of vegetation indices from LANDSAT satellite images.We used the synthetic weekly LANDSAT-like data that have no BRDF   correction or normalization and, thus, the effect of angular geometry on surface reflectance and vegetation indices remained.The predicted and observed reflectance values were significantly correlated, however, the former can only explain the variance of the latter about 71% for the NIR band and 56% for the red band.EVI is a semi-empirical mathematic transformation of observed reflectance from individual spectral bands (blue, red and NIR) of optical sensors (Huete et al., 2002).The second source is the biases in footprint climatology estimates, which will lead to uncertainty in the optimization of VPM model's parameters and transfer errors to the scaled ecosystem-level GPP.SAFE applies to most conditions of atmospheric stability because its mathematical framework is a stationary gradient diffusion formulation with height-independent crosswind dispersion and the assumption of independence of vertical and crosswind dispersion allows the reduction of the continuity equation to a two-dimensional advection-diffusion equation.Kljun et al. (2003) reported a model comparison result showing that in stable or neutral conditions, when the mean plume height was low, the analytical models tend to be biased with a long tail, shorter peak value and the peak location farther from the receptor.To avoid the model biases resulting from this limitation of the analytical model, the source area is considered to be all of the grids (cells) that have f (x,y,z m ) larger than the cutoff point, where a given cumulative fraction (e.g.95%, or 97%) is achieved (Chen et al., 2009a).The biases in footprint climatology estimation using SAFE approximated within 5% by model comparison.The third source is the errors in EC-derived GPP.The EC measurements themselves are not free from error.The partitioning of NEE into its component fluxes (i.e.GPP and R e ) has large uncertainties (Falge et al., 2001;Chen et al., 2009b).The two major steps to derive GPP are the gap filling of NEE and estimation of daytime ecosystem respiration, and both of them carried on uncertainties.The random error in the estimates of annual NEP at QYZ was found to be within ± 30 g C m −2 using multiple assessing methods.In summary, the biases in estimated GPP at the landscape scale may be mainly led by the uncertainties associated with the blended weekly LANDSAT-like data because of the limited data availability of LANDSAT and the limitation of the ESTRFM model.
MODIS products underestimated annual GPP by 25-30% while the VPM model with default static parameter overestimated GPP during the growing season about 20-25%.Their biases in daily GPP estimation (RMSE) were 2.38 and 1.18 g C m −2 day −1 (Table 4), respectively.The bias in GPP estimates at daily time scale using the optimized VPM model with assistance of footprint analysis decreased from 2.24 (non-optimized, ∼43.5% of mean measured daily value) to 1.18 g C m −2 day −1 (optimized, ∼22.9% of mean measured daily value).The optimized VPM model significantly improved the GPP estimation may be mainly attributed to the sensitivity of the remote-sensing algorithm to seasonality of PAR and T m and the parameters in EVI calculation (Eq.4).The parameters for estimating T m in Eq. ( 11), T min , T max and T opt are vegetation-type and climate-zone dependent, and may vary with different seasonal phases.In the modelling scenario without optimized parameters, we simply assume those parameters have no seasonal variations.This may lead to ill parameterization of T m , such as over-corrected (smaller values of T m ) during the low-temperature periods while under-corrected (larger values of T m ) during the hightemperature periods (see Fig. 7).
It is worthy to notice that the footprint integrated LAND-SAT GPP with optimized parameters is significantly different from estimated MODIS GPP at the tower's pixel using the VPM model with the same optimization algorithm (Fig. 7).This mainly reflects the high land surface heterogeneity (including land cover types, vegetation indices, etc.) within the EC flux footprint area.The EC flux footprint is generally less than 1-3 km 2 and varies depending on the wind speed, wind direction and the atmospheric stability.The spatial resolutions of MODIS (pixel sizes at nadir vary from 250 m to 1 km for different channels) are apparently too coarse for linking the estimated MODIS C fluxes to the EC measured C fluxes.Caution should be taken in comparing the estimated MODIS GPP with the EC-derived GPP (e.g.Xiao et al., 2004) because of mismatch between them in spatial scales.A good agreement between the MODIS annual GPP at the tower's pixel and the EC derived GPP does not imply that the modelled or upscaled landscape/regional GPP using MODIS imagery data has high accuracy.We obtained higher accuracy in predicted daily GPP based on LANDSAT imagery data at a 30-m resolution than MODIS data at a 500m resolution (Table 4) with the same model-parameter optimization algorithm (Eq.13).We, therefore, concluded that the LANDSAT imagines with a 30-m resolution are ideal for optimizing the satellite-based models (e.g.VPM) from EC data with assistance of flux footprint analysis (Chen et al., 2009a) and consequently the optimized models can be further used to extrapolate the EC measured C fluxes to landscape/regional scales.The optimized parameters of VPM with EC measurements made at the QYZ tower are appropriate for similar climate regions with the same dominant vegetation type of evergreen conifer.However, they may not be suitable for other ecoregions or other vegetation types because different vegetation types generally have markedly different functional (e.g.photosynthetic) processes.The algorithm developed in this study for upscaling GPP to the landscape and regional scales based on EC-tower measurements and LANDSAT and MODIS imagery may be applicable to most of the world-widely distributed EC flux towers (current more than 500 EC towers across the globe).Its application to other tower sites may need to be further examined, and it would also be useful to cross-verify with other independent upscaling methods (e.g.top-down estimations based on CO 2 concentration measurements).
The maximum light use efficiency (ε 0 ) is the basis for the remote-sensing based algorithms or models and the accurate estimating of ε 0 is one of the key steps for using the satellite data to estimate either GPP or NPP (Running et al., 1999).In nature, ε 0 is determined by many biological and biophysical factors and soil nutrient conditions.Much attention should be given to the variability of ε 0 among vegetation www.biogeosciences.net/7/2943/2010/Biogeosciences, 7, 2943-2958, 2010 types across a heterogeneous landscape (Li et al., 2007).Since ε 0 differs significantly among vegetation types, these differences should be accounted for when estimating GPP using remotely sensed data.These globally available EC datasets provide investigators with opportunities to estimate the ecosystem-scale photosynthetic (including ε 0 ) and respiratory parameters.A widely-used method for those parameter estimations is the Michaelis-Menten approach (Falge et al., 2001).Caution is advised when we apply the EC-derived ε 0 to remote-sensing-based algorithms for GPP estimates because: (i) there are large uncertainties in the EC-derived ε 0 ; (ii) the values of EC-derived ε 0 vary seasonally and interannually; and (iii) the EC-derived ε 0 represents the EC flux footprint area, whose sizes and orientations vary with time, in other words, the spatial representation of the EC-derived ε 0 is normally different from the satellite image pixels.Making use of the satellite data with fine resolution (e.g.LANDSAT data) to estimate GPP, flux footprint modelling should be involved to optimize the remote-sensing-based algorithms' parameters, such as ε 0 , T m , etc.These optimized parameters can then be transfered to the applications of other satellite images with coarse resolution (e.g.MODIS) by applying data-model fusion techniques.Combining vegetation indices (e.g.EVI, LSWI) from different multi-temporal/spatial satellite sensors' data, climate data (PAR, temperature), optimized ε 0 parameter for individual vegetation types, and flux footprint modelling and data-model fusion, the remotesensing-based algorithms is a powerful tool for estimating landscape/regional or global GPP.

Fig. 1 .
Fig. 1.Flow diagram of the GPP upscaling algorithm based on LANDSAT and EC tower data.

Fig. 2 .
Fig. 2. Flow diagram of the major processing steps of the LANSATbased GPP algorithm.The definitions of symbols and abbreviations are given in text.

Fig. 3 .
Fig.3.Five-day ensemble means of CO 2 component fluxes for the year of 2004 measured at the Qian Yinzhou flux tower in an evergreen needleleaf forest, China.Measured net ecosystem productivity (NEP) was partitioned into gross primary productivity (GPP) and ecosystem respiration (R e ) using procedures described in Sect.2.2.

Fig. 4 .
Fig. 4. Mean monthly daytime pure footprints and the corresponding cumulative footprint contours for every other month in 2004 for the QYZ flux tower in an evergreen needleleaf plantation, China.

Fig. 5 .
Fig. 5. Remotely-sensed annual total gross primary production (GPP) map for 2004 at a 30-m resolution over a 30 km ×30 km region centred at the QYZ tower modelled using VPM with default (a) and optimized (b) parameters.The MODIS products at 1-km resolution are also shown for comparison.

Fig. 6 .
Fig. 6.A comparison of overall 8-day gross primary production (GPP) values between the observed (EC-derived) and predicted (remotely sensed) in 2004 at an ecosystem scale in an evergreen needleleaf forest at QYZ, China.

Fig. 7 .
Fig. 7.A comparison of the seasonal dynamics between the observed and remotely sensed gross primary productivity (GPP) in 2004 in an evergreen needleleaf forest at QYZ, China.The inset shows the cumulative curves of the five estimations of GPP from the beginning to the end of the year 2004.

Fig. 8 .
Fig.8.Differences in estimated annual total gross primary production (GPP) using the VPM model -between, without and with the optimized footprint integrated algorithm (non-optimized -optimized) based on the synthetic LANDSAT-like images in 2004 in an evergreen needleleaf forest over a 30 km × 30 km region centered at QYZ, China.

Fig. 9 .
Fig. 9. Differences in annual total gross primary productivity (GPP) between the values derived from the synthetic LANDSATlike images with the optimized footprint integrated algorithm and the MODIS products (LANDSAT -MODIS) in 2004 in an evergreen needleleaf forest over a 30 km × 30 km region centered at QYZ, China.

Table 1 .
are shown in Table 1.According to the field measurements in August of 2003, the leaf area index (LAI) of the plantation was 4.5 Stand characteristics at Qian Yinzhou experiment station in southeastern China * .

Table 2 .
Acquired LANDSAT imagery information for the QYZ site * .The scene path and row are 122 and 041, respectively. *

Table 3 .
Optimized VPM model parameters a subtropical plantation coniferous forest at QYZ, China.

Table 4 .
Characteristics of linear regression analysis (y =ax +b) of 8-day averaged values of remotely sensed GPP versus that of ECderived GPP.The comparative model bias (%) was estimated as RMSE/Mean × 100% and the ECderived mean daily GPP in 2004 at QYZ site was 5.15 g C m −2 day −1 . a