Improving a plot-scale methane emission model and its performance at a northeastern Siberian tundra site

In order to better address the feedbacks between climate and wetland methane (CH 4) emissions, we tested several mechanistic improvements to the wetland CH 4 emission model Peatland-VU with a longer Arctic data set than any other model: (1) inclusion of an improved hydrological module, (2) incorporation of a gross primary productivity (GPP) module, and (3) a more realistic soil-freezing scheme. A long time series of field measurements (2003–2010) from a tundra site in northeastern Siberia is used to validate the model, and the generalized likelihood uncertainty estimation (GLUE) methodology is used to test the sensitivity of model parameters. Peatland-VU is able to capture both the annual magnitude and seasonal variations of the CH 4 flux, water table position, and soil thermal properties. However, detailed daily variations are difficult to evaluate due to data limitation. Improvements due to the inclusion of a GPP module are less than anticipated, although this component is likely to become more important at larger spatial scales because the module can accommodate the variations in vegetation traits better than at plot scale. Sensitivity experiments suggest that the methane production rate factor, the methane plant oxidation parameter, the reference temperature for temperature-dependent decomposition, and the methane plant transport rate factor are the most important parameters affecting the data fit, regardless of vegetation type. Both wet and dry vegetation cover are sensitive to the minimum water table level; the former is also sensitive to the runoff threshold and open water correction factor, and the latter to the subsurface water evaporation and evapotranspiration correction factors. These results shed light on model parameterization and future improvement of CH4 modelling. However, high spatial variability of CH4 emissions within similar vegetation/soil units and data quality prove to impose severe limits on model testing and improvement.


Introduction
Northern high latitudes contain a large quantity of potentially climate-vulnerable carbon (Van Huissteden and Dolman, 2012;Hugelius et al., 2013).Peatlands are a common feature of this region and may cover an area as large as 4.0 × 10 6 km 2 (Yu et al., 2010).
Methane (CH 4 ) emissions from peat soils strongly influence the atmospheric CH 4 concentration (Yu et al., 2013;IPCC, 2007;Umezawa et al., 2012).These emissions are the net result of a balance between CH 4 production by methanogenic microorganisms within anaerobic soil and CH 4 oxidation by methanotrophic microorganisms in aerated soil and in plants (Van Huissteden et al., 2009).These processes are controlled by water table position, soil temperature, methane transport pathways, and substrate availability and quality (Walter and Heimann, 2000).
Key drivers of methanogenesis and oxidation are predicted to change (IPCC, 2013).The feedbacks between climate and peatland CH 4 emission, however, are complex.Precisely how climate change will impact the northern high latitudes Published by Copernicus Publications on behalf of the European Geosciences Union.
is not fully understood.Increases in air temperatures and/or precipitation levels may result in an increase in active layer depth, anoxic soil conditions, and raised soil temperatures, therefore producing elevated CH 4 emissions.Alternatively, CH 4 emissions may decrease if permafrost degradation improves soil drainage (Van Huissteden et al., 2011).The latter is particularly true in discontinuous permafrost regions (Smith, 2005).
To constrain these uncertainties, several research stations around the Arctic have started to measure methane fluxes and associated parameters.However, field measurements from the northern high latitudes are spatially limited due to the vast size and remoteness of this region, the extreme climate, and logistical difficulties.Process-based computer simulations could bridge the gap between the field situation and the global knowledge level, and increase our understanding of these feedbacks.Ultimately, this process enables us to make observationally justifiable projections of future climate via the up-scaling of available field data.
Several process-based models have been developed to simulate CH 4 fluxes at different spatial scales (Liebner et al., 2011;Cao et al., 1996;Christensen et al., 1996;Bekki and Law, 1997;Wania et al., 2009Wania et al., , 2010;;Zhuang et al., 2004).Compared with large-scale models, site-scale models have the advantage of using site-specific chemical and physical properties and plant data to parameterize the model, thus enabling an interactive process representation of soil, biosphere, and atmosphere.In addition, they can be validated against field measurements under a variety of conditions (Granberg et al., 2001;Zhang et al., 2002;Arah and Stephen, 1998;Van Huissteden et al., 2006;Comer et al., 2000).
Conversely, large-scale modelling of CH 4 fluxes always requires aggregated and simplified information on vegetation and soil, which are more difficult to parameterize or validate.However, these models, particularly when coupled to climate models, are highly important in understanding the feedbacks between climate change and CH 4 emissions (Gedney, 2004;Kaplan, 2002;Chen and Prinn, 2006;Nicolsky et al., 2007).
Large-scale models are usually up-scaled from small-scale models (Walter et al., 2001;Segers and Leffelaar, 2001).If a parameter has a strong influence on the modelled fluxes on the plot scale, then it is likely that it also has a large influence in an up-scaled version of this model.Depending on model structure, this may hold for other models that use the same or similar parameters (Van Huissteden et al., 2009).Careful parameter sensitivity analysis and analysis of the effects of model structure are therefore necessary to reduce modelling uncertainty and eliminate the parameters that do not contribute significantly to model accuracy in order to eliminate redundant computational enterprise.
In this study we test the updated plot-scale version of the methane emission model Peatland-VU on the Kytalyk tundra site in northeastern Siberia, which has a long data series (2003)(2004)(2005)(2006)(2007)(2008)(2009)(2010).The new model version (compared to Van Huissteden et al. (2006)) has been improved by (1) including a hydrological module to dynamically calculate the water table when actual water table data are not available, as this is a key factor controlling the environment for CH 4 production; (2) adopting a gross primary productivity (GPP) module to simulate gross primary productivity for each vegetation type, which affects the substrate availability; and (3) improving the soil-freezing scheme with a more realistic calculation of the soil thermal conductivity.
We employ the GLUE (generalized likelihood uncertainty estimation) methodology (Lamb et al., 1998;Beven, 2004;Van Huissteden et al., 2009) to test the model sensitivity with validation data from on-site chamber measurements.We also test the effects of changes in model structure, such as the addition of the GPP module.

Study area
The study area is located in the Kytalyk nature reserve (70 • 49 N, 147 • 29 E) in Sakha Republic, northeastern Siberia (Fig. 1).This is a continuous permafrost zone with a permafrost thickness in excess of 300 m (Shahgedanova, 2002).
The climate of this region is continental.Climatic records from the nearest weather station (Chokurdakh, 70 • 37 N, 147 • 53 E; elevation 48 m; Köppen climate classification ET; polar tundra; data from 1961 to 1990) show a mean annual air temperature of −14.3 • C, a July average of 9.5 • C, and a January average of −34.6 • C. The mean annual precipitation is 232 mm, approximately half of which falls as rain during the growing season, whilst the rest falls as snow.Although this amount of precipitation is similar to the yearly total in semi-arid areas, total evaporation is much lower; thus the soil remains very wet and the wilting point is normally not exceeded (Parmentier et al., 2011).
Summer temperatures are highly variable due to the large contrast between winds from the north and south: northern winds blow cold air from the East Siberian Sea, while southern winds bring hot summer air from the Siberian interior.The wind direction can change over hours, and therefore affects both air and soil surface temperature.Through this effect on temperature, wind direction has an indirect influence on the CH 4 flux.
Snowmelt occurs between mid-May and the start of June, while bud break begins at the end of June or early July.The growing season usually stops at the end of August or early in September, when temperatures begin to drop below 0 • C and sunlight diminishes.
We include two morphological units in this study: the frequently inundated river floodplain and a river terrace with tundra vegetation.Floodplain sediments consist of silt or silty peat in the back swamps, and silt and fine sand in the levees.Mineral soils dominate, but soils below dense sedge  (Rekacewicz, 1998).Right: landscape of Kytalyk site (Van Huissteden et al., 2005).
vegetation have a thin (< 10 cm) organic horizon.Active layer thickness ranges from 25 to 55 cm, averaging 42 cm, and is thickest below the levees.The river terrace soil consists of silt overlain by 15 to 30 cm of peat.The active layer in dry areas ranges from 12 to 28 cm, while it is markedly thicker in wet areas (22 to 50 cm) (Van Huissteden et al., 2005).
The floodplain levees are overgrown with tall Salix shrubs, while the back swamps are dominated by meadows with grasses and sedges, ranging from low grass (Arctophila fulva) in the lowest, wettest parts to tall grasses and sedges (Carex arctisibirica, Ranunculus glacialis, Eriophorum angustifolium) in the slightly less wet parts.The tundra terrace shows a larger diversity of vegetation types: dry shrubs such as Betula nana and Salix pulchra in dryer regions (polygon rims, low palsas), moist tundra with Eriophorum vaginatum tussocks in less well drained areas, and wet areas with Sphagnum sp. and Carex sp. and some Eriophorum angustifolium.
The water table depth is spatially heterogeneous.In the sedge-dominated wet depressions it could be 20 cm above the soil surface, while in the shrub-dominated dry areas the soil may be unsaturated down to the top of the permafrost.Although the water table depth is highly heterogeneous, soil moisture changes are less abrupt and most soils remain moist during the growing season.
Depending on geomorphology, soil moisture content, and vegetation, the measurements are categorized into floodplain dry (FD) and wet (FW1, FW2) groups and river terrace dry (TD) and wet (TW1, TW3, TW4) groups, where "wet" is defined as largely water-saturated soils, with a water table not lower than 5 cm below the surface.These locations are shown in Fig. 1.
FD is associated with floodplain levees, dominated by tall Salix shrub.FW1 and FW2 are floodplain back swamps with different vegetation.TD represents higher elevations in the micromorphology associated with permafrost and high soil ice content (low palsas and ridges along ice wedges), with thin active layers and unsaturated soil horizons above the permafrost.TW1 is situated in diffuse drainage lines consisting of interconnected depressions and characterized by dense and species-poor Carex eriophorum vegetation.TW3 and TW4 are associated with ice wedge polygon centres, which show more species-rich vegetation with Sphagnum sp., sedges, and Potentilla palustris.

Model description
Peatland-VU is a process-based model designed to simulate CH 4 and carbon dioxide (CO 2 ) flux from a column of peat soil (Van Huissteden et al., 2006).In this study we focus on improving the performance of the CH 4 flux module and applying it to permafrost peat soils.
The CH 4 flux depends on CH 4 production in the anaerobic soil zone, consumption by methanotrophic bacteria in the aerated zone, and the different transport pathways to the atmosphere (Walter and Heimann, 2000).The model subdivides the soil column into 15 layers of equal thickness (0.1 m) and calculates the flux rate of each layer, before integrating over all layers to obtain the total flux: where D CH 4 (t, z) is the CH 4 concentration at time t and depth z, F diff is the diffusive flux, Q eb and Q pl represent ebullition and plant transport, and R pr and R ox are the CH 4 sources and sinks due to CH 4 production and oxidation.
CH 4 production is temperature-dependent and linearly related to substrate availability (Eq.2), where R 0 (µM h −1 ) is a constant rate factor, C fresh is the fresh carbon concentration in the soil carbon reservoirs (Fig. 2), T (t, z) is the soil temperature at depth z and time t, and T ref is a reference temperature approximating the mean summer soil temperature below the water table.
The soil temperature gradient is calculated using a thermal diffusion equation.The thermal diffusivity is estimated based on the volumetric heat capacity and thermal conductivity.An improved method for estimating soil thermal conductivity from generic soil composition data (Balland and Arp, 2005) has replaced the calculation in the previous version of Peatland-VU, which was based on Williams and Smith (1991).This adjustment allows the model to simulate a more realistic active layer depth due to a better approximation of the thermal conductivity of the frozen soil.
The substrate pool is the sum of several labile soil organic matter (SOM) classes (manure, root exudates, dead roots and litter, dead microbes) and stable SOM (peat, humic matter).The labile SOM reservoirs are replenished by GPP and manure addition (the latter is used in managed wetlands and not included in the experiments described here).In the previous version, GPP is estimated from the temperature of the topmost 10 cm soil or fed by another model.We incorporate a GPP module into Peatland-VU in this study, which has two options: method (i), adopted from Lund-Potsdam-Jena Dynamic Global Vegetation Model (LPJ) (Sitch et al., 2003;Haxeltine et al., 1996), within which GPP is dependent on vegetation structures and phenology, driven by input of climatology, soil type, and atmospheric CO 2 concentration and method (ii), a simpler set of equations based on Shaver et al. (2007), within which GPP is primarily related to temperature, light, and leaf area index (LAI).
CH 4 oxidation is temperature sensitive and its calculations depend on the CH 4 concentration at each time step (Eq.3), where K m (µM) and V max (µM h −1 ) are the Michaelis-Menten constants.Q 10,ox determines the temperature sensitivity of the process.
The anaerobic-aerated zone boundary is defined by water table, which in the previous version of Peatland-VU is prescribed.We add a hydrological module based on Granberg et al. (1999) and Yurova et al. (2007) to simulate dynamic water table positions.This has been extended to include saturated zone water transport dependent on water head, distance to nearest drainage line, and saturated hydraulic conductivity.
Finally, the model simulates CH 4 transport in three ways: transport by diffusion above and below water

Measurements
Methane flux data are available for Kytalyk for every year between 2003 and 2010, recorded using round static chambers attached to an Innova 1312 photoacoustic gas analyser (Van Huissteden et al., 2005).
During the campaign days, the flux data were collected from 53 different chamber placements that are representative of the measurement groups mentioned above.Accompanying each flux measurement, active layer depth and water table level were measured manually next to the chamber collar.These data are compared with model simulations.Despite all measurements having been carefully screened, uncertainties in the data are inevitable and may lead to poor model-data fits.
Key sources of uncertainties include induced ebullition during measurements or possible chamber leakage, the CH 4 flux calculation method (the change in CH 4 concentration within the chamber is presumed to be linear, which in fact is not the case; Conen and Smith, 1998;Forbrich et al., 2010;Levy et al., 2011), the spatial and temporal variability (which is difficult to capture due to the limited distribution of measurement spots and measurement frequency), and the measurements error (e.g. when the measuring plot is ponded, we take the vertical distance between the water surface and soil surface as water table level.However, it is difficult to separate the soil surface from the vegetation roots, particularly with the presence of tall sedges or peat moss hummocks)., 11, 3985-3999,  The major parameters are listed in Table 1.The CH 4 and soil organic matter production modules are described in detail in Van Huissteden et al. (2006, 2009); their values are set up in accordance with the vegetation types.The parameter values of the soil physic module are derived from laboratory experiments on soil samples from Kytalyk.The hydrological parameters are as follows: W min is the lowest water table level of the modelling period, in metres.Negative values denote subsurface water tables.E WT is a correction factor that reduces evaporation if the water table is below the surface.FW2 has a slightly higher E WT value due to the shallowest water table level.Z runoff is the threshold value above which a ponded water layer produces runoff, depending on topography condition.The floodplain back swamp groups FW1 and FW2 can hold a small amount of water above the ground surface before generating runoff, while the low-depression groups TW1 and TW4 have much higher Z runoff values.E o and E veg are evaporation correction factors for open water and vegetation properties respectively.K sat is the horizontal hydraulic conductivity of saturated soil.D D and D L are the distance to the nearest drainage line and the water level limit above which the ground water starts to drain respectively.

Biogeosciences
Major parameters of the GPP module are K Beer , Beer's law molar extinction coefficient in unit ground area per unit leaf area; F par , fraction of photosynthetically active radiation; P maxL , light-saturated photosynthetic rate per leaf area, which is high (∼ 20) for tussock-dominated group TD2 and low (∼ 14) for Betula-nana-dominant TD1; and F phe , the phenology characteristics, including the base for calculating the heat sum (growing degree days), F phe1 ; the heat sum when maximum LAI is reached, F phe2 ; and the maximum LAI, F phe3 .
Snow depth, evapotranspiration, and precipitation data, essential to drive the model, are collected from the meteorological tower of the study site and the Chokurdakh weather station.The latter one is situated 30 km southwest of the study site and is used only to gap-fill the in situ observations.To complete the evapotranspiration data set, we use the FAO (Food and Agriculture Organization of the United Nations) potential evaporation calculator ETo (Raes, 2012), a formula based on air temperature, wind speed, and solar radiation.
Kytalyk site is located north of Chokurdagh, and a heat island effect for the Chokurdagh weather station cannot be excluded.For the earlier years there are no or fragmentary data from the research sites during the winter months.For the winter of 2011 and 2012, we have complete air temperature data sets from both the research site and the Chokurdakh weather station.By comparing the wintertime data for both stations, it proves that the winter air temperatures at the research site are on average 0.96 • C lower than those at the Chokurdakh station.We prepare two air temperature data sets to test to what extent the model is sensitive to the air tem-perature input change: (1) a 2 m air temperature record from Chokurdakh, augmented with local site data when available (in summer), and (2) subtracting 1 • C from the data set one for the winter data.

GLUE method
In order to assess the influence of input parameters, which are difficult to quantify, we use the Monte-Carlo-based GLUE methodology (Beven, 2009).For each parameter, the value range is predefined.A set of parameters is selected randomly within their own ranges to complete a model run.
For each group, 3000 Monte Carlo simulations were run to test the performances of the hydrology and CH 4 flux calculations separately.The CH 4 module test involves three model structure assessments: (a) switch off the GPP module and estimate GPP based on topsoil temperature and minimum/maximum GPP, (b) GPP calculated according to method (i), and (c) GPP calculated according to method (ii).We pre-ran 1000 simulations to rule out the parameters that do not have significant influences on the model output and use fixed values for them, and, incorporating the results from Van Huissteden et al. (2009), we select the parameters to be tested.These parameters are shown in Figs.7 to 10 in the Results section.
The results of each model run are compared with site measurements, and evaluated by objective function values.Van Huissteden et al. (2009) explain each objective function and its applicable scope.We choose the Nash-Sutcliffe (NS) efficiency coefficient to assess the model performance, which is essentially a measure of how well the model performs in predicting the data with respect to an estimate of the fluxes based on the data average: 1 indicates a perfect simulationdata fit; values close to or below 0 indicate an error variance of the same magnitude, or larger, than the variance of the observations: where E is the NS efficiency, σ 2 e is the error variance, σ 2 o is the variance of the observations, ŷt is the predicted value at time t, and y t is the observed value.Maximum LAI n/a 0.7 to 2.0 most years the model also follows temporal variations in the data.

Active layer thickness
Summer campaign data suggest an average maximum active layer depth of 50-60 cm (Van Huissteden et al., 2005); however, the measurements are sparse and do not cover the whole thawing season.Here we only present the modelled active layer thickness from the tundra wet group, TW1.The other groups show similar behaviours.
Forcing the model by data set one (data from Chokurdagh airport weather station) results in an overestimation of the active layer depth by 125 % (not shown).Using data set two as the model input improves the simulated active layer thickness considerably; the bias is 12 %.The maximum depth of thaw is about 80 cm, attained in autumn.For most years there is an exact match between modelled and observed active layer thickness (Fig. 3).

Water table level
Figure 4 compares the measured (with ±1 standard error bars) and modelled water table levels.We present the results from one dry group (FD), one floodplain wet group (FW2), and one tundra wet group (TW1), as these have the largest number of measurements.The model shows plausible yearly cycles for all groups.However, since the spatial and temporal variability in the data is high, the coefficient of determination (R 2 ) values of all the groups are low (0.13 for FD, 0.15 for FW2, and 0.23 for TW1).
The model failed to reproduce the water table values for the extremely wet year 2007 for both the wet groups, FW2 and TW1.During this year, FW2 was influenced by prolonged river flooding, and the nearby TW1 sites by impeding drainage.However, we do not expect this has a considerable effect on CH 4 production as when the water table is above soil surface the soil profile is already fully saturated, creating an optimal anaerobic environment for methanogenesis; therefore the change of ponded water level does not affect this anaerobic condition.In this case, CH 4 production is mostly controlled by changes in the soil temperature and organic substrate.
Despite realistic trends and temporal variations, some simulated water table positions are significantly deeper than the observed ones, for instance, in the year 2005 for FW2, and 2004 and 2009 for TW1.This can be partly explained by the limited number of measurement locations and high data variability.In addition, this underperformance can also be due to the physical structure of model.This one-dimensional hydrological module does not consider an upstream runoff water source; therefore water table rises due to runoff input from neighbouring areas are not captured by the model.

CH 4 flux
We use the summed CH 4 flux of the three transport pathways (diffusion, ebullition, and plant-mediated) from model simulations to compare with each of the flux chamber measurements.Figures 5 and 6 present the comparison between simulated and observed (with +/−1 standard error bars) from groups FW2 and TW1.The other groups yield similar results (not shown).
The modelled magnitude of the total CH 4 flux generally agrees well with the data and the simulated seasonal pat-tern is clear.The R 2 values are low for all groups, regardless of whether the GPP module is on or which calculations are used.However, this mismatch should not compromise the capability of the model to investigate the spatial and temporal CH 4 flux patterns since the model still explains part of the variance of the data, as indicated by the NS efficiency in the sensitivity analysis section.
Measured fluxes reveal a considerable range among groups, from less than 0 (FD) to more than 100 (FW1) mg m −2 h −1 .Encouragingly, the model reproduced this variation adequately.The lowest emissions are from the tundra dry group TD, with values around 0, while the highest one originates from FW1 during the summer of 2006, with a value of 77.18 mg m −2 h −1 .Mean daily fluxes during growing seasons produced by Peatland-VU range from 0.05 (TD) to 15.1 mg m −2 h −1 (FW1).
Some of the data from the dry groups show small negative fluxes, suggesting that CH 4 is taken up from the oxic topsoil or atmosphere by methanotrophic microorganisms.The model represents this mechanism, but does not reproduce these fluxes very well, although it simulates low to negative fluxes.These very small fluxes also have a very large measurement uncertainty, which makes the assessment of the model performance impossible.

Model sensitivity analysis
The water table simulation tests show that most of the parameter sensitivities are group-dependent, except for W min , which is highly sensitive at all groups.The dry groups are also sensitive to E WT , while the wet groups are strongly www.biogeosciences.net/11/3985/2014/Biogeosciences, 11, 3985-3999, 2014  affected by Z runoff and E o .The vegetation correction factor for evaporation, E veg , is highly important to both the dry groups and the wet groups, FW1 and TW1.Conversely, the drainage distance, D D , does not have a large influence on any group.
Figure 7 shows the deviations of cumulative distributions of the parameters, between the top 100 model performances  For the CH 4 testing we include all field data except the floodplain groups from 2007.This was due to the unrepresentativeness of these particular measurements for the floodplain as a whole, and the likelihood that they may have been subject to disturbance (F.J. W. Parmentier, personal communication, 2013).
The distributions of all groups indicate that the CH 4 production rate factor, R 0 ; the CH 4 plant oxidation parameter, F ox p; the reference temperature for temperature-dependent decomposition, T ref ; and the CH 4 plant transport rate factor, F plant , contribute significantly to the best model-data fits, and may mask the effects of the GPP module.
When the GPP module is switched off, the model performs better with high R 0 value ranges (0.35 to 0.50) for groups FW1 and TW3 in comparison with lower values, and conversely for the other groups.As for F ox , groups FW2 and TW4 perform better over lower values (0.1 to 0.5) than the other groups (not shown).
The improvement created by the addition of the GPP module is generally insignificant.The higher objective function results for groups FW2, TW1, and TW4 indicate a better model performance.However, turning off the GPP module results in higher NS values for FW1 and TW3 (Table 2).Emissions from FW1 tend to be underestimated, especially for the large values, while fluxes from TW4 are overestimated, particularly for low values.
Most of the parameters in both GPP calculation methods show little sensitivity to the total emissions except a plant phenology factor, number of growing degree days (heat sum), F phe1 .This suggests that a complicated GPP module is not necessary for plot-scale CH 4 emission modelling; instead meteorological inputs can be used as a surrogate and the effect of photosynthesis can be parameterized in the subroutines.
The two GPP calculation methods do not give different results in most cases except for FW2, where method (i) gives a much higher value than method (ii).The maximum objective function values are listed in Table 2 -FW2 runs with GPP module on -and all the TW4 simulations have passed the F test p = 0.1 probability limit (NS > 0.3212).
Same as in Fig. 7, Figs. 8 to 10 show the deviations of cumulative distributions of the parameters, between the top 100 model performances (blue lines) and all 3000 simulations (red lines), with and without GPP functions for group TW1.
We also tested the sensitivity of CH 4 fluxes to the calculated soil temperature and active layer thickness.Forcing the model by the second air temperature data set (1 • C lower than the data from Chokurdakh weather station in wintertime), a much better match for the active layer depth was achieved.However, this does not give a better model-data fit with respect to the CH 4 fluxes.With the lower winter air temperature input, the best model fit resulting from the GLUE analysis is slightly lower than that with the higher air temperature input, although the number of well-fitting model runs is higher.This holds for experiments with and without the GPP module.Well-performed model runs using the higher air temperature time series have a lower Q 10 (2-3.5)value Table 2.The maximum objective function values of GLUE experiment, with GPP module off, GPP calculated by method (i), and GPP calculated by method (ii).

Discussion
The model-data comparison shows that the modelled active layer depth and soil temperature is very sensitive to soil surface temperature input.Comparing the air temperature data sets from the research site and Chokurdakh station (winter 2011 to 2012) shows the air temperatures at the research site are on average 0.96 • C lower than those at the Chokurdakh station.Consequently, subtracting 1 • C from the winter air temperature improves the modelled active layer thickness considerably.These experiments show the importance of modelling the temperature gradient between the soil surface and top of the canopy/top of the snow cover for accurate modelling of active layer thickness.Importantly, it shows a strong sensitiv-ity of the model to small temperature differences resulting from spatial separation between modelled site and temperature data collection.
The new water table module allows Peatland-VU to simulate more realistic conditions for CH 4 production when actual water table data are not available; an essential capability when modelling this data-sparse biome.Long-term tests show that Peatland-VU is able to reproduce the magnitude, and spatial and temporal patterns of the water table, even though the spatial variation between floodplain wet groups and tundra dry groups can be large.Nonetheless, detailed daily variations are difficult to evaluate due to limited data on this scale.
The model failed to reproduce the water table values for the wet groups in 2007 -an extremely wet year.This is likely due to the assignment of a specific runoff threshold value in the water table algorithm for each group.This was implemented according to our general knowledge of the normal yearly flooding situation, over which we assume the surface water is drained as runoff.This approach ignores flooding in situations where water input occurs from upstream, such as in TW1 and FW groups (the TW1 groups also act as drainage channels in very wet conditions).Moreover, in 2007, spring snowmelt, high summer precipitation and poor drainage conditions on the floodplain exacerbated the runoff, which is not reflected in the model structure.This a justifiable omission as the depth of excess flood water does not change the fact that the subsurface is still anoxic and the organisms behave similarly, thus carrying forward negligible changes in CH 4 production regarding the influences of water table position.
The model captured the magnitude and seasonal pattern of the CH 4 flux and the disparities between wet and dry groups.The emission fluctuations due to water table level, temperature, and active layer thickness are also reflected well in the simulations.However, model-data comparisons give low R 2 values, although the NS objective function values are still positive, indicating that the model captures at least part of the variance of the data.This is partly due to the uncertainties in measurements.There is in some cases considerable uncertainty as to the exact location of soil surface and water level, in particular for sedge and Sphagnum vegetation (see methods).The CH 4 flux measurements are laced with an immeasurable uncertainty due to induced ebullition during measurement; in particular large fluxes could have been influenced although care has been taken to scrutinize the data (see methods).Furthermore, this could also be due to the fact that Peatland-VU is a one-dimensional model, which is ap-propriate to reproduce the important processes from a characteristic soil profile and vegetation unit, while the measurement plots show a very high spatial variability of measured fluxes, even within one vegetation unit, reflected by the error bars on the measurements in Figs. 5 and 6.This variability and uncertainty limits model-data comparison, but this mismatch should not compromise the capability of the model to investigate the spatial and temporal CH 4 flux patterns.
The GLUE model sensitivity experiment on the water table parameters shows that most of the sensitivity to these parameters is group-dependent.The dry groups are sensitive to the minimum ground water table level, W min , and the subsurface water evaporation correction factor, E WT , while the wet groups are sensitive to W min ; the runoff threshold, Z runoff ; and the open water correction factor for evaporation, E o .This is in accordance with the fact that the dry groups in general have a lower water table position (below soil surface) compared with the wet groups (ponded).
The results of the CH 4 flux sensitivity experiment suggest that the CH 4 production rate, R 0 ; the CH 4 plant oxidation parameter, F ox ; the reference temperature for temperaturedependent decomposition, T ref ; and the CH 4 plant transport rate factor, F plant , are the most important factors affecting the data fit of all groups.From the model runs with higher objective function values, the parameter range differences in vegetation types are found.For example, the tall sedges dominating group FW2 and TW4 generally have low F ox values, in accordance with high vascular plants methane transport ability (Bubier et al., 1995;Joabsson and Christensen, 2001).However for the swamp group, FW1, and Sphagnum group, TW3, R 0 values are higher, which suggests that the highs and lows of CH 4 production rate control the flux variability.
The sensitivity experiments for the inclusion of GPP module give higher objective function results for groups FW2, FD, TW1, and TW4, but the results are reversed for FW1 and TW3 groups.We attribute this to the fact that FW1 and TW3 are groups with a very low vegetation canopy, with mossdominated vegetation and low vegetation total biomass, under which conditions the photosynthesis is driven more by soil water and temperature than by light.Similar results can be found from recent studies (Zona et al., 2011;Street et al., 2012).The results from two GPP calculation methods do not differ significantly except for FW2.Most of the parameters in both GPP calculation methods are not sensitive at any group except a plant phenology factor, F phe1 .Therefore we conclude that a GPP module is not essential in process-based, plot-scale CH 4 emission modelling, although the model generally performs slightly better with one of the GPP models that we tested, compared to a simple approach based on soil temperature.However, on the small scale, a model can be fine-tuned to local conditions, making a GPP module less useful.On the larger scale, variations in vegetation traits are to be expected and fine-tuning to one particular site will lead to poor model performance.In such cases, a GPP module may accommodate these spatial variations better, and be of additional benefit to the modelling effort.
The model experiments also show the effect of variations between modelled and measured soil temperature on the modelled fluxes.Deviations in the modelled soil temperature do affect the CH 4 result, but may be compensated by changes in the Q 10 and T ref parameters.The model may be improved by adding a vegetation canopy temperature gradient model.However, derivation of soil surface temperature from air temperature based on empirical relations derived from measurement data can also be applied.
There are a number of published plot-scale models attempting to simulate CH 4 emissions from wetlands and permafrost tundra available for comparison, especially when it comes to model performance.The generally low R 2 values may suggest that this type of methane emission model performs poorly.However, the R 2 goodness of fit assesses a more exact point-by-point modeldata agreement, and high spatial/temporal variability in the data, in the shape of measurement errors or high spatial and temporal variability of fluxes within the same vegetation units, severely limits the usefulness of this model-data validation method.Other choices of objective which may be better suited to this type of data, need to be considered.Our GLUE experiments used Nash-Suttcliffe efficiency to assess the model performance, which is essentially a measure of how well the model performs in predicting the data with respect to an estimate of the fluxes based on the data average.The positive objective function values indicate that the model still performs better than a flux estimate based on data averages and variance.
Alternatively, Zürcher et al. (2013) smoothed the data and model results using a spline approximation method with a cut-off period of 2 months; a better fit has been achieved.However, the longest data coverage for Kytalyk is 2007, covering 19 growing days; for the other years, the data coverage could be as short as only a few days.This makes it difficult to do a long-term model validation against the cumulative flux.Meanwhile, we expect that, when applied on largerscale wetlands, such as landscape scale, the model performance could be improved as the within-plot heterogeneity can be integrated.However, careful model parameterization is needed, and adequate validation data, such as eddy covariance measurements, are necessary to calibrate the model.
CH 4 models are typically tested with a few years of data or less (e.g.Van Huissteden et al. (2006, 2009)); our study shows that a longer data series does not contribute to a better model-data fit and even may result in a lower data fit, probably because of long-term variation that is not (or cannot be) accounted for by the model.Longer data series may contain more inhomogeneities due to measurement methods, instrumental drift not accounted for by calibration, and natural year-to-year variability (Mastepanov et al., 2013), resulting in a poorer fit.

Conclusions
Peatland-VU is able to capture both the annual magnitude and seasonal variations of the CH 4 flux, water table position, and soil thermal properties.However, detailed daily variations are difficult to evaluate due to the large uncertainty, mainly caused by spatial variation, in the data.The testing of two different models for vegetation primary produc-tion shows that improvements can be gained by adding primary production modelling, although the improvement is not very large and, for differences in model structure, marginal only.However, with a primary production module, our model could also be coupled into climate models and dynamic vegetation models in order to better explain spatial and temporal variations in CH 4 emissions from northern permafrost, and to predict responses under future change scenarios.However, the micro-topographical features within one wetland or even one vegetation type, which control the hydrology and biogeochemical processes and therefore influence the CH 4 fluxes, can vary widely; therefore careful parameterization is needed when up-scaling.The large uncertainty in the data due to high spatial variability limits model-data comparison.It may be useful to experiment with objective functions that can behave better under such a situation than the oft-used R 2 measure.

Figure 2 .
Figure 2. Schematic of the carbon partitioning simulated by Peatland-VU.
model is run over 8 years, from 2003 to 2010.Due to the low measurement frequency in comparison to the model time step (1 day), the model may show flux variations that were not recorded by the data; therefore the exact amplitudes of data and model output cannot be compared.Although the observation time series for each year are short, the simulations show realistic magnitude and seasonal patterns; for www

Figure 3 .
Figure 3. Soil temperature and active layer depth produced by Peatland-VU, tundra wet group TW1, 2004 to 2010.The red vertical lines toward the top of the figure indicate the measured active layer thickness, while the length of the line indicates one standard deviation.The black lines show the modelled active layer boundaries.

Figure 4 .
Figure 4. Water table produced by Peatland-VU compared with data, floodplain dry group (FD), floodplain wet group (FW2), and tundra wet group (TW1), 2003 to 2010.For data (red), the standard error is shown by an error bar.Note the different scales on the y axes.

Figure 5 .
Figure 5. CH 4 flux produced by Peatland-VU compared with data, floodplain wet group (FW2), 2003 to 2010.For data (red), the standard error is shown by an error bar.

Figure 6 .
Figure 6.CH 4 flux produced by Peatland-VU compared with data, tundra wet (TW1) group, 2003 to 2010.For data (red), the standard error is shown by an error bar.

(
blue lines) and all 3000 simulations (red lines), group TW1.A large deviation of the red and blue lines indicates a high sensitivity of the model to the parameter.There is a clear sensitivity to the parameters E o , E veg , W min , Z runoff , and D L .Model responses to the other parameter changes are negligible.

Figure 7 .
Figure 7. Deviations of cumulative distributions of parameters in hydrology module (TW1).Blue: top 100 model performance.Red: all 3000 model runs.E o , evaporation correction factor for open water; E veg , evaporation correction factor for crop/vegetation; E WT , evaporation correction factor for ground water; W min , deepest water table position; Z runoff , ponded water depth limit; K sat , horizontal saturated hydraulic conductivity; D D , downslope drainage distance; and D L , downslope drainage water level limit.

Figure 8 .
Figure 8. Deviations of cumulative distributions of parameters in CH 4 module (TW1), GPP function off.Blue: top 100 model performance.Red: all 3000 model runs.R 0 , CH 4 production rate factor; F plant , CH 4 plant transport rate factor; F ox , CH 4 plant oxidation factor; Q 10 , CH 4 production Q 10 factor; P max , maximum primary productivity; F exu , plant root exudate factor; and T ref , reference temperature for decomposition.

Figure 9 .
Figure 9. Deviations of cumulative distributions of parameters in CH 4 module (TW1), GPP calculated by method (i).Blue: top 100 model performance.Red: all 3000 model runs.R 0 , CH 4 production rate factor; F plant , CH 4 plant transport rate factor; F ox , CH 4 plant oxidation factor; Q 10 , CH 4 production Q 10 factor; F phe1 , plant phenology factor, base for calculating heat sum; F phe2 , plant phenology factor, heat sum when maximum LAI is reached; F phe3 , plant phenology factor, maximum LAI; F exu , plant root exudate factor; and T ref , reference temperature for decomposition.

Figure 10 .
Figure 10.Deviations of cumulative distributions of parameters in CH 4 module (TW1), GPP calculated by method (ii).Blue: top 100 model performance.Red: all 3000 model runs.R 0 , CH 4 production rate factor; F plant , CH 4 plant transport rate factor; F ox , CH 4 plant oxidation factor; Q 10 , CH 4 production Q 10 factor; F phe1 , plant phenology factor, base for calculating heat sum; F phe2 , plant phenology factor, heat sum when maximum LAI is reached; F phe3 , plant phenology factor, maximum LAI; F exu , plant root exudate factor; T ref , reference temperature for decomposition; and K Beer , Beer's law extinction coefficient.
Segers and Leffelaar (2001) apply a process-based model at the Nieuwkoopse Plassen (western Netherlands, non-permafrost).Although their modelled CH 4 fluxes have the same order of magnitude as the measurements, the model fails in capturing the seasonal pattern.Zhang et al. (2002) present the Wetland-DNDC (Denitrification-Decomposition) model and report the simulations of three North America sites, whose R 2 values range from 0.03 to 0.76.Wania et al. (2010) parameterize LPJ-WHyMe (Lund-Potsdam-Jena wetland Hydrology and Methane DGV Model) at three permafrost sites: BOREAS, Abisko, and Ruoergai.The simulated daily flux compared well with data for the Ruoergai site; however, the model-data fits are poor for the other two sites.Tang et al. (2010) test three process-based models of different complexities at two Michigan peatlands (non-permafrost): Hollow Bog and Big Cassandra Bog.The simulated fluxes from the latter site are less agreeable than the first one, with the highest R 2 values of 0.31 and 0.60 respectively.
table, transport by bubble formation (ebullition) below water table, and transport through plants.

Table 1 .
Major parameters and their ranges used in Peatland-VU model.
maxL Light-saturated photosynthetic rate per leaf area µmol m −2 s −1 13 to 20F phe1Base for calculating heat sum (growing degree days) n/a 3 to 10 F phe2Heat sum when maximum LAI is reached n/a 50 to 150 F phe3