Vegetation Influence and Environmental Controls on Greenhouse Gas Fluxes from a Drained Thermokarst Lake in the Western Canadian Arctic

Thermokarst features are widespread in ice-rich regions of the circumpolar Arctic. The rate of thermokarst 10 lake formation and drainage is anticipated to accelerate as the climate warms. However, it is uncertain how these dynamic features impact the terrestrial Arctic carbon cycle. Methane (CH4) and carbon dioxide (CO2) fluxes were measured during peak growing season using eddy covariance and chambers at Illisarvik, a 0.16 km thermokarst lake basin that was experimentally drained in 1978 on Richards Island, Northwest Territories, Canada. Vegetation in the basin differs markedly from the surrounding dwarf-shrub tundra and included patches of tall shrubs, grasses and sedges 15 with some bare ground and a small pond in the centre. During the study period, temperature and wind conditions were highly variable and soil water content decreased steadily. Basin scaled net ecosystem exchange (NEE) measured by eddy covariance was -1.5 [ CI95% ± 0.2] g C-CO2 m d; NEE followed a marked diurnal pattern with no trend during the study period. NEE was primary controlled by photosynthetic photon flux density and influenced by vapor pressure deficit, volumetric water content and the presence of shrubs. By contrast, net methane exchange (NME) was low (8.7 20 [CI95% ± 0.4] mg CH4 m dand had little impact on the carbon balance of the basin during the study period. NME displayed high spatial variability, sedge areas in the basin were the strongest source of CH4 while upland areas outside the basin were a net sink. Soil moisture and temperature were the main environmental factors influencing NME, having a positive and negative effect respectively. 25


Introduction
Thermokarst lakes are a widespread feature in poorly drained, sedimentary permafrost lowlands with excess ground ice volume (French, 2013). These lakes drain, sometimes catastrophically, via bank overflow, ice wedge erosion, coastal erosion, and stream migration (Billings and Petersons, 1980;Mackay, 1999). Thermokarst lakes and drained 30 thermokarst lake basins (DTLB) are prominent landscape features of the western Canadian Arctic (Mackay, 1999;Marsh et al, 2009;Lantz & Turner, 2015). Lake formation and drainage is a natural part of the thaw lake cycle, but it https://doi.org/10.5194/bg-2019-477 Preprint. Discussion started: 12 February 2020 c Author(s) 2020. CC BY 4.0 License. is anticipated that climate change will accelerate this cycle resulting in more lake formation and drainage (Jones et al., 2018).
Thermokarst lakes are well recognized sources of methane (CH4) which is 28 times as potent as carbon dioxide (CO2) 35 on a 100-year time scale (Walter et al., 2007). Thermokarst lake formation and expansion is expected to exert a positive feedback on climate change and accelerate Arctic warming in the near term, but one modelling study suggests that drainage may limit expansion and result in decreased lake area by the end of the century (van Huissteden et al., 2011). Post drainage, DTLB undergo rapid ecological succession. In colder tundra environments, meadows or polygonal landscapes dominated by sedges, grasses and rushes will form (Lara et al., 2015). In slightly warmer, 40 boreal or transitional regions, DTLB often become dominated by willows and other shrubs (Lantz and Turner, 2015).
Carbon exchange in DTLB of various ages has been examined by a few studies, all of which focused on the Barrow Peninsula in Northern Alaska. In general, DTLB are net sinks for CO2 with greatest uptake in younger basins and decreasing net uptake of CO2 as the basin ages (Zona et al., 2010;Zulueta et al., 2011;Lara et al., 2015). DTLB source/sink strength of CH4 was found to be highly variable depending on vegetation and ground conditions (Lara et 45 al., 2015). Net methane exchange (NME) is highest in wet or flooded meadows and remnant ponds while upland tundra surrounding DTLB can be a methane sink (Zona et al., 2009;Zona et al., 2012;Lara et al., 2015;Sturtevant and Oechel, 2013).
In this study, fluxes of CO2 and CH4 were measured at Illisarvik, an experimentally drained thermokarst lake bed on Richard's Island in the western Canadian Arctic, Northwest Territories, Canada. Fluxes of CO2 and CH4 were 50 measured during the peak growing season using a combination of closed chamber and eddy covariance (EC) measurements. Net ecosystem exchange (NEE) of CO2 was calculated from fluxes and storage change and NEE was separated into ecosystem respiration (ER) and gross primary productivity (GPP), = + . Here we report on: 1) the spatial and temporal variability of the NEE and NME, 2) the vegetation and environmental factors NEE and NME, 3) how the carbon balance at Illisarvik compares to other DTLB, and 4) potential future trajectories as Illisarvik 55 continues to evolve.

Study Site and Data Collection
The study took place at Illisarvik, a DTLB on Richards Island (69˚28'47.5" N, 134˚35'18.7" W), that was drained experimentally in 1978 (Mackay, 1981). Illisarvik has since served as the focus of studies on permafrost growth, 60 active layer development and vegetation succession (Ovenden, 1986;Mackay and Burn, 2002;O'Neil et al., 2012;Wilson et al., 2019). At the nearby Tuktoyaktuk climate station mean annual air temperature (Ta) is -10°C, July is the warmest month with a mean of 11°C and January is the coldest at -27°C. Mean annual precipitation is 160.7 mm yr -1 , the majority falling as rain in the summer and autumn. Snow cover typically lasts from September or October to In the 39 years since drainage, Illisarvik has undergone rapid vegetation succession. After drainage, there were two remnant ponds. In the first five years after drainage, vegetation colonized the basin margins and wetter areas (Ovenden, 1986). By 1999, low vegetation had proliferated across most of the basin and taller willows had become established along the basin margins (Mackay and Burn;2002). By 2010, some of the willows had grown to be 3 m in 70 height (O'Neil and Burn;. Current vegetation at Illisarvik is diverse relative to the dwarf-shrub tundra of the surrounding uplands; the basin hosts a mix of woody shrubs (Salix spp., Betula spp., & Alnus spp), wetland vegetation (Carex spp.,Arctophila fulva, etc.), and various grasses (Pocacea spp.). The basin is partly ringed by a terrace of peat that formed after a partial drainage event ~ 2000 years BP and supports vegetation similar to the uplands (C Burn, personal communication 2016). An ancient DTLB is located 100 m to the south of the Illisarvik basin and the Arctic 75 Ocean is to the west of the basin, separated by a ridge of upland tundra about 50 m wide at its narrowest (Fig 1).
A vegetation survey of species presence and percent cover was done on a 50 m grid in and around the basin during the 2016 study period (Wilson, 2019). A vegetation map was created with ten units based on plant functional type and vegetation structure, with sub-units denoting sub-canopy vegetation. The unit boundaries between grid points were estimated visually by traversing the grid lines. Additional survey data on vegetation units & canopy height were 80 collected manually with a GPS in the proximity of the EC station because greater resolution was needed for footprint modelling. Drone imagery was collected on July 23 rd over two flights using a Phantom 2 drone (DJI, Shenzhen, China). The GPS points and drone imagery were used to cross reference and modify the map of Wilson et al. (2019).
The ten units were then aggregated into 6 broader surface cover classes (listed from largest to smallest areal fraction within the footprint climatology): shrub, grass, sedge, upland, sparse, and water classes (Fig 1 & Table 1). 85

Weather and Soil Measurements
Weather data were logged on a CR1000 datalogger (Campbell Scientific Inc, Logan, UT, USA; CSI) at 5-minute intervals. A NRLite net radiometer (Kipp & Zonen, Delft, Netherlands) measured net all-wave radiation (Rn), SQ-110 quantum sensor (Apogee Instruments, Logan, UT, USA) measured photosynthetic photon flux density (PPFD), and a HMP35 (CSI) measured Ta and humidity (RH). A tipping bucket rain gauge (R.M Young Company, Travers City, 90 MI, USA) was placed 3 m to the west of the main tripod. Soil temperature and moisture data were recorded at 30minute intervals on CR10x dataloggers (CSI) within two soil pits with different vegetation types near the tripod: Grass (30 m to the east) and Shrub (40 m to the north). Each system measured ground heat flux (G) with custom made heat flux plates, soil temperatures (Ts) with custom made type-T thermocouples at depths of 0.08 m, and volumetric water content ( ) with CS616 water content reflectometers (CSI). There was one repetition of each observation per pit. 95 The climate and soil stations operated uninterrupted from July 10 th (day 192) and July 11 th (day 193) respectively, until August 7 th , 2016 (day 220). On July 11 th and August 6 th thaw depth was measured at each of the 10 chamber sites. Thaw depth was measured by inserting a graduated steel probe into the ground to point of refusal. Each site was probed five times: the median value has been used as the thaw depth at each location. Between day 193 and 197, a large herd of reindeer (500 + animals) visited Illisarvik. They mostly avoided the tripod but did graze near it on a few 100 occasions.

EC Fluxes
An EC system was placed in the southwestern portion of the basin (69° 28' 47.82", -134° 35' 18.6") and measured fluxes of CO2 (FCO2) and CH4 (FCH4) for the full study period. The EC system consisted of an open-path infrared CO2/H2O gas analyser (IRGA) (model LI-7500, LI-COR Inc., Lincoln, NK, USA; LI-COR), an open-path CH4 105 analyser (model LI-7700, LI-COR) and a CSAT3 sonic anemometer (CSI) mounted on a tripod at a measurement height (zm) of 3 m (Fig 2.). The EC data and air pressure (Pa) were logged at 10 Hz on the LI-7550 Analyzer Interface Unit (LI-COR). The CSAT3 was oriented to the northeast (40°) because climatology for Tuktoyaktuk indicated northerly and easterly winds are typical for July and August (Environment Canada, 2016).
Half-hourly fluxes were calculated with EddyPro V.6.2.0 (LI-COR). The software performed statistical assessments 110 (Vickers and Mart, 1997), low and high frequency spectral corrections (Moncrieff et al., 1997 and2004), a double rotation (Wilczak et al., 2001), applied the WPL correction to account for density fluctuations (Webb et al., 1980), and quality control . Post processing treatments included: storage correction (calculating the net flux as the sum of the observed scalar flux and the rate of change in scalar concentration at zm), filtering fluxes by friction velocities ( * ) below 0.1 m s -1 , and removing spurious half hourly measurements (Papale et al., 2006).

115
Additionally, observations with mean winds from 220˚ ± 30˚ were removed as these to avoid uncertainties associated with the wake of the sonic anemometer, and observations were removed during precipitation events and when the open-path analyzers indicated there were any other obstructions within the path. The data were gap-filled using neural networks (NN) which have been applied to FCO2 and FCH4 in other studies (Moffat et al., 2010;Dengel et al., 2013).
Details of the NN methodology are discussed in Appendix A. 120 The flux footprint represents the influence of upwind areas on a measured scalar flux and the footprint climatology is the average of individual footprints over a time period. Evaluation of the flux footprints and climatology help evaluate the reliability of the dataset and estimate the source area of each individual datapoint of the EC flux measurement. A scalar flux sampled at (0,0, ), can be represented as the integral of the flux footprint function f(x,y) and the distribution of sources/sinks ( ) over a domain D (Kljun et al., 2015): The flux contribution of upwind source areas increases sharply upwind from the measurement location to a peak then decrease gradually with increasing distance (Schmid, 2002). The empirically derived flux footprint function of Klujn et al., (2015) was used to estimate the source area of each half hourly flux measurement.
The model requires boundary layer heights which were not measured onsite. Half hourly boundary layer heights were 130 interpolated from three-hour estimates obtained from the Global Data Assimilation System of the U.S. National Oceanic and Atmospheric Administration. The model also requires the aerodynamic roughness length ( 0 ) which is influenced by the canopy height and spacing. Canopy height (Ch) varied considerably within the basin (from >1 m in the north to ~0 m in the bare ground). Canopy height variability was lower in the vicinity of the EC tripod but ranged L is the Obukhov length. 0 was found to be insensitive to changes in the zero-plane displacement height, d, as zm >> d, so the mean value of d around the tripod was used, where d = 2/3 Ch. Zero-plane displacement did not change significantly over the course of the study so 0 remained fixed over the study period for each wind sector.
For each 30-minute flux observation, was solved at one-meter resolution over a 1 km 2 domain centred on the EC 140 tripod. f(x,y)i were intersected with the surface classes to determine their relative contribution each flux observation (referred to as Shrub%, Sedge%, etc.). The footprint function is technically infinite so a fraction of each f(x,y)i was not contained within the model domain. The out-of-domain source fraction ranged from 1.8% -4.9% with a mean of 3.2% and assumed to have minimal impact on the analysis. The half hourly flux footprints were then averaged over the study to calculate the flux footprint climatology. All post-processing and footprint modelling were carried out 145 using Python.

Closed Chamber Measurements
In addition to EC measurements, fluxes of CO2, CH4, and nitrous oxide (N2O) were sampled using a static non-steady state chamber flux technique on 11 dates between July 12 and August 5, 2016 (Laforce, 2018). Only the bare ground was a significant source of N2O, so those results are not shown here. Chamber collars were located at ten sites, eight 150 within and two outside the basin ( Figure 1). Each surface cover class was represented by at least one chamber site, except for open water. At each site a pair of collars were installed 20 cm apart and the above ground biomass was removed from one of the collars, with the exception of the 'sparse' cover class where only one collar was installed.
PVC collars 30 cm long and 24.3 cm in diameter were inserted to a depth of approximately 15 cm. The chambers were 34 cm tall and made out of polycarbonate covered in black opaque tape to maintain dark conditions inside the 155 chamber (for more details, see Martin et al., 2018). The chambers contained a small vent (10 cm coiled 1/8" diameter copper pipe) to ensure a constant pressure during measurements. The use of opaque chambers means that FCO2 represents only ER. The independent estimation of ER is important as it is difficult to estimate ER using standard EC techniques at high latitude sites during Arctic summer at this latitude. Standard approaches require night-time data, a condition which was only found during a small fraction of the study period. 160 Measurements were made between 9:00 and 17:00 starting at a different site each day to randomize the sampling order to avoid a bias due to diurnal ranges on the 11 days when performing chamber measurements. During gas flux measurements, the chambers were sealed to the top of the collars within a groove filled with water and five 24 mL air samples were collected into evacuated 12 mL vials sealed with doubled septa. Each vial contained a small amount of magnesium perchlorate to dry the air sample. Samples were collected at 0, 5, 10, 15 and 20 minutes after the chambers 165 were set on the collars. Air within the chamber was mixed with a 60 mL syringe attached to a three-way stopcock before each air sample was taken. Samples were stored until analysis the following fall. To monitor the integrity of the vials through shipping, storage and analysis, a number of the evacuated vials were filled with helium in the lab before the field season began.
Concentrations of CO2, CH4 and N2O were determined at Carleton University, using a CP 3800 gas chromatograph 170 (Varian Inc., Pao Alto, CA, USA) as described by Wilson and Humphreys (2010). Three replicates of five CO2/ CH4 standards varying from 383.1 to 15212.6 ppm CO2 and from 1.08 to 22.11 ppm CH4 were included in every set of https://doi.org/10.5194/bg-2019-477 Preprint. Discussion started: 12 February 2020 c Author(s) 2020. CC BY 4.0 License. measurements to create a linear relationship between gas concentration and chromatogram area. The fluxes (FC) were calculated as follows:

=
( 2)  175 where (dc/dt) is the linear rate of change in the mixing ratio of the gas, A is the chamber area (0.0464 m 3 ), V is the chamber volume (between 0.0182 and 0.0242 m 3 adjusted for m 2 collar depth at each collar location), R is the ideal gas constant, P is pressure in Pa and T is the air temperature in Kelvin. Chamber measurements were upscaled to for comparison to EC observations. ER was calculated using eq. 4 and the Q10 and R10 coefficients derived by Laforce (2018) explained in section 2.4.1. ER was calculated for each surface class separately and then weighted by footprint climatology (see Table 2). For NME, there are no standard empirical 185 functions, so mean NME for each class was weighted by the footprint climatology instead. The weighted estimates were compared to EC observations and to help verify our findings.

Factor Selection and Gap Filling
We used an exploratory approach to identify the smallest set of factors that best predicted half hourly NEE and NME without overfitting the dataset. We started with 10 factors: four meteorological variables [ (PPFD), Ta, vapor pressure 190 deficit (VPD), wind speed (U)], two soil variables [volumetric water content (VWC) and soil temperature (Ts)], and four source area fractions [shrub (FShrub), grass (FShrub), sedge (FSedge), and upland, (FUpland)]. The four source area variables correspond to surface classes sampled by the chamber samples. We excluded sparse (FSparse) because its average contribution to the EC observations was only 2.1% and chamber samples indicated ER was low and NME was not significantly different from zero. A number of these variables are highly correlated but it was necessary to 195 include them so the model could account for source area heterogeneity.
A series of neural networks (NN) were trained iteratively on bootstrapped datasets. First NN were trained on each factor individually and the one with the lowest MSE was selected. Next, NN were trained on that factor in combination with one of the remaining nine. The best performing additional factor was again selected and this process was repeated until MSE failed to improve. The most parsimonious model was identified using the one standard error (SE) rule. 200 Dybowski and Roberts (2001) give the standard error of a bootstrap estimate of a given error metric (eg. = ) to be where is the mean of the bootstrapped samples. The smallest set of factors where was within one of the minimum for both NEE and NME were selected for further analysis. The models trained on those factor 205 https://doi.org/10.5194/bg-2019-477 Preprint. Discussion started: 12 February 2020 c Author(s) 2020. CC BY 4.0 License.
sets are referred to as NNNEE and NNNME respectively. NN modelling was done using the Keras Python library (Chollet et al., 2015), see the Appendix A for a more detailed explanation of the NN analysis.
Multiple Imputation (MI) was then used to gap fill the NEE and NME with the outputs from NNNEE and NNNME respectively (Vitale et al., 2018). Of the 1296 half hourly flux observations 28.9% of FCO2 and 31.3% of FCH4 were missing or filtered out. There were a few gaps in the source area fractions needed to gap-fill the flux time series 210 because the footprint function is not valid when * < 0.1 m s -1 . When source area fractions were missing, they were gap-filled by using the mean source are fraction observed for winds within ± 5˚of the observed wind direction. The meteorological and soil data were continuous and did not need to be gap-filled.

Flux Partitioning
NEE is negative when there is uptake of CO2 by the ecosystem and positive when there is net emission. ER and GPP 215 are always positive, ER represents the sum of heterotrophic and autotrophic respiration and GPP represents photosynthetic uptake of CO2. Night-time NEE observations (eg. PPFD <= 10 µmol m -2 s -1 ) are typically used to quantify ER because GPP ~ 0 (Aubinet et al., 2012). Some NN analyses of NEE have trained separate models for night-time and daytime conditions (Papale & Valenini, 2003). However, these methods are not practical during the Arcitc summer, the sun did not set at Illisarvik until July 28 th , over half way through the study period. There were not 220 enough night-time samples (n=100) to be worth training a separate NN. Instead, we estimated ER by calculating the intercept of NNNEE at PPFD = 0 µmol m -2 s -1 for all observations. This estimate is referred to as NNER. Laforce (2018) fit a Q10 equation to the chamber ER observations: where R10 is the base respiration at 10 C˚ and Q10 is the temperature sensitivity coefficient. We fit the night-time EC 225 observations available (n=100) to this Q10 equation for comparison (Aubinet et al., 2012).

Factor Analysis and Upscaling
The trained NNs were used to investigate how individual factors influenced NEE and NME. The partial first derivative of the model response to one controlling factor was calculated while keeping all other inputs fixed. For example, the partial first derivative, ∂ ∂PPFD , is an approximation of the NEE light response curve under a specific set of conditions. 230 Similarly, NNNME can be used to approximate NME response to controls like VWC or Ts. For both fluxes, the selected model contained at least one source area fraction variable, indicating the vegetation type(s) which had significant influence over both NEE and NME. We mapped NNNEE and NNNME to full coverage for the surface classes to approximate their fluxes and compare with the chamber observations.

Results & Discussion 235
During the 29-day study, half-hourly Ta and Ts ranged between 0.4 and 26.2 C˚ and 4.4 and 11.0 C˚, respectively (Fig   2a). Day length and maximum solar altitude decreased from 24 hours to 19.25 hours and 41.6˚ to 35.4˚, but daily https://doi.org/10.5194/bg-2019-477 Preprint. Discussion started: 12 February 2020 c Author(s) 2020. CC BY 4.0 License. PPFD was more influenced by variations in cloud cover. Precipitation (19 mm) fell on 14 of the 28 days with trace snowfall on three of those days, but VWC decreased throughout the period (Fig 2b). At the onset of the study period, VWC was high and soils were saturated with ponding in the sedge areas. By the end of the study most of this surface 240 water had dried up. On July 11 th average thaw depth (cm) was 37, 45, 51, 64, 81 at upland, sedge, grass, shrub, and sparse classes, respectively. By August 6 th , average thaw depth was 45, 62 and 66 cm at upland, sedge and grass surface classes and over 100 cm at both the shrub and sparse classes.
A strong low-pressure system stalled off the coast between day of year (DOY) 199 and 204. This caused westerly winds to occur much more frequently than is typical for July and August. The 50%, 80% and 90% flux footprint 245 climatology contours are shown in Fig 1a. Mean source area fractions indicate the EC observations were skewed towards the grass surface class and under-sampled the shrub class, but the range of surface classes sampled was diverse enough to allow for testing of the impact of source area fraction on the fluxes (Table 2).

EC & Chamber Observations
EC observations indicate that the flux footprint area was a carbon sink during the peak growing season, but fluxes 250 varied considerably from day to day. Gap-filled daily NEE ranged from -3.7 to -0.2 g C-CO2 m -2 d -1 with a mean -1.5 [CI95% ± 0.2] g C-CO2 m -2 d -1 (Figure 2c). Estimated ER was 2.2 [CI95% ± 0.9] g C-CO2 m -2 d -1 with corresponding GPP 3.7 g C-CO2 m -2 d -1 . There were no notable trends in NEE or ER over the study period. Our observations are within ranges observed from young DTLB on the Barrow Peninsula. NEE was greater than (ie. less carbon uptake) EC observations of from four wetter, sedge dominated DTLB, where peak season NEE was -2.5 g C-CO2 m -2 d -1 , ER 255 (1.5 g C-CO2 m -2 d -1 ) was lower than at Illisarvik while GPP (4.0 g C-CO2 m -2 d -1 ) was slightly higher (Zona et al., 2010). But NEE was less than (ie. more carbon uptake) upscaled chamber estimates for wet meadow DTLB (-0.9 g C-CO2 m -2 d -1 ), while ER (2.7 g C-CO2 m -2 d -) and GPP (3.5 g C-CO2 m -2 d -) were higher and lower than Illisarvik (Lara et al., 2015). Mid-day NEE at was comparable to aircraft observations from young DTLB (Zulueta et al., 2011).
Gap-filled daily NME was modest, ranging from 2.0 to 25.1 mg C-CH4 m -2 d -1 with a mean of 8.7 [CI95% ± 0.4] mg 260 C-CH4 m -2 d -1 (Fig 3d). Even accounting for the greater GWP of CH4, NME did not constitute a significant component of the carbon balance. Daily NME decreased over the study period as soils in the basin dried. NME at Illisarvik was much less than EC observations from in a wet sedge dominated DTLB (18.4 mg C-CH4 m -2 d -1 ) and chamber observations from DTLB (26.1 mg C-CH4 m -2 d -) on the Barrow Peninsula (Zona et al., 2009;Lara et al., 2015).
Chamber NME was much smaller than ER but there was significantly more variability among vegetation classes (Fig   3b). Sedge was a very strong CH4 source at 114.7 [CI95% ± 15.3] mg C-CH4 m -2 d -1 , which is comparable to chamber observations from vegetated ponds on the Barrow Peninsula (Lara et al., 2015). Sites across the arctic with sedges 275 have higher NME than sites without sedges (Olefeldt et al., 2013). Shrub and Grass were very weak sources, 0.7 [CI95% ± 0.3] and 0.4 [CI95% ± 0.3] mg C-CH4 m -2 d -1 , respectively and Sparse was neutral. Upland was a net sink for CH4 -1.1 [CI95% ± 0.4] mg C-CH4 m -2 d -1 . Lara et al. (2015) did not find upland areas on the Barrow Peninsula to be CH4 sinks, but upland tundra is known to be a globally significant methane sink (Whalen and Reedburgh, 1990;Whalen et al., 1996) Upland fluxes were negatively correlated with Ts, Sparse and Grass had no clear relationships, 280 while Shrub and Sedge had significant positive correlations with VWC and Ts, but there were no straightforward empirical functions to model NME like there is for ER.
Footprint scaled chamber estimates of ER and NME were about 32% and 31% greater than the EC estimates. Mean estimated ER estimated was 3.2 g C-CO2 m -2 d -1 and NME was 12.9 [CI95% ± 8.1] mg C-CH4 m -2 d -1 The discrepancies between the EC and chamber observations is often observed and has been attributed to differences in measurement 285 techniques, the small sample size of chamber observations, and sampling bias since all chamber measurements were taken during the day with fair weather (Katayanagi et al., 2005;Chaichana et al., 2018). Meijide et al. (2011) found that chamber NEE could be up to twice as large as EC observations and Riederer et al. (2014) also found chamber NME estimates were about 30% higher than EC estimates.

NEE Response to Environmental Factors and Vegetation Type 290
NNNEE (r 2 = 0.91) had four factors: PPFD, VPD, VWC, and Shrub. For comparison, NEE estimated using the Q10 paired with a light response curve (r 2 = 0.8) or a random forest regression tree (r 2 = 0.86) trained with the factors selected by NNNEE were not as accurate as NNNEE. PPFD is the primary control over NEE, a NN trained on PPFD alone provided a reasonable fit (r 2 = 0.83). The three additional factors: VPD, VWC, and Shrub, helped NNNEE fit a wider variety of conditions. Looking at the partial first derivative of NNNEE under different conditions, we can inspect 295 the modelled light response curves (α). The minimum of α is the peak light use efficiency. With increasing PPFD, light use becomes less efficient and α approaches zeros as the light response nears saturation.
VPD was a secondary control over NEE, which is consistent with the findings of another study using NN to analyse NEE (Moffat et al., 2012). Increasing VPD increased peak light use efficiency and net C uptake until a threshold, above which it had a strong limiting effect (fig 4a & b). For example, under dry conditions (eg. VPD = 1500 Pa), 300 peak light use is less efficient (-12 nmol CO2 µmol -1 photon) than under optimal conditions (-18 nmol CO2 µmol -1 photon). The value of this threshold was dependent upon soil moisture: from 1000 pa when VWC was highest to 200 pa when VWC was low.
Mapping NNNEE and NNER at Shrub% = 100%, Shrub% = 0%, and Shrub% = 36% (footprint climatology), shows that VWC and Shrub% were the primary controls over ER and thus influenced NEE (fig 4c & d). We can see from the 305 partial first derivates of NNER that increasing VWC increases ER from shrub areas. In the absence of shrubs, increasing VWC inhibits ER. The partial first derivative of NNNEE shows that VWC slightly limits NEE from non-Shrub areas and significantly reduces it in Shrub areas. The VWC relationships support Zona et al. (2010) who found VWC https://doi.org/10.5194/bg-2019-477 Preprint. Discussion started: 12 February 2020 c Author(s) 2020. CC BY 4.0 License. explained 70% of the variability in daily peak season NEE and ER and Kittler et al. (2016) who found drier soils increased both NEE after a wet tundra drainage experiment in Siberia. The chamber data supports the inclusion of 310 Shrub% because Shrub ER was significantly lower than the other vegetation classes.

NME Response to Environmental Factors and Vegetation Type
NNNME (r 2 = 0.62) had five factors: Sedge%, Shrub% VWC, TS, VPD, and U. A random forest regression tree (r 2 = 0.51) and a GLM (r 2 = 0.25) trained on the factors selected by NNNME were not as accurate as NNNME. NME was more chaotic and less dependent on any one factor than NEE which is why the NNNME needed an extra factor and had 315 a lower r 2 score. Source area had a significant effect on NME, and it is encouraging that the model contains Sedge% and Shrub% since Sedge and Shrub were the strongest CH4 source and largest footprint component respectively. These two factors can combine to map NME under three general situations: we can extrapolate to Sedge% = 100 % & Shrub% = 0 % or Sedge% = 0 % & Shrub% 100 %, or to footprint climatology ( Table 2). The footprint climatology means that some upland tundra is also included in the estimate, which brings NME down. 320 VWC was the primary climatic driver identified by NNNME. Wetter soils had a consistent positive effect on NME which was strongest when Sedge% was high (Fig 5a & b). Between driest and wettest conditions, estimated NME increased: by an order of magnitude at Sedge% = 100 %, 4-fold at Shrub% = 100%, and from neutral to a source at footprint climatology. The effect of soil moisture on CH4 production and emission has been noted in many other studies (e.g. Zona et al., 2009;Nadeau et al., 2013;Olefeldt et al., 2013). 325 Higher Ts generally had a negative effect on NME (Fig 5c & d). Higher soil temperatures increase the oxidation potential of methanotrophs (Liu et al., 2016;King and Adamsen, 1992), so this result was expected for the drier portions of the basin and upland tundra. However, this wasn't expected for the sedge areas because most studies find NME in sedges is positively correlated to Ts (Olefeldt et al., 2013). There was diurnal cycle in NME with NME peaking in the morning when Ts was at its daily minimum, which supports this finding. NNNME performance improved 330 less with the addition of U indicating the NNNME was near saturation and its effects are less relevant. Higher wind speed had a weak limiting effect on NME when VWC was high and increased NME when VWC was low (not shown).
High winds were mainly associated with two strong storm events. In order to better resolve this relationship a longer dataset would be needed.

Future Trajectories 335
Presently, NEE and ER at Illisarvik are within ranges observed within similar landscape features on the Barrow Peninsula but NME was considerably lower. However, Illisarvik will continue to evolve and the trajectory it takes could significantly alter the carbon balance. Most DTLB on Richards Island and the Tuktoyaktuk Peninsula evolve into wet sedge-moss peatlands (Ovendend, 1986). As do DTLB on the Barrow Peninsula (Lara et al., 2015). This would cause NME to increase significantly. NNNME estimated that at Sedge% = 100% mean NME would be 17.9 340 [CI95% ± 10.6] mg C-CH4 m -2 d -1 , which is comparable to EC observations from a wet sedge DTLB on the Barrow Peninsula (Zona et al., 2009). If the basin becomes wetter and the shrubs are displaced by sedges and grasses net https://doi.org/10.5194/bg-2019-477 Preprint. Discussion started: 12 February 2020 c Author(s) 2020. CC BY 4.0 License. carbon uptake may increase. NNNEE and NNER estimated that at Shrub% = 0% NEE and ER over the study period would be -1.9 [CI95% ± 0.5] g C-CO2 m -2 d -1 and 1.8 [CI95% ± 1.1] g C-CO2 m -2 d -1 respectively.
However, active maintenance of the basin's outlet channel (C. Burn, personal communication 2016) has artificially 345 lowered soil moisture and potentially limited this transition. This coupled with climate change will promote shrub expansion and Illisarvik could end up more like the shrub dominated DTLB of Old Crow Flats, Yukon (Lantz et al., 2015). NNNEE and NNER estimated that at Shrub% = 100% NEE and ER over the study period would be -0.9 [CI95% ± 1.5] g C-CO2 m -2 d -1 and 2.4 [CI95% ± 1.6] g C-CO2 m -2 d -1 respectively. Other studies have also suggested that shrub expansion could influence NEE by increasing ER (Merbould et al., 2009;Meyer-Smith et al., 2011). NNNME estimated 350 NME would be 13.1 [CI95% ± 9.4] mg C-CH4 m -2 d -1 at Shrub% = 100%, meaning the basin will likely remain a net source of CH4.

Conclusions
This study investigated NEE, ER and NME in the Illisarvik DTLB using EC and chamber data. To our knowledge this is the first such study conducted in a DTLB outside of the Barrow Peninsula. Our observations were generally in 355 agreement with other studies but show how Illisarvik differs from the colder, wetter DTLB on the Barrow Peninsula.
Illisarvik is a carbon sink during the growing season with NME only having a small positive effect on the net carbon balance. Illisarvik NEE was similar to young DTLB on the Barrow Peninsula, while NME was lower due to better drainage, drier conditions, and more diverse vegetation. However, higher NME in early July indicated we likely missed a key period of CH4 emissions earlier in the season. A longer, more comprehensive study would be needed to 360 resolve the annual carbon budget for Illisarvik.
Chamber measurements of ER and NME from different land cover classes within and outside the basin added context to the EC observations. Vegetation class (and associated difference in terrain and soil properties) had only a slight impact on NEE and ER but was one of the dominant controls over NME. Sedge areas were a strong source of CH4, other vegetation types in the basin were weak sources, and upland areas were a net sink. These results suggest that 365 NME in particular is expected to shift as both the terrain and the vegetation of the Illisarvik DTLB continues to evolve.

Appendix A: Neural Networks analysis and uncertainty calculations
Typically, NEE is gap-filled using flux-partitioning algorithms that model ER and GPP separately using TS and PPFD, respectively (e.g. Lee et al., 2017;Aubinet, 2012). However, this method requires night-time observations and thus does not perform well for Arctic summertime measurements due to the limited number of samples available during 370 low light conditions. There are no widely agreed upon functional relationships for gap-filling NME since CH4 production and consumption vary considerably both between different landcover types and environmental conditions. Some methods that have been used include classification and regression trees (CART) (Nadeau et al., 2013;Sachs et al., 2008), general linear models (GLM) (Zona et al., 2009), and mean diurnal variation (Nadeau et al., 2014). We attempted to use a GLM, CART, and random forest regression trees but they were not flexible enough to account for 375 source area variability. https://doi.org/10.5194/bg-2019-477 Preprint. Discussion started: 12 February 2020 c Author(s) 2020. CC BY 4.0 License.
Neural networks (NN) are flexible machine learning methods that are ideally suited to perform non-linear, multivariate regression. They make no a priori assumptions about the functional relationships between the factors and responses. (Melesse and Hanley, 2005;Desai et al., 2008). NN are universal approximators; given enough hidden nodes a NN is capable of mapping any continuous function to an arbitrary degree of accuracy (Hornik et al., 1991). If all relevant 380 climate and ecosystem information are available to a network, the remaining variability can be attributed to noise in the measurement (Moffat et al., 2010).
NN have been shown to be among the best performing methods for gap-filling NEE data for forest sites in Europe (Moffat et al., 2007). They have also been used to gap-fill NME time series in sub-arctic wetlands, tundra sites, and wet sedge tundra (Dengel et al., 2013). NN have been used to identify and model factors influencing NEE and to 385 partition NEE into ER and GPP (Moffat et al., 2010). NNs have even been used to upscale fluxes from the ecosystem level to the continental scale (Dou and Yang, 2018;Papale et al, 2003).
A NN approximates a true regression function ( ): where ( ) is the target function and ( ) the noise (Khosravi and Nahavandi, 2010 The network approximates ( ) as ( , ) by mapping the relationship between and the target. Here we used feed-forward dense NN with a single hidden layer: is a non-linear transfer function, here we used the rectified linear activation unit (ReLu) (Anders and Korn, 1999). 395 denotes the number of hidden nodes in the network and must be assigned before training. Too many hidden nodes and the NN will overfit the training data, too few and it will underfit. Early stopping will prevent NN from overfitting training sets (Weigend, 1993;Sarle, 1995;Tetko et al., 1995). Therefore, it is more important to ensure a NN has enough hidden nodes to adequately map the target function (Smith, 1994). We set H to a function M, the number of training samples (N), and the number of targets (1): 400 This rule of thumb ensures a NN has sufficient flexibility to approximate the target response. The weights = [ 1 … , 10 … ] are randomly initialized and after each model iteration is updated by backpropagating the error through the network. N denotes the number of observations or targets. The error metric most commonly used is the mean squared error, MSE: 405 The weights are adjusted in the direction that will decrease the error and training continues until a stopping criterion is reached. We chose to set aside 20% of the training data as a test set to be used for early stopping, and terminated training when the MSE of the test set failed to improve for 10 consecutive iterations.
Bootstrapping is used to account for model variability and estimate confidence and prediction intervals by training 410 NN on B different realizations of the dataset, where B is the number of bootstrapped samples, we used B = 30 (Heskes, 1997;Khosravi & Nahavandi, 2010). An individual NN generates point outputs approximating a target function with no information on the confidence in those estimates (Khosravi & Nahavandi, 2010). However, there are usually https://doi.org/10.5194/bg-2019-477 Preprint. Discussion started: 12 February 2020 c Author(s) 2020. CC BY 4.0 License. multiple ( , ) that approximate ( ) because of the random weight initializations (Weigend & LeBaron, 1994).
As such, there are two sources of error we are concerned with, the accuracy of our estimation of ( ) and the accuracy 415 of our estimates with respect to the target. A confidence interval describes the first (e.g. ( ) − ( , )) while a prediction interval describes the latter (eg. ( ) − ( , )) (Heskes, 1997). By definition, a prediction interval contains the confidence interval because: For b = 1 … B, a random sample with replacement of size p is drawn from the original dataset. Setting p equal to the 420 size of the original dataset yields a set of B training sets each containing approximately 67% of the original dataset.
The 33% leftover from each bootstrap sampled can be used for model validation (Heskes, 1997). The average of our ensemble of networks can then serve as our approximation of F(X): The variance of the model outputs is: 425 A confidence interval (CI) for ( ) can be calculated as ( ) ± (1−∝, ) ( ), where tscore is the students t-score, 1-α is the desired confidence level, and df are the degrees of freedom which are set to the number of bootstrapped samples B. NN performance can be seen to improve with the inclusion of more factors, until the model saturates and becomes over-parametrized ( Figure A1). 430

Data & Code availability
Our data and code are available on github: https://github.com/June-Spaceboots/Illisarvik_CFluxes   https://doi.org/10.5194/bg-2019-477 Preprint. Discussion started: 12 February 2020 c Author(s) 2020. CC BY 4.0 License. Figure 5: a. Modelled NME response to VWC at different source area fractions and b. the partial first derivatives of NME with respect to VWC. c. Modelled NME response to Ts at different source area fractions and d. the partial first derivatives of NME with respect to Ts. The shaded areas in a & c are 95% confidence intervals and grey circles are the EC observations. https://doi.org/10.5194/bg-2019-477 Preprint. Discussion started: 12 February 2020 c Author(s) 2020. CC BY 4.0 License.    Night-time EC observations (n=100) 1.6 2.8