Dispersal distances and migration rates at the arctic treeline in Siberia – a genetic and simulation-based study

A strong temperature increase in the Arctic is expected to lead to latitudinal treeline shift. This tundra–taiga turnover would cause a positive vegetation–climate feedback due to albedo decrease. However, reliable estimates of tree migration rates are currently lacking due to the complex processes involved in forest establishment, which depend strongly on seed dispersal. We aim to fill this gap using LAVESI, an individual-based and spatially explicit Larix vegetation simulator. LAVESI was designed to simulate plots within homogeneous forests. Here, we improve the implementation of the seed dispersal function via field-based investigations. We inferred the effective seed dispersal distances of a typical open-forest stand on the southern Taymyr Peninsula (northern central Siberia) from genetic parentage analysis using eight nuclear microsatellite markers. The parentage analysis gives effective seed dispersal distances (median ∼ 10 m) close to the seed parents. A comparison between simulated and observed effective seed dispersal distances reveals an overestimation of recruits close to the releasing tree and a shorter dispersal distance generally. We thus adapted our model and used the newly parameterised version to simulate south-to-north transects; a slow-moving treeline front was revealed. The colonisation of the tundra areas was assisted by occasional long-distance seed dispersal events beyond the treeline area. The treeline (∼ 1 tree ha−1) advanced by ∼ 1.6 m yr−1, whereas the forest line (∼ 100 trees ha−1) advanced by only ∼ 0.6 m yr−1. We conclude that the treeline in northern central Siberia currently lags behind the current strong warming and will continue to lag in the near future.


Introduction
Changing climate forces species worldwide to migrate (Arctic Climate Impact Assessment, 2004;IPCC, 2013).This is exceptionally challenging for sessile organisms such as plants, as they may strongly lag behind their moving climate envelope (Harsch et al., 2009;Loarie et al., 2009;Moran and Clark, 2012).Warming is particularly pronounced in the Arctic, where the tundra-taiga ecotone demarks the transition from forest stands to treeless areas, which is expected to move northwards (Harsch et al., 2009;Holtmeier and Broll, 2005).Such tree range expansion is of major interest because the establishment of forest in the dwarf-shrub tundra would reduce the surface albedo and promote a positive feedback to global temperature (Bonan, 2008).
Trees migrate via seed dispersal and face several ecological barriers (Svenning et al., 2014): first, viable seeds need to be produced; second, these need to be dispersed; and, third, seeds need to germinate and survive to grow to new individuals.This process, called "effective seed dispersal" (Connell, 1971;Janzen, 1970), determines the speed and spatial pattern of a species' response to climate change.For example, closely dispersed seeds and a long generation time re-Published by Copernicus Publications on behalf of the European Geosciences Union.
sult in a slow-moving front, while a patchy pattern will form from many long-distance seed dispersal events (Clark, 1998;Nathan and Muller-Landau, 2000;Ritchie and MacDonald, 1986).Migration can speed up if there are relict trees from an earlier wider extent of forest which have survived in refugia ahead of the recent treeline (Stewart and Lister, 2001;Väliranta et al., 2011).
To project future species ranges, the potential migration rate under global warming is estimated via simulation studies (Kaplan and New, 2006;Roberts and Hamann, 2016;Snell and Cowling, 2015).However, these simulations depend strongly on the dispersal configuration of the model (Bhagwat and Willis, 2008;McLachlan et al., 2005;Stewart et al., 2010;Willis and Van Andel, 2004).Most empirical attempts to estimate historical migration rates are based on records of fossil pollen and macrofossils in sediment cores as indicators of species presence (MacDonald et al., 2008;Pisaric et al., 2001), but the interpretation is compromised because of a lack of knowledge of glacial refugia, particularly small "cryptic" refugia that can be easily overlooked in the fossil record (e.g.Petit et al., 2008).Therefore, more reliable estimates of dispersal distances of tree taxa are needed in order to predict the treeline response under high-latitude warming (Snell, 2014;Snell and Cowling, 2015).
Understanding treeline changes on the southern Taymyr Peninsula is of particular relevance, as the area is characterised by a strong warming trend (IPCC, 2013).It represents an ideal study area because the treeline is formed of monospecific tree stands of Larix Mill.taxa and was thus the focus of several treeline studies (IPCC, 2013;Naurzbaev et al., 2002;Sidorova et al., 2010).The response to warming seems to differ with timescale: while millennial-scale warming during the mid-Holocene is reflected by a treeline location 200 km further north on the Taymyr Peninsula (Andreev et al., 2002;Klemm et al., 2016;MacDonald et al., 2008), the decadal-scale ongoing warming generates no response (Niemeyer et al., 2015;Wieczorek et al., 2017), possibly because of low seed availability.
To study the responses and migration dynamics of treeline tree stands under climate change, LAVESI, an individualbased and spatially explicit simulation model for Larix (Kruse et al., 2016;Wieczorek et al., 2017), was developed.In comparison to other dynamic vegetation models, it handles each individual larch tree beginning from a seed to an established seedling until becoming a mature tree, producing seeds itself and thus starting a new generation.This model includes wind-dependent seed dispersal and density-dependent growth and mortality processes.The representation of the full life cycle allows in-detail simulation experiments to unravel the influences of previously overlooked feedbacks (further details in Kruse et al., 2016;Wieczorek et al., 2017).However, the seed dispersal component had not been validated by observations.Traditional methods to track seed dispersal distances include seed traps and seed-bank analyses (Brown et al., 1988;Greene et al., 2004;Stoehr, 2000), which are time-consuming and prone to underestimate distances (Ashley, 2010;Pairon et al., 2006).Fortunately, genetic analyses provide an alternative modern approach.Repetitive sequences in the nuclear genome (short sequence repeats -SSR -or microsatellites) are sufficiently variable genetic markers to resolve parentage (Ashley, 2010;Hartl and Clark, 2007;Schlötterer, 2000).Using such an approach, the dispersal of pollen and seeds in a landscape can be tracked, and effective seed dispersal distances can be inferred (e.g.Pairon et al., 2006;Piotti et al., 2009;Pluess, 2011;Steinitz et al., 2011).For example, microsatellite studies have helped to elucidate the recruitment source of spruce juveniles and the dispersal patterns at an elevational treeline that recently shifted upwards (Piotti et al., 2009).Furthermore, a range expansion of larch following a glacier retreat could be tracked without a decrease in genetic diversity (Pluess, 2011).Genetic analyses can thus be used to provide a more realistic implementation of seed dispersal in simulation models.
With this study, we aim at improving seed dispersal and establishment processes in the simulation model LAVESI to make it applicable for simulating treeline migration rates.Therefore, we undertook a genetic parentage analysis of a treeline stand on the southern Taymyr Peninsula by applying an assay of eight nuclear microsatellites to get a reliable estimate of the effective seed dispersal distance (1).This information was used to improve the individual-based model LAVESI (2), which we then ran to simulate treeline advances into the tundra and estimate migration rates (3).

Sample collection
Needle samples from larch individuals (Larix gmelinii (Rupr.)Rupr.) were collected from a tree stand during fieldwork in the summer of 2013 on the southern Taymyr Peninsula, Krasnoyarsk region, in northern-central Siberia (plot name: TY04VI; 72.409 • N and 105.448 • E; Fig. 1).The open canopy forest stand with ∼ 300 trees ha −1 belongs to the forest tundra and has shown enhanced recent recruitment (site code FTe_1; Wieczorek et al., 2017).We sampled all individuals > 0.4 m in height in a 20 × 20 m area as well as all trees > 2 m high or bearing cones from the surrounding 100 × 100 m area (Fig. 3).Additionally, in the central 12 × 12 m area, individuals < 0.4 m were collected.Larch individuals from the 20 × 20 m plot were accurately mapped with a tape measure, while a standard GPS device (Garmin) was used to map the individuals in the 100 × 100 m area.We recorded the height of each individual and collected short twigs with needles and dried them in the field on silica gel.

Genotyping of individuals
Genomic DNA was extracted from 50 to 100 mg of dried needles after shock freezing them in liquid nitrogen and Figure 1.Overview of the larch forests (Larix gmelinii) growing at study site TY04VI on the southern Taymyr Peninsula and the sampling scheme.The green circumartic line on the maps marks the modern treeline (Walker et al., 2005).Topography in the enlarged area ranges between 1 and 2521 m (WorldClim 1.4; Hijmans et al., 2005).Rivers and lakes are given in blue colours (GSSHS updated version 2.2.2, 1 January 2013, first published by Wessel and Smith, 1996).Photo is from Stefan Kruse (16 July 2013).
grinding them in a FastPrep ® (MP Biomedicals) device with steel beads using the silica-column-based extraction kit Invisorb ® Spin Plant Mini Kit (Stratec Molecular).Subsequently, three multiplex polymerase chain reactions (PCRs; 10 µL) were set up, including all eight primer pairs used in this study (bcLK211, bcLK253, Ld101, bcLK056, bcLK228, bcLK263, and bcLK189;Isoda and Watanabe 2006;and Ld101, Ld42 and Ld56;Wagner et al., 2012; further details in Sect.S1 and Table S1 in the Supplement) using the Multiplex PCR Master Kit (QIAGEN).Fragment length determination was done by Source BioScience (Oxford, UK).Allele sizes were scored in Geneious (version 7.1.5,Biomatters Ltd.) using the Microsatellite Plugin (version 1.4.0).Raw allelic data were imported into R version 3.2.2(R Core Team, 2015), and peaks were binned to step sizes of two base pairs.The dataset was converted to the "genind" format using the package "adegenet" (Version 1.4-2; Jombart, 2008;Jombart and Ahmed, 2011) for subsequent analyses.Details on the microsatellite primer selection, PCR protocol, and subsequent data analysis, including binning of allele frequent lengths, were described in Kruse et al. (2018a) and are available online at https://doi.org/10.1594/PANGAEA.885765.

Parentage analyses based on microsatellite data
The effective seed dispersal function was estimated from the results of a parentage analysis.We used eight highly diverse nuclear loci; five of which were sufficiently informative to distinguish all individuals, as is mandatory for parental assignment studies (Fig. S2).We determined parents from allele frequency data with a likelihood-based approach implemented in CERVUS version 3.0.7 (Kalinowski et al., 2007).During the analyses, we allowed for a 1 % error in genotyping and a minimum of seven loci typed in the final analysis.All individuals (612 in total) were analysed, and we searched for parents of recruits (height < 2 m) from all potential tree individuals (height > 0.4 m).Following the program documentation, we simulated the heritage in CERVUS for 10 000 seeds with a 10 % chance of a parent sampled and 1 % error (Marshall et al., 1998;Slate et al., 2000) to determine thresholds for the "log of the overall likelihood ratio" (LOD) scores in this analysis.Only those with positive assignments and a high confidence level exceeding 95 % were retained for further analyses (Fig. 3b).However, we kept assignments where the parent pair fell below the high-confidence LOD threshold when one of the two parents could be assigned with high confidence.Assignments to identical genotypes were excluded from subsequent analyses (Fig. 3a).The goodness of the assignment was calculated as the mean of the exclusion probability over all tested offspring (which is one minus the non-exclusion probability calculated in CERVUS).Finally, we distinguished the more distant parent as the pollen donator (father) and the other as the seed source (mother) and all single-parent cases as seed dispersal events, following Dow and Ashley (1996).We ran simulations with the individual-based and spatially explicit model LAVESI (Kruse et al., 2016;Wieczorek et al., 2017).This model simulates the life cycle of larch individuals from seeds to mature trees and was parameterised for Larix gmelinii.On a simulation area of user-specified size, individuals grow where seeds settle and germinate, and competition among individuals is handled by a fine sub-grid of cells with an area of 20 × 20 cm.The model was established to improve our understanding of past and future treeline displacements under changing climate.The relevant processes (growth, seed production and dispersal, establishment, and mortality) are incorporated as submodules which are parameterised on the basis of field evidence (Wieczorek et al., 2017), complemented with data from literature.Seed dispersal into the environment is estimated by a Gaussian and fat-tailed dispersal function.With this and the detailed representation of competition that the model realistically simulates, similar to Janzen's (1970) andConnell's findings (1971), recruits have the highest chance of surviving at intermediate distances to the producing tree, not directly at it.Fine-tuning the model parameters of involved processes, which includes the impact strength that competition has on smaller trees, allows adapting the effective seed dispersal distance.Simulation runs proceed in yearly time steps and are forced by monthly temperature and precipitation time series.The original model of Kruse et al. (2016) was updated with the following processes (details in Sect.S2): (i) seed dispersal distances now depend on species-specific traits (tree height and seed properties) and wind speed and direction (Kruse et al., 2018b), (ii) the tree diameter growth function is newly calibrated to the climate forcing (Epp et al., 2018), and (iii) the active-layer thaw depth directly influences the tree's growth, which is used to estimate its seed production and mortality.Following these updates, the parameter set-tings of the original model were revised to simulate stands comparable to the site TY04VI (Kruse et al., 2018b).

Tuning the dispersal process in the model
We performed model runs to simulate larch stands in 100 × 100 m areas with closed boundaries.This means that seeds which leave the area on one side enter the field from the other side.To tune the model's processes in order to capture the observed effective seed dispersal distribution, we tested several combinations of model parameters and introduced new variables into formulae used in the program code of the model (details in Sect.S2 and in Tables S4 and S5).Each simulation begins with a 2000-year spin-up phase followed by an 80-year experimental phase (1934( -2013 CE) CE).
Climate forcing data were derived from the CRU TS 3.22 gridded data (0.5 • resolution; Harris et al., 2014).We calculated the distance-to-centre weighted mean of all monthly temperature and precipitation values of the weather data of the TY04VI grid cell and its eight surrounding grid cells.Wind data from May to August were extracted from the ERA-Interim reanalysis data (1979( -2012 CE; CE;Bromwich et al., 2016).During the spin-up phase, weather data for each year were randomly sampled from the years 1939-2008 CE to allow the tree stand to reach a quasi-stable state.The period excludes a 5-year margin at the beginning and end of the climate observations available from the Khatanga station (see Kruse et al., 2016, for further information).For years not covered by wind data, the model randomly selected a year from the available wind data series.Simulations were run for 10 repeats, and the outcomes of all individuals present in the final simulation year of 2013 CE were recorded.
The distance from the seed source tree for each established individual, i.e. the effective seed dispersal, was inferred from the simulation results.We resampled these simulated distances to consider the same frequency of observed parenthood in the central 20 × 20 m as in the surrounding 100 × 100 m area (sampling scheme details in Sect.2.1 sample collection).We included only simulations which had at least 10 individuals present in the central 20 × 20 m area in further analyses.Distances were binned to classes of 1 m steps.We evaluated the simulation results by calculating Pearson's product moment correlation coefficient between simulated and observed dispersal distances.Furthermore, we reconstructed the proportion of on-site reproductive success in the final year as the ratio between on-site recruitment and all recruits.The differences between simulated and observed on-site recruitment ratios were tested by one-sided Student's t tests.

Simulation experiments to depict migration rate
The best model that resulted from the tuning process was used to simulate larch migration in a hypothetical area of 1000 m (east-west) by 5000 m (north-south).Tree growth was initialised during the first 100 years by introducing 20 000 seeds yr −1 into the southernmost 100 m area.This setting was run under two contrasting climate scenarios: first, with homogeneous temperature and precipitation forcing named "EvenClim" and second, with linearly decreasing temperatures from south to north -mimicking the real southto-north climate gradient -named "GradClim".The gradients of mean annual temperature and annual precipitation are described as being −6.24 × 10 −6 • C m −1 and +3.26 × 10 −6 mm (year m) −1 , respectively, in a northward direction starting at TY04VI, as inferred from an analysis of globally interpolated monthly climate data for 1960-1990(Hijmans et al., 2005)).Simulations under both scenarios were repeated 10 times and run for 2000 years using climate series from random years out of the available period 1939-2008 CE.Data for each established tree individual were collected every 10 years of the simulation.
We analysed stand densities for the entire simulated area.The number of trees (> 2 m) was calculated from 100 × 100 m plots (= 1 ha) by iteration over the entire area in 50 m steps (x and y).To reduce the errors introduced by the strict boundary conditions at the edges of the hypothetical area, the outer 100 m borders were excluded.The advance of larch stands into tundra was estimated from forest density by mean number of trees on east-west transects.We defined two relevant thresholds of tree densities (see also Fig. 2): (i) the "treeline" when the mean tree density fell below 1 tree ha −1 and (ii) the "forest line" when mean tree density fell below 100 trees ha −1 .These thresholds are in accordance with observations in treeline areas in Siberia (Kharuk et al., 2006;Montesano et al., 2016;Wieczorek et al., 2017).The migration rates of these thresholds in metres advanced per year were calculated as the slope of a linear model describing the position of the treeline or forest line as a function of simulation time.The rates were tested with a two-sample t test for significant differences.Analyses were performed in R version 3.2.2(R Core Team, 2015).
The eight chosen microsatellite loci were highly polymorphic and varied from 11 to 41 different alleles, with only 0.49 % missing alleles in total (Figs.S1, S4).The information content reached a plateau at four loci which could separate 597 ± 2 individuals (> 99 %), and the power increased slightly towards 600±1 separated individuals with seven loci (Fig. S2).Accordingly, we included all eight loci in the sub-  sequent analyses to separate all individuals.In total, 601 sampled trees could be distinguished, and 22 individuals were identified as 10 clonal groups, of which 11 were subsequently excluded from further analyses (Fig. 3a).The maximum distance between two individuals within these groups was 30 m but was mostly < 5 m (Fig. 3a).
In the parenthood analyses, we aimed to find the parents for 266 individuals (< 2 m) from the remaining 464 individ- uals (> 0.4 m).The exclusion probability for one parent was ∼ 99.9910 ± 0.0172 % and was 100 % for a pair of parents in the used assay.A single parent or both parents were assigned for 151 individuals with a high confidence (> 95 %; LOD threshold for the parent pair of 15.13 and for a single parent 5.76; examples in Fig. 4).This is ∼ 53 % on-site recruitment in respect to all tested offspring.Among these, in 49 cases we found both parents (18.4 %), and for 92 cases only a single parent (34.6 %) was assigned.One of the largest trees (H = 7 m) was assigned to 8 % of the recruits (Fig. 4).Trees with many assignments are generally larger than those with few (Fig. S5).Mostly, parental trees exceeded their offspring's height, but in 3.7 % cases recruits were higher than their assigned parents.
We identified 150 effective seed dispersal distances when assuming that the closest parent is the seed source when two parents were assigned or only a single parent was identified.The observed mean distance of effective seed dispersal is ∼ 15.0 m (median of ∼ 9.8 m), with a minimum of 0.8 m and a maximum of 56.1 m (Fig. 5).

The adapted effective seed dispersal in LAVESI
In the original model (Kruse et al., 2016;Wieczorek et al., 2017), effective seed dispersal distances follow a rightskewed distribution (Fig. 5).Mean dispersal for this model is 13.5 m (median 9.98 m), with a minimum of 0.6 m and a maximum of 60.3 m.In general, it captures the observed data, but it overemphasises the recruits close to the mother tree at distances of 1-3 m.Furthermore, the simulated maximum probability peak between distances of ∼ 4-7 m is roughly 3 m closer to the seed source than observed.Also, the observed tail approached zero probability faster at the far distances compared to the observed effective distances.These three deviations were used to guide the model adaptation.
The best-fitting model has effective dispersal distances that match well to the observed distances (r = 0.93) and has the smallest sum of residuals (0.0063) compared to the other parameter sets (0.0066-0.1623).It is driven with a combination of parameters which increase the dispersal (S dist = 1, r GaussExpDisp = 1.0, and d GaussCentre = 4.0) and seed production rate (f s = 11) compared to the original model (parameter set "I"; Figs.S7, S8, and Table S5).The resulting mean dispersal distance is 12.3 m (median 8.85 m), with a range of 2.7-71.1 m (Fig. 5).

Simulating migration dynamics in the tundra-taiga transition zone
Simulations were run for a hypothetical 5000 m long southnorth transect, initialised by introducing seeds to the southern 100 m wide area.In the homogeneous climate scenario, EvenClim, single trees spread up to ∼ 3600 m during the  than at the beginning (0-1000 years).The migration rates of the forest line were roughly half in the GradClim scenario.

Discussion
4.1 Seed dispersal distances inferred from genetic heritage analysis Our assay of eight highly polymorphic microsatellites distinguished all 601 genotyped individuals, allowing us to infer the local recruitment pattern of a tree stand and, from that, the effective seed dispersal distances.In this analysis, we needed to exclude the observed clonal groups that are consequences of exceptional reproduction.We are confident that these are true observations of clones, as we minimised the chance of full sibs sharing the same genotype by using highly polymorphic nuclear microsatellites that are not in linkage disequilibrium (Kruse et al., 2018).Nevertheless, we cannot rule out that selfing or back crossing have occurred that could yield to offspring being genetically identical to one of the parents.If those modes of inheritance regularly occur and would have caused a misidentification of full siblings as clones, we would expect to observe an continuously increasing number of transitional states from identical genotypes (zero different alleles) to sharing 50 % of their alleles (eight different alleles).However, it sharply drops from the identified clonal groups to a very low value and increases again beginning at three to four differences (Fig. S6).This gives us the confidence to classify such identical individuals as clones.An explanation for these could be that windthrown trees can survive in non-favourable conditions producing horizontal branches rather than upright stems forming krummholtz (own observations; Wieczorek et al., 2017).By producing adventitious roots from branches touching the ground (Kajimoto, 2010;Cooper, 1911) and subsequent separation of the main stem or horizontal branch, two genetically identical individuals can be found if both parts survive.We were able to identify at least one parent for a majority of the offspring in the parentage analysis (53 % : 18.4 % for both parents and 34.6 % for one parent), even though only those cases with a high degree of confidence (> 95 %) were regarded and an area of only 100 × 100 m was analysed.Unfortunately, the labour-intensive sample collection and genetic analyses restricted the analysis to a rather small area in comparison to the large area of the treeline transition zone.Assessing the parentage across a broader scale and for different positions in the treeline ecotone would further help in understanding dispersal dynamics at the treeline, but the additional knowledge gain does not scale with effort.The observed effective seed dispersal ranged between ∼ 1 and 56 m (median: 10 m).This aligns with the short dispersal distances generally reported for larches.For example, it was found that most seeds (94 %) fell within 18 m of the releasing trees in a study of Larix laricina in the northern USA (Dun-can, 1954).This result, however, is not directly comparable to effective seed dispersal, because all dispersed seeds are included in the estimation and not only those which germinated and were established as a new individual.Effective seed dispersal distances of 2-48 m were found in dense forests of Larix decidua in the Swiss Alps using an approach similar to ours (Pluess, 2011).Higher effective seed dispersal distances have been observed, however, for other wind-dispersed tree taxa (e.g.27-58 m for Pinus pinaster in González-Martínez et al., 2002;39-833 m for Picea in Piotti et al., 2009).One explanation for the observed differences might be the density of the tree stands because established trees reduce the wind speed.In our tree stand, which is comparatively denser than those studied above, shorter dispersal distances are more likely than in open areas (Antonovics and Levin, 2016).Furthermore, dispersal is dependent on the release height (Matlack, 1987), which in our stand was rather low due to the shortness of the trees (mean: ∼ 4.5 m, max: 9 m).Most cones occurred on branches at roughly half the tree's height (see Fig. 1), as is typical for open stands.
The observed amount of on-site recruitment is high compared to other studies (Piotti et al., 2009;Pluess, 2011) but lower than that found in orchards (Funda et al., 2008).We found both parents for one-fifth of the offspring within the analysed area (compared to only 11.1 % in Piotti et al., 2009).As a result, recruits effectively immigrated at a rate of 47 % from the exterior of the analysed 100 × 100 m area, which is similar to a study of Picea abies in the Italian Alps (43.3 %; Piotti et al., 2009).
The parenthood in our investigated site was dominated by a few relatively tall trees.For example, a 6 m tall tree generated 8 % (17 cases) of all identified recruits.This observation is reasonable, as the chance of producing viable offspring increases with age and size, and, once a tree is well established, its seeds are released into the local environment.Other studies make similar observations.For example, Dow and Ashley (1996) found that Quercus saplings often established close to the releasing tree, and the majority of the offspring were assigned to 4 out of 62 mature trees.Schnabel et al. (1998) observed that three Gleditsia trees produced 58 % of the offspring, and Piotti et al. (2009) found that six local adults produced ∼ 62 % of Picea juveniles.
Overall, our results indicate that incorporating individual seed dispersal (such as implemented in LAVESI) rather than introducing a certain seed sum needs to be implemented in models to realistically model tree migration processes.
We were unfortunately unable to distinguish between fatherhood and motherhood using nuclear inherited markers (DeVerno et al., 1993;Szmidt et al., 1987).It is a valid criticism that we simply assume that single-parent assignments represent seed dispersal events (following Dow and Ashley, 1996;Moran and Clark, 2012;Piotti et al., 2009).In the extremely unlikely case that the more distant parent was instead the seed source (results not shown), the effective dispersal distance would increase to a median of ∼ 23 m.This would lead to a further decrease in on-site recruitment, which is already slightly underestimated.Furthermore, our approach risks assigning missing parents to extra-site recruitment if the local parents have died, leading to an overemphasis of the fat tail.We consider this risk to be low in our analysed tree stand, as we found only a few dead trees or saplings within the 100 × 100 m area, and they were already largely decayed (Wieczorek et al., 2017).

Genetic-model comparison and model adaptation
Using observations of parentage from a treeline stand of Larix gmelinii we improved the seed dispersal function in LAVESI so that it will better represent larch migration.
The original dispersal function modelled seed distribution using a simple Gaussian density function with a fat tail (Kruse et al., 2016), as is implemented in a number of models (e.g.Levin et al., 2003;Snell, 2014), but, in contrast to most other models, dispersal in LAVESI is related to wind speed and direction (Kruse et al., 2018b).The most realistic simulation results are achieved via a combination of parameter adjustments, i.e. shifting the implemented Gaussian distribution term 4 m away from the centre, increasing the factor scaling the distance by roughly 6 times the original value and increasing the influence of the Gaussian term twice (model "I" in Table S5).With these adjustments, the simulated effective dispersal distance aligns fairly well with the observed values.The new function slightly overestimates dispersal and therefore allows ∼ 7 % more recruits to immigrate into the plot.This discrepancy, however, might also be an artefact related to the shortcomings of the genetic parentage analyses.Regardless, with the dispersal function parameterised to the observed effective seed dispersal, simulations are now more realistic than with the original version of the model (Kruse et al., 2016).

Treeline migration rates
We performed simulation experiments with the best-fitting model to estimate the potential migration rate of the treeline on the southern Taymyr Peninsula.Under the scenario of even temperature and precipitation (EvenClim), the northward migration rate of the forest line is ∼ 0.6 m year −1 and is ∼ 1.6 m year −1 for the treeline.Under the more realistic climate gradient scenario of northward decreasing temperature and slightly increasing precipitation, an even slower advancing treeline and forest line is implied.Overall, we find an astonishingly low migration rate, even though the best-fitting model slightly overestimates immigration at the stand level.Our simulations may still be conservative, as we cannot rule out that the dispersal function underestimates far distance dispersal at the same time as overestimating intermediate distance dispersal.Nevertheless, the slow recruitment ahead of the treeline is in accordance with field observations at northernmost single-tree stands in the tundra, which show creep-ing growth forms (krummholtz) and no apparent recruitment (Wieczorek et al., 2017).Our estimated migration rate is quite slow compared to the observed spread of larch individuals into the tundra by 3-11 m year −1 , as mapped by Kharuk et al. (2006).However, the stand that Kharuk et al. (2006) investigated is an exceptional open-forest island close to a river ahead of the modern forest line where winds might be stronger, leading to higher dispersal rates (Antonovics and Levin, 2016;Duncan, 1954).Another field-based study reports a treeline expansion of 50 m year −1 in arctic Alaska (Lloyd, 2005), whereas an elevational range shift for larch in the Polar Urals of 20-60 m during the last century is reported by Devi et al. (2008) as well as a general upward shift of 20-50 m between 1910 and 2000 of open forest in this mountainous area (Shiyatov et al., 2005;Shiyatov and Mazepa, 2012).During the Holocene Thermal Maximum, boreal forests expanded on the Taymyr Peninsula to their northernmost position during the Holocene, which was likely assisted by glacial refugial populations ahead of the treeline (MacDonald et al., 2000(MacDonald et al., , 2008)).The treeline responded with a centennial lag to environmental improvement, for example solar insolation and reached its maximum position at ∼ 8000 to 4000 yr BP, and subsequently declined to reach its modern limits around 3000 yr BP (MacDonald et al., 2000).Recently, global warming has been ameliorating conditions for Larix forests in Siberia, and evidence can be found that treeline stands are starting to respond, but at a slower rate than one might expect given the strong increase in temperatures (Wieczorek et al., 2017;Harsch et al., 2009).A possible explanation for the slow advance may be because we report the advance of a forest line rather than single trees.Furthermore, we analysed only one tree stand, and effective dispersal rates will likely differ among sites depending on a variety of abiotic or biotic factors (Moran and Clark, 2012).The actual dispersal distance depends on stand density, amongst others, as more trees reduce the wind speed (Antonovics and Levin, 2016), and establishment will be affected by local density-dependent mortality due to seed predation close to the releasing tree (Janzen-Conell effect;Janzen, 1970, andConell, 1971).Furthermore, the probability of seeds surviving and forming a seed bank and the survival rates of seedlings strongly determine the colonisation speed.This is linked to the availability of microsites where seedlings benefit from shelter, thus lowering their mortality rates (e.g.Resler et al., 2005;Maher et al., 2006;Germino et al., 2011).These effects are not explicitly simulated but are implicitly taken into account by our model parameterisation (Kruse et al., 2016).Migration corridors along rivers are not taken into account, but they likely assist colonisation in these landscapes because of deeper active-layer depths close to the rivers and also from downstream seed dispersal (Neilson et al., 2005;Wieczorek et al., 2017).Nevertheless, the positive impact of an increased survivorship on migration rates can be observed in our migration simulation experiments.

S. Kruse et al.: Dispersal distances and migration rates at the arctic treeline in Siberia
The mortality rate ahead of the treeline is lower under homogeneous climate than in the linearly decreasing climate gradient scenario, with the consequence that the migration enters the exponential phase earlier (Figs.6 and 7).In addition, we based our model adaptations on an area that is only one hectare in size, and with this we cannot directly assess the long-distance seed dispersal to which our implemented kernel should be fit.To account for these cases, we implemented a Gaussian dispersal kernel combined with an exponential shaped with a fat tail (Kruse et al., 2016).In this study, this allows seeds to be dispersed to far distances and led to a higher immigration into the simulated forest plot than observed.In consequence, the simulated migration rate tends to be overestimated.
This comprehensive study from genetic analyses to a model application is a first attempt at showing the importance of undertaking these timely model parameterisation studies and should be enhanced by, for example, inferring the parentage for other positions in the treeline ecotone on the southern Taymyr Peninsula.
Our results show that small uncertainties in the implementation of dispersal in a model impact the timing and shape of the simulated tundra colonisation.This is in accordance with a simulated lag in the vegetation response to climate change, when seed dispersal in a global dynamic vegetation model is constrained rather than using the usual unlimited seed-bank approach (Snell, 2014;Snell and Cowling, 2015).However, further processes on smaller scales can constrain the response of tree stands and should therefore not be neglected in simulation studies: an advancing front is shaped by short-distance dispersal and spatially explicit processes, such as competition between individuals.A simulation model with spatially explicit seed dispersal combined with a representation of small-scale population processes helps to give realistic estimates on the migration rates.We have demonstrated that the LAVESI model allows a realistic implementation and parameterisation of dispersal processes.
In summary, our results suggest that the current climate change will lead to a lagged response by decades to centuries.In particular, the first step of migration will be slow, although the subsequent infilling could be rapid.It seems likely, therefore, that recent strong warming will cause a highly nonlinear response in forest and treeline advance.

Conclusions
We parameterised and applied the individual-based model and spatially explicit LAVESI to estimate migration rates of the treeline and forest line advance under current climate conditions.First, we inferred the effective seed dispersal distance from a genetic parentage analysis based on nuclear microsatellites and second, we improved the dispersal process of the model according to the observed dispersal pattern.
In our genetic analyses, we found a genetically diverse tree population at a location within the treeline close to the tundra in Siberia.The parentage analysis revealed that the majority of recruits (∼ 60 %) have a local origin.Knowing the positions of the parent trees, we could estimate the effective seed dispersal distances between parent and offspring, which are mostly short (∼ 10 m), although longer distances (up to ∼ 60 m) are possible.Simulations with the adapted LAVESI model improved our knowledge of the likely treeline migration response.The rate is surprisingly slow, being just a few metres northwards per year.To find out if the estimated slow migration is an outlier coming from overfitting to only one study site or the general response rate under current warming, further similar studies at other treeline positions would be necessary.The simulated migration pattern also showed that occasional long-distance seed dispersal events far beyond the treeline area assisted the colonisation of the tundra.Our migration rate estimates are in the lower range of those observed and are significantly slower than those inferred from palaeoecological studies or from simulated vegetation responses to climate change in dynamic global vegetation models.These findings indicate that the treeline in northern central Siberia will lag behind the recent strong warming (which is moving by ∼ 1000 m yr −1 ; Loarie et al., 2009), but if isolated trees occasionally establish in the tundra, they could become nuclei for a rapid colonisation of the tundra.Should this rapid colonisation occur, the albedo of these populated tundra areas will reduce, and thus a positive feedback to climate warming will follow the lagged response of tundra-taiga transition.Such a scenario could be run in a large-scale simulation experiment using the improved version of the LAVESI model in an attempt to learn more about the impacts of such a vegetation-climate feedback in the upcoming decades.

2. 4
Simulation study 2.4.1 The individual-based and spatially explicit Larix model LAVESI

Figure 2 .
Figure 2. Simulated treeline advances along a hypothetical southnorth transect across the modern forest line.Simulations were initialised with seeds in the southernmost 100 m area and run for 2000 years.In the beginning, trees established beyond the treeline (mean density falling below 1 tree ha −1 ) and formed "single tree" stands in the tundra, which then acted as nuclei for further range expansion, so that in following years the treeline and the forest line (mean density falling below 100 trees ha −1 ) could advance northwards, forming open forest.

Figure 3 .
Figure 3. Map of genotyped Larix gmelinii individuals at site TY04VI; individuals sharing the same genotype (clones) are marked by filled points of the same colour in (a), and local recruits are marked by filled bright circles, whereas parents by filled black circles, in (b).

Figure 4 .
Figure 4. Map of assigned relationships between parents (filled black circles) and their offspring (filled bright circles) of Larix gmelinii individuals.Additional genotyped individuals are given as open circles.Note the non-linear scale of the coordinates.

Figure 5 .
Figure 5. Seed dispersal of Larix gmelinii as simulated with the original simulation model LAVESI (black line surrounded by a grey band showing the 95 % confidence interval) and the adapted model version (yellow line and band showing the 95 % confidence interval) compared to observed effective seed dispersal (green thick line).Dispersal distances were binned to 1 m classes.

Figure 6 .
Figure 6.Simulated migration rates of the forest line (mean density falling below 100 trees ha −1 ) and the treeline (mean density falling below 1 tree ha −1 ) estimated from the best-fitting model.The simulations were forced by two contrasting climate scenarios, either homogeneous temperature across the area (EvenClim: grey shading) or linearly decreasing temperature from south to north (GradClim: darker blue shading).

Figure 7 .
Figure 7. Simulations along a hypothetical transect at the tundra-taiga transition reveal the northward advance of (a) the forest line (mean density falling below 100 trees ha −1 ) and (b) treeline (mean density falling below 1 tree ha −1 ).Simulations were forced by two contrasting temperature scenarios (homogeneous temperature EvenClim -grey shading -and northward linearly decreasing temperature and precipitation GradClim -darker blue shading).Shaded areas give the 99 % and 90 % confidence intervals around the mean value of 10 simulation repeats.