Natural disturbances are the dominant form of forest regeneration and dynamics in unmanaged tropical forests. Monitoring the size distribution of treefall gaps is important to better understand and predict the carbon budget in response to land use and other global changes. In this study, we model the size frequency distribution of natural canopy gaps with a discrete power law distribution. We use a Bayesian framework to introduce and test, using Monte Carlo Markov chain and Kuo–Mallick algorithms, the effect of local physical environment on gap size distribution. We apply our methodological framework to an original light detecting and ranging dataset in which natural forest gaps were delineated over 30 000 ha of unmanaged forest. We highlight strong links between gap size distribution and environment, primarily hydrological conditions and topography, with large gaps being more frequent on floodplains and in wind-exposed areas. In the future, we plan to apply our methodological framework on a larger scale using satellite data. Additionally, although gap size distribution variation is clearly under environmental control, variation in gap size distribution in time should be tested against climate variability.
Natural disturbances caused by forest gaps play an important role in tropical
rainforest dynamics. Canopy gaps caused by the death of one or more trees are
the dominant form of forest regeneration because the creation of canopy
openings continuously reshapes forest structure as gaps are filled with
younger trees
Many studies have investigated the effect of treefall gaps on biodiversity,
particularly animal communities
Airborne light detecting and ranging (lidar)
platforms therefore offer a solution to this problem. Recent developments in
lidar have significantly advanced our ability to derive accurate measurements
of canopy forest structure, to detect gaps, and to assess the effect of
spatial and temporal variation on carbon balance
In the present study, we use a DCM derived from airborne lidar across a 30 000 ha tropical forest landscape in the Régina forest in French Guiana. This approach provides high-resolution maps of canopy gaps and helps us to understand the environmental determinism of gap occurrence in tropical forests. Our specific aims were therefore
to define canopy gaps from canopy height data using a probabilistic
approach; to model gap size distribution by inferring a likelihood-explicit discrete power law distribution in a Bayesian
framework; and to introduce the environment into the scaling parameter of the power law distribution and test its predictive
ability.
The study site is located in the Régina forest (4
Lidar data were acquired by aircraft in 2013 over 30 000 ha of forest by a
private contractor, Altoa (
The DCM was derived from the raw scatter plot consisting of the pooled dataset from the two acquisitions. Raw data points were first processed to extract ground points using the TerraScan (TerraSolid, Helsinki) ground routine, which classifies ground points by iteratively building a triangulated surface model. Ground points typically made up less than 1 % of the total number of the return pulses. The DCM has a resolution of 1 m. In order to avoid delineating “false” gaps due to river beds, we remove areas very close to natural rivers with a 20 m buffer applied to all shorelines. Additionally, a 25 m buffer was applied to exclude anthropogenic tracks.
We use six environmental variables to synthesize the observed environmental
gradients. All variables were computed from a lidar digital terrain model
(DTM) with 5 m
The slope was derived from the lidar DTM. Slope was computed at a grid cell as the maximum rate of change in elevation from that cell to its eight neighboring cells over the distance between them.
We use the TOPographic EXposure (TOPEX) index to measure topographic
exposure to wind
Drained area (DA) measures the surface of the hydraulic basin that flows through a cell. A low value indicates that a cell is located at the border between two basins, whereas high values indicate cells located downstream.
The hydraulic altitude (HA) of each cell, its altitude above the closest stream of its hydraulic basin, was computed from the third-order hydraulic system. Low values, including 0, indicate that the forest plot is potentially temporarily flooded, whereas high values indicate that it is located on a hilltop.
The terrain ruggedness index (TRI) captures the difference between flat and
mountainous landscapes. TRI was calculated using SAGA GIS
The height above the nearest drainage (HAND) model normalizes topography with
respect to the drainage network by applying two procedures to the DTM. The
initial basis for the HAND model came from the definition of a drainage
channel: perennial streamflow occurs at the surface, where the soil substrate
is permanently saturated. It follows that the terrain at and around a flowing
stream must be permanently saturated, independently of the height above sea
level at which the channel occurs. Streamflow indicates the localized
occurrence of homogeneously saturated soils across the landscape. The second
basis for the HAND model came from the distinctive physical features of water
circulation. Land flows proceed from the land to the sea in two phases: in
restrained flows at the hillslope surface and subsurface, and in freer flows
(or discharge) along defined natural channels
To identify discrete canopy gaps, we had to choose a gap threshold height.
Some authors define this threshold at 2 m
In our study, we define the minimum area of a gap as
The statistical analyses were performed in R
We use a Kolmogorov–Smirnov (KS) distance criterion order to determine the error
between the observed distribution and the Pareto distribution. KS is defined as the
maximum distance between the cumulative distribution functions (CDFs) of the data and
the fitted function
Having set the height threshold and minimum gap size, the GSFD is modeled with a discrete Pareto distribution frequency.
We use a Bayesian framework to estimate model parameters. Here, the value of a parameter is estimated by its posterior distribution, which by definition is proportional to the product of the likelihood of the model and the parameter prior distribution. The prior distribution is based on prior knowledge of the possible values of a parameter. The posterior densities of the different parameters were estimated using a Monte Carlo Markov chain algorithm (MCMC).
As the model contains many parameters, we built a Metropolis–Hastings (MH)
algorithm in which all parameters are updated together. Details on the
algorithm are given below:
The first values of the parameter vector are initialized as
For each step
Acceptance or rejection of the new candidate
The candidate
The algorithm is run for 1000 iterations. We use the median of the posterior densities to estimate parameter values, and the distribution of the posterior densities to estimate parameter credibility intervals.
To improve model inference, parameter significance and interpretation, we
first transformed some environmental variables:
The environmental variables are then centered and scaled with R function “scale”.
We first consider each environmental covariate independently. These
covariates are included one-by-one in the model to constrain the exponent
We first investigated the collinearity of environmental data through principal component analysis (PCA) of the normalized environmental dataset.
Canopy height distribution. Canopy height considered as a mixture distribution of two ecological features. The first (blue curve) is the natural variation in canopy height, modeled as a normal distribution. The second (red curve) is linked to the presence of low heights in the total canopy height distribution, likely to be due to a forest gap. We set the gap threshold to the 0.001th percentile of the blue curve density, i.e., 11 m.
List of environmental variables, abbreviations, units, and values of the posteriors in univariate models.
To build the final model, we used the results of the univariate model
(Table
To select the significant covariates and build the final model, we used the method proposed
by
We start the KM algorithm with
Model inference and data analysis were conducted with R software (R Core
Team, 2012). All maps and geographical information were computed with SAGA
In this study, we used a forest canopy height mixture model to define the
maximum height of a given pixel to be included in a forest gap. This
probabilistic method produced results that fit the observed canopy height
distribution. We retained the 11 m threshold that corresponds to the 0.001th
percentile of the canopy height distribution (Fig.
We mapped 12 293 gaps with vegetation
Results of the principal component analysis of the environmental variables.
All variables had an effect on gap size distribution
(Table
To define the final multivariate predictive model, we used the significant results of the univariate models together with the output of the PCA, in order to avoid multicollinearity.
The first three PCA axes explained more than 80 % of the data variance.
The first axis, which accounted for 36.45 % of the variance, was
positively correlated with relative HAlt and negatively correlated with HAND
and DA, and thus clearly highlighted the local altitudinal gradient. The
second axis explained an additional 28.5 % of variance and was positively
correlated with the TRI and slope. The third axis explained a further
15.2 % of the variance and was correlated only with TOPEX
(Fig.
The observed gap size frequency distributions modeled as a power law
function with
Environmental covariates with posterior KM values close to 1, namely slope,
TOPEX, and HAND (Eq.
Results of the Kuo–Mallick algorithm for variable selection. Variables were included in the final model when their value was close to 100 %: slope, TOPEX and HAND.
Posterior distribution of the environmental variables in the final multivariate model.
Delineating forest gaps is a persistent challenge for foresters and
ecologists, among whom Brokaw's gap definition (1982) has remained
extensively used, in which “a `hole' in the forest extending through all
levels down to an average height of 2 m above the ground,” must be defined
by an experienced observer. There are several studies that do not use this
2 m threshold definition of gaps, but instead 10 m (e.g., Hubbell et al.,
1999; van der Meer and Bongers, 1996; Welden et al., 1991). However, in this
study we have decided to use a probabilistic approach, modeling height
distribution as a mixture of two normal laws. We found a height, 11 m, which
is much higher than that in Brokaw's definition but is consistent with our
field experience, where woody debris, dead canopy tree boles, and residual
saplings (i.e., remnants that survive the gap formation event) may rise well
above 2 m. For example,
Defining minimum gap size is also a delicate proposition. Some authors,
working with high-definition lidar data, have considered a minimum gap size
(
We built on previous studies that show that gap size distribution follows a
power law distribution. However, the underlying mechanisms that control this
distribution are still unclear. The Bayesian framework we developed allowed
us to detail the contributions of each environmental variable to the size of
each individual gap. Because the precise environmental variables were
explicitly taken into account in the model likelihood of each gap, we were
able to predict gap size distribution from environmental covariates, a
difficult task when the scale exponent is estimated once, at the forest
level, and compared between forests. The global scale exponent that we
estimated for an average environment (
For the first time, gap size distribution integrates environmental variables
as a linear combination of the scale parameter (
Steep slopes are well known to directly impact tropical forest canopy
structure
HAND is a binary variable that takes the value 1 on water-saturated soils.
Because
The effect of topographic exposure on
To our knowledge, this is the first study where the precise environmental
descriptors associated with each canopy gap were explicitly taken into
account in the calculation of the model likelihood. We were able to do so
because we wrote the general model likelihood as the product of all the
single likelihoods (i.e., each gap had its own likelihood depending on the
environmental covariate values). Doing so, we were able to predict gap size
distribution from the fine environmental covariates, an impractical task when
the scale exponent is estimated once at the forest level (i.e., mixing all
the found gaps together) and compared between forests a posteriori. We also
put forward an innovative method to define a height threshold and minimum gap
size using two probabilistic approaches. The modeled distribution of canopy
height as a mixture of two distributions provides a clear height threshold,
while the minimization of KS distance between observed and predicted data
proves to be efficient for setting the minimum gap size. We use a Bayesian
framework in which the model likelihood of each gap is expressed as a
function of the unique environment local to the gap, highlighting the
predominant role of the topographic exposure and waterlogging in determining
gap size distribution. We expected that slope would also play an important
role, with steeper slopes leading to larger gap sizes. However, we found that
a steeper slope led to smaller
gaps, as already highlighted by
Datasets used in this study are the property of the French National Forest Service, a private company. They are available upon request at laurent.descroix@onf.fr.
List of environmental variables, abbreviations, units, and values of the posteriors in univariate models for a height threshold equal to the 0.0001th percentile of the height distribution of the canopy.
List of environmental variables, abbreviations, units, and values of the posteriors in univariate models for a height threshold equal to the 0.001th percentile of the height distribution of the canopy.
List of environmental variables, abbreviations, units, and values of the posteriors in univariate models for a height threshold equal to the 0.01th percentile of the height distribution of the canopy.
The authors declare that they have no conflict of interest.
We thank the handling editor, Ervan Rutishauser and Marijn Bauters for their thoughtful comments on a previous version of this paper. This study is part of the GFclim project funded by PO-Feder Région Guyane. Bruno Hérault was supported by a grant from the Investing for the Future program (managed by the 350 French National Research Agency – ANR, labex CEBA, ref. ANR-10-LABX-0025). Edited by: A. Rammig Reviewed by: E. Rutishauser and M. Bauters