The Moderate Resolution Imaging Spectroradiometer Vegetation Continuous Fields (MODIS VCF) Earth observation product is widely used to estimate forest cover changes and to parameterize vegetation and Earth system models and as a reference for validation or calibration where field data are limited. However, although limited independent validations of MODIS VCF have shown that MODIS VCF's accuracy decreases when estimating tree cover in sparsely vegetated areas such as tropical savannas, no study has yet assessed the impact this may have on the VCF-based tree cover data used by many in their research. Using tropical forest and savanna inventory data collected by the Tropical Biomes in Transition (TROBIT) project, we produce a series of calibration scenarios that take into account (i) the spatial disparity between the in situ plot size and the MODIS VCF pixel and (ii) the trees' spatial distribution within in situ plots. To identify if a disparity also exists in products trained using VCF, we used a similar approach to evaluate the finer-scale Landsat Tree Canopy Cover (TCC) product. For MODIS VCF, we then applied our calibrations to areas identified as forest or savanna in the International Geosphere-Biosphere Programme (IGBP) land cover mapping product. All IGBP classes identified as “savanna” show substantial increases in cover after calibration, indicating that the most recent version of MODIS VCF consistently underestimates woody cover in tropical savannas. We also found that these biases are propagated in the finer-scale Landsat TCC. Our scenarios suggest that MODIS VCF accuracy can vary substantially, with tree cover underestimation ranging from 0 % to 29 %. Models that use MODIS VCF as their benchmark could therefore be underestimating the carbon uptake in forest–savanna areas and misrepresenting forest–savanna dynamics. Because of the limited in situ plot number, our results are designed to be used as an indicator of where the product is potentially more or less reliable. Until more in situ data are available to produce more accurate calibrations, we recommend caution when using uncalibrated MODIS VCF data in tropical savannas.
Tree cover values derived from Earth observation (EO) data form a fundamental part of ecological research. They are used to estimate forest cover change, biomass, and carbon stocks (Bastin et al., 2019; Giriraj et al., 2017; Saatchi et al., 2011; Song et al., 2014) and to help identify key areas for conservation efforts (Miles et al., 2006) and as a basis for climatic and vegetation modelling and model evaluation (Brovkin et al., 2013; Burton et al., 2019; Kelley et al., 2013). All this research, in turn, plays a vital role in informing local, regional, and global environmental policies (Harris et al., 2012). As such, an EO product's accuracy is important to consider, as any errors in the initial tree cover estimate can be further compounded in downstream work.
Only a handful of EO products provide global maps of percent tree cover or forest and shrub cover distributions (Bartholomé and Belward, 2005; Bicheron et al., 2008), and fewer still provide information stretching over at least a decade (Friedl et al., 2002; Hansen et al., 2003; Sexton et al., 2013; DiMiceli, 2017). Of these, one of the products most widely used in ecological modelling is the Moderate Resolution Imaging Spectroradiometer Vegetation Continuous Fields (MODIS VCF) product (DiMiceli, 2017). MODIS VCF is a yearly product that provides percent tree cover globally at a spatial resolution of 250 m. The most recent iteration (Collection 6) is available for the years 2000 through to 2020. Its quantitative measure of woody cover is recorded annually and is described as a percentage of ground cover, making it particularly suited for use in evaluating dynamic global models (Lasslop et al., 2018; Rabin et al., 2017), as a proxy for in situ data that are harder to collect (Kelley et al., 2019), and to help define parameters for calculating global tree restoration potential (Bastin et al., 2019). MODIS VCF is also used to train alternative products, such as the newer finer-scale Landsat Tree Canopy Cover (TCC) product (Sexton et al., 2013).
As the VCF product has progressed from Collection 1 to its current Collection 6, several validations using in situ field data or higher-resolution remotely sensed data as a reference measurement have been carried out. These have been few and limited to sites within a biome (Montesano et al., 2009), a region (Hansen et al., 2005; White et al., 2005), or within a country (Gao et al., 2014; Sexton et al., 2013). The MODIS VCF product evaluated was the most recent collection available at the time (i.e. Collection 3: Hansen et al., 2005; White et al., 2005; Collection 4: Montesano et al., 2009; Collection 5: Gao et al., 2015; Sexton et al., 2013). To our knowledge, no such independent validation experiment has yet been conducted on Collection 6, which produces tree cover estimates in the same manner as Collection 5 but with improvements made to the upstream inputs to enhance its accuracy (DiMiceli, 2017). Likewise, validation of the finer-scale TCC product has been limited to its penultimate version and to the taiga–tundra circumpolar region (Montesano et al., 2016).
The validations found that MODIS VCF may be less suitable for estimating tree cover in sparsely vegetated areas. Huang and Siegert (2006) noted that MODIS VCF classified large areas of land as “bare” where their land cover classification system identified it as sparsely vegetated. Montesano et al. (2009) found that MODIS VCF data (Collection 4) overestimated cover in areas of low tree cover in taiga–tundra transition zones. Sexton et al. (2013) found that the Collection 5 product overestimated cover in areas of low cover (below 20 %) and underestimated in areas of higher tree cover, while Gao et al. (2015) found that MODIS VCF can only partially discriminate between tropical forest and non-forest, struggling in areas that have greater heterogeneity. Similarly to MODIS VCF (Montesano et al., 2009), Montesano et al. (2016) revealed an overestimation of the taiga–tundra low tree covers in the finer-scale Landsat TCC, suggesting that using VCF as training has propagated these overestimations into the higher-resolution product. What is clear from the history of these validation and comparison experiments is that MODIS VCF has accuracy issues in areas with low woody-vegetation cover, which has implications when its tree cover estimates are treated as accurately representative of real-world conditions. Failure to account for VCF's difficulty in estimating low woody covers can, therefore, lead to miscalibrated models and estimations that do not reflect real-world conditions. This, in turn, has knock-on effects on environmental policymaking, conservation efforts, and future ecological research, especially in areas with vegetation cover types that are most prone to error.
Tropical savannas have woody covers that fall within the range particularly affected by the reported MODIS VCF errors. A large proportion of these savannas can be found in tropical developing countries (Boval and Dixon, 2012) and are predicted to be home to half of the world's population by 2050 (Archer et al., 2020). Tropical savannas are therefore highly vulnerable to anthropogenic change. In the face of a growing population, land fragmentation, and changing climate, a savanna's ability to maintain robust ecosystem functions is directly linked to the amount of woody cover present (Sankaran et al., 2006). As a result, the ability to accurately monitor the state, dynamics, and woody cover trends of tropical savannas is a vital part of understanding how and why savannas are changing in the tropics (Harris et al., 2012; Miles et al., 2006), while also improving modelled climate projections and vegetation dynamics for this complex biome.
In this study, we evaluate MODIS VCF Collection 6 in tropical savannas and forest areas by comparing VCF's tree cover percentage to corresponding field data. Similarly, we evaluate Landsat TCC (version 4) to explore if, when VCF is used as training, VCF biases are propagated. We then, for MODIS VCF, characterize the observed bias in woody covers across both savanna and forest ecosystems and apply our calibration across the tropics to highlight the regions most likely affected by these inaccuracies. We finish by discussing the implications the uncovered biases may have on tropical vegetation and terrestrial biogeochemical modelling.
We used the MODIS VCF Collection 6 product (250 m spatial resolution,
DiMiceli, 2017) with tree cover values averaged across the years 2006
through to 2009 to reflect the range of the field data collection period.
MODIS VCF was downloaded using the MODIS R package (Hijmans, 2017) in R 3.5.2
(R Core Team, 2018). We used the 2005 and 2010 30 m Landsat TCC version 4
product (
The in situ field data were sourced from the “Tropical Biomes in Transition”
project (TROBIT) (
All the sites fall within the tropics, that is, within 23.5
Location of sampling sites in Africa, Australia, and South America
from the TROBIT project (based on Fig. 2 of Torello-Raventos et al., 2013)
shown in MODIS VCF (DiMiceli, 2017). Of the 63 field sites, only the 48
sites with available GPS coordinates were selected (
The classification of the TROBIT sites as either “forest” or “savanna” is based on the parameters described in Torello-Raventos et al. (2013) and Veenendaal et al. (2015). A savanna is a natural land cover that is not a forest, bare ground, or body of water. Forest is defined as woody vegetation with an average tree height of or exceeding 6 m and a canopy area index (CAI) value of at least 0.3 for open forests and 0.7 for forests. In addition, floristic differences (i.e. dominance of savanna species) are used to differentiate forests from taller-growing savannas that have similar CAIs and tree heights (see Fig. 9 of Torello-Raventos et al., 2013).
There is some ambiguity in how savannas and “grasslands” are defined. Some modelling-based research treat the two biomes as different (Whitley et al., 2017), while studies based on plant functional traits group them together (Solofondranohatra et al., 2018; White et al., 2000). As there is some concern that MODIS VCF will struggle to pick up woody cover in areas with really sparse vegetation, in this paper we decided to treat grasslands as part of the savanna domain.
CAI is defined as the sum of the projected areas of individual tree crowns
divided by the ground area. In the TROBIT project (Torello-Raventos et al.,
2013; Veenendaal et al., 2015), plot-wide CAI is made up of the sum of
the upper-stratum, mid-stratum, and subordinate-stratum crown areas.
Membership to a stratum is determined by the tree's dbh (diameter at breast height; upper stratum: dbh
CAI values do not account for within-site tree canopy distribution patterns and the overlap between individual tree canopies. We account for this by converting each CAI value into a probability distribution function incorporating the following two extreme scenarios: “enforced overlap”, where the location probability of individual canopies increases linearly from 0 to 1 across a site, and “unenforced overlap”, where individual canopies follow a uniform random distribution pattern and canopy overlap is not purposefully introduced (Fig. 2). We repeated this 1000 times per CAI measurement to determine the probability distribution of expected CAI for each field plot.
Unlike CAI, which is the fraction of ground covered by tree crowns, the percent tree cover value from MODIS VCF (and so Landsat TCC) is defined as “the portion of the skylight orthogonal to the surface which is intercepted by trees” (Hansen et al., 2002). To make MODIS VCF and Landsat TCC comparable to tree cover derived from TROBIT plot CAIs, we divided these product values by 0.8 as suggested by Hansen et al. (2002). This is also the standard approach in most modelling studies using VCF (e.g. Lasslop et al., 2020; Kelley et al., 2013; Burton et al., 2019). The 0.8 value can be thought of as a gap correction factor (GCF) that accounts for within-canopy gaps. Although the GCF has been shown to vary with vegetation type (Lloyd et al., 2008; 0.34–0.60) and crown cover (Tang et al., 2019b; 0.70–0.96), we opted to use 0.8, as we found that it yielded more conservative results compared to a variable GCF. It also avoided introducing additional parameters into our analysis.
Next, to account for the difference in size between the MODIS VCF pixel (250 m
Visual representation of the effects of enforcing overlap within a
(100 m
For Landsat TCC, where the Landsat TCC pixels (30 m
We compared both MODIS VCF and Landsat TCC with TROBIT under four different
scenarios: (1) unenforced overlap and clumping, (2) enforced overlap and
unenforced clumping, (3) unenforced overlap and enforced clumping, and (4) enforced overlap and clumping. Comparisons were conducted by fitting the
following logit function:
We evaluated the impact of the MODIS VCF biases inferred from these
regression equations across the tropics by inverting our calculation of
MODIS VCF bias (Fig. A1) as follows: first, the inverse (i.e. solving for
We then used the MCD12Q1 product to identify the areas of forest and
savanna across the tropics in the MODIS VCF product. MCD12Q1 is widely
used by the global land surface modelling community (e.g. Sellar et al.,
2019; Wiltshire et al., 2021) and describes land cover in terms of 17 global
land cover classes as per the International Geosphere-Biosphere Programme
(IGBP, Table 3 in Sulla-Menashe and Friedl, 2018). The product is based on
the same spectroradiometer (MODIS) and temporal resolution as the VCF
product. Referring to the definition of savanna of Veenendaal et al. (2015), the following land cover classes were chosen to represent savanna:
closed shrublands, open shrublands, woody savanna, savanna, and grassland;
meanwhile forest encompasses the following: evergreen needleleaf forests, evergreen
broadleaf forests, deciduous needleleaf forests, deciduous broadleaf
forests, and mixed forests. We subset MCD12Q1 to the tropical zone between
For a more detailed land-cover-specific evaluation, we extracted the calibrated 250 m MODIS VCF pixel values for each corresponding 500 m MCD12Q1 pixel to construct land-cover-specific MODIS VCF tree cover frequency distributions (Fig. A3). Our tree cover calibration by cover type for the four clumping–overlap regression combinations was then calculated by multiplying each cover type MODIS VCF frequency distribution (Fig. A3) with curves representing the median, 5 %, and 95 % confidence lines of the calibration equation ensembles.
MODIS VCF underestimates tree cover within the 19 % to 81 % range across all four combinations of enforced–unenforced overlap and clumping (black line, Fig. 4). Below 12 %, MODIS VCF tree cover values do not significantly disagree with TROBIT field data and may instead be overestimating tree cover (50 % confidence, dashed line, Fig. 4). A similar pattern is seen when tree cover exceeds 84 %: MODIS VCF does not differ significantly from TROBIT when there is enforced overlap (i.e. when tree canopies are clustering towards one side increasing the degree of canopy overlap, Fig. 4 right) but may underestimate tree cover when overlap is not enforced (i.e. tree canopies are spaced randomly within the site, Fig. 4 left).
MODIS VCF percent tree cover versus percent tree cover from TROBIT field data, taking into account uncertainties associated with tree cover spatial distributions within a MODIS pixel and/or field site. The four combinations are the following: (1) no overlap and no clumping, where tree canopies are randomly distributed within both the pixel and site; (2) no overlap and maximum clumping, where tree canopies are clustered in one area of the pixel and randomly distributed throughout the field site; (3) with overlap and no clumping, where tree canopies are randomly distributed within the pixel but overlap substantially within the field site; and (4) with overlap and maximum clumping, where tree canopies are clustered to one side within the pixel and overlap substantially within the site. The bolded dashed line in black shows the 1 : 1 relationship. The solid lines represent the median of the respective regressions (green for forest, orange for savanna, and black for forest and savanna combined), and the thin lines represent the 5 % and 95 % confidence interval of their respective regression lines. The vertical error bars represent uncertainty introduced by clumping; the horizontal error bars represent the uncertainty introduced by overlap.
There is a clear difference in how accurately MODIS VCF estimates tree cover in forested areas (in green, Fig. 4) as opposed to areas identified as savannas (in orange, Fig. 4). In savanna sites, MODIS VCF significantly and consistently underestimates tree cover regardless of the amount of overlap and clumping. Significant underestimation (at 95 % confidence) occurs when in situ tree cover exceeds 19 %–21 % (without enforced clumping) or 11 %–12 % (with enforced clumping). In forest sites, MODIS VCF does not show the same pattern of systematic underestimation. Divergence does occur at high covers, depending on the enforcement of overlap or clumping. MODIS VCF underestimates tree cover where tree cover exceeds 84 % (at the 95 % confidence interval) when neither overlap nor clumping is enforced and overestimates where tree cover exceeds 78 % (at 5 % confidence interval) when both overlap and clumping are enforced.
Landsat TCC percent tree cover versus percent tree cover from TROBIT field data, taking into account uncertainties associated with tree cover spatial distributions within a TCC pixel and/or field site. The four combinations are the following: (1) no overlap and no clumping, where tree canopies are randomly distributed within both the pixel and site; (2) no overlap and maximum clumping, where tree canopies are clustered in one area of the pixel and randomly distributed throughout the field site; (3) with overlap and no clumping, where tree canopies are randomly distributed within the pixel but overlap substantially within the field site; and (4) with overlap and maximum clumping, where tree canopies are clustered to one side within the pixel and overlap substantially within the site. The bolded dashed line in black shows the 1 : 1 relationship. The solid lines represent the median of the respective regressions (green for forest, orange for savanna, and black for forest and savanna combined), and the thin lines represent the 5 % and 95 % confidence interval of their respective regression lines. The vertical error bars represent uncertainty introduced by clumping; the horizontal error bars represent the uncertainty introduced by overlap.
Similar patterns can be observed with Landsat TCC (black line, Fig. 5). There is a significant underestimation of tree cover in the lower cover ranges up to 59 % when there is enforced overlap and up to 82 % when overlap is not enforced. In savanna sites (orange line, Fig. 5) the underestimation (at 95 % confidence) is significant and consistent for covers below 75 %–80 % (without enforced overlap) or below 52 %–60 % (with enforced overlap). In forest sites (green line, Fig. 5) there is no systematic difference.
We assessed the impact of VCF's underestimation of intermediate tree covers across the tropics (IGBP forest and savanna land cover classes), using a calibration based on the combined forest and savanna sites (black curve, Fig. 4) instead of using the savanna-only sites for a savanna-specific calibration (orange curve, Fig. 4). This is because there were few TROBIT sites representing savanna with MODIS VCF tree cover values exceeding 40 %, and global land cover maps disagree on the distribution of savannas within the forest–savanna ecotone (Herold et al., 2008).
The distribution of tree cover change after calibrating against field data
are similar across the four scenarios (Fig. A2), and the regions where all
four scenarios agree on the direction of change (positive and negative) are
substantial (Fig. 6). However, there are some differences caused by the
uncertainty introduced by different extents of overlap and clumping. While
we see a significant increase in tree cover across all clumping–overlap
combinations in many regions of tropical savannas and grasslands (Pennington
et al., 2018), such as in the forest–savanna mosaics that surround Congolian
rainforests, we do not see the same pattern in the Cerrado of Brazil. This
is likely because the African forest–savanna regions fall within the range
of MODIS VCF values that consistently undergo a positive calibration
(
Using field plots over a limited geographic extent creates uncertainty that may be unaccounted for in our analysis when our calibration is broadly applied across the highly variable tropical-forest–savanna ecotone. By multiplying the uncertainty range of our calibrations with the geographical distance to the closest sampled TROBIT site, we identified priority regions for further field surveying (Fig. 6 bottom). We found Southeast Asia, Central America, and Mexico are areas where additional in situ observations would greatly help improve confidence. Field data from the northwestern region of South America, the southeast of the African continent, and Madagascar would also help.
As our calibrations were based on a limited number of sites in a limited number of regions, it is important to note that the maps shown in Figs. 6 and A2 are not definitive. For instance, we found a significant tree cover decrease in the Sahel post-calibration in multiple scenarios, which runs counter to the results of Brandt et al. (2020), who found that tree cover was underestimated in this region. This disparity may be due to our lack of field sites in these more arid regions, further highlighting the importance of more in situ data for more accurate and precise calibration. Therefore, our calibrations are most useful in identifying areas where MODIS VCF estimates may be more or less reliable.
When looking at our calibration in more detail, we see that MODIS VCF significantly underestimates tree cover in all the IGBP land cover classes that we considered, regardless of overlap or clumping (95 % confidence interval) (Fig. 7). The most substantial and significant underestimation is in the classes “woody savannas” and savannas. The underestimation is the largest in woody savannas, except when clumping and overlap are enforced (in purple, Fig. 7). This is because the peak in the tree cover frequency distribution for savannas aligns with where the calibration for maximum overlap and clumping is the largest (i.e. at about 20 % tree cover; see Fig. A3), while the peak in cover distribution for woody savannas (26 %–67 %, Fig. A3) aligns with the cover range that undergoes the greatest change (Fig. 7) in the other clumping and overlap scenarios.
“Open shrublands” only show a small underestimation of tree cover, despite the woody cover definition (10 %–60 %) matching the range where MODIS VCF most underestimates tree cover (26 %–67 % cover). The discrepancy may be because the majority of the open-shrublands class commission error is with the grasslands class (see Table S6 in Sulla-Menashe et al., 2019). The MODIS VCF tree cover in areas classified as open shrublands is therefore likely to be lower than the IGBP definition would suggest (see Fig. A3), resulting in calibrations that are more conservative.
We found significant increases in tree cover for forests in every calibration scenario, though net change is not significant (95 % confidence) when overlap is enforced. This can be explained by the presence of both negative and positive calibrations in the higher ranges of tree cover when overlap is enforced. Similarly, the net change is insignificant across all clumping and overlap scenarios for the IGBP classes matching the lower ranges of tree cover (grasslands, closed shrublands, and open shrublands).
Percent change in tree cover post-calibration (clockwise: no
enforced clumping or overlap (black), enforced clumping and no enforced
overlap (blue), no enforced clumping and enforced overlap (red), and enforced
clumping and overlap (pink)) in the forest supercategory and the five savanna
classes. The palest tone indicates positive change; the mid tone indicates negative
change; and the darkest tone indicates net change. Error bars denote the
5 %–95 % confidence interval; if the error bar extends past the
While MODIS VCF is a powerful and accessible tool to map tree cover, our field-data-based calibrations indicate that the latest MODIS VCF Collection 6 is missing a lot of woody cover, even when uncertainty introduced by site canopy overlap and clumping within the MODIS VCF pixel are accounted for. The Landsat TCC product, which may be viewed as an alternative with a higher spatial resolution, behaves in a similar manner. Our map (Fig. 6 top) highlights that this potential underestimation of woody cover is mainly occurring in tropical savannas. Moreover, the highest underestimation in the savanna classes occurs when there is no enforced overlap (Fig. A2) (i.e. when there is a uniform random distribution of trees) which is the scenario that most likely reflects the TROBIT savanna plots. This is evidenced by work done by Veenendaal et al. (2015), where TROBIT plots were tested for complete spatial randomness, and only minor indications of overlap were found. Woody savannas, as an example, may have their tree cover underestimated by up to 32 % (95 % confidence) when neither clumping nor overlap is enforced (grey tones, Fig. 7). If our results are representative of the tropics, then overall, MODIS VCF may be underestimating tropical tree cover by between 7 %–29 % for unenforced clumping and overlap or 0 %–21 % for when either clumping or overlap are enforced (5 %–95 % confidence).
An overestimation at the lower end of the cover (
On the other hand, when looking at the relationship between TROBIT's upper-stratum canopy height and the difference between TROBIT and VCF, we would
have expected an increasing underestimation in the lower height ranges.
Instead we found a low
We also found discrepancies between the tree cover values derived from MODIS VCF and the corresponding class definition of the MCD12Q1 product (Fig. A3) which again suggests that the 5 m height threshold may not always apply in MODIS VCF. For example, MODIS VCF recorded tree cover in the open-shrublands and “closed-shrublands” classes of the MCD12Q1 product (Fig. A3), even though the height range for these classes is 1–2 m. For the savanna class, MODIS VCF yields a percent tree cover range that matches closely with the savanna class definition (between 10 % and 30 %), despite the differing tree thresholds for MODIS VCF and IGBP (5 m minimum for MODIS VCF and 2 m minimum for IGBP). These discrepancies suggest one of the following three things: open/closed shrublands and savannas contain trees taller than 5 m, MODIS VCF is distinguishing trees below the 5 m threshold, or some combination of both.
Another explanation for the discrepancy between the IGBP class definitions and those estimated through MODIS VCF could be the between-class omission and commission errors (Fig. 5 in the present paper and Table S6 in Sulla-Menashe et al., 2019). For example, the accuracy for closed shrublands is particularly low. It is mainly confused with open shrublands, woody savannas, and savannas. The majority of the open-shrublands class commission error is with the grasslands class, and there is confusion to a lesser extent between open shrublands, woody savannas, and savannas. Also, the “cropland/natural vegetation mosaics” class is often mapped as closed shrublands, woody savannas, savannas, or grasslands.
More work needs to be done to evaluate how effective both MODIS VCF, Landsat TCC, and MCD12Q1 are at implementing the height thresholds in their respective “tree” definitions, as this may have implications when MODIS VCF, Landsat TCC, and MCD12Q1 are used for global model calibration or validation.
Overall, our results suggest that the biases found in the previous collections may have persisted in Collection 6, despite reported improvement in accuracy (DiMiceli et al., 2017). This indicates that the biases introduced by binning the training data (Gerard et al., 2017) and using a CART (classification and regression tree) model (Hanan et al., 2013) are inherent and still present within this version of MODIS VCF. Similar results for MODIS VCF and Landsat TCC also suggest that by training TCC with VCF tree cover these biases have been propagated into the finer-scale product. There is a risk of bias propagation when MODIS VCF (or related products) is used as a single source for benchmarking models (e.g. Brandt et al., 2017; Lasslop et al., 2020; Burton et al., 2019; Kelley et al., 2019, 2021). To avoid this, studies should try use multiple data sources (e.g. for woody cover possibly the GEDI – Global Ecosystem Dynamics Investigation – lidar-based canopy cover product, Tang et al., 2019a) whether for model calibration/validation (e.g. Sellar et al., 2019; Wiltshire et al., 2021; Burton et al., 2021) or hypothesis testing (e.g. Taylor et al., 2012).
We suggest that while MODIS VCF and landsat TCC give a good overview of tree cover on a global scale, both should be used cautiously in savanna regions. Special care should be taken in savannas, a biome that has long been noted as being challenging for EO products to characterize, as solitary trees in the landscape tend to be missed by global tree cover products (Jung et al., 2006; Brandt et al., 2020). The poor performance of MODIS VCF and Landsat TCC in savannas in particular (Gaughan et al., 2013; Gross et al., 2018; Kumar et al., 2019) emphasizes the importance of continuous independent validation and re-calibration of these products. The ecosystem functions of savannas can vary drastically with just a slight difference in tree cover (Gaughan et al., 2013), and even slight errors may create issues in how we interpret the state and dynamics of the biome, which in turn affects how the land is managed.
Work on forest restoration potential would also be impacted. Bastin et al. (2019), for example, used MODIS VCF to estimate tree cover in agricultural land. As this tree cover is likely to have been underestimated substantially, the derived available land space for replanting may be less than projected, with the restoration potential being overestimated. However, our results also indicate an underestimated tree cover in woodier savannas and forests. Accounting for this, the restoration potential could actually be greater than anticipated because the carrying capacity of a unit of land could be greater than previously thought. Calibration could also result in a more uniform cover distribution across regions, producing a more gradual transition between low-cover savannas and high-cover forests. This could have implications for work that, for example, uses MODIS VCF to study forest–savanna dynamics and bi-stability (Lasslop et al., 2018; Wuyts et al., 2017; Xu et al., 2016).
To ensure the appropriate use of both products, we suggest that where field data are available, the products should be calibrated for use in the target region. However, calibrating on a large scale using field data as a reference presents several challenges. Firstly, different in situ measurement techniques tend to measure different types of tree cover (e.g. Fiala et al., 2006; Korhonen et al., 2006; Rautiainen et al., 2005), and each will require a specific conversion method to enable direct comparison with MODIS VCF or Landsat TCC. For example, the comparison of Montesano et al. (2016) did not acknowledge VCF's and thus TCC's “within canopy gaps”, which may explain their observed underestimation in covers above 80 %. In our case, to account for gaps between tree crowns, we applied the 0.8 “gap correction factor” to the CAI. However, the GCF and resulting tree cover could vary widely on a plot-by-plot basis (Lloyd et al., 2008). With further in situ data that describe tropical vegetation-type-specific GCF variation, we may be able to incorporate site-specific GCFs into our analysis.
There is also the uncertainty associated with the field data collection. In our case, the site-specific CAI standard errors (Supplement B in Torello-Raventos et al., 2013) are small and show no systematic bias and are therefore not expected to significantly change our results. However, our results in Fig. 6 have been extrapolated from a limited number of field sites, with somewhat limited distribution across the tropics. While our uncertainty map gives a good idea of areas of concern, for a more robust description of MODIS VCF's accuracy, we would need substantially more widely distributed in situ sites to represent the tropics. Field data collection remains costly and labour intensive, and adopting protocols that ensure standardization and data sharing via information platforms would facilitate large-scale validation exercises. In the case of validating woody cover products, including direct tree cover measurements such as crown area would also help.
However, the reality for much of pre-existing data is that they are not standardized, and field data (whether standardized or not) are not designed to be directly comparable to remote-sensing-derived variables. Here, our technique offers a potential solution to this. By accounting for uncertainties that arise from differences between the available in situ data and the remotely sensed variable, we are able to use in situ data, which may initially seem unsuitable, to carry out a more conservative evaluation.
Finally, factors such as cloud cover, landscape heterogeneity, phenology, vegetation type, and soil type affect the accuracy of remotely sensed products like MODIS VCF and Landsat TCC (Hansen et al., 2003; Huete et al., 1997; Smith et al., 2002). Data characterizing these at the plot level would help identify potential confounding factors affecting performance and so help further constrain uncertainties.
Alternatively, comparing MODIS VCF to other land cover maps or higher-resolution remotely sensed data are recommended (Gross et al., 2018; Lary and Lait, 2006; Montesano et al., 2016), though without a large-scale effort to re-calibrate MODIS VCF and products trained using VCF like Landsat TCC, the question of how appropriate MODIS VCF is for use in both forests and savannas in the tropics will remain. By highlighting the extent to which MODIS VCF struggles to estimate tree cover in tropical forests and savannas, we hope to inform the future use of this product and improve its useability.
We found that MODIS VCF significantly underestimates tree cover in tropical forests and savannas, even when within-site and pixel–site variation are accounted for in calibration. We also found that using MODIS VCF for training likely propagates these biases, even in the finer-scale Landsat TCC. As MODIS VCF is a product that is commonly used in a wide variety of ecological research including vegetation modelling, estimating restoration potential, and identifying forest–savanna bimodality, we stress that more independent work on validating and re-calibrating is required before its tree cover estimates can be relied upon in the tropics.
Site names, locations, canopy area index values, MODIS VCF percent tree cover values, cover type, and TROBIT site descriptions of the 48 TROBIT project plots used in this study. TROBIT site descriptions are based on Table S1 of Veenendaal et al. (2015).
The calibration curves developed for MODIS VCF based on four pixel–site mismatch scenarios (no clumping and no overlap, enforced clumping and no overlap, no clumping and enforced overlap, and enforced clumping and enforced overlap). The dashed line signifies the “ideal” 1 : 1 relationship wherein calibrated MODIS VCF is unchanged from the original MODIS VCF values. The shaded regions represent the 5 % to 95 % confidence intervals for the respective calibrated MODIS VCF values.
The change in tree cover post-calibration for all four scenarios. Black dots indicate areas where the post-calibration values have a 95 % certainty of being positive (increasing cover) or negative (decreasing cover) calibrations. These uncertainty maps are indicators of areas where MODIS VCF estimates may be more or less reliable and cannot be used as definitive calibrations due to the limited number of field sites used as reference.
Frequency distributions of percent tree cover value as estimated by MODIS VCF across the forest supercategory and the following IGBP classes that by our definition count as part of the savanna domain: closed shrublands, open shrublands, woody savannas, savannas, and grasslands. Specific class definitions as per the user guide for the MODIS land cover product (Sulla-Menashe and Friedl, 2018).
TROBIT plot upper-stratum height versus the difference between MODIS VCF and TROBIT percent tree cover for the four clumping and overlap scenarios. Upper and lower bars represent the uncertainty range's 10th and 90th percentile, respectively, based on the convolution of MODIS VCF and TROBIT cover uncertainties from Fig. 4.
The code and data used to support the findings of this study are archived at
Upon being granted access, the data are available at
MODIS VCF is accessible via the MODIS R package (
TCC data are accessible at
RA, DIK, and FG designed the TROBIT–MODIS VCF pixel–site comparison technique. RA and FG collated TROBIT and corresponding MODIS VCF values, and CG collated Sexton et al. (2013) values. DIK and ND performed regression analysis and constructed global maps. RA wrote the first draft of the paper with input from DIK and FG. DIK plotted the figures. MTR, EV, TRF, OLP, SLL, BS, HT, BSM, TD, LA, GD, and GS carried out the extensive TROBIT field campaigns. MTR was the main person responsible for field data quality checking and digitizing. RA, DIK, FG, and ND contributed to the final manuscript.
The contact author has declared that neither they nor their co-authors have any competing interests.
Publisher’s note: Copernicus Publications remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
The contribution by Douglas I. Kelley was supported by the UK Natural Environment Research Council through the UK Earth System Modelling project (UKESM, grant no. NE/N017951/1). France Gerard was supported by the SUNRISE project (grant no. NE/R000131/1). Ning Dong was supported by the Australia Research Council (grant no. DP170103410). Beatriz Schwantes Marimon was supported by the National Council for Scientific and Technological Development (CNPq, grant no. 301153/2018-3).
Maps in Figs. 1 and 3 were constructed using raster 2.6-7 (Hijmans, 2017) and mapproj 1.2-5 (Brownrigg et al., 2017) in R 3.2.0 (R Core Team, 2018). Coastlines were obtained from maps 3.1.0 (Becker et al., 2016).
Special thanks go to Jon Lloyd for supervising Rahayu Adzhar at the start of the project, to Jeanette Kemp for her work in the Australian field sites, and to Azemiyah Abdul Rahim for her support during the production of this paper.
This research has been supported by the Natural Environment Research Council (grant nos. NE/N017951/1 and NE/R000131/1) and the Australian Research Council (grant no. DP170103410). Beatriz Schwantes Marimon was supported by the National Council for Scientific and Technological Development (CNPq, grant no. 301153/2018-3).
This paper was edited by Alexandra Konings and reviewed by three anonymous referees.