Scale variance in the carbon dynamics of fragmented, mixed-use landscapes estimated using model–data fusion

Milodowski, David T.; Smallman, T. Luke; Williams, Mathew

doi:https://doi.org/10.5194/bg-20-3301-2023

Articles | Volume 20, issue 15

https://doi.org/10.5194/bg-20-3301-2023

Articles | Volume 20, issue 15

Research article

11 Aug 2023

Research article |

| 11 Aug 2023

Scale variance in the carbon dynamics of fragmented, mixed-use landscapes estimated using model–data fusion

David T. Milodowski, T. Luke Smallman, and Mathew Williams

Abstract

Many terrestrial landscapes are heterogeneous. Mixed land cover and land use generate a complex mosaic of fragmented ecosystems at fine spatial resolutions with contrasting ecosystem stocks, traits, and processes, each differently sensitive to environmental and human factors. Representing spatial complexity within terrestrial ecosystem models is a key challenge for understanding regional carbon dynamics, their sensitivity to environmental gradients, and their resilience in the face of climate change. Heterogeneity underpins this challenge due to the trade-off between the fidelity of ecosystem representation within modelling frameworks and the computational capacity required for fine-scale model calibration and simulation. We directly address this challenge by quantifying the sensitivity of simulated carbon fluxes in a mixed-use landscape in the UK to the spatial resolution of the model analysis. We test two different approaches for combining Earth observation (EO) data into the CARDAMOM model–data fusion (MDF) framework, assimilating time series of satellite-based EO-derived estimates of ecosystem leaf area and biomass stocks to constrain estimates of model parameters and their uncertainty for an intermediate complexity model of the terrestrial C cycle. In the first approach, ecosystems are calibrated and simulated at pixel level, representing a “community average” of the encompassed land cover and management. This represents our baseline approach. In the second, we stratify each pixel based on land cover (e.g. coniferous forest, arable/pasture) and calibrate the model independently using EO data specific to each stratum. We test the scale dependence of these approaches for grid resolutions spanning 1 to 0.05^∘ over a mixed-land-use region of the UK. Our analyses indicate that spatial resolution matters for MDF. Under the community average baseline approach biological C fluxes (gross primary productivity, R_eco) simulated by CARDAMOM are relatively insensitive to resolution. However, disturbance fluxes exhibit scale variance that increases with greater landscape fragmentation and for coarser model domains. In contrast, stratification of assimilated data based on fine-resolution land use distributions resolved the resolution dependence, leading to disturbance fluxes that were 40 %–100 % higher than the baseline experiments. The differences in the simulated disturbance fluxes result in estimates of the terrestrial carbon balance in the stratified experiment that suggest a weaker C sink compared to the baseline experiment. We also find that stratifying the model domain based on land use leads to differences in the retrieved parameters that reflect variations in ecosystem function between neighbouring areas of contrasting land use. The emergent differences in model parameters between land use strata give rise to divergent responses to future climate change. Accounting for fine-scale structure in heterogeneous landscapes (e.g. stratification) is therefore vital for ensuring the ecological fidelity of large-scale MDF frameworks. The need for stratification arises because land use places strong controls on the spatial distribution of carbon stocks and plant functional traits and on the ecological processes controlling the fluxes of C through landscapes, particularly those related to management and disturbance. Given the importance of disturbance to global terrestrial C fluxes, together with the widespread increase in fragmentation of forest landscapes, these results carry broader significance for the application of MDF frameworks to constrain the terrestrial C balance at regional and national scales.

How to cite

How to cite.

Dates

Received: 23 Aug 2022 – Discussion started: 29 Aug 2022 – Revised: 31 May 2023 – Accepted: 02 Jun 2023 – Published: 11 Aug 2023

1 Introduction

Over the past decade, terrestrial ecosystems have provided a global net carbon (C) sink sequestering ∼ 3.4 ± 0.9 PgC yr⁻¹, ∼ 30 % of anthropogenic CO₂ emissions, despite estimated emissions of ∼ 1.6 ± 0.7 PgC yr⁻¹ associated with land use and land cover change (Friedlingstein et al., 2020). The future trajectory of the terrestrial carbon sink will therefore have a significant impact on global efforts to achieve the goal of the UN Framework Convention on Climate Change to avoid dangerous climate change, reaffirmed in the Glasgow Climate Pact (UNFCCC, 2021). Quantification of spatial and temporal variations in exchange magnitude, alongside their associated uncertainties, is therefore essential to understanding the stability of the terrestrial carbon sink in the face of rapid environmental change (Hurlbert et al., 2019), and a prerequisite to robust national reporting of land-based CO₂ emissions and their attribution to different sectors (Grassi et al., 2017; Jones and Friedlingstein, 2020; McGlynn et al., 2022). Terrestrial biosphere models provide a means of quantifying the land carbon balance in a systemic, ecologically coherent way (Bonan and Doney, 2018). However, the current and future dynamics of terrestrial C exchange are highly uncertain, largely due to uncertainties in the structure and parameter constraints of the biosphere models themselves (Lovenduski and Bonan , 2017; Smallman et al., 2021).

Global land use and land cover change has increased the fragmentation of ecosystems, creating highly heterogeneous landscapes that host a mosaic of land cover and uses (Lindenmayer and Fischer, 2013; Brink et al., 2017; Matricardi et al., 2020). This heterogeneity juxtaposes ecosystems with contrasting C stocks, traits and ecological processes, management, and environmental sensitivity, at length scales of 10–100 m. For example, within the UK, landscapes comprise a patchwork of managed arable land and pasture, semi-natural and plantation woodland, heath, and settlements (Fig. 1). Insight into the dynamics of this patchwork of ecosystems has been greatly accelerated by the proliferation of Earth observation (EO) data from satellites that monitor ecosystems with ever-increasing spatial and temporal resolution (Exbrayat et al., 2019). A major challenge is to synthesise this expanding range of EO data to generate systemic understanding of the terrestrial C cycle, thus transforming ecosystem observation into ecological understanding that can inform policy development and facilitate land management (Smallman et al., 2022).

Model–data fusion (MDF) frameworks provide the means to integrate EO observations with spatially explicit process-based ecosystem models that encapsulate our understanding of how C flows through ecosystems (Luo et al., 2011), thereby providing key, mass-balanced, constraints on the fluxes of C between the atmosphere and land surface alongside their associated uncertainties (Niu et al., 2014; Bloom et al., 2016; Peylin et al., 2016; MacBean et al., 2018; Smallman et al., 2021). MDF frameworks that exploit intermediate complexity models of the terrestrial C cycle, such as CARDAMOM (Bloom et al., 2016; Exbrayat et al., 2018; Lopez-Blanco et al., 2019; Smallman et al., 2021), are able to generate “local” calibrations based on pixel-level inversions of EO and auxiliary data streams. Calibrating ecosystem models to local data is important, because the functional traits of ecosystems vary in space (Smith et al., 2013; Reich et al., 2014; Butler et al., 2017; Exbrayat et al., 2018; Lopez-Blanco et al., 2019; Smallman et al., 2021), with trait differences within biomes often exceeding differences between biomes (Van Bodegom et al., 2012; Butler et al., 2017); failure to account for such variations may lead to biases in the estimated dynamics (Scheiter et al., 2013; Exbrayat et al., 2018). However, the computational intensity of large-scale MDF frameworks limits their spatial resolutions to 10–100 km, several orders of magnitude greater than the length scales relevant to differentiating the ecosystems within landscape mosaics (e.g. Kaminski et al., 2012; Smith et al., 2013; Kuppel et al., 2014; Bloom et al., 2016; Peylin et al., 2016; Yin et al., 2020; Smallman et al., 2021).

The scale disparity between model domains and the ecological fabric those domains represent poses a major challenge to large-scale modelling of terrestrial C dynamics in heterogeneous landscapes (Stoy et al., 2009; Fisher and Koven, 2020; Levy et al., 2022). In typical spatially distributed MDF applications, available observations are aggregated to pixel-level “community averages” prior to inversion. There are usually sufficient degrees of freedom in ecological process models to fit the observed temporal changes in aggregated stocks and fluxes, based on available observation constraints (Beven, 2006; Famiglietti et al., 2021). Nevertheless their ecological fidelity may be limited in heterogeneous landscapes, for which the parameters retrieved by these community average models provide intermediate representations of the distinct ecosystems present. This limitation compromises efforts to attribute fluxes to specific land uses and raises potential for significant sources of bias when estimating the terrestrial carbon balance and its environmental sensitivity. Firstly, the C cycle represents the interplay of a number of nonlinear ecological processes, and therefore upscaling raises the familiar foe of Jensen's inequality (Jensen, 1906; Levy et al., 2022), whereby for a set of input variables, X, the expectation value for a nonlinear function f (i.e. E[f(X)]), will not yield the same estimate as the same nonlinear function applied to the average values of those variables (f(E[X])), leading to scale variance. In the case of terrestrial C fluxes, land use places strong controls both on the distribution of carbon stocks and plant functional traits within the landscape and on the processes controlling the fluxes of C through landscapes, particularly those related to exogenous processes such as management and disturbance. The inversion of the pixel-average environmental signals may provide biased diagnostics, in particular where pixels comprise a mixture of different ecosystem types and management. As the model domains become coarser, the length scales over which the environmental signal is averaged increases. Failing to account for the co-location of stocks and process imposed by land use in mixed-use landscapes (e.g. concentration of C stocks in woodland, where timber harvest is focused) provides a clear source of potential systematic scale-variant bias in derived flux estimates across large scales. Additionally, community average models may miss or poorly represent processes specific to certain land uses (White et al., 2019; Kondo et al., 2020). To ensure the ecological fidelity of large-scale ecosystem C-cycle models, it is therefore vital to adequately capture the essential processes controlling the fluxes of C through these different ecosystems and their potentially divergent temporal dynamics and environmental sensitivities (Levy et al., 2022).

https://bg.copernicus.org/articles/20/3301/2023/bg-20-3301-2023-f01

Figure 1Perspective view of Sentinel-2 imagery over a typical landscape sampled from the study area illustrating the fine-scale mosaic of land use characteristic of this region. The spatial extent of the displayed domain is 10 km × 10 km and comprises part of the North River Tyne catchment in Northumberland, with the spatially extensive coniferous woodlands of Kielder Forest encroaching into the NW of the scene (top). Contains modified Copernicus Sentinel data (2021).

In this study, we specifically address the impact of the resolution trade-off in spatially explicit MDF frameworks between ecological fidelity and computational intensity by investigating how simulated carbon cycling in a mixed-use landscape in the UK responds to the spatial resolution of the model grid. We test two different MDF approaches that assimilate EO information of ecosystem characteristics to constrain model parameters and uncertainty for an intermediate complexity model of the terrestrial C cycle, DALEC (Williams et al., 2005; Bloom et al., 2016; Smallman et al., 2017, 2021). In the first approach, ecosystems are calibrated and modelled at the pixel level, representing a community average of the encompassed land cover and management. This corresponds to the approach commonly employed in large-scale ecosystem MDF frameworks (e.g. Smith et al., 2013; Bloom et al., 2016; Yin et al., 2020; Smallman et al., 2021). In the second, we stratify each pixel based on land cover and calibrate the model independently using remotely sensed data specific to each stratum, aligning more closely with the tiled plant functional type (PFT) approach employed in many terrestrial biosphere models (e.g. Sitch et al., 2008; Kaminski et al., 2012; Kuppel et al., 2014). The novelty of introducing stratification within a MDF context is that we use fine-scale ecosystem information contained within EO data to retrieve locally calibrated parameter ensembles for the ecosystems represented by each stratum and therefore retain that key advantage of MDF systems, which enables calibrated traits to vary across environmental gradients, within the constraints of the available observations and ecological knowledge (Smallman et al., 2022). However, by stratifying pixels we attempt to minimise the extent to which the environmental signals being inverted are averaged across functionally distinct ecosystems, thus ameliorating one source of scale-variant error in the estimated C fluxes. Furthermore, the model parameters retrieved through MDF synthesise the ecological information relayed by the assimilated data streams, within the constraints imposed by model structure and data quality. At coarser scales, aggregating observation streams results in the loss of ecological information, which we expect to be particularly marked in heterogeneous landscapes. Stratification provides one mechanism through which this ecological information loss may be reduced. Of course, the ecological fidelity of the model calibrations may still be limited where the model structure cannot adequately represent important processes, or where there are systematic errors and biases in the assimilated data; stratification by itself does not resolve these components of ecological fidelity, but it does open avenues through which they may be addressed.

We test the two MDF approaches – the novel sub-pixel stratification approach and the traditional pixel average (baseline) approach – on grid resolutions spanning 1 to 0.05^∘. Specifically we address the following hypotheses:

H1. Estimated C fluxes will be scale variant, with stronger resolution sensitivity exhibited by exogenous fluxes (i.e. disturbance) compared to biogenic fluxes (e.g. gross primary productivity, GPP).
H2. C fluxes will be more consistent across grid resolutions when the framework explicitly accounts for sub-pixel heterogeneity in land use; estimates from the baseline (unstratified) experiments will converge on the stratified estimates at finer spatial resolutions.
H3. Aggregating data to coarser spatial resolutions will result in parameter estimates that increasingly fail to capture functional variations between land cover types, but stratifying the landscape prior to aggregation reduces this functional information loss.

Using MDF to combine models and data at local scale offers huge potential for rigorous quantification of the state and dynamics of the terrestrial C cycle across large spatial scales, with propagation of uncertainty through analyses (Bloom et al., 2016; Smallman et al., 2022). In testing the hypotheses outlined above, we seek to address a key challenge relating to the mismatch between the scales of ecological processes and of large-scale MDF frameworks through the development of a novel stratified MDF framework. Our approach retains the core advantages of MDF, namely local calibration with local information, while also capturing the fine-scale ecosystem heterogeneity common to fragmented or mixed-use landscapes.

2 Methods

2.1 Study area

The study area for this site covers northern England and the Scottish borders, spanning approximately 30 000 km² across 3 ^∘ of longitude and 1 ^∘ of latitude (Fig. 2). The region comprises a mosaic of land cover types, including coniferous plantation forest (including the nationally significant forestry estates of Kielder Forest, Eskdalemuir Forest, and Galloway Forest), fragments of broadleaf woodland, upland heath, arable agriculture, and pasture. The longitudinal extent stretches from coast to coast, from the Firth of Clyde in the west to the North Sea in the east. Elevation varies from sea level to a high of 978 m on Scafell Pike in the Lake District. These gradients in longitude and elevation are associated with gradients in both precipitation and temperature (Jenkins et al., 2009). Precipitation decreases from west to east in response to the prevailing westerly wind direction and orographic enhancement of rainfall in areas of high topography; temperature gradients are broadly controlled by elevation.

https://bg.copernicus.org/articles/20/3301/2023/bg-20-3301-2023-f02

Figure 2Map of the study area, spanning 1^∘ of latitude and 3^∘ of longitude across southern Scotland and northern England. The land cover types displayed are aggregated from the LCM2015 land cover map of Great Britain (Rowland et al., 2017), regridded to approximately 300 m resolution.

2.2 Model–data fusion with CARDAMOM (CARbon DAta MOdel fraMework)

2.2.1 DALEC

At the core of our model–data fusion framework sits DALEC, an intermediate complexity model of the terrestrial C cycle (Williams et al., 2005; Bloom and Williams, 2015; Smallman et al., 2017; Famiglietti et al., 2021). DALEC is a mass balance model of the C cycle with carbon moving through different pools based on parameterised fluxes (Fig. 3). A number of variants of DALEC have been created representing ecosystem carbon dynamics with varying degrees of complexity (Famiglietti et al., 2021; Smallman et al., 2021). The specific version of DALEC used here corresponds to the C6 model outlined in Famiglietti et al. (2021), which combines the C-cycle structure from Bloom and Williams (2015) with the revised photosynthesis model from Smallman and Williams (2019). There are four live biomass pools, specifically relating to carbon stored in foliage, labile carbon, fine roots and wood, and two dead organic carbon pools: litter and soil organic carbon. Carbon enters the system through GPP, modelled using the photosynthesis model ACM2 (Smallman and Williams, 2019), wherein GPP is simulated as a function of modelled leaf area, estimated canopy photosynthetic efficiency, absorbed solar radiation, atmospheric CO₂ concentration, air temperature, and a stomatal conductance model that balances potential water supply from the soil (assumed to be at field capacity) through the roots with atmospheric demand, determined by absorbed solar radiation and vapour pressure deficit (VPD). Carbon is lost from the system via autotrophic and heterotrophic respiration. Net primary production (NPP) is allocated between autotrophic respiration and the live pools based on fixed fractions. Canopy growth is driven by a combination of direct allocation from GPP and transfer of carbon from the labile pool. The flux of carbon from the labile pool to foliage and canopy senescence, driving litterfall, are controlled by a simple day-of-year phenology model with a parameterised leaf life span (Bloom and Williams, 2015). Carbon flows from the roots and wood into the litter and soil organic carbon pools, respectively, based on first-order turnover rates. Heterotrophic respiration fluxes also follow first-order kinetics, but with an exponential temperature sensitivity. A full list of model parameters is provided in the Appendix (Table A1). The relative simplicity compared to other terrestrial biosphere models makes DALEC amenable to calibration in model–data fusion frameworks and allows propagation of uncertainties through large ensemble simulations (Bloom et al., 2016; Exbrayat et al., 2018; Famiglietti et al., 2021; Smallman et al., 2021).

2.2.2 Model–data fusion

Our model–data fusion framework, CARDAMOM (Bloom et al., 2016), uses a Bayesian approach within an adaptive proposal Markov chain Monte Carlo (AP-MCMC) framework (Haario et al., 2001) that can assimilate a range of information (Bloom et al., 2016), including remotely sensed leaf area index (LAI) and aboveground biomass (see Sect. 2.3). The premise of the approach is to take driving data describing the meteorology and disturbances such as forest clearance and fire and search the model parameter space to find parameter combinations that provide simulated dynamics that are consistent with the available data. Specifically, given a set of observations, O, with uncertainty σ, the probability of a given parameter set x, P(x|O), is calculated as a function of the likelihood of the observations given the current parameters, P(O|x), and any prior knowledge on the parameter distributions, P(x):

\begin{matrix} (1) & P (x | O) \propto P (O | x) \cdot P (x) . \end{matrix}

The likelihood P(O|x) is calculated based on the misfit between the N available observations and the equivalent simulated state variables and fluxes for each parameter set, M:

\begin{matrix} (2) & P (O | x) = \exp (- 0.5 \cdot \sum_{n = 1}^{N} {(\frac{O_{n} - M_{n}}{σ_{n}})}^{2}) . \end{matrix}

We use the Gelman–Rubin convergence criterion to determine whether multiple chains at each pixel have converged. The AP-MCMC (Haario et al., 2001) does not stipulate or target an acceptance rate; the emergent acceptance rate typically varies between 5 % and 25 %. The covariance matrix used in adapting the parameter sampling is generated from an initial phase of the MCMC. No hyperparameters are estimated as part of the process.

To facilitate the calibration process, we employ a series of ecological dynamic constraints, EDCs (Bloom and Williams, 2015; Smallman et al., 2017). EDCs comprise a series of mathematical rules and functions that impose conditions on the inter-relationships between model parameters to ensure ecological “realism” in the accepted parameter sets and are based on ecological theory (Bloom and Williams, 2015). For example, the turnover of the wood carbon pool must be slower than foliage turnover. Where EDCs are not satisfied, the likelihood is set to zero. By restricting the acceptable parameter space, the EDCs therefore reduce the effective model complexity (Famiglietti et al., 2021) and tend to reduce bias and equifinality in the calibrated ensembles (Bloom and Williams, 2015). The resulting ensemble of parameter sets encapsulates the uncertainty in the calibration within the available observational constraints.

2.2.3 Stratification approach for mosaic landscapes

Our approach to handling fine-scale heterogeneity during the model–data fusion process is based on sub-pixel stratification based on land use (Fig. 3). Stratification is achieved by sampling the spatially gridded EO data products at their native spatial resolution based on a reference land cover map, resampled to the same resolution using the modal category.

The specific land cover product used is the LCM2015 land cover map produced by the UK Centre for Ecology and Hydrology (CEH) (Rowland et al., 2017), which we aggregate to four classes: coniferous woodland, broadleaf woodland, arable/pasture, and heathland, which includes semi-natural grasslands and widespread areas of non-wooded upland heath (Fig. A1 in the Appendix). Urban and coastal areas are masked from all analyses. For each pixel, separate ensembles are calibrated independently, yielding a suite of ensembles that maintain the ecological fidelity of the calibrated parameters. While aggregation of observations to coarser resolution unavoidably results in information loss, stratification preserves the distinctions between functionally distinct ecosystems; therefore in this sense, the ecological fidelity of the resultant suite of ensembles is maintained within the limitations of the model structure and data quality. This is a contrast to the “traditional” model–data fusion approach, which aggregates the data constraints into pixel-level community averages prior to calibration, yielding calibrated parameter combinations that may be attempting to account for a multitude of distinct ecological processes. However, when considering the ecological fidelity of the calibrated models, it is important to note that CARDAMOM will attempt to find parameters that satisfy the observation constraints and their uncertainties, and therefore systematic errors associated with particular data streams will propagate to lead to parameter estimates that do not provide a good representation of the ecology. We discuss this in relation to the impact of overly seasonal LAI observations for conifer woodland canopies and the resultant impact on canopy parameters in Sect. 4.3.

The stratification approach is very flexible. The number of categories can be refined as necessary, within the data constraints. Regarding aggregation of uncertainties, we do not have constraints on the extent to which pixel uncertainties are correlated in space. Therefore for each stratum, spatial aggregation of uncertainty conservatively assumes correlated uncertainties (Exbrayat et al., 2018). However, we assume that individual strata, representing different ecosystems, are uncorrelated with other strata when aggregating sub-pixel ensembles to pixel level. For simplicity of comparison across the experiments in this study against the baseline experiments (i.e. no stratification), we use only one model structure across all strata and pre-process the assimilated data streams in the same way. For strata where woody tissues are not part of the dominant vegetation types, for example in areas covered by crop and pasture, the C_Wood pool also provides a reservoir for non-woody structural tissue, with the differential allocation patterns and turnover rates reflected in the retrieved parameters. Importantly, different ecosystems could in the future be modelled with distinct, ecosystem-specific models that better capture their functional process dynamics. Relevant ecosystem-specific model variants have previously been integrated within the CARDAMOM framework, for example woodlands (Smallman et al., 2017), pasture (Myrgiotis et al., 2021), and arable agriculture (Revill et al., 2021). Given the computational limitations on the resolution of the model domain, stratification would be prerequisite to the inclusion of ecosystem-specific models within regional CARDAMOM applications.

https://bg.copernicus.org/articles/20/3301/2023/bg-20-3301-2023-f03

Figure 3Schematic flow diagrams illustrating the different model–data fusion approaches employed in this study: the “traditional” model–data fusion approach whereby the input data are aggregated to pixel-level “community averages” and stratification based on the land use leading to calibration of a suite of land-use-specific ensembles. At the heart of both MDF approaches sits DALEC, an intermediate complexity model of the terrestrial C cycle (Bloom and Williams, 2015). While the presented time series show specifically the change in live biomass, dC_bio, it is important to note that similar information is retrieved for all fluxes and stocks within DALEC, alongside pixel-specific parameter ensembles.

2.3 Data

2.3.1 Meteorological drivers

Meteorological drivers, comprising temperature, short-wave radiation, VPD, and wind speed, are drawn from the CRU-JRAv1.1 dataset, a 6-hourly 0.5 × 0.5^∘ reanalysis (CRU, 2019). Atmospheric CO₂ concentration is taken from the Mauna Loa global CO₂ concentration (https://www.esrl.noaa.gov/gmd/ccgg/trends/, last access: 22 August 2020).

2.3.2 Copernicus LAI 300 m

LAI data are obtained from the 300 m Copernicus LAI product v1.0 (Fuster et al., 2020) for the period 2014–2019. The LAI estimates in the Copernicus 300 m product represent 10 d composites from daily estimates of LAI that are generated from to daily top-of-atmosphere input reflectances detected by the PROBA-V satellite by applying a neural network. These 10 d LAI estimates were aggregated to monthly averages prior to assimilation. Pixel-wise uncertainty estimates are also provided with this product, calculated as the root-mean-square difference between the individual daily neural network estimates and the 10 d average. Previous work has indicated that these uncertainty estimates underestimate the true uncertainties associated with this product (Zhao et al., 2020). We therefore used a more conservative temporal aggregation approach based on the maximum uncertainty within the aggregation period.

2.3.3 ESA Biomass CCI aboveground biomass 2017 and 2018

Aboveground biomass (AGB) estimates and associated uncertainty were extracted the global maps published within the ESA Biomass Climate Change Initiative (CCI) collection (Version 2), comprising two estimates for the years 2017 and 2018 with a spatial resolution of 100 m (Santoro et al., 2021). The ESA CCI Biomass data are derived from synthetic aperture radar (SAR) backscatter data, specifically ALOS PALSAR L-band SAR backscatter combined with Sentinel-1 C-band SAR backscatter. Uncertainty estimates are provided with this product, calculated as the standard deviation associated with the AGB estimate after propagating errors through the SAR measurement, SAR–AGB modelling framework, and merging of L-band and C-band estimates into an overall AGB estimate (Santoro et al., 2021).

The DALEC wood carbon pool represents the combination of above- and below-ground carbon (i.e. including the coarse root component). The contribution from below-ground biomass (BGB) to the woody biomass pool, alongside the associated uncertainty, is modelled using an allometric relationship following Saatchi et al. (2011):

\begin{matrix} (3) & BGB = 0.489 \cdot {AGB}^{0.89} . \end{matrix}

2.3.4 SoilGrids2 soil organic carbon (SOC)

Soil organic carbon estimates and associated uncertainties were obtained from SoilGrids2, which provides 250 m resolution spatial maps of depth profiles for various soil properties (Poggio et al., 2021). These maps were produced using EO and auxiliary spatial data within a machine learning framework trained on over 230 000 individual soil profile observations. The extracted SOC estimates are used to set a prior constraint on the initial SOC stock. As there is no date associated with the SoilGrids2 dataset, we use these estimates, with their uncertainty, to provide a prior constraint on the initial SOC stocks. This contrasts with our treatment of the LAI and AGB data, which are associated with specific time periods and therefore used as observational constraints on the simulated time series. However, the aggregation of the original SoilGrids2 data layers for the baseline and stratified experiments follows the same procedure.

2.3.5 Disturbance

Disturbance is imposed on DALEC based on satellite observations of tree cover loss and burned area. Disturbances related to tree cover loss are driven by observations from the Global Forest Watch (GFW) dataset (Hansen et al., 2013), which provides annual constraints on tree cover loss at 30 m resolution based on Landsat data. Note that other mechanisms of disturbance, such as agricultural harvests and pasture management, are not considered in the current analysis. To convert area estimates of tree cover loss into changes in C stocks, we use a simple clearance model in which a fraction of the C stored in C_wood, C_foliage, and C_labile is removed based on the pixel fraction (or stratum-specific sub-pixel fraction) identified within the GFW dataset as experiencing tree cover loss. In practice, most tree cover loss occurs in the conifer woodlands and is therefore concentrated in these woodlands in the stratified analysis, compared to the baseline experiments, in which we do not consider the sub-pixel distribution of land cover. Fire is imposed based on monthly aggregated burnt area fractions in the MODIS MCD64A1 product (Giglio et al., 2018), which maps fire-affected areas at 500 m resolution based on changes in surface reflectance, although the occurrence of MODIS-detected fires throughout the model domain was very low. Emissions from fire are estimated by assuming a fraction of simulated biomass either undergoes combustion, therefore immediately released to the atmosphere, or is transferred to the litter pool, based on tissue specific combustion-completeness factors (Exbrayat et al., 2018).

2.4 Experimental setup

To test how simulated C fluxes varied with grid resolution, we calibrated DALEC across the target domain at four different grid resolutions: 0.05, 0.25, 0.50, and 1.00^∘, at a monthly time step spanning the period 2014–2019. We compared the retrieved parameters and simulated C fluxes for two MDF approaches: the proposed stratified CARDAMOM calibration that explicitly accounts for sub-pixel heterogeneity in land use and the traditional pixel aggregate CARDAMOM calibration. The latter serves as a baseline. In the baseline experiments, the observation streams were aggregated to the domain resolutions, and these community average environmental signals were assimilated into a single ensemble. In the stratified experiments, the individual observation streams were stratified at their native resolutions based on the dominant category from the high-resolution land cover map, and then these strata-specific subsets of observations were aggregated to the resolution of the model domains before assimilation. In all cases we use the same underlying DALEC model structure within the MDF framework (Fig. 3). Emergent differences in the retrieved parameters, stocks, and fluxes between experimental runs are therefore a consequence of the resolution at the observations are aggregated, rather than ecosystem-specific differences in model structure. The resolution of the meteorological data in all cases is 0.5^∘; therefore, our analysis does not allow us to test the extent to which resolving fine-scale variations in meteorological forcing impacts on the overall C balance.

We characterise the calibration performance for each ensemble based on the RMSE and the bias with respect to the N assimilated observations:

\begin{matrix} (4) & RMSE = \sqrt{\frac{\sum_{n = 1}^{N} {(O_{n} - M_{n})}^{2}}{N}} \\ (5) & Bias = \frac{\sum_{n = 1}^{N} (O_{n} - M_{n})}{N} . \end{matrix}

In both cases, we calculate pixel-level metrics and then weight the contributions from individual pixels when aggregating across the domain based on the fractional coverage contributed by each stratum. As the observations are also associated with significant uncertainty, we also consider the ratio of the RMSE and bias to the product uncertainty as a measure of agreement within the uncertainty constraints provided by the assimilated data. Values >1 would indicate situations where the model was not able to fit the observations to within their associated uncertainty. Different data streams may provide inconsistent and/or incompatible information, for example due to data biases or incorrect specification of uncertainty (Zhao et al., 2020). This could lead to larger $RMSE / σ$ ratios as CARDAMOM attempts to balance inconsistent information. Larger model–data mismatch could also indicate model structural error. Values <1 may indicate improved constraints based on the combination of assimilating complementary data streams and the ecological knowledge embedded in the model and EDCs.

We are able to address our first two hypotheses (H1, H2), relating to the impact of resolution and sub-pixel stratification on diagnostic analyses of C cycle dynamics, by comparing the changes in C stocks and fluxes over the data assimilation period. H3 is addressed by comparing the retrieved parameters for each run, including the individual land use classes in the stratified analysis and how these distributions shift depending on the land use and on the spatial resolution of the analysis. To understand the potential impact of any emergent differences in the retrieved parameters on future trajectories, we then ran forward simulations of our DALEC ensembles to 2100 under the SSP2-4.5 W m⁻² scenario extracted from the UK Earth System Model (UKESM; Sellar et al., 2019) contribution to CMIP6 (Eyring et al., 2016), which corresponds to a middle-of-the-road scenario with a projected mean global warming of 2.7 ^∘C (O'Neill et al., 2016). We do not impose future disturbance fluxes, so emergent differences in C dynamics will be driven by the interactions between climate and the retrieved parameters for each ensemble. To avoid step changes in meteorology between the historical meteorology (from observations) and future meteorology (simulated by UKESM), we apply the future trajectories for each scenario based on the anomaly in the UKESM forecast relative to 2019 (following Smallman et al., 2021).

3 Results

3.1 Calibration performance

The two MDF approaches tested provided comparable fits to the calibration data. Both the baseline and stratified CARDAMOM calibrations were able to fit the assimilated C_Wood and LAI observations to well within the levels of observation uncertainty, with the RMSE between simulated and observed variables less than 50 % of the uncertainties attached to the assimilated observations (Table 1, Figs. A2, A3). In general the RMSE values were comparable between the stratified and baseline experiments for LAI (mean RMSE for C_Wood across spatial resolutions: 14.0 % for the baseline experiment; 13.6 % for the stratified experiment), while the RMSE for C_Wood was slightly lower for the stratified experiment (mean RMSE for C_Wood across spatial resolutions: 14.0 % for the baseline experiment; 13.6 % for the stratified experiment). For both LAI and C_Wood, the RMSE tended to increase at finer spatial resolutions in both the baseline experiments and the aggregated stratified experiment (Table 1), although the resolution-dependent trend was not consistent between individual strata, Table A2). An increase in RMSE at finer spatial resolutions can be rationalised by the smoothing effect of aggregating the remote sensing products over larger spatial scales. This not only removes the impact of high frequency random noise in the assimilated signal but also removes variability generated by local processes (e.g. management) that are not accounted for in the relatively simple treatment of canopy dynamics encoded in the model. In the stratified experiment, the bias in C_Wood was dominated by the contributions from the woodland strata (Table A2), corresponding to their much greater C_Wood stocks, which were over 4 times higher in coniferous woodland than arable/pasture and 6 times higher than in the heathland class (Table A2, Fig. A3). Notably, all strata, including the coniferous woodland class, contained a strong seasonal cycle of monthly LAI in both the assimilated observations and the simulations (Fig. A2).

Table 1Summary of calibration performance, aggregated across the domains for the baseline and stratified experiments. σ represents the uncertainty of the assimilated observation data; thus $RMSE / σ$ provides the ratio of the RMSE to the uncertainty attached to the observation constraint. For an equivalent breakdown of calibration performance of the individual strata in the stratified experiment, see Table A2. The values in parentheses following the RMSE and bias estimates indicate the percentage relative to the mean of the observations across the domain.

Download Print Version | Download XLSX

Table 2Summary of domain-aggregated carbon budgets for the baseline and stratified experiments. Fluxes are gross primary productivity (GPP), total ecosystem respiration (R_eco), cumulative changes in live (dC_bio) and dead (dC_soil) organic C pools integrated over the 6-year assimilation period (2014–2019), and carbon losses due to harvest and other tree cover loss (harvest). Values represent the median pixel level estimates averaged across the domain, alongside the 5 % and 95 % percentiles, i.e. assuming fully correlated uncertainties.

Download Print Version | Download XLSX

https://bg.copernicus.org/articles/20/3301/2023/bg-20-3301-2023-f04

Figure 4Spatially aggregated time series for GPP and ecosystem respiration ( $R_{eco} = R_{a} + R_{het}$ ) and the cumulative change in carbon stocks in the live (dC_bio) and soil (dC_soil) pools, shown for the baseline and stratified ensembles for four spatial resolution domains. Only the median estimates are shown for clarity. Confidence levels for the 1 and 0.05^∘ domains for the same time series are provided in Fig. A5 in the Appendix.

Scale variance in the carbon dynamics of fragmented, mixed-use landscapes estimated using model–data fusion

2.1 Study area

2.2 Model–data fusion with CARDAMOM (CARbon DAta MOdel fraMework)

2.2.1 DALEC

2.2.2 Model–data fusion

2.2.3 Stratification approach for mosaic landscapes

2.3 Data

2.3.1 Meteorological drivers

2.3.2 Copernicus LAI 300 m

2.3.3 ESA Biomass CCI aboveground biomass 2017 and 2018

2.3.4 SoilGrids2 soil organic carbon (SOC)

2.3.5 Disturbance

2.4 Experimental setup

3.1 Calibration performance

3.2 Terrestrial C budget and impact of spatial resolution on C flux estimates

3.3 Impact of stratification on calibrated parameters and future C dynamics

4.1 Scale variance of simulated carbon balance in heterogeneous ecosystems: H1, H2

4.2 Impact of heterogeneity on model parameters and ecosystem response to future climate H3

4.3 Limitations of current approach and future work

4.4 Broader implications for constraining the terrestrial C balance