Grasslands cover around two-thirds of the agricultural land area of Great Britain (GB) and are important reservoirs of organic carbon (C). Direct assessments of the C balance of grasslands require continuous monitoring of C pools and fluxes, which is only possible at a small number of experimental sites. By relying on our quantitative understanding of ecosystem C biogeochemistry we develop models of grassland C dynamics and use them to estimate grassland C balance at various scales. Model-based estimation of the C budget of individual fields and across large domains is made complex by the spatial and temporal variability in climate and soil conditions, as well as in livestock grazing, grass cutting and other management activities. In this context, earth observations (EOs) provide subfield-resolution proxy data on the state of grassland canopies, allowing us to infer information about vegetation management, to apply observational constraints to the simulated ecosystems and, thus, to mitigate the effects of model input data uncertainty. Here, we show the potential of model–data fusion (MDF) methods to provide robust analyses of C dynamics in managed grasslands across GB. We combine EO data and biogeochemical modelling by implementing a probabilistic MDF algorithm to (1) assimilate leaf area index (LAI) times series (Sentinel-2); (2) infer defoliation instances (grazing, cutting); and (3) simulate livestock grazing, grass cutting, and C allocation and C exchanges with the atmosphere. The algorithm uses the inferred information on grazing and cutting to drive the model's C removals-and-returns module, according to which
Grasslands, natural and managed, are important biomes globally, with large soil carbon (C) pools and a key role in the cycling of water and nutrients
Grasses fix C through photosynthesis (gross primary production, GPP) and allocate a fraction of this C to grow stems, leaves and roots. Plant senescence results in transfers of biomass C to litter and dead organic matter in the soil which undergo decomposition. Defoliation, through grazing and cutting, is a major disturbance to C cycling
The potential of managed grasslands in GB, and beyond, to act as C sinks
Quantitative understanding of the dynamics of C pools and fluxes in grasslands is gained through field and lab-based experiments. This understanding is incorporated into models of ecosystem C biogeochemistry, which are conceptually coherent structures of mathematical equations that track the fluxes of C in the atmosphere–plant–soil–livestock system
Advances in satellite-based remote sensing methods, i.e. earth observation (EO), over the past decade have increased the volume and resolution of spatial data on grassland states (e.g. sward biomass, chlorophyll content) and soil factors (e.g. soil moisture and temperature)
In previous analyses at two grassland eddy flux sites in GB, we have shown that calibrating biogeochemical model parameters with ground-based LAI observations allowed robust diagnoses of the effects of grazing and cutting on independently measured net C exchanges Can we detect realistic variations in grassland vegetation management over national domains at field scale by assimilating EO information on LAI? What is the C balance of managed grasslands and how does it vary across GB? Which factors control the predicted C balance and biomass removals? How large is the analytical uncertainty on C cycling and which factors affect it?
The novelty of this research is to combine EO data and modelling to infer management of grasslands at the field scale across a nation and then to simulate the role of management on grassland C exchanges. The advent of highly resolved satellite data from Sentinel 2 makes this possible, allowing tracking of
For the identification of the location and limits of representative grassland fields, we used the 2018 UKCEH Land Cover plus map (LCM), which is updated annually by the Centre for Ecology and Hydrology (CEH) of the UK (
DALEC-Grass (Fig.
Schematic description of the DALEC-Grass model. DALEC-Grass simulates the dynamics of five C pools (C): leaf, stem, roots, litter and SOC. C is allocated to the five C pools via NPP allocation (A) and litter production (L). Vegetation removals (VRs) can occur due to grazing or cutting. DALEC-Grass determines whether a vegetation removal is caused by grazing, cutting or neither (see Sect.
The Carbon Data Model Framework (CARDAMOM) is a Bayesian MDF framework that is tailored for use in ecosystem biogeochemistry studies
Bayesian inference is performed in CARDAMOM using the root mean square error (RMSE) between the simulated and the EO-based LAI time series to calculate and attribute likelihoods to every sampled parameter vector. In this study, the simulated annealing (SA) algorithm is used to implement the probabilistic parameter sampling process
A uniform distribution was used for each of the 31 DALEC-Grass parameter priors, and the range for each parameter prior is presented in Table
Two independent EO datasets on leaf area index (LAI) were used in this study. The first dataset is the Copernicus Global Land Service (CGLS,
The EO-based LAI data that are assimilated in CARDAMOM were calculated from Sentinel-2 (S2) images. Sentinel-2 is an EO mission of the European Space Agency (ESA) that consists of two optical-imaging polar-orbiting satellites that were launched in 2015 (S2A) and 2017 (S2B). Atmospherically corrected images at 10, 20 and 60 m resolutions (L2A product) were downloaded from the Amazon Web Services (AWS) S2 data pool (
Six meteorological drivers are used in DALEC-Grass to drive variations in the biogeochemical process: (1) minimum and maximum temperature (
Agricultural census-based data on the number of sheep and cattle (beef and dairy) were obtained from the EDINA AgCensus database
Implementing the MDF algorithm for the thousands of fields that are classified as improved grassland in the LCM database is computationally demanding and time consuming. In addition, the spatial resolution of the CGLS (Proba-V-based) vegetation reduction data is 300 m (9 ha). Taking into account that the average managed grassland field is 5–9 ha in area, we set a minimum limit of 6 ha (and a maximum of 13 ha) when filtering the LCM dataset to obtain the location of fields. Moreover, the number of EO data points available for each field depends on the time of image capturing and the amount of cloud cover at overpass. As a consequence, the number of dates of available EO data can vary considerably between fields. We set a limit of having at least 30 S2 data points (for 2017–2018) for a field to be selected for simulation. The fields that met the conditions were allocated to 25 km
Topographic map of Great Britain (GB) with red symbols showing the locations of sampled fields. Built-up areas are shown in black. Digital elevation model from
We used the two independent EO datasets on LAI to inform the analysis. The first dataset (CGLS) was used to estimate weekly absolute LAI change (vegetation reduction time series) and, based on the magnitude of change, to derive whether cutting or grazing had occurred. CGLS data therefore provided an independent estimate of management operations and act as a driver of LAI loss in the model, week to week. The value of CGLS is the availability of continuous weekly data, with no gaps in coverage. Their weakness is the large uncertainty on LAI change, particularly for smaller fields (
In more detail, DALEC-Grass simulates weekly biomass growth driven by weather. LAI loss is then imposed by CGLS estimates. Broad uncertainty on the CGLS estimates recognise the potential bias in this driving dataset
Description of how S2 LAI observations, (CGLS) vegetation reduction time series, and DALEC-Grass are used to infer and calculate managed vegetation removals (grazing, cutting). The DALEC-Grass biophysical module simulates weekly leaf growth and senescence driven by weather data. The DALEC-Grass management module simulates weekly vegetation removals driven by the vegetation reduction data. The CARDAMOM MDF algorithm calibrates the parameters of DALEC-Grass in order to achieve the smallest possible error (RMSE) between S2 LAI and simulated LAI time series. LAI
The micrometeorological sign convention is used when presenting C balance variables, whereby a positive (
The effectiveness of the LAI assimilation process is assessed by quantifying the level of fit between MDF-predicted and EO-based LAI time series using (1) the percent of overlap between the EO-based data points (field mean) and the corresponding MDF-predicted ranges (95 % confidence interval), (2) the RMSE, and (3) the bias between the simulated and observed time series. To account for the possibility that some of the simulated fields may not be managed grasslands due to changes in management but classified as such in the LCM data, we remove from the results any fields for which the estimated overlap is
To answer our first science question, the MDF-predicted weekly grazed biomass is converted into livestock units (LU) per hectare following the assumptions that (1) one cattle is 1 LU and one sheep is 0.11 LU, (2) 1 LU weighs 650 kg, (3) an animal demands
To answer our second science question, we present and examine the annual and seasonal C balance and the cumulative annual fluxes of the simulated fields. To assess what controls the predicted C balance of the simulated grasslands (our third science question), we quantify the correlation coefficient between meteorological model drivers, management-related model parameters and MDF predictions of C cycling. In order to provide a more quantitative assessment of the factors that control grassland C dynamics, we quantify the relative impact of management and climate on the MDF-predicted NBE. We use the model meteorological drivers and the posterior model parameters related to management and climatic controls for every simulated field to train a random forest (RF) model that estimates NBE. A total of 75 % of the data are used to train the RF model and 25 % to assess its predictive ability (coefficient of determination). Thereafter, we use the Shapley additive explanations (SHAP) method to quantify how much each RF predictor affects the RF-predicted NBE
For each simulated field and model output, the MDF algorithm produces a mean and 95 % confidence interval. To answer our fourth science question, we quantify the predictive uncertainty around an output by calculating its relative confidence range (RCR). RCR is equal to the size of the MDF-predicted 95 % confidence interval divided by the corresponding mean and expressed as %. We present and examine the estimated RCRs to identify the key factors that affect uncertainty.
For 12 % of the initial dataset (2108 fields) our analysis failed to generate a simulated-vs.-observed LAI overlap
A comparison of MDF predictions of livestock density against census-based data (Fig.
Cartograms of census-based and MDF-predicted livestock density (livestock units, LU ha
The analysis suggests that the 1855 simulated fields were managed with varying intensity. The majority of simulated fields were grazed-only (75 %) and no cut-only fields were simulated. Grazed biomass exceeded cut biomass in 85 % of the fields (GCD
Most of the MDF-predicted first grass cuts (85 %) occurred between the first half of May and the second half of July. For the fields where more than one cut was identified, the period between the first and last cut was
MDF-based C cycle estimates show that management affected the C balance of the simulated grassland ecosystems significantly. The difference between grazed and cut biomass volume (GCD) is used to present the impact that these two biomass removal methods have on C balance. The mean annual GPP across GB fields was 30 % higher (
Violin plots of GPP, Reco, removed biomass and C flux from litter to SOC based on MDF predictions (2017–2018) for all simulated fields. Violin plots are split according to whether grazing or cutting removed most grass biomass. The blue side of each violin plot shows results for fields in which most biomass was removed via grazing (GCD
Seasonal NEE varied across GB, with strongest sinks in spring and summer, strongest sources in autumn, and close to neutral net exchanges during winter (Fig.
Spatial mean (
Cartograms of cumulative NEE for summer 2017, summer 2018 and change in summer NEE between 2017 and 2018 (cumulative NEE 2018
Correlation coefficients (Fig.
Heatmap of correlation coefficients (
We expanded on the correlations-based analysis of the MDF results by using (1) the MDF-predicted data on NBE, as well as (2) the corresponding meteorological drivers and (3) model parameters describing climatic and management controls on grass growth, to train a RF model that estimates NBE. The resulting RF model was able to explain 93 % of the variance in MDF-predicted NBE (
Normalised SHAP values for RF-based estimation of annual NBE in 2017 and 2018.
The size of the uncertainty around MDF estimates is quantified using the RCR (relative confidence range) of MDF outputs. The mean RCR is
Cartograms of relative confidence range (RCR, 100
Process modelling combined with earth observation can identify grassland vegetation management effectively over large spatial domains. The distribution of MDF-based livestock densities across GB mirrors the independent determined census-based numbers of cattle and sheep per area; MDF estimated livestock density is 0.7 (
The MDF-predicted GB-average pasture dry matter yield (
The presented results are probabilistic model-based estimates produced by carefully upscaling our quantitative understanding of grassland C cycling under GB conditions. DALEC-Grass is a process-based C biogeochemical model that has been calibrated and validated against in situ data on C pools and fluxes collected over 11 years and at two variably managed grassland sites in GB
The results of this study show that the majority of managed grasslands in GB were net C sinks during 2017 (NEE
Our study shows that biomass removals were key determinants of the C balance of managed grasslands. The role of cutting relative to grazing as a biomass removal method was found to be particularly important. Grasslands in which most biomass was removed via cutting had a lower GPP and Reco as opposed to grasslands in which grazing was the main biomass removal method (Fig.
Management and climate have a combined effect on C dynamics, and disentangling their individual impacts is a challenge
The conclusions that we draw in regards to which factors have more influence on grassland C dynamics are based on two assumptions. Firstly, we assume that the simulated grasslands are well optimised for the intended use, i.e. to sustain different types of livestock (e.g. dairy and/or beef cattle and/or sheep). This means that each sward is maintained in good condition and that farmers manage their fields optimally based on their long-term experience. Secondly, the fact that a large share of the simulated fields (especially in the southern half of England) experienced continuous weeks of unusually hot and dry weather conditions during one (2018) of the two simulated years is treated as a climate anomaly; i.e. climate in 2018 is not representative of normal climatic controls on C balance. Based on these assumptions, we argue that the simulated vegetation management, as inferred from the observational data, was adapted to the seasonal weather anomaly. Therefore, significant changes in ecosystem C cycling were beyond the control of human management and can be mostly attributed to the seasonal weather anomaly.
Our findings on the role of management are in agreement with findings in a number of relevant studies, notwithstanding differences in methodologies and ecoclimatic conditions.
In summary, we conclude that management is a key determinant of the C balance of managed grasslands in GB. We note that climatic anomalies, such as heat waves and droughts, can reduce the relative importance of management as a determinant of grassland C balance. In simple terms, human decisions can adjust grassland sink or source strength, and this depends mostly on the soil's existing C stock, the sward's composition and condition, and the timing and intensity of livestock grazing and grass cutting. Climate change can change this fine C balance substantially, and prolonged heat and drought is one way in which this can occur in regions with temperate maritime climate.
We use the RCR to quantify the uncertainty around the MDF-predicted variables. RCR shows how wide the 95 % confidence intervals (i.e.
The estimated predictive uncertainty for LAI, GPP and grazed biomass was noticeably higher for fields that were mostly cut (GCD
This study uses a MDF algorithm that depends on EO data and process modelling of C dynamics in grasslands. The Proba-V-based vegetation reduction time series that are used to drive DALEC-Grass have a resolution (9 ha) that is coarse when compared to the average size of grassland fields in GB. These noisy data on vegetation reduction cause increased uncertainty in MDF predictions especially in regards to the timing of cutting events. Moreover, most areas of GB are affected by frequent cloudiness, which means that the number of Sentinel 2-based LAI data points per year and simulated field is limited compared to other parts of the world. However, we ensured 30 images per field over 2 years in our selection process, and this richness of information at field resolution and for national domains is unprecedented in such an analysis.
DALEC-Grass was developed and tested under GB conditions, showing high skill in predicting C allocation and CO
In general, the ability to use field-specific observed information on key aspects of grassland vegetation and to infer vegetation grazing and cutting are the key advances presented in this study. The majority of grassland-focused model-based estimates for large domains typically rely on uncertain information on grazing and cutting. Also, with few recent exceptions, most relevant studies do not include field-specific validation of model predictions, which results in highly uncertain estimates. On the other hand, the calculation of some lateral flows of C (manure input) remains an outstanding challenge as this depends on information that cannot be inferred from EO-based time series. The size, type and age of livestock significantly affect aboveground biomass and the turnover of grazed biomass C
Our approach cannot detect human-managed manure application from EO data and so does not consider this method of manure-C addition to the soil. Grazed biomass-to-manure-C conversion factors are used in DALEC-Grass to estimate the amount of manure C produced and added to the simulated soil litter pool at each time step. This way of applying manure is a simplification of what happens in reality where grazing livestock will deposit some manure to the soil while some of their manure will be collected during periods of housing and stored to be applied across a farm's fields and/or traded to other grassland or arable farms. Typically, most manure is applied in GB grasslands during spring and autumn. Despite all this, GB livestock is primarily grass fed, and the volume of manure produced in a farm is directly related to the biomass productivity and the livestock density maintained at the different farm fields, which MDF can detect and deduce respectively
Our overarching aim is to produce a computational ecosystem modelling framework that is (1) able to utilise the swathes of EO data that are increasingly becoming available while (2) being easy to adapt and incorporate new knowledge gained from field/lab experiments and observations. This study showed that the MDF algorithm will benefit most from improving the temporal resolution and quality of EO LAI data used. We believe that by advancing on this front the algorithm will be able to produce more accurate estimates across grasslands in Europe and other regions with similar agroclimatic conditions. Introducing soil moisture and N-cycling-related processes to DALEC-Grass will pave the way for more detailed consideration of the effects of fertiliser use and different grass mixtures, as well as for its application at climatically critical rangelands and pastures across the world (e.g. tropical and dry regions). DALEC-Grass has a structure that facilitates the incorporation of modelling advances made with other DALEC-based models such as those presented in
This study presented how, by fusing EO data and biogeochemical modelling of managed grassland C dynamics at field resolution across a national domain, a MDF framework can detect biomass removals and use this information to predict grassland C fluxes and balance probabilistically. In addition, the study showed how field-specific model predictions of grassland vegetation can be validated against field-specific EO-based LAI time series. We argue that both of these uses of EO data in model-based studies represent key advancements that increase the credibility of field-scale estimates of C dynamics in managed grasslands. Our results show that MDF-predicted annual yields and livestock density mirror ground-based information well. In agreement with a range of studies on temperate grasslands in Europe and beyond, our study reaffirms the C sink potential of managed grasslands in GB. In contrast to previous measurements and model-based studies, however, we showed how MDF can quantify and interpret C dynamics across a large domain (GB) while also resolving subfield-scale variability in vegetation management. This granularity is vital as our results show how management differences between fields have strong effects on net C balance. It is widely accepted that climate change is manifesting itself, among other ways, as more frequent droughts in northern Europe
National targets for C neutrality in the agricultural sector and the unfolding of climate change create a challenging future for GB grassland farming. The estimation of grassland C balance using MDF has a number of limitations, including the lack of field-scale data on soil C and fertiliser and manure application across large domains. Yet, these limitations can be addressed if MDF is used as part of a land C management monitoring system, in which farmers report field-scale activity data (i.e. fertiliser and manure use) and measure soil C for validation purposes. Overall, the strength of probabilistic MDF is its potential to utilise disparate observational data and provide estimates with well-defined uncertainties. In this respect, the volume and resolution of observational data on plant and soil conditions of grasslands continue to grow driven by advances in EO science and infrastructure and by an increasing interest and investment in environmental monitoring technologies (e.g. low-cost proximal sensing, integrated network of sensors and stations). We argue that in the near future farmers and governments alike will be able to benefit from MDF approaches that provide key monitoring tools for C balance, as well as guidance on adaptation and mitigation of climate change effects on agriculture towards meeting net-zero goals.
DALEC-Grass parameters (number, description, units and prior min/max values).
Ecological and dynamic constraints (EDCs).
Schematic description of data sources and data flow in the model–data fusion process. The DALEC-Grass model is driven by weekly weather and vegetation reduction data (see Sect.
Cartograms of overlap (%), root mean square error (RMSE) (m
Kernel density estimates plot (inner part) and distributions (outer part) of MDF-predicted (
Mean month of year of first simulated grass cutting per GB region.
Cartograms of MDF-predicted GPP, Reco, NEE, NBE, removed biomass and C flux to SOC. The mean value for 2017–2018 across all fields in each cell is presented. The size of cells is adjusted according to the number of simulated fields within it. Unit: gC m
Cartograms of MDF-predicted NBE for 2017 and 2018. The mean across all fields in each cell is presented. The size of cells is adjusted according to the number of simulated fields within it. Unit: gC m
Map of inter-annual (2017–2018) difference in 3-week average VPD (Pa) per season. The map is a 25 km grid of GB. Only grid cells that contain at least one simulated field are presented.
Cartograms of cumulative NEE per season for 2017 (December 2016–November 2017) and 2018 (December 2017–November 2018), and change in seasonal NEE from 2017 to 2018. The mean MDF-predicted seasonal NEE of all fields in each cell is presented. The size of cells is adjusted according to the number of simulated fields within it.
The code and data used in this study are available on the Edinburgh DataShare digital repository of research data (
VM and MW devised the study concept. VM developed DALEC-Grass, implemented the MDF and undertook the analysis with support from all authors. VM led the writing, with support from MW and TLS.
The contact author has declared that none of the authors has any competing interests.
Publisher’s note: Copernicus Publications remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
This study was supported by the Natural Environment Research Council (NERC) of the UK through several projects: the Soils Research to deliver Greenhouse Gas REmovals and Abatement Technologies (Soils-R-GGREAT) project (NE/P018920/1), DARE-UK (NE/S003819/1), and GREENHOUSE (NE/K002619/1). Mathew Williams acknowledges support from NCEO and the Royal Society. We acknowledge the inputs and support of the CARDAMOM development team. We thank Anthony Bloom (NASA Jet Propulsion Laboratory) for his support.
This research has been supported by the Natural Environment Research Council (grant nos. NE/P018920/1, NE/S003819/1 and NE/K002619/1).
This paper was edited by Sönke Zaehle and reviewed by Aiming Qi and two anonymous referees.