The carbon budget of the managed grasslands of Great Britain informed by earth observations

. Grasslands cover around two thirds of the agricultural land area of Great Britain (GB) and are important reservoirs of organic carbon (C). Outside a few well-monitored sites the quantiﬁcation of C dynamics in managed grasslands is limited. Using process models to extrapolate site data is made complex by the spatio-temporal variability of climate, soils, and of grazing and cutting patterns. However, new earth observations (EO) provide sub-ﬁeld resolution proxy data on the state of 5 grassland canopies to support upscaling. Here, we show the potential of model-data fusion (MDF) to provide robust near-real time C analyses of managed grasslands across GB. We combine EO data and biogeochemical modelling to estimate grassland C balance and to infer the role of management and environment. We implement a MDF algorithm to (1) infer grassland cutting and grazing from data on vegetation volume (Proba-V), (2) optimise model parameters by assimilating leaf area index (LAI) times series data (Sentinel-2) and (3) simulate livestock grazing, grass cutting, and C allocation and C exchanges with 10 the atmosphere. The MDF algorithm was applied for 2017 and 2018 to generate probabilistic C estimates at 1855 ﬁelds sampled from across GB. The algorithm was able to effectively assimilate the Sentinel-2 based LAI time-series (overlap=80%, RMSE=1.1 m − 2 m − 2 , bias=0.35 m 2 m − 2 ) and predict livestock densities per area that correspond with independent agricultural census-based data ( r =0.68, RMSE=0.45 LU ha − 1 , bias=-0.06 LU ha − 1 ). The mean total removed biomass across all simulated ﬁelds was 6 ( ± 1 . 8 ) tDMha − 1 y − 1 . The simulated grassland ecosystems were on average C sinks in 2017 and 2018; the net 15 biome exchange (NBE) was -191 ± 81 (2017) and -49 ± 69 gCm − 2 y − 1 (2018). Our results show that the 2018 European summer drought reduced the strength of C sinks in UK grasslands and led to a 9-fold increase in the number ﬁelds that were annual C sources (NBE>0) in 2018 (18% of ﬁelds) compared to 2017 (2% of ﬁelds). The ﬁeld-scale analysis showed that management in the form of timing, intensity and type of defoliation were key determinants of the C balance of managed grasslands, with cut ﬁelds acting as weaker C sinks compared to grazed ﬁelds. Nevertheless, extreme weather, such as prolonged droughts, can 20 convert grassland C sinks to sources. calibrations of DALEC-Grass parameters by assimilating local EO LAI time-series. The MDF algorithm is implemented for 2017-2018 on a large sample of 1855 managed grassland ﬁelds in GB (England, Wales and Scotland). We obtain a sample representative of the different grassland ecosystems and management types by randomly selecting 1 ﬁeld per 25 km 2 across GB from the UK’s 115 Land Cover Map (vector land parcels). Grazing intensity, cutting timing and yields and C pools and ﬂuxes are predicted by DALEC-Grass for every simulated ﬁeld. In order to evaluate our MDF analysis, we compare predictions of annual grass yields (grazed and cut biomass) to biomass utilisation data from the relevant literature and to livestock density data from the most recent GB agricultural census data. In 2018, GB was affected by a summer heat and drought wave (summer 2018 was ≈ 1 o C warmer than summer 2017) allowing us to examine the impact of climate anomalies on grassland C balance (Kendon 120 2019) The aim of this study is answer questions:

grassland canopies to support upscaling. Here, we show the potential of model-data fusion (MDF) to provide robust near-real time C analyses of managed grasslands across GB. We combine EO data and biogeochemical modelling to estimate grassland C balance and to infer the role of management and environment. We implement a MDF algorithm to (1) infer grassland cutting and grazing from data on vegetation volume (Proba-V), (2) optimise model parameters by assimilating leaf area index (LAI) times series data (Sentinel-2) and (3) simulate livestock grazing, grass cutting, and C allocation and C exchanges with 10 the atmosphere. The MDF algorithm was applied for 2017 and 2018 to generate probabilistic C estimates at 1855 fields sampled from across GB. The algorithm was able to effectively assimilate the Sentinel-2 based LAI time-series (overlap=80%, RMSE=1.1 m −2 m −2 , bias=0.35 m 2 m −2 ) and predict livestock densities per area that correspond with independent agricultural census-based data (r=0.68, RMSE=0.45 LU ha −1 , bias=-0.06 LU ha −1 ). The mean total removed biomass across all simulated fields was 6 (±1.8) tDMha −1 y −1 . The simulated grassland ecosystems were on average C sinks in 2017 and 2018; the net 15 biome exchange (NBE) was -191±81 (2017) and -49±69 gCm −2 y −1 (2018). Our results show that the 2018 European summer drought reduced the strength of C sinks in UK grasslands and led to a 9-fold increase in the number fields that were annual C sources (NBE>0) in 2018 (18% of fields) compared to 2017 (2% of fields). The field-scale analysis showed that management in the form of timing, intensity and type of defoliation were key determinants of the C balance of managed grasslands, with cut fields acting as weaker C sinks compared to grazed fields. Nevertheless, extreme weather, such as prolonged droughts, can 20 convert grassland C sinks to sources. management plays a major role (Chang et al., 2021).
Quantitative understanding of the dynamics of C pools and fluxes in grasslands is gained through field and lab-based experiments. This understanding is incorporated into models of ecosystem C biogeochemistry, which are conceptually-coherent structures of mathematical equations that track the fluxes of C in the atmosphere-plant-soil-livestock system (Snow et al., 2014;Chang et al., 2013;Ma et al., 2015;Sándor et al., 2018;Puche et al., 2019). Biogeochemical models can upscale knowledge on 80 ecosystem C dynamics across large areas and over time. Model-based upscaling represents a robust way for diagnosing the role of climate and management on C exchanges, and exploring C sensitivity of future climate and alternate management scenarios.
Models require information on environmental conditions as inputs. Providing these inputs across space introduces uncertainty (input uncertainty) to model predictions because the relevant data come from spatial extrapolations of point measurements (i.e. soil surveys, weather stations). Another key source of input uncertainty is the lack of accurate spatial data on grassland 85 management i.e. harvest and grazing patterns and manure and fertiliser use, which must therefore be inferred by some means (Chang et al., 2015a;Fetzel et al., 2017;Vuichard et al., 2007;Rolinski et al., 2018;Blanke et al., 2018;Abdalla et al., 2018;Chang et al., 2021). Model credibility can be supported by effective calibration with ground data and validating predictions using independent data. Providing uncertainty estimates on model outputs provides robust contexts for model interpretation (Kennedy and O'Hagan, 2001;Dietze, 2017). 90 Advances in satellite-based remote sensing methods, i.e. Earth Observation (EO), over the past decade have increased the volume and resolution of spatial data on grassland states (e.g. sward biomass, chlorophyll content) and soil factors (e.g. soil moisture and temperature) (Reinermann and Asam, 2020;Ustin and Middleton, 2021). There is an opportunity to use EO data in model-based studies to perform upscaling with reduced input and parametric uncertainty and with constrained predictive uncertainty and model bias. In this context, high resolution (<100 m), frequently-retrieved (∼weekly) EO data on the state of 95 grassland vegetation can be assimilated and used to validate relevant model predictions and to calibrate model parameters at the scale of individual grassland fields (Patenaude et al., 2008;Oijen et al., 2011;Maselli et al., 2013;Pique et al., 2020b, a).
In addition, time-series of EO-based vegetation indices can be used to monitor vegetation volume change and identify the timing of the relevant management i.e. grass harvesting and livestock grazing (Dusseux et al., 2014;Giménez et al., 2017;Yu et al., 2018;Reichstein et al., 2019). Leaf area index (LAI) conveys information on vegetation structure and volume, and 100 can be estimated from multispectral optical EO data (Munier et al., 2018). The volume of EO-derived LAI data is, however, dependant on the frequency of satellite overpasses and the level of cloudiness.
In previous analyses at two grassland eddy flux sites in GB we have shown that calibrating biogeochemical model parameters with ground-based LAI observations allowed robust diagnoses of the effects of grazing and cutting on independently measured net C exchanges (Myrgiotis et al., 2020). In a follow-up study at another grassland research farm in GB we demonstrated 105 that model calibration with satellite-based LAI observations was effective for monitoring biomass removals and quantifying management impacts on field-scale C balance (Myrgiotis et al., 2021). Here, we build on this earlier work to demonstrate how EO data and biogeochemical modelling can be combined to (1) detect vegetation-related management operations (i.e. grass cutting and grazing intensity) and (2) to estimate the variation in C dynamics over a large domain (GB) and at fine resolution (sub-field scale). We use a parsimonious process-based biogeochemical model of grassland C dynamics (DALEC-Grass) that 110 is integrated into a probabilistic model-data fusion (MDF) algorithm (CARDAMOM). DALEC-Grass is driven by weather data and field-specific EO-based data on weekly change in vegetation volume. CARDAMOM performs field-specific calibrations of DALEC-Grass parameters by assimilating local EO LAI time-series. The MDF algorithm is implemented for 2017-2018 on a large sample of 1855 managed grassland fields in GB (England, Wales and Scotland). We obtain a sample representative of the different grassland ecosystems and management types by randomly selecting 1 field per 25 km 2 across GB from the UK's 115 Land Cover Map (vector land parcels). Grazing intensity, cutting timing and yields and C pools and fluxes are predicted by DALEC-Grass for every simulated field. In order to evaluate our MDF analysis, we compare predictions of annual grass yields (grazed and cut biomass) to biomass utilisation data from the relevant literature and to livestock density data from the most recent GB agricultural census data. In 2018, GB was affected by a summer heat and drought wave (summer 2018 was ≈ 1 o C warmer than summer 2017) allowing us to examine the impact of climate anomalies on grassland C balance (Kendon et al.,120 2018, 2019) The aim of this study is to answer four key questions: 1. Can we detect realistic variations in grassland vegetation management over national domains at field scale by assimilating EO information on LAI? 2. What is the C balance of managed grasslands and how does it vary across GB?
3. Which factors control the predicted C balance and biomass removals? The novelty of this research is to combine EO data and modelling to infer management of grasslands at field-scale across a nation and then to simulate the role of management on grassland C exchanges. The advent of highly-resolved satellite data from Sentinel 2 makes this possible, allowing tracking of ∼weekly changes in LAI at sub-field scales for a national sample of grassland fields. The intermediate complexity model employed means that Bayesian approaches to model calibration can 130 explore the uncertainty of parameters and estimates of C cycling. The key innovation is to combine observations of changes in grassland LAI from space with expected changes in LAI (i.e. grass growth rates) derived from process modelling. The difference between observed and expected change in LAI is used to infer consumption by grazing livestock or removals by grass cutting. The C cycle estimate is then updated based on this estimation. Map plus (LCM), which is updated annually by the Centre of Ecology and Hydrology (CEH) of the UK (www.ceh.ac.uk/crops2015).
The LCM includes geo-referenced polygons of improved grassland fields in GB that are identified as such by using a combi-140 nation of reflectance data. The LCM data are validated against ground observations of land-use type.

DALEC-Grass Model
DALEC-Grass (Fig. 1) is a process-orientated model of intermediate complexity representing the C cycling of grassland ecosystems. DALEC-Grass uses meteorological information to calculate gross primary productivity, autotrophic and heterotrophic respiration, changes in leaf area index (LAI), the C turnover of different plant and soil pools as well as the removal of C via grazing and grass cutting. Photosynthesis is calculated using the Aggregated Canopy Model (ACM) and phenology is calculated using the Growing Season Index (GSI) approach (Williams et al., 1997;Smallman et al., 2017). DALEC-Grass uses a dynamic scheme to allocate C to above and below-ground plant tissues, which is based on the assumption that C allocation to roots increases after sufficient leaf area has been developed such that there are diminishing returns on further canopy expansion (Myrgiotis et al., 2020). The model uses a simple scheme to describe C allocation to soil with two pools considered; a more 150 labile litter pool to which dead plant material and manure-C are added, and a more recalcitrant soil organic carbon pool (SOC), which receives C from the litter pool only. At each time-step the volume of simulated grazed biomass-C is converted to animal respiration CO 2 -C, digestion CH 4 -C and manure-C using generic conversion factors (see Fig. 1) extracted from the relevant literature (Parsons et al., 2009;Zeeman et al., 2010;Worrall and Clay, 2012;Bell et al., 2016;Lee et al., 2017). These generic conversion factors are used because the type, weight and age of animals grazing on individual fields can neither be inferred 155 from EO data nor be reliably estimated from available datasets of livestock spatial distribution (e.g. agricultural census). The conversion factors reflect an averaging of relevant data for beef and dairy cattle and sheep, which are the main types of livestock in the GB. The model's 31 parameters are presented in Table A1 in Appendix A. In this study, DALEC-Grass is implemented at a weekly time-step.  Figure 1. Schematic description of the DALEC-Grass model. DALEC-Grass simulates the dynamics of 5 C pools (C) : leaf, stem, roots, litter and SOC. C is allocated to the 5 C pools via NPP allocation (A) and litter production (L). Vegetation removals (VR) can occur due to grazing or cutting. DALEC-Grass determines whether a vegetation removal is caused by grazing, cutting or neither (see section 2.2.2). When cutting is simulated (VRc>0) cut biomass (Bc) is removed from the ecosystem. When grazing is simulated (VRg>0) 32% of grazed biomass (Bg) is converted to manure, 54% of grazed biomass (Bg) is converted to animal respiration ( a CO2) and 4% of grazed biomass (Bg) is converted to methane ( a CH4). Dotted lines (· · ·) show outward fluxes of C. Solid lines ( ) show internal and inward fluxes of C

160
The CARbon DAta MOdel FraMework (CARDAMOM) is a Bayesian MDF framework that is tailored for use in ecosystem biogeochemistry studies (Bloom et al., 2016). By assimilating observational data CARDAMOM updates the distribution of model parameters following the rules of Bayesian inference. A key aspect of CARDAMOM is the use of ecological and dynamic constraints (EDC), which are conditions applied to the parameter sampling process in order to ensure the mathematical, ecological and biogeochemical sensibility (or "common sense") of the simulated system (Bloom and Williams, 2015). In 165 simple terms, CARDAMOM examines whether the simulated pools and fluxes that result from implementing the model with a sampled parameter vector behave in realistic ways; i.e. do not exceed certain user-defined, widely-accepted and literaturebased limits. The EDCs used in CARDAMOM are presented in Table A2 in Appendix A. A schematic description of how CARDAMOM and DALEC-Grass are connected is provided in Figure A1 in Appendix A.
Bayesian inference is performed in CARDAMOM using the root mean square error (RMSE) between the simulated and 170 the EO-based LAI time-series to calculate and attribute likelihoods to every sampled parameter vector. In this study, the Simulated Annealing (SA) algorithm is used to implement the probabilistic parameter sampling process (Kan et al., 2016).
After testing the SA algorithm to identify the optimal number of repetitions for achieving acceptable chain convergence the number of parameter proposals was set to 5,000,000. The uncertainty around the assimilated (Sentinel-2) LAI data was set to 15% (relative standard deviation) in this study. However, it should be noted that the uncertainty around remote sensing-based 175 LAI data is poorly determined but largely underestimated (Zhao et al., 2020).
A uniform distribution was used for each of the 31 DALEC-Grass parameter priors and the range for each parameter prior is presented in Table A1 (Appendix A). In Myrgiotis et al. (2020) DALEC-Grass parameter priors were refined through implementing the model using known vegetation management (cutting dates, livestock density time-series) and by assimilating field-measured LAI and CO 2 flux data (chamber-based and eddy covariance). In Myrgiotis et al. (2021) these model pri-180 ors have been tested and refined further using EO-based vegetation reduction time-series (as vegetation management-related model drivers) and by assimilating Sentinel-2 based LAI time-series. The limits of the uniform parameter distributions used in the present study are based on the results of Myrgiotis et al. (2021) but 4 parameters were allowed to vary more than these results suggested in order to better consider the variability in management factor across GB grasslands (indicated with * in Table A1 in Appendix A). These parameters are the plant photosynthetic N use efficiency (PNUE), the leaf mass C per area 185 (LCA), and the pre-grazing and pre-cutting biomass.

Earth observation data
EO-based information of vegetation canopy cover was used to infer vegetation-related management operations (i.e. grazing or cutting). This canopy cover time-series was generated for each simulated field from LAI data from the Copernicus Global Land Service (CGLS) EO database. The CGLS LAI data comprise LAI estimates from on Proba-V satellite images that have 190 a spatial resolution of 300m and a temporal resolution of 10 days. Gaps in the CGLS LAI time-series due to cloud coverage are filled using a machine learning model built with time-series for past years and neighbouring pixels (Smets et al., 2018). For each simulated field the corresponding time-series has been converted, from their original 10-day time-step, to a weekly time step using linear interpolation. Thereafter, the reduction between subsequent dates in the time-series was calculated. When the change between week n and n + 1 was positive the reduction value for week n is 0. Hereafter, we refer to this time-series as 195 the "vegetation reduction" time-series.
The EO-based LAI data assimilated in CARDAMOM were calculated from Sentinel-2 images. Atmospherically-corrected images at 20m and 60m resolutions (L2A product) were downloaded from the Amazon Web Services (AWS) Sentinel-2 datapool. The images were processed to remove pixels with cloud and haze and, then, used to calculate LAI (at 20m) using the sen2cor algorithm (Weiss and Baret, 2016). When available, the field-average Sentinel-2-based LAI value that corresponds to 200 the day closest to the first day of every simulated week is added to the weekly observational LAI time-series that are assimilated through the CARDAMOM MDF framework.
Using two independent EO datasets on LAI provides a more robust and continuous estimate of vegetation dynamics. Due to frequent cloud cover field-specific, high-resolution (20m) and continuous weekly LAI time-series are not available from Sentinel-2 images alone. Hence we use a separate EO-based data product (CGLS LAI) in which observational LAI time-series 205 are gap-filled using techniques that are more robust than simple temporal interpolation. In order to control the impacts of the observational uncertainty of the vegetation reduction data (due to their 300m resolution) we : (1) use relevant mechanisms in DALEC-Grass (see section 2.2.2) and (2) adjust the process of sampling grassland fields (see section 2.2.1) accordingly to match the coarser resolution of vegetation reduction data.

Environmental and management data 210
Six meteorological drivers are used in DALEC-Grass to drive variations in biogeochemical process: (1) minimum and maximum temperature ( o C), (2) total short-wave radiation (MJm −2 d −1 ), (3) atmospheric CO 2 concentration (ppm), (4) 21-day average photoperiod (sec), (5) 21-day average minimum T and (6) 21-day average vapour pressure deficit (Pa). Data were obtained from the ERA5 global atmospheric reanalysis database of the European Centre for Medium-Range Weather Forecasts (ECMWF). Values of soil C (gCm −2 at 60cm depth) at 300m resolution were obtained from the the SoilGrids database (Hengl 215 et al., 2017). For every simulated field the mean and standard deviation (SD) of the corresponding SoilGrids pixels are used to define the range of the model's initial SOC pool size parameter.
Agricultural census-based data on the number of sheep and cattle (beef and dairy) were obtained from the EDINA AgCensus database (AgCensus, 2020). They are used in this study to independently evaluate the estimates from the MDF implementation.
The AgCensus data are produced by spatially disaggregating the numbers of cattle and sheep recorded at the level of local 220 administrative units (for each UK country), into a 5km grid of the UK. The most recently available livestock data for each constituent country of the UK refer to different years; 2010 for England, 2015 for Wales and 2017 for Scotland.

Sampling of grassland fields from the UK land cover map
Implementing the MDF algorithm for the thousands of fields that are classified as improved grassland in the LCM database is 225 computationally demanding and time consuming. In addition, the resolution of the CGLS (Proba-V-based) vegetation reduction data is 300 m (9 ha). Taking into account that the average managed grassland field is 5-9 ha in area, we set a minimum limit of 6 ha (and a maximum of 13 ha) when filtering the LCM dataset to obtain the location of fields. Moreover, the number of EO data points available for each field depends on the time of image capturing and the amount of cloud cover at overpass.
As a consequence, the number of dates of available EO data can vary considerably between fields. We set a limit of having at 230 least 30 Sentinel-2 data points (during 2017-2018) for a field to be selected for simulation. The fields that met the conditions were allocated to 25km 2 cells of a 5km grid of GB. One field was randomly selected from each cell, which resulted in a set of 2108 fields (Fig, 2). The CARDAMOM MDF algorithm was implemented for each of the selected fields for 2017 and 2018 by running DALEC-Grass at a weekly-time step while assimilating the corresponding available EO-based LAI data. We refer to the outputs of this implementation as "MDF predictions".

Identifying and calibrating grazing and cutting from EO data
Field-scale vegetation management activities are inferred from EO-based reduction data. At each time step the vegetation reduction information is used to infer whether a field has undergone a grass cutting event or a livestock grazing event or neither. For a vegetation reduction input data point to be simulated as a cutting: (1) the event should occur between April and October, when cutting tends to occur; (2) the simulated aboveground biomass at the time of cutting should be greater than the 240 pre-cutting biomass parameter (P28, see Table A1) indicating that there is enough grass for a cut to be worthwhile; and (3) the resulting yield should be > 80 g C m −2 , another economic test for cutting to be likely. If any of these 3 conditions is not met then a grazing event is indicated and simulated if the simulated aboveground biomass exceeds the pre-grazing biomass parameter (P27, see Table A1). Otherwise, if neither a cut nor a grazing event can be simulated, no vegetation removal is simulated.

245
In addition to this, DALEC-Grass parameters are locally calibrated at every simulated field by assimilating Sentinel-2 based LAI data. This means that the vegetation management-related decisions made by DALEC-Grass during the simulated period are conditioned on observational LAI data also (Fig. 3). Assimilation is performed by implementing DALEC-Grass while sampling from the model's parameter space in order to minimise the error (RMSE) between observational and simulated LAI time-series. 250 Figure 3. Description of how Sentinel-2 LAI observations, (CGLS) vegetation reduction time-series and DALEC-Grass are used to infer and calculate managed vegetation removals (grazing, cutting). The DALEC-Grass biophysical module simulates weekly leaf growth and senescence driven by weather data. The DALEC-Grass management module simulates weekly vegetation removals driven by the vegetationreduction data. The CARDAMOM MDF algorithm calibrates the parameters of DALEC-Grass in order to achieve the smallest possible error (RMSE) between Sentinel-2 LAI and simulated LAI time-series. LAI = C leaf ÷ P15 (see table A1 for details on parameter P15). The DALEC-Grass management module determines whether LAI reduction is due to grazing (Lg) or cutting (Lc) or neither. When a grazing reduction is identified (VRg>0 thus VRc=0) the livestock C-turnover process (described in 2.1.2) is implemented. When a reduction is identified as produced by cutting (VRc>0 thus VRg=0) most leaf biomass is removed; parameters P27, P28, P29, P31 (table A1) play a direct role in cut yield estimation. When neither grazing nor cutting is identified then VRc=0 and VRg=0. Boxes in black show EO-based information. C leaf : leaves C pool, A: C allocation, L: litter production, VRg: vegetation removal due to grazing, VRc: vegetation removal due to cutting, t: time.

Calculation of presented variables and sign convention
The micrometeorological sign convention is used when presenting C balance variables, whereby a positive (+) sign before a NEE and NBE value signifies an addition of C to the atmosphere, and a negative sign (-) signifies a removal of C from the atmosphere. The net change (gCm −2 ) in the size of the soil organic C (SOC) pool is presented and referred to as ∆ SOC .
Positive ∆ SOC values signify an increase in the SOC pool size and negative values signify a decrease in the SOC pool size. We 255 use the difference between total annually grazed and cut biomass (GCD) to quantify the relative impact of these two vegetation removals methods. A negative (-) GCD value signifies that more biomass was removed from the simulated field via cutting than it was via livestock grazing. A positive (+) GCD value signifies that more biomass was removed via livestock grazing than it was via grass cutting.

Assessment and analysis of MDF predictions 270
To assess the effectiveness of the LAI assimilation process we quantify the level of fit between MDF-predicted and EO-based LAI time-series using (1) the % of overlap between the EO-based data points (field mean) and the corresponding MDFpredicted ranges (95% confidence interval); and (2) the root mean square error (RMSE); and (3) the bias between the simulated and observed time-series. To account for the possibility that some of the simulated fields may not be managed grasslands due to changes in management, but classified as such in the LCM data, we remove from the results any fields for which the estimated 275 overlap is < 50% (see results for size of post-MDF dataset).
To answer our first science question, the MDF-predicted weekly grazed biomass is converted into livestock units (LU) per ha following the assumptions that : (1) 1 cattle is 1 LU and one sheep is 0.11 LU; (2) 1 LU weighs 650kg; (3) an animal demands ≈ 2.5% of its weight in the form grass dry matter (DM) when grazing; and (4) 47.5% of DM consists of C (Vertès et al., 2018).
The MDF-predicted and independent census-based LU ha −1 are compared using the correlation coefficient (r) and the RMSE To answer our second science question we present and examine the annual and seasonal C balance and the cumulative annual fluxes of the simulated fields. To assess what controls the predicted C balance of the simulated grasslands (our third science question) we quantify the correlation coefficient between meteorological model drivers, management-related model parameters 285 and MDF predictions of C cycling. In order to provide a more quantitative assessment of the factors that control grassland C dynamics we quantify the relative impact of management and climate on the MDF-predicted NBE. We use the model meteorological drivers and the posterior model parameters related to management and climatic controls for every simulated field to train a random forest (RF) model that estimates NBE. 75% of the data are used to train the RF model and 25% to assess its predictive ability (coefficient of determination). Thereafter, we use the Shapley Additive Explanations (SHAP) method to 290 quantify how much each RF-predictor affects the RF-predicted NBE (Rodríguez-Pérez and Bajorath, 2020). The SHAP method examines the structure of the RF model and provides the weight (SHAP value) that the model gave to each predictor. SHAP values can be seen as the machine learning equivalent of the coefficient of determination (r 2 ). The estimated SHAP values are normalised (0-1) to be comparable to r 2 . We note that RF is used in this study solely to support MDF data analysis and not for predictive purposes.

295
For each simulated field and model output the MDF algorithm produces a mean and 95% confidence interval. To answer our fourth science question, we quantify the predictive uncertainty around an output by calculating its relative confidence range (RCR). RCR is equal to the size of the MDF-predicted 95% confidence interval divided by the corresponding mean, and expressed as %. We present and examine the estimated RCRs to identify the key factors that affect uncertainty.

Assimilation of EO-based LAI data
For 12% of the initial dataset of 2108 simulated fields, our analysis failed to generate a simulated-vs-observed LAI overlap > 50%. These fields were removed from the analysis and the final reported dataset includes 1855 fields. Based on the 1855 fields, three performance metrics indicated that CARDAMOM effectively assimilated the provided EO-based LAI time-series  The analysis suggests that the 1855 simulated fields were managed with varying intensity. The majority of simulated fields were grazed-only (75%) and no cut-only fields were simulated. Grazed biomass exceeded cut biomass in 85% of the fields 320 (GCD > 0) and cut biomass exceeded grazed biomass in the remaining 15% (GCD < 0). The mean MDF-predicted annual yield (grazed and cut biomass) was 6±1.8 tDMha −1 y −1 (5th|25th|75th|95th percentiles : 2.8|4.6|7.3|8.5 tDMha −1 y −1 ). These results reflect biomass utilisation per grassland management intensity in the UK. Rough grazing grasslands (40% of UK grassland area) have an annual yield (total removed biomass) of 3.09±1.56 tDMha −1 y −1 , permanent grasslands (50% of UK grassland area) have an annual yield of 7.41±2.02 tDMha −1 y −1 and temporary grasslands (10% of UK grassland area) have a 325 yield of 9.76±2.03 tDMha −1 y −1 (Qi et al., 2017(Qi et al., , 2018. Most of the MDF-predicted first grass cuts (85%) occurred between the first half of May and the second half of July. For the fields where more than one cut was identified the period between the first and last cut was ≈ 2 months. The MDF-predicted day-of-year of first cut increased northwards with the mean date of first cut in northern England and Scotland being 3-6 weeks later than in the southern half of GB. This spatial pattern is likely the combined effect of differences in the onset and duration of 330 the grass growing season and in related management decisions. Due to the small share of cut-and-grazed fields in the simulated dataset, and in order to make the spatial pattern more visible, we present the average month of first cut on a regional basis ( Fig.   A4 in Appendix A ).

Predicted C balance and dynamics
MDF-based C cycle estimates show that management affected the C balance of the simulated grassland ecosystems signifi-335 cantly. The difference between grazed and cut biomass volume (GCD) is used to present the impact that these two biomass removal methods have on C balance. The mean annual GPP across GB fields was 30% higher (1992±400 gCm −2 y −1 ) for fields where most biomass was removed via grazing (GCD>0) compared to those where cut biomass exceeded grazed biomass (GCD<0) (1518±426 gCm −2 y −1 ) ( Figure 5). Reco was higher for fields dominated by grazing also. The mean NEE across GB was -232±94 gCm −2 y −1 , the relative role of grazing compared to cutting did not affect NEE significantly, and 95% of 340 the simulated fields were net C sinks at the ecosystem scale. When considering the role of cutting and grazing C removals and returns to the ecosystem, the impact of cutting as a biomass removal method becomes important. The NBE of fields dominated by cutting removals (GCD<0) was 38±70 gCm −2 y −1 while fields dominated by grazing removals (GCD>0) had a NBE of -126±95 gCm −2 y −1 . On average, 60% more C was removed (grazed and cut) in mostly-cut (GCD>0) grasslands than in mostly-grazed (GCD<0) grasslands. The flux of C into the SOC pool was, on average, 66% larger in mostly-grazed 345 (GCD<0) than mostly-cut (GCD>0) fields. The annual change in the size of the SOC pool (∆ SOC ) for mostly-cut (GCD<0) grasslands was 116±52 gCm −2 y −1 and 36±40 gCm −2 y −1 for grazing-dominated (GCD>0) fields. The spatial distribution of MDF-predicted GPP, Reco, NEE, NBE, removed biomass and C flux into SOC is presented in the cartograms of Figure A5 in  (Table 1). The GB-average NEE and NBE increased between 2017 and 2018, indicating a reduction in sink strengths (Fig. 5). While only 2% of the simulated grasslands had a NBE>0 in 2017 this share increased 9-fold to 18% in 2018 ( Fig. A6 in Appendix A ). The GB-average total removed biomass in the drought-affected 2018 was 27% higher than in 2017. Reductions in cut yields and increases in grazed biomass underlie this increase in GB-average removed 360 biomass (Fig. 5). In this context, the area-mean grazed biomass during the 3 months of spring 2018, in the southern half of GB, and the 3 months of summer 2018, in the northern half, was higher than the respective seasons in 2017 (results not presented).

Controls on C cycling in UK grasslands
Correlation coefficients (Fig. 7) generated across the 1855 fields show the links between meteorological drivers, key processes (model parameters) and model outputs (C exchanges). There are strong positive correlations between GPP, Reco, C inputs 365 to soil and root:shoot ratio, and these factors are strongly negatively correlated with NBE and NEE. The most productive  * the sum of grazed and cut biomass in a year ** manure is produced by simulated grazing livestock and directly enters the soil litter pool (external manure inputs are not considered) fields (higher GPP) are associated with high inputs of C to soils and are the strongest C sinks (more negative NEE and NBE).
Among modelled processes, the ratio of C allocation to roots relative to stems and leaves (root:shoot ratio) is the most strongly correlated with the net C balance of the simulated fields (Fig. 7). More-frequent and higher-yielding grass cuts reduce root:shoot ratio and, therefore, reduce the flux of C to litter and, subsequently, to the recalcitrant soil C pool (SOC We expanded on the correlations-based analysis of the MDF results by using (1) the MDF-predicted data on NBE, and (2) the corresponding meteorological drivers and (3) model parameters describing climatic and management controls on grass growth, to train a RF model that estimates NBE. The resulting RF model was able to explain 93% of the variance in MDFpredicted NBE (r 2 =0.93) using 5 predictors: VPD, mean T, solar radiation, all posterior DALEC-Grass parameters related to 380 climatic effects on grass growth and all posterior parameters related to grassland management. The weight (normalised SHAP) attributed by the RF model to these 5 predictors suggests that management parameters (aggregated weight for all parameters in the group) were the most important factor for grassland NBE over the simulated period (

Predictive uncertainty
The size of the uncertainty around MDF estimates is quantified using the RCR (relative confidence range) of MDF outputs.
The mean RCR is 42±9% for LAI, 21±10% for GPP, 18±6% for Reco and 26±16% for grazed biomass (Figure 8). MDF 390 predictive uncertainty is therefore a small faction of the mean estimate of these scalar variables. The GB-average RCR for LAI, GPP and grazed biomass prediction increased from 44%, 26% and 27% in fields where cut biomass did not exceed grazed biomass (GCD>0) to 54%, 40% and 52% in fields where cut biomass exceeded grazed biomass (GCD<0). The higher RCR (mean and SD) for LAI and grazed biomass is caused by the spatio-temporal uncertainty in the vegetation reduction time-series. This input-related uncertainty leads to the MDF algorithm being less effective in identifying cutting instances in 395 some simulated fields; i.e. sampled parameter vectors produce varying predictions on the timing and intensity of grass cuts.
The impact of input data uncertainty on RCR is also reflected on the shape of RCR distributions for GPP and grazed biomass (violin plots in Fig. 8). The assimilated EO LAI time-series condition the simulated LAI, which combined with the simulated removals (grazing/cutting), determine the weekly GPP at each simulated field. Reducing the uncertainty (spatial and temporal) around the vegetation reduction data is expected to lead to less variable predictions of cutting timing and intensity and, thus, to 400 lower predictive uncertainty overall.  (Chang et al., 2015b).
The MDF-predicted GB-average pasture dry matter yield (6±1.8 tDMha −1 y −1 ) is within the range for UK permanent 410 pastures (7.41±2.02 tDMha −1 y −1 ) as estimated by Qi et al. 2017 using statistical extrapolation of field-measured data. Due to the field-size limits (6-13ha) used in sampling for fields across GB, the share of less intensively managed grasslands is likely biased high in the simulated fields' dataset (Qi et al., 2018). Cut-only grasslands were under-represented in our analysis, as we expected ≈ 10% of field to be in temporary management for cutting. We believe this is an artefact of the noise in the vegetation reduction time-series, which led to cut-only grasslands failing to pass the 50% overlap limit and being excluded from the final 415 dataset. The inclusion of fields that are 6-13ha in size could have led to an under-representation of cut-only fields but we note that no data exist on the percentage of managed grasslands that are grazed-only, cut-and-grazed and cut-only.

The C balance of managed grasslands
The majority of managed grasslands in GB were net C sinks during 2017 (NEE=-289±76 gCm −2 y −1 ) and 2018 (NEE=-174±74 gCm −2 y −1 ) at the ecosystem level, i.e. based on CO 2 gas exchanges. Numerous flux tower-based studies have con-420 cluded that managed temperate grasslands are, on average, C sinks but NEE estimates vary greatly between ≈ -700 gCm −2 y −1 to almost C-neutrality Gilmanov et al., 2007;Skinner, 2008). The scale of NEE increase between 2017 and 2018 is comparable to past field-based estimates under normal and heat-wave conditions (Klumpp et al., 2011). When considering C fluxes that also included grazing/cutting removals and manure return to the soil (i.e. NBE), the simulated grasslands were net C sinks during 2017 (NBE=-191±82 gCm −2 y −1 ) and close to C neutral in 2018 (NBE=-49±70 gCm −2 y −1 ). These 425 NBE estimates are comparable to those in the literature but the inconsistency in the variables included in NBE calculations makes comparisons less than straightforward Skinner, 2008  Our study shows that biomass removals were key determinants of the C balance of managed grasslands. The role of cutting 435 relative to grazing as a biomass removal method was found to be particularly important. Grasslands in which most biomass was removed via cutting had a lower GPP and Reco as opposed to grasslands in which grazing was the main biomass removal method (Fig. 5). However, when GPP and Reco are summed the resulting NEE did not vary significantly between mostly grazed and mostly cut grasslands. All of the simulated grasslands were grazed and underwent more or less frequent defoliation during the simulated period. When cutting occurs the leaf area of a grassland is reduced close to zero, which represents a 440 diminution of the grassland's photosynthetic capacity. According to DALEC-Grass, in the post-cutting period the simulated grassland allocates almost all of its C to aboveground tissues (stems,leaves) in order to build up the leaf area necessary to increase photosynthetic activity and sustain growth. This causes a smaller root-to-shoot ratio in grazed and cut grasslands compare to grazed-only ones as well as a reduction of root-based C inputs to the litter pool (lower R h ). Grazing that occurs during the post-cut period maintains the volume of leaves at relatively low levels. This leads to a reduced annual GPP for 445 grasslands that are both grazed and cut and also means lower manure-C returns to the soil, which explains the weaker C sinks (higher NBE) of most grazed-and-cut grasslands compared to grazed-only fields.

Factors that control managed grassland C dynamics
Management and climate have a combined effect on C dynamics and disentangling their individual impacts is a challenge (Ammann et al., 2020). Here, we used a correlation matrix of model drivers, parameters and outputs to understand how climate The conclusions that we draw in regards to which factors have more influence on grassland C dynamics are based on two assumptions. Firstly, we assume that the simulated grasslands are well-optimised for the intended use; i.e. to sustain different 460 types of livestock (e.g. dairy and/or beef cattle and/or sheep). This means that each sward is maintained in good condition and that farmers manage their fields optimally based on their long-term experience. Secondly, the fact that a large share of the simulated fields (especially in the southern half of England) experienced continuous weeks of unusually hot and dry weather conditions during one (2018) of the two simulated years is treated as a climate anomaly; i.e. climate in 2018 is not representative of normal climatic controls on C balance. Based on these assumptions, we argue that the simulated vegetation management, as 465 inferred from the observational data, was adapted to the seasonal weather anomaly. Therefore, significant changes in ecosystem C cycling were beyond the control of human management and can be mostly attributed to the seasonal weather anomaly.
Our findings on the role of management are in agreement with findings in a number of relevant studies, notwithstanding differences in methodologies and eco-climatic conditions. Skinner 2008 found that higher biomass removals increase NBE based on C flux measurements in cut-and-grazed temperate grasslands in the USA. Koncz et al. 2017  has a strong influence on C fluxes and balance.
In summary, we conclude that management is a key determinant of the C balance of managed grasslands in GB. We note that climatic anomalies, such as heat waves and droughts, can reduce the relative importance of management as a determinant of 480 grassland C balance. In simple terms, human decisions can adjust grassland sink or source strength and this depends mostly on the soil's existing C stock, the sward's composition and condition and the timing and intensity of livestock grazing and grass cutting. Climate change can change this fine C balance substantially and prolonged heat and drought is one way in which this can occur in regions with temperate maritime climate.

Predictive uncertainty 485
We use the RCR to quantify the uncertainty around the MDF-predicted variables. RCR shows how wide the 95% confidence intervals (i.e. 2×SD, assuming normality) are relative to the mean value. The assimilated LAI data come from processing Sentinel-2 images and have an uncertainty attached to them. This observational uncertainty is not always examined in relevant studies but a relative SD of 15% is considered as representative (Zhao et al., 2020). This means that the average RCR of the assimilated observational LAI data is 30%. Considering that MDF predictions incorporate model parametric uncertainty as 490 well, the mean analysis LAI RCR of 45% is, as expected, larger than, but of similar magnitude to, the observational uncertainty of 30% (Fig. 8).
The estimated predictive uncertainty for LAI, GPP and grazed biomass was noticeably higher for fields that were mostly cut (GCD<0) (Fig. 8). The MDF algorithm does not infer cutting simply by translating large reductions in vegetation as cuts. The MDF algorithm examines each weekly vegetation reduction input to decide whether to simulate it as cutting, grazing or ignore 495 it depending on the simulated amount of foliar biomass at the time. The simulated amount of foliar biomass is constrained through the assimilation of field-specific EO-based LAI time-series. A weekly vegetation reduction will be simulated as a cut when it is reasonable both biophysically and agronomically based on the EDCs (see section 2.1.2). This higher predictive uncertainty when cutting occurs suggests that the best way to obtain more accurate predictions is to improve the spatial and temporal resolution of the vegetation reduction time-series and/or estimates of LAI. Using radar (e.g. Sentinel-1) to derive LAI 500 in spite of cloud cover would be a valuable advance.

Limitations
This study uses a MDF algorithm that depends on EO data and process modelling of C dynamics in grasslands. The Proba-V-based vegetation reduction time-series that are used to drive DALEC-Grass have a resolution (9ha) that is coarse when compared to the average size of grassland fields in GB. These noisy data on vegetation reduction cause increased uncertainty 505 in MDF predictions especially in regards to the timing of cutting events. Moreover, most areas of GB are affected by frequent cloudiness, which means that the number of Sentinel 2-based LAI data points per year and simulated field is limited compared to other parts of the world. However, we ensured 30 images per field over two years in our selection process, and this richness of information at field resolution and for national domains is unprecedented in such an analysis.
DALEC-Grass was developed and tested under UK conditions, showing high skill in predicting C allocation and CO 2 fluxes 510 under variable management and different soil conditions (Myrgiotis et al., 2020). However, DALEC-Grass can only infer effects on grass growth through the processes it simulates, and so can mis-attribute effects arising from missing processes. For instance, the MDF algorithm can adjust a specific plant growth rate parameter (DALEC parameter P10,photosynthetic N use efficiency) between fields based on observed LAI dynamics and weather. Inferred P10 variation among fields might be linked to spatial patterns in soil fertility. But because there is no direct soil moisture constraint on LAI in DALEC-grass to be adjusted, 515 a real spatial soil moisture limitation on LAI might be mis-interpreted as a restriction on P10. So we should be cautious in interpreting process variation and assigning with certainty to a particular forcing. Also, a single P10 estimate is made for each field covering both 2017 and 2018, so the current analysis does not allow field nutrient supply (and therefore P10) to change between years. The strong differences in sink strengths observed in the 2017 and 2018 analyses are informative. The flux differences cannot arise from parameter differences between years, as these parameters are constant. Instead differences must 520 arise from process changes (e.g. GPP) resulting from changes to the forcing (VPD, Fig. 2) and changes to the assimilated data (LAI) between years. The larger LAI uncertainty in southern GB (Fig. 8 may be related to soil moisture impacts on grass growth that we fail to identify with the current model structure. Finally, DALEC-Grass has been validated against data from grasslands dominated (>90%) by perennial ryegrass (Lolium perenne) and its ability to simulate swards with larger shares of herbs and forbs has not been tested.

525
Overall, the ability to use field-specific observed information on key aspects of grassland vegetation and to infer its management (vegetation grazing and cutting) are the key advancements presented in this study. In this context, the majority of grassland-focused model-based estimates at large scales typically rely on uncertain information on grazing and cutting. Also, with few recent exceptions most relevant studies do not include field-specific validation of model predictions, which results in highly uncertain estimates. Despite that, the calculation of lateral flows of C remains an outstanding challenge as it depends on 530 information that cannot be inferred from EO-based time-series. The size, type and age of livestock significantly affect livestock turnover of grazed biomass-C. While we cannot infer this information from EO data, our probabilistic MDF framework allow us to attribute uncertainty to livestock C turnover and, thus, to quantify their impact on predictions. This attribution can be done by treating grazed biomass-C conversion factors ( Fig. 1) as model parameters. It is, currently, not possible to detect manure spreading events from EO data. Therefore, we do not consider external manure-C addition to the soil.

Future work
Our overarching aim is to produce a computational ecosystem science framework that is (1) able to utilise the swathes of EO data that are increasingly becoming available while (2) being easy to adapt and incorporate new knowledge gained from field/lab experiments and ground observations. This study showed that the MDF algorithm will benefit most from improving the temporal resolution and quality of EO LAI data used. We believe that by advancing on this front the algorithm will be able 540 to produce more accurate estimates across European grasslands and other with similar agro-climatic conditions. Introducing soil moisture and N cycling-related processes to DALEC-Grass will pave the way for more detailed consideration of the effects of fertiliser use and different grass mixtures, and for its application at climatically-critical rangelands and pastures across the world (e.g. tropical and dry regions). DALEC-Grass has a structure that facilitates the incorporation of modelling advances made with other DALEC-based models such those presented in Revill et al. 2021 for foliar N and Smallman and Williams 545 the credibility for MDF estimates.

Conclusions
This study presented how, by fusing EO data and biogeochemical modelling of managed grassland C dynamics at field resolu-550 tion across a national domain, a MDF framework can detect biomass removals and use this information to predict grassland C fluxes and balance probabilistically. In addition, the study showed how field-specific model predictions of grassland vegetation can be validated against field-specific EO-based LAI time-series. We argue that both of these uses of EO data in model-based studies represent key advancements that increase the credibility of field-scale estimates of C dynamics in managed grasslands.
Our results show that MDF-predicted annual yields and livestock density mirror ground based information well. In agreement 555 with a range of studies on temperate grasslands in Europe and beyond, our study reaffirms the C sink potential of managed grasslands in GB. In contrast to previous measurements and model-based studies, however, we showed how MDF can quantify and interpret C dynamics across a large domain (GB) while also resolving sub-field scale variability in vegetation management.
This granularity is vital as our results show how management differences between fields have strong effects on net C balance. It is widely accepted that climate change is manifesting itself, among other ways, as more frequent droughts in northern Europe 560 (Peters et al., 2020). Our study showed how the most prolonged drought (2018) that has been recorded since 2000 affected the C balance of managed grasslands. It highlights that the ability of temperate maritime grasslands to sequester C could be significantly affected by prolonged heat waves and drought. Various climatic and management-related factors affect both the annual C balance and the seasonal grassland biomass utilisation in livestock farming in GB; and northwest Europe in general.
National targets for C neutrality in the agricultural sector and climate change create a challenging future for UK grassland 565 farming. MDF represents a robust method for monitoring the biophysical state and C dynamics of any grassland field, over any domain in near-real time. We argue that farmers and governments alike can use MDF approaches like this to provide needed monitoring tools for C balance, and guidance on adaptation and mitigation of climate change effects on agriculture towards meeting net-zero goals.
Appendix A 570 The turnover rate of the soil organic matter pool cannot be faster than that of the litter pool 2 Initial SOC pool cannot be < the sum of all other pools (litter, roots, aboveground) 3 The soil organic matter pool cannot lose or gain > 5% of its C in a simulated year 4 Annual GPP and ecosystem respiration cannot be <800 g C m −2 or >2800 g C m −2 (Xia et al., 2015;Gilmanov et al., 2007) 5 Weekly mean GPP cannot be >25 g C m −2 (Xia et al., 2015;Gilmanov et al., 2007) 6 Cutting yield cannot be <80 g C m −2 or >385 g C m −2 (Qi et al., 2017) 7 No more than 4 cuts can occur each simulated year (Qi et al., 2017) Figure A1. Schematic description of data sources and data flow in the model-data fusion process. The DALEC-Grass model is driven by weekly weather and vegetation reduction data (see sections 2.1.5 and 2.1.4 ). The initial size of the soil organic C (SOC) pool for each simulated field is obtained from the SoilGrids database (Hengl et al., 2017) and used as a DALEC-Grass parameter (with uncertainty attributed to it). DALEC-Grass produces outputs on weekly C pools, fluxes and removals (see Fig. 1). It also produces weekly time-series of LAI. Observational time-series on LAI are assimilated by reducing the RMSE between observational and simulated LAI time-series (see section 2.1.4). The assimilation is performed in CARDAMOM by using the simulated annealing (SA) method/algorithm (see section 2.1.3).     AgCensus, E.: EDINA AgCensus 2020, http://agcensus.edina.ac.uk, 2020.