Three decades of simulated global terrestrial carbon ﬂuxes from a data assimilation system confronted with different periods of observations

. During the last decade, carbon cycle data assimilation systems (CCDAS) have focused on improving the simulation of seasonal and mean global carbon ﬂuxes over a few years by simultaneous assimilation of multiple data streams. However, the ability of a CCDAS to predict longer-term trends and variability of the global carbon cycle and the constraint provided by the observations have not yet been assessed. Here, we evaluate two near-decade-long assimilation experiments of the Max Planck Institute – Carbon Cycle Data Assimilation System (MPI-CCDAS v1) using spaceborne estimates of the fraction of absorbed photosynthetic active radiation (FAPAR) and atmospheric CO 2 concentrations from the global network of ﬂask measurement sites from either 1982 to 1990 or 1990 to 2000. We contrast these simulations with independent observations from the period 1982–2010, as well as a third MPI-CCDAS assimilation run using data from the full 1982–2010 period, and an atmospheric inversion covering the same data and time. With 30 years of data, MPI-CCDAS


Introduction
The observed contemporary increase in atmospheric CO 2 is driven by anthropogenic emissions from fossil fuels, land use change (2007-2016 average: 11.1 ± 0.6 GtC yr −1 ), and the concurrent net carbon uptake of the ocean and land from the atmosphere, which take up approximately 22.4 % and 28 % of the anthropogenic flux, respectively.Despite recent advances in atmospheric observations and ocean and land modeling, there is an imbalance of 5.6 % (0.6 GtC yr −1 ) between the ocean and land sinks, carbon emissions, and changes in the atmospheric CO 2 concentration (Le Quéré et al., 2018).Throughout past decades, notable progress has been made to improve the performance of terrestrial biosphere models, but the simulated global terrestrial carbon fluxes and the net land carbon balance still have the highest uncertainties from all of the components of the global carbon cycle (Friedlingstein et al., 2014;Le Quéré et al., 2018).Quantifying the magnitude and dynamics of the global terrestrial carbon cycle across different temporal scales, and their contribution to the global carbon cycle, is challenging because the substantial heterogeneity and complexity in land ecosystems and challenges in the quantification of contemporary effects and response of these ecosystems to increasing postindustrial CO 2 concentrations (Lienert and Joos, 2018;Stocker et al., 2014;Wang et al., 2017).
One strategy to reduce the mismatch between carbon flux predictions from land surface models and measured atmospheric CO 2 concentrations is through data assimilation (DA) techniques, meaning to "train" the land models by confronting them systematically with observations of carbonrelated variables (Raupach et al., 2005).During DA, process parameters of land surface models are adjusted through numerical minimization techniques to reduce the misfit between model results and actual observations under consideration of the statistical properties of both data sets.While atmospheric transport inversions are a method used to infer the sinks and sources of CO 2 between the atmosphere and land, or ocean, from atmospheric CO 2 measurements (Newsam and Enting, 1988;Peylin et al., 2013;Rayner et al., 1999;Rödenbeck et al., 2003), the application of carbon cycle data assimilation systems (CCDASs) provides additional opportunities.In CCDAS, the process-based carbon cycle mechanisms in land surface models are informed with measurements to support a better estimate of the terrestrial carbon cycle, and improve the capacity to project its dynamics.With this purpose, several CCDASs have been developed in the past (e.g., Kaminski et al., 2012Kaminski et al., , 2013;;Lienert and Joos, 2018;Peylin et al., 2016;Scholze et al., 2016).The difference among some of these models is the variational or sequential statistical approach they follow during the data assimilation process (Montzka et al., 2012).A common characteristic in these models is their capacity for integrating long-term and time-consistent globally available observational records related to the carbon cycle such as atmospheric CO 2 measurements from flask and in situ networks (Conway et al., 1994), as well as remote-sensing products of canopy phenology properties such as MODIS NDVI (Moderate Resolution Imaging Spectroradiometer -Normalized Difference Vegetation Index) (Rouse et al., 1974) and FA-PAR (Disney et al., 2016;Pinty et al., 2011a).
Previous studies have analyzed the prognostic capability of the data assimilation systems (e.g., Rayner et al., 2011Rayner et al., , 2005;;Scholze et al., 2007;Schürmann et al., 2016), but only for a few years of prognosis after the assimilation.Scholze et al. (2007) concluded that the CCDAS built around BETHY (Biosphere Energy Transfer Hydrology) is capable of providing a prognostic period of 4 years (2000)(2001)(2002)(2003) of atmospheric CO 2 after data assimilation of 21 years (1979 to 1999) of CO 2 concentrations.Schürmann et al. (2016) discussed the prognosis capability of the Max Planck Institute -Carbon Cycle Data Assimilation System (MPI-CCDAS v1) for 2 years after a short assimilation period of 5 years.Rayner et al. (2011) showed that the uncertainty related to model parameters during the prediction of CO 2 fluxes with a CCDAS is considerably reduced when the model parameters are constrained with 2 decades of atmospheric measurements; however, these results were obtained with a model that ignores the interacting effects of water, energy, and phenology on the carbon cycle predictions.
The overarching aim of this work is to understand the ability of the MPI-CCDAS v1 to make decadal projections of the land C cycle when the assimilation is confronted with different temporal windows from two observational constraints: FAPAR from remote-sensing data and atmospheric CO 2 concentrations from the global flask measurement network.For this, we present 3 decades of modeled land carbon fluxes with the MPI-CCDAS and investigate the effect of withholding information from recent decades in the projected carbon fluxes and the ability of the model to reproduce the observations during the period of data assimilation.We also analyze trends and seasonal variations in the simulated signals during the periods of the assimilation and compare to independent results to evaluate the model performance.With these results, we gain insights into the number of observations (in terms of decadal scale) necessary in data assimilation systems to improve the representation of the global terrestrial carbon cycle components.These results open the possibility of including newly measured data in CCDAS that are only available for periods of less than a decade.

MPI-CCDAS
The MPI-CCDAS was built around the Jena Scheme Biosphere-Atmosphere Coupling in Hamburg (JSBACH) land surface model (Dalmonech and Zaehle, 2013;Raddatz et al., 2007;Reick et al., 2013) and follows a variational approach that simultaneously reduces the model-data misfit for multiple independent carbon cycle data sets (Kaminski et al., 2013).Since its first development based on the BETHY -CCDAS, the MPI-CCDAS has undergone several code modifications and improvements, as well as tests of the assimilation of new observational data sets (e.g., Kaminski et al., 2012Kaminski et al., , 2013;;Rayner et al., 2005;Scholze et al., 2016;Schürmann et al., 2016), with the aim of further improving the representation of land carbon fluxes.The history of the MPI-CCDAS and other current CCDASs is extensively discussed in Scholze et al. (2017).
The code of the MPI-CCDAS version in this work is identical to the one used in Schürmann et al. (2016).The model calculates the half-hourly storage and surface fluxes of energy, water, and carbon in terrestrial ecosystems at coarse spatial resolution (8 • × 10 • grid) (Fig. 1).This horizontal resolution allows computational feasibility and a realistic computational cost for the set of experiments presented in this work.Furthermore, previous evidence has shown that a higher spatial resolution in global vegetation models does not exert a considerable influence in the simulated carbon fluxes at global or regional scales when compared to results obtained with a coarse grid (Müller and Lucht, 2007).The lack of influence to improve the simulated global C fluxes due to changes in the model spatial resolution might also apply to CCDAS (Peylin et al., 2016).
The spatial distribution of the different plant functional types (PFTs) in JSBACH is shown in Fig. S1 (Supplement).The selected parameters for the assimilation procedure and their prior and range of values were based on Schürmann et al. (2016), where an extensive sensitivity study led to retaining those parameters with a substantial effect on the simulated carbon and water fluxes, as well as on phenology.The majority of the selected parameters for the optimization are linked to phenology, but there are also parameters related to photosynthesis and global parameters that control the land carbon turnover during the assimilation.The final list of parameters together with their initial value obtained from an independent forward simulation of JSBACH 3.0 (see Sect. 2.3.1) are shown in Table 1.
The MPI-CCDAS starts with an initial guess for the model control vector (p pr ) of carbon cycle properties, model states, and their Gaussian uncertainty ("prior") with covariance C pr .The model control vector p is iteratively updated to minimize a joint cost function J (Eq. 1) describing the misfit between observational data streams (d; FAPAR and atmospheric CO 2 , both with covariance C d ) and the corresponding simulated observation operators of the MPI-CCDAS M(p), taking into account the uncertainties in the observational data assuming a Gaussian distribution and the information from the prior.
During the optimization procedure, a new model trajectory is determined in each iteration (i.e., in every cycle when the model recalculates the cost function for the difference between the model parameters and the observational constraint), such that energy and mass are conserved through the entire assimilation window (Kaminski and Mathieu, 2017).The gradient of the cost function with respect to the model control vector ( ∂J ∂p ) is evaluated with a tangent-linear version of JSBACH 3.0, which was generated through automatic differentiation using a TAF (transformation of algorithms in Fortran) compiler tool (Giering and Kaminski, 1998).With this tangent-linear version of the model code, the derivatives for the parts of the model code where J (p) is evaluated (i.e., code parts that depend on the control variables) are accurately calculated following the chain rule of calculus.Thus, the mathematical formulation of the code involved in the cost function must be differentiable.Since this was not the case for the phenological code of JSBACH 3.0, the phenology   scheme was updated following Knorr et al. (2010), where the minimum and maximum calculations in the entire code were replaced by smoothing functions to avoid abrupt transitions (Schürmann et al., 2016).

FAPAR
The fraction of the radiation that is absorbed by plants during photosynthesis (FAPAR) is a component of the land surface radiation budget that dynamically indicates the status of the vegetation canopy over space and time (Gobron et al., 2006).In a previous study, MPI-CCDAS was constrained by MODIS TIP (Two-Stream Inversion Package) FAPAR (hereafter TIP-FAPAR) generated from the inversion of a 1-D radiation transfer model (Pinty et al., 2006(Pinty et al., , 2007) ) using the MODIS broadband visible and near-infrared spectral white sky surface albedo as input (Clerici et al., 2010;Pinty et al., 2011a, b).For this study, the TIP-FAPAR product was available only from 2003 to 2011, making it unsuitable for the indented longer assimilation period.While there are longterm remotely sensed proxies of FAPAR, such as the NDVI (Rouse et al., 1974), it has been found previously that NDVI was less reliable than TIP-FAPAR in terms of the seasonal cycle amplitude of vegetation seasonality (Dalmonech and Zaehle, 2013;Dalmonech et al., 2015).Therefore, we used as FAPAR proxy the Global Inventory Monitoring and Modeling System (GIMMS) NDVI product for the period 1982 to 2006 (Tucker et al., 2005), and merged it with the TIP-FAPAR product to provide a longer record of vegetation greenness.The maximum and minimum NDVI values were rescaled at the pixel level to coincide with those from the TIP-FAPAR for the overlapping periods (i.e., 2003 to 2006) following where x is the period from 2003 to 2006 for each data set, and NDVI is the full NDVI product from 1982 to 2006, with minimum values given by NDVI min and maximum values by NDVI max .TIP min and TIP max are the corresponding minimum and maximum values from the TIP-FAPAR product.With this approach, the resulting merged product maintains the maximum and minimum values from TIP-FAPAR while preserving the temporal dynamics of NDVI.The median un-certainty of the available TIP-FAPAR data was considered to be the uncertainty for the entire time series.Due to a technical failure in the MPI-CCDAS, the final FAPAR mod product used in the assimilation procedure only spans from 1982 to 2006 and the last 4 years from the TIP-FAPAR product were not considered.For this study, this product was aggregated to match the model grid horizontal resolution considering background snow-free and snow-covered conditions separately (Schürmann et al., 2016).
To discard pixels in the global FAPAR data that might lead to bias during the assimilation procedure, we applied a mask to the global FAPAR grid following three criteria.(1) We masked out the grid cells with crop-dominating phenology of > 20 % since no explicit crop phenology is described in JSBACH.This step has consequences in areas where other relevant functional types are also present in the same grid cells, such as deciduous broadleaves that are also abundant in the USA and Europe.As a result, the parameters related to deciduous broadleaves are constrained from other locations; (2) we further masked out pixels that hold a low correlation (R 2 < 0.2) when the prior model results are compared to the observations, as we had previously found that the MPI-CCDAS is incapable of correcting such poor model behaviors (Schürmann et al., 2016).Finally, (3) we masked out pixels located in areas where phenology abundance is low, i.e., deserts, because they would influence the optimization causing significant bias due to global compensating effects.The final FAPAR product used during the assimilation contains 40 % of the original number of pixels after the applied mask, resulting in more pixels distributed in the Northern Hemisphere compared to the southern areas.These observational data will be referred to hereafter as FAPAR obs (see Fig. 1 for the global distribution of mean FAPAR obs from 1982 to 2006).

Atmospheric CO 2 concentrations and observation operator
Measurements of atmospheric CO 2 mixing ratios were taken from the flask data continuous record of 28 sites in the NOAA CMDL station network (Conway et al., 1994;Rödenbeck et al., 2003).The selection criteria included the length of the record (on average 19 years) (Fig. A1 in Appendix A) and focused on remote and ocean stations with low impact of local carbon sources and sinks of carbon (Schürmann et al., 2016) (see the location of CO 2 stations in Fig. 1).In the MPI-CCDAS, the atmospheric transport of CO 2 is calculated by integrating the simulated half-hourly net CO 2 fluxes to monthly values followed by the transport calculation with the Jacobian representation of the atmospheric transport model TM3 that is driven with meteorology fields from NCEP (National Centers for Environmental Prediction) reanalysis (Heimann and Körner, 2003;Rödenbeck et al., 2003).During the generation of the monthly transport matrices, the precise sampling time of flask measurements as well as the 3-hourly atmospheric transport was considered to minimize the representation error due to short-term fluctuations in atmospheric transport and to minimize the impact of synoptic atmospheric transport variability on the simulated seasonal and long-term dynamics of atmospheric CO 2 at the monitoring stations.Through this approach, the nonlinear effect of weather anomalies on the surface fluxes were also taken into account.TM3 runs at horizontal "fine grid" (fg) resolution of 4 • × 5 • .Due to computational demands, it is not possible at this stage to use the MPI-CCDAS at the same fine grid resolution as in the TM3.The treatment of uncertainties is performed in the same way as in the TM3 atmospheric inversion (Rödenbeck et al., 2003) but imposing a floor value of 1 ppm on the uncertainties (Rayner et al., 2005) to allow a range for the comparison to the observational operator.
We also compare the fluxes from the assimilation to fluxes obtained from an atmospheric transport inversion (referred to as INV).Similar to the MPI-CCDAS, the atmospheric transport inversion is constrained by atmospheric CO 2 data linked to surface fluxes through a tracer transport model, but the land surface CO 2 fluxes are adjusted directly rather than through changes in the parameters of a land surface process model.The inversion setup used in this study is similar to Jena CarboScope v4.1 (Rödenbeck, 2005;Rödenbeck et al., 2003), involving the same TM3 model as in the MPI-CCDAS.To make the inversion results as comparable as possible to those from the MPI-CCDAS, we used in the inversion the same prior fluxes from fossil fuel emissions and ocean (Sect.2.2.3), as well as the same CO 2 stations.This comparison also helps to gauge the impact of non-land surface fluxes on the ability to reproduce the observations.

Background carbon fluxes
To account for the total carbon balance during the comparison between the land fluxes from MPI-CCDAS and atmospheric concentrations, it is necessary to include background carbon fluxes (i.e., from fossil fuel emissions, use and change of land cover, and the ocean).
Land use and land cover change (LULCC).The LULCC fluxes were obtained from a transient simulation carried out with the JSBACH 3.0 forced with prescribed annual maps of modified cover fractions (Hurtt et al., 2006).These fluxes do not consider disturbances such as fluxes from fires.
Fossil fuel (FF) emissions.The FF emissions used for this work are the result of a merged product from various data sets to complete a long record of emissions, i.e., 1980 to 2012.This product was prepared for the GEOCAR-BON project (http://www.geocarbon.net,Version 3, last access: June 2014) by Philippe Peylin after merging and harmonizing various data sets: (1) for the period 1980 to 1989, the CDIAC (Carbon Dioxide Information Analysis Center; http://cdiac.ess-dive.lbl.gov/, last access: May 2013) product prepared for the CMIP5 exercise (Andres et al., 2013(Andres et al., , 2011(Andres et al., , 1996)); (2) for the period 1990 to 2009, the IER-EDGAR (Institute of Energy and Rational use of Energy, Stuttgart, Germany -Emission Database for Global Atmospheric Research; http://www.carbones.eu/wcmqs/project/,last access: May 2013) product where the FF emissions are constructed using the EDGAR v4.2 data set (http://edgar.jrc.ec.europa.eu/overview.php?v=42, last access: May 2013) and completed with profiles for different countries, emission sectors, and time zones available for different temporal resolutions; and (3) for the period 2010 to 2012, the CarbonTracker product derived at the NOAA Climate Monitoring and Diagnostics Laboratory (CMDL; https://www.esrl.noaa.gov/gmd/ccgg/carbontracker/, last access: May 2013).

Spin-up and preparation of initial files
The MPI-CCDAS was forced with meteorology from CRU-NCEP (the Climate Research Unit from the University of East Anglia, analysis of the NCEP reanalysis atmospheric forcing) version 6.1, available at daily resolution from 1901 to 2014 and a spatial resolution of 0.5 • (Viovy and Ciais, 2015).The atmospheric forcing fields (i.e., wind speed, air temperature, precipitation, downward short-and longwave radiation, and specific humidity) were remapped to the coarse (8 • × 10 • ) model grid.Prescribed annual means (one annual global mean value) of atmospheric CO 2 were also included as part of the forcing fields for the model (https: //www.esrl.noaa.gov/gmd/ccgg/trends/global.html, last access: July 2015).
Before the assimilation experiments, the JSBACH 3.0 model was spun up to equilibrium of the vegetation and soil carbon pools with 1901 atmospheric CO 2 , land cover, and 1901-1910 climate.The spin-up procedure was performed for a model period of 1000 years with repeated cycles of atmospheric forcing data.After this period, a transient model simulation was also performed with JSBACH 3.0 for the period 1901 to 2012.This transient simulation included a change in atmospheric CO 2 , climate, and land cover.The purposes of this simulation were (i) to obtain the initial conditions for the CCDAS experiments and (ii) to derive spatially resolved land use emissions from a JSBACH 3.0 simulation as additional forcing (see Sect. 2.2.2).Due to technical limitations, the cover fraction of each PFT is kept constant in MPI-CCDAS during data assimilation, and thus remained fixed through the simulation period to account for the imprint of the space-time dynamics of land use change emissions on atmospheric CO 2 concentrations.After the spin-up procedure, an initial global scaling factor was set for the slowly varying carbon pool (f slow , also selected as optimization parameter) to account for non-steady-state conditions at the beginning of the assimilation (Carvalhais et al., 2008;Schürmann et al., 2016).

Assimilation experiments
During the assimilation procedure, the model was forced with the same daily reanalysis atmospheric data used during the model spin up.In this study we present the results of three long-term experiments using the MPI-CCDAS, which differ in the timeframe of the observational records used during the assimilation: (1) ALL covers data in 1980-2010 and includes the complete available timeframe of the two observational data sets, i.e., for FAPAR from 1982 to 2006 and for the atmospheric CO 2 concentrations from 1982 to 2010; (2) DEC1 covers observations from the two data sets available from 1982 to 1990; and (3) DEC2 covers measurements available from the two data sets from 1990 to 2000 (Fig. A2).Because of the different lengths of the CO 2 records for some stations, this ultimately leads to a different number of observations used for each experiment (Fig. A1).
The simulation period in the three assimilation experiments is from 1970 to 2010.The first 10 years (1970 to 1979) of the results are discarded because during this period the phenology, vegetation productivity, and the fast land C pools adjust to the new model control vector p.Through this adjustment any imprint of the initial conditions on the calculation of the cost function is avoided.The soil C pool at the beginning of the experiment was included in the model control vector and only results from 1980 are reported below.The results of the assimilation for the periods of time that fall within the observational temporal window are considered for model diagnostic, whereas the periods that fall outside the assimilation window on each experiment are periods of model prognosis, i.e., the prognosis period in DEC1 is from 1991 to 2010, and in DEC2 for 2001 to 2010.

Results
We first evaluate the long-term trends, seasonal and spatial variability of FAPAR, and carbon fluxes from the different assimilation experiments (Sect.3.1 to 3.3), and based on these analyze the prognostic ability of the MPI-CCDAS (Sect.3.4).To facilitate the analysis in some of our results, the global land is divided into eight regions: boreal west and east (BW and BE, for latitudes north of 60 • N), temperate northwest and northeast (TNW and TNE, between latitudes 20 and 60 • N); tropical west and east (TW and TE, between latitudes 20 • N and 20 • S); temperate southwest and southeast (TSW and TSE, for latitudes south of 20 • S) (Fig. 1).

Phenology
In all assimilation experiments, the RMSE and the bias between the modeled and observed FAPAR for 1982 to 2006 is reduced compared to the prior (Table 2).One important cause for this improvement is the change in the spatial distribution of the yearly maximum leaf area index (LAI) due to the optimization of the PFT-specific maximum LAI ( max ) parameter (Fig. S2) (see also Sect.A1 and Fig. A3 in the Appendix for more specific results of parameter changes due to the assimilation).The improvement occurs in all regions (Fig. 2), despite notable differences between the different assimilation experiments.In the decadal experiments DEC1 and DEC2, the largest error reduction compared to the prior is 19 % for boreal regions, while in the temperate areas this reduction is about 16 %.In the ALL experiment, larger reductions of 21 % on average are obtained in the tropical regions TE and TW.
One important factor in the error reduction is a substantial increase in the linear global correlation (R 2 ) in FAPAR during spring and autumn in experiments DEC1 (0.42 and 0.48, respectively) and DEC2 (0.48 and 0.47, respectively) with respect to the prior (0.31 and 0.33, respectively), with changes mostly taking place in the Northern Hemisphere (Fig. S3).An analysis for representative pixels (Fig. 1) shows that the assimilation procedure results in a better representation of the timing and amplitude of the mean seasonal cycle, particularly in the temperate and boreal zones of the Northern Hemisphere (Fig. S4).As a result, the average global R 2 between modeled and observed FAPAR increased with respect to the prior experiment from 0.17 in the prior to 0.20 for ALL and 0.34 for both DEC1 and DEC2 (Table 2, Fig. S3).Further details on the pixel-level analysis are presented in Sect.A2 of the Appendix.
The observed FAPAR signal exhibits positive long trends, indicating a greening trend of vegetation for most of the regions, with the exception of the TSW region, where the longterm trend indicates a decrease in FAPAR (i.e., browning).In most of the regions, the assimilation results agree on a positive long-term trend as in the observations; the magnitude of this trend is in disagreement with the observations (Fig. 3).Particularly in the BE region, the prior experiment overestimates the FAPAR obs trend by almost double.After the assimilation, the simulated FAPAR trend is reduced, leading instead to a slight underestimation of the growth rate in all of the posterior experiments.In the TWS region, the assimilation improved the long-term trend from a positive to a negative growth rate in the three posterior experiments.The most substantial disagreement between FAPAR obs and FAPAR mod occurs in the TW region, where the observations show a positive trend in FAPAR during the period of analysis, whereas this is not captured in the prior and all the posterior experiments.Despite these trend adjustments, the model-data error (based on the 4-year mean differences to the observations at regional scale) remains more or less constant across the 30year period (Fig. 4).
The observed FAPAR signal also contains a small amount of interannual variability (Fig. S5).Compared to observations, the simulated IAV of FAPAR (obtained from the monthly signal for each experiment) is improved only in the predominantly temperature-controlled boreal regions, whereas in temperate and tropical areas with a larger contribution of moisture-controlled phenology, the assimilation does not improve the variability (Fig. S5).

Atmospheric CO 2
To diagnose the performance of the MPI-CCDAS with respect to the atmospheric mole fractions of CO 2 , we compare the observed and simulated values, in terms of the mean seasonal cycle, IAV, and monthly growth rate, at three stations: (1) Alert (ALT) in the Northern Hemisphere, (2) Mauna Loa (MLO) in the tropics, and (3) South Pole (SPO) in the Southern Hemisphere.Results of this comparison are shown in Fig. 5.For MLO and ALT, the timing of the seasonal cycle is already well reproduced in the prior simulation, and the assimilation corrects errors in the amplitude of the seasonal cycle and the long-term trend.At SPO, there are also large relative differences between the model results and the observations, however, of a much smaller magnitude than for the two other stations.After the assimilation, the seasonal phase of CO 2 is shifted by approximately a month to better match the pattern in the measurements in the three experiments, and the amplitude of the seasonal cycle is in better agreement with the observations than compared to the prior.
Figure 6 demonstrates that these examples are broadly representative of the global changes due to the assimilation.Fig-  ure 6a shows a reduction in the CO 2 amplitude for stations of the Northern Hemisphere (> 40 • N) after the assimilation, which is in better agreement with the observations than the prior simulation.The most substantial amplitude reduction occurs at the northernmost station (ALT), where the seasonal amplitude decreases from 23.5 ppm in the prior experiment to 16.5 ppm in the ALL experiment, bringing it closer to the observed amplitude of 14.4 ppm.The latitudinal distribution of the linear correlation coefficient between the observed and simulated mean seasonal cycles is depicted in Fig. 6b, and demonstrates a very good agreement (R 2 > 0.9) in the Northern Hemisphere in all of the experiments (including the prior simulation).In the tropics (specifically between 20 and 40 • N), the misfit of the phasing of the seasonal cycle is improved after the assimilation, as evidenced by an increased linear correlation.However, this is achieved at the expense of a considerable reduction in the amplitude of the seasonal cycle compared to the observations.The results from the atmospheric inversions (INV) show a closer statistical agreement with the observations, as shown in Figs. 5 and 6.
During the nearly 30 years of atmospheric CO 2 data available, the time series of the CO 2 mole fractions in the prior model results strongly underestimate the long-term trend, and start to deviate in the first 5 years of the time series.In all the assimilation experiments, the long-term atmospheric CO 2 trend is in much better agreement with the 30-year trend of the measurements in the entire period of the assimilation (leftmost panels of Figs. 5 and 6c).The mean growth rate calculated from the results of the ALL experiment is in good agreement with the results in the observations (0.15 ppm month −1 in both cases) compared to the prior model (0.087 ppm month −1 ).Despite the moderate improvement, the MPI-CCDAS is incapable of improving the IAV of the atmospheric CO 2 concentration substantially, with the most notable deviations from the observed signals remaining unchanged after the assimilation procedure (Fig. 5).

Global and regional carbon pools and fluxes
We next compare the simulated land carbon cycle in the prior and posterior experiments to independent data.In the posterior experiments, the vegetation C pool decreased between 14 % and 20 % of the value in the prior but remained within the range of the literature estimate (442 ± 146 PgC).The global soil C stock showed significant changes after the assimilation.In all the posterior experiments, the soil C pool decreased by 45 %, 43 %, and 53 % with respect to the value in the prior.Still, the total C in the soil (1362 PgC) in the ALL experiment after the assimilation is in closer agreement with the estimate from the Harmonized World Soil Database (http://webarchive.iiasa.ac.at/Research/LUC/External-World-soil-database/HTML, last access: January 2015) of 1343 PgC (Table 3).As for the total global vegetation C stock, the prior and assimilation are in closer agreement with the lower end of the estimate by Carvalhais et al. (2014) (296 PgC).
The simulated latitudinal GPP values agree well with the data-driven model tree ensemble (MTE) estimate from Jung et al. (2011) for the period from 1982 to 2010 north of 30 • N.However, the assimilation results are low biased in the tropics, which propagated into lower estimates of global GPP in all the posterior results (Fig. 7d and Table 3).After the assimilation, the global GPP and NPP are reduced in the three posterior experiments compared to the prior (118.8 and 54.5 PgC yr −1 , respectively).In contrast to the posterior global mean of GPP, the value in the prior simulation compares favorably well to the global mean from the MTE product (118.9PgC yr −1 ) for the same period of analysis.The global mean GPP is reduced by up to 26 PgC yr −1 on average in the three posterior experiments compared to the prior experiment.Simulation DEC1 experienced the largest reduction in the global photosynthetic C uptake (83.1 PgC yr −1 ) relative to the prior value (Table 3 and spatial distribution of the GPP difference to the prior in Fig. S6d, f, and h).
At the large scale, the variation in the NBE (net biome exchange of CO 2 with the atmosphere, calculated as the net ecosystem exchange (NEE) minus the flux related to land use change) from all of the simulations through the time series is similar to that from the Global Carbon Project 2017 (GCP17; Le Quéré et al., 2018) and INV, with the significant anomalies collocated in time (Figs.7a, A4).Contrary to the prior simulation, the total annual NBE from the three posterior experiments falls within the uncertainty (shaded green area in Fig. A4d calculated as ±1 standard deviation) of the mean NBE from the terrestrial ecosystem models in the GCP17.However, the 1982-2010 mean net biome exchange in all of the assimilation experiments through the time series is on average 1.4 PgC yr −1 lower than the flux in the prior simulation (−2.06 PgC yr −1 ) and 0.8 PgC yr −1 less than the GCP17 value (−1.27±0.97PgC yr −1 ) (Table 3, Figs.A4d and S7 for summary of C balance).
In all MPI-CCDAS simulations, the NEE is reduced relative to the prior in most of the Southern Hemisphere, while it is increased in the Northern Hemisphere (Fig. S6c, e, and g).Temperate northern areas contribute the most to the global net CO 2 uptake.In the boreal east and west regions (BE and BW), the net land C emissions increased in all of the posterior experiments compared to the prior (Fig. S6c, e,  and g) with the most significant increase in BE for DEC2 (−0.29 PgC yr −1 ) relative to the corresponding value in the prior (−0.09PgC yr −1 ).The decrease in GPP in the tropics is depicted in the latitudinal gradient of NBE shown in Fig. 7c and in the spatial distribution of the NEE difference between the prior and the posterior experiments (Fig. S6c, e,  and g).As in the tropics, the NEE in the southern temperate region is consistently reduced after the assimilation experiments, also switching the NEE of the TSE region from a C sink of −0.18 PgC yr −1 in the prior to a mean C source to the atmosphere of 0.016 PgC yr −1 in the DEC2 experiment.For each station the panels show the time series of the mean monthly values, the mean seasonal cycle, the interannual variability, and the monthly growth rate for the entire period of the simulation .
The magnitude of the global NBE and GPP is smaller in the posterior experiments than in the prior.However, the trend in the anomaly of these fluxes (calculated relative to the temporal mean of each time series) is comparable in all the experiments (Fig. 7a and b), suggesting that the response to the environmental conditions is similar through the simulation period after the assimilation as well.This robust response shows in GPP a similar and gradual increasing C uptake (positive trend) during the period of analysis, only with a slightly reduced slope in the prior experiment (Fig. 7b).

Prognostic capability of MPI-CCDAS
Finally, we evaluate the goodness of the model-data fit of the decadal assimilation runs with respect to their long-term carbon cycle simulation relative to (i) that of the prior and (ii) that of the assimilation run using data from the 30-year experiment as a reference case for "best possible" modeldata match given the structural limitations of the MPI-CCDAS to match the observations (as evaluated in Sect.3.1 and 3.2).We calculate the 4-year mean differences between the atmospheric CO 2 mole fraction measurements and the CO 2 model results and also the INV results, for all of the stations (Fig. 8).In the ALL assimilation experiment, the atmospheric CO 2 concentration consistently matches the observations across the entire assimilation period (that also corresponds to the window of assimilation) with a −0.03 ± 1 ppm average bias to the observations (Fig. 8).This is comparable to the trend (Fig. 6c), and 4-year mean differences inferred by the inversions, where the 4-year mean results in the ALL fall within the standard deviation of the 4-year mean of the INV (Fig. 8).This is in striking contrast to the prior experiment, where the 4-year mean of the CO 2 mole fraction at the end of this simulation is 18.8 ppm lower than observed.For the DEC1 experiment, the 4-year mean difference among the measurements and the model results is between −0.3 and 0.3 ppm in the 1980s.This level of model-data agreement remains for the 1990s, where the experiment did not see any observations.After the year 2000, the fit increasingly degrades, with an underestimation of the CO 2 mole fraction by 1.6 ppm for the last 4-year average.However, this is still a 90 % reduction in misfit compared to the prior experiment.
We next quantify the RMSE between the CO 2 measurements and model results for each station for four different periods: 1982-1990, 1990-2000, 2000-2010, and 1982-2010 (Figs. 9 and A2) (Figs. 9 and A2).The large bias of the prior is reflected in the RMSE where the results of this experiment have the most substantial error for all of the stations and periods (between 2.8 and 18.7 ppm) (Fig. 9).For the posterior experiments with a decadal window of assimilation (DEC1 and DEC2), the performance of the assimilation of CO 2 mole fraction also improves substantially across all time periods.Within the period of the assimilation, the difference to the measurements and RMSE is most strongly reduced, and the error increases somewhat outside of the window of assimilation.The model results show that when only the first decade of data is assimilated (DEC1), a more significant deviation to the long-term trend of atmospheric CO 2 occurs between 2000 and 2010 compared to DEC2 and ALL (Fig. 9c).Similarly, a larger bias is also observed in the results from DEC2 where the lowest 4-year mean difference between the observations and the assimilation results takes place in the period of the window of assimilation for this experiment (1990-2000) (Figs. 8 and9b for RMSE).During this period, the model overestimates the CO 2 atmospheric concentration only by 0.15 ppm on average, whereas for the periods out-  side the window of assimilation, the CO 2 concentration is underestimated by 0.64 ppm in the period 1982-1990, and by 1.04 ppm in the period 2000-2010.Thus, in experiment DEC2 the prognostic skill of CCDAS is also reduced outside the window of assimilation, and the long-term trend is less well reproduced than in the ALL experiment.
The analysis of the 4-year mean differences for the period 1982-2006 between FAPAR obs and the results of the prior and assimilation experiments at the regional scale (areas in Fig. 1) reveals, contrary to the CO 2 observations, a near-constant 4-year mean FAPAR difference within the time series and each of the experiments (Fig. 4).In general terms, the decadal experiments are better able to reproduce the mean FAPAR across all regions.The largest difference between posterior results to the observations is in the tropical regions, where the FAPAR 4-year mean difference showed that the observations remained consistently larger than the ALL results by on average 0.042 in TE and 0.095 in TW (Fig. 4).Importantly, however, the trend correction for the boreal and temperate areas (Fig. 3) is similar across the different assimilation experiments, suggesting that important biases of the JSBACH 3.0 model, including the tendency to simulate too strong boreal greening, can be readily reduced with only 10 years of data, as further improvement with the 30-year record is small.

Discussion
The parameter optimization with a simultaneous assimilation of long-term spaceborne FAPAR and atmospheric CO 2 measurements in the MPI-CCDAS, resulted in a considerable reduction in the cost function and norm of the gradient, which can be seen as an overall improvement in the modeled global carbon fluxes with a decrease in the root-meansquared error of the MPI-CCDAS compared to the CO 2 and FAPAR and observations (Figs. 9 and 2).The trajectory of model parameters involved in the optimization differed for each experiment and each phenotype.While some paramewww.biogeosciences.net/16/3009/2019/Biogeosciences, 16, 3009-3032, 2019 ters were consistently retrieved after the assimilation, such as the maximum leaf area of grasses and shrubs and the correction parameter for the initial soil pool size, some final parameter estimates varied considerably between the three experiments, e.g., the tropical maximum leaf area index and some of the parameters controlling the seasonality of the phenology (Fig. A3).These variations lead to regional differences in the simulated compartment flux GPP and ecosystem respiration, which are not well constrained from the observations.Interestingly, these differences result in very similar absolute values in global carbon fluxes and their trends.This demonstrates a certain degree of equifinality in the results and cautions a too stringent interpretation of the MPI-CCDAS outcome in terms of improving understanding about biosphere processes and their long-term trends.

FAPAR
MPI-CCDAS is capable of extracting information about the seasonal cycle and the long-term trends from the FAPAR observations.Using decade-long FAPAR data during the assimilation (DEC1 and DEC2) already leads to notable improvement of the simulated seasonal phenology of the land surface within and outside the window of assimilation, i.e., maintaining these changes during the prognosis periods.This improvement is predominantly the result of the ability in the model to simulate the timing of green-up and brown-down in spring and summer through the optimization of parameters that regulate the onset and end of the growing season (i.e., parameters for temperature and day-length thresholds).The greening effect is especially taking place in the Northern Hemisphere, dominated by the phenotypes deciduous and evergreen needleleaf and extratropical deciduous trees.The long-term greening trend in the vegetation of boreal regions previously observed in spaceborne data (Forkel et al., 2016;Lucht et al., 2002) was captured in the results of MPI-CCDAS before the assimilation, but it was mostly overestimated in northern regions and underestimated in the Southern Hemisphere.After the assimilation experiments, the greening trend was improved primarily in the boreal areas and is in closer agreement to the reported satellite FAPAR data.The modest improvements achieved in the simulated greening trend of temperate areas in the western hemisphere are associated with a decreased performance in the eastern hemisphere, indicating that the model structure of MPI-CCDAS is incapable of reconciling regional differences.This difference could be an indicator of the need to parameterize both hemispheres differently in terms of their phenological response to the underlying driving factors (such as temperature, moisture availability, and day length); also, this could be due to the lack of process to account for the land use or vegetation dynamics in the MPI-CCDAS.Despite these broadscale improvements, the MPI-CCDAS does not reproduce the magnitude of the greening trend and its interannual variability in all the posterior experiments at pixel and regional levels.This is likely a result of the MPI-CCDAS structure, which relies on few globally relevant PFTlevel parameters.Although some of the phenological parameters in CCDAS adapt to local mean growing season temperature, other thresholds are only globally applicable, causing a trend to temperature grasslands that cover a wide climatological range.For example, some of the global parameters such as f aut_leaf and f slow imply that improvements of modeled fluxes in the boreal regions directly affect fluxes in the tropics, inducing parameter changes to compensate for the altered C fluxes.Defining instead such global parameters per PFT would alleviate this issue but will compromise the computational cost and might not necessarily reduce the overall uncertainty.Another technical challenge is the use of a spatially mixed signal at the coarse spatial model resolution (due to the high computational requirements necessary to increase model resolution) to infer PFT-specific parameters.A likely better strategy for constraining PFT-specific parameters would be to resample the highly resolved satellite product to PFT-specific FAPAR classes per pixel before the aggregation into a global grid.This change would allow us to find more spatially refined classes and provide PFT-specific FAPAR maps to the CCDAS to reduce issues in the identification of phenological parameters for different climatic regions.
Except for the tropical latitudes, the difference between the regional IAV of the observations and model output is small compared to seasonal variations.The modeled signal remains within a range of 0.05 (dimensionless) FAPAR obs .The signal and the model-data difference is also smaller than the global mean retrieval error of the FAPAR product, which is ±0.2088 (Schürmann et al., 2016).This error was used to quantify the observational FAPAR uncertainty in the assimilation, thereby reducing the ability of the MPI-CCDAS to detect and correct such smaller variation.Overall, the lacking match of the IAV may therefore be of little overall concern.Nevertheless, the lower than observed IAV in the tropical bands may be indicative of too weak drought response in the maximum leaf area index of the model.Although the assimilation procedure allows changes in the phenology response to water stress (given by parameter τ w ), the assimilation procedure decreased the drought sensitivity of tropical phenology given the entire spatially explicit FAPAR time series, and therefore did not allow capture of the regional drought events that could be in principle linked to changes in LAI.
The technical error during the assimilation procedure to not include the period from 2007 to 2010 in the FAPAR mod product does not however influence the decadal results observed here because the main information gain of the CC-DAS in terms of phenology stems from the seasonal cycle, with little effect on the overall trends between the three assimilation experiments with different time periods.
Bearing in mind the different spatial resolution of methods (i.e., model grids and remote-sensing pixels), a robust comparison between the mean and maximum LAI values before and after the assimilation per region are presented in Table A1 of the Appendix.The results fall within LAI values from MODIS and lidar reported in the literature.Groundbased observations in the tropical Amazon-savanna transition forest between 2005 and 2008 show an annual mean LAI value for the total canopy of 7.4 ± 0.6 m 2 m −2 , and for the seasonally flooded forest the value of 3.4 ± 0.1 m 2 m −2 .For the remote-sensing data from MODIS, the reported values are 6.2 ± 0.2 and 5.8 ± 0.3 m 2 m −2 , respectively (Biudes et al., 2014).In the eastern Amazon forest, the remote-sensingbased LAI has been reported as 6.2 m 2 m −2 from lidar, and 4.8 m 2 m −2 with a low end of 2.0 m 2 m −2 from MODIS (Qu et al., 2011).

Atmospheric CO 2
The considerable improvement of the seasonal amplitude and the long-term trend of atmospheric CO 2 at Northern Hemisphere stations is independent of the different periods of data used for the assimilation.However, the MPI-CCDAS consistently fails to resolve some of the features of the year-to-year variability detected in the measured atmospheric CO 2 stations, which translates into an acceptable, but far from perfect fit to the inferred annual carbon budget of the GCP17 (Le Quéré et al., 2018).We compared the performance to the results from an atmospheric CO 2 inversion (INV) with the same input fields and atmospheric transport model as MPI-CCDAS to illustrate that these deviations do not reflect uncertainties in the representation of the atmospheric transport.It needs to be mentioned that both the choice of the atmospheric transport model (and associated imperfections at resolving the vertical and lateral atmospheric transport of CO 2 ) and the method to aggregate atmospheric observations to obtain an estimate of the annual growth rate in the global carbon budget introduce some error in any forecast of the interannual variability.As a consequence, only the occurrence of more significant model-data mismatches can be interpreted as an actual result of the MPI-CCDAS' inability to correctly resolve the carbon flux variation.
Notably, the model lacks the representation of some key processes that contribute to climate-induced interannual variability of the carbon cycle, such as the possibility to dynamically account for fire disturbance (Lasslop et al., 2014), tropical peatland fires related to El Niño-Southern Oscillation (ENSO) (van der Werf et al., 2008), or the increase in terrestrial carbon uptake after large-scale volcanic eruptions such as for Mt.Pinatubo in 1991 (Lucht et al., 2002;Mercado et al., 2009).Omitting fluxes in the current model configuration due to fire events may impair the ability of the model to infer the atmospheric growth rate of CO 2 associated with El Niño events (Frölicher et al., 2011(Frölicher et al., , 2013)).One way to overcome the IAV mismatch would be to include fire fluxes in the model by prescribing them from the Global Fire Emissions Database (GFED; van der Werf et al., 2010); however the latest version of this data set (version 4.0) is only available for years from 1997, which is a limiting factor for the timeframe of the simulations in this work.However, the contribution of these interannual variations to the overall CO 2 cost function is low in comparison to the signal contained in the seasonal cycle and deviations in the long-term trend, such that the MPI-CCDAS may simply not be sensitive enough to these aggregate system properties like the response of the tropical carbon cycle to El Niño events given the uncertainty in the atmospheric transport and the observational representation error.

Carbon cycle simulation
Independent of the amount of data used in the assimilation window, our results show that the GPP and NEE were consistently reduced globally compared to the prior run, i.e., less carbon uptake by plants leading to the model results being in closer agreement with other independent estimates such as the GCP17.The MPI-CCDAS suggests a somewhat lower average annual atmospheric CO 2 growth rate (calculated by the sum of the net C fluxes from the ocean, land, and fossil fuel emissions) than the one estimated in the GCP17 (Le Quéré et al., 2018), even if the MPI-CCDAS estimate falls within the uncertainty of the GCP17 (Figs. 7 and S7).Most of the difference stems from small differences in the assumed fossil and ocean carbon fluxes.In the case of the carbon fluxes from fossil fuels, the data prescribed in MPI-CCDAS do not contain fluxes due to cement and flaring; thus the magnitude of the annual carbon sources through the time series is consistently lower but still within the ±5 % uncertainty of the GCP17 data (Le Quéré et al., 2018) (Fig. A4).As for the ocean carbon sink, the annual mean values prescribed in MPI-CCDAS are also of lower magnitude than the mean value in the GCP17 but falling in the lower limit of the uncertainty value (Figs.A4c and S7).The flux due to LULCC prescribed in MPI-CCDAS is also of smaller magnitude than that from the GCP17 because the simulation made by JSBACH 3.0 does not consider disturbances like fires and gross transitions, which might have also contributed to the lower land C sink obtained in the assimilation experiments compared to the total land C sink in GCP17 (Fig. A4d).
The MPI-CCDAS GPP matches well the observationbased product MTE-GPP (Jung et al., 2007) in regions with a distinct light-and temperature-driven seasonal cycle (i.e., north of approx.30 • N), translating to a reduction in modeled GPP by 0.7 PgC yr −1 in boreal regions.However, similar to the results in Schürmann et al. (2016) with only 5 years of assimilation, the tropical productivity is strongly reduced by the assimilation to estimates that are substantially lower than independent estimates such as MTE.This feawww.biogeosciences.net/16/3009/2019/Biogeosciences, 16, 3009-3032, 2019 ture is likely the result of a global compensating effect on heterotrophic respiration, and this effect is observed in the drop of the photosynthetic capacity (f photos ) in the tropical evergreen and deciduous PFTs, as well as in the reduction of the maximum tropical LAI in the three assimilation experiments compared to the prior.In addition, another critical factor influencing the global reduction of GPP and the tropical uptake of C appears to be related to the difference in data availability of CO 2 stations between the defined assimilation windows.Specifically, this is evident in the results of the data-poorer experiment DEC1, where the topical GPP is substantially lower than in the independent estimates and in the assimilation experiments that use more stations (DEC2 and ALL).As a result, the mean tropical land C source to the atmosphere in the prior experiment (mean NBE value of 0.12 PgC yr −1 , and minimum value of −0.07 PgC yr −1 , reflecting C uptake in the 4 • S latitudinal band) was increased to 0.37 ± 0.17 PgC yr −1 on average for all the posterior results.
The NPP : GPP ratio in ALL and DEC2 decreased to 0.35 and 0.31, respectively, when compared to the prior value (0.45).This reduction might be mainly because the NPP is not well constrained from the atmospheric record.In JS-BACH 3.0, autotrophic respiration (Ra) is directly coupled to GPP; hence the fraction of GPP partitioned to Ra leads to an increase in the seasonal cycle of the ecosystem respiration.An increase in Ra with respect to the prior (which is only visible in the global average value in DEC2; Table 3) results in a reduced net land carbon uptake, masking the smaller changes in the vegetation turnover.
The reduction in the soil C pool after the assimilation can be explained due to an unavoidable effect in the model.The MPI-CCDAS was initially spun up until the soil C pools reached equilibrium considering preindustrial forcing; however, this new initial state does not consider climate variability.To compensate for this and to reduce the respiration when the MPI-CCDAS is confronted with contemporary changes in the climate, the model creates an artificial C sink that leads to a reduction in the soil C stocks.It is important to note that the JSBACH 3.0 version used in this MPI-CCDAS does not include permafrost processes; therefore, the global soil C stock might still be underestimated.

Value of long-term data sets to constrain CCDAS
Notwithstanding the MPI-CCDAS conceptual issues, the setup of this study enables us to test by how much the quality of the data-model agreement is reduced after exposing the MPI-CCDAS to shorter observational time series.In terms of FAPAR, there is no apparent degradation of fit with time, despite that in general terms, the trend in the data is best matched with the ALL experiment.This is foremost a consequence of comparatively small trends in the observed FA-PAR, implying that extracting the mean seasonal patterns and amplitude for a few years is essential for simulating current and near-term FAPAR.Issues with model structure and with the assimilation setup prevent a better model-data fit irrespective of the length of the record.This would suggest that a focus of assimilation on high-quality and highly spatially resolved FAPAR should be a priority over the use of long-term data sets.The results are different for the case of projecting atmospheric CO 2 , where a long record of atmospheric CO 2 measurements favorably contributes to a better representation of the long-term values after the assimilation, whereas a shorter window leads to deviations to the observations in the periods outside the assimilation years.The model-data agreement is of approximately ±0.5 ppm during the assimilation period and starts to deviate for the DEC1 experiment later than 10 years after the end of the assimilation window, whereas in the DEC2 experiment, the degradation of the model-data match already starts after approximately 5 years.Still, the average deviation from the observations by using shorter assimilation periods is not far from the upper limit of the uncertainty when using the longest record.Nonetheless, with the caveat that MPI-CCDAS does not fully explain the interannual variability of the land net carbon flux, this suggests a reasonable short-term (for a small number of years) forecasting skill of atmospheric CO 2 .

Conclusion
The MPI-CCDAS is capable of simultaneously integrating two independent observational data sets over three consecutive decades at the global scale to estimate global carbon fluxes.The results demonstrate that assimilating only 1 decade of observations, for two observational data sets (FA-PAR and atmospheric CO 2 concentrations), leads to broadly comparable results and trends in the global carbon cycle components than using the entire time series of available observations (30 years).Currently, the system can confidently predict the carbon fluxes on short timescales (up to 5 years after the end of the window of assimilation), e.g., for atmospheric CO 2 concentrations at the site level, and the mean prediction remains within the uncertainty of the observations.However, long-term forecasts with CCDAS are less certain, as the observational record does not sufficiently constrain the interannual variability of the simulated land carbon fluxes and longer-term changes in the decadal net carbon uptake.Nevertheless, the comparatively small error of 2 ± 1.3 ppm after 15-19 years of prognostic simulation shows the potential for midterm carbon cycle predictions constrained using the CCDAS approach.
The MPI-CCDAS is a computationally expensive system, and the demonstration that large-scale carbon fluxes can be improved by only using a limited period of observations increases the feasibility of using data assimilation systems to constrain the land carbon budget in land surface models.However, we also show that there are considerable variations in the estimated parameters and the regional distribution of the land C uptake, suggesting that further improvements in the land surface model, especially in the current structure and design, must be first solved to improve the model and computational efficiencies of the system.This is recommended before an attempt to include another observational stream or other modifications aiming to test an enhancement on the prognostic skill in the full MPI-CCDAS.

Appendix A A1 Assimilation performance
Figure A3 depicts the final posterior value (X f ) for each optimization parameter I and in each assimilation experiment.The last parameter value is normalized to its corresponding prior value (X p , shown in Table 1), i.e., (X f /X p ) − 1; this is done to make a comparison between parameters on their response and the assimilation because each parameter holds a different range of values.The normalized result is also shown for each phenotype for the phenology and photosynthesisrelated parameters, and also for the initial leaf growth rate (ξ ), CO 2 initial offset, and land carbon turnover parameters that are applied globally.
More significant changes in some phenology parameter values are observed; e.g., the maximum LAI ( max ) decreased in almost all PFTs and in all experiments, except for the phenotypes CE (coniferous evergreen) in the ALL experiment and ETD (temperate broadleaf evergreen and deciduous; mostly dominating in Europe and the eastern USA and Asia).In CD (coniferous-deciduous trees; located in northeast Asia, specifically in the east Siberian taiga) the max value increased notably in the DEC1 and DEC2 experiments (Fig. A3e).
In the tropical forest areas, the reduction of the max was from 3.17 in the prior experiment to 2.27 (33 %) in ALL for the TW area, and from 3.27 in the prior to 2.43 (26 %) in ALL for the TE area.For the other assimilation experiments the average maximum LAI moderately decreased in TW from 3.17 in the prior to 2.89 (8.8 %) in DEC1 and from 3.17 in the prior to 3.00 (5.3 %) in DEC2.
In other extratropical areas results from experiments DEC1 and DEC2 experienced an average increase in max by 5.6 % in BE (from 2.29 in the prior to 2.42), 24 % in BW (from 1.62 in the prior to 2.01), and 3.8 % in TNW (from 3.11 in the prior to 3.23).
As a result, the temperature-and daylight-related parameters were modulated such that the largest decrease with respect to the prior value in the temperature at leaf onset (T ϕ) was also observed for these two PFTs, especially for CD in the DEC1 and DEC2 experiments.Also, the day length at leaf shedding (t c ) and the timescale of leaf senescence (leaf shedding timescale, 1/τ 1 ) primarily increased for CD.As for the PFTs influenced by temperature and water (TeH, TeCr, TrH, and TrCr), the most significant change with respect to the prior value took place in the posterior value for the C 3 crops (TeCr; distributed in Europe, the USA, and East Asia) whose value decreased considerably for the water stress tolerance (τ w ) in experiments DEC1 and DEC2, whereas the value of the timescale of leaf senescence (leaf shedding timescale, 1/τ 1 ) also increased considerably for the same experiments.These changes seemed to be a response of the large decrease in the foliar area max for this PFT, which took place in all three experiments.The value of the photo-  synthesis rate modifier (f photos ) influences the productivity at leaf level.Thus, a lower value of f photos will lead to lower GPP (less carbon uptake and a potential increase in NEE).Our results show that after the assimilation experiments the value of f photos decreased with respect to the prior experiment, mainly for the C 3 grasses and pasture (TeH; distributed mostly in the Northern Hemisphere) as well as for the tropical evergreen and deciduous trees (TrBE and TrBD), and this is more noticeable in the DEC1 experiment.
As for the global parameters, significant deviations from the prior value are observed in the parameter that controls the initial size of the slow soil C pool (f slow ) and also in the parameter that defines the initial atmospheric CO 2 mole fraction (CO offset 2 ), which is globally set to be constant.The posterior value of both of these parameters decreased in the three posterior experiments.Variations in f slow induce changes in  the global heterotrophic respiration, controlling in this way the disequilibrium between GPP and the ecosystem respiration.Because JSBACH tends to overestimate the soil C pool, optimizing f slow is a means to improve this estimation; however, the spatial distribution of the carbon pools remains unchanged, and the prior value controls the prior value, meaning that the GPP and ER relation remains similar in the posterior experiments to that in the prior experiment.Since the magnitude of the initial slow carbon pool was set, this might influence the other modeled carbon pools compared to the soil carbon pool, leading to biased soil and vegetation carbon stocks; therefore, the assessment on the predicted pools should be made with care.We compare the resulting global total soil and vegetation carbon pools robustly to independent estimates from the literature or other products, and results are shown in the main text of the Discussion section.

A2 Pixel-level phenology analysis
The FAPAR analysis at the pixel level shows that in pixels P1 (located in eastern Siberia), P2 (located in eastern Brazil), and P6 (located in Canada), the magnitude of the mean seasonal cycle is better represented when compared to the observations (Fig. S4).Also, the timing of the mean seasonal cycle is corrected, e.g., in pixels with large seasonal amplitude such as in P1 and in P6.While in the prior experiment (and ALL experiment) the onset and peak of the growing season in P1 and P6 are delayed by up to 2 months, in the results from experiments DEC1 and DEC2 this delay is reduced to only 1 month.This correction might be partially due to changes in some optimized parameters: increase in the day length at leaf shedding (t c ) and reduction in the temperature at leaf onset T ϕ detected for both the CE and CD, as well as for ETD, TeCr, and TeH phenotypes (Fig. A3c, d, e, and g); this is because these parameters control the onset and end of the vegetation activity.Despite changes in T ϕ and t c after the assimilation in TrH, this temporal shift is less evident in P2.In this pixel, the amplitude of the seasonal cycle is small, and only changes in the magnitude of the amplitude are visible after the assimilation (Fig. S4).In the results of DEC1 and DEC2 for pixel P3 (located in the USA and dominated by TeCr), the water stress tolerance time (τ w ) and T ϕ were primarily reduced, whereas the leaf shedding timescale (1/τ l ; earlier shedding) increased.

Figure 1 .
photos a

Figure 2 .
Figure 2. RMSE for FAPAR from the model results and observations for the period 1982-2006 and for different regions.

Figure 3 .
Figure 3. Mean monthly growth rate of FAPAR for 1982-2006 on each analyzed geographical region for the satellite observations and results of prior and the posterior experiments.

Figure 4 .
Figure 4. Time series of the 4-year mean of the FAPAR anomaly for the satellite data for each model experiment in six selected model pixels.The error bar indicates the ±1 standard deviation of the 4year differences.The first marker (as asterisk) in the time series is the single value for 1982.

Figure 5 .
Figure 5. Statistical analysis of atmospheric CO 2 at three flask measurement sites: Alert (ALT; top panels), Mauna Loa (MLO, center panels), and South Pole (SPO, bottom panels), from the measurements, prior, posterior experiments (ALL, DEC1, and DEC2), and inversion (INV1).For each station the panels show the time series of the mean monthly values, the mean seasonal cycle, the interannual variability, and the monthly growth rate for the entire period of the simulation.

Figure 6 .
Figure 6.(a) Latitudinal distribution of the mean CO 2 seasonal amplitude for the 28 flask measurement stations from the observations, prior, and posterior experiments.(b) Latitudinal distribution of R 2 obtained from the correlation between the observations and each simulation result of the mean atm.CO 2 seasonal cycle and (c) average atmospheric CO 2 monthly growth rate across stations for the observations and model results.The star on each bar is the mean of the atm.CO 2 monthly growth rate, the horizontal middle black line on each box is the median, the red whiskers depict the error as 1 standard deviation, and the grey dots on each box are the actual monthly growth rate values for all the stations in each data set.

Figure 7 .
Figure 7. Time series of the anomaly to the temporal mean of the time series (a, b), and latitudinal gradient (c, d) of the total net ecosystem exchange (NEE including the influence of LULCC) (a, c) and gross primary production (b, d) for the results of each model simulation.NEE from the model is compared to the GCP 2017 and INV data set (a, c).GPP is compared to the MTE data-data driven estimate of Jung et al. (2011) (b, d).

Figure 8 .
Figure 8.Time series of the 4-year mean of the atm.CO 2 anomaly to the observations for each model experiment and inversion results, for all the stations.The y axis is limited to the results in the posterior experiments.The error bar is 1 standard deviation to the 4-year mean of the differences to the observations.The first marker to the left in the time series (as asterisk) is the single value for 1982 not included in the subsequent 4-year means.

Figure 9 .
Figure 9. RMSE for different periods between CO 2 atm.concentrations from measurements and model results for the different assimilation experiments and inversion results for each of the flask measurement stations.

Figure A2 .
Figure A2.Experimental setup for posterior experiments ALL, DEC1, and DEC2 with different temporal windows for the assimilation of FAPAR and molar fractions of atmospheric CO 2 .

Figure A3 .
Figure A3.Final value for each parameter p at the end of the assimilation experiments, normalized to the prior value (p pr ), i.e., (p/p pr ) − 1.This is shown for each model plant functional type (a-h) and globally for the land C turnover parameters (i, j).

Figure A4 .
Figure A4.Time series of the annual mean of the major components of the C cycle used as background fluxes in CCDAS compared to those from the GCP 2017.The atm.CO 2 growth from the model output is the result of the sum of fossil fuel, ocean, and land C fluxes.The blue shadow in the ocean C sink of the GCP 2017 data is the standard deviation of the mean sink from the models that contributed to the GCP.The land C flux is the total NEE with contribution of the flux due to LULCC.The green shadow area is the standard deviation of the mean land C flux from the terrestrial models that contributed to the GCP.

Table 1 .
Model parameters selected for the optimization.Nos. 1 to 6: related to phenology; no.7 to photosynthesis and nos.8 to 11 to land carbon turnover.The values in the table for each PFT (where applies only) are for the prior conditions: p

Table 2 .
Statistical analysis of FAPAR for 1982-2006in all of the experiments, and also for the periods of the window of assimilation only for DEC1 and DEC2.R 2 is obtained from the linear correlation between FAPAR obs and FAPAR mod calculated for the entire period and by seasons.NRMSE is the normalized root-mean-squared error, defined as RMSE divided by the mean (FAPAR obs ).

Table 3 .
Global average of the terrestrial carbon cycle components and carbon stocks in results from the assimilation experiments and prior, and other independent estimates (see table footnote for description).

Table A1 .
Regional mean and maximum leaf area index in prior and posterior experiments.