Evaluating two soil carbon models within a global land surface model using surface and spaceborne observations of atmospheric CO2 mole fractions

The trajectories of soil carbon (C) in the changing climate are of utmost importance, as soil carbon is a substantial carbon storage with a large potential to impact the atmospheric carbon dioxide (CO2) burden. Atmospheric CO2 observations integrate all processes affecting C exchange between the surface and the atmosphere. Therefore they provide a benchmark for carbon cycle models. We evaluated two distinct soil carbon models (CBALANCE and YASSO) that were implemented to a global land surface model (JSBACH) against atmospheric CO2 observations. We transported the biospheric carbon fluxes 5 obtained by JSBACH using the atmospheric transport model TM5 to obtain atmospheric CO2. We then compared these results with surface observations from Global Atmosphere Watch (GAW) stations as well as with column XCO2 retrievals from the GOSAT satellite. The seasonal cycles of atmospheric CO2 estimated by the two different soil models differed. The estimates from the CBALANCE soil model were more in line with the surface observations at low latitudes (0◦ N-45◦ N) with only 1 % bias in the seasonal cycle amplitude (SCA), whereas YASSO was underestimating the SCA in this region by 32 %. YASSO 10 gave more realistic seasonal cycle amplitudes of CO2 at northern boreal sites (north of 45◦ N) with underestimation of 15 % compared to 30 % overestimation by CBALANCE. Generally, the estimates from CBALANCE were more successful in capturing the seasonal patterns and seasonal cycle amplitudes of atmospheric CO2 even though it overestimated soil carbon stocks by 225 % (compared to underestimation of 36 % by YASSO) and its predictions of the global distribution of soil carbon stocks was unrealistic. The reasons for these differences in the results are related to the different environmental drivers 15 and their functional dependencies of these two soil carbon models. In the tropical region the YASSO model showed earlier increase in season of the heterotophic respiration since it is driven by precipitation instead of soil moisture as CBALANCE. In the temperate and boreal region the role of temperature is more dominant. There the heterotophic respiration from the YASSO model had larger annual variability, driven by air temperature, compared to the CBALANCE which is driven by soil 1 https://doi.org/10.5194/bg-2020-7 Preprint. Discussion started: 10 February 2020 c © Author(s) 2020. CC BY 4.0 License.

reveal a more spatially integrated CO 2 signal compared to surface site observations and together they provide a complementary dataset. These two data sources have been used together to study the carbon cycle with "top-down" inversion modelling (Crowell et al., 2019). This kind of modelling framework uses atmospheric CO 2 observations to constrain a priori biospheric and ocean fluxes, based on the Bayesian inversion technique, which results in optimized estimates (a posteriori) of the fluxes (Maksyutov et al., 2013;Rödenbeck et al., 2003;van der Laan-Luijkx et al., 2017;Wang et al., 2019). The estimates for fossil emissions are often assumed as known, i.e., not optimized in the inversion.
In this study we investigate in how far atmospheric CO 2 observations, both from surface network and space-born observations, can be used to benchmark these two soil carbon models included in the land surface model JSBACH, one of the models participating in CMIP6. Previous studies evaluating the new YASSO soil carbon model performance when implemented in JSBACH, have shown better performance in relation to observations of soil carbon stocks when compared to the older soil car-70 bon model of JSBACH, CBALANCE (Goll et al., 2015;Thum et al., 2011). Since the only difference between the two model versions is the description of the underlying soil processes and include no major feedbacks between soil and vegetation in the model set-up (excluding the effect of litter accumulation on fire fluxes), the difference in the release of carbon to the atmosphere originates only from the soil carbon models. The two soil carbon models used in this work have different environmental drivers. CBALANCE has soil moisture and soil temperature as driving variables and the YASSO model precipitation and air 75 temperature. The models have differing response functions to these environmental variables as well as different carbon pool structures, and they are both first-order decay models. This framework allows us to investigate how these above-mentioned differences in soil carbon modes influence atmospheric CO 2 . To transfer terrestrial carbon fluxes from the surface to the atmosphere, we use the transport model TM5 and the anthropogenic and ocean fluxes from an inversion framework (van der Laan-Luijkx et al., 2017). In the analysis we also use a simple box model calculation to further understand the main causes in 80 the different outcomes of the models.
JSBACH (Giorgetta et al., 2013) and the newer model, YASSO, has been used in simulations for the annual global carbon budget (Le Quéré et al., 2015;Le Quéré et al., 2018) and is used in CMIP6 simulations of JSBACH (Mauritsen et al.,2019).
It is furthermore also used in JSBACH4, a re-implementation of JSBACH for the ICON-ESM (Giorgetta et al., 2018;Nabel et al., 2019).
Independent of the used sub-model for soil carbon, JSBACH uses three carbon pools for the living vegetation carbon: 95 a wood pool, containing woody parts of plants, and a green and a reserve pool containing the non-woody parts. JSBACH simulates different processes that lead to losses from the vegetation pools, such as grazing, shedding of leaves and natural or anthropogenic disturbances. Depending on the process, some of the vegetation carbon is lost as CO 2 into the atmosphere, while the remaining part is transferred as dead vegetation into the litter and soil pools of the used sub-model for soil carbon, where it is then subject to the internal processes of the soil carbon sub-model. The only process outside of the soil carbon sub-model 100 which influences dead material are the fire processes, burning parts of above ground litter carbon.

The soil carbon model CBALANCE
CBALANCE (CBA) is the original soil carbon sub-model of JSBACH (Raddatz et al., 2007), which has been used in CMIP5.
The environmental drivers for decomposition in CBA are soil temperature (at soil depth of 30 to 120 cm below the surface) and a relative soil moisture (α) of the upper-most soil layer, which is 5 cm thick. The α varies between zero and one. 105 The function for soil temperature dependence, f CBA,T soil of decomposition follows a Q10 formulation as f CBA,T soil (T soil ) = Q T soil 10 • C

10
(1) with a Q 10 value of 1.8 and T soil is soil temperature in Celsius (shown in Fig. S1a) (Raddatz et al., 2007). The dependency on relative soil moisture α is linear (Fig. S1 b) and it is calculated as where α crit is 0.35 and α min is 0.1 (Knorr, 2000).
Together these functions are modulating the rate of decomposition, so that the R h from each pool (denoted by i) is where C i is the carbon content of each pool and τ i is the turnover rate of each pool in days. CBA uses five different carbon pools having different turnover times:

115
-Two green litter pools: one above-and one below ground in which the non-woody plant parts are decomposed with turnover times between 1.8 and 2.5 years (Goll et al., 2015) -Two woody litter pools: one above-and one below ground in which the woody plant parts are decomposed with turnover times of several decades -One slow pool receiving its input from the four litter pools and its turnover time is in the order of a century. 120

The soil carbon model YASSO
The original soil carbon model of JSBACH was replaced by YASSO (YAS) (Thum et al., 2011;Goll et al., 2012). JSBACH's YAS implementation is based on the Yasso07 model (Tuomi et al., 2009). Development of Yasso07 has relied heavily on litter bag and other observational data sets that have been used to estimate the model parameters (Tuomi et al., 2009(Tuomi et al., , 2011. Owing to its strong connection to experiments, its environmental drivers are quasi-monthly air temperature and precipitation.
The decomposition depends on precipitation P a (m) as
Similar to CBA, YAS has slowly and fast decomposing pools, but its pool dynamics are more structured. YAS uses 18 carbon pools, nine for the decomposition of woody litter and nine for the decomposition of non-woody litter. The only difference in the calculation of the decomposition rates between non-woody and woody pools is an additional parameter that increases 135 the turnover rates with increasing size of the woody litter (Tuomi et al., 2011). In addition to the distinction into woody and non-woody litter which is also done in CBA, YAS takes the chemical composition of the litter into account. YAS uses four chemically distinct pool kinds: acid soluble, water soluble, ethanol soluble and non-soluble. For each of these four chemical compositions one above and one below ground pool are used. In addition there is one humus pool for woody and one for non-woody material. The dynamics of the YAS carbon pools are described in (Tuomi et al., 2009) with decomposition fluxes 140 5 https://doi.org/10.5194/bg-2020-7 Preprint. Discussion started: 10 February 2020 c Author(s) 2020. CC BY 4.0 License.
causing redistributions among the different pools or losses to the atmosphere. Each of the pools has a decay constant, which is modified by the environmental dependencies in Eqs. (4) and (5).
Each PFT used in JSBACH has a distinct chemical composition. Furthermore, the branch size of the woody litter is PFTdependent.
2.2 The model simulations: The JSBACH set-up 145 JSBACH model simulations followed the TRENDY v4 protocol in terms of JSBACH version, simulation protocol and forcing data (Le Quéré et al., 2015;Sitch et al., 2015). Climate forcing was based on CRUNCEP v6 (Viovy, 2010) and the global atmospheric CO 2 was obtained from ice core and NOAA monitoring station data (Sitch et al., 2015). For both set-ups the model was separately run into equilibrium, i.e. until the soil carbon pools of the applied carbon sub-model were at steadystate. The two different transient simulations were then done for the period 1860 to 2014. Anthropogenic land cover change 150 was forced by the LUHv1 dataset (Hurtt et al., 2011) and was simulated as described in Reick et al. (2013). While fire and windthrow were simulated, natural land cover changes and the nitrogen cycle were not activated. Simulations were done at T63 spatial resolution (approximately 1.9 • , 200 km). For further details on the spin-up and the model version please refer to the SI.

155
To estimate atmospheric CO 2 , we used the global Eulerian atmospheric transport model TM5 (Krol et al., 2005;Huijnen et al., 2010). TM5 was run globally at 6 • x 4 • (latitude x longitude) resolution with two-way zoom over Europe, where the European domain was run at 1 • x 1 • resolution. This is the pre-existing set-up that was readily available. The 3-hourly meteorological fields from ECMWF ERA-Interim (Dee et al., 2011) were used as constraints. Linear interpolation was done to obtain CO 2 estimates at the exact locations and times of the observations. 160 We fed TM5 with daily biospheric, weekly ocean and annual fossil fuel fluxes to obtain realistic atmospheric CO 2 . Values of GPP and total ecosystem respiration were taken from the JSBACH simulations for the two different soil model formulations.
Also, carbon release from vegetation and soil owing to land-use change, fires and herbivores were taken from the JSBACH model results as part of terrestrial biosphere carbon fluxes. In addition, we used the posterior biospheric flux estimates from CarbonTracker Europe (CTE2016, later referenced to as CTE; van der Laan-Luijkx et al. (2017)) to provide some guidance on 165 the ability of TM5 to represent the individual site observations. The ocean fluxes were the a posterior estimates from the same study.  Table S1.
Simulations with TM5 were done for 2000-2014, but the first year was considered as spin-up and omitted from the analysis.

The surface observations
Surface observations of atmospheric CO 2 from NOAA weekly discrete air samples (ObsPack product: GLOBALVIEWplus v2.1; ObsPack (2016)) were used to evaluate the effect of different soil carbon models on tropospheric atmospheric CO 2 175 seasonal cycles at sites around the globe. The sites used in the evaluation are shown in Fig. 1. From the data, samples reflective of well-mixed background air were selected (based on flag criteria) similar to van der Laan-Luijkx et al. (2017) to minimize the influence on the observation of transport model errors in our analysis.

The satellite retrievals
GOSAT (Greenhouse Gases Observing Satellite) from Japan Aerospace Exploration Agency (JAXA) was launched in 2009 and 180 observes column XCO 2 with the TANSO-FTS instrument (Kuze et al., 2009). These data were used to evaluate the different simulations and to assess the model performance at larger spatial scale.
XCO 2 from the simulation results were calculated using global 4 • x 6 • x 25 (latitude x longitude x vertical levels) daily average 3-dimensional (3-D) atmospheric CO 2 fields. For each satellite retrieval, the global 3-D daily mean gridded atmospheric CO 2 estimates were horizontally interpolated to the location of the retrievals to create the vertical profile of simulated CO 2 .

185
Averaging kernels (AKs) (Rodgers and Connor, 2003) were applied to model estimates to ensure reliable comparison with GOSAT retrievals: whereĈ is XCO 2 , scalar c a is the prior XCO 2 of each retrieval, h is a vertical summation vector, a is an absorber-weighted AK of each retrieval, x is a model profile and x a is the prior profile of the retrieval. Each retrieval had a prior profile (Yoshida 190 et al., 2013). The retrievals for different terrestrial TransCom (TC) areas ( Fig. 1) were compared with those calculated from the two model simulations. For comparison with GOSAT XCO 2 , the estimates of 3D fields at 6 • x 4 • resolution were used, but not those from the zoom grids due to technical reasons. Differences in XCO 2 due to model resolution were not significant within the context of this study. In this work GOSAT observations (NIES retrieval V02.21 and V02.31) between July 2009 and the end of 2014 were used. was similar in both model formulations (Table 1). The small difference is caused by fire fluxes, which have small differences in their magnitude but similar spatial patterns (Fig. S3). The global estimate for total soil carbon by CBA was 4.5-fold larger than by YAS (Table 1). The global estimate for the litter simulated by YAS pool was larger than that simulated by CBA.
The net CO 2 flux shows a slightly larger land sink for YAS than CBA (Table 2) These have been calculated from the carbon pools over the whole time period and the mean annual R h . The models show longer turnover times in northern high latitudes and dry areas. CBA predicts longer turnover times to southern Europe, whereas YAS does not. On the other hand, YAS predicts longer turnover times close to Saharan desert, different to CBA. YAS also 220 consistently predicts longer turnover times to northern latitudes, but CBA does not do this for the northern European region.
The global distribution of soil carbon is very different between the model formulations ( September. In the region 10 • S -30 • S YAS has 54 % larger amplitude in R h than CBA.

240
The variation in R h seasonal dynamics of these two model formulations can be linked to the differences in their environmental drivers and functions. In Table 3  soil moisture variable α in any of these regions (Table 3 and Fig. S9). In the tropical region the soil moisture for CBA and precipitation for YAS are more important drivers compared to soil and air temperatures. In the high latitudes the temperature has larger effect on R h with both models, even though in the Northern Hemisphere also the precipitation has a significant role for YAS.
To assess, whether the higher amplitude of the seasonal cycle in R h by YAS is caused by the larger litter pool or the 255 environmental response functions, a simple box model calculation was performed (a detailed description in Appendix).
When the global respiration was calculated with the turnover times and soil carbon pools of the YAS model, but using the environmental responses and drivers of the CBA model, the annual magnitude decreased by 29 % compared to the original YAS model (Table A1). However, the yearly maximum value did not change by much. When the opposite was done, and the turnover time and soil carbon pools of CBA were used with the environmental responses and inputs of the YAS model, the magnitude 260 of the global heterotrophic respiration was increased approximately 1.4-fold (Fig. A1). The increase in the amplitude was 83 % (Table A1). Therefore this simple analysis suggests that the environmental variables and their response functions cause the larger global amplitude of R h in the YAS model formulation. To further disentangle, whether this change was caused by the different environmental drivers or their functional dependencies, we made additional tests.
The amplitudes of the seasonal cycle of R h (difference between the maximum and minimum values) are shown in Table A1.  (Table A1). This increase is occurs when either driving variables or response functions are changed individually. However, with the change of the response functions the change in the amplitude 270 is larger, 74 %. Therefore the response functions have a more pronounced role in the changes than only the driving variables and this was true for both of the models.

Evaluation against surface observations
Seasonal cycle amplitudes (SCAs) of atmospheric CO 2 are successfully simulated by the modeling framework across different latitudes (Fig. 5a). The r 2 values of the observed seasonal cycle and the model estimates are high across latitudes, despite some 275 lower values in mid-latitudes of the Northern Hemisphere (Fig. 5b). Averaged over all the latitudes the r 2 value, calculated as linear correlation of simulated and observed averaged annual cycles, was 0.93 for the CTE, 0.90 for CBA and 0.87 for YAS.
The CBA model is able to capture the timing of the seasonal cycle in northern latitudes, but has a tendency to overestimate the SCA by 30 % north of 45 • N. In this region the underestimation of SCA by CTE is approximately 5 % and by YAS 14 %.
In the region 0 This behavior is further illustrated from comparisons of the detrended seasonal cycle at four stations in the Northern Hemi-285 sphere ( Fig. 6 and Table S2). To confirm the general quality of the TM5 model used for both YAS and CBA we plotted its biospheric posterior fluxes from CarbonTracker Europe 2016 (CTE); indeed, deviations of CTE to observations are much smaller than from the JSBACH model at all sites. At the high-latitude sites, Alert and Pallas (Fig. 6a, b), CBA overestimates the seasonal cycle amplitude, while YAS shows some phase-shift of the cycle. The observed SCAs are smaller at the two more southern sites, Niwot Ridge and Mauna Loa. For those sites, CBA is generally successful in capturing their magnitude (Table   290 S2), whereas YAS underestimates them strongly. YAS is also having difficulty capturing the seasonal pattern at Niwot Ridge.
This was happening generally in the temperate region, as is also seen in the lower r 2 values of the YAS model at the different sites ( Fig. 5.
When comparing the overall bias at these four sites between the observations and the model simulations, CBA overestimated The results at surface sites show that CBA largely overestimated SCA at high northern latitudes, whereas YAS almost consistently underestimated the SCA in the Northern Hemisphere. CBA captured the seasonal cycle patterns better than YAS 300 across different latitudes.

Column XCO 2 comparisons for TransCom regions
This evaluation of the two soil modules against satellite column XCO 2 was carried out for the different TransCom (TC) regions and Australia (TC=10). For completeness we show the analysis also for these regions in Table S3.  (Thoning et al., 1989)), but for regions with missing data or no clear seasonal cycle, we averaged over all years of data.
To further illustrate the results from this comparison, we show data for two regions having a clear seasonal cycle and 310 reflecting the same behaviour that was noticed at the four surface observation sites that were looked deeper into in the previous section. In TC region 2, the southern part of North America, CBA is more successful in capturing the observed SCA than YAS ( Fig. 7 a), even though CBA reaches the minimum XCO 2 later than observations. YAS underestimates SCA by 56% and has a different seasonal pattern than observations, so the minimum is reached earlier than in the observations and also the shape during the summer period is different to the observations. In TC region 11, Europe, both models capture the SCA (Fig. 7, Table   315 S3) and the seasonal cycle in the first part of the year. The increase of CO 2 is not as well captured by the simulations.
Overall, observed and simulated XCO 2 differ from each other similar to the surface site observations. Estimates of SCA by YAS are too small in mid-latitudes (Fig. 7a) and in TCs 2, 5 and 8 compared to the observations, and CBA is better in capturing the observed annual cycles. At TC=1 (the northern part of North America), CBA overestimates the SCA, while YAS better captures it. However, the seasonal cycle pattern is better captured with CBA (Table S2) than with YAS. Generally YAS had 320 smaller SCAs than the observations and CBA was more consistent with the observations in most TC regions (Table S3). CBA is also better than YAS in capturing the seasonal pattern of XCO 2 in the all TC regions (Table S3).
There occurs bias between the space-born observations and the model simulations. When averaged over the time period used and the TC regions, CBA overestimates the GOSAT observations by 3.37 ppm and YAS by 2.33 ppm. These values were in line with the bias estimates at the four surface sites.

Discussion
In this work our aim was to use atmospheric observations to benchmark soil carbon models. Our main finding was that the two models predicted differently the annual cycle of the global R h , with the YAS model having a larger amplitude. This in turn lead to pronounced clear differences in the model predictions of seasonal cycles of the atmospheric CO 2 mole fractions. To be able to attribute the differences between the two models we need to compare their results from different aspects and to also 330 judge whether our model simulations are reasonable in the light of previous research.
Similarly to the earlier studies (Goll et al., 2015;Thum et al., 2011), in our results the YAS model was more successful than CBA in estimating the observed global soil carbon stocks, which is approximately 1500 PgC (Scharlemann et al., 2014) including large uncertainties. The distribution of soil carbon stocks was also more realistic in YAS than in CBA. The large soil carbon stocks in the mid-latitudes predicted by CBA (Fig. 2 a) are unrealistic compared to current estimates of global soil 335 carbon distribution (Scharlemann et al., 2014). The large carbon stocks at high latitudes predicted by the YAS model (Fig. 2) are more in line with the observations. However, the version of JSBACH used does not include peatlands and is modelling only mineral soils, therefore the large carbon reservoirs of peatlands are not captured by the model, as they are now described as mineral soils. The YAS model is widely used in different applications at smaller scale and its performance to estimate soil carbon stocks has been found to be good (Hernández et al., 2017). Comparability between the model-calculated and the 340 observed carbon stocks is relevant for any analyses of the carbon fluxes because in the both models the fluxes are proportional to the stocks (flux = decomposition rate * stock).
The global turnover rate of soil carbon by CBA was somewhat larger than in an earlier study, where it was estimated to be 40.8 years (Todd-Brown et al., 2014). This value was in the higher end of the CMIP5 models. The global turnover rate value from YAS, which was 14.8 years, is more in the range of the other CMIP5 models (Todd-Brown et al., 2014). The spatial 345 distribution of the turnover rate anomalies show the the differences caused by the environmental drivers and their dependencies at annual timescales. When comparing these overall turnover rates of the total soil carbon, it is important to keep in mind that both models consisted of carbon pools that had widely varying turnover rates. For example, despite of the higher overall turnover rate, the turnover rate of the most recalcitrant carbon pool of YAS was an order of magnitude lower than that of CBA.
Moving to the monthly time scales, we can see that the global seasonal R h cycle had a larger amplitude with YAS than 350 with CBA (Fig. 3)  have showed that the response functions themselves lead to pronounced differences between soil carbon models (Wieder et al., 2018).
The annual heterotrophic respiration was 66.1 PgCyr −1 for CBA and 65.5 PgCyr −1 for YAS (Table 2), which falls in the range of estimates from the Earth System Models (41.3-71.6 PgCyr −1 ) and is close the observation based estimates of 60 PgCyr −1 (Shao et al., 2013). The similar values of R h by YAS and CBA are caused by the way the models are run into steady-360 state in the beginning of the simulation in 1860. Part of this difference is caused by the fire fluxes. The YAS model has a larger litter pool that behaves as fuel for fires. Therefore, to have the system at steady state, global heterotrophic respiration by YAS must be less. Also, the simulation time of 140 years before the beginning of the analysis might also cause some divergence between the model runs.
When R h is compared by latitudinal zones, differences between the model formulations are visible in the variability and 365 timing of the seasonal cycles in many regions (Fig. 4). R h showed strong correlations to the environmental drivers of the models in different latitudinal zones (Table 3). Both models are largely influenced by their moisture dependency in the tropical region (Table 3). CBA is driven by soil moisture with a linear dependence and YAS is driven by precipitation. While at annual timescales these two variables (air vs. soil temperature and precipitation vs. soil moisture) are similar, since precipitation is affecting soil moisture and on longer timescales air temperature determines soil temperature in the top soil layers, the seasonal 370 cycles of the variables are different. At annual timescales, at which the YAS model has been originally developed, the dynamics of these variables are not likely to be as different as at these shorter timescales. Precipitation begins earlier in the season in the tropical region, and it causes YAS to reach yearly maximum R h earlier than CBA, which is driven by soil moisture in this region. Also in the temperate region, where the temperature has a larger role, the air temperature has larger variability than soil temperature and this leads to different kind of seasonal pattern of the R h predictions by the models.

375
Different moisture dependencies of R h have earlier been found to be important (Exbrayat et al., 2013). At the global level is the response of the simulated R h to soil moisture conditions, since R h is not attenuated in very moist conditions and they found a need to improve the moisture dependency of the YAS model. This is in line with our findings, that a model that has been parameterized at annual time scales requires further development before it can be reliably applied at shorter timescales.
Precipitation has been originally used in the YAS model as a proxy for soil moisture, since enough accurate soil moisture observations for model development haven't been available. Clearly, this idea needs reconsideration as our results show that at 385 zonal spatial scales and monthly temporal scale the R h from YAS is not at all correlated to soil moisture variable α.
The global GPP, being 165 PgCyr −1 in this study, was overestimated, compared to the FLUXCOM estimate. Different

390
The GPP of JSBACH is relatively high, but it was the same for both of the model formulations and only contributed to the amount of litter fall. Therefore we do not expect the variation of its magnitude to have substantial influence to our results, bearing also in mind that the seasonal cycle in different latitudinal zones was captured by the JSBACH model (Fig. S2).
The differences by the two models in the seasonal cycle of atmospheric CO 2 were strong. CBA better reproduced the seasonal cycle amplitudes capturing the shape of the seasonal cycle both for surface sites and comparisons in the TC regions, even 395 though its soil carbon distribution had lower performance compared to YAS. CBA exaggerated the seasonal cycle amplitudes at high northern latitudes, as has been found earlier (Dalmonech and Zaehle, 2013).  (Table 2) and is therefore lower than the land sink by CTE2016.

The biases between XCO
Since the ocean fluxes are a posteriori fluxes from CTE2016 simulation, they cause a bias to the simulated atmospheric CO 2 mole fraction when used together with the land fluxes from the JSBACH simulation. The net land sink of YAS version is slightly closer to CTE2016 value, and this leads to lower bias also at Mauna Loa (Fig. S10).
Additionally, the spaceborn observations also contain bias. GOSAT retrievals were evaluated against ground-based Total
In this study we concentrated on analysing the SCAs and the pattern of the seasonal cycle and emphasized the differences between the two different soil carbon models. Therefore we do not consider this bias to play a big role in these analysis.
13 https://doi.org/10.5194/bg-2020-7 Preprint. Discussion started: 10 February 2020 c Author(s) 2020. CC BY 4.0 License. Also the transport model contains biases (Gurney et al., 2004), but since only one transport model was used in this study, it is expected that the biases are similar between the two model runs and are not the cause for the large differences seen in the two 410 different simulations.

Conclusions
We demonstrated how atmospheric CO 2 observations can be used to benchmark soil carbon models and that it is important to benchmark models across several different variables. This work highlighted the importance of the model drivers and their functional dependencies. The YAS model better captured the magnitude and spatial distribution of soil carbon stocks globally 415 and resulted in similar global turnover rate compared to other Earth System Models, in comparison to the much higher turnover rate by the CBA model.
The YAS model showed biases in the atmospheric CO 2 cycle at temperate latitudes in the Northern Hemisphere. The CBA model showed better performance in capturing the seasonal cycle pattern of the CO 2 mole fraction, but it had biases in the high latitudes in the Northern Hemisphere. When considering both land based and atmospheric based observations, it is not 420 straightforward to say which model performed better. However, the R h from the YAS model showed misalignment with soil water content in the tropical regions, as they were negatively correlated with each other. This suggests that use of precipitation as a proxy for soil moisture might not be sensible in sub-annual time scales.
In addition to the surface observations of CO 2 , also space-born XCO 2 observations were used. They were providing a largerscale confirmation for the results obtained from the surface observations and thus worked as a complimentary information 425 source.
Soil carbon models have several development needs van Groenigen et al., 2017) that are now partly being answered with the next generation models including more mechanistic representation of several below ground processes (Wieder et al., 2015;Yu et al., 2019). The development of moisture dependency from simple empiric relationships is moving towards mechanistic approaches, which may yield more reliable results in the long term (Yan et al., 2018).

430
Code and data availability. A simple box model calculation was performed to evaluate the importance of the dependencies on environmental drivers and the soil carbon pool sizes on the larger global seasonal cycle amplitude in R h as predicted by YAS. In this box model, we assume that heterotophic respiration R h is a product of environmental dependencies and the turnover time as where R h,Y AS is the heterotophic respiration of model YAS, b is a scalar that takes into account the different magnitudes of 445 the response functions, T air is the air temperature, P a is the annual precipitation, C soil,Y AS are the total soil carbon pools and τ Y AS is the turnover time of the total soil carbon pools. T soil is the soil temperature and α is the relative soil moisture. This formulation in A1 refers to the YAS model. The response functions are as shown in Section 2.1.2. For the CBA model the formulation is as These response have been introduced in Section 2.1.1.
The equations were used for monthly data averaged over the years 2001-2014 of heterotrophic respiration, environmental drivers and soil carbon stocks to estimate the turnover times for each grid point for YAS using eq. A1 and for CBA using eq. A2.
Using these turnover times, we calculated the global R h with the turnover times and soil carbon pools of each model by making different tests. First, we used the environmental responses and drivers of the other model (lines B in Table A1). Additionally 455 we changed the driving variables, but kept the original response functions (lines C in Table A1). Then we changed only the response functions of the original model while keeping the original driving variables (lines D in Table A1).
Since the driving variables of soil moisture and annual precipitation differed in magnitudes approximately four-fold, the soil moisture was multiplied by four when using the function for annual precipitation (f Y AS,Pa ) and when annual precipitation was used in the function for soil moisture (f CBA,α ) it was divided by four. The annual cycles of R h are shown in Fig. A1 and the 460 amplitudes in Table A1. 29 https://doi.org/10.5194/bg-2020-7 Preprint. Discussion started: 10 February 2020 c Author(s) 2020. CC BY 4.0 License.  33 https://doi.org/10.5194/bg-2020-7 Preprint. Discussion started: 10 February 2020 c Author(s) 2020. CC BY 4.0 License.