Global variability of carbon use efficiency in terrestrial ecosystems

Xiaolu Tang 2, , Nuno Carvalhais , Catarina Moura , Bernhard Ahrens, Sujan Koirala, Shaohui Fan, Fengying Guan, Wenjie Zhang , Sicong Gao, Vincenzo Magliulo, Pauline Buysse, Shibin Liu, Guo Chen, Wunian Yang, Zhen Yu, Jingjing Liang, Leilei Shi, Shengyan Pu , Markus Reichstein 14, 15 5 Department Biogeochemical Integration, Max Planck Institute for Biogeochemistry, Jena, Germany College of Earth Science, Chengdu University of Technology, Chengdu, Sichuan, China State Environmental Protection Key Laboratory of Synergetic Control and Joint Remediation for Soil & Water Pollution, Chengdu University of Technology, Chengdu, China Departamento de Ciências e Engenharia do Ambiente, DCEA, Faculdade de Ciênciase Tecnologia, FCT, 10 Universidade Nova de Lisboa, Caparica, Portugal Key laboratory of Bamboo and Rattan, International Centre for Bamboo and Rattan, Beijing, P.R. China State Key Laboratory of Resources and Environmental Information System, Institute of Geographic Sciences and Natural Resources Research, Beijing, China School of Life Science, University of Technology Sydney, NSW, Australia 15 CNR Institute for Mediterranean Agricultural and Forest Systems, Via Patacca 85, Ercolano (Napoli), Italy UMR ECOSYS, INRA-AgroParisTech, Université Paris Saclay, Thiverval-Grignon, France Department of Ecology, Evolution, and Organismal Biology, Iowa State University, Ames, IA, USA Department of Forestry and Natural Resources, Purdue University, 715 W. State St, West Lafayette, 20 IN, USA Laboratory of Geospatial Technology for the Middle and Lower Yellow River Regions, College of Environment and Planning, Henan University, Jinming Avenue, Kaifeng, China State Key Laboratory of Geohazard Prevention and Geoenvironment Protection (Chengdu University of Technology), Chengdu, Sichuan, China 25 German Centre for Integrative Biodiversity Research (iDiv) Halle-Jena-Leipzig, 04103 Leipzig, Germany Michael Stifel Center Jena (MSCJ) for Data-Driven & Simulation Science, 07743 Jena, Germany Corresponding to: Xiaolu Tang (lxtt2010@163.com)


Introduction
The increasing levels of atmospheric CO2 concentrations and climate change have highlighted our need to a better understanding of terrestrial carbon cycling and its responses to climate change.Gross primary 55 production (GPP), net primary production (NPP) and autotrophic respiration (Ra) are the most important and highly related components to carbon cycling.The carbon fixed by photosynthesis is allocated to a variety of usages in plants, including growth respiration, maintenance respiration and biomass accumulation.The allocation proportion is highly relevant to understand ecosystem carbon stock and carbon cycles, because it strongly affects the residence time and location of carbon in the ecosystems 60 (Zhang et al., 2014).For example, the carbon residence time for maintenance respiration and structural biomass of organs varied dramatically, which could range from a few hours to decades or even centuries (Campioli et al., 2011).Although an increasing number of researches have been conducted on carbon exchanges in different ecosystems, unanswered questions about the fate of the carbon taken up by the ecosystem and its relationships with the environmental variables and ecosystem types are still remained.

65
Carbon use efficiency (CUE), defined as the ratio of NPP to gross primary production (GPP), is an important parameter to describe the carbon transfer from atmosphere to terrestrial biomass (Bradford and Crowther, 2013).A CUE value of 0.5 means that 50% of acquired carbon is allocated to biomass.Generally, NPP, which is a most direct and robust estimate, is usually calculated from the biomass increment of wood, leaves and litter on an annual base.While GPP is very complex as it consists 70 photosynthetic carbon gain by all leaves, including overstory and understory, but it is typically not measured directly (DeLucia et al., 2007).Alternatively, GPP could be calculated as the sum of NPP and Ra (DeLucia et al., 2007;Curtis et al., 2005).Therefore, due to the methodological challenging, a constant CUE value of 0.5 has been widely used in modelling carbon cycling.
Theoretically, if Ra is proportional to GPP in terrestrial ecosystems that vary in vegetation type, age, 75 climate and soil fertility, CUE should be constant.On the other hand, if Ra is proportional to biomass, CUE should vary with differences in allocation (DeLucia et al., 2007).However, the assumption of constant CUE have been challenged by both field observations and modelling studies (Zhang et al., 2009;Xiao et al., 2003), and they have found that CUE vary with ecosystem type, climate, soil nutrient and geographic allocation (Albrizio and Steduto, 2003;Maseyk et al., 2008;Xiao et al., 2003;Zhang et al., 80 2009).These variations have significant effects on landscape estimates of carbon cycling.For example, an error of 20% of the constant CUE (0.5) used in landscape models (ranging from 0.4 to 0.6) can misrepresent a substantial amount of carbon, comparable to the total anthropogenic CO2 emissions when scaling it to total terrestrial biosphere (DeLucia et al., 2007).
Although global distributions of GPP and NPP were established, such as MODIS and Dynamic Global 85 Vegetation Models (DGVMs) GPP and NPP (DeLucia et al., 2007;Zhang et al., 2009), GPP and NPP did not change in the same pattern, leading to different changing patterns in CUE compared to GPP and NPP.
For example, a photosynthesis rate reaches its maximum at the temperature of 25-30 o C, while the Biogeosciences Discuss., https://doi.org/10.5194/bg-2019-37Manuscript under review for journal Biogeosciences Discussion started: 14 February 2019 c Author(s) 2019.CC BY 4.0 License.respiration rate increases exponentially with the increase of temperature (Piao et al., 2010;Ryan et al., 1997), which results in the decrease of NPP and CUE.Using the DGVMs from the TRENDY ensemble 90 and MODIS-derived GPP and NPP, He et al. (2018), Zhang et al. (2009) and Zhang et al. (2014) have estimated the global CUE and plotted it along a geographical and climate gradients.These studies have advanced our knowledge on understanding the global distribution of CUE, however, the validity of the conclusion might be sensitive to simplified parameters, including varying plant functional types, the constant maximum radiation conversion efficiency and respiration rate per unit of leaf and wood biomass 95 used in different biomes and applied in these NPP and GPP product algorithms (Zhang et al., 2009).
Previous studies, based on individual observations or process-based model estimates, indicate that site fertility and management are important drivers of CUE by increasing resource availability for plants (Vicca et al., 2012;Campioli et al., 2015).However, whether these factors are dominant drivers for temporal and spatial variability of CUE has not been assessed.Additionally, atmospheric nitrogen 100 deposition, largely overlooked before, might be another confounding factor affecting GPP (Fleischer et al., 2013) and NPP (Stevens et al., 2015;LeBauer and Treseder, 2008), and the further prediction of the spatial variability of CUE.Therefore, diagnosing the co-variation of CUE with climate and other environmental factors is fundamental to understand its driving factors, and to further fill the current gaps in knowledge about the controls on CUE.

105
In this study, we compiled a new dataset consisting of 415 site-year CUE observations from 188 sites distributed across all the global terrestrial ecosystems and climate regions (Fig. 1), updated from databases from Luyssaert et al. (2007) and Campioli et al. (2015), and other peer review publications prior to February 2017.For global CUE mapping and imputation, 15 global variables clarified by four types were extracted for each set of site coordinates for the measurement year (Table S1).Furthermore, 110 we compiled additional local attributes, including climate region, site management practice, and ecosystem types.The objectives of this study were to: (1) study the ecosystem gradients of CUE; (2) explore the spatial variability of CUE and its potential climatic, edaphic, and management factors; (3) estimate CUE-derived NPP.

Dataset
This study established a global database of site-year CUE based on observations from 188 sites extending Luyssaert et al. (2007) and Campioli et al. (2015) database.Five ecosystem types were clarified cropland, forest, grassland, wetland and tundra.The key rule for inclusion in this database was that measurements of NPP and GPP were available for the same year, and each single year measurement was 120 taken as an independent observation according to our selecting criteria ("Criteria of selecting publications" in Supporting information).NPP included both above-and below-ground growth, which could be estimated by harvest, biometric models, or increment core (below-ground).According to the Biogeosciences Discuss., https://doi.org/10.5194/bg-2019-37Manuscript under review for journal Biogeosciences Discussion started: 14 February 2019 c Author(s) 2019.CC BY 4.0 License.procedure of Vicca et al. (2012) and Campioli et al. (2015), gap-filling of missing NPP components, such as understory and herb NPP, was conducted in forest ecosystems.After the gap-filling, a seven percent 125 increase of CUE was observed (Fig. S2).
The integrated and updated database contained 415 observations.The maximum plausible CUE was 0.84 and 8 observations were excluded after plausibility check ("Plausibility check" in Supporting information).Managed forest sites were excluded from modelling the temporal and spatial patterns of CUE although there was a statistical difference on the average of CUE between managed and non-130 managed forests (Fig. S3), but management as a covariate contributed little to the statistical power of the RF model.Furthermore, there is scarce information on management practices globally in order to use it as an upscaling covariate, and in general the DGVMs lack also the description of the management dynamics that lead to these differences.These management sites mainly included fertilized and thinning sites.Finally, the dataset included 286 observations for forests, 33 for grassland, 27 for wetland, 56 for 135 cropland and 5 for tundra, which were used for one-way analysis of variance (ANOVA) to compare the significance of CUE among the five ecosystem types.Before and after removing the managed forest sites, one-way ANOVA results did not change and further indicated that CUE varied with ecosystem types (Fig. 2 and S4).

Global variable selection
We used 15 variables of four types to predict CUE globally (Table S1).Since NPP and GPP observations were collected from the publications based on a yearly-scale, monthly climate data were Soil fertility level is an important CUE driving factor (Vicca et al., 2012).However, determining the soil fertility levels is challenging because it is determined not only by soil nutrient contents, but also by the interaction of soil textures, pH, depth and bulk density.Therefore, in this study, soil organic matter is used as an integrated indicator of soil fertility because it is a nutrient sink and source, enhances soil 155 physical and chemical properties, and promotes biological activity (de la Paz Jimenez et al., 2002).
Primarily, there were 17 land cover types.Because there were not enough observations for the each land cover in our dataset, we aggregated the land cover into four land cover types -cropland, forest, grassland and wetland, with a spatial resolution of 0.5 o × 0.5 o .160
Since most of the process models reported a monthly dynamic of NPP and GPP, to calculate CUE, the monthly NPP and GPP were first summed to an annual scale.When comparing TRENDY-CUE of different ecosystem types, TRENDY-CUE was aggregated to 0.5 o × 0.5 o for TRENDY models with 170 different spatial resolutions.

Data analysis
ANOVA analysis with a post hoc Tukey's HSD test was conducted to test whether CUE varied with ecosystem type and management.
Random Forest (RF) is a machine learning approach that uses a large number of ensemble regression 175 trees but a random selection of predictive variables (Breiman, 2001).RF does not only consider nonlinear relationships and the interactions of the variables, but also assesses the importance value of the variables.The importance value of a given variable is expressed by the mean decrease in accuracy (or increase in mean square error, %IncMSE, Breiman, 2001).The higher the importance value is, the more importance of the given variable is.

180
To reduce the number of the variables and improve the model efficiency, "rfcv" function within "RandomForest" package was used in R language (Kabacoff, 2015).This function showed the cross- CUE varied widely from 0.201 to 0.822 (Fig. 2), while the overall mean CUE across different ecosystem types, climate regions and management practices was 0.488 ± 0.136 (mean ± 1 standard deviation).CUE varied significantly by ecosystem types (p < 0.001) with the highest value found in wetland (0.607 ± 0.133), followed by tundra (0.573 ± 0.125), cropland (0.566 ± 0.145), forests (0.464 ± 205 0.127) and grassland (0.457 ± 0.109).Cropland and wetland CUEs were significant higher than forest and grassland, while forest CUE did not differ significant from grassland.Tundra CUE was not different from that of cropland, forest, grassland and wetland.Lower CUE in forests indicates higher respiration requirement to maintain higher forest ecosystem biomass production compared to other ecosystems (Piao et al., 2010).In comparison, the lowest CUE value was found in grassland presumably due to the heightened respiration caused by a limitation of precipitation (Shao et al., 2016).Moreover, the lack of oxygen in the saturated soil of wetland may suppress belowground Ra, while fertilization and intensive management in cropland help to increase biomass yield and reduce the respiration proportion (Campioli et al., 2015;Snyder et al., 2009).Thus, our results imply that CUE among ecosystems was not constant and a constant CUE of 0.5 could lead to biased estimates for C cycling modelling across temporal and 215 spatial scales (see "Practical implication for NPP estimation" section).However, our conclusion was different from Campioli et al. (2015), who proposed a constant CUE across ecosystems.Such different results can be attributed to: 1) different grouping strategies; and 2) a 225 stricter filtering criteria used in our study.We grouped ecosystems in five classes (see above) due to a limited number of observations for some of individual ecosystems.In our database, we only included publications simultaneously reporting NPP and GPP in the same given year, while even at the same site, NPP and GPP reported in different years were excluded since the climatic variables can lead to significant variability in GPP (Anav et al., 2015;Jung et al., 2011) and NPP (Li et al., 2017) ("Criteria of selecting 230 publications" in Supporting information).Measurements of each single year were taken as an independent observation.Additionally, a plausibility check of CUE was conducted in our database for each given year and the maximum acceptable CUE was 0.84 ("Plausibility check" in Supporting information).
Management practice increased CUE regardless of the ecosystem types (Fig. S3), which was consistent 235 with Campioli et al. (2015).This is likely attributed to (1) the increase of carbon allocation to biomass production (Campioli et al., 2015); (2) the decrease of belowground C flux (Litton et al., 2007) and (3) the decrease of the allocation of GPP to Ra (Vicca et al., 2012).Regarding to the ecosystem types, management practice only increased forest CUE, rather than grass ecosystem (Fig. S3).Therefore, when modelling temporal and spatial distribution, managed forest sites were excluded (Yang et al., 2014).

240
However, it should be noted that the one-way ANOVA results did not change when the managed forest sites were excluded (Fig. S4).

Spatial variability of CUE
Random Forest (RF) analysis (Breiman, 2001) indicated that ecosystem type was the most important driving factor of CUE (Fig. S5) considering climate, satellite (GIMMS NDVI, LAI and fPAR), soil and 245 site variables (Table S1).This result corroborated our finding that CUE varied significantly with ecosystem types.Across the latitudinal gradient, CUE decreased with latitude, varying from 0.58 at 65 o N to 0.42 at 10 o S (Fig. 3).The latitudinal pattern was consistent with MODIS-based CUE, which can be explained by the changes of temperature and CUE sensitivity to temperature (Ryan et al., 1994).
Normally, the rate of respiration increases exponentially with temperature (Ryan et al., 1994;Ryan et al., 250 1995), or has a higher sensitivity to temperature compared to GPP (Curtis et al., 2005), or plants have higher energy requirements to maintain living tissues (Ryan et al., 1994) or longer growing season with the increasing temperature (Piao et al., 2007), while the photosynthesis rate stabilized over a wide range of temperatures, i.e. 20-35°C (Teskey et al., 1995).Thus, plants allocate relatively more C to respiration cost in higher temperature areas.However, the highest CUE was observed in the intensively cropped 255 region, such as the central North America, central Europe and North China.Nonetheless, there was no significant latitudinal pattern of CUE for cropland ecosystem (Figs.S9), except for a few grid cells above 60 o N.This result indicated that the variations of CUE are more intensively controlled by management practices to maximize production, hence increase CUE, while the role of climate variability was lower in CUE variability for crops.Besides, nutrient availability was another important controlling factor of 260 CUE (Vicca et al., 2012).For example, tropical areas are generally constrained by soil nutrient availability, particularly by low phosphorus concentration (Reich et al., 2009).These results further challenged the conventional assumption that the CUE should be consistent independent of environmental conditions (Campioli et al., 2015;Waring et al., 1998;Maier et al., 2004).However, CUE had no significantly temporal trend during 1982-2011 (data was not shown).predicted by Random forest.The grey range means 2.5 to 97.5 percentile ranges of the predicted CUE.
TRENDY models have been widely used to estimate the temporal and spatial variability of NPP (Shao 270 et al., 2016) and GPP (Jung et al., 2017), providing a valuable tool to analyse temporal and spatial variability of CUE.However, due to the definition of different plant functional types among TRENDY models, different parameters of the same plant functional type across space were applied in different TRENDY models.This leads to a different magnitude and spatial patterns of GPP (Anav et al., 2015) and NPP (Shao et al., 2016), and further affecting temporal and spatial patterns of CUE.We compared CUEs derived from the 13 TRENDY model simulations for (1) the same number of observations at the same locations sites for per ecosystem type and (2) the spatial patterns.TRENDY 280 model mean CUEs varied from 0.460 for wetland to 0.527 for tundra, which had a lower change range compared to observations (Fig. 2).On the other hand, there was no significant difference between observed and TRENDY CUEs (p = 0.0715 -0.539), except for forest (p = 0.018).However, latitudinally, we found a large spread among models (Fig. 4 and S10).Larger variabilities of TRENDY-CUE were observed compared to predicted CUE and these variabilities were particularly large at high latitude 285 (>60 o N), suggesting that TRENDY models overestimated or underestimated CUE at high latitudinal areas.This result was consistent with Xia et al. (2017), which reported overestimated CUE from TRENDY model in permafrost areas.Eight of 13 TRENDY-CUE decreased with latitude, and OCN-and LPX_Bern-CUE was lowest in high latitude, while JSBACH_v2.5-,LPJ-and LPJ-GUESS-CUE showed an increasing pattern in the topical areas.HYL-CUE was constant across all latitudes due to a fixed ratio 290 (0.5) of plant respiration to total photosynthesis (Levy et al., 2004).Similar patterns were found for TRENDY-CUE of each ecosystem type (Fig. S11).These different CUE patterns may be related to several reasons: Biogeosciences Discuss., https://doi.org/10.5194/bg-2019-37Manuscript under review for journal Biogeosciences Discussion started: 14 February 2019 c Author(s) 2019.CC BY 4.0 License.
First, different plant function types were used for different TRENDY models and constant parameter was used for each plant function type across time and space (Xia et al., 2017).

295
Second, different sets of equations and parameters can lead to different estimates of GPP and NPP, further contributing to the differences of modelled CUE (He et al., 2018).Except the HYL model, none of the TRENDY models uses a fixed CUE, thus TRENDY-CUE was determined by the difference between the GPP and Ra, including both maintenance and growth respiration.However, the simulated maintenance and growth respiration varied greatly among different TRENDY models (Xia et al., 2017).

300
Based on a global database and upscaling tree-level Ra estimates to the stand-level, annual Ra is linearly related to biomass (Piao et al., 2010), indicating that sites with higher biomass need higher maintenance respiration.Although TRENDY models stimulated growth respiration dynamically, fixed growth respiration coefficients were used, such as 0.25 for JULES (Clark et al., 2011) and TRIFFID (Cox, 2001), 0.28 for ORCHIDEE (Krinner et al., 2005) and 0.30 for CLM4.5 (Lawrence et al., 2011).Nonetheless, 305 using constant growth respiration coefficients in model simulation will ignore the inter-annual variability of climatic and soil nutrient controls and generate a simplistic representation of plant respiration, which could not describe the mechanisms of plant respiration in relation to climate change temporally and spatially, thus causing the major source of uncertainty of CUE.For example, maintenance respiration varies with temperature and growth respiration contributes 40-60% of total respiration in the growing 310 seasons (Stockfors and Linder, 1998).Even if growth or maintenance respiration acted as a constant fraction of GPP, the respiration rate will change between years due to the variability of GPP.Therefore, further studies are still needed to explore how maintenance and growth respiration respond to climate change across time and space.
Third, most models do not consider nutrient constraint, such as nitrogen, which ignore the GPP or 315 NPP increment induced by increasing nitrogen deposition (Anav et al., 2015;Shao et al., 2016).Fourth, due to the lack of explicit representation of CO2 diffusion within leaves (Sun et al., 2014), TRENDY models underestimate the photosynthetic responsiveness to increasing atmospheric CO2 (Anav et al., 2015).Last but not the least, since TRENDY models without representing agricultural management, crop physiology and fertilization treatment, which are important practices to increase production (Guanter et 320 al., 2014), TRENDY models generally underestimated crop CUE and no model could capture the spatial change in CUE in croplands (Fig. S11).Additionally, although the same climate data is used for all TRENDY models to remove the uncertainty of the different meteorological forcing, using a particular forcing can lead to systematic errors that will be propagated to the output of carbon models (Anav et al., 2015).Therefore, our observed CUE indicated that the model predictive capability of CUE need to be 325 improved to better representation of the terrestrial C cycling.On the other hand, both predicted CUE and TRENDY-CUE challenged a constant CUE and called for variable CUE for modelling global C cycling across space and time for different ecosystem types.

Practical implication for NPP estimation
Biogeosciences Discuss., https://doi.org/10.5194/bg-2019-37Manuscript under review for journal Biogeosciences Discussion started: 14 February 2019 c Author(s) 2019.CC BY 4.0 License.et al., 2017).Using this approach, global NPP estimate of this study was 59.1 ± 0.2 Pg C a -1 (Table 1), which is close to the reported value of 60 Pg C a -1 of IPCC (Ciais et al., 2013).Such result highlights the 335 potential of estimating NPP as a proportion of GPP, particularly in area of non-access and complex site structures.
Second, this study shows that using flexible CUE values improves prediction accuracy of global C cycling for different ecosystem types.Our results indicated that CUE varied spatially (Fig. 3), thus using a constant CUE derived NPP may lead anthropogenic bias for NPP estimate.Using the modelled 340 (spatially varied) CUE derived NPP in this study could potentially reduce the bias, therefore, such NPP estimate could serve as a 'ground truth' or benchmark.NPP estimates using Jung's GPP multiplied by constant CUE (0.5) was 61.9 ± 0.2 Pg C a -1 , which overestimated global NPP by 2.8 Pg C a -1 (Fig. S12).
Third, our NPP estimate indicates the improvement of MODIS algorithms.MODIS NPP was 54.4 ± 345 0.9 Pg C a -1 , which underestimated NPP by 4.7 Pg C a -1 compared to this study, equalling 50% of anthropogenic CO2 emissions (Janssens-Maenhout et al., 2017).This conclusion was also confirmed by previous study that MODIS underestimated production due to the light saturation in tropical areas (Propastin et al., 2012).Such underestimation can be also observed in Fig. S13.
Fourth, our NPP estimate highlights a better parameterization to improve the representation of 350 processes controlling NPP in TRENDY models.We calculated TRENDY NPP as a proportion of TRENDY GPP, and NPP of different TRENDY models ranged from 47.2 ± 1.2 to 64.3 ± 1.6 Pg C a -1 from 1982 to 2011 (Table 1).Such result indicates TRENDY models underestimated or overestimated NPP due to the simply representing growth and maintenance respiration as a proportion of GPP and lacking of representing site management and CO2 fertilization effects (Anav et al., 2015).Considering

370
(3) the comparison of CUE between observed based estimates and TRENDY models, and among TRENDY models varied greatly, particularly in high latitude areas, highlighting the need for a better process representation to improve the representation of processes controlling CUE in TRENDY models.
Our data analysis further indicated that the mismatch between RF-CUE and TRENDY-CUE was caused by both (1) differences in ecosystem type (significant difference for forest ecosystem in Fig. 2); (2) 375 differences in land cover distribution globally [e.g.different plant functional types or land overs used in TRENDY models (Xia et al., 2017)].However, a question still remains whether such mismatch in CUE between RF and TREDNY can be also related to the misrepresentation of vegetation C stock or CUE sensitivities to environmental controls.Additionally, further improvements in the approach should overcome shortcomings from reduced data availability and the mismatch in spatial resolution between 380 covariates and in situ CUE. 140

Figure 1 .
Figure 1.Site distribution and the number of observations for forest, grass, wetland, crop and tundra ecosystems.Geographical distribution of the observational sites in the database is not even.Western Europe has excellent coverage, while Eastern part and Russia only feature sparse sites.Asian sites are also mostly grouped on the coastal areas, while Africa is for the great part not represented.From a biome point of view, forest sites are largely over-represented with respect to others. 145 Biogeosciences Discuss., https://doi.org/10.5194/bg-2019-37Manuscript under review for journal Biogeosciences Discussion started: 14 February 2019 c Author(s) 2019.CC BY 4.0 License.annually averaged (e.g.temperature) or summed (e.g.precipitation).Since GIMMS NDVI ranged from July 1981 to December 2015 and LAI and fPAR ranged from July 1981 to December 2011 for GIMMS, 150for the observational years that were not in these year ranges, the values of the closest year were extracted.
Biogeosciences Discuss., https://doi.org/10.5194/bg-2019-37Manuscript under review for journal Biogeosciences Discussion started: 14 February 2019 c Author(s) 2019.CC BY 4.0 License.validated prediction of models with sequentially reduced number of predictors (ranked by variable importance) and a 10-fold cross-validation was applied in this study.At last, six variables (ecosystem type, annual mean precipitation, annual mean temperature, nitrogen deposition, latent heat flux and 185 diurnal temperature range, Fig. S5) were selected to predict temporal and spatial patterns of CUE using Random forest (RF).The six variables explained 49% variance in CUE.Six types of cross-validations were conducted (Figs.S6-8): leave-one-site-out cross-validation (LOSOCV, leave all year observations within one site out and predicted by the rest site-years observations for each site), mean-site cross-validation (MSCV, building a RF model using all 190 observations and validating mean-site CUE as a new dataset), leave-one-latitude-out (LOLOCV, leave all year observations of the same latitude out and predicted by the rest site-years observations for each latitude) and mean-latitude cross-validation (MLCV, building a RF model using all observations and validating mean-latitude CUE as a new dataset), multi-year cross-validation (MCV, validating CUE extracted from predicted CUE map only for sites with more than four-year observations with observed 195 CUE) and "range" cross-validation (RCV, validating predicted CUE and observed CUE within the same change range of each predicting variable).These cross-validations contributed to the uncertainty of predicting the time series CUE for unknown sites.Finally, Pearson correlation efficiency, model efficiency and root mean square error were calculated.

Figure 2 .
Figure 2. Carbon use efficiency (CUE) in cropland, forest, grassland, tundra and wetland for both observed and TRENDY CUE.The capital letters (A and B) on error bars of observed and model mean CUE indicate significant difference among five ecosystem types for observed and model mean CUE, 220

Figure 4 .
Figure 4. Latitudinal analysis of TRENDY carbon use efficiency (CUE).The bold purple curve represents predicted CUE using Random forest.
This study is supported by postdoc funding from Max-Planck-Institute for Biogeochemistry.This study is also jointed supported by the National Natural Science Foundation of China (31800365 and 41671432); the Fundamental Research Funds of International Centre for Bamboo and Rattan (1632018003 and 1632018009); Innovation funding of Remote Sensing Science and Technology of Chengdu University 395 of Technology (KYTD201501); Starting Funding of Chengdu University of Technology (10912-2018KYQD-06910).Great thanks for all the authors' contributions of the data collection from the publications.Great thanks to Dr. Matteo Campioli for his critical comments on the dataset."The MOD12Q1 data product was retrieved from the online Data Pool, courtesy of the NASA Land Processes Distributed Active Archive Center (LP DAAC), USGS/Earth Resources Observation and Science (EROS) 400 Center, Sioux Falls, South Dakota, https://lpdaac.usgs.gov/data_access/data_pool".http://dx.doi.org/10.1038/ngeo2553,2015.Cao, L.: An Earth system model of intermediate complexity: Simulation of the role of ocean mixing parameterizations and climate change in estimated uptake for natural and bomb radiocarbon and anthropogenic CO2, J. Geophys.Res., 110, http://dx.doi.org/10.1029/2005jc002919,2005.Ciais, P., Sabine, C., Bala, G., Bopp, L., Brovkin, V., Al., E., and House, J. I.: Carbon and Other 425 Biogeochemical Cycles.In: Climate Change 2013: The Physical Science Basis.Contribution of Working Group I to the Fifth Assessment Report of the Intergovernmental Panel on Climate Change Change, in, edited by: Stocker, T. F., Qin, D., Plattner, G.-K., Tignor, M., Allen, S. K., Boschung, J., Nauels, A., Xia, Y., Bex, V., and Midgley, P. M., Cambridge University Press, Cambridge, United Kingdom and New York, NY, USA, 159-254, 2013.Biogeosciences Discuss., https://doi.org/10.5194/bg-2019-37Manuscript under review for journal Biogeosciences Discussion started: 14 February 2019 c Author(s) 2019.CC BY 4.0 License.Biogeosciences Discuss., https://doi.org/10.5194/bg-2019-37Manuscript under review for journal Biogeosciences Discussion started: 14 February 2019 c Author(s) 2019.CC BY 4.0 License.

Table 1 .
NPP prediction from observations, MODIS and TRENDY models Our results had important practical implications, particularly for estimation of global NPP.First, our study paves way to derive NPP directly from established base GPP estimates, such as Jung's GPP(Jung Biogeosciences Discuss., https://doi.org/10.5194/bg-2019-37Manuscriptunder review for journal Biogeosciences Discussion started: 14 February 2019 c Author(s) 2019.CC BY 4.0 License.thusproviding a viable alternative of existing MODIS and TRENDY estimates of global biosphere carbon fixation rates.Our study shows that previous global MODIS NPP estimates that are based on a fixed CUE values can be 4.7 Pg C a -1 lower than the actual value, an underestimation that is four times 360 greater than the total annual fossil-fuel CO2 emission of the entire European Union (Janssens-Maenhout et al., 2017).Therefore, it is of great socioeconomic importance to account for the global variability of CUE in terrestrial ecosystems in estimating carbon fixation rate of the biosphere.In summary, although data-derived CUE may serve as a benchmark for ecosystem models, directly upscaling from observations has not been observed.This study presents an approach to fill this 365 knowledge gap by compiling a global CUE database and predicting CUE with global environmental variables using RF algorithm, providing a global CUE product with a moderate resolution of 0.5 o × 0.5 o .
355the inter-annual variability of respiration coefficient is an important step to reduce the major source of uncertainty of C flux and CUE.Last, our global CUE map facilitated ground-truthing NPP estimation, Presently, robust findings include: (1) the pronounced CUE variation between and within different ecosystem types, challenging the perspective that CUE is independent of environmental controls; (2) a strong spatial variability of CUE with higher CUE at higher latitudes and lower CUE in tropical areas;