From land use to land cover : restoring the afforestation signal in a coupled integrated assessment – earth system model and the implications for CMIP 5 RCP simulations

Climate projections depend on scenarios of fossil fuel emissions and land use change, and the Intergovernmental Panel on Climate Change (IPCC) AR5 parallel process assumes consistent climate scenarios across integrated assessment and earth system models (IAMs and ESMs). The CMIP5 (Coupled Model Intercomparison Project Phase 5) project used a novel “land use harmonization” based on the Global Land use Model (GLM) to provide ESMs with consistent 1500–2100 land use trajectories generated by historical data and four IAMs. A direct coupling of the Global Change Assessment Model (GCAM), GLM, and the Community ESM (CESM) has allowed us to characterize and partially address a major gap in the CMIP5 land coupling design: the lack of a corresponding land cover harmonization. For RCP4.5, CESM global afforestation is only 22 % of GCAM’s 2005 to 2100 afforestation. Likewise, only 17 % of GCAM’s 2040 afforestation, and zero pasture loss, were transmitted to CESM within the directly coupled model. This is a problem because GCAM relied on afforestation to achieve RCP4.5 climate stabilization. GLM modifications and sharing forest area between GCAM and GLM within the directly coupled model did not increase CESM afforestation. Modifying the land use translator in addition to GLM, however, enabled CESM to include 66 % of GCAM’s afforestation in 2040, and 94 % of GCAM’s pasture loss as grassland and shrubland losses. This additional afforestation increases CESM vegetation carbon gain by 19 PgC and decreases atmospheric CO2 gain by 8 ppmv from 2005 to 2040, which demonstrates that CESM without additional afforestation simulates a different RCP4.5 scenario than prescribed by GCAM. Similar land cover inconsistencies exist in other CMIP5 model results, primarily because land cover information is not shared between models. Further work to harmonize land cover among models will be required to increase fidelity between IAM scenarios and ESM simulations and realize the full potential of scenario-based earth system simulations.


Introduction
Land use plays a major role in determining terrestrial-atmosphere mass and energy exchange (Adegoke et al., 2007;Raddatz, 2007), which in turn influences local to global climate (Brovkin et al., 2013;A. D. Jones et al., 2013;Pitman et al., 2009).Despite much recent progress, we still have a limited understanding of how historical land use has affected, and continues to affect, climate (Brovkin et al., 2013;A. D. Jones et al., 2013;Pitman et al., 2009) and carbon (Anav et al., 2013;Arora and Boer, 2010;Houghton, 2010;Houghton et al., 2012;Hurtt et al., 2006;Jain et al., 2013;Jain and Yang, 2005; C. Jones et al., 2013;Smith and Published by Copernicus Publications on behalf of the European Geosciences Union. A. V. Di Vittorio et al.: From land use to land cover Rothwell, 2013), and high uncertainty as to how land use might evolve in the future (Hurtt et al., 2011;van Vuuren et al., 2011a;Wise et al., 2009).Part of the uncertainty in future land use trajectories is due to inherent unpredictability of human actions, and part to the high diversity of potential climate mitigation and adaptation scenarios.Several energy and land strategies have been proposed to mitigate climate change (Rose et al., 2012;P. Smith et al., 2013), and while these have similar overall goals, some strategies will likely compete for land and other resources if implemented simultaneously.For example, afforestation and bioenergy production both aim to reduce atmospheric CO 2 concentrations, but both activities require land area, and both strategies would impact crop production and markets through effects on crop area (Reilly et al., 2012).
Reflecting this limited understanding of land use effects on climate and carbon, global climate models (GCMs), and also next generation earth system models (ESMs) that include fully coupled atmosphere-land-ocean carbon cycles, implement a wide range of land use/cover approaches with varying degrees of detail and limited inclusion of managed ecosystems and land use practices (Brovkin et al., 2013;Pitman et al., 2009).The Land Use and Climate, IDentification of robust impacts (LUCID) activity employed seven GCMs to determine whether land use change has significant regional climate impacts and farther-reaching teleconnections due to biophysical changes in land surface.The results for 1972-2002 revealed significant but inconsistent changes in temperature, precipitation, and latent heat in some areas where land use change had occurred.The authors concluded that the model disagreement was due mainly to differences in land use and land cover change implementations and corresponding land cover distributions, with contributions from methodological differences in crop phenology, albedo, and evapotranspiration (Pitman et al., 2009).The environmental factors addressed by LUCID are also key factors for determining carbon uptake by vegetation, and thus it is not surprising that the Coupled Climate-Carbon Cycle Model Intercomparison Project (C 4 MIP) activity generated ESM projections that range from the land being a carbon source to a large carbon sink by 2100 (Friedlingstein et al., 2006).
To advance the scientific understanding of the effects of land use change on climate, phase 5 of the Coupled Model Intercomparison Project (CMIP5) (Taylor et al., 2012) applied a novel "land use harmonization" approach to produce the required land use change information for all participating GCMs and ESMs.The Global Land use Model (GLM) was used for this land use harmonization to generate the first set of continuous, spatially gridded land use change scenarios for the years 1500-2100 (Hurtt et al., 2011).GLM computes land use states and transitions annually at half-degree, fractional spatial resolution, including secondary land age, area, and biomass, and the spatial patterns of shifting cultivation and wood harvesting (Hurtt et al., 2006).Land use products from GLM have successfully been used as inputs to both re-gional and global dynamic land models (Baidya Roy et al., 2003;Hurtt et al., 2002;Shevliakova et al., 2009) and fully coupled ESMs (Jones et al., 2011;Shevliakova et al., 2013).The land use harmonization process ensures a continuous transition from the historical reconstructions to the future projections made by integrated assessment models (IAMs).
The land use harmonization methodology was designed to satisfy the demands of a broad range of models and to provide a consistent set of land use inputs for GCMs and ESMs.The historical period of the land use harmonization  was based on version 3.1 of the Historical Database of the Environment (HYDE; Klein Goldewijk et al., 2011) and Food and Agriculture Organization (FAO) wood harvest data.For the future period , the land use harmonization process utilized land use data from the four representative concentration pathways (RCPs), each provided by a different IAM.The RCP scenarios were designed to each meet a different radiative forcing target (2.6, 4.5, 6.0, and 8.5 W m −2 ), and due to differences among the IAMs these scenarios spanned a range of approaches in all sectors, including land use, for meeting the targets (van Vuuren et al., 2011a).As a result, forest cover change varied widely from deforestation to afforestation across the scenarios.Once the land use data were passed through the land use harmonization, each GCM/ESM utilized a unique subset of the harmonized outputs, based on model capabilities, and applied it to a unique set of land use and land cover types (e.g., Lawrence et al., 2012).Although this process was largely successful in enabling the first spatially explicit land use driven climate change experiments, it introduced considerable uncertainty into the climate response for a given RCP in part because of model-specific translation requirements between harmonized land use outputs and GCM/ESM simulated land cover.This uncertainty due to inconsistent land cover distributions among models precluded robust intercomparison of land-atmosphere processes (e.g., carbon uptake, evapotranspiration) because differences among models were dominated by the differences among simulated land cover distributions (Brovkin et al., 2013).As land use and land cover are interdependent, a more detailed specification of the relationship between land use and land cover may reduce uncertainty in earth system simulations such that experiments can focus on land-atmosphere process uncertainty rather than be confounded by inconsistent land use/cover distributions.
Recent analyses of CMIP5 results using prescribed CO 2 concentrations have also shown the land ranging from a carbon source to a sink in 2100 for a given scenario (Brovkin et al., 2013;C. Jones et al., 2013).The LUCID activity was repeated for five CMIP5 ESMs and the results demonstrated that large inter-model spreads of key regional land surface variables (temperature, precipitation, albedo, latent heat, and available energy) were still due mainly to differences in land use and land cover change implementations and corresponding land cover distributions.Inter-model spreads of CO 2 emissions, however, were attributed mainly to differences in land carbon cycle process parameterizations.As a result, different land cover distributions among the models gave significantly different regional changes in climate associated with land use change, but with insignificant effects on global mean temperature.Furthermore, the range of net cumulative land use change emissions from 2006 to 2100 for RCP8.5 was 34 to 205 PgC, with the high estimate likely due to the combination of relatively high levels of land carbon and the inclusion of all land use transitions rather than just net land use change (Brovkin et al., 2013).Additionally, not all of the models used the GLM wood harvest data, further contributing to the spread of model results.For comparison, estimates of net cumulative carbon emissions during 1700-2000 (1850-2000) range from 138 to 250 PgC (110 to 210 PgC) (Table 3 in Smith and Rothwell, 2013).The differences in land use and land cover implementations are also a main factor in the large spread of 21st century land carbon uptake and of compatible fossil fuel emissions allowable for a given RCP.In fact, the inter-model spreads in land carbon uptake for individual scenarios are greater than the inter-scenario spreads for individual models (C.Jones et al., 2013).It is apparent that further work is needed to resolve inconsistencies among land use and land cover approaches to reduce climate uncertainty, especially for regional impact assessment.
Additional sources of climate uncertainty related to land use are the RCP radiative forcing targets, which include only emissions of greenhouse gases (GHGs) and some aerosols and reactive gases (van Vuuren et al., 2011a).These targets do not include radiative forcing from albedo change or other direct climate effects associated with land use change.In a recent modeling experiment, two different carbon tax policies with dramatically different land use scenarios met the same radiative forcing target (4.5 W m −2 ) in the IAM used for RCP4.5 but had significantly different radiative forcing in an ESM (difference of 1 W m −2 ) due to albedo differences between the land use scenarios (A.D. Jones et al., 2013).Likewise, the shared socioeconomic pathways (SSPs) for mitigation, adaptation, and impact studies in the Intergovernmental Panel on Climate Change (IPCC) fifth Assessment Report (AR5) are likely to produce different land use scenarios that meet the same RCP target, but have different radiative forcing in the ESMs due to the direct effects of land use and land cover change on climate.However, one of the goals of the RCP process was to provide a set of radiative forcing targets for ESMs that remains consistent with respect to the diversity of SSPs associated with each RCP target (Moss, et al., 2010).As a result of the wide range of land use and land cover related uncertainties in climate projections, an increased emphasis on land use and land cover dynamics is a high priority for CMIP6 (Meehl et al., 2014).
A more consistent and complete land use and land cover coupling between IAMs and ESMs will facilitate more accurate projections of global change scenarios and more robust multi-model intercomparisons of climate and carbon cycle interactions with anthropogenic drivers such as fossil fuel emissions and land use change.These expected outcomes are in line with a primary goal of a scenario-based approach, such as the RCPs, which is "to better understand uncertainties in order to reach decisions that are robust under a wide range of possible futures" (Moss et al., 2010;p. 747).The RCPs were designed to better understand uncertainties in global climate projections by providing distinct scenarios of atmospheric radiative forcing and land use change.Intrascenario comparison of ESM simulations offers insights to uncertainties in ESM processes, while inter-scenario comparison of ESM simulations offers insights to uncertainties due to a range of possible futures.However, the efficacy of this approach depends on the fidelity of the ESM simulations to the RCP scenarios.Without this fidelity, intra-scenario comparison is not possible, because the ESMs are not simulating the same scenario, and inter-scenario comparison might include futures outside the prescribed range of possibility.
The IAMs projected a complete terrestrial surface (along with ice, rock, and urban) for each given scenario because land use and land cover are interdependent.For example, carbon stocks in various ecosystems might be valued under a carbon price policy, so land cover would need to be determined along with land use.Or a land policy might restrict certain land cover conversions.Within the CMIP5 coupling process, however, GCMs and ESMs determine their own land cover while remaining consistent with the land use harmonization data, thus potentially reducing the fidelity of the full climate simulations to the RCP scenarios.This was a practical design that obviated the redesign of GCM/ESM land use and land cover implementations, but also precluded analysis of the climate impacts of different land cover responses to land use change because such analysis is robust only within a single model where everything but land cover response remains consistent.Another challenge posed by the interdependence of land use and land cover is the implementation of geographic shifts in land cover due to bioclimatic changes.While these shifts are often implemented within ESMs, such shifts are a second-order effect that is superposed upon land use change and might be better implemented as a feedback from ESMs to IAMs to inform land use and land cover projection.Incorporating both land use and land cover into the coupling between IAMs and ESMs is a fundamental step toward realizing the full potential of the scenario-based RCP process.
Our approach to addressing inconsistencies between IAMs and ESMs is to integrate an IAM and an ESM into the first fully coupled model that directly simulates human-environment feedbacks.The resulting integrated ESM (iESM) includes climate feedbacks on vegetation productivity and ecosystem carbon from the Community ESM (CESM) to the Global Change Assessment Model (GCAM) to facilitate land use projection at 5-year intervals.The iESM uses GLM as in the CMIP5 land use harmonization, along with the CESM Land Use Translator (LUT) that converts land use harmonization outputs to CESM land cover and wood harvest area.Our initial iESM simulations showed that time varying factors based on CESM simulated net primary production (NPP) and heterotrophic respiration (HR) were successfully used by GCAM for land use projection.However, these simulations also demonstrated that the large RCP4.5 afforestation signal was not being passed through from GCAM to CESM.GCAM simulated afforestation as a carbon-sequestering strategy to help meet the RCP4.5 target, but this additional forest area was not included in the land use harmonization.As a result, most of this forest area was not included in CESM simulations, both for CMIP5 and in an early version of iESM.
Here we test the feasibility of restoring the lost afforestation signal by using the iESM as a test bed to explore alternative coupling strategies.We focus on modifications to the CESM LUT because initial modifications to GLM did not restore CESM afforestation.One advantage of focusing on a post-land use harmonization approach is that it could be applied to other ESMs independently without changing the land use harmonization product.Section 2 includes model description and experimental design, Sect. 3 presents results and demonstrates that this problem exists in CMIP5, and Sect. 4 discusses the limitations of our current approach and the implications for the CMIP5 archive with respect to land use and climate.We conclude with suggestions for improving IAM to ESM land coupling for future model intercomparisons.

iESM description
The iESM integrates GCAM, GLM, and CESM to evaluate the effects of human-environment feedbacks on the earth system (Fig. 1).We have completed the first coupling stage that allows GCAM to project land use distribution in 5-year increments based on the previous 5 years of CESM vegetation productivity.Here we give an overview of how the three main components interact.A more detailed description of iESM development will be presented in a forthcoming paper (Collins et al., 2014). GCAM v3.0 (Calvin et al., 2011; henceforth referred to as GCAM) is a tightly coupled IAM of human and biogeophysical processes associated with climate change.GCAM's human system components simulate global economic activity within energy, agriculture, and forest product markets with respect to 14 geopolitical regions.A previous version of GCAM projected land use and land cover distributions for each of the 14 geopolitical regions (Wise et al., 2009) and was used to generate the CMIP5 RCP4.5 scenario (Thomson et al., 2011).Currently, GCAM incorporates a range of improvements to the Agriculture and Land Use (AgLU)  module, including the capacity to operate on 151 geographical land units to generate a more detailed and accurate spatial distribution of land use.There are three land cover types that remain constant over time (urban, tundra, and rock/ice/desert) and 24 land use and land cover types available for redistribution, including 12 food and feed crops, five bioenergy crops, and seven managed and unmanaged ecosystems (Kyle et al., 2011;Wise and Calvin, 2011).The "geographical land units" are defined by intersecting 18 global agro-ecological zones (Lee et al., 2005) with the 14 geopolitical regions.In the iESM, GCAM projects land use and land cover distributions within each of these land units at 5year intervals.These distributions are based on profit shares calculated from agricultural costs, prices, yields, and the application of a carbon price to vegetation and soil carbon densities.
In a second and intermediate step, GLM uses GCAM's cropland, pasture, and forest areas (and wood carbon harvest) to compute all annual, fractional land use states and transitions.As part of this process it disaggregates GCAM's geographical land unit data to a half-degree global grid by computing spatial patterns and also ensures consistency with the historical land use reconstructions (Hurtt et al., 2011(Hurtt et al., , 2006)).GLM has been slightly modified from its CMIP5 implementation to better facilitate forest area change matching with GCAM (Sect.2.3.2).This modification enables GLM to use forest area output from GCAM that was not incorporated into the CMIP5 land use harmonization.Nonetheless, iESM still follows the CMIP5 implementation for CESM in using these GLM land use harmonization outputs: cropland, pasture, primary, and secondary land area, as well as wood harvest areas on primary and secondary forested and non-forested land.
CESM (Bitz et al., 2011;Gent et al., 2011) has fully coupled atmosphere, ocean, land, and sea ice components.Within CESM, the Community Land Model v4.0 (CLM; Lawrence et al., 2011) receives the selected GLM outputs via a translator that converts these outputs to 16 CLM plant functional types (PFTs; eight forest, three grass, three shrub, one bare soil, and one crop) (Lawrence et al., 2012).The CLM dynamic vegetation module, which estimates bioclimate-driven geographical shifts in CLM PFTs, cannot run at the same time as the land use change module presented here; only one of these modules can change CLM PFT areas per simulation.While the iESM does not directly estimate bioclimatic shifts in land cover, the NPP and HR feedbacks to GCAM do incorporate bioclimatic effects on ecosystems into GCAM's land use and land cover projections.The version of iESM used in this study was based on CESM v1.0beta9, which is a pre-release version of the model used for the CMIP5 simulations.
The iESM climate feedbacks on vegetation and carbon were implemented by passing annual climate scaling factors from CESM to GCAM based on NPP and HR.These factors were used to scale GCAM crop yields and vegetation and soil carbon densities every 5 years.To calculate the scaling factors, the per-pixel, PFT-specific CESM 5year annual average NPP and HR values for a given GCAM time step were divided by base-period average annual values (1990)(1991)(1992)(1993)(1994)(1995)(1996)(1997)(1998)(1999)(2000)(2001)(2002)(2003)(2004).These NPP and HR ratios were then filtered to exclude outliers based on a median absolute deviation method, and finally aggregated to GCAM's geographical land units and land use and land cover types (for details see Bond-Lamberty et al., 2014).Crop yields and vegetation carbon densities for GCAM's next land use projection were scaled by the NPP ratio, while soil carbon densities were scaled by a combination of the NPP and HR ratios ((NPP ratio + (1 − (HR ratio − 1))) / 2).

Simulations
Our iESM simulations cover 2005 to 2040 with fully coupled CESM components and prescribed RCP4.5 emissions and carbon price path.These simulations use the land use change module, a dynamic ocean (R. Smith et al., 2013), Community Atmosphere Model v4 physics (Gent et al., 2011), carbon-nitrogen biogeochemistry (Thornton et al., 2007), and active land-atmosphere-ocean carbon dynamics, at approximately 1 • resolution (0.9375 • × 1.25 • ).The iESM initial conditions are the culmination of a CESM spinup run followed by a CESM 1850-20005 transient historical run with land use change.GCAM initial conditions are calibrated to 2005 wood harvest, land use area, and energy and agriculture costs and production, as reported by individual countries and processed and archived by international organizations (e.g., FAO, International Energy Agency).The GCAM RCP4.5 scenario was described fully by Thomson et al. (2011).
We performed two fully integrated simulations to compare two iESM cases: (1) original CESM LUT (OLDLUT) and (2) modified CESM LUT (NEWLUT) (Table 1).In fact, OLDLUT was our initial fully integrated simulation with iESM and, as reported below, it revealed inconsistencies within iESM that needed to be addressed prior to scientific experimentation.OLDLUT also showed that the updated GLM did not increase CESM afforestation with respect to a previous simulation performed by manually passing data between the respective iESM models.The NEWLUT case was used to test our hypothesis that the lost afforestation signal could be recovered by modifying only the CESM component of iESM.These fully integrated runs included climate feedbacks on vegetation productivity and ecosystem carbon in GCAM's land use projections, which occurred at 5-year intervals.Analysis of the effects of introducing these feedbacks on land use, carbon, and climate will be presented in a forthcoming paper.

OLDLUT land use coupling within iESM
The OLDLUT iESM land use coupling followed the CMIP5 land use harmonization algorithm (Fig. 2), but with a slightly modified version of GLM (see Sect. 2.3.2).The coupling was designed to match GCAM and CESM changes in absolute cropland and pasture area.For CMIP5, GLM received only crop and pasture areas from GCAM, but for the iESM GLM also receives forest area from GCAM to better facilitate forest area change matching (see Sect. 2.3.2).GLM also receives wood products demand from GCAM (in tons of carbon), which is spatially distributed to determine the extent of harvested area in each of five wood harvest types (primary forest harvest, primary non-forest harvest, secondary mature forest harvest, secondary immature forest harvest, and secondary non-forest harvest).The OLDLUT (Fig. 3) uses only the cropland and pasture area outputs from GLM to update CESM PFT areas in conjunction with maps of potential vegetation (the vegetation most likely to be present if no land use change had occurred; Ramankutty and Foley, 1999).Noncrop PFT area reductions are made in proportion to their respective existing grid-cell fractions, while additions are made in proportion to their respective potential vegetation grid cell fractions.The OLDLUT does not use the primary and secondary land area information for updating PFT areas because CESM does not keep track of these land use designations.
The OLDLUT does, however, use the primary and secondary land area to calculate the harvested fraction of GLM harvestable area (sum of the five wood harvest type areas divided by the total area of primary and secondary land).fraction is applied to forest area to determine the harvested area in CESM (Lawrence et al., 2012).
The OLDLUT makes specific assumptions about pasture area change because CESM does not keep track of pasture area (Fig. 3).Changes in GLM cropland result directly in CESM changes in crop PFT area, but changes in pasture area are constrained by forest PFT area and reflected in changes in grass and shrub PFT area.More specifically, pasture addition is limited to replacement of existing forest PFT area with grass PFT area, and pasture removal is limited to the replacement of grass and shrub PFT area by potential forest PFT area.This means that grass and shrub PFT area changes associated with pasture area change can be only as large as the available existing or potential forest area.

Modifying the GLM spatial distribution algorithm
For the iESM, GLM was modified to better facilitate forest area change matching with GCAM in an effort to increase the forest area simulated by CESM.These modifications included operating on GCAM's 151 geographical land units (rather than the 14 regions used for CMIP5) in addition to using GCAM's forest area output, which was not previously shared between the models.For CMIP5, GLM applied the cropland and pasture area changes to the 2005 half-degree map of cropland and pasture while preserving the total cropland and pasture area changes within GCAM regions.Spatial allocation of cropland and pasture areas to the half-degree grids was done with a preference for expanding agricultural area onto non-forested land and reducing agricultural area where GLM would expect a forest to grow, while also preserving 2005 spatial patterns of land use by allocating new cropland and pasture near existing agricultural areas (Hurtt et al., 2011).
The new GLM algorithm uses GCAM forest area from each geographical land unit at each time step and attempts to preserve the forest area changes within each geographical land unit in addition to preserving the cropland and pasture area changes.GLM has previously defined "forest" as natural vegetation that is growing on land where the potential biomass density, based on an internal potential vegetation growth model, is greater than 2 kg C m −2 .Using this definition, the potential forestland within GLM is fixed and, as a result, the GLM algorithm cannot grow forest outside of this forestland.In the new algorithm, GLM matches GCAM forest area changes by moving cropland and pasture around within each geographical land unit to "expose" enough potential forestland for regrowth to meet the GCAM forest area changes (see the following steps a-c).In addition, to meet GCAM's land requirements for afforestation, GLM uses a different definition of "forest" (potential biomass density greater than 1 kg C m −2 , rather than 2 kg C m −2 ) than the definition used elsewhere in the GLM code (e.g., for computing the spatial pattern of wood harvesting).The new GLM algorithm operates in three main steps: a. Decreases in cropland and pasture occur first on the highest potential biomass land and increases in cropland and pasture occur first on the lowest potential biomass land.
b.If the forest area change within a geographical land unit is not met, a redistribution of cropland and pasture within that geographical land unit occurs such that, when possible, existing cropland and pasture is moved from high biomass density land to low biomass density land.
c.If the forest area change within a geographical land unit is still not met, the algorithm attempts to allocate any "unmet" forest area change within another land unit (or across multiple land units) within the same region, using a similar method to (b) above.

Modifying the CESM land use translation algorithm
To test our hypothesis that the lost afforestation signal could be recovered solely by the ESM component, we focused on modifying the LUT (NEWLUT; Fig. 4) to capture GCAM afforestation via changes in agricultural land.This approach is more expedient than redesigning the coupling code and LUT to receive forest area changes directly from GLM because such redesign would logically require implementation of a single, consistent land surface and carbon cycle among all iESM components.Specifically, the NEWLUT adds tree PFTs when cropland and pasture are removed.Furthermore, the NEWLUT preferentially removes tree PFTs when cropland and pasture are added.Forest area information is still not shared between GLM and the NEWLUT (other than forest harvest).The NEWLUT also includes proper grid cell fraction matching between GLM and CESM, which primarily affects crop, grass, and shrub PFTs.

CMIP5 RCP4.5 land use and land cover distributions among GCAM, GLM, and CESM
The OLDLUT iESM land use coupling was also used in CMIP5, albeit with 14 regions rather than 151 geographical land units and without the GLM modifications and climate feedbacks described above, and so we explored the extent to which the afforestation signal was lost in the CMIP5 simulations.We compared the RCP4.base code for iESM and thus contains the same versions of the model components.

CMIP5 RCP4.5 land use and land cover area inconsistencies
The GCAM afforestation signal was dramatically decreased in the CESM simulations, and the total area covered by CESM herbaceous (grass and shrub) PFTs increased while GCAM pasture decreased (Fig. 5).CESM forest area increased by 23 % of the 4.82 million km 2 of afforestation between 2005 and 2020, and by 22 % of the 10.98 million km 2 of afforestation by 2100.GLM captured 64 and 56 % of the afforestation in 2020 and 2100, respectively.GCAM and GLM pasture decreased by 4.69 million km 2 from 2005 to 2100 while CESM herbaceous PFTs increased by 1.11 million km 2 over the same period.The changes in global cropland area were faithfully transmitted (CESM decreases were only 7 % less than GCAM decreases), but absolute CESM cropland area was approximately 1.5 million km 2 less than GCAM cropland area throughout the simulation (data not shown).Changes in GLM pasture and cropland areas were essentially identical to GCAM changes, and GLM absolute area values were slightly higher and lower, respectively, than GCAM pasture and cropland areas (cropland data not shown).

Restored afforestation in iESM
The OLDLUT simulation revealed that only changes in crop area were being faithfully transmitted from GCAM to CESM (Fig. 6; changes in global area).In contrast, CESM forest area increased by only 17 % of GCAM's 5.40 million km 2 of afforestation between 2015 and 2020, and by only 17 % of the 7.73 million km 2 of afforestation between 2015 and 2040.Changes in GLM forest area, on the other hand, reflected changes in GCAM forest area quite well (Fig. 6), but at the cost of dramatically overestimating absolute forest area within GLM due to a low biomass threshold for defining forest (Fig. 7; absolute values of global area).Within GLM, the new algorithm captured 93 % of afforestation between 2015 and 2020 and 84 % between 2015 and 2040, as compared to the original GLM algorithm that captured only 14 and 20 % over the respective periods in a previous simulation performed by manually passing data between the respective iESM models (data not shown).Changes in GCAM pasture were not reflected by changes in CESM herbaceous PFTs, but were faithfully output by GLM (Fig. 6).
The NEWLUT simulation shows improved forest and cropland area changes in CESM with a corresponding change in CESM herbaceous PFT area.The main improvement is that CESM forest area increases by 64 % of GCAM's 2015-2020 afforestation and by 66 % of the 7.71 million km 2 of afforestation from 2015 to 2040 (Fig. 6).This additional forest area in NEWLUT reduces total area covered by CESM herbaceous PFTs by 94 % of the 4.36 million km 2 of GCAM pasture loss by 2040. Figure 8 shows the spatial tradeoff between forest and herbaceous PFTs that achieves this level of afforestation, and Fig. 9 demonstrates a sustained increase in average annual land carbon uptake after 2020 due to additional afforestation.In comparison to OLDLUT, the NEW-LUT increase in land carbon uptake results in a 19 PgC increase in vegetation carbon gain and an 8 ppmv decrease in atmospheric CO 2 gain between 2005 to 2040 (Fig. 10).
NEWLUT also improves the CESM absolute cropland area (Fig. 7) through proper matching of GLM and CESM grid cell fractions.The effect of this proper matching is apparent in the cropland and pasture area changes from 2005 to 2006 (Figs. 6 and 7).GLM NEWLUT outputs follow the GCAM NEWLUT outputs with relationships between GLM and GCAM similar to those for OLDLUT (data not shown).

Discussion
The iESM and CMIP5 land cover area discrepancies (Figs.5-7) result from a gap in the original CMIP5 land coupling design that allows inconsistent forest area and land cover type definitions across models (Fig. 2), along with different underlying carbon cycles.The land use harmonization was, however, ambitious and largely successful in developing consistent land use definitions and data without requiring extensive redevelopment of land use and land cover components of all participant models (Hurtt et al., 2011).As our study attests, such redevelopment is challenging and modelspecific, but might be required for ESMs to adequately simulate the IAM-prescribed anthropogenic drivers and their corresponding effects on carbon and climate.Thus, while this is a specific case, the lost iESM afforestation signal is instructive of the shortcomings of the CMIP5 design and the restoration of this signal offers insights into improving land use and land cover coupling for model intercomparisons.
A primary challenge for improving the CMIP5 land coupling is to increase the amount of specific land cover information being shared between IAM (and historical) scenarios and ESMs.For CMIP5, the land use harmonization was designed to harmonize land use data between models, and as such GLM did not receive forest area or any other land cover information from any of the IAMs (Masui et al., 2011;Riahi et al., 2011;Thomson et al., 2011;van Vuuren et al., 2011b).Thus, at the first coupling step, scenarioprescribed land cover associated with any IAM policy that A. V. Di Vittorio et al.: From land use to land cover valued carbon within unmanaged ecosystems (e.g., grassland, wetland, forest) was lost.While GLM does, however, keep track internally of forested and non-forested land (according to its own definition of forest, which likely differs from those within IAMs and ESMs), the output land use harmonization product includes only cropland, pasture, primary, and secondary land areas and transitions, and the age and biomass density of secondary land (and harvest areas, carbon amounts, and transitions, which we do not address here).As each ESM characterizes the land surface by its own suite of vegetation and management types (Brovkin et al., 2013), additional land use and land cover information could be lost in the second coupling step between GLM and the ESMs.For example, some ESMs were able to use the primary, secondary, and transition information, but they might have been applying this information to different land covers than those used by GLM, thus introducing a second shift away from the original IAM scenario.Our specific case demonstrates an even greater inconsistency due to the use of only cropland and pasture information.GCAM has 17 crop types (the CMIP5 version had 10) and seven managed and unmanaged land cover types while CESM has 16 PFTs, only one of which is a crop type.The LUT algorithm uses only the GLM cropland and pasture area information to adjust PFTs because CLM does not keep track of primary versus secondary land.The resulting spatial pattern of non-crop PFTs is determined by the existing PFT distribution and CESM's internal representation of potential vegetation cover (Lawrence et al., 2012;Ramankutty and Foley, 1999).An additional source of error that we did not investigate here is the relationship between individual PFTs and land cover types that may comprise several PFTs (e.g., forest land may consist of 60 % trees and 40 % grass).
Due to the lack of a prescribed land cover input associated with the land use input, forest area changes in CESM (and iESM) are effectively residual changes that are only indirectly linked to GCAM forest area through changes in cropland and pasture areas.The LUT calculates cropland area changes first and pasture area changes second (Figs. 3  and 4).In CMIP5 CESM simulations, cropland area changes cause non-crop PFTs to be added or removed in proportion to their potential or existing grid-cell fractions, respectively.Pasture is more complicated because it is not tracked as such: pasture is not a single PFT and its changes are represented as changes in herbaceous and tree PFTs.Specifically, tree PFTs are removed when pasture is added, and non-crop PFTs are added in proportion to their potential vegetation grid-cell fractions when pasture is removed (Lawrence et al., 2012).This residual PFT determination, combined with independent and unique forest definitions across GCAM, GLM, and CESM, causes the bulk of prescribed afforestation to not appear in the CESM land surface.As a direct consequence, CESM grass area (and shrub area to a lesser extent) increases while GCAM pasture decreases dramatically (Fig. 5).CESM has this same limitation for all four RCP scenarios, and the other CMIP5 ESMs implement similar inconsistencies to varying degrees due to the lack of specific vegetation types in the land coupling between IAMs and ESMs.For example, Davies-Barnard et al. (2014) recently reported that the HadGEM2-ES RCP4.5 forest area increased 11 % from 2005 to 2100, while the GCAM forest area increased by 24 %.Additionally, the GCAM 2005 forest area was 41.1 M km 2 , the GLM 2005 forest area was 39.9 km 2 , but the MPI-ESM 2005 forest area was about 24 M km 2 .As a result, the 35 % increase in MPI-ESM RCP4.5 forest area by 2100 (Wilkenskjeld et al., 2014) was still only 77 % of GCAM's afforestation.It is apparent from these inconsistencies that interdependent land use and land cover need to be faithfully transmitted from IAMs to ESMs to robustly simulate the effects of prescribed scenarios on the earth system.
Even partial restoration of the lost afforestation signal in iESM demonstrates the potentially dramatic effect on global carbon and climate of using IAM land cover and land use information in ESMs.As soon as 25 years after the initial increase in forest area, and with only 66 % of GCAM's afforestation area, the NEWLUT has a significant impact on global carbon balance (Fig. 9).The assumption that forest exclusively replaces abandoned cropland and pasture in GCAM's land use projection (Figs.6-8) sets the upper limit for CESM because there is no other information to constrain forest area, and may be applicable only to the RCP4.5 scenario.Although this limits NEWLUT to including only twothirds of the total afforestation, adding more forest area to CESM would be arbitrary without additional land cover information.Nonetheless, the increased afforestation in NEW-LUT results in an increase in net land carbon uptake over the OLDLUT case due to a sustained increase in average annual land carbon uptake after 2020 (Fig. 9).As a result, the NEW-LUT simulation increases vegetation carbon gain by 19 PgC and decreases atmospheric CO 2 gain by 7.7 ppmv from 2005 to 2040 in comparison to OLDLUT (Fig. 10).The NEWLUT simulation also decreases soil carbon gain by about 1.5 PgC over this period (data not shown).
Simple linear extrapolation of the iESM vegetation carbon gain and atmospheric CO 2 gain from 2005 to 2100 increases these changes to approximately 52 PgC and 21 ppmv, and extending CESM forest area to match GCAM total afforestation could potentially increase these changes to 88 PgC and 36 ppmv in 2100.These are rough estimates that use 2005 as a starting point to reduce the high slope associated with the initial increase from 2015 to 2020, and also assume that additional forest area continues to gain carbon for 60-80 years after it is established.Regardless of the absolute accuracy of these extrapolations, the potential gain in vegetation carbon alone for CESM with full afforestation is on the order of estimates of net cumulative land use change emissions during 1850-2000, which range from 110 to 210 PgC (Table 3 in Smith and Rothwell, 2013).For comparison, the range of CMIP5 vegetation carbon stock gains for RCP4.5 is about 50 to 300 PgC from 2005 to 2100, with most gains being less than 150 PgC and relatively linear (Fig. 2 in C. Jones et al., 2013).An increase in gain of 88 PgC would dramatically shift CESM vegetation carbon dynamics in relation to the other ESMs.The corresponding 36 ppmv decrease in atmospheric CO 2 is nearly one-third of the difference between the prescribed 2100 concentrations of the RCP4.5 (∼ 540 ppmv) and RCP2.6 (∼ 420 ppmv) scenarios (Fig. 1 in C. Jones et al., 2013).More importantly for CESM's ability to robustly simulate the effects of the RCP scenarios on the earth system, the prognostic CESM atmospheric CO 2 concentration in 2100 for RCP4.5 is 610 ppmv (Keppel-Aleks et al., 2013), and a decrease from 610 to 574 ppmv has an approximate decrease in radiative forcing of 0.33 W m −2 , which is nontrivial with respect to the 4.5 W m −2 target.While these carbon cycle changes in the CESM component of iESM may have a significant effect on climate, it is important to note that the carbon cycle effects of afforestation in CESM are not identical to those in GCAM or GLM because these three models have different biogeochemistry and vegetation models.These differences in carbon cycles, however, do not obviate the need for making both land cover and land use consistent between IAMs and ESMs in order to best match the prescribed radiative forcing scenario.
Different implementations of land cover and land use among IAMs and ESMs also reduce the fidelity between RCP scenarios and their associated effects on the earth system.Figure 8 shows that most of the additional forest area in NEWLUT occurs on grassland and shrubland, and that these lands generally coincide with areas of limited potential forest.The OLDLUT could not add forest area where no potential forest area exists, and the rate of forest carbon accumu-A.V. Di Vittorio et al.: From land use to land cover lation is constrained by environmental conditions.GLM also limits forest area and growth based on potential forest and environmental conditions, but with a different growth model and map of potential forest area than used by CESM.On the other hand, GCAM afforestation is a strategy to expand forest area for carbon sequestration, and assumes that it is cost effective to use agricultural inputs (e.g., water, fertilizer) to achieve the expected forest growth.This disagreement among the three models hampers communication of forest area changes and contributes to the differences in forest area among the models, both in CMIP5 (Fig. 5) and in the iESM (Figs. 6 and 7).Nonetheless, sharing forest area between GCAM and GLM does improve the fidelity between GCAM and GLM's forest area changes (Figs. 5 and 6).GLM and CESM do not simulate agricultural inputs for forests, yet the NEWLUT can simulate most, but not all, of the prescribed afforestation (Figs. 6 and 7) by adding forest area based on GCAM's cropland and pasture changes, rather than on potential forest area.The additional forest might not grow as well in CESM as in GCAM, but the CESM forest productivity is fed back to GCAM for subsequent land use projections, so environmental restrictions on forest growth will influence future land use and land cover.This feedback does not, however, fully compensate for the lack of bioclimatic or agricultural input availability constraints on GCAM's land use projection, which might contribute to an overly optimistic afforestation projection.More generally, this feedback mechanism opens a path for more robustly simulating interdependent land use and land cover through incorporation of potential bioclimate-driven geographic shifts in land cover.ESMs could estimate bioclimatic drivers or geographic shifts for given land use/cover scenarios, and then feed this information back to the IAMs for incorporation into land use/cover projection.Implementing such a feedback for scenario-based simulations would consolidate land use/cover determination into internally consistent modules within the IAMs, thereby increasing fidelity between the scenario-prescribed land surface and the one used by the ESMs.
We have focused on understanding the effects of mismatched land cover areas on global simulations, rather than on mismatched carbon cycles, because the spatial distribution of land cover and land use is a scenario-determined boundary condition for ecosystem-specific processes such as biogeochemical dynamics.For global simulations this boundary condition is generally provided by historical data and IAMs, and, as we have shown, a mismatch in this boundary condition causes CESM to simulate non-scenario effects on carbon and climate (due to a non-scenario land surface), rather than the scenario-driven effects of the land surface prescribed for meeting the RCP4.5 target.Mismatched carbon cycles among IAMs and ESMs, on the other hand, along with differences in atmospheric radiation code, will preclude exact matches in radiative forcing for a given RCP scenario, but should not cause significant deviations among models in the carbon and climate effects of a given scenario.
While we plan to completely reconcile land use and land cover inconsistencies within the iESM by implementing a single carbon cycle with consistent land surface characterization among the components, it is not desirable, nor feasible, for all IAMs and ESMs to have the same biogeochemistry and vegetation growth components.For example, a diversity of terrestrial models can help characterize uncertainty in global simulations.This uncertainty, however, is most useful if these models simulate the same spatial distribution of land cover and land use change.Therefore, iESM redevelopment that ensures land use and land cover consistency between GCAM and CESM could provide a template for improving the fidelity between IAM scenarios and ESM simulations in the next CMIP.In fact, land cover information is currently planned to be included in the CMIP6 land coupling, along with a more extensive land use model intercomparison project (Meehl et al., 2014).

Conclusions
We have identified the lack of specific land cover type information being shared among GCAM, GLM, and CESM in the iESM as the primary cause of CESM having very little afforestation and effectively no change in herbaceous PFT area in contrast to GCAM's large RCP4.5 afforestation and corresponding pasture reduction.Initial efforts to fix this problem through GLM modifications and the sharing of forest area between GCAM and GLM improved only the fidelity of forest area changes between GCAM and GLM.We then focused on modifying the algorithm that translates GLM land use harmonization outputs to CESM PFTs.While these land use translator modifications have been successful at capturing two-thirds of GCAM's RCP4.5 afforestation signal and corresponding reductions in herbaceous PFT area, they are not sufficient to completely overcome the limitations imposed by not passing specific land cover types from GCAM through to CESM.These modifications are also specific to the GCAM RCP4.5 scenario, and might need to be altered for the other RCP scenarios.Furthermore, we have not addressed the lack of constraints on GCAM forest area expansion, nor mismatches between land cover and PFT definitions.Nonetheless, this partial restoration of afforestation has a significant impact on iESM's global carbon cycle through increased vegetation carbon and decreased atmospheric CO 2 concentration.
The iESM framework follows the CMIP5 land coupling design, and as such we have characterized a major gap in this design that precludes accurate translation of projected IAM land surface scenarios to ESMs by focusing only on land use such as cropland and pasture (albeit successfully), and not including specific land cover types such as forest, grassland, and shrubland.The relationship between land use and land cover is handled uniquely by individual ESMs, which means that the effects of scenario mismatch will be model-specific and more relevant for some RCPs than others.The resulting land cover discrepancies are likely most pronounced for the large RCP4.5 afforestation signal, which was greatly reduced in the CMIP5 CESM and HadGEM2-ES (see Davies-Barnard et al., 2014) simulations, but could also arise for other large land cover changes such as the extensive deforestation of RCP8.5.As total land area is conservative, errors in the distribution of one land cover are complemented by errors in the distributions of other land covers.In GCAM's RCP4.5 scenario, pasture decreases over the 21st century, but the CMIP5 CESM runs have increasing grass and shrub areas over the same period.It is very important that the land use and land cover changes (which determine land use change emissions and the total capacity for vegetation carbon assimilation) match between the IAMs and ESMs because the CMIP5 experimental design is predicated on the fidelity between IAM scenarios and ESM simulations such that they have similar, specific radiative forcings for a given scenario, including CO 2 emissions from land use change (Moss et al., 2010).Furthermore, future radiative climate targets are likely to include the biogeophysical forcings of land use change because it has been shown that the modeled climate system is sensitive to changes in these forcings due to the spatial distribution of land use and land cover change (Brovkin et al., 2013;A. D. Jones et al., 2013;Pitman et al., 2009), making it imperative that IAM and ESM land use and land cover distributions match as closely as possible.Maintaining the diversity of global biogeochemical and vegetation models also calls for GCMs and ESMs to match historical and projected land cover and land use distributions as closely as possible, so as to isolate carbon cycle contributions to uncertainty from contributions due to differences in land use and land cover.Fortunately, our results indicate that it might be possible to adjust land cover in other CMIP5 models to better match RCP4.5 afforestation and the corresponding climate scenario, while still using the standard land use harmonization data.
We conclude that the land coupling between IAMs and ESMs for future model intercomparisons needs to ensure greater consistency in land cover and land use among the models in order to realize the full potential of scenario-based earth system simulations.In short, the models need to agree on the actual land area and the annual spatial distribution of major (non-)vegetation land covers and land uses.In other words, the ESMs need to simulate the same basic land surface as prescribed by the IAM-generated RCP scenarios.To achieve the required consistency, we suggest that the next CMIP land coupling design provides land cover and land use information, and a standard mapping between land cover and plant functional types.Fortunately, this is an emerging priority for the CMIP6 Land Use Model Intercomparison Project (LUMIP, http://www.wcrp-climate.org/index.php/modelling-wgcm-mip-catalogue/modelling-wgcm-mips/ 318-modelling-wgcm-catalogue-lumip, http://www.wcrp-climate.org/wgcm/WGCM17/LUMIP_proposal_v4.pdf).The following gridded data with fractional shares within grid cells are specifically recommended: 1. Annual land cover states with complete, contiguous spatial coverage within grid cells.Land cover needs to include at least the basic categories of cropland, grassland, shrubland, woodland, forest, and other (bare/sparse, ice, urban, water).This will allow consistency in major (non-)vegetation types for model intercomparison (with the "other" category having fixed area).The "other" categories could also be separated out for models that can use them, and in preparation for changing their areas also.
2. Annual land use states including primary and secondary land, wood harvest, and pasture (cropland should coincide with the land cover state).These uses should be provided with respect to the land cover categories.Wood harvest and pasture should include both area and amount of biomass/carbon harvested or removed by grazing.
3. A standard present-day land area data set to be used by all models.Land area includes all land cover and land use categories as described above.
4. Annual land use and land cover transitions.Land use transitions need to be accompanied by corresponding land cover transitions with complete, contiguous spatial coverage within grid cells.Net land use/cover transitions, which should be used for model intercomparison, are annual changes in individual land use and cover states, and may include additional detail about sources of wood harvest and grazed biomass.Gross land use/cover transitions are the transitions among particular land use/covers occurring within a particular year.These transitions sum to the net land use/cover transitions, and should also be provided to characterize shifting cultivation and other gross land conversions.While gross land use/cover transitions are very important and make a significant difference in the carbon cycle, until more models are able to make use of gross transitions they should not be included in model intercomparisons.

Figure 1 .
Figure 1.Implementation of iESM terrestrial feedbacks.The light blue arrows show information flow from GCAM to CESM.The light green arrows show information flow from CESM to GCAM.The dashed gray outline, including the arrow crossed out by green, represents the CMIP5 land coupling.The solid green outline, minus the arrow crossed out by green, depicts the iESM implementation used in this study.

Figure 2 .
Figure 2. General iESM land use coupling algorithm.Forest area is not passed from GCAM to GLM in the CMIP5 land use coupling, but it is passed in the iESM simulations used in this study.AEZ, agro-ecological zone.

Figure 3 .
Figure 3. OLDLUT algorithm for dynamic PFT coverage.When cropland and pasture decrease, non-crop PFTs are added in proportion to potential vegetation fractions.When cropland and pasture increase, non-crop PFTs are removed in proportion to reference year fractions.

Figure 4 .
Figure 4. NEWLUT algorithm for dynamic PFT coverage.When cropland and pasture decrease, tree PFTs are added in proportion to potential vegetation fractions.When cropland and pasture increase, tree PFTs are removed first, then other non-crop PFTs, in proportion to reference year fractions.

Figure 6 .
Figure 6.iESM land use and forest area changes with respect to 2015.The GLM-NEWLUT forest and pasture data are nearly identical to the GLM-OLDLUT data and are not shown for clarity.Similarly, the GLM-NEWLUT cropland data are nearly identical to the GCAM-NEWLUT data.

Figure 7 .
Figure 7. iESM land use and forest area.The GLM-NEWLUT forest and pasture data are nearly identical to the GLM-OLDLUT data and are not shown for clarity.Similarly, the GLM-NEWLUT cropland data track the GCAM-NEWLUT data, but with the same offset as for the GLM-OLDLUT data.

Figure 9 .
Figure 9. Net ecosystem exchange (NEE) comparison between iESM simulations.(a) NEE for each simulation.(b) NEE difference (NEWLUT minus OLDLUT).These data show more land carbon uptake (negative NEE), associated with the additional trees, for the NEWLUT simulation during the afforestation period (2015 forward).

Figure 10 .
Figure 10.Comparison between iESM simulations of (a-b) vegetation carbon and (c-d) atmospheric CO 2 concentration.Differences are NEWLUT minus OLDLUT.Due to additional forest area, the NEWLUT simulation significantly increases vegetation carbon gain and decreases atmospheric CO 2 gain over the OLDLUT simulation.

Table 1 .
Two iESM simulations performed for this study.