Evaluating and improving the Community Land Model's sensitivity to land cover

. Modeling studies have shown the importance of biogeophysical effects of deforestation on local climate conditions but have also highlighted the lack of agreement across different models. Recently, remote-sensing observations have been used to assess the contrast in albedo, evap-otranspiration (ET), and land surface temperature (LST) be-tween forest and nearby open land on a global scale. These observations provide an unprecedented opportunity to evaluate the ability of land surface models to simulate the biogeo-physical effects of forests. Here, we evaluate the representation of the difference of forest minus open land (i.e., grass-land and cropland) in albedo, ET, and LST in the Community Land Model version 4.5 (CLM4.5) using various remote-sensing and in situ data sources. To extract the local sensitivity to land cover, we analyze plant functional type level output from global CLM4.5 simulations, using a model con-ﬁguration that attributes a separate soil column to each plant functional type. Using the separated soil column conﬁgura-tion, CLM4.5 is able to realistically reproduce the biogeo-physical contrast between forest and open land in terms of albedo, daily mean LST, and daily maximum LST, while the effect on daily minimum LST is not well


Introduction
While the forested area has stabilized or is even increasing over Europe and North America, deforestation is still ongoing at a fast pace in some areas of South America, Africa, and southeast Asia (Huang et al., 2009;Hansen et al., 2013;Margono et al., 2014;McGrath et al., 2015).In addition, carbon sequestration by re-or afforestation has been pro-Published by Copernicus Publications on behalf of the European Geosciences Union.
posed as a strategy to mitigate anthropogenic climate change (Brown et al., 1996;Sonntag et al., 2016;Seneviratne et al., 2018), making forest loss or gain likely an essential component of future climate change.Changes in forest coverage impact climate by altering both the carbon cycle (Ciais et al., 2013) and various biogeophysical properties of the land surface such as albedo, evaporative fraction, and roughness length (Bonan, 2008;Pitman et al., 2009;Davin and de Noblet-Ducoudré, 2010;Akkermans et al., 2014;Li et al., 2015).However, there exist considerable discrepancies in the representation of biogeophysical effects amongst land surface models, thus generating a need for a thorough evaluation of the representation of these effects in individual models.
Model simulations indicate that the biogeophysical effects of historical deforestation have been rather small on a global scale (Davin et al., 2007;Findell et al., 2007;Davin and de Noblet-Ducoudré, 2010;de Noblet-Ducoudré et al., 2012;Malyshev et al., 2015).However, they have likely been significant on regional and local scales, especially over areas which experienced intense deforestation rates (Pongratz et al., 2010;de Noblet-Ducoudré et al., 2012;Kumar et al., 2013;Malyshev et al., 2015;Lejeune et al., 2017Lejeune et al., , 2018)).Similarly, present-day observational data, either based on in situ (Juang et al., 2007;Lee et al., 2011;Zhang et al., 2014;Bright et al., 2017) or remote-sensing measurements (Li et al., 2015;Alkama and Cescatti, 2016;Li et al., 2016;Duveiller et al., 2018), show that biogeophysical effects of forests can strongly influence local climate conditions.Among the different biophysical effects, the increased surface albedo (cooling effect), the alteration of the evaporative fraction (warming or cooling effect, depending on the region and season), and the lower surface roughness causing a reduction of the turbulent heat fluxes (warming effect) have been identified as the three main drivers of the climate impact of deforestation (Bonan, 2008;Pitman et al., 2009;Davin and de Noblet-Ducoudré, 2010;Li et al., 2015).However, some of these biogeophysical processes are not well represented in current land surface models.The model intercomparison projects LUCID (Land-Use and Climate, IDentification of robust impacts) and CMIP5 (Coupled Model Intercomparison Project Phase 5) exposed the lack of model agreement concerning the biogeophysical impacts of historical land use and land cover change (LULCC), especially regarding the impact on evapotranspiration (ET) and temperature during the warm season over the midlatitudes of the Northern Hemisphere (de Noblet-Ducoudré et al., 2012;Kumar et al., 2013;Lejeune et al., 2017).In addition, distinct discrepancies between present-day temperature observations and the simulated historical effects of LULCC over North America were identified (Lejeune et al., 2017).This highlights the need for systematic evaluation and improvement of the representation of biogeophysical processes in land surface models.
Observing the local climatic impact of LULCC is not straightforward.When temporally comparing observational data over an area undergoing LULCC, it is difficult to disentangle the effect of the LULCC forcing from other climatic forcings (e.g., greenhouse gas forcing).To overcome this difficulty, observational studies often spatially compare nearby sites of differing land cover, assuming that they receive the same atmospheric forcing (e.g., von Randow et al., 2004;Lee et al., 2011).Hence, the sensitivity of land surface models to land cover can be evaluated best with observational data by spatially comparing different land cover types in models.Recently, Malyshev et al. (2015) employed a new approach to assess the local impacts of LULCC in land surface models by comparing climate variables over tiles corresponding to different plant functional types (PFTs) located within the same grid cell.Since PFT tiles within the same grid cell experience exactly the same atmospheric forcing, the resulting subgrid land cover signal extracted by this method achieves good comparability to local observations which contrast neighboring forest and open land sites (Lee et al., 2011;Li et al., 2015;Alkama and Cescatti, 2016;Li et al., 2016).
Here, we aim to evaluate and improve the sensitivity of the Community Land Model version 4.5 (CLM4.5) to land cover, using observational data of the local contrast between forest and open land (i.e., grassland and cropland).In Sect.3.1 of this study, we systematically analyze the representation of the local difference of forest minus open land in albedo, ET, and land surface temperature (LST) in CLM4.5 against the newly released observational remote-sensing-based products of Li et al. (2015).The forest signal in CLM4.5 is extracted by comparing tiles corresponding to forest and open land, similar to Malyshev et al. (2015).Given the uncertainties in observation-based ET estimates, we further extend our evaluation by including data from the Global Land Evaporation Amsterdam Model (GLEAM) version 3.1a (Miralles et al., 2011;Martens et al., 2017) and the Global ET Assembly (GETA) 2.0 (Ambrose and Sterling, 2014), which are based on remote-sensing and in situ observations, respectively.Finally, a sensitivity experiment is presented in Sect.3.2, which explores the possibilities to better represent the ET impact of forests in CLM4.5.This configuration of CLM4.5 incorporates modifications in root distribution, plant water uptake, light limitation of photosynthesis, and maximum rates of carboxylation.
2 Methods and data 2.1 Model description and setup CLM is the land surface component of the Community Earth System Model (CESM), a state-of-the-art Earth system model widely applied in the climate science community (Hurrell et al., 2013).CLM represents the interaction of the terrestrial ecosystem with the atmosphere by simulating fluxes of energy, water, and a number of chemical species at the interface between the land and the atmosphere.The , 15, 4731-4757, 2018 www.biogeosciences.net/15/4731/2018/represented biogeophysical processes include absorption and reflection of both diffuse and direct solar radiation by the vegetation and soil surface, emission and absorption of longwave radiation, latent and sensible heat fluxes from the soil and canopy, and heat transfer into the snow and soil.Subgrid heterogeneity is taken into account in CLM by the subdivision of each land grid cell in five land units (glacier, wetland, vegetated, lake, and urban).The vegetated land unit is further divided into 16 tiles representing different PFTs (including bare soil).We run CLM version 4.5 at 0.5 • resolution for the period 1997-2010. A 5-year (1997-2001) spin-up period is excluded from the analysis to minimize the impact of the model initialization.The analysis of CLM4.5 therefore covers the period of 2002 to 2010 which matches well with the observation period of 2002 to 2012 of Li et al. (2015).

Biogeosciences
Assuming that the feedback of the land surface to the atmosphere is of minor importance for the subgrid contrast between forest and open land tiles, simulations are performed in offline mode using atmospheric forcing from the CRUN-CEP v4 reanalysis product (Vivoy, 2009;Harris et al., 2014).
The land cover map and vegetation state data are prescribed based on MODIS observations (Lawrence and Chase, 2007, Fig. A1).The land cover map from the year 2000 is kept static during the entire simulation period, since no land cover change is required to retrieve a spatial contrast between forest and open land.The optional carbon and nitrogen module of CLM4.5 as well as the crop and irrigation modules are kept inactive in our simulations.
By default, all PFTs within a grid cell in CLM4.5 share a single soil column (Oleson et al., 2013), implying that all PFTs experience the same soil temperature and soil moisture (SM).Further, the surface energy balance at PFT level is closed using the ground heat flux (GHF; i.e., GHF is calculated as the residual of the other energy fluxes).Hence, the soil warms in the case of an energy excess at the land surface, and vice versa.Warmer (cooler) soil in turn will result in increased (decreased) sensible and latent heat fluxes away from the ground and/or increased (decreased) emitted longwave radiation, thereby counteracting the initial energy imbalance.Consequently, this model architecture eventually results in near-zero daily mean GHF, once the soil temperature has adjusted to an equilibrium state with a near-zero energy imbalance.On shared soil columns (ShSCs), however, GHFs can reach unrealistically high values for individual PFTs (Fig. A2a and c), because a common soil temperature is artificially maintained for all PFTs, which differs from their individual equilibrium states.This assumption leads to a net GHF into the soil over open land PFTs and out of the soil over forest PFTs for the majority of the locations across the globe, implying a lateral subsurface heat transport from open land towards forests (Schultz et al., 2016).To resolve this issue, Schultz et al. (2016) proposed a modification of CLM4.5 which attributes a separate soil column (SeSC) to each PFT.This modification allows the soil of individual PFTs to equilibrate to a different temperature (Fig. A3) and suppresses these unrealistically high (lateral) GHFs (Fig. A2b and d).
Here, we present results from a simulation on SeSCs, called CLM-BASE, unless stated otherwise (Table A4).We also performed a simulation on ShSCs named CLM-DFLT.
Further, we present a sensitivity experiment, named CLM-PLUS in Sect.3.2, in which we try to alleviate detected biases in ET.Besides the SeSCs, four aspects in the parameterization of vegetation transpiration (VTR) are modified in this sensitivity experiment: -The first aspect is shallower root distribution for grassand cropland PFTs.CLM4.5 accounts for SM stress on transpiration through a stress function β t , which ranges from 0 (when soil moisture limitation completely suppresses VTR) to 1 (corresponding to no soil moisture limitation of VTR).Forests for the most part experience higher SM stress than open land in CLM-DFLT except in the northern high-latitude winter (Fig. A4), partly caused by the similar root distribution for all PFTs but evergreen broadleaf trees (Fig. A5).In reality, observed maximum rooting depths are considerably higher for forests than for grassland and cropland (Canadell et al., 1996;Fan et al., 2017).Likewise, in situ observations in the tropics show that grassland ET decreases during dry periods, because grasses have only limited access to water reservoirs located below a depth of 2 m (von Randow et al., 2004).Hence, we aim to increase SM stress of open land PFTs and reduce their ability to extract water from the lower part of the soil, by introducing a shallower root distribution for these PFTs (Fig. A5).This root distribution was not fitted to a particular observed root distribution.However, the new root distribution agrees better with the average rooting depth of annual grass reported by Fan et al. (2017).
-The second aspect is dynamic plant water uptake.Tropical forests are often observed to exhibit increased ET during dry periods, due to increased incoming shortwave radiation (da Rocha et al., 2004;Huete et al., 2006;Saleska et al., 2007).That is, despite the upper soil being dry, tropical trees still have sufficient access to water from deeper soil layers (Jipp et al., 1998;von Randow et al., 2004).We aim to allow a similar behavior in CLM4.5 by introducing a dynamic plant water uptake, where plants only extract water from the 10 % of the roots with best access to SM (example in Fig. A6).
-The third aspect is light limitation reduction for all C 3 PFTs and enhancement for C 4 PFTs.In CLM-BASE, ET of boreal PFTs is underestimated compared to GETA 2.0 (Fig. 3f).-The last aspect is modified maximum rates of carboxylation (V cmax ; Table A1).This PFT-specific parameter is suitable to tune VTR, since it is not well constrained from observations and VTR in models is highly sensitive to this parameter (Bonan et al., 2011).The new values were chosen with the aim to alleviate biases relative to GETA 2.0 (Fig. 3f) and still lie well within the range of observations collected in the TRY plant trait database (Boenisch and Kattge, 2017).Additionally, the minimum stomatal conductance of C 4 plants, which is by default 4 times larger than that of C 3 plants, is reduced.
A technical description of these modifications as well as a discussion of the effect on ET by each individual modification is provided in Appendix A.

Observational data
The data published in Li et al. (2015) are used to evaluate the effects of forests on local climate variables in CLM4.5.This data set was created by applying a window-searching algorithm to remote-sensing LST, albedo, and ET products from the MODerate resolution Imaging Spectroradiometer (MODIS) to systematically compare these variables over forest and open land on a global scale.The data of this study, hereafter referred to as MODIS, cover the period of 2002 to 2012 and were aggregated from the initial window size of 0.45 • ×0.25 • to 0.5 • ×0.5 • spatial resolution.Hence, the similar spatial scale of the MODIS data and the CLM4.5 simulations allows for good comparability between these two data sources.
We also use two additional observation-based data sets of ET to consider uncertainties in present-day ET estimates.Various global ET products are available which, however, exhibit substantial discrepancies (Mueller et al., 2011(Mueller et al., , 2013;;Wang and Dickinson, 2012;Michel et al., 2016;Miralles et al., 2016).In particular, the algorithm from Mu et al. (2011) used to retrieve the MODIS ET product was found to systematically underestimate ET compared to in situ and catchment-scale observations (Michel et al., 2016;Miralles et al., 2016).In addition, algorithms used to infer ET from remote-sensing observations make assumptions on how the land cover type influences ET, preventing an independent identification of the influence of LULCCs on ET.We therefore complement our evaluation of the ET impact of forest in CLM4.5 with two additional data sets: GLEAM version 3.1a and GETA 2.0.
GLEAM was introduced in 2011 (Miralles et al., 2011) and revised twice, resulting in the current version (3.1; Martens et al., 2017).It provides estimates of potential ET for tall canopy, bare soil, and low vegetation after Priestly and Taylor (1972).Potential ET of vegetated land surfaces is converted into actual ET using vegetation-dependent parameterizations of evaporative stress.Canopy interception evap-oration is calculated separately using the parameterization of Gash and Stewart (1979).GLEAM uses surface radiation, near-surface air temperature, surface SM, precipitation, snow water equivalent, and vegetation optical depth observations to estimate ET globally at 0.25 • resolution.To maximize spatial and temporal overlap with the MODIS observations, we choose GLEAM version 3.1 a (hereafter referred to as GLEAM), which incorporates reanalysis input besides satellite observations.We compare the ET estimates for tall canopy and low vegetation to model output for forests and open land, respectively.Since interception loss is only estimated for tall canopy, it was fully attributed to ET from forests.
GETA 2.0 (Ambrose and Sterling, 2014) is a suite of global-scale fields of actual ET for 16 separate land cover types (LCTs), derived from a collection of in situ measurements between 1850 and 2010.Using a linear mixed effect model with air temperature, precipitation, and incoming shortwave radiation as predictors, yearly ET estimates for each of these 16 different LCTs have been obtained with a global coverage and 1 • spatial resolution.We then use the same land cover map employed for the CLM4.5 simulations to weigh the different LCTs in this data set and retrieve an ET value for forest and open land (see Sect. 2.3 for more details).Since our CLM4.5 simulations were conducted without irrigation, we did not include the GETA 2.0 irrigation layer.We refer to this data set as GETA in this study.

Model evaluation
The forest signal in CLM4.5 is extracted by comparing the area-weighted mean of the variables of interest over all forest tiles to its corresponding values over open land tiles (i.e., grassland and cropland), similar to Malyshev et al. (2015).As such, it becomes possible to infer a forest signal for every model grid cell containing any forest and any open land PFT, no matter how small the fraction of the grid cell covered by these PFTs.The different PFT tiles within a 0.5 • × 0.5 • grid cell in our CLM4.5 simulations are subject to the exact same atmospheric forcing and are hence comparable to the almost local effect of forests retrieved at a resolution of 0.45 • × 0.25 • in MODIS.It needs to be noted that the MODIS observations can only be retrieved under clear-sky conditions, thereby potentially impairing the comparability to our CLM4.5 data which are not filtered for clear-sky days.Nevertheless, it was decided to include cloudy days for the analysis of the CLM4.5 simulations, to preserve the comparability to studies which do not distinguish between cloudy and clear-sky days (e.g., GLEAM; GETA; da Rocha et al., 2004;von Randow et al., 2004;Liu et al., 2005).
A total of 12 of the 16 PFTs of CLM4.5 are attributed to either the forest or the open land class as described in Table A2.Consistent with Li et al. (2015), open land was considered the combination of grassland and cropland.Hence, bare soil as well as shrubland are excluded from our analysis.Forest and open land ET of GETA was aggregated similarly using the same land cover (LC) map as in the CLM4.5 simulations, with the LCTs of GETA attributed to the different CLM4.5 PFTs as listed in Table A3.To ensure a consistent comparison with the LST data from MODIS, we derive a radiative temperature (T rad ) from the emitted longwave radiation output (LW up ) in CLM4.5 according to Stefan-Boltzmann's law (assuming that emissivity is 1 as in Eq. 4.10 of Oleson et al., 2013): with σ being the Stefan-Boltzmann constant (5.67 × 10 −8 W m −2 K −4 ).Hereafter, T rad will be referred to as LST.
For the local difference of forest minus open land in albedo, ET, daily mean LST, daily maximum LST, and daily minimum LST, we will use the symbols To evaluate the different CLM4.5 simulations objectively, three different metrics are calculated over the following eight Köppen-Geiger climate zones (Kottek et al., 2006): equatorial humid (E-h), equatorial seasonally dry (E-sd), arid (Arid), warm temperate winter dry (T -wd), warm temperate summer dry (T -sd), warm temperate fully humid (T -fh), snow warm summer (S-ws), and snow cold summer (S-cs) (Fig. 1).As a first metric, the area-weighted mean for a given variable over these climate zones ( x) is calculated as follows: where x i is the difference of forest minus open land in variable x of all the grid cells i belonging to the respective climate zone and A i their areas.Secondly, the CLM4.5 simulations are compared in terms of the area-weighted root mean squared deviation (RMSD) to the observation-based data sources: where x sim i and x obs i are the simulated and observed differences of forest minus open land in variable x.RMSD for a Köppen-Geiger climate zone is calculated from a data pool collecting all monthly values with data in CLM4.5 and the given observational data which lie within the respective climate zone (except when comparing to GETA for which only long-term annual means are available).
Lastly, the index of agreement (IA; Duveiller et al., 2016) was calculated for the same data pools as RMSD.This dimensionless metric describes the agreement between two data sets, with 0 indicating no agreement and 1 indicating perfect agreement.By definition, this metric is set to 0 if the two compared data sets exhibit a negative Pearson correlation.Since results of this metric generally support those of RMSD, they are shown in the Appendix (Fig. A7).

Results
3.1 Evaluation of the local effect of forests in CLM4.5

Albedo
The MODIS satellite observations and CLM-BASE agree on a generally negative α(f −o) (Fig. 2).Effectively, MODIS observations show slightly positive α(f − o) for some latitude-month combinations concentrated in the tropics and subtropics (Fig. 2); however, these differences are mostly insignificant and must be considered in the light of uncertainties in the MODIS observations, which are more sparse over these regions due to frequent cloud coverage (Li et al., 2015).The negative albedo difference is amplified towards the poles and in wintertime due to the snow masking effect (Harding et al., 2001).Among the non-snow climate zones, the albedo contrast between forest and open land is strongest in the Arid and the T -sd climate zones (Fig. 3a).This could be related to the occurrence of dry periods in these climate zones during which open land dries out more easily than forests due to their shallower root profiles (Canadell et al., 1996;Fan et al., 2017).As green leaves have lower albedo than dry leaves and the soil, the albedo contrast between the still-green forest and the dried-out open land would be intensified in such a scenario (Dorman and Sellers, 1989).α(f − o) tends to be more negative in CLM-BASE than in the satellite observations in all Köppen-Geiger climate zones, especially in the snow climate zones.RMSD values over the climate zones exhibit similar tendencies as the magnitudes of mean α(f −o) and have roughly the same magnitude of mean α(f − o) (Fig. 4a).The exception to this are the tropical climate zones where the magnitude of RMSD is considerably higher than the mean values of α(f −o).This is likely related to the fact that MODIS observes only a weak albedo signal of forests in these climate zones.

Evapotranspiration
All of the considered observation-based ET products indicate that annual mean ET(f − o) is positive in every climate zone, despite considerable variations in the magnitude of this difference (Fig. 3e).GLEAM suggests a near-zero ET(f − o) in the Arid climate zone most likely because it uses surface SM data as an input to estimate ET.Also, GLEAM exhibits positive ET(f − o) throughout the year in the midlatitudes, unlike MODIS which has a negative ET(f − o) during winter (Fig. 6).Paired-site FLUXNET studies offer an additional opportunity to compare ET over forest and over open land on a point scale.Overall, they report higher ET for tropical forests (Jipp et al., 1998;von Randow et al., 2004;Wolf et al., 2011).In the midlatitudes and high latitudes, a number of FLUXNET studies observe a positive ET(f − o) during summer, and a near-zero negative ET(f − o) during winter, similar to MODIS (Fig. 6; Liu et al., 2005;Stoy et al., 2006;Juang et al., 2007;Baldocchi and Ma, 2013;Vanden Broucke et al., 2015;Chen et al., 2018).On the other hand, negative ET(f − o) values have been observed at some paired FLUXNET sites in the tropics ( Van der Molen et al., 2006) and in the midlatitudes during summer (Teuling et al., 2010).A2.
The considered global ET data sets, however, consistently exhibit higher ET over forests in most regions (Fig. 5).This agreement across the different independent global data sources gives some confidence in the fact that ET is generally higher over forests.Nevertheless, it needs to be noted that ET(f − o) GETA shows fundamentally different results when considering the data over irrigated crops instead of data over rainfed crops (resulting in negative ET(f − o) at many locations).Therefore, distinguishing irrigated from rainfed crops in future evaluations would be essential but remains beyond the scope of this study.
CLM-BASE exhibits considerable discrepancies in ET(f − o) to the observation-based data sets both for the annual mean values (Fig. 5) and the seasonal cycle (Fig. 6).ET(f − o) in CLM-BASE is near zero in all climate zones (Fig. 3e), and even negative in the       2018) and ours show that the lower soil evaporation signal only arises for the configuration with SeSCs (data of CLM-DFLT are not presented here).Thus, lower soil evaporation around the Equator in CLM-BASE is likely related to the diminution of the soil temperature and of the available energy mentioned earlier in this section.It appears reasonable that, in comparison with open land, forests have lower soil evaporation since (1) the forest soil surface receives less incoming solar radiation, (2) more of the incoming precipitation is intercepted by the canopy, and (3) the water vapor concentrations within the canopy are higher.Yet soil evaporation and canopy interception evaporation contribute a larger proportion to total ET in CLM4.5 (31 and 19 %) compared to GLEAM (14 and 10 %; Martens and Miralles, 2017).It is thus possible that the strength of this effect is too large in CLM4.5.However, most ET measurement techniques cannot distinguish among the different components of ET, making it difficult to assess which partitioning is more realistic.Overall, negative ET(f − o) values in CLM-BASE typically coincide with negative differences for its VTR component, in particular during the wet season in the tropics and subtropics and during summer at higher latitudes (Fig. 7c and f), whereas negative values in the soil evaporation difference are partly compensated by positive values in interception evaporation (Fig. 7d and e).It is therefore likely that VTR is the main driver behind the ET(f −o) bias even though the contribution of the individual ET components to the total signal cannot be evaluated with observations.For this reason, the modifications in the CLM-PLUS sensitivity experiment are targeted at altering vegetation transpiration.
In summary, ET(f − o) in CLM4.5 exhibits considerable discrepancies to the considered global ET data sets and in situ observations.The SeSC configuration amplifies these discrepancies, which are typically driven by the difference in VTR of forest minus open land.

Land surface temperature
The overall local temperature impact of forests is the result of several biogeophysical properties acting simultaneously.They include lower albedo of forests (warming effect), higher surface roughness (cooling effect if land surface is warmer than boundary layer), and alteration of the evaporative fraction (Bonan, 2008;Pitman et al., 2009;Davin and de Noblet-Ducoudré, 2010;Li et al., 2015).For daily mean LST, forests exhibit a cooling effect in MODIS except for the winter months at latitudes exceeding 30 • (Fig. 8a).This implies that the cooling effects of higher surface roughness and generally higher evaporative fraction over forests are stronger than the warming effect due to their lower albedo.
LST avg (f − o) and LST max (f − o) are positive only under the presence of snow, as α(f − o) is amplified due to the snow masking effect (moreover, sensible heat fluxes are often directed towards the land surface during winter at high latitudes, resulting in warmer forests due to their higher surface roughness inducing stronger turbulent heat fluxes; Liu et al., 2005).The observed magnitude of LST max (f − o) tends to be larger than that of LST avg (f − o) likely due to the fact that the observed daytime effect is partly compensated by an opposing nighttime effect (Fig. 3b, c, and d).MODIS exhibits an overall cooling effect of forests on daily mean LST in all climate zones, including the snow climate zone where the sign of the difference changes seasonally (Fig. 8d).Further, this data set shows a slightly negative LST min (f −o) in tropical and subtropical regions and even a positive LST min (f −o) in the midlatitudes (Fig. 8g).This nighttime signal in the midlatitudes is observed in several observational studies but its source is not yet fully determined (Lee et al., 2011;Vanden Broucke et al., 2015;Li et al., 2015).
CLM-BASE generally captures the sign and magnitude of LST avg (f − o) and LST max (f − o) compared to MODIS (Fig. 8).The SeSCs used in CLM-BASE allow for larger LST differences between forest and open land than the default version of CLM4.5 (CLM-DFLT) on ShSCs, resulting in a better agreement with the observed magnitudes.This is due to the fact that the GHF on ShSCs counteracts the soil temperature difference and thereby also the LST difference between forest and open land.Nevertheless, there are still some discrepancies between the LST signal in CLM-BASE and the MODIS observations.It appears that LST avg (f −o) in CLM-BASE has a positive bias in the equatorial, the Arid, and the snow climate zones, and a negative bias in the Twd and T -fh climate zones (Fig. 3b).LST max (f − o) in CLM-BASE appears qualitatively similar to the MODIS observations (Fig. 8d, e, and f) but is biased positively in all climate zones (Fig. 3c).In contrast, daily minimum LST shows much larger discrepancies between CLM-BASE and MODIS (Fig. 8g, h, and i).In CLM-BASE, LST min (f −o) is similar to LST avg (f − o) and LST max (f − o); i.e., forests have an overall nighttime cooling effect in all climate zones except for the neutral signal in the snow climate zones, whereas MODIS exhibits an only weak nighttime cooling effect in the tropical climate zones and a clear nighttime warming effect in all other climate zones (Fig. 3d).The weak performance of CLM-BASE in terms of LST min (f − o) is also visible in the RMSD values which are considerably larger than the mean LST min (f − o) signal (compare Figs. 3d and 4d).
Interestingly, and in contrast to LST, CLM4.5 simulates a small year-round warming effect of forests on daily maximum 2 m air temperature (T2M, Fig. 9).This contradicts a number of observational studies which show that the T2M difference of forest minus open land ( T2M(f − o)) has the same sign but is attenuated compared to LST(f − o) (Li et al., 2015;Vanden Broucke et al., 2015;Alkama and Cescatti, 2016;Li et al., 2016).The fact that we use offline simulations in our experiments might explain this behavior, because some land-atmosphere feedbacks are not represented.However, Lejeune et al. (2017) report similar discrepancies of T2M(f − o) in CLM with observational data for coupled simulations, suggesting that the behavior of T2M(f − o) in our simulations may not be related to the lack of atmospheric feedbacks.

Sensitivity experiment to alleviate ET biases in
CLM4.5 In the previous section, striking discrepancies between the effect of forests in CLM-BASE and observation-based data were found for ET(f − o).An important driver responsible for these differences was identified to be VTR (Fig. 7).
In addition, it became apparent that the SeSC configuration impairs the ET(f − o) compared to the ShSC configuration (Fig. 6), despite improving LST avg (f − o) and LST max (f − o) (Fig. 8).Hence, in this section, we aim to improve the comparability of modeled ET(f − o) to observation-based results by testing a modified parameterization of VTR in a sensitivity experiment called CLM-PLUS.This model configuration comprises (1) a shallower root distribution for open land PFTs, (2) a modified plant water uptake scheme whereby plants only extract water from the 10 % of the roots with best access to SM, (3) altered light limitation of photosynthesis (decreased for C 3 plants and increased for C 4 plants), and (4) altered V cmax values to alleviate ET biases at PFT level compared to the GETA data.
α(f −o) is only marginally affected by the modifications of CLM-PLUS compared to CLM-BASE (Fig. 3a).This is expected since the modifications are targeted at modifying VTR which is not linked directly to albedo.ET(f − o) in CLM-PLUS becomes more positive than in CLM-BASE in all climate zones, thereby better matching the observationbased estimates (Fig. 3e).The improvement is also apparent www.biogeosciences.net/15/4731/2018/Biogeosciences, 15, 4731-4757, 2018  in the RMSD values which are reduced in CLM-PLUS for all data sets and climate zones, except for GETA in the E-h climate zone (Fig. 4e).The bias in average ET compared to GETA is smaller in CLM-PLUS than in CLM-BASE for all PFTs except for boreal deciduous needleleaf trees and crops (Fig. 3f).Some discrepancies with observation-based ET products nevertheless remain.ET(f − o) in CLM-PLUS is still mostly less positive compared to remote-sensing-based observations and GETA, and remains of opposite sign during the warm season in the temperate regions and in a narrow band around the Equator (Figs. 6 and 3e).This band originates from a negative ET(f −o) around the western part of the Equator in Africa and over Indonesia (Fig. 5).GLEAM and GETA observations cover these areas which explains the only moderate reduction of RMSD of CLM-PLUS against GLEAM and the increase in RMSD against GETA in the E-h climate zone.On the other hand, the RMSD against MODIS is reduced considerably in CLM-PLUS, since MODIS observations are sparse over Africa and Indonesia (Fig. 4e).Also, relative to the in situ observations of von Randow et al. ( 2004), biases in CLM-PLUS are reduced, yet not completely eliminated (Table 1).As a consequence of the improved ET(f − o), we find that CLM-PLUS partly alleviates the positive bias in LST max (f − o) compared to the MODIS data, especially in the equatorial climate zone which also reduces the RMSD in all but the Arid climate zone (Figs.3c and 4c).This hints that a realistic representation of ET(f − o) is crucial for resolving the underestimated cooling effect of forests on daily maximum LST.Similarly, RMSD of LST avg (f − o) decreases in the equatorial and Arid climate zones, whereas it increases in the temperate and snow climate zones (Fig. 4b).At the same time, the RMSD of LST min (f − o) is only marginally increased in all climate zones (Fig. 4d).

Discussion
The combination of SeSCs and the further modifications introduced in CLM-PLUS led to substantial improvements in CLM4.5's capability to represent forest/open land contrast.Nevertheless, some biases still persist.In particular, CLM4.5 is still unable to represent the nighttime warming effect of forests in the midlatitudes exhibited by observational data (Lee et al., 2011;Zhang et al., 2014;Vanden Broucke et al., 2015;Li et al., 2015Li et al., , 2016;;Alkama and Cescatti, 2016).Additionally, there is a remaining positive bias of LST max (f −o) compared with MODIS even though this bias is alleviated to some extent due to the more positive ET(f − o).Inadequate representation or omission of several processes in CLM4.5 could be the source of these discrepancies with MODIS.The biases in both LST max (f −o) and LST min (f − o) could be alleviated by accounting for vegetation heat storage, a process which is currently disregarded in CLM4.5.Observed diurnal vegetation heat storage fluxes reach an amplitude of 10-20 W m −2 in the midlatitudes and high latitudes (McCaughey and Saxton, 1988;Lindroth et al., 2010;Kilinc et al., 2012) and 20-70 W m −2 in the tropics (Moore and Fisch, 1986;Meesters and Vugts, 1996;dos Santos Michiles and Gielow, 2008).Fluxes of this magnitude are sufficient to considerably alter the diurnal temperature cycle in forests and hence potentially resolve the discrepancies in LST max (f − o) and LST min (f − o) of CLM4.5 with MODIS.While ET(f − o) in CLM-PLUS is improved against all the considered ET data sets in almost every climate zone, some biases persist, especially concerning the seasonality in the midlatitudes and high latitudes as well as annual mean values around the Equator.In CLM-PLUS, the focus was on VTR, thereby neglecting the contribution from soil and interception evaporation.However, soil evaporation is considerably lower over forests around the Equator in CLM-PLUS which might explain the remaining negative ET(f −o) in this region.We therefore encourage additional sensitivity experiments which also focus on the other components of ET.When testing new model configurations, care should be taken that the implemented modifications do not impair other features of the model, related not only to the water but also the energy and carbon budgets.Reassuringly, we find that global ET averages are only weakly affected in the sensitivity experiment, with an average of 1.43 mm day −1 in CLM-BASE compared to 1.41 mm day −1 in CLM-PLUS.These values lie within the range of 1.2 to 1.5 mm day timated from surface water budgets (Wang and Dickinson, 2012).Nevertheless, it would be desirable in future studies to evaluate the biogeochemical effects of the different model configurations investigated here alongside the biogeophysical effects.
For comparison with LST data, we used the radiative temperature in CLM4.5 rather than the more common T2M diagnostic which exhibits an observation-contradicting sign in CLM4.5 during daytime (compare Figs. 8e and 9).Such T2M-specific discrepancies with observations could be related to a differing definition of T2M over forests in the model and observations.For example, the differing sign of T2M max (f − o) in climate models using CLM and the observations of Lee et al. (2011) found in Lejeune et al. (2017) might be related to the fact that T2M observations were made 2 to 15 m above the forest canopy, whereas T2M of CLM4.5 lies within the forest canopy (Oleson et al., 2013).Therefore, T2M in CLM4.5 should be used with care when comparing to observations.
There are several factors which may affect the comparability of the signal extracted from our CLM4.5 simulations and the considered observational data sets.(1) The different data sources use differing land cover information.For example, GLEAM uses the MOD44B product which provides the fraction of each grid cell covered by trees, non-tree vegetation, and non-vegetated land surfaces, whereas MODIS uses the MCD12C1 product which provides the dominant International Geosphere-Biosphere Programme (IGBP) land cover type (Li et al., 2015;Martens et al., 2017).Further, the definition of forest and open land in the Li et al. (2015) data set can be a source of model-data discrepancy.The methodology applied by Li et al. (2015) relies on the definition of a threshold (80 %) in the coverage of forest (open land) for a pixel to be classified as forest (open land).There are therefore some mixing effects between the forest and open land categories in this data set, whereas our evaluation method isolates pure signals over forest and open land in CLM4.5.In fact, MODIS albedo retrievals were found to underestimate albedo over grass-and cropland, especially under the presence of snow, and overestimate it over forests due to the heterogeneity of land cover within pixels (Cescatti et al., 2012;Wang et al., 2014).Therefore, it is possible that the magnitude of α(f − o) is underestimated in MODIS rather than overestimated in CLM4.5.Consistently, in situ observations of paired forest and open land sites support the higher α(f − o) found in CLM-BASE (von Randow et al., 2004;Liu et al., 2005).(2) MODIS LST data are retrieved under clear-sky conditions only, whereas we do not mask out cloudy days in the evaluation of the CLM4.5 simulations.
(3) The overpass times of the MODIS satellite system are at 01:30 LT and 13:30 LT, hence not necessarily coinciding with the daily maximum and minimum LST in CLM4.5.(4) Finally, the meteorological conditions within one search window of MODIS may vary among the different pixels, whereas the different PFT tiles in our CLM4.5 simulations were subject to the exact same atmospheric forcing.However, Li et al. (2015) partly accounted for this effect by applying an elevation adjustment.Moreover, they found little sensitivity of the forest minus open land signal to the size of the chosen window.
In this study, we focused on the contrast between forest and open land.However, we acknowledge that future studies should consider other types of land conversions or land management changes, as an increasing number of studies have demonstrated that other LULCCs than de-or reforestation also have remarkable biogeophysical effects (e.g., Davin et al., 2014;Malyshev et al., 2015;Naudts et al., 2016;Thiery et al., 2017;Chen et al., 2018).The two new observationbased data sets of Bright et al. (2017) and Duveiller et al. ( 2018) assess the biogeophysical consequences of a series of different LULCCs globally, thereby enabling the evaluations of the sensitivity to additional types of land cover in future studies.An additional advantage of these two studies is that they both provide a signal for a complete conversion from one land cover type to another (i.e., they do not rely on a coverage threshold as MODIS).In our evaluation approach, we focus on the local climatic impact of forests, thereby neglecting feedback mechanisms between the atmosphere and the land surface.While they appear to be relevant in many climate models (Winckler et al., 2017;Devaraju et al., 2018), their evaluation is prevented by the lack of observations at the moment.

Conclusions
In this study, we evaluate the representation of the local biogeophysical effects of forests in the CLM4.5, using recently published MODIS-based observations of the albedo, ET, and LST difference between forest and nearby open land.Given the uncertainties in observation-based ET estimates, we further extend our evaluation for this variable by including data from GLEAM v3.1a and GETA 2.0.In our model evaluation, we extract a local signal of forests by analyzing PFT-level model output, allowing for good comparability with the highresolution satellite observations.Further, we use a modified version of CLM4.5 which attributes a separated soil column to each PFT, resulting in a more realistic subgrid contrast between forest and open land.
Overall, the lower albedo over forests in CLM4.5 is in line with the MODIS observations.However, the albedo contrast between forests and open land is somewhat more pronounced in the model.Ground observations support the stronger albedo contrast in CLM4.5, suggesting that MODIS albedo observations should be used carefully when contrasting different land cover types, as satellite observations tend to retrieve a mixed signal of various land cover types due to their limited spatial resolution.By suppressing lateral ground heat fluxes, the soil column separation considerably improved the representation of the impact of deforestation Biogeosciences, 15, 4731-4757, 2018 www.biogeosciences.net/15/4731/2018/on daily mean and maximum LST, resulting in a good agreement with the MODIS observations.Both exhibit an overall cooling effect of forests on these variables, except for winter at latitudes exceeding 30 • .Nevertheless, it appeared that the LST difference of forest minus open land in CLM4.5 tends to have a positive bias compared to observational studies.Also, it emerged that caution is required when comparing 2 m air temperature in CLM4.5 to observational data.This variable is only diagnostic in CLM4.5 and might not conform with measurements, despite realistic LST values.The nighttime warming effect of forests in the midlatitudes, which emerged in a number of recent observational studies, is not reproduced by CLM4.5.The biases in the daily maximum and minimum LST signals of forests might be at least partly alleviated by accounting for heat storage in the vegetation biomass.We therefore encourage a modification of CLM which enables the representation of canopy heat storage.
Observation-based ET estimates generally agree on higher ET over forests than open land throughout the year at low latitudes and during summer at midlatitudes and high latitudes.This was however not represented by the CLM4.5 configuration using separated soil columns.In fact, the soil column separation impaired the ET signal of forests in CLM4.5, despite improving the LST signal of forests considerably.Hence, a complete evaluation and verification of this modification of CLM4.5 should be undertaken before including it in future versions of CLM.We succeeded in attenuating the biases in ET and also daily maximum LST in a sensitivity experiment which incorporated modifications on four aspects of the parameterization of vegetation transpiration: the root distribution, a dynamic plant water uptake instead of the current static one, the light limitation, and the maximum rate of carboxylation.Historically, the most important LULCC process, deforestation, is still ongoing in large parts South America, Africa, and southeast Asia.A realistic representation of the biogeophysical effects of LULCC in climate models is needed as a number of observational studies revealed that they can have a considerable impact on the local climate.An appropriate representation of the effects of LULCC is not only a feature land surface models require to understand the climate of the past and project future climate but is also a chance to achieve a more realistic simulation of processes at the land surface.To this end, the analysis of model output at PFT level can help reveal model deficiencies that otherwise would have been hidden below the veil of grid-scale aggregation.
Data availability.CLM4.5 is publicly accessible as described in www.cesm.ucar.edu/models/cesm1.2/cesm/doc/usersguide/x290.html (CESM Software Engineering Group, 2014).The Model output and modifications to the model code are available from the corresponding authors upon request.The GETA 2.01 database is available at http://sterlinglab.ca/databases (Sterling, 2018).GLEAM version 3.1a can be downloaded from https://www.gleam.eu/#downloads (Martens et al., 2017)  In CLM4.5, ET is strongly and positively correlated to SM at most locations, indicating that SM limitation exerts a strong control on the magnitude of ET (not shown).In CLM-DFLT, where SM is the same for all PFTs within a grid cell, forest mostly experiences higher SM stress except for the northern high-latitude winter (Fig. A4a).Once the SeSCs are introduced in CLM-BASE, the differences in the SM stress are also influenced by the differences in SM, which in turn are affected by the various ET rates over forest and open land.
In other terms, it is possible that forests experience less SM stress than open land but only because they evaporate less water, and vice versa (Fig. A4b).We argue that the difference in the SM stress of forest minus open land in CLM-DFLT is more representative, because it is unaffected by the ET rates of the individual PFTs in this model configuration.Under this assumption, forests are often more SM-limited than open land in CLM4.5.In contrast, two observational studies comparing SM profiles of forest and nearby pasture sites in the Amazon reveal that forests have a considerably higher capacity to access water from the soil below a depth of 2 m (Jipp et al., 1998;von Randow et al., 2004).Further, there are a number of studies reporting increased forest ET during the dry season due to the higher amount of incoming shortwave radiation, whilst the response is the opposite over pasture (Jipp et al., 1998;da Rocha et al., 2004;von Randow et al., 2004;Huete et al., 2006;Saleska et al., 2007).Altogether, these studies indicate that forest ET should be less SM-limited than open land ET.It is thus possible that forests experience too high and/or open land too little SM stress in CLM4.5.CLM4.5 accounts for SM stress on VTR through a stress function β t , which ranges from 0 (when soil moisture limitation completely suppresses VTR) to 1 (corresponding to no SM limitation on VTR).This function is calculated according to Eq. (A1) as the sum of the root fraction in each soil layer (r i ) multiplied by a PFT-dependent wilting factor (w i ).The original root distributions in CLM4.5 were adapted from Zeng (2001) and are rather similar for all PFTs, especially for needleleaf trees, broadleaf deciduous trees, and grassland in the lower part of the soil (Fig. A5).Therefore, there is no considerable difference in the default configuration of CLM4.5 regarding the ability to extract water from the lower part of the soil between forests and open land PFTs (except for broadleaf evergreen trees).Furthermore, all tree PFTs have a less negative soil matrix potential at which the stomata are fully closed and opened than the open land ones; i.e., tree PFTs have their permanent wilting point at a higher SM content than open land and hence use water more conservatively.In order to increase SM limitation for open land PFTs and thus reduce their ability to extract water from the lower part of the soil, we conduct a sensitivity experiment, called CLM-ROOT, with a much shallower root distribution for open land PFTs.The new values for the root distribution factors (r a and r b ) are shown in Table A1 and the resulting root distribution in Fig. A5.
The modified root distributions strongly reduce the ET of non-arctic open land PFTs, especially ET of C 4 grass (Table A5).Also, the ET of grassland at the location of the pasture site in the Amazon in the study of von Randow et al. ( 2004) is considerably reduced during the dry period, even overcompensating the positive bias in CLM-BASE (Table A6).On the other hand, it does not affect ET during the wet season, when ET is not SM limited.Overall, this experiment reveals that modifying the root distribution has high potential to alleviate biases of CLM4.5 in ET, except for the arctic region where likely temperature and incoming shortwave radiation are the main factors limiting VTR.

A2 Sensitivity to dynamic plant water uptake
In the tropics, forests often exhibit increased ET during dry periods, due to increased light availability (da Rocha et al., 2004;Huete et al., 2006;Saleska et al., 2007), even though the upper soil is dry, as they still have sufficient water supply from the lower part of the soil (Jipp et al., 1998;von Randow et al., 2004).We aim to allow a similar behavior in CLM4.5 by introducing a dynamic plant water uptake, where plants only extract water from the 10 % of the roots with the highest wilting factor (i.e., best access to SM) for the calculation of the β t factor and the extraction of soil water (example in Fig. A6).The resulting model simulation, called CLM-10PER, was conducted by adding this modification to the configuration from the CLM-ROOT experiment.This modification generally reduces SM stress for plants and hence increases ET for all non-arctic PFTs (Table A5).Its impact is limited for arctic PFTs where temperature and shortwave radiation are more important limiting factors of VTR than water availability.A notable improvement can be observed for tropical deciduous broadleaf trees for which average ET is increased by 0.11 mm day −1 , thereby alleviating the negative bias compared to GETA.Furthermore, it improves the seasonal dynamics of forest ET in the tropics.With the 10 % modification, forests show increased ET during the dry period at the forest site of da Rocha et al. (2004)

A3 Sensitivity to light limitation
As arctic PFTs are only weakly affected by the previously introduced modifications of SM stress as well as the maximum rate of carboxylation described in the next section, we performed a sensitivity experiment with altered light limitation, which is called CLM-LIGHT.Since ET values are strongly negatively biased for boreal deciduous broadleaf trees and C 3 arctic grass (Table A5), the light limitation of photosynthesis for C 3 plants was lessened by increasing the factor 0.5 in Eq. (8.7) of Oleson et al. (2013) to 0.6.Because ET of C 4 grass exhibits a strong positive bias, their quantum efficiency was reduced from 0.05 to 0.025 mol CO 2 mol −1 photon, thereby increasing their light limitation.
Altering the light limitation of photosynthesis impacts ET in all climate zones (Table A5).Its impact is strongest in the tropics and remains small in boreal regions.Of the C 3 PFTs, tropical evergreen broadleaf trees are impacted strongest.The implemented modification alleviates the negative ET bias for evergreen broadleaf trees during the dry season but slightly increases the positive bias during the wet season, overall still leading to a further improvement of the difference between the two seasons (Table A6).Additionally, the increased light limitation reduces ET of C 4 grass during the wet season similar to the observations over the grassland site in von Randow et al. (2004).This is likely responsible for the increased ET during the dry season as well, since the reduced SM consumption during the wet season is carried over to the following dry season, therefore reducing the SM stress.A4 Sensitivity to the maximum rate of carboxylation V cmax appears to be a suitable parameter to tune VTR values, since it is not well constrained from observations and VTR in models is highly sensitive to this parameter (Bonan et al., 2011).In CLM4.5, the values reported by Kattge et al. (2009) are used except for tropical evergreen broadleaf trees, for which a higher value was chosen to alleviate model biases (Bonan et al., 2012;Oleson et al., 2013).In order to test the sensitivity of the PFT-specific ET values to V cmax , we conduct a final sensitivity experiment with new values of this parameter in addition to the other modifications presented beforehand, with the aim to alleviate the biases to GETA (Table A1).Additionally, the minimum stomatal conductance of C 4 plants, which is by default 4 times larger than that of C 3 plants, was reduced from 40 000 µmol m −2 s −1 to 20 000 µmol m −2 s −1 (see Eq. 8.1 in Oleson et al., 2013) in this sensitivity experiment, which we call CLM-PLUS.
As already shown by Bonan et al. (2011), photosynthetic activity of C 3 PFTs is strongly influenced by the choice of V cmax , except for the boreal ones where light or temperature are more important limiting factors of photosynthesis.The CLM-PLUS simulation alleviates biases in ET averaged for the individual PFTs compared to GETA, in particular by reducing ET over temperate evergreen needleleaf trees, both temperate and tropical evergreen broadleaf trees, and C 4 grass, as well as by increasing ET of tropical deciduous broadleaf trees (Table A5).The mismatch between results of CLM4.5 and the in situ measurements of von Randow et al.   (2004), on the other hand, support a stronger tuning for this particular PFT in order to further reduce its ET.

Figure 2 .
Figure 2. Seasonal and latitudinal variations of α(f − o) in (a) the MODIS observations and (b) CLM-BASE.Points with a mean which is insignificantly different from zero in a two-sided t test at 95 % confidence level are marked with a black dot.Only grid cells containing valid data in the MODIS observations were considered for the analysis of CLM-BASE.All data from the 2002-2010 analysis period corresponding to a given latitude and a given month are pooled to derive the sample set for the test.Panel (c) shows the zonal annual mean of both MODIS (in green along with the range between the 10th and 90th percentiles in grey) and CLM-BASE (in red, the range between the 10th and 90th percentiles in orange).Note that on this subfigure results have been smoothed with a 4 • latitudinally running mean.

Figure 6 .
Figure 6.Seasonal and latitudinal variations of ET(f − o) in (a) the MODIS and (b) GLEAM observations, (c) CLM-DFLT, (d) CLM-BASE, and (e) CLM-PLUS.Points with a mean which is insignificantly different from zero in a two-sided t test at 95 % confidence level are marked with a black dot.All data from the 2002-2010 analysis period corresponding to a given latitude and a given month are pooled to derive the sample set for the test.

Figure 7 .
Figure 7. Seasonal and latitudinal variations of ET(f −o) in (a) MODIS, (b) GLEAM, and difference of forest minus open land in (c) total ET, (d) soil evaporation, (e) canopy interception evaporation, and (f) vegetation transpiration in CLM-BASE.Points with a mean which is insignificantly different from zero in a two-sided t test at 95 % confidence level are marked with a black dot.

Figure 8 .
Figure 8. Seasonal and latitudinal variations of LST avg (f − o) in (a) the MODIS observations, (b) CLM-DFLT, and (c) CLM-BASE.Points with a mean which is insignificantly different from zero in a two-sided t test at 95 % confidence level are marked with a black dot.Only grid cells containing valid data in the MODIS observations were considered for the analysis of CLM-DFLT and CLM-BASE.All data from the 2002-2010 analysis period corresponding to a given latitude and a given month are pooled to derive the sample set for the test.Panel (d) shows the zonal annual mean of MODIS (green, range between the 10th and 90th percentiles in grey), CLM-DFLT (blue, range between the 10th and 90th percentiles in blue), and CLM-BASE (red, range between the 10th and 90th percentiles in orange).Note that on this subfigure results have been smoothed with a 4 • latitudinally running mean.The same was done for LST max (f − o) in panels (e), (f), (g), and (h), and for LST min (f − o) in panels (i), (j), (k), and (l).

Figure 9 .
Figure 9. Seasonal and latitudinal variations of (a) daily maximum T2M difference of forest minus open land and (b) LST max (f −o) in CLM-BASE.Points with a mean which is insignificantly different from zero in a two-sided t test at 95 % confidence level are marked with a black dot.All data from the 2002-2010 analysis period corresponding to a given latitude and a given month are pooled to derive the sample set for the test.Only grid cells containing valid data in the MODIS observations were considered for the analysis.

Figure A3 .
Figure A3.Difference in vertically averaged annual mean soil temperature of forest minus open land in CLM-BASE.

Figure A4 .
Figure A4.Seasonal and latitudinal variations of β t -factor differences of forest minus open land in (a) CLM-DFLT and (b) CLM-BASE.Points with a mean which is insignificantly different from zero in a two-sided t test at 95 % confidence level are marked with a black dot.

Figure A5 .
Figure A5.Vertical root fraction distribution of the different PFTs in the default version of CLM4.5 and (in light blue) the modified root fraction distribution of open land PFTs used in CLM-PLUS.The asterisks mark the reported maximum rooting depths ofFan et al. (2017) for annual grass (yellow), evergreen needleleaf trees (dark blue), deciduous broadleaf trees (light green), and evergreen broadleaf trees (dark green).

Figure A6 .
Figure A6.Example of the calculation of the β t factor with the 10 % modification.Shown are five soil layers, with the fraction of the roots in these layers in brown and the wilting factor in blue.On the bottom, the calculation of β t for this particular example with the 10 % modification (β 10PER t ) and the default calculation in CLM4.5 (β DFLT t ), assuming the roots not shown have a wilting factor of 0. The root fractions eventually used to calculate β 10PER t are shaded in red.

Figure A8 .
Figure A8.Seasonal and latitudinal variations of ET(f − o) in CLM-DFLT for (a) all tree PFTs minus open land, (b) deciduous tree PFTs only minus open land, and (c) evergreen tree PFTs only minus open land.Points with a mean which is insignificantly different from zero in a two-sided t test at 95 % confidence level are marked with a black dot.All data from the 2002-2010 analysis period corresponding to a given latitude and a given month are pooled to derive the sample set for the test.The same is done for CLM-BASE in panels (d), (e), and (f).

Table 1 .
ET Liu et al. (2005)lux in situ observations from various studies and the values in CLM-BASE and CLM-PLUS at the respective locations.EBT indicates broadleaf evergreen tree, DBT indicates broadleaf deciduous tree, and ENT indicates evergreen needleleaf tree.−1relative to GETA.The biases of these PFTs can have a large effect on the overall ET(f − o) as they cover a large proportion of the land surface (9.5, 8.0, and 8.0 %, respectively).Similarly, CLM-BASE overestimates ET compared to in situ measurements conducted over a pasture site in the Amazon by von Randow et al. (2004) and underestimates ET compared to the two forest sites in Alaska reported in the study ofLiu et al. (2005)(Table1).Interestingly, deciduous trees are mostly responsible for this discrepancy in ET(f − o) at latitudes below 30 • (Fig.A8).In the midlatitudes, on the other hand, both deciduous and evergreen trees show lower ET than open landBiogeosciences, 15, 4731-4757, 2018www.biogeosciences.net/15/4731/2018/ and the land cover specific variables used in this study are available upon request from Brecht Martens.The MODIS-based data are available from Yan Li upon request. .

Table A1 .
The PFT-specific values of V cmax (µmol m −2 s−1), r a , and r b in the default of CLM4.5 and in CLM-PLUS.

Table A3 .
The land cover types from Ambrose and Sterling (2014) (GETA) used in this study and the numbers of the respective PFTs in CLM4.5 applied to the different land cover types (TableA2).

Table A4 .
Overview of the different modifications of CLM4.5 incorporated in the simulations presented this study.

Table A5 .
Area -weighted annual mean ET for each PFT analyzed in this study according to the GETA data and in the different configurations of CLM4.5 and fraction of the land surface covered by the different PFTs.The global integral of annual ET is listed on the bottom.www.biogeosciences.net/15/4731/2018/Biogeosciences, 15, 4731-4757, 2018

Table A6 .
ET and latent heat flux in situ observations from various studies and the values of the different CLM4.5 sensitivity tests at the respective locations.