Coincidences of climate extremes and anomalous vegetation responses: comparing tree ring patterns to simulated productivity

Climate extremes can trigger exceptional responses in terrestrial ecosystems, for instance by altering growth or mortality rates. Such effects are often manifested in reductions in net primary productivity (NPP). Investigating a Europe-wide network of annual radial tree growth records confirms this pattern: we find that 28 % of tree ring width (TRW) indices are below two standard deviations in years in which extremely low precipitation, high temperatures or the combination of both noticeably affect tree growth. Based on these findings, we investigate possibilities for detecting climate-driven patterns in long-term TRW data to evaluate state-of-the-art dynamic vegetation models such as the Lund-Potsdam-Jena dynamic global vegetation model for managed land (LPJmL). The major problem in this context is that LPJmL simulates NPP but not explicitly the radial tree growth, and we need to develop a generic method to allow for a comparison between simulated and observed response patterns. We propose an analysis scheme that quantifies the coincidence rate of climate extremes with some biotic responses (here TRW or simulated NPP). We find a relative reduction of 34 % in simulated NPP during precipitation, temperature and combined extremes. This reduction is comparable to the TRW response patterns, but the model responds much more sensitively to drought stress. We identify 10 extreme years during the 20th century during which both model and measurements indicate high coincidence rates across Europe. However, we detect substantial regional differences in simulated and observed responses to climatic extreme events. One explanation for this discrepancy could be the tendency of tree ring data to originate from climatically stressed sites. The difference between model and observed data is amplified by the fact that dynamic vegetation models are designed to simulate mean ecosystem responses on landscape or regional scales. We find that both simulation results and measurements display carry-over effects from climate anomalies during the previous year. We conclude that radial tree growth chronologies provide a suitable basis for generic model benchmarks. The broad application of coincidence analysis in generic model benchmarks along with an increased availability of representative long-term measurements and improved process-based models will refine projections of the long-term carbon balance in terrestrial ecosystems.


Introduction
Extreme climate events are known to trigger exceptional responses in terrestrial ecosystems (Reyer et al., 2012;Smith, 2011;Zscheischler et al., 2014a, c).Understanding which Published by Copernicus Publications on behalf of the European Geosciences Union.
A. Rammig et al.: Tree ring extremes as model benchmarks ecosystem processes exceed their natural range of variability in the wake of environmental extremes is crucial for anticipating the fate of land ecosystems under climate change scenarios (Cotrufo et al., 2011;Jentsch et al., 2011).For instance, anomalous ecosystem responses induced by drought events (Schwalm et al., 2012) may decrease the economic returns from forest ecosystems (Hanewinkel et al., 2013) or lead to substantial net CO 2 emissions, amplifying climate change (Reichstein et al., 2013).One prominent example is the 2003 heat wave in Europe, when carbon emissions of ∼ 0.5 Pg C yr −1 were released from forests that usually act as carbon sinks (Ciais et al., 2005;Janssen et al., 2003).However, it is important to note that extreme events may have differential effects in different biomes, e.g.enhanced vegetation growth during the 2003 heat wave at high elevations in the Alps (Jolly et al., 2005).
Water stress and high temperatures reduce evapotranspiration and productivity in many mid-and low-latitude areas (Granier et al., 2007).Yet, the general applicability of such studies is challenged by the different climate responses of forests across biomes and tree species (Babst et al., 2013b;Granier et al., 2007;Lindner et al., 2010).Furthermore, the extent to which increasing amounts of atmospheric CO 2 may serve as a buffer against drought by enhancing water use efficiency is being discussed (e.g.Andreu-Hayles et al., 2011;Penuelas et al., 2011;Keenan et al., 2013).The magnitude of these competing effects remains poorly constrained and may differ among tree species or tree age classes (Gedalof and Berg, 2010) and likewise depends on nutrient availability (e.g.Norby et al., 2010).Another known, yet understudied aspect of forest growth dynamics is the role of lagged responses to the previous year's climate extremes.These can influence current forest productivity, e.g.via decreased nonstructural carbohydrate reserves (e.g.Dietze et al., 2014;Fritts, 1976;Richardson et al., 2013) or altered mortality rates (Bréda et al., 2006;Moreno et al., 2013).Carbon sequestered in the second half of the growing season is generally not used for radial growth but supports a combination of cell-wall thickening and storage (Babst et al., 2014a).
The growing recognition of the role that climate extremes play in land ecosystems (e.g.Williams et al., 2014;Zscheischler et al., 2013) and the implications for global carbon cycling requires developing new data analysis tools.While there is a large body of literature on the quantification of extremes in climate variables, we lack techniques to quantify extremes in biospheric responses and most importantly a solid framework for linking climatic and biospheric extreme events (Smith, 2011).A suitable methodological approach in this direction requires detecting both instantaneous and lagged responses of a biospheric variable (e.g.tree ring width index -TRW -and net primary productivity -NPP) to climatic extremes (e.g.temperature, precipitation or the combination of both).We therefore propose a generic method to evaluate the impacts of climate extremes on biospheric variables; this method quantifies the coincidence rates of extremes in long climatic and biospheric time series (> 50 years).Thereby, we create a unit-free metric that enables us to compare different measures of vegetation productivity.We apply coincidence analysis, a method that was put forward by Donges et al. (2011) in a different context.We exemplify our approach by evaluating a set of European tree ring data and the output from a dynamic vegetation model (LPJmL) model.
We focus on exploring the potential of annual radial growth increments (tree ring chronologies) for model evaluation purposes.This data source is recognized as one of the few opportunities for quantifying ecosystem responses to multiple extreme events on long-term timescales (Babst et al., 2012).Tree ring chronologies can, with certain restrictions, be regarded as proxies for the variability in standscale productivity and offer a possibility to relate long-term tree growth to climate fluctuations and extremes on regional to continental scales (e.g.Babst et al., 2012;Battipaglia et al., 2010).Likewise, tree rings show pronounced lagged effects and a positive relationship with previous fall's climate (Wettstein et al., 2011).Depending on their sign, climate anomalies in this season may either enhance or mitigate the impact of extremes on forest growth in the subsequent year because they directly affect the growing season length and, related to this, the replenishment of carbon storage (Kuptz et al., 2011).Also, the interaction of carbon accumulation with seed production (i.e.mast years) may sometimes lead to lowgrowth anomalies regardless of climatic conditions.Such masting events and non-climatic drivers of forest growth (e.g.management or disturbances) may challenge the interpretation of biotic responses to climate extremes because they alter carbon allocation patterns (Mund et al., 2010).Nevertheless, tree ring chronologies are widely regarded as robust and very unique long-term indicators of biospheric responses to climate anomalies (Babst et al., 2014b;Jones et al., 2009;Pederson et al., 2014).
Despite extensive observational studies, the impacts of extreme events under current and past environmental conditions remain insufficiently documented.This is a natural consequence of the low occurrence probability of the events along with chronically scarce long-term observations (Innes, 1998;Smith, 2011).Hence, it is difficult to project the impacts of expected changes in frequencies and intensities of extreme events (Barriopedro et al., 2011;Field et al., 2012) on the terrestrial carbon cycle (Reichstein et al., 2013).In this context terrestrial biosphere models play a crucial role in quantifying the impact of climate extremes on the terrestrial carbon cycle and, most importantly, on NPP (Keenan et al., 2012;Zscheischler et al., 2014b;Williams et al., 2014).One prerequisite is, however, that models are well tested for their capacity to reproduce the relevant signatures of extreme impacts in the recent past.Year-to-year variation and impacts of extreme events in these models are best reflected by simulated NPP, and we analyse the impact of climate extremes on simulated NPP within our coincidence framework, thereby considering the role of lagged events.
Our scale-free approach allows us to directly compare response patterns identified in the observed tree rings with simulated productivity, which is a straightforward way of testing models for their capacity to reproduce the relevant signatures of extreme impacts in the recent past.Indeed, it has been recognized that it is essential for advanced modelling studies to converge to suitable benchmarks for testing terrestrial biosphere models (cf.Dalmonech and Zaehle, 2013;Kelley et al., 2013;Luo et al., 2012).
For instance, Luo et al. (2012) conclude that suitable model benchmarks are characterized by "objectivity, effectiveness, and reliability for evaluating model performance".Hence, the goal has to be a suite of metrics that can embrace characteristic response functions.Along these lines, our study evaluates the potential of the coincidence analysis framework to become an element of a generic model benchmarking system.We address this issue by working on the following specific questions: -do state-of-the-art dynamic vegetation models agree with observed responses to climate extremes?
-how can long-term observations help us understanding biotic responses to extreme events?
2 Materials and methods

Measurements of tree ring widths
We compiled TRW chronologies from 606 sites across Europe and parts of northern Africa (10 For the detection of growth extremes, low frequency variability including the biological age/size trend characteristic of tree ring data was removed from the constituent tree ring time series at each site using a spline detrending with a 50 % frequency cutoff response at 30 years (Babst et al., 2012).Prior to detrending, the variance of each time series was stabilized using an adaptive power transformation as described by Cook and Peters (1997) and the mean tree ring chronologies were corrected for changes in sample replication (Frank et al., 2007) to reduce biases in the detection of growth extremes induced by variance changes over time.The tree ring detrending and standardization procedure converts the tree ring width data into dimensionless indices (so-called tree ring width indices, TRWs) with a mean of approximately unity.The tree ring data set spans most of terrestrial Europe, but is not evenly distributed across the continent (see Babst et al., 2013, and Fig. 3).Conifer sites are most frequent in Scandinavia, in the Alpine region and in the Mediterranean, while broadleaved species are predominantly located in central Europe and northern Spain (Babst et al., 2013).

Climate data
We  (Dee et al., 2011).Daily temperature, precipitation and solar radiation were used to drive the model runs.For the coincidence analysis with TRW and simulated NPP, we calculate mean annual temperature (T ) and annual precipitation sums (P ) over the growing season from the climate data set (see below).

Simulated net primary productivity (NPP)
Simulations of monthly NPP are performed with the dynamic global vegetation model LPJmL (Bondeau et al., 2007;Sitch et al., 2003) with a fully coupled carbon and water cycle (Gerten et al., 2004).The model is driven by temperature, radiation, precipitation and atmospheric CO 2 concentration.The productivity of vegetation (GPP) for each plant functional type (PFT) is simulated by a process-based photosynthesis scheme based on Farquhar (Farquhar et al., 1980) that adjusts carboxylation capacity and leaf nitrogen seasonally and within the canopy profile (Haxeltine and Prentice, 1996).Net primary production (NPP) is derived by subtracting maintenance and growth respiration from GPP. LPJmL simulates the allocation of accumulated carbon to the plant's compartments (leaves, stem, root and reproductive organs) according to allometric constraints.Responses of the modelled vegetation to climate extremes include the inhibition of photosynthesis and increased maintenance respiration at high temperatures and reduced stomatal conductance and thus reduced photosynthesis with water stress.
For the present study, we ran LPJmL in its natural vegetation mode not considering land management and landuse change.Process-based simulation of fire is included by the so-called SPITFIRE model, which is coupled to LPJmL (Thonicke et al., 2010).Simulation runs were performed at 0.  2004) and Sitch et al. (2003).

Determination of the growing season for TRW and simulated NPP
To determine the length of the growing season (GS obs ), we use the fraction of photosynthetically absorbed radiation (FAPAR) derived from remote sensing and interpolated to daily values (from the Moderate-resolution Imaging Spectroradiometer (MODIS) Pinty et al., 2011) at each geographical coordinate of a tree ring time series.The MODIS-TIP (twostream inversion package) provides broadband albedo values resulting in time series that are comparable across vegetation types.Given that there is no commonly accepted approach to derive growing season length from remote sensing observations via a threshold approach (see, e.g., White et al., 2009), we apply a mixture of absolute and relative heuristic criteria.First, we flagged days as non-growing season when FAPAR values drop below 0.12 or are below −0.8 standard deviations of FAPAR.To ensure that the second criterion does not affect evergreen sites, we reset all values > 0.43 to growing season.Importantly, we considered only the longest phase of points that follow these criteria to be GS obs , and we only allow for one single GS obs in Europe as we assume that double growing seasons do not play a substantial role in the latitudes under scrutiny.The dynamic definitions of the GS obs derived from FAPAR allow deriving typical growing seasons at each geographical point, thereby assuming that the average climate data for these days of the year are robust proxies for the growing season throughout the entire observational period.
To determine the length of the growing season for each simulated grid cell (GS sim ), we use the simulation results for NPP.As GS sim we define here the longest period of subsequent months per year when the monthly NPP is greater than 0.

Preprocessing of climate data, TRW and simulated NPP for coincidence analysis
The coincidence analysis requires pairing each point in the TRW data set with local T and P variability.Accordingly, each TRW site is associated with the site-specific (i.e.geographically encompassing) climate grid cell of the WATCH-ERA-Interim data at 0.25 • × 0.25 • spatial resolution.In this way, we obtain 606 pairs of time series representing tree ring growth and climatological data.Monthly temperature and precipitation data are averaged and summed, respectively, over the growing season (see Sect. 2.1.5).The maximum temporal overlap between each pair of time series determines the length of the period for coincidence analysis.
To obtain pairs of time series for the comparison of simulated NPP with P and T in a comparable way as for TRW, we calculate the sum of simulated NPP over the growing season (see Sect. 2.1.5).Analogously to the coincidence analysis between TRW and climate data, we compute the average temperature and total precipitation over the growing season, where NPP > 0. We obtain pairs of simulated NPP and climate drivers for each grid cell.
The TRW data set consists of 606 time series at selected measurement sites throughout Europe.For comparison with simulated NPP, we select the corresponding grid cell centres nearest to the measurement sites.

Coincidence analysis and definition of extreme events
For our analysis, we search for coincidences (Donges et al., 2011) between specific percentiles in the pairs of biotic and climate time series.In the case of TRW and NPP, values smaller than the 10th percentiles were used (low-productivity extremes).In the climate records, all values exceeding the 90th percentiles of mean growing season temperature (hot extremes) and being less than the 10th percentile of the total growing season precipitation (dry extremes) were defined as extreme events.This combination of climatic and biotic extremes tests the link between extremely high temperature, low precipitation or the combination of both in causing lowgrowth responses at all sites.At alpine or boreal sites, particularly high temperatures may even lead to better growth conditions (e.g.Jolly et al., 2005).Similarly, extremely low temperatures during the growing season could cause low-growth extremes, e.g. in the Alps or the boreal zone (e.g.Babst et al., 2012).We therefore interpret our results carefully regarding these issues.
To obtain the number of coincidences, K, between two given time series, we count the number of extreme events that both time series have in common simultaneously or allowing for a predefined lag.For the determination of K there are two parameters in the coincidence analysis: (1) t determines the width of the time window (in years, y) in which a TRW or NPP extreme can fall after a P , T or combined P and T extreme.For t = 1 y, only coincidences between TRW or NPP with P , T or combined P and T extremes during 1 year are counted.For t = 2 y, coincidences between TRW or NPP with P , T or combined P and T extremes during a time window of 2 years are counted; all coincidences falling in this time window are counted as K = 1.(2) τ determines the time lag between the TRW or NPP and the P , T or combined P and T extreme.We distinguish between t = 1 y and τ = 0 y (which account for coincidences occurring in the same year, i.e. instantaneous growth responses) and between t = 1 y and τ = 1 y to investigate lagged effects, i.e. extreme growth responses in the following year (Fig. 1; see also Donges et al., 2011).We then normalize K by the total number of extreme events N in the climate time series of P , T or combined P and T to obtain the coincidence rate r with 0 ≤ r ≤ 1 (0 if no coincidences occur and 1 if the maximum number of possible coincidences occurs).

Testing the significance of coincidences
Autocorrelations as well as the specific shape of the distribution of amplitudes in the considered climatological and biotic time series can have a profound influence on the observed bivariate coincidence rates.To control for these effects and assess the statistical significance of the computed coincidence rates r, we create 1000 iAAFT (iterative Amplitude Adjusted Fourier Transformation; Schreiber and Schmitz, 2000;Venema et al., 2006) surrogate time series for each site and grid cell.The iAAFT surrogates are fully statistically independent from the original time series but characterized by the same amplitude distribution and, most importantly, the same autocorrelation properties.Hence, we can investigate the coincidence rate of extremes that would be expected to arise by chance between two time series of a given autocorrelation structure.We calculate for each site and grid cell the distribution of the coincidence rates of the iAAFT surrogate time series.The coincidence rate r (calculated from climate and bi-otic extremes) is assumed to be significant if it is higher than the 90 % percentile of the surrogate distribution.In the following analysis, we only consider TRW sites and simulated grid cells with a significant coincidence rate r.For brevity, these locations are subsequently addressed as significant sites or grid cells.

Detection of European-wide extreme years
To identify years with a pronounced European-wide forest response to climate extremes, i.e. years that yield a high number of coincidences across the continent, we take the sum over all coincidences at significant sites/grid cells occurring during a specific year and divide it by the number of all significant sites/grid cells, again yielding a number between 0 and 1 (0 if no coincidences occur at significant sites/grid cells and 1 if all significant sites show a coincidence in the year considered).As "European-wide extreme years" we define all values one standard deviation above the average annual significant coincidence rate.

Analysis of downregulation of forest growth by extreme events
To estimate the potential downregulation of forest growth by extreme events, we assume that years with coincidences of extremes in P , T and combined P and T with TRW and NPP represent extreme years for the ecosystem.For this analysis, the TRW and NPP time series were rescaled to zero mean and unit variance (z-scores) in order to guarantee comparability in the summary statistics (note that the coincidence analysis itself is scale free and does not require this preprocessing step).We then select z-scored TRW and NPP values during extreme years, i.e. during years with significant coincidences between extreme response and extreme climate.We calculate the proportion of z-scored NPP and TRW values during P , T and P and T extremes below two standard deviations in relation to the total number of TRW and NPP extremes.

Results and Discussion
In this section we first discuss the general picture of the impact of climate extremes on measured TRW and modelled NPP and then focus on patterns of spatial and temporal coincidence rates at significant sites/grid cells.

Downregulation of forest growth by extreme events
To estimate potential effects of extreme events on tree growth and productivity, we quantified TRW and NPP anomalies in years with extreme climate conditions.Generally carbon losses are strongest during combined precipitation and temperature extremes, particularly for TRW.A total of 815 TRW extremes significantly coincide with P , T and P and T extremes.Thereby, 9, 6 and 13 % (34, 28 and 38 %) of the TRW values are below two (one) standard deviations in years with extremely high temperatures, low precipitation and combined temperature and precipitation extremes, respectively (Fig. 2a, b, c, Table S1).At TRW sites, 1491 extremes in simulated NPP are detected.The higher number of detected NPP extremes may partially reflect the fact that the same climate forcing data are used to drive NPP simulations (resulting in a higher probability of coincidences between climate and NPP extremes than coincidences between climate and TRW extremes).Of the simulated NPP values at TRW sites, 20, 3 and 11 % (56, 10 and 27 %) are below two (one) standard deviations in years with P , T and combined P and T extremes, respectively (Fig. 2d, e, f, Table S1).The strong reduction in simulated NPP during precipitation extremes may be related to an overestimation of the modelled P sensitivity of NPP (Babst et al., 2013;Beer et al., 2010).Our definition focusses on extremes during the growing season and thereby neglects any impacts of extreme events that occur outside the growing season which could have significant impact on forest productivity, such as respiratory carbon losses in autumn and winter depleting carbon storage pools, and reduce growth in the following year (e.g.Piao et al., 2008).Extremely warm temperatures in winter may also increase the snow amount in the boreal zone, leading to a delayed start of the next growing season (Helama et al., 2013).By contrast, warm winter/spring temperatures are beneficial for an earlier start of the next growing season (Polgar and Primack, 2011;Rossi et al., 2014).Keeping this in mind, our results for growing season extremes (Fig. 2) suggest that TRW and simulated NPP show subtle differences in their response to different climate extremes and that the seasonality of climate anomalies may be of vital importance in this respect.Hence, in the following, we carry out an in-depth investigation as to how these differences can be attributed.

Extreme years as determined from coincidences across Europe
To analyse the responses of forest growth to drought and heat extremes in models and observations, it is necessary to first evaluate whether the timing of the climate-driven reductions in TRW and simulated NPP events match reasonably well.
In this context, we determine European-wide extreme years as described in Sect.2.4.For both TRW and simulated NPP, we identify the years 1911years , 1921years , 1945years , 1947years , 1976years and 2003 (Fig. 3 (Fig. 3, dark grey boxes) as dry extremes with substantial biotic impacts.Extremely hot years are detected in 1934, 1945, 1947, 1949, 1950, 2002and 2003 (Fig. 3 (Fig. 3).Coincidences of combined P and T extremes with NPP and TRW are detected in 1945detected in , 1947detected in , 1994detected in and 2003 (Fig. 3 (Fig. 3).These results are in good agreement with earlier studies that identified extreme events during these years.In their analysis, Babst et al. (2012) show that 1947 had extremely low growth in southern, southeastern and central Europe due to dry conditions.Neuwirth et al. (2007) reveal 1921 as a negative extreme The colour bar gives the coincidence rate r for the coincidence analysis with t = 2.Only grid cells with significant coincidence rates are coloured; nonsignificant grid cells are marked in grey.Note that the significance level for each grid cell is determined separately.
year in the Rhône Valley, Jura, northern Bavaria and northern Germany and 1947 as a negative extreme year in western Poland, northwestern Germany and Slovenia.Battipaglia et al. (2010) reconstructed temperature extremes from tree rings and found extremely warm conditions in 1911, 1921, 1964 and 2003.Extreme fire years are reported in 1947 and 1976 in Germany (Goldammer, 2001).In 1994, temperature anomalies of up to 2 • C in comparison to 1961-1990 were recorded throughout Europe (Halpert et al., 1995).The effects of the extreme year 2003 in Europe are well known (e.g.Ciais et al., 2005).This list demonstrates the capability of our coincidence analysis to identify European-wide heat and drought extremes.

Spatial distribution of responses to extreme events
In the next step, we focus on the regional patterns of biotic responses revealed by coincidence analysis.The value of tree ring records as model benchmarks under climate extremes will depend crucially on the matches in these spatial patterns.Figure 4 identifies areas where simulated NPP of broadleaved and needle-leaved trees shows significant coincidences with precipitation, temperature and combined temperature and precipitation extremes, respectively.Figure 5 shows the analogue picture for TRW.Generally, we find more significant grid cells with high coincidence rates between simulated NPP and precipitation (n = 259 in grid cells at TRW site) than with temperature extremes (n = 74 in grid cells at TRW site; Fig. 4 upper row) during the growing season.As mentioned before, this may be related to an overestimation of the modelled P sensitivity of NPP.It also shows that water is an important driver at many sites particularly under extreme conditions (Reichstein et al., 2013;Zscheischler et al., 2014c).For the observed TRW values, we find almost the same amount of significant sites for coincidences with P (n = 189) as for coincidences with T (n = 139; Fig. 5).In contrast to observed TRW, simulated NPP displays generally low or insignificant coincidence rates with high -temperature and low-precipitation extremes in mountainous areas (Figs. 4 Note that for some bins, for TRW or site NPP only one value exists. and 5).This may be due to the spatial resolution of the climate data, where grid cells cannot resolve climatic differences along steep altitudinal gradients in the Alps and the related responses displayed by TRW (e.g.King et al., 2013).Climate extremes (low precipitation, high temperatures) in such areas may therefore not limit simulated NPP during the growing season.This calls for higher-resolution long-term climate data sets (or ideally a denser network of climate stations in complex terrain) to better capture site-level climate extremes and improve their representation in simulated NPP anomalies.Drought conditions may not only result from a lack of rainfall but also from high temperature, which drives vapour pressure deficit in dry areas such as the Mediterranean region (Williams et al., 2012).Therefore, we also show the coincidence rates r for the combined P and T extremes (Figs. 4  and 5, lower row).A characteristic feature is that these coincidence rates are generally lower because combined P and T events are rare.However, we find a relatively high number of significant grid cells with combined events (n = 243 in grid cells at TRW site, n = 242 for TRW out of 606 TRW sites).
Zonal patterns become more obvious through binning of the results (Fig. 6).For both, simulated and observed growth responses, we find a ∼ 40 % probability that a climatic extreme is associated with a biotic extreme, i.e. reduced growth response in the current or subsequent year.We find a ∼ 10 and ∼ 20 % probability that combined temperature and precipitation extremes are associated with a biotic extreme for TRW and NPP, respectively.The simulated NPP at tree ring sites displays an increase in the coincidence rate r along a mean annual temperature gradient with lower r .Instantaneous (r calculated for t = 1 and τ = 0, x axis) vs. lagged (r calculated for t = 1 and τ = 1, y axis) coincidences of extreme events.The size of the dots is proportional to the amount of significant coincidences, i.e. larger dots indicate a higher number of sites/grid cells with significant coincidences.Note that the regular grid of coincidence rates results from consistent time series length for simulated NPP, while the length of the time series for TRW differs.
in low-temperature zones and higher r in high-temperature zones (Fig. 6, blue dots).The coincidence rates between simulated NPP and P range from r ≈ 0.38 to 0.54 (Fig. 6a); for NPP and T they scatter around ∼ 0.4 (Fig. 6b), whereas for both NPP and TRW and combined T and P extremes, r ranges between 0.1 and 0.2 (Fig. 6c).Again, the overestimation of the P sensitivity of simulated NPP in comparison to TRW is visible.TRW displays rather constant coincidence rates of ∼ 0.4 with T (Fig. 6a, red dots) and P (Fig. 6b) along the temperature gradient, whereas for combined extremes the increase in the coincidence rates is similar to that of NPP (Fig. 6c).The lower coincidence rates in TRW may be driven by adverse effects of extreme T , e.g. in mountainous areas, where high temperatures during the growing season may even lead to increased growth (Jolly et al., 2005).Also, the importance of nonstructural carbohydrates (NSCs) should not be underestimated.NSCs can be stored for up to 10 years, used as resources during unfavourable growth conditions, and thereby buffer the negative effects of extreme events (e.g.Carbone et al., 2013;Richardson et al., 2013;Klein et al., 2014).The low coincidence rates of NPP and TRW with combined T and P extremes again result from the rareness of these events (Fig. 6c).

Instantaneous and lagged responses to extreme events
To further assess the growth responses to climate extremes found in models and observations, it is necessary to analyse their dynamics.Lagged biotic responses to extreme events are of particular interest.We therefore compare the coincidence rate r in the same year (i.e.instantaneous responses, calculated with t = 1 and τ = 0) with the coincidence rate in the year after the climate extreme (i.e.lagged responses, calculated with t = 1 and τ = 1; Fig. 7 and see also Fig. 1).We find a high number of coincidences in the year after the extreme compared to the instantaneous response, particularly during combined P and T extremes, indicating lagged responses (Fig. 7e, f).Overall, negative precipitation anomalies combined with positive temperature extremes lead to reduced growth not only in the current, but also in the following year.This is in line with other studies (e.g.Babst et al., 2012;Franke et al., 2013) that have emphasized the importance of considering lagged effects in measured TRW.Babst et al. (2012) found that particularly late growing season extremes lead to reduced growth in the following year.The pattern is less pronounced in simulated NPP (Fig. 7e).In the model, lagged effects in NPP are simulated when unfavourable climate conditions lead to low productivity and high respiration costs during the current year and thus less accumulation of biomass.Constant or less accumulated biomass then leads to reduced simulated NPP during the following year.Because simulated NPP represents a rather short-term measure of carbon use compared to observed TRW, it responds more instantaneously to changes in photosynthesis and respiration during extreme events.In contrast, observed TRW integrates carbon accumulation and growth over a whole growing season, relies in part on stored carbohydrates, and may even be influenced by longer-term responses to canopy and root architecture.These considerations may explain some of the observed differences between TRW and simulated NPP under extreme climate conditions.

Conclusions
We present a simple method for detecting impacts of extreme events in time series of climate and forest growth that is based on coincidence analysis.The coincidence metric is viewed as a "unit-free", neutral measure for biotic responses to climate impacts.The method is general and independent of units and does not require attempts to convert tree ring width to NPP for comparison with model output; instead, we can compare the results of the coincidence analysis to test for possible causal relationships between extreme climate and extreme growth responses.Tree rings are long-term observational time series related to forest productivity and are thus valuable archives for improving our process understanding of forest responses to www.biogeosciences.net/12/373/2015/Biogeosciences, 12, 373-385, 2015 extreme events and, thus, for evaluating dynamic vegetation models.Our study shows that low precipitation, high temperature and combined extremes lead to substantial losses in forest productivity, which is ∼ 30 % below two standard deviations during extreme years.We identified years with climate extremes which caused extreme ecosystem responses in Europe for the 20th century, which are consistent with previously reported evidence.
Our study has shown the potential of standardized tree ring data to be used for the evaluation of dynamic global vegetation models abilities to simulate growth responses to climate extremes.Earlier model evaluation studies have lacked this type of analysis.As climate extremes can have long-lasting impacts, DGVMs need to be able to simulate such effects and capture the processes that are responsible for multiyear lagged effects.The combination of improved DGVMs and the method of coincidence analysis can then be applied to quantify the impacts of extreme events, e.g. on the long-term fate of the global carbon balance.
The Supplement related to this article is available online at doi:10.5194/bg-12-373-2015-supplement.
25 • × 0.25 • spatial resolution based on the WATCH-ERA-Interim daily climate data.A global value of annual atmospheric CO 2 concentration was prescribed for the 1901-2010 period based on data from the NOAA Earth System Research Laboratory (NOAA ESRL, 2013).The transient runs from 1901 to 2010 were preceded by a spin-up of 1000 years using www.biogeosciences.net/12/373/2015/Biogeosciences, 12, 373-385, 2015 carbon pools and fluxes and vegetation cover.Model parameterization and soil types followed Gerten et al. (

Figure 1 .
Figure 1.Example of coincidence analysis between a time series of (a) precipitation (P ) and (b) tree ring width indices (TRW).The dashed horizontal line represents the lower 10 % quantile.Events in precipitation or tree ring width that fall below this threshold are counted as extreme events, as indicated by the dotted vertical lines for P (blue) and TRW (red).Grey bars indicate the coincidence of extreme P and TRW events within a time window of 2 years ( t = 2) and are counted as one coincidence.In this example, we count 11 climate extremes and 7 coincidences with TRW for t = 2, resulting in a coincidence rate of r = 7/11 = 0.64.Note that if there are two TRW extremes coinciding with one P extreme in the time window of t = 2, this would account for only one extreme.The letters "I" and "L" indicate instantaneous and lagged effects (i.e.carry-over effects), respectively.When setting the time window to t = 1 and τ = 0, five coincidences are counted and for t = 1 and τ = 1, two coincidences are counted in this example.

Figure 2 .Figure 3 .
Figure 2. Histograms of deviations of growth responses of TRW (a, b, c) and NPP (d, e, f) during extreme years in units of standard deviations (z-scores).The vertical dashed grey line marks two negative standard deviations.The numbers denote the proportion of coinciding events below two standard deviations (see also TableS1in the Supplement).

Figure 4 .
Figure 4. Map of coincidence rates between extremes in simulated NPP and precipitation for (a) broadleaved and (b) needle-leaved trees.Coincidence rates between extremes in simulated NPP and temperature for (c) broadleaved and (d) needle-leaved trees.Coincidence rates for simulated NPP and combined precipitation and temperature extremes for (e) broadleaved and (f) needle-leaved trees.The colour bar gives the coincidence rate r for the coincidence analysis with t = 2.Only grid cells with significant coincidence rates are coloured; nonsignificant grid cells are marked in grey.Note that the significance level for each grid cell is determined separately.

Figure 5 .
Figure 5. Map of tree ring sites and coincidence rates at each site.Coincidence rates between extremes in TRW and precipitation for (a) broadleaved, (b) needle-leaved and (c) other tree species are provided.In the middle row (d, e, f), coincidence rates between extremes in TRW and temperature for the different tree species are displayed.In the lower row (g, h, i), coincidence rates for extremes in TRW and combined precipitation and temperature extremes are given.Nonsignificant sites are marked with transparent dots.The colour bar gives the coincidence rate r for the coincidence analysis with t = 2.

Figure 6 .
Figure 6.Significant coincidence rates of TRW (red dots) and simulated NPP at TRW sites (blue dots) with (a) precipitation, (b) temperature, and (c) the combination of both, in climate space (as given in 2.5 • C temperature bins, x axis) averaged over all tree species.Sites/ grid cells with significant coincidences are aggregated in 2.5 • C mean annual temperature bins.Error bars give the standard deviation among sites/grid cells, and numbers in the plot denote the n of significant sites/grid cells for each 2.5 • C temperature bin (numbers for TRW in red and for simulated NPP in blue).Note that for some bins, for TRW or site NPP only one value exists.