Spatio-temporal variations and uncertainty in land surface modelling for high latitudes: univariate response analysis
- 1School of Mathematics and Statistics, University of Sheffield, Sheffield, UK
- 2Centre for Ecology and Hydrology, Wallingford, UK
- 3Leverhulme Centre for Climate Change Mitigation, Animal and Plant Sciences Department, University of Sheffield, Sheffield, UK
- 4Laboratoire des Sciences du Climat et de l’Environnement, Institut Pierre Simon Laplace, Université Paris-Saclay, Gif-sur-Yvette, France
- anow at: The European Centre for Medium-Range Weather Forecasts, Shinfield Park, Reading, RG2 9AX, UK
Correspondence: Didier G. Leibovici (firstname.lastname@example.org)
A range of applications analysing the impact of environmental changes due to climate change, e.g. geographical spread of climate-sensitive infections (CSIs) and agriculture crop modelling, make use of land surface modelling (LSM) to predict future land surface conditions. There are multiple LSMs to choose from that account for land processes in different ways and this may introduce predictive uncertainty when LSM outputs are used as inputs to inform a given application. For useful predictions for a specific application, one must therefore understand the inherent uncertainties in the LSMs and the variations between them, as well as uncertainties arising from variation in the climate data driving the LSMs. This requires methods to analyse multivariate spatio-temporal variations and differences. A methodology is proposed based on multiway data analysis, which extends singular value decomposition (SVD) to multidimensional tables and provides spatio-temporal descriptions of agreements and disagreements between LSMs for both historical simulations and future predictions. The application underlying this paper is prediction of how climate change will affect the spread of CSIs in the Fennoscandian and north-west Russian regions, and the approach is explored by comparing net primary production (NPP) estimates over the period 1998–2013 from versions of leading LSMs (JULES, CLM5 and two versions of ORCHIDEE) that are adapted to high-latitude processes, as well as variations in JULES up to 2100 when driven by 34 global circulation models (GCMs). A single optimal spatio-temporal pattern, with slightly different weights for the four LSMs (up to 14 % maximum difference), provides a good approximation to all their estimates of NPP, capturing between 87 % and 93 % of the variability in the individual models, as well as around 90 % of the variability in the combined LSM dataset. The next best adjustment to this pattern, capturing an extra 4 % of the overall variability, is essentially a spatial correction applied to ORCHIDEE-HLveg that significantly improves the fit to this LSM, with only small improvements for the other LSMs. Subsequent correction terms gradually improve the overall and individual LSM fits but capture at most 1.7 % of the overall variability. Analysis of differences between LSMs provides information on the times and places where the LSMs differ and by how much, but in this case no single spatio-temporal pattern strongly dominates the variability. Hence interpretation of the analysis requires the summation of several such patterns. Nonetheless, the three best principal tensors capture around 76 % of the variability in the LSM differences and to a first approximation successively indicate the times and places where ORCHIDEE-HLveg, CLM5 and ORCHIDEE-MICT differ from the other LSMs. Differences between the climate forcing GCMs had a marginal effect up to 6 % on NPP predictions out to 2100 without specific spatio-temporal GCM interaction.
The rise in surface temperatures under global warming is predicted to be most severe in the Arctic, where it is already altering surface conditions and perturbing ecological systems (Overland et al., 2014). This will have multiple societal impacts, not least on the health of animals and humans (IPCC AR5 WG2 A, 2014). The term climate-sensitive infection (CSI) refers to diseases whose epidemiological aspects are driven, at least in part, by climatic factors (McMichael et al., 2006; Ebi et al., 2017; Cayol et al., 2017). In the Arctic, climate change is likely to cause enhanced CSI risk in terms of increased incidence, more frequent outbreaks, geographic spread of existing affected zones, and occurrence of newly affected zones (Pauchard et al., 2016; Sajanti et al., 2017; Waits et al., 2018) The complex ecology of CSI organisms presents a challenge to modelling and predicting their epidemiology (Ostfeld, 2010; Carvalho et al., 2014; Ruscio et al., 2015; Sormunen et al., 2016; Li et al., 2016; Gilbert, 2016; White et al., 2018). However, such modelling is needed as disease vectors, such as ticks, mosquitoes, badgers and roe deer, which are associated, for example, with Lyme disease (borreliosis) and tularemia, are expanding their ranges northwards (Jaenson et al., 2012; Jore et al., 2014; Andersen and Davis, 2017; Laaksonen et al., 2017; Blomgren et al., 2018). Under climate change and human-induced land use changes, fragmentation of the landscape was found to affect Lyme disease (Simon et al., 2014), whilst mosquito abundance was associated with outbreaks of tularemia in boreal forest (Rydén et al., 2012). CSIs can also be non-vector diseases, as climate change may force increasing proximity and contacts between animals, e.g. pestivirus affects mammals (livestock or wild) and thus reindeer (Kautto et al., 2012). This could threaten the successful bovine pestivirus eradication programmes existing in Scandinavia since the 1990s (Tryland, 2013).
The Nordic Centre of Excellence (NCoE) CLINF, “Climate change effects on the epidemiology of infectious diseases and the impacts on Northern societies” (http://www.glinf.org/, last access: 2 March 2020), is an interdisciplinary project supported by NordForsk (https://www.nordforsk.org/en, last access: 2 March 2020), covering an area extending across Norway, Sweden, Finland and north-west Russia. Its aim is to study how climate change will affect the prevalence of human and animal CSIs and the consequences for Arctic societies. To do so it needs to characterize how a changing climate will change the environmental and societal conditions affecting a range of CSIs in Nordic regions. Besides predicting environmental changes likely to affect the spread of CSIs, CLINF also aims to gather and generate information on the societal impacts of climate change. Achieving this aim requires tools to model land surface and aquatic changes under climate forcing. This paper focuses on land surface models (LSMs) and the extent to which existing LSMs could provide forecasts useful for the purposes of predicting CSI epidemiology.
An important factor in discussing the predictive value of these models is the variability in their outputs. This variability arises from two sources: variability in the climate drivers, since there are many global circulation models (GCMs), and differences between LSMs, whose core concepts are similar but with many differences in process representation and parameterization. This leads to three key questions:
How does the choice of the GCM affect the CSI-relevant outputs of a given LSM?
For a given GCM, how different are the CSI-relevant outputs of the different LSMs?
How do the joint effects of GCM and LSM differences translate into variability in predictions of CSI-relevant quantities?
Addressing these questions requires methods to describe spatio-temporal differences in models, and the first part of this paper describes such methods. The treatment here is relevant to a range of applications and is generic, but the evaluation of the methods in the latter part of the paper is couched in terms of differences between LSM predictions of net primary production (NPP), i.e. a single model output variable indicating vegetation activity, hence with relevance to CSI modelling involving changes in habitat for specific vectors, as well as carbon fluxes and ecosystem functioning (Koca et al., 2006; Rafique et al., 2016).
It is important that we quantify the uncertainty in any variable derived from an LSM model as a predictor in CSI modelling, so that the full uncertainty in the predictions (and associated risk) is available to public health decision-making. Typically, the uncertainty in the predictions from a single LSM is poorly known, and we instead treat the spread in data simulated by a range of leading LSMs as a proxy for this uncertainty. Since Arctic CSIs are the underlying motivation for this work, we only consider LSMs that represent characteristics of Nordic areas, including high-latitude processes, vegetation and landscapes. These are CLM5 (the Community Landscape Model version 5) (Lawrence et al., 2019), JULES (the Joint UK Land Environment Simulator) (Clark et al., 2011; Comyn-Platt et al., 2020) and two versions of ORCHIDEE (Organising Carbon and Hydrology in Dynamic Ecosystems), ORCHIDEE-MICT (OR_MICT) (Guimberteau et al., 2018) and ORCHIDEE-HLveg (OR_HL) (Druel et al., 2017). The simulated climate data cover the historical period from December 1997 to December 2013, while for JULES we also analysed data from 100-year forecasts to the end of the 21st century under forcing by 34 different GCMs (Comyn-Platt et al., 2020). The specifics of the four models and the driving climate data are briefly described in Sect. 1.2.
Section 2 motivates the use of a multiway methodology to characterize variations between LSMs, and the essentials of such a methodology are described in Sect. 3. In Sect. 4 we use this methodology to analyse the differences between the four selected LSMs, while Sect. 5 shows how the methodology can be applied directly to differences between the LSMs. The same approach is then used in Sect. 6 to assess how the choice of a particular GCM affects the NPP predictions from the JULES LSM. Section 7 gives our discussion and conclusions.
1.1 LSM and ecological modelling aspects relevant to CSI prediction
Climate change is driving the spread of a range of CSI disease vectors (Zuliani et al., 2015; Andersen and Davis, 2017; Blomgren et al., 2018), so understanding the spatio-temporal distribution and evolution of characteristics, such as habitat suitability of these vectors or reservoirs, is essential. These characteristics can then be used in an ecological model that could be coupled with epidemiological models to estimate future risks of disease incidence, e.g. where and when the transmission risks are likely to be highest. For example, changes in abundance and extent of habitat suitability are important factors to be considered in dynamic landscape epidemiology modelling (Lambin et al., 2010). Changes due to global warming could affect both abundance and geographical extent and also extend the period of transmission, e.g. by affecting the vector life cycle (Rose et al., 2015). Extreme weather conditions and events may either introduce outbreaks of abundance, thereby increasing the risk of pathogen occurrence in disease vectors, or wipe out a species at a given location.
Under different scenarios of climate drivers, such as the Representative Concentration Pathways (RCPs) developed by the Intergovernmental Panel on Climate Change (IPCC), LSMs can simulate future atmospheric and land conditions that can be related to vector habitat suitability. Variables simulated by combined climate and land surface models, such as surface temperature, soil moisture, precipitation and land cover, can be used in ecological models or as part of an epidemiological model, e.g. for a species distribution model (Booth et al., 2014). However, the predictive uncertainty in these variables may lead to significant uncertainty in the predictions from CSI modelling (Asghar et al., 2016). This paper describes and quantifies the spatio-temporal uncertainty arising from the choice of LSM alone, i.e. without assessing its impact on CSI predictions, but provides an essential component in understanding the uncertainty in any statistical or mathematical predictions of CSI epidemiology and ecology (Beale and Lennon, 2012; Zuliani et al., 2015; Metcalf et al., 2017) that use LSM outputs as predicting variables.
1.2 Land surface model and data description
The four LSMs used in this study, CLM5, two versions of ORCHIDEE (OR_MICT and OR_HL) and JULES, were chosen because of their high degree of maturity and their ability to model characteristics of Nordic areas, including high-latitude processes, vegetation and landscapes. Table 1 summarizes these characteristics; details can be found in the references. OR_MICT (Guimberteau et al., 2018) includes major high-latitude adaptations, including snow and soil thermal interaction, plant primary productivity constrained by high-latitude conditions, and soil carbon stocks with feedback dynamics. OR_HL (Druel et al., 2017, 2019) adapts ORCHIDEE with specific plant functional types (PFTs) such as non-vascular plants (mosses, lichens), Arctic C3 grass and boreal shrubs. CLM5 (Lawrence et al., 2019) includes permafrost modelling and snow processes, C3 Arctic grass and deciduous boreal shrubs as part of its 15 PFTs (see Appendix C) but no non-vascular plants. The version of JULES (Clark et al., 2011) used here has been extended to be suitable for high latitudes (Comyn-Platt et al., 2020) by including processes such as permafrost–carbon feedbacks (Burke et al., 2017).
For all the LSMs, the initial PFTs were derived from land cover maps. JULES and the two versions of ORCHIDEE use the land cover product from the European Space Agency Climate Change Initiative (ESA-CCI) (Poulter et al., 2015). The supplementary material to Druel et al. (2017) describes the correspondence between land cover and the added Arctic PFTs. CLM5 uses the Land Use Harmonised data version 2, a product of the Land Use Intercomparison Project (LUMIP) (Lawrence et al., 2016), to define its initial spatial distribution of PFTs. For the historical analyses, the data were re-gridded to the finest grid spacing, 0.5∘ N × 0.5∘ E, by simple disaggregation which introduces a limitation when comparing the LSMs. All analyses were performed for a sub-area of the CLINF zone between 4.5–34.5∘ E and 58.5–70.5∘ N. Note that the climate forcing data are not the same for the different LSMs (see Table 1) since the LSM data were provided by different modelling groups, each of which uses preferred GCMs. This is unlikely to have any significant impact on the LSM comparisons (see Sect. 5).
Unpublished analysis within the CLINF project has identified specific variables whose spatio-temporal behaviour is correlated with CSI incidence; these include vegetation activity (here represented by net primary production, NPP), soil moisture, soil surface temperature, snow cover, precipitation and land cover. This section concerns analysis of how predictions of such variables differ between LSMs.
For a given variable, say NPP, the data simulated by an LSM can be arranged as a two-way spatial × temporal table, where the spatial dimension has as many entries as latitude–longitude positions, and the temporal dimension represents monthly values for each year over the period studied. For our dataset, the historical data simulations from December 1997 to December 2013 have 193 monthly entries over the selected zone of 1152 grid cells. Therefore for the four LSMs we get a three-way data table per variable or a four-way table if we include all the variables given above. Since the LSMs provide NPP for each PFT, the PFT dimension could also be added, but this is not done here.
Analysing such structured datasets to understand spatial, temporal and between-model variations can be challenging when there are long-tail distributions (as is the case in our dataset: see Fig. 1, which shows the histogram of NPP values in the combined historical datasets simulated by the four LSMs) which preclude the use of classical geostatistical methods, and due to their multivariate nature. For two-way tables, singular value decomposition (SVD) is a powerful tool to extract associations of variables and patterns within data, e.g. clusters and trends. The SVD of a data matrix finds pairs of vectors (components) that successively extract decreasing fractions of the variation in the data and are uncorrelated with previous pairs. Visual description of these optimal vectors can be obtained by plotting the component weights, e.g. a spatial effect as a map and a temporal effect as a time series plot.
With more than two dimensions, the data are a multiway table; combining different dimensions to obtain a two-way data table, i.e. a matrix, suitable for SVD would lead to difficulties in interpreting the results. We could instead compare the SVDs of the four spatio-temporal (1152×193) tables of NPP for each LSM, which may indicate whether the models behave similarly but would not readily highlight their differences. When the spatio-temporal effects extracted from the four SVDs are similar, one would say it is a trend in NPP and the small differences would be interpreted as uncertainty due to intra-model variations. But when no similarities in patterns could be read across the SVDs for each of the four LSMs, one could infer larger inter-model uncertainty with specific spatio-temporal variations per LSM without other means of comparisons. Such considerations have led to the development of methods to extend the SVD to multiway tables; these are briefly described below, before giving a fuller description in Sect. 3 of the R package PTAk method used in this paper (Leibovici, 2010), which is an optimal nested decomposition of the data variation.
Let X be an n×p matrix, which we can regard as a collection of np-dimensional vectors or pn-dimensional vectors. The matrix XtX is positive semi-definite, so all its eigenvalues are positive, and its eigenvectors, φh, are mutually orthogonal, i.e. if . The matrices XtX and XXt have the same eigenvalues, , and the sum of squares of the elements of X is given by , where x is the matrix X vectorized as a np-dimensional vector.
The SVD of any matrix X is defined by the series of decreasing σh, the singular values, each associated with a pair of unit vectors φh (p-dimensional) and ψh (n-dimensional), with , which explain a fraction of the variability of X (defined as the sum of squares of the elements of X), i.e. . Hence SVD can be used for dimension reduction by defining a p′-dimensional subspace () that captures most of the variability in X:
For a suitable p′, the residual variation is small enough to be considered insignificant. As shown in Eq. (1), this decomposition can be written in tensorial form, since . The rank-1 matrix is known as a decomposed rank-1 tensor (Leibovici, 2010). The term σ1ψ1⊗φ1 is the best rank-1 tensor approximation to the matrix X in the sense of capturing the maximum fraction of variability in X among all rank-1 tensors, i.e. the maximum value of σ=ψtXφ. Subsequent rank-1 tensors in the decomposition in Eq. (1), given by the other eigenvectors, are orthogonal to the previous ones and successively extract decreasing fractions of the variability. Matrices can be seen as order 2 tensors and multiway tables as order k tensors, where k is the number of dimensions of the table. For k=2 the SVD can be seen as an optimal basis vector system in each dimension and in the tensor space, and generalizations of SVD to tensors of the order k≥3 aim to find equivalent optimal systems.
Several algorithms to extend SVD to tables with more than two entries have been proposed (Tucker, 1966; Carroll and Chang, 1970; Harshman, 1970; Kroonenberg, 1983; Leibovici, 2010), and development of methods and their applications is still very active (Demšar et al., 2013; Kroonenberg, 2016; Takeuchi et al., 2017; Lock and Li, 2018). Most extensions aim to find an optimum decomposition of a multiway table that allows dimension reduction by looking for a decomposition similar to Eq. (1) under specific optimization criteria. The algebraic embedding of two-way data tables as tensors of the order of 2 (equivalent to matrices) and by extension of k-way data tables as tensors of the order k facilitates the understanding of these extensions. For a multiway table X with k≥3 entries this takes the generic form
where hi is the index of the basis vectors of the individual vector spaces making up the k-dimensional data table and ϵ expresses the residual of the approximation given by the summation. This residual depends on the method and the number of components used in the decomposition and can be zero (as would be the case if we retained all the terms in a SVD).
The decompositions carried out by the CANDECOMP and PARAFAC methods (Carroll and Chang, 1970; Harshman, 1970) fix the number of rank-1 tensors in the decomposition but do not impose an orthogonality constraint, while PCA-3modes (Kroonenberg, 1983) considers both orthogonality and rank within each vector space. However, all three methods need to choose in advance the number of rank-1 tensors in their optimization and obtain decompositions that are not nested as with SVD, in which the rank p′′ approximation of X contains the approximation obtained for p′ (with ). This property is often desirable for environmental data analysis (Frelat et al., 2017), as decomposition of the variance or sum of squares has a practical interpretation.
To address this, Leibovici and Sabatier (1998) developed the PTAk method, which is a hierarchical decomposition giving nested approximation by construction. For k=2, the PTAk algorithm is the same as SVD, while for k=3 it is given by
The notation ⊗i means that the vector on the left takes the ith place in the tensor product, e.g. and “..” indicates the contraction operation (defined in Appendix B along with definitions of the other notation used in Eq. 3). Note that the PTA3 algorithm is recursive as the last line of Eq. (3) calls another PTA3. This process can be continued until it leads to a null table, but normally a stopping rule is imposed by requiring the decomposition to capture a prescribed fraction of the overall variability or specifying the desired number of order k rank-1 tensors (right-hand side of Eq. 3).
Similarly to SVD, the PTA3 algorithm first retrieves the rank-1 tensor approximation to X, , which captures the maximum possible fraction of the variability in X and is termed the first principal tensor (PT) of X. The second, third and fourth lines in Eq. (3) correspond to optimizations associated with this first PT in which the decomposed tensors share one of the components in the first PT. The corresponding PTA2 analyses are complete SVD decompositions into series of tensor products. Given this decomposition, descriptive statistics or plots of the triple of components (ψ1, φ1, ϕ1) can then be used to visualize the pattern or effect associated with the fraction of the variability captured by each of the tensors.
The generalization of Eq. (3) to k-way data tables is straightforward. In a PTAk decomposition, the first rank-1 tensor will have associated PTA(k−1) values which will recursively end up at associated PTA2s, i.e. SVDs.
This principal aims of this section are to perform a PTA3 analysis of the three-way spatial × temporal × LSM table X of NPP and to interpret the results. However, it is useful to first examine some of the properties of the distributions of NPP for each LSM. The histogram of the NPP values in the full data table X, displayed in Fig. 1, conceals distinct differences between the LSMs. Some of these differences are indicated by Table 2, which gives the mean NPP and sum of the squares of NPP for each LSM, and Table 3, which shows for each LSM the fraction of NPP values in each decile of the reference distribution in Fig. 1. In Table 2, OR_MICT stands out by its low mean NPP (23 % less than JULES) and low variability (significantly less than the other LSMs and 37 % less than OR_HL). The LSMs also exhibit different distributions (see Table 3): notably CLM5 has 35 % of its NPP values in the first decile of the reference distribution, while OR_HL and JULES have very few values in this decile, and the decile with peak occupancy is different for all four LSMs. However, all the LSMs place around 10 % of their NPP estimates in each of the higher deciles (70 % to 100 %).
These distributional differences tell us nothing about the spatio-temporal differences between the LSMs, and for that we use the decomposition provided by the R package PTAk (Leibovici, 2010) of which the first 10 terms are displayed in Fig. 2. This describes the hierarchical and nested decomposition of the sum of squares of X into PTs and associated PTs. Each row corresponds to a PT, identified by a label and number,
-no-, and its singular value; e.g.
-no-1 correspond to the first line of Eq. (3) giving the best rank-1 approximation of X, with singular value . The row with label
vs222 gives the singular value corresponding to the next order-3 rank-1 approximation, which corresponds to the recursive step in the last line of Eq. (3).
The rows between
vs222 correspond to PTs associated with
vs111, which are derived from PTA2s, i.e. SVDs. The labels given to these decomposed components start with the dimension of the component used in contracting the tensor X (see Appendix B) and continue with the label of the PT from which they are derived and the dimensions of the contracted tensor; e.g.
1152 vs111 193 4 identifies the results from the PTA2 of the 193×4 matrix (i.e. an SVD), where ψ1 is the 1152-dimensional vector forming the spatial component of PT
-no-1. Therefore the associated PTs
4 have the same spatial component as tensor
-no-1. Similarly, the rank-1 tensors
7 are associated PTs along the temporal component of
vs111. Note that Fig. 2 displays only PTs with a contribution exceeding 0.1 % of the total sum of squares, as indicated in the bottom line in the figure. This means that we show only the first two terms from each of the PTA2s associated with
vs111, one of the associated PTs associated with
vs222 and no associated PTs for
The other terms in the rows of Fig. 2 are the singular values associated with each PT (
SingVal) and the percentage of the variability in X explained by each of the PTs (
ssPT %). The variability explained is given by the square of the singular value. Tensors
-no- 2, 5 and
8 are missing as they are repeats of already listed rank-1 tensors. This arises from the way the code implements Eq. (3); see Leibovici (2010) for further details.
The contribution by the main PTs decreases from
vs333, etc. Each of the associated tensors makes a smaller contribution than its main PT but this may be larger than the next main PT; e.g. tensor
-no-3 captures more variability than tensor
-no-11. There is no particular ordering in the tensors associated with different components, between
-no-3, which is associated with the spatial component, and
-no-6, which is associated with the temporal component, but the PTs associated with a given component are ordered since they derive from the same PTA2 (i.e. SVD); e.g.
-no-4. Figure 2 then allows one to select the PTs or associated PTs that successively capture the variability in X.
It is helpful to visualize the first PT, whose components are displayed in Fig. 3, as an optimal approximation to the initial data table in which each of the four layers is the same 2-D spatio-temporal “map”, but scaled by the weight for a particular LSM, given by ϕ1. The spatial pattern at each time is the same (ψ1, as in Fig. 3a), but with a weight appropriate to that particular time. Similarly, the time series at each spatial location is the same (φ1, shown as Fig. 3b), but with a weight appropriate to that location. To recover the NPP from this approximation at a particular position, time and for a given LSM, the corresponding values in ψ1, φ1 and ϕ1 are multiplied together and then multiplied by its singular value, σ1. Exactly the same construction applies to each of the rank-1 tensors in the decomposition.
The spatial effect (Fig. 3a), which is always positive, places higher weights in Sweden, the Baltic states and north-west Russia and lower values in Norway and northern Finland, with values varying between 22 % and 138 % of a uniform spatial weighting (i.e. equal weights of ). For display, the temporal component, a vector of length 193 (December 1997 to December 2013), has been split into annual segments which express the monthly weights over the 16-year period (Fig. 3b). As expected, there is a strong seasonal effect, with the summer months (June to August) having large positive weights, while values are very small from November to March and include negative values from December to February in nearly all years. Two other groups of months can be distinguished: October paired with April as just before or after winter and May with September as just before and after the seasonal peak of production. The months from May to September all display significant upward trends in NPP over the 16 years, with average increases of 1.48 %, 0.80 %, 0.63 %, 0.67 % and 0.51 % per annum respectively. The other months show no significant trends. April, May, June and August have more inter-annual variability than the other months, and April, May and June all show peaks in 2002. Over the 16 years, the maximum is in July 2013 and is 217 % greater than for uniform temporal weighting (), while the minimum in winter (December 2006) represents −8 % of uniform weighting.
Since these spatial and temporal patterns are the same for all the LSMs, the difference between them is expressed by the LSM weights (Fig. 3c). For identical LSM simulations, the weights would be 1∕2, since each vector in the decomposition is normalized to unity (i.e. ), but the weights lie between 0.460 and 0.523, with JULES and OR_HL respectively giving values 3 % and 5 % higher than for equal weights and OR_MICT giving a value 8 % lower. Hence there is only a weak dependence on the LSM in this first PT.
The proportion of the variability in the first PT due to each LSM is given by the squares of the LSM weights, i.e. 21.2 %, 27.4 %, 25.1 % and 26.3 % for OR_MICT, OR_HL, CLM5 and JULES respectively. Multiplying these values by gives the sum of squares of NPP in the spatio-temporal maps for
vs111 for each LSM (see Table.2). Several points should be noted about the approximation to X given by
The squares of the LSM weights are in the ratio , while the values of the original sum of squares of NPP (see Table 2) are in the ratio . Hence the first PT correctly picks up the ordering of the variability amongst the LSMs, but not its full value, since it is effectively a smoothing of the dataset.
The spatio-temporal maps for the individual LSMs capture 92.0 %, 86.5 %, 87.3 % and 93.1 % of the original variability of OR_MICT, OR_HL, CLM5 and JULES respectively. Hence each one is a reasonable approximation to the original LSM simulation, particularly for OR_MICT and JULES.
The mean NPP represented by
vs111is kg m−2 s−1, which is very close to that of the mean of X ( kg m−2 s−1), though the individual NPP spatio-temporal maps for each LSM track the original mean NPP less closely (+3.6 %, −0.7 %, +6.3 % and −5.6 % for OR_MICT, OR_HL, CLM5 and JULES respectively; see Table 2).
As noted above, recovering the NPP at a particular position and time and for a given LSM in
vs111 requires multiplying together the corresponding weights in the spatial, temporal and LSM dimensions and then multiplying by the singular value. So, for example, the maximum value of NPP in the first PT over the whole time period is in July 2013, in the darkest red cell of Fig. 3a and for OR_HL, the LSM with maximum weight. Since and in this cell the spatial, temporal and LSM weights are 0.040, 0.156 and 0.523 respectively, this yields a maximum NPP of kg m−2 s−1. There are small negative temporal weights from December to February, leading to negative values of NPP and an overall minimum NPP of kg m−2 s−1 in December 2006, which will again occur for OR_HL and at the same position as the overall maximum NPP.
Results from Table 2 and this first PT makes the important point that a single spatio-temporal pattern does well at capturing the NPP from the four LSMs. Whilst this expresses a common trend between the LSM, their weight similarity is up to 14 % differences, showing a variation in intensity from one to another. Despite similar photosynthesis modules in most LSMs, parameter settings, such as the choice of PFTs together with different climate datasets (GCM; see Table 1) and settings in other modules, induce these variations. The subsequent PTs provide a series of corrections to this common pattern, expressing LSM specificities, such as how the PFTs are parameterized.
The second best PT in the decomposition,
-no-6, is a temporally associated PT, so it has the same temporal component as
vs111 and expresses 3.72 % of the variability. Its spatial component (Fig. 4a) has positive (red) weights in the north and west and negative (green) weights to the south and east. The most striking feature of this tensor is in its LSM component (Fig. 4c) which shows a marked contrast between OR_HL, with a large negative weight, and the other LSMs, for which the weights are significantly smaller and positive. Hence, after multiplying the weights in the different components, all the LSMs except OR_HL will see an increase in NPP in the red areas in the summer months and decrease in the green areas, while the opposite effect occurs for OR_HL. When the temporal weights are negative, as occurs for most of the winter, these sign changes in NPP are reversed. As can be seen from Table 2, including the contribution from this PT increases the captured fraction of variability in OR_HL from 86.5 % to 95.9 %, with much smaller gains for the other LSMs. Therefore, PT
-no-6, mostly contributing to fitting OR_HL (9.36 % of its variability), is highlighting a specificity relatively to the others. Without ground truth, one cannot tell if this specificity is a bias or a better modelling than the other LSMs and just expresses a well-defined uncertainty.
Figure 5 shows the components of the third best PT,
-no-3, which captures 1.72 % of the variability and is associated with the same positive spatial pattern as PT
-no-1. Here the temporal effect is positive for the months from August to October, close to zero for November and July, and negative for the other months, especially April to June, which show higher inter-annual variations around their trends than the other months. CLM5 and JULES have large positive and negative weights respectively while OR_MICT has a smaller negative weight and OR_HL has a weight very close to zero. Hence for CLM5 this tensor acts to increase NPP from August to October and reduce it for all other months except November and July, while for JULES and OR_MICT it does the opposite. As expected from the weights, including this tensor mainly acts to improve the fit of CLM5 and JULES to their original values (Table 2). There are more between-year variations for the months of May to July than post-peak-production months, with September and October showing the most stable year-to-year variations among the months contributing to the tensor.
-no-9 is the fourth best in the decomposition and captures 1.38 % of total variability. Since it is associated with the LSM component of PT
-no-1 it is the same for all LSMs. Its spatial component Fig. 6a exhibits a strong latitudinal gradient with positive values in the north and negative values in the south. The temporal component has positive weights in July and August and negative values in April, May and June, while for other months the weights are near-zero. This is relatively constant over the 16-year period, the year-to-year variations being smaller than the separation of the three groups of months, except in 2002 where June joins the near zero group and July gets a significantly higher value than August whilst getting a significantly lower weight (close to the zero group) than August in 2012. Also, one must notice in 2006 a relatively parallel shift from 2005 values for all months having a contribution. The years 2002, 2006, 2011 and 2012, showing local temporal similarity for the growing season, correspond to extreme events mentioned in the literature (Høgda et al., 2013; Bjerke et al., 2014; Park et al., 2016). Hence, since the LSM weights are all positive, in July and August this tensor acts to increase NPP in the north and reduce it in the south, while in April to June it does the opposite. These effects will be slightly greater for OR_HL because of its greater weight. Though its contribution to the overall sum of squares is only 1.38 %, it provides improvements for all LSMs (see Table 2).
None of the other PTs contribute more than 1 % to the overall variability and their components are not displayed, although the contributions for all terms in Fig. 2 are given in Table 2. For example, the next best PT (
-no-11), which derives from the last line in Eq. (3), captures 0.42 % of the variability and principally improves the fit to the variability captured by CLM5 and, to a lesser extent, JULES. The summation of all 10 PTs that each represent at least 0.1 % of the variability captures 98.4 % of the variability in X and between 97.4 % and 99.0 % of the variability in the individual LSMs (last line of Table 2).
Overall, this analysis shows that a single optimal spatio-temporal pattern, with slightly different weights for the four LSMs (up to 14 % maximum difference), provides a reasonably good approximation for all their estimates of NPP, capturing between 87 % and 93 % of the variability in the individual models, as well as around 90 % of the variability in the combined LSM dataset. The next best adjustment to this pattern is a spatial correction that principally applies to OR_HL and significantly improves the fit of the approximation to this LSM, with only small improvements for the other LSMs. The second best adjustment adds a temporal pattern that mainly affects CLM5 and JULES and improves the fit to these LSMs, with less effect on OR_MICT and none on OR_HL. The third best adjustment adds a new spatio-temporal pattern whose spatial component is roughly the opposite of that in the first PT (i.e. it is spatially similar but with opposite signs) but a quite different temporal component that is positive in the later summer months, negative in the late spring and early summer months, and roughly zero at other times. The improvement in the overall fit from the next best PT and all succeeding ones is less than 0.9 %, and, although in two instances the fits to individual models improve by over 1 %, in most cases the improvements are much smaller (see Table 2).
Summing the 10 PTs whose individual contribution to the overall variability exceeds 0.1 % (Fig. 2) provides an approximation to the overall data table that captures 98.4 % of the overall variability and between 97.4 % and 99.0 % of the variability in the individual LSMs (Table 2). However, also of interest is the point-wise goodness of fit of the approximation, not just the variability it captures. This is represented by the table of residuals, i.e. the ϵ term in Eq. (2). Around 75 % of the absolute values of these residuals are less than 8.4 % of the overall mean NPP, so in most cases there is a good point-wise fit to the original data, but the maximum absolute value of the residuals () is around the third quartile of NPP (). Hence, in some cases the approximation may be significantly different from the correct value despite the residuals contributing less than 1.62 % to the overall variability.
Section 4 identified differences between the LSMs captured by an optimal decomposition of the associated three-way table. In this section we instead directly analyse the variability in the differences between the LSMs, in order to localize where and when the LSMs disagree and thus to quantify spatio-temporally the uncertainty in NPP associated with the choice of a particular LSM. We in fact analyse LSM differences normalized by the maximum value of NPP, i.e. (NNP1–NPP2) ∕ NPPmax, where NNP1 and NNP2 refer to NPP values in two different LSMs and NPPmax is the maximum NPP over all four LSMs. Note that for each pair of LSMs we have chosen arbitrarily whether to use (NPP1–NPP2) or (NPP2–NPP1). This choice of sign does not affect the PTAk optimization since this is based on the sum of squares, but the sign does matter when identifying which of a pair of LSMs gives higher NPP values. The sign convention used is indicated in the relevant figures (Figs. 9–11).
Figure 7 displays the histograms of (a) the absolute values of normalized differences, which has a peak near zero but also a fairly long right-hand tail, and (b) the absolute values of (NPP1–NPP2) ∕ (NPP1 + NPP2), which is fairly flat across most of the range, with a small peak near zero, but with a large peak near 1. The latter indicates that for many times and places the NPP values in one LSM are very small relative to one of the other LSMs. This occurs much more frequently in winter when CLM5 gives NPP values that are very small compared to those from the other LSMs. However, since NPP is small in winter, these large relative differences have little impact on overall annual production. Indeed Table 2 shows that the mean annual NPP from CLM5 exceeds that from OR_MICT.
The results of the PTA3 for the table of normalized NPP differences are shown in Fig. 8. The first and second PTs respectively extract 43.4 % and 21.7 % of the variation, both with well-structured patterns in their components. The first, shown in Fig. 9, has a spatial pattern with negative (green) value areas to the south and east, and positive (red) values in the north and west, as well as in south-west Finland. The temporal component is always positive and displays a seasonal effect (Fig. 9b) with the same ordering of the months as Fig. 3. However, despite being very similar to the temporal pattern in Fig. 3, it shows more year-to-year variations, and the May profile differs from September with an increasing trend. All the differences involving OR_HL have significant weights but for the other differences they are close to zero. Hence the effects of this principal vector essentially translate into differences between OR_HL and the other LSMs. Taking into account the signs of the spatial, temporal and LSM weights (the last to be interpreted as an LSM difference), this means that for this PT over the whole time period CLM5 > JULES > OR_MICT > OR_HL in the red areas in Fig. 9a while these orderings are reversed in the green areas. However, the small weights on the differences not involving OR_HL indicate that the other LSMs all give similar NPP values.
The second best PT, Fig. 10, expressing 21.7 % of the variability, has a quite complex positive spatial pattern, with the strongest effects in northern Finland and the weakest near Lake Ladoga, Russia, in the south-east of the region. The temporal weights are positive in June, May and to a lesser extent April; weakly positive for the winter months from December to March; nearly zero in July; variable but mainly negative in November; and negative from August to October. The weights for all differences involving JULES are negative but are positive for the other differences. This means that for this PT in all locations and for all years JULES > OR_MICT > OR_HL > CLM5 from April to June, but this ordering is reversed from August to October. This is a persistent monthly pattern but with much greater inter-annual variability from May to July than in the other months. So this ordering of LSMs is more sensitive to yearly variations from April to June than its reverse counterpart during post peak production months from August to October.
The third and fourth most important PTs,
7, are associated with the temporal component of
vs111 and capture 10.94 % and 4.85 % of the variability respectively. Their spatial and LSM components are depicted in Fig. 11. The first displays little spatial structure apart from significant negative values along the east coast of Sweden. This may be due to differences in data resolution before grid transformation but also occurs where C3 grass is the dominant PFT (all LSMs). All LSM differences have positive weights except CLM5 – JULES, which is negative but small, and all differences involving OR_MICT have significantly larger weights than other combinations. Since the temporal component is positive everywhere (Fig. 9b), the net effect is that OR_MICT > OR_HL > JULES > CLM5 in the red areas (with a high value north of Lake Ladoga), and this order is reversed in the green areas. However, the differences between LSMs other than those involving OR_MICT are small.
The spatial component of PT
-no-7 (Fig. 11c) is weakly positive except for a very small area near Tromsø in northern Norway. All the LSM differences involving JULES have negative weights and have greater magnitude than the other differences, which are all positive, meaning that JULES > OR_HL > OR_MICT > CLM5 everywhere except near Tromsø, where this ordering is reversed.
As indicated by Fig. 8, the next best PTs (not displayed) are
-no-13, associated with the spatial component of vs222,
10, associated with the LSM component of
-no-19, associated with the LSM component of vs222. Hence PT
-no-13 modulates the temporal pattern of differences depicted in Fig. 10 with a distinct temporal pattern that has different positive weights for each of the LSM differences (the contribution from OR_HL – CLM5 is almost zero and OR_MI – JULES gets the larger positive weight). A contrast between July (positive weights) and May (negative weights) stands out clearly from the other months by the size of its contribution to the variability, for reasons which are not clear. In Fig. 10, July and OR_MI – JULES weights were close to zero. Because PTs
10 are associated with the LSM component of
vs111, the spatio-temporal table given by summing the spatial × temporal terms in all three PTs can be analysed together; this would mainly reveal spatio-temporal differences between OR_HL and the other LSMs (see Fig. 9c). However, this combined analysis cannot be displayed as separate spatial and temporal plots. With the same LSM weights as in Fig. 10, PT
-no-19 exhibits a clear north–south gradient and a temporal pattern in which June clearly contributes more to the variability than the other months. This is similar to what is seen for July in PT
-no-13, again for unknown reasons. All the rest of the PTs cumulatively contribute only 10 % to the overall variability and individually less than 0.8 %.
Also analysed was the variability in the quantity ∣NPP1 – NPP NPP1 + NPP2∣ but this is not displayed, since its main contribution is to show that the large peak near 1 seen in Fig. 7b can mainly be attributed to small values of CLM5 relative to the other LSMs in winter in the north of the region.
The analysis in this section adds significantly to that in Sect. 4 by providing specific information on the times and places where the LSMs differ and by how much. However, in this case no single spatio-temporal pattern strongly dominates the variability so interpretation of the analysis requires consideration of several such patterns. Nonetheless, the three best PTs capture around 76 % of the variability in the LSM differences. The first essentially tells us that over a well-defined spatial pattern and a clearly ordered temporal pattern that with a maximum in summer and a minimum in winter OR_HL gives different values from the other LSMs, which are all similar. The second PT principally identifies times and places where CLM5 differs from the other LSMs, while the third does the same for OR_MICT.
This section analyses the effects of different GCM drivers on the NPP estimated by JULES, so it is a partial answer to question (i) in Sect. 1. Two global warming scenarios that stabilize at 1.5 and 2.0 ∘C above pre-industrial levels by the year 2100 were used, with 34 GCMs as climate forcing in JULES (Comyn-Platt et al., 2020). The ensemble of the GCMs is taken to represent the uncertainty in climate prediction, from which one can get an idea of the associated uncertainty in the JULES estimates of NPP. Note however, that this commonly used approach to quantifying climate uncertainty is not entirely satisfactory, since it identifies inter-GCM model variability with the internal uncertainty in climate prediction (Hawkins and Sutton, 2009; Kay et al., 2015).
For each scenario a PTA3 analysis was performed on a spatial × temporal × GCM table. The decompositions for both the 1.5 and 2.0 ∘C targets capture almost all the variation in their first PT (99.15 % and 99.16 % respectively); hence very similar spatio-temporal patterns of NPP are produced whichever GCM is used. The spatial patterns are shown in Figs. 12a and 13a. The temporal and GCM weights are given as a percentage relative deviation from uniform weighting, i.e. , where cp indicates the weight while unif = for the temporal dimension and unif = for the GCMs.
Over the 100 years, all months exhibit an initial increase, which is sharper for the 2 ∘C scenario, followed by a flattening out and minor decrease; this decrease sets in around 2070 for the 1.5 ∘C scenario and slightly later for the 2.0 ∘C scenario. The maximum increase from 2000 (indicated on each monthly curve in Figs. 12b and 13b) is higher in every month for the 2.0 ∘C scenario, e.g. 20 % and 32 % in July for the 1.5 and 2.0 ∘C case respectively. The differences between the GCMs are indicated by histograms of the relative deviation of the GCM weights from uniform weighting (Figs. 12c and 13c). These differences are up to 7 % for the 2.0 ∘C scenario and 4.5 % for the 1.5 ∘C scenario. For both scenarios, the groups of GCMs giving the lowest or highest difference from equal weighting were the same, though the precise ordering was different (see Appendix D). If the singular value associated with this first PT are expressing the same amount of variability, the latter is 10 % higher for the 2.0 ∘C case than for 1.5 ∘C, which simply expresses the sharper increase in NPP values produced under a more intense warming.
This paper investigates the uncertainty associated with choosing a given LSM and GCM to predict the effects of climate change on net primary production in northern Europe. More precisely, it provides a spatio-temporal analysis that captures the principal similarities and differences between LSM estimates of NPP, which need to be taken into account if these LSMs are to be used to provide scenarios for applications. Its primary motivation is to provide information relevant to studying climate-sensitive infections (CSIs), but here the CSI context is used only to reduce the number of LSMs to those that contain adequate descriptions of key high-latitude processes. NPP was chosen as a representative, substantial output variable from any LSM. It is based on a multiway data science methodology that extends the SVD of a matrix to a multitable in order to analyse spatio-temporal variations between LSMs. This allows quantification of the similarities and differences between the LSMs structured into spatio-temporal patterns that identify where, when and between which LSMs they occur, together with analysis of the variability arising from using different climate forcing models (GCMs) by then reflecting on the arising uncertainties when estimating NPP.
Detailed results of each multiway data analysis are given at the end of each section and we focus here on integrating the many highlighted aspects of uncertainties arising from comparing the four LSMs: OR_MICT, OR_HL, CLM5 and JULES.
Global statistical differences were found between the LSMs, with OR_MICT exhibiting significantly lower mean NPP and variability than the other LSMs, and CLM5 producing a very high proportion of low values associated with the winter season, particularly in the north of the CLINF region. However, all the LSMs tend to agree for higher NPP values (above the 70 % decile), which mainly indicates that they give similar values in summer. Despite these global differences, to a first approximation the spatio-temporal behaviour of all the LSMs could be well-fitted by the tensor product of a single spatial and temporal pattern, in which the west and north of the region exhibited lower NPP values than the east and south, and there was a strong seasonal pattern. Differences between LSMs for this single pattern were fairly small, with weights lying between 92 % and 105 % of a uniform weighting of 0.5 or 14 % maximum difference between them. This combined pattern captured around 90 % of the overall variability in simulations covering 16 years for the whole Fennoscandian region. Across this time period, this first approximation displayed statistically significant increases in NPP from May to September, with the largest increase in the earlier months. This is likely to be caused by the growing season starting earlier and lasting longer.
The LSM requiring the most adjustment to this first approximation for an improved fit was OR_HL; this adjustment is in the spatial pattern, decreasing the spatial weights in Norway and northern Finland and increasing them in Sweden and southern Finland. The next adjustment, which has no effect on OR_HL, is to modify the temporal pattern; this particularly improves the fit to CLM5. The approximation achieved with just these adjustments captures 95 % of the overall variation and between 93 % and 96 % of the variation in the individual models. It also has the advantage of being fairly simple to interpret because OR_HL dominates the first adjustment while CLM5 (and to a lesser extent JULES) dominates the second. Further terms in the approximation yield smaller gains that tend to be spread more evenly across the LSMs.
While the first analysis provides information on temporal and spatial patterns characterizing the main common structure in the LSM estimates of NPP, together with a systematic analysis of how different models diverge from this pattern, more specific information on how they differ from each other is gained by analysing their differences. Here no single pattern dominates the overall variability between the LSMs, but the three best PTs capture around 76 % of the variability in the LSM differences and can be fairly well-interpreted in terms of how individual LSMs differ in space and time from the others. Successively they show where and when individually OR_HL, CLM5 and OR_MICT differ from the other LSMs, and also where different LSMs agree. Moreover, this analysis on differences enabled us to add specificities (geographical and temporal differences) to each LSM, e.g. OR_HL with a large difference in summer for NPP values with any other models (with similar values) in most parts of Finland and eastern Sweden and CLM5 with smaller NPP values than the others in May and June but higher values from August to October. Quantitatively the former represents 43.37 % of variability of the differences and the latter 21.7 %.
Besides the main trend of increase in NPP over the 16 years (first analysis), 24 % in May and down to 8 % in September (see Fig. 3b), no other noticeable year-to-year patterns were identified in both analyses based on comparing the four LSMs and in the 100-year horizon analysis with JULES. The year-to-year variations were less important than intra-annual patterns, either seasonal or other months' patterns. In other words, the monthly patterns were relatively steady over the 16-year period. However, within the monthly patterns the between-year variations could be very different, illustrating either relatively steady monthly patterns with differences among the LSMs (e.g. in Fig. 3) or very variable intensity of the monthly pattern from year to year expressed by a principal tensor, some with similar variability across months (e.g. Fig. 9) or with different levels of uncertainty for certain months (e.g. in Figs. 5 and 10). These various levels of inter-annual variability linked to the effects (i.e. month pattern, spatial and LSM differences) already described are to modulate the uncertainties associated with LSM choice.
Our analysis of the impact of the choice of GCM on the simulations of NPP was restricted to runs with JULES out to 2100 driven by 34 different GCMs. This showed that a single spatio-temporal pattern captured over 99 % of the variability of NPP in the combined dataset for climate change scenarios, leading to either 1.5 or 2.0 ∘C atmospheric warming, and that none of the GCM weightings differed by more than 3 % from uniform weighting (maximum difference of 6 %). The temporal pattern showed increases in NPP up to the 2070s, with small decreases thereafter. Although this analysis was only carried out for JULES, there is no reason to expect significantly different findings for the other LSMs, since they all use a form of the Farquhar photosynthesis model to derive gross primary production, of which some fraction is allocated to NPP. Moreover, this single PT expressing 99 % of variability highlights a strong effect correlated temporally to the findings of the first analysis with the four LSMs (Fig. 3). Hence the insensitivity of the simulated NPP to the choice of GCM is likely to be repeated in the other LSMs.
We return to the three key questions posed in Sect. 1:
How does the choice of the GCM affect the CSI-relevant outputs of a given LSM?
For a given GCM, how different are the CSI-relevant outputs of the different LSMs?
How do the joint effects of GCM and LSM differences translate into variability in predictions of CSI-relevant quantities?
The analysis in this paper suggests that, at least for NPP, we can neglect the effect of different GCMs and need only deal with question (ii). Quantitative answers are provided to this question in terms of both spatio-temporal patterns and differences and similarities of LSMs. However, we have only considered one of the six variables listed at the start of Sect. 2 that are considered to be of major importance for climate-sensitive infections (CSIs), and we may find different behaviour for the others. In particular, initial investigation indicates very different representations of land cover between the four LSMs and how land cover will evolve under climate change in the 21st century. This variable is likely to be the one showing the most differences between the LSMs because it is very much controlled by the PFTs used, how they are parameterized and the rules by which PFTs compete over time.
Of significant interest would be analysis of multiple variables and their co-variation. We intended to address this issue in a future paper using the PTAk method used here, since this can be readily extended to multiple variables. While this does not present any methodological difficulties, it will only become clear how useful this is when we find how easy it is to interpret the outputs of the analysis.
The next major step is to couple the findings from this paper (and its extension to other variables) to ecological models for CSI vectors and statistical epidemiological models in order to establish the sensitivity of predicted CSI behaviour under climate change to the choice of GCM and LSM. Currently only a small number of CSIs have well-developed predictive models (notably tularemia (Rydén et al., 2012; Andersen and Davis, 2017; Desvars-Larrive et al., 2017); Lyme disease (Simon et al., 2014; Li et al., 2016)) and these will provide the basis for such a study. However, CLINF is in the process of developing more comprehensive statistical CSI models at high latitudes, which will lend themselves readily to combination with the approach adopted in this paper. Besides understanding better the variations from one LSM to another, geographically and temporally, which are important aspects in CSI models, the methodology developed in this paper allows some controls on the predictive uncertainty arising from choosing one LSM. In particular, the variations in CSI prediction due to the use of different LSMs can be systematically analysed as the result of a sequence of progressively less important deviations from an overall common pattern, for NPP as predictor in this paper.
Note the slightly different scientific notations throughout the paper, for example 0.00000002 as or or .
For X and Y two multiway data tables , their inner product is defined as . The contraction operation is the extension to tensors of the linear combination of the columns or rows of a matrix to give a vector. If X is a tensor of the order of 3, equivalent to a table , then with the variables and vectors of length n, p and q respectively, the contraction is a p×q matrix with , the contraction is a n×q matrix with , and is a n×p matrix with . Contacting X successively by two vectors gives for example , and is equivalent to the inner product for the multiway data tables.
B2 Orthogonal projector
Without loss of generality let a, b and c be unit vectors of dimensions n, p and q respectively. If X is a tensor represented by an array, one can write , where is the linear orthogonal projection of X onto and . From the orthogonality constraints, , so .
Moreover, if then . This property extends easily to any subspace of E, F and G; i.e is equivalent to .
This appendix lists the PFTs for the versions of the LSMs used in this paper (see Sect. 1.2). JULES, ORCHIDEE_MICT (OR_MICT), ORCHIDEE-HL-Veg (OR_HL) and CLM5 have 14, 13, 16 and 15 PFT PFTs respectively. The version of JULES used for the 34 simulations over 100 years used 10 PFTs (where C3 or C4 crops or pastures are set as C3 or C4 grass).
The analyses were performed using the R package PTAk (https://CRAN.R-project.org/package=PTAk, last access: 2 March 2020, Leibovici, 2010). The multiway data tables used in the paper can be requested from the first author. CLM5.0 is publicly available through the Community Terrestrial System Model (CTSM) git repository (https://github.com/ESCOMP/ctsm, last access: 2 March 2020, Lawrence et al., 2019); all model data are archived and publicly available at the UCAR/NCAR Climate Data Gateway, https://doi.org/10.5065/d6154fwh, last access: 2 March 2020, Oleson, 2018.
EC-P, GH, MVM, MG, AD, DZ and PC provided data and comments on the draft manuscript. DGL designed the methodologies, performed the analyses and drafted the manuscript. SQ and DGL finalized the article.
The authors declare that they have no conflict of interest.
This work was carried out under NordForsk funding to CLINF, a Nordic Centre of Excellence (NCoE) led by Professor Birgitta Evengård (https://www.nordforsk.org/en/programmes-and-projects/projects/climate-change-effects-on-the-epidemiology-of-infectious-diseases-and-the-impacts-on-northern-societies-clinf, last access: 2 March 2020). Maria Val Martin was supported by the Leverhulme Trust through a Leverhulme Research Centre Award (RC-2015-029).
This research has been supported by the NordForsk (grant no. 76413).
This paper was edited by Alexey V. Eliseev and reviewed by two anonymous referees.
Andersen, L. K. and Davis, M. D. P.: Climate change and the epidemiology of selected tick-borne and mosquito-borne diseases: update from the International Society of Dermatology Climate Change Task Force, Int. J. Dermatol., 56, 252–259, https://doi.org/10.1111/ijd.13438, 2017. a, b, c
Asghar, N., Petersson, M., Johansson, M., and Dinnetz, P.: Local landscape effects on population dynamics of Ixodes ricinus, Geospatial Health, 11, 283–289, https://doi.org/10.4081/gh.2016.487, 2016. a
Bjerke, J. W., Karlsen, S. R., Høgda, K. A., Malnes, E., Jepsen, J. U., Lovibond, S., Vikhamar-Schuler, D., and Tømmervik, H.: Record-low primary productivity and high plant damage in the Nordic Arctic Region in 2012 caused by multiple weather events and pest outbreaks, Environ. Res. Lett., 9, 084006, https://doi.org/10.1088/1748-9326/9/8/084006, 2014. a
Blomgren, E., Hesson, J. C., Schäfer, M. L., and Lundström, J. O.: Pest occurrence of Aedes rossicus close to the Arctic Circle in northern Sweden, J. Vector Ecol., 43, 36–43, https://doi.org/10.1111/jvec.12280, 2018. a, b
Booth, T. H., Nix, H. A., Busby, J. R., and Hutchinson, M. F.: bioclim: the first species distribution modelling package, its early applications and relevance to most current MaxEnt studies, Divers. Distrib., 20, 1–9, https://doi.org/10.1111/ddi.12144, 2014. a
Burke, E. J., Ekici, A., Huang, Y., Chadburn, S. E., Huntingford, C., Ciais, P., Friedlingstein, P., Peng, S., and Krinner, G.: Quantifying uncertainties of permafrost carbon–climate feedbacks, Biogeosciences, 14, 3051–3066, https://doi.org/10.5194/bg-14-3051-2017, 2017. a
Carvalho, C., Lopes de Carvalho, I., Zé-Zé, L., Núncio, M., and Duarte, E.: Tularaemia: A challenging zoonosis, Comparative Immunology, Microbiology and Infectious Diseases, 37, 85–96, https://doi.org/10.1016/j.cimid.2014.01.002, 2014. a
Cayol, C., Koskela, E., Mappes, T., Siukkola, A., and Kallio, E. R.: Temporal dynamics of the tick Ixodes ricinus in northern Europe: Epidemiological implications, Parasite. Vector., 10, 166, https://doi.org/10.1186/s13071-017-2112-x, 2017. a
Clark, D. B., Mercado, L. M., Sitch, S., Jones, C. D., Gedney, N., Best, M. J., Pryor, M., Rooney, G. G., Essery, R. L. H., Blyth, E., Boucher, O., Harding, R. J., Huntingford, C., and Cox, P. M.: The Joint UK Land Environment Simulator (JULES), model description – Part 2: Carbon fluxes and vegetation dynamics, Geosci. Model Dev., 4, 701–722, https://doi.org/10.5194/gmd-4-701-2011, 2011. a, b
Comyn-Platt, E., Hayman, G., Huntingford, C., et al.: Carbon budgets for 1.5 and 2 ∘C targets lowered by natural wetland and permafrost feedbacks, Nat. Geosci., 11, 568–573, https://doi.org/10.1038/s41561-018-0174-9, 2018. a, b, c, d, e
Demšar, U., Harris, P., Brunsdon, C., Fotheringham, A. S., and McLoone, S.: Principal Component Analysis on Spatial Data: An Overview, Ann. Assoc. Am. Geogr., 103, 106–128, https://doi.org/10.1080/00045608.2012.689236, 2013. a
Desvars-Larrive, A., Liu, X., Hjertqvist, M., SjöStedt, A., Johansson, A., and RydéN, P.: High-risk regions and outbreak modelling of tularemia in humans, Epidemiol. Infect., 145, 482–490, https://doi.org/10.1017/S0950268816002478, 2017. a
Druel, A., Peylin, P., Krinner, G., Ciais, P., Viovy, N., Peregon, A., Bastrikov, V., Kosykh, N., and Mironycheva-Tokareva, N.: Towards a more detailed representation of high-latitude vegetation in the global land surface model ORCHIDEE (ORC-HL-VEGv1.0), Geosci. Model Dev., 10, 4693–4722, https://doi.org/10.5194/gmd-10-4693-2017, 2017. a, b, c
Druel, A., Ciais, P., Krinner, G., and Peylin, P.: Modeling the vegetation dynamics of northern shrubs and mosses in the ORCHIDEE land surface model, J. Adv. Model. Earth Sy., 11, 2020–2035, https://doi.org/10.1029/2018MS001531, 2019. a
Ebi, K. L., Ogden, N. H., Semenza, J. C., and Woodward, A.: Detecting and Attributing Health Burdens to Climate Change, Detecting and Attributing Health Burdens to Climate Change, Environ. Health Persp., 125, 085004–085004, https://doi.org/10.1289/EHP1509, 2017. a
Frelat, R., Lindegren, M., Denker, T. S., Floeter, J., Fock, H. O., Sguotti, C., Stäbler, M., Otto, S. A., and Möllmann, C.: Community ecology in 3D: Tensor decomposition reveals spatio-temporal dynamics of large ecological communities, PLOS ONE, 12, e0188205, https://doi.org/10.1371/journal.pone.0188205, 2017. a
Gilbert, L.: How landscapes shape Lyme borreliosis risk, in: Ecology and Control of Vector-borne diseases, edited by: Braks, M. A., van Wieren, S. E., Takken, W., and Sprong, H., Wageningen Academic Publishers, The Netherlands, 4, 161–171, https://doi.org/10.3920/978-90-8686-838-4_11, 2016. a
Guimberteau, M., Zhu, D., Maignan, F., Huang, Y., Yue, C., Dantec-Nédélec, S., Ottlé, C., Jornet-Puig, A., Bastos, A., Laurent, P., Goll, D., Bowring, S., Chang, J., Guenet, B., Tifafi, M., Peng, S., Krinner, G., Ducharne, A., Wang, F., Wang, T., Wang, X., Wang, Y., Yin, Z., Lauerwald, R., Joetzjer, E., Qiu, C., Kim, H., and Ciais, P.: ORCHIDEE-MICT (v8.4.1), a land surface model for the high latitudes: model description and validation, Geosci. Model Dev., 11, 121–163, https://doi.org/10.5194/gmd-11-121-2018, 2018. a, b
Harshman, R. A.: Foundations of the PARAFAC Procedure: Models and Conditions for “an Explanatory” Multi-Modal Factor Analysis, UCLA Working Papers in Phonetics 16, UCLA, (UMI Serials in Microform, No. 10085), 1970. a, b
Høgda, K. A., Tømmervik, H., and Karlsen, S. R.: Trends in the start of the growing season in Fennoscandia 1982–2011, Remote Sens., 5, 4304–4318, 2013. a
IPCC AR5 WG2 A: Climate Change 2014: Impacts, Adaptation, and Vulnerability, Part A: Global and Sectoral Aspects, Contribution of Working Group II (WG2) to the Fifth Assessment Report (AR5) of the Intergovernmental Panel on Climate Change (IPCC), 2014. a
Jaenson, T. G., Jaenson, D. G., Eisen, L., Petersson, E., and Lindgren, E.: Changes in the geographical distribution and abundance of the tick Ixodes ricinus during the past 30 years in Sweden, Parasite Vector, 5, PMC3311093, https://doi.org/10.1186/1756-3305-5-8, 2012. a
Jore, S., Vanwambeke, S. O., Viljugrein, H., Isaksen, K., Kristoffersen, A. B., Woldehiwet, Z., Johansen, B., Brun, E., Brun-Hansen, H., Westermann, S., Larsen, I.-L., Ytrehus, B., and Hofshagen, M.: Climate and environmental change drives Ixodes ricinus geographical expansion at the northern range margin, Parasite Vector, 7, 11, https://doi.org/10.1186/1756-3305-7-11, 2014. a
Kautto, A. H., Alenius, S., Mossing, T., Becher, P., Belák, S., and Larska, M.: Pestivirus and alphaherpesvirus infections in Swedish reindeer (Rangifer tarandus tarandus L.), Vet. Microbiol., 156, 64–71, https://doi.org/10.1016/j.vetmic.2011.10.018, 2012. a
Kay, J. E., Deser, C., Phillips, A., Mai, A., Hannay, C., Strand, G., Arblaster, J. M., Bates, S. C., Danabasoglu, G., and Edwards, J.: The Community Earth System Model (CESM) large ensemble project: A community resource for studying climate change in the presence of internal climate variability, B. Am. Meteorol. Soc., 96, 1333–1349, 2015. a
Koca, D., Smith, B., and Sykes, M. T.: Modelling Regional Climate Change Effects On Potential Natural Ecosystems in Sweden, Clim. Change, 78, 381–406, https://doi.org/10.1007/s10584-005-9030-1, 2006. a
Laaksonen, M., Sajanti, E., Sormunen, J. J., Penttinen, R., Hänninen, J., Ruohomäki, K., Sääksjärvi, I., Vesterinen, E. J., Vuorinen, I., Hytönen, J., and Klemola, T.: Crowdsourcing-based nationwide tick collection reveals the distribution of Ixodes ricinus and I. persulcatus and associated pathogens in Finland, Emerg. Microbes Infec., 6, e31, https://doi.org/10.1038/emi.2017.17, 2017. a
Lambin, E. F., Tran, A., Vanwambeke, S. O., Linard, C., and Soti, V.: Pathogenic landscapes: Interactions between land, people, disease vectors, and their animal hosts, Int. J. Health Geogr., 9, 54, https://doi.org/10.1186/1476-072X-9-54, 2010. a
Lawrence, D., Fisher, R., Koven, C., Oleson, K. W., Swenson, S., and et al.: The Community Land Model version 5 (CLM5), J. Adv. Model. Earth Sy., 11, 4245–4287, https://doi.org/10.1029/2018MS001583, 2019. a, b, c
Lawrence, D. M., Hurtt, G. C., Arneth, A., Brovkin, V., Calvin, K. V., Jones, A. D., Jones, C. D., Lawrence, P. J., de Noblet-Ducoudré, N., Pongratz, J., Seneviratne, S. I., and Shevliakova, E.: The Land Use Model Intercomparison Project (LUMIP) contribution to CMIP6: rationale and experimental design, Geosci. Model Dev., 9, 2973–2998, https://doi.org/10.5194/gmd-9-2973-2016, 2016. a
Leibovici, D. and Sabatier, R.: A Singular Value Decomposition of k-Way Array for a Principal Component Analysis of Multiway Data, PTA-k, Linear Algebra Appl., 269, 307–329, 1998. a
Li, S., Gilbert, L., Harrison, P. A., and Rounsevell, M. D. A.: Modelling the seasonality of Lyme disease risk and the potential impacts of a warming climate within the heterogeneous landscapes of Scotland, J. R. Soc. Interface, 13, 27030039, https://doi.org/10.1098/rsif.2016.0140, 2016. a, b
Metcalf, C. J. E., Walter, K. S., Wesolowski, A., Buckee, C. O., Shevliakova, E., Tatem, A. J., Boos, W. R., Weinberger, D. M., and Pitzer, V. E.: Identifying climate drivers of infectious disease dynamics: recent advances and challenges ahead, P. Roy. Soc. A, 284, 28814655, https://doi.org/10.1098/rspb.2017.0901, 2017. a
Ostfeld, R.: Lyme disease: the ecology of a complex system, OUP USA, 216 pp., 2010. a
Oleson, K., Lawrence, D., Lombardozzi, D., and Wieder, W.: CLM land-only release, https://doi.org/10.5065/d6154fwh, 2018. a
Overland, J. E., Wang, M., Walsh, J. E., and Stroeve, J. C.: Future Arctic climate changes: Adaptation and mitigation time scales, Earth's Future, 2, 68–74, https://doi.org/10.1002/2013EF000162, 2014. a
Park, T., Ganguly, S., Tømmervik, H., Euskirchen, E. S., Høgda, K.-A., Karlsen, S. R., Brovkin, V., Nemani, R. R., and Myneni, R. B.: Changes in growing season duration and productivity of northern vegetation inferred from long-term remote sensing data, Environ. Res. Lett., 11, 084001, https://doi.org/10.1088/1748-9326/11/8/084001, 2016. a
Pauchard, A., Milbau, A., Albihn, A., Alexander, J., Burgess, T., Daehler, C., Englund, G., Essl, F., Evengård, B., Greenwood, G. B., Haider, S., Lenoir, J., McDougall, K., Muths, E., Nuñez, M. A., Olofsson, J., Pellissier, L., Rabitsch, W., Rew, L. J., Robertson, M., Sanders, N., and Kueffer, C.: Non-native and native organisms moving into high elevation and high latitude ecosystems in an era of climate change: new challenges for ecology and conservation, Biol. Invasions, 18, 345–353, https://doi.org/10.1007/s10530-015-1025-x, 2016. a
Poulter, B., MacBean, N., Hartley, A., Khlystova, I., Arino, O., Betts, R., Bontemps, S., Boettcher, M., Brockmann, C., Defourny, P., Hagemann, S., Herold, M., Kirches, G., Lamarche, C., Lederer, D., Ottlé, C., Peters, M., and Peylin, P.: Plant functional type classification for earth system models: results from the European Space Agency's Land Cover Climate Change Initiative, Geosci. Model Dev., 8, 2315–2328, https://doi.org/10.5194/gmd-8-2315-2015, 2015. a
Rafique, R., Zhao, F., de Jong, R., Zeng, N., and Asrar, G. R.: Global and Regional Variability and Change in Terrestrial Ecosystems Net Primary Production and NDVI: A Model-Data Comparison, Remote Sens., 8, 177, https://doi.org/10.3390/rs8030177, 2016. a
Rose, H., Wang, T., van Dijk, J., and Morgan, E. R.: GLOWORM-FL: A simulation model of the effects of climate and climate change on the free-living stages of gastro-intestinal nematode parasites of ruminants, Ecol. Model., 297, 232–245, https://doi.org/10.1016/j.ecolmodel.2014.11.033, 2015. a
Ruscio, B. A., Brubaker, M., Glasser, J., Hueston, W., and Hennessy, T. W.: One Health – a strategy for resilience in a changing arctic, Int. J. Circumpol. Heal., 74, 27913, https://doi.org/10.3402/ijch.v74.27913, 2015. a
Rydén, P., Björk, R., Schäfer, M. L., Lundström, J. O., Petersén, B., Lind, A., Forsman, M., Sjöstedt, A., and Johansson, A.: Outbreaks of Tularemia in a Boreal Forest Region Depends on Mosquito Prevalence, The Journal of Infectious Diseases, 205, 297–304, https://doi.org/10.1093/infdis/jir732, 2012. a, b
Sajanti, E., Virtanen, M., Helve, O., Kuusi, M., Lyytikäinen, O., Hytönen, J., and Sane, J.: Lyme Borreliosis in Finland, 1995–2014, Emerg. Infect. Dis., 23, 1282–1288, https://doi.org/10.3201/eid2308.161273, 2017. a
Simon, J. A., Marrotte, R. R., Desrosiers, N., Fiset, J., Gaitan, J., Gonzalez, A., Koffi, J. K., Lapointe, F.-J., Leighton, P. A., Lindsay, L. R., Logan, T., Milord, F., Ogden, N. H., Rogic, A., Roy-Dufresne, E., Suter, D., Tessier, N., and Millien, V.: Climate change and habitat fragmentation drive the occurrence of Borrelia burgdorferi, the agent of Lyme disease, at the northeastern limit of its distribution, Evol. Appl., 7, 750–764, https://doi.org/10.1111/eva.12165, 2014. a, b
Sormunen, J. J., Klemola, T., Vesterinen, E. J., Vuorinen, I., Hytönen, J., Hänninen, J., Ruohomäki, K., Sääksjärvi, I. E., Tonteri, E., and Penttinen, R.: Assessing the abundance, seasonal questing activity, and Borrelia and tick-borne encephalitis virus (TBEV) prevalence of Ixodes ricinus ticks in a Lyme borreliosis endemic area in Southwest Finland, Ticks Tick-Borne Dis., 7, 208–215, https://doi.org/10.1016/j.ttbdis.2015.10.011, 2016. a
Takeuchi, K., Kawahara, Y., and Iwata, T.: Structurally Regularized Non-negative Tensor Factorization for Spatio-Temporal Pattern Discoveries, in: Machine Learning and Knowledge Discovery in Databases, Lecture Notes in Computer Science, Springer, 582–598, https://doi.org/10.1007/978-3-319-71249-9_35, 2017. a
Tucker, L.: Some mathematical notes on three-mode factor analysis, Psychometrika, 31, 279–311, 1966. a
Waits, A., Emelyanova, A., Oksanen, A., Abass, K., and Rautio, A.: Human infectious diseases and the changing climate in the Arctic, Environ. Int., 121, 703–713, 2018. a
White, L. A., Forester, J. D., and Craft, M. E.: Dynamic, spatial models of parasite transmission in wildlife: Their structure, applications and remaining challenges, J. Anim. Ecol., 87, 559–580, 2018. a
Zuliani, A., Massolo, A., Lysyk, T., Johnson, G., Marshall, S., Berger, K., and Cork, S. C.: Modelling the Northward Expansion of Culicoides sonorensis (Diptera: Ceratopogonidae) under Future Climate Scenarios, PLOS ONE, 10, e0130294, https://doi.org/10.1371/journal.pone.0130294, 2015. a, b