The dynamics of biochemical processes in terrestrial ecosystems are tightly coupled to local meteorological conditions. Understanding these interactions is an essential prerequisite for predicting, e.g. the response of the terrestrial carbon cycle to climate change. However, many empirical studies in this field rely on correlative approaches and only very few studies apply causal discovery methods. Here we explore the potential for a recently proposed causal graph discovery algorithm to reconstruct the causal dependency structure underlying biosphere–atmosphere interactions. Using artificial time series with known dependencies that mimic real-world biosphere–atmosphere interactions we address the influence of non-stationarities, i.e. periodicity and heteroscedasticity, on the estimation of causal networks. We then investigate the interpretability of the method in two case studies. Firstly, we analyse three replicated eddy covariance datasets from a Mediterranean ecosystem. Secondly, we explore global Normalised Difference Vegetation Index time series (GIMMS 3g), along with gridded climate data to study large-scale climatic drivers of vegetation greenness. We compare the retrieved causal graphs to simple cross-correlation-based approaches to test whether causal graphs are considerably more informative. Overall, the results confirm the capacity of the causal discovery method to extract time-lagged linear dependencies under realistic settings. For example, we find a complete decoupling of the net ecosystem exchange from meteorological variability during summer in the Mediterranean ecosystem. However, cautious interpretations are needed, as the violation of the method's assumptions due to non-stationarities increases the likelihood to detect false links. Overall, estimating directed biosphere–atmosphere networks helps unravel complex multidirectional process interactions. Other than classical correlative approaches, our findings are constrained to a few meaningful sets of relations, which can be powerful insights for the evaluation of terrestrial ecosystem models.

The terrestrial biosphere responds to atmospheric drivers such as radiation intensity, temperature, vapour pressure deficit, and composition of trace gases. On the other hand, the biosphere influences the atmosphere via partitioning incoming net radiation into sensible, latent, and ground heat fluxes, as well as via controlling the exchange of trace gases and volatile organic compounds. Over the past few decades, many of these processes have been identified and their physical, chemical, and biological effects have been investigated

Multiple ecological monitoring systems have been set up to monitor ecosystem dynamics. Networks of eddy covariance towers continuously monitor carbon, water, and energy fluxes in high temporal resolution

The study of biosphere–atmosphere interactions using observations typically relies on correlative approaches, or is based on model data, i.e. requires a priori knowledge.
In recent years, a new branch in statistics aiming for causal inference from empirical data has experienced substantial progress.
The idea of causal inference had already emerged in the early 20th century

Aiming to mitigate some of the limitations of the traditional Granger causality,

Ecological and climate data are often time-ordered. This property can be exploited to construct time series graphs

In this study, we explore the potential of PCMCI for disentangling and quantifying interactions and feedbacks between terrestrial biosphere state and fluxes and meteorological variables. The study is structured as follows. In Sect.

Monitoring an ecosystem with continuous observations of net ecosystem exchange (NEE), the underlying gross primary production (GPP), and ecosystem respiration (

Two prominent methods that aim for directional dependencies are Granger causality and transfer entropy

PCMCI addresses this issue by reducing the set of conditions

PCMCI assesses the causal structure of a multivariate dataset or process

The time order within the time series allows us to orient directed links which are only pointing forward in time. This accounts for causal information propagating forward in time only, i.e. the cause shall precede the effect.
Therefore, a directed causal link

At the core of PCMCI there are conditional independence tests

To efficiently estimate

The PC step starts by initialising the whole past of a process:

Every conditional independence test is evaluated at a significance threshold

MCI is the actual causal discovery step that ascribes a

The link strength in the PCMCI framework can be given by the effect size of the conditional independence test statistic measure CI used in combination with MCI. In case of ParCorr, the effect size is given by the partial correlation value, which is between

We tested the algorithm on artificial datasets prior to its application to real-world data. The artificial dataset was created using a test model that takes time series of measured global radiation (

The parameters

The model was fitted to real observational data (using radiation, temperature, and land–atmosphere fluxes) of daily time resolution, measured by the eddy-covariance method

From each of the 72 sets of parameters we generated four sets of time series, each having a length of 500 years. The time series generation was initiated using two types of data: first, uncorrelated, normally distributed noise, and second, unprocessed radiation data as used during the fitting (the available radiation data were repeated to 500 years). The resulting datasets are called baseline dataset and seasonality dataset, respectively. In both cases, the model was run twice, once with homoscedastic (constant variance of

Data from three towers located in Majadas de Tiètar, (ES-LMa, ES-LM1, ES-LM2), a Mediterranean Savanna in central Spain, are used (coordinates of the central tower: 39

We expect the causal imprints in the data to vary between seasons and during the course of the day. To satisfy causal stationarity, we estimate networks separately for each month and consider only samples for which the potential radiation was above four-fifths of the potential daily maximum, which corresponds to midday samples.
We used a mask type that limits only the receiver variable to the respective month and day time values (see Table

The second observational case study was performed on a global dataset.
We used data with 0.5

To examine the influence of radiation, temperature, and precipitation on NDVI by means of PCMCI, we used the following settings. We compute the anomalies by subtracting a smoothed seasonal mean. A maximal time lag of 3 months was chosen based on the largest lag with significant partial correlation among all pairs of variables, partialling out only the autocorrelation of each variable. The receiver variable was limited to the growing season defined by

As an example, in Fig.

The artificial datasets are generated with a prescribed interaction structure (true network), which is obtained by fitting the test model to the FLUXNET sites. Here we show for four time series lengths the process graphs estimated via both lagged correlation and PCMCI. The data used stems from the homoscedastic realisation of the seasonality dataset of the Hainich site. The significance level was set to 0.01. The number of time lag labels were limited to five in the correlation networks. However, for the longest time series typically the whole range of lags (0–25) was significant.

The distribution of false positive detection rates estimated for the baseline dataset, the seasonality dataset and the anomalised seasonality dataset (mean seasonal cycle subtracted). The distributions are given for different time series length (number of datapoints). Additionally, the distributions are split to show the impact of heteroscedastic noise (orange) compared to normal distributed noise (blue). The significance level of 0.01 is given by a blue horizontal line.

The distribution of the true positive detection rates for each link in the test model estimated for the baseline dataset, the seasonality dataset, and the anomalised seasonality dataset. The distributions are given for different time series lengths (number of datapoints). Additionally, the distributions are split to show the impact of heteroscedastic noise (orange) compared to normally distributed noise (blue).

The FPR of homoscedastic time series in the baseline dataset is in the expected range of 0.01, the chosen significance level, indicating a well-calibrated test due to fulfilled assumptions. The assumption of stationarity is violated as soon as heteroscedasticity or seasonality is present. The effect on the FPR is an increase above 0.01 for time series length of 1 and 5 years, with a much stronger increase due to seasonality (factors of 4 and 8, respectively) than for heteroscedasticity (factor of 2).

The effect of non-stationarities on the TPR differs among the links. The detection of linear links (

Comparing the TPRs of the non-linear links shows some disparity. The links

In summary, the seasonality dataset exhibits high TPR even for non-linear links. Compared to stationary time series, the detection of non-linear links actually benefits from seasonality. The high detection, though, comes at the cost of a high false positive rate for time series length of and above 1 year. To a certain degree, the increase in FPR can be counteracted by anomalisation.

First, we look at the link consistency by comparing networks that were obtained for each tower within a month. The comparison is done for 2 months with strongly differing climate conditions: April and August. In Fig.

Comparison of the networks of three eddy covariance measurement stations (LMa, LM1, LM2) located in Majadas de Tiètar (Spain). Links that are found to be significant in one of the three networks are included. For each link, the calculated strength of all three networks is plotted with its 90 % confidence interval. The colours blue, orange, and green correspond to the towers LMa, LM1, and LM2, respectively. The significance threshold is 0.01. If a link does not pass the significance, it is marked by a black dot. The links are grouped into lag 0 (top), lag 1 (middle), and all lags greater than 1 (bottom). Negative NEE is associated with carbon uptake by the ecosystem. Links at lag 0 are left undirected (–), yet as

The difference among the seasons is further investigated in Fig.

To visualise the gradual changes in interaction structure the networks of the three towers are combined for each month. The number of significant occurrences of a link is given by its width. The link strength, given by the link colour, is calculated by averaging the significant links of the towers. The link's lag is shown in the centre of each arrow, sorted in descending order of link strength. The resulting graphs are shown for April 2014 until March 2015. The significance threshold is 0.01. The networks of April and August, illustrated in Fig.

The above results demonstrate that PCMCI is sensitive enough to capture seasonal differences and certain physiological reasonable biosphere behaviour. Moreover, PCMCI yields a better interpretable network structure than pure correlation approaches.

The significant lags and MCI values of each climatic variable on NDVI were subject to inspection. In Fig.

Influence of climatic drivers on NDVI as calculated by PCMCI. The first column shows the estimated causal influence given as maximal absolute MCI value of climatic drivers on NDVI. The second column gives the time lag at which the maximal absolute MCI value occurs (in months).

Map of the strongest climatic driver (largest absolute MCI value) per grid point.

The dominant lags are found to be 0 and 1. Only a very small fraction of the total area shows a maximal MCI value at a higher lag of two or three months.
The lags are also not equally distributed among the climatic drivers. Radiation and temperature are predominantly strongest at lag 0, while precipitation has a much larger fraction of area showing the strongest response at lag 1. Regions where the impact of

In summary, PCMCI estimates coherent interaction patterns that match well with anticipated behaviour based on vegetation type and prevailing climatic conditions.

Causal discovery methods promise an improved understanding and can help to come up with new hypotheses about the interaction between biosphere and atmosphere

With regard to expected non-linearities in biosphere–atmosphere interactions, using a linear independence test within the PCMCI framework may not be adequate. We justify our choice with the following arguments: first, non-linearities are often approximated linearly. Second, a linear regression based test has a much higher power for detecting linear links than a non-parametric test

The probability of detecting a link with PCMCI depends strongly on a link's MCI effect size, which is larger for strong variance in the driver and a low variance in the receiver (see Sect.

Seasonality and heteroscedasticity constitute violations of the stationarity assumption underlying the independence test ParCorr.
Seasonality constitutes a common driver in this model. In general, such common drivers increase the dependence among the variables and hence lead to a higher detection rate for true links (TPR) as well as a higher false positive rate (FPR) for absent links if this driver is not conditioned out properly. This additionally causes the TPR and the FPR rate to increase in the seasonality model.
As shown in

Summarising the results of the test model, the different detection rates, disparity among non-linear links, and the detection of multiplicative links are largely explainable via the effect of the variance on the link detection. Yet, the discussion revealed the need for further research in several aspects. On the one hand, feedback loops are not included in the test model yet are an important aspect in natural systems. On the other hand, removing non-stationarities is essential to keep the false positive rate in the expected range, but standard procedures of subtracting the mean seasonal cycle are not sufficient. Further, the effect of non-stationarity on the causal network structure needs to be investigated.

In both the half-hourly time-resolved eddy covariance data and the monthly global dataset, the predominant type of dependence found is contemporaneous. PCMCI leaves these undirected since no time order indicating the flow of causal information is available. Further, as discussed in Sect.

Nevertheless, robust patterns were identified in our studies that are also consistent with other studies. Furthermore, a causal analysis has the advantage of an enhanced interpretability compared to correlative approaches. First of all, we could show that the networks' estimated link strengths are consistent for observational data, even though measurement error affects the data. The dataset used was suitable for this analysis, as the measurement stations are located in a reasonably homogeneous ecosystem that shows only little spatial variation

The global study of climatic drivers of vegetation shows a general pattern of lags and dependence strengths of vegetation on climatic variables that is easily interpretable. The boreal regions appear energy limited and especially driven by temperature (see Fig.

In summary, we pointed out the need for careful interpretations in applying causal discovery methods and especially highlighted the challenges linked to the study of biosphere–atmosphere interaction via PCMCI. We demonstrated that the network structures estimated from observational data are explainable with respect to plant physiology and climatic effects. Finally, our study shows that causal methods can deliver better interpretability and a much improved process understanding in comparison to correlation and bivariate Granger causality analyses that are ambiguous to interpret since they do not account for common drivers.

The preceding discussion has shed light on the merits of PCMCI, as well as the challenges of applying causal discovery methods.

Here we tested PCMCI, an algorithm that estimates causal graphs from empirical time series. We specifically explored two types of datasets that are highly relevant in biogeosciences: eddy covariance measurements of land–atmosphere fluxes and global satellite remote sensing of vegetation greenness. The causal graphs estimated from the eddy covariance data collected in a Mediterranean site confirm patterns we would expect in these ecosystems: during the dry season's plants senescence, for instance, the ecosystem's carbon cycle (NEE) decouples from meteorological variability. On the contrary, during the main growing season, with warm and humid conditions, strong links between NEE,

The eddy covariance data of the FLUXNET sites can be downloaded from the official web page (

CRU temperature and precipitation data are available at

CRUNCEP radiation data can be downloaded via

The NDVI dataset is available at

The TIGRAMITE software package that includes PCMCI can be found on github

CK and MDM designed the study with contributions from JR and DGM. CK conducted the analysis and wrote the manuscript. All authors helped to improve the manuscript. AC, MM, TEM, and OPP conducted field experiments in Majadas de Tiètar, processed and provided its data, and helped with their interpretation.

The authors declare that they have no conflict of interest.

We thank Maha Shadaydeh, Rune Christiansen, Jonas Peters, Milan Flach and Markus Reichstein for useful comments. Christopher Krich thanks the Max Planck Research School for global Biogeochemical Cycles for supporting his PhD project. The authors affiliated with the Max Planck Institute for Biogeochemistry thank the European Space Agency for funding the “Earth System Data Lab” project.

This work used eddy covariance data acquired and shared by the FLUXNET community, including these networks: AmeriFlux, AfriFlux, AsiaFlux, CarboAfrica, CarboEuropeIP, CarboItaly, CarboMont, ChinaFlux, Fluxnet-Canada, GreenGrass, ICOS, KoFlux, LBA, NECC, OzFlux-TERN, TCOS-Siberia, and USCCC. The ERA-Interim reanalysis data are provided by ECMWF and processed by LSCE. The FLUXNET eddy covariance data processing and harmonisation was carried out by the European Fluxes Database Cluster, AmeriFlux Management Project, and Fluxdata project of FLUXNET, with the support of CDIAC; ICOS Ecosystem Thematic Center; and the OzFlux, ChinaFlux, and AsiaFlux offices.

The article processing charges for this open-access publication were covered by the Max Planck Society.

This paper was edited by Ivonne Trebs and reviewed by Benjamin L. Ruddell and two anonymous referees.