Impact of temperature and water availability on microwave-derived gross primary production

Vegetation optical depth (VOD) from microwave satellite observations has received much attention in global vegetation studies in recent years due to its relationship to vegetation water content and biomass. We recently have shown that VOD is related to plant productivity, i.e., gross primary production (GPP). Based on this relationship between VOD and GPP, we developed a theory-based machine learning model to estimate global patterns of GPP from passive microwave VOD retrievals. The VOD-GPP model generally showed good agreement with site observations and other global data sets in temporal dynamic but tended to overestimate annual GPP across all latitudes. We hypothesized that the reason for the overestimation is the missing effect of temperature on autotrophic respiration in the theorybased machine learning model. Here we aim to further assess and enhance the robustness of the VOD-GPP model by including the effect of temperature on autotrophic respiration within the machine learning approach and by assessing the interannual variability of the model results with respect to water availability. We used X-band VOD from the VOD Climate Archive (VODCA) data set for estimating GPP and used global state-of-the-art GPP data sets from FLUXCOM and MODIS to assess residuals of the VOD-GPP model with respect to drought conditions as quantified by the Standardized Precipitation and Evaporation Index (SPEI). Our results reveal an improvement in model performance for correlation when including the temperature dependency of autotrophic respiration (average correlation increase of 0.18). This improvement in temporal dynamic is larger for temperate and cold regions than for the tropics. For unbiased root-mean-square error (ubRMSE) and bias, the results are regionally diverse and are compensated in the global average. Improvements are observed in temperate and cold regions, while decreases in performance are obtained mainly in the tropics. The overall improvement when adding temperature was less than expected and thus may only partly explain previously observed differences between the global GPP data sets. On interannual timescales, estimates of the VOD-GPP model agree well with GPP from FLUXCOM and MODIS. We further find that the residuals between VOD-based GPP estimates and the other data sets do not significantly correlate with SPEI, which demonstrates that the VOD-GPP model can capture responses of GPP to water availability even without including additional information on precipitation, soil moisture or evapotranspiration. Exceptions from this rule were found in some regions: significant negative correlations between VOD-GPP residuals and SPEI were observed in the US corn belt, Argentina, eastern Europe, Russia and China, while significant positive correlations were obtained in South America, Africa and Australia. In these regions, the significant correlations may indicate different plant strategies for dealing with variations in water availability. Overall, our findings support the robustness of global microwave-derived estimates of gross primary production for large-scale studies on climate–vegetation interactions. Published by Copernicus Publications on behalf of the European Geosciences Union. 3286 I. E. Teubner et al.: Impact of temperature and water availability

Abstract. Vegetation optical depth (VOD) from microwave satellite observations has received much attention in global vegetation studies in recent years due to its relationship to vegetation water content and biomass. We recently have shown that VOD is related to plant productivity, i.e., gross primary production (GPP). Based on this relationship between VOD and GPP, we developed a theory-based machine learning model to estimate global patterns of GPP from passive microwave VOD retrievals. The VOD-GPP model generally showed good agreement with site observations and other global data sets in temporal dynamic but tended to overestimate annual GPP across all latitudes. We hypothesized that the reason for the overestimation is the missing effect of temperature on autotrophic respiration in the theorybased machine learning model. Here we aim to further assess and enhance the robustness of the VOD-GPP model by including the effect of temperature on autotrophic respiration within the machine learning approach and by assessing the interannual variability of the model results with respect to water availability. We used X-band VOD from the VOD Climate Archive (VODCA) data set for estimating GPP and used global state-of-the-art GPP data sets from FLUXCOM and MODIS to assess residuals of the VOD-GPP model with respect to drought conditions as quantified by the Standardized Precipitation and Evaporation Index (SPEI).
Our results reveal an improvement in model performance for correlation when including the temperature dependency of autotrophic respiration (average correlation increase of 0.18). This improvement in temporal dynamic is larger for temperate and cold regions than for the tropics. For unbi-ased root-mean-square error (ubRMSE) and bias, the results are regionally diverse and are compensated in the global average. Improvements are observed in temperate and cold regions, while decreases in performance are obtained mainly in the tropics. The overall improvement when adding temperature was less than expected and thus may only partly explain previously observed differences between the global GPP data sets. On interannual timescales, estimates of the VOD-GPP model agree well with GPP from FLUXCOM and MODIS. We further find that the residuals between VOD-based GPP estimates and the other data sets do not significantly correlate with SPEI, which demonstrates that the VOD-GPP model can capture responses of GPP to water availability even without including additional information on precipitation, soil moisture or evapotranspiration. Exceptions from this rule were found in some regions: significant negative correlations between VOD-GPP residuals and SPEI were observed in the US corn belt, Argentina, eastern Europe, Russia and China, while significant positive correlations were obtained in South America, Africa and Australia. In these regions, the significant correlations may indicate different plant strategies for dealing with variations in water availability.
Overall, our findings support the robustness of global microwave-derived estimates of gross primary production for large-scale studies on climate-vegetation interactions.

Introduction
Vegetation optical depth (VOD) from microwave satellite observations provides the opportunity for studying large-scale vegetation dynamics due to its sensitivity to the vegetation water content and aboveground biomass. Different studies have employed VOD for deriving various plant properties or vegetation characteristics that can be related to the plant's water content, including biomass estimation (Liu et al., 2015;Brandt et al., 2018;Rodríguez-Fernández et al., 2018;Chaparro et al., 2019;Fan et al., 2019;Frappart et al., 2020;Wigneron et al., 2020;Li et al., 2021), crop yield (Chaparro et al., 2018), tree mortality Sapes et al., 2019), analysis of burned area , ecosystem-scale isohydricity (Konings and Gentine, 2017), plant water uptake during dry downs (Feldman et al., 2018) and plant water storage . VOD, or microwave satellite observations in general, is also analyzed for its potential in detecting the impact of drought (Song et al., 2019;Crocetti et al., 2020). Despite the sensitivity of VOD to vegetation water content, the relationship between VOD and gross primary production (GPP) has not yet been analyzed with regard to how the relationship responds to varying conditions of dryness or wetness.
Recently, we have shown that VOD is related to plant productivity, i.e., GPP . Based on these findings, we developed a theory-guided machine learning model to estimate GPP from VOD (VOD-GPP model) and trained the model using eddy covariance estimates of GPP from the FLUXNET network . The VOD-GPP model relies on estimating carbon sink terms, i.e., net primary production (NPP) and autotrophic respiration (Ra), based on VOD as a proxy for aboveground living biomass. The VOD-GPP model thus represents a carbon-sink-driven approach. Since the VOD-GPP model uses biomass as its main input, the estimation of GPP does not rely on input variables that are commonly used in sourcedriven approaches, e.g., absorption of photosynthetically active radiation as primary input term or vapor pressure deficit as controlling factor for stomatal conductance (Running et al., 2000;Turner et al., 2005;Goodrich et al., 2015;Zhang et al., 2016Zhang et al., , 2017. Although different studies are tackling the question of how much information on biomass is actually contained in the VOD signal (Momen et al., 2017;Vreugdenhil et al., 2018;Zhang et al., 2019), it might be worth noting that the water content can be seen as an important aspect in our model approach since it presents the living part of the vegetation and only living cells, which contain water, are able to respire. We have shown that the VOD-GPP model can represent temporal dynamics of GPP well but that it overestimates GPP, especially in temperate and boreal regions . We hypothesize that this overestimation may be caused by a missing representation of temperature dependency of autotrophic respiration in the VOD-GPP model. Ra is the process through which chemical energy that was stored by building up carbohydrates during photosynthesis is gained by converting carbohydrates back into carbon dioxide. It is generally known that Ra is a temperature-dependent process (e.g., Atkin and Tjoelker, 2003). Modeling the response of Ra to temperature, however, is complex due to the existence of thermal acclimation (Atkin and Tjoelker, 2003). Ra is commonly represented through an exponential function with Q10as the base, which is multiplied with a basal respiration rate (e.g., Smith and Dukes, 2013). The base value Q10 describes how much Ra changes when temperature changes by 10 • C (e.g., Atkin et al., 2008). Although global models often use constant values for either one parameter or both parameters (Gifford, 2003;Smith and Dukes, 2013), studies have shown that both basal respiration rate and Q10 may vary with temperature (Tjoelker et al., 2001;Wythers et al., 2013). The implementation of such temperature acclimation yields a functional representation that decreases again at higher temperatures and thus takes into account that respiration may decrease outside an optimum temperature range (Smith and Dukes, 2013).
Here we aim to assess the impact of the temperature dependency of Ra in the VOD-GPP model and if it can improve model performance. Furthermore, we will test the plausibility of the model by comparing the estimated interannual variability of GPP with independent state-of-the art global data sets of GPP and by assessing model residuals with respect to variations in climatological water availability as represented by the Standardized Precipitation and Evaporation Index (SPEI). Since source (GPP) and sink terms (NPP + Ra) should theoretically be in balance, any differences between the two approaches that are related to variations in water availability may give insight into different plant strategies for dealing with dry or wet conditions and thus may be of interest for ecological or plant-physiological studies at a large scale.
2 Data and methods

Choice of microwave frequency
The VOD-GPP model relies on biomass as input. Nevertheless, the choice of microwave frequency for estimating GPP may look counterintuitive. On the one hand, VOD from low microwave frequencies like L band has been demonstrated to be better suited as proxy for mapping total aboveground biomass than high-frequency VOD, i.e., X-band VOD, as L-band VOD saturates less at high biomass values (Chaparro et al., 2019;Frappart et al., 2020;Li et al., 2021). On the other hand, previous analyses demonstrated that X-band VOD shows a closer agreement with GPP Kumar et al., 2020). In Fig. A1, we further corroborated this observation by a correlation analysis between in situ GPP and VOD from L and X band, respectively (for details about the single sensor VOD data sets, see Teubner I. E. Teubner et al.: Impact of temperature and water availability 3287 et al., 2018). Despite the high fraction (38 %) of forest pixels used for this computation, higher correlations were obtained for X band than for L band. An explanation could be that whole plant biomass was found to be less suited for estimating GPP as opposed to biomass of metabolically active plant parts like leaves and fine roots (Litton et al., 2007). Based on these findings, we concluded that higher-frequency VOD appears to be better suited for estimating GPP, and therefore we used X-band VOD in our analysis.

Data sets
We analyzed different GPP data sets derived from microwave and optical sensors as well as SPEI. As input to the VOD-GPP model, we used X-band VOD data from the VOD Climate Archive (VODCA). Since global coverage for VODCA X-band data starts in 2003 (Moesinger et al., 2020) and SPEI data are available through 2015, we used the common period from 2003 to 2015 for our analysis. Temporal median maps for the global GPP data sets are displayed in the Supplement (Fig. A2).

VODCA
VOD retrievals from single sensors often span only a certain period in time, which may hamper the analysis of longer periods. To overcome this problem, we used a merged singlefrequency VOD from the VOD Climate Archive (VODCA; Moesinger et al., 2020) as input to our model. VODCA (Moesinger et al., 2020) X band (VODCAX) contains nighttime observations of passive VOD derived from TMI (10.7 GHz; variable overpass time), AMSR-E (10.7 GHz; descending 01:30 LECT, local equatorial crossing time), WindSat (10.7 GHz; descending 06:00 LECT) and AMSR2 (10.7 GHz; descending 1:30 LECT). The VOD input data are obtained from the Land Parameter Retrieval Model (LPRM; van der Schalie et al., 2017). The use of nighttime observations on the one hand meets the LPRM assumption of homogeneous temperature conditions (Owe et al., 2001) and on the other hand is better suited as proxy for plant water status than daytime observations. Due to diurnal differences in plant water status and the refilling during the night (El Hajj et al., 2019;Konings and Gentine, 2017), nighttime observations are closer to the predawn water potential, which is commonly used as estimator for the daily vegetation water status (Konings and Gentine, 2017;Konings et al., 2019). During the processing of VODCAX, data are masked for radio frequency interference (RFI) (Moesinger et al., 2020) since RFI can introduce spurious retrievals (Li et al., 2004;Njoku et al., 2005). Data are available at daily resolution and 0.25 • grid spacing.

Independent global GPP data sets
The MOD17A2H v006 product provides global estimates of GPP that are derived from surface reflectances (Running et al., 2004(Running et al., , 2015. The algorithm is based on the light use efficiency concept by Monteith (1972) and uses the fraction of photosynthetically absorbed radiation for deriving plant productivity (Running et al., 1999(Running et al., , 2000. Data are produced as 8 d GPP estimates at 500 m resolution. FLUXCOM presents an upscaling of GPP from eddy covariance measurements using an ensemble of machine learning approaches (Jung et al., 2020). The data set is available at 8 d resolution and 10 km grid spacing. FLUXCOM estimates are produced in two setups: the FLUXCOM remote sensing (RS) is based on remote sensing data as input to the machine learning models and the FLUXCOM RS+METEO uses meteorological data and only the mean seasonal cycle of remote sensing data (Jung et al., 2020). Since our approach is mainly based on remote sensing data, i.e., VOD observations, we used FLUXCOM RS in our analysis. The FLUXCOM algorithm uses the following MODIS variables as input: Enhanced vegetation index, leaf area index, MODIS band 7middle infrared reflectance, normalized difference vegetation index and normalized difference water index.

In situ GPP estimation from FLUXNET
The Fluxnet2015 data set (Pastorello et al., 2020) provides daily in situ estimates of carbon, water and heat fluxes, which are determined using the eddy covariance technique. GPP estimates are available for two flux partitioning methods, i.e., daytime and nighttime partitioning method. We used the mean of both partitioning methods, as suggested in (Pastorello et al., 2020), with variable friction velocity threshold (GPP_DT_VUT_REF, GPP_NT_VUT_REF) from the freely available station data set (Tier1 v1). Since data are available until 2014, we used data for the period from 2003 to 2014 as training data for estimating GPP based on VOD. An overview of the FLUXNET sites is given in Fig. A3 and Table A1.

SPEI
For analyzing the impact of variations in water availability, we used SPEI from the SPEIbase (Beguería et al., 2017;Vicente-Serrano et al., 2010). The climatological water balance is calculated on different timescales ranging from 1 up to 48 months. Since drought can act on different timescales, we used SPEI at two different aggregations, 3 and 12 months, for investigating the response to dry and wet conditions. The 3-month SPEI (SPEI03) represents short-term effects, while the 12-month SPEI (SPEI12) relates to dry or wet conditions at an annual timescale. Although SPEI cannot be used to express actual water shortage for plants, it allows for the indication of relative deviations from mean conditions. Because of the use of both precipitation and temperature, SPEI further enables the comparison between different biomes (Vicente-Serrano et al., 2010). The SPEI data have a monthly resolution and a grid spacing of 0.5 • .

3288
I. E. Teubner et al.: Impact of temperature and water availability 2.2.5 ERA5-Land ERA5-Land, produced by the European Centre for Medium-Range Weather Forecasts (ECMWF) (C3S, 2019; Muñoz-Sabater, 2019), provides a reanalysis data set of meteorological parameters. ERA5 uses a 4D variational data assimilation scheme and a simplified extended Kalman filter (Hersbach et al., 2020). We used skin temperature and snow data for masking VOD. In the VOD-GPP model, we incorporated 2 m air temperature (T2M) for representing the temperature dependency of autotrophic respiration. T2M was used in our analysis since this parameter is most common for describing the temperature dependency of autotrophic respiration for aboveground vegetation (e.g., Ryan et al., 1997;Running et al., 2000;Ceschia et al., 2002;Drake et al., 2016). The data have an hourly resolution and 9 km spatial sampling.

Data processing
VODCAX data were masked for low temperature (skin temperature < 0 • C) and snow cover (snow depth > 0 cm) and then aggregated to 8 d estimates by computing the mean over 8 d to match the temporal resolution of GPPmodis and GPPfluxcom. These 8 d values were then used as input to the VOD-GPP model and for further analysis throughout the study. GPPfluxcom and GPPmodis were aggregated to 0.25 • to match the spatial sampling of VODCAX. For the comparison with SPEI, 8 d GPP estimates were further resampled to monthly resolution while SPEI was spatially resampled to 0.25 • using the nearest-neighbor method.

GPP estimation based on VOD
The approach of estimating GPP based on microwave radiation and the corresponding equations are described in detail in Teubner et al. (2019). In short, the VOD-GPP model uses VOD as a proxy of aboveground living biomass (Eq. 1). It determines GPP by estimating sinks for carbohydrates, i.e., the sum of NPP and Ra, which are represented through different VOD-derived variables: (i) time series of the bulk VOD signal (VOD; 8 d aggregated native VOD time series), (ii) time series of the temporal change in VOD ( VOD; VOD t = VOD t −VOD t−1 computed from the smoothed 8 d aggregated VOD time series) and (iii) the grid cell median of VOD (mdnVOD; calculated over the entire VOD time series of the grid cell; used as a proxy for vegetation cover). While NPP is related to VOD, Ra is related to both VOD and VOD using the concept proposed by Ryan et al. (1997) of dividing Ra into maintenance and growth respiration (Eq. 2). By assuming that belowground biomass terms are proportional to aboveground biomass (i.e., biomass B can be expressed through above ground biomass AGB) and adding a static term c supporting the conversion in Eq. (2), GPP can be represented through a differential equation with VOD as input (Eq. 3).
The formulation in GAM for this previous model, which uses only VOD variables as input (GPPvod; Eq. 4), then reads as follows: where s denotes spline terms for representing the functions between each input variable and the response variable GPP in the two-dimensional space.
For adding the temperature dependency of Ra, we are considering the two terms of Ra, i.e., maintenance and growth respiration. Since the temperature sensitivity mainly applies to the maintenance term (Ryan et al., 1997), we are only incorporating an interaction term with temperature for the maintenance part of the model formulation. Although all terms potentially may be dependent on temperature due to the general temperature dependency of enzymatic activity, the temperature dependency for modeling growth related sink terms (growth respiration and net primary production) may be of less importance. For the current model formulation (GPPvodtemp; Eq. 5), we now introduce an interaction term between VOD and temperature: where te stands for a tensor term, which represents the interaction between VOD and temperature and spans a surface in the three-dimensional space. Consistent with our previous model, we used GAM as regression method for deriving GPP. The pyGAM (Servén and Brummitt, 2018) version 0.8.0 provides the possibility of adding an interaction term. An advantage of GAM is that the relationships between input variable and response variables are not required to be known beforehand but instead can be estimated from the data themselves (Hastie and Tibshirani, 1987). Since the relationship between VOD and GPP as well as its relationship with temperature is difficult to determine a priori, this method is well suited for our approach.
In GAM, a number of basis spline functions are fitted to the data and the resulting function is further smoothed to obtain the final response function (Servén and Brummitt, 2018). The degree of smoothing is determined by the smoothing factor, which yields strong smoothing for high values and low smoothing for low values. For the current models we used a smoothing factor of 2, which is lower than for the model in Teubner et al. (2019). This was done since the response function for the tensor term was too smooth using the default number of 10 splines for tensor terms and resulted in unrealistically high GPP values at high VOD. For VOD, the default number of 20 splines for spline terms were used, while for mdnVOD we reduced the number of splines to 5 in order to obtain a smooth relationship.

Statistical analysis
For model comparison, we computed Pearson correlation, unbiased root-mean-square error (ubRMSE) and bias. For studying the error characteristics, ubRMSE was used instead of RMSE to exclude the impact of bias, which was observed during our analysis. In addition, cross-validation was computed for the above metrics using the leave-site-out method, where the model performance is evaluated at each site by omitting the respective site data from model training and then using the left-out data for computing the statistics. The analysis was carried out for the full signal and the anomalies from the mean seasonal cycle.
In case of analyzing annual GPP anomalies as a measure for interannual variability and residuals of the VOD-GPP model, we based our analysis on standardized annual or 8 d time series data (z scores). This was done in order to analyze GPP data in the absence of systematic differences between the data sets. The standardization for the 8 d or the annual data was applied to each grid cell time series by subtracting the mean and dividing by the standard deviation.
For generating the smoothed time series in the calculation of VOD and for aiding visual comparison in time series plots, we applied a Savitzky-Golay filter with window size of 11 data points.

Model representation of temperature dependency
We find that the sensitivity of VOD to GPP increases with temperature as shown by the partial dependency plots (Fig. 1). For low temperatures, the sensitivity of the VOD-GPP relationship is relatively low (Fig. 1a). As temperature increases, the sensitivity also increases and further exhibits an optimum behavior. At high temperatures, however, the maxima of the curves are lower than for moderate temperatures. The partial dependency for T2M (Fig. 1d) shows an optimum behavior with a peak around 20 • C, which slightly differs between the VOD values. The partial dependencies for VOD and mdnVOD (Fig. 1b,c) are consistent with the previous model and yield an increasing relationship with GPP for VOD in the middle part of the value range and a general decreasing relationship for mdnVOD.
In addition to identifying the underlying relationships, we can further assess the magnitude of the contribution to GPP for the input variables based on the data range in the partial dependency plots. The main contribution to GPP in the model comes from the interaction term between VOD and T2M with a range of about 12 gC m −2 d −1 , which is followed by VOD with a range of about 6 gC m −2 d −1 and mdnVOD with a range of about 4 gC m −2 d −1 . The contribution of the maintenance part, as represented through the interaction term, is thus higher than for VOD, which represents the sum of NPP and the growth term in Ra.

Evaluation at site level
At FLUXNET in situ stations, global GPP data sets overall show similar results (Fig. 2). GPPvod exhibits a slight accumulation of GPP values at around 4 g C m −2 d −1 , while the density for GPPvodtemp is relatively smooth and comparable to GPPfluxcom and GPPmodis. Both GPPvod and GP-Pvodtemp show a relatively high number of non-zero GPP at around zero GPPfluxnet, which is less pronounced for GP-Pvodtemp than for GPPvod. Cross-validation results in Table A2 further confirm a higher performance of GPPvodtemp compared to GPPvod. For the full signal as well as for the anomalies from the mean cycle, correlation, ubRMSE and bias generally yield higher performance for GPPvodtemp. The increase in performance is more pronounced for the full signal than for the anomalies. Despite an overall agreement of GPPvodtemp, GPPfluxcom and GPPmodis with in situ GPP, all three data sets exhibit an underestimation of GPP at high values of GPP compared with in situ GPPfluxnet. At annual time scale, the difference with GPPfluxnet at high GPP becomes much lower for GPPvodtemp compared to GPPfluxcom and GPPmodis (Fig. A4), which indicates on the one hand that GPPvodtemp is able to match the in situ training data and on the other hand suggests that differences in GPP already exist between the training data set used in our study and the independent global GPP data sets, which may contribute to differences at global scale. The observed overestimation of GPP for GPPvodtemp at low in situ GPP can also be observed at annual time scale. This may be an explanation for the general tendency for overestimation of microwave-derived GPP estimates and appears not to be entirely related to the temperature sensitivity of Ra, since it is still present for GPPvodtemp.

Impact of adding temperature dependency at the global scale
Performance metrics for GPPvod and GPPvodtemp were assessed with respect to both GPPfluxcom and GPPmodis.
Since the results for GPPfluxcom and GPPmodis are similar, we are only showing results for GPPfluxcom. Correlations with GPPfluxcom ( Fig. 3a) reveal widespread strongly positive values with a global mean of 0.63. Some areas in the tropics and in the Australian desert exhibit an inverse temporal dynamic with GPPfluxcom. Compared with  ) and (c) denote the 95 % confidence interval. The interaction between VOD and T2M (a, d), which represents a surface in the three-dimensional space, is displayed as projection on the 2D plane for each of the two input variables. For this, the parameter space was divided into 10 equally spaced bins between minimum and maximum of the respective variable. The bin edges are displayed as colored lines as indicated in the legend. GPPvod, correlations increase in large parts of the world (Fig. 3b) with a global average difference of 0.18. Regions that benefit most from adding temperature as input are temperate and cold regions, which could be expected since these regions per definition are strongly controlled by temperature. Tropics and subtropics, however, mainly show only minor changes in correlation coefficient with a few exceptions of decreasing correlations. Since the annual temperature amplitude in these regions is low, the model's sensitivity to temperature is also low, which makes the interaction term mainly controlled by VOD.
The global average for ubRMSE between GPPvodtemp and GPPfluxcom (Fig. 3c) yields a value of 1.20. Consistent with the increase in performance for the correlation, areas in the temperate and cold region show an improvement in error, i.e., a decrease of ubRMSE compared to GPPvod (Fig. 3d). Other regions, however, exhibit an increase in ubRMSE. The global average of the difference between results for GP-Pvodtemp and GPPvod is −0.05. Therefore, gains and losses in error are largely compensated at the global scale.
The bias between GPPvodtemp and GPPfluxcom ( Fig. 3c) is generally positive everywhere with a global average of 1.64. This finding is also evident from the higher range in the median maps for GPPvodtemp compared with GPPfluxcom and GPPmodis (Fig. A2). Comparing the results for GPPvod and GPPvodtemp, the addition of temperature shows an increase in bias mainly in the tropics (Fig. 3d), which is also evident for the difference of the median maps (Fig. A2e). Despite this increase in the tropics, regions with a reduction in bias also exist, which are mainly found in temperate and cold regions. On the global scale, decreases and increases in bias compensate and yield an average difference of −0.05.
The latitudinal distribution of annual GPP (Fig. 4a) further demonstrates that the addition of temperature yields a reduction of GPP mainly for regions outside −35 and +60 • N. The reduction in the zonal mean, however, is smaller than may have been expected, probably due to compensating effects. For the region between +30 and +60 • N, where reductions in bias were observed on the global map, positive and negative values for the bias appear to compensate yielding no net reduction in the zonal mean. In the tropical region, the increase in bias for GPPvodtemp compared with GPPvod is again evident. When considering the latitudinal distribution of annual GPP relative to the latitudinal maximum, however, the distribution for GPPvodtemp is actually closer to the independent data sets than GPPvod (Fig. 4b). This suggests that although the bias largely increases in the tropics, the relative distribu- tion between tropics and temperate to boreal regions is better represented by the setup that includes temperature.
For a region in Europe (5 to 15 • E and 46 to 51 • N), where we generally did observe an increase in all three performance metrics, we find that for GPPvod mainly wintertime estimates of GPP are too high compared to GPPfluxcom and GPPmodis (Fig. 5). By adding temperature as input to the model, winter observations are markedly dampened and summer observations are only slightly increased. Nevertheless, even when including the temperature dependency, winter GPP estimates are still slightly higher for GPPvodtemp than for GPPfluxcom or GPPmodis. A similar behavior is observed for other temperate regions (Fig. A5).
In the remaining study, due to the observed bias (both at site level and global scale), we are analyzing relative rather than absolute values for comparing interannual variability and the impact of water availability. In addition, we are focusing our further analysis on GPPvodtemp since this setup overall showed higher performance than GPPvod. Results for GPPvod are displayed in the Supplement for comparison with GPPvodtemp.

Interannual variability and varying conditions of water availability
The latitudinal distribution of annual GPP anomalies reveals a general agreement between the GPP data sets (Figs. 6 and A6). Although differences exist between all data sets, key Despite the fact that these key features are found in all data sets, we also observe that the magnitude of the anomalies often differs between the data sets, which thus yields a generally relatively high variability between all data sets. In terms of the overall latitudinal pattern, it appears that GPPvodtemp is more similar to GPPmodis than to GPPfluxcom. For the correlation of the residuals between standardized GPP (GPPvodtemp − GPPfluxcom or GP-Pvodtemp − GPPmodis) and SPEI, we find that large areas show no significant correlation with SPEI03 (Fig. 7a, b). For the long-term climatological water balance, i.e., SPEI12 (Fig. 7c, d), these areas with non-significant correlations further increase. In terms of model applicability, the nonsignificant correlations are the desired result. Given that correlations between GPPvodtemp and GPPfluxcom or GPPmodis are high in these regions, this demonstrates that GPPvodtemp shows a similar behavior to GPPfluxcom or GPPmodis in response to variations in dry or wet conditions. This finding thus provides a strong indication that the VOD-GPP relationship generally remains similar under varying conditions of water availability.
ative, occur at both timescales. Negative correlations indicate that during dry conditions GPPvodtemp is higher relative to the reference GPP than during wet conditions, while positive correlations mean that during dry conditions GP-Pvodtemp is lower relative to the reference GPP than during wet conditions. The spatial distribution of these significant correlations is largely consistent between GPPfluxcom and GPPmodis. For the short-term response to SPEI (Fig. 7a,  b), negative correlations are more frequent than positive correlations, indicating that the response to short-term drought events is often a reduction of source-driven GPP relative to sink-driven GPP. Negative correlations are mainly observed in the US corn belt, Argentina, eastern Europe, Russia and China, with the strongest negative correlations being in the US, Argentina and Russia. Positive correlations are obtained mainly over South America, Africa and Australia. For the long-term response to SPEI (Fig. 7c, d), the number of positive correlations increase. Similar to the short-term response, positive correlations are mainly found over South America, Africa and Australia.
The analysis of GPPvod residuals reveals a similar result as for GPPvodtemp (Fig. A7). For GPPvod, however, the number of grid cells with non-significant correlations in the four analyses is lower by about 2 % to 4 % than for GPPvodtemp, while the global average correlation is nearly identical. The higher number of non-significant correlations for GPPvodtemp than for GPPvod is expected because the addition of temperature accounts for some variation in the VOD-based GPP estimation.
For specific regions indicated in Fig. 7, we analyzed the time series of the standardized GPP (Fig. 8) and the response to SPEI categories (Fig. A8) in order to inspect under which situations negative or positive correlations with SPEI occur.
For the region in the US corn belt (Fig. 8a), where we found moderately negative correlations with SPEI, all three GPP data sets show a reduction in summer GPP in 2006 and 2012. Compared with other years, however, the reduction of GPPvodtemp tends to be less than for GPPfluxcom and GPPmodis. This behavior can be verified by considering the residuals along the SPEI12 gradient (Fig. A8a). During  dry conditions, the residuals are higher than during wet conditions. Since higher residuals indicate that GPPvodtemp is higher relative to the reference data sets, this result confirms the findings for the time series.
In Argentina (Fig. 8b), we observed strongly negative correlations for the analysis with SPEI. For this region, a pronounced dry condition is observed at the end of 2008 and beginning of 2009. In this period, GPPfluxcom and GPPmodis are reduced more strongly than GPPvodtemp. In the first following year, the GPPvodtemp peak is slightly lower than for GPPfluxcom and GPPmodis at the end of 2009. In the second following year (the end of 2011), GPPvodtemp is similar to that of GPPfluxcom and GPPmodis again. This result is further supported by the pronounced decrease of the residuals with SPEI12 in Fig. 8b. In addition to the interannual variability, we also find that the spring peak is more pronounced in GPPfluxcom and GPPmodis than in GPPvodtemp, which might point towards a surplus of carbohydrates in spring that are incorporated for building up biomass later in the year or may be related to differences in land cover.
For the example in Africa (Fig. 8c), where correlations with SPEI12 were positive, GPPvodtemp generally appears to be a bit higher relative to GPPfluxcom and GPPmodis at the end of each growing period. In the face of dry conditions, however, GPPvodtemp shows a stronger reduction in GPP than GPPfluxcom and GPPmodis at the end of the growing season, as observed in 2006 and 2009. Despite some differences in the time series between GPPvodtemp and the reference data sets, the temporal dynamic is generally similar between the data sets. This indicates that the sink-driven GPP shows a slightly different response to changes in environmental conditions for this region, which then results in the observed positive correlations with SPEI. Considering the residuals along the SPEI12 gradient for this region, we find that the residuals increase with SPEI12 for all categories except for very wet conditions (Fig. A8c).

I. E. Teubner et al.: Impact of temperature and water availability
The time series for Australia (Fig. 8d) shows that GP-Pvodtemp is generally reduced during dry conditions and increases relative to GPPfluxcom and GPPmodis during wet conditions. The increase in GPPvodtemp relative to the reference data sets appears to be strongest for the period following a year after long-term dry conditions, i.e., in 2009, 2011 and 2012. The residuals consistently show a clear increase along the SPEI12 categories (Fig. A8d).

Impact of adding temperature as model input
The performance of the VOD-GPP model was shown to improve with the addition of an interaction term between VOD and temperature mainly in terms of temporal dynamic. Our results showed that the improvement in temporal dynamic was mainly observed for temperate and cold regions. Since the growing season in these regions is largely controlled by temperature, this indicates that the improvement may largely be a seasonal effect. When analyzing the temperature response of respiration across biomes, both spatial and temporal differences resulting from thermal acclimation need to be taken into account (Vanderwel et al., 2015). On the spatial scale, temperature sensitivity largely varies with mean annual temperature across biomes (Piao et al., 2010;Vanderwel et al., 2015). On the temporal scale, temperature-corrected respiration rates, as observed for stem respiration of deciduous trees or for needle-leaved evergreen trees, exhibit a seasonal variation, leading to higher respiration rates during summer than during winter (Maier et al., 1998;Ceschia et al., 2002;Vose and Ryan, 2002;Zha et al., 2004). Consistently, we observed a dampening of GPPvodtemp during winter compared to GPPvod. The addition of temperature thus seems to enable the model to reflect differences in basal respiration rates between growing and dormant periods in these regions. Although the temporal component of thermal acclimation of respiration appears to be the dominant contribution, the resulting dependency on temperature represents the cumulative effect of spatial and temporal thermal acclimation of respiration as the relationship for the temperature dependency was estimated from the data without a priori assumptions.
In addition to the temperature dependency, Ra also varies with tissue nitrogen content (Maier et al., 1998;Ceschia et al., 2002;Vose and Ryan, 2002;Tjoelker et al., 2008), which may thus contribute to uncertainties in the GPP estimation derived from VOD. Ra is also known to vary between plant tissues (Vose and Ryan, 2002;Gifford, 2003). The respiration of woody tissue is generally lower than for leaves (Vose and Ryan, 2002). Since VOD generally increases with the fraction of woody vegetation (Chaparro et al., 2019), using the median of VOD as model input may potentially com-pensate at least partly for differences in respiration rates of stems and branches versus leaves within a grid cell.

Bias between GPP data sets
The addition of temperature dependency revealed contrasting results for the bias. While reductions in bias were observed for temperate and cold regions, a strong increase in bias was found for the tropics. Since the interaction term between VOD and T2M represents a relationship in the threedimensional space, certain combinations of VOD and T2M intervals in the parameter space may not be well represented by the training data. FLUXNET stations are not evenly distributed around the globe, as the majority of stations are located in the temperate region. This may have caused the model to be not well constrained in certain regions, e.g., where temperature and VOD are very high, and thus might have contributed to the increase in bias in the tropics. Therefore, additional FLUXNET stations might help to better constrain the VOD-GPP model. Nevertheless, differences between the data set were already evident at the site level, which suggests that the observed difference at global scale may at least partly be caused by differences in the training data set. In general, the agreement in annual GPP estimates is lowest in the tropics (Anav et al., 2015). Estimates for the FLUXCOM RS setup, which was used in our study, were reported to yield lower global estimates than the FLUXCOM RS+METEO setup or GPP estimates from vegetation models (Jung et al., 2020). Similarly, MODIS was found to underestimate GPP in the tropics (Turner et al., 2006). The need for better constraints for GPP estimates especially in the tropics is well recognized (MacBean et al., 2018) and tackled in different studies (e.g., MacBean et al., 2018;Sun et al., 2018;Wu et al., 2020) but is usually hampered by the availability of in situ estimates.

Implications of possible saturation of VOD at high biomass
The choice of microwave frequency for the estimation of GPP may have certain implications. Different studies have demonstrated that L-band VOD yields more robust estimates of total aboveground biomass than X-band VOD, as lowfrequency VOD does not saturate at high biomass values (Chaparro et al., 2019;Frappart et al., 2020;Li et al., 2021). Nonetheless, the impact of such potential saturation with biomass on the estimation of GPP is less trivial, especially with regard to densely vegetated areas like the tropics. Nonlinearity in the conversion between VOD and AGB should ideally be reflected in the partial dependency plot of GAM, which was also the reason for choosing this type of modeling approach. Scatterplots of the resulting GPPvodtemp estimates did not show clear signs of saturation at high in situ GPP. The FLUXNET training data set, however, only has few stations in the tropics, and thus the robustness of the model Figure 8. Regional mean of standardized GPP values for regions as indicated in Fig. 7 over the study period. Shaded areas denote the standard deviation for the regional aggregated time series. Vertical grey areas indicate periods with different levels of dryness conditions for regional aggregated SPEI12: SPEI12< −1 (dark grey), −1 <=SPEI12<0 (light grey) and SPEI12>=0 (white areas). Data were smoothed to aid visual comparison. may be limited by the availability of in situ stations. Apart from this, the relationship between VOD and GPP has been found to be in closer agreement for X-band VOD than for Lband Kumar et al., 2020), which was also observed for the correlation with in situ FLUXNET GPP (Fig. A1). At first glance, this might appear contradictory to the above-mentioned better performance of L-band VOD for biomass estimation. A comparison of biomass estimates from different plant components with GPP, however, demonstrated that large structural components, which make up a large fraction of the total biomass, may contribute less to GPP than metabolically active plant parts (Litton et al., 2007). Since high-frequency VOD is more sensitive to small plant parts like leaves and twigs (Woodhouse, 2017), this could be an explanation why X-band VOD might be better suited for the estimation of GPP and why saturation at high total aboveground biomass may be less of an issue here.

Independence of global GPP data sets
For the comparison with VOD-based GPP estimates, we used independent global data set from FLUXCOM and MODIS. Both data sets include to some extent information from FLUXNET data. FLUXCOM has been trained against FLUXNET data (Tramontana et al., 2016;Jung et al., 2020) but with a larger number of stations than in the freely available Tier1 data set that was used for our model. In addition, MODIS has been partly calibrated to some FLUXNET stations (Running et al., 1999). Therefore, the FLUXCOM and MODIS may not be fully independent of our VOD-based GPP estimates. Nevertheless, there is no alternative to constrain absolute GPP estimates at a global scale than by using FLUXNET data. In addition, the agreement between GPP and VOD-based GPP estimates was also confirmed at site level using leave-site-out cross-validation. Since this analysis is independent of the comparison with global data sets, it supports the use of VOD for deriving GPP.

The "zero-GPP problem" and non-structural carbohydrates
For GPPvodtemp, we observed that winter GPP values for an example over Europe were slightly higher compared to GPPfluxcom and GPPmodis. This issue of estimating GPP values close to zero was also observed in the scatterplots between GPPvodtemp and in situ GPPfluxnet. The reason for the overestimation at low GPP may be on the one hand an artifact related to the rehydration of plant residues after rain events and on the other hand may be explained by the sinkdriven nature of our approach. In the latter case, the nonzero GPPvodtemp values may be caused by perennial vegetation. Both evergreen and deciduous vegetation are respiring throughout the dormant period (Maier et al., 1998;Vose and Ryan, 2002) and are concurrently containing water. In turn, this presence of vegetation water content is detected through microwave sensors, leading to non-zero GPPvodtemp estimates. It thus may point towards the existence of a storage term. In plants, photosynthetic assimilates can be stored in the form of non-structural carbohydrates (NSCs), which can be converted back to plant usable sugars to support respiration during the dormant period and growth at the start of the growing season (e.g., Martínez-Vilalta et al., 2016). For tropical forest plots, the balancing of plot-level measurements of source and sink terms showed a decoupling between the two in response to drought, which the authors attributed to the existence of NSC (Doughty et al., 2015). Therefore, such a storage term can thus support a temporary imbalance between sources and sinks of carbon, which may translate into differences between source-and sink-driven GPP.

Magnitude of input terms
Based on the partial dependency plots, we found that for the maintenance-related term, i.e., the interaction term between VOD and T2M, the value range is higher than for VOD. Although our model represents the sum of NPP and growth Ra and not just growth Ra, the magnitude of the two input terms is consistent with studies that analyzed the contribution of maintenance and growth to Ra. For whole plants and for stem respiration of boreal needle-leaved trees, maintenance respiration was shown to play the dominant role for Ra, with a contribution of 70 % (Chambers et al., 2004) and 80 % (Zha et al., 2004), respectively.

Response to water availability
The analysis of VOD-GPP residuals with respect to FLUX-COM and MODIS revealed that GPPvodtemp largely showed a similar behavior to the independent GPP data sets as demonstrated by the widespread non-significant correlations with SPEI. This result is further supported by the general agreement in interannual variability. In addition to the possible impact of NSC, occurrences of significant correlations between VOD-GPP residuals and SPEI may indicate different plant strategies for dealing with changes in dry or wet conditions. For negative correlations, this could be mainly related to differences in plant hydraulics, while for positive correlations, it might indicate shifts between aboveground and belowground carbon allocation. Different plant strategies with regard to hydraulics can be expressed with the concept of isohydricity, which describes the regulation of stomatal control (Konings and Gentine, 2017;Giardina et al., 2018;Martínez-Vilalta and Garcia-Forner, 2017). At an ecosystem level, this parameter can be obtained using the difference in twice-daily overpasses of microwave observations (Konings and Gentine, 2017). Although Martínez-Vilalta and Garcia-Forner (2017) argue that the regulation of water potential may not necessarily be strongly coupled with the assimilation during drought, the degree of isohydricity may still be an explanation for the observed variation in GPPvodtemp relative to GPPfluxcom and GPPmodis. Pronounced negative correlation for the analysis of GPP residuals were found in Argentina and the US corn belt, which are regions where Konings and Gentine (2017) observed high values of isohydricity. Corn, which exhibits isohydric behavior (Lambers and Oliveira, 2019;Martínez-Vilalta and Garcia-Forner, 2017), i.e., it maintains water potential through strong regulation of stomata, additionally has the ability, like other grasses, to roll up leaves in response to drought to reduce the loss of water from the plant's cuticular (e.g., Ribaut et al., 2009). In conjunction with the isohydric behavior, this might be an explanation for the strong signal reduction of GPPfluxcom and GPPmodis relative to GPPvodtemp observed over Argentina. Although our analysis is based on 8 d time steps, characteristics of plant hydraulics which are retrieved from sub-daily data show similar features to those of our analysis of residuals between sourceand sink-driven GPP in response to changes in water availability.
In contrast to the isohydric behavior, anisohydric behavior should not lead to pronounced differences between GP-Pvodtemp and GPPfluxcom or GPPmodis as stomatal conductance and leaf water potential are both reduced in response to dry conditions (Lambers and Oliveira, 2019). The anisohydric behavior thus potentially relates to the nonsignificant correlations. Nevertheless, the degree of isohydricity may also vary between wet and dry seasons (Konings and Gentine, 2017), which also needs to be taken into account for the interpretation of the residuals.
The observed positive correlations, i.e., reductions of GP-Pvodtemp relative to GPPfluxcom or GPPmodis, could be associated with a stronger shift of assimilates to belowground plant organs. Different studies have shown that root growth may increase in face of drought to maintain water access (Sanaullah et al., 2012;Burri et al., 2014) and consequently also nutrient supply (Lambers and Oliveira, 2019). Since VOD observations only detect aboveground living vegetation, a shift towards belowground plant organs may lead to apparently lower GPPvodtemp. Nevertheless, the inverse, i.e., an increase of allocation to shoots, was also observed in the presence of legume species during drought (Sanaullah et al., 2012) and for tropical forest plots after drought (Doughty et al., 2015).
Comparisons of GPPvodtemp with in situ observations of vegetation properties during such extreme events like drought, however, may be needed to improve the understanding of the plant's response to changes in environmental conditions at the ecosystem to global scale.

Conclusions
The VOD-GPP model was analyzed with regard to the impact of adding temperature as model input in order to account for the temperature dependency of autotrophic respiration. The resulting GPP estimates, GPPvodtemp, showed a high consistency with GPPfluxcom and GPPmodis for the temporal dynamic both at intra-and interannual timescales.
For bias and error, the addition of temperature resulted in a regionally diverse response with a general improvement for temperate and cold regions and a decrease in performance mainly in the tropics. The improvement upon adding temperature, however, was less than might have been expected, which indicates that the previous lack of temperature dependency in the model formulation can only partly account for the observed differences between the global GPP data sets. Nevertheless, this result demonstrates that an improvement by adding temperature is possible but might require further model constraints for a more robust estimation of GPP.
The analysis of the VOD-GPP residuals revealed that GP-Pvodtemp largely yields a similar behavior as GPPfluxcom and GPPmodis with respect to SPEI. This highlights that the relationship between VOD and GPP generally may be valid even under varying conditions of water availability. For some regions, where significant correlations were observed, the observed differences between GPPvodtemp and GPPfluxcom or GPPmodis may indicate different plant strategies for dealing with drought conditions.
Author contributions. IET conceived the study, carried out the analysis and drafted the manuscript with contributions from WD and MF regarding the study design. BW contributed to data preparation. LM provided VOD estimates from VODCA. All authors discussed the results and commented on the manuscript.
Competing interests. The authors declare that they have no conflict of interest.
Special issue statement. This article is part of the special issue "Microwave remote sensing for improved understanding of vegetation-water interactions (BG/HESS inter-journal SI)". It is a result of the EGU General Assembly 2020, 3-8 May 2020.