A double peak in the seasonality of California’s photosynthesis as observed from space

Solar-Induced chlorophyll Fluorescence (SIF) has been shown to be a powerful proxy for photosynthesis and gross 1 primary productivity (GPP). The recently launched TROPOspheric Monitoring Instrument (TROPOMI) features the required 2 spectral resolution and signal-to-noise ratio to retrieve SIF from space. Here we present a downscaling method to obtain 3 500-m spatial resolution SIF over California. We report daily values based on a 14-day window. TROPOMI SIF data show a 4 strong correspondence with daily GPP estimates at AmeriFlux sites across multiple ecosystems in California. We find a linear 5 relationship between SIF and GPP that is largely invariant across ecosystems with an intercept that is not significantly different 6 from zero. Measurements of SIF from TROPOMI agree with MODIS vegetation indices (NDVI, EVI, and NIRv) at annual 7 timescales but indicate different temporal dynamics at monthly and daily timescales. TROPOMI SIF data show a double peak 8 in the seasonality of photosynthesis, a feature that is not present in the MODIS vegetation indices. The different seasonality 9 in the vegetation indices may be due to a clear-sky bias in the vegetation indices, whereas previous work has shown SIF to 10 have a low sensitivity to clouds and to detect the downregulation of photosynthesis even when plants appear green. We further 11 decompose the spatio-temporal patterns in the SIF data based on land cover. The double peak in the seasonality of California’s 12 photosynthesis is due to two processes that are out of phase: grasses, chaparral, and oak savanna ecosystems show an April 13 maximum while evergreen forests peak in June. An empirical orthogonal function (EOF) analysis corroborates the phase offset 14 and spatial patterns driving the double peak. The EOF analysis further indicates that two spatio-temporal patterns explain 84% 15 of the variability in the SIF data. Results shown here are promising for obtaining global GPP at sub-kilometer spatial scales 16 and identifying the processes driving carbon uptake. 17


18
Photosynthesis is the process by which plants and other organisms use sunlight to synthesize carbon dioxide (CO 2 ) and water 19 to glucose and oxygen. Accurate knowledge of gross primary productivity (GPP) through photosynthesis is crucial for under-20 standing the land-atmosphere carbon exchange, which is one of the largest and most uncertain aspects of the global carbon 21 resolution. Efforts are underway to create a multi-decadal SIF record using different space-borne instruments (Parazoo et al.,  25 different remote sensing techniques for retrieving SIF from space-borne measurements. 26 Some work has shown SIF to be a better measure of carbon uptake than other vegetation indices that look at canopy "green-27 ness". This is, in part, because indices like the normalized difference vegetation index (NDVI) are a measure of photosynthetic 28 capacity (Sellers et al., 1986) whereas SIF is a measure of the photosynthetic activity and is coupled to the radiation regime. 29 For example, Luus et al. (2017) showed that the seasonal cycle of a biosphere model driven by SIF agreed with measurements 30 of CO 2 whereas the seasonal cycle from the model driven by the Enhanced Vegetation Index (EVI) was markedly different 31 from the CO 2 observations. Joiner et al. (2011) showed that the seasonal cycles of SIF and EVI agree in some regions, but not 32 others. Walther et al. (2016) showed a decoupling of the photosynthesis and greenness dynamics in boreal evergreen forests 33 by comparing SIF and EVI to model estimates of GPP, with SIF better capturing the seasonality of both deciduous broadleaf 34 and evergreen needleleaf forests. Again, this is likely due to SIF capturing photosynthetic activity, rather than photosynthetic 35 capacity. More recently, Magney et al. (2019a) demonstrated a mechanistic link between SIF and GPP over the course of a 1 year in a winter-dormant Northern Hemisphere conifer forest, despite retaining chlorophyll through the winter. Magney et al. 2 (2019a) highlighted the potential for new satellite measurements of SIF from TROPOMI and OCO-2 to track GPP at coarse 3 spatial resolution (3.5×7 km 2 ). Here we present an oversampling and downscaling method to obtain daily estimates of SIF at 500-m resolution. To our 5 knowledge, this is the highest resolution SIF dataset from satellite measurements. We then compare this down-scaled 500-m SIF 6 data to AmeriFlux sites across the state of California to assess the relationship between SIF and GPP. We finish by decomposing 7 California's spatio-temporal patterns of photosynthesis and carbon uptake into the dominant modes using empirical orthogonal 8 functions (EOFs). Here we focus on California because there are a number of eddy flux towers and it encompasses a range of 9 diverse ecosystems including: deciduous and evergreen forests, irrigated croplands, and grasslands (see Fig. 1). 10 3 2 Measurements of SIF, vegetation, and GPP 1 2.1 Satellite measurements of Solar-Induced chlorophyll Fluorescence (SIF) from TROPOMI 2 The TROPOspheric Monitoring Instrument (TROPOMI; Veefkind et al., 2012) is a nadir-viewing imaging spectrometer with 3 bands in the UV, visible, near-infrared, and shortwave infrared aboard the Sentinel-5 Precursor satellite. The Sentinel-5 Precur- 4 sor satellite was launched into low earth orbit on October 13, 2017 with an equatorial crossing time of 13:30 local solar time 5 and a 16 day orbit cycle. TROPOMI has a wide swath (2,600 km across track) enabling near-daily global coverage. The spatial 6 resolution of the ground pixels is 7 km along track 1 and 3.5-15 km across track (3.5 km at nadir and 15 km at the edge of the 7 swath). Of particular relevance here is the near-infrared band (725-775 nm) that covers the far-red part of SIF emission and is devoid of atmospheric absorption features. TROPOMI has a spectral resolution of ∼0.4 nm and a signal-to-noise ratio of 12 ∼2,500 in this retrieval window. The TROPOMI SIF retrieval uses a singular value decomposition to derive the spectral basis 13 functions from TROPOMI data over vegetation-free areas (e.g., oceans, ice, and deserts).
14 One particularly attractive feature of space-borne SIF retrievals is the low sensitivity to atmospheric scattering by aerosols 15 and clouds. Specifically, Frankenberg et al. (2012) showed that 80% of the emitted SIF could be retrieved in the presence of 16 clouds with low-to-moderate optical thickness. As such, Köhler et al. (2018) filtered out pixels with cloud fractions larger 17 than 80% based on VIIRS observations; we use this same cloud filtering here. This weak sensitivity to clouds is in contrast 18 to reflectance based measures of vegetation (e.g., NDVI) that can only be made in clear-sky conditions, potentially inducing a 19 clear-sky bias in reflectance based vegetation indices.

20
Here, we apply one additional bias correction to the TROPOMI retrievals that was not included in Köhler et al. (2018). We 21 find some mostly barren regions have systematically negative SIF values, which is non-physical. This bias is thought to be 22 related to bright surfaces and is likely due to the choice of training data for the spectral basis functions. We are investigating 23 ways to correct this globally. In the interim, we compute a spatio-temporal bias correction b i,j,k (where i, j are the spatial 24 coordinates and k is the temporal coordinate) such that the mean SIF for a given location over a 30-day moving window is 25 always positive. That is to say, 27 wheres i,j,k is the 1-month block average for the k th day at location i, j. This still allows for negative SIF values due to vari-28 ability and noise but will shift the mean SIF for a given 500-m grid cell to be positive. In practice, this bias correction is small 29 with 78% having no bias correction at all and 89% of the grid cells having a bias correction smaller than 0.1 mW/m 2 /sr/nm.

30
The bias correction primarily impacts desert regions in Southeastern California (see Supplemental Fig. S1).
where ρ NIR , ρ red , and ρ Blue are the reflectances in their respective MODIS bands and G, C 1 , C 2 , and L are coefficients for the 13 MODIS EVI algorithm (L = 1, C 1 = 6, and C 2 = 7.5, G = 2.5). 14 2.3 GPP estimates from AmeriFlux eddy covariance sites 15 AmeriFlux is a network of long-term eddy covariance sites that launched in 1996 (Baldocchi et al., 2001). These eddy covari-16 ance sites provide a direct measure of net ecosystem exchange (NEE; CO 2 fluxes) (Baldocchi et al., 1988) and can be used 17 to evaluate both bottom-up models and satellite proxies of carbon exchange. Disentangling the CO 2 fluxes into GPP (CO 2 18 uptake) and total ecosystem respiration (R eco ; CO 2 released) generally requires making assumptions about the temperature 19 dependence of the respiration which can induce biases in the GPP estimate (Reichstein et al., 2005). Nevertheless, these eddy 20 covariance sites provide the best estimate of site level GPP across multiple ecosystems in California including: croplands, wet-21 lands, woody savannas, and grasslands. Here we use data from 11 AmeriFlux sites across California (see Table 1) to evaluate 22 the SIF retrievals from TROPOMI. NEE flux partitioning at these sites was performed using artificial neural networks from  land. A few features that immediately stand out are: 28 1. The strong correspondence between EVI and NIR v . We find a nearly linear relationship between these two indices 29 (r 2 = 0.98).  2. All three MODIS indices are well correlated with eachother (r 2 > 0.85). We do observe a weakly non-linear relationship 1 between NIR v and EVI or NDVI (see the curvature in the NIR v row).  Of the three vegetation indices examined here, we find the strongest relationship between NIR v and SIF, but it only explains 6 half of the variability on daily timescales (r 2 = 0.52). The agreement improves at coarser temporal scales (annual r 2 = 0.83-  As mentioned above, the nominal spatial resolution of the ground pixels from TROPOMI are 3.5×7 km 2 at nadir. However, 4 the wide swath from TROPOMI (2600 km across-track) often results in multiple observations per day (see right panel of 5 Fig. 3). Additionally, the orientation of these swaths differ over the 16-day orbit cycle allowing us to infer higher spatial 6 resolution than the nominal spatial resolution. This idea has been widely used with the space-borne OMI instrument that 7 preceded TROPOMI (see Sun et al., 2018a, and references therein for a discussion of oversampling with OMI observations). 8 However, the spatial resolution of TROPOMI is a factor of 15 finer than OMI (3.5×7 km 2 for TROPOMI and 14×26 km 2 for 9 OMI, both at nadir). Oversampling with OMI often required years of observations (e.g., Zhu et al., 2014). The wide swath and 10 dense spatial coverage of TROPOMI allow us to perform biweekly oversampling. 11 8 Fig. 3 shows a schematic of how the oversampling is performed. The left panel shows two hypothetical swaths from 1 TROPOMI overlaid on a 500-m grid (same spatial resolution as the MODIS NBAR dataset). Areas where the swaths overlap 2 allow us to partition the information down to a finer spatial scale. For example, the yellow pixel in swath B overlaps with all 3 four pixels from swath A. As such, the signal from that pixel in swath B can be sub-divided to finer spatial scales. Each unique 4 shade of color would correspond to unique information in the left panel of Fig. 3. The right panel of Fig. 3 shows the sampling 5 density of TROPOMI over California on a single day in June 2018, the dark blue region indicates where two TROPOMI swaths 6 overlapped that day. 7 We find that, on average, each 500-m grid cell is within the bounds of ∼0.6 TROPOMI scenes with a successful retrieval per 8 day. By using biweekly oversampling (a moving 14-day window) we obtain approximately 8 different swath orientations over 9 a 14-day period for the oversampling. These 8 swath orientations allow us to further refine our grid to following the schematic 10 shown in Figure 3. It also means that the daily values presented here are representative of 14-day moving averages (centered 11 about that day). 12 We can take the oversampling a step further by pre-weighting the SIF signal in a TROPOMI scene by the underlying 13 vegetation fraction, we refer to this as "downscaling". That is to say, we assume the observed SIF from TROPOMI in a given 14 scene likely originates from more vegetated regions within that scene. Here we use a relative weighting for this downscaling: where s is the retrieved SIF from TROPOMI for a single scene, s i,j is the SIF spatially downscaled to 500-m, v i,j are the  produced at weekly temporal frequency whereas we produce daily estimates using a 14-day moving window. The oversampling 28 and downscaling methods both yield consistent large-scale patterns and seasonal cycles (left panels in Fig. 4). The main impact 29 of the MODIS-based local downscaling is a sharpening effect. This can be seen in the right column of Fig. 4. Importantly, the oversampling and downscaling methods and the nuanced difference in processing allow for analysis at much finer spatio-1 temporal scales. That is to say that we are not inducing large-scale changes in the spatio-temporal patterns with these different 2 methods of processing, those are robustly driven by the underlying SIF retrievals. 3 4 Inferring GPP from SIF 4 Previous work has shown strong empirical relationships between SIF and GPP at coarse spatial scales (e.g., Walther et al., and GPP, this follows from a simple relational analysis. From Monteith theory (Monteith, 1972) we can write: where Φ CO2 is the light use efficiency of CO 2 assimilation, I is the photosynthetically active radiation (PAR), and α is the where Φ F is the the light-use efficiency of SIF and β is the probability of SIF photons escaping the canopy. Rearranging yields: at sub-daily timescales, which implicitly points to non-linearities in Φ CO2 /(βΦ F ). β will be a function of the canopy structure 20 and likely differs between ecosystems, although some studies have argued that reflectance measurements could be used to infer 2019b), owing the increased linearity at the canopy scale to averaging SIF and GPP over many different leaf angles exposed to 25 highly heterogenous light environments.
26 Figure 5 compares the TROPOMI SIF retrievals to observations from AmeriFlux sites across California (see Table 1 and  dots are the actual TROPOMI SIF retrievals at that location that have the scene-specific relative weighting from the MODIS 1 NIR v (Eq. 5). No temporal smoothing has been applied in Fig. 5. We find a strong correspondence between SIF and GPP AmeriFlux sites in California indicates larger inter-ecosystem differences in the SIF-GPP relationship than intra-ecosystem 17 differences, lending credence to this universal scaling factor. However, there are two important caveats: 1) we do not have an 18 eddy covariance site in an evergreen forest, which is a major limitation as much of California is dominated by evergreen forests 19 and 2) we are not directly measuring GPP with SIF. As such, we refer to this SIF-estimated GPP as: 20 GPP * := 18.5 · SIF.

21
This single scaling from Eq. 9 seems to be a reasonable relation given the available information, with the caveat that there 22 could be differences between ecosystems that are unaccounted for. To reiterate, there is an clear correspondence between the 23 observed SIF and GPP estimated for the different AmeriFlux sites (see Fig. 5) but we have a limited number of AmeriFlux sites 24 in California that do not cover all ecosystems. As such, we do not report GPP here and have included an asterisk to highlight 25 the caveats with the relationship presented here. Future work to obtain a more robust SIF-GPP relationship covering more 26 ecosystems is warranted.  We can use the CropScape database (see Fig. 1) to determine the ecosystems driving the spatio-temporal patterns in the 1 TROPOMI SIF data as it provides land cover classifications across the state of California at 30-m spatial resolution. However, 2 a notable limitation of the classifications from the CropScape data is the lack of discrimination for non-cropland areas. For 3 example, grasslands and pastures are combined into a single land type that seems to also include regions that would typically 4 be defined as oak savanna and chaparral. In lieu of a better sub-kilometer land cover dataset, we use the classifications from 5 the CropScape database for this work.
6 Figure 6a shows a breakdown of the regions contributing to the statewide SIF signal based on the land cover data from 7 the CropScape database. We find the California grasslands and pastures (a single classification that also includes chaparral 8 and oak savanna) have a single peak that coincides with the first statewide peak in April, this is consistent with the seasonal 9 cycle at California grassland sites in the AmeriFlux network (Fig. 5) that show a unimodal peak in the spring that ends in 10 May. Figs. 6e and 6e' show the mean spatial pattern in April 2018 and 2019, respectively, where we see that the April peak 11 coincides with a statewide increase in SIF. There are a few pertinent hotspots in grasslands or pastures during this April peak.  The second peak in June shows a dominant contribution from evergreen forests (Fig. 6a). This can also be seen in the 16 spatial patterns from Fig. 6f and 6f' where the evergreen forests in Northern California exhibit a strong SIF signal. California's 17 Central Valley can be clearly distinguished as the surrounding hills have dried out (predominantly oak savanna and chaparral). 1 The observed photosynthesis from the Central Valley is maintained by heavily irrigated cropland throughout the valley. 2 The yellow line in Fig. 6b indicates the fraction of SIF in California that comes from cropland. We see the largest relative 3 contribution occurring in the fall. However, this is is primarily because all other ecosystems have gone dormant (see Fig. 6g) 4 as opposed to an increase in photosynthetic activity from cropland. The only region that shows an increase in photosynthesis 5 are the rice fields in the Sacramento Valley (the valley surrounding Sutter Buttes at 39.1 • N, 121.5 • W) in Northern California. 6 The rice fields show a SIF signal in excess of 2.5 mW/m 2 /sr/nm during the fall (GPP * in excess of 45 µmol/m 2 /s). Both 2018 and 2019 show a double peak in the seasonal cycle, however, the onset of the grassland driven peak differs 8 substantially between the two years. This difference is likely driven by the increased precipitation in 2019 (blue line in Fig. 6d). 9 There was 50% more precipitation in 2019 compared to 2018 and the precipitation occurred earlier in the water year. By mid- 10 February 2019 there was more precipitation than the annual total from 2018. This early precipitation allowed for an earlier and 11 longer growing season for the grasses. Figure 8 shows the difference in spatial patterns between 2019 and 2018. In general, we 12 find reasonable consistency between the two years in April and June but substantial differences in the Fall. We find a factor of 2 negative anomalies in Fig. 8b), the impact of these fires is currently the focus of forthcoming work. An additional feature that 5 stands out is the positive SIF anomaly in Southern California, this increase in 2019 is due to the low rainfall the previous year. 6 Interestingly, none of the MODIS vegetation indices (NDVI, EVI, or NIR v ) show this double peak in photosynthesis 7 (Fig. 6c). The seasonal cycles from the three vegetation indices show a greening that starts in mid-winter (begins in De- 8 cember 2018) and increases roughly linearly to a peak in April. All three vegetation indices maintain that peak until July when higher elevations, making water available for many of the needleleaf evergreen species in the Sierras and Coastal ranges, then 28 the water resources become depleted and temperatures cool prompting these evergreen species to go back into a photoprotec- 29 tive state, resulting in a short photosynthetically active growing season that has been shown to be more well characterized by 30 SIF from GOME-2 than MODIS NDVI and EVI (Zuromski et al., 2018). Future work comparing SIF and MODIS indices with 31 measured PAR at AmeriFlux sites would be useful in further evaluating the role of radiation and physiology in the double peak 32 feature. Figure 7 shows a Hovmöller diagram for three transects across Northern California (see Figure 1 for the location of the 34 transects). Transect A in Figure 7 shows the short but strong SIF signal from the rice fields. The timing of the SIF signal 35 from the rice fields agrees with the growing cycle for rice in California. Rice in the Sacramento Valley is typically planted in 1 mid-to-late May, the fields are then flooded, and harvested in mid-to-late September (University of California at Davis, 2018). interpreted as physical modes of the system. 15 Figure 9 shows the first two EOFs and their associated PCs for the TROPOMI SIF data over California, the corresponding 16 eigenvalue spectrum can be seen in Fig. S5. The first two EOFs corroborate the findings from Section 5 and, taken together, 17 explain 84% of the variability in the TROPOMI SIF data:

18
-EOF 1: the mean signal 19 -EOF 2: the double peak 20 The first EOF (Fig. 9a) represents the mean signal and explains 74% of the variability in the TROPOMI SIF data. From the 21 spatial pattern we can see that it includes most of the biomass in California and is strongly correlated to the state-wide mean 22 SIF: r 2 = 0.99. The associated principal component bears a strong similarity to the statewide mean SIF seasonal cycle (Fig. 6a). 23 This finding is not entirely surprising because we are using un-normalized SIF data for the matrix factorization. This means 24 that the most important mode of variability is the mean signal and that the following EOFs are anomalies relative to the mean 25 signal. 26 The second EOF (Fig. 9b) represents the double peak in the timing of California's photosynthesis. This EOF combines the 27 signal from the grasslands (positive phase of EOF 2) and the evergreen forests (negative phase of EOF 2). We find EOF 2 28 to be positively correlated with the grassland fraction from the CropScape database (r = 0.55) and negatively correlated with 29 the evergreen forests (r = -0.36). There is also a negative correlation with the rice fields (r = -0.32). The associated principle 30 component serves to amplify the seasonal cycle from EOF 1 in grasslands during April and amplify the forest peak in June. 31 This is because the red region (grasslands) in Fig. 9b will contribute a positive anomaly in April and a negative anomaly in June. Conversely, the blue region (evergreen forests and rice) will contribute a negative anomaly in April and a positive anomaly in 1 June. This EOF arises because the grasslands and forests are both spatially separated and out of phase with each other, allowing 2 the matrix factorization to place them into a single EOF that represents the processes driving the double peak in the timing of 3 California's photosynthesis. 4 It should be noted that these EOF patterns found here are unlikely to be true "physical modes" (see, for example, Monahan 5 et al., 2009). That is to say, we would not necessarily expect the response to a perturbation to follow patterns shown in Fig. 9.
6 EOF 2 is a good example of this because it seems unlikely that the grasslands and forests will exhibit opposing responses to a 7 forcing. Grasslands and forests are combined into a single EOF simply because there is little loss of information by combining 8 them due to the spatial separation and phase offset. This is not to argue against the utility of EOFs. EOFs are a useful method 9 for identifying structure in geophysical datasets, as evidenced here by their identification of the double peak in the timing of 1 California's photosynthesis. We present an oversampling and downscaling method to produce daily estimates of Solar Induced chlorophyll Fluorescence 4 (SIF), a proxy for photosynthetic activity, at 500-m spatial resolution from TROPOMI. To our knowledge, this is the highest 5 spatial resolution SIF dataset from satellite measurements. We find a double peak in the seasonality of photosynthesis in 6 California during 2018 and 2019, a feature that is not present in the MODIS vegetation indices (NDVI, EVI, or NIR v ). Analysis 7 of the spatial and temporal patterns of the SIF data indicates that the double peak is due to two processes that are out of phase 8 with each other: woody grasslands (e.g., grasslands, chaparral, and oak savanna) and evergreen forests. 9 Our work applies methods developed for previous satellite retrievals (oversampling) and uses estimates of sub-grid scale 10 vegetation (downscaling) to produce daily 500-m spatial resolution SIF from TROPOMI over California. We use a 14-day 11 moving window to produce this esimate. The oversampling method results in a smooth spatial field and removes artifacts due 12 to complex topography and the wide TROPOMI swath. The downscaling method further refines the high resolution spatial 13 patterns by bringing in a priori information on the sub-grid vegetation patterns. The oversampling and downscaling methods

14
do not alter the large scale spatio-temporal patterns as they conserve the SIF signal over a single scene. 15 TROPOMI SIF data and MODIS vegetation indices are reasonably consistent at annual timescales over California, but show 16 weaker relationships at daily and monthly timescales. This implies that TROPOMI SIF is providing some information that is 17 distinct from the MODIS vegetation indices. TROPOMI SIF data show a strong correspondence with half-hourly estimates of 18 GPP at multiple AmeriFlux sites across different ecosystems including: cropland, grassland, savanna, and wetlands. We find a 19 linear relationship between SIF and GPP that is largely invariant across ecosystems with an intercept that is not significantly 20 different from zero. As such, we use SIF as an estimate of GPP * with the caveat that some ecosystems are not represented in 21 our California analysis.

22
The double peak in the seasonality of California's photosynthesis observed by TROPOMI SIF is due to two processes that are 23 out of phase with each other: grasses show a maximum in April and evergreen forests peak in June. An empirical orthogonal 24 function (EOF) analysis corroborates the phase offset and spatial patterns driving the double peak. The EOF analysis also 25 indicates that two spatio-temporal patterns explain 84% of the variability in the TROPOMI SIF data. 26 The results shown here are promising for obtaining global near-daily GPP at sub-kilometer spatial scales using satellite 27 measurements. This, in turn, may prove helpful in addressing long-standing questions regarding the mechanisms and locations 28 driving carbon uptake in the Northern Hemisphere. It would also allow us to monitor climate change impacts on vulnerable 29 ecosystems at local-to-global scales.    36 Gamon, J. A., Huemmrich, K. F., Stone, R. S., and Tweedie, C. E.: Spatial and temporal variation in primary productivity (NDVI) of