Source partitioning of H 2 O and CO 2 fluxes based on high-frequency eddy covariance data : a comparison between study sites

For an assessment of the roles of soil and vegetation in the climate system, a further understanding of the flux components of H2O and CO2 (e.g., transpiration, soil respiration) and their interaction with physical conditions and physiological functioning of plants and ecosystems is necessary. To obtain magnitudes of these flux components, we applied source partitioning approaches after Scanlon and Kustas (2010; SK10) and after Thomas et al. (2008; TH08) to high-frequency eddy covariance measurements of 12 study sites covering different ecosystems (croplands, grasslands, and forests) in different climatic regions. Both partitioning methods are based on higher-order statistics of the H2O and CO2 fluctuations, but proceed differently to estimate transpiration, evaporation, net primary production, and soil respiration. We compared and evaluated the partitioning results obtained with SK10 and TH08, including slight modifications of both approaches. Further, we analyzed the interrelations among the performance of the partitioning methods, turbulence characteristics, and site characteristics (such as plant cover type, canopy height, canopy density, and measurement height). We were able to identify characteristics of a data set that are prerequisites for adequate performance of the partitioning methods. SK10 had the tendency to overestimate and TH08 to underestimate soil flux components. For both methods, the partitioning of CO2 fluxes was less robust than for H2O fluxes. Results derived with SK10 showed relatively large dependencies on estimated water use efficiency (WUE) at the leaf level, which is a required input. Measurements of outgoing longwave radiation used for the estimation of foliage temperature (used in WUE) could slightly increase the quality of the partitioning results. A modification of the TH08 approach, by applying a cluster analysis for the conditional sampling of respiration–evaporation events, performed satisfactorily, but did not result in significant advantages compared to the original method versions developed by Thomas et al. (2008). The performance of each partitioning approach was dependent on meteorological conditions, plant development, canopy height, canopy density, and measurement height. Foremost, the performance of SK10 correlated negaPublished by Copernicus Publications on behalf of the European Geosciences Union. 1112 A. Klosterhalfen et al.: Source partitioning of H2O and CO2 fluxes tively with the ratio between measurement height and canopy height. The performance of TH08 was more dependent on canopy height and leaf area index. In general, all site characteristics that increase dissimilarities between scalars appeared to enhance partitioning performance for SK10 and TH08.

radiation used for the estimation of foliage temperature (used in WUE) could slightly increase the quality of the partitioning results. A modification of the TH08 approach, by applying a cluster analysis for the conditional sampling of respiration/evaporation events, performed satisfactorily, but did not result in significant advantages compared to the original method versions developed by Thomas et al. (2008). The performance of each partitioning approach was dependent on meteorological conditions, plant development, canopy height, canopy density, and measurement height. Foremost, the 5 performance of SK10 correlated negatively with the ratio between measurement height and canopy height. The performance of TH08 was more dependent on canopy height and leaf area index. In general, all site characteristics which increase dissimilarities between scalars appeared to enhance partitioning performance for SK10 and TH08.

Introduction
The eddy covariance (EC) method is a micrometeorological technique commonly used to measure the energy, water vapor, 10 and carbon dioxide exchange between biosphere and atmosphere across a large range of scales in time and space (Baldocchi et al., 2001;Reichstein et al., 2012). The measurements help to understand the temporal and spatial variations of these fluxes at the land-atmosphere interface. However, the EC method quantifies only net fluxes of water vapor, i.e. evapotranspiration (ET), and the net ecosystem exchange of CO 2 (NEE). Thus, for a better assessment of the role of soil and vegetation in the climate system, a further understanding of the flux components of H 2 O and CO 2 and their interaction with physical 15 conditions and physiological functioning of plants and ecosystems is necessary (Baldocchi et al., 2001). To obtain magnitudes of transpiration, evaporation, photosynthesis, and respiration by soil and vegetation, certain measurements with additional instrumentation independent of the EC technique can be conducted. Alternatively or additionally, so-called source partitioning approaches can be applied to the net fluxes obtained with the EC method. For instance, with the notion that during night no CO 2 is assimilated by plants (and hence, observed NEE equals respiration), respiratory fluxes are often 20 estimated based on semi-empirical models describing the relationship between a physical driver (e.g., temperature) and respiration (Lloyd and Taylor, 1994;Reichstein et al., 2005Reichstein et al., , 2012. To estimate soil surface fluxes of both H 2 O and CO 2 directly from high frequency EC data without assumptions on such drivers, two distinct partitioning approaches were developed by Scanlon and coauthors (Scanlon and Sahu, 2008;Scanlon and Kustas, 2010), and Thomas et al. (2008). Both approaches rely on the assumption that the presence of multiple sources and sinks in and below the canopy will lead to 25 decorrelation of the high frequency scalar concentrations measured by the EC method above the canopy. This decorrelation contains information about the strength of these sinks and sources, which can be quantified by applying the flux-variance similarity theory or conditional sampling strategies. The scalar-scalar-correlations of H 2 O and CO 2 are however not only influenced by the sink-source distribution, but also by height (atmospheric surface layer, roughness sublayer), surface heterogeneity (Williams et al., 2007), canopy density, and coherent structures (Edburg et al., 2012;Huang et al., 2013). 30 The source partitioning approach after Scanlon and Sahu (2008) and Scanlon and Kustas (2010) has already been applied to data acquired above a corn field (eastern USA; Scanlon and Kustas, 2012), has been compared to an isotopic H 2 O flux partitioning method (Good et al., 2014) and to the Noah Land Surface Model  both for grasslands, and has been evaluated on a forest site on a decadal time scale (Sulman et al., 2016). Zeeman et al. (2013) further investigated the partitioning approach after Thomas et al. (2008) in association with coherent structures. To better assess these two approaches and their theoretical background, an intercomparison at a variety of study sites is necessary . 5 The objective of this study is to compare and evaluate the source partitioning approaches after Scanlon and Kustas (2010) and after Thomas et al. (2008) by applying them to high frequency scalar measurements of various study sites in different ecosystems. In addition to testing slight modifications of both partitioning methods, conditions and characteristics of study sites are identified under which the methods perform best. Based on findings of the above-mentioned authors and a large eddy simulation (LES) study (Klosterhalfen et al., 2019), we hypothesize that the methods' performance is dependent on the 10 canopy height (h c ), which should represent the vertical separation of sinks and sources of H 2 O and CO 2 between canopy top and soil surface, on the canopy density (leaf area index LAI, or expressed as the ratio LAI h c -1 ), and on the ratio between measurement height (z) and h c , respectively. All these factors affect the degree of mixing of the scalars detected by the EC sensors. With a high and sparse canopy and a low z h c -1 , we hypothesize a larger dissimilarity between scalar fluctuations and a more precise partitioning result of both source partitioning approaches. To summarize, goals of this study are: 15 -The comparison and evaluation of the partitioning results obtained with the approaches after Scanlon and Kustas (2010) and after Thomas et al. (2008) for various ecosystems, and testing slight modifications of the approaches -An analysis of the two approaches with respect to their dependence on their underlying assumptions -The description of the interrelations between performance of the partitioning methods, turbulence characteristics, and site characteristics (such as canopy type, h c , z h c -1 , LAI, and LAI h c -1 ) 20 -The identification of characteristics of a data set (i.e. of study site and period properties), which lead to a satisfactory performance of the partitioning methods, if such characteristics exist.

Source Partitioning after Scanlon and Kustas (2010) -SK10
To estimate the contributions of transpiration (T), evaporation (E), photosynthesis as net primary production (NPP), and soil 25 respiration (R soil, autotrophic and heterotrophic sources) to the measured net fluxes, Scanlon and Sahu (2008) and Scanlon and Kustas (2010) proposed a source partitioning method using high frequency time series from a typical EC station. This method (SK10 in the following) is based on the spatial separation and relative strength of sinks and sources of water vapor and CO 2 below the canopy (source of both water vapor and CO 2 ), in the canopy (source of water vapor and sink of CO 2 during daylight), and the atmosphere. Assuming that air from those sinks and sources is not yet perfectly mixed before 30 reaching the EC sensors, partitioning is estimated based on the separate application of the flux-variance similarity theory to the stomatal and non-stomatal components of the scalars, as well as an estimation of canopy water use efficiency (WUE).
The slope of the relationship between water vapor fluctuations (q') and CO 2 fluctuations (c') originating from stomatal and non-stomatal processes usually differs from the WUE at the leaf level and the correlation between the two scalars (ρ q'c' ) usually deviates from -1 during daytime. This deviation of the slope of the q' versus c' relationship from WUE at leaf-level and the reduction of correlation are used to estimate the composition of the measured fluxes (Scanlon and Kustas, 2010;Scanlon and Sahu, 2008). For a detailed analytical description of SK10 see Scanlon and Albertson (2001), Scanlon and Sahu 5 (2008), Kustas (2010, 2012), and Palatella et al. (2014). Furthermore, Skaggs et al. (2018) implemented SK10 in the open source Python 3 module Fluxpart. In the present study, SK10 was applied to high frequency EC data and the flux components were estimated using the implementation of SK10 as described by Klosterhalfen et al. (2019).
As mentioned before, the WUE at the leaf level has to be estimated for the application of SK10. WUE at the leaf level describes the relation between the amount of CO 2 uptake through stomata (photosynthesis) and the corresponding amount of 10 H 2 O loss (T). One way to derive WUE (without additional measurements at leaf-level) is to relate the difference in mean CO 2 concentration between air outside and inside the leaf to the difference in mean water vapor concentration between air outside and inside the leaf including a factor that accounts for the difference in diffusion rate between H 2 O and CO 2 through the stomatal aperture (Campbell and Norman, 1998;Scanlon and Sahu, 2008). The mean H 2 O and CO 2 concentrations just outside the leaf can be inferred from EC measurements by considering logarithmic mean concentration profiles 15 implementing the Monin-Obukhov similarity theory (MOST; Kustas, 2010, 2012;Scanlon and Sahu, 2008). For the internal CO 2 concentration, a constant value of 270 or 130 ppm was presumed, typical for C 3 or C 4 plants, respectively (Campbell and Norman, 1998;Špunda et al., 2005;Williams et al., 1996;Xue et al., 2004). Values for the internal water vapor concentration were estimated based on 100% relative humidity at foliage temperature. Measurements of foliage temperature were not available at the study sites, so for the source partitioning foliage temperature was set equal to measured 20 air temperature (WUE meanT ; Scanlon and Sahu, 2008). Additionally, to investigate the sensitivity of WUE, foliage temperature was also derived by means of measured outgoing longwave radiation (WUE OLR ; with a surface emissivity of 0.98), or calculated similar to the external concentrations by considering a mean profile based on MOST (WUE MOST ). Thus, three different approaches of SK10 with differing inputs for WUE were applied to all study sites. Thomas et al. (2008) presented a new method (TH08 in the following) to estimate daytime sub-canopy respiration of forests directly from EC raw data by conditional sampling. In an analogous way, evaporation can be quantified by exchanging c' with q' in the equations given by Thomas et al. (2008Thomas et al. ( , equations 1-11, pages 1212Thomas et al. ( -1215. The method assumes that occasionally air parcels moving upward (vertical wind fluctuations w' > 0) carry unaltered H 2 O/CO 2 concentration combinations of the sub-canopy. Looking at the fluctuations q' and c', both normalized with the corresponding standard 30 deviation, respiration/evaporation signals should occur within the part of the joint probability distribution where w', q' and c' are positive, i.e. in the first quadrant in the q'-c' plane (where q' > 0 and c' > 0). Additionally, Thomas et al. (2008) introduced a hyperbolic threshold criterion within quadrant 1, in order to only sample all data points above this hyperbola. Thomas et al. (2008) found realistic respiration estimates with a hyperbolic threshold of 0.25, which was also applied here.

Source Partitioning after Thomas et al. (2008) -TH08 25
Subsequently, daytime evaporation and respiration can be determined from the conditionally sampled w', q', and c' time series within quadrant 1 (Q1) or using the hyperbola threshold criterion (H). For the determination of the turbulent H 2 O and CO 2 fluxes either the covariance between w' and the corresponding scalar (CV) can be used, or the relaxed eddy accumulation (REA) technique (Businger and Oncley, 1990) using the coefficient β as described in equation 4, page 1213 5 and statements on page 1215 in Thomas et al. (2008). Hence, Thomas et al. (2008) applied four different approaches to quantify the respiration/evaporation events, combining the two conditional sampling criteria (Q1 or H) and the two calculation strategies (CV or REA technique).
For some averaging periods in our data, a potential respiration/evaporation 'cloud' was evident but did not occur (completely) within quadrant 1 (Fig. 1). As a modification of the conditional sampling strategy and a more tolerant detection 10 of respiration/evaporation events, a distribution-based cluster analysis was conducted (fifth approach, GMM). With the Gaussian Mixture Model (Canty, 2010) using the Expectation-Maximization Algorithm, two clusters were defined for each averaging period: the respiration/evaporation 'cloud' and all further points associated with T and photosynthesis independent of the sign of w'. Soil surface fluxes were calculated by CV from data in the respiration/evaporation 'cloud', where the deviations from the averages of all sampled cluster data points (instead of all data points) were used for q' and c' (w' kept 15 unchanged). Because the sampled respiration/evaporation 'cloud' by GMM would not always lie within quadrant 1 (often in quadrant 1 and 4, or in 1 and 2), and q' and/or c' of the defined 'cloud' could correlate negatively with w', the corresponding surface flux would often be negative (Fig. 1). If this was the case for H 2 O and/or CO 2 soil fluxes, the corresponding flux was recalculated considering the deviations from the averages of all data points for w', q', and c', and only including data points within quadrant 1 of the original q'-c' plane and with w' > 0. This recalculated flux represented only a minimal fraction of 20 the corresponding flux component in the considered averaging period. Also, as a result of this procedure the number of data points could differ between H 2 O and CO 2 for TH08 CV GMM depending on the calculation step used.

Study Sites and Data Processing
For the application and evaluation of the source partitioning methods, various study sites in a number of countries with differing cover types, canopy densities (represented by LAI), and measurement heights were chosen (Table 1). Almost all 25 study sites are part of the FLUXNET network (Baldocchi et al., 2001). Detailed site and measurement descriptions can be found in the listed references. Besides coniferous and deciduous forests with closed canopy cover, grasslands, and croplands, some sites represent special canopy types: in Forest_SC (for site abbreviations see Table 1) EC measurements have been conducted above a Mediterranean oak savanna (dehesa; Andreu et al., 2018); in Wüstebach an area of about 9 ha was deforested in 2013 and so measurements were obtained above the still present spruce forest (Forest_WU) and the deforested 30 area (Grass_WU) (Graf et al., 2014;Wiekenkamp et al., 2016), where grass, shrubs, and young deciduous trees have been regrowing swiftly; and in Forest_LA a coniferous forest has been regrowing gradually after a non-cleared windthrow in 2007 (Matiu et al., 2017). These three study sites represent the most heterogeneous landcover types in this study.
For each study site, measurements from days with a high-productive state of the vegetation and fair-weather conditions were selected to exclude factors interfering with the performance of the partitioning method. Time periods with precipitation events were excluded. Furthermore, the quality assessment scheme after Mauder et al. (2013) was applied to each data set and source partitioning was only conducted for time periods with the highest or intermediate quality flag levels assigned by this scheme. We only considered partitioning results of daytime data, because both methods require the presence of 5 photosynthesis. Here, daytime was determined by calculating sunrise and sunset times by means of local time. Additionally, the TH08 method was only applied to time periods with a negative ρ q'c' , and if less than 1% of the total data points in one half-hour time period were sampled as the respiration/evaporation 'event', the partitioning result was disregarded.
The high frequency H 2 O and CO 2 time series of all study sites were pre-processed and prepared for the application of the source partitioning approaches as described by Klosterhalfen et al. (2019). For each study site, physically impossible values 10 and spikes were excluded in the high frequency EC data of vertical wind, total H 2 O and CO 2 concentrations. The time delay was corrected, missing raw data within a half-hour period were gap-filled by linear interpolation, and a planar-fit rotation was conducted, where the rotation matrix was calculated for only a maximum time period of two weeks. Further, the EC data was corrected for density fluctuations after Detto and Katul (2007). Then, the source partitioning approaches were applied to half-hourly time series of these pre-processed high frequency data, partitioning fractions (E/ET or R soil /NEE, respectively) 15 were calculated, and applied to the post-processed half-hourly EC data.

Evaluation of Source Partitioning Results
The evaluation of the source partitioning performance was conducted in multiple ways for the various study sites depending on data-availability. At some study sites, R soil was measured additionally with closed-chamber measurements independently of the EC measurements. In Grass_RO and the cropland in Selhausen (Wheat_SE, Barley_SE, Intercrop_SE, 20 SugarBeet_SE), continuous measurements of multiple longterm-chambers were available for the considered time periods (half-hourly at Selhausen and hourly interpolated to half-hourly at Grass_RO). In Maize_DI, Forest_WU, and Grass_WU, R soil was measured with survey-chambers at several measurement points on one day during the considered time periods, so spatial and temporal averages for the hours in question could be calculated. For all study sites, soil evaporation (E soil ) was estimated as a fraction of measured ET based on Beer's law depending on LAI (E soil = ET exp(-0.6 LAI); Campbell and 25 Norman, 1998;Denmead et al., 1996). Thus, the root mean square error (RMSE) and the bias could be calculated between the partitioning results for E or R soil and the estimated E soil or chamber measurements, respectively. RMSE was sensitive to bias and outliers, and the distribution of errors was skewed. The positive outliers/errors (overestimations) were larger than negative errors (underestimations). An overestimation of the flux component magnitude may result in a larger RMSE than an underestimation. Therefore, we also calculated a version of the RMSE based on log-transformed data (RMSE ln ; data 30 transformed with ln(x +1)) before computing differences between estimated and reference E or R soil . Furthermore, one has to keep in mind that the measurements of R soil and LAI can also contain errors and that E soil is only a rough model approximation and can only give an order of magnitude to expect.
In addition, partitioned CO 2 fluxes were evaluated in reference to results of the established partitioning approach after Reichstein et al. (2005), if available; even though this approach targets other flux components (total ecosystem respiration TER and gross primary production GPP). For Forest_MMP and Forest_WA, results of this partitioning approach were not available, thus, we chose for these sites maximal margins for GPP and TER based on partitioning results of previous years and experience. For all sites, the estimated NPP and R soil for every time step were classified as reasonable if their magnitudes 5 were smaller than the determined GPP or TER, respectively. Since all data sets were from the main growing season and for weather conditions favorable to high respiration, we assumed that R soil should additionally be larger than 1 μmol m -2 s -1 . In the following, NPP and R soil estimates meeting these criteria ("hits in range") will be counted as HiR GPP (magnitude of NPP smaller than magnitude of GPP) and HiR TER (R soil smaller than TER and larger than 1 μmol m -2 s -1 ). We calculated the percent fraction of HiR GPP and HiR TER in relation to the count of time steps with valid partitioning solutions. Within 10 this evaluation step two source partitioning approaches (approach after Reichstein et al., 2005 versus SK10 or TH08) were examined and compared including their different assumptions and uncertainties, and the results have to be handled with care.
An evaluation of the estimated flux magnitudes was also possible for some study sites by means of prior publications.

Analysis of Source Partitioning Approaches
To compare the strengths and limitations of SK10 and TH08 and to gain a better insight in their functionality and 15 dependencies on turbulence and site characteristics, a correlation analysis was conducted between HiR GPP or HiR TER and the variables z, h c , z h c -1 , LAI, or LAI h c -1 . Here, we have chosen HiR GPP and HiR TER as the criteria of partitioning performance, because these could be calculated for all considered study sites, unlike the error metrics (RMSE, bias, etc.) regarding R soil . Different subsets of sites were considered for the calculation of the correlations: all study sites, only forest sites, or only cropland and grassland sites. 20 SK10 was already thoroughly analyzed by means of synthetic high frequency data derived by LES (Klosterhalfen et al., 2019). To obtain a better understanding of the strengths and limitations of TH08, we constructed a conceptual model to generate simple, synthetic data sets of w', q', and c' (with sample sizes of N = 100) with different degrees of mixing between scalar sinks and sources from the soil, canopy, and boundary layer (Fig. 7, upper panels). We considered no mixing, complete mixing, and partial mixing between scalars originating from soil and canopy (with positive w'). For all three sets, 25 mixing with scalars originating from the boundary layer (with negative w') was excluded. Averages of fluctuations were all specified as zero, and each scalar sink/source strength was determined such that the net H 2 O flux equals to 1 mmol m -2 s -1 and the net CO 2 flux to -1 μmol m -2 s -1 . To each generated data point of w', q' and c' a random number, sampled from a standard normal distribution and rescaled to a standard deviation of 5% of the magnitude of the variable, was added to simulate additional sources of variance not related to the degree of mixing. TH08 was applied to these synthetic data sets and 30 could be validated with the true known partitioning fractions

Results and Discussion
For each study site, the number of half-hourly time steps during daylight per considered time period is shown in Table A1 in the Appendix. Also, the fraction of daylight time steps of high-quality (HQ) data which were used in the application of SK10 and TH08 are shown, where for SK10 only a good or intermediate quality flag (after Mauder et al., 2013) and no precipitation were required, and for TH08 additionally a negative ρ q'c' . Furthermore, the fraction of these HQ-time steps, for 5 which partitioning solutions were found, is shown for each method version. Thus, from the original data, only a part remained for the partitioning, and for only a part of the remaining data a partitioning result could be obtained.

Flux Components Magnitudes
In the following, figures are shown for some selected sites, which were deemed most representative for all study sites, and/or 10 for some selected method versions of SK10 and TH08, which usually exhibited the best partitioning performance. In Fig (deviation from the mean by more than ten times of the standard deviation) were excluded. Figure 6 shows the error quantities, RMSE ln and bias relative to R soil chamber measurements, HiR GPP, HiR TER, and E soil estimation, for each site and method version. In all figures, timestamps are in local time.
In general, the partitioned CO 2 fluxes showed a higher variability and more spikes than the partitioned H 2 O fluxes for all sites (e.g., at Forest_HH, Fig. S2 in Supplementary material). Furthermore, SK10 and TH08 gave differing results for each 25 study site and performed disparately between method versions. In Fig. 2-5, it is apparent that TH08 mostly resulted in lower magnitudes of the flux components originating from the soil surface or sub-canopy than SK10. The source partitioning results of Forest_LO (Fig. 2, 4,5) were an exception to this rule. For this study site, the partitioning fractions of SK10 and TH08 were very similar and thus suggest a low uncertainty of the results. For the other study sites, larger discrepancies were observed between SK10 and TH08. Furthermore, the partitioning fractions E/ET and NPP/NEE varied much less between 30 sites for TH08 than for SK10 ( Results of the TH08 approach and estimated E soil imply a relatively larger fraction of T. At Forest_SC, the results of the different source partitioning methods were impacted by water stress. For a very dry period in August 2016, both partitioning 15 approaches were not applicable, because transpiration and photosynthesis almost ceased due to water stress, and the correlations between H 2 O and CO 2 fluxes were almost always positive (not shown). In April 2017, partitioning results were obtained showing an increase in R soil estimated with SK10 and a decrease in estimated E (Fig. S7 in Supplementary material). Spring 2017 was considered as relatively dry in this region, and the last precipitation event was five days before the respective time period, so that it can be assumed that water stress increased steadily in April 2017. No 20 respiration/evaporation events were apparent in the q'-c' planes, which could be caused by the sub-canopy in the oak savanna, thus, TH08 probably underestimated soil fluxes substantially.
In Grass_RO the continuous chamber measurements of R soil and TER estimated with the approach after Reichstein et al. (2005) did not agree well. TER decreased steadily over the seven days (this could also be observed for Grass_FE) and was mostly lower than measured R soil (Fig. S8 in Supplementary material). In comparison to measured R soil , SK10 still 25 overestimated and TH08 underestimated R soil fluxes. For Forest_WU and Grass_WU, TH08 yielded results matching comparatively well with the modeled estimate E soil and the gap-filling approach after Reichstein et al. (2005) (Fig. S3, S9 in Supplementary material). As mentioned before, Grass_WU is a very heterogeneous site with regrowing vegetation of grasses, shrubs, and trees on dry and wet areas. Thus, the measured signals might display fluxes originating from different sinks and sources distributed horizontally rather than vertically. The present variety of plant types increased the uncertainty 30 in the estimation of WUE. Usage of WUE OLR improved the partitioning by SK10 significantly, but could not avoid overestimation of R soil (in reference to chamber measurements and TER). For Forest_LA, we observed a behavior similar to Grass_WU (Fig. S5 in Supplementary material). Here, the forest is also regrowing, but spruce trees are already more abundant and larger.
For Maize_DI in 2007, Jans et al. (2010 reported a mean R soil flux of 3.16 μmol m -2 s -1 and a peak R soil of 23 μmol m -2 s -1 . R soil estimates by SK10 were often as large as this peak, but the maximum observed by Jans et al. (2010) was triggered by precipitation, which does not apply in our case (Fig. S11 in Supplementary material). The partitioning results for the cropland in Selhausen (Wheat_SE, Barley_SE, Intercrop_SE, SugarBeet_SE) showed large differences between crops and were more robust for H 2 O fluxes than CO 2 fluxes. 5 Figure 6 shows the error metrics RMSE ln and bias relative to chamber measurements of R soil , HiR GPP, HiR TER, and RMSE ln and bias relative to E soil estimation, for each site and method version. A clear pattern in the performance of the source partitioning depending on method version or on study site characteristics could not be identified in the error metrics ( Fig. 6). However, the following general statements can be made: 10

Error Metrics
1) The RMSE in R soil was usually larger for SK10 than for TH08 (not shown). Considering RMSE ln in R soil , SK10 performed better at forest sites than TH08, and slightly worse at crop-and grasslands (Fig. 6a). The bias in R soil was always positive for SK10 (except for Forest_WU) and often negative for TH08 (except for TH08 REA H; Fig. 6b); SK10 has the tendency to overestimate and TH08 to underestimate R soil compared to respiration chamber measurements. The lowest RMSE, RMSE ln , and bias were found for the SK10 method versions in Forest_WU and for TH08 in Forest_WU, Grass_WU, and 15 SugarBeet_SE_09.
2) When using the gap-filling model after Reichstein et al. (2005) as a reference, high HiR GPP were relatively frequent for TH08, with a minimum of 66.7% for SugarBeet_SE_06, while HiR GPP for SK10 were considerably lower (Fig. 6c). For HiR TER, such a clear difference in performance could not be observed (Fig. 6d). While SK10 mostly overestimated TER, TH08 often estimated soil fluxes smaller than the minimum R soil threshold of 1 μmol m -2 s -1 . TH08 REA H usually gave the 20 best results for HiR TER and the worst for HiR GPP within the method versions of TH08. Also, the performance of SK10 improved for CO 2 in Maize_DI with increasing crop height and lower LAI (Fig. 4, 6).
3) The RMSE (not shown), RMSE ln , and bias of E (in reference to E soil estimated using Beer's law) were mostly similar or slightly larger for SK10 than for TH08 except for the low crop canopies, Forest_LO, Forest_MMP, and Forest_SC ( Fig. 6e, f). These sites also had a relatively low LAI. The error metrics were low in Forest_WU and Grass_WU for SK10 25 and TH08. The worst performance regarding E could be found in Forest_HH for SK10, and in Forest_SC, Maize_DI_06, and Intercrop_SE for TH08. The bias indicated that SK10 underestimated E for all canopies with a LAI lower than 2.3 (Forest_LO, Forest_SC, Maize_DI_06, SugarBeet_SE_06, Intercrop_SE, the latter three have relatively short canopies). This could also be explained by the larger E soil estimates based on Beer's law due to the smaller LAIs, thus preventing an overestimation by SK10. 30 To summarize, for TH08 the calculation of the fluxes via REA yielded larger fluxes than via CV (Fig. 2, 3, 5). Because averaging in the flux calculation is performed differently for CV and REA (i.e. equations 1, page 1212 and equation 8, page 1214 in Thomas et al., 2008), and fewer data points are sampled with the hyperbolic threshold than using data from the entire Q1, the largest magnitudes were obtained by using REA with the hyperbolic threshold (REA H). In some time steps, no respiration/evaporation 'cloud' was apparent in the q'-c' plane, thus, the applied conditional sampling strategies were not as effective as intended, and an assessment of a correct sampling was not possible. Using GPP and TER estimated with the gapfilling model after Reichstein et al. (2005) as reference, components estimated by TH08 almost always were within this prescribed range (i.e. magnitude of NPP smaller than magnitude of GPP, and R soil smaller than TER) because of their small 5 resulting fluxes, whereby R soil was often below the assumed minimum threshold of 1 μmol m -2 s -1 ; thus, we assume these values to be underestimated (Fig. 6, S1-S13 in Supplementary material). Regarding the error metrics in Fig. 6, TH08 REA H, among all TH08 method versions, yielded the best result for the largest number of sites and error metrics. Partitioning results obtained by TH08 CV GMM were not systematically different from the other method versions, but showed no extreme spikes in the soil flux components. 10 The SK10 approach had the tendency to produce very high values of the soil flux components. Considering the diurnal dynamics and averages (Fig. 3-5), results of SK10 were satisfactory, but still relatively large. For most of the study sites, the magnitudes and variability in the half-hourly results of the soil flux components were decreased by using WUE MOST

Analysis by Means of Correlation Analysis
We studied the interrelations between partitioning performance (expressed in HiR GPP and HiR TER) and site 20 characteristics such as canopy height h c , LAI, canopy density (using LAI h c -1 as proxy), measurement height z, and the position of the measurements relative to the roughness sublayer (using z h c -1 as a proxy) by means of a correlation analysis (Tables 2, 3). Here, h c represents the vertical separation of sinks and sources of passive scalars between canopy top and soil surface. For the chosen study sites, LAI correlated with h c when considering a specific ecosystem type (forest, cropland, or grassland). Thus, LAI h c -1 was also considered to distinguish between their impacts on partitioning performance. The 25 ecosystem type "cropland" included only two different sites, Maize_DI and Selhausen (Wheat_SE, Barley_SE, Intercrop_SE, SugarBeet_SE), and thus only two different measurement heights z, but a total of nine data sets resulting from the considered time periods and various crops (Table 1). Therefore, the correlation coefficients with z including this ecosystem type have to be handled with care. All these site characteristics contain some information about the characteristics of the observed turbulence and also affect the degree of mixing of the scalars when they reach the EC sensor. Correlation coefficients between partitioning performance and site characteristics were calculated for all sites together, for forests only, or for crop-and grasslands only, respectively (Tables 2, 3). For the SK10 method versions, the correlation coefficients showed similar relations between variables and partitioning results for both HiR GPP and HiR TER, because 5 SK10 had the tendency to overestimate both NPP and R soil . For the TH08 method versions, relations slightly differ between HiR GPP and HiR TER, because TH08 had the tendency to underestimate R soil fluxes (< 1 μmol m -2 s -1 ), thus HiR TER were smaller than HiR GPP. For the forest sites, the correlations were relatively high between variables and partitioning performance, even though mostly not significantly different from zero.
The performance of all SK10 method versions correlated negatively with LAI h c -1 and z h c -1 , and positively with h c and z, 10 where the correlation with z h c -1 was often significant. The correlation coefficients regarding LAI, despite being also positive, were the smallest. Therefore, partitioning performance of SK10 was mostly enhanced with a sparse canopy and measurements obtained close to the canopy (close to or within roughness sublayer). For the TH08 method versions, LAI had larger effects on partitioning performance than for SK10 method versions, and h c , z h c -1 , and LAI h c -1 had smaller effects.
Correlation coefficients of LAI and LAI h c -1 were mostly positive with a few exceptions (e.g., regarding HiR TER for crop-15 and grasslands). For the TH08 method versions, all site characteristics correlated positively with HiR GPP, except for z h c -1 considering all study sites. The correlations between site characteristics and HiR TER were weak while considering all study sites. For forest sites, HiR TER correlated negatively with LAI h c -1 and z h c -1 and positively with h c , LAI, and z. For crop-and grasslands, similar results were obtained, except the negative correlation between HiR TER and LAI. Also, the correlations with h c and z increased in significance. Apparently, a dense canopy yielded too low sub-canopy fluxes derived 20 by TH08, but more reasonable canopy fluxes.
The variable LAI mostly correlated positively with partitioning performance for TH08 method versions and very weak with partitioning performance for SK10 method versions, which contradicted our initial hypotheses. Also, the correlation between partitioning performance by TH08 and LAI h c -1 at forest sites contradicted our assumption that a higher plant density would have a strong negative effect. Next to canopy density, LAI could also be connected to larger sinks and sources of canopy 25 fluxes (T and photosynthesis) relative to soil surface fluxes due to larger biomass, and to the appearance and frequency of coherent structures. A dense canopy prevents frequent ejections of air parcels from the sub-canopy, but provokes higher scalar concentrations in such air parcels because of a longer accumulation under the canopy. Respiration/evaporation events could occur less frequently but be of higher magnitude. Also, small gaps in an otherwise dense canopy can play an important role regarding ejection events. Thus, how canopy density affects scalar-scalar-correlation measured above the canopy (and 30 associated with that the partitioning performance), cannot be easily assessed. In this study, canopy density (LAI and LAI h c -1 ) and partitioning performance (especially regarding HiR TER) correlated negatively at crop-and grassland sites and mostly positively at the forest sites for TH08. Assuming gaps in the canopy can be more frequent in forests than in crop-or grasslands, these results support the above-mentioned aspects. Zeeman et al. (2013) found a clear connection between the appearance of coherent structures and the detection of respiration/evaporation events following the TH08 approach, where the best results were obtained for an open canopy (Forest_MMP). They found a temporal separation of 10-20 s between sub-, mid-, and above-canopy measurements. In order to assess to what extent these effects play a role in the current data sets, an estimate of the (large-scale) heterogeneity and density of the vegetation at all study sites (gap fraction, canopy openness) would be necessary, which is beyond the scope of this paper. 5

Analysis by Means of a Conceptual Model
SK10 was already thoroughly analyzed by means of the synthetic high frequency data derived by LES (Klosterhalfen et al., 2019). In the present study, TH08 was applied to various synthetic w'-, q'-, and c'-data sets including soil, canopy, and boundary layer scalar sink/sources derived by a simple conceptual model as described above (Fig. 7, top panel). Defined by the conditional sampling concept, we hypothesized that TH08 would work perfectly with no mixing of the scalars from the 10 three different origins, would not obtain any partitioning fractions in case of the complete mixing, and would underestimate the soil fluxes in case of partial mixing. TH08 behaved as hypothesized except for TH08 REA H (see below; Fig. 7, bottom panel). For the partial mixing, a small difference in TH08-derived partitioning fractions (especially for H 2 O) was observed between the sampling in Q1 and with H, because one data point was not sampled with the hyperbolic threshold, but was located within Q1. TH08 REA H did not 15 yield any partitioning results in case of no or partial mixing. This is due to the different definitions of β in the application of REA with the sampling in Q1 or with H ( Thomas et al., 2008, equation 4, page 1213 and statement on page 1215). β is an empirical constant and can be approximated by the ratio between the standard deviation of w' (σ w' ) and the difference between the mean vertical velocities in updrafts and downdrafts (w + ̅̅̅̅w -̅ ). For the conditional sampling approach within Q1, β is derived including all data points (disregarding the sign of q' or c'). For the approach including the hyperbolic threshold 20 criterion, β is derived from w' data points which satisfy the hyperbolic threshold criterion for positive q' and c'. In case of our conceptual model for the partial mixing, no data point with negative w' satisfied this criterion, so without w -̅ , β and a partitioning fraction could not be calculated. Figure 7 shows the partitioning fractions for TH08 REA H while applying β as calculated in TH08 REA Q1 (non-filled markers). TH08 CV GMM performed similar to the other method versions: it sampled the correct respiration/evaporation 'cloud' in case of no mixing and no 'cloud' in case of complete mixing. 25 However, in case of the partial mixing all data points with q' > 0 were sampled by TH08 CV GMM, thus, considering also the fraction originating from the canopy. For the latter, the covariances applying the averages of q or c of the sampled cluster, and considering only data points with w' > 0, were negative for H 2 O and CO 2 (not shown). Thus, E and R soil were recalculated with the covariance taking the deviations of the average of q or c considering all data points, and including only data points with w' > 0, within quadrant 1, and within the sampled cluster. This way of correcting the sampling by GMM 30 resulted in a similar partitioning fraction as the other method versions.

Summary and Conclusions
For all sites and all applied method versions, the partitioned CO 2 fluxes generally showed a higher variability and more spikes than the partitioned H 2 O fluxes. Mean diurnal cycles averaged over each site's specific time period yielded satisfactory results. The partitioning approaches after Scanlon and Kustas (2010;SK10) and after Thomas et al. (2008; TH08) gave differing results and performed disparately between method versions. TH08 mostly resulted in lower 5 magnitudes of the flux components originating from the soil surface than SK10. In addition, TH08 had the tendency to underestimate these flux components in reference to soil respiration flux measurements and estimates of E soil based on Beer's law. SK10 usually had the tendency to overestimate soil flux components and yielded larger error metrics (RMSE and bias).
The RMSE depends on the bias and the error distribution was asymmetric. The positive errors (overestimations) were larger than negative errors (underestimations). Decreasing the weight of outliers by log-transforming R soil data from chamber 10 observations and partitioning estimations revealed a lower RMSE ln for SK10 at forest sites than for TH08. SK10 was used with a variety of estimates of WUE. Estimating input WUE using foliage temperature derived from the observed outgoing longwave radiation often improved the partitioning performance. For TH08, various options were tested regarding the conditional sampling and flux calculation. Applying a Gaussian Mixture Model for the conditional sampling approach in TH08 did not improve partitioning performance significantly, because obtaining a positive and correct flux 15 estimation was difficult for data points outside quadrant 1 in the q'-c' plane. For TH08, conditional sampling including a hyperbolic threshold and calculating flux components based on the relaxed eddy accumulation technique yielded the best partitioning results.
The dependencies of the partitioning performance on turbulence and site characteristics were analyzed based on a correlation analysis and the application of TH08 to synthetic, conceptual data sets of scalar fluctuations. Foremost, the performance of 20 SK10 was improved for sparse canopies and especially with a low ratio between measurement height and canopy height. The performance of TH08 was more dependent on canopy height and leaf area index. Partitioning performance of TH08 improved with increasing canopy density for forests, whereas the opposite was observed for grass and crops. In general, site characteristics which increase dissimilarities between scalars (due to less mixing, large sink-source separation, coherent structures, ejections, etc.) appeared to enhance partitioning performance for SK10 and TH08. 25 For the forest site Loobos in The Netherlands, SK10 and TH08 obtained similar partitioning results and sufficient error metrics suggesting a low uncertainty. At this site with a relatively low leaf area index, high canopy, and low ratio between measurement and canopy height, conditions for both partitioning approaches seemed to be appropriate.

Appendix A
In Table A1 the number of half-hourly time steps during daylight per considered time period is shown for each study site. 30 Also, the fraction of daylight time steps of high-quality (HQ) which were used in the application of SK10 and TH08 are shown, where for SK10 only a good or intermediate quality flag (after Mauder et al., 2013) and no precipitation were required, and for TH08 additionally a negative ρ q'c' . Furthermore, the fraction of these HQ-time steps, for which partitioning solutions were found, is shown for each method version. With TH08 by sampling in the first quadrant (Q1) a partitioning result could be obtained for almost every time step (minimum of 98.2%). With the hyperbolic threshold criterion and with GMM fewer solutions could be found, because quite often the number of sampled data points was less than 1% of the total number in one half-hour time period. SK10 sometimes could not find a partitioning solution, when the measured and 5 estimated ρ q'c' were not equal and removing large-scale processes by Wavelet-transform could not help either to solve the system of equations. The most solutions were found for Forest_MMP and the least for Grass_RO, suggesting a dependence on vegetation height. For crop sites Maize_DI and SugarBeet_SE, the number of solutions with SK10 increased with development stage of the maize or sugar beet, respectively, while the ratio between measurement height and h c decreased. At the same sites the number of solutions for TH08 with hyperbolic threshold and GMM decreased (the conditional sampling in 10 Q1 was not affected). Generally, for the grasslands and the lower crop canopies more solutions were obtained with TH08 than SK10. An exception was the low intercrop in Selhausen (Intercrop_SE).

Supplement Link.
Competing interests. The authors declare that they have no conflict of interest.