Phosphorus Transport in Subsurface Flow at Beech Forest Stands: Does Phosphorus Mobilization Keep up with Transport?

Phosphorus (P) is a limiting factor of primary productivity in most forest ecosystems but little is known about retention within and losses of P from forests. Subsurface flow (SSF) is one of the important pathways of P export but few 10 attempts exist to quantify it. We present results of sprinkling experiments with ca. 150 mm, 2H labelled, total rainfall conducted at 200 m2 plots on hillslopes with slopes between 14° and 28° at three beech forests in Germany in summer and spring. We aimed at quantifying vertical and lateral SSF and associated P transport in the forest floor, the mineral soil and the saprolite. The study sites differed regarding soil depth, skeleton content and soil P stocks (between 678 g/m2 and 209 g/m2, in the first 1 m soil depth). Vertical SSF in the mineral soil and in the saprolite was at least two orders of magnitude larger than lateral 15 SSF in the same depth. Vertical and lateral SSF consisted mainly of pre-event water that was replaced by sprinkling water (piston flow mechanism). Short spikes of event water at the beginning of the experiment at two of the sites with high skeleton content indicate that preferential flow occurred in parallel to matrix flow. We observed a significant decrease in P concentrations in SSF with increasing soil depth suggesting effective retention of P by adsorption to soil particles in all three forest ecosystems. Higher P concentrations in SSF at the beginning of the experiments indicate nutrient flushing but P 20 concentrations were nearly constant thereafter despite strong increase in SSF. P concentrations did also not change significantly with increasing share of event water in SSF. These chemostatic transport conditions suggests that P mobilization rates were similar to transport rates in both, P-rich and P-poor sites. The observed first flush effect implies that P export by SSF will increase as rainfall events with high transport capacity are predicted to occur more frequent under future climatic conditions.


Introduction 25
Phosphorus (P) is a major component of plant nutrition and has been reported to be a potential limiting factor of primary productivity in forest ecosystems (Achat et al., 2016;Elser et al., 2000Elser et al., , 2007. In the last decades a decrease of foliar P concentrations and an increase in the Nitrogen to Phosphorus (N:P) ratio has been observed in forests (Braun et al., 2010;Duquesnay et al., 2000;Jonard et al., 2015;Kabeya et al., 2017). One of the largest loss of P from forest ecosystem is occurring via surface or subsurface flow (SSF) (Jardine et al., 1990;Kaiser et al., 2000;Missong et al., 2018b;Sohrt et al., 2018).
Changes in elemental composition of SSF provide insight into the processes along the flow paths (e.g., dilution, enrichment, precipitation) and whether P mobilization can keep up with P transport caused by SSF. (Bol et al., 2016;Heathwaite and Dils, 2000;Julich et al., 2017b;Steegen et al., 2001). Soil internal measurements of lateral and vertical SSF and associated elemental concentrations are the prerequisite to characterize ecosystems in terms of their overall capacity to retain available nutrients. (Lang et al., 2017). 35 Unlike for agricultural land only very few studies exist that quantify the fluxes of P in forest environments, as elemental concentrations are low and measuring vertical and lateral SSF is challenging (Bol et al., 2016;Mayerhofer et al., 2017). In a small number of field studies water was sampled from surface waters i.e., forest streams (Cole and Rapp, 1981;Gottselig et al., 2017;Kunimatsu et al., 2001;Schindler and Nighswander, 1970;Tayor et al., 1971;Zhang et al., 2008) or from groundwater wells in riparian zones near the stream that are easier accessible than SSF (Carlyle and Hill, 2001;Fuchs et al., 40 2009;Vanek, 1993). But elemental composition of stream water represents an integrated signature of the entire catchment and is therefore not appropriate for a detailed process identification within soil compartments. Stream water is also subject to instream retention and mobilization of P and therefore not necessarily representative of transport conditions in the hillslopes (Gregory, 1978;Hill et al., 2010;Mulholland and Hill, 1997;Sohrt et al., 2019;Stelzer et al., 2003).
Therefore, soil solution below forest stands was collected using suction lysimeters (Cole and Rapp, 1981;Compton and Cole, 45 1998;Kaiser et al., 2000;Qualls et al., 2002) but these samples alone do not allow to calculate P fluxes. A very limited number of studies used a trench to measure fluxes and P concentration in lateral SSF. Timmons et al. (1977) were among the first; they installed a 1.8 m long drainage at the intersection of the A and B horizon in 33 cm soil depth to measure P concentrations and water fluxes in an aspen-birch forest in Minnesota. Sohrt et al. (2018) used 10 m wide trenches to measure and collect lateral subsurface flow from the forest floor, mineral soil and saprolite at three beech forest stands in Germany during natural rainfall 50 events. Jackson et al. (2016) performed an artificial sprinkling experiments on a 200 m 2 plot on a hillslope at the Savannah River Research Site (South Carolina) covered by pine trees and measured water flux and P concentrations in lateral SSF at 1.25 m soil depth. They applied dye-and conservative tracers to identify dominant flow paths and fractions of event and preevent water involved with solute transport. Makowski et al (2020) were the first to measure vertical SSF use zero-tension lysimeters of a size of 40 x 50 cm installed at 10 cm and 35 cm soil depth at two soil profiles of a beech forest in Saxony, 55 Germany. The mineral soil of the Haplic Cambisol had an average P content between 194 and 221 mg/kg (Makowski et al., 2020). They performed artificial sprinkling experiments on 3 m 2 plots above the lysimeters with a sprinkling intensity of 20 mm/h over 4 hours. They reported significantly higher P concentrations in the sampled vertical SSF in the first 2 hours (Pflushing), followed by a decrease and finally constant P-concentrations for the rest of the experiment. Initial P-concentrations as well as SSF-and P-fluxes where higher in the subsoil than the topsoil. They reported strong differences on water and P 60 fluxes between the two soil profiles which is likely caused by soil heterogeneity and can only be optimized by larger lysimeter and sprinkling plot size. From field studies as described above and from lab experiments using soil columns we know, that transport of P during high intensity rainfall events does occur mainly along preferential flow paths (Cox et al., 2000;Fuchs et al., 2009;Julich et al., https://doi.org/10.5194/bg-2020-118 Preprint. Discussion started: 20 April 2020 c Author(s) 2020. CC BY 4.0 License.
2017b; Missong et al., 2018a). Preferential flow path allow subsurface flow to bypass the soil matrix that otherwise has been 65 shown to effectively retain P (Compton and Cole, 1998;Ilg et al., 2009;Johnson et al., 2016;Qualls et al., 2002). In biopores (e.g., earthworm borrows, root channels) have been identified as biochemical hotspots that can show significantly higher P concentrations than the soil matrix especially in fine textured soils (Backnäs et al., 2012;Hagedorn and Bundt, 2002).
Preferential flow paths can extent below the rooting depth and thus are considered as one important pathway of P losses from forest ecosystems (Julich et al., 2017a;Sohrt et al., 2017). 70 Here we present a comparative study based on hillslope-scale artificial sprinkling experiments on 200 m 2 plots at three beech (Fagus sylvatica) forest sites that differ in terms of their soil depth, skeleton content and SSF flowpaths as well as their soil P stocks. We used an experimental setup that allowed to measure soil depth specific lateral and vertical flow of water and transported hydrochemicals (incl. P). We performed two sprinkling experiments at each site to capture potential differences in P fluxes between summer/fall and spring. The rational is that microbial activity, which is responsible for P mineralization, is 75 strongly dependent on moisture and temperature conditions and litter fall is not evenly distributed across the year (Brinson, 1977;Kirschbaum, 1995). In particular, we address the following research questions: 1. What are the main runoff generation mechanisms, flow paths and temporal delays in runoff response (lateral versus vertical, shallow versus deep) in three forest stands with different soil properties (i.e., skeleton content and soil depth) during long, moderate intense sprinkling events? 80 2. What is the dynamic of P concentrations in lateral and vertical SSF, measured at different soil depths during artificial sprinkling events and does it differ seasonally and among sites with different skeleton content and soil P availability?
3. Is P in the soil solution diluted during large sprinkling experiments or can P mobilization keep up with P transport (chemostatic conditions)?

Study Sites
For this analysis three beech forest sites in Germany with contrasting soil hydrological properties (i.e., skeleton content and soil depth) and P stocks were selected. Their site characteristics and source of references are summarized in Tab. 1. Mitterfels (MIT) (48° 58' 32" N; 12° 52' 37" E) is located ca. 70 km east of Regensburg in the Bavarian Forest at 1023 m a.s.l. It's mean annual precipitation is 1299 mm. The site is characterized by Hyperdystric chromic, folic cambisol with a loamy topsoil (0 -90 35 cm) and a sandy-loamy subsoil (35 -130 cm). The stone content in the top-and subsoil is 23 % and 26 % and the P stocks in the upper 1 m of the soil profile is 678 g/m 2 . The saprolite reaches a total depth of 7 m but weathering is less below 2 m depth. The parent material below the saprolite is paragneiss. Conventwald (CON) (48° 01' 16" N; 7° 57' 56" E) is located 20 km east of Freiburg in the Black Forest at 840 m a.s.l. and has a mean annual total rainfall of 1749 mm. The parent material, main soil type and vegetation is similar to MIT but soils considerably differ in the skeleton content (CON: 87 % topsoil, 67 % 95 subsoil), the depth of the saprolite (CON 17 m but less weathered below 3m) and their P stocks in the upper 1 m of the soil https://doi.org/10.5194/bg-2020-118 Preprint. Discussion started: 20 April 2020 c Author(s) 2020. CC BY 4.0 License. profile (CON: 231 g/m 2 ). Tuttlingen (TUT) (47° 58' 42" N; 8° 44' 50" E) is located 125 km south of Stuttgart at 835 m a.s.l. and has 900 mm annual rainfall. Due to its carbonatic parent material a rendzic Leptosol with a clayey top-and sub-soil has developed that is less rich in soil P stocks (209 g/m 2 ) than MIT und CON. The soil profile has a 20 -40 cm deep topsoil and a 60 -80 cm deep subsoil directly overlaying the fractured carbonate parent material (Tab. 2); the stone content of the top-and 100 sub-soil is 50 % and 67 %. The site is also covered by beech (Fagus sylvatica) but the stand is younger than at MIT and CON (TUT: 100 years) Bulk density of CON and TUT is more similar than compared to MIT, but all three soil profiles show considerably variation in the bulk density with depth ( Fig. 1)

Experimental Setup and Lab-Analysis
At each of the three sites we delineated an experimental plot of 200 m 2 (10 by 20 m) which was separated from its uphill 105 neighboring area by a plastic foil inserted into the soil profile. At the downhill side of the plots, a trench (TR) was dug down into the saprolite until refusal of a hydraulic shovel excavator and drainage matts and drainage pipes were installed in three (MIT, CON) or two (TUT) depths ( Fig. 2) to collect lateral flow. The actual depth of the pipes varied according to the sitespecific soil profile (Tab. 2), but was chosen such that water draining from the forest floor (L, Of, Oh), the mineral soil (A and B horizon) and the saprolite (Cw) could be sampled. Plastic foil was installed across the entire 10 m width of the trench and 110 down to the depth of the three soil compartments so that all water flowing laterally towards the trench was captured in the appropriate drainage pipe. To measure also vertical flow, we installed zero tension lysimeters (LY) for which we used steel piling plates with a dimension of 1.0 by 0.6 m. To install them, a trench was dug at the side of the hillslope and the steel piling plates were pushed from the side into the undisturbed soil profile with heavy duty hydraulic jacks. By this, effects on soil structure by excavation and refill were prevented and the mixing of soil P stocks from different soil horizons was avoided. We 115 installed the LY in similar depth as the TR. At MIT and CON we installed an additional LY right below the A-horizon in 30 to 40 cm soil depth; at TUT the shallow soil depth did not allow installing a LY below 60 cm. In the following, the TR and LY are numbed as TR1B to TR3B and LY1B to LY4B with increasing soil depth (Tab. 2). ("B" indicates the sprinkling plot and allows distinction from another dataset not used in this paper but mentioned in subsequent publications). All trenches were backfilled after installation. 120 In the upper and lower half of the hillslope, volumetric water content and soil temperature were monitored at two soil profiles in 20, 40, 60, 80 and 120 cm soil depth (no 120 cm measurements at TUT due to shallower soil). We used SMT100 sensors (Truebner GmbH) attached to CR1000 data loggers (Campbell Scientific) and monitored volumetric water content and soil temperature in 5 min time intervals.
At each plot we performed two artificial sprinkling experiments -one in spring at the start of the growing season and one in 125 late summer/early fall during or towards the end of the growing season but before leaf senescence. The two periods were chosen to reflect potential seasonal differences in soil moisture and P supply. I.e., higher soil moisture content after snowmelt but less P mineralization due to colder temperature in spring, versus drier soil moisture conditions and advanced P mineralization towards the end of the growing season with warmer soil temperature in summer/fall. The mean difference of https://doi.org/10.5194/bg-2020-118 Preprint. Discussion started: 20 April 2020 c Author(s) 2020. CC BY 4.0 License. the median volumetric water content over the 7 days preceding the experiment was 8 vol %, 3 vol % and 8 vol % between 130 summer and spring for MIT, CON and TUT, respectively. We sprinkled the 200 m 2 plots with a mean rainfall intensity of 15 to 20 mm/h over 10 to 12 h. This reflects a large rainfall event with a return period of more than 100 years for all sites (DWD, 2010). But the selected rainfall intensity allowed all irrigation water to infiltrate into the soil and did not generate surface runoff. 60'000 liters of water were trucked to the site, run through an industrial deionizer (VE-300 (6x50 Liter), AFT GmbH & Co.KG) to remove all minerals (especially P) that is also typically not found in high concentrations in natural rainfall. The 135 deionized water had an electrical conductivity of less than 20 μS/cm. The deionized water was stored in a large water pillow (60.000 L, Sturm Feuerschutz GmbH) between 50 and 100 m above the sprinkling plot. The resulting hydrostatic gradient was sufficient to run the sprinkling without a pump. The 6 radial sprinklers (Xcel-wobbler and pressure regulator manufactured by Senninger) installed at a height of 2 m sprinkled 60 % of the total water onto the 200 m 2 plot and 40 % outside to reduce boundary effects with the otherwise dry surrounding area. 1 kg (first sprinkling experiment) and 2 kg (second experiment) of 140 99.96 atom % deuterium was added while filling the pillow to elevate the natural background deuterium isotopic signature of the sprinkling water by ca. 100 permille. Water samples before and after adding the deuterium and during sprinkling (collected with totalizers on the experimental plot) were collected to measure the background isotopic composition and to check for a constant isotopic label signature over the course of the experiment.
The subsurface flow (SSF) from the TR and LY was routed outside the hillslope via a pipe system to tipping buckets that 145 recorded the flow volume of SSF in 5 min time intervals. The pipe system had been flushed via access tubes the day before the experiments to guarantee function and cleanness. Over the course of the sprinkling experiment (ca. 10 hours) every 30 minutes the SSF of all TR and LY was sampled into 100 ml brown glass bottles using automatic samplers (custom made by the Chair of Hydrology, University of Freiburg). The sampling was continued for 12 hours after the end of the sprinkling with a sampling interval of 2 h. The water samples were transported in cooling boxes to the lab directly after the experiment for 150 subsequent hydrochemical and isotope analysis. To measure total phosphorus concentrations (Ptot), 50 ml of the sample was digested by adding 0.5 ml 4.5 M sulfuric acid (H2SO4) and processed in an autoclave. Ptot was analyzed by the molybdenum blue photometric method based on DIN EN ISO 6878 (DIN, 2004) using a Unicam AquaMate photometer (Spectronic Unicam) with a 5 cm-cuvette at 700 nm. We determined the limit of quantitation for Ptot (0.026 mg/l,) and the limit of detection (0.013 mg/l) based on DIN 32645 with a significance bound of 99 % for the limit of quantitation and 77 % for the limit of 155 detection (DIN, 2008). The remaining 50 ml of each sample was filtered with a 0.45 µm cellulose filter (PERFECT-FLOW, WICOM) and used for analysis of 18 O and 2 H stable water isotopes using a Cavity Ring-Down L2130-i Isotopic Liquid Water Analyser (Picarro Inc.). Based on the background isotopic signature and the isotopic signature of the sprinkling water, eventand pre-event water fractions were calculated using a two endmember mass balance approach, also called two-component isotope hydrograph separation (Sklash and Farvolden, 1979). 160 https://doi.org/10.5194/bg-2020-118 Preprint. Discussion started: 20 April 2020 c Author(s) 2020. CC BY 4.0 License.

Data Analysis
TR and LY volumetric measurements were scaled to 1 m 2 of plot area and expressed as mm/h which allows direct comparison with the incoming sprinkling water. We determined the time lag between the start of the sprinkling experiment and the time when 20 % of the total rise in SSF had been reached or exceeded (called trise20 in SSF). In a similar way we determined the time lag between the start of the sprinkling event and the time when the event water fraction had reached 20 % (called trise20 in 165 event water fraction). The threshold of 20 % has been chosen as clear indication of first response in the change of SSF and event water fraction. If trise20 in event water fraction is longer than trise20 in SSF this indicates that the flow celerity is faster than flow velocity. If trise20 in event water fraction is shorter than the trise20 in SSF, this indicates preferential flow as velocity is faster than celerity.
To investigate if P mobilization was able to keep up with P transport, we plotted P-concentrations as a function of SSF in log-170 log space. If the statement before was correct, we would expect near chemotactic conditions, i.e., P-concentrations would not vary much with increasing or decreasing SSF. In this case data points were expected to plot parallel to the x-axis in log-log space. In case of simple dilution, P concentrations would decrease proportional with increasing SSF and the data points would be aligned on a 1:1 line in log-log space. As a further measure we calculated the ratio between the range in observed SSF values and the respective range in observed P concentration. Under predominantly chemostatic conditions, the range in SSF 175 would be much larger than the range in P-concentrations.
In addition, we investigated the relation between Ptot concentrations and event water fractions and checked whether the slope of a linear regression based on the datapoints was significantly different from zero. We also tested if this regression slope was significantly different from a slope describing dilution due to proportional mixing of pre-event water with our deionized sprinkling water using a Mann-Whitney test. The slope describing simple dilution was determined by a regression with the 180 best linear fit through the data points and the additional constrain of Ptot = 0 mg/l for event water fraction = 1. All data analyses were performed in R 3.4.2 (R Developer Team).

Lateral versus vertical SSF
In general, vertical SSF (measured by LY) dominated total water flux during all sprinkling experiments. Depending on site 185 and soil depth between 89 and 99 % of total SSF was vertically percolating to deeper depth and only < 1 to 11 % of the sprinkling water was flowing laterally towards the trench (Fig. 3). At all study sites LY1B yielded steady flow with a mean rate of 10 to 15 mm/h which is identical to the sprinkling rate. This confirms that the LY, even if positioned at the boundary of the hillslope, were experiencing rainfall intensities that are representative for the rest of the hillslope. The LY at deeper soil depth showed a slower increase in SSF than LY1B below the forest floor but also reached a mean flow rate of 10 to 15 mm/h 190 towards the end of the sprinkling experiment (except LY3B and LY4B at MIT in summer and spring and LY3B at TUT in https://doi.org/10.5194/bg-2020-118 Preprint. Discussion started: 20 April 2020 c Author(s) 2020. CC BY 4.0 License. summer) (Fig. 3). Lateral SSF (measured by TR) was at least two orders of magnitude lower than vertical SSF but increased constantly towards the end of the sprinkling experiments. An exception was TR2B at TUT which reached a plateau (ca. 1 mm/h) during the spring experiment, most probable due to wetter antecedent conditions (see discussion section). Maximum SSF from TR2B at CON and TUT was 0.51 and 0.70 mm/h during the experiment in summer and 0.72 and 1.09 mm/h in 195 spring. TR3B at MIT yielded neither vertical nor lateral SSF in any of the two sprinkling experiments. This is likely attributed to lower skeleton content and higher storage capacity of the soil in MIT.

Event and pre-event water fractions
Vertical SSF in the topsoil, subsoil and saprolite (i.e., all LY except LY1B) at all sites and during all experiments was predominantly pre-event water (Fig. 3, Tab. 3). In contrast, the mean pre-event water fraction from the forest floor (i.e., LY1B) 200 during the sprinkling events was low (i.e., MIT: 15 % and 12 %; CON: 16 % and 10 % and TUT: 4 % and 4 % in summer and spring, respectively). The mean pre-event water fractions of vertical SSF increased with depth at all sites and all events and already in 35 to 40 cm soil depth (LY2B) the mean pre-event water fractions during the events in summer and spring were 88 % and 83 % at MIT, 58 % and 60 % at CON, and 83 % and 64 % at TUT. The mean pre-event water fraction in the vertical SSF further continued to increase with soil depth (see LY3B and LY4B in Tab. 3). 205 For the lateral SSF a similar increase of the pre-event water fraction with soil depth was observed. However, the mean preevent water fractions of lateral SSF in the subsoil (i.e., TR2B) were typically smaller than the pre-event water fractions in vertical SSF at similar depth (i.e., LY3B). The mean pre-event water fractions of TR2B during the events in summer and spring were 33 % and 44 % at MIT, 55 % and 63 % at CON and 47 % and 70 % at TUT. The mean pre-event water fractions during the events in summer and spring of LY3B were 93 % and 95 % at MIT, 83 % and 63 % at CON and 70 % and 95 % at TUT. 210 Mean pre-event water fractions of the lateral and vertical SSF in the saprolite (i.e., TR3B and LY4B) were similar (i.e., 74 % and 78 % for TR3B and 78 % and 86 % for LY4B at CON during the experiment in summer and spring.

Differences in SSF and response time between summer and spring
In general, SSF between the sprinkling experiments in summer and in spring differed less than between the three sites. This was particularly true for the peak flows in SSF at all sites. An exception was TUT where peak flows for LY2B and LY3B were 215 higher in spring compared to summer which is likely due to pronounced differences in antecedent wetness between summer and spring at TUT (see details above).
SSF between summer and spring differed mainly in respect to the response time of the individual TR and LY. This was most pronounced at MIT and least at CON (Fig. 3, Tab. 4). At MIT the trise20 in SSF was on average 1.9 times longer in summer than in spring (except TR2B: 0.7 times shorter) (Tab. 4). At CON it was the opposite i.e., the trise20 in SSF was on average 0.7 220 times shorter in summer than in spring; except for SSF in the saprolite (i.e., LY3B, LY4B, TR3B) for which trise20 in SSF on average was 1.3 times longer than in spring (Tab. 4). At TUT trise20 in the top and subsoil was 1.8 times longer in summer than in spring but trise20 in the forest floor (LY1B and TR1B) was 0.3 times shorter. At CON and TUT the trise20 was not always https://doi.org/10.5194/bg-2020-118 Preprint. Discussion started: 20 April 2020 c Author(s) 2020. CC BY 4.0 License.
increasing systematically with soil depth (e.g., LY3B responded earlier than LY2B at CON in spring and TUT in summer; TR3B responded earlier than TR2B at CON in summer and spring. At MIT in summer LY4B did not respond until 1.5 h after 225 the end of the sprinkling experiment and TR3B did not respond at all in summer and spring. The trise20 of the event water fraction (Tab. 4), was typically longer than the trise20 of SSF (Tab. 5). Only for TR2B and TR3B at CON in summer and spring and TR2B at TUT in summer trise20 of event water fraction was shorter than trise20 of SSF. These differences in trise20 of event water fraction and tries20 of SSF for these TR was in the range of 3 hours to almost 4 hours for CON and 1 h for TUT. At MIT, SSF from LY3B and LY4B did not reach 20% event water fraction during the experiments in 230 summer and spring. The same was true for LY3B at TUT in spring.

Dynamics of P concentrations
Ptot concentrations of vertical SSF (LY) were up to one order of magnitude higher during the first 1 to 2 hours after the first response of each flow component than during the remaining time of the sprinkling experiment (Fig. 4a). This was particularly true for the vertical flow from the forest floor and the topsoil (LY1B and LY2B) but also apparent, albeit less than one order 235 of magnitude, for the subsoil and the saprolite (LY3B, LY4B).
The Ptot concentrations of vertical flow from the forest floor was significantly higher than in the topsoil (in 30 to 40 cm soil depth), except for CON and TUT in summer (Fig. 4a). Similarly, the Ptot concentrations of vertical flow in the subsoil and saprolite was significantly lower than in the topsoil except for MIT in summer and spring and at TUT in summer. All Ptot concentrations were above the limit of quantitation (i.e., 0.009 mg/l). 240 The Ptot concentrations in the same vertical flow component were not significantly different between summer and spring, except for LY1B and LY3B at MIT; LY1B at CON; LY1B, LY2B and LY3B at TUT. However, the Ptot concentrations of the same flow component (e.g., LY1B) at the P-rich site MIT was generally significantly higher than at the P-poorer sites CON and TUT, except for LY2B in spring. In a similar vein, Ptot concentration at CON were significantly higher than at TUT except, LY3B in summer. 245 Ptot concentrations in the lateral flows (TR) also showed a sharp decline during the first 1 to 2 hours after the onset of flow of each component, except for TR1B and TR2B at MIT in spring and TR2B at CON in spring (Fig. 4b). Ptot concentrations of TR2B at CON in summer showed a steady decline and for TR3B the decline occurred with more delay (i.e., 5 h after first response) compared to the other experiments. At CON and TUT the Ptot concentrations in the lateral flow of the forest floor (TR1B) were significantly higher than in the subsoil (TR2B). Contrarily at MIT the Ptot concentrations in lateral flow of the 250 forest floor (TR1B) were significantly lower than in the subsoil (TR2B), both in summer and spring (Fig. 4b). The difference in Ptot concentrations in the same flow component in summer and spring was not significant, except for TR1B at TUT.

P concentration as a function of instantaneous flow
The range in SSF was typically factors if not orders of magnitude larger than the range in Ptot concentrations during the sprinkling experiments except for those flow components that yielded little SSF in general (e.g., TR1B at CON during summer and spring, LY3B and TR2B at MIT during summer) (Fig. 5). Data points were in general aligned rather parallel to the x-axis in the log-log plots of Ptot concentration versus SSF and not along the 1:1 line (Fig. 5). This suggests that transport in most subsurface flow components was rather chemostatic than diluted. However, we observed weak anti-clockwise hysteresis effects i.e., median Ptot concentration were higher on the rising limb than on the falling limb of the SSF hydrographs, except for TR1B at MIT in summer, LY4B at CON in spring and LY2B at TUT in summer. Most differences in Ptot concentrations 260 during rising and falling limbs were, however, small (e.g., 0.007 mg/l, 0.028 mg/l and 0.081 mg/l for the 25%-, 50%-and 75% quantile of all differences) indicating that the hysteresis effect was small (Suppl. Tab. 1).

P concentration as a function of event water fraction
Ptot concentrations also did not change significantly with increasing event water fraction (Fig. 6). Only in the forest floor (LY1B and TR1B), the slopes of the linear regression lines fitted to the Ptot concentrations as a function of event water 265 fractions were significantly different from zero, except LY1B at CON in summer and TR1B at MIT in spring (Tab. 6). The transport conditions in the lateral and vertical SSF in the forest floor were therefore predominantly non-chemostatic. In the mineral soil and saprolite lateral and vertical SSF was dominantly chemostatic (except LY2B and LY4B at MIT in spring, LY3B at CON in summer, LY2B at CON in spring, LY2B and TR2B at TUT in summer and TR2B at TUT in spring) (Tab. 6). Most of these regression slopes were however close to zero. 270 For the regression slopes indicating non-chemostatic behavior, we tested if they were not significantly different form a slope describing proportional mixing. In general this was not the case, except for some flow components from the forest floor (i.e., LY1B at MIT and TUT in summer, TR1B at CON in spring; see italic values in Tab. 6) and from the mineral soil (i.e, TR2B at MIT in summer, TR2B at TUT in summer and spring and LY3B at CON in summer). At least for the latter three the regression slope was however close to zero. As a rough generalization one could summarize: The regression slopes of most 275 lateral and vertical SSF in the mineral soil and saprolite tended to be small or close to zero (the majority was chemostatic) whereas regression slopes of flow components from the forest floor were typically significantly different from zero (some indicating proportional mixing).

What are the main SSF paths during long, moderate intense sprinkling events? 280
In general, vertical SSF dominated total flow during all of our sprinkling experiments and lateral SSF was at least two orders of magnitude lower than vertical SSF (Fig. 3). This finding implies that previous studies at trenched hillslopes at sites with well drained soils and moderately permeable bedrock likely missed out to measure and sample an important loss term of the water and nutrient balance (Jackson et al., 2016;Sohrt et al., 2018;Timmons et al., 1977). This is partly due to different research foci of these studies but mainly attributed to the technical difficulty to measure and sample vertical SSF. The use of 285 our large zero tension lysimeters is a successful way to capture vertical flow. They yield more representative results than https://doi.org/10.5194/bg-2020-118 Preprint. Discussion started: 20 April 2020 c Author(s) 2020. CC BY 4.0 License. traditional small size lysimeters of a few cm 2 or suction caps that are more likely to be affected by soil heterogeneity. The steel piling plates that we pressed into the undisturbed soil profile from the side of the hillslope also allow to preserve the natural soil profile with its soil texture and structure and horizon-specific P stocks. However, their installation comes with high effort.
Vertical and lateral SSF in the mineral soil and saprolite at all sites and all experiments was predominantly pre-event water 290 (Fig. 3, Tab. 3). This is generally in agreement with Jackson et al. (2016) who performed a tracer experiment at the Savannah River Site (South Carolina) with a loamy sand topsoil overlaying a sandy clay-loam subsoil. However, their maximum preevent water fraction was 50 % while it was typically higher in our study (mean pre-event water fraction in vertical and lateral SSF of the mineral soil was 83 % and 63 %, see also Tab. 3). Our findings suggest that SSF runoff generation was dominated by piston flow, i.e., incoming event water is pushing pre-event water down into the soil profile initiating SSF. A peak of high 295 event water at the beginning of the sprinkling experiments, a non-sequential onset of SSF with soil depth and shorter trise20 of event water than trise20 of SSF in some lateral flow components at CON and TUT suggest however, that preferential flow occurred in parallel to matrix flow at CON and TUT. Occurrence of preferential flow is important as it allows P bypassing the soil matrix in its soluble and colloidal form and is therefore considered to be a very prominent pathway of P loss from the ecosystem (Jardine et al., 1990;Kaiser et al., 2000;Missong et al., 2018b). The fact that we observed indications of preferential 300 flow predominantly at CON and TUT but not at MIT and mainly in lateral flow and less in vertical flow may be explained by differences in soil properties, especially skeleton content and soil bulk density of the three sites (Fig. 1). Swelling and shrinking due to the higher clay content at TUT might be another mechanism leading to preferential flowpaths. At MIT the lateral and vertical flow components in the saprolite did not respond or responded with strong time delay to the sprinkling experiments ( Fig. 3, Tab. 4) which suggests very efficient storage of the sprinkling water in the soil. Therefore, at MIT characterized by 305 relatively low skeleton content but high soil storage capacity the shallow flowpaths were most important and the deeper flowpaths did not yield much or any flow. At CON it was the opposite. The higher skeleton content allowed water to reach deeper soil depth and therefore deeper flowpaths yielded more flow than shallower ones. At TUT the clay-rich topsoil led to more lateral flow at shallow depth than CON but as the total soil depth was much smaller than at the two other sites, the storage capacity was less and therefore also the mineral soil yielded significant amounts of flow. 310

P concentration dynamics in vertical and lateral SSF
Median Ptot concentrations in vertical SSF from the forest floor were significantly higher than those from the mineral soil and saprolite (Fig. 4). This suggests that P losses from the forest floor were efficiently retained by adsorption in the mineral soil.
This was the case for all three sites regardless of natural P-soil stocks. The rate at which adsorption occurred appeared to be faster than the flow rate. The Ptot concentrations in vertical and lateral SSF declined with increasing soil depth, except for 315 MIT were the median Ptot concentration in lateral SSF of the mineral soil was higher than from the forest floor during both sprinkling experiments. This is explained by the difference in P-stocks of the forest floor and mineral soil of the three sites.
While Ptot stocks in the forest floor at MIT are only 7 g/m 2 it is almost 2 times higher at CON (13 g/m 2 ) and almost three times higher at TUT(19 g/m 2 ) (see Tab. 1). On the contrary the Ptot stocks in the mineral soil at MIT (624 g/m 2 ) are almost 3 times https://doi.org/10.5194/bg-2020-118 Preprint. Discussion started: 20 April 2020 c Author(s) 2020. CC BY 4.0 License.
higher than at CON (230 g/m 2 ) and more than three times higher than at TUT (189 g/m 2 ). The only other study on P 320 concentrations in vertical SSF, did not find a significant difference in Ptot concentrations of vertical SSF at 10 cm and 35 cm soil depth at their beech forest site in Saxony, Germany with Haplic Cambisol (Makowski et al., 2020).
While in our study P concentrations in SSF decreased with soil depth at all three sites, the Ptot concentrations of the same flow component at different sites was typically highest at the P-rich site (MIT) and lowest at the P-poorest site (TUT), except lateral flows from the forest floor (TR1B) at MIT. This suggests that similar P leaching and retention mechanisms are occurring in 325 soils at P-rich and P-poor sites but the actual P concentrations in SSF differ between forest ecosystems with different Pavailability.
Ptot concentrations in vertical and lateral SSF at all soil depth were typically significantly higher during the first 1 to 2 hours after the first response of each flow component than during the remaining time of the sprinkling experiment (Fig. 4). This nutrient flushing effect (Hornberger et al., 1994) has been describe also in other studies as a prominent feature of lateral export 330 of nutrients; mainly N (DON) and C (DOC) (Qualls, G. Haines, 1991;van Verseveld et al., 2008;Weiler and Mcdonnell, 2006) but also P (DOP) (Burns et al., 1998;Missong et al., 2018a;Qualls et al., 2002;Sohrt et al., 2018). Makowski et al (2020), who measured P concentrations in vertical SSF, also reported nutrient-flushing in the first 2 hours of their sprinkling experiments.
In our study Ptot concentrations after the nutrient flushing were, relatively constant regardless of further increasing SSF. The 335 change in Ptot concentrations were several factors smaller than the change in SSF (Fig. 5) and the regression slope of Ptot concentration versus SSF was not significantly different from zero for most flow components in the mineral soil (Tab. 6). This suggest that P transport in SSF in the mineral soil was chemostatic, which further suggests that P mobilization could keep up with P transport. This was different in the forest floor as Ptot concentrations in SSF continued to decline at a slow rate after the flushing phase (Fig. 4). However the slope of a regression line of Ptot concentration versus event water fraction of most 340 flow components in the forest floor was significantly different from simple dilution that would be the case if a given amount of P is leached and diluted by an increasing amount of event water in SSF (Fig. 6, Tab. 6). It is therefore likely that dilution happed at a lower rate than assumed by proportional mixing, i.e., the rate at which P was supplied by biogeochemical processes was equal or slightly lower than transport capacity. This would also fit our other observation that the median Ptot concentrations was higher on the rising than on the falling limb of all flow components (Fig. 5, Suppl. Tab. 1). 345

Seasonal differences in SSF and P concentrations
The differences in SSF between the sprinkling experiment in summer and spring were smaller than the difference in SSF between the three sites and mainly related to SSF response timing (Fig. 3, Tab. 4). The reason is likely due to relatively small differences in soil moisture antecedent conditions between the two sprinkling experiments in summer and spring (except TUT).
This was particularly true at CON where the high skeleton content allowed the soils to drain quickly to field capacity. Soil 350 properties likely also explain, why seasonal differences in SSF response timing were more pronounced at MIT than CON (Tab. 4,Tab. 5). At TUT the difference in SSF dynamics between the experiment in summer and spring is likely more related to https://doi.org/10.5194/bg-2020-118 Preprint. Discussion started: 20 April 2020 c Author(s) 2020. CC BY 4.0 License. differences in antecedent wetness conditions. The 7-day-average of the median volumetric water content of the soil profile at TUT during spring was 23 vol % compared to 15 vol % at TUT during summer. Still, the dominance of pre-event water fractions during both events at TUT suggest that not fast preferential flow but piston flow was the dominant process during 355 both experiments at TUT. P concentrations in the same SSF flow component were typically not significantly different between summer and spring, expect for vertical flow from the forest floor at all sites. This might be an effect of the different biogeochemical processes and degree of decomposition of the litter material between summer and spring, as we had hypothesized. The reason why we did not measure a significant difference in Ptot concentration in lateral flow from the forest floor (TR1B) could be that lateral flow 360 has typically a longer flow distance to the trench along which adsorption can occur. While the mean lateral distance to the trench is 10 m the total vertical soil depth is 3 m.

Conclusions
We present results of six sprinkling experiments conducted at 200 m 2 hillslopes at three beech forest sites in Germany that differ in their soil depth, skeleton content and soil P stocks to quantify vertical and lateral SSF and associated P transport. 365 Vertical SSF in the mineral soil and saprolite was at least two orders of magnitude larger than lateral SSF and consisted mainly of pre-event water that was likely replaced by sprinkling water (piston flow mechanism). Short spikes of event water at the beginning of the experiment at CON and TUT however indicate that preferential flow occurred in parallel at sites with a higher skeleton content. No or very delayed SSF from the saprolite at MIT also showed the importance of soil storage capacity in terms of retaining water and nutrients; differences between seasons were however minor at all sites, expect for vertical flow 370 from the forest floor. We observed a significant decrease in Ptot concentrations in SSF with increasing soil depth in all three forest ecosystems. It was especially strong in the mineral topsoil, which suggest efficient retention of P by adsorption.
However, the actual P concentrations in SSF were highest at the P-rich site (MIT) and lowest at the P-poor site (TUT). Ptot concentrations in SSF at all soil depth and all sites were particularly high in the first 1 to 2 h after the first response of each flow component. For the remaining time of the experiments, transport conditions in the mineral soil and saprolite were however 375 close to chemostatic. E.g., the change (decrease) in Ptot concentrations was factors if not orders of magnitude smaller than the change in SSF during the experiments. Ptot concentrations in the mineral soil did also not change significantly with increasing fraction of event water in SSF towards the end of the experiments. This suggests that the P mobilization in the mineral soil could keep up with P transport. As P transport is closely linked to SSF, P export from forest stands will likely increase on the long-term as the number of large rainfall events associated with SSF is predicted to increase under future climatic conditions. 380

Data availability
The data will be available from the data repository freiDoc. Until this is set in place please contact the authors for any requests.

Author contribution
MW wrote the grant; MR was responsible for the project management, experimental design and field installations together with two technicians. MR planed and organized the sprinkling experiments and lab-analysis which was conducted by two 385 technicians. MR was responsible for data pre-processing and all analysis. JK and FL provided the data on soil characterization and valuable thoughts and discussion on soil ecological aspect of the study. MR wrote the manuscript including all figures and tables. All other co-authors discussed the results, provided valuable feedback on the text and figures.

Competing interests:
The authors declare that they have no conflict of interest. 390

Acknowledgments
This project was carried out in the framework of the priority program SPP 1685 "Ecosystem Nutrition: Forest Strategies for  (Lang et al., 2017), c (Diaconu et al., 2017), all other information is based on own data)

560
Tab. 5: Time to 20% event water fraction (trise20) in [min] for all flow components and all sprinkling experiments; X indicates that trise20 could not be calculated due to missing data at the beginning of the event (MIT in spring); NA indicate that this LY or TR was not existing (TUT) or yielded no flow (MIT). Empty cells indicate that 20 % event water fraction was not reached during the event; numbers in bold indicate that the time to 20% event water fractions was shorter than the time to 20% rise in SSF (see Tab. 4).