Response to reviewers on our manuscript “Memory effects on greenhouse gas emissions (CO2, N2O and CH4) following grassland restoration?”

A five-year greenhouse gas (GHG) exchange study of the three major gas species (CO 2 , CH 4 26 and N 2 O) from an intensively managed permanent grassland in Switzerland is presented. 27 Measurements comprise two years (2010/2011) of manual static chamber measurements of 28 CH 4 and N 2 O, five years of continuous eddy covariance (EC) measurements (CO 2 /H 2 O – 2010- 29 2014) and three years (2012-2014) of EC measurement of CH 4 and N 2 O. Intensive grassland 30 management included both regular and sporadic management activities. Regular management 31 practices encompassed mowing (3-5 cuts per year) with subsequent organic fertilizer 32 amendments and occasional grazing whereas sporadic management activities comprised grassland in central Switzerland accounting for changes in GHG exchange following frequent management activities; (ii) to compare two different measurement techniques, namely eddy covariance and static greenhouse gas flux chambers to quantify the GHG exchange in a business-as-usual year; and (iii) to provide a five year GHG budget of the site and quantify losses/gains of C and N. Based on our results we provide suggestions for future research approaches to further understand ecosystem GHG exchange, to mitigate GHG emissions and to ensure nutrient retention at the site for sustainable production from permanent grasslands in the future.

Major concerns 1. CH4 fluxes: I have major concerns with the usage of the CH4 fluxes as presented in this manuscript. Firstly, while the authors present a comparison of N2O chamber and eddy covariance data (Figure 3), they do not for CH4. I believe this is likely as the comparison does not suggest any 1:1 relationship (based on my interpretation of Figure 4b). The authors then use this chamber data to derive annual CH4 fluxes for the years without EC data and assume to be comparable with the EC derived annual fluxes. From the data presented, I see no evidence to believe this to be the case (unlike N2O). Given the two chamber years suggest a small uptake of CH4, while the last three a release of CH4 coinciding with a difference in measurement methodology, I question whether the authors really believe these years are comparable. While the authors discuss these methodology differences in detail in the discussion section, and overall the contribution of CH4 to the GHG budget is small, I believe further attention needs to be given to this, and ideally the equivalent plot to figure 3b is presented for CH4. Based on the timing of management events (pasture restoration) and change in measurement methodology it could be easily interpreted as pasture restoration changes grassland CH4 exchange from an uptake to release.
These are indeed relevant points and surely, we do not want to give the impression that pasture restoration changes grassland CH4 exchange from an uptake to release as this can not be proven by the data presented in this study (see following response). We had preferred to show a similar comparison as given for N2O, however the methane concentrations measurements were not reliable in 2013 due to a flame ionization detector (FID) malfunction in the gas chromatograph. Overall, we also did not expect to find a similar relation between the methane flux measurements obtained by eddy covariance and chambers caused by the small magnitude of the fluxes measured. As stated in the original manuscript "We calculated detection limits for the individual GHGs from our manual chambers following (Parkin et al., 2012). Detection limits were 0.34 ± 0.26 nmol m -2 s -1 , 0.05 ± 0.02 nmol m -2 s -1 , and 0.06 ± 0.06 µmol m -2 s -1 for CH4, N2O and CO2, respectively, clearly indicating that methane fluxes measured by GHG chambers in 2010/2011 were on average -0.16 ± 0.16 nmol CH4 m -2 s -1 , (see Table 2) and thus below the actual detection limit." However, we did compare our eddy covariance methane flux values (methane fluxes fluctuating around 0 with an overall range of -40 up to +40 nmol CH4 m-2 s-1 (Figure 4 b)) with the values reported by (Felber et al., 2015) from a similar grassland system in Western Switzerland. (Felber et al., 2015) have shown that such values measured by the EC technique represent a soil signal (Figure 6 in Felber et al. 2015).
Following this, we agree that we should not have computed annual sums for the years 2010/2011 for methane and will remove these in the revised manuscript. We will only present the gap-filled numbers for methane for 2012 -2014 and show the actual measurements derived with GHG flux chambers for the years 2010/2011 only (Figure 4b).
Overall, we would like to point out again that methane fluxes are of minor importance for the carbon and greenhouse gas budget of the site under the current management (see also our response to the second concern as well as the concern made by reviewer #2 on the influence of grazing animals on methane fluxes).
2. The impact of grazing needs further consideration. While harvesting is more common in this study, the impact of grazing needs further clarification and/or modification of the presented results. Firstly, it is unclear to me how the grazing off-take was estimated (please clarify), and whether the deposition of excreta C was included in the C balances. While I'm not familiar with sheep grazing, at least for cattle this can be in the order of one-third of consumption, and therefore not an insignificant component (especially for 2014, Parcel A with 1769.9 kg C ha-1 of grazing removal according to table S1) and requiring acknowledgement of how this is currently dealt with, or included in the C balance (e.g. Table  2). Furthermore, the authors state they did not detect any CH4 release with grazing (lines 432-433). Using the example of Parcel A in 2014, which was primarily grazed by cattle, and assuming _3% was converted and released as CH4 (e.g. Felber et al. (2016)), 53.1 kg C ha-1 would have been emitted from the grazers as CH4, which when converted to g CO2-eq m-2 calculated to 240 g CO2-eq m-2 or much larger than the 55 g CO2-eq m-2 reported in table. If this was not detected, then I suggest the authors reconsider how grazing related CH4 is dealt with in this manuscript given they are reporting ecosystem scale GHG budgets.
Indeed, methane emissions from grazing animals need to be considered in annual budgets of methane and carbon. We argue that these are already accounted for in our data, since our observation boundary is the ecosystem and thus, we only include CH4 from animals when these are on the field. Grazing intensity was extremely low and only lasted for few days in the specific years (2010,2011,2014). Also, most of the grazing were sheep, and cattle were only present in 2014 in Parcel A for less than four weeks in total at an average stocking rate of 4.04 heads per hectare. Thus, the reviewer's statement that Parcel A was primarily grazed by cattle in 2014 is incorrect. Furthermore, we are aware of the 3% assumption and while this approach could be taken, we were not able to follow the numbers presented by the reviewer. Possibly some additional explanation could be provided on how the values given were derived.
At the same time, we propose another approximation for methane emissions from enteric fermentation from cattle as follows and in relation to the study by Felber et al. (2016). Felber  The total CH4 emissions calculated are thus 48.96 kg CH4 per ha (4.89 g CH4 m-2). When we convert this to C, we derive emissions of 4.07 kg CH4-C per ha (0.40 g CH4-C m-2). This would be the value we expect also to see with the EC flux tower under perfect conditions with a non-movable point source. Unfortunately, such perfect conditions are not the reality and we may not have captured all of these emissions due to shifts in wind direction, changes in turbulence as well as the actual animal movement out of the fetch. Also, as indicated by Felber et al. (2016) distance from the cow to the EC tower determines how much methane one measures with the EC tower. Moreover, 4.07 kg CH4-C ha-1 (0.40 g CH4-C m-2) are of minimal influence for both the C budget as well as for the GHG budget of the site (see Table 4). In order to clarify this point, we add this information on the issue of grazing in the revised manuscript.
Grazing removal was quantified experimentally by having areas in both parcels from which the animals were excluded. At the end of each grazing period, the grass in the enclosures was cut similar to the approach taken when estimating harvests with subsequent laboratory analysis for C and N. Grazing is included in the harvest in Table 4, as this is a removal of biomass from the system. The return of nutrients via excreta ((approx. 32% C, (Felber et al., 2016)) resembles a recycling of nutrients within the systems and associated GHG emissions would be included in the EC measurements. Following our previous argument of the very low stocking density, this is unlikely to have considerable effects to the results of the study.
Moderate Concerns 3. The focus (or perhaps title?) of this manuscript needs sharpening. The title indicates a focus on pasture restoration which is matched by the abstract, yet much attention is given to methodological considerations. Specific goal (ii) states "briefly compare two different measurement techniques" however the first two-thirds of the discussion (i.e. not briefly) comments on this aspect! While important and noteworthy, either change the title/abstract, or return the primary focus of the discussion to management effects. Additionally, goal (iii) is not really explored in this manuscript -perhaps combine with goal (i)?
Thank you for this suggestion. In the revised manuscript we combined the goals (i) and (iii) and shortened the discussion on the methodological aspects while giving more attention to the primary goal of the study. As a consequence, the former objective (iv) has now become the new objective (iii) (see the version of the manuscript with track changes).
4. Providing a partial N budget provides little useful information. Including individual components is beneficial, but to sum them up as an incomplete "budget" is not. If the authors choose to retain the N budget, please include some further context including some ballpark estimates of the remaining components to aid interpretation.
We agree that particularly in terms of N providing the partial budget is not as good as providing a full N budget. At the same time, we avoided after careful consideration, to provide a N budget with ballmark estimates as some fluxes would be largely uncertain due to little data availability from such systems (ie nitrate leaching) or overall limited data availability across agricultural systems (ie losses in form of NOx and N2). Yet we are aware that losses of nitrogen via ie NH3, N2, NOx can be much larger than the losses via N2O. Consequently, we rephrased the respective objective (previously iv now iii) to"(iii) to provide a GHG budget of the site". We further changed the wording from C and N budgets to C and N gains and losses with the losses we specifically refer to losses of N via N2O.
5. While N2O flux gap filling is difficult, the use of running medians may be problematic, and especially for gaps occurring during pulse emissions (e.g. the restoration period/fertiliser applications). The authors should comment on limitations of this approach, especially in the absence of any uncertainties (which I accept is rarely done in N2O flux studies so do not see them as a requirement here). Minor/Technical Concerns Lines 33-34: grazing is listed as both a regular and sporadic management activity. Please clarify which it is.
We apologize for the mislead in wording and will rephrase as follows: "Grazing is a typical management activity in such intensive grassland. At our site, we observe grazing with either sheep or cattle for few days at the beginning or end of most years." Line 37: Missing the word "out" (or similar) after "carried".
Thank you for pointing this out. Actually, we had the hypothesis of increased CO2 uptake already in the manuscript (L. 89-90). We reworded these lines as follows: Prior to our measurements we hypothesized short-term losses of CO2 after restoration and more continuous losses of primarily N2O following dramatic managements events such as ploughing occurring at irregular time intervals. We further hypothesized an increased carbon uptake strength compared to the pre-ploughing years.
Lines 89-90: If you expect CO2 losses (as per the above point), why would you expect a C gain? Please adjust this and align with the previous sentence to clarify your hypothesis.

See our comment to the previous remark made by the reviewer.
Line 108: Do you mean CH4 emissions from the land or the grazers? In fact, this point needs clarity throughout the manuscript -are the grazers included within the system boundary, and therefore their emissions?
We actually refer to both, land emissions/uptake as well as CH4 emissions from grazers. In terms of system boundaries, these are set to the ecosystem here, thus we account for the GHG emissions made by grazers (CH4 from enteric fermentation, as well as CH4 and N2O from excreta) for the years 2012-2014. Given that stocking rate was low and the actual time of grazing short we expected little effects of grazing on the budget while still aiming at being inclusive as we wanted to include all the management activities occurring in this field. We further included the offtake due to grazing in the budget calculations. The recycling of nutrients from grazing animals and their deposits is included in the eddy covariance measurements. While this may not be the case for 2011/2012. Given the small stocking rate and as explained before this is likely of minor importance and surely will not change the results.
We are not sure what the reviewer refers to here as these are two sentences in the original manuscript. However, in order to increase the flow of reading the suggested lines will be adjusted as follows in the revised manuscript. "The study by Hörtnagl et al. (2018) further elaborated the variability in management intensity and related variations in GHG exchange across sites, stressing the need for more case studies based on continuous GHG observations to improve existing knowledge and close remaining knowledge gaps. To complete the picture on factors impacting ecosystem GHG exchange, irregular occurring events such as dry spells or extraordinary wet periods can further lead to enhanced or reduced GHG emissions (Chen et al., 2016;Hartmann and Niklaus, 2012;Hopkins and Del Prado, 2007;Mudge et al., 2011;Wolf et al., 2013)" Line 130: "adaptations" should be "adaptation" (no "s").

Done
Line 137: "respectively" is not needed -please delete.
That was an oversight and we added the A. The correction was applied.
Lines 241-249: It was unclear to me what QA/QC procedures were applied to the raw (10/20Hz) and which to the 30-minute data. I suggest improving the clarity here.
We rephrased this section by clearly distinguishing between raw data and raw time series (high frequency) and specifically state when we refer to 30-minute data.
Line 248: what was considered the physically plausible range? Please include this information.

Done
Line 280: Order of words: "no longer closed" should be "closed no longer".

Done
Line 314: Remove the word "Up"

Done
Lines 477-478: I think the before and after restoration periods should be separated. I don't believe averaging the two periods to be fair as part of the purpose of restoration is to improve growth, and therefore modification of CO2 exchange should also be expected.  Line 538: Correct the format of the reference Done Line 579-580: Are you referring to the measured CO2 exchange to be +/-50 g C m-2 y-1, or the uncertainty? This sentence is very unclear as no uncertainty has been presented, so please clarify. Baldocchi et al. (2003), who stated that annual numbers presented from EC measurements can vary by as much as +/-50 g C m-2 y-1. Thus, we want to encourage that this is an uncertainty anyone should keep in mind when evaluating annual budgets derived by the EC technique. Table 1: I find the "max data availability" columns repetitive -perhaps just a single column of this data?

This refers to the statement made by
Good point, thank you! We removed the repetitive statement of numbers in the revised manuscript and also removed the columns presenting the water fluxes as these are not referred to in the manuscript.  We thank the reviewer for pointing us towards these references and we refer to these in the revised version of the manuscript.

Reviewer #2:
We would like to thank reviewer #2 for the overall positive evaluation and for providing feedback on the points that the reviewer encourages to be addressed. Our responses to the questions/concerns are given in italic font.
The manuscript "Memory effects on greenhouse gas emissions (CO2, N2O and CH4) following grassland restoration?" by Merbold et al. is a well written longterm study of GHGs from a grazed grassland system in Switzerland. The team have used a mixture of measurement methods over a 5 year period to get a very good picture of a full GHG budget for the field. This is a very valuable study as such longterm observations are rare and it answers some questions that are not well studied. I found the manuscript interesting to read, and it was written to a very high standard and I do believe that it should be published after some amendments. I do have some comments that I feel should be addressed by the authors that I believe would improve the quality and usefulness of the study for others. Although these comments are numerous and not entirely simple to address, if the authors can amend their study to incorporate them I feel the work would benefit greatly.
Thank you for the positive evaluation and we suggest ways forward point by point below.
A large assumption made by the study is that the eddy covariance measurements are entirely truthful of the conditions in the field. It has been observed in the past that long-term carbon budgets derived from eddy covariance can be biased due to assumptions made by the method. Often negative carbon fluxes are reported in similar systems, however, when investigating deep soil cores there was found to be no significant difference in C content of the soil (see Jones et al., doi:10.5194/bg-14-2069-2017 for one such study). The manuscript does not provide evidence of the C stock in the soil beyond the Eddy C measurements to back up the evidence which would have made it a much more significant study. This does not invalidate the study by any means, but without clarification of potential uncertainties, it increases the danger that the study provides "concrete" evidence of mitigation methods (i.e. grazing animals is a carbon sink) that has been used recently by advocates of the meat industry to justify the long-term environmental aspects of livestock farming. I would advise a short message of discussion to highlight that there is room for error in the measurements and that soil carbon was not measured to validate the measurements. Alternatively, if the soil measurements are there, please include them. As we primarily provide a GHG budget -after having revised the objectives -these numbers do not represent a full farm-scale assessment.
I do not agree with the way that the N2O flux data has been handled in the study. N2O fluxes measured using chambers almost always follow a log-normal distribution in space, so any data analysis must take this into account when handling means and uncertainties. A simple arithmetic mean with associated uncertainty (not sure what the error bars on Fig 3 and 4 represent?) will not be an adequate way to represent this data (although commonly used wrongly in previous studies). This will result in a skewing of the data and large overestimates in minimum confidence intervals and underestimations of maximum confidence intervals. An example is when uncertainties of N2O cross the negative threshold when no observations of flux dip below zero. This is not a satisfactory way to present the data. I recommend using a more sophisticated analysis technique and showing 95% confidence intervals where possible for a thorough comparison of the measurement techniques.
We thank the reviewer for the critical assessment. Our

Given that we aimed at deriving an annual budget which is relatively conservative we chose the running median approach. First of all, this way we are less likely to overestimate N2O emissions compared to ie the daily average approach. Linear interpolation would also have led to an overestimation of N2O emissions particularly for the years 2010 and 2011 with few data points. Certainly, we see the lowest influence of gap filling errors for the years with EC measurements, whereas there may be a larger bias for the year with chamber measurements. Based on our 5-year observation period that indicated N2O
emissions peaks during the growing season only and following fertilization events primarily (except 2012), we are confident that we covered the majority of these peaks during the years 2010 and 2011 when only chamber measurements were available. Thus, we decided to remain with the chosen approach as we do not think it is beneficial to state values which are likely to be more biased than the chosen approach.
L303: Due to the log-normal distribution of N2O emissions measured using chambers, most measurements will be very close to zero and ppb differences in gas samples will hover around detection limits of the analysis instrument. In such cases, the R2 value of the fits will be very low for many, but the regression between points will still be valid (effectively an average of the instrument noise with a slope near zero). By cutting data with R2 lower than 0.8 I assume that a very large number of small fluxes are removed from the dataset. If this is the case I would recommend a threshold on this QC method, or a more detailed explanation of what impact this had on the data in the text if this is not the case (as I read it, the method would likely contribute to a large bias is flux estimates).
We implemented thorough QC criteria concerning the N2O flux calculations. All the details have been in detail provided in Imer et al. (2013), including the R2 threshold and how many data points were dismissed. Overall, the low fluxes being part of our observations were not being the limit of detection and have thus been included in this study.
Uncertainties in cumulative emissions are not presented which makes it difficult to compare with other studies or what impact gap-filling and weather may have had on the study. This should be easily manageable for CO2 for which models exist, and probably for CH4 using simple gap-filling as it was found to be approximate zero throughout the study. I understand that there is no definitive way to gapfill N2O, however a running median is not a statistically defensible way to "model" data. As a result no uncertainty will be calculated from this method. If the authors want to estimate uncertainties in cumulative N2O fluxes, they will have to develop a more sophisticated approach to gap-filling.
We agree with the reviewer that there are different approaches to gapfill GHG flux data. Certainly, the gapfilling approaches for CO2 and CH4 are better developed than for N2O. The running median approach was chosen, following Hoertnagl et al. 2018 (see above) as this seemed at the time being the best possible way to fill N2O flux data gaps given the ecosystem observed.
I feel a nitrogen budget without NH3, NOx and N2 is not very useful. Combined, these gases will likely contribute approx. 50% of nitrogen losses from the system. Perhaps a better way to confer N losses is to calculate the emission factors of the fertiliser applications, as that is a more generally used term for such activities in literature and is a better description of the presented results in the study.
We are in full agreement with the reviewer that other N compounds build a large part of the N budget. We thus adjusted the manuscript to only show the GHG budget and avoid stating a full N budget as this could be only based on very rough estimates. We also decided to adjust the text and mention only C and N gains/losses.
Is there a way to estimate the N content of the fodder/grass on the field before tillage to assess the emissions from the herbage being tilled into the soil?
We have thought about this too when preparing the manuscript and realized that we had not taken such measurements. However, to our current knowledge the additional N being incorporated into soil during tillage should be very small due the very low vegetation height at this time of the year.
Does the carbon budget take into account vehicle use? Is it insignificant or does tractor diesel have a role to play?
The currently presented budget does not include C emissions from vehicle use for two reasons: (1) the hours farm vehicles are being used on this field are very limited over the course of the year given the small size of the fields (negligible). The negligibility of these emissions was further underlined (2) by a MSc thesis that investigated full farmgate budgets in the years prior this study.
L225: Can you explain what you mean by an internal reference cell in the instrument for the QCLAS? To my knowledge, these cells are used to find absorption lines on the spectra and not for calibration as they leak over time. The QCLAS system typically does not require calibration as it operates on the principles that the absorption follows Hitran quantum mechanics laws.
Thank you for this comment and this seems to be a misunderstanding of what we have written. We stated that the infrared gas analyser was calibrated regularly, while we also wrote that the QCLAS was fitted against an internal reference cell. In order to create better clarity we changed this sentence as follows: "The QCLAS did not need calibration due to its operating principles, and an internal reference cell (mini-QCL manual, Aerodyne Research Inc., Billerica, MA, USA) eased finding the absorption spectra after each restart of the analyzer." Some minor corrections L283: I think there is a bit of wording here that is confusing. Flushing the chamber with the syringe isn't technically correct. I think it would be better to say that the syringe was used to pump the chamber to circulate the air to avoid the concentration gradients? Done L471: here the order of the sentences makes it sound like CH4 contributed to 70% of the budget. Please re-order. Our results stress the inclusion of grassland restoration events when providing cumulative sums 54 of C sequestration potentials and/or global warming potentials (GWPs). Consequently, this 55 study further highlights the need for continuous long-term GHG exchange observations as well 56 as the implementation of our findings into biogeochemical process models to track potential 57 GHG mitigation objectives as well as to predict future GHG emission scenarios reliably.  (Table S1) (Table S1). corrections were applied to raw fluxes, accounting for high-pass and low-pass filtering for the 236 CO2 signal based on the open-path IRGA as well as for the closed-path CH4 and N2O data 237 . All fluxes were calculated using the software EddyPro (version 6.0, LiCor 238 Biosciences, Lincoln, NE, USA) (Fratini and Mauder, 2014). 239 The quality of half-hourly raw time series was assessed during flux calculations following 240 (Vickers and Mahrt, 1997). Raw data were rejected if (a) spikes accounted for more than 1 % 241 of the time series, (b) more than 10 % of available data points were significantly different from 242 the overall trend in the 30 min time period, (c) raw data values were outside a plausible range 243 (± 50 µmol m -2 s -1 for CO2, ± 300 nmol m -2 s -1 for N2O and ± 1 µmol m -2 s -1 for CH4) and  (Table 1). The amount of available flux values for N2O and CH4 were less, since we were 252 only capable to continuously measure both gases from 2012 onwards (Table 1) To date a common strategy to fill gaps in EC data of CH4 and N2O has not been agreed on. The 313 commonly used methods are simple linear approaches (Mishurov and Kiely, 2011) or the 314 application of more sophisticated tools such as artificial neural networks (Dengel et al., 2011). 315 The difficulty of finding an adequate gap-filling strategy results from the fact that emission 316 pulses of either N2O or CH4 remain challenging to predict. Similarly, different measurement 317 approaches -i.e. low temporal resolution manual GHG chambers compared to high temporal 318 resolution eddy covariance measurements -need different gap-filling approaches (Mishurov 319 and Kiely, 2011; Nemitz et al., 2018). In order to keep the gap-filling methods as simple and 320 reliable as possible, we used a running median (30 and 60 days for eddy covariance based and 321 chamber N2O fluxes, respectively). A similar approach was recently chosen by Hörtnagl et al. 322 (2018) due to its sensitivity to peaks in the N2O exchange data. The approach was particularly 323 chosen as it minimizes the bias occuring from linear gap filling or simply using an overall 324 average value. While the gapfilling approach may be of less importance for EC flux 325 measurements with its high temporal data availability, it is the more important for less 326 frequently available GHG fluxes derived via manual chambers. Given the occurrence of 327 sporadic N2O peaks which occur mostly in relation to management activities and last for few 328 hours/days only as well as the labour needed to carry out GHG chambers measurements, 329 researchers commonly aim at having weekly or biweekly flux data (i.e. Imer et al. 2013). The 330 respective sampling design is commonly designed to capture potential N2O flux peaks as well 331 as some background values (Mishurov and Kiely, 2011). If one then uses either a linear 332 interpolation or an overall average value, one can derive a budget which is than a likely 333 overestimation of the annual flux budget caused by the few flux peaks observed in such 334 managed systems. The same bias is likely to occur if just flux averages are used since few very 335 high emission peaks will affect such an average. Thus, and in order to simulate N2O emission 336 peaks more reliably, we have chosen the approach as taken by Hörtnagl et al. (2018). 337 In contrast to CH4 and N2O various well-established approaches to fill CO2 flux data exist 338 (Moffat et al., 2007). Here, we filled gaps in CO2 exchange data following the marginal 339 distribution sampling method (Reichstein et al., 2005) which was implemented in the R 340 package REddyProc (https://r-forge.r-project.org/projects/reddyproc/). and average daily temperature in winter dropped as low as -12.7 °C (6 th February 2012, Figure  377 1b) with soil temperature following in a dampened pattern (Figure 1b). Average daily 378 photosynthetic photon flux density did not differ considerably over the five-year observation 379 period (Figure 1c). The site rarely experienced snow cover during winter (Figure 1b). 380 The complexity in management activities becomes apparent when comparing business as usual 381 years (e.g. 2011) with the restoration year (2012, Figure 2a

Temporal variation of GHG exchange 396
Fluxes of CO2 and N2O showed considerable variation between and within years. This variation 397 primarily occurs due to management activities and seasonal changes in meteorological 398 variables (Figures 1 and 4). In contrast, methane fluxes did not show a distinct seasonal pattern.  Table 2). All four non-ploughing years 405 revealed largest CO2 uptake rates in late spring (daily averaged peak uptake rates were >10 406 µmol CO2 m -2 s -1 , March and April, Figure 4a). Besides the seasonal effects a clear impact of 407 harvest events could be identified, with abrupt changes from net uptake of CO2 to either 408 reduced uptake or net loss of CO2 (light blue arrows indicate harvest event, Figure 4a). A 409 similar but less pronounced effect was found following grazing periods (light and dark brown 410 arrow, Figure 4a). A complete switch from net uptake to net CO2 release was observed during 411 the first three months of 2012, after ploughing and during re-cultivation of the grassland. In 412 this specific year, the site only experienced snow cover for few days (Figure 1c) and 413 temperatures below 5 ºC occurred more regularly than in all other years (Figure 1 b). Seasonal 414 CO2 exchange was characterized by net release of CO2 in winter (DJF), highest CO2 uptake 415 rates were observed in spring (MAM), constant uptake rates during summer (JJA) which 416 however were lower than those measured in spring, and very low net release of CO2 in fall 417 (Table 3). Average winter CO2 exchange for the five-year observation period (gap-filled 30 418 min data) was 0.28 ± 5.68 µmol CO2 m -2 s -1 (SE = 0.04, Table 3 The individual static chamber measurements (2011&2011) were often below the detection 428 limit and fluctuated around zero similar to the eddy covariance measurements (Figure 4b). Any 429 methane peaks expected due to freezing and thawing in late winter and early spring were not 430 observed. Also, commonly reported net emissions of methane during grazing of animals were 431 not seen (Figure 4b). Seasonal differences of methane exchange did not show a clear pattern 432 (Table 3) The global warming potential (GWP), expressed as the yearly cumulative sum of all gases after 468 their conversion to CO2-equivalents, was negative during all years (between -387 and -2577 469 CO2-eq. m -2 ) except for the ploughing year 2012 (+2629 CO2-eq. m -2 ). 470 Overall, CO2 exchange contributed more than 90% to the total GHG balance in 2011, 2013 and 471 2014. Clearly, CH4 exchange was of minimal importance for the GHG budget (Table 2) restoration year the site lost 395 g CO2-C m -2 (3950 kg C ha -1 ) ( Table 2). Carbon losses (and/or 480 gains) from methane were < 1 g CH4-C m -2 during all five years. 481 Carbon was gained in both parcels during the pre-ploughing years (Table 4) (Table 2). 526 We calculated detection limits for the individual GHGs from our manual chambers following 527 (Parkin et al., 2012). Detection limits were 0.34 ± 0.26 nmol m -2 s -1 , 0.05 ± 0.02 nmol m -2 s -1 , 528 and 0.06 ± 0.06 µmol m -2 s -1 for CH4, N2O and CO2, respectively. Following this, methane flux 529 measurements frequently were below this limit of detection, hence we did not calculate here show the overall trend on C uptake/release of the site and clearly exceed the uncertainty 556 of ± 50 g C per year for eddy covariance studies as suggested by Baldocchi (2003). 557 Methane was of negligible importance for the C budget of this site. We did not observe distinct 558 peaks in CH4 emissions in relation to grazing which is primarily due to the low grazing pressure Nitrogen inputs and losses via N2O varied largely between the years before and after ploughing. 574 While the site was characterized by large N amendments prior to ploughing and with reduced 575 harvest, the picture was completely the opposite during the years after ploughing, with 576 considerably less N inputs compared to the nitrogen removed from the field via harvests. 577 This study in combination with an overview of available datasets on grassland restoration and 586 their consequences on GHG budgets highlights the overall need of additional observational 587 data. While restoration changed the previous C sink to a C source at the Chamau site, the wider 588 implication in terms of the GWP of the site when including other GHGs have long-term 589 consequences (i.e. in mitigation assessments). Furthermore, this study showed the large 590 variations in N inputs and N outputs from this grassland and the difficulty farmers face when 591 aiming for balanced N budgets in the field. Still, the current study focused on GHGs only and 592 can thus not constrain the N budget but assess the losses of N via N2O. Losses in form of NH3, 593 N2 and NOx will have to be quantified to fully assess N budgets besides the overall fact that 594 GHG data following grassland restoration remain largely limited to investigate long-term 595 consequences. 596 Fortunately, these are likely to become available in the near future by the establishment of 597 environmental research infrastructures (i.e. ICOS in Europe, NEON in the USA or TERN in 598 Australia) that aim at standardized, high quality and high temporal resolution trace gas 599 observation of major ecosystems, including permanent grasslands. With these additional data, 600 another major constraint of producing defensible GHG and nutrient budgets, namely gap-filling 601 procedures, will likely be overcome. New and existing data can be used to derive reliable 602 functional relations and artificial neural networks (ANNs) at field to ecosystem scale that are 603 capable of reproducing in-situ measured data. Once this step is achieved, both the available 604 data as well the functional relations can be used to improve, to train and to validate existing 605 biogeochemical process models (Fuchs et al., 2020). Subsequently, reliable projections on both 606 nutrient and GHG budgets at the ecosystem scale that are driven by anthropogenic management 607 as well as climatic variability become reality. 608 The study stresses the necessity of including management activities occurring at low frequency 609 such as ploughing in GHG and nutrient budget estimates. Only then, the effect of potential 610 best-bet climate change mitigation options can be thoroughly quantified. The next steps in 611 GHG observations from grassland must not only focus on observing business as usual 612 activities, but also aim at testing the just mentioned best-bet mitigation options jointly in the 613 field while simultaneously in combination with existing biogeochemical process models.  Values are given as all data possible, raw processed values and high quality (HQ) data, which 622 were then used in the analysis. High quality data are data with a quality flag "0" and "1" from 623 the Eddypro output only. Grey shaded areas represent time period where both methods (EC 624 and static chambers) were used simultaneously to estimate FN2O. Static chamber flux data are 625 highlighted in italic font. 626 627 Table 2: with "-" and losses/exports are indicated with "+". While management information was 643 available for both parcels (A and B), flux measurements are an integrate of both parcels. n.c. = 644 not calculated 645 646 Table 5: Existing studies investigating the GHG exchange over pastures following ploughing. 647 Results presented show the flux magnitude following ploughing and are rounded values of the 648 individual presented in the papers. Values were converted to similar units (mg CO2-C m -2 h -1 , 649 µg CH4-C m -2 h -1 and µg N2O-N m -2 h -1 ). Based on Web of Knowledge search July 15th 2017 650 with the search terms "grassland", "pasture", "greenhouse gas", "ploughing" and/or "tillage". 651 Only two studies representing conversion from pasture to cropland or other systems were 652 included in this table. 653 654 Table S1: whether carbon (C in kg ha -1 ) and/or nitrogen (N in kg ha -1 ) were amended to, or exported from 675 the site ("Fo" and "Fo*"-organic fertilizers, slurry/manure (red); "Fm" -mineral fertilizer (light 676 orange); "H" -harvest (light blue); "Gs" and "Gc" -grazing with sheep/cows (light/dark 677 brown). Other colored arrows visualize any other management activities such as pesticide 678 application ("Ph"-herbicide (light pink); "Pm"-molluscicide (dark pink); "T"-tillage (black), 679 "R"-rolling (light grey) and "S"-sowing (dark grey) which occurred predominantly in 2010 680 (parcel B) and 2012 (parcels A and B). Carbon imports and exports are indicated by black and 681 grey bars. Thereby black indicated the start of the specific management activities and grey the 682 duration (e.g. during grazing, "Gs"). Green colors indicate nitrogen amendments or losses, with 683 dark green visualizing the start of the activity and light green colors indicating the duration. 684 Sign convention: positive values denote export/release, negative values import/uptake.