Reply on RC1

In “Carbon sequestration potential of street tree plantings in Helsinki” Havu and collaborators present model simulations over an urban area in Helsinki to estimate the carbon sequestration of urban trees. The authors used the SUEWS and Yasso models to simulate the carbon cycle for the vegetation and soil, respectively. Based on the model results, they estimated that the simulated urban area will become carbon neutral in 12 or 14 years. In my opinion, the manuscript is a valuable addition to the literature since many of the land surface model studies omit urban areas despite their growing importance. Also, the manuscript describes the most pressing points to improve modeling of energy and material cycles in urban areas such as soil carbon process in the urban area and groundwater access. However, I think the manuscript would benefit from revisions to make the method and results clearer.

In "Carbon sequestration potential of street tree plantings in Helsinki" Havu and collaborators present model simulations over an urban area in Helsinki to estimate the carbon sequestration of urban trees. The authors used the SUEWS and Yasso models to simulate the carbon cycle for the vegetation and soil, respectively. Based on the model results, they estimated that the simulated urban area will become carbon neutral in 12 or 14 years. In my opinion, the manuscript is a valuable addition to the literature since many of the land surface model studies omit urban areas despite their growing importance. Also, the manuscript describes the most pressing points to improve modeling of energy and material cycles in urban areas such as soil carbon process in the urban area and groundwater access. However, I think the manuscript would benefit from revisions to make the method and results clearer.
We thank the referee for highlighting the relevance of the manuscript and we agree with the suggestions to make the methods and results clearer. Below, we discuss how this will be done in detail.

Major comments
The metrics used for model-data evaluation were not well justified. Four metrics were used, i.e., nRMSE, RMSE, nMBE We agree with the reviewer that the metrics used in the manuscript need further explanation. We used two metrics RMSE and MBE, and their normalized versions. We chose RMSE to evaluate the model accuracy, as it measures the average magnitude of the error. We chose MBE to describe if there is systematic error in the possible over-or underestimate. Both RMSE and MBE are useful to evaluate model performance, although they depend on the scale. As the scales notably varied in the different datasets used in this manuscript, we chose to focus on the normalized versions as then the comparisons between the sites and years are more clear.
Improved description of the metrics will be added to the manuscript in section 2.6 as follows: Modified text (L335): "The normalized metrics are mainly used in the analysis as they allow comparison between different scales. The nRMSE is used to evaluate the accuracy of the models and the nMBE indicates whether the models have a systematic over-or underestimation. " We think it is valuable for a reader to see all these four metrics in Tables 4 and 5, but in the text we will focus on normalized metrics which will make the analysis more consistent. The results sections 3.1 and 3.2 will be clarified in the revised manuscript so that reader will not get lost in all these metrics.
For transpiration (starting L358), we originally used RMSE and MBE which were better at Tilia site, because transpiration was four times smaller, than at Alnus site. Now, because the scales are different, we chose to look at normalized values in which case Alnus site performed in both cases better. This will be corrected in the revised manuscript.
For Yasso model (starting L372), RMSE was better at Alnus site, and nRMSE at Tilia site. We wanted to use the same metrics for both SUEWS and Yasso, but it is worth of noticing that the number of observations might be too small to use nRMSE for Yasso. We will clarify this in the revised manuscript. However, nMBE states the systematic under-or overestimation and thus will be more useful in this case.
Modified text (L372): "Yasso model performance was evaluated using only four measurement points in time and therefore, the following statistical values should be treated with caution. The model performance was best in soil 3 as nMBE was lowest at both sites (Table 5). " The study showed that the street trees in Helsinki could become carbon neutral or slightly carbon sink 14 years after planting but it is not clear what happens after those 14 years.

How long does an average street tree live? What is the soil volume affected by a street tree? What happens with the wood of street trees once they are replaced? How much carbon is emitted in the management of street trees? How does all of this affect the life cycle carbon cycle of street trees? Thinking about the carbon balance changes after those 14 years is likely to give important insights and implications for the street tree management.
One of the strengths of the model would be that a theoretical experiment is possible even though it needs several assumptions. Extending the model application would make the manuscript more suitable for Biogeosciences as it would shift the focus of the manuscript from model development to new insights in the biogeochemical cycle of street trees.
We thank the referee for the important insights on the carbon cycle of street trees. For the revised manuscript, a simplified estimation of carbon sequestration potential throughout the street tree lifespan was made with both models. Even though old street trees do exist, the expected lifespan of a street tree is approximately only 20-30 years due to various construction works (Roman and Scatena, 2011). Therefore, the estimation was made for 30 years after the street tree transplanting. For SUEWS, both photosynthesis and plant respiration were averaged from pruned years (2008)(2009)(2010)(2011)(2012)(2013)(2014)(2015)(2016) and assumed that the leaf mass and calculated averaged CO 2 exchange rates will remain the same for rest of the 30-year period. Similarly, a new simulation was made to estimate the soil carbon stock for 30 year period with Yasso using averaged monthly meteorology that was calculated for the same years (2008)(2009)(2010)(2011)(2012)(2013)(2014)(2015)(2016), and assuming stable root litter input. In addition, we included litter input from leaves and pruned wooden parts. As these are removed from the area of interest, the decomposition is taking place somewhere else but is naturally linked to the carbon sequestration of these tree of interest. In the revised MS, we will estimate the carbon sequestration potential of street trees as in Fig. 9, for 30 years from tree planting with the unit kg C per year per tree. Therefore, the canopy area and soil affected (25 m 2 ) will be taken into consideration.
The text in the methods section will be modified as follows: "A simplified estimation of carbon sequestration potential throughout the expected street tree lifespan was made using both models. The expected lifespan of a street tree is approximately 20-30 years (Roman and Scatena, 2011), and therefore, the estimation was made for 30 years (2002-2032) after the street tree planting. For SUEWS, both annual photosynthesis and plant respiration were averaged from pruned years (2008)(2009)(2010)(2011)(2012)(2013)(2014)(2015)(2016) and assumed that the calculated average rate of photosynthesis and plant respiration will continue for 2017-2032. For Yasso, an additional model run using the average monthly mean air temperature and precipitation from the same years with stable root biomass was conducted. " We acknowledge that this manuscript does not fully cover the whole street tree carbon cycle throughout its lifecycle as the SUEWS model is developed to describe the net CO 2 flux on a local scale. However, we will add more insights on the full lifecycle to Discussion as previous estimates on the carbon content of the leaves from these street trees exists. Between 2002 and 2011, the leaves would contribute cumulatively 12.5 kg C per tree of which approximately 7.3 kg C per tree still remained in 2011. The total C in pruning was 0.7 kg C per tree of which 0.6 kg C still remained in 2011 as the trees were pruned only after 2008 (Riikonen et al., 2017). We will add the new analysis to the results section in the revised manuscript and will revise the Discussion to include more insights on the carbon cycle of street trees during their lifespan.

L43 Explain why those methods are not suitable for climate change.
We will add a clarification on how i-Tree is not yet suitable to examine the effect of climate change on carbon cycling in Finland, as its climate conditions are adjusted for US and currently, it does not take into account the future climate.
Modified text (L43): "However, these methods are incapable of catching the correct response of urban biogenic carbon cycle to local environmental conditions and changes in local climate, as climate conditions have been adjusted for US, and thus lack high temporal resolution. In addition, the model cannot simulate carbon cycling in future climates." L51 I could not find the necessity of the sentence following with 'Furthermore'. It may mean that simulating the right temperature is very important due to the interaction with urban structures. Does it?
As the focus is on the urban biogenic carbon cycle, we explained how both photosynthesis and respiration will be modelled. "Furthermore" was just to add how respiration would be modelled, but we will remove it.
Modified text (L51): "Plant and soil respiration is modelled to exponentially depend on air temperature. " L63 How come the SOC will be increased in urban soils compared to the natural environment? Don't urban structures inhibit such an process? Pataki et al. (2006) explains that the increase in the soil is observed in the most highly managed soils due to how much more carbon input these practises will leave to the soil. The impact is visible on parks, but in general, the structure of cities affects the soil beneath buildings and paved areas, preventing such processes. Clarification on this will be added to the MS.

L69 Starting with 'in addition' mentioning that the information was not referred to beforehand, was confusing.
The sentence will be modified.
Modified text (L69): "Soil carbon decomposition depends on the size of the SOC pool, and on temperature and precipitation (Davidson and Janssens, 2006). " L101 It was difficult to understand how the three different soils were laid in the experimental sites.
Clarification on this will be added to the MS: Modified text (L101-102): "The different soils were installed as planting pockets separated by compacted gravel at Alnus site or as continuous strip at Tilia site."

L124, 128, 134 This content might be more suitable for the result section
We agree. As a result, sentence in L124 "Comparing the water use of the different tree species…" will be removed from methods. It is already clear from the results. Sentences in L128-131 "Overall, Tilia site had higher SWC..." will be removed from the methods and added to the results section 3.1.1: Modified text L340-342: "In general, Tilia site was moister than Alnus site as also seen in the observed groundwater level. Furthermore, the catchment area at the Tilia site is large, whereas Alnus site is fed mainly with local rainfall (Riikonen et al., 2011). During the summers from 2008 to 2011, the SWC was on average 27 and 13 % for Tilia and Alnus sites, respectively." Sentence in L134 "The soil carbon stock estimates…" and the following sentence "The proportion of carbon..." will be removed and the section 2.6 in the methods will be modified: Modified text L314-315: "The proportion of carbon in the LOI was assumed as 0.56 (Hoogsteen et al., 2015). "

L143 Why were the additional measurements primarily used? Was it closer to the sites?
The quality of the PWD measurements is much better than the old precipitation measurements with Ott. We include the reasoning in the revised MS in Section 2.3: "…and these were primarily used when available due to their better quality than the Ott measurements.".

L155 Do you mean the air pressure?
Yes, it will be modified to the revised manuscript.

L258, 347 How was the value for the input (0.06) decided?
We will add clarification to the MS that the value was chosen by sensitivity testing so that the soil doesn't limit the modelled transpiration.
Modified text L258: "The limit was chosen by sensitivity testing such that the soil does not dry and limit the modelled transpiration. " L349 a slight morning maximum: unclear expression.
We will remove the expression and modify the sentence.
Modified text L349: "The diurnal maximum of observed transpiration reached 0.27 mm h -1 in the morning at Tilia site. " L356 Why only two years over four years of evaluation period were referred?
Only two years were referred because the worst and best cases were always years 2010 and 2011. Years 2008 and 2009 always fell between these two. The years in brackets will be removed in the revised manuscript to avoid confusion.

L365 You mean similarly, respiration is higher in Alnus?
Yes, we will add clarification.

L538 The values have a different unit from
The soil depth was 1 m, so both kg C m-2 and kg C m-3 mean the same thing. However, we agree with the reviewer that the units should be consistent throughout the manuscript and we will change all of them to m-2, which is commonly used for fluxes. This will affect L372 and y-axis label of Figure 7. Table 1 It seems to be shown way too early as table 1 is only referred to 4 pages later. Table 1 is first time referred in previous page as it also describes the site and connects to Figure 1. Thus, we prefer keeping Table 1 at the original place. Yes, thank you for noticing this mistake. It will be modified to the revised manuscript. Thank you for noticing this mistake. In previous articles, these same functions have been named either as g or f functions, and now we had accidentally used them both here. All the f() functions will be changed to g() to match the Figure 2 in the revised manuscript.

Figure 8
Not clear if 'estimated' points out simulation or observation since the term 'estimates' has been used for observations. We will change "Estimated" to "Simulated" to clarify that these values are modelled.