Interactive comment on “ An improved parameterization of leaf area index ( LAI ) seasonality in the Canadian Land Surface Scheme ( CLASS ) and Canadian Terrestrial Ecosystem Model ( CTEM ) modelling framework

to the CLASS-CTEM model. NSC module allows to better represent Leaf Area seasonality, as well as to provide a mobile carbohydrate pool to the trees to increase its resilience to disturbances in absence of photosynthesis. It is tested in three Fluxnet sites, where GPP, LAI, and heat fluxes (Incident radiation, latent heat and sensible heat) model projections are contrasted against real data. In my opinion, this is an interesting, thoroughfull work, where the authors clearly demonstrate that the addition of the NSC module clearly improves model performance. My major concerns about the present paper are about its novelty. Currently most of process-based forest simulation models does include the NSC module (Fontes et al., 2010), in a similar way than the new module for the CLASS-CTEM model. So, in my opinion, your current manuscript doesn’t clarify the novelty of your work. Furthermore, throughout your manuscript there is little reference to other models that include this key compartment, and I think it would be a nice element to include in the discussion, as there is plenty of other works in which the addition of NSC in a given model clearly improves its performance.

various stages in the phenological cycle, and corresponding improvement in the timing of carbon fluxes (though limited improvement in energy fluxes). This is of use to the land surface modeling community as not all LSMs currently include specific NSC pools and related processes. As the authors have discussed, NSC processes are relevant for other components of biogeochemical cycling or ecosystem functioning, in addition to phenology that they focus on in this study (though this discussion could be expanded).
The paper is clearly written and structured (although the goals of their model development work might be better stated as questions or hypotheses). However, I have some concerns about the lack of breadth of the study and depth of the analyses undertaken, which are detailed more in specific comments and outlined briefly here.
1) The addition of non-structural carbon pools and associated fluxes is the primary focus of the study; however, there are no observations of NSCs used to evaluate the authors' modifications to the model. I would have expected that any model modification would be tested against observations that are directly relevant to the new processes added in the model even though I appreciate that NSC data are scarce (see Dietze et al., 2014 for a review of previous such studies in the literature). Why was this not the case? Although the authors state the sites chosen were those with available LAI data, was it not possible to evaluate this model at any site that had observations of non-structural carbon pools (even if those data came from sites that did not also have LAI data)? The authors only chose three sites representing only one plant functional type. I would think there are a greater number of sites with LAI data that this model could be tested against.
2) While the authors detail improvements in their modified model in comparison with observations (though less so for energy fluxes), it is not clear which of the model modifications made (detailed in Sections 2.1.1 to 2.1.4) are responsible for the improvements in the simulated carbon and energy fluxes. The authors could show the impact of each C2 modification individually, before evaluating all together. I think the modeling community would appreciate knowing how each of the different modifications made to the model contributed to the overall improvement in the model. Such an analysis would help them ascertain if there are potential structural deficiencies in their own model, thus placing this work in a wider context.
3) The analysis lacks depth -namely, there is a lack of a rigorous quantitative evaluation of the modified model. It would be useful to include certain metrics to quantify the improvements simulated by the modified model (simple correlations for example). In addition, given the authors state that their primary goal is to address the issue of delayed leaf phenology, their analyses should be focused only on that question; general discussions of model behavior and magnitude of fluxes are distracting, especially given they have decided not to run a historical or transient simulation of the model after the spinup, with increasing CO2 and climate.
4) The authors state in the discussion around lines 534-535 that the omission of NSC pools in the original model was a structural error. However, they do not definitively provide evidence to support their claim that the omission of NSC pools was a structural error. While their results show that this process can improve model LAI temporal dynamics, they have not conclusively shown that this is the only process that could be responsible for any discrepancies between the model and the observations, and therefore how important it is to add these specific processes. Incorporation of NSC pools and fluxes may not be the only process that can alleviate the problems in the simulated LAI. As they go on to state, biological systems are complex and difficult to represent with physical equations in models. To ensure that we do have the right model behavior, the processes we include must be rigorously tested against data corresponding to that process. Ideally, the authors would test alternative functions available in the literature for the processes they have implemented, in order to estimate the structural uncertainty associated with the new model developments. A Bayesian model selection framework could be used in order to select the most parsimonious model based on a C3 model selection criterion (such as the Akaike Information Criterion -see Melaas et al., 2013 for an example). I would also be interested to see an analysis on the uncertainty related the parameters they have implemented. It might then be useful to discuss other NSC related processes that remain poorly understood that are not captured by their new model.

5)
The discussion lacks depth as to how the models they have implemented compare to other studies that have already implemented NSC models, as well as a discussion of any caveats to their modeling work related to the points I mention here. See specific comments.

Specific Comments
Introduction Lines 109-111: Unless I have misunderstood, this model has been used in a phenology comparison at these sites (Richardson et al., 2012). If I have the right model, it seems to me that the problems in the behavior of CTEM (for simulating LAI) shown in Richardson et al. are different to that in Anav. This shows that there might be other issues in the phenology models already implemented in CTEM due to differences between versions/parameterizations, without the addition of new processes/modifications to the model?
Model, data and methods Sections 2.1.1 and 2.1.2 There is a lack of references and/or reasoning for some of the mechanisms they are implementing in the models and the various assumptions they make in doing so (e.g. assumption that respiratory losses occur from non-structural part -line 176, and the references/reasoning behind formulation in equations 5, 6 ).
In addition, the reasoning behind fixing certain parameter values needs to be detailed Line 186: maybe refer to Section 2.1.2 for how is Tj calculated? Section 2.1.3 Are you referring to the fact that CLASS-CTEM has a flat peak of around 2 months in Anav et al., 2013 Fig 11, as opposed to the sharper (∼1 month) peak seen in the observations and other models? In that case, it might be good to just state this in parentheses, as I was distracted by the fact that LAI simulated by CLASS-CTEM does not start to decline until long after the summer solstice (Anav et al., 2013 Fig 11) and much later than the observations. In any case, is there evidence in the literature that allocation fractions is modulated by day length -as represented in Section 2.1.3? No references are given to support the addition of this process. Could this slower rate of decline be due to incorrect parameters/processes related to senescence? Initially I was more distracted by the fact the seasonal cycle is delayed (out of phase) by a month or so. I am not therefore convinced if this correction factor based on day length presented in Section 2.1.3 is needed on top of other structural changes in the model. Section 2.1.4 Similarly to Sections 2.1.1 (above), why is a value of 12 • C now used for Tleaf_cold? Is this based on the literature, or experiments, or a calibration exercise? Please give details and/or references as to how this value was chosen. Section 2.2.2 Please could you detail where you got the site meteorological data from, and which method and/or software you used to gap-fill the met data? Also, please could you detail why you chose to use a CO2 concentration of 350ppm (this is detailed around line 418 in the results, but needs to be put here). Finally, please could you detail how the LAI measurements were made at each site? Are there differences between sites? This information would be helpful for readers.

Results
Figures 4-6: It would be good to state that both simulated and observed values represent averaged daily values across all years where data are present in the figure   C5 captions.
Section 3.1 It would be helpful to have some metrics that show improvement (or lack thereof) between model versions for the full timeseries at each site. Even just RMSE or R would be helpful to quantify this and help put the results in context. This could be added to Table 3 for example.
Lines 382-384. It would be helpful if the authors showed a comparison of the observations and the model for each of the different modifications to the model that the authors have made in this study (as described in Sections 2. 1.1-2.1.4), in addition to the overall improvement brought about by all modifications together. That way, other modeling groups can assess which modifications might be necessary for their own model -thus making the study useful in a wider context. These may only be put supplementary figures or tables, but it would still be useful to discuss in the text.
Lines 390-391: This is true, and a good result. However, I also noted there seems to be an offset in the start of LAI and GPP in the observations at both US-MMS and US-UMB. At US-MMS the onset of LAI now better matches the observations, but there is now a bias towards a too early rise in GPP. Similarly, although the LAI at US-UMB now better matches the observations, and the GPP matches the observations very well, there is a still this offset. Why do you think that is? Is there a discrepancy between the two types of observations?
Lines 396-414: Why haven't the authors run a historical simulation after their spinup using increasing CO2 values, so that they can compare to the observed NEE and ecosystem respiration more directly, rather than comparing the (naturally offset/biased) equilibrium state of the model? I appreciate that the lack of a site and disturbance history would result in biases in the model simulations, but this spinup + historical simulation protocol is very common, and I presume is normally used to run CTEM for model inter-comparisons as well as climate change simulations? The authors state that their primary objective is to evaluate the temporal dynamics. But I do not see any issue therefore with running a historical run -as is the often used protocol -and then stating more clearly that their only goal is to look at the temporal dynamics. In any case, the decision to only compare the model at its equilibrium state (as detailed in lines 421-425 for example) should be put in the methods, not in the results, so the reader is fully aware before they get to the results. Section 3.2 Line 440: I am a bit confused as the stem's NSC pool does not get depleted in Figs 7-9c? It decreases a little, but not by a large amount as a fraction of its size? I also would expect that given the addition of NSC pools is the main focus of this study, the model should be evaluated at sites which do have NSC data. Section 3.3 I find this section somewhat distracting given, aside from the last sentence, the differences between the original and modified model are not discussed much. In fact, the differences are very small. The authors note this, but do not provide any discussion as to why the change in seasonality of the simulated LAI does not alter energy fluxes more, as one might expect.

Discussion and conclusions
Aside from the conclusions part to this section, I find the rest of this section lacks a more in-depth discussion in places. There is some discussion of future perspectives to further improve the modeling of LAI (lines 513-525), and the possibility to include the other processes such as drought mortality and the N cycle due to the requirement to model N in leaf NSC pools (lines 552-554). However, there could also be more discussion of the results that might place them in a wider context. E.g. what are the implications for the wider modeling community? How do your results compare to ways NSC-related processes have been implemented in other NSC modeling studies (see review and references in Dietze et al., 2014). A discussion of any caveats to their work would also be useful. These might include some of the points I raised in my general comments, or a discussion about the uncertainty in NSC processes implemented and/or those that remain poorly understood (as the authors stated in the introduction).