Comment on bg-2020-433

The submitted article by Wei Zhang et al. present the new module to better describe the thermal dynamics within the hydro-biogeochmical model CNMM-DNDC. The authors set up a catchment scale modelling approach and test it on observed soil temperature, water filled pore space, CH4 and N2O emissions from three alpine ecosystems of the Tibetan Plateau. The authors conclude that their proposed module improves the reliability of all investigated measured criteria. The manuscript is well written and the results prove the benefit of the modified modelling approach.

filled pore space, CH4 and N 2 O emissions from three alpine ecosystems of the Tibetan Plateau. The authors conclude that their proposed module improves the reliability of all investigated measured criteria. The manuscript is well written and the results prove the benefit of the modified modelling approach.

General Comments
Starting very general, I doubt that a default model setup can be used to determine, whether a model is capable to reproduce a measured criteria or not. As such, the presented simulations of the "original" as well as "modified" model are, to me, just random, given the vast amount of internal model parameters. In my opinion both models would need a prior calibration to the local data set first, before judging whether the model is capable to reproduce the observed criteria or not. Maybe there is a parameter combination in the original mode, which performs much better than the modified version. As such, I would reject this manuscript given this general point. However, I leave this point to the Editor, as I know that it is still common to investigated not calibrated models in biogeochemical sciences. In the hydrological community, it is not. This paper covers both disciplines.
Another general point, which makes the judgement of this manuscript difficult for me is, that neither the model, nor the observation data, nor the model setups are accessible to me. Given the open access policy of BG, I was quite surprised to see that. Under these circumstances and given the not very detailed Materials and Methods section, which cites a lot, but gives very view details, I had to guess a lot. I made these guesses within this review in favor to the authors of this manuscript, assuming scientific correctness, without being able as a reviewer to check. E.g., if the given equations proper implemented in modelling code or if the model was correct set up for the characteristics of the study catchment. I feel a lot of improvement necessary within this manuscript to give the readers the possibility to understand and reproduce the results of this study. This point is not easy to cover given the current stage of the manuscript. I leave also this second general point to the Editor, to decide, whether it is reason to reject the manuscript.
My further comments can be seen as major revision, which will hopefully help the authors to improve their manuscript, either for future publication in this journal or after potential rejection in another journal. Table 1: Please specify "n" in Table caption. Please add a column for data resolution. Regarding CH4 an N2O fluxes, Tables say its daily, while Text (Line 167) says its weekly. Further, statistical indices (IA, NSI, Slope, R² and P) are not in line with chapter 2.4 (IA, NSE, R², ZIR and MRB). Also, give please give the full name of the statistical indices in the Table caption. Use NSE (Nash-Sutcliffe efficiency) instead of NSI throughout the manuscript. How is it possible that there are no values for R² and P, if you could calculate IA, NSI and Slope? Please give the equations for ZIR and MRB in chapter 2.4, as there are not so commonly used. Figure 1: In general, a good simulation of the observed soil temperature is nothing special in the state of the art environmental models. The "original" model just seems to be wrong, so it is not a big challenge to improve that. However, still a necessary task, which shows the importance of this work. What I am missing in the methods chapter, is a clear description of the differences between the "modified" and the "original" model. Maybe a figure would help, showing the setup of both models and highlighting the differences.

Specific comments
Line 221-224: This statement is way to bold. Firstly, you tested the model in one region only. Secondly, the models best achieved NSI value is 0.32. That's very far from a "reliable" prediction (please add some citations for comparison in the Discussion chapter, there are tons, e.g. Ford et al., 2014). Thirdly, you tested the model on top soil WFPS only. How can you generalize from there to "reliable water movement" in general? Please delete this passage and be more accurate throughout the manuscript.
Line 235-237: I do not understand how the results show that the model "simulated the CH 4 fluxes [..] at the catchment scale". The set up might be fully distributed (which is a guess by me and not very clear stated in the Material and Methods section). However, you test the model on local scale measurements. So, you can only state that the model is able to reproduce the local measurements with the given statistical accuracy. Please be consistent with this comment throughout the manuscript. Further, I do not understand why a fully-distributed model set-up is needed herein to test the local measurements as you do not have any spatial measurements. If it is needed, it needs some justification within the manuscript and some results on the spatial scale, e.g. a map of the N2O and CH4 emissions. Figure 3: From the differences in the WFPS and soil temperature, I do not see any reason why the original model would produce such a high N 2 O peak and the modified model (assuming that the described soil temperature routine was the only thing changed, which is not 100% clear in chapter 2.1.2) does not explain such a vast difference in either denitrification and nitrification processes. This needs some more detailed discussion in chapter 4.3, also by showing the different model internal processes of nitrification and denitrification of the two model set ups explaining the difference.
Line 344-347: Again a very bold statement, speaking of "hydrology" when testing a model with WFPS measured at the upper 6 cm of the soil (what about evaporation, overland flow, infiltration, groundwater recharge,…); speaking of "nitrogen cycling" when testing against N2O emission, which is only 1% of the total N emissions (what about N2, NH3, NO, the nitrogen stored in the soil,…), speaking of "carbon cycling" while not looking at CO2 emissions or the changing carbon storage. Please rephrase and stick to the investigated processes throughout the manuscript. Section 4.4: Interesting section, however, atm out of the scope of the manuscript (which implies so far the testing of a changed module in a model). If this section is supposed to be included in the manuscript, please extend, title, abstract, methods section and add a figure are table to visualize this discussion. I would recommend to delete. Please add a comparison of the model results with other studies investigating wetlands, meadows, forest. Also a comparison to different models, which implemented and tested thermal dynamics in their models. And what about other studies investigating freeze thaw cyclings with hydro-biogeochemical models, e.g. deBruijn et al (2009). I am surprised not to see many of the relevant literature within the discussion, please add.

Technical corrections
Line 22: Change "as" to "is" Line 34: I don't understand how the model can be used to "evaluate the sustainability". Please rephrase.
Line 44: Expression "during long periods" please be more accurate and give some numbers.
Line 125: Needs a reference to the concept.
Line 134-136: Please simplify structure of the sentence Line 134-140: It remains unclear to me, whether these changes are already done in the cited publications, or if that's the new part.
Line 149: What are these numbers behind organic matter, mineral, water, ice and air? I have to assume they are dynamic for each time step.
Equations 1-6: Why not add an example with a given organic matter, minerals, water ice and air values.
Line 153: Why in 35 m? Is that a constant? Or the depth to which the model can be applied to?
Line 175: Details about the instruments used for the soil temperature and WFPS measurements are relevant within this manuscript, please add.
Line 192: Why was the model run in 3 hours resolution if the metrological data input is hourly available?   Line250-252: Again, a too broad generalization from one model run and one study area. Please stick with the expression to the investigated processes, e.g. "These results indicate that the modified CNMM-DNDC has the potential to estimate N2O emissions in a seasonally frozen region." Line 262-272: Please move to the discussion chapter. Also, I see what the authors want to express here, however, investigating the climate impact from different landuse is not expressed as scope of the manuscript.  Line 311: How was the influence of the clay fraction in the CH4 uptake investigated? Interesting point, but this statement comes out of the blue, as it was not part of the Methods and the Results sections.
Line 328: How did the authors achieve and control an inundation in the model?
Line 335-336: I do not understand this sentence. Is the process of "disruption of soil aggregates" as well as the "structure, population and activity of the microbes" really incuded in CNMM-DNDC. The materials and methods section is missing a description of the relevant included process. And I assume, these processes are not included, so please rephrase.
Line 344: Where is the "detection limit" of the used N2O measurement technique? Line 387: Change to " implies that a hydro-biogeochemical model"