Comment on bg-2021-352

I think that the study is interesting and innovative, and topic is highly relevant to Biogeosciences. This is a challenging simulation work with large uncertainties in Kenya and Ethiopia, but authors do a good job in comparison with meta-analysis from published literature and give an in-depth discussion on model limitations of each management practice. Generally, the manuscript is well structured and written with reasonable simulation design and evaluation results. My comments are minor and mostly limited to the text where are a bit difficult to follow, especially in the description of model protocol runs.

I think that the study is interesting and innovative, and topic is highly relevant to Biogeosciences. This is a challenging simulation work with large uncertainties in Kenya and Ethiopia, but authors do a good job in comparison with meta-analysis from published literature and give an in-depth discussion on model limitations of each management practice. Generally, the manuscript is well structured and written with reasonable simulation design and evaluation results. My comments are minor and mostly limited to the text where are a bit difficult to follow, especially in the description of model protocol runs.

Specific comments
L23: What is 'standard management' here? It refers to F std in Table 3, but it is unclear to readers before getting through the whole text, please consider to revise it.
L90: How does LPJ-GUESS regionally represent crop sowing and harvest? More details on the computation of crop phenology would be appreciated.
L122: What about N-fixing grass as intercrops in the model? No? L127: Residue removal fraction here is set to 90%, but the proportion is 50% for sitescale evaluation in Sect 2.3.1(L179). Please clarify the setup difference between regional and site simulations.
L176: The second maize growing period modelled as N-fixing grass makes nonsense to me because of the extra N addition to the soils from BNF.If it's the case, what is the N fixation rate of the grass? Why not using non-N-fixing cover crop as a replacement? L178: 'assuming an N content of 1.75% in the farmyard manure'. Please provide a reference.
L188: 'assumed that INM3 was under grassland systems for the period 1901-2002 '. How many above-ground biomass is removed from ecosystem here? Author stated that in the model 50% of AG biomass in C 3 grass is harvested every year in Sect. 2.1 (L87), but the none of biomass is supposed to be removed here since grassland is natural vegetation without any managements. L246: 'All simulated outputs in the last ten years of the model experiments were taken for analysis'. You mean only the outputs from 2091-2100 taken for analysis? Maybe we shouldn't name it '2091-2100' because you are using the constant climate and CO 2 , this period doesn't correspond to the real calendar years like RCP scenarios. L255-259: Authors are trying to quantify the potential transition of F opt caused by future climate change through comparing the difference between C2 and C3, but B2 is much representative for the present-day climate, why not B2 vs. C3? any F opt difference in spatial pattern between B2 and C2?
Figs. 2-3: Please also add the simulated SOC results between 1901 and 2002 at two sites. Table 4: Please consider to change SOC unit from 't C ha -1 ' to 'Mg C ha -1 ', the latter one is more common in soil carbon studies.  Table 3 the fraction of residue retention is 100%.

Figs.1-3 and
L381: Look at the crop production in Fig.4, sorghum and pulse yields are largely overestimated, indicating that the updating growth parameters described in Sect. 2.3.2 does not work well, I expected to see some explanations on such deviations. Table 5: It seems that the modelled total SOC stocks, N losses and crop productions in both countries show a fairly good agreement with previous studies. The issue here is if the simulated crop-specific areas are also comparable with other statistics? For instance, the cropland area in 2014 is 6,222,100 and 17,433,400 ha for Kenya and Ethiopia, respectively (L398). How do we know these numbers are reliable? L441: If 50% of residue retention in the model setup is not fully equivalent to the observed input of 2 t ha -1 , how many residues in absolute value was left in the field in simulations?
L447: Land use history prior to experiments is most likely the reason to explain the declined SOC at two Kenyan sites. Look at Fig.6, soil C pools take almost 20-30 years to reach a new equilibrium after shifting the managements. The 10+ years of cultivation at the evaluated sites is thus not long enough for the stable C-N pools.
L452: 'because of the prevailing warm and moist climate ', but most part of Kenya belongs to semi-arid climate to my mind, no?
L533: 'crop+cover-crop rotation'. Do you mean 'main crop (long rainy season)+ cover crop (short rainy season)' or ''main crop 1 (long rainy season)+ main crop 2 (short rainy season)+cover crop (dry season)' ? L538-547: Good discussion on SOC storage from the time perspective, apart from different management practices comparison. It's interesting to see that only combined Fconserv run show the net CO 2 uptake in the future, while other strategies present stable or declined SOC (Fig.S7). Any possibility to implement the simulation of Integrated Soil Fertility Management (ISFM) here? which is the most well-known soil conserving techniques in SSA (Sommer et al., 2018).
L567-570: What if you combine the hydrological N losses and gaseous N emission in Xia et al (2018)? will simulated total N losses be consistent with the meta-analysis? L610-615: 'the fixed N amount under legume-based cropping systems in SSA can be as low as 0-50 kg N ha -1 yr -1 '. It's unclear to me, N fixation rate of 0-50 kg N ha -1 yr -1 refers to legume crops or N-fixing grass? What is the BNF rate of cover crop in F CC-BNF simulation? I'm asking because as we know, the N benefit to soil fertility from green manure is literally correlated with N fixation capacity.