Reply on RC2

Wilson and Gerber present a thoughtful analysis on challenges related to scaling microbial dynamics to ecosystem scales. Their work takes a deep dive into the mathematics of these transition across scales. The work is well reasoned, well supported, and well written. My chief suggestion with this paper is to encourage the authors to take a step back from mathematical rigor of their analysis to connect their ideas more broadly with theories and measurements related to SOM turnover, persistence, and vulnerability. My major suggestions are aimed at making these suggestions

The mathematical focus on variability (especially related to substrates and microbes) that this paper explores seems connected to the more theoretical ideas in Schaffer 2012 andLehman et al 2020. I wonder if revisions can reach a broader audience by connecting the quantitative depth of this paper with these broader concepts?
CHW: We will expand the discussion to tie our approach with the concepts raised in these two papers, which we agree are important references in this field. Our analysis does not address molecular diversity as in Lehman et al. (2020), whereas we address spatial heterogeneity through the correlation between microbes and substrate as well as their respective variabilities. Interestingly, the empirical soil moisture function may be considered as a mean field characterization of access as discussed in response to referee 1, and further elaborated below.
Water seems like the big unknown here. If the colocation of microbes and substrates is largely dependent on liquid water availability (as well as SOM-mineral interaction) then high heterogeneity of soil moisture within sites seriously complicates the feasibility of actually capturing the local scale heterogeneity for which the authors seem to be advocating. This is not a deal-breaker for publication, but it seems like a topic that could be discussed? More details in the minor comments below.
CHW: We agree that soil moisture is a critical component of environmental variability, and likely to induce scale transition effects. As discussed above, water influences access, but also oxygen limitation. In the framework of Tang and Riley (2019), the formulation of an appropriate soil moisture function can be thought of as defining an "effective substrate affinity" term, which in itself accounts for some of the microscale heterogeneity. But we absolutely agree, as noted above, that over larger spatial scales, soil moisture variation should be studied within the analytical framework of the scale transition. For this paper, we studied the impact of temperature, another major driver, whose kinetics are generally well-behaved, i.e. "Q10 scaling". By contrast, many soil moisture functions (e.g. Yan et al. 2018) are piecewise, rather than smoothly continuous, rendering their analysis substantially more complicated. In the face of these complications, we can see a few strategies for future work that we would be happy to include in our discussion. First, the soil moisture function could be replaced by a simplified, perhaps polynomial or Gaussian approximation, whose analytical properties would be amenable to our treatment. Alternately, given at least a continuously differentiable function and a probability kernel, a full convolution integral could be set up and either solved if possible or studied numerically.
Line 230, this statement seems values for results in 1c, which converges at 0.9. This seem relevant, especially if one take home message from the text as presented is that 'a first order model is good enough'. This may be true, but what are the implication for having a scale correction factor that does not equal 1, even at large lambdas? Are the conditions required for this to occur plausible in natural systems?
CHW: This is a sharp observation. However, for large enough lambda, the convergence does proceed to 1 as indeed it must according to equation 15. Thus, the appearance of convergence to 0.9 is an artefact of the axis range. However, we believe that this phenomena is actually useful to highlight how *slow* the convergence is for large variabilities. We will revise our legend and text to make this more clear! As to the realism, we believe that a useful next project would involve a meta-analytic collaboration to compile empirical data on the relevant correlation and variability terms where they can be gleaned from published research.

Minor and Technical Comments
Throughout, I'd encourage the authors to be precise and consistent in their terminology for things like "scale transition correction", 'scale correction factor', 'mean field correction', etc. This will avoid confusion and help clarify the ideas in the text.

CHW: We will do that in the revision -good suggestion.
Line 54, I might add Wang et al. 2016 to this list.
Line 78, can the author's just write out process-based models throughout? There are enough acronyms in the text, removing this non-standard one that's sparingly used will aid readability.Line 80, Although not related to trace gas fluxes, Bradford et al 2017, 2021 raise similar concerns related to litter decomposition rates. These reference are also good ones related to the major concern about soil moisture, above. CHW: We will include the references mentioned here at the appropriate places, and follow the editorial suggestions.
Line 210. This does seem like valuable insight. Notably, the variation in soil moisture (which I'd argue impacts substrate availability) is very high (Loescher et al. 2014). I wonder what implications this has for looking at flux estimates within and among sites?
CHW: As noted in responses above and further below, we agree that soil moisture is a really important component of both spatial and temporal variability, and look forward to strengthening our discussion accordingly.
Should the y-axes on fig 1 be held constant so that it's more obvious that at the CV(MB) increases the magnitude of the scale factor changes? Also, should Fig 1 & 2 have the same y-axis label, they're both showing a Scale correction factor.

CHW: We will explore improvements to the figure labeling and legends for both figures.
Line 235 maybe replace 'virtus' with something link 'benefit'

CHW: Will do.
Line 255, microbial biomass is commonly used in models, but really it's the 'active' microbial biomass that matters here, which seems to be much more variable (and harder to quantify).

CHW: We will clarify the issue regarding live biomass to that effect.
Section 3.2. This is a nice example for temperature, but given much higher variance in moisture, (again see Loescher et al 2014) it seems like the real environmental variance we need to care about within sites is moisture. Maybe this doesn't need a mathematical proof, but a brief discussion.
CHW: See our response regarding soil moisture to earlier comments. Briefly, we agree that discussion of soil moisture will strengthen our paper, and look forward to doing so.
Line 321, this sounds ideal, but I wonder where this is ever going to occur for all the variables of relevance, again see Loescher et al (2014)?! CHW: We will amend the sentence with the reference, and acknowledge that this is a tall order.
Extending a bit on the comment above I have additional thoughts, listed below. These are NOT intended to be stinging critiques of the work presented, but I would encourage the authors to discuss some of the more practical challenges that would be involved with putting the research plan they outline into place: SOC does not equal available C and microbial biomass does not equal active biomass.
CHW: Yes that is true, and we can bring this in along with the discussion about access, and its relationship to soil moisture.
How do these measurements integrate with depth?

CHW: In the context of looking at respiration flux and NEE, spatial heterogeneity also includes depth heterogeneity. Perhaps a first pass would be to integrate each soil horizon (or depth interval within a decomposition model) separately, where each layer will have their own terms. All in all, this strengthens our argument that a flux-tower derived parameter is almost surely not a true physiological parameter, but carries the issue of scale transition.
How do we bridge the jump between the variability in soil and trying to infer heterotrophic respiration fluxes from NEE measured in the flux tower.
CHW: Ultimately, there are many considerations here. Our preference would be to develop a large joint generative statistical model that assimilates all the available data, estimates parameters, and can then be used for forecasting and inference via the posterior predictive distribution (e.g. Caughlin et al. 2021, Dietze 2017. Based on our analysis in this paper, we believe that an important component of this model development will be to include relevant scale transition terms as needed in both measurement and process specifications in this model. For instance, rather than fitting a mean-field model for the flux (F) as in equation 3, we might include say the spatial colocation terms and the corrected model in equation 15 (were the right data available).
Even if all this could be measured with enough fidelity at a single site how do we extend such insights to resolution of a grid cell in an ESM (nominally 1 degree or 100x100 km).
CHW: We agree that this is a tall order, but having an awareness of scale transition, and perhaps an understanding where scale transition are likely to be especially large, is critically important. Ultimately, our hope is that this work is the start of a conversation about systematic scale transition effects, and that reconciliation between theory and data via rigorous statistical models will help untangle critical aspects to constrain and model when going from plot to field to site to ESM grid cell.
The logic and math presented here is fascinating, and it does help prioritize the measurements that need to be taken, but I do wonder if it's realistic to make the measurements given existing technology & infrastructure?
CHW: We agree that in some cases the measurements may be logistically unfeasible. The first step, in our view, is to work within existing footprints of eddy covariance towers, and fit microbial models with and without the scale transition terms, which in turn would be informed with systematic sampling of the relevant covariates. Measuring spatial variations in SOC, extractable microbial biomass (or alternately, SIR estimates of active biomass), soil moisture and texture, and so on should be tractable at this scale. But, overall, we agree with Will Wieder that this is a challenging task that will require collaboration between theorists, statistical modelers, and empiricist! Line 335. I'd push back a bit on this statement, because if the aim of these models is to faithfully capture soil flux measurements, then I'd assume there's not much benefit in using anything but a first order model. If, however, the aim of these models is to more broadly explore our theoretical understanding of microbial and soil controls over SOM persistence and vulnerabilities, then microbial explicit models may be useful (see Wieder et al. 2015).
CHW: We agree with the notion that it must be a goal to work with conceptually sound models. But if parameterization is a goal, it follows that it will be difficult for more complex and perhaps physiologically more realistic model to obtain proper parameterization. Conceptually sound models are still useful to define range and limits of simplified, first order models. Moreover, even first order models may have critical scale transitions in their response to environmental drivers such as temperature and soil moisture! I don't love presenting a new figure in the conclusion of a paper, but this is more of a stylistic comment than a serious critique of the work.
CHW: We think the nature of this figure is conceptual and illustrates an issue following the discussion, and therefore deserves the exception.