The effects of land use on soil carbon stocks in the UK

Levy, Peter; Bentley, Laura; Danks, Peter; Emmett, Bridget; Garbutt, Angus; Heming, Stephen; Henrys, Peter; Keith, Aidan; Lebron, Inma; McNamara, Niall; Pywell, Richard; Redhead, John; Robinson, David; Wickenden, Alexander

doi:10.5194/bg-21-4301-2024

Articles | Volume 21, issue 19

https://doi.org/10.5194/bg-21-4301-2024

© Author(s) 2024. This work is distributed under
the Creative Commons Attribution 4.0 License.

https://doi.org/10.5194/bg-21-4301-2024

© Author(s) 2024. This work is distributed under
the Creative Commons Attribution 4.0 License.

Articles | Volume 21, issue 19

Research article

| Highlight paper

|

02 Oct 2024

Research article | Highlight paper |

| 02 Oct 2024

The effects of land use on soil carbon stocks in the UK

Peter Levy, Laura Bentley, Peter Danks, Bridget Emmett, Angus Garbutt, Stephen Heming, Peter Henrys, Aidan Keith, Inma Lebron, Niall McNamara, Richard Pywell, John Redhead, David Robinson, and Alexander Wickenden

Download

Final revised paper (published on 02 Oct 2024)
Preprint (discussion started on 14 Aug 2023)

Interactive discussion

Status: closed

CC1:
'Comment on egusphere-2023-1681', Marguerite Mauritz, 24 Sep 2023

This manuscript examines the effect of land use change on soil carbon stocks across the UK. Estimates in this analysis represent depth-related changes in Sc using a logarithmic depth function, and a Bayesian estimation to account for within-site variation and account for non-independence of samples from the same core. The authors include a larger dataset than previously used by incorporating samples from multiple surveys. Assumptions inherent in space-for-time substitutions are examined via effects of land-use change at different levels of mean Sc to account for pre-existing soil type differences. The authors show that land use effects vary by analytical approach, the effects of land use change are difficult to differentiate from mean soil carbon stocks, and that background soil C content mattered particularly in the woodland class.
I think this type of analysis is important and examination of how to accurately quantify soil C stock changes is essential for implementing climate mitigation strategies. Improvements to the structure and conciseness of the manuscript would help to highlight the unique aspects of this work. Below, I have some suggestions.

Narrative on space-for-time: I think that the space-for-time assumptions affecting land use change impacts can come earlier in the abstract and introduction. After the second sentence of the abstract, an immediate question that came to my mind was how time affects land use change impacts on soil C. I wondered this again by line 27 of the introduction. Since this paper attempts to deal explicitly with space-for-time assumptions, I suggest highlighting that aspect early on.

Intro: condense 3 sections into one to make the writing more concise. Sections 1.2 and 1.3 repeat the prose of section 1.

Methods: start with brief description of each dataset used in the manuscript.
Methods are a bit difficult to follow. Perhaps organize according to the three main approaches to improve estimates outlined in lines 56-59?
Model – how is model fit determined? Is the depth distribution of Sc modeled as part of the Bayesian framework? If yes, is there a need to control feedbacks in the model so that information only flows one way (eg: estimates of main LU effects do not feedback to the depth distribution? See Ogle et al 2013 and Ogle and Pendall 2015).
The results of this paper show smaller soil C stock estimates compared to Bradley. Would it be possible to run the analysis only on the Bradley data to get some sense for the extent to which the change is due to modeling choices and the extent to which it’s due to a larger and more extensive dataset?

Results:
Is the model fit reported anywhere? I may have missed this.
Develop the narrative by giving context first and then describing results. Eg, line 248: ‘Figure 1 shows some of the soil survey data…’ what is some? What’s the question this figure addresses or what patterns should the reader pay attention to?
Lead with a statement that directs the reader to the figure (Figure X). Eg, line 276 instead of ‘these are shown in Figure 4’, end the previous sentence with (Figure 4).
Given figure 3 with all the data by depth, is figure 2 for specific cores, necessary? What does figure 2 add that cannot be seen in figure 3?
In the figures showing prediction lines, are these lines drawn based on the bayesian model parameters?

Discussion: discuss the importance of showing confidence intervals and prediction intervals? Depending whether confidence or prediction intervals are interpreted, the conclusions would be substantially different, particularly for results shown in figures 4 and 5.

Data availability: point to specific data sources?

Figure 1: explain abbreviations in caption. Label axes as latitude and longitude? The left panel has two color scales that are both spatial and I am unsure how to read the graph.

Figure 2: Do the points in this figure represent an average or individual points? If average, is there an error or uncertainty associated? If individual samples, why were these chosen? What are the titles of each facet (226, 7, 9, ED_2004_18)? include log in the y-axis name? Consider a flipped-axis in which the x-axis is shown vertically and the y-axis is shown horizontally? I think it’s a more intuitive way to look at a variable like depth.

Figure 3: include log in the y-axis name? Consider a flipped-axis in which the x-axis is shown vertically and the y-axis is shown horizontally?

Figure 5: ‘For the latter’ includes which studies? Only Guoo and Gifford and Poepllau or also ELUM and RAC? To clarify, perhaps add an NA or ND to the figure itself where no data are available.

Figure 3-6: keep LU categories in the same order in all graphs.

Minor
Line 10: What does ‘This’ refer to? It reads as if ‘this’ is the variability in mean effects, but I think it’s meant to refer to the results on how land use ranks in soil C storage?

Line 30: Why say >400 instead of listing an exact number?

Line 30: does this need a citation? The following paragraph implies that the samples from >400 soil series come from a particular dataset if other data has been collected since 2005.

Line 33: ‘These data’ – I got a bit lost as to which data is being refered to. The surface soil data collected since 2005? At this point in reading I am not sure if this paragraph leading the reader to the fact that the current study is going to use data since 2005 to update estimates or that data since 2005 can’t be used ?

Line 34: ’there are some issues’ is vague. Perhaps reword sentence to be more specific? Eg: ‘Assumptions inherent in previous estimates (Bradley et al 2005) may have important limitations for interpretation of soil C stock changes.’.

Line 37: ‘some of the details…with respect to land use are unclear’ – is it possible to be specific about what exactly is unclear?

Line 38: is there a citation for choices/assumptiinos giving different results? Or are the authors stating that it’s important to be more clear in the assumptions? Or even to run multiple models that vary assumptions in a type of sensitivity analysis?

Line 39: keep with previous paragraph as this is about limitations in prior studies? Does ‘the general approach’ reefer to what Bradley did? Or more broadly, to soil stock/LUC analyses?

Line 53: Is the Countryside Survey data the data referred to in line 30? Perhaps briefly describe the Countryside dataset? Many readers won’t be familiar with it.

Line 57: point i), the way that lines 31-34 are written, it seemed as if the more recent data are difficult to incorporate for analysis. Perhaps re-phrase line 31-34 to better lead the reader toward the potential utility of more recent survey data.

Line 65: Define S_c

Line 66: Why does the use of pedotransfer functions make the data not comparable? Is the problem that each survey uses a different bulk density estimate approach?

Line 68: Is estimating Sc as a function of depth an alternative to correcting with bulk density?

Line 78-96: For a while I thought this was already the methods section. I think the narrative here an be condensed with brief and general mention of benefits from Bayesian modeling.

Line 97-119: condense this section and incorporate into section 1 of the introduction as 1.2 and 1.3 repeat ideas at the end of 1, but in more detail.

Line 154:Is it possible to refer the reader to a specific section below for more discussion of statistical issues? It could otherwise be hard to find.

Line 178: Do I understand correctly that survey effects are not shown in equation 8, but are included in the model?

Line 243: Are data from the meta-analyses different from the other datasets? If meta-analyses are used only for comparison, this section could be removed and instead used in the discussion.

Line 251: the details on data availability may be better to add in the methods? I also got confused which data are limited – presumably not the data used in this analysis? If the availability of data has an impact on the analysis, mention that. If it does not then is it important to mention at all?

Line 260: instead of ‘relatively infrequent’ is it possible to give a percentage of complicated soil profiles?

Citation: https://doi.org/10.5194/egusphere-2023-1681-CC1
- CC2: 'Reply on CC1', Marguerite Mauritz, 24 Sep 2023
  
  References
  Kiona Ogle. Jarrett Barber. Karla Sartor."Feedback and Modularization in a Bayesian Meta–analysis of Tree Traits Affecting Forest Dynamics."Bayesian Anal.8(1)133 - 168,March 2013.https://doi.org/10.1214/13-BA806
  Ogle, K. and Pendall, E. (2015), Isotope partitioning of soil respiration: A Bayesian solution to accommodate multiple sources of variability. J. Geophys. Res. Biogeosci., 120, 221–236. doi: 10.1002/2014JG002794.
  
  Citation: https://doi.org/10.5194/egusphere-2023-1681-CC2
RC1:
'Comment on egusphere-2023-1681', Marguerite Mauritz, 26 Sep 2023

Due to a technological misunderstanding I accidentally posted my comment as a community comment. Here it is again as RC:

This manuscript examines the effect of land use change on soil carbon stocks across the UK. Estimates in this analysis represent depth-related changes in Sc using a logarithmic depth function, and a Bayesian estimation to account for within-site variation and account for non-independence of samples from the same core. The authors include a larger dataset than previously used by incorporating samples from multiple surveys. Assumptions inherent in space-for-time substitutions are examined via effects of land-use change at different levels of mean Sc to account for pre-existing soil type differences. The authors show that land use effects vary by analytical approach, the effects of land use change are difficult to differentiate from mean soil carbon stocks, and that background soil C content mattered particularly in the woodland class.
I think this type of analysis is important and examination of how to accurately quantify soil C stock changes is essential for implementing climate mitigation strategies. Improvements to the structure and conciseness of the manuscript would help to highlight the unique aspects of this work. Below, I have some suggestions.

Narrative on space-for-time: I think that the space-for-time assumptions affecting land use change impacts can come earlier in the abstract and introduction. After the second sentence of the abstract, an immediate question that came to my mind was how time affects land use change impacts on soil C. I wondered this again by line 27 of the introduction. Since this paper attempts to deal explicitly with space-for-time assumptions, I suggest highlighting that aspect early on.

Intro: condense 3 sections into one to make the writing more concise. Sections 1.2 and 1.3 repeat the prose of section 1.

Methods: start with brief description of each dataset used in the manuscript.
Methods are a bit difficult to follow. Perhaps organize according to the three main approaches to improve estimates outlined in lines 56-59?
Model – how is model fit determined? Is the depth distribution of Sc modeled as part of the Bayesian framework? If yes, is there a need to control feedbacks in the model so that information only flows one way (eg: estimates of main LU effects do not feedback to the depth distribution? See Ogle et al 2013 and Ogle and Pendall 2015).
The results of this paper show smaller soil C stock estimates compared to Bradley. Would it be possible to run the analysis only on the Bradley data to get some sense for the extent to which the change is due to modeling choices and the extent to which it’s due to a larger and more extensive dataset?

Results:
Is the model fit reported anywhere? I may have missed this.
Develop the narrative by giving context first and then describing results. Eg, line 248: ‘Figure 1 shows some of the soil survey data…’ what is some? What’s the question this figure addresses or what patterns should the reader pay attention to?
Lead with a statement that directs the reader to the figure (Figure X). Eg, line 276 instead of ‘these are shown in Figure 4’, end the previous sentence with (Figure 4).
Given figure 3 with all the data by depth, is figure 2 for specific cores, necessary? What does figure 2 add that cannot be seen in figure 3?
In the figures showing prediction lines, are these lines drawn based on the bayesian model parameters?

Discussion: discuss the importance of showing confidence intervals and prediction intervals? Depending whether confidence or prediction intervals are interpreted, the conclusions would be substantially different, particularly for results shown in figures 4 and 5.

Data availability: point to specific data sources?

Figure 1: explain abbreviations in caption. Label axes as latitude and longitude? The left panel has two color scales that are both spatial and I am unsure how to read the graph.

Figure 2: Do the points in this figure represent an average or individual points? If average, is there an error or uncertainty associated? If individual samples, why were these chosen? What are the titles of each facet (226, 7, 9, ED_2004_18)? include log in the y-axis name? Consider a flipped-axis in which the x-axis is shown vertically and the y-axis is shown horizontally? I think it’s a more intuitive way to look at a variable like depth.

Figure 3: include log in the y-axis name? Consider a flipped-axis in which the x-axis is shown vertically and the y-axis is shown horizontally?

Figure 5: ‘For the latter’ includes which studies? Only Guoo and Gifford and Poepllau or also ELUM and RAC? To clarify, perhaps add an NA or ND to the figure itself where no data are available.

Figure 3-6: keep LU categories in the same order in all graphs.

Minor
Line 10: What does ‘This’ refer to? It reads as if ‘this’ is the variability in mean effects, but I think it’s meant to refer to the results on how land use ranks in soil C storage?

Line 30: Why say >400 instead of listing an exact number?

Line 30: does this need a citation? The following paragraph implies that the samples from >400 soil series come from a particular dataset if other data has been collected since 2005.

Line 33: ‘These data’ – I got a bit lost as to which data is being refered to. The surface soil data collected since 2005? At this point in reading I am not sure if this paragraph leading the reader to the fact that the current study is going to use data since 2005 to update estimates or that data since 2005 can’t be used ?

Line 34: ’there are some issues’ is vague. Perhaps reword sentence to be more specific? Eg: ‘Assumptions inherent in previous estimates (Bradley et al 2005) may have important limitations for interpretation of soil C stock changes.’.

Line 37: ‘some of the details…with respect to land use are unclear’ – is it possible to be specific about what exactly is unclear?

Line 38: is there a citation for choices/assumptiinos giving different results? Or are the authors stating that it’s important to be more clear in the assumptions? Or even to run multiple models that vary assumptions in a type of sensitivity analysis?

Line 39: keep with previous paragraph as this is about limitations in prior studies? Does ‘the general approach’ reefer to what Bradley did? Or more broadly, to soil stock/LUC analyses?

Line 53: Is the Countryside Survey data the data referred to in line 30? Perhaps briefly describe the Countryside dataset? Many readers won’t be familiar with it.

Line 57: point i), the way that lines 31-34 are written, it seemed as if the more recent data are difficult to incorporate for analysis. Perhaps re-phrase line 31-34 to better lead the reader toward the potential utility of more recent survey data.

Line 65: Define S_c

Line 66: Why does the use of pedotransfer functions make the data not comparable? Is the problem that each survey uses a different bulk density estimate approach?

Line 68: Is estimating Sc as a function of depth an alternative to correcting with bulk density?

Line 78-96: For a while I thought this was already the methods section. I think the narrative here an be condensed with brief and general mention of benefits from Bayesian modeling.

Line 97-119: condense this section and incorporate into section 1 of the introduction as 1.2 and 1.3 repeat ideas at the end of 1, but in more detail.

Line 154:Is it possible to refer the reader to a specific section below for more discussion of statistical issues? It could otherwise be hard to find.

Line 178: Do I understand correctly that survey effects are not shown in equation 8, but are included in the model?

Line 243: Are data from the meta-analyses different from the other datasets? If meta-analyses are used only for comparison, this section could be removed and instead used in the discussion.

Line 251: the details on data availability may be better to add in the methods? I also got confused which data are limited – presumably not the data used in this analysis? If the availability of data has an impact on the analysis, mention that. If it does not then is it important to mention at all?

Line 260: instead of ‘relatively infrequent’ is it possible to give a percentage of complicated soil profiles?

References
Kiona Ogle. Jarrett Barber. Karla Sartor. "Feedback and Modularization in a Bayesian Meta–analysis of Tree Traits Affecting Forest Dynamics." Bayesian Anal. 8 (1) 133 - 168, March 2013. https://doi.org/10.1214/13-BA806
Ogle, K. and Pendall, E. (2015), Isotope partitioning of soil respiration: A Bayesian solution to accommodate multiple sources of variabilityJ. Geophys. Res. Biogeosci., 120, 221–236. doi: 10.1002/2014JG002794.

Citation: https://doi.org/10.5194/egusphere-2023-1681-RC1
- AC1: 'Reply on RC1', Peter E. Levy, 16 Nov 2023
  
  We thank the referee for the time taken.
  Their comments are shown in italics; our response is beneath in normal font.
  Narrative on space-for-time: I think that the space-for-time assumptions affecting land use change impacts can come earlier in the abstract and introduction. After the second sentence of the abstract, an immediate question that came to my mind was how time affects land use change impacts on soil C. I wondered this again by line 27 of the introduction. Since this paper attempts to deal explicitly with space-for-time assumptions, I suggest highlighting that aspect early on.
  
  - we introduce the space-for-time assumption on line 13 in the first paragraph of the Introduction. Personally I don't see how it can come any earlier.
  
  Intro: condense 3 sections into one to make the writing more concise. Sections 1.2 and 1.3 repeat the prose of section 1.
  
  - no, we feel a limited amount of repetition is useful.
  
  Methods: start with brief description of each dataset used in the manuscript.
  
  - no, we need to introduce the general method and notation before we can go into the specifics of each data set.
  Methods are a bit difficult to follow. Perhaps organize according to the three main approaches to improve estimates outlined in lines 56-59?
  
  - we already have sections on two of these approaches, but we can add the third (space-for-time).
  Model – how is model fit determined?
  
  - also requested by other referees - we can add details in the model development section in the revision.
  Is the depth distribution of Sc modeled as part of the Bayesian framework?
  
  - yes, this is explicit in Eqn 8 and throughout the text.
  If yes, is there a need to control feedbacks in the model so that information only flows one way (eg: estimates of main LU effects do not feedback to the depth distribution? See Ogle et al 2013 and Ogle and Pendall 2015).
  
  - no, the model structure is simple enough that this is not an issue: we are only fitting intercept and slope terms for each land use; these *are* the main effects.
  The results of this paper show smaller soil C stock estimates compared to Bradley. Would it be possible to run the analysis only on the Bradley data to get some sense for the extent to which the change is due to modeling choices and the extent to which it’s due to a larger and more extensive dataset?
  
  - that comparison is already shown in Fig 5. We already say that the difference is largely due to modelling choices rather than the additional data.
  
  Results:
  Is the model fit reported anywhere? I may have missed this.
  
  - no, and the same point was made by referee 2. We will add this as suggested.
  Develop the narrative by giving context first and then describing results. Eg, line 248: ‘Figure 1 shows some of the soil survey data…’ what is some? What’s the question this figure addresses or what patterns should the reader pay attention to?
  
  - This is a good point - we added a map to show the wide coverage, and out of habit. However, given that we cannot diclose the locations of several of the large data sets, Fig 1 rather fails in this respect, and probably is not needed. We can remove, or at least make the point more clearly.
  Lead with a statement that directs the reader to the figure (Figure X). Eg, line 276 instead of ‘these are shown in Figure 4’, end the previous sentence with (Figure 4).
  
  - noted. Will add some variety in the revision.
  Given figure 3 with all the data by depth, is figure 2 for specific cores, necessary? What does figure 2 add that cannot be seen in figure 3?
  
  - Fig 2 shows that the linear relationship is a close fit in individual cores. This point is lost when all cores are plotted together in Fig 3.
  In the figures showing prediction lines, are these lines drawn based on the bayesian model parameters?
  
  - this is explicit in the figure captions.
  
  Discussion: discuss the importance of showing confidence intervals and prediction intervals? Depending whether confidence or prediction intervals are interpreted, the conclusions would be substantially different, particularly for results shown in figures 4 and 5.
  - This is a good point and we assumed prior knowledge; we can discuss this point in the revision.
  
  Figure 1: explain abbreviations in caption. Label axes as latitude and longitude? The left panel has two color scales that are both spatial and I am unsure how to read the graph.
  - redundant if we are deleting Fig 1.
  Figure 2: Do the points in this figure represent an average or individual points? If average, is there an error or uncertainty associated? If individual samples, why were these chosen? What are the titles of each facet (226, 7, 9, ED_2004_18)? include log in the y-axis name? Consider a flipped-axis in which the x-axis is shown vertically and the y-axis is shown horizontally? I think it’s a more intuitive way to look at a variable like depth.
  
  Figure 3: include log in the y-axis name? Consider a flipped-axis in which the x-axis is shown vertically and the y-axis is shown horizontally?
  - The values shown are in original units; it is the axis scaling that is transformed here. Personally, I dislike flipped axes like that.
  Figure 5: ‘For the latter’ includes which studies? Only Guoo and Gifford and Poepllau or also ELUM and RAC? To clarify, perhaps add an NA or ND to the figure itself where no data are available.
  - we will change "for the latter" to "in the meta-analyses"
  Figure 3-6: keep LU categories in the same order in all graphs.
  - noted to be changed in the revision.
  Minor
  Line 10: What does ‘This’ refer to? It reads as if ‘this’ is the variability in mean effects, but I think it’s meant to refer to the results on how land use ranks in soil C storage?
  
  - it refers to the variability, and we can change this to "This variability ...".
  
  Line 30: Why say >400 instead of listing an exact number?
  
  - because it is the magnitude that is important, not the exact number. The exact number depending how you count them (all cases, series with some missing values etc.).
  
  Line 30: does this need a citation? The following paragraph implies that the samples from >400 soil series come from a particular dataset if other data has been collected since 2005.
  
  - This has been cited at the end of the previous sentence. We are still in the same paragraph, so still talking about the same thing, so we wouldn't cite it again.
  
  Line 33: ‘These data’ – I got a bit lost as to which data is being refered to. The surface soil data collected since 2005?
  
  - yes.
  At this point in reading I am not sure if this paragraph leading the reader to the fact that the current study is going to use data since 2005 to update estimates or that data since 2005 can’t be used ?
  
  - the point is they haven't been used (until now) because no one had a method to do so.
  
  Line 34: ’there are some issues’ is vague. Perhaps reword sentence to be more specific? Eg: ‘Assumptions inherent in previous estimates (Bradley et al 2005) may have important limitations for interpretation of soil C stock changes.’.
  
  - we say what these issues are in the next sentence.
  
  Line 37: ‘some of the details…with respect to land use are unclear’ – is it possible to be specific about what exactly is unclear?
  
  - we can say something more in the revision. There are no equations in the paper and no surviving source code of their data analysis, so there is a general lack of clarity in how one interprets the paper into a reproducible mathematical/code form.
  
  Line 38: is there a citation for choices/assumptiinos giving different results? Or are the authors stating that it’s important to be more clear in the assumptions? Or even to run multiple models that vary assumptions in a type of sensitivity analysis?
  
  - all of this things. I don't think the general point needs a citation, but we show the importance here in Table 1, which shows the effect of choosing log-transformation or not.
  
  Line 39: keep with previous paragraph as this is about limitations in prior studies? Does ‘the general approach’ reefer to what Bradley did? Or more broadly, to soil stock/LUC analyses?
  
  - this refers more broadly to any soil stock/LUC analysis which uses the space-for-time substitution.
  
  Line 53: Is the Countryside Survey data the data referred to in line 30? Perhaps briefly describe the Countryside dataset? Many readers won’t be familiar with it.
  
  - not sure why the comment for line 53 refers to line 30. Assuming this is intended - no, line 30 is a continuation from the previous sentence, expanding upon the data of "Milne and Brown, 1997, Bradley et al. (2005)" (line 28).
  
  Line 57: point i), the way that lines 31-34 are written, it seemed as if the more recent data are difficult to incorporate for analysis. Perhaps re-phrase line 31-34 to better lead the reader toward the potential utility of more recent survey data.
  
  - this is my point - more recent data *have been* difficult to incorporate, because no one had a method to include other data sets which used different depths.
  
  Line 65: Define Sc
  
  - defined on line 20.
  
  Line 66: Why does the use of pedotransfer functions make the data not comparable? Is the problem that each survey uses a different bulk density estimate approach?
  
  - exactly, yes. Ideally it should be measured on the same sample as the carbon fraction, otherwise we add all the additional uncertainty in the modelled (pedotransfer function) estimate of bulk density, which can be substantial but is rarely quantified.
  
  Line 68: Is estimating Sc as a function of depth an alternative to correcting with bulk density?
  
  - no. Bulk density is never a correction.
  
  Line 78-96: For a while I thought this was already the methods section. I think the narrative here an be condensed with brief and general mention of benefits from Bayesian modeling.
  
  Line 97-119: condense this section and incorporate into section 1 of the introduction as 1.2 and 1.3 repeat ideas at the end of 1, but in more detail.
  
  Line 154:Is it possible to refer the reader to a specific section below for more discussion of statistical issues? It could otherwise be hard to find.
  
  - we will add "(section 2.2 below)"
  
  Line 178: Do I understand correctly that survey effects are not shown in equation 8, but are included in the model?
  
  - correct. We tried variants of the model with various terms included/excluded, and we will show the results as requested by Referee 2.
  
  Line 243: Are data from the meta-analyses different from the other datasets? If meta-analyses are used only for comparison, this section could be removed and instead used in the discussion.
  
  - yes they are different and used only for comparison. However, it is useful to plot them in Fig 5, so they need to be introduced before the Results section.
  
  Line 251: the details on data availability may be better to add in the methods? I also got confused which data are limited – presumably not the data used in this analysis? If the availability of data has an impact on the analysis, mention that. If it does not then is it important to mention at all?
  
  - the issue is that for some data sets we do not know the spatial location/coordinates, and for others we cannot disclose the spatial location/coordinates, so they cannot appear on Figure 1. We can explain this better in the revision.
  
  Line 260: instead of ‘relatively infrequent’ is it possible to give a percentage of complicated soil profiles?
  
  - unfortunately, the data we have do not contain explicit information about the presence/depth/nature of soil horizons, so we cannot be more quantitative about it.
  
  Citation: https://doi.org/10.5194/egusphere-2023-1681-AC1
RC2:
'Comment on egusphere-2023-1681', José Lucas Safanelli, 28 Sep 2023
The paper “The Effects of Land Use on Soil Carbon Stocks in the UK” provides interesting pieces of evidence using Bayesian inference with a bigger dataset that collated national and regional soil inventories containing varying degrees of precision and nature. The main upshot of this study is the rigorous analysis for estimating the effect sizes of land uses on SOC stock in the UK and also for describing the limitation of the dataset and the analysis, i.e., it has been assuming the space-for-time substitution. The paper is well-organized and written, with the results and discussion being supported by the methodology. However, I think that readers (including myself) would greatly benefit from further clarification that I list below:
In the abstract, they mention the compilation of more than 15k cores (Line 3), while the Discussion refers to more than 25k (line 381). Just make sure to double-check this number and provide a consistent value across sections.

While the authors introduce the importance of land use averages of SOC stocks, which are important for model parametrization for nationwide accountability of land use emissions (under the LULUCF/IPCC guidelines), I think that authors could better emphasize this in the discussion and propose that their results are an improvement over Bradley et al. (2005) and can be employed under LULUCF/IPCC guidelines. It is well explained, however, that the space-for-time substitute assumption (with their results seeming to still be affected by it) is still limited and requires further analysis.

It is interesting to note the space-for-time substitution assumption has also been explored in several other types of studies (e.g., digital soil mapping or spatial prediction) and we are still not sure about their general validity and precision, due to the same reasons: lack of longitudinal studies across several sites under several treatments (that can in principle be mapped). This study, in turn, explores and describes it in the context of land use effects analysis in a creative way.

Hierarchical modeling: I think that presenting the reasons for adopting a logarithm transformation of SOC with theoretical reasons and proper references is completely valuable. The explanation of the reasons for adopting a Bayesian hierarchical model was good too. However, I think that from a modeling standpoint, many of the things explained in that section were not clearly described in the Methods. For example, the definition of the priors of their analysis, the goodness-of-fit metrics for evaluating their predictions, etc. Adding this information (even as supplements) is greatly appreciated.

It is always good to present and discuss the assumptions and notations for estimating soil carbon stocks, and this paper can serve as a reference for other works that seek to relate SOC stocks to land use change, especially because it links to UNFCCC guidelines. The rationale behind their model development is very interesting and I wonder if that can be adapted to other model structures. I mean, using the proposed whole-profile estimation by identifying the depth of the arithmetic mean, like for example, running some simple regression analysis with predicted SOC (model agnostic) and depths, etc. Of course, this depends a lot on the prediction algorithm and the way depth is treated as the predictor but seems very promising.

The authors mention that for the sake of brevity, treating the differences between surveys is not represented in their main model (which is already encompassed by the group-specific terms), but they provide predictions for those different sources in the results. In addition, it seems that several versions of the main model are generated for inference, and I wonder how the authors can be more transparent about all model’s capacity for drawing their conclusions. They mention in the results (Line 263) that their full model (having site and location as random effects) achieved an explained variance of 90%. What about the simpler forms, e.g., when they exclude the random intercepts and slopes from group-specific terms for making Figure 4, or when another model is used to make Figure 5 (that requires the sources as random terms)? This addition is highly recommended (a table placed in supplements). This will certainly help to understand why the mean effects were significant while the prediction intervals were broad among uses and/or sources (possibly because of the low performance in some versions?).

I wonder why the y-axis in Figure 2 and Figure 3 are correctly displayed in log space (but with back-transformed labels), while Figure 4 and Figure 5 are not. I can understand it is because of the effect size visualization and the change of variable of interest, but I wonder if the 95% confidence and prediction intervals are properly displayed. I would expect higher upper intervals due to the log effect. Maybe the log effect is smoothed by bulk density when estimating the SOC stock. Would greatly appreciate a further verification of this. Similarly, I’d greatly appreciate having a better explanation of the back transformation steps in the model development.

Figure 1: It seems that at least two data points (or cores) are misplaced in the sea. This brings me to the question about the data quality. We are not sure about which source those data points come from but might indicate that other potential errors might have happened (both for soil series and land use labeling). In fact, I went over the original Bradley paper (on the journal website it says published in 2006, but you cited 2005, must check) and there are several limitations of that dataset. The first we can spot is that the employed soil maps were made in very coarse scales (1:250,000) and this can have an enormous impact on the definition of soil series and help to explain why the authors found issues with the space-for-time substitution. Also, it sort of explains the huge prediction intervals.

This makes me think about other ways of estimating SOC stock effect sizes across land uses, like using more advanced and very performant machine learning algorithms to make spatial maps and compare them with land use maps. There are many other sources of uncertainty involved in model building, prediction, and inference, but they could indicate the same effects with lower uncertainty in the predictions. Actually, the authors defend in the discussion that interpretability is impacted by this approach. Considering the recent advances in model interpretability (partial dependences, Shapley values, etc.), uncertainty estimation via conformal prediction, and non-parametric inference, I think it is worth investigating and comparing with their proposed method. So I recommend not opposing alternative approaches and actually stating that there is room for exploring this research problem in different ways.

Specific comments:
Line 259: “(...) so the variance explained by a linear trend was less”. Maybe lower?

Line 259: I think that “complicated” profiles is not a good term, please rephrase it.

Line 265: “The interpretation of lhe latter…”, amend to “the”.

I recommend adjusting Figure 1, especially by repositioning the legend to the bottom and keeping both map grids the same size and in different panels. Label each panel and indicate them in the caption. Please, make the notation of variables consistent across the text, figures, and tables.
Citation: https://doi.org/10.5194/egusphere-2023-1681-RC2
- AC2: 'Reply on RC2', Peter E. Levy, 16 Nov 2023
  
  We thank the referee for the time taken. Their comments are shown in italics; our response is beneath in normal font.
  In the abstract, they mention the compilation of more than 15k cores (Line 3), while the Discussion refers to more than 25k (line 381). Just make sure to double-check this number and provide a consistent value across sections.
  
  - both are correct: "15790 soil cores" (Line 3), and "more than 25,000 core sections" (line 381) i.e. most cores are split into several sections by depth. We can add the word "depth" to make this more obvious.
  While the authors introduce the importance of land use averages of SOC stocks, which are important for model parametrization for nationwide accountability of land use emissions (under the LULUCF/IPCC guidelines), I think that authors could better emphasize this in the discussion and propose that their results are an improvement over Bradley et al. (2005).
  
  - agreed - we can make this point more explicitly in the revision.
  Hierarchical modeling: I think that presenting the reasons for adopting a logarithm transformation of SOC with theoretical reasons and proper references is completely valuable. The explanation of the reasons for adopting a Bayesian hierarchical model was good too. However, I think that from a modeling standpoint, many of the things explained in that section were not clearly described in the Methods. For example, the definition of the priors of their analysis, the goodness-of-fit metrics for evaluating their predictions, etc. Adding this information (even as supplements) is greatly appreciated.
  
  - we can add the details suggested. In brief, the marginal and conditional r^2 values were used to assess goodness-of-fit, as defined for mixed-effect models by Nakagawa et al (2017).
  It is always good to present and discuss the assumptions and notations for estimating soil carbon stocks, and this paper can serve as a reference for other works that seek to relate SOC stocks to land use change, especially because it links to UNFCCC guidelines. The rationale behind their model development is very interesting and I wonder if that can be adapted to other model structures. I mean, using the proposed whole-profile estimation by identifying the depth of the arithmetic mean, like for example, running some simple regression analysis with predicted SOC (model agnostic) and depths, etc. Of course, this depends a lot on the prediction algorithm and the way depth is treated as the predictor but seems very promising.
  
  - This sounds interesting, but I don't really understand what the referee means here.
  The authors mention that for the sake of brevity, treating the differences between surveys is not represented in their main model (which is already encompassed by the group-specific terms), but they provide predictions for those different sources in the results. In addition, it seems that several versions of the main model are generated for inference, and I wonder how the authors can be more transparent about all model’s capacity for drawing their conclusions. They mention in the results (Line 263) that their full model (having site and location as random effects) achieved an explained variance of 90%. What about the simpler forms, e.g., when they exclude the random intercepts and slopes from group-specific terms for making Figure 4, or when another model is used to make Figure 5 (that requires the sources as random terms)? This addition is highly recommended (a table placed in supplements). This will certainly help to understand why the mean effects were significant while the prediction intervals were broad among uses and/or sources (possibly because of the low performance in some versions?).
  
  - we can add detail about the effect of the different terms on the variance explained as suggested.
  I wonder why the y-axis in Figure 2 and Figure 3 are correctly displayed in log space (but with back-transformed labels), while Figure 4 and Figure 5 are not. I can understand it is because of the effect size visualization and the change of variable of interest, but I wonder if the 95% confidence and prediction intervals are properly displayed. I would expect higher upper intervals due to the log effect. Maybe the log effect is smoothed by bulk density when estimating the SOC stock. Would greatly appreciate a further verification of this. Similarly, I’d greatly appreciate having a better explanation of the back transformation steps in the model development.
  
  - Figs 2 & 3 show the highly skewed raw data, which need the log scaling to display appropriately. Figs 4 & 5 show only the means and effect sizes, so can be shown in the original untransformed units. The intervals should be also be on the same untransformed in Figs 4 & 5. We will double-check that this has been done correctly.
  Figure 1: It seems that at least two data points (or cores) are misplaced in the sea. This brings me to the question about the data quality. We are not sure about which source those data points come from but might indicate that other potential errors might have happened (both for soil series and land use labeling). In fact, I went over the original Bradley paper (on the journal website it says published in 2006, but you cited 2005, must check) and there are several limitations of that dataset. The first we can spot is that the employed soil maps were made in very coarse scales (1:250,000) and this can have an enormous impact on the definition of soil series and help to explain why the authors found issues with the space-for-time substitution. Also, it sort of explains the huge prediction intervals.
  
  - we will revise Fig correcting the errors
  
  - from the SUM journal website:
  
  How to cite:
  
  "Bradley, R.I., Milne, R., Bell, J., Lilly, A., Jordan, C. and Higgins, A. (2005), A soil carbon and land use database for the United Kingdom. Soil Use and Management, 21: 363-369. https://doi.org/10.1079/SUM2005351"
  
  - the 1:250,000 maps were used for spatial mapping, and were not involved in the raw soil core data we analysed here (as far as we can tell - the methods are not completely clear, and no equations are given).
  This makes me think about other ways of estimating SOC stock effect sizes across land uses, like using more advanced and very performant machine learning algorithms to make spatial maps and compare them with land use maps. There are many other sources of uncertainty involved in model building, prediction, and inference, but they could indicate the same effects with lower uncertainty in the predictions. Actually, the authors defend in the discussion that interpretability is impacted by this approach. Considering the recent advances in model interpretability (partial dependences, Shapley values, etc.), uncertainty estimation via conformal prediction, and non-parametric inference, I think it is worth investigating and comparing with their proposed method. So I recommend not opposing alternative approaches and actually stating that there is room for exploring this research problem in different ways.
  
  - this is fair comment - there is certainly scope for exploration of other methods and approaches. My general concern with machine learning is that they are less interpretable, almost by definition. Although they may give the appearance of lower uncertainty, it is hard to be sure this is not a result of over-fitting to the sample at hand, even when cross-validation is used within the available sample. Very happy to look further into this, and can add something to the Discussion on this point.
  Specific comments:
  
  Line 259: “(...) so the variance explained by a linear trend was less”. Maybe lower?
  
  - if we think of variance as an amount, "less" seems more appropriate.
  Line 259: I think that “complicated” profiles is not a good term, please rephrase it.
  
  - Perhaps "soils with more complex vertical structures".
  Line 265: “The interpretation of lhe latter…”, amend to “the”.
  
  - corrected
  I recommend adjusting Figure 1, especially by repositioning the legend to the bottom and keeping both map grids the same size and in different panels. Label each panel and indicate them in the caption. Please, make the notation of variables consistent across the text, figures, and tables.
  
  - agreed. We will revise as suggested here and by referee 3, or delete as implied by Referee 1.
  Reference
  
  Nakagawa, S., Johnson, P.C.D., Schielzeth, H., 2017. The coefficient of determination R2 and intra-class correlation coefficient from generalized linear mixed-effects models revisited and expanded. Journal of The Royal Society Interface 14, 20170213. https://doi.org/10.1098/rsif.2017.0213
  
  Citation: https://doi.org/10.5194/egusphere-2023-1681-AC2
RC3:
'Comment on egusphere-2023-1681', Stephen Chapman, 30 Sep 2023

General comments
Understanding the effects of land use change on soil carbon stocks is a necessary step to predicting how the land use change matrix impacts carbon fluxes from soil and how UNFCCC reporting can be implemented, if indeed it can. The application of the logarithmic model for changes with depth and the use of Bayesian statistics takes the process beyond what had been previously achieved in this area and, while the application is to the UK, it clearly has value for world-wide soils.
One concern is the strict applicability of the logarithmic function (soil carbon vs depth) to crop soils. These are typically ploughed to 30 cm or so, though a minor fraction may be under minimum till. This means that there is a continual dilution of the surface horizon (0-15 cm) with soil from below, and a continual deposition of carbon lower down (15-30 cm). Interestingly, we have found that plough depth in Scotland has increased over the years (Lilly and Chapman, 2015). This shift in carbon would not have been evident in the data of Jobbagy and Jackson (2000), who appeared to deal mainly with grasslands, shrublands and forests. What is also clear from the data presented here (Figure 3) is that there is quite a large data gap just in this region (on either side of the 0.25 m line). In fact, it is evident in all four land use types. What it might mean for the crops is that the surface values are less than what they would be in the absence of ploughing and that the slopes of the regression lines are less than what they would otherwise be. The net result would be a reduction of the total carbon values for these crop soils.
A secondary consideration is the status of improved grasslands. These are also periodically ploughed but usually the time since last ploughing is unknown. It may be known to some extent where repeated samples are taken, such as in the Countryside Survey. Time since being under grass will affect how close the C stock to typical crop values and how close it is to more typical ‘grass’ values. As argued above, there will also be an effect on the distribution of C over the Ap (ploughed) horizon.
Lilly, A. and Chapman, S. (2015) Assessing changes in carbon stocks of Scottish soils: Lessons learnt. IOP Conf. Ser. Earth Environ.Sci. 25 012016
Specific comments
LL75-77 This sentence seems to be a repeat of what has already been stated in LL73-74.
Figure 1 The inclusion of altitude in not really helpful, and for most of England and Wales cannot be seen. I recommend omission. Also some data points appear in the sea or in Eire!
Technical corrections
L83 assuming
L252 availability
L257 Figures 2 & 3 (not S2 & S3)
L265 the
L267 (and elsewhere) Use of the word ‘outwith’ is fine by me but may raise some eyebrows outwith Scotland.
L332 the assumption
L356 rejects

Citation: https://doi.org/10.5194/egusphere-2023-1681-RC3
- AC3: 'Reply on RC3', Peter E. Levy, 16 Nov 2023
  
  We thank the referee the time taken and particular attention to detail. Their comments are shown in italics; our response is beneath in normal font.
  
  One concern is the strict applicability of the logarithmic function (soil carbon vs depth) to crop soils. These are typically ploughed to 30 cm or so, though a minor fraction may be under minimum till. This means that there is a continual dilution of the surface horizon (0-15 cm) with soil from below, and a continual deposition of carbon lower down (15-30 cm). Interestingly, we have found that plough depth in Scotland has increased over the years (Lilly and Chapman, 2015). This shift in carbon would not have been evident in the data of Jobbagy and Jackson (2000), who appeared to deal mainly with grasslands, shrublands and forests. What is also clear from the data presented here (Figure 3) is that there is quite a large data gap just in this region (on either side of the 0.25 m line). In fact, it is evident in all four land use types. What it might mean for the crops is that the surface values are less than what they would be in the absence of ploughing and that the slopes of the regression lines are less than what they would otherwise be. The net result would be a reduction of the total carbon values for these crop soils.
  
  - These are good points. The simple effect of ploughing would be to reduce the slope of soil carbon vs depth because of the mixing effect of the plough. This in itself should already be accounted for (we fit separate slopes and intercepts for each land use), and the logarithmic decline is still likely to be reasonable. More importantly, the effect of ploughing might be to create two different layers (above and below the plough depth) with different slopes (soil carbon vs depth), so something like a "broken stick" model might be more appropriate. The practical problem is that we typically don't have enough samples to have the resolution to see such effects. We don't see clear evidence of this in Figure 3, but perhaps only for this reason. If we knew which sites had been ploughed and to what depth, we could include this in the analysis, but the information is not generally available. We can include discussion of the above in the revision.
  A secondary consideration is the status of improved grasslands. These are also periodically ploughed but usually the time since last ploughing is unknown. It may be known to some extent where repeated samples are taken, such as in the Countryside Survey. Time since being under grass will affect how close the C stock to typical crop values and how close it is to more typical ‘grass’ values. As argued above, there will also be an effect on the distribution of C over the Ap (ploughed) horizon.
  
  - as above, this is a valid point, but in practice, the information on ploughing history is not generally available. The intention is that the sample is large enough to be representative of the range found in improved grasslands, averaging over all post-ploughing states. This is clearly not perfect, but it is at least a very large sample. We can include discussion of the above in the revision.
  
  Specific comments
  LL75-77 This sentence seems to be a repeat of what has already been stated in LL73-74.
  
  - There are two points, which we did not make clear enough:
  
  - the decline with depth is exponential, and so forms a linear relationship when carbon density is log-transformed;
  
  - the frequency distribution would be expected to be lognormal on the basis of the multiplication of bulk density and carbon fraction, and this is seen in the data e.g. Jobbagy and Jackson (2000).
  
  - We will rewrite this more clearly in the revision.
  Figure 1 The inclusion of altitude in not really helpful, and for most of England and Wales cannot be seen. I recommend omission. Also some data points appear in the sea or in Eire!
  
  - agreed. We will revise or delete Fig 1 accordingly.
  Technical corrections
  L83 assuming
  
  - corrected
  L252 availability
  
  - corrected
  L257 Figures 2 & 3 (not S2 & S3)
  
  - corrected. We did initially make a document with these extra figures plotted, but decided that they are not enough to merit a supplementary information.
  L265 the
  
  - corrected
  L267 (and elsewhere) Use of the word ‘outwith’ is fine by me but may raise some eyebrows outwith Scotland.
  
  - Synonyms "beyond" or "outside" don't sound quite as good to me, but up to Editor's disgression (I hadn't realised this was a Scottish English word).
  L332 the assumption
  
  - corrected
  L356 rejects
  
  - corrected
  
  Citation: https://doi.org/10.5194/egusphere-2023-1681-AC3

Peer review completion

AR – Author's response | RR – Referee report | ED – Editor decision | EF – Editorial file upload

ED: Reconsider after major revisions (20 Nov 2023) by Sara Vicca

AR by Peter E. Levy on behalf of the Authors (01 Aug 2024) Author's response Author's tracked changes Manuscript

ED: Referee Nomination & Report Request started (01 Aug 2024) by Sara Vicca

RR by Anonymous Referee #2 (12 Aug 2024)

ED: Publish subject to technical corrections (12 Aug 2024) by Sara Vicca

AR by Peter E. Levy on behalf of the Authors (19 Aug 2024) Manuscript

Editorial statement

This study proposes revising the effect sizes of land use on SOC stock across the UK using a large dataset and a more robust analysis. It may serve as the basis for new reports of the nationwide land use emissions following the guidelines of the UNFCCC agreement. In addition, the study demonstrates the limitation of the space-for-time substitution assumption for estimating these effects.