Observational benchmarks inform representation of soil organic carbon dynamics in land surface models

Nyaupane, Kamal; Mishra, Umakant; Tao, Feng; Yeo, Kyongmin; Riley, William J.; Hoffman, Forrest M.; Gautam, Sagar

doi:https://doi.org/10.5194/bg-21-5173-2024

Articles | Volume 21, issue 22

https://doi.org/10.5194/bg-21-5173-2024

© Author(s) 2024. This work is distributed under
the Creative Commons Attribution 4.0 License.

https://doi.org/10.5194/bg-21-5173-2024

© Author(s) 2024. This work is distributed under
the Creative Commons Attribution 4.0 License.

Articles | Volume 21, issue 22

Research article

|

19 Nov 2024

Research article |

| 19 Nov 2024

Observational benchmarks inform representation of soil organic carbon dynamics in land surface models

Kamal Nyaupane, Umakant Mishra, Feng Tao, Kyongmin Yeo, William J. Riley, Forrest M. Hoffman, and Sagar Gautam

Download

Final revised paper (published on 19 Nov 2024)
Supplement to the final revised paper
Preprint (discussion started on 21 Mar 2023)
Supplement to the preprint

Interactive discussion

Status: closed

RC1:
'Comment on bg-2023-50', Lorenzo Menichetti, 08 May 2023

The study is interesting because of the general perspective it gives, but it contains also quite some self-evident remarks and some related misunderstandings. It also lacks context, since most of the advancements in modeling in the last 50 years (starting from the temperatures and soil water content relationships with microbial activity) are not considered but you are still making recommendations to modelers.
In order to make your statements, you need first to study and briefly review (connecting your results to them) how these processes are represented in ESM, to be honest it sometimes seems you have just a vague idea.
The main of these remarks are all the discussions about introducing in models any variable related to soil moisture or temperature. These relationships are very well known, we know that those are the main factors influencing SOC decomposition and models consider this already quite well.
Using drought instead of precipitations is no improvement. Most SOC models (or maybe all) consider soil moisture simulating it based on precipitations and evapotranspiration (with more or less refined water balance), and have relatively refined response functions of decomposition responding to soil moisture. When the soil moisture falls, the microbial activity in the models (often represented by the kinetics) reduces. This is the main impact of drought on SOC. Most parametric decomposition models go further, representing also a decrease in activity towards the end of the curve, when soil gets close to saturation (this time due to lack of oxygen). You could for example start from the review by Moyano et al., 2013 to have an overview of the discussion about moisture. Concerning temperature effects, you could check first the Lloyd and Taylor 1993, a good citaton classic. But even if temperature is between the two the easier bit, there’s still discussion going on (for example Ratkowski). And here we are still just considering first order kinetic models, representing these interactions as external forcing variables as a scaling of the kinetics. There are much more complex models that represent these effects internally to the model itself, for example representing the effect of soil texture on moisture or explicitly considering diffusion of nutrients.
If drought works better in your model than precipitation the main reason (given that you are using models with potentially "infinite degrees of freedom", at least I personally define ML models like that when I use them, being quiote illiterate on the topic) is probably that you are not including evapotranspiration in your model (while most/all ESM will probably due, calculating the soil water balance) while drought contains, in some sense, information about that too.
Also your speculation about the causes of why drought was so important in your model (like 262-264) are not so convincing. For sure it might be that there is also an effect on inputs, but those should be considered in your model(s) already by using NPP. While it is very well known (and how much, numerically) that soil moisture affects microbial activity.
Summarizing, you are probably asking the wrong questions to your model(s). Saying that temperature related and moisture related indicators (whatever those are, since your model has ”infinite” degrees of freedom), is extremely self evident since half a century. While asking to your model how much would it matter to include also some edaphic parameter, and which one, one a global scale for predictions, that would be an interesting question to read about.
Line 293-294: you don’t need this kind of study to demonstrate different controls of moisture for different ESM. You can simply read which functions they rely upon, if those functions are different (they are) then the controls will be different. I think you should study the main functions available for that, and which function has been implemented in which model.
Your conclusions seem off track. Line 310-311: I would say that there’s no disagreement, all SOC models are using temperature and moisture of the microbial environment to control decomposition, those processes are well taken care of (better than using drought alone). Different models will of course rely on different variables to represent the same processes, the fact yours relies on diurnal temperature instead of soil temperature or daily temperature does not allow you to make inferences on other models, it depends on the functions they use. But they all agree that we need to represent the impact that the water present in the microbial micro-environment has on kinetics, and the effect that temperature in such micro-environment has also on the kinetics.
So, concluding, the study could have some potential but it requires a much better and extensive work on documenting the state of the art in detail. Understanding how the problems you talk about are already dealt with in models (and I mean at the level of the single functions) will also help you to repurpose your conclusions.

I also suggest you to shift your focus a bit, you are probably having a bit too ambitious goals (of making a big impact on modeling). Your approach is interesting for me (I am a modeler myself) because it offers insights on processes that ok, we know well in principle, but still they vary in different environments, there might be interactions with einvironmental factors changing the relationships, and so on. The global perspective of your study is interesting already, even in case you won't revolutionize anything.

I have minor (but still important) concerns about validation too. How did you trained your RF models? Can you ensure that the validation is completely independent? For example if you used Caret to train the metaparameters, you might have a spillover of the training in validation (because you select the metaparameters with the crossvalidation results). Another big issue would be to ensure that the data points of each fold of the crossvalidation are not correlated with any data point in the training. For example if you have more propfile from one single site, some would end in validation some in training (for each fold), injecting information from validation into training. If you are selecting instead at the site level (or if it corresponds to the data point level) it’s all fine.
Concerning the GAM, how did you validate them? You say you used Rˆ2, but based on which dataset did you calculate it?
In general, please be extremely specific about your validation approaches, in particular discussing why you believe there is no spillover of information between training and validation and why the two are supposed independent.
You also need to describe better the study on the ESM data. What are the SOC data you mention on line 156? Are those simulated data or measured?

Some references:
Moyano, Fernando E., Stefano Manzoni, and Claire Chenu. “Responses of Soil Heterotrophic Respiration to Moisture Availability: An Exploration of Processes and Models.” Soil Biology and Biochemistry 59 (April 2013): 72–85. https://doi.org/10.1016/j.soilbio.2013.01.002.
Lloyd, J., and J. A. Taylor. “On the Temperature Dependence of Soil Respiration.” Functional Ecology 8, no. 3 (June 1994): 315. https://doi.org/10.2307/2389824.

Citation: https://doi.org/10.5194/bg-2023-50-RC1
- AC1: 'Reply on RC1', Umakant Mishra, 10 Oct 2023
  
  We would like to thank the reviewer for their time and thoughtful critique of this manuscript. We have addressed all the comments and believe that the article has been improved because of the valuable feedbacks. Please find our point-by-point responses below.
  #Reviewer 1 comments:
  
  The study is interesting because of the general perspective it gives, but it contains also quite some self-evident remarks and some related misunderstandings. It also lacks context, since most of the advancements in modeling in the last 50 years (starting from the temperatures and soil water content relationships with microbial activity) are not considered but you are still making recommendations to modelers. In order to make your statements, you need first to study and briefly review (connecting your results to them) how these processes are represented in ESM, to be honest it sometimes seems you have just a vague idea.
  Response: We thank the reviewer for their thoughtful comments. To address these comments, we have added a separate paragraph in the Introduction section which summarizes how different environmental factors are currently used in ESMs. We also reviewed existing literature on SOC decomposition and its relationship with soil moisture, temperature, and microbial activity as suggested.
  The main of these remarks are all the discussions about introducing in models any variable related to soil moisture or temperature. These relationships are very well known, we know that those are the main factors influencing SOC decomposition and models consider this already quite well.
  Response: We appreciate the reviewers’ comment regarding the environmental factors influencing SOC decomposition. However, we feel that the wide range of soil moisture and temperature functions used in models indicates that these relationships are not robustly known. For example, Sierra et al. (2015) Fig 4a and 4c, reproduced below, show the very large differences in temperature and moisture functions used in several common land models. Although these curves reflect the diversity of functional forms used in CMIP6 land models, they are not the same as the relationships we are quantifying from observations, which are better terms as ‘emergent functional relationships.
  
  Our results also indicate that ESM land models in CMIP6 do not capture the emergent moisture and temperature controllers on SOC decomposition rates. For example, we found that 3 environmental factors explain more than 96% of variability in ESMs. In contrast, 14 environmental factors only explain 61% of variability in observations. Half of these observationally-inferred environmental factors are edaphic factors which are not adequately represented in ESMs. Among the 3 environmental factors which are major predictors in both ESMs and observations, the emergent functional relationships are very different.
  Using drought instead of precipitations is no improvement. Most SOC models (or maybe all) consider soil moisture simulating it based on precipitations and evapotranspiration (with more or less refined water balance), and have relatively refined response functions of decomposition responding to soil moisture. When the soil moisture falls, the microbial activity in the models (often represented by the kinetics) reduces. This is the main impact of drought on SOC. Most parametric decomposition models go further, representing also a decrease in activity towards the end of the curve, when soil gets close to saturation (this time due to lack of oxygen). You could for example start from the review by Moyano et al., 2013 to have an overview of the discussion about moisture. Concerning temperature effects, you could check first the Lloyd and Taylor 1993, a good citaton classic. But even if temperature is between the two the easier bit, there’s still discussion going on (for example Ratkowski). And here we are still just considering first order kinetic models, representing these interactions as external forcing variables as a scaling of the kinetics. There are much more complex models that represent these effects internally to the model itself, for example representing the effect of soil texture on moisture or explicitly considering diffusion of nutrients.
  Response: We thank the reviewer for these helpful suggestions. As suggested, we have reviewed each of the suggested publications and expanded the literature review both in the Introduction and Discussion sections of the manuscript.
  If drought works better in your model than precipitation the main reason (given that you are using models with potentially "infinite degrees of freedom", at least I personally define ML models like that when I use them, being quite illiterate on the topic) is probably that you are not including evapotranspiration in your model (while most/all ESM will probably do calculate the soil water balance) while drought contains, in some sense, information about that too.
  Response: In our study, we tried to include a wide range of environmental factors that are available and can be related to SOC dynamics. The Palmer drought index that we used in this study explicitly includes evapotranspiration, precipitation, and temperature. Therefore, as you mentioned, the drought index includes controls on the soil water balance, making it a better predictor than precipitation alone. In the revised manuscript, we have defined the drought index and its calculation and discussed its importance in predicting SOC.
  Also your speculation about the causes of why drought was so important in your model (like 262-264) are not so convincing. For sure it might be that there is also an effect on inputs, but those should be considered in your model(s) already by using NPP. While it is very well known (and how much, numerically) that soil moisture affects microbial activity.
  Response: We agree with the reviewer that the text in L262-264 can be improved and made more focused. As suggested we have modified that text in the revised manuscript.
  Summarizing, you are probably asking the wrong questions to your model(s). Saying that temperature related and moisture related indicators (whatever those are, since your model has “infinite” degrees of freedom), is extremely self evident since half a century. While asking to your model how much would it matter to include also some edaphic parameter, and which one, one a global scale for predictions, that would be an interesting question to read about.
  Response: While we agree that temperature and moisture are very well known to affect SOC decomposition rates, their emergent functional relationships are not well known and vary strongly between land models (e.g., Sierra et al. 2015). Therefore, the objective of our study was to benchmark environmental control representations in the current generation of ESMs using existing observations and ML approaches. The study that we conducted clearly showed (1) dominant environmental controllers of SOC stocks at global scale in observations and CMIP6 ESMs, and (2) the mathematical relationships between the dominant environmental controllers and SOC stocks in both observations and ESMs. To address this concern, we have added text to the revised manuscript discussing the known importance of soil moisture and temperature on SOC stocks, and highlighting the wide range of functional forms included in CMIP6 land models. We also attempted to address the reviewer’s comment about how edaphic parameters inform global-scale SOM predictions. Out of 14 dominant environmental factors that our ML models selected as global predictors of SOC stocks in observations, 7 were edaphic factors. In the three CMIP6 ESMs we evaluated, only CESM used 5 of these edaphic factors. Interestingly, the cation exchange capacity, which is the most dominant environmental factor inferred from observations, is not used in any CMIP6 ESMs that we evaluated. We have modified the text in the Discussion section to highlight these findings as the reviewer suggested.
  Line 293-294: you don’t need this kind of study to demonstrate different controls of moisture for different ESM. You can simply read which functions they rely upon, if those functions are different (they are) then the controls will be different. I think you should study the main functions available for that, and which function has been implemented in which model.
  Response: To clarify, our objective was to benchmark how the existing environmental controls are related to the emergent SOC stocks in ESMs and observations. These emergent relationships may differ from the relationships coded in land models for several reasons, including interactions between multiple stressors (e.g., nutrients, moisture, temperature, light), time scales of analysis, and model differences in calculating moisture and temperature). We have added text to the revised manuscript to clarify these points. Our results show varying influences of different variables on SOC stocks across different ESMs. In addition to the dominant environmental controllers of SOC stocks, we also report observationally-inferred relationships between dominant variables and SOC stocks. As suggested by the reviewer in this and earlier comments, we have also modified the text in the Introduction section of the manuscript to include existing functions that ESMs currently include to represent control of these environmental factors.
  Your conclusions seem off track. Line 310-311: I would say that there’s no disagreement, all SOC models are using temperature and moisture of the microbial environment to control decomposition, those processes are well taken care of (better than using drought alone). Different models will of course rely on different variables to represent the same processes, the fact yours relies on diurnal temperature instead of soil temperature or daily temperature does not allow you to make inferences on other models, it depends on the functions they use. But they all agree that we need to represent the impact that the water present in the microbial micro-environment has on kinetics, and the effect that temperature in such micro-environment has also on the kinetics.
  Response: As mentioned above, we agree with the reviewer that it is well known that soil moisture and temperature are important controllers of SOM decomposition. To clarify this point, we have added a sentence to this effect in the revised manuscript. However, our results demonstrate that the model and observations have very different inferences of the emergent functional form of these relationships, and importantly, the number of controllers that need to be considered. The models dramatically underestimate the number (3 versus 16) and type (e.g., edaphic) of important controllers that we inferred from observations. We note that a wealth of literature exists documenting discrepancies between ESM land models and observed SOC stocks and dynamics. Consistent with our study, multiple previous studies (Collier et al., 2018; Georgiou et al., 2021; Luo et al., 2012; Todd-Brown et al., 2013; document the need for model benchmarking studies to identify discrepancies and improve model structures to reduce uncertainties in predicting carbon climate feedbacks.
  So, concluding, the study could have some potential but it requires a much better and extensive work on documenting the state of the art in detail. Understanding how the problems you talk about are already dealt with in models (and I mean at the level of the single functions) will also help you to repurpose your conclusions.
  Response: Thank you for indicating the merits of our study. As suggested by the reviewer, we have modified the manuscript text in multiple places, added new references, and provided greater details about the environmental controllers that are represented in models.
  I also suggest you to shift your focus a bit, you are probably having a bit too ambitious goals (of making a big impact on modeling). Your approach is interesting for me (I am a modeler myself) because it offers insights on processes that ok, we know well in principle, but still they vary in different environments, there might be interactions with environmental factors changing the relationships, and so on. The global perspective of your study is interesting already, even in case you won't revolutionize anything.
  Response: Thank you for your comments. Our goals are actually quite clear: to document existing differences between global observations and ESMs regarding: (1) dominant environmental controllers of SOC stocks, and (2) the emergent relationships between the dominant environmental controllers and SOC stocks.
  I have minor (but still important) concerns about validation too. How did you trained your RF models? Can you ensure that the validation is completely independent? For example if you used Caret to train the metaparameters, you might have a spillover of the training in validation (because you select the metaparameters with the crossvalidation results). Another big issue would be to ensure that the data points of each fold of the crossvalidation are not correlated with any data point in the training. For example if you have more propfile from one single site, some would end in validation some in training (for each fold), injecting information from validation into training. If you are selecting instead at the site level (or if it corresponds to the data point level) it’s all fine.
  Response: Thank you for the comments, the model was trained (70% of the data) and tested using the other 30% of data. We have not used multiple profiles from a single site since each site contained a single soil profile, so there was no spillover effect. For splitting data into calibration and validation datasets, we have used a standard procedure that we have used in our other studies to split the calibration and validation datasets in a spatially balanced way (Mishra et al., 2022; Mishra et al., 2021; Mishra et al. 2020).
  Concerning the GAM, how did you validate them? You say you used Rˆ2, but based on which dataset did you calculate it?
  Response: A similar model validation approach was used for both RF and GAM approaches. 70% of the data was used for training the model and 30% of the data were used for model testing. We have modified the text in the revised manuscript to more clearly explain how model calibration and validation were done.
  In general, please be extremely specific about your validation approaches, in particular discussing why you believe there is no spillover of information between training and validation and why the two are supposed independent.
  Response: Thank for your suggestions. As described above, the model was trained and tested using independent data sets, i.e., 70% of the data was used for training and 30% of data was used for testing. Each site contained a single soil profile, so there are no spillover effects. For splitting that data into calibration and validation datasets, we have used standard procedures that we have used in our previous studies to split the calibration and validation datasets into spatially balanced way (Mishra et al. 2020; Mishra et al., 2021; Mishra et al., 2022).
  You also need to describe better the study on the ESM data. What are the SOC data you mention on line 156? Are those simulated data or measured?
  Response: Thank you for your comments. We have added text in the Methods section of the revised manuscript and provided greater details about both field observations and CMIP6 ESM data that we used in this study.
  References:
  Collier, N., Hoffman, F. M., Lawrence, D. M., Keppel-Aleks, G., Koven, C. D., Riley, W. J., Mu, M., and Randerson, J. T.: The International Land Model Benchmarking (ILAMB) system: design, theory, and implementation, Journal of Advances in Modeling Earth Systems, 10, 2731–2754, , https://doi.org/10.1029/2018MS001354, 2018.
  Georgiou, K., Malhotra, A., Wieder, W. R., Ennis, J. H., Hartman, M. D., Sulman, B. N., Berhe, A. A., Grandy, A. S., Kyker-Snowman, E., Lajtha, K., Moore, J. A. M., Pierson, D., and Jackson, R. B.: Divergent controls of soil organic carbon between observations and process-based models, Biogeochemistry, 156, 5–17, https://doi.org/10.1007/S10533-021-00819-2, 2021.
  Luo, Y. Q., et al.: A framework for benchmarking land models, Biogeosciences, 9, 1899–1944, https://doi.org/10.5194/bgd-9-1899-2012, 2012.
  Mishra, U., Yeo K., Adhikari, A., Riley, W.J., Hoffman, F., Hudson, C., and S. Gautam, S.: Empirical relationships between environmental factors and soil organic carbon produce comparable prediction accuracy as the machine learning, Soil Science Society of America Journal, 86, 1611-1624, doi:10.1002/saj2.20453, 2022.
  Mishra, U., et al.: Spatial heterogeneity and environmental predictors of permafrost region soil organic carbon stocks. Science Advances,7, eaaz5236, doi: 10.1126/sciadv.aaz5236, 2021.
  Mishra, U., Gautam, S., Riley, W.J., and F. Hoffman, F.: Ensemble machine learning approach better predicts the spatial heterogeneity of surface soil organic carbon stocks in data-limited northern circumpolar region, Frontiers in Big data, 3, 40, doi: 10.3389/fdata.2020.528441, 2020.
  Sierra, C. A., Trumbore, S. E., Davidson, E. A., Vicca, S., & Janssens, I. (2015). Sensitivity of decomposition rates of soil organic matter with respect to simultaneous changes in temperature and moisture. Journal of Advances in Modeling Earth Systems, 7(1), 335-356.
  Todd-Brown, K. E. O., Randerson, J. T., Post, W. M., Hoffman, F. M., Tarnocai, C., Schuur, E. A. G., and Allison, S. D.: Causes of variation in soil carbon simulations from CMIP5 Earth system models and comparison with observations, Biogeosciences, 10, 1717–1736, https://doi.org/10.5194/BG-10-1717-2013, 2013.
  
  Citation: https://doi.org/10.5194/bg-2023-50-AC1
RC2:
'Comment on bg-2023-50', Anonymous Referee #2, 24 Sep 2023

Review for Observational benchmarks inform representation of soil organic carbon dynamics in land surface models, bg-2023-50

General comments
The paper is well structured, easy to follow and interesting. The subject is very important and the results of this paper can be useful for modelling of climate change and carbon cycles. It shows that some processes, that are often omitted in the process modellering of the gcm:s have an important influence on the soil C stocks, and that the physical properties of the soils have a larger influence of soil C stocks in observations than in the studied models.
Specific comments
Lines 85-86. The upper meter of soil isn’t necessary the whole profile. Topsoil also isn’t always 30 cm thick, but the thickness depends on the soil formation at the soil profile. It would be better just use something like ”upper 30 cm of soil” and ”upper meter of soil” and not use already existing names. Also, don’t you underestimate the carbon stock of wetlands if you only count the uppermost meter or 30 cm, when many peat lands have a larger depth of peat?
Line 118. There are no factors that cover biotic factors like presence or non-presence of certain species that affect soil structure and soil carbon more than others (spruce trees and earthworms for example) and no factors that cover details in management by humans, like prescribed burnings, use of organic or inorganic fertiliser or no fertiliser, irrigation of agricultural soils, presence or no presence of water draining ditches in forests and agricultural land, whole tree removal on clear cuts in forests or removal of only stems, with tops and branches left on site, forests or shrublands used heavily for fire wood collection that removes most dead wood or not. Soil moisture is only included by a drought index, which, I assume, does not capture the average soil moisture but the drier extremes. Nitrogen availability is also not included, even though nitrogen affect decomposition of organic matter and NPP.
Line 180 ff. Is it reasonable to divide land uses in so few classes? The averages in soil C stock is close to each other between the different classes you use? Would other, finer, classes yield similar averages, or would there be forest classes with much higher and much lower average C stocks than the average of all forest? Would the main driving factors be the same for all forest subclasses as they are for the one forest class you are using, or would they be different for e.g. tropical forest and temperate deciduous forest etc. And the same for subclasses of barren land and your other land classes, would the subclasses react in the same way as the large class, even though they might be very different (barren land can be barren for very different reasons for example, and urban land can be mostly concrete and asphalt or mostly gardens, depending on population density and other factors)?

Discussion: Discuss the implications of your model being a statistical model, whereas the gcm:s are far mor complex process-based models – how can they incorporate your findings? A drought index, for example, is only an index describing the result of the interaction between temperature, precipitation, evaporation and other processes that are already in the gcm:s, and it will not be constant when climate is changing – what ways could there be to incorporate it? CEC is also changing with time and soil pH, soil carbon content et c, and must in a process-based model be modelled, together with the important processes related to soil pH, such as base cation concentrations.
Line 288. There is no Figure 5a, it is Figure 6a.
Line 293. There is no Figure 5b. Figure 6e is the figure with temperature effect, but I don’t understand what you mean by eventually reduce soil carbon – the curve is falling already at low temperatures.
Line 297. Refer to Figure 6b?

Figure 6. Some measure of the spread around the average of the observed values would be very interesting. Also, it should be the same y-axis scale on all figures a-f. Figure 6f: The curve of the observed values is rather flat for 0 - 2000 m above sea level and relatively little land has an elevation higher than that. Have you made sure that the data from higher elevations don’t have a disproportionate effect on your model? And that the effect of high elevation on soil carbon isn’t only an effect of exposure to erosion?

Citation: https://doi.org/10.5194/bg-2023-50-RC2
- AC2: 'Reply on RC2', Umakant Mishra, 10 Oct 2023
  
  First, we would like to thank the reviewer for their time and thoughtful critique of this manuscript. We have addressed all the comments and believe that the article has been improved because of the valuable feedbacks. Please find our point-by-point responses below.
  
  General comments
  The paper is well structured, easy to follow and interesting. The subject is very important and the results of this paper can be useful for modelling of climate change and carbon cycles. It shows that some processes, that are often omitted in the process modelling of the gcms have an important influence on the soil C stocks, and that the physical properties of the soils have a larger influence of soil C stocks in observations than in the studied models.
  Response: We sincerely appreciate the thoughtful and encouraging comments of the reviewer.
  Specific comments
  Lines 85-86. The upper meter of soil isn’t necessary the whole profile. Topsoil also isn’t always 30 cm thick, but the thickness depends on the soil formation at the soil profile. It would be better just use something like ”upper 30 cm of soil” and ”upper meter of soil” and not use already existing names. Also, don’t you underestimate the carbon stock of wetlands if you only count the uppermost meter or 30 cm, when many peat lands have a larger depth of peat?
  Response: We thank the reviewer for these suggestions. As suggested, in the revised manuscript, we have modified the text used to describe depth descriptions. We agree with the reviewer that the depth descriptions of 30 cm and 1 m will not account for the total peatland SOC stocks as peatlands store more carbon to a much greater depth. But, as we have included soil samples from all kind of soils, we used depth descriptions of 30 cm and 1 m which are often used in literature.
  Line 118. There are no factors that cover biotic factors like presence or non-presence of certain species that affect soil structure and soil carbon more than others (spruce trees and earthworms for example) and no factors that cover details in management by humans, like prescribed burnings, use of organic or inorganic fertiliser or no fertiliser, irrigation of agricultural soils, presence or no presence of water draining ditches in forests and agricultural land, whole tree removal on clear cuts in forests or removal of only stems, with tops and branches left on site, forests or shrublands used heavily for fire wood collection that removes most dead wood or not. Soil moisture is only included by a drought index, which, I assume, does not capture the average soil moisture but the drier extremes. Nitrogen availability is also not included, even though nitrogen affect decomposition of organic matter and NPP.
  Response: We agree with the reviewer that the SOC dynamics of natural and managed ecosystems are different. We also agree that the results may have been different if different environmental predictors have been used. We attempted to conduct a global model benchmarking study such that the findings can inform current generation of ESMs. In this study, our specific objectives were to (1) identify the dominant environmental controllers of SOC stocks at global scale both in observations and CMIP6 ESMs, and (2) derive and compare the mathematical relationships between the dominant environmental controllers and SOC stocks in both observations and ESMs. To meet these objectives, we used 46 environmental factors which covers all the environmental factors that have been used in the current generation of CMIP6 ESMs.
  To appropriately address the reviewer’s concerns we have added a separate paragraph at the end of the Discussion section in the revised manuscript explaining the limitations of our approach. In particular, we mentioned that ecosystem specific (for example croplands and forests) environmental factors should be used in future studies as they may improve the SOC prediction accuracy in observations.
  Line 180 ff. Is it reasonable to divide land uses in so few classes? The averages in soil C stock is close to each other between the different classes you use? Would other, finer, classes yield similar averages, or would there be forest classes with much higher and much lower average C stocks than the average of all forest? Would the main driving factors be the same for all forest subclasses as they are for the one forest class you are using, or would they be different for e.g. tropical forest and temperate deciduous forest etc. And the same for subclasses of barren land and your other land classes, would the subclasses react in the same way as the large class, even though they might be very different (barren land can be barren for very different reasons for example, and urban land can be mostly concrete and asphalt or mostly gardens, depending on population density and other factors)?
  Response: We agree that the dominant environmental controllers and predictive power of ML models will differ if the SOC stock observations were divided using different land cover categories. But, the ML approach is a data intensive approach and requires a large number of data points to produce stable results. That is why we divided the entire datasets into eight different land cover classes. In order to better address the reviewer’s concerns, we have re-categorized our datasets using global biomes (Olson et al., 2001).
  Discussion: Discuss the implications of your model being a statistical model, whereas the gcms are far more complex process-based models – how can they incorporate your findings? A drought index, for example, is only an index describing the result of the interaction between temperature, precipitation, evaporation and other processes that are already in the gcms, and it will not be constant when climate is changing – what ways could there be to incorporate it? CEC is also changing with time and soil pH, soil carbon content et c, and must in a process-based model be modelled, together with the important processes related to soil pH, such as base cation concentrations.
  Response: Thanks for these insightful comments. In response to these comments, we have modified the text in the Discussion section of the manuscript and suggested ways that edaphic controls, which are half of the dominant controllers of SOC in observations, can be incorporated in ESMs. We also mentioned that our mathematical relationships can be used to benchmark ESM results. While our results can not directly be used to develop model parameterizations, they can: (1) point to categories of functional forms for controllers; (2) inform where effort may best applied to improve model functional forms (e.g., to the dominant controllers); and (3) inform modelers of where their model may have very different functional forms for emergent relationships than exist in the observations.
  Line 288. There is no Figure 5a, it is Figure 6a.
  Response: Thanks for indicating the error; the figure number is corrected in the revised manuscript.
  Line 293. There is no Figure 5b. Figure 6e is the figure with temperature effect, but I don’t understand what you mean by eventually reduce soil carbon – the curve is falling already at low temperatures.
  Response: Thanks for indicating the error, the manuscript text has been modified to correct the figure number and the observed and modeled trends of changes in SOC with increase in temperature.
  Line 297. Refer to Figure 6b?
  Response: Thank you for finding the error, the figure number is corrected in the revised manuscript.
  Figure 6. Some measure of the spread around the average of the observed values would be very interesting. Also, it should be the same y-axis scale on all figures a-f. Figure 6f: The curve of the observed values is rather flat for 0 - 2000 m above sea level and relatively little land has an elevation higher than that. Have you made sure that the data from higher elevations don’t have a disproportionate effect on your model? And that the effect of high elevation on soil carbon isn’t only an effect of exposure to erosion?
  Response: Thank you for these suggestions. We have made the scale of the Y-axis the same in all figures. We have reanalyzed the relationships between elevation and SOC and updated the figures and text in the revised manuscript.
  
  Reference:
  Olson, D. M. et al. Terrestrial Ecoregions of the World: A New Map of Life on Earth: A new global map of terrestrial ecoregions provides an innovative tool for conserving biodiversity. Bioscience 51, 933-938, doi:10.1641/0006-3568(2001).
  
  Citation: https://doi.org/10.5194/bg-2023-50-AC2

Peer review completion

AR: Author's response | RR: Referee report | ED: Editor decision | EF: Editorial file upload

ED: Reconsider after major revisions (17 Nov 2023) by Kirsten Thonicke

AR by Umakant Mishra on behalf of the Authors (11 Feb 2024) Author's response Author's tracked changes

EF by Sarah Buchmann (13 Feb 2024) Manuscript

ED: Referee Nomination & Report Request started (20 Mar 2024) by Kirsten Thonicke

RR by Anonymous Referee #3 (03 Apr 2024)

Suggestions for revision or reasons for rejection

This paper investigates the representation of soil organic carbon (SOC) dynamics in Earth System Models (ESMs) by leveraging a substantial dataset of SOC field observations and employing machine learning models to identify key environmental factors influencing global SOC stocks. The study is rooted in addressing the challenge of accurately predicting carbon climate feedbacks, a crucial aspect of understanding and mitigating climate change impacts. The authors employed two machine learning approaches, Random Forest (RF) and Generalized Additive Modeling (GAM), to analyze a comprehensive dataset comprising 54,000 SOC field observations and geospatial datasets of 46 environmental factors. The objectives were to identify dominant environmental controllers of SOC stocks globally and across biomes, derive functional relationships between these controllers and SOC stocks, and compare these findings with the assumptions and results of current ESMs, particularly those involved in the Coupled Model Intercomparison Project phase six (CMIP6).

Although the study provides critical insights into the environmental controllers of SOC stocks and highlights gaps in the representation of SOC dynamics in current ESMs, I find the manuscript could be improved by:

Major comments:

1) Representation of Deep Soil Carbon. The analysis is limited to the upper 1 m of soil. Considering the significant SOC stocks in deeper soil layers, especially in peatlands, extending the depth range could provide a more comprehensive understanding of SOC dynamics;
2) Comparison with other SOC global products (SoilGrids, Harmonized World Soil Database, WISE, NCSCDv2, etc). See Endsley et al. (2020);
3) Thorough Model intercomparison. The authors cite Collier et al. (2018) but do not use ILAMB as validation tool. I suggest the authors add their newly created SOC product into ILAMB and perform a validation of the proposed CMIP6 models;
4) Lack of discussion about future work and ways to include the newly discovered impact of environmental variables in ESMs, particularly cation exchange capacity;
5) Lack of references to critical literature. I suggest a more thorough literature review.

Specific comments:

Figures 2 and 3 need a map indicating where those plant functional groups are.

Figure 2: organize the x-axis from larger to smallest SOC.

Figure 4 y-axis should be the same as Figure 3 y-axis.

Line 29-40: See Terrer et al. (2021) and Crowther et al. (2019).

Line 49-50: See Braghiere et al. (2023).

Line 39: The punctuation mark before the citation "(Friedlingstein et al., 2014.; Arora et al., 2020)" seems to be a typo. Consider changing it to "(Friedlingstein et al., 2014; Arora et al., 2020)."

Lines 93-95: The phrase "in this study in both global field observations and ESMs" could be made more concise to improve readability, such as "in this study's global field observations and ESM evaluations."

Line 115: The phrase "we may not be accounting for the total SOC stocks" slightly undermines the study's robustness. It could be strengthened by stating the limitation more firmly and suggesting it as an area for future research.

Lines 117-118: The mixture of passive and active voice ("The 2019 snapshot of the WoSIS dataset contained") might confuse readers. Consider maintaining a consistent voice, preferably active, throughout the document.

Line 161: "off-the-shelf model" might benefit from clarification or a brief explanation, especially for readers less familiar with machine learning terminology.

Lines 198-200: The transition from discussing average SOC stock to variability and standard deviation might be smoother with a brief introductory phrase highlighting the shift in focus.

Lines 295-297: The explanation of cation exchange capacity’s role could be enhanced by briefly discussing its significance in SOC dynamics, making the text more informative for readers not specialized in soil science.

References

Arthur Endsley, K., Kimball, J. S., Reichle, R. H., & Watts, J. D. (2020). Satellite Monitoring of Global Surface Soil Organic Carbon Dynamics Using the SMAP Level 4 Carbon Product. Journal of Geophysical Research: Biogeosciences, 125(12), e2020JG006100. https://doi.org/10.1029/2020JG006100

Braghiere, R. K., Fisher, J. B., Miner, K. R., Miller, C. E., Worden, J. R., Schimel, D. S., & Frankenberg, C. (2023). Tipping point in North American Arctic-Boreal carbon sink persists in new generation Earth system models despite reduced uncertainty. Environmental Research Letters, 18(2), 025008. https://doi.org/10.1088/1748-9326/acb226

Collier, N., Hoffman, F. M., Lawrence, D. M., Keppel‐Aleks, G., Koven, C. D., Riley, W. J., Mu, M., & Randerson, J. T. (2018). The International Land Model Benchmarking (ILAMB) System: Design, Theory, and Implementation. Journal of Advances in Modeling Earth Systems, 10(11), 2731–2754. https://doi.org/10.1029/2018MS001354

Terrer, C., Phillips, R. P., Hungate, B. A., Rosende, J., Pett-Ridge, J., Craig, M. E., Groenigen, K. J. van, Keenan, T. F., Sulman, B. N., Stocker, B. D., Reich, P. B., Pellegrini, A. F. A., Pendall, E., Zhang, H., Evans, R. D., Carrillo, Y., Fisher, J. B., Sundert, K. van, Vicca, S., & Jackson, R. B. (2021). A trade-off between plant and soil carbon storage under elevated CO2. Nature 2021 591:7851, 591(7851), 599–603. https://doi.org/10.1038/s41586-021-03306-8

Crowther, T. W., van den Hoogen, J., Wan, J., Mayes, M. A., Keiser, A. D., Mo, L., Averill, C., & Maynard, D. S. (2019). The global soil community and its influence on biogeochemistry. Science, 365(6455). https://doi.org/10.1126/science.aav0550

Hide

RR by Anonymous Referee #4 (03 Sep 2024)

Suggestions for revision or reasons for rejection

This paper presents an interesting analysis of global soil organic carbon (SOC), comparing drivers of measured carbon stocks with drivers of predicted carbon stocks from multiple Earth system models (ESMs). It is a valuable contribution to the model-data evaluation literature, by exploring the oversimplification of SOC representations within ESMs. Most of the paper is easy to read, but there are some paragraphs (especially the Discussion) where the flow of the literature review feels a bit stilted and could be improved. There are also some issues with the figure presentations and statistics that would help with communicating the main findings, which I have listed below.

Microbial dynamics are mentioned as a primary link between soil moisture and temperature and SOC dynamics (lines 89-92), but this concept is not tied back in to the discussion of how the three ESMs differ in their SOC representations. Each of the ESMs had contrasting relationships between SOC and precipitation, soil texture, and elevation - but there is no discussion of why underlying ESM model structures might produce different inaccuracies or oversimplifications, and whether these are connected to differences in microbial activity representations. Figure 6 could also be paired with statistics or a summary figure showing the overall performance of SOC predictions from the three ESMs (especially the predictive spread from each model), which might help indicate whether the simplifications in the ESMs translate to overconfidence or underconfidence in SOC predictions.

The authors have not provided the data or code used for the analysis. The datasets that the authors aggregated from WOSIS and Mishra et al. (2021) were processed to estimate bulk density and SOC, and data from the three ESMs were aggregated as well. Sharing the relevant code and processed datasets would increase data accessibility for the broader research community.

Minor comments:

Figure 1 inset is hard to read. I suggest reformatting the x-axis labels and providing a more descriptive figure caption about the data sources and measurement depths.

Figure 2 should have statistics related to the boxplots, since it is used to support statements about biome differences in SOC, made in lines 201-205

Figure 3 and 4 are initially hard to distinguish - they have the same axis labels, but Figure 4 caption mentions “strength” of environmental factors. More descriptive captions would be helpful. Figure 3 and Figure 5 could potentially be combined to better demonstrate how the important factors vary between observations and ESM predictions.

Figure 4: The figure legend labels are covering part of the data. This figure is hard to read with the 8 biomes plotted as lines for each variable. Reformatting the figure to give each variable more space (or faceting the figure by biome) would improve the figure’s interpretability.

Figure 6: The black line, labeled “observations”, seems to instead be a fitted or smoothed line (actual observations should be points, or distributions that show the spread of the data — also mentioned by a previous reviewer). There also seems to be errors in the labels of the insets, where some of the fitted lines are labeled blue instead of red.

Typo line 161 (“model” should be “models”)

Hide

ED: Publish as is (25 Sep 2024) by Kirsten Thonicke

AR by Umakant Mishra on behalf of the Authors (26 Sep 2024) Manuscript

Download

Article (3849 KB)
Full-text XML

Short summary

Representing soil organic carbon (SOC) dynamics in Earth system models (ESMs) is a key source of uncertainty in predicting carbon–climate feedbacks. Using machine learning, we develop and compare predictive relationships in observations (Obs) and ESMs. We find different relationships between environmental factors and SOC stocks in Obs and ESMs. SOC prediction in ESMs may be improved by representing the functional relationships of environmental controllers in a way consistent with observations.