A differentiable ecosystem modeling framework for large-scale inverse problems: demonstration with photosynthesis simulations
Abstract. Photosynthesis plays an important role in carbon, nitrogen, and water cycles. Ecosystem models for photosynthesis are characterized by many parameters that are obtained from limited in-situ measurements and applied to the same plant types. Previous site-by-site calibration approaches could not leverage big data and faced issues like overfitting or parameter non-uniqueness. Here we developed a programmatically differentiable (meaning gradients of outputs to variables used in the model can be obtained efficiently and accurately) version of the photosynthesis process representation within the Functionally Assembled Terrestrial Ecosystem Simulator (FATES) model. This model is coupled to neural networks that learn parameterization from observations of photosynthesis rates. We first demonstrated that the framework was able to recover multiple assumed parameter values concurrently using synthetic training data. Then, using a real-world dataset consisting of many different plant functional types, we learned parameters that performed substantially better and dramatically reduced biases compared to literature values. Further, the framework allowed us to gain insights at a large scale. Our results showed that the carboxylation rate at 25 °C (Vc,max25), was more impactful than a factor representing water limitation, although tuning both was helpful in addressing biases with the default values. This framework could potentially enable a substantial improvement in our capability to learn parameters and reduce biases for ecosystem modeling at large scales.
Doaa Aboelyazeed et al.
Status: final response (author comments only)
RC1: 'Comment on bg-2022-211', Anonymous Referee #1, 14 Dec 2022
AC2: 'Reply on RC1', Chaopeng Shen, 26 Jan 2023
- AC4: 'Reply on AC2', Chaopeng Shen, 31 Jan 2023
- AC2: 'Reply on RC1', Chaopeng Shen, 26 Jan 2023
RC2: 'Comment on bg-2022-211', Anonymous Referee #2, 23 Jan 2023
- AC1: 'Reply on RC2', Chaopeng Shen, 23 Jan 2023
AC3: 'Reply on RC2', Chaopeng Shen, 31 Jan 2023
RC3: 'Reply on AC3', Anonymous Referee #2, 01 Feb 2023
- AC5: 'Reply on RC3', Chaopeng Shen, 01 Feb 2023
- AC6: 'Reply on RC3', Chaopeng Shen, 08 Feb 2023
- RC3: 'Reply on AC3', Anonymous Referee #2, 01 Feb 2023
Doaa Aboelyazeed et al.
Doaa Aboelyazeed et al.
Viewed (geographical distribution)
This paper presents a nice example of combining theory based models and machine learning to efficiently identify parameters of an ecosystem model, exploiting observation data recorded at multiple sites. The approach is valid and the results are interesting. However, the documentation of data and methods is currently deficient on a level that makes it hard to grasp the main messages and interpret the results. Section 2 of the paper does in my yes require a thorough revision, including new explanatory figures, restructuring and replacement of text blocks. For this reason I recommend a major revision or rejection with an invitation to resubmit.
1. I assume a key point of the developed framework is that it enables to directly backpropagate from the outputs through the model equations to the neural networks. This is not clear from the paper at all. Much of the framework description seems like you feed NN predictions of parameters through a black box physics based model, which is a standard approach. I suggest a dedicated subsection, possibly including a figure, to clarify this detail.
2. The datasets used for training and testing are not properly documented. We don't know how many datapoints are included over which time periods. The random holdout suddenly appears in the results, and in general we don't know how training/validation/testing splits are defined. CLM4.5 standard parameters play a central role in the results, but we know nothing about where they come from / how they are defined and if, for example, all or a subset of values are used for comparison.
3. The explanation of the ecosystem model suffers from a clear struggle between trying not to include the entire set of equations in the paper, while providing sufficient detail. For me the level of detail provided in the paper was actually confusing, because it required constant looking up in the appendix to understand the context, distracting from the main messages. I think a way out could be to include a figure that summarizes the main blocks of the model (including what parts correspond to f1 and f2), include only the changed equations in the paper, and otherwise keep the full model description in the appendix. On a sidenote: is f2 not the same as an observation equation, that is commonly used in state space models?
4. Details on hyperparameters (neural network # of layers, activation functions, learning rates etc.) are not provided at all. Some key information should be provided in the paper, and a reference to supporting information or the code should be provided for details.
line 61: nonuniqueness is also going to be a problem if we employ newer frameworks like PINNs or dPL
line 110: it might be worthwhile to start with a reference to figure 1 and a down to earth explanation of the objective of your work, i.e. to calibrate model parameters across many sites, to capture the variation of parameters using neural networks, and to employ differentiable programming to speed up the identification process
line 118: please explain PFT again in this section
line 140: If you preserve eq. 4 and 5 in the paper, I think they should be presented in reverse order (f1 first, f2 second)
line 146-164: please include only methodological descriptions that are relevant for the results. of the julia implementation was not used, then it should not be described and discussed
line 183: you don't describe anywhere in your data how many PFTs you consider. it is therefore here also not clear how many dummy variables this model receives as input.
line 190-205: I think this information is not needed to understand the main message
eq. 10: why is psi_max replaced by psi_0? (missing explanation)
eq. 11: what is F_om?
line 232: the CLM4.5 data points should be documented in a dedicated data section. in general, i suggest they you separate the description of data and experiments
line 239: were all calculations performed only for the topsoil layer in all experiments?
Table 1: missing symbol explanations for means and standard deviations
line 383: please include time series for observations and model predictions
fig. 5: symbols in legend cannot be distinguished. are results shown for the test dataset?
line 426: i would add that you have identified parameter values that are optimized for the considered set of model equations and forcings. both of these have limitations. equations may be wrong, ERA5 is rather uncertain, and measurement principles can vary between stations. This is both a limitation and a strength of your framework. Parameter values will not be transferable to other inputs. On the other hand you can obtain optimized predictions for the given set of forcings.