the Creative Commons Attribution 4.0 License.
the Creative Commons Attribution 4.0 License.
A differentiable ecosystem modeling framework for large-scale inverse problems: demonstration with photosynthesis simulations
Doaa Aboelyazeed
Chonggang Xu
Forrest M. Hoffman
Alex W. Jones
Chris Rackauckas
Kathryn E. Lawson
Abstract. Photosynthesis plays an important role in carbon, nitrogen, and water cycles. Ecosystem models for photosynthesis are characterized by many parameters that are obtained from limited in-situ measurements and applied to the same plant types. Previous site-by-site calibration approaches could not leverage big data and faced issues like overfitting or parameter non-uniqueness. Here we developed a programmatically differentiable (meaning gradients of outputs to variables used in the model can be obtained efficiently and accurately) version of the photosynthesis process representation within the Functionally Assembled Terrestrial Ecosystem Simulator (FATES) model. This model is coupled to neural networks that learn parameterization from observations of photosynthesis rates. We first demonstrated that the framework was able to recover multiple assumed parameter values concurrently using synthetic training data. Then, using a real-world dataset consisting of many different plant functional types, we learned parameters that performed substantially better and dramatically reduced biases compared to literature values. Further, the framework allowed us to gain insights at a large scale. Our results showed that the carboxylation rate at 25 °C (Vc,max25), was more impactful than a factor representing water limitation, although tuning both was helpful in addressing biases with the default values. This framework could potentially enable a substantial improvement in our capability to learn parameters and reduce biases for ecosystem modeling at large scales.
Doaa Aboelyazeed et al.
Status: final response (author comments only)
-
RC1: 'Comment on bg-2022-211', Anonymous Referee #1, 14 Dec 2022
This paper presents a nice example of combining theory based models and machine learning to efficiently identify parameters of an ecosystem model, exploiting observation data recorded at multiple sites. The approach is valid and the results are interesting. However, the documentation of data and methods is currently deficient on a level that makes it hard to grasp the main messages and interpret the results. Section 2 of the paper does in my yes require a thorough revision, including new explanatory figures, restructuring and replacement of text blocks. For this reason I recommend a major revision or rejection with an invitation to resubmit.
Major comments
1. I assume a key point of the developed framework is that it enables to directly backpropagate from the outputs through the model equations to the neural networks. This is not clear from the paper at all. Much of the framework description seems like you feed NN predictions of parameters through a black box physics based model, which is a standard approach. I suggest a dedicated subsection, possibly including a figure, to clarify this detail.
2. The datasets used for training and testing are not properly documented. We don't know how many datapoints are included over which time periods. The random holdout suddenly appears in the results, and in general we don't know how training/validation/testing splits are defined. CLM4.5 standard parameters play a central role in the results, but we know nothing about where they come from / how they are defined and if, for example, all or a subset of values are used for comparison.
3. The explanation of the ecosystem model suffers from a clear struggle between trying not to include the entire set of equations in the paper, while providing sufficient detail. For me the level of detail provided in the paper was actually confusing, because it required constant looking up in the appendix to understand the context, distracting from the main messages. I think a way out could be to include a figure that summarizes the main blocks of the model (including what parts correspond to f1 and f2), include only the changed equations in the paper, and otherwise keep the full model description in the appendix. On a sidenote: is f2 not the same as an observation equation, that is commonly used in state space models?
4. Details on hyperparameters (neural network # of layers, activation functions, learning rates etc.) are not provided at all. Some key information should be provided in the paper, and a reference to supporting information or the code should be provided for details.Detailed comments
line 61: nonuniqueness is also going to be a problem if we employ newer frameworks like PINNs or dPL
line 110: it might be worthwhile to start with a reference to figure 1 and a down to earth explanation of the objective of your work, i.e. to calibrate model parameters across many sites, to capture the variation of parameters using neural networks, and to employ differentiable programming to speed up the identification process
line 118: please explain PFT again in this section
line 140: If you preserve eq. 4 and 5 in the paper, I think they should be presented in reverse order (f1 first, f2 second)
line 146-164: please include only methodological descriptions that are relevant for the results. of the julia implementation was not used, then it should not be described and discussed
line 183: you don't describe anywhere in your data how many PFTs you consider. it is therefore here also not clear how many dummy variables this model receives as input.
line 190-205: I think this information is not needed to understand the main message
eq. 10: why is psi_max replaced by psi_0? (missing explanation)
eq. 11: what is F_om?
line 232: the CLM4.5 data points should be documented in a dedicated data section. in general, i suggest they you separate the description of data and experiments
line 239: were all calculations performed only for the topsoil layer in all experiments?
Table 1: missing symbol explanations for means and standard deviations
line 383: please include time series for observations and model predictions
fig. 5: symbols in legend cannot be distinguished. are results shown for the test dataset?
line 426: i would add that you have identified parameter values that are optimized for the considered set of model equations and forcings. both of these have limitations. equations may be wrong, ERA5 is rather uncertain, and measurement principles can vary between stations. This is both a limitation and a strength of your framework. Parameter values will not be transferable to other inputs. On the other hand you can obtain optimized predictions for the given set of forcings.Citation: https://doi.org/10.5194/bg-2022-211-RC1 -
AC2: 'Reply on RC1', Chaopeng Shen, 26 Jan 2023
Dear Reviewer #1,
Thank you for your constructive criticism. Please see the attached PDF for detailed responses. We have actually run new cross validation and we will show you some new figures. We have also made some clarifications in response to your comments.
-
AC4: 'Reply on AC2', Chaopeng Shen, 31 Jan 2023
Dear Reviewer:
A small correction. Toward the end of our previous response, we attached figures coming from random holdout with replacements. In the attached PDF, we present figures from cross validation in which no data points were repeatedly selected as test points.
-
AC4: 'Reply on AC2', Chaopeng Shen, 31 Jan 2023
-
AC2: 'Reply on RC1', Chaopeng Shen, 26 Jan 2023
-
RC2: 'Comment on bg-2022-211', Anonymous Referee #2, 23 Jan 2023
The authors of the manuscript ‘A differentiable ecosystem modeling framework for large-scale inverse problems: demonstration with photosynthesis simulations’ describe the application of the ‘differentiable parameter learning’(dPL) framework to the photosynthesis module of FATES model. The framework, and concept, overcomes extrapolation limitations from site-by-site calibration approaches and allows leveraging information content in large-scale datasets towards a global parameterization of photosynthesis models. Neither the concept (Tsai et al., Nature Communications, https://www.nature.com/articles/s41467-021-26107-z, 2021; Bao et al., Authorea, https://www.authorea.com/doi/full/10.1002/essoar.10512186.3, 2022) nor the dPL framework (Tsai et al., 2021; Feng et al., 2022ab) are new. However, the framework is used in the FATES model for the first time and the results would be of interest for further model development, but also to the scientific community at large.
At this point, the experiment focuses on inverting two parameters, Vcmax25 and B, resulting in that the accuracy of the simulated net photosynthesis rate being slightly improved. The main concerns at this stage relate to apparently incorrect formulations of some key equations, to issues about the validation strategy, to the fact that the forcing data and the experiments are not described sufficiently, challenging the acceptance of the study, while hampering any reproducibility efforts. Please see below for details.
Major comments:
- Two key equations are incorrect in the paper:
1) line 140: equation 5, Ci=Ca-An*Patm*(1.4gs+1.6gb)/(gs+gb);
2) line 505: equation A1, Ac=Vcmax*(Ci-Γ*)/(Ci+Kc*(1+Ko/Oi)).
According to the user guide of the FATES model (https://fates-users-guide.readthedocs.io/projects/tech-doc/en/latest/fates_tech_note.html#fundamental-photosynthetic-physiology-theory), the equations should be:
1) Ci=Ca-An*Patm*(1.4gs+1.6gb)/(gs*gb);
2) Ac=Vcmax*(Ci-Γ*)/(Ci+Kc*(1+Oi/Ko)).
Since the FATES model is reimplemented in Julia and PyTorch by the authors, the codes might be also wrong. If so, the unit of Ci will be incorrect, leading to errors in the inversion of Vcmax25 and B. The wrong computation of the effective Michaelis-Menten coefficient (=Kc*(1+Oi/Ko)) might only have a slight effect if the temperature is close to 25°C, but should be concerned if the temperature is too low or high (and I do see some points with low leaf temperature in the ‘Lin15’ database). Thus, I have doubts about the current results and relevant analysis.
- As all the results are validated only once using the temporal holdout data or the random holdout data, the generalizability of the dPL (or NNB+NNv) is not clear. If the N-fold or leave-one-out cross-validation can be adopted, the statistical metrics can be more justifiable to reflect the model performance.
- The forcing variables and parameters are not clearly differentiated in the paper. For example, is the leaf layer boundary conductance, gb, a constant parameter across sites or a temporally changing variable? If it is a forcing variable for FATES, where is gb from? is θice a forcing variable or a parameter correlated with temperature and θliq? Is the Ca a constant value or variable? The model would be different if the spatial and temporal variability of all these factors are considered. If all these are parameters (i.e., scalars), what are the values?
- Line 216-218: the reason for replacing saturated soil matric potential (Ψsat) with soil matric potential for closed stomata (Ψc) is not explained. Equation 10 shows that the Ψsat is replaced with soil matric potential for open stomata(Ψo), not Ψc. Furthermore, the Ψi was still calculated using Ψsat in Appendix A (equations A16-A18). I’m confused about which variable was used to calculate Ψi.
- Line 218-220: is NNB used to predict Bi or Ψi? B depends on only %clay and Fom according to equations A22-A23, while the authors add %sand, which is related to Ψsat and, therefore, Ψi. I didn’t find a direct relationship between Bi and %sand according to the original equations in the FATES model. If NNB is used to predict Ψi, I think the equation can be Ψi=θliq*NNB(%sand,%clay,PFT,Fom,T), where T represents the factors controlling θice, e.g., temperature.
- I think the neural networks (NNB and NNv) need constraints on Vcmax25 and Ψi. Although the authors declared that the predicted Vcmax25 without any constraints is within a rational range similar to the literature and measurement, the range of the predicted B is not discussed. If the predicted Bi is very large at some point, Ψi can be much higher than Ψo, leading to wi being higher than 1 (i.e., exceeding the range defined in equation A15). Besides, the Vcmax25 is possibly to be inappropriate without any physical constraints at sites not considered in this study.
- Line 235-236: ‘we tested retrieving both Vcmax25 and B, the latter of which varies spatially and temporally.’
If B varies temporally, it should be clarified how the training data is partitioned and how the ‘random holdout test’ is done. For example, is B changing per year or every N years? how many years/points per site are used to estimate B? Do the training points have to be in sequence or not?
- Line 238-239: ‘For simplicity, the computations of B, Ψi, wi, βt were performed for the top soil layer only.’
In the synthetic experiment, only the top soil layer is considered. However, ‘B, Ψi, wi‘ for the other layers are not clarified (=zero or default values in CLM?). Are the other soil layers considered in the real data experiment? If yes, how many ‘B’ was estimated (i.e., how many soil layers and how many temporally changing Bi)? If not, wi can only represent the water availability at the top layer. The βt is equal to wi and the root distribution, ri, at the top layer. What is ri at the top layer (soil depth=0cm according to line 306)?
- Line 239: ‘To retrieve B, we used NNB but exclude the PFT term.’
I think it is not proper if the PFT is excluded from the training but included in the equation. If PFT is excluded, the term should be removed from equation 11. The sentence at line 222 ‘… along with categorical inputs (PFT), we used…’ should be rephrased.
- Line 245: ‘The model passing the test of the synthetic case was then applied to a real dataset…’
The same NN was used for synthetic data and real data, but the NN information (layers, neurons activation functions) is not clear. As real data is much more complex, using a different NN structure from the synthetic test might have better performance.
- Line 266-267: the loss function is very significant to evaluate the NN, but not explained in the paper. Without the equation of the loss function or the NN information, the dPL framework cannot be assessed by others, in other words, the experiment cannot be repeated. I think this doesn’t fulfil the requirement of Biogeosciences: ‘Is the description of experiments and calculations sufficiently complete and precise to allow their reproduction by fellow scientists (traceability of results)?’.
- Line 268-272: the authors ‘hope to identify parameters that can generalize well in space’, so I think the readers would wonder if the parameters are estimated per site or per PFT. If parameters are estimated per site, how are they aggregated to parameters per PFT in figure 3a and 4a? If estimated per PFT, I’m afraid the spatial variability of the parameters is not fully captured by dPL.
- Line 292-302: the sources of the soil moisture, stomatal conductance, meteorological forcings and the soil properties are mentioned, but the sources of Ca, gb and Patm are not clear.
- The data source of ‘Lin15’ was not specified. I found a database at Lin et al., 2015, bud didn’t find the dates information on lines 296-300.
- Line 304-305: the soil organic carbon content is collected, but the unit is not explained. Does the unit need to be transferred to get the soil organic matter fraction?
- Line 410-411: the authors claim that the predicted Vcmax25 ‘were well correlated with’ literature values. However, the correlation coefficient or determination coefficient was never stated in the paper. Too few points are displayed in figure 6b, and the distribution pattern of only four PFT types (crop R, C3 grass, NET Boreal and BDS temperate) is similar to CLM.
- Line 431-432: I cannot identify the C3 grass at the lower left corner of figure 5b. Maybe a violin plot per PFT can be helpful to show the difference between optimizing B or not for a specific plant type. The figures in the paper only show the net photosynthesis rate across all sites. However, the site-level comparison might be more meaningful to assess the four parameterization strategies: Vdef+B, Vdef+Bdef, V+Bdef, and V+B.
- Line 445-450: I didn’t see any significant correlation between the estimated Vcmax25 and the PFT-mean from TRY database or other model default values. The authors should provide the scatter plots and the correlation coefficients between the Vcmax values to conclude that the dPL can get parameters correlated with literature values (line 490).
Minor comments:
- Line 123: the right part looks very similar to the middle part in equation 3, but the subscript ‘W’ beside ‘argmin’ is not explained. As I understand, the ‘argmin’ in the right part is the same as the ‘argmin’ in the middle part.
- Line 142: The short name for CO2 partial pressure at the leaf surface is ‘Ca’, but is ‘Cs’ in the appendix. Please use a uniform short name across the paper.
- Line 187: equation 11 is cited at line 187 for the first time, but the full equation is placed at line 218. The equation should appear close to the first citation.
- Line 193: does ‘i’ represent the soil layer number? I didn’t see the explanation around the equation.
- Line 197: ‘across different soil different layers’ should be ‘across different soil layers’.
- Line 203/equation 9: the second line should be Ti≤Tf-2 ‘or’ θliq≤0.
- Line 205: the short name for the physical parameter at the second blue area should be θ but not θc.
- Line 218/equation 11: B and Fom should have a subscript, i. Bi=NNBi(%sand,%clay,PFT,Fom,i).
- Line 222: the ‘one-hot embedding’ was already stated at line 183. The definition should be explained where it is mentioned for the first time.
- Line 228: the short name for ‘differentiable learning framework’ is defined but not used.
- Line 310/Figure 2: the full names of the land cover types (e.g., BET tropical) are not explained before or around the figure.
- Line 349: table 2 is mentioned for the first time, but the full table is placed after two pages.
- Line 384: the CO2 should be CO2(subscript).
- Line 390/figure 5: I cannot understand the titles of the subplots. What is the meaning of ‘learning B’ and ‘learning Vcmax25’? The B is not optimized in figure 5a.
- Line 514/equation A7: the Cs is not used.
- Line 520-530: the three functions, Φ1, Φ2, and Φ3, need to be clarified.
- Appendix: the citations of equations are wrong (e.g, lines 503-504, 512, 520, 534…). The equations should be cited using A1-A23.
Citation: https://doi.org/10.5194/bg-2022-211-RC2 -
AC1: 'Reply on RC2', Chaopeng Shen, 23 Jan 2023
Dear Reviewer,
Thank you for your comments! We are preparing a fuller response to your comment, but here are some rapid responses to some comments. Please see the attached PDF.
Essentially, the equations in the manuscript had typos, but they were correctly implemented in the code. The code was correct as we compared carefully against the Fortran code in these subroutines as we developed the differentiable versions of the code. We will be publishing the code as the paper gets closer to acceptance so this can be examined in the code. We apologize for the errors in the manuscript.
Thanks for pointing out a cross validation test, and we will run it. We hypothesize that a randomly-heldout validation test will show better metrics than the temporal test reported in the paper, which is a more stringent test.
Again, please see the attached PDF and we will give you the rest of the replies promptly.
-
AC3: 'Reply on RC2', Chaopeng Shen, 31 Jan 2023
We thank your detailed comments! Wow, this ends up being a 24-page response -- please see the attachment.
As a summary, it seems most of the questions seek clarifications and details about the model. Those comments can help us elucidate the model better. We did not find major comments that require computational experiments or major reorganization. There is a question about cross validation, which we have already run. It shows expected and essentially similar results. Moreover, some metrics were requested and we calculated them and reported them in the responses.
We indeed followed our previous differentiable parameter learning paradigm which was first applied in hydrology (Tsai et al., 2021; Feng et al., 2022), as noted in the manuscript, but this is a novel use in the large domain of ecosystem modeling, which is a very large field of study. The system is also different as here we have a nonlinear system of equations while in hydrologic cases we have ordinary differential equations. The mathematical treatment was different. The Julia software solves the system using adjoint solvers, although it is a relatively minor point as we mainly used the PyTorch version for its high parallel efficiency.
We could not have noticed Bao et al., 2022 as it went online after our manuscript did and seems to be undergoing review. Upon some examination, we believe the basic modules are very different. They are using a light-use-efficiency approach and predicted GPP, while our paper focused on photosynthesis using a Farquhar-type model. Hence we don’t think there is much overlap between the two.
-
RC3: 'Reply on AC3', Anonymous Referee #2, 01 Feb 2023
The authors’ reply is timely and clarifies most of the questions and confusion. I’m looking forwards to reading the new version of this manuscript!
According to the new information provided by the authors, the manuscript presented how well the net photosynthesis can be simulated using two parameters (Vcmax25 and Bi) predicted via a simple MLP neural network (one hidden layer) with a few attributes (PFT, %sand, %clay and Fom). After reading the authors’ reply, I still have the following concerns and comments:
- There are limited site-level temporal data, thus the seasonality of net photosynthesis cannot be assessed.
- The violin plots showed the net photosynthesis per PFT, but I think readers would be more interested in how different is the simulated net photosynthesis from the measured net photosynthesis. Maybe the fourth violin plot (measured values) can be added on the right side and the NSE can be displayed at the top. Moreover, I think only the test dataset (or better cross-validated dataset) should be compared with the measured values (e.g., Fig. 5) and used to make the violin plots.
- The authors clarified that the Vcmax25 is predicted per PFT but did not mention Bi. Is Bi predicted per PFT or per site? How is the predicted Bi compared with values from CLM?
- Since the site-level comparison and the site-average An comparison are not possible, the generalizability cannot be evaluated. However, the model performance across sites can be compared to other papers using the Farquhar model (e.g., Fig 1B of Chen at al., PNAS, https://doi.org/10.1073/pnas.2115627119, 2022).
Citation: https://doi.org/10.5194/bg-2022-211-RC3 -
AC5: 'Reply on RC3', Chaopeng Shen, 01 Feb 2023
Thank you for your quick comments. Since the interactive system is about to close, we can only add a quick reply here:
1. We did not claim the model can simulate seasonality very well at a site, but now we see the model is capturing the average behavior quite well. We will clarify this as a Limitation in the revised manuscript. This is a question for the next iteration of the work. Lots of improvements can be made ---- add more parameters for estimation, changing the formulation, allow for memory, maybe not parameterize on PFT but something else, etc. There are just too many possible things to do. We have to take reasonable steps --- we cannot expect to do everything in the first article here. From the readers' perspective, it would also be exhausting to read too many things at once. We have to take a scientific, stepwise approach. The main point of the paper is to show the promise of this approach, which this paper has done.
2. Don't have time to do this before the system closes. We do expect, as we divide into each PFT, and into each site, the performance will likely go down.
3. Because we changed the formula here, B is no longer comparable to the CLM4.5 values, but I don't think that CLM4.5 formula cannot be touched.
4. The paper named above was modeling GPP, which is important but not comparable. We couldn't find a benchmark study for net photosynthesis with a similar number of sites. We would also appreciate it if there is any suggestion regarding a comparable study on net photosynthesis. On the other hand, we know the model captures the overall mean pattern well from the cross-validation metrics. We will clarify what to expect from this model. We also need to put things into context. How good are the previous models? We believe we indeed made big progress but there is obviously a lot of room for improvement.
Thank you, and we will give you a fuller response later if allowed!
Citation: https://doi.org/10.5194/bg-2022-211-AC5 - AC6: 'Reply on RC3', Chaopeng Shen, 08 Feb 2023
-
RC3: 'Reply on AC3', Anonymous Referee #2, 01 Feb 2023
Doaa Aboelyazeed et al.
Doaa Aboelyazeed et al.
Viewed
HTML | XML | Total | BibTeX | EndNote | |
---|---|---|---|---|---|
491 | 184 | 24 | 699 | 3 | 7 |
- HTML: 491
- PDF: 184
- XML: 24
- Total: 699
- BibTeX: 3
- EndNote: 7
Viewed (geographical distribution)
Country | # | Views | % |
---|
Total: | 0 |
HTML: | 0 |
PDF: | 0 |
XML: | 0 |
- 1