Modeling polar marine ecosystem functions guided by bacterial physiological and taxonomic traits

Kim, Hyewon Heather; Bowman, Jeff S.; Luo, Ya-Wei; Ducklow, Hugh W.; Schofield, Oscar M.; Steinberg, Deborah K.; Doney, Scott C.

doi:https://doi.org/10.5194/bg-19-117-2022

Articles | Volume 19, issue 1

https://doi.org/10.5194/bg-19-117-2022

© Author(s) 2022. This work is distributed under
the Creative Commons Attribution 4.0 License.

https://doi.org/10.5194/bg-19-117-2022

© Author(s) 2022. This work is distributed under
the Creative Commons Attribution 4.0 License.

Articles | Volume 19, issue 1

Research article

|

06 Jan 2022

Research article |

| 06 Jan 2022

Modeling polar marine ecosystem functions guided by bacterial physiological and taxonomic traits

Hyewon Heather Kim, Jeff S. Bowman, Ya-Wei Luo, Hugh W. Ducklow, Oscar M. Schofield, Deborah K. Steinberg, and Scott C. Doney

Download

Final revised paper (published on 06 Jan 2022)
Supplement to the final revised paper
Preprint (discussion started on 02 Sep 2020)
Supplement to the preprint

Interactive discussion

Status: closed

AC: Author comment | RC: Referee comment | SC: Short comment | EC: Editor comment

- Printer-friendly version

- Supplement

RC1: 'Review of "Microbial diversity-informed modelling of polar marine ecosystem functions" by H. H. Kim et al.', Anonymous Referee #1, 02 Oct 2020
- AC1: 'Reply on RC1', Hyewon Kim, 31 Mar 2021
RC2: 'Review of “Microbial diversity-informed modelling of polar marine ecosystem functions” by Hyewon H. Kim et al.', Anonymous Referee #2, 06 Nov 2020
- AC2: 'Reply on RC2', Hyewon Kim, 31 Mar 2021

Peer-review completion

AR: Author's response | RR: Referee report | ED: Editor decision

ED: Reconsider after major revisions (08 Apr 2021) by Marilaure Grégoire

AR by Heather Hyewon Kim on behalf of the Authors (08 Apr 2021) Author's response Manuscript

ED: Referee Nomination & Report Request started (12 Apr 2021) by Marilaure Grégoire

RR by Anonymous Referee #2 (03 May 2021)

RR by Anonymous Referee #1 (21 May 2021)

Suggestions for revision or reasons for rejection

Review of "Modelling polar marine ecosystem functions guided by bacterial physiological and taxonomic traits" by H. H. Kim et al.

For the revised ms the authors expanded their model from 0D to 1D (3 layers). The ms has greatly improved, mostly by providing the model equations (although apparently incomplete, see below), whose omission had made it impossible for me to understand the model structure in the previous round. The parameter estimation has much improved and its description has become OK now. But even after the long time it has taken the authors to prepare the revised ms, it still leaves a strong impression of sloppiness. Several sentences are simply incomprehensible and little attention seems to have been paid to the readability, correctness, and design of some of the figures. Only some of the changes to the previous ms are highlighted. The model description in the main text is still very much unclear and this applies also to the mode concept. Nevertheless, having seen the equations, the study seems to be much better than I had feared based on the original ms. After another major revision or two, I now think it could become a useful contribution.

One of the remaining problems is the confusion of assumptions and results. The authors mention in the response letter (introductory para, point 4) that the "larger cell-specific BP and SDOC uptake rates of HNA cells than those of LNA cells" indicate the robustness of their analysis. This finding is also referred to in the results and discussion sections (lines 276, 384–385). But this is a model assumption, not a result: "maximum bacterial growth rate of the HNA group (μHNA, d-1) was ensured to be optimized to be higher than that of the LNA group (μLNA, d-1)" (lines 168–169).

The authors have now clarified that their model considers flexible (Chl:)C:N:P stoichiometry, but this is mentioned only in the equations and the supplement. This information must be provided in the main text, e.g., under Sections 2.1 or 2.2 or a new 2.x section, as this information is quite crucial for understanding the model design. The statement that the model has 12 state variables (line 81) is simply wrong (I counted 32). This misinformation had led me to conclude that the model was based on a fixed-stoichiometry approach in my previous review. Fig. 1 has been amended regarding the flows of inorganic nutrients to phytoplankton. But it still remains a source of confusion. Fig. 1 shows two compartments, "Higher level" and "RDOM", which do not have corresponding differential equations, so the authors should either add the missing equations or modify Fig. 1 to clarify what these are (this applies also to Fig. 5).

The sentence "Total (bulk) bacterial production (BP; BP = BPHNA + BPLNA) was constrained by observations, and therefore, the group-specific production (BPHNA and BPLNA, mmol C m-3 d-1) was determined during optimization:" (lines 99–100) is unclear. Does this mean that Eqs. (3) and (4) apply only during the optimization? How do you calculate BP_HNA and BP_LNA when not optimizing?

On line 104, you state that "The modelling framework consisted of a dynamic (mechanistic) part and a data-driven part (Figure 2)" but Fig. 2 is about the data assimilation scheme and does not show or mention dynamic and data-driven parts.

On lines 109–110, you introduce fmodes as functional modes, but even after reading the whole ms several times, it remains unclear what these are, e.g., which functions the fmodes describe. Since the fmodes are used later on in the statistical analysis, they should be explained clearly.
On lines 310–311, "These results suggest a clear link between the modelled ecosystem functions and observed bacterial taxonomic (modes) and physiological (fHNA) traits observations." This, together with the absence of any significant relations for fmodes, seems to indicate that the functions (not decribed in the ms) of the functional modes were chosen inappropriately.

The authors moved from a 0D to a 1D setup for the model but do not provide any information about the 1D setup except the depth levels. No indication about vertical mixing is given in the equations either, so they must be considered incomplete. Also, I am not convinced that, given the shallow model domain (20 m), a 1D design provides a significant advantage over 0D. But again, essential information is missing to allow a firm judgement, e.g., the depth of the mixed layer and its seasonal variations. If the mixed layer is usually deeper than 20 m at the modelled site, then a 1D model offers no advantage over a 0D model. Also, no information is provided regarding the vertical geometry (are the three layers of the same height?) or the mixing scheme (implicit, explicit, positive definite, etc.). The authors should also indicate how the mixing coefficients were obtained or calculated. The reference to Kim et al. (2021) is insufficient, as this has not been published. The authors should just add a short section describing the vertical configuration and modify the equations accordingly.

The concept of the bacterial modes remains rather confusing. Since this is one of the main foundations of the present study, this must be clarified. My main problem with Fig. 6 is still that the explanation of the modes (also in the authors' response letter) does not seem to match what is shown in the panels. For example, Candidatus Pelagibacter ubique is supposed to dominate mode 6 (Fig. 6c) and C. Thioglobus singularis should dominate mode 1 (Fig. 6e). However, the relative abundance of C. T. singularis in mode 1 never exceeds 0.25, whereas C. P. ubique has relative abundances between 0.25 and 0.35 in mode 1, according to panel c, so it appears that both modes 1 and 6 are dominated by C. P. ubique. I did go through Bowman et al. (2017) but could not find an explanation there either.

The description of the data assimilation and parameter optimisation has become much more accessible by the added explanations. Still, several points remain unclear.
On lines 166–167, you write "... group-specific bacterial model parameters were optimized in the direction to properly represent the dynamics associated with each group ..." I do not understand what this means, even with the explanation in the next sentence, which describes a constraint imposed in the maximum bacterial growth rates.
On lines 187–188, "When converting Chl to phytoplankton C (N) biomass, the maximum Chl to N ratio was used along with other reference ratios ..." it remains unclear why you use the maximum (rather than, e.g., an average) Chl:N ratio and what the other reference ratios are. This must be clarified. Also, throughout the ms, you mostly refer to C biomass, so it is unclear where, when, or why you convert to N biomass here.
On lines 198–199, you refer to "... normalized costs of individual data types (J’m) ..." The J'm seem to be indicated in Table 2 but they are never defined.
On lines 205–213, you refer to the depth of the mixed layer as affecting the calculation of the target error. Besides the lack of information on the mixed-layer depth, it is also unclear whether the mixed layer was always deeper than 20 m, or whether you always applied the same CV throughout the whole model domain. Please explain clearly.
On lines 240–241, "... 5-7 constrained parameters and 3-6 optimized parameters ..." Please clearly explain how you define and determine constrained and optimized parameters and how they differ. Also, explain CS and OP in Tables S2–S6.
On lines 254–155 "However, the model skill for HNA biomass slightly degraded in the climatological model (Figure 3b), with lower correlations and normalized standard deviation and higher RMSD than the four years together (Figure 3a)." Since the individual years have been optimized individually, this result was to be expected. What could be more informative is a comparison with simulations for the individual years but with the same parameter set, e.g., the most portable one. This could provide insight into the influence of parameter differences compared to that of different boundary conditions and forcings between the different years.

These sentences are incomprehensible and must be corrected. I could not figure out what you wanted to say here:
Lines 273–275: "C stocks and flows averaged over the growth (Figure 5) and normalized by NPP (normalized by NPP in 1-day for C stocks; Figure S9) season for each year summarized an annual snapshot of the group-specific bacterial dynamics."
Lines 283–284: "NO3, POC, and SDOC in unassimilated years were modelled to values comparable to those in other assimilated years (Figure 5)."
Line 399: "Modelled nutrient stocks were above detect limits and indicated the lack of macronutrient limitations."

Fig. 3: The caption for panel a (2010 - 2013) is confusing. The simulations also cover 2014 and this caption gives the impression that the simulations went through 2010–2013 continuously, which is not what you did. You should come up with a better caption. Also, as mentioned above, a third panel showing results for different years with a single parameter set could be useful. The last sentence of the caption seems to make no sense, since the x-axes of both panels are the same.

Fig. 4: The numbers and letters are very hard to read and often overlap. Maybe rearrange in 4 columns (2 for the means and 2 for the CVs)? The units in (b) are wrong, the CV is dimensionless.

Fig. 5: The caption says that the panels show C stocks and flows in units of mmol C m-2 and mmol C m-2 d-1 but the panels also show NH4, NO3, and PO4, so the associated numbers must have different units. The caption explains the numbers in the first rows and the numbers in parentheses but not the numbers in the second rows next to the arrows. Then it says that "N and P flows, as well as the flows smaller than 0.01 mmol C m-3 d-1, are omitted." but the panels show arrows from and to NH4 and to the inorganic nutrient compartments.

Fig. 8 suffers from the same problems as Fig. 4 (% numbers are also dimensionless). In addition, the first rows in (b) should be left out as they are always 0 by definition.

Hide

ED: Reconsider after major revisions (16 Jun 2021) by Marilaure Grégoire

AR by Heather Hyewon Kim on behalf of the Authors (14 Jul 2021) Author's response Manuscript

ED: Referee Nomination & Report Request started (21 Jul 2021) by Marilaure Grégoire

RR by Anonymous Referee #3 (03 Aug 2021)

Suggestions for revision or reasons for rejection

This paper uses data from a 1D ecosystem model alongside bacterial genomic data to give insight into the role of bacteria in ecosystem functioning at a site in the West Antarctic Peninsula. The ecosystem model, recently accpeted for publication in a separate paper, has been modified to include two functional types of bacteria, HNA and LNA, and it is able to assimilate biomass of these types using flow cytometry data. Outputs from the model are examined in association with bacterial groupings derived from previously published taxonomic analysis of genome data from the same location. This is interesting work, and I was pleased to see a study that combine ecosystem model and genome data. The paper has been much improved by the previous cycles of review. However, I found it difficult to read and I recommend a little further work before it is published, to make it accessible to a wider readership.

First, now that the paper describing the ecosystem model has been accepted for publication (Kim et al., 2021) the methods section can be reduced and simplified. Section 2.1 just needs a summary of the model previously published: the key points are the main functional types, which elements are tracked (C,N,P) and whether there is flexible stoichiometry. Of course the differences from Kim et al. (2021) need to be covered, but this can be relatively brief. I would remove the equations and the references to them – I know that a previous reviewer requested them, but they are identical to the ones in Kim et al. (2021) and can now be read there; lines 80-117 will be much easier to read without the references. If the equations are kept the symbols need to be explained. The description of the data assimilation scheme in sections 2.3-2.5 is very similar to that in Kim et al. (2021) and does not need to be repeated in such detail. Only a summary is needed in this paper, with the emphasis on the differences compared to the first paper, e.g. the difference in calculation of the target error. Reduced in this way, the methods section would be easier to read and give more weight to the work on bacteria, which is the novel part of this paper.

Second, I suggest giving a slightly fuller explanation of the genomic data, for readers like myself who are less familiar with this than with the modelling. The terms mode, functional mode, closest estimated genomes and closest completed genomes all seem to be specific to Bowman et al. (2017) and as such need to be explained more fully (or omitted if they are not needed – I don’t see that referring to closest estimated genomes and closest completed genomes is required for the discussion here). I understand that taxonomic modes are groupings based on 16S rRNA gene sequence, but I don’t understand where the fmodes come from. Line 134 says that “functional modes (fmodes hereafter) were derived from predicted community metabolic structure” – what data was this based on? From reading Bowman et al., 2017, I think it is also the 16S data, but the phrasing here seems to imply that it comes from a different source. I’m also not clear how the distinction between modes and f-modes can be interpreted: in section 3.3 it is stated that “fmode did not have a significant relationship with any of the modelled ecosystem functions examined” – how can we interpret that in terms of the bacterial contribution to ecosystem functioning?

Third, section 2.2 and Figure 2 led me to expect more integration of the genomic data into the model than I actually found. Figure 2 shows the data-driven part of the model, where the bacterial modes are used, but I could not find any description of how this is done in the methods. Line 133 refers to “the data-driven part representing how bacterial modes (Bowman et al. 2017) are compared to final model outputs based on optimized model parameters from the dynamic part”. So is this just a comparison between model outputs and bacterial modes, as presented in sections 3.3 and 4.3? In what sense is the modelling framework data driven, i.e. what does the arrow in Figure 2 pointing from the bacterial modes to the model field represent?

My main scientific concern is that in mnay cases the model values have a much lower standard deviation than the observations (Figure 3, Figure S2), and more so than in the previous model (Kim et al, 2021). So the model appears to be missing much of the variability observed in the field measurements. Do the authors have any comment on this?

With these modifications, I think the paper will be a useful contribution to the literature, with relevance beyond the particular study area.

A few specific points:
Abstract: the abbreviation WAP needs to be explained.
Line 61: r missing from Palmer
Line 81: I agree with the previous review comment that there are more than 12 state variables, and it is still not clear that only the carbon stocks are being considered. But why give a number? I don’t see that this is important for the rest of the text.
Line 83: LDOC in the text is given as LDOM in Figure 1. Is it the same thing? Similarly for SDOC, RDOC.
Line 108: rpesent instead of present
Line 168: avialable instead of available
Line 291 and Figure 3: it is not specified how the standard deviation is normalized.
Tables S2-S6: it would be helpful to explain the abbreviations OP and CS in the legends.
Text S1 to S2: I think much of this is now part of Kim et al. (2021) and can be removed.

Hide

RR by Anonymous Referee #1 (18 Aug 2021)

Suggestions for revision or reasons for rejection

Third review of "Modelling polar marine ecosystem functions guided by bacterial physiological and taxonomic traits" by H. H. Kim et al.

I think this is the first time I am writing a third review for a manuscript. In their second revision, the authors have now clarified/corrected most of the technical and language problems of the previous versions. Nevertheless, substantial problems remain, as detailed below, so that I still can not recommend publication of the ms as is. While the amount of required changes is quite small, the confusion of assumptions and results is sufficiently severe, so that it should be considered a major revision.

My main problem now lies in the still unresolved problem that the authors present an immediate consequence of their model assumptions as a result of their study, namely that the "High nucleic acid (HNA) bacteria show relatively high cell-specific productivity, respiration, and utilisation of the semi-labile dissolved organic carbon pool compared to their low nucleic acid (LNA) bacteria counterparts." (Abstract, lines 21–23). The authors' response does not really address this problem as it only applies to the total, rather than the biomass-normalised, rates. Please note that I have no problem accepting as a result the finding that the total (not biomass-normalised) rates are higher for HNA than for LNA, and I also think that this would be actually much more relevant in terms of both ecology and biogeochemistry. I also do not question that the growth rate of the HNA may potentially be lower at low labile DOC concentration because the HNA have a higher half-saturation concentration. But this does not apply (at least not for the parameters shown in Tables S2–S6) for the labile DOC concentrations in this study (Fig. 8, LDOC). In addition, several (not optimised or constrained) loss-rate parameters (RDOC production, mortality, respiration) are higher for HNA, and these must be compensated by faster DOC uptake in order to allow coexistence of HNA and LNA. Clearly, therefore, the higher biomass-specific rates of HNA are imposed by the model assumptions and must not be presented as a result.

The description of the state variables on lines 80–102 is still wrong. The statement that the model has 12 state variables (line 81) is simply not true. I do not understand the hesitation of the authors to correct this obvious mistake.

The problems with the bacterial modes and fmodes largely remain. For example, I had asked what functions the functional modes refer to. The authors refer (also in their response letter) to Bowman et al. (2017) for details but all that Bowman et al. (2017) write about the fmodes is this: "Based on inspection of the within-cluster sum of squares plot, we identified […] eight modes based on inferred metabolic pathways (not shown)." I think it is impossible to judge the validity of the statements regarding the fmodes without concrete information about the associated actual functions (metabolic pathways). How can one know whether these functions are selected in a meaningful or useful manner if no information about them is provided? Regarding the bacterial modes, in their response letter, the authors write that they changed the example in Fig. 6 to Dokdonia. This does not address the problem I described. I never had any problem understanding the Dokdonia case. My problem is understanding how the assignment of the modes to species works and I explained the (apparent?) contradiction between the authors' definition and what is shown in Fig. 6 with the example of Candidatus Pelagibacter ubique and C. Thioglobus singularis. Using a set of species where this problem does not show up obviously does not help here.

The unclear presentation of the mode concept and the above problem of the confusion of assumptions and result are the reasons why I grade the scientific significance and quality as poor. The presentation quality gets a fair grade mainly because of the extremely poor language quality. While I do appreciate the additional information about the model and the corrections in this latest revision, I do in fact expect that authors supply this kind of information already with the initial submission.

The sentence on lines 447–448 is still unclear to me. "… the values above …" (above what?) This is followed by "… detect limits …" It appears that two sentences were merged and something was lost, e.g., "… the values above those required by …" "… this was used to detect the limits or indicate the lack of macronutrient limitation …"

Hide

ED: Reconsider after major revisions (06 Sep 2021) by Marilaure Grégoire

Dear Dr. Kim and co-authors,

Two reviewers have evaluated the revised version of your work and they still have important concerns including some already pointed out in the first revised version. I will ask you to consider them very thoroughly in the next revised version. Here are points that need to be carefully addressed:

1) The statement mentioned in the abstract (line 21-23) that “"High nucleic acid (HNA) bacteria show relatively high cell-SPECIFIC productivity, respiration, and utilisation of the semi-labile dissolved organic carbon pool compared to their low nucleic acid (LNA) bacteria counterparts." is highlighted as an important result of model simulations performed in this paper although it results from model parameterization (and hence assumptions). Reviewer #2 pointed out that this conclusion can be considered as an output from the model if it applies to the non-normalized quantity. However, as written, this refers to the biomass-normalized quantities. Please carefully check the detailed comment of reviewer #2 and provide a clear answer.

2) The model description needs to be revised and the number of state variables to be checked. At the time of the first review, the GMD paper was not available. Since it is now published, I suggest that you keep in the main text a summary of the model main characteristics and put in the appendix details on the formulation. These details have been requested by Reviewer #2 and this is important that it remains accessible in the manuscript but to improve the clarity of the paper and make it more accessible, transfer some parts in the supplementary section as suggested by reviewer#3 in his/her comment.
3) Both reviewers do not understand the concept of bacterial models and fmodes. This has to be absolutely clarified. To refer to Bowman et al., (2017) is not enough. Please clearly explain, in this paper, to what functions the fmodes refer. This comment has been mentioned many times by reviewer #2 and the third reviewer has exactly the same concern. Please carefully check and answer the detailed comments of Reviewers #2 and #3.
4) Please explain why the variability in model simulations is lower compared to that from observation.

We ask you to prepare a new revised version that addresses all the points mentioned by the two reviewers with a very particular attention to those mentioned above that are critical. Please send a detailed answer to each comment with a revised version including track changes.

Many thanks for your efforts,

Kind regards,

Marilaure Grégoire

Hide

AR by Heather Hyewon Kim on behalf of the Authors (08 Oct 2021) Author's response Manuscript

ED: Referee Nomination & Report Request started (18 Oct 2021) by Marilaure Grégoire

RR by Anonymous Referee #3 (28 Oct 2021)

Suggestions for revision or reasons for rejection

Thank you for the opportunity to review the revised version of this paper. The authors have responded to the questions and feedback from the editor and reviewers and the paper is now much improved. The revised methods section is much easier to read and the removal of the fmodes is a helpful simplification. The fact that the fmodes did not show any significant relationships with key ecosystem functions may be interesting and worth investigating further, but I don’t think it is ready for publication yet. The authors have adequately addressed the other points raised in review:
• the link between the higher HNA cell-specific rates and the assumptions built into the model parameters (lines 356-362 and the abstract);
• the reasons for, and the need to improve on, the low variability in the model outputs compared to observations (lines 382-386).
The paper has some novel aspects and its methods could be debated, but in my opinion it is now ready to be published and further discussion can happen through the normal scientific process.

There are just a few very minor points which I think should be addressed, as follows:
Line 205: I suggest adding a reference to Text S3 as well as Tables S2-S6.
Line 232 “the model captured best the temporal and spatial (depth) variability of PP”: I don’t understand why this is “best”. The skill for PP is relatively low, looking at Figure 3. Is the point that for PP the variability is captured better than the absolute value? I think this sentence needs revising.
Line 259 “There was little interannual variability in the average microzooplankton”: the values in Figure 5 range from 0.39 to 0.76, which does not appear to me to be small variability – the highest value is nearly twice the lowest. This sentence needs to be rephrased, or the variability put into context to explain why this variability is little.
Figure S9 legend: I think the units are not correct – if the values are normalized by NPP they should not be in mmol C m-3 etc.
Figure S10a: The number 5 is almost invisible on the yellow hexagon – could it be changed to black?

There are a few spelling mistakes in both the manuscript and the supplementary material.

Use of English: I suggest the following changes for the authors’ consideration. I my opinion they would improve the clarity of the manuscript.
Line 128 “the results from 10 m are only presented in detail”: I think this should be “only the results from 10 m are presented in detail”.
Line 130 “yet to include the adequate number”: change to “yet to include an adequate number”
Line 223: add “of” after “Because”
Line 256 “the variable for a single year’s”: I think this should be “the variable for which a single year’s”
Line 336 “we assigned the identical initial parameter value”: change to “we assigned an identical initial parameter value”
Line 347 “exhibited the intermediate levels”: remove “the”, i.e. “exhibited intermediate levels”
Line 348 “same number of the constrained parameters”: remove ‘the”
Line 350 “suggesting the connection”: change to “suggesting a connection”
Line 366 “with the cell-specific growth rate”: change to “with a cell-specific growth rate”
Line 374 “the strong interannual variability”: remove “the”
Line 390 “showing the increased HNA growth rates”: remove “the”
Line 406 “often observed during the Antarctic phytoplankton: remove “the”
Line 447 “was characterized by the negative temperature anomaly”: change “the” to “a”
Line 448 “and the positive sea-ice anomaly”: change “the” to “a”

Hide

ED: Publish subject to minor revisions (review by editor) (12 Nov 2021) by Marilaure Grégoire

AR by Heather Hyewon Kim on behalf of the Authors (12 Nov 2021) Author's response

ED: Publish as is (22 Nov 2021) by Marilaure Grégoire

AR by Heather Hyewon Kim on behalf of the Authors (23 Nov 2021)

Download

Article (5413 KB)
Full-text XML

Short summary

Heterotrophic marine bacteria are tiny organisms responsible for taking up organic matter in the ocean. Using a modeling approach, this study shows that characteristics (taxonomy and physiology) of bacteria are associated with a subset of ecological processes in the coastal West Antarctic Peninsula region, a system susceptible to global climate change. This study also suggests that bacteria will become more active, in particular large-sized cells, in response to changing climates in the region.