Hydrodynamic and Biochemical Impacts on the Development of Hypoxia in the Louisiana–Texas Shelf Part II: Statistical Modeling and Hypoxia Prediction
- 1Department of Oceanography and Coastal Sciences, Louisiana State University, Baton Rouge, LA, 70803, USA
- 2Department of Experimental Statistics, Louisiana State University, Baton Rouge, LA, 70803, USA
- 3Center for Computation and Technology, Louisiana State University, Baton Rouge, LA, 70803, USA
- 4Coastal Studies Institute, Louisiana State University, Baton Rouge, LA, 70803, USA
- 1Department of Oceanography and Coastal Sciences, Louisiana State University, Baton Rouge, LA, 70803, USA
- 2Department of Experimental Statistics, Louisiana State University, Baton Rouge, LA, 70803, USA
- 3Center for Computation and Technology, Louisiana State University, Baton Rouge, LA, 70803, USA
- 4Coastal Studies Institute, Louisiana State University, Baton Rouge, LA, 70803, USA
Abstract. In this study, a novel ensemble regression model was developed for hypoxic area (HA) forecast in the Louisiana–Texas (LaTex) Shelf. The ensemble model combines a zero-inflated Poisson generalized linear model (GLM) and a quasi-Poisson generalized additive model (GAM) and considers predictors with hydrodynamic and biochemical features. Both models were trained and calibrated using the daily hindcast (2007–2020) by a three-dimensional coupled hydrodynamic–biogeochemical model embedded in the Reginal Ocean Modeling System (ROMS). A promising HA forecast is provided by the ensemble model with a low RMSE (3,204 km2), a high R2 (0.8005), and a precise performance in capturing hypoxic area peaks in the summers. To test its robustness, the model was further applied to a global forecast model and produces HA prediction from 2019 to 2020 with the adjusted predictors from the HYbrid Coordinate Ocean Model (HYCOM). Predicted HA shows a high agreement with the ROMS hindcast time series (RMSE = 4,571 km2, R2 = 0.8178). Our model can also predict the magnitude and onsets of summer HA peaks in both 2019 and 2020 with high accuracy. To the best of our knowledge, this ensemble model is by far the first one providing fast and accurate daily HA predictions for the LaTex Shelf while considering both hydrodynamic and biochemical effects. This study demonstrates that it is feasible to perform regional ocean HA prediction using global ocean forecast.
Yanda Ou et al.
Status: final response (author comments only)
-
RC1: 'Comment on bg-2022-4', Anonymous Referee #1, 28 Mar 2022
General Comments:
This manuscript applies an ensemble regression approach to produce daily predictions of hypoxic area for the Louisiana-Texas shelf. The manuscript is well written and provides more than adequate descriptions of the methods used to develop, train, and apply the multiple regression models considered for application. Although the ensemble model’s application to global HYCOM seems an important aspect of this work, it’s sole focus in the discussion feels somewhat like an afterthought. HYCOM presentation in the discussion also presents material that seems better suited for the methods section. Although not essential for publication (in my opinion), I would ask the authors to consider expanding the discussion of HYCOM application in the manuscript by addressing relevant technical details in the methods section and focusing on model outcomes for both the ROMS and HYCOM model in the discussion.
I have no major issues with the manuscript as presented (beyond my recommendation for the discussion of HYCOM), and recommend publication with mostly minor revisions as described below.
Specific Comments:
- Line 55: “The effects of water column stratification are not included or only partially considered” In the previous paragraph you describe several models as incorporating water reaeration and wind velocity in the regression model. Are these not at minimum proxies for water column stratification? Suggest this be re-phrased to address the need to include stratification explicitly.
- Line 155: where was the “temperature-dependent decomposition rate of organic matter” derived?
- Line 179: Is a shelfwide average appropriate for all predictors? Was any attempt made to derive predictors for different longitudinal zones of the shelf? For example, the ROMS grid extends far to the western limits of the hypoxic zone along Texas, and thus stratification predictors averaged across the entire domain may be less dynamic than otherwise expected.
- Figure 4: The 95% CI are not visible. Is this because the confidence intervals are extremely tight and not visible at this multi-annual scale?
- Figure 7: Please clarify in the caption where the hypoxic area from the ROMS hindcast is coming from. Is it from the ensemble model, and if so, should 95% CI’s be applied here?
Technical Corrections:
- Abstract line 20: suggest changing “is by far the first one providing” to “is the first”
- Line 24: superscript “L-1”
- Line 56/57: “The information of future conditions is limited although some models are built upon multiple predictors, thus these forecast models are indeed “pseudo-forecast” ones.” This sentence is awkward. Suggest rewording to: “Information on future conditions is often limited to few predictors, thus limiting these forecast models to “pseudo-forecasts””
- Line145/146: “However, by far, global forecast model systems like HYCOM does not include biochemical fields” This sentence is a little confusing. When you say “fields”, do you mean “parameters”? HyCOM is strictly a hydrodynamic model, so it is sufficient to say “However, global forecast models such as HYCOM do not simulate biochemical parameters. Therefore, the biochemical-related term SOC needs to be replaced by an alternative term (denoted as SOCalt).”
- Line 150: It may be more appropriate to describe nitrogen as available for plankton growth, not bloom.
- Line 164/165: “For simplification, we denoted this variable as (Qh), W3, and ð as PEAheat, PEAwind, and DCPTemp, respectively.” It took me a few reads to figure out what this sentence was trying to say. Suggest removing the word “this” and modifying to “denoted the variables”
- Line 316: Change “It implies” to “This implies”
-
AC1: 'Reply on RC1', Z. George Xue, 26 Apr 2022
The comment was uploaded in the form of a supplement: https://bg.copernicus.org/preprints/bg-2022-4/bg-2022-4-AC1-supplement.pdf
-
RC2: 'Comment on bg-2022-4', Anonymous Referee #2, 29 Mar 2022
In their manuscript, Ou et al develop a novel statistical model to forecast/hindcast the size of the hypoxic area in the northern Gulf of Mexico. They use the model to test the feasibility of using HYCOM output and atmospheric data (reanalysis and forecast) to forecast the size of the hypoxic zone. The manuscript is well written and the statistical model seems to be able to retrieve the hypoxic area simulated with the ROMS model (part I paper). I am not familiar with the GLM/GAM statistical techniques and hopefully another reviewer can verify this part of the methodology. My overall assessment is that some improvements are required before the manuscript should be considered for publication. There are a few points that I think are important and would like raise below. Other, more specific comments are listed afterwards.
1) In its current form, the manuscript is mostly methodological and therefore I don't know if BG is the best fit for it. This could be solved with some improvements. For instance, the Discussion section present an example of how to use the forecast model. This is a really interesting approach but it feels like a quick addition to justify the model development, that will be "further improved" in the future. A proper set of "forecasts" that are tested against observations would make a much more compelling case for the models ability to forecast hypoxia. 1985-2021 mid-summer observations are available for this test; I believe that HYCOM and atmospheric forcing data are available in the recent years to carry out this analysis. The forecast input data come with (high?) uncertainty and it would be interesting to know the effect on the hypoxia forecast (compared with the reanalysis input).
2) My second point is a follow up from above. The manuscript relies exclusively on models. This is fine as a methodological paper but not if the authors aim at improving the current (seasonal) hypoxia forecasts and providing a tool for managers. For instance, it is assumed that the ROMS hindcast is a true representation of LaTex hypoxia. This is obviously not the case (as with any models) and it seems important to include observations in the manuscript to see how/if the forecasts drift away from the observations as we go from ROMS to GLM/GAM to HYCOM. Also note some of the reviewers comments on the Part I paper referenced here. Furthermore, the model provides a highly temporally resolved forecast, but it is not clear to me if, as a forecast, it does better than the seasonal forecast models (cited in the Introduction) that are, for some of them, spatially and temporally resolved. Some comparison with those (available annually through NOAA, e.g. https://www.noaa.gov/news-release/noaa-forecasts-average-sized-dead-zone-for-gulf-of-mexico) would strengthen the manuscript.
3) The part that needs significant improvement is the Discussion, which is not really available in the current version of the manuscript. Rather, the Discussion section presents an attempt at a "real" forecast using HYCOM. This could be moved to the Results section and a real Discussion section should be provided. What does this new technique brings to the knowledge of LaTex hypoxia? How does it compare with earlier models? How is this useful to managers? What are the caveats and limitations? what are the future developments? How is this technique portable to other systems? All of those are legitimate points that should be discussed.Specific comments
L36/53: Those are seasonal forecast and cannot include the wind since it is not predictable at this time scale.
L56: Stratification is included indirectly in the statistical models
L58: They are not pseudo forecast, they forecast the mid summer hypoxic area (well in advance). Therefore, they are seasonal forecasts, which is different from the short-term forecasts provided by HYCOM.
L58-59: "fail whenever winds are strong in summers": Note that some of these models provide information on the effect of the wind on the forecast
L76: FYI (related to the main comment above), looking at the comparison between ROMS and observed mid-summer hypoxic area in Part I manuscript, the r-square is 0.58.
L79: could you define the geographical limits that you use for the LaTex shelf? That would be helpful to have a sense of your comparisons as it is not clear if you use the same area as the mid-summer sampling cruises to calculate the hypoxic zone.
L91: what do you mean by up to?
L148: It might be helpful to include these equations here.
L158: Can you discuss the biological meaning of this time lag? It seems to indicate that mid-summer hypoxia is fuelled by early summer loads and therefore that there is no relationship between May load and summer hypoxia.
L176 (Table 1): "Hypoxic area" would be better than "Area of extremely low dissolved oxygen concentration"
L194: Figures are not presented in order, please reorder
L198 (Figure 1a): The lack of relationship between SOCalt and botT is a bit concerning, can you comment?
L198 (Figure 1g): What is the time range of these data, all year, spring-summer, spring-fall?
L223-228: I didn't get how this added term solves the high level of correlation between predictors
L258: "impaired"
L264-274: Not sure if that is a good test of model skill. Excluding randomly half of the years (or 30-40%) would have provided a good dataset for testing. Can you discuss why you did not split the hypoxia data into years, since hypoxia is a seasonal process?
L281 (Figure 4): You should add observations.
L289: the correlation doesn't seem to be significant
L293 (Table 2): What is Pr? does it make any sense to provide a Pr of <1e-16?
L299: "procedure"
L316-317: Early summer or spring? It looks like hypoxia develops in Spring in the time series
L316-323: Do you see all that in Figure 5?
L343: This is not a discussion, see main comment above.
L373: Why not doing that for the entire time series?
L376: It is an interesting technique but lacks observations, why didn't you do a real forecast, i.e. a week ahead of the mid-summer cruise, for each year where the input data are available?
L377: "slight": ~20+% difference
L386: Your model forecast doesn't seem to do better than the seasonal forecast in 2019 and misses the pre-sampling mixing event, can you comment? The 2020 mid-summer hypoxic area is also largely overestimated (~20,000 vs 5,000) and seem to be doing worst than seasonal forecasts despite the model ability to take into account the effect of wind (there was a tropical storm before the mid summer sampling that year)
-
AC2: 'Reply on RC2', Z. George Xue, 26 Apr 2022
The comment was uploaded in the form of a supplement: https://bg.copernicus.org/preprints/bg-2022-4/bg-2022-4-AC2-supplement.pdf
-
AC2: 'Reply on RC2', Z. George Xue, 26 Apr 2022
-
RC3: 'Comment on bg-2022-4', Anonymous Referee #3, 12 Apr 2022
Review of “Hydrodynamic and Biochemical Impacts on the Development of Hypoxia in the Louisiana–Texas Shelf Part II: Statistical Modeling and Hypoxia Prediction”
Summary
The manuscript employs a novel approach in applying numerical and statistical modeling techniques to more accurately forecast hypoxia area on the Louisiana-Texas shelf in the Gulf of Mexico. After selecting a set of predictors that are well correlated with hypoxic area in the Gulf, a long-term ROMS numerical simulation of this study area (2007-2020) is used to train an ensemble of statistical models using both generalized linear and generalized additive modeling techniques. The most promising techniques are then applied to global model outputs and USGS forcings to develop an accurate forecast over a later time period (2019-2020).
Overall, the manuscript describes a highly applicable and useful approach to rapidly forecast hypoxic conditions using a statistical ensemble. This approach appears to offer multiple benefits to past forecasts, and would serve as a helpful template for other coastal areas as well. The paper utilizes a limited number of explanatory variables to achieve a good fit, and I think that the predictors they use are appropriate and highly applicable to hypoxic area estimates. I’ve tried to include many notes to summarize these points, but this is not an exhaustive list.
Major comments
General:
There is a fair amount of general awkward phrasing and minor grammatical and spelling errors, but I don’t find that they hinder my own understanding of the content.
Introduction:
I think that this section could be broken up into three sections as opposed to the 2 paragraphs it has now. Currently, only one sentence discusses the ecological/societal consequences of hypoxia in this region, and the authors immediately begin discussing the predictive capabilities of previous forecasting efforts. In my opinion, there could be more motivation in the first paragraph that illustrates why hypoxia forecasts are important and useful, and the benefits that environmental managers and others could gain from an accurate forecast. Otherwise, this reads a bit more like an interesting scientific modeling exercise done for its own sake. The second paragraph could then focus on past efforts to create a forecasting system, while the final paragraph could talk about some of the shortcomings that this model ensemble will address.
Methods:
I have some minor questions about the equations described for the hydrodynamic-related predictors section, but I don’t think that they are likely to alter the conclusions of the paper in a meaningful way.
Discussion:
Would suggest renaming this section as “results” since a discussion section is typically what is described in the conclusions section here.
Conclusions:
I think that the paper would benefit from a more comprehensive conclusion that reiterated some of the broader implications and benefits that could come from this hybrid ensemble approach. The final two sentences are really just devoted to saying that this is the first of its kind, which again reinforces some of the issues I mention in the introduction related to this being a pure modeling exercise.
Specific Comments
Line 15: It may benefit the reader to include a percentage value in comparison to the low RMSE value of 3204 square kilometers, which may be quite large in other coastal systems.
Line 20: Suggest removing the words “by far”. Because this model is the first to do this, the modifier “by far” suggests that no other groups are anywhere near this operational capability. I’m not sure if this is the intent, maybe this is meant instead to say that this ensemble model has the highest performance skill “by far”.
Line 25: Suggest changing to “shelf-wide” here and elsewhere in the paper
Line 30: I’ve seen “destruction” of hypoxia used more often than “deconstruction” in the literature, suggest making this change
Line 41-43: Awkward phrasing, cut out “however” from sentence
Line 46-47: Suggest rephrasing as “An additional Bayesian model applied to summer bottom DO predictions accounts for May total nitrogen…”
Line 49-52: Suggest rewording as “Mechanistic prediction methods have also been applied by Laurent and Fennel (2019) to develop a weighted mean forecast that is calibrated using May nitrate loads and three-dimensional hindcast simulations over the period 1985-2018. Once calibrated, the model only requires May nitrate loads as an input to produce the seasonal forecast for a given year.”
Line 55: Suggest changing “shortages” to “drawbacks”
Line 55-59: Remove periods before points 2 and 3, otherwise you can remove the colon and break them all up into single sentences. Point 2 could also be reworded slightly, reads awkwardly now. Change "year-to-year” to “interannual”
Line 61-62: Suggest rewording to something like “Here we aimed to provide a new technique in HA prediction that considers both stratification and biochemical effects, and accurately produces daily forecasts of HA based on selected predictors’ own forecasts.”
Line 65-67: Hypoxic volume really hasn’t been mentioned up to this point in the manuscript, and here you say that it will be neglected because HA is a better predictor anyway. Would suggest removing these sentences altogether.
Line 71-77: I understand that some of the data used for model evaluation are described in the companion paper, but this section seems to be much more focused on derived model inputs (e.g. reanalyses and model outputs). Suggest changing the title of this section to reflect this better.
Line 87: Suggest changing to “… the amount of energy per volume required to homogenize the entire water column”
Line 95: Change “… are other two factors influencing” to “are two other factors that influence”
Line 95-96: Could be worth mentioning that the effect of tidal mixing on stratification is neglected in this study site, since it’s included as an additional term in the Simpson 1981 paper.
Line 98: The first term on the right-hand side of this equation is negative in Simpson et al. (1978), but it seems like the way that this has been defined (reversing the position of water density and depth-integrated water density), that this may actually be referencing the equation of Simpson 1981. Equation 1 in Simpson 1981 also does not have “h” in the first right-hand side term, but I’m unsure if this is an error on Simpson’s part since it appears in the 1978 paper. Suggest changing the reference and/or modifying the equation (may be easier just to change the reference rather than redo calculations/figures).
Line 110-111: Suggest referencing figure 1a here as was done in lines 90-92.
Line 126: Suggest changing “… estimated for the following” to “estimated by”
Line 128: I am having trouble understanding why this equation does not match what is shown in equation 2.27 of Monteith and Unsworth (2014). It looks as if some simplification occurred such that the denominator of the exponential (T-T’, where T’=36K in Monteith and Unsworth) was incorporated into the numerator in the manuscript. However, when I plot the two curves against each other I find that they are unequal, and the gap increases with increasing temperatures. At 20 degrees C, for example, this is equal to vapor pressure difference of approximately 23 Pa. Is this a relatively minor difference, or is this likely to strongly affect the correlation found when combined with W^3?
Line 142-143: Here I would also suggest pointing the reader to figure 1a as was done in lines 90-92.
Line 145-146: Suggest changing phrasing to “However, global forecast model systems like HYCOM do not currently include biochemical fields.”
Line 156: Suggest removing this sentence and adding the correlation metric to the sentence that describes it first from lines 153-155. This earlier sentence could then read “… calculated as 19 days (R^2=0.8157, Figure A2a).”
Line 158: Is there a reference for this decomposition rate coefficient, or has this described in more detail in the companion manuscript?
Line 163-165: I would suggest immediately describing these variables as PEA_heat, PEA_wind, and DCP_temp, rather than defining them here again.
Line 166: Can you better define what it means when you state that “multicollinearity may become a problem”? Maybe adding a short technical detail on the ramifications of this would be helpful to the reader.
Line 169-170: Are all the grid cells the same size for this model domain? Is this described in more detail in the companion paper?
Line 188: Change “rest” to “remaining”
Line 190-191: Change “is chosen randomly” to “are chosen randomly” and “is grouped into” to “are grouped into”
Line 192: Suggest changing to “split at intervals of 5000 km^2”
Line 272: Some awkward phrasing “… which impose more threatens to the shelf ecosystem.”
Line 299: Misspelling of “procedure”
Line 332-333: Suggest change to “… tends to underestimate HA peak estimates (like those seen at samples 310 and 920)”
Line 351-352: What daily data are referred to here, the outputs derived from HYCOM or the nitrate and nitrite loadings from USGS?
Line 378-381: These two sentences are a bit repetitive and could be combined. I’m also not entirely clear about whether HYCOM is expected to integrate USGS runoff in the future. Is the use of daily estimates part of long-term plans for HYCOM simulations?
Line 399: Some awkward phrasing, suggest changing to “… HA forecast capable of explaining up to 80% of the total variability”
Line 404: “… on HYCOM,s”
-
AC3: 'Reply on RC3', Z. George Xue, 26 Apr 2022
The comment was uploaded in the form of a supplement: https://bg.copernicus.org/preprints/bg-2022-4/bg-2022-4-AC3-supplement.pdf
-
AC3: 'Reply on RC3', Z. George Xue, 26 Apr 2022
Yanda Ou et al.
Yanda Ou et al.
Viewed
HTML | XML | Total | BibTeX | EndNote | |
---|---|---|---|---|---|
270 | 68 | 18 | 356 | 6 | 5 |
- HTML: 270
- PDF: 68
- XML: 18
- Total: 356
- BibTeX: 6
- EndNote: 5
Viewed (geographical distribution)
Country | # | Views | % |
---|
Total: | 0 |
HTML: | 0 |
PDF: | 0 |
XML: | 0 |
- 1