the Creative Commons Attribution 4.0 License.
the Creative Commons Attribution 4.0 License.
Ocean models as shallow sea oxygen deficiency assessment tools: from research to practical application
Sarah Piehl
René Friedland
Thomas Neumann
Gerald Schernewski
Abstract. Oxygen is a key indicator of ecosystem health and part of environmental assessments used as a tool to achieve a healthy ocean. Oxygen assessments are mostly based on monitoring data that are spatially and temporally limited, although monitoring efforts have increased. This leads to an incomplete understanding of the current state and ongoing trends of the oxygen situation in the oceans. Ocean models can be used to overcome spatial and temporal limitations and provide high-resolution 3D oxygen data but are rarely used for policy-relevant assessments. In the Baltic Sea where environmental assessments have a long history and which is known for the world’s largest permanent hypoxic areas, ocean models are not routinely used for oxygen assessments. Especially for the increasingly observed seasonal oxygen deficiency in its shallower parts, current approaches cannot adequately reflect the high spatio-temporal dynamics. To develop a suitable shallow water oxygen deficiency assessment method for the western Baltic Sea, we evaluated first the benefits of a refined model resolution. Secondly, we integrated model results and observations by a retrospective fitting of the model data to the measured data using several correction functions as well as a correction factor. Despite its capability to reduce the model error, none of the retrospective correction functions applied led to consistent improvements. One reason is probably the heterogeneity of the used measurement data, which are not consistent in their temporal and vertical resolution. Using the Arkona Basin as an example, we show a potential future approach where only high temporal and/or vertical resolution station data is integrated with model data to provide a reliable and ecologically relevant assessment of oxygen depletion with a high degree of confidence and transparency. By doing so we further aim to demonstrate strengths and limitations of ocean models and to assess their applicability for policy-relevant environmental assessments.
- Preprint
(5435 KB) - Metadata XML
-
Supplement
(2308 KB) - BibTeX
- EndNote
Sarah Piehl et al.
Status: open (until 14 Dec 2023)
-
RC1: 'Comment on bg-2023-152', Anonymous Referee #1, 27 Nov 2023
reply
Overall Statements:
The manuscript “Ocean models as shallow sea oxygen deficiency assessment tools: from research to practical application“ by Sarah Piehl, Rene Friedland, Thomas Neumann, and Gerald Schernewski aims to show how the results of a coupled 3D physical-biogeochemical model can be improved so that they provide reliable information to support environmental measures. The authors use very heterogeneous data in the southwestern Baltic Sea to show how this attempt fails. However, a much higher resolution data set in one of the Baltic Sea basins can actually improve the model results there. I have the impression that the first experiment was designed in such a way that no good results were possible. It is described that sometimes contradictory data was recorded at neighbouring locations. It is clear that the model results cannot be improved with such data. In-depth data quality control and a plausibility check in advance would have been appropriate here.
The present manuscript also contains technical errors: data from at least one station is used for both optimisation and validation. Furthermore, one definition of the error metric is incorrect. The methods used appear to have been chosen arbitrarily without providing a justification.
The figures are often of poor quality and are inadequately described in the captions. Some sentences in the text are incorrect or abbreviated. References are missing.
Overall, a retrospective correction of the model results is presented, which is already questionable in its approach. The reader would expect an attempt to be made to improve parts of the model in such a way that quality-tested measured oxygen values are matched. Data assimilation would also be more in line with the current state of modelling.
The take-home message of the manuscript is that it is necessary to use a more dense data network to obtain good modelling results. This is not a new scientific insight.
I see a way out of this manuscript to deliver a good scientific contribution by using the data of the Arkona Basin to recognise and improve deficits of the model. This requires an in-depth error analysis that not only sheds light on statistical relations, but also examines individual time periods and developments in the model results and the high-resolution data.
Detailed remarks:
L32: Rosenberg: reference missing.
L105: Please insert a chapter on data sets used.
L106: as oxygen dynamics are the focus, the model description should contain the description of the benthic and sediment model system.
L110: “well”.
L118: cite correctly.
L138: Fig. 1 poor quality. Station numbers are barely recognisable.
L158: no profiles in Fig. 1.
L159: Fredland: missing reference
L177: wrong definition, if P(bar) and O(bar) represent the means.
L197ff: please write in more detail. I do not know if a distance between observation and corrected simulation was taken for a simulation cell/time nearest to the observation.
L213: show the position of the MARNET station used.
L220: please explain why you switch to a multiplicative correction.
L224: please say at this point of the manuscript, that the following sections use uncorrected results.
L234: please use unified nomenclature: south-western BS (Like in Tab 1).
L235: AE>0 does not fit to the corresponding profiles.
L247: are the r values for the Kiel Bay correct?
L259: Fig. 3: poor quality. Too high information density.
L260ff: introduce Fig. 3. Which model data is used? Explain the boxes and the lines (percentiles ..)
L267: much stronger model scatter for TF0012.
L269: TF0010 is here used for validation. Fig 1 shows this station with a white dot (for model correction).
L269: Tab 2: please explain which data is compared here. Monthly means? Same location/time?
L273: Fig 4 caption: where are the observations (black)?
L280: also negative values appear.
L304: Fig 5: identify the different basins.
L315: please say model performance is poor.
L316: this is a very interesting section. Please introduce these experiments in more detail: How are the different components (distance to shore, ..) calculated? Did you set in (5) ai,bi,ci = 0 for calculating the distance to shore component only?
L320: you use different correlation functions. Here it is R2 . In Tab 2 you use r (Pearson’s correlation coefficient), and in Tab 3 the Spearman Rank correlation. Please explain your choice.
L320: state that y-axes have different numbers.
L348: I cannot see an improvement of 0.28.
L354: Fig 7: please insert basin names.
L357: I do not understand the categories: spatial near-bottom oxygen, near bottom oxygen and oxygen profiles. Maybe it’s only to adapt the nomenclature to Tab 3.
L375: please also discuss “all basins” in Fig 3.
L379: standard error decreases with increasing number of observations. This by definition.
L384: please explain the boxes, the lines and the dots in Fig 8. Why do we have standard errors for each month? Why not one standard error for each month?
L396: “oxygen concentration” in the Arkona Bay?
L396: please give a motivation to use this simple multiplicative correction function here.
L408: please explain MARNET Function and MARNET Factor.
L433: which baseline?
L437: please discuss vertical resolution too.
L480: show the position of this single station.
L501: “other studies” give examples.
L514: “three different thresholds were used and averaged” – this appears indeed senseless.
L517: I would expect an overestimation of oxygen concentrations if the thickness of the model’s deepest layer is larger than 4 m.
L518: “daily resolution”: is this the time step of the model or the resolution of the results which are analysed?
L522: Dias et al: reference missing.
L532: also discuss the possibility to use a retrospective adaption for projections.
L539: the transparency is discussed above in the opposite direction.
L541: in the frame of OSPAR work such exercises were done too.
L569: “improved the representation of the oxygen dynamics”.
L576: discuss also the possibility of data assimilation.
L598ff: please help the reader to recognise each reference (line distance or indent).
Citation: https://doi.org/10.5194/bg-2023-152-RC1 -
RC2: 'Comment on bg-2023-152', Anonymous Referee #2, 06 Dec 2023
reply
In their manuscript, Piehl et al compare the results of a coupled circulation-biogeochemical model of the Baltic Sea with observations to assess the ability of the model to represent oxygen variability, in particular hypoxic/low oxygen conditions, in the southwestern Baltic Sea. The goal is to use the model in environmental assessments of the Baltic Sea. An observation-based model correction is tested. The authors find that both increased horizontal resolution and applied correction can improve the model performance. They conclude that despite some mismatch with observations the model is appropriate for use in oxygen deficiency assessments.
The title of the manuscript is a bit misleading. The study is limited to model/observations comparison and lacks scientific insights that could be valuable to develop the oxygen deficiency assessment method of the western Baltic Sea. The correction method is an interesting exercise to see where the model fails but should be used to correct the model rather than its results. If the model overestimates oxygen depletion in some regions then oxygen consumption may be overestimated and should be corrected. With the assessment method in mind, the authors should propose metrics (with examples), based on model simulations, that would be useful additions to the current observations used in the environmental assessments of the Baltic Sea. In its current form, the manuscript is more like a technical report and therefore of somewhat limited interest to the broader Biogeosciences community.
Specific comments:L98-104: There seem to be a disconnect between the proposed work (assess/improve model agreement with observations) and the objective of the study (provide a high resolution monitoring system)
Figure 2: Is this a multi annual summer average?
L270-271: This is a bit optimistic, Fig 4 shows a large discrepancy in simulated subsurface oxygen
Section 3.2: Here you correct your model with a set of data and then assess the correct model with the same set of data. A more robust assessment of the model skills would be to correct the model with half of the dataset and then compare the corrected model with the second part of the observations.
L365-368: Figure 3b indicate an underestimation of near bottom oxygen at the Bay of Mecklenburg, which seems to contradict this statement
L412-414: The case study was mainly an assessment of the model skills and not the development of a method for a high resolution monitoring and assessment system. To reach this goal you first need to assess the model and then use its results to develop metrics that are robust and useful addition to the assessment
L421-422: Of course, this is controlled by air-sea gas exchange
L498-499: This is somewhat contradictory to what was said before, the model has some limitations
L532-535: The model assessment could be used to understand the source of the discrepancies and ultimately to improve the model
L552: Computational power remains an issue, especially for highly resolved models
Minor comments/typos:L46: Here and elsewhere: "monitoring data" is not the proper terminology, please rephrase
L47: remove monitoring
L49: the quality and thus the reliability...
L51: Nevertheless, data acquisition is limited...
L88: models provide an approximation of in-situ conditions and will never perfectly align with observations.
This sentence is redundant with the previous one thoughL96: However, due to the coarse horizontal resolution of the model (~5km), nearshore areas...
L98: resolution (~2km) to investigate...
L126: km more appropriate than nautical miles?
vertically resolved z-layers ... from 0.5 at the surface to 2m at the bottom in the deepest areas.L128: Figure 1 does not mention the model, is the red square on the left panel the limits of the model grid and does the left panel show the model bathymetry? or does the model grid cover the entire Baltic Sea?
L135: what is the time unit?
Equation 3: The formula is the mean square error, please correct. Do you mean the Mean Absolute Error(MAE)? Why not using the bias?
Figure 3: - Can you mention in the caption to refer to Fig 1 for the station locations
- Can you move the xtick labels leftward to align with the vertical line?Figure 7: Although the color scale is the same, the color of the observations points vary from Figure 2, can you explain? The captions are the same.
L558: Can you add this reference in the References list?
Citation: https://doi.org/10.5194/bg-2023-152-RC2 -
RC3: 'Comment on bg-2023-152', Anonymous Referee #3, 07 Dec 2023
reply
I think this analysis was a rigorous test of model performance in a region of the Baltic Sea. The motivation for the effort was also worthy in that we do need to improve model simulations in the shallower, perhaps more dynamic regions of the coastal zone. Where I think the analysis fell short was in the way the model was evaluated. There are two challenges I have for the authors:
(1) Why were the data aggregated into seasonal means for a long time period and compared to similarly aggregated model means? I can understand that this is one reasonable test of model behavior (getting the mean seasonal cycle correct), but for these models to be useful in scenarios of climate change and nutrient abatement, they need to do better than a seasonal climatology. It is also possible that aggregating data into a climatology might obscure how well the model might be doing to capturing particular events (e.g., wet years, heat waves). I am not sure if I wish for the authors return to the output and data and do a more time-specific comparison, but it would certainly provide another useful test of the model.
(2) The approach for model correction described in this paper is not incorrect, but it is insufficient. To try and post-correct the model with fixed parameters (e.g., distance from shore) seems counterintuitive; surely the more likely reasons for model-data mismatch are an inadequacy in the process rates, which are time and space varying. I think the paper would have been much more effective if the authors did sensitivity tests on some key rate processes (primary production, background diffusivity) to see if that helped improve model performance. This would be an avenue by which the authors might see more meaningful improvements in RMSE and other error metrics.
Below are more specific comments:
(1) Line 99: Is it better to call this "disagreement"? Error implies one is wrong, and the observations, as you correctly point out, have limitations.
(2) Line 110: "wel' should be "well"
(3) Line 118: Just write "as described by Kuznetsov and Neumann (2013)" so you don't need to write the names twice.
(4) Line 121: I am not sure what this means. Do you just mean that the main currency of metabolism is oxygen or carbon, and it is translated to N and P pools through stoichiometry?
(5) Line 126: make "miles" singular
(6) Line 148: should be "emphasizes"
(7) Line 148: Delete the comma after "Both"
(8) Line 158: Do you mean station with vertical oxygen profiles? There is no data in Fig 1
(9) Line 159: Can you offer a little more explanation here? Do I understand that near-bottom values were estimated from near-surface? Presumably, there is uncertainty with this that should be reported here.
(10) line 166: My guess is that you don't really care that the models reproduce a monthly oxygen climatology, but rather you want the models to capture other forms of variability. So why the aggregation here? This is related to one of my main general comments.
(11) Line 198: polynomials?
(12) Line 212: Can you specify what the temporal resolution is here?
(13) Lines 234-243: This writing seems to overlook the fact that the inner part of the Bay of Mecklenberg is an area where the model substantially over estimates DO. this should be addessed.
(14) Figure 3: There does seem to be a case where the models keep oxygen low well into November (Panels a, b, c, d) , but the observations don't show this, why might that be? This is a fairly season-specific problem that could be discussed in detail from a mechanistic standpoint.
(15) Line 310: I think you could more clearly point out that RMSE is more often > 2 mg/L for these comparisons. This is rather high, and worth highlighting.
(16) Figure 6 and related text: Iwould argue that the relative differences in RMSE between variables (purple dots) are pretty small (0.02 to 0.2) relative to the overall RMSE (~2). This would tell me that there are other factors that are more important in driving the discrepancies, likely biogeochemical or hydrodynamic processes in the model. This is not really discussed at all in the paper and is worth more discussion.
(17) Figure 7: It would probably be more effective to show these as "differences from corrected" and forget the observations. This would make it clearer to see where the improvements occurred. It seems that only the deep Arkona basin was really improved by the model.
(18) Line 386: should be "analysis"
(19) Line 339: This sentence is worded really oddly. How about "Since model skill was not equally good...."
Citation: https://doi.org/10.5194/bg-2023-152-RC3
Sarah Piehl et al.
Sarah Piehl et al.
Viewed
HTML | XML | Total | Supplement | BibTeX | EndNote | |
---|---|---|---|---|---|---|
160 | 54 | 10 | 224 | 22 | 5 | 10 |
- HTML: 160
- PDF: 54
- XML: 10
- Total: 224
- Supplement: 22
- BibTeX: 5
- EndNote: 10
Viewed (geographical distribution)
Country | # | Views | % |
---|
Total: | 0 |
HTML: | 0 |
PDF: | 0 |
XML: | 0 |
- 1