The revised version of the manuscript is a much better read, and the authors have spent considerable effort in addressing my comments. However, I still have some reservations about the methodology and framing in the revised version.
general comments
The authors have incorporated feedback from my previous comments in the manuscript, and importantly, they acknowledge that the uncertainty estimates do not reflect the full model uncertainty. However, the first such acknowledgment appears late in the manuscript, in line 476 in the results and discussion section. Later, the authors still claim that "Our objective extends beyond merely reproducing satellite NPP products. We aim to improve the overall accuracy and uncertainty quantification of NPP estimates by incorporating a robust probabilistic framework." (l 697). But the uncertainty is not fully qualified, in particular, this approach does not capture structural uncertainty, i.e. model bias or inadequacy. The estimates of CAFE may be heavily biased, but we do not know, and the uncertainty analysis conducted here would not show it. A more careful language is needed.
The authors claim that "The results reveal that both models are competent in quantifying CAFE uncertainty." (l. 726). Beyond the problem mentioned in my comment above, it remains unclear if the two methods actually capture main parts of the CAFE signal. Based on Fig. 7 and 10, the NN and Bayes model can capture the seasonal dynamics of the CAFE output. But is there a trend in the CAFE data, and do the two models capture that trend?
Furthermore, what evidence is there that the NN and Bayes model perform better than climatology? My concern is that one could build a simple climatological NPP model for Weizhou Island with uncertainty that would produce very similar output to the NN or Bayes model. For example, one could use
a + b * sin((c + time)/d) + epsilon
where epsilon ~ Normal(0, sigma) is a random variable. After estimating the model parameters (a, b, c, d, sigma) from CAFE data, it would require only time input and produce NPP estimates with uncertainty. Of course, this a very simple model and every year is the same, there is no trend, and the uncertainty does not vary with time. But then the NN and Bayes model seem to produce nearly identical output for each year as well, and the uncertainty envelope in Fig. 7 and 10 are very similar from year to year. Thus, it is important to show that NN and Bayes model perform better than a simple climatology model.
An aspect that is important but not described well in the manuscript is the required model input compared to that of VGPM, CbPM, and CAFE. In one statement, the authors write: "These inputs overlap substantially with those used in VGPM, CbPM, and CAFE, demonstrating that the NN and Bayesian models do not require additional or more complex inputs." (l. 315). Later the manuscript states: "These probabilistic models do not require additional input variables beyond those used by VGPM, CbPM, and CAFE." (l. 720) Are really all 11 inputs listed in Table 1 used in VGPM, CbPM, and CAFE? Did the authors perform any experiments limiting the inputs to the NN and Bayes model further to examine which inputs are actually required to produce the output?
When the data used for training a NN or model is very limited, a common thing to do is bootstrapping, i.e. dividing the data into different training and testing datasets repeatedly. Did the authors try different testing and training data configurations? It may shed more light on the differences in the CDF curves that are discussed in Section 3.2.2.
Overall, the manuscript reads much better than the initial version. However, the discussion of the results is quite long and feels repetitive at times. I would recommend tightening up Section 3 and removing repetitive statements.
specific comments
L 54: "Conventional methods of NPP measurement, such as ship-based sampling and bottle incubations, are beset with challenges like human errors and inadequacies in capturing spatial and temporal dynamics. This underscores the necessity for more sophisticated and comprehensive methods (Yang et al., 2021; Li et al., 2020)." True, but this study relies very much on monitoring data from a station and thus does not capture spatial dynamics -- it further relies on continuous measurements to capture the temporal dynamics. The authors mention this later: "Due to factors such as equipment malfunctions and adverse weather conditions, some data for the eleven variables were incomplete." (l 198).
L 79: "Currently, the most widely utilized models for estimating NPP include the Vertically Generalized Production Model (VGPM), [...], have been proposed.": This sentence needs to be rephrased.
L 156: "The proportion of excellent water quality in Guangxi's near-shore waters reaches more than 90% all year round": It is not clear what this means. What is this measure of water quality, and is this based on a study or survey that could be cited? Similarly, what does "the quality of the marine ecological environment has remained at the forefront of the country" imply? More specific language and references would be useful here.
L 163: "Weizhou Island, located in the southern subtropical monsoon zone, experiences a pleasant climate with abundant heat and precipitation throughout the year." Phrases like "pleasant climate" or "abundant heat and precipitation" are not specific or quantitative. The next sentence already specifies average (air?) temperatures, so the "pleasant climate" is not necessary here.
Eq. 1: Mention right away what theta and D represent in the equation.
L 367: "In probabilistic forecasting, the focus extends beyond mere point estimates to encompass the shape and dispersion of the probability distribution.": This sentence and the next could go to the beginning of the section to give a better motivation for the use of CRPS.
L 382: "y the predicted value, x the observed value". This works, but is not conventional. Typically, x are the predicted values and y denotes observations.
L 393: The CDF is introduced here, but it has already been used above in the definition of CRPS. I would suggest switching the section order.
L 483: "On using CAFE as a prediction target, both models show more consistent performance.": The term model has now been used to describe VGPM, CbPM, and CAFE, but also the NN and Bayesian model. Please ensure that the reader always knows what models are referenced in the text. Furthermore, this statement about consistent performance for both models seems to contradict a later one: "In addition, for NN model's MAPD index value for CAFE is lower than that for Bayes model" (l 487).
L 490: "Overall evaluation indicates that under both models' assessment criteria, CAFE demonstrates superior accuracy in predicting effects compared to VGPM and CbPM.": This paragraph is not very helpful. What are the two assessment criteria used here? (Fig. 5 uses three metrics, not two.) What does "predicting effects" mean? It is not helpful to the reader that the remaining paragraph discuss VGPM and CbPM results and not CAFE.
L 499: "(1) prior research indicating that CAFE provides relatively accurate estimates of NPP in marine ecosystems with characteristics similar to the Weizhou Island area, due to its advanced parameterization of phytoplankton dynamics". Please cite this prior research or provide some evidence for this statement.
L 520: Is this analysis based on the testing data or the full CAFE-based dataset?
L 523: Are these confidence intervals credible intervals for the Bayesian model?
L 590: "Fig. 8 demonstrates the CDF curves of the predicted mean values after the normalization process and the CDF curves of the CAFE." This sentence and the next are difficult to understand. Are they meant to emphasize the advantages of normalizing the values? Why make this point right after stating that divergence between these two CDFs should be minimal? Please rephrase.
L 671: Is the only difference between the estimates in this section and previous ones the daily resolution?
L 722: "By prioritizing variables such as SST and AP, the models can be optimized to reduce reliance on less influential inputs, improving efficiency without compromising accuracy." Was this actually shown? Did the authors try to run the NN or Bayes model with fewer input variables? |
Please find the comments in the attached PDF.