Role of regression model selection and station distribution on the estimation of oceanic anthropogenic carbon change by eMLR

Plancherel, Y.; Rodgers, K. B.; Key, R. M.; Jacobson, A. R.; Sarmiento, J. L.

doi:https://doi.org/10.5194/bg-10-4801-2013

Articles | Volume 10, issue 7

https://doi.org/10.5194/bg-10-4801-2013

© Author(s) 2013. This work is distributed under
the Creative Commons Attribution 3.0 License.

https://doi.org/10.5194/bg-10-4801-2013

© Author(s) 2013. This work is distributed under
the Creative Commons Attribution 3.0 License.

Articles | Volume 10, issue 7

Research article

|

16 Jul 2013

Research article |

| 16 Jul 2013

Role of regression model selection and station distribution on the estimation of oceanic anthropogenic carbon change by eMLR

Y. Plancherel, K. B. Rodgers, R. M. Key, A. R. Jacobson, and J. L. Sarmiento

Abstract. Quantifying oceanic anthropogenic carbon uptake by monitoring interior dissolved inorganic carbon (DIC) concentrations is complicated by the influence of natural variability. The "eMLR method" aims to address this issue by using empirical regression fits of the data instead of the data themselves, inferring the change in anthropogenic carbon in time by difference between predictions generated by the regressions at each time. The advantages of the method are that it provides in principle a means to filter out natural variability, which theoretically becomes the regression residuals, and a way to deal with sparsely and unevenly distributed data. The degree to which these advantages are realized in practice is unclear, however. The ability of the eMLR method to recover the anthropogenic carbon signal is tested here using a global circulation and biogeochemistry model in which the true signal is known. Results show that regression model selection is particularly important when the observational network changes in time. When the observational network is fixed, the likelihood that co-located systematic misfits between the empirical model and the underlying, yet unknown, true model cancel is greater, improving eMLR results. Changing the observational network modifies how the spatio-temporal variance pattern is captured by the respective datasets, resulting in empirical models that are dynamically or regionally inconsistent, leading to systematic errors. In consequence, the use of regression formulae that change in time to represent systematically best-fit models at all times does not guarantee the best estimates of anthropogenic carbon change if the spatial distributions of the stations emphasize hydrographic features differently in time. Other factors, such as a balanced and representative station coverage, vertical continuity of the regression formulae consistent with the hydrographic context and resiliency of the spatial distribution of the residual field can be used to help guide model selection. The characteristic spatial scales of the modes of inter-annual to decadal variability in relation to the size of the North Atlantic, in concert with the station coverage available, place practical limits on the ability of eMLR to fully account for natural variability. Due to its statistical nature, eMLR only efficiently removes the natural variability whose spatial scales are smaller than the system analyzed.

Download & links

Article (PDF, 12268 KB)

Supplement (48 KB)

Download & links

Received: 02 Sep 2012 – Discussion started: 19 Oct 2012 – Revised: 03 Jun 2013 – Accepted: 17 Jun 2013 – Published: 16 Jul 2013