Implications of Observed Inconsistencies in Carbonate Chemistry Measurements for Ocean Acidification Studies

The growing field of ocean acidification research is concerned with the investigation of organism responses to increasing pCO 2 values. One important approach in this context is culture work using seawater with adjusted CO 2 levels. As aqueous pCO 2 is difficult to measure directly in small-scale experiments, it is generally calculated from two other measured parameters of the carbonate system (often A T , C T or pH). Unfortunately, the overall uncertainties of measured and subsequently calculated values are often unknown. Especially under high pCO 2 , this can become a severe problem with respect to the interpretation of physiological and ecological data. In the few datasets from ocean acidification research where all three of these parameters were measured, pCO 2 values calculated from A T and C T are typically about 30 % lower (i.e. ∼300 µatm at a target pCO 2 of 1000 µatm) than those calculated from A T and pH or C T and pH. This study presents and discusses these discrepancies as well as likely consequences for the ocean acidification community. Until this problem is solved, one has to consider that calculated parameters of the carbonate system (e.g. pCO 2 , calcite saturation state) may not be comparable between studies, and that this may have important implications for the interpretation of CO 2 perturbation experiments.


Introduction
Since the beginning of the Industrial Revolution, CO 2 emissions from the burning of fossil fuels and changes in land use have increased atmospheric CO 2 levels from preindustrial values of 280 ppm to currently 390 ppm (www.esrl.noaa.gov/gmd/ccgg/trends; data by Tans and Keeling, NOAA/ESRL).Values are expected to rise to 750 ppm (IPCC scenario IS92a, IPCC, 2007) or even beyond 1000 ppm by the end of this century (Raupach et al., 2007).In addition to its contribution to the broadly discussed greenhouse effect, about 25 % of anthropogenic CO 2 has been taken up by the ocean (Canadell et al., 2007), causing a shift of the carbonate chemistry towards higher CO 2 concentrations and lower pH (Broecker et al., 1971).This process, commonly referred to as ocean acidification (OA), is already occurring and is expected to intensify in the future (Kleypas et al., 1999;Wolf-Gladrow et al., 1999;Caldeira and Wickett, 2003).Ocean acidification will affect marine biota in many different ways (for reviews see Fabry et al., 2008;Rost et al., 2008).
To shed light on potential responses of organisms and ecosystems, numerous national and international research projects have been initiated (see Doney et al., 2009).An essential part of OA research is based on CO 2 perturbation experiments, which represent the primary tool for studying responses of key species and marine communities to acidification of seawater.Marine biologists working in this field have to deal with several problems associated with this type of experiment: being especially interested in high pCO 2 scenarios, seawater carbonate chemistry needs to be adjusted and kept quasi-constant over the duration of an experiment (in many cases, the carbonate chemistry is not at all controlled after initial adjustment).Also, the correct determination of at least two parameters is necessary to obtain a valid description of the whole carbonate system and hence correctly interpret organism responses.
Aqueous pCO 2 is difficult to measure in small-scale experiments, and also pH has been under debate due to intricacies concerning pH scales and measurement protocols (Dickson, 2010;Liu et al. 2011).Total alkalinity (A T ) and dissolved inorganic carbon (C T ) are usually favoured as input parameters for carbonate chemistry calculations, because sample preservation and measurements are relatively straightforward.This combination of parameters had also been thought to lead to the most accurate calculations of CO 2 concentrations and carbonate saturation states (Riebesell et al., 2010).Still, there is no agreement of which two parameters are to be measured, and, as a consequence, carbonate system calculations in different studies are often based on different input parameters.As will be shown here, this may severely impair comparability of different datasets.
Even though detailed literature on measurement protocols has been published (Dickson et al., 2007;Gattuso et al., 2010), potential pitfalls and problems with uncertainty estimations remain and, as certified reference materials (CRMs) are only available for current surface ocean conditions, the quality of carbonate chemistry measurements at high pCO 2 levels is often unknown.Uncertainties of estimated pCO 2 values are generally considered to be smaller than 10 % (c.f.Gattuso et al., 2010;Hydes et al., 2010).An examination of the few over-determined datasets assessed in OA laboratories (including data from our own laboratory; reported in the Supplement) reveals up to 30 % discrepancies between estimated pCO 2 levels derived from different input pairs . This potentially widespread phenomenon has major implications for the comparability and quantitative validity of studies in the OA community.In view of the growing body of OA literature and its impact on public opinion and policy makers (Raven et al., 2005), the identification, quantification and prevention of common errors has to be an issue of high priority.
This publication is based on an earlier manuscript entitled "On CO 2 perturbation experiments: Over-determination of carbonate chemistry reveals inconsistencies" (Hoppe et al. 2010).

Results
We present here a comparison of over-determined carbonate chemistry datasets found in the literature together with our own datasets.Only one dataset with more than two parameters of the carbonate system measured in OAlaboratories was found in the list of "EPOCA relevant publications" archived in the PANGEA database (Nisumaa et al., 2010; http://www.epoca-project.eu/index.php/data.html):Schneider and Erez, (2006); another study was excluded from this analysis because of conflicting values between database and manuscript.In addition, the data from Iglesias-Rodriguez et al. (2008), Thomsen et al. (2010) and our own laboratory (Hoppe et al. 2010) are shown.For all datasets, values reported for relevant parameters (e.g.salinity, temperature, pH scale, etc.) and the dissociation constants of carbonic acid of Mehrbach et al. (1973; as refit by Dickson and Millero, 1987) were used to calculate pCO 2 values at 15 • C using the program CO 2 sys (Pierrot et al., 2006).As infor- mation on nutrient concentrations was lacking in the datasets used, values were based on appropriate literature data (see Supplement for details).
These calculations revealed discrepancies in the pCO 2 calculated from different input pairs, which increased systematically with increasing pCO 2 (Fig. 1).The pCO 2 calculated from C T and A T was ∼ 30 % lower than the pCO 2 calculated from either C T and pH or from A T and pH, the latter pairs yielding comparable results (±5 %).The carbonate system of Iglesias-Rodriguez et al. (2008; as shown in the PANGEA database) was not strictly over-determined.However, if one assumes equilibration of the aerated seawater with the gas mixtures used (280-750 ppm), the deviation of the pCO 2 values (calculated from A T and C T ) from the target pCO 2 reveals a similar relationship to that observed in the other datasets (Fig. 1).Even though outgassing in C T samples cannot be completely excluded as a potential source of the discrepancies in this particular study, the consistent pattern among studies argues strongly against this explanation.
With respect to our own dataset, further information is available.Discrepancies of ∼30 % were observed irrespective of whether C T or A T was manipulated, and in both natural and artificial seawaters (NSW and ASW, respectively; Supplement, Table 2).

Discussion
Underestimation of pCO 2 calculated from measured values of A T and C T has been described in a number of studies from the marine chemistry community, in which direct measurements over a range of pCO 2 levels (approx.200-1800 µatm) were compared to calculations from A T and C T (Lee et al., 1996(Lee et al., , 2000;;Wanninkhof et al., 1999;Luecker et al., 2000;Millero et al., 2002).The magnitude of these deviations is, however, much smaller than found in our study (5-10 %; cf.Fig. 4 in Luecker et al., 2000).The latter datasets and those from the OA community differ in the magnitude of the discrepancies (∼5-10 % and ∼30 %, respectively).Thus, the phenomenon observed in our study seems to be different from the one documented by marine chemists.
Currently, we do not have an explanation for the discrepancies described here, although a few simple explanations, such as the uncertainties of dissociation constants or uncertainties attributed to A T , C T or pH measurements, can be ruled out: Systematic errors in measured A T (5 µmol kg −1 ; based on repeated CRM measurements, our own data), C T (7 µmol kg −1 ; based on repeated CRM measurements, our own data), pH (0.02; Liu et al., 2011) and in equilibrium constants (0.01 in pK * 1 , 0.02 in pK * 2 ; Dickson, 2010) would be much too small to explain the large discrepancies in calculated pCO 2 .
The contribution of dissolved organic matter (DOM) to alkalinity has recently gained a lot of attention (Kim and Lee, 2009;Koeve et al., 2010).However, changes in A T due to DOM cannot cause the discrepancies described here, since the phenomenon was also observed in an experiment in which artificial seawater without any organic compounds or organisms was used (Supplement, Table 2).Furthermore, experiments with nutrient-enriched North Sea seawater (our data), probably DOM-rich water from Kiel Bight (Thomsen et al., 2010) and from the oligotrophic Red Sea (Schneider and Erez, 2006) show essentially identical discrepancies (Fig. 1).Nonetheless, DOM contributions can become a significant source of error in high biomass cultures (Kim and Lee, 2009).
It remains puzzling that these discrepancies are observed in experiments involving both A T and C T adjustments, different seawater compositions, as well as in several datasets produced with different equipment and procedures (e.g.coulometric, colourimetric and manometric C T measurements).The fact that several independent studies carried out within the framework of ocean acidification research show similar discrepancies between calculated pCO 2 values (Fig. 1) suggests a systematic, as opposed to a random, deviation that will hinder a realistic judgement of the quality of datasets.
Regardless of the reasons for its occurrence, this phenomenon will have consequences for ocean acidification research.Firstly, published pCO 2 values may not be comparable if different input parameters were measured and used to calculate pCO 2 .Secondly, if calculated pCO 2 values are underestimated by up to 30 %, an organism's respective sensitivity to acidification might be severely overestimated.This is especially important at pCO 2 levels ≥750 µatm, which are typically applied for the year 2100 scenario and therefore crucial for all CO 2 perturbation experiments.As an example, one might refer to the responses of four Emiliania huxleyi strains to different pCO 2 levels reported by Langer et al. (2009).For strain RCC1256, the authors report strongly decreasing calcification rates above pCO 2 values of 600 µatm (pCO 2 values were derived from A T and C T measurements).As the study of Langer et al. (2009) was conducted in the same laboratory as this one, the presence of the described discrepancies can be assumed.If the pCO 2 values from Langer et al. (2009) are indeed ∼30 % lower than the ones calculated from A T and pH (or C T and pH), our study could suggest that calcification increases until a pCO 2 of 750 µatm and only declines at values above 800 µatm.Predictions for this strain for the often proposed 2100 scenario of 750 µatm would thus differ substantially.The discrepancies in calculated pCO 2 values described here might also explain the differing results reported by Langer et al. (2009) and Hoppe et al. (2011) with respect to the sensitivity of this strain.Thirdly, depending on the input pair chosen, the calculated carbonate ion concentration and hence the calcite and aragonite saturation states might differ significantly.In this study, discrepancies in saturation states were found to be in the range of 15-30 %.
Care must therefore be taken when comparing studies that use different pairs of input parameters or when reporting threshold levels of pCO 2 harmful to an organism.To improve comparability between future studies, it may be useful to agree on a certain pair of input parameters as long as the described discrepancies remain.We suggest, for the time being, that the OA community should use A T and pH as input parameters when calculating the carbonate chemistry and, whenever possible, measure and report additional parameters.This suggestion does, however, not mean that the resulting pCO 2 values are "correct".Although choosing a particular pair of parameters provides a pragmatic approach to dealing with such discrepancies, it is unsatisfying andif the choice results in inaccurate calculations of pCO 2 and [CO 2− 3 ] -may lead to inappropriate interpretations of organism responses.Currently, we have neither sufficient understanding of the uncertainties of carbonate chemistry measurements, nor a clear demonstration that it is possible to get thermodynamically consistent data of A T , C T , pH and pCO 2 for seawater samples with pCO 2 > 600 µatm (A.Dickson personal communication, 2011)

www.biogeosciences.net/9/2401/2012/ Biogeosciences, 9, 2401-2405, 2012 estimates
. Further investigations on source and occurrence of this phenomenon are necessary.Certified reference material with high pCO 2 , as well as calculation programs including the propagation of errors, could improve estimations of uncertainties in carbonate chemistry measurements and therewith calculations of pCO 2 values.It should become common practise to provide and defend of uncertainty.A large-scale inter-comparison of the quality of carbonate chemistry measurements between different laboratories (from the OA but also from the marine chemistry community) would help revealing whether the phenomenon described here is indeed widespread.