Cyanobacteria net community production in the Baltic Sea as inferred from profiling pCO2 measurements

Organic matter production by cyanobacteria blooms is a major environmental concern for the Baltic Sea, as it promotes the spread of anoxic zones. Partial pressure of carbon dioxide (pCO2) measurements carried out on Ships of Opportunity (SOOP) since 2003 have proven to be a powerful tool to resolve the carbon dynamics of the blooms in space and time. However, SOOP measurements lack the possibility to directly constrain depth-integrated net community production (NCP) in moles of carbon per surface area due to their restriction to the sea surface. This study tackles the knowledge gap through (1) providing an NCP best guess for an individual cyanobacteria bloom based on repeated profiling measurements of pCO2 and (2) establishing an algorithm to accurately reconstruct depth-integrated NCP from surface pCO2 observations in combination with modelled temperature profiles. Goal (1) was achieved by deploying state-of-the-art sensor technology from a small-scale sailing vessel. The lowcost and flexible platform enabled observations covering an entire bloom event that occurred in July–August 2018 in the Eastern Gotland Sea. For the biogeochemical interpretation, recorded pCO2 profiles were converted to C T, which is the dissolved inorganic carbon concentration normalised to alkalinity. We found that the investigated bloom event was dominated by Nodularia and had many biogeochemical characteristics in common with blooms in previous years. In particular, it lasted for about 3 weeks, caused a C T drawdown of 90 μmolkg−1, and was accompanied by a sea surface temperature increase of 10 C. The novel finding of this study is the vertical extension of the C T drawdown up to the compensation depth located at around 12 m. Integration of the C T drawdown across this depth and correction for vertical fluxes leads to an NCP best guess of ∼ 1.2 molm−2 over the productive period. Addressing goal (2), we combined modelled hydrographical profiles with surface pCO2 observations recorded by SOOP Finnmaid within the study area. Introducing the temperature penetration depth (TPD) as a new parameter to integrate SOOP observations across depth, we achieve an NCP reconstruction that agrees to the best guess within 10 %, which is considerably better than the reconstruction based on a classical mixed-layer depth constraint. Applying the TPD approach to almost 2 decades of surface pCO2 observations available for the Baltic Sea bears the potential to provide new insights into the control and long-term trends of cyanobacteria NCP. This understanding is key for an effective design and monitoring of conservation measures aiming at a Good Environmental Status of the Baltic Sea. Published by Copernicus Publications on behalf of the European Geosciences Union. 4890 J. D. Müller et al.: Cyanobacteria NCP in the Baltic Sea

1 Introduction 1.1 Net community production (NCP) in marine ecosystems 25 Net community production (NCP) of organic matter triggers many biogeochemical processes that control the functioning and state of marine ecosystems. Globally relevant examples are the biological carbon pump (Henson et al., 2011;Sanders et al., 2014) and the establishment of oxygen minimum zones (Gilly et al., 2013;Oschlies et al., 2018). In this biogeochemical context, we define NCP as the net amount of carbon fixed in organic matter (gross production minus respiration) that is produced in a defined water volume over a defined period. The reliable quantification of NCP is a prerequisite to understand 30 subsequent biogeochemical transformation of the organic matter and its imprint on environmental conditions.

Baltic Sea
On a regional scale, NCP quantification is of particular importance to study the formation of anoxic conditions in stratified water bodies caused by the mineralisation of organic matter that was exported across a permanent pycnocline. This situation is typically encountered in semi-enclosed, silled estuaries such as the Baltic Sea. The deep basins of the Baltic Sea receive 35 substantial amounts of oxygenated, salty water from the North Sea only during occasional major inflow events. Between inflow events, those water masses can stagnate for more than a decade below the permanent halocline (Mohrholz et al., 2015), which is located at around 60 m water depth in the Central Baltic Sea. The export of organic matter into the deep waters is the ultimate cause for the expansion of anoxic areas in the Baltic Sea, which are nowadays considered "the largest anthropogenically induced hypoxic areas in the world [a state which is] primarily linked to increased inputs of nutrients from land" (Carstensen 40 et al., 2014). A quantitative and mechanistic understanding of near-surface organic matter production is key to understand, predict, and eventually counteract the expansion of those anoxic areas. The development of measures to reduce eutrophication and deep water anoxia represents a core component of the EU Marine Strategy Framework Directive (MSFD), which is implemented as the HELCOM Baltic Sea Action Plan (BSAP) and aims at a Good Environmental Status (GES).

Cyanobacteria blooms 45
The annual cycle of organic matter production in the Baltic Sea can be broadly divided into two events (Schneider and Müller, 2018). After a nitrate-fueled spring bloom, which is usually followed by a so-called blue water period with close-to-zero NCP rates, mid-summer cyanobacteria blooms develop in most years and cause a next pulse of NCP. The cyanobacteria blooms are limited to the months of June to August (Kownacka et al., 2020) and represent a common feature of the Baltic Sea ecosystem at least since the 1960s (Finni et al., 2001). The blooms are a major public concern, because they produce toxins and form (POC). However, POC measurements would not detect the amount of organic matter that was exported between observations (Wasmund et al., 2005) and also fail to achieve the required spatio-temporal resolution due to a low degree of automation. As an alternative, it is possible to quantify NCP through the drawdown of dissolved inorganic carbon (C T ) from the water column (Schneider et al., 2003). From a biogeochemical perspective, the determination of NCP in terms of carbon is ideal, because carbon is the major component of organic matter and directly related the amount of oxygen (O 2 ) that is consumed during 70 mineralisation. In principle, NCP could as well be estimated from O 2 time series. However, the equilibrium reactions of carbon dioxide (CO 2 ) in seawater result in higher re-equilibration times of CO 2 with the atmosphere compared to O 2 , which results in substantially longer signal preservation and makes C T the preferred tracer for NCP. During the Baltic Sea spring bloom, the tracing of nutrient consumption is a meaningful alternative to quantify NCP and convincingly leads to comparable results to the C T approach (Wasmund et al., 2005). However, nutrient time series do not allow for determining cyanobacteria NCP due 75 to the organism's ability to fix nitrogen and their highly variable C:P ratios. In conclusion, the well established C T approach is the favorable method to determine cyanobacteria NCP. However, it should be noted that NCP estimates derived from this approach include the formation of POC and dissolved organic carbon (DOC). The produced DOC contributes~20% to POC (Hansell and Carlson, 1998;Schneider and Kuss, 2004) and is not likely to be vertically exported.

Previous studies 80
Among previous attempts to trace and quantify the organic matter production of cyanobacteria blooms, automated measurements of the partial pressure of carbon dioxide (pCO 2 ) on the Ship of Opportunity (SOOP) Finnmaid played a pivotal role.
Those measurements were started in 2003 and it was demonstrated that highly accurate time series of changes (not absolute values) in C T can be derived from pCO 2 observations (Schneider et al., 2006). The conversion from pCO 2 to C T relies on a fixed alkalinity (A T ) estimate and is applicable under the condition that internal sources of A T can be excluded, which is the 85 case in the Baltic Sea due to the absence of calcifying plankton (Tyrrell et al., 2008). The derived parameter is comparable to directly measured C T normalised to A T , and in the following referred to as C T *. For several years of SOOP observations, it was shown that the C T * drawdown during mid-summer cyanobacteria blooms occurs in pulses of days to weeks, primarily during calm, sunny days. Further, it was found that the C T * drawdown correlates well with the co-occurring increase in sea surface temperature (SST), rather than with absolute SST. This relationship was attributed to a common driver, which is the light dose 90 received by the water mass under consideration (Schneider and Müller, 2018).
Despite the successful investigation of cyanobacteria blooms through SOOP pCO 2 observations, providing a depth-integrated estimate of NCP in units of moles carbon fixed per surface area remains challenging due to the restriction of SOOP observations to surface waters. Previous studies aiming at a depth-integrated NCP estimate either simply assumed that the C T drawdown reached as far down as the water inlet of the measurement system (Schneider and Müller, 2018) or relied on a modelled mixed 95 layer depth for the integration of surface observations across depth (Schneider et al., 2014). However, in the absence of any vertically resolved measurements, neither approach could be validated. Likewise, remote sensing approaches resolve the spatial coverage of the blooms (Hansson and Hakansson, 2007;Kahru and Elmgren, 2014), but fail to detect their vertical extent (Kutser et al., 2008) and quantify NCP. Finally, regular research vessel cruises allow for the determination of a full suite of biogeochemical parameters from discrete water samples and even the experimental determination of carbon fixation rates through 100 14 C incubations (Wasmund et al., 2001(Wasmund et al., , 2005. Incubation experiments can provide valuable information about instantaneous rates of NCP, but -in contrast to time series observations such as obtained by SOOP measurements -do not allow to integrate observed changes over time and constrain budgets of biogeochemical transformations. This integration over time requires several weeks of repeated observations to resolve the progression of entire bloom events, ideally covering a station network to average bloom patchiness.

This study
This study builds upon the previous success to determine NCP based on pCO 2 time series, but extends the approach to vertically resolved observations for the first time. The primary goals of this study are to (1) provide a best-guess estimate for the depth-integrated NCP of an individual cyanobacteria bloom based on the full suite of depth-resolved in situ measurements and

110
(2) establish an algorithm to reconstruct depth-integrated NCP based on surface pCO 2 observations and modelled hydrographical profiles Achieving goal (2) and applying the algorithm to almost two decades of SOOP pCO 2 observations in the Baltic Sea would not only allow to determine long-term trends of cyanobacteria NCP, but also enable disentangling its drivers through a comparison of NCP across years characterized by different environmental conditions such as SST, pCO 2 and nutrient availability. 115 surface measurements were not taken into account. In addition to the sensor measurements, discrete samples for dissolved inorganic carbon (C T ), total alkalinity (A T ) and phytoplankton counts were collected. Track coordinates were continuously recorded with a tablet computer (Galaxy Tab Active, Samsung Electronics, Suwon, South Korea).
In addition to the field sampling campaign, atmospheric measurements of wind speed and pCO 2 were provided by an ICOS (Integrated Carbon Observation System) station permanently operated on the island Östergarnsholm (Fig. 1B). Furthermore, 130 sea surface pCO 2 and temperature (SST) were also determined on the SOOP Finnmaid, regularly crossing the field study area (Fig. 1B). High-resolution hydrographical model data were obtained from the Generalized Estuarine Turbulence Model (GETM) along a vertical section following the Finnmaid track.
2.2 Field sampling campaign 2.2.1 CTD measurements 135 CTD measurements were performed with a SBE 16 SEACAT instrument (serial number 2557; Sea-Bird Electronics, Bellevue, USA). Temperature and salinity sensors were pre-calibrated at IOW's sensor calibration laboratory. The manual operation of the sensor package was guided by real-time display of data submitted through a strain-relieved cable. Data stored on an internal memory were used for analysis. The CTD logging frequency was 15 seconds and observations were linearly interpolated to match the higher measurement frequency of the pCO 2 sensor (for additional details see Appendix A2). The CTD instrument 140 supplied auxiliary sensors with power and served as a central unit to record and transmit analogue output signals.

pCO 2 sensor measurements
The submersible CO 2 sensor used in this study, a CONTROS HydroC® CO 2 (formerly Kongsberg Maritime Contros, Kiel, Germany; now -4H-JENA engineering, Jena, Germany), uses membrane equilibration of a headspace and subsequent optical Non-Dispersive Infra-Red (NDIR) absorption to determine the pCO 2 in water (Fietzek et al., 2014).  A pre-and post-deployment calibration of the sensor was performed by the manufacturer. pCO 2 data were post-processed taking into account the pre-and post-deployment calibration polynomials, as well as zeroing signals regularly recorded during each deployment. The post-processing resulted in an accuracy of 1% of reading (Fietzek et al., 2014). For details concerning sensor calibration, configuration, and signal post-processing, see Appendices A1 -A3.
Although the pCO 2 sensor achieves low and reproducible response times through active pumping of water onto the mem-150 brane, a correction of the response time (τ ) was applied following Bittig et al. (2018). After the response time correction, the mean absolute pCO 2 difference between the up-and downcast profile was <2.5 µatm in the upper 5 m of the water column and <7.5 µatm across the upper 20 m (Fig. A2). For details concerning the response time correction, see Appendix A4.
The biogeochemical interpretation of the pCO 2 data was based on downcast profiles only. Since downcasts were started after complete equilibration of the pCO 2 sensor in near-surface waters, the applied response time correction has only a minor 155 impact on the derived NCP estimate.

Discrete C T , A T and phytoplankton sampling
Discrete samples were collected at stations 07 and 10 ( Fig. 1B) with a manually released Niskin bottle. The sampling depth was estimated based on the released line. C T and A T samples were filled into 250 ml SCHOTT-DURAN bottles and poisoned with 200 µL saturated HgCl solution within 24 hours after sampling. Samples were stored dark and cool, transported to 160 IOW, and analysed in the laboratory within no more than 21 days after sampling. C T was determined with an Automated Infra Red Inorganic Carbon Analyzer (AIRICA, MARIANDA, Kiel, Germany) and A T was analysed by open cell titration (Dickson et al., 2007). C T and A T measurements were referenced to certified reference materials from batch 173 (Dickson et al., 2003). Phytoplankton samples were fixed with Lugol solution, and community composition and biomass were determined by microscopic counts according to the Utermöhl method (HELCOM, 2017). For details on the analysis of discrete samples, see Appendix B.

Atmospheric measurements
Meteorological observations were provided by the ICOS flux tower (Fig. 1b) Rutgersson et al., 2020). Atmospheric pCO 2 was recorded with an atmospheric profile system (AP200, Campbell Scientific, Logan, USA) mounted with a CO 2 /H 2 O gas analyzer (LI-840A, LI-COR Bio-170 sciences, Lincoln, USA). Wind speed was measured with a wind monitor (Young, Michigan, USA) at 12 m above mean sea level. Wind speed and pCO 2 data were averaged over 30 min intervals for further analysis. Measured wind speed was converted to U 10 , the wind speed at 10 m above sea level (Winslow et al., 2016), to be consistent with the gas exchange parameterisation (see Sect. 2.4.3).

175
The determination of NCP in this study relies on the interpretation of observed temporal changes in the dissolved inorganic carbon concentration (C T *) across the water column. We refer to this estimate as our best-guess, as it is well-constrained by high-quality measurements and therefore as close to the truth as currently possible. Conceptually, our calculations follow the idea of a one-dimensional box model approach, which does not resolve regional variability within the research area, i.e. it neglects lateral water mass transport. With this approach it is possible to calculate NCP from the observed changes in C T * 180 (Sect. 2.4.1) after vertical gridding and regional averaging of the profiles (Sect. 2.4.2) and applying corrections for CO 2 fluxes caused by air-sea gas exchange (Sect. 2.4.3) and vertical mixing (Sect. 2.4.4).

C T * calculation
C T * was calculated from the measured profiles of temperature and response time corrected pCO 2 (Schneider et al., 2014), as well as the mean A T (1720 µmol kg −1 ) and mean salinity (6.9) determined from discrete samples collected across the upper 185 20 m of the water column and the entire observation period (Fig. B1). Calculations were performed with the R package seacarb (Gattuso et al., 2020), using the CO 2 dissociation constants for estuarine waters from Millero (2010).
The calculated C T * represents an alkalinity-and salinity-normalised estimate of the dissolved inorganic carbon concentration. C T * is suitable to accurately determine changes rather than absolute values of the dissolved inorganic carbon concentration and therefore the preferred variable to quantify NCP. The uncertainty in the determination of changes of C T * is below 190 2 µmol kg −1 when the mean A T is constrained within ±30 µmol kg −1 (see Appendix C1 for a detailed assessment).

Vertical gridding and regional averaging
The regional averaging of observations across the study area and the calculation of temporal changes at individual depth levels required a vertical gridding of the profiling sensor measurements. The vertical gridding of individual profiles was achieved by calculating mean values within 1 m depth intervals. Downcast profiles with missing observations from two or more depth intervals caused by zeroing measurements of the pCO 2 sensor were discarded, which affected 8 out of 86 recorded profiles.
For each of eight cruise events (Fig. 2), regionally averaged profiles were further calculated as mean values within each depth interval across all stations. Based on those mean, vertically gridded cruise profiles, incremental and cumulative changes over time were calculated for each depth interval. Throughout the manuscript, observations averaged across the upper 0 -6 m of the water column are referred to as surface observations. The air-sea gas exchange of CO 2 (F) was calculated from sea surface pCO 2 , salinity and temperature, in combination with atmospheric pCO 2 and wind speed (U 10 ) according to Wanninkhof (2014). For the calculation, sea surface observations were linearly interpolated to match the temporal resolution of atmospheric measurements.

Vertical entrainment flux of CO 2 through mixing 205
Due to the stable thermocline present between June 6 and August 7, vertical mixing of C T * across the 12 m integration depth layer was neglected during this period. However, clear signals for significant vertical entrainment of C T * across this layer were recorded between August 7 and 16. This entrainment was quantified assuming an instantaneous complete vertical mixing to 17 m water depth after August 7. For this simplified scenario, the C T * flux across the 12 m depth layer was estimated based on a mass-balance of C T *, which behaves conservatively with respect to mixing (see Appendix C2 for details). 210 2.5 NCP reconstruction from surface pCO 2 observations and hydrographical profiles Calculating depth-integrated NCP from a time series of surface pCO 2 observations, such as provided by SOOP lines, also relies on the conversion of pCO 2 to C T *. Furthermore, the change of C T * over time in the surface water needs to be multiplied with an integration depth estimate to derive an inventory change. Here, we tested two approximations of this integration depth, which are:

215
• Mixed layer depth (MLD) • Temperature penetration depth (TPD) MLD and TPD are described in detail in Sect. 2.5.3. The two parameterisations were further applied to following two test data sets, both of which contain the required surface pCO 2 and vertically resolved temperature and salinity data: • In situ data from the BloomSail campaign without pCO 2 data at depth (SV Tina V (surface only)) 220 • Combined SOOP surface pCO 2 observations and modelled salinity and temperature profiles (SOOP Finnmaid + GETM model) The derived four reconstructed NCP time series were compared to the best-guess estimate (i.e. the estimate based on the vertically resolved pCO 2 observations from this study).
2.5.1 SOOP Finnmaid surface pCO 2 SOOP Finnmaid regularly commutes between Helsinki in Finnland and Travemünde in Germany thereby crossing the entire Central Baltic Sea and our study area on the east coast of Gotland every 1 -2 days. On board SOOP Finnmaid, pCO 2 is measured with a bubble-type equilibrator system supplied with water from an inlet at around 3 m water depth. Details of the measurement set-up are described in Schneider et al. (2014) and data are submitted on a regular basis to the Surface Ocean CO 2 Atlas SOCAT (Bakker et al., 2016). The primary measurement system used to determine pCO 2 in this study is a NDIR sensor 230 (LI-6262, LI-COR Biosciences, Lincoln, USA). The ferrybox unit is also equipped with an additional methane/carbon dioxide analyzer (Greenhouse Gas Analyzer DLT 100, type 908-0011, Los Gatos Research, San Jose, USA), providing independent pCO 2 observations (Gülzow et al., 2011). Intercomparison of both systems is routinely used to ensure the correct functioning of the instrumentation. In this study, a data gap caused by malfunctioning of the primary LI-COR system was filled by including data recorded with the Los Gatos system on six cruises between July 8 and 16 (see Appendix D for details). The mean regional 235 pCO 2 , sea surface temperature (SST) and salinity (SSS) were calculated for each crossing of the study area (Fig. 1B). Based on those mean values, C T * was calculated following the procedure outlined in Sect. 2.4.1. A remaining gap in the SOOP time series was filled with two in situ C T * observations from the BloomSail campaign (July 19 and 24).

GETM model temperature and salinity
Surface SOOP measurements were complemented with the vertical distribution of salinity and temperature from the output 240 of a numerical ocean model of the Baltic Sea. The deployed General Estuarine Turbulence Model (GETM) has a horizontal resolution of 1 nautical mile and 50 vertical terrain-following levels. The uppermost level has a thickness of maximum 50 cm to properly represent SST and ocean-atmosphere fluxes. The computation of the atmospheric fluxes is based on the parameterisation of Kara et al. (2005). The model run covers the period 1961 -2019. A detailed analysis of the ocean model performance is given in Placke et al. (2018) and Gräwe et al. (2019). For the present study, we used a model run restarted in 2003 with the 245 atmospheric forcing from the operational reanalysis data set of the German weather service (Zängl et al., 2015). Additionally, we implemented the Langmuir-circulation parameterisation of Axell (2002), to account for wind-wave induced variation in the mixed layer depth. Model results were averaged over 24 h and interpolated to a standardised section with 2 km horizontal and 1 m vertical resolution, which follows the mean Finnmaid cruise track. Based on this standard section, daily mean profiles within the study area were computed and linearly interpolated to match the exact times of Finnmaid crossings.

Parameterisation of the integration depth
In this study, two parameters were used to integrate surface observations across depth, namely the classical mixed layer depth (MLD) and the newly introduced temperature penetration depth (TPD).
MLD was defined as the shallowest depth at which seawater density exceeds the density at the surface by more than 0.1 kg m −3 (Roquet et al., 2015). According to this definition, MLD characterises the thermohaline structure of the water 255 column and often (but not necessarily) approximates the depth to which surfaces water masses are actively mixed. The defini-tion through a fixed density threshold further implies that gradual changes of temperature with depth are not reflected by this parameter.
TPD was defined as the SST increase divided by the integrated warming signal across the water column (i.e. the sum of all positive temperature changes within 1m depth intervals) that occurred between two sampling events (for illustration see Fig.   260 C4A). TPD is only applicable when SST increases and has units of metres. According to its definition, TPD characterises the mean penetration depth of a warming signal and takes gradual changes of temperature across depth into account. To illustrate the TPD concept, it should be noted that a homogeneous warming signal that ceases abruptly at 10 m water depth would result in the same TPD as a warming signal that decreases linearly from the surface to 20 m water depth (TPD is 10 m in both cases). The TPD approach is motivated by the assumption that primary production and temperature increase are both primarily 265 controlled by the light dose that a water parcel received (Schneider et al., 2014) and therefore show similar patterns.
Based on MLD or TPD, vertically integrated changes of C T * were reconstructed as the product of incremental changes of surface C T * between cruise days and one of the two integration depth estimates. The reconstructed integrated changes of C T * were further corrected for air-sea fluxes of CO 2 according to section 2.4.3. Please note that neither the MLD nor the TPD approach allows to resolve vertical entrainment fluxes, because profiles of C T * are not reconstructed (compare section 2.4.4).

270
In analogy to TPD, the penetration depth of C T * drawdown (CPD) was defined as the decrease of C T * at the surface divided by the integrated loss of C T * across the water column (Fig. C4B).

Results
3.1 Dynamics of temperature, pCO 2 , C T * and phytoplankton biomass Between July 6 and August 16, a total number of 78 complete vertical CTD and pCO 2 downcast profiles were recorded (Fig. 275 2 and 3). C T * was calculated and profiles were regionally averaged for each of the eight cruise events (Fig. 4). Since the first cruise of the BloomSail expedition on July 6, sea surface temperature (SST) increased steadily from~15°C to peak values of 25°C ( Fig. 4 and 5) observed on August 3. Sea surface pCO 2 was already as low as~100 µatm at the beginning of July ( Fig. 5a) and decreased further to the lowest values of~70 µatm on July 24. The drop in pCO 2 and the simultaneous increase in SST correspond to a decrease of C T * of almost 90 µmol kg −1 (Fig. 4). During this period of intense primary production, 280 the regional variability of SST, pCO 2 , and C T * across stations was low compared to their temporal change ( Fig. 5a-b; Fig.   C3). The regional variability is slightly higher when including the coastal stations 01, 13, and 14 (results not shown), but is generally lower than suggested by the bloom patchiness typically observed through remote sensing (Fig. 1a). With respect to pCO 2 dynamics, it should be noted that (i) the observed temperature increase and C T * drawdown have opposing effects on pCO 2 and (ii) the change of pCO 2 per change in C T * is generally low at low absolute pCO 2 . The observed C T * dynamics 285 in surface waters are clearly attributable to the primary production activity of phytoplankton and go along with an observed increase of the biomass of Nodularia sp. (Fig. B2), which also peaked on July 24.
Between the extremes of pCO 2 and C T * (minimum on July 24) and SST (maximum on August 3), a noticeable increase of surface C T * was observed on July 31, which was accompanied by a higher regional variability across the station network ( Fig. 5a,c). The temporary C T * increase was limited to the north-eastern stations 07 -10 ( Fig. C3) and paralleled by a drop 290 in salinity and elevated A T at the same stations (Fig. B1). It is therefore attributable to the lateral exchange of water masses.
All signals of this lateral intrusion vanished within a week. At the other stations (02 -06 and 11 -12), no noticeable signs of water mass exchange or C T * changes were observed between July 24 and August 3, indicating that NCP had ceased during this period. During the first two weeks of August the study area was affected by increased wind speeds, causing a decrease of SST back to~18°C. The simultaneous return of surface pCO 2 to~150 µatm corresponded to a C T * increase of~100 µmol kg −1 .

295
The observed surface warming and C T * drawdown extended vertically to a water depth of~10 m (Fig. 4). On the first cruise day (July 6), the vertical distribution of C T * and temperature was still relatively homogenous. C T * at 25 m water depth was 70 µmol kg −1 higher than at the surface. Likewise, the temperature gradient covered only~3°C from 16°C at the surface to 13°C at depth. The warming of surface waters caused an increasingly stable thermocline to be established at around 10 m water depth, reaching a temperature gradient of~10°C across 5 m on August 3. Continuous and uniform consumption of C T * 300 within the surface layer enhanced the vertical C T * gradient to >150 µmol kg −1 between the surface and 25 m water depth. The C T * drawdown was observed to a maximum depth of 12 m. Between August 7 and 16 the SST drop of~6°C was accompanied by a temperature increase in deeper water layers (11 -17 m) of up to 5°C. This vertical redistribution of heat indicates vertical mixing of water masses, which was also reflected in a steep increase of C T * in the surface water and a loss of C T * between 11 -17 m ( Fig. 3 and 4).

NCP best-guess based on profiling measurements
Net community production (NCP) was determined through vertical integration of the observed consumption of C T * from the surface to a water depth of 12 m. The chosen integration depth reflects the maximum penetration depth of the incremental (i.e. between cruise days), as well as the cumulative (i.e. from July 6 -24) C T * drawdown (Fig. 4). Likewise, about 95% of the cumulative warming signal, which refers to positive temperature changes integrated over depth, occurred above 12 m.  Until July 24, the depth-integrated C T * consumption amounted to~0.9 mol m −2 (Fig. 5H). This observed C T * consumption was corrected for air-sea fluxes of CO 2 (F). Between July 6 and August 7, the cumulative flux (F cum ) amounted to around -0.5 mol m −2 (Fig. 5G), with a negative sign representing CO 2 uptake from the atmosphere. In the absence of noticeable vertical mixing, this flux was entirely added to the observed C T * consumption. Only between August 7 and 16, when mixing to about 17 m water depth was observed, a significant fraction of the CO 2 taken up from the atmosphere was transported below 12 m 315 water depth. To account for the partial loss of airborne CO 2 to deeper waters during this 9 day-period, only 12/17 of F cum during this time (-0.2 mol m −2 ), which is the fraction that would remain in the upper water column, was added to the observed C T * consumption. In addition, a significant amount of C T * entrainment (~0.5 mol m −2 ) into the surface layer was caused by the vertical mixing between August 7 and 16 ( Fig. 5H and C2).
After correction for air-sea fluxes and vertical entrainment of CO 2 , the cumulative changes of depth-integrated C T * repre-320 sent the NCP between 0 -12 m water depth (Fig. 5H). The peak NCP value of~1.2 mol m −2 was observed on July 24 and is of primary interest because it reflects the amount of organic matter that was produced and is potentially available to be either exported or remineralised. After July 24, no signs of continued NCP were observed. Accordingly, the following attempt to reconstruct NCP based on surface pCO 2 observations focuses on the period July 6 -24.

NCP reconstruction based on surface pCO 2 and hydrographical profiles 325
The reconstruction of depth-integrated NCP was tested for two data sets containing the same type of information, namely the observed changes in surface pCO 2 and vertical profiles of seawater salinity and temperature. The first data set "SV Tina V (surface only)" contains the surface pCO 2 data recorded during the BloomSail expedition, as well as the complete CTD profiles. The second data set ("SOOP Finnmaid + GETM model") combines surface pCO 2 observations from SOOP Finnmaid with seawater salinity and temperature as estimated with the GETM model. For both data sets C T * time series were calculated 330 based on the same mean A T .
An almost identical decrease of surface C T * of~50 µmol kg −1 was determined between July 6 and 16 (Fig. 6A), based on the completely independent pCO 2 data recorded on SOOP Finnmaid and SV Tina V. Likewise, a very similar increase in C T * between August 6 and 15 was determined from both independent observational data sets. The good agreement between the independent observations justifies that a data gap due to failure of instrumentation on the SOOP was filled with two observations 335 from SV Tina V on July 19 and 24 (open circles in (Fig. 6A).
Good agreement was also found for the spatio-temporal dynamics of observed and modelled seawater temperature (Fig. 6B).
Observed and modelled SST agreed within 1°C over the entire observation period, despite an absolute change spanning almost 10°C. Slightly higher deviations between observed and modelled temperature were found around the thermocline, where the observational record revealed a stronger temperature gradient. This difference is likely due to an imperfect representation of 340 Langmuir circulation in the model (Axell, 2002), whereas the absence of increased light attenuation caused by phytoplankton particles was previously found to have only minor impacts on modeled SST dynamics (Löptien and Meier, 2011). Most importantly, the mean temperature penetration depths (TPD) derived from the observational and model data differ less than 1 m, indicating that surface warming and the integrated heat uptake are accurately represented by the model. The TPD (mean the observational and model data, respectively (Fig. 6B). The TPD estimates are considerably higher than the respective mixed layer depth (MLD) estimates (6.0 ± 1.9 m and 5.5 ± 1.2 m) and agree better with the observed penetration depth of C T * drawdown, indicating that TPD is the favourable approximation of the integration depth.
The NCP reconstruction based on TPD is generally higher than the MLD-based estimate (Fig. 6C). Comparing peak cumulative NCP estimates for July 24, the TPD-approach results in a~10% overestimation compared to the best-guess estimate, i.e. 350 the value derived from vertically resolved measurements. In contrast, the MLD-based NCP estimate is~30% lower than this best-guess estimate. The reconstructed NCP estimates are very similar for both test data sets, as the good agreement between the underlying C T *, MLD and TPD time series suggests.
Comparing the deviation between the best-guess and reconstructed NCP estimates in the light of the lateral variability observed within the study area, it must be emphasised that between July 6 and 24, the mean standard deviation of pCO 2 355 and C T * across stations amounted to ± 6 µatm and ± 11 µmol kg −1 , respectively. This is higher than the likely uncertainty associated with the pCO 2 measurements (see Methods), as well as its response time correction (see Methods and Appendix A4) or conversion to C T * (see Appendix C1). Therefore, the lateral variability of seawater chemistry and the production signal are generally considered the highest source of uncertainty to our NCP estimates. Still, this lateral variability is small compared to the signal to be resolved (i.e. the C T * consumption of~90 µmol kg −1 ), but on a relative scale (~10%) in about the same 360 order of magnitude as the difference between the best-guess and the TPD-based, reconstructed NCP estimates. In contrast, the lateral variability is smaller than the deviation between the best-guess and the MLD-based, reconstructed NCP estimates.
All reconstructed NCP estimates include the correction of air-sea fluxes of CO 2 , but it is impossible to quantify and correct vertical entrainment fluxes due to mixing, because the vertical distribution of C T * across the water column can not be resolved.
The strong deviation between the best-guess NCP and the MLD-based reconstruction on August 16 is due to this missing 365 correction of vertical mixing. This deviation highlights that the reconstruction approach is only applicable to production periods with a stable or shoaling thermocline. The TPD-based approach does not allow for any estimate during the last two weeks of the observations period, as the TPD is per definition only applicable to periods of warming surface waters. 2008, 2009 and 2011, the authors found average daily rates of C T * consumption ranging from 3 to 8 µmol kg −1 d −1 , which is comparable to the mean rate of 4.4 µmol kg −1 d −1 determined in this study (i.e. the average C T * drawdown of~90 µmol kg −1 over 12 days, Fig. 4). The individual production events identified by Schneider et al. (2014) lasted 1 to 5 weeks, similar to the duration described in this study. Finally, Schneider et al. (2014) also provided a depth-integrated NCP estimate based on a daily modelled mixing depths, which ranged from 3 -20 m and were derived from the vertical distribution of a tracer one 380 day after its injection into the surface. Although this approach is primarily useful to estimate the vertical distribution of air-sea CO 2 fluxes and does not necessarily reflect the vertical extent of net community production, their determined midsummer NCP estimates (1 -2.1 mol m −2 ) are in the same order of magnitude as the best-guess estimate derived in this study. It should be noted that the NCP estimates by Schneider et al. (2014) refer to the cumulative NCP of one to three production pulses per years, whereas our estimate of~1.2 mol m −2 refers to a single bloom event.
385 Wasmund et al. (2001) conducted 14 C incubation experiments at different water depths to determine instantaneous rates of primary production during a cyanobacteria bloom. The obtained daytime carbon fixation rates in surface waters (0.4 -0.8 mmol C m −3 h −1 ) are in the same order of magnitude as the mean rate found in this study. More importantly, the authors also found significantly lower fixation rates below 10 m water depth (< 0.2 mmol-C m −3 h −1 ), which agrees with the depth distribution of NCP observed in this study.

390
Furthermore, the succession of different cyanobacteria genera observed in 2018, with the Nodularia dominated bloom following an earlier presence of Aphanizomenon (Fig. B2), was previously described as a typical pattern (Wasmund, 2017), as well as the fact that increased wind speed and turbulence can inhibit N-fixation of cyanobacteria and cause the termination of the bloom (Wasmund, 1997).
In conclusion, the bloom event duration, C T * drawdown, and NCP, as well as the vertical extend of carbon fixation and 395 the succession of the bloom observed in this study agree well with observations in previous years, and distinct differences cannot be found. We therefore conclude that the findings of this study are representative for Baltic Sea cyanobacteria blooms in general, although the SST and pCO 2 levels in 2018 were at the upper and lower end, respectively, of the conditions observed in previous years (Schneider and Müller, 2018).

Recommendations and caveats for NCP reconstruction from SOOP and model data 400
The good agreement between our best-guess and the reconstructed, TPD-based NCP estimate of the production peak on July 24 (Fig. 6C) indicates that it is possible to determine NCP from surface pCO 2 observations and vertically resolved seawater temperature with little uncertainty. For the NCP calculation based on surface pCO 2 observations from SOOP and modelled temperature profiles, we recommend to: 1. Convert surface pCO 2 to C T * based on a mean A T estimate for the region under consideration. 405 2. Identify production pulses dominated by cyanobacteria as periods characterised by a decrease in C T * that occurs between June and August.
3. Integrate observed surface C T * changes to the temperature penetration depth (TPD) estimated from modelled temperature profiles, rather than using a mixed layer depth (MLD) estimate. a stable or shoaling thermocline.
It should be emphasised that lateral variability and water mass transport are critical for observation-based NCP estimates and constitute the largest source of uncertainty in our estimates. However, SOOP observations allow averaging of observations across large regions, which reduces the impact of lateral water mass transport (Schneider and Müller, 2018). The region for spatial averaging should be chosen large enough to avoid as much as possible the influence of lateral perturbations which 415 depend on the surface dynamics and the biogeochemical gradients in the surrounding area. Yet, the region for spatial averaging should be chosen small enough to ensure that variations of pCO 2 within the region are small compared to the temporal changes of interest. Another critical aspect of the recommended NCP reconstruction approach is the restriction to periods of a stable or shoaling thermocline. While in principle it is possible that net organic matter production could occur also during periods of a deepening thermocline, this process was observed neither in this study nor previous years (Schneider and Müller, 2018), and 420 is in line with the planktological finding that increased wind speed causes the termination of the bloom (Wasmund, 1997).
The NCP reconstruction approach presented in this study was derived from observations covering a single bloom event within the Central Baltic Sea. In the lack of comparable comprehensive observational data that underlie our best-guess estimate, the applicability of this approach could not be tested for other regions or bloom events. However, the dynamics and intensity of the bloom event described here are comparable to previous, independent descriptions of cyanobacteria blooms. Therefore, 425 it is assumed that underlying biogeochemical mechanisms are representative and that the NCP reconstruction approach can be applied to other cyanobacteria bloom events. Specifically, we assume that the findings represented here can be applied to evaluate past and future pCO 2 observations made on Finnmaid and other SOOP in the Central Baltic Sea without compromise.
Larger uncertainties should be expected when applying the approach to other basins of (or even outside) the Baltic Sea.

430
In this study, the depth-integrated quantification of NCP that occurred during a cyanobacteria bloom in the Baltic Sea in 2018 is achieved through the interpretation of profiling measurements of pCO 2 that covered the entire bloom event. Furthermore, it is demonstrated that this best-guess estimate can be reconstructed with small bias from SOOP pCO 2 observations and modelled temperature profiles. Recommendations to apply our reconstruction approach to the comprehensive long-term record of surface pCO 2 data available for the Baltic Sea are given. The application of this approach will allow for the detection and 435 attribution of trends in cyanobacteria NCP across decades. In particular the comparison of NCP estimates of bloom events that occurred under different environmental conditions will provide a better understanding of the controlling factors. Ultimately, this knowledge will inform the design and monitoring of conservation measures aiming at a Good Environmental Status of the Baltic Sea and potentially other regions.
Website: Following the concept of literature programming and relying on the R package workflowr (Blischak et al., 2019), the code, plain text comments, and graphical output of this study are compiled as a website available at: https://jens-daniel-mueller.github.io/BloomSail/.
Code and raw data: A release of the Github repository underlying the website and containing all code was tagged as "os-2020-120_submission" and archived on https://zenodo.org/. All raw data required to run the analysis were uploaded manually to this archive.

445
Processed environmental data: Processed in situ observation of this study will be made available through https://www.pangaea.de/ upon acceptance of the manuscript.

A2 Sensor configuration and operation
The instrument periodically records zeroing values, during which the CO 2 within the gas stream is scrubbed by a soda lime cartridge. Zeroings of two minutes duration were recorded every five hours during the field deployment. A period of 600 seconds after the zeroing was flagged as a flush period, during which the sensor signal recovers to environmental conditions. Recordings during the flush and zeroing period were removed before further biogeochemical interpretation.

460
For the majority of the measurements, the sensor was operated with a 8W-pump (SBE-5T; Sea-Bird Electronics, Bellevue, USA) and the logging interval was set to 1 second. Only for the first two cruise days on July 6 and 10, a 1W-pump (SBE-5M, Sea-Bird Electronics) was used and the logging interval set to 10 seconds.
The downcast profiles were always recorded continuously and with a steady profiling speed of~2 m min −1 . The upcast profiles were either performed continuously as well, or with a stop to record an equilibrated reference pCO 2 value at a desired 465 depth. Only continuous downcast profiles were used for biogeochemical interpretation.
Zeroing signals were recorded by the CTD unit from the analogue sensor output, as well as in the internal sensor memory.
Both records were used to ensure exact temporal match of the CTD and pCO 2 time series. Only pCO 2 data stored with higher temporal resolution in the internal memory were used during further analysis.
A3 Data post-processing 470 A drift correction as discussed in Fietzek et al. (2014) was applied to the field data to improve the data quality. This postprocessing considers information from the pre-and post-deployment calibrations (i.e. concentration dependent or span drift) and the regular in situ zeroings (i.e. zero drift).
The first 60 seconds within every zeroing interval were discarded to only consider smooth zero-gas measurements that are not affected by the signal drop from ambient pCO 2 to the zero value. Zero signals for every point of the deployment were 475 obtained by linear interpolation of the zero measurements. In case of data gaps larger than 2 hours within the deployment data, the course of the 2 zero signals before or after the gap was linearly extrapolated forward or backward, respectively, instead of an interpolation over the time of the measuring gap. A concentration-dependent drift of the sensor was considered by transforming the pre-into the post-deployment calibration polynomial according to the actual sensor runtime (and not according to the course of the zero measurements as applied within Fietzek et al. (2014)).

480
Approx. 100 unrealistic outliers were found within the sensor temperature record (T sensor parameter) of the HydroC®. These were identified to be electronic artefacts and the values replaced by the constant temperatures recorded before and after these events that only lasted a few seconds at most.
Given the statistics of the pre-and post-deployment calibration, the small drift encountered throughout the deployment and the otherwise smooth performance of the sensor during the deployment, the accuracy of the measurements is considered to be 485 1% of reading as also found within Fietzek et al. (2014).

A4 pCO 2 response time correction
The actual in situ response times (τ ) of the sensor were determined by fitting an exponential function to the signal recovery following a zeroing (Fig A1;Fiedler et al. (2013);Fietzek et al. (2014)). The determined τ values were used subsequently to correct the signal delay (Fiedler et al., 2013;Fietzek et al., 2014;Atamanchuk et al., 2015).

A4.1 Response time determination
In situ response times (τ ) were determined from pCO 2 data recorded during the flush period after each zeroing. Data recorded during the initial 20 seconds of each flush period were removed as those are affected by the mixing of residual gas volumes inside the sensor. Individual τ values were determined by fitting the non-linear model where pCO 2 (t) is the recorded pCO 2 at time t, pCO 2 (t 0 ) and pCO 2 (t end ) are the fitted pCO 2 values at the beginning and the end of the equilibration process, and dt is the time since the beginning of the equilibration process. In situ τ was determined for a fit interval length of 300 seconds. Flush periods were discarded when the mean of absolute residuals from the fit exceeded 1% of the final pCO 2 , a condition which indicated unstable environmental pCO 2 (e.g. due to unintended heaving of the sensor package).

500
Similar to previous studies, a decrease of τ with increasing in situ temperature was found. The dependence of τ on temperature was fitted with linear regression models, separately for the deployments with the 1W-and 8W-pump. The sensor was carefully cleaned after each cruise and no signs of a changing sensor response time over time as an indicative of fouling on the sensor's membrane were detected.

505
For each recorded pCO 2 value, the corresponding τ was calculated from measured in situ temperature. The response time correction was then applied according to equation (S3) in the supplementary material of Bittig et al. (2018):  where pCO 2,insitu is the true in situ pCO 2 time series, pCO 2,obs the pCO 2 time series as recorded by the sensor, and τ the response time for the interval between t i and t i+1 . Due to the short interval between adjacent observations in our study, the 510 calculated value from the right side of equation A2 was considered directly representative for pCO 2,insitu (t i+1 ), although it is strictly the mean value between two adjacent observations, i.e. 0.5 · (pCO 2,insitu (t i ) + pCO 2,insitu (t i+1 )). Finally, a rolling mean with a window width of 30 sec was applied to the response time corrected pCO 2,insitu time series to remove short term noise. Please note that throughout the rest of the manuscript pCO 2,insitu is referred to as pCO 2 .

A4.3 Quality assessment 515
The improvements by the response time correction were investigated based on the difference between up-and downcast pCO 2 profiles vertically gridded into 1m depth intervals. To focus this quality assessment on the conditions in near surface waters which are subject of this study, profiles were discarded which exceeded a maximum depth of 30 m and/or a maximum pCO 2 of 300 µatm. Those profiles were excluded only for the quality assessment (not for the biogeochemical interpretation) to avoid a bias through exposure to very high pCO 2 at greater depth. Furthermore, profiles were removed with a maximum number 520 of missing observations from two or more depth intervals, which occasionally occurred when a sensor zeroing started while profiling. Based on this subset of response time corrected pCO 2 profiles it was found that the mean absolute pCO 2 difference between the up-and downcast profile was <2.5 µatm averaged across the upper 5 m of the water column and <7.5 µatm across the upper 20 m. The highest offset was found at around 10 m water depth and results from the steep environmental pCO 2 gradient around the thermocline. After a first, single addition of hydrochloric acid to achieve a pH 4 -3.5, A T is determined during a continued, stepwise titration 535 to pH 3, during which pH is recorded potentiometrically (Dickson et al., 2007). Measurements were referenced to CRM batch 173 (Dickson et al., 2003). C T * calculated for discrete samples refers to a classical alkalinity-normalised C T , and was defined as C T * = C T · A T,mean / A T . C T * derived from discrete samples or pCO 2 sensor data are directly comparable (Fig. 5c) because they are referenced to the same mean A T of the discrete samples (1720 µmol kg −1 ).

B2 Phytoplankton
Phytoplankton samples were fixed with Lugol solution within no more than 24 hours after sampling. Samples were stored dark, before being transported to IOW and analysed in the laboratory within no more than 3 months after sampling. Phytoplankton community composition and biomass were determined by the Utermöhl method (HELCOM, 2017), which relies on microscope counts and the conversion of cell shape and size to biomass units. 545 Figure B2. Time series of cyanobacterial biomass, averaged for surface (0 -6 m) and subsurface (6 -25 m) water masses sampled from stations 07 and 10 (Fig. 1). Results are based on microscope counts and distinguish three genera (panels).
is the case in the Baltic Sea due to the absence of calcifying plankton (Tyrrell et al., 2008). To avoid confusion with measured or absolute C T values and for consistency with previous studies, the calculated variable is referred to as C T *.
To evaluate the applicability of this approach under the specific pCO 2 and temperature conditions observed in summer 2018, we calculated C T * changes between Jul 6 and 24 for a range of A T values covering three times the standard deviation of A T observations (Fig. B1). For assumed A T values of 1747 µmol kg −1 and 1693 µmol kg −1 , which is 1 standard deviation of the 555 observations (27 µmol kg −1 ) higher and lower than the mean A T (1720 µmol kg −1 ), the bias of the derived change in C T * amounts to ± 1.6 µmol kg −1 . This bias is <2% compared to the signal of interest, i.e. the absolute drawdown of C T * (89 µmol kg −1 ). Figure C1. Bias of changes in CT* as a function of the bias in mean AT used for calculation (see Fig. B1). Results correspond to the pCO2 and temperature conditions observed in this study and are expressed in absolute and relative units. Grey areas highlight ±1 standard deviation around the mean AT.
It should be noted that the bias assessment presented here reflects two types of errors, namely (i) the assignment of an erroneous mean A T value for the calculation and (ii) the lateral exchange of water masses with different A T but identical 560 initial pCO 2 during the observation period. The robustness of this approach to the latter aspect is the reason why pCO 2 observations are more suitable to determine NCP than direct C T measurements, when those are not normalised to corresponding A T measurements.
C2 Calculation of the vertical entrainment flux of C T * The vertical entrainment flux of C T * that occured across the 12 m integration depth layer between Aug 7 and 16 was estimated 565 assuming an instantaneous complete vertical mixing to 17 m water depth after Aug 7. For this scenario, the hypothetical homogeneous C T * concentration after the mixing event (C T *mix) equals the mean volume-weighted C T * concentration between 0 -17 m (Fig. C2). Furthermore, the entrainment flux (C T *flux) into the surface water column (0 -12 m) is equal to the concentration difference between observed C T * on Aug 7 and C T *mix, integrated from 12 to 17 m.  C4 Temperature penetration depth (TPD) concept Figure C4. Illustration of the temperature and CT* penetration depth concept, short TPD and CPD. Shown are exemplary profiles of incremental changes of (a) temperature and (b) CT* observed between the cruises on July 6 and 10. TPD and CPD (red horizontal lines) are defined as the depth-integrated positive (for temperature) and negative (for CT*) changes (grey areas) divided by the change at the surface.
TPD and CPD are expressed in units of metres. also physically detected and fixed. The resulting difference between the two systems was clearly correlated with absolute pCO 2 , as expected from contamination with ambient air. For data from the transect on July 5, the linear regression model pCO 2,true = pCO 2,LGR + 0.038 * pCO 2,LGR -24.2 was fitted, assuming that the LI-COR system had delivered the "true" pCO 2,true before its failure. Assuming further that the effect of the contamination remained constant, this relationship was then applied to reconstruct pCO 2,true from pCO 2,LGR for the period without LI-COR data. To validate this adjustment, pCO 2,true was 580 also reconstructed from pCO 2,LGR on July 4 and compared to pCO 2 directly measured with the LI-COR system. The mean difference was below 2 µatm for the entire transect as well as for a data subset within the study region, giving confidence to the high accuracy of the adjusted pCO 2,true . It should be noted that the adjusted SOOP pCO 2 data recorded between July 7 and July 16 agree well with the in situ pCO 2 recorded by the sailing campaign, i.e. the standard deviations of all surface measurements in the study region overlap.