A Baltic Sea estuary as phosphorus source and sink

Internal phosphorus (P) loading from sediments, controlled by hypoxia, is often assumed to hamper the recovery of lakes and coastal areas from eutrophic conditions. We use a box-model to calculate seasonal and annual inputs, export, 10 retention and internal cycling of P in the inner archipelago of Stockholm, Sweden (Baltic Sea) in 1968–2015. The area receives freshwater from Lake Mälaren and treated sewage from the greater Stockholm area. The sewage treatment plants (STPs) have improved their nutrient removal in steps, starting with P in 1972 and nitrogen in 1996. In the first 10-20 years after the main P load reduction in 1972-76, the model shows, in comparison to the load, a small negative annual P balance, probably due to release from legacy sediment P stores. The now stabilized, near neutral P balance indicates no continued 15 internal loading from legacy P, but P retention is low, despite improved oxygen conditions. Seasonally, sediments are a P sink in spring and a P source in summer and autumn. Most of the deep-water P release from sediments in summer-autumn appears to be derived from the settled spring bloom and is exported during winter. Oxygen consumption and P release in the deep water are generally tightly coupled, indicating limited control by P binding to iron-oxyhydroxides under oxic conditions. However, in years of deep-water hypoxia enhanced P release suggest contribution from redox-sensitive P stores. 20 The oxygen conditions in the area have generally improved, probably due both to lower sedimentation of organic matter from the 1970s and lower STP ammonium loads from the late 1990s. Increased oxygen inputs to the intermediate and deep waters due to weakened stratification and enhanced vertical mixing have probably also contributed, while increased respiration rates due to elevated bottom water temperatures probably explain worsened oxygen conditions during the 1990s. Since the P turnover time is short and legacy P minute, measures to bind P in Stockholm inner archipelago sediments would 25 primarily accumulate P imported from the Baltic Sea and from Lake Mälaren inflow, and management here should focus on reducing external nutrient inputs.


Introduction
Bottom water hypoxia (with oxygen concentration < 2 mg L −1 ) and anoxia are common eutrophication problems in stratified lakes and coastal waters (Diaz and Rosenberg, 2008;Conley et al., 2011).Its root cause is excessive inputs of nutrients resulting in increased settling of organic matter, which increases oxygen (O 2 ) consumption.The occurrence of hypoxia is also influenced by variations in deep-water ventilation, which depend on basin morphology and meteorological forcing (Zhang et al., 2010).In addition to its direct adverse effects on benthic fauna, hypoxia also influences processes cycling nitrogen (N) and phosphorus (P), with feedback effects on water quality.It is well known that release of phosphate-P from sediments can be drastically enhanced when bottom waters turn anoxic (e.g.Einsele, 1936;Mortimer, 1941;Middelburg and Levin, 2009).Such "internal P loading" is often assumed to hamper the recovery of eutrophic ecosystems (Schindler, 2006(Schindler, , 2012;;Vahtera et al., 2007;Puttonen et al., 2014;Stigebrandt et al., 2014).
In the Baltic Sea, one of the world's largest brackish ecosystems, control of both external and internal P loading is considered crucial for reducing the recurring summer blooms of nitrogen-fixing cyanobacteria (Vahtera et al., 2007).Here, large fluctuations in the water mass P inventory are linked Published by Copernicus Publications on behalf of the European Geosciences Union.
to the areal extent of hypoxia (Conley et al., 2002;Jilbert et al., 2011) and numerous experimental studies have documented drastically increased sediment P release as formerly oxic sediments turn anoxic (e.g.Koop et al., 1990;Gunnars and Blomqvist, 1997).The mechanism usually invoked is the classical redox-sensitive release of phosphate-P (or dissolved inorganic P, DIP) from P bound to iron(III)oxyhydroxides (Fe-P) as Fe is reduced to Fe(II) (Mortimer, 1941;Mort et al., 2010).The importance of O 2 control of P release from sediments is also supported by in situ flux studies showing that oxygenation of anoxic sediments can drastically decrease the P release (Ekeroth et al., 2016a;Hall et al., 2017).
The long-standing paradigm that "O 2 controls P release from sediments" is however too simple to fully describe the sediment P release processes (Gächter and Müller, 2003;Hupfer and Lewandowski, 2008).On a longer timescale the P release from sediments is due to an imbalance between the P sedimentation and the P binding capacity of the deeper, usually anoxic, sediment (Hupfer and Lewandowski, 2008;Carey and Rydin, 2011;Rydin et al., 2011).Even though Fe(III)-bound P can be abundant in surface sediments in the Baltic Sea, it seems to contribute little to the long-term P retention, since P is permanently buried mainly in organic forms (Jensen et al., 1995;Lukkari et al., 2009;Rydin et al., 2011;Mort et al., 2010) and, under some conditions, as persistent Fe(II)-P minerals (Slomp et al., 2013;Reed et al., 2016).In marine sediments sulfate reduction promotes the binding of Fe as sulfides, decreasing the Fe available for Pbinding, allowing considerable P release from oxic surface sediments (Caraco et al., 1990;Blomqvist et al., 2004).This suggests that the control of sediment P release is tightly coupled to organic matter decomposition (Caraco et al., 1990) rather than directly to the O 2 concentration (Hupfer and Lewandowski, 2008).
Estuaries are central in the land-ocean continuum, acting as strong nutrient filters and transforming terrestrial nutrient inputs by complex biogeochemical interactions, involving geological, physical, chemical, and biological processes, affected by a wide array of forcings, e.g.wind, light, water temperature, freshwater discharge, and, where relevant, tides (Regnier et al., 2013).Due to large nutrient inputs and intense nutrient cycling, productivity is usually high, with a 10-fold variability between estuaries (Cloern et al., 2014).In a recent review, Testa et al. (2017) listed a set of challenges and key research questions for estuarine ecosystem science, e.g.maintaining spatially distributed time-series data sets and investigating likely trajectories of ecosystem recovery in response to restoration, to improve mechanistic knowledge of ecosystem change from comparative studies.The recovery of estuaries can follow different pathways and is further complicated by climate change, emphasizing the need for comparative studies (Duarte et al., 2015).Baltic Sea archipelagos, bays, and estuaries are affected by nutrient inputs with local freshwater runoff, import from the open Baltic Sea, and sometimes also by direct discharges of treated munici-pal sewage water.Internal P loading from sediments is often regarded as an important additional P source (e.g.Puttonen et al., 2014;Rydin et al., 2017).Moreover, it is often claimed (e.g.Schindler and Vallentyne, 2008) that following high P loading, historical P storages in coastal sediments may leak for many years, preventing the recovery of good water quality.Indeed, a high load of organic matter can increase the sediment organic P fraction (Lukkari et al., 2009), which potentially contributes to internal loading as external loading is relieved.Accumulated organic matter will also elevate oxygen demand, delaying improvement of O 2 conditions (Middelburg and Levin, 2009;Reed et al., 2011).In lakes, recurrent seasonal internal P loading can be an important factor delaying ecosystem recovery (Schindler, 2006;Nürnberg and LaZerte, 2016).In contrast, estuaries usually have higher water-turnover rates than lakes and the resulting larger export of released P should facilitate water quality recovery.
The Stockholm archipelago, NW Baltic Sea Proper (Fig. 1), has historically received high nutrient loading in the form of sewage from the Stockholm area (Brattberg, 1986) and with fresh water from Lake Mälaren (at LM in Fig. 1), which drains a large inland area with agriculture, forests and several small to medium towns (< 150 000 inhabitants).Due to increasing discharges of mostly untreated sewage, water quality in the archipelago had deteriorated seriously by the mid-20th century (Brattberg, 1986;Schindler and Vallentyne, 2008).In the early 1970's, effective BOD (biological oxygen demand) reduction and chemical P precipitation was implemented in the major sewage treatment plants (STPs) in Stockholm and around Lake Mälaren (Karlgren and Ljungström, 1975).Archipelago water quality then improved rapidly, with P concentrations and phytoplankton biomass, particularly of cyanobacteria, decreasing substantially in a few years (Anonymous, 1975;Karlgren and Ljungström, 1975;Brattberg, 1986;Johansson and Wallström, 2001).After Sweden's accession to the European Union, the two largest STPs discharging to the inner Stockholm archipelago installed N removal in 1996, which also slightly improved P removal.As a result, water quality improved further, both in the still mostly P-limited inner archipelago and further out, where N now decreased to limiting levels (Boesch et al., 2006;Lännergren and Stehn, 2011).
Over 40 years have passed since the P and BOD loads to the Stockholm inner archipelago were drastically reduced in 1972.Levels of P and chlorophyll have declined and oxygen conditions have improved in the inner archipelago (Boesch et al., 2006;Norkko et al., 2012).Still, the sediments are suggested to be an important P source, judging from the regular late summer increase in P concentrations below the summer pycnocline that propagates to the surface layer in late summer to autumn (Boesch et al., 2006).Schindler and Vallentyne (2008) suggest that legacy P may still influence the area, although Schindler et al. (2008) find that there are indications that P release from sediments is beginning to decrease in response to improved oxygen conditions.It has thus been un- clear how much of the P release is a seasonal recycling of P due to organic matter decomposition, the extent to which the P release is controlled by low O 2 concentrations, and whether historical legacy P is still influencing the area.
While numerous studies have dealt with the short-term O 2dependent release of dissolved inorganic P (DIP) from Baltic Sea coastal sediments (e.g.Ekeroth et al., 2016b), fewer studies have focused on the long-term retention of P relative to external P inputs (e.g.Almroth-Rosell et al., 2016).Even fewer studies explore the dual role of sediments as sinks and sources for P over seasonal and annual scales (but see Almroth-Rosell et al., 2015).This may partly be a method-ological issue.Although direct in situ measurements of sediment release or uptake of DIP are relatively straightforward (e.g.Hall et al., 2017;Sommer et al., 2017), they need sophisticated equipment and require a very large sampling effort to reliably represent the variety of depths, sediment types, and seasons.Moreover, these methods do not quantify the settling fluxes of P to the sediment and thus are not sufficient to determine whether the sediment is a net P sink or source (Nielsen et al., 2001;Hupfer and Lewandowski, 2008).Unfortunately, direct sedimentation measurements are prone to large uncertainties (Blomqvist and Larsson, 1994;Gustafsson et al., 2013).
Box models can help constrain and integrate net nutrient fluxes over larger coastal areas, for full seasonal cycles and for longer time periods (e.g.Savchuk, 2005;Testa and Kemp, 2008;Testa et al., 2017).Previous P budgets for the Stockholm inner archipelago have focused primarily on the longterm P balance and are old (Anonymous, 1975;Karlgren and Ljungström, 1975), based on long-term annual means (Karlsson et al., 2010), or rely on mechanistic benthic models (Almroth-Rosell et al., 2016).In this study, we use a temporally highly resolved dynamic (non-steady-state) box model to analyse a time series of nearly 50 years of water P and O 2 concentrations in the Stockholm inner archipelago.Firstly, long-term changes in the area's annual P balance are used to evaluate whether the previously high P loading, drastically reduced over 40 years ago, is still influencing the coastal area due to legacy sediment P release.Our expectation was that P retention would gradually improve once the high P loading was relieved.We also wanted an improved estimate of the present P retention efficiency and its annual variability, with implications for P export to the open Baltic Sea, where excess P may stimulate N-fixing cyanobacterial blooms.Secondly, we evaluate the common assumption that low O 2 concentrations are the main mechanism controlling P release, and the suggestion that improved O 2 conditions have decreased the P release.Alternatively, P release is merely related to degradation rates of organic matter, which would imply that seasonal P release is mainly controlled by seasonal P deposition.We test these mechanisms by calculating the area's seasonal function as a source and sink for P relative to the O 2 concentrations and the deep-water O 2 consumption.Finally, the temperature in the deep water of the inner archipelago has increased several degrees, although it is unclear whether this is an effect of global warming or of local water circulation patterns.The temperature increase should elevate rates of O 2 consumption and P release, and we use our model to test this, accounting for estimated changes in organic matter deposition rates, which also should influence O 2 consumption.We test the sensitivity of our model results by varying some basic assumptions, e.g. about the origin of saline water flowing into the inner archipelago and about the accuracy of the P concentrations measured in freshwater input and STP loads.

Study site
The Stockholm archipelago comprises ∼ 24 000 islands (Hedenstierna, 1948) (Fig. 1).Its dominant freshwater source is Lake Mälaren (drainage basin 22 000 km 2 ) that empties through Norrström in central Stockholm (LM in Fig. 1) to the relatively enclosed inner archipelago (IA).This connects to the middle archipelago through one major and three shallow minor straits.The 18 m deep Oxdjupet sound (O in Fig. 1) has the only significant deep-water inflow and ∼ 84 % of the surface water outflow (Engqvist and Andrejev, 2003).Water exchange between the inner and middle archipelago is governed mainly by estuarine circulation, generated by the large freshwater flow, while wind and barotropic forcing are less important (Engqvist and Andrejev, 2003), making the IA suitable for mass-balance calculations.Due to the estuarine circulation and large relative volumes of surface water, the deep-water retention time is relatively short (Engqvist and Andrejev, 2003).The IA local drainage area is 323 km 2 , its water surface area 103 km 2 , its mean depth 14 m and its maximum depth 57 m (SMHI, 2003).

Data sources
We used water chemistry data from the Stockholm Vatten monitoring programme for 1968-2015 and the studies by Mats Waern in 1968-1976 (Nordling et al., 1984).Data from 1968 to 1981 were digitized from printed reports.Sampling frequency is detailed in Table S1 in the Supplement.Total phosphorus (TP) and DIP were analysed by standard colorimetric methods (molybdate reactive P), TP after persulfate digestion.Due to analytical problems, nutrient data from 2011 were not used (Walve, 2012).Temperature was measured within the water samplers until 1991, and later with an in situ CTD probe at the same depths.Salinity was calculated from conductivity of water samples in 1968-1991 and 2011-2015, and from conductivity measured in situ by CTD in 1992-2010.Oxygen was measured by Winkler titration (SS-EN 25813-1), and hydrogen sulfide (H 2 S) by a spectrophotometer (SS 028115-1) and converted to negative oxygen.The few times when Stockholm Vatten and the Mats Waern group sampled the same station on the same day, agreement was good (salinity r 2 = 0.99, slope = 0.99, n = 19, TP r 2 = 0.99, slope = 0.98, n = 15).Data on the daily freshwater discharge from Lake Mälaren, calculated by Stockholms Hamnar AB (Ports of Stockholm), were provided by Stockholm Vatten and have an estimated error of < 5 % (SMHI, 2006).Concentrations of TP and DIP in the discharge of Lake Mälaren were measured monthly (1968)(1969)(1970)(1971)(1972)(1973) or weekly (from 1974) by Stockholm Vatten.STP discharges of water and TP to the inner archipelago were obtained from Stockholm Vatten and Käppalaförbundet (Henriksdal -quarterly data 1971-1987, weekly 1988-2013, monthly 2014-15, Käppala -quarterly 1973-1985, weekly 1986-2015, Loudden -quarterly 1971-1988, weekly 1989-2003, Bromma -weekly 1989-2013, monthly 2014-15 and Hemmesta -weekly 2004-2006).For 1968-1970, we only found summed annual TP load from all STPs discharging into the inner archipelago (Anonymous, 1975).DIP load, analysed on unfiltered water with an unknown fraction of Fe-bound P, was available from 1989.The Swedish Meteorological and Hydrological Institute (SMHI) provided precipitation data (station Stockholm 98 210) and local land runoff 1977-2015 (S-HYPE model).We used the monthly means for 1977-2015 to estimate local runoff for 1968-1976.Since local runoff is < 2 % of total annual freshwater inputs, resulting errors are small.

Water and salt box model
We used a box model (Fig. 2) with four fixed depth layers to calculate the water, salt and P budgets of the IA, with two different approaches (see below).The model assumes an estuarine circulation (Engqvist and Andrejev, 2003), with outflowing brackish surface water and a counter-current of inflowing saltier water (SW) from the middle archipelago to the deep water of the IA, creating a corresponding upwelling of deep water in the IA.We also included mixing between water layers.The deep-water layer (D, 20-57 m), which has spatially and vertically relatively homogenous salinity in the IA, is separated from the surface water (S, 0-10 m) by an intermediate pycnocline layer (M, 10-20 m).Since there is often a shallower pycnocline during high freshwater input, the surface layer was divided into an upper surface layer (US, 0-4 m), representing outflowing water, and a lower surface layer (LS, 4-10 m).On most occasions the upper water mass is well-mixed to 10 m, suggesting a deeper outflowing layer, but the 0-4 m layer salinity is still representative due to the weak vertical salinity and nutrients gradients on these occasions.We performed sensitivity tests of model results to different depths of the outflowing layer (see Sect. 2.6).Model box volumes (392 Mm 3 for 0-4 m, 465 Mm 3 for 4-10 m, 465 Mm 3 for 10-20 m, and 214 Mm 3 for 20-57 m) were calculated from hypsographic curves of IA basins (SMHI, 2003), excluding three relatively isolated small, peripheral basins (Brunnsviken, Edsviken and Kyrkfjärden).
We used total freshwater inflow (FW t ), salinity of the IA and salinity of inflowing SW to constrain water flows, with two different approaches.Firstly, we used volume-weighted mean salinity of the IA to constrain water flows, typical for conventional one-dimensional box models.Thus, this "mean model" yields a SW inflow matching the salt amount in the IA and the salinity of outflowing water (as volume-weighted mean for surface water, i.e.Sal out equals Sal US , Fig. 2).We used this approach to estimate representative mean internal P fluxes in the IA.For some analyses, we also used a secondary "boundary model".It adds an extra boundary SW in-  flow needed to match the salinity in the surface water near the boundary, which is higher than the volume-weighted mean surface water salinity of the IA.Thus, the boundary model accounts for the surface water salinity gradient by adding to the water turnover in the outer part of the IA, not represented in the mean model.We give detailed calculations below.We used the boundary model to correct the P input-output budget but do not regard it as representative for the mean internal P budget, since it is mainly a "short-circuit" flow occurring in the outermost sub-basins of the IA.We made several sensitivity tests to check the effects of various assumptions on the results.
Equations for water exchange used Knudsen's theorem for estuarine circulation (Knudsen, 1900), which assumes constant water volume (Eq. 1) and equilibrium of salt transport (Eq. 2), where Q out is the outflowing volume of surface water (with the relatively low salinity Sal out = Sal US in the mean model), Q in is the inflowing volume of deep water (with the relatively high salinity Sal in ), and Q FW t is the total freshwater input to the estuary, i.e. the sum of Lake Mälaren discharge, local land runoff, precipitation and waste water discharge.Q in is calculated from Eqs. (1) and (2) according to Since the Knudsen equations assume steady state, which for salt is a valid approximation only for a period longer than the turnover time of the estuary, where volume-weighted salinity can shift by up to one unit within a month, we modified the salt mass-balance equation to include changes in total salt amount, S IA , of the IA, Q in is calculated from Eqs. (1) and (4) according to A similar box model approach was developed by Hagy et al. (2000) and used by Testa and Kemp (2008), Testa et al. (2008), andBoynton et al. (2008) to calculate salt and water transport in the Patuxent River estuary of Chesapeake Bay.
The model interpolates the daily changes in salt amount S IA from volume-weighted salinity data for each depth layer, calculated from measured salinity profiles (4 m depth resolution) at the central stations (A, AV, H, L, and K in Fig. 1) and water volumes for 1 m depth layers (SMHI, 2003).We tested the representativeness of the central stations with data from all sub-areas for 1998-2005.The central stations slightly underestimated salinity and DIP, but we found no systematic difference for TP (Table S2).We used these relationships to correct data for each depth layer.A model sensitivity test with non-adjusted data showed smaller uncertainty than for other sensitivity tests (see below).In some years absent winter (February) data for the IA central stations were estimated from Solöfjärden (S in Fig. 1) data using correlations with the central stations, and correlations between IA surface water and deeper layers for years with full February data sets (see the sampling frequency in Table S1).December data were generally lacking, but model sensitivity tests, run with extrapolated November or February data, rather than from linear interpolation, showed less divergence than for other sensitivity tests (see below).
The salinity at 20-30 m depth in Trälhavet (T in Fig. 1), just outside the boundary strait Oxdjupet (O in Fig. 1), was used to represent inflowing deep water (Lännergren and Stehn, 2011).This salinity is in good agreement with the bottom water salinity in Solöfjärden (S in Fig. 1), the IA subbasin closest to Oxdjupet.Oxdjupet data also support the use of this layer (Lännergren and Stehn, 2011), but were not used in the model, due to large data gaps.
The wind-driven and diffusive vertical mixing (Fig. 2) was calculated according to the salt mass-balance equations for the deep layer D (Eq.6), intermediate layer M (Eq.7), and upper surface layer US (Eq.8): where S is the change in salt amount in each layer for the time step used, Q D mix is the vertical mixing of the intermediate (10-20 m) and deep layers, Q M mix is the mixing of the lower surface and intermediate layers, and Q S mix is the mixing of upper and lower surface layers (Fig. 2).Mixing terms were calculated by rearranging the equations.
The boundary model adds an extra SW inflow Q inxD according to where Sal b is surface water salinity at the boundary.This inflow generates a corresponding increase in outflow.
In the innermost part of the IA, discharged sewage water creates a four-layered water flow, with an outflowing sewage-enriched layer at 8-20 m, below inflowing shallow and outflowing surface layers and above inflowing deep water (Boesch et al., 2006;Lännergren and Stehn, 2011).These intermediate layers are restricted to the innermost part of the IA and do not affect the water and salt budgets of the box model.Moreover, directing STP flows to either the upper surface or intermediate layer has no effect on our results, since they are aggregated for the upper 0-20 m.
The modelling used the ExtendSim8 software (Imagine That Inc.), which is suitable for sequential computations and interpolations of large data sets.We used a time step of 1 day and computation dt = 100 day −1 .
We correlated model results with water temperature data from the representative station Koviksudde (K in Fig. 1) with the most complete sampling.Data were compiled for depth intervals with consistent data for all sampling occasions.Means were calculated for depth intervals 0-8 m (data mostly from 0, 4, and 8 m), 12-20 m (mostly 12, 16, and 20 m), and 24-32 m (24, 28, and 32 m).Since some sampling occasions were close to the end or the beginning of months, temperature data were interpolated between occasions before calculations of seasonal means.

Phosphorus model
Phosphorus inputs to the IA come from Lake Mälaren, local land runoff, precipitation on the sea surface, STPs, and inflowing saltier water from the middle archipelago.P import from the middle archipelago was calculated from the volume of inflowing water Q in and the mean P concentration at 20-30 m depth in Trälhavet (T in Fig. 1).Atmospheric P deposition on the IA is ∼ 15 kg km −2 yr −1 (Savchuk et al., 2008) -minute compared to other P sources, and therefore disregarded.
The P export of the mean model (defined above) is a function of the mean model water outflow and the volumeweighted P concentration in the upper surface layer.The P concentration in this and all other layers was continuously adjusted to fit interpolated observations, by additions or removal from the model.The summed corrections were used as estimates of net internal losses from, or inputs to, the water column.
The P export of the boundary model was calculated using the out-flow of the mean model and the P concentration at 0-4 m depth near the boundary (defined by stations S, O and TF, Fig. 1), and then adding the extra flow across the boundary that was needed to match the boundary salinities.The TP export calculated by the boundary model also had the advantage of larger data availability of TP concentrations at the boundary than at the central stations of the IA (Table S1).Using the boundary model, monthly and annual external P import-export balance was calculated.Thus, a positive P balance means that import from external sources are larger than the export.The extra flows generated in the boundary model generally resulted in additional annual net P loss from the inner archipelago, due to mostly higher P concentrations in the outflowing surface water than in the inflow (defined by concentration in deep water at station T).

Oxygen budget for the deep water
An O 2 budget was calculated only for the deep layer (> 20 m).In the model, O 2 is imported to the deep water by SW inflow and mixing with overlying water (10-20 m).Oxygen is exported by upwelling and mixing into overlying water.Consumption of O 2 in the deep water was calculated as for P, i.e. with a function continuously adjusting the estimated concentration to fit observations.The O 2 consumption was converted to a theoretical mineralization of P assuming Redfield composition (1 mol P per 106 mol O 2 ) of organic matter.The complete oxidation of ammonium, generated from organic matter, to nitrate would give a 30 % lower P yield (1 mol P per 138 mol O 2 ).We used the higher P yield (1 : 106) because ammonium periodically accumulated in the deep water, and because ammonium transfer to intermediate water layers was unknown, as was also the use of nitrate in deep-water organic matter oxidation (denitrification).

Sensitivity analysis
We tested the robustness of the P budgets by (1) varying the freshwater flow from Lake Mälaren by ±5 %, (2) assuming different recruitment depths for the inflowing SW (12-20,  .Lake Mälaren contributes 93.6 ± 2.1 % (mean and SD for all years) and STPs 3.6 ± 1.2 % of the total freshwater inputs.
16-24 and 24-32 m compared to the default depth 20-30 m), (3) assuming a ±10 % error in the P discharge from the STPs, (4) assuming a ±10 % error in TP and DIP measurements, and (5) changing the depth layer of the outflowing surface water to 0-10 m.

Salinity and temperature
The salinity of the deep water (> 20 m) in the IA mirrors the salinity at 20-30 m depth in Trälhavet (T in Fig. 1), the assumed recruitment depth for inflowing salt water (SW) (Fig. 3a).The SW salinity peaked around 1980 at 6.0-6.5 and has since slowly declined and stabilized at ∼ 5.5.The volume-weighted mean salinity of the IA showed considerable seasonal variation (Fig. 3a), mainly due to variable freshwater input affecting the surface water salinity (Fig. 3b).
The long-term trend of IA mean salinity mirrors that of the deep water but showed a more abrupt decline, influenced by the large FW inflows in 1977-1988 (Fig. 4a).From the late 1980s, a less distinct halocline and a deeper surface mixed layer (Fig. 3b) accompanied the salinity decrease in the deep water.The pycnocline depth (depth of maximum density change per metre) increased from ∼ 10 to ∼ 14 m and its strength (density change per metre) decreased (data not shown but indicated in Fig. 3b).From 1989, there was an abrupt increase (∼ 2 • C) in mean July to October temperature in the intermediate and deep layers (Fig. 3c).There is a seasonal temperature rise from July to October (by ∼ 3 • C), but this changed little, and the higher mean July-October temperature from 1989 on, relative to the previous period, was mostly the result of larger seasonal temperature rises earlier in the year.After 1989, early May temperatures were ∼ 0.5 • C higher than previously in the intermediate layer and ∼ 1 • C higher in the deep layer, with early July temperatures ∼ 2 and ∼ 1 • C higher, respectively.After 2007, temperatures have increased even further (Fig. 3c).

Phosphorus concentrations
Following the reduction of P load from the STPs in the early 1970s (see below), annual mean total P concentrations (TP) in the IA decreased from > 100 µg L −1 to around 49 µg L −1 in 1976-85 and 39 µg L −1 in 1986-1995 (Table 1, Fig. 3d, e).
After the P-treatment upgrade in 1996, TP declined further to 30-32 µg L −1 (Table 1).In 1968-75, TP was lower in the deep than in the surface water, but this was reversed after the P load reduction (in the following, P refers to TP unless otherwise stated).The P decline was less prominent in the deep water than in the surface water and the deep water varied more among years (Fig. 3d, e).In the early 1970s, substantial amounts of dissolved inorganic phosphorus (DIP) remained in the upper 20 m all year (Fig. 3d).From the mid-1970s, summer (June-August) DIP concentrations declined to values close to or below the analytical quantification limit (3 µg L −1 ).In the deep water, DIP was consistently the main P fraction (Fig. 3e).After the P load reduction in the 1970s, the P concentration in the inflowing salt water (SW) from the middle archipelago (Trälhavet) has declined from ∼ 35 to ∼ 25 µg L −1 (Table 2).

Water budget
Total freshwater input (FW t ) to the IA in 1968-2015 (5296 ± 1418 Mm 3 yr −1 , or 168 ± 45 m 3 s −1 , mean ±SD among years) was dominated by Lake Mälaren (94 ± 2.1 %), followed by treated sewage (3.6 ± 1.2 %), local runoff (1.7 ± 0.6 %), and direct precipitation on the water surface (1.1 ± 0.4 %).FW t input shows no trend, but the early 1970s were relatively dry (1976 extremely dry), followed by a period of high discharge in the late 70s to the late 80s (Fig. 4a, Table 1).In the 90s there was again relatively low discharge (1996 and 1997 were especially dry), followed by high discharges around the year 2000 and at the end of the time series (Fig. 4a, Table 1).On average, the annual salt water inflow from the middle archipelago (Q SW ) was 74 ± 22 % of the FW t input when calculated by the mean model, and 136 ± 54 % by the boundary model (Fig. 4b, Table 2).Salt water inflow varied less among years than the FW t discharge (Fig. 4), and on an annual basis, there was no correlation be-  1968-1975 1976-1985 1986-1995 1996-2005 2006 1968-1975 1976-1985 1986-1995 1996-2005  tween these flows.The SW inflow was higher in the second compared to the first half of the time series, especially when calculated by the boundary model (Fig. 4b, Table 2).The sensitivity analyses with variable recruitment depth of SW, or variable discharge from Lake Mälaren, changed the SW inflow only ∼ 6 % relative to default models, except for the scenario with shallowest (12-20 m) recruitment depth (+15 %), and the scenario with export depth 0-10 m (+36 %).The mean model gives an annual mean water turnover time of 2.2 ± 0.6 months.The seasonal variation is large, ranging from an average of 1.7 months in April to 5.2 months in August.

External P loads
In the late 1960s and early 1970s, the direct P discharges from STPs to the inner archipelago were very high, peaking at over 800 t in 1970 (Fig. 5a).In 1976, when P treatment had been implemented, P loads from Stockholm's STPs had decreased by a factor of 10 (Fig. 5a).During this pe-riod, reduced P loads to Lake Mälaren also decreased P inputs substantially (FW, i.e.FW t minus IA STPs).After 1976, the annual P load from STPs has mostly been considerably smaller than the P inputs from FW or SW inflow (Fig. 5b, Table 1).From 1976 to 1995, the P load from STPs gradually decreased further, from ∼ 80 to 45 t yr −1 , and after the 1996 P-treatment upgrade the load was reduced to ∼ 30 t yr −1 , i.e. to approximately 10 % of the total external (STPs, FW, and SW) P load to the inner archipelago.From 1976, the mean annual P loading with FW has been 170 ± 62 t yr −1 , predominantly from Lake Mälaren (162 ± 62 t yr −1 , Table 1).Although there is a significant increase in the modelled inflowing SW volume with time, there is no equivalent increase in the SW P input, because of the decreasing SW P concentrations (in Trälhavet, T in Fig. 1, Table 2).
The Lake Mälaren and SW DIP inputs (data not shown) follow the same pattern as the TP load.From 1989 (when data on DIP loads from STPs become available) on average 15 t of DIP per year originate from STPs, 10 % of the total external DIP inputs to the area, or 17 % of the DIP load from land.In contrast to the decrease in TP load, DIP load from STPs remained virtually unchanged after the P-treatment upgrade in 1996, indicating that mainly particulate P was removed.

Annual and long-term P input-export budget
The annual P export from the inner archipelago closely mirrored (linear regression r 2 = 0.97) the summed annual external P loading (including P in SW inflow) (Fig. 5c).Thus, the annual P loading mainly governed the P export.The inputexport difference shows the net P balance of the area, with positive values indicating that the area is a net P sink (export < input, i.e. positive P balance and (positive) net P retention), while negative values indicate it is a net P source (export > input, i.e. negative P balance).The annual inputexport difference was generally a small fraction (mostly < 10 %) of the inputs and exports and showed considerable variation among years (Fig. 5d).The boundary model sensitivity analysis indicated an uncertainty of the annual P balance of approximately ±25 t (Fig. 5d).Following the major relaxation of external P loads in the 1970s, the long-term mean P balance was initially negative (1976)(1977)(1978)(1979)(1980)(1981)(1982)(1983)(1984)(1985)(1986)(1987)(1988)(1989)(1990)(1991)(1992)(1993)(1994)(1995), but later became slightly positive (1996-2015) (Table 3).Sensitivity tests show the largest sensitivity to assumed SW recruitment depth, followed by potential errors in measured TP concentrations.Excluding the most extreme scenario with the shallowest SW recruitment depth, the mean P balance in 1996-2015 ranged from −19 to +15 t yr −1 (Table 3).For 1976-2015, linear regression indicates a weak long-term trend of increasing P retention (0.9 t yr −1 , r 2 = 0.16, p < 0.02), but there was no trend for 1996-2015 (r 2 < 0.01, p > 0.8) (Fig. 5d).
In 1968-74, before effective P reduction in STPs, P retention was highly variable (Fig. 5d).As STPs in this period were the main source of P, assuming 10 % uncertainty in the STP load will mean a proportionally large uncertainty in the retention value.In 1969 and 1970, years with relatively good O 2 conditions (Fig. 6c), a positive P retention was relatively robust.From 1976, when the P loads had decreased substantially, our results indicate a 9-year period with mostly negative P balance, i.e. larger export than input.The discrepancy between the boundary model and the mean model for some of these years (Table 3) was due to better data availability for the boundary model in winter, the season with most of the P export.This was followed by a 5-year period (1985)(1986)(1987)(1988)(1989) with a balanced P input and export which coincided with better O 2 conditions in both the deep (> 20 m) and intermediate (10-20 m) water.As O 2 conditions deteriorated again in the 1990s, there was clear net P loss from the IA.This may partly have been release of P deposited in the previous (1985-89), more O 2 -rich period.In 1992 and 1993, the high P loss from the IA (Fig. 5d) was associated with relatively widespread hypoxia in both the deep water and the 10-20 m layer (Fig. 6c), and exceptionally poor P retention < 20 m (see next section and Fig. 6a), suggesting high P release from bottoms shallower than 20 m.Positive values indicate that the IA accumulates (is net a sink for) P, negative values that it is a net source of P. Year 2011 is excluded.Sensitivity tests included (1) Variable freshwater flow with Lake Mälaren (Q LM ), (2) recruitment depth of inflowing salt water (SW) from the middle archipelago (affecting salinity and P), (3) sewage treatment plant (STP) discharge of P, (4) water P concentrations in the IA and Lake Mälaren (LM), (5) deeper layer for water export from the IA, (6) non-adjusted data from central IA stations (see methods), (7) mean model.Sensitivity test 1976Sensitivity test -1985Sensitivity test 1986Sensitivity test -1995Sensitivity test 1996Sensitivity test -2005Sensitivity test 2006Sensitivity test -2015 1976-1985 1986-1995 1996-2005 2006-2015

Internal P fluxes
Annual P budgets were also derived separately for the deep water (> 20 m) and for the combined surface and intermediate water layers (0-20 m), using the mean model.For each layer, we calculated internal losses or sources of P from the sum of daily adjustments of the P modelled from external inputs that were required to match the observed P. In the upper 0-20 m layer, the net loss of P needed to balance the annual P budget was interpreted as being due to net P sedimentation (Fig. 6a).Conversely, a source of P was needed in the deep water (Fig. 6a), mostly in the form of DIP (results not shown, but see Fig. 3e), and was interpreted as a net sediment P release.low 20 m depth) in 1976-2015, with no trend (Fig. 6a).The annual net P loss from 0 to 20 m was similar, on average 61 ± 29 t for 1976-2015.The estimated deep-water P-release was most sensitive to assumed potential errors in P concentrations and SW recruitment depth (Table 4).
Most net P sedimentation from the upper 20 m layer (∼ 56 % of annual) occurs in spring (March-May), most of the deep-water net P release (∼ 57 %) in summer and autumn (July-October, Figs.6b, 7, and Supplement Fig. S3).Maximum P release usually occurs in September (13 t month −1 , i.e. 13 mg m −2 day −1 or 0.43 mmol m −2 day −1 ).In spring, the net P loss from surface water layers decreases the P pool and removes some of the external P input, and the P export is smaller than the input (Fig. 7).In late summer, when the P pool starts to rebuild, there is net P input in the deep water (partly also advected to surface layers) and a small net P loss from surface layers (Fig. 7).Also contributing to increasing P concentrations in summer and autumn are the larger external P inputs compared to P exports.The annual net P sedimentation in spring (March-May) correlated with deep-water sediment P release in summer and autumn (July-October) (Fig. 8a), which is consistent with a predominantly internal recycling of P within the plankton growth season March-October.Although there is on average a net P loss from the 0-20 m layer also in summer, it varies considerably among years (Fig. 7c).In a few years, there is a net internal P source in the upper layer 0-20 m in summer, suggesting P release from shallow sediments was larger than gross P sedimentation from the surface water layer.
The combined annual net release or loss for all water layers indicates whether the inner archipelago sediments act as a source or sink for P relative to the water column (Fig. 6a).The pattern was similar to the net input-export budget (Fig. 5d); i.e. in years with net P release to the water, there was often, but not always, an annual negative P balance for the IA.Differences are partly due to the different approaches used (mean and boundary models, respectively), but delays between years also contribute, as P accumulated in the water late in the year is partly exported early in the following year.

Oxygen concentrations
Oxygen (O 2 ) concentrations in the deep water vary seasonally, being highest in January-May and lowest usually in September-October (Fig. 7).Anoxia in the central inner basin has been rare since 1976, and hydrogen sulfide was only recorded in 1993 (data not shown).It should, however, be noted that seasonal occurrence of hydrogen sulfide is common in enclosed, peripheral areas of the inner archipelago (Lännergren and Stehn, 2011).The annual minimum volume-weighted O 2 concentration in the deep water shows a general improving trend (+0.07 mg L −1 yr −1 , r 2 = 0.46, p < 0.0001, Fig. 6c).There are also decadal variations, with bad conditions in the 1970s, better in the 1980s, worse again in the 1990s, and finally better from 2001.The annual maximum extent of hypoxic (O 2 < 2 mg L −1 ) bottom areas was estimated using the hypsographic curve (bottom area per depth layer) of the IA.In the period 1968-1980, hypoxic areas varied greatly, from < 10 to 100 % of deepwater bottoms and 0 to 100 % of 10-20 m bottoms (Fig. 6c).In the 1980s only minor deep areas were hypoxic (except for 1984), followed in the 1990s by larger hypoxic areas and then a return to only occasional deep-water hypoxia in the new millennium.From 1994, there has been no hypoxia in the 10-20 m depth interval.

Relationship between O 2 and P release in the deep water
The deep-water P release in July-October correlated negatively with the annual minimum O 2 concentration, usually occurring in September or October (Fig. 8b).This may partly be due to the classic mechanism of enhanced P release at low O 2 concentrations, but is not necessarily a cause and effect relationship.Both P release and O 2 depletion may be a consequence of organic matter degradation, and thus primarily linked to organic matter sedimentation, as is indicated by the correlation between the net P loss in spring and the net P release in summer and autumn (Fig. 8a).To shed light on the cause and effect relationships we also modelled deep-water O 2 consumption, which was a function of the seasonal decrease of volume-weighted O 2 concentration, the SW O 2 inflow (mean model), and the net O 2 input to the deep water from mixing with the 10-20 m layer.The O 2 consumption rate (Fig. 7) was low in spring, increased through the summer and peaked in September-October (at 800 t month −1 , i.e. 0.8 g m −2 day −1 or 26 mmol O 2 m −2 day −1 for 32.3 km 2 bottom area below 20 m depth).The P release rates (P O 2 ) theoretically expected from mineralization of organic matter were calculated from deepwater O 2 consumption, assuming organic matter of Redfield elemental composition (see Sect. 2).In most years, the annual P O 2 was close to the modelled annual TP and DIP release rates (Fig. 6a, b), but tended to be lower than modelled TP and DIP release in years with low minimum O 2 concentrations (e.g. 1984concentrations (e.g. , 1992concentrations (e.g. -93, 1996concentrations (e.g. , 2000concentrations (e.g. , 2008concentrations (e.g. , and 2012) (Figs. 6a and 9a, b)) (Figs. 6a and 9a, b).When the comparison was restricted to July-October, P O 2 explained less of the P release (Figs.6b  and 9c, d), with the largest deviation for September (Fig. 7).Conversely, the modelled P release was generally somewhat lower than P O 2 in May-June (Figs. 7, Supplement Fig. S3).

Factors affecting oxygen conditions
The observed O 2 minimum correlates with modelled O 2 consumption in July-October (Fig. 8c, here converted to P release; see above).Since the calculation of O 2 consumption depends on the observed O 2 minimum concentration, a correlation is expected and should be interpreted with caution.We find a similar correlation between the O 2 minimum, which is usually in September, and the O 2 consumption in July (data not shown).Although our model results indicate that the O 2 consumption rate influences the O 2 concentration of the deep water, hypoxia primarily developed in years with low O 2 inputs to the deep water, i.e. when the sum of O 2 inputs from boundary inflow and mixing with intermediate water was low (Fig. 8c).
Mean O 2 consumption in July-October (converted to P mineralization from organic matter, P O 2 ) did not correlate with water temperature (Fig. 10a).However, there was a tendency of higher P O 2 at high temperatures when normalized to the net organic matter sedimentation in spring (as modelled P loss from upper water mass, Fig. 10b), suggesting a higher degradation rate per amount of settled organic matter at higher temperatures.The results were clearer when the same analysis (P O 2 per P sedimentation as a function of temperature) was done for each month separately (data not shown), with similar slopes in July, August and September, and with correlation coefficients decreasing from July on (r 2 = 0.41, 0.30, 0.18, < 0.01, for each month July-October, respectively).There was no long-term trend in July-October O 2 consumption (Fig. 11).A decrease from the 1970s to the 1980s was followed by increased O 2 consumption in the 1990s (Fig. 11), possibly influenced by increased temperatures.

Discussion
We used two different model approaches to evaluate the internal P fluxes and the P input-export budget of the Stock-holm inner archipelago (IA).By accounting for the surface salinity gradient from the IA central stations to the boundary, the boundary model gives a more realistic boundary flow, considerably higher than in the mean model, where water turnover will balance mean IA salinity.Despite this difference, both models result in similar input-export P budgets, with differences for most years within the uncertainty estimates of the sensitivity tests.We consider the boundary model the most reliable for the input-export budget due to the more realistic boundary flow and the larger data availability at the boundary compared to IA central stations, especially in winter, when net P export is highest.In contrast, water turnover rates of the mean model should be more representative, since it uses the central stations to estimate mean internal fluxes in surface and deep water.Fortunately, the period with the highest internal flux rates (April-October) has the best data availability.The similarity of the input-export budgets of the two models suggests that internal nutrient fluxes in the outer IA section are not fundamentally differ-  ent from that represented by the central stations.More winter data for the boundary model in the period from the late 1970s to the early 1980s may have contributed to the larger P export of the boundary model in this period (Table 3).

Legacy P release and long-term P retention
Our study highlights the dual function of sediments in a coastal area as sink and source for P and shows that this function strongly depends on temporal resolution.Sediments of the IA change from a P sink in spring to a P source in summer and autumn, are an annual net source in some years, especially when there is hypoxia, and retains little of the P inputs to the IA over the long term.Two previous studies have estimated long-term P budgets for the Stockholm IA.Using a mass-balance P budget based on long-term annual means, Karlsson et al. (2010) calculated a P retention of 51 t yr −1 for the period 1982-95 and 53 t yr −1 for 1996-2007.They also calculated a P retention of 70 t yr −1 based on sediment accumulation rates.Almroth-Rosell et al. ( 2016) used coupled physical and biogeochemical models to calculate a P retention of 30 t yr −1 for 1990-2012, corresponding to 18 % of the land-derived P loading.In their biogeochemical model, sediment DIP release was O 2 sensitive and a fraction of P was assumed to be permanently buried, apparently a nearly constant amount per year.Thus, the retention estimate was strongly dependent on the assumed model parameters.Overall, our results indicate a near balance between input and export of P.During the first two decades following the main P treatment upgrade in the 1970s, there was a tendency to net P export ("negative retention"), indicating legacy P release.For the last 20-year period, our sensitivity tests suggest a long-term mean P retention of at most 15 t yr −1 (Table 3), corresponding to 8 % of summed FW and STP inputs, half the value calculated by Almroth-Rosell et al. (2016).Compared to other coastal areas, a relatively low P retention efficiency (< 20 %) should be expected for the Stockholm IA, based on its low water residence time (Nixon et al., 1996;Nielsen et al., 2001;Almroth-Rosell et al., 2016).It is difficult to reconcile a sediment net burial of 50-70 t P yr −1 (Karlsson et al., 2010) with our internal flux estimates of net P sedimentation losses from the 0-20 m water layer in March-May of usually less than 50 t.There is an apparent contradiction between the very low P retention and the presence of accumulation bottoms with a net sediment deposition in the inner archipelago (Jonsson et al., 2003).The difficulty of estimating representative area-integrated net sediment accumulation rates from sediment cores that are sampled in the deepest areas may contribute to this discrepancy.Most of the dry matter sedimentation in archipelago areas is resuspended sediments from shallow areas, with a large fraction having the elemental signature of clay (Blomqvist and Larsson, 1994).Persson and Jonsson (2000) found large variation in annual sediment accumulation rates, related to the occurrence of strong wind events, and concluded that erosion of shallow bottoms contribute much of the sediment accumulation in the deeper areas.Sediment accumulation rates in deep areas therefore do not represent net loss of P.Moreover, mobilization of accumulated P from deeper sediment layers may also contribute to the apparent contradiction between net P release and net accumulation of sediments in deep areas.The net P export after the external P load reduction in the 1970s indicates that sediments were then a net source of P, presumably as a legacy of the previous high P loading.Degradation of organic P, combined with limited retention of Fe-P in surface sediments during poor O 2 conditions, may have driven this high P release.
Legacy P release, observed for up to 20 years after the main P load reduction, was small relative to the previously very high P loading, with accumulated inputs from STPs alone as high as 3900 t in 1968-75 (see Table 1) compared to a negative P balance indicating net P export of 450 t in 1976-1995 (260 t in 1976-1985).Conditions favouring large P export also at high P loading seem to have limited the build-up of a large sediment legacy P storage (Fig. 5c).Sediment cores from the 1990s provide no evidence of a remaining legacy of carbon in IA sediments (Jonsson et al., 2003).Similarly, DIP mass balances for a shallow Danish estuary indicated no substantial legacy P release after reduced external P loading, but rather a close link between P loading and summer DIP release (Staehr et al., 2017).Boynton et al. (2018) reviewed global estuarine and coastal sediment-water flux data and found only a few time series long enough to evaluate remedy actions.In Boston Harbor and Chesapeake Bay sediment processes responded quickly to load reductions and did not exhibit long-term legacy effects.
From 1996, the stabilized, near-neutral P balance of the Stockholm IA means that there was still little P retention, in spite of the improved O 2 conditions.Although oxidation of surface sediments can increase short-term P retention due to accumulation of Fe oxyhydroxides, many studies indicate little effect of this P pool on long-term P retention, which depends on the amount of P permanently buried in the deeper anoxic parts of the sediments (Jensen et al., 1995;Lukkari et al., 2009;Rydin et al., 2011).Recent results indicate that a persistent Fe(II)-P mineral can form in sediments of the Bothnian Sea (Slomp et al., 2013) and to some extent also in the Baltic Proper (Reed et al., 2016).To which extent formation of Fe(II)-P minerals has contributed to long-term retention in the Stockholm IA is unknown, but the low overall P retention suggests this process is currently not very significant.Possibly, preferential Fe(II) sulfide formation in IA sediments competes with Fe(II)-P formation at the current organic loading (Reed et al. 2016).Norkko et al. (2012) have suggested that the recent (post-2005) increase in the invasive polychaetes Marenzelleria spp. in the IA may have enhanced the oxygenation of the sediment and hence increased the binding of P, presumably as Fe-oxyhydroxides.We found neither lower P release rates nor increased P retention in these years.However, our results are not directly comparable to those of Norkko et al. (2012), who used data from the innermost IA only.

Seasonal surface water P budget
In spring, there is net internal TP loss (largely as DIP) from the whole water mass of the IA, due to larger net losses from the upper layers (0-20 m) than net P inputs to the deep water > 20 m (Fig. 7).This DIP and TP loss, presumably to the sediment, was associated with the phytoplankton spring bloom, which typically contributes to nutrient uptake and sedimentation in temperate coastal areas (Blomqvist and Larsson, 1994;Testa and Kemp, 2008;Staehr et al., 2017).In the IA, diatoms, known to have high sedimentation rates, and dinoflagellates dominate the spring bloom (Lännergren and Stehn, 2011).The incorporation in higher trophic levels probably contributes only to a small extent to the P loss, since zooplankton are included in the TP analysis and the build-up of fish biomass occurs mainly in summer (Hjerne and Hansson, 2002).P uptake by benthic vegetation likely contributes little to the overall P budget, since the deep, turbid IA has small bottom areas suitable for macrophytes, while large algae, such as bladderwrack (Fucus vesiculosus), are absent from this low-salinity area.
In summer and autumn, net P losses from the 0-20 m upper layer decrease, while net P inputs, mainly of DIP, increase in the deep water.This requires a net internal P source, which we interpret as net sediment P release, which increases the P concentration in all water layers of the IA (Fig. 7).However, there is no net export of P from the IA until the winter months.
Our model approach can only give net rates for each compartment.That the annual net P loss from the upper 0-20 m is relatively small in some years (Fig. 6a) probably does not reflect lower gross sedimentation rates, but substantial regeneration of P also from shallow sediments.Such P release in summer can explain the much larger variability of 0-20 m layer net P loss for the full year than for the spring only, when P recycling from sediments should generally be low.Bottoms above 20 m depth account for 69 % of the area in the inner archipelago, but only a small proportion of them are accumulation bottoms (Jonsson, 2003).These shallow bottoms are probably particularly important for P regeneration in years when transport to deeper accumulation bottoms is less effective, or when hypoxia affect also shallow areas, e.g. in 1992 and 1993.Sedimented matter apparently is normally efficiently transported to deep accumulation bottoms, as most of the net P sedimentation in spring seems to be regenerated in the deep water in summer and autumn (Fig. 6).However, it remains difficult to estimate the contribution of gross sedimentation P losses from the surface layer to the deep water in summer.

Seasonal deep-water P and O 2 budgets
Our monthly P budgets for the deep water indicate a more or less continuous internal DIP source, with a clear increase in summer and a peak in September, similar to what Jensen et al. (1995) found in Aarhus Bay.The budget for the whole water column shows a net loss of P to the sediments in spring, suggesting deep-water sediments are a net sink for P in spring and early summer.The delayed net P input to the whole water column in summer and autumn suggests that most of this internal P source in the deep water is regeneration from seasonal storage of P in deep-water sediments, but a P redistribution by transport of shallow sediments to the deep water can also contribute.On an annual basis, deep-water O 2 consumption rates correlated strongly with TP and DIP release rates, close to Redfield proportions (Fig. 9a), indicating that P release is mainly short-term (within-year) recycling of sedimented organic P.Moreover, net P release in the deep water correlated with net P spring sedimentation (Figs.8a, 7), suggesting that a large share of the net P release in the deep water in summer and autumn is a seasonal recycling of P from the spring bloom.Staehr et al. (2017) came to the same conclusion based on a DIP mass balance for a shallow Danish estuary, where summer net DIP production coincided with changes in TP inputs, indicating that P inputs in spring were the primary source of P release in summer.
The summer increase in P flux from sediments to the deep water starts in June (Fig. 7), despite O 2 concentrations in June and July well above hypoxic levels, even in the deepest bottom water.In August and September, P release continues to increase also in years with relatively good O 2 conditions.These observations, and the close coupling between deep-water P release and respiration rates, indicate limited classical Fe-P trapping (Einsele, 1936;Mortimer, 1941) in the oxic surface sediments.Similarly, field studies in the Finnish archipelago have shown high DIP release also under oxic conditions, with limited Fe-P formation as the suggested explanation (Lehtoranta and Heiskanen, 2003;Lehtoranta et al., 2009).Monitoring data from the Lake Mälaren outlet Norrström show that the Fe transport is substantial, ∼ 400 t yr −1 , corresponding to an annual mean concentration of ∼ 70 µg L −1 (SLU, 2016).However, much of this Fe will be transported out of the IA by the surwww.biogeosciences.net/15/3003/2018/Biogeosciences, 15, 3003-3025, 2018 face water and much of the Fe that reaches the sediment can be immobilized as solid, unreactive FeS or FeS 2 in anoxic layers of these sulfate-rich coastal sediments, where sulfatereducing bacteria produce hydrogen sulfide (Blomqvist et al., 2004).As a result of low Fe availability relative to the production of DIP from organic matter decomposition, the rate of P release to the water mass is then mainly determined by the rate of remineralization (Caraco et al., 1990;Hupfer and Lewandowski, 2008).We found that P release was somewhat lower than expected from O 2 consumption in spring, and higher in summer.This suggests some early season trapping of P in Fe-P complexes, which partly dissolve in late summer, when the redox-cline in the sediment becomes shallower and higher temperatures increase reduction rates of Fe-oxyhydroxides (Jensen et al., 1995;Jilbert et al., 2011).Jensen et al. (1995) found that P sedimented in spring was largely retained as Fe-bound P until September-October, when half the annual P release occurred.For the Arkona basin of the Baltic Sea, Reed et al. (2011) estimated that sediment DIP release under oxic conditions approximately doubled from winter to summer (0.05-0.1 mmol m −2 day −1 ), recycling much of the P mainly deposited as organic matter in spring.Aerobic degradation of organic matter was the dominant P mineralization process, but P release was largely mediated via rapid Fe-oxyhydroxide turnover.Elevated temperatures in the IA after 1990 may have increased anaerobic respiration and sulfide production (Bågander, 1977), promoting P release by decreasing the available Fe due to Fe-sulfide formation.Lehtoranta et al. (2009) suggested that the Fe-dependent P-binding capacity of marine sediments can be reduced by previous anoxic events, if much of the Fe pool becomes permanently bound as sulfides and is unavailable for Pbinding upon reoxidation, as described by Blomqvist et al. (2004).Effective P-binding capacity will then be regained only slowly, as the Fe pool is replenished.Early investigations show that IA sediments were anoxic long before the rapidly increasing loads of organic matter and nutrients in the mid-20th century (Brattberg, 1986).Hydrogen sulfide was recorded as early as 1909 and was present in the inner parts of the IA (Norrström-Höggarnsfjärden) on all 10 sampling occasions in 1909-1913and on 40 out of 50 occasions in 1930-1944(Anonymous, 1945)).Dredging of the strait Oxdjupet, from 8 to 10.5 m in 1919 and to 12 m in 1929, apparently had little effect, but dredging in 1950 to the present 18 m did improve O 2 conditions.Sediment cores show lamination frequency increased until 1950, followed by decreased frequency, indicative of improved O 2 conditions with more benthic fauna (Jonsson et al., 2003).In 1970-1980, lamination frequency increased again, remaining high at least until the mid-1990s (Jonsson et al., 2003).In 2008, surface sediments were again oxidized (Karlsson et al., 2010).How this history of anoxic events has affected later Fe-and P-cycling in the Stockholm IA is not well understood, but the short water turnover time of the IA and high Fe loading suggests that continuous inputs of Fe should be relatively more important for formation of sediment Fe(III)-P complexes than the longterm cycling of the internal Fe pool.

Low O 2 and P release
In our model, inflowing SW is fully mixed into the deep water and the estimates of deep-water P release are a direct function of the deep-water turnover rates.Thus, if the deep water becomes partly stagnant and the SW inflow partly mixes into shallower (10-20 m) water layers, the model will somewhat overestimate deep-water P release rates.Since stagnation may contribute to hypoxia, this risk is higher in years with hypoxia and therefore the differences in P release between hypoxic and oxic years are more likely overestimated than underestimated.However, variable deep-water turnover rates will have similar effect on deep-water P release and O 2 consumption.Thus, the finding of high P release rates relative to theoretical P remineralization (based on O 2 consumption) in years with low O 2 concentrations (Fig. 9) is not affected by this.Likewise, the P retention estimate for the entire IA is independent of modelled deep-water P cycling.
The relatively high P release in years with deep-water hypoxia (Fig. 9) suggests that classical redox-dependent release of Fe(III)-bound P contributed significantly to P release in those years.We also found high P net export in years when hypoxia affected intermediate water depths (e.g. in 10-20 m layer in 1992-1993), suggesting unusually large P release from sediment Fe-P in these water layers.This is in contrast to the net retention in more O 2 -rich years, when Fe-P pools probably were restored.Puttonen et al. (2014) estimated the reserve of potentially mobile P in Baltic Sea archipelago and Stockholm inner archipelago surface sediments to 3.5 g m −2 , stored predominantly in redox-sensitive Fe-bound forms on accumulation bottoms.This P reserve is similar to our estimated annual net P release from deep IA bottoms (up to 3.1 g m −2 yr −1 and a mean of 2.1 g m −2 yr −1 ) and to the P release of 2.7 g m −2 yr −1 estimated by Rydin et al. (2011) in Torsbyfjärden in the outer IA.Our estimate of the P amount released in the deep water (67 t yr −1 ) is not sensitive to the defined depth of the deep water.However, with 15-57 m instead of 20-57 m as deep-water layer the sediment area increases by 40 % and the area-specific mean P release is only 1.5 g m −2 yr −1 .Still, Fe-P pools (Puttonen et al., 2014) are not very large relative to annual P recycling from deep-water sediments of the IA and the notion that such mobile P pools must have accumulated over decades of high anthropogenic P loads is untenable for the Stockholm IA.

Factors affecting oxygen conditions
Our model indicates that although O 2 consumption is an important factor for deep-water O 2 concentrations in the Stockholm IA, hypoxia is particularly prone to develop in years with low O 2 inputs to the deep water, i.e.O 2 inputs with boundary inflow and mixing of deep water with intermediate water (Fig. 8c).This was evident already in the 1970s.Despite relatively high organic matter loads to the deep water in these years (judging from the P budget and high surface water chlorophyll concentrations), our model results indicate that high O 2 inflow prevented widespread hypoxia in 1969 and 1970.
We find no long-term trend  in July-October O 2 inflow to the IA deep water from the middle archipelago (data not shown).However, there is a trend of increasing O 2 input to the deep water from mixing with intermediate water layers (10-20 m), where O 2 concentrations have increased.STP discharges of O 2 -consuming substances, mostly ammonium, were ∼ 15 000 t O 2 eq.year −1 before sewage treatment was improved in the mid-1990s, and decreased to ∼ 3000 t by 2000 (Lännergren and Stehn, 2011).We made no O 2 budget for the intermediate water, but considering mixing rates, the decreased discharges were large enough to contribute to the disappearance of hypoxia in the 10-20 m depth layer.In addition, the weakening of the stratification that occurred around 1990 promoted vertical mixing and thus O 2 transport to the intermediate layer.
Organic matter inputs ultimately control the total O 2 consumption in the deep water.However, temperaturedependent aerobic and anaerobic microbial processes should influence O 2 consumption rates.We found seasonally increasing O 2 consumption rates from May to September or October, when the temperature increased from ∼ 2 to 9 • C. The maximum O 2 consumption rate, in September (26 mmol O 2 m −2 d −1 ), corresponds well to summer O 2 consumption rates measured in Danish estuaries (AErtebjerg et al., 2003) and Finnish archipelago areas (Kauppi et al., 2017).AErtebjerg et al. (2003) found that temperature explained most of the within-year variation in O 2 consumption.In the Stockholm IA, decomposition rates of organic material should have increased due to elevated temperatures from 1990 (Fig. 3c).We find that the O 2 consumption rate in July-October normalized to organic matter input in spring increased as postulated (Fig. 10b).As expected, the correlation based on monthly data was strongest in early summer, when most organic matter from the spring bloom is still not decomposed.We find no long-term trend in total O 2 consumption in July-October.Lowered organic matter deposition probably explains the decreased O 2 consumption in the late 1970s.However, in the 1990s increased O 2 consumption due to higher temperatures probably counteracted this improvement and contributed to the renewed spread of hypoxia.

Effects and potential management of internal P load
The effect of hypoxia-controlled internal P loading on longterm ecosystem productivity depends on the limiting nutrient and, if P, on the size of the internal P load relative to external P loads and the P inventory in the water mass.In the Baltic Sea as a whole, hypoxia clearly affects the P inventory of the water mass over long periods (Conley et al., 2002).The Baltic Sea P inventory of ∼ 400 000 t fluctuates by 100 000 t between years (Jilbert et al., 2011), much more than the current annual P loading from land of ∼ 28 000 t (HELCOM, 2011).In the well-flushed Stockholm IA, the annual mean P inventory in the water mass is currently ∼ 46 t, fluctuating by only a few tons between years and 30 t seasonally (Fig. 7), much less than the annual P loading from land of 190 t and the total annual P inputs of almost 400 t.Release of P during hypoxic events will largely be exported in winter and will have very limited long-term local effects.
Measures to artificially increase P trapping in sediments, for example by precipitation with aluminium salts, can be successful in archipelago bays with restricted water exchange (Rydin et al., 2017).The short P turnover time in the Stockholm IA suggests that for long-term effects on local water quality, sediment treatment here must enhance long-term annual P retention, probably by repeated treatments.Such measures would primarily bind P imported from the Baltic Sea and Lake Mälaren and to a minor extent the relatively small current P inputs from STPs, rather than legacy P.However, not only management of P should be considered, since intensified P limitation in the IA will lead to a larger N export to the N-limited middle and outer archipelago (cf.Brattberg, 1986;Paerl, 2009;Lännergren and Stehn, 2011).
Our study shows that the often-observed correlation between water O 2 concentrations and P (e.g.Norkko et al., 2012) is only partly a cause and effect relationship, since both O 2 consumption and P release are driven by organic matter degradation (cf.Hupfer and Lewandowski, 2008;Reed et al., 2011).Thus, P binding by Fe compounds can apparently at most give some temporary sediment P retention, but does not prevent P release into the overlying oxic water in the longer term.These results have implications for expected outcomes of artificial methods to oxygenate anoxic deep water.Such efforts may precipitate P in the short term, but a corresponding long-term P storage cannot be expected.

Conclusions
In accordance with several recent studies our results indicate that estuarine and coastal sediments can respond rapidly to load reductions and exhibit small long-term legacy effects.Indeed, the seasonal increase in water column P concentrations in the Stockholm inner archipelago in summer and autumn is due mainly to internal P loading from the sediments.However, most of this P release is simply a recycling of P deposited by the sedimentation of the spring bloom.P release from historical deposits was apparently restricted to the first 10-20 years after the major P load reduction, when model results suggest sediments were net P sources, albeit small.The long-term P balance has now stabilized, with close to www.biogeosciences.net/15/3003/2018/Biogeosciences, 15, 3003-3025, 2018 neutral balance (i.e.export equals inputs) in the last 20-year period, indicating that internal loading from legacy P has essentially ceased.The sensitivity analysis suggests a retention of < 15 t yr −1 , or 8 % of land-based inputs.Most external P inputs are either exported without seasonal delay or are efficiently recycled from sediments in summer and autumn despite the improved O 2 conditions, and then exported in winter.
The coupled P release and O 2 consumption, even under oxic conditions, indicate that Fe binding of P cannot efficiently prevent release of P mineralized from organic matter degradation.However, consistent with the classical view of redox-dependent P release, summer and autumn hypoxia triggered elevated P release, resulting in low annual P retention of the area especially when sediments at intermediate depth were affected by hypoxia.
Hypoxia in the IA deep water was influenced by the size of the spring bloom and by temperature affecting O 2 consumption rates, but also by deep-water renewal rates and the O 2 concentration in the water ventilating the deep water.The generally improving O 2 conditions of the IA are probably due to lowered sedimentation of organic matter (especially since the 1970s), to lowered STP loading of ammonium (from the late 1990s), and to enhanced vertical mixing (from the 1990s).The enhanced mixing has also contributed to higher deep-water temperature and respiration rates.Detailed studies are needed to separate the complex factors influencing the O 2 conditions in the Stockholm IA.
Further water quality improvements in the Stockholm IA will require either further decrease in nutrient load or improved long-term sediment P retention.Since the P turnover is rapid and legacy P small in the Stockholm IA, measures binding P in sediments would primarily accumulate P recently imported from the Baltic Sea, Lake Mälaren, and local STPs, and would need to be repeated regularly for enduring effect.This situation should be typical of well-flushed Baltic Sea coastal areas and bays.
Data availability.Monitoring and runoff data are available from the Swedish Meteorological and Hydrological Institute (SMHI, 2016(SMHI, , 2017) ) and the Swedish University of Agricultural Sciences (SLU, 2016).

Figure 1 .
Figure 1.Maps showing the location of the Stockholm inner archipelago (IA) in the north-western Baltic Proper.Lake Mälaren with drainage area is shown in the top left map and its outlet Norrström in Stockholm (LM) in the lower map.The small local drainage area of the IA is shown in the lower map.The central stations (A, AV, H, L, and K) were used to represent the IA.The main water exchange between the IA and middle archipelago is through Oxdjupet (O).Data from Trälhavet (T) were used to represent inflowing water.Surface water data from Solöfjärden (S), Oxdjupet (O), and Trälhavet Fiskare (TF, weekly station for surface water) were used to represent outflowing water of the boundary model.Depths (m) in maps are spline interpolated from nautical chart data (scale 1 : 25 000, published 1984).

Figure 5 .
Figure 5. Annual total phosphorus (TP) inputs, TP export, and net TP balance.(a) Annual TP load from sewage treatment plants (STPs), from freshwater runoff from Lake Mälaren plus the local catchment (FW), and (b) from modelled inflow of middle archipelago deep water (SW), calculated with boundary and mean models (see methods).(c) Total annual external TP input and export, using the boundary model.(d) Net input-export according to the boundary model (difference of values in c) with error bars showing maximums and minimums from the sensitivity test.Positive values indicate that the inner archipelago accumulates P and negative values that it is a net source of P.

Figure 6 .
Figure 6.Internal total phosphorus (TP) fluxes and oxygen (O 2 ) conditions.(a) Annual net TP release in deep layer (> 20 m), annual net TP loss from 0-20 m, net TP balance for whole water mass, and calculated P remineralization from organic matter (from modelled O 2 consumption).Positive net TP balance indicates net TP release from sediments to water mass and negative balance indicates net TP loss from water to sediments.(b) As in (a) but net TP release in deep layer (> 20 m) for July-October and net TP loss from 0 to 20 m in March-May.(c) Minimum volume-weighted O 2 concentration in deep water and maximum percent hypoxic (O 2 < 2 mg L −1 ) bottom area in deep water and 10-20 m layer (July-October).

Figure 7 .
Figure 7. Monthly means and changes for 1996-2015 (2011 excluded) in (a) total phosphorus (TP) stock and change in the whole water mass and stock in the deep water, (b) TP input and export with the boundary model, (c) net TP release in deep water (> 20 m) and net TP losses in the upper water layers (0-20 m) and calculated P remineralization (from modelled O 2 consumption), with the mean model, and (d) deep-water temperature, O 2 consumption, and minimum O 2 concentration.Error bars show standard deviation.Graphs for other time periods are shown in Supplement Fig. S3.

Figure 8 .
Figure8.(a) Annual net total phosphorus (TP) release (t) in deep water (> 20 m) in July-October, as a function of net TP loss (t) from the layers 0-20 m inMarch-May 1969-2015 (1970 excluded  as an outlier (P loss 222 t), 1975 excluded due to lack of data, and 2011 excluded due to unreliable data).The colour code shows mean bottom water temperature for July-October (station K, 24-32 m; see Fig.3).There is a significant linear regression, y = 19.8+ 0.55x, r 2 = 0.38, p < 0.0001.Only the 1 : 1 line is shown in the graph.(b) Annual net TP release in deep water for July-October (t) as a function of the minimum oxygen (O 2 ) concentration (mg L −1 ).Colour-coded as in (a).There is a significant linear regression, y = 73.2− 8.50x, r 2 = 0.48, p < 0.0001.(c) Minimum O 2 concentration (mg L −1 ) as a function of O 2 consumption rate, converted to expected P release (see methods).The colour code shows model-calculated O 2 input to the deep water normalized to the mean O 2 input for all years.A value > 1 indicates a higher O 2 input than the mean value for all years, and a value < 1 indicates a lower O 2 input than the mean for all years.The linear regression is y = 6.31−0.10x,r 2 = 0.23 (p < 0.001; however, see Sect.3).

Figure 10 Figure 11 .
Figure 10.(a) Annual oxygen consumption for July-October 1969-2015 (1975 and 2011 excluded) recalculated to theoretical deep-water P remineralization, as a function of temperature.The data set is colour-coded according to the minimum oxygen concentration for July-October (y = 28.3− 0.31x, r 2 = 0.006, p = 0.60).(b) As in (a) but with theoretical P remineralization normalized to P loss from the 0 to 20 m layer in March-May (y = −0.030+ 0.13x, r 2 = 0.25, p < 0.001).
Oxdjupet strait (Q in Sal in according to Eq. 2) and forms the inner archipelago deep water, with salt content S D .Upwelling water (equivalent to Q in ) transports salt to the above layers and mixing (Q D mix , Q M mix , and Q S mix ) relocates salt between the layers.

Table 1 .
Period means of annual water discharge (Q), total phosphorus (TP) load, and volume-weighted input TP concentrations (± standard deviation).The volume-weighted TP concentration in the whole water mass and in the surface layers of the model area are also shown.LM: Lake Mälaren; LR: local runoff; P: precipitation; STP: sewage treatment plants; IA: inner archipelago.

Table 2 .
Period means of modelled annual water inflow (Q), total phosphorus (TP) input, and volume-weighted input TP concentrations (± standard deviation).SW: inflowing salt water from the middle archipelago; MM: mean model; BM: boundary model.

Table 3 .
Effects of sensitivity tests of net P export (t yr −1 ) from the inner archipelago (IA) with a boundary model (0-6) and mean model (7).

Table 4 .
Effects of sensitivity tests on deep-layer (> 20 m) P release (t yr −1 ) in the inner archipelago (IA) using the mean model.Year 2011 is excluded.Sensitivity tests included (1) variations in freshwater flow with Lake Mälaren (Q LM ), (2) recruitment depth of inflowing salt water (SW) from the middle archipelago (affecting salinity and P), (3) sewage treatment plant (STP) discharge of P, (4) water P concentrations in the IA and Lake Mälaren (LM), (5) a deeper layer for water export from the IA, (6) non-adjusted data from central IA stations (see methods), and (7) use of the boundary P concentration for export from the IA.