Dissolved organic matter signatures in urban surface waters: spatio-temporal patterns and drivers

Advances in analytical chemistry have facilitated the characterization of dissolved organic matter (DOM), which has 15 improved understanding of DOM sources and transformations in surface waters. For urban waters, however, where DOM diversity is likely high, the interpretation of DOM signatures is hampered by a lack of basic information. Here we explored the spatiotemporal variation of DOM composition in contrasting urban water bodies, based on spectrophometry and fluorometry, size-exclusion chromatography and ultrahigh-resolution mass spectrometry, to identify linkages between DOM signatures and potential drivers. The highly diverse DOM we observed distinguished 20 lakes and ponds characterized by a high proportion of autochthonous DOM from rivers and streams with more allochthonous DOM. Seasonal variation was apparent in all types of water bodies, driven by the interaction between phenology and urban influences. Specifically, nutrient supply, the percentage of green space adjacent to the water bodies and point source pollution emerged as major urban drivers of DOM composition. Optical DOM properties also revealed the influence of effluents from waste water treatment plants, suggesting their use in water-quality assessment 25 and monitoring. Furthermore, optical measurements inform about processes both within water bodies and in their surroundings, which could improve the assessment of ecosystem functioning and integrity. https://doi.org/10.5194/bg-2021-340 Preprint. Discussion started: 5 January 2022 c © Author(s) 2022. CC BY 4.0 License.

Map of 32 sampling sites in the city of Berlin, including 7 lakes (dark green), 7 ponds (light green), 9 streams (light blue), and 9 rivers (dark blue), including two heavily polluted stream sites and two heavily polluted river sites (a), and PCA 90 scores for these sites in four different seasons (b, c). Site codes are given in Table S1. Sites marked by asterisks (*) were restricted to 3 seasons and hence excluded from the PCA. https://doi.org/10.5194/bg-2021-340 Preprint. Discussion started: 5 January 2022 c Author(s) 2022. CC BY 4.0 License.

Physico-chemical field measurements and water sampling
During each field visit, we measured water temperature, pH, the dissolved oxygen (DO) concentration and electrical 95 conductivity using a hand-held WTW Multiprobe 3320 (pH320, OxiCal-SL, Cond340i, Weilheim, Germany) or a smarTROLL probe (In-Situ, Fort Collins, CO, USA). We also collected integrative water samples (2 L) from the upper 0.5 m water layer for chlorophyll-a and DOM analyses. The water was kept cool in acid-washed polycarbonate Nalgene bottles placed in a cooling box pending filtration in the laboratory (GF75, 0.3 μm average pore size; Advantec, Tokyo, Japan) within 6 hours after sampling. Additional volumes of surface water were filtered through pre-combusted glass 100 fiber filters (GF75) directly in the field. These filters were placed into acid-washed pre-combusted (450 °C, 4h) glass vials (15-20 mL) sealed with a PTFE septum in a screw-cap for later measurements of dissolved organic carbon (DOC) concentrations, DOM fluorescence and absorbance, and DOM molecular size distribution. The water passed through the filter was collected in acid-washed polyethylene tubes for analyses of soluble reactive phosphorus (SRP), nitrate (NO3 -), nitrite (NO2 -), ammonium (NH4 + ) and trace organic compounds (TrOCs). We also took unfiltered water 105 samples for total phosphorus (TP) analysis. For each variable, we collected three replicate samples at each site in each season. We stored all samples in the dark in a cooling box during transport. To preserve samples and remove all inorganic carbon, we acidified (pH 2) the water for DOC, NO3 -, NO2and NH4 + analyses with 2 M HCl within 6 hours after sample collection. DOC concentrations and DOM fluorescence and absorbance were measured within 24 h.

DOM characterization
DOM absorbance and fluorescence were simultaneously determined on an Aqualog instrument (Horiba Ltd, Kyoto, Japan). We used the absorbance spectra to calculate several indexes (Table A2): the specific UV absorption (SUVA254) as a proxy for DOM aromaticity (Weishaar et al., 2003), the ratio of absorbance at 250 and 365 nm (E2:E3) as an (inverse) indicator of molecular size (Peuravuori and Pihlaja, 1997), the ratio of E4:E6 as an indicator of humification 115 (Chen et al., 1977), the short-wavelength slope within the wavelength region of 275-295 nm (Helms et al., 2008) as an inverse correlate with molecular weight and aromaticity, and the ratio of slopes (SR) computed from short and long wavelength regions (Loiselle et al., 2009) as another negative correlate with DOM molecular weight. We used the fluorescence data to compute the freshness index β/α (Table A2) (Wilson and Xenopoulos, 2009), which indicates the relative importance of recently produced DOM (Parlanti et al., 2000). Furthermore, we calculated the fluorescence 120 index (FIX) as the ratio of fluorescence intensities at the emission wavelengths of 470 and 520 nm (obtained at excitation wavelength of 370 nm), which has proved useful to distinguish the relative contributions of terrestrial (FIX~1.4) and aquatic (FIX~1.9) sources of DOM (Mcknight et al., 2001). Finally, we computed the humification index (HIX) as a proxy for humic substances (Ohno, 2002). Fluorescence excitation-emission matrices (EEMs) were used for PARAFAC analysis, a multivariate three-way modeling approach decomposing EEMs into individual 125 fluorophores (Bro, 1997;Stedmon and Bro, 2008). We derived 8 components from a total of 116 EEMs and compared their loading spectra with the OpenChrom/OpenFluor database (http://www.openfluor.org) (Murphy et al., 2014). See Supporting Information for details of data processing, including PARAFAC analysis. https://doi.org/10.5194/bg-2021-340 Preprint. Discussion started: 5 January 2022 c Author(s) 2022. CC BY 4.0 License.
The molecular size distribution of DOM was analyzed by liquid size-exclusion chromatography in combination with UV and IR detection of organic carbon and UV detection of organic nitrogen (LC-OCD-OND) (Huber et al., 2011).

130
We determined concentrations of three molecular size fractions: humic-like substances (HS-C and HS-N reported in mg C L -1 and mg N L -1 , respectively), high-molecular weight non-humic substances (reported as HMWS-C and HMWS-N, in mg C L -1 and mg N L -1 ) and low-molecular weight substances (LMWS, in mg C L -1 ).
To examine the molecular composition of DOM, we used ultrahigh-resolution Fourier-Transform Ion Cyclotron Mass Spectrometry (FT-ICR-MS). We extracted DOM on Agilent Bond Elut PPL solid-phase columns (Dittmar et al., 2008) 135 from 1 L of filtered water acidified to pH 2. We then diluted extracts to 10 µg L -1 C in 1/1 ultrapure water/methanol before broadband mass spectrometry on a 15 Tesla Solarix FT-ICR-MS (Bruker Daltonics, Bremen, Germany) in electrospray ionization negative mode (300 accumulated scans, ion accumulation time of 0.1 s, flow rate of 240 µL/h).
We performed internal mass calibration and exported the raw mass lists from 150 to 1000 Da for further data processing using previously established R code (Del Campo et al., 2019). Briefly, we first applied a method detection limit similar 140 to Riedel & Dittmar (Riedel and Dittmar, 2014) before aligning m/z values across samples (Del Campo et al., 2019).
Subsequently, we assigned chemical formulas to mean m/z values assuming single-charged deprotonated molecular ions and Cl-adducts for a maximum elemental combination of C100H250O80N4P2S2, respecting chemical constraints and using rigorous mass error assessments, stable isotope confirmation and homologous series assessment (Del Campo et al., 2019). More detail of the FT-ICR-MS methods can be found in the Supplement material. To condense the mass-145 spectrometric information, we derived 12 molecular groups (Lesaulnier et al., 2017) based on elemental composition and calculated the average molecular mass, number of formulas (molecular richness) and total intensity for each of them. In addition, we computed the double-bond equivalents (DBE) and the aromaticity index (AI) as indicators of unsaturated compounds (Koch and Dittmar, 2006), and the molecular lability boundary (MLB) as a measure of lability (D'andrilli et al., 2015).

Additional water-chemical analyses
We determined total DOC concentrations by high-temperature catalytic combustion and infrared spectrometry on a TOC-V Analyzer (Shimadzu, Kyoto, Japan). NO3 -, NO2and NH4 + were analyzed on a FIAcompact (MLE GmbH, Dresden, Germany). TP was measured using the same technique with unfiltered water samples that were digested with K2S2O8 (30 min at 134 °C). We measured chlorophyll-a concentrations spectrophotometrically (HITACHI U2900; 155 Tokyo, Japan) following hot ethanol extraction (Jespersen and Christoffersen, 1987) of three GF75 filters from each individual water sample. Concentrations of 18 trace organic compounds (TrOCs) were determined by HPLC-MS/MS (Shimadzu, Kyoto, Japan) (Zietzschmann et al., 2016). These included chemicals such as acesulfame (a sweetener), benzotriazole (a corrosion inhibitor), and drug residues like carbamazepine and gabapentin (Table B1).

160
We used repeated-measures ANOVA to test for differences among types of water bodies and seasonal sampling periods (referred to as seasons hereafter) for a variety of response variables; there was non-significant interaction between water body type and season. Further, we assessed the importance of seasonal variation in each water body type by https://doi.org/10.5194/bg-2021-340 Preprint. Discussion started: 5 January 2022 c Author(s) 2022. CC BY 4.0 License. computing a respective variance component using a type-II ANOVA (aka variance component analysis) for data from each water body type with season and site-ID as random factors; this approach assesses temporal variation as a fraction 165 of total variation within each water body type.
For constrained multivariate analyses we considered land cover adjacent to the water bodies, trophic state and micropollutant load as drivers of variation in DOM chemical composition. We used the percentages of urban green and paved areas as a proxy for land cover and assessed trophic state based on concentrations of TP, NH4 + , NO3and chlorophyll a. Finally, the first axis of a principal component analysis (PCA) based on the TrOC dataset was used as 170 a proxy for micropollutant load.
We followed a three-step approach to analyze the spatio-temporal patterns of DOM composition: First we identified major axes of variation in DOM composition by a PCA based on quantitative indicators of DOM, analytically accessible fractions thereof or quantitative proxies: DOC concentration, all absorbance and fluorescence data, absolute component-specific fluorescence intensities from PARAFAC, and the results from size-exclusion chromatography.

175
Only the 27 sites sampled in all four seasons were included in this analysis. All variables were standardized to a mean of zero with a variance of 1 to ensure equal weightings, and projected onto the ordination space using Pearson correlations of the variables with PCA axes in a distance biplot sensu Legendre and Legendre (2012). To explore spatial patterns, we mapped PC1 and PC2 scores onto Berlin´s landscape using QGIS (QGIS Development Team, 2017, Open Source Geospatial Foundation Project; http://qgis.osgeo.org).

180
Second, we used the same dataset as dependent matrix in a redundancy analysis (RDA) with the set of potential drivers described above as predictor variables. We started with the full RDA model and forward-selected drivers (Legendre and Legendre, 2012). For hypothesis tests in the RDA, permutations were restricted to account for repeated measurements at the same sites across seasons by first permuting sets of four seasonal measurements across sites and then permuting across seasons within each site. To check our ability to identify drivers behind major variation observed 185 in DOM composition, we used Procrustes analysis to assess the similarity of PCA and RDA ordinations, including a permutation-based test of the non-randomness of the achieved superimposition (Mardia, 1979;Peres-Neto and Jackson, 2001).
Third, we exploited results of the FT-ICR-MS to facilitate interpretation of the two major axes of variation in DOM chemical composition resulting from the PCA. The FT-ICR-MS data were only available for three seasons and were 190 purely compositional (relative intensities), as the many thousands of compounds contained in the spectra cannot be calibrated to yield concentrations. To link the quantitative and compositional datasets, we correlated scores of PCA axes with compound-specific relative intensities of the mass spectra. The compound-specific correlation coefficients were then used as color codes in van Krevelen plots, which locate chemical formulae identified by FT-ICR-MS in a space defined by oxygen richness (O:C) and saturation (H:C). FT-ICR-MS-derived information such as the richness 195 or average weight of specific molecular groups were also projected onto the PCA ordination space as arrows, provided correlation coefficients were >0.2. All statistical analyses and graphs were made with R 3.2.4 (R Core Team, 2016).

Physico-chemical characteristics
Among all physico-chemical variables, only DOC concentration and temperature differed significantly among types 200 of water bodies (p<0.05 and p<0.001, respectively). Temperature varied strongly across seasons, but still proved significantly different among water body types, with lakes and rivers being warmer than ponds and streams. DOC concentrations did not vary across seasons, but were significantly higher in ponds and streams than in lakes and rivers.
Ponds also showed the highest chlorophyll-a concentrations and rivers the lowest, but these differences were not significant.

205
Separate ANOVAs for each water body type showed that seasonal variation in TP and NH4 + concentrations was highest in rivers and streams. Seasonal variation in NO3concentrations was generally high, but systematic differences were neither detected among seasons nor sites (Table A1). Seasonal variation of chlorophyll-a concentrations was also high and similar across types of water bodies.
The analysis of TrOCs identified acesulfame, a widely used artificial sweetener (Buerge et al., 2009), in 72 out of a 210 total of 120 samples taken at 32 sites across all seasons (Table B1). Similarly, two corrosion inhibitors included in the analysis, benzotriazole and methylbenzotriazole (Cotton and Scholes, 1967;Tamil Selvi et al., 2003), occurred in 68 and 63 samples, respectively. Fifteen other TrOCs were detected in at least 2 and up to 62 samples (Table B1) (Table B2).

DOM composition
PARAFAC modeling resulted in 8 components referred to as C1-C8 (Table A3, Fig. A1). Components C6 and C8 were previously found to be protein-like, whereas all other components have been reported as humic-like (Table A3).
In contrast to the standard physico-chemical variables we measured and the size-exclusion chromatography (Table   220 A6), the PARAFAC components and absorbance and fluorescence indices generally showed significant differences among water body types (Table A4 and A5).
The first axis of the PCA analyzing spatio-temporal patterns of DOM chemical composition explained 36% of the total variance ( Fig. 2). PC1 was largely defined by the negative loadings for C2 and C1 (representing humic substances originating from waste water treatment), the short-wavelength slope, SUVA254 and LMWS (Fig. 2b). Furthermore, 225 PC1 correlated positively with the absorption slope ratio, E2:E3 (molecular size), β/α and HMWS-C. This axis separated water body types, from lakes on the right to ponds, rivers, and finally streams on the left. The optical proxies identified PC1 as a gradient spanning from lakes, where DOM had lower aromaticity and contained more freshly produced material, to streams, which showed high aromaticity and low proportions of fresh DOM. Pond P4, which was identified as an outlier because of particularly high NH4 + concentrations, also showed a rather distinct DOM  PC2 explained an additional 20% of the total variance and correlated positively with HMWS (mg N/L) and β/α, and negatively with HIX. An exploration of spatio-temporal variation by plotting site-specific PC scores (Fig. 3) identified PC2 as the axis capturing temporal variation, with the four seasons aligning vertically at most sites. Winter and summer had the lowest and highest PC2 scores, respectively, with transitional seasons located in between. Thus, higher proportions of humic substances in winter contrast with more labile DOM in summer. In agreement with the variable-250 specific seasonal variance components, the degree of seasonal differentiation differed among water body types also in multivariate space, being higher in streams and ponds than in the larger lakes and rivers (Fig. 3b). Except for sites P4 and H3, two water bodies behaving exceptionally also in many other respects, seasonal variability was poorly reflected by PC1, which largely captured variation among individual water bodies or water body types, separating flowing from standing waters. Visual inspection of PCA scores mapped across Berlin (Fig. 1b,c) did not reveal a spatial signature 255 transcending types of water bodies. RDA identified the areal percentage of green space adjacent to the water bodies, TP, NH4 + , NO3and the first axis of the PCA based on TrOCs as significant predictors of DOM composition (Fig. C1).
The resulting PCA and RDA ordinations for DOM were significantly correlated (Procrustes rotation 0.73, p<0.001), suggesting that the considered predictors were indeed major drivers of variation in DOM chemical composition.  Furthermore, PC1 was negatively related to black carbon, polyphenols and polycyclic aromatic compounds with aliphatic chains, which are all typical of soil-derived humic material, as well as with unsaturated aliphatics, saturated fatty acids and peptides, indicating that all of these molecular groups were more important in streams. Lastly, the computed molecular lability boundary (MLB), carbohydrates, sugars without heteroatoms (N, S or P) and unsaturated 280 aliphatics were positively related to PC2, while AI, DBE, black carbon and polyphenols were negatively related to PC2.

Spatial patterns and drivers of DOM signatures
Our results show that the chemical composition of DOM in contrasting surface waters of the metropolitan area of 285 Berlin, Germany, is highly diverse. This reflects both aquatic-terrestrial linkages and DOM transformations within the aquatic systems (Fonvielle et al., 2021) and suggests a high ecosystem-level functional diversity across the urban aquatic network. Clear differences among the four types of water bodies we investigated were due to distinct signatures of streams and rivers vs. ponds and lakes. This was revealed especially by the first principal component (PC1) of a PCA (Fig. 2), which reflects the dominant gradient defined by variation in DOM composition across the 32 urban sites 290 included in the study.
Stream DOM exhibited higher aromaticity (as indicated by SUVA254) and lower amounts of recently produced, lowmolecular DOM (as indicated by the freshness index or the slope ratio) than lakes at the opposite end of the gradient.
This pattern matches results from agricultural streams near Berlin, where SUVA254 values up to 3 L m -1 mg -1 have been reported (Graeber et al., 2012) and from an urban river in southwestern Korea (SUVA254 values of 2.5 L m -1 mg -295 1 ) (Park, 2009). The distinct signature is also reflected in other DOM components, such as the fluorophore C2, which was more important in streams and identified as terrestrial humic material (Murphy et al., 2011). Streams also showed higher levels of humic-like (C1) and protein-like (C8) compounds, whereas higher values of the freshness index characterized lakes. These patterns consistently indicate that the arrangement of sites along PC1 reflects a gradient of https://doi.org/10.5194/bg-2021-340 Preprint. Discussion started: 5 January 2022 c Author(s) 2022. CC BY 4.0 License. allochthonous vs autochthonous sources of DOM. A corollary of this finding is that despite the potentially pervasive 300 influence of the urbanized surroundings, urban streams in particular are more tightly linked to the terrestrial environment than urban lakes, just as is the case for flowing and standing waters in natural landscapes (Larson et al., 2014).
In contrast to natural landscapes, however, the linkage of urban waters with their terrestrial surroundings is mediated by paved surfaces and engineered flow paths, including roof run-off into rain gutters, extensive (partially leaky) 305 sanitation networks and sewage overflows in WWTPs following heavy rainfall or snowmelt. The urban gradient from allochthonous to autochthonous DOM sources we document could thus be driven by surface run-off rather than soil seepage and subsequent delivery of DOM to surface waters via groundwater. This interpretation is supported by higher levels of proteins (Fig. 2) characterizing the urban streams and rivers, as opposed to soil-derived humic DOM signatures typical of unimpacted streams and rivers (Hutchins et al., 2017). The proteins could originate from surface 310 runoff integrating various sources of urban pollution but they might also derive from WWTPs, as implied by the nature of some of the PARAFAC components we identified (Table A3). For instance, the humic fluorophore C2 has been reported in WWTP effluents that may be discharged into urban surface waters (Murphy et al., 2011). Point-source inputs were also identified as drivers of DOM composition by the influence of TrOCs in our RDA and their correlations with C2 and C8, all of which are components of WWTP effluents.

315
Lakes differ from streams by a typically greater importance of autochthonous production fostered by abundant nutrients. Elevated nutrient concentrations should hence coincide with DOM signatures indicative of autochthonous carbon sources, as found in agricultural streams, where the freshness index β:α indicating autotrophic activity was related to high nitrogen concentrations (Wilson and Xenopoulos, 2009). This pattern contrasts with the negative relation between nitrogen concentration and the proportion of fresh DOM found across our study sites, where high 320 nutrient concentrations were instead strongly related to DOM components of WWTP effluents. This typically resulted in an allochthonous DOM character at high-nitrogen sites.
Similarly, the TP concentration was significantly related to DOM composition in our RDA, where phosphorus-rich water bodies also proved to have more allochthonous than autochthonous DOM. This points to inputs from urban surface runoff rather than groundwater inflow where long flow paths and residence times provide ample opportunities 325 for phosphorus immobilization. As with N, additional phosphorus may derive from WWTP effluents, as suggested by the positive relationship between TP concentration and the fluorophore C2 as a putative tracer of WWTP effluents (Murphy et al., 2011). Overall, the negative relationships between nutrient availability and the importance of autochthonous components in the DOM pool suggests that while streams and rivers may efficiently collect N and P from the urban environment; lakes are more efficient at channeling nutrients into autochthonous production. Thus, the 330 autochthonous DOM signature in urban lakes appears to be largely independent of nutrient supply and rather be facilitated by longer water residence times, higher water temperature and favorable light conditions.
Our results on urban surfaces driving urban allochthonous DOM composition meet our expectation that land cover notably influences the composition of DOM in urban surface waters (Williams et al., 2016;Sankar et al., 2020). This conclusion is supported by results from our RDA, which identified the presence of green spaces in the perimeter of the water bodies as a significant influence. However, the relationship between land cover and DOM composition must be interpreted with caution because all lakes were situated in areas with green spaces in their surroundings, whereas streams ran through areas dominated by buildings and paved surfaces. The urban running waters, more than lakes and ponds, thus received high surface runoff during rain events, including high inputs of pollutants and allochthonous DOM. Given the evident negative relationship between green space and paved surface areas (R=-0.47, p<0.001), green 340 spaces might be used as an inverse proxy for paved surfaces influencing DOM signatures in urban surface waters.
However, since paved surface area per se did not emerge as a significant predictor in our RDA, land cover can at best partly account for the observed variation in DOM composition across the contrasting urban sites we investigated.
Except for ponds and some lakes, all investigated water bodies had direct surface water connections, which could result in spatial autocorrelation (Peterson et al., 2006). In addition, spatial patterns may arise from the prominent land cover 345 gradients in Berlin, ranging from forested areas to densely populated urban centers. Since the sampling design of our study does not lend itself to a formal analysis of spatial autocorrelation, we explored spatial patterns with DOM proxies in maps (Fig. 1b,c) but found no obvious relationships. Instead, type-specific characteristics of the water bodies were pronounced, largely independent of hydrological connections. Factors potentially contributing to the resulting heterogeneity across the surface waters in the city include specific local stressors such as point-source inputs of 350 pollutants, spatially variable urban surface runoff delivering allochthonous DOM, and hydraulic-engineering structures such as sluices. Thus our map of DOM composition (Fig. 1b,c) could be interpreted as visualizing urban heterogeneity in aquatic ecosystem diversity and condition.

Seasonal patterns and drivers of DOM signatures
Seasonal variation in DOM signatures occurred in all types of water bodies mostly independent from variation among 355 the four water body types. With a few exceptions, P4 and H3 being the most prominent examples, seasonal variation of DOM composition was consistent across all water body types. (Fig. 3a,b), Assessed separately at each site (Fig. 3b), DOM was generally fresher in summer and autumn than in winter and spring, as indicated by higher ratios of β:α and more HMWS-N as indicators of polysaccharides and proteins (Thurman, 1985), whereas humic matter was more abundant in winter, and the pattern in spring was not clear-cut. Our rank-based analysis of PC2 scores (Fig. 3c) suggests 360 a consistent seasonal pattern of changes in DOM composition across sites, which emerged even though the variation within individual sites was limited along PC2.
At least four potential processes could account for the observed seasonal turnover in DOM composition: exudates of aquatic primary producers, microbial and sunlight-induced transformation of DOM, and terrestrial inputs from riparian vegetation (Spencer et al., 2009;Cory et al., 2015), all of which could be influenced by the urban environment.

365
Seasonal variation in light conditions could be important in influencing DOM composition by primary producers, independent of nutrient supply (see above), and temperature changes might also play a role, especially in determining rates of microbial DOM transformations. Pulses of leaf litter falling or swept or blown into urban water bodies could be an additional source of DOM varying with season (Gessner et al., 1999). This holds particularly for urban green spaces and water courses lined by woody riparian vegetation. However, quantification of the relative importance of 370 different drivers of seasonal patterns remains difficult based on the data currently available for urban settings. The ponds and streams included in our study showed higher and less predictable seasonal turnover than the lakes and rivers, as revealed by the pattern along PC2 (Fig. 3). This indicates that the nature and degree of aquatic-terrestrial coupling in urban settings leaves an imprint on the turnover of DOM. Surveys of DOM dynamics should hence be more informative about ecosystem conditions than assessments based on single grab samples or averaged data. Inputs 375 of DOM from WWTP effluents may also be captured by the seasonal patterns, although that influence is likely variable, as indicated by considerable seasonal turnover of DOM at site H3 contrasting with a minimal turnover at sites S5 and R7 (small ellipses in Fig. 3b), despite the influence of WWTP effluents at those sites.

DOM composition as a potential basis for urban surface water monitoring
The fact that our analysis of DOM composition revealed behavior of individual water bodies underlines the potential 380 usefulness of DOM descriptors as ecosystem-scale functional indicators that could be included in regular water-quality assessment and monitoring. Some sites deviated from the general pattern observed for water bodies of the same type.
P4, for example, was formerly connected to a sewage farm and appeared to be influenced by previously unrecognized storm water runoff that likely delivered inputs during heavy rain. The site was characterized by high levels of nutrients, especially NH4 + , and a distinct DOM composition. Similarly, S5, located immediately downstream of a WWTP,

385
although not specifically selected as a highly polluted site, also showed a distinct DOM composition as reflected by its highly negative PC1 score (Fig. 2a), indicating that the allochthonous influence was likely the strongest among all sites. Site R7 showed the same pattern as S5, and although not initially recognized as being affected by a WWTP, its DOM composition revealed that it had actually received WWTP effluents, which happened since the end 2015 (Nega et al., 2019). The distinct signatures at these individual sites are thus a promising starting point for incorporating 390 information on DOM composition in water-quality assessment and monitoring. Notably, DOM optical indexes are highly cost-effective to apply and yield information that is not easily obtained by classic approaches. Robustness of such assessments would further increase when they are based on continuous time series. This could strengthen the implementation of current legal frameworks such as the EU Water Framework Directive aiming at an integrative waterquality assessment, including of urban water bodies.

Conclusion
The composition of DOM collected in a suite of contrasting water bodies in a large metropolitan area, the city of Berlin in Germany, is diverse. varying widely in molecular size and other features related to the degree of allochthonous inputs and conveying a distinct urban character. DOM features clearly differentiated water body types, from lakes with highly abundant autochthonous DOM to streams with more allochthonous DOM. Seasonal variation of DOM was 400 prevalent in all water body types but likely driven not only by phenology but also by distinctly urban drivers such as nutrient supply, WWTP inputs, reduced leaf litter input or flashy runoff from sealed surfaces. Nutrient supply, the percentage of green space and concentrations of trace organic pollutants (as proxies for point source influences) were identified as major drivers of DOM composition. Notably, easily measured optical data on DOM were sufficient to detect WWTP effluents, a result that was confirmed by data on TrOC. This suggests that DOM analyses could be a 405 useful starting point in water-quality monitoring. Optical analyses of DOM are inexpensive and easily implemented, and could be complemented by more sophisticated and potentially automated analyses such as the mass-spectrometric https://doi.org/10.5194/bg-2021-340 Preprint. Discussion started: 5 January 2022 c Author(s) 2022. CC BY 4.0 License. quantification of TrOCs. DOM composition can inform about processes both within water bodies and in the terrestrial surroundings, therefore, water-quality assessments could benefit from integrating information on DOM composition.
Robustness of the assessments would increase if based on time series or even continuous monitoring, knowledge and 410 technology for which are readily available. This could strengthen current assessments as implemented in legal frameworks such as the EU Water Framework Directive, which aims at an integrative assessment of the "ecological status" of water bodies.
Appendix A includes tables that complement the physico-chemical and dissolved organic composition information.

490
Author contributions. All authors contributed to designing the study. CR and SH collected the data. CR did the optical analysis and the PARAFAC modelling, GS carried out the FT-ICR-MS analysis. CR and GS conducted the statistical analysis. CR led the manuscript writing, jointly with GS. All authors discussed results and edited the manuscript.
Competing interests. The authors declare that they have no conflict of interest.
Data availability. The data will be available at a suitable repository at https://www.re3data.org/.