First pan-Arctic assessment of dissolved organic carbon in lakes of the permafrost region

Lakes in permafrost regions are dynamic landscape components and play an important role for climate change feedbacks. Lake processes such as mineralization and flocculation of dissolved organic carbon (DOC), one of the main carbon fractions in lakes, contribute to the greenhouse effect and are part of the global carbon cycle. These processes are in the focus of climate research, but studies so far are limited to specific study regions. In our synthesis, we analyzed 2167 water samples from 1833 lakes across the Arctic in permafrost regions of Alaska, Canada, Greenland, and Siberia to provide first pan-Arctic insights for linkages between DOC concentrations and the environment. Using published data and unpublished datasets from the author team, we report regional DOC differences linked to latitude, permafrost zones, ecoregions, geology, near-surface soil organic carbon contents, and ground ice classification of each lake region. The lake DOC concentrations in our dataset range from 0 to 1130 mg L−1 (10.8 mg L−1 median DOC concentration). Regarding the permafrost regions of our synthesis, we found median lake DOC concentrations of 12.4 mg L−1 (Siberia), 12.3 mg L−1 (Alaska), 10.3 mg L−1 (Greenland), and 4.5 mg L−1 (Canada). Our synthesis shows a significant relationship between lake DOC concentration and lake ecoregion. We found higher lake DOC concentrations at boreal permafrost sites compared to tundra sites. We found significantly higher DOC concentrations in lakes in regions with ice-rich syngenetic permafrost deposits (yedoma) compared to non-yedoma lakes and a weak but significant relationship between soil organic carbon content and lake DOC concentration as well as between ground ice content and lake DOC. Our pan-Arctic dataset shows that the DOC concentration of a lake depends on its environmental properties, especially on permafrost extent and ecoregion, as well as vegetation, which is the most important driver of lake DOC in this study. This new dataset will be fundamental to quantify a pan-Arctic lake DOC pool for estimations of the impact of lake DOC on the global carbon cycle and climate change. Published by Copernicus Publications on behalf of the European Geosciences Union. 3918 L. Stolpmann et al.: First pan-Arctic assessment of dissolved organic carbon


Introduction
At northern high latitudes where mean annual ground temperatures are below 0 • C, permafrost has been an important carbon (C) sink for thousands of years since freezing is one of the most effective mechanisms for longterm C fixation in soils (Schuur et al., 2008;Grosse et al., 2011). Permafrost landscapes store large amounts (∼ 1300 to 1600 Pg C) of soil organic C (Hugelius et al., 2014) and are a potential source for C emissions to the atmosphere when soil temperatures exceed 0 • C and permafrost thaws Koven et al., 2011). Through recent climate change, Arctic permafrost regions have experienced an increase of permafrost temperatures by 0.5 to 2 • C and a local deepening of the active layer of up to 90 cm since the 1970s (Romanovsky et al., 2010;IPCC, 2013;Biskaborn et al., 2019). More recently, permafrost warmed globally by an average of 0.29 • C ± 0.12 • C over the 2007-2016 period due to higher air temperatures, with some of the strongest warming trends (about 0.9 • C per decade) measured in individual boreholes at the polar stations Marre Sale in northwest Siberia and Samoylov Island in northeast Siberia (Biskaborn et al., 2019). In addition, thermokarst and thermo-erosion processes act as a mechanism for the rapid release of permafrost C in the climate system (Walter Anthony et al., 2018;. Hence, the impact of global climate change on permafrost regions and their C cycling has to be thoroughly investigated. Of particular interest is ice-rich permafrost, which is vulnerable to rapid degradation processes, such as thermokarst and thermo-erosion, which lead to ground ice melt, subsequent soil volume loss, and ground subsidence. Consequently, characteristic landforms such as thermoerosional valleys, thaw slumps, and thermokarst lakes form in these regions. Thermokarst lakes are quite dynamic and widespread landscape features in the Arctic Grosse et al., 2013;Manasypov et al., 2015), and their biochemical processes play an important role for C cycling and climate change feedbacks in the Arctic and beyond (Walter Anthony et al., 2018).
In lakes, dissolved organic carbon (DOC) is one of the main C fractions (Tranvik et al., 2009). It is mobile and can be chemically labile (Vonk et al., 2013a, b). DOC in lakes can be produced in the lake itself (autochthonous DOC) or in the catchment of the lake (allochthonous DOC) (Sobek et al., 2007). The organic carbon (OC) content of terrestrial soils is the main source for allochthonous DOC. DOC in lakes can be transferred to and stored in lake sediments due to flocculation (Tranvik et al., 2009). DOC can also be degraded by photo-oxidation or microbial activity, resulting in the mineralization of OC to carbon dioxide (CO 2 ) and methane (CH 4 ) and the emission to the atmosphere (Frey and Smith, 2005;Battin et al., 2008;Tranvik et al., 2009;Vonk et al., 2013a, b). These processes are important components of the northern C cycle and affect greenhouse gas emissions from lakes. Vonk et al. (2015) suggested that the C flux from surface waters to the atmosphere and from land to ocean represents roughly one-third to one-half of the net C exchange from land to the atmosphere in the Arctic.
In recent years, DOC concentrations, lability, and mobility in Arctic lake systems, including thermokarst lakes, have been investigated; however, these studies have largely been limited to specific regions. For example, it was found that hydrologic linkages between a pond and its catchment affect the load of DOC in ponds in northern Siberia (Abnizova et al., 2014) and that DOC in different lake basin types responds differently to climate change (Larsen et al., 2017). For specific regions of western Siberia, Shirokova et al. (2013) found a negative correlation between DOC concentration and the size and age of thermokarst lakes. Among global lakes (7500 lakes from 35 land-cover types), Sobek et al. (2007) found no correlation between lake area or other lake properties and DOC concentration, but DOC concentration in lakes was found to depend on catchment properties such as topography and climate. However, permafrost region lakes, which represent approximately 25 % of global lakes (Lehner and Döll, 2004), only comprised about 10 % of the 7500 global lakes studied with respect to DOC (Sobek et al., 2007). Hence, a pan-Arctic-focused analysis of the spatial variability of lake DOC in permafrost regions is still missing.
The objectives of this study are to synthesize existing datasets of lake DOC in northern permafrost regions, to provide first insights into linkages between DOC concentration and environmental parameters (permafrost zone, ecoregion, deposit types, ground ice content, and soil organic carbon content), and to identify drivers for lake DOC concentration in this region affected by rapid climate change. Our synthesis includes published datasets as well as unpublished datasets from the author team to find regional differences in DOC concentration of lakes across the Arctic.

Study areas
In our synthesis, we included 2167 samples from 1833 lakes of 13 study areas (22 sites) across the Arctic, sampled from 1979 to 2017 (Table 1, Fig. A1). Lakes in our study are located from 59.2 to 82.5 • N latitude. A total of 49.3 % of our dataset comes from sites in Alaska, 24.2 % from Canada, 23.3 % from Siberia, and 3.2 % from Greenland. The study areas of our dataset are dominated by tundra climate and very cold subarctic climate. The Nunavut study area is also characterized by cool continental climate. The mean annual air temperature of our study areas ranges from −18 • C in the Canadian Arctic Archipelago (Michel, 2011) to −0.7 • C in Whitehorse, Yukon (Bonnaventure and Lewkowicz, 2011). All study lakes are located in landscapes influenced by permafrost (Fig. 1a). Lakes in this synthesis cover the full range of permafrost extents from continuous, discontinuous, isolated, and sporadic permafrost areas.
Study sites of the North and Northwest Alaska study area (Fig. 1a, 1-3) are predominantly located in the continuous permafrost zone (82 % of the lakes studied in this area). Forty-six percent of the studied lakes in this study area are located in the tundra ecoregion, and 54 % in the tundraboreal transition region. The North and Northwest Alaska study area is mainly composed of fluvial and yedoma deposits (62 %). The Southcentral Alaska study area (Fig. 1a, 4) is predominantly underlain by discontinuous permafrost. Studied lakes in this study area are located in the boreal ecoregion and are surrounded by glacio-moraine (67 %) and mountain alluvium (13 %) deposits. The study sites in the Interior Alaska study area (Fig. 1a,(5)(6) are predominantly located in the discontinuous permafrost zone (65 %); 19 % of studied lakes in this study area are located in the isolated permafrost zone, belonging to the Denali National Park and Preserve (Fig. 1a, 5). The Interior Alaska study area is situated in the boreal zone and mainly underlain by fluvial (55 %) and yedoma (16 %) deposits.
The study sites in the Yukon and Northwest Territories study areas (Fig. 1a,(7)(8)(9)(10) are predominantly situated in the continuous permafrost zone (65 % of studied lakes in this area) and the discontinuous permafrost zone (30 %), with some lakes in the sporadic permafrost zone (5 %), located in the Whitehorse transect.
Studied lakes of this study area can be found in the tundra ecoregion, in the boreal-tundra transition zone, and in the boreal forest, with glacial deposits. Study sites in the Nunavut study area (Fig. 1a,(11)(12)(13)(14) are located in the zone of continuous permafrost. Studied lakes in this area are situated in the tundra ecoregion and are surrounded by glacial, bedrock, and colluvial deposits. Studied lakes of the Manitoba study area (Fig. 1a,15) are located in the continuous permafrost zone and are predominantly situated in the boreal forest, underlain by glacio-marine deposits.
The Qeqqata study area (Fig. 1a,16) in Greenland is situated in the continuous permafrost zone. Studied lakes in this area are located in the tundra ecoregion and surrounded by eolian deposits.
The Siberian Yamalo-Nenets Autonomous Region (A.R.) study area (Fig. 1a, 21) covers the continuous, discontinuous, and sporadic permafrost zones. Here, 72 % of the studied lakes are situated in boreal forest, and 28 % in the tundra ecoregion, especially on the Yamal Peninsula. The Yamalo-Nenets A.R. study area is underlain by glacial-moraine, glacio-lacustrine, glacio-fluvial, and alluvial deposits. Studied lakes of the Khanty-Mansi A.R. study area (Fig. 1a,22) are situated in the continuous and isolated permafrost zone. This area is situated in the boreal forest and dominated by glacio-fluvial deposits.
The Chukotka A.R. study area (Fig. 1a, 17) is situated in the continuous permafrost zone and covers the full range of tundra ecoregion, boreal forest, and tundra-boreal transition region. Lakes in this study area are surrounded by ice-rich syngenetic permafrost deposits (yedoma) and fluvial deposits. The Khatanga study site in the Krasnoyarsk Krai study area (Fig. 1a, 20) is located in the continuous permafrost zone. This area is situated in the tundra ecoregion and underlain by lacustrine, alluvial, and eluvial deposits. The Sakha Republic (Yakutia) study area (Fig. 1,(18)(19) includes sites in the Lena River delta (Kurungnakh Island, Sobo-Sise Island, Samoylov Island, and Bykovsky Peninsula) and sites close to the Kolyma River. These sites are situated in the continuous permafrost zone. The Lena Delta study site is situated in the tundra ecoregion, whereas the Kolyma study site is situated in the boreal forest. The study lakes of the Sakha Republic study area are mainly located in ice-rich syngenetic permafrost deposits (yedoma) or fluvial and alluvial deposits.

Data extraction from existing studies
For this synthesis, we searched the scientific literature for the keywords DOC and lakes in permafrost regions and largely focused on local to regional lake DOC syntheses that provided data at the individual lake level (i.e., not averaged values for groups of lakes or regions). From identified references (Pienitz et al., 1997a, b;Hamilton et al., 2001;Lim et al., 2001;Kokelj et al., 2005;Medeiros et al., 2012;Halm and Griffith, 2014;Manasypov et al., 2014Manasypov et al., , 2015Northington and Saros, 2016;Larsen et al., 2017;Osburn et al., 2017;Coch et al., 2019;Serikova et al., 2019;Johnston et al., 2020) data for 1757 DOC samples, collected from 1478 individual lakes, were extracted into a database for further analysis. Unpublished field data of the author team was included in the database (410 samples from 355 lakes). The database includes samples that were collected during the period of April to early October (Table 1). When there was a lake sampled once in a month for more than 1 year, we calculated the average lake DOC concentration. Samples from the author team were taken from or near the water surface as well as the vast majority of the synthesized data. Although some of the synthesized data do not provide the sampling depth, we can assume that the majority of these Arctic lakes and ponds are shallow and well mixed. Across our synthesis dataset, various wellestablished methods (Bauer and Bianchi, 2011) were used to quantify DOC concentration, including high-temperature catalytic combustion, low-temperature chemical oxidation, and photochemical oxidation. The 246 samples from the Alfred Wegener Institute (AWI), Helmholtz Centre for Polar and Marine Research, were analyzed with high-temperature catalytic combustion, described in Appendix A.

Sample database and geospatial analysis
We created a geospatial database of permafrost region lakes with DOC data (Permafrost Region Lake-DOC version 1, or PeRL-DOCv1) in the desktop geoinformation system (GIS) ArcMap (10.4.1, Esri) containing all 1833 lakes as point features. Additional data layers were included in the PeRL-DOCv1 GIS for the analysis of lake environmental characteristics, including layers on permafrost and ground ice distribution (Jorgenson et al., 2008), surface geology (Jorgenson et al., 2008), and yedoma distribution (Strauss et al., 2016). For all lakes, a range of parameters (Table A1) were extracted and exported into the spreadsheet database for further analysis.
For the determination of yedoma and non-yedoma areas, we used the Database of Ice-Rich Yedoma Permafrost (IRYP) by Strauss et al. (2016). Using the study site descriptions from the synthesized lake DOC literature and a map of terrestrial ecoregions (Olson et al., 2004), we assigned an ecoregion for each data point.
To infer lake genesis, each data point was assigned a deposit type, which refers to the surrounding deposit type of each lake. For this, we used the Permafrost Characteristics of Alaska map by Jorgenson et al. (2008) for Alaska, Nielsen (2010) for Greenland, the Map of the Quaternary Formations of the Russian Federation (Petrov et al., 2014), the Geological Survey of Canada map of Fulton (1995) for Canada, and the yedoma distribution database of IRYP (Strauss et al., 2016). Furthermore, we added the ice content for the surrounding area of each lake, using the term "low", "moderate", "high", and "variable" (Jorgenson et al., 2008;Brown et al., 1997). Finally, we used the Northern Circumpolar Soil Carbon Database (NCSCDv2) to add the soil organic carbon content (SOCC) of the area surrounding each lake for the upper 0 to 100 cm, 100 to 200 cm, 200 to 300 cm, and aggregated 0 to 300 cm of soil (Hugelius et al., 2014).

Statistical analysis
To conduct statistical tests, we used RStudio (version 1.0.153). We tested normality by using the Shapiro-Wilk test. Because our data do not follow a normal distribution, we used the Spearman rank correlation coefficient (ρ) to measure each relationship between DOC concentration and a further parameter (latitude, permafrost zone, ecoregion, ground ice content, deposit type, SOCC) for all lakes in our dataset. We used the Wilcoxon-Mann-Whitney test to determine the difference in means between two populations. To analyze the relationship of DOC and multiple parameters, we performed a principal component analysis. Our dataset contains six samples from Qeqqata in Greenland (Osburn et al., 2017), collected in April in underice conditions. For the sake of comparability, these data have not been included in the statistical analysis.

Temporal variability of DOC concentration data
For only 81 of 1833 lakes in our dataset we had multitemporal data, which means that these lakes were sampled at least two times during the ice-free period.
For 42 % of the multi-temporal subset we found increasing DOC concentrations in a year, regarding the variation of sub-annual samples. For 42 % of the multi-temporal subset we found decreasing DOC concentrations, and for 6 % of the multi-temporal subset we found fluctuating values in sub-annual samples. In some cases, the DOC concentration increased after snowmelt and further decreased until fall or decreased in summer and increased until fall.
In our dataset, 16 lakes were sampled multiple times over the same seasonal period at the study site North Slope in North Alaska, and six lakes were sampled multiple times over the same seasonal period in the study area Qeqqata, Greenland (Osburn et al., 2017). The six lakes located in Qeqqata were sampled in April, June, and August in 2014, whereas lakes in North Slope were sampled in mid-June, late June, July, and August in 2014. For five of the six lakes in Qeqqata, the highest DOC concentration of the respective sampling series was found for April samples. Then, the DOC concentration decreased in June and increased in August (Table 2). For these lakes, a 30-45 % higher DOC concentration in April and up to 25 % higher DOC concentration in August were observed in comparison to the June sampling and therefore demonstrate a seasonal DOC variability. In contrast to the Qeqqata samples we found decreasing DOC concentrations in 12 of 16 lakes in North Slope when comparing DOC concentrations of mid-June and August samples (Table 2). We also checked for seasonal variability in a larger dataset available from the study areas Southcentral and Interior Alaska, where different sets of lakes were sampled during each month from May to September. This allowed an analysis of the median DOC concentration for each month for each of the two study areas. For Southcentral Alaska we found a pattern similar to that in Qeqqata, with a 17 % higher DOC concentration in May and September compared to July (Table A2). Additionally, we compared samples of the whole dataset from the months June and August. For these months, in addition to the Qeqqata and North Slope samples, samples from the study areas Yamalo-Nenets A.R., North and Northwest Alaska, Southcentral Alaska, and Interior Alaska were available. In three of the four study areas we also found higher DOC concentrations in August than in June, comparable to the Qeqqata lakes.

Variable DOC concentrations across the Arctic
Lakes in our database from sites across the Arctic -covering different permafrost zones, ecoregions, and deposit types -show a high variation of lake DOC concentration. We found differences between the four regions of Alaska, Canada, Greenland, and Siberia, as well as between study areas and study sites within these regions (Figs. 2, 3; Table A3). The median DOC concentration across the entire dataset was 10.8 mg L −1 . The concentration ranged from 0 to 1130 mg L −1 (Table 3). A total of 91.8 % of the lakes included in our dataset have a DOC concentration between 0 and 30 mg L −1 . Comparing DOC concentrations of lake water in permafrost regions of Alaska, Canada, Greenland, and Siberia, we found median DOC concentrations of 12.3, 4.2, 10.3, and 12.4 mg L −1 , respectively. Figure 3 highlights the variability of median DOC concentration in the permafrost regions of Alaska, Canada, Greenland, and Siberia and demonstrates the large range of DOC concentration in Alaska. In contrast, lakes in the Canadian permafrost region had a smaller range of DOC concentrations (Fig. 2d). We found that 80.3 % of samples collected in Canadian lakes had a lower DOC concentration than the dataset median of 10.8 mg L −1 . In Alaska and Siberia, we found that about 58 % of the lakes had higher DOC concentrations than the dataset median. Lakes in Greenland showed a 50 : 50 ratio with DOC concentrations below and above the dataset median. A large number of lakes with DOC concentrations above 30 mg L −1 were found in Interior Alaska in the Yukon Flats and Yukon-Charley Rivers National Preserve (Fig. 2c). We had four lakes with strikingly high DOC concentrations more than 10 times higher than the In addition, about 25 % of lakes with a DOC concentration above 30 mg L −1 were located in the Yamalo-Nenets A.R. (Fig. 2b). We found that lake DOC concentration was negatively correlated with geographic latitude of a lake (ρ = −0.3; p<0.05; Table 4; Fig. A2). The DOC concentration of lakes at the southernmost study sites (Yukon Flats and Yukon-Charley Rivers National Preserve) showed a large range from 10.2 to 1130 mg L −1 and 5.0 to 66.7 mg L −1 , respectively (Table A3).

Higher DOC concentrations in boreal forest lakes
In our dataset, 43.7 % of the lakes were located in the boreal forest ecoregion, 42.6 % in the tundra region, and 13.7 % in a boreal-tundra transition zone. We found a significant relationship between lake DOC concentration and the lakesurrounding ecoregion (ρ = 0.31; p<0.05; Table 4; Fig. A2), with significantly lower DOC concentrations in lakes of the tundra region (p<0.05). The DOC concentration of lakes in the boreal zone ranged from 0.8 to 1130 mg L −1 , and the median DOC concentration in the boreal zone was 15.3 mg L −1 , whereas the DOC concentration of lakes in the tundra zone ranged from 0 to 816 mg L −1 with a median of 6.8 mg L −1 (Fig. 3). With a median DOC concentration of 8.5 mg L −1 , lakes in the boreal-tundra transition zone had significantly lower DOC concentrations than lakes in the boreal forest (p<0.05).

Lower DOC concentrations in lakes of the continuous permafrost zone
Median DOC concentration was highest in lakes of the sporadic permafrost zone (17.3 mg L −1 ) and negatively correlated with permafrost extent (ρ = 0.37; p<0.05; Figs. 3, A2; Table 4). DOC concentrations in lakes of the discontinuous zone were significantly higher (14 mg L −1 ) than in lakes in the continuous permafrost zone (8 mg L −1 ).

Higher lake DOC concentrations in yedoma regions
About 16 % of the 1833 lakes of our dataset were located in regions with ice-rich syngenetic permafrost deposits (yedoma). The DOC concentration in lakes of these regions ranged from 1.7 to 50.6 mg L −1 with a median of 11.8 mg L −1 . The DOC concentrations in non-yedoma region lakes, comprising 79 % of the dataset, ranged from 0 to 1130 mg L −1 , and the median DOC concentration was 10.3 mg L −1 , which is significantly lower than in the yedoma region (p<0.05). Our analysis shows a weak but significant relationship of the lake-surrounding deposit type and lake DOC concentration (ρ = −0.2; p<0.05; Table 4; Fig. A2). The highest median DOC concentrations occur in lakes in areas with mountain alluvium and glacio-lacustrine deposits (15.2 and 15.5 mg L −1 , respectively). The lowest median DOC concentrations were found in lakes in areas underlain by bedrock, coastal, and glacial deposits (2.6, 4, and 4 mg L −1 , respectively).  We found a weakly positive relationship between ground ice content and lake DOC concentrations (ρ = 0.05; p<0.05; Table 4; Fig. A2). In regions of low ground ice content, the median amounts to 9.6 mg L −1 , compared to regions of moderate and high ground ice content with median DOC concentrations of 12.7 and 11.4 mg L −1 , respectively.

Lake DOC and SOCC
We analyzed the relationship between lake DOC concentrations and lake-surrounding SOCC and found a weakly significant relationship for SOCC in the upper 100 cm (ρ = 0.1; p<0.05; Table 4; Fig. A2). The significance of the relationship got weaker for SOCC in the upper 300 cm (ρ = 0.09; p<0.05; Table 4, Figs. 4, A2).

Ecoregion zonation as key factor for pan-Arctic lake DOC
Our study shows the strongest significant relationships between lake DOC concentration and permafrost extent, ecoregion, and geographic latitude (ρ = 0.31; ρ = 0.37; ρ = −0.3). In contrast to Sobek et al. (2007), who assumed a strong relationship between lake DOC and soil OC, we found only a weak connection of lake DOC and surrounding SOCC. Our study provides an insight into potential sources of DOC in pan-Arctic lakes. We particularly found that lakes in the boreal forest region have higher DOC concentrations compared to tundra region lakes (Fig. 3). Soils of boreal forests are rich in organic material, and microbial degradation is low (Sobek et al., 2007). In areas of boreal forest, the frost-free period is extended and the surface water can be in contact with soil C for a longer time, resulting in higher DOC concentrations in boreal lakes. Previous studies confirm that vegetation is an important driver for DOC in permafrost catchments (Harms et (Halm and Griffith, 2014). In contrast, higher permafrost extent at northern high latitudes results in lower vegetation density, and lakes are less connected and thus hydrologically isolated, leading to overall lower DOC concentrations. With climate change affecting northern ecosystem structures, a reduced permafrost extent (Vasiliev et al., 2020), shifting vegetation composition (Myers-Smith et al., 2011), and enhanced hydrological connectivity (Chen et al., 2014;Nitze et al., 2017) likely will impact lake DOC concentrations and associated biogeochemical fluxes (Sobek et al., 2005). For example, enhanced DOC concentrations in a lake provides an increased basis for the mineralization of DOC through photo-oxidation and by microbial activity, which may result in higher CO 2 emissions from these lakes. In our first pan-Arctic assessment of DOC in lakes of the permafrost region we found that DOC concentrations in lakes become significantly higher along an ecoregion gradient transitioning from the tundra zone to the tundra-boreal transition zone to the boreal forest zone. In addition, DOC concentrations are overall higher in permafrost zones that are less continuous. Both trends suggest that climate change, projected to result in an expansion of the boreal forest northwards into the tundra zone and a decrease in permafrost continuity, will likely result in higher DOC concentrations in lakes of these regions. Moreover, permafrost loss and a shift of the boreal forest ecoregion might lead to more connected lakes and thus an increase of allochthonous DOC in lakes. This, in turn, may result in higher CO 2 emissions from lakes to the atmosphere.

Pan-Arctic lakes in a global view of lake DOC
The median DOC concentration of our dataset (10.8 mg L −1 ) is almost 3 times higher than the value (3.88 mg L −1 ) found by Toming et al. (2020), who studied global lakes with a surface area larger than 0.1 km 2 . Our study across the Arctic shows a high variation of lake DOC concentration. Canada and Greenland had the lowest median DOC concentration, with low inter-site variation (Fig. 3) compared to the high variability observed in Alaska and Siberia. Whereas the Canadian and Greenlandic regions were affected by past glaciation, the majority of the Alaskan and Siberian sites were not glaciated and are characterized by extensive lowlying wetlands. Though we found a weak but significant relationship between lake DOC concentration and lakesurrounding deposit type, we found the lowest DOC concentrations in lakes surrounded by glacial and bedrock deposits (Fig. 3). In our dataset, these deposit types are mainly located in the former glaciated Canadian Arctic. Sepulveda-Jauregui et al. (2015) found a higher DOC content in yedoma lakes, analyzing CO 2 emissions from 40 lakes of a north-south transect in Alaska covering all permafrost types. So, we compared the DOC concentration in lakes of the yedoma region and the DOC concentration in lakes in non-yedoma regions, comprising 79 % of our dataset with different deposit types and including lakes with the four highest DOC concentrations in Interior Alaska characterized by fluvial deposits, eolian deposits, and mountain alluvium deposits. We found significantly higher DOC concentrations in yedoma lakes compared to non-yedoma lakes. This might be attributable to the mobilization of old labile yedoma carbon by thermo-erosion along rapidly expanding lake shores and thermokarst processes (Strauss et al., 2017). We assume that yedoma lake generation influences yedoma lake DOC. The formation of yedoma lakes, due to deep thermokarst subsidence, results in deep and often closed basins (Morgenstern et al., 2011). As result of the missing lake connectivity, DOC is locked in the lake, originating partially from eroding organic-rich yedoma deposits (Strauss et al., 2017), melting yedoma ice wedges (Fritz et al., 2015), and the active layer. Further the lower lake connectivity might prevent flushing of yedoma thermokarst lake water with river water and snowmelt water. Hence, we assume that yedoma thermokarst lakes are more likely to have elevated DOC concentrations than other more connected lakes as well as well-mixed larger and shallower lakes, where photodegradation plays an important role, associated with lower lake DOC concentration. However, to determine the DOC source in yedoma lakes, radiocarbon dating of each sample would be necessary.
While we showed that lake DOC concentration is influenced by permafrost extent and type of ecoregion, they do not explain all of the variability in the dataset. Additional factors regulate DOC. For example, air temperature, precipitation, and solar radiance have an influence on surface water DOC concentration (Cole et al., 2002;Molot et al., 2005;Anderson and Stedmon, 2007). Anderson and Stedmon (2007) analyzed lakes in Low Arctic Greenland and found the highest lake DOC concentrations in areas of low precipitation and low discharge. In those areas, evaporation is high, leading to higher DOC concentrations. For our database, the role of evaporation may also explain the high DOC concentrations of lakes in the Yukon Flats in Interior Alaska. Here, the lakes are less hydrologically connected and the region is very arid, allowing evaporationdriven concentration of DOC (Johnston et al., 2020). While we found that lake latitude is correlated with lake DOC concentration, we did not investigate lake altitude. Sobek et al. (2007) and Toming et al. (2020) found for their global lake databases that lake altitude is another important indicator for lake DOC, with lake DOC concentrations being lowest in areas of high elevation.

The complexity of lake DOC regulation
Analysis of our dataset with available pan-Arctic data have shown significant relationships between ecoregion and lake DOC concentration, between geographical latitude and DOC concentration, and between permafrost extent and DOC concentration, even if these relationships are generally weak. Other studies suggest additional parameters influencing lake DOC concentration. For example, Xenopoulos et al. (2003) analyzed catchment characteristics of lakes and found that lake perimeter and the proportion of the watershed occupied by wetlands are the strongest predictors for DOC in lakes of temperate forests. On a global scale, lake elevation and the proportion of wetlands in a watershed are the strongest predictors for lake DOC. Tranvik et al. (2009) described lake area, which is connected to lake volume and water retention time, possibly being negatively correlated in regional studies but not being an important DOC predictor in a global view. The fact that the majority of predictors for lake DOC differ in regions demonstrates the complexity of the regulation of DOC concentration in lakes. However, several of these parameters are not included in our study, which could be a cause of the often only weak relationships found in our analysis. As a result of limited data availability on detailed hydrological catchments of northern lakes, the hydrological connectivity (vertical and lateral) is also not included in our analysis. However, it is known for example that less allochthonous DOC is transported to a hydrologically isolated lake than to a connected lake (Bogard et al., 2019). In arid regions with rather isolated lakes, such as in the Yukon Flats in Interior Alaska, evapoconcentration of DOC plays an important role (Johnston et al., 2020). Water bodies with the highest DOC concentrations in the Yukon Flats have a water depth less than 1 m. Studies in western Siberia showed that ponds receive the highest impact of allochthonous input due to the high ratio of lake drainage area to water volume. This results in short water residence time leading to the highest concentrations of DOC (Shirokova et al., 2013;Manasypov et al., 2014Manasypov et al., , 2015. In addition to allochthonous DOC, autochthonous DOC, including phytoplankton productivity as well as heterotrophic bacterioplankton respiration processes (Chupakov et al., 2017), influences the DOC concentration, especially in lakes with low connectivity. For lakes in the Yukon River basin, Bogard et al. (2019) described a minor importance of allochthonous DOC in lakes and highlighted the carbon fixation from atmospheric CO 2 .
Beside our analysis of temporal variability of a subset of our dataset, the sampling month of each sample was not included in the statistical analysis of our pan-Arctic dataset, which may result in uncertainties due to variations in lake DOC concentration over the ice-free period. For Qeqqata, Greenland, higher DOC concentrations were found in samples collected in April (under ice) and August compared to June samples. In winter, nutrients as well as DOC do concentrate in lakes (Manasypov et al., 2015;Vonk et al., 2015;Grosbois et al., 2017), resulting in higher DOC concentrations in under-ice samples from April. The spring flood transports large amounts of allochthonous DOC to the lakes, concentrating them with DOC, resulting in higher lake DOC concentrations in spring (Manasypov et L. Stolpmann et al.: First pan-Arctic assessment of dissolved organic carbon al., 2015). During summer in this region, characterized by low precipitation, evapoconcentration is a major cause of increasing DOC concentration (Anderson and Stedmon, 2007). Considering a seasonality of DOC concentration in our dataset, we found two different patterns at two different study sites. This highlights the complexity of regulators and mechanisms of the DOC concentration in a lake over a season.
The influence of biological, hydrological, climatic, and topographical parameters on the DOC concentration of a lake clearly is very complex. Whereas our pan-Arctic dataset provides first insights into the relationship between some environmental parameter and lake DOC concentration, regional studies are necessary to elucidate these complex mechanisms and to determine DOC predictors, which may differ regionally.

Challenges of a pan-Arctic DOC assessment
Our synthesis shows a wide range of DOC concentrations in Arctic permafrost region lakes. An important uncertainty factor for analyzing lake DOC concentration in a pan-Arctic context is the still-limited amount of lake DOC data compared to the exceptional large number of lakes. This region hosts the most lake-rich landscapes on earth (Lehner and Döll, 2004), and their geologic and hydrologic origins are diverse (Pienitz et al., 2008;Vincent and Laybourne-Parry, 2008;Grosse et al., 2013) but often connected to paleogeographic and cryosphere processes that differ substantially from the world's other lake regions (Smith et al., 2007;Brosius et al., 2021). Lakes in our synthesis dataset were sampled over the past 40 years (Fig. A1). Since then, environmental conditions in some study areas may have changed due to the accelerating climate change. For example, thermokarst lakes are very dynamic, and some lakes that were sampled 30-40 years ago may now be completely drained and thus no longer exist. Other environmental characteristics in catchments -such as permafrost extent, vegetation cover, or runoff dynamics -may have changed over time, thereby also affecting lake DOC concentration. The remoteness of many lakes in the Arctic results in multiple challenges to spatially and temporally representative sampling. For example, multitemporal sampling of Arctic lakes is still very rare and limits our insights into the seasonal and long-term dynamics of lake DOC for many Arctic lake types. To the best of our knowledge, there are no long-term lake DOC studies available for the Arctic that would help elucidate decadalscale DOC changes and trends and possible correlations with ongoing Arctic change. However, seasonal fluctuations were studied for a small subset of lakes in our dataset (Qeqqata, Greenland).
Further uncertainties result from still rather coarseresolution environmental data layers for the pan-Arctic, such as permafrost, ground ice content, soil organic carbon, and ecoregion, as well as the sparseness of high-resolution climate data. New remote sensing and numerical-modelingdriven approaches to create spatially homogeneous datasets for this large region may provide a much better base for future analyses of lake DOC and its correlation with environmental factors. For example, pan-Arctic remote sensing of permafrost region disturbances  may allow correlation of lake DOC data with the processes of rapid permafrost degradation, or global studies of remotely sensed lake abundance and change (Pekel et al., 2016) may help to elucidate the dynamical aspects of lake DOC. To quantify the permafrost region lake DOC pool, an assessment of the volume of the diverse lake types in the Arctic is needed.

Conclusions
DOC is one of the main C fractions in lakes contributing to the greenhouse effect as part of the global C cycle. This first pan-Arctic assessment provides linkages between DOC concentrations and the environment of 1833 lakes in permafrost regions of Alaska, Canada, Greenland, and Siberia. Our study compares DOC concentrations of lakes in the permafrost region with different permafrost extent; tundra and boreal forest ecoregions; regions of different deposit types; areas with high, moderate, low, and variable ground ice content; and different SOCC in the upper 3 m. In these areas, we found a wide range of DOC concentrations from 0 to 1130 mg L −1 with the highest concentrations in lakes in the Yukon Flats in Interior Alaska and the lowest concentrations in North Slope in Arctic Alaska and the Canadian Arctic Archipelago. We identified a significant relationship of lake DOC and the ecoregion, and we found increasing lake DOC with increasing vegetation from tundra to boreal forest and decreasing latitude and permafrost extent. We conclude for our dataset that ecoregion zonation is the most important driver for lake DOC concentration in the pan-Arctic region. Nevertheless, the regulation of lake DOC concentration is complex, and some DOC predictors -such as hydrological connectivity, water retention time, and topography -were not included in our analysis due to the lack of appropriately detailed pan-Arctic datasets for these parameters. However, our study of pan-Arctic lake DOC concentration in permafrost regions provides a first broad overview of the connections between lake DOC and lake environment and forms a basis for further detailed analysis. So, the new PeRL-DOCv1 database will be useful for quantification of C pools and fluxes from freshwater bodies across the Arctic. For DOC analysis of 246 samples collected by authors from AWI, 20 mL of the sample was filtered through a 0.7 µm pore size glass fiber filter; preserved with 20-50 µl of 30 % hydrochloric acid (HCl); and sent to AWI in Potsdam, Germany, for laboratory processing. We then treated the samples with high-temperature catalytic combustion. For the quality control during the measurement and validation of the results, standard samples with known concentrations of DOC and blank samples of ultrapure water were added to the sample set. The direct method or so-called NPOC method (non-purgeable organic carbon) was used to determine the DOC concentration. We filled 9 mL of the sample into a 9 mL glass vial, sealed each vial with an aluminum foil, and placed them in the vial rack of "Shimadzu TOC-VCPH". During measurement, the samples were acidified with hydrochloric acid to a pH value of 2-3 and afterwards treated with oxygen gas, which eliminated inorganic C by conversion to CO 2 . In the next step, NPOC passes the catalyst, where it heats up to 680 • C, and the CO 2 passes the NDIR detector (non-dispersed infrared). The NDIR detector measures the concentration, and related software calculates the average of up to five measurement procedures of each sample (Manual Shimadzu/TOC-V, 2008). The DOC concentration was recorded in milligrams per liter.