This is an R Markdown Notebook. When you execute code within the notebook, the results appear beneath the code.
Try executing this chunk by clicking the Run button within the chunk or by placing your cursor inside it and pressing Ctrl+Shift+Enter.
For further analysis and data, please refer to the accompanied github repo:
https://github.com/CoraHoerstmann/Kerguelen2021
Data:
The datasets can be downloaded from PANGEAE: https://doi.org/10.1594/PANGAEA.885896 (Hoerstmann et al. 2018)
genomic sequencing data: https://www.ebi.ac.uk/ena/browser/view/PRJEB29488 (Study name: MD206_genomics, accession no: PRJEB29488)
References for used Rpackages:
! For package version use packageVersion() and see within notebook
tidyverse: Wickham et al., (2019) (https://doi.org/10.21105/joss.01686)
ggplot2: Wickham (2016) (https://ggplot2.tidyverse.org)
vegan: Oksanen et al. 2019 (https://CRAN.R-project.org/package=vegan)
zCompositions: Palarea-Albaladejo et al. (2015) (http://dx.doi.org/10.1016/j.chemolab.2015.02.019)
iNEXT: Hsieh et al. (2020) (http://chao.stat.nthu.edu.tw/wordpress/software-download/)
geodist: Padgham & Sumner (2019) (https://CRAN.R-project.org/package=geodist)
reshape2: Wickham (2007) (http://www.jstatsoft.org/v21/i12/)
ape: Paradis & Schliep (2018)
gdm: Fitzpatrick et al. (2020) (//CRAN.R-project.org/package=gdm)
upsetR: Gehlenborg (2019) (https://CRAN.R-project.org/package=UpSetR)
phyloseq: McMurdie & Holmes (2013) (http://dx.plos.org/10.1371/journal.pone.0061217)
magrittr: Bache & Wickham (2014) (https://CRAN.R-project.org/package=magrittr)
Acknowledgement for decisions for statistical analyses:
Buttigieg PL, Ramette A (2014) A Guide to Statistical Analysis in Microbial Ecology: a community-focused, living review of multivariate data analyses. FEMS Microbiol Ecol. 90: 543–550. https://mb3is.megx.net/gustame/home
this is kept seperate from microbial analysis because in microbial analysis we only use the mean vaules of the rates. In this part of the analysis we plot the biological rates against environmental data (temperature) and check for significance between ocean provinces and watermasses.
import, rename the columnheaders and associate provincecs and watermasses
## [1] change column names
## [1] associate provinces and watermasses
Dissolved inorganic nutrients (nitrate, phsophate and silicate) and particulate organic carbon to particulate organic nitrogen ratio are plotted against sea surface temperature. Temperature is the best predictor of watermasses and strongest correlate with environmental data and was thus used in our dataset for data representation. Samples are colored according to water mass (WM)
We performed two sample t-test to check for significant differences between ocean provinces.
## # A tibble: 13 x 11
## Event mean_NO3 min_NO3 max_NO3 mean_P min_P max_P mean_Si min_Si max_Si n
## * <chr> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <int>
## 1 OISO~ 24.9 23.8 25.6 1.66 1.64 1.68 20.0 19.2 20.5 3
## 2 OISO~ 8.12 6.98 9.53 0.75 0.71 0.8 0.493 0.365 0.685 6
## 3 OISO~ 1.49 1.10 1.87 0.267 0.24 0.3 0.0267 0 0.06 3
## 4 OISO~ 0 0 0 0.158 0.13 0.205 0.415 0.315 0.48 6
## 5 OISO~ 0 0 0 0.09 0.05 0.17 1.42 1.36 1.48 6
## 6 OISO2 0 0 0 0.0333 0.03 0.04 0.937 0.905 0.965 6
## 7 OISO3 0 0 0 0.0617 0.055 0.07 0.698 0.68 0.73 3
## 8 OISO~ 24.9 24.4 25.3 1.70 1.7 1.70 26.9 26.4 27.3 6
## 9 OISO4 0 0 0 0.103 0.1 0.11 0.06 0.035 0.075 6
## 10 OISO6 20.2 19.2 20.8 1.45 1.40 1.48 2.56 2.4 2.66 6
## 11 OISO7 22.4 22.4 22.6 1.54 1.52 1.55 6.05 5.92 6.14 6
## 12 OISO9 22.1 20.8 23.5 1.59 1.54 1.64 7.14 6.74 7.54 2
## 13 OISOE 19.2 17.7 20.0 1.40 1.34 1.43 1.23 1.10 1.30 6
## # A tibble: 6 x 11
## Event mean_NO3 min_NO3 max_NO3 mean_P min_P max_P mean_Si min_Si max_Si n
## * <chr> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <int>
## 1 OISO11 24.9 24.9 24.9 1.66 1.66 1.66 20.0 20.0 20.0 1
## 2 OISO37 24.9 24.9 24.9 1.70 1.70 1.70 26.9 26.9 26.9 1
## 3 OISO6 20.2 20.2 20.2 1.45 1.45 1.45 2.56 2.56 2.56 1
## 4 OISO7 22.4 22.4 22.4 1.54 1.54 1.54 6.05 6.05 6.05 1
## 5 OISO9 22.1 22.1 22.1 1.59 1.59 1.59 7.14 7.14 7.14 1
## 6 OISOE 19.2 19.2 19.2 1.40 1.40 1.40 1.23 1.23 1.23 1
To calculate biomass specific primary productivity (PB) and chlorophyll-specific N2 fixation we normalized the rate measurements by chlorophyll concentration.
PB is mainly a function of irradiance but can also sometimes be used to check phtoplankton efficiency
N2 fixation, normalized by chlorophyll a, gives insights about the relative importance of N2 fixation to primary productivity. As seen in our data, although we were able to measure N2 fixation throughout the entire sampling region, the relative importance of N2 fixation to PP in the Southern Ocean (SO) is low because other N sources (nitrate) are available and PP is >> N2 fixation.
Carbonfix$PB <- Carbonfix$cfix.day/Carbonfix$chl.a_ug.l
We plotted rate measurements and specific rate measurements against sea surface temperature and colored the samples according to water masses (WM). (Fig. 3 and Fig. A10)
## Loading required package: rcompanion
## # A tibble: 13 x 5
## Event mean_c.fix min_PP max_PP n
## * <chr> <dbl> <dbl> <dbl> <int>
## 1 OISO11 1542. 1021. 2065. 3
## 2 OISO14 2636. 994. 3847. 6
## 3 OISO15 1885. 1580. 2342. 3
## 4 OISO16 378. 170. 538. 6
## 5 OISO18 301. 226. 371. 6
## 6 OISO2 190. 63.2 325. 6
## 7 OISO3 643. 350. 846. 3
## 8 OISO37 1186. 587. 1876. 6
## 9 OISO4 1069. 524. 1877. 6
## 10 OISO6 1978. 676. 4242. 6
## 11 OISO7 2129. 943. 4305. 6
## 12 OISO9 3395. 3237. 3553. 2
## 13 OISOE 2646. 2212. 3115. 6
## # A tibble: 13 x 5
## Event mean_PB min_PB max_PB n
## * <chr> <dbl> <dbl> <dbl> <int>
## 1 OISO11 1683. 1115. 2255. 3
## 2 OISO14 665. 248. 979. 6
## 3 OISO15 477. 400. 593. 3
## 4 OISO16 790. 271. 1341. 6
## 5 OISO18 563. 364. 762. 6
## 6 OISO2 484. 257. 762. 6
## 7 OISO3 1215. 662. 1599. 3
## 8 OISO37 795. 118. 1628. 6
## 9 OISO4 531. 315. 841. 6
## 10 OISO6 784. 296. 1609. 6
## 11 OISO7 843. 283. 1889. 6
## 12 OISO9 1924. 1834. 2013. 2
## 13 OISOE 567. 477. 762. 6
##
## Pearson's product-moment correlation
##
## data: Cfix_all$MLD..m. and Cfix_all$mean_PB
## t = 0.7288, df = 11, p-value = 0.4814
## alternative hypothesis: true correlation is not equal to 0
## 95 percent confidence interval:
## -0.3814760 0.6846442
## sample estimates:
## cor
## 0.2146197
## # A tibble: 13 x 7
## Event MQR_station mean_N2.fix min max n na.rm
## * <chr> <dbl> <dbl> <dbl> <dbl> <int> <lgl>
## 1 OISO11 1.22 0.892 0.225 2.20 3 TRUE
## 2 OISO14 0.718 0.782 -0.0200 2.26 6 TRUE
## 3 OISO15 1.16 0.239 -0.397 1.43 3 TRUE
## 4 OISO16 1.32 1.05 0.388 2.21 6 TRUE
## 5 OISO18 5.00 4.04 0.939 7.92 6 TRUE
## 6 OISO2 2.62 1.58 0.701 2.88 6 TRUE
## 7 OISO3 5.44 2.81 0.649 6.91 3 TRUE
## 8 OISO37 1.22 1.97 0.759 3.09 6 TRUE
## 9 OISO4 3.54 2.01 0.109 4.92 6 TRUE
## 10 OISO6 0.879 0.927 0.174 3.25 6 TRUE
## 11 OISO7 1.19 1.75 1.00 4.39 6 TRUE
## 12 OISO9 0.753 0.882 0.193 2.15 3 TRUE
## 13 OISOE 0.651 0.924 0.177 2.09 6 TRUE
## # A tibble: 5 x 5
## WM mean min max n
## * <chr> <dbl> <dbl> <dbl> <int>
## 1 AZ 1.34 0.892 1.97 15
## 2 ISSG 2.31 1.05 4.04 21
## 3 PFZ 1.27 0.882 1.75 14
## 4 SAF 0.601 0.239 0.782 9
## 5 STF 2.01 2.01 2.01 6
PN and delta 15N ratios (Fig. A4)
## Event PN_CHL province WM
## 1 OISO3 11.482539 ISSG ISSG
## 2 OISO16 11.459569 ISSG ISSG
## 3 OISO2 12.158775 ISSG ISSG
## 4 OISO18 29.737791 ISSG ISSG
## 5 OISO4 7.197732 SSTC STF
## 6 OISO14 5.919934 SSTC SAF
## 7 OISO15 2.708091 SSTC SAF
## 8 OISO7 4.701511 SO PFZ
## 9 OISO9 6.729712 SO PFZ
## 10 OISO6 5.616233 SO PFZ
## 11 OISO37 2.805612 SO AZ
## 12 OISO11 15.286551 SO AZ
## 13 OISOE 3.942750 SO AZ
### import microbial datasets files and data prep
metdata calculates mean for carbon fixation and N2 fixation for using it as metadata for microbial analyses.
We used High throughput pigment liquid chromatography (HPLC) to identify different pigments in our samples. Pigments can be used as markers for phytoplankton taxa and size classes and were calculated after Hirata et al. 2011 and Uitz et al. 2006. How to calculate the diagnostic pigments see supplementary table 4
We tested significance of pigment composition between ocean provinces with PERMANOVA and checked for individual pigments using Welch two-sample t-test.
Size classes had a significant second order polynominal fit.
(Fig. A5)
## Permutation test for adonis under reduced model
## Terms added sequentially (first to last)
## Permutation: free
## Number of permutations: 999
##
## adonis2(formula = all_pig_T0[7:27] ~ province, data = all_pig_T0, method = "bray")
## Df SumOfSqs R2 F Pr(>F)
## province 2 1.74815 0.76496 17.9 0.001 ***
## Residual 11 0.53714 0.23504
## Total 13 2.28529 1.00000
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## Permutation test for adonis under reduced model
## Terms added sequentially (first to last)
## Permutation: free
## Number of permutations: 999
##
## adonis2(formula = all_pig_T0[7:27] ~ WM, data = all_pig_T0, method = "bray")
## Df SumOfSqs R2 F Pr(>F)
## WM 4 1.86158 0.81459 9.8856 0.001 ***
## Residual 9 0.42371 0.18541
## Total 13 2.28529 1.00000
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## Event chl.c3 fuco zea dv_chla province WM
## 1 OISO10 0.08030 0.10100 0.00000 0.00000 SO AZ
## 2 OISO11 0.03684 0.06139 0.00161 0.00000 SO AZ
## 3 OISO14 0.10000 0.08157 0.03120 0.00000 SSTC SAF
## 4 OISO15 0.18600 0.06843 0.04070 0.00716 SSTC SAF
## 5 OISO16 0.00000 0.00325 0.03430 0.01553 ISSG ISSG
## 6 OISO18 0.00000 0.00000 0.05660 0.03250 ISSG ISSG
## 7 OISO2 0.00000 0.00806 0.05720 0.02627 ISSG ISSG
## 8 OISO3 0.00000 0.00614 0.03810 0.01772 ISSG ISSG
## 9 OISO37 0.07509 0.54235 0.01090 0.00000 SO AZ
## 10 OISO4 0.02440 0.01970 0.04040 0.00000 SSTC STF
## 11 OISO6 0.04130 0.10500 0.00417 0.00000 SO PFZ
## 12 OISO7 0.07770 0.13400 0.00578 0.00000 SO PFZ
## 13 OISO9 0.07260 0.07100 0.00098 0.00000 SO PFZ
## 14 OISOE 0.14434 0.43000 0.00117 0.00000 SO AZ
##
## Call:
## lm(formula = pigments_T0$Microplankton ~ poly(pigments_T0$Temp,
## 2))
##
## Residuals:
## Min 1Q Median 3Q Max
## -0.157301 -0.095663 0.006094 0.052645 0.230833
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 0.29052 0.03259 8.914 2.3e-06 ***
## poly(pigments_T0$Temp, 2)1 -0.68428 0.12194 -5.611 0.000158 ***
## poly(pigments_T0$Temp, 2)2 0.25995 0.12194 2.132 0.056411 .
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 0.1219 on 11 degrees of freedom
## Multiple R-squared: 0.7661, Adjusted R-squared: 0.7236
## F-statistic: 18.02 on 2 and 11 DF, p-value: 0.0003384
##
## Call:
## lm(formula = pigments_T0$Picoplankton ~ poly(pigments_T0$Temp,
## 2))
##
## Residuals:
## Min 1Q Median 3Q Max
## -0.054313 -0.012681 0.000045 0.016527 0.062581
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 0.248327 0.008615 28.824 1.03e-11 ***
## poly(pigments_T0$Temp, 2)1 0.906679 0.032236 28.127 1.34e-11 ***
## poly(pigments_T0$Temp, 2)2 0.120994 0.032236 3.753 0.00319 **
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 0.03224 on 11 degrees of freedom
## Multiple R-squared: 0.9865, Adjusted R-squared: 0.9841
## F-statistic: 402.6 on 2 and 11 DF, p-value: 5.162e-11
##
## Call:
## lm(formula = pigments_T0$Nanoplankton ~ poly(pigments_T0$Temp,
## 2))
##
## Residuals:
## Min 1Q Median 3Q Max
## -0.190742 -0.058067 -0.009707 0.074205 0.168298
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 0.46115 0.02997 15.385 8.72e-09 ***
## poly(pigments_T0$Temp, 2)1 -0.22240 0.11215 -1.983 0.07289 .
## poly(pigments_T0$Temp, 2)2 -0.38094 0.11215 -3.397 0.00596 **
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 0.1122 on 11 degrees of freedom
## Multiple R-squared: 0.5844, Adjusted R-squared: 0.5089
## F-statistic: 7.735 on 2 and 11 DF, p-value: 0.007989
to account for compositionality of sequencing data, data needs to be normalized. Most robust is centered-log-ratio transformation, however, it gives negative values and thus, hellinger is better for some analyses.
We choose 3 as the appropriate row-sum cutoff to exclude low abundant ASVs.
source("C:/Users/choerstm/Documents/core_PS113/Genomics/RScripts/general_clr_hellinger_transformation.R", encoding = "UTF-8")
ASV_16S_hellinger <- hellinger(ASV_16S,3)
ASV_18S_hellinger <- hellinger(ASV_18S,3)
ASV_16S_clr <- clr(ASV_16S,3)
## No. corrected values: 27
ASV_18S_clr <- clr(ASV_18S,3)
## No. corrected values: 138
#it'll print the number of removed rows for 16S and 18S, respectively
z scoring unifies metadata by shifting the mean of all variables to 0 and the standard deviation to 1.
meta_16S.z <- meta_16S
meta_18S.z <- meta_18S
meta_16S.z[c(5:15,17, 21, 24,28,30, 31)] <- lapply(meta_16S.z[c(5:15,17, 21, 24,28,30, 31)], function(x) {y <-scale(x, center = TRUE, scale = TRUE)})
meta_18S.z[c(5:15,17, 21, 24,28,30, 31)] <- lapply(meta_18S.z[c(5:15,17, 21, 24,28,30, 31)], function(x) {y <-scale(x, center = TRUE, scale = TRUE)})
rownames(meta_16S.z) <- meta_16S.z$Event
rownames(meta_18S.z) <- meta_18S.z$Event
iNEXT focuses on three measures of Hill numbers of order q: species richness (q = 0), Shannon diversity (q = 1, the exponential of Shannon entropy) and Simpson diversity (q = 2, the inverse of Simpson concentration). For each diversity measure, iNEXT uses the observed sample of abundance or incidence data (called the “reference sample”) to compute diversity estimates and the associated 95% confidence intervals for the following two types of rarefaction and extrapolation (R/E):
Sample‐size‐based R/E sampling curves: iNEXT computes diversity estimates for rarefied and extrapolated samples up to an appropriate size. This type of sampling curve plots the diversity estimates with respect to sample size.
Coverage‐based R/E sampling curves: iNEXT computes diversity estimates for rarefied and extrapolated samples with sample completeness (as measured by sample coverage) up to an appropriate coverage. This type of sampling curve plots the diversity estimates with respect to sample coverage.
iNEXT also plots the above two types of sampling curves and a sample completeness curve. The sample completeness curve provides a bridge between these two types of curves.
(Fig. A6)
A key objective of this study was to investigate correlations between microbial diversity (reflected in richness q=0 and Inverse Simpson q=2, the latter to better reflect taxonomic evenness across a sample).
(Fig. A7, A8)
##
## Pearson's product-moment correlation
##
## data: meta.16S.div$Temp and meta.16S.div$Richness
## t = 2.5826, df = 9, p-value = 0.02957
## alternative hypothesis: true correlation is not equal to 0
## 95 percent confidence interval:
## 0.08632694 0.90004306
## sample estimates:
## cor
## 0.6524163
##
## Pearson's product-moment correlation
##
## data: meta.16S.div$Temp and meta.16S.div$Simpson
## t = -0.32251, df = 9, p-value = 0.7544
## alternative hypothesis: true correlation is not equal to 0
## 95 percent confidence interval:
## -0.6641759 0.5267633
## sample estimates:
## cor
## -0.1068872
##
## Pearson's product-moment correlation
##
## data: meta.16S.div$mean_c.fix and meta.16S.div$Simpson
## t = 1.1568, df = 9, p-value = 0.2771
## alternative hypothesis: true correlation is not equal to 0
## 95 percent confidence interval:
## -0.3061684 0.7893074
## sample estimates:
## cor
## 0.3597859
##
## Pearson's product-moment correlation
##
## data: meta.16S.div$mean_N2fix and meta.16S.div$Simpson
## t = -2.9624, df = 9, p-value = 0.0159
## alternative hypothesis: true correlation is not equal to 0
## 95 percent confidence interval:
## -0.9162977 -0.1776338
## sample estimates:
## cor
## -0.7026372
## Call:corr.test(x = meta.16S.div[c(8, 26, 28)], y = meta.16S.div$Simpson,
## method = "pearson", adjust = "holm")
## Correlation matrix
## [,1]
## Temp -0.11
## C_umol.L 0.09
## mean_c.fix 0.36
## Sample Size
## [1] 11
## Probability values adjusted for multiple tests.
## [,1]
## Temp 1.00
## C_umol.L 1.00
## mean_c.fix 0.83
##
## To see confidence intervals of the correlations, print with the short=FALSE option
##
## Pearson's product-moment correlation
##
## data: meta.18S.div$Temp and meta.18S.div$Richness
## t = 4.9994, df = 10, p-value = 0.0005378
## alternative hypothesis: true correlation is not equal to 0
## 95 percent confidence interval:
## 0.5266713 0.9555621
## sample estimates:
## cor
## 0.8451269
##
## Pearson's product-moment correlation
##
## data: meta.18S.div$Temp and meta.18S.div$Simpson
## t = 4.5414, df = 10, p-value = 0.001072
## alternative hypothesis: true correlation is not equal to 0
## 95 percent confidence interval:
## 0.4664076 0.9480460
## sample estimates:
## cor
## 0.8206454
##
## Pearson's product-moment correlation
##
## data: meta.18S.div$mean_c.fix and meta.18S.div$Simpson
## t = -2.7942, df = 10, p-value = 0.01898
## alternative hypothesis: true correlation is not equal to 0
## 95 percent confidence interval:
## -0.8956849 -0.1423436
## sample estimates:
## cor
## -0.6621531
##
## Pearson's product-moment correlation
##
## data: meta.18S.div$mean_N2fix and meta.18S.div$Simpson
## t = 3.4567, df = 10, p-value = 0.006157
## alternative hypothesis: true correlation is not equal to 0
## 95 percent confidence interval:
## 0.2843266 0.9215218
## sample estimates:
## cor
## 0.7378321
## Call:corr.test(x = meta.18S.div[c(8, 26, 28)], y = meta.18S.div$Simpson,
## method = "pearson", adjust = "holm")
## Correlation matrix
## [,1]
## Temp 0.82
## C_umol.L -0.49
## mean_c.fix -0.66
## Sample Size
## [1] 12
## Probability values adjusted for multiple tests.
## [,1]
## Temp 0.00
## C_umol.L 0.11
## mean_c.fix 0.04
##
## To see confidence intervals of the correlations, print with the short=FALSE option
another way of analysing the effect of environmental dissmilarity on microbial community dissimilarity is using a general dissimilarity model, since it is more flexible to non-linear effects.
(Fig. 4 c,d; Fig. 11)
source("D:/Kerguelen_project/analysis_with_R/Rscripts/gdm_environmental_distance.R", encoding = "UTF-8")
#decide input metadata for gdm analysis
gdm_input <- c("Event","Latitude","Longitude","Sal", "MLD..m.", "Temp", "oxygen", "NO3umol.l", "NH4_umol.l", "P_umol.l", "Si_umol.l", "chl.a_ug.l", "POCPN_ratio", "mean_c.fix", "mean_PB", "mean_N2fix")
print("prokaryotes")
## [1] "prokaryotes"
gdm_analysis(ASV_16S_hellinger, meta_16S[gdm_input])
## Warning: package 'gdm' was built under R version 4.0.5
##
## To cite package 'gdm' in publications use:
##
## Matthew C. Fitzpatrick, Karel Mokany, Glenn Manion, Matthew Lisk,
## Simon Ferrier and Diego Nieto-Lugilde (2021). gdm: Generalized
## Dissimilarity Modeling. R package version 1.4.2.2.
## https://CRAN.R-project.org/package=gdm
##
## A BibTeX entry for LaTeX users is
##
## @Manual{,
## title = {gdm: Generalized Dissimilarity Modeling},
## author = {Matthew C. Fitzpatrick and Karel Mokany and Glenn Manion and Matthew Lisk and Simon Ferrier and Diego Nieto-Lugilde},
## year = {2021},
## note = {R package version 1.4.2.2},
## url = {https://CRAN.R-project.org/package=gdm},
## }
##
## ATTENTION: This citation information has been auto-generated from the
## package DESCRIPTION file and may need manual editing, see
## 'help("citation")'.
##
## [1]
## [1]
## [1] GDM Modelling Summary
## [1] Creation Date: Sun May 09 17:50:54 2021
## [1]
## [1] Name: gdm.1
## [1]
## [1] Data: gdmTab
## [1]
## [1] Samples: 55
## [1]
## [1] Geographical distance used in model fitting? TRUE
## [1]
## [1] NULL Deviance: 9.96217539248304
## [1] GDM Deviance: 0.338783639265586
## [1] Percent Deviance Explained: 96.5993005953176
## [1]
## [1] Intercept: 0.0215196942572105
## [1]
## [1] Predictor 1: Geographic
## [1] Splines: 3
## [1] Min Knot: 2.92718633674046
## [1] 50% Knot: 17.7354532857212
## [1] Max Knot: 30.3392664092262
## [1] Coefficient[1]: 0
## [1] Coefficient[2]: 0
## [1] Coefficient[3]: 0
## [1]
## [1] Predictor 2: Sal
## [1] Splines: 3
## [1] Min Knot: 33.73584
## [1] 50% Knot: 34.39721
## [1] Max Knot: 35.85502
## [1] Coefficient[1]: 0.192002997451449
## [1] Coefficient[2]: 0.363645361448136
## [1] Coefficient[3]: 0.172358116206037
## [1]
## [1] Predictor 3: MLD..m.
## [1] Splines: 3
## [1] Min Knot: 7.945
## [1] 50% Knot: 49.576
## [1] Max Knot: 82.281
## [1] Coefficient[1]: 0
## [1] Coefficient[2]: 0
## [1] Coefficient[3]: 0.0608797779263614
## [1]
## [1] Predictor 4: Temp
## [1] Splines: 3
## [1] Min Knot: 3.08
## [1] 50% Knot: 11.59
## [1] Max Knot: 27.38
## [1] Coefficient[1]: 0
## [1] Coefficient[2]: 0
## [1] Coefficient[3]: 0
## [1]
## [1] Predictor 5: oxygen
## [1] Splines: 3
## [1] Min Knot: 203.99899
## [1] 50% Knot: 274.29599
## [1] Max Knot: 324.05301
## [1] Coefficient[1]: 0
## [1] Coefficient[2]: 0.0710671106952922
## [1] Coefficient[3]: 0.635253969815272
## [1]
## [1] Predictor 6: NO3umol.l
## [1] Splines: 3
## [1] Min Knot: 0
## [1] 50% Knot: 9.53
## [1] Max Knot: 25.585
## [1] Coefficient[1]: 0
## [1] Coefficient[2]: 0
## [1] Coefficient[3]: 0
## [1]
## [1] Predictor 7: NH4_umol.l
## [1] Splines: 3
## [1] Min Knot: 0
## [1] 50% Knot: 0.08
## [1] Max Knot: 1.43
## [1] Coefficient[1]: 0
## [1] Coefficient[2]: 0
## [1] Coefficient[3]: 0
## [1]
## [1] Predictor 8: P_umol.l
## [1] Splines: 3
## [1] Min Knot: 0.04
## [1] 50% Knot: 0.8
## [1] Max Knot: 1.705
## [1] Coefficient[1]: 0.219554335438145
## [1] Coefficient[2]: 0
## [1] Coefficient[3]: 0
## [1]
## [1] Predictor 9: Si_umol.l
## [1] Splines: 3
## [1] Min Knot: 0
## [1] 50% Knot: 1.29
## [1] Max Knot: 27.34
## [1] Coefficient[1]: 0.191369066479726
## [1] Coefficient[2]: 0
## [1] Coefficient[3]: 0.0206927190764605
## [1]
## [1] Predictor 10: chl.a_ug.l
## [1] Splines: 3
## [1] Min Knot: 0.48725
## [1] 50% Knot: 2.2327
## [1] Max Knot: 4.96042
## [1] Coefficient[1]: 0.196728675054061
## [1] Coefficient[2]: 0
## [1] Coefficient[3]: 0.256623591273278
## [1]
## [1] Predictor 11: POCPN_ratio
## [1] Splines: 3
## [1] Min Knot: 5.90375923981978
## [1] 50% Knot: 6.17551726614751
## [1] Max Knot: 8.44946949997637
## [1] Coefficient[1]: 0.414586603405584
## [1] Coefficient[2]: 0.0983292595546666
## [1] Coefficient[3]: 0.127969937128137
## [1]
## [1] Predictor 12: mean_c.fix
## [1] Splines: 3
## [1] Min Knot: 190.3836
## [1] 50% Knot: 1541.95352
## [1] Max Knot: 3395.06772
## [1] Coefficient[1]: 0.636704365876361
## [1] Coefficient[2]: 0.00612600848050092
## [1] Coefficient[3]: 0
## [1]
## [1] Predictor 13: mean_PB
## [1] Splines: 3
## [1] Min Knot: 477.38724318191
## [1] 50% Knot: 665.104094312654
## [1] Max Knot: 1923.75820626583
## [1] Coefficient[1]: 0
## [1] Coefficient[2]: 0
## [1] Coefficient[3]: 0
## [1]
## [1] Predictor 14: mean_N2fix
## [1] Splines: 3
## [1] Min Knot: 0.23895
## [1] 50% Knot: 1.58218833333333
## [1] Max Knot: 4.038265
## [1] Coefficient[1]: 0
## [1] Coefficient[2]: 0.111875297133606
## [1] Coefficient[3]: 0
## [1]
## List of 15
## $ dataname : symbol gdmTab
## $ geo : logi TRUE
## $ sample : int 55
## $ gdmdeviance : num 0.339
## $ nulldeviance: num 9.96
## $ explained : num 96.6
## $ intercept : num 0.0215
## $ predictors : chr [1:14] "Geographic" "Sal" "MLD..m." "Temp" ...
## $ coefficients: num [1:42] 0 0 0 0.192 0.364 ...
## $ knots : num [1:42] 2.93 17.74 30.34 33.74 34.4 ...
## $ splines : num [1:14] 3 3 3 3 3 3 3 3 3 3 ...
## $ creationdate: chr "Sun May 09 17:50:54 2021"
## $ observed : num [1:55] 0.685 0.796 0.921 0.925 0.908 ...
## $ predicted : num [1:55] 0.67 0.816 0.93 0.911 0.876 ...
## $ ecological : num [1:55] 1.11 1.69 2.66 2.42 2.09 ...
## - attr(*, "class")= chr [1:2] "gdm" "list"
## NULL
## List of 2
## $ x: num [1:200, 1:14] 2.93 3.06 3.2 3.34 3.48 ...
## ..- attr(*, "dimnames")=List of 2
## .. ..$ : NULL
## .. ..$ : chr [1:14] "Geographic" "Sal" "MLD..m." "Temp" ...
## $ y: num [1:200, 1:14] 0 0 0 0 0 0 0 0 0 0 ...
## ..- attr(*, "dimnames")=List of 2
## .. ..$ : NULL
## .. ..$ : chr [1:14] "Geographic" "Sal" "MLD..m." "Temp" ...
## NULL
## NULL
## NULL
print("eukaryotes")
## [1] "eukaryotes"
gdm_analysis(ASV_18S_hellinger, meta_18S[gdm_input])
## Warning: package 'gdm' was built under R version 4.0.5
##
## To cite package 'gdm' in publications use:
##
## Matthew C. Fitzpatrick, Karel Mokany, Glenn Manion, Matthew Lisk,
## Simon Ferrier and Diego Nieto-Lugilde (2021). gdm: Generalized
## Dissimilarity Modeling. R package version 1.4.2.2.
## https://CRAN.R-project.org/package=gdm
##
## A BibTeX entry for LaTeX users is
##
## @Manual{,
## title = {gdm: Generalized Dissimilarity Modeling},
## author = {Matthew C. Fitzpatrick and Karel Mokany and Glenn Manion and Matthew Lisk and Simon Ferrier and Diego Nieto-Lugilde},
## year = {2021},
## note = {R package version 1.4.2.2},
## url = {https://CRAN.R-project.org/package=gdm},
## }
##
## ATTENTION: This citation information has been auto-generated from the
## package DESCRIPTION file and may need manual editing, see
## 'help("citation")'.
##
## [1]
## [1]
## [1] GDM Modelling Summary
## [1] Creation Date: Sun May 09 17:50:55 2021
## [1]
## [1] Name: gdm.1
## [1]
## [1] Data: gdmTab
## [1]
## [1] Samples: 66
## [1]
## [1] Geographical distance used in model fitting? TRUE
## [1]
## [1] NULL Deviance: 10.0922602806298
## [1] GDM Deviance: 0.681693129804237
## [1] Percent Deviance Explained: 93.2453869514977
## [1]
## [1] Intercept: 0.288371586039048
## [1]
## [1] Predictor 1: Geographic
## [1] Splines: 3
## [1] Min Knot: 5.03513663071818
## [1] 50% Knot: 17.7298314879299
## [1] Max Knot: 30.3392664092262
## [1] Coefficient[1]: 0
## [1] Coefficient[2]: 0.207551168078672
## [1] Coefficient[3]: 0
## [1]
## [1] Predictor 2: Sal
## [1] Splines: 3
## [1] Min Knot: 33.73584
## [1] 50% Knot: 34.501835
## [1] Max Knot: 35.85502
## [1] Coefficient[1]: 0.210265649532649
## [1] Coefficient[2]: 0
## [1] Coefficient[3]: 0
## [1]
## [1] Predictor 3: MLD..m.
## [1] Splines: 3
## [1] Min Knot: 0
## [1] 50% Knot: 48.566
## [1] Max Knot: 82.281
## [1] Coefficient[1]: 0.0413560912989392
## [1] Coefficient[2]: 0.433302265916552
## [1] Coefficient[3]: 0.0207487042140491
## [1]
## [1] Predictor 4: Temp
## [1] Splines: 3
## [1] Min Knot: 3.08
## [1] 50% Knot: 10.87
## [1] Max Knot: 27.38
## [1] Coefficient[1]: 0
## [1] Coefficient[2]: 0.400610092223674
## [1] Coefficient[3]: 0
## [1]
## [1] Predictor 5: oxygen
## [1] Splines: 3
## [1] Min Knot: 203.99899
## [1] 50% Knot: 280.792995
## [1] Max Knot: 324.05301
## [1] Coefficient[1]: 0
## [1] Coefficient[2]: 0
## [1] Coefficient[3]: 0.146795437409257
## [1]
## [1] Predictor 6: NO3umol.l
## [1] Splines: 3
## [1] Min Knot: 0
## [1] 50% Knot: 10.745
## [1] Max Knot: 25.585
## [1] Coefficient[1]: 0
## [1] Coefficient[2]: 0
## [1] Coefficient[3]: 0
## [1]
## [1] Predictor 7: NH4_umol.l
## [1] Splines: 3
## [1] Min Knot: 0
## [1] 50% Knot: 0.105
## [1] Max Knot: 1.43
## [1] Coefficient[1]: 0
## [1] Coefficient[2]: 0
## [1] Coefficient[3]: 0
## [1]
## [1] Predictor 8: P_umol.l
## [1] Splines: 3
## [1] Min Knot: 0.04
## [1] 50% Knot: 0.845
## [1] Max Knot: 1.705
## [1] Coefficient[1]: 0.413494785996043
## [1] Coefficient[2]: 0.212583789962081
## [1] Coefficient[3]: 0
## [1]
## [1] Predictor 9: Si_umol.l
## [1] Splines: 3
## [1] Min Knot: 0
## [1] 50% Knot: 1.3275
## [1] Max Knot: 27.34
## [1] Coefficient[1]: 0.471212795627758
## [1] Coefficient[2]: 0
## [1] Coefficient[3]: 0.389875728821161
## [1]
## [1] Predictor 10: chl.a_ug.l
## [1] Splines: 3
## [1] Min Knot: 0.4012
## [1] 50% Knot: 1.998755
## [1] Max Knot: 4.96042
## [1] Coefficient[1]: 0.121823492050891
## [1] Coefficient[2]: 0
## [1] Coefficient[3]: 0.351256310665088
## [1]
## [1] Predictor 11: POCPN_ratio
## [1] Splines: 3
## [1] Min Knot: 5.90375923981978
## [1] 50% Knot: 6.1985956585102
## [1] Max Knot: 8.44946949997637
## [1] Coefficient[1]: 0.0283205827227883
## [1] Coefficient[2]: 0
## [1] Coefficient[3]: 0.13814288024884
## [1]
## [1] Predictor 12: mean_c.fix
## [1] Splines: 3
## [1] Min Knot: 190.3836
## [1] 50% Knot: 1363.77054
## [1] Max Knot: 3395.06772
## [1] Coefficient[1]: 0.169528177183697
## [1] Coefficient[2]: 0
## [1] Coefficient[3]: 0
## [1]
## [1] Predictor 13: mean_PB
## [1] Splines: 3
## [1] Min Knot: 477.38724318191
## [1] 50% Knot: 786.977552148935
## [1] Max Knot: 1923.75820626583
## [1] Coefficient[1]: 0.152837160870431
## [1] Coefficient[2]: 0
## [1] Coefficient[3]: 0
## [1]
## [1] Predictor 14: mean_N2fix
## [1] Splines: 3
## [1] Min Knot: 0.23895
## [1] 50% Knot: 1.3156875
## [1] Max Knot: 4.038265
## [1] Coefficient[1]: 0
## [1] Coefficient[2]: 0.0149004109553122
## [1] Coefficient[3]: 0.177511227651542
## [1]
## List of 15
## $ dataname : symbol gdmTab
## $ geo : logi TRUE
## $ sample : int 66
## $ gdmdeviance : num 0.682
## $ nulldeviance: num 10.1
## $ explained : num 93.2
## $ intercept : num 0.288
## $ predictors : chr [1:14] "Geographic" "Sal" "MLD..m." "Temp" ...
## $ coefficients: num [1:42] 0 0.208 0 0.21 0 ...
## $ knots : num [1:42] 5.04 17.73 30.34 33.74 34.5 ...
## $ splines : num [1:14] 3 3 3 3 3 3 3 3 3 3 ...
## $ creationdate: chr "Sun May 09 17:50:55 2021"
## $ observed : num [1:66] 0.926 0.903 0.948 0.943 0.929 ...
## $ predicted : num [1:66] 0.917 0.92 0.93 0.939 0.92 ...
## $ ecological : num [1:66] 2.49 2.53 2.66 2.8 2.52 ...
## - attr(*, "class")= chr [1:2] "gdm" "list"
## NULL
## List of 2
## $ x: num [1:200, 1:14] 5.04 5.16 5.29 5.42 5.54 ...
## ..- attr(*, "dimnames")=List of 2
## .. ..$ : NULL
## .. ..$ : chr [1:14] "Geographic" "Sal" "MLD..m." "Temp" ...
## $ y: num [1:200, 1:14] 0.00 1.03e-05 4.14e-05 9.31e-05 1.65e-04 ...
## ..- attr(*, "dimnames")=List of 2
## .. ..$ : NULL
## .. ..$ : chr [1:14] "Geographic" "Sal" "MLD..m." "Temp" ...
## NULL
## NULL
## NULL
RDA and PERMANOVA test
We perfromed PCA analyses on metadata and excluded data which were correlating with other variables. Input variables are listed uder “RDA input” We checked whether residulas of metadata were normally distributed and used the ones for further analysis in RDA
Significances of differences were tested using PERMANOVA
(Fig. 4 a,b; Fig. A9)
source("D:/Kerguelen_project/analysis_with_R/Rscripts/actual_multivariate_test.R", encoding = "UTF-8")
## Loading required package: ggvegan
## Warning in library(package, lib.loc = lib.loc, character.only = TRUE,
## logical.return = TRUE, : there is no package called 'ggvegan'
## Loading required package: vegan
## This is vegan 2.5-7
## Loading required package: rgr
## Loading required package: fastICA
##
## Attaching package: 'rgr'
## The following object is masked _by_ '.GlobalEnv':
##
## clr
## The following object is masked from 'package:dplyr':
##
## syms
## Loading required package: ggplot2
##
## Attaching package: 'ggplot2'
## The following object is masked from 'package:rgr':
##
## syms
## Loading required package: phyloseq
##
## Attaching package: 'phyloseq'
## The following object is masked from 'package:raster':
##
## distance
## Loading required package: adespatial
## Registered S3 methods overwritten by 'adegraphics':
## method from
## biplot.dudi ade4
## kplot.foucart ade4
## kplot.mcoa ade4
## kplot.mfa ade4
## kplot.pta ade4
## kplot.sepan ade4
## kplot.statis ade4
## scatter.coa ade4
## scatter.dudi ade4
## scatter.nipals ade4
## scatter.pco ade4
## score.acm ade4
## score.mix ade4
## score.pca ade4
## screeplot.dudi ade4
## Registered S3 method overwritten by 'gdata':
## method from
## reorder.factor DescTools
## Registered S3 method overwritten by 'spdep':
## method from
## plot.mst ape
## Registered S3 methods overwritten by 'adespatial':
## method from
## plot.multispati adegraphics
## print.multispati ade4
## summary.multispati ade4
## Loading required package: geodist
RDA_input <- c("Event","Temp","oxygen", "NO3umol.l", "NH4_umol.l","Si_umol.l", "chl.a_ug.l","mean_c.fix", "mean_N2fix")
RDA_analysis(ASV_16S_clr, ASV_16S_hellinger, meta_16S.z, meta_16S.z[RDA_input])
## Call: rda(formula = abundance.clr.t ~ Temp + oxygen + NO3umol.l +
## NH4_umol.l + Si_umol.l + chl.a_ug.l + mean_c.fix + mean_N2fix, data =
## meta.sub)
##
## Inertia Proportion Rank
## Total 2775.5079 1.0000
## Constrained 2490.2661 0.8972 8
## Unconstrained 285.2417 0.1028 2
## Inertia is variance
##
## Eigenvalues for constrained axes:
## RDA1 RDA2 RDA3 RDA4 RDA5 RDA6 RDA7 RDA8
## 1268.8 601.4 234.0 116.4 89.8 86.4 66.8 26.6
##
## Eigenvalues for unconstrained axes:
## PC1 PC2
## 160.86 124.39
## Permutation test for rda under reduced model
## Permutation: free
## Number of permutations: 999
##
## Model: rda(formula = abundance.clr.t ~ Temp + oxygen + NO3umol.l + NH4_umol.l + Si_umol.l + chl.a_ug.l + mean_c.fix + mean_N2fix, data = meta.sub)
## Df Variance F Pr(>F)
## Model 8 2490.27 2.1826 0.071 .
## Residual 2 285.24
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## [1] 0.897229
## Permutation test for adonis under reduced model
## Terms added sequentially (first to last)
## Permutation: free
## Number of permutations: 999
##
## adonis2(formula = abundance.hellinger.t ~ mean_c.fix, data = meta, method = "bray")
## Df SumOfSqs R2 F Pr(>F)
## mean_c.fix 1 0.89007 0.3152 4.1426 0.011 *
## Residual 9 1.93371 0.6848
## Total 10 2.82378 1.0000
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## Permutation test for adonis under reduced model
## Terms added sequentially (first to last)
## Permutation: free
## Number of permutations: 999
##
## adonis2(formula = abundance.hellinger.t ~ mean_N2fix, data = meta, method = "bray")
## Df SumOfSqs R2 F Pr(>F)
## mean_N2fix 1 0.56611 0.20048 2.2567 0.074 .
## Residual 9 2.25767 0.79952
## Total 10 2.82378 1.00000
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## Permutation test for adonis under reduced model
## Terms added sequentially (first to last)
## Permutation: free
## Number of permutations: 999
##
## adonis2(formula = abundance.hellinger.t ~ province, data = meta, method = "bray")
## Df SumOfSqs R2 F Pr(>F)
## province 2 1.7729 0.62784 6.7481 0.001 ***
## Residual 8 1.0509 0.37216
## Total 10 2.8238 1.00000
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## Permutation test for adonis under reduced model
## Terms added sequentially (first to last)
## Permutation: free
## Number of permutations: 999
##
## adonis2(formula = abundance.hellinger.t ~ WM, data = meta, method = "bray")
## Df SumOfSqs R2 F Pr(>F)
## WM 4 2.09517 0.74197 4.3133 0.001 ***
## Residual 6 0.72861 0.25803
## Total 10 2.82378 1.00000
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
RDA_analysis(ASV_18S_clr, ASV_18S_hellinger, meta_18S.z, meta_18S.z[RDA_input])
## Call: rda(formula = abundance.clr.t ~ Temp + oxygen + NO3umol.l +
## NH4_umol.l + Si_umol.l + chl.a_ug.l + mean_c.fix + mean_N2fix, data =
## meta.sub)
##
## Inertia Proportion Rank
## Total 4648.3774 1.0000
## Constrained 3757.7714 0.8084 8
## Unconstrained 890.6060 0.1916 3
## Inertia is variance
##
## Eigenvalues for constrained axes:
## RDA1 RDA2 RDA3 RDA4 RDA5 RDA6 RDA7 RDA8
## 1813.0 635.6 429.0 291.0 216.8 148.5 147.5 76.5
##
## Eigenvalues for unconstrained axes:
## PC1 PC2 PC3
## 448.6 330.3 111.8
## Permutation test for rda under reduced model
## Permutation: free
## Number of permutations: 999
##
## Model: rda(formula = abundance.clr.t ~ Temp + oxygen + NO3umol.l + NH4_umol.l + Si_umol.l + chl.a_ug.l + mean_c.fix + mean_N2fix, data = meta.sub)
## Df Variance F Pr(>F)
## Model 8 3757.8 1.5823 0.091 .
## Residual 3 890.6
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## [1] 0.808405
## Permutation test for adonis under reduced model
## Terms added sequentially (first to last)
## Permutation: free
## Number of permutations: 999
##
## adonis2(formula = abundance.hellinger.t ~ mean_c.fix, data = meta, method = "bray")
## Df SumOfSqs R2 F Pr(>F)
## mean_c.fix 1 1.0095 0.27315 3.7581 0.006 **
## Residual 10 2.6862 0.72685
## Total 11 3.6957 1.00000
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## Permutation test for adonis under reduced model
## Terms added sequentially (first to last)
## Permutation: free
## Number of permutations: 999
##
## adonis2(formula = abundance.hellinger.t ~ mean_N2fix, data = meta, method = "bray")
## Df SumOfSqs R2 F Pr(>F)
## mean_N2fix 1 0.4957 0.13412 1.549 0.125
## Residual 10 3.2000 0.86588
## Total 11 3.6957 1.00000
## Permutation test for adonis under reduced model
## Terms added sequentially (first to last)
## Permutation: free
## Number of permutations: 999
##
## adonis2(formula = abundance.hellinger.t ~ province, data = meta, method = "bray")
## Df SumOfSqs R2 F Pr(>F)
## province 2 2.0031 0.54201 5.3256 0.001 ***
## Residual 9 1.6926 0.45799
## Total 11 3.6957 1.00000
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## Permutation test for adonis under reduced model
## Terms added sequentially (first to last)
## Permutation: free
## Number of permutations: 999
##
## adonis2(formula = abundance.hellinger.t ~ WM, data = meta, method = "bray")
## Df SumOfSqs R2 F Pr(>F)
## WM 4 2.4801 0.67109 3.5706 0.001 ***
## Residual 7 1.2155 0.32891
## Total 11 3.6957 1.00000
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
alternative to venn diagram but gives more information through set size and intersection size. Yet, it is only a vague estimate in our analysis, since provinces represent sum of all ASVs detected across samples but not neccessarily occur in other samples. Thus, upset plots should only be performed when differences between provinces were significant (in PERMANOVA) How well each sample fit the sum of its province is summarised below
Reference: https://jokergoo.github.io/ComplexHeatmap-reference/book/upset-plot.html
(Fig. 4 e,f)
source("D:/Kerguelen_project/analysis_with_R/Rscripts/upsetR.R", encoding = "UTF-8")
upset_16Splots(ASV_16S)
##
## Attaching package: 'UpSetR'
## The following object is masked from 'package:lattice':
##
## histogram
## Loading required package: usethis
##
## Attaching package: 'devtools'
## The following object is masked from 'package:permute':
##
## check
## ========================================
## ComplexHeatmap version 2.6.2
## Bioconductor page: http://bioconductor.org/packages/ComplexHeatmap/
## Github page: https://github.com/jokergoo/ComplexHeatmap
## Documentation: http://jokergoo.github.io/ComplexHeatmap-reference
##
## If you use it in published research, please cite:
## Gu, Z. Complex heatmaps reveal patterns and correlations in multidimensional
## genomic data. Bioinformatics 2016.
##
## This message can be suppressed by:
## suppressPackageStartupMessages(library(ComplexHeatmap))
## ========================================
upset_18Splots(ASV_18S)
Table A5
because it is biased you can check how well each sample is represented in the combined sample. You can only use this analysis if the PERMANOVA is actually significant
for the analysis I choose union mode: 1 means in that set and 0 is not taken into account. When there are multiple 1, the relationship is OR. Then, 1 1 0 means a set of elements in set A or B, and they can also in C or not in C (union(A, B)). Under this mode, the seven combination sets can overlap.
SO S37 - 0 1 2 520 237 218 - “of total presence (455) : 48% S11 - 0 1 2 520 237 218 -”of total presence (455) : 48% SE - 0 1 2 520 202 253 - “of total presence (455) : 56% S7 - 0 1 2 520 214 241 -”of total presence (455) : 53% S9 - 0 1 2 520 228 227 - “of total presence (455) : 50% S10 - 0 1 2 520 201 254 -”of total presence (455) : 56%
SSTC S4 - 0 1 2 456 218 301 - “of total presence (519) : 58% S14 - 0 1 2 456 200 319 -”of total presence (519) : 61% S15 - 0 1 2 456 184 335 - "of total presence (519) : 65%
ISSG S3 - 0 1 2 418 188 369 - “of total presence (557) : 66% S2 - 0 1 2 418 295 262 -”of total presence (557) : 47% S18 - 0 1 2 418 254 303 - "of total presence (557) : 54%
SO S37 - 0 1 2 1923 439 139 - “of total presence (578) : 24% S11 - 0 1 2 1923 342 236 -”of total presence (578) : 41% SE - 0 1 2 1923 216 362 - “of total presence (578) : 63% S7 - 0 1 2 1923 275 303 -”of total presence (578) : 52% S9 - 0 1 2 1923 229 349 - "of total presence (578) : 60%
SSTC S4 - 0 1 2 1917 263 321 - “of total presence (584) : 55% S15 - 0 1 2 1917 145 439 -”of total presence (584) : 75%
ISSG S3 - 0 1 2 586 852 1063 - “of total presence (1915) : 56% S2 - 0 1 2 586 1031 884 -”of total presence (1915) : 46% S16 - 0 1 2 586 1325 590 - “of total presence (1915) : 31% S18 - 0 1 2 586 1002 913 -”of total presence (1915) : 48%
when differences between provinces are significant it is valid to perform a SIMilarity PERcentage (SIMPER) analysis to see which entities contribute most to the observed dissimilarity (Clarke 1993)
[1] SIMPER analysis 16Sotu | contribution | ASV | kingdom | phylum | class | genus | family | species |
---|---|---|---|---|---|---|---|---|
>ASV_8 | -1.254761e+12 | >ASV_8 | Bacteria | Proteobacteria | Alphaproteobacteria | SAR11 clade | Clade II | |
>ASV_149 | -1.608580e+12 | >ASV_149 | Bacteria | Proteobacteria | Alphaproteobacteria | Sphingomonadales | Sphingomonadaceae | Novosphingobium |
>ASV_71 | -1.883222e+12 | >ASV_71 | Bacteria | Proteobacteria | Alphaproteobacteria | SAR11 clade | Clade II | |
>ASV_105 | -1.946567e+12 | >ASV_105 | Bacteria | Proteobacteria | Alphaproteobacteria | Rhodobacterales | Rhodobacteraceae | uncultured |
>ASV_146 | -1.946567e+12 | >ASV_146 | Bacteria | Bacteroidetes | Bacteroidia | Flavobacteriales | Flavobacteriaceae | NS5 marine group |
>ASV_156 | -1.946567e+12 | >ASV_156 | Bacteria | Cyanobacteria | Oxyphotobacteria | Synechococcales | Cyanobiaceae | Prochlorococcus MIT9313 |
>ASV_157 | -1.946567e+12 | >ASV_157 | Bacteria | Bacteroidetes | Bacteroidia | Cytophagales | Cyclobacteriaceae | Marinoscillum |
>ASV_173 | -1.946567e+12 | >ASV_173 | Bacteria | Proteobacteria | Gammaproteobacteria | Cellvibrionales | Porticoccaceae | SAR92 clade |
>ASV_223 | -1.946567e+12 | >ASV_223 | Archaea | Euryarchaeota | Thermoplasmata | Marine Group II | ||
>ASV_233 | -1.946567e+12 | >ASV_233 | Bacteria | Proteobacteria | Gammaproteobacteria | SAR86 clade |
otu | contribution | ASV | kingdom | phylum | class | genus | family | species |
---|---|---|---|---|---|---|---|---|
>ASV_81 | 6.951505e+12 | >ASV_81 | Bacteria | Proteobacteria | Gammaproteobacteria | Betaproteobacteriales | Methylophilaceae | OM43 clade |
>ASV_137 | 6.654805e+12 | >ASV_137 | Bacteria | Proteobacteria | Gammaproteobacteria | Thiotrichales | Thiotrichaceae | uncultured |
>ASV_176 | 6.613872e+12 | >ASV_176 | Bacteria | Proteobacteria | Alphaproteobacteria | Puniceispirillales | SAR116 clade | |
>ASV_218 | 6.577739e+12 | >ASV_218 | Bacteria | Proteobacteria | Alphaproteobacteria | Puniceispirillales | SAR116 clade | |
>ASV_88 | 6.446653e+12 | >ASV_88 | Bacteria | Bacteroidetes | Bacteroidia | Flavobacteriales | NS7 marine group | |
>ASV_230 | 6.218666e+12 | >ASV_230 | Bacteria | Proteobacteria | Alphaproteobacteria | Puniceispirillales | SAR116 clade | |
>ASV_257 | 6.204857e+12 | >ASV_257 | Bacteria | Bacteroidetes | Bacteroidia | Flavobacteriales | Flavobacteriaceae | NS5 marine group |
>ASV_322 | 6.183627e+12 | >ASV_322 | Bacteria | Proteobacteria | Alphaproteobacteria | Rhodobacterales | Rhodobacteraceae | Rhodovulum |
>ASV_186 | 6.166903e+12 | >ASV_186 | Bacteria | Proteobacteria | Gammaproteobacteria | Alteromonadales | Alteromonadaceae | Alteromonas |
>ASV_141 | 6.127317e+12 | >ASV_141 | Bacteria | Proteobacteria | Alphaproteobacteria | SAR11 clade | Clade IV |
otu | contribution | ASV | kingdom | phylum | class | genus | family | species |
---|---|---|---|---|---|---|---|---|
>ASV_304 | 1.038439e+13 | >ASV_304 | Bacteria | Bacteroidetes | Bacteroidia | Flavobacteriales | Flavobacteriaceae | NS2b marine group |
>ASV_348 | 1.001642e+13 | >ASV_348 | Bacteria | Bacteroidetes | Bacteroidia | Flavobacteriales | Flavobacteriaceae | NS4 marine group |
>ASV_390 | 9.719405e+12 | >ASV_390 | Bacteria | Bacteroidetes | Bacteroidia | Flavobacteriales | Cryomorphaceae | uncultured |
>ASV_421 | 9.449232e+12 | >ASV_421 | Bacteria | Proteobacteria | Gammaproteobacteria | SAR86 clade | ||
>ASV_426 | 9.366773e+12 | >ASV_426 | Bacteria | Proteobacteria | Gammaproteobacteria | Betaproteobacteriales | Methylophilaceae | OM43 clade |
>ASV_550 | 8.424683e+12 | >ASV_550 | Bacteria | Cyanobacteria | Oxyphotobacteria | Synechococcales | Cyanobiaceae | Prochlorococcus MIT9313 |
>ASV_475 | 7.989456e+12 | >ASV_475 | Bacteria | Bacteroidetes | Bacteroidia | Flavobacteriales | Flavobacteriaceae | NS5 marine group |
>ASV_618 | 7.989456e+12 | >ASV_618 | Bacteria | Proteobacteria | Gammaproteobacteria | SAR86 clade | ||
>ASV_645 | 7.767083e+12 | >ASV_645 | Bacteria | Proteobacteria | Gammaproteobacteria | Cellvibrionales | Halieaceae | OM60(NOR5) clade |
>ASV_663 | 7.645234e+12 | >ASV_663 | Bacteria | Proteobacteria | Deltaproteobacteria | PB19 |
otu | contribution | ASV | Confidence | Taxon | clade | kingdom | phylum | genus | class | family | species |
---|---|---|---|---|---|---|---|---|---|---|---|
ASV_48 | 8.868194e+12 | ASV_48 | 1.0000000 | Eukaryota | Alveolata | Dinoflagellata | Dinophyceae | Gonyaulacales | Gonyaulacaceae | Gonyaulax | |
ASV_31 | 8.529136e+12 | ASV_31 | 0.8023807 | Eukaryota | Stramenopiles | Ochrophyta | Bacillariophyta | Bacillariophyta_X | Raphid-pennate | Fragilariopsis | |
ASV_53 | 8.519608e+12 | ASV_53 | 0.9999952 | Eukaryota | Alveolata | Dinoflagellata | Syndiniales | Dino-Group-I | Dino-Group-I-Clade-1 | Dino-Group-I-Clade-1_X | Dino-Group-I-Clade-1_X_sp. |
ASV_11 | 8.314783e+12 | ASV_11 | 0.9999997 | Eukaryota | Alveolata | Dinoflagellata | Syndiniales | Dino-Group-I | Dino-Group-I-Clade-1 | Dino-Group-I-Clade-1_X | Dino-Group-I-Clade-1_X_sp. |
ASV_76 | 8.230866e+12 | ASV_76 | 0.9417963 | Eukaryota | Alveolata | Dinoflagellata | Dinophyceae | Dinophyceae_X | Dinophyceae_XX | Dinophyceae_XXX | Dinophyceae_XXX_sp. |
ASV_50 | 8.223533e+12 | ASV_50 | 0.9928710 | Eukaryota | Alveolata | Dinoflagellata | Dinophyceae | Dinophyceae_X | Dinophyceae_XX | Dinophyceae_XXX | Dinophyceae_XXX_sp. |
ASV_22 | 8.192052e+12 | ASV_22 | 0.9361058 | Eukaryota | Alveolata | Dinoflagellata | Dinophyceae | Dinophyceae_X | Dinophyceae_XX | Dinophyceae_XXX | Dinophyceae_XXX_sp. |
ASV_94 | 8.182193e+12 | ASV_94 | 0.9563041 | Eukaryota | Archaeplastida | Chlorophyta | Chloropicophyceae | Chloropicales | Chloropicaceae | Chloroparvula | Chloroparvula_pacifica |
ASV_51 | 7.943274e+12 | ASV_51 | 0.9992372 | Eukaryota | Alveolata | Dinoflagellata | Dinophyceae | Dinophyceae_X | Dinophyceae_XX | Dinophyceae_XXX | Dinophyceae_XXX_sp. |
ASV_170 | 7.862158e+12 | ASV_170 | 0.9707207 | Eukaryota | Alveolata | Dinoflagellata | Dinophyceae | Dinophyceae_X | Dinophyceae_XX | Dinophyceae_XXX | Dinophyceae_XXX_sp. |
otu | contribution | ASV | Confidence | Taxon | clade | kingdom | phylum | genus | class | family | species |
---|---|---|---|---|---|---|---|---|---|---|---|
ASV_33 | 3.672376e+13 | ASV_33 | 1.0000000 | Eukaryota | Stramenopiles | Ochrophyta | Bacillariophyta | Bacillariophyta_X | Polar-centric-Mediophyceae | Chaetoceros | |
ASV_10 | 3.621182e+13 | ASV_10 | 0.9994667 | Eukaryota | Hacrobia | Haptophyta | Prymnesiophyceae | Phaeocystales | Phaeocystaceae | Phaeocystis | Phaeocystis_rex |
ASV_25 | 3.620973e+13 | ASV_25 | 0.9997619 | Eukaryota | Stramenopiles | Ochrophyta | Bacillariophyta | Bacillariophyta_X | Radial-centric-basal-Coscinodiscophyceae | Actinocyclus | Actinocyclus_curvatulus |
ASV_134 | 3.465054e+13 | ASV_134 | 0.9999991 | Eukaryota | Alveolata | Dinoflagellata | Syndiniales | Dino-Group-I | Dino-Group-I-Clade-1 | Dino-Group-I-Clade-1_X | Dino-Group-I-Clade-1_X_sp. |
ASV_78 | 3.413063e+13 | ASV_78 | 1.0000000 | Eukaryota | |||||||
ASV_31 | 3.354308e+13 | ASV_31 | 0.8023807 | Eukaryota | Stramenopiles | Ochrophyta | Bacillariophyta | Bacillariophyta_X | Raphid-pennate | Fragilariopsis | |
ASV_26 | 3.297447e+13 | ASV_26 | 0.9994394 | Eukaryota | Hacrobia | Haptophyta | Prymnesiophyceae | Phaeocystales | Phaeocystaceae | Phaeocystis | Phaeocystis_rex |
ASV_217 | 3.289846e+13 | ASV_217 | 0.9317330 | Eukaryota | Hacrobia | Haptophyta | Prymnesiophyceae | Prymnesiales | Chrysochromulinaceae | Chrysochromulina | |
ASV_206 | 3.284959e+13 | ASV_206 | 0.8823163 | Eukaryota | Hacrobia | Picozoa | Picozoa_X | Picozoa_XX | Picozoa_XXX | Picomonas | Picomonas_judraskeda |
ASV_18 | 3.283145e+13 | ASV_18 | 0.9999653 | Eukaryota | Stramenopiles | Ochrophyta | Bacillariophyta | Bacillariophyta_X | Raphid-pennate |
otu | contribution | ASV | Confidence | Taxon | clade | kingdom | phylum | genus | class | family | species |
---|---|---|---|---|---|---|---|---|---|---|---|
ASV_23 | 1.127798e+13 | ASV_23 | 0.9891629 | Eukaryota | Archaeplastida | Chlorophyta | Chloropicophyceae | Chloropicales | Chloropicaceae | Chloroparvula | Chloroparvula_pacifica |
ASV_94 | 1.003371e+13 | ASV_94 | 0.9563041 | Eukaryota | Archaeplastida | Chlorophyta | Chloropicophyceae | Chloropicales | Chloropicaceae | Chloroparvula | Chloroparvula_pacifica |
ASV_170 | 9.797047e+12 | ASV_170 | 0.9707207 | Eukaryota | Alveolata | Dinoflagellata | Dinophyceae | Dinophyceae_X | Dinophyceae_XX | Dinophyceae_XXX | Dinophyceae_XXX_sp. |
ASV_176 | 9.153583e+12 | ASV_176 | 1.0000000 | Eukaryota | |||||||
ASV_162 | 9.111550e+12 | ASV_162 | 0.9542062 | Eukaryota | Alveolata | Dinoflagellata | Dinophyceae | Dinophyceae_X | Dinophyceae_XX | Dinophyceae_XXX | Dinophyceae_XXX_sp. |
ASV_2 | 8.983205e+12 | ASV_2 | 0.9975556 | Eukaryota | Alveolata | Dinoflagellata | Dinophyceae | Dinophyceae_X | Dinophyceae_XX | Dinophyceae_XXX | Dinophyceae_XXX_sp. |
ASV_66 | 8.840453e+12 | ASV_66 | 0.7037306 | Eukaryota | Stramenopiles | Ochrophyta | Bacillariophyta | Bacillariophyta_X | Polar-centric-Mediophyceae | ||
ASV_18 | 8.792462e+12 | ASV_18 | 0.9999653 | Eukaryota | Stramenopiles | Ochrophyta | Bacillariophyta | Bacillariophyta_X | Raphid-pennate | ||
ASV_5 | 8.757161e+12 | ASV_5 | 0.9999997 | Eukaryota | Archaeplastida | Chlorophyta | Palmophyllophyceae | Prasinococcales | Prasinococcales-Clade-B | Prasinoderma | Prasinoderma_singularis |
ASV_375 | 8.753520e+12 | ASV_375 | 0.9423032 | Eukaryota | Alveolata | Dinoflagellata | Dinophyceae | Gonyaulacales | Pyrophacaceae | Fragilidium |
##
## Welch Two Sample t-test
##
## data: Bacteroidetes_SO$Abundance and Bacteroidetes_ISSG$Abundance
## t = 4.5828, df = 396.05, p-value = 6.157e-06
## alternative hypothesis: true difference in means is not equal to 0
## 95 percent confidence interval:
## 0.3565637 0.8923132
## sample estimates:
## mean of x mean of y
## 3.285904 2.661465
##
##
## Welch Two Sample t-test
##
## data: Cyano_SO$Abundance and Cyano_ISSG$Abundance
## t = -4.7419, df = 42.62, p-value = 2.385e-05
## alternative hypothesis: true difference in means is not equal to 0
## 95 percent confidence interval:
## -2.953793 -1.190715
## sample estimates:
## mean of x mean of y
## 1.660626 3.732879
##
##
## Welch Two Sample t-test
##
## data: Cyano_SO$Abundance and Cyano_SSTC$Abundance
## t = -3.8645, df = 42.821, p-value = 0.0003723
## alternative hypothesis: true difference in means is not equal to 0
## 95 percent confidence interval:
## -2.7558208 -0.8656909
## sample estimates:
## mean of x mean of y
## 1.660626 3.471382