1 Content

This is a set of code and data sets to document and reproduce the computational analyses in Steffen et al., 2020. In brief, it contains three parts: the metabolome, microbiota and inter-omics analyses. For the metabolome, we include detailed descriptions of the metabolomics data acquisition, data processing with xcms, and multivariate analyses with ropls. In addition, signals of known and novel compounds of interest were manually extracted and analysed. For the microbiota, we include community visualisation and ecological analyses in vegan. The inter-omics section contains mantel test and procrustes rotations, as well as the microbial interaction network annotated with the OTU’s correlation with barettin, and their response to depth.

We hope to contribute to good science by providing reproducible documentation of the computational workflow. If you have questions, comments or suggestions please feel free to get in touch.

1.1 Experimental setup

Three demosponge species Geodia barretti (n=20), Stryphnus fortis (n=15), and Weberella bursa (n=17) (Fig 1.6) were sampled in the Davis Strait between Canada and Greenland (61.147942-66.38245 Lat,-68.78077- -57.96573 Lon) from 244 m to 1467 m depth (Fig 1.1, 1.2). Temperature and salinity in situ were recorded. Sample metadata is deposited at PANGAEA.

1.1.1 Map

library(ggplot2)
library(ggmap)
library(maps)
library(mapdata)
library(marmap)
library(ggrepel)
library(sf)
library(rnaturalearth)
library(rnaturalearthdata)

sample_coords <- read.csv("data/Steffen_et_al_metadata_PANGAEA.csv", header = T, sep = ";")
sample_coords <- sample_coords[, c("Species", "unified_ID", "Latitude", "Longitude")]
sample_coords <- na.omit(sample_coords)

world <- ne_countries(scale = "medium", returnclass = "sf")
sample_map2 <- ggplot(data = world) + geom_sf() + coord_sf(xlim = c(-95, 0), ylim = c(45, 75), expand = T) + annotate("rect", xmin = -68.78077, xmax = -57.96573, 
    ymin = 61.147942, ymax = 66.38245, alpha = 0.2) + geom_point(data = sample_coords, aes(x = sample_coords$Longitude, y = sample_coords$Latitude, col = Species)) + 
    ggtitle("Sample map") + xlab("Longitude") + ylab("Latitude") + theme_bw() + theme(legend.position = "none")
sample_map2
Geographic placement of the sample site on the Northern hemisphere, between Canada and Greenland. Dots indicate individual samples.

Figure 1.1: Geographic placement of the sample site on the Northern hemisphere, between Canada and Greenland. Dots indicate individual samples.

# Green Sf, Blue=Wb, Red=Gb
map_data <- getNOAA.bathy(-70, -50, 58, 68, resolution = 4, keep = T, antimeridian = FALSE)
sample_map_1 <- autoplot(map_data, geom = c("r", "c")) + scale_fill_gradient2(low = "dodgerblue4", mid = "gainsboro", high = "darkgreen") + geom_point(aes(x = sample_coords$Longitude, 
    y = sample_coords$Latitude, colour = factor(sample_coords$Species)), data = sample_coords) + ggtitle("Sample map") + xlab("Longitude") + ylab("Latitude") + 
    labs(fill = "Depth", col = "Species")
sample_map_1
Detailed map of the sampling site, samples represented by dots are coloured according to sponge species.

Figure 1.2: Detailed map of the sampling site, samples represented by dots are coloured according to sponge species.

# with labels: sample_map_1+geom_label_repel(aes(x=sample_coords$Longitude, y=sample_coords$Latitude, label = sample_coords$unified_ID), box.padding = 0.35,
# point.padding = 0.5, data=sample_coords)
# 3D map
library(marmap)
library(lattice)  #for wireframe

map_data_hires <- getNOAA.bathy(-70, -50, 58, 68, resolution = 1, keep = T, antimeridian = FALSE)
wireframe(unclass(map_data_hires), shade = T, aspect = c(1/2, 0.1), screen = list(z = 0, x = -50), par.settings = list(axis.line = list(col = "transparent")), 
    par.box = c(col = rgb(0, 0, 0, 0.1)))
knitr::include_graphics("data/map_screenshot.png")
3D representation of the threshold and slope in the Davis strait.

Figure 1.3: 3D representation of the threshold and slope in the Davis strait.

1.1.2 Water masses

knitr::include_graphics("data/Argo_Mean_on_Grid_Size_0dec25_2002_3-2018_6_Lambert_Lat_Range_59-65_Vector_IY20191123.png")
Currents in the North Atlantic between Greenland and Canada. Analysis and figure contributed by Igor Yashayaev.

Figure 1.4: Currents in the North Atlantic between Greenland and Canada. Analysis and figure contributed by Igor Yashayaev.

knitr::include_graphics("data/water_masses_2002-2015 T-S Profiles Selected in Topo-range 300-1900 m & Colored by Depth - Labeled.png")
Temperature and salinity (T-S) plot showing the different water masses in the Davis strait by depth: Shelf Water (ShW), Slope Water (SW), Irminger Current (IC), Labrador Sea Water (LSW), and Icelandic slope water (ISW). Analysis and figure contributed by Igor Yashayaev.

Figure 1.5: Temperature and salinity (T-S) plot showing the different water masses in the Davis strait by depth: Shelf Water (ShW), Slope Water (SW), Irminger Current (IC), Labrador Sea Water (LSW), and Icelandic slope water (ISW). Analysis and figure contributed by Igor Yashayaev.

1.1.3 Sponges

knitr::include_graphics("data/sponges.png")
__A__ _G. barretti_ and __B__ _S. fortis_ from West Shetland Channel. Image Crown Copyright ©2006, all rights reserved. Image provided by K. Howell, Plymouth University, UK. __C__ _W. bursa_, picture taken by scuba diving at 40 m in Svalbard by Peter Leopold.

Figure 1.6: A G. barretti and B S. fortis from West Shetland Channel. Image Crown Copyright ©2006, all rights reserved. Image provided by K. Howell, Plymouth University, UK. C W. bursa, picture taken by scuba diving at 40 m in Svalbard by Peter Leopold.

G. barretti and S. fortis are high microbial abundance (HMA) sponges often found living in proximity. W. bursa is likely a low microbial abundance sponges (see diversity metrics, Nicole Boury-Esnault per.comm.). While all sponges originate from the same geographic region, W. bursa typically does not occur in the same habitats as G. barretti and S. fortis (Murillo et al. 2018). It is important to note that S. fortis is whiteish in colour, and the yellow shown in the picture stems from Hexadella dedritifera frequently found overgrowing it.

1.2 Sample metadata analysis

To investigate underlying data patterns, we assessed correlation of our predictor variables, i.e. the meta data of our samples (depth, latitude, longitude, sampling year, salinity and temperature).

1.2.1 Correlations, visual inspection and variance inflation factors (VIF) across all samples

md <- read.csv("data/Steffen_et_al_metadata_PANGAEA.csv", header = T, sep = ";")
md <- md[, c("Species", "Depth", "Latitude", "Longitude", "YEAR", "MeanBotSalinity_PSU", "MeanBottomTemp_Cdeg", "unified_ID")]

a <- cor.test(md$Depth, md$Latitude, method = "spearman")
a <- paste("Rho =", round(a$estimate, digits = 3), " p-value=", round(a$p.value, digits = 3))
b <- cor.test(md$Depth, md$Longitude, method = "spearman")
b <- paste("Rho =", round(b$estimate, digits = 3), " p-value=", round(b$p.value, digits = 3))
c <- cor.test(md$Depth, md$YEAR, method = "spearman")
c <- paste("Rho =", round(c$estimate, digits = 3), " p-value=", round(c$p.value, digits = 3))
d <- cor.test(md$Depth, md$MeanBotSalinity_PSU, method = "spearman")
d <- paste("Rho =", round(d$estimate, digits = 3), " p-value=", round(d$p.value, digits = 3))
e <- cor.test(md$Depth, md$MeanBottomTemp_Cdeg, method = "spearman")
e <- paste("Rho =", round(e$estimate, digits = 3), " p-value=", round(e$p.value, digits = 3))
f <- cor.test(md$MeanBotSalinity_PSU, md$MeanBottomTemp_Cdeg, method = "spearman")
f <- paste("T-S plot: Rho =", round(f$estimate, digits = 3), " p-value=", round(f$p.value, digits = 3))

k <- ggplot(md, aes(x = Latitude, y = Depth)) + geom_point() + scale_y_continuous(trans = "reverse") + xlab("Latitude") + ylab("Depth") + ggtitle(a) + theme(plot.title = element_text(size = 10))
l <- ggplot(md, aes(x = Longitude, y = Depth)) + geom_point() + scale_y_continuous(trans = "reverse") + xlab("Longitude") + ylab("Depth") + ggtitle(b) + theme(plot.title = element_text(size = 10))
m <- ggplot(md, aes(x = YEAR, y = Depth)) + geom_point() + scale_y_continuous(trans = "reverse") + xlab("Year") + ylab("Depth") + ggtitle(c) + theme(plot.title = element_text(size = 10))
n <- ggplot(md, aes(x = MeanBotSalinity_PSU, y = Depth)) + geom_point() + scale_y_continuous(trans = "reverse") + xlab("Salinity") + ylab("Depth") + ggtitle(d) + 
    theme(plot.title = element_text(size = 10))
o <- ggplot(md, aes(x = MeanBottomTemp_Cdeg, y = Depth)) + geom_point() + scale_y_continuous(trans = "reverse") + xlab("Temperature") + ylab("Depth") + ggtitle(e) + 
    theme(plot.title = element_text(size = 10))
p <- ggplot(md, aes(x = MeanBotSalinity_PSU, y = MeanBottomTemp_Cdeg, colour = md$Depth)) + geom_point(size = 2) + scale_colour_gradient(guide = guide_colourbar(), 
    high = "#f4c430", low = "#3b3bc4", trans = "reverse") + xlab("Salinity in psu") + ylab("Temperature in °C") + labs(colour = "Depth") + ggtitle(f) + theme(plot.title = element_text(size = 10), 
    legend.key.size = unit(0.3, "cm")) + scale_x_continuous(limits = c(34.4, 35))
# The limits of the x-axis exclude two samples from water with comparably low salinity
library(gridExtra)
grid.arrange(k, l, m, n, o, p, nrow = 3, top = "Spearman correlation of sample associated meta data")
Correlation of depth with other environmental parameters/predictor variables across all samples.

Figure 1.7: Correlation of depth with other environmental parameters/predictor variables across all samples.

rm(a, b, c, d, e, f, k, l, m, n, o, p, df)

In the bottom right plot showing temperature versus salinity, two samples from lower salinity were excluded for visual clarity. From the plots above and their correlation tests, we find that longitude, salinity and temperature are significantly correlated with depth, and that salinity and temperature also correlate with each other.

This is further corroborated in the variance inflation factors. A VIF > 10 indicates redundancy/collinearity. This means that we have to be cautious and avoid including redundant constraints in ecological models.

library(usdm)
md$Species <- NULL
md$unified_ID <- NULL
vif_all <- vif(md)

library(kableExtra)
options(kableExtra.html.bsTable = T)
kable(vif_all, longtable = T, booktabs = T, caption = "VIF for meta data from all samples", row.names = F) %>% kable_styling(bootstrap_options = c("striped", 
    "hover", "bordered", "condensed", "responsive"), full_width = F, latex_options = c("striped", "scale_down"))
Variables VIF
Depth 3.267983
Latitude 6.125058
Longitude 7.189093
YEAR 1.122117
MeanBotSalinity_PSU 16.131495
MeanBottomTemp_Cdeg 12.301386
rm(md)

As we will be focussing on Geodia barretti, we repeat the same inspection procedure for the sample subset.

1.2.2 Correlations, visual inspection and variance inflation factors (VIF) across G. barretti samples

md <- read.csv("data/Steffen_et_al_metadata_PANGAEA.csv", header = T, sep = ";")
md <- md[, c("Species", "Depth", "Latitude", "Longitude", "YEAR", "MeanBotSalinity_PSU", "MeanBottomTemp_Cdeg", "unified_ID")]
md <- md[md$Species == "Geodia barretti", ]

a <- cor.test(md$Depth, md$Latitude, method = "spearman")
a <- paste("Rho =", round(a$estimate, digits = 3), " p-value=", round(a$p.value, digits = 3))
b <- cor.test(md$Depth, md$Longitude, method = "spearman")
b <- paste("Rho =", round(b$estimate, digits = 3), " p-value=", round(b$p.value, digits = 3))
c <- cor.test(md$Depth, md$YEAR, method = "spearman")
c <- paste("Rho =", round(c$estimate, digits = 3), " p-value=", round(c$p.value, digits = 3))
d <- cor.test(md$Depth, md$MeanBotSalinity_PSU, method = "spearman")
d <- paste("Rho =", round(d$estimate, digits = 3), " p-value=", round(d$p.value, digits = 3))
e <- cor.test(md$Depth, md$MeanBottomTemp_Cdeg, method = "spearman")
e <- paste("Rho =", round(e$estimate, digits = 3), " p-value=", round(e$p.value, digits = 3))
f <- cor.test(md$MeanBotSalinity_PSU, md$MeanBottomTemp_Cdeg, method = "spearman")
f <- paste("Rho =", round(f$estimate, digits = 3), " p-value=", round(f$p.value, digits = 3))

k <- ggplot(md, aes(x = Latitude, y = Depth)) + geom_point() + scale_y_continuous(trans = "reverse") + xlab("Latitude") + ylab("Depth") + ggtitle(a) + theme(plot.title = element_text(size = 10))
l <- ggplot(md, aes(x = Longitude, y = Depth)) + geom_point() + scale_y_continuous(trans = "reverse") + xlab("Longitude") + ylab("Depth") + ggtitle(b) + theme(plot.title = element_text(size = 10))
m <- ggplot(md, aes(x = YEAR, y = Depth)) + geom_point() + scale_y_continuous(trans = "reverse") + xlab("Year") + ylab("Depth") + ggtitle(c) + theme(plot.title = element_text(size = 10))
n <- ggplot(md, aes(x = MeanBotSalinity_PSU, y = Depth)) + geom_point() + scale_y_continuous(trans = "reverse") + xlab("Salinity") + ylab("Depth") + ggtitle(d) + 
    theme(plot.title = element_text(size = 10))
o <- ggplot(md, aes(x = MeanBottomTemp_Cdeg, y = Depth)) + geom_point() + scale_y_continuous(trans = "reverse") + xlab("Temperature") + ylab("Depth") + ggtitle(e) + 
    theme(plot.title = element_text(size = 10))
p <- ggplot(md, aes(x = MeanBotSalinity_PSU, y = MeanBottomTemp_Cdeg, colour = md$Depth)) + geom_point(size = 2) + scale_colour_gradient(guide = guide_colourbar(), 
    high = "#f4c430", low = "#3b3bc4", trans = "reverse") + xlab("Salinity in psu") + ylab("Temperature in °C") + labs(colour = "Depth") + ggtitle(f) + theme(plot.title = element_text(size = 10), 
    legend.key.size = unit(0.3, "cm"))
library(gridExtra)
grid.arrange(k, l, m, n, o, p, nrow = 3, top = "Spearman correlation of Geodia barretti associated meta data")
Correlation of depth with other environmental parameters/predictor variables across samples of _G. barretti_.

Figure 1.8: Correlation of depth with other environmental parameters/predictor variables across samples of G. barretti.

rm(a, b, c, d, e, f, k, l, m, n, o, p, df)
library(usdm)
md$Species <- NULL
md$unified_ID <- NULL
vif1 <- vif(md)
vif1$VIF <- round(vif1$VIF, digits = 2)

md$MeanBotSalinity_PSU <- NULL
md$MeanBottomTemp_Cdeg <- NULL
vif2 <- vif(md)
vif2$VIF <- round(vif2$VIF, digits = 2)

library(dplyr)
vifs <- full_join(vif1, vif2, by = c(Variables = "Variables"))
colnames(vifs) <- c("Variables", "All predictors", "Subset")
vifs[c(5, 6), 3] <- ""

library(kableExtra)
options(kableExtra.html.bsTable = T)

cap <- paste("VIF for extensive and simplified models for", text_spec("G. barretti", italic = T))

kable(vifs, longtable = T, booktabs = T, caption = cap, row.names = F) %>% kable_styling(bootstrap_options = c("striped", "hover", "bordered", "condensed", "responsive"), 
    full_width = F, latex_options = c("striped", "scale_down"))
Variables All predictors Subset
Depth 22.15 5.91
Latitude 11.63 10.31
Longitude 16.59 13.38
YEAR 1.73 1.46
MeanBotSalinity_PSU 61.13
MeanBottomTemp_Cdeg 47.52
rm(md)

With this subset, the collinearity becomes even more pronounced. Salinity, temperature and depth are strongly correlated and the VIFs indicated that they should not be included in a model together. This correlation is partly due to physical properties of water. Density of water increases with increasing salinity and decreasing temperature. To a lesser extent, longitude and lattitude are collinear in our sampling. However, removing the predictor variables salinity and temperature mititgated/lowered the effect of collinearity.

References

Murillo, Fj, E Kenchington, G Tompkins, L Beazley, E Baker, A Knudby, and W Walkusz. 2018. “Sponge Assemblages and Predicted Archetypes in the Eastern Canadian Arctic.” Marine Ecology Progress Series 597 (June): 115–35. https://doi.org/10.3354/meps12589.