非度量多维尺度法(NMDS)是一种将多维空间的研究对象(样本或变量)简化到低维空间进行定位、分析和归类,同时又保留对象间原始关系的数据分析方法。
适用于无法获得研究对象间精确的相似性或相异性数据,仅能得到他们之间等级关系数据的情形。换句话说,当资料不适合直接进行变量型多维尺度分析时,对其进行变量变换,再采用变量型多维尺度分析。其特点是根据样品中包含的物种信息,以点的形式反映在多维空间上,而对不同样品间的差异程度,则是通过点与点间的距离体现的,最终获得样品的空间定位点图。
利用OTU表进行NMDS:利用OTU表做NMDS时,可以获得(样本+物种)两种得分信息,能够找到更多的潜在信息。
library(vegan)
library(ggplot2)
# data --------------------------------------------------------------------
set.seed(13)
otu <- matrix(sample(c(0:200), 1200, replace = TRUE),
ncol = 12, nrow = 100,
dimnames = list(
row_names = paste0("OTU", seq(1:100)),
col_names = paste0("sample", seq(1:12))
))
groups <- data.frame(
sample = paste0("sample", seq(1:12)),
sites = rep(c("root", "soil", "rhizosphere"), 4)
)
# hellinger-transform: square root of method = "total"
otu <- t(otu)
hell_otu <- decostand(otu, "hell")
# The number of points n should be n > 2*k + 1
# default k = 2
NMDS <- metaMDS(hell_otu, k = 4, distance = "bray")
# print NMDS
# stress 应该越小越好,通常阈值为0.2
NMDS
# Get Species or Site Scores from an Ordination
score_species <- scores(NMDS, display = "species")
score_sites <- scores(NMDS, display = "sites")
# get the stress value
stress <- round(NMDS$stress, 4)
# adds group date
# 有时groups中的sample 和score 的结果顺序不一样
# 推荐使用merge 或者 match 函数合并数据
plot_data <- cbind(as.data.frame(score_sites), groups)
# set data for spider plot ------------------------------------------------
# calculate the center of NMDS1, NMDS2, according to groups
cent <- aggregate(cbind(NMDS1, NMDS2) ~ sites,
data = plot_data, FUN = mean)
# summarise_if 有利于自动化
plot_data %>%
group_by(sites) %>%
summarise_if(is.numeric, mean)
cent <- setNames(cent, c("sites", "oNMDS1", "oNMDS2"))
spider_data <- merge(plot_data, cent, by = "sites", sort = FALSE)
# 设置坐标轴刻度label
theme <- function(){
list(scale_x_continuous(breaks = seq(from = -0.1,
to = 0.1,
by = 0.05),
labels = seq(from = -0.1,
to = 0.1,
by = 0.05)),
scale_y_continuous(breaks = seq(from = -0.1,
to = 0.1,
by = 0.05),
labels = seq(from = -0.1,
to = 0.1,
by = 0.05)))
}
# 可视化
ggplot(plot_data, aes(x = NMDS1, y = NMDS2,
color = sites)) +
geom_point() +
coord_fixed() +
stat_ellipse()
ggplot(spider_data, aes(x = NMDS1, y = NMDS2,
color = sites)) +
geom_segment(aes(xend = oNMDS1, yend = oNMDS2)) +
geom_point() +
geom_point()
NMDS结果评估
通常情况下我们可以根据应力值来判断排序模型的合理性,应力值(Stress)最好不要大于0.2。
此外,还可以通过goodness()和stressplot() {vegan}来评估NMDS拟合优度。
# Shepard图: 若R2越大,则NMDS结果越合理
stressplot(NMDS, main = "Shepard Plot")
gof <- goodness(NMDS1)
plot(NMDS, display = "sites", type = "t", main = "goodness of fit statistic")
points(NMDS, cex = gof * 200, col = "red")
网友评论