放假期间我们学习一下基础的东西
Corrplot软件包简介
介绍
所述corrplot包是相关矩阵,置信区间的图形显示。它还包含一些进行矩阵重新排序的算法。另外,corrplot擅长细节,包括选择颜色,文本标签,颜色标签,布局等。
可视化方法
有七个可视化方法(参数method
中)corrplot包,命名"circle"
,"square"
,"ellipse"
,"number"
,"shade"
,"color"
,"pie"
。
正相关以蓝色显示,负相关以红色显示。颜色强度和圆圈的大小与相关系数成正比。
library(corrplot)
## corrplot 0.84 loaded
M <- cor(mtcars)
corrplot(M, method = "circle")
image
corrplot(M, method = "square")
image
corrplot(M, method = "ellipse")
image
corrplot(M, method = "number") # Display the correlation coefficient
image
corrplot(M, method = "shade")
image
corrplot(M, method = "color")
image
corrplot(M, method = "pie")
image
布局
共有三种布局类型(参数type
):
-
"full"
(默认):显示完整的相关矩阵 -
"upper"
:显示相关矩阵的上三角 -
"lower"
:显示相关矩阵的下三角
corrplot(M, type = "upper")
image
corrplot.mixed()
是混合可视化样式的包装函数。
corrplot.mixed(M)
image
corrplot.mixed(M, lower.col = "black", number.cex = .7)
image
corrplot.mixed(M, lower = "ellipse", upper = "circle")
image
corrplot.mixed(M, lower = "square", upper = "circle", tl.col = "black")
image
重新排序相关矩阵
相关矩阵可以根据相关系数重新排序。这对于确定矩阵中隐藏的结构和图案很重要。有在corrplot(参数四种方法order
)的名字命名 "AOE"
,"FPC"
,"hclust"
,"alphabet"
。在序列化包中可以找到更多算法 。
您还可以通过function手动“重新排序”矩阵corrMatOrder()
。
-
"AOE"
一个我aia i = { 棕褐色(e i 2 / e i 1),如果 ë 我1 > 0 ;棕褐色(e i 2 / e i 1)+ π ,除此以外。ai={tan(ei2/ei1),if ei1>0;tan(ei2/ei1)+π,otherwise.
Ë 1e1Ë 2e2
-
"FPC"
对于第一个主成分订单。 -
"hclust"
层次聚类顺序,以及"hclust.method"
要使用的聚集方法。"hclust.method"
应该是一个"ward"
,"single"
,"complete"
,"average"
,"mcquitty"
,"median"
或"centroid"
。 -
"alphabet"
按字母顺序排列。
corrplot(M, order = "AOE")
image
corrplot(M, order = "hclust")
image
corrplot(M, order = "FPC")
image
corrplot(M, order = "alphabet")
image
如果使用"hclust"
,则corrplot()
可以基于层次聚类的结果在相关矩阵图的周围绘制矩形。
corrplot(M, order = "hclust", addrect = 2)
image
corrplot(M, order = "hclust", addrect = 3)
image
Change background color to lightblue
corrplot(M, type = "upper", order = "hclust",
col = c("black", "white"), bg = "lightblue")
image
使用不同的色谱
col1 <- colorRampPalette(c("#7F0000", "red", "#FF7F00", "yellow", "white",
"cyan", "#007FFF", "blue", "#00007F"))
col2 <- colorRampPalette(c("#67001F", "#B2182B", "#D6604D", "#F4A582",
"#FDDBC7", "#FFFFFF", "#D1E5F0", "#92C5DE",
"#4393C3", "#2166AC", "#053061"))
col3 <- colorRampPalette(c("red", "white", "blue"))
col4 <- colorRampPalette(c("#7F0000", "red", "#FF7F00", "yellow", "#7FFF7F",
"cyan", "#007FFF", "blue", "#00007F"))
whiteblack <- c("white", "black")
## using these color spectra
corrplot(M, order = "hclust", addrect = 2, col = col1(100))
image
corrplot(M, order = "hclust", addrect = 2, col = col2(50))
image
corrplot(M, order = "hclust", addrect = 2, col = col3(20))
image
corrplot(M, order = "hclust", addrect = 2, col = col4(10))
image
corrplot(M, order = "hclust", addrect = 2, col = whiteblack, bg = "gold2")
image
还可以使用标准调色板(包grDevices
)
corrplot(M, order = "hclust", addrect = 2, col = heat.colors(100))
image
corrplot(M, order = "hclust", addrect = 2, col = terrain.colors(100))
image
corrplot(M, order = "hclust", addrect = 2, col = cm.colors(100))
image
corrplot(M, order = "hclust", addrect = 2, col = gray.colors(100))
image
其他选择是使用RcolorBrewer
包。
library(RColorBrewer)
corrplot(M, type = "upper", order = "hclust",
col = brewer.pal(n = 8, name = "RdBu"))
image
corrplot(M, type = "upper", order = "hclust",
col = brewer.pal(n = 8, name = "RdYlBu"))
image
corrplot(M, type = "upper", order = "hclust",
col = brewer.pal(n = 8, name = "PuOr"))
image
更改文本标签和图例的颜色和旋转
参数cl.*
用于颜色图例,tl.*
如果用于文本图例。对于文本标签,tl.col
(文本标签颜色)和tl.srt
(文本标签字符串旋转)用于更改文本颜色和旋转。
这里有些例子。
## remove color legend and text legend
corrplot(M, order = "AOE", cl.pos = "n", tl.pos = "n")
image
## bottom color legend, diagonal text legend, rotate text label
corrplot(M, order = "AOE", cl.pos = "b", tl.pos = "d", tl.srt = 60)
image
## a wider color legend with numbers right aligned
corrplot(M, order = "AOE", cl.ratio = 0.2, cl.align = "r")
image
## text labels rotated 45 degrees
corrplot(M, type = "lower", order = "hclust", tl.col = "black", tl.srt = 45)
image
处理非相关矩阵
corrplot(abs(M),order = "AOE", col = col3(200), cl.lim = c(0, 1))
image
## visualize a matrix in [-100, 100]
ran <- round(matrix(runif(225, -100,100), 15))
corrplot(ran, is.corr = FALSE, method = "square")
image
## a beautiful color legend
corrplot(ran, is.corr = FALSE, method = "ellipse", cl.lim = c(-100, 100))
image
如果矩阵是矩形,则可以使用win.asp
参数调整纵横比, 以使矩阵呈现为正方形。
ran <- matrix(rnorm(70), ncol = 7)
corrplot(ran, is.corr = FALSE, win.asp = .7, method = "circle")
image
处理缺失(NA)值
默认情况下,corrplot将NA值呈现为"?"
字符。使用na.label
参数,可以使用不同的值(最多支持两个字符)。
M2 <- M
diag(M2) = NA
corrplot(M2)
image
corrplot(M2, na.label = "o")
image
corrplot(M2, na.label = "NA")
image
在标签中使用“ plotmath”表达式
从version开始0.78
,可以 在变量名称中使用 plotmath表达式。要激活plotmath渲染,前缀的人物之一的标签":"
,"="
或"$"
。
M2 <- M[1:5,1:5]
colnames(M2) <- c("alpha", "beta", ":alpha+beta", ":a[0]", "=a[beta]")
rownames(M2) <- c("alpha", "beta", NA, "$a[0]", "$ a[beta]")
corrplot(M2)
image
将相关图与显着性检验相结合
res1 <- cor.mtest(mtcars, conf.level = .95)
res2 <- cor.mtest(mtcars, conf.level = .99)
## specialized the insignificant value according to the significant level
corrplot(M, p.mat = res1$p, sig.level = .2)
image
corrplot(M, p.mat = res1$p, sig.level = .05)
image
corrplot(M, p.mat = res1$p, sig.level = .01)
image
## leave blank on no significant coefficient
corrplot(M, p.mat = res1$p, insig = "blank")
image
## add p-values on no significant coefficient
corrplot(M, p.mat = res1$p, insig = "p-value")
image
## add all p-values
corrplot(M, p.mat = res1$p, insig = "p-value", sig.level = -1)
image
## add cross on no significant coefficient
corrplot(M, p.mat = res1$p, order = "hclust", insig = "pch", addrect = 3)
image
可视化置信区间
corrplot(M, low = res1$lowCI, upp = res1$uppCI, order = "hclust",
rect.col = "navy", plotC = "rect", cl.pos = "n")
image
corrplot(M, p.mat = res1$p, low = res1$lowCI, upp = res1$uppCI,
order = "hclust", pch.col = "red", sig.level = 0.01,
addrect = 3, rect.col = "navy", plotC = "rect", cl.pos = "n")
image
res1 <- cor.mtest(mtcars, conf.level = .95)
corrplot(M, p.mat = res1$p, insig = "label_sig",
sig.level = c(.001, .01, .05), pch.cex = .9, pch.col = "white")
image
corrplot(M, p.mat = res1$p, method = "color",
insig = "label_sig", pch.col = "white")
image
corrplot(M, p.mat = res1$p, method = "color", type = "upper",
sig.level = c(.001, .01, .05), pch.cex = .9,
insig = "label_sig", pch.col = "white", order = "AOE")
image
corrplot(M, p.mat = res1$p, insig = "label_sig", pch.col = "white",
pch = "p<.05", pch.cex = .5, order = "AOE")
image
自定义相关图
# matrix of the p-value of the correlation
p.mat <- cor.mtest(mtcars)$p
head(p.mat[, 1:5])
## [,1] [,2] [,3] [,4] [,5]
## [1,] 0.000000e+00 6.112687e-10 9.380327e-10 1.787835e-07 1.776240e-05
## [2,] 6.112687e-10 0.000000e+00 1.802838e-12 3.477861e-09 8.244636e-06
## [3,] 9.380327e-10 1.802838e-12 0.000000e+00 7.142679e-08 5.282022e-06
## [4,] 1.787835e-07 3.477861e-09 7.142679e-08 0.000000e+00 9.988772e-03
## [5,] 1.776240e-05 8.244636e-06 5.282022e-06 9.988772e-03 0.000000e+00
## [6,] 1.293959e-10 1.217567e-07 1.222320e-11 4.145827e-05 4.784260e-06
# Specialized the insignificant value according to the significant level
corrplot(M, type = "upper", order = "hclust",
p.mat = p.mat, sig.level = 0.01)
image
# Leave blank on no significant coefficient
corrplot(M, type = "upper", order = "hclust",
p.mat = p.mat, sig.level = 0.01, insig = "blank")
image
在上图中,p值> 0.01的相关被认为是无关紧要的。在这种情况下,相关系数值留为空白或添加叉号。
col <- colorRampPalette(c("#BB4444", "#EE9988", "#FFFFFF", "#77AADD", "#4477AA"))
corrplot(M, method = "color", col = col(200),
type = "upper", order = "hclust", number.cex = .7,
addCoef.col = "black", # Add coefficient of correlation
tl.col = "black", tl.srt = 90, # Text label color and rotation
# Combine with significance
p.mat = p.mat, sig.level = 0.01, insig = "blank",
# hide correlation coefficient on the principal diagonal
diag = FALSE)
image
探索大型功能矩阵
# generating large feature matrix (cols=features, rows=samples)
num_features <- 60 # how many features
num_samples <- 300 # how many samples
DATASET <- matrix(runif(num_features * num_samples),
nrow = num_samples, ncol = num_features)
# setting some dummy names for the features e.g. f23
colnames(DATASET) <- paste0("f", 1:ncol(DATASET))
# let's make 30% of all features to be correlated with feature "f1"
num_feat_corr <- num_features * .3
idx_correlated_features <- as.integer(seq(from = 1,
to = num_features,
length.out = num_feat_corr))[-1]
for (i in idx_correlated_features) {
DATASET[,i] <- DATASET[,1] + runif(num_samples) # adding some noise
}
corrplot(cor(DATASET), diag = FALSE, order = "FPC",
tl.pos = "td", tl.cex = 0.5, method = "color", type = "upper")
image
生活很好,等你超越
网友评论