这期来聊聊韦恩图,这种图形虽然简单,但是也是文章中很常见的,今天就来看看 CNS 级别文章中的Venn该怎么绘制?
前言
维恩图用于展示在不同的事物群组(集合)之间的数学或逻辑联系,尤其适合用来表示集合(或)类之间的“大致关系”,它也常常被用来帮助推导(或理解推导过程)关于集合运算(或类运算)的一些规律。一般个数在2到7组之间。
我们希望实现下面的韦恩图以及更高维度的图形。
1. 软件包安装
if (!require(VennDiagram)) install.packages("VennDiagram")
if (!require(venn)) install.packages("venn")
if (!require(UpSetR)) install.packages("UpSetR")
library(VennDiagram)
library(venn)
library(UpSetR)
2. 数据读取
该模块支持2种数据格式,下面是详细介绍:
-
韦恩图常用数据格式:第一行为组名,必须要有,会出现在图中。每一列都是一个分组。
-
定量矩阵表格式:每行是一个基因,每列是个样本。行名和列名都要有,数值是定量值。
我们使用的第一种格式,如下:
dat <- read.table("flower.txt", header = T, sep = "\t")
head(dat)
## c1 c2 c3 c4 c5 c6 c7 c8
## 1 gene1193 gene1253 gene1236 gene1325 gene1246 gene1414 gene1259 gene1249
## 2 gene1194 gene1254 gene1241 gene1327 gene1247 gene1416 gene1260 gene1250
## 3 gene1195 gene1255 gene1243 gene1328 gene1248 gene1417 gene1261 gene1251
## 4 gene1197 gene1256 gene1244 gene1329 gene1249 gene1421 gene1262 gene1253
## 5 gene1199 gene1257 gene1246 gene1330 gene1250 gene1422 gene1263 gene1256
## 6 gene1202 gene1259 gene1247 gene1331 gene1251 gene1425 gene1265 gene1258
dim(dat)
## [1] 1662 8
venn_list = as.list(dat[, -8])
# 查看交集详情,并导出结果
3. 绘制多集合韦恩图
这里我们使用两个软件包 venn 和 VennDiagram 都是经典的绘制Venn图非常棒的软件包,其中Venn能实现2-7个集合的韦恩图绘制,而VennDiagram可以实现2-5个集合的韦恩图,两者都有自己的风格,集合过多就不适合这么做了,羡慕的例子可别适合做转录组多分组比较之后的差异基因集合个数,有想做这种分析的,一定参照这些例子,做出图形非常美观!注意:venn在做韦恩图时候自始至终都是一个函数venn;而VennDiagram每次不同的集合个数都需要变换函数,而且需要自己统计好交集的个数,2-3个集合还算好弄,但是高于4个集合,自己统计起来还是挺麻烦的,需要自己搞个脚本循环一下。另外高于7个集合的这需要另种表现形式,可以通过 UpSetR 软件包来实现。
2. 两个集合韦恩图
两个集合的韦恩图最常见也是做好理解和绘制的,我们同样使用两个软件包来实现绘制功能,如下:
A. venn {venn}
venn函数的输入数据为列表,所以我们需要将数据框转为列表,直接使用as.list()函数即可,如下:
venn2List <- as.list(dat[,1:2])
cross=venn(venn2List,
zcolor='style', # 调整颜色,style是默认颜色,bw是无颜色,当然也可以自定义颜色
opacity = 0.3, # 调整颜色透明度
box = F, # 是否添加边框
ilcs = 0.5, # 数字大小
sncs = 1 # 组名字体大小
)
B. draw.pairwise.venn {VennDiagram}
创建一个包含两个集合的维恩图。当数据集满足特定条件时创建欧拉图。
cross
## c1 c2 counts
## 0 0 0
## c2 0 1 756
## c1 1 0 777
## c1:c2 1 1 885
venn.plot <- draw.pairwise.venn(area1 = sum(cross[cross$c1 == 1, ]$counts), area2 = sum(cross[cross$c2 ==
1, ]$counts), cross.area = sum(cross[which(grepl("c1:c2", rownames(cross)) ==
TRUE), ]$counts), category = colnames(dat[, 1:2]), fill = c("blue", "red"), lty = "blank",
cex = 2, cat.cex = 2, cat.pos = c(285, 105), cat.dist = 0.09, cat.just = list(c(-1,
-1), c(1, 1)), ext.pos = 30, ext.dist = -0.05, ext.length = 0.85, ext.line.lwd = 2,
ext.line.lty = "dashed")
3. 三个集合韦恩图
A. venn {venn}
venn3List <- as.list(dat[,1:3])
cross=venn(venn3List,
zcolor='style', # 调整颜色,style是默认颜色,bw是无颜色,当然也可以自定义颜色
opacity = 0.3, # 调整颜色透明度
box = F, # 是否添加边框
ilcs = 0.5, # 数字大小
sncs = 1 # 组名字体大小
)
B. draw.triple.venn {VennDiagram}
创建一个包含三个集合的维恩图。当数据集满足特定条件时创建欧拉图。
cross
## c1 c2 c3 counts
## 0 0 0 0
## c3 0 0 1 477
## c2 0 1 0 596
## c2:c3 0 1 1 160
## c1 1 0 0 581
## c1:c3 1 0 1 196
## c1:c2 1 1 0 255
## c1:c2:c3 1 1 1 630
venn.plot <- draw.triple.venn(area1 = sum(cross[cross$c1 == 1, ]$counts), area2 = sum(cross[cross$c2 ==
1, ]$counts), area3 = sum(cross[cross$c3 == 1, ]$counts), n12 = sum(cross[which(grepl("c1:c2",
rownames(cross)) == TRUE), ]$counts), n23 = sum(cross[which(grepl("c2:c3", rownames(cross)) ==
TRUE), ]$counts), n13 = sum(cross[which(grepl("c1:c3", rownames(cross)) == TRUE),
]$counts) + cross["c1:c2:c3", ]$counts, n123 = sum(cross[which(grepl("c1:c2:c3",
rownames(cross)) == TRUE), ]$counts), category = colnames(dat[, 1:3]), fill = c("blue",
"red", "green"), lty = "blank", cex = 2, cat.cex = 2, cat.col = c("blue", "red",
"green"))
4. 四个集合韦恩图
A. venn {venn}
venn4List <- as.list(dat[,1:4])
cross=venn(venn4List,
zcolor='style', # 调整颜色,style是默认颜色,bw是无颜色,当然也可以自定义颜色
opacity = 0.3, # 调整颜色透明度
box = F, # 是否添加边框
ilcs = 0.5, # 数字大小
sncs = 1 # 组名字体大小
)
B. draw.quad.venn {VennDiagram}
创建一个包含四个集合的维恩图。
cross
## c1 c2 c3 c4 counts
## 0 0 0 0 0
## c4 0 0 0 1 368
## c3 0 0 1 0 388
## c3:c4 0 0 1 1 89
## c2 0 1 0 0 507
## c2:c4 0 1 0 1 89
## c2:c3 0 1 1 0 104
## c2:c3:c4 0 1 1 1 56
## c1 1 0 0 0 494
## c1:c4 1 0 0 1 87
## c1:c3 1 0 1 0 114
## c1:c3:c4 1 0 1 1 82
## c1:c2 1 1 0 0 159
## c1:c2:c4 1 1 0 1 96
## c1:c2:c3 1 1 1 0 179
## c1:c2:c3:c4 1 1 1 1 451
venn.plot <- draw.quad.venn(area1 = sum(cross[cross$c1 == 1, ]$counts), area2 = sum(cross[cross$c2 ==
1, ]$counts), area3 = sum(cross[cross$c3 == 1, ]$counts), area4 = sum(cross[cross$c4 ==
1, ]$counts), n12 = sum(cross[which(grepl("c1:c2", rownames(cross)) == TRUE),
]$counts), n13 = sum(cross[which(grepl("c1:c3", rownames(cross)) == TRUE), ]$counts) +
179 + 451, n14 = sum(cross[which(grepl("c1:c4", rownames(cross)) == TRUE), ]$counts) +
96 + 451, n23 = sum(cross[which(grepl("c2:c3", rownames(cross)) == TRUE), ]$counts),
n24 = sum(cross[which(grepl("c2:c4", rownames(cross)) == TRUE), ]$counts) + 451,
n34 = sum(cross[which(grepl("c3:c4", rownames(cross)) == TRUE), ]$counts), n123 = sum(cross[which(grepl("c1:c2:c3",
rownames(cross)) == TRUE), ]$counts), n124 = sum(cross[which(grepl("c1:c2:c4",
rownames(cross)) == TRUE), ]$counts) + 451, n134 = sum(cross[which(grepl("c1:c3:c4",
rownames(cross)) == TRUE), ]$counts) + 451, n234 = sum(cross[which(grepl("c2:c3:c4",
rownames(cross)) == TRUE), ]$counts), n1234 = sum(cross[which(grepl("c1:c2:c3:c4",
rownames(cross)) == TRUE), ]$counts), category = colnames(dat[, 1:4]), fill = c("orange",
"red", "green", "blue"), lty = "dashed", cex = 2, cat.cex = 2, cat.col = c("orange",
"red", "green", "blue"))
5. 五个集合韦恩图
A. venn {venn}
venn5List <- as.list(dat[,1:5])
cross=venn(venn5List,
zcolor='style', # 调整颜色,style是默认颜色,bw是无颜色,当然也可以自定义颜色
opacity = 0.3, # 调整颜色透明度
box = F, # 是否添加边框
ilcs = 0.5, # 数字大小
sncs = 1 # 组名字体大小
)
B. draw.quintuple.venn {VennDiagram}
创建带有五个集合的维恩图。
venn.plot <- draw.quintuple.venn(area1 = 301, area2 = 321, area3 = 311, area4 = 321,
area5 = 301, n12 = 188, n13 = 191, n14 = 184, n15 = 177, n23 = 194, n24 = 197,
n25 = 190, n34 = 190, n35 = 173, n45 = 186, n123 = 112, n124 = 108, n125 = 108,
n134 = 111, n135 = 104, n145 = 104, n234 = 111, n235 = 107, n245 = 110, n345 = 100,
n1234 = 61, n1235 = 60, n1245 = 59, n1345 = 58, n2345 = 57, n12345 = 31, category = colnames(dat[,
1:5]), fill = c("dodgerblue", "goldenrod1", "darkorange1", "seagreen3", "orchid3"),
cat.col = c("dodgerblue", "goldenrod1", "darkorange1", "seagreen3", "orchid3"),
cat.cex = 2, margin = 0.05, cex = c(1.5, 1.5, 1.5, 1.5, 1.5, 1, 0.8, 1, 0.8,
1, 0.8, 1, 0.8, 1, 0.8, 1, 0.55, 1, 0.55, 1, 0.55, 1, 0.55, 1, 0.55, 1, 1,
1, 1, 1, 1.5), ind = TRUE)
6. 六个集合维恩图
venn {venn}
venn6List <- as.list(dat[,1:6])
venn(venn6List,
zcolor='style', # 调整颜色,style是默认颜色,bw是无颜色,当然也可以自定义颜色
opacity = 0.3, # 调整颜色透明度
box = F, # 是否添加边框
ilcs = 0.5, # 数字大小
sncs = 1 # 组名字体大小
)
7. 七个集合韦恩图
venn {venn}
venn7List <- as.list(dat[,1:7])
cross=venn(venn7List,
zcolor='style', # 调整颜色,style是默认颜色,bw是无颜色,当然也可以自定义颜色
opacity = 0.3, # 调整颜色透明度
box = F, # 是否添加边框
ilcs = 0.5, # 数字大小
sncs = 1 # 组名字体大小
)
8. 多于7个集合的韦恩图
多于7个基本就实现不了这种带曲圆的方式绘制了,但是可以考虑通过UpSetR软件包中upset来实现一个热点表格的形式展现,我们先绘制8个集合的图形,如下:
require(ggplot2)
require(plyr)
require(gridExtra)
require(grid)
movies <- read.csv(system.file("extdata", "movies.csv", package = "UpSetR"), header = TRUE,
sep = ";")
upset(movies, nsets = 8, nintersects = 30, mb.ratio = c(0.5, 0.5), order.by = c("freq",
"degree"), decreasing = c(TRUE, FALSE))
绘制9个集合的图形,如下:
upset(movies, nsets = 9, nintersects = 30, mb.ratio = c(0.5, 0.5), order.by = c("freq",
"degree"), decreasing = c(TRUE, FALSE))
绘制10个集合的图形,如下:
upset(movies, nsets = 10, nintersects = 30, mb.ratio = c(0.5, 0.5), order.by = c("freq",
"degree"), decreasing = c(TRUE, FALSE))
最后看看整体炫图,还是蛮酷毙了的赶脚!
References:
-
Lex et al. (2014). UpSet: Visualization of Intersecting Sets IEEE Transactions on Visualization and Computer Graphics (Proceedings of InfoVis 2014), vol 20, pp. 1983-1992, (2014).
-
Lex and Gehlenborg (2014). Points of view: Sets and intersections. Nature Methods 11, 779 (2014).
-
Ruskey, F. and M. Weston. 2005. Venn diagrams. Electronic Journal of Combinatorics, Dynamic Survey DS5.
-
Mamakani, K., Myrvold W. and F. Ruskey. 2011. Generating all Simple Convexly-drawable Polar Symmetric 6-Venn Diagrams. International Workshop on Combinatorial Algorithms, Victoria. LNCS, 7056, 275-286.
本文使用 文章同步助手 同步
网友评论