论文
https://www.sciencedirect.com/science/article/pii/S0092867421008916#da0010
Ancient and modern genomes unravel the evolutionary history of the rhinoceros family
image.png犀牛
本地论文 1-s2.0-S0092867421008916-main.pdf
数据和代码下载链接
https://github.com/liushanlin/rhinoceros-comparative-genome
今天的推文我们来重复一下论文中的 Figure5
image.png数据集用到的是TableS4,部分数据如下
image.png加载需要用到的R包
library(readxl)
library(tidyverse)
library(ggplot2)
library(ggrepel)
将数据整理成作图需要的格式
df<-read_excel("mmc4.xlsx",
skip = 1) %>%
select(2,5,9,10,11) %>%
rename('V2'=`Common name`,
'V1'=`conservation status`,
'V3'=`# Missense`,
'V4'=`# LoF mutation`,
'V7'=`# Silent`) %>%
mutate(V5=V3/V7,
V6=V4/V7) %>%
select(V1,V2,V5,V6) %>%
group_by(V1,V2) %>%
summarise(V5=mean(V5),
V6=mean(V6)) %>%
mutate(V3=case_when(
V1 == "Least Concern" | V1 == "Least concern" | V1 == "Near Threatened" ~ 'A',
TRUE ~ "B"
))
head(df)
作图代码
pdf(file = "output.pdf",
width = 10,
height = 8,
family = "serif")
plota = ggplot(data = df, aes(x=V5, y=V6)) +
geom_point(aes(color=V3, shape=V3),size=4)+
geom_text_repel(aes(label=V2),size=4) +
scale_x_continuous(name = "mean rate of Missense / Slient") +
scale_y_continuous(name = "mean value of LoF mutation rate") +
geom_smooth(method = "lm",
formula = y~x,
color="black",
size=1, se=F) +
scale_shape_manual(values = c(15,19),
labels=c("Least concern, Data deficient, Near threatened",
"Vulnerable, Endangered, Critically endangered"))+
scale_color_manual(values = rev(c("#D55E00","#999999")),
labels=c("Least concern, Data deficient, Near threatened",
"Vulnerable, Endangered, Critically endangered"))+
annotate("text", x=0.6, y=0.03,
label = "atop(italic(R) ^ 2 == 0.61, 'P value = 9.43e-5')",
parse=T, size=6) +
theme(panel.background = element_blank(),
panel.grid = element_blank(),
axis.line = element_line(),
axis.text = element_text(size = 12),
axis.title = element_text(size = 12),
legend.position = c(0.3,0.9),
legend.title = element_blank()
)
print(plota)
dev.off()
最终结果
image.png欢迎大家关注我的公众号
小明的数据分析笔记本
小明的数据分析笔记本 公众号 主要分享:1、R语言和python做数据分析和数据可视化的简单小例子;2、园艺植物相关转录组学、基因组学、群体遗传学文献阅读笔记;3、生物信息学入门学习资料及自己的学习笔记!
示例数据和代码可以留言加我的微信获取
网友评论