截取长度列
j是线程数 f 是长度所在列
seqkit fx2tab -j 8 -l -n -i -H HNHP_rep1_1h_S46_L003_R1_val_1.fq.gz| cut -f 2 >Length.txt
得到一列长度数
length
128
150
150
150
150
150
147
150
150
然后在R 中运行画
library(tidyverse)
length <- read_tsv("Length.txt") %>% group_by(length) %>%
summarise(Count = n())
length$length <- as.character(length$length)
sum <- sum(length$Count)
ggplot(length) + geom_col(aes(length, Count), width = 0.8) +
geom_line(aes(length, Count), group = 1) + geom_point(aes(length, Count)) +
scale_y_continuous(sec.axis = sec_axis(~.*100/sum, name = "% Relative Abundance")) + xlab("Length") +
theme_bw() + theme(panel.grid = element_blank(),
axis.title = element_text(size = 15))
ggsave("Length.png", height = 5, width = 8)
ggsave("Length.pdf", height = 5, width = 8)
最后得出一个分布图
![](https://img.haomeiwen.com/i17602949/f760c37d583619b5.png)
参考https://www.jianshu.com/p/31244fb42da1
网友评论