三、RNASeq分析
![](https://img.haomeiwen.com/i15781461/2205fdc5536dc770.png)
image.png
- RNASeq1:对得到的count进行差异基因分析
- RNASeq2:对得到的count进行fpkm值计算
五、Python对按fpkm分组后的基因bed文件匹配
import pandas as pd
#小文件 encoding="gbk" 保证含有中文字符时不乱码
high = pd.read_csv("high.txt",encoding="gbk")
low = pd.read_csv("low.txt",encoding="gbk")
middle = pd.read_csv("middle.txt",encoding="gbk")
unexpressed = pd.read_csv("unexpressed.txt",encoding="gbk")
#大文件
rice = pd.read_csv("rice.bed.csv",encoding="gbk")
#索引
index1 = rice[u'locus'].isin(high[u'locus'])
index2 = rice[u'locus'].isin(low[u'locus'])
index3 = rice[u'locus'].isin(middle[u'locus'])
index4 = rice[u'locus'].isin(unexpressed[u'locus'])
hb = rice[index1]
lb = rice[index2]
mb = rice[index3]
ub = rice[index4]
hb.to_csv('hb.csv', index=False, encoding='gbk')
lb.to_csv('lb.csv', index=False, encoding='gbk')
mb.to_csv('mb.csv', index=False, encoding='gbk')
ub.to_csv('ub.csv', index=False, encoding='gbk')
六、Shell对分组对不同表达量的多个bed,用Deeptools画对应的peak分布。
awk -F " " '{print$1"\t"$2"\t"$3"\t"$4"\t"}' ub.bed > unexpressed.bed # 注意:bed文件分隔符为Tab (\t)
sed -i "1d" unexpressed.bed
网友评论