美文网首页走进转录组
RNA-seq之安装Hisat2--Samtools--Feat

RNA-seq之安装Hisat2--Samtools--Feat

作者: 墨白的生信学习笔记 | 来源:发表于2022-09-15 15:16 被阅读0次

引言:以下学习笔记主要参考《一文学会常规转录组分析》“https://www.jianshu.com/p/bdeebd669eb8”

1.数据获取及质控

提前安装Stratooklit、Prefetch、Aspera、Fastaqc、Multiqc

创建下载数据记录号的文件

#cat dir_6.txt

SRR3286802

SRR3286803

SRR3286804

SRR3286805

SRR3286806

SRR3286807

1.1Aspera下载SRA数据:

使用命令

下载

解压

sh ibm-aspera-connect_4.1.3.93_linux.s

#~/.aspera/connect/bin/ascp -i ~/.aspera/connect/etc/asperaweb_id_dsa.putty --mode recv --host ftp-private.ncbi.nlm.nih.gov --user anonftp --file-list dir_6.txt

报错1:

ascp: destination required

Startup failed, exit

解决1:将.aspera路径改为绝对路径,最后的Data是你下载文件的指定路径

#/home/radish/.aspera/connect/bin/ascp -i /home/radish/.aspera/connect/etc/asperaweb_id_dsa.putty --mode recv --host ftp-private.ncbi.nlm.nih.gov --user anonftp --file-list dir_6.txt Data/

报错2:

ascp: Failed to open TCP connection for SSH, exiting.

Session Stop  (Error: Failed to open TCP connection for SSH)

2.下载gff/gtf注释文件并提取出感兴趣的基因/转录本区间

#less Arabidopsis_thaliana.TAIR10.42.gff3 | awk'{ if($3=="gene") print $0 }'>gene27655.gff


3.安装Hisat2


3.1root下安装,所以无需写bashrc

#anaconda search -t conda hisat2

#anaconda show bioconda/hisat2

#conda install --channel https://conda.anaconda.org/bioconda hisat2

运行

#hisat

没问题

3.2如果普通用户,则需要写入bashrc

#vi ~/.bashrc

#export PATH=~/home/radish/bio_soft/hisat2-2.2.0/hisat2:$PATH

#source ~/.bashrc

3.3将SRA数据比对到参考基因组:

3.3.1建立索引:

#hisat2-build Arabidopsis_thaliana.TAIR10.dna.toplevel.fa Arabidopsis_thaliana &

3.3.2单独比对:

#hisat2 -p 6 -x Arabidopsis_thaliana -1 SRR3286802_1.fastq.gz -2 SRR3286802_2.fastq.gz -S SRR3286802.sam

3.3.2脚本比对:

#cat 3.sh

for i in `seq 2 7`

do

hisat2  -x  ~/bio_soft/Arabidopsis_thaliana  -p  8  \

-1  ~/bio_soft/SRR328680${i}_1.fastq.gz  \

-2  ~/bio_soft/SRR328680${i}_2.fastq.gz  \

-S  ~/bio_soft/SRR328680${i}.sam

done

#sh 3.sh

报告文件来看比对率都挺高的,97%以上。

4.sam转bam并排序。安装Samtools时报错:

#ibncurses.so.5: cannot open shared object fil

解决:

#whereis libncurses.so.5

#ln -s /usr/lib64/libncurses.so.6.1 /usr/lib64/libncurses.so.5

安装Samtools

与上述Hisat2同命令

运行:

单独转换和排序:

#samtools view -bS SRR3286805.sam > SRR3286805.bam

#samtools sort SRR3286805.bam > SRR3286805.n.bam

脚本转化和排序:

#cat 1.sh

for i in `seq 2 7`

do

samtools view -@ 8 -Sb SRR328680${i}.sam > SRR328680${i}.bam

samtools sort -@ 8 -n SRR328680${i}.bam > SRR328680${i}.n.bam

done

#sh 1.sh

5.计算表达量

5.1.安装FeatureCounts

#export PATH=~/home/radish/bio_soft/subread-1.6.0-Linux-x86_64/bin:$PATH

5.2.安装Stringtie

#wget http://ccb.jhu.edu/software/stringtie/dl/stringtie-1.3.3b.Linux_x86_64.tar.gz

#tar -zvxf stringtie-1.3.3b.Linux_x86_64.tar.gz

#cd stringtie-1.3.3b.Linux_x86_64/

#pwd

将打印出来的路径写入bashrc

#vi ~/.bashrc

#export PATH=~/home/radish/bio_soft/stringtie-1.3.3b.Linux_x86_64/stringtie:$PATH

#source ~/.bashrc

未完待续

相关文章

网友评论

    本文标题:RNA-seq之安装Hisat2--Samtools--Feat

    本文链接:https://www.haomeiwen.com/subject/zolibrtx.html