1 软件安装
https://www.jianshu.com/p/eb89ab4af035
linux平台下需要安装的软件:fastqc,fastp,hisat2,samtools,htseq
2下载基因组序列和基因组注释文件
猕猴基因组和注释文件:
Macaca mulatta (ID 215) - Genome - NCBI (nih.gov)
Macaca_mulatta - Ensembl genome browser 104
3构建索引文件
hisat2-build -p 2 GCF_003339765.1_Mmul_10_genomic.fna Mmul
hisat2-build -p 2 Macaca_mulatta.Mmul_10.dna.toplevel.fa Mmul
4过滤raw reads
mkdir -p fastp
ls *1.fastq.gz|while read id;
do
fastp -5 20 -i ${id%_*}_1.fastq.gz -I ${id%_*}_2.fastq.gz \
-o ${id%_*}_1.clean.fq.gz -O ${id%_*}_2.clean.fq.gz \
-j ./fastp/${id%_*}.json -h ./fastp/${id%_*}.html;
done
5比对
ls *1.clean.fq.gz|while read id;
do
hisat2 -t -p 3 -x /media/lzx/0000678400004823/Indexs/Hisat2/Macaca_mulatta/Mmul \
-1 $id -2 ${id%_*}_2.clean.fq.gz \
2>${id%%_*}.hisat.log \
|samtools sort -@ 3 -o ${id%_*}_ht2p.bam
done
网友评论