rRNA去除之后就开始进行数据比对了,这一步骤作者使用了四个比对软件:Tophat2,STAR,HISAT2,BWA。
相应代码抠出来:
BWA:
image-20220524201319092.png继续扣出来运行~
构建索引
# conda install -c bioconda bwa -y
# 创建文件夹
mkdir BWAIndex
# 构建BWA索引
# vim BWAIndex.sh
fasta=Homo_sapiens.GRCh38.dna.primary_assembly.fa
fasta_baseName=GRCh38
cd BWAIndex/
bwa index -p ${fasta.baseName} -a bwtsw ../$fasta
cd ../
# 运行
nohup sh BWAIndex.sh >BWAIndex.sh.log &
索引内容如下:
BWAIndex
├── GRCh38.amb
├── GRCh38.ann
├── GRCh38.bwt
├── GRCh38.pac
└── GRCh38.sa
数据比对
参数解释
- -f: FILE file to write output to instead of stdout
- --library-type:建库方式类型,无链特异性,链特异第一链,链特异第二链
# 激活小环境
conda activate rna
# 创建文件夹
mkdir -p alignment/bwa
index_base=../GRCh38/BWAIndex/GRCh38
outdir=alignment/bwa
## 单端
ls alignment/rRNA_dup/SRR10352*gz |while read id
do
sample_name=${id##*/}
sample_name=${sample_name%%.*}
echo "bwa aln -t 12 -f $outdir/${sample_name}.sai $index_base $id && bwa samse $index_base $outdir/${sample_name}.sai $id | samtools view -@ 12 -h -bS - >$outdir/${sample_name}_bwa.bam"
done >bwa.sh
# 运行
nohup sh bwa.sh >bwa.sh.log &
bwa.sh的内容:
image-20220524195757995.png运行完之后目录下每个样本会生成:
- SRR1035213_bwa.bam
- SRR1035213.sai
网友评论