使用macs3找peaks
ls *.bam | while read id;
do
sample=${id%.*}
macs3 callpeak -f BAMPE -t ${sample}.bam -g mm --shift -50 --extsize 100 -n ${sample} -B -q 0.01 --outdir ../macs3/
done
FRiP 计算
Fraction of reads in peaks (FRiP) 指落入峰区域的 reads 占比,应该要高于 0.3 较好。这个指标非强制性,不同数据表现并不相同,计算也不用追求精确。
分别计算 BAM 文件总 fragments 数和处于 peak 区域的 fragments 数,再相除,两命令得到的 fragments 数目相除便是 FRiP.
ls *.bam | while read id;
do
sample=${id%.*}
totalReads=$(samtools view -c ${sample}.bam)
Reads=$(samtools view -c -L ../macs3/${sample}_peaks.narrowPeak ${sample}.bam)
echo $Reads $totalReads
echo ${sample} '==> FRiP value:' $(bc <<< "scale=2;100*$Reads/$totalReads")'%'
done
507578 7571172
2-cell-1 ==> FRiP value: 6.70%
1344980 12764754
2-cell-2 ==> FRiP value: 10.53%
320245 5536566
2-cell-4 ==> FRiP value: 5.78%
1660877 11085878
2-cell-5 ==> FRiP value: 14.98%
IDR计算重复情况
#去除.narrowPeak文件首行
sed -i '1d' *.narrowPeak
#对MACS3的结果文件narrowPeak根据-log10(p-value)进行排序
ls *.narrowPeak | while read id;
do
sample=${id%.*}
sort -k8,8nr ${sample}.narrowPeak > ${sample}.sorted.narrowPeak
done
#对生物学重复样本间的peak进行鉴定,查看两次重复的peak的IDR(不可重复率)
idr --samples 2-cell-1_peaks.sorted.narrowPeak 2-cell-2_peaks.sorted.narrowPeak --output-file ../idr/2-cell-1-2.idr_peak.narrowPeak --rank p.value --plot --idr-threshold 0.05 --log-output-file ../idr/2-cell-1-2.idr.log
#log文件会给出peaks通过IDR < 0.05的比率
cat ../idr/2-cell-1-2.idr.log
Initial parameter values: [0.10 1.00 0.20 0.50]
Final parameter values: [0.63 1.10 0.61 0.57]
Number of reported peaks - 391/3854 (10.1%)
Number of peaks passing IDR cutoff of 0.05 - 391/3854 (10.1%)
![](https://img.haomeiwen.com/i27423876/62e76e25ad78b565.png)
网友评论