理解RNA-seq流程:https://zhuanlan.zhihu.com/p/139773946
安装fastqc
wget http://www.bioinformatics.babraham.ac.uk/projects/fastqc/fastqc_v0.11.3.zip#fastqc官网下载
unzip fastqc_v0.11.3.zip#下载后解压
#此时在biosofts文件夹里多了一个FastQC的文件夹
cd biosofts/FastQC
#进入FastQC文件夹后,可以看到一个名为fastqc的文件
chmod 755 fastqc#给予权限
#接下来加入到PATH:
##学会的新方法:不再用vim编辑~/.bashrc,而是使用echo命令
echo 'export PATH=~/biosofts/FastQC:$PATH' >>~/.bashrc #注意这里冒号前FastQC是文件夹
source ~/.bashrc
fastqc -h 成功
总结:下载-赋予权限-加入到~/.bashrc中

使用fastqc和multiQC质控
使用fastqc时,需要先配置java环境,具体方法如下:
https://blog.csdn.net/u010993514/article/details/82926514
ls *gz | xargs fastqc -t 2 #t2表示两个线程
multiqc ./ --export #对文件夹内文件进行整合质控
multiqc依赖于python3,目前python3需要依赖于外网才能安装
https://blog.csdn.net/L_15156024189/article/details/84831045
https://www.cnblogs.com/wintest/p/12057170.html
fastqc质控解释:https://zhuanlan.zhihu.com/p/20731723
multiqc质控结果解释:微信 生信小知识 5.31 3. 质控与去接头
安装trimgalore
Trim Galore是对FastQC和Cutadapt的包装,因此需要先安装FastQC和Cutadapt这两个软件,fastqc已有,cutadapt可用sudo install安装。
# Check that cutadapt is installed
cutadapt --version
# Check that FastQC is installed
fastqc -v
# Install Trim Galore
curl -fsSL https://github.com/FelixKrueger/TrimGalore/archive/0.6.5.tar.gz -o trim_galore.tar.gz
tar xvzf trim_galore.tar.gz
# Run Trim Galore
~/TrimGalore-0.6.5/trim_galore
运行trimgalore报错:

放弃trimgalore,换用另一种FastQ质量控制推荐的网站(fastp)试一试:
FASTP
https://www.jianshu.com/p/f223206b3378

fastp -i SRR1175538_1.fastq.gz -o SRR1175538_1.fastp.fq.gz \
-I SRR1175538_2.fastq.gz -O SRR1175538_2.fastp.fq.gz \
-l 36 -q 20 --compression=6 -R SRR1175538 \ #设置为36和20
-h SRR1175538.html -j SRR1175538.json
写循环:
for i in $(ls *_1.fastq.gz)
do
i=${i/_1.fastq.gz/}
fastp -i ${i}_1.fastq.gz -o ${i}_1.fastp.fq.gz \
-I ${i}_2.fastq.gz -O ${i}_2.fastp.fq.gz \
-l 36 -q 20 --compression=6 \
-R ${i} -h ${i}.fastp.html -j ${i}.fastp.json
done
fastp质控过的文件是*.fastp.fq.gz

fastp质控后再用fastqc和multiQC评估:
ls *gz | xargs fastqc -t 2 #xargs表传递
#--export可输出pdf格式的文件


GSE135055的normal里的SRR9856213使用fastqc时报错:
Failed to process file SRR9856213_1.fastp.fq.gz
uk.ac.babraham.FastQC.Sequence.SequenceFormatException: Ran out of data in the middle of a fastq entry. Your file is probably truncated
at uk.ac.babraham.FastQC.Sequence.FastQFile.readNext(FastQFile.java:179)
at uk.ac.babraham.FastQC.Sequence.FastQFile.next(FastQFile.java:125)
at uk.ac.babraham.FastQC.Analysis.AnalysisRunner.run(AnalysisRunner.java:77)
at java.base/java.lang.Thread.run(Thread.java:834)
先舍弃掉这个标本
网友评论