CIRCexplorer3 | 踩坑之路

作者: 苦哈哈的柠檬水 | 来源:发表于2022-03-03 00:11 被阅读0次

CIRCexplorer3 | 踩坑之路
踩坑之路
thymeleaf 踩坑之路（一）数字(number) 算法坑
踩坑要趁早
webpack踩坑之路
Spark踩坑之路
proguard 踩坑之路
RecyclerView踩坑之路
small 踩坑之路
kotlin踩坑之路

创建circexplorer3环境

python版本需为2.7+

conda create -n circexplorer3 python=2.7.15
conda activate circexplorer3

软件版本问题

注意：Samtools=0.1.18.0，Bowtie=1.0.0.0，tophat=2.0.12

conda install Samtools=0.1.18.0
conda install Bowtie=1.0.0.0

根据官网https://circexplorer2.readthedocs.io/en/latest/tutorial/setup/以及https://github.com/YangLab/CLEAR安装CIRCexplorer2及CIRCexplorer3所需的软件

#安装相关软件
conda install -y Cufflinks BEDTools STAR MapSplice BWA segemehl pysam pybedtools docopt scipy HISAT2 StringTie
conda install -y -c bioconda ucsc-genepredtogtf
conda install -y -c bioconda ucsc-gtftogenepred
#安装CIRCexplorer2
conda install circexplorer2
#安装CIRCexplorer3
git clone https://github.com/YangLab/CLEAR
cd ./CLEAR
python ./setup.py install

注意：tophat=2.0.12

#安装tophat软件
wget http://ccb.jhu.edu/software/tophat/downloads/tophat-2.0.12.Linux_x86_64.tar.gz
tar -zxvf tophat-2.0.12.Linux_x86_64.tar.gz
export PATH=$PATH:/home/user/bioinfo_software/tophat-2.0.12.Linux_x86_64/ #临时加入环境变量（每次登录需要设置）

数据处理

前期准备工作

需根据CIRCexplorer2官网下载GTF及FA等基因注释文件，以及使用HISAT2和Bowtie进行索引构建

#下载GTF以及fa相关文件
fetch_ucsc.py hg19 ref hg19_ref.txt
fetch_ucsc.py hg19 kg hg19_kg.txt
fetch_ucsc.py hg19 ens hg19_ens.txt
fetch_ucsc.py hg19 fa hg19.fa
#转为GTF格
cut -f2-11 hg19_ref.txt|genePredToGtf file stdin hg19_ref.gtf
cut -f2-11 hg19_kg.txt|genePredToGtf file stdin hg19_kg.gtf
cut -f2-11 hg19_ens.txt|genePredToGtf file stdin hg19_ens.gtf 式
#构建tophat所需索引（指定bowtie）
bowtie-build ../hg19.fa bowtie1_index  # bowtie建立索引
bowtie2-build ../hg19.fa bowtie2_index # bowtie2建立索引
hisat2-build -p 70 hg19.fa ../hg19.fa

建立变量中的名称不能含有 “.” ，使用 “_” 代替

hisat_index=/home/user/project/reference/UCSC/HISAT2/hg19.fa
bowtie1_index=/home/user/project/reference/UCSC/bt1/bowtie1_index
hg19_gtf=/home/user/project/reference/UCSC/hg19_ref.gtf
hg19_fa=/home/user/project/reference/UCSC/hg19.fa

单样本测试

注意：运行前，检查环境以及注释文件的路径
！！！-p线程数不能设置过大，否则会超过Linux默认的最大文件描述符数量（1024），一直出现 “too many files open” 或者 “Error opening SAM file” 报错！！！

nohup clear_quant -p 70 -1 SRRnnnnnn_1.fq.gz -2 SRRnnnnnn_2.fq.gz -g $hg19_fa -i $hisat_index -j $bowtie1_index --gtf $hg19_gtf -o ./SRRnnnnnn_output/ &

批量处理

#查看文件注释文件路径
echo $hisat_index $bowtie1_index $hg19_gtf $hg19_fa
#链接处理文件
ln -s ../04.trim_galore_clean/*.fq.gz ./
#批量处理脚本
for i in `ls *_1.fq.gz`
 do
id=${i/_1.fq.gz/}
output_dir=./${id}_output/
echo "clear_quant -p 70 -1 ${id}_1.fq.gz -2 ${id}_2.fq.gz -g $hg19_fa -i $hisat_index -j $bowtie1_index --gtf $hg19_gtf -o $output_dir"
done >CIRCexplorer3.sh
#运行
nohup bash CIRCexplorer3.sh 1>nohup.log &

汇总定量文件

#批量处理
ls *_1.fq.gz | while read id
do
id=${id/_1.fq.gz/}
file=${id}_output
echo "cp $file/quant/quant.txt ./quant_file/$id"
done >gather.command
cat gather.command #检查命令的完整性
bash gather.command 2>gather.log #运行命令

网友评论

本文标题：CIRCexplorer3 | 踩坑之路

本文链接：https://www.haomeiwen.com/subject/jcqyrrtx.html

延伸阅读

深度阅读

您也可以注册成为美文阅读网的作者，发表您的原创作品、分享您的心情！

CIRCexplorer3 | 踩坑之路

创建circexplorer3环境

软件版本问题

数据处理

前期准备工作

单样本测试

批量处理

汇总定量文件

相关文章

CIRCexplorer3 | 踩坑之路

踩坑之路

thymeleaf 踩坑之路（一）数字(number) 算法坑

踩坑要趁早

webpack踩坑之路

Spark踩坑之路

proguard 踩坑之路

RecyclerView踩坑之路

small 踩坑之路

kotlin踩坑之路

网友评论

延伸阅读

深度阅读

栏目导航

热点阅读