首先trinity生成的fasta文件
安装Transdecoder(我是通过conda安装的,也可以去下载安装包自己解压加环境)
运行TransDecoder.LongOrfs
$TransDecoder.LongOrfs -t /data1/spider/ytbiosoft/data/trinity.all/trinity_out_dir_all.Trinity.fasta
结果如下:
(bioinforspace) [spider 04:01:18 e[35;1m]/data1/spider/ytbiosoft/data/trinity.all/TransDecoder.LongOrfs
$TransDecoder.LongOrfs -t /data1/spider/ytbiosoft/data/trinity.all/trinity_out_dir_all.Trinity.fasta
-- Skipping CMD: /data1/spider/miniconda3/envs/bioinforspace/opt/transdecoder/util/compute_base_probs.pl /data1/spider/ytbiosoft/data/trinity.all/trinity_out_dir_all.Trinity.fasta 0 > /data1/spider/ytbiosoft/data/trinity.all/TransDecoder.LongOrfs/trinity_out_dir_all.Trinity.fasta.transdecoder_dir/base_freqs.dat, checkpoint [/data1/spider/ytbiosoft/data/trinity.all/TransDecoder.LongOrfs/trinity_out_dir_all.Trinity.fasta.transdecoder_dir.__checkpoints_longorfs/base_freqs_file.ok] exists.
- extracting ORFs from transcripts.
-total transcripts to examine: 375779
[375700/375779] = 99.98% done CMD: touch /data1/spider/ytbiosoft/data/trinity.all/TransDecoder.LongOrfs/trinity_out_dir_all.Trinity.fasta.transdecoder_dir.__checkpoints_longorfs/TD.longorfs.ok
#################################
### Done preparing long ORFs. ###
##################################
Use file: /data1/spider/ytbiosoft/data/trinity.all/TransDecoder.LongOrfs/trinity_out_dir_all.Trinity.fasta.transdecoder_dir/longest_orfs.pep for Pfam and/or BlastP searches to enable homology-based coding region identification.
Then, run TransDecoder.Predict for your final coding region predictions.
运行TransDecoder.Predict
$TransDecoder.Predict -t /data1/spider/ytbiosoft/data/trinity.all/trinity_out_dir_all.Trinity.fasta 结果如下:
-rw-rw-r-- 1 spider spider 1449 Apr 11 19:48 pipeliner.38661.cmds
-rw-rw-r-- 1 spider spider 296 Apr 11 20:01 pipeliner.38864.cmds
-rw-rw-r-- 1 spider spider 3644 Apr 11 20:41 pipeliner.40446.cmds
-rw-rw-r-- 1 spider spider 3350 Apr 11 20:38 pipeliner.41631.cmds
-rw-rw-r-- 1 spider spider 16515704 Apr 11 20:40 trinity_out_dir_all.Trinity.fasta.transdecoder.bed
-rw-rw-r-- 1 spider spider 104584265 Apr 11 20:43 trinity_out_dir_all.Trinity.fasta.transdecoder.cds
-rw-rw-r-- 1 spider spider 75371916 Apr 11 20:39 trinity_out_dir_all.Trinity.fasta.transdecoder.gff3
-rw-rw-r-- 1 spider spider 44730662 Apr 11 20:41 trinity_out_dir_all.Trinity.fasta.transdecoder.pep
drwxrwxr-x 3 spider spider 4096 Apr 11 20:38 trinity_out_dir_all.Trinity.fasta.transdecoder_dir
drwxrwxr-x 2 spider spider 4096 Apr 11 20:43 trinity_out_dir_all.Trinity.fasta.transdecoder_dir.__checkpoints
drwxrwxr-x 2 spider spider 52 Apr 11 20:12 trinity_out_dir_all.Trinity.fasta.transdecoder_dir.__checkpoints_longorfs
其中:
*.pep (是最终的候选ORF编码的蛋白序列)
*.cds (是编码蛋白的核酸序列)
*.gff3 (是表示ORF和转录本的位置关系)
*.bed (用于后期的IGV可视化)
欢迎联系互相学习:909474045@qq.com
网友评论