之前的conda安装Trinity一直报错,索性把conda卸载,然后重新安装。
conda install -c bioconda samtools bowtie2
#2021年6月28日安装成功
conda install -c bioconda trinity
> conda create -n Trinity trinity -y
Collecting package metadata (current_repodata.json): done
Solving environment: failed with repodata from current_repodata.json, will retry with next repodata source.
Collecting package metadata (repodata.json): done
Solving environment: failed
换channel
> conda create -n Trinity trinity -c bioconda
Collecting package metadata (current_repodata.json): done
Solving environment: done
# To activate this environment, use
> conda activate Trinity
> conda install -c bioconda bowtie samtools=1.8
# To deactivate an active environment, use
> conda deactivate
测试一下
> Trinity
###############################################################################
#
______ ____ ____ ____ ____ ______ __ __
| || \ | || \ | || || | |
| || D ) | | | _ | | | | || | |
|_| |_|| / | | | | | | | |_| |_|| ~ |
| | | \ | | | | | | | | | |___, |
| | | . \ | | | | | | | | | | |
|__| |__|\_||____||__|__||____| |__| |____/
Trinity-v2.8.5
#
#
# Required:
#
# --seqType <string> :type of reads: ('fa' or 'fq')
#
# --max_memory <string> :suggested max memory to use by Trinity where limiting can be enabled. (jellyfish, sorting, etc)
# provided in Gb of RAM, ie. '--max_memory 10G'
#
# If paired reads:
# --left <string> :left reads, one or more file names (separated by commas, no spaces)
# --right <string> :right reads, one or more file names (separated by commas, no spaces)
#
# Or, if unpaired reads:
# --single <string> :single reads, one or more file names, comma-delimited (note, if single file contains pairs, can use flag: --run_as_paired )
#
# Or,
# --samples_file <string> tab-delimited text file indicating biological replicate relationships.
# ex.
# cond_A cond_A_rep1 A_rep1_left.fq A_rep1_right.fq
# cond_A cond_A_rep2 A_rep2_left.fq A_rep2_right.fq
# cond_B cond_B_rep1 B_rep1_left.fq B_rep1_right.fq
# cond_B cond_B_rep2 B_rep2_left.fq B_rep2_right.fq
#
# # if single-end instead of paired-end, then leave the 4th column above empty.
#
####################################
## Misc: #########################
#
# --include_supertranscripts :yield supertranscripts fasta and gtf files as outputs.
#
# --SS_lib_type <string> :Strand-specific RNA-Seq read orientation.
# if paired: RF or FR,
# if single: F or R. (dUTP method = RF)
# See web documentation.
#
# --CPU <int> :number of CPUs to use, default: 2
# --min_contig_length <int> :minimum assembled contig length to report
# (def=200)
#
# --long_reads <string> :fasta file containing error-corrected or circular consensus (CCS) pac bio reads
# (** note: experimental parameter **, this functionality continues to be under development)
#
# --genome_guided_bam <string> :genome guided mode, provide path to coordinate-sorted bam file.
# (see genome-guided param section under --show_full_usage_info)
#
# --jaccard_clip :option, set if you have paired reads and
# you expect high gene density with UTR
# overlap (use FASTQ input file format
# for reads).
# (note: jaccard_clip is an expensive
# operation, so avoid using it unless
# necessary due to finding excessive fusion
# transcripts w/o it.)
#
# --trimmomatic :run Trimmomatic to quality trim reads
# see '--quality_trimming_params' under full usage info for tailored settings.
#
#
# --no_normalize_reads :Do *not* run in silico normalization of reads. Defaults to max. read coverage of 200.
# see '--normalize_max_read_cov' under full usage info for tailored settings.
# (note, as of Sept 21, 2016, normalization is on by default)
#
# --no_distributed_trinity_exec :do not run Trinity phase 2 (assembly of partitioned reads), and stop after generating command list.
#
#
# --output <string> :name of directory for output (will be
# created if it doesn't already exist)
# default( your current working directory: "/media/lzx/本地磁盘/20201224HepG2/20201020HepG2奥贝胆酸SE/plasmids/trinity_out_dir"
# note: must include 'trinity' in the name as a safety precaution! )
#
# --workdir <string> :where Trinity phase-2 assembly computation takes place (defaults to --output setting).
# (can set this to a node-local drive or RAM disk)
#
# --full_cleanup :only retain the Trinity fasta file, rename as ${output_dir}.Trinity.fasta
#
# --cite :show the Trinity literature citation
#
# --verbose :provide additional job status info during the run.
#
# --version :reports Trinity version (Trinity-v2.8.5) and exits.
#
# --show_full_usage_info :show the many many more options available for running Trinity (expert usage).
#
#
###############################################################################
#
# *Note, a typical Trinity command might be:
#
# Trinity --seqType fq --max_memory 50G --left reads_1.fq --right reads_2.fq --CPU 6
#
# (if you have multiple samples, use --samples_file ... see above for details)
#
# and for Genome-guided Trinity, provide a coordinate-sorted bam:
#
# Trinity --genome_guided_bam rnaseq_alignments.csorted.bam --max_memory 50G
# --genome_guided_max_intron 10000 --CPU 6
#
# see: /home/lzx/miniconda3/opt/trinity-2.8.5/sample_data/test_Trinity_Assembly/
# for sample data and 'runMe.sh' for example Trinity execution
#
# For more details, visit: http://trinityrnaseq.github.io
#
###############################################################################
- 如何提升Trinity组装转录组结果?
设置--min_kmer_cov 2 默认是1
--min_glue 10 (貌似值越大,N50越长,基因数越少)
虽然在一定程度上N50变长,基因数变少,但这也意味着丢弃一些转录本,
Trinity --seqType fq --samples_file $wkd/assembly/samples.txt \
--CPU 10 --max_memory 10G --min_contig_length 150
Trinity --seqType fq --max_memory 11G --single SRR9625467.unmap.fq.gz --SS_lib_type R --CPU 4 --min_contig_length 1000 > trinity.log 2> trinity_err.log
网友评论