网页版的Genomescope貌似崩了,一直在运算中,不出结果,因而只好本地安装了
conda install -c bioconda genomescope2
###出现报错如下
Package python conflicts for:
python=3.6
genomescope2 -> r-argparse -> python[version='2.7.*|3.4.*|3.5.*|3.6.*']
genomescope2 -> python[version='>=3.2|>=3.6']The following specifications were found to be incompatible with your system:
- feature:/linux-64::__glibc==2.31=0
- feature:|@/linux-64::__glibc==2.31=0
Your installed version is: 2.31
经查阅,应该是conda-forge和default冲突导致的,只使用conda-forge,问题解决:
conda install -c conda-forge -c bioconda jellyfish genomescope2
使用jellyfish切成21mer(该软件推荐使用21mer,也可以尝试其他)
jellyfish count -C -m 21 -s 1000000000 -t 10 *.fastq -o 21.reads.jf
jellyfish histo -t 10 21.reads.jf > reads.21.histo
帮助文件
usage: /mnt/d/conda1/bin/genomescope2 [-h] [-v] [-i INPUT] [-o OUTPUT]
[-p PLOIDY] [-k KMER_LENGTH]
[-n NAME_PREFIX] [-l LAMBDA]
[-m MAX_KMERCOV] [--verbose]
[--no_unique_sequence] [-t TOPOLOGY]
[--initial_repetitiveness INITIAL_REPETITIVENESS]
[--initial_heterozygosities INITIAL_HETEROZYGOSITIES]
[--transform_exp TRANSFORM_EXP]
[--testing] [--true_params TRUE_PARAMS]
[--trace_flag] [--num_rounds NUM_ROUNDS]
optional arguments:
-h, --help show this help message and exit
-v, --version print the version and exit
-i INPUT, --input INPUT
input histogram file
-o OUTPUT, --output OUTPUT
output directory name
-p PLOIDY, --ploidy PLOIDY
ploidy (1, 2, 3, 4, 5, or 6) for model to use [default
2]
-k KMER_LENGTH, --kmer_length KMER_LENGTH
kmer length used to calculate kmer spectra [default
21]
-n NAME_PREFIX, --name_prefix NAME_PREFIX
optional name_prefix for output files
-l LAMBDA, --lambda LAMBDA, --kcov LAMBDA, --kmercov LAMBDA
optional initial kmercov estimate for model to use
-m MAX_KMERCOV, --max_kmercov MAX_KMERCOV
optional maximum kmer coverage threshold (kmers with
coverage greater than max_kmercov are ignored by the
model)
--verbose optional flag to print messages during execution
--no_unique_sequence optional flag to turn off yellow unique sequence line
in plots
-t TOPOLOGY, --topology TOPOLOGY
ADVANCED: flag for topology for model to use
--initial_repetitiveness INITIAL_REPETITIVENESS
ADVANCED: flag to set initial value for repetitiveness
--initial_heterozygosities INITIAL_HETEROZYGOSITIES
ADVANCED: flag to set initial values for nucleotide
heterozygosity rates
--transform_exp TRANSFORM_EXP
ADVANCED: parameter for the exponent when fitting a
transformed (x**transform_exp*y vs. x) kmer histogram
[default 1]
--testing ADVANCED: flag to create testing.tsv file with model
parameters
--true_params TRUE_PARAMS
ADVANCED: flag to state true simulated parameters for
testing mode
--trace_flag ADVANCED: flag to turn on printing of iteration
progress of nlsLM function
--num_rounds NUM_ROUNDS
ADVANCED: parameter for the number of optimization
rounds
根据该用法,
genomescope2 -i reads.21.histo -o 21 -k 21
结果生成到名为21的目录中
更多详情请见该软件使用手册GitHub - schatzlab/genomescope: Fast genome analysis from unassembled short reads
网友评论