美文网首页
基因组估算软件:Genomescope

基因组估算软件:Genomescope

作者: 王梓维 | 来源:发表于2021-12-22 10:57 被阅读0次

    网页版的Genomescope貌似崩了,一直在运算中,不出结果,因而只好本地安装了

    conda install -c bioconda genomescope2
    ###出现报错如下
    Package python conflicts for:
    python=3.6
    genomescope2 -> r-argparse -> python[version='2.7.*|3.4.*|3.5.*|3.6.*']
    genomescope2 -> python[version='>=3.2|>=3.6']The following specifications were found to be incompatible with your system:
    
      - feature:/linux-64::__glibc==2.31=0
      - feature:|@/linux-64::__glibc==2.31=0
    
    Your installed version is: 2.31
    

    经查阅,应该是conda-forge和default冲突导致的,只使用conda-forge,问题解决:

    conda install -c conda-forge -c bioconda jellyfish genomescope2
    

    使用jellyfish切成21mer(该软件推荐使用21mer,也可以尝试其他)

    jellyfish count -C -m 21 -s 1000000000 -t 10 *.fastq -o 21.reads.jf
    jellyfish histo -t 10 21.reads.jf > reads.21.histo
    

    帮助文件

    usage: /mnt/d/conda1/bin/genomescope2 [-h] [-v] [-i INPUT] [-o OUTPUT]
                                          [-p PLOIDY] [-k KMER_LENGTH]
                                          [-n NAME_PREFIX] [-l LAMBDA]
                                          [-m MAX_KMERCOV] [--verbose]
                                          [--no_unique_sequence] [-t TOPOLOGY]
                                          [--initial_repetitiveness INITIAL_REPETITIVENESS]
                                          [--initial_heterozygosities INITIAL_HETEROZYGOSITIES]
                                          [--transform_exp TRANSFORM_EXP]
                                          [--testing] [--true_params TRUE_PARAMS]
                                          [--trace_flag] [--num_rounds NUM_ROUNDS]
    
    optional arguments:
      -h, --help            show this help message and exit
      -v, --version         print the version and exit
      -i INPUT, --input INPUT
                            input histogram file
      -o OUTPUT, --output OUTPUT
                            output directory name
      -p PLOIDY, --ploidy PLOIDY
                            ploidy (1, 2, 3, 4, 5, or 6) for model to use [default
                            2]
      -k KMER_LENGTH, --kmer_length KMER_LENGTH
                            kmer length used to calculate kmer spectra [default
                            21]
      -n NAME_PREFIX, --name_prefix NAME_PREFIX
                            optional name_prefix for output files
      -l LAMBDA, --lambda LAMBDA, --kcov LAMBDA, --kmercov LAMBDA
                            optional initial kmercov estimate for model to use
      -m MAX_KMERCOV, --max_kmercov MAX_KMERCOV
                            optional maximum kmer coverage threshold (kmers with
                            coverage greater than max_kmercov are ignored by the
                            model)
      --verbose             optional flag to print messages during execution
      --no_unique_sequence  optional flag to turn off yellow unique sequence line
                            in plots
      -t TOPOLOGY, --topology TOPOLOGY
                            ADVANCED: flag for topology for model to use
      --initial_repetitiveness INITIAL_REPETITIVENESS
                            ADVANCED: flag to set initial value for repetitiveness
      --initial_heterozygosities INITIAL_HETEROZYGOSITIES
                            ADVANCED: flag to set initial values for nucleotide
                            heterozygosity rates
      --transform_exp TRANSFORM_EXP
                            ADVANCED: parameter for the exponent when fitting a
                            transformed (x**transform_exp*y vs. x) kmer histogram
                            [default 1]
      --testing             ADVANCED: flag to create testing.tsv file with model
                            parameters
      --true_params TRUE_PARAMS
                            ADVANCED: flag to state true simulated parameters for
                            testing mode
      --trace_flag          ADVANCED: flag to turn on printing of iteration
                            progress of nlsLM function
      --num_rounds NUM_ROUNDS
                            ADVANCED: parameter for the number of optimization
                            rounds
    

    根据该用法,

    genomescope2 -i reads.21.histo -o 21 -k 21
    

    结果生成到名为21的目录中

    更多详情请见该软件使用手册GitHub - schatzlab/genomescope: Fast genome analysis from unassembled short reads

    相关文章

      网友评论

          本文标题:基因组估算软件:Genomescope

          本文链接:https://www.haomeiwen.com/subject/dxkqqrtx.html