美文网首页生物信息学习生物信息学与算法微生物信息学
细菌基因组:质粒序列的鉴定之PlasmidFinder

细菌基因组:质粒序列的鉴定之PlasmidFinder

作者: 基因的生物信息学分析 | 来源:发表于2019-08-12 21:22 被阅读11次
    image

    细菌基因组测序完成后,想知道里面有没有质粒怎么办?

    质粒(plasmid) 广泛存在于生物界,从细菌、放线菌、丝状真菌、大型真菌、酵母到植物,甚至人类机体中都含有。从分子组成看,有DNA 质粒,也有RNA 质粒; 从分子构型看,有线型质粒、也有环状质粒: 其表型也多种多样。细菌质粒是基因工程中最常用的载体。

    质粒是细菌酵母菌放线菌等生物中染色体(或拟核)以外的DNA分子,存在于细胞质中(但酵母除外,酵母的2 μm质粒存在于细胞核中),具有自主复制能力,使其在子代细胞中也能保持恒定的拷贝数,并表达所携带的遗传信息,是闭合环状的双链DNA分子。质粒不是细菌生长繁殖所必需的物质,可自行丢失或人工处理而消除,如高温、紫外线等。质粒携带的遗传信息能赋予宿主菌某些生物学性状,有利于细菌在特定的环境条件下生存。

    与细菌基因组相同,质粒也属于环形双链DNA(共价闭环DNA,covalenr closed circular DNA, cccDNA)。

    PlasmidFinder介绍

    从细菌基因组测序数据中鉴定出质粒序列。基于一个人工校对的质粒复制子数据库。

    也有在线版本:https://cge.cbs.dtu.dk/services/PlasmidFinder/

    image

    不需要安装直接上传序列即可快速得到结果

    PlasmidFinder软件安装

    git clone https://bitbucket.org/genomicepidemiology/plasmidfinder.git
    cd plasmidfinder
    

    下载和安装PlasmidFinder数据库

    # Clone database from git repository (develop branch)
    git clone https://bitbucket.org/genomicepidemiology/plasmidfinder_db.git
    cd plasmidfinder_db
    PLASMID_DB=$(pwd)
    # Install PlasmidFinder database with executable kma_index program
    python3 INSTALL.py kma_index
    

    如果kma_index 没有安装可以参考

    (https://bitbucket.org/genomicepidemiology/kma)

    git clone https://bitbucket.org/genomicepidemiology/kma.git
    cd kma && make
    

    PlasmidFinder软件使用:

    查看帮助文档

    $ python3 plasmidfinder.py  -h   
    usage: plasmidfinder.py [-h] [-i INFILE [INFILE ...]] [-o OUTDIR]
                            [-tmp TMP_DIR] [-mp METHOD_PATH] [-p DB_PATH]
                            [-d DATABASES] [-l MIN_COV] [-t THRESHOLD] [-x] [-q]
    
    optional arguments:
      -h, --help            show this help message and exit
      -i INFILE [INFILE ...], --infile INFILE [INFILE ...]
                            FASTA or FASTQ input files.
      -o OUTDIR, --outputPath OUTDIR
                            Path to blast output
      -tmp TMP_DIR, --tmp_dir TMP_DIR
                            Temporary directory for storage of the results from
                            the external software.
      -mp METHOD_PATH, --methodPath METHOD_PATH
                            Path to method to use (kma or blastn)
      -p DB_PATH, --databasePath DB_PATH
                            Path to the databases
      -d DATABASES, --databases DATABASES
                            Databases chosen to search in - if non is specified
                            all is used
      -l MIN_COV, --mincov MIN_COV
                            Minimum coverage
      -t THRESHOLD, --threshold THRESHOLD
                            Minimum threshold for identity
      -x, --extented_output
                            Give extented output with allignment files, template
                            and query hits in fasta and a tab seperated file with
                            allele profile results
      -q, --quiet
    

    运行命令:

    $ python3 plasmidfinder.py -i test/test.fsa -o testout/ -p plasmidfinder_db -x
    

    查看结果文件夹:

    $ ls testout
    data.json  Hit_in_genome_seq.fsa  Plasmid_seqs.fsa  results_tab.tsv  results.txt  tmp
    
    $ more testout/results.txt
    plasmidfinder Results
    
    Organism(s): Enterobacteriaceae,Gram Positive
    
    ****************************************************************************************
    Enterobacteriaceae
    **********************************************************************************************************************************
    Plasmid         Identity  Query / Template length    Contig                       Position in contig    Note    Accession number
    **********************************************************************************************************************************
    IncHI1B(R27)         100  540 / 540                  IncHI1B(R27)_1_R27_AF250878  1..540                R27     AF250878
    ==================================================================================================================================
    
    
    ****************************************************************************************
    Gram Positive
    ****************************************************************************************************************
    Plasmid    Identity    Query / Template length    Contig        Position in contig    Note    Accession number
    ****************************************************************************************************************
    -          -           -                          No hit found  -                     -       -
    ================================================================================================================
    
    
    
    
    Extended Output:
    
    # IncHI1B(R27)_AF250878
    template:   ATTCCAGAAAACCGATCTCTTTAAGCTGGCCCAGCGCCTTTTTAACCGTGGCATTCTGGT
                ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
    query:      ATTCCAGAAAACCGATCTCTTTAAGCTGGCCCAGCGCCTTTTTAACCGTGGCATTCTGGT
    
    template:   TACCGAGGTGTGATGACAGTTGGAGTCGTCCACGAAGCCGATCGAATCCGATGCGGTAAA
                ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
    query:      TACCGAGGTGTGATGACAGTTGGAGTCGTCCACGAAGCCGATCGAATCCGATGCGGTAAA
    
    template:   AGGTGCTCGGCAGCTCAGCCAGATACAGGTACAGGGCCTGTGCGGACTCCTTACGGGCCA
                ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
    query:      AGGTGCTCGGCAGCTCAGCCAGATACAGGTACAGGGCCTGTGCGGACTCCTTACGGGCCA
    
    template:   GTTTTTGCAATGTCTTCAGGTAGAGTCGGGTTTTACCGTCGACGCGATACAGCGTATTGA
                ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
    query:      GTTTTTGCAATGTCTTCAGGTAGAGTCGGGTTTTACCGTCGACGCGATACAGCGTATTGA
    
    template:   GCTTCGAATTTGGCTTGATGATGATTTTTCCCGTGGAACTGTCGTAATACGTCGATTCCA
                ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
    query:      GCTTCGAATTTGGCTTGATGATGATTTTTCCCGTGGAACTGTCGTAATACGTCGATTCCA
    
    template:   CCAGGTGCATGTTTATCGTTATCTGATCATCTGTACCGGGTATTTTCTTAATAAATGAAA
                ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
    query:      CCAGGTGCATGTTTATCGTTATCTGATCATCTGTACCGGGTATTTTCTTAATAAATGAAA
    
    template:   TGTTGGTCCGGGCTATACGCGTCAGCGAAGCATCAAAGCGCTCTTTCAGTTGTTTATCAA
                ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
    query:      TGTTGGTCCGGGCTATACGCGTCAGCGAAGCATCAAAGCGCTCTTTCAGTTGTTTATCAA
    
    template:   TGCGCTTGGTATCAAACCCACAAAATTTTGCAAACTCCGGAAAATTCAGCTCCAGCTGAC
                ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
    query:      TGCGCTTGGTATCAAACCCACAAAATTTTGCAAACTCCGGAAAATTCAGCTCCAGCTGAC
    
    template:   CTTCTGAATCAAGCGGCCGGTTAGACAACGCATAAACGATCCCACACCATGATTTGAAAT
                ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
    query:      CTTCTGAATCAAGCGGCCGGTTAGACAACGCATAAACGATCCCACACCATGATTTGAAAT
    

    参考:PlasmidFinder and pMLST: in silico detection and typing of plasmids. Carattoli A, Zankari E, Garcia-Fernandez A, Volby Larsen M, Lund O, Villa L, Aarestrup FM, Hasman H. Antimicrob. Agents Chemother. 2014. April 28th.
    感谢您的阅读,欢迎点赞、评论和转发!!

    扫描或长按下方二维码,即可关注公众号: 基因的生物信息学分析

    image

    相关阅读

    细菌基因组:结核杆菌测序耐药位点分析

    一文搞定细菌基因组De Novo测序分析

    肠道菌群:16S测序分析流程解读

    肠道菌群:宏基因组测序分析流程解读(上)

    相关文章

      网友评论

        本文标题:细菌基因组:质粒序列的鉴定之PlasmidFinder

        本文链接:https://www.haomeiwen.com/subject/kyzxdctx.html