美文网首页生物信息工具生信交互式工具基因组组装
比较基因组学之共线性工具JCVI安装及使用

比较基因组学之共线性工具JCVI安装及使用

作者: PS小易同学 | 来源:发表于2023-02-20 13:09 被阅读0次

    比较基因组学之共线性工具JCVI安装及使用

    jcvi及依赖的安装


    $ conda create -n jcvi -c bioconda -c conda-forge jcvi last

    工作流程

    1.将GFF文件转换成BED文件

    $ python -m jcvi.formats.gff bed --type=mRNA --key=Parent Puya_raimondii.rep.gff3 -o Puya_raimondii.bed
    $ python -m jcvi.formats.gff bed --type=mRNA --key=Parent CB5.v20190123.re.gff3 -o CB5.bed
    #由于很多基因组注释文件会包含许多不同的转录本,MCcan 并不知道这些基因实际上是同一个基因,而是将它们视为看起来像串联基因复制的不同基因。如果转录本过多,建议添加一个选项--primary_only到上面的BED生成命令中,只保留每个基因的一个转录本。
    $ python -m jcvi.formats.gff bed --type=mRNA --key=Name --primary_only Puya_raimondii.rep.gff3 -o Puya_raimondii.bed
    

    2.准备cds文件或蛋白质序列文件

    $ 此处我直接用tbtools提取CDS序列
    #或者
    $ python -m jcvi.formats.fasta format Puya_raimondii.cds.fa Puya_raimondii.cds.
    $ python -m jcvi.formats.fasta format CB5.v20190123.cds.fa CB5.cds
    #注这里的CDS序列文件ID要与.bed文件ID统一,python -m jcvi.formats.gff bed --type=mRNA --key=Parent中 --key=后面可以自己视情况调整
    

    3.同线性区块的搜索

    $ ls *.???
    
    image.png
    $ python -m jcvi.compara.catalog ortholog CB5 Puya_raimondii --no_strip_names
    
    image.png
    可以看到上面由于latal不能使用多线程而报错,无法生成共线性区块信息
    查找了需多办法最终通过增加--cpu=1解决
    $ python -m jcvi.compara.catalog ortholog CB5 Puya_raimondii --no_strip_names --cpu=1
    
    image.png

    4.同线性的可视化

    $ python -m jcvi.graphics.dotplot CB5.Puya_raimondii.anchors
    
    image.png image.png

    5.局部共线性可视化

    $ python -m jcvi.compara.synteny mcscan CB5.bed CB5.Puya_raimondii.lifted.anchors --iter=1 -o CB5.Puya_raimondii.i1.blocks
    
    image.png
    #将你需要展示的局部共线性信息从CB5.Puya_raimondii.i1.blocks文件中提取出来
    #这里我将CB5.17G0008100,CB5.17G0008090,CB5.17G0008050,CB5.17G0008110上由21个基因,下游20个基因区域提取出来保存为CB5.Puya_raimondii.block
    CB5.17G0007820  PY_026228
    CB5.17G0007830  PY_026226
    CB5.17G0007840  PY_026225
    CB5.17G0007850  PY_026221
    CB5.04G0002850  .
    CB5.17G0007870  .
    CB5.17G0007880  PY_026226
    CB5.17G0007890  .
    CB5.17G0007900  .
    CB5.17G0007910  PY_026217
    CB5.05G0004350  .
    CB5.17G0007930  PY_026215
    CB5.17G0007940  PY_026214
    CB5.17G0007950  PY_026213
    CB5.17G0007960  PY_026213
    CB5.17G0007970  PY_026213
    CB5.17G0008000  PY_026212
    CB5.17G0008010  PY_026211
    CB5.17G0008020  PY_026209
    CB5.17G0008030  PY_026205
    CB5.17G0008110  PY_026205
    CB5.17G0008050  PY_026206
    CB5.17G0008060  PY_026207
    CB5.17G0008070  PY_026208
    CB5.17G0008080  PY_026207
    CB5.17G0008090  PY_026206
    CB5.17G0008100  PY_026205
    CB5.17G0008120  .
    CB5.17G0008130  PY_026204
    CB5.17G0008140  PY_026203
    CB5.17G0008150  PY_026203
    CB5.17G0008160  PY_026202
    CB5.17G0008170  .
    CB5.17G0008180  PY_026200
    CB5.17G0008190  PY_026199
    CB5.17G0008200  PY_026199
    CB5.17G0008210  PY_026198
    CB5.17G0008220  PY_026198
    CB5.17G0008230  .
    CB5.17G0008240  .
    CB5.17G0008250  PY_026197
    CB5.17G0008260  PY_026197
    CB5.17G0008270  .
    CB5.17G0008280  .
    CB5.17G0008290  PY_026196
    
    #准备layout文件,文件内容如下CB5.Puya_raimondii.layout
    # x,   y, rotation,   ha,     va,   color, ratio,            label
    0.5, 0.6,        0, left, center,       m,     1,       CB5 Chr17
    0.5, 0.4,        0, left, center, #fc8d62,     1, Puya_raimondii Scaffold4
    # edges
    e, 0, 1
    
    #合并.bed文件
    cat CB5.bed Puya_raimondii.bed >CB5.Puya_raimondii.bed
    
    #生成局部共线性图
    $ python -m jcvi.graphics.synteny CB5.Puya_raimondii.block CB5.Puya_raimondii.bed CB5.Puya_raimondii.layout
    

    [图片上传失败...(image-882d1e-1676960060731)]

    #显示指定基因lable
    python -m jcvi.graphics.synteny CB5.Puya_raimondii.block CB5.Puya_raimondii.bed CB5.Puya_raimondii.layout --genelabelsize=4 --genelabels=CB5.17G0008100,CB5.17G0008090,CB5.17G0008050,CB5.17G0008110,PY_026206,PY_026205
    
    image.png

    接下来重复上面操作将做出CB5与Acmosus局部共线性关系展示出来

    $ python -m jcvi.formats.gff bed --type=mRNA --key=Parent Acomosus_321_v3.re.gene.gff3 -o Acomosus.bed
    $ python -m jcvi.formats.fasta format Acomosus_321_v3.re.gene.cds Acomosus.cds
    $ python -m jcvi.compara.catalog ortholog CB5 Acomosus --no_strip_names --cpu=1
    $ python -m jcvi.compara.synteny mcscan CB5.bed CB5.Acomosus.lifted.anchors --iter=1 -o CB5.Acomosus.i1.blocks
    $ cat CB5.bed Acomosus.bed >CB5_Acmosus.bed
    $ python -m jcvi.graphics.synteny CB5.Ac.blocks CB5_Acmosus.bed CB5.Ac.layout
    $ python -m jcvi.graphics.synteny CB5.Ac.blocks CB5_Acmosus.bed CB5.Ac.layout --genelabelsize=4 --genelabels=CB5.17G0008100,CB5.17G0008090,CB5.17G0008050,CB5.17G0008110,Aco023267.1,Aco023266.1,Aco023263.1,Aco023262.1
    
    image.png

    接下来重复上面操作将做出Puya_raimondii与rice局部共线性关系展示出来

    $ python -m jcvi.formats.gff bed --type=mRNA --key=Parent Osativa_323_v7.0.re.gene.gff3 -o Osativa.bed
    $ python -m jcvi.formats.fasta format Osativa_323_v7.0.re.gene.cds Osativa.cds
    $ python -m jcvi.compara.catalog ortholog Puya_raimondii Osativa --no_strip_names --cpu=1
    $ python -m jcvi.compara.synteny mcscan Puya_raimondii.bed Puya_raimondii.Osativa.lifted.anchors --iter=1 -o Puya_raimondii.Osativa.i1.blocks
    $ cat Puya_raimondii.bed Osativa.bed >Puya_raimondii.Osativa.bed
    $ python -m jcvi.graphics.synteny Puya_raimondii.Osativa.blocks Puya_raimondii.Osativa.bed Puya_raimondii.Osativa.layout
    $ python -m jcvi.graphics.synteny Puya_raimondii.Osativa.blocks Puya_raimondii.Osativa.bed Puya_raimondii.Osativa.layout --genelabelsize=4 --genelabels=PY_026206,PY_026205
    
    image.png

    接下来重复上面操作将做出rice与bananas局部共线性关系展示出来

    $ python -m jcvi.formats.gff bed --type=mRNA --key=Parent Musa_acuminata_pahang_v4.re.gff3 -o Musa_acuminata.bed
    $ python -m jcvi.formats.fasta format Musa_acuminata_pahang_v4.gene.cds Musa_acuminata.cds
    $ python -m jcvi.compara.catalog ortholog Osativa Musa_acuminata --no_strip_names --cpu=1
    $ python -m jcvi.compara.synteny mcscan Osativa.bed Osativa.Musa_acuminata.lifted.anchors --iter=1 -o Osativa.Musa_acuminata.i1.blocks
    $ cat Osativa.bed Musa_acuminata.bed>Osativa.Musa_acuminata.bed
    $ python -m jcvi.graphics.synteny Puya_raimondii.Osativa.blocks Puya_raimondii.Osativa.bed Puya_raimondii.Osativa.layout
    $ python -m jcvi.graphics.synteny Puya_raimondii.Osativa.blocks Puya_raimondii.Osativa.bed Puya_raimondii.Osativa.layout --genelabelsize=4 --genelabels=PY_026206,PY_026205
    

    2.多重共线性关系

    为方便起见,我们可以在一个图中展示多重共线性关系。首先与之前一样,使用**python -m jcvi.compara.synteny mcscan**构建多个共线性块,然后修改blocks.layout文件以表示更多区域以及区域之间的边缘。

    这次我以Acmosus,CB5,MD2,PY,At,Musa,rice,Atr为例。在此我以Acmosus为参考构建含有8个基因组的blocks文件。随后将CB5,MD2,PY,At,Musa,rice,Atr分别与菠萝比对。

    2.1 Acmosus,MD2的比对

    $ python -m jcvi.compara.catalog ortholog Acomosus ACMD2 --cscore=.99
    #使用官方文档命令报错说不存在下列id得不到结果文件
    

    [图片上传失败...(image-4e80ec-1676959427417)]

    # 解决办法增加一个--no_strip_names
    $ python -m jcvi.compara.catalog ortholog Acomosus ACMD2 --cpu=1 --no_strip_names --cscore=.99
    
    image.png
    # 生成blocks文件
    $ python -m jcvi.compara.synteny mcscan Acomosus.bed Acomosus.ACMD2.lifted.anchors --iter=1 -o Acomosus.ACMD2.i1.blocks
    

    2.2 Acmosus与CB5的比对

    # 寻找同源基因对
    $ python -m jcvi.compara.catalog ortholog Acomosus CB5 --cpu=1 --no_strip_names --cscore=.99
    # 根据同源关系生成blocks文件
    python -m jcvi.compara.synteny mcscan Acomosus.bed Acomosus.CB5.lifted.anchors --iter=1 -o Acomosus.CB5.i1.blocks
    
    image.png
    image.png

    2.3 Acmosus与PY的比对

    # 寻找同源基因对
    $ python -m jcvi.compara.catalog ortholog Acomosus Puya_raimondii --cpu=1 --no_strip_names --cscore=.99
    # 根据同源关系生成blocks文件
    python -m jcvi.compara.synteny mcscan Acomosus.bed Acomosus.Puya_raimondii.lifted.anchors --iter=1 -o Acomosus.Puya_raimondii.i1.blocks
    

    2.4 Acmosus与At的比对

    # 寻找同源基因对
    $ python -m jcvi.compara.catalog ortholog Acomosus Athaliana --cpu=1 --no_strip_names --cscore=.99
    # 根据同源关系生成blocks文件
    python -m jcvi.compara.synteny mcscan Acomosus.bed Acomosus.Athaliana.lifted.anchors --iter=1 -o Acomosus.Athaliana.i1.blocks
    

    2.5 Acmosus与Musa的比对

    # 寻找同源基因对
    $ python -m jcvi.compara.catalog ortholog Acomosus Musa_acuminata --cpu=1 --no_strip_names --cscore=.99
    # 根据同源关系生成blocks文件
    python -m jcvi.compara.synteny mcscan Acomosus.bed Acomosus.Musa_acuminata.lifted.anchors --iter=1 -o Acomosus.Musa_acuminata.i1.blocks
    

    2.6 Acmosus与rice的比对

    # 寻找同源基因对
    $ python -m jcvi.compara.catalog ortholog Acomosus Osativa --cpu=1 --no_strip_names --cscore=.99
    # 根据同源关系生成blocks文件
    python -m jcvi.compara.synteny mcscan Acomosus.bed Acomosus.Osativa.lifted.anchors --iter=1 -o Acomosus.Osativa.i1.blocks
    

    2.7Acmosus与ATR的比对

    # 寻找同源基因对
    $ python -m jcvi.compara.catalog ortholog Acomosus Amborella_trichopoda --cpu=1 --no_strip_names --cscore=.99
    # 根据同源关系生成blocks文件
    python -m jcvi.compara.synteny mcscan Acomosus.bed Acomosus.Amborella_trichopoda.lifted.anchors --iter=1 -o Acomosus.Amborella_trichopoda.i1.blocks
    

    2.8 将所有比对结果中blocks文件整合

    $ python -m jcvi.formats.base join Acomosus.ACMD2.i1.blocks Acomosus.CB5.i1.blocks Acomosus.Puya_raimondii.i1.blocks Acomosus.Athaliana.i1.blocks Acomosus.Musa_acuminata.i1.blocks Acomosus.Osativa.i1.blocks Acomosus.Amborella_trichopoda.i1.blocks --noheader > Acomosus.ACMD2.CB5.Puya_raimondii.Athaliana.Musa_acuminata.Osativa.Amborella_trichopoda.blocks
    $ python -m jcvi.formats.base join Acomosus.ACMD2.i1.blocks Acomosus.CB5.i1.blocks Acomosus.Puya_raimondii.i1.blocks Acomosus.Athaliana.i1.blocks Acomosus.Musa_acuminata.i1.blocks Acomosus.Osativa.i1.blocks Acomosus.Amborella_trichopoda.i1.blocks --noheader | cut -f1,2,4,6,8,10,12,14 > Acomosus.blocks 
    

    2.9 准备layout文件

    #文件内容如下
    # x,   y, rotation,     ha,     va, color, ratio,            label
    0.5, 0.6,        30, center,    top,      ,     20,       Acomosus LG02
    0.3, 0.4,        0, center, bottom,      ,     5, MD2 LSRQ01005221.1
    0.4, 0.4,        0, center, bottom,      ,     5, MD2 LSRQ01000111.1
    0.7, 0.4,        0, center, bottom,      ,     20, CB5 chr17
    0.5, 0.8,        0, center,    top,      ,    2, Puya_raimondii Scaffold4
    0.7, 0.8,        0, center, bottom,      ,    .2, Oryza_sativa Chr9
    0.3, 0.6,        90, center, bottom,      ,    10, Musa_acuminata chr08
    0.7, 0.6,        90, center, bottom,      ,    10, Musa_acuminata chr07
    0.3, 0.8,        0, center, bottom,      ,    5, Arabidopsis_thaliana Chr2
    0.4, 0.7,        0, center, bottom,      ,    .2, Amborella_trichopoda scaffold00024
    # edges
    e, 0, 1
    e, 0, 2
    e, 0, 3
    e, 0, 4
    e, 0, 5
    e, 0, 6
    e, 0, 7
    # x,y分别表示各物种共线性区块位置,x,y值需要在0-1之间否则无法出图。rotation旋转角度,即共线性区块旋转角度
    

    2.10 合并所有bed文件

    $ cat Acomosus.bed ACMD2.bed CB5.bed Puya_raimondii.bed Athaliana.bed Musa_acuminata.bed Osativa.bed Amborella_trichopoda.bed > Acomosus.ACMD2.CB5.Puya_raimondii.Athaliana.Musa_acuminata.Osativa.Amborella_trichopoda.bed
    

    2.11 生成多物种共线性图

    $ python -m jcvi.graphics.synteny Acomosus.ACMD2.CB5.Puya_raimondii.Athaliana.Musa_acuminata.Osativa.Amborella_trichopoda.blocks Acomosus.ACMD2.CB5.Puya_raimondii.Athaliana.Musa_acuminata.Osativa.Amborella_trichopoda.bed Acomosus.ACMD2.CB5.Puya_raimondii.Athaliana.Musa_acuminata.Osativa.Amborella_trichopoda.blocks.layout
    
    image.png

    相关文章

      网友评论

        本文标题:比较基因组学之共线性工具JCVI安装及使用

        本文链接:https://www.haomeiwen.com/subject/lacgkdtx.html