美文网首页单细胞转录组单细胞测序技术Alternative polyadenylation
scDAPA:从单细胞转录组数据中检测可变聚腺苷酸化(APA)

scDAPA:从单细胞转录组数据中检测可变聚腺苷酸化(APA)

作者: 周运来就是我 | 来源:发表于2019-11-26 20:59 被阅读0次

    多聚腺苷化(polyadenylation,poly(A))是转录本成熟过程中在3'末端发生的重要修饰步骤。选择性多聚腺苷化(Alternative Poly(A),APA)是真核生物中一种广泛存在的基础调控机制,不仅增加细胞中转录组和蛋白组的复杂性,并且影响目标RNA的功能、稳定性、定位和翻译效率。Poly(A)位点标识着转录本末尾,其准确识别是基因注释和转录调控机制研究的基础。APA表现出组织特异性,对细胞增殖和分化具有重要作用。

    选择性聚腺苷酸(APA)在真核生物的mRNA稳定性和功能中起着关键的转录后调控作用。单细胞RNA-seq (scRNA-seq)是发现基因表达水平细胞异质性的有力工具。最常用的 10× scRNA-seq 3’丰富的建库策略, 使我们能够将APA的研究分辨率提高到单细胞水平。然而,目前还没有可用的计算工具来调查来自scRNA-seq数据的APA概况。

    在这里,我们提出了一个软件包scDAPA检测和可视化动态APA从scRNA-seq数据。以bam/sam文件和细胞簇标签为输入,scDAPA使用基于直方图的方法和Wilcoxon秩和检验检测APA动态,并使用动态APA可视化候选基因。对标结果表明,scDAPA能从scRNA-seq数据中有效识别不同细胞群中具有动态APA的基因。 :https://scdapa.sourceforge.io.

    一、APA类型:

    (1)3’UTRAPA

    大部分APA位点处于含有顺势作用元件(ciselements)的3’UTR区,3’UTR-APA会对转录后基因调控产生许多影响,如mRNA稳定性、mRNA核转移和定位以及编码蛋白定位。

    图1. 3’UTR APA示意图[1]

    (2)Upstream Region APA(UR-APA)

    UR-APA位点位于最后一个外显子前,UR-APA引起末端外显子的可变表达,导致mRNA编码序列和3’UTR的变化。根据polyadenylation sites(PAS)的剪接模型,可将UR-APA分为两类:Skipped terminal exon和Composite terminal exon。Skipped terminal exon略过了末端外显子,而Composite terminal exon则由内部外显子延伸产生。

    图2. UR-APA示意图[1]
    unset PYTHONPATH 
    source  software/miniconda3/bin/activate software/miniconda3/envs/velocyto
    
    10X_RNA/Development/scDAPA/extractReads.sh -r  10X_RNA/Development/velocyto/example/CellRanger/pbmc5k/outs/possorted_genome_bam.bam -c 10X_RNA/Development/velocyto/example/CellRanger/pbmc5k/outs/analysis/clustering/kmeans_10_clusters/clusters.csv  -o ./result
    
    
    10X_RNA/Development/scDAPA/extractGenes.sh -i10X_RNA/pipeline2.1/database/10X_Ref/refdata-cellranger-GRCh38-1.2.0/genes/genes.gtf  -o hg38.gene.gff 
    export PATH=bedtools2/bin/:$PATH
    10X_RNA/Development/scDAPA/annotate3Ends.sh  -d 10X_RNA/Development/scDAPA/example/result/  -g  10X_RNA/Development/scDAPA/example/hg38.gene.gff 
    
    
    anno
    Column Name Explanation
    seqname The name of the sequence
    source The program that generated this feature
    feature The name of this type of feature
    start The starting position of the feature in the sequence
    end The ending position of the feature
    score A score between 0 and 1000
    strand Valid entries include "+", "-", or "."
    frame If the feature is not a coding exon, the value should be "."
    gene Gene ID and name
    start of read The starting positions of reads annoted to this gene, separated by comma
    end of read The ending positions of reads annoted to this gene, separated by comma

    将上述结果导入R包scDAPAminer

    > library(scDAPAminer)
    > # creat a folder named 'stat'
    > # 1. only compare two specific cell groups
    > scDAPAdetect(file1='./result/1.anno',file2='./result/2.anno',type='f2f',output_dir='./stat')
    > 
    > # 2. compare every two cell groups stored in the ./result directory
    > scDAPAdetect(dir='./result',type='d',output_dir='./stat',bin_size=100,count_cutoff=20)
    
    Column Name Explanation
    chr Name of the chromosome/scaffold
    gene Gene ID and name
    meanlen1 Mean length of 3′ ends to gene's start site in cell group 1
    meanlen2 Mean length of 3′ ends to gene's start site in cell group 2
    SDD Site distribution difference SDD∈[0,1]
    p.value Statistical test p values
    p.adjust Adjusted p values
    > dp = scDAPAview(files=c('./result/1.anno','./result/2.anno'),alt_names=c('cell_A','cell_B'),gtf=gtf,gene_id='ENSG00000160062',legend.position = c(0.2,0.8))
    > 
    > # customize colour theme
    > library(ggsci)
    > dp + scale_colour_aaas()
    > 
    > # customize legend title
    > dp + labs(colour = "Cell type")
    > 
    > # customize legend position
    > dp + theme(legend.position = c(0.6, 0.9))
    > 
    > # customize simultaneuouly
    > dp + scale_colour_aaas() + labs(colour = "Cell type") + theme(legend.position = c(0.6, 0.9))
    
    


    [1]Tian B, Manley J L. Alternative polyadenylation of mRNA precursors[J]. Nature Reviews Molecular Cell Biology, 2016, 18(1):18.

    [2]Abdelghany S E, Hamilton M, Jacobi J L, et al. A survey of the sorghum transcriptome using single-molecule long reads[J]. Nature Communications, 2016, 7:11706.

    http://www.frasergen.com/cn/info_173.aspx?itemid=258

    Congting Ye, Qian Zhou, Xiaohui Wu, Chen Yu, Guoli Ji, Daniel R Saban, Qingshun Q Li, scDAPA: detection and visualization of dynamic alternative polyadenylation from single cell RNA-seq data, Bioinformatics, , btz701, https://doi.org/10.1093/bioinformatics/btz701

    高通量测序技术在可选择性多聚腺苷酸化研究中的应用

    相关文章

      网友评论

        本文标题:scDAPA:从单细胞转录组数据中检测可变聚腺苷酸化(APA)

        本文链接:https://www.haomeiwen.com/subject/fiiywctx.html