前言

Immugent在前面的系列推文中，分别介绍了disgenet2r包的多种功能：包括对疾病相关的基因、或者疾病相关突变进行分析，感兴趣的小伙伴可以从往期推文中进行学习。但是以上都是针对单个研究对象，其实在很多情况下，我们通过测序得到的都不是单个基因，而是在处理前后发生变化的多个基因（基因集），如何对得到的众多基因进行综合解读，以得到最关键的信息是后续解读背后生物学机制的关键。基于此种需求，就衍生出了各种基因富集分析的方法，如GO/KEGG/Reactome等。

那么在本期推文中，Immugent将会继续介绍disgenet2r包的新功能，对得到的基因集进行疾病相关富集分析，或者是如何将疾病相关基因整合到其它流程，如富集分析中。

代码实操

Performing a disease enrichment

list_of_genes <- disease2gene(disease = "C0004352", database = "GENOMICS_ENGLAND")

list_of_genes <- list_of_genes@qresult$gene_symbol

res_enrich <-disease_enrichment( entities =list_of_genes, vocabulary = "HGNC",
                             database = "CURATED" )

table1 <- res_enrich@qresult[1:10, c("Description", "FDR", "Ratio",  "BgRatio")]

plot( res_enrich, class = "Enrichment", count =3,  cutoff= 0.05, nchars=70)

image.png

res_enrich <-disease_enrichment( entities =list_of_genes, vocabulary = "HGNC",
                             database = "ALL" )

table1 <- res_enrich@qresult[1:10, c("Description", "FDR", "Ratio",  "BgRatio")]

list_of_variants <- disease2variant(disease = "C0004352", database = "CLINVAR")

list_of_variants <- as.character(list_of_variants@qresult$variantid)

res_enrich <-disease_enrichment( entities =list_of_variants, vocabulary = "DBSNP",
                             database = "CURATED" )

table1 <- res_enrich@qresult[1:10, c("Description", "FDR", "Ratio",  "BgRatio")]

plot( res_enrich, class = "Enrichment", count =6,  cutoff= 0.05)

image.png

dis2path <- disease2pathway( disease = "C0018801" ,
                             database = "ALL", score = c(0.5, 1))
dis2path

# head(dis2path@qresult)
# qr <- extract(dis2path)
# head(qr, 3)

Retrieving the diseases associated to a given pathway

path2dis <- pathway2disease( pathway = "WP1591" ,
                             database = "CURATED",
                             score = c(0.9, 1))
path2dis
head(path2dis@qresult)
qr <- extract(path2dis)
head(qr, 3)

Retrieve the drug targets for a disease

library("SPARQL")
dis2com <- disease2compound(
  disease = c("C0020538"),
  database = "CURATED" )
dis2com
qr <- extract(dis2com)
head(qr, 3)

Retrieving Gene Ontology data for a gene

library("SPARQL")
gene2bp <- gene2biologicalprocess(gene = "351")
head(gene2bp@qresult)
gene2cc <- gene2cellcomponent(gene = "351")
head(gene2cc@qresult)
gene2mf <- gene2molecularfunction(gene = "351")
head(gene2mf@qresult)
gene2indi <- gene2indication( gene = "1588" )
head(gene2indi@qresult)

Retrieving Gene Ontology data for a disease

disease2bp <- disease2biologicalprocess(
  disease = "C0036341",
  database = "CURATED",
  score = c( 0.6,1)
)
disease2bp
disease2bp <- extract(disease2bp)
head(disease2bp[c("gene_product_label","go_label")])

Retrieving the molecular functions of the genes associated to a disease

disease2mf <- disease2molecularfunction(
  disease = "C0002395",
  database = "UNIPROT"
)

disease2mf
disease2mf <- extract(disease2mf)
head(disease2mf[c("gene_product_label", "go_label")])

说在最后

disgenet2r包的应用范围可以说是很广泛的，特别是本期介绍的其能像经典的GO/KEGG富集分析一样，将感兴趣的基因集进行疾病相关富集分析。除此之外，disgenet2r包还可以根据疾病类型筛选出相关的基因做其它通路相关富集分析，或者是找出某一个基因调控的多种疾病类型。这样我们就可以在实际使用中，根据需要来对相关基因进行筛选，可谓是非常方便了。

好啦，截止到目前，有关disgenet2r包的所有教程都已经更新完毕，希望大家能经过这个系列的学习将disgenet2r包运用到自己的分析中，我们下期再会~~