!!!计算方法是以前初师兄根本文献扒的,后续把文章doi附上
!!!KEGG基因list,利用R可以很快扒下来,我后面再更新,百度一下你就知道
准备好转录组表达矩阵,KEGG基因list
step1:构建代谢基因表达矩阵
library(dplyr)
#读入表达矩阵
tpm <- read.csv("tpm.csv")
#读入代谢基因list
all <- read.csv("allgenelist.csv")
#构建代谢基因list表达矩阵
all_tpm <- all %>% left_join(y=tpm,by="gene_name")
![](https://img.haomeiwen.com/i550630/b648d2d528a64c99.png)
step2:标准化代谢基因表达矩阵并加权
###标准化,即按行求均值,
ave <- rowMeans(all_tpm)
all_tpm_ave <- all_tpm/ave
write.csv(allgene_tpm_ave,"allgene_tpm_ave.csv")
##输出后,用excel打开,在gene_name前加一列countif,统计该基因在代谢list中出现的次数,即权重
#psR里这一步我还没不知道怎么计数,excel简单些,后续在改
![](https://img.haomeiwen.com/i550630/6e87c6e2b482cd41.png)
step3:通路活性计算
![](https://img.haomeiwen.com/i550630/18055d5430ebd95d.png)
#读入上述加权后表达矩阵
data<- read.csv("allgene_tpm_ave.csv")
#根据初师兄的表选取通路的行号2523:2540行
Pantothenate_and_CoA_biosynthesis <- data[2523:2540,]
###计算score
pathway_step1 <- colSums(Pantothenate_and_CoA_biosynthesis[,4:227]/Pantothenate_and_CoA_biosynthesis$countif)
pathway_step2 <- colSums(Pantothenate_and_CoA_biosynthesis[4:227])
SCORE <- as.data.frame(t(pathway_step1/pathway_step2))
write.csv(SCORE,"SCORE.csv")
![](https://img.haomeiwen.com/i550630/a37b305d289cfd4b.png)
计算结果即每一个样品在选取通路下的score值,后续作图个性化分析了,把生物学重复求均值后,整理为下述形式,自行作图
![](https://img.haomeiwen.com/i550630/4ed5630b7dbb30e9.png)
![](https://img.haomeiwen.com/i550630/cc8772d81339af97.png)
网友评论