我要做一个分析,需要用均一化的数据,学生说之前是需要两两去做DEseq2,然后再做下游分析,我觉得太麻烦。于是摸索了下,DEseq2可以均一化多个样本。
代码如下:我这里是8个条件下的24个样本,大家根据自己的情况去修改:
- 加载需要的相关包
rm(list=ls())
if(!require(DESeq2))BiocManager::install("DESeq2")
library(rio)
library(dplyr)
- 导入数据
data_all=import("exprpac.csv",header = T)
3.选取需要均一化的样本列,构建database
database=data_all[,c(5:28)]
head(database)
database=database[complete.cases(database), ]
- 构建factor,切忌名称中间不能添加如“_“ 等符号,否则会报错Error in DESeqDataSet(se, design = design, ignoreRank) : variables in design formula cannot contain NA: condition
condition = factor(c(rep("56AA", 3), rep("56CK", 3), rep("col0AA", 3), rep("col0CK", 3), rep("30AA", 3), rep("30CK", 3), rep("30mycAA", 3), rep("30mycCK", 3)), levels = c("56AA","56CK","col0AA","col0CK","30AA","30CK","30mycAA","30mycCK"))
- DEseq2均一化
countdata = round(as.matrix(database))
coldata = data.frame(row.names = colnames(countdata), condition)
dds = DESeqDataSetFromMatrix(countdata, colData = coldata, design = ~ condition)
dds = DESeq(dds)
sizeFactors(dds)
head(dds)
res <- results(dds)
resdata <- merge(as.data.frame(res), as.data.frame(counts(dds,
normalized=TRUE)),by="row.names",sort=FALSE)
head(resdata)
merge_list <- data.frame(data_all,resdata)
head(merge_list)
resdata <- merge_list
head(resdata)
6.数据保存,因为我是用均一化的数据替代了原始的数据,所以,表格结构重新构造了下
nor <- select(resdata,c(1:4,59:82),c(29:51))
write.csv(nor,file = "DEseq2_tot_normalized.csv",row.names = FALSE)
*注,目前我的水平看起来后续做差异分析的时候只能两两做,不能够一次生成。如果有高手的有办法的话,欢迎指教!
网友评论