scRNAseq的data一般存在大量的零值,包括真实零值(即转录与否)以及技术上带来的dropout。已有的DEanalysis的R包不能区分零值的来源,R包DEsingle则是为了解决这一问题专门开发的R包,不仅可以区分DEgene种类,而且提高了准确率。
R包DEsingle主要是基于ZINB模型,比较在单细胞测序的raw read counts matrix中两组特定细胞间的差异基因表达分析。
input的数据format:单细胞测序得到的raw read counts的非负整数矩阵或者SingleCellExperiment对象(SCE对象,scatter包读取产生)
两个函数DEsingle()、DEtype(),分别用于探索差异基因和对差异基因进行划分
>DEsingle(counts, group, parallel = FALSE, BPPARAM = bpparam())
counts:counts可以是单细胞测序得到的raw read counts的非负整数矩阵或者含有read counts矩阵的SingleCellExperiment对象,矩阵行名是基因,列名是样本或者细胞
group:group是表明分组的向量,对应counts矩阵的列名
parallel:无重复样本,默认FALSE;有生物重复是,parallel则为TRUE,采用BiocParallel,对应属性为BPPARAM
BPPARAM:当parallel默认FALSE时,BPPARAM= bpparam();当parallel=TRUE时,BPPARAM=bplapply
>DEtype(results, threshold)
results:DEsingle的结果
threshold:P值的阈值
Output
DEsingle输出的结果为dataframe,行名为基因,列名至少5项内容:
1.group1和group2的零膨胀负二项分布参数值: theta_1, theta_2, mu_1, mu_2, size_1, size_2, prob_1, prob_2
2.两组的mean read counts及foldchange值: total_mean_1, total_mean_2,foldchange
3.两组的标准化后的mean read counts及foldchange值: norm_total_mean_1, norm_total_mean_2, norm_foldChange
4.基于零假设的卡方值: chi2LR1; pvalue; pvalue_LR2; pvalue_LR3; 校正后P值: pvalue.adj.FDR;FDR_LR2;FDR_LR3
5.Remark:异常信息
DEtype的结果还多了两项:
Type:Types of DE genes.
DEs represents different expression status; 不同表达状态
是指该差异基因两组细胞间真实零值比例存在统计学上的差异
DEa represents differential expression abundance; 不同的表达丰度
是指该差异基因在两组细胞差异表达,但真实零值比例不存在统计学上地差异
DEg represents general differential expression. 包括上述两种情况
是指该差异基因不仅在两组差异表达,而且真实零值比例也存在统计学上的差异
State:DE genes上调?(up)下调?(down)
利用内置数据TestData练习
# 加载测试数据
data(TestData)
# Specifying the two groups to be compared分组
# The sample number in group 1 and group 2 is 50 and 100 respectively
group <- factor(c(rep(1,50), rep(2,100)))
# Detecting the differentially expressed genes 探索差异基因
results <- DEsingle(counts = counts, group = group)
# Dividing the differentially expressed genes into 3 categories 对差异基因分类
results.classified <- DEtype(results = results, threshold = 0.05)
results.DEs <- results.sig[results.sig$Type == "DEs", ]
results.DEa <- results.sig[results.sig$Type == "DEa", ]
results.DEg <- results.sig[results.sig$Type == "DEg", ]
Ref:Miao Z, Deng K, Wang X, Zhang X. (2016) DEsingle for detecting three types of differential expression in single-cell RNA-seq data.
网友评论