美文网首页
「scRNAseq」Normalization2

「scRNAseq」Normalization2

作者: 一路向前_莫问前程_前程似锦 | 来源:发表于2020-05-08 17:44 被阅读0次

三大软件

1. Seurat

2. Scran

3. scater 可视化而已

Seurat 归一化的方法是:

In the default normalization method in Seurat, counts for each cell are divided by the total counts for that cell and multiplied by the scale factor 10 000. This is then log transformed.
Seurat 如何归一化呢?就是采用每个基因的count/该细胞的总counts,在乘以10000,之后在进行log转换。

Here we use the filtered data from the counts slot of the SCE object to create a Seurat object. After normalization, we convert the result back into a SingleCellExperiment object for comparing plots.

Scan 归一化方法:

The normalization procedure in scran is based on the deconvolution method 。 Counts from many cells are pooled to avoid the drop-out problem. Pool-based size factors are then “deconvolved” into cell-based factors for cell-specific normalization. Clustering cells prior to normalization is not always necessary but it improves normalization accuracy by reducing the number of DE genes between cells in the same cluster.

scran中的归一化过程基于反褶积方法,将许多细胞的count汇集在一起以避免drop-out问题。然后将基于池的因子因素被“反褶积”到基于细胞的因子中,以实现细胞特异性的归一化。在归一化之前对细胞进行聚类并不总是必要的,但它通过减少同一聚类中细胞间DE基因的数量来提高归一化的准确性

qclust <- quickCluster(pbmc.sce)
pbmc.sce <- computeSumFactors(pbmc.sce, clusters = qclust)
summary(sizeFactors(pbmc.sce))
pbmc.sce <- normalize(pbmc.sce)
Examine the results and compare to the log-normalized result. Are they different?

plotRLE(pbmc.sce[,1:50], exprs_values = "logcounts", exprs_logged = FALSE, 
        style = "full")

Feature selection: Seurat

The default method in Seurat 3 is variance-stabilizing transformation. A trend is fitted to to predict the variance of each gene as a function of its mean. For each gene, the variance of standardized values is computed across all cells and used to rank the features. By default, 2000 top genes are returned.

:Seurat 3中的默认方法是方差稳定转换。一个趋势被用来预测每个基因的方差作为其平均值的函数。对于每个基因,标准化值的方差在所有细胞中计算并用于对特征进行排序。默认情况下,返回2000个基因被返回。

pbmc.seu <- FindVariableFeatures(pbmc.seu, selection.method = "vst")
top10 <- head(VariableFeatures(pbmc.seu), 10)
vplot <- VariableFeaturePlot(pbmc.seu)
LabelPoints(plot = vplot, points = top10, repel = TRUE)

Feature selection: scran

In the scran method for finding HVGs, a trend is first fitted to the technical variances. In the absence of spike-ins, this is done using the whole data, assuming that the majority of genes are not variably expressed. Then, the biological component of the variance for each endogenous gene is computed by subtracting the fitted value of the trend from the total variance.

在寻找HVGs的scran方法中,一个趋势首先适合于技术方差变异。在没有spike-ins的情况下,这是使用整个数据来完成的,假设大多数基因都有一致的表达。然后,通过从总方差中减去趋势的拟合值来计算每个基因的方差中的生物成分。

fit <- trendVar(pbmc.sce, use.spikes = NA)
decomp <- decomposeVar(pbmc.sce, fit)
top.hvgs <- order(decomp$bio, decreasing = TRUE)
head(decomp[top.hvgs,])
plot(decomp$mean, decomp$total, xlab = "Mean log-expression", ylab = "Variance")
o <- order(decomp$mean)
lines(decomp$mean[o], decomp$tech[o], col = "red", lwd = 2)

We choose genes that have a biological component that is significantly greater than zero, using a false discovery rate (FDR) of 5%.

hvg.out <- decomp[which(decomp$FDR <= 0.05),]
hvg.out <- hvg.out[order(hvg.out$bio, decreasing=TRUE),]
plotExpression(pbmc.sce, features = rownames(hvg.out)[1:10])

相关文章

网友评论

      本文标题:「scRNAseq」Normalization2

      本文链接:https://www.haomeiwen.com/subject/rqufnhtx.html