探序基因肿瘤研究院 整理
在R种,假设要将A1.Seurat的seurat数据变量和A2.Seurat的放在一个列表中,则可以:
seurat_list <- list()
seurat_list[["A1"]] <- A1.Seurat
seurat_list[["A2"]] <- A2.Seurat
1. 先将各个样本的Seurat结构变成一个list格式,
例如:SeuratList <- list(P1T.Seurat,P2T.Seurat,P3T.Seurat)
AllBatch.anchors <- FindIntegrationAnchors(object.list = SeuratList, dims = 1:15,k.filter=80)
MerSeurat <- IntegrateData(anchorset = AllBatch.anchors, dims = 1:15)
MerSeurat <- RunPCA(object = MerSeurat , npcs = 30, verbose = FALSE)
MerSeurat <- RunUMAP(object = MerSeurat , reduction = "pca", dims = 1:15)
MerSeurat <- FindNeighbors(object = MerSeurat , reduction = "pca", dims = 1:15)
MerSeurat <- FindClusters(MerSeurat ,resolution = 1)#调分辨率
2. 将各个样本构成一个基因表达矩阵列表
例如,将各样本存到一个list中,变量名为:ifnb.list
library(Seurat)
load("innb.list.RData")
features <- SelectIntegrationFeatures(object.list = ifnb.list)
ifnb.list <- lapply(X = ifnb.list, FUN = function(x) {
x <- ScaleData(x, features = features, verbose = FALSE)
x <- RunPCA(x, features = features, verbose = FALSE)
})
immune.anchors <- FindIntegrationAnchors(object.list = ifnb.list, anchor.features = features)
immune.combined <- IntegrateData(anchorset = immune.anchors)
DefaultAssay(immune.combined) <- "integrated"
immune.combined <- ScaleData(immune.combined, verbose = FALSE)
immune.combined <- RunPCA(immune.combined, npcs = 30, verbose = FALSE)
immune.combined <- RunUMAP(immune.combined, reduction = "pca", dims = 1:15)
immune.combined <- FindNeighbors(immune.combined, reduction = "pca", dims = 1:15)
immune.combined <- FindClusters(immune.combined, resolution = 0.5)
save(immune.combined,file="AftRemBatch.RData")
查看immune.combine变量,在immune.combined@assays中,有RNA和integrated。immune.combined@assays$integrated@data,矩阵的基因数为2000,这些基因应该来自于步骤中挑选的features ,基因表达值有正有负。scale.data,也是只有2000个基因,基因表达值有正有负。counts为空。
假设运行FeaturePlot()查看基因表达分布,它默认选的是integrated中的data矩阵。
在seurat数据结构中,有个active.assay变量,里面内容为:integrated。active.ident变量存放的分群编号应该就是去批次后计算的分群。
3. 考察IntegrateData函数
IntegrateData(
anchorset,
new.assay.name = "integrated",
normalization.method = c("LogNormalize", "SCT"),
features = NULL,
features.to.integrate = NULL,
dims = 1:30,
k.weight = 100,
weight.reduction = NULL,
sd.weight = 1,
sample.tree = NULL,
preserve.order = FALSE,
eps = 0,
verbose = TRUE
)
从函数可知,去批次的结果,存放在seurat@assays$integrated中,count为空,data,scale.data有数据,且根据运行出的结果分析,data,scale.data有正有负数,且基因应该只是挑选的高可变基因。
参考:
网友评论