美文网首页
空间转录组---seurat

空间转录组---seurat

作者: jjjscuedu | 来源:发表于2023-03-10 17:04 被阅读0次

    空间转录组是2022生命科学十大创新产品名单,因此将来会在生物学领域有非常大的应用空间,目前植物类的相关文章较少,我也是在慢慢的学习中。我们测试数据选取兰花的空转数据:Spatiotemporal atlas of organogenesis in development of orchid flowers(这篇文章我前面也分享过),与单细胞的数据结构基本一致,多了spatial这个文件夹,主要包含的就是切片信息和spot定位信息。

    测试数据可以根据作者paper提供的下载链接获取:

    https://osf.io/mqsb4/?view_only=787b3d5a81e54c27b3f33d5841b59a51

    今天,我们先尝试利用seurat来分析数据,后面我们尝试利用文章提到的STEEL也去分析一下,看下结果的差异。

    ========数据载入========

    #加载所需要的包

    library(Seurat)

    library(ggplot2)

    library(cowplot)

    library(dplyr)

    library(magrittr)

    library(gtools)

    library(stringr)

    library(Matrix)

    library(tidyverse)

    library(patchwork)

    #加载数据

    orc1<-Load10X_Spatial("Slide_1",filename="filtered_feature_bc_matrix.h5",assay="spatial")

    =====质量控制=====

    我们以前利用seurat进行质量控制的时候,可以利用小提琴图查看UMI,count等信息。因为多了位置信息,所以我们可以将这些变量体现在切片上,查看那些组织可能除了问题。

    dir.create("QC")

    #下面查看总UMI的分布情况

    plot1 <- VlnPlot(orc1, features = "nCount_spatial", pt.size = 0.1) + NoLegend()

    plot2 <- SpatialFeaturePlot(orc1, features = "nCount_spatial") + theme(legend.position = "right")

    pearplot <- plot_grid(plot1,plot2)

    ggsave("QC/Slide1_UMI.pdf", plot = pearplot, width = 12, height = 5)

    #下面查看总每个cell检测出来gene个数的分布情况

    plot1 <- VlnPlot(orc1, features = "nFeature_spatial", pt.size = 0.1) + NoLegend()

    plot2 <- SpatialFeaturePlot(orc1, features = "nFeature_spatial") + theme(legend.position = "right")

    pearplot <- plot_grid(plot1,plot2)

    ggsave("QC/Slide1_Feature.pdf", plot = pearplot, width = 12, height = 5)

    可以明显的看到,不管是UMI还是Feature,在spot中差异主要是有与切片组织的细胞数目以及细胞类型导致的。比如,在花瓣中可以看到非常少的UMI以及Feature,在中间的分生组织中,UMI和Feature的数目非常多。因此,单细胞的标准方法(如LogNormalize函数)可能会有问题,在空间转录组分析中,一般使用其他方法进行标准化。

    ========标准化=======

    目前空间转录组数据推荐使用sctransform,集合了单细胞标准化的NormalizeData(),ScaleData(),FindVariableFeatures()。

    dir.create("Normalize")

    orc1 <- SCTransform(orc1, assay = "spatial", return.only.var.genes = FALSE, verbose = FALSE)

    ## 比较SCTransform与log Normalization的区别,所以计算不同标准化的结果

    orc1 <- NormalizeData(orc1, verbose = FALSE, assay = "spatial")

    ## 计算UMI与Feature的相关性,分为5组

    orc1 <- GroupCorrelation(orc1, group.assay = "spatial", assay = "spatial", slot = "data", do.plot = FALSE)

    orc1 <- GroupCorrelation(orc1, group.assay = "spatial", assay = "SCT", slot = "scale.data", do.plot = FALSE)

    p1 <- GroupCorrelationPlot(orc1, assay = "spatial", cor = "nCount_spatial_cor") + ggtitle("Log Normalization") +theme(plot.title = element_text(hjust = 0.5))

    p2 <- GroupCorrelationPlot(orc1, assay = "SCT", cor = "nCount_spatial_cor") + ggtitle("SCTransform Normalization") +theme(plot.title = element_text(hjust = 0.5))

    p3 <- plot_grid(p1, p2)

    ggsave("Normalize/Normalization_cor_Slide1.pdf", plot = p3, width = 12, height = 5) 

    对于上面的箱形图,计算每个特征(基因)与UMIs数量(这里的nCount_spatial变量)的相关性。然后,根据基因的平均表达将它们分组,并生成这些相关性的箱形图。

    同时这里需要注意,SCT标准化后有个单独的assay(SCT),里面包含三个矩阵,分别为count,data以及scale.data,log标准化也有一个assay(Spatial),里面也包含三个矩阵,但是scale.data的数据为0。

    标准化之后进行高变基因的筛选,这与单细胞是一致的。

    top10 <- head(VariableFeatures(orc1), 10)

    plot1 <- VariableFeaturePlot(orc1)

    plot2 <- LabelPoints(plot = plot1, points = top10, repel = TRUE, size=2.5)

    ggsave("Normalize/VariableFeature_Slide1.pdf", plot = plot2, width = 9, height = 6)

    ======数据降维和聚类========

    由于文章中使用了STEEL方法,所以这次的分析只是随便看看,下个帖子我们会按照STEEL方法进行详细的测试。

    dir.create("cluster")

    ## PCA降维并提取主成分

    orc1 <- RunPCA(orc1, features = VariableFeatures(orc1))

    plot1 <- DimPlot(orc1, reduction = "pca", group.by="orig.ident")

    plot2 <- ElbowPlot(orc1, ndims=40, reduction="pca")

    pearplot <- plot_grid(plot1,plot2)

    ggsave("cluster/PCA_Slide1.pdf", plot = pearplot, width = 13, height = 6)

    pdf("cluster/PC_heatmap_Slide1.pdf",height = 12,width = 12)

    DimHeatmap(orc1, dims = 1:9, nfeatures=10, cells = 200, balanced = TRUE)

    dev.off()

    orc1 <- FindNeighbors(orc1, dims = 1:20)

    orc1 <- FindClusters(orc1, resolution = 0.9)   #大家可以根据自己数据选取最优的分辨率

    metadata <- orc1@meta.data

    table(orc1@meta.data$seurat_clusters)

    cell_cluster <- data.frame(cell_ID=rownames(metadata), cluster_ID=metadata$seurat_clusters)

    write.csv(cell_cluster,'cluster/cell_cluster_Slide1.csv',row.names = F)

    ## 非线性降维

    ## tsne

    orc1 <- RunTSNE(orc1, dims = 1:20)

    embed_tsne <- Embeddings(orc1, 'tsne')

    write.csv(embed_tsne,'cluster/embed_tsne_Slide1.csv')

    p1 <- DimPlot(orc1, reduction = "tsne", label = TRUE)

    p2 <- SpatialDimPlot(orc1, label.size = 3, pt.size.factor = 1.3)

    pearplot <- plot_grid(p1,p2)

    ggsave("cluster/tsne_Slide1.pdf", plot = pearplot, width = 15, height = 6)

    ## UMAP

    orc1 <- RunUMAP(orc1,dims = 1:20)

    embed_umap <- Embeddings(orc1, 'umap')

    write.csv(embed_umap,'cluster/embed_umap_Slide1.csv')

    p1 <- DimPlot(orc1, reduction = "umap", label = TRUE)

    p2 <- SpatialDimPlot(orc1, label = TRUE, label.size = 3, pt.size.factor = 1.3)

    pearplot <- plot_grid(p1,p2)

    ggsave("cluster/umap_Slide1.pdf", plot = pearplot, width = 15, height = 6)

    ========单个基因的展示========

    #可以采取不同的数据集,进行基因表达的展示

    dir.create("featureplot")

    p1 <- SpatialFeaturePlot(orc1, features = "PAXXG070630",slot="counts",pt.size.factor = 1.6,min.cutoff = 0.1)

    p2 <- SpatialFeaturePlot(orc1, features = "PAXXG070630",slot="data",pt.size.factor = 1.6,min.cutoff = 0.1)

    p3 <- SpatialFeaturePlot(orc1, features = "PAXXG070630",slot="scale.data",pt.size.factor = 1.6,min.cutoff = 0.1)

    pearplot <- plot_grid(p1,p2,p3,align = "v",axis = "b",ncol = 3)

    ggsave("featureplot/SCTSlide1.pdf", plot = pearplot, width = 18, height = 7)

    相关文章

      网友评论

          本文标题:空间转录组---seurat

          本文链接:https://www.haomeiwen.com/subject/pkltrdtx.html