美文网首页走进转录组
bulk RNA-Seq(7)样本相关性、聚类、PCA分析

bulk RNA-Seq(7)样本相关性、聚类、PCA分析

作者: Bioinfor生信云 | 来源:发表于2022-07-05 17:29 被阅读0次

    欢迎关注Bioinfor 生信云微信公众号!

    读取三张表

    library(tidyverse)
    library(readr)
    gene_info <- read_delim("MM_6js01hvu.emapper.annotations.tsv", 
    delim = "\t", escape_double = FALSE, 
    col_names = FALSE, comment = "#", trim_ws = TRUE,row.names = 1) %>% 
    select(ID = X1,
    GO = X10,
    Ko = X12, 
    pathway = X13,
             Gene_name = X9)
    gene_exp <- read.table('genes.TMM.EXPR.matrix', header=T, row.names = 1)
    
    sample_info <- read.table(file = 'sample.txt', sep = "\t", header=T, row.names = 1)
    

    样本相关性

    相关性分析correlation
    R语言的cor函数,可以计算变量之间的相关系数


    #计算距离
    sample_cor <- cor(gene_exp)
    sample_cor1 <- round(sample_cor, digits = 2)
    #画图
    library(pheatmap)
    pheatmap(sample_cor1, display_numbers = T,fontsize = 10, angle_col = 45)
    

    聚类树状图

    sample_dist <- dist(t(gene_exp))
    sample_hc <- hclust(sample_dist)
    plot(sample_hc)
    

    PCA

    
    library(PCAtools)p <- pca(gene_exp, metadata = sample_info, removeVar = 0.1)
    pca_loadings <- p$loadings #某基因对pc1\pc2\pc3\pc4的贡献
    pca_rotated <- p$rotated #每个主成分与样本之间的关系
    screeplot(p)  #主成分对样本差异的解释度
    biplot(p,
           x = 'PC1',
           y = 'PC2',
           colby = 'group', 
           shape = 'shape',
           legendPosition = 'right')
    

    数据可以保存在rdata格式的文件中,下次直接用load()函数导入使用。

    喜欢的话就点个赞吧

    相关文章

      网友评论

        本文标题:bulk RNA-Seq(7)样本相关性、聚类、PCA分析

        本文链接:https://www.haomeiwen.com/subject/zkisbrtx.html