美文网首页R语言学习
pheatmap行标签顺序大调整

pheatmap行标签顺序大调整

作者: 周运来就是我 | 来源:发表于2020-07-28 12:00 被阅读0次

    在科学论文中,我们经常要用到热图。我们在热图在单细胞数据分析中的应用比较系统地介绍了热图的一般规则。但是在实际操作中还是会遇到一些细节问题,如标签顺序。

    我们知道一个好的热图,要能反映出数据规律,直觉上就是要有明显的色块。那么色块是如何来的呢?和行与列的顺序有关。如一张好的热图大概率是这样的:


    但是如果我们调整顺序,他可以变成这样的:

    对我们来说重要的是获得这个顺序,然后指定给绘图函数。我们以熟悉的pheatmap为例来探索一下。首先生成示例数据:

    library(pheatmap)
    # Create test matrix
    test = matrix(rnorm(200), 20, 10)
    test[1:10, seq(1, 10, 2)] = test[1:10, seq(1, 10, 2)] + 3
    test[11:20, seq(2, 10, 2)] = test[11:20, seq(2, 10, 2)] + 2
    test[15:20, seq(2, 10, 2)] = test[15:20, seq(2, 10, 2)] + 4
    colnames(test) = paste("Test", 1:10, sep = "")
    
    rownames(test) = c(paste("CGene", 6:10, sep = ""),
                       paste("AGene", 1:5, sep = ""),
                       paste("BGene", 11:15, sep = ""),
                       paste("DGene", 16:20, sep = ""))
    
    

    看看数据长什么样子:

    test
    
                  Test1       Test2        Test3       Test4       Test5         Test6
    CGene6   3.32676462 -2.16507595  4.232450403 -0.73583213  3.94062305 -0.1935842619
    CGene7   3.30040713  0.08865765  3.721572091  0.33449053  2.73292952  0.4583932832
    CGene8   1.50450030  0.64337406  3.407904162 -1.24057682  3.14174263 -0.0007014311
    CGene9   2.89088634 -0.55950670  2.060130582 -1.75583323  1.07926694  2.2556162284
    CGene10  1.90369857  1.40255666  1.760750107 -0.76906325  2.64811141 -0.6957942691
    AGene1   3.42061019  1.14950064  4.268530703  0.05037557  1.84633305  0.5137683525
    AGene2   2.88919835 -1.02100837  2.957415715 -1.09980021  3.67011986  0.7510053428
    AGene3   5.24239748  0.02736920  4.045355782 -0.08883342  4.06748687 -0.9685845021
    AGene4   2.19006433  1.37861550  2.337982108 -0.94394769  3.83553785 -0.8334859349
    AGene5   4.48235967 -1.48192686  5.028429364  0.15901242  3.49067895  0.5836504001
    BGene11  0.93281128  0.60297065  0.877725891  2.68570163 -0.52096014  0.5303119758
    BGene12 -0.82352032  4.13015350 -0.007314182  2.56230292 -1.22882126  2.0095278472
    BGene13  1.07999506  2.00713092  1.185458666  1.13050138  0.15584559  2.3795046412
    BGene14  1.11955349  2.84165755  0.220021162  1.63569739  0.99095614  3.3335572441
    BGene15  1.77628153  6.37128696  1.004835310  4.90696601  0.75322787  5.3301565398
    DGene16  0.04400472  6.33588183 -1.293469424  5.43806241  0.53726670  6.2000870073
    DGene17 -2.63598249  7.79111111 -0.204355079  6.85814507 -0.87600545  6.8738334335
    DGene18 -0.48197063  6.21941112  0.841207756  6.19352280  0.12741642  6.0838277426
    DGene19  0.96229006  5.79064015  1.319576057  7.18360581 -0.05522554  5.6089813401
    DGene20 -1.42032585  4.29067156  0.589306112  5.99965957  0.43606552  5.7949180143
                   Test7      Test8      Test9     Test10
    CGene6   4.138778102  1.6304399  1.4972186  0.6664516
    CGene7   4.205202621  0.7133720  1.3688061  0.4749147
    CGene8   3.675146838 -0.8371708  2.4173558 -0.8573423
    CGene9   0.911284470  0.6367740  1.8973446  0.5885573
    CGene10  2.381027675  2.0743930  3.6874262  1.1493406
    AGene1   3.416045270 -0.0662255  2.1358439 -1.3471116
    AGene2   4.091088541  0.2684579  3.6841199 -1.7729912
    AGene3   2.746024503  0.3570507  2.2417769 -0.1226907
    AGene4   2.734958681 -0.7147136  1.8119604 -0.9273917
    AGene5   2.131046458  0.5774685  4.1504215 -1.0478849
    BGene11  0.367833875  1.5309153 -1.0897623  3.3879448
    BGene12  0.003437035  1.1982992 -1.1184832  1.2544010
    BGene13  0.124903765  2.0180698 -1.1180846  4.0343573
    BGene14 -1.623291426  2.4192553 -1.3206414  0.7060437
    BGene15  0.576155533  7.4567201  1.3057335  5.6594995
    DGene16  0.542256420  5.8187826 -1.6232905  7.1829024
    DGene17 -0.711543153  7.1164359 -0.8563482  7.9621794
    DGene18  0.632542083  5.9143762 -0.9905354  7.6225081
    DGene19 -0.659880146  5.0144296 -0.5088869  4.9703428
    DGene20 -0.445718763  4.8705198 -1.5070905  6.2237708
    

    默认参数:

    p1 <- pheatmap(test, main = "pheatmap")
    

    这时的顺序是按聚类顺序来的。

    p2 <- pheatmap((test),cluster_row = FALSE,main = "cluster_row = FALSE")
    

    不聚类时,行顺序就是我们的输入矩阵的数据顺序。

    我们把行名按字母排个序。

    p3<- pheatmap(test[order(rownames(test)),],cluster_row = FALSE,main = 'test[order(rownames(test)),]\ncluster_row = FALSE')
    

    这时候就是字母序了。

    有时候,我们只想留下聚类结果,并不想展示聚类轴,怎么办呢?

    
    nr=rownames(test)[p1$tree_row[["order"]]]
    nr  # 可以把这个顺序传递给Doheatmap
    
    [1] "DGene17" "DGene16" "DGene18" "BGene15" "DGene19" "DGene20" "BGene11" "BGene13"
     [9] "BGene12" "BGene14" "CGene6"  "AGene3"  "AGene5"  "CGene8"  "AGene4"  "AGene2" 
    [17] "CGene7"  "AGene1"  "CGene9"  "CGene10"
    
    
    nc=colnames(test)[p1$tree_col[["order"]]]
    
    
    p4<-pheatmap(test[nr,nc], main = "pheatmap/nremove cluster lable",cluster_rows = F)
    

    最后,我们把这四张图拼在一起,对读着有个交代。

    require(ggplotify)
    p1 = as.ggplot(p1)
    p2 = as.ggplot(p2)
    p3 = as.ggplot(p3)
    p4 = as.ggplot(p4)
    p12 <-cowplot:: plot_grid(p1, p2, labels = c('A', 'B'), align = 'h', 
                              rel_widths = c(1, 1.3))
    p34 = cowplot::plot_grid(p3, p4, labels = c('C', 'D'), align = 'h',
                             rel_widths = c(1, 1.3))
    
    comb = cowplot::plot_grid(p12,p34, ncol = 1, 
                            rel_heights = c(1, 1))
    

    DoHeatmap clustering specific genes and not top x genes #2261
    继续来看pheatmap那些有趣的事情
    热图如何去掉聚类树的同时保留聚类的顺序?
    【r<-ggplot2】cowplot在网格中排列图形
    Arranging plots in a grid
    https://github.com/satijalab/seurat/issues/2222

    相关文章

      网友评论

        本文标题:pheatmap行标签顺序大调整

        本文链接:https://www.haomeiwen.com/subject/nrdghktx.html