样本相关性分析，样本聚类分析

作者: 嗒嘀嗒嗒嘀嗒嘀嘀 | 来源:发表于2020-07-23 23:56 被阅读0次

加载tidyverse

library(tidyverse)

导入数据三张表

samples	strain	stage	指标1	指标2	指标3
BLO_S1_LD1	BLO	S1	3.5	3.0	40
BLO_S1_LD2	BLO	S1	3.8	3.2	48
BLO_S1_LD3	BLO	S1	3.0	3.0	50
BLO_S2_LD1	BLO	S2	9.5	13.0	90
BLO_S2_LD2	BLO	S2	9.8	13.2	88
BLO_S2_LD3	BLO	S2	10.0	13.0	90

cor(gene_exp) # 相关性计算

相关系数分类
- 皮尔森相关系数 pearson
  线性相关
- 斯皮尔曼相关系数 spearman
  等级相关
- 肯德尔相关系数 kendall
  适用于离散变量、分类型变量的相关系数
举例
- 计算两个基因之间的相关系数，用皮尔森相关系数
- 肿瘤分期相关基因，分期之间是等级相关，用斯皮尔曼相关系数
- 哪些基因与性别相关，用肯德尔相关系数
计算样本相关系数，直接用皮尔森相关系数即可
command
sample_cor <- round(cor(gene_exp) , digits = 2) # round保留两位小数
sample_cor <- round(cor(gene_exp, method = 'spearman') , digits = 2) # 可以指定相关系数算法
library(pheatmap)
pheatmap(sample_cor)

第一步：计算距离矩阵
样本两两之间，谁与谁的距离要算出来
- sample_dist <- dist(t(gene_exp)) # dist算的是行之间的距离矩阵,所以需要将表达矩阵转置，t()表示转置
- "euclidean", 欧几里得距离矩阵，最常用
- "maximum",
第二步：聚类
层次聚类法
sample_hc <- hclust(sample_dist)
plot(sample_hc)

聚类方法简述 ?hclust可查看
single 最短聚类法； comlpete 最长聚类法(默认)； median 平均距离法; 进化树构建使用类似方法 = UPGMA

本文标题：样本相关性分析，样本聚类分析

本文链接：https://www.haomeiwen.com/subject/iolklktx.html