关于 `ConsensusClusterPlus`

作者: Seurat_Satija | 来源:发表于2021-08-30 15:48 被阅读0次

关于 `ConsensusClusterPlus`
🎣®[R]ConsensusClusterPlus②二元属性（0
2019-01-16
如何通过一致性聚类实现对表达谱数据的亚型分类
2022-11-08 ConsensusClusterPlus
一致性聚类ConsensusClusterPlus
ConsensusClusterPlus: R中实现鉴定簇集数及
R语言：一致性聚类 ConsensusClusterPlus
[R包]ConsensusClusterPlus①一致性聚类Co
关于关于关于

无监督分析下鉴定簇集数及成员

Wilkerson, D. M, Hayes, Neil D (2010). “ConsensusClusterPlus: a class discovery tool with confidence assessments and item tracking.” Bioinformatics, 26(12), 1572-1573. http://bioinformatics.oxfordjournals.org/content/26/12/1572.abstract.

1. 关于 `ConsensusClusterPlus`

Consensus Clustering 是一种可用于鉴定数据集（比如 microarray 基因表达）中的簇集 (clusters) 成员及其数量的算法。ConsensusClusterPlus 则将 Consensus Clustering 在 R 中实现了。
Jimmy大神说这是他见过最简单的包┑(￣Д ￣)┍

library(ConsensusClusterPlus)
ls("package:ConsensusClusterPlus")
# [1] "calcICL"              "ConsensusClusterPlus"

ConsensusClusterPlus function for determing cluster number and class membership by stability evidence.

calcICL function for calculating cluster-consensus and item-consensus.

2. 好像真的很简单只是操作简单

使用 ConsensusClusterPlus 的主要三个步骤：

准备输入数据
跑程序
计算聚类一致性 (cluster-consensus) 和样品一致性 (item-consensus)

3. 准备输入数据

首先收集用于聚类分析的数据，比如 mRNA 表达微阵列或免疫组织化学染色强度的实验结果数据。输入数据的格式应为矩阵。下面以 ALL 基因表达数据为例进行操作。

library(ALL)
data(ALL)
dataset <- exprs(ALL)
dataset[1:5,1:5]
#              01005    01010    03002    04006    04007
# 1000_at   7.597323 7.479445 7.567593 7.384684 7.905312
# 1001_at   5.046194 4.932537 4.799294 4.922627 4.844565
# 1002_f_at 3.900466 4.208155 3.886169 4.206798 3.416923
# 1003_s_at 5.903856 6.169024 5.860459 6.116890 5.687997
# 1004_at   5.925260 5.912780 5.893209 6.170245 5.615210

取矩阵中 MAD 值 top 5000 的数据：

mads <- apply(dataset, 1, mad)
dataset <- dataset[rev(order(mads))[1:5000],]
dim(dataset)
# [1] 5000  128

4. 运行 `ConsensusClusterPlus`

先设定几个参数：

pItem (item resampling, proportion of items to sample) : 80%
pFeature (gene resampling, proportion of features to sample) : 80%
maxK (a maximum evalulated k, maximum cluster number to evaluate) : 6
reps (resamplings, number of subsamples) : 50
clusterAlg (agglomerative heirarchical clustering algorithm) : 'hc' (hclust)
distance : 'pearson' (1 - Pearson correlation)

# title <- tempdir() ## 虽说是“当前文件夹”，但似乎结果会输出到包的安装路径...
## 所以还是👇
title <- “YOUR PATH”
results <- ConsensusClusterPlus(dataset, maxK = 6,
                                reps = 50, pItem = 0.8,
                                pFeature = 0.8,  
                                clusterAlg = "hc", 
                                distance = "pearson",
                                title = title,
                                plot = "png")  
## 作者这里是pFeature = 1，和前文不符，于是我依然是按0.8输入计算的

这时工作路径的文件夹会出现9张图。

查看一下结果：

results[[2]][["consensusMatrix"]][1:5,1:5] 
#         [,1]      [,2]      [,3]    [,4]      [,5]
# [1,] 1.00000 0.9375000 1.0000000 0.90625 1.0000000
# [2,] 0.93750 1.0000000 0.9677419 1.00000 0.9393939
# [3,] 1.00000 0.9677419 1.0000000 0.93750 1.0000000
# [4,] 0.90625 1.0000000 0.9375000 1.00000 0.9062500
# [5,] 1.00000 0.9393939 1.0000000 0.90625 1.0000000
results[[2]][["consensusTree"]] 
# Call:
# hclust(d = as.dist(1 - fm), method = finalLinkage)
# 
# Cluster method   : average 
# Number of objects: 128 
results[[2]][["consensusClass"]][1:5] 
# 01005 01010 03002 04006 04007 
#     1     1     1     1     1

4.1 一致性矩阵

分别为图例、k = 2, 3, 4, 5 时的矩阵热图。

image

4.2 一致性累积分布函数图

image

This figure allows a user to determine at what number of clusters, k, the CDF

reaches an approximate maximum, thus consensus and cluster con dence is at

a maximum at this k.

4.3 Delta Area Plot

image

The delta area score (y-axis) indicates the relative increase in cluster stability.

4.4 Tracking Plot

image

This plot provides a view of item cluster membership across different k and enables a user to track the history of clusters relative to earlier clusters.

5. 计算聚类一致性 (cluster-consensus) 和样品一致性 (item-consensus)

icl <- calcICL(results, title = title,
               plot = "png")
## 返回了具有两个元素的list，然后分别查看一下
dim(icl[["clusterConsensus"]])
# [1] 20  3
icl[["clusterConsensus"]] 
#       k cluster clusterConsensus
#  [1,] 2       1        0.9402982
#  [2,] 2       2        0.9062500
#  [3,] 3       1        0.8504193
#  [4,] 3       2        0.9062500
#  [5,] 3       3        0.9869781
#  [6,] 4       1        0.9652282
#  [7,] 4       2        0.9045058
#  [8,] 4       3        0.9062500
#  [9,] 4       4        0.9728043
# [10,] 5       1        0.9216686
# [11,] 5       2        0.9145987
# [12,] 5       3        0.9062500
# [13,] 5       4        0.9874950
# [14,] 5       5              NaN
# [15,] 6       1        0.9307379
# [16,] 6       2        0.8897721
# [17,] 6       3        0.7474747
# [18,] 6       4        0.8750000
# [19,] 6       5        0.9885269
# [20,] 6       6        0.6333333
dim(icl[["itemConsensus"]])
# [1] 2560    4
icl[["itemConsensus"]][1:5,] 
#   k cluster  item itemConsensus
# 1 2       1 28032     0.9523526
# 2 2       1 28024     0.9366226
# 3 2       1 03002     0.9686272
# 4 2       1 01005     0.9573623
# 5 2       1 04007     0.9549235

5.1 Cluster-Consensus Plot

image

5.2 tem-Consensus Plot

image.png

关于 `ConsensusClusterPlus`
无监督分析下鉴定簇集数及成员 Wilkerson, D. M, Hayes, Neil D (2010). “Co...
🎣®[R]ConsensusClusterPlus②二元属性（0
“ConsensusClusterPlus①一致性聚类Consensus Clustering” 基础信息: 二元...
2019-01-16
根据表达谱用K均值对样本聚类---》做KM曲线 library(ConsensusClusterPlus)resu...
如何通过一致性聚类实现对表达谱数据的亚型分类
通过ConsensusClusterPlus包对基因表达谱执行一致性聚类（Consensus Clusterin...
2022-11-08 ConsensusClusterPlus
Consensus Clustering 是一种可用于鉴定数据集（比如 microarray 基因表达）中的簇集 ...
一致性聚类ConsensusClusterPlus
介绍一致性聚类是一种为确定数据集中可能的聚类的数量和成员提供定量证据的方法，例如作为微阵列基因表达。这种方法在癌...
ConsensusClusterPlus: R中实现鉴定簇集数及
无监督分析下鉴定簇集数及成员 Wilkerson, D. M, Hayes, Neil D (2010). “Co...
R语言：一致性聚类 ConsensusClusterPlus
什么情况下会用到一致性聚类？顺手策个文章套路：比较常用到聚类分析的就是肿瘤亚型识别的文章吧。这类文章一般会对...
[R包]ConsensusClusterPlus①一致性聚类Co
注：一致性聚类通常被用于确定最佳的聚类数目K 聚类分析传统方法的不足不能提供“客观的”分类数目的标准和分类边界,例...
关于关于关于
他们爱他们自己，不爱你他们爱你是他们的母亲妻子女儿姐妹他们不爱你直到你死的时候，爱才产生，与遗忘同时那也不...

关于 `ConsensusClusterPlus`

无监督分析下鉴定簇集数及成员

1. 关于 `ConsensusClusterPlus`

2. 好像真的很简单只是操作简单

3. 准备输入数据

4. 运行 `ConsensusClusterPlus`

4.1 一致性矩阵

4.2 一致性累积分布函数图

4.3 Delta Area Plot

4.4 Tracking Plot

5. 计算聚类一致性 (cluster-consensus) 和样品一致性 (item-consensus)

5.1 Cluster-Consensus Plot

5.2 tem-Consensus Plot

相关文章

关于 `ConsensusClusterPlus`

🎣®[R]ConsensusClusterPlus②二元属性（0

2019-01-16

如何通过一致性聚类实现对表达谱数据的亚型分类

2022-11-08 ConsensusClusterPlus

一致性聚类ConsensusClusterPlus

ConsensusClusterPlus: R中实现鉴定簇集数及

R语言：一致性聚类 ConsensusClusterPlus

[R包]ConsensusClusterPlus①一致性聚类Co

关于关于关于

网友评论

延伸阅读

深度阅读

栏目导航

热点阅读

单细胞测序

转录组

rna_seq

生信

R plot

关于 `ConsensusClusterPlus`

无监督分析下鉴定簇集数及成员

1. 关于 ConsensusClusterPlus

2. 好像真的很简单 只是操作简单

3. 准备输入数据

4. 运行 ConsensusClusterPlus

4.1 一致性矩阵

4.2 一致性累积分布函数图

4.3 Delta Area Plot

4.4 Tracking Plot

5. 计算聚类一致性 (cluster-consensus) 和样品一致性 (item-consensus)

5.1 Cluster-Consensus Plot

5.2 tem-Consensus Plot

相关文章

网友评论

延伸阅读

深度阅读

栏目导航

热点阅读

1. 关于 `ConsensusClusterPlus`

2. 好像真的很简单只是操作简单

4. 运行 `ConsensusClusterPlus`