美文网首页
降维: stats::prcomp()

降维: stats::prcomp()

作者: LET149 | 来源:发表于2023-08-22 09:32 被阅读0次

https://blog.csdn.net/weixin_54000907/article/details/119496869
https://zhuanlan.zhihu.com/p/433947355
https://www.jianshu.com/p/8994afcaa757

1. 基础使用

> help(prcomp)
> set.seed(1995)  
> kk=matrix(abs(round(rnorm(100, mean=1000, sd=500))), 10, 10)  
> colnames(kk)=paste("Gene", 1:10, sep="_")  
> rownames(kk)=paste("Cell", 1:10, sep="_")
> kk   '#行为细胞,列为基因
        Gene_1 Gene_2 Gene_3 Gene_4 Gene_5 Gene_6 Gene_7 Gene_8 Gene_9 Gene_10
Cell_1    1530    988    837   1031   1333   1376    950    436    699    1170
Cell_2     832    933   1147    653   1421   1495   1280   1846    642    1294
Cell_3    1073   1094    284    446   1242    412   1074   1438    805     372
Cell_4    1197   1499   1138   1371    384   1698   1421   1073   1600    1090
Cell_5    1832    340     66    702   1070   1456    707   1145    377     138
Cell_6     827    964    680    710    634    781   1701   1612    466    1871
Cell_7     997    152    485   1510   1266   1251   1153   1527    927     526
Cell_8     991    306   1103    700    258    893    489    242    269    1635
Cell_9     881    327   1190    971    688   1206    334    915    629    1066
Cell_10   1835    975   1086    324    999   1109   1590    354   1360    1448

kk_scaled = scale(kk, center=T, scale=T)   #对数据进行z-score,注意是按列进行的,也就是对不同细胞的相同基因进行基因内部的z-score

pca_of_kk_scaled <- prcomp(kk_scaled, center=F, scale=F)   #进行 PCA,注意此时的PCA是按行进行的;也就是最终按行进行绘图

2. 结果解读

2.1 每个细胞(行)降维后每个PC的坐标

pca_of_kk_scaled$x

> pca_of_kk_scaled$x
                PC1        PC2         PC3        PC4          PC5         PC6         PC7         PC8          PC9          PC10
Cell_1   0.14982755  0.2742417  1.04247806 -0.7571106 -0.737164596  0.12918971  0.27143377  1.20441143  0.029898897  8.204268e-18
Cell_2   0.42740957  0.6611752 -0.94015083  1.1553481 -1.838595362  0.24869544 -0.49003793 -0.24018554 -0.071486251 -1.035020e-16
Cell_3  -1.45996405  1.1120487 -1.92700721 -0.7002753  0.980249506  1.10275980 -0.22467565  0.16568902 -0.022663924 -4.312561e-16
Cell_4   2.79714577  1.1542082  1.36496620  0.9338950  1.083346462 -0.00857649 -0.65009580  0.02814380 -0.027367433 -2.894943e-16
Cell_5  -2.73737161  0.7423112  1.20509596 -0.9632308  0.003714466 -0.90615390 -0.75148240 -0.24533036  0.008516788  3.377496e-16
Cell_6   0.78915476 -0.3460818 -2.40526033  0.5838742  0.288674242 -1.17093330  0.10198672  0.19201045  0.082146149 -4.293121e-16
Cell_7  -1.37070005  0.9838947  0.76345842  1.7986020  0.304183410 -0.10725315  1.22920911 -0.18919990 -0.040511507  3.743266e-16
Cell_8   0.01651802 -3.3238432  0.05478453 -0.4385290  0.343620389 -0.11802574  0.02780208  0.05289524 -0.132617669  8.675003e-17
Cell_9  -0.36305513 -1.8838957  0.76485869  0.8591002 -0.113643387  0.77908057 -0.16804521 -0.33009335  0.156163344  2.254719e-16
Cell_10  1.75103517  0.6259408  0.07677651 -2.4716737 -0.314385130  0.05121705  0.65390530 -0.63834079  0.017921606 -2.928080e-16
2.2 对PC的特征进行解读

pca_of_kk_scaled %>% summary()

> pca_of_kk_scaled %>% summary()
Importance of components:
                          PC1    PC2    PC3    PC4     PC5    PC6    PC7     PC8     PC9      PC10
Standard deviation     1.6109 1.4798 1.3317 1.2830 0.84748 0.6767 0.6082 0.49313 0.08029 1.304e-16
Proportion of Variance 0.2595 0.2190 0.1774 0.1646 0.07182 0.0458 0.0370 0.02432 0.00064 0.000e+00
Cumulative Proportion  0.2595 0.4785 0.6558 0.8204 0.89224 0.9380 0.9750 0.99936 1.00000 1.000e+00

Proportion of Variance: 单个PC对数据整体变异的贡献
Proportion of Variance: PC的累积贡献

2.3 每个主成分的标准差

pca_of_kk_scaled$sdev

2.4 数据整体变异度

所有主成分标准差的平方和:sum((pca_of_kk_scaled$sdev)^2)

2.5 每个主成分对数据整体变异度的解释程度

PVE: Percent Variance Explained

每个 PC 对总体变异的解释比例

特定PC标准差的平方 / 所有主成分标准差的平方和
(pca_of_kk_scaled$sdev)^2 / sum((pca_of_kk_scaled$sdev)^2)

2.6 计算PC用到的 特征向量

pca_of_kk_scaled$rotation

相关文章

网友评论

      本文标题:降维: stats::prcomp()

      本文链接:https://www.haomeiwen.com/subject/taasmdtx.html