https://blog.csdn.net/weixin_54000907/article/details/119496869
https://zhuanlan.zhihu.com/p/433947355
https://www.jianshu.com/p/8994afcaa757
1. 基础使用
> help(prcomp)
> set.seed(1995)
> kk=matrix(abs(round(rnorm(100, mean=1000, sd=500))), 10, 10)
> colnames(kk)=paste("Gene", 1:10, sep="_")
> rownames(kk)=paste("Cell", 1:10, sep="_")
> kk '#行为细胞,列为基因
Gene_1 Gene_2 Gene_3 Gene_4 Gene_5 Gene_6 Gene_7 Gene_8 Gene_9 Gene_10
Cell_1 1530 988 837 1031 1333 1376 950 436 699 1170
Cell_2 832 933 1147 653 1421 1495 1280 1846 642 1294
Cell_3 1073 1094 284 446 1242 412 1074 1438 805 372
Cell_4 1197 1499 1138 1371 384 1698 1421 1073 1600 1090
Cell_5 1832 340 66 702 1070 1456 707 1145 377 138
Cell_6 827 964 680 710 634 781 1701 1612 466 1871
Cell_7 997 152 485 1510 1266 1251 1153 1527 927 526
Cell_8 991 306 1103 700 258 893 489 242 269 1635
Cell_9 881 327 1190 971 688 1206 334 915 629 1066
Cell_10 1835 975 1086 324 999 1109 1590 354 1360 1448
kk_scaled = scale(kk, center=T, scale=T) #对数据进行z-score,注意是按列进行的,也就是对不同细胞的相同基因进行基因内部的z-score
pca_of_kk_scaled <- prcomp(kk_scaled, center=F, scale=F) #进行 PCA,注意此时的PCA是按行进行的;也就是最终按行进行绘图
2. 结果解读
2.1 每个细胞(行)降维后每个PC的坐标
pca_of_kk_scaled$x
> pca_of_kk_scaled$x
PC1 PC2 PC3 PC4 PC5 PC6 PC7 PC8 PC9 PC10
Cell_1 0.14982755 0.2742417 1.04247806 -0.7571106 -0.737164596 0.12918971 0.27143377 1.20441143 0.029898897 8.204268e-18
Cell_2 0.42740957 0.6611752 -0.94015083 1.1553481 -1.838595362 0.24869544 -0.49003793 -0.24018554 -0.071486251 -1.035020e-16
Cell_3 -1.45996405 1.1120487 -1.92700721 -0.7002753 0.980249506 1.10275980 -0.22467565 0.16568902 -0.022663924 -4.312561e-16
Cell_4 2.79714577 1.1542082 1.36496620 0.9338950 1.083346462 -0.00857649 -0.65009580 0.02814380 -0.027367433 -2.894943e-16
Cell_5 -2.73737161 0.7423112 1.20509596 -0.9632308 0.003714466 -0.90615390 -0.75148240 -0.24533036 0.008516788 3.377496e-16
Cell_6 0.78915476 -0.3460818 -2.40526033 0.5838742 0.288674242 -1.17093330 0.10198672 0.19201045 0.082146149 -4.293121e-16
Cell_7 -1.37070005 0.9838947 0.76345842 1.7986020 0.304183410 -0.10725315 1.22920911 -0.18919990 -0.040511507 3.743266e-16
Cell_8 0.01651802 -3.3238432 0.05478453 -0.4385290 0.343620389 -0.11802574 0.02780208 0.05289524 -0.132617669 8.675003e-17
Cell_9 -0.36305513 -1.8838957 0.76485869 0.8591002 -0.113643387 0.77908057 -0.16804521 -0.33009335 0.156163344 2.254719e-16
Cell_10 1.75103517 0.6259408 0.07677651 -2.4716737 -0.314385130 0.05121705 0.65390530 -0.63834079 0.017921606 -2.928080e-16
2.2 对PC的特征进行解读
pca_of_kk_scaled %>% summary()
> pca_of_kk_scaled %>% summary()
Importance of components:
PC1 PC2 PC3 PC4 PC5 PC6 PC7 PC8 PC9 PC10
Standard deviation 1.6109 1.4798 1.3317 1.2830 0.84748 0.6767 0.6082 0.49313 0.08029 1.304e-16
Proportion of Variance 0.2595 0.2190 0.1774 0.1646 0.07182 0.0458 0.0370 0.02432 0.00064 0.000e+00
Cumulative Proportion 0.2595 0.4785 0.6558 0.8204 0.89224 0.9380 0.9750 0.99936 1.00000 1.000e+00
Proportion of Variance
: 单个PC对数据整体变异的贡献
Proportion of Variance
: PC的累积贡献
2.3 每个主成分的标准差
pca_of_kk_scaled$sdev
2.4 数据整体变异度
所有主成分标准差的平方和:
sum((pca_of_kk_scaled$sdev)^2)
2.5 每个主成分对数据整体变异度的解释程度
PVE: Percent Variance Explained
每个
PC
对总体变异的解释比例
特定PC标准差的平方
/所有主成分标准差的平方和
(pca_of_kk_scaled$sdev)^2
/sum((pca_of_kk_scaled$sdev)^2)
2.6 计算PC用到的 特征向量
pca_of_kk_scaled$rotation
网友评论