【搬砖】免疫浸润实践

作者: yadandb | 来源:发表于2021-02-10 23:32 被阅读0次

【搬砖】免疫浸润实践
区块链投资2-搬砖
搬砖 ‘搬砖’
Bitcoin Elven
免疫浸润分析简单介绍
说说交易市场中的搬砖
[区块链入门必备]第3期-法币搬砖
零风险搬砖套利
法币搬砖——Zfund量化套利
搬砖是这么回事（二）

牛转乾坤！新年快乐，小白来啃生信发paper~

网站：https://cibersortx.stanford.edu
参考1：https://www.jianshu.com/p/a8759484359c
看下tutorial

上传文件要求
ok，来整理我的mixture。

准备好了~22880genes*10samples

cibersortx计算细胞组分

我的数据是microarray，所以最后那个箭头不勾选。

结果

数据可下载txt/csv/..，图片是截图的。。没找到下载的地方。

————————————————————————————————————
可见网页上的输出比较局限，没有好看的图，分组数据比较等功能也没有，所以接下来用R来实现cibersortx。
参考2：https://cloud.tencent.com/developer/article/1622907
从官网下载2个文件
（1）LM22.txt（22种免疫细胞的参考marker基因表达）
（2）CIBERSORT.R（CIBERSORT源代码，从官网下载）

https://cibersort.stanford.edu/download.php
(ps官网下CIBERSORT.R需要先request permission……我从下面这个网站拿到了✅
https://rdrr.io/github/singha53/amritr/src/R/supportFunc_cibersort.R
参考3：https://www.jianshu.com/p/0baac4c52ac8）

source("CIBERSORT.R")

# Define LM22 file
LM22.file <- "LM22.txt"
exp.file <- "GSE14017boneMetmatrix_rowsymbol.txt" #改为待分析的matrix

TME.results = CIBERSORT(LM22.file, exp.file, perm = 1000, QN = TRUE)

# output CIBERSORT results
write.table(TME.results, "TME.results.output.txt", 
            sep = "\t", row.names = T, col.names = T, quote = F)

有报错

报错：行名不能重复

解决：debug发现rownames里有4个不是unique。。。but我的matrix是用R生成的，所以rownames确实是不重复的。。我把角标“Gene”删掉之后这一步就没有报错了。

有报错X2

报错：缺一个包

解决：BiocManager::install("preprocessCore")

结果
R运行结果与网页输出结果有一丢丢不同。可能原因是网页的置换次数perm我没有设置（为0），而代码perm=1000。

下面出图

# boxplot
library(ggpubr)
library(ggthemes)

x=TME.results[,1:22] #22 cell types
.
.
plot.info=data.frame(Celltype,Composition)
colnames(plot.info)=col
ggboxplot(
  plot.info,
  x = "CellType",
  y = "Composition",
  color = "black",
  fill = "CellType",
  xlab = "",
  ylab = "Cell composition",
  main = "TME Cell composition"
) +
  theme_base() +
  theme(axis.text.x = element_text(
    angle = 90,
    hjust = 1,
    vjust = 1
  ))

箱线图

# boxplot by different samples(bone/brain/lung)
ggboxplot(
  plot.info,
  x = "CellType",
  y = "Composition",
  color = "black",
  fill = "SampleType",
  xlab = "",
  ylab = "Cell composition",
  main = "TME Cell composition group by Sampletype"
) +
  stat_compare_means(
    label = "p.signif",
    method = "t.test",
    ref.group = ".all.",
    hide.ns = T
  ) +
  theme_base() +
  theme(axis.text.x = element_text(
    angle = 90,
    hjust = 1,
    vjust = 1
  ))

箱线图2

补充学习一下stat_compare_means函数。
参考4：https://blog.csdn.net/zhouhucheng00/article/details/106391872
参考5：https://www.jianshu.com/p/0d4f17dc4a58

## 修改
ggboxplot(
  plot.info2,
  x = "ShortST",
  y = "Composition",
  color = "black",
  fill = "SampleType",
  xlab = "",
  ylab = "Cell composition",
  main = "TME Cell composition group by Sampletype",
  facet.by = "CellType",
) +
  stat_compare_means(
    label = "p.signif",#显著性水平，即用不同数量的 * 表示显著性水平
    label.y=0.25,
    method = "t.test",
    ref.group = ".all.",
    hide.ns = T
  ) +
  theme_base() +
  theme(axis.text.x = element_text(
    angle = 90,
    hjust = 1,
    vjust = 1
  ))

箱线图2 修改（注意这里的统计是对的）

在箱线图2中，所谓的p-value（用*表示显著性水平）是每一个CellType与所有组别的均值的t-test，比如B cells memory组有“****”，指的是B cells memory组与全部数据的均值之间存在****的显著，而不是B cells memory组内部bone/brain/lung之间的统计学差异。

而有分析意义的是每一个Cell Type中，bone/brain/lung之间的统计学差异。因此需要做修改。

（最后这个图还是不够美观啊（sigh😌