最近搞了一下 Meta 分析,蛮有意思的,掌握了这些技能未来发文章不用愁,Meta 分析也是医学上很重要的方法之一,这期基于现有数据,分析几个经典的图形,包括森林图(forest)、漏洞图(funnel)、星状图(radial)、拉贝图(labbe)、以及Q-Q正态分位图(Q-Q normal)
01 Meta 分析的概念
1976年,Glass首次将这一概念命名为 Meta-analysis (荟萃分析)。并定义为一种对不同研究结果进行收集、合并及统计分析的方法。这种方法逐渐发展为一门新兴学科——“循证医学”的主要内容和研究手段。荟萃分析的主要目的是将以往的研究成果更为客观地综合反映出来。研究者并不进行原始的研究,而是将研究以获得的结果进行综合分析。
02 输入数据模式
单个变量
多分类变量
连续性变量
还可以检测模型系数并获得可信区间,以及对参数进行精准检验如置换检验(permutation)。
程序包安装,如下:
install.packages("metafor")
library("metafor")
读取软件包数据,来自13项检验卡介苗抗结核效果的研究的结果,meta分析的目的是检查卡介苗预防结核病的总体有效性,并检查可能影响效果大小的缓减剂。该数据集已经在一些出版物中被用来说明元分析方法,如下:
data("dat.bcg")
print(dat.bcg, row.names = FALSE)
trial author year tpos tneg cpos cneg ablat alloc
1 Aronson 1948 4 119 11 128 44 random
2 Ferguson & Simes 1949 6 300 29 274 55 random
3 Rosenthal et al 1960 3 228 11 209 42 random
4 Hart & Sutherland 1977 62 13536 248 12619 52 random
5 Frimodt-Moller et al 1973 33 5036 47 5761 13 alternate
6 Stein & Aronson 1953 180 1361 372 1079 44 alternate
7 Vandiviere et al 1973 8 2537 10 619 19 random
8 TPT Madras 1980 505 87886 499 87892 13 random
9 Coetzee & Berjak 1968 29 7470 45 7232 27 random
10 Rosenthal et al 1961 17 1699 65 1600 42 systematic
11 Comstock et al 1974 186 50448 141 27197 18 systematic
12 Comstock & Webster 1969 5 2493 3 2338 33 systematic
13 Comstock et al 1976 27 16886 29 17825 33 systematic
#######数据说明
?dat.bcg
Format
The data frame contains the following columns:
trial numeric trial number
author character author(s)
year numeric publication year
tpos numeric number of TB positive cases in the treated (vaccinated) group
tneg numeric number of TB negative cases in the treated (vaccinated) group
cpos numeric number of TB positive cases in the control (non-vaccinated) group
cneg numeric number of TB negative cases in the control (non-vaccinated) group
ablat numeric absolute latitude of the study location (in degrees)
alloc character method of treatment allocation (random, alternate, or systematic assignment)
这13项研究以下列表格提供数据细节,如下:
TB positiveTB negativevaccinated grouptpostnegcontrol groupcposcneg
03 实例解析
计算log相对风险和相应的抽样方差,如下:
dat <- escalc(measure = "RR", ai = tpos, bi = tneg, ci = cpos,
di = cneg, data = dat.bcg, append = TRUE)
print(dat[,-c(4:7)], row.names = FALSE
trial author year ablat alloc yi vi
1 Aronson 1948 44 random -0.8893 0.3256
2 Ferguson & Simes 1949 55 random -1.5854 0.1946
3 Rosenthal et al 1960 42 random -1.3481 0.4154
4 Hart & Sutherland 1977 52 random -1.4416 0.0200
5 Frimodt-Moller et al 1973 13 alternate -0.2175 0.0512
6 Stein & Aronson 1953 44 alternate -0.7861 0.0069
7 Vandiviere et al 1973 19 random -1.6209 0.2230
8 TPT Madras 1980 13 random 0.0120 0.0040
9 Coetzee & Berjak 1968 27 random -0.4694 0.0564
10 Rosenthal et al 1961 42 systematic -1.3713 0.0730
11 Comstock et al 1974 18 systematic -0.3394 0.0124
12 Comstock & Webster 1969 33 systematic 0.4459 0.5325
13 Comstock et al 1976 33 systematic -0.0173 0.0714
演示公式界面,如下:
k <- length(dat.bcg$trial)
dat.fm <- data.frame(study = factor(rep(1:k, each = 4)))
dat.fm$grp <- factor(rep(c("T", "T", "C", "C"), k), levels = c("T", "C"))
dat.fm$out <- factor(rep(c("+", "-", "+", "-"), k), levels = c("+", "-"))
dat.fm$freq <- with(dat.bcg, c(rbind(tpos, tneg, cpos, cneg)))
dat.fm
study grp out freq
1 1 T + 4
2 1 T - 119
3 1 C + 11
4 1 C - 128
5 2 T + 6
6 2 T - 300
7 2 C + 29
8 2 C - 274
9 3 T + 3
10 3 T - 228
11 3 C + 11
12 3 C - 209
13 4 T + 62
14 4 T - 13536
15 4 C + 248
16 4 C - 12619
17 5 T + 33
18 5 T - 5036
19 5 C + 47
20 5 C - 5761
21 6 T + 180
22 6 T - 1361
23 6 C + 372
24 6 C - 1079
25 7 T + 8
26 7 T - 2537
27 7 C + 10
28 7 C - 619
29 8 T + 505
30 8 T - 87886
31 8 C + 499
32 8 C - 87892
33 9 T + 29
34 9 T - 7470
35 9 C + 45
36 9 C - 7232
37 10 T + 17
38 10 T - 1699
39 10 C + 65
40 10 C - 1600
41 11 T + 186
42 11 T - 50448
43 11 C + 141
44 11 C - 27197
45 12 T + 5
46 12 T - 2493
47 12 C + 3
48 12 C - 2338
49 13 T + 27
50 13 T - 16886
51 13 C + 29
52 13 C - 17825
随机效应模型,如下:
#默认模型为REML模型Random-effect model using Restricted maximum likelihood estimator
res <- rma(yi, vi, data = dat)
res
Random-Effects Model (k = 13; tau^2 estimator: REML)
tau^2 (estimated amount of total heterogeneity): 0.3132 (SE = 0.1664)
tau (square root of estimated tau^2 value): 0.5597
I^2 (total heterogeneity / total variability): 92.22%
H^2 (total variability / sampling variability): 12.86
Test for Heterogeneity:
Q(df = 12) = 152.2330, p-val < .0001
Model Results:
estimate se zval pval ci.lb ci.ub
-0.7145 0.1798 -3.9744 <.0001 -1.0669 -0.3622 ***
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
res <- rma(ai = tpos, bi = tneg, ci = cpos,
di = cneg, data = dat, measure = "RR")
res
Random-Effects Model (k = 13; tau^2 estimator: REML)
tau^2 (estimated amount of total heterogeneity): 0.3132 (SE = 0.1664)
tau (square root of estimated tau^2 value): 0.5597
I^2 (total heterogeneity / total variability): 92.22%
H^2 (total variability / sampling variability): 12.86
Test for Heterogeneity:
Q(df = 12) = 152.2330, p-val < .0001
Model Results:
estimate se zval pval ci.lb ci.ub
-0.7145 0.1798 -3.9744 <.0001 -1.0669 -0.3622 ***
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
异质性的置信区间可用metafor中confint()可以求异质性系数(tau^2,tau,I^2(%),H^2)的置信区间,如下:
### confidence intervals for tau^2, I^2, and H^2
confint(res)
estimate ci.lb ci.ub
tau^2 0.3132 0.1197 1.1115
tau 0.5597 0.3460 1.0543
I^2(%) 92.2214 81.9177 97.6781
H^2 12.8558 5.5303 43.0680
# 估算效应量的预测/可信度区间
predict(res, digits = 3)
pred se ci.lb ci.ub pi.lb pi.ub
-0.715 0.180 -1.067 -0.362 -1.867 0.438
forest 森林图
森林图是以统计指标和统计分析方法为基础,用数值运算结果绘制出的图型。它在平面直角坐标系中,以一条垂直的无效线(横坐标刻度为1或0)为中心,用平行于横轴的多条线段描述了每个被纳入研究的效应量和可信区间,用一个棱形(或其它图形)描述了多个研究合并的效应量及可信区间。它非常简单和直观地描述了Meta分析的统计结果,是Meta分析中最常用的结果表达形式。如下:
### forst plot
forest(res, slab = paste(dat$author, dat$year, sep = ", "),
xlim = c(-16, 6), at = log(c(.05, .25, 1, 4)), atransf = exp,
ilab = cbind(dat$tpos, dat$tneg, dat$cpos, dat$cneg),
ilab.xpos = c(-9.5, -8, -6, -4.5), cex = .75)
op <- par(cex = .75, font = 2)
text(c(-9.5, -8, -6, -4.5), 15, c("TB+", "TB-", "TB+", "TB-"))
text(c(-8.75, -5.25), 16, c("Vaccinated", "Control"))
text(-16, 15, "Author(s) and Year", pos = 4)
text(6, 15, "Relative Risk [95% CI]", pos = 2)
par(op)
混合效应模型,如下:
### mixed-effects model with absolute latitude and publication year as moderators
res <- rma(yi, vi, mods = cbind(ablat, year), data = dat)
res
Mixed-Effects Model (k = 13; tau^2 estimator: REML)
tau^2 (estimated amount of residual heterogeneity): 0.1108 (SE = 0.0845)
tau (square root of estimated tau^2 value): 0.3328
I^2 (residual heterogeneity / unaccounted variability): 71.98%
H^2 (unaccounted variability / sampling variability): 3.57
R^2 (amount of heterogeneity accounted for): 64.63%
Test for Residual Heterogeneity:
QE(df = 10) = 28.3251, p-val = 0.0016
Test of Moderators (coefficients 2:3):
QM(df = 2) = 12.2043, p-val = 0.0022
Model Results:
estimate se zval pval ci.lb ci.ub
intrcpt -3.5455 29.0959 -0.1219 0.9030 -60.5724 53.4814
ablat -0.0280 0.0102 -2.7371 0.0062 -0.0481 -0.0080 **
year 0.0019 0.0147 0.1299 0.8966 -0.0269 0.0307
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
### predicted average relative risks for 10, 20, 30, 40, and 50 degrees
### absolute latitude holding publication year constant at 1970
predict(res, newmods = cbind(seq(from = 10, to = 60, by = 10), 1970),
transf = exp, addx = TRUE)
### plot of the predicted average relative risk as a function of absolute latitude
preds <- predict(res, newmods = cbind(0:60, 1970), transf = exp)
wi <- 1/sqrt(dat$vi)
size <- 0.5 + 3 * (wi - min(wi))/(max(wi) - min(wi))
plot(dat$ablat, exp(dat$yi), pch = 19, cex = size,
xlab = "Absolute Latitude", ylab = "Relative Risk",
las = 1, bty = "l", log = "y")
lines(0:60, preds$pred)
lines(0:60, preds$ci.lb, lty = "dashed")
lines(0:60, preds$ci.ub, lty = "dashed")
abline(h = 1, lty = "dotted")
funnel 漏斗图
漏斗图是由Light等在1984年提出,一般以单个研究的效应量为横坐标,样本含量为纵坐标做的散点图。效应量可以为RR、OR和死亡比或者其对数值等。理论上讲,被纳入Meta分析的各独立研究效应的点估计,在平面坐标系中的集合应为一个倒置的漏斗形,因此称为漏斗图。
样本量小,研究精度低,分布在漏斗图的底部,向周围分散;
样本量大,研究精度高,分布在漏斗图的顶部,向中间集中。
实际使用时,做Meta分析的研究个数较少时不宜做漏斗图,一般推荐Meta分析的研究个数在10个及以上才需做漏斗图。漏斗图主要用于观察某个系统评价或Meta分析结果是否存在偏倚,如发表偏倚或其他偏倚,如下图:
### funnel plots
res <- rma(yi, vi, data = dat)
funnel(res, main = "Random-Effects Model")
res <- rma(yi, vi, mods = cbind(ablat), data = dat)
funnel(res, main = "Mixed-Effects Model")
#######Begger's检验
ranktest(res)
Rank Correlation Test for Funnel Plot Asymmetry
Kendall's tau = 0.0256, p = 0.9524
##########Egger's检验
regtest(res)
Regression Test for Funnel Plot Asymmetry
Model: mixed-effects meta-regression model
Predictor: standard error
Test for Funnel Plot Asymmetry: z = -0.4623, p = 0.6439
radial 星状图
主要反映各研究的异质性,从而发现异质性的点。弧线对应的效应评估大小分布。如下:
### radial plots
res <- rma(yi, vi, data = dat, method = "REML")
radial(res, main = "Random-Effects Model")
res <- rma(yi, vi, data = dat, method = "ML")
radial(res, main = "Mixed-Effects Model")
Q-Q分位图
绘制预测结果,观察是否在置信区间内部,如下:
### qq-normal plots
res <- rma(yi, vi, data = dat)
qqnorm(res, main = "Random-Effects Model")
res <- rma(yi, vi, mods = cbind(ablat), data = dat)
qqnorm(res, main = "Mixed-Effects Model")
labbe 拉贝图
该函数计算两组的臂水平结果(例如,治疗和控制),并将它们相互比较。特别是功能块的原始比例两组互相在分析风险差异,比例在分析的日志(日志)风险比率,日志赔率在分析优势比(日志),平方根反正弦转换比例在分析平方根反正弦转换风险差异,分析发病率差异时用原始发病率,分析发病率比时用发病率的对数(log),分析发病率差异时用发病率的平方根转换,如下:
res <- rma(measure="RR", ai=tpos, bi=tneg, ci=cpos, di=cneg, data=dat.bcg)
### default plot
labbe(res)
### funnel plot with risk values on the x- and y-axis and add grid
labbe(res, transf=exp, grid=TRUE)
# }
04 分析结果解读
可以看到I^2为71.98%,属于高度异质性,可采用随机效应模型。异质性低的时候可以采用固定效应模型和随机效应模型,结果差别不大,但高异质性只能选择随机效应模型,否则会使结果外推性受到约束。此处选择随机效应模型是出于保守情况考虑。
random-effect model是基于跨研究间存在异质性的假设,该合并模型承认研究间异质性的存在,但是不对异质性加以处理;如果纳入合并的研究间存在异质性,尽管未达到我们常规设定的I^2>50%,但是在用 fixed-effect model合并时,默认运算直接忽略这一部分异质性的存在,这样合并的结果会造成假阳性误差,而选用random-effect model合并时,尽管不处理异质性,但是其默认运算承认异质性的存在,合并结果更可信!
从漏斗图以及Begg's检验及Egger's 检验的结果可以看出,P值都是大于0.05的,也就意味着没有发表偏倚。
Meta 分析同样是发文章的一种选择,搞清这些图解以及内部的方法,至于SCI能发到多少分,还需要看我们研究的课题的新颖程度,而方法也就是这些,表现形式基本都有,只要您有好的数据,顺利发表个SCI高分还是指日可待的,下期在聊聊其他软件!
关注公众号,扫码进群,免费解答,后期会有免费直播教程,敬请期待!
Reference:
Viechtbauer, W. (2010). Conducting Meta-Analyses in R with the metafor Package. Journal of Statistical Software, 36(3), 1–48.
Colditz, G. A., Brewer, T. F., Berkey, C. S., Wilson, M. E., Burdick, E., Fineberg, H. V., & Mosteller, F. (1994). Efficacy of BCG vaccine in the prevention of tuberculosis: Meta-analysis of the published literature. Journal of the American Medical Association, 271(9), 698–702.
Berkey, C. S., Hoaglin, D. C., Mosteller, F., & Colditz, G. A. (1995). A random-effects regression model for meta-analysis. Statistics in Medicine, 14(4), 395–411.
van Houwelingen, H. C., Arends, L. R., & Stijnen, T. (2002). Advanced methods in meta-analysis: Multivariate approach and meta-regression. Statistics in Medicine, 21(4), 589–624.
Viechtbauer, W. (2010). Conducting meta-analyses in R with the metafor package. Journal of Statistical Software, 36(3), 1–48.
公众号:桓峰基因 生物信息分析,SCI文章撰写及生物信息基础知识学习:R语言学习,perl基础编程,linux系统命令,Python遇见更好的你 44篇原创内容 -->
本文使用 文章同步助手 同步
网友评论