美文网首页
lme4包的使用简介

lme4包的使用简介

作者: 小潤澤 | 来源:发表于2024-09-09 12:40 被阅读0次

语法

给个链接到之前的文档:如何定义两物种之间基因表达量的保守性(内有lme4包使用说明)

基本的表达式为:resp ~ FEexpr + (REexpr1 | factor1) + (REexpr2 | factor2) + ...
固定效应的截距和斜率不会由于随机效应项的变化而变化

这里介绍以下几种常用的方法:

fm1 <- lmer(Reaction ~ Days + ( Days | Subject), dat)
fm1 <- lmer(Reaction ~ Days + (1 | Subject), dat)
fm1 <- lmer(Reaction ~ Days + (Subject | Days), dat)
fm1 <- lmer(Reaction ~ Days + (1 | Days), dat)
fm1 <- lmer(Reaction ~ Subject + (Days | Subject), dat)
fm1 <- lmer(Reaction ~ Subject + (Subject | Days), dat)
fm1 <- lmer(Reaction ~ Subject + (1 | Subject), dat)
fm1 <- lmer(Reaction ~ Subject + (1 | Days), dat)

要点:

  1. 固定效应部分写 1 代表只考虑固定效应截距项而不考虑固定效应的斜率;若赋予变量则代表同时考虑固定效应截距项和固定效应的斜率
  2. 随机效应部分 (随机效应项 | 随机分组因子),随机效应项写 1 代表只考虑固定效应截距项而不考虑固定效应的斜率;若赋予变量则代表同时考虑随机效应截距项和随机效应的斜率;随机分组因子则是对随机效应项进行进一步分组,比方随机分组因子有A,B,C三个水平分组,那么线性模型将会分A,B,C作拟合
  3. fm1 <- lmer(Reaction ~ Subject + (Subject | Subject), dat)等价于fm1 <- lmer(Reaction ~ Subject + (1 | Subject), dat)fm1 <- lmer(Reaction ~ Days+ (Days | Days), dat)等价于fm1 <- lmer(Reaction ~ Days+ (1 | Days), dat)

例子

其中 Days为连续变量,Subject为因子变量
对应第一个表达式

library(lme4)
data(sleepstudy)

dat = sleepstudy
dat$Subject = as.factor(c(rep('A',60),rep('B',60),rep('C',60)))
fm1 <- lmer(Reaction ~ Days + (Days | Subject), dat)
summary(fm1)

coef(fm1)

> coef(fm1)
$Subject
  (Intercept)      Days
A    249.2140  8.502512
B    252.3907 11.351088
C    252.6106 11.548257

attr(,"class")
[1] "coef.mer"

表达式fm1 <- lmer(Reaction ~ Days + ( Days | Subject), dat)的意义是以Days为固定效应(考虑固定效应的截距和斜率),还是以Days为随机效应,即考虑随机效应的截距和斜率,分组的随机因子为Subject;因此这种情况相当于没有随机效应项,只有分组,即分组的固定效应回归模型,没有随机效应的斜率和截距,全是固定效应的斜率和截距

  1. 当Days = 0,Subject = A 时,Reaction 值为 8.502512 × 0 + 249.2140
  2. 当Days = 1,Subject = A 时,Reaction 值为 8.502512 × 1 + 249.2140
  3. 当Days = 2,Subject = A 时,Reaction 值为 8.502512 × 2 + 249.2140
  4. 当Days = 0,Subject = B 时,Reaction 值为 11.351088 × 0 + 252.3907
  5. 当Days = 1,Subject = B 时,Reaction 值为 11.351088 × 1 + 252.3907
  6. 当Days = 2,Subject = B 时,Reaction 值为 11.351088 × 2 + 252.3907
  7. 当Days = 0,Subject = C 时,Reaction 值为 11.548257 × 0 + 252.6106
  8. 当Days = 1,Subject = C 时,Reaction 值为 11.548257 × 1 + 252.6106
  9. 当Days = 2,Subject = C 时,Reaction 值为 11.548257 × 2 + 252.6106

因此它的线性关系需要分组来看:


对应第二个表达式

library(lme4)
data(sleepstudy)

dat = sleepstudy
dat$Subject = as.factor(c(rep('A',60),rep('B',60),rep('C',60)))
fm1 <- lmer(Reaction ~ Days + (1 | Subject), dat)
summary(fm1)

coef(fm1)

> coef(fm1)
$Subject
  (Intercept)     Days
A    241.1497 10.46729
B    255.8238 10.46729
C    257.2418 10.46729

attr(,"class")
[1] "coef.mer"

表达式fm1 <- lmer(Reaction ~ Days + ( 1 | Subject), dat)的意义是以Days为固定效应(考虑固定效应的截距和斜率),随机效应项写 1 代表不考虑随机效应的斜率项,考虑其截距项(即将随机因子的影响转换为截距加入到方程中),分组的随机因子为Subject,例如 A 中的截距 241.1497,B中的截距为255.8238 受不同Subject分组而不同,并且它们的截距为固定效应截距与随机效应截距之和,而斜率为固定效应斜率

因此它的线性关系需要分组来看:


对应第三个表达式

library(lme4)
data(sleepstudy)

dat = sleepstudy
dat$Subject = as.factor(c(rep('A',60),rep('B',60),rep('C',60)))
fm1 <- lmer(Reaction ~ Days + (Subject | Days), dat)
summary(fm1)

coef(fm1)

> coef(fm1)
$Days
       SubjectB      SubjectC (Intercept)     Days
0  2.399840e-08  2.920663e-08    251.4051 10.46729
1  4.067253e-08  4.970416e-08    251.4051 10.46729
2 -8.451885e-09 -1.019267e-08    251.4051 10.46729
3  9.618661e-09  1.181750e-08    251.4051 10.46729
4 -9.571652e-09 -1.162255e-08    251.4051 10.46729
5  4.944233e-08  6.047095e-08    251.4051 10.46729
6 -6.091995e-09 -7.603941e-09    251.4051 10.46729
7  2.267801e-08  2.766016e-08    251.4051 10.46729
8  5.934868e-08  7.253005e-08    251.4051 10.46729
9  7.305345e-08  8.927133e-08    251.4051 10.46729

attr(,"class")
[1] "coef.mer"

表达式fm1 <- lmer(Reaction ~ Days + ( Subject | Days), dat)的意义是以Days为固定效应,随机效应项为因子变量Subject(即将随机效应项的斜率作为影响距加入到方程中,忽略随机效应截距),分组的随机因子为Days,按照Days进行分组计算;由于此时模型将Subject当作因子变量,所以默认SubjectA为参考物,SubjectB和SubjectC分别和SubjectA对比(此时因子变量忽略截距,即忽略随机效应截距);由于Subject为因子变量(此时因子变量忽略截距),截距=251.4051为固定效应的截距,斜率=10.46729为固定效应的斜率,此时随机效应的影响用斜率体现

因此它的线性关系需要分组来看:

  1. 当Days = 0,Subject = A 时,Reaction 值为 10.46729 × 0 + 251.4051
  2. 当Days = 0,Subject = B 时,Reaction 值为 10.46729 × 0 + 251.4051 + 2.399840e-08
  3. 当Days = 0,Subject = C 时,Reaction 值为 10.46729 × 0 + 251.4051 + 2.920663e-08
  4. 当Days = 1,Subject = A 时,Reaction 值为 10.46729 × 1 + 251.4051
  5. 当Days = 1,Subject = B 时,Reaction 值为 10.46729 × 1 + 251.4051 + 2.399840e-08
  6. 当Days = 1,Subject = C 时,Reaction 值为 10.46729 × 1 + 251.4051 + 2.920663e-08
  7. 当Days = 2,Subject = A 时,Reaction 值为 10.46729 × 2 + 251.4051
  8. 当Days = 2,Subject = B 时,Reaction 值为 10.46729 × 2 + 251.4051 + 2.399840e-08
  9. 当Days = 2,Subject = C 时,Reaction 值为 10.46729 × 2 + 251.4051 + 2.920663e-08

对应第四个表达式

library(lme4)
data(sleepstudy)

dat = sleepstudy
dat$Subject = as.factor(c(rep('A',60),rep('B',60),rep('C',60)))
fm1 <- lmer(Reaction ~ Days + (1 | Days), dat)
summary(fm1)

coef(fm1)

> coef(fm1)
$Days
  (Intercept)     Days
0    251.4051 10.46729
1    251.4051 10.46729
2    251.4051 10.46729
3    251.4051 10.46729
4    251.4051 10.46729
5    251.4051 10.46729
6    251.4051 10.46729
7    251.4051 10.46729
8    251.4051 10.46729
9    251.4051 10.46729

attr(,"class")
[1] "coef.mer"

表达式fm1 <- lmer(Reaction ~ Days + ( 1 | Days), dat)的意义是以Days为固定效应,随机效应项写 1 代表不考虑随机效应的斜率,而考虑随机效应的截距;此时随机效应的截距等同固定效应的截距,因此此时只有固定效应的斜率和截距,并且随机分组因子为Days,相当于完全不考虑随机效应项,因此结果完全为固定效应的系数(斜率和截距)

  1. 当Days = 0,Reaction 值为 251.4051 + 10.46729 × 0
  2. 当Days = 1,Reaction 值为 251.4051 + 10.46729 × 1
  3. 当Days = 2,Reaction 值为 251.4051 + 10.46729 × 2
  4. 当Days = 3,Reaction 值为 251.4051 + 10.46729 × 3
  5. 当Days = 4,Reaction 值为 251.4051 + 10.46729 × 4
  6. 当Days = 5,Reaction 值为 251.4051 + 10.46729 × 5
  7. 当Days = 6,Reaction 值为 251.4051 + 10.46729 × 6
  8. 当Days = 7,Reaction 值为 251.4051 + 10.46729 × 7
  9. 当Days = 8,Reaction 值为 251.4051 + 10.46729 × 8
  10. 当Days = 9,Reaction 值为 251.4051 + 10.46729 × 9

对应第五个表达式

library(lme4)
data(sleepstudy)

dat = sleepstudy
dat$Subject = as.factor(c(rep('A',60),rep('B',60),rep('C',60)))
fm1 <- lmer(Reaction ~ Subject + (Days | Subject), dat)
summary(fm1)

coef(fm1)

> coef(fm1)
$Subject
       Days (Intercept) SubjectB SubjectC
A  7.680847    250.1575 1.854289 4.273369
B 11.291513    251.7820 1.854289 4.273369
C 11.187903    251.7354 1.854289 4.273369

attr(,"class")
[1] "coef.mer"

表达式fm1 <- lmer(Reaction ~ Subject + (Days | Subject), dat)的意义是以Subject 为固定效应(由于Subject为因子变量,所以不考虑固定效应的截距),随机效应项为Days(即考虑随机效应的斜率和截距),分组的随机因子为Days,按照Days进行分组计算;由于此时模型将Subject当作因子变量,所以默认SubjectA为参考物,SubjectB和SubjectC分别和SubjectA对比(此时因子变量忽略截距,忽略随机效应截距);因此固定效应只有统一的斜率项:斜率B = 1.854289,斜率C = 4.273369

  1. 当Days = 0,Subject = A 时,Reaction 值为 7.680847 × 0 + 250.1575
  2. 当Days = 0,Subject = B 时,Reaction 值为 11.291513 × 0 + 251.7820 + 1.854289
  3. 当Days = 0,Subject = C 时,Reaction 值为 11.187903 × 0 + 251.7354 + 4.273369
  4. 当Days = 1,Subject = A 时,Reaction 值为 7.680847 × 1 + 250.1575
  5. 当Days = 1,Subject = B 时,Reaction 值为 11.291513 × 1 + 251.7820 + 1.854289
  6. 当Days = 1,Subject = C 时,Reaction 值为 11.187903 × 1 + 251.7354 + 4.273369
  7. 当Days = 2,Subject = A 时,Reaction 值为 7.680847 × 2 + 250.1575
  8. 当Days = 2,Subject = B 时,Reaction 值为 1.291513 × 2 + 251.7820 + 1.854289
  9. 当Days = 2,Subject = C 时,Reaction 值为 11.187903 × 2 + 251.7354 + 4.273369

对应第六个表达式

library(lme4)
data(sleepstudy)

dat = sleepstudy
dat$Subject = as.factor(c(rep('A',60),rep('B',60),rep('C',60)))
fm1 <- lmer(Reaction ~ Subject + (Subject | Days), dat)
summary(fm1)

coef(fm1)

> coef(fm1)
$Days
  (Intercept)  SubjectB SubjectC
0    257.1725  5.629783  8.06793
1    263.1822  8.705003 11.02711
2    262.9781  8.600570 10.92662
3    273.9903 14.235679 16.34910
4    277.5391 16.051596 18.09649
5    291.3354 23.111346 24.88986
6    292.6354 23.776574 25.52998
7    298.5781 26.817544 28.45621
8    310.3491 32.840869 34.25225
9    319.4526 37.499269 38.73487

attr(,"class")
[1] "coef.mer"

表达式fm1 <- lmer(Reaction ~ Subject + (Subject | Days), dat)的意义是以Subject 为固定效应,还是以Subject 为随机效应,即考虑随机效应的截距和斜率,分组的随机因子为Days,按照Days分组计算;因此这种情况相当于没有随机效应项,只有分组,即分组的固定效应回归模型,没有随机效应的斜率和截距,全是分组后的固定效应的斜率和截距

  1. 当 Subject = A 时,Days = 0,Reaction = 257.1725
  2. 当 Subject = B 时,Days = 0,Reaction = 257.1725 + 5.629783
  3. 当 Subject = C 时,Days = 0,Reaction = 257.1725 + 8.06793
  4. 当 Subject = A 时,Days = 1,Reaction = 263.1822
  5. 当 Subject = B 时,Days = 1,Reaction = 263.1822 + 8.705003
  6. 当 Subject = C 时,Days = 1,Reaction = 263.1822 + 11.02711
  7. 当 Subject = A 时,Days = 2,Reaction = 262.9781
  8. 当 Subject = B 时,Days = 2,Reaction = 262.9781 + 8.600570
  9. 当 Subject = C 时,Days = 2,Reaction = 262.9781 + 10.92662

对应第七个表达式
当然如果将只考虑Subject的固定效应项

library(lme4)
data(sleepstudy)

dat = sleepstudy
dat$Subject = as.factor(c(rep('A',60),rep('B',60),rep('C',60)))
fm1 <- lmer(Reaction ~ Subject + (1 | Subject), dat)
summary(fm1)

coef(fm1)

> coef(fm1)
$Subject
  (Intercept) SubjectB SubjectC
A    284.7213 19.72682 21.63304
B    284.7213 19.72682 21.63304
C    284.7213 19.72682 21.63304

attr(,"class")
[1] "coef.mer"

表达式fm1 <- lmer(Reaction ~ Subject + ( 1 | Subject), dat)的意义是以Subject 为固定效应,随机效应项写 1 代表不考虑随机效应的斜率,而考虑随机效应的截距;此时随机效应的截距等同固定效应的截距,因此此时只有固定效应的斜率和截距,并且随机分组因子为 Subject,相当于完全不考虑随机效应项,因此结果完全为固定效应的系数(斜率和截距);由于Subject为因子变量,所以默认 Subject A 为参考物,Subject B和Subject C分别与Subject A作比较

  1. 当 Subject = A 时,Reaction = 284.7213
  2. 当 Subject = B 时,Reaction = 284.7213 + 19.72682
  3. 当 Subject = C 时,Reaction = 284.7213 + 21.63304

对应第八个表达式

library(lme4)
data(sleepstudy)

dat = sleepstudy
dat$Subject = as.factor(c(rep('A',60),rep('B',60),rep('C',60)))
fm1 <- lmer(Reaction ~ Subject + (1 | Days), dat)
summary(fm1)

coef(fm1)

> coef(fm1)
$Days
  (Intercept) SubjectB SubjectC
0    248.0516 19.72682 21.63304
1    254.9236 19.72682 21.63304
2    255.6824 19.72682 21.63304
3    271.1280 19.72682 21.63304
4    276.0844 19.72682 21.63304
5    293.4914 19.72682 21.63304
6    296.6977 19.72682 21.63304
7    302.4557 19.72682 21.63304
8    318.1192 19.72682 21.63304
9    330.5787 19.72682 21.63304

attr(,"class")
[1] "coef.mer"

表达式fm1 <- lmer(Reaction ~ Subject + ( 1 | Days), dat)的意义是以Subject 为固定效应,随机效应项写 1 代表不考虑随机效应的斜率,而考虑随机效应的截距(即将随机因子的影响转换为截距加入到方程中),并且随机分组因子为 Days,按Days进行分组计算;由于Subject为因子变量,所以默认 Subject A 为参考物,Subject B和Subject C分别与Subject A作比较;而Days作为连续变量带入对应数值运算;因此结果的截距为固定效应的截距和随机效应的截距之和,斜率为固定效应的斜率:斜率B=19.72682,斜率C=21.63304

  1. 当 Subject = A 时,Days =0,Reaction = 248.0516
  2. 当 Subject = A 时,Days =1,Reaction = 254.9236
  3. 当 Subject = A 时,Days =2,Reaction = 255.6824
  4. 当 Subject = B 时,Days =0,Reaction = 248.0516 + 19.72682
  5. 当 Subject = B 时,Days =1,Reaction = 254.9236 + 19.72682
  6. 当 Subject = B 时,Days =2,Reaction = 255.6824 + 19.72682
  7. 当 Subject = C 时,Days =0,Reaction = 248.0516 + 21.63304
  8. 当 Subject = C 时,Days =1,Reaction = 254.9236 + 21.63304
  9. 当 Subject = C 时,Days =2,Reaction = 255.6824 + 21.63304

复合情形

library(lme4)
data(sleepstudy)

dat = sleepstudy
dat$Subject = as.factor(c(rep('A',60),rep('B',60),rep('C',60)))
fm1 <- lmer(Reaction ~ Subject + (1 | Days) + (1 | Subject), dat)
summary(fm1)

coef(fm1)

> coef(fm1)
$Days
  (Intercept) SubjectB SubjectC
0    248.0518 19.72682 21.63304
1    254.9238 19.72682 21.63304
2    255.6826 19.72682 21.63304
3    271.1280 19.72682 21.63304
4    276.0844 19.72682 21.63304
5    293.4914 19.72682 21.63304
6    296.6977 19.72682 21.63304
7    302.4556 19.72682 21.63304
8    318.1190 19.72682 21.63304
9    330.5784 19.72682 21.63304

$Subject
  (Intercept) SubjectB SubjectC
A    284.7213 19.72682 21.63304
B    284.7213 19.72682 21.63304
C    284.7213 19.72682 21.63304

attr(,"class")
[1] "coef.mer"

这种随机效应相加的情形相当于将这两项随机效应分开来看,详见第七个表达式和第八个表达式

相关文章

网友评论

      本文标题:lme4包的使用简介

      本文链接:https://www.haomeiwen.com/subject/ryaeljtx.html