球史上头一份儿的t检验

作者: 小洁忘了怎么分身 | 来源:发表于2019-10-03 16:31 被阅读0次

球史上头一份儿的t检验
如何选择T检验？
3、Stata入门---常见的统计操作
《白话统计》读书笔记-t and anovar
R做方差齐次检验
python数据分析之t检验
R语言-T检验、秩和检验、百分比检验、卡方检验
t检验-单样本t检验
关于t检验记录
Stata系列-如何进行多元回归分析

title: "Vignette Title"
author: "Vignette Author"
date: "2019-09-19"

写了那么多R语言帖，没涉及过统计，来了来了！我大三时的生物统计是一学期没有听课，最后靠一本老师的课件，突击3天考了93的，现在需要补课，特别想要那本课件，可是同学们都没有，老师教完我们这一届就退休了。我只记得挺简单的！发扬我化繁为简的思想，开始刷统计，打你啊谁怕谁。

0.准备数据

x1 = iris$Sepal.Length[1:50]
x2 = iris$Petal.Length[51:100]

t检验只不过是个函数而已，用?t.test查看帮助文档，你有我有全都有啊。

t.test(x, y = NULL, alternative = c("two.sided", "less", "greater"), mu = 0, paired = FALSE, var.equal = FALSE, conf.level = 0.95, ...)

1.最简单的t检验

(1)单个总体均值的t检验

检验均值是否等于某个值，参数mu默认等于0，主要看p值。检验结果也一起给出了t值、自由度、备择假设、95%置信区间以及均值。

t.test(x1)
#> 
#>  One Sample t-test
#> 
#> data:  x1
#> t = 100.42, df = 49, p-value < 2.2e-16
#> alternative hypothesis: true mean is not equal to 0
#> 95 percent confidence interval:
#>  4.905824 5.106176
#> sample estimates:
#> mean of x 
#>     5.006
t.test(x2,mu = 3)
#> 
#>  One Sample t-test
#> 
#> data:  x2
#> t = 18.96, df = 49, p-value < 2.2e-16
#> alternative hypothesis: true mean is not equal to 3
#> 95 percent confidence interval:
#>  4.126453 4.393547
#> sample estimates:
#> mean of x 
#>      4.26

p值<0.05即拒绝"x2均值等于3"的假设。

(2)两个总体均值的t检验

即检验两个总体的均值是否相等，也是看p值。

假设："x1和x2均值相等"

t.test(x1,x2)
#> 
#>  Welch Two Sample t-test
#> 
#> data:  x1 and x2
#> t = 8.9799, df = 90.882, p-value = 3.514e-14
#> alternative hypothesis: true difference in means is not equal to 0
#> 95 percent confidence interval:
#>  0.5809806 0.9110194
#> sample estimates:
#> mean of x mean of y 
#>     5.006     4.260

p值<0.05即拒绝"x1和x2均值相等"的假设

2.单边假设检验

(1)一个总体

参数alternative,可选值“two.sided”, “less”, “greater”。指的是备择假设的方向。因此下面代码是的假设是x1的均值小于5！不是大于,看清楚啦！

假设："x1均值小于5"

t.test(x1,mu=5,alternative = "greater")
#> 
#>  One Sample t-test
#> 
#> data:  x1
#> t = 0.12036, df = 49, p-value = 0.4523
#> alternative hypothesis: true mean is greater than 5
#> 95 percent confidence interval:
#>  4.922425      Inf
#> sample estimates:
#> mean of x 
#>     5.006

p值大于0.05所以不能拒绝原假设，即接受"x1均值小于5"。

(2)两个总体

假设："x1与x2的均值之差大于0"

t.test(x1,x2,mu = 1,alternative = "less")
#> 
#>  Welch Two Sample t-test
#> 
#> data:  x1 and x2
#> t = -3.0575, df = 90.882, p-value = 0.001466
#> alternative hypothesis: true difference in means is less than 1
#> 95 percent confidence interval:
#>      -Inf 0.884052
#> sample estimates:
#> mean of x mean of y 
#>     5.006     4.260

p值小于0.05所以不能拒绝原假设，即接受"x1与x2的均值之差大于0"。

3.复杂一点的两总体均值的假设检验

t检验对数据的要求是符合正态分布，方差不齐。

(1) 方差相等的两个总体

t检验默认两总体方差不等！如果相等，就添加参数var.equal = TRUE。

假设：x1和x2均值之差大于0

t.test(x1,x2,alternative = "less",var.equal = TRUE)
#> 
#>  Two Sample t-test
#> 
#> data:  x1 and x2
#> t = 8.9799, df = 98, p-value = 1
#> alternative hypothesis: true difference in means is less than 0
#> 95 percent confidence interval:
#>       -Inf 0.8839488
#> sample estimates:
#> mean of x mean of y 
#>     5.006     4.260

p值等于1，不能拒绝原假设。

(2) 配对的两个总体

比如一组病人用药前和后，或者一组病人的癌和癌旁

假设：用药前后均值相等

df <- data.frame(bf = rnorm(10),af = runif(10))
df
#>             bf         af
#> 1   0.58010472 0.86960663
#> 2   0.25525231 0.52773048
#> 3  -0.77518791 0.36981479
#> 4  -1.10502267 0.92992487
#> 5  -0.86170377 0.55808954
#> 6  -0.42196653 0.12709410
#> 7  -2.35900186 0.26329736
#> 8   0.02647175 0.29790846
#> 9   0.90500441 0.04198544
#> 10  0.03287849 0.51154376
t.test(df$bf,df$af,paired = T)
#> 
#>  Paired t-test
#> 
#> data:  df$bf and df$af
#> t = -2.5859, df = 9, p-value = 0.02941
#> alternative hypothesis: true difference in means is not equal to 0
#> 95 percent confidence interval:
#>  -1.5411122 -0.1029211
#> sample estimates:
#> mean of the differences 
#>              -0.8220167

p>0.05 不能拒绝原假设

(3) 两个总体均值只差是否等于某特定值

假设：x1与x2均值之差等于3

t.test(x1,x2,mu = 3)
#> 
#>  Welch Two Sample t-test
#> 
#> data:  x1 and x2
#> t = -27.132, df = 90.882, p-value < 2.2e-16
#> alternative hypothesis: true difference in means is not equal to 3
#> 95 percent confidence interval:
#>  0.5809806 0.9110194
#> sample estimates:
#> mean of x mean of y 
#>     5.006     4.260

p<0.05,拒绝原假设。

显著性水平指定

显著性水平常用0.01，0.05，0.1，0.05是默认值。

t.test(x1,mu = 4,conf.level = 0.99)
#> 
#>  One Sample t-test
#> 
#> data:  x1
#> t = 20.181, df = 49, p-value < 2.2e-16
#> alternative hypothesis: true mean is not equal to 4
#> 99 percent confidence interval:
#>  4.872406 5.139594
#> sample estimates:
#> mean of x 
#>     5.006
t.test(x1,mu = 4,conf.level = 0.9)
#> 
#>  One Sample t-test
#> 
#> data:  x1
#> t = 20.181, df = 49, p-value < 2.2e-16
#> alternative hypothesis: true mean is not equal to 4
#> 90 percent confidence interval:
#>  4.922425 5.089575
#> sample estimates:
#> mean of x 
#>     5.006