美文网首页科研信息学R生物统计
R语言ggstatsplot包做“卡方检验”

R语言ggstatsplot包做“卡方检验”

作者: Whuer_deng | 来源:发表于2019-07-23 07:59 被阅读295次
library(ggstatsplot)
library(ggplot2)
library(dplyr)
data("diamonds")

diamonds2 <- diamonds %>% 
  filter(color == c('J', 'H', 'F'), clarity %in% c('SI2', 'VS1', 'IF'))#筛选出diamonds中颜色为J、H、F,分类为SI2、VS1、IF的数据,并保存为diamonds2。

ggbarstats(diamonds2, color, clarity, palette = 'Set2')
#以下为统计结果
Note: 95% CI for effect size estimate was computed with 100 bootstrap samples.
Note: Results from one-sample proportion tests for each level of the variable
clarity testing for equal proportions of the variable color.

# A tibble: 3 x 9
  condition N          F      H      J      `Chi-squared`    df `p-value` significance
  <ord>     <chr>      <chr>  <chr>  <chr>          <dbl> <dbl>     <dbl> <chr>       
1 SI2       (n = 1208) 45.20% 41.72% 13.08%         225.      2         0 ***         
2 VS1       (n = 966)  46.38% 38.20% 15.42%         149.      2         0 ***         
3 IF        (n = 251)  53.39% 39.44% 7.17%           84.6     2         0 ***   
image.png
如图所示,卡方值为15.01,p = 0.005 < 检验水准0.05,可认为钻石的颜色与分类不独立,即有差别。各个clarity的组内比较,不同颜色钻石的数量的差异均具有显著性(每个柱子上面为三颗星“***”,卡方值分别为225, 149, 84.6,均大于卡方分布在自由度为2,阿尔法为0.05时的值5.99,即p < 0.05, 所以都具有显著性)。
ggpiestats(diamonds2, color, clarity, palette = 'Set3')
#以下为统计结果
Note: 95% CI for effect size estimate was computed with 100 bootstrap samples.

Note: Results from one-sample proportion tests for each level of the variable
clarity testing for equal proportions of the variable color.

# A tibble: 3 x 9
  condition N          F      H      J      `Chi-squared`    df `p-value` significance
  <ord>     <chr>      <chr>  <chr>  <chr>          <dbl> <dbl>     <dbl> <chr>       
1 SI2       (n = 1208) 45.20% 41.72% 13.08%         225.      2         0 ***         
2 VS1       (n = 966)  46.38% 38.20% 15.42%         149.      2         0 ***         
3 IF        (n = 251)  53.39% 39.44% 7.17%           84.6     2         0 ***         
image.png
此图统计结果与上面柱状图的结果一样,只是将柱状图换成饼图。
这种些图形能够方便快速的将统计数据快速可视化,不仅能得到基本的卡方统计量,P值,还可以得到各分组内的分布状况,如颜色为J的钻石在分类为SI2的组内占比为13%,占比最大的为颜色F,占比45%。在分类VS1和IF组内,占比最大的也是颜色F,分别占比46%和53%。
grouped_ggpiestats(diamonds2[diamonds2$cut != 'Very Good',], color, clarity, grouping.var = cut, simulate.p.value = T)  #diamonds2[diamonds2$cut != 'Very Good',]表示去掉数据中cut为Very Good的数据,simulate.p.value = T表示对P值进行调整,因为cut为Fair的数据内,颜色为J和H的数量为0。
#以下为统计结果
Note: 95% CI for effect size estimate was computed with 100 bootstrap samples.

Note: Results from one-sample proportion tests for each level of the variable
clarity testing for equal proportions of the variable color.

# A tibble: 3 x 9
  condition N     F     H     J     `Chi-squared`    df `p-value` significance
  <ord>     <chr> <chr> <chr> <chr>         <dbl> <dbl>     <dbl> <chr>       
1 SI2       (n =~ 47.7~ 41.7~ 10.4~          16.1     2     0     ***         
2 VS1       (n =~ 42.8~ 35.7~ 21.4~           2       2     0.368 ns          
3 IF        (n =~ 100.~ NA    NA              6       2     0.05  ns          
Note: 95% CI for effect size estimate was computed with 100 bootstrap samples.

Note: Results from one-sample proportion tests for each level of the variable
clarity testing for equal proportions of the variable color.

# A tibble: 3 x 9
  condition N     F     H     J     `Chi-squared`    df `p-value` significance
  <ord>     <chr> <chr> <chr> <chr>         <dbl> <dbl>     <dbl> <chr>       
1 SI2       (n =~ 49.6~ 35.7~ 14.6~         25.6      2     0     ***         
2 VS1       (n =~ 48.1~ 31.3~ 20.4~          9.71     2     0.008 **          
3 IF        (n =~ 69.2~ 15.3~ 15.3~          7.54     2     0.023 *           
Note: 95% CI for effect size estimate was computed with 100 bootstrap samples.

Note: Results from one-sample proportion tests for each level of the variable
clarity testing for equal proportions of the variable color.

# A tibble: 3 x 9
  condition N     F     H     J     `Chi-squared`    df `p-value` significance
  <ord>     <chr> <chr> <chr> <chr>         <dbl> <dbl>     <dbl> <chr>       
1 SI2       (n =~ 44.5~ 42.0~ 13.3~         71.7      2     0     ***         
2 VS1       (n =~ 41.5~ 41.5~ 16.8~         29.6      2     0     ***         
3 IF        (n =~ 40.0~ 48.0~ 12.0~          5.36     2     0.069 ns          
Note: 95% CI for effect size estimate was computed with 100 bootstrap samples.

Note: Results from one-sample proportion tests for each level of the variable
clarity testing for equal proportions of the variable color.

# A tibble: 3 x 9
  condition N     F     H     J     `Chi-squared`    df `p-value` significance
  <ord>     <chr> <chr> <chr> <chr>         <dbl> <dbl>     <dbl> <chr>       
1 SI2       (n =~ 45.4~ 44.6~ 9.91%          84.7     2         0 ***         
2 VS1       (n =~ 49.0~ 38.5~ 12.5~          84.7     2         0 ***         
3 IF        (n =~ 52.5~ 42.3~ 5.08%          66.3     2         0 ***  
image.png

相关文章

  • R语言ggstatsplot包做T检验

    R语言用ggstatsplot包做方差分析和绘图R语言ggstatsplot包做卡方检验 单样本均值比较 1、点图...

  • R语言ggstatsplot包做“卡方检验”

    如图所示,卡方值为15.01,p = 0.005 < 检验水准0.05,可认为钻石的颜色与分类不独立,即有差别。各...

  • R语言 卡方检验

    卡方检验是一种确定两个分类变量之间是否存在显着相关性的统计方法。 这两个变量应该来自相同的人口,他们应该是类似 -...

  • 2020-07-08 R基础绘图+统计

    R统计 安装加载必要的R包 QUALITATIVE DATA QUANTITATIVE DATA 卡方检验,Fis...

  • R语言卡方检验大全

    本文首发于公众号:医学和生信笔记 医学和生信笔记,专注R语言在临床医学中的使用,R语言数据分析和可视化。主要分享R...

  • 卡方检验

    白话统计学—卡方检验基本原理R语言实现卡方检验的替换组内两两比较等级资料的比较单向R×C列联表分析——列有序双向有...

  • Fisher's exact test

    目录 适用实例 计算原理 计算实例3.1 解答过程3.2 R语言代码 Fisher精确检验和卡方检验的选择 1. ...

  • R | 卡方检验

    χ2检验主要有三个用途:单样本方差的同质性检验、独立性检验和适合性检验。适合性检验和独立性检验都是应用于离散型资料...

  • R实战|卡方检验及其可视化

    R实战|卡方检验及其可视化 卡方检验 卡方检验是一种以χ 2 分布为基础的用途广泛的假设检验方法。是一种非参数检验...

  • 卡方检验(Chi-Square Test)

    适用实例 计算原理 计算实例3.1 解答过程3.2 R语言代码 1. 适用实例 卡方检验就是检验两个变量之间有没有...

网友评论

    本文标题:R语言ggstatsplot包做“卡方检验”

    本文链接:https://www.haomeiwen.com/subject/imenlctx.html