美文网首页R语言R小技巧
使用R包 tableone 做基线表(baseline)

使用R包 tableone 做基线表(baseline)

作者: 上校的猫 | 来源:发表于2019-10-06 10:30 被阅读0次

    总体比较简单,要注意变量类型,是连续型变量(continous variables)还是分类变量(categorical variables)。其次注意变量的分布情况,连续型变量是否符合正态分布,样本量是不是太少,最终选择不同的检验方法。

    As you can see in the previous table, when there are two or more groups group comparison p-values are printed along with the table (well, let's not argue the appropriateness of hypothesis testing for table 1 in an RCT for now.). Very small p-values are shown with the less than sign. The hypothesis test functions used by default are chisq.test() for categorical variables (with continuity correction) and oneway.test() for continous variables (with equal variance assumption, i.e., regular ANOVA). Two-group ANOVA is equivalent of t-test.

    You may be worried about the nonnormal variables and small cell counts in the stage variable. In such a situation, you can use the nonnormal argument like before as well as the exact (test) argument in the print() method. Now kruskal.test() is used for the nonnormal continous variables and fisher.test() is used for categorical variables specified in the exact argument. kruskal.test() is equivalent to wilcox.test() in the two-group case. The column named test is to indicate which p-values were calculated using the non-default tests.

    # generate data with package named "wakefield" ----------------------------
    # refer to https://github.com/trinker/wakefield
    library(wakefield)
    dat1 <- r_data_frame(100,
                         age(x=20:80),
                         sex(prob = c(0.8,0.2)),
                         smokes,
                         income,
                         animal,
                         likert(x=c("group1"),prob=c(1),name = "group")
                         )
    
    dat2 <- r_data_frame(100,
                         age(x=30:100),
                         sex(prob = c(0.5,0.5)),
                         smokes,
                         income,
                         animal,
                         likert(x=c("group2"),prob=c(1),name = "group")
                         )
    dat <- rbind(dat1,dat2)
    
    # make baseline with packge named "tableone"  -----------------------------
    # https://cran.r-project.org/web/packages/tableone/vignettes/introduction.html
    summary(dat)
    dput(names(dat))
    a=CreateTableOne(vars=c("Age", "Sex", "Smokes", "Income"), 
                     #Vector of variables to summarize
                     data = dat,
                     strata="group", #Multiple group summary
                     factorVars=c("Sex","Smokes")) 
                     #Vector of categorical variables that need transformation
    ## Testing
    ?print.TableOne
    summary(a)
    print(a,showAllLevels = TRUE) #Showing all levels for categorical variables
    print(a, nonnormal = c("Income"),
          exact =c("Sex"),
          smd=T) 
    # The hypothesis test functions used by default are chisq.test() 
    # for categorical variables (with continuity correction) and 
    # oneway.test() for continous variables (with equal variance 
    # assumption, i.e., regular ANOVA). Two-group ANOVA is equivalent 
    # of t-test.
    ## For nonnormal variables and small cell counts
    # In such a situation, you can use the nonnormal argument like 
    # before as well as the exact (test) argument in the print() method.
    # Now kruskal.test() is used for the nonnormal continous variables 
    # and fisher.test() is used for categorical variables specified in 
    # the exact argument. kruskal.test() is equivalent to wilcox.test() 
    # in the two-group case. The column named test is to indicate which
    # p-values were calculated using the non-default tests.
    
    ## Exporting
    a_csv<- print(a, nonnormal = c("Income"),
                  exact =c("Sex"),
                  smd=T, 
                  showAllLevels = TRUE,
                  quote = FALSE, 
                  noSpaces = TRUE, 
                  printToggle = FALSE)
    library("knitr")
    kable(a_csv,  
          align = 'c', 
          caption = 'Table 1: Comparison of unmatched samples')
    write.csv(a_csv, file = "myTable.csv")
    
    level group1 group2 p test SMD
    n 100 100
    Age (mean (SD)) 47.54 (18.05) 64.28 (19.30) <0.001 0.896
    Sex (%) Male 76 (76.0) 57 (57.0) 0.007 exact 0.411
    Female 24 (24.0) 43 (43.0)
    Smokes (%) FALSE 86 (86.0) 79 (79.0) 0.264 0.185
    TRUE 14 (14.0) 21 (21.0)
    Income (median [IQR]) 33853.50 [23196.55, 53600.29] 31957.53 [16484.16, 53772.67] 0.320 nonnorm 0.046

    相关文章

      网友评论

        本文标题:使用R包 tableone 做基线表(baseline)

        本文链接:https://www.haomeiwen.com/subject/wskupctx.html