美文网首页R语言知识干货
R语言与生信应用25-R语法-tapply

R语言与生信应用25-R语法-tapply

作者: BioSi | 来源:发表于2019-05-12 20:58 被阅读45次

    tapply

    tapply 对向量的子集执行批处理操作。

    > str(tapply)
    function (X, INDEX, FUN = NULL, ..., simplify = TRUE)
    
    • X 是一个向量
    • INDEX 因子或因子的列表(或与因子相关)
    • FUN 批处理的函数
    • ... 其他传递给 FUN 函数
    • simplify 是否简化结果

    分组取平均值。

    > x <- c(rnorm(10), runif(10), rnorm(10, 1))
    > f <- gl(3, 10)
    > f
     [1] 1 1 1 1 1 1 1 1 1 1 2 2 2 2 2 2 2 2 2 2 3 3 3 3 3 3 3 3 3 3
    Levels: 1 2 3
    > tapply(x, f, mean)
            1         2         3 
    0.1045255 0.4867243 0.9131191 
    

    不简化分组取平均值的结果Take group means without simplification.

    > tapply(x, f, mean, simplify = FALSE)
    $`1`
    [1] 0.1045255
    
    $`2`
    [1] 0.4867243
    
    $`3`
    [1] 0.9131191
    
    

    找到每组的数值范围。

    > tapply(x, f, range)
    $`1`
    [1] -0.8040998  1.0022698
    
    $`2`
    [1] 0.04577595 0.95238798
    
    $`3`
    [1] -0.4422177  2.3863979
    
    

    split

    split 根据因子向量或因子列表分组。

    > str(split)
    function (x, f, drop = FALSE, ...)  
    
    • x 可以是向量、列表、数据框is a vector (or list) or data frame
    • f 因子或因子列表
    • drop 是否去除因子水平为空的结果

    > split(x, f)
    $`1`
     [1]  0.06417511  0.77601085  1.66855356  1.38744423
     [5] -0.90908770  0.39727163 -2.13528805  0.29087121
     [9]  0.82936584  0.53773723
    
    $`2`
     [1] 0.6646064 0.4408925 0.3199122 0.2156969 0.8358507
     [6] 0.1408568 0.4088236 0.2258691 0.9606134 0.7945027
    
    $`3`
     [1]  0.65276220  2.46645556  2.72756544  1.77246304
     [5]  2.94941952  0.11977102 -0.04283368  2.36610370
     [9]  0.44573942  2.31295594
    
    

    lapplysplit 配合使用的例子。

    > lapply(split(x, f), mean)
    $`1`
    [1] 0.2907054
    
    $`2`
    [1] 0.5007624
    
    $`3`
    [1] 1.57704
    
    

    拆分数据框

    > library(datasets)
    > head(airquality)
      Ozone Solar.R Wind Temp Month Day
    1    41     190  7.4   67     5   1
    2    36     118  8.0   72     5   2
    3    12     149 12.6   74     5   3
    4    18     313 11.5   62     5   4
    5    NA      NA 14.3   56     5   5
    6    28      NA 14.9   66     5   6
    
    > s <- split(airquality, airquality$Month)
    > lapply(s, function(x) colMeans(x[, c("Ozone", "Solar.R", "Wind")]))
    $`5`
       Ozone  Solar.R     Wind 
          NA       NA 11.62258 
    
    $`6`
        Ozone   Solar.R      Wind 
           NA 190.16667  10.26667 
    
    $`7`
         Ozone    Solar.R       Wind 
            NA 216.483871   8.941935 
    
    $`8`
       Ozone  Solar.R     Wind 
          NA       NA 8.793548 
    
    $`9`
       Ozone  Solar.R     Wind 
          NA 167.4333  10.1800 
    
    

    > sapply(s, function(x) colMeans(x[, c("Ozone", "Solar.R", "Wind")])) 
                   5         6          7        8        9
    Ozone         NA        NA         NA       NA       NA
    Solar.R       NA 190.16667 216.483871       NA 167.4333
    Wind    11.62258  10.26667   8.941935 8.793548  10.1800
    
    > sapply(s, function(x) colMeans(x[, c("Ozone", "Solar.R", "Wind")],
                                     na.rm = TRUE))
                      5            6             7            8           9 
    Ozone      23.61538     29.44444     59.115385    59.961538   31.44828 
    Solar.R   181.29630    190.16667    216.483871   171.857143  167.43333 
    Wind       11.62258     10.26667      8.941935     8.793548   10.18000
    

    Splitting on More than One Level

    > x <- rnorm(10)
    > f1 <- gl(2, 5)
    > f2 <- gl(5, 2)
    > f1
     [1] 1 1 1 1 1 2 2 2 2 2
    Levels: 1 2
    > f2
     [1] 1 1 2 2 3 3 4 4 5 5
    Levels: 1 2 3 4 5
    > interaction(f1, f2)
     [1] 1.1 1.1 1.2 1.2 1.3 2.3 2.4 2.4 2.5 2.5
    10 Levels: 1.1 2.1 1.2 2.2 1.3 2.3 1.4 ... 2.5
    

    Splitting on More than One Level

    Interactions can create empty levels.

    > str(split(x, list(f1, f2)))
    List of 10
     $ 1.1: num [1:2] -0.378  0.445
     $ 2.1: num(0)
     $ 1.2: num [1:2] 1.4066 0.0166
     $ 2.2: num(0)
     $ 1.3: num -0.355
     $ 2.3: num 0.315
     $ 1.4: num(0)
     $ 2.4: num [1:2] -0.907  0.723
     $ 1.5: num(0)
     $ 2.5: num [1:2] 0.732 0.360
    

    split

    Empty levels can be dropped.

    > str(split(x, list(f1, f2), drop = TRUE))
    List of 6
     $ 1.1: num [1:2] -0.378  0.445
     $ 1.2: num [1:2] 1.4066 0.0166
     $ 1.3: num -0.355
     $ 2.3: num 0.315
     $ 2.4: num [1:2] -0.907  0.723
     $ 2.5: num [1:2] 0.732 0.360
    

    相关文章

      网友评论

        本文标题:R语言与生信应用25-R语法-tapply

        本文链接:https://www.haomeiwen.com/subject/sxoonqtx.html