关于 dplyr 1.0.0 出来后我想分享的一些东西

作者: 热衷组培的二货潜 | 来源:发表于2020-06-06 22:30 被阅读0次

于 2020-05-29 那天，期盼已久的 dplyr 1.0.0 终于出来了（emm，鸽了半个月）。

dplyr 在出 1.0.0 版本之前不久，于 hadely 在 twitter 发文 dplyr 发布推迟半个月到 29 号，同时也终于把那黄不拉几的 logo 换成了一个更炫目的 logo，新 logo 还是蛮好看的。

不过我还是喜欢粉笔画版本的这个。好了，闲扯就这么多吧，我反正就记得了这个鸽了半个月。

关于 dplyr 1.0.0 的几个我的笔记：

dplyr 1.0.0 出来了，我也该推一波相关资源了。

我想推荐的几本围绕《R for data science》相关的几本书

Tidy evaluation（进化版）：https://tidyeval.tidyverse.org/
《Modern R with the tidyverse》：https://b-rodrigues.github.io/modern_R/
《Statistical Inference via Data Science: A ModernDive into R and the Tidyverse》：https://moderndive.netlify.com/index.html
《The tidyverse style guide（Tidyverse 代码风格指引）》: https://style.tidyverse.org/
《R 数据分析指南与速查手册》：https://bookdown.org/xiao/RAnalysisBook/
《数据科学与 R 语言》：https://bookdown.org/xiangyun/RGraphics/
四川师范大学研究生公选课《数据科学中的 R 语言》：https://bookdown.org/wangminjie/R4DS/

我想推荐的几篇 dplyr 博文：

Tidyverse 学习素材：https://www.stat.cmu.edu/~ryantibs/statcomp/lectures/
Tidyverse 问答社区：https://community.rstudio.com/c/tidyverse
Tidyverse 中包更新消息：https://www.tidyverse.org/blog/
data.table and dplyr（两两对比）：https://atrebas.github.io/post/2019-03-03-datatable-dplyr/
TidyTuesday（数据处理+可视化实例）：https://github.com/rfordatascience/tidytuesday/blob/master/README.md
TidyTuesday twitter 在线shiny app：https://nsgrantham.shinyapps.io/tidytuesdayrocks/
dplyr 操作 50 例（强烈推荐跟一波）：https://www.listendata.com/2016/08/dplyr-tutorial.html
Hot questions for Dplyr（强烈推荐）****：https://www.thetopsites.net/projects/dplyr/ dplyr 处理数据的各种问题收集。
知乎张敬信老师的 玩转数据处理120题（R语言tidyverse版本）

玩转数据处理120题之P1-P20（R语言tidyverse版本）

玩转数据处理120题之P21-P50（R语言tidyverse版本）

玩转数据处理120题之P51-P80（R语言tidyverse版本）

玩转数据处理120题之P81-P100（R语言tidyverse版本）

玩转数据处理120题之P101-P120（R语言tidyverse版本）

参考资源：

Tidyverse 包官方更新处：其实看这个就行了，其他的都是这个的衍生。。。
2020-0414-Dplyr across: First look at a new Tidyverse function
2020-0415-The Seven Key Things You Need To Know About dplyr 1.0.0
- twitter 链接：https://twitter.com/dr_keithmcnulty/status/1250404270027026432
- 1. Built in tidyselect
- 1. relocate()
- 1. Superpowered summarise()
- 1. colwise using across()
- 1. new rowwise() grammar
- 1. easy modeling inside dataframes
- 1. nest_by()
2020-04-11-dplyr 1.0 代码示例 ：建议不用看，看官方的示例即可了
Twitter 上 dplyr 的话题标签 #dplyr
Nick Merlino 2020/05/27-My Favorite dplyr 1.0.0 Features
Tidyverse Case Study: Anscombe’s quartet
知乎张敬信老师的 【R语言】dplyr1.0.0新功能解读
2020-0602-dplyr 1.0.0 （58 页 PPT 讲解），可以说是 dplyr 包的发展史了（强烈推荐）。
- twitter 链接：https://twitter.com/rdataberlin/status/1268266145909551106
- github 代码 Rmarkdown 链接：https://github.com/courtiol/Rcourses/tree/master/dplyr_1_0_0

dplyr 1.0.0 小结

那么这一次 dplyr 1.0.0 更新后多了些什么内容呢？又带了怎样更便捷的操作。请允许我一一道来。

dplyr 包中有哪些核心函数呢？

select()：列操作，
rename()：对列进行重命名
mutate()：创建新的列
filter()：行操作，按条件筛选出所需要的行
summarise()：汇总函数
arrange(): 排序函数
*_join()：多个表格（数据）之间的操作
relocate()：更方便的调整列的位置
slice()：功能类似 head() 函数、但是比 head() 函数更为强大，可以输出特定行、最大值的行、最小值的行、随机选择若干行或者百分比行
across()：内置于 summarise()、mutate() 等函数内部，使得数据处理更加简单，取代了之前的一系列 *_if()、*_at()、*_all() 子函数，使得对列可以同时进行多个函数处理。
rowwise(): 使得在 R 中对于数据按照行进行数据分析，比如：感兴趣的列的每一行的统计运算。
c_across(): 常常与 rowwise() 函数连用，行处理中的 across()
...

下面我们来逐一介绍。

select()

按照位置：
- df %>% select(1, 5, 10)
- df %>% select(1:4)
按照名字：
- df %>% select(a, e, j)
- df %>% select(c(a, e, j))
- df %>% select(a:d)
按照函数选择：
- df %>% select(starts_with("x"))：选择列名以 x 开头列
- df %>% select(ends_with("s"))：选择列名以 s 结尾的列
- df %>% select(num_range("x", 1:3)) ：选择列名为 x1、x2、x3 的列
- df %>% select(contains("ijk"))：匹配包含列名中 “ijk” 的名称的列
- df %>% select(matches("(.)\\1")) ：通过正则来进行匹配列
- 也可以通过与 contains() 和 matches() 、str_c()等函数连用
按照数据类型：
- df %>% select(where(is.numeric))
- df %>% select(where(is.factor))
- df %>% select(where(~is.numeric(.x) & mean(.x, na.omit = TRUE) > 1))
通过布尔运算符进行多个组合
- df %>% select(!where(is.factor))
- df %>% select(where(is.numeric) & starts_with("x"))
- df %>% select(starts_with("a") | ends_with("z"))

rename()

直接修改：
- df1 %>% rename(b = 2)；b 表示修改后的列名，2 表示第二列
按照函数：
- df2 %>% rename_with(toupper)
- df2 %>% rename_with(toupper, !col1)
- df2 %>% rename_with(toupper, starts_with("x"))
- df2 %>% rename_with(toupper, where(is.numeric))

mutate()

可以很方便的新增列，而且新列一旦创建就可以直接被用来创建新列。
- df %>% mutate(new_col = col1 + col2, new_col1 = new_col/2)
.keep 参数
- .keep = "all": 全都保留，和 dplyr 1.0.0 之前版本一致
- .keep = "used": 只保留用来计算得到新列的列
- .keep = "unused": 只保留没有用来处理得到新列的列
- .keep = "none": 只保留新增的列，相当于函数 transmute()
.before 参数可以控制新增列的位置在哪一列之前
.after 参数可以控制新增列的位置在哪一列之后

filter

可以通过布尔运算筛选符合条件的行

df %>% filter(col > 1 & col2 == "A")
df %>% filter(col1 == 1 & col1 == 2)
df %>% filter(col %in% c("A", "B"))
between() 函数

summarise()

汇总函数。一般结合 group_by() 、across() 、数学统计运算函数、自定义函数 等连用。

arrange

df %>% arrange(col1, col2)：默认升序
df %>% arrange(desc(col1))：desc 降序
df %>% arrange(col1 - col2)

*_join()

inner_join() ：内连接；by 指定两个表相同的键
left_join() ：左连接；保留 x 中的所有观测。
full_join() ：全连接；保留 x 和 y 中的所有观测
right_join() ：右连接；保留 y 中的所有观测
semi_join(x, y)：保留 x 表中与 y 表中的观测相匹配的所有观测
anti_join(x, y)：丢弃 x 表中与 y 表中的观测相匹配的所有观测

relocate()

df3 %>% relocate(y, z)；将 yz 列移到最前面
df3 %>% relocate(where(is.character))；将字符串类型列都放到最前面
df3 %>% relocate(w, .after = y)；将 w 列移动到 y 列后面
df3 %>% relocate(w, .before = y)；将 w 列移动到 y 列前面
df3 %>% relocate(w, .after = last_col())；将 w 列移至最后面

slice()

top_n()、 sample_n()、 sample_frac() 这三个函数已经被 slice 新增的子函数所替代

slice_head()：默认只输出第一行，如果数据分组了则为每一个组的第一行
- df %>% slice_head(prop = 0.1)
- df %>% slice_head(prop = 10)
slice_tail()：默认只输出最后一行，其他参数同 slice_head()
slice_sample()：默认随机输出一行，
slice_min()：
slice_max()
slice()

其中 slice_head() 、slice_sample() 中新增了参数 n = 和 prop =，n 表示多上行，prop 表示所占数据行的比例。相当于函数 sample_n() 和 sample_frac()。

top_n 被函数 slice_min() 和 slice_max() 所替代

across

across(.cols = everything(), .fns = NULL, ..., .names = NULL)

第一个参数，选择你所想要操作的列（类似于 select() 函数），我们可以通过位置、名字、数据类型来选择。
第二个参数，.fns 就是要对列进行的操作函数，可以类似 purrr 中的公式，比如：~ .x/2

为什么我们要多使用 across()

across() 函数可以很方便的同时对列进行多个操作
across() 函数减少了 dplyr 所需要提供的函数数目。使得 dplyr 用起来更加方便以及更加通俗易懂
across() 整合了之前后缀为 _if、_at 等函数的功能，使我们能够按照位置、列名、列数据类型来筛选数据
across() 不需要 vars() 函数，_at() 函数是 dplyr 中唯一必须手动引用变量名的地方。

注意：across() 函数不能与 select() 、rename() 函数连用，因为他们已经使用了选择的语法，我们如果想要使用函数来改变列名那么就需要使用函数 rename_with()

本次更新最为重要的一个函数。所有 *_if()、 *_at()、 *_all() 变体函数都已经被 across() 函数所取代，使得所有列进行相同操作更为便捷。

怎么转换我们之前基于 _at、_if、_all 等后缀的函数处理为 across()

去掉 _at、 _if、 _all 后缀
变为 across()
- _if 系列则改为 where()
- _at() 系列则去掉 vars 函数即可
- _all() 系列则改为 everything() 即可

across() 与其他函数连用

across() 与 mutate() 连用

df %>% mutate_if(is.numeric, log)
df %>% mutate(across(where(is.numeric), log))

rescale01 <- function(x){
  rng <- range(x, na.rm = T)
  (x - rng[1])/(rng[2] - rng[1])
}

df <- tibble(x = 1:4, y = rnorm(4))

df %>%
  mutate(across(where(is.numeric), rescale01))
## # A tibble: 4 x 2
##       x     y
##   <dbl> <dbl>
## 1 0     0    
## 2 0.333 0.291
## 3 0.667 0.207
## 4 1     1

across(where()) 与 summarise() 函数

# 选择字符串列进行统计长度信息
starwars %>%
  summarise(across(where(is.character), ~length(unique(.x))))

# 选取数值列，进行求均值
starwars %>%
  group_by(homeworld) %>%
  filter(n() > 1) %>%
  summarise(across(where(is.numeric), ~ mean(.x, na.rm = T)))

across(everything()) 取代 mutate_all()
across() 与 count() 函数连用

starwars %>%
  count(across(contains("color")), sort = TRUE)

across() 与 distinct() 函数连用

starwars %>%
  distinct(across(contains("color")))

across() 与 filter() 函数连用

# 查找所有没有缺失值 NA 的列
starwars %>%
  filter(across(everything(), ~ !is.na(.x)))

通过 across() 对列同时进行多个操作

min_max <- list(
  min = ~min(.x, na.rm = T),
  max = ~max(.x, na.rm = T)
)

starwars %>%
  summarise(across(where(is.numeric), min_max))


# 怎么控制输出结果列名呢？
# 使用 glue 包
# {fn} 表示使用的函数名，{col} 表示操作的列名
starwars %>%
  summarise(across(where(is.numeric), min_max, .names = "{fn}.{col}"))
## # A tibble: 1 x 6
##   min.height max.height min.mass max.mass min.birth_year max.birth_year
##        <int>      <int>    <dbl>    <dbl>          <dbl>          <dbl>
## 1         66        264       15     1358              8            896

# 如果我们想要将同样函数处理的数据放置于一起，我们就需要将函数分开
# 我们可以看到结果是很奇怪的。
starwars %>%
  summarise(across(where(is.numeric), ~min(.x, na.rm = T), .names = "min.{col}"),
            across(where(is.numeric), ~max(.x, na.rm = T), .names = "max.{col}"))
## # A tibble: 1 x 9
##   min.height min.mass min.birth_year max.height max.mass max.birth_year
##        <int>    <dbl>          <dbl>      <int>    <dbl>          <dbl>
## 1         66       15              8        264     1358            896
## # ... with 3 more variables: max.min.height <int>, max.min.mass <dbl>,
## #   max.min.birth_year <dbl>

总之这是一个非常重要的函数。但是以下几种情况需要注意：

across 在结合 summarise() 函数使用时候，会自动将前面所计算的函数：比如 n() 考虑在内，会覆盖 n() 结果。

df <- data.frame(x = c(1, 2, 3), y = c(1, 4, 9))
df %>%
  summarise(n = n(), across(where(is.numeric), sd))
##    n x        y
## 1 NA 1 4.041452

# 可看到这里 n() 统计结果为 NA，因为 n 为一个数值，所以后面 across() 计算了他的 sd 值，3 的 sd 值为 NA，如果我们想解决这一个问题，我们就需要将 n() 统计放置于 across() 函数处理之后
df %>%
  summarise(across(where(is.numeric), sd), n = n())
##   x        y n
## 1 1 4.041452 3

# 还有另外一种方法，即在 across() 函数中加上一个条件 !n
df %>%
  summarise(n = n(), across(where(is.numeric) & !n, sd))
##   n x        y
## 1 3 1 4.041452

rowwise()

在 R 中 dplyr 通常是对列进行操作，然而对于行处理方面还是比较困难， rowwise()函数来对数据进行行处理，常与 c_across() 连用。

本节中列举了三个常见的案例：

行水平的计算（比如，xyz 的平均值）
使用不同的参数调用同一个函数
对列表列进行操作

当然这些问题我们可以通过类似 for 等循环来进行操作，但是我们可以通过管道的形式进行更便捷的操作，这里作者有一句经典的话:

Of course, someone has to write loops. It doesn’t have to be you. — Jenny Bryan

rowwise 按行来进行分组，和 group_by() 函数一样，并不会改变数据得内容，仅仅是进行分组：

df <- tibble(x = 1:2, y = 3:4, z = 5:6)
df %>% rowwise()
# 可以看到下面中多一个表示符号：Rowwise
## # A tibble: 2 x 3
## # Rowwise: 
##       x     y     z
##   <int> <int> <int>
## 1     1     3     5
## 2     2     4     6

# 计算的是数据中所有的数值的平均值
df %>% mutate(m = mean(c(x, y, z)))
## # A tibble: 2 x 4
##       x     y     z     m
##   <int> <int> <int> <dbl>
## 1     1     3     5   3.5
## 2     2     4     6   3.5

# 计算每一列的平均值
df %>% mutate(across(everything(), ~mean(.x, na.rm = T)))
## # A tibble: 2 x 3
##       x     y     z
##   <dbl> <dbl> <dbl>
## 1   1.5   3.5   5.5
## 2   1.5   3.5   5.5

# 计算的是每一行的平均值
df %>% rowwise() %>% mutate(m = mean(c(x, y, z)))
## # A tibble: 2 x 4
## # Rowwise: 
##       x     y     z     m
##   <int> <int> <int> <dbl>
## 1     1     3     5     3
## 2     2     4     6     4

rowwise() 与 summarise() 函数连用

df <- tibble(name = c("Mara", "Hadley"), x = 1:2, y = 3:4, z = 5:6)

# 结果仅仅只有值
df %>% 
  rowwise() %>% 
  summarise(m = mean(c(x, y, z)))
## `summarise()` ungrouping output (override with `.groups` argument)
## # A tibble: 2 x 1
##       m
##   <dbl>
## 1     3
## 2     4


# 可以通过加上需要处理的行作为 summarise() 的行名，可以使用 `rowwise(name)`，保留 `name` 列
df %>% 
  rowwise(name) %>% 
  summarise(m = mean(c(x, y, z)))
## `summarise()` regrouping output by 'name' (override with `.groups` argument)
## # A tibble: 2 x 2
## # Groups:   name [2]
##   name       m
##   <chr>  <dbl>
## 1 Mara       3
## 2 Hadley     4


df <- tibble(id = 1:6, w = 10:15, x = 20:25, y = 30:35, z = 40:45)
df
## # A tibble: 6 x 5
##      id     w     x     y     z
##   <int> <int> <int> <int> <int>
## 1     1    10    20    30    40
## 2     2    11    21    31    41
## 3     3    12    22    32    42
## 4     4    13    23    33    43
## 5     5    14    24    34    44
## 6     6    15    25    35    45
# 使用 `rowwise` 对数据进行行分组 
rf <- df %>% rowwise(id)

rf %>% mutate(total = sum(c(w, x, y, z)))
## # A tibble: 6 x 6
## # Rowwise:  id
##      id     w     x     y     z total
##   <int> <int> <int> <int> <int> <int>
## 1     1    10    20    30    40   100
## 2     2    11    21    31    41   104
## 3     3    12    22    32    42   108
## 4     4    13    23    33    43   112
## 5     5    14    24    34    44   116
## 6     6    15    25    35    45   120
rf %>% summarise(total = sum(c(w, x, y, z)))
## `summarise()` regrouping output by 'id' (override with `.groups` argument)
## # A tibble: 6 x 2
## # Groups:   id [6]
##      id total
##   <int> <int>
## 1     1   100
## 2     2   104
## 3     3   108
## 4     4   112
## 5     5   116
## 6     6   120

c_across

常常与 rowwise() 函数连用，行处理中的 across()

rf <- tibble(id = 1:6, w = 10:15, x = 20:25, y = 30:35, z = 40:45) %>% rowwise(id)

rf %>% mutate(total = sum(c_across(w:z)))
## # A tibble: 6 x 6
## # Rowwise:  id
##      id     w     x     y     z total
##   <int> <int> <int> <int> <int> <int>
## 1     1    10    20    30    40   100
## 2     2    11    21    31    41   104
## 3     3    12    22    32    42   108
## 4     4    13    23    33    43   112
## 5     5    14    24    34    44   116
## 6     6    15    25    35    45   120
rf %>% mutate(total = sum(c_across(where(is.numeric))))
## # A tibble: 6 x 6
## # Rowwise:  id
##      id     w     x     y     z total
##   <int> <int> <int> <int> <int> <int>
## 1     1    10    20    30    40   100
## 2     2    11    21    31    41   104
## 3     3    12    22    32    42   108
## 4     4    13    23    33    43   112
## 5     5    14    24    34    44   116
## 6     6    15    25    35    45   120

rowwise() 、c_across()、across() 连用

ungroup() 取消分组，这里表示取消按照行进行分组

rf %>% 
  mutate(total = sum(c_across(w:z))) %>% 
  ungroup() %>% 
  mutate(across(w:z, ~ . / total))
## # A tibble: 6 x 6
##      id     w     x     y     z total
##   <int> <dbl> <dbl> <dbl> <dbl> <int>
## 1     1 0.1   0.2   0.3   0.4     100
## 2     2 0.106 0.202 0.298 0.394   104
## 3     3 0.111 0.204 0.296 0.389   108
## 4     4 0.116 0.205 0.295 0.384   112
## 5     5 0.121 0.207 0.293 0.379   116
## 6     6 0.125 0.208 0.292 0.375   120

行处理函数总结：rowSums() 和 rowMeans()

内置行处理函数更快，对行进行操作，没有分成行、然后统计，最后连接到一起。

df %>% mutate(total = rowSums(across(where(is.numeric))))
## # A tibble: 6 x 6
##      id     w     x     y     z total
##   <int> <int> <int> <int> <int> <dbl>
## 1     1    10    20    30    40   101
## 2     2    11    21    31    41   106
## 3     3    12    22    32    42   111
## 4     4    13    23    33    43   116
## 5     5    14    24    34    44   121
## 6     6    15    25    35    45   126

df %>% mutate(mean = rowMeans(across(where(is.numeric))))
## # A tibble: 6 x 6
##      id     w     x     y     z  mean
##   <int> <int> <int> <int> <int> <dbl>
## 1     1    10    20    30    40  20.2
## 2     2    11    21    31    41  21.2
## 3     3    12    22    32    42  22.2
## 4     4    13    23    33    43  23.2
## 5     5    14    24    34    44  24.2
## 6     6    15    25    35    45  25.2

重复的函数调用：按行传入变量参数

rowwise() 不仅适用于返回长度为 1 的向量的函数; 如果结果是一个列表，它可以与任何函数一起连用。这意味着 rowwise() 和 mutate() 提供了一种优雅的方法，可以多次使用不同的参数调用函数，将输出存储在输入旁边。

一定要用 list() 函数来将命令括起来，比如 list(runif(n, min, max)) 而非 runif(n, min, max)

df <- tribble(
  ~ n, ~ min, ~ max,
    1,     0,     1,
    2,    10,   100,
    3,   100,  1000,
)

df %>% 
  rowwise() %>% 
  mutate(data = list(runif(n, min, max)))
## # A tibble: 3 x 4
## # Rowwise: 
##       n   min   max data     
##   <dbl> <dbl> <dbl> <list>   
## 1     1     0     1 <dbl [1]>
## 2     2    10   100 <dbl [2]>
## 3     3   100  1000 <dbl [3]>

两两多重组合：tidyr::expand_grid() 函数

# 这里就会得到  3*3 九种结果
df <- expand.grid(mean = c(-1, 0, 1), sd = c(1, 10, 100))

df %>% 
  rowwise() %>% 
  mutate(data = list(rnorm(10, mean, sd)))

各种功能：结合 do.call()

df <- tribble(
   ~rng,     ~params,
   "runif",  list(n = 10), 
   "rnorm",  list(n = 20),
   "rpois",  list(n = 10, lambda = 5),
) %>%
  rowwise()

df %>% 
  mutate(data = list(do.call(rng, params)))
## # A tibble: 3 x 3
## # Rowwise: 
##   rng   params           data      
##   <chr> <list>           <list>    
## 1 runif <named list [1]> <dbl [10]>
## 2 rnorm <named list [1]> <dbl [20]>
## 3 rpois <named list [2]> <int [10]>

最重要的是用来建模

nest_by() 分组存储为一个 list

by_cyl <- mtcars %>% nest_by(cyl)
by_cyl
## # A tibble: 3 x 2
## # Rowwise:  cyl
##     cyl                data
##   <dbl> <list<tbl_df[,10]>>
## 1     4           [11 x 10]
## 2     6            [7 x 10]
## 3     8           [14 x 10]

按行线性建模

mods <- by_cyl %>% mutate(mod = list(lm(mpg ~ wt, data = data)))
mods
## # A tibble: 3 x 3
## # Rowwise:  cyl
##     cyl                data mod   
##   <dbl> <list<tbl_df[,10]>> <list>
## 1     4           [11 x 10] <lm>  
## 2     6            [7 x 10] <lm>  
## 3     8           [14 x 10] <lm>
mods <- mods %>% mutate(pred = list(predict(mod, data)))
mods
## # A tibble: 3 x 4
## # Rowwise:  cyl
##     cyl                data mod    pred      
##   <dbl> <list<tbl_df[,10]>> <list> <list>    
## 1     4           [11 x 10] <lm>   <dbl [11]>
## 2     6            [7 x 10] <lm>   <dbl [7]> 
## 3     8           [14 x 10] <lm>   <dbl [14]>

dplyr 简介

这次对于 dplyr 包函数更新了一个很重要的说明参考文件书，主要分为以下几个方面，方便我们系统的去学习（本文大多数例子也是从中而来）。

dplyr 简介，是学习 dplyr 包主要功能的最佳选择地方，没有之一，其中包括以下几个方面：

base R 操作与 dplyr 操作的等同函数

列操作
兼容性操作
dplyr
分组操作
常见的 dplyr 相关编程
行操作
两个数据之间的操作：*join() 系列操作 (翻译不到位的勿见怪)

关于 dplyr 1.0.0 出来后我想分享的一些东西

关于 dplyr 1.0.0 的几个我的笔记：

我想推荐的几本围绕《R for data science》相关的几本书

我想推荐的几篇 dplyr 博文：

参考资源：

dplyr 1.0.0 小结

select()

rename()

mutate()

filter

summarise()

arrange

*_join()

relocate()

slice()

across

rowwise()

c_across

dplyr 简介

相关文章

网友评论

延伸阅读

深度阅读

栏目导航

热点阅读

R for data science

生信修炼

R技巧

生信杂谈

关于 dplyr 1.0.0 出来后我想分享的一些东西

关于 dplyr 1.0.0 的几个我的笔记：

我想推荐的几本围绕 《R for data science》相关的几本书

我想推荐的几篇 dplyr 博文：

参考资源：

dplyr 1.0.0 小结

select()

rename()

mutate()

filter

summarise()

arrange

*_join()

relocate()

slice()

across

rowwise()

c_across

dplyr 简介

相关文章

网友评论

延伸阅读

深度阅读

栏目导航

热点阅读

R for data science

生信修炼

R技巧

生信杂谈

我想推荐的几本围绕《R for data science》相关的几本书