[R语言] magrittr包管道操作《R for data

作者: 半为花间酒 | 来源:发表于2020-04-28 09:56 被阅读0次

[R语言] magrittr包管道操作《R for data
多个工作表汇总
小洁详解《R数据科学》--第十三章管道操作
【tidyverse】part4：编程
[R语言] forcats包因子操作《R for data s
【r<-包】R-数据操作（三）：高效的data.table
[R语言] tibble包《R for data science
R语言作业-20题
R包学习之magrittr
【R数据科学读书笔记】R语言中的管道操作

《R for Data Science》第十八章 Pipes 啃书知识点积累
参考链接：R for Data Science

library(magrittr)

Piping alternatives

- Intermediate steps

R will share columns across data frames, where possible.

diamonds <- ggplot2::diamonds
diamonds2 <- diamonds %>% 
  dplyr::mutate(price_per_carat = price / carat)

pryr::object_size(diamonds)
#> Registered S3 method overwritten by 'pryr':
#>   method      from
#>   print.bytes Rcpp
#> 3.46 MB
pryr::object_size(diamonds2)
#> 3.89 MB
pryr::object_size(diamonds, diamonds2)
#> 3.89 MB

#  如果修改了其中一列，该列在数据框就不再共享
diamonds$carat[1] <- NA
pryr::object_size(diamonds)
#> 3.46 MB
pryr::object_size(diamonds2)
#> 3.89 MB
pryr::object_size(diamonds, diamonds2)
#> 4.32 MB

pryr::object_size()可以获取给定对象占用的内存，可以给多个对象
object.size()只能给定一个对象

- Function composition

bop(
  scoop(
    hop(foo_foo, through = forest),
    up = field_mice
  ), 
  on = head
)

The dagwood sandwhich problem:
The disadvantage is that you have to read from inside-out, from right-to-left, and that the arguments end up spread far apart.

- Use the pipe

foo_foo %>%
  hop(through = forest) %>%
  scoop(up = field_mice) %>%
  bop(on = head)

# 本质上如下
my_pipe <- function(.) {
  . <- hop(., through = forest)
  . <- scoop(., up = field_mice)
  bop(., on = head)
}
my_pipe(foo_foo)

两种不适用管道的情况

(1) 使用当前环境的函数：如assign load get

assign("x", 10); x
# [1] 10

"x" %>% assign(100); x
# [1] 10

env <- environment()
"x" %>% assign(100, envir = env); x
# [1] 100

(2) 延迟使用、惰性计算的函数: 如多数捕获异常的函数
tryCatch try suppressMessages suppressWarnings

tryCatch(stop("!"), error = function(e) "An error")
#> [1] "An error"

stop("!") %>% 
  tryCatch(error = function(e) "An error")
#> Error in eval(lhs, parent, parent): !

When not to use the pipe

知道什么时候不用管道也是很重要的事情

Pipes are most useful for rewriting a fairly short linear sequence of operations.

Your pipes are longer than (say) ten steps. In that case, create intermediate objects with meaningful names. That will make debugging easier, because you can more easily check the intermediate results, and it makes it easier to understand your code, because the variable names can help communicate intent.
You have multiple inputs or outputs. If there isn’t one primary object being transformed, but two or more objects being combined together, don’t use the pipe.
You are starting to think about a directed graph with a complex dependency structure. Pipes are fundamentally linear and expressing complex relationships with them will typically yield confusing code.

Other tools from magrittr

When working with more complex pipes, it’s sometimes useful to call a function for its side-effects. Maybe you want to print out the current object, or plot it, or save it to disk. Many times, such functions don’t return anything, effectively terminating the pipe.

%T>%
%T>% works like %>% except that it returns the left-hand side instead of the right-hand side. It’s called “tee” because it’s like a literal T-shaped pipe.

library(magrittr)

rnorm(100) %>%
  matrix(ncol = 2) %>%
  plot() %>%
  str()
#  NULL

rnorm(100) %>%
  matrix(ncol = 2) %T>%
  plot() %>% 
  str()
# num [1:50, 1:2] -0.351 -1.751 0.666 0.516 -0.686 ...

%$%
It “explodes” out the variables in a data frame so that you can refer to them explicitly.
(便于显式调用变量)

mtcars %$%
  cor(disp, mpg)
#> [1] -0.8475514

# 可以用with显式变量
with(mtcars, cor(disp, mpg))

%<>%
直接替换不需要重赋值

mtcars <- mtcars %>% 
  transform(cyl = cyl * 2)

mtcars %<>% transform(cyl = cyl * 2)

网友评论

R for Data Science

本文标题：[R语言] magrittr包管道操作《R for data

本文链接：https://www.haomeiwen.com/subject/njsuwhtx.html

延伸阅读

深度阅读

您也可以注册成为美文阅读网的作者，发表您的原创作品、分享您的心情！

[R语言] magrittr包管道操作《R for data

Piping alternatives

- Intermediate steps

- Function composition

- Use the pipe

When not to use the pipe

Other tools from magrittr

相关文章

[R语言] magrittr包管道操作《R for data

多个工作表汇总

小洁详解《R数据科学》--第十三章管道操作

【tidyverse】part4：编程

[R语言] forcats包因子操作《R for data s

【r<-包】R-数据操作（三）：高效的data.table

[R语言] tibble包《R for data science

R语言作业-20题

R包学习之magrittr

【R数据科学读书笔记】R语言中的管道操作

网友评论

延伸阅读

深度阅读

栏目导航

热点阅读

R for Data Science

[R语言] magrittr包 管道操作《R for data

Piping alternatives

- Intermediate steps

- Function composition

- Use the pipe

When not to use the pipe

Other tools from magrittr

相关文章

网友评论

延伸阅读

深度阅读

栏目导航

热点阅读

[R语言] magrittr包管道操作《R for data