(1)(1)从flights数据集中选择dep_time,dep_delay, arr_time和arr_delay,通过头脑风暴找出尽可能多的方法。
library(dplyr)
library(nycflights13)
#方式一
head(select(flights, dep_time, dep_delay, arr_time, arr_delay))
#方式二
head(flights[c("dep_time", "dep_delay", "arr_time", "arr_delay")])
#方式三
head(select(flights, 4, 6, 7, 9))
#方法四
head(flights[c(4, 6, 7, 9)])
方法五
head(select(flights, all_of(c("dep_time", "dep_delay", "arr_time", "arr_delay"))))
方法五
head(select(flights, any_of(c("dep_time", "dep_delay", "arr_time", "arr_delay"))))
(2)如果在select()函数中多次计入一个变量名,那会发生什么情况?
忽略重复项。
(3)one_of()函数的作用是什么?为什么它结合以下向量使用时非常有用?
vars <- c("years", "month", "day", "dep_delay", "arr_delay")
vars <- c("years", "month", "day", "dep_delay", "arr_delay")
select(flights,one_of(vars))
# A tibble: 336,776 x 4
month day dep_delay arr_delay
<int> <int> <dbl> <dbl>
1 1 1 2 11
2 1 1 4 20
3 1 1 2 33
4 1 1 -1 -18
5 1 1 -6 -25
6 1 1 -4 12
7 1 1 -5 19
8 1 1 -3 -14
9 1 1 -3 -8
10 1 1 -2 8
# ... with 336,766 more rows
Warning message:
Unknown columns: `years`
(4)以下代码的运行结果是否出乎意料?选择辅助函数处理大小写的默认方式是什么?如何改变默认方式?
select(flights, contains("TIME"))
# A tibble: 336,776 x 6
dep_time sched_dep_time arr_time sched_arr_time air_time time_hour
<int> <int> <int> <int> <dbl> <dttm>
1 517 515 830 819 227 2013-01-01 05:00:00
2 533 529 850 830 227 2013-01-01 05:00:00
3 542 540 923 850 160 2013-01-01 05:00:00
4 544 545 1004 1022 183 2013-01-01 05:00:00
5 554 600 812 837 116 2013-01-01 06:00:00
6 554 558 740 728 150 2013-01-01 05:00:00
7 555 600 913 854 158 2013-01-01 06:00:00
8 557 600 709 723 53 2013-01-01 06:00:00
9 557 600 838 846 140 2013-01-01 06:00:00
10 558 600 753 745 138 2013-01-01 06:00:00
# ... with 336,766 more rows
contains()的默认行为是忽略大小写。
习题参考:
http://www.360doc.cn/mip/931355281.html
网友评论