来自https://www.datanovia.com/en/lessons/select-data-frame-columns-in-r/学习笔记。
就是
从一个数据框中提取子集
,是按照什么提取呢?是按照列
来提取!主要有pull
、select
、select_if
,还有一些辅助帮助定位的函数如starts_with(), ends_with(), contains(), matches(), one_of()
这些函数!最主要的就是select
函数!
曾经记录过tidyverse
的数据变换,如下:
Reshape2包-长宽数据转换之-melt和dcast函数
Required packages
library(tidyverse)
Demo dataset
my_data <- as_tibble(iris)
my_data
![](https://img.haomeiwen.com/i11316862/5df7071de9e813d8.png)
Extract column values as a vector
my_data %>% pull(Species)
[图片上传失败...(image-16a687-1591955739591)]
Extract columns as a data table
1.Select column by position
my_data %>% select(1:3)
![](https://img.haomeiwen.com/i11316862/532a7d149e3d4b9f.jpg)
my_data %>% select(1,3)
![](https://img.haomeiwen.com/i11316862/627d30e0c3f1ebbe.jpg)
2.Select columns by names
my_data %>% select(Sepal.Length, Petal.Length)
![](https://img.haomeiwen.com/i11316862/69e4754144d1d0d2.jpg)
my_data %>% select(Sepal.Length:Petal.Length)
![](https://img.haomeiwen.com/i11316862/94c4b8fddae8b25d.jpg)
There are several special functions that can be used inside select(): starts_with(), ends_with(), contains(), matches(), one_of(), etc.
先看一下这个数据的列名
> colnames(my_data)
[1] "Sepal.Length" "Sepal.Width" "Petal.Length" "Petal.Width" "Species"
my_data %>% select(starts_with("Petal"))
![](https://img.haomeiwen.com/i11316862/311bfebf4869a681.jpg)
my_data %>% select(ends_with("Width"))
![](https://img.haomeiwen.com/i11316862/2959b833d915bd9e.jpg)
my_data %>% select(contains("etal"))
![](https://img.haomeiwen.com/i11316862/05ba9ea6df008797.jpg)
my_data %>% select(matches(".t."))#'.'点号用来匹配换行符以外的任意一个字符
![](https://img.haomeiwen.com/i11316862/3e6bcb8034d13d97.jpg)
my_data %>% select(one_of(c("Sepal.Length", "Petal.Length")))
![](https://img.haomeiwen.com/i11316862/de1df923e6594b53.jpg)
Select column based on a condtion
my_data %>% select_if(is.numeric)
![](https://img.haomeiwen.com/i11316862/d5e2a6a300b60a44.jpg)
Remove columns
1.drop columns by position
my_data %>% select(-Sepal.Length, -Petal.Length)
![](https://img.haomeiwen.com/i11316862/252b389e50a9c0ac.jpg)
my_data %>% select(-(Sepal.Length:Petal.Length))
[图片上传失败...(image-a6ebb1-1591955739592)]
my_data %>% select(-starts_with("Petal"))
![](https://img.haomeiwen.com/i11316862/5c6aa2ab618662e6.jpg)
2.drop columns by position
> colnames(my_data)
[1] "Sepal.Length" "Sepal.Width" "Petal.Length" "Petal.Width" "Species"
my_data %>% select(-1)
![](https://img.haomeiwen.com/i11316862/388b34122be420ff.jpg)
my_data %>% select(-(1:3))
![](https://img.haomeiwen.com/i11316862/c05463683d7ebfaa.jpg)
my_data %>% select(-1, -3)
![](https://img.haomeiwen.com/i11316862/9fc05e86a3354d51.jpg)
上面就是提取列的全部操作的示例了,记录下来,下次用的时候可以根据示例的结果很方便地查找到函数
网友评论