R语言学习——merge 函数

作者: Xinli_5d16 | 来源:发表于2021-04-17 23:52 被阅读0次

R的merge用法（2018-06-05）
R语言学习——merge 函数
2018-03-20 R的 merge ，cbind ，rbin
2017-10-21 【作业笔记】
R中使用merge()函数合并数据
哈佛R语言课程--3.函数、参数和R包
R语言_函数认知&R包安装
学习小组Day5笔记--慧美
[R语言]match(),%in%,merge()
R语言常用函数整理（基础篇）

写在前面：生信技能树
http://rstudio-pubs-static.s3.amazonaws.com/13602_96265a9b3bac4cb1b214340770aa18a1.html

merge 函数

用来merge同一行名或者列名的两个数据框， or do other versions of database join operations.

函数使用

merge(x, y, ...)

具体的参数如下：

merge(x, y, by = intersect(names(x), names(y)),
      by.x = by, by.y = by, all = FALSE, all.x = all, all.y = all,
      sort = TRUE, suffixes = c(".x",".y"), no.dups = TRUE,
      incomparables = NULL, ...)

参数：（居然这么多，不看不知道）
x,y 表示数据框或者其他的对象

by, by.x, by.y 具体的需要用来merge的某一列

all 逻辑值；all = L 表示all.x =L 且all.y=L，L代表TURE 或者FALSE

all.x 逻辑值，如果为TURE, 就会把数据加到并输出，在x中存在而在y中不存在，则会输出为NA.如果为FLASE,x，y中所有的数据都会合并输出。

all.y 类似于all.x

sort 逻辑值，输出的结果是否根据columns进行分选

suffixes 长度为2的向量，用来表示没有merge列的名称。

no.dups 逻辑值，用来避免suffixes中避免重复，默认为flase

incomparables 不匹配的向量，类似match 函数

> id1 <- c(2, 3, 4, 5, 7)
> heights <- c(62, 65, 71, 71, 67)
> df1 <- data.frame(id = id1, heights)
> id2 <- c(1, 2, 6:10)
> weights <- c(147, 113, 168, 135, 142, 159, 160)
> df2 <- data.frame(id = id2, weights)
> View(df1)
> View(df2)
> df <- merge(df1,df2,by="id")
> View(df) #取交集
> df
  id heights weights
1  2      62     113
2  7      67     135
> View(df1)
> df1
  id heights
1  2      62
2  3      65
3  4      71
4  5      71
5  7      67
> df2
  id weights
1  1     147
2  2     113
3  6     168
4  7     135
5  8     142
6  9     159
7 10     160
> df
  id heights weights
1  2      62     113
2  7      67     135
> df <- merge(df1,df2,by="id",all = T)  #取并集
> df
   id heights weights  #没有返回为NA
1   1      NA     147
2   2      62     113
3   3      65      NA
4   4      71      NA
5   5      71      NA
6   6      NA     168
7   7      67     135
8   8      NA     142
9   9      NA     159
10 10      NA     160
> View(df)
> df <- merge(df1,df2,by="id",all = F,all.x = T)  #保留x中那一列中所有东西，把y中填进去，没有为NA
> df
  id heights weights
1  2      62     113
2  3      65      NA
3  4      71      NA
4  5      71      NA
5  7      67     135
> df <- merge(df1,df2,by="id",all = F,all.x = T,all.y = T) # 取二者的交集
> df
   id heights weights
1   1      NA     147
2   2      62     113
3   3      65      NA
4   4      71      NA
5   5      71      NA
6   6      NA     168
7   7      67     135
8   8      NA     142
9   9      NA     159
10 10      NA     160