[Rtips] group_by + across + wher

作者: 热衷组培的二货潜 | 来源:发表于2020-12-07 23:26 被阅读0次

[Rtips] group_by + across + wher
Week1: swirl教程 2: Grouping and C
R的坑（收集中……）
R 选择最新日期的数据
海星
Come across
LeetCode-434. Number of Segments
通识50讲之34-经济增长
184/200
SQLAlchemy 中的Group By用法

来源群里一个老哥问：

data_ori<-"CB;IB;NCB;tag
combine=0;0;2;6;a
combine=1;3;3;5;b
combine=2;8;6;2;b"
data<- read.table(text = data_ori,header = T,sep=";",quote ="")

haha<-data.frame(c("a","b"))
colnames(haha)<-"tag"
for (i in c("CB","IB","NCB")){
  ll<- data  %>% select(i,tag) %>% group_by(tag) %>%  summarise(tmp = sum(data[,i]))
  colnames(ll)<-c("tag",i)
  haha<-haha %>% left_join(ll,by="tag")
}想以tag为组求和，而不是要该列全部的和，求问改怎么改呀


想要的是比如CB列，a组的和是0，b组的和是1；IB列a组和为2，b组和为9

变成

作为常年在网上各种群摸鱼的我，有时候只要不是很忙或者说心情还好，就会去解答，但是想着解答了后不记录下来，那么看到答案的人就少了。所以就有了我此文。

有两种方法：

第一种：

如果看不懂下面代码，建议大家认证去学习一下 dplyr 1.0 后增加的 across、where 系列内容。
相关内容可在 tidyverse 官方阅读：
https://www.tidyverse.org/blog/
dplyr 1.0.0: working across columns

# 最简单粗暴

library(tidyverse)
data %>%
  group_by(tag) %>%
  summarise(across(where(is.numeric), sum))

`summarise()` ungrouping output (override with `.groups` argument)
# A tibble: 2 x 4
  tag      CB    IB   NCB
  <chr> <int> <int> <int>
1 a         0     2     6
2 b        11     9     7

第二种：


# 第二种：比较啰嗦

library(tidyverse)

data %>%
  pivot_longer(
    cols = -tag,
    names_to = "group",
    values_to = "value"
  ) %>%
  group_by(tag, group) %>%
  summarise(tmp = sum(value)) %>%
  ungroup() %>%
  pivot_wider(
    names_from = group,
    values_from = tmp
    )

`summarise()` regrouping output by 'tag' (override with `.groups` argument)
# A tibble: 2 x 4
  tag      CB    IB   NCB
  <chr> <int> <int> <int>
1 a         0     2     6
2 b        11     9     7