以下是基于dataCamp里面的<network analysis in R>课程以及Network
visualization with R
的学习笔记。同时由于现在对于网络数据的处理还有一个包(tidygraph)。由于tidygraph的数据是tbl的。所以对于tidyverse处理都成无缝衔接。所以这里也就顺带学习了以下这个包的使用。
library(igraph)
library(tidygraph)
library(tidyverse)
网络的基本要素
对于网络数据,主要是包括两个元素,一个是顶点(vertices/nodes),另外一个是连接线(edges)。我们在提供数据的时候也是基于这两个元素来提供数据的。
image-20200402121502121
网络对象构建
对于网络数据,我们需要提前创建一个和网络有关的对象。igraph
和tidygraph
具有可以转换数据的对象自己的函数.
igraph
igraph可以通过graph_from_data_frame
函数来构建网络数据。这个数据集需要提供网络之间的连接线信息以及节点信息。同时可以选择网络是否是有方向的。
nodes <- read.csv("./Data/Dataset1-Media-Example-NODES.csv", header=T, as.is=T)
links <- read.csv("./Data/Dataset1-Media-Example-EDGES.csv", header=T, as.is=T)
net <- graph_from_data_frame(d=links, vertices=nodes, directed=T)
net
## IGRAPH c3731a0 DNW- 17 49 --
## + attr: name (v/c), media (v/c), media.type (v/n), type.label (v/c),
## | audience.size (v/n), type (e/c), weight (e/n)
## + edges from c3731a0 (vertex names):
## [1] s01->s02 s01->s03 s01->s04 s01->s15 s02->s01 s02->s03 s02->s09 s02->s10
## [9] s03->s01 s03->s04 s03->s05 s03->s08 s03->s10 s03->s11 s03->s12 s04->s03
## [17] s04->s06 s04->s11 s04->s12 s04->s17 s05->s01 s05->s02 s05->s09 s05->s15
## [25] s06->s06 s06->s16 s06->s17 s07->s03 s07->s08 s07->s10 s07->s14 s08->s03
## [33] s08->s07 s08->s09 s09->s10 s10->s03 s12->s06 s12->s13 s12->s14 s13->s12
## [41] s13->s17 s14->s11 s14->s13 s15->s01 s15->s04 s15->s06 s16->s06 s16->s17
## [49] s17->s04
tidygraph
tidygrph包提供了可以把基本的构建网络对象的函数tbl_graph
。通过这个函数可以构建网络对象。同时对于数据库;矩阵;igraph网络对象可以通过as_tbl_graph
来进行转换。
### 直接构建网络对象
net1 <- tbl_graph(nodes = nodes, edges = links, directed = T)
### 转换igraph的对象
net2 <- as_tbl_graph(net)
net2
## # A tbl_graph: 17 nodes and 49 edges
## #
## # A directed multigraph with 1 component
## #
## # Node Data: 17 x 5 (active)
## name media media.type type.label audience.size
## <chr> <chr> <int> <chr> <int>
## 1 s01 NY Times 1 Newspaper 20
## 2 s02 Washington Post 1 Newspaper 25
## 3 s03 Wall Street Journal 1 Newspaper 30
## 4 s04 USA Today 1 Newspaper 32
## 5 s05 LA Times 1 Newspaper 20
## 6 s06 New York Post 1 Newspaper 50
## # … with 11 more rows
## #
## # Edge Data: 49 x 4
## from to type weight
## <int> <int> <chr> <int>
## 1 1 2 hyperlink 22
## 2 1 3 hyperlink 22
## 3 1 4 hyperlink 21
## # … with 46 more rows
网络对象的查看
网络对象构建完之后,我们可以查看相关的信息
igraph
igraph可以通过V
函数查看node的标签信息。通过vertex_attr
可以看对于node
的所有注释信息
通过E
函数查看edges
的连接信息。通过edges_attr
可以看edges
的所有注释信息。
## 查看node的信息
vertex_attr(net)
## $name
## [1] "s01" "s02" "s03" "s04" "s05" "s06" "s07" "s08" "s09" "s10" "s11" "s12"
## [13] "s13" "s14" "s15" "s16" "s17"
##
## $media
## [1] "NY Times" "Washington Post" "Wall Street Journal"
## [4] "USA Today" "LA Times" "New York Post"
## [7] "CNN" "MSNBC" "FOX News"
## [10] "ABC" "BBC" "Yahoo News"
## [13] "Google News" "Reuters.com" "NYTimes.com"
## [16] "WashingtonPost.com" "AOL.com"
##
## $media.type
## [1] 1 1 1 1 1 1 2 2 2 2 2 3 3 3 3 3 3
##
## $type.label
## [1] "Newspaper" "Newspaper" "Newspaper" "Newspaper" "Newspaper" "Newspaper"
## [7] "TV" "TV" "TV" "TV" "TV" "Online"
## [13] "Online" "Online" "Online" "Online" "Online"
##
## $audience.size
## [1] 20 25 30 32 20 50 56 34 60 23 34 33 23 12 24 28 33
## 查看node的标签
V(net)
## + 17/17 vertices, named, from c3731a0:
## [1] s01 s02 s03 s04 s05 s06 s07 s08 s09 s10 s11 s12 s13 s14 s15 s16 s17
### 查看edges的注释信息
edge_attr(net)
## $type
## [1] "hyperlink" "hyperlink" "hyperlink" "mention" "hyperlink" "hyperlink"
## [7] "hyperlink" "hyperlink" "hyperlink" "hyperlink" "hyperlink" "hyperlink"
## [13] "mention" "hyperlink" "hyperlink" "hyperlink" "mention" "mention"
## [19] "hyperlink" "mention" "mention" "hyperlink" "hyperlink" "mention"
## [25] "hyperlink" "hyperlink" "mention" "mention" "mention" "hyperlink"
## [31] "mention" "hyperlink" "mention" "mention" "mention" "hyperlink"
## [37] "mention" "hyperlink" "mention" "hyperlink" "mention" "mention"
## [43] "mention" "hyperlink" "hyperlink" "hyperlink" "hyperlink" "mention"
## [49] "hyperlink"
##
## $weight
## [1] 22 22 21 20 23 21 1 5 21 22 1 4 2 1 1 23 1 22 3 2 1 21 2 21 1
## [26] 21 21 1 22 21 4 2 21 23 21 2 2 22 22 21 1 1 21 22 1 4 23 21 4
### 查看edges的连接信息
E(net)
## + 49/49 edges from c3731a0 (vertex names):
## [1] s01->s02 s01->s03 s01->s04 s01->s15 s02->s01 s02->s03 s02->s09 s02->s10
## [9] s03->s01 s03->s04 s03->s05 s03->s08 s03->s10 s03->s11 s03->s12 s04->s03
## [17] s04->s06 s04->s11 s04->s12 s04->s17 s05->s01 s05->s02 s05->s09 s05->s15
## [25] s06->s06 s06->s16 s06->s17 s07->s03 s07->s08 s07->s10 s07->s14 s08->s03
## [33] s08->s07 s08->s09 s09->s10 s10->s03 s12->s06 s12->s13 s12->s14 s13->s12
## [41] s13->s17 s14->s11 s14->s13 s15->s01 s15->s04 s15->s06 s16->s06 s16->s17
## [49] s17->s04
tidygraph
对于tidygraph的对象而言。首先这个对象是无缝衔接igraph的参数的。所以上面的那些参数都是可以使用的。另外呢,tidygraph含有一个activate
函数可以来提取相对应的信息。这个函数支持nodes和edges这两个参数。提取的结果通过as.*
就可以转换为数据框来进行查看了。
### 查看nodes信息
net2 %>% activate(nodes) %>% as_tibble() %>% head()
## # A tibble: 6 x 5
## name media media.type type.label audience.size
## <chr> <chr> <int> <chr> <int>
## 1 s01 NY Times 1 Newspaper 20
## 2 s02 Washington Post 1 Newspaper 25
## 3 s03 Wall Street Journal 1 Newspaper 30
## 4 s04 USA Today 1 Newspaper 32
## 5 s05 LA Times 1 Newspaper 20
## 6 s06 New York Post 1 Newspaper 50
### 查看edges信息
net2 %>% activate(edges) %>% as.data.frame() %>% head()
## from to type weight
## 1 1 2 hyperlink 22
## 2 1 3 hyperlink 22
## 3 1 4 hyperlink 21
## 4 1 15 mention 20
## 5 2 1 hyperlink 23
## 6 2 3 hyperlink 21
网络信息的筛选
igraph
igraph
可以进行相关信息筛选的查看。但是筛选完的数据,如果想要进行网络可视化的话。就需要重新的进行定义网络对象了。
## 基于node的注释信息筛选node
V(net)[type.label == "TV"]
## + 5/17 vertices, named, from c3731a0:
## [1] s07 s08 s09 s10 s11
## 查看某一个node的edge信息
E(net)[[inc("s01")]]
## + 8/49 edges from c3731a0 (vertex names):
## tail head tid hid type weight
## 1 s01 s02 1 2 hyperlink 22
## 2 s01 s03 1 3 hyperlink 22
## 3 s01 s04 1 4 hyperlink 21
## 4 s01 s15 1 15 mention 20
## 5 s02 s01 2 1 hyperlink 23
## 9 s03 s01 3 1 hyperlink 21
## 21 s05 s01 5 1 mention 1
## 44 s15 s01 15 1 hyperlink 22
## 基于某一个标准筛选edges
E(net)[[type == "heyperlink"]]
## + 0/49 edges from c3731a0 (vertex names):
## [1] tail head tid hid type weight
## <0 rows> (or 0-length row.names)
tidygraph
通过activate
我们可以提取相关的node/edge信息。然后利用dplyr相关参数进行添加/修改即可。这样筛选完的对象还是网络对象。可以继续进行可视化的操作。
net2 %>% activate(nodes) %>% filter(type.label == "TV") %>%
activate(edges) %>% filter(type == "mention")
## # A tbl_graph: 5 nodes and 4 edges
## #
## # A directed simple graph with 2 components
## #
## # Edge Data: 4 x 4 (active)
## from to type weight
## <int> <int> <chr> <int>
## 1 1 2 mention 22
## 2 2 1 mention 21
## 3 2 3 mention 23
## 4 3 4 mention 21
## #
## # Node Data: 5 x 5
## name media media.type type.label audience.size
## <chr> <chr> <int> <chr> <int>
## 1 s07 CNN 2 TV 56
## 2 s08 MSNBC 2 TV 34
## 3 s09 FOX News 2 TV 60
## # … with 2 more rows
注释信息的添加/删除
igraph
igraph
的数据储存都是list格式的,所以如果要添加额外的注释信息,我们可以使用$
来进行添加。如果要添加node信息使用V
;如果要添加edges信息则使用E
。
## 添加color的信息
V(net)$color <- ifelse(V(net)$type.label == "TV", "red", "blue")
vertex_attr(net)
## $name
## [1] "s01" "s02" "s03" "s04" "s05" "s06" "s07" "s08" "s09" "s10" "s11" "s12"
## [13] "s13" "s14" "s15" "s16" "s17"
##
## $media
## [1] "NY Times" "Washington Post" "Wall Street Journal"
## [4] "USA Today" "LA Times" "New York Post"
## [7] "CNN" "MSNBC" "FOX News"
## [10] "ABC" "BBC" "Yahoo News"
## [13] "Google News" "Reuters.com" "NYTimes.com"
## [16] "WashingtonPost.com" "AOL.com"
##
## $media.type
## [1] 1 1 1 1 1 1 2 2 2 2 2 3 3 3 3 3 3
##
## $type.label
## [1] "Newspaper" "Newspaper" "Newspaper" "Newspaper" "Newspaper" "Newspaper"
## [7] "TV" "TV" "TV" "TV" "TV" "Online"
## [13] "Online" "Online" "Online" "Online" "Online"
##
## $audience.size
## [1] 20 25 30 32 20 50 56 34 60 23 34 33 23 12 24 28 33
##
## $color
## [1] "blue" "blue" "blue" "blue" "blue" "blue" "red" "red" "red" "red"
## [11] "red" "blue" "blue" "blue" "blue" "blue" "blue"
tidygraph
在我们使用activate
之后,可以提取相对应的信息,然后通过mutate即可来添加其他的信息了
net2 %>% activate(nodes) %>%
mutate(color = if_else(type.label == "TV", "red", "blue")) %>%
pull(color)
## [1] "blue" "blue" "blue" "blue" "blue" "blue" "red" "red" "red" "red"
## [11] "red" "blue" "blue" "blue" "blue" "blue" "blue"
网友评论