[包] ggraph

主要是整理官方文件,供自己理解
https://ggraph.data-imaginist.com/articles/Edges.html
提供给ggraph的数据有两个表,分别是nodes和edges, tidygraph中包含很多创造特征的函数. ggraph和create_layout也会创造一些特征. 这些都可以作为分组和颜色等等图形属性.

1. layouts

As the layout is a global specification of the spatial position of the nodes it spans all layers in the plot and should thus be defined outside of calls to geoms or stats. In ggraph it is often done as part of the plot initialization using ggraph() — a function equivalent in intent to ggplot().
In addition to specifying the layout during plot creation it can also happen separately using create_layout(). This function takes the same arguments as ggraph() but returns a layout_ggraph object that can later be used in place of a graph structure in ggraph call:

ggraph(graph, layout = 'kk', maxiter = 100) + 
  geom_edge_link(aes(colour = factor(year))) + 
  geom_node_point()

layout <- create_layout(graph, layout = 'eigen')
ggraph(layout) + 
  geom_edge_link(aes(colour = factor(year))) + 
  geom_node_point()
  1. ggraph supports tbl_graph objects from tidygraph natively. Any other type of object will be attempted to be coerced to a tbl_graph object automatically. Tidygraph provide conversions for most known graph structure in R so almost any data type is supported by ggraph by extension.

  2. There’s a lot of different layouts in ggraph — All layouts from the graphlayouts and igraph packages are available, an ggraph itself also provide some of the more specialised layouts itself. All in all ggraph provides well above 20 different layouts to choose from, far more than we can cover in this text.
    If ggraph lacks the needed layout it is always possible to supply your own layout function that takes a tbl_graph object and returns a data.frame of node positions, or supply the positions directly by passing a matrix or data.frame to the layout argument.

  3. Sometimes standard and circular representations of the same layout get used so often that they get different names. In ggraph they’ll have the same name and only differ in whether or not circular is set to TRUE.
    Not every layout has a meaningful circular representation in which cases the circular argument will be ignored.

# An arc diagram
ggraph(graph, layout = 'linear') + 
  geom_edge_arc(aes(colour = factor(year)))
fig 1
# A coord diagram
ggraph(graph, layout = 'linear', circular = TRUE) + 
  geom_edge_arc(aes(colour = factor(year))) + 
  coord_fixed()
fig 2
graph <- tbl_graph(flare$vertices, flare$edges)
# An icicle plot
ggraph(graph, 'partition') + 
  geom_node_tile(aes(fill = depth), size = 0.25)
fig 3
# A sunburst plot
ggraph(graph, 'partition', circular = TRUE) + 
  geom_node_arc_bar(aes(fill = depth), size = 0.25) + 
  coord_fixed()
fig 4
  1. There is no such thing as “the best layout algorithm” as algorithms have been optimized for different scenarios.
graph <- as_tbl_graph(highschool) %>% 
  mutate(degree = centrality_degree())
lapply(c('stress', 'fr', 'lgl', 'graphopt'), function(layout) {
  ggraph(graph, layout = layout) + 
    geom_edge_link(aes(colour = factor(year)), show.legend = FALSE) +
    geom_node_point() + 
    labs(caption = paste0('Layout: ', layout))
})

5. hive plot
A hive plot, while still technically a node-edge diagram, is a bit different from the rest as it uses information pertaining to the nodes, rather than the connection information in the graph.
They are less common though, so use will often require some additional explanation.

graph <- graph %>% 
  mutate(friends = ifelse(
    centrality_degree(mode = 'in') < 5, 'few',
    ifelse(centrality_degree(mode = 'in') >= 15, 'many', 'medium')
  ))
ggraph(graph, 'hive', axis = friends, sort.by = degree) + 
  geom_edge_hive(aes(colour = factor(year))) + 
  geom_axis_hive(aes(colour = friends), size = 2, label = FALSE) + 
  coord_fixed()
fig 5

6. Hierarchical layouts
Trees and hierarchies are an important subset of graph structures, and ggraph provides a range of layouts optimized for their visual representation. Some of these use enclosure and position rather than edges to communicate relations (e.g. treemaps and circle packing). Still, these layouts can just as well be used for drawing edges if you wish to:

graph <- tbl_graph(flare$vertices, flare$edges)
set.seed(1)
ggraph(graph, 'circlepack', weight = size) + 
  geom_node_circle(aes(fill = depth), size = 0.25, n = 50) + 
  coord_fixed()
fig 6
set.seed(1)
ggraph(graph, 'circlepack', weight = size) + 
  geom_edge_link() + 
  geom_node_point(aes(colour = depth)) +
  coord_fixed()
fig 7
ggraph(graph, 'treemap', weight = size) + 
  geom_node_tile(aes(fill = depth), size = 0.25)
fig 8
ggraph(graph, 'treemap', weight = size) + 
  geom_edge_link() + 
  geom_node_point(aes(colour = depth))
fig 9

The most recognized tree plot is probably dendrograms though. If nothing else is stated the height of each node is calculated based on the distance to its farthest sibling (the tree layout, on the other hand, puts all nodes at a certain depth at the same level):

ggraph(graph, 'tree') + 
  geom_edge_diagonal()
fig 10

The height of each branch point can be set to a variable — e.g. the height provided by hclust and dendrogram objects:

dendrogram <- hclust(dist(iris[, 1:4]))
ggraph(dendrogram, 'dendrogram', height = height) + 
  geom_edge_elbow()
fig 11

Dendrograms are one of the layouts that are amenable for circular transformations, which can be effective in giving more space at the leafs of the tree at the expense of the space given to the root:

ggraph(dendrogram, 'dendrogram', circular = TRUE) + 
  geom_edge_elbow() + 
  coord_fixed()
fig 12

A type of trees known especially in phylogeny is unrooted trees, where no node is considered the root. Often a dendrogram layout will not be faithful as it implicitly position a node at the root. To avoid that you can use the unrooted layout instead.

tree <- create_tree(100, 2, directed = FALSE) %>% 
  activate(edges) %>% 
  mutate(length = runif(n()))
ggraph(tree, 'unrooted', length = length) + 
  geom_edge_link()
fig 13
  1. Focal layouts/Matrix layouts/Fabric layouts

2. Nodes

geom_node_*()
All ggraph geoms get a filter aesthetic that allows you to quickly filter the input data. The use of this can be illustrated when plotting a tree:

ggraph(gr, layout = 'dendrogram', circular = TRUE) + 
  geom_edge_diagonal() + 
  geom_node_point(aes(filter = leaf)) + 
  coord_fixed()
fig 14

In the above plot only the terminal nodes are drawn by filtering on the logical leaf column provided by the dendrogram layout.

geom_node_tile() and geom_node_range() are the ggraph counterpart to ggplot2s geom_tile() and geom_linerange() while geom_node_circle() and geom_node_arc_bar() maps to ggforces geom_circle() and geom_arc_bar(). Collective for these is that the spatial dimensions of the geoms (e.g. radius, width, and height) are precalculated by their intended layouts and defaulted by the geoms

ggraph(gr, layout = 'treemap', weight = size) + 
  geom_node_tile(aes(fill = depth))
fig 15
l <- ggraph(gr, layout = 'partition', circular = TRUE)
l + geom_node_arc_bar(aes(fill = depth)) + 
  coord_fixed()
fig 16
l + geom_edge_diagonal() + 
  geom_node_point(aes(colour = depth)) + 
  coord_fixed()

fig 17

geom_node_text() and geom_node_label() apart from their ggplot2 counterparts: both have a repel argument that, when set to TRUE, will use the repel functionality provided by the ggrepel package to avoid overlapping text. There is also geom_node_voronoi() that plots nodes as cells from a voronoi tesselation. This is useful for e.g. showing dominance of certain node types in an area as overlapping is avoided

graph <- create_notable('meredith') %>% 
  mutate(group = sample(c('A', 'B'), n(), TRUE))

ggraph(graph, 'stress') + 
  geom_node_voronoi(aes(fill = group), max.radius = 1) + 
  geom_node_point() + 
  geom_edge_link() + 
  coord_fixed()
fig 18

3. Edges

看看这个数据格式

library(ggraph)
library(tidygraph)
library(purrr)
library(rlang)

set_graph_style(plot_margin = margin(1,1,1,1))
hierarchy <- as_tbl_graph(hclust(dist(iris[, 1:4]))) %>% 
  mutate(Class = map_bfs_back_chr(node_is_root(), .f = function(node, path, ...) {
    if (leaf[node]) {
      as.character(iris$Species[as.integer(label[node])])
    } else {
      species <- unique(unlist(path$result))
      if (length(species) == 1) {
        species
      } else {
        NA_character_
      }
    }
  }))

hairball <- as_tbl_graph(highschool) %>% 
  mutate(
    year_pop = map_local(mode = 'in', .f = function(neighborhood, ...) {
      neighborhood %E>% pull(year) %>% table() %>% sort(decreasing = TRUE)
    }),
    pop_devel = map_chr(year_pop, function(pop) {
      if (length(pop) == 0 || length(unique(pop)) == 1) return('unchanged')
      switch(names(pop)[which.max(pop)],
             '1957' = 'decreased',
             '1958' = 'increased')
    }),
    popularity = map_dbl(year_pop, ~ .[1]) %|% 0
  ) %>% 
  activate(edges) %>% 
  mutate(year = as.character(year))

link 直线
fan 重复的连接画曲线
parallel 重复的链接画平行线

ggraph(hairball, layout = 'stress') + 
  geom_edge_fan(aes(colour = year))

loop 环

# let's make some of the student love themselves
# bind_edges
loopy_hairball <- hairball %>% 
  bind_edges(tibble::tibble(from = 1:5, to = 1:5, year = rep('1957', 5)))
ggraph(loopy_hairball, layout = 'stress') + 
  geom_edge_link(aes(colour = year), alpha = 0.25) + 
  geom_edge_loop(aes(colour = year))

geom_edge_density() lets you add a shading to your plot based on the density of edges in a certain area

ggraph(hairball, layout = 'stress') + 
  geom_edge_density(aes(fill = year)) + 
  geom_edge_link(alpha = 0.25)

Arcs
it increases overplotting and decreases interpretability for virtually no gain (unless complexity is your thing). That doesn’t mean arcs have no use in graph visualizations. Linear and circular layouts can benefit greatly from them and geom_edge_arc() is provided precisely for this scenario:

ggraph(hairball, layout = 'linear') + 
  geom_edge_arc(aes(colour = year))

Arcs behave differently in circular layouts as they will always bend towards the center no matter the direction of the edge (the same thing can be achieved in a linear layout by setting fold = TRUE).

ggraph(hairball, layout = 'linear', circular = TRUE) + 
  geom_edge_arc(aes(colour = year)) + 
  coord_fixed()
fig 19

Elbow 直方
Diagonals 弯曲
Bends 介于以上两者之间

ggraph(hierarchy, layout = 'dendrogram', height = height) + 
  geom_edge_bend()
fig 20

Hive
Span
point and tile 不需要edge

three variants
base:

ggraph(hairball, layout = 'linear') + 
  geom_edge_arc(aes(colour = year, alpha = after_stat(index))) + 
  scale_edge_alpha('Edge direction', guide = 'edge_direction')

2-variant线可以使用点的特征

ggraph(hierarchy, layout = 'dendrogram', height = height) + 
  geom_edge_elbow2(aes(colour = node.Class))

0-variant

edge strength
Many of the edge geoms takes a strength argument that denotes their deviation from a straight line. Setting strength = 0 will always result in a straight line, while strength = 1 is the default look.

small_tree <- create_tree(5, 2)

ggraph(small_tree, 'dendrogram') + 
  geom_edge_elbow(strength = 0.75)
fig 21

Decorating edges

simple <- create_notable('bull') %>% 
  mutate(name = c('Thomas', 'Bob', 'Hadley', 'Winston', 'Baptiste')) %>% 
  activate(edges) %>% 
  mutate(type = sample(c('friend', 'foe'), 5, TRUE))

Arrow or label
There’s a solution to this in the form of the start_cap and end_cap aesthetics in the base and 2-variant edge geoms (sorry 0-variant). This can be used to start and stop the edge drawing at an absolute distance from the terminal nodes. Using the circle(), square(), ellipsis(), and rectangle() helpers it is possible to get a lot of control over how edges are capped at either end. This works for any edge, curved or not:

ggraph(simple, layout = 'linear', circular = TRUE) + 
  geom_edge_arc(arrow = arrow(length = unit(4, 'mm')), 
                start_cap = circle(3, 'mm'),
                end_cap = circle(3, 'mm')) + 
  geom_node_point(size = 5) + 
  coord_fixed()
fig 22

When plotting node labels you often want to avoid that incoming and outgoing edges overlaps with the labels. ggraph provides a helper that calculates the bounding rectangle of the labels and cap edges based on that:

ggraph(simple, layout = 'graphopt') + 
  geom_edge_link(aes(start_cap = label_rect(node1.name),
                     end_cap = label_rect(node2.name)), 
                 arrow = arrow(length = unit(4, 'mm'))) + 
  geom_node_text(aes(label = name))
fig 23

Usually you would like the labels to run along the edges, but providing a fixed angle will only work at a very specific aspect ratio. Instead ggraph offers to calculate the correct angle dynamically so the labels always runs along the edge. Furthermore it can offset the label by an absolute length:

ggraph(simple, layout = 'graphopt') + 
  geom_edge_link(aes(label = type), 
                 angle_calc = 'along',
                 label_dodge = unit(2.5, 'mm'),
                 arrow = arrow(length = unit(4, 'mm')), 
                 end_cap = circle(3, 'mm')) + 
  geom_node_point(size = 5)
fig 24

Coneections
The estranged cousin of edges are connections. While edges show the relational nature of the nodes in the graph structure, connections connect nodes that are not connected in the graph. This is done by finding the shortest path between the two nodes. Currently the only connection geom available is [geom_conn_bundle()](https://ggraph.data-imaginist.com/reference/geom_conn_bundle.html) that implements the hierarchical edge bundling technique:

flaregraph <- tbl_graph(flare$vertices, flare$edges)
from <- match(flare$imports$from, flare$vertices$name)
to <- match(flare$imports$to, flare$vertices$name)
ggraph(flaregraph, layout = 'dendrogram', circular = TRUE) + 
  geom_conn_bundle(data = get_con(from = from, to = to), alpha = 0.1) + 
  coord_fixed()
fig 25
最后编辑于
©著作权归作者所有,转载或内容合作请联系作者
  • 序言:七十年代末,一起剥皮案震惊了整个滨河市,随后出现的几起案子,更是在滨河造成了极大的恐慌,老刑警刘岩,带你破解...
    沈念sama阅读 196,647评论 5 462
  • 序言:滨河连续发生了三起死亡事件,死亡现场离奇诡异,居然都是意外死亡,警方通过查阅死者的电脑和手机,发现死者居然都...
    沈念sama阅读 82,690评论 2 374
  • 文/潘晓璐 我一进店门,熙熙楼的掌柜王于贵愁眉苦脸地迎上来,“玉大人,你说我怎么就摊上这事。” “怎么了?”我有些...
    开封第一讲书人阅读 143,739评论 0 325
  • 文/不坏的土叔 我叫张陵,是天一观的道长。 经常有香客问我,道长,这世上最难降的妖魔是什么? 我笑而不...
    开封第一讲书人阅读 52,692评论 1 267
  • 正文 为了忘掉前任,我火速办了婚礼,结果婚礼上,老公的妹妹穿的比我还像新娘。我一直安慰自己,他们只是感情好,可当我...
    茶点故事阅读 61,552评论 5 358
  • 文/花漫 我一把揭开白布。 她就那样静静地躺着,像睡着了一般。 火红的嫁衣衬着肌肤如雪。 梳的纹丝不乱的头发上,一...
    开封第一讲书人阅读 46,410评论 1 275
  • 那天,我揣着相机与录音,去河边找鬼。 笑死,一个胖子当着我的面吹牛,可吹牛的内容都是我干的。 我是一名探鬼主播,决...
    沈念sama阅读 36,819评论 3 387
  • 文/苍兰香墨 我猛地睁开眼,长吁一口气:“原来是场噩梦啊……” “哼!你这毒妇竟也来了?” 一声冷哼从身侧响起,我...
    开封第一讲书人阅读 35,463评论 0 255
  • 序言:老挝万荣一对情侣失踪,失踪者是张志新(化名)和其女友刘颖,没想到半个月后,有当地人在树林里发现了一具尸体,经...
    沈念sama阅读 39,752评论 1 294
  • 正文 独居荒郊野岭守林人离奇死亡,尸身上长有42处带血的脓包…… 初始之章·张勋 以下内容为张勋视角 年9月15日...
    茶点故事阅读 34,789评论 2 314
  • 正文 我和宋清朗相恋三年,在试婚纱的时候发现自己被绿了。 大学时的朋友给我发了我未婚夫和他白月光在一起吃饭的照片。...
    茶点故事阅读 36,572评论 1 326
  • 序言:一个原本活蹦乱跳的男人离奇死亡,死状恐怖,灵堂内的尸体忽然破棺而出,到底是诈尸还是另有隐情,我是刑警宁泽,带...
    沈念sama阅读 32,414评论 3 315
  • 正文 年R本政府宣布,位于F岛的核电站,受9级特大地震影响,放射性物质发生泄漏。R本人自食恶果不足惜,却给世界环境...
    茶点故事阅读 37,833评论 3 300
  • 文/蒙蒙 一、第九天 我趴在偏房一处隐蔽的房顶上张望。 院中可真热闹,春花似锦、人声如沸。这庄子的主人今日做“春日...
    开封第一讲书人阅读 29,054评论 0 19
  • 文/苍兰香墨 我抬头看了看天上的太阳。三九已至,却和暖如春,着一层夹袄步出监牢的瞬间,已是汗流浃背。 一阵脚步声响...
    开封第一讲书人阅读 30,345评论 1 254
  • 我被黑心中介骗来泰国打工, 没想到刚下飞机就差点儿被人妖公主榨干…… 1. 我叫王不留,地道东北人。 一个月前我还...
    沈念sama阅读 41,810评论 2 342
  • 正文 我出身青楼,却偏偏与公主长得像,于是被迫代替她去往敌国和亲。 传闻我的和亲对象是个残疾皇子,可洞房花烛夜当晚...
    茶点故事阅读 41,016评论 2 337

推荐阅读更多精彩内容