This RMarkdown document is part of the Generic Skills Component (GSK) of the Course of the Foundation Studies Programme at Srishti Manipal Institute of Art, Design, and Technology, Bangalore India. The material is based on A Layered Grammar of Graphics by Hadley Wickham. The course is meant for First Year students pursuing a Degree in Art and Design.
The intent of this GSK part is to build Skill in coding in R, and also appreciate R as a way to metaphorically visualize information of various kinds, using predominantly geometric figures and structures.
All RMarkdown files combine code, text, web-images, and figures developed using code. Everything is text; code chunks are enclosed in fences (```)
At the end of this Lab session, we should:
- know the types and structures of network data
and be
able to work with them
- understand the basics of modern network packages in R
- be able to create network visualizations using
tidygraph
, ggraph
( static visualizations ) and
visNetwork
(interactive visualizations)
- see directions for how the network metaphor applies in a variety of domains (e.g. biology/ecology, ideas/influence, technology, transportation, to name a few)
The method followed will be based on PRIMM:
parameters
of the code
do and write comments to explain. What bells and
whistles can you see?parameters
code provided to
understand the options
available. Write
comments to show what you have aimed for and achieved.The setup
code chunk below brings into
our coding session R packages that provide specific
computational abilities and also datasets which we can
use.
To reiterate: Packages and datasets are not the same thing !! Packages are (small) collections of programs. Datasets are just….information.
Network graphs are characterized by two key terms: nodes and edges
Nodes
: Entities
vertices
. Nodes have IDs
.Edges
: Connections
links
, ties
.In R, we create network representations using node and edge information. One way in which these could be organized are:
- Node list
: a data frame with a single column listing
the node IDs found in the edge list. You can also add attribute
columns to the data frame such as the names of the nodes or
grouping variables. ( Type? Class? Family? Country? Subject? Race? )
ID | Node Name | Attribute? Qualities?Categories? Family? Country?Planet? |
1 | Ned | Nursery School Teacher |
2 | Jaguar Paw | Main Character, Apocalypto |
3 | John Snow | Epidemiologist |
Edge list
: data frame containing two columns:
source node and destination node of an
edge
. Source and Destination have
node IDs
.Weighted network graph
: An edge list can also contain
additional columns describing attributes of the edges such as a
magnitude aspect for an edge. If the edges have a magnitude attribute
the graph is considered weighted.From | To | Relationship | Weightage |
---|---|---|---|
1 | 3 | Financial Dealings | 6 |
2 | 1 | History Lessons | 2 |
2 | 3 | Vaccination | 15 |
Layout
: A geometric arrangement of
nodes
and edges
.
arrangement
.Layout Algorithms
: Method
to arranges
nodes
and edges
with the aim of optimizing
some metric
.
masses
and edges are
springs
. The Layout algorithm minimizes the stretching and
compressing of all springs.(BTW, are the Spring Constants K the same for
all springs?…)Directed and undirected network graph
: If the
distinction between source and target is meaningful, the network is
directed. If the distinction is not meaningful, the
network is undirected. Directed edges
represent an ordering of nodes, like a relationship extending from one
node to another, where switching the direction would change the
structure of the network. Undirected edges are simply
links between nodes where order does not matter.Examples:
The World Wide Web is an example of a directed network because hyperlinks connect one Web page to another, but not necessarily the other way around.
Co-authorship networks represent examples of un-directed networks, where nodes are authors and they are connected by an edge if they have written a publication together
When people send e-mail to each other, the distinction between the sender (source) and the recipient (target) is clearly meaningful, therefore the network is directed.
Connected
and Disconnected
graphs: If
there is some path from any node to any other
node, the Networks is said to be Connected. Else,
Disconnected.tidygraph
and ggraph
tidygraph
and ggraph
are modern R packages
for network data. Graph Data setup and manipulation is done in tidygraph
and graph visualization with ggraph.
tidygraph
Data -> “Network Object” in R.ggraph
Network Object -> Plots using a chosen
layout/algo.Both leverage the power of igraph
,
which is the Big Daddy of all network packages. We will
be using the Grey’s Anatomy dataset in our first foray into
networks.
grey_nodes <- read_csv("data/grey_nodes.csv")
grey_edges <- read_csv("data/grey_edges.csv")
# grey_nodes <- read_delim("./Data/greys-nodes.csv", delim = ";")
# ger_edges <- read_delim("~/Downloads/grey-edges.csv",
# delim = ";", escape_double = FALSE, trim_ws = TRUE)
grey_nodes
## # A tibble: 54 × 7
## name sex race birthyear position season sign
## <chr> <chr> <chr> <dbl> <chr> <dbl> <chr>
## 1 Addison Montgomery F White 1967 Attending 1 Libra
## 2 Adele Webber F Black 1949 Non-Staff 2 Leo
## 3 Teddy Altman F White 1969 Attending 6 Pisces
## 4 Amelia Shepherd F White 1981 Attending 7 Libra
## 5 Arizona Robbins F White 1976 Attending 5 Leo
## 6 Rebecca Pope F White 1975 Non-Staff 3 Gemini
## 7 Jackson Avery M Black 1981 Resident 6 Leo
## 8 Miranda Bailey F Black 1969 Attending 1 Virgo
## 9 Ben Warren M Black 1972 Other 6 Aquarius
## 10 Henry Burton M White 1972 Non-Staff 7 Cancer
## # ℹ 44 more rows
grey_edges
## # A tibble: 57 × 4
## from to weight type
## <chr> <chr> <dbl> <chr>
## 1 Leah Murphy Arizona Robbins 2 friends
## 2 Leah Murphy Alex Karev 4 benefits
## 3 Lauren Boswell Arizona Robbins 1 friends
## 4 Arizona Robbins Callie Torres 1 friends
## 5 Callie Torres Erica Hahn 6 friends
## 6 Callie Torres Alex Karev 12 benefits
## 7 Callie Torres Mark Sloan 5 professional
## 8 Callie Torres George O'Malley 2 professional
## 9 George O'Malley Izzie Stevens 3 professional
## 10 George O'Malley Meredith Grey 4 friends
## # ℹ 47 more rows
Questions and Inferences #1:
Look at the console output thumbnail. What does for example
name = col_character
mean? What attributes (i.e. extra
information) are seen for Nodes and Edges? Understand the data in both
nodes and edges as shown in the second and third thumbnails. Write some
comments and inferences here.
Key function:
tbl_graph()
: (aka “tibble graph”). Key arguments:
nodes
, edges
and directed
. Note
this is a very versatile command and can take many input forms, such as
data structures that result from other packages. Type
?tbl_graph
in the Console and see the Usage
section.ga <- tbl_graph(nodes = grey_nodes,
edges = grey_edges,
directed = FALSE)
ga
## # A tbl_graph: 54 nodes and 57 edges
## #
## # An undirected simple graph with 4 components
## #
## # A tibble: 54 × 7
## name sex race birthyear position season sign
## <chr> <chr> <chr> <dbl> <chr> <dbl> <chr>
## 1 Addison Montgomery F White 1967 Attending 1 Libra
## 2 Adele Webber F Black 1949 Non-Staff 2 Leo
## 3 Teddy Altman F White 1969 Attending 6 Pisces
## 4 Amelia Shepherd F White 1981 Attending 7 Libra
## 5 Arizona Robbins F White 1976 Attending 5 Leo
## 6 Rebecca Pope F White 1975 Non-Staff 3 Gemini
## # ℹ 48 more rows
## #
## # A tibble: 57 × 4
## from to weight type
## <int> <int> <dbl> <chr>
## 1 5 47 2 friends
## 2 21 47 4 benefits
## 3 5 46 1 friends
## # ℹ 54 more rows
Questions and Inferences #2:
Questions and Inferences: What information does the graph object contain? What attributes do the nodes have? What about the edges?
ggraph
3a. Quick Plot: autograph()
This is to check quickly is
the data is imported properly and to decide upon going on to a more
elaborate plotting.
autograph(ga)
Questions and Inferences #3:
Questions and Inferences: Describe this graph, in simple words here. Try to use some of the new domain words we have just acquired: nodes/edges, connected/disconnected, directed/undirected.
3b. More elaborate plot
Key functions:
ggraph(layout = "......")
: Create classic node-edge
diagrams; i.e. Sets up the graph. Rather like ggplot
for
networks!Two kinds of geom
: one set for nodes, and another for
edges
geom_node_point(aes(.....))
: Draws node as “points”.
Alternatives are circle / arc_bar / tile / voronoi
.
Remember the geom
s that we have seen before in Grammar of
Graphics!
geom_edge_link(aes(.....))
: Draws edges as “links”.
Alternatives are
arc / bend / elbow / hive / loop / parallel / diagonal / point / span /tile
.
geom_node_text(aes(label = ......), repel = TRUE)
:
Adds text labels (non-overlapping). Alternatives are
label /...
labs(title = "....", subtitle = "....", caption = "....")
:
Change main titles, axis labels and legend titles. We know this from our
work with ggplot
.
# Write Comments next to each line
# About what that line does for the overall graph
ggraph(graph = ga, layout = "kk") +
#
geom_edge_link(width = 2, color = "pink") +
#
geom_node_point(
shape = 21,
size = 8,
fill = "blue",
color = "green",
stroke = 2
) +
#
labs(title = "Whoo Hoo! My first silly Grey's Anatomy graph in R!",
subtitle = "Why did Ramesh put me in this course...",
caption = "Bro, they are doing **cool** things in the other
classes...")
## Warning: Using the `size` aesthetic in this geom was deprecated in ggplot2 3.4.0.
## ℹ Please use `linewidth` in the `default_aes` field and elsewhere instead.
## This warning is displayed once every 8 hours.
## Call `lifecycle::last_lifecycle_warnings()` to see where this warning was
## generated.
Questions and Inferences #3:
Questions and Inferences: What parameters have been changed here, compared to the earlier graph? Where do you see these changes in the code above?
Let us Play with this graph and see if we can make some small changes. Colour? Fill? Width? Size? Stroke? Labs? Of course!
# Change the parameters in each of the commands here to new ones
# Use fixed values for colours or sizes...etc.
ggraph(graph = ga, layout = "kk") +
geom_edge_link(width = 2) +
geom_node_point(shape = 21, size = 8,
fill = "blue",
color = "green",
stroke = 2) +
labs(title = "Whoo Hoo! My next silly Grey's Anatomy graph in R!",
subtitle = "Why did Ramesh put me in this course...",
caption = "Bro, they are doing cool things in the other
classes...")
Questions and Inferences #4:
Questions and Inferences: What did the
shape
parameter achieve? What are the possibilities with
shape
? How about including alpha
?
3c. Aesthetic Mapping from Node and Edge attribute columns
Up to now, we have assigned specific
numbers to geometric aesthetics such as shape and size. Now we are
ready ( maybe ?) change the meaning and significance of the entire graph
and each element within it, and use aesthetics / metaphoric
mappings to achieve new meanings or insights. Let us try using
aes()
inside each geom
to map a
variable
to a geometric aspect.
Don’t try to use more than 2 aesthetic mappings simultaneously!!
The node elements we can tweak are:
geom_node_****()
geom_node_****(aes(...............))
aes(alpha = node-variable)
: opacity; a value between 0
and 1aes(shape = node-variable)
: node shapeaes(colour = node-variable)
: node colouraes(fill = node-variable)
: fill colour for nodeaes(size = node-variable)
: size of nodeThe edge elements we can tweak are:
geom_edge_****()
geom_edge_****(aes(...............))
aes(colour = edge-variable)
: colour of the edgeaes(width = edge-variable)
: width of the edgeaes(label = some_variable)
: labels for the edgeType ?geom_node_point
and ?geom-edge_link
in your Console for more information.
ggraph(graph = ga, layout = "fr") +
geom_edge_link0(aes(width = weight)) + # add mapping here
geom_node_point(aes(color = race), size = 6) + # add mapping here
# geom_node_label(aes(label = name), # modify this mapping
# repel = TRUE, max.overlaps = 20,
# alpha = 0.6,
# size = 3) +
labs(title = "Whoo Hoo! Yet another Grey's Anatomy graph in R!")
Questions and Inferences #5:
Questions and Inferences: Describe some of the changes here. What types of edges worked? Which variables were you able to use for nodes and edges and how? What did not work with either of the two?
# Arc diagram
ggraph(ga, layout = "linear") +
geom_edge_arc(aes(width = weight), alpha = 0.8) +
scale_edge_width(range = c(0.2, 2)) +
geom_node_point(size = 2, colour = "red") +
labs(edge_width = "Weight") +
theme_graph()+
theme(legend.position = "top")
## Warning in grid.Call(C_stringMetric, as.graphicsAnnot(x$label)): font family
## not found in Windows font database
## Warning in grid.Call(C_stringMetric, as.graphicsAnnot(x$label)): font family
## not found in Windows font database
## Warning in grid.Call(C_textBounds, as.graphicsAnnot(x$label), x$x, x$y, : font
## family not found in Windows font database
## Warning in grid.Call(C_textBounds, as.graphicsAnnot(x$label), x$x, x$y, : font
## family not found in Windows font database
Questions and Inferences #6:
Questions and Inferences: How does this graph look “metaphorically” different? Do you see a difference in the relationships between people here? Why?
# Coord diagram, circular
ggraph(ga, layout = "linear", circular = TRUE) +
geom_edge_arc(aes(width = weight), alpha = 0.8) +
scale_edge_width(range = c(0.2, 2)) +
geom_node_point(size = 4,colour = "red") +
geom_node_text(aes(label = name),repel = TRUE, size = 3,
max.overlaps = 20) +
labs(edge_width = "Weight") +
theme_graph()+
theme(legend.position = "right",
aspect.ratio = 1)
## Warning in grid.Call(C_textBounds, as.graphicsAnnot(x$label), x$x, x$y, : font
## family not found in Windows font database
## Warning in grid.Call(C_textBounds, as.graphicsAnnot(x$label), x$x, x$y, : font
## family not found in Windows font database
## Warning in grid.Call(C_textBounds, as.graphicsAnnot(x$label), x$x, x$y, : font
## family not found in Windows font database
## Warning in grid.Call(C_textBounds, as.graphicsAnnot(x$label), x$x, x$y, : font
## family not found in Windows font database
Questions and Inferences #7:
Questions and Inferences: How does this graph look “metaphorically” different? Do you see a difference in the relationships between people here? Why?
These provide for some alternative metaphorical views of networks. Note that not all layouts are possible for all datasets!!
# setting theme_graph
set_graph_style()
# This dataset contains the graph that describes the class
# hierarchy for the Flare visualization library.
# Type ?flare in your Console
head(flare$vertices)
## name size shortName
## 1 flare.analytics.cluster.AgglomerativeCluster 3938 AgglomerativeCluster
## 2 flare.analytics.cluster.CommunityStructure 3812 CommunityStructure
## 3 flare.analytics.cluster.HierarchicalCluster 6714 HierarchicalCluster
## 4 flare.analytics.cluster.MergeEdge 743 MergeEdge
## 5 flare.analytics.graph.BetweennessCentrality 3534 BetweennessCentrality
## 6 flare.analytics.graph.LinkDistance 5731 LinkDistance
head(flare$edges)
## from to
## 1 flare.analytics.cluster flare.analytics.cluster.AgglomerativeCluster
## 2 flare.analytics.cluster flare.analytics.cluster.CommunityStructure
## 3 flare.analytics.cluster flare.analytics.cluster.HierarchicalCluster
## 4 flare.analytics.cluster flare.analytics.cluster.MergeEdge
## 5 flare.analytics.graph flare.analytics.graph.BetweennessCentrality
## 6 flare.analytics.graph flare.analytics.graph.LinkDistance
# flare class hierarchy
graph = tbl_graph(edges = flare$edges, nodes = flare$vertices)
# dendrogram
ggraph(graph, layout = "dendrogram") +
geom_edge_diagonal() +
labs(title = "Dendrogram")
## Warning in grid.Call(C_stringMetric, as.graphicsAnnot(x$label)): font family
## not found in Windows font database
## Warning in grid.Call(C_textBounds, as.graphicsAnnot(x$label), x$x, x$y, : font
## family not found in Windows font database
# circular dendrogram
ggraph(graph, layout = "dendrogram", circular = TRUE) +
geom_edge_diagonal() +
geom_node_point(aes(filter = leaf)) +
coord_fixed()+
labs(title = "Circular Dendrogram")
# rectangular tree map
ggraph(graph, layout = "treemap", weight = size) +
geom_node_tile(aes(fill = depth), size = 0.25) +
labs(title = "Rectangular Tree Map")
## Warning: Using `size` aesthetic for lines was deprecated in ggplot2 3.4.0.
## ℹ Please use `linewidth` instead.
## font family not found in Windows font database
## font family not found in Windows font database
## font family not found in Windows font database
## font family not found in Windows font database
## font family not found in Windows font database
## font family not found in Windows font database
## This warning is displayed once every 8 hours.
## Call `lifecycle::last_lifecycle_warnings()` to see where this warning was
## generated.
# circular tree map
ggraph(graph, layout = "circlepack", weight = size) +
geom_node_circle(aes(fill = depth), size = 0.25, n = 50) +
coord_fixed() +
labs(title = "Circular Tree Map")
## Warning in grid.Call(C_textBounds, as.graphicsAnnot(x$label), x$x, x$y, : font
## family not found in Windows font database
## Warning in grid.Call(C_textBounds, as.graphicsAnnot(x$label), x$x, x$y, : font
## family not found in Windows font database
## Warning in grid.Call(C_textBounds, as.graphicsAnnot(x$label), x$x, x$y, : font
## family not found in Windows font database
## Warning in grid.Call(C_textBounds, as.graphicsAnnot(x$label), x$x, x$y, : font
## family not found in Windows font database
## Warning in grid.Call(C_textBounds, as.graphicsAnnot(x$label), x$x, x$y, : font
## family not found in Windows font database
## Warning in grid.Call(C_textBounds, as.graphicsAnnot(x$label), x$x, x$y, : font
## family not found in Windows font database
# icicle
ggraph(graph, layout = "partition") +
geom_node_tile(aes(y = -y, fill = depth))
## Warning in grid.Call(C_textBounds, as.graphicsAnnot(x$label), x$x, x$y, : font
## family not found in Windows font database
## Warning in grid.Call(C_textBounds, as.graphicsAnnot(x$label), x$x, x$y, : font
## family not found in Windows font database
## Warning in grid.Call(C_textBounds, as.graphicsAnnot(x$label), x$x, x$y, : font
## family not found in Windows font database
# sunburst (circular icicle)
ggraph(graph, layout = "partition", circular = TRUE) +
geom_node_arc_bar(aes(fill = depth)) +
coord_fixed() +
labs(title = "Circular Icicle")
## Warning in grid.Call(C_textBounds, as.graphicsAnnot(x$label), x$x, x$y, : font
## family not found in Windows font database
## Warning in grid.Call(C_textBounds, as.graphicsAnnot(x$label), x$x, x$y, : font
## family not found in Windows font database
## Warning in grid.Call(C_textBounds, as.graphicsAnnot(x$label), x$x, x$y, : font
## family not found in Windows font database
## Warning in grid.Call(C_textBounds, as.graphicsAnnot(x$label), x$x, x$y, : font
## family not found in Windows font database
## Warning in grid.Call(C_textBounds, as.graphicsAnnot(x$label), x$x, x$y, : font
## family not found in Windows font database
Questions and Inferences #8:
Questions and Inferences: How do graphs look “metaphorically” different? Do they reveal different aspects of the group? How?
Faceting allows to create sub-plots according to the values of a qualitative attribute on nodes or edges.
# setting theme_graph
set_graph_style()
# facet edges by type
ggraph(ga,layout = "linear", circular = TRUE) +
geom_edge_link(aes(color = type)) +
geom_node_point() +
facet_edges(~ type)
## Warning in grid.Call(C_stringMetric, as.graphicsAnnot(x$label)): font family
## not found in Windows font database
## Warning in grid.Call(C_textBounds, as.graphicsAnnot(x$label), x$x, x$y, : font
## family not found in Windows font database
## Warning in grid.Call(C_textBounds, as.graphicsAnnot(x$label), x$x, x$y, : font
## family not found in Windows font database
## Warning in grid.Call(C_textBounds, as.graphicsAnnot(x$label), x$x, x$y, : font
## family not found in Windows font database
## Warning in grid.Call.graphics(C_text, as.graphicsAnnot(x$label), x$x, x$y, :
## font family not found in Windows font database
## Warning in grid.Call(C_textBounds, as.graphicsAnnot(x$label), x$x, x$y, : font
## family not found in Windows font database
## Warning in grid.Call(C_textBounds, as.graphicsAnnot(x$label), x$x, x$y, : font
## family not found in Windows font database
# facet nodes by sex
ggraph(ga,layout = "linear", circular = TRUE) +
geom_edge_link() +
geom_node_point() +
facet_nodes(~race)
## Warning in grid.Call(C_textBounds, as.graphicsAnnot(x$label), x$x, x$y, : font
## family not found in Windows font database
# facet both nodes and edges
ggraph(ga,layout = "linear", circular = TRUE) +
geom_edge_link(aes(color = type)) +
geom_node_point() +
facet_graph(type ~ race) +
th_foreground(border = TRUE)
## Warning in grid.Call(C_textBounds, as.graphicsAnnot(x$label), x$x, x$y, : font
## family not found in Windows font database
## Warning in grid.Call(C_textBounds, as.graphicsAnnot(x$label), x$x, x$y, : font
## family not found in Windows font database
## Warning in grid.Call.graphics(C_text, as.graphicsAnnot(x$label), x$x, x$y, :
## font family not found in Windows font database
## Warning in grid.Call.graphics(C_text, as.graphicsAnnot(x$label), x$x, x$y, :
## font family not found in Windows font database
## Warning in grid.Call.graphics(C_text, as.graphicsAnnot(x$label), x$x, x$y, :
## font family not found in Windows font database
## Warning in grid.Call.graphics(C_text, as.graphicsAnnot(x$label), x$x, x$y, :
## font family not found in Windows font database
## Warning in grid.Call.graphics(C_text, as.graphicsAnnot(x$label), x$x, x$y, :
## font family not found in Windows font database
## Warning in grid.Call.graphics(C_text, as.graphicsAnnot(x$label), x$x, x$y, :
## font family not found in Windows font database
## Warning in grid.Call.graphics(C_text, as.graphicsAnnot(x$label), x$x, x$y, :
## font family not found in Windows font database
## Warning in grid.Call.graphics(C_text, as.graphicsAnnot(x$label), x$x, x$y, :
## font family not found in Windows font database
## Warning in grid.Call(C_textBounds, as.graphicsAnnot(x$label), x$x, x$y, : font
## family not found in Windows font database
## Warning in grid.Call(C_textBounds, as.graphicsAnnot(x$label), x$x, x$y, : font
## family not found in Windows font database
Questions and Inferences #9:
Questions and Inferences: Does splitting up the main graph into subnetworks give you more insight? Describe some of these.
The data frame graph representation can be easily augmented with
metrics or statistics computed on the
graph. Remember how we computed counts
with the penguin
dataset in Grammar of Graphics.
Before computing a metric on nodes or edges use the
activate()
function to activate either node or edge data
frames. Use dplyr
verbs
(filter, arrange, mutate
) to achieve your computation in
the proper way.
Centrality is a an “ill-defined” metric of node and edge
importance in a network. It is therefore calculated in many
ways. Type ?centrality
in your Console.
Let’s add a few columns to the nodes and edges based on network centrality measures:
ga %>%
activate(nodes) %>%
# Most connections?
mutate(degree = centrality_degree(mode = c("in"))) %>%
filter(degree > 0) %>%
activate(edges) %>%
# "Busiest" edge?
mutate(betweenness = centrality_edge_betweenness())
## # A tbl_graph: 54 nodes and 57 edges
## #
## # An undirected simple graph with 4 components
## #
## # A tibble: 57 × 5
## from to weight type betweenness
## <int> <int> <dbl> <chr> <dbl>
## 1 5 47 2 friends 20.3
## 2 21 47 4 benefits 44.7
## 3 5 46 1 friends 39
## 4 5 41 1 friends 66.3
## 5 18 41 6 friends 39
## 6 21 41 12 benefits 91.5
## # ℹ 51 more rows
## #
## # A tibble: 54 × 8
## name sex race birthyear position season sign degree
## <chr> <chr> <chr> <dbl> <chr> <dbl> <chr> <dbl>
## 1 Addison Montgomery F White 1967 Attending 1 Libra 3
## 2 Adele Webber F Black 1949 Non-Staff 2 Leo 1
## 3 Teddy Altman F White 1969 Attending 6 Pisces 4
## # ℹ 51 more rows
Packages tidygraph
and ggraph
can be
pipelined to perform analysis and visualization tasks in one go.
# setting theme_graph
set_graph_style()
ga %>%
activate(nodes) %>%
# Who has the most connections?
mutate(degree = centrality_degree()) %>%
activate(edges) %>%
# Who is the go-through person?
mutate(betweenness = centrality_edge_betweenness()) %>%
# Now to continue with plotting
ggraph(layout = "nicely") +
geom_edge_link(aes(alpha = betweenness)) +
geom_node_point(aes(size = degree, colour = degree)) +
# discrete colour legend
scale_color_gradient(guide = "legend")
## Warning in grid.Call(C_textBounds, as.graphicsAnnot(x$label), x$x, x$y, : font
## family not found in Windows font database
## Warning in grid.Call(C_textBounds, as.graphicsAnnot(x$label), x$x, x$y, : font
## family not found in Windows font database
## Warning in grid.Call(C_textBounds, as.graphicsAnnot(x$label), x$x, x$y, : font
## family not found in Windows font database
## Warning in grid.Call(C_textBounds, as.graphicsAnnot(x$label), x$x, x$y, : font
## family not found in Windows font database
## Warning in grid.Call(C_textBounds, as.graphicsAnnot(x$label), x$x, x$y, : font
## family not found in Windows font database
## Warning in grid.Call(C_textBounds, as.graphicsAnnot(x$label), x$x, x$y, : font
## family not found in Windows font database
## Warning in grid.Call(C_textBounds, as.graphicsAnnot(x$label), x$x, x$y, : font
## family not found in Windows font database
## Warning in grid.Call(C_textBounds, as.graphicsAnnot(x$label), x$x, x$y, : font
## family not found in Windows font database
# or even less typing
ggraph(ga,layout = "nicely") +
geom_edge_link(aes(alpha = centrality_edge_betweenness())) +
geom_node_point(aes(colour = centrality_degree(),
size = centrality_degree())) +
scale_color_gradient(guide = "legend",
low = "green",
high = "red")
## Warning in grid.Call(C_textBounds, as.graphicsAnnot(x$label), x$x, x$y, : font
## family not found in Windows font database
## Warning in grid.Call(C_textBounds, as.graphicsAnnot(x$label), x$x, x$y, : font
## family not found in Windows font database
## Warning in grid.Call(C_textBounds, as.graphicsAnnot(x$label), x$x, x$y, : font
## family not found in Windows font database
## Warning in grid.Call(C_textBounds, as.graphicsAnnot(x$label), x$x, x$y, : font
## family not found in Windows font database
## Warning in grid.Call(C_textBounds, as.graphicsAnnot(x$label), x$x, x$y, : font
## family not found in Windows font database
## Warning in grid.Call(C_textBounds, as.graphicsAnnot(x$label), x$x, x$y, : font
## family not found in Windows font database
## Warning in grid.Call(C_textBounds, as.graphicsAnnot(x$label), x$x, x$y, : font
## family not found in Windows font database
## Warning in grid.Call(C_textBounds, as.graphicsAnnot(x$label), x$x, x$y, : font
## family not found in Windows font database
Questions and Inferences #10:
Questions and Inferences: How do the Centrality Measures show up in the graph? Would you “agree” with the way we have done it? Try to modify the aesthetics by copy-pasting this chunk below and see how you can make an alternative representation.
Who is close to whom? Which are the groups you can see?
# setting theme_graph
set_graph_style()
# visualize communities of nodes
ga %>%
activate(nodes) %>%
mutate(community = as.factor(group_louvain())) %>%
ggraph(layout = "graphopt") +
geom_edge_link() +
geom_node_point(aes(color = community), size = 5)
## Warning in grid.Call(C_textBounds, as.graphicsAnnot(x$label), x$x, x$y, : font
## family not found in Windows font database
## Warning in grid.Call(C_textBounds, as.graphicsAnnot(x$label), x$x, x$y, : font
## family not found in Windows font database
## Warning in grid.Call(C_textBounds, as.graphicsAnnot(x$label), x$x, x$y, : font
## family not found in Windows font database
## Warning in grid.Call(C_textBounds, as.graphicsAnnot(x$label), x$x, x$y, : font
## family not found in Windows font database
Questions and Inferences #11:
Questions and Inferences: Is the Community depiction clear? How would you do it, with which aesthetic? Copy Paste this chunk below and try.
visNetwork
Exploring the VisNetwork
package. Make graphs wiggle and
shake using tidy
commands! The package implements
interactivity using the physical metaphor of weights and springs we
discussed earlier.
The visNetwork()
function uses a nodes list and edges
list to create an interactive graph. The nodes list must include an “id”
column, and the edge list must have “from” and “to” columns. The
function also plots the labels for the nodes, using the names of the
cities from the “label” column in the node list.
library(visNetwork)
# Prepare the data for plotting by visNetwork
grey_nodes
## # A tibble: 54 × 7
## name sex race birthyear position season sign
## <chr> <chr> <chr> <dbl> <chr> <dbl> <chr>
## 1 Addison Montgomery F White 1967 Attending 1 Libra
## 2 Adele Webber F Black 1949 Non-Staff 2 Leo
## 3 Teddy Altman F White 1969 Attending 6 Pisces
## 4 Amelia Shepherd F White 1981 Attending 7 Libra
## 5 Arizona Robbins F White 1976 Attending 5 Leo
## 6 Rebecca Pope F White 1975 Non-Staff 3 Gemini
## 7 Jackson Avery M Black 1981 Resident 6 Leo
## 8 Miranda Bailey F Black 1969 Attending 1 Virgo
## 9 Ben Warren M Black 1972 Other 6 Aquarius
## 10 Henry Burton M White 1972 Non-Staff 7 Cancer
## # ℹ 44 more rows
grey_edges
## # A tibble: 57 × 4
## from to weight type
## <chr> <chr> <dbl> <chr>
## 1 Leah Murphy Arizona Robbins 2 friends
## 2 Leah Murphy Alex Karev 4 benefits
## 3 Lauren Boswell Arizona Robbins 1 friends
## 4 Arizona Robbins Callie Torres 1 friends
## 5 Callie Torres Erica Hahn 6 friends
## 6 Callie Torres Alex Karev 12 benefits
## 7 Callie Torres Mark Sloan 5 professional
## 8 Callie Torres George O'Malley 2 professional
## 9 George O'Malley Izzie Stevens 3 professional
## 10 George O'Malley Meredith Grey 4 friends
## # ℹ 47 more rows
# Relabel greys anatomy nodes and edges for VisNetwork
grey_nodes_vis <- grey_nodes %>%
rowid_to_column(var = "id") %>%
rename("label" = name) %>%
mutate(sex = case_when(sex == "F" ~ "Female",
sex == "M" ~ "Male")) %>%
replace_na(., list(sex = "Transgender?")) %>%
rename("group" = sex)
grey_nodes_vis
## # A tibble: 54 × 8
## id label group race birthyear position season sign
## <int> <chr> <chr> <chr> <dbl> <chr> <dbl> <chr>
## 1 1 Addison Montgomery Female White 1967 Attending 1 Libra
## 2 2 Adele Webber Female Black 1949 Non-Staff 2 Leo
## 3 3 Teddy Altman Female White 1969 Attending 6 Pisces
## 4 4 Amelia Shepherd Female White 1981 Attending 7 Libra
## 5 5 Arizona Robbins Female White 1976 Attending 5 Leo
## 6 6 Rebecca Pope Female White 1975 Non-Staff 3 Gemini
## 7 7 Jackson Avery Male Black 1981 Resident 6 Leo
## 8 8 Miranda Bailey Female Black 1969 Attending 1 Virgo
## 9 9 Ben Warren Male Black 1972 Other 6 Aquarius
## 10 10 Henry Burton Male White 1972 Non-Staff 7 Cancer
## # ℹ 44 more rows
grey_edges_vis <- grey_edges %>%
select(from, to) %>%
left_join(., grey_nodes_vis,
by = c("from" = "label")) %>%
left_join(., grey_nodes_vis,
by = c("to" = "label")) %>%
select("from"= id.x, "to" = id.y)
grey_edges_vis
## # A tibble: 57 × 2
## from to
## <int> <int>
## 1 47 5
## 2 47 21
## 3 46 5
## 4 5 41
## 5 41 18
## 6 41 21
## 7 41 37
## 8 41 31
## 9 31 20
## 10 31 17
## # ℹ 47 more rows
Using fontawesome icons
grey_nodes_vis %>%
visNetwork(nodes = ., edges = grey_edges_vis) %>%
visNodes(font = list(size = 40)) %>%
# Colour and icons for each of the gender-groups
visGroups(groupname = "Female", shape = "icon",
icon = list(code = "f182", size = 75, color = "tomato"),
shadow = list(enabled = TRUE)) %>%
visGroups(groupname = "Male", shape = "icon",
icon = list(code = "f183", size = 75, color = "slateblue"),
shadow = list(enabled = TRUE)) %>%
visGroups(groupname = "Transgender?", shape = "icon",
icon = list(code = "f22c", size = 75, color = "fuchsia"),
shadow = list(enabled = TRUE)) %>%
#visLegend() %>%
#Add the fontawesome icons!!
addFontAwesome() %>%
# Add Interaction Controls
visInteraction(navigationButtons = TRUE,
hover = TRUE,
selectConnectedEdges = TRUE,
hoverConnectedEdges = TRUE,
zoomView = TRUE)
There is another family of icons available in visNetwork, called ionicons
.
Let’s see how they look:
grey_nodes_vis %>%
visNetwork(nodes = ., edges = grey_edges_vis,) %>%
visLayout(randomSeed = 12345) %>%
visNodes(font = list(size = 50)) %>%
visEdges(color = "green") %>%
visGroups(
groupname = "Female",
shape = "icon",
icon = list(
face = 'Ionicons',
code = "f25d",
color = "fuchsia",
size = 125
)
) %>%
visGroups(
groupname = "Male",
shape = "icon",
icon = list(
face = 'Ionicons',
code = "f202",
color = "green",
size = 125
)
) %>%
visGroups(
groupname = "Transgender?",
shape = "icon",
icon = list(
face = 'Ionicons',
code = "f233",
color = "dodgerblue",
size = 125
)
) %>%
visLegend() %>%
addIonicons() %>%
visInteraction(
navigationButtons = TRUE,
hover = TRUE,
selectConnectedEdges = TRUE,
hoverConnectedEdges = TRUE,
zoomView = TRUE
)
Some idea of interactivity and controls with
visNetwork
:
library(visNetwork)
# let's look again at the data
starwars_nodes <- read_csv("./Data/star-wars-network-nodes.csv")
## Rows: 22 Columns: 2
## ── Column specification ────────────────────────────────────────────────────────
## Delimiter: ","
## chr (1): name
## dbl (1): id
##
## ℹ Use `spec()` to retrieve the full column specification for this data.
## ℹ Specify the column types or set `show_col_types = FALSE` to quiet this message.
starwars_edges <- read_csv("./Data/star-wars-network-edges.csv")
## Rows: 60 Columns: 3
## ── Column specification ────────────────────────────────────────────────────────
## Delimiter: ","
## chr (2): source, target
## dbl (1): weight
##
## ℹ Use `spec()` to retrieve the full column specification for this data.
## ℹ Specify the column types or set `show_col_types = FALSE` to quiet this message.
# We need to rename starwars nodes dataframe and edge dataframe columns for visNetwork
starwars_nodes_vis <-
starwars_nodes %>%
rename("label" = name)
# Convert from and to columns to **node ids**
starwars_edges_vis <-
starwars_edges %>%
# Matching Source <- Source Node id ("id.x")
left_join(., starwars_nodes_vis, by = c("source" = "label")) %>%
# Matching Target <- Target Node id ("id.y")
left_join(., starwars_nodes_vis, by = c("target" = "label")) %>%
# Select "id.x" and "id.y" ONLY
# Rename them as "from" and "to"
# keep "weight" column for aesthetics of edges
select("from" = id.x, "to" = id.y, "value" = weight)
# Check everything once
starwars_nodes_vis
## # A tibble: 22 × 2
## label id
## <chr> <dbl>
## 1 R2-D2 0
## 2 CHEWBACCA 1
## 3 C-3PO 2
## 4 LUKE 3
## 5 DARTH VADER 4
## 6 CAMIE 5
## 7 BIGGS 6
## 8 LEIA 7
## 9 BERU 8
## 10 OWEN 9
## # ℹ 12 more rows
starwars_edges_vis
## # A tibble: 60 × 3
## from to value
## <dbl> <dbl> <dbl>
## 1 2 0 17
## 2 3 0 13
## 3 10 0 6
## 4 7 0 5
## 5 13 0 5
## 6 1 0 3
## 7 16 0 1
## 8 1 10 7
## 9 2 1 5
## 10 1 3 16
## # ℹ 50 more rows
Ok, let’s make things move and shake!!
visNetwork(nodes = starwars_nodes_vis,
edges = starwars_edges_vis) %>%
visNodes(font = list(size = 30), shape = "icon",
icon = list(code = "f1e3", size = 75)) %>%
addFontAwesome() %>%
visEdges(color = "red")
visNetwork(nodes = starwars_nodes_vis,
edges = starwars_edges_vis) %>%
visNodes(font = list(size = 30)) %>%
visEdges(color = "red")
Step 1. Fire up a new RMarkdown. Write your name, file_name and date.
Step 2. Take any one of the “Make1-Datasets” datasets decribed below.
Step 3. RMarkdown contents:
Step 4. Knit before you submit. Submit only your knittable
.Rmd
file.
set up
airline_nodes <- read_csv("./Data/AIRLINES-NODES.csv") %>% mutate(Id = Id + 1)
airline_edges <- read_csv("./Data/AIRLINES-EDGES.csv") %>%
mutate(Source = Source + 1, Target = Target + 1)
data("karate",package= "igraphdata")
karate
?karate
in the consoletbl_graph
.ggraph
.GoT <- read_rds("./data/GoT.RDS")
GoT[[index]]
where index = 1…7 and
then plot directly.igraphdata
(type
?igraphdata
in console)This is in groups. Groups of 4. To be announced
You need to create a Network Graph for your favourite Book, play, TV serial or Show. (E.g. Friends, BBT, or LB or HIMYM…or Hamlet, Little Women , Pride and Prejudice, or LoTR)
Step 1. Go to: Literary Networks for instructions. (Instructions are on also Teams -> Files.)
Step 2. Make your data using the instructions.
In the nodes excel, use id
and names
as
your columns. Any other details in other columns to the right.
In your edges
excel, use from
and
to
are your first columns. Entries in these columns can be
names
or id
s but be consistent and don’t
mix.
Step 3. Decide on 3 answers that you to seek and plan to make graphs for.
Step 4. Create graph objects. Say 3 visualizations.
Step 5. Write comments/answers in the code and narrative text. Add pictures from the web using Markdown syntax.
Step 6. Write Reflection ( ok, a short one!) inside your RMarkdown. Make sure it knits!!
Step 7. Group Submission: Submit the knittable .Rmd file AND the data. RMarkdown with joint authorship. Each person submits on their Assignments. All get the same grade on this one.
Ask me for clarifications on what to do after you have read the Instructions in your group.
Thomas Lin Pedersen - 1 giraffe, 2 giraffe,GO!
Igraph: Network Analysis and Visualization. https://CRAN.R-project.org/package=igraph.
Pedersen, Thomas Lin. 2017a. Ggraph: An Implementation of Grammar of Graphics for Graphs and Networks. https://CRAN.R-project.org/package=ggraph.
———. 2017b. Tidygraph: A Tidy Api for Graph Manipulation. https://CRAN.R-project.org/package=tidygraph.
Tyner, Sam, François Briatte, and Heike Hofmann. 2017. “Network Visualization with ggplot2.” The R Journal 9 (1): 27–59. https://journal.r-project.org/archive/2017/RJ-2017-023/index.html.
Network Datasets https://icon.colorado.edu/#!/networks
Yunran Chen, Introduction to Network Analysis Using R