+ - 0:00:00
Notes for current slide
Notes for next slide

Slides 04 - Metaphors with Graphics

From Code to Geometry

Arvind Venkatadri

Srishti Manipal Institute

(2022-08-09)

1

How does one read Shakespeare?

shakespeare

To code or not to code, that is the question...

2

What is a Grammar of Graphics?

Code looks and reads like English.

Has verbs, nouns, some adjectives....

3

What is a Grammar of Graphics?

Code looks and reads like English.

Has verbs, nouns, some adjectives....

  • Describes Information/ideas/concepts from any source domain.
4

What is a Grammar of Graphics?

Code looks and reads like English.

Has verbs, nouns, some adjectives....

  • Describes Information/ideas/concepts from any source domain.

  • GEOMETRY as the target domain : What comes out of R is predominantly "geometry"

5

How do we express visuals in words?

  • Data to be visualized
6

How do we express visuals in words?

  • Data to be visualized
  • Geometric objects that appear on the plot
7

How do we express visuals in words?

  • Data to be visualized
  • Geometric objects that appear on the plot
  • Aesthetic mappings from data to visual component
8

How do we express visuals in words?

  • Data to be visualized
  • Geometric objects that appear on the plot
  • Aesthetic mappings from data to visual component
  • Statistics transform data on the way to visualization
9

How do we express visuals in words?

  • Data to be visualized
  • Geometric objects that appear on the plot
  • Aesthetic mappings from data to visual component
  • Statistics transform data on the way to visualization
  • Coordinates organize location of geometric objects
10

How do we express visuals in words?

  • Data to be visualized
  • Geometric objects that appear on the plot
  • Aesthetic mappings from data to visual component
  • Statistics transform data on the way to visualization
  • Coordinates organize location of geometric objects
  • Scales define the range of values for aesthetics
11

How do we express visuals in words?

  • Data to be visualized
  • Geometric objects that appear on the plot
  • Aesthetic mappings from data to visual component
  • Statistics transform data on the way to visualization
  • Coordinates organize location of geometric objects
  • Scales define the range of values for aesthetics
  • Facets group into subplots
12

The Essence of ggplot

all ggplot2

  • aes(x = , y = ) (aesthetics)
  • aes(x = , y = , color = ) (add color)
  • aes(x = , y = , size = ) (add size)
  • + facet_wrap(~ ) (facetting)
  • + scale_ ( add a scale)
13

gg is for Grammar of Graphics

Data

Aesthetics

Geoms

+ geom_*()

14

The Five-Named Graphs

  • Scatterplot: geom_point()
  • Line graph: geom_line()
  • Histogram: geom_histogram()
  • Boxplot: geom_boxplot()
  • Bar graph: geom_bar() or geom_col (see Lab 02)
15

Chunk : penguins

head(penguins)
## # A tibble: 6 × 8
## species island bill_length_mm bill_depth_mm flipper_length_mm body_mass_g sex year
## <fct> <fct> <dbl> <dbl> <int> <int> <fct> <int>
## 1 Adelie Torgersen 39.1 18.7 181 3750 male 2007
## 2 Adelie Torgersen 39.5 17.4 186 3800 female 2007
## 3 Adelie Torgersen 40.3 18 195 3250 female 2007
## 4 Adelie Torgersen NA NA NA NA <NA> 2007
## 5 Adelie Torgersen 36.7 19.3 193 3450 female 2007
## 6 Adelie Torgersen 39.3 20.6 190 3650 male 2007

We see the first few rows of the dataset penguins. We see that there are a few NA data observations too. Let us remove them for now.

penguins <- penguins %>% drop_na()
16

Chunk: Mapping

ggplot(penguins)

17

Chunk: Mapping

ggplot(data = penguins,
mapping = aes(x = bill_length_mm,
y = body_mass_g))

18

Chunk: Mapping

ggplot(data = penguins,
mapping = aes(x = bill_length_mm,
y = body_mass_g)) +
geom_point()

19

Chunk: Mapping

ggplot(data = penguins,
mapping = aes(x = bill_length_mm,
y = body_mass_g)) +
geom_point() +
geom_smooth(method = "lm")

20

Chunk: Geom_Point_Position_Colour

ggplot(data = penguins)

21

Chunk: Geom_Point_Position_Colour

ggplot(data = penguins,
aes(x = bill_length_mm,
y = body_mass_g,
color = island))

We can leave out the "mapping" word and just use aes .

Why is there no plot?

🤔 💭

Right !! We have not used a geom command yet!!

22

Chunk: Geom_Point_Position_Colour

ggplot(data = penguins,
aes(x = bill_length_mm,
y = body_mass_g,
color = island)) +
geom_point() +
ggtitle("A point geom with position, color aesthetics")

Note that the points are located by position coordinates on both x and y axis, and coloured by the island variable.

23

Chunk: Geom_Point_Position_Colour

ggplot(data = penguins,
aes(x = bill_length_mm,
y = body_mass_g,
color = island)) +
geom_point(size = 4) +
ggtitle("A point geom with position color and size aesthetics")

Note that the points are located by position coordinates on both x and y axis, and coloured by the island variable.

And we've fixed size = 4!

24

Alpha

diamonds %>%
# Sample some 20% of the data
slice_sample(prop = 0.2) %>%
ggplot(.) +
geom_point(aes(x = carat,
y = price))

Are the points all overlapping? Can we see them better?

25

Alpha

diamonds %>%
# Sample some 20% of the data
slice_sample(prop = 0.2) %>%
ggplot(.) +
geom_point(aes(x = carat, y = price),
# alpha outside the aes() !!!
alpha = 0.2) +
labs(title = "Points plotted with Alpha")

Are the points all overlapping? Can we see them better?

26

Chunk: Box Plot

ggplot(diamonds) +
geom_boxplot(aes(x = cut, y = price)) +
labs(title = "Box Plot")

27

Chunk: Box Plot

ggplot(diamonds) +
geom_boxplot(aes(x = cut,
y = price,
fill = cut)) +
labs(title = "Box Plot")

28

Chunk: Geom_Bar_1

ggplot(data = penguins)

29

Chunk: Geom_Bar_1

ggplot(data = penguins) +
aes(x = species)

30

Chunk: Geom_Bar_1

ggplot(data = penguins) +
aes(x = species) +
geom_bar() +
ggtitle("A bar geom with position and height aesthetics")

The bars are plotted with positions on the x-axis, defined by the species variable, and heights mapped to the y-axis.

How did the graph "know" the heights of the bars?

geom_bar has an internal count statistic computation. Many geom_s have internal computation that are accessible to programmers.

31

Geom_Bar_Position_Stack_and_Dodge

When using more than a pair of variables with a bar chart, we have a few more position options:

ggplot(penguins,
aes(x = species,
fill = island))

32

Geom_Bar_Position_Stack_and_Dodge

When using more than a pair of variables with a bar chart, we have a few more position options:

ggplot(penguins,
aes(x = species,
fill = island)) +
geom_bar() +
ggtitle(label = "A stacked bar chart")

The bars are coloured by the island variable and are stacked in position.

33

Geom_Bar_Position_Stack_and_Dodge

And here we use the dodge option:

ggplot(penguins,
aes(x = species,
fill = island)) +
geom_bar(position ="dodge") +
ggtitle(label =
"A dodged bar chart")

34

Facetting

ggplot(penguins)

35

Facetting

ggplot(penguins) +
aes(x = flipper_length_mm,
y = body_mass_g)

36

Facetting

ggplot(penguins) +
aes(x = flipper_length_mm,
y = body_mass_g) +
geom_point()

37

Facetting

ggplot(penguins) +
aes(x = flipper_length_mm,
y = body_mass_g) +
geom_point() +
facet_wrap(~island) +
ggtitle("A point geom graph with facets")

The graph has split into multiples, based on the number of islands.

38

Still more Facetting

ggplot(penguins) +
aes(x = flipper_length_mm,
y = body_mass_g) +
geom_point()

What if we have even more "factor" variables? We have island and species...can we split further?

39

Still more Facetting

ggplot(penguins) +
aes(x = flipper_length_mm,
y = body_mass_g) +
geom_point() +
facet_grid(species~island) +
ggtitle("A point geom graph with grid facets")

The graph has split into multiples, based on the number of islands and the number of species.

40

And shall we look briefly at colour?

41

Finally...Colour !! ( Just a bit )

diamonds %>%
slice_sample(prop = 0.2) %>%
ggplot(.) +
geom_point(aes(x = carat, y = price))

42

Finally...Colour !! ( Just a bit )

diamonds %>%
slice_sample(prop = 0.2) %>%
ggplot(.) +
geom_point(aes(x = carat, y = price, colour = cut), size = 3) +
scale_colour_brewer(palette = "Set3") +
labs(title = "Brewer Colour Pallette (Set3)")

We are using the RColorBrewer package here. Type RColorBrewer::display.brewer.all() in your Console and see what palettes are available.

43

Chunk: Colour !! ( Just a bit )

diamonds %>%
slice_sample(prop = 0.2) %>%
ggplot(.) +
geom_point(aes(x = carat, y = price, colour = cut), size = 3) +
scale_colour_viridis_d() +
labs(title = "Viridis Palette",
subtitle = "The Default in ggplot")

44

Chunk: Colour !! ( Just a bit )

diamonds %>%
slice_sample(prop = 0.2) %>%
ggplot(.) +
geom_point(aes(x = carat, y = price, colour = cut), size = 3) +
scale_colour_viridis_d(option = "magma") +
labs(title = "Viridis Palette, Option Magma")

45

Chunk: Colour !! ( Just a bit )

diamonds %>%
slice_sample(prop = 0.2) %>%
ggplot(.) +
geom_point(aes(x = carat, y = price, colour = cut), size = 3) +
scale_colour_viridis_d(option = "inferno") +
labs(title = "Viridis Palette, Option Inferno")

46

Conclusion

  • ggplot takes a dataframe/tibble as the data argument
  • The aes-thetic arguments can be x, y, colour, shape, alpha for example...
  • The geom_*() commands specify the kind of plot
  • Together, the ggplot package offers a Grammar of near-English commands which allow us to plot data in various ways.
47

How does one read Shakespeare?

shakespeare

To code or not to code, that is the question...

2
Paused

Help

Keyboard shortcuts

, , Pg Up, k Go to previous slide
, , Pg Dn, Space, j Go to next slide
Home Go to first slide
End Go to last slide
Number + Return Go to specific slide
b / m / f Toggle blackout / mirrored / fullscreen mode
c Clone slideshow
p Toggle presenter mode
t Restart the presentation timer
?, h Toggle this help
Esc Back to slideshow