with ggplot2
University of Luxembourg
Friday, the 2th of May, 2025
Learning objectives
ggplot2
data.frame/tibble
ggplot2
A | B | C | D |
---|---|---|---|
2 | 3 | 4 | a |
1 | 2 | 1 | a |
4 | 5 | 15 | b |
9 | 10 | 80 | b |
x = A
y = C
shape = D
\(x = \frac{A-min(A)}{range(A)}*width\)
\(y = \frac{C-min(C)}{range(C)}*height\)
Wickham: A Layered Grammar of Graphics (2007)
A | B | shape |
---|---|---|
25 | 11 | circle |
0 | 0 | circle |
75 | 53 | square |
200 | 300 | square |
What if we want to split into panels circles and squares?
shape
aesthetic is free for another variable.Wickham: A Layered Grammar of Graphics (2007)
geofacet, US states by Ryan Hafen
Warning
ggplot2
layers are combined with +
, not %>%
nor |>
.
This introduces a break in the workflow. (ggplot1
would have been fine)
ggplot
lines combined with +
Fertility Agriculture Examination Education Catholic
Courtelary 80.2 17.0 15 12 9.96
Delemont 83.1 45.1 6 9 84.84
Franches-Mnt 92.5 39.7 5 5 93.40
Moutier 85.8 36.5 12 7 33.77
Neuveville 76.9 43.5 17 15 5.16
Porrentruy 76.1 35.3 9 7 90.57
Broye 83.8 70.2 16 7 92.85
Glane 92.4 67.8 14 8 97.16
Gruyere 82.4 53.3 12 7 97.67
Sarine 82.9 45.2 16 13 91.38
Veveyse 87.1 64.5 14 6 98.61
Aigle 64.1 62.0 21 12 8.52
Aubonne 66.9 67.5 14 7 2.27
Avenches 68.9 60.7 19 12 4.43
Cossonay 61.7 69.3 22 5 2.82
Echallens 68.3 72.6 18 2 24.20
Grandson 71.7 34.0 17 8 3.30
Lausanne 55.7 19.4 26 28 12.11
La Vallee 54.3 15.2 31 20 2.15
Lavaux 65.1 73.0 19 9 2.84
Morges 65.5 59.8 22 10 5.23
Moudon 65.0 55.1 14 3 4.52
Nyone 56.6 50.9 22 12 15.14
Orbe 57.4 54.1 20 6 4.20
Oron 72.5 71.2 12 1 2.40
Payerne 74.2 58.1 14 8 5.23
Paysd'enhaut 72.0 63.5 6 3 2.56
Rolle 60.5 60.8 16 10 7.72
Vevey 58.3 26.8 25 19 18.46
Yverdon 65.4 49.5 15 8 6.10
Conthey 75.5 85.9 3 2 99.71
Entremont 69.3 84.9 7 6 99.68
Herens 77.3 89.7 5 2 100.00
Martigwy 70.5 78.2 12 6 98.96
Monthey 79.4 64.9 7 3 98.22
St Maurice 65.0 75.9 9 9 99.06
Sierre 92.2 84.6 3 3 99.46
Sion 79.3 63.1 13 13 96.83
Boudry 70.4 38.4 26 12 5.62
La Chauxdfnd 65.7 7.7 29 11 13.79
Le Locle 72.7 16.7 22 13 11.22
Neuchatel 64.4 17.6 35 32 16.92
Val de Ruz 77.6 37.6 15 7 4.97
ValdeTravers 67.6 18.7 25 7 8.65
V. De Geneve 35.0 1.2 37 53 42.34
Rive Droite 44.7 46.6 16 29 50.43
Rive Gauche 42.8 27.7 22 29 58.33
Infant.Mortality
Courtelary 22.2
Delemont 22.2
Franches-Mnt 20.2
Moutier 20.3
Neuveville 20.6
Porrentruy 26.6
Broye 23.6
Glane 24.9
Gruyere 21.0
Sarine 24.4
Veveyse 24.5
Aigle 16.5
Aubonne 19.1
Avenches 22.7
Cossonay 18.7
Echallens 21.2
Grandson 20.0
Lausanne 20.2
La Vallee 10.8
Lavaux 20.0
Morges 18.0
Moudon 22.4
Nyone 16.7
Orbe 15.3
Oron 21.0
Payerne 23.8
Paysd'enhaut 18.0
Rolle 16.3
Vevey 20.9
Yverdon 22.5
Conthey 15.1
Entremont 19.8
Herens 18.3
Martigwy 19.4
Monthey 20.2
St Maurice 17.8
Sierre 16.3
Sion 18.1
Boudry 20.3
La Chauxdfnd 20.5
Le Locle 18.9
Neuchatel 23.0
Val de Ruz 20.0
ValdeTravers 19.5
V. De Geneve 18.0
Rive Droite 18.2
Rive Gauche 19.3
Horst AM, Hill AP, Gorman KB (2020). palmerpenguins: Palmer Archipelago (Antarctica) penguin data. R package
v0.1.0
# A tibble: 344 × 8
species island bill_length_mm bill_depth_mm flipper_length_mm body_mass_g
<fct> <fct> <dbl> <dbl> <int> <int>
1 Adelie Torgersen 39.1 18.7 181 3750
2 Adelie Torgersen 39.5 17.4 186 3800
3 Adelie Torgersen 40.3 18 195 3250
4 Adelie Torgersen NA NA NA NA
5 Adelie Torgersen 36.7 19.3 193 3450
6 Adelie Torgersen 39.3 20.6 190 3650
7 Adelie Torgersen 38.9 17.8 181 3625
8 Adelie Torgersen 39.2 19.6 195 4675
9 Adelie Torgersen 34.1 18.1 193 3475
10 Adelie Torgersen 42 20.2 190 4250
# ℹ 334 more rows
# ℹ 2 more variables: sex <fct>, year <int>
library(palmerpenguins)
penguins |>
ggplot() +
aes(x = flipper_length_mm,
y = body_mass_g) +
aes(color = sex) +
geom_point() +
theme_bw(base_family = "Roboto Condensed", base_size = 13) +
scale_color_manual(values = c("darkorange", "cyan4"), na.translate = FALSE) +
labs(title = "Penguin flipper and body mass",
caption = "Horst AM, Hill AP, Gorman KB (2020)",
subtitle = "Dimensions for male/female Adelie, Chinstrap and Gentoo Penguins at Palmer Station LTER")
library(palmerpenguins)
penguins |>
ggplot() +
aes(x = flipper_length_mm,
y = body_mass_g) +
aes(color = sex) +
geom_point() +
theme_bw(base_family = "Roboto Condensed", base_size = 13) +
scale_color_manual(values = c("darkorange", "cyan4"), na.translate = FALSE) +
labs(title = "Penguin flipper and body mass",
caption = "Horst AM, Hill AP, Gorman KB (2020)",
subtitle = "Dimensions for male/female Adelie, Chinstrap and Gentoo Penguins at Palmer Station LTER") +
theme(plot.subtitle = element_text(size = 13))
library(palmerpenguins)
penguins |>
ggplot() +
aes(x = flipper_length_mm,
y = body_mass_g) +
aes(color = sex) +
geom_point() +
theme_bw(base_family = "Roboto Condensed", base_size = 13) +
scale_color_manual(values = c("darkorange", "cyan4"), na.translate = FALSE) +
labs(title = "Penguin flipper and body mass",
caption = "Horst AM, Hill AP, Gorman KB (2020)",
subtitle = "Dimensions for male/female Adelie, Chinstrap and Gentoo Penguins at Palmer Station LTER") +
theme(plot.subtitle = element_text(size = 13)) +
labs(x = "Flipper length (mm)",
y = "Body mass (g)",
color = "Penguin sex")
library(palmerpenguins)
penguins |>
ggplot() +
aes(x = flipper_length_mm,
y = body_mass_g) +
aes(color = sex) +
geom_point() +
theme_bw(base_family = "Roboto Condensed", base_size = 13) +
scale_color_manual(values = c("darkorange", "cyan4"), na.translate = FALSE) +
labs(title = "Penguin flipper and body mass",
caption = "Horst AM, Hill AP, Gorman KB (2020)",
subtitle = "Dimensions for male/female Adelie, Chinstrap and Gentoo Penguins at Palmer Station LTER") +
theme(plot.subtitle = element_text(size = 13)) +
labs(x = "Flipper length (mm)",
y = "Body mass (g)",
color = "Penguin sex") +
theme(legend.position = "bottom",
legend.background = element_rect(fill = "white", color = NA))
library(palmerpenguins)
penguins |>
ggplot() +
aes(x = flipper_length_mm,
y = body_mass_g) +
aes(color = sex) +
geom_point() +
theme_bw(base_family = "Roboto Condensed", base_size = 13) +
scale_color_manual(values = c("darkorange", "cyan4"), na.translate = FALSE) +
labs(title = "Penguin flipper and body mass",
caption = "Horst AM, Hill AP, Gorman KB (2020)",
subtitle = "Dimensions for male/female Adelie, Chinstrap and Gentoo Penguins at Palmer Station LTER") +
theme(plot.subtitle = element_text(size = 13)) +
labs(x = "Flipper length (mm)",
y = "Body mass (g)",
color = "Penguin sex") +
theme(legend.position = "bottom",
legend.background = element_rect(fill = "white", color = NA)) +
theme(plot.caption = element_text(hjust = 0, face = "italic"))
library(palmerpenguins)
penguins |>
ggplot() +
aes(x = flipper_length_mm,
y = body_mass_g) +
aes(color = sex) +
geom_point() +
theme_bw(base_family = "Roboto Condensed", base_size = 13) +
scale_color_manual(values = c("darkorange", "cyan4"), na.translate = FALSE) +
labs(title = "Penguin flipper and body mass",
caption = "Horst AM, Hill AP, Gorman KB (2020)",
subtitle = "Dimensions for male/female Adelie, Chinstrap and Gentoo Penguins at Palmer Station LTER") +
theme(plot.subtitle = element_text(size = 13)) +
labs(x = "Flipper length (mm)",
y = "Body mass (g)",
color = "Penguin sex") +
theme(legend.position = "bottom",
legend.background = element_rect(fill = "white", color = NA)) +
theme(plot.caption = element_text(hjust = 0, face = "italic")) +
theme(plot.caption.position = "plot")
library(palmerpenguins)
penguins |>
ggplot() +
aes(x = flipper_length_mm,
y = body_mass_g) +
aes(color = sex) +
geom_point() +
theme_bw(base_family = "Roboto Condensed", base_size = 13) +
scale_color_manual(values = c("darkorange", "cyan4"), na.translate = FALSE) +
labs(title = "Penguin flipper and body mass",
caption = "Horst AM, Hill AP, Gorman KB (2020)",
subtitle = "Dimensions for male/female Adelie, Chinstrap and Gentoo Penguins at Palmer Station LTER") +
theme(plot.subtitle = element_text(size = 13)) +
labs(x = "Flipper length (mm)",
y = "Body mass (g)",
color = "Penguin sex") +
theme(legend.position = "bottom",
legend.background = element_rect(fill = "white", color = NA)) +
theme(plot.caption = element_text(hjust = 0, face = "italic")) +
theme(plot.caption.position = "plot") +
facet_wrap(vars(species))
library(palmerpenguins)
penguins |>
ggplot() +
aes(x = flipper_length_mm,
y = body_mass_g) +
aes(color = sex) +
geom_point() +
theme_bw(base_family = "Roboto Condensed", base_size = 13) +
scale_color_manual(values = c("darkorange", "cyan4"), na.translate = FALSE) +
labs(title = "Penguin flipper and body mass",
caption = "Horst AM, Hill AP, Gorman KB (2020)",
subtitle = "Dimensions for male/female Adelie, Chinstrap and Gentoo Penguins at Palmer Station LTER") +
theme(plot.subtitle = element_text(size = 13)) +
labs(x = "Flipper length (mm)",
y = "Body mass (g)",
color = "Penguin sex") +
theme(legend.position = "bottom",
legend.background = element_rect(fill = "white", color = NA)) +
theme(plot.caption = element_text(hjust = 0, face = "italic")) +
theme(plot.caption.position = "plot") +
facet_wrap(vars(species)) +
scale_x_continuous(guide = guide_axis(n.dodge = 2))
library(palmerpenguins)
penguins |>
ggplot() +
aes(x = flipper_length_mm,
y = body_mass_g) +
aes(color = sex) +
geom_point() +
theme_bw(base_family = "Roboto Condensed", base_size = 13) +
scale_color_manual(values = c("darkorange", "cyan4"), na.translate = FALSE) +
labs(title = "Penguin flipper and body mass",
caption = "Horst AM, Hill AP, Gorman KB (2020)",
subtitle = "Dimensions for male/female Adelie, Chinstrap and Gentoo Penguins at Palmer Station LTER") +
theme(plot.subtitle = element_text(size = 13)) +
labs(x = "Flipper length (mm)",
y = "Body mass (g)",
color = "Penguin sex") +
theme(legend.position = "bottom",
legend.background = element_rect(fill = "white", color = NA)) +
theme(plot.caption = element_text(hjust = 0, face = "italic")) +
theme(plot.caption.position = "plot") +
facet_wrap(vars(species)) +
scale_x_continuous(guide = guide_axis(n.dodge = 2)) +
scale_y_continuous(labels = scales::label_comma())
geom_point()
geom_violin()
geom_line()
geom_histogram()
geom_bar()
geom_density()
Tip
Have a look at the cheatsheet or the documentation for more possibilities.
They are present, it works because they have sensible default:
theme_grey
cartesian
identity
disabled
Artwork by @allison_horst flipbookr by Gina Reynolds, Quarto version by Kieran Healy
aes()
map columns/variables data to aestheticsgeom
) have different expectations:
NB: mapping =
and data =
are often skipped.
geom_point()
accepts additional arguments such as the colour
Important
Parameters defined outside the aesthetics aes()
are applied to all data.
Require two conditions:
aes()
Error in FUN(X[[i]], ...): object 'country' not found
This is hardly useful, but we shall see an application later, stick to the 2 mapping rules: - Inside aes()
and refer to a valid table column.
aes()
and refer to a data columnexpression
?body_mass_g > 4000
that returns a boolean to find outCompare the two following (great example of a Simpson’s paradox):
Important
aesthetics
in ggplot()
are passed on to all geometries
.aesthetics
in geom_*()
are specific (and can overwrite inherited)island
variable to a shape
aesthetics for both dots and linear models5
alpha = 0.7
)Suppose we want to connect dots by g
Should be the job of geom_line()
Source: koshske blog: geom_line()
doesn’t draw lines
ggplot(penguins,
aes(x = bill_length_mm,
y = bill_depth_mm,
shape = island,
colour = species)) +
geom_point() +
geom_smooth(method = "lm",
formula = 'y ~ x') +
labs(title = "Bill ratios of Palmer penguins",
caption = "Horst AM, Hill AP, Gorman KB (2020)",
subtitle = "Split per species / island",
shape = "Islands",
x = "cumen length (mm)",
y = "cumen depth (mm)")
Warning
geom
ggplot2
doing the stat for yougeom_col()
x
scale in %
using scales
geom_bar()
requires x
OR y
Annoying to see those 3 bars in disorder
forcats
)Using the function fct_infreq()
penguins |>
ggplot(aes(y = fct_infreq(species))) +
geom_bar() +
scale_x_continuous(expand = expansion(mult = c(0, .1))) +
labs(title = "Palmer penguins species",
y = NULL) +
theme_minimal(14) +
# nice trick from T. Pedersen
theme(panel.ontop = TRUE,
# better to hide the horizontal grid lines
panel.grid.major.y = element_blank())
bin
value is 30
and will be printed out as a messagestack
for the position
. Here we overlay with "identity"
and use transparencycolour
and fill
mapped to the same variable for cosmetic purposespenguins |>
drop_na(sex) |> # from tidyr
ggplot() +
geom_bar(aes(y = species,
fill = sex),
position = "fill") +
geom_vline(xintercept = 0.5,
linetype = "dashed",
colour = "grey30") +
scale_x_continuous(labels = scales::label_percent(),
position = "top",
expand = c(0, 0)) +
labs(x = NULL, y = NULL) +
theme_classic(16) # larger font sizes
y
by a categorical x
geom_boxplot()
is assessing that:
body_mass_g
is continuousspecies
is categorical/discreteArtwork by Allison Horst
NA
to avoid this categorypenguins |>
filter(!is.na(sex)) |>
# define aes here for both geometries
ggplot(aes(y = body_mass_g,
x = species,
fill = sex,
# for violin contours and dots
colour = sex
)) + # very transparent filling
geom_violin(alpha = 0.1, trim = FALSE) +
geom_point(position = position_jitterdodge(dodge.width = 0.9),
alpha = 0.5,
# don't need dots in legend
show.legend = FALSE)
GIF source: Linh Ngo @BioTuring
ggbeeswarm
Artwork by @allison_horst
(Hint: think about inherited aesthetics)
penguins |>
ggplot() +
geom_point(aes(x = bill_length_mm,
y = body_mass_g)) +
geom_smooth(method = "lm")
Error in `geom_smooth()`:
! Problem while computing stat.
ℹ Error occurred in the 2nd layer.
Caused by error in `compute_layer()`:
! `stat_smooth()` requires the following missing aesthetics: x and y.
ggplot2
outputs dots as they appear in the input data
You learned to:
Further reading 📚
Acknowledgments 🙏 👏
Thank you for your attention!