Introduction to BASV53

with R

Aurélien Ginolhac, DHML

University of Luxembourg

Monday, the 23th of February, 2026

Hello!

What you can do now:

  • Check for material at the main site

https://basv53.uni.lu

  • Install R, RStudio and packages

setup

Check your install

library(tidyverse)
read_csv("https://biostat2.uni.lu/practicals/data/swiss.csv",
         show_col_types = FALSE) |>
  filter_out(Fertility > 80) |>
  pivot_longer(cols = c(Fertility, Agriculture),
               names_to = "measurement", 
               values_to = "value") |>
  ggplot(aes(x = value, y = Education, colour = measurement)) +
  geom_point() +
  geom_smooth(method = "loess", formula = "y ~ x", alpha = 0.2) +
  theme_bw(14)

Overview

This course provides an introduction to and the tidyverse, one of its dialect.

  • Brief computer history, programming and hardware notions
  • Biology track: data analysis is mandatory
  • Focusing on loading and cleaning data for exploratory visualizations

Lectures

  • Slides, formal lecture
  • Quick exercises inserted
  • Unprepared live demo

Practicals

  • Detailed exercises
  • Solutions hidden/revealed

Projects

  • Different projects, team up by (2, 3)
  • Due date: June
  • 10 min defense
  • No slides to prepare: Quarto HTML

This course is composed of ~ 30 hours (2 ECTS)

1 ECTS

  • Written and practical exam, qmd file
  • 2 hours
  • All document allowed
  • Internet allowed
  • On your laptop allowed
  • Communication with others forbidden
  • AI allowed if tool specified and prompt included
  • Retake exam is an oral exam

1 ECTS

  • Home work project in a group
  • AI allowed if tool specified and prompt included
  • Wide range of subjects:
    • Gene expression in yeast cells
    • Spotify song characteristics
    • Temperatures from ice core records
    • Voyager Golden Record images
    • CO2 worldwide emissions
    • Photo-voltaic installation 1 year of 5 min step measurements
    • Human genome gene structures
    • Colon rectal cancer microarray

Internet access allowed. Watch out for time!

Why allowed ?

Hello my name is Joby, I have a PhD in Physics and I work for NASA and I just had to look up the equation for the volume of a sphere

— Joby Hollis 🏳️ 🌈🇪🇺 (@Jobium) 3 September 2018

Downside!

  • Time vanishes fast if you aren’t prepared
  • AI is NOT running code

Teacher

Aurélien Ginolhac

This website

Entirely built with Quarto, and hosted on the Uni/LCSB Gitlab

Highlights, more data science than pure computing

  • Computer parts
  • Programming basics, exemplified in
    • Data types
    • Data structures
    • Sub-setting
    • Control flow for and if
  • Data wrangling
    • Import data
    • Manipulating
    • Visualizing
  • Literate programming
    • Quarto

Literate programming, separate content from formatting

HTML

PDF

Live Demo: French first names over 120+ years

Dataset baby names in France over 120 years (data from the French statistic institute INSEE)

Dataset

prenoms
# A tibble: 648,370 × 5
    year sex   name          n      prop
   <dbl> <chr> <chr>     <int>     <dbl>
 1  1900 F     Abeline       3 0.0000127
 2  1900 F     Abelle        3 0.0000127
 3  1900 F     Ada           4 0.0000170
 4  1900 F     Adelaide    194 0.000822 
 5  1900 F     Adelheid      3 0.0000127
 6  1900 F     Adelia       12 0.0000509
 7  1900 F     Adelie        3 0.0000127
 8  1900 F     Adelina      50 0.000212 
 9  1900 F     Adeline     224 0.000949 
10  1900 F     Adelphine    19 0.0000805
# ℹ 648,360 more rows
# ℹ Use `print(n = ...)` to see more rows

Data analysis

Questions?

  • Number of babies born per year
  • Births per month across recent years
  • Ratio Male / Female
  • Evolution of your first names
  • Dynamic of novelty
  • Novelty versus Saints
  • Double gender first names
  • Most popular first per decade