?loghelp(log)
with the tidyverse
University of Luxembourg
Tuesday, the 25th of February, 2025
You will learn to:
is a shorthand for “GNU R”:
Learning to use will make you more efficient and facilitate the use of advanced data analysis tools
shiny
Execution speed is easy to measure but what about development speed?
Source: Jozef Hajnala
Source: Touchon & McCoy. Ecosphere. 2016
Source: D. Robinson, StackOverflow blog
The bad news is that when ever you learn a new skill you’re going to suck. It’s going to be frustrating. The good news is that is typical and happens to everyone and it is only temporary. You can’t go from knowing nothing to becoming an expert without going through a period of great frustration and great suckiness.
— Hadley Wickham
?print
print.data.frame
colnames, names
read.csv, load, readRDS
df$x, df$"x", df[,"x"], df[[1]]
tidyverse
curseNavigating the balance between
base
and thetidyverse
is a challenge to learn
— Robert A. Muenchen
Source: Robert A. Muenchen’ blog
2 possibilities for manual pages.
Sadly, manpages are often unhelpful, vignettes or articles better described workflow (below readxl
website).
In Rstudio, the help page can be viewed in the bottom right pane
The ambiguity [of the S language] is real and goes to a key objective: we wanted users to be able to begin in an interactive environment, where they did not consciously think of themselves as programming. Then as their needs became clearer and their sophistication increased, they should be able slide gradually into programming, when the language and system aspects would become more important.
— John Chambers, “Stages in the Evolution of S”
source: Teaching to New Users: from tapply to Tidyverse 2018 by Roger D. Peng
Hadley Wickham is Chief Scientist at Posit
We think the tidyverse is better, especially for beginners. It is:
2022: lubridate
joined the core
Construct | ![]() |
Base | Version |
---|---|---|---|
Strings read as factors | tibbles |
Default | v4.0 |
c(factor("a"), factor("b")) |
[1] a b |
Was [1] 1 1 |
v4.1 |
Pipe | %>% |
|> |
v4.1 |
Lambda | ~ .x |
\(x) |
v4.1 |
Placeholder in pipe | . |
_ |
v4.2 |
Unnamed placeholder | list(a = 1) %>% .$a |
list(a = 1) |> _$a |
v4.3 |
NULL assignment |
rlang:: |
|
v4.4 |
Dataset | palmerpenguins::penguins |
penguins |
v4.5 |
=
equal
.
dot
,
comma
~
tilde
*
star (asterisk)
-
hyphen
_
underscore
"
double quotation marks
'
single quotation marks
`
backticks
#
hash
|
(vertical) bar
/
(forward) slash
\
backslash
()
parentheses
[]
(square) brackets
{}
(curly) braces
<>
chevrons
<-
assignment (left)
->
right assignment
|>
(base) pipe
library()
, ensure function’ originbase
loadedTime Series:
Start = 1
End = 10
Frequency = 1
[1] NA 6 9 12 15 18 21 24 27 NA
Conflict: 2 packages export the same function
The latest loaded wins
Error in UseMethod("filter") :
no applicable method for 'filter' applied to an object of class "c('integer', 'numeric')"
Solution: prefix with ::
to call functions from a specific package
Time Series:
Start = 1
End = 10
Frequency = 1
[1] NA 6 9 12 15 18 21 24 27 NA
Or use the conflicted
package
Type | Example |
---|---|
character (strings) | ‘tidyverse!’ |
boolean | TRUE / FALSE (T /F not protected) |
numeric | integer (2), double (2.34) |
date (also doubles) | 2024-03-04 (Sys.Date() ) |
datetime | 2024-03-04 09:12:24 CET , (Sys.time() ) |
complex | 2+0i |
c()
is the function for concatenate
convert strings to factors, levels
is the dictionary
data.frame
same as list but where all objects must have the same length
length(vector)
c()
functionOperator is <-
, associate a name to an object, right version ->
is a valid alias
In #rstats, it's surprisingly important to realise that names have objects; objects don't have names pic.twitter.com/bEMO1YVZX0
— Hadley Wickham (@hadleywickham) May 16, 2016
Source: H. Wickham - Adv R, licence CC
Source: H. Wickham - R for data science, licence CC
Important!
Unlike python or Perl, vectors use 1-based index!!
The :
operator generates integer
sequences
Select elements from position 3 to 10:
[1] 1 5 5
[1] NA
[1] 1 5 5 NA 6
R extends the vector! And uses missing values NA
LETTERS
a built-in vectors of the 26 UPPER case letters.
Subset LETTERS
to obtain A, B, C, D, E
Subset LETTERS
to obtain B, D and F
Several solutions exist.
Remove Z from LETTERS
length(x)
returns the numbers of items in vector x
Keep even letters (B, D, E … Z)
%%
is modulo integer remainder of divisionYou learned to:
Acknowledgments 🙏 👏
Thank you for your attention!