Pipes and Lambdas

Aurélien Ginolhac, DLSM

University of Luxembourg

Wednesday, the 26th of March, 2025

Learning objectives

You will learn to:

  • Use anonymous function: lambdas
  • Use pipes, aka |>

Functions, syntax

Structure

Assigned functions to a name using the keyword function

add_random <- function(x) {
  x + runif(1)
}

# Alternatively without curly braces

add_random2 <- function(x) x + runif(1)

# Default for the second argument
add_random3 <- function(x, n = 1) x + runif(n)

runif(n)

Generate n numbers from the uniform distribution. By default between 0 and 1.

Calling function’ names

add_random(3)
[1] 3.236672
add_random(3)
[1] 3.987232
add_random2(2)
[1] 2.352066
add_random3(3)
[1] 3.499562
# Changing default argument (n = 1)
add_random3(3, n = 5)
[1] 3.807075 3.925572 3.559497 3.949845 3.716239

Functions: declaration or not

Declared

my_function <- function(my_argument) {
  my_argument + 1
}

In the Global Environment:

ls.str()
my_function : function (my_argument)  

Are reusable.

my_function(2)
[1] 3

Anonymous (so-called lambdas)

Not stored but used “on the fly”

(function(x) { x + 2 })(2) 
[1] 4

Do not alter the Global Environment

ls()
[1] "my_function"

Modern base shorthand since v4.1: \(x)

(\(x) x + 2)(2)
[1] 4

x is a convention, but can be any strings

(\(nb) nb^3)(2)
[1] 8

Native lambda, functions without a name

  • When you don’t need/want to assign a function to a name
  • Functional programming allows to reuse on-the-fly declared functions

Example (R >= 4.1)

\(x) x + 1 
function (x) 
x + 1
# Is a shorthand for (R < 4.1)
function(x) x + 1
function (x) 
x + 1
# Usage in functional programming
lapply(3:5, \(x) x**2)
[[1]]
[1] 9

[[2]]
[1] 16

[[3]]
[1] 25

Remember vectorisation!

(3:5)**2
[1]  9 16 25

Pipes

Output the result of one function as input for the next one

base, classic parenthesis syntax

set.seed(12) # to obtain identical numbers
round(mean(rnorm(5)), 2)
[1] -0.76

And pipes are into since version 4.1 as |>

Native pipeline

set.seed(12)
rnorm(5) |>
  mean() |>
  round(2)
[1] -0.76

Native pipe requirements

Parentheses mandatory

c(1.2, 3.1) |> mean
Error in mean: The pipe operator requires a function call as RHS (<input>:1:16)
c(1.2, 3.1) |> mean()
[1] 2.15

Placeholder (R >= v4.2!): when input argument is not the first one

For base function grepl() which finds a pattern (here either an A or C) and return a logical vector

  • First argument is the pattern
  • Second argument (x) is input vector

Solution: use named arg with the placeholder _

c("A", "B") |> grepl("[AC]", x = _)
[1]  TRUE FALSE

Pipes recommendations

In the tidyverse, input is the first argument of functions.

Avoid using a |> for a single operation, replace:

swiss |> 
  mutate(Fertility + Education)

by:

mutate(swiss, Fertility + Education)

A classic issue is to specify twice data:

swiss |> 
  filter(swiss, Examination > 10) |> 
  mutate(Fertility + Education)

Fails

Correct:

swiss |> 
  filter(Examination > 10) |> 
  mutate(Fertility + Education)

Before we stop

You learned to:

  • Function reminders
  • Lambdas
  • Pipes
  • Vectorisation

Acknowledgments

  • Hadley Wickham
  • Roland Krause

Thank you for your attention!