[1] "logical"
Lecture 8
Duke University
STA 199 Summer 2026: Session I
May 26, 2026
An object’s type indicates how it is stored in memory
You’ll commonly encounter:
logicalintegerdoublecharacterYou’ll less commonly encounter:
listNULLcomplexrawCan you use different types of data together? Sometimes… but be careful!
Vectors are constructed using the c function
without intention…
R will happily convert between various types without complaint when different types of data are concatenated in a vector. This is NOT always a good thing.
without intention…
with intention…
with intention…
Explicit coercion:
When you call a function like:
Implicit coercion:
Happens when you use a vector in a specific context that expects a certain type of vector.
We can think of data frames like like vectors of equal length glued together
We can think of data frames like like vectors of equal length glued together
pull() function, we extract a vector from the data frame; this is functionally the same as accessing a column with df$col_name
We can think of dates like an integer (the number of days since the origin, 1 Jan 1970) and an integer (the origin, aka “the Unix epoch”) glued together
The lubridate package allows you to work with / access elements of dates seamlessly
R uses factors to handle categorical variables with a fixed and known set of possible values
factor(x = ...): “The default (ordering of levels) is the unique set of values taken by as.character(x), sorted into increasing (alphabetical) order of x”
[1] "character"
[1] June July June August June
Levels: August July June
[1] "August" "July" "June"
We can use the forcats package (in tidyverse) to work with factors!
Some commonly used functions are:
fct_relevel(): reorder factors by hand
fct_reorder(): reorder factors by another variable
fct_infreq(): reorder factors by frequency
fct_rev(): reorder factors by reversing
Referencing a column in a pipeline:
# A tibble: 5 × 3
x `2011` `my var`
<dbl> <dbl> <dbl>
1 -2 -2 -2
2 -0.5 -0.5 -1
3 0.5 0.5 0
4 1 1 1
5 2 2 2
"x" means the literal character string.
# A tibble: 3 × 3
x `2011` `my var`
<dbl> <dbl> <dbl>
1 0.5 0.5 0
2 1 1 1
3 2 2 2
x means the column name in df.
Referencing a column in a pipeline:
# A tibble: 5 × 3
x `2011` `my var`
<dbl> <dbl> <dbl>
1 -2 -2 -2
2 -0.5 -0.5 -1
3 0.5 0.5 0
4 1 1 1
5 2 2 2
"2011" means the literal character string.
# A tibble: 5 × 3
x `2011` `my var`
<dbl> <dbl> <dbl>
1 -2 -2 -2
2 -0.5 -0.5 -1
3 0.5 0.5 0
4 1 1 1
5 2 2 2
2011 means the literal number.
Referencing a column in a pipeline:
# A tibble: 5 × 3
x `2011` `my var`
<dbl> <dbl> <dbl>
1 -2 -2 -2
2 -0.5 -0.5 -1
3 0.5 0.5 0
4 1 1 1
5 2 2 2
"my var" means the literal character string.
Error in parse(text = input): <text>:2:13: unexpected symbol
1: df |>
2: filter(my var
^
my var means nothing.