STA 199: Introduction to Data Science & Statistical Thinking

This page contains an outline of the topics, content, and assignments for the semester! Entries for future dates are tentative; the timeline of topics, assignments, and due dates might be changed throughout the session at the instructor’s discretion. All material changes will be communicated with the students in a timely manner.

WEEK DATE TOPIC MATERIALS DUE PREPARE
1 Wed, May 13 Welcome πŸ–₯️ slides 00
⌨️ ae 00
GitHub username - End of Lecture

Thu, May 14 Lab 0 πŸ’» lab 0 Lab 0 - End of Lab

Thu, May 14 Meet the Toolkit πŸ–₯️ slides 01
πŸ—’οΈ notes 01
⌨️ ae 01

πŸ“— r4ds - intro
πŸ“˜ ims - chp 1
πŸŽ₯ Meet the toolkit :: R and RStudio
πŸŽ₯ Meet the toolkit :: Quarto

Fri, May 15 Grammar of Data Visualization πŸ–₯️ slides 02
πŸ—’οΈ notes 02
⌨️ ae 02

πŸ“— r4ds - chp 1
πŸ“˜ ims - chp 4
2 Mon, May 18 Grammar of Data Transformation πŸ–₯️ slides 03
πŸ—’οΈ notes 03
⌨️ ae 03
βœ… ae 03

πŸ“— r4ds - chp 2
πŸ“— r4ds - chp 3.1-3.5
πŸŽ₯ Grammar of data transformation
πŸŽ₯ Code along :: Flights and pipes

Mon, May 18 Lab 1 πŸ’» lab 1
πŸ“— r4ds - chp 2
πŸ“˜ ims - chp 5

Tue, May 19 Exploratory Data Analysis I πŸ–₯️ slides 04
πŸ—’οΈ notes 04
⌨️ ae 04
βœ… ae 04

πŸ“— r4ds - chp 3.6-3.7
πŸŽ₯ Visualizing and summarizing categorical data
πŸŽ₯ Visualizing and summarizing numerical data
πŸŽ₯ Visualizing and summarizing relationships
πŸŽ₯ Code along :: Star Wars characters

Wed, May 20 Exploratory Data Analysis II πŸ–₯️ slides 05
πŸ—’οΈ notes 05
⌨️ ae 05
βœ… ae 05
Lab 1 - 11:59PM ET πŸ“˜ ims - chp 5
πŸ“˜ ims - chp 6
πŸŽ₯ Code along :: Diving deeper with Palmer Penguins

Thu, May 21 Tidying Data πŸ–₯️ slides 06
πŸ—’οΈ notes 06
⌨️ ae 06
βœ… ae 06

πŸ“— r4ds - chp 5
πŸ“˜ ims - chp 6

Thu, May 21 Lab 2 πŸ’» lab 2
πŸ“— r4ds - chp 4
πŸŽ₯ Tidy data
πŸŽ₯ Tidying data
πŸŽ₯ Code Along: Population over time

Fri, May 22 Joining Data πŸ–₯️ slides 07
πŸ—’οΈ notes 07
⌨️ ae 07
βœ… ae 07

πŸŽ₯ Joining data
πŸŽ₯ Code along :: Continent populations
πŸ“— r4ds - chp 19.1-19.3

Sun, May 24



3 Mon, May 25 NO CLASS - Memorial Day
Lab 2 - 11:59PM ET

Tue, May 26 Data Types and Classes πŸ–₯️ slides 08
πŸ—’οΈ notes 08
⌨️ ae 08
βœ… ae 08

πŸŽ₯ Data types
πŸŽ₯ Data classes
πŸŽ₯ Code along :: That’s my type
πŸ“— r4ds - chp 16

Wed, May 27 Importing and Recoding Data πŸ–₯️ slides 09
πŸ—’οΈ notes 09
⌨️ ae 09
βœ… ae 09

πŸŽ₯ Importing data
πŸŽ₯ Code along :: Halving CO2 emissions
πŸŽ₯ Code along :: Student survey
πŸ“— r4ds - chp 7
πŸ“— r4ds - chp 17.1 - 17.3

Thu, May 28 More Practice + Midterm Review πŸ–₯️ slides 10
πŸ—’οΈ notes 10
⌨️ ae 10
βœ… ae 10



Thu, May 28 Lab 3 πŸ’» lab 3


Fri, May 29 Data Ethics + Midterm Review πŸ–₯️ slides 11
πŸ—’οΈ notes 11
⌨️ ae 11
βœ… ae 11

πŸ“• mdsr - chp 8
πŸ“ How to make a racist AI in R without really trying
πŸŽ₯ Alberto Cairo - How charts lie
πŸŽ₯ Joy Buolamwini - How I’m fighting bias in algorithms

Sun, May 31

Lab 3 - 12:00PM ET
4 Mon, Jun 1 Midterm (lecture + lab sessions)



Tue, Jun 2 Project Milestone 1 - Collaboration
Project Milestone 1 - 11:00AM ET

Wed, Jun 3 The Language of Models πŸ–₯️ slides 12
πŸ—’οΈ notes 12
⌨️ ae 12
βœ… ae 12

πŸŽ₯ The language of models
πŸ“˜ ims - chp 7.1
πŸŽ₯ Fitting and interpreting models
πŸŽ₯ Modeling nonlinear relationships

Thu, Jun 4 Simple Linear Regression πŸ–₯️ slides 13
πŸ—’οΈ notes 13
⌨️ ae 13
βœ… ae 13

πŸŽ₯ Linear regression with a categorical predictor
πŸŽ₯ Outliers in linear regression
πŸŽ₯ Code along :: Modeling fish
πŸ“˜ ims - chp 7.2

Thu, Jun 4 Project Milestone 2 - Proposals



Fri, Jun 5 Multiple Linear Regression I πŸ–₯️ slides 14
πŸ—’οΈ notes 14

πŸŽ₯ Linear regression with multiple predictors
πŸŽ₯ Main and interaction effects
πŸ“˜ ims - chp 8.1-8.2
πŸ“˜ ims - chp 8.3-8.5

Sun, Jun 7

Project Milestone 2 - 11:59PM ET
5 Mon, Jun 8 Multiple Linear Regression II πŸ–₯️ slides 15
πŸ—’οΈ notes 15
⌨️ ae 14
βœ… ae 14

πŸŽ₯ Code along :: Modeling interest rates

Mon, Jun 8 Lab 4 πŸ’» lab 4


Tue, Jun 9 Linear Regression Diagnostics πŸ–₯️ slides 16
πŸ—’οΈ notes 16

TBU

Wed, Jun 10 Logistic Regression I πŸ–₯️ slides 17
πŸ—’οΈ notes 17
⌨️ ae 15
βœ… ae 15

πŸŽ₯ Logistic regression
πŸŽ₯ Code along :: Building a spam filter
πŸ“˜ ims - chp 9

Thu, Jun 11 Logistic Regression II / Modeling Wrap-Up πŸ–₯️ slides 18
πŸ—’οΈ notes 18
⌨️ ae 16
βœ… ae 16
Lab 4 - 11:59PM ET
πŸŽ₯ Clasification and decision errors
πŸŽ₯ Overfitting and spending your data

Fri, Jun 12 Lab 5 πŸ’» lab 5


Sun, Jun 14

Project Milestone 3 - 11:59PM ET
6 Mon, Jun 15 Interval Estimation πŸ–₯️ slides 19
πŸ—’οΈ notes 19
⌨️ ae 17
βœ… ae 17

πŸŽ₯ Quantifying uncertainty
πŸŽ₯ Bootstrapping
πŸŽ₯ Code along :: Bootstrapping Duke Forest houses
πŸ“˜ ims - chp 11
πŸ“˜ ims - chp 12

Mon, Jun 15 Project Milestone 4 - Peer Review
Project Milestone 4 - 12:30PM ET


Tue, Jun 16 Hypothesis Testing πŸ–₯️ slides 20
πŸ—’οΈ notes 20
⌨️ ae 18
βœ… ae 18
Lab 5 - 11:59PM ET
πŸŽ₯ Hypothesis testing
πŸ“˜ ims - chp 11

Wed, Jun 17 More Inference πŸ–₯️ slides 21
πŸ—’οΈ notes 21
⌨️ ae 19
βœ… ae 19



Thu, Jun 18 Exam Review & Projects πŸ“Š Kahoot!!


Thu, Jun 18 Lab 6 πŸ’» lab 6


Fri, Jun 19 NO CLASS - Juneteenth



Sun, Jun 21

Lab 6 - 11:59pm ET
7 Mon, Jun 22 Project Milestone 5 - Presentations + Final Review
Project - 9:30 AM

Mon, Jun 22 Final Review ⌨️ ae 20
βœ… ae 20



Wed, Jun 24 FINAL EXAM (9am - 12pm)