STA 199: Introduction to Data Science & Statistical Thinking
This page contains an outline of the topics, content, and assignments for the semester! Entries for future dates are tentative; the timeline of topics, assignments, and due dates might be changed throughout the session at the instructorβs discretion. All material changes will be communicated with the students in a timely manner.
| WEEK | DATE | TOPIC | MATERIALS | DUE | PREPARE |
|---|---|---|---|---|---|
| 1 | Wed, May 13 | Welcome | π₯οΈ slides 00 β¨οΈ ae 00 |
GitHub username - End of Lecture | |
| Thu, May 14 | Lab 0 | π» lab 0 | Lab 0 - End of Lab | ||
| Thu, May 14 | Meet the Toolkit | π₯οΈ slides 01 ποΈ notes 01 β¨οΈ ae 01 |
π r4ds - intro π ims - chp 1 π₯ Meet the toolkit :: R and RStudio π₯ Meet the toolkit :: Quarto |
||
| Fri, May 15 | Grammar of Data Visualization | π₯οΈ slides 02 ποΈ notes 02 β¨οΈ ae 02 |
π r4ds - chp 1 π ims - chp 4 |
||
| 2 | Mon, May 18 | Grammar of Data Transformation | π₯οΈ slides 03 ποΈ notes 03 β¨οΈ ae 03 β ae 03 |
π r4ds - chp 2 π r4ds - chp 3.1-3.5 π₯ Grammar of data transformation π₯ Code along :: Flights and pipes |
|
| Mon, May 18 | Lab 1 | π» lab 1 | π r4ds - chp 2 π ims - chp 5 |
||
| Tue, May 19 | Exploratory Data Analysis I | π₯οΈ slides 04 ποΈ notes 04 β¨οΈ ae 04 β ae 04 |
π r4ds - chp 3.6-3.7 π₯ Visualizing and summarizing categorical data π₯ Visualizing and summarizing numerical data π₯ Visualizing and summarizing relationships π₯ Code along :: Star Wars characters |
||
| Wed, May 20 | Exploratory Data Analysis II | π₯οΈ slides 05 ποΈ notes 05 β¨οΈ ae 05 β ae 05 |
Lab 1 - 11:59PM ET | π ims - chp 5 π ims - chp 6 π₯ Code along :: Diving deeper with Palmer Penguins |
|
| Thu, May 21 | Tidying Data | π₯οΈ slides 06 ποΈ notes 06 β¨οΈ ae 06 β ae 06 |
π r4ds - chp 5 π ims - chp 6 |
||
| Thu, May 21 | Lab 2 | π» lab 2 | π r4ds - chp 4 π₯ Tidy data π₯ Tidying data π₯ Code Along: Population over time |
||
| Fri, May 22 | Joining Data | π₯οΈ slides 07 ποΈ notes 07 β¨οΈ ae 07 β ae 07 |
π₯ Joining data π₯ Code along :: Continent populations π r4ds - chp 19.1-19.3 |
||
| Sun, May 24 | |||||
| 3 | Mon, May 25 | NO CLASS - Memorial Day | Lab 2 - 11:59PM ET | ||
| Tue, May 26 | Data Types and Classes | π₯οΈ slides 08 ποΈ notes 08 β¨οΈ ae 08 β ae 08 |
π₯ Data types π₯ Data classes π₯ Code along :: Thatβs my type π r4ds - chp 16 |
||
| Wed, May 27 | Importing and Recoding Data | π₯οΈ slides 09 ποΈ notes 09 β¨οΈ ae 09 β ae 09 |
π₯ Importing data π₯ Code along :: Halving CO2 emissions π₯ Code along :: Student survey π r4ds - chp 7 π r4ds - chp 17.1 - 17.3 |
||
| Thu, May 28 | More Practice + Midterm Review | π₯οΈ slides 10 ποΈ notes 10 β¨οΈ ae 10 β ae 10 |
|||
| Thu, May 28 | Lab 3 | π» lab 3 | |||
| Fri, May 29 | Data Ethics + Midterm Review | π₯οΈ slides 11 ποΈ notes 11 β¨οΈ ae 11 β ae 11 |
π mdsr - chp 8 π How to make a racist AI in R without really trying π₯ Alberto Cairo - How charts lie π₯ Joy Buolamwini - How Iβm fighting bias in algorithms |
||
| Sun, May 31 | Lab 3 - 12:00PM ET | ||||
| 4 | Mon, Jun 1 | Midterm (lecture + lab sessions) | |||
| Tue, Jun 2 | Project Milestone 1 - Collaboration | Project Milestone 1 - 11:00AM ET | |||
| Wed, Jun 3 | The Language of Models | π₯οΈ slides 12 ποΈ notes 12 β¨οΈ ae 12 β ae 12 |
π₯ The language of models π ims - chp 7.1 π₯ Fitting and interpreting models π₯ Modeling nonlinear relationships |
||
| Thu, Jun 4 | Simple Linear Regression | π₯οΈ slides 13 ποΈ notes 13 β¨οΈ ae 13 β ae 13 |
π₯ Linear regression with a categorical predictor π₯ Outliers in linear regression π₯ Code along :: Modeling fish π ims - chp 7.2 |
||
| Thu, Jun 4 | Project Milestone 2 - Proposals | ||||
| Fri, Jun 5 | Multiple Linear Regression I | π₯οΈ slides 14 ποΈ notes 14 |
π₯ Linear regression with multiple predictors π₯ Main and interaction effects π ims - chp 8.1-8.2 π ims - chp 8.3-8.5 |
||
| Sun, Jun 7 | Project Milestone 2 - 11:59PM ET | ||||
| 5 | Mon, Jun 8 | Multiple Linear Regression II | π₯οΈ slides 15 ποΈ notes 15 β¨οΈ ae 14 β ae 14 |
π₯ Code along :: Modeling interest rates | |
| Mon, Jun 8 | Lab 4 | π» lab 4 | |||
| Tue, Jun 9 | Linear Regression Diagnostics | π₯οΈ slides 16 ποΈ notes 16 |
TBU | ||
| Wed, Jun 10 | Logistic Regression I | π₯οΈ slides 17 ποΈ notes 17 β¨οΈ ae 15 β ae 15 |
π₯ Logistic regression π₯ Code along :: Building a spam filter π ims - chp 9 |
||
| Thu, Jun 11 | Logistic Regression II / Modeling Wrap-Up | π₯οΈ slides 18 ποΈ notes 18 β¨οΈ ae 16 β ae 16 |
Lab 4 - 11:59PM ET |
π₯ Clasification and decision errors π₯ Overfitting and spending your data |
|
| Fri, Jun 12 | Lab 5 | π» lab 5 | |||
| Sun, Jun 14 | Project Milestone 3 - 11:59PM ET | ||||
| 6 | Mon, Jun 15 | Interval Estimation | π₯οΈ slides 19 ποΈ notes 19 β¨οΈ ae 17 β ae 17 |
π₯ Quantifying uncertainty π₯ Bootstrapping π₯ Code along :: Bootstrapping Duke Forest houses π ims - chp 11 π ims - chp 12 |
|
| Mon, Jun 15 | Project Milestone 4 - Peer Review | Project Milestone 4 - 12:30PM ET |
|||
| Tue, Jun 16 | Hypothesis Testing | π₯οΈ slides 20 ποΈ notes 20 β¨οΈ ae 18 β ae 18 |
Lab 5 - 11:59PM ET |
π₯ Hypothesis testing π ims - chp 11 |
|
| Wed, Jun 17 | More Inference | π₯οΈ slides 21 ποΈ notes 21 β¨οΈ ae 19 β ae 19 |
|||
| Thu, Jun 18 | Exam Review & Projects | π Kahoot!! | |||
| Thu, Jun 18 | Lab 6 | π» lab 6 | |||
| Fri, Jun 19 | NO CLASS - Juneteenth | ||||
| Sun, Jun 21 | Lab 6 - 11:59pm ET | ||||
| 7 | Mon, Jun 22 | Project Milestone 5 - Presentations + Final Review | Project - 9:30 AM | ||
| Mon, Jun 22 | Final Review | β¨οΈ ae 20 β ae 20 |
|||
| Wed, Jun 24 | FINAL EXAM (9am - 12pm) |
