Syllabus
FAQ, etc.
Office Hours
Calendar
Campuswire
Message Board
Gradescope
Gradebook
Course Notes
This Week
Exam Week
Final Exam on Tuesday, Jun 08
Discussion 8
Friday June 11 at 13:00 PM via
Zoom
Assignment is Extra Credit
Zoom Recording
Was due Saturday, Jun 12 at 23:59 PM
Week 10
Conclusion
Lecture 18 — Model Evaluation and Fairness
Sensitivity, Specificity, Precision, Recall
Model Evaluation and Individuals
Fairness and Bias in ML Models
Parity Measures
Parity Example: Loans and Age
Week 9
Model Selection
Lecture 17 — Examples
Grid Search Cross-Validation
Examples: Multicollinearity
Example: Text Data
Lecture 16 — Model Validation
Model Validation
Example: Polynomial Regression
Example: Decision Trees
Discussion 7
Friday May 28 at 13:00 PM via
Zoom
Assignment is Extra Credit
Zoom Recording
Was due Saturday, May 29 at 23:59 PM
Lab 9
Was due Tuesday, Jun 01 at 23:59 PM
Project 5
Was due Monday, Jun 07 at 23:59 PM
Week 8
Prediction and sklearn
Lecture 15 — Model Pipelines
Chapters 9.2 - Data Pipelines
sklearn Transformers and Models
Pipelines
Intro to Bias and Variance
Lecture 14 — Features
Chapter 9.1 - Feature Engineering
Features
Feature Engineering
Features and Models
Example: Predicting Tips
Example: Adding Features to Predict Tips
Lab 8
Was due Tuesday, May 25 at 23:59 PM
Project 4
Was due Thursday, May 27 at 23:59 PM
Week 7
Regex and NLP
Lecture 13 — Extracting Information from Text
Chapter 8.2 - Information Extraction from Text
Cleaning the SD City Salary Job Titles
Bag of Words
Tf-Idf
Lecture 12 — Text Data
Chapter 8.1 - Text Processing
Canonicalization
Intro to Regular Expressions
Regex Building Blocks
Extended Regex
Discussion 7
Friday May 14 at 13:00 PM via
Zoom
Assignment is Extra Credit
Zoom Recording
Was due Saturday, May 15 at 23:59 PM
Lab 7
Was due Tuesday, May 18 at 23:59 PM
Project 4 Checkpoint
Was due Thursday, May 20 at 23:59 PM
Week 6
HTTP and HTML
Midterm on Thursday, May 06
Lecture 11 — Parsing Nested Data
Chapter 7.3 - Parsing HTML
HTML
Parsing HTML
Example: Scraping Quotes
Nested vs. Flat Data Structures
API Requests
Discussion 6
Friday May 7 at 13:00 PM via
Zoom
Assignment is Extra Credit
Was due Saturday, May 08 at 23:59 PM
Lab 6
Was due Tuesday, May 11 at 23:59 PM
Week 5
Missingness
Lecture 10 — Collecting Data
Chapters 7.1 and 7.2 - Data Collection
Collecting Data
HTTP Requests and JSON
APIs vs Scraping
Lecture 9 — Imputation
Chapters 6.3, 6.4, 6.5 - Missing Data
Introduction to Data Imputation
What to do with missing data
Mean imputation
Probabilistic imputation
Discussion 5
Friday April 30 at 13:00 PM via
Zoom
Assignment is Extra Credit
Zoom Recording
Was due Saturday, May 01 at 23:59 PM
Lab 5
Was due Tuesday, May 04 at 23:59 PM
Project 3
Was due Thursday, May 13 at 23:59 PM
Week 4
Permutation Tests
Lecture 8 — Missingness
Chapters 6.1 and 6.2 - Missingness
Missingness Mechanisms
Formal Definitions of Missingness
Assessing Missingness Quantitatively
Kolmogorov-Smirnov Test Statistic
Summary and Examples
Lecture 7 — Permutation Tests
Chapter 5.5 - Permutation Tests
Permutation Tests
TVD as a Test Statistic
Speeding up Permutation Tests
Discussion 4
Friday April 23 at 13:00 PM via
Zoom
Assignment is Extra Credit
Zoom Recording
Was due Saturday, Apr 24 at 23:59 PM
Lab 4
Was due Tuesday, Apr 27 at 23:59 PM
Project 2
Was due Thursday, Apr 29 at 23:59 PM
Week 3
Data Granularity
Lecture 6 — Combining Data
Chapters 5.3 and 5.4 - Data Granularity
Concatenating Vertically
Working with Times
Concatenating Horizontally
Joining with pd.merge
Many-to-* Joins, and a Demo
Lecture 5 — Data Granularity
Chapters 5.1 and 5.2 - Data Granularity
Data Granularity and groupby
Pivot Tables
Simpson's Paradox
Discussion 3
Friday April 16 at 13:00 PM via
Zoom
Assignment is Extra Credit
Zoom Recording
Was due Saturday, Apr 17 at 23:59 PM
Lab 3
Was due Tuesday, Apr 20 at 23:59 PM
Project 2 Checkpoint
Was due Thursday, Apr 22 at 23:59 PM
Week 2
Messy Data and Testing
Lecture 4 — Hypothesis Testing
Chapter 4.4 - Hypothesis Testing
Hypothesis Testing and Coin Flips
Jury Panels and the TVD
P-Values
Lecture 3 — Messy Data
Chapter 4.2 - Cleaning Messy Data
Introduction to data cleaning
Cleaning up data types
Unfaithful data
Missing values
Discussion 2
Friday April 9 at 13:00 PM via
Zoom
Assignment is Extra Credit
Zoom Recording
Was due Saturday, Apr 10 at 23:59 PM
Lab 2
Was due Tuesday, Apr 13 at 23:59 PM
Project 1
Was due Thursday, Apr 15 at 23:59 PM
Week 1
Introduction
Welcome to DSC 80!
Here's how to get started:
read the
syllabus
join our
campuswire
and
gradescope
with the email invitations you received earlier this week
watch the intro lecture linked below when it is posted on Tuesday
See you in lecture.
Lecture 2 — Tabular Data
Chapter 2 - The Basics of Tabular Data
Tabular Data in Pandas
Pandas and Numpy
Useful Pandas Methods
Lecture 1 — Introduction
Chapter 1 - Introduction to Data Science
Introduction
Course Structure
The Data Science Lifecycle
Discussion 1
Friday April 2 at 13:00 PM via
Zoom
Assignment is Extra Credit
Zoom Recording
Was due Saturday, Apr 03 at 23:59 PM
Lab 1
Was due Tuesday, Apr 06 at 23:59 PM
Project 1 Checkpoint
Was due Thursday, Apr 08 at 23:59 PM