Skip to content

Latest commit

 

History

History

linear-regression-code-intro

Folders and files

NameName
Last commit message
Last commit date

parent directory

..
 
 
 
 
 
 
 
 
date duration maintainer order title
w02d02
90
ultimatist
2
Linear Regression Code Intro

Sample Lesson Plan

Learning Objectives

At the end of this notebook the students should:

  • Be able to visualize data
  • Look for correlations and multicollinearity
  • Understand how linear regression models work
  • Interpret basic regression statistics like R^2
  • Do basic feature engineering and selection to improve models

Depends On

Linear Regression Theory Intro

  • Students should have a foundational understanding of linear regression prior to attempting this lab

Seaborn

Python Advanced

  • Pickling

Instructor Notes

The goal of this notebook is to guide students through implementation of linear regression modeling. Prior theory understanding is key; still, this lab should take 90 minutes and students should attempt all exercises in the car price predictor student section.

Details about Learning Objectives

With this notebook students will:

Be able to create linear regression in:

  • statsmodels: a package mainly best at doing regressions with traditional R formula syntax
  • scikit-learn: This is the main machine learning package we'll be using throughout the course. It has a multitude of machine learning algorithms and helpful machine learning pipeline tools. sklearn has a tremendous amount of functionality, to get the most out of this course it will help to really explore the depth of the documentation on your own and watch as you understand more and more of the functionality as the course progresses.

Gain familiarity with the following:

  • R formulas: R formulas are a convenient way for encapsulating functional relationships for regressions
  • seaborn: We'll use seaborn for visualization as we go along
  • Variable Preprocessing and Polynomial Regression with scikit-learn: We'll be "standardizing" or "normalizing" many of our variables to yield better model data. We'll show how the "linear" models can be extended to basically any type of function by using functions of the different fields as the inputs to the linear model.

Installations (if necessary)

conda install pandas numpy statsmodels seaborn scikit-learn

Additional Resources