When more than one feature affects the output
In the example of estimating house price, size, # of bedrooms, # of floors may affect the price
features are denoted with x1, x2
Since there are more than one features, the hypothesis is going to get longer in order to include more terms/features
Making features on a similar scale
Mean Normalization
Tips to ensure learning rate is working correctly
Use smaller learning rate if not converging
Method to solve for θ analytically instead of doing iterations
θ = (XTX)-1XTy
Slow if n is really large becuase you need to take inverse of the matrix X
won’t work if (XTX) is not inversible