Machine Learning Cheat Sheet — Supervised Learning

Supervised Learning Models

Y Tech
4 min readJun 17, 2019

Linear Regression

Y=WX + b
W: weights
X: features

Works best when the data is linear. If the data is not linear, then we may need to transform the data, add features, or use another model.

Sensitive to outliers. Outliers contribute too much to the errors, so will impact the model. We may need to determine the outliers and remove them if necessary.

Error functions:

Gradient Descent:
Change the weights to move in the direction that descent the error the most.

In linear regression, split the data into many small batches. Each batch, with roughly the same number of points. Then, use each batch to update your weights. This is still called mini-batch gradient descent.

Gradient Descent Vs Closed Form Solution:
We can solve W (weights) by setting derivatives of the Error to weights to 0. Then it is just a solving of n*n matrix (n is the number of features). But when n is too large, this requires a lot of computing power. So gradient descent will be a better solution to find the results which is close enough, but requires less computing power.

Polynomial Regression:
Fit the data to a higher degree polynomials.

--

--