Machine Learning Cheat Sheet — Data Processing Techniques
Outliers affect the distribution. If a value is significantly below the expected range, it will drag the distribution to the left, making the graph left-skewed or negative. Alternatively, if a value is significantly above the expected range, it will drag the distribution to the right, making the graph right-skewed or positive.
There are different ways to handle skewed data:
- Log Function + 1, Normalization