Thursday, December 19, 2019

Correlation

What is Correlation?

Correlation shows us the Direction and Strength of a linear relationship shared between 2 quantitative variables.

Its denoted by the equation

where

r = correlation
n = # of data points in a data set
Sx = The standard deviation
Sy = The standard deviation
Xi = The data value
For more details on the mean and standard deviation, refer the following blog post:

Direction is provided by the slope (if we draw a line along the data points)
If the slope is upwards, we deduce that the correlation is positive.
If the slope is downwards, we deduce that the correlation is negative.
Correlation values range from -1 to 1.
A value of 1 indicates perfect positive correlation and a -1 indicates perfect negative correlation.

Correlation is positive


Correlation is negative

Strength of a linear relationship gets stronger as correlation increases from 0 to 1 or from 0 to -1.
Refer pics below.



r = 0


          
        
r = 0.3




r = 0.7

r = 1

Lets look at a calculation for a dataset for "No of hours on a treadmill" vs "Calories burnt"






We can see a near straight line of a positive correlation of 0.969 (very close to a perfect positive correlation).

No comments:

Post a Comment