What is meant by correlation?

Correlation is a term that is a measure of the power of a linear relationship within two quantitative variables (e.g., height, weight). There are mainly two types of correlation depending on the movement of the variables −

  • Positive correlation – In a Positive correlation of two variables, both the variables move in the same direction. This means when the value of one variable goes up, the other also increases and vice versa. For example, the more fuel you burn, the more distance you can travel with an automobile.

  • Negative correlation – In case of negative correlation, when one variable increases the other goes down, and vice versa.

Strong and Weak Correlation

In a strong correlation, it is possible to predict the values of one variable with a reasonably high level of accuracy based on the values of the other. In case of a weak correlation, the average of one variable is related to the other, but there are plenty of exceptions.

The sample correlation coefficient "r" quantifies the power of the relationship. Correlations are also tested often for statistical significance.

Limitations of correlation analysis

  • Correlation can’t show the presence or effect of other variables apart from the two variables being explored.

  • Correlation doesn’t tell us about the cause and effect of variation.

  • Correlation also fails to describe curvilinear relationships.

Correlations describe data moving together

Correlations can be used for describing simple relationships within the data sets. For example, for a dataset of campsites in a mountain park, one may want to know whether there is a relationship between the height of the campsite and the average temperatures there in the summer.

Here, for each individual campsite, two measures must be taken: elevation and temperature. When you check these two variables against each other across the sample with a correlation, you will find a linear relationship: as elevation goes up, the temperature goes down. So, the two variables are negatively correlated.

What do correlation numbers mean?

Correlations are measured with a unit-free calculation called the correlation coefficient that ranges from -1 to +1 and is denoted by "r". The statistical part is indicated with a p-value. Therefore, correlations are typically written with two key numbers: "r =" and "p =".

  • The near "r" goes to zero, the linear relationship turns weaker.

  • Positive "r" values show a positive correlation.

  • Negative "r" values indicate a negative correlation.

  • The p-value gives us proof that we can justify the fact that the population correlation coefficient is different from zero, depending on what we observe from the sample.

  • "Unit-free measure" shows that correlations exist on their own scale. In the example given above, the number given for "r" is not on the same scale as either height or temperature. This is different from the other forms of statistics. For example, the mean of the height measurements is on the same scale as its variable.

Updated on: 17-Sep-2021


Kickstart Your Career

Get certified by completing the course

Get Started