Causation and correlation

Data is an extremely powerful tool in taking any kind of decision; whether it is washing hand before the operation to avoid the outbound infection or changing the shape of a bottle to make a brand of new drink more attractive in the food market.

There always existed some misunderstanding around correlation and causation, which some believe is the same. However, this is wrong. This article will help you to find out how correlation and causation differ, what similarities they have and, where one or another better be applied.


Correlation- is a mutual relationship or connection between two or more things. Similar to correlation is connection and consumption. Correlation is a statistical technique that can show whether and how strongly pairs of variables are related. For example, we can take a very common example such as weight and height. Correlation can tell you how much variations in people’s weight is related to their height. Taller people will be heavier than shorter ones, however, if we compare two people which height does not differ drastically their weight will be different. One will be heavier than another.

Some correlations are obvious some unsuspected, smart correlation analysis can lead to a greater understanding of data.

Determining correlation 

Correlation only works for quantifiable data, where numbers are meaningful. But it does not work for gender, color or brands. The main result of the correlation is the correlation coefficient. It ranges from  -1.0 to +1.0 . The closer range(r) to +1 or -1, the more closely two variables are related.  If r is close to 0, it means, there is no relationship between the variables.  If it is positive, it means that as one variable gets larger, the other gets larger as well. If it is negative, it means that as one gets larger, the other gets smaller. Such dependency is call inverse. To determine the correlation coefficient, Pearson techniques are used. Which is most appropriate for the linear relationship and not very much for the curvilinear relationship. If you forgot by chance what curvilinear means, is the one which the relationship does not follow a straight line. In the case of a curvilinear relationship, multiple regression is the best option.



Causation – is the action fo causing something the relationship between cause and effect. Causation indicates that one event is the result of the occurrence of another event.  i.e there is a causal relationship between the two events. This referred to as cause and effect.

The classic causation vs correlations example which is frequently used is that smoking is correlated with alcoholism, but does not cause alcoholism itself. While smoking causes an increase in the risk of developing lung cancer.

If you want more amazing examples of how tricky and funny and sometimes dangerous misunderstanding of correlation and causation might check this Ted talk video

Don't miss out!
Subscribe To Our Newsletter

Learn new things. Get an article everyday.

Invalid email address
Give it a try. You can unsubscribe at any time.