The Great LIE - Correlation is Not Causation - FutureIQ

6,968 views Wait, is this logic right? • Sep 06, 2024
Slog Reference: Correlation is not causation - do windmills cause wind

Description

Everyone has been hearing that correlation is not causation. But this is not necessarily a true statement.

Book 1 - Freakonomics: https://tapthe.link/FreakonomicsFIQd
Book 2 - Fooled by Randomness: https://tapthe.link/FooledbyRandomFIQd

The reality is more nuanced than this plane statement. And I have some real-life examples for you to understand the distinction between the two. Learn if there is a correlation between the terms correlation and causation from the phrase correlation is not causation and how this phrase can fool you into believing things that are not true. Once you understand this difference, you will be well-equipped to avoid common fallacies surrounding that. Learn it all in this futureIQ episode.

More videos for you:
The Randomness of Science: https://youtu.be/uNhT2hRtqUU
Fooled by randomness: https://youtu.be/QjDXyuBJ0UY
Tobacco politics: https://youtu.be/N8ewD023cNg
Pollution makes you dumb: https://youtu.be/ANvfe_pPg-Q

Hope you enjoyed FutureIQ by Navin Kabra and Shrikant Joshi. Do hit us up on Twitter:
@ngkabra http://twitter.com/ngkabra
@shrikant https://twitter.com/shrikant

Listen it on the podcast provider of your choice: https://tapthe.link/FutureIQRSS
Watch other episodes of The FutureIQ podcast: https://www.youtube.com/playlist?list=PLAppTB0r5_TaYueZ0adD42Wiw5X-wTE4v

#futureiq #correlation #statistics

Related Slog Matches

Correlation is not causation - do windmills cause wind

Fuzzy Text Enhanced

67.29

Transcript

do windmills cause wind I can show you strong scientific data that windmills cause wind that quitting smoking increases your chances of dying of lung cancer that body lies keep you healthy that eating ice cream causes drowning and that higher HDL the good cholesterol reduces heart disease you know I can prove that turning on AC in my car causes the outside temperature to go up what these are examples of correlation okay suppose I go to satara and there are a whole bunch of windmills there and I collect data about when the windmill was it in operation or not in operation okay and whether there was wind or whether there was no wind and I draw a graph you
will see that every time the windmill is running the wind starts no it's the other way around well yeah okay you think this is funny okay it is funny but there are serious serious scientific situations in which people make this mistake okay let me take another example okay let's collect data of people dying of lung cancer and see what they are dying of okay you collect a lot of data about them and you see that people who quit smoking huh many more of those die of lung cancer than than people who actually just continued smoking that can't be right that is actually right there is real data showing that so the data is saying that if you quit smoking you will die of
lung cancer more CH more likely than if you continue smoking yes what so this is the problem right now suddenly it doesn't sound funny anymore no this is absolutely not funny okay so now at least you agree that this is like a serious topic for us to discuss before we get into the details of how this happened let's get a look at some more examples right okay so these are called spurious correlations for example look at this graph okay American cheese consumption is directly correlated to Google stock price right so either people eating cheese causes them to Google more or somehow Google is causing people to eat more cheese but if that graph didn't look very convincing to you
here's a much more convincing graph it goes up goes down goes up again yeah I like those kind of graphs where clearly one and if two curves are doing the same thing then obviously one of them is causing the other so the number of skin scare specialists in Kentucky correlates with Google searches for how to hide a body so what are these skincare specialists in Kentucky doing okay watching of skin care jobs I'm guessing look in the links there is a guy called Tyler Wigan he's created a website of such spurious correlations he finds in data he has thousands of them you can just spend hours looking at that and laughing okay I I remember this example
from fonics when you were giving these purious corelations I remembered it something about Pirates and global warming was it lack of pirates is causing global warming or something yes so I mean we have a link take a look at that so purious correlation so funny man but right lung cancer yeah that's not that's there's something spous correlation right so what's the real explanation the real explanation is that the cause and effect can be reversed without us realizing it right our human brain likes to jump to conclusions ah yeah so like with the windmill example it is the wind that is causing the windmill to turn but then what is happening with the L cancer right why
are quitting smoking increas your chance you're dying well the simple thing is that when you get a diagnosis of lung cancer you quit smoking oh so most people who die of lung cancer have just recently quit smoking ah right so that is an example of causality going in the reverse Direction lung cancer causes quitting right so so I talked about lice gives good health right so in like you know medieval times people noticed that I mean everybody had body license those days okay and what they noticed was that sick people do not have body lice only healthy people have body lice okay and they used to think that lice cause good health okay I have to check whether they
were actually putting lice on sick people to cure them I don't know if that was happening they did put leech not lice but the real explanation there is fairly simple and reverse Direction which is that lice are very sensitive to body temperature so as soon as you get sick and your temperature goes up the lice run away that's why sick people didn't have lice so there was reverse causality so all of the examples that you gave earlier about windmills and about lung cancer and about lies basically you're saying that there is reverse causality in those correlations but that's not always the case Okay sometimes the cause and effect can be bire okay so cycling causes low BMI BMI
is body mass index right so I mean if you are more muscles then you are you have low BMI which is good I thought that was a well studied fact though I mean you know if you look at cyclists you will see they have low BMI you might jump to the conclusion usion that cycling causes your BMI to go down right you lose fat and you gain muscle but they did a more detailed study and they found that you know cycling doesn't cause your BMI to go down in many cases okay okay what they realized was that in some cases people with already low BMI are the ones who take up cycling right so sometimes it goes in
this direction sometimes it goes in the reverse direction right I mean another example is use of recreational drugs okay this is marijuana cocaine Etc and mental health problems right okay highly correlated but we don't know which direction is goes is it you take drugs and it can cause you to get much worse mentally or because you have mental health problems you cannot resist the drugs and you use drugs more right it can go in both directions I actually uh read about a study Rel to ADHD and stimulants where they figured out or I I'm not sure if it was a study or a Post article something but they figured out that people with adsd are more likely to
take uh what are called upper stimulating drugs because uh that is what their brain craves because of the adsd bit so yeah you are right it's it's it's bidirectional but then again once you get addicted to those drugs then your mental health suffers all over again kind of like not just a bidirectional but more of a cyclic Rel Rel it can be right so I mean basically you have to be careful you can't just assume that because there is a correlation one of them causes the other so what what I'm seeing from this is correlation is causation it's just that you have to figure out the direction sometimes it can be bidirectional right or reverse well no that's not really
true right in fact I haven't talked about the most important situation which is let's start with an example okay you see very high correlation between people sleeping with shoes and getting a headache in the morning okay so which way is the correlation going as in does sleeping with shoes cause headache or because you know you're going to have a headache you sleep with shoes that doesn't make sense does it the only time I have slept with shoes is when have when I have come back from a really long party that went on late into the night in fact usually huh people get drunk and then they fall asleep on the bed without taking off their shoes and they have a hangover the
next day right so in this case the two things we studied sleeping with shoes and headache neither of them causes the other there's a third Factor being drunk which caused both of them right so this is like the third Factor this is usually the most important one hidden Factor it is called right oh ice cream sales and drowning are correlated ice like the more ice cream sales the greater the cases of drowning Yes W how okay third Factor third Factor simple s is the common factor in summer people eat more ice cream and they go to the beach and swim more and drown more right and therefore drown more yeah at if you study atmospheric CO2 levels right and
the rise of atmospheric CO2 uh especially across countries and all of that you will notice that highly correlated with increase in obesity right so is obesity is not causing increased CO2 it's causing increased methane increased CO2 is not causing obesity so what's going on uh what could be the third factor I have wied up now what could be the third Factor simple the Richer a country gets the fatter the people get and the country also has much higher emissions right all of these seem like either obvious things or not very important things it's not going to affect you right so let me give an example which actually affects you okay about 5 years back or I think maybe it was 10 years
back people used to go on and on about oh there is good cholesterol and bad cholesterol right and actually having low levels of the good cholesterol is bad for you right okay uh what they saw was a high correlation between low levels of good cholesterol good cholesterol is the HDL okay okay high density lipoproteins so low levels of HDL was correlated with higher levels of heart disease heart attacks and things like that right I remember this and you know then people can go around oh well let me eat things with HDL right but that was not true at all okay there was a third Factor because I mean how did they discover this they first noticed
that this correlation wasn't there for black people okay okay then they noticed that there are some people who have a genetic mutation in which higher HDL is also there and higher heart disease is also there so opposite of what everyone was talking about right okay after a while what they have now reached a conclusion is that there is the common factor there is probably genes or diet or exercise which affects both which causes you to have the things that cause you to have lower HDL are the same things that cause you to have worse heart disease right but my point is that entire scientific community and experts and all that were going around saying HDL is good for you although there was
no C there it was only a correlation and they had just hadn't studied it enough no no so this is a very interesting thing uh let's try to understand how science works right so you collect data on a whole bunch of different different things and you see which things are independent variables that you can control what to do and all that and which things are affected by them and you see when this goes up this goes up so you know this causes this and so on right so let's study my car okay in my car I always keep the AC at 22° at all times okay okay now I'm collecting a lot of data about what is the outside
temperature what is the inside temperature in the car and how much diesel is being used by the car okay and what I notice is that there is a negative correlation between outside temperature and Diesel usage so the more the outside temperature the less the diesel usage no more outside temperature more diesel gets used because my AC is oh negative a okay there is no correlation between inside temperature and Diesel usage inside temperature is always 22 so it's not correlated with anything correct there is no correlation between inside temperature and outside temperature that is true inside temperature is always 22 so it is not correlated with outside temperature right so inside temperature is not correlated with anything so we throw it
away so one scientist says that outside temperature causes diesel usage to go up H another scientist says that diesel usage is causing the outside temperature to go up right and these two are fighting but everyone agrees to the following which is that outside temperature is completely unrelated to inside temperature okay and Diesel usage is completely unrelated to inside temperature uh right and so seems like AC is unrelated to and AC is also unrelated to inside temperature right so you see how you know if you just look at correlations you are going to end up with lots of like ridiculous look that data is correlated but the causality is no it's yes and that is the important
point I am trying to make or more accurately Milton fredman the great Economist this is called Milton fredman's thermostat by the way and the point he was trying to make is that correlations are useless unless you talk about the cause and effect right you can't just say this is correlated with this that's why you know this causes this or this causes this you have to give the sequence of cause and effect a mechanism you have to explain the mechanism yeah and and you and Milton Freedman both had to use an artificial example to no let me give a more realistic example we have talked about this in another episode on Fooled by Randomness but it's such a lovely
example I'm going to explain again okay this is called the Illinois wellness program where a whole bunch of companies give incentives to their employees that you know join our wellness program and in that program we encourage running and going to the gym and so on basically becoming healthy and then they studied later on that they noticed that the people who had joined the wellness program among those people there was lower attrition which means they were quitting the company less there were lower medical spending they were ending up in hospital list clearly this is like a huge uh Improvement right and so basically it seems very clear that joining the wellness program was beneficial to the
employees health and also to the employer because people were quitting the company list yeah yeah later on they redid the study but this time they used randomization they didn't tell people that oh come join our wellness program they asked people are you interested in joining the wellness program and of the people that said yes half of them were allowed to join and half of them were not allowed to join and then they compared between these three groups people who were not interested in joining people who wanted to join but were not allowed and people who joined and this is called a randomized controlled trial r c because you randomized those two groups they found that there was the correl ations all
disappeared yeah right and what that proves yeah is the following simple thing that the people who are already healthy the people who are already going to spend less on medical expenses because they were already healthy and the people who are already not going to quit because they were more consens people those are the ones that wanted to join the wellness program right so even though they did a study where they compared two different populations right those in the wellness program and those who didn't join the wellness program still that study was wrong because the randomization was not there and the correlations there were you know because of a third Factor right the only way to
get rid of the third factor is randomization you know what you've done right you started off by saying correlation and some amount of causation in some direction either reverse or B Direction whatever and now we have come to the point which is one of we've come to a statement which is one of my hated statements is correlation is not always causation because I I think that's such a weasal statement to make correlation is not always causation my question to you is what is it is correlation causation or is correlation not causation yeah so unfortunately it's a tricky problem okay um first thing is that correlation might not be ca right we already proved that with so
many examples so if you can do a randomized study right where you pick which people will be in part A and which people will be in part two the people don't themselves pick right it is done through complete randomization then the third Factor if there is a third hidden factor that that will be equally divided amongst the two groups yeah right then no third Factor can show a spurious correlation now if there is a correlation in uh then you know that it is uh actually caused right so that is the gold standard a randomized trial with two parts controlled correct but huh sometimes you can't do a randomized control trial okay there is a very famous and funny paper
that somebody published I mean basically they asked why parachutes don't work why because there is no randomized control trial of parachutes right would you like to participate in a randomized control trial of parachutes you will be randomly assigned jump from an airplane you get a parachute and you get a placebo bag right not going to happen absolutely not which means you can never study whether parachutes work or not right similarly I mean if I wanted to study whether child abuse results in bad academic performance right yeah this has to be observed ation and correlational this cannot be a randomized control trial right so how do you fix it and this does need fixing do you know why why why is
because people misuse the term correlation is not causation okay they do which is why I am not very much I'm not a fan of that statement correct exactly so for example this statement was made famous all over the world by do you know who uh the the tobacco companies so in the 1940s 50s right when it became clear that lung cancer is caused by smoking they went around saying no no no no that's just a correlation and correlation is not causation we we'll do an entire episode on that well let's not get into that but I do agree with you that you don't like the statement correlation is not caus right Paul Graham the famous investor what he says
is that sure correlation is not causation the statement is true but whenever he's heard anybody use the statement in an argument that person is an idiot right so sa amus puts it more clearly he says it is not clear to me if saying correlation is not caus causes people to become idiots but it is clearly highly correlated with it right so now you can twist your brain around that statement but you know the thing to keep in mind is that correlation is not CA but it might be caus right so what you have you don't say that oh correlation is not CA so you throw away the data what you do is you go into the
details you try to look for what mechanism can cause this you try to get rid of the spous correlations you figure it out right because correlation is not caus but correlation points you in the correct direction usually right so um now I'm wondering about the skincare experts in Kentucky and the searches for how to hide a body what's the causation there if any or is it a purely spous correlation we don't know if you want to look for it please do and let us know but in general keep in mind that our brain is very easily Fooled by Randomness we have an episode on Fooled by Randomness check it out you will like that one also we also have an episode on
randomized control trials which if it is available by this time you'll find a link in the description show notes if not uh subscribe and you'll know when it happens hit that Bell thing so you get a notification and it happens Shri Kant naen thank you future IQ