Good Examples of Causation does not Imply Correlation

In summary, the conversation discusses the concept that causation does not necessarily imply correlation, specifically in situations where the relationship between variables is not linear. Examples such as Hooke's law, where data pairs do not fit a linear model, are given to illustrate this point. The conversation also touches on the idea of lurking variables and the importance of considering all potential factors when analyzing data. The ambiguity of the term "correlation" is also mentioned, with some interpreting it as general correlation and others as specifically linear correlation. Overall, the conversation highlights the complexity of causation and correlation in statistical analysis.
  • #36
WWGD said:
You have valid points, but my question is a narrower, more technical one, where I make reference to correlation in the sense I think it is commonly-used, and not a broader question on causality. For one, PF does not allow philosophical questions ( too many headaches and no experts on the matter to moderate). But it is worth exploring the connection between causality and dependence or causality and other factors.
Fair enough, but to avoid confusion, I would make sure to choose a specific measure of correlation if you are not addressing the broader question. Because if you use a restricted measure of correlation, then your examples will also depend on the specific measure of correlation as well (what is true about Pearson's won't be true about Spearman's and so forth). If you are simply uncomfortable with correlation not being in the term "mutual information", then you could instead use "total correlation". It's even more general than "mutual information" and has the word correlation in it :) And it is a measure that comes from a branch of engineering, so you might not get in trouble for getting too Philosophical ;).

https://en.wikipedia.org/wiki/Total_correlation
 
Last edited:
Physics news on Phys.org
  • #37
Jarvis323 said:
Fair enough, but to avoid confusion, I would make sure to choose a specific measure of correlation if you are not addressing the broader question. Because if you use a restricted measure of correlation, then your examples will also depend on the specific measure of correlation as well (what is true about Pearson's won't be true about Spearman's and so forth). If you are simply uncomfortable with correlation not being in the term "mutual information", then you could instead use "total correlation". It's even more general than "mutual information" and has the word correlation in it :) And it is a measure that comes from a branch of engineering, so you might not get in trouble for getting too Philosophical ;).

https://en.wikipedia.org/wiki/Total_correlation
I brought up a much simpler case of 2 RVs. Does Total Correlation shed much light when we only consider two variables? Clearly if there is causality there is a degree of dependence but it seems overkill for this basic case.
 
  • #38
WWGD said:
I brought up a much simpler case of 2 RVs. Does Total Correlation shed much light when we only consider two variables? Clearly if there is causality there is a degree of dependence but it seems overkill for this basic case.
Total correlation works with 2 random variables. It's just a generalization of mutual information (which is between two random variables).

It's also nice because it works with discrete and categorical data. So you can reason about abstract events, which makes connecting it with causality easier.
 
Last edited:
  • #39
Jarvis323 said:
To the contrary, using the term "correlation" as short for a specific type of linear statistical relationship is an abuse of terminology
Don't be silly. It is never an abuse of terminology to use a standard term in the standard manner with the standard meaning. Never. And particularly not in the sub-forum specifically for that discipline where the standard term is defined.

However, since you did produce a reference that supports your basic point (not your specific point about mutual information but the general idea of using correlation in a more broad sense), please feel free to use your non-standard meaning in this thread as long as you explicitly identify when you are using the non-standard meaning.
 
  • Like
Likes WWGD
  • #40
It is a good ref in general but overkill for the case of two variables. Using a tank to kill a fly.
 
  • #41
Dale said:
Don't be silly. It is never an abuse of terminology to use a standard term in the standard manner with the standard meaning. Never. And particularly not in the sub-forum specifically for that discipline where the standard term is defined.
It's a term that is context dependent.

Most definitions I find online say correlation is a measure of linear dependency. I guess this means that Spearman's correlation is not standard either? We are talking about strictly linear correlation right?

Then it's pretty easy and unsurprising that there are cases where causality doesn't imply linear correlation, because there are lots of non-linear processes.

You can use examples like trends that go up and then go down, like the number of active viral infections vs the number of people that have had it. Or things like the x position of a wheel in euclidean space vs how much it has turned, or sinusoidals, etc.
 
Last edited:
  • #42
Jarvis323 said:
It's a term that is context dependent.

Most definitions I find online say correlation is a measure of linear dependency. I guess this means that Spearman's correlation is not standard either? We are talking about strictly linear correlation right?

Then it's pretty easy and unsurprising that there are cases where causality doesn't imply linear correlation, because there are lots of non-linear processes.

You can use examples like trends that go up and then go down, like the number of active viral infections vs the number of people that have had it. Or things like the x position of a wheel in euclidean space vs how much it has turned, or sinusoidals, etc.
That's why my question was narrower on either Pearson or Spearman. That way it has a somewhat clear answer, even if it is an oversimplification. Otherwise we would likely have an endless discussion , and in an area I am not too familiar with.
 
  • Like
Likes Dale
  • #43
Dale said:
Don't be silly. It is never an abuse of terminology to use a standard term in the standard manner with the standard meaning. Never. And particularly not in the sub-forum specifically for that discipline where the standard term is defined.

However, since you did produce a reference that supports your basic point (not your specific point about mutual information but the general idea of using correlation in a more broad sense), please feel free to use your non-standard meaning in this thread as long as you explicitly identify when you are using the non-standard meaning.
Fair enough. But at least the mutual information part is supported at least somewhat by the fact that another name for mutual information is total correlation between two random variables.

I usually am studying topics where linear correlation isn't very relevant compared with statistical association in general, and I often use the word correlation. I've learned from this thread that people are so used to meaning linear correlation when they say correlation that I should just stop using the word correlation all together in those cases.

Just pretend I wasn't here.
 
Last edited:
  • #44
Jarvis323 said:
Fair enough. But at least the mutual information part is supported at least somewhat by the fact that another name for mutual information is total correlation between two random variables. I usually an studying topics where linear correlation isn't very relevant compared with statistical association in general, and I sometimes often use the word correlation. I've learned from this topic that people are so used to meaning linear correlation when they say correlation that I should just stop using the word correlation all together in those cases.
But how much sense does it make to bring it up for a simple case of two variables Y vs X?
 
  • #45
WWGD said:
But how much sense does it make to bring it up for a simple case of two variables Y vs X?
Mutual information is probably the simplest measure of dependency of all for two random variables. It's calculation is simple, and it's interpretation is simple (although maybe harder to visualize), at least in my opinion.

Of coarse as I have been arguing, I think it makes the most sense to use mutual information in this context, but that's just my opinion.

That said you're right, it would make the discussion a lot more complicated and maybe philosophical, and not produce the result you need for the class. So I concede it might be inappropriate.

Encryption is a decent example though. Because encrypted strings are meant to appear to have absolutely no statistical association with their plain text counterparts, even though they are generated from each other, with a small missing ingredient. But with standard linear correlation you can't use any examples like this because of both the type of data and limitation as to what kind of association it measures.

There is also a concept of "correlation-immunity" in cryptography that is relevant.
https://en.wikipedia.org/wiki/Correlation_immunity
 
Last edited:
  • Like
Likes WWGD
  • #46
Just about any time the output increases (or decreases) monotonically with the input, causation will imply correlation.
 
  • #47
A simple example of causation without correlation is between Y and XY, where X and Y are independent random variables: X has probability = 1/2 on each of =1, +1 and Y has probability = 1/2 on each of 0, 1. The random variables Y and XY have correlation = 0 but event A{Y=0} forces event B{XY=0}.
This example should be kept in mind when considering if causation implies a linear relationship or that zero correlation implies that one variable gives no indication of the value of another variable. Both of those statements are incorrect in the general case.
 
Last edited:
  • #48
WWGD said:
Ok, so if the causality relation between A,B is not linear, then it will go unnoticed by correlation, i.e., we may have A causing B but Corr(A, B)=0.

It's insufficient proof to assert "then it will" and then give a reason using the phrase "we may have".

What you describe is not a well defined mathematical situation. If we define a "causal relation" between A and B to mean that B is a function of A, the "correlation between A and B" is undefined until some procedure of sampling the values of A is specified.

For example, if B = f(A) then there may be intervals where f is increasing with respect to A and intervals where it is decreasing with respect to A. The "correlation between A and B" is not defined as a specific number until how we say how to sample the various intervals.

Furthermore the term "causal relation" is ambiguous. For example suppose for each value of B = b, the value of A is given by a family of probability distributions of the form f(A,b) where b is a parameter of the distribution. Then A is not function of B, but B still "has an effect" on the value of A.

I may be teaching a small online class that includes this general area and was looking for examples that are " natural".

I hope you don't teach using ambiguous jingles like "Causation Does Not Imply Correlation".
 
  • #49
Stephen Tashi said:
It's insufficient proof to assert "then it will" and then give a reason using the phrase "we may have".

What you describe is not a well defined mathematical situation. If we define a "causal relation" between A and B to mean that B is a function of A, the "correlation between A and B" is undefined until some procedure of sampling the values of A is specified.

For example, if B = f(A) then there may be intervals where f is increasing with respect to A and intervals where it is decreasing with respect to A. The "correlation between A and B" is not defined as a specific number until how we say how to sample the various intervals.

Furthermore the term "causal relation" is ambiguous. For example suppose for each value of B = b, the value of A is given by a family of probability distributions of the form f(A,b) where b is a parameter of the distribution. Then A is not function of B, but B still "has an effect" on the value of A.
I hope you don't teach using ambiguous jingles like "Causation Does Not Imply Correlation".
Consider Hooke's law and the causal relation ( Causality is still somrwhat of a philosophical term at this stage, so I am settling for accepted Physical laws as describing /defining causality) with a shift, to## y=k(x-1)^2## . Then the samples at opposite sides of the above equalities described by it, as well as by other Physical laws may give rise to uncorrelated data sets. I cannot afford to enter or present serious background on causality in an intro-level class.
 
  • #50
If you are going to teach an introductory class, I think you should be careful about these terms. Saying that A and B are uncorrelated implies that A and B are random variables. Saying that A causes B implies that A and B are events. The two implications are conflicting. It would be better to talk about random variables X and Y being correlated and about the event ##X \in A## implying (not causing) the event ##Y \in B##. (You could talk about events A and B being independent, but not uncorrelated).
Also, you should be careful to indicate that "causation" is a logic problem that depends on subject knowledge, not a statistical problem.
 
Last edited:
  • #51
FactChecker said:
If you are going to teach an introductory class, I think you should be careful about these terms. Saying that A and B are uncorrelated implies that A and B are random variables. Saying that A causes B implies that A and B are events. The two implications are conflicting. It would be better to talk about random variables X and Y being correlated and about the event ##X \in A## implying (not causing) the event ##Y \in B##. (You could talk about events A and B being independent, but not uncorrelated).
Also, you should be careful to indicate that "causation" is a logic problem that depends on subject knowledge, not a statistical problem.
Well, this is part of the problem of trying to popularize not-so-simple topics. I have to do enough handwaving to start a tornado.
 
  • Haha
Likes Klystron
  • #52
This is just a Stats 101 for incoming freshmen , with little to no background in Math/Philosophy. I was just asked to incorporate this topic to the existing curriculum. At any rate, there is still a lot of handwaving when introducing the CLT, Hypothesis Testing, etc. I will include the caveat of the necessary oversimplification and just direct the curious ones to more advanced sources.
 
  • #53
WWGD said:
Consider Hooke's law and the causal relation ( Causality is still somrwhat of a philosophical term at this stage, so I am settling for accepted Physical laws as describing /defining causality) with a shift, to## y=k(x-1)^2## . Then the samples at opposite sides of the above equalities described by it, as well as by other Physical laws may give rise to uncorrelated data sets. I cannot afford to enter or present serious background on causality in an intro-level class.

For the purposes of statistics, the main point is not the philsopical definition of causality, but rather the mathematical point that the correlation between A and B is ( as @FactChecker says) only defined for random varables A and B. If A and B are physical measurements, they are only random variables if some scheme is specified for taking random samples of the measurements. So a "Hookes Law" relation between A and B does not define A and B as random variables. To suggest to an introductory class that the names of two measurements (e.g. length, force or height, weight) implies the concept of a correlation or a lack of correlation between the measurements is incorrect. A fundamental problem in applications of statistics is how to design sampling methods. You cannot provide a coherent example of "measurement A is not correlated with measurement B" without including the sampling method.
 
  • #54
You do not need to dwell in class on the technicalities, but you can arm yourself with a few simple, intuitive, examples. I think that is what you want from this thread.
If you randomly select a person and measure the lengths of their arms, those lengths are random variables. The event that the selected right arm is more than 2 feet long is an event. The right arm lengths are highly correlated with left arm lengths, but the right arm being over 2 feet long does not cause the left arm to be over two feet long -- it just implies, it does not cause.
 
  • #55
Stephen Tashi said:
For the purposes of statistics, the main point is not the philsopical definition of causality, but rather the mathematical point that the correlation between A and B is ( as @FactChecker says) only defined for random varables A and B. If A and B are physical measurements, they are only random variables if some scheme is specified for taking random samples of the measurements. So a "Hookes Law" relation between A and B does not define A and B as random variables. To suggest to an introductory class that the names of two measurements (e.g. length, force or height, weight) implies the concept of a correlation or a lack of correlation between the measurements is incorrect. A fundamental problem in applications of statistics is how to design sampling methods. You cannot provide a coherent example of "measurement A is not correlated with measurement B" without including the sampling method.
My actual take on causation of B by A would be that in several independent experiments, variable A was controlled for, ( instances of it were) selected randomly so that the major non-error variation in B is explained through variation in A. But there is little room to delve into this, to specify how/where random variables or events come into place in this setting. And, yes, I was assuming a scheme to take random samples from each has been defined. Students I have had have trouble understanding what a probability distribution is, so delving into events and random variables is overkill, as I am only allotted a single class to go into this topic.
 
  • #56
WWGD said:
But there is little room to delve into this, to specify how/where random variables or events come into place in this setting.

I don't understand how an introductory course in statistics can have little room for discussing random variables!
 
  • #57
Stephen Tashi said:
I don't understand how an introductory course in statistics can have little room for discussing random variables!
Nursing and other humanities students , high school with hardly any/ very poor Math or Science background, I guess. Not a comfortable position for me to be in, for sure.
 
Last edited:
  • #58
WWGD said:
Consider Hooke's law and the causal relation ( Causality is still somrwhat of a philosophical term at this stage, so I am settling for accepted Physical laws as describing /defining causality)

But even in Hooke's law, does the displacement cause the force, or does the force cause the displacement?

Yes!
 
  • #59
WWGD said:
Ok, so if the causality relation between A,B is not linear, then it will go unnoticed by correlation, i.e., we may have A causing B but Corr(A, B)=0. I am trying to find good examples to illustrate this but not coming up with much. I can think of Hooke's law, where data pairs (x, kx^2) would have zero correlation. Is this an " effective" way of illustrating the point that causation does not imply ( nonzero) correlation? Any other examples?

Here's a nice figure with some examples illustrating your point:
1605632004715.png

https://janhove.github.io/teaching/2016/11/21/what-correlations-look-like
 
  • Like
Likes Klystron, Jarvis323, jim mcnamara and 1 other person
Back
Top