Q&A: On Fuzzy Distinctions, Causality, and Simpson's Paradox

Originally published: September 21, 2016

This is an English translation (via GPT-5.4). Read the original Hebrew version.

On Fuzzy Distinctions, Causality, and Simpson's Paradox

Question

An interesting post about binary differences versus statistical differences, and the implications for the male brain and the female brain.

I’d be happy to hear your opinion.

Answer

Indeed, there is a common mistake regarding fuzzy distinctions. Like the difference between Religious Zionist and Haredi, or left and right. Interesting and typical. Notice the cat and dog pictures there, which really do look similar. I have a feeling that there it’s not only about statistics (that is, in my opinion, if we change the proportions of similarity and difference, we won’t always end up with a dog and a cat even if we preserve most of the characteristics of the dog and the cat. There is something essentialist here beyond the collection of characteristics, except that it is hard to capture and translate into words from the simple visual form). Sometimes (and perhaps always) talk about a collection of characteristics is only a limitation in our ability to define, not in the things themselves.

See another very interesting post of his.
Truth be told, I don’t agree with what he says there. If I understood correctly, in those tables there is simply careless sampling. Those tables do not represent groups of people who were chosen randomly, but rather those who agreed to take the drug. If you look, the two tables do not add up to the general one (after all, there are 40 total, and in each sub-table there are also 40. The men are distributed differently from the women (10 took it and 30 didn’t, and vice versa). In short, the whole thing seems to me nothing more than misleading, and the distinction between causality and correlation remains intact. Moreover, even according to his own approach, when you bring the story into this, you are manually introducing causality. It still does not arise from the statistical data themselves.
The cartoon there is gorgeous. Seemingly there are two parallel planes there on which the relation between correlation and causality is being examined. In the concepts themselves, and in learning (whether or not it causally changed the position). But in my opinion, if you pay attention, there is a relation of causal entailment there on three different planes (not just two): the entailment in the concept of causality itself (a cause entails an effect). The causal entailment between correlation and causality (correlation entails causality). And the entailment between learning statistics and the distinction between correlation and causality. Wonderful. And about each of them a question arises: is there causal entailment at all (or perhaps everything is only correlation—the dilemma of David Hume)? Does correlation express causality, or is that only a correlation between them (that is, a correlation between correlation and causality)? And did the learning causally bring about the change in perception, or is that only a correlation between them?
——————————————————————————————
Questioner (another one):
By the way, I looked at the overall table, and it has 80 total, and each sub-table has 40, so it seems to add up fine.
——————————————————————————————
Rabbi:
I probably made a mistake because I was going too fast. Thanks.
——————————————————————————————
Questioner:
1. I don’t think the issue there is the agreement to take drugs. It doesn’t play any significant role in the arguments he presents. For that matter, the same post could have been written if the sampling had been random.

2. His question about whether there are additional variables that, if we knew them, could reverse the picture—that is of course an excellent question, and there is no answer to it. Not that we haven’t found the answer, but that the answer does not exist (unless we have the complete equation of physics or something that grandiose).
My lecturer in the relevant statistics course, when teaching Simpson’s paradox, said explicitly that there is no mathematical or statistical way to neutralize Simpson’s paradox. We need to think carefully about all the variables we manage to control for, and that’s the best we have.

3. I have no idea what he is talking about regarding the “best statisticians” who struggled over Simpson’s paradox. There is no struggle here at all. When there is a case of Simpson’s paradox as described in the post, the answer is unequivocal: you never give the drug, even when you do not know the sex of the patient in front of you.
In my opinion the formal reasoning is as follows. Let the patient’s sex be denoted by G, where G = male or G = female. It is an unknown variable, but of course it is given and fixed relative to the experiment. Then we know that for every G,
P(E|C,G) < P(E|~C,G)
That is, his mistake in my opinion is that there is a hidden conditioning here on additional variables, which he simply did not write down. When you give the patient the drug and examine the results, that experiment is conditioned on and depends on the patient’s sex, and therefore the notation P(E|C) zzz is sloppy notation (though accepted, and there is no way to avoid it).
——————————————————————————————
Rabbi:
That is exactly what I wrote. He presents it as though there were some paradox here or a refutation of the distinction between correlation and causality. But the truth is that there are simply more hidden variables here (agreement to take drugs, or any other variable—it doesn’t matter). Therefore the claim that correlation is not causality remains intact. And in general, I didn’t understand the nature of this paradox, and why it is a paradox. All it says is that there are more variables, or that the sample was not taken carefully (without claiming that it was possible to be more careful. Usually we do not know what the relevant variables are, as you wrote). What is paradoxical about that?
——————————————————————————————
Questioner:
If I understood correctly, the paradox he presents is the question of whether to give a drug to a person whose sex is unknown to you. In my opinion there is no paradox here at all, because the answer is no—you don’t give the drug to anyone.

The point about causality is correct in general, in my opinion. If I understand correctly, Judah Pearl’s point is probably this: if we knew the story behind the formation of the correlation before us, we could avoid falling into Simpson’s paradox in the first place, because we would know that sex has an effect (for example).

More basically: before us is a pile of data that links between a collection of sampled individuals X (for example, we have X1, X2, …, Xn people), and some parameter Y (for example, if Yi = 0 then Xi is healthy, and if Yi = 1 then Xi is sick).
We want to divide X into two subsets A and B (where B is the complementary group of A within X) and test the hypothesis whether belonging to group A is connected to health (for example, A is the group of people who took some drug). Suppose we found such a connection, and we are happy to conclude that belonging to group A is indeed connected to health.
The problem facing us is whether the division into A and B is justified. Perhaps if we add another division according to another parameter (for example, divide A into men/women and also B into men/women), we will get different answers.
If so, says Judah Pearl, it is worthwhile for us to understand the story underlying the connection between A and health, and if we understand that story that describes the causality, we can decide whether to attribute the health to belonging to group A, or whether this is a case of Simpson’s paradox and in fact we need to add the division into men/women in order to “really understand” why the members of A are healthier.

In fact, mathematically speaking, very often we can find some additional arbitrary division that will reverse the trend. Men/women is simply a division we are used to and that is probably sensible in various situations, but statistics is blind to prior common sense. One can always find an arbitrary subgroup that will reverse the trend for us. So apparently the story plays an important role, so that we can decide which sub-division it is justified to take into account.
——————————————————————————————
Rabbi:
That is completely clear, but it seems trivial to me. If you know the story, then you know the causality, but that is exactly the problem: the correlations do not give you the story. So what is the novelty here?
As for the paradox too, I completely agree. Don’t give the drug, because there are additional factors that influence the matter and we have not identified them. Therefore our statistical picture is clearly incomplete and it is not right to rely on it. Again, I really do not understand what the novelty is here.

On Fuzzy Distinctions, Causality, and Simpson's Paradox

Question

Answer

שתף

השאר תגובהלבטל