Causality: B. The Problem of Formalization (Column 461)
With God’s help
Disclaimer: This post was translated from Hebrew using AI (ChatGPT 5 Thinking), so there may be inaccuracies or nuances lost. If something seems unclear, please refer to the Hebrew original or contact us for clarification.
In the previous column I surveyed the question of causality in broad strokes. Among other things, I argued there that the principle of causality has no empirical source, and that the causal relation has no mathematical-logical representation at all. In the first comment to that column, Roi Yozebitch linked to an interview he conducted with Prof. Judea Pearl, a computer scientist, in which they discuss an article Pearl published that deals precisely with this point. His main claim there is that probability—indeed even conditional probability and a full Bayesian picture—cannot represent a causal relation. He then proposes a formalism that can represent causality. In this column I will use his model to sharpen the claims I made last time.
Probabilistic measures for causality
Probability offers a quantitative estimate of the weight of different possibilities under conditions that include uncertainty. For example, when you throw a die there are six possible outcomes, and therefore there is uncertainty about the expected result. Probability theory gives us a quantitative estimate for each of the possibilities and, in the case of a fair die, the weight is 1/6 for every outcome. Probability as such, of course, has nothing to do with causality. If we wish to get closer to notions of causality and relations between events, we can speak about correlation.
Correlation between two events A and B measures the connection between them. If there is no connection between the events, then the occurrence of one does not affect the other, which means the probabilities are independent. The probability of a compound event—i.e., the probability that both A and B occurred—is denoted P(A∗B)P(A*B). When there is no dependence between the events, the following equality holds:
P(A∗B)=P(A)P(B)P(A*B) = P(A)P(B).
When there is a connection between the events, the equality does not hold, and the strength of the connection is determined by the ratio between the two sides of this equation (the farther apart they are, the stronger the connection).
If we wish to approach a description of a causal relation more closely, it is customary to employ conditional probability and Bayes’ formula. In Column 402 I defined conditional probability and presented Bayes’ formula (see also Columns 144–145 and elsewhere). Briefly, conditional probability describes a change in probability when additional information is introduced. For example, the probability of obtaining a 5 on a fair die is P(A)=1/6P(A)=1/6. But if we also know that the outcome was greater than 1 (call this information BB), then the probability of having gotten a 5 increases, of course: P(A∣B)=1/5P(A\mid B)=1/5. Translation: the probability of obtaining a 5 given that the outcome is greater than 1 is 1/5. If we know that the result was odd (call this CC), then the conditional probability of getting a 5 in that case is P(A∣C)=1/3P(A\mid C)=1/3. And yet it is clear that conditional probability is not necessarily causality. It is hard to say that the fact that the result is odd is the cause of getting a 5 (and not only because in this case the conditional probability is not 1).
You will sometimes hear or read the claim that “correlation is not causation” (as noted last time), but that conditional probability is tied to causation. Thus, for example, there is a causal connection between rain (A) and clouds (B), which means that the overall probability of rain P(A)P(A) is smaller than the probability of rain given that there were clouds, P(A∣B)P(A\mid B). True, even if there are clouds it doesn’t necessarily rain, and therefore the conditional probability is not 1. What then will you say about the reverse conditional probability, P(B∣A)P(B\mid A)? If there is rain then clearly there are clouds. Therefore P(B∣A)=1P(B\mid A)=1. But no one would say that the rain causes the clouds (it’s merely an indication of them; cf. the fat-and-diet example in the previous column). Logically we would say that rain is a sufficient condition for clouds, but not their cause. This reflects another claim we encountered earlier: logic is indifferent to the time axis—and now we see that probability is as well. The probabilistic formula is indifferent to the question of what is cause and what is effect; it only tests their connection. In other words, even conditional probability is not a good measure of causality. In the previous example (with the die) we also saw that conditional probability does not necessarily express a causal connection.
To be sure, a causal connection between events can be described in terms of conditional probability. For example, suppose that striking a match (A) necessarily (i.e., it is a sufficient condition) ignites a fire (B). Then the conditional probability P(B∣A)=1P(B\mid A)=1. If it is not strictly necessary but still tends to ignite, then P(B∣A)P(B\mid A) will be greater than the unconditional P(B)P(B), but not equal to 1. Even in such a case there is a causal relation between the events, only it also depends on the circumstances.
Pearl’s proposal
In §4.1 of his article, Pearl discusses whether the probabilistic language plus the time axis suffices to describe a causal relation. He describes the prevailing view, according to which these two components do indeed suffice to describe causality. But, he argues—contrary to what many think—the answer is negative (exactly as I argued in the previous column).
The example he brings to sharpen the difference is this: suppose a switch XX turns on two bulbs YY and ZZ. Assume the bulbs are at very different distances from the switch, and therefore bulb ZZ lights up a second before bulb YY. When we flip the switch XX, after some time ZZ will light, and a second later YY will light. Now compare this to the following situation: the switch XX turns on bulb ZZ, which in turn turns on bulb YY. From the logical and temporal points of view, the two situations are identical. But it is clear that from a causal point of view they are not. The first situation can be described as Y←X→ZY \leftarrow X \rightarrow Z, whereas the second is X→Z→YX \rightarrow Z \rightarrow Y. The difference between the two situations is easily demonstrated by examining interventions: if I intervene artificially and turn bulb ZZ off, in the first case this will not affect YY, but in the second case it will also extinguish YY. In the first case there is no causal relation between ZZ and YY (only correlation); in the second there is.
In my terminology, his conclusion can be put this way: to describe the causal relation fully, we must add to logic (and probability) and to time also the physical element (the production of the effect). The arrows in the diagrams express this third component. Pearl uses this representation to propose a formalism that better describes the causal relation.
Without entering the mathematical details, in §3.2 Pearl proposes using a description similar to the arrow diagrams in place of standard logical and probabilistic descriptions. He defines causal functions that take account of environmental conditions (circumstances), and thus allow the degree of influence to be included as well. For example, clouds (B) do indeed produce rain (A), but as we saw one cannot mark this causally as B⇒AB \Rightarrow A (i.e., “if B then necessarily A”), because it depends on additional circumstances (weather, the nature of the clouds, humidity, and so on). He therefore writes it as a function
A=f(B,w)A = f(B,w).
That is: there is a function ff that receives as inputs the circumstances ww and the clouds BB, and determines whether there will be rain AA. But do not be misled: this is not an ordinary mathematical function equality, because—as we saw last time—plain equality has no directionality. For Pearl this “equality” operates only one way, from right to left. This means that if we change AA artificially (say, we make rain from an airplane), it does not follow that anything in the input values of the function has changed. In other words, this is not an equality but a causal entailment, the one we indicated above with an arrow.
Two problems
I see two main problems with Pearl’s proposal. The first is that it makes no progress toward a full mathematical treatment of causality. The second is that it does not truly capture the concept of causality in its entirety. I will spell them out now.
The question that always arises for me when I encounter a logical symbolization—or any formalization—of some phenomenon is: what have we gained from this notation? Up to now we said there is a causal relation between AA and BB. Pearl suggests that, instead of saying “there is a causal relation,” we symbolize it with an arrow or a one-way function. What is the gain from this regular mathematical function notation? In other domains we have accumulated a lot of knowledge about the meanings of functions and how to use them. There is therefore obvious logic in describing natural phenomena using mathematical functions; the benefit of that notation is clear. But here we merely translated a Hebrew word into a semi-mathematical symbol—and I don’t see what we gained from the whole story. Pearl’s preliminary analysis does have value, of course, since it shows that standard mathematical and logical notation does not cope with causal phenomena. But his constructive proposal, in my view, brings little benefit to the discussion. We used to think that, instead of mathematics, we have to speak Hebrew; now we mark a Hebrew word with an arrow. I do not see what we gained by this. This is the first problem I find in Pearl’s proposal.¹
The second problem that arises here concerns whether this formalism truly captures the concept of causality in full. If you look at the switch-and-two-bulbs example above, you will see that Pearl addresses the causal relation by way of its phenomena. He distinguishes correlation from causation through the operation of intervention (a do-operation, in his terminology). But as I noted earlier, David Hume posed a more severe philosophical problem concerning the causal relation: it has no empirical source—i.e., it cannot be identified via observed phenomena and occurrences in the world. As is known, Hume argued that even if one sees a kick cause a ball to fly or clouds (under given circumstances ww) cause rain, we have no empirical way to ascertain that what we have here is a relation of production rather than mere correlation. This is a philosophical-scientific problem, not a problem of mathematical-logical symbolization. Even if Pearl’s notation solves the formal problem, it leaves the scientific problem in place. We have no way to express, in mathematical terms, the claim that there is between AA and BB a relation of production—that the one brings about the other. At most we can express the fact that a change in AA will change BB, i.e., the connection between the phenomena. But that still belongs to the phenomenological plane and not to the essential one. The fact of the connection is phenomenological (concerned with phenomena and their relations), but it does not touch causality in its essential philosophical sense.
In other words, the debate with Hume about the nature of the causal relation will not be settled by the formal tools Pearl proposes, and it is doubtful that it can be settled in such a way at all. Those tools express a difference between correlation and phenomenological causality, but they do not touch essential causality. There may be no formal way to treat essential causality. What science and mathematics can do is at most treat phenomena and occurrences in the world and the logical-temporal connections between them. The causality that produces those connections (in Kantian terms: the noumenon, as opposed to the phenomenon) belongs to metaphysics and is therefore not accessible to observation, science, or mathematical formalization.
To see this, consider the law of gravitation. This law describes the motion of a body of mass mm under the influence of another body of mass MM at a distance rr from it. The law of gravitation describes the acceleration of the body mm given the other mass MM (which is the factor BB) under the given circumstances (the distance rr, and the absence of other masses and influences). But one cannot derive from it the existence of a gravitational force that produces that motion. The motion of mm is correlated with the presence of mass MM—and that is a description on the phenomenological plane. The claim that a force is the causal producer of the motion, and that the equation linking the acceleration with the data (MM and rr) even hints at its existence—none of this follows.² The claim that there is a force producing this acceleration is our interpretation of the law of gravitation, just as the claim that force causes acceleration and not the other way round is our interpretation of Newton’s second law (see the previous column).
My second remark is not necessarily a critique of Pearl’s proposal. I meant only to sharpen the difference between the planes (phenomenological and essential) and to highlight the additional element in the causal relation beyond logic and the time axis—the element I called above the physical (productive) component. My claim is that a mathematical and logical formalism, including Pearl’s, does not succeed in describing that physical dimension of causality.
Summary
I have shown here that there is a fundamental problem in any logical or mathematical formalization of the causal relation. This is not an argument against one or another concept of causality. On the contrary: my claim is that even if my concept is not dismissible and can be formalized, it suffers from a lack of formalizability, whereas Pearl’s concept may be formalizable—but that does not refute mine. It merely shows that there is something about causality that does not belong to the scientific-mathematical stratum, but rather to the philosophical stratum. When I say that a force is the cause of acceleration, that is an interpretation and not a scientific statement. It does not arise from the equation but from our interpretation of it.
Science can manage without that interpretation. A scientist wants to know whether a force is acting or not, and according to the second law of Newton the existence of acceleration proves that a force is acting—even if the acceleration is not the cause of the force (which is not true, even by the prevailing interpretation that force is the cause of acceleration). A scientist wants to predict an earthquake, and for this it suffices to find correlations and Pearl-style causality; that is, to find what is expected to appear before the quake and can predict its occurrence. Whether that thing is the cause of the quake or not is a question of interpretation and therefore not pure science but rather philosophy. Science gets along without it. Think of Pearl’s bulbs example: a scientist will want to predict whether bulb YY will light, and for this it suffices to know the antecedent conditions—even if these are not causes in the full physical-philosophical sense. But I do not see this, by itself, as an explanation for the bulb’s lighting. An explanation for some event requires pointing to the cause of its occurrence. Therefore, to speak of an explanation for the bulb’s lighting we must add to the empirically found correlations that we have formalized the causal interpretation.
Links mentioned
-
Pearl, “Causality and…” (reprint): https://ftp.cs.ucla.edu/pub/stat_ser/r284-reprint.pdf
¹ It should be noted that Pearl is a computer scientist, and his explicit goal is to enable computers to handle causal relations. He is not proposing philosophical or scientific solutions to the kind of issue I raised here. In that sense, his proposal should be judged by outcomes. Perhaps such a notation is more accessible for logical encoding and formalization for a computer. From this angle, the problem I described may not constitute a critique of his proposal at all; for our purposes, however, this analysis is important for sharpening my own claims about the causal relation and the role of the third component (production—what I called in the previous column “the physical component”).
² In mechanics we do speak of gravitational force, but that is at most a definition. Instead of saying that a mass MM causes a mass mm to accelerate at aa, we say that MM exerts a force of magnitude mama on mm. We have no way to know that a force is acting except via the fact that an acceleration develops. That is not accessible to observation. I have often noted that discovery of gravitons might change the situation, as photons express the electromagnetic force (field). But the existence of gravitons would also be an interpretation of the equation, and even if it were confirmed empirically, it would not follow from Newton’s law of gravitation.
Discover more from הרב מיכאל אברהם
Subscribe to get the latest posts sent to your email.
Why do you choose to use the causal interpretation as condition 3 and not the ”intervention test” that you mentioned above? Isn't it more intuitive? In addition, doesn't the ”causal interpretation” have the problem of “completing a condition without a condition”? The statement is that in order for something to be called a cause, there must be a “causing” connection, but this concept is also no more defined (certainly not empirically or even in the form of a thought experiment) than the concept of cause. Thanks
The intervention test is phenomenological. David Hume does not accept it as an indication of causation. Therefore, I argue that it is a good test on the phenomenological level, but philosophically it does not contain the dimension of causation. I think that this concept is well understood by all of us, otherwise this discussion would not be taking place. If you are looking for a phenomenological definition for it, then you have thrown the baby out with the bathwater, because my entire argument is that it does not exist and yet it exists.
I buy the central argument that emerges from the column, according to which formalization has a price (it is “blind” to physical and metaphysical meanings or essences, for example, causality).
At the same time, I assume you will agree with me that it also has a benefit. The benefit is, of course, the ability to create a science that is communicative to human beings like us. If we imagine a possible world in which there are beings who do not share with us the ability to create formal systems (mathematical symbols, etc.), but who have the intuitive ability to directly grasp abstract essences such as causality.
Now the question is what kind of world and what kind of “history”, if any, will such beings have?
And it is hard to imagine that they could even begin to create some kind of joint creation (material, spiritual or intellectual). It is possible that for them, “everything” common (since they directly perceive the same objects). In fact, it seems that even from a value and motivational point of view, they will have nothing that would push them to act together (and perhaps even to act “just”).
I understand that this question takes the subject in a new direction that you may not be interested in in this context. For your consideration.
Totally agree.
The discussion about the world those beings would have doesn't seem useful to me. The question is what other means do they have to communicate with each other instead of our formalization. Think about people who don't have formal capacity today, or people from four thousand years ago. Didn't they have a history or an understanding of causality? In short, a discussion that we don't have the tools to have.
“And the causal relationship has no mathematical-logical representation at all”
The derivative with respect to time represents development in time, and from the expression we can derive what we want to call cause and effect.
Everything else is a pure illusion.
Thanks for the interesting article and for its predecessor. From what the Rabbi commented on the lack of independence of logic in time, I became more and more interested in the phrase ‘only if’ for logical notation.
And when the Rabbi wrote that there is no benefit here from formalization, I immediately threw myself into the –
The author said: But their knowledge of it is like our knowledge. The philosopher defined it because it is the beginning and the cause in which the thing is situated and moved, which is in it by itself and not by chance.
The Khazari said: As if he were to say, because the thing that moves of itself and rests of itself has a cause, in it rests and moves, and that cause is – it is nature.
A. The author said: This is what she wanted to say with great precision and accuracy and to distinguish between what works by chance from what works by nature, and the things that amaze the hearers, but the benefit of knowing them in nature – this is it.