The Halakhic Significance of Multiplying Probabilities: C. Conditional Probability and an Undermined Majority (Column 614)
With God’s help
Disclaimer: This post was translated from Hebrew using AI (ChatGPT 5 Thinking), so there may be inaccuracies or nuances lost. If something seems unclear, please refer to the Hebrew original or contact us for clarification.
In the last two columns I discussed probabilistic products. In the previous column I highlighted the importance of dependence between events for calculating their joint probability. This topic has come up here more than once (search for “law of total probability” or “Bayes’ formula”). In most of those columns I dealt with the confusion between two conditional probabilities: P(A/B) and P(B/A). Here I thought to take the opportunity and examine a few examples in which people confuse the absolute probability P(A) with the conditional probability[1] P(A/B). In the end we shall see that the same assumption discussed in the last two columns enters here as well: in most cases the halakhic “majority” is negative and not positive. We will also see that the tools to use here (conditional probability, product of probabilities, a possibility tree) are similar to those used in the previous columns. We will also present a visual representation of this phenomenon.
Reminder: conditional probability
The probability of some event determines the chance that it will occur. We can set such a probability when we have a sample space and a distribution. For example, in rolling a die, the sample space contains six events—the six possible outcomes of the roll. The distribution depends on whether the die is fair or not. With a fair die, the chance of each outcome is equal, and since the probabilities of all outcomes always sum to 1, the chance of each is 1/6. Hence, if I ask for the chance of getting a 5 when rolling a fair die, the answer is 1/6.
But one can ask: what is the chance of getting a 5 when we have additional information—for example, if it is known that the outcome is odd? Here the answer will be 1/3 (since there are three odd outcomes and they are equiprobable). If it is known that the outcome was even, the probability is of course 0 (since 5 is not even). This means that the event “5” depends on the event “odd outcome” or “even outcome.” They are not independent.
We denote these quantities as follows:
The chance of event A is P(A). The chance of event B is P(B). The chance of A given that B occurred is P(A/B). Suppose A is the outcome 5, and B is the event “odd outcome.” Then we have: P(A) = 1/6, P(B) = 1/2, and of course P(A/B) = 1/3. In this case, the full probability P(A) equals the product P(A/B)*P(B). But that is not always the situation. To understand this, let us define event C as “an even outcome.” Its probability is P(C)=1/2, and it can also be viewed as the complement of B (namely ~B). From this you can see that we also have: 0 = P(A/~B).
Conditional probability within a possibility tree
You can describe the chance of getting a 5 using a possibility tree, as we saw in the last two columns:
The chance of getting a 5 is composed of two possibilities (the leftmost branch and the second from the right):
- On the right branch, in order to get a 5 you must first land on an odd outcome (probability 1/2), and then the conditional probability of getting a 5 is 1/3. Therefore the chance of 5 is 1/6 (the product along the right path: 1/2*1/3=1/6).
- On the left branch, first an even outcome occurs (probability 1/2), and then the conditional probability of getting a 5 is 0. Hence the chance along that path is 0 (product along the left path: 1/2*0=0).
Along any such path, the first event appears at the top of the tree, and what follows must be taken as conditional probability. What is the chance of an even or odd outcome? Multiply that by the conditional chance of getting a 5 given that the outcome is odd or even. The absolute probability of getting a 5 is composed of these two possibilities. More generally, we can write here the law of total probability:
P(A) = P(A/B)*P(B) + P(A/~B)*P(~B)
This formula describes the absolute probability of getting a 5 as the sum of two possibilities. The product we saw above follows because in this case the second addend is 0. That is not always so.
Another way to describe the mechanism of statistical dependence is by restricting the sample space. An odd or even outcome restricts the relevant sample space (those outcomes that can occur), and therefore the probability of the event we are interested in changes. This can be depicted as follows:
The space is the entire ellipse, which contains six events. If we consider the whole space, the chance of getting a 5 is 1/6. But when I know that the result is odd, I am effectively restricting the space by the thin boundary line in the figure, such that we are in the portion of the ellipse to its left. Likewise, an even outcome restricts us to the right side of the ellipse. Once our space is restricted, it contains fewer events, and so the chance of the event we care about (5) changes. On the left it is 1/3 (larger than in the full space), and on the right it is 0 (smaller than in the full space). The chance of getting a 5 on the right or left side is a conditional probability. For example: given that we restrict ourselves to the left side, what is the chance of 5? 1/3. The law of total probability tells us that the chance of getting a 5 is either to be on the right (probability 1/2)—in which case the conditional probability is 0—or to be on the left (probability 1/2)—in which case the conditional probability is 1/3. Therefore the absolute probability 1/6 can be presented as the sum of two products, each representing the option of choosing one side, as we saw above.
Of course, one could also partition the space in other ways—for example, when event B is “the outcome is ≥5,” or “the outcome is divisible by 4,” and so on. Each such partition yields different conditional probabilities on its two sides, but the absolute probability will always be the sum of the products of the different possibilities, and there we will always get 1/6.
Example: “The presumption that what is under a person’s control is his”
Halakha rules that if there is litigation between two parties, the burden of proof lies with the claimant: “He who seeks to extract (money) from another bears the burden of proof” (hamotzi mechaveiro alav ha-ra’ayah). The Talmud, Bava Kamma 46b, discusses the source of this rule:
R. Shmuel bar Naḥmani: From where do we know that he who seeks to extract bears the burden of proof? As it is stated, “Whoever has a case should approach them”—that is, he should present proof to them. Rav Ashi challenged: Why do I need a verse? It is logic: “He whose pain hurts goes to the physician.”
Initially a verse is brought, but ultimately it is set aside because it can be derived from logic: the one who hurts goes to the doctor. Simply put, this is a legal, not a probabilistic, claim: the plaintiff wants the court to act, i.e., to extract money from the defendant and give it to him. By contrast, the defendant does not want anything from the court (he wants the court to do nothing so the money remains with him). Therefore the burden of proof lies with the plaintiff: if you want the court to act, you must provide it a good reason to do so. In the absence of a reason, the court will not act. This is an advantage granted to the “possessor” (muchzak) over the claimant. In any case, whether based on a verse or on logic, it seems that this rule does not reflect a factual estimate (that it is really more likely that the defendant is right), but rather a legal rule.
It is sometimes claimed (see, for example, Enẓit, s.v. “Ḥezkat Metaltelin”) that behind the logic and the verse lies a probabilistic principle based on the rule: “There is a presumption (ḥazakah) that whatever is under a person’s hand is his.” This phrasing does not appear in the Talmud itself but only in the Rishonim, yet many identify it with the rule “the extractor bears the burden,” i.e., with the concept of possession (see a survey here). But this identification is problematic: if it is truly a probabilistic presumption, why do we need a verse, and why didn’t the Gemara present this probabilistic reasoning instead of a legal rationale? Moreover, the legal rationale leaves the matter in doubt and does not determine who is right—to the extent that later authorities discuss whether one who acquired money by virtue of possession is truly considered its owner for all purposes. But if this principle is based on probabilistic evidence, namely on the presumption that what is under one’s hand is his, then there is positive evidence in favor of the defendant. If so, this doubt should not arise.
One can distinguish between a monetary claim and a claim to a specific object. When someone sues me for money I borrowed or for damages I owe and I deny it, there is no specific object about which we argue. Thus it is hard here to speak about an advantage to the claim of the one who currently holds the money. The borrower/damager does indeed possess the money, but it is hard to infer from that that the money is probably his (i.e., that he did not borrow or did not cause damage). This presumption suits objects. If the claim is to ownership of some specific object, there the presumption could arise and decide. In monetary disputes, then, it seems hard to apply this presumption. But regarding objects—do we truly not need the logic or the verse?
Explanation: conditional probability
The answer is no. To understand this, we must return to the probabilistic question. If we survey all the objects in the world, we will discover that most are in the hands of their owners. That is, given some object A before us, the probability that the one holding it is its owner is greater than the probability that someone else is its owner. Seemingly this is evidence in favor of the holder. However, this consideration is valid only if we choose an object in the world at random. In a court case between two parties concerning ownership of this object, we are dealing with an object that is under dispute. Within the subset of objects under legal dispute, I see no reason to assume that in most cases the defendant (the possessor) is right. Is there any basis to assume that plaintiffs lie more than defendants (who defend themselves)? On the contrary, there is a presumption that “a person does not sue unless he has something [to rely on]” (see Shevuot 40b), which gives weight to the assumption that the plaintiff is not lying.
The root of the fallacy again lies in conditional probability. Suppose we consider ownership of a particular object. The probability that it belongs to A, who holds it, is P(A), and for the sake of argument it equals the number of objects in the hands of their owners divided by the total number of objects in the world. If there are a million objects in the world, we assume the great majority, say 850,000, are with their owners. Therefore the chance that the object before us is with its owner is P(A) = 0.85. But the object in question was not chosen randomly from all objects in the world. It is part of a special subset of objects that are under dispute. In this subset the distribution between objects belonging to the plaintiff and to the defendant may be utterly different; that is, the chance that the holder is the owner is no longer necessarily 0.85. And as noted, it is very plausible that the distribution here is different (a priori the chance for each side is roughly equal).
In the terminology of conditional probabilities, the mere fact that the object is under dispute restricts us from the set of all objects in the world to the set of disputed objects. Simply, the picture is very similar to the ellipse shown above:
Out of 18 objects there are 4 in dispute (the shaded ones). If we did statistics on the entire ellipse, we would get P(A) = 0.85. If we did statistics on some portion of it, say on the left side of the boundary line marked there, we would get some other result. And if we do statistics on the four shaded objects we would again get a different distribution. The problem is even worse, since naturally the statistics we conducted—that showed that 85% of objects are with their owners—was carried out only on objects whose ownership is known and recognized, i.e., entirely on non-shaded objects, those outside the restricted domain of objects under dispute. The question is: what would the result be if we did statistics only on the four shaded objects? We have seen that there is no reason to think the result would again be 0.85; in fact, there is good reason to think it would not be. The presumptions of honesty for plaintiff and defendant are equal; therefore, a priori the assumption that the plaintiff is lying is not more reasonable than that the defendant is lying. In principle, the probability we would get here would be something like 1/2.[2]
In conditional-probability terms, 0.85 is the probability that a randomly chosen object in the world is in its owner’s hands, P(A). But we seek the probability for an object chosen from among the disputed items (the shaded ones). Here we have a conditional probability P(A/B), where B is the fact that the object is under dispute (one of the shaded ones). The object under discussion was chosen from the subset of disputed items. We saw that a priori this probability is probably around 1/2. No wonder it is smaller than the overall result, since it is incorrect to assume identity between P(A) and P(A/B).
The conclusion is that even if one can speak of a presumption that “what is under a person’s hand is his,” that can be true only in a legal sense. This presumption has no probabilistic meaning. Legally, we may adopt a presumption that if an object is in someone’s hand it is his until proven otherwise—because of reasons like “he whose pain hurts goes to the physician” or by virtue of a verse—but that is a legal presumption, not a probabilistic claim. Probabilistically, the distribution is equal between the sides.
Ignoring information: representative samples and scientific generalizations
The computation assumed by those who equate this presumption with possession ignores information we have (B), and instead computes absolute rather than conditional probability. But it is incorrect to rely on the overall result if we have more specific information. It is exactly like claiming that the chance of getting a 5 given that the outcome is odd is 1/6, since that is the absolute chance of getting a 5. In the die-rolling case, the conditional result increases (from 1/6 to 1/3), whereas in our case it decreases (from 0.85 to 0.5), but in any case this is an error. One must not ignore information and do statistics on the basis of partial data. We lose information and therefore our result is less accurate.
Seemingly science does precisely the same error. We observe a few massive bodies falling toward Earth and infer a general law that all massive bodies fall toward Earth. Here, too, it looks like inference from a partial group to the general group. But note that science at least tries to make the sample representative—that is, that the distribution within the partial group resembles that of the general group. This means we have not ignored information. If one takes a partial group that has no special characteristic beyond those of the general group, then even if we have reduced the group, we have not ignored information. The distribution in the partial group will resemble that in the full group. If we examined 5% of the world’s objects and found that 85% of them are with their owners, then it is quite reasonable to say this is also the general distribution—so long as the small group does not have some special characteristic different from those of the large group. That is a representative sample, since using it does not ignore information. But if we made the generalization of the law of gravity on the basis of a group of objects all made of wood, then the generalization that all massive objects fall to Earth would be highly problematic. We ignored the information that all the objects in our sample are made of wood. Perhaps only wooden objects fall to Earth. This is a non-representative sample, since it has distinctive features (additional information that must not be ignored).
If I were to infer from the overall distribution in the entire ellipse above to the group on the left side of the boundary line drawn there, that would be legitimate, provided I have no special feature known for the objects in that subset. But the assumption that identifies possession with the presumption “what is under a person’s hand is his” makes an illegitimate leap. It uses the distribution of all objects in the ellipse to describe what is in that bounded segment. Here there is a special property to the subset we are discussing, for all the objects in this subset are under legal dispute. Such an inference ignores relevant information and therefore yields an incorrect statistical result.
Another example: The Monty Hall Problem
This is a well-known probability puzzle named after an American TV game show (hosted by Monty Hall). In the game, the contestant is shown three doors. Behind one is a valuable prize—a car—and behind each of the other two is a goat. The contestant must open a door and wins whatever is behind it. He obviously prefers the car, but he does not know which door hides it, so he must choose at random. Now there is a twist. After the contestant points to and chooses one door—say door no. 1—the host, who knows which door hides the car, opens one of the other two doors—say door no. 3—and reveals a goat (he always opens a door that he knows has a goat). The contestant is now allowed to decide whether to stick with his original choice or to change and pick the last unopened door—door no. 2. Should he switch, or does it make no difference?
Seemingly, the initial chance that the contestant chose the door with the car is 1/3 (one of three doors). In addition, the contestant knows that at least one of the two remaining doors has a goat behind it. The host opened the door with a goat because he knows what is behind each door; therefore, apparently he has not added any information for the contestant. There is always a door with a goat, and he can simply choose it. If so, the probability that the car is behind each remaining door remains 1/3. On the other hand, now the contestant confronts two doors and knows that behind one is a goat and behind the other is the car. Thus now the chance for each is 1/2. But note that this is also the chance for the door he initially chose; hence there still seems to be no reason to switch. The conclusion appears to be: no point in switching; it changes nothing.
But this is wrong. Statistical calculation shows that it is always advantageous to switch. The original choice had a 1/3 chance of winning. The host will always leave two unopened doors, with a car behind one, but he does this by opening a door that he knows in advance does not hide the car. Therefore he does add information. The door initially chosen retains its 1/3 chance, as before. Nothing has changed for it, since the host will always leave the contestant with that door and the initial chance was 1/3. But the chance for the other unopened door rises from 1/3 to 2/3, because it absorbs all the probabilities that the car is not behind the contestant’s original door. In this way the host has added information regarding that other door, and therefore it is best to switch. Note that the probability that the car is behind the second door is not 1/2 but 2/3 (otherwise 1/2 + 1/3 would not sum to 1). This is admittedly confusing.
The pitfall lies, of course, in conditional probability. At the start of the game, the probability for the remaining door is 1/3, just like for the other two. But after the host opens a door, he has given us additional information. Now the probability for the remaining door (A), given that it was left closed by the host, has risen to 2/3. Therefore we must not go by the absolute probability P(A), which, absent information about the doors, is 1/3 (just like the chance for the current door). We must use a conditional probability that takes into account the added information: P(A/Z) = 2/3, where A is the event “the car is behind the remaining door” and Z is the event “the host left that door closed.” This chance is twice that of the door already chosen. Therefore one should switch.
This is another example of confusion between absolute and conditional probability. Here, too, we can see that mixing up these probabilities amounts to ignoring information we have.
Possibility tree, sample space, and information
To analyze this case in the terms we saw above, we must describe the game using a possibility tree (as in the last two columns). We have three doors: {A, B, C}, where we call the door with the car A. Behind the other two there are goats. The chance that the contestant chooses door A is 1/3. The chance that he chooses each of the other two is also 1/3, together 2/3.
Assuming the contestant chose door A, the host will open a door with a goat and leave another door with a goat closed. In these cases it is best for the contestant to stay with his door. If the contestant chose a door with a goat (say B), then the host will open C (with probability 1) and leave A closed. The same if the contestant chooses C. In both those cases it is best for the contestant to switch.
Now we compute the probabilities. The chance for each initial choice is 1/3. If the contestant chose A, then there is a 1/2 chance for each of the two branches below it.[3] Altogether there is a 1/3*1/2*2 = 1/3 chance that he should stay with his door. If the contestant chose B or C, then altogether there is a 1/3*1 + 1/3*1 = 2/3 chance that he should switch. Therefore, in the bottom line, it is preferable to switch.
The information added to the contestant is that his situation is now at the second-level nodes and not at the first node; that is, he is now after the host’s action. We can also describe this in terms of restricting the sample space. If we count the number of possibilities in which switching yields a win, we find that after the host’s action there is a higher percentage of such cases than before the host acted.
Conclusion
Probability is a toolbox for decision-making under conditions of missing information. Therefore the result of the calculation depends not only on the factual situation (fairness of the die, body mass, etc.) but also on the information lacking and the information in our possession. If we change the amount of information we have, the result of the calculation for the very same case can change. This happens when the information ignored or added is relevant—that is, when it is information on which the result statistically depends. Statistical dependence determines the relevance of the information to be considered.
In my article “On the Representativeness Fallacy in Halakha,” I brought additional examples of failures involving conditional probability—what halakhic language calls “a majority that has been undermined” (rov she’ittre’a).
A majority that has been undermined: levirate marriage
At the end of that article I cited two examples of a “majority that has been undermined.” This speaks of a situation in which there is a majority in favor of one possibility over another, and then additional information emerges and the majority is undermined. This is a halakhic expression of added information that changes the distribution.
The first example is in Mishnah Yevamot 35b:
One who is of doubtful status—perhaps a nine-month child to the first husband, perhaps a seven-month child to the last—she must leave [the levir], the child is kosher, and they are liable for an asham talui (a provisional guilt-offering).
This concerns one who hastened and performed levirate marriage with his yevama immediately after his brother’s death, and afterward a son was born at a time that raises doubt. The Gemara assumes that a fetus can be born either after nine months or after seven months, with most being born after nine. Here the doubt is whether the child was born after nine months of pregnancy and his father is the first, or whether he was born after seven months and his father is the second. For the sake of the example, let us assume he consummated the levirate marriage two months after the brother’s death.
The child is, of course, kosher in either case (if he is the child of the first, then the levirate act was invalid and they violated the prohibition of a brother’s wife not in the place of a mitzvah, but the child is certainly kosher; and if he is the second’s seven-month child then the levirate act is valid and he is the second’s kosher child). The question concerns the couple: they bring an asham talui because of the doubt of relations with a brother’s wife not in the place of a mitzvah (if she was pregnant there is no mitzvah of levirate marriage, and she is an ervah to the levir).
The Gemara there (37a) raises a difficulty:
Doubtful nine-month child, etc. Rava said to Rav Naḥman: Let us follow the majority of women, and most women give birth after nine months!
Why is the child considered doubtful? There is a majority of births at nine months, so we should decide that he is the first husband’s child. Consequently, the offering should be a definite sin-offering, not an asham talui.
In conclusion the Gemara explains:
…He said to him: This is what I meant: Most women give birth after nine months and a minority after seven; and any who gives birth after nine—her fetus is recognizable at the end of the first trimester. Since here her fetus was not recognizable at the end of the first trimester, the majority has been undermined.
At this stage the Gemara suggests that any woman who gives birth after nine months has a noticeable pregnancy. Here, however, the fetus was not noticeable (hence the levir decided to consummate the marriage, thinking she was not pregnant. If the pregnancy had been noticeable at the time, this doubt would not have arisen). Since those who give birth at nine months have noticeable pregnancies, the majority in favor of nine months is undermined.
The Gemara then asks:
If indeed every woman who gives birth after nine months has a noticeable pregnancy at the end of the first trimester, then since it was not noticeable here—certainly the fetus is a seven-month child to the latter!
If indeed all nine-month births have noticeable pregnancies, then the law should not be that they are liable for an asham talui but rather exempt—because it is certain that he was born after seven months.
Finally the Gemara corrects this to say that it is not a certainty but a majority:
Rather say: Most who give birth after nine months have a noticeable pregnancy at the end of the first trimester; and since here it was not noticeable—the majority is undermined.
Thus, in the conclusion, the Gemara explains that the majority of nine-month births is undermined because among those, most have noticeable pregnancies. Therefore the nine-month majority is undermined; it is a case of doubt; hence they bring an asham talui.
This answer is not straightforward. If the pregnancy here was not noticeable (and we saw that this is the case), then there is a majority in the other direction—that this fetus was not a nine-month fetus. Why, then, do we not decide definitively that the fetus is seven months and exempt them from a sacrifice altogether?
A majority that has been undermined: betulim (virginity)
A parallel move appears in Ketubbot 16a–b. The Gemara discusses whether the woman before us married as a virgin or not. The assumption is that no report (kol) reached us that she married as a virgin. On the other hand, there is a majority of brides who are virgins. The Gemara states there:
Ravina said: Because we can say—most women marry as virgins and a minority as widows; and any who marries as a virgin has a (public) report, and since this one has no report—the majority is undermined.
There is a majority that marry as virgins. That majority is undermined because any who marries as a virgin has a report.
The Gemara then asks that if the rule that a virgin marriage has a report is a certainty and not a majority, then not only is the majority undermined; there is a definite conclusion in the opposite direction:
If indeed every virgin marriage has a report, what of witnesses who arrive [afterwards]? Those witnesses must be false!
In the conclusion the Gemara explains this as a majority, not a certainty:
Rather, said Ravina: Most virgin marriages have a report, and since this one has no report—the majority is undermined.
As noted, the move is parallel to the case in Yevamot. And here, too, one can ask: if most virgin marriages have a report, then one who married without a report is presumably not a virgin. Why is this a doubt rather than a certainty in the opposite direction?
Explanation: conditional probability
The answer to both difficulties is fairly simple, and we will see it through the second example. Most women who marry do so as virgins. Therefore, in general, if asked whether the woman before us married as a non-virgin or a virgin—the answer would be “virgin.” On the other hand, if there was no report about her, then there is a counter-majority, for most who marry as virgins have a report. Suppose there are 1,000 women who married. Of these, 80% are virgins—i.e., 800 virgins—and 200 non-virgins. By contrast, among the virgins there is a majority of 80% with a report—i.e., 640 virgins with a report, and thus 160 virgins who married without a report. Now a woman who married without a report presents herself, and we wonder whether she is a virgin or not. To decide, we compare the number of non-virgin marriages (200) with the number of virgin marriages without a report (160). The decision is clear: she is a non-virgin. The second majority neutralized the first. This is the mechanism of a majority being undermined by an opposing majority.
In R. Shmuel Rozovsky’s lectures on Yevamot, §299, he, too, dwells on this difficulty and formulates it thus:
It seems simply that the second majority is in the same ratio as the first majority. For example, if the first majority—“most women give birth after nine months”—is in a 4:5 ratio (say, out of 100 births, 80 are after nine months), then the second majority—“most of those who give birth after nine months have a noticeable pregnancy”—is also in a 4:5 ratio (i.e., 64 out of the 80 nine-month births). If so, among 100 women there are 20 who give birth in seven months and 16 who give birth in nine months without a noticeable pregnancy. If so, it is difficult: this woman certainly belongs to one of these minorities and is not among the nine-month births with a noticeable pregnancy. If so, why do we doubt which group she is from? There is a majority that she belongs to the seven-month group, since the seven-month births are more than the nine-month-without-noticeable-pregnancy group.
It is forced to say that our case is precisely one where the second majority is not in the same ratio as the first; rather, among 80 nine-month births there are 20 without noticeable pregnancy—and the seven-month births are not more numerous than the nine-month-without-notice group. But simply this is not implied; rather, the ratio of minority to majority in the second majority is like that in the first—and this is difficult.
If the two majorities have the same strength, an opposing majority is created, as we saw above. If so, RSR questions why we treat it as a case of doubt (the majority is undermined) rather than as a case of certainty. Seemingly there is a determination in the opposite direction, not a doubt.
Clearly, the answer depends on the relation between the two majorities. If the majority of virgins who marry with a report were less significant—for example, 60%—then the number of virgins who married without a report would be only 320, compared to 200 non-virgin marriages. In that case, when a woman who married without a report comes before us, the decision would still be that she is a virgin. Here the majority is not undermined. At 75% of virgins marrying with a report, the two numbers would equalize, and we would remain with a doubt. Thus, to answer RSR’s challenge, halakha sets a categorical rule: when a majority is undermined, we do not rely on it. We do not have the possibility of conducting a statistical survey in every case that comes before the court to know the ratio of the majorities; therefore the assumption is that the situation is balanced and there is no way to decide based on majority. Legally this is a case of doubt.
In the background it is important to recall what we saw in the last two columns. In almost all halakhic majority cases, we deal with a negative majority—that is, a majority whose size we cannot quantify. In such a case, the most appropriate approach is to set a general rule that when a majority is undermined, it is null and void and cannot be relied upon. If we had a positive majority (with numerical data), then indeed there would be no need to resort to the “undermined majority”; we could compute and follow the majority as it is. But with a negative majority there is no way to rely on the majority in such a situation; therefore, one must indeed dismiss a majority that has been undermined. Incidentally, the notion of an undermined majority (rov she’ittre’a) is the halakhic expression of a remark at the start of column 612, where I noted that sometimes there is a “majority of a majority” (ruba deruba) when the second majority goes against the first rather than with it. In such a situation an offset can be created and we lose the original majority. This is exactly the case of an undermined majority. An undermined majority is the mirror image of ruba deruba, with the same possibility tree, except that in ruba deruba the second majority reinforces the first, whereas in an undermined majority it weakens it.
We now arrive at the conditional-probability pitfall. The initial majority is an absolute probability that holds for the general population (most women marry as virgins). But regarding the particular woman before us, we have additional relevant information: there was no report of virginity. This information restricts the group of all women to a specific subset (those without a report of virginity). This relevant information, which narrows the relevant group, turns the absolute probability into a conditional one, and therefore the chance changes. In this special subset the distribution is different, and this is precisely what the Talmud calls “a majority that has been undermined.”
Connection to the previous columns
One can present an “undermined majority” using a possibility tree, as we did in the previous two columns and above. At the top node, the two branches relate to each other via the first majority (does she belong to the group of those who marry as virgins or not), and at the lower nodes the second majority governs (if she belongs to the virgin group, we then ask whether there was a report or not),[4] but as noted here the directions of the two majorities are opposite.
Such a situation can of course also be represented by an ellipse diagram as we did here. Take the set of all married women, represented by the ellipse. Among them, the majority married as virgins and a minority did not. Within the group of those who married as virgins, a majority had a report of virginity and a minority married as virgins without a report. The picture is as follows:
The left side of the ellipse is the minority who did not marry as virgins. On the right side is the majority who married as virgins. The majority of that right-hand subset (the dark area) has a report of virginity, and the minority of that subset married as virgins without a report (the white area above the dark area on the right side of the ellipse).
Now a woman comes before us without a report when she married. Assuming she belongs to those who married without a report, she belongs to a subset of the ellipse for which we have relevant information (no report). That subset is the entire white area in the ellipse. And yet we still wonder which sub-subset of the white area she belongs to: does she belong to those who married as non-virgins (the left side), and therefore had no report; or did she marry as a virgin but without a report (the upper right white area)? This is the internal distribution within the group of those without a report, and as you can see, within this subset (the white area) there is not necessarily a majority for the virgins. It depends on the sizes of those two sub-subsets. In the example above, the ratio is 7:4 in favor of her not having married as a virgin. Note that this occurs even though, in the full population, most women married as virgins. This illustrates restricting the space based on relevant information; in the restricted space the distribution differs from that of the whole. Once again we have conditional probability; one must not confuse it with absolute probability. Although in the aggregate most women marry as virgins, when it is known that there was no report, the distribution differs, and there may be no majority for the virgins. In such a situation we say that the original majority has been undermined and we do not use it—specifically when the majority in question is negative (we do not have numerical information). With a positive majority, when we have numbers, we simply check whether, in the bottom line, a majority remains or not.
“Something extra” in self-incrimination and medical diagnosis
In my aforementioned article I explained that representativeness fallacies really compare the whole with a non-representative sample, and as we saw here, that means that the distribution in the sample differs from the distribution in the general population; hence one must use conditional probability (given that you belong to this subset, i.e., to the sample, what is the chance of such-and-such?). There I showed implications for medical diagnosis of rare diseases and for legal evidence (the requirement of some “additional thing”—davar ma—alongside self-incrimination). Here I will briefly illustrate to show the connection (for more, see the article).
There are legal systems in which a confession is “the queen of evidence,” and a person who incriminates himself is convicted thereby. In Israeli law, however, self-incrimination is not enough (unless given in court), although such a confession is viewed as very strong evidence. To convict someone who incriminated himself, one needs an additional “something”—that is, another piece of evidence, even if quite weak on its own. Thus we find in the Attorney General’s Guidelines, Guideline 4.3012, from Nisan 5767 (April 2007), §1:[5]
It is an established rule in the Supreme Court’s jurisprudence, from time immemorial, that one may not convict a person on the basis of his confession alone given outside the courtroom, even when the confession was obtained without external pressure, unless some “additional thing” is found to bolster that confession (CrA 3/49 Andlerski v. Attorney General, PD 2 589; CrA 290/59, Ploni v. Attorney General, PD 14 1489).
Why is this extra “something” so important? It is quite weak evidence on its own, but, as we will now see, when added to self-incrimination it completely changes the picture.
In my article I proposed explaining this, too, in terms of conditional probability and the representativeness fallacy. Suppose, for the sake of discussion, we are dealing with murder. Suppose further that self-incrimination is 99% strong as evidence. But murder is very rare; the proportion of murderers in the general population is far less than 1%. For the sake of discussion, say 0.01%. I showed there that in such a situation the practical strength of the evidence is about 1% reliability. Note: although we are dealing with the “queen of evidence,” still, when it aims to prove a rare phenomenon, it is not necessarily reliable. What matters is the relation between the rarity of the phenomenon and the quality of the evidence; in this case the base rate is a hundred times smaller than the evidentiary quality, and therefore the evidence is not good (see there for the calculation and explanation). The very same situation exists for diagnosing a rare disease with 0.01% prevalence using a test with 99% accuracy.
Now suppose we add “something,” for example, evidence that the defendant was in the vicinity of the murder. This by itself is far from sufficient to convict. Would anyone convict merely because he was seen near the scene? There are quite a few people who were there—say ten. Thus this is a “something” which, on its own, has little weight. But note what happens when there is self-incrimination. This weak evidence reduces the pool of potential murderers from ten million (all citizens) to ten (those who were present). Within this group, the a priori chance that the defendant is the killer is 10%, which by itself is far from enough for conviction. But in this subset, the chance that a person is the murderer is no longer so small, and in such a situation a 99%-quality piece of evidence like a confession becomes decisive. Now the chance of error is minuscule and one can convict. What does the “additional thing” do? It adds relevant information and thereby restricts the sample space to a subset of the general space (instead of ten million potential suspects there are ten). In that subset, the prevalence of the trait (being the murderer) is dramatically higher (10% instead of 0.01%). Without the “something,” the evidence attempts to answer an absolute question: what is the chance that this person is the murderer? After adding the “something,” our evidence evaluates a conditional probability: what is the chance he is the murderer given that he was in the vicinity? Here the evidence (the confession) yields an excellent probability.
The same holds for medical diagnosis of a rare disease. If there is some symptom in addition to the accurate test mentioned, that symptom by itself is far from sufficient for diagnosis. But it changes the group under discussion from the general population to a small subset of those who exhibit the symptom. In that subset, the prevalence of the disease is already much higher, and the test will yield very good results (because now it measures a conditional probability and not an absolute one).
In both cases the mechanism is identical: a transition from an absolute probability, which is very small, to a conditional probability (within a restricted group for which we have additional information B), which is much larger. In the legal case we ask: what is the chance that someone who confessed is the murderer (A) given that he was in the vicinity (B)—denoted as the conditional probability P(A/B). This is clearly very different from the question: what is the chance that someone who confessed is the murderer, P(A). The first probability is much larger and suffices for conviction beyond a reasonable doubt. Likewise for the medical question: what is the chance that someone who tested positive is ill, P(A), versus what is the chance that someone who tested positive is ill given that he has a relevant symptom, P(A/B). The latter is much larger and suffices to yield a reliable diagnosis.
As I wrote there, in my estimation very few jurists and physicians are aware of, understand, and can explain this point, and I believe not a few have erred and continue to err in it (physicians issuing incorrect diagnoses or judges convicting on shaky grounds). Think of a doctor who sent you for a test for a rare disease (say, in the past, corona) with 99% accuracy, and the result came back positive. Ask: what is the chance that I am sick? Almost no doctor understands that the chance is very slim.[6] I assume most also do not understand why a small symptom that arouses a distant suspicion can completely change the picture. Fortunately, this is usually the situation: people are not sent for tests for rare diseases unless there is a symptom that arouses suspicion. That symptom is the “additional thing,” and therefore in such cases the test is indeed reliable. This is why, in practice, physicians usually do not make such mistakes. But in routine screenings that are not done due to a specific suspicion in the patient examined (for example, broad population screening), the results for a given person are worth nothing. Very few understand this. But fortunately the collective wisdom somehow works.
[1] I debated whether “conditional probability” or “conditioned probability” is the more accurate phrasing.
[2] The conclusion is even more paradoxical. If we remove from the space the four shaded events—this is the set on which we did the statistics (it has 14 objects). As noted, we got 0.85 (about 12 out of 14 objects are in their owners’ hands). Then the probability for all objects (including the shaded ones) is 14/18 = 0.777. The overall probability itself is not really correct, because it was computed only on the white objects. In fact, the error is even greater than the conjecture we would have made (had we been able) based on all objects in the world. Of course, the number of disputed objects is negligible relative to all objects, so in the real case this difference is negligible.
[3] I assume the host draws lots between the two. It does not really matter. I could have treated it as probability 1 for opening a goat door and leaving one goat door closed. I preferred presenting it this way to show the dependence.
[4] It is an interesting exercise to check whether this flips (see the previous column).
[5] See my book God Plays Dice, Yediot Books, Tel Aviv 2011, pp. 104–112. There I explained this somewhat differently: the expected-value criterion is ineffective for decision-making if the chance of attaining the expectation is low. One can view this, in another way, also as a type of representativeness fallacy, but this is not the place.
[6] Most will say the chance is 99% (which is, of course, completely wrong. Test accuracy is the opposite conditional probability: given that I am ill, what is the chance the test detects it? Diagnosis asks the reverse conditional: given a positive test, what is the chance I am ill? These are entirely different numbers, especially when the disease is rare).
Discover more from הרב מיכאל אברהם
Subscribe to get the latest posts sent to your email.
[For now, I've only skimmed. And perhaps you addressed this in the articles you mentioned.] Even if, given that a legal dispute is abandoned, there is no majority that the holder is right, if it is determined as a rule that the holder is wrong (and when there is no additional evidence to dispute it), then certainly in most cases, if a legal dispute is abandoned, the plaintiff will be a fraud and the holder will be right, because all fraudsters will be suing all day long. Therefore, we are forced to determine that even when a dispute is abandoned, the holder is right. Do your words weigh this calculation and do you think this is only a legal presumption?
Yes. I think I even wrote that before. Right now you have no evidence that the holder is right. What you wrote is just the legal explanation for why to establish such a legal presumption.
[If this is indeed a choice of a decision method that maximizes the probability of achieving justice over the alternative methods, then it seems that the wording in the column is a bit too strong. There is no need for a verse or for probabilities of pain and doctors, and at the heart of this matter is a probabilistic claim, even if at the moment it is not, and we are only forced by necessity to give the holder the power of holding the objects over which there is no dispute.]
There is no such thing. It is what I wrote. Now you ask whether the one who issues from his author the evidence on him is the same as the presumption that is under his human hand? The answer is no, it was determined because of your assumption. Why is it not based on that assumption? Because of doctors and conditional probabilities. That is exactly what I argued. You are only expanding more on the assumption that is the basis of the presumptions even when there is no presumption that is under his human hand.