Halakhic Examples of Errors in the Use of Conditional Probability (Column 145)
Loading…
Article Contents
With God’s help. Halakhic Examples of Error in Using Conditional Probability
In the previous column I analyzed the physico-theological argument on the basis of Bayes’s formula for total probability, and I pointed out errors that are made in the use of reversed conditional probabilities. In this column I will continue and show similar errors that are made in using this formula in two examples from Jewish law. My apologies to the bored readers; I hope that with this I will conclude the probabilistic time-out.
Majority in a rabbinical court: the question. Several months ago someone asked me the following question:
Sefer Ha-Chinukh (see column 67 and …) explains that we follow the majority in a rabbinical court because when there is a majority for a certain opinion, the probability that it is the correct one is higher. This means that the probability of a correct ruling by many judges is higher than the probability of a correct ruling by one judge, or by slightly fewer judges. If we assume that the probability of an expert judge issuing a correct ruling is p, which is of course a number between 0 and 1, then the probability of n judges issuing a correct ruling is p^n. When one multiplies the probabilities (assuming independence of correctness or error among different judges), one always gets a lower result. This contradicts the probabilistic explanation in the words of Sefer Ha-Chinukh.
But this is absurd on its face, for by the same token the probability of error by many judges is lower than the probability that slightly fewer of them err, or that one errs. The probability of error for one judge is 1-p, which is also a number between 0 and 1, and therefore the probability of error by n judges, which is (1-p)^n, is certainly lower as well. So what is the truth? Does agreement among several judges improve the chance of error, or the chance of a correct ruling?
A side note: we have implicitly assumed here independence among the judges’ opinions, since if the opinions depend on one another (if one influences the other in some way), then it is not justified to multiply each one’s probabilities by those of the others. For example, if we throw a die twice in exactly the same way (the same angle and the same initial velocity), what is the probability that it will fall twice on 6? 1/6
Of course (and not 1/36). Only if the throws are independent should one multiply the probabilities in order to obtain the probability of the pair. It is fairly clear that the assumption of independence among judges is not literally correct (because in a rabbinical court they deliberate, argue, and exchange reasons among themselves). For the sake of simplicity (our concern here is to illustrate an idea and not to clarify the law of majority rule in a rabbinical court), let us nevertheless assume independence here, meaning that each judge on the court forms his position on his own. In such a case it is reasonable to assume that the probability of n
judges erring or being correct is indeed the product of the probabilities (this too can be discussed at great length, but as noted, it is not important).
- Determining menstrual cycles: the question. That same person asked me another question, and as we shall see below the answer to it is similar. In Jewish law a woman’s cycle is determined according to sightings. If she sees blood at fixed intervals three times, then the conclusion is that her physiology is regular (that is, that for her there is a fixed interval between sightings), and then she must suspect that she will also see blood at the time that comes after the next interval. Therefore the Sages forbade her to have sexual relations at that time (as a matter of law, the prohibition of cycle dates is rabbinic). For example, suppose a woman saw blood three times at equal intervals between one time and the next; then on the corresponding day after the third time she must be concerned that she will see blood and refrain from sexual relations. Now let us assume three assumptions:
It is medically known that women’s intervals range between 28
days and 31 days, that is, there are four types of intervals.
Let us assume that nowadays a woman (once it was much less) has an active cycle for about forty years, that is, about five hundred months.
Let us further assume that the intervals are uniformly distributed (the probability of an interval of 28
days is the same as any other probability).
Now let us ask: what is the probability of getting, by chance, three equal intervals during this period? The answer is 1/16
Why? Suppose the first interval was 29 days. The probability that the one after it will also be 29
days is 1/4, and the probability that the one after that will also be 29
days is likewise 1/4. Therefore the probability of three equal intervals in succession is 1/16. Again, this is under an assumption of independence. If there is dependence, then of course the probability that all the intervals are equal increases greatly. If the dependence is complete, that is, if the woman really has a fixed cycle, then the probability is 1
In any event, even if the woman does not have a fixed cycle, over the course of 500
times throughout a lifetime it is clear that there will be quite a few such triplets of equal intervals (on average about thirty: 500/16). So why does Jewish law decide after three occurrences of the same interval that she has a fixed cycle?
To the same extent, this could be a random result.
- Majority in a rabbinical court: the explanation. First, a second look shows that the picture described in the question cannot be correct. The probability that a collection of n
judges errs plus the probability that this collection is correct must be 1. But it is not true that … = 1
p^n + (1-p)^n
In fact, this almost never holds (except in the cases where p = 0,1).
How can it be that the sum of these two probabilities is not 1? Well, here of course there is the possibility that opinions in the rabbinical court will be split and not everyone will decide unanimously. To analyze the case, let us assume that this is a monetary dispute between Reuven and Shimon. Shimon sues Reuven, and before the court there arise two possibilities that it must decide between: either Reuven is liable, 1 for short, or Reuven is exempt, 0 for short.
We are given that the probability of a judge making a mistake is 1-p
and of being correct is p. Let us assume for simplicity that the probability of error in the two directions (acquitting one who is liable or holding one liable who is exempt) is equal. It is important to understand that the probability of a judge making a mistake (and likewise of being correct) is actually a conditional probability: the probability that the judge will say that 1 occurred
(or 0), given that 0 occurred
(or 1), respectively.
Now let event A be the following: two judges say that Reuven is liable and one says that he is exempt. The question is: assuming that this is the judges’ ruling, what is more correct to decide, that Reuven is indeed liable (like the majority opinion), or that Reuven is exempt? As stated, Sefer Ha-Chinukh says that we follow the majority because they are usually correct. Most likely the two are correct
and the one mistaken, and not the other way around. Why is this true? After all, the probability that two are correct is p^2, and that is smaller than the probability that one is correct, p. But as we have seen, the reverse is also true: the probability that two are mistaken is (1-p)^2
which is smaller than the probability that one is mistaken, 1-p. So what is the truth? As stated, these two probabilities do not add up to 1.
This brings us back to the previous column. The probabilities that appear here are unnormalized conditional probabilities, and it is no wonder that they do not add up to 1. That is precisely the clue to the solution of our puzzle. We are looking for the reverse conditional probability: if the judges decided something, what is the probability that it is true? As we saw in the previous column as well, the probability that it is true and the probability that it is not must add up to 1.
We know how to calculate the probability that A
will occur on the assumption that Reuven is in fact liable and on the assumption that he is in fact exempt. As stated, for simplicity we assume independence among the judges’ errors (it seems to me that in reasonable cases the results regarding our question, whether the majority is indeed correct, do not really depend on this).
(If the reality is that Reuven is indeed liable, 1), the decision A
means that two judges were correct and one was mistaken. What is the probability of that? Of course: P(A/1)=p^2(1-p). And if the reality is that Reuven is exempt, 0, then two judges were mistaken and one was correct; the probability of that is P(A/0)=p(1-p)^2
Already here one can see the error in the question. The probabilities of error and of a correct ruling should be examined by comparing these two, and not by comparing a large number of judges with a small number, that is, p^2
versus p
or (1-p)^2
versus 1-p.
But the solution to our problem is based on comparing the two reverse conditional probabilities: P(1/A), which is the probability that if the judges ruled this way then Reuven is indeed liable, versus P(0/A), the probability that if the judges ruled this way Reuven is exempt. That is, we need the reverse conditional probabilities. To calculate them we must use Bayes’s formula (which we encountered in the previous column); it reverses the direction of the conditioning:
To know a numerical result, we need to know the absolute prior probability that Reuven is in fact liable, P(1)
or exempt, P(0), and of course we do not know that. Since we have only two cases (either Reuven is liable or he is exempt), we can denote P(1)=q, P(0)=1-q.
Now, from Bayes’s formula we obtain for the probability that the truth is 1
(Reuven is liable):
P(1/A)=P(A/1)q / { p^2(1-p)q + p(1-p)^2 (1-q)} = pq / [pq + (1-p)(1-q)] = 1 / (1 + α) and for the probability that reality is 0
(Reuven is exempt) we get P(0/A)=P(A/0)(1-q) / { p^2(1-p)q + p(1-p)^2 (1-q)} = (1-p)(1-q) / [pq + (1-p)(1-q)] = α/ (1+α) where we defined:
α = (1-p)(1-q)/pq. The denominators of these two conditional probabilities are equal, and therefore the ratio of the probabilities is the same as the ratio of the reverse conditional probabilities: P(1/A) / P(0/A) = P(A/1)q / P(A/0)(1-q) = pq/[(1-p)(1-q)] = 1/α. For simplicity, let us now assume that the prior probability that Reuven is liable or not is equal (true, there is a presumption of innocence, but if he has already come to court then his presumption of innocence is no longer relevant. On the contrary, it is more likely that he is liable), that is q=(1-q)=1/2. Under this assumption all the prior probabilities cancel, and we are left with a simpler formula. What we get is P(1/A)= P(A/1) / {P(A/1) + P(A/0)} P(0/A)= P(A/0) / {P(A/1) + P(A/0)}. The denominators are equal, and therefore if we divide one by the other, we obtain (using the probabilities above): P(1/A) / P(0/A) = P(A/1) / P(A/0) = p/(1-p). Since the sum of the two must be 1, we get (as would of course also follow from an explicit computation of the denominator and substitution): P(1/A) = p P(0/A) = 1-p
The view of laymen is the opposite of the Torah’s view. It is now clear that if the probability that a judge will be correct is greater than one-half (that is, he is a Torah scholar), then it makes sense to follow the majority. If the probability that a judge will be correct is less than one-half (ignorant judges), then one should follow the minority. This is what is said in the Sema, sec. 3, subsec. 13:
In Responsa Mahariv, sec. 146, he writes: ‘If you would heed my advice, do not sit with the community in any judgment, for you know that the rulings of laymen and the rulings of the learned are two opposites. And they said in the chapter Zeh Borer [Sanhedrin 23a]: Thus did the clear-minded people of Jerusalem act—they would not sit in judgment unless they knew who would sit with them, etc.; see there.’ And in the common language of the yeshivot: ‘The view of laymen is the opposite of the Torah’s view.’ That is exactly what follows from what we obtained here. There is a simple intuitive explanation for what we obtained here. The probability that two are correct and one is mistaken is p^2(1-p). The probability that two are mistaken and one is correct is lower (assuming that p is above one-half): p(1-p)^2. This is the ratio between the probabilities. All that is needed is simply to turn them into actual probabilities, that is, to make sure that their sum is 1. For that one has to divide each of them by the sum of these two probabilities. That is exactly what Bayes’s formula gave us.
An interesting note that emerges from Bayes’s formula. From Bayes’s formula we see that if the prior probability that the defendant is liable is in fact small, the result can change, since then the denominator of the two cases remains similar, but the numerator must be multiplied by the prior probability of 1 or 0. This can completely change the picture.
This explains the need for the prosecution’s discretion in deciding whether to put someone on trial or not, before he stands trial in court. At first glance that seems pointless, since that is exactly the judge’s role. Usually this is explained as a need to save the court’s time. But in light of what we have seen here, there is a deeper and more fundamental explanation: the role of this stage is to increase the prior probability that the defendant is guilty (among those whom the prosecution regards as guilty there is a higher percentage of guilty people), and that allows us to follow with greater confidence the ruling of the judge or the majority of the judges.
A different formulation of this calculation
- Determining menstrual cycles: the explanation. Again let us define the two factual states (we are trying to determine which of them is the true one): a state in which there is no fixed cycle (0) and a state of a fixed cycle (1).
The event of three equal intervals (this is the result before us) will be denoted A.
The probability that if the cycle is fixed we will get three equal intervals: P(A/1) = 1
P(A/0) = 1/16. Now let us consider a case in which we obtained three equal intervals, that is, A occurred. Our question is: what is the probability that the woman has a fixed cycle, as opposed to the probability that this is accidental?
Again let us assume for purposes of discussion that the probability of a fixed cycle is: P(1)=q, and the probability of a random cycle is: P(0)=1-q. According to Bayes’s formula for total probability: P(1/A) = q / {q + 1/16(1-q)} P(0/A) = 1/16(1-q) / {q + 1/16(1-q)}. The ratio between the conditional probabilities is: 16q / (1-q). Again, if q is very high (close to 1), there is a high ratio between the probabilities, that is, it is preferable to assume a fixed cycle. But if q
is small, that is, if the percentage of women with a fixed cycle is small, there is no justification at all for assuming that the cycle is fixed on the basis of three equal intervals. The logic of this result is very simple, because if we obtained three equal intervals it is better to assume that this is accidental (whose probability is not bad, as we saw, 1/16) rather than assume that we happened upon one of those few women who have a fixed cycle. Here we have an important result: determining cycles assumes that a non-negligible proportion of women have a fixed cycle (if 3%
have a fixed cycle, the probabilities are roughly equal, and then the question whether the woman has a fixed cycle is an evenly balanced doubt.1) If this datum changes, all the laws of determining cycles need to change, and in such situations it seems that the presumption created by three occurrences will not suffice (the smaller the percentage of women with a fixed cycle, the more one must rely on a longer series of equal intervals).2
A note on presumptions of three occurrences. In Jewish law there is a principle that after something happens three times, the assumption is that it is not accidental. This is called a presumption of three occurrences (three times). The example of cycles with which we dealt here is one of those contexts. The commentators discuss these presumptions and their significance, and it appears that some of these presumptions create a reality and others reflect an existing reality (see, for example, Kehillot Ya’akov, Tohorot, sec. 66, and much else). At least with regard to some of these presumptions one can make a similar analysis (as we saw regarding cycles), and the conclusions we reached here can have implications for the question of where to apply them and how.
Summary. In both of these cases we saw that a simplistic treatment of probabilities leads to errors. In order to arrive at probabilities that add up to 1 and to compare them, one must make careful use of Bayes’s formula. Along the way, interesting legal implications arose for the use of presumptions of three occurrences (which depends on the prevalence in reality of the phenomena in question).3 And also for majority rule in a rabbinical court (which depends on the level of the judges and on the probability that the defendant or litigant is indeed guilty or liable).
One should discuss whether this is a rabbinic doubt, since cycle dates are rabbinic, or whether it is a biblical doubt, since we have a doubt whether she may see at that time, and then one must be stringent out of concern for the biblical prohibition against relations with one’s menstruant wife. It seems reasonable that one should be stringent in this doubt, for although cycle dates are rabbinic, that itself was the enactment regarding cycles: to be stringent about such a concern as though there were here a biblical doubt. Put differently, if we have an evenly balanced doubt whether the cycle is fixed, then at the anticipated time there is a 50%
chance that she will see blood, and then intercourse will be forbidden by biblical law, and this is of course a biblical doubt. There is more to discuss here. The question of how to determine the percentage of women with a fixed cycle is not simple, for those very tests themselves assume premises regarding the percentages of women. And if we require regularity over more than three intervals, we ignore a situation in which there is a temporary fixed cycle that changes over time (not every fixed cycle lasts a lifetime, and what does not last does not mean that the cycle is not fixed).3 On this, see also my article in Assia, which is the basis for the entire discussion here.
Discussion
Because of the formulas.
In the paragraph “The Majority in a Court: The Question,” on the second line from the end: the probability of a judge making a mistake is 1-p
Why? Shouldn’t that be his probability of being correct?
Good point. I’ll correct it (just the reverse: P is the probability of being correct).
How did you arrive at 3% of the women? I got about 5%.
It’s possible I made a mistake. It was something I worked out quickly in my head.
Halakhah does not really take probabilistic considerations into account in cases of doubt; rather, it uses rules like a double doubt as opposed to a single doubt, whatever separated is presumed to have come from the majority, presumption, etc. Is this an area that should change in light of the way we think about doubts?
I think those rules apply where you don’t have a clear calculation. When there is a calculation, it is more reasonable to follow it. A state of doubt is an even state, 50-50, and when there is a calculation showing that this is not 50-50, one should follow the majority (except in places where we do not follow the majority).
Technical question: why are the latest columns in PDF format? In my opinion, it’s much less convenient.