Uncertainty and Statistics – Lecture 8
This transcript was produced automatically using artificial intelligence. There may be inaccuracies in the transcribed content and in speaker identification.
🔗 Link to the original lecture
🔗 Link to the transcript on Sofer.AI
Table of Contents
- The source of the rule and an initial distinction between types of majority
- A majority not present before us as a scientific generalization, and a majority present before us as a complete observation
- The Sefer HaChinukh on a court majority and the condition of equal wisdom
- Rabbi Shimon Shkop’s question: is a court majority a majority not present before us?
- The author’s claim: even according to the Sefer HaChinukh, it remains a majority present before us because there is no empirically testable sample
- A new definition: statistics belong to a majority not present before us, and an a priori assumption belongs to a majority present before us
- The question of the basis for saying that most judges are right, and the probabilistic discussion
- The limits of formulas and the emphasis on the role of assumptions
- Concluding questions about a narrow majority and margins of error
div style=”direction:rtl;text-align:justify;margin:20px 0;”>
Summary
General Overview
The text lays out the distinction between a majority present before us and a majority not present before us through the Talmudic passage in Chullin, and shows that the Talmud derives from “follow the majority” only a majority present before us, such as stores and a court, whereas a majority not present before us is not simply learned from there, even though in practice we do follow it. The author argues that a majority not present before us is a scientific generalization based on a representative sample, whereas a majority present before us rests on reasoning and a priori assumptions that are not easily open to empirical testing. From that he re-explains how even a court majority can still be understood as a majority present before us, even though the Sefer HaChinukh presents it as the greater likelihood that the majority hits the truth. He adds a probabilistic analysis that illustrates when a majority of judges is preferable and when it is not, and what the mathematical and statistical assumptions in the background really mean.
The source of the rule and an initial distinction between types of majority
The Talmud in Chullin learns the rule of following the majority from “follow the majority,” and defines this as a majority present before us, such as most meat in the stores in a case where a piece of meat was found in the street, and likewise a majority in a court. The Talmud states that a majority not present before us cannot be learned from there, and the question remains what its source is, even though in practice it is obvious that we also follow a majority not present before us, and Rashi offers two possibilities. From the plain meaning of the Talmud it seems that a majority present before us is stronger, because even after accepting a majority present before us it may still be that one does not follow a majority not present before us; whereas if a majority not present before us were stronger, it would seem obvious that it should certainly work. Rabbi Shimon brings that Maimonides apparently understood the opposite, and a possible explanation is suggested through the “David Levy effect,” in which even a 51% majority decides 100% of the cases, and therefore has a certain power.
A majority not present before us as a scientific generalization, and a majority present before us as a complete observation
A majority not present before us is defined as a scientific generalization based on the principle of induction: one sees a number of cases, assumes they form a representative sample, and infers from them a general law about a group that is not before us. The example is “most women give birth at nine months and not at seven,” where from observation of births one infers a general distribution and uses it to decide a particular case. A majority present before us is defined as a case where the entire relevant group is before us, such as ten stores, nine kosher and one non-kosher, and therefore no sample is needed and there is no inference about the whole world, only use of a local accidental distribution. The distinction is sharpened by the fact that the distribution in a particular city is not a universal law of nature, so a majority present before us is not built on a sample but on direct knowledge of all the relevant particulars.
The Sefer HaChinukh on a court majority and the condition of equal wisdom
The Sefer HaChinukh explains the reason for following the majority in a court by saying that when the two disagreeing camps are equal in Torah wisdom or close to it, a greater number of opinions agrees with the truth more than the minority does. The Sefer HaChinukh says one cannot claim that a small group of sages should be overruled by a large group of ignoramuses, even if they were as numerous as those who left Egypt; therefore, when there is a large gap in Torah level, one follows the sages and not the numerical majority. The Sefer HaChinukh adds that even if to the listener the truth seems otherwise, the rule tends toward not departing from the path of the majority, and in principle the decision is determined by the majority. He notes that in the Sanhedrin it is different, though without developing that as the main focus of the discussion.
Rabbi Shimon Shkop’s question: is a court majority a majority not present before us?
Rabbi Shimon Shkop in Shaarei Yosher points out that according to the Sefer HaChinukh’s explanation, a court majority comes out looking like a majority not present before us, because the question is not identifying what each judge said but whether the court is correct, and the claim is that in most cases where there is a two-against-one dispute, the majority is more likely to hit the truth. According to this, “the majority and minority” are not within the specific panel, but within the set of historical panels in which the majority was right as against the cases in which the majority erred, and that resembles a general natural law. Rabbi Shimon argues that this contradicts the Talmud in Chullin, which calls a court majority a majority present before us, and he therefore rejects the Sefer HaChinukh and offers another explanation that compares it to pieces of meat, while the text proposes continuing in a different direction in order to understand the issue.
The author’s claim: even according to the Sefer HaChinukh, it remains a majority present before us because there is no testable sample
The author argues that although the language of the Sefer HaChinukh sounds like general statistics, in practice there is no way to ground it in an empirical sample, because there is no independent way to check in each case whether the court was right or wrong; at most one can reopen the same discussion on the basis of the same evidence. He compares this to the case of the stores: there too one cannot really check a sample of lost pieces of meat to know which store they came from, because the very fact of loss is not a planned experiment and there is no practical way to track the source of every piece, except perhaps theoretically by imagining a systematic marking of pieces over many years. He concludes that a majority present before us is based on a priori reasoning, such as the assumption of equal probability of separation from each store in the absence of any other indication, and not on generalization from a sample. On that same principle he explains that the Sefer HaChinukh relies on an a priori logic according to which a majority of judges equal in wisdom is expected to hit the truth more often, and therefore this is not a majority not present before us in the scientific sense, but a majority present before us in the sense of a foundational assumption.
A new definition: statistics belong to a majority not present before us, and an a priori assumption belongs to a majority present before us
The author determines that a majority not present before us is clear statistics, like surveys, sampling, and generalization that can be checked after the fact against reality; whereas a majority present before us is not statistics at all, but common-sense assumptions that are not empirically tested in the same way. He argues that people tend to think the opposite, but in fact a majority not present before us belongs to the world of science and polling institutes, while a majority present before us rests on the assumption of uniform distribution in the absence of other information and on reasonings that are not measurable. He notes that with this formulation one can easily understand why a court majority is a majority present before us, whereas with other formulations, such as one attributed to Kovetz Shiurim by Rabbi Elchanan Wasserman, it is hard to explain.
The question of the basis for saying that most judges are right, and the probabilistic discussion
The author investigates how one arrives at the intuition that a majority of judges is more often right, and presents the opposite question, according to which apparently “the majority is wrong,” and then shows that this simple calculation is mistaken. He defines a judge’s quality as \(p\), the probability of being correct, and calculates that given a two-against-one ruling, the ratio between the probability that the majority was right and the probability that the majority was wrong is \(p/(1-p)\), so the better the judges are, the greater the advantage of the majority. He explains that for a cattle herdsman, where \(p\) is around one-half, the majority has no advantage over the minority, and if \(p<0.5\) the minority is preferable, although in his view a minimally poor judge behaves like a coin toss around one-half and not below that. He shows that differences in quality between judges also justify the Sefer HaChinukh’s rule not to follow a majority of fools against a minority of sages, and adds that this is a heuristic rule and not always exact.
The limits of formulas and the emphasis on the role of assumptions
The author emphasizes that mathematical calculations rest on assumptions that are not open to simple empirical testing, such as the very existence of a quality parameter \(p\) for a judge and the assumption of independence among judges, and therefore here too we are not dealing with science but with a priori assumptions. He explains that mathematics can expose mistakes in interpretation or in assumptions, but it is not a guarantee of truth when the assumptions are false, and he gives examples of statistics in the media often being nonsense because of incorrect interpretation and assumptions. He adds that in legal cases immediate impression of credibility has weight, but that too rests on intuition rather than statistical justification, and therefore the whole structure of a majority present before us is tied to common sense and assumptions, not to measurement.
Concluding questions about a narrow majority and margins of error
In a concluding question, a comparison is drawn between a majority present before us of 51 versus 49 stores and a statistical generalization of 51% versus 49% in childbirth cases, and it is said that from the standpoint of modern statistics, if the result falls within the margin of error one should not treat it as a clear majority, but at most as a “doubtful majority.” The author adds that even in the case of stores, a tiny majority is very sensitive to small changes in assumptions, such as equal probabilities of separation from each store. He notes that from a halakhic standpoint, halakhic decisors may say that 51 against 49 is a majority if there is no other indication, but from the standpoint of statistical significance there is a problem in relying on such a majority. The text closes with the end of the lecture and a wish for a peaceful Sabbath.
Full Transcript
[Rabbi Michael Abraham] Last time we looked at the passage in Chullin that deals with the source of this rule of following the majority. The Talmud brings it from “follow the majority,” and then says that from “follow the majority” we learn a majority present before us, which is either the majority of meat in the stores—for example, when you find a piece of meat in the street and there is a given distribution of stores selling meat in the city—or a majority in a court. Both of those are a majority present before us. But a majority not present before us, the Talmud says, that cannot be learned from there. So then the question is: where is it learned from? And I said that the conclusion of the Talmud is problematic, because it seems there is no source. So Rashi gives two possibilities; it doesn’t matter right now. Practically speaking, it’s obvious that we also follow a majority not present before us. But what you do see in the Talmud is that this is not learned simply from “follow the majority,” which means there is a difference between these two kinds of majority: a majority present before us and a majority not present before us. I spoke about the question of which one is stronger. On the face of it, from the plain meaning of the Talmud it seems that a majority present before us is stronger, because the fact is that even if we know that we follow a majority present before us, the Talmud still says it may be that we do not follow a majority not present before us. Meaning, if a majority not present before us were stronger than a majority present before us, then if the latter works, the former should certainly work. So from the simple reading of the Talmud it seems that a majority present before us is stronger. I mentioned that Rabbi Shimon brings that Maimonides apparently learned the opposite, and it doesn’t matter right now exactly how he gets that into the Talmud, but Maimonides understood the opposite. I even suggested some kind of possible explanation for that—if you remember, the David Levy effect—that there is something stronger about a majority not present before us, because in one hundred percent of the cases we will rule according to the majority even if the majority is only fifty-one percent. Fine, and that’s just a parenthetical remark. After that I started defining the difference between a majority present before us and a majority not present before us, and my claim basically was that a majority not present before us is really some kind of scientific generalization. In other words, science—a scientific law—is a majority not present before us. How do you arrive at a scientific law? We have some number of cases that we have seen, we assume that this set of cases is a representative sample, and if it’s a representative sample then I can assume that what happens in those cases probably happens in all cases, and therefore I infer that there is some general law that this is the situation. Yes? For example, most women give birth at nine months and not at seven. So let’s say I know, I don’t know, fifty births I heard about, and of them, I don’t know, forty were at nine months and ten were at seven. So since I infer that those fifty births are more or less a representative sample of all the births that happen in the world, then apparently eighty percent of births in the world are at nine months and twenty percent are at seven. That is an example of scientific generalization, the principle of induction, and that is what’s called a majority not present before us. A majority present before us is not the result of scientific generalization, because the whole group is before us; we don’t need to create a sample. Let’s say there are ten stores, nine kosher and one not kosher. So I don’t need a sample; I know all ten stores. And now, please, a particular store or a particular piece of meat has separated, and I ask myself which of the stores it came from, what kind of store it came from. After all, it came from the majority of the stores. There is no act of generalization from a sample here; it has nothing to do with that at all. Therefore, in a majority not present before us, the group about which I determine the statistics is not before me. What is before me is a sample, and I assume that what happens in the sample is true of the entire group, even though that group is not before me. That’s why it is called a majority not present before us: about what is not before me, I infer a conclusion by means of generalization from a sample. By contrast, in a majority present before us, the entire group is before me. There isn’t some hidden group out there in the world that I haven’t seen and don’t know the nature of and want to learn about from some sample. No. All ten relevant stores are personally known to me; they are standing
[Speaker B] in front of me. And the question—yes, greetings
[Rabbi Michael Abraham] to everyone. I’m asking forgiveness in advance,
[Speaker B] so all the stores
[Rabbi Michael Abraham] relevant to our discussion
[Speaker B] are before me. I don’t need to infer any conclusion—some conclusion about what happens in all the stores in the world. That also isn’t even true about all the stores in the world. Let’s say
[Rabbi Michael Abraham] that in my city there are nine kosher stores and one non-kosher one. That does not mean that in the whole world ninety percent of stores are kosher. Not at all. That is a completely accidental distribution. In every city it will be different. Therefore, it is not a law of nature, it is not something universal, and therefore it is also not based on generalization. It is based on direct observation. It is not the sample that is before me, but the whole group that is before me. I don’t need to look at a sample and from it learn about the rest of the group. Okay? That, basically, is the distinction between a majority present before us and a majority not present before us. Now, I’m doing this briefly because we already did it last time. Now, we saw the Sefer HaChinukh, where the Sefer HaChinukh explains why we follow the majority in a court. I’m mentioning that again because it’s the starting point for our discussion today. “And the choice of the majority, according to what seems, applies when the two disputing camps are equal in wisdom of Torah.” Yes? When there is a majority in court, let’s say two against one, then we follow the majority. When? When the two groups—the group of the two and the group of the one—are more or less equal in Torah wisdom, meaning the judges are more or less on the same Torah level. “For one cannot say that a small group of sages should not be overruled by a large group of fools, even if they are as numerous as those who left Egypt. But where there is equality of wisdom, or something close to it, the Torah informs us that a greater number of opinions will always agree with the truth more than the minority.” Meaning, if I have a few sages and a lot of idiots, then even if there are six hundred thousand idiots, yes, like those who left Egypt, what do I care that they are many? If there are a few who are wise, obviously they are right. There is no logic, says the Sefer HaChinukh, in following the majority unless all the judges are more or less on the same Torah level, approximately. Okay? If there is a big gap in Torah level, then obviously you follow the sages, not the majority of people. The law of following the majority was said only in a case where there is equality of wisdom, more or less, where everyone is more or less on the same Torah level. And why? What happens then? What happens then is that if there is equality of wisdom, then the majority will probably hit the truth; it has a higher chance of hitting the truth than the minority. That is what he says: “a greater number of opinions will always agree with the truth more than the minority.” “And whether they agree with the truth or not, according to the opinion of the listener”—what you think about who is right doesn’t matter—“the rule tends that we should not depart from the path of the majority.” What determines it is the majority. “And what I say is that choosing the majority always applies when the two disagreeing camps are equal in true wisdom, for so it is said everywhere except in the Sanhedrin, where we are not overly precise about who disagrees and which camp knows more; rather, we always do according to the words of the majority.” Meaning, he claims that in the Sanhedrin it is different, the Great Sanhedrin. But never mind. In principle, he says, we follow the greater wisdom—well, the majority where wisdom is equal. Now here, his claim basically is that we follow the majority because usually the majority is right. That is the reason we follow the majority. Now here we need to pay close attention, and this is the question we were left with at the end of the previous lecture: what kind of majority is the majority formed in a court? A majority present before us or a majority not present before us? So the Talmud says it is a majority present before us. Right? Like stores and a court, both are considered in the Talmud to be a majority present before us. But from the Sefer HaChinukh’s explanation it clearly emerges—and this is what Rabbi Shimon Shkop points out at the beginning of gate 3 in Shaarei Yosher—that according to the Sefer HaChinukh’s explanation, the majority in court really comes out as a majority not present before us. Why? Let’s remind ourselves for a moment what the difference is between a majority present before us and a majority not present before us. In a majority present before us, what is before us is not a representative sample; it is the entire group. In a majority not present before us, what is before us is only a representative sample, and we assume or generalize to all the other situations. Okay? Now, if I were to say the following—let’s say I found a judge lying unconscious in the street. I can’t ask him who he is. So now I ask myself: is he one of the two judges, one of the majority judges, or is he the lone judge? We know he is a judge from this court. That would be exactly like the piece of meat from the stores, right? That would be a majority present before us. Because the three judges are personally known to me; it’s not a matter of generalization. I know that two belong to the majority and one belongs to the minority. If I found a judge lying in the street, he probably belongs to the majority. It is more likely that he belongs to the majority. Exactly like the piece of meat from the stores. But the ruling according to the majority in court, in a verdict, is not like that. We do not find a verdict lying in the street and ask ourselves who said it. We know exactly what each judge said. We know that Reuven said so-and-so is liable, Shimon said so-and-so is liable, and Levi said so-and-so is exempt. Two against one. So I know exactly what each judge said. So what exactly is the question that is supposed to be decided by the majority? That is not at all similar to all these cases of stores or anything like that. What is the connection? In the case of stores I ask myself: from where did the piece of meat separate? Did it separate from store A, B, C, or J? So I have nine kosher possibilities and one non-kosher one, and the assumption is that it probably separated from the majority group. But here there is no process of separation. I spoke about the fact that the process of separation is essential to a majority present before us. In the case of a court there is no process of separation. What separated from what? What is the group and what is the individual that separated, about which I need to decide to which part of the group it belongs? This is not that kind of question at all. So what is the question? According to how the Sefer HaChinukh explains it, it works like this. I have before me a court, and the court ruled by majority vote that so-and-so is liable. Two judges against one, and I know exactly what each judge said. I have no doubts. Everything is perfectly clear. What question am I asking? The question I am asking is whether the court is right. That’s the question I’m asking. Right? Not which judge said what. My question is: the court said—and I know exactly what each judge said—in the end, the outcome was that so-and-so is liable. I ask myself: is it really true that so-and-so is liable? In other words, was the court correct? Says the Sefer HaChinukh: if you were to do statistics, you would discover that in most cases—take all the cases in history, everywhere in the world, it doesn’t matter—where there were disagreements in court of two against one, you would find that in most cases the two were right and not the one. Let’s leave aside for the moment how the Sefer HaChinukh got that idea, but there is some logic in it. He probably assumes it as a matter of reasoning. And therefore he says: so what? So the court that is now before me, in this specific case that I am dealing with, apparently belongs to the majority of courts in which the majority is right. And the majority and minority I am speaking about here are not at all the majority and minority of judges in the court before me. That is not the majority and minority I’m talking about. The majority and minority I’m talking about are this: from among all the panels that reached a decision in disagreements of two against one, among all those panels the majority group is the group of cases in which the majority of judges was right. Notice: it’s the majority of judges, but the majority group is the majority of cases, the majority of panels—not the majority of judges. And the minority group is those panels in which the majority ruled incorrectly. The majority in each such panel. Meaning, when I speak here about majority and minority, it is not at all the majority and minority of the judges in the court. It is the majority and minority of the cases in which there were disagreements among judges, where in most cases the majority was right and in a minority of cases the majority was wrong and the minority was right. That is the statistic the Sefer HaChinukh is talking about. Now if that is the case, then what you are really assuming here is some sort of law of nature. And this law of nature basically says that usually the majority of judges is right. And if so, then also in the case before us it is proper to assume that the majority is right and not the minority. That is really a majority not present before us. That is Rabbi Shimon Shkop’s question. You are really talking about the majority of panels that have existed in the world. That is not a defined group present before you. You are assuming something—think of it as a law of nature. Right? It is not something specific to the panel before me. I am making a claim that is some general law, a law of nature, that in divided panels usually the majority will be right. In the case of stores, the fact that most stores are kosher is not a law of nature. I just happen to know that in this city most stores are kosher. It does not come from any knowledge about the world that in the world generally a store selling meat is kosher. On the contrary, generally it’s the opposite.
[Speaker C] Meaning that someone who becomes a judge is statistically usually right? Is that what this basically means?
[Rabbi Michael Abraham] We assume that, yes. That’s in the background. I’ll get to that in a moment. But right now I’m saying: the assumption is that if there is a disagreement between majority and minority, the odds are that the majority is right and not the minority. That is basically the claim. Now according to this explanation of the Sefer HaChinukh, Rabbi Shimon Shkop asks, this is contradicted by the Talmud in Chullin. Because the Talmud in Chullin says that a majority in court is a majority present before us, whereas according to the Sefer HaChinukh’s explanation it comes out to be a majority not present before us. Now Rabbi Shimon Shkop there suggests some kind of formalistic explanation that I don’t find very convincing, but he remains with the issue unresolved—meaning, not an explanation of the Sefer HaChinukh. He says the Sefer HaChinukh is rejected; he says it is contradicted by the Talmud. He offers another explanation of why a majority in court is a majority present before us, not like the Sefer HaChinukh. The Sefer HaChinukh’s explanation is indeed a majority not present before us. And the Sefer HaChinukh is mistaken, because the Talmud says that a majority among judges is a majority present before us. And then he offers some other explanation that compares it to pieces of meat. Not important right now. I’m going to suggest a different explanation.
[Speaker D] And this explanation follows from what we saw in the previous lecture, from the difference, from the definition of the difference between a majority present before us and a majority not present before us in the context of sampling.
[Rabbi Michael Abraham] Look, let’s say we ask why Rabbi Shimon Shkop claims that the majority in court is—according to the Sefer HaChinukh—a majority not present before us. Because he says: the majority and minority being discussed are not the judges who are before us, because there I have no doubt at all; I know what each judge said. There is no uncertainty about what some judge said. All the information is before me. There is no question here that needs to be decided by majority. Okay? So what is it then? The Sefer HaChinukh’s claim is that I have a panel that said something—two against one said that so-and-so is liable, okay?—and my question is not the one against the two, the minority against the majority within the court. Rather, is this court one that falls into the majority group or the minority group of divided panels? Right? As I explained before. Okay, so apparently that is a majority not present before us, because I’m speaking about panels all over the world, not about some defined group standing before me. But if you think carefully now according to the definitions I presented earlier, Rabbi Shimon Shkop is mistaken. Because let’s try to think how the Sefer HaChinukh arrived at the conclusion that in most cases the majority opinion is the correct one. Right? That is basically what he claims. If this were a majority not present before us, as Rabbi Shimon Shkop argues, then you would have to show me how I arrive at that by examining a sample and generalizing from the sample to create a general law of nature, right? That is what I’m supposed to do in a majority not present before us. So let’s think for a moment: how would you construct, how would you test, this rule on a sample? And after that we can ask whether the sample is representative or not. Try to think for half a minute: how do you do this, how do you test on a sample this rule proposed by the Sefer HaChinukh, that in most cases the majority is right? We would basically need to take, say, I don’t know, a hundred decisions—randomly choose them from, say, court protocols or something like that, from legal precedent databases—and randomly choose a hundred court decisions, but take only decisions where there was a disagreement of two against one, yes, majority and minority in the panel. And let’s say we found a hundred such cases. Suppose this is a representative sample. We would check in how many of them the majority was right. Let’s say it came out to eighty percent. So here, in this sample we saw that indeed in most cases the majority is right, and if the assumption is that this is a representative sample, then apparently this is a general law and in all cases the majority is right—or more likely, the majority is right, right? That is how it should work. Except what is the problem here? The problem is that there is no way to test the sample. Let’s say one of the hundred cases from the sample comes before you. Now I want to check: the majority said so-and-so is liable, and now I need to check whether in this case the majority was right or wrong. Yes? This is an empirical check. After I check it over a hundred cases, I’ll have some distribution: I’ll see in how many of those hundred cases the majority was right and in how many it was wrong. But how can this be checked? There is no way to check it. All you can do is go back and conduct the case yourself, the discussion yourself, and reach a legal decision. You have no independent way to check whether the court’s ruling is correct or not correct, right? If you knew the truth—whether Reuven murdered or didn’t murder, or stole or didn’t steal, or something like that—and now you ask yourself: the court didn’t know the truth, but ruled in light of the evidence it had. Then I have a way to check whether it was right or wrong, because I know the truth. But nobody knows the truth. From the defendant’s standpoint he is always innocent, right? Meaning, we have no independent way to know the truth. All you can do is examine the evidence the judges examined. You are no better than they are. They examined it and reached their conclusion. You have no independent way to check your sample. I’m not even talking about the generalization. Before I generalize, I need to see what happens in the sample. I took a hundred cases that, for the sake of argument, are a representative sample, and now I go over them one by one and have to check in each one whether the majority was right or whether the majority was wrong. I have no way to do that. How would I know whether in case number seventeen the majority was right or wrong? Or whether in case eighty-two the majority was right or wrong? I have no way to do it. So this cannot be a majority not present before us. In a majority not present before us—look, let’s compare it to an ordinary case of majority not present before us. I want to know how many women give birth at nine months and how many at seven. A woman comes before me and I need to know whether her child was born at seven or at nine months. Yes? Is he the child of her first husband—she remarried two months after the divorce, and if it was nine months it was from the first husband, if seven then from the second husband. That is the question. Is he the child of the first or the second? So I make statistics. I take a hundred cases blindly, randomly sample a hundred births. I check the records in hospitals or at doctors or whatever it may be. And I look: each of these births—was it at nine months or at seven? I reach the conclusion that out of a hundred representative cases, eighty were at nine and twenty were at seven. Now I return to the case before me and say: fine, if in most cases the birth was at nine months, then I assume that in this case too the birth was at nine months, because in this case I don’t know. I need to decide based on the statistics. Here I don’t have information. The statistics determine that in this case too, the birth was probably at nine months. Because I know there is a natural law like this, that eighty percent of women give birth at nine months. Okay, that is how a majority not present before us works. That is scientific generalization. But in the case of judges, even according to the Sefer HaChinukh’s explanation it cannot work that way. The way you arrived at the law of nature that says the odds are that the majority is right and only in a minority of cases is the majority wrong—that is not a generalization from a sample. Because you have no way to check the sample. The interesting question is how you check it.
[Speaker D] So then how do you really reach that conclusion? Great question. But it is not a generalization from a sample. And if that’s so, then it is not a majority not present before us. The question is why it is a majority present before us. So I don’t know—it’s not this and not that; it’s something third.
[Rabbi Michael Abraham] So to understand why it is at least somewhat similar to a majority present before us, I’m basically making the following claim. Let me maybe take one more step. Let’s go back for a moment to the stores, okay? I found a piece of meat in the street, and in the city there are nine kosher stores and one non-kosher store. So I claim that there is a ninety percent chance that this meat is kosher because it came from a kosher store, and therefore it is permitted to eat it. Strictly speaking I say—Jewish law actually forbids it because meat that was out of sight creates a concern—but strictly speaking I could have followed the majority and eaten it. Now I ask myself what this is based on. I say this is a majority present before us, right? There is no generalization from a sample here. I know all the stores, and I know: store A is kosher, B is kosher, C is non-kosher, D is kosher, and so on. I know all the stores. In that sense this is like the judges. I know everything. Okay? It’s not a generalization from a sample, so it is not a majority not present before us. But then on what is it based?
[Speaker D] It is based on an assumption—an assumption that has no justification at all—that the chance that this piece of meat came from each one of the stores is equal.
[Rabbi Michael Abraham] Right? Only under that assumption can the distribution of the stores determine the status of the piece of meat. I’m basically assuming that I have ten stores, nine kosher and one non-kosher, and the assumption is that this piece of meat has an equal chance of coming from each of the stores. That is a distribution. In a distribution there are no probabilities by themselves, right? We talked about this in the introduction. You need to assume a distribution. The distribution is that the chance that this piece comes from each store is equal—one tenth. Okay? And in fact the later authorities discuss what happens if there is one very large store that sells much more meat than the others. The question is whether there too one follows the majority of stores or not; there are disputes about this, but I’m not getting into that now. But let’s say all the stores more or less sell the same amount of meat. Fine. Still, notice: there is some assumption here. Why? On what basis? It sounds very reasonable to us, right? But in fact it has no statistical basis. It’s just logic. I fully agree with that logic, but it is logic. Try to test me on this. I am now claiming a hypothesis: in most cases, the piece of meat you find comes from the kosher stores. I’m going to pull the same trick the Sefer HaChinukh did. Yes? I’m turning a majority present before us into a majority not present before us. What am I saying? Let’s suppose we have lots of cities and lots of pieces of meat that we find lying in the street. Each time in a different city, at a different time, and the distribution of stores is, say, nine kosher and one non-kosher. In all the cases I collected only those cases—only cities where the split is like that. Now I ask: in how many of those cases did the piece come from the kosher stores, and in how many did it come from the non-kosher store? So apparently there’s no problem: let’s choose cases, check where the piece came from, and do statistics. And we’ve turned it into a majority not present before us. There is a rule like this: if there are nine kosher stores and one non-kosher store, then in all such cases in the world, in ninety percent of the cases the piece comes from the kosher stores and not the non-kosher one. Do you see the similarity to the Sefer HaChinukh? What the Sefer HaChinukh did with judges, I’m doing with pieces of meat. Sometimes I feel there isn’t much difference between the two. But it’s the same move, right? Now I’m going to ask the same question here too. Let’s say I want to check my sample. I found a hundred cases where there is a piece of meat lying in the street and the city has ten stores, nine kosher and one non-kosher—or whatever, some number of stores where ninety percent are kosher and ten percent are non-kosher. I gathered all those cases, and for me that is a representative sample. Now what do I have to do? Check in each case where the piece of meat came from, and then see. And my assumption is that we will find that in ninety percent of the cases the piece came from the kosher stores. Okay? Then I can apply that to the case before us as well. What’s the problem? The problem is that there is no way to check it. How will you check where the piece of meat came from in all those cases? You found it lying in the street—how will you check where it came from? We have no independent way to check where it came from. So what do we do? We follow the majority present before us. If there are nine kosher stores, then we assume the piece came from the kosher stores, right? We assume that majority; we do not arrive at it by a statistical calculation. We have no way to check the sample. Do you see the similarity to courts? It is exactly the same thing. Stores and courts are exactly the same thing. And therefore both of them are a majority present before us. Because it is exactly the same thought process. What does that mean? It means you arrive at your majority not by generalization from a sample, but from a priori reasoning. You determine behavior in the world, but not on the basis of generalization from a sample—rather from reasoning. In courts you say, by reasoning, that if a majority of judges says one thing, then most likely the majority is right, and only a minority chance exists that the majority is wrong. That is from reasoning, not from generalization from a sample. I have no way to create such a sample. Not the generalization—the generalization I can make. The sample I cannot create. Okay? And in stores it’s the same thing. I want to claim that in most cases a piece of meat in such a case will have come from the majority stores. For that I need to check a hundred such cases—or whatever, a set of such cases—and see in each case where the piece came from. But I have no way to do that. How would I know where the piece came from? There is no way to conduct a planned experiment of losing a piece of meat and then asking myself—I know where it came from—and then asking the judge: come rule, where did it come from? If it is a planned experiment, then I’m not losing a piece of meat. Loss is, by definition, an unplanned process. You lose a piece of meat randomly, without noticing. You can’t do that intentionally. So there is no way to plan such an experiment. Okay? Maybe, maybe I could design such an experiment in the following way: if I marked the pieces of meat in all the stores and tracked, over I don’t know, ten years, pieces of meat that were lost and found in the street. And let’s say every store is marked with a number from one to ten, and all the pieces of meat are marked. In store number one, all the pieces of meat have “1” written on them; in store number two, all the pieces of meat have “2” written on them, and so on. Now I check pieces that people lost and that were found in the street, and I ask myself: how many of these pieces came from store number one, two, three, ten? That could be a test…
[Speaker E] Yes? What?
[Rabbi Michael Abraham] This could be a check. I understand. If someone is speaking, then please stop and go back on mute afterward, because it’s disruptive. The claim—the claim, really—is that if I were to do such a thing, I would have some way to test the law of nature. Namely, that if there are nine stores and one is kosher and one is non-kosher, then the pieces that get lost really are distributed nine to one, kosher versus non-kosher. But you understand that nobody ever did this experiment, and there isn’t really any way to do it, because the pieces of meat aren’t marked—there’s no way. If we did do this experiment, it would become a majority not present before us. That’s what I’m claiming. It’s defined as a majority present before us because we didn’t do the experiment and there’s no real way to do the experiment. So what then? How did we actually decide that it is more likely that this piece came from the kosher stores? Logic says so. Why assume that one store has an advantage over the others? Assuming they’re all more or less equally close to the place where I found the piece of meat, all sell the same quantity of meat, I have no reason to assume there are differences. Say they all package in the same way. If there were one store whose packaging tears more easily, then I would assume more pieces of meat are lost by customers from that store, because they fell out of a torn bag. But let’s say they all sell in reasonable bags—then we have no reason to distinguish among them. So what does that mean? “We have no reason to distinguish” proves nothing. It only says I have no other indication. And in the absence of any other indication, I assume the distribution is uniform. Right? That’s what I assume. But that’s an a priori assumption. It’s not the result of generalizing from a sample. That is exactly what happens in courts. What does Sefer HaChinukh say? Sefer HaChinukh says: look, I’m telling you a priori—I have no way to check this—that in most cases where there is disagreement between a majority and a minority, in most cases the majority will be right and not the minority. How do you know? we’d ask him. “I don’t know, my logic tells me.” Right? There’s no way to check this based on a sample. What would he answer? “My logic tells me.” In other words, it’s exactly like the stores—a majority present before us. A majority present before us is a majority based on logic. A majority not present before us is a majority based on empirical observation, on a sample and generalization from that sample to a general law. That’s the difference. Once you understand it this way, and not according to the standard definitions—Eliav is here, I don’t know, he asked me after last time about a formulation in Kovetz Shiurim—Rabbi Elchanan Wasserman in Kovetz Shiurim on Bava Batra suggests some distinction between a majority present before us and a majority not present before us, and he claims that it’s what I said, but that’s not correct. Here, this is the distinction according to the formulation—it doesn’t matter, I’m answering him already—but according to his formulation there in Kovetz Shiurim, you won’t be able to explain why the majority in court is a majority present before us. According to my formulation it’s self-evident. Therefore I think that once you understand that a majority not present before us is generalization from a sample, while a majority present before us is a majority based on an a priori assumption, on reasoning, then it’s clear that both in court and in the meat case it is a majority present before us. There’s no need to reach for formal solutions or anything like that. By the way, that’s really true—it’s all logic. It’s only logic; it’s not empirical observation or anything. People think that specifically the case of a majority present before us is statistics, and a majority not present before us is not statistics—exactly the opposite. A majority not present before us is statistics. A majority present before us is an a priori assumption; it is not statistics. It is an a priori assumption that says that if most of the stores are kosher, then I assume this piece came from the majority and not from the minority. How do you know? You can’t run an experiment, you can’t check a sample, there is no statistical space, there is nothing. You can’t do a statistical calculation. It has nothing to do with statistics; it’s an a priori assumption. A very logical assumption—I think each of us would agree that it’s a logical assumption in the absence of other information, yes? There could be information, like with a loaded die: if one store is larger, or another store’s bags tear more easily, then that really would break the whole thing. But if we have no other information, then the assumption is that it is uniformly distributed—in other words, a fair die. And if so, then I assume the outcome is probably distributed according to equal chances. That’s an a priori assumption, not statistics. And now that is basically my claim, and therefore the majority in court really is a majority present before us and not a majority not present before us. Now. An interesting question: how did we actually reach the conclusion that the majority of judges is right? Here I come to the earlier remark—what about a cattle herder? I think Ido, you asked that. What about a cattle herder? Okay, let’s try to think for a moment. How do we get to this assumption or this intuition that the majority of judges hits the truth with higher probability? Yes, ostensibly this is a priori logic, like the meat separated from the stores. We have no sample, no generalization from a sample, no way to check this empirically. So who said it’s true? Maybe it’s just implanted in us somehow, but has no justification. By the way, Rabbi Shimon Shkop wants to claim something like this. To my mind it’s absurd, but that’s what he wants to claim: that there is no logic at all in a majority present before us; it’s a scriptural decree. But of course that’s not right. Ask any non-Jew, without any scriptural decree or anything, if he finds a piece of meat in the street and there are nine kosher stores and one non-kosher one, everyone will tell you that it probably came from the majority of stores. You can’t say that this is a scriptural decree; it’s logic. The question, though, is what exactly is that logic based on? That’s not a simple question. Now regarding judges, I’ll present it this way. Once a man asked me: why is the majority of judges right? Ostensibly it’s exactly the opposite: the majority is wrong. Usually the majority is wrong. Why? Let’s say a judge is right in 70% of cases. Then the probability that a judge is right is 0.7. Okay? What’s the probability that two judges are right? Assuming they make independent decisions: multiply—0.49. Less than 50%. Meaning, the larger the number of judges, the greater your chance of error, not the smaller. Usually the majority is wrong; usually the minority is right. That’s how he asked me. This was a man who had even done an advanced degree in statistics. So I told him that he was making a bitter mistake. And anyway, let me show you this in a nutshell. Let’s now do the calculation for the chance that the judge is wrong, not that he is right. You calculated the probability that he is right. 0.7 he is right; two judges are right with 0.7 squared, 0.7 times 0.7, which is 0.49. Let’s look at the chance of error. The chance of error for one judge is 0.3, right? If he is right 0.7, he is wrong 0.3. What is the chance of error for two judges? Multiply, same logic—0.09. Well then, with two judges the chance of error is smaller, not greater. But how do those two figures fit together? Something here doesn’t make sense. Now look, just as an indication—I’m showing you how one can apply the introductions I gave earlier in statistics. It’s very simple, exactly what Ido writes here: the sum of the probabilities does not add up to one. Right—the probability that both judges are right is 0.49, the probability that both are wrong is 0.09, so together that comes to 0.58, right? Am I right? Yes, 0.58. So where did the other 0.42 disappear to? We have two cases, each has a probability, but the probabilities do not sum to one. What other possibility is there besides both judges being right or both judges being wrong? What else could there be? If they both say the same thing, yes, then it can’t be that one is right and the other wrong—they are saying the same thing. So either both are right or both are wrong; there are no other possibilities.
[Speaker F] One is right and one is wrong, no? What? One is right and one is wrong.
[Rabbi Michael Abraham] But if
[Speaker F] they’re saying the same
[Rabbi Michael Abraham] thing, then one can’t be right… They said the same thing. Yes, let’s set up the situation. I’m checking the majority opinion. The majority opinion says so-and-so is liable, and the minority says so-and-so is exempt. Now I’m asking: what is the probability that the majority was right, and what is the probability that the majority was wrong?
[Speaker F] But it’s obvious why we won’t get to one; it’s obvious why we won’t get to one if we relate only to cases where they say the same thing.
[Rabbi Michael Abraham] Of course, of course—that’s the mistake. It doesn’t add up to one and it isn’t supposed to add up to one. What can add up to one, if you really want to sum it to one, is Newton’s binomial. Yes, do that—calculate the probability that one judge is right and two are wrong: p times 1 minus p squared. 1 minus p is the chance of error. Two judges wrong is 1 minus p squared, one judge right is p. That product gives you one judge right and two judges wrong. Sorry—one judge wrong and the other two right, so it’s 1 minus p times p squared. And then sum all those, and it really does come out to one. But to look at two judges… out of the three, and ask whether they are right or wrong—that is not a complete event space, so it won’t add up to one. Another way to see it is to return to what we saw in previous classes. What does it mean to say a judge has quality—that a judge is right seventy percent of the time? Like, remember the medical tests. Let’s call the quality of a judge p, 0.7. Assuming Reuven committed murder, the judge will detect it; the judge will convict him, right? So if in seventy percent of cases the judge rules correctly, then that’s a judge of quality 0.7. Okay? But here we are asking the opposite question. Assuming the judges say he is a murderer, they ruled that he is a murderer—what is the probability that he really is a murderer? Do you understand that we reversed the direction? The measurement of the judge’s quality goes in the direction: assuming he murdered, what is the probability that the judge will rule that way? The question we’re asking here is the reverse: assuming the judge ruled that way, what is the probability that he is a murderer? You understand that we now need Bayes’s formula, the one that flips p of A given B into p of B given A. That already depends on the question of what the objective probability is that he murdered—what is the probability that he murdered. All the problems we talked about—you remember?—with medical diagnosis and all the examples I brought there. I also mentioned there the issue of following legal evidence. Here is an example—not exactly about evidence, but about the quality of a judge. You have to do the whole Bayes formula calculation in order to get to this and arrive at results. What can be done more simply—and this is equivalent to Bayes’s formula, though I won’t prove that here, but it is equivalent to Bayes’s formula—let’s say two judges say, two judges say that so-and-so is liable, and one judge says he is exempt. Okay? And all the judges have quality p, yes? Remember? We’re talking about Sefer HaChinukh, that the judges’ quality is more or less at the same level. So they all have quality p, say 0.8 or 0.7, doesn’t matter. Then if the two judges are right and one is wrong, that is p squared times 1 minus p. 1 minus p is the probability that a judge is wrong, and p is the probability that he is right, so two right and one wrong is p squared times 1 minus p. The opposite probability—that the minority was right and the majority wrong—is p times 1 minus p squared. Right? 1 minus p is the probability that a judge is wrong, squared because two judges were wrong, times p, which is the probability that one judge was right. I see that… let’s do this for a moment, I’ll write it out for you. It’s not… it’s very simple, maybe it sounds not…
[Speaker D] Without writing it, it’s a little hard to follow. It’s very simple. Look, the chance that a judge is right—that is, the quality of the judge—is p. Okay?
[Rabbi Michael Abraham] And all three judges have that quality. Now, two judges ruled that so-and-so is liable, and one ruled that he is exempt—that’s the given data. Now I say there are two possibilities: either so-and-so is liable or so-and-so is…
[Speaker G] We can’t see anything, we can’t see anything on the screen.
[Rabbi Michael Abraham] Can’t hear?
[Speaker G] We can’t see anything on the screen.
[Rabbi Michael Abraham] You can’t see? No. How can that be? I shared it. One second. Can you see now? Now you can see. Oh, okay. So the chance that a judge is right—that’s basically the measure of the judge’s quality—is p. Two judges ruled that so-and-so is liable and one judge ruled that so-and-so is exempt. That’s the given data. Now there are two possibilities: either the majority was right or the minority was right, correct? Those are the two possibilities we’re looking for now. If the majority was right, then that means that the two judges ruled,
[Speaker D] the two judges were right, which means p squared times 1 minus p. Squared—the two above. Okay? Right? Because for two judges who are right, each one is p, so p times p, and times one judge who was wrong, whose probability is 1 minus p. So that’s one possibility, the possibility that the majority was right. Okay? What is the possibility that the majority was wrong? p times 1 minus p squared. Understand why? 1 minus p is the chance that the judge is wrong.
[Rabbi Michael Abraham] Two judges being wrong is 1 minus p times 1 minus p, squared. And one judge being right has probability p. I always multiply the probabilities of each of the judges. So that’s the probability that the majority was wrong. Now, we already talked about this—let’s say this is, let’s say p is 0.7, okay? These two things do not add up to one. Right? Because here it is 0.49 times 0.3, that’s 0.12, and the other one is even smaller. So it won’t even get to 0.2, certainly not to one. It doesn’t add up to one. But I do know that the probability is proportional to this and to this. If you want, we can write this divided by the sum of the two, and that divided by the sum of the two—that would be the result of Bayes’s formula. But more simply, this is case A, okay? And this is case B. Now I ask myself: which probability is greater, A or B? That’s really what we’re asking here, right? What is the probability that A—that is, that the majority was right—divided by the probability that B was right; what is the ratio between them? Since they both have the same denominator, I can save myself the denominator. Let’s divide one by the other—what do we get? p divided by 1 minus p. Agreed? That’s the ratio between the two states. Right? p squared divided by p gives p, and 1 minus p divided by 1 minus p squared leaves 1 minus p in the denominator. Now look—if p is 0.7, okay? Then this is 0.7 divided by 0.3, which is two and a third. The probability that the majority was right is two and a third times greater than the probability that the majority was wrong. That’s the correct calculation. And if it’s 0.9, then the probability that the majority was right divided by the probability that the majority was wrong is ninefold, not two and a third. And if it’s a remarkable, excellent judge, whose probability of being right is 0.99, then it’s a hundredfold. Right? That’s 0.99, which is one divided by 0.01, so a hundred. Ninety-nine if you want to be exact. It’s ninety-nine. It’s ninety-nine times more. In other words, the higher the quality of the judge, the more p approaches one, and the advantage of the majority rises. Right? By how much is the probability that the majority is right greater than the probability that the majority is wrong? That depends on p. Phi in this case as a letter p, not the word mouth, yes? So the quality of the judge improves our situation tremendously. If the judge has quality 0.9, adding two more such judges brings us to a very high level of certainty that the majority was right. But if the quality of the judge is, say, one-half,
[Speaker D] sometimes right, sometimes not right, roughly equally. What is the result here? One. Right? Half divided by half, one. There is no advantage at all to the majority over the minority. Right? What happens with judges whose probability is 0.4? Meaning that in most cases they are wrong. Do you understand that then it is more likely that the minority was right? There, we’ve reached the answer about the cattle herder—I told you I’d answer you about the cattle herder.
[Rabbi Michael Abraham] In the case of a cattle herder there is no advantage of the majority over the minority. Now it’s true that for a cattle herder the more correct model is one-half, not 0.4. Because a cattle herder—the worst possible judge—is one-half. A judge who is 0.4 would have to be a good judge who systematically reverses his own results. Because one-half means you’re just taking a shot in the dark, random choice. And with a random choice, half the cases you’ll be right, half you’ll be wrong, for the sake of discussion. It’s not exact, because it depends on the probability that the people coming before you are actually guilty or not. But let’s say it’s fifty-fifty. Okay? Then the worst judge in the world is one-half; there is no judge below one-half. Below one-half would have to be a judge who deliberately makes himself an idiot, and that’s worse than a blind random draw. Worse than a blind random draw, so—okay? Therefore… yes, in intelligence, in… what is it called? In machine learning, the assumption is always that one-half is the minimum; there is nothing below one-half. Meaning, the worst possible algorithm is a blind random draw. A blind random draw gives you correctness in half the cases. An algorithm that gives you correctness in less than half the cases—throw it in the trash, just flip a coin. That you can always do. Okay? So therefore it doesn’t matter; I’m saying on the conceptual level, a bad judge is a judge who hovers around one-half. Then there is no advantage to the majority over the minority. Now there is another interesting case. What happens if I have one judge who is 0.9 and I add to him two judges who are 0.6, almost cattle herders—let’s say sheep herders. Okay? What happens then? Then in some cases—in this particular case, for example—adding those two judges at 0.6 lowers the quality; it increases the probability of error. You actually made the situation worse by adding two judges at 0.6. Because with the 0.9 judge, the chance of error is, say, 0.1. If he has with him two more judges of 0.6, I think the chance of error comes out to 0.28 if I remember correctly. The chance of error increases when you add judges, even though notice: each of those judges is above one-half. Meaning, each is a judge who is right more often than wrong. But it doesn’t matter. Since I have one judge who is 0.9, take just him and you have an optimal judge. The moment you add two others, the chance of error is still below one-half, of course—that’s always true. Because the number of judges will always keep you above one-half. But it is significantly worse than the single judge. So stay with the single judge. And that is exactly what Sefer HaChinukh says: if they are not equal in wisdom, then there is no point following the majority—take the minority. Notice, what does it mean to take the minority—sorry, take the minority and don’t follow the majority. What does it mean to take the minority? It means to take what that one judge said and ignore what the two less intelligent judges said, right? The chance of error there is 1 minus p. Because de facto it is a court of one judge. So if he is 0.9, the chance of error is 0.1. If you followed the two judges who are 0.6, you would get 0.28. Much higher chance of error. Therefore Sefer HaChinukh says: clearly, if the majority of the judges are foolish and the minority wise—the minority is the wise minority—follow the minority and not the majority. When you follow the minority and not the majority, your chance of error—you do not need to factor in the minority. Your chance of error is as though de facto you have only a single judge. Don’t multiply it by the chance that both are wrong. You don’t need to. Treat it as though there is a single judge, because you follow only him. And a single judge is p. You ignore the two minority opinions—I mean the two dissenting opinions. There is no need to do all the multiplications that I did. Therefore Sefer HaChinukh says: obviously one should follow the wise judges even if they are in the minority. And not—now by the way this is not entirely simple. Suppose there are situations in which a judge of 0.9 together with two judges of, say, 0.8 can reach a better result than the single judge. So it’s not always true that when the two other judges are worse, that lowers the quality of the result. So Sefer HaChinukh’s rule is a heuristic rule. Meaning, it is not always correct. But heuristically he is right. In general, take the wise minority and go with them. That is better than considering everyone, even if the fools are as numerous as those who left Egypt—six hundred thousand fools. Okay? So that’s basically the claim. Now what is the significance of all the calculation I did here? I hope it wasn’t… these are simple things overall, it’s not… what is the significance of the calculation I did here? I showed you how Sefer HaChinukh basically arrived at this reasoning or this universal law that in most cases the majority opinion is the correct one, and only in a minority of cases is the majority wrong. I said this is not a generalization from a sample. So how do you get there? Through the calculation I just did. Sefer HaChinukh obviously didn’t do this, but this is what lay behind his mode of thought if you formalize it. This is basically the way to get there. Now what is the problem with this method? You have lots of assumptions. You have an assumption, for example, about what the quality of the judge is—p. You have no way to check the quality of a judge, as I said earlier. That’s why this is also not a majority not present before us, because how could you check the quality of a judge? Take all the cases he ruled on and check whether he was right or not—but you have no way to know if he was right or not. You cannot know better than he does what the real outcome is. So how can you check the quality of a judge? The assumption that there is such a thing as a judge’s quality p is itself some kind of a priori assumption. I assume that if he is a Torah scholar and has sound judgment, he is probably right in most cases. Why? Because that seems logical to me. It is not the result of empirical testing; I have no way to do that. Okay, therefore the calculation I did now—because if the calculation I did now were certainly correct, then it could not be worse than a majority not present before us. A scientific generalization that is one hundred percent correct cannot be something worse than a majority not present before us. Certainly I would give it the status of a majority not present before us. But that is not so, because this calculation is not certainly correct. This calculation assumes various assumptions that are a priori assumptions: that I have a judge of quality p, say 0.8 or 0.9, I don’t know what. I just assume it because I think he is a wise man, and I throw out a number estimating his wisdom, how wise he is. I say it’s 0.8. Based on what? Just because I decided. There is no way to know. Therefore a majority present before us is not statistics. Look, I did here a probability calculation, Bayes’s formula, probability, I multiplied probabilities, I assumed independence among the judges—which is also not a correct assumption—but let’s say so, okay? But what? Everything began from my assumption that there is a fixed quality p for a judge. And where did that come from? Because I decided. In other words, at the end of the day, after all the statistical or probabilistic calculation I did, everything begins and ends with an a priori assumption that there is such a thing as a judge’s quality and that you can attach some sort of number to it—I don’t know, 0.7, 0.8, whatever. There is no way to check any of these assumptions empirically. Just as judges form an impression of a witness or litigant or complainant who appears before them—that he gives the impression of telling the truth. Impression has a place in a court ruling. More than that: it is accepted that when the Supreme Court hears an appeal from a lower instance, the Supreme Court does not revisit the evidence. There is no judicial review of the evidence accepted by the lower court. Why? Because the witnesses are not standing before the judges of the Supreme Court, and you cannot form an impression of their credibility. So there you assume that what the lower court determined is true. You need to review its judicial reasoning. But the evaluation of evidence is not reviewed by the Supreme Court, because there was direct impression. What is that impression worth? After all, you have no way of knowing whether the man is telling the truth, or whether he is a terrific actor who inspires trust in you. How do you know at all? Maybe he doesn’t even need to be an actor. Who said there are indicators of which person is trustworthy and which is not trustworthy? What, just because that’s how you feel? The basis—what? How do you know that? Just because I have some kind of intuition. Do you understand that this whole business—I am inclined to give it weight, but do you understand that it is ultimately just an a priori assumption with no empirical, statistical, or other justification? Therefore a majority present before us is not statistics. It has nothing to do with statistics. It is simply assumptions, a collection of assumptions of a priori logic. That’s all. And precisely a majority not present before us is statistics. You check the representative sample, and there you check empirically. There is no problem—you can check. Did she give birth at nine months or at seven? That is not an assumption; it is a fact. You checked it. The only assumption entering into a majority not present before us is that the sample you know is a representative sample, and that the woman before you is indeed distributed according to the general majority or the representative sample, and that she too is not a pathological case. Fine. So there too there is some kind of assumption, but that’s true in every statistical calculation. At the basic level, a majority not present before us is… it is clear-cut statistics. That’s what polling institutes do. Polling institutes check what the vote distribution in an election will be—that’s what they do. They take a sample and assume that this will probably also be the distribution among all voters. By the way, they are usually amazingly accurate—we already talked about that. Contrary to the bad reputation they got, and really without justification. Pollsters reach amazing results—all of them, by the way. Without exception. Today the technique is simply already so good that you don’t need to be an especially good pollster; the difference between a good and a bad pollster is at the margins. Generally speaking, a very good method has developed, and you can reach very good generalizations from a sample. You can check it, yes. Afterward they hold elections and see the result. You can check the pollsters, and they come out excellent on that test. By contrast, a majority present before us has no way to be checked. There is no way to check it. Here you have no indication that the sophisticated calculation you did is worth anything. Even though it is mathematical, very convincing, very logical, seemingly absolute—an actual calculation. Nonsense calculation. Everything begins and ends with the assumption that there is such a thing as p, that there is independence, and that you can multiply the probabilities. Independence among the judges. You do p times 1 minus p squared, assuming each judge made his decision independently of the other two, and that’s not true. In capital cases they begin with the youngest, and even there there can be dependence. But in monetary cases they don’t even begin with the youngest; they can begin with the greatest. So the lesser ones are certainly drawn after him. There is no independence among the opinions. By the way, that does not necessarily worsen the chance of being right—on the contrary. If they follow the greatest judge and the lesser ones are drawn after him and do not rule independently, that can actually improve our chance of being right. Everything is the opposite of common-sense intuition in these calculations. People think: wait, each one should express his opinion independently, otherwise it’s biased. No, no, no. If they follow the greatest judge, they will usually be more correct. Because that is what it means that he is the greatest judge—that he is a better judge. Why should they follow their own weak opinion if they can follow a judge who is most likely right? If he persuaded them, that gives us an indication that he is even more likely to be right than in the ordinary case. In the ordinary case, he tells us that so-and-so is liable; I assume he is probably right because he is a good judge. But here it’s not only that—he convinced two more judges, and they too are not reeds in the swamp, they are not nobodies. They’re not 0.9, but they are 0.7, and he managed to persuade them. So all the better; then the probability is even greater than 0.9 that he is right. So the fact that they follow him contains nothing bad in the statistical sense. In other words, there are many, many assumptions here that we make which are not really the product of statistics. This whole business is very confusing, even though the calculations are very simple, at the level of tenth-grade high school. You don’t need any mathematics beyond tenth grade here, but you do have to do the math. We have some tendency to jump straight to the intuitive answer, and that is where we fall down. By the way, all these calculations of Daniel Kahneman—you know, I mentioned representativeness and all the biases—that’s all tenth-grade math. Everything I know from there anyway has not one drop of mathematics beyond tenth grade, and for that he got a Nobel Prize. Because the wisdom is not the mathematical calculation; the wisdom is using that tenth-grade calculation intelligently. Sometimes tenth-grade math is enough, but you have to apply it precisely, systematically, and intelligently. Then you discover all kinds of fallacies. What I pointed out now is a collection of fallacies—failures—using tenth-grade math. Meaning, the fact that it is tenth-grade math does not mean people won’t make mistakes. Why? And they know the tenth-grade math; it’s not that they don’t know it. Why do they make mistakes? Because that calculation is based on assumptions, and the mistake is in the assumptions. You assume the wrong assumptions. That’s the point. There is a certain magic to formal, mathematical arguments; they somehow seem absolute to us. We forget that when mathematics deals with the world, it usually expresses or is based on certain assumptions we have about the situation. And if those assumptions are not correct, no mathematical calculation will help. The mathematical calculation will lead us to a false result because the assumptions on which it is based are false. And this happens every day. The statistics you hear in the media day in and day out are almost always nonsense. Almost always nonsense. But it’s nonsense with data, with numbers—they show you, I don’t know, seventy percent, eighty percent. The question is where that seventy percent came from, what it means, what interpretation you give it, which parameters affected it and which didn’t—no calculation beyond tenth grade. Lots of mistakes by PhDs, even though the calculation is tenth-grade math, and PhDs make mistakes in it. This is very delicate, and one has to be careful not to be seduced by formulas. Formulas are no guarantee that you are right. Formulas help a great deal with thinking if you start from the correct assumptions. From there on, mathematics will take you very well to the conclusion, much better than without it. But it cannot replace correct assumptions. If your assumptions are not correct, no mathematics will help you; it will only complicate things further. Sometimes mathematics will show you that your assumptions are not correct; you will simply reach a contradiction, as I showed you earlier with the two judges. What is the probability they are right? 0.49. What is the probability they are wrong? 0.09. Well then, mathematics shows you that those two things do not add up to one. So mathematics shows you that your assumptions are not correct. That is an advantage of mathematics: even if you start from wrong assumptions, sometimes—pay attention to the results—they will show you that your assumptions are not correct, okay? Or your interpretation is not correct. So that is basically the claim. Therefore, therefore the majority of judges is a majority present before us and not a majority not present before us.
[Speaker F] Sorry, a question. To say about judges that it’s an a priori assumption because there is this p parameter—I understand that. It’s a little harder for me to understand saying the same thing about the nine stores, because here there isn’t—there is something arbitrary in setting p.
[Rabbi Michael Abraham] No, so that’s why I made the analogy to the stores. Let me do it again. Look at the stores for a moment. You have nine kosher stores and one non-kosher one. Now I’ll do the exercise that Sefer HaChinukh did for judges. I say: I want to know the probability that this piece of meat came from the kosher stores. So I say: let’s think about the general law—what happens in a collection of cases where in each case there is a city with nine kosher stores and one non-kosher one, and a piece of meat was lost? Now I ask myself where the piece of meat came from. Now if my assumption is that this goes with 0.9, that is basically my p, and now I ask what is the probability that this happens in two cases? What is the probability that this happens in three cases? And now I infer a conclusion about the case before me. So the p—what is roughly parallel, yes, to p—is basically my assumption about what happens in one such individual case. Of course that is an assumption, but if I assume it, then now do the calculation for a hundred such cases, and you will discover, of course, that in a hundred such cases there is a distribution in favor of the majority of kosher stores. But you assumed that in each individual case, “separated” always means separated from the majority and not from the minority, and the statistics are nine against one. So it is very similar to judges. It’s not exactly the same thing, but it’s very similar. You always assume something about the individual case, and from there you can play with multiplication and statistics and everything. But it all begins with the individual case, where it is just an assumption. How do you really know that the piece of meat separates with equal probability from all the stores? Why do you assume that? How do you know that?
[Speaker F] But that assumption about the stores is not under the kind of influence that exists with judges’ opinions, where p can really change according to the…
[Rabbi Michael Abraham] I’m not saying it’s the same thing. I told you—that’s why I said it’s similar. But you can look at this case as a kind of hypothesis about p. From there on, I ask myself what will happen in three such cases, what will happen in two such cases, and I’ll build some kind of natural law. So I can do the calculation on the basis of p and reach, through statistical calculation, the law of nature: in what percentage of cases the piece will come out from the nine stores and in what percentage of cases it will come from the one store. But it all begins with some assumption about the basic p. Now it’s not exactly the same as judges, because with judges I ask the question about this case by virtue of the majority I reached. Here the basic p is the very case I am dealing with. The generalization won’t give me anything regarding this case. Fine, but that doesn’t matter. Still, the assumption about this case assumes a p of 0.9, and that p comes out of one’s head, from a mere assumption. Logical, again—very logical, I don’t dispute it—but it is an assumption. It is not the result of any empirical measurement, generalization, anything. Nothing to do with science. It’s just common sense. Let me formulate it this way in this context. In the context of most women giving birth at nine months, there is an advantage to the scientist. Because he knows how to do statistics, he knows how to check whether a sample is representative or not, and so on. In the case of the stores, the scientist has no added value over a layman. The same common sense that says this piece probably came from the majority of the stores—that is what the scientist will say and that is what the ordinary person will say. There is no difference there. Right? He has no advantage. You don’t need any scientific skill. Rather, just basic common-sense assumptions, which the scientist also relies on. But the scientist has in addition mathematical and scientific skills, etc. Here you don’t need that added value. It’s simply just common sense.
[Speaker D] Therefore it’s a majority present before us. Okay? Good, we’ll stop here. Any questions or comments? Rav Mikhi?
[Speaker H] What? I’m saying, Rav Mikhi. A question about the previous class—I don’t feel like typing. Okay. Suppose there are a hundred stores: fifty-one kosher and forty-nine non-kosher. It’s still a majority present before us. Now let’s go to childbirth. I did statistics on X people, and I saw that 51% give birth at nine months and 49% give birth at seven. Okay. When I now make the generalization, I may get it wrong, because every sample has error. Right? Obviously.
[Rabbi Michael Abraham] Every generalization has a certain margin of error, like election polls.
[Speaker H] Right, but if I do a poll and arrive at 51% giving birth at nine months, would you say that because it’s a majority present before us, I apply that also to the current case?
[Rabbi Michael Abraham] I don’t think so. I think that today, with the tools we have today—let’s not talk about the halakhic decisors or the Amoraim—with the tools we have today, if my margin of error includes fifty-fifty, I would not declare that a majority.
[Speaker D] I need it to be outside the margin of error. If it came out 55%, with a margin of error of plus or minus 4%, that’s a majority. But if it’s 52% plus or minus 4%, I would not treat it as a majority. Call it perhaps a doubtful majority. Maybe there is room to give some weight to a doubtful majority, but that’s the maximum. It’s a doubtful majority.
[Speaker H] And that’s what…
[Rabbi Michael Abraham] Also in the stores example there is room for some hesitation. Because in the stores, let’s say the majority is 51 against 49, and the piece of meat separates—after all, I am also assuming that the probability of separating from each store is the same.
[Speaker H] I have no other information.
[Rabbi Michael Abraham] After all, it’s not exactly the same. I assume it’s more or less the same. Once the distribution of stores is 51 against 49, a small error in the distribution can upset the whole story.
[Speaker H] Then very easily you suddenly get to sixty and seventy.
[Rabbi Michael Abraham] Right. And therefore I say that there too I would not completely rule out not following a majority of 51 against 49, even though I assume that if you asked halakhic decisors, they would tell you that if you have no clear contrary indication, then 51 against 49 is also a majority. That is a Jewish law rule. But at the level of statistical significance, clearly there is a problem in following such a majority.
[Speaker D] Any other questions? Comments? Okay. Sabbath peace, goodbye. Sabbath peace.