Simplism in Simple Statistical Forecasts (Column 473)
I’ve just finished reading Jim Holt’s book, which deals with scientific, philosophical, and mathematical issues that have triggered intellectual revolutions and shifts in worldviews. I usually don’t read popular science literature, because it’s quite hard to write interesting, high-quality popular science. Such books often focus on anecdotes and gossip about thinkers and scientists, and on very superficial, top-down descriptions of ideas which—without professional understanding—confuse more than they help. They often aspire to present the philosophical implications of scientific insights, but in practice it usually comes out rather silly (scientists also do plenty of silly things and embrace populist “philosophies” when interpreting scientific results and pointing to their implications for our lives and thought). This book, in part, does those things too, but here and there it also enters the arguments themselves (at least at the popular level), and I really didn’t find gross errors. That, too, is quite rare.
One of the points that got me thinking is a simple argument that appears at the beginning of the fourth chapter (and a variant returns later on). As Holt writes, these arguments are rather astonishing because they are extremely minimalist while their conclusions are very far-reaching. This is exactly the sort of argument I’m fond of. It reminded me of another example.
A Way to Increase State Revenues
I once saw a lecture by Bibi, where he explained why raising taxes does not necessarily increase state revenues, and that lowering taxes can actually do so. He drew axes on the board, with the Y-axis marking the amount of money in the state’s coffers (revenues only), and the X-axis marking tax rates. Now, suppose you set a 0% tax—what are the revenues? Of course, 0. Suppose you set a 100% tax—again, revenues are 0 (no one will work if there’s no profit in it). Thus, the graph of state revenues as a function of the tax rate should look roughly like this: somewhere between the origin and the point (1,0) there must be some maximum (assuming state revenues can’t be negative), and according to him, world experience shows it’s located around 30%.
What does that mean? If the current tax rate is about 50%, then the way to increase state revenues is to lower the tax rate. And if the current tax rate is about 20%, then the way to increase revenues is to raise the tax rate. This is, of course, a simplistic and imprecise argument, but I quite like it because very minimal, simple assumptions lead to an interesting result (perhaps somewhat counter-intuitive).
Back to our topic.
Estimating the Lifespan of Phenomena
At the beginning of chapter four Holt presents an argument first formulated by Princeton astrophysicist J. Richard Gott in a 1993 Nature paper. He assumes the Copernican principle: we are probably not special. From this it follows that if we are acquainted with some phenomenon, then we are likely neither among the first to experience it nor among the last. That is, this moment is neither among the earliest moments of its existence nor among its latest. Note the conclusion that emerges from this simple assumption.
Suppose there’s a Broadway show that has already been performed n times. I, watching it now, am probably not among the first 2.5% who saw it, nor among the last 2.5% (true for 95% of observers). I can therefore determine, with a 95% confidence interval, that I am somewhere within the middle 95%. Hence one may state with 95% confidence that the show will continue for at least another n/39 performances (otherwise I’d be in the first 2.5%), and for no more than another 39n performances (otherwise I’d be in the last 2.5%).
For example, if the show has had 100 performances so far, it will continue at least another two or three, and no more than about 3,900.
Likewise for humanity. (For simplicity, I’ll speak of “humanity” from here on.) Suppose that in its present form it has existed for ~10,000 years. Then, with 95% confidence, it will continue to exist at least another 250 years and no more than 400,000 years. He estimates, in this way, the lifespan of the Internet, the presence of numbers in our lives, and more. Of course, the older a phenomenon is, the longer its expected remaining lifespan. He even provides several examples that corroborate this result.
This really is a powerful argument. With minimal, reasonable assumptions and a single simple consideration, we arrive at dramatic and surprising results with implications across many fields.
Objections
A first problem with this formula is that we arbitrarily chose 95% as the “normal” yardstick, and 5% as the “special” yardstick. We could just as well have chosen 99% versus 1%, and then the results would need to be divided and multiplied accordingly. But that’s not really an objection, since the same question can be addressed at different confidence levels. One may say, with 95% confidence, that our lifespan will be between X and Y, and with higher confidence that it will be longer. Note that the confidence concerns the entire interval, not just the upper or lower bound. Otherwise we’d end up with 99% confidence that we’ll live between 250 and 400,000 years, and 100% confidence we’ll live a million years, which is absurd. Naturally, assuming we are in the “special” 1% gives less confidence. The assertion that we are not in the special 5% is simpler and safer than assuming we are not in the special 1%, hence it yields a weaker result.
A second issue: as time goes on, the estimates themselves lengthen. If we wait another year and humanity is still on stage, then—surprise—the lifespan estimates will lengthen too. Yet this isn’t really a problem. It’s a dynamic estimate that constantly changes; as we have more information, our estimates can be updated. If we tossed a die without knowing the result, the chance to get a five is one-sixth; if we later learn the outcome is odd, the chance becomes one-third. The longer life goes on, the more information we have, and thus it’s reasonable to update our statistical estimates.
The real problem here is that I already know today, with high probability, that I’ll live another thousand years—i.e., I already possess information about where I will be then. Why not use it now and update my estimate? You see this leads to a divergence toward infinity. If we look at the other side of the time axis and ask what estimates our ancestors would have made (assuming they performed them) back when humanity had existed for only a thousand years, we’d get rather poor results. A thousand years ago, the estimate would have been between 25 and 40,000 years. Fine, that’s still reasonable. But what about 9,900 years ago? Back then humanity was only a hundred years old, so the estimate should have been between two and a half and four thousand years. Yet we already know today that this estimate is wrong. If so, why should we trust our current estimates if we’re already building things today that our descendants will know are wrong, and toss them in the trash?
This isn’t a technical quibble but a problem inherent to phenomena like humanity. It’s obviously hard to bound such estimates because of ambiguity in definitions: at which point in the evolutionary process do I define the creature as “human”? If we define the creature that existed a million years ago as human, then the estimated remaining lifespan of the human species increases dramatically. At the same 95% confidence, we would now expect between 25,000 and 40,000,000 more years. Does this contradict the earlier estimate? Not necessarily. It yields an optimistic estimate, while the previous one is more pessimistic. But notice: if we take the minimal estimate from the optimistic one—say, 250,000 years—that implies we are very special hominids, which breaks the Copernican principle as assumed in the second estimate. Thus there is, in fact, a contradiction between the two estimates.
A similar problem arises regarding the end point. It’s not easy to define when humanity is considered extinct. After significant evolutionary changes, would we still be “human”? Are we ourselves truly the continuation of the caveman, or is “scientific man” already a new creature? If you like, there are names for this: Generation X, Y, Z, and so on.
At root lies this question: every entity is special in some respects and not in others. The question is whether the respect in which you apply the Copernican principle is indeed not a special one. Perhaps we mistakenly applied it over our particular axis, rendering the estimate not worth much. Hominids may not be very special, but humans could be more special. If you dig through all relevant data about any person, you’ll always find something special: for example, the gematria (numerical value) of his name equals exactly his age; or he lives exactly 10 km from his mother. The probability of such things is slim, but it clearly happens for some. Note that being exactly in the middle is also very special. I could propose, with the same confidence interval, that I am neither in the middle, nor in one-third nor two-thirds of the human period. Each such placement is very special, hence unlikely that I live in it. But if so, by definition I always live in a very special region—and thus the Copernican principle kills itself.
All of this hides an assumption that the process by which such phenomena go extinct is random and uniformly distributed over their lifetime. That’s a very strong assumption, and I doubt it’s generally correct. Humanity today might be easier to wipe out than five hundred years ago, because we possess weapons that could end the entire story in seconds. On the other hand, humanity is bigger and more distributed, hence harder to wipe out. Suppose we’re on the brink of nuclear war, with a 50% chance it breaks out. Would it be right to estimate our expected survival under such circumstances using Gott’s formula? Not really. What’s the chance that we are on the brink of such a war? What’s the chance it will break out? Is the distribution of such events uniform? We have no way to know. In the absence of other information one might assume a uniform distribution, but I wouldn’t base anything on that.
Judgment Day Draws Near
On p. 340 Holt describes what is called the “Doomsday Argument.” I don’t know why astrophysicists formulate such arguments, but facts are facts: this one too was raised by an astrophysicist—Brandon Carter of Australia—at a meeting of the Royal Society in London in 1983.
Shall we begin the thought experiment? The astrophysicist presumably thinks that humanity is likely on the verge of extinction.
The argument goes roughly like this. Let’s assume an optimistic future for humanity: it will survive many more generations. The Earth’s population will stabilize at a reasonable number of about fifteen billion; then, as we grow, we will settle other stars in our galaxy, and manage to expand the food supply to match the needs of all humanity. Let’s say that every decade humanity grows by a billion people, until the sun dies out (a reasonable estimate). Suppose, in total, humanity across all generations will number only about fifty billion people before the curtain falls.
If that’s the case, then, according to the Copernican principle, we are exceedingly special: roughly 0.00001 of all humans—wow, admit it, that’s very special. In contrast, if humanity will go extinct soon, then it’s very plausible that our current generation is precisely the largest generation. It’s very plausible we’re living in the last generation, because that’s the likeliest moment.
Holt, who was very impressed by the previous argument, for some reason notes that this one contradicts it head-on. From the fact that we are not special (the Copernican principle), the earlier argument led to the conclusion that our existence will continue for about 40 more times our current age, not end in a few generations. Here, the same mode of reasoning leads to a completely different conclusion, and in many ways the exact opposite. It seems there’s something rotten in the state of Denmark… How can that be?
Numbers of People and the Time Axis: The Wonders of Exponential Processes
In the past I mentioned the uniqueness of exponential processes (in particular regarding COVID spread). Here’s a nice illustration. Think of a regular sheet of paper. You fold it in half, then again in half, and so on, forty times. What will the thickness be? For simplicity, assume the sheet’s original thickness is 1 mm. Each fold doubles the thickness: after one fold, 2 mm; after the next, 4 mm; then 8, 16, and so on. After 40 folds, the total thickness is 240 mm. Converting to kilometers (divide by a million), that’s about 220 km; and 210 is roughly a thousand, so here we have roughly a million kilometers—almost three times the distance between Earth and the Moon (!!). All from forty folds of a 1-mm sheet.
What’s the point? Humanity grows in an exponential process, doubling itself every few generations (of course there are disasters and extinctions, and growth rates vary by place and time; I’m just illustrating a theoretical point). In such a process, at each generational stage there are as many individuals as the total number who existed up to that point. Today there are about ten billion people on the globe, which is on the order of the total number of people who have lived until now (I’ve read estimates of about forty billion). If so, the last generation won’t be so special, since the number of people in it is comparable to all people who existed throughout history. This allows us to base the estimate on numbers of people rather than the time axis. Of course one can translate it back to the time axis, and thereby resolve the apparent contradiction between the two calculations above.
(There’s a well-known story about the inventor of chess. The Persian shah offered him any reward he asked for in gratitude for inventing the game, and he asked for a chessboard—8×8 squares—with one grain of wheat on the first square, two on the second, four on the third, and so on across the 64 squares. There wasn’t enough grain in the entire kingdom to pay his request. For illustrated explanations of this story and exponential processes, see: link.)
Back to Simplistic Considerations
Here’s another example of the claim above that it’s hard to speak of the Copernican principle because every person has specialness along some axes and not others. In our case, I can be very special on the time axis (e.g., living in the last generation), yet not special regarding the headcount axis (since in my generation lives about a quarter of all humanity).
Let me stress: I myself don’t accept Holt’s approach. But all this is said just to ask whether the claim that Gott’s approach is correct is itself correct. I think the explanations I presented above apply here as well. For example, the assumption that I am “not special” implies that my soul is drawn at random from some pool and tossed into the world at some stage, with the lottery being uniform (each stage equally likely). I see no real basis for that assumption. In such a context one could perhaps speak of plausibility (in my opinion, even that is dubious, since we have no information about the process), certainly not of probability. And besides: if I myself had been “drawn” in the 12th century BCE, would I still have been me? In what sense? It would be a different person altogether—born at a different time and environment. By that definition, the chance that I would be born exactly now is 1. Asking “what would Maimonides have said if he lived today?” is like asking whether he would still be Maimonides—it’s undefined. He would not have been Maimonides, but someone else.
Incidentally, Holt notes (p. 341) that Brandon Carter coined the term “anthropic principle” about a decade earlier (in the 1970s). To my surprise, the doomsday argument bears many similarities to those ideas. Despite the charm of their simplicity (I discussed this in my first book, God Plays Dice, and also in the third conversation in my book The First Being), I still recommend a healthy dose of skepticism toward elegant, simple arguments. Sometimes a person hits upon a simple and correct insight—evolutionary theory is like that; as the philosopher Malcolm put it, it’s an “eye-opening tautology.” But as a rule, far-reaching conclusions (alas) require heavier argumentative work; “according to the pain is the reward.” Grand conclusions (that burst with profligacy) generally demand more thought than minimalist arguments. So before you follow Holt and run with an argument like this, it’s definitely worth giving it another check.
Homework for readers: Try to raise objections to Bibi’s argument presented at the beginning of the column.
Source: translated from the attached PDF. :contentReference[oaicite:0]{index=0}
Contents of the Article
With God’s help
__Oversimplification in Simple Statistical Forecasts __
I am just now finishing Jim Holt’s book, When Einstein Walked with Gödel, which deals with scientific, philosophical, and mathematical questions that bring about intellectual revolutions and changes in worldviews. I generally do not read popular science, because it is quite difficult to write popular science that is both interesting and of a high standard. Usually such books focus on anecdotes and gossip about thinkers and scientists, and on very superficial top-down descriptions of ideas that, without understanding them professionally, are more confusing than helpful. They pretend to present the conceptual and philosophical implications of scientific insights, and this usually comes out rather silly (scientists too often do quite a lot of populist nonsense when they come to interpret scientific results and point to their implications for our lives and our thinking). This book in part does the same things as well, but here and there it also enters into the arguments themselves (at least at the popular level), and I truly did not find in it any gross errors. That too is fairly rare.
One of the points that set me thinking is a simple argument that appears at the beginning of the book’s fourth chapter, and another version of it recurs later on (p. 340). As Holt writes, these arguments are quite amazing because they are very minimalist and their result is very far-reaching. This is the sort of argument I am fond of. Here is an example that came to mind just now.
How to Increase State Revenue
I once saw a lecture by Bibi in which he explained why raising taxes does not necessarily increase state revenue, and why lowering taxes can actually do so. He sketched on the board a coordinate system in which the Y-axis measures the amount of money in the state treasury (revenue alone), and the X-axis the tax rate.
Suppose a 0% tax is imposed; what will the revenues be? 0, of course. Suppose now a 100% tax is imposed; what will the revenues be now? Again 0, of course (because nobody will go out to work if there is no gain in it for him). Hence the graph of state revenue as a function of the tax rate must look roughly like this:
Some maximum must lie between the origin and the point (1,0) (assuming state revenue cannot be negative), and according to him experience around the world shows that it is located roughly at 0.3. What does this mean? That if the current tax rate is 0.5, then the way to increase state revenue is precisely to lower the tax rate. And if the current tax rate is 0.2, then the way to increase revenue is to raise the tax rate. This is of course a simplistic and imprecise argument, but I like it very much because a minimum of highly understandable and simple assumptions leads to an interesting result (and perhaps one that somewhat runs against initial intuition).
Now back to our subject.
Estimating the Lifespan of Phenomena
At the beginning of the fourth chapter of the book (p. 57), Holt presents an argument first formulated by the Princeton astrophysicist Richard J. Gott (Gott), in a 1903 article in Nature. He assumes the Copernican principle, according to which we are probably not special.[1] From this it follows that if we are acquainted with some phenomenon, then we are probably neither the first nor the last to encounter it. That is, this moment is neither among the first moments of its existence nor among the last. Notice the conclusion that follows from this simple assumption.
Suppose there is a Broadway show that has already been performed n times. I, who am watching it now, am not among the first 2.5% who saw it, nor among the last 2.5% who will see it (after all, this is true for 95% of its viewers). I may therefore say with good probability that I am located somewhere within the middle 95%. Hence one can determine, with a 95% confidence interval, that this show will continue for not less than another n/39 (otherwise I am special, since I belong to the last 2.5% who watched it) and not more than another 39n (otherwise I am special, because I belong to the first 2.5% who watched it).[2] Thus, for example, if the show has run 100 times so far, it will continue for at least another two or three performances, and for no more than another 3,900 performances.
So too one can estimate the lifespan of humanity (from here on, for simplicity, I will use 40 instead of 39). Suppose that in its present form it has existed for about 10,000 years; then it will continue to exist for at least another 250 years, and at most 400,000 years. He estimates in this way the lifespan of the Internet (at least another year and no more than fifteen hundred years), the presence of numbers in our lives, and more and more. Of course, the older the phenomenon, the greater the life expectancy assigned to it. He even brings there several examples that confirm this result.
This really is a powerful argument. With a minimum of reasonable assumptions and one simple consideration, we arrive at many dramatic and highly surprising results, with implications in many fields.
Objections
A first problem that arises with respect to this formula is that we arbitrarily chose the number 95% as the measure of normality, and 5% as the measure of specialness. We could just as well have chosen 99% as against 1%, in which case the results would have to be divided and multiplied by 100. But this is not really an objection, since the confidence interval for the estimate would then be different. Indeed, the same question can have different estimates at different confidence levels. One can say with 95% confidence that our lifespan will be between X and Y, and at a higher confidence level that the interval will be longer. Notice that the confidence concerns the entire interval, not its upper or lower bound. Otherwise we would end up with 95% confidence that we will live between 250 and 400,000 years, and with 99% confidence that we will live between 100 and a million years. Naturally, a more precise estimate gives us less confidence. The estimate that we do not belong to the exceptional 1% is simpler and more secure than the assumption that we are not among the exceptional 5%. No wonder the result it yields is weaker.
A second problem that arises here is that as our lifespan lengthens, the estimates themselves lengthen. If we wait another 10,000 years and, surprisingly, humanity is still on the stage, then the estimates of its duration will grow enormously. This is a dynamic estimate that changes all the time. But this too is not really a problem, since whenever we have more information our estimates can always change. If we throw a die and do not know the result, the probability of getting 5 is one sixth. But if we know that the outcome is odd, then the probability rises to one third. As our lifespan continues, we have more information, and therefore it is possible and reasonable to update our statistical estimates.
The problem that does arise in this connection is that already today I know that, with high probability, humanity will still be here in another 1,000 years; that is, even today I possess the information that it will then still be among us. If so, why should I not use that information already now and update my estimate? You understand that this is a process of estimates that diverge more and more toward infinity. If we look toward the other side of the timeline, and ask ourselves what value our ancestors’ estimates would have had (assuming they made them), we would reach a not particularly impressive result. 9,000 years ago humanity had existed only a thousand years. Hence the estimates then would have been that the lifespan of humanity lay between 25 years and 40,000. Well, that is still reasonable. And what about 9,900 years ago? Then humanity had existed only a hundred years, and therefore the estimates should have been between two and a half years and 4,000. But that estimate is already known to us today to be false. So why should we trust the estimates we construct today, if already today we know that our descendants will throw them into the trash?
It can also be argued that a phenomenon like humanity cannot be delimited clearly. This is not a technical question but a problem of vagueness in the definition. From what stage in the evolutionary process do I define the creature as human? If we define the being that existed a million years ago as human, then the estimate of the lifespan of the human species grows enormously (though now within that same confidence interval). At the same confidence level, we would now expect between 25,000 and 40,000,000 years. Does that contradict the previous estimate? Not necessarily. It gives us an optimistic estimate, while the previous one is more pessimistic. Of the two, we should take the minimal estimate. But notice that if we take the minimal estimate from above, namely 250 years, this means that we are very special hominids from the standpoint of the more optimistic estimate. That breaks the Copernican principle as we assumed it in the second estimate. So there really is a contradiction between these two estimates.
Incidentally, a similar problem arises with respect to the endpoint. It is not easy to define when things count as humanity having become extinct. If we undergo significant evolutionary changes, will we still be human beings? Are we ourselves really a continuation of prehistoric man? And perhaps scientific man is already a new creature? If you like, it already has names: Generation X, Y, and Z, and so on.
What lies at the root of the problem is that every creature is special in some respects and not special in others. The question is whether the respect with regard to which you apply the Copernican principle is really a respect in which we are not special, or whether we have mistakenly applied it to a respect in which we are special, in which case the estimate is not worth much. We may be hominids that are not very special, but human beings are more special. If you rummage through all the relevant data about some person, you will always find that he is special in some respect. For example, the numerical value of his name is exactly equal to his age. What are the odds of that happening? Slim. But clearly it will happen for some number of people. He lives exactly 10 kilometers from his mother. That too is very special, but it probably happens from time to time.
Notice that being exactly in the middle is also very special. I could have proposed an estimate, with the same confidence interval, that I do not live in the middle of the human period, nor at one-third or two-thirds, or at 78% of the period. Each of these is very special, and therefore it is likely that I live precisely there. But if so, then by definition I always live in a very special region of humanity, and thus the Copernican principle kills itself. The matter is similar to the question what the probability is that a die thrown a hundred times will land on some specific sequence of outcomes. The probability is of course negligible, but that is true of every sequence of outcomes, and therefore we know in advance that the result obtained in the end will certainly be special. The Copernican assumption that the die’s outcome will not be special is false. On the contrary, it will always be very special; there are simply many outcomes, all of them special.
Beyond all this, hidden in this argument is the assumption that the process by which these phenomena become extinct is random, and uniformly distributed over the span of their existence. This is a very strong assumption, and I do not think it is generally correct. Today humanity can become extinct more easily than it could five hundred years ago, because we have weapons that can bring this whole sad story to an end in seconds. On the other hand, humanity today is larger and more dispersed, so it is harder to wipe it out. Suppose we are on the brink of nuclear war, and there is a 50% chance that it will break out. Would it be correct to estimate our expected duration in such a situation by Gott’s formula?! Not really. What is the probability that we are on the brink of such a war? What is the probability that it will break out? Is the distribution of such events uniform? We have no possibility of assuming anything about it, and therefore considerations that assume a uniform distribution are nonsense. I assume that in the absence of other information I too might assume a uniform distribution, but I would not build anything on it.
The Doomsday Battle
On page 340 Holt describes what is called the doomsday argument. I do not know why astrophysicists tend to formulate such arguments, but the fact is that this argument too was first raised at a meeting of the Royal Society in London in 1983 by an astrophysicist from Australia named Brandon Carter. What are the chances that the people who came up with all these arguments are astrophysicists? I am beginning to think that perhaps astrophysics is on the verge of extinction.
The argument goes roughly like this. Suppose the human species has an optimistic future. It will survive for many more generations. The size of humanity on planet Earth will stabilize at a reasonable quantity of about fifteen billion, and then, as growth continues, we will begin to settle other stars in our galaxy, and will succeed in increasing the amount of food in accordance with the needs of all humanity. Suppose humanity grows by a billion people every decade (a reasonable estimate). Until the sun burns out and the capacity of our galaxy to support annoying people like us comes to an end, humanity over all its generations will amount in total to about 10^15 human beings. Of these, only about fifty billion have lived until our time. If that is the case, then we belong to about 0.00001 percent of all humanity. Wow—admit it, we are very special. But as is known, according to the Copernican principle that is not likely. By contrast, if humanity will soon disappear from the face of the earth, then it is quite likely that our most probable moment is precisely now. It is very likely that we are living in the last generation, because it is the largest generation.
Holt is very impressed by this argument, but for some reason he apparently did not notice that this argument flatly contradicts his previous argument described above. This argument yields the conclusion that our not being special (the Copernican principle) leads precisely to the conclusion that our existence will end soon, not in another forty times the period we have already existed—somewhat different from the estimate obtained there. But beyond the contradiction in the doctrine of Holt, long may he live, there really is a problematic situation here. Seemingly, by means of the same form of reasoning we arrive at completely different, and to a large extent opposite, conclusions. How can that be? Something is rotten in the state of Denmark…
Number of People and the Time Axis: The Wonders of Exponential Processes
I have mentioned here in the past (especially with respect to models of the spread of COVID) the distinctive character of exponential processes, which can greatly surprise someone unaccustomed to them. A nice demonstration of this is folding a sheet of paper. Take a sheet of paper as large as you like, with the thickness of an ordinary sheet. Fold it in half, and then in half again, and so on 40 times. What will be the thickness of the folded sheet? If you do not know, hold on tight. For simplicity, let us assume that the original sheet is 1 mm thick. Each such fold doubles the total thickness. After one fold we have 2 mm, and after another fold the total thickness is 4 mm, then 8 mm, 16 mm, and so on. After 40 folds, the total thickness of the folded sheet is 2^40 millimeters. To illustrate what that means, let us convert it to kilometers (divide by a million, giving roughly 2^20). We get the result: 2^20 km. And again, to make this vivid, let me remind you that 2^10 is about a thousand (10^3), so here we have about a million. Hence the resulting sheet is about a million kilometers thick. It turns out that this is almost three times the distance between the earth and the moon (!!). Notice: this entire monster began with 40 folds of a sheet a millimeter thick.[3]
What is the relevance of all this? Humanity grows in an exponential process, and therefore doubles itself every few generations (there are of course also disasters and extinctions, and the growth rate varies from place to place and generation to generation, but here I am only illustrating a theoretical point). In such a process, at each stage (a generation, or several generations) there exist as many individuals as the total number of individuals that had existed until that time. Today there are about ten billion people on the globe, and that is on the same order of magnitude as the total number of people who have lived until our day (I read that the estimate is that about forty billion people have lived up to now). If so, the last generation would not be so special, because the number of people in it is similar to the total number of people who have ever existed throughout history. This allows us to base an estimate on the number of people rather than on the time axis, as we did above, and then one can also translate it onto the time axis. That, of course, is the meaning of the contradiction between the two calculations.
Back to Simplistic Considerations
One can see here another example of my claim above that it is hard to speak of the Copernican principle, because every person is special on certain axes and not on others. And in our case, I may be very special on the time axis (living in the last generation) but not at all special in terms of the number of people (because in my generation there lives about a quarter of all humanity).
But all of this is said only on Holt’s own terms. I explained above why I myself do not accept the first argument. We may now ask whether this claim, unlike Gott’s, is correct. I think it is not either, since most of the explanations I gave above apply here as well. Thus, for example, the assumption that I am not special means that my soul is randomly drawn from some reservoir and cast into the world at some stage. Another assumption is that the distribution in these drawings is uniform (= the probability of each person emerging at each stage is the same). I see no real basis for this assumption. In such a context one might perhaps speak of plausibility (in my opinion, not even that, since we have no information at all about the process), but certainly not of probability. And in general, if I myself had been drawn in the twelfth century BCE, would I have been I? In what sense? After all, that would be a completely different person. It may be that I am defined as one who was born precisely now, in my place and time and surroundings. On that basis, the probability that I would be born here is 1, by definition. This is like asking what Maimonides would say about something if he were alive today. The answer is that even if, for some reason, that question is important to someone, it is not well-defined. If he were alive today he probably would not be Maimonides but someone else.
Incidentally, Holt recounts (on p. 341) that Brandon Carter was the one who first formulated the anthropic principle about a decade earlier (in the 1970s). I was not especially surprised to hear this, since that argument too suffers from similar problems, despite its simplicity and initial charm (I discussed this in my books God Plays with Dice and in the third conversation of my book The First Existent).
The simplicity of an argument casts a spell over us, and rightly so. Occam’s razor is based on that. Therefore, from time to time we encounter very simple considerations that are nonetheless elegant and impressive, and the more far-reaching and wide-ranging the conclusions that arise from them, the greater their charm, of course. Holt describes Carter’s argument as follows (p. 341):
Even in comparison with the usual structure of transcendental a priori arguments, this argument is rather breathtaking. In its economy of initial assumptions, as against the conclusion bursting with extravagance, it is a worthy rival to the argument of Saint Anselm of the eleventh century…
Once he gets his breath back, I would nevertheless suggest that he pause for a moment and check this crushing argument again. To me it does not sound especially impressive. Incidentally, on p. 58 you will find very similar expressions of admiration for that earlier argument (though more laconic: ‘quite amazing’), even though, as noted, its conclusions contradict those of this one. The gates of mathematical and scientific hair-splitting have not been locked.
And more generally, I would not dismiss such elegant and simplistic arguments out of hand, because sometimes a person does hit upon a simple and correct insight (the theory of evolution, for example, is such a case. In the words of the philosopher Malcolm: it is an ‘illuminating tautology’). But I still strongly recommend a healthy measure of suspicion toward such arguments, because usually, sadly, the rule is ‘the reward is in proportion to the effort’ (Avot 5:23). Far-reaching conclusions (bursting with extravagance) generally require broader intellectual and argumentative work. When you encounter such an argument, it is certainly worth subjecting it to a further examination before you fall under its spell and make use of it.[4]
1.
Footnotes
- I assume this is called the Copernican principle because Copernicus taught us that we revolve around the sun and not it around us. We are not special.
- 2.5% is 1/40.
- A better-known story, of course, is about the inventor of chess. The Persian shah offered him a prize of his choosing in appreciation for inventing the game, and he asked for a chessboard (8×8 squares) on whose first square there would be one grain of wheat, on the second two, on the third four, and so on over all the squares of the board (64 squares). There were not enough grains of wheat in all Persia to fill his request. See here for illustrated explanations of this story and of exponential processes.
- Homework for the readers: try to raise objections to Bibi’s argument presented at the beginning of the column.
Discussion
In the context of Bibi’s argument, the argument assumes there is a single maximum, whereas it is certainly possible (and even likely) that there are several maxima, and therefore also at least one minimum. Practically speaking, the argument is not very useful; what it says is that there is some optimal tax rate (from the standpoint of state revenues), which is a fairly trivial claim. The important question is what that optimal tax rate is, which of course can vary from one economy to another and with the macroeconomic situation.
In short, the less information the model contains (correct assumptions about reality), the less useful it is.
That is the weakest criticism. It’s not even entirely correct, because most likely there is only one maximum, and in any case it at least proves that an increase in taxes does not necessarily increase revenues. That is the main claim.
I also really don’t agree that less information is less helpful. Here too there is a more complex process that has its own optimum.
I haven’t yet looked into it, but one remark caught my eye. You wrote that in your view, when we have no information at all about the distribution process, one cannot even speak of probability. Apropos what you mentioned at the end, about parallels to discussions of God and creationism: regarding the proof from the uniqueness of the system of laws, I thought you did argue that one can claim uniqueness without any information at all about the distribution process. What’s the difference?
If the assumption is that we are not special, then it makes no difference at all whether what is happening to us is happening for the first time or the last time, with a probability of 50% or a probability of 1 in a trillion, according to statistical rules or contrary to them. None of these changes anything. After all, we are not special.
Therefore this whole discussion is unnecessary.
When the process is entirely unknown to us, but there is some process there, there is no point in assuming a uniform distribution. As I noted, that is at most a default, and not one I would build much on. But in the physico-theological argument there is an assumption that the formation of the world is pure chance out of absolute nothingness (otherwise the question would remain of what created what existed before). In such a situation, the assumption of a uniform distribution is the most reasonable and sensible one. A non-uniform distribution requires a reason. In the lottery of souls, if it is conducted by the Holy One, blessed be He, or by some other mechanism, there is a reason, and one would need to know that reason in order to say anything about it.
It’s complicated for me, but I’ll try to feel my way a bit further. I find it hard to see the distinction between a uniform distribution and a non-uniform one, but I’ll grant it for now (because it’s an idea that needs reflection) and ask differently – seemingly, a uniform distribution (which fits symmetry considerations) is actually much more special than some non-uniform distribution.
In addition, and I hope I’m not mistaken and muddling things up, seemingly regarding majority in matters of prohibition, where there too there are mechanisms pushing toward stringency, you also said that one basically assumes some kind of uniform distribution, and therefore the halfway line has significance.
Exactly. That is why, in the absence of other information, one assumes a uniform distribution. It is the simplest and most symmetric.
As for halakhah in matters of prohibition, each case is judged on its own merits. But there one does not follow only the statistical consideration, but also legal-halakhic rules (for example, there is an aspiration to simplicity. There are meta-legal principles that have an influence, etc.).
If it is the simplest and most symmetric, then it is the most special of all, and yet? Sustain me, etc.
We are not randomizing distributions. The distribution governs the randomization. The uniform distribution is the simplest, and therefore we assume it. Just as fitting points along a straight line is preferable to fitting them along a sine wave, even though you could say that the straight line is the simplest and therefore the most special.
Seemingly, from your straight-line example, on the contrary: since one sees that there is a simple and special line that approximately fits what is there, that is precisely why it is reasonable that it is not by chance. But we would not be able to assume from the outset that a certain phenomenon will fall on a straight line without any grounding. I understand that you are saying that considerations of simplicity are entirely a priori, but how does the line show that?
(I reflected before the previous comment about randomizing distributions, but couldn’t get anywhere, and I am still wondering.)
I don’t really understand what the discussion is about. Do you dispute that, in the absence of other information, it is reasonable to assume a uniform distribution? Why make a distinction between outcomes? If one knows of no differences among outcomes in the sample space, the most reasonable assumption is that they all have the same weight. I don’t know what more there is to add.
But you are the one who holds that even in the absence of information it is not reasonable to assume a uniform distribution regarding souls. And you explained that this is because there is an unknown process, and only in an emergence from absolute nothingness would systems of laws have been expected to emerge with a uniform distribution, and therefore from the uniqueness of the system there is a proof of creation.
I still do not have a settled opinion, and perhaps there is a difference between before the events occur (when, if one calculates an expectation, one probably has to assume a uniform distribution) and after it has happened (when it is very hard to piously assume that it was supposed to happen according to a uniform distribution). In any case, I asked within your framework, and if it has been exhausted, it has been exhausted.
Exactly. And I explained the distinction. In a random process the distribution is uniform. In a process of choice there is no reason to assume דווקא that. I also added that perhaps that is what I would assume in the absence of information, but I would not build anything on it.
It seems to me we have exhausted the matter.
Could you just clarify for me whether I understood correctly that in an emergence from nothingness (if we assume that is possible, for the sake of the physico-theological proof independently of the cosmological one) you are making the positive claim that there would be a uniform distribution there (and this is a critical claim for the proof), and not merely a conjecture based on lack of knowledge.
Yes. If it is from nothingness, then it should be treated as a uniform distribution.
Let the interested reader consult https://mikyab.net/posts/63572#_ftn14.
https://he.wikipedia.org/wiki/%D7%A2%D7%A7%D7%95%D7%9E%D7%AA_%D7%9C%D7%90%D7%A4%D7%A8
The Laffer curve