New on the site: Michi-bot. An intelligent assistant based on the writings of Rabbi Michael Avraham.

Simplism in Simple Statistical Forecasts (Column 473)

With God’s help

Disclaimer: This post was translated from Hebrew using AI (ChatGPT 5 Thinking), so there may be inaccuracies or nuances lost. If something seems unclear, please refer to the Hebrew original or contact us for clarification.

I’ve just finished reading Jim Holt’s book, which deals with scientific, philosophical, and mathematical issues that have triggered intellectual revolutions and shifts in worldviews. I usually don’t read popular science literature, because it’s quite hard to write interesting, high-quality popular science. Such books often focus on anecdotes and gossip about thinkers and scientists, and on very superficial, top-down descriptions of ideas which—without professional understanding—confuse more than they help. They often aspire to present the philosophical implications of scientific insights, but in practice it usually comes out rather silly (scientists also do plenty of silly things and embrace populist “philosophies” when interpreting scientific results and pointing to their implications for our lives and thought). This book, in part, does those things too, but here and there it also enters the arguments themselves (at least at the popular level), and I really didn’t find gross errors. That, too, is quite rare.

One of the points that got me thinking is a simple argument that appears at the beginning of the fourth chapter (and a variant returns later on). As Holt writes, these arguments are rather astonishing because they are extremely minimalist while their conclusions are very far-reaching. This is exactly the sort of argument I’m fond of. It reminded me of another example.

A Way to Increase State Revenues

I once saw a lecture by Bibi, where he explained why raising taxes does not necessarily increase state revenues, and that lowering taxes can actually do so. He drew axes on the board, with the Y-axis marking the amount of money in the state’s coffers (revenues only), and the X-axis marking tax rates. Now, suppose you set a 0% tax—what are the revenues? Of course, 0. Suppose you set a 100% tax—again, revenues are 0 (no one will work if there’s no profit in it). Thus, the graph of state revenues as a function of the tax rate should look roughly like this: somewhere between the origin and the point (1,0) there must be some maximum (assuming state revenues can’t be negative), and according to him, world experience shows it’s located around 30%.

What does that mean? If the current tax rate is about 50%, then the way to increase state revenues is to lower the tax rate. And if the current tax rate is about 20%, then the way to increase revenues is to raise the tax rate. This is, of course, a simplistic and imprecise argument, but I quite like it because very minimal, simple assumptions lead to an interesting result (perhaps somewhat counter-intuitive).

Back to our topic.

Estimating the Lifespan of Phenomena

At the beginning of chapter four Holt presents an argument first formulated by Princeton astrophysicist J. Richard Gott in a 1993 Nature paper. He assumes the Copernican principle: we are probably not special. From this it follows that if we are acquainted with some phenomenon, then we are likely neither among the first to experience it nor among the last. That is, this moment is neither among the earliest moments of its existence nor among its latest. Note the conclusion that emerges from this simple assumption.

Suppose there’s a Broadway show that has already been performed n times. I, watching it now, am probably not among the first 2.5% who saw it, nor among the last 2.5% (true for 95% of observers). I can therefore determine, with a 95% confidence interval, that I am somewhere within the middle 95%. Hence one may state with 95% confidence that the show will continue for at least another n/39 performances (otherwise I’d be in the first 2.5%), and for no more than another 39n performances (otherwise I’d be in the last 2.5%).

For example, if the show has had 100 performances so far, it will continue at least another two or three, and no more than about 3,900.

Likewise for humanity. (For simplicity, I’ll speak of “humanity” from here on.) Suppose that in its present form it has existed for ~10,000 years. Then, with 95% confidence, it will continue to exist at least another 250 years and no more than 400,000 years. He estimates, in this way, the lifespan of the Internet, the presence of numbers in our lives, and more. Of course, the older a phenomenon is, the longer its expected remaining lifespan. He even provides several examples that corroborate this result.

This really is a powerful argument. With minimal, reasonable assumptions and a single simple consideration, we arrive at dramatic and surprising results with implications across many fields.

Objections

A first problem with this formula is that we arbitrarily chose 95% as the “normal” yardstick, and 5% as the “special” yardstick. We could just as well have chosen 99% versus 1%, and then the results would need to be divided and multiplied accordingly. But that’s not really an objection, since the same question can be addressed at different confidence levels. One may say, with 95% confidence, that our lifespan will be between X and Y, and with higher confidence that it will be longer. Note that the confidence concerns the entire interval, not just the upper or lower bound. Otherwise we’d end up with 99% confidence that we’ll live between 250 and 400,000 years, and 100% confidence we’ll live a million years, which is absurd. Naturally, assuming we are in the “special” 1% gives less confidence. The assertion that we are not in the special 5% is simpler and safer than assuming we are not in the special 1%, hence it yields a weaker result.

A second issue: as time goes on, the estimates themselves lengthen. If we wait another year and humanity is still on stage, then—surprise—the lifespan estimates will lengthen too. Yet this isn’t really a problem. It’s a dynamic estimate that constantly changes; as we have more information, our estimates can be updated. If we tossed a die without knowing the result, the chance to get a five is one-sixth; if we later learn the outcome is odd, the chance becomes one-third. The longer life goes on, the more information we have, and thus it’s reasonable to update our statistical estimates.

The real problem here is that I already know today, with high probability, that I’ll live another thousand years—i.e., I already possess information about where I will be then. Why not use it now and update my estimate? You see this leads to a divergence toward infinity. If we look at the other side of the time axis and ask what estimates our ancestors would have made (assuming they performed them) back when humanity had existed for only a thousand years, we’d get rather poor results. A thousand years ago, the estimate would have been between 25 and 40,000 years. Fine, that’s still reasonable. But what about 9,900 years ago? Back then humanity was only a hundred years old, so the estimate should have been between two and a half and four thousand years. Yet we already know today that this estimate is wrong. If so, why should we trust our current estimates if we’re already building things today that our descendants will know are wrong, and toss them in the trash?

This isn’t a technical quibble but a problem inherent to phenomena like humanity. It’s obviously hard to bound such estimates because of ambiguity in definitions: at which point in the evolutionary process do I define the creature as “human”? If we define the creature that existed a million years ago as human, then the estimated remaining lifespan of the human species increases dramatically. At the same 95% confidence, we would now expect between 25,000 and 40,000,000 more years. Does this contradict the earlier estimate? Not necessarily. It yields an optimistic estimate, while the previous one is more pessimistic. But notice: if we take the minimal estimate from the optimistic one—say, 250,000 years—that implies we are very special hominids, which breaks the Copernican principle as assumed in the second estimate. Thus there is, in fact, a contradiction between the two estimates.

A similar problem arises regarding the end point. It’s not easy to define when humanity is considered extinct. After significant evolutionary changes, would we still be “human”? Are we ourselves truly the continuation of the caveman, or is “scientific man” already a new creature? If you like, there are names for this: Generation X, Y, Z, and so on.

At root lies this question: every entity is special in some respects and not in others. The question is whether the respect in which you apply the Copernican principle is indeed not a special one. Perhaps we mistakenly applied it over our particular axis, rendering the estimate not worth much. Hominids may not be very special, but humans could be more special. If you dig through all relevant data about any person, you’ll always find something special: for example, the gematria (numerical value) of his name equals exactly his age; or he lives exactly 10 km from his mother. The probability of such things is slim, but it clearly happens for some. Note that being exactly in the middle is also very special. I could propose, with the same confidence interval, that I am neither in the middle, nor in one-third nor two-thirds of the human period. Each such placement is very special, hence unlikely that I live in it. But if so, by definition I always live in a very special region—and thus the Copernican principle kills itself.

All of this hides an assumption that the process by which such phenomena go extinct is random and uniformly distributed over their lifetime. That’s a very strong assumption, and I doubt it’s generally correct. Humanity today might be easier to wipe out than five hundred years ago, because we possess weapons that could end the entire story in seconds. On the other hand, humanity is bigger and more distributed, hence harder to wipe out. Suppose we’re on the brink of nuclear war, with a 50% chance it breaks out. Would it be right to estimate our expected survival under such circumstances using Gott’s formula? Not really. What’s the chance that we are on the brink of such a war? What’s the chance it will break out? Is the distribution of such events uniform? We have no way to know. In the absence of other information one might assume a uniform distribution, but I wouldn’t base anything on that.

Judgment Day Draws Near

On p. 340 Holt describes what is called the “Doomsday Argument.” I don’t know why astrophysicists formulate such arguments, but facts are facts: this one too was raised by an astrophysicist—Brandon Carter of Australia—at a meeting of the Royal Society in London in 1983.

Shall we begin the thought experiment? The astrophysicist presumably thinks that humanity is likely on the verge of extinction.

The argument goes roughly like this. Let’s assume an optimistic future for humanity: it will survive many more generations. The Earth’s population will stabilize at a reasonable number of about fifteen billion; then, as we grow, we will settle other stars in our galaxy, and manage to expand the food supply to match the needs of all humanity. Let’s say that every decade humanity grows by a billion people, until the sun dies out (a reasonable estimate). Suppose, in total, humanity across all generations will number only about fifty billion people before the curtain falls.

If that’s the case, then, according to the Copernican principle, we are exceedingly special: roughly 0.00001 of all humans—wow, admit it, that’s very special. In contrast, if humanity will go extinct soon, then it’s very plausible that our current generation is precisely the largest generation. It’s very plausible we’re living in the last generation, because that’s the likeliest moment.

Holt, who was very impressed by the previous argument, for some reason notes that this one contradicts it head-on. From the fact that we are not special (the Copernican principle), the earlier argument led to the conclusion that our existence will continue for about 40 more times our current age, not end in a few generations. Here, the same mode of reasoning leads to a completely different conclusion, and in many ways the exact opposite. It seems there’s something rotten in the state of Denmark… How can that be?

Numbers of People and the Time Axis: The Wonders of Exponential Processes

In the past I mentioned the uniqueness of exponential processes (in particular regarding COVID spread). Here’s a nice illustration. Think of a regular sheet of paper. You fold it in half, then again in half, and so on, forty times. What will the thickness be? For simplicity, assume the sheet’s original thickness is 1 mm. Each fold doubles the thickness: after one fold, 2 mm; after the next, 4 mm; then 8, 16, and so on. After 40 folds, the total thickness is 240 mm. Converting to kilometers (divide by a million), that’s about 220 km; and 210 is roughly a thousand, so here we have roughly a million kilometers—almost three times the distance between Earth and the Moon (!!). All from forty folds of a 1-mm sheet.

What’s the point? Humanity grows in an exponential process, doubling itself every few generations (of course there are disasters and extinctions, and growth rates vary by place and time; I’m just illustrating a theoretical point). In such a process, at each generational stage there are as many individuals as the total number who existed up to that point. Today there are about ten billion people on the globe, which is on the order of the total number of people who have lived until now (I’ve read estimates of about forty billion). If so, the last generation won’t be so special, since the number of people in it is comparable to all people who existed throughout history. This allows us to base the estimate on numbers of people rather than the time axis. Of course one can translate it back to the time axis, and thereby resolve the apparent contradiction between the two calculations above.

(There’s a well-known story about the inventor of chess. The Persian shah offered him any reward he asked for in gratitude for inventing the game, and he asked for a chessboard—8×8 squares—with one grain of wheat on the first square, two on the second, four on the third, and so on across the 64 squares. There wasn’t enough grain in the entire kingdom to pay his request. For illustrated explanations of this story and exponential processes, see: link.)

Back to Simplistic Considerations

Here’s another example of the claim above that it’s hard to speak of the Copernican principle because every person has specialness along some axes and not others. In our case, I can be very special on the time axis (e.g., living in the last generation), yet not special regarding the headcount axis (since in my generation lives about a quarter of all humanity).

Let me stress: I myself don’t accept Holt’s approach. But all this is said just to ask whether the claim that Gott’s approach is correct is itself correct. I think the explanations I presented above apply here as well. For example, the assumption that I am “not special” implies that my soul is drawn at random from some pool and tossed into the world at some stage, with the lottery being uniform (each stage equally likely). I see no real basis for that assumption. In such a context one could perhaps speak of plausibility (in my opinion, even that is dubious, since we have no information about the process), certainly not of probability. And besides: if I myself had been “drawn” in the 12th century BCE, would I still have been me? In what sense? It would be a different person altogether—born at a different time and environment. By that definition, the chance that I would be born exactly now is 1. Asking “what would Maimonides have said if he lived today?” is like asking whether he would still be Maimonides—it’s undefined. He would not have been Maimonides, but someone else.

Incidentally, Holt notes (p. 341) that Brandon Carter coined the term “anthropic principle” about a decade earlier (in the 1970s). To my surprise, the doomsday argument bears many similarities to those ideas. Despite the charm of their simplicity (I discussed this in my first book, God Plays Dice, and also in the third conversation in my book The First Being), I still recommend a healthy dose of skepticism toward elegant, simple arguments. Sometimes a person hits upon a simple and correct insight—evolutionary theory is like that; as the philosopher Malcolm put it, it’s an “eye-opening tautology.” But as a rule, far-reaching conclusions (alas) require heavier argumentative work; “according to the pain is the reward.” Grand conclusions (that burst with profligacy) generally demand more thought than minimalist arguments. So before you follow Holt and run with an argument like this, it’s definitely worth giving it another check.

Homework for readers: Try to raise objections to Bibi’s argument presented at the beginning of the column.

Source: translated from the attached PDF. :contentReference[oaicite:0]{index=0}


Discover more from הרב מיכאל אברהם

Subscribe to get the latest posts sent to your email.

17 תגובות

  1. https://he.wikipedia.org/wiki/%D7%A2%D7%A7%D7%95%D7%9E%D7%AA_%D7%9C%D7%90%D7%A4%D7%A8

    Laffer curve

  2. In the context of Bibi's argument, the argument assumes that there is one maximum, when it is certainly possible (and even likely) that there are several maxima, and therefore at least one minimum. From a practical point of view, the argument is not very useful, what the argument says is that there is an optimal tax rate (in terms of state revenue), a fairly trivial argument. The important question is what that optimal tax rate is, which can obviously vary from one economy to another and with the macroeconomic situation.
    In short, the less information the model contains (correct assumptions about reality), the less useful it is.

    1. This is the weakest criticism. Not even entirely correct, because most likely there is only one maximum, and in any case it at least proves that a tax increase does not necessarily increase revenue. This is the main argument.
      I also really disagree that less information is less useful. Here too there is a more complex process that has an optimum.

  3. I haven't read it yet, but one comment caught my eye. You wrote that in your opinion, when there is no information about the distribution process, then it is impossible to even talk about probability. Speaking of what you mentioned at the end about the parallels to discussions about God and creationism, on the subject of proving the uniqueness of the system of rules, I thought you did claim that it is possible to claim uniqueness without any information about the distribution process. What is the difference?

    1. When the process is completely unknown to us but there is some process there, there is no point in assuming that the distribution is uniform. As I noted, this is at most a default that I would not base much on. But in the physico-theological view there is an assumption that the creation of the world is a complete coincidence from absolute nothingness (otherwise the question remains what created what was before). In such a situation, the assumption that the distribution is uniform is the most reasonable and logical. An uneven distribution needs a reason. In the lottery of souls, whether it is done by God or some other mechanism, there is a reason, and you need to know this reason in order to say something about it.

      1. It's complicated for me, but I'll try to explore a little more. It's hard for me to see the distinction between a uniform distribution and a non-uniform distribution, but I'll assume that (because it's an idea that needs to be pondered) and ask differently – Apparently a uniform distribution (which fits symmetry considerations) is actually much more special than some non-uniform distribution.
        In addition, and I hope I'm not mistaken and confusing, apparently regarding the majority in the prohibitions, there are also mechanisms for severity, yes, you actually said that some kind of uniform distribution is assumed, and therefore the half-line is important.

        1. Exactly. That's why we assume a uniform distribution in the absence of other information. It's the simplest and most symmetrical.
          Regarding the halakhah in prohibitions, each case is different. But there they don't just follow statistical considerations but rather legal-halakhic rules (for example, they strive for simplicity. There are meta-legal principles that influence, etc.).

          1. If it is the simplest and most symmetrical then it is the most unique, and yet? French as etc.

            1. We don't draw distributions. The distribution governs the drawing. The uniform distribution is the simplest, so we assume it. Just as stitching points on a straight line is preferable to stitching them along a sine, even though you could say that the straight line is the simplest and therefore the most special.

              1. Apparently, from where you came in a straight line, rather, since you see that there is a simple and special line that will approximately intersect what is there, then that is why it is likely not a coincidence. But we cannot assume from the outset that a certain phenomenon will fall on a straight line without any anchoring. I understand that you say that considerations of simplicity are completely a priori, but how does the line show this.
                (I pondered before the previous response about the distribution lottery and could not come up with it and I still wonder)

              2. I don't really understand what the discussion is about. Are you disagreeing that in the absence of other information it is reasonable to assume a uniform distribution? Why make a difference between results? If we don't know about differences between results in the sample space, it is most likely that they all have the same weight. I don't know what to add.

              3. But you are of the opinion that even in the absence of information it is not reasonable to assume a uniform distribution in souls. And you explained that this is because there is an unknown process, and only by emerging from nothingness were systems of laws supposed to emerge with a uniform distribution, and therefore from the uniqueness of the system there is proof of creation.
                I still do not have a solid opinion, and perhaps there is a difference between before the events (which if you calculate the expectation you should probably assume a uniform distribution) and after it happened (which is then very difficult to assume with fidelity that it was supposed to happen with a uniform distribution). And from your method I asked if it was exhausted.

              4. Exactly. And I explained the division. In a random process the distribution is uniform. In a selection process there is no reason to assume that. And I added that maybe that is what I would assume without information, but I would not base anything on it.
                I think we have exhausted.

              5. Could you just clarify for me whether I understood correctly that in the emergence from nothingness (assuming it is possible, for the purpose of the proof without cosmological dependence) you are positively claiming that there would be a uniform distribution (and this is a critical claim for the proof), and not just a hypothesis from a lack of knowledge.

              6. Ya'ayan Ha'ayan https://mikyab.net/posts/63572#_ftn14.

  4. If the assumption is that we are not special, then it makes no difference whether what happens to us happens for the first time or recently, with a probability of 50% or with a probability of 1 in a trillion, according to statistical rules or contrary to them. None of this makes any difference at all. After all, we are not special.

    And so this whole discussion is unnecessary.

Leave a Reply

קרא גם את הטור הזה
Close
Back to top button