New on the site: Michi-bot. An intelligent assistant based on the writings of Rabbi Michael Avraham.

A Look at Occam’s Razor (Column 426)

With God’s help

Disclaimer: This post was translated from Hebrew using AI (ChatGPT 5 Thinking), so there may be inaccuracies or nuances lost. If something seems unclear, please refer to the Hebrew original or contact us for clarification.

In this column I wanted to touch on the principle of Occam’s razor (Occam’s razor). It’s a principle I use—of course alongside many others—in various contexts, and it has a few confusing aspects that are worth discussing.

History

This principle is ancient, with roots in Greek philosophy and later periods, but it is attributed to William of Ockham, a 14th-century English Franciscan friar who first formulated and articulated it. William of Ockham’s principle concerns the number of entities we posit, and it states that entities should not be multiplied beyond necessity. In other words, a theory that posits fewer entities is the better theory. Put differently: whoever proposes a theory with more entities bears the burden of proof.

Over the generations, notions such as “simplicity,” “elegance,” “parsimony,” and “aesthetics” have been attached to this idea. That is, we prefer an explanation or theory that is simpler, more economical, or more elegant/aesthetic. These formulations, of course, go beyond counting how many entities a theory contains and essentially expand the original principle. Still, the idea is similar: a small number of entities is one example of a criterion that expresses a theory’s simplicity and elegance.

In earlier eras, justifications for this principle were grounded in the nature of God—that is, assumptions that God prefers simplicity and thus created His world in that way. Even without resorting to theology, scientists tend to think the world is structured simply; this is a metaphysical justification. It isn’t clear where this belief comes from if not from theology, though some hold it is an inference from our accumulated experience.

Those earlier conceptions assume that the razor guides us to truth—meaning, we choose the simpler theory because it is more likely to be true (not necessarily certainly true, but with a higher chance of being true). This is an ontic-metaphysical view of the principle. Today, however, I think most scientists and philosophers tend to see it as merely a methodological principle (see the Wikipedia entry mentioned above and, for example, here), i.e., it is not a claim about the world but a guideline for scientists and philosophers. According to this interpretation, when two theories explain the relevant body of facts, there is no reason to choose the more complex theory. On this view, we choose the simpler one not because it is more likely true, but because there is no reason to use a more complex theory when a simpler option exists—at least until the simpler theory is refuted. If that does happen, experience will force us to adopt a more complex theory.

I doubt you’ll be surprised if I add that this principle may stem from our intuition and that its status is akin to that of the principles of causality or induction, which are also part of the foundations of scientific (and general) thought. These, too, are principles without empirical (certainly not direct) grounding, and yet we assume them to be true—often without even noticing. Even on this view, Occam’s razor is not merely methodological but an ontic claim, i.e., a claim about the nature of the world. I’ll remind you that in my view intuition is a cognitive tool; the upshot of my suggestion is that we grasp these principles through (non-sensory) observation of the world itself.

The significance of these debates

At first glance these are merely theoretical disputes. We all agree the principle should be used; the question is only how to justify it. That’s a philosophical issue that, on the face of it, doesn’t matter to practitioners who use Occam’s razor. As I will show, however, that’s not the case. These debates have implications for the principle itself and for how it is used.

As noted, it is now common to think of the razor as merely methodological. Many argue that quantum theory or relativity are certainly not simple theories, so it’s hard to claim that the laws of nature are simple or that God created a simple world. For example, Newtonian mechanics and gravity are far simpler than quantum mechanics and relativity, and yet we now know the latter are the truer theories. This seems like a decisive argument against the validity of the razor, suggesting it is at most methodological rather than ontic-metaphysical.

This argument targets a very specific understanding of the principle. If the justification is theological, then indeed we’d expect the laws of nature not to be so complicated. The theological thesis is challenged by this argument. But the claim that the principle points to truth need not be based on theology. For example, the intuitive grounding I offered does not buckle under this argument. On my proposal, the razor does not say that the laws of nature or the world’s operation are simple, but that among two possible explanations of the body of facts, the simple explanation has a higher chance of being correct. Over the years since Newton formulated mechanics, it became clear that it cannot explain all the observed facts, and so it isn’t a candidate for a correct theory. Among those that do explain all known facts, we still choose the simplest. Quantum and relativistic theories, for all their complexity, are the simplest theories that fit the totality of facts known to us. As Sherlock Holmes said (The Sign of the Four): once you have eliminated the impossible, whatever remains, however improbable, must be the truth.

Thus, different justifications for the principle yield different formulations of it. This is not merely a theoretical dispute about justification; these disputes have consequences for the principle’s content and its use.

It’s important to add that the picture is now more nuanced. On the one hand, the justification I propose yields an ontic-metaphysical conception of the principle. On the other hand, it is not a direct claim about the world (I’m not asserting that it is simple and elegant). My formulation looks, on the face of it, very similar to the methodological conception, since it doesn’t say anything about the simplicity of the true theories but only guides us in choosing among candidates. Haven’t I circled back, through the back door, to the methodological view?

Not quite. Suppose we have two possible theories, A and B, both explaining the body of facts, with A being the simpler. According to the methodological view, choosing A says nothing about its truth; the chance that A is true equals the chance that B is true. If the chances are equal, there is no reason to choose the complex theory, hence we prefer the simple. By contrast, on the ontic-metaphysical view, the choice of A rests on the assumption that it is more likely true. When facing such competing theories, scientists usually try to devise an experiment to decide between them. They look for an experiment that could deliver one of two possible outcomes: outcome a would fit A, and outcome b would fit B. What, then, is the a priori probability of obtaining outcome a? Advocates of the methodological view should answer that, a priori, the two outcomes are equally likely. We have no information to favor a over b, since A’s simplicity doesn’t mean it is more likely true. By contrast, on my ontic view, the probability of getting a is indeed higher. As noted, my directive to choose A for its simplicity is based on the assessment that A is more likely true than B. Of course this doesn’t mean outcome b cannot occur. That would have been the expectation under the theological justification (which claims the world itself is simple). On my view, the b outcome is possible, but less likely than a.

So on my account, too, the razor is a claim about the world, not just a methodological guideline—even though I am not claiming the laws of nature are necessarily simple. I claim that simplicity and elegance are indicators of factual truth. They do not guarantee it, but a simple theory has a higher probability of being true.

Proof

Let’s compare the two interpretations. To sharpen the discussion, I’ll focus on a concrete scientific topic, say Newton’s second law of mechanics. It states that there is a linear relation between the force on a body (F) and the acceleration it develops (a), with the proportionality constant being its mass (m): F = m*a.

How can we arrive at this law empirically? Very simply: perform an experiment. Apply different forces to a body of given mass m and measure the acceleration in each case. Suppose we conducted such an experiment and obtained the results shown in the following graph:

The five empty circles are the results of the experiment. As the text in the graph notes, there are several ways to connect these points into a continuous curve. Two are shown (the solid straight line and the dashed line). Clearly the straight line appears simpler, and Occam’s razor tells us to choose it.

Two ways to interpret that choice now arise:

  • The methodological option — We choose the solid line even though it is no truer than the dashed one, simply because it is the simplest.
  • The ontic option — We choose the solid line because the simplest is probably also the truer (it has the higher likelihood).

Before continuing, let me note that no scientist on earth would seriously claim that the true curve is not the straight line. That’s the simplest option, and it’s obvious to any reasonable person that it is probably the correct one. No scientist would say it’s necessary, but all would agree it is the most plausible. In other words, any reasonable scientist will tell you that the accelerations in cases 6 and 7 will be those predicted by the solid straight line. In other words, scientists implicitly assume the ontic interpretation of Occam’s razor, not the methodological one.

If I ask such a scientist why they prefer the solid line, they’ll of course say it’s simpler. If I then ask why they choose the simpler line—are the laws of nature always simple?—they usually won’t have an answer. Pressed to the wall, they’ll fall back on the methodological option: the solid line isn’t any truer, but why use a complex curve when a simple one suffices? Yet if I then ask what result they’d bet on if we perform experiments at points 6 and 7, an honest answer must be that they expect the result predicted by the solid line. Their stance is thus ontic, not methodological.

We could stop here, but I want to push it one step further. Suppose I face a scientist who insists they have no expectation whatsoever for experiments 6 and 7: any outcome is equally likely, since the straight line is no more correct than the dashed; its chance of being right is the same as the dashed line’s, because simplicity is no criterion of truth. With such obstinacy, experiments won’t help: even if we get the straight-line outcomes, they’ll claim it’s a lucky coincidence. It could have been any other result as well.

In appendix B of my book God Plays Dice and in my article here, I offered an argument proving that such a stubborn scientist is mistaken. My claim is this: there are, in fact, infinitely many possible curves that “sew through” all the points in the graph. Therefore, the probability of obtaining precisely the straight-line outcomes for cases 6 and 7 is exactly 0. If, when we run the experiment, we do get those outcomes, that is indeed confirmation of the ontic thesis. Moreover, suppose we collected all the cases in the history of science where results lined up on a straight line, and in each instance where a further experiment was conducted we asked whether it, too, fell on the same straight line. Adherents of the methodological stance would predict that the number of such cases is negligible—in their view, it almost never happened (recall: the probability is exactly 0). Proponents of the ontic stance would say the probability is not negligible (they cannot say exactly how large; their claim is merely that the simple generalization is not a shot in the dark).

Let’s broaden the question further. Ask how many times in the history of science a tested scientific generalization yielded a prediction that was confirmed experimentally. This is essentially the same question, only now about scientific generalizations in general (each being the simplest theory under the circumstances), not just straight lines. Here, too, the methodological camp would have to say the number of such cases was negligible. The ontic camp, by contrast, will say there were quite a few.

At this point it should be clear that the ontic position is correct; the history of science corroborates it. If the methodological stance were right, no scientific generalization would ever be confirmed by any experiment, and we would have no general scientific laws. In other words, our scientific knowledge today would be akin to that of primordial humankind. If there has been scientific progress, it means generalizations work—not always, of course, but in a non-trivial number of cases. That suffices to refute the methodological stance. The number of confirmed generalizations is a measure of the quality of the intuitive cognition underlying Occam’s razor (the quality of our non-sensory “sight”).

In other words, the dispute between methodological and ontic camps is not merely philosophical-theoretical. It has practical significance and can therefore be decided empirically. The course of scientific history factually confirms the ontic thesis and refutes the methodological one.

Objections to the razor

Over history, and especially in recent years, several objections to Occam’s razor have been raised (see the Wikipedia entry mentioned above). With the preliminaries in place, we are ready to examine them one by one.

  1. Distorting scientific considerations

The first objection is that the razor introduces extraneous considerations into science. Science should seek truth, while the razor injects simplicity, parsimony, and elegance—alien considerations. This objection effectively targets the ontic reading but could accept the razor as methodological (why adopt a complex theory when a simpler one does the job?).

Even the ontic reading, however, isn’t genuinely touched by this claim. First, there are philosophies of science that don’t see science as a truth-seeking discipline but rather as one that seeks simple and elegant descriptions of the facts. On that view (which I have, following Ze’ev Bechler, called “actualism”), Occam’s razor is the very essence of science and perfectly at home in it. But I am, of course, not an actualist (the graph argument above was originally offered against actualism). Second, on the ontic reading, the razor is a criterion of truth, not merely a methodological rule; it is therefore only fitting to use it in the pursuit of scientific truth. One might argue that it is not an empirical tool but a metaphysical-philosophical one and thus has no place in scientific methodology. But by the same token one could say that about countless other scientific assumptions that don’t arise from observation—causality, induction, the assumption that nature is time-invariant, and many more. So even when we frame the dilemma—between the two interpretations of the razor—this objection falls in the gap between the horns; it fails to land a blow on either.

The main problem with this objection is that it misses the point of Occam’s razor and therefore fails against either interpretation. As explained, the razor is invoked to decide between two alternatives that both explain all the known facts. Once that’s the situation, we have no purely intra-scientific criterion with which to decide between them. If an empirical decision were available, we would not need the razor at all.

Above I offered an empirical demonstration that the razor works: using it yields better results than a random shot in the dark. That shows that there is no distortion of scientific considerations here, but rather an additional meta-scientific consideration—one among many—that helps us reach scientific truth.

  1. Time dependence

This objection says that using the razor leads to time-dependence in scientific theory. At any given time, what counts as simple can shift in light of the information accumulated up to then, and so the theory deemed simple may also change.

Again, this reflects a misunderstanding of science as well as of the razor. First, scientific theories do change over time. When new facts are discovered, a theory can change or at least be updated. How is that different from any other scientific matter? Clearly, as our knowledge advances, our theory may change—whether we take an actualist stance (theories are claims about us rather than the world) or an information-realist stance (they are claims about the world).

One might refine the objection and argue that our concept of simplicity can shift not because of newly accumulated facts but due to cultural and philosophical currents, and so the theory changes regardless of facts and observation. But that returns us to the previous objection, where I noted that every scientific theory rests on meta-empirical assumptions; the razor is no different. This brings us to the next objection.

  1. The vagueness of “simplicity” and “elegance”

A major objection to the razor is that notions like simplicity and elegance are relative and vague. They can vary between people, eras, or cultures. What’s simple and elegant to me may be crude and very far from simple to you.

That is indeed correct, but it needs qualification. First, there are cases in which simplicity is well-defined. For example, in the illustration above, the straight line is simpler than any other curve because it can be described with fewer parameters (a straight line uses only two parameters). This echoes the original formulation that spoke of the number of entities; there, too, the criterion seems well-defined.

Many argue that counting entities is illusory. Is a train a single thing or many? It has seats, cars, a locomotive, restrooms, passengers, an engine, and so on. What about a car? Or a stone? A stone is made of atoms, each of which is itself complex in terms of elementary particles. Not to mention a person—or even a plant or an animal—organisms of staggering complexity that are hard to regard as simple. Is saying that one person caused phenomenon X simpler than saying it was caused by a collection of other things like stones, clouds, and wind? In my view, this argument is mistaken. When we say a person caused something, we mean the person as an organic whole. In that sense, there is a single cause—even if the person is internally complex. That differs from a non-organic combination of other causes which, even if individually far simpler than a human, still make for a more complex and less elegant explanation.

One could still argue that this very distinction is not scientific but cultural and conceptual. Here are several replies: (a) There are mathematically grounded measures for such distinctions. Entropy is a measure of complexity with concrete expressions in the laws of nature (especially thermodynamics), which makes it hard to see it as purely subjective or culture-bound. (b) Even if simplicity were a vague and relative notion, at least as a methodological tool it is very reasonable to use it. We have nothing better for adjudicating between two theories that both explain all the facts. (c) My claim about an intuitive basis says that parsimony and simplicity are not subjective: we draw them from a kind of (not necessarily sensory) observation of the world. I therefore reject the premise that these notions are merely subjective. True, there are disagreements about them; but in any such disagreement, one side is right and the other wrong—and in some cases science can even decide which.

The same goes for theories that use abstract (theoretical) notions such as energy, potential, wavefunction, or force, which might also be hard to count unequivocally. Here, too, I answer that if a notion functions holistically/organically, then for our purposes it is one notion.

As to this objection as well, I can only refer back to the demonstration I gave above: the razor works. For our purposes, the meaning of that demonstration is that our notions of simplicity and elegance are apparently not purely subjective—even if we lack crisp metrics and objective validation. The success of science is the best validation there is. The successes of science show that our simplicity notions are (not with certainty, of course) broadly correct rather than mere subjectivity.

  1. Conservatism

The objection from conservatism can be formulated in two ways: (a) the razor is false because it expresses conservatism; (b) the razor is dangerous because it leads to conservatism.

The first claim is that the razor is merely an expression of conservatism. Scientists facing two equivalent theories will choose the one that fits the current style of thought. That’s what they call simplicity and elegance.

But this claim, too, is wrong. First, a scientist always uses the knowledge accumulated so far, and there is nothing wrong with that. On the contrary, that is what it means to accumulate knowledge: that it serves as a basis for further thinking. This claim is merely another manifestation of what we already saw: our notions of simplicity and elegance are not subjective. They are the product of accumulated knowledge and intuitive cognition and, as such, are fully legitimate scientific tools. True, they are not infallible, but current knowledge and theory are our best starting point at any given moment; there is no reason to ignore them in the name of some illusory objectivity.

The second claim does not attack the razor itself but points to a danger in using it: it may justify conservatism and inertia. Over-reliance on the razor could lead scientists to dismiss revolutionary theses out of hand without due consideration. Einstein’s attitude to quantum theory (he refused to accept it to his dying day, despite being one of its fathers) is a good example.

This is an important warning and certainly deserves attention. But it should not prevent us from using the razor—only from overusing it. What counts as overuse? At first glance, if the theories are equivalent (each explains all the facts), the decision between them depends on simplicity and elegance. When is it wrong to use them? The description “a theory that explains all the facts” is naïve and simplistic. In the overwhelming majority of cases, that is not the situation: a theory explains many facts, but there are always open questions requiring further research. Contrary to Karl Popper’s description, we do not discard a scientific theory because a single experiment contradicts it. Sometimes we place such issues in the “needs investigation” box. A theory that explains many facts and strikes us as highly reasonable has strength and weight of its own. Abandoning it requires several very significant empirical falsifications. Many philosophers of science have noted that this is not mere conservatism; such conservatism is crucial to the scientific process. Without it, we would flit between theories whenever some experiment failed.[1]

Example: What is rational thinking?

I’ve described a case several times that nicely illustrates this fallacy (see, e.g., column 267). When I studied at the Gush Etzion yeshiva, there was a student two years my senior (now a well-known figure) who fell ill with severe jaundice and did not recover for several months. A mutual friend told me what he had seen when visiting him in the hospital. A certain “witch” was brought with pigeons; she placed them, one after the other, on his navel. Each in turn died after a few seconds, then another was placed and died, and so on. After a few days he recovered and returned to the yeshiva. This was a known phenomenon at the time; I believe that in later years its mechanism was understood and it became clear why the pigeons do not cure jaundice. But that’s not the point. When I returned home and told my parents this story, they clapped their hands in dismay at the intellectual darkness and irrationality the ossified yeshiva was instilling in me (note: this was Gush Etzion, not Toldot Avraham Yitzhak). I told them that their “rational” approach didn’t strike me as rational at all.

Rationality does not mean denying facts that seem implausible. If we have reliable testimony about them, then they are probably correct (until proven otherwise). A rational person doesn’t stop at accepting the facts but tries to seek an explanation for them—either in terms of existing knowledge or by expanding knowledge into new horizons. Denying facts is not rationality but sheer conservatism. If all the great scientists had behaved in that “rational” way, we would still have the science of primordial man (since every fact that contradicted it would have been summarily dismissed).

I should add that skepticism about such facts is healthy. Facts that contradict established scientific knowledge are rare, and the more comprehensive and well-founded our knowledge, the less we expect to encounter such facts. Therefore, when such facts reach us, it is indeed important to make sure no mistake has occurred and that the facts are reliable. But once we have checked and concluded they are, the rational path is to accept them and think further about their meaning.

Further cautions

We have seen that using the razor is both necessary and justified, but that it comes with risks we should beware of. Beyond conservatism and irrationality, one common risk is simplistic, catch-all explanations. Sometimes people think that if they have offered a general, simplistic explanation, then it is also simpler—and therefore more correct. For example, I have often heard the claim that God is the most economical explanation for the world’s existence and therefore also the most scientifically correct. The theistic explanation posits a single being; seemingly the most elegant explanation possible.

Such an explanation, however, suffers from two problems, both typical of misuses of the razor—and they are connected: first, it is not open to falsification; second, it is so general that it doesn’t say much.

An unfalsifiable explanation will always fit all the facts, by definition. It’s easy to invent and impossible to refute—but that is precisely its weakness. It’s no accident that the razor is typically used in scientific contexts, where we can test our decisions with future experiments. Consider, for example, Israel’s process of redemption: it has ups and downs. One explanation is that this is the way of the world; there are various forces working in different directions—sometimes more success, sometimes less. Another pins everything on divine providence: it raises and lowers, but always toward a predetermined goal. The second explanation is, superficially, simpler—yet its weakness is that it is unfalsifiable. What could possibly happen that would make us abandon it? Nothing. Beyond that, it is too general; anything that happens fits it, so it doesn’t really explain. What explains everything explains nothing.

I have often heard people claim to find in the Zohar, the Torah, the Maharal, or Rav Kook, evolution, quantum theory, relativity, and so on. In the rare cases where I checked the sources, I found a flimsy remark that, at best, bears some resemblance to the scientific theory in question, but doesn’t actually say anything concrete and certainly doesn’t yield predictions or quantitative principles. Even in cases where one could indeed find something of the idea in question (usually not the case), scattering broad generalities is not an explanation. The tell is that everything that happens will fit those “predictions.” I’m sure that if and when relativity or evolution (which is not falsifiable) are refuted, those same quoters will find brilliant exegesis showing that their source predicted that too. I’m sure that in the past, people found in Torah or Talmud the scientific theories of their time; the question of what they would say today, now that those theories have been shown incorrect, remains open. And again: theory of everything is a theory of nothing The.

I’m sure there will be comments about my own uses of the razor, especially in theological contexts that are not falsifiable. I only suggest thinking before asking, since in my assessment most such remarks will be mistaken. For the public good, I will clarify only this: I did not say that the razor should never be used outside science and in unfalsifiable contexts. I said that when there is a competition between two explanations, it is important to factor in, in deciding between them, that one of them is unfalsifiable and overly general.

Now, to the summaries.

The prosecution’s summation: “Wikipedia”

In the Wikipedia entry mentioned above, all the objections presented here are cited uncritically. For some reason there isn’t a single argument there in defense of the razor, nor any word about the problems with those refutations. You won’t be surprised to read the concluding paragraph from that entry:

One should not treat Occam’s razor as a rule or a law but only as a pragmatic recommendation. If we treat it as a rule, we will use it to choose a certain theory because it is simpler. According to Karl Popper, if one day we find an observation that does not match that theory, the theory is immediately refuted and rejected. But, according to the same Popper, that would immediately refute and reject Occam’s razor as well, by whose lights we chose the now-refuted theory in the first place.

A common mistake is to claim that Occam’s razor is supposed to provide a tool for choosing between true and false theories, and that is not so. The razor helps us choose, from among different true theories, the theory that is simpler in terms of explanation, parsimony, and content, and to use it to proceed with the scientific enterprise. But what about the other theories, those that were rejected? They too are (at that time) correct! Why reject correct theories merely because they are more complex and complicated to understand? Many argue that a scientist’s task is to reject false theories, not complicated ones.

Galileo Galilei mocked Occam’s razor by saying that we should discard all the science books and choose only the letters of the alphabet, since with them one can explain anything and they are simpler than the science books.

Still, one should not entirely reject Occam’s razor. It is useful in areas such as the didactic domain—it is easier to explain and teach theories and concepts in a simpler way than in a complicated way. Choosing a simpler theory can also reduce implementation costs compared to a more complicated one.

I’m pleased by the indulgent tone that allows us to make methodological use of this ancient, primitive principle. We discover that, in the learned author’s eyes, it is forgivable—sometimes even tolerable. I hope it now needs no further explanation why this is a collection of nonsense reflecting a deep misunderstanding of Occam’s razor and of science in general.

The defense’s summation: using the razor

We have seen that the use of the razor is justified and well-grounded in scientific practice. True, care is needed in using it, but it gains strong support from the history of science. We have also seen that all the objections rest on misunderstandings. The conclusion is that the razor is indeed useful—and more than that: it is a tool for grasping ontic truth, not merely a methodological device.

Originally, the razor served William of Ockham in philosophical contexts (many use it to argue for the existence of God). But it is generally used in scientific contexts, and not by accident I focused on those. I will add that the razor is the sole foundation of the non-deductive logic we developed, which underlies “soft” (non-deductive) inferences—the inferences of science and law—as opposed to those of mathematics and formal logic. This is not the place to go into it; it can be found in our two articles in BD”D (Part A and Part B), in the first volume of the Talmudic Logic series, and in a more popular and concise form in part six of my book Truth, Not Stability.

Beyond that, we use the razor at every turn in daily life. We constantly draw conclusions from what we experience and encounter, usually choosing one conclusion from among several possibilities. So there, too, we choose the simplest and most elegant conclusion under the circumstances. If the station is empty, we assume the bus has already passed; if it’s dark, we assume evening has fallen; if someone tells us the time, we assume he’s telling the truth; and so on. Skeptics can always raise alternative possibilities, and unseasoned defenders will tend to retreat to methodological language. If you ask someone why he believes the person who told him the time, or why he assumes the bus has passed, he won’t understand what you want. It’s self-evident to him. If you press and ask who told him he’s right—after all, there are alternative explanations—he will have to say that, methodologically speaking, this is the reasonable choice even if it isn’t necessarily correct. But that’s post-hoc rationalization, not a description of what he actually thinks. To him it’s obvious that it’s correct; lacking a rational account, he escapes to methodological justifications. That is exactly what happens to scientists and philosophers with Occam’s razor. Instead of admitting that we have an intuitive capacity to grasp the right generalizations with decent probability—something that sounds a bit mystical and not very scientific—they retreat to methodological justifications.

[1] In a certain sense, this claim is equivalent to what is called (in machine learning and elsewhere) overfitting (see here). Excessive fit to the facts is known to be a defect in a theory (usually signaling experimenter manipulation or a programmer’s misunderstanding in machine learning). See, from a different angle, chapter six of my book The Science of Freedom.


Discover more from הרב מיכאל אברהם

Subscribe to get the latest posts sent to your email.

45 תגובות

  1. I really liked it!
    But it seems to me that someone with a methodological stance would use your example as a justification for your method - the number of correct predictions in the history of science that stemmed from incorrect theories, (like the example of Newton's mechanics) is enormous, which shows that a straight line may help to make predictions but not to find the more correct theory.
    If science is entirely a method of making predictions - there is indeed a “substantial” justification for using a razor, but if the goal of science is to reach the study of the truth - your evidence from the number of predictions is nothing more than justification for the fact that this is a successful method of making predictions, and therefore even though experience shows that there is a good chance that the theory will later be proven false, for the time being it makes a lot of sense to hold on to it.

  2. In other words, the razor helps find a theory that will be easy to confirm, but it does not help find the theory that will not be refuted. In most cases, it will actually be refuted in the future, and therefore will be proven to be a useful but incorrect tool.

    1. This is an absurd claim. First, you assume that erroneous theories make no less predictions than correct theories. This is ridiculous, of course, because according to this it is not clear why they were replaced. Furthermore, the theories that have been proposed and tested are indeed correct within certain limits even in light of the more revised theory (such as Newton's mechanics in light of quantum and relativity). And this is exactly my claim that in this way we are advancing towards the truth. According to you, we should today have a collection of theories that have not been refuted and that have no connection to the truth. I would not get on a plane based on a theory about which all I can say is that it has not been refuted. There are millions of them.

      1. 1. Incorrect theories can give numerically no less correct predictions. The reason we replace a theory is not because we have a theory that gives more predictions, but because the first theory has been refuted. That is, even if one theory is confirmed a thousand times but is refuted convincingly, it will be replaced by a theory that has been confirmed less but has not been refuted.

        2. I get on a plane not because I am sure that the physics of the scientists who built it will not be refuted, but because I am convinced that it is at least a good approximation to the truth. As I wrote - a useful tool but incorrect. In the question of what works - the razor is a good tool, to choose the best approximation (the straight line is the shortest distance between the two observations). In the question of what will always work - (will not be refuted) there is no advantage to the simplicity of the theory. And as time goes by, scientific theories do not seem to become simpler, but more complex.

  3. Reflecting

    A. There seems to be some kind of assumption here. You take (as an illustration) all the theories that humanity has come up with to date and assume that we see that the simpler theory has always been right. And from this you deduce a meta-theory (i.e. a theory about scientific theories) that a simple theory has an ontic advantage. But in fact you have laid down the razor here, and found the simplest explanation for why simpler theories have been successful in the past and present. The razor hypothesis on all theories is like the straight line hypothesis on a single theory. Since there are an infinite number of other meta-theories that would produce exactly the successful theories of humanity on data collections, then we still have no justification for choosing the razor meta-theory over any other meta-theory.

    B. In your doctrine you hang the matter on recognition in the eyes of reason.
    B1. A scientist can only see the data and nothing from the experiment itself. For example, today the theorist probably holds a list of numerical results obtained by different experimenters in different experiments under different conditions and based on this he formulates a theory. He does not look at the world itself but at numerical records in human language written on a sheet of paper and encoded in a human way. So in your opinion, through this paper the theorist observes the general law? Ostensibly, recognition requires information that comes directly from the world, and not processed and encoded information.
    B2. If it is about recognition, then how did Newton, for example, arrive at his own incorrect theory. Even if the numerical results come out very close, it is clear that the knowledge of the theory and the entities does not concern the numerical prediction but the entities themselves. So Newton of course did not observe a law that does not exist at all (f=ma), and if so, the question arises as to how the razor principle helped him predict.

    1. A. You are right in principle, and I think I commented on this in Appendix B. In my opinion, if a theory works, then it is also true. The actualists believe that it is possible that it only works and is not true. So according to their method on the meta level, they must methodologically conclude that the theories are true. They also agree that it is right to rely on statistics, and statistics say that the theories are true. In other words, anyone who insists on this level also returns to the usual skepticism, and there is nothing to be done against skeptics.
      B1. The numbers raise the situation for him. Suppose a scientist sees a numerical report on the relationship between force and acceleration. He understands what force is and what acceleration is and understands that there is a relationship between them. The numbers only tell him exactly what the quantitative relationship (the coefficient of proportion is mass) and the formal relationship (a straight line) are.
      B2. Newton's theory is also fundamentally true. This is not just a lucky approximation (based on a theory that is ontically wrong). There is indeed a gravitational force that acts between two masses. Its quantitative description changes slightly in extreme circumstances. The description of the force as a curvature of space is just a different way of describing the matter.
      Even one ontically reality can be described in several ways, and their accuracy may differ from one another.

      1. A. I understand that you are saying that proof from the entire history of science is stronger than proof from a single scientific theory in which a scientist predicted a simple theory and it turned out that he was actually quite close to the truth. But I still don't understand the point of difference. What can an actualist argue against proof from a single scientific theory that he cannot argue against proof from the entire history of science. Is the study of the entire history of science only intended to strengthen statistics (from zero for all practical needs to zero, very zero for all practical needs)?

        1. This is not a quantitative difference. The actualist claims that a theory that works is not true as a description of the world. But he accepts the methodological assumptions of science, according to which a theory should work and that statistical verification (such as generalizations) is acceptable, etc. But in his opinion, all this was said only on the methodological level.
          Now I ask a meta-theoretical question: are the theories true or just facts. Ostensibly, this is a purely philosophical question, and therefore the debate about it is ongoing and cannot be decided. But my argument, and in this its uniqueness, offers a statistical-empirical (scientific) answer to this question and not just a philosophical one. And after all, the actualist accepts statistics as a decisive tool at least on the scientific-methodological level. If so, at least methodologically he is supposed to adopt the view that the theories are true. It may sound absurd, but that's what comes out, probably because his very doctrine is absurd (because what works is probably true. It's unlikely that this is a collection of miracles in vain. It turns out that those who cling too closely to facts and reject speculation must adopt a miraculous view).

          1. You explain here that even one successful prediction is a refutation to the naive actualist who does not even methodologically accept that the theory is correct but only works. But I still do not understand what the view of all scientific theories (where the razor is empirically accepted as a meta-theory that explains the theories) added to this more than your view of a single theory (where a theory that used the razor is empirically accepted as a theory that explains the findings).

            B. Regarding recognition, I still do not understand. Are you saying that if a scientist is given a graph like the one you presented between two variables and is not told their meaning, then he will no longer recognize the simplest theory as an ontological description of the data and predicting the future? If he does propose a simple theory, it means that the razor is not a cognitive principle at all.

            1. A. I don't know what is not understood in what I wrote and repeated. The difference is not quantitative (one theory versus many) but in the nature of the question. When discussing one theory, we are talking about the question of whether it works. When talking about all theories or the observation that leads to the construction of theories, we are talking about the question of whether we have such a sense or not, that is, whether the theories describe reality or just facts. This is the test question that I tried to answer, and not a scientific question as is the case with one theory. Looking at all theories speaks about us, looking at one theory speaks about the world or, more precisely, about science.

              B. He will recognize that the straight line is the correct connection. But he will of course not be able to understand what theoretical concepts underlie the straight line. It is likely that the level of recognition of reality is directly proportional to the level of encounter and familiarity with empirical materials.

  4. Occam's razor is just nonsense.
    Occam's razor principle is stronger: keep yourself from getting involved, keep it simple, learn nothing.

    If there are two theories that explain the same phenomena, then both theories are possible.
    What determines whether a theory is better than another is not its simplicity but its ability to surprise us with its predictions. And that should be the guiding principle.

  5. I didn't understand the graph example.
    The reason we assume in advance that an experiment will support the straight line is not because of Occam's razor but because the straight line has a theory that explains it while the curved line has no explanation.
    If both lines had logical theories that explain them then we wouldn't be able to predict the results of the experiment in advance.

    1. Absolutely not true. Here too, there is no explanation, and yet a straight line is assumed, and in general, whenever there is such a connection, a straight line is preferred regardless of the explanations.

      1. As far as I know, the purpose of experiments is to choose between hypotheses.
        Regarding the straight line, we have a hypothesis, regarding a curved line, we do not, since it is equivalent to an infinity of other curved lines, so what good would an experiment be?

        1. There is no point in this whole discussion. Take a graph like the one above and show it to some scientist. Erase the axis labels, and don't tell him what the X-axis is and what the Y-axis is measured here.
          I would love to hear if there is one scientist who wouldn't choose the straight line. I don't think you will find one.

          1. The main reason is that there are not enough points to reduce the uncertainty.
            Choosing a non-straight line adds more uncertainties to the other variables.

            If there were enough points to reduce the noise and uncertainty, no one would choose a straight line.

  6. The Razor of Rabbi Moses ben Maimonides – “The Teacher of the Perplexed” Part Two, Chapter 11:
    “Since the purpose of this science (astronomy) is to posit a formula (astronomical model) with which it is possible for this star to move
    ….And what is required of this movement will correspond to observation.
    However, he seeks to minimize movements and wheels as much as possible.
    If we are able to posit a formula according to which the observed movements will be explained according to three wheels (cyclical motion),
    and another formula according to which the same thing will be possible according to four wheels –
    it is appropriate for us to trust the formula in which the number of movements is smaller ”

    There is no explanation why “it is appropriate for us to trust ” .

    1. Ockham was born 83 years after Maimonides died.

      Another translation:
      “Know that these aforementioned matters of property, if read and understood by a mere learned person, 1 would think that they are a complete proof that the shape of the wheels and their number is so. And this is not so, and this is not the purpose of the science of property, but there are things among them that are proven to be so, as it has been proven that the orbit of the sun is an orbit inclined to the equator 2 and this is something that is beyond doubt, but whether it has a wheel outside the center, or a wheel of rotation – has not been proven. And this is something that the possessor of property does not pay attention to, because the purpose of this science is to assume a property in which it is possible for the motion of a star to be one of rotation, in which there is neither speed nor slowness nor change, [and] the result of that motion will correspond to what is seen.

      And he will try *2 to exclude the movements and the number of wheels as much as possible, since if we could, through an analogy, assume a property in which what is seen from the movement of this star will exist in three wheels, and another property in which the same thing will exist in four wheels, then it is better to rely on the property with the fewest number of movements 3. And therefore we chose the sun of the center's departure over assuming a rotating wheel as mentioned in Ptolemaic 4.

      *2. Here too, R”S wrote that our Rabbi was correct in his translation “and he meant with this”. And he is correct, and the intention is that he should strive for this and his goal should be towards this.

      1. It seems that the logic behind this requirement appears later with the parable of the man who has excess money… (excessbetter)

        And it seems that the description of the rotations is better because supposedly God gave the creatures what they need, no more and no less. And if 3 is enough, then there is no reason to add additional intelligences.

        Then the principle is based on a kind of perfection with minimal energy waste. Or a principle of minimal motion (Hamilton's principle).

  7. 1. Is it possible to speak in terms of justification for a principle in a naturalistic world that has evolved purposelessly in an evolutionary way?
    At most, it is possible to explain why psychologically and methodologically it is acceptable to accept it. But is it possible to speak of ontic justification?
    2. Is the argument that Ockham's razor works and the chance of this is zero, evidence in favor of the principle?
    Because it seems to be able to confirm this only for those who do indeed presuppose that the principle is true, because even the test findings are analyzed within the mental framework of the principle that the simple explanation is also the correct one. So it is not a series of Foxes.
    3. You did not address the objection that it depends on time, not in the sense of the amount of knowledge that is before us, but that the principle is applied within the framework of a world that depends on time. (In the sense that the laws of nature are “fixed” in time).
    For example, the very claim that the hypothesis that the bus passed is preferable. Although there is no likelihood that we can think that within the framework of the world open to humans the simplest event is also the right one.

    1. 1. In a materialistic world, there is no justification for anything. I have written about this more than once. In such a world, there is no discretion and we are a mechanical machine. What comes out comes out, there is no question of justifications.
      2. See my answer to Tirgitz who spoke about the assumption of the desired.
      3. I did not understand the question. I was referring to time dependence.

  8. Thank you very much, I'll take a look there.
    3. Indeed, that wasn't clear enough. I'll try another way.
    There are two essential parts to Ockham's razor.
    1. The laws of nature as a kind and realization of “pure laws”.
    2. The multifaceted human world in which we live.
    So while regarding the first part, one can quite reasonably accept the belief that the simple description and explanation is also the correct one, (it is enough to assume and have theological evidence that we are capable of understanding the world to justify this).
    It seems that in the second world, a world open to humans – with free will and discretion, a world that is made up of so many different factors/sub-factors and shades, and that every little thing has a great impact. It already seems that the assumption about the correctness of the simple description may even be arbitrary.
    Because even if we assume that the framework of natural laws is simple and deterministic, it really doesn't require that the human world will also be simple. On the contrary, it changes a lot and is subject to time.
    Another aspect from which this can be seen is our inability to predict the future, and from our point of view it may even be something of a chaotic nature.
    But then, how can Ockham's razor be justified also on the level of "our" world, which is open to construction and changes over time.

    There is no need to repeat the last paragraph in the column and show that Ockham's razor is constantly behind our inferences regarding this human level. Whether on the daily level of walking and encountering an empty bus stop, or in the legal field of strong evidence laws, etc., etc.

    1. I don't understand the question. Are you asking that simplicity isn't necessarily true? I explained that. There is evidence that it works.

  9. I meant to ask how one can assume that simplicity is real, within the framework of human culture. Not within the framework of the laws of nature, where it is indeed understood that if one assumes that God instills in us the ability to understand the world, there is no reason why we should be able to deduce the laws of nature with some ingrained assumptions.

    But human reality is complex by its very nature, open to free will, to myriad independent factors, and is not deterministic at all.

      1. Let's try again this time with clearer assumptions 🙂 ,
        How can we trust the principle that it is forbidden to multiply beings in a world that is not deterministic (but with free choice).
        Because:
        The a priori probability of the hypothesis of an explanation of the event depends on the number of beings. (The simpler you assume it is, the more likely it is).
        But the number of beings depends directly on their will.
        And their will is not valid for the principle.

        If so, there is no reason to assume that the principle is valid for explanations in a world full of non-deterministic beings, except for the whole complexity of the world anyway.

        I suppose you will object to assumption 3, but this contradicts the concept of free choice.

  10. It seems that William of Ockham, applied the simplified ‘razor’ to the name of his birthplace, which is called Okham, in the manner of place names in England ending in -ham, such as ‘Nottingham’ and ’Birmingham’ and so on.

    However, in his philosophical writing, William calls his birthplace in a simplified and reduced form: Ockham was to Occam, while skipping the -h. As a kind of poet's essay: The landscape of his homeland – is the pattern of man 🙂

    With the blessing of the raids, Prastik the Pshitik

    1. If we take as an example the simple mechanics of Newton versus the complex mechanics of the theory of relativity – It is not correct to say that the simple theory is ‘false’, at small speeds Newton's mechanics is incredibly accurate, and is used in our everyday lives. Only when we reach orders of magnitude approaching the speed of light – then the complex mechanics of ‘the theory of relativity’ come into play.

      There are visible and simple layers in the world and in life, and there are deep and hidden layers. In the simple layers – Ockham's simplifying razor helps us, but beneath the surface, with scrutiny and investigation, deep and complex layers are revealed, deeper than ever. In the layer of simplification, the Razor helps us – But beyond it there are wonderful worlds of ‘raz’ and ’or’.

      Best regards, Nehorai Shraga Agami-Psisowitz

      1. Paragraph 1 Line 2
        … At low speeds the mechanics of…

        Ibid., Line 4
        .. Then the complex mechanics comes into play…

      2. In S.D. H. Bageslow P.B.

        It is interesting to examine whether there are parallels to the idea of ‘Occam's razor’ in Judaism. Perhaps the rule that ’no scripture is too simple’ goes partly in the same direction as the simpler explanation in the language of Scripture is true. However, unlike ’Occam's razor’ which accepts only the simple explanation and rejects the complicated– in Torah interpretation – the ‘simplicity of scripture’ is not the only explanation, but exists alongside the Midrash of the sages, and the simple and the expository are complementary aspects of understanding the Torah.

        The assumption that simplicity and ’economy– Preferable to long – seemingly stands at the foundation of Rabbi Akiva's way of learning, who believes that the formulation of the Torah was supposed to be the shortest and most economical’, and therefore believes that all the variations of words or letters, and even ‘like letter combinations’ – were not done simply to beautify the language (as Rabbi Yishmael believed that ’Torah spoke like the language of humans’), but rather every variation of language came to teach ‘great laws’.

        With blessings, Nasha”ef

        1. And perhaps the Hagar who asked Hillel, "Convert me so that you can teach me the entire Torah on one foot," also came from a similar premise to Ockham, that one must seek an explanation that explains all the "phenomena" of the Torah according to one guiding principle, a premise that Hillel also accepts and proposes that the guiding principle of the entire Torah is awareness of others.

          With greetings, Nashaf

          1. Rabbi Michael places the equal side on Occam's razor, and this is probably the accepted understanding.

  11. As it turns out, the very scholarly discussion in principle contradicts the principle

  12. This question is dealt with a lot in machine learning in what is called “model selection” which is actually very much like the example here of the graph of whether to choose a description of a line or a polynomial of higher degree.
    There is a principle called “no free lunch” which simply means that if we do not make a priori assumptions (that do not depend on the information) about the function that is supposed to describe the information, then the information cannot teach us anything. If a function can be random (even one that cannot be described simply by a mapping from a side x) then any information we have received does not teach us anything about things we have not seen.
    Only when we assume something a priori about a limited family of functions that can describe the information can the information guide us from this family of functions which function best fits the information.

    There is a measure called the “VC dimension” of the family of functions that defines the “size” of the family of functions. As I increase the family of functions from which I choose, I can be less confident in my conclusions because the fit of the theory I chose to the data can be a result of the fact that there are so many permissible theories that it is clear that one of them will fit the data. So if I choose a linear description then it is a small family of functions and therefore if the line really fits the information I can be quite confident in my conclusions. If I choose a high-degree polynomial (and I don't have a lot of data) then I can be less confident in my conclusions. So even though the high-degree polynomial also fits the information, the barrier to my error on information that I have not seen will be high.
    There is also a Bayesian approach that asks what the probability of the family of functions is given the information and it depends on what the probability of the information is given the family of functions. That is, the probability that information will sit on one line when we assume that the function is a line is high. On the other hand, if I assume that the function is a high-degree polynomial, then the chance that I would only see information that sits on one line is low. Therefore, the line should be preferred.

    1. Chen Chen. The NFL assumption is exactly the basis for what I said here. (Of course, you also assume that the results you saw are uniformly distributed, meaning that it is a representative sample, otherwise even in a family of exponential functions the results in front of you may fall on one line.)
      Right now I'm in the middle of a course on Coursera (from Stanford) on machine learning. Interesting.

  13. Daniel – Thanks. You saved me a comment 🙂

    In general, I find it disappointing how little space the concepts from statistics and machine learning have in philosophical discourse, even though they are the closest thing to ”engineering epistemology” that we have. I am glad that Prof’ Michi chose to study this, and I hope that a lot of good philosophy will come out of it

    1. Philosophy is the language of the wise naïve. Those who don't really understand that the world is more complex than the collection of words they mumble.

  14. It seems that according to Ockham, kashrut should not be privatized, since it is better to exclude those entities that provide kashrut 🙂

    Best regards, Ray Golator

    1. On the contrary, Occam's razor requires the kashrut reform,

      Since to this day the determining body for the certification of kashrut providers is the ‘Chief Rabbinate Council’ which has 15 members and is elected by a ‘electoral body’ of 150 members, while according to the reform the final arbiter will be the ‘designated’Minister of Religious Affairs, there is only one entity by which all kashrut matters will be governed: the Minister of Religious Affairs, Shalit’A.

      Best regards, Jill Glachowski

Leave a Reply

קרא גם את הטור הזה
Close
Back to top button