New on the site: Michi-bot. An intelligent assistant based on the writings of Rabbi Michael Avraham.

On Abduction and Mathematics (Column 537)

With God’s help

Disclaimer: This post was translated from Hebrew using AI (ChatGPT 5 Thinking), so there may be inaccuracies or nuances lost. If something seems unclear, please refer to the Hebrew original or contact us for clarification.

In my classes on learning from experience, I discussed the relationship between induction and abduction. Following those classes, Yehuda sent me the following email:

I came across an article that demonstrates a failed use of induction, and how abduction can save the day. The article is attached (see Figure 2 inside). This is in the spirit of the current series I’m watching on YouTube.

I read the article and greatly enjoyed it, and I thought this would be an opportunity to touch a bit on issues of abduction and to explain why, in my opinion, this article is not a good example of it. Understanding the details of the article and this column requires some mathematical knowledge, so I will write the column in a way that allows those who wish not to delve into the mathematical details.

What is induction?

We are used to dividing the modes of inference in logic into three types: deduction, induction, and analogy. Deduction is inference from the general to the particular, meaning that from the generalization “All X are Y,” one may deduce that a, which is an instance within X, is also Y. For example, if X is the set of human beings and Y is the set of mortals, then a particular a who is a human (included in X) is necessarily also mortal (included in Y).

This raises the question: how did I arrive at the generalization “All X are Y”? After all, I could not have seen all human beings and verified that indeed all of them are mortal. Usually, one arrives at this by a process of (scientific) induction: I myself observed that several such instances were mortal, and from this I generalize that these were not random occurrences but a representative sample that displays to me the general law whereby all human beings are mortal.

It should be noted that we are not speaking here about mathematical induction. In mathematics, induction is a proof technique for a claim about infinitely many cases. If I prove it for the first case, and prove that if it is true for the K-th case, then it is necessarily true for the (K+1)-th case, then this can be seen as a proof that the claim is true for all cases. From this you can see that mathematical induction is a kind of deduction, namely that its conclusion is certain and necessarily correct (we have a proof for it); otherwise, it would not belong to mathematics.[1] Induction in its ordinary sense, which is what we are dealing with here, appears in science and not in mathematics. Here it is not a proof but an inference that derives a general law from several particulars.

Let us return now to scientific induction. As David Hume already showed, such induction is a problematic procedure. Even if I saw that the phenomenon exists in a finite number of cases, how can I know that this phenomenon holds for all cases of this kind? Alternatively, from any collection of cases one can derive infinitely many different general laws. How can I know which of them is the correct general law?

The answer is that we have no way of knowing this with certainty. The cases we observed, by themselves, cannot select among the various generalizations, since these are generalizations each of which fits all the cases we saw. But intuition helps us see that a certain generalization is more reasonable (perhaps due to its simplicity, or other intuitive criteria). David Hume, who challenged induction, assumed that if we do not have certainty regarding the correct generalization, then we have no way to claim that it is more correct than its competitors. He argued that this is speculation stemming from the structure of thought embedded within us, but nothing may be inferred from this about the world itself. Yet as I have written more than once in the past, I think he erred here, since certainty is not a necessary condition for knowledge. There can be non-certain knowledge. In fact, all our knowledge is of this kind (except for tautological claims).

Thus, induction is an inference that takes a set of particular cases and produces from them a general law, such that they are particular instances of that law (they can be deduced from it).[2] In the words of the philosopher of science Carl Hempel, this is the deductive-nomological schema, in which the general law constitutes an explanation of the particular cases if they can be deduced from it. The path from the cases to the general law is induction; the path from it back to them is deduction. From any set of cases, several deductive-nomological explanations can be derived, and our selection of one among them is made on grounds of plausibility, simplicity, etc. (this is no longer a purely empirical procedure).

What is abduction?[3]

At the basis of the induction we always perform stands some theory. Consider a question from a psychometric test that presents you with the following number sequence and asks you to write the next term:

3, 5, 7, ?

I assume most of you will write 9. But I propose a different continuation for the sequence: 11. Correct or not? It depends on the explanation you offer for this sequence: if it is the sequence of odd numbers, then the correct continuation is 9; but if it is the sequence of primes, then the correct continuation is 11 (since 9 is not prime). Who is right? There is no right or wrong here, since both answers are equally correct. Each is based on a different underlying theory, but both theories fit the cases we observed, and therefore there is no way to decide between them solely from the known numbers in the sequence. These are two possible interpretations of the sequence, even though they yield different predictions for the next number.

This is exactly what often happens in science. We have a collection of observations of cases we encountered, and we seek their generalization into some general law (that will tell us what will happen in another case, which we have not yet observed). This generalization is induction, but as we see here, there are several possibilities for such a generalization. What decides between the possibilities is the underlying theory (primes or odds). The theory is what determines the general law. The move from a collection of facts to a general law is induction; the move from that collection of facts to a theory is called abduction (following the American philosopher Charles Peirce).

When we make a generalization, in life or in science, we are not always aware of the tacit assumptions underlying it, but such assumptions are always there. Sometimes a certain generalization seems natural and compelling and therefore it is obvious to us that this is the general law, and we fail to notice that we have implicitly assumed some theory here—in other words, we have performed a tacit abduction. Thus a student who writes 9 as the answer to the above question will not even hold himself to account and will not always be aware that he is making an assumption. As a result, he may be blind to the possibility that other continuations are also possible (like 11, for example—and there are countless other possible continuations). If he becomes aware that at the basis of his induction lies an abduction, he will be more alert and sensitive to alternative possibilities, of course.

Examples: Misinterpreting correlations

This can be illustrated by cases of correlations that lead to mistaken conclusions. For example, one could argue that it is not worth going on a diet, since everyone who goes on a diet is fat. Even if it is true that there is a correlation between dieting and obesity (these are the facts), there is an error in the theory that serves as an explanation for this correlation. It is not the diet that causes obesity but obesity that causes the diet. Therefore the conclusion that one should avoid dieting is incorrect. We proposed here a generalization about the relationship between dieting and obesity, but it was based on the wrong theory—that is, we erred in our abduction—and this led to an error in the conclusion (which corresponds to induction).

Another example is a letter written years ago by a professor from the Technion to a newspaper, in which he recommended that the State of Israel increase investment in higher education, since there is a correlation between investment in higher education and high GDP. Even if it is true that such a correlation exists, his conclusion is not necessary. It depends on the explanation of the correlation (the abduction), which could be at least one of two: 1) High investment in higher education increases GDP (the explanation he assumed). 2) High GDP leads to high investment in higher education (since a state that has money can invest it in luxuries). There is no way to decide between these two possibilities solely from the facts he described (to decide, one must run regressions). Thus, generalizations or conclusions we naturally infer from data or observations depend on theory. The generalization that is the product of induction is always based on a theory produced by abduction. A different abduction will yield a different induction.

Another example: An analysis of qal va-ḥomer

Another example of the meaning and importance of abduction can be seen in analyzing the qal va-ḥomer (a fortiori) inference.[4] Consider the following qal va-ḥomer (based on m. Bava Kamma 2:4):

Damager / Domain Public Domain Victim’s Courtyard
Tooth and Foot 0 1
Horn 1 ?

Table 1

In Table 1 we have three data points (seen in the Torah—these are the ‘empirical’ data): damages by Tooth and Foot (=T&F) are exempt in the public domain, but liable in the victim’s courtyard. Damages by Horn are liable in the public domain (for simplicity I ignore here that the liability is for half-damages). The question concerns a law not written in the Torah (and thus not open to observation): what is the law for damages by Horn in the victim’s courtyard? As we see in the mishnah there, two arguments are raised that prove from the table above that damages by Horn in the victim’s courtyard are liable:

  • The columns argument. From the two data in the right column we see that it is easier to impose liability for Horn than for T&F. Now we move to the left column and conclude that if damages by T&F are liable in the victim’s courtyard, then all the more so Horn (which is more severe than they) will be liable there. QED. The assumption underlying this argument (the hierarchy rule) is that liabilities for Horn are more severe than liabilities for T&F.
  • The rows argument. From the two data in the top row we see that it is easier to impose liability in the victim’s courtyard than in the public domain. Now we move to the bottom row and conclude that if Horn is liable in the public domain, it will certainly be liable in the victim’s courtyard. The assumption underlying this argument is that liabilities in the victim’s courtyard are more severe (easier to impose) than liabilities in the public domain.

On the face of it, these are two different and independent arguments. Each is based on a different assumption, even though they arrive at the same conclusion. To see this, consider a refutation to the columns argument. Such a refutation presents a third domain in which T&F would be liable and Horn would not. Suppose, for example, that for damage caused on an island at sea, Horn is exempt and T&F are liable. We then obtain the data in Table 2:

Damager / Domain Public Domain Victim’s Courtyard Island
Tooth and Foot 0 1 1
Horn 1 ? 0

Table 2

Why is this a refutation? Because from the left column one can infer that the hierarchy we assumed between Horn and T&F is not necessary. But apparently, even after this refutation, the rows argument remains intact, since it does not assume any hierarchy between damagers but rather between domains, and this refutation does not attack that hierarchy.

By contrast, a refutation to the rows argument would present a third type of damager that is liable in the public domain and exempt in the victim’s courtyard. Suppose, for example, that damages by Ear are such, and we get Table 3:

Damager / Domain Public Domain Victim’s Courtyard
Tooth and Foot 0 1
Horn 1 ?
Ear 1 0

Table 3

The refutation row shows us that the hierarchy between the public domain and the victim’s courtyard is not necessary, thus attacking the rows argument. But the assumption of the columns argument apparently remains intact.

If so, we would expect that in Talmudic literature, when a refutation to a qal va-ḥomer is presented, the response would be to ‘rotate’ the qal va-ḥomer and leave the conclusion in place. But nowhere in the literature of Ḥazal or the commentaries do we find this. After a row or column refutation is presented, the Gemara concludes that the qal va-ḥomer has fallen.[5] Apparently there is an assumption here that these two arguments are two formulations of the same argument. The question is why. The reason can be seen in the theory underlying the qal va-ḥomer inference. The qal va-ḥomer inference is a kind of induction (or, actually, analogy), but underlying it sits an abduction—that is, a theory that explains it. Once this is understood and recognized, the difficulty disappears on its own.

To see this, let us return to Table 1 and first consider the columns argument. The right column shows us that Horn has some characteristic by virtue of which it should be more liable than T&F. Let us denote it by the letter α (alpha), and assume that Horn is more severe than T&F because it has this characteristic in greater intensity than T&F. Thus, if T&F have it with intensity α, Horn has it with intensity 2α. But this is not enough to account for the data in the table. To explain the data we must also assume that, in order to impose liability in the public domain, a damager must possess this characteristic with intensity 2α; therefore, T&F are not liable there, while Horn is. And what happens in the victim’s courtyard? There, any damager possessing intensity α alone will be liable (otherwise T&F would not be liable there). That is, to explain the qal va-ḥomer of the damagers (the columns), we are compelled to add an assumption regarding the domains (the rows). Not only that, the assumption we have added necessarily involves the same parameters as were involved with the damagers; otherwise this characteristic in the damagers would not be relevant to their liability in the different domains.

Note what we have obtained: to explain the data in the table, it is not enough to set a hierarchy among damagers; we must necessarily set the same hierarchy among the domains. Moreover, the hierarchy among them is also described in terms of the same parameter α. Remarkably, the assumption of the columns argument that sets a hierarchy among damagers implicitly forces us to assume that very same hierarchy among the domains. But if so, when we encounter a column refutation that undermines the hierarchy among damagers, it thereby also undermines the hierarchy among the domains (since after the refutation there is no reason to assume that either). This is why the Talmud assumes that a column refutation refutes the rows argument as well. And of course, the same will hold for the rows argument and a row refutation, which will refute the columns argument too.

The qal va-ḥomer inference works from the three data in the table and infers from them the fourth (missing) datum. This is inference at the level of facts, something parallel to induction (or analogy, in this case). But in the background, even if we did not notice, there lies a theory (the assumption regarding the α-characteristics of the damagers and of the domains), and it underlies both the data and the inference. Thus, we performed an induction that rests on a tacit abduction. The refutation, if one exists, shows us that the abduction we used is incorrect, and consequently the induction (the explanation of all the data in the table) and the conclusion derived within it collapse as well. Think of a refutation to the professor’s argument above that shows a state that does not invest in higher education yet has high GDP (or does invest and the GDP remains low). Such a refutation shows that the explanatory parameter (investment in higher education) is irrelevant to understanding the explanandum (GDP).

Abduction in science: Black-body radiation

So it goes in science as well. We have before us a collection of facts from various observations, and we seek a general law to explain them. The way to reach such a generalization is induction. But induction only gives us a closed description of what will happen in infinitely many other cases, without an explanation. To understand these phenomena—and thereby also to verify that the induction we performed is correct—we must perform abduction, namely to seek the theory underlying these facts. The correct theory will, of course, yield the correct general law. The scientific generalization gives us a formula that describes what will happen in all situations and cases, but the theory is the conceptual framework that explains and justifies the formula.

In the history of science, we know of situations where induction was done before abduction, when one generalizes the observed facts and creates from them a general law, formula, or equation. The law gives us the results expected to be obtained in other cases we have not yet observed. We still have no explanation why this is the formula and how we arrived from the observations to this formula (and perhaps there are other alternatives). At this stage we possess a phenomenological theory, i.e., a general description of infinitely many facts which is a generalization of the facts known to us. When we find a theory that explains the general formula—if indeed we find one—it will better anchor the claim that this formula is the correct generalization. Such a theory includes principles and concepts and a web of relations among them, within which one can show that the general law we reached by induction is the correct law. If we do not find such a theory, suspicion arises that our induction (the equation) was not correct. We will illustrate this by finding the explanation for black-body radiation.

Already by the end of the 19th century it was known that if we heat a body, it emits electromagnetic radiation at various wavelengths. This radiation differs for each wavelength and also depends on the temperature of the body (but not on the spectrum of radiation it absorbs). Some wavelengths are emitted with great intensity and others with lower intensity, and, as noted, the entire spectrum depends on temperature.

The final formula for black-body radiation was presented by Max Planck in 1900. Before him there was the Rayleigh–Jeans formula, which described the spectrum for long wavelengths (denoted by the letter lambda, λ), and Wien’s formula for short wavelengths. Both were generalizations of measurements for certain wavelengths that were described by approximate mathematical formulas. They were obtained by induction, which took us from particular results for certain wavelengths to more general formulas for all wavelengths. Max Planck unified these two formulas and created Planck’s equation, which gives us the spectrum for all wavelengths. The graph that depicts the empirical results of radiation at the different wavelengths (each color is a different temperature, in Kelvin) is the following:

In 1900, Planck proposed a mathematical formula that describes these graphs, where I is the energy emitted from the body (at temperature T and wavelength λ):

h – Planck’s constant, T – temperature, KB – Boltzmann’s constant, C – speed of light, λ – wavelength

Here the induction process ended: we moved from specific results (for certain wavelengths) to a general mathematical description (for all wavelengths). But this is only the beginning of the story, since we have no idea why this is the correct formula. It is merely a generalization of the empirical results and the creation of a general graph and formula. The process continues in 1905 (the annus mirabilis), when Einstein looks at this formula and understands that it reflects the fact that light comes in discrete quanta—photons. From this he explained the photoelectric effect, in a paper that laid the foundations of quantum theory. In 1918, Max Planck himself presented a full statistical explanation showing that his formula reflects the distribution of radiation of discrete quanta of light at different temperatures—that is, he showed that this formula can be explained by a quantum theory of light.

This is already a step of abduction. From the observations of radiation at certain wavelengths, we arrived at a general theory (quantum theory and quantum statistics), within which one can derive Planck’s formula (the product of induction). The scientific path here was from observations of certain particulars to induction (graph and formula), and from there to abduction (theory). Note that the direction of logical derivation is the reverse: from the theory to the formula (the formula gave us hints, and thus we arrived at the theory, which in turn explains the formula), and from it to the particular cases (a particular wavelength and temperature). The conclusion is that a scientific generalization is not mere induction. Induction gives us a general phenomenological law (a description of the behavior), but the scientific understanding and explanation of this behavior come only with the formation of the scientific theory—namely, by abduction. Induction gives us a (general) description; abduction gives us an explanation.[6]

Mathematical surprises

As noted above, Yehuda sent me a mathematical article that presents a very surprising puzzle. To spare those who are not familiar, I will give an incorrect example. Define the digital root of a number as the result of summing its digits, and if the result is still not a single digit, we sum the digits again until we get a single-digit number. Now suppose we check the digital roots of various numbers and discover that if a number is divisible by 9, its digital root is also divisible by 9 (this claim is, of course, true). For example, if we look at the digital root of 9, we get 9, of course. The digital root of 18 is 1+8=9. The digital root of 27 is 2+7=9. Likewise for the digital root of 36, 45, and so on. Note that this is true also for very large numbers, like 1,089. Its sum of digits is 1+0+8+9=18, and the digital root of 18 gives us the final digital root, 9, of course. Let us suppose (this is the incorrect point in the example) that this interesting property holds for all the multiples of 9 you checked one after another, but if you had enough patience, you suddenly discovered that it does not hold for 676,026 (9×75,114; this is the 75,114-th number in the sequence). Would this surprise you? I think yes. Suppose we assume that after this number the property holds again, but suddenly we discover yet another huge number in which it again does not hold. The conclusion is that the general claim (that the digital root is always 9) is not correct, and therefore there are counterexamples from time to time. A bit odd, but it happens, no?

What would you say if this property stopped holding from 676,026 onward? I mean a situation where every multiple of 9 from there on no longer satisfies this property. It seems to me that this would be even more surprising to me. A property that holds continuously for very many numbers I would expect to continue to hold always. But even if I discover that it does not hold, if this occurs at numbers scattered randomly along the axis, that is still somewhat tolerable. But if it ceases entirely from some point in the sequence onward, that is very surprising.

In both of these cases I assume that a reasonable person would perform induction and suppose that this property holds for all numbers in the sequence (all multiples of 9). But from time to time it may turn out that our inductions are wrong. Of the two surprises I described, it is hard for me to explain why the second surprises me more than the first. The fact that a claim holds for many numbers but not for all is not problematic in itself. And if it is not true, there is no reason to assume that its failure would appear immediately at the first number, or at the 58-th number in the sequence. There is no principled bar to it appearing at the 75,114-th number in the sequence. For this reason, the first phenomenon seems less surprising to me. But if something holds for very many numbers, and from a certain number onward it ceases entirely to hold, that is far more surprising.[7]

Back to the example I received

The article mentioned describes a phenomenon very similar to what I described here (but there it is a real phenomenon). Consider the following sequence of Borwein integrals (named for two physicists, father and son, who discovered it):

It turns out that each of them up to the seventh (I1…I7) yields exactly the same delightfully neat result: 1.[8] But at the eighth integral (I8) a small deviation already appears (ten digits after the decimal point), and the result is already slightly less than 1. From there onward (I9 and beyond) the situation deteriorates further and further downward. If I understand correctly (in light of the explanation given there for the phenomenon), I think that in the end it actually approaches 0.

This phenomenon is truly surprising to me (and apparently to many others; otherwise the article would not have been written). What causes a sequence defined in a monotonic and gradual manner to be cut off abruptly like that? One would have expected the results to continue to be 1 all along (by induction from the initial cases). Note that this is a surprise of the second type and not the first—that is, the more surprising variety. Incidentally, as you will see immediately, in the same way one can arrange such sequences to be cut off starting from I75,114 or any other term you wish.

The brilliant part (this is only a survey—the explanation was presented by two other physicists in a 2019 paper) is the explanation proposed for this surprising phenomenon. Without going into mathematical details, their claim was that each such integral presents a quantity described by a random walk, as I will now describe. Suppose there is a collection of points distributed uniformly along the entire X-axis. Each one advances right or left along the axis, with the size of each step uniformly distributed over some given interval, say on the segment (−Δ, Δ). Clearly, after the first step nothing will change in the overall distribution (because there is perfect symmetry among all the points on the axis). So too after each subsequent step. If we define the outcome as the height of the distribution at the point X=0 after the n-th step, then if the initial height of the distribution is 1, the outcome will be 1 after any number of steps at every point on the axis, and in particular at X=0.

But what happens if the initial distribution is not spread over the entire axis but only over the interval (−1, 1)? Recall that we are looking at the height of the distribution at X=0 after the n-th step. Suppose that on the first step all the points step right or left with maximal step length Δ=1/3, and at the next step the maximal step length is Δ=1/5, and so on with the odd numbers in the denominator. The authors explained that the integral I1 describes the number of points at X=0 after the first step. The integral I2 describes the number of points after the second step, and so forth. Consider what should occur in such a situation. On the first step, the point X=0 ‘thinks’ it lives inside an infinite distribution. The walkers entering or leaving it are only those near it, since it can be affected only by the segment at most 1/3 away in each direction. Within this neighborhood everything appears uniformly distributed, and therefore the result obtained is exactly as in the problem that deals with an initial distribution spread over the entire axis. Hence the outcome—namely, the number of points at X=0—is 1, as in the infinite case. After the second step it can already begin to feel walkers located at a maximal distance of 1/3 + 1/5 from it in both directions. Since this distance is still less than 1, then from the perspective of X=0 the result is still as if the initial distribution were infinite. Therefore, in this case too the result remains 1. The same goes for the third and fourth steps. When will the point X=0 be able to feel that the initial distribution is not infinite? When influence begins to arrive from points that lie beyond the initial distribution segment (i.e., from points on the axis located to the right of X=1 or to the left of X=−1). When these begin to arrive, it suddenly “realizes” that the distribution around it is not infinite, and then a deviation from the constant result begins. Clearly the result will begin to decrease, since the distribution that was initially concentrated on a certain segment begins to spread over the entire axis. From the explanation I proposed you can understand that this happens exactly at the integral I8, since after the seventh step influences begin to reach the point X=0 from a distance that exceeds 1. The reason is that the series 1/3 + 1/5 + 1/7 + 1/9 + 1/11 + 1/13 + 1/15 + … exceeds 1 after eight terms. Therefore the result begins to be slightly less than 1, because some of the walkers escape outward beyond the initial distribution (to X’s located to the right of 1 or to the left of −1), and therefore the height of the distribution begins to drop. After enough steps—that is, further integrals in this sequence—the results will approach 0 (since eventually the initial distribution tends to spread uniformly over the entire axis).

This is the explanation they proposed for the surprising phenomenon I described. This is why the outcomes of these integrals begin to change starting from the eighth term. Incidentally, from here it is very easy to construct sequences of integrals that will begin to deviate from the 75,114-th term, or any other term you wish. Simply construct a sequence of steps whose sum will exceed 1 after 75,114 terms. This is really not hard.

Is this abduction? Between science and mathematics

As noted, Yehuda sent me this article as an example of induction carried out without an abduction in the background. Someone who looks at the sequence of the first seven integrals will be inclined, by induction, to think that all the subsequent integrals will also yield the result 1. But he would be mistaken. The explanation I presented here explains why it is not correct to continue the sequence by natural and simple induction. Seemingly, this is exactly like the abduction that explains why the continuation of the sequence above can be 11 and not 9, or why the result of the qal va-ḥomer is not 1 even if, because of a row refutation, one switches from the rows argument to the columns argument, and so on. When one understands the theoretical explanation, one can discover that one’s induction is not correct.

But as I answered Yehuda after reading the article, in my opinion this is not an example of abduction. The explanation given here is of a different type. Abduction proposes a general theory from which the formula can be derived, and all the particular cases are explained within it. Here, however, the proposed explanation does not present a general law but a specific phenomenon described by that formula. This formula can describe other phenomena as well, and not a random walk, and yet the result would still be correct. The random walk is a particular example and not a general law, and the example does not explain the formula in a deductive-nomological manner but illustrates it in a particular case.

In mathematics, this is called the random walk being a ‘model’ for the theory of these integrals. It is a case described by the mathematical formulas. Just as vector calculus is a general mathematical theory, and there are very many phenomena in physics that model it: forces, velocities, accelerations, moments—all these are quantities described by vector calculus, and therefore they model the vector calculus theory. The random walk is a model of the theory described by these integrals (and there can certainly be many other models of it as well).

In general, it seems to me that abduction is a procedure that belongs essentially to science and not to mathematics. Mathematics does not deal with analogical and inductive inferences, but only with deduction—that is, with necessary inferences from the general to the particular. Science deals with analogy and induction, and therefore also with abductions (which are a kind of induction).

[1] The intuitionists indeed disagree, but that is merely a philosophical pilpul.

[2] In the words of the philosopher of science Carl Hempel, this is the deductive-nomological schema, in which the general law constitutes an explanation of the particular cases if they can be deduced from it.

[3] See also Columns 399 and 405.

[4] See my book Emet ve-lo Yatziv, chapter twenty.

[5] There are two exceptions, in Bava Kamma and Niddah, and in both the table is not symmetric. We will not go into this here.

[6] Which is always of the “reducing to the unknown” type (see Appendix B in my book Does God Play Dice? and here).

[7] My sense is that this is related to the distinction between underfitting and overfitting, described in Column 243.

[8] For convenience I divide each integral by π.


Discover more from הרב מיכאל אברהם

Subscribe to get the latest posts sent to your email.

18 תגובות

  1. [A rotation of K. A is both a matter of fact in the sense of a tooth (a deduction for the argument of the rows) and a new place that can be exactly the same as it (a deduction for the argument of the columns). Therefore, regarding your general question about rotations of K. A, even where the deduction is from a given law, I remember it differently as a known fact, and perhaps I am simply mistaken. If we say, “What is a tooth that is exempt from a tax that is owed in a tax?” A loan that is due in a tax is not a law that is owed in a tax. A deduction of what is a tooth that is due in a tax that is exempt from a tax will say in a tax that is exempt from a tax. Then, it is also possible to say, “What is a tax that is exempt from a tax that is owed in a tax that is owed in a tax.” A deduction of A will prove that a tooth is owed in a tax and a tax is exempt from a tax that is owed in a tax. Isn’t that right?
    Only if the Pirka is an explanation and not a known law, such as the argument of the lines, Pirka explained what is the meaning of the word that is terrifying in the mind, and therefore it is obligatory to use it because it is a delinteria, but in the Hana, an exempted Kern, then if one turns to the argument of the columns and claims that the Hana is severe in the mind, then this Pirka from the explanation is perhaps irrelevant (although even here it is not clear that the Pirka always stands). And from where it is possible to turn, then at least the Tosafot are supposed to deal with this, and it seems that they do indeed do this, and if not, then the bones of Joseph in the Kiddushin. ]

    1. I mistakenly reversed the meaning of the terms row argument and column argument (in a column, a row argument uses rows to find a hierarchy between columns. I mistakenly used the term row argument for the argument that finds a hierarchy between rows).

      1. I didn't mean to swallow a pirka (in the terms I know for swallowing a pirka, maybe I'm not accurate). You opened with the question of why the Gemara after a pirka almost always does not turn the kal v hamor. Then you explained the matter wonderfully as written in the first book. The question is not clear to me (and from what I remember, it was explicitly addressed). Even before the wonderful explanation, how can a turn of kal v hamor evade the pirka?
        You wrote that if there is a question of topics: “A tooth that is exempt in the public domain is liable in the court of damages, a foundation that is liable in the public domain is not a law that is liable in the court of damages” Then he explained, "What is the meaning of a tooth that is owed to an island in the middle of the sea, you will say in a foundation that is exempt from a tooth in the middle of the sea?" Then we can turn the easy and the hard into the easy and the hard of places, "What is the public domain that a tooth is exempt from, a foundation that is owed to, the court of damages that a tooth is owed to is not a law that a foundation is owed to". So much for your words at the stage of the question.
        Why is it impossible to explain, "An island in the middle of the sea will prove that a tooth is owed to and a foundation is exempt from it?" [Of course, your explanation is at the foundation of all things, but why is it necessary in order to solve this basic problem].

        1. This in itself is not necessarily a disjunction. The fact that the hierarchy does not exist in relation to a third subject (between the first and third columns) does not dispel the vertical hierarchy in itself (between the first and second columns). However, after looking at the analysis in terms of parameters (alpha), it does indeed become clear that this is probably a disjunction.

  2. In induction, the emphasis is on the consequence, and in abduction, on the general or the premise.

    3, 5, 7, 9,

    What is the next number in the series? In simple induction, 11. However, if these are numbers with a difference of 1 to the power of 2, the next number in the series is 15.

    3, 5, 7, 11,

    What is the next number in the series? In simple abduction, 13. However, if these are numbers with a gradually increasing difference, the next message in the series is 15.

    Now:

    3, 5, 7, ?, 15

    What is the missing number in the series?

    13, if the general series is 3, 5, 7, 13, 15, 17, 23, 25, 27…

    It is not for nothing that the ancient sages disagreed about whether the particular is nothing but what is in the whole, or whether the whole is nothing but a collection of particulars.

  3. “The random walk is a private example and not a general law”. The argument is understandable and indeed there is a difference. But isn't it correct to say that all models of a formula are isomorphic to each other? And if indeed for every two models such an isomorphism exists, then what is the significance of the difference between a private example and a general law? [By the way, by chance, not long ago this series was shown on this magical channel https://did.li/borwein%5D.

    1. They are indeed isomorphic, but the explanation is not the mere existence of the model, but the understanding that arises from looking at the model. This understanding is not a general explanation, but an understanding from within a single model.

  4. I think Occam's razor is the explanation for many of the examples you gave, for example, why the natural continuation of 3, 5, 7 is 9 and not 11, and also why it is surprising that some property that is true for many hundreds of numbers ceases to exist from a certain number onwards. By the way, every number has a property that ceases to exist precisely for it, namely that it is greater than the numbers smaller than it. Why is this not surprising? Again, because the complexity in describing this rule is equal to the complexity of the number itself, which is not the case in the surprising example in the article here.

    1. I didn't understand what Occam's razor had to do with it.
      These trivial properties, of course, surprise no one and are therefore not interesting.

      1. The question is how to quantify the triviality of a feature. There is a field in the foundations of algorithmic learning that is a formalization of Occam's Razor that answers this: https://en.wikipedia.org/wiki/Minimum_description_length

        In the example of completing a series, building a finite sequence of elements, a polynomial of degree equal to the length of the series is enough to describe it, and in turn to determine the next elements. It is known that determining the probability of an event determines the length of the description required for its instance (by constructing an arithmetic coding, for example). Here, for example, under the circumstances, it would be reasonable to determine the geometric distribution for both the degree of the polynomial and its coefficients (assuming they are integers). It should be noted that this is not a reevaluation of the distribution as geometric, but a decision to use the distribution to create a code.
        Occam's Razor is the basis for almost the entire column you wrote, but this case is a very direct example of the principle of the short description above.

        As for your statement "These trivial features of course surprise no one, and therefore are not interesting" – they are trivial, because they do not encode anything more efficiently. The principle of shorthand objectively quantifies that this is trivial.
        In more complicated cases, this is a theoretically based principle to determine hyperparameters for unsupervised learning: https://scholar.google.co.uk/scholar_url?url=https://citeseerx.ist.psu.edu/document%3Frepid%3Drep1%26type%3Dpdf%26doi%3Dc6e7e4bdf941322b03e7abe871af66119bde3c0f&hl=en&sa=X&ei=lknKY-z7O4jQmAGhnojwCg&scisig=AAGBfm3BOJoWms-exURNRBOE3hdWRDNhcQ&oi=scholarr

        1. I think you mean what I commented on in comment 7. But I repeat that you offer a description of why something is more surprising than others (overfitting versus underfitting), but I did not deal with that here (except for the part that discusses the comparison between the two types of surprise). I dealt with the fact that sometimes surprising things also happen, regardless of the question of why they are surprising. I certainly agree that in these series of integrals there is something very surprising according to the criteria you mentioned. That is what is important for my discussion. The question of why it is surprising and to what extent, is a different question.

  5. The Wikipedia reference to Wien's formula is incorrect. It is a reference to Wien's deflection law, which is also related to the distribution of blackbody radiation, but is something different from Wien's formula (also called "Wien's distribution law"), which is included in Planck's formula: This is the correct reference: https://en.wikipedia.org/wiki/Wien_approximation

  6. The historical explanation for Planck's formula is also inaccurate. Planck himself, when developing the formula, assumed that the energy of the radiation is emitted in discrete doses. He did not know how to explain why (and assumed that this was somehow related to the fact that the material itself that emits the radiation (the material of the black body) is made up of discrete molecules. That is, of molecules that are the ones that emit the radiation). Einstein used this result to explain the photoelectric effect, understanding that it is the light itself that is discrete (made up of particles) and this is the reason why its energy is emitted in discrete doses from the black body (but that it is not related to the fact that the material is made up of molecules).

    1. Here: https://he.wikipedia.org/wiki/%D7%9E%D7%A7%D7%A1_%D7%A4%D7%9C%D7%90%D7%A0%D7%A7

    2. As the reference here says, in 1918 he only won the Nobel Prize for it. Although many disputed this and claimed that in fact the main discovery was Albert Einstein's. The relationship between them reminds me of the relationship between Columbus (Planck), who arrived first in America and thought he had reached India, and Amerigo Vespucci, who arrived after him and realized that it was a new continent (Einstein) and named it America. So Columbus "discovered" America.

Leave a Reply

קרא גם את הטור הזה
Close
Back to top button