What Is Intuition? Cognition, Thinking, and Emotion (Column 653)
With God’s help
Disclaimer: This post was translated from Hebrew using AI (ChatGPT 5 Thinking), so there may be inaccuracies or nuances lost. If something seems unclear, please refer to the Hebrew original or contact us for clarification.
In the previous column I needed a general characterization of the faculty we call intuition. I searched the site and was surprised to find that although I’ve discussed it in many places, there isn’t a single orderly column that does so systematically. So I decided to fill that gap here.
Intuition Is Not Emotion
First, I wish to dismiss the common association between intuition and emotion. Emotions are not claims. When I say that I love someone and someone else says that he does not love that person, we are not in disagreement. There’s no disputing taste and smell. Note: that is not because the dispute cannot be resolved but because there is nothing to resolve (there is no dispute here). These are two reports about mental states—mine and his—not a claim about something in the world. The fact that I am in mental state A does not contradict the assertion that so-and-so is in mental state B. Each person has his own psychological makeup. Even if we look at just one assertion, say mine, it cannot be judged in terms of truth or falsehood. I am not asserting anything about the world but reporting a state within me. Hence I do not disagree with someone who feels differently. Likewise, when I say that I am afraid and someone else says that he is not—there is no disagreement. Again, these are reports about inner psychological states in different people. Of course, if someone claims that I am not afraid (that is, his claim is about me and not about himself), then we do have a dispute—but that is not my topic here.
Now consider Reuven, who has invested great effort over years in solving a very difficult scientific or mathematical problem. At long last he reaches an answer, X, and holds a thanksgiving party for friends in that scientific community. At the party Reuven presents the problem to his colleagues, and suddenly one of them, Shimon, stops him and says: “Oh, obviously, the solution is X.” Reuven is stunned, for he labored over this problem for years with his best talents and energy, and Shimon tosses out the answer offhand a moment after hearing the problem. Reuven, with trepidation, asks Shimon how he managed to solve it so quickly, and he replies: “When I saw the problem, I had a strong feeling that this is the solution.”
Is the term “feeling/emotion” appropriate here? I don’t think so. We saw above that an emotion denotes a report about a mental state, and therefore there is no point arguing about it. Is what Shimon proposed an emotion in that sense? His solution is a claim about the problem before us, and as such it can be true or false. Now think about someone else, say Levi, who says “the solution is Y” (because he has such a feeling, and not Emotion X like Shimon). Clearly Levi disagrees with Reuven and Shimon (who hold that the solution is X). That is, the assertion that the solution to the problem is X or Y is a claim, and anyone who disagrees is in a dispute with us. In that sense, we are not dealing with an emotion here.
Why, then, are we inclined to treat this as an emotion? Because the answer here is not argued for. Reuven’s assertion is not considered an emotion but the result of reasoning. Shimon, however, reached the same result without calculation but only on the basis of a “feeling,” and therefore in his case it seems to us like an emotion rather than reasoning. The same goes for Levi, who reached a different result. Yet the term “emotion” in this context is not apt. They are not presenting a calculation, but if we examine their solutions we can determine which of them was right and which was wrong. In other words, this is a claim and not a report about a mental state. In affective psychological states, emotion is merely a report about that state and nothing more. But in solutions to scientific or mathematical problems, the mental state produced an insight, and that insight is a claim subject to truth or falsehood.
Thus, none of our three friends is dealing with emotions. They are all making claims. The difference between Reuven and Shimon on the one hand and Levi on the other is a disagreement about a claim—indeed a factual dispute (whether X is correct or Y). The difference between Reuven and Shimon is not a disagreement at all, since both assert the same claim (X). Here the difference is only in the way they arrive at the claim. Reuven reached it through recursive, conscious, rational thinking, while Shimon reached it by intuition. Intuition is not an emotion but a cognitive–intellectual faculty. As noted, using the term “emotion” in such a context is very confusing. It is better to distinguish between them and call affective states “emotions,” whereas direct cognitions (not based on step-by-step reasoning) I propose to call “intuitions.” In this view, we can say that Reuven and Shimon are running Achilles and the Tortoise’s race. Reuven is the tortoise, who solves the problem step by step in a systematic way with explicit and detailed calculation. Shimon, by contrast, is Achilles, who covers the distance quickly and gets straight to the final result without passing through all the intermediate steps. It is entirely possible that in Shimon’s brain all of Reuven’s steps were performed, but rapidly and without awareness (below I will mention Daniel Kahneman’s System 1 in this context).
Another example can be found on my site, in column 471, where I discussed the Talmud’s permission for Zimri to kill Pinhas under the rubric of “pursuer” (rodef). That is highly counterintuitive, since Zimri was the offender and Pinhas sought to prevent the offense and even received God’s blessing for what he did. And yet, the Talmud says that from Zimri’s perspective Pinhas is a pursuer and he is entitled to kill him in self-defense. In a comment there Avi cited a distinction by the Rosh on the spot, who writes that this permission was given only to Zimri himself and to no one else:
But another person would be executed for killing him, for he [Pinhas] is not a full-fledged pursuer since he acts with authorization. And specifically Zimri is permitted to save himself at the cost of Pinhas’s life, but not any other person. For every other person is given license to kill Zimri; therefore he [Zimri] has no license to save himself by taking Pinhas’s life.
The rationale for the Rosh’s distinction is not so simple (see my brief response there), but my intuition clearly says that the distinction is correct. After that one can try to formulate a systematic, orderly rationale, but the intuition gave us the answer immediately. The fact that we later found a rationale only shows that this intuition is not merely a subjective emotion detached from truth and reasoning. It is an unconscious reasoning that arrives at a “correct” conclusion—that is, at a claim and not an emotion—even though it skips the recursive procedures of our ordinary thinking.
The Relationship Between Emotions and Facts
The conclusion is that our intuition is a rational or cognitive faculty, not merely a given psychological structure. There is no reason to take seriously conclusions that arise from my psychological makeup. As Mark Twain said, “The world owes you nothing; it was here first.” The fact that you are built a certain way says nothing about the world, and therefore whatever follows from your mental makeup cannot be a factual claim about the world. A psychological structure produces emotions (like love or fear), but intuition is part of our intellect that apprehends the world.
True, when I love Rachel, this is of course related to various of her characteristics that arouse in me the emotion of love. But someone else will feel love toward Leah and not toward Rachel, because his psychological makeup is stirred by different features than those that stir me. So yes, there is a connection between emotions and facts, but the existence of the emotion as such yields no factual claim and says nothing about the world.
The example of fear is subtler and more complex. Fear can arise from facts. For example, when I encounter a predator, fear arises in me. That fear results from the information that such an encounter may end badly for me. But this is not a conclusion from the existence of the emotion of fear itself, but from the information that arouses the fear. If I conclude that the emotion of fear expresses a factual insight (acquired from experience or some other source), namely that a predator is something one should beware of, then fear can be seen as a source for factual claims. But again, the existence of fear as such does not say that. What matters is the information that the fear reflects. Consider a person with a damaged brain who does not feel fear; he can still understand that a lion is dangerous and draw the conclusions and run away. In him, that information does not manifest as fear, but what matters on the factual level is the information, not its affective form. As noted, what goes on inside me due to my mental makeup cannot serve as a basis for a factual claim about the world.
In other words, when I encounter a lion I have an intuition that there is a serious danger to my well-being. My psychological makeup is such that the information conveyed by the intuition generates within me the mental state of fear, and then I flee. But the fear is only a mediator and not the source of that information. Just as a sensation of pain says nothing on its own. In a normal person, if he feels pain there is probably a physiological source, and therefore he would be wise to check and treat it. But the pain is only an indicator that mediates the physiological damage to my psyche. I do not go to be examined because of the pain but because of the information the pain expresses. Even a person who lacks pain sensation, upon discovering the same medical problem, ought to go to a doctor.
For my part, even when I respond to information conveyed to me through the mediation of emotions, I am not responding to the emotions and I am not basing myself on them. It is based on the assumption that an emotion of this kind reflects some factual state, and that state—not the emotion—is my motive for action and decision. I once wrote that the feeling of love for a woman is not a reason to marry her. It is one datum to be weighed when forming a decision whether to marry or not. That decision should be made with the intellect, not with emotion.
What Is Intuition: Characteristics and Sources
In Wikipedia, s.v. “Intuition,” we find the following definition:
Intuition is rapid inference based on a paucity of data by means of past experience and past inferences.
The ability to receive and decipher intra-personal messages (to attain knowledge) without deliberative thinking. What characterizes intuitive knowledge is the fact that it is generated very quickly (in an instant) and appears suddenly (as inspiration or a flash). Nevertheless, the intuitive solution is not always created immediately.
A person who holds an intuitive insight cannot fully explain why he adopts that position. In this, intuition differs from opinion; to the contrary, often intuitive knowledge contradicts a person’s ordinary logic and challenges him to decipher it. In this respect it also differs from instinct, which is an innate inner tendency based on prior experiences and whose purpose is survival.
There are two points here: the definition of intuition and its sources. Regarding its definition, intuition operates unconsciously and not on the basis of arguments and systematic reasoning. That is exactly what we saw above in Shimon’s solution of the problem.
The Israeli psychologist Daniel Kahneman, Nobel laureate in economics, explained in his book Thinking, Fast and Slow that there are two systems of thought within us: System 1 is fast, instinctive, and emotional, and System 2 is conscious, rational, and slower. In most of our functions, System 1 works better and more efficiently, though in complex problems—and in particular when we have cognitive biases in the matter at hand—we will need System 2. Some people have a more developed System 1 and can use it even to solve scientific and mathematical problems (like Shimon above). In the domain of morality, too, this system is widely used. In many cases our moral stance toward a given situation is not the result of reasoned thinking but of a feeling—what we call “conscience.” Here, too, my claim is that this is not an emotion but an intuition. If it were an emotion, that ethical judgment would have no validity (as we saw regarding emotions in general). We can link Kahneman’s System 1 to intuition and System 2 to conscious reasoning. Below I will argue that this linkage is not exact and not complete.
So much for the nature of intuition. But, as noted, this Wikipedia passage makes another claim about the source of intuition. It contends that this ability is the product of experience and past inferences (or prior experiences). Later in the entry we find the view of psychoanalyst and psychiatrist Carl Jung, who said something similar:
Intuition is an immediate approach to the experience of the inner unconscious of each person. According to him, an action driven by intuition or a gut feeling is essentially the opening of one’s mind to inner feelings and thoughts, balanced against the information apprehended by the senses.
I think this thesis represents the common conception that intuition is nothing but accumulated experience stored in our subconscious over the years, which manages to break out from time to time and guide us in decisions and interpretations of various situations. It can be likened to training a neural network (an artificial brain), into which data are fed, each item re-organizing the network so that it internalizes it. After a while, the network can solve new problems in light of its accumulated “experience.” This means that intuition is nothing but the product of a collection of cognitions that act upon our brain and shape it in accordance with what emerges from them. These cognitions sometimes reach our conscious thought, but they also enter our subconscious and accumulate there. They also emerge from there in an uncontrolled way; in fact the products of this experience burst forth when needed and help us solve new problems based on that accumulated, unconscious experience. This is Kahneman’s System 1, and according to this claim its source is experience.
Most people I have spoken with do not even consider the possibility of another way to understand intuition. For them it is clear that we are dealing with the accumulation of experience through our cognitive faculties. In their view, the only difference between it and ordinary reasoning is the absence of awareness. I now wish to argue that experience cannot be the sole source of intuition. There must be something more there.
Intuition Cannot Be Only Accumulated Experience
In column 426 I discussed Occam’s razor, which at least today serves as a generic name for preferring a simpler explanation over a less simple one. If there are two theories that both explain all the facts, we choose the simpler. But the fact that it is simpler does not mean it is the correct one—particularly since simplicity is debatable (what is simple to me may not be simple to you, and certainly not to a being structured entirely differently from humans). Hence many construe the principle as methodological–psychological, not as a tool for arriving at truth. If there are two theories that explain all the facts and one is simpler, why not use it? The choice is a methodological principle, not because it is more true.
In that column I presented an argument showing that the razor is a tool for uncovering truth, not just a methodological principle. The argument was based on the fact that any set of experimental results can fit countless generalizations. Thus, in the graph below we see results of several measurements regarding the relation between the force acting on a body and the acceleration it develops (Newton’s second law of mechanics):
We obtain five results in experiments (the points marked with hollow circles). A scientist will try to draw a line through these results in order to obtain the general law (the law that holds for any force and any acceleration). Philosopher of science Carl Hempel called this process the “deductive–nomological schema.” He showed that explaining phenomena we measured by means of a theory is essentially the search for a general law of which the cases we measured are particular instances (derivable from it by deduction). In the case described there, the natural line is a straight line (the solid line drawn). But as you can see, there are many other ways to draw a line through the five measured points (one such is indicated by the dashed line), and therefore from those five measurements we can reach innumerable generalizations, that is, innumerable general laws. The straight line is indeed the simplest, but, as stated, that does not necessarily mean it is also the most correct. Similarly, in any formulation of a scientific theory from empirical findings, one can reach innumerable general theories of which the cases we measured would be particular instances.
According to this approach (which I called, following Zeev Bechler, “actualism”), the conclusion that the general law is a straight line (F = m·a), or that the simplest theory is the “correct” one, is construed as a tentative conclusion. As long as we have not found a contradiction to this being the correct line (or correct theory), we will use it because it is the simplest—but we are not asserting that it is indeed the correct one. Note: the actualist is not merely claiming that we cannot have certainty about this line. No one disputes that. A scientific theory always subjects its predictions to tests of refutation. The actualist’s claim is that there is no reason at all to suppose that this is the correct line—or, say, that it is more correct than the dashed line. In his view, we choose this line because it is simpler. This is our decision for reasons of convenience, not a claim about the world. That’s all. On this view, our intuitive feeling that the straight line is the correct line (or more generally that the simpler theory is also more correct) is but a product of the way our subjective intellect works and has nothing to do with what happens in the world itself. It is an expression of the notion that intuition is an emotion. In this sense, one can say that the theory is merely a report about a mental state (in the scientist’s mind) and not a claim about the world.
But as I showed there, this approach does not withstand the test of facts and logic. We saw that for the actualist the theory is merely a convenient way of arranging known facts, not a claim about the world. On this view, the probability that in the next experiment (say, when we measure point 6 or 7 in the graph) we will get the result that lies on the straight line is zero (for each line yields a different result, and the actualist assumes that as far as the world is concerned all lines are of equal status). Is that really what happens in our scientific experiments? When we test a scientific theory formed by generalizing measured cases, do its predictions always fail in the next experiment? That is what would follow if the actualist were right. For him every theory is a shot in the dark, an arbitrary choice among countless possibilities, having nothing to do with the factual truth about the world itself.
Yet if the actualist were truly right, we would today be at the scientific knowledge level of Adam. Every experiment would fail and force us to replace the theory. We would be swapping theory for theory after each experiment, chasing our own tails throughout the history of scientific research. There would be no progress in our understanding of the world. But that is not what actually happens. In fact, quite a few experiments do succeed (of course some fail, but not all; for our purposes that suffices). This means that our intuition—our ability to “guess” the correct line, or what we regard as the “simplest”—is not merely an emotion but some cognitive instrument. This instrument leads us to results that make claims about the world, and therefore they are not subjective emotions (which, as we saw, make no claim about the world). It enables us to advance in the scientific understanding of the world—that is, it is a device for arriving (in many cases) at correct theories about the world itself.
The conclusion so far is that intuition is not an emotion but a tool that reveals factual truth. Moreover, it is clear that intuition is not pure reasoning. If it were pure reasoning, then the path from the data we measured to the theory could not lead us to the correct conclusion. The data themselves can be generalized in many ways, and reasoning without observation cannot sort out the correct generalization. Reasoning can tell us which generalization is the simplest (in the scientist’s eyes), but the decision that the simple is also correct is a mere baseless conjecture (essentially a subjective emotion). Therefore it is clear that the basis of scientific generalizations must also include an element of cognition beyond reasoning. From viewing the five results we measured, we “see” that the straight line is the correct law.
I elaborated on this in my series on Intellectual Optimism and Pessimism (494 – 496). There I explained that the route to synthetic-a priori claims must contain a cognitive component and not only processes of reasoning. Kant’s problem of the synthetic a priori arises only because of the tacit assumption of a sharp distinction between reasoning and cognition (and therefore Kant, who would not relinquish that assumption, could not solve it), and its solution necessarily involves relinquishing that dichotomy. Kant’s problem compels us to conclude that we have thinking cognition or cognizing thought (see also the previous column). The route from measurements to general laws (the generalization) is not a purely intellectual process. It includes an observational–cognitive element, even if not mediated by the senses. We have a sixth sense—namely, a department of the intellect that engages in intellectual “seeing” of the world (what Husserl called “eidetic vision,” Maimonides “the eyes of the intellect,” and Rav Ha-Nazir “auditory logic.” All these expressions combine an element of reasoning with an element of cognition and mean that our intellect has a function that integrates both). Husserl describes it as though the specific data are “transparent,” and we “see” through them (with the eyes of the intellect, eidetic vision) the general law.
Thus, intuition is neither emotion nor (mere) reasoning but a cognitive instrument. We can still ask whether this instrument is only the product of accumulated experience, or whether there is something beyond that. If you think a moment longer you will realize that from the same argument I presented here (from the graph) we can also see that it is not merely accumulated experience. To understand this we must note that accumulated experience itself is nothing but a generalization based on specific cases we observed. We must understand that learning from experience always involves generalization. The next case I encounter will never be the very case I already observed. Even if it were the same situation, I would still need the assumption that it will behave the same way as before (that is, the assumption that there is a fixed law is our assumption, not something arising from observations). But as we saw, any set of cases can be construed—that is, generalized to a general law—in innumerable different ways. How, then, can one learn from experience at all? How can one choose the correct generalization on the basis of the facts observed?
The inevitable conclusion is that intuition cannot be only a process of reasoning. It contains a cognitive dimension. Intuition is also based on direct cognition. My way of realizing that the straight line is the correct general law is neither a subjective emotion nor merely the result of accumulated experience. The “simplicity” of the straight line expresses that I “see” (with the eyes of the intellect) that it is the more correct line. We have the ability to look at the measurements we made, to see “through” them the general law, and to understand that it is precisely the straight line and not any other line that strings all these observations together (like the dashed line in the drawing). This is the only way to explain how the choices we make (what seems to us the simplest way) are not arbitrary and in many cases their predictions match the future observations we will make.
Interim Summary: Four Different Conceptions of Intuition
We rejected the possibility that intuition is emotion. We then saw that it also cannot be pure reasoning (that is, intellectual processing of specific data), but must include an observational component. Yet even that is not the end of the road. We saw that intuition is not only the product of prior observations (accumulated experience), but contains a component of direct observation (with the “eyes of the intellect”). As if contemplation of the data transfers us directly to the general law that explains them (they are its particular instances).
Applications to Artificial Intelligence
Here a difficult question arises. If this is indeed based on an innate faculty (a kind of intellectual perception) and not only on accumulated experience, we would expect that an artificial intelligence system would not be able to learn from experience. But in fact such systems do learn from experience just as we do (and sometimes better). I noted above that the common way of “training” a neural network is to feed it data, which it internalizes and shapes itself accordingly, and after sufficient experience it can solve new problems based on that experience.
Do computers also possess the faculty of eidetic perception? That already sounds thoroughly mystical. Computers perform mechanical calculations, and if learning from experience—which is a collection of generalizations from the data we encountered—cannot be merely a process of reasoning, i.e., if it must contain an element of direct cognition of the general law through the data, that should not occur in a mechanical artificial system. It merely processes the data and produces a generalization from them. If an artificial system succeeds in learning from experience, it would seem to be evidence that we are dealing with data processing, i.e., a process of pure reasoning without an additional observational component (beyond collecting the data themselves).
We can sharpen the analogy to artificial reasoning through the fact that the way an AI system makes decisions is very similar to what I described above regarding the graph. We are dealing with a collection of methods called regression, used to derive a relation between an independent variable and a dependent variable from specific data in hand (from measurement or another source). In a typical first lesson in a machine-learning course, students learn the (non-statistical) regression technique familiar to every scientist as the “least squares” method.
Here is an image from Wikipedia depicting measurements of the relation between age and height in children:
How do we draw the solid line that strings these points together? You can see that the situation is less clear than in the graph I brought above; it’s not certain that we’re dealing with a straight line, though that looks plausible. But even assuming the line is straight, its slope is not obvious (there is some wiggle room). In addition, if you want to guess how this continues at later ages, it would be very risky to use this graph for prediction. It depicts the growth period, but at some stage the child stops growing and the graph levels off (asymptotically to final height, and perhaps even declines at more advanced ages). Moreover, even in the growth period there is no reason to assume that the growth rate is constant (i.e., that the graph is linear).
What we do in such a case is assume, based on the overall impression, that it is a straight line—at least up to the relevant age—and then look for the line that best fits the results (i.e., the values of the two parameters of that straight line). The common criterion for “best fit” is “least squares,” that is, we seek the line that minimizes the sum of distances of the points from it, and assume that this is the correct line. Of course, it may be that the correct line is not straight at all. It may be some wildly different curve that strings the points in a completely different way (like the dashed line in the earlier graph). But the researcher assumes some form of line (according to a visual impression of the results), and then tells the AI software to find the straight line that best fits these results—just as a scientist does in the Newton’s second law example.
That is, the AI software assumes in advance the shape of the correct line, and only then can it search for the exact line that best fits the results. Even the criterion by which we determine the fit is not univocal. “Least squares” is only one of several options. If we assume a different type of line (not straight), still, once we assume some fitting criterion we can find the best-fitting line. If there are several possibilities we choose the simplest line (with the straight line, in our eyes, being the simplest, of course).
We can extend this to more complex datasets, where the shape of the correct curve is not at all clear, and there the researcher must decide on some curve shape and only then seek a fit (the parameters of the best-fitting curve). There are also points that deviate significantly from the general trend, which we may assume are errors and discard (not take into account). But all these are intuitive decisions by the researcher, and they are given as inputs to the AI software. I note that nowadays there are programs that can suggest the curve shape on their own, but they too are fed with principles dictated by the human programmer/researcher.
In column 592 I discussed cognizing thought as a human faculty and wondered whether it exists in AI or not. As I understand it, it exists in the programmer, not in the software, and therefore it is a mistake to view the software as a substitute for human thought. A person built a tool based on his insights, and it is no wonder that the tool succeeds in human tasks. That does not mean that a being who is a blank slate relying only on experience would succeed. The computer is not a blank slate and it does not rely only on experience (i.e., only on the cases it is “trained” on). It is imbued with insights whose source is in us, just as they are imbued in us.
Intuition from the Scientific–Empirical Angle
The conclusion is that there is no such thing as pure learning from experience without presuppositions and a conceptual framework. Even what appears to us as unadulterated experience—what the typical empiricist would happily rely on—contains within it a priori principles drawn from our intuition, whether or not we are aware of them. So too with the empirical character of modern science. I have often pointed out that many people live under the illusion that science is based only on observations and does not assume anything beyond what is based on direct observation. That is a misunderstanding. At the base of science lie innumerable assumptions that have no empirical source (the principle of causality, induction, the rejection of action at a distance, and much more). That is, both the human learner and the “learning” AI are not nourished only by specific facts. They do not come to research as a blank slate, but laden with assumptions and modes of thought—without which it is impossible to learn and progress. That is also what we saw in the graph example above.
This of course raises the question: what justifies the trust we place in these assumptions? Note, the question is not whether we hold them to be certain. We do not. The question is: why take them seriously at all? Are they not merely assumptions grounded in the structure of our thinking, to which the world owes nothing? The possibility of trusting them derives from the assumption that we have some faculty capable of generating and granting us such insights, not from sensory observations. This faculty precedes observation and undergirds our ability to use it. But it is not an emotion and not merely a subjective matter, since the information based on it counts for us as reliable information, and we rely on it without hesitation. Thus, at least from our perspective it is clear that this is a cognitive faculty capable of saying things about the world even prior to observation, and indeed grounding our ability to learn from observations.
Intuition from the Philosophical Angle
At the base of our thinking and our accumulation of knowledge about the world lie two fundamental toolboxes: the scientific–empirical and the logical–philosophical. The former is grounded in observation; the latter, in reasoning. So far we have seen that at the base of our ability to learn from experience and observations lies an a priori faculty that I called intuition. We saw that it outlines the framework within which we learn from experience and interpret observational facts—i.e., it undergirds science.
But intuition also lies at the base of the second toolbox, the logical–philosophical. Every logical and philosophical argument is grounded in first principles. These premises themselves can perhaps be derived from still more fundamental premises, but at the end of the road there is a set of basic premises not derived from prior ones. These are our axioms or self-evident truths. Where do we get them? Experience cannot furnish us with all our first principles, for several reasons: first, we saw that experience itself is grounded in a priori assumptions. Second, experience yields particular facts (a particular object fell to earth; a particular person told me the correct time of day, etc.). From that one cannot logically derive any general insight about the world—and certainly not the laws of nature. To move from isolated observations to the laws of nature requires generalization, which is grounded in reasoning and a priori assumptions (and above we saw that these themselves cannot be merely the product of accumulated experience). Thus, at the base of the premises of our logical arguments lies intuition. Upon it rests our entire philosophical edifice, for if we have no trust in the first principles, what value is there in the structure we build upon them?! Intuition is a cognitive component that imports into us information about the world and about how to handle and reason about it. In my series on philosophy (155 – 160) I showed that the whole field called “philosophy” is nothing but the systematic collection of these insights/cognitions and an analysis of their import.
In addition, we saw above that sometimes intuition shortens the path of recursive reasoning (this is likely what happens in the example of Reuven and Shimon’s problem above). In such cases one could say that it contains no cognitive component but merely performs the computation/reasoning unconsciously. In principle it is possible to perform that computation consciously and slowly, step by step (like Reuven), but a gifted person (like Shimon) can do it quickly without awareness. It is the very same computation that Reuven performs but carried out inside Shimon.
Back to Kahneman
Kahneman’s System 1 deals with processes of reasoning that do not involve explicitly formulating the argument and the calculation, but arrive directly at the conclusion (as in the example of Reuven and Shimon). This itself can be construed in two ways: (1) Shimon looks at the equation or the problem and “sees” with the eyes of the intellect the solution. Here there is an element of (unconscious) cognition. If that is the process, there is no possibility of formulating a conscious, slow equivalent. (2) Shimon performs all the computational steps but unconsciously. In this account there is no cognitive component, yet in principle one can formulate a conscious, slow equivalent: simply describe the calculation step by step (this is exactly what Reuven did, and the assumption is that Shimon and he traversed the very same route—Shimon quickly and without awareness, Reuven consciously and slowly). In Kahneman’s terms we can say that Shimon has a highly developed System 1, whereas here Reuven uses his System 2.
When intuition yields our first principles or the methodological framework within which science operates, there it is clear we are not dealing with a shortcut of reasoning. Arriving at first principles or at the framework of scientific thought are processes that have no conscious counterpart (recursive or not). This is intuition as a process of cognition, not a shortcut of reasoning. Therefore I wrote above that identifying Kahneman’s System 1 with intuition is not precise. In his System 1, Kahneman treats intuition as unconscious computation—yet in principle it has a conscious counterpart. If we proceed slowly and consciously, like the tortoise, we can formulate the calculation and perform it step by step and reach Shimon’s result (as Reuven did). But the route to first principles is not unconscious reasoning; it is a non-sensory perception of the world. As I explained, there cannot be a conscious or sensory counterpart to that. One who lacks it will never get there. Here only Achilles (Shimon) will manage to solve the problem; the tortoise (Reuven) lacks the tools to do so.
Intuition and Faith
This brings me to the relation between faith and intuition. We saw that the base of our first principles and our conceptual framework is built by intuition. This is a non-sensory perception of the world that yields claims about it (without using the senses or philosophical proofs). If so, science (sensory observations and learning from them) and philosophy are built upon intuition, not vice versa. Therefore the doubt “what is the validity of intuition?” reflects a misunderstanding. It expects an explanation that would grant intuition validity by force of a logical argument or on the basis of an observation or scientific finding. But the logical order is the reverse: they are built on intuition, not it on them—“Your guarantor needs a guarantor.” On the face of it, relying on it seems like an irrational and illogical tool. But once you understand what I have described here, you see that the picture is exactly the opposite. This is the foundation of all rational and scientific thinking. They rely on it, not it on them. It is the most basic foundation of rationality, and therefore its own rationality is not in doubt.
In this sense one can say that “intuition” is a synonym for “faith.” Faith is a claim we accept without having an observational basis for it. And even if we have a logical–philosophical argument, that argument itself is grounded in first principles that themselves lack such a basis. Hence people commonly say that faith is an irrational claim, perhaps an emotion. The punctilious add that where philosophy ends, faith begins. We are dealing with insights we reach without observation and without foundation. But in light of what I have described so far, the truth seems the reverse. Faith is not an emotion; otherwise the believer would not differ from the atheist except in psychological makeup. They would have no substantive dispute. Yet faith is also not based on observation or philosophy; rather, it is their foundation. Seemingly it is unfounded and irrational—but in truth it is the essence of rationality, which is entirely grounded in intuition, i.e., in faith. A principled rejection of intuition is a renunciation of rationality altogether (and of science as well).
On this view, at the foundation of all scientific or other knowledge lie beliefs, that is, insights grounded in intuition. Faith is the foundation of rationality and logic, not a substitute for them. I am speaking here of faith as a faculty (that is, the tool I have called “intuition”), not of particular beliefs—whether in God or anything else. I have confidence in this tool and in the insights I derive by means of it—about the world, about morality, or any other field. Belief in God is just a special case of these beliefs. It is a product of our intuitions, and the logical arguments that lead to it only expose that it resides in the first principles on which those arguments are built (see my essay here).
This brings me to a question Yonatan raised in the Q&A a few days ago:
What do you think of the following statement (by Rav Kook):
“Faith is neither intellect nor emotion, but the most fundamental self-disclosure of the essence of the soul, which must guide it in its disposition.”
To which I replied with my characteristic delicacy:
If only I understood what that means. I suspect that no one else (including the speaker) truly understands it either. In the next column I will sharpen a point that may relate to this saying.
So here is the column, and here is my response to the shock Yonatan expressed there at my apparent disparagement of Rav Kook. I am not disparaging him, but to my judgment his writings are full of vague statements with undefined concepts, and therefore also full of claims that sound profound only because we have not bothered to define the terms. Once we define them, we sometimes discover that we are dealing with imprecise statements—if they have any meaning at all. I have written more than once that there is nothing outside our thinking and intellect. These are the tools we have, and whatever lies beyond them is nonsense—or at least asserts nothing. Therefore faith, too, cannot be beyond the intellect (and, as noted, it is certainly not an emotion). That is why I wrote there that when such a thing is said, one must first define what “intellect” is and what “emotion” is, and then one can say something about the relation of these two to faith. As we have seen here, faith is the product of the faculty called intuition, and that faculty undergirds all rational and scientific thought, and therefore it cannot be something different from them.
For me, where intuition (faith) ends, philosophy and science begin—not the other way around. Whoever wishes to claim otherwise should kindly explain where our intuitive insights come from, which are neither emotion nor observation (not even accumulated past observations embedded unconsciously). Is there some faculty other than what is called faith? How do they differ? Why assume at all that they differ? They sound and look the same, so what is the basis for and benefit of the claim that we are dealing with two different faculties? My sense is that the motivation for such outlandish statements is the desire to aggrandize faith. But as Maimonides writes about those who think that giving no reason for the commandments magnifies God, it is exactly the reverse: to say that faith is above the intellect is to say that we are dealing with nonsense. That is, such a person testifies about himself that he does not truly believe in God and His existence, but has certain experiences. If it is not intellect then it is emotion, for there is no third option. Emotions—or what is “above the intellect”—have nothing to do with faith.
I do not intend to claim that Rav Kook or others who say such things are not believers. Clearly he was a great believer. My claim is that they sometimes use undefined and confused concepts. In fact they believe in God because their intellect (and their intuition) tells them that God exists, but they do not understand this (because we are dealing with unargued statements, and relying on them calls for justification), and therefore they describe it as something that is emotion or that is above the intellect, and the like. As noted, these are unnecessary and incorrect statements, and they do not magnify God or faith in Him in any way. The opposite is true.
Discover more from הרב מיכאל אברהם
Subscribe to get the latest posts sent to your email.
So far I have only read the chapter “Applications to Artificial Intelligence”.
“The programs are fed by principles dictated to them by the human programmer/researcher”. What are these principles and where are they detailed?
A. It is known that a neural network is a universal approximation, meaning that for every continuous function there is a network that approximates it as we wish (even if it is not true that there is a single architecture that can approximate every continuous function). If a researcher takes observational data, divides it into a training set and a test set, grills himself an architecture, trains it and manages to improve the error on the test set, where is the dictated principle here? That the function in nature is close to continuous? If drawing an architecture is much easier than drawing the continuous function itself (and as far as I know it is), then apparently this means that the architecture with training has its own power without the dictation.
B. Why is the error function essential to the matter? The global minimum point of all the different error functions is the same minimum point, because they are all supposed to give a minimum value for the correct result (and only for it). For example, if we predict a numerical value, it does not matter whether we take the difference in absolute value or the difference in square or the logarithm of the absolute value of the quotient, etc. It seems that the differences between the different error functions are only in the question of how best to distribute the errors (for example, is it better to have two errors of size 2 or one error of size 5), so if the measurements are completely accurate and an accurate approximation function can be found, then training with each of the derived error functions should in principle bring us to the same approximation function. If this is true, then it apparently means that even in the error function, no human insight is embedded from the essential component of the forecast, but only human insight on how to overcome measurement errors, and perhaps also which error function will create a smoother search space that is easier to converge.
C. Even if human insights are indeed embedded, these insights are not specific to the very specific problem but rather general insights. Someone comes along and predicts the growth rate of fruits as a function of the sun's intensity (why specifically as a function of the sun – intuition) he can be a simple researcher who takes a lot of data, takes some basic architecture and tries with all kinds of hyperparameters (not too many) without ever seeing the sun. Even in such an initial attempt, training will improve the error on the test group to some significant extent. So the insights are only very general, like “Find a simple function” Or “Find a derivative function” and so on, and nothing that involves observing the specific problem of the growth rate of fruits. Do you really think so, that intuition revealed to us the principle that in nature it is worth looking for the simplest functions and defined for us in general terms what simplicity is, and since then intuition has left the country?
It's hard for me to go into all the details (some of which I'm not familiar with). But as far as I understand, your words don't challenge my claims.
In general, my argument is that a neural network that can take all the data in the universe and build the laws of nature from them, that is, make a single and correct generalization based solely on the data without any outside help, must be built in a certain way. This form is performed by the programmer, and in fact through it he introduces additional principles into the calculation (his own a priori information). Learning a random system from experience alone does not lead us to anything, and certainly not to the laws of nature.
Of course, in supervised training, you, the programmer, enter information and not just data (you explain to the machine what is correct and what is not. A person who learns from experience does not have this feedback). But even in unsupervised learning, the algorithm belongs to the programmer. Like introducing the assumption that it is a straight line in simple learning as I described.
Even if you can make some progress from the bare data alone (and in my opinion, not at all), what I said is that in order to make significant progress, you need something beyond the bare data. Therefore, any progress of a system empty of programmer information does not indicate anything. This additional information is embedded in the structure of the learning system.
The error functions you are talking about are themselves already a definition of the programmer. Any function that seems logical to him may give a minimum of the correct result (in my opinion, this is really not necessary, if I understood your meaning correctly). But I do not see how it can be claimed that every function will do this. And that every algorithm leads to a correct result?
I will take an extreme example. Suppose I want to build a machine that will receive all the data in the universe and extract all the correct laws of nature. Do you think it is possible to simply produce a collection of screws that will do this? You understand that it has to be a very specific structure, or one of several very specific structures. Whoever builds such a structure is essentially introducing his own (a priori) information into the machine. The same goes for us. The structure of our brain contains information that is not the product of experience alone. Therefore, even if our brain can now make the right generalizations, it is still thanks to the a priori information that is within it (its structure).
I did not write anywhere that this information should be specific to the problem at hand. On the contrary, regarding humans I gave examples of general assumptions of science that we bring from home (the principle of causality, induction, negation of action at a distance, etc.). These are really not specific assumptions, but general ones. And yet this is information without which we cannot act and learn.
I'm just trying to identify which principles are embedded and where. And I see only one principle: there is a derivational function that connects the data to the prediction. If at a fundamental level this is all that is brought home, except for the selection of the relevant data set, for all problems, then that means that this is all that intuition should have known and moreover it is not involved in specific scientific generalizations. Is this true?
I have no idea if that's true, but even if it is true, it doesn't matter. This function itself is the information I'm talking about. Another function wouldn't do the job. Beyond that, I'm talking about collecting all the data in the universe. When you select data relevant to the problem, that itself is information. And when you introduce the feedback (of supervised learning) that is also information.
I'll come back to that. The principle of causality is an important assumption in the accumulation of scientific knowledge and the determination of our scientific laws. Where did we learn it itself from? It's part of the function that's built into us, but it's itself information, and it's what allows us to learn from experience. Our programmer or the ability to observe the world with the eyes of the mind, are responsible for this ”information”.
I did not understand the meaning of the attack on Rabbi Kook. It is very likely that he meant exactly what you explained: faith is neither emotion nor reason in the conventional (logical) sense, but intuitive recognition. Now, is it clear that all objects and all realities must be laid out and revealed to the intuition of every person? Simply no, and also clearly no, since there are (also in the intuitive realm) geniuses, ordinary wise men, and fools, and the latter are able to recognize (intuitively) only a few of the truths that the former are able to recognize.
In fact, you can take your example: Shimon Tesfam intuitively figured out the solution, Reuven did not. Hence the intuition is not the same in every person and at any time.
Now, assuming (which is maintained by faith) that the’ He created man and his abilities, including his intuition, and is responsible for the various qualities of merit, as well as for the question of which truths will be known to man and which will not. It is clear that the ability to recognize Him in himself, which is the faith about which Rabbi Kook writes here, was given to man in a very deliberate way, because it is the basis for the religious existence in his life.
Now we only have to make one more assumption: that the religious existence, that is: the work of God, is the most significant thing that a person can produce from himself, and we have already reached the words of Rabbi Kook: “the most fundamental self-discovery of the essence of the soul”.
Intuition does not lead to direct recognition
but only to a tool or method for understanding
Therefore, it is impossible for belief in God to be intuition
After we have decided on a certain logic through intuition, it is possible to prove God with the help of reason
(Reuven and Shimon's intuition is built from experience and not from recognition)
Why do you think intuition does not lead to direct recognition?
Just as there is intuition for basic premises (such as, that our evidence indicates what really exists in the world. And induction, etc.), so there can be intuition for recognizing the fact that there is a God, even before the logical proofs, which can of course be correct.
Maybe this can contribute to the discussion:
I once said to the famous atheist Richard Dawkins during a radio talk show, “Richard, religion is music, and you lack musical hearing”. He replied “That's right, I lack musical hearing, but there is no music”. (Rabbi Sachs)
The entire article is here
https://www.hamikra.org/articles/hrb-lvrd-yvntn-zqs-zts-l-bmsr-lprsht-hzynv-tshp-g-mshh-hysh-bd-lnv-bl-shyrtv-vdnh-bpynv-mh-lynv-l-shvt-kdy-lhbtykh-shl-nbd-vth-sprv-lnv-zh-khshvb-lnv-ttsy-v-vnprsm-nkhnv-gm-rvtsym-lptvkh-mdvrym-nvspym-kn-bbqshh-ttsy-v
The rabbi wrote in the past about the Zohar that people have a spiritual intuition about it, and therefore this is a reason to accept it, even though it is not ancient and the connection to it is very vague.
But here the rabbi wrote that it is not relevant to the matter of facts, and the Zohar establishes facts. Moreover, why does it oblige me if people have intuition? Or more than that, why does it give me the opportunity to accept it because of that?
I would appreciate a response, thank you!
I don't understand the question. Someone else who is obligated to accept him, or who has authority. I get the impression that there are correct spiritual intuitions there, and therefore I have a sympathetic attitude towards him. Others who think differently will behave differently. Everything is fine.
I didn't understand how the example of Newton's second law and scientific experiments in general proves that intuition is not based only on experience. After all, it sounds very likely that the more force we apply, the higher the acceleration will be. This is something that can be felt and seen in reality when throwing a ball, for example. Therefore, it is very logical that the scientist would choose the straight line even without any cognitive tools beyond experience. Also from experience, which already knows that many laws and forces in science operate in such a direct relationship, it is very likely that more laws will work like this.
Thank you
I didn't understand whether you were arguing that science is also based on intuitions and not just on observation (experience), or that our faculty called intuition itself is not a product of accumulated experience.
I am dealing with the claim that our ability known as intuition is not a product of accumulated experience. I did not understand why this is true, and perhaps it is a product of experience alone. As I wrote, I think it is likely that the scientist's tendency to draw a straight line between the points (in the example of Newton's second law) comes from experience alone, in two possible ways. One possibility is that the scientist sees from his experience and feeling that when he exerts more force on an object, the acceleration of the object increases, it is very likely that it is in direct proportion, the scientist expects to see this in the law as well. A second possibility is that the scientist sees similar phenomena in other natural laws that have already been discovered, that when there are several points on a graph that were discovered by experiments, it is likely that the general law will be the line that runs between them. Therefore, it is actually possible that intuition comes only from experience, and intuitions about which a person has no experience do not reflect reality, but rather something that the person feels is true.
I think I've explained why. Experience cannot give us induction, which is the basis of learning from experience. David Hume already wrote this.
I understand, thank you.
I still don't really understand the justification for the reliability of spiritual intuitions, that is, those that are not empirically expressed.
Intuitions like those that underlie science, I understand, because they are consistent with reality time and time again, and it is very likely that we are really observing something real in reality. On the other hand, intuitions that are not empirically expressed, such as the intuition of free choice, are very likely to be imaginations, especially given that there are so many beliefs and intuitions that have circulated in the world, and when there was the ability to test them, it was discovered how unrelated they were to reality.
You are wrong. These intuitions do not stand up to empirical testing. The tests themselves are done within them.
There are many scientific theses that have been proven wrong, and many that have not. The same is true of intuitions. These are the tools we have, and they are the only means by which we think. Controls should be as much as possible, of course.
I really don't understand the disdain. Rav Kook doesn't call what you call the "eyes of the mind" and therefore their observations "mind." I have never called it mind either. For me, mind is a machine that draws conclusions from axioms to theorems (and from theorems to other theorems, etc.), not the thing that generates the axioms themselves. I also called mind the ability to distinguish between things that look the same but are actually different. That is, to separate between similar things (as opposed to similarity, which compares between different things). And more generally, to our sense of criticism. Criticism separates between the observations of the eyes of the mind. He calls what you call the "eyes of the mind" faith, since they are actually a fundamental observation of the soul (personality). The mind for him is essentially a chisel that is used to cut or sculpt (called a hammer) that works on the observations of the soul (beliefs).
Faith in this context = faith in God and perhaps if they say faith in Judaism
There is no problem with intuition that teaches a principle, there is a problem with intuition that teaches a conclusion
Not clear. But it doesn't matter. Belief in God = seeing Him (kind of). Hence any belief in anything else
There is confusion in the message: “It is certainly possible that all of Shimon's calculation steps were made in Reuven's mind”. Reuven should be reversed with Shimon
Thanks. Fixed.
A central claim in the column is that a neural network seeks a simple good generalization. And this preference for "simple" generalization is an idea that humans have acquired through intuition (cognitive) and incorporated into the model. And in the response above, I wondered where this principle is embedded in the network.
I came across an article that, to my understanding (I've only read it superficially for now), makes this exact claim. It's hard to put my finger on what exactly the article is, but I'll briefly summarize the gist of what I understood from what's relevant (Tal'H).
https://arxiv.org/pdf/2503.02113
His claim is that the network seeks simple but not too simple generalizations (soft inductive bias). As you always explain with Occam's razor. With that, he explains 3 phenomena that are considered quite strange. And he also suggests where this principle (of preference for simplicity) is embedded as an intrinsic regularization.
This principle is implemented through the architecture and through the optimization algorithm (which is influenced by the surface of the error function) (1) The architecture. A deep and narrow architecture facilitates hierarchical pattern recognition. A well-known empirical phenomenon is that early layers recognize simple components such as a line, a circle, and a noun, and later layers recognize more complex components such as a face and the subject of a sentence, and so on. Such a hierarchical approach leads to a search for common components between examples, i.e., giving up some of the uniqueness in each example, i.e., a preference for simplicity. (2) The optimization algorithm. He claims that simpler solutions are also less sensitive to small changes in the model weights. That is, a region of a simple solution is characterized by a large local minimum space around it when looking only at the first-order derivative. Hence two things, such a solution takes up a relatively large space in space, and when you reach it during training with a gradient it is difficult to get out and therefore you stay there. In addition, components common to many examples affect the gradient more than unique components, because they appear more, and therefore they will be identified first and the complexity-uniqueness will be built gradually. Hence another thing – in larger models there are more configurations of simple solutions and each of them occupies a relatively large local minimum space. In such a way that the proportion of simple solutions can increase.
Chen Chen.
I think it is possible to understand that we introduce this into the system a priori, even without knowing the structure of such networks. In simple networks we ourselves dictate the shape of the line we are looking for (a straight line or a parabola), and then use least squares or another price function to provide feedback. The shape of the line is our dictate that determines the solution. But in LLM, of course, it does not work that way. Still, it is clear that even in the vast collection of problems that the LLM network deals with, there are many possible solutions to each problem (there are infinite lines that intertwine each discrete set of points). So how does it arrive at the simple solution? This is not a mathematical result, since mathematically any solution is legitimate, as long as there is a match to the data. It is necessary for us to introduce it in some way.
Therefore, the conclusion is that we are the ones who introduce the pursuit of simplicity. It is not self-evident a priori. Now we can only think about how and when exactly we introduce it? I don't think the structure of the network is enough for this. Building a network with a certain architecture is not enough to say what the solution will be. The fact that it divides the work between the layers is in itself in need of explanation (if it is an expression of the pursuit of simplicity, then why does it really do this?).
I dealt with this in the series that has just ended (up to column 699). My argument is that training the network on human examples and texts swallows within itself our methods of thinking, that is, the pursuit of simplicity. In controlled training, you give feedback to the system, and this feedback expresses what you yourself see as the correct solution, that is, the simplicity. In LLM training, something similar is essentially done, only this happens through the texts themselves (written by humans).
As an answer.
You know that the saying of the great sage, the word of the wise, and the word of the wise, came to Moses from Sinai, and the dilemma is like a blind man in a chimney, and is it not a reason for Yahib? It's fun to confirm with reality (or with theory) also a good a priori claim.
You wrote "Therefore, the conclusion is that we are the ones who introduce the pursuit of simplicity. It is not self-evident a priori." The box does not appear to be ts. But it seems that this is an edit (because when I first read it, I read it without this word, I think). So I did not understand.
How does the structure of the network encourage simplicity, that is, block some of the non-simple solutions and increase the space of simple solutions, I understand that in a deep and narrow architecture (many layers and each layer is relatively small) since the ability of each narrow layer is limited, it is harder for it to remember the unique features in different examples and it is easier to concentrate on the common. I do remember that in an openai article on scaling laws between gpt2 and gpt3 where they tried to explain their considerations for choosing the new size of gpt3 (increasing from 1.5 to 175 billion), they presented a claim that the shape of the network is less significant than the overall size. Being so busy with the daily race that is full of technical details (theoretical and practical) makes it mentally difficult to get out and think or engage in a top-down view.