Out of the Minds of Babes

Steven Pinker*

It is a cliché of neuroscience that the brain works differently from a digital computer. But the report by Marcus et al. [HN1] in this issue on page 77 (1) demonstrating "rule learning by seven-month-old infants" suggests that one of the mechanisms that makes computers intelligent--manipulating symbols according to rules--may be a basic mechanism of the human brain as well.

Hundreds of years before anyone knew anything about brains or computers, two very different conceptions arose of how the mind works:

"When a man reasons, he does nothing else but conceive a sum total from addition of parcels, or conceive a remainder from subtraction of one sum from another; which, if it be done by words, is conceiving of the consequence of the names of all the parts to the name of the whole, or from the names of the whole and one part to the name of the other part. ... For REASON is nothing but reckoning."

In this passage from Leviathan, written in 1651 (2), Thomas Hobbes [HN2] uses "reckoning" in the original sense of "calculating" or "computing." For example, if the definition of "man" is "rational animal," and we are told that something is "rational" and an "animal" (names of parts), we can deduce it is a "man" (name of whole). If these symbols are represented as patterns of activity in the brain, and if some patterns trigger other patterns because of the way the brain is organized, then we have a theory of intelligence. That theory became the basis of the rationalist philosophy of Descartes and Leibniz [HN3], and much later, information-processing models in cognitive psychology, Noam Chomsky's theory of generative grammar [HN4], and programs for language and reasoning in artificial intelligence [HN5].

But there is an alternative:

"There appear to be only three principles of connection among ideas, namely, resemblance, contiguity in time or place, and cause or effect. Experience teaches us that a number of uniform effects result from certain objects. When a new object, endowed with similar sensible qualities, is produced, we expect similar powers and forces and look for a like effect. From a body of like color and consistence with bread we expect like nourishment and support."


In this passage from his 1748 Enquiry Concerning Human Understanding, David Hume [HN6] summarizes the theory of associationism [HN7]. The mind connects things that are experienced together or that look alike, and generalizes to new objects according to their resemblance to known ones. Replace Hume's "ideas" or "sensible qualities" with "stimuli" and "responses," and you get the behaviorism of Ivan Pavlov, John Watson, and B. F. Skinner [HN8]. Replace the ideas with "neurons" and the associations with "connections," and you get the neural network [HN9] models of D. O. Hebb [HN10] and the school of cognitive science called connectionism [HN11].

The theories would not have survived for centuries if they did not account for important phenomena. Associationism captures the tendency of animals to pick up statistical patterns among events and generalize them to similar events. Examples range from the gradient of bar-pressing rates in rats when the surrounding stimuli vary from training conditions to the widely reported demonstration in these pages last year that eight-month-old infants pick up the probabilities of transition between syllables in streams of artificial speech (3) [HN12].

Moreover, it's easy to see how the laws of association might be implemented in neural hardware. If, as many neuroscientists believe, neurons that fire together wire together, we have an implementation of Hume's principle of contiguity in time. If neurons represent simple properties, and sets of active neurons represent concepts, then concepts that are similar will literally overlap in neural real estate, and anything associated with one concept will automatically be associated with similar concepts. The connectionists Geoffery Hinton, David Rumelhart, and James McClelland [HN13], echoing Hume's remark about resemblance, wrote, "If ... you learn that chimpanzees like onions you will probably raise your estimate of the probability that gorillas like onions. In a network that uses distributed representations, this kind of generalization is automatic" (4).

The theory of symbol processing seems better suited to explaining the brain's ability to handle complex ideas and the aspects of language that communicate them. People are not slaves to similarity. We can be told that a whale is not a fish and that Tina Turner is a grandmother, overriding our statistical experience of what fish and grandmothers tend to look like. This suggests an ability to summarize an entire category by a mental variable or symbol, whose meaning comes from the rules it enters into: "a mammal is an animal that suckles," "a grandmother is the mother of a parent." These rules support generalizations that work more like deductions than similarity gradients. For example, we can infer that whales have livers or that Ms. Turner has had at least one baby (5).

Language is the quintessential symbol-manipulating system. When we learn that the grammatical object comes after the verb from simple sentences like "Tex hugged the dog," we can generalize that regularity to grammatical objects that are very different in sound ("I like Joe Bftsplk"), in meaning ("Kant defined the categorical imperative"), or in length ("Sheila met a tall blonde man with one brown shoe"). The abstractness and open-ended expressive power of human language comes from a system of recursive rules manipulating variables like "noun phrase" and "object" (6).

Although many cognitive scientists believe that the human mind is a hybrid system that uses both associations and rules (5), others want to retain associative networks as the fundamental stuff of cognition (4). They suggest that humans are not naturally good at the kind of reasoning subserved by rules. Rule use emerges late in life as a result of formal schooling and socially articulated rules, or as a result of extensive training that makes an associative network approximate rule-like behavior.

Marcus et al. (1) have now shown that infants as young as seven months can abstract simple rules from language-like sounds, suggesting that rule formation is not a late add-on but there from the start. Children of that age are just beginning to segment words from ambient speech, although they are several months away from understanding or producing them (6). Marcus et al. used a common method in the study of infant cognition: present a stimulus repeatedly until the infants are bored, then present them either with stimuli of the same kind or of a different kind. "Same kind" and "different kind" are in the mind of the beholder, so if infants attend longer to the different kind, they must be telling them apart.

In these experiments, infants were habituated with "sentences" that follow one sequence, such as "ga ti ga" and "li na li" (an ABA pattern), and then were presented with sentences that contained different words and either the same sequence, such as "wo fe wo" (ABA), or a different sequence, such as "wo fe fe" (ABB). The babies listened longer to the "different" sequence, showing that they must have discriminated ABA from ABB; everything else about the test sentences, such as the actual syllables and their transition probabilities, was the same. Various controls ensured that the children did not simply like the sound of some sequences more than others, or memorize smaller chunks like BA. Marcus has also demonstrated that a kind of associative network frequently touted as a ruleless model of language learning, J. Elman's Simple Recurrent Network [HN14], does not discriminate the patterns in the way these infants do.

Marcus et al. (1) are careful not to claim that infants lack an ability to form associations, that rule learning is uniquely human, or that the rule-learning mechanism at work in this experiment is the same one that babies use to acquire language later. But their demonstration suggests that the ability to recognize abstract patterns of stimuli that cut across their sensory content is a basic ability of the human mind. How it is carried out in the brain is still largely a mystery. Research in the neurobiology of learning and in neural network modeling (perhaps searching where the light is best) has tended to focus on simple associative learning mechanisms whose functions would have been recognizable to associationist philosophers writing centuries ago. Marcus et al.'s experiment is a reminder that humans also think in abstractions, rules, and variables, and is a challenge to figure out how we do so.

References and Notes

  1. G. F. Marcus, S. Vijayan, S. Bandi Rao, P. M. Vishton, Science 283, 77 (1999).
  2. T. Hobbes, Leviathan (1651; reprint, Oxford Univ. Press, New York, 1957).
  3. J. Saffran, R. Aslin, E. Newport, Science 274, 1926 (1996).
  4. G. E. Hinton, J. L. McClelland, D. E. Rumelhart, in Parallel Distributed Processing, D. E. Rumelhart and J. L. McClelland, Eds. (MIT Press, Cambridge, MA, 1986).
  5. S. L. Armstrong et al., Cognition 13, 263 (1983) [Medline]; S. Pinker, How the Mind Works (Norton, New York, 1997) [publisher's information]; G. F. Marcus, The Algebraic Mind (MIT Press, Cambridge, MA, 1999).
  6. S. Pinker, The Language Instinct (HarperCollins, New York, 1994) [publisher's information].
  7. Grant support received from NIH HD 18381.