Your complimentary articles
You’ve read one of your four complimentary articles for this month.
You can read four articles free per month. To have complete access to the thousands of philosophy articles on this site, please
AI & Mind
Arguing with the Chinese Room
Michael DeBellis says Searle’s famous argument about computers not having understanding does not compute.
Many readers of this magazine will be familiar with John Searle’s classic ‘Chinese Room’ argument against ascribing consciousness to Artificial Intelligence. Due to my experience building AI systems for business applications, I have a different take on Searle’s argument than most others. But first let’s look at his argument.
The Chinese Room
Searle introduced the Chinese Room in a paper published in 1980, called ‘Minds, Brains, and Programs’ (Behavioral and Brain Sciences, vol.3, no.3). The paper begins with the following thought experiment:
Professor Searle is locked in a room. He can’t read Chinese or even distinguish Chinese characters from Japanese. He’s given four sets of paper. The people giving him them have labels for each set, although Searle is not aware of their labels. I’ll put the labels at the beginning of each numbered item, along with Searle’s description in quotes:
1. Script: “A large batch of Chinese writing”
2. Story: “A second batch of Chinese [text]”
3. Questions: “A third batch of Chinese symbols”
4. Program: “Instructions… in English, that enable me to correlate elements of  with  and . These rules instruct me how to give back ”
5. Answers: “Certain Chinese symbols with certain sorts of shapes in response to certain sorts of shapes given me in ”
The idea is that the instructions  tell Searle how to respond to certain sets of Chinese symbols  by outputting other Chinese symbols in specific ways . In this way Searle gives coherent Chinese answers to Chinese questions without understanding a word of Chinese. The final part of Searle’s thought experiment is to “Suppose [that] I get so good at following the instructions… and the programmers get so good at writing the programs that from… the point of view of somebody outside the room… my answers to the questions are absolutely indistinguishable from those of native Chinese speakers. Nobody just looking at my answers can tell that I don’t speak a word of Chinese.”
Searle points out that this system does the same thing as AI programs. His implication is clear, just because a computer program gives good answers to questions, that doesn’t mean it understands what is going on. Later in that paper he also equates this with passing the Turing Test, concerned with determining whether one’s interlocutor is conscious or not.
Since Searle has no understanding of Chinese even though he is able to process the questions by following an algorithm (the instructions), he asserts that in the same way there need be no understanding in AI systems, because what they are doing is equivalent to what he is doing.
John Searle by Simon Ellinas 2023
Illustration © Simon Ellinas 2023. Please visit www.simonillustrations.com
Problems & Agreements with Searle
Let me now start by describing where I agree with Searle, then mention some fairly minor problems, then go on to what I think is the key issue.
I agree with Searle that the way Roger Schank and other early AI researchers described their progress was over-optimistic. One of the most infamous examples is from Marvin Minsky, who in 1970 stated, “In from three to eight years we will have a machine with the general intelligence of an average human being.” Schank wasn’t quite as extreme, but some of the ways he discusses the consciousness of a computer program - one able to solve a very narrow set of linguistic tasks - were inflated. I think probably most AI researchers would now agree with that. However, there is a difference between deflating the significance of an idea, and claiming that all work that follows a similar methodology is completely vacuous.
Beginning with the less significant counterarguments: the scenario Searle describes would never actually work. Of course, the natural response is ‘It’s a thought experiment: it doesn’t have to be something that can actually be implemented’. While it’s true that certain details can be waved away for a thought experiment, there are other details that can’t simply be dismissed.
So why do I maintain that Searle’s system couldn’t work, and why does that matter? Because the Chinese Room could never approach the speed of a native Chinese speaker, and speed is an issue for passing the Turing Test.
The sort of mechanism Searle describes in his thought experiment is a model known as a Finite State Machine (FSM). Noam Chomsky defined a hierarchy of languages based on the complexity of the phrases they could generate, and the FSM family of languages is the simplest type. Here the input to the system is a set of symbols, and the system uses a set of rules to correlate the input symbols with another set of symbols, which are the output. A thermostat is a classic FSM. It regularly takes readings, and if the temperature is below a threshold it turns on the heat and leaves it on until the temperature is above another threshold. The crucial missing element in an FSM is memory. There is no mechanism where symbols can be stored so that they can be resolved later based on context.
In Syntactic Structures (1957), Chomsky proved that an FSM is incapable of parsing natural languages. An intuitive argument for why FSMs can’t process natural language can be seen by considering a simple English sentence that Chomsky often uses: ‘I saw the man on the hill with a telescope’. Who has the telescope? Is it me, or the man on the hill? There is no way to determine the referent from this single sentence. This is known as the problem of anaphora in linguistics: sentences that use pronouns such as ‘I’ or noun phrases such as ‘the man on the hill’ often need the context of sentences that came before or after to disambiguate who the referent is.
To process anaphora (and many other features of natural languages), the system doing the processing needs memory as well as rules. Unidentified variables need to be stored somewhere so that they can be resolved by context that comes before or after. But an FSM such as Searle’s Room has no memory. It just takes symbols as input, and moves to different states as a result of applying rules to the input. It can’t interpret ambiguity.
However, once one begins to add memory, the rules become much more complex and the chances for error become exponentially greater. The Turing Test includes speed of response as part of the test. If it takes a system much longer than it would take a human to answer simple questions about (say) a short story, any reasonable judge would determine that it was a computer and not a person. For a program to pass the Turing Test, it would also need to be able to handle extended discourse, humor, metaphor, etc. To date no system that I’m aware of has even come close to passing the test. This gets back to Searle’s claim that AI researchers exaggerated the significance of their results.
Searle’s Definition of Strong AI is (Mostly) a Strawman
As a result of his argument Searle asserts that “the claims made by strong AI are false.” According to Searle the three claims made by proponents of strong (ie humanlike) AI are:
AI Claim 1: ‘‘that the programmed computer understands the stories.’’
AI Claim 2: ‘‘that the program in some sense explains human understanding.’’
AI Claim 3: Strong AI is about software not hardware (ie, it ignores the brain as a possibly unique site of consciousness).
However, these claims that Searle ascribes to strong AI are for the most part too strong, and not held by the vast majority of AI researchers then or now.
Claim 1, the idea that AI programs understand text, hinges on our definition of ‘understand’. I will discuss this idea at the end because I think it is the most important question.
Claim 2 can be supported from our perspective in the twenty-first century looking back on the impact of Schank’s research, and similar AI research of that time. Schank’s work was also relevant to early work in applied AI.
In the 1980s I was a member of the AI group that was a part of Accenture’s Technology Services Organization in Chicago. One of the first systems we developed was the Financial Statement Analyzer, a system that utilized a concept of Schank’s to analyze the yearly financial statements that corporations are required by the government to file. These statements were shared with the public, especially with shareholders, so corporations often spent significant effort on the presentation of the reports, with elaborate graphics. While the government required specific information in these reports, they left it open to each corporation to determine how to format the documents. Thus, a normal computer system that could parse tables fairly easily was not able to automatically process these statements. The Accenture AI group developed a system that could analyze the reports, find the relevant ‘frames’ (e.g., debt to equity ratio) and use rule-based heuristics to determine which reports would benefit from further analysis by an expert. (‘FSA: Applying AI Techniques to the Familiarization Phase of Financial Decision Making’, IEEE Expert, Chunka Mui and William McCarthy, Sept. 1987.)
Our system in reading these reports in a sense did some of the work that a human understanding the reports would have done. Not that Schank (or anyone to date) has provided a complete theory of human language. Rather, the work of Schank and others led to other productive work on language and other problems of cognitive science, that is, of ‘human understanding’.
Concerning Claim 3 – that strong AI is only about software not hardware – Searle distinguishes between machines and programs, and says that strong AI is only about programs, and that the nature of the machine running it (the computer, or brain) is irrelevant: it is only the program that matters. This is a strawman, in that Searle confuses a simplifying assumption – that the mind can be studied as a system independent of the physical brain – with the truth, that all the minds we know of are associated with brains.
Even in computer science it has only been fairly recently that software can be packaged so that it is (mostly) independent of the hardware platform. At the time of the Chinese Room argument – 1980 – AI software was tightly coupled to the specific programming language and operating system that the researchers were using. Only in the last decade or so, thanks to Virtual Machines such as Java and Docker, could software be packaged in a way that’s independent of hardware. This is the results of decades of engineering effort.
The brain, however, is not designed from scratch in the way environments such as the Java Virtual Machine are. The human brain is the result of one hack upon another, adding whatever small random mutations happen to increase reproductive success. It would be ridiculous for anyone who truly understands computers to think that this same level of engineering could be achieved by nature. We can see this by examining the brain architecture for functions such as vision, which we understand much better than language. In vision, information is processed in the primary visual cortex. There are modules going from low level visual processing (e.g., edge and surface detectors) to high level (e.g., face detectors in primates, or bug detectors in frogs). In a computer system, each level would have a small number of well-defined interfaces to the level above or below it (and few to more than one level away). In the brain, however, there are many significant collections of neurons that connect layers with other layers two or more levels away, as well as major connections to other areas of the brain. Clearly, then, no complete understanding of the visual system can be had without understanding the complex biology of the brain. At the same time, it is possible to study the visual system in the abstract; for instance, simply defining the various levels and the kind of information that is communicated between each level. This vision model, originally developed by David Marr, which abstracts away from its implementation in a brain or computer, led to great advances in both computer and human vision. Later research was able to (partially) map these abstract functions onto the topology of the brain.
While researchers in cognitive science often talk about mental functions without describing the specific areas of the brain in which they occur, this is only a simplifying assumption. It is not a criticism of researchers that they make such assumptions, since science would be impossible without them. A simple example from physics is the equations for computing the force of gravity. Computing the force on an object with mass X dropped from height Y or launched with force F is trivial. However, when we do this, we never are calculating the true force of gravity. That would require we include the gravitational pull of the Moon, the other planets, even the stars. The math for calculating the gravitational force on three interacting bodies is significantly complex, and the complexity increases exponentially with each body added to the calculations. However, for most purposes we can get by with the simplifying assumption that just the mass of the Earth and the object matter.
Searle’s Argument is Based on a Logical Fallacy
Searle’s argument can be summarized as:
1. Strong AI maintains that a symbol processing system that passed the Turing Test understands human language
2. The Chinese Room argument demonstrates that a symbol processing system could pass the Turing Test and still not understand human language
3. Thus, no symbol processing system that passes the Turing Test understands human language
This is an invalid argument. All Searle has proven is that it is possible that a symbol processing system could pass the Turing Test and not understand language. This is not a proof that every symbol processing system that passed the Turing Test does not understand natural language.
Searle might respond by saying that what strong AI claims is that any system that can pass the Turing Test understands human language. However, I’m not aware of anyone in AI that makes this claim. They simply don’t bother to point out that not every system that can be imagined in a thought experiment that seems to understand language necessarily understands language.
To see this, consider another thought experiment: Professor Nietzsche has constructed a quantum computer with memory that exceeds conventional memory in both space and speed by several orders of magnitude. He programs his computer with a simple table consisting of zettabytes (1021) of information. The first column in the table contains short stories in Chinese; the second column, questions in Chinese about those stories; and the third column are the answers to those questions. The program then takes Chinese stories and questions as input, and looks up the pair in the first two columns of the array that best matches them (using simple pattern-matching algorithms), then returns the third value in that row of the array as the answer.
Such a system could perform much better than the Chinese Room ever could. Yet, no one in AI would consider this to be relevant to the myriad problems of natural language understanding, because such a system would still be restricted to a very narrow subset of natural language possibilities. Also, the idea of a system based on predefined questions and answers contradicts what Chomsky with good reason calls the creative aspect of language use.
The Definition of ‘Understanding’: Do Submarines Swim?
Returning to claim one, the final, and most important, idea is that AI systems in some sense understand natural language. This requires us to examine Turing’s original paper on his Test. The paper opens as follows:
“I propose to consider the question, ‘Can machines think?’ This should begin with definitions of the meaning of the terms ‘machine’ and ‘think.’ The definitions might be framed so as to reflect so far as possible the normal use of the words, but this attitude is dangerous, If the meaning of the words ‘machine’ and ‘think’ are to be found by examining how they are commonly used it is difficult to escape the conclusion that the meaning and the answer to the question, ‘Can machines think?’ is to be sought in a statistical survey such as a Gallup poll. But this is absurd. Instead of attempting such a definition I shall replace the question by another, which is closely related to it and is expressed in relatively unambiguous words.”
(‘Computing Machinery and Intelligence’, Mind, Volume LIX, Issue 236, October 1950)
Even setting aside all the issues I’ve already raised, this is the essence of the problem with the Chinese Room argument: Turing is explicitly not trying to answer the question ‘Can machines think?’ by appealing to the definitions we use in everyday language, as Searle is. Turing is trying to provide a scientific definition of thinking that abstracts away from the natural assumptions most people bring to such discourse. Thus the question that the Chinese Room is really addressing is not the question that Turing posed, which is: ‘‘What is a rational definition of ‘understanding’ that could apply to both machines and people?’’ Rather, what Searle is arguing is that our commonsense notion of ‘understanding’ can’t be applied to computers. But as Turing said, the way people normally use words like ‘understanding’ and ‘thinking’ is not relevant to a scientific theory of cognition. Chomsky agrees with Turing, and says that asking if computers can think (in the commonsense, Searlean sense) is like asking ‘Can submarines swim?’ (Chomsky and His Critics, 2008, p.279). In English they don’t, but in Japanese they do. In English we don’t use the word ‘swim’ to describe what a submarine does; but Japanese does use the same word for the movement of humans and submarines through water. That doesn’t tell us anything about oceanography or ship design – just as thought experiments about ‘understanding’ in everyday language use don’t tell us anything useful about cognitive science.
© Michael DeBellis 2023