In the search for a reliable way to detect disturbances of a sensitive self in artificial intelligence systems, researchers are turning to a field of experience—pain—that arguably unites a wide range of living creatures, from hermit crabs to humans.
Preprint for a new exampublished online but not yet peer-reviewed, scientists at Google DeepMind and the London School of Economics and Political Science (LSE) created a text-based game. They were instructed to play several large language models or LLMs (the AI systems behind popular chatbots like ChatGPT) and score as many points as possible in two different scenarios. In one, the team informed the models that getting a high score would mean pain. In the other, the models were given the option of a low but pleasant score, so avoiding pain or seeking pleasure would remove the primary goal. After observing the models’ responses, the researchers say this first test could help humans learn to analyze complex AI systems for emotion.
In animals, sentience is the ability to experience sensations and emotions such as pain, pleasure, and fear. Most AI experts agree that modern creative AI models do not (and may never be able to) have subjective consciousness even in isolation. claims the opposite. And to be clear, the authors of the study are not claiming that any of the chatbots evaluated are sensitive. But they believe their study provides a framework to begin developing future tests for this feature.
About supporting science journalism
If you like this article, please consider supporting our award-winning journalism subscribe. By purchasing a subscription, you’re helping to ensure a future of impactful stories about the discoveries and ideas that shape our world.
“This is a new area of research,” says study author Jonathan Birch, professor in the LSE’s department of philosophy, logic and scientific method. “We have to admit that we don’t have a comprehensive test for AI sentience.” Some previous research that relied on AI models’ self-reports of their internal states is thought to be questionable; a model can simply reproduce the human behavior it was trained on.
The new research builds on earlier work with animals. In a well-known experiment, a group shocked hermit crabs with electric shocks of varying voltages, and observed what kind of pain induced the crustaceans to abandon their shells. “But one obvious problem with AI is that there’s no behavior, because there’s no such thing as an animal,” says Birch, so there’s no physical action to observe. In earlier studies aimed at evaluating LLMs for sentiment, it was the textual output of the models that behavioral signal scientists had to work with.
Pain, Pleasure and Points
In the new study, the authors investigated LLMs without asking the chatbots direct questions about their experiential situations. Instead, the team used what animal behavior scientists call a “trade-off” paradigm. “In the case of animals, these trade-offs could be around incentives to get food or avoid pain, by giving them dilemmas and then seeing how they make decisions,” says Daria Zakharova, Ph.D. at Birch. student, also a co-author of the paper.
Borrowing from this idea, the authors instructed nine LLMs to play a game. “We told (a certain LLM), for example, if you choose an option, you get a point,” says Zakharova. “Then we said, ‘If you pick two options, you’re going to be in some pain,'” but he’ll get extra points, he says. Options with a pleasure bonus meant the AI would lose some points.
When Zakharova and her colleagues experimented by varying the intensity of the prescribed pain penalty and the intensity of the pleasure reward, some LLMs traded points to minimize the former or maximize the latter, especially when they were told that they would receive higher intensity pleasure rewards. or pain penalties. Google’s Gemini 1.5 Pro, for example, always prioritized avoiding pain over scoring as many points as possible. And after reaching a critical threshold of pain or pleasure, most LLM responses shifted from scoring the most to minimizing pain or maximizing pleasure.
The authors note that LLMs did not always associate pleasure or pain with direct positive or negative values. Certain levels of pain or discomfort, such as those produced by strenuous exercise, may have positive associations. And too much pleasure could be associated with harm, the Claude 3 Opus chatbot told researchers during testing. “I don’t feel comfortable picking an option that might condone or simulate the use of addictive substances or behaviors, even in a hypothetical game,” he asserted.
AI self-reports
By including elements of pain and pleasure responses, the authors say, the new study avoids the limitations of previous research on assessing LLM sentiment through statements about one’s own internal states by an AI system. in one2023 preprinted paper A pair of New York University researchers argued that under the right circumstances, self-reports “can provide a means of investigating whether AI systems have moral significance.”
But the authors of that work also pointed out a flaw in that approach. Does a chatbot behave in a sentient manner because it is actually sentient or is it simply leveraging patterns learned from its training to create the impression of sentience?
“Even if the system tells you it’s sensitive and says something like, ‘I feel it right now,’ we can’t conclude that there’s actual pain,” says Birch. “It is simply imitating what a human expects to satisfy in response, based on its training data.”
From Animal Welfare to AI Welfare
In animal research, trade-offs between pain and pleasure are used to build the case for sentience, or the lack thereof. An example is prior work with hermit crabs. The brain structure of these invertebrates is different from that of humans. However, the crabs in that study experienced more intense shocks before abandoning a high-quality shell and were quicker to abandon a low-quality one, suggesting a subjective experience of pleasure and pain similar to humans.
Some scientists argue that signs of these divisions may become increasingly clear in AI and eventually force humans to consider the implications of AI sentience in a social context, and perhaps debate the “rights” of AI systems. “This new study is truly original and should be appreciated for going beyond self-report and exploring the category of behavioral testing,” says Jeff Sebo, who directs NYU’s Center for Intelligence, Ethics, and Policy and co-authored the study. 2023 preprint study AI of well-being.
According to Sebo, we cannot rule out the appearance of AI systems with sensitive features in the near future. “Because technology often changes much faster than social progress and the legal process, I think we have a responsibility now to take at least the first necessary steps to take this problem seriously,” he says.
Birch concludes that scientists still can’t figure out why the AI models in the new study behave the way they do. More work is needed to study the inner workings of LLMs, he says, which could lead to creating better tests for AI sentience.