AI Gets a Lot Better at Debating When It Knows Who You Are, Study Finds

A new study shows that GPT-4 reliably wins debates against its human counterparts in one-on-one conversations—and the technology gets even more persuasive when it knows your age, job, and political leanings.
Researchers at EPFL in Switzerland, Princeton University, and the Fondazione Bruno Kessler in Italy paired 900 study participants with either a human debate partner or OpenAI’s GPT-4, a large language model (LLM) that, by design, produces mostly text responses to human prompts. In some cases, the participants (both machine and human) had access to their counterparts’ basic demographic info, including gender, age, education, employment, ethnicity, and political affiliation.
The team’s research—published today in Nature Human Behaviour—found that the AI was 64.4% more persuasive than human opponents when given that personal information; without the personal data, the AI’s performance was indistinguishable from the human debaters.
“In recent decades, the diffusion of social media and other online platforms has expanded the potential of mass persuasion by enabling personalization or ‘microtargeting’—the tailoring of messages to an individual or a group to enhance their persuasiveness,” the team wrote.
When GPT-4 was allowed to personalize its arguments, it became significantly more persuasive than any human—boosting the odds of changing someone’s mind by 81.2% compared to human-human debates. Importantly, human debaters did not become so persuasive when given access to that personal information.
“In the context of persuasion, experts have widely expressed concerns about the risk of LLMs being used to manipulate online conversations and pollute the information ecosystem by spreading misinformation, exacerbating political polarization, reinforcing echo chambers and persuading individuals to adopt new beliefs,” the researchers added.
GPT-4 can argue with you, and given a set of facts about you, it may excel at convincing you to change your point of view, the researchers found. The team notes in the paper’s discussion that LLMs have previously been criticized for generating and diffusing hate speech, misinformation, and propaganda; at scale, LLMs with users’ personal information could be harnessed for malicious purposes.
The team’s research pairs nicely with a recent ChatGPT update that allows the model to remember more of users’ conversations (with their permission), meaning that the AI can have access to a catalogue of information about its users.
But there’s also good news—or bad news—depending on how you see it. GPT-4 was very effective at persuading its opponents on less controversial issues, but with more entrenched positions (referred to in the research as “opinion strength”), the bot had a harder time convincing humans to change their minds. In other words, there’s no indication that GPT-4 would be any more successful than you are at the Thanksgiving debate table.
What’s more, the researchers found that GPT-4 tends to use more logical and analytical language, while human debaters relied more on personal pronouns and emotional appeals. Surprisingly, personalization didn’t dramatically change GPT-4’s tone or style—it just made its arguments more targeted.
In three out of four cases, human participants could correctly identify their opponent as AI, which the researchers attribute to GPT-4’s distinct writing style. But participants had a difficult time identifying human opponents as human. Regardless, people were more likely to change their mind when they thought they were arguing with an AI than when they believed their opponent was human.
The team behind the study says this experiment should serve as a “proof of concept” for what could happen on platforms like Reddit, Facebook, or X, where debates and controversial topics are routine—and bots are a very established presence. The recent paper shows that it doesn’t take Cambridge Analytica-level profiling for an AI to change human minds, which the machines managed with just six types of personal information.
As people increasingly rely on LLMs for help with rote tasks, homework, documentation, and even therapy, it’s critical that human users remain circumspect about the information they’re fed. It remains ironic that social media—once advertised as the connective tissue of the digital age—fuels loneliness and isolation, as two studies on chatbots found in March.
So even if you find yourself in a debate with an LLM, ask yourself: What exactly is the point of discussing such a complicated human issue with a machine? And what do we lose when we hand over the art of persuasion to algorithms? Debating isn’t just about winning an argument—it’s a quintessentially human thing to do. There’s a reason we seek out real conversations, especially one-on-one: To build personal connections and find common ground, something that machines, with all their powerful learning tools, are not capable of.
gizmodo