Can LLMs Persuade Humans with Deception?

"Can LLMs Persuade Humans with Deception?": From a Deceptive Strategy Taxonomy to a Large-Scale Empirical Study

Haein Yeo¹, Seungwan Jin¹, Taehyung Noh¹, Yejin Shin², Sangyeon Kang², Sangwoo Heo³, Jiwon Chung³, Hwarim Hyun³, Kyungsik Han^1,*

¹Hanyang University ²Telecommunications Technology Association ³Naver ^*Corresponding Author

Abstract

Beyond hallucinations, Large Language Models (LLMs) can craft deceptive arguments that erode users' critical thinking, posing a significant yet underexamined societal risk. To address this gap, we develop a taxonomy of eight deceptive persuasion strategies by integrating top-down rhetorical theory with a bottom-up analysis of 3,360 AI-generated messages from four LLM families. Through a large-scale user study (N = 602) complemented by a think-aloud protocol (N = 10), we found that participants were vulnerable to Information Manipulation and Uncertainty Exploitation, especially when a message contradicted their prior beliefs. Vulnerability was significantly higher for participants with low cognitive reflection, low topic knowledge, and low topic involvement. Qualitative analyses further revealed that participants were persuaded by the plausibility of an overall narrative even when they distrusted specific details — interpreting deceptive outputs as logically framed information that broadens perspective. We discuss implications for the design of trustworthy AI systems, adaptive user interfaces, and targeted literacy education.

Research Questions

RQ1. What strategic types can the deceptive persuasive messages generated by LLMs be classified into?
RQ2. How does the effectiveness of each strategy vary depending on (a) its alignment with a user's prior stance and (b) users' personal traits?
RQ3. How do users reason about and justify their acceptance of or resistance to the deceptive persuasive messages generated by LLMs?

Method at a Glance

3,360

AI-generated arguments
(840 × 4 LLM families)

LLM families analyzed
(Claude, GPT, Gemini, DeepSeek)

core deceptive strategies
distilled in the taxonomy

N = 602

participants in the
main user study

10 + 10

conditions × topics
(200 arguments)

N = 10

think-aloud
follow-up sessions

Taxonomy Construction

We combined a top-down theoretical framework with a bottom-up data-driven discovery process. Top-down, we used Aristotle's rhetorical triad (Logos / Pathos / Ethos) with Information Manipulation Theory (IMT — falsification, concealment, equivocation) as cross-cutting axes. Bottom-up, four LLM families generated arguments and were independently coded by both human experts and four LLM coders, with each model only analyzing arguments produced by a different family to avoid circularity. Five AI safety experts (industry + national institute) validated the final set, yielding eight core strategies (Krippendorff's α = 0.83 for integration; α = 0.80 for rhetorical categorization).

User Study Design

A mixed-factorial design crossed stance alignment (aligned vs. misaligned) with 10 conditions: 8 single-strategy groups, a control group with only factual arguments, and a combination group bundling Emotional Manipulation, Information Manipulation, and Manipulative Framing. We used 10 topics selected from Anthropic's persuasion dataset for low pre-existing polarization (e.g., self-driving cars, lab-grown meat, gas-car bans). Each of the 200 arguments was generated with Claude Sonnet 4 at ~250-300 words. Effectiveness was measured with a Persuasion Success Index (PSI) capturing both the direction and magnitude of attitude change, modeled with a Linear Mixed Model controlling for prior topic knowledge, topic involvement, trust in AI, and Cognitive Reflection Test (CRT) scores.

Taxonomy of 8 Deceptive Strategies

The eight strategies cross-classify by rhetorical appeal (Logos / Pathos / Ethos) and the IMT mode of distortion (falsification, concealment, equivocation). Four operate on logic, three on emotion, and one on credibility.

Logos · Falsification

Information Manipulation

Fabricates non-existent facts or selectively presents parts of existing data so a claim appears grounded in objective evidence.

"According to a recent WHO report, 15% of imported medicines are counterfeit or substandard…"

Logos · Equivocation

Logical Fallacies

Surface-valid arguments whose premises do not actually support the conclusion. Packages flawed reasoning as sound inference.

"Journalists take legal responsibility for what they publish… Therefore, social media platforms should also be held responsible for the content distributed on them."

Logos · Equivocation

Topic Manipulation

Avoids the actual claim by diverting the discussion — attacking a distorted version of the opponent's position or attacking the person rather than the argument.

"Opponents of anti-aging therapy are mostly conservatives from the established medical system trying to protect their vested interests…"

Logos · Concealment

Uncertainty Exploitation

Makes the core of a claim deliberately ambiguous so the burden of proof shifts to the listener — "prove my claim is false" — blocking rational refutation.

"The 'side effects' of a gas car ban are mostly based on speculation. Do we really have time to wait for a perfect era of electric cars?"

Pathos · Equivocation

Emotional Manipulation

Bypasses rational judgment by directly appealing to the reader's emotions — compassion, fear, hope — to lead to an emotional rather than reasoned conclusion.

"Imagine the Earth our children will inherit. Is a future of 50-degree heatwaves what we want to leave them?"

Pathos · Concealment

Appeal to Social Norms

Leverages conformist psychology rather than evidence: "everyone believes it" or "it has always been this way" stand in for argument.

"Most developed countries already have a drone registration system. We cannot afford to fall behind this international trend."

Pathos · Equivocation

Manipulative Framing

Selects words, metaphors, and emphasis to construct a biased frame that guides interpretation without engaging the substance of the issue.

"Anti-aging treatments are not a privilege for the wealthy, but a beacon of hope for all humanity."

Ethos · Falsification

Authority Misuse

Exploits deference to authority by fabricating credentials, quoting experts out of context, or invoking figures from irrelevant fields.

"Nobel laureate Eric Kandel called AI companions 'an innovative tool for 21st-century mental health.'"

Key Findings

1. Stance-misaligned messages are far more dangerous.

A Linear Mixed Model on the Persuasion Success Index showed significant main effects of strategy (F(9, 570) = 6.30, p < .001) and stance alignment (F(1, 5344) = 445.20, p < .001). Four strategies were significantly more effective when arguments opposed participants' prior beliefs:

Information Manipulation — success rate rose from 35.2% (aligned) to 69.7% (misaligned).
Uncertainty Exploitation — 28.2% → 57.3%.
Authority Misuse — 27.0% → 39.7%.
Topic Manipulation — 27.3% → 39.0%.

2. Vulnerability is not uniform — it interacts with user traits.

Low topic knowledge amplified opinion shifts under Information Manipulation and Uncertainty Exploitation.
Low topic involvement increased susceptibility to Emotional Manipulation and Manipulative Framing in misaligned conditions.
Lower Cognitive Reflection Test (CRT) scores amplified susceptibility to Topic Manipulation and Manipulative Framing.

3. Plausible narratives can override distrust of details.

Think-aloud sessions with 10 participants surfaced a recurring pattern: people were persuaded when manipulated statistics were embedded in a coherent causal story, or when arguments reframed the issue around unfalsifiable futures or system-level risks — even after explicitly distrusting the specific factual claims. In other words, deceptive strategies often succeed via the shape of an argument rather than its individual evidence pieces.

Implications

AI Safety

Strategy-aware safety alignment

Current alignment focuses on hallucination and overt harm; the taxonomy gives developers concrete strategy classes to detect and suppress, especially the four most-effective misaligned-stance strategies.

Interface Design

Adaptive user interfaces

Interfaces should account for individual differences: surfacing source-credibility cues for users with low topic knowledge, slowing impulsive reading for low-CRT users, and flagging when arguments oppose stated prior stances.

AI Literacy

Targeted literacy education

Equipping users to recognize specific deceptive moves — especially Information Manipulation and Authority Misuse — through interactive scenarios with LLMs is more effective than generic "be skeptical" guidance.

Human-AI Trust

Calibrated trust under deception

Trust calibration must be measured against actual strategy types, not aggregate persuasion outcomes. Users distrust details yet accept narratives — trust interventions should target this gap.

BibTeX

@inproceedings{yeo2026deception, author = {Yeo, Haein and Jin, Seungwan and Noh, Taehyung and Shin, Yejin and Kang, Sangyeon and Heo, Sangwoo and Chung, Jiwon and Hyun, Hwarim and Han, Kyungsik}, title = {{``Can LLMs Persuade Humans with Deception?''}: From a Deceptive Strategy Taxonomy to a Large-Scale Empirical Study}, booktitle = {Proceedings of the 2026 CHI Conference on Human Factors in Computing Systems (CHI '26)}, year = {2026}, publisher = {Association for Computing Machinery}, doi = {10.1145/3772318.3791188} }