- Pierre Clayton's Newsletter
- Posts
- 🧠 Paper of the Day: Do LLMs Trust AI Regulation?
🧠 Paper of the Day: Do LLMs Trust AI Regulation?
Today’s paper asks a deliciously unsettling question: Do LLMs trust AI regulation?
Researchers threw Large Language Models (LLMs) into an evolutionary game where AI users, developers, and regulators had to navigate trust, incentives, and risks.
The twist? The LLMs behaved differently from pure game theory predictions – and not always in reassuring ways.
Let’s dive in.
🔍 The Problem
AI systems are everywhere, but how do we trust them?
And more importantly — can regulators make AI safer without breaking innovation?
Users, developers, and regulators are locked in a game of trust and incentives.
Each actor faces a dilemma: comply, defect, regulate strictly, or just hope for the best.
The challenge is complex, risky, and packed with incomplete information and strategic dependencies.
📚 How They Studied It
Instead of relying just on cold equations, the researchers blended evolutionary game theory with LLM agents (specifically, GPT-4o and Mistral Large).
They set up a 3-player game:
Users (trust or not)
Developers (comply with regulation or defect)
Regulators (enforce strictly or stay lenient)
Agents played one-shot and repeated games — with and without conditional trust based on regulators' reputation.
👉Here’s a glimpse of their process:

📈 What They Found
The results were fascinatingly messy.
Setting | GPT-4o | Mistral Large |
---|---|---|
One-shot, No Conditional Trust | Mixed trust, mixed compliance | Trust collapses |
One-shot, Conditional Trust | Trust rises | Trust collapses harder |
Repeated Games | Moves toward trust if regulation is cheap | Trust rebounds more slowly |
With Personalities | More nuanced, but still "cautious optimism" | Generally pessimistic |
🧠 Why It Matters in Real Life
AI governance tools could themselves be based on LLM agents someday.
Different LLMs behave differently — model choice matters a lot.
Conditional trust is tricky: if regulators’ reputation is visible, users may trust less, not more.
Repeated interactions matter: seeing good behavior over time boosts cooperation.
Personalities matter: adding risk-averse or cooperative traits shifted agent behavior.
In short: If AI is helping regulate AI, the details will make or break trust.
🚀 The Big Picture
This study shows how messy, human-like, and unpredictable LLM behavior becomes in social dilemmas.
Multi-agent systems aren’t magic — they’re fragile, sensitive, and sometimes pessimistic.
But blending game theory and LLMs could build smarter AI governance models, predicting risks better than humans alone.
The future of AI regulation may depend not just on laws — but on how AIs themselves "feel" about following them.