Pierre Clayton's Newsletter
Posts
🧠 Paper of the Day: Do LLMs Trust AI Regulation?

🧠 Paper of the Day: Do LLMs Trust AI Regulation?

Pierre Clayton
April 16, 2025

Today’s paper asks a deliciously unsettling question: Do LLMs trust AI regulation?
Researchers threw Large Language Models (LLMs) into an evolutionary game where AI users, developers, and regulators had to navigate trust, incentives, and risks.
The twist? The LLMs behaved differently from pure game theory predictions – and not always in reassuring ways.

Let’s dive in.

🔍 The Problem

AI systems are everywhere, but how do we trust them?
And more importantly — can regulators make AI safer without breaking innovation?
Users, developers, and regulators are locked in a game of trust and incentives.
Each actor faces a dilemma: comply, defect, regulate strictly, or just hope for the best.
The challenge is complex, risky, and packed with incomplete information and strategic dependencies.

📚 How They Studied It

Instead of relying just on cold equations, the researchers blended evolutionary game theory with LLM agents (specifically, GPT-4o and Mistral Large).

They set up a 3-player game:

Users (trust or not)
Developers (comply with regulation or defect)
Regulators (enforce strictly or stay lenient)

Agents played one-shot and repeated games — with and without conditional trust based on regulators' reputation.

👉Here’s a glimpse of their process:

📈 What They Found

The results were fascinatingly messy.

Setting	GPT-4o	Mistral Large
One-shot, No Conditional Trust	Mixed trust, mixed compliance	Trust collapses
One-shot, Conditional Trust	Trust rises	Trust collapses harder
Repeated Games	Moves toward trust if regulation is cheap	Trust rebounds more slowly
With Personalities	More nuanced, but still "cautious optimism"	Generally pessimistic

🧠 Why It Matters in Real Life

AI governance tools could themselves be based on LLM agents someday.
Different LLMs behave differently — model choice matters a lot.
Conditional trust is tricky: if regulators’ reputation is visible, users may trust less, not more.
Repeated interactions matter: seeing good behavior over time boosts cooperation.
Personalities matter: adding risk-averse or cooperative traits shifted agent behavior.

In short: If AI is helping regulate AI, the details will make or break trust.

🚀 The Big Picture

This study shows how messy, human-like, and unpredictable LLM behavior becomes in social dilemmas.
Multi-agent systems aren’t magic — they’re fragile, sensitive, and sometimes pessimistic.
But blending game theory and LLMs could build smarter AI governance models, predicting risks better than humans alone.
The future of AI regulation may depend not just on laws — but on how AIs themselves "feel" about following them.

👉 You can read the full paper here.