Multi-Agent AI Risk Topology Framework

Classifying interaction-emergent risks by source of behavioural correlation — Working Paper, February 2026

Exogenous Endogenous

T1→3

Source of Behavioural Correlation → increasing endogeneity

Scaling Laws of Harm — Scope Boundary

T0 Linear

2× users → 2× harm

f(n) = n

T2 Superlinear

N orchestrated > N × single

f(n) > n

T3 Emergent

Qualitatively new outcomes

f(n) ≠ g(1)·n

Only superlinear and emergent scaling are within the multi-agent risk framework scope. Linear scaling is T0 — technology proliferation, addressed by existing single-agent governance.

Signal endogenisation is the mechanistic account of how systems move from T1 toward T3. An initially arbitrary coordination signal acquires meaning through local feedback dynamics, conditional on environmental structure supporting convergence.

Sunspotnear T1

Exogenous, arbitrary signal selects among pre-existing equilibria. No interaction history needed. Semantically arbitrary — any random variable works. If agents’ memories were wiped and the signal reintroduced, the equilibrium reconstitutes immediately. A selection mechanism, not a construction mechanism.

Governance intervention: Remove or randomise the public signal

Cheap Talktransition

Signal starts arbitrary, acquires meaning through repeated interaction. Three-step feedback: signal → behavioural response → outcome → updated signal. Creates new strategic possibilities that didn’t exist without the signal channel. Requires ongoing behavioural maintenance. Wipe agents’ memories and the equilibrium collapses.

Governance intervention: Constrain communication channels (agents may route around the constraint)

Implicit CoordinationT3

No separate communication channel. Agents’ payoff-relevant actions are the signals. The price in algorithmic collusion. The military posture in arms races. Coordination mechanism and strategic interaction are one and the same. Nothing to “remove.”

Governance intervention: Restructure the environment itself (market microstructure, information rules, payoff timing)

⚠

Intervention difficulty increases along the continuum

Signal removal (trivial) → channel constraint (agents route around) → environmental restructuring (requires institutional redesign)

↻

Convergence Condition

Endogenisation succeeds or fails depending on environmental structure — payoff structure (temptation to defect), temporal structure (shadow of the future), and information structure (signal observability). The agent-level mechanism is the same; the environment determines whether it converges to cooperation, collusion, or oscillation.

The environment has two components: the objective structure (rules, constraints — does not change with agent behaviour) and the effective incentive landscape (objective structure filtered through all agents’ behaviour — changes constantly).

The feedback loop between these components drives the T1→T3 transition. Five dimensions determine whether interactions produce harmful or benign equilibria:

InformationStructure

Public vs private signals; common coupling signals (e.g. market price); observability of others’ actions and states. Determines whether agents can coordinate, whether oversight can detect coordination, and whether information asymmetries create exploitation opportunities.

PayoffStructure

Zero/positive/mixed-sum; continuous vs binary outcomes; reversible vs irreversible. Binary payoffs (elections) convert small perturbations into large irreversible consequences. Same agent capability, vastly different harm depending on payoff structure.

CouplingStructure

Network topology; mediating institutions (markets, platforms, registries); direct vs environment-mediated interaction. Determines cascade paths, contagion dynamics, and whether interventions can be localised.

TemporalStructure

Simultaneous vs sequential; one-shot vs repeated; commitment mechanisms; shadow of the future. Repeated interaction enables both cooperation and collusion. Temporal structure determines whether cheap talk can endogenise.

RegulatoryStructure

Jurisdiction-bound governance vs global interaction scope; observation boundaries; enforcement mechanisms. The primary generator of the international cooperation threshold: when interaction scope exceeds governance scope, gaps emerge by construction.

Key Distinction

Homogeneity of agents (correlated behaviour from shared training) ≠ homogeneity of desired outcomes (convergent outcomes from heterogeneous agents due to payoff structure). The USSR and the West had different capabilities and knowledge but reached the same equilibrium — because MAD’s payoff structure dominated agent heterogeneity.

International Cooperation Threshold

The primary determinant is structural mismatch: when agents’ interaction scope exceeds any single regulator’s observation scope, governance gaps emerge by construction. Modulated by environmental amplification (binary payoffs), harm irreversibility, and capability democratisation. T3 dynamics almost always require international cooperation because emergent interaction dynamics are jurisdiction-agnostic.

Select two incident cases or topology categories to compare side-by-side. Differences are highlighted to reveal how topology classification drives governance response.

Case A

⇄

Case B