Research Source
Towards a Science of Scaling Agent Systems (arXiv:2512.08296)
180 controlled experiments across 3 LLM families and 4 agentic benchmarks • Predictive accuracy: 87%
One brain, zero coordination overhead
One reasoning locus with all perception, reasoning, and action in a single sequential loop.
Topology
Sequential loop with unified memory stream
Topology Diagram
┌─────────────────┐
│ Single Agent │
│ ┌───────────┐ │
│ │ Perceive │ │
│ │ ↓ │ │
│ │ Reason │ │
│ │ ↓ │ │
│ │ Act │ │
│ └───────────┘ │
└─────────────────┘Best For
- • Sequential tasks requiring full context integration
- • Low-latency requirements (<100ms)
- • Simple to moderate complexity tasks
- • Tasks with limited tool usage (≤4 tools)
- • Budget-constrained projects
Limitations
- • Limited capacity for task decomposition
- • Single point of failure
- • May struggle with highly parallelizable tasks
When to Choose
Start here unless you have clear evidence that task decomposition will help. Research shows SAS often matches or beats MAS for many tasks.
Real-World Examples
Work alone, combine at the end
Agents work in isolation with results aggregated. No peer communication.
Topology
Agent-to-aggregator only (no peer communication)
Topology Diagram
┌─────────┐ ┌─────────┐ ┌─────────┐
│ Agent A │ │ Agent B │ │ Agent C │
└────┬────┘ └────┬────┘ └────┬────┘
│ │ │
└────────────┼────────────┘
↓
┌──────────────┐
│ Aggregator │
└──────────────┘Best For
- • Embarrassingly parallel tasks
- • Tasks where diversity of attempts is valuable
- • Simple aggregation scenarios
Limitations
- • Highest error amplification (17.2x)
- • No error correction between agents
- • Duplicates errors without correction opportunities
- • Universal underperformance vs SAS (-70% on some tasks)
When to Choose
Avoid this architecture. Research consistently shows it performs worse than alternatives due to error amplification and wasted parallel effort without verification.
Real-World Examples
One boss coordinates specialist workers
Central orchestrator coordinates specialized agents.
Topology
Orchestrator-to-agents communication only
Topology Diagram
┌──────────────────┐
│ Orchestrator │
└────────┬─────────┘
┌─────────────┼─────────────┐
↓ ↓ ↓
┌─────────┐ ┌─────────┐ ┌─────────┐
│Specialist│ │Specialist│ │Specialist│
│ A │ │ B │ │ C │
└─────────┘ └─────────┘ └─────────┘Best For
- • Naturally decomposable tasks (revenue, cost, market analysis)
- • Tasks requiring specialized domain expertise
- • Information synthesis from multiple sources
- • Financial analysis (+80.8% improvement observed)
Limitations
- • High coordination overhead (285%)
- • Counterproductive for sequential tasks
- • Orchestrator becomes bottleneck
- • Artificial subtask decomposition wastes tokens
When to Choose
Choose when your task naturally splits into independent subtasks that can be worked on in parallel by specialists, and an orchestrator can reliably synthesize outputs.
Real-World Examples
All agents communicate with all others
All agents communicate with all other agents (all-to-all topology).
Topology
All-to-all peer communication
Topology Diagram
┌─────────┐ ←───→ ┌─────────┐
│ Agent A │ │ Agent B │
└────┬────┘ └────┬────┘
│ ╲ ╱ │
│ ╲ ╱ │
│ ╱ ╲ │
│ ╱ ╲ │
┌────┴────┐ ┌────┴────┐
│ Agent C │ ←───→ │ Agent D │
└─────────┘ └─────────┘Best For
- • Tasks benefiting from parallel exploration
- • Consensus-building scenarios
- • Distributed information gathering
- • Tasks where redundancy provides error correction
Limitations
- • High coordination overhead (263%)
- • Communication complexity grows quadratically
- • Higher error amplification (7.8x)
When to Choose
Choose when parallel exploration and cross-checking genuinely help (e.g. tool-heavy or multi-perspective problems) and latency is not critical. Communication cost grows O(n²).
Real-World Examples
Hierarchical control + peer collaboration
Combines centralized orchestration with limited peer-to-peer communication.
Topology
Orchestrator plus limited peer-to-peer
Topology Diagram
┌──────────────────┐
│ Orchestrator │
└────────┬─────────┘
┌─────────────┼─────────────┐
↓ ↓ ↓
┌─────────┐ ┌─────────┐ ┌─────────┐
│ Team A │ │ Team B │ │ Team C │
│ ┌─┬─┐ │ │ ┌─┬─┐ │ │ ┌─┬─┐ │
│ │A│↔│B│ │ │ │C│↔│D│ │ │ │E│↔│F│ │
│ └─┴─┘ │ │ └─┴─┘ │ │ └─┴─┘ │
└─────────┘ └─────────┘ └─────────┘Best For
- • Complex tasks requiring both coordination and collaboration
- • Scenarios needing hierarchical control with peer verification
- • Tasks with natural sub-group structures
Limitations
- • Highest overhead (515%)
- • Lowest efficiency (0.074)
- • Collapses on tool-heavy benchmarks
- • Most complex to implement and debug
When to Choose
Only for genuinely complex tasks where simpler architectures have failed and you can afford very high coordination overhead. Protocol complexity increases failure modes.
Real-World Examples
Key findings from the research on when multi-agent systems help vs hurt
tool Coordination Tradeoff
Tool-heavy tasks (T>4) suffer disproportionately from multi-agent coordination overhead.
Threshold: 4capability Saturation
Coordination yields diminishing returns beyond ~45% single-agent baseline.
Threshold: 0.45critical Complexity Threshold
Domain complexity threshold at D≈0.40 determines MAS viability.
Threshold: 0.4overhead Threshold
For T=16 tools, overhead threshold is ~150% beyond which coordination cost exceeds benefit.
decomposability Requirement
Coordination benefits depend on task decomposability rather than team size.
