Architecture Guide

Study agent orchestration models based on MIT & Google DeepMind research

Research Source

Towards a Science of Scaling Agent Systems (arXiv:2512.08296)

180 controlled experiments across 3 LLM families and 4 agentic benchmarks • Predictive accuracy: 87%

Single-Agent System (SAS)

One brain, zero coordination overhead

Efficiency: 46.6%Error Amp: 1x

One reasoning locus with all perception, reasoning, and action in a single sequential loop.

Topology

Sequential loop with unified memory stream

Overhead:0%

Performance:0% to +0%

Topology Diagram


    ┌─────────────────┐
    │   Single Agent  │
    │  ┌───────────┐  │
    │  │  Perceive │  │
    │  │     ↓     │  │
    │  │  Reason   │  │
    │  │     ↓     │  │
    │  │   Act     │  │
    │  └───────────┘  │
    └─────────────────┘

Best For

• Sequential tasks requiring full context integration
• Low-latency requirements (<100ms)
• Simple to moderate complexity tasks
• Tasks with limited tool usage (≤4 tools)
• Budget-constrained projects

Limitations

• Limited capacity for task decomposition
• Single point of failure
• May struggle with highly parallelizable tasks

When to Choose

Start here unless you have clear evidence that task decomposition will help. Research shows SAS often matches or beats MAS for many tasks.

Real-World Examples

ChatGPT - single model handles all conversationsGitHub Copilot - one agent for code suggestionsSimple chatbots and virtual assistants

Independent Multi-Agent

Work alone, combine at the end

Efficiency: 23.4%Error Amp: 17.2x

Agents work in isolation with results aggregated. No peer communication.

Topology

Agent-to-aggregator only (no peer communication)

Overhead:58%

Performance:-70% to +20%

Topology Diagram


    ┌─────────┐  ┌─────────┐  ┌─────────┐
    │ Agent A │  │ Agent B │  │ Agent C │
    └────┬────┘  └────┬────┘  └────┬────┘
         │            │            │
         └────────────┼────────────┘
                      ↓
              ┌──────────────┐
              │  Aggregator  │
              └──────────────┘

Best For

• Embarrassingly parallel tasks
• Tasks where diversity of attempts is valuable
• Simple aggregation scenarios

Limitations

• Highest error amplification (17.2x)
• No error correction between agents
• Duplicates errors without correction opportunities
• Universal underperformance vs SAS (-70% on some tasks)

When to Choose

Avoid this architecture. Research consistently shows it performs worse than alternatives due to error amplification and wasted parallel effort without verification.

Real-World Examples

Ensemble voting systems (limited use)A/B testing different agent approaches

Centralized Multi-Agent

One boss coordinates specialist workers

Efficiency: 15.6%Error Amp: 4.4x

Central orchestrator coordinates specialized agents.

Topology

Orchestrator-to-agents communication only

Overhead:285%

Performance:-50% to +81%

Topology Diagram


              ┌──────────────────┐
              │   Orchestrator   │
              └────────┬─────────┘
         ┌─────────────┼─────────────┐
         ↓             ↓             ↓
    ┌─────────┐  ┌─────────┐  ┌─────────┐
    │Specialist│  │Specialist│  │Specialist│
    │    A    │  │    B    │  │    C    │
    └─────────┘  └─────────┘  └─────────┘

Best For

• Naturally decomposable tasks (revenue, cost, market analysis)
• Tasks requiring specialized domain expertise
• Information synthesis from multiple sources
• Financial analysis (+80.8% improvement observed)

Limitations

• High coordination overhead (285%)
• Counterproductive for sequential tasks
• Orchestrator becomes bottleneck
• Artificial subtask decomposition wastes tokens

When to Choose

Choose when your task naturally splits into independent subtasks that can be worked on in parallel by specialists, and an orchestrator can reliably synthesize outputs.

Real-World Examples

Financial analysis agents (revenue, cost, market specialists)Research assistants with search, summarize, and write agentsCustomer service with routing and specialist agents

Decentralized Multi-Agent

All agents communicate with all others

Efficiency: 17.8%Error Amp: 7.8x

All agents communicate with all other agents (all-to-all topology).

Topology

All-to-all peer communication

Overhead:263%

Performance:-42% to +75%

Topology Diagram


    ┌─────────┐ ←───→ ┌─────────┐
    │ Agent A │       │ Agent B │
    └────┬────┘       └────┬────┘
         │    ╲     ╱     │
         │      ╲ ╱       │
         │      ╱ ╲       │
         │    ╱     ╲     │
    ┌────┴────┐       ┌────┴────┐
    │ Agent C │ ←───→ │ Agent D │
    └─────────┘       └─────────┘

Best For

• Tasks benefiting from parallel exploration
• Consensus-building scenarios
• Distributed information gathering
• Tasks where redundancy provides error correction

Limitations

• High coordination overhead (263%)
• Communication complexity grows quadratically
• Higher error amplification (7.8x)

When to Choose

Choose when parallel exploration and cross-checking genuinely help (e.g. tool-heavy or multi-perspective problems) and latency is not critical. Communication cost grows O(n²).

Real-World Examples

Debate-style reasoning systemsMulti-perspective analysis (pros/cons from different angles)Collaborative writing with peer review

Hybrid Multi-Agent

Hierarchical control + peer collaboration

Efficiency: 7.4%Error Amp: 5.1x

Combines centralized orchestration with limited peer-to-peer communication.

Topology

Orchestrator plus limited peer-to-peer

Overhead:515%

Performance:-39% to +73%

Topology Diagram


              ┌──────────────────┐
              │   Orchestrator   │
              └────────┬─────────┘
         ┌─────────────┼─────────────┐
         ↓             ↓             ↓
    ┌─────────┐  ┌─────────┐  ┌─────────┐
    │ Team A  │  │ Team B  │  │ Team C  │
    │ ┌─┬─┐   │  │ ┌─┬─┐   │  │ ┌─┬─┐   │
    │ │A│↔│B│ │  │ │C│↔│D│ │  │ │E│↔│F│ │
    │ └─┴─┘   │  │ └─┴─┘   │  │ └─┴─┘   │
    └─────────┘  └─────────┘  └─────────┘

Best For

• Complex tasks requiring both coordination and collaboration
• Scenarios needing hierarchical control with peer verification
• Tasks with natural sub-group structures

Limitations

• Highest overhead (515%)
• Lowest efficiency (0.074)
• Collapses on tool-heavy benchmarks
• Most complex to implement and debug

When to Choose

Only for genuinely complex tasks where simpler architectures have failed and you can afford very high coordination overhead. Protocol complexity increases failure modes.

Real-World Examples

Large enterprise automation systemsComplex research with team leads and specialistsMulti-department workflow automation

Scaling Principles

Key findings from the research on when multi-agent systems help vs hurt

tool Coordination Tradeoff

Tool-heavy tasks (T>4) suffer disproportionately from multi-agent coordination overhead.

Threshold: 4

capability Saturation

Coordination yields diminishing returns beyond ~45% single-agent baseline.

Threshold: 0.45

critical Complexity Threshold

Domain complexity threshold at D≈0.40 determines MAS viability.

Threshold: 0.4

overhead Threshold

For T=16 tools, overhead threshold is ~150% beyond which coordination cost exceeds benefit.

decomposability Requirement

Coordination benefits depend on task decomposability rather than team size.