Chapter 18: Advanced — Escalation and Architect
Learning Objectives
Master the "stuck auto-escalation" pattern, enabling multi-agent systems to evolve when facing difficult problems instead of getting stuck in infinite loops.
Core Contradiction
If there's only one developer (sonnet) + an infinite loop threshold (e.g., 3 failures):
sonnet attempt 1 → fail
sonnet attempt 2 → fail (same error)
sonnet attempt 3 → fail (still same error)
→ threshold triggered, stop
→ human intervention
Problem: Sonnet repeatedly hitting a wall on the same error is wasteful. A stronger agent should be swapped in on the 3rd attempt.
Three-Tier Escalation Chain
flowchart TB
Start["Group N Start"] --> Dev["developer (sonnet)"]
Dev -->|Success| Done["✓ APPROVED"]
Dev -->|3 Failed Rounds| DeepDev["developer-deep (opus)"]
DeepDev --> PathA["Path A: Flag Spec Defect
(No Code Written)"]
DeepDev --> PathB["Path B: Differentiated Redo
(Using Structurally Different Approach)"]
PathA --> Arch["architect (opus)"]
PathB -->|Success| Done
PathB -->|Still Fails| Arch
Arch --> Stuck["STUCK.md
(Diagnosis + 3 Options)"]
Stuck --> Human["⚠️ Human Decision"]
style Dev fill:#bbdefb
style DeepDev fill:#ffccbc
style Arch fill:#e1bee7
style Done fill:#c8e6c9
style Human fill:#fff9c4The Essence of the Deep Agent — Not More Computing Power, But Questioning Assumptions
developer (sonnet) default assumption:
"The spec is correct, the design is correct, I just need to implement them."
→ Repeated attempts with a faulty spec = perpetual failure
developer-deep (opus) should assume:
"The first two rounds failed, indicating potential defects in the spec/design.
I should pause to question before attempting implementation."
→ Find the root cause, either fix the spec or change the approach.
This is the true value of deep — not "smarter," but "bolder in questioning."
Path A: Flagging Defects
If the deep agent finds the spec contradictory / Scenario unimplementable, it does not write code:
[Append to end of review/N.md]
## Chapter 18: Developer Concern (escalated round)
**Suspected Defect:** spec
**Specific Item:** Requirement: Synchronization Strategy / Scenario: Manual Step Timeline
**Conflict:** Scenario requires "no command injection," but D5 requires "typing in the terminal."
What to type if a manual step has no commands? These two points contradict each other.
**Suggested Resolution:**
- option A: Fix Scenario by adding "manual step skips typing phase"
- option B: Fix D5 by changing "typing" to "optional step"
→ Then stdout outputs: DEVELOPER-DEEP: PATH=A → main Claude knows to dispatch the architect.
Path B: Differentiated Redo
If the spec is fine, but the previous approach was wrong:
Previous attempt 1: Use subprocess.Popen to control terminal → fail (typing not smooth)
Previous attempt 2: Same subprocess + sleep adjustment → fail (race condition)
Previous attempt 3: Add select() polling → fail (inconsistent macOS behavior)
deep sees these are all "patching subprocess" → change approach:
PATH=B: Switch to libtmux's send_keys + capture_pane
→ Structurally different approach
→ Won't hit the same pitfalls
→ stdout outputs DEVELOPER-DEEP: PATH=B (commit abc123), then proceeds normally to the tester.
Architect: Fallback Diagnosis
When deep also gets stuck, the architect steps in. It does not write code — it only reads the entire history, provides a diagnosis + 3 actionable options:
# Stuck: Group 7 — Orchestrator (Synchronization Strategy)
**Diagnosed At:** 2026-05-08 14:32
**Failed Agents:** developer (3 rounds) → developer-deep (PATH=A) → tester-deep (PATH=A)
**Root Cause Category:** Spec defect
## Chapter 18: Evidence Chain
- Round 1~3 (sonnet): developer repeatedly failed on D5 synchronization strategy
- Round 4 (deep, PATH A): flagged "Scenario manual step contradicts D5"
- Round 5 (tester-deep, PATH A): also flagged Scenario as untestable
## Chapter 18: Root Cause Analysis
spec.md line 47 Scenario "Manual step timeline" requires "no command injection,"
but design.md D5 line 12 stipulates "terminal must type." The two directly contradict.
A manual step is a pure explanatory step without terminal commands and should not involve typing.
## Chapter 18: Recommended Decisions
### Option A: Fix Scenario
Change spec.md line 47 Scenario to "no command injection, skip typing phase"
Cost: Minimal, only 1 line change
Risk: None
### Option B: Fix D5
Change design.md D5 to "terminal typing only occurs when commands are present"
Cost: Minor text adjustment to D5
Risk: Need to confirm that all code dependent on D5 does not assume "typing is mandatory"
### Option C: Deconstruct Manual Step
Completely remove manual step from spec, move to independent capability
Cost: Significant refactoring
Risk: Affects other groups
## Chapter 18: Recommendation
Option A — minimal change, does not affect other decisions.
→ The user can make a decision within 30 seconds after reading this. This is the architect's entire value — saving you 30 minutes of reading failure logs.
Infinite Loop Thresholds (Where to Set the Numbers)
testFailRound[N] >= 3 → escalate to developer-deep
reviewerRound[N] >= 3 → escalate to developer-deep
testFailRound[N] >= 5 → escalate to architect
reviewerRound[N] >= 3 + deep already used → escalate to architect
deep PATH=A → directly escalate to architect
E2E failure >= 3 → escalate to architect
The numbers should not be too large (delays) nor too small (frequent interruptions). 3/5 are empirical values.
Notification Integration (Marker Mode)
Each escalation outputs a line with a ⚠️ marker:
⚠️ Group 7: escalating to developer-deep (reason: testFailRound=3)
⚠️ Group 7: STUCK — architect diagnosed root cause = Spec defect
Stop hook listens for lines starting with ⚠️ → pushes to Telegram. This notification design is detailed in Ch 26.
Anti-Patterns
❌ deep only changes model, not strategy using opus like sonnet, wasteful
❌ deep prompt says "use deeper reasoning" model won't change behavior just for this, need clear different workflows
❌ architect also writes code out of scope, compromises diagnostic independence
❌ threshold too high (>10) human intervention too late
❌ threshold too low (=2) frequent interruptions, autonomy becomes nominal
What You Can Do Now
- Design your project's escalation chain
- When writing a deep agent, make it "question assumptions" instead of "try again"
- Design the architect's STUCK.md template
- Choose infinite loop thresholds