Chapter 17 | Model Selection Strategy

6 MIN READ | UPDATED: 2026-05-15

Chapter 17: Model Selection Strategy

Learning Objectives

Assign models by role to avoid "burning cash with all Opus" or "getting stuck with all Sonnet."

Three-Tier Comparison (Data as of May 2026)

Model Input Output Context Strengths Weaknesses
Haiku 4.5 $1/MTok $5/MTok 200k Extremely fast, cheap Weak at complex reasoning
Sonnet 4.6 $3/MTok $15/MTok 1M King of cost-effectiveness Limited for top-tier architectural thinking
Opus 4.7 $5/MTok $25/MTok 1M Deep reasoning, finding inconsistencies 5x more expensive

→ Sonnet is approximately 5 times cheaper than Opus. If Sonnet can solve it, don't use Opus.

Role Matching Formula

flowchart TD
    Q["What is the primary function of this role?"]
    Q -->|Mass execution
Repetitive operations| Sonnet["sonnet
(Cost-effective)"] Q -->|Deep reasoning
Find inconsistencies| Opus["opus
(Deep thinking)"] Q -->|Simple classification
Batch processing| Haiku["haiku
(Cheap)"] Sonnet --> SonnetEx["developer / tester /
e2e-tester"] Opus --> OpusEx["reviewer / architect /
developer-deep"] Haiku --> HaikuEx["Log classification / Simple formatting
(Not used in this project)"] style Sonnet fill:#bbdefb style Opus fill:#ffccbc style Haiku fill:#c5e1a5

Final Configuration for Our doc2video Project

Agent Model Reason
developer sonnet Repeatedly implements tasks, high volume
developer-deep opus Stuck and escalated, needs to question spec
tester sonnet Translating scenarios is a mechanical task
tester-deep opus Judging if a scenario is testable requires insight
e2e-tester sonnet Black-box execution of commands, checking output
reviewer opus Finding code inconsistencies is Opus's strength
architect opus Cross-team diagnosis requires a global perspective

Cost Estimation (Based on Actual Project Runs)

A medium-sized team (5 tasks) completes a full cycle:
  developer (sonnet) 1-2 rounds     → ~$0.30
  tester (sonnet) 1-2 rounds        → ~$0.20
  reviewer (opus) 1-2 rounds        → ~$0.50
  ─────────────────────────────────
  Subtotal                          ~$1.00

One escalation (3 failed rounds → developer-deep):
  + developer-deep (opus) 1 round  → ~$0.40
  ─────────────────────────────────
  Total including escalation        ~$1.40

→ For our doc2video project, with 13 teams, 61 tasks, and several escalations, we estimate $15~$30 to complete the entire project. Manual development of the same scope would take 1-2 weeks = at least $5000 in labor costs — AI collaboration is 100x+ cheaper.

Escalate to Opus, Don't Start with It

flowchart LR
    Bad["All Opus
(5x cost)"] --> BadResult["Every team uses top-tier
Simple tasks also burn cash"] Good["Default Sonnet
Escalate to Opus when stuck"] --> GoodResult["80% of tasks run cheaply
Only complex problems burn cash"] style Bad fill:#ffcdd2 style Good fill:#c8e6c9

→ This is the escalation mechanism we'll discuss in Chapter 18.

When to Downgrade to Haiku

If you have these types of light tasks:

✅ Classify logs into ERROR/WARN/INFO
✅ Translate variable names to camelCase
✅ Simple schema validation

Consider using Haiku for micro-agents. Our doc2video project doesn't have these types of tasks — so we didn't use Haiku.

Anti-Patterns

❌ All Opus: "To ensure quality with the strongest model"
   → Wastes 80% of money on tasks where Sonnet is perfectly sufficient

❌ All Sonnet: "To save money"
   → Gets stuck on complex problems, ends up burning more tokens trying repeatedly

❌ Using Sonnet for the reviewer
   → Reviews easily become LGTM, fails to find subtle inconsistencies

❌ Using Opus for the developer
   → High volume, repetitive runs, wastes expensive model

What You Can Do Now

  • Assign models to each role in your own project
  • Estimate total project costs
  • Understand why "escalate rather than start with top-tier"

The next chapter will clarify the "escalation" mechanism — an advanced form of multi-agent autonomy.