Chapter 17 | Model Selection Strategy

6 MIN READ | UPDATED: 2026-05-15

Chapter 17: Model Selection Strategy

Learning Objectives

Assign models by role to avoid "burning cash with all Opus" or "getting stuck with all Sonnet."

Three-Tier Comparison (Data as of May 2026)

Model	Input	Output	Context	Strengths	Weaknesses
Haiku 4.5	$1/MTok	$5/MTok	200k	Extremely fast, cheap	Weak at complex reasoning
Sonnet 4.6	$3/MTok	$15/MTok	1M	King of cost-effectiveness	Limited for top-tier architectural thinking
Opus 4.7	$5/MTok	$25/MTok	1M	Deep reasoning, finding inconsistencies	5x more expensive

→ Sonnet is approximately 5 times cheaper than Opus. If Sonnet can solve it, don't use Opus.

Role Matching Formula

flowchart TD
    Q["What is the primary function of this role?"]
    Q -->|Mass execution
Repetitive operations| Sonnet["sonnet
(Cost-effective)"]
    Q -->|Deep reasoning
Find inconsistencies| Opus["opus
(Deep thinking)"]
    Q -->|Simple classification
Batch processing| Haiku["haiku
(Cheap)"]

    Sonnet --> SonnetEx["developer / tester /
e2e-tester"]
    Opus --> OpusEx["reviewer / architect /
developer-deep"]
    Haiku --> HaikuEx["Log classification / Simple formatting
(Not used in this project)"]

    style Sonnet fill:#bbdefb
    style Opus fill:#ffccbc
    style Haiku fill:#c5e1a5

Final Configuration for Our doc2video Project

Agent	Model	Reason
developer	sonnet	Repeatedly implements tasks, high volume
developer-deep	opus	Stuck and escalated, needs to question spec
tester	sonnet	Translating scenarios is a mechanical task
tester-deep	opus	Judging if a scenario is testable requires insight
e2e-tester	sonnet	Black-box execution of commands, checking output
reviewer	opus	Finding code inconsistencies is Opus's strength
architect	opus	Cross-team diagnosis requires a global perspective

Cost Estimation (Based on Actual Project Runs)

A medium-sized team (5 tasks) completes a full cycle:
  developer (sonnet) 1-2 rounds     → ~$0.30
  tester (sonnet) 1-2 rounds        → ~$0.20
  reviewer (opus) 1-2 rounds        → ~$0.50
  ─────────────────────────────────
  Subtotal                          ~$1.00

One escalation (3 failed rounds → developer-deep):
  + developer-deep (opus) 1 round  → ~$0.40
  ─────────────────────────────────
  Total including escalation        ~$1.40

→ For our doc2video project, with 13 teams, 61 tasks, and several escalations, we estimate $15~$30 to complete the entire project. Manual development of the same scope would take 1-2 weeks = at least $5000 in labor costs — AI collaboration is 100x+ cheaper.

Escalate to Opus, Don't Start with It

flowchart LR
    Bad["All Opus
(5x cost)"] --> BadResult["Every team uses top-tier
Simple tasks also burn cash"]
    Good["Default Sonnet
Escalate to Opus when stuck"] --> GoodResult["80% of tasks run cheaply
Only complex problems burn cash"]

    style Bad fill:#ffcdd2
    style Good fill:#c8e6c9

→ This is the escalation mechanism we'll discuss in Chapter 18.

When to Downgrade to Haiku

If you have these types of light tasks:

✅ Classify logs into ERROR/WARN/INFO
✅ Translate variable names to camelCase
✅ Simple schema validation

Consider using Haiku for micro-agents. Our doc2video project doesn't have these types of tasks — so we didn't use Haiku.

Anti-Patterns

❌ All Opus: "To ensure quality with the strongest model"
   → Wastes 80% of money on tasks where Sonnet is perfectly sufficient

❌ All Sonnet: "To save money"
   → Gets stuck on complex problems, ends up burning more tokens trying repeatedly

❌ Using Sonnet for the reviewer
   → Reviews easily become LGTM, fails to find subtle inconsistencies

❌ Using Opus for the developer
   → High volume, repetitive runs, wastes expensive model

What You Can Do Now

Assign models to each role in your own project
Estimate total project costs
Understand why "escalate rather than start with top-tier"

The next chapter will clarify the "escalation" mechanism — an advanced form of multi-agent autonomy.

← PREVIOUS LESSON Chapter 16 | Writing Agent Files

NEXT LESSON → Chapter 18 | Advanced: Escalation and Architect