Claude Code: Route Tasks to Cheaper Models with Subagents

Claude Code: Route Tasks to Cheaper Models with Subagents

Today I learned you can dramatically cut Claude Code costs by routing read-only tasks to Haiku via custom subagents β€” and the real trick is running many of them in parallel.

The Idea

Not every task needs Opus. Codebase exploration, file searching, and pattern matching are perfect for Haiku β€” it’s roughly 1/5 the cost. By restricting the subagent to read-only tools (Read, Grep, Glob), even hallucinations can’t accidentally edit your files.

How to Create a Haiku Subagent

Create a Markdown file with YAML frontmatter in .claude/agents/ (project-level) or ~/.claude/agents/ (global):

<!-- .claude/agents/fast-reader.md -->
---
name: fast-reader
description: Fast, low-cost agent for reading and analyzing code. Use for codebase exploration and file discovery.
model: haiku
tools: Read, Grep, Glob
---

You are a fast code analysis agent optimized for cost efficiency.

When searching:
1. Use Glob to find files by name/extension
2. Use Grep for pattern matching
3. Use Read only for targeted analysis
4. Provide concise summaries with file references

Key fields:

  • model: haiku β€” routes to the cheap, fast model
  • tools: Read, Grep, Glob β€” allowlist restricts to read-only operations

Using It

Once the file exists, Claude auto-delegates matching tasks. Or you can ask explicitly:

Use the fast-reader to find all files that import 'express' and list the routes they define.

The Real Cost Saver: Parallel Batch Processing

The biggest savings come from running many Haiku searches in parallel and merging the results, instead of one expensive Opus pass over the whole codebase.

Example prompt:

Search these 5 modules in parallel using subagents:
- src/auth/ β€” find all permission checks
- src/api/ β€” list all endpoint definitions
- src/db/ β€” find all raw SQL queries
- src/utils/ β€” identify exported helper functions
- src/middleware/ β€” list all middleware chains

Claude spawns multiple Haiku subagents that run concurrently. Each returns a focused summary. The main agent merges the results β€” faster and cheaper than a single Opus scan.

Practical Setup: Split Research and Implementation

A common pattern is to pair a cheap reader with a capable writer:

<!-- .claude/agents/researcher.md -->
---
name: researcher
description: Fast research agent for codebase exploration
model: haiku
tools: Read, Grep, Glob
---
Answer codebase questions efficiently. Be concise.
<!-- .claude/agents/implementer.md -->
---
name: implementer
description: Implementation specialist for writing and modifying code
model: sonnet
tools: Read, Write, Edit, Bash, Grep, Glob
---
Expert code implementer for changes and feature development.

Workflow:

  1. Research phase (Haiku) β€” explore the codebase, understand patterns
  2. Implementation phase (Sonnet/Opus) β€” make the actual changes

This cuts the research cost to nearly nothing while keeping full capability for the writing phase.

Cost Comparison

Approach Estimated Cost
Single Opus scanning 50 files $$$
10 parallel Haiku subagents ~$ (1/5 per agent, faster total time)
Haiku research + Sonnet implementation ~50% savings overall

Where Subagent Files Live

Location Scope
.claude/agents/ Project β€” commit to share with team
~/.claude/agents/ Personal β€” available across all projects

Monitor your spending with /cost inside any session.

How LearnAI Team Could Use This

  • Use cheap read-only subagents for wiki audits, source checks, link checks, and cross-file documentation searches.
  • Reserve stronger models for final edits and judgment-heavy rewrites.
  • Split large codebase exploration from implementation to reduce Claude Code costs.

Real-World Use Cases

  • Run parallel read-only audits across multiple wiki categories.
  • Search source files and related entries before updating documentation.
  • Use Haiku subagents for repetitive grep/glob tasks that don’t need reasoning.