Today I learned you can dramatically cut Claude Code costs by routing read-only tasks to Haiku via custom subagents β and the real trick is running many of them in parallel.
The Idea
Not every task needs Opus. Codebase exploration, file searching, and pattern matching are perfect for Haiku β itβs roughly 1/5 the cost. By restricting the subagent to read-only tools (Read, Grep, Glob), even hallucinations canβt accidentally edit your files.
How to Create a Haiku Subagent
Create a Markdown file with YAML frontmatter in .claude/agents/ (project-level) or ~/.claude/agents/ (global):
<!-- .claude/agents/fast-reader.md -->
---
name: fast-reader
description: Fast, low-cost agent for reading and analyzing code. Use for codebase exploration and file discovery.
model: haiku
tools: Read, Grep, Glob
---
You are a fast code analysis agent optimized for cost efficiency.
When searching:
1. Use Glob to find files by name/extension
2. Use Grep for pattern matching
3. Use Read only for targeted analysis
4. Provide concise summaries with file references
Key fields:
model: haikuβ routes to the cheap, fast modeltools: Read, Grep, Globβ allowlist restricts to read-only operations
Using It
Once the file exists, Claude auto-delegates matching tasks. Or you can ask explicitly:
Use the fast-reader to find all files that import 'express' and list the routes they define.
The Real Cost Saver: Parallel Batch Processing
The biggest savings come from running many Haiku searches in parallel and merging the results, instead of one expensive Opus pass over the whole codebase.
Example prompt:
Search these 5 modules in parallel using subagents:
- src/auth/ β find all permission checks
- src/api/ β list all endpoint definitions
- src/db/ β find all raw SQL queries
- src/utils/ β identify exported helper functions
- src/middleware/ β list all middleware chains
Claude spawns multiple Haiku subagents that run concurrently. Each returns a focused summary. The main agent merges the results β faster and cheaper than a single Opus scan.
Practical Setup: Split Research and Implementation
A common pattern is to pair a cheap reader with a capable writer:
<!-- .claude/agents/researcher.md -->
---
name: researcher
description: Fast research agent for codebase exploration
model: haiku
tools: Read, Grep, Glob
---
Answer codebase questions efficiently. Be concise.
<!-- .claude/agents/implementer.md -->
---
name: implementer
description: Implementation specialist for writing and modifying code
model: sonnet
tools: Read, Write, Edit, Bash, Grep, Glob
---
Expert code implementer for changes and feature development.
Workflow:
- Research phase (Haiku) β explore the codebase, understand patterns
- Implementation phase (Sonnet/Opus) β make the actual changes
This cuts the research cost to nearly nothing while keeping full capability for the writing phase.
Cost Comparison
| Approach | Estimated Cost |
|---|---|
| Single Opus scanning 50 files | $$$ |
| 10 parallel Haiku subagents | ~$ (1/5 per agent, faster total time) |
| Haiku research + Sonnet implementation | ~50% savings overall |
Where Subagent Files Live
| Location | Scope |
|---|---|
.claude/agents/ |
Project β commit to share with team |
~/.claude/agents/ |
Personal β available across all projects |
Monitor your spending with /cost inside any session.
How LearnAI Team Could Use This
- Use cheap read-only subagents for wiki audits, source checks, link checks, and cross-file documentation searches.
- Reserve stronger models for final edits and judgment-heavy rewrites.
- Split large codebase exploration from implementation to reduce Claude Code costs.
Real-World Use Cases
- Run parallel read-only audits across multiple wiki categories.
- Search source files and related entries before updating documentation.
- Use Haiku subagents for repetitive grep/glob tasks that donβt need reasoning.