codegraph by colbymchenry (~3.3k stars, MIT) is an MCP-server-style pre-indexed code knowledge graph that sits between Claude Codeβs Explore agent and your codebase. Instead of letting Claude spend most of a taskβs time running grep, glob, and Read to discover code structure, codegraph pre-parses everything with tree-sitter into a local SQLite graph (nodes = symbols, edges = calls / imports / inheritance / framework URL routes), then exposes a small set of MCP query tools that return relationships instantly. The READMEβs top tagline advertises 94% fewer tool calls and 77% faster exploration; the benchmark table averaged across 6 real codebases reports 92% fewer tool calls and 71% faster β both directionally consistent.
| *Source: github.com/colbymchenry/codegraph (MIT, ~3.3k stars at time of writing) | Surfaced via Weibo (η±ε―ε―-η±ηζ΄»), May 2026 β link not preserved* |
The problem it actually solves
Run any non-trivial Claude Code task on a 1k-file project and watch the trace. A large fraction of the agentβs runtime β often more than half β is just discovery: grep for a symbol, glob for matching files, Read to scan one. Each tool call burns tokens, time, and the context budget. Worse, when an agent gives up on discovery because the context is filling, it may proceed with an incomplete picture and produce a confidently wrong answer.
codegraphβs bet: that discovery work is repetitive and fundamentally cacheable. Parse the code once, store the structural facts in a tiny local SQLite DB, expose them via MCP, and the agent stops thrashing.
Architecture in one diagram
ββββββββββββββββββββββββ
β Your source code β
ββββββββββββ¬ββββββββββββ
β
ββββββββββββΌββββββββββββ
β tree-sitter parsers β (19 languages β
β β ASTs β with custom extraction
β β for Svelte / Vue / Liquid)
ββββββββββββ¬ββββββββββββ
β
ββββββββββββΌββββββββββββ
β Node + edge extract β
β β’ functions/classes β
β β’ call / import β
β β’ inherits β
β β’ framework routes β
ββββββββββββ¬ββββββββββββ
β
ββββββββββββΌββββββββββββ
β .codegraph/ β
β codegraph.db β
β (SQLite + FTS5) β
ββββββββββββ¬ββββββββββββ
β
ββββββββββββΌββββββββββββ
β MCP server β
β 9 query tools (see β
β table below) β
ββββββββββββ¬ββββββββββββ
β
ββββββββββββΌββββββββββββ
β Claude Code agent β
β (uses graph instead β
β of grep/glob/Read) β
ββββββββββββββββββββββββ
File watcher (FSEvents / inotify / ReadDirectoryChangesW)
β auto-debounced re-index on save, started by the
MCP/watch server after `codegraph init`
The 9 MCP tools codegraph exposes
Per the source (src/mcp/tools.ts), codegraph exposes nine query tools β not one. Most users wonβt invoke them by name; the agent picks whichever fits the question. They split into two tiers:
| Tier | Tool | What it returns |
|---|---|---|
| Heavy | explore |
A deep multi-hop exploration around a symbol or path. The βspend the budget hereβ call. |
| Light | search |
Full-text + symbol search via SQLite FTS5 |
| Light | context |
Surrounding symbols + call-site context for a target |
| Light | callers |
Who calls this symbol |
| Light | callees |
What this symbol calls |
| Light | impact |
Forward and backward blast radius for a symbol or file |
| Light | node |
Detailed metadata for a specific graph node |
| Light | status |
Index status / freshness |
| Light | files |
List indexed files (with filters) |
The agentβs typical pattern is: one or two search / node / callers calls to triangulate, then one explore if it needs depth. The 92%/71% numbers come from this composition replacing tens of grep + Read cycles.
Whatβs distinctive about codegraph
| Feature | Why it matters |
|---|---|
| Tiered MCP tool design | One heavy explore tool plus eight lightweight query tools. The agent answers most factual questions with light tools (cheap, fast); only escalates to explore for genuine deep dives |
| Framework-aware URL routing | Recognizes web frameworks (per src/resolution/frameworks/) β Django, FastAPI, Flask, Express, Laravel, Rails, Spring, Gin (incl. chi / gorilla / mux), Axum, actix, Rocket, ASP.NET, Vapor, React (+ React Router), Svelte (+ SvelteKit), SwiftUI/UIKit, and more. URL patterns link to their handler functions/classes. Huge win for βwhat handles /api/users/:id?β queries |
| File watcher auto-sync | Native OS events (FSEvents / inotify / ReadDirectoryChangesW). After codegraph init does the initial index, the MCP/watch server keeps the graph fresh on subsequent edits with debouncing to avoid thrash |
| 100% local, no API key | All processing on your machine, SQLite DB only. Nothing leaves your environment |
npx one-liner install |
Installer auto-wires the MCP server in ~/.claude.json and adds auto-allow permissions for the light query tools (search, context, callers, callees, impact, node, status) so the agent doesnβt ask permission for every read. The heavier explore is not auto-allowed by default |
Performance β what the README claims (6 real codebases)
| Codebase | Tool calls (with β without) | Time (with β without) |
|---|---|---|
| VS Code (TypeScript) | 3 β 52 (94% fewer) | 17s β 1m 37s (82% faster) |
| Excalidraw | (numbers in README table) | β |
| Claude Code | (numbers in README table) | β |
| Alamofire (Swift) | (numbers in README table) | β |
| Swift Compiler (largest test: 25.8k files, 272.9k nodes) | 84% fewer | 73% faster; initial index in <4 min |
| Average across the table | 92% fewer tool calls | 71% faster |
| README top tagline | 94% fewer tool calls | 77% faster exploration |
The two numbers donβt conflict β the 94%/77% is the READMEβs marketing line; the 92%/71% is the cross-codebase average from the benchmark table. The READMEβs own observation: agents using codegraph βnever fell back to raw file readsβ for code structure questions on the test set.
Benchmark methodology caveat (Codex flagged this on review): the benchmark is 6 codebases, ad-hoc tasks, no statistical reporting. The numbers are directionally trustworthy but should not be quoted as a formal benchmark. Measure on your own codebase before adopting team-wide.
Install in one command
npx @colbymchenry/codegraph
The interactive installer:
- Detects your Claude Code config (
~/.claude.json) - Adds the codegraph MCP server entry
- Auto-allows permissions for the light query tools (per
src/installer/config-writer.ts):search,context,callers,callees,impact,node,status - Offers a global install:
npm install -g @colbymchenry/codegraph
Per-project bootstrap:
cd your-project
codegraph init -i
codegraph init builds the initial index synchronously. The file watcher / auto-sync kicks in when the MCP/watch server starts (i.e., once Claude Code opens an MCP session against it). Large projects (~25k files) finish the initial indexing in <4 min on a modern laptop.
How it differs from the sibling tool
If youβve seen Code Review Graph by tirth8205 (~16.7k stars), youβve seen a cousin β same underlying idea, different design choices and a substantially broader tool surface:
| Dimension | codegraph (this entry) | code-review-graph (sibling) |
|---|---|---|
| Primary headline | 92-94% fewer tool calls, 71-77% faster (per README) | 8.2x token reduction, up to 49x on monorepos |
| Optimization target | Eliminate discovery thrash during any Claude Code task | Originally framed around code review; now also advertises daily coding, architecture/debug/onboarding prompts, wiki generation, refactoring, and multi-repo search |
| MCP surface | 9 tools (1 heavy explore + 8 light queries) |
28 tools (search, refactor, dead code, wiki gen, communitiesβ¦) |
| Framework awareness | URL β handler routing built in across many web frameworks | Framework-aware features and resolvers; richer overall toolkit |
| Update strategy | OS file-watcher + debounce | Incremental SHA-256 hash diff |
| Install | npx @colbymchenry/codegraph |
pip install code-review-graph |
| License | MIT | MIT |
| Stars (at time of writing) | ~3.3k | ~16.7k |
Which to pick? code-review-graph is broader (28 tools, more output modes including auto-wiki generation and community detection); codegraph is narrower (9 tools, framework routing, file watcher). For projects where you mostly want Claude to stop spelunking, codegraphβs smaller footprint is appealing. For monorepo refactoring + auto-doc workflows, code-review-graphβs richer toolkit wins. They can coexist as MCP servers if you want both.
Limitations and honest caveats
- WASM SQLite fallback is meaningfully slower (README-stated). If
better-sqlite3(the native module) fails to install on your platform, codegraph falls back to SQLite-WASM, which is 5-10Γ slower and can causedatabase is lockederrors when the MCP server queries concurrently with indexing. The README has platform-specific C compiler install steps to avoid this. - Files >1MB are skipped by default (README-stated; configurable). Large generated files / minified bundles wonβt be in the graph.
- Initial indexing time scales with the codebase (README-stated). Most projects finish in seconds; the Swift Compiler test (25k+ files) took <4 min. WASM backend is meaningfully slower.
- Static models donβt capture dynamic dispatch / reflection /
eval()/ string-built imports. (My inference, not directly stated in the README.) The agent will sometimes still need to fall back toReadfor these patterns.
How LearnAI Team Could Use This
- Default-install candidate for LearnAIβs Claude Code setup β pilot it on one team project first, measure your own tool-call/time numbers, then roll out if the win holds on your codebases. The privacy story is solid (βnothing leaves the machineβ).
- Pair with CS-310 (Advanced OO Programming & Design) β codegraphβs graph view is a pedagogically useful way to show students what an LLM is actually looking for when it reads code. The URL β handler routing is a tangible βframework awarenessβ lesson.
- CS-336 (Program Analysis for Security) β codegraph is a working tree-sitter + AST + graph pipeline. Students can extend it (e.g., add a taint-flow edge type, or an information-flow analysis) as a 2-3 week project; the codebase is the right size for that scope.
- Research workflow β when youβre skimming an unfamiliar repo for paper-reproduction work, codegraph lets Claude answer βwhere does this paperβs algorithm actually live in the code?β much faster than
grep-driven discovery. Run a quick before/after measurement on your own repo to confirm the numbers translate.
Real-World Use Cases
| Scenario | How to use |
|---|---|
| βWhere is X used?β questions in a large repo | Use callers on the symbol; get the list directly |
| βWhat does X depend on?β | Use callees for the outgoing call graph |
| Refactor impact preview | Use impact for the forward/backward blast radius before renaming or removing |
| API endpoint debugging | Ask Claude βwhat code handles POST /users/:id/auth?β β codegraphβs framework-aware routing finds the handler |
| Onboarding a new contributor | Run codegraph; ask Claude βsummarize the architectureβ with the graph available; result is structurally informed instead of file-list-shallow |
Important things to know
- codegraph is one MCP server in a growing space. Knowledge-graph approaches to agent codebase exploration are a clear pattern now (Code Review Graph is the most-starred sibling). Treat codegraph as one good implementation β easy to swap if a better one appears for your stack.
- The benchmark methodology is non-rigorous (Codex flagged on review). Six codebases, ad-hoc tasks, no statistical reporting. Headline numbers are directionally trustworthy but not a formal benchmark β measure your own before adopting team-wide.
- The framework registry is updated frequently in source. Check
src/resolution/frameworks/for the live list; the framework names in this entry are a snapshot, not exhaustive. - No data leaves your machine. Genuinely a local tool. No API keys, no telemetry mentioned in the README.
- The βauto-allow on installβ applies only to the light query tools. The heavier
exploretool still prompts for permission by default, which is correct β you want to know when the agent is about to spend the deeper-exploration budget. - Companion deep-dives in this wiki:
- Code Review Graph β 8.2x Token Reduction β sibling tool, broader scope
- Claude Code Context Management & CLAUDE.md β the complementary βwhat lives in conversation contextβ
- Claude Code Β· context: fork β isolation pattern for heavy explorations
- Harness Engineering β The Real Bottleneck Isnβt the Model β why this kind of tooling is high-leverage
- GBrain β Garry Tanβs Persistent Agent Memory System β the broader pattern of graph-shaped memory for agents