AI Agent Primer β€” The Vocabulary Ladder and 18-Step Workflow

AI Agent Primer β€” The Vocabulary Ladder and 18-Step Workflow

A short, beginner-facing teaching map for the AI-agent vocabulary that newcomers run into the moment they start reading about Claude Code, Codex, AutoGPT, or β€œagents” in general. Built from two complementary popular explainers that surfaced in Chinese AI media in early 2026: the 7-term automation ladder (Token β†’ Harness) and the 18-step swimlane workflow (User β†’ Agent β†’ LLM β†’ Skill β†’ MCP β†’ Tools). Neither is a canonical taxonomy β€” they’re two angles on the same beginner question, β€œwhat are all these words and how do they fit together?” This entry serves as the on-ramp; the deep dives live in the wiki entries linked from each layer.

*Source: Douyin: ηŸ₯希AI β€” β€œAI δΈƒδΈͺ词, δ½ ηœŸηš„ζžζ‡‚δΊ†ε‡ δΈͺ?” (April 21, 2026) β€” the 7-term ladder Douyin: η¨‹εΊε‘˜ζ₯Όε“₯ β€” β€œδΈ€εΌ ε›Ύηœ‹ζ‡‚ AI Agent 全桁程” (January 13, 2026) β€” the 18-step swimlane (pages 1–4 of a 15-page deck used here; remaining 11 pages not consulted)*

Read this entry as a teaching map, not a standard. The two frameworks below are popular explainer artifacts, not formal definitions. Where each term has a precise meaning, the relevant deep-dive wiki entry is linked.

View 1 β€” The 7-Term Automation Ladder

The first explainer (ηŸ₯希AI) frames seven core terms as a ladder of increasing automation. Each rung is a layer you can adopt; the higher you go, the more the system runs without you watching it.

# Term One-line Deep-dive entry
1 Token The unit a model reads and writes; every interaction burns tokens Caveman Token Compression, Claude Code Token Costs (RTK)
2 Prompt (提瀺词) The instructions you give the model for one task β€” the clearer, the better Prompt Master, Shortest Prompt Lines That Work, Anti-Sycophancy Prompt
3 Context (δΈŠδΈ‹ζ–‡) The current conversation’s workspace; close the window and it’s gone Claude Code Context Management & CLAUDE.md, Claude Code Context Fork
4 Skill A reusable, file-based operation manual the model can re-load on demand How Anthropic Uses Skills β€” Thariq’s 9-Category Framework, Claude Code Skills: Resources & Repos, Addy Osmani, Matt Pocock, Karpathy
5 MCP An open connection-standard protocol between models and external tools/data β€” popularly shorthanded as β€œplugins for the AI” Anthropic Managed Agents β€” Decoupling the Brain from the Hands, alphaXiv MCP β€” ArXiv Search Inside Claude Code
6 Agent A model + planner + memory + tool-use loop that can execute multi-step tasks autonomously Seven Agent Architectures, Agentic AI Engineer Roadmap 2026
7 Harness The surrounding scaffold/runtime that runs an agent reliably β€” context limits, tool gating, verification loops, error recovery Harness Engineering β€” The Real Bottleneck Isn’t the Model, Agents Need Control Flow

Two Pre-empts Before You Quote This Ladder

  • β€œMCP = plugins” is a useful shorthand, not the definition. MCP is an open protocol that standardizes how models, tools, and data sources talk to each other β€” closer to β€œUSB-C for AI” than to β€œbrowser plugins.” The plugin-like behavior is one consequence of the protocol.
  • β€œHarness” is not a universal formal layer in every framework β€” it’s the engineering discipline of building the scaffold around the model. Different stacks (LangChain, Claude Code, Codex, OpenHarness) implement the harness differently.

View 2 β€” The 18-Step Swimlane Workflow

The second explainer (η¨‹εΊε‘˜ζ₯Όε“₯) takes the same vocabulary and shows it moving: an 18-step swimlane flowchart of how a single user query traverses six lanes before a result comes back.

β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”  β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”  β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”  β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”  β”Œβ”€β”€β”€β”€β”€β”€β”  β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚  User   β”‚  β”‚  Agent   β”‚  β”‚   LLM    β”‚  β”‚ Skill  β”‚  β”‚ MCP  β”‚  β”‚  Tools  β”‚
β””β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”˜  β””β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”˜  β””β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”˜  β””β”€β”€β”€β”¬β”€β”€β”€β”€β”˜  β””β”€β”€β”¬β”€β”€β”€β”˜  β””β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”˜
     β”‚            β”‚              β”‚             β”‚           β”‚           β”‚
   query  ─────▢  receive  ──▢  intent       β”‚           β”‚           β”‚
     β”‚            β”‚              β”‚             β”‚           β”‚           β”‚
     β”‚       load context+skill  β”‚             β”‚           β”‚           β”‚
     β”‚            β”‚              β”‚             β”‚           β”‚           β”‚
     β”‚            β”‚  ────────▢  plan          β”‚           β”‚           β”‚
     β”‚            β”‚              β”‚             β”‚           β”‚           β”‚
     β”‚       dispatch  ────────────────────▢  invoke ───────────▢  run
     β”‚            β”‚              β”‚             β”‚           β”‚           β”‚
     β”‚            β”‚              β”‚             β”‚           β”‚       result
     β”‚            β”‚  ◀────────  observe ◀───────────────────────────────
     β”‚            β”‚              β”‚             β”‚           β”‚           β”‚
     β”‚            β”‚  ────────▢  next step / verify ──▢ ...               β”‚
     β”‚            β”‚              β”‚             β”‚           β”‚           β”‚
     β”‚  ◀───── final answer                                              β”‚

The 18 steps elaborate this with explicit observation, error retry, and verification loops between the LLM and the tool layer. The takeaway: a single β€œagent run” is not one model call β€” it’s a dozen-plus orchestrated calls across six distinct lanes.

Per-Lane Role, in One Line

The six swimlanes:

Lane Job What it isn’t
User Express intent Not responsible for picking tools or steps
Agent Plan, dispatch, remember, verify Not the brain β€” borrows the LLM for reasoning
LLM Reason, write, classify, decide Not a planner or executor; just a stateless engine
Skill Hold reusable procedures the agent can re-load Not a one-shot prompt; a file/folder the agent reads on demand
MCP Standardize tool-agent connectivity Not a tool itself; the protocol that lets tools plug in
Tools Take concrete action (web search, code run, DB query, API call) Not autonomous; the agent picks and calls them

(Prompt isn’t a lane in this view β€” it’s the artifact passing between User and Agent. See the ladder above for where Prompt sits.)

How the Two Views Complement

The two explainers are not competing taxonomies β€” they answer different questions:

Question Best view
β€œWhat level of automation should I aim for?” Ladder β€” Token β†’ Harness shows you the next rung
β€œHow does a single agent run actually execute?” Flowchart β€” shows the roles interacting in real time
β€œWhat does each word mean in isolation?” Either view’s glossary works as an entry point
β€œWhere do I go to learn this concept deeply?” Click through to the linked wiki entry

Both views collapse onto the same underlying truth: modern agent systems are layered, and each layer has a job that the others can’t do well. Choose the right layer for the right problem, and the system runs; over-engineer the wrong layer, and it doesn’t.

What the Ladder Tells You About Your Own Setup

A diagnostic question for each rung:

  1. Token: Do you know roughly what tokens your common workflows burn? If not, start there.
  2. Prompt: Do you have a default system prompt that you trust on important work?
  3. Context: When sessions end, do you lose state you wanted to keep?
  4. Skill: Is there work you do the same way β‰₯3 times? It should be a skill.
  5. MCP: Are you copying data in/out of the agent manually? Look for an MCP integration.
  6. Agent: Is the agent making plans, or are you?
  7. Harness: When the agent fails silently, do you find out?

The further down the list a β€œno” appears, the more leverage you’ll get from investing there next.

How LearnAI Team Could Use This

  • Onboarding gateway β€” assign this entry as the first reading in any AI module before going deep. The 7-term ladder is short enough that even non-engineering students can absorb it in 10 minutes.
  • Glossary check-in β€” quiz students after the first week: define each term, name the deep-dive wiki entry that owns it. Misses identify exactly where the curriculum needs reinforcement.
  • Diagnostic for project advising β€” when a student says β€œmy agent isn’t working,” walk down the ladder ask: which rung is the failure on? Token cost? Prompt vagueness? Missing skill? No verification?
  • Cross-curriculum integration β€” non-CS students (bio, finance, design) benefit most from the 6-lane flowchart, which shows that β€œAI” is not one thing but a system of cooperating parts.
  • Companion exercise to harness engineering module β€” read this entry first, then dive into Harness Engineering and Agents Need Control Flow.

Real-World Use Cases

  • Stakeholder briefings β€” when a PM or executive asks β€œwhat is an agent?”, the ladder + flowchart together cover 80% of the answer in a single page.
  • Engineering interviews β€” candidates who can place themselves on the ladder (and explain why their preferred layer matters) tend to ship reliable agent systems.
  • Tool evaluation β€” when a new AI tool is announced, map it onto the ladder: which rung does it improve? If it’s not obvious, the tool probably isn’t differentiated.
  • Cross-team alignment β€” when product / engineering / data each mean a different thing by β€œagent,” this glossary normalizes the vocabulary before debate.
  • Self-audit β€” solo engineers diagnose which rung of the ladder their workflow is stuck on and what investment unlocks the next rung.
  • Source explainer 1 (7-term ladder): Douyin post by ηŸ₯希AI (April 21, 2026)
  • Source explainer 2 (18-step swimlane): Douyin post by η¨‹εΊε‘˜ζ₯Όε“₯ (January 13, 2026; pages 1–4 of a 15-page deck were consulted)
  • Deeper reading by layer: see the per-term link column in the ladder table above