When the agent is wrong: a debugging protocol
AI agents fail in specific ways. The right debugging response depends on the failure mode. Here's the structured protocol that catches drift fast.
AI agents go wrong in patterns. Recognizing the failure mode lets you respond efficiently — pure retry vs context fix vs scope tightening vs human takeover. Here's the structured debugging protocol.
The five failure modes
After hundreds of agent sessions, agent failures reduce to five patterns:
1. Hallucinated APIs / methods
The agent calls a method that doesn't exist, references a function that was renamed, or imports a package that isn't in package.json.
Diagnostic: error mentions "is not a function" or "module not found."
Fix: tell the agent the actual API. "There's no setupUser method; the right method is createUser from users-service.ts."
2. Wrong pattern matched
The agent looked at the codebase and pattern-matched to something similar but not right. New code is plausible but doesn't fit this case.
Diagnostic: code looks fine, but it doesn't match the actual convention.
Fix: cite the right pattern explicitly. "We use Repository pattern, not direct ORM calls. Look at users-repo.ts for the convention."
3. Scope creep
The agent did what was asked plus "helpful" adjacent changes.
Diagnostic: PR has more files than expected.
Fix: revert the extra changes; re-prompt with sharper scope. See the runaway agent post.
4. Confidently wrong explanation
The agent claims something works, claims it understands the bug, claims the fix is right — but it isn't.
Diagnostic: agent's confidence doesn't match observed reality (tests still fail; behavior unchanged).
Fix: ask for verification. "You said you fixed it. Run the tests and confirm."
5. Drift over long sessions
After several hours, the agent slowly loses track of CLAUDE.md scope, original goal, or earlier decisions.
Diagnostic: agent's later actions contradict earlier ones, or it asks questions it should know answers to.
Fix: restart the session with a fresh context summary. Don't try to course-correct in-place.
The debugging protocol
When the agent's output is wrong, run the protocol top to bottom:
Step 1 (1 min): identify the failure mode
Read the agent's last action and result. Match to one of the five modes above.
If you can't classify, treat as "drift" by default.
Step 2 (3 min): take corrective action based on mode
| Mode | Response |
|---|---|
| Hallucinated API | Cite real API; re-prompt |
| Wrong pattern | Cite real pattern with example file path |
| Scope creep | Revert; re-prompt with explicit not in scope |
| Confidently wrong | Ask for verification ("run the test and confirm") |
| Drift | Restart session with fresh summary |
Step 3 (variable): retry with the correction
Re-prompt. The corrective context should specifically address what went wrong.
Step 4 (2 min): verify
Run tests, read the diff, confirm the fix is real. Don't trust the agent's claim of success.
Step 5: if it still fails after 2-3 retries → take over
If the agent can't get it after 3 attempts:
- Context is missing (you can't infer what; pause and audit).
- Codebase has unusual patterns the agent's training didn't see.
- The bug is genuinely subtle (a 4th attempt won't help).
Do it yourself. Faster than another retry.
Specific corrective prompts
For hallucinated APIs
You said `setupUser()` exists in users-service.ts. It doesn't.
The actual methods in users-service.ts are:
- createUser(input: CreateUserInput): User
- updateUser(id: string, patch: UpdateUserInput): User
- deleteUser(id: string): void
Use the right one. Re-read the file if you're unsure.
The specific contradiction + the actual API + the instruction to re-read.
For wrong pattern
You wrote a service that calls the database directly. Our convention is
Repository pattern — services call repos, repos do DB work.
Example: src/users/users-service.ts → users-repo.ts.
Move the DB calls into auth-repo.ts. Service should only call the repo.
Reference the right pattern + cite an existing example.
For scope creep
The change to src/users/UserCard.tsx is out of scope for this session.
Revert it. Then re-do the auth changes only.
In scope: src/auth/* + tests/auth/*.
Out of scope: everything else.
Specify the revert + re-state scope.
For confidently wrong
You said the test passes. Run `npm test -- --bail` and paste the output.
Force evidence. Don't accept the claim without verification.
For drift
Stop. We've drifted. Restart this conversation.
Context: [paste a 200-word summary of what was decided]
Continue from there.
Reset with a clean summary. Trying to course-correct in-place rarely works on a long-drifted session.
Distinguishing failure modes
Sometimes the symptom is ambiguous. Two diagnostics:
Did the agent verify?
If the agent never ran tests / never read the file it's modifying / never checked its work — that's confidence-without-verification (mode 4).
If the agent did verify but still got the wrong answer — that's hallucination or wrong pattern (mode 1 or 2).
Is the symptom local or systemic?
Local symptom (one wrong API) → mode 1. Systemic symptom (many wrong APIs, many wrong patterns) → context is broken; restart.
Anti-patterns
"Try again"
If you say only "try again," the agent has no new information. It'll likely produce a similar wrong answer.
Always provide specific corrective context.
Apologetic prompting
The agent will sometimes apologize and produce another wrong answer. Don't accept the apology as fix; verify the new output.
Marathon recovery
If you've spent 30+ minutes trying to correct one issue, you're in a sunk-cost trap. Step back, do it yourself, save the agent for the next task.
Trusting the explanation
The agent's narrative ("I think the bug is here because...") is a hypothesis, not a fact. Verify the hypothesis with code/tests.
Re-prompting the same context that failed
If the failure was caused by missing context (mode 5), re-prompting in the same session = same failure. Restart.
When the agent is consistently wrong
If the same agent + same task type fails repeatedly:
Audit the CLAUDE.md
Is the scope right? Are conventions clearly stated? Are common pitfalls listed?
Often: tightening CLAUDE.md fixes 70% of recurring failures.
Audit the prompt
Is the task clearly specified? Are file paths and pattern references included?
Often: writing a more specific prompt fixes the next 20%.
Try a different model
If you're using a smaller/cheaper model and quality is consistently below bar, try the frontier model for this task type.
Often: this fixes the remaining 10% — but check if the workflow can be redesigned to fit smaller models first.
Question the workflow itself
If a task is consistently hard for the agent, is it a good fit for AI assistance? Some tasks are genuinely better done by hand.
File-manager setup for debugging
mq-dir's panes for agent debugging:
- Pane 1: the file the agent is changing (so you see edits in real-time).
- Pane 2: the test file (so you can verify).
- Pane 3: cmux session running the agent.
- Pane 4: notes / corrective prompt drafts.
You see the change, the test, the conversation, and your draft response — all simultaneously.
What this protocol gives you
After a few weeks of consistent application:
- Faster recovery from agent failures (3 minutes vs 30).
- Better instinct for when to take over (don't sink-cost).
- Improved prompts and CLAUDE.md (each failure teaches something).
- More accurate trust calibration (you know when to verify).
The compounding effect is that your overall AI workflow quality improves, not just individual session debugging.
Verdict
AI agents fail in patterns. Recognizing the pattern saves time:
- Hallucinated API → cite real API.
- Wrong pattern → cite real pattern.
- Scope creep → revert + re-scope.
- Confidently wrong → demand verification.
- Drift → restart with summary.
Verify everything the agent claims. Don't trust apologies. Take over after 2-3 failed iterations.
The protocol is small but the daily impact is real. Most heavy AI users develop something like it implicitly; documenting it makes the discipline transferable.
mq-dir's quad-pane is well-suited to debugging — diff, tests, agent terminal, notes all visible. Free, MIT.
mq-dir is fully open source.
MIT licensed, zero telemetry. Read the source, file an issue, send a PR.
★ Star on GitHub →Frequently asked questions
References
- [1]
Ready to try mq-dir?
A native quad-pane file manager built for AI multi-tasking on macOS. Free, MIT licensed, zero telemetry.
Related posts
Local LLMs on macOS in 2026: when they're worth the GPU
Local LLMs got dramatically better in 2025-2026. They're competitive with frontier APIs for some workflows; not all. Here's the honest picture.
Claude Code memory without polluting global config
Claude Code's memory feature is powerful but easy to misuse. The pattern that scales — what to put in global memory, what to put per-project, what to never persist.
File-context strategies for AI agents: what to feed, what to skip, what to summarize
When an AI agent has access to your whole repo, it doesn't read your whole repo. Here's how to choose what enters context, what stays out, and how that decision affects output quality.