Custom Claude skills: when to write one (and when not to)
Claude skills are reusable agent capabilities. They're powerful — but writing one for the wrong workflow is wasted effort. Here's the practical guide.
Custom Claude skills became prominent in 2025 as a way to make recurring agent workflows reusable. They're powerful but easy to misuse — writing a skill for a one-off task is over-engineering, while writing prompts for a recurring complex workflow is under-engineering.
This post is the practical decision guide.
What a skill actually is
A skill is a structured directory containing:
- SKILL.md — instructions the agent reads when the skill is invoked.
- Supporting files — templates, scripts, reference data.
- Trigger conditions — when the LLM should consider using this skill.
When you describe a task, the LLM checks if any skill matches; if one does, it's invoked, and the SKILL.md instructions guide the agent's behavior.
This is more structured than a prompt. Prompts are conversational; skills are reusable, named capabilities.
When to write a skill
Three signals that a workflow merits a skill:
1. You've used the same prompt 10+ times
If you've copy-pasted the same prompt template more than 10 times across sessions, it's earning a skill upgrade. The cost of writing the skill is amortized across future use.
2. The workflow is multi-step
If the task is "do X, then check Y, then do Z," a skill that codifies the sequence is more reliable than re-explaining the steps each time.
3. The workflow needs specific files or tools
If the task always involves reading specific reference files or invoking specific tools, a skill that captures the setup is faster than re-specifying each time.
When NOT to write a skill
Most workflows shouldn't be skills.
One-off tasks
A task you'll do once or twice this quarter doesn't warrant a skill. A regular prompt is faster to write and adequate.
Highly variable tasks
If each instance of "the same task" actually needs different instructions, a skill won't help — you'll customize each time anyway.
Tasks that already have great built-in tools
Don't write a skill for "read this file" — Claude Code's built-in Read is already the right answer.
Tasks where the agent already does well
If you ask Claude "write me a TypeScript type for this object" and it works 95% of the time, don't add a skill. Skills are for the 5% workflows where unstructured prompting fails.
What good skills look like
Skill: "code-review-strict"
---
name: Code review (strict)
when_to_use: When user asks for a strict / thorough / critical code review of a diff or file.
---
You are reviewing code for defects. Produce a severity-rated list (P0/P1/P2):
- P0: bugs, security issues, broken behavior
- P1: significant code quality issues, design problems
- P2: minor stylistic issues
For each finding:
1. Quote the line(s) in a fenced block
2. State the defect in one sentence
3. Suggest the smallest fix
4. Prefix [STYLE] if it's a stylistic preference, not a defect
Don't list things that are correct. Don't summarize the PR.
Why this works as a skill: the workflow is recurring (you review code regularly), multi-step (severity rating + structured output), and the prompt is specific enough to not need customization per instance.
Skill: "blog-post-from-notes"
---
name: Blog post from notes
when_to_use: When user asks to convert notes/outline into a publishable blog post in our voice.
---
Convert the provided notes into a blog post following these rules:
1. Read `~/dev/_shared/references/brand-voice.md` for tone.
2. Length: 1500-2500 words.
3. Structure: lead → context → 3-5 main sections (H2s) → conclusion.
4. Include:
- First paragraph that states the problem and the take.
- Concrete examples in each main section.
- One "what we'd skip" honest critique.
5. Don't use:
- Hype language ("game-changer", "revolutionize").
- Bullet-point-heavy style (prefer paragraphs).
- Buried lede.
Why this works: highly recurring (you publish weekly), specific (length, structure, voice rules), and references a brand-voice file that the skill auto-loads.
Skill: "swift-codable-migration"
---
name: Swift Codable migration
when_to_use: When adding a field to a persisted Codable struct in mqdirCore.
---
When adding a field to a Codable struct in mqdirCore:
1. Add the field with a default value.
2. Hand-roll `init(from decoder: Decoder)` if not already present.
3. Use `try container.decodeIfPresent(<Type>.self, forKey: .<key>) ?? <default>` for the new field.
4. Add a test in `<Module>MigrationTests.swift`:
```swift
func testMigration_vN_to_vN+1_preservesAllFields() throws {
let v_N_json = """{ ... }"""
let decoded = try JSONDecoder().decode(<Type>.self, from: ...)
XCTAssertEqual(decoded.<existing field>, ...) // old fields preserved
XCTAssertEqual(decoded.<new field>, <default>) // new field defaulted
}
- Bump the version number in
<Type>.currentVersion. - Run
swift test --filter <ModuleMigrationTests>to verify.
Why this works: project-specific recurring task with strict pattern, where mistakes cost user data. Codifying as a skill ensures every instance follows the right shape.
## What bad skills look like
### Skill: "be helpful"
```md
---
name: Be helpful
when_to_use: Always.
---
Be helpful. Answer the user's question.
Adds no information. Wastes the skill mechanism.
Skill: "write tests"
---
name: Write tests
when_to_use: When user asks for tests.
---
Write tests for the function provided.
Too vague. The agent already knows how to write tests; this skill doesn't add specifics.
Skill: "do everything"
A 500-line SKILL.md that tries to encompass an entire workflow domain. Skills should be focused — one workflow per skill.
If your skill has more than ~150 lines, you've probably bundled too much. Split it into multiple focused skills.
Skill structure recommendations
Keep SKILL.md short
Aim for <150 lines. Concise skills are reliable; sprawling skills get partially-followed.
Define when_to_use precisely
The LLM uses this to decide when to invoke. Vague triggers ("when relevant") cause skill misfires; precise triggers ("when reviewing TypeScript code in our auth module") are reliable.
Reference external files for long content
Don't embed a 500-word brand voice guide in SKILL.md. Reference an external file:
- Read `~/dev/_shared/references/brand-voice.md` for tone.
Skills should be instructions; the data they reference can be elsewhere.
Include negative examples
Don't:
- Use the phrase "moving forward" — corporate-speak.
- Start posts with "In today's fast-paced world" — tired opener.
Negative examples anchor the skill's standards.
Where skills live
Anthropic's skill format places them in a structured directory. Common pattern:
~/.claude/skills/
├── code-review-strict/
│ └── SKILL.md
├── blog-post-from-notes/
│ ├── SKILL.md
│ └── voice-reference.md
└── swift-codable-migration/
└── SKILL.md
Each skill is a directory. SKILL.md is the entry point. Supporting files live alongside.
For team-shared skills, commit the directory tree to a shared repo and point each team member's Claude Code config at it.
Maintenance
Skills aren't write-once. Two practices:
When the agent doesn't follow a skill instruction, fix the instruction.
If you wrote Don't use bullet-heavy style and the agent still produces bullets, the instruction wasn't strong enough. Update:
Don't use bullet-heavy style. If you find yourself producing 3+ bullets in
a row, convert to a paragraph instead. Bullets are for genuine lists (>5
items, or hierarchical), not paragraph chunking.
The clarification anchors what "bullet-heavy" means.
When you find yourself overriding a skill instruction, update the skill.
If you regularly say "ignore the part about X" when invoking the skill, X shouldn't be in the skill. Remove it.
Common skill patterns
Pattern 1: review skills
code-review-strict, code-review-style, pr-description-writer. One per review type.
Pattern 2: generation skills
generate-react-component, generate-jest-test, generate-changelog. One per output type.
Pattern 3: transformation skills
markdown-to-blog, feature-spec-to-tickets, api-error-to-user-message. One per transformation.
Pattern 4: project-specific patterns
swift-codable-migration, react-form-with-zod, nextjs-route-with-tests. One per recurring pattern in your project.
File-manager setup for skill development
mq-dir's pane setup helps when authoring skills:
- Pane 1: skills directory
~/.claude/skills/. - Pane 2: a SKILL.md you're editing.
- Pane 3: a reference file the skill points to.
- Pane 4: cmux session where you test the skill.
You see the skill list, the current edit, the referenced data, and the test session at once.
Verdict
Skills are a powerful Claude Code feature for workflows that:
- Recur weekly+.
- Have multi-step structure.
- Have specific output shapes.
- Reference specific files or tools.
They're misused when written for:
- One-off tasks.
- Highly variable tasks.
- Tasks the agent already handles well.
Aim for ~10-30 well-crafted skills, each focused, each <150 lines. Don't sprawl.
The investment is real (30 min - 2 hours per skill) but the daily compounding return is significant for recurring workflows.
mq-dir is the navigation layer for the artifacts your skill-driven workflows produce. Free, MIT.
mq-dir is fully open source.
MIT licensed, zero telemetry. Read the source, file an issue, send a PR.
★ Star on GitHub →Frequently asked questions
References
- [1]
Ready to try mq-dir?
A native quad-pane file manager built for AI multi-tasking on macOS. Free, MIT licensed, zero telemetry.
Related posts
When the agent is wrong: a debugging protocol
AI agents fail in specific ways. The right debugging response depends on the failure mode. Here's the structured protocol that catches drift fast.
Context engineering: the hidden lever for agent quality
Prompt engineering peaked. Context engineering — what's in the agent's working memory, what's not — is the 2026 leverage point. Here's the practical playbook.
Claude Code memory without polluting global config
Claude Code's memory feature is powerful but easy to misuse. The pattern that scales — what to put in global memory, what to put per-project, what to never persist.