Custom Claude skills: when to write one (and when not to)
Claude skills are reusable agent capabilities. They're powerful — but writing one for the wrong workflow is wasted effort. Here's the practical guide.
Custom Claude skills became prominent in 2025 as a way to make recurring agent workflows reusable. They're powerful but easy to misuse — writing a skill for a one-off task is over-engineering, while writing prompts for a recurring complex workflow is under-engineering.
This post is the practical decision guide.
What a skill actually is
A skill is a structured directory containing:
- SKILL.md — instructions the agent reads when the skill is invoked.
- Supporting files — templates, scripts, reference data.
- Trigger conditions — when the LLM should consider using this skill.
When you describe a task, the LLM checks if any skill matches; if one does, it's invoked, and the SKILL.md instructions guide the agent's behavior.
This is more structured than a prompt. Prompts are conversational; skills are reusable, named capabilities.
When to write a skill
Three signals that a workflow merits a skill:
1. You've used the same prompt 10+ times
If you've copy-pasted the same prompt template more than 10 times across sessions, it's earning a skill upgrade. The cost of writing the skill is amortized across future use.
2. The workflow is multi-step
If the task is "do X, then check Y, then do Z," a skill that codifies the sequence is more reliable than re-explaining the steps each time.
3. The workflow needs specific files or tools
If the task always involves reading specific reference files or invoking specific tools, a skill that captures the setup is faster than re-specifying each time.
When NOT to write a skill
Most workflows shouldn't be skills.
One-off tasks
A task you'll do once or twice this quarter doesn't warrant a skill. A regular prompt is faster to write and adequate.
Highly variable tasks
If each instance of "the same task" actually needs different instructions, a skill won't help — you'll customize each time anyway.
Tasks that already have great built-in tools
Don't write a skill for "read this file" — Claude Code's built-in Read is already the right answer.
Tasks where the agent already does well
If you ask Claude "write me a TypeScript type for this object" and it works 95% of the time, don't add a skill. Skills are for the 5% workflows where unstructured prompting fails.
What good skills look like
Skill: "code-review-strict"
---
name: Code review (strict)
when_to_use: When user asks for a strict / thorough / critical code review of a diff or file.
---
You are reviewing code for defects. Produce a severity-rated list (P0/P1/P2):
- P0: bugs, security issues, broken behavior
- P1: significant code quality issues, design problems
- P2: minor stylistic issues
For each finding:
1. Quote the line(s) in a fenced block
2. State the defect in one sentence
3. Suggest the smallest fix
4. Prefix [STYLE] if it's a stylistic preference, not a defect
Don't list things that are correct. Don't summarize the PR.
Why this works as a skill: the workflow is recurring (you review code regularly), multi-step (severity rating + structured output), and the prompt is specific enough to not need customization per instance.
Skill: "blog-post-from-notes"
---
name: Blog post from notes
when_to_use: When user asks to convert notes/outline into a publishable blog post in our voice.
---
Convert the provided notes into a blog post following these rules:
1. Read `~/dev/_shared/references/brand-voice.md` for tone.
2. Length: 1500-2500 words.
3. Structure: lead → context → 3-5 main sections (H2s) → conclusion.
4. Include:
- First paragraph that states the problem and the take.
- Concrete examples in each main section.
- One "what we'd skip" honest critique.
5. Don't use:
- Hype language ("game-changer", "revolutionize").
- Bullet-point-heavy style (prefer paragraphs).
- Buried lede.
Why this works: highly recurring (you publish weekly), specific (length, structure, voice rules), and references a brand-voice file that the skill auto-loads.
Skill: "swift-codable-migration"
---
name: Swift Codable migration
when_to_use: When adding a field to a persisted Codable struct in mqdirCore.
---
When adding a field to a Codable struct in mqdirCore:
1. Add the field with a default value.
2. Hand-roll `init(from decoder: Decoder)` if not already present.
3. Use `try container.decodeIfPresent(<Type>.self, forKey: .<key>) ?? <default>` for the new field.
4. Add a test in `<Module>MigrationTests.swift`:
```swift
func testMigration_vN_to_vN+1_preservesAllFields() throws {
let v_N_json = """{ ... }"""
let decoded = try JSONDecoder().decode(<Type>.self, from: ...)
XCTAssertEqual(decoded.<existing field>, ...) // old fields preserved
XCTAssertEqual(decoded.<new field>, <default>) // new field defaulted
}
- Bump the version number in
<Type>.currentVersion. - Run
swift test --filter <ModuleMigrationTests>to verify.
Why this works: project-specific recurring task with strict pattern, where mistakes cost user data. Codifying as a skill ensures every instance follows the right shape.
## What bad skills look like
### Skill: "be helpful"
```md
---
name: Be helpful
when_to_use: Always.
---
Be helpful. Answer the user's question.
Adds no information. Wastes the skill mechanism.
Skill: "write tests"
---
name: Write tests
when_to_use: When user asks for tests.
---
Write tests for the function provided.
Too vague. The agent already knows how to write tests; this skill doesn't add specifics.
Skill: "do everything"
A 500-line SKILL.md that tries to encompass an entire workflow domain. Skills should be focused — one workflow per skill.
If your skill has more than ~150 lines, you've probably bundled too much. Split it into multiple focused skills.
Skill structure recommendations
Keep SKILL.md short
Aim for <150 lines. Concise skills are reliable; sprawling skills get partially-followed.
Define when_to_use precisely
The LLM uses this to decide when to invoke. Vague triggers ("when relevant") cause skill misfires; precise triggers ("when reviewing TypeScript code in our auth module") are reliable.
Reference external files for long content
Don't embed a 500-word brand voice guide in SKILL.md. Reference an external file:
- Read `~/dev/_shared/references/brand-voice.md` for tone.
Skills should be instructions; the data they reference can be elsewhere.
Include negative examples
Don't:
- Use the phrase "moving forward" — corporate-speak.
- Start posts with "In today's fast-paced world" — tired opener.
Negative examples anchor the skill's standards.
Where skills live
Anthropic's skill format places them in a structured directory. Common pattern:
~/.claude/skills/
├── code-review-strict/
│ └── SKILL.md
├── blog-post-from-notes/
│ ├── SKILL.md
│ └── voice-reference.md
└── swift-codable-migration/
└── SKILL.md
Each skill is a directory. SKILL.md is the entry point. Supporting files live alongside.
For team-shared skills, commit the directory tree to a shared repo and point each team member's Claude Code config at it.
Maintenance
Skills aren't write-once. Two practices:
When the agent doesn't follow a skill instruction, fix the instruction.
If you wrote Don't use bullet-heavy style and the agent still produces bullets, the instruction wasn't strong enough. Update:
Don't use bullet-heavy style. If you find yourself producing 3+ bullets in
a row, convert to a paragraph instead. Bullets are for genuine lists (>5
items, or hierarchical), not paragraph chunking.
The clarification anchors what "bullet-heavy" means.
When you find yourself overriding a skill instruction, update the skill.
If you regularly say "ignore the part about X" when invoking the skill, X shouldn't be in the skill. Remove it.
Common skill patterns
Pattern 1: review skills
code-review-strict, code-review-style, pr-description-writer. One per review type.
Pattern 2: generation skills
generate-react-component, generate-jest-test, generate-changelog. One per output type.
Pattern 3: transformation skills
markdown-to-blog, feature-spec-to-tickets, api-error-to-user-message. One per transformation.
Pattern 4: project-specific patterns
swift-codable-migration, react-form-with-zod, nextjs-route-with-tests. One per recurring pattern in your project.
File-manager setup for skill development
mq-dir's pane setup helps when authoring skills:
- Pane 1: skills directory
~/.claude/skills/. - Pane 2: a SKILL.md you're editing.
- Pane 3: a reference file the skill points to.
- Pane 4: cmux session where you test the skill.
You see the skill list, the current edit, the referenced data, and the test session at once.
Verdict
Skills are a powerful Claude Code feature for workflows that:
- Recur weekly+.
- Have multi-step structure.
- Have specific output shapes.
- Reference specific files or tools.
They're misused when written for:
- One-off tasks.
- Highly variable tasks.
- Tasks the agent already handles well.
Aim for ~10-30 well-crafted skills, each focused, each <150 lines. Don't sprawl.
The investment is real (30 min - 2 hours per skill) but the daily compounding return is significant for recurring workflows.
mq-dir is the navigation layer for the artifacts your skill-driven workflows produce. Free, MIT.
mq-dir is fully open source.
MIT licensed, zero telemetry. Read the source, file an issue, send a PR.
★ Star on GitHub →Frequently asked questions
References
- [1]
Ready to try mq-dir?
A native quad-pane file manager built for AI multi-tasking on macOS. Free, MIT licensed, zero telemetry.
Related posts
Local LLMs on macOS in 2026: when they're worth the GPU
Local LLMs got dramatically better in 2025-2026. They're competitive with frontier APIs for some workflows; not all. Here's the honest picture.
Claude Code memory without polluting global config
Claude Code's memory feature is powerful but easy to misuse. The pattern that scales — what to put in global memory, what to put per-project, what to never persist.
File-context strategies for AI agents: what to feed, what to skip, what to summarize
When an AI agent has access to your whole repo, it doesn't read your whole repo. Here's how to choose what enters context, what stays out, and how that decision affects output quality.