What is use-skill-create?

This Claude skill introduces a rigorous Test-Driven Development (TDD) methodology for creating and maintaining AI agent skills. It solves the problem of unreliable agent behavior by enforcing a 'Red-Green-Refactor' cycle, ensuring that every instruction is backed by a proven failure case and a verified solution. By focusing on 'The Iron Law'—no skill without a failing test—it enables developers to build bulletproof process documentation that agents actually follow, eliminating common rationalizations and loopholes.

When should I use use-skill-create?

use-skill-create is useful in the following scenarios: • Developing high-reliability agentic workflows: Use the Red-Green-Refactor cycle to ensure new skills effectively guide agent behavior in complex, high-pressure scenarios. • Refining existing AI instructions: Identify specific 'loopholes' or excuses an agent uses to bypass instructions and refactor the skill with explicit counters and red flags. • Optimizing AI Knowledge Bases: Create compressed, search-optimized (CSO) reference guides that improve token efficiency while maintaining strict adherence to proven techniques. • Verifying agent compliance before deployment: Run pressure tests with subagents to document baseline failures and verify that the skill successfully corrects those specific behaviors.

name	use-skill-create
description	Use when creating new skills, editing existing skills, or verifying skills work before deployment

Writing skills IS Test-Driven Development applied to process documentation.

You write test cases (pressure scenarios with subagents), watch them fail (baseline behavior), write the skill (documentation), watch tests pass (agents comply), and refactor (close loopholes).

Core principle: If you didn't watch an agent fail without the skill, you don't know if the skill teaches the right thing.

REQUIRED BACKGROUND: You MUST understand autonome:use-tdd before using this skill.

What is a Skill?

A skill is a reference guide for proven techniques, patterns, or tools.

Skills are: Reusable techniques, patterns, tools, reference guides

Skills are NOT: Narratives about how you solved a problem once

When to Create a Skill

Create when:

Technique wasn't intuitively obvious to you
You'd reference this again across projects
Pattern applies broadly (not project-specific)
Others would benefit

Don't create for:

One-off solutions
Standard practices well-documented elsewhere
Project-specific conventions (put in CLAUDE.md)
Mechanical constraints (automate with regex/validation)

The Iron Law (Same as TDD)

NO SKILL WITHOUT A FAILING TEST FIRST

This applies to NEW skills AND EDITS to existing skills.

Write skill before testing? Delete it. Start over.

RED-GREEN-REFACTOR for Skills

RED: Write Failing Test (Baseline)

Run pressure scenario with subagent WITHOUT the skill. Document exact behavior:

What choices did they make?
What rationalizations did they use (verbatim)?
Which pressures triggered violations?

GREEN: Write Minimal Skill

Write skill that addresses those specific rationalizations.

Run same scenarios WITH skill. Agent should now comply.

REFACTOR: Close Loopholes

Agent found new rationalization? Add explicit counter. Re-test until bulletproof.

SKILL.md Structure

Frontmatter (YAML):

Only two fields: name and description
Max 1024 characters total
name: Letters, numbers, hyphens only
description: Third-person, "Use when..." format
- CRITICAL: Describe ONLY triggering conditions
- NEVER summarize the skill's process or workflow
- Keep under 500 characters

Why no workflow in description: Testing revealed that when descriptions summarize workflow, Claude may follow the description instead of reading the full skill content. Descriptions should only answer "Should I read this skill right now?"

Body sections:

# Skill Name

Core principle in 1-2 sentences.

# When to Use

Bullet list with SYMPTOMS and use cases
When NOT to use

# The Process / Pattern

Clear, numbered steps or comparison

# Red Flags

What indicates you're about to violate

# Common Rationalizations

Table of excuses vs. reality

Key Principles

Token Efficiency:

Target <200 words for frequently-loaded skills
Move details to tool help
Use cross-references (don't repeat)
Compress examples
Eliminate redundancy

Claude Search Optimization (CSO):

Keywords: Error messages, symptoms, synonyms, tools
Descriptive naming: Verb-first, active voice
No workflow summaries in description

Bulletproofing Against Rationalization:

Close every loophole explicitly
Address "Spirit vs Letter" arguments
Build rationalization table from testing
Create red flags list

Testing Skill Types

Discipline-Enforcing Skills (TDD, verification):

Test with pressure scenarios
Multiple combined pressures (time + sunk cost + exhaustion)
Identify rationalizations, add explicit counters

Technique Skills (how-to guides):

Application scenarios
Variation scenarios
Missing information tests

Pattern Skills (mental models):

Recognition scenarios
Application scenarios
Counter-examples

Skill Creation Checklist

RED Phase:

Create pressure scenarios (3+ combined pressures)
Run WITHOUT skill - document baseline verbatim
Identify patterns in rationalizations

GREEN Phase:

Name uses only letters, numbers, hyphens
Description starts with "Use when..." (no workflow)
Keywords throughout for search
Address specific baseline failures
Run scenarios WITH skill - verify compliance

REFACTOR Phase:

Identify NEW rationalizations
Add explicit counters
Build rationalization table
Create red flags list
Re-test until bulletproof

Deployment:

Commit skill to git

The Bottom Line

Creating skills IS TDD for process documentation.

Same Iron Law: No skill without failing test first. Same cycle: RED (baseline) → GREEN (write skill) → REFACTOR (close loopholes).

If you follow TDD for code, follow it for skills.

use-skill-create

When & Why to Use This Skill

Use Cases