What is smith-prompts?

This Claude skill establishes advanced prompt engineering standards focused on cache optimization and token efficiency. It provides a technical framework for structuring AGENTS.md files and AI prompts to maximize cache hits, potentially reducing costs by 90% and latency by 85% through strategic content ordering and progressive disclosure patterns.

When should I use smith-prompts?

smith-prompts is useful in the following scenarios: • Optimizing high-frequency AI agent interactions to maximize prompt cache hits and significantly reduce operational overhead. • Structuring complex AGENTS.md files using a cache-friendly architecture that separates static instructions from dynamic project data. • Implementing progressive disclosure and sparse attention patterns to handle large-scale context without exceeding token limits or degrading performance. • Designing robust structured outputs and tool schemas to ensure consistent AI behavior across different LLM providers like Anthropic, OpenAI, and Gemini.

name	smith-prompts
description	Prompt engineering standards for AI interactions with cache optimization. Use when writing AI prompts, optimizing context usage, or structuring AGENTS.md files. Covers prompt caching, token efficiency, and progressive disclosure patterns.

Prompt Engineering Standards

Load if: Writing AI prompts, optimizing context usage
Prerequisites: @smith-principles/SKILL.md

CRITICAL: Prompt Caching (Primacy Zone)

Cache reduces costs 90%, latency 85%

Structure for caching:

Static content first (methodology, rules)
Tool definitions in consistent order
Project context (AGENTS.md, docs)
Dynamic content last (recent changes)

Cache breakpoints: Every ~1024 tokens. Prefix must be identical for cache hit.

Reordering tools between calls
Injecting dynamic content into static sections
Modifying cached prefix unnecessarily
Using Markdown tables (see @smith-skills/SKILL.md - use bullet lists instead)

AGENTS.md Cache-Friendly Structure

<!-- STATIC - cached -->
<metadata>

Scope, Load if, Prerequisites

</metadata>

<required>

Critical NEVER/ALWAYS rules

</required>

<forbidden>

Anti-patterns

</forbidden>

<!-- CACHE BREAKPOINT (~1024 tokens) -->

<!-- DYNAMIC - not cached -->
<examples>

Code examples that evolve

</examples>

Token Efficiency

Progressive Disclosure

Three-level loading:

Metadata only (50 tokens)
Core concepts when triggered (200 tokens)
Full details when accessed (1000+ tokens)

Sparse Attention

Efficient file reading:

Grep to find location
Read with offset/limit for large files
Read only necessary context (±20 lines)

Loading full files when targeted reads suffice
Reading documentation when metadata answers the question
Repeating user's question in responses

Structured Output

Platform mechanisms:

OpenAI: JSON Schema with strict: true (100% compliance)
Anthropic: Tool use with flexible schemas
Gemini: responseSchema with retry

Schema design:

Match existing project patterns
Include descriptions for complex fields
Define required vs optional fields
Keep nesting ≤3 levels

@smith-ctx/SKILL.md - Progressive disclosure, reference-based communication
@smith-xml/SKILL.md - Approved XML tags

ACTION (Recency Zone)

For caching:

Place static content before dynamic
Maintain consistent tool order
Target >80% cache hit rate

For efficiency:

Use Grep before Read
Read incrementally (narrow → expand)
Use file:line references

smith-prompts

When & Why to Use This Skill

Use Cases