llm-prompt-optimizer
Optimize prompts for better LLM outputs through systematic analysis and refinement
When & Why to Use This Skill
The LLM Prompt Optimizer is a specialized tool designed to enhance the quality, accuracy, and reliability of AI responses through systematic prompt engineering. It solves the common problem of inconsistent or low-quality LLM outputs by applying evidence-based techniques such as chain-of-thought prompting, constraint calibration, and model-specific adjustments. By diagnosing failure patterns like hallucinations and ambiguity, it transforms vague instructions into high-performance prompts, making it essential for developers and researchers aiming for production-grade AI results.
Use Cases
- Refining complex business logic prompts to ensure consistent structured data output (e.g., JSON) and eliminate formatting errors.
- Reducing hallucinations and factual inaccuracies in RAG (Retrieval-Augmented Generation) systems by tightening context constraints and adding negative examples.
- Migrating and adapting existing prompts between different models (e.g., GPT-4 to Claude) to maintain performance while leveraging model-specific strengths.
- Debugging underperforming prompts by identifying hidden ambiguities and missing context that lead to irrelevant or off-target AI responses.
- Optimizing token usage for long-context tasks by removing redundant instructions and compressing examples without losing instruction clarity.
| name | LLM Prompt Optimizer |
|---|---|
| slug | llm-prompt-optimizer |
| description | Optimize prompts for better LLM outputs through systematic analysis and refinement |
| category | ai-ml |
| complexity | intermediate |
| version | "1.0.0" |
| author | "ID8Labs" |
LLM Prompt Optimizer
The LLM Prompt Optimizer skill systematically analyzes and refines prompts to maximize the quality, accuracy, and relevance of large language model outputs. It applies evidence-based optimization techniques including structural improvements, context enrichment, constraint calibration, and output format specification.
This skill goes beyond basic prompt writing by leveraging understanding of how different LLMs process instructions, their attention patterns, and their response tendencies. It helps you transform underperforming prompts into high-yield instructions that consistently produce the results you need.
Whether you are building production AI systems, conducting research, or simply want better ChatGPT responses, this skill ensures your prompts are optimized for your specific model and use case.
Core Workflows
Workflow 1: Analyze and Diagnose Prompt Issues
- Receive the current prompt and sample outputs
- Identify failure patterns:
- Hallucination triggers
- Ambiguity sources
- Missing context gaps
- Conflicting instructions
- Over/under-constrained parameters
- Map issues to specific prompt segments
- Prioritize fixes by impact
- Explain root causes to user
Workflow 2: Apply Optimization Techniques
- Select appropriate techniques based on diagnosis:
- Chain-of-thought insertion
- Few-shot example addition
- Role/persona specification
- Output schema definition
- Constraint tightening/loosening
- Restructure prompt for clarity
- Add missing context or examples
- Remove conflicting or redundant instructions
- Test optimized version
- Iterate based on results
Workflow 3: Model-Specific Optimization
- Identify target LLM (GPT-4, Claude, Llama, etc.)
- Apply model-specific best practices:
- Token budget optimization
- System prompt vs user prompt split
- Temperature/sampling guidance
- Context window utilization
- Adjust for model quirks and strengths
- Document model-specific recommendations
Quick Reference
| Action | Command/Trigger |
|---|---|
| Diagnose prompt issues | "Why isn't this prompt working: [prompt]" |
| Optimize for accuracy | "Optimize for accuracy: [prompt]" |
| Reduce hallucinations | "Reduce hallucinations in: [prompt]" |
| Add structure | "Add better structure to: [prompt]" |
| Model-specific optimization | "Optimize this for [model]: [prompt]" |
| A/B test variants | "Create prompt variants for testing: [prompt]" |
Best Practices
Start with Clear Intent: Define exactly what success looks like before optimizing
- Bad: "Make it work better"
- Good: "Reduce factual errors while maintaining conversational tone"
Use Explicit Output Formats: LLMs follow structure better than vague requests
- Specify JSON schemas, markdown formats, or template structures
- Example: "Return as JSON with keys: analysis, recommendations, confidence"
Calibrate Constraints: Too many constraints cause conflicts; too few cause drift
- Test constraint combinations systematically
- Remove constraints that don't improve output quality
Leverage Positive Instructions: Tell the model what TO do, not just what NOT to do
- Bad: "Don't be verbose"
- Good: "Respond in 2-3 concise sentences"
Position Critical Instructions Strategically: Beginning and end get more attention
- Put key constraints at the start
- Repeat critical requirements at the end
Use Delimiters for Multi-Part Inputs: Clear separation prevents confusion
- Triple quotes, XML tags, or markdown headers
- Example:
"""User Query: {query}""" """Context: {context}"""
Advanced Techniques
Recursive Refinement Loop
For complex prompts, use iterative optimization:
1. Generate baseline outputs (n=5)
2. Score outputs against criteria
3. Identify lowest-scoring dimension
4. Adjust prompt targeting that dimension
5. Repeat until all dimensions score acceptably
Prompt Decomposition
Break complex tasks into simpler sub-prompts:
Complex: "Analyze this code, find bugs, suggest fixes, and refactor"
Decomposed:
Step 1: "List all potential bugs in this code"
Step 2: "For each bug, explain the fix"
Step 3: "Refactor the fixed code for clarity"
Negative Example Injection
Show what NOT to do alongside positive examples:
Good output: [example]
Bad output (avoid this): [anti-example]
Key difference: [explanation]
Token Budget Optimization
When context is limited:
1. Remove redundant phrases
2. Use abbreviations consistently
3. Compress examples to minimal effective size
4. Prioritize recent/relevant context
5. Consider summarizing long contexts
Common Pitfalls to Avoid
- Over-engineering simple prompts with unnecessary complexity
- Copying prompts between models without adaptation
- Ignoring the relationship between temperature and prompt specificity
- Adding examples that introduce unwanted patterns
- Using vague terms like "good," "proper," or "appropriate" without definition
- Conflicting instructions that force the model to choose
- Forgetting to specify handling of edge cases and errors