oe-security-prompt-injection

shami-ah's avatarfrom shami-ah

Maintain and extend prompt-injection defenses. Use when adding new user-input surfaces, changing prompt templates, or when a new injection pattern is observed; run the security regression suite and add a minimal new test case.

0stars🔀1forks📁View on GitHub🕐Updated Jan 8, 2026

When & Why to Use This Skill

The oe-security-prompt-injection skill is a specialized tool for maintaining and strengthening defenses against prompt-injection attacks in LLM-based applications. It enables developers to manage security regression suites, validate prompt template changes, and proactively integrate new attack patterns into their testing framework to ensure robust AI safety and system integrity.

Use Cases

  • Securing New Input Surfaces: Automatically update and verify defenses when adding new user-facing input fields to an AI application.
  • Prompt Template Validation: Ensure that modifications to underlying prompt templates do not introduce new vulnerabilities or weaken existing security guardrails.
  • Automated Security Regression: Run comprehensive pytest-based regression suites to confirm that system updates haven't compromised protection against known injection techniques.
  • Threat Intelligence Integration: Quickly add and test minimal new test cases when a new prompt injection pattern is observed in the wild to maintain a proactive security posture.
nameoe-security-prompt-injection
descriptionMaintain and extend prompt-injection defenses. Use when adding new user-input surfaces, changing prompt templates, or when a new injection pattern is observed; run the security regression suite and add a minimal new test case.

oe-security-prompt-injection

Run the regression suite

  • pytest backend/tests/regression/test_security_prompt_injection.py -v

Add a new attack case (when needed)

  1. Add the new payload to the parametrized attack list in backend/tests/regression/test_security_prompt_injection.py.
  2. Assert both:
    • the input is flagged as suspicious, and
    • the matched pattern/category is the expected one (so we catch drift).

Guardrails

  • Do not weaken detection to “make a test pass”; prefer tightening allowlists for safe inputs and adding targeted patterns for new attacks.