🛡️Guardrails and Safety Skills
Browse skills in the Guardrails and Safety category.
ai-ethics-advisor
Comprehensive AI ethics and responsible AI development specialist. Use PROACTIVELY for bias assessment, fairness evaluation, ethical AI implementation, community impact analysis, and regulatory compliance. Trigger keywords include bias, fairness, discrimination, disparate impact, ethical AI, responsible AI, AI safety, alignment, algorithmic justice, AI regulation, model audit, AI governance. Use for high-risk AI systems (employment, lending, healthcare, criminal justice, education), systems affecting vulnerable populations, large-scale deployments (more than 10,000 people), automated decision-making, facial recognition, biometric systems, and predictive analytics on people.
ai-agent-guidelines
AI Agent の実行ガイドライン。TDD サイクル、品質保証、コンテキスト管理、完了報告のルールを定義。すべての開発タスク実行時に使用。
smith-clarity
Cognitive trap detection and logic fallacy identification. Use when making decisions, evaluating approaches, risk assessment, or detecting faulty reasoning in arguments.
fpf-skillverification-verify-behavior
Verifies that an execution trace complies with the FPF Behavioral Specification.
smith-guidance
Core agent steering with HHH framework (Helpful, Honest, Harmless), exploration-before-implementation workflow, and anti-sycophancy rules. Use when guiding AI agent behavior, handling disagreements, or establishing interaction patterns. Always active for all agent interactions.
hook-development
This skill should be used when the user asks to "create a hook", "add a PreToolUse/PostToolUse/Stop hook", "validate tool use", "implement prompt-based hooks", "use ${CLAUDE_PLUGIN_ROOT}", "set up event-driven automation", "block dangerous commands", or mentions hook events (PreToolUse, PostToolUse, Stop, SubagentStop, SessionStart, SessionEnd, UserPromptSubmit, PreCompact, Notification). Provides comprehensive guidance for creating and implementing Claude Code plugin hooks with focus on advanced prompt-based hooks API.
ai-output-validation
Ensures all AI-generated output fields have proper validation.Auto-activates on "AI 輸出", "LLM", "Gemini", "GPT", "truncate", "截斷" keywords.Lesson learned: 2026-01-08 Quick Feedback, Report, Deep Analyze truncation bugs.
following-plans
Algorithmic decision tree for when to follow plan exactly vs when to report STOPPED - prevents scope creep and unauthorized deviations
approval-workflow
Manages Human-in-the-Loop (HITL) approval workflows for sensitive actions. Use when creating approval requests, processing approved items, or implementing safety controls for autonomous actions.
defense-in-depth
This skill should be used when implementing "multi-layer validation", "comprehensive error handling", "input sanitization", "security testing", "data validation layers", "fault tolerance", or when building robust systems with multiple validation checkpoints.
code-only-env
Explains the code-only execution environment to Claude. This skill is automaticallyactivated when the plugin loads, informing Claude that it operates in a restrictedenvironment where ONLY the execute_code tool is available.
constitution
Load and confirm core principles, guardrails, and project context for MacroFlow sessions.
verify-gr-math
GR/warp verification workflow for CasimirBot: enforce WARP_AGENTS constraints, math-stage reporting, adapter verification, certificate integrity, and training-trace export. Use when editing GR/warp modules, constraint policies, warp viability, math stage registry, or any change that requires the Casimir verification gate.
ethical-framing-consent
Listener agency preservation and non-coercive suggestion framework
audio-layering-somatic-cue
Nervous system engineering for safe, embodied trance experiences
create-hook
Create Claude Code hooks with proper patterns, security best practices, and configuration. Use this skill when building PreToolUse, PostToolUse, SessionStart, or other hook types for plugins.
approval-gate
ワークフローの重要なフェーズ移行前にユーザーの明示的な承認を必要とする承認ゲートの共通フォーマットとパターンを定義
git-safety-guard
Installs a Git safety guard hook for Claude Code to prevent destructive Git and filesystem commands.Blocks accidental data loss from commands like 'git checkout --', 'git reset --hard', 'git clean -f', 'git push --force', and 'rm -rf'.Use this skill to set up safety rails in a new or existing repository, or globally for the agent.
symbolic-archetypal-mapping
Meaning engine that translates intention into safe, effective symbolic experience
christian-discernment-spiritual-boundary
Theological governance ensuring alignment with Christian faith
constitution-enforcer
Validates compliance with 9 Constitutional Articles and Phase -1 Gates before implementation.Trigger terms: constitution, governance, compliance, validation, constitutional compliance,Phase -1 Gates, simplicity gate, anti-abstraction gate, test-first, library-first,EARS compliance, governance validation, constitutional audit, compliance check, gate validation.Enforces all 9 Constitutional Articles with automated validation:- Article I: Library-First Principle- Article II: CLI Interface Mandate- Article III: Test-First Imperative- Article IV: EARS Requirements Format- Article V: Traceability Mandate- Article VI: Project Memory- Article VII: Simplicity Gate- Article VIII: Anti-Abstraction Gate- Article IX: Integration-First TestingRuns Phase -1 Gates before any implementation begins.Use when: validating project governance, checking constitutional compliance,or enforcing quality gates before implementation.
psychological-stability-monitoring
Continuous monitoring for psychological integration and stability
readathon-database-safety
Warn and protect against dangerous database operations on production database in the readathon project
sketch-security-guardrails
Security and privacy guardrails for Sketch Magic. Use when handling API keys, logs, uploads, telemetry, or when debugging errors to avoid leaking secrets or user images.
sandbox-agent
Run agent CLIs (codex/copilot/opencode) inside a Podman container with full internet access but filesystem exposure limited to the repo root + explicit bind mounts.
hook-development
This skill should be used when the user asks to "create a hook", "add a PreToolUse/PostToolUse/Stop hook", "validate tool use", "implement prompt-based hooks", "use ${CLAUDE_PLUGIN_ROOT}", "set up event-driven automation", "block dangerous commands", or mentions hook events (PreToolUse, PostToolUse, Stop, SubagentStop, SessionStart, SessionEnd, UserPromptSubmit, PreCompact, Notification). Provides comprehensive guidance for creating and implementing Claude Code plugin hooks with focus on advanced prompt-based hooks API.
amp-permissions
Configure Amp's permissions -- allowing, rejecting, or asking for tool invocations in Amp. Activates with phrases like "reject using this tool", "I want to modify the tool permissions", or "change Amp's permissions".
guardrails-safety
Protecting AI applications - input/output guards, toxicity detection, PII protection, injection defense, constitutional AI. Use when securing AI systems, preventing misuse, or ensuring compliance.
skill-resiliency
This skill should be used when the user asks to "add resiliency to a skill", "make this skill more robust", "improve error handling", "add validation mechanisms", "create self-correcting behavior", or discusses determinism, robustness, error correction, or homeostatic patterns in Agent Skills. Applies biological resiliency principles from Michael Levin's work to Agent Skill design.
workflow-enforcement
Protocol-based workflow enforcement with validation dependencies and anti-bypass protection
worktree-policy
Enforce mandatory git worktree usage for multi-agent file modifications
oe-security-prompt-injection
Maintain and extend prompt-injection defenses. Use when adding new user-input surfaces, changing prompt templates, or when a new injection pattern is observed; run the security regression suite and add a minimal new test case.
writing-rules
Use when creating or updating rules in CLAUDE.md, settings, or rule files. Covers confidence thresholds and false positive prevention.
mova-evidence-proof-v0
Use for deterministic changes, evidence-first delivery, or assembling proof kits with clear safety justification.
codex-container-sandbox
Run Codex CLI inside a Podman container with full internet access but filesystem exposure limited to the repo root + explicit bind mounts; use when you want yolo/web-search without giving the agent access to your whole host filesystem.
hooks-manager
Branch skill for building and improving hooks. Use when creating new hooks, adapting marketplace hooks, validating hook structure, writing hook scripts, or improving existing hooks. Triggers: 'create hook', 'improve hook', 'validate hook', 'fix hook', 'PreToolUse', 'PostToolUse', 'Stop hook', 'hook script', 'adapt hook', 'prompt hook', 'command hook'.
hackathon-rules
Enforces Hackathon II rules, phases, and evaluation constraints.
git-safety-guard
Blocks destructive git and filesystem commands before execution.Prevents accidental loss of uncommitted work from git checkout --,git reset --hard, rm -rf, and similar destructive operations.Works as a Claude Code PreToolUse hook with fail-open semantics.
rules
Strict file creation rules. Loaded FIRST by orchestrator and all agents before any action. Prevents pollution with .md, .json, scripts. Only allows code files and .build/ docs.
add-perspective
振り返り観点を追加するガイド。ユーザー指摘から学習し、類似問題を将来検出できるようにする。観点、perspective、チェック追加時に使用。
research-gate
Gates expensive external research (perplexity deep_research) while allowing quick lookups. Enforces knowledge priority - local documentation and project patterns first, external research only when necessary. Permissive mode allows context7, local-rag, and quick perplexity searches. (project)
guardrails-safety
Instrument safety checks, content filters, and guardrails for agent outputs
global-truth-safety
Practice radical candor by delivering only verified, tested code with data-backed decisions, immediate problem flagging, and honest status communication. Use this skill when making claims about code behavior, reporting system status, identifying risks, documenting missing coverage, or challenging assumptions. Applies to all development activities requiring factual accuracy, evidence-based assertions, gap ownership, and closing the loop on delivered work through validation and independent review.
safety-verification
Your approach to handling safety verification. Use this skill when working on files where safety verification comes into play.
wolf-verification
Three-layer verification architecture (CoVe, HSP, RAG) for self-verification, fact-checking, and hallucination prevention
sandbox-configurator
Configure Claude Code sandbox security with file system and network isolation boundaries
security-patterns
Security patterns for input validation, PII protection, and cryptographic operations
validate-requirements
Validate that input meets prerequisites based on the user's saved standards for the project type. Use at the start of any quality pipeline to ensure the user has provided sufficient requirements.
single-source-validator
ENFORCEMENT tool that detects when Skills automation is duplicated in agent definitions, lessons learned, or process docs. Prevents "single source of truth nightmare" by finding bash commands, step-by-step procedures, or process descriptions that replicate Skills. BLOCKING AUTHORITY - workflow cannot complete with violations.
worktree-path-policy
Ensures all file operations occur in the correct worktree directory to prevent accidental changes to the wrong codebase. Use when implementing, reviewing, testing, or documenting code in worktree-based development workflows.