🛡️Guardrails and Safety Skills

Browse skills in the Guardrails and Safety category.

ai-ethics-advisor

DTMC-marketplace's avatarfrom DTMC-marketplace

Comprehensive AI ethics and responsible AI development specialist. Use PROACTIVELY for bias assessment, fairness evaluation, ethical AI implementation, community impact analysis, and regulatory compliance. Trigger keywords include bias, fairness, discrimination, disparate impact, ethical AI, responsible AI, AI safety, alignment, algorithmic justice, AI regulation, model audit, AI governance. Use for high-risk AI systems (employment, lending, healthcare, criminal justice, education), systems affecting vulnerable populations, large-scale deployments (more than 10,000 people), automated decision-making, facial recognition, biometric systems, and predictive analytics on people.

[Guardrails and Safety]

ai-agent-guidelines

k2works's avatarfrom k2works

AI Agent の実行ガイドライン。TDD サイクル、品質保証、コンテキスト管理、完了報告のルールを定義。すべての開発タスク実行時に使用。

[Guardrails and Safety]

smith-clarity

tianjianjiang's avatarfrom tianjianjiang

Cognitive trap detection and logic fallacy identification. Use when making decisions, evaluating approaches, risk assessment, or detecting faulty reasoning in arguments.

[Guardrails and Safety]

fpf-skillverification-verify-behavior

venikman's avatarfrom venikman

Verifies that an execution trace complies with the FPF Behavioral Specification.

[Guardrails and Safety]

smith-guidance

tianjianjiang's avatarfrom tianjianjiang

Core agent steering with HHH framework (Helpful, Honest, Harmless), exploration-before-implementation workflow, and anti-sycophancy rules. Use when guiding AI agent behavior, handling disagreements, or establishing interaction patterns. Always active for all agent interactions.

[Guardrails and Safety]

hook-development

Kabilan108's avatarfrom Kabilan108

This skill should be used when the user asks to "create a hook", "add a PreToolUse/PostToolUse/Stop hook", "validate tool use", "implement prompt-based hooks", "use ${CLAUDE_PLUGIN_ROOT}", "set up event-driven automation", "block dangerous commands", or mentions hook events (PreToolUse, PostToolUse, Stop, SubagentStop, SessionStart, SessionEnd, UserPromptSubmit, PreCompact, Notification). Provides comprehensive guidance for creating and implementing Claude Code plugin hooks with focus on advanced prompt-based hooks API.

[Guardrails and Safety]

ai-output-validation

Youngger9765's avatarfrom Youngger9765

Ensures all AI-generated output fields have proper validation.Auto-activates on "AI 輸出", "LLM", "Gemini", "GPT", "truncate", "截斷" keywords.Lesson learned: 2026-01-08 Quick Feedback, Report, Deep Analyze truncation bugs.

[Guardrails and Safety]

following-plans

tobyhede's avatarfrom tobyhede

Algorithmic decision tree for when to follow plan exactly vs when to report STOPPED - prevents scope creep and unauthorized deviations

[Guardrails and Safety]

approval-workflow

maneeshanif's avatarfrom maneeshanif

Manages Human-in-the-Loop (HITL) approval workflows for sensitive actions. Use when creating approval requests, processing approved items, or implementing safety controls for autonomous actions.

[Guardrails and Safety]

defense-in-depth

ChunkyTortoise's avatarfrom ChunkyTortoise

This skill should be used when implementing "multi-layer validation", "comprehensive error handling", "input sanitization", "security testing", "data validation layers", "fault tolerance", or when building robust systems with multiple validation checkpoints.

[Guardrails and Safety]

code-only-env

rvantonder's avatarfrom rvantonder

Explains the code-only execution environment to Claude. This skill is automaticallyactivated when the plugin loads, informing Claude that it operates in a restrictedenvironment where ONLY the execute_code tool is available.

[Guardrails and Safety]

constitution

acornsoft's avatarfrom acornsoft

Load and confirm core principles, guardrails, and project context for MacroFlow sessions.

[Guardrails and Safety]

verify-gr-math

pestypig's avatarfrom pestypig

GR/warp verification workflow for CasimirBot: enforce WARP_AGENTS constraints, math-stage reporting, adapter verification, certificate integrity, and training-trace export. Use when editing GR/warp modules, constraint policies, warp viability, math stage registry, or any change that requires the Casimir verification gate.

[Guardrails and Safety]

ethical-framing-consent

randysalars's avatarfrom randysalars

Listener agency preservation and non-coercive suggestion framework

[Guardrails and Safety]

audio-layering-somatic-cue

randysalars's avatarfrom randysalars

Nervous system engineering for safe, embodied trance experiences

[Guardrails and Safety]

create-hook

RBozydar's avatarfrom RBozydar

Create Claude Code hooks with proper patterns, security best practices, and configuration. Use this skill when building PreToolUse, PostToolUse, SessionStart, or other hook types for plugins.

[Guardrails and Safety]

approval-gate

takemo101's avatarfrom takemo101

ワークフローの重要なフェーズ移行前にユーザーの明示的な承認を必要とする承認ゲートの共通フォーマットとパターンを定義

[Guardrails and Safety]

git-safety-guard

stars-end's avatarfrom stars-end

Installs a Git safety guard hook for Claude Code to prevent destructive Git and filesystem commands.Blocks accidental data loss from commands like 'git checkout --', 'git reset --hard', 'git clean -f', 'git push --force', and 'rm -rf'.Use this skill to set up safety rails in a new or existing repository, or globally for the agent.

[Guardrails and Safety]

symbolic-archetypal-mapping

randysalars's avatarfrom randysalars

Meaning engine that translates intention into safe, effective symbolic experience

[Guardrails and Safety]

christian-discernment-spiritual-boundary

randysalars's avatarfrom randysalars

Theological governance ensuring alignment with Christian faith

[Guardrails and Safety]

constitution-enforcer

Eigo-Mt-Fuji's avatarfrom Eigo-Mt-Fuji

Validates compliance with 9 Constitutional Articles and Phase -1 Gates before implementation.Trigger terms: constitution, governance, compliance, validation, constitutional compliance,Phase -1 Gates, simplicity gate, anti-abstraction gate, test-first, library-first,EARS compliance, governance validation, constitutional audit, compliance check, gate validation.Enforces all 9 Constitutional Articles with automated validation:- Article I: Library-First Principle- Article II: CLI Interface Mandate- Article III: Test-First Imperative- Article IV: EARS Requirements Format- Article V: Traceability Mandate- Article VI: Project Memory- Article VII: Simplicity Gate- Article VIII: Anti-Abstraction Gate- Article IX: Integration-First TestingRuns Phase -1 Gates before any implementation begins.Use when: validating project governance, checking constitutional compliance,or enforcing quality gates before implementation.

[Guardrails and Safety]

psychological-stability-monitoring

randysalars's avatarfrom randysalars

Continuous monitoring for psychological integration and stability

[Guardrails and Safety]

readathon-database-safety

stevensouza's avatarfrom stevensouza

Warn and protect against dangerous database operations on production database in the readathon project

[Guardrails and Safety]

sketch-security-guardrails

joelklabo's avatarfrom joelklabo

Security and privacy guardrails for Sketch Magic. Use when handling API keys, logs, uploads, telemetry, or when debugging errors to avoid leaking secrets or user images.

[Guardrails and Safety]

sandbox-agent

santiago-afonso's avatarfrom santiago-afonso

Run agent CLIs (codex/copilot/opencode) inside a Podman container with full internet access but filesystem exposure limited to the repo root + explicit bind mounts.

[Guardrails and Safety]

hook-development

cityfish91159's avatarfrom cityfish91159

This skill should be used when the user asks to "create a hook", "add a PreToolUse/PostToolUse/Stop hook", "validate tool use", "implement prompt-based hooks", "use ${CLAUDE_PLUGIN_ROOT}", "set up event-driven automation", "block dangerous commands", or mentions hook events (PreToolUse, PostToolUse, Stop, SubagentStop, SessionStart, SessionEnd, UserPromptSubmit, PreCompact, Notification). Provides comprehensive guidance for creating and implementing Claude Code plugin hooks with focus on advanced prompt-based hooks API.

[Guardrails and Safety]

amp-permissions

thurstonsand's avatarfrom thurstonsand

Configure Amp's permissions -- allowing, rejecting, or asking for tool invocations in Amp. Activates with phrases like "reject using this tool", "I want to modify the tool permissions", or "change Amp's permissions".

[Guardrails and Safety]

guardrails-safety

doanchienthangdev's avatarfrom doanchienthangdev

Protecting AI applications - input/output guards, toxicity detection, PII protection, injection defense, constitutional AI. Use when securing AI systems, preventing misuse, or ensuring compliance.

[Guardrails and Safety]

skill-resiliency

m31uk3's avatarfrom m31uk3

This skill should be used when the user asks to "add resiliency to a skill", "make this skill more robust", "improve error handling", "add validation mechanisms", "create self-correcting behavior", or discusses determinism, robustness, error correction, or homeostatic patterns in Agent Skills. Applies biological resiliency principles from Michael Levin's work to Agent Skill design.

[Guardrails and Safety]

workflow-enforcement

chkim-su's avatarfrom chkim-su

Protocol-based workflow enforcement with validation dependencies and anti-bypass protection

[Guardrails and Safety]

worktree-policy

ekson73's avatarfrom ekson73

Enforce mandatory git worktree usage for multi-agent file modifications

[Guardrails and Safety]

oe-security-prompt-injection

shami-ah's avatarfrom shami-ah

Maintain and extend prompt-injection defenses. Use when adding new user-input surfaces, changing prompt templates, or when a new injection pattern is observed; run the security regression suite and add a minimal new test case.

[Guardrails and Safety]

writing-rules

erikpr1994's avatarfrom erikpr1994

Use when creating or updating rules in CLAUDE.md, settings, or rule files. Covers confidence thresholds and false positive prevention.

[Guardrails and Safety]

mova-evidence-proof-v0

Leryk1981's avatarfrom Leryk1981

Use for deterministic changes, evidence-first delivery, or assembling proof kits with clear safety justification.

[Guardrails and Safety]

codex-container-sandbox

santiago-afonso's avatarfrom santiago-afonso

Run Codex CLI inside a Podman container with full internet access but filesystem exposure limited to the repo root + explicit bind mounts; use when you want yolo/web-search without giving the agent access to your whole host filesystem.

[Guardrails and Safety]

hooks-manager

henmessi's avatarfrom henmessi

Branch skill for building and improving hooks. Use when creating new hooks, adapting marketplace hooks, validating hook structure, writing hook scripts, or improving existing hooks. Triggers: 'create hook', 'improve hook', 'validate hook', 'fix hook', 'PreToolUse', 'PostToolUse', 'Stop hook', 'hook script', 'adapt hook', 'prompt hook', 'command hook'.

[Guardrails and Safety]

hackathon-rules

UBAIDRAZA1's avatarfrom UBAIDRAZA1

Enforces Hackathon II rules, phases, and evaluation constraints.

[Guardrails and Safety]

git-safety-guard

terraphim's avatarfrom terraphim

Blocks destructive git and filesystem commands before execution.Prevents accidental loss of uncommitted work from git checkout --,git reset --hard, rm -rf, and similar destructive operations.Works as a Claude Code PreToolUse hook with fail-open semantics.

[Guardrails and Safety]

rules

majiayu000's avatarfrom majiayu000

Strict file creation rules. Loaded FIRST by orchestrator and all agents before any action. Prevents pollution with .md, .json, scripts. Only allows code files and .build/ docs.

[Guardrails and Safety]

add-perspective

silenvx's avatarfrom silenvx

振り返り観点を追加するガイド。ユーザー指摘から学習し、類似問題を将来検出できるようにする。観点、perspective、チェック追加時に使用。

[Guardrails and Safety]

research-gate

majiayu000's avatarfrom majiayu000

Gates expensive external research (perplexity deep_research) while allowing quick lookups. Enforces knowledge priority - local documentation and project patterns first, external research only when necessary. Permissive mode allows context7, local-rag, and quick perplexity searches. (project)

[Guardrails and Safety]

guardrails-safety

majiayu000's avatarfrom majiayu000

Instrument safety checks, content filters, and guardrails for agent outputs

[Guardrails and Safety]

global-truth-safety

majiayu000's avatarfrom majiayu000

Practice radical candor by delivering only verified, tested code with data-backed decisions, immediate problem flagging, and honest status communication. Use this skill when making claims about code behavior, reporting system status, identifying risks, documenting missing coverage, or challenging assumptions. Applies to all development activities requiring factual accuracy, evidence-based assertions, gap ownership, and closing the loop on delivered work through validation and independent review.

[Guardrails and Safety]

safety-verification

majiayu000's avatarfrom majiayu000

Your approach to handling safety verification. Use this skill when working on files where safety verification comes into play.

[Guardrails and Safety]

wolf-verification

majiayu000's avatarfrom majiayu000

Three-layer verification architecture (CoVe, HSP, RAG) for self-verification, fact-checking, and hallucination prevention

[Guardrails and Safety]

sandbox-configurator

majiayu000's avatarfrom majiayu000

Configure Claude Code sandbox security with file system and network isolation boundaries

[Guardrails and Safety]

security-patterns

majiayu000's avatarfrom majiayu000

Security patterns for input validation, PII protection, and cryptographic operations

[Guardrails and Safety]

validate-requirements

majiayu000's avatarfrom majiayu000

Validate that input meets prerequisites based on the user's saved standards for the project type. Use at the start of any quality pipeline to ensure the user has provided sufficient requirements.

[Guardrails and Safety]

single-source-validator

majiayu000's avatarfrom majiayu000

ENFORCEMENT tool that detects when Skills automation is duplicated in agent definitions, lessons learned, or process docs. Prevents "single source of truth nightmare" by finding bash commands, step-by-step procedures, or process descriptions that replicate Skills. BLOCKING AUTHORITY - workflow cannot complete with violations.

[Guardrails and Safety]

worktree-path-policy

majiayu000's avatarfrom majiayu000

Ensures all file operations occur in the correct worktree directory to prevent accidental changes to the wrong codebase. Use when implementing, reviewing, testing, or documenting code in worktree-based development workflows.

[Guardrails and Safety]
← Back to All Skills