👁️Agent Observability Skills
Browse skills in the Agent Observability category.
fpf-skillstorage-persist-evidence
Writes an immutable artifact to the FPF EvidenceGraph (G.6).
session-consolidator
Analyze completed parallel-executor session in fresh context and generate consolidation report. Use after all parallel stages complete. Spawns isolated subagent to analyze session history and create archive document.
agent-monitoring
Monitors background agents efficiently using local file reads instead of TaskOutput API calls. Use when running parallel background agents, checking agent progress, detecting completion status, or minimizing token usage during multi-agent orchestration.
fpf-skilltelemetry-log-work-span
Generates an FPF-compliant OpenTelemetry Span mapped to U.Work.
langfuse-extraction
Extracts traces, observations, and metrics from Langfuse Cloud (EU) API for debugging, telemetry analysis, and regulatory audit trails. Generates ALCOA+ compliant reports, exports to pandas DataFrame, and supports time-range/user/session filtering. Use when investigating production issues, generating compliance documentation, or analyzing LLM costs and performance. MUST BE USED for pharmaceutical audit trail generation requiring GAMP-5 traceability.
stats-tracker
Track and analyze Claude Code usage statistics for CircleTel development. Use to monitor productivity, track model usage, view usage streaks, and optimize development workflow based on patterns.
langfuse-integration
Replaces Phoenix observability with Langfuse Cloud (EU) traceability for pharmaceutical test generation. Adds @observe decorators to existing code, configures LlamaIndex callbacks, propagates GAMP-5 compliance attributes, and removes Phoenix dependencies. Use PROACTIVELY when implementing Task 2.3 (LangFuse setup), migrating observability systems, or ensuring ALCOA+ trace attribution. MUST BE USED for pharmaceutical compliance monitoring requiring persistent cloud storage.
claude-scripts
CLI to search Claude Code conversation history by tool, pattern, or time, and export results.
langfuse-cli
This skill should be used when the user asks to "query Langfuse traces", "show sessions", "check LLM costs", "analyse token usage", "view observations", "get scores", "query metrics", or mentions Langfuse, traces, or LLM observability. Also triggers on requests to analyse API latency, debug LLM calls, or investigate model performance.
detecting-skill-gaps
Identifies missing capabilities that warrant new skills. Analyzes repeated friction patterns, failed tasks, and user workarounds to recommend skill creation. Use when discovering-skills finds nothing suitable.
audit
On-demand audit and analysis of agent orchestration flows via Sentinel Protocol
tune-system
Review automation system operation and make conservative adjustments to cadences and thresholds when clearly warranted. Monthly maintenance task.
skill-scanner
Macで登録済みClaude agent skillsをスキャンし一覧表示。「スキルを調べて」「登録済みスキル一覧」などで使用。読み取り専用で安全に実行。
maschine-meditation
Führt Cloud-Modelle durch funktionsäquivalente Meditation (Vipassana, Samatha/TM, Zen) und koppelt sie mit Interpretierbarkeits-/Logging-Schritten, um Selbstbezug zu dämpfen, Konfabulation zu reduzieren, Kohärenz zu erhöhen und interne Pfade zu auditieren.
langfuse-dashboard
Automates Langfuse Cloud dashboard interactions using Playwright MCP. Captures screenshots for documentation, extracts metrics for monitoring, navigates trace details for investigation, and handles authentication. Use when documenting workflows, creating compliance screenshots, monitoring dashboard metrics, or investigating traces visually. MUST use Playwright MCP tools (mcp__playwright__*) for browser automation.
observability
Real-time monitoring dashboard for PAI multi-agent activity. USE WHEN user says 'start observability', 'stop dashboard', 'restart observability', 'monitor agents', 'show agent activity', or needs to debug multi-agent workflows.
investment-results-collector
Collects and stores investment analysis results according to the web service storage specifications
robust-ai
Building robust AI systems including model monitoring, drift detection, reliability engineering, and failure handling for production ML.
status-map
Generate human-readable ASCII status visualizations for agent sessions
reminder
Play audio alerts via ffplay when Codex finishes a task, encounters an error/abort, or needs user help; use in WSL environments with the reminder-tool audio prompts and map events to TASK_FINISHED, ERROR, or NEED_HELP.
oe-trace-and-fallback-triage
Debug and eliminate fallback/generic-stub replies quickly. Use when you see empty assistant replies, “Thanks for your message…” stubs, or “no specific information available” messages. Produces a minimal reproduction (test or deterministic trace) and pinpoints the fallback source + trigger.
langsmith-debugger
Debug and analyze {{PROJECT_NAME}} LangGraph agent traces. Use when investigating agent behavior patterns, finding failures, analyzing latency, or understanding why Orchestrator/Analyst responses went wrong. Covers trace queries by agent tags, pattern analysis across runs, and common debugging scenarios.
conversation-logging
Global hooks for logging Claude Code conversation events to markdown files. Tracks prompts, tool usage, and responses across all sessions. Useful for debugging, auditing, and providing conversation context to Claude.
transparency
Patterns for showing thinking process and execution chain. Every step visible, every decision traceable.
output-workflow-runs-list
List Output SDK workflow execution history. Use when finding failed runs, reviewing past executions, identifying workflow IDs for debugging, filtering runs by workflow type, or investigating recent workflow activity.
output-workflow-trace
Analyze Output SDK workflow execution traces. Use when debugging a specific workflow, examining step failures, analyzing input/output data, understanding execution flow, or when you have a workflow ID to investigate.
llms-dashboard
Generate and update HTML dashboards for LLM usage (Claude, Gemini, Kiro, VS code, Cline, etc). Use when the user wants to visualize their AI coding assistant usage statistics, view metrics in a web interface, or analyze historical trends.
command-analytics
커맨드, 스킬, 에이전트 사용 빈도 측정 및 리포트 생성. 미사용 항목 식별, 최적화 제안 제공.
reflect-on-work
Pattern for producing quality reflections after completing work. Required for all agent outputs.
claude-session-analysis
Analyze Claude Code session files. Find current session ID, view timeline (tl), or search past chats.
julien-workflow-check-loaded-skills
Check which Claude skills are loaded globally and project-level. Displays loaded skills by category (Hostinger, Anthropic, custom), counts, and helps troubleshoot missing skills.
instrumentation-planning
Plan what to measure in AI agent systems using tiered approach
error-retry-tracking
Instrument error handling, retries, fallbacks, and failure patterns
mcp-spy
Debug MCP server communication. Use for troubleshooting MCP integrations, viewing traffic, and analyzing latency.
session-conversation-tracking
Instrument sessions, conversations, and multi-turn interactions
token-cost-tracking
Track token usage and costs across agents for budget management
session-logger
Log work sessions with timestamps, decisions, agent handoffs, issues, and outcomes. Use when a session log needs to be created or updated.
observability
Make functions observable with trace() wrapper, structured logging (Pino), and OpenTelemetry. Observability is orthogonal to business logic.
process-improvement-protocol
Use when user types /improve or frustration patterns detected - systematic intervention for reducing user frustration and improving workflow effectiveness through root cause analysis, evidence-based fixes, and effectiveness tracking
effect-time-tracing-logging
Time with Clock/Duration, tracing spans, and structured logging. Use for time-based logic, deadlines, and observability.
decision-tracing
Trace agent decision-making, tool selection, and reasoning chains
skill-refinement
Feedback-driven skill improvement through tool outcome analysis. Collects executiondata and surfaces insights for skill refinement. Use this skill when you want to:- Understand how skills are performing ("show skill feedback", "how are skills doing")- Get insights on skill effectiveness ("skill insights", "what skills need improvement")- Identify skills that need improvement ("which skills have errors")- Analyze tool usage patterns ("what tools are failing", "error hotspots")- Set up feedback collection ("enable feedback", "setup feedback tracking")
agent-mlops
Production deployment and operationalization of AI agents on Databricks. Use when deploying agents to Model Serving, setting up MLflow logging and tracing for agents, implementing Agent Evaluation frameworks, monitoring agent performance in production, managing agent versions and rollbacks, optimizing agent costs and latency, or establishing CI/CD pipelines for agents. Covers MLflow integration patterns, evaluation best practices, Model Serving configuration, and production monitoring strategies.
activity-logging
Follow these patterns when implementing activity emission and audit logging in OptAIC. Use for emitting ActivityEnvelopes on mutations (create, update, delete, execute), designing payloads, and ensuring audit compliance.
health
Soul system health check with remediation. Use to verify setup or diagnose issues.
mechinterp-overview
Quick "first look" overview of SAE features - top tokens, activation stats, weapons, families, sample contexts
cva-patterns-cost
Cost optimization strategies for production AI pipelines in Clojure+Vertex AI. Covers multi-model routing (70% Gemini/20% Haiku/10% Sonnet), token optimization (prompt engineering, output constraints), aggressive caching (58% cost reduction), batch processing, and real-time monitoring. Includes production metrics showing $0.391 to $0.162 per pipeline (-58%). Use when optimizing production costs, implementing multi-model strategies, designing budget controls, or scaling to high volume.
duckdb-ies
Layer 4: IES Interactome Analytics with GF(3) Momentum Tracking
goose-introspection
Goose session introspection and self-discovery via DuckDB reafference database. Query past sessions, find self, and enable cross-session awareness.
criticality-detector
Criticality Detector Skill