org-verification-pipeline
Produces verified datasets, verified evaluation results, and a deployable contract bundle for a workflow. Use when you need provable correctness at data and evaluation boundaries. Trigger with 'verify workflow', 'validate contract', or 'run verification pipeline'.
When & Why to Use This Skill
This Claude skill automates a rigorous 5-phase verification pipeline to ensure data integrity and evaluation accuracy. By utilizing deterministic scripts and gated workflows, it produces provably correct datasets, detailed metric reports, and deployable contract bundles, bridging the gap between raw data and production-ready AI workflows.
Use Cases
- Validating training or evaluation datasets against canonical schemas to ensure high-quality inputs and data integrity.
- Running reproducible backtests and performance evaluations to generate reliable metrics for AI models and automated workflows.
- Generating and auditing deployable contract bundles to ensure API consistency and compliance before production deployment.
- Auditing data quality deterministically to identify and fix frequency or integrity issues in large datasets.
| name | org-verification-pipeline |
|---|---|
| description | "Produces verified datasets, verified evaluation results, and a deployable contract bundle for a workflow. Use when you need provable correctness at data and evaluation boundaries. Trigger with 'verify workflow', 'validate contract', or 'run verification pipeline'." |
| allowed-tools | "Read,Write,Glob,Grep,Bash(python:*),Bash(jq:*)" |
| version | "0.1.0" |
| author | "Your Team <team@example.com>" |
| license | MIT |
Verification Pipeline (5-Phase)
Produce verified artifacts (data, metrics, contracts) using a 5-phase gated workflow with deterministic scripts.
Overview
This skill runs a strict pipeline:
- Phase 1 maps raw inputs to a canonical contract
- Phase 2 audits data quality deterministically
- Phase 3 produces a runnable evaluation plan
- Phase 4 runs a reproducible evaluation and computes metrics
- Phase 5 produces a deployable contract bundle and validates examples
Prerequisites
- Python 3.11+
- Repo dependencies installed
- Input dataset available (CSV/Parquet/JSON) and a target contract defined
Instructions
- Create a run directory under
reports/<project>/<timestamp>/. - Run phases in order using agents in
agents/and procedures inreferences/. - After each phase:
- validate the returned JSON contract
- verify
report_pathexists
- Run verification scripts (must pass) before continuing:
- Phase 1:
{baseDir}/scripts/verify_schema.py - Phase 2:
{baseDir}/scripts/verify_integrity.pyand{baseDir}/scripts/verify_frequency.py - Phase 4:
{baseDir}/scripts/run_backtest.py - Phase 5:
{baseDir}/scripts/verify_api_examples.sh
- Phase 1:
Output
reports/<project>/<timestamp>/0X-*.md: evidence reportsreports/<project>/<timestamp>/quality.json: quality summaryreports/<project>/<timestamp>/metrics.json: evaluation metricsreports/<project>/<timestamp>/contract/**: deployable contract bundle
Error Handling
Error: Phase JSON is invalid / missing keys
Solution: Re-run the phase and ensure strict JSON output contract is met.Error: Verification script fails
Solution: Treat script output as ground truth; update mapping, data, or plan until scripts pass.
Examples
# Example: run schema verification (Phase 1)
python {baseDir}/scripts/verify_schema.py \
--input data/raw.csv \
--schema references/00-canonical-schema.json \
--out reports/example/2025-01-01T00-00-00Z/quality.json
Resources
- Procedures:
{baseDir}/references/ - Agents:
{baseDir}/agents/ - Scripts:
{baseDir}/scripts/