What is code-verification?

This Claude skill provides a sophisticated multi-agent workflow for automated code verification and quality assurance. By utilizing a main-agent and sub-agent loop, it systematically validates code against requirements, acceptance criteria, and quality standards. The skill features automated fix attempts, regression testing, and advanced browser-based UI verification via Playwright, ensuring that software implementations are reliable, accessible, and compliant with technical specifications.

When should I use code-verification?

code-verification is useful in the following scenarios: • Automated Requirements Validation: Verify that new feature implementations strictly adhere to complex acceptance criteria and functional specifications through structured, itemized checklists. • End-to-End Web UI Testing: Use Playwright integration to perform deep inspections of DOM elements, visual appearance, accessibility (ARIA) standards, and network performance on web applications. • Iterative Bug Fixing and Self-Healing: Automatically detect code failures or linting errors and execute iterative fix attempts with built-in regression checks to ensure stability without manual intervention. • Code Quality and Compliance Audits: Scan codebases to ensure all functions have docstrings, remove unused imports, and maintain high test coverage across multiple files. • Pre-deployment Verification: Generate comprehensive verification reports that document passed tests, failed attempts, and audit trails before merging code into production.

name	code-verification
description	Multi-agent code verification workflow using a main agent and sub-agent loop. Use when verifying code against requirements, acceptance criteria, or quality standards. Triggers on requests to verify, validate, or check code against specifications, checklists, or instructions.

Code Verification Skill

Verify code against requirements using a main agent / sub-agent loop with structured feedback and automatic retry.

Workflow Overview

1. Parse verification instructions into testable items
2. For each instruction:
   a. Pre-flight: Confirm instruction is testable
   b. Sub-agent: Verify if instruction is met
   c. If failed: Main agent attempts fix
   d. Repeat b-c up to 5 times or until success
   e. Update checklist with result
3. Generate verification report

Step 1: Parse Verification Instructions

Extract each verification instruction into a discrete, testable item:

ID: Unique identifier (e.g., V-001)
Instruction: The requirement text
Test approach: How to verify (file inspection, run tests, lint, type check, etc.)
Files involved: Which files to examine
Requires Browser: Whether the instruction needs Playwright MCP verification
- Auto-detect from keywords: UI, render, display, visible, hidden, show, hide, click, hover, focus, blur, scroll, DOM, element, component, layout, responsive, style, CSS, color, font, screenshot, visual, appearance, console, error, warning, log, network, request, response, accessibility, a11y, ARIA, animation, transition, loading, performance
- Mark as: browser: true or browser: false
Browser Verification Type (if browser: true):
- DOM_INSPECTION - Element presence, visibility, content via accessibility tree snapshots
- SCREENSHOT - Visual appearance, layout verification
- CONSOLE - Browser console errors, warnings, logs
- NETWORK - API requests, responses, status codes (via network interception)
- PERFORMANCE - Load times, Core Web Vitals (via tracing)
- ACCESSIBILITY - ARIA attributes, semantic HTML, accessibility tree analysis

Step 2: Pre-flight Validation

Before the verification loop, confirm each instruction is testable:

Instruction is specific and unambiguous
Success criteria are clear
Required files/resources exist

Flag untestable instructions immediately rather than attempting verification.

Browser-Specific Pre-Flight

For instructions with browser: true:

Check Playwright MCP availability
- If unavailable, mark instruction as BLOCKED with reason: "Playwright MCP not available"
- Suggest: "Ensure Playwright MCP server is running (npx @playwright/mcp@latest)"
Verify dev server is running
- Check if configured dev server URL responds (e.g., http://localhost:3000)
- If not running, attempt to start using configured command (e.g., npm run dev)
- Wait for configured startup time before proceeding
- If unable to start, mark as BLOCKED: "Dev server not accessible at {URL}"
Confirm target route exists
- Navigate to the page specified in the instruction using browser_navigate
- If 404 or error, mark as BLOCKED: "Target route not found: {route}"

Step 3: Sub-Agent Verification Protocol

Spawn a sub-agent to verify each instruction. The sub-agent MUST return structured output:

VERIFICATION RESULT
-------------------
Instruction ID: [ID]
Status: PASS | FAIL | BLOCKED
Location: [file:line or "N/A"]
Severity: BLOCKING | MINOR
Finding: [What was found]
Expected: [What was expected]
Suggested Fix: [Specific fix recommendation]

Sub-agent rules:

Check ONLY the specific instruction assigned
Do not attempt fixes—report findings only
Be precise about location (file, line number, function name)
Distinguish between blocking failures and minor issues

Browser-Enhanced Verification Output

For instructions with browser: true, the sub-agent MUST use Playwright MCP and return:

BROWSER VERIFICATION RESULT
---------------------------
Instruction ID: [ID]
Status: PASS | FAIL | BLOCKED
Type: DOM | VISUAL | CONSOLE | NETWORK | PERFORMANCE | ACCESSIBILITY
URL: [URL] | Viewport: [width]x[height]

Finding: [What was observed]
Expected: [What was expected]

Details: [Type-specific information]
  - DOM: selector, found, visible, content
  - Visual: screenshot path, description
  - Console: errors, warnings, logs
  - Network: endpoint, method, status, response summary
  - Performance: load time, LCP, FID, CLS
  - Accessibility: ARIA, semantic HTML, contrast, keyboard nav

Suggested Fix: [Specific fix recommendation]

Browser Sub-Agent Rules

In addition to standard sub-agent rules, browser verification sub-agents MUST:

Start with an accessibility tree snapshot (browser_snapshot) of the initial state
Use stable selectors (prefer data-testid over complex CSS paths, or use accessibility tree element refs)
Wait for dynamic content to load before inspecting (browser_wait_for_text or browser_wait)
Capture console output before and after actions
Take screenshots (browser_screenshot) when verifying visual appearance
Test at default viewport unless criterion specifies responsive/mobile (use browser_resize to change)

Step 4: Main Agent Fix Protocol

When sub-agent reports FAIL:

Review the finding - Understand what failed and why
Check fix history - Do not repeat a previously attempted fix
Apply targeted fix - Make the minimum change to address the issue
Log the attempt - Record what was changed

Fix attempt tracking

Maintain a fix log per instruction:

FIX LOG: [Instruction ID]
--------------------------
Attempt 1: [Description of change] → [Result]
Attempt 2: [Description of change] → [Result]
...

Strategy escalation

Attempts 1-2: Direct fix based on sub-agent suggestion
Attempt 3: Try alternative approach
Attempts 4-5: Broaden scope, consider architectural changes

If the same failure pattern repeats twice, explicitly try a different strategy.

Browser-Specific Fix Strategies

Failure Type	Common Fixes
DOM/Visibility	Conditional rendering, CSS display/visibility, z-index, prop passing
Console errors	JS exceptions, missing mocks, env vars, CORS
Network	Endpoint URLs, auth headers, payload format, CORS config
Visual	CSS cascade, responsive breakpoints, font loading
Performance	Bundle size, image optimization, lazy loading, render-blocking
Accessibility	ARIA attributes, color contrast, heading hierarchy, keyboard handlers

Step 5: Exit Conditions

Exit the verification loop when ANY condition is met:

Condition	Action
Sub-agent reports PASS	✅ Check off instruction
5 attempts exhausted	❌ Mark failed with notes
Same failure 3+ times	⚠️ Exit early, flag for review
Fix introduces regression	⚠️ Revert, flag for review
Issue is MINOR severity	⚠️ Note and continue

Step 6: Regression Check

After each fix attempt, verify:

The targeted instruction (primary check)
Any previously-passing related instructions (regression check)

If a fix breaks something else, revert and note the conflict.

Browser Regression Checks

After each browser-related fix, verify no regressions in: console errors, visual appearance, performance metrics, accessibility. If regression detected, capture before/after state and log in fix history.

Step 7: Generate Verification Report

After all instructions are processed:

VERIFICATION REPORT
===================
Total Instructions: [N]
Passed: [N] ✅
Failed: [N] ❌
Needs Review: [N] ⚠️

DETAILS
-------
[V-001] ✅ [Instruction summary]
[V-002] ❌ [Instruction summary]
  - Failed after 5 attempts
  - Last error: [description]
  - Attempts: [brief log]
[V-003] ⚠️ [Instruction summary]
  - Flagged: Repeated same failure pattern
  - Recommendation: [suggestion]

AUDIT TRAIL
-----------
[Timestamp] V-001: Verified PASS on first check
[Timestamp] V-002: Attempt 1 - Changed X → FAIL
[Timestamp] V-002: Attempt 2 - Changed Y → FAIL
...

BROWSER VERIFICATION (if applicable)
------------------------------------
Browser Checks: [passed]/[total] | Blocked: [N]
Playwright: Available | Unavailable
Dev Server: [URL] | Not Running

Issues Found:
- [V-XXX] {type}: {description}

Screenshots: [list of captured files]

Example

Given a checklist:

[ ] All functions have docstrings
[ ] No unused imports
[ ] Tests pass with >80% coverage

Workflow execution:

Parse into V-001, V-002, V-003
Pre-flight confirms all are testable
Sub-agent checks V-001 → FAIL (missing docstring in utils.py:45)
Main agent adds docstring
Sub-agent re-checks → PASS
Continue to V-002...
Final report shows 3/3 passed

Key Principles

Structured feedback: Sub-agent always returns actionable, located findings
No repeated fixes: Track what was tried to avoid loops
Early exit: Don't burn attempts on unfixable issues
Regression awareness: Fixes shouldn't break other things
Audit everything: The journey matters for debugging

code-verification

When & Why to Use This Skill

Use Cases