code-verification
Multi-agent code verification workflow using a main agent and sub-agent loop. Use when verifying code against requirements, acceptance criteria, or quality standards. Triggers on requests to verify, validate, or check code against specifications, checklists, or instructions.
When & Why to Use This Skill
This Claude skill provides a sophisticated multi-agent workflow for automated code verification and quality assurance. By utilizing a main-agent and sub-agent loop, it systematically validates code against requirements, acceptance criteria, and quality standards. The skill features automated fix attempts, regression testing, and advanced browser-based UI verification via Playwright, ensuring that software implementations are reliable, accessible, and compliant with technical specifications.
Use Cases
- Automated Requirements Validation: Verify that new feature implementations strictly adhere to complex acceptance criteria and functional specifications through structured, itemized checklists.
- End-to-End Web UI Testing: Use Playwright integration to perform deep inspections of DOM elements, visual appearance, accessibility (ARIA) standards, and network performance on web applications.
- Iterative Bug Fixing and Self-Healing: Automatically detect code failures or linting errors and execute iterative fix attempts with built-in regression checks to ensure stability without manual intervention.
- Code Quality and Compliance Audits: Scan codebases to ensure all functions have docstrings, remove unused imports, and maintain high test coverage across multiple files.
- Pre-deployment Verification: Generate comprehensive verification reports that document passed tests, failed attempts, and audit trails before merging code into production.
| name | code-verification |
|---|---|
| description | Multi-agent code verification workflow using a main agent and sub-agent loop. Use when verifying code against requirements, acceptance criteria, or quality standards. Triggers on requests to verify, validate, or check code against specifications, checklists, or instructions. |
Code Verification Skill
Verify code against requirements using a main agent / sub-agent loop with structured feedback and automatic retry.
Workflow Overview
1. Parse verification instructions into testable items
2. For each instruction:
a. Pre-flight: Confirm instruction is testable
b. Sub-agent: Verify if instruction is met
c. If failed: Main agent attempts fix
d. Repeat b-c up to 5 times or until success
e. Update checklist with result
3. Generate verification report
Step 1: Parse Verification Instructions
Extract each verification instruction into a discrete, testable item:
- ID: Unique identifier (e.g.,
V-001) - Instruction: The requirement text
- Test approach: How to verify (file inspection, run tests, lint, type check, etc.)
- Files involved: Which files to examine
- Requires Browser: Whether the instruction needs Playwright MCP verification
- Auto-detect from keywords: UI, render, display, visible, hidden, show, hide, click, hover, focus, blur, scroll, DOM, element, component, layout, responsive, style, CSS, color, font, screenshot, visual, appearance, console, error, warning, log, network, request, response, accessibility, a11y, ARIA, animation, transition, loading, performance
- Mark as:
browser: trueorbrowser: false
- Browser Verification Type (if
browser: true):DOM_INSPECTION- Element presence, visibility, content via accessibility tree snapshotsSCREENSHOT- Visual appearance, layout verificationCONSOLE- Browser console errors, warnings, logsNETWORK- API requests, responses, status codes (via network interception)PERFORMANCE- Load times, Core Web Vitals (via tracing)ACCESSIBILITY- ARIA attributes, semantic HTML, accessibility tree analysis
Step 2: Pre-flight Validation
Before the verification loop, confirm each instruction is testable:
- Instruction is specific and unambiguous
- Success criteria are clear
- Required files/resources exist
Flag untestable instructions immediately rather than attempting verification.
Browser-Specific Pre-Flight
For instructions with browser: true:
Check Playwright MCP availability
- If unavailable, mark instruction as BLOCKED with reason: "Playwright MCP not available"
- Suggest: "Ensure Playwright MCP server is running (npx @playwright/mcp@latest)"
Verify dev server is running
- Check if configured dev server URL responds (e.g.,
http://localhost:3000) - If not running, attempt to start using configured command (e.g.,
npm run dev) - Wait for configured startup time before proceeding
- If unable to start, mark as BLOCKED: "Dev server not accessible at {URL}"
- Check if configured dev server URL responds (e.g.,
Confirm target route exists
- Navigate to the page specified in the instruction using
browser_navigate - If 404 or error, mark as BLOCKED: "Target route not found: {route}"
- Navigate to the page specified in the instruction using
Step 3: Sub-Agent Verification Protocol
Spawn a sub-agent to verify each instruction. The sub-agent MUST return structured output:
VERIFICATION RESULT
-------------------
Instruction ID: [ID]
Status: PASS | FAIL | BLOCKED
Location: [file:line or "N/A"]
Severity: BLOCKING | MINOR
Finding: [What was found]
Expected: [What was expected]
Suggested Fix: [Specific fix recommendation]
Sub-agent rules:
- Check ONLY the specific instruction assigned
- Do not attempt fixes—report findings only
- Be precise about location (file, line number, function name)
- Distinguish between blocking failures and minor issues
Browser-Enhanced Verification Output
For instructions with browser: true, the sub-agent MUST use Playwright MCP and return:
BROWSER VERIFICATION RESULT
---------------------------
Instruction ID: [ID]
Status: PASS | FAIL | BLOCKED
Type: DOM | VISUAL | CONSOLE | NETWORK | PERFORMANCE | ACCESSIBILITY
URL: [URL] | Viewport: [width]x[height]
Finding: [What was observed]
Expected: [What was expected]
Details: [Type-specific information]
- DOM: selector, found, visible, content
- Visual: screenshot path, description
- Console: errors, warnings, logs
- Network: endpoint, method, status, response summary
- Performance: load time, LCP, FID, CLS
- Accessibility: ARIA, semantic HTML, contrast, keyboard nav
Suggested Fix: [Specific fix recommendation]
Browser Sub-Agent Rules
In addition to standard sub-agent rules, browser verification sub-agents MUST:
- Start with an accessibility tree snapshot (
browser_snapshot) of the initial state - Use stable selectors (prefer
data-testidover complex CSS paths, or use accessibility tree element refs) - Wait for dynamic content to load before inspecting (
browser_wait_for_textorbrowser_wait) - Capture console output before and after actions
- Take screenshots (
browser_screenshot) when verifying visual appearance - Test at default viewport unless criterion specifies responsive/mobile (use
browser_resizeto change)
Step 4: Main Agent Fix Protocol
When sub-agent reports FAIL:
- Review the finding - Understand what failed and why
- Check fix history - Do not repeat a previously attempted fix
- Apply targeted fix - Make the minimum change to address the issue
- Log the attempt - Record what was changed
Fix attempt tracking
Maintain a fix log per instruction:
FIX LOG: [Instruction ID]
--------------------------
Attempt 1: [Description of change] → [Result]
Attempt 2: [Description of change] → [Result]
...
Strategy escalation
- Attempts 1-2: Direct fix based on sub-agent suggestion
- Attempt 3: Try alternative approach
- Attempts 4-5: Broaden scope, consider architectural changes
If the same failure pattern repeats twice, explicitly try a different strategy.
Browser-Specific Fix Strategies
| Failure Type | Common Fixes |
|---|---|
| DOM/Visibility | Conditional rendering, CSS display/visibility, z-index, prop passing |
| Console errors | JS exceptions, missing mocks, env vars, CORS |
| Network | Endpoint URLs, auth headers, payload format, CORS config |
| Visual | CSS cascade, responsive breakpoints, font loading |
| Performance | Bundle size, image optimization, lazy loading, render-blocking |
| Accessibility | ARIA attributes, color contrast, heading hierarchy, keyboard handlers |
Step 5: Exit Conditions
Exit the verification loop when ANY condition is met:
| Condition | Action |
|---|---|
| Sub-agent reports PASS | ✅ Check off instruction |
| 5 attempts exhausted | ❌ Mark failed with notes |
| Same failure 3+ times | ⚠️ Exit early, flag for review |
| Fix introduces regression | ⚠️ Revert, flag for review |
| Issue is MINOR severity | ⚠️ Note and continue |
Step 6: Regression Check
After each fix attempt, verify:
- The targeted instruction (primary check)
- Any previously-passing related instructions (regression check)
If a fix breaks something else, revert and note the conflict.
Browser Regression Checks
After each browser-related fix, verify no regressions in: console errors, visual appearance, performance metrics, accessibility. If regression detected, capture before/after state and log in fix history.
Step 7: Generate Verification Report
After all instructions are processed:
VERIFICATION REPORT
===================
Total Instructions: [N]
Passed: [N] ✅
Failed: [N] ❌
Needs Review: [N] ⚠️
DETAILS
-------
[V-001] ✅ [Instruction summary]
[V-002] ❌ [Instruction summary]
- Failed after 5 attempts
- Last error: [description]
- Attempts: [brief log]
[V-003] ⚠️ [Instruction summary]
- Flagged: Repeated same failure pattern
- Recommendation: [suggestion]
AUDIT TRAIL
-----------
[Timestamp] V-001: Verified PASS on first check
[Timestamp] V-002: Attempt 1 - Changed X → FAIL
[Timestamp] V-002: Attempt 2 - Changed Y → FAIL
...
BROWSER VERIFICATION (if applicable)
------------------------------------
Browser Checks: [passed]/[total] | Blocked: [N]
Playwright: Available | Unavailable
Dev Server: [URL] | Not Running
Issues Found:
- [V-XXX] {type}: {description}
Screenshots: [list of captured files]
Example
Given a checklist:
[ ] All functions have docstrings
[ ] No unused imports
[ ] Tests pass with >80% coverage
Workflow execution:
- Parse into V-001, V-002, V-003
- Pre-flight confirms all are testable
- Sub-agent checks V-001 → FAIL (missing docstring in
utils.py:45) - Main agent adds docstring
- Sub-agent re-checks → PASS
- Continue to V-002...
- Final report shows 3/3 passed
Key Principles
- Structured feedback: Sub-agent always returns actionable, located findings
- No repeated fixes: Track what was tried to avoid loops
- Early exit: Don't burn attempts on unfixable issues
- Regression awareness: Fixes shouldn't break other things
- Audit everything: The journey matters for debugging