visual-snapshot-skill

nhantran889's avatarfrom nhantran889

Adds visual capabilities to the agent using Playwright MCP with Script Execution support.

0stars🔀0forks📁View on GitHub🕐Updated Jan 4, 2026

When & Why to Use This Skill

The Visual Snapshot Skill empowers Claude agents with advanced browser automation and visual inspection capabilities via Playwright MCP. It enables agents to 'see' web interfaces, capture high-resolution screenshots, and execute complex JavaScript directly within the browser context, facilitating seamless UI verification and automated web workflows.

Use Cases

  • Visual Regression Testing: Capture 'before' and 'after' snapshots of web components to verify CSS or layout changes during the development process.
  • Automated UI Debugging: Navigate to local or remote URLs to identify rendering issues and use accessibility audit tools to ensure compliance with web standards.
  • Complex Workflow Automation: Use the 'Code-as-Params' pattern to execute multi-step interactions—like form filling, cart management, and data extraction—in a single execution turn.
  • Dynamic Content Verification: Programmatically interact with JavaScript-heavy elements (e.g., Shopify themes) to verify that dynamic UI logic functions correctly under different states.
nameVisual Snapshot Skill
version1.1.0
descriptionAdds visual capabilities to the agent using Playwright MCP with Script Execution support.

Capabilities

This skill allows you to "see" the Shopify theme and interact with it programmatically. You can navigate, click, type, and execute complex JavaScript logic to verify your work.

Dependencies

  • MCP Server: @modelcontextprotocol/server-playwright
  • Shopify CLI: Must be running shopify theme dev to serve the site (default: http://localhost:9292).

Tools Usage Guide

1. playwright_navigate

Use this to open the local development URL.

  • Param url: usually http://localhost:9292.

2. playwright_screenshot

Use this to capture the current state.

  • Param name: filename prefix.
  • Param selector: capture specific element.
  • Param width/height: viewport dimensions.

3. playwright_evaluate (Advanced: Remote Code Execution)

This is the most powerful tool. It allows you to execute unlimited Playwright/JS code in the browser context in a SINGLE turn.

  • Param script: The JavaScript code to run. Code runs inside the browser page context.
  • Usage: Use this for complex interactions (loops, conditionals) or to batch multiple actions (Filling forms -> Clicking -> Waiting -> Returning data).
  • Example:
    // Agent sends this string to 'script' param
    const items = document.querySelectorAll(".product-card");
    const results = [];
    for (const item of items) {
      if (item.innerText.includes("Sale")) {
        item.querySelector("button").click(); // Add to cart
        results.push(item.id);
      }
    }
    return results; // Returns data to Agent
    

Workflow: Visual Verification Loop

When asked to fix a UI bug or implement a design:

  1. Analyze: Understand the request.
  2. Snapshot (Before):
    • Use playwright_navigate.
    • Use playwright_screenshot.
    • Tip: Use playwright_evaluate to setup complex state (e.g., login, add item to cart) before screenshot.
    • CRITICAL: Use specific MCP tool to view the image file if not automatically shown.
  3. Code: Apply fixes to Liquid/CSS.
  4. Snapshot (After): Verify changes.

Advanced Pattern: "Code-as-Params"

Instead of calling click -> wait -> click (3 turns), just write a script: playwright_evaluate(script="document.getElementById('menu-toggle').click();")