vibium-browser-automation
Comprehensive guide for using Vibium browser automation tool via MCP. Use when (1) Automating web interactions in Cursor IDE, (2) Navigating web pages, (3) Taking screenshots, (4) Clicking buttons or links, (5) Filling forms or input fields, (6) Finding elements on web pages, (7) Web scraping tasks, (8) Testing websites, or any task requiring programmatic browser control within Cursor IDE
When & Why to Use This Skill
This Claude skill provides a comprehensive framework for browser automation using the Vibium MCP tool within Cursor IDE. It enables users to programmatically navigate web pages, interact with UI elements like buttons and forms, capture screenshots for debugging, and perform complex web scraping or automated testing tasks directly from their development environment. By bridging the gap between the IDE and the browser, it streamlines workflows that require real-time web interaction and data extraction.
Use Cases
- Automated Web Scraping: Extracting structured data from dynamic websites by navigating through pages and identifying specific CSS elements.
- End-to-End UI Testing: Verifying website functionality and user flows by simulating actions like clicking, typing, and form submission within the Cursor IDE.
- Visual Regression and Documentation: Automatically capturing viewport screenshots across multiple URLs to monitor UI consistency or generate visual documentation.
- Data Entry Automation: Streamlining repetitive tasks such as filling out multi-page web forms or migrating data between web-based platforms.
- Web Research and Analysis: Combining browser navigation with OCR tools to extract and analyze text from complex web layouts or images.
| name | vibium-browser-automation |
|---|---|
| description | Comprehensive guide for using Vibium browser automation tool via MCP. Use when (1) Automating web interactions in Cursor IDE, (2) Navigating web pages, (3) Taking screenshots, (4) Clicking buttons or links, (5) Filling forms or input fields, (6) Finding elements on web pages, (7) Web scraping tasks, (8) Testing websites, or any task requiring programmatic browser control within Cursor IDE |
| short-description | Browser automation with Vibium MCP |
Vibium Browser Automation
Guide for using Vibium browser automation tool through MCP protocol in Cursor IDE.
Available Tools
| Tool | Description | Notes |
|---|---|---|
browser_launch |
Start browser (visible by default) | Use headless: false for debugging |
browser_navigate |
Go to URL | Waits for page load |
browser_find |
Find element by CSS selector | Returns element info (tag, text, box) |
browser_click |
Click an element | Waits for element to be clickable |
browser_type |
Type text into an element | Waits for element to be editable |
browser_screenshot |
Capture viewport | Saves to Pictures/Vibium/ by default |
browser_quit |
Close browser | Clean up when done |
Core Workflows
Basic Navigation and Screenshot
# Launch browser
browser_launch(headless=False)
# Navigate to page
browser_navigate(url="https://example.com")
# Take screenshot
browser_screenshot(filename="example.png")
# Close browser
browser_quit()
Finding and Interacting with Elements
# Find element
element = browser_find(selector="h1")
# Returns: tag=h1, text="Example", box={x:100, y:200, w:300, h:50}
# Click element
browser_click(selector="a")
# Type into input
browser_type(selector="input[name='q']", text="search query")
Form Filling
# Navigate to form page
browser_navigate(url="https://example.com/form")
# Fill multiple fields
browser_type(selector="input[name='name']", text="John Doe")
browser_type(selector="input[name='email']", text="john@example.com")
browser_type(selector="input[name='phone']", text="123-456-7890")
# Take screenshot of filled form
browser_screenshot(filename="form-filled.png")
Best Practices
CSS Selectors
Use simple, reliable selectors:
- ✅
input[name="q"]- Attribute selectors - ✅
textarea[name="q"]- Element + attribute - ✅
h1,a,button- Simple element selectors - ✅
.class-name- Class selectors - ❌
:has-text()- Not supported - ❌ Complex pseudo-selectors - May fail
Multiple selectors for flexibility:
# Try multiple selectors if unsure
browser_find(selector="textarea[name='q'], input[name='q']")
Error Handling
Connection issues:
- If browser connection fails, relaunch:
browser_launch() - Check if browser is still running before operations
Element not found:
- Wait for page to load before finding elements
- Use more generic selectors if specific ones fail
- Take screenshot to inspect page state
Timeouts:
- Some operations may timeout after 30s
- Break complex tasks into smaller steps
- Take screenshots at key points for debugging
Screenshot Management
- Screenshots save to
C:\Users\<user>\Pictures\Vibium\on Windows - Use descriptive filenames:
google-search-typed.png - Take screenshots before/after important actions for debugging
Limitations and Workarounds
For detailed explanation of limitations and comprehensive workarounds, see limitations.md.
No Snapshot Log Support
Problem: Vibium doesn't support saving accessibility snapshots (YAML format).
Workaround: Combine with cursor-ide-browser:
- Use vibium for browser operations
- Use
cursor-ide-browser'sbrowser_snapshotto get accessibility snapshot - Save snapshot to YAML file
- Use
snapshot-querytools to analyze snapshot
No Keyboard Actions
Problem: Can't press Enter or other keys directly.
Workaround:
- For form submission, find and click submit button instead
- For search, find search button and click it
- Or navigate directly to result URL if possible
Limited Selector Support
Problem: Some advanced CSS selectors don't work.
Workaround:
- Use basic selectors (element, class, attribute)
- Try multiple selector variations
- Use
browser_findto test selectors before clicking/typing
Text Extraction Limitations
Problem: Cannot directly extract full page text content.
Recommended Solution: EasyOCR + Screenshot + Scroll + Grep
This is the most reliable one-time solution:
- Take screenshot with
browser_screenshot() - Use EasyOCR to extract all text from image
- Use grep/regex to search in OCR text
- Scroll page and repeat for full content
See limitations.md for complete implementation.
Other Workarounds:
- Combine with
cursor-ide-browsersnapshot for full text extraction - Use
web_searchtool for structured information - Extract multiple small elements instead of large ones
Manual Information Organization
Problem: Need to manually combine and organize data from multiple browser_find results.
Workaround:
- Use
web_searchfor pre-structured data - Combine tools (vibium + cursor-ide-browser + snapshot-query)
- Create extraction scripts for automated organization
Common Patterns
Web Search Automation
# Navigate to search engine
browser_navigate(url="https://www.google.com")
# Find search box
browser_find(selector="textarea[name='q'], input[name='q']")
# Type search query
browser_type(selector="textarea[name='q']", text="your query")
# Take screenshot
browser_screenshot(filename="search-typed.png")
# Note: Can't press Enter, need to find and click search button
# Or navigate directly to search results URL
Information Extraction
# Navigate to page
browser_navigate(url="https://example.com")
# Find elements
heading = browser_find(selector="h1")
links = browser_find(selector="a")
# Extract information from results
# Results include: tag, text, box (position/size)
Multi-Page Workflow
# Page 1
browser_navigate(url="https://example.com/page1")
browser_screenshot(filename="page1.png")
# Page 2
browser_navigate(url="https://example.com/page2")
browser_screenshot(filename="page2.png")
# Page 3
browser_navigate(url="https://example.com/page3")
element = browser_find(selector=".target")
browser_click(selector=".target")
browser_screenshot(filename="page3-clicked.png")
Troubleshooting
Browser Connection Lost
Symptoms: "failed to get browsing context" errors
Solution:
# Relaunch browser
browser_launch(headless=False)
# Then retry operations
Element Not Found
Symptoms: "element not found" errors
Solutions:
- Wait for page to fully load
- Use more generic selectors
- Take screenshot to verify page state
- Try alternative selectors
Timeout Errors
Symptoms: Operations timeout after 30s
Solutions:
- Break task into smaller steps
- Take screenshots between steps
- Verify page loaded correctly before operations
Search/Form Submission
Problem: Can't press Enter to submit
Solutions:
- Find submit button and click it
- Navigate directly to result URL
- Use form action URL with parameters
Integration with Other Tools
Combining with cursor-ide-browser
For tasks requiring accessibility snapshots:
- Use vibium for browser operations
- Use cursor-ide-browser for snapshot capture
- Save snapshot to YAML
- Use snapshot-query for analysis
Combining with web_search
For research tasks:
- Use web_search for initial information gathering
- Use vibium to visit specific URLs found
- Extract detailed information with browser_find
- Take screenshots for documentation
Example Use Cases
1. Website Screenshot Collection
urls = ["https://example.com", "https://example.org", "https://example.net"]
for i, url in enumerate(urls):
browser_navigate(url=url)
browser_screenshot(filename=f"site-{i+1}.png")
2. Form Data Entry
browser_navigate(url="https://example.com/form")
browser_type(selector="input[name='name']", text="Test User")
browser_type(selector="input[name='email']", text="test@example.com")
browser_type(selector="input[name='phone']", text="123-456-7890")
browser_screenshot(filename="form-filled.png")
3. Information Research
# Search for information
browser_navigate(url="https://www.google.com/search?q=your+query")
browser_screenshot(filename="search-results.png")
# Find and click first result
first_result = browser_find(selector=".g h3 a")
browser_click(selector=".g h3 a")
browser_screenshot(filename="result-page.png")
Key Takeaways
- EasyOCR for text extraction - Use EasyOCR + Screenshot + Scroll + Grep for reliable full-page text extraction (see easyocr-workflow.md)
- Keep selectors simple - Use basic CSS selectors for reliability
- Take screenshots often - Helps with debugging and documentation
- Handle errors gracefully - Relaunch browser if connection lost
- Combine tools - Use vibium with other browser tools for advanced features
- Test selectors first - Use
browser_findto verify selectors before clicking/typing - No keyboard actions - Use clicks instead of key presses
- No snapshot logs - Use cursor-ide-browser for accessibility snapshots