What is test-mcp-connector?

The MCP Connector Testing skill is a specialized developer tool designed to evaluate and optimize the 'agent-friendliness' of MCP-based connectors. By simulating real-world interactions using the Claude Agent SDK, it validates whether an LLM can accurately interpret action descriptions and execute tool calls via natural language, ensuring high reliability in agentic workflows.

When should I use test-mcp-connector?

test-mcp-connector is useful in the following scenarios: • Agent-Friendliness Audit: Automatically verify if your API action descriptions are clear enough for an LLM to select the correct tool and populate parameters accurately without manual intervention. • Multi-turn Workflow Simulation: Test complex, stateful sequences where an agent must retrieve data from one action (e.g., searching for a user ID) and use it in a subsequent step (e.g., updating that user's permissions). • Connector Regression Testing: Run a comprehensive suite of natural language prompts against a connector after YAML modifications to ensure that updates haven't introduced ambiguities or broken existing agent integrations. • Optimization of Tool Schemas: Identify specific failure patterns where an agent confuses similar tools, allowing developers to refine naming conventions and parameter descriptions for better model performance.

name	test-mcp-connector
description	\|
**Required triggers (ALL must mention "MCP" explicitly)	**
**DO NOT trigger for	**

MCP Connector Testing

Setup (REQUIRED - Collect ALL upfront)

Required	Description
Account ID	StackOne account ID (linked provider account)
StackOne API Key	StackOne API key (`credentials:read` scope)
Anthropic API Key	For the test agent (check `ANTHROPIC_API_KEY` env, ask if not set)
Profile	CLI profile for `stackone push`
Provider	Provider name (e.g., `datadog`, `intercom`)

⛔ Account ID = Complete Authentication

If the user provides an Account ID, you have EVERYTHING needed for authentication.

The Account ID is a linked account where provider credentials (API keys, tokens, subdomains, secrets) are ALREADY stored in StackOne.

When Account ID is provided, DO NOT ask for:

❌ Provider API keys
❌ Provider tokens or secrets
❌ Any provider-specific credentials

Authentication = Account ID + StackOne API Key. Nothing else needed.

Phase 1: Quick Connectivity Check (Optional)

# Test single action
stackone run src/configs/<provider>/<provider>.connector.s1.yaml \
  --account-id <ACCOUNT_ID> --action-id <key>_<action> --profile=<profile>

# Push after changes
stackone push src/configs/<provider>/<provider>.connector.s1.yaml --profile=<profile>

Phase 2: Agent Simulation (PRIMARY)

Core Principle

Test via agent conversations, not direct tool calls. The goal: can an agent understand action descriptions and use them correctly?

// ❌ WRONG - bypasses agent understanding
await callMCPTool("provider_list_items", { path: {}, query: {} });

// ✅ CORRECT - agent decides what to call
await agent.chat("Show me all items in the system");

Test Agent Setup

All test files: test-agent/<provider>/ (gitignored)

mkdir -p test-agent/<provider> && cd test-agent/<provider>
npm init -y && npm install @anthropic-ai/claude-agent-sdk tsx

Agent code:

import { query } from "@anthropic-ai/claude-agent-sdk";

const STACKONE_API_KEY = process.env.STACKONE_API_KEY;
const ACCOUNT_ID = process.env.ACCOUNT_ID;

async function testPrompt(prompt: string) {
  const toolCalls: any[] = [];
  let result: any;

  for await (const message of query({
    prompt,
    options: {
      model: "claude-haiku-4-5",
      mcpServers: {
        "stackone": {
          type: "http",
          url: "https://api.stackone.com/mcp",
          headers: {
            "Authorization": `Basic ${Buffer.from(`${STACKONE_API_KEY}:`).toString("base64")}`,
            "x-account-id": ACCOUNT_ID
          }
        }
      },
      allowedTools: ["mcp__stackone__*"]
    }
  })) {
    if (message.type === "assistant" && message.message?.content) {
      toolCalls.push(...message.message.content.filter((b: any) => b.type === "tool_use"));
    }
    if (message.type === "result" && message.subtype === "success") {
      result = message.result;
    }
  }
  return { prompt, toolCalls, result };
}

Run: cd test-agent/<provider> && npx tsx test.ts

Testing Process

For EACH action in the connector:

Generate realistic prompt (natural language, no action names)
Send to agent, capture: tool called, params, interpretation
Evaluate: ✅ Right tool + params = PASS | ❌ Wrong = FAIL → fix

Include multi-turn workflows:

User: "Show me items in category X"     → list_items
User: "Get details on the first one"    → get_item (uses ID from previous)
User: "Update its status to active"     → update_item (same ID)

Coverage Requirements

Test ALL actions - if connector has 50 actions, test 50 actions
Track progress: [15/50] action_name... ✅ PASS
Include single-turn AND multi-turn tests
Don't skip write operations

Fix Loop

Diagnose failure (description unclear? missing param? wrong tool?)
Fix connector YAML
Push: stackone push <path> --profile=<profile>
Retry same prompt → verify fix
Continue to next action

Output

## Test Results: <provider> connector

### Summary
- Total actions: 50 | Tested: 50 (100%)
- Initial pass rate: 85% (42/50)
- Final pass rate: 100% (50/50)
- Fixes applied: 8

### Changes Made
| Action | Issue | Fix |
|--------|-------|-----|
| list_items | Agent confused with search | Added "primary listing endpoint" to description |
| create_item | Missing required field | Added `category_id` to required params |

### Issue Patterns
- Description clarity: 5 fixes
- Missing params: 2 fixes
- Naming conflicts: 1 fix

### Files Modified
- src/configs/<provider>/<provider>.connector.s1.yaml

Required in report: Total vs tested (must equal), before/after pass rates, every fix, files modified.

test-mcp-connector

When & Why to Use This Skill

Use Cases