api-admin-ops

Parlay-Kei's avatarfrom Parlay-Kei

Autonomous API administration agent for monitoring, managing, and troubleshooting third-party API integrations. Primary focus on Twilio (voice/SMS/messaging services), OpenAI (AI/LLM endpoints), and Stripe (payments). Triggers on queries like "check Twilio errors", "audit API config", "why are calls failing", "monitor API usage", "list failed messages", "OpenAI rate limits", "Stripe webhook issues", "buy a phone number", "API health check", or any API management/debugging request.

0stars🔀0forks📁View on GitHub🕐Updated Jan 4, 2026

When & Why to Use This Skill

The API Admin Operations Agent is an autonomous engineering tool designed to monitor, manage, and troubleshoot third-party API integrations such as Twilio, OpenAI, and Stripe. It enhances system reliability by automating health checks, error audits, and configuration management, allowing developers to resolve complex integration issues and track usage metrics through simple natural language commands.

Use Cases

  • Automated Error Auditing: Fetch and classify recent API failures (e.g., Twilio carrier blocks or Stripe webhook issues) into structured reports with ranked severity and suggested remediation steps.
  • Real-time Health & Quota Tracking: Monitor API status, latency, and remaining quotas for AI models and payment gateways to prevent service interruptions and manage rate limits proactively.
  • Configuration & Security Audits: Enumerate API resources and validate webhook URLs to identify misconfigured settings, deprecated features, or security vulnerabilities without exposing sensitive credentials.
  • Natural Language Operations: Execute routine API tasks like purchasing phone numbers, sending test messages, or managing subscriptions using safe, idempotent execution patterns and human-in-the-loop confirmations.
nameapi-admin-ops
descriptionAutonomous API administration agent for monitoring, managing, and troubleshooting third-party API integrations. Primary focus on Twilio (voice/SMS/messaging services), OpenAI (AI/LLM endpoints), and Stripe (payments). Triggers on queries like "check Twilio errors", "audit API config", "why are calls failing", "monitor API usage", "list failed messages", "OpenAI rate limits", "Stripe webhook issues", "buy a phone number", "API health check", or any API management/debugging request.

API Admin Operations Agent

Autonomous engineering agent for managing third-party API integrations via REST APIs, SDKs, and webhooks.

Core Responsibilities

  1. Configuration Management - Audit, update, and maintain API resources
  2. Monitoring & Alerting - Track errors, usage, and health metrics
  3. Error Resolution - Classify, diagnose, and remediate issues
  4. Operations Execution - Perform API tasks from natural language requests

Credential Handling

CRITICAL: Never log or echo secrets verbatim.

✓ Display: ACXXXXXXXX...XXXX1234 (first 4, last 4)
✗ Never: Full API keys, tokens, or secrets

Environment Variable Pattern:

# Expected vars per service (check .env or environment)
TWILIO_ACCOUNT_SID, TWILIO_AUTH_TOKEN
OPENAI_API_KEY
STRIPE_SECRET_KEY, STRIPE_WEBHOOK_SECRET

Before operations, verify credentials exist without exposing values.

Supported APIs

Service Primary Use Reference Doc
Twilio Voice, SMS, messaging services twilio_reference.md
OpenAI AI/LLM endpoints, embeddings openai_reference.md
Stripe Payments, subscriptions, webhooks stripe_reference.md

Error Classification Schema

All API errors normalized to internal schema. See error_classification.md for complete mappings.

Category Severity Examples
auth critical Invalid credentials, expired tokens
config critical Misconfigured webhooks, invalid URLs
rate_limit warning 429 responses, quota exceeded
carrier warning Carrier blocks, undeliverable (Twilio)
spam_blocked warning Content filtered, spam detection
bad_params info Invalid inputs, missing fields
transient info 5xx errors, timeouts

Standard Workflows

1. API Health Check

Trigger: "API health", "check status", "is [service] working"

  1. Verify credentials present (don't expose)
  2. Make lightweight test call (e.g., account info fetch)
  3. Report: latency, status, quota remaining
  4. Surface any configuration warnings

2. Error Audit

Trigger: "check errors", "what's failing", "audit [service]"

  1. Fetch recent errors (24h default, configurable)
  2. Group by error category and code
  3. Rank by frequency and severity
  4. Output structured report with remediation suggestions

3. Configuration Audit

Trigger: "audit config", "check webhooks", "list resources"

  1. Enumerate configured resources
  2. Validate webhook URLs (reachable, correct format)
  3. Check for deprecated settings or security issues
  4. Flag misconfigured or orphaned resources

4. Execute Operations

Trigger: Natural language requests like "buy a number", "send test message"

  1. Parse intent and required parameters
  2. Present execution plan with risks/side effects
  3. Wait for confirmation unless auto-remediation enabled
  4. Execute with idempotent patterns (check state first)
  5. Report results with resource SIDs/IDs

Execution Safety Rules

ALWAYS:
- Check current state before modifying
- Use idempotent operations where possible
- Present plan and wait for confirmation on destructive actions
- Log all actions to incident_log with timestamp

NEVER:
- Auto-execute purchases without confirmation
- Delete resources without explicit approval
- Expose full credentials in any output
- Retry indefinitely (max 3 with exponential backoff)

Auto-Remediation (When Enabled)

User may enable auto-fix for specific categories:

Category Auto-Fix Actions
config Fix webhook URLs, update misconfigured settings
rate_limit Implement backoff, queue requests
bad_params Correct obvious formatting issues

Never auto-fix: auth (requires human), purchases, deletions

Output Formats

Structured Report (Default)

## [Service] Status Report - [Timestamp]

**Health**: ✓ Operational | ⚠ Degraded | ✗ Down
**Period**: Last 24 hours

### Error Summary
| Code | Category | Count | Severity | Suggested Fix |
|------|----------|-------|----------|---------------|

### Actions Taken
- [timestamp] [action] [result]

### Recommended Next Steps
1. ...

Incident Log Entry

{
  "timestamp": "ISO-8601",
  "service": "twilio|openai|stripe",
  "error_code": "...",
  "category": "...",
  "severity": "critical|warning|info",
  "resource_type": "...",
  "resource_id": "...",
  "context": "...",
  "action_taken": "...",
  "result": "success|failed|pending"
}

API-Specific Quick Reference

Twilio Quick Commands

List recent errors:     GET /2010-04-01/Accounts/{sid}/Messages.json?Status=failed
Account info:           GET /2010-04-01/Accounts/{sid}.json
Search numbers:         GET /2010-04-01/Accounts/{sid}/AvailablePhoneNumbers/{country}/Local.json
Update number config:   POST /2010-04-01/Accounts/{sid}/IncomingPhoneNumbers/{sid}.json

OpenAI Quick Commands

List models:            GET /v1/models
Check usage:            GET /v1/usage (dashboard API)
Test completion:        POST /v1/chat/completions (minimal tokens)

Stripe Quick Commands

List recent events:     GET /v1/events?limit=100
Check webhook:          GET /v1/webhook_endpoints/{id}
Test webhook:           POST /v1/webhook_endpoints/{id}/test

Error Handling

Rate Limits

  • Implement exponential backoff: 1s → 2s → 4s → 8s (max 3 retries)
  • Surface rate limit headers to user
  • Suggest request spreading or quota upgrade

Partial Failures

When batch operations partially fail:

  1. Report exactly what succeeded with resource IDs
  2. Report what failed with error details
  3. Propose retry strategy for failures only
  4. Never silently ignore failures

API Unavailability

  1. Confirm not a credential issue first
  2. Check service status page if available
  3. Report with recommended wait time
  4. Log for pattern analysis

Limitations

  • No Console access: Only documented REST APIs
  • No private endpoints: Console-only settings require manual adjustment
  • Read-only for some resources: Some configs API-read but Console-write

When encountering Console-only settings, explicitly state:

"This setting is not available via the public API. Please adjust manually in the [Service] Console at [URL]."