# Integration Examples

Real-world examples of integrating eval-framework with other plugins and workflows.

## Example 1: Brand Voice Copywriter

A copywriting plugin that uses eval-framework for quality assurance.

### Workflow

1. User requests marketing copy
2. Copywriter agent generates content
3. Copywriter invokes judge with brand-voice rubric
4. If score < threshold, copywriter revises
5. Final content returned with evaluation score

### Implementation

In the copywriter agent:

```markdown
After generating copy, evaluate against the brand-voice rubric:

1. Check if `.claude/evals/brand-voice.yaml` exists
2. If exists, invoke eval-framework judge agent:
   - Rubric: brand-voice
   - Content: [generated copy]
3. If score >= 75%, present copy to user
4. If score < 75%, revise based on violations and re-evaluate
5. Include evaluation score in response
```

## Example 2: Code Review Plugin

A code review plugin that combines multiple rubrics.

### Workflow

1. User requests code review
2. Plugin identifies relevant rubrics based on file types
3. Runs each rubric and aggregates results
4. Presents unified review with all findings

### Rubric Selection

```markdown
Based on files being reviewed:
- `*.tsx` → Run: code-security, react-patterns
- `**/api/**` → Run: code-security, api-design
- `*.test.*` → Run: test-quality
- `*.md` → Run: docs-quality

Aggregate all results into unified review.
```

## Example 3: Documentation Generator

A docs plugin that validates generated documentation.

### Quality Gates

```yaml
# docs-quality.yaml
criteria:
  completeness:
    weight: 30
    checks:
      - type: presence
        pattern: "^## (Overview|Usage|API|Examples)"
        message: "Include standard sections"

  accuracy:
    weight: 40
    checks:
      - type: custom
        prompt: "Do the code examples match the actual API signatures?"
        message: "Update examples to match current API"

  clarity:
    weight: 30
    checks:
      - type: custom
        prompt: "Is this documentation clear enough for a new developer?"
        message: "Simplify explanations for newcomers"
```

## Example 4: Commit Message Validator

Validate commit messages against team standards.

### Rubric

```yaml
# commit-message.yaml
name: commit-message
scope:
  type: content

criteria:
  format:
    weight: 40
    threshold: 90
    checks:
      - type: pattern
        pattern: "^(feat|fix|docs|style|refactor|test|chore)(\\(.+\\))?!?: .{1,50}"
        message: "Use conventional commit format: type(scope): subject"
      - type: absence
        pattern: "^.{51,}"
        message: "Subject line should be 50 chars or less"

  body:
    weight: 30
    threshold: 70
    checks:
      - type: custom
        prompt: "Does this commit message explain WHY the change was made, not just WHAT?"
        message: "Explain the motivation for this change"

  references:
    weight: 30
    threshold: 60
    checks:
      - type: custom
        prompt: "Does this commit reference relevant issues or tickets?"
        message: "Reference related issues (#123)"
```

## Example 5: API Design Review

Validate API endpoints against REST conventions.

### Rubric

```yaml
# api-design.yaml
name: api-design
scope:
  type: both
  file_patterns:
    - "app/routes/api/**/*.ts"

criteria:
  rest-conventions:
    weight: 35
    checks:
      - type: custom
        prompt: "Does this API follow REST conventions? GET for reads, POST for creates, PUT/PATCH for updates, DELETE for deletes?"
        message: "Use appropriate HTTP methods for the operation"
      - type: absence
        pattern: "router\\.(get|post).*\\/(create|delete|update)"
        message: "Don't use verbs in URLs - use HTTP methods instead"

  error-responses:
    weight: 35
    checks:
      - type: custom
        prompt: "Does this API return appropriate error status codes (400 for bad input, 401 for auth, 404 for not found, 500 for server errors)?"
        message: "Use correct HTTP status codes"
      - type: presence
        pattern: "status:\\s*(4|5)\\d{2}"
        message: "Include error status codes in error responses"

  response-format:
    weight: 30
    checks:
      - type: custom
        prompt: "Are API responses consistent with a standard envelope format?"
        message: "Use consistent response format across endpoints"
```

## Invoking Judge Programmatically

When building plugins that use eval-framework:

### From an Agent

```markdown
To evaluate content:

1. Read the rubric from `.claude/evals/[name].yaml`
2. Parse criteria, weights, and thresholds
3. For each criterion:
   - Apply pattern-based checks (regex matching)
   - Apply custom checks (LLM evaluation)
   - Calculate criterion score (0-100)
4. Calculate weighted overall score
5. Determine pass/fail based on min_score and required_criteria
6. Return structured results

Use the judge agent's evaluation process and output format.
```

### From a Command

```markdown
Invoke the judge by:
1. Using Task tool to spawn judge agent
2. Or using /eval-run command
3. Or reading rubric and implementing evaluation logic inline
```

## Multi-Rubric Evaluation

When evaluating against multiple rubrics:

```markdown
## Aggregate Evaluation

Run each rubric independently:
- Rubric A: 85/100 PASS
- Rubric B: 72/100 FAIL
- Rubric C: 90/100 PASS

Aggregate options:
1. **All must pass**: Overall FAIL (B failed)
2. **Weighted average**: (85 + 72 + 90) / 3 = 82.3
3. **Minimum score**: 72 (worst score)

Report each rubric result separately for actionable feedback.
```
