langfuse-extraction

Danik911's avatarfrom Danik911

Extracts traces, observations, and metrics from Langfuse Cloud (EU) API for debugging, telemetry analysis, and regulatory audit trails. Generates ALCOA+ compliant reports, exports to pandas DataFrame, and supports time-range/user/session filtering. Use when investigating production issues, generating compliance documentation, or analyzing LLM costs and performance. MUST BE USED for pharmaceutical audit trail generation requiring GAMP-5 traceability.

0stars🔀0forks📁View on GitHub🕐Updated Dec 19, 2025

When & Why to Use This Skill

The Langfuse Extraction skill enables seamless retrieval of traces, observations, and performance metrics from the Langfuse Cloud (EU) API. It is designed to facilitate advanced LLM debugging, telemetry analysis, and the generation of regulatory-compliant audit trails. By supporting ALCOA+ standards and GAMP-5 traceability, this skill is essential for developers and compliance officers who need to monitor LLM workflows, analyze token costs, and maintain rigorous documentation for high-stakes production environments.

Use Cases

  • Production Issue Investigation: Quickly isolate and debug workflow failures or performance bottlenecks by extracting detailed trace spans and latency data.
  • Regulatory Audit Reporting: Generate ALCOA+ compliant audit trails and GAMP-5 validation reports, specifically tailored for pharmaceutical and healthcare industry requirements.
  • LLM Cost & Usage Monitoring: Analyze token consumption and calculated costs across specific users or sessions to optimize resource allocation and budget management.
  • Data Analysis & Research: Export observability data directly into pandas DataFrames for custom statistical analysis, performance benchmarking, and long-term trend visualization.
namelangfuse-extraction
descriptionExtracts traces, observations, and metrics from Langfuse Cloud (EU) API for debugging, telemetry analysis, and regulatory audit trails. Generates ALCOA+ compliant reports, exports to pandas DataFrame, and supports time-range/user/session filtering. Use when investigating production issues, generating compliance documentation, or analyzing LLM costs and performance. MUST BE USED for pharmaceutical audit trail generation requiring GAMP-5 traceability.
allowed-tools["Bash", "Read", "Write", "Grep"]

Langfuse Extraction Skill

Purpose: Extract observability data from Langfuse Cloud API for analysis, debugging, and compliance reporting.


When to Use This Skill

Use when:

  • Investigating production workflow failures or performance issues
  • Generating ALCOA+ compliant audit trails for regulatory review
  • Analyzing LLM token usage and costs across sessions
  • Exporting trace data to pandas for statistical analysis
  • Creating compliance reports for GAMP-5 validation
  • Debugging specific user sessions or workflows

Do NOT use when:

  • Adding instrumentation to code (use langfuse-integration skill)
  • Interacting with Langfuse dashboard UI (use langfuse-dashboard skill)

Prerequisites

  1. Langfuse API Keys configured in environment
  2. langfuse Python package installed
  3. Traces already exist in Langfuse Cloud from instrumented workflows

Workflow Phases

Phase 1: Extract Recent Traces (Time-Range Query)

Use case: Get last 24 hours of traces for monitoring/debugging.

# scripts/extract_traces.py --hours 24 --output recent_traces.json

from langfuse import Langfuse
from datetime import datetime, timedelta
import json

langfuse = Langfuse()

from_time = datetime.now() - timedelta(hours=24)
traces = langfuse.api.trace.list(
    from_timestamp=from_time.isoformat(),
    tags=["pharmaceutical", "gamp5"],
    limit=100
)

# Export to JSON
with open("recent_traces.json", "w") as f:
    json.dump([{
        "trace_id": t.id,
        "timestamp": t.timestamp,
        "user_id": t.user_id,
        "session_id": t.session_id,
        "duration_ms": t.duration,
        "status": t.status
    } for t in traces.data], f, indent=2)

Phase 2: Extract Detailed Observations (Span Analysis)

Use case: Investigate specific trace with all span details.

# scripts/extract_traces.py --trace-id <id> --detailed

trace = langfuse.api.trace.get("trace_id_here")

observations = []
for obs in trace.observations:
    observations.append({
        "id": obs.id,
        "type": obs.type,  # "SPAN", "GENERATION", "EVENT"
        "name": obs.name,
        "latency_ms": obs.latency,
        "input_tokens": obs.usage.input if obs.usage else 0,
        "output_tokens": obs.usage.output if obs.usage else 0,
        "cost": obs.calculated_total_cost or 0.0,
        "metadata": obs.metadata
    })

Phase 3: Generate ALCOA+ Audit Trail

Use case: Regulatory compliance reporting.

# scripts/generate_audit_trail.py --user-id <clerk_id> --session-id <job_id>

def generate_audit_trail(user_id: str, session_id: str = None):
    traces = langfuse.api.trace.list(
        user_id=user_id,
        session_id=session_id
    )

    audit_trail = []
    for trace in traces.data:
        audit_entry = {
            "timestamp": trace.timestamp,
            "user_id": trace.user_id,
            "session_id": trace.session_id,
            "trace_id": trace.id,
            "compliance": {
                "attributable": bool(trace.user_id),
                "contemporaneous": True,
                "complete": trace.status == "COMPLETED",
                "gamp5_category": trace.metadata.get("compliance.gamp5.category")
            },
            "operations": [
                {"name": obs.name, "duration_ms": obs.latency}
                for obs in trace.observations
            ]
        }
        audit_trail.append(audit_entry)

    return audit_trail

Phase 4: Export to Pandas DataFrame

Use case: Statistical analysis, cost tracking, performance metrics.

# scripts/export_to_dataframe.py --output traces.csv

import pandas as pd

traces = langfuse.api.trace.list(limit=1000)

records = []
for trace in traces.data:
    records.append({
        "trace_id": trace.id,
        "timestamp": trace.timestamp,
        "duration_ms": trace.duration,
        "user_id": trace.user_id,
        "session_id": trace.session_id,
        "total_cost": trace.total_cost or 0.0,
        "input_tokens": trace.usage.input if trace.usage else 0,
        "output_tokens": trace.usage.output if trace.usage else 0,
        "status": trace.status,
        "gamp5_category": trace.metadata.get("compliance.gamp5.category")
    })

df = pd.DataFrame(records)
df.to_csv("traces.csv", index=False)

Success Criteria

  • ✅ API keys configured and tested
  • ✅ Traces extracted with all required fields
  • ✅ ALCOA+ audit trail includes user/session attribution
  • ✅ DataFrame export includes token usage and costs
  • ✅ No FALLBACK LOGIC (errors propagate with diagnostics)
  • ✅ Compliance metadata preserved in exports

Reference Materials

  • api-reference.md: Complete Langfuse API documentation
  • audit-trail-formats.md: ALCOA+/GAMP-5 compliant output formats
  • query_templates.json: Common API query patterns

Skill Version: 1.0.0 Last Updated: 2025-01-17 API Version: Langfuse REST API v1 EU Data Residency: cloud.langfuse.com