health-checks

majiayu000's avatarfrom majiayu000

Implement liveness, readiness, and dependency health checks

0stars🔀0forks📁View on GitHub🕐Updated Jan 5, 2026

When & Why to Use This Skill

This Claude skill provides a standardized framework for implementing robust health checks in cloud-native applications. It helps developers distinguish between liveness, readiness, and startup probes to ensure high availability, prevent cascading failures, and optimize service discovery within Kubernetes environments by correctly validating process health and external dependencies.

Use Cases

  • Case 1: Configuring Kubernetes liveness probes to automatically detect and restart unresponsive application processes.
  • Case 2: Implementing readiness checks that validate critical dependencies like databases and caches before allowing traffic to reach a service instance.
  • Case 3: Designing startup probes for legacy or resource-heavy applications to manage long initialization periods without triggering premature restarts.
  • Case 4: Standardizing health check API response formats (JSON) across microservices to improve observability and integration with SRE monitoring tools.
  • Case 5: Preventing 'thundering herd' issues by implementing proper timeouts and avoiding anti-patterns like checking dependencies in liveness probes.
namehealth-checks
description"Implement liveness, readiness, and dependency health checks"
priority1

Health Checks

Different checks serve different purposes. Don't conflate them.

Check Types

Endpoint Purpose On Failure Should Check
/health/live Process alive? K8s restarts pod Only process responsiveness
/health/ready Can handle traffic? K8s removes from LB DB, cache, critical deps
/health/startup Init complete? K8s waits Initialization status

Liveness (Simple)

Return 200 OK immediately. Never check dependencies.

Checking DB in liveness = pod restarts when DB is down = cascading failure.

Readiness (Dependency Checks)

For each dependency:
  → Check with timeout (1-2s)
  → Record healthy/unhealthy status
Return 503 if any critical dependency unhealthy

Response Format

{
  "status": "healthy|unhealthy",
  "version": "1.2.3",
  "dependencies": {
    "database": "healthy",
    "cache": "unhealthy: connection refused"
  }
}

Kubernetes Config

livenessProbe:
  httpGet: { path: /health/live, port: 8080 }
  periodSeconds: 10
  failureThreshold: 3

readinessProbe:
  httpGet: { path: /health/ready, port: 8080 }
  periodSeconds: 5
  failureThreshold: 3

startupProbe:
  httpGet: { path: /health/startup, port: 8080 }
  periodSeconds: 5
  failureThreshold: 30  # 150s max startup

Anti-Patterns

  • Liveness checks dependencies → Cascading restarts
  • No timeout on checks → Health endpoint hangs
  • No caching → Thundering herd on health endpoints

References

  • references/platforms/{platform}/health-checks.md
health-checks – AI Agent Skills | Claude Skills