APIbasic

API Health Check Endpoints

Dedicated endpoints that report the operational status of an API and its dependencies for use by load balancers, orchestrators, and monitoring systems.

Also known as: Health Endpoint, Readiness Probe, Liveness Probe

Description

Health check endpoints provide a standardized way for infrastructure components (load balancers, container orchestrators, monitoring systems) to determine whether an API instance is operational and ready to serve traffic. Kubernetes defines two types: liveness probes (is the process alive and not deadlocked?) and readiness probes (is the service ready to accept traffic, including its dependencies?). A failed liveness probe triggers a container restart; a failed readiness probe removes the instance from the service's endpoint pool.

A basic health endpoint (GET /health or GET /healthz) returns 200 with a simple status response. A more sophisticated implementation checks dependencies: database connectivity, Redis availability, external API reachability, disk space, and memory usage. The deep health check should return per-dependency status: { status: 'healthy', checks: { database: { status: 'healthy', latency_ms: 5 }, redis: { status: 'degraded', latency_ms: 150 }, disk: { status: 'healthy', free_gb: 42.5 } } }. The overall status is 'healthy' only if all checks pass, 'degraded' if non-critical checks fail, and 'unhealthy' if critical checks fail.

Health check endpoints should not require authentication (they're called by infrastructure components that don't have API credentials), should be lightweight (not triggering expensive operations), and should have their own rate limiting to prevent abuse. The deep health check should include timeouts for each dependency check (e.g., 2 seconds) to avoid the health check itself hanging when a dependency is unresponsive. Caching health check results for a few seconds prevents thundering herd problems when many monitors poll simultaneously.

Prompt Snippet

Implement GET /health (shallow, for Kubernetes liveness probe) returning 200 { status: 'ok' } with no dependency checks, and GET /health/ready (deep, for readiness probe) checking PostgreSQL (SELECT 1), Redis (PING), and any critical external services with per-check 2-second timeouts. Return 200 { status: 'healthy', checks: { postgres: { status, latency_ms }, redis: { status, latency_ms } } } when all critical checks pass, 503 with the same structure when any critical check fails. Cache deep health results for 5 seconds. Exclude both endpoints from authentication middleware and request logging to reduce noise.

Related Terms

API

advanced

Circuit Breaker Pattern

A resilience pattern that prevents cascading failures by temporarily stopping requests to a failing downstream service after a threshold of errors is reached.

API

advanced

API Gateway Pattern

A single entry point that sits in front of backend services to handle cross-cutting concerns like authentication, rate limiting, routing, and request transformation.

API

basic

Request/Response Logging

Structured logging of API request metadata, response details, and timing information for debugging, monitoring, and audit compliance.