Java Microservices - Health Check API



Introduction

In a microservices architecture, we have to make sure each service instance can handle requests. Services might be up (healthy). They may also be down for unknown reason. Without detection, unhealthy services can still receive traffic, degrade performance, or fail unpredictably. This is where the Health Check API pattern comes in: a dedicated HTTP endpoint (e.g., GET /health) that actively verifies service viability. Infrastructure (Load Balancers, orchestrators) and monitoring tools use it to identify healthy instances-and take necessary action when they aren't.

Why You Need a Health Check API

Traffic Control

Load balancers and service registries rely on health status to stop routing to unhealthy instances.

Automated Monitoring & Alerts

Monitoring microservices poll health-check endpoints to trigger alerts or spin up new containers when services fail.

Deployment Safety

Health-checks guard against premature traffic to newly deployed instances that haven't fully initialized.

Anatomy of a Health Check API

Endpoint URL

Common patterns−

  • /health − general status

  • /health/live or /healthz − liveness (is the process alive?)

  • /health/ready − readiness (can serve requests?

  • /health/started − startup (fully initialized) (tutorialspoint.com, openliberty.io)

HTTP Method & Status Codes

  • Use GET

  • 200 OK if healthy; 503 Service Unavailable (or 500) if unhealthy

  • Avoid caching− include headers like Cache-Control: no-cache

Payload Structure

A lightweight JSON response listing each check and its result

Example

{
   "status": "UP",
   "checks": [
      { "name": "db", "status": "UP", "responseTimeMs": 34 },
      { "name": "cache", "status": "DOWN", "error": "ConnectionTimeout" }
   ]
}

What to Check

Divide checks into −

Process Health

  • Is the service running?

  • Is the event loop or thread pool responsive?

Resource Health

  • Disk space, CPU, memory, thread availability.

Dependencies

  • Databases, caches, messaging systems, external APIs.

  • Ping downstream services or open DB connections.

Application Logic

  • Basic app-level operations, e.g., can user login, is config valid.

Best practice− Keep individual checks fast and non-blocking.

Types of Health Checks

Liveness

  • Simple− is the service process alive?

  • Used by Kubernetes to restart frozen or crashed containers.

Readiness

  • Can the service respond to traffic?

  • Checks dependency availability, connection pools, and app readiness.

  • Prevents routing to incompletely initialized services.

Startup

  • Determines when the service is fully initialized.

  • Prevents readiness/liveness failures during boot.

Composite

  • Aggregate liveness and readiness for simplified monitoring.

Implementation Strategies

Frameworks & Tooling

  • Spring Boot Actuator (/actuator/health)

  • MicroProfile Health for Java− /health, /health/live, /health/ready

  • Open Liberty built-in health support

Custom Implementation

  • Set up REST endpoints; run checks with timeout and return aggregated JSON & code

  • Use circuit breakers or caching for expensive dependency checks.

Integration with Infrastructure

  • Deploy startup, liveness, readiness URLs to Kubernetes, AWS ALB, Consul, Istio

  • Configure polling intervals and thresholds

Best Practices

Keep It Lean

  • Avoid overly broad, slow checks

  • Load balancers need quick binary decisions.

Automate & Monitor

  • Poll health endpoints frequently (e.g. every 30 seconds)

  • Set alerts on app status or check failure

Pitfalls to Avoid

  • Confusing with Ping− A simple ping says nothing about deeper dependencies.

  • Heavy Checks in Liveness− Overburdening liveness checks can slow restarts.

  • Caching Responses− Health endpoints must reflect real-time state.

  • Insufficient Timeout− Health endpoint shouldn't hang on slow resources.

  • Unprotected Endpoints− Exposes system details−secure access.

  • Unnamed Checks− Use descriptive names and timestamps in responses.

  • Polling Too Infrequently− Hourly checks may miss rapid failures.

Code Samples

Spring Boot + Actuator

In you Spring boot application, in the pom.xml file, add the following dependency−

<dependency>
   <groupId>org.springframework.boot</groupId>
   <artifactId>spring-boot-starter-actuator</artifactId>
   <version>3.5.3</version>
</dependency>

In your, application.yml, add the following snippet−

management:
  endpoints:
    web:
      exposure:
        include: health,info
  health:
    db:
      enabled: true

After running the application, go to: http://localhost:8080/actuator to see metrics of the application.

Infrastructure Integration

Kubernetes

  • livenessProbe/health/live restarts dead containers

  • readinessProbe/health/ready gates traffic until healthy

Cloud Load Balancers & Service Meshes

  • Use health endpoints for routing decisions

API Gateways (e.g. APISIX)

  • Performs active and passive health checks.

Monitoring & Alerting

  • Tools like Prometheus can scrape health endpoints

  • Send alerts on status changes

Real World Patterns

Banking Scenario

Login, transfer, billing microservices each expose health-checks. If a transfer service fails, routing shifts, alerts fire, auto recovery kicks in.

Container Ecosystem

Two-tier health-check strategy−

  • Liveness probe = fast ping

  • Readiness probe = full dependency checks.

Health Check in Observability

The Health Check API is part of a broader observability stack−

  • Logs

  • Distributed tracing

  • Metrics

  • Exception tracking

Ideally, health endpoints feed into dashboards, triggers, and alert systems to detect anomalies early.

When Health Check Isn't Enough

If your system relies on caching, message queues, bulk operations, or multi-step transactions, deeper observability is needed-like distributed tracing, APM, and golden-path tests-but health-checks remain a crucial first line.

Summary

  • Health Check API provides real-time insight into service availability.

  • Supports traffic routing, orchestration, and alerting.

  • Separate liveness/readiness/startup endpoints.

  • Ensure lightweight, fast, secure, and well-logged checks.

  • Avoid caching, overloading, and slow feedback.

  • Combine with broader observability tools for maximum resilience.

The Health Check API may appear simple, but it's foundational. It underpins all upstream systems−load balancers, orchestrators, and alert platforms−empowering autonomous, resilient microservice ecosystems. When done right, it significantly enhances reliability and maintainability.

Advertisements