health

package

v0.9.0 Latest Latest Go to latest Published: Feb 2, 2026 License: Apache-2.0 Imports: 8 Imported by: 0

Details

Valid go.mod file
Redistributable license
Tagged version
Stable version
Learn more about best practices

Repository

github.com/stacklok/toolhive

Links

Documentation ¶

Overview ¶

Package health provides health monitoring for vMCP backend MCP servers.

This package implements the HealthChecker interface and provides periodic health monitoring with configurable intervals and failure thresholds.

Index ¶

func IsHealthCheck(ctx context.Context) bool
func NewHealthChecker(client vmcp.BackendClient, timeout time.Duration, ...) vmcp.HealthChecker
func WithHealthCheckMarker(ctx context.Context) context.Context
type Monitor
- func NewMonitor(client vmcp.BackendClient, backends []vmcp.Backend, config MonitorConfig) (*Monitor, error)
type MonitorConfig
- func DefaultConfig() MonitorConfig
type State
type Summary
- func (s Summary) String() string

Constants ¶

This section is empty.

Variables ¶

This section is empty.

Functions ¶

func IsHealthCheck ¶

func IsHealthCheck(ctx context.Context) bool

IsHealthCheck returns true if the context is marked as a health check. Authentication strategies use this to bypass authentication for health checks, since health checks verify backend availability and should not require user credentials. Returns false for nil contexts.

func NewHealthChecker ¶

func NewHealthChecker(
	client vmcp.BackendClient,
	timeout time.Duration,
	degradedThreshold time.Duration,
) vmcp.HealthChecker

NewHealthChecker creates a new health checker that uses BackendClient.ListCapabilities as the health check mechanism. This validates the full MCP communication stack: network connectivity, MCP protocol compliance, authentication, and responsiveness.

Parameters:

client: BackendClient for communicating with backend MCP servers
timeout: Maximum duration for health check operations (0 = no timeout)
degradedThreshold: Response time threshold for marking backend as degraded (0 = disabled)

Returns a new HealthChecker implementation.

func WithHealthCheckMarker ¶

func WithHealthCheckMarker(ctx context.Context) context.Context

WithHealthCheckMarker marks a context as a health check request. Authentication layers can use IsHealthCheck to identify and skip authentication for health check requests.

Types ¶

type Monitor ¶

type Monitor struct {
	// contains filtered or unexported fields
}

Monitor performs periodic health checks on backend MCP servers. It runs background goroutines for each backend, tracking their health status and consecutive failure counts. The monitor supports graceful shutdown and provides thread-safe access to backend health information.

func NewMonitor ¶

func NewMonitor(
	client vmcp.BackendClient,
	backends []vmcp.Backend,
	config MonitorConfig,
) (*Monitor, error)

NewMonitor creates a new health monitor for the given backends.

Parameters:

client: BackendClient for communicating with backend MCP servers
backends: List of backends to monitor
config: Configuration for health monitoring

Returns (monitor, error). Error is returned if configuration is invalid.

func (*Monitor) BuildStatus ¶ added in v0.8.3

func (m *Monitor) BuildStatus() *vmcp.Status

BuildStatus builds a vmcp.Status from the current health monitor state. This converts backend health information into the format needed for status reporting to the Kubernetes API or CLI output.

Phase determination: - Ready: All backends healthy, or no backends configured (cold start) - Pending: Backends configured but no health check data yet (waiting for first check) - Degraded: Some backends healthy, some degraded/unhealthy - Failed: No healthy backends (and at least one backend exists)

Returns a Status instance with current health information and discovered backends.

Takes a single snapshot of backend states to ensure internal consistency under concurrent updates.

func (*Monitor) GetAllBackendStates ¶

func (m *Monitor) GetAllBackendStates() map[string]*State

GetAllBackendStates returns health states for all monitored backends. Returns a map of backend ID to State.

func (*Monitor) GetBackendState ¶

func (m *Monitor) GetBackendState(backendID string) (*State, error)

GetBackendState returns the full health state for a backend. Returns (state, error). Error is returned if the backend is not being monitored.

func (*Monitor) GetBackendStatus ¶

func (m *Monitor) GetBackendStatus(backendID string) (vmcp.BackendHealthStatus, error)

GetBackendStatus returns the current health status for a backend. Returns (status, error). Error is returned if the backend is not being monitored.

func (*Monitor) GetHealthSummary ¶

func (m *Monitor) GetHealthSummary() Summary

GetHealthSummary returns a summary of backend health for logging/monitoring. Returns counts of healthy, degraded, unhealthy, and total backends.

func (*Monitor) IsBackendHealthy ¶

func (m *Monitor) IsBackendHealthy(backendID string) bool

IsBackendHealthy returns true if the backend is currently healthy. Returns false if the backend is not being monitored or is unhealthy.

func (*Monitor) Start ¶

func (m *Monitor) Start(ctx context.Context) error

Start begins health monitoring for all backends. This spawns a background goroutine for each backend that performs periodic health checks. Returns an error if the monitor is already started, has been stopped, or if the parent context is invalid.

The monitor respects the parent context for cancellation. When the parent context is cancelled, all health check goroutines will stop gracefully.

Note: A monitor cannot be restarted after it has been stopped. Create a new monitor instead.

func (*Monitor) Stop ¶

func (m *Monitor) Stop() error

Stop gracefully stops health monitoring. This cancels all health check goroutines and waits for them to complete. Returns an error if the monitor was not started.

After stopping, the monitor cannot be restarted. Create a new monitor if needed.

func (*Monitor) UpdateBackends ¶ added in v0.8.3

func (m *Monitor) UpdateBackends(newBackends []vmcp.Backend)

UpdateBackends updates the list of backends being monitored. Starts monitoring new backends and stops monitoring removed backends. This method is safe to call while the monitor is running.

func (*Monitor) WaitForInitialHealthChecks ¶ added in v0.8.3

func (m *Monitor) WaitForInitialHealthChecks()

WaitForInitialHealthChecks blocks until all backends have completed their initial health check. This is useful for ensuring that health status is accurate before relying on it (e.g., before reporting initial status to an external system).

If the monitor was not started, this returns immediately (no initial checks to wait for). This method is safe to call multiple times and from multiple goroutines.

type MonitorConfig ¶

type MonitorConfig struct {
	// CheckInterval is how often to perform health checks.
	// Must be > 0. Recommended: 30s.
	CheckInterval time.Duration

	// UnhealthyThreshold is the number of consecutive failures before marking unhealthy.
	// Must be >= 1. Recommended: 3 failures.
	UnhealthyThreshold int

	// Timeout is the maximum duration for a single health check operation.
	// Zero means no timeout (not recommended).
	Timeout time.Duration

	// DegradedThreshold is the response time threshold for marking a backend as degraded.
	// If a health check succeeds but takes longer than this duration, the backend is marked degraded.
	// Zero means disabled (backends will never be marked degraded based on response time alone).
	// Recommended: 5s.
	DegradedThreshold time.Duration
}

MonitorConfig contains configuration for the health monitor.

func DefaultConfig ¶

func DefaultConfig() MonitorConfig

DefaultConfig returns sensible default configuration values.

type State ¶

type State struct {
	// Status is the current health status.
	Status vmcp.BackendHealthStatus

	// ConsecutiveFailures is the number of consecutive failed health checks.
	ConsecutiveFailures int

	// LastCheckTime is when the last health check was performed.
	LastCheckTime time.Time

	// LastError is the last error encountered (if any).
	LastError error

	// LastTransitionTime is when the status last changed.
	LastTransitionTime time.Time
}

State is an immutable snapshot of a backend's health state. This is returned by GetState and GetAllStates to provide thread-safe access to health information without holding locks.

type Summary ¶

type Summary struct {
	Total           int
	Healthy         int
	Degraded        int
	Unhealthy       int
	Unknown         int
	Unauthenticated int
}

Summary provides aggregate health statistics for all backends.

func (Summary) String ¶

func (s Summary) String() string

String returns a human-readable summary.

Source Files ¶

View all Source files

?	: This menu
/	: Search site
f or F	: Jump to
y or Y	: Canonical URL