metrics

package
v0.1.0 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Mar 9, 2026 License: Apache-2.0 Imports: 10 Imported by: 0

README

pkg/metrics - Prometheus Metrics Infrastructure

Generic, reusable utilities for Prometheus metrics with instance-based registry support.

Overview

This package provides infrastructure for exposing Prometheus metrics via HTTP endpoints. It follows instance-based design patterns to support application reinitialization without global state pollution.

Key Design Principle: All metrics use instance-based prometheus.Registry instead of the global default registry. This allows metrics to be garbage collected when components are reinitialized.

Components

Server

HTTP server for exposing Prometheus metrics via /metrics endpoint.

import (
    "github.com/prometheus/client_golang/prometheus"
    "haptic/pkg/metrics"
)

// Create instance-based registry
registry := prometheus.NewRegistry()

// Register your metrics with the registry
counter := prometheus.NewCounter(prometheus.CounterOpts{
    Name: "my_counter",
    Help: "Example counter",
})
registry.MustRegister(counter)

// Create and start metrics server
server := metrics.NewServer(":9090", registry)
if err := server.Start(ctx); err != nil {
    log.Fatal(err)
}

Features:

  • Instance-based (not global)
  • Graceful shutdown support
  • OpenMetrics format support
  • Security headers (read timeout)
  • Helpful root handler with links
Helpers

Convenience functions for creating Prometheus metrics with consistent naming.

import (
    "github.com/prometheus/client_golang/prometheus"
    "haptic/pkg/metrics"
)

registry := prometheus.NewRegistry()

// Create metrics with helpers
counter := metrics.NewCounter(registry, "requests_total", "Total requests")
histogram := metrics.NewHistogram(registry, "request_duration_seconds", "Request duration")
gauge := metrics.NewGauge(registry, "active_connections", "Active connections")

// Create metrics with labels
statusCounter := metrics.NewCounterVec(
    registry,
    "http_requests_total",
    "HTTP requests by status",
    []string{"status", "method"},
)

// Use custom buckets for histograms
latencyHistogram := metrics.NewHistogramWithBuckets(
    registry,
    "api_latency_seconds",
    "API latency",
    metrics.DurationBuckets(),
)

Usage Patterns

Instance-Based Registry

Always use instance-based registries to support reinitialization:

// Good - instance-based
registry := prometheus.NewRegistry()
counter := metrics.NewCounter(registry, "operations", "Operations")

// Bad - uses global registry
counter := prometheus.NewCounter(prometheus.CounterOpts{
    Name: "operations",
})
prometheus.MustRegister(counter)  // Global registration
Lifecycle Management

Metrics are tied to application lifecycle:

func runIteration(ctx context.Context) error {
    // Create fresh registry for this iteration
    registry := prometheus.NewRegistry()

    // Create metrics
    counter := metrics.NewCounter(registry, "events", "Events processed")

    // Start server
    server := metrics.NewServer(":9090", registry)
    go server.Start(ctx)

    // When context is cancelled, server stops
    // Registry and metrics are garbage collected
    <-ctx.Done()
    return nil
}
Standard Metric Naming

Follow Prometheus naming conventions:

// Counters - suffix with _total
requests := metrics.NewCounter(registry, "requests_total", "Total requests")

// Histograms - suffix with unit
duration := metrics.NewHistogram(registry, "duration_seconds", "Request duration")

// Gauges - no suffix, current state
connections := metrics.NewGauge(registry, "active_connections", "Active connections")

// Add subsystem prefix for clarity
deploymentCounter := metrics.NewCounter(
    registry,
    "deployment_total",
    "Total deployments",
)
// Actual metric name: "haptic_deployment_total"
Custom Bucket Sizes

Use appropriate bucket sizes for your metrics:

// Duration metrics - use DurationBuckets()
latency := metrics.NewHistogramWithBuckets(
    registry,
    "request_duration_seconds",
    "Request duration",
    metrics.DurationBuckets(),  // 10ms to 10s
)

// Size metrics - use custom buckets
size := metrics.NewHistogramWithBuckets(
    registry,
    "response_size_bytes",
    "Response size",
    []float64{100, 1000, 10000, 100000, 1000000},
)

Server Endpoints

GET /metrics

Exposes metrics in Prometheus/OpenMetrics format.

Response:

# HELP haptic_requests_total Total requests
# TYPE haptic_requests_total counter
haptic_requests_total 42

# HELP haptic_duration_seconds Request duration
# TYPE haptic_duration_seconds histogram
haptic_duration_seconds_bucket{le="0.01"} 10
haptic_duration_seconds_bucket{le="0.05"} 25
haptic_duration_seconds_bucket{le="+Inf"} 42
haptic_duration_seconds_sum 1.5
haptic_duration_seconds_count 42
GET /

Returns helpful information about available endpoints.

Testing

Unit Tests

Test metric creation and registration:

func TestMetricCreation(t *testing.T) {
    registry := prometheus.NewRegistry()

    counter := metrics.NewCounter(registry, "test_total", "Test counter")
    counter.Inc()

    // Verify value
    assert.Equal(t, 1.0, testutil.ToFloat64(counter))
}
Integration Tests

Test server functionality:

func TestMetricsServer(t *testing.T) {
    registry := prometheus.NewRegistry()
    counter := metrics.NewCounter(registry, "test_total", "Test")
    counter.Add(42)

    server := metrics.NewServer(":0", registry)  // Random port
    ctx, cancel := context.WithCancel(context.Background())
    defer cancel()

    go server.Start(ctx)
    time.Sleep(100 * time.Millisecond)

    // Fetch metrics
    resp, err := http.Get(fmt.Sprintf("http://localhost:%d/metrics", server.Port()))
    require.NoError(t, err)
    defer resp.Body.Close()

    body, _ := io.ReadAll(resp.Body)
    assert.Contains(t, string(body), "haptic_test_total 42")
}

Best Practices

DO
  • ✅ Use instance-based registries
  • ✅ Follow Prometheus naming conventions
  • ✅ Use appropriate metric types (counter, gauge, histogram)
  • ✅ Choose meaningful bucket sizes for histograms
  • ✅ Add labels for dimensions, but keep cardinality low
  • ✅ Document what each metric measures
DON'T
  • ❌ Use global prometheus.DefaultRegisterer
  • ❌ Create metrics with high cardinality labels (e.g., user IDs)
  • ❌ Re-register metrics on the same registry
  • ❌ Mix metric types (don't use counter for gauge-like values)
  • ❌ Use generic metric names (be specific about what's measured)

Prometheus Queries

Counter Metrics

Rate of events per second:

rate(haptic_requests_total[5m])

Total events in time range:

increase(haptic_requests_total[1h])
Histogram Metrics

Average latency:

rate(haptic_duration_seconds_sum[5m]) /
rate(haptic_duration_seconds_count[5m])

95th percentile latency:

histogram_quantile(0.95, rate(haptic_duration_seconds_bucket[5m]))
Gauge Metrics

Current value:

haptic_active_connections

Maximum value over time:

max_over_time(haptic_active_connections[5m])

Architecture

This package provides generic infrastructure. Domain-specific metrics should be implemented in their respective packages:

  • pkg/metrics - Generic utilities (this package)
  • pkg/controller/metrics - Controller domain metrics
  • pkg/dataplane/metrics - HAProxy integration metrics (future)

Resources

Documentation

Index

Constants

This section is empty.

Variables

This section is empty.

Functions

func DeploymentDurationBuckets

func DeploymentDurationBuckets() []float64

DeploymentDurationBuckets returns histogram buckets for HAProxy deployment duration.

Deployments involve network calls to the dataplane API and may wait for HAProxy to reload its configuration. Reloads can take several seconds on busy servers, so buckets extend to 60s to capture the full tail distribution without capping.

Buckets: [0.05, 0.1, 0.25, 0.5, 1.0, 2.5, 5.0, 10.0, 15.0, 30.0, 60.0]

func DurationBuckets

func DurationBuckets() []float64

DurationBuckets returns histogram buckets suitable for duration metrics in seconds.

The buckets cover a range from 10ms to 10s, which is appropriate for most API and processing durations.

Buckets: [0.01, 0.05, 0.1, 0.25, 0.5, 1.0, 2.5, 5.0, 10.0]

Example:

registry := prometheus.NewRegistry()
latency := metrics.NewHistogramWithBuckets(
    registry,
    "operation_duration_seconds",
    "Operation duration in seconds",
    metrics.DurationBuckets(),
)

func NewCounter

func NewCounter(registry prometheus.Registerer, name, help string) prometheus.Counter

NewCounter creates and registers a counter metric.

A counter is a cumulative metric that represents a single monotonically increasing value. Use counters for values that only increase, such as the number of requests served, tasks completed, or errors.

Parameters:

  • registry: The Prometheus registry to register with (use prometheus.NewRegistry())
  • name: Metric name (e.g., "http_requests_total")
  • help: Human-readable description of the metric

Example:

registry := prometheus.NewRegistry()
requestsTotal := metrics.NewCounter(registry, "http_requests_total", "Total HTTP requests")
requestsTotal.Inc()

func NewCounterVec

func NewCounterVec(registry prometheus.Registerer, name, help string, labels []string) *prometheus.CounterVec

NewCounterVec creates and registers a counter vector with labels.

A counter vector is a collection of counters with the same name but different label dimensions. Use counter vectors when you need to track the same counter across different categories.

Parameters:

  • registry: The Prometheus registry to register with
  • name: Metric name
  • help: Human-readable description
  • labels: Label names (e.g., []string{"method", "status"})

Example:

registry := prometheus.NewRegistry()
httpRequests := metrics.NewCounterVec(
    registry,
    "http_requests_total",
    "Total HTTP requests",
    []string{"method", "status"},
)
httpRequests.WithLabelValues("GET", "200").Inc()
httpRequests.WithLabelValues("POST", "201").Inc()

func NewGauge

func NewGauge(registry prometheus.Registerer, name, help string) prometheus.Gauge

NewGauge creates and registers a gauge metric.

A gauge is a metric that represents a single numerical value that can arbitrarily go up and down. Use gauges for values that can increase or decrease, such as temperature, memory usage, or number of concurrent requests.

Parameters:

  • registry: The Prometheus registry to register with
  • name: Metric name (e.g., "concurrent_requests")
  • help: Human-readable description of the metric

Example:

registry := prometheus.NewRegistry()
activeConnections := metrics.NewGauge(registry, "active_connections", "Number of active connections")
activeConnections.Set(42)

func NewGaugeVec

func NewGaugeVec(registry prometheus.Registerer, name, help string, labels []string) *prometheus.GaugeVec

NewGaugeVec creates and registers a gauge vector with labels.

A gauge vector is a collection of gauges with the same name but different label dimensions. Use gauge vectors when you need to track the same metric across different categories.

Parameters:

  • registry: The Prometheus registry to register with
  • name: Metric name
  • help: Human-readable description
  • labels: Label names (e.g., []string{"method", "status"})

Example:

registry := prometheus.NewRegistry()
queueSize := metrics.NewGaugeVec(
    registry,
    "queue_size",
    "Size of queue by type",
    []string{"queue_type"},
)
queueSize.WithLabelValues("high_priority").Set(10)
queueSize.WithLabelValues("low_priority").Set(50)

func NewHistogram

func NewHistogram(registry prometheus.Registerer, name, help string) prometheus.Histogram

NewHistogram creates and registers a histogram metric with default buckets.

A histogram samples observations (e.g., request durations or response sizes) and counts them in configurable buckets. Use histograms for measuring distributions of values.

Default buckets: [.005, .01, .025, .05, .1, .25, .5, 1, 2.5, 5, 10]

Parameters:

  • registry: The Prometheus registry to register with
  • name: Metric name (e.g., "http_request_duration_seconds")
  • help: Human-readable description of the metric

Example:

registry := prometheus.NewRegistry()
duration := metrics.NewHistogram(registry, "request_duration_seconds", "Request duration")
duration.Observe(0.5)

func NewHistogramVec

func NewHistogramVec(registry prometheus.Registerer, name, help string, labels []string, buckets []float64) *prometheus.HistogramVec

NewHistogramVec creates and registers a histogram vector with labels and custom buckets.

A histogram vector is a collection of histograms with the same name but different label dimensions. Use histogram vectors when you need to measure distributions across different categories.

Parameters:

  • registry: The Prometheus registry to register with
  • name: Metric name
  • help: Human-readable description
  • labels: Label names (e.g., []string{"method", "endpoint"})
  • buckets: Bucket boundaries (e.g., DurationBuckets())

Example:

registry := prometheus.NewRegistry()
latencyByPhase := metrics.NewHistogramVec(
    registry,
    "operation_duration_seconds",
    "Operation duration by phase",
    []string{"phase"},
    metrics.DurationBuckets(),
)
latencyByPhase.WithLabelValues("init").Observe(0.5)
latencyByPhase.WithLabelValues("processing").Observe(1.2)

func NewHistogramWithBuckets

func NewHistogramWithBuckets(registry prometheus.Registerer, name, help string, buckets []float64) prometheus.Histogram

NewHistogramWithBuckets creates and registers a histogram with custom buckets.

Use this when default buckets don't match your use case. For duration metrics, consider using DurationBuckets() as a starting point.

Parameters:

  • registry: The Prometheus registry to register with
  • name: Metric name
  • help: Human-readable description
  • buckets: Bucket boundaries (e.g., []float64{0.1, 0.5, 1.0, 5.0})

Example:

registry := prometheus.NewRegistry()
duration := metrics.NewHistogramWithBuckets(
    registry,
    "api_latency_seconds",
    "API latency distribution",
    metrics.DurationBuckets(),
)
duration.Observe(0.25)

Types

type Server

type Server struct {
	// contains filtered or unexported fields
}

Server serves Prometheus metrics over HTTP.

IMPORTANT: Server is instance-based (not global). Create one per application lifecycle to ensure metrics are garbage collected when the server stops.

The server provides a /metrics endpoint for Prometheus scraping and gracefully shuts down when the context is cancelled.

func NewServer

func NewServer(addr string, registry prometheus.Gatherer) *Server

NewServer creates a new metrics server.

IMPORTANT: Pass an instance-based registry (prometheus.NewRegistry()), NOT prometheus.DefaultRegisterer. This ensures metrics are garbage collected when the server stops, which is critical for applications that reinitialize on configuration changes.

Parameters:

  • addr: TCP address to listen on (e.g., ":9090" or "localhost:9090")
  • registry: The Prometheus registry to serve (use prometheus.NewRegistry())

Example:

registry := prometheus.NewRegistry()  // Instance-based, not global!
server := metrics.NewServer(":9090", registry)
go server.Start(ctx)

func (*Server) Addr

func (s *Server) Addr() string

Addr returns the address the server is configured to listen on.

func (*Server) SetRegistry

func (s *Server) SetRegistry(registry prometheus.Gatherer)

SetRegistry replaces the Prometheus registry used to serve metrics. This allows reusing the same server across controller iterations while swapping the registry (which contains iteration-specific metrics).

func (*Server) Start

func (s *Server) Start(ctx context.Context) error

Start starts the HTTP server and blocks until the context is cancelled.

This method should typically be run in a goroutine:

go server.Start(ctx)

The server performs graceful shutdown when the context is cancelled, waiting for active connections to complete (up to a 10-second timeout).

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL