metrics

package
v0.1.0-alpha.2 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Dec 28, 2025 License: Apache-2.0 Imports: 8 Imported by: 0

README

pkg/metrics - Prometheus Metrics Infrastructure

Generic, reusable utilities for Prometheus metrics with instance-based registry support.

Overview

This package provides infrastructure for exposing Prometheus metrics via HTTP endpoints. It follows instance-based design patterns to support application reinitialization without global state pollution.

Key Design Principle: All metrics use instance-based prometheus.Registry instead of the global default registry. This allows metrics to be garbage collected when components are reinitialized.

Components

Server

HTTP server for exposing Prometheus metrics via /metrics endpoint.

import (
    "github.com/prometheus/client_golang/prometheus"
    "haptic/pkg/metrics"
)

// Create instance-based registry
registry := prometheus.NewRegistry()

// Register your metrics with the registry
counter := prometheus.NewCounter(prometheus.CounterOpts{
    Name: "my_counter",
    Help: "Example counter",
})
registry.MustRegister(counter)

// Create and start metrics server
server := metrics.NewServer(":9090", registry)
if err := server.Start(ctx); err != nil {
    log.Fatal(err)
}

Features:

  • Instance-based (not global)
  • Graceful shutdown support
  • OpenMetrics format support
  • Security headers (read timeout)
  • Helpful root handler with links
Helpers

Convenience functions for creating Prometheus metrics with consistent naming.

import (
    "github.com/prometheus/client_golang/prometheus"
    "haptic/pkg/metrics"
)

registry := prometheus.NewRegistry()

// Create metrics with helpers
counter := metrics.NewCounter(registry, "requests_total", "Total requests")
histogram := metrics.NewHistogram(registry, "request_duration_seconds", "Request duration")
gauge := metrics.NewGauge(registry, "active_connections", "Active connections")

// Create metrics with labels
statusCounter := metrics.NewCounterVec(
    registry,
    "http_requests_total",
    "HTTP requests by status",
    []string{"status", "method"},
)

// Use custom buckets for histograms
latencyHistogram := metrics.NewHistogramWithBuckets(
    registry,
    "api_latency_seconds",
    "API latency",
    metrics.DurationBuckets(),
)

Usage Patterns

Instance-Based Registry

Always use instance-based registries to support reinitialization:

// Good - instance-based
registry := prometheus.NewRegistry()
counter := metrics.NewCounter(registry, "operations", "Operations")

// Bad - uses global registry
counter := prometheus.NewCounter(prometheus.CounterOpts{
    Name: "operations",
})
prometheus.MustRegister(counter)  // Global registration
Lifecycle Management

Metrics are tied to application lifecycle:

func runIteration(ctx context.Context) error {
    // Create fresh registry for this iteration
    registry := prometheus.NewRegistry()

    // Create metrics
    counter := metrics.NewCounter(registry, "events", "Events processed")

    // Start server
    server := metrics.NewServer(":9090", registry)
    go server.Start(ctx)

    // When context is cancelled, server stops
    // Registry and metrics are garbage collected
    <-ctx.Done()
    return nil
}
Standard Metric Naming

Follow Prometheus naming conventions:

// Counters - suffix with _total
requests := metrics.NewCounter(registry, "requests_total", "Total requests")

// Histograms - suffix with unit
duration := metrics.NewHistogram(registry, "duration_seconds", "Request duration")

// Gauges - no suffix, current state
connections := metrics.NewGauge(registry, "active_connections", "Active connections")

// Add subsystem prefix for clarity
deploymentCounter := metrics.NewCounter(
    registry,
    "deployment_total",
    "Total deployments",
)
// Actual metric name: "haptic_deployment_total"
Custom Bucket Sizes

Use appropriate bucket sizes for your metrics:

// Duration metrics - use DurationBuckets()
latency := metrics.NewHistogramWithBuckets(
    registry,
    "request_duration_seconds",
    "Request duration",
    metrics.DurationBuckets(),  // 10ms to 10s
)

// Size metrics - use custom buckets
size := metrics.NewHistogramWithBuckets(
    registry,
    "response_size_bytes",
    "Response size",
    []float64{100, 1000, 10000, 100000, 1000000},
)

Server Endpoints

GET /metrics

Exposes metrics in Prometheus/OpenMetrics format.

Response:

# HELP haptic_requests_total Total requests
# TYPE haptic_requests_total counter
haptic_requests_total 42

# HELP haptic_duration_seconds Request duration
# TYPE haptic_duration_seconds histogram
haptic_duration_seconds_bucket{le="0.01"} 10
haptic_duration_seconds_bucket{le="0.05"} 25
haptic_duration_seconds_bucket{le="+Inf"} 42
haptic_duration_seconds_sum 1.5
haptic_duration_seconds_count 42
GET /

Returns helpful information about available endpoints.

Testing

Unit Tests

Test metric creation and registration:

func TestMetricCreation(t *testing.T) {
    registry := prometheus.NewRegistry()

    counter := metrics.NewCounter(registry, "test_total", "Test counter")
    counter.Inc()

    // Verify value
    assert.Equal(t, 1.0, testutil.ToFloat64(counter))
}
Integration Tests

Test server functionality:

func TestMetricsServer(t *testing.T) {
    registry := prometheus.NewRegistry()
    counter := metrics.NewCounter(registry, "test_total", "Test")
    counter.Add(42)

    server := metrics.NewServer(":0", registry)  // Random port
    ctx, cancel := context.WithCancel(context.Background())
    defer cancel()

    go server.Start(ctx)
    time.Sleep(100 * time.Millisecond)

    // Fetch metrics
    resp, err := http.Get(fmt.Sprintf("http://localhost:%d/metrics", server.Port()))
    require.NoError(t, err)
    defer resp.Body.Close()

    body, _ := io.ReadAll(resp.Body)
    assert.Contains(t, string(body), "haptic_test_total 42")
}

Best Practices

DO
  • ✅ Use instance-based registries
  • ✅ Follow Prometheus naming conventions
  • ✅ Use appropriate metric types (counter, gauge, histogram)
  • ✅ Choose meaningful bucket sizes for histograms
  • ✅ Add labels for dimensions, but keep cardinality low
  • ✅ Document what each metric measures
DON'T
  • ❌ Use global prometheus.DefaultRegisterer
  • ❌ Create metrics with high cardinality labels (e.g., user IDs)
  • ❌ Re-register metrics on the same registry
  • ❌ Mix metric types (don't use counter for gauge-like values)
  • ❌ Use generic metric names (be specific about what's measured)

Prometheus Queries

Counter Metrics

Rate of events per second:

rate(haptic_requests_total[5m])

Total events in time range:

increase(haptic_requests_total[1h])
Histogram Metrics

Average latency:

rate(haptic_duration_seconds_sum[5m]) /
rate(haptic_duration_seconds_count[5m])

95th percentile latency:

histogram_quantile(0.95, rate(haptic_duration_seconds_bucket[5m]))
Gauge Metrics

Current value:

haptic_active_connections

Maximum value over time:

max_over_time(haptic_active_connections[5m])

Architecture

This package provides generic infrastructure. Domain-specific metrics should be implemented in their respective packages:

  • pkg/metrics - Generic utilities (this package)
  • pkg/controller/metrics - Controller domain metrics
  • pkg/dataplane/metrics - HAProxy integration metrics (future)

Resources

Documentation

Index

Constants

This section is empty.

Variables

This section is empty.

Functions

func DurationBuckets

func DurationBuckets() []float64

DurationBuckets returns histogram buckets suitable for duration metrics in seconds.

The buckets cover a range from 10ms to 10s, which is appropriate for most API and processing durations.

Buckets: [0.01, 0.05, 0.1, 0.25, 0.5, 1.0, 2.5, 5.0, 10.0]

Example:

registry := prometheus.NewRegistry()
latency := metrics.NewHistogramWithBuckets(
    registry,
    "operation_duration_seconds",
    "Operation duration in seconds",
    metrics.DurationBuckets(),
)

func NewCounter

func NewCounter(registry prometheus.Registerer, name, help string) prometheus.Counter

NewCounter creates and registers a counter metric.

A counter is a cumulative metric that represents a single monotonically increasing value. Use counters for values that only increase, such as the number of requests served, tasks completed, or errors.

Parameters:

  • registry: The Prometheus registry to register with (use prometheus.NewRegistry())
  • name: Metric name (e.g., "http_requests_total")
  • help: Human-readable description of the metric

Example:

registry := prometheus.NewRegistry()
requestsTotal := metrics.NewCounter(registry, "http_requests_total", "Total HTTP requests")
requestsTotal.Inc()

func NewCounterVec

func NewCounterVec(registry prometheus.Registerer, name, help string, labels []string) *prometheus.CounterVec

NewCounterVec creates and registers a counter vector with labels.

A counter vector is a collection of counters with the same name but different label dimensions. Use counter vectors when you need to track the same counter across different categories.

Parameters:

  • registry: The Prometheus registry to register with
  • name: Metric name
  • help: Human-readable description
  • labels: Label names (e.g., []string{"method", "status"})

Example:

registry := prometheus.NewRegistry()
httpRequests := metrics.NewCounterVec(
    registry,
    "http_requests_total",
    "Total HTTP requests",
    []string{"method", "status"},
)
httpRequests.WithLabelValues("GET", "200").Inc()
httpRequests.WithLabelValues("POST", "201").Inc()

func NewGauge

func NewGauge(registry prometheus.Registerer, name, help string) prometheus.Gauge

NewGauge creates and registers a gauge metric.

A gauge is a metric that represents a single numerical value that can arbitrarily go up and down. Use gauges for values that can increase or decrease, such as temperature, memory usage, or number of concurrent requests.

Parameters:

  • registry: The Prometheus registry to register with
  • name: Metric name (e.g., "concurrent_requests")
  • help: Human-readable description of the metric

Example:

registry := prometheus.NewRegistry()
activeConnections := metrics.NewGauge(registry, "active_connections", "Number of active connections")
activeConnections.Set(42)

func NewGaugeVec

func NewGaugeVec(registry prometheus.Registerer, name, help string, labels []string) *prometheus.GaugeVec

NewGaugeVec creates and registers a gauge vector with labels.

A gauge vector is a collection of gauges with the same name but different label dimensions. Use gauge vectors when you need to track the same metric across different categories.

Parameters:

  • registry: The Prometheus registry to register with
  • name: Metric name
  • help: Human-readable description
  • labels: Label names (e.g., []string{"method", "status"})

Example:

registry := prometheus.NewRegistry()
queueSize := metrics.NewGaugeVec(
    registry,
    "queue_size",
    "Size of queue by type",
    []string{"queue_type"},
)
queueSize.WithLabelValues("high_priority").Set(10)
queueSize.WithLabelValues("low_priority").Set(50)

func NewHistogram

func NewHistogram(registry prometheus.Registerer, name, help string) prometheus.Histogram

NewHistogram creates and registers a histogram metric with default buckets.

A histogram samples observations (e.g., request durations or response sizes) and counts them in configurable buckets. Use histograms for measuring distributions of values.

Default buckets: [.005, .01, .025, .05, .1, .25, .5, 1, 2.5, 5, 10]

Parameters:

  • registry: The Prometheus registry to register with
  • name: Metric name (e.g., "http_request_duration_seconds")
  • help: Human-readable description of the metric

Example:

registry := prometheus.NewRegistry()
duration := metrics.NewHistogram(registry, "request_duration_seconds", "Request duration")
duration.Observe(0.5)

func NewHistogramWithBuckets

func NewHistogramWithBuckets(registry prometheus.Registerer, name, help string, buckets []float64) prometheus.Histogram

NewHistogramWithBuckets creates and registers a histogram with custom buckets.

Use this when default buckets don't match your use case. For duration metrics, consider using DurationBuckets() as a starting point.

Parameters:

  • registry: The Prometheus registry to register with
  • name: Metric name
  • help: Human-readable description
  • buckets: Bucket boundaries (e.g., []float64{0.1, 0.5, 1.0, 5.0})

Example:

registry := prometheus.NewRegistry()
duration := metrics.NewHistogramWithBuckets(
    registry,
    "api_latency_seconds",
    "API latency distribution",
    metrics.DurationBuckets(),
)
duration.Observe(0.25)

Types

type Server

type Server struct {
	// contains filtered or unexported fields
}

Server serves Prometheus metrics over HTTP.

IMPORTANT: Server is instance-based (not global). Create one per application lifecycle to ensure metrics are garbage collected when the server stops.

The server provides a /metrics endpoint for Prometheus scraping and gracefully shuts down when the context is cancelled.

func NewServer

func NewServer(addr string, registry prometheus.Gatherer) *Server

NewServer creates a new metrics server.

IMPORTANT: Pass an instance-based registry (prometheus.NewRegistry()), NOT prometheus.DefaultRegisterer. This ensures metrics are garbage collected when the server stops, which is critical for applications that reinitialize on configuration changes.

Parameters:

  • addr: TCP address to listen on (e.g., ":9090" or "localhost:9090")
  • registry: The Prometheus registry to serve (use prometheus.NewRegistry())

Example:

registry := prometheus.NewRegistry()  // Instance-based, not global!
server := metrics.NewServer(":9090", registry)
go server.Start(ctx)

func (*Server) Addr

func (s *Server) Addr() string

Addr returns the address the server is configured to listen on.

func (*Server) Start

func (s *Server) Start(ctx context.Context) error

Start starts the HTTP server and blocks until the context is cancelled.

This method should typically be run in a goroutine:

go server.Start(ctx)

The server performs graceful shutdown when the context is cancelled, waiting for active connections to complete (up to a 10-second timeout).

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL