observability

package
v0.5.24 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Jun 21, 2026 License: Apache-2.0 Imports: 16 Imported by: 0

Documentation

Overview

Package observability provides structured event logging, metrics, and tracing for integrating switchAILocal with external monitoring systems.

Index

Constants

This section is empty.

Variables

View Source
var (

	// InFlightRequests tracks the number of requests currently being processed.
	InFlightRequests = promauto.NewGauge(
		prometheus.GaugeOpts{
			Name: "switchailocal_requests_in_flight",
			Help: "Number of HTTP requests currently being processed",
		},
	)

	// ProviderHealthScore tracks real-time provider health scores.
	ProviderHealthScore = promauto.NewGaugeVec(
		prometheus.GaugeOpts{
			Name: "switchailocal_provider_health_score",
			Help: "Provider health score (0-1) based on success rate, latency, and quota",
		},
		[]string{"provider", "status"},
	)

	// RateLimitedTotal tracks the total number of rate-limited requests.
	RateLimitedTotal = promauto.NewCounterVec(
		prometheus.CounterOpts{
			Name: "switchailocal_rate_limited_total",
			Help: "Total number of requests rejected by rate limiter",
		},
		[]string{"scope"},
	)

	// LoadSheddedTotal tracks the total number of load-shed requests.
	LoadSheddedTotal = promauto.NewCounter(
		prometheus.CounterOpts{
			Name: "switchailocal_load_shed_total",
			Help: "Total number of requests rejected by load shedding",
		},
	)
)

Functions

func GinMiddleware

func GinMiddleware(emitter *EventEmitter) gin.HandlerFunc

GinMiddleware creates a Gin middleware that emits a RequestEvent after every request completes.

func MetricsMiddleware

func MetricsMiddleware(enabled bool) gin.HandlerFunc

MetricsMiddleware creates a Gin middleware that records Prometheus telemetry.

func RegisterMetricsRoute

func RegisterMetricsRoute(router *gin.Engine, enabled bool, path string)

RegisterMetricsRoute attaches the /metrics endpoint to the provided router if enabled in config.

func StartPprofServer

func StartPprofServer(port int)

StartPprofServer starts a pprof HTTP server on the given port. It runs in a background goroutine and never returns under normal operation. The server is bound to localhost only for security.

Endpoints available:

  • /debug/pprof/ — index page
  • /debug/pprof/profile — CPU profile (30s default)
  • /debug/pprof/heap — heap memory profile
  • /debug/pprof/goroutine — goroutine dump
  • /debug/pprof/block — blocking profile
  • /debug/pprof/mutex — mutex contention profile

Types

type EventEmitter

type EventEmitter struct {
	// contains filtered or unexported fields
}

EventEmitter writes structured RequestEvent records to a configured output.

func NewEventEmitter

func NewEventEmitter(enabled bool, output, filePath string) (*EventEmitter, error)

NewEventEmitter creates a new emitter based on the provided configuration. output: "stdout" or "file" filePath: path to the NDJSON file (only used when output is "file")

func (*EventEmitter) Close

func (e *EventEmitter) Close() error

Close shuts down the emitter and flushes any buffered output.

func (*EventEmitter) Emit

func (e *EventEmitter) Emit(event *RequestEvent)

Emit writes a single RequestEvent as a JSON line to the configured output.

func (*EventEmitter) IsEnabled

func (e *EventEmitter) IsEnabled() bool

IsEnabled returns whether the emitter is active.

type RequestEvent

type RequestEvent struct {
	// Timestamp of the event in ISO 8601 format.
	Timestamp string `json:"timestamp"`

	// RequestID is the unique identifier for this request.
	RequestID string `json:"request_id"`

	// Model is the model originally requested by the client.
	RequestedModel string `json:"requested_model"`

	// SelectedModel is the model actually used after auto-routing.
	SelectedModel string `json:"selected_model,omitempty"`

	// Provider is the provider that served the request.
	Provider string `json:"provider"`

	// Intent is the classified intent (e.g., "coding", "reasoning", "fast").
	Intent string `json:"intent,omitempty"`

	// Complexity is the estimated query complexity (0.0 to 1.0).
	Complexity float64 `json:"complexity,omitempty"`

	// LatencyMs is the total request duration in milliseconds.
	LatencyMs int64 `json:"latency_ms"`

	// HTTPStatus is the upstream HTTP response status code.
	HTTPStatus int `json:"http_status"`

	// Success indicates whether the request completed successfully.
	Success bool `json:"success"`

	// InputTokens is the number of tokens in the request.
	InputTokens int `json:"input_tokens,omitempty"`

	// OutputTokens is the number of tokens in the response.
	OutputTokens int `json:"output_tokens,omitempty"`

	// RQS is the Routing Quality Score for this request (0.0 to 1.0).
	RQS float64 `json:"rqs,omitempty"`

	// AutoRouted indicates whether auto-routing was used.
	AutoRouted bool `json:"auto_routed"`

	// FallbacksAttempted is the number of fallback providers tried before success.
	FallbacksAttempted int `json:"fallbacks_attempted,omitempty"`

	// Streaming indicates whether this was a streaming response.
	Streaming bool `json:"streaming,omitempty"`

	// Error contains the error message if the request failed.
	Error string `json:"error,omitempty"`
}

RequestEvent is a structured per-request telemetry record emitted as NDJSON. It captures the full lifecycle of a proxied API request, including routing decisions, provider selection, latency, token counts, and quality scores.

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL