Documentation
¶
Overview ¶
Package observability provides structured event logging, metrics, and tracing for integrating switchAILocal with external monitoring systems.
Index ¶
Constants ¶
This section is empty.
Variables ¶
var ( // InFlightRequests tracks the number of requests currently being processed. InFlightRequests = promauto.NewGauge( prometheus.GaugeOpts{ Name: "switchailocal_requests_in_flight", Help: "Number of HTTP requests currently being processed", }, ) // ProviderHealthScore tracks real-time provider health scores. ProviderHealthScore = promauto.NewGaugeVec( prometheus.GaugeOpts{ Name: "switchailocal_provider_health_score", Help: "Provider health score (0-1) based on success rate, latency, and quota", }, []string{"provider", "status"}, ) // RateLimitedTotal tracks the total number of rate-limited requests. RateLimitedTotal = promauto.NewCounterVec( prometheus.CounterOpts{ Name: "switchailocal_rate_limited_total", Help: "Total number of requests rejected by rate limiter", }, []string{"scope"}, ) // LoadSheddedTotal tracks the total number of load-shed requests. LoadSheddedTotal = promauto.NewCounter( prometheus.CounterOpts{ Name: "switchailocal_load_shed_total", Help: "Total number of requests rejected by load shedding", }, ) )
Functions ¶
func GinMiddleware ¶
func GinMiddleware(emitter *EventEmitter) gin.HandlerFunc
GinMiddleware creates a Gin middleware that emits a RequestEvent after every request completes.
func MetricsMiddleware ¶
func MetricsMiddleware(enabled bool) gin.HandlerFunc
MetricsMiddleware creates a Gin middleware that records Prometheus telemetry.
func RegisterMetricsRoute ¶
RegisterMetricsRoute attaches the /metrics endpoint to the provided router if enabled in config.
func StartPprofServer ¶
func StartPprofServer(port int)
StartPprofServer starts a pprof HTTP server on the given port. It runs in a background goroutine and never returns under normal operation. The server is bound to localhost only for security.
Endpoints available:
- /debug/pprof/ — index page
- /debug/pprof/profile — CPU profile (30s default)
- /debug/pprof/heap — heap memory profile
- /debug/pprof/goroutine — goroutine dump
- /debug/pprof/block — blocking profile
- /debug/pprof/mutex — mutex contention profile
Types ¶
type EventEmitter ¶
type EventEmitter struct {
// contains filtered or unexported fields
}
EventEmitter writes structured RequestEvent records to a configured output.
func NewEventEmitter ¶
func NewEventEmitter(enabled bool, output, filePath string) (*EventEmitter, error)
NewEventEmitter creates a new emitter based on the provided configuration. output: "stdout" or "file" filePath: path to the NDJSON file (only used when output is "file")
func (*EventEmitter) Close ¶
func (e *EventEmitter) Close() error
Close shuts down the emitter and flushes any buffered output.
func (*EventEmitter) Emit ¶
func (e *EventEmitter) Emit(event *RequestEvent)
Emit writes a single RequestEvent as a JSON line to the configured output.
func (*EventEmitter) IsEnabled ¶
func (e *EventEmitter) IsEnabled() bool
IsEnabled returns whether the emitter is active.
type RequestEvent ¶
type RequestEvent struct {
// Timestamp of the event in ISO 8601 format.
Timestamp string `json:"timestamp"`
// RequestID is the unique identifier for this request.
RequestID string `json:"request_id"`
// Model is the model originally requested by the client.
RequestedModel string `json:"requested_model"`
// SelectedModel is the model actually used after auto-routing.
SelectedModel string `json:"selected_model,omitempty"`
// Provider is the provider that served the request.
Provider string `json:"provider"`
// Intent is the classified intent (e.g., "coding", "reasoning", "fast").
Intent string `json:"intent,omitempty"`
// Complexity is the estimated query complexity (0.0 to 1.0).
Complexity float64 `json:"complexity,omitempty"`
// LatencyMs is the total request duration in milliseconds.
LatencyMs int64 `json:"latency_ms"`
// HTTPStatus is the upstream HTTP response status code.
HTTPStatus int `json:"http_status"`
// Success indicates whether the request completed successfully.
Success bool `json:"success"`
// InputTokens is the number of tokens in the request.
InputTokens int `json:"input_tokens,omitempty"`
// OutputTokens is the number of tokens in the response.
OutputTokens int `json:"output_tokens,omitempty"`
// RQS is the Routing Quality Score for this request (0.0 to 1.0).
RQS float64 `json:"rqs,omitempty"`
// AutoRouted indicates whether auto-routing was used.
AutoRouted bool `json:"auto_routed"`
// FallbacksAttempted is the number of fallback providers tried before success.
FallbacksAttempted int `json:"fallbacks_attempted,omitempty"`
// Streaming indicates whether this was a streaming response.
Streaming bool `json:"streaming,omitempty"`
// Error contains the error message if the request failed.
Error string `json:"error,omitempty"`
}
RequestEvent is a structured per-request telemetry record emitted as NDJSON. It captures the full lifecycle of a proxied API request, including routing decisions, provider selection, latency, token counts, and quality scores.