metrics

package
v0.2.0 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: May 16, 2026 License: Apache-2.0 Imports: 5 Imported by: 0

README

Metrics

Package metrics provides Prometheus metrics for HTTP, WebRTC, STT, LLM, TTS, and recording. All collectors are registered on the shared Registry; the server exposes /metrics for scraping.

Purpose

  • Registry: Single prometheus.Registry; HTTP handlers register it with Prometheus HTTP.
  • Label strategy: Common labels include session_id (often hashed/sampled via SampledSessionID), stage, direction, status, model to keep cardinality under control.
  • Categories: HTTP request/connection metrics; WebRTC peer connection and bytes; STT/LLM/TTS errors, fallbacks, and latencies; recording queue and job counts.

Metric categories

graph TD
    Registry["Registry"] --> HTTP["HTTP"]
    Registry --> WebRTC["WebRTC"]
    Registry --> STT["STT"]
    Registry --> LLM["LLM"]
    Registry --> TTS["TTS"]
    Registry --> Recording["Recording"]
    HTTP --> HTTPReq["http_requests_total\nhttp_request_duration_seconds"]
    HTTP --> HTTPConn["http_active_connections"]
    HTTP --> HTTPErr["http_errors_total\nhttp_timeout_total"]
    WebRTC --> WebRTCConn["webrtc_peer_connections_*\nwebrtc_connection_failures_total"]
    WebRTC --> WebRTCBytes["webrtc_bytes_sent_total\nwebrtc_bytes_received_total"]
    STT --> STTErr["stt_errors_total\nstt_fallback_total"]
    STT --> STTLat["stt_time_to_first_token_seconds\nstt_transcription_latency_seconds"]
    LLM --> LLMErr["llm_errors_total\nllm_retries_total\nllm_fallback_total"]
    LLM --> LLMLat["llm_time_to_first_token_seconds\nllm_generation_latency_seconds"]
    TTS --> TTSErr["tts_errors_total\ntts_fallback_total"]
    TTS --> TTSLat["tts_time_to_first_audio_chunk_seconds\ntts_synthesis_latency_seconds"]
    Recording --> RecJobs["recording_jobs_*_total\nrecording_queue_depth"]

Exported symbols

Symbol Type Description
Registry *prometheus.Registry Shared registry; all metrics registered in init()
LabelSessionID, LabelStage, LabelDirection, LabelStatus, LabelModel const Common label keys
HTTPRequestsTotal, HTTPRequestDurationSeconds, HTTPActiveConnections, HTTPErrorsTotal, HTTPTimeoutTotal *CounterVec / *HistogramVec / *GaugeVec HTTP metrics
WebRTCPeerConnectionsTotal, WebRTCPeerConnectionsActive, WebRTCBytesSentTotal, WebRTCBytesReceivedTotal, WebRTCConnectionFailuresTotal, WebRTCReconnectionAttemptsTotal *CounterVec / *GaugeVec WebRTC metrics
STTErrorsTotal, STTFallbackTotal, STTTimeToFirstTokenSeconds, STTTranscriptionLatencySeconds, STTStreamingLagSeconds *CounterVec / *HistogramVec STT metrics
LLMErrorsTotal, LLMRetriesTotal, LLMFallbackTotal, LLMTimeToFirstTokenSeconds, LLMGenerationLatencySeconds, LLMInterTokenLatencySeconds *CounterVec / *HistogramVec LLM metrics
TTSErrorsTotal, TTSFallbackTotal, TTSTimeToFirstAudioChunkSeconds, TTSSynthesisLatencySeconds, TTSStreamingLagSeconds *CounterVec / *HistogramVec TTS metrics
RecordingJobsEnqueuedTotal, RecordingJobsSuccessTotal, RecordingJobsFailedTotal, RecordingQueueDepth *Counter / *Gauge Recording metrics
SampledSessionID(raw, sampleRate) func Returns hashed session ID or "sampled_out" for low cardinality

Concurrency

  • Prometheus collectors are safe for concurrent use (Observe, Inc, Set, etc.).
  • SampledSessionID uses rand (seeded in init); use from a single goroutine or with external synchronization if consistency matters.

Files

File Description
prom.go Registry, label constants, all metric vars, init registration, SampledSessionID
prom_test.go Tests

See also

Documentation

Index

Constants

View Source
const (
	LabelSessionID = "session_id"
	LabelStage     = "stage"
	LabelDirection = "direction"
	LabelStatus    = "status"
	LabelModel     = "model"
)

Common label keys.

Variables

View Source
var (
	HTTPRequestsTotal = prometheus.NewCounterVec(
		prometheus.CounterOpts{
			Name: "http_requests_total",
			Help: "Total number of HTTP requests.",
		},
		[]string{"method", "route", "status_code", LabelSessionID, LabelStage, LabelDirection, LabelStatus, LabelModel},
	)
	HTTPRequestDurationSeconds = prometheus.NewHistogramVec(
		prometheus.HistogramOpts{
			Name:    "http_request_duration_seconds",
			Help:    "HTTP request duration in seconds.",
			Buckets: prometheus.DefBuckets,
		},
		[]string{"method", "route", "status_code", LabelSessionID, LabelStage, LabelDirection, LabelStatus, LabelModel},
	)
	HTTPActiveConnections = prometheus.NewGaugeVec(
		prometheus.GaugeOpts{
			Name: "http_active_connections",
			Help: "Number of active HTTP connections.",
		},
		[]string{"route", LabelStage, LabelDirection, LabelSessionID, LabelModel},
	)
	HTTPErrorsTotal = prometheus.NewCounterVec(
		prometheus.CounterOpts{
			Name: "http_errors_total",
			Help: "Total number of HTTP errors.",
		},
		[]string{"method", "route", "error_type"},
	)
	HTTPTimeoutTotal = prometheus.NewCounterVec(
		prometheus.CounterOpts{
			Name: "http_timeout_total",
			Help: "Total number of HTTP timeouts.",
		},
		[]string{"method", "route"},
	)
)

HTTP metrics.

View Source
var (
	WebRTCPeerConnectionsTotal = prometheus.NewCounterVec(
		prometheus.CounterOpts{
			Name: "webrtc_peer_connections_total",
			Help: "Total number of WebRTC peer connections by state.",
		},
		[]string{"state", LabelSessionID, LabelStage},
	)
	WebRTCPeerConnectionsActive = prometheus.NewGaugeVec(
		prometheus.GaugeOpts{
			Name: "webrtc_peer_connections_active",
			Help: "Current number of active WebRTC peer connections.",
		},
		[]string{LabelStage, LabelSessionID},
	)
	WebRTCBytesSentTotal = prometheus.NewCounterVec(
		prometheus.CounterOpts{
			Name: "webrtc_bytes_sent_total",
			Help: "Total bytes sent over WebRTC.",
		},
		[]string{LabelDirection, LabelSessionID, LabelModel},
	)
	WebRTCBytesReceivedTotal = prometheus.NewCounterVec(
		prometheus.CounterOpts{
			Name: "webrtc_bytes_received_total",
			Help: "Total bytes received over WebRTC.",
		},
		[]string{LabelDirection, LabelSessionID, LabelModel},
	)
	WebRTCConnectionFailuresTotal = prometheus.NewCounterVec(
		prometheus.CounterOpts{
			Name: "webrtc_connection_failures_total",
			Help: "Total WebRTC connection failures by reason.",
		},
		[]string{"reason", LabelStage},
	)
	WebRTCReconnectionAttemptsTotal = prometheus.NewCounterVec(
		prometheus.CounterOpts{
			Name: "webrtc_reconnection_attempts_total",
			Help: "Total number of WebRTC reconnection attempts.",
		},
		[]string{LabelSessionID, LabelStage},
	)
)

WebRTC metrics.

View Source
var (
	STTErrorsTotal = prometheus.NewCounterVec(
		prometheus.CounterOpts{
			Name: "stt_errors_total",
			Help: "Total STT errors by type.",
		},
		[]string{"error_type", LabelSessionID, LabelStage, LabelModel},
	)
	STTFallbackTotal = prometheus.NewCounterVec(
		prometheus.CounterOpts{
			Name: "stt_fallback_total",
			Help: "Total STT fallback invocations.",
		},
		[]string{LabelSessionID, LabelStage, LabelModel},
	)
	STTTimeToFirstTokenSeconds = prometheus.NewHistogramVec(
		prometheus.HistogramOpts{
			Name:    "stt_time_to_first_token_seconds",
			Help:    "Time from audio start to first STT token.",
			Buckets: prometheus.DefBuckets,
		},
		[]string{LabelSessionID, LabelStage, LabelStatus, LabelModel},
	)
	STTTranscriptionLatencySeconds = prometheus.NewHistogramVec(
		prometheus.HistogramOpts{
			Name:    "stt_transcription_latency_seconds",
			Help:    "End-to-end STT transcription latency per utterance.",
			Buckets: prometheus.DefBuckets,
		},
		[]string{LabelSessionID, LabelStage, LabelStatus, LabelModel},
	)
	STTStreamingLagSeconds = prometheus.NewHistogramVec(
		prometheus.HistogramOpts{
			Name:    "stt_streaming_lag_seconds",
			Help:    "Lag between audio arrival and STT transcript emission.",
			Buckets: prometheus.DefBuckets,
		},
		[]string{LabelDirection, LabelSessionID, LabelModel},
	)
)

STT metrics.

View Source
var (
	LLMErrorsTotal = prometheus.NewCounterVec(
		prometheus.CounterOpts{
			Name: "llm_errors_total",
			Help: "Total LLM errors by type.",
		},
		[]string{"error_type", LabelSessionID, LabelStage, LabelModel},
	)
	LLMRetriesTotal = prometheus.NewCounterVec(
		prometheus.CounterOpts{
			Name: "llm_retries_total",
			Help: "Total LLM retries.",
		},
		[]string{LabelSessionID, LabelStage, LabelModel},
	)
	LLMFallbackTotal = prometheus.NewCounterVec(
		prometheus.CounterOpts{
			Name: "llm_fallback_total",
			Help: "Total LLM fallback invocations.",
		},
		[]string{LabelSessionID, LabelStage, LabelModel},
	)
	LLMTimeToFirstTokenSeconds = prometheus.NewHistogramVec(
		prometheus.HistogramOpts{
			Name:    "llm_time_to_first_token_seconds",
			Help:    "Time from request to first LLM token.",
			Buckets: prometheus.DefBuckets,
		},
		[]string{LabelSessionID, LabelStage, LabelStatus, LabelModel},
	)
	LLMGenerationLatencySeconds = prometheus.NewHistogramVec(
		prometheus.HistogramOpts{
			Name:    "llm_generation_latency_seconds",
			Help:    "End-to-end LLM generation latency.",
			Buckets: prometheus.DefBuckets,
		},
		[]string{LabelSessionID, LabelStage, LabelStatus, LabelModel},
	)
	LLMInterTokenLatencySeconds = prometheus.NewHistogramVec(
		prometheus.HistogramOpts{
			Name:    "llm_inter_token_latency_seconds",
			Help:    "Latency between streamed LLM tokens.",
			Buckets: prometheus.DefBuckets,
		},
		[]string{LabelSessionID, LabelStage, LabelModel},
	)
)

LLM metrics.

View Source
var (
	TTSErrorsTotal = prometheus.NewCounterVec(
		prometheus.CounterOpts{
			Name: "tts_errors_total",
			Help: "Total TTS errors by type.",
		},
		[]string{"error_type", LabelSessionID, LabelStage, LabelModel},
	)
	TTSFallbackTotal = prometheus.NewCounterVec(
		prometheus.CounterOpts{
			Name: "tts_fallback_total",
			Help: "Total TTS fallback invocations.",
		},
		[]string{LabelSessionID, LabelStage, LabelModel},
	)
	TTSTimeToFirstAudioChunkSeconds = prometheus.NewHistogramVec(
		prometheus.HistogramOpts{
			Name:    "tts_time_to_first_audio_chunk_seconds",
			Help:    "Time from text-in to first TTS audio chunk.",
			Buckets: prometheus.DefBuckets,
		},
		[]string{LabelSessionID, LabelStage, LabelStatus, LabelModel},
	)
	TTSSynthesisLatencySeconds = prometheus.NewHistogramVec(
		prometheus.HistogramOpts{
			Name:    "tts_synthesis_latency_seconds",
			Help:    "Full TTS synthesis latency.",
			Buckets: prometheus.DefBuckets,
		},
		[]string{LabelSessionID, LabelStage, LabelStatus, LabelModel},
	)
	TTSStreamingLagSeconds = prometheus.NewHistogramVec(
		prometheus.HistogramOpts{
			Name:    "tts_streaming_lag_seconds",
			Help:    "Lag between text-in and audio-out for TTS.",
			Buckets: prometheus.DefBuckets,
		},
		[]string{LabelDirection, LabelSessionID, LabelModel},
	)
)

TTS metrics.

View Source
var (
	RecordingJobsEnqueuedTotal = prometheus.NewCounter(
		prometheus.CounterOpts{
			Name: "recording_jobs_enqueued_total",
			Help: "Total number of recording upload jobs enqueued.",
		},
	)
	RecordingJobsSuccessTotal = prometheus.NewCounter(
		prometheus.CounterOpts{
			Name: "recording_jobs_success_total",
			Help: "Total number of recording upload jobs that succeeded.",
		},
	)
	RecordingJobsFailedTotal = prometheus.NewCounter(
		prometheus.CounterOpts{
			Name: "recording_jobs_failed_total",
			Help: "Total number of recording upload jobs that failed.",
		},
	)
	RecordingQueueDepth = prometheus.NewGauge(
		prometheus.GaugeOpts{
			Name: "recording_queue_depth",
			Help: "Current number of pending recording upload jobs in the queue.",
		},
	)
)

Recording metrics.

Registry is the shared Prometheus registry for Voxray.

Functions

func SampledSessionID

func SampledSessionID(raw string, sampleRate int) string

SampledSessionID returns a stable, low-cardinality session ID label. It either hashes the raw ID to a short hex string or, when sampled out, returns the constant "sampled_out".

Types

This section is empty.

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL