metrics

package
v1.3.1 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Jun 8, 2026 License: MIT Imports: 11 Imported by: 0

Documentation

Overview

Package metrics is a tiny, dependency-free Prometheus exposition backend tailored to VoiceServer's needs. It provides:

  • counters (monotonic, label-keyed)
  • gauges (up/down, label-keyed)
  • summary-style histograms with P50/P90/P95/P99 quantiles

We deliberately avoid pulling in prometheus/client_golang — it adds ~100 transitive deps for features (proto, gRPC, exemplars, …) we don't use. The text exposition format is small and stable.

Concurrency: every method on Registry is safe for concurrent use. Latency cost per observation: one atomic add (counters / gauges) or one RWLock + append (histograms) — both below 200 ns on modern x86, which is irrelevant compared to the network latencies we measure.

Index

Constants

View Source
const (
	// Calls.
	MetricActiveCalls = "voiceserver_active_calls"
	MetricCallsTotal  = "voiceserver_calls_total"

	// Recognizer / synthesizer errors.
	MetricASRErrors = "voiceserver_asr_errors_total"
	MetricTTSErrors = "voiceserver_tts_errors_total"

	// User-interrupts-AI events.
	MetricBargeInTotal = "voiceserver_barge_in_total"

	// Latencies (milliseconds).
	MetricE2EFirstByteMs = "voiceserver_e2e_first_byte_ms"
	MetricTTSFirstByteMs = "voiceserver_tts_first_byte_ms"
	MetricLLMFirstByteMs = "voiceserver_llm_first_byte_ms"

	// Dialog plane.
	MetricDialogReconnectTotal = "voiceserver_dialog_reconnect_total"
)

Metric name constants. Kept in one place so dashboards can grep for a single source of truth. Names follow Prometheus convention: `<namespace>_<subsystem>_<name>_<unit>`.

View Source
const MetricObserveDroppedTotal = "voiceserver_metrics_observe_dropped_total"

MetricObserveDroppedTotal counts samples lost because the async Observe buffer was full. If this is non-zero in production the drain goroutine isn't keeping up — usually a downstream stall rather than a real load issue.

View Source
const MetricUnknownLabelTotal = "voiceserver_metrics_unknown_label_total"

MetricUnknownLabelTotal counts soft-whitelist violations. Visible via /metrics so on-call can spot "someone is shipping a metric the declared whitelist doesn't cover" without grepping logs.

Variables

View Source
var (
	LabelsTransportSIP    = map[string]string{"transport": "sip"}
	LabelsTransportWebRTC = map[string]string{"transport": "webrtc"}
)

LabelsTransportSIP / LabelsTransportWebRTC are the two transports we use today. The whitelist for any metric labelled by transport should be: RegisterLabels(metric, "transport").

View Source
var Default = NewRegistry()

Default is the process-wide registry. Use this for application-level metrics so a single /metrics handler serves everything.

Functions

func ASRError

func ASRError(transport string)

ASRError bumps the ASR error counter. Called from the recognizer error callback in the gateway client.

func AsyncDroppedCount

func AsyncDroppedCount() uint64

AsyncDroppedCount returns the total samples dropped since process start. Exposed for tests and self-observability tooling.

func BargeIn

func BargeIn(transport string)

BargeIn counts how often the VAD interrupted the AI's TTS because the user started talking. Good predictor of conversation health — a high rate usually means the AI is too verbose or VAD is too twitchy.

func CallEnded

func CallEnded(transport, status string)

CallEnded mirrors CallStarted. status is a short classification like "ok", "dialog-hangup", "ice-failed", "pipeline-error" — use the same vocabulary you use in call_events.kind so dashboards line up.

func CallStarted

func CallStarted(transport string)

CallStarted increments the active-calls gauge and the calls_total counter for the given transport. Call at the moment the session becomes "live" (ASR/TTS wired + dialog plane connected).

func DialogReconnect

func DialogReconnect(transport, outcome string)

DialogReconnect counts reconnect attempts to the dialog plane regardless of outcome. A growing counter means the dialog app is flaky; pair with the ok/fail counters for success rate.

func Handler

func Handler() http.Handler

Handler returns an http.Handler that writes the Default registry in Prometheus text exposition format. Mount at /metrics — no auth by default; add middleware if the listener is internet-exposed.

func LabelsCall

func LabelsCall(transport, status string) map[string]string

LabelsCall composes a 2-key label set for the common (transport, status) shape used by voiceserver_calls_total. We pre-build the known combinations rather than allocating per-call. Add more statuses here if dashboards need to slice on them.

Return type is map[string]string to fit the existing API; pointer identity is preserved across calls so map-key dedupe inside the registry stays cheap.

func LabelsDialogOutcome

func LabelsDialogOutcome(transport, outcome string) map[string]string

LabelsDialogOutcome is used by DialogReconnect — bounded set of outcomes per the original API contract.

func ObserveAsync

func ObserveAsync(name, help string, v float64)

ObserveAsync queues a histogram sample on the global async drain. Hot-path safe: non-blocking, zero allocation, drops on full (incrementing the dropped-samples counter).

This is the recommended call for any observation that fires more than ~10x/sec per process. For one-off latencies (per turn, per call) the synchronous Default.Observe is fine and slightly more accurate (no buffering reorder concerns).

func ObserveE2EFirstByte

func ObserveE2EFirstByte(ms int)

ObserveE2EFirstByte records the user-perceived latency from ASR final to first audible AI byte. Only meaningful values (>0) should be passed — 0 means "no ASR final preceded this turn" which shouldn't skew the distribution.

func ObserveLLMFirstByte

func ObserveLLMFirstByte(ms int)

ObserveLLMFirstByte records the dialog app's reported time to first LLM token (ms). Comes from CommandMeta.LLMFirstMs on tts.speak.

func ObserveTTSFirstByte

func ObserveTTSFirstByte(ms int)

ObserveTTSFirstByte records Speak -> first PCM frame latency (ms). Measures the TTS engine's cold-start / TTFB across all turns.

func RegisterLabels

func RegisterLabels(metric string, keys ...string)

RegisterLabels declares the allowed label keys for a metric. Subsequent updates with extra keys will have those keys dropped (soft defense). Calling RegisterLabels twice for the same metric REPLACES the whitelist (last write wins) — intended for tests.

Safe to call from init().

func TTSError

func TTSError(transport string)

TTSError bumps the TTS error counter. Called when Speak returns an error or is interrupted / drained before producing any audio.

Types

type Registry

type Registry struct {
	// contains filtered or unexported fields
}

Registry is the single source of truth for VoiceServer process-level metrics. A call-site imports the package, mutates the Default registry via helpers like IncCounter(), and a single HTTP handler serialises the registry to Prometheus text format on /metrics scrape.

func NewRegistry

func NewRegistry() *Registry

NewRegistry returns an empty, ready-to-use registry.

func (*Registry) AddCounter

func (r *Registry) AddCounter(name, help string, labels map[string]string, n uint64)

AddCounter adds `n` to the counter. n must be >= 0 (Prometheus counters are monotonic); negative values are silently ignored so a buggy call site doesn't corrupt the series.

Labels are filtered through the cardinality whitelist registered via RegisterLabels (see labels.go). Unknown keys are dropped and reported via metrics_unknown_label_total.

func (*Registry) AddGauge

func (r *Registry) AddGauge(name, help string, labels map[string]string, v float64)

AddGauge increments (v > 0) / decrements (v < 0) a gauge atomically. Labels run through the cardinality whitelist (see labels.go).

func (*Registry) IncCounter

func (r *Registry) IncCounter(name, help string, labels map[string]string)

IncCounter bumps a counter by 1. Safe to call from hot paths.

func (*Registry) Observe

func (r *Registry) Observe(name, help string, v float64)

Observe records one sample into a histogram. The registry keeps at most `maxSamples` most recent observations to bound memory; older values are dropped in FIFO order. Quantiles are computed at scrape time from the live buffer, so a /metrics request is O(n log n) in buffer size — perfectly fine for n up to a few thousand.

func (*Registry) ObserveN

func (r *Registry) ObserveN(name, help string, v float64, maxSamples int)

ObserveN is Observe with a custom buffer cap. Use when you want finer control over memory vs resolution (e.g. 8192 for a hot call latency signal you scrape every 10s).

func (*Registry) SetGauge

func (r *Registry) SetGauge(name, help string, labels map[string]string, v float64)

SetGauge stores a value for a gauge. Labels run through the cardinality whitelist (see labels.go).

func (*Registry) WritePromText

func (r *Registry) WritePromText(w io.Writer)

WritePromText serialises the registry in Prometheus text exposition format (v0.0.4). Safe to call concurrently with metric updates; snapshot is point-in-time per metric.

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL