measure

package
v0.5.0 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: May 31, 2026 License: MIT Imports: 6 Imported by: 0

Documentation

Overview

Package measure aggregates per-request measurements in memory so shim can answer the question "what are you actually doing to my traffic?" loudly, in a single JSON payload exposed at /v1/metrics.

Stage 1 measurements (Fork 2-a, 3-a in todos/shim-stage1-plan.md):

  • Per-endpoint latency reservoir (fixed-size, percentile-at-read).
  • Per-endpoint token-delta totals (shim cl100k_base count vs upstream usage.*_tokens).
  • Rewrite-event counts (model rewrites, stop_sequences truncations, ...).

All state is in-memory; restart resets to zero. Persistence is a Stage 2 decision.

Index

Constants

View Source
const (
	RewriteModel         = "model"
	RewriteStopSequences = "stop_sequences"
)

Rewrite event kinds. Adding a new kind = add a constant here + wire the call site + extend the rewrites table in README.

Variables

This section is empty.

Functions

This section is empty.

Types

type Collector

type Collector struct {
	// contains filtered or unexported fields
}

Collector aggregates measurements across goroutines. All public methods are safe to call concurrently; percentile compute is deferred to Snapshot.

func New

func New() *Collector

New returns an empty Collector. The reservoir RNG is seeded from time.Now() at construction; callers wanting determinism can build their own and replace the rng field via tests (same-package access).

func (*Collector) RecordLatency

func (c *Collector) RecordLatency(endpoint string, d time.Duration)

RecordLatency adds an observation for endpoint. Duration is stored as floating-point milliseconds.

func (*Collector) RecordRequestSeen

func (c *Collector) RecordRequestSeen(endpoint string)

RecordRequestSeen increments the per-endpoint request counter. Called at handler entry, before any parsing or validation. The denominator for any per-endpoint ratio (errors/seen, rewrites/seen) operators want to compute from /v1/metrics.

func (*Collector) RecordRewriteEvent

func (c *Collector) RecordRewriteEvent(kind string)

RecordRewriteEvent increments the counter for kind. Use the Rewrite* constants; arbitrary strings are accepted but break grep-ability.

func (*Collector) RecordTokenDelta

func (c *Collector) RecordTokenDelta(endpoint string, shimCount, upstreamPrompt, upstreamCompletion int)

RecordTokenDelta adds one (shim-count, upstream-claimed) pair for endpoint. shimCount is the cl100k_base BPE count from tokens.Count; upstreamPrompt and upstreamCompletion are the upstream usage fields. Totals + count surface the delta at Snapshot time.

No-op when BOTH upstreamPrompt and upstreamCompletion are zero — upstream omitted the usage block, recording zeros would dilute the shim/upstream comparison without adding signal. README discloses this.

func (*Collector) RecordUpstreamError

func (c *Collector) RecordUpstreamError(endpoint string, status int)

RecordUpstreamError records a non-2xx response from the upstream. status is the upstream's HTTP code (e.g., 400, 502, 429). Aggregated by class (4xx/5xx) and per-status for drill-down. Operators read both to answer "what's failing" without reading raw logs.

func (*Collector) Snapshot

func (c *Collector) Snapshot() Snapshot

Snapshot returns a point-in-time view of all aggregates with percentiles computed from current reservoir state. Returned maps are copies; callers can mutate freely.

type LatencyStats

type LatencyStats struct {
	P50 float64 `json:"p50"`
	P95 float64 `json:"p95"`
	P99 float64 `json:"p99"`
	N   int     `json:"n"`
}

LatencyStats reports percentiles in milliseconds. N is total observations recorded for this endpoint (NOT capped at reservoir size — n past reservoirCap means random replacement applied).

type Snapshot

type Snapshot struct {
	Latency        map[string]LatencyStats       `json:"latency"`
	Tokens         map[string]TokenStats         `json:"token_delta"`
	Rewrites       map[string]int                `json:"rewrites"`
	UpstreamErrors map[string]UpstreamErrorStats `json:"upstream_errors"`
	RequestsSeen   map[string]int                `json:"requests_seen"`
}

Snapshot is the JSON-marshallable view returned by Collector.Snapshot. The wire shape is committed in todos/shim-stage1-plan.md §1 step 6; breaking changes need a CHANGELOG entry.

type TokenStats

type TokenStats struct {
	ShimTotal               int `json:"shim_total"`
	UpstreamPromptTotal     int `json:"upstream_prompt_total"`
	UpstreamCompletionTotal int `json:"upstream_completion_total"`
	N                       int `json:"n"`
}

TokenStats reports cumulative sums. The delta is shim_total vs. upstream_prompt_total — a wide gap signals the cl100k count diverges meaningfully from the upstream's actual tokenizer for the traffic shape.

type UpstreamErrorStats

type UpstreamErrorStats struct {
	Total    int            `json:"total"`
	Class4xx int            `json:"class_4xx"`
	Class5xx int            `json:"class_5xx"`
	ByStatus map[string]int `json:"by_status"`
}

UpstreamErrorStats reports counts of upstream non-2xx responses per endpoint. ByStatus keys are stringified codes ("400", "502", ...) so the JSON shape is canonical; Total = Class4xx + Class5xx for status codes in the standard error ranges (3xx and oddities are counted only in Total + ByStatus).

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL