metrics

package
v0.1.0-beta.9 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: May 9, 2026 License: Apache-2.0 Imports: 8 Imported by: 0

Documentation

Overview

Package metrics provides token usage metrics collection and aggregation.

Index

Constants

This section is empty.

Variables

This section is empty.

Functions

This section is empty.

Types

type Accumulator

type Accumulator struct {
	// contains filtered or unexported fields
}

Accumulator collects token usage metrics with thread-safe operations. Session totals use atomic counters. Historical data is stored in a ring buffer of pre-aggregated 1-minute time buckets.

func NewAccumulator

func NewAccumulator(maxDataPoints int) *Accumulator

NewAccumulator creates a metrics accumulator with the given ring buffer capacity. Each slot holds one minute of aggregated data, so 10000 slots ≈ ~7 days.

func (*Accumulator) Clear

func (a *Accumulator) Clear()

Clear resets all metrics — session totals, per-server totals, and history.

func (*Accumulator) ClearCost

func (a *Accumulator) ClearCost()

ClearCost resets cost counters and cost ring-buffer values without touching token counters or format-savings state. Used by the `DELETE /api/metrics/cost` endpoint so operators can wipe cost data without losing token history.

func (*Accumulator) CostMicroSnapshot

func (a *Accumulator) CostMicroSnapshot() map[string]CostMicroUSDCounts

CostMicroSnapshot returns per-server cumulative cost in the int64 micro-USD shape used by the persistence layer. Skipping the float USD round-trip avoids any precision loss between the in-memory atomics and the on-disk schema. Used by telemetry.MetricsFlusher.flushOnce to compute a cost diff against prevCost in the same units that get written to metrics.jsonl, and consumed symmetrically by SeedFromFile via RestoreCost. Session totals are not returned — they are derivable as the sum across servers, which RestoreCost re-derives on rehydrate.

func (*Accumulator) CostSnapshot

func (a *Accumulator) CostSnapshot() CostUsage

CostSnapshot returns the current cost usage summary in USD. The shape mirrors Snapshot()'s TokenUsage so API responses can carry both side by side. Cache fields are non-zero only when RecordCost recorded cache usage; otherwise they are omitted from JSON via omitempty.

func (*Accumulator) Query

func (a *Accumulator) Query(duration time.Duration) TimeSeriesResponse

Query returns historical time-series data for the given duration. For ranges > 6h, data points are downsampled to hourly buckets.

func (*Accumulator) QueryCost

func (a *Accumulator) QueryCost(duration time.Duration) CostTimeSeriesResponse

QueryCost returns historical cost-over-time data for the given duration. For ranges > 6h, data points are downsampled to hourly buckets, matching the Query (token) behavior so charts can share the same time-range selector. The PerClient map on the response is left nil; call QueryCostByClient when caller asks for per-client grouping.

func (*Accumulator) QueryCostByClient

func (a *Accumulator) QueryCostByClient(duration time.Duration) CostTimeSeriesResponse

QueryCostByClient is QueryCost with per-client grouping enabled. The returned response has its PerClient field populated alongside PerServer so consumers can render either dimension off a single response.

func (*Accumulator) Record

func (a *Accumulator) Record(serverName string, inputTokens, outputTokens int)

Record adds a token usage observation from a tool call. Equivalent to RecordReplica with replicaID=-1 (i.e. do not attribute to a replica).

func (*Accumulator) RecordCost

func (a *Accumulator) RecordCost(serverName string, replicaID int, cost CostBreakdown)

RecordCost adds a per-call USD cost observation alongside the token observation that RecordReplica records. Pass replicaID < 0 to skip the per-replica update, mirroring RecordReplica.

Cost MUST be computed at observation time, not derived from stored token totals at read time: a model change mid-window would otherwise mis-price earlier calls. Cache-read and cache-write components arrive as separate fields on CostBreakdown so the Snapshot shape can preserve the split.

func (*Accumulator) RecordCostWithClient

func (a *Accumulator) RecordCostWithClient(serverName string, replicaID int, clientID string, cost CostBreakdown)

RecordCostWithClient is the client-aware variant of RecordCost. The cost is added to the per-client cost aggregates and per-client cost ring buffer in addition to the session, per-server, and per-replica aggregates. An empty clientID skips the per-client update.

func (*Accumulator) RecordFormatSavings

func (a *Accumulator) RecordFormatSavings(serverName string, originalTokens, formattedTokens int)

RecordFormatSavings records token counts before and after format conversion. Normal token usage tracking is handled separately by the ToolCallObserver; this method only tracks the format savings delta.

func (*Accumulator) RecordReplica

func (a *Accumulator) RecordReplica(serverName string, replicaID, inputTokens, outputTokens int)

RecordReplica adds a token usage observation attributed to a specific replica. Per-server aggregates are updated in all cases. Pass replicaID < 0 to skip the per-replica update (used for servers that are not part of a replica set).

func (*Accumulator) RecordReplicaWithClient

func (a *Accumulator) RecordReplicaWithClient(serverName string, replicaID int, clientID string, inputTokens, outputTokens int)

RecordReplicaWithClient is the client-aware variant of RecordReplica. It updates the per-client token counters in addition to session, per-server, and per-replica aggregates. An empty clientID skips the per-client update, matching the replicaID < 0 convention so callers without attribution can continue to use the same code path.

func (*Accumulator) RecordToolCall

func (a *Accumulator) RecordToolCall(serverName, toolName string)

RecordToolCall increments per-(server, tool) call counters and stamps the last-called timestamp. Used by pkg/optimize's unused_tool heuristic.

An empty serverName or toolName is a no-op so callers without per-tool attribution (legacy ToolCallObserver path) can invoke unconditionally.

func (*Accumulator) ReplaySnapshot

func (a *Accumulator) ReplaySnapshot(serverName string, ts time.Time, inputTokens, outputTokens, costMicro int64)

ReplaySnapshot adds a historical observation to the time-series ring buffers (aggregate + per-server) without touching cumulative counters. Used by telemetry.MetricsFlusher.SeedFromFile to rehydrate per-minute bucket history from each persisted Diff line — the chart shows pre-restart activity continuously alongside live data instead of resetting to a single post-restart point.

costMicro is the rolled-up total cost for the minute (sum of the four CostBreakdown components) in int64 micro-USD, matching the live RecordCost path which also calls addCostToBucket(now, totalMicro). Pass 0 for token-only replays (legacy persistence files predate the cost field). Cost-only replays — non-zero costMicro with zero token counts — are supported so a minute that recorded a priced fixture without token attribution still hydrates its cost bucket on seed.

Cumulative counters are restored separately via Restore + RestoreCost. Calling both with the same source data reproduces the on-disk state.

ts is bucketed to the minute via the same key the live Record path uses, so chronological replay produces one bucket per flush minute and live observations after replay continue advancing the same ring naturally.

func (*Accumulator) Restore

func (a *Accumulator) Restore(perServer map[string]TokenCounts)

Restore replaces per-server token totals with the supplied map and recomputes session totals as the sum across all servers (matching the invariant Record/RecordReplica maintains). Used on daemon startup to repopulate cumulative counters from a persisted metrics.jsonl file.

Existing per-server counters are overwritten for any server present in the map; servers absent from the map retain their current state. Replicas and format-savings counters are not restored — those carry no on-disk equivalent in the snapshot format. Time-series ring buckets are populated separately via ReplaySnapshot.

func (*Accumulator) RestoreCost

func (a *Accumulator) RestoreCost(perServer map[string]CostMicroUSDCounts)

RestoreCost is the cost analogue of Restore: it overwrites per-server cost component atomics with the supplied map and recomputes session cost totals as the sum across all servers (matching the invariant RecordCost maintains). Used on daemon startup by telemetry.MetricsFlusher.SeedFromFile to repopulate cumulative cost counters from a persisted metrics.jsonl file so the Cost KPI card reflects pre-restart spend the moment the UI loads.

Per-component splitting (input / output / cache-read / cache-write) is preserved on the cumulative atomics so CostSnapshot.Session can render the breakdown without recomputing — same trade-off live RecordCost makes. The time-series ring buffers are populated separately via ReplaySnapshot, which carries only the rolled-up total per bucket.

Servers absent from the map retain their current cost state. Replicas, format-savings, and per-client cost have no on-disk equivalent in the snapshot format and are not restored.

func (*Accumulator) Snapshot

func (a *Accumulator) Snapshot() TokenUsage

Snapshot returns the current token usage summary.

func (*Accumulator) StartedAt

func (a *Accumulator) StartedAt() time.Time

StartedAt returns the wall-clock time the accumulator was created. Clear and ClearCost do not reset this value — the start-of-observation window stays anchored to the gateway lifetime, which is what pkg/optimize uses to gate "<24h of data" findings.

func (*Accumulator) ToolUsageSnapshot

func (a *Accumulator) ToolUsageSnapshot() map[string]map[string]ToolStat

ToolUsageSnapshot returns a deep copy of the per-(server, tool) call counters. Empty when no per-tool calls have been recorded (typical for gateways still on the legacy ToolCallObserver path).

type CostBreakdown

type CostBreakdown struct {
	Input      float64
	Output     float64
	CacheRead  float64
	CacheWrite float64
}

CostBreakdown is the per-call USD cost split passed to RecordCost. Cache fields are priced separately from input tokens to match LiteLLM's cache rate fields — conflating them mis-prices providers like Anthropic by roughly an order of magnitude.

func (CostBreakdown) IsValid

func (c CostBreakdown) IsValid() bool

IsValid reports whether every component is finite and non-negative. A misconfigured Source could in theory return NaN/Inf rates or a negative Calculate result; recording those into atomic counters would permanently corrupt the snapshot. RecordCost drops invalid breakdowns.

func (CostBreakdown) IsZero

func (c CostBreakdown) IsZero() bool

IsZero reports whether all components are zero. Used by RecordCost to short-circuit accumulator updates when a tool call has no priceable usage (unknown model, all-zero token counts).

type CostCounts

type CostCounts struct {
	InputUSD      float64 `json:"input_usd"`
	OutputUSD     float64 `json:"output_usd"`
	CacheReadUSD  float64 `json:"cache_read_usd,omitempty"`
	CacheWriteUSD float64 `json:"cache_write_usd,omitempty"`
	TotalUSD      float64 `json:"total_usd"`
}

CostCounts is the snapshot shape for a single dimension (session, per-server, per-replica) of cost accumulation. All values are USD. Cache fields are omitempty so consumers that only care about input/output costs are not forced to render zeroes.

type CostDataPoint

type CostDataPoint struct {
	Timestamp time.Time `json:"timestamp"`
	USD       float64   `json:"usd"`
}

CostDataPoint is the time-series shape for cost-over-time queries.

type CostMicroUSDCounts

type CostMicroUSDCounts struct {
	InputMicroUSD      int64 `json:"input_micro_usd,omitempty"`
	OutputMicroUSD     int64 `json:"output_micro_usd,omitempty"`
	CacheReadMicroUSD  int64 `json:"cache_read_micro_usd,omitempty"`
	CacheWriteMicroUSD int64 `json:"cache_write_micro_usd,omitempty"`
}

CostMicroUSDCounts is the int64 micro-USD shape used by the persistence layer to round-trip the four cost components without float precision loss. Mirrors the in-memory atomic representation on serverCounters.

func (CostMicroUSDCounts) IsZero

func (c CostMicroUSDCounts) IsZero() bool

IsZero reports whether all four cost components are zero.

func (CostMicroUSDCounts) TotalMicroUSD

func (c CostMicroUSDCounts) TotalMicroUSD() int64

TotalMicroUSD returns the rolled-up sum of the four components — the shape ReplaySnapshot stores per bucket, matching addCostToBucket's live behavior of writing a single total per minute.

type CostTimeSeriesResponse

type CostTimeSeriesResponse struct {
	Range     string                     `json:"range"`
	Interval  string                     `json:"interval"`
	Points    []CostDataPoint            `json:"data_points"`
	PerServer map[string][]CostDataPoint `json:"per_server"`
	// PerClient groups cost over time by originating MCP client. Populated
	// only when the API caller requests per-client grouping (the
	// `per_client=true` query parameter on /api/metrics/cost) so the JSON
	// stays compact for the common per-server view.
	PerClient map[string][]CostDataPoint `json:"per_client,omitempty"`
}

CostTimeSeriesResponse is the cost analogue of TimeSeriesResponse. The `Range` and `Interval` strings reuse the same vocabulary as the token time-series so charts can share a time-range selector.

type CostUsage

type CostUsage struct {
	Session    CostCounts                    `json:"session"`
	PerServer  map[string]CostCounts         `json:"per_server"`
	PerReplica map[string]map[int]CostCounts `json:"per_replica,omitempty"`
	// PerClient groups USD cost by the originating MCP client. omitempty so
	// pre-attribution consumers keep their existing JSON shape.
	PerClient map[string]CostCounts `json:"per_client,omitempty"`
}

CostUsage is the top-level cost snapshot. The shape mirrors TokenUsage so API consumers can render cost charts beside token charts.

type DataPoint

type DataPoint struct {
	Timestamp    time.Time `json:"timestamp"`
	InputTokens  int64     `json:"input_tokens"`
	OutputTokens int64     `json:"output_tokens"`
	TotalTokens  int64     `json:"total_tokens"`
}

DataPoint is a single time-series data point with token counts.

type FormatSavings

type FormatSavings struct {
	OriginalTokens  int64   `json:"original_tokens"`
	FormattedTokens int64   `json:"formatted_tokens"`
	SavedTokens     int64   `json:"saved_tokens"`
	SavingsPercent  float64 `json:"savings_percent"`
}

FormatSavings tracks token savings from output formatting.

type ModelResolver

type ModelResolver func(serverName string) string

ModelResolver returns the configured model ID for a server, or "" when the server has no model attribution. Used by the Observer when a tool result does not carry a model in its CallUsage metadata. Resolvers must be safe for concurrent calls.

type Observer

type Observer struct {
	// contains filtered or unexported fields
}

Observer implements mcp.ToolCallObserver and mcp.ClientObserver by counting tokens, pricing the call against the active pricing.Source, and recording both into an Accumulator.

func NewObserver

func NewObserver(counter token.Counter, accumulator *Accumulator) *Observer

NewObserver creates a ToolCallObserver that counts tokens and records metrics. The cost path is wired but inert until SetModelResolver installs a server -> model mapping or tool results carry CallUsage with a model field. Until then RecordCost is called only when both the call reports a model and that model is known to the active pricing.Source.

func (*Observer) ObserveToolCall

func (o *Observer) ObserveToolCall(serverName string, replicaID int, arguments map[string]any, result *mcp.ToolCallResult)

ObserveToolCall counts input/output tokens and records them, then prices the call against the active pricing.Source and records the per-component USD breakdown alongside the tokens.

The cost path is best-effort: a call against an unknown model records tokens normally and skips RecordCost. Cache-read and cache-write tokens reported in result._meta (CallUsage) are priced via the provider's cache rates rather than rolled into the input rate.

func (*Observer) ObserveToolCallWithClient

func (o *Observer) ObserveToolCallWithClient(_ context.Context, obs mcp.ToolCallObservation) mcp.ToolCallSummary

ObserveToolCallWithClient is the ClientObserver entry point. It records the same tokens + cost as ObserveToolCall, additionally attributes them to the supplied client, and returns a summary the gateway uses to populate OTel GenAI semantic span attributes without re-counting tokens.

func (*Observer) SetModelResolver

func (o *Observer) SetModelResolver(r ModelResolver)

SetModelResolver installs the server -> model resolver used as a fallback when a tool result does not carry a model in its CallUsage. Passing nil clears the resolver, after which only call-level model attribution is honored.

type TimeSeriesResponse

type TimeSeriesResponse struct {
	Range     string                 `json:"range"`
	Interval  string                 `json:"interval"`
	Points    []DataPoint            `json:"data_points"`
	PerServer map[string][]DataPoint `json:"per_server"`
}

TimeSeriesResponse is returned by the historical metrics endpoint.

type TokenCounts

type TokenCounts struct {
	InputTokens  int64 `json:"input_tokens"`
	OutputTokens int64 `json:"output_tokens"`
	TotalTokens  int64 `json:"total_tokens"`
}

TokenCounts holds input/output/total token counts.

type TokenUsage

type TokenUsage struct {
	Session    TokenCounts                    `json:"session"`
	PerServer  map[string]TokenCounts         `json:"per_server"`
	PerReplica map[string]map[int]TokenCounts `json:"per_replica,omitempty"`
	// PerClient groups token usage by the originating MCP client (for example
	// "claude-code", "cursor"). The field is omitempty so consumers built
	// before per-client attribution shipped continue to see the same JSON
	// shape. Future per-user / per-team dimensions land as sibling fields
	// (per_user, per_team) under this same shape rather than reshaping
	// per_client.
	PerClient     map[string]TokenCounts `json:"per_client,omitempty"`
	FormatSavings FormatSavings          `json:"format_savings"`
}

TokenUsage is the top-level token usage snapshot returned by the API.

type ToolStat

type ToolStat struct {
	Calls        int64     `json:"calls"`
	LastCalledAt time.Time `json:"last_called_at,omitempty"`
}

ToolStat is the snapshot shape for per-(server, tool) call tracking. Used by pkg/optimize to detect tools that have not seen any calls inside a freshness window. Calls is the cumulative count since the accumulator was created or last cleared; LastCalledAt is the wall-clock time the most recent call was recorded, or the zero value when no calls have been recorded.

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL