Documentation
¶
Overview ¶
Package pricing computes USD cost for MCP tool calls using model rate tables sourced from LiteLLM's `model_prices_and_context_window.json`.
The package is intentionally small: it owns model-rate lookup, model-ID normalization, and cost computation. It has no dependencies on other gridctl packages so any layer (gateway, CLI, web API) can call it.
The default Source is backed by an embedded snapshot of LiteLLM's pricing JSON. To swap in an alternate source (a deterministic fixture in tests, a future Anthropic/OpenAI native source, a community-maintained JSON), call SetSource. The package-level Lookup and Calculate functions read through the active Source via an atomic pointer so swaps are safe under concurrent readers.
Cache-read and cache-write tokens are priced separately from input tokens using the LiteLLM cache_read_input_token_cost and cache_creation_input_token_cost fields. Conflating them with input tokens mis-prices providers like Anthropic by roughly an order of magnitude because cache rates are ~10% / ~125% of input rates.
Index ¶
Constants ¶
This section is empty.
Variables ¶
This section is empty.
Functions ¶
Types ¶
type Cost ¶
Cost is the per-component USD breakdown for a single call. Total returns the sum across all components.
func CalculateBreakdown ¶
CalculateBreakdown returns the per-component USD cost. Cache-read and cache-write tokens are priced separately from input tokens; callers that only want a session total should use Calculate.
type LiteLLMSource ¶
type LiteLLMSource struct {
// contains filtered or unexported fields
}
LiteLLMSource is a Source backed by an embedded snapshot of LiteLLM's model_prices_and_context_window.json. The full file is parsed once at construction and held in memory; lookups are constant-time map reads.
func NewLiteLLMSource ¶
func NewLiteLLMSource() *LiteLLMSource
NewLiteLLMSource parses the embedded LiteLLM pricing data and returns a ready-to-use Source. Parsing failures fall back to an empty rate table (every Lookup returns false). Failures are logged at WARN; callers do not receive an error so the gateway can still start when pricing data is malformed.
func (*LiteLLMSource) Lookup ¶
func (s *LiteLLMSource) Lookup(model string) (Rates, bool)
Lookup returns the per-token rates for the given model ID. Probes the rate table in order from most specific to most general:
- The exact normalized form (provider stripped, lower-cased).
- The same form with any trailing -YYYYMMDD date suffix removed.
- A small alias table for IDs that diverge by more than the prefix/date heuristics handle.
The lookup path is allocation-free for already-canonical IDs (e.g. "claude-opus-4-7") so the cost path can run on the gateway hot path without per-call GC pressure.
type Rates ¶
type Rates struct {
InputPerToken float64
OutputPerToken float64
CacheReadPerToken float64
CacheWritePerToken float64
}
Rates are the per-token USD prices for a single model. Cache fields are zero when the provider does not publish cache rates; in that case any reported cache tokens are treated as free (LiteLLM omits the fields rather than zero-pricing them, but the caller cannot distinguish "free" from "absent" — pricing is best-effort, not a billing source of truth).
type Source ¶
Source is the abstraction for a per-model rate table. Implementations are expected to be safe for concurrent Lookup. Name is a short identifier (e.g. "litellm") used in logs and diagnostic output.
func CurrentSource ¶
func CurrentSource() Source
CurrentSource returns the active Source. Useful for diagnostics — most callers should use Lookup or Calculate.