pricing

package
v0.1.0-beta.9 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: May 9, 2026 License: Apache-2.0 Imports: 6 Imported by: 0

Documentation

Overview

Package pricing computes USD cost for MCP tool calls using model rate tables sourced from LiteLLM's `model_prices_and_context_window.json`.

The package is intentionally small: it owns model-rate lookup, model-ID normalization, and cost computation. It has no dependencies on other gridctl packages so any layer (gateway, CLI, web API) can call it.

The default Source is backed by an embedded snapshot of LiteLLM's pricing JSON. To swap in an alternate source (a deterministic fixture in tests, a future Anthropic/OpenAI native source, a community-maintained JSON), call SetSource. The package-level Lookup and Calculate functions read through the active Source via an atomic pointer so swaps are safe under concurrent readers.

Cache-read and cache-write tokens are priced separately from input tokens using the LiteLLM cache_read_input_token_cost and cache_creation_input_token_cost fields. Conflating them with input tokens mis-prices providers like Anthropic by roughly an order of magnitude because cache rates are ~10% / ~125% of input rates.

Index

Constants

This section is empty.

Variables

This section is empty.

Functions

func Calculate

func Calculate(model string, usage Usage) (float64, bool)

Calculate returns the total USD cost for a tool call. When the model has no pricing data the second return is false and the cost is zero — callers should treat that as "pricing unavailable" rather than "free."

func SetSource

func SetSource(s Source)

SetSource swaps the package-level Source used by Lookup and Calculate. Safe to call from any goroutine; subsequent Lookup/Calculate calls observe the new Source after the store completes.

Types

type Cost

type Cost struct {
	Input      float64
	Output     float64
	CacheRead  float64
	CacheWrite float64
}

Cost is the per-component USD breakdown for a single call. Total returns the sum across all components.

func CalculateBreakdown

func CalculateBreakdown(model string, usage Usage) (Cost, bool)

CalculateBreakdown returns the per-component USD cost. Cache-read and cache-write tokens are priced separately from input tokens; callers that only want a session total should use Calculate.

func (Cost) Total

func (c Cost) Total() float64

Total returns the sum of all component costs.

type LiteLLMSource

type LiteLLMSource struct {
	// contains filtered or unexported fields
}

LiteLLMSource is a Source backed by an embedded snapshot of LiteLLM's model_prices_and_context_window.json. The full file is parsed once at construction and held in memory; lookups are constant-time map reads.

func NewLiteLLMSource

func NewLiteLLMSource() *LiteLLMSource

NewLiteLLMSource parses the embedded LiteLLM pricing data and returns a ready-to-use Source. Parsing failures fall back to an empty rate table (every Lookup returns false). Failures are logged at WARN; callers do not receive an error so the gateway can still start when pricing data is malformed.

func (*LiteLLMSource) Lookup

func (s *LiteLLMSource) Lookup(model string) (Rates, bool)

Lookup returns the per-token rates for the given model ID. Probes the rate table in order from most specific to most general:

  1. The exact normalized form (provider stripped, lower-cased).
  2. The same form with any trailing -YYYYMMDD date suffix removed.
  3. A small alias table for IDs that diverge by more than the prefix/date heuristics handle.

The lookup path is allocation-free for already-canonical IDs (e.g. "claude-opus-4-7") so the cost path can run on the gateway hot path without per-call GC pressure.

func (*LiteLLMSource) Name

func (s *LiteLLMSource) Name() string

Name returns "litellm".

type Rates

type Rates struct {
	InputPerToken      float64
	OutputPerToken     float64
	CacheReadPerToken  float64
	CacheWritePerToken float64
}

Rates are the per-token USD prices for a single model. Cache fields are zero when the provider does not publish cache rates; in that case any reported cache tokens are treated as free (LiteLLM omits the fields rather than zero-pricing them, but the caller cannot distinguish "free" from "absent" — pricing is best-effort, not a billing source of truth).

func Lookup

func Lookup(model string) (Rates, bool)

Lookup returns the per-token rates for a model, normalizing the model ID against the active Source's known keys. Returns (zero Rates, false) when the model is unknown.

type Source

type Source interface {
	Lookup(model string) (Rates, bool)
	Name() string
}

Source is the abstraction for a per-model rate table. Implementations are expected to be safe for concurrent Lookup. Name is a short identifier (e.g. "litellm") used in logs and diagnostic output.

func CurrentSource

func CurrentSource() Source

CurrentSource returns the active Source. Useful for diagnostics — most callers should use Lookup or Calculate.

type Usage

type Usage struct {
	InputTokens      int
	OutputTokens     int
	CacheReadTokens  int
	CacheWriteTokens int
}

Usage is the per-call token breakdown supplied to Calculate. Cache fields default to zero and are priced separately from InputTokens.

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL