Documentation
¶
Overview ¶
Package pricing owns the pricing/model-parameters catalog (compartments A + B + E): canonical pricing rows fetched from the upstream datasheet, per-provider datasheet-derived model views, supported request types and parameters, and scoped pricing overrides. It also computes per-response cost.
The package performs no list-models I/O — that's the live store's domain. The hourly sync ticker lives on the composer (ModelCatalog), not here; the composer calls SyncFromURL / LoadFromDB / Sync*ModelParams*. Reads are hot-path and lock-free where possible.
Index ¶
- Constants
- type Config
- type Entry
- type LookupScopes
- type MatchType
- type Options
- type Override
- type ScopeKind
- type Store
- func (s *Store) BaseModelName(model string) string
- func (s *Store) CalculateCost(result *schemas.BifrostResponse, scopes *LookupScopes) float64
- func (s *Store) CalculateCostForUsage(usage *schemas.BifrostLLMUsage, provider schemas.ModelProvider, model string, ...) float64
- func (s *Store) DatasheetModelsForProvider(provider schemas.ModelProvider) []string
- func (s *Store) DatasheetProviders() []schemas.ModelProvider
- func (s *Store) DeleteOverride(id string)
- func (s *Store) DistinctBaseModelNames() []string
- func (s *Store) Get(model string, provider schemas.ModelProvider, requestType schemas.RequestType) *configstoreTables.TableModelPricing
- func (s *Store) GetCapabilityEntry(model string, provider schemas.ModelProvider) *Entry
- func (s *Store) GetModelParametersByModel(ctx context.Context, model string) (*configstoreTables.TableModelParameters, error)
- func (s *Store) GetPricingEntryForModel(model string, provider schemas.ModelProvider) *Entry
- func (s *Store) GetSupportedParameters(model string) []string
- func (s *Store) IsRequestTypeSupported(model string, requestType schemas.RequestType) bool
- func (s *Store) IsSameModel(model1, model2 string) bool
- func (s *Store) IsTextCompletionSupported(model string, provider schemas.ModelProvider) bool
- func (s *Store) LastSyncedAt() time.Time
- func (s *Store) LoadFromDB(ctx context.Context) error
- func (s *Store) LoadFromURLIntoMemory(ctx context.Context) error
- func (s *Store) LoadModelParamsFromDB(ctx context.Context) (int, error)
- func (s *Store) LoadModelParamsFromURLIntoMemory(ctx context.Context) error
- func (s *Store) LoadOverridesFromStore(ctx context.Context) error
- func (s *Store) MarkSynced(t time.Time)
- func (s *Store) ModelParametersURL() string
- func (s *Store) SetOverrides(rows []configstoreTables.TablePricingOverride) error
- func (s *Store) SyncFromURL(ctx context.Context) error
- func (s *Store) SyncInterval() time.Duration
- func (s *Store) SyncModelParamsFromURL(ctx context.Context) error
- func (s *Store) URL() string
- func (s *Store) UpdateSyncConfig(cfg Config)
- func (s *Store) UpsertModelPricingAttributes(ctx context.Context, model string, provider schemas.ModelProvider, ...) (int64, error)
- func (s *Store) UpsertOverrides(rows ...*configstoreTables.TablePricingOverride) error
Constants ¶
const ( DefaultURL = "https://getbifrost.ai/datasheet" DefaultModelParametersURL = "https://getbifrost.ai/datasheet/model-parameters" DefaultSyncInterval = 24 * time.Hour DefaultPricingTimeout = 45 * time.Second DefaultModelParametersTimeout = 45 * time.Second )
Defaults for sync configuration and timeouts. Exposed so the composer can fall back to these when the framework Config leaves fields nil.
const ( TokenTierAbove272K = 272000 TokenTierAbove200K = 200000 TokenTierAbove128K = 128000 )
Tier boundaries for tiered token pricing. Matches the upstream datasheet keys (input_cost_per_token_above_<N>k_tokens).
Variables ¶
This section is empty.
Functions ¶
This section is empty.
Types ¶
type Config ¶
Config groups the values the composer hands to New / UpdateSyncConfig. Zero values fall back to the Default* constants.
type Entry ¶
type Entry struct {
BaseModel string `json:"base_model,omitempty"`
Provider string `json:"provider"`
Mode string `json:"mode"`
ContextLength *int `json:"context_length,omitempty"`
MaxInputTokens *int `json:"max_input_tokens,omitempty"`
MaxOutputTokens *int `json:"max_output_tokens,omitempty"`
Architecture *schemas.Architecture `json:"architecture,omitempty"`
// AdditionalAttributes carries editorial metadata stored on the pricing
// row (e.g. description). Populated from the DB read path only; the
// json:"-" tag prevents URL datasheet payloads from ever feeding into
// this field via json.Unmarshal.
AdditionalAttributes map[string]string `json:"-"`
Options
}
Entry represents a single model's pricing information. Field names and JSON tags match the datasheet schema exactly. AdditionalAttributes carries editorial metadata stored on the pricing row — never populated from the URL datasheet, only from DB reads via the management API.
func (*Entry) UnmarshalJSON ¶
UnmarshalJSON handles the special case where search_context_cost_per_query may arrive as either a plain float64 or a tiered object {"search_context_size_low":…, "search_context_size_medium":…, "search_context_size_high":…}.
type LookupScopes ¶
LookupScopes carries the runtime identifiers used to resolve scoped pricing overrides during cost calculation.
func LookupScopesFromContext ¶
func LookupScopesFromContext(ctx *schemas.BifrostContext, provider string) *LookupScopes
LookupScopesFromContext builds a LookupScopes from a BifrostContext. Reads the governance virtual key ID (not the raw VK token) and the selected key ID. provider should be the provider name string (e.g. "openai"); pass "" if unavailable. Returns nil only when ctx is nil. An empty scopes value is still returned when all fields are empty so global-scope overrides remain evaluable.
NOT SAFE in a goroutine — reads from ctx which is cancelled when the request ends. Call synchronously in PostHooks and pass the result by value to anything that may outlive the request.
type MatchType ¶
type MatchType string
MatchType controls how an override pattern is matched against model names.
type Options ¶
type Options struct {
// Costs - Text
InputCostPerToken *float64 `json:"input_cost_per_token,omitempty"`
OutputCostPerToken *float64 `json:"output_cost_per_token,omitempty"`
InputCostPerTokenBatches *float64 `json:"input_cost_per_token_batches,omitempty"`
OutputCostPerTokenBatches *float64 `json:"output_cost_per_token_batches,omitempty"`
InputCostPerTokenPriority *float64 `json:"input_cost_per_token_priority,omitempty"`
OutputCostPerTokenPriority *float64 `json:"output_cost_per_token_priority,omitempty"`
InputCostPerTokenFlex *float64 `json:"input_cost_per_token_flex,omitempty"`
OutputCostPerTokenFlex *float64 `json:"output_cost_per_token_flex,omitempty"`
// Fast mode (Anthropic research preview, speed:"fast" on Opus 4.6/4.7/4.8).
// Flat rate across the full context window — no 128k/200k/272k tiering.
InputCostPerTokenFast *float64 `json:"input_cost_per_token_fast,omitempty"`
OutputCostPerTokenFast *float64 `json:"output_cost_per_token_fast,omitempty"`
InputCostPerCharacter *float64 `json:"input_cost_per_character,omitempty"`
// Costs - 128k Tier
InputCostPerTokenAbove128kTokens *float64 `json:"input_cost_per_token_above_128k_tokens,omitempty"`
InputCostPerImageAbove128kTokens *float64 `json:"input_cost_per_image_above_128k_tokens,omitempty"`
InputCostPerVideoPerSecondAbove128kTokens *float64 `json:"input_cost_per_video_per_second_above_128k_tokens,omitempty"`
InputCostPerAudioPerSecondAbove128kTokens *float64 `json:"input_cost_per_audio_per_second_above_128k_tokens,omitempty"`
OutputCostPerTokenAbove128kTokens *float64 `json:"output_cost_per_token_above_128k_tokens,omitempty"`
// Costs - 200k Tier
InputCostPerTokenAbove200kTokens *float64 `json:"input_cost_per_token_above_200k_tokens,omitempty"`
InputCostPerTokenAbove200kTokensPriority *float64 `json:"input_cost_per_token_above_200k_tokens_priority,omitempty"`
OutputCostPerTokenAbove200kTokens *float64 `json:"output_cost_per_token_above_200k_tokens,omitempty"`
OutputCostPerTokenAbove200kTokensPriority *float64 `json:"output_cost_per_token_above_200k_tokens_priority,omitempty"`
// Costs - 272k Tier
InputCostPerTokenAbove272kTokens *float64 `json:"input_cost_per_token_above_272k_tokens,omitempty"`
InputCostPerTokenAbove272kTokensPriority *float64 `json:"input_cost_per_token_above_272k_tokens_priority,omitempty"`
OutputCostPerTokenAbove272kTokens *float64 `json:"output_cost_per_token_above_272k_tokens,omitempty"`
OutputCostPerTokenAbove272kTokensPriority *float64 `json:"output_cost_per_token_above_272k_tokens_priority,omitempty"`
// Costs - Cache
CacheCreationInputTokenCost *float64 `json:"cache_creation_input_token_cost,omitempty"`
CacheReadInputTokenCost *float64 `json:"cache_read_input_token_cost,omitempty"`
CacheCreationInputTokenCostAbove200kTokens *float64 `json:"cache_creation_input_token_cost_above_200k_tokens,omitempty"`
CacheReadInputTokenCostAbove200kTokens *float64 `json:"cache_read_input_token_cost_above_200k_tokens,omitempty"`
CacheReadInputTokenCostAbove200kTokensPriority *float64 `json:"cache_read_input_token_cost_above_200k_tokens_priority,omitempty"`
CacheCreationInputTokenCostAbove1hr *float64 `json:"cache_creation_input_token_cost_above_1hr,omitempty"`
CacheCreationInputTokenCostAbove1hrAbove200kTokens *float64 `json:"cache_creation_input_token_cost_above_1hr_above_200k_tokens,omitempty"`
CacheCreationInputAudioTokenCost *float64 `json:"cache_creation_input_audio_token_cost,omitempty"`
CacheReadInputTokenCostPriority *float64 `json:"cache_read_input_token_cost_priority,omitempty"`
CacheReadInputTokenCostFlex *float64 `json:"cache_read_input_token_cost_flex,omitempty"`
CacheReadInputImageTokenCost *float64 `json:"cache_read_input_image_token_cost,omitempty"`
CacheReadInputTokenCostAbove272kTokens *float64 `json:"cache_read_input_token_cost_above_272k_tokens,omitempty"`
CacheReadInputTokenCostAbove272kTokensPriority *float64 `json:"cache_read_input_token_cost_above_272k_tokens_priority,omitempty"`
// Costs - Image
InputCostPerImage *float64 `json:"input_cost_per_image,omitempty"`
InputCostPerPixel *float64 `json:"input_cost_per_pixel,omitempty"`
OutputCostPerImage *float64 `json:"output_cost_per_image,omitempty"`
OutputCostPerPixel *float64 `json:"output_cost_per_pixel,omitempty"`
OutputCostPerImagePremiumImage *float64 `json:"output_cost_per_image_premium_image,omitempty"`
OutputCostPerImageAbove512x512Pixels *float64 `json:"output_cost_per_image_above_512_and_512_pixels,omitempty"`
OutputCostPerImageAbove512x512PixelsPremium *float64 `json:"output_cost_per_image_above_512_and_512_pixels_and_premium_image,omitempty"`
OutputCostPerImageAbove1024x1024Pixels *float64 `json:"output_cost_per_image_above_1024_and_1024_pixels,omitempty"`
OutputCostPerImageAbove1024x1024PixelsPremium *float64 `json:"output_cost_per_image_above_1024_and_1024_pixels_and_premium_image,omitempty"`
OutputCostPerImageAbove2048x2048Pixels *float64 `json:"output_cost_per_image_above_2048_and_2048_pixels,omitempty"`
OutputCostPerImageAbove4096x4096Pixels *float64 `json:"output_cost_per_image_above_4096_and_4096_pixels,omitempty"`
OutputCostPerImageLowQuality *float64 `json:"output_cost_per_image_low_quality,omitempty"`
OutputCostPerImageMediumQuality *float64 `json:"output_cost_per_image_medium_quality,omitempty"`
OutputCostPerImageHighQuality *float64 `json:"output_cost_per_image_high_quality,omitempty"`
OutputCostPerImageAutoQuality *float64 `json:"output_cost_per_image_auto_quality,omitempty"`
InputCostPerImageToken *float64 `json:"input_cost_per_image_token,omitempty"`
OutputCostPerImageToken *float64 `json:"output_cost_per_image_token,omitempty"`
// Costs - Audio/Video
InputCostPerAudioToken *float64 `json:"input_cost_per_audio_token,omitempty"`
InputCostPerAudioPerSecond *float64 `json:"input_cost_per_audio_per_second,omitempty"`
InputCostPerSecond *float64 `json:"input_cost_per_second,omitempty"`
InputCostPerVideoPerSecond *float64 `json:"input_cost_per_video_per_second,omitempty"`
OutputCostPerAudioToken *float64 `json:"output_cost_per_audio_token,omitempty"`
OutputCostPerVideoPerSecond *float64 `json:"output_cost_per_video_per_second,omitempty"`
OutputCostPerSecond *float64 `json:"output_cost_per_second,omitempty"`
// Costs - Other.
//
// SearchContextCostPerQuery is stored as a single float64, but the upstream datasheet
// represents it as a tiered object. See Entry.UnmarshalJSON.
SearchContextCostPerQuery *float64 `json:"search_context_cost_per_query,omitempty"`
CodeInterpreterCostPerSession *float64 `json:"code_interpreter_cost_per_session,omitempty"`
// Costs - OCR
OCRCostPerPage *float64 `json:"ocr_cost_per_page,omitempty"`
AnnotationCostPerPage *float64 `json:"annotation_cost_per_page,omitempty"`
}
Options holds every individual cost field. Embedded into Entry and reused as the patch shape for Override.
type Override ¶
type Override struct {
ID string `json:"id"`
Name string `json:"name"`
ScopeKind ScopeKind `json:"scope_kind"`
VirtualKeyID *string `json:"virtual_key_id,omitempty"`
ProviderID *string `json:"provider_id,omitempty"`
ProviderKeyID *string `json:"provider_key_id,omitempty"`
MatchType MatchType `json:"match_type"`
Pattern string `json:"pattern"`
RequestTypes []schemas.RequestType `json:"request_types,omitempty"`
Options Options `json:"options"`
}
Override describes a scoped pricing override shared across config storage, model catalog compilation, and governance APIs.
type ScopeKind ¶
type ScopeKind string
ScopeKind identifies which governance scope an override applies to.
const ( ScopeKindGlobal ScopeKind = "global" ScopeKindProvider ScopeKind = "provider" ScopeKindProviderKey ScopeKind = "provider_key" ScopeKindVirtualKey ScopeKind = "virtual_key" ScopeKindVirtualKeyProvider ScopeKind = "virtual_key_provider" ScopeKindVirtualKeyProviderKey ScopeKind = "virtual_key_provider_key" )
type Store ¶
type Store struct {
// contains filtered or unexported fields
}
Store owns the pricing catalog (canonical rows, base-model index, derived datasheet view, supported request types/parameters) and pricing overrides.
All I/O is driven by the composer — Store does not own a ticker or the distributed lock; it exposes SyncFromURL / LoadFromDB / UpdateSyncConfig as the surface the composer calls.
func New ¶
func New(configStore configstore.ConfigStore, logger schemas.Logger, cfg Config) *Store
New constructs a Store with the given config. The store is empty; callers (composer) drive bootstrap via LoadFromDB or LoadFromURLIntoMemory.
func NewTestStore ¶
NewTestStore constructs a minimal Store for unit tests without I/O. Optionally seed baseModelIndex so BaseModelName lookups resolve. A no-op logger is wired so cost / pricing paths (which assume Store.logger is non-nil) don't panic from external test code.
func (*Store) BaseModelName ¶
BaseModelName returns the canonical base model name. Uses the pre-computed base_model from the pricing catalog when present, falling back to algorithmic date/version stripping for unknown models.
"gpt-4o" → "gpt-4o" "openai/gpt-4o" → "gpt-4o" "gpt-4o-2024-08-06" → "gpt-4o" (algorithmic fallback)
func (*Store) CalculateCost ¶
func (s *Store) CalculateCost(result *schemas.BifrostResponse, scopes *LookupScopes) float64
CalculateCost calculates the cost of a Bifrost response. It handles all request types, cache debug billing, and tiered pricing. If scopes is nil, an empty LookupScopes is used; global and provider-scoped overrides may still apply since the provider is derived from the response.
func (*Store) CalculateCostForUsage ¶ added in v1.3.22
func (s *Store) CalculateCostForUsage(usage *schemas.BifrostLLMUsage, provider schemas.ModelProvider, model string, requestType schemas.RequestType, scopes *LookupScopes) float64
CalculateCostForUsage computes the dollar cost from a bare usage object plus provider / model / request type, for cases where no full BifrostResponse exists. The primary use is billing partial usage carried on a failed or cancelled request via BifrostError.ExtraFields.BilledUsage: the provider consumed tokens, so we must charge for them even though there is no success response to read. It mirrors CalculateCost's compute path so success and failure billing use identical rates. Returns 0 when usage is nil.
func (*Store) DatasheetModelsForProvider ¶
func (s *Store) DatasheetModelsForProvider(provider schemas.ModelProvider) []string
DatasheetModelsForProvider returns the per-provider model slice derived from pricing data on the last load/sync. Composer unions this with live.ModelsForProvider on read.
func (*Store) DatasheetProviders ¶
func (s *Store) DatasheetProviders() []schemas.ModelProvider
DatasheetProviders returns every provider that has at least one pricing row in the datasheet view. Composer unions this with live + keyconfig to enumerate "all known providers" for GetProvidersForModel.
func (*Store) DeleteOverride ¶
DeleteOverride removes an override by ID.
func (*Store) DistinctBaseModelNames ¶
DistinctBaseModelNames returns every unique base name from the catalog. Used by governance for cross-provider model selection.
func (*Store) Get ¶
func (s *Store) Get(model string, provider schemas.ModelProvider, requestType schemas.RequestType) *configstoreTables.TableModelPricing
Get returns the raw pricing row for (model, provider, requestType) or nil. Useful for callers that need exact pricing without override resolution.
func (*Store) GetCapabilityEntry ¶
func (s *Store) GetCapabilityEntry(model string, provider schemas.ModelProvider) *Entry
GetCapabilityEntry returns capability metadata (context length, supported modes, etc.) for a (model, provider) pair. Prefers chat → responses → text-completion entries; falls back to the lexicographically first mode if none of the preferred modes match. Tries the exact model first, then the canonical base model.
func (*Store) GetModelParametersByModel ¶
func (s *Store) GetModelParametersByModel(ctx context.Context, model string) (*configstoreTables.TableModelParameters, error)
GetModelParametersByModel reads a single model-parameter row from the DB. Used by the composer's cache-miss handler — installed via providerUtils.SetCacheMissHandler in the composer's Init.
func (*Store) GetPricingEntryForModel ¶
func (s *Store) GetPricingEntryForModel(model string, provider schemas.ModelProvider) *Entry
GetPricingEntryForModel returns the first pricing entry found across known modes. Preserved for callers (inference handler) that want any pricing row for the model without specifying a request type.
func (*Store) GetSupportedParameters ¶
GetSupportedParameters returns the list of OpenAI-compatible parameter names a model accepts (e.g. temperature, top_p, tools). nil for unknown.
func (*Store) IsRequestTypeSupported ¶
func (s *Store) IsRequestTypeSupported(model string, requestType schemas.RequestType) bool
IsRequestTypeSupported checks whether a model declares support for the given request type via the model-parameters datasheet.
func (*Store) IsSameModel ¶
IsSameModel reports whether two model strings refer to the same underlying model after normalization.
func (*Store) IsTextCompletionSupported ¶
func (s *Store) IsTextCompletionSupported(model string, provider schemas.ModelProvider) bool
IsTextCompletionSupported checks whether a model has a text_completion pricing entry — used by litellmcompat to decide whether to convert text completion requests into chat completion requests.
func (*Store) LastSyncedAt ¶
LastSyncedAt returns the last successful URL→DB sync timestamp; zero before any sync has completed.
func (*Store) LoadFromDB ¶
LoadFromDB reloads the in-memory pricing cache + datasheet view from the config store. Used by the composer at bootstrap and as the gossip ReloadFromDB handler on non-leader pods.
func (*Store) LoadFromURLIntoMemory ¶
LoadFromURLIntoMemory loads pricing from the URL directly into memory (no DB). Used when the composer was built without a config store.
func (*Store) LoadModelParamsFromDB ¶
LoadModelParamsFromDB bulk-loads model parameters from the DB into the provider-utils cache and the in-memory supportedResponseTypes / supportedParams indexes. Returns the row count so the composer can decide whether to background-sync from URL afterwards.
The provider-utils cache-miss handler in the composer still loads one row at a time when an unknown model is queried; both paths use the same JSON shape stored in the table.
func (*Store) LoadModelParamsFromURLIntoMemory ¶
LoadModelParamsFromURLIntoMemory fetches model parameters from the URL and applies them in-memory only. Used when there's no config store.
func (*Store) LoadOverridesFromStore ¶
LoadOverridesFromStore reloads all overrides from the config store. Called at bootstrap and after force-reload paths.
func (*Store) MarkSynced ¶
MarkSynced records the timestamp of a successful sync — called by the composer's ticker after a successful tick.
func (*Store) ModelParametersURL ¶
ModelParametersURL returns a snapshot of the model-parameters URL.
func (*Store) SetOverrides ¶
func (s *Store) SetOverrides(rows []configstoreTables.TablePricingOverride) error
SetOverrides replaces the full in-memory override set. Duplicate IDs in the input keep the last-seen entry (matching today's behavior).
func (*Store) SyncFromURL ¶
SyncFromURL fetches the upstream pricing datasheet, persists it to the DB (when configStore != nil), and refreshes the in-memory cache + derived datasheet view. On URL failure it falls back to existing DB records when any exist, otherwise propagates the error.
The composer owns the distributed lock, the ticker, and the after-sync gossip hook — none of that lives here. SyncFromURL is the pure "URL → DB → memory" step.
func (*Store) SyncInterval ¶
SyncInterval returns the minimum elapsed time between background syncs.
func (*Store) SyncModelParamsFromURL ¶
SyncModelParamsFromURL fetches model parameters from the configured URL, persists to DB (when configStore != nil), and refreshes the in-memory indexes. On URL failure it falls back to DB records when any exist.
func (*Store) UpdateSyncConfig ¶
UpdateSyncConfig replaces URL / params URL / interval atomically. The composer is responsible for triggering a fresh sync after this returns.
func (*Store) UpsertModelPricingAttributes ¶
func (s *Store) UpsertModelPricingAttributes(ctx context.Context, model string, provider schemas.ModelProvider, attrs map[string]string) (int64, error)
UpsertModelPricingAttributes writes the additional_attributes column for every pricing row that matches (model, provider), then reloads the pricing cache so the new values are immediately visible to list-models. Returns the number of rows updated (0 = no such pricing row, which callers must surface as a validation error). An empty/nil attrs map clears the column.
func (*Store) UpsertOverrides ¶
func (s *Store) UpsertOverrides(rows ...*configstoreTables.TablePricingOverride) error
UpsertOverrides inserts or replaces one or more overrides, rebuilding the lookup map exactly once at the end.