datasheet

package
v1.3.20 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Jun 16, 2026 License: Apache-2.0 Imports: 21 Imported by: 0

Documentation

Overview

Package pricing owns the pricing/model-parameters catalog (compartments A + B + E): canonical pricing rows fetched from the upstream datasheet, per-provider datasheet-derived model views, supported request types and parameters, and scoped pricing overrides. It also computes per-response cost.

The package performs no list-models I/O — that's the live store's domain. The hourly sync ticker lives on the composer (ModelCatalog), not here; the composer calls SyncFromURL / LoadFromDB / Sync*ModelParams*. Reads are hot-path and lock-free where possible.

Index

Constants

View Source
const (
	DefaultURL                    = "https://getbifrost.ai/datasheet"
	DefaultModelParametersURL     = "https://getbifrost.ai/datasheet/model-parameters"
	DefaultSyncInterval           = 24 * time.Hour
	DefaultPricingTimeout         = 45 * time.Second
	DefaultModelParametersTimeout = 45 * time.Second
)

Defaults for sync configuration and timeouts. Exposed so the composer can fall back to these when the framework Config leaves fields nil.

View Source
const (
	TokenTierAbove272K = 272000
	TokenTierAbove200K = 200000
	TokenTierAbove128K = 128000
)

Tier boundaries for tiered token pricing. Matches the upstream datasheet keys (input_cost_per_token_above_<N>k_tokens).

Variables

This section is empty.

Functions

This section is empty.

Types

type Config

type Config struct {
	URL                string
	ModelParametersURL string
	SyncInterval       time.Duration
}

Config groups the values the composer hands to New / UpdateSyncConfig. Zero values fall back to the Default* constants.

type Entry

type Entry struct {
	BaseModel string `json:"base_model,omitempty"`
	Provider  string `json:"provider"`
	Mode      string `json:"mode"`

	ContextLength   *int                  `json:"context_length,omitempty"`
	MaxInputTokens  *int                  `json:"max_input_tokens,omitempty"`
	MaxOutputTokens *int                  `json:"max_output_tokens,omitempty"`
	Architecture    *schemas.Architecture `json:"architecture,omitempty"`

	// AdditionalAttributes carries editorial metadata stored on the pricing
	// row (e.g. description). Populated from the DB read path only; the
	// json:"-" tag prevents URL datasheet payloads from ever feeding into
	// this field via json.Unmarshal.
	AdditionalAttributes map[string]string `json:"-"`

	Options
}

Entry represents a single model's pricing information. Field names and JSON tags match the datasheet schema exactly. AdditionalAttributes carries editorial metadata stored on the pricing row — never populated from the URL datasheet, only from DB reads via the management API.

func (*Entry) UnmarshalJSON

func (p *Entry) UnmarshalJSON(data []byte) error

UnmarshalJSON handles the special case where search_context_cost_per_query may arrive as either a plain float64 or a tiered object {"search_context_size_low":…, "search_context_size_medium":…, "search_context_size_high":…}.

type LookupScopes

type LookupScopes struct {
	VirtualKeyID  string
	SelectedKeyID string
	Provider      string
}

LookupScopes carries the runtime identifiers used to resolve scoped pricing overrides during cost calculation.

func LookupScopesFromContext

func LookupScopesFromContext(ctx *schemas.BifrostContext, provider string) *LookupScopes

LookupScopesFromContext builds a LookupScopes from a BifrostContext. Reads the governance virtual key ID (not the raw VK token) and the selected key ID. provider should be the provider name string (e.g. "openai"); pass "" if unavailable. Returns nil only when ctx is nil. An empty scopes value is still returned when all fields are empty so global-scope overrides remain evaluable.

NOT SAFE in a goroutine — reads from ctx which is cancelled when the request ends. Call synchronously in PostHooks and pass the result by value to anything that may outlive the request.

type MatchType

type MatchType string

MatchType controls how an override pattern is matched against model names.

const (
	MatchTypeExact    MatchType = "exact"
	MatchTypeWildcard MatchType = "wildcard"
)

type Options

type Options struct {
	// Costs - Text
	InputCostPerToken          *float64 `json:"input_cost_per_token,omitempty"`
	OutputCostPerToken         *float64 `json:"output_cost_per_token,omitempty"`
	InputCostPerTokenBatches   *float64 `json:"input_cost_per_token_batches,omitempty"`
	OutputCostPerTokenBatches  *float64 `json:"output_cost_per_token_batches,omitempty"`
	InputCostPerTokenPriority  *float64 `json:"input_cost_per_token_priority,omitempty"`
	OutputCostPerTokenPriority *float64 `json:"output_cost_per_token_priority,omitempty"`
	InputCostPerTokenFlex      *float64 `json:"input_cost_per_token_flex,omitempty"`
	OutputCostPerTokenFlex     *float64 `json:"output_cost_per_token_flex,omitempty"`
	// Fast mode (Anthropic research preview, speed:"fast" on Opus 4.6/4.7/4.8).
	// Flat rate across the full context window — no 128k/200k/272k tiering.
	InputCostPerTokenFast  *float64 `json:"input_cost_per_token_fast,omitempty"`
	OutputCostPerTokenFast *float64 `json:"output_cost_per_token_fast,omitempty"`
	InputCostPerCharacter  *float64 `json:"input_cost_per_character,omitempty"`
	// Costs - 128k Tier
	InputCostPerTokenAbove128kTokens          *float64 `json:"input_cost_per_token_above_128k_tokens,omitempty"`
	InputCostPerImageAbove128kTokens          *float64 `json:"input_cost_per_image_above_128k_tokens,omitempty"`
	InputCostPerVideoPerSecondAbove128kTokens *float64 `json:"input_cost_per_video_per_second_above_128k_tokens,omitempty"`
	InputCostPerAudioPerSecondAbove128kTokens *float64 `json:"input_cost_per_audio_per_second_above_128k_tokens,omitempty"`
	OutputCostPerTokenAbove128kTokens         *float64 `json:"output_cost_per_token_above_128k_tokens,omitempty"`
	// Costs - 200k Tier
	InputCostPerTokenAbove200kTokens          *float64 `json:"input_cost_per_token_above_200k_tokens,omitempty"`
	InputCostPerTokenAbove200kTokensPriority  *float64 `json:"input_cost_per_token_above_200k_tokens_priority,omitempty"`
	OutputCostPerTokenAbove200kTokens         *float64 `json:"output_cost_per_token_above_200k_tokens,omitempty"`
	OutputCostPerTokenAbove200kTokensPriority *float64 `json:"output_cost_per_token_above_200k_tokens_priority,omitempty"`
	// Costs - 272k Tier
	InputCostPerTokenAbove272kTokens          *float64 `json:"input_cost_per_token_above_272k_tokens,omitempty"`
	InputCostPerTokenAbove272kTokensPriority  *float64 `json:"input_cost_per_token_above_272k_tokens_priority,omitempty"`
	OutputCostPerTokenAbove272kTokens         *float64 `json:"output_cost_per_token_above_272k_tokens,omitempty"`
	OutputCostPerTokenAbove272kTokensPriority *float64 `json:"output_cost_per_token_above_272k_tokens_priority,omitempty"`

	// Costs - Cache
	CacheCreationInputTokenCost                        *float64 `json:"cache_creation_input_token_cost,omitempty"`
	CacheReadInputTokenCost                            *float64 `json:"cache_read_input_token_cost,omitempty"`
	CacheCreationInputTokenCostAbove200kTokens         *float64 `json:"cache_creation_input_token_cost_above_200k_tokens,omitempty"`
	CacheReadInputTokenCostAbove200kTokens             *float64 `json:"cache_read_input_token_cost_above_200k_tokens,omitempty"`
	CacheReadInputTokenCostAbove200kTokensPriority     *float64 `json:"cache_read_input_token_cost_above_200k_tokens_priority,omitempty"`
	CacheCreationInputTokenCostAbove1hr                *float64 `json:"cache_creation_input_token_cost_above_1hr,omitempty"`
	CacheCreationInputTokenCostAbove1hrAbove200kTokens *float64 `json:"cache_creation_input_token_cost_above_1hr_above_200k_tokens,omitempty"`
	CacheCreationInputAudioTokenCost                   *float64 `json:"cache_creation_input_audio_token_cost,omitempty"`
	CacheReadInputTokenCostPriority                    *float64 `json:"cache_read_input_token_cost_priority,omitempty"`
	CacheReadInputTokenCostFlex                        *float64 `json:"cache_read_input_token_cost_flex,omitempty"`
	CacheReadInputImageTokenCost                       *float64 `json:"cache_read_input_image_token_cost,omitempty"`
	CacheReadInputTokenCostAbove272kTokens             *float64 `json:"cache_read_input_token_cost_above_272k_tokens,omitempty"`
	CacheReadInputTokenCostAbove272kTokensPriority     *float64 `json:"cache_read_input_token_cost_above_272k_tokens_priority,omitempty"`

	// Costs - Image
	InputCostPerImage                             *float64 `json:"input_cost_per_image,omitempty"`
	InputCostPerPixel                             *float64 `json:"input_cost_per_pixel,omitempty"`
	OutputCostPerImage                            *float64 `json:"output_cost_per_image,omitempty"`
	OutputCostPerPixel                            *float64 `json:"output_cost_per_pixel,omitempty"`
	OutputCostPerImagePremiumImage                *float64 `json:"output_cost_per_image_premium_image,omitempty"`
	OutputCostPerImageAbove512x512Pixels          *float64 `json:"output_cost_per_image_above_512_and_512_pixels,omitempty"`
	OutputCostPerImageAbove512x512PixelsPremium   *float64 `json:"output_cost_per_image_above_512_and_512_pixels_and_premium_image,omitempty"`
	OutputCostPerImageAbove1024x1024Pixels        *float64 `json:"output_cost_per_image_above_1024_and_1024_pixels,omitempty"`
	OutputCostPerImageAbove1024x1024PixelsPremium *float64 `json:"output_cost_per_image_above_1024_and_1024_pixels_and_premium_image,omitempty"`
	OutputCostPerImageAbove2048x2048Pixels        *float64 `json:"output_cost_per_image_above_2048_and_2048_pixels,omitempty"`
	OutputCostPerImageAbove4096x4096Pixels        *float64 `json:"output_cost_per_image_above_4096_and_4096_pixels,omitempty"`
	OutputCostPerImageLowQuality                  *float64 `json:"output_cost_per_image_low_quality,omitempty"`
	OutputCostPerImageMediumQuality               *float64 `json:"output_cost_per_image_medium_quality,omitempty"`
	OutputCostPerImageHighQuality                 *float64 `json:"output_cost_per_image_high_quality,omitempty"`
	OutputCostPerImageAutoQuality                 *float64 `json:"output_cost_per_image_auto_quality,omitempty"`
	InputCostPerImageToken                        *float64 `json:"input_cost_per_image_token,omitempty"`
	OutputCostPerImageToken                       *float64 `json:"output_cost_per_image_token,omitempty"`

	// Costs - Audio/Video
	InputCostPerAudioToken      *float64 `json:"input_cost_per_audio_token,omitempty"`
	InputCostPerAudioPerSecond  *float64 `json:"input_cost_per_audio_per_second,omitempty"`
	InputCostPerSecond          *float64 `json:"input_cost_per_second,omitempty"`
	InputCostPerVideoPerSecond  *float64 `json:"input_cost_per_video_per_second,omitempty"`
	OutputCostPerAudioToken     *float64 `json:"output_cost_per_audio_token,omitempty"`
	OutputCostPerVideoPerSecond *float64 `json:"output_cost_per_video_per_second,omitempty"`
	OutputCostPerSecond         *float64 `json:"output_cost_per_second,omitempty"`

	// Costs - Other.
	//
	// SearchContextCostPerQuery is stored as a single float64, but the upstream datasheet
	// represents it as a tiered object. See Entry.UnmarshalJSON.
	SearchContextCostPerQuery     *float64 `json:"search_context_cost_per_query,omitempty"`
	CodeInterpreterCostPerSession *float64 `json:"code_interpreter_cost_per_session,omitempty"`

	// Costs - OCR
	OCRCostPerPage        *float64 `json:"ocr_cost_per_page,omitempty"`
	AnnotationCostPerPage *float64 `json:"annotation_cost_per_page,omitempty"`
}

Options holds every individual cost field. Embedded into Entry and reused as the patch shape for Override.

type Override

type Override struct {
	ID            string                `json:"id"`
	Name          string                `json:"name"`
	ScopeKind     ScopeKind             `json:"scope_kind"`
	VirtualKeyID  *string               `json:"virtual_key_id,omitempty"`
	ProviderID    *string               `json:"provider_id,omitempty"`
	ProviderKeyID *string               `json:"provider_key_id,omitempty"`
	MatchType     MatchType             `json:"match_type"`
	Pattern       string                `json:"pattern"`
	RequestTypes  []schemas.RequestType `json:"request_types,omitempty"`
	Options       Options               `json:"options"`
}

Override describes a scoped pricing override shared across config storage, model catalog compilation, and governance APIs.

func (*Override) IsValid

func (override *Override) IsValid() error

IsValid validates the shared override contract before persistence or runtime use.

type ScopeKind

type ScopeKind string

ScopeKind identifies which governance scope an override applies to.

const (
	ScopeKindGlobal                ScopeKind = "global"
	ScopeKindProvider              ScopeKind = "provider"
	ScopeKindProviderKey           ScopeKind = "provider_key"
	ScopeKindVirtualKey            ScopeKind = "virtual_key"
	ScopeKindVirtualKeyProvider    ScopeKind = "virtual_key_provider"
	ScopeKindVirtualKeyProviderKey ScopeKind = "virtual_key_provider_key"
)

type Store

type Store struct {
	// contains filtered or unexported fields
}

Store owns the pricing catalog (canonical rows, base-model index, derived datasheet view, supported request types/parameters) and pricing overrides.

All I/O is driven by the composer — Store does not own a ticker or the distributed lock; it exposes SyncFromURL / LoadFromDB / UpdateSyncConfig as the surface the composer calls.

func New

func New(configStore configstore.ConfigStore, logger schemas.Logger, cfg Config) *Store

New constructs a Store with the given config. The store is empty; callers (composer) drive bootstrap via LoadFromDB or LoadFromURLIntoMemory.

func NewTestStore

func NewTestStore(baseModelIndex map[string]string) *Store

NewTestStore constructs a minimal Store for unit tests without I/O. Optionally seed baseModelIndex so BaseModelName lookups resolve. A no-op logger is wired so cost / pricing paths (which assume Store.logger is non-nil) don't panic from external test code.

func (*Store) BaseModelName

func (s *Store) BaseModelName(model string) string

BaseModelName returns the canonical base model name. Uses the pre-computed base_model from the pricing catalog when present, falling back to algorithmic date/version stripping for unknown models.

"gpt-4o"               → "gpt-4o"
"openai/gpt-4o"        → "gpt-4o"
"gpt-4o-2024-08-06"    → "gpt-4o"  (algorithmic fallback)

func (*Store) CalculateCost

func (s *Store) CalculateCost(result *schemas.BifrostResponse, scopes *LookupScopes) float64

CalculateCost calculates the cost of a Bifrost response. It handles all request types, cache debug billing, and tiered pricing. If scopes is nil, an empty LookupScopes is used; global and provider-scoped overrides may still apply since the provider is derived from the response.

func (*Store) DatasheetModelsForProvider

func (s *Store) DatasheetModelsForProvider(provider schemas.ModelProvider) []string

DatasheetModelsForProvider returns the per-provider model slice derived from pricing data on the last load/sync. Composer unions this with live.ModelsForProvider on read.

func (*Store) DatasheetProviders

func (s *Store) DatasheetProviders() []schemas.ModelProvider

DatasheetProviders returns every provider that has at least one pricing row in the datasheet view. Composer unions this with live + keyconfig to enumerate "all known providers" for GetProvidersForModel.

func (*Store) DeleteOverride

func (s *Store) DeleteOverride(id string)

DeleteOverride removes an override by ID.

func (*Store) DistinctBaseModelNames

func (s *Store) DistinctBaseModelNames() []string

DistinctBaseModelNames returns every unique base name from the catalog. Used by governance for cross-provider model selection.

func (*Store) Get

Get returns the raw pricing row for (model, provider, requestType) or nil. Useful for callers that need exact pricing without override resolution.

func (*Store) GetCapabilityEntry

func (s *Store) GetCapabilityEntry(model string, provider schemas.ModelProvider) *Entry

GetCapabilityEntry returns capability metadata (context length, supported modes, etc.) for a (model, provider) pair. Prefers chat → responses → text-completion entries; falls back to the lexicographically first mode if none of the preferred modes match. Tries the exact model first, then the canonical base model.

func (*Store) GetModelParametersByModel

func (s *Store) GetModelParametersByModel(ctx context.Context, model string) (*configstoreTables.TableModelParameters, error)

GetModelParametersByModel reads a single model-parameter row from the DB. Used by the composer's cache-miss handler — installed via providerUtils.SetCacheMissHandler in the composer's Init.

func (*Store) GetPricingEntryForModel

func (s *Store) GetPricingEntryForModel(model string, provider schemas.ModelProvider) *Entry

GetPricingEntryForModel returns the first pricing entry found across known modes. Preserved for callers (inference handler) that want any pricing row for the model without specifying a request type.

func (*Store) GetSupportedParameters

func (s *Store) GetSupportedParameters(model string) []string

GetSupportedParameters returns the list of OpenAI-compatible parameter names a model accepts (e.g. temperature, top_p, tools). nil for unknown.

func (*Store) IsRequestTypeSupported

func (s *Store) IsRequestTypeSupported(model string, requestType schemas.RequestType) bool

IsRequestTypeSupported checks whether a model declares support for the given request type via the model-parameters datasheet.

func (*Store) IsSameModel

func (s *Store) IsSameModel(model1, model2 string) bool

IsSameModel reports whether two model strings refer to the same underlying model after normalization.

func (*Store) IsTextCompletionSupported

func (s *Store) IsTextCompletionSupported(model string, provider schemas.ModelProvider) bool

IsTextCompletionSupported checks whether a model has a text_completion pricing entry — used by litellmcompat to decide whether to convert text completion requests into chat completion requests.

func (*Store) LastSyncedAt

func (s *Store) LastSyncedAt() time.Time

LastSyncedAt returns the last successful URL→DB sync timestamp; zero before any sync has completed.

func (*Store) LoadFromDB

func (s *Store) LoadFromDB(ctx context.Context) error

LoadFromDB reloads the in-memory pricing cache + datasheet view from the config store. Used by the composer at bootstrap and as the gossip ReloadFromDB handler on non-leader pods.

func (*Store) LoadFromURLIntoMemory

func (s *Store) LoadFromURLIntoMemory(ctx context.Context) error

LoadFromURLIntoMemory loads pricing from the URL directly into memory (no DB). Used when the composer was built without a config store.

func (*Store) LoadModelParamsFromDB

func (s *Store) LoadModelParamsFromDB(ctx context.Context) (int, error)

LoadModelParamsFromDB bulk-loads model parameters from the DB into the provider-utils cache and the in-memory supportedResponseTypes / supportedParams indexes. Returns the row count so the composer can decide whether to background-sync from URL afterwards.

The provider-utils cache-miss handler in the composer still loads one row at a time when an unknown model is queried; both paths use the same JSON shape stored in the table.

func (*Store) LoadModelParamsFromURLIntoMemory

func (s *Store) LoadModelParamsFromURLIntoMemory(ctx context.Context) error

LoadModelParamsFromURLIntoMemory fetches model parameters from the URL and applies them in-memory only. Used when there's no config store.

func (*Store) LoadOverridesFromStore

func (s *Store) LoadOverridesFromStore(ctx context.Context) error

LoadOverridesFromStore reloads all overrides from the config store. Called at bootstrap and after force-reload paths.

func (*Store) MarkSynced

func (s *Store) MarkSynced(t time.Time)

MarkSynced records the timestamp of a successful sync — called by the composer's ticker after a successful tick.

func (*Store) ModelParametersURL

func (s *Store) ModelParametersURL() string

ModelParametersURL returns a snapshot of the model-parameters URL.

func (*Store) SetOverrides

func (s *Store) SetOverrides(rows []configstoreTables.TablePricingOverride) error

SetOverrides replaces the full in-memory override set. Duplicate IDs in the input keep the last-seen entry (matching today's behavior).

func (*Store) SyncFromURL

func (s *Store) SyncFromURL(ctx context.Context) error

SyncFromURL fetches the upstream pricing datasheet, persists it to the DB (when configStore != nil), and refreshes the in-memory cache + derived datasheet view. On URL failure it falls back to existing DB records when any exist, otherwise propagates the error.

The composer owns the distributed lock, the ticker, and the after-sync gossip hook — none of that lives here. SyncFromURL is the pure "URL → DB → memory" step.

func (*Store) SyncInterval

func (s *Store) SyncInterval() time.Duration

SyncInterval returns the minimum elapsed time between background syncs.

func (*Store) SyncModelParamsFromURL

func (s *Store) SyncModelParamsFromURL(ctx context.Context) error

SyncModelParamsFromURL fetches model parameters from the configured URL, persists to DB (when configStore != nil), and refreshes the in-memory indexes. On URL failure it falls back to DB records when any exist.

func (*Store) URL

func (s *Store) URL() string

URL returns a snapshot of the pricing URL.

func (*Store) UpdateSyncConfig

func (s *Store) UpdateSyncConfig(cfg Config)

UpdateSyncConfig replaces URL / params URL / interval atomically. The composer is responsible for triggering a fresh sync after this returns.

func (*Store) UpsertModelPricingAttributes

func (s *Store) UpsertModelPricingAttributes(ctx context.Context, model string, provider schemas.ModelProvider, attrs map[string]string) (int64, error)

UpsertModelPricingAttributes writes the additional_attributes column for every pricing row that matches (model, provider), then reloads the pricing cache so the new values are immediately visible to list-models. Returns the number of rows updated (0 = no such pricing row, which callers must surface as a validation error). An empty/nil attrs map clears the column.

func (*Store) UpsertOverrides

func (s *Store) UpsertOverrides(rows ...*configstoreTables.TablePricingOverride) error

UpsertOverrides inserts or replaces one or more overrides, rebuilding the lookup map exactly once at the end.

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL