datasheet

package

v1.3.22 Latest Latest Go to latest Published: Jun 21, 2026 License: Apache-2.0 Imports: 21 Imported by: 0

Details

Valid go.mod file
Redistributable license
Tagged version
Stable version
Learn more about best practices

Repository

github.com/maximhq/bifrost

Links

Open Source Insights

Documentation ¶

Overview ¶

Package pricing owns the pricing/model-parameters catalog (compartments A + B + E): canonical pricing rows fetched from the upstream datasheet, per-provider datasheet-derived model views, supported request types and parameters, and scoped pricing overrides. It also computes per-response cost.

The package performs no list-models I/O — that's the live store's domain. The hourly sync ticker lives on the composer (ModelCatalog), not here; the composer calls SyncFromURL / LoadFromDB / Sync*ModelParams*. Reads are hot-path and lock-free where possible.

Index ¶

Constants
type Config
type Entry
- func (p *Entry) UnmarshalJSON(data []byte) error
type LookupScopes
- func LookupScopesFromContext(ctx *schemas.BifrostContext, provider string) *LookupScopes
type MatchType
type Options
type Override
- func (override *Override) IsValid() error
type ScopeKind
type Store
- func New(configStore configstore.ConfigStore, logger schemas.Logger, cfg Config) *Store
- func NewTestStore(baseModelIndex map[string]string) *Store

Constants ¶

View Source

const (
	DefaultURL                    = "https://getbifrost.ai/datasheet"
	DefaultModelParametersURL     = "https://getbifrost.ai/datasheet/model-parameters"
	DefaultSyncInterval           = 24 * time.Hour
	DefaultPricingTimeout         = 45 * time.Second
	DefaultModelParametersTimeout = 45 * time.Second
)

Defaults for sync configuration and timeouts. Exposed so the composer can fall back to these when the framework Config leaves fields nil.

View Source

const (
	TokenTierAbove272K = 272000
	TokenTierAbove200K = 200000
	TokenTierAbove128K = 128000
)

Tier boundaries for tiered token pricing. Matches the upstream datasheet keys (input_cost_per_token_above_<N>k_tokens).

Variables ¶

This section is empty.

Functions ¶

This section is empty.

Types ¶

type Config ¶

type Config struct {
	URL                string
	ModelParametersURL string
	SyncInterval       time.Duration
}

Config groups the values the composer hands to New / UpdateSyncConfig. Zero values fall back to the Default* constants.

type Entry ¶

type Entry struct {
	BaseModel string `json:"base_model,omitempty"`
	Provider  string `json:"provider"`
	Mode      string `json:"mode"`

	ContextLength   *int                  `json:"context_length,omitempty"`
	MaxInputTokens  *int                  `json:"max_input_tokens,omitempty"`
	MaxOutputTokens *int                  `json:"max_output_tokens,omitempty"`
	Architecture    *schemas.Architecture `json:"architecture,omitempty"`

	// AdditionalAttributes carries editorial metadata stored on the pricing
	// row (e.g. description). Populated from the DB read path only; the
	// json:"-" tag prevents URL datasheet payloads from ever feeding into
	// this field via json.Unmarshal.
	AdditionalAttributes map[string]string `json:"-"`

	Options
}

Entry represents a single model's pricing information. Field names and JSON tags match the datasheet schema exactly. AdditionalAttributes carries editorial metadata stored on the pricing row — never populated from the URL datasheet, only from DB reads via the management API.

func (*Entry) UnmarshalJSON ¶

func (p *Entry) UnmarshalJSON(data []byte) error

UnmarshalJSON handles the special case where search_context_cost_per_query may arrive as either a plain float64 or a tiered object {"search_context_size_low":…, "search_context_size_medium":…, "search_context_size_high":…}.

type LookupScopes ¶

type LookupScopes struct {
	VirtualKeyID  string
	SelectedKeyID string
	Provider      string
}

LookupScopes carries the runtime identifiers used to resolve scoped pricing overrides during cost calculation.

func LookupScopesFromContext ¶

func LookupScopesFromContext(ctx *schemas.BifrostContext, provider string) *LookupScopes

LookupScopesFromContext builds a LookupScopes from a BifrostContext. Reads the governance virtual key ID (not the raw VK token) and the selected key ID. provider should be the provider name string (e.g. "openai"); pass "" if unavailable. Returns nil only when ctx is nil. An empty scopes value is still returned when all fields are empty so global-scope overrides remain evaluable.

NOT SAFE in a goroutine — reads from ctx which is cancelled when the request ends. Call synchronously in PostHooks and pass the result by value to anything that may outlive the request.

type MatchType ¶

type MatchType string

MatchType controls how an override pattern is matched against model names.

const (
	MatchTypeExact    MatchType = "exact"
	MatchTypeWildcard MatchType = "wildcard"
)

type Options ¶

type Options struct {
	// Costs - Text
	InputCostPerToken          *float64 `json:"input_cost_per_token,omitempty"`
	OutputCostPerToken         *float64 `json:"output_cost_per_token,omitempty"`
	InputCostPerTokenBatches   *float64 `json:"input_cost_per_token_batches,omitempty"`
	OutputCostPerTokenBatches  *float64 `json:"output_cost_per_token_batches,omitempty"`
	InputCostPerTokenPriority  *float64 `json:"input_cost_per_token_priority,omitempty"`
	OutputCostPerTokenPriority *float64 `json:"output_cost_per_token_priority,omitempty"`
	InputCostPerTokenFlex      *float64 `json:"input_cost_per_token_flex,omitempty"`
	OutputCostPerTokenFlex     *float64 `json:"output_cost_per_token_flex,omitempty"`
	// Fast mode (Anthropic research preview, speed:"fast" on Opus 4.6/4.7/4.8).
	// Flat rate across the full context window — no 128k/200k/272k tiering.
	InputCostPerTokenFast  *float64 `json:"input_cost_per_token_fast,omitempty"`
	OutputCostPerTokenFast *float64 `json:"output_cost_per_token_fast,omitempty"`
	InputCostPerCharacter  *float64 `json:"input_cost_per_character,omitempty"`
	// Costs - 128k Tier
	InputCostPerTokenAbove128kTokens          *float64 `json:"input_cost_per_token_above_128k_tokens,omitempty"`
	InputCostPerImageAbove128kTokens          *float64 `json:"input_cost_per_image_above_128k_tokens,omitempty"`
	InputCostPerVideoPerSecondAbove128kTokens *float64 `json:"input_cost_per_video_per_second_above_128k_tokens,omitempty"`
	InputCostPerAudioPerSecondAbove128kTokens *float64 `json:"input_cost_per_audio_per_second_above_128k_tokens,omitempty"`
	OutputCostPerTokenAbove128kTokens         *float64 `json:"output_cost_per_token_above_128k_tokens,omitempty"`
	// Costs - 200k Tier
	InputCostPerTokenAbove200kTokens          *float64 `json:"input_cost_per_token_above_200k_tokens,omitempty"`
	InputCostPerTokenAbove200kTokensPriority  *float64 `json:"input_cost_per_token_above_200k_tokens_priority,omitempty"`
	OutputCostPerTokenAbove200kTokens         *float64 `json:"output_cost_per_token_above_200k_tokens,omitempty"`
	OutputCostPerTokenAbove200kTokensPriority *float64 `json:"output_cost_per_token_above_200k_tokens_priority,omitempty"`
	// Costs - 272k Tier
	InputCostPerTokenAbove272kTokens          *float64 `json:"input_cost_per_token_above_272k_tokens,omitempty"`
	InputCostPerTokenAbove272kTokensPriority  *float64 `json:"input_cost_per_token_above_272k_tokens_priority,omitempty"`
	OutputCostPerTokenAbove272kTokens         *float64 `json:"output_cost_per_token_above_272k_tokens,omitempty"`
	OutputCostPerTokenAbove272kTokensPriority *float64 `json:"output_cost_per_token_above_272k_tokens_priority,omitempty"`

	// Costs - Cache
	CacheCreationInputTokenCost                        *float64 `json:"cache_creation_input_token_cost,omitempty"`
	CacheReadInputTokenCost                            *float64 `json:"cache_read_input_token_cost,omitempty"`
	CacheCreationInputTokenCostAbove200kTokens         *float64 `json:"cache_creation_input_token_cost_above_200k_tokens,omitempty"`
	CacheReadInputTokenCostAbove200kTokens             *float64 `json:"cache_read_input_token_cost_above_200k_tokens,omitempty"`
	CacheReadInputTokenCostAbove200kTokensPriority     *float64 `json:"cache_read_input_token_cost_above_200k_tokens_priority,omitempty"`
	CacheCreationInputTokenCostAbove1hr                *float64 `json:"cache_creation_input_token_cost_above_1hr,omitempty"`
	CacheCreationInputTokenCostAbove1hrAbove200kTokens *float64 `json:"cache_creation_input_token_cost_above_1hr_above_200k_tokens,omitempty"`
	CacheCreationInputAudioTokenCost                   *float64 `json:"cache_creation_input_audio_token_cost,omitempty"`
	CacheReadInputTokenCostPriority                    *float64 `json:"cache_read_input_token_cost_priority,omitempty"`
	CacheReadInputTokenCostFlex                        *float64 `json:"cache_read_input_token_cost_flex,omitempty"`
	CacheReadInputImageTokenCost                       *float64 `json:"cache_read_input_image_token_cost,omitempty"`
	CacheReadInputTokenCostAbove272kTokens             *float64 `json:"cache_read_input_token_cost_above_272k_tokens,omitempty"`
	CacheReadInputTokenCostAbove272kTokensPriority     *float64 `json:"cache_read_input_token_cost_above_272k_tokens_priority,omitempty"`

	// Costs - Image
	InputCostPerImage                             *float64 `json:"input_cost_per_image,omitempty"`
	InputCostPerPixel                             *float64 `json:"input_cost_per_pixel,omitempty"`
	OutputCostPerImage                            *float64 `json:"output_cost_per_image,omitempty"`
	OutputCostPerPixel                            *float64 `json:"output_cost_per_pixel,omitempty"`
	OutputCostPerImagePremiumImage                *float64 `json:"output_cost_per_image_premium_image,omitempty"`
	OutputCostPerImageAbove512x512Pixels          *float64 `json:"output_cost_per_image_above_512_and_512_pixels,omitempty"`
	OutputCostPerImageAbove512x512PixelsPremium   *float64 `json:"output_cost_per_image_above_512_and_512_pixels_and_premium_image,omitempty"`
	OutputCostPerImageAbove1024x1024Pixels        *float64 `json:"output_cost_per_image_above_1024_and_1024_pixels,omitempty"`
	OutputCostPerImageAbove1024x1024PixelsPremium *float64 `json:"output_cost_per_image_above_1024_and_1024_pixels_and_premium_image,omitempty"`
	OutputCostPerImageAbove2048x2048Pixels        *float64 `json:"output_cost_per_image_above_2048_and_2048_pixels,omitempty"`
	OutputCostPerImageAbove4096x4096Pixels        *float64 `json:"output_cost_per_image_above_4096_and_4096_pixels,omitempty"`
	OutputCostPerImageLowQuality                  *float64 `json:"output_cost_per_image_low_quality,omitempty"`
	OutputCostPerImageMediumQuality               *float64 `json:"output_cost_per_image_medium_quality,omitempty"`
	OutputCostPerImageHighQuality                 *float64 `json:"output_cost_per_image_high_quality,omitempty"`
	OutputCostPerImageAutoQuality                 *float64 `json:"output_cost_per_image_auto_quality,omitempty"`
	InputCostPerImageToken                        *float64 `json:"input_cost_per_image_token,omitempty"`
	OutputCostPerImageToken                       *float64 `json:"output_cost_per_image_token,omitempty"`

	// Costs - Audio/Video
	InputCostPerAudioToken      *float64 `json:"input_cost_per_audio_token,omitempty"`
	InputCostPerAudioPerSecond  *float64 `json:"input_cost_per_audio_per_second,omitempty"`
	InputCostPerSecond          *float64 `json:"input_cost_per_second,omitempty"`
	InputCostPerVideoPerSecond  *float64 `json:"input_cost_per_video_per_second,omitempty"`
	OutputCostPerAudioToken     *float64 `json:"output_cost_per_audio_token,omitempty"`
	OutputCostPerVideoPerSecond *float64 `json:"output_cost_per_video_per_second,omitempty"`
	OutputCostPerSecond         *float64 `json:"output_cost_per_second,omitempty"`

	// Costs - Other.
	//
	// SearchContextCostPerQuery is stored as a single float64, but the upstream datasheet
	// represents it as a tiered object. See Entry.UnmarshalJSON.
	SearchContextCostPerQuery     *float64 `json:"search_context_cost_per_query,omitempty"`
	CodeInterpreterCostPerSession *float64 `json:"code_interpreter_cost_per_session,omitempty"`

	// Costs - OCR
	OCRCostPerPage        *float64 `json:"ocr_cost_per_page,omitempty"`
	AnnotationCostPerPage *float64 `json:"annotation_cost_per_page,omitempty"`
}

Options holds every individual cost field. Embedded into Entry and reused as the patch shape for Override.

type Override ¶

type Override struct {
	ID            string                `json:"id"`
	Name          string                `json:"name"`
	ScopeKind     ScopeKind             `json:"scope_kind"`
	VirtualKeyID  *string               `json:"virtual_key_id,omitempty"`
	ProviderID    *string               `json:"provider_id,omitempty"`
	ProviderKeyID *string               `json:"provider_key_id,omitempty"`
	MatchType     MatchType             `json:"match_type"`
	Pattern       string                `json:"pattern"`
	RequestTypes  []schemas.RequestType `json:"request_types,omitempty"`
	Options       Options               `json:"options"`
}

Override describes a scoped pricing override shared across config storage, model catalog compilation, and governance APIs.

func (*Override) IsValid ¶

func (override *Override) IsValid() error

IsValid validates the shared override contract before persistence or runtime use.

type ScopeKind ¶

type ScopeKind string

ScopeKind identifies which governance scope an override applies to.

const (
	ScopeKindGlobal                ScopeKind = "global"
	ScopeKindProvider              ScopeKind = "provider"
	ScopeKindProviderKey           ScopeKind = "provider_key"
	ScopeKindVirtualKey            ScopeKind = "virtual_key"
	ScopeKindVirtualKeyProvider    ScopeKind = "virtual_key_provider"
	ScopeKindVirtualKeyProviderKey ScopeKind = "virtual_key_provider_key"
)

type Store ¶

type Store struct {
	// contains filtered or unexported fields
}

Store owns the pricing catalog (canonical rows, base-model index, derived datasheet view, supported request types/parameters) and pricing overrides.

All I/O is driven by the composer — Store does not own a ticker or the distributed lock; it exposes SyncFromURL / LoadFromDB / UpdateSyncConfig as the surface the composer calls.

func New ¶

func New(configStore configstore.ConfigStore, logger schemas.Logger, cfg Config) *Store

New constructs a Store with the given config. The store is empty; callers (composer) drive bootstrap via LoadFromDB or LoadFromURLIntoMemory.

func NewTestStore ¶

func NewTestStore(baseModelIndex map[string]string) *Store

NewTestStore constructs a minimal Store for unit tests without I/O. Optionally seed baseModelIndex so BaseModelName lookups resolve. A no-op logger is wired so cost / pricing paths (which assume Store.logger is non-nil) don't panic from external test code.

func (*Store) BaseModelName ¶

func (s *Store) BaseModelName(model string) string

BaseModelName returns the canonical base model name. Uses the pre-computed base_model from the pricing catalog when present, falling back to algorithmic date/version stripping for unknown models.

"gpt-4o"               → "gpt-4o"
"openai/gpt-4o"        → "gpt-4o"
"gpt-4o-2024-08-06"    → "gpt-4o"  (algorithmic fallback)

func (*Store) CalculateCost ¶

func (s *Store) CalculateCost(result *schemas.BifrostResponse, scopes *LookupScopes) float64

CalculateCost calculates the cost of a Bifrost response. It handles all request types, cache debug billing, and tiered pricing. If scopes is nil, an empty LookupScopes is used; global and provider-scoped overrides may still apply since the provider is derived from the response.

func (*Store) CalculateCostForUsage ¶ added in v1.3.22

func (s *Store) CalculateCostForUsage(usage *schemas.BifrostLLMUsage, provider schemas.ModelProvider, model string, requestType schemas.RequestType, scopes *LookupScopes) float64

CalculateCostForUsage computes the dollar cost from a bare usage object plus provider / model / request type, for cases where no full BifrostResponse exists. The primary use is billing partial usage carried on a failed or cancelled request via BifrostError.ExtraFields.BilledUsage: the provider consumed tokens, so we must charge for them even though there is no success response to read. It mirrors CalculateCost's compute path so success and failure billing use identical rates. Returns 0 when usage is nil.

func (*Store) DatasheetModelsForProvider ¶

func (s *Store) DatasheetModelsForProvider(provider schemas.ModelProvider) []string

DatasheetModelsForProvider returns the per-provider model slice derived from pricing data on the last load/sync. Composer unions this with live.ModelsForProvider on read.

func (*Store) DatasheetProviders ¶

func (s *Store) DatasheetProviders() []schemas.ModelProvider

DatasheetProviders returns every provider that has at least one pricing row in the datasheet view. Composer unions this with live + keyconfig to enumerate "all known providers" for GetProvidersForModel.

func (*Store) DeleteOverride ¶

func (s *Store) DeleteOverride(id string)

DeleteOverride removes an override by ID.

func (*Store) DistinctBaseModelNames ¶

func (s *Store) DistinctBaseModelNames() []string

DistinctBaseModelNames returns every unique base name from the catalog. Used by governance for cross-provider model selection.

func (*Store) Get ¶

func (s *Store) Get(model string, provider schemas.ModelProvider, requestType schemas.RequestType) *configstoreTables.TableModelPricing

Get returns the raw pricing row for (model, provider, requestType) or nil. Useful for callers that need exact pricing without override resolution.

func (*Store) GetCapabilityEntry ¶

func (s *Store) GetCapabilityEntry(model string, provider schemas.ModelProvider) *Entry

GetCapabilityEntry returns capability metadata (context length, supported modes, etc.) for a (model, provider) pair. Prefers chat → responses → text-completion entries; falls back to the lexicographically first mode if none of the preferred modes match. Tries the exact model first, then the canonical base model.

func (*Store) GetModelParametersByModel ¶

func (s *Store) GetModelParametersByModel(ctx context.Context, model string) (*configstoreTables.TableModelParameters, error)

GetModelParametersByModel reads a single model-parameter row from the DB. Used by the composer's cache-miss handler — installed via providerUtils.SetCacheMissHandler in the composer's Init.

func (*Store) GetPricingEntryForModel ¶

func (s *Store) GetPricingEntryForModel(model string, provider schemas.ModelProvider) *Entry

GetPricingEntryForModel returns the first pricing entry found across known modes. Preserved for callers (inference handler) that want any pricing row for the model without specifying a request type.

func (*Store) GetSupportedParameters ¶

func (s *Store) GetSupportedParameters(model string) []string

GetSupportedParameters returns the list of OpenAI-compatible parameter names a model accepts (e.g. temperature, top_p, tools). nil for unknown.

func (*Store) IsRequestTypeSupported ¶

func (s *Store) IsRequestTypeSupported(model string, requestType schemas.RequestType) bool

IsRequestTypeSupported checks whether a model declares support for the given request type via the model-parameters datasheet.

func (*Store) IsSameModel ¶

func (s *Store) IsSameModel(model1, model2 string) bool

IsSameModel reports whether two model strings refer to the same underlying model after normalization.

func (*Store) IsTextCompletionSupported ¶

func (s *Store) IsTextCompletionSupported(model string, provider schemas.ModelProvider) bool

IsTextCompletionSupported checks whether a model has a text_completion pricing entry — used by litellmcompat to decide whether to convert text completion requests into chat completion requests.

func (*Store) LastSyncedAt ¶

func (s *Store) LastSyncedAt() time.Time

LastSyncedAt returns the last successful URL→DB sync timestamp; zero before any sync has completed.

func (*Store) LoadFromDB ¶

func (s *Store) LoadFromDB(ctx context.Context) error

LoadFromDB reloads the in-memory pricing cache + datasheet view from the config store. Used by the composer at bootstrap and as the gossip ReloadFromDB handler on non-leader pods.

func (*Store) LoadFromURLIntoMemory ¶

func (s *Store) LoadFromURLIntoMemory(ctx context.Context) error

LoadFromURLIntoMemory loads pricing from the URL directly into memory (no DB). Used when the composer was built without a config store.

func (*Store) LoadModelParamsFromDB ¶

func (s *Store) LoadModelParamsFromDB(ctx context.Context) (int, error)

LoadModelParamsFromDB bulk-loads model parameters from the DB into the provider-utils cache and the in-memory supportedResponseTypes / supportedParams indexes. Returns the row count so the composer can decide whether to background-sync from URL afterwards.

The provider-utils cache-miss handler in the composer still loads one row at a time when an unknown model is queried; both paths use the same JSON shape stored in the table.

func (*Store) LoadModelParamsFromURLIntoMemory ¶

func (s *Store) LoadModelParamsFromURLIntoMemory(ctx context.Context) error

LoadModelParamsFromURLIntoMemory fetches model parameters from the URL and applies them in-memory only. Used when there's no config store.

func (*Store) LoadOverridesFromStore ¶

func (s *Store) LoadOverridesFromStore(ctx context.Context) error

LoadOverridesFromStore reloads all overrides from the config store. Called at bootstrap and after force-reload paths.

func (*Store) MarkSynced ¶

func (s *Store) MarkSynced(t time.Time)

MarkSynced records the timestamp of a successful sync — called by the composer's ticker after a successful tick.

func (*Store) ModelParametersURL ¶

func (s *Store) ModelParametersURL() string

ModelParametersURL returns a snapshot of the model-parameters URL.

func (*Store) SetOverrides ¶

func (s *Store) SetOverrides(rows []configstoreTables.TablePricingOverride) error

SetOverrides replaces the full in-memory override set. Duplicate IDs in the input keep the last-seen entry (matching today's behavior).

func (*Store) SyncFromURL ¶

func (s *Store) SyncFromURL(ctx context.Context) error

SyncFromURL fetches the upstream pricing datasheet, persists it to the DB (when configStore != nil), and refreshes the in-memory cache + derived datasheet view. On URL failure it falls back to existing DB records when any exist, otherwise propagates the error.

The composer owns the distributed lock, the ticker, and the after-sync gossip hook — none of that lives here. SyncFromURL is the pure "URL → DB → memory" step.

func (*Store) SyncInterval ¶

func (s *Store) SyncInterval() time.Duration

SyncInterval returns the minimum elapsed time between background syncs.

func (*Store) SyncModelParamsFromURL ¶

func (s *Store) SyncModelParamsFromURL(ctx context.Context) error

SyncModelParamsFromURL fetches model parameters from the configured URL, persists to DB (when configStore != nil), and refreshes the in-memory indexes. On URL failure it falls back to DB records when any exist.

func (*Store) URL ¶

func (s *Store) URL() string

URL returns a snapshot of the pricing URL.

func (*Store) UpdateSyncConfig ¶

func (s *Store) UpdateSyncConfig(cfg Config)

UpdateSyncConfig replaces URL / params URL / interval atomically. The composer is responsible for triggering a fresh sync after this returns.

func (*Store) UpsertModelPricingAttributes ¶

func (s *Store) UpsertModelPricingAttributes(ctx context.Context, model string, provider schemas.ModelProvider, attrs map[string]string) (int64, error)

UpsertModelPricingAttributes writes the additional_attributes column for every pricing row that matches (model, provider), then reloads the pricing cache so the new values are immediately visible to list-models. Returns the number of rows updated (0 = no such pricing row, which callers must surface as a validation error). An empty/nil attrs map clears the column.

func (*Store) UpsertOverrides ¶

func (s *Store) UpsertOverrides(rows ...*configstoreTables.TablePricingOverride) error

UpsertOverrides inserts or replaces one or more overrides, rebuilding the lookup map exactly once at the end.

Source Files ¶

View all Source files

?	: This menu
/	: Search site
f or F	: Jump to
y or Y	: Canonical URL