modelcatalog

package
v1.2.35 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Apr 5, 2026 License: Apache-2.0 Imports: 19 Imported by: 0

Documentation

Overview

Package modelcatalog provides a pricing manager for the framework.

Index

Constants

View Source
const (
	DefaultPricingSyncInterval        = 24 * time.Hour
	ConfigLastPricingSyncKey          = "LastModelPricingSync"
	ConfigLastParamsSyncKey           = "LastModelParametersSync"
	ConfigProviderModelHealthStateKey = "ProviderModelHealthStateV1"
	DefaultPricingURL                 = "https://getbifrost.ai/datasheet"
	DefaultModelParametersURL         = "https://getbifrost.ai/datasheet/model-parameters"
	DefaultPricingTimeout             = 45 * time.Second
	DefaultModelParametersTimeout     = 45 * time.Second
)
View Source
const (
	TokenTierAbove200K = 200000
	TokenTierAbove128K = 128000
)

Default sync interval and config key

View Source
const DefaultProviderModelHealthPersistDebounce = 500 * time.Millisecond
View Source
const DefaultProviderModelSnapshotStaleAfter = 24 * time.Hour

Variables

This section is empty.

Functions

This section is empty.

Types

type Config

type Config struct {
	PricingURL                         *string        `json:"pricing_url,omitempty"`
	PricingSyncInterval                *time.Duration `json:"pricing_sync_interval,omitempty"`
	ProviderModelHealthPersistDebounce *time.Duration `json:"provider_model_health_persist_debounce_ms,omitempty"`
}

Config is the model pricing configuration.

type ModelCatalog

type ModelCatalog struct {
	// contains filtered or unexported fields
}

func Init

func Init(ctx context.Context, config *Config, configStore configstore.ConfigStore, shouldSyncPricingFunc ShouldSyncPricingFunc, logger schemas.Logger) (*ModelCatalog, error)

Init initializes the model catalog

func NewTestCatalog

func NewTestCatalog(baseModelIndex map[string]string) *ModelCatalog

NewTestCatalog creates a minimal ModelCatalog for testing purposes. It does not start background sync workers or connect to external services.

func (*ModelCatalog) CalculateCost

func (mc *ModelCatalog) CalculateCost(result *schemas.BifrostResponse) float64

CalculateCost calculates the cost of a Bifrost response. It handles all request types, cache debug billing, and tiered pricing.

func (*ModelCatalog) Cleanup

func (mc *ModelCatalog) Cleanup() error

Cleanup cleans up the model catalog

func (*ModelCatalog) DeleteModelDataForProvider

func (mc *ModelCatalog) DeleteModelDataForProvider(provider schemas.ModelProvider)

DeleteModelDataForProvider deletes all model data from the pool for a given provider

func (*ModelCatalog) DeleteProviderPricingOverrides

func (mc *ModelCatalog) DeleteProviderPricingOverrides(provider schemas.ModelProvider)

func (*ModelCatalog) ForceReloadPricing

func (mc *ModelCatalog) ForceReloadPricing(ctx context.Context) error

func (*ModelCatalog) GetBaseModelName

func (mc *ModelCatalog) GetBaseModelName(model string) string

GetBaseModelName returns the canonical base model name for a given model string. It uses the pre-computed base_model from the pricing catalog when available, falling back to algorithmic date/version stripping for models not in the catalog.

Examples:

mc.GetBaseModelName("gpt-4o")                    // Returns: "gpt-4o"
mc.GetBaseModelName("openai/gpt-4o")             // Returns: "gpt-4o"
mc.GetBaseModelName("gpt-4o-2024-08-06")         // Returns: "gpt-4o" (algorithmic fallback)

func (*ModelCatalog) GetDistinctBaseModelNames

func (mc *ModelCatalog) GetDistinctBaseModelNames() []string

GetDistinctBaseModelNames returns all unique base model names from the catalog (thread-safe). This is used for governance model selection when no specific provider is chosen.

func (*ModelCatalog) GetModelCapabilityEntryForModel

func (mc *ModelCatalog) GetModelCapabilityEntryForModel(model string, provider schemas.ModelProvider) *PricingEntry

GetModelCapabilityEntryForModel returns capability metadata for a model/provider pair. It prefers chat, then responses, then text-completion entries; if none exist, it falls back to the lexicographically first available mode for deterministic behavior.

func (*ModelCatalog) GetModelsForProvider

func (mc *ModelCatalog) GetModelsForProvider(provider schemas.ModelProvider) []string

GetModelsForProvider returns all available models for a given provider (thread-safe)

func (*ModelCatalog) GetPricingEntryForModel

func (mc *ModelCatalog) GetPricingEntryForModel(model string, provider schemas.ModelProvider) *PricingEntry

GetPricingEntryForModel returns the pricing data

func (*ModelCatalog) GetProviderModelSnapshotHealthReport

func (mc *ModelCatalog) GetProviderModelSnapshotHealthReport() ProviderModelSnapshotHealthReport

func (*ModelCatalog) GetProvidersForModel

func (mc *ModelCatalog) GetProvidersForModel(model string) []schemas.ModelProvider

GetProvidersForModel returns all providers for a given model (thread-safe)

func (*ModelCatalog) GetUnfilteredModelsForProvider

func (mc *ModelCatalog) GetUnfilteredModelsForProvider(provider schemas.ModelProvider) []string

GetUnfilteredModelsForProvider returns all available models for a given provider (thread-safe)

func (*ModelCatalog) IsModelAllowedForProvider

func (mc *ModelCatalog) IsModelAllowedForProvider(provider schemas.ModelProvider, model string, allowedModels []string) bool

IsModelAllowedForProvider checks if a model is allowed for a specific provider based on the allowed models list and catalog data. It handles all cross-provider logic including provider-prefixed models and special routing rules.

Parameters:

  • provider: The provider to check against
  • model: The model name (without provider prefix, e.g., "gpt-4o" or "claude-3-5-sonnet")
  • allowedModels: List of allowed model names (can be empty, can include provider prefixes)

Behavior:

  • If allowedModels is empty: Uses model catalog to check if provider supports the model (delegates to GetProvidersForModel which handles all cross-provider logic)
  • If allowedModels is not empty: Checks if model matches any entry in the list Provider-specific validation:
  • Direct matches: "gpt-4o" in allowedModels for any provider
  • Prefixed matches: Only if the prefixed model exists in provider's catalog (e.g., "openai/gpt-4o" in allowedModels only matches if openrouter's catalog contains "openai/gpt-4o" AND the model part matches the request)

Returns:

  • bool: true if the model is allowed for the provider, false otherwise

Examples:

// Empty allowedModels - uses catalog
mc.IsModelAllowedForProvider("openrouter", "claude-3-5-sonnet", []string{})
// Returns: true (catalog knows openrouter has "anthropic/claude-3-5-sonnet")

// Explicit allowedModels with prefix - validates against catalog
mc.IsModelAllowedForProvider("openrouter", "gpt-4o", []string{"openai/gpt-4o"})
// Returns: true (openrouter's catalog contains "openai/gpt-4o" AND model part is "gpt-4o")

// Explicit allowedModels with prefix - wrong model
mc.IsModelAllowedForProvider("openrouter", "claude-3-5-sonnet", []string{"openai/gpt-4o"})
// Returns: false (model part "gpt-4o" doesn't match request "claude-3-5-sonnet")

// Explicit allowedModels without prefix
mc.IsModelAllowedForProvider("openai", "gpt-4o", []string{"gpt-4o"})
// Returns: true (direct match)

func (*ModelCatalog) IsSameModel

func (mc *ModelCatalog) IsSameModel(model1, model2 string) bool

IsSameModel checks if two model strings refer to the same underlying model. It compares the canonical base model names derived from the pricing catalog (or algorithmic fallback for models not in the catalog).

Examples:

mc.IsSameModel("gpt-4o", "gpt-4o")                            // true (direct match)
mc.IsSameModel("openai/gpt-4o", "gpt-4o")                     // true (same base model)
mc.IsSameModel("gpt-4o", "claude-3-5-sonnet")                  // false (different models)
mc.IsSameModel("openai/gpt-4o", "anthropic/claude-3-5-sonnet") // false

func (*ModelCatalog) IsTextCompletionSupported

func (mc *ModelCatalog) IsTextCompletionSupported(model string, provider schemas.ModelProvider) bool

IsTextCompletionSupported checks if a model supports text completion for the given provider. Returns true if the model has pricing data for text completion ("text_completion"), false otherwise. This is used by the litellmcompat plugin to determine whether to convert text completion requests to chat completion requests.

func (*ModelCatalog) RecordProviderModelDiscoveryResult

func (mc *ModelCatalog) RecordProviderModelDiscoveryResult(
	provider schemas.ModelProvider,
	unfiltered bool,
	modelData *schemas.BifrostListModelsResponse,
	discoveryErr *schemas.BifrostError,
)

RecordProviderModelDiscoveryResult records one provider model listing attempt (filtered or unfiltered).

func (*ModelCatalog) RefineModelForProvider

func (mc *ModelCatalog) RefineModelForProvider(provider schemas.ModelProvider, model string) (string, error)

RefineModelForProvider refines the model for a given provider by performing a lookup in mc.modelPool and using schemas.ParseModelString to extract provider and model parts. e.g. "gpt-oss-120b" for groq provider -> "openai/gpt-oss-120b"

Behavior: - When the provider's catalog (mc.modelPool) yields multiple matching models, returns an error - When exactly one match is found, returns the fully-qualified model (provider/model format) - When the provider is not handled or no refinement is needed, returns the original model unchanged

func (*ModelCatalog) ReloadPricing

func (mc *ModelCatalog) ReloadPricing(ctx context.Context, config *Config) error

ReloadPricing reloads the model catalog from config

func (*ModelCatalog) SetProviderPricingOverrides

func (mc *ModelCatalog) SetProviderPricingOverrides(provider schemas.ModelProvider, overrides []schemas.ProviderPricingOverride) error

func (*ModelCatalog) UpsertModelDataForProvider

func (mc *ModelCatalog) UpsertModelDataForProvider(provider schemas.ModelProvider, modelData *schemas.BifrostListModelsResponse, allowedModels []schemas.Model, deniedModels []schemas.Model)

UpsertModelDataForProvider upserts model data for a given provider

func (*ModelCatalog) UpsertUnfilteredModelDataForProvider

func (mc *ModelCatalog) UpsertUnfilteredModelDataForProvider(provider schemas.ModelProvider, modelData *schemas.BifrostListModelsResponse)

UpsertUnfilteredModelDataForProvider upserts unfiltered model data for a given provider

type PricingEntry

type PricingEntry struct {
	BaseModel string `json:"base_model,omitempty"`
	Provider  string `json:"provider"`
	Mode      string `json:"mode"`

	ContextLength   *int                  `json:"context_length,omitempty"`
	MaxInputTokens  *int                  `json:"max_input_tokens,omitempty"`
	MaxOutputTokens *int                  `json:"max_output_tokens,omitempty"`
	Architecture    *schemas.Architecture `json:"architecture,omitempty"`

	// Costs - Text
	InputCostPerToken          float64  `json:"input_cost_per_token"`
	OutputCostPerToken         float64  `json:"output_cost_per_token"`
	InputCostPerTokenBatches   *float64 `json:"input_cost_per_token_batches,omitempty"`
	OutputCostPerTokenBatches  *float64 `json:"output_cost_per_token_batches,omitempty"`
	InputCostPerTokenPriority  *float64 `json:"input_cost_per_token_priority,omitempty"`
	OutputCostPerTokenPriority *float64 `json:"output_cost_per_token_priority,omitempty"`
	InputCostPerCharacter      *float64 `json:"input_cost_per_character,omitempty"`
	// Costs - 128k Tier
	InputCostPerTokenAbove128kTokens          *float64 `json:"input_cost_per_token_above_128k_tokens,omitempty"`
	InputCostPerImageAbove128kTokens          *float64 `json:"input_cost_per_image_above_128k_tokens,omitempty"`
	InputCostPerVideoPerSecondAbove128kTokens *float64 `json:"input_cost_per_video_per_second_above_128k_tokens,omitempty"`
	InputCostPerAudioPerSecondAbove128kTokens *float64 `json:"input_cost_per_audio_per_second_above_128k_tokens,omitempty"`
	OutputCostPerTokenAbove128kTokens         *float64 `json:"output_cost_per_token_above_128k_tokens,omitempty"`
	// Costs - 200k Tier
	InputCostPerTokenAbove200kTokens  *float64 `json:"input_cost_per_token_above_200k_tokens,omitempty"`
	OutputCostPerTokenAbove200kTokens *float64 `json:"output_cost_per_token_above_200k_tokens,omitempty"`

	// Costs - Cache
	CacheCreationInputTokenCost                        *float64 `json:"cache_creation_input_token_cost,omitempty"`
	CacheReadInputTokenCost                            *float64 `json:"cache_read_input_token_cost,omitempty"`
	CacheCreationInputTokenCostAbove200kTokens         *float64 `json:"cache_creation_input_token_cost_above_200k_tokens,omitempty"`
	CacheReadInputTokenCostAbove200kTokens             *float64 `json:"cache_read_input_token_cost_above_200k_tokens,omitempty"`
	CacheCreationInputTokenCostAbove1hr                *float64 `json:"cache_creation_input_token_cost_above_1hr,omitempty"`
	CacheCreationInputTokenCostAbove1hrAbove200kTokens *float64 `json:"cache_creation_input_token_cost_above_1hr_above_200k_tokens,omitempty"`
	CacheCreationInputAudioTokenCost                   *float64 `json:"cache_creation_input_audio_token_cost,omitempty"`
	CacheReadInputTokenCostPriority                    *float64 `json:"cache_read_input_token_cost_priority,omitempty"`
	CacheReadInputImageTokenCost                       *float64 `json:"cache_read_input_image_token_cost,omitempty"`

	// Costs - Image
	InputCostPerImage                             *float64 `json:"input_cost_per_image,omitempty"`
	InputCostPerPixel                             *float64 `json:"input_cost_per_pixel,omitempty"`
	OutputCostPerImage                            *float64 `json:"output_cost_per_image,omitempty"`
	OutputCostPerPixel                            *float64 `json:"output_cost_per_pixel,omitempty"`
	OutputCostPerImagePremiumImage                *float64 `json:"output_cost_per_image_premium_image,omitempty"`
	OutputCostPerImageAbove512x512Pixels          *float64 `json:"output_cost_per_image_above_512_and_512_pixels,omitempty"`
	OutputCostPerImageAbove512x512PixelsPremium   *float64 `json:"output_cost_per_image_above_512_and_512_pixels_and_premium_image,omitempty"`
	OutputCostPerImageAbove1024x1024Pixels        *float64 `json:"output_cost_per_image_above_1024_and_1024_pixels,omitempty"`
	OutputCostPerImageAbove1024x1024PixelsPremium *float64 `json:"output_cost_per_image_above_1024_and_1024_pixels_and_premium_image,omitempty"`
	OutputCostPerImageAbove2048x2048Pixels        *float64 `json:"output_cost_per_image_above_2048_and_2048_pixels,omitempty"`
	OutputCostPerImageAbove4096x4096Pixels        *float64 `json:"output_cost_per_image_above_4096_and_4096_pixels,omitempty"`
	OutputCostPerImageLowQuality                  *float64 `json:"output_cost_per_image_low_quality,omitempty"`
	OutputCostPerImageMediumQuality               *float64 `json:"output_cost_per_image_medium_quality,omitempty"`
	OutputCostPerImageHighQuality                 *float64 `json:"output_cost_per_image_high_quality,omitempty"`
	OutputCostPerImageAutoQuality                 *float64 `json:"output_cost_per_image_auto_quality,omitempty"`
	InputCostPerImageToken                        *float64 `json:"input_cost_per_image_token,omitempty"`
	OutputCostPerImageToken                       *float64 `json:"output_cost_per_image_token,omitempty"`

	// Costs - Audio/Video
	InputCostPerAudioToken      *float64 `json:"input_cost_per_audio_token,omitempty"`
	InputCostPerAudioPerSecond  *float64 `json:"input_cost_per_audio_per_second,omitempty"`
	InputCostPerSecond          *float64 `json:"input_cost_per_second,omitempty"`
	InputCostPerVideoPerSecond  *float64 `json:"input_cost_per_video_per_second,omitempty"`
	OutputCostPerAudioToken     *float64 `json:"output_cost_per_audio_token,omitempty"`
	OutputCostPerVideoPerSecond *float64 `json:"output_cost_per_video_per_second,omitempty"`
	OutputCostPerSecond         *float64 `json:"output_cost_per_second,omitempty"`

	// Costs - Other
	//
	// SearchContextCostPerQuery is stored as a single float64, but the pricing datasheet
	// represents it as a tiered object with three keys: search_context_size_low,
	// search_context_size_medium, and search_context_size_high.  For every provider except
	// Perplexity the three tier values are identical, so we collapse the object to its
	// medium tier value (falling back to low then high).  Perplexity always returns a
	// pre-computed total_cost in its usage response, so the per-query rate is never
	// consumed for that provider; the collapsed value is therefore correct in all cases.
	// See UnmarshalJSON below for the custom decoding logic.
	SearchContextCostPerQuery     *float64 `json:"search_context_cost_per_query,omitempty"`
	CodeInterpreterCostPerSession *float64 `json:"code_interpreter_cost_per_session,omitempty"`
}

PricingEntry represents a single model's pricing information. Field names and JSON tags match the datasheet schema exactly.

func (*PricingEntry) UnmarshalJSON

func (p *PricingEntry) UnmarshalJSON(data []byte) error

UnmarshalJSON implements json.Unmarshaler for PricingEntry. It handles the special case where search_context_cost_per_query may arrive as either a plain float64 or a tiered object {"search_context_size_low":…, "search_context_size_medium":…, "search_context_size_high":…}.

type ProviderModelDiscoveryHealth

type ProviderModelDiscoveryHealth struct {
	Status          ProviderModelHealthStatus `json:"status"`
	LastAttemptAt   *time.Time                `json:"last_attempt_at,omitempty"`
	LastSuccessAt   *time.Time                `json:"last_success_at,omitempty"`
	LastErrorAt     *time.Time                `json:"last_error_at,omitempty"`
	LastError       string                    `json:"last_error,omitempty"`
	LastModelsCount int                       `json:"last_models_count"`
}

type ProviderModelHealthStatus

type ProviderModelHealthStatus string
const (
	ProviderModelHealthUnknown  ProviderModelHealthStatus = "unknown"
	ProviderModelHealthHealthy  ProviderModelHealthStatus = "healthy"
	ProviderModelHealthStale    ProviderModelHealthStatus = "stale"
	ProviderModelHealthError    ProviderModelHealthStatus = "error"
	ProviderModelHealthDegraded ProviderModelHealthStatus = "degraded"
)

type ProviderModelSnapshotHealth

type ProviderModelSnapshotHealth struct {
	Provider             schemas.ModelProvider        `json:"provider"`
	Status               ProviderModelHealthStatus    `json:"status"`
	SnapshotModelCount   int                          `json:"snapshot_model_count"`
	FilteredModelCount   int                          `json:"filtered_model_count"`
	UnfilteredModelCount int                          `json:"unfiltered_model_count"`
	FilteredSource       ProviderModelSource          `json:"filtered_source"`
	UnfilteredSource     ProviderModelSource          `json:"unfiltered_source"`
	LastSnapshotUpdated  *time.Time                   `json:"last_snapshot_updated,omitempty"`
	FilteredDiscovery    ProviderModelDiscoveryHealth `json:"filtered_discovery"`
	UnfilteredDiscovery  ProviderModelDiscoveryHealth `json:"unfiltered_discovery"`
}

type ProviderModelSnapshotHealthReport

type ProviderModelSnapshotHealthReport struct {
	Status            ProviderModelHealthStatus          `json:"status"`
	GeneratedAt       time.Time                          `json:"generated_at"`
	StaleAfterSeconds int64                              `json:"stale_after_seconds"`
	Summary           ProviderModelSnapshotHealthSummary `json:"summary"`
	Providers         []ProviderModelSnapshotHealth      `json:"providers"`
}

type ProviderModelSnapshotHealthSummary

type ProviderModelSnapshotHealthSummary struct {
	TotalProviders    int `json:"total_providers"`
	HealthyProviders  int `json:"healthy_providers"`
	StaleProviders    int `json:"stale_providers"`
	ErrorProviders    int `json:"error_providers"`
	DegradedProviders int `json:"degraded_providers"`
	UnknownProviders  int `json:"unknown_providers"`
}

type ProviderModelSource

type ProviderModelSource string
const (
	ProviderModelSourceUnknown           ProviderModelSource = "unknown"
	ProviderModelSourcePricingCatalog    ProviderModelSource = "pricing_catalog"
	ProviderModelSourcePersistedSnapshot ProviderModelSource = "persisted_snapshot"
	ProviderModelSourceDefaultSeed       ProviderModelSource = "default_seed"
	ProviderModelSourceAllowedModels     ProviderModelSource = "allowed_models"
	ProviderModelSourceLiveDiscovery     ProviderModelSource = "live_discovery"
)

type ShouldSyncPricingFunc

type ShouldSyncPricingFunc func(ctx context.Context) bool

ShouldSyncPricingFunc is a function that determines if pricing data should be synced It returns a boolean indicating if syncing is needed It is completely optional and can be nil if not needed syncPricing function will be called if this function returns true

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL