Documentation
¶
Overview ¶
Package registry provides model definitions for various AI service providers. This file contains static model definitions that can be used by clients when registering their supported models.
Package registry provides centralized model management for all AI service providers. It implements a dynamic model registry with reference counting to track active clients and automatically hide models when no clients are available or when quota is exceeded.
Index ¶
- func GetAntigravityModelConfig() map[string]*AntigravityModelConfig
- type AntigravityModelConfig
- type ModelCapabilities
- type ModelInfo
- func GetAIStudioModels() []*ModelInfo
- func GetClaudeModels() []*ModelInfo
- func GetGeminiCLIModels() []*ModelInfo
- func GetGeminiModels() []*ModelInfo
- func GetGeminiVertexModels() []*ModelInfo
- func GetIFlowModels() []*ModelInfo
- func GetOpenAIModels() []*ModelInfo
- func GetOpenCodeModels() []*ModelInfo
- func GetQwenModels() []*ModelInfo
- type ModelModalities
- type ModelRegistration
- type ModelRegistry
- func (r *ModelRegistry) CleanupExpiredQuotas()
- func (r *ModelRegistry) ClearModelQuotaExceeded(clientID, modelID string)
- func (r *ModelRegistry) ClientSupportsModel(clientID, modelID string) bool
- func (r *ModelRegistry) GetAllProviders() []ProviderInfo
- func (r *ModelRegistry) GetAvailableModels(handlerType string) []map[string]any
- func (r *ModelRegistry) GetFirstAvailableModel(handlerType string, priorityList []string) (string, error)
- func (r *ModelRegistry) GetModelCount(modelID string) int
- func (r *ModelRegistry) GetModelInfo(modelID string) *ModelInfo
- func (r *ModelRegistry) GetModelProviders(modelID string) []string
- func (r *ModelRegistry) GetModelsForClient(clientID string) []*ModelInfo
- func (mr *ModelRegistry) GetModelsWithMinContext(minContext int) []*ModelInfo
- func (r *ModelRegistry) RegisterClient(clientID, clientProvider string, models []*ModelInfo)
- func (r *ModelRegistry) ResumeClientModel(clientID, modelID string)
- func (r *ModelRegistry) SetModelQuotaExceeded(clientID, modelID string)
- func (r *ModelRegistry) SuspendClientModel(clientID, modelID, reason string)
- func (r *ModelRegistry) UnregisterClient(clientID string)
- type NativeTool
- type ProviderInfo
- type ThinkingSupport
Constants ¶
This section is empty.
Variables ¶
This section is empty.
Functions ¶
func GetAntigravityModelConfig ¶
func GetAntigravityModelConfig() map[string]*AntigravityModelConfig
GetAntigravityModelConfig returns static configuration for antigravity models. Keys use the ALIASED model names (after modelName2Alias conversion) for direct lookup.
Types ¶
type AntigravityModelConfig ¶
type AntigravityModelConfig struct {
Thinking *ThinkingSupport
MaxCompletionTokens int
Name string
}
AntigravityModelConfig captures static antigravity model overrides, including Thinking budget limits and provider max completion tokens.
type ModelCapabilities ¶ added in v0.4.0
type ModelCapabilities struct {
// Attachment reports whether the model accepts file attachments
// (typically image/pdf passed as content-block image_url with a data:
// URI). Required for Vercel AI SDK to forward image bytes.
Attachment bool `json:"attachment"`
// ToolCall reports whether the model can emit OpenAI-style function
// calls (tools[] with type=function, returning tool_calls).
ToolCall bool `json:"tool_call"`
// Reasoning reports whether the model emits reasoning/thinking chunks
// distinct from final-answer content (DeepSeek-R1-style or Gemini
// thinking budget). Clients that render a chain-of-thought UI key off
// this flag.
Reasoning bool `json:"reasoning"`
// Modalities enumerates supported input + output media types. Each
// entry is one of: text, image, audio, video, pdf. Output is usually
// just ["text"] but TTS / image-gen / music-gen models output audio
// or image. A model that omits an entry is assumed not to support it.
Modalities ModelModalities `json:"modalities"`
}
ModelCapabilities describes the feature surface of a model in a shape compatible with what Vercel AI SDK and adjacent ecosystems look for in a /v1/models response. Populated from explicit config when present, or inferred from the model ID when nil (see InferCapabilities).
The shape mirrors OpenCode's per-model schema so clients that read it directly can auto-enable attachment / tool-calling without users having to declare it in their local config — which was the bug that motivated this field (OpenCode silently strips images when attachment is false).
func InferCapabilities ¶ added in v0.4.0
func InferCapabilities(id string) *ModelCapabilities
InferCapabilities returns a best-effort capability set derived from a model identifier. Recognises common families across the providers we route to. Returns nil only for IDs we have no signal about — caller should treat nil as "unknown, don't surface in /v1/models".
Maintenance: when a new vision/audio model lands, add a case here. The alternative — declaring caps in config — works for operators who want to override, but the inference path is what makes the gateway "just work" for clients on day one.
type ModelInfo ¶
type ModelInfo struct {
// ID is the unique identifier for the model
ID string `json:"id"`
// Object type for the model (typically "model")
Object string `json:"object"`
// Created timestamp when the model was created
Created int64 `json:"created"`
// OwnedBy indicates the organization that owns the model
OwnedBy string `json:"owned_by"`
// Type indicates the model type (e.g., "claude", "gemini", "openai")
Type string `json:"type"`
// DisplayName is the human-readable name for the model
DisplayName string `json:"display_name,omitempty"`
// Name is used for Gemini-style model names
Name string `json:"name,omitempty"`
// Version is the model version
Version string `json:"version,omitempty"`
// Description provides detailed information about the model
Description string `json:"description,omitempty"`
// Visibility controls normal public catalog exposure. "private" models
// remain routable internally but are omitted from GetAvailableModels.
Visibility string `json:"visibility,omitempty"`
// InputTokenLimit is the maximum input token limit
InputTokenLimit int `json:"inputTokenLimit,omitempty"`
// OutputTokenLimit is the maximum output token limit
OutputTokenLimit int `json:"outputTokenLimit,omitempty"`
// SupportedGenerationMethods lists supported generation methods
SupportedGenerationMethods []string `json:"supportedGenerationMethods,omitempty"`
// ContextLength is the context window size
ContextLength int `json:"context_length,omitempty"`
// MaxCompletionTokens is the maximum completion tokens
MaxCompletionTokens int `json:"max_completion_tokens,omitempty"`
// SupportedParameters lists supported parameters
SupportedParameters []string `json:"supported_parameters,omitempty"`
// Thinking holds provider-specific reasoning/thinking budget capabilities.
// This is optional and currently used for Gemini thinking budget normalization.
Thinking *ThinkingSupport `json:"thinking,omitempty"`
// Capabilities describes what the model supports — vision, audio, tool
// calling, reasoning, etc. Surfaced in /v1/models so OpenAI-compatible
// clients (Vercel AI SDK / OpenCode / Cursor / Continue.dev) can
// auto-detect feature support without per-model client-side config. If
// nil, capabilities are inferred from the model ID at marshal time.
Capabilities *ModelCapabilities `json:"capabilities,omitempty"`
// NativeTools declares provider-native tools the upstream model
// supports without caller-side implementation (e.g. MiniMax M2.7's
// autonomous `{"type":"web_search"}`). Agent runtimes that discover
// `/v1/models` can merge these entries into their caller-declared
// `tools` array so the model picks them on recent-events / browsing
// queries without relying on server-side autoinject hacks. Empty /
// absent means the model exposes no provider-native tools.
NativeTools []NativeTool `json:"native_tools,omitempty"`
}
ModelInfo represents information about an available model
func GetAIStudioModels ¶
func GetAIStudioModels() []*ModelInfo
GetAIStudioModels returns the Gemini model definitions for AI Studio integrations
func GetClaudeModels ¶
func GetClaudeModels() []*ModelInfo
GetClaudeModels returns the standard Claude model definitions
func GetGeminiCLIModels ¶
func GetGeminiCLIModels() []*ModelInfo
GetGeminiCLIModels returns the standard Gemini model definitions
func GetGeminiModels ¶
func GetGeminiModels() []*ModelInfo
GetGeminiModels returns the standard Gemini model definitions
func GetGeminiVertexModels ¶
func GetGeminiVertexModels() []*ModelInfo
func GetIFlowModels ¶
func GetIFlowModels() []*ModelInfo
GetIFlowModels returns supported models for iFlow OAuth accounts.
func GetOpenAIModels ¶
func GetOpenAIModels() []*ModelInfo
GetOpenAIModels returns the standard OpenAI model definitions
func GetOpenCodeModels ¶
func GetOpenCodeModels() []*ModelInfo
GetOpenCodeModels returns the standard OpenCode model definitions
func GetQwenModels ¶
func GetQwenModels() []*ModelInfo
GetQwenModels returns the standard Qwen model definitions
type ModelModalities ¶ added in v0.4.0
ModelModalities lists the media types a model accepts on input and produces on output. Same vocabulary the OpenAI Realtime + Anthropic models endpoints use (text/image/audio/video/pdf).
type ModelRegistration ¶
type ModelRegistration struct {
// Info contains the model metadata
Info *ModelInfo
// Count is the number of active clients that can provide this model
Count int
// LastUpdated tracks when this registration was last modified
LastUpdated time.Time
// QuotaExceededClients tracks which clients have exceeded quota for this model
QuotaExceededClients map[string]*time.Time
// Providers tracks available clients grouped by provider identifier
Providers map[string]int
// SuspendedClients tracks temporarily disabled clients keyed by client ID
SuspendedClients map[string]string
}
ModelRegistration tracks a model's availability
type ModelRegistry ¶
type ModelRegistry struct {
// contains filtered or unexported fields
}
ModelRegistry manages the global registry of available models
func GetGlobalRegistry ¶
func GetGlobalRegistry() *ModelRegistry
GetGlobalRegistry returns the global model registry instance
func NewModelRegistry ¶
func NewModelRegistry() *ModelRegistry
NewModelRegistry creates a new, empty model registry.
func (*ModelRegistry) CleanupExpiredQuotas ¶
func (r *ModelRegistry) CleanupExpiredQuotas()
CleanupExpiredQuotas removes expired quota tracking entries
func (*ModelRegistry) ClearModelQuotaExceeded ¶
func (r *ModelRegistry) ClearModelQuotaExceeded(clientID, modelID string)
ClearModelQuotaExceeded removes quota exceeded status for a model and client Parameters:
- clientID: The client to clear quota status for
- modelID: The model to clear quota status for
func (*ModelRegistry) ClientSupportsModel ¶
func (r *ModelRegistry) ClientSupportsModel(clientID, modelID string) bool
ClientSupportsModel reports whether the client registered support for modelID. It checks both the registered model ID (alias) and the DisplayName (upstream model name).
func (*ModelRegistry) GetAllProviders ¶
func (r *ModelRegistry) GetAllProviders() []ProviderInfo
GetAllProviders returns information about all registered providers
func (*ModelRegistry) GetAvailableModels ¶
func (r *ModelRegistry) GetAvailableModels(handlerType string) []map[string]any
GetAvailableModels returns all models that have at least one available client Parameters:
- handlerType: The handler type to filter models for (e.g., "openai", "claude", "gemini")
Returns:
- []map[string]any: List of available models in the requested format
func (*ModelRegistry) GetFirstAvailableModel ¶
func (r *ModelRegistry) GetFirstAvailableModel(handlerType string, priorityList []string) (string, error)
GetFirstAvailableModel returns the first available model for the given handler type. It first checks the provided priorityList. If no model from the list is available, it prioritizes remaining models by their creation timestamp (newest first).
Parameters:
- handlerType: The API handler type (e.g., "openai", "claude", "gemini")
- priorityList: Optional list of model IDs to check first
Returns:
- string: The model ID of the first available model, or empty string if none available
- error: An error if no models are available
func (*ModelRegistry) GetModelCount ¶
func (r *ModelRegistry) GetModelCount(modelID string) int
GetModelCount returns the number of available clients for a specific model Parameters:
- modelID: The model ID to check
Returns:
- int: Number of available clients for the model
func (*ModelRegistry) GetModelInfo ¶
func (r *ModelRegistry) GetModelInfo(modelID string) *ModelInfo
GetModelInfo returns the registered ModelInfo for the given model ID, if present. Returns nil if the model is unknown to the registry.
func (*ModelRegistry) GetModelProviders ¶
func (r *ModelRegistry) GetModelProviders(modelID string) []string
GetModelProviders returns provider identifiers that currently supply the given model Parameters:
- modelID: The model ID to check
Returns:
- []string: Provider identifiers ordered by availability count (descending)
func (*ModelRegistry) GetModelsForClient ¶
func (r *ModelRegistry) GetModelsForClient(clientID string) []*ModelInfo
GetModelsForClient returns the models registered for a specific client. Parameters:
- clientID: The client identifier (typically auth file name or auth ID)
Returns:
- []*ModelInfo: List of models registered for this client, nil if client not found
func (*ModelRegistry) GetModelsWithMinContext ¶
func (mr *ModelRegistry) GetModelsWithMinContext(minContext int) []*ModelInfo
GetModelsWithMinContext returns all active models that support at least the given context length.
func (*ModelRegistry) RegisterClient ¶
func (r *ModelRegistry) RegisterClient(clientID, clientProvider string, models []*ModelInfo)
RegisterClient registers a client and its supported models Parameters:
- clientID: Unique identifier for the client
- clientProvider: Provider name (e.g., "gemini", "claude", "openai")
- models: List of models that this client can provide
func (*ModelRegistry) ResumeClientModel ¶
func (r *ModelRegistry) ResumeClientModel(clientID, modelID string)
ResumeClientModel clears a previous suspension so the client counts toward availability again. Parameters:
- clientID: The client to resume
- modelID: The model being resumed
func (*ModelRegistry) SetModelQuotaExceeded ¶
func (r *ModelRegistry) SetModelQuotaExceeded(clientID, modelID string)
SetModelQuotaExceeded marks a model as quota exceeded for a specific client Parameters:
- clientID: The client that exceeded quota
- modelID: The model that exceeded quota
func (*ModelRegistry) SuspendClientModel ¶
func (r *ModelRegistry) SuspendClientModel(clientID, modelID, reason string)
SuspendClientModel marks a client's model as temporarily unavailable until explicitly resumed. Parameters:
- clientID: The client to suspend
- modelID: The model affected by the suspension
- reason: Optional description for observability
func (*ModelRegistry) UnregisterClient ¶
func (r *ModelRegistry) UnregisterClient(clientID string)
UnregisterClient removes a client and decrements counts for its models Parameters:
- clientID: Unique identifier for the client to remove
type NativeTool ¶ added in v0.5.7
type NativeTool struct {
// Type is the tool type the upstream model recognises (e.g.
// "web_search"). Matches the "type" key of an OpenAI tools[] entry.
Type string `json:"type" yaml:"type"`
// Description is a short human-readable sentence for the operator /
// client UI. Not forwarded to the model.
Description string `json:"description,omitempty" yaml:"description,omitempty"`
// Params documents the per-tool knobs (e.g. force_search,
// max_keyword). Shape is intentionally loose (map[string]any) since
// each provider defines its own parameter surface. Callers that want
// to set a param append it alongside "type" on their tools[] entry.
Params map[string]any `json:"params,omitempty" yaml:"params,omitempty"`
}
NativeTool describes a provider-native tool a model supports out of the box. The shape deliberately mirrors the OpenAI tools[] entry a caller would declare (`{"type": "...", ...}`) so consumers can splice it directly into their own tools array at chat-completion time without translation. Params documents the tool's knobs for the operator; they are optional hints, not enforced by switchAILocal.
type ProviderInfo ¶
type ProviderInfo struct {
// ID is the unique identifier for the provider (e.g., "gemini", "claude", "ollama")
ID string `json:"id"`
// Name is the human-readable name
Name string `json:"name"`
// Type indicates the provider type ("api" or "cli")
Type string `json:"type"`
// Mode indicates the operational mode ("local" or "online")
Mode string `json:"mode"`
// Status indicates provider availability ("active", "degraded", "unavailable")
Status string `json:"status"`
// ModelCount is the number of models available from this provider
ModelCount int `json:"model_count"`
// Models lists the model IDs available from this provider
Models []string `json:"models,omitempty"`
}
ProviderInfo represents information about an AI provider
type ThinkingSupport ¶
type ThinkingSupport struct {
// Min is the minimum allowed thinking budget (inclusive).
Min int `json:"min,omitempty"`
// Max is the maximum allowed thinking budget (inclusive).
Max int `json:"max,omitempty"`
// ZeroAllowed indicates whether 0 is a valid value (to disable thinking).
ZeroAllowed bool `json:"zero_allowed,omitempty"`
// DynamicAllowed indicates whether -1 is a valid value (dynamic thinking budget).
DynamicAllowed bool `json:"dynamic_allowed,omitempty"`
// Levels defines discrete reasoning effort levels (e.g., "low", "medium", "high").
// When set, the model uses level-based reasoning instead of token budgets.
Levels []string `json:"levels,omitempty"`
}
ThinkingSupport describes a model family's supported internal reasoning budget range. Values are interpreted in provider-native token units.