model

package
v0.9.0 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: May 6, 2026 License: Apache-2.0 Imports: 23 Imported by: 0

Documentation

Index

Constants

View Source
const (

	// Provider name constants used in model routing and configuration.
	ProviderOllama    = "ollama"
	ProviderAnthropic = "anthropic"
	ProviderOpenAI    = "openai"
)
View Source
const DiscoverDisabledEnv = "OBOL_DISABLE_LOCAL_MODEL_DISCOVERY"

DiscoverDisabledEnv is the env var that disables local-server discovery. Set to "1" or "true" to skip the scan entirely (useful when an unrelated service binds one of the well-known inference ports).

View Source
const PreferredDefaultOllamaModel = "qwen3.5:4b"

PreferredDefaultOllamaModel is the model we *recommend* operators pull when they're starting from an empty Ollama inventory or have only cloud-aliased entries. Picked as a reasonable balance between capability and CPU footprint on developer machines without a discrete GPU.

Note: we do NOT bump this to the front of an existing `/api/tags` ordering. On hosts that already have local chat models, the ordering Ollama returns (modified-time) is treated as the operator's preference signal — overriding it would silently demote a model the user just pulled and intends to use. The stack-up auto-config only suggests this name when Ollama has nothing usable; once any local chat model is configured, `obol model prefer ...` is the explicit reorder path.

Variables

View Source
var WellKnownModels = map[string][]string{
	ProviderAnthropic: {
		"claude-opus-4-6",
		"claude-sonnet-4-6",
		"claude-haiku-4-5-20251001",
		"claude-sonnet-4-5-20250929",
	},
	ProviderOpenAI: {
		"gpt-5.4",
		"gpt-4.1",
		"gpt-4.1-mini",
		"o4-mini",
		"o3",
	},
}

WellKnownModels maps provider names to their commonly-used model IDs. Used to populate OpenClaw's model allowlist when a wildcard is configured and the LiteLLM pod is not reachable for a live /v1/models query.

Functions

func AddCustomEndpoint added in v0.8.0

func AddCustomEndpoint(cfg *config.Config, u *ui.UI, name, endpoint, modelName, apiKey string) error

AddCustomEndpoint adds a custom OpenAI-compatible endpoint to LiteLLM after validating it works.

LiteLLM `model_name` contract — the canonical identifier is the bare `modelName`. Same convention every other code path in this stack uses: Ollama writes `qwen3.5:9b`, Anthropic writes `claude-opus-4-7`, OpenAI writes `gpt-5.4`. The agent (Hermes / OpenClaw) reads `model_name` straight back as the `model` field on chat-completion calls — any provider-prefix namespacing (`custom/<name>/<model>`) on this side breaks that round-trip because the agent then strips it and calls LiteLLM with a key that doesn't match.

The `name` arg is informational only. It is surfaced via `obol model status` / `list` for human reference but does NOT participate in the LiteLLM route key. Two custom endpoints that publish the same `modelName` will overwrite each other in the LiteLLM ConfigMap; that is the natural "repoint my model" behavior an operator running `obol model setup custom` wants when they re-run the command.

func AutoConfigOllamaModelNames added in v0.9.0

func AutoConfigOllamaModelNames(models []OllamaModel) []string

AutoConfigOllamaModelNames converts the raw /api/tags inventory into the ordered model-name list we auto-write into LiteLLM and agent configs.

Policy:

  • strip the cosmetic `:latest` tag suffix
  • ignore empty names
  • keep local chat-capable models ahead of Ollama cloud aliases that would require extra credentials to work (mitigates the rc8 regression where a `:cloud` alias landed at index 0 and became Hermes' unusable default)
  • keep embedding-only models last so they never become the default chat model
  • within each tier, preserve Ollama's own ordering — that's the operator's pull-history preference signal, and overriding it would silently demote a model the user just pulled

This only affects auto-generated defaults. Operators can still reorder the resulting LiteLLM model_list later with `obol model prefer ...`.

func ConfigureLiteLLM added in v0.8.0

func ConfigureLiteLLM(cfg *config.Config, u *ui.UI, provider, apiKey string, models []string) error

ConfigureLiteLLM adds a provider to the LiteLLM gateway. For cloud providers, it patches the Secret with the API key and adds the model to config.yaml. For Ollama, it discovers local models and adds them.

When only models change (no API key), models are hot-added via the /model/new API — no restart required. When API keys change, a rolling restart is triggered so the new Secret values are picked up.

func FormatBytes added in v0.8.0

func FormatBytes(b int64) string

FormatBytes formats a byte count as a human-readable string.

func GetConfiguredModels added in v0.8.0

func GetConfiguredModels(cfg *config.Config) ([]string, error)

GetConfiguredModels returns the model names available in LiteLLM. Wildcard entries (e.g. anthropic/*) are expanded: first by querying the running LiteLLM pod's /v1/models endpoint, falling back to the baked-in WellKnownModels list if the cluster is unreachable.

func GetMasterKey added in v0.8.0

func GetMasterKey(cfg *config.Config) (string, error)

GetMasterKey reads the LiteLLM master key from the cluster Secret.

func GetProviderStatus

func GetProviderStatus(cfg *config.Config) (map[string]ProviderStatus, error)

GetProviderStatus reads LiteLLM config and returns provider status.

func HasConfiguredModels added in v0.8.0

func HasConfiguredModels(cfg *config.Config) bool

HasConfiguredModels returns true if LiteLLM has at least one non-catch-all model configured (i.e., something other than the "paid/*" route).

func HasProviderConfigured added in v0.8.0

func HasProviderConfigured(cfg *config.Config, provider string) bool

HasProviderConfigured returns true if LiteLLM already has at least one model entry for the given provider (e.g., "anthropic", "openai").

func IsCredentialRequiringOllamaModel added in v0.9.0

func IsCredentialRequiringOllamaModel(name string) bool

IsCredentialRequiringOllamaModel reports whether an Ollama model name is one of the cloud-aliased entries that needs an API key to actually serve requests (e.g. `deepseek-v4-pro:cloud`). Exported so stack-up can warn when the auto-picked primary would land on one of these.

func LoadDotEnv added in v0.8.0

func LoadDotEnv(path string) map[string]string

LoadDotEnv reads KEY=value pairs from a .env file. Returns an empty map if the file doesn't exist or is unreadable. Skips comments (#) and blank lines. Does not call os.Setenv.

func PatchLiteLLMEntries added in v0.9.0

func PatchLiteLLMEntries(cfg *config.Config, u *ui.UI, entries []ModelEntry) error

PatchLiteLLMEntries merges precomputed ModelEntry values into the LiteLLM ConfigMap without touching Secrets and without restarting. Caller is responsible for restarting LiteLLM once after batching all patches when an upstream Secret/ConfigMap value actually changed.

func PatchLiteLLMProvider added in v0.8.0

func PatchLiteLLMProvider(cfg *config.Config, u *ui.UI, provider, apiKey string, models []string) error

PatchLiteLLMProvider patches the LiteLLM Secret (API key) and ConfigMap (model_list) for a provider without restarting the deployment. Call RestartLiteLLM afterwards (once, after batching multiple providers).

func PreferModels added in v0.9.0

func PreferModels(cfg *config.Config, u *ui.UI, names []string) error

PreferModels reorders LiteLLM's model_list so the named entries appear at the head, in the order given. Remaining entries keep their original relative order. This is the operator-facing primitive that lets model.Rank's "first chat-capable wins" rule pick a specific primary without a remove/re-add cycle.

Returns an error if any of the requested names is not present in the current model_list — typos should be loud, not silent no-ops.

LiteLLM has no model_list reorder API, so after the ConfigMap patch this rolls the LiteLLM Deployment so the new order takes effect (the /v1/models listing follows model_list order, and hermes/openclaw read the ConfigMap directly via GetConfiguredModels for the agent primary).

func ProviderEnvVar added in v0.8.0

func ProviderEnvVar(provider string) string

ProviderEnvVar returns the env var name for a provider's API key.

func ProviderFromModelName added in v0.8.0

func ProviderFromModelName(name string) string

ProviderFromModelName infers the provider from a model name string.

func PullOllamaModel added in v0.8.0

func PullOllamaModel(name string) error

PullOllamaModel pulls a model from the Ollama registry. It streams progress to stdout, matching the UX of `ollama pull`.

func Rank added in v0.9.0

func Rank(models []string) (primary string, fallbacks []string)

Rank selects the primary model from the configured model list and demotes the rest to fallbacks.

Model ordering is configuration, not hidden product policy. LiteLLM's model_list order is the source of truth, so Rank preserves that order instead of guessing quality from provider names, parameter-count tags, or model-family aliases. The only exception is known embedding-only entries: they are kept in the fallback list but moved behind chat-capable models so an embedding model does not become the default chat model when another option exists.

The returned strings are the original inputs. Do not strip provider prefixes or normalize names here; Hermes/OpenClaw round-trip the returned primary back to LiteLLM as the chat-completions model field.

func RemoveModel added in v0.8.0

func RemoveModel(cfg *config.Config, u *ui.UI, modelName string) error

RemoveModel removes a model entry from the LiteLLM ConfigMap (persistence) and hot-deletes it from the running router via the API (immediate effect). No pod restart is required.

func ResolveAPIKey added in v0.8.0

func ResolveAPIKey(provider string) (key, envVarUsed string)

ResolveAPIKey checks the primary env var and each AltEnvVar in order for the given provider. Returns the key value and the env var it was found in. Both are empty if no key is available.

func RestartLiteLLM added in v0.8.0

func RestartLiteLLM(cfg *config.Config, u *ui.UI, provider string) error

RestartLiteLLM restarts the LiteLLM deployment and waits for rollout.

func ValidateCustomEndpoint added in v0.8.0

func ValidateCustomEndpoint(endpoint, modelName, apiKey string) error

ValidateCustomEndpoint validates that a custom OpenAI-compatible endpoint works. It runs a 2-step validation: reachability check, then inference probe. The inference probe is the definitive test — some servers (e.g., mlx-lm) don't list the loaded model in /models but accept it for inference.

func WarnAndStripV1Suffix added in v0.8.0

func WarnAndStripV1Suffix(endpoint string) string

WarnAndStripV1Suffix checks if an endpoint URL has a trailing /v1 suffix, warns the user, and returns the stripped URL. For OpenAI-compatible providers, LiteLLM auto-appends /v1, causing double /v1/v1 if the user includes it.

Types

type CacheControlInjection added in v0.9.0

type CacheControlInjection struct {
	Location string `yaml:"location"`
	Role     string `yaml:"role,omitempty"`
	Index    *int   `yaml:"index,omitempty"`
}

CacheControlInjection is one entry in LiteLLM's cache_control_injection_points list. Either Role or Index narrows which message in the request gets the cache_control marker.

type DiscoveredProvider added in v0.9.0

type DiscoveredProvider struct {
	Label           string
	ServerType      string
	HostEndpoint    string
	ClusterEndpoint string
	Entries         []ModelEntry
}

DiscoveredProvider is one local OpenAI-compatible inference server that auto-config can register with LiteLLM.

func DiscoverLocalProviders added in v0.9.0

func DiscoverLocalProviders(ctx context.Context) ([]DiscoveredProvider, error)

DiscoverLocalProviders probes well-known local inference ports and returns one DiscoveredProvider per host endpoint that exposes at least one model. Returns an empty slice (not an error) when discovery is disabled or nothing is reachable. Honors OBOL_DISABLE_LOCAL_MODEL_DISCOVERY.

type LiteLLMConfig added in v0.8.0

type LiteLLMConfig struct {
	ModelList       []ModelEntry   `yaml:"model_list"`
	GeneralSettings map[string]any `yaml:"general_settings,omitempty"`
	LiteLLMSettings map[string]any `yaml:"litellm_settings,omitempty"`
}

LiteLLMConfig represents the LiteLLM proxy config.yaml structure.

type LiteLLMParams added in v0.8.0

type LiteLLMParams struct {
	Model   string `yaml:"model"`
	APIBase string `yaml:"api_base,omitempty"`
	APIKey  string `yaml:"api_key,omitempty"`
	// CacheControlInjectionPoints is a LiteLLM directive that tells the proxy
	// to attach Anthropic-style `cache_control: {type: ephemeral}` markers to
	// specific messages on every request to this model. We pin the system
	// message for Anthropic entries so prompt caching is on by default.
	CacheControlInjectionPoints []CacheControlInjection `yaml:"cache_control_injection_points,omitempty"`
}

LiteLLMParams holds the routing parameters for a model.

type ModelEntry added in v0.8.0

type ModelEntry struct {
	ModelName     string        `yaml:"model_name"`
	LiteLLMParams LiteLLMParams `yaml:"litellm_params"`
}

ModelEntry is a single entry in model_list.

type OllamaModel added in v0.8.0

type OllamaModel struct {
	Name       string `json:"name"`
	Size       int64  `json:"size"`
	ModifiedAt string `json:"modified_at"`
}

OllamaModel describes a model pulled in the local Ollama instance.

func ListOllamaModels added in v0.8.0

func ListOllamaModels() ([]OllamaModel, error)

ListOllamaModels queries the local Ollama server for pulled models. Returns nil and an error if Ollama is not reachable.

type ProviderInfo

type ProviderInfo struct {
	ID         string   // provider id (e.g. "anthropic", "openai", "ollama")
	Name       string   // display name
	EnvVar     string   // primary env var for API key (empty for Ollama)
	AltEnvVars []string // fallback env vars checked in order (e.g. CLAUDE_CODE_OAUTH_TOKEN)
}

ProviderInfo describes an LLM provider.

func GetAvailableProviders

func GetAvailableProviders(_ *config.Config) ([]ProviderInfo, error)

GetAvailableProviders returns the known provider list (static, no pod query needed).

type ProviderStatus

type ProviderStatus struct {
	Enabled   bool
	HasAPIKey bool
	EnvVar    string // environment variable name (e.g. ANTHROPIC_API_KEY)
	Models    []string
}

ProviderStatus captures effective global LiteLLM provider state.

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL