Documentation
¶
Index ¶
- Constants
- Variables
- func AddCustomEndpoint(cfg *config.Config, u *ui.UI, endpoint, modelName, apiKey string) error
- func AddCustomEndpointWithOptions(cfg *config.Config, u *ui.UI, endpoint, modelName, apiKey string, ...) error
- func AutoConfigOllamaModelNames(models []OllamaModel) []string
- func ConfigureLiteLLM(cfg *config.Config, u *ui.UI, provider, apiKey string, models []string) error
- func FormatBytes(b int64) string
- func GetConfiguredModels(cfg *config.Config) ([]string, error)
- func GetMasterKey(cfg *config.Config) (string, error)
- func GetProviderStatus(cfg *config.Config) (map[string]ProviderStatus, error)
- func HasConfiguredModels(cfg *config.Config) bool
- func HasProviderConfigured(cfg *config.Config, provider string) bool
- func IsCredentialRequiringOllamaModel(name string) bool
- func ListChatCapableModels(cfg *config.Config) ([]string, error)
- func LoadDotEnv(path string) map[string]string
- func PatchLiteLLMEntries(cfg *config.Config, u *ui.UI, entries []ModelEntry) error
- func PatchLiteLLMProvider(cfg *config.Config, u *ui.UI, provider, apiKey string, models []string) error
- func PreferModels(cfg *config.Config, u *ui.UI, names []string) error
- func ProviderEnvVar(provider string) string
- func ProviderFromModelName(name string) string
- func PullOllamaModel(name string) error
- func Rank(models []string) (primary string, fallbacks []string)
- func ReconcileRecorded(cfg *config.Config, u *ui.UI)
- func RecordState(cfg *config.Config, u *ui.UI)
- func RemoveModel(cfg *config.Config, u *ui.UI, modelName string) error
- func ResolveAPIKey(provider string) (key, envVarUsed string)
- func RestartLiteLLM(cfg *config.Config, u *ui.UI, provider string) error
- func ValidateCustomEndpoint(endpoint, modelName, apiKey string) error
- func ValidateCustomEndpointWithOptions(endpoint, modelName, apiKey string, options CustomEndpointOptions) error
- func WarnAndStripV1Suffix(endpoint string) string
- type CacheControlInjection
- type CustomEndpointOptions
- type DiscoveredProvider
- type LiteLLMConfig
- type LiteLLMParams
- type ModelEntry
- type OllamaModel
- type ProviderInfo
- type ProviderStatus
- type RecordedModelState
Constants ¶
const ( // Provider name constants used in model routing and configuration. ProviderOllama = "ollama" ProviderAnthropic = "anthropic" ProviderOpenAI = "openai" )
const DiscoverDisabledEnv = "OBOL_DISABLE_LOCAL_MODEL_DISCOVERY"
DiscoverDisabledEnv is the env var that disables local-server discovery. Set to "1" or "true" to skip the scan entirely (useful when an unrelated service binds one of the well-known inference ports).
const PreferredDefaultOllamaModel = "qwen3.5:4b"
PreferredDefaultOllamaModel is the model we *recommend* operators pull when they're starting from an empty Ollama inventory or have only cloud-aliased entries. Picked as a reasonable balance between capability and CPU footprint on developer machines without a discrete GPU.
Note: we do NOT bump this to the front of an existing `/api/tags` ordering. On hosts that already have local chat models, the ordering Ollama returns (modified-time) is treated as the operator's preference signal — overriding it would silently demote a model the user just pulled and intends to use. The stack-up auto-config only suggests this name when Ollama has nothing usable; once any local chat model is configured, `obol model prefer ...` is the explicit reorder path.
Variables ¶
var WellKnownModels = map[string][]string{ ProviderAnthropic: { "claude-opus-4-6", "claude-sonnet-4-6", "claude-haiku-4-5-20251001", "claude-sonnet-4-5-20250929", }, ProviderOpenAI: { "gpt-5.4", "gpt-4.1", "gpt-4.1-mini", "o4-mini", "o3", }, }
WellKnownModels maps provider names to their commonly-used model IDs. Used to populate OpenClaw's model allowlist when a wildcard is configured and the LiteLLM pod is not reachable for a live /v1/models query.
Functions ¶
func AddCustomEndpoint ¶ added in v0.6.0
AddCustomEndpoint adds a custom OpenAI-compatible endpoint to LiteLLM after validating it works.
LiteLLM `model_name` contract — the canonical identifier is the bare `modelName`. Same convention every other code path in this stack uses: Ollama writes `qwen3.5:9b`, Anthropic writes `claude-opus-4-7`, OpenAI writes `gpt-5.4`. The agent (Hermes / OpenClaw) reads `model_name` straight back as the `model` field on chat-completion calls — any provider-prefix namespacing (`custom/<name>/<model>`) on this side breaks that round-trip because the agent then strips it and calls LiteLLM with a key that doesn't match.
Two custom endpoints that publish the same `modelName` will overwrite each other in the LiteLLM ConfigMap; that is the natural "repoint my model" behavior an operator running `obol model setup custom` wants when they re-run the command.
func AddCustomEndpointWithOptions ¶ added in v0.10.0
func AutoConfigOllamaModelNames ¶ added in v0.9.0
func AutoConfigOllamaModelNames(models []OllamaModel) []string
AutoConfigOllamaModelNames converts the raw /api/tags inventory into the ordered model-name list we auto-write into LiteLLM and agent configs.
Policy:
- strip the cosmetic `:latest` tag suffix
- ignore empty names
- keep local chat-capable models ahead of Ollama cloud aliases that would require extra credentials to work (mitigates the rc8 regression where a `:cloud` alias landed at index 0 and became Hermes' unusable default)
- keep embedding-only models last so they never become the default chat model
- within each tier, preserve Ollama's own ordering — that's the operator's pull-history preference signal, and overriding it would silently demote a model the user just pulled
This only affects auto-generated defaults. Operators can still reorder the resulting LiteLLM model_list later with `obol model prefer ...`.
func ConfigureLiteLLM ¶ added in v0.6.0
ConfigureLiteLLM adds a provider to the LiteLLM gateway. For cloud providers, it patches the Secret with the API key and adds the model to config.yaml. For Ollama, it discovers local models and adds them.
When only models change (no API key), models are hot-added via the /model/new API — no restart required. When API keys change, a rolling restart is triggered so the new Secret values are picked up.
func FormatBytes ¶ added in v0.5.0
FormatBytes formats a byte count as a human-readable string.
func GetConfiguredModels ¶ added in v0.6.0
GetConfiguredModels returns the model names available in LiteLLM. Wildcard entries (e.g. anthropic/*) are expanded: first by querying the running LiteLLM pod's /v1/models endpoint, falling back to the baked-in WellKnownModels list if the cluster is unreachable.
func GetMasterKey ¶ added in v0.6.0
GetMasterKey reads the LiteLLM master key from the cluster Secret.
func GetProviderStatus ¶
func GetProviderStatus(cfg *config.Config) (map[string]ProviderStatus, error)
GetProviderStatus reads LiteLLM config and returns provider status.
func HasConfiguredModels ¶ added in v0.7.0
HasConfiguredModels returns true if LiteLLM has at least one non-catch-all model configured (i.e., something other than the "paid/*" route).
func HasProviderConfigured ¶ added in v0.7.0
HasProviderConfigured returns true if LiteLLM already has at least one model entry for the given provider (e.g., "anthropic", "openai").
func IsCredentialRequiringOllamaModel ¶ added in v0.9.0
IsCredentialRequiringOllamaModel reports whether an Ollama model name is one of the cloud-aliased entries that needs an API key to actually serve requests (e.g. `deepseek-v4-pro:cloud`). Exported so stack-up can warn when the auto-picked primary would land on one of these.
func ListChatCapableModels ¶ added in v0.10.0
ListChatCapableModels reads the current LiteLLM ConfigMap and returns the model names that are chat-capable (not wildcards, not embedding-only).
It intentionally does NOT expand wildcard entries to live models — the goal is to detect whether at least one concrete, directly-routable chat model is present, not to enumerate every model the facilitator might expose. If the ConfigMap is absent or unreadable (cluster not yet up, first install) the function returns (nil, err) — callers should treat that as "no chat models" and emit the warning.
func LoadDotEnv ¶ added in v0.7.0
LoadDotEnv reads KEY=value pairs from a .env file. Returns an empty map if the file doesn't exist or is unreadable. Skips comments (#) and blank lines. Does not call os.Setenv.
func PatchLiteLLMEntries ¶ added in v0.9.0
PatchLiteLLMEntries merges precomputed ModelEntry values into the LiteLLM ConfigMap without touching Secrets and without restarting. Caller is responsible for restarting LiteLLM once after batching all patches when an upstream Secret/ConfigMap value actually changed.
func PatchLiteLLMProvider ¶ added in v0.7.0
func PatchLiteLLMProvider(cfg *config.Config, u *ui.UI, provider, apiKey string, models []string) error
PatchLiteLLMProvider patches the LiteLLM Secret (API key) and ConfigMap (model_list) for a provider without restarting the deployment. Call RestartLiteLLM afterwards (once, after batching multiple providers).
func PreferModels ¶ added in v0.9.0
PreferModels reorders LiteLLM's model_list so the named entries appear at the head, in the order given. Remaining entries keep their original relative order. This is the operator-facing primitive that lets model.Rank's "first chat-capable wins" rule pick a specific primary without a remove/re-add cycle.
Returns an error if any of the requested names is not present in the current model_list — typos should be loud, not silent no-ops.
LiteLLM has no model_list reorder API, so after the ConfigMap patch this rolls the LiteLLM Deployment so the new order takes effect (the /v1/models listing follows model_list order, and hermes/openclaw read the ConfigMap directly via GetConfiguredModels for the agent primary).
func ProviderEnvVar ¶ added in v0.7.0
ProviderEnvVar returns the env var name for a provider's API key.
func ProviderFromModelName ¶ added in v0.7.0
ProviderFromModelName infers the provider from a model name string.
func PullOllamaModel ¶ added in v0.5.0
PullOllamaModel pulls a model from the Ollama registry. It streams progress to stdout, matching the UX of `ollama pull`.
func Rank ¶ added in v0.9.0
Rank selects the primary model from the configured model list and demotes the rest to fallbacks.
Model ordering is configuration, not hidden product policy. LiteLLM's model_list order is the source of truth, so Rank preserves that order instead of guessing quality from provider names, parameter-count tags, or model-family aliases. The only exception is known embedding-only entries: they are kept in the fallback list but moved behind chat-capable models so an embedding model does not become the default chat model when another option exists.
The returned strings are the original inputs. Do not strip provider prefixes or normalize names here; Hermes/OpenClaw round-trip the returned primary back to LiteLLM as the chat-completions model field.
func ReconcileRecorded ¶ added in v0.10.0
ReconcileRecorded re-applies the recorded model state to the cluster. Called from `obol stack up` after auto-configuration so operator intent (entries + order) wins over auto-detected defaults. No record on disk means nothing to do.
func RecordState ¶ added in v0.10.0
RecordState snapshots the live model_list (minus paid/* entries, which are controller/purchase-derived and must not survive recreation) plus the provider API keys it references into the host-side record. Best-effort: failures warn but never fail the command that just succeeded.
func RemoveModel ¶ added in v0.7.0
RemoveModel removes a model entry from the LiteLLM ConfigMap (persistence) and hot-deletes it from the running router via the API (immediate effect). No pod restart is required.
func ResolveAPIKey ¶ added in v0.7.0
ResolveAPIKey checks the primary env var and each AltEnvVar in order for the given provider. Returns the key value and the env var it was found in. Both are empty if no key is available.
func RestartLiteLLM ¶ added in v0.7.0
RestartLiteLLM restarts the LiteLLM deployment and waits for rollout.
func ValidateCustomEndpoint ¶ added in v0.6.0
ValidateCustomEndpoint validates that a custom OpenAI-compatible endpoint works. It runs a 2-step validation: reachability check, then inference probe. The inference probe is the definitive test — some servers (e.g., mlx-lm) don't list the loaded model in /models but accept it for inference.
func ValidateCustomEndpointWithOptions ¶ added in v0.10.0
func ValidateCustomEndpointWithOptions(endpoint, modelName, apiKey string, options CustomEndpointOptions) error
ValidateCustomEndpointWithOptions validates that a custom OpenAI-compatible endpoint works. It runs a 2-step validation: reachability check, then inference probe. The inference probe is the definitive test — some servers (e.g., mlx-lm) don't list the loaded model in /models but accept it for inference.
func WarnAndStripV1Suffix ¶ added in v0.7.0
WarnAndStripV1Suffix checks if an endpoint URL has a trailing /v1 suffix, warns the user, and returns the stripped URL. For OpenAI-compatible providers, LiteLLM auto-appends /v1, causing double /v1/v1 if the user includes it.
Types ¶
type CacheControlInjection ¶ added in v0.9.0
type CacheControlInjection struct {
Location string `yaml:"location"`
Role string `yaml:"role,omitempty"`
Index *int `yaml:"index,omitempty"`
}
CacheControlInjection is one entry in LiteLLM's cache_control_injection_points list. Either Role or Index narrows which message in the request gets the cache_control marker.
type CustomEndpointOptions ¶ added in v0.10.0
type CustomEndpointOptions struct {
DisableThinking bool
}
CustomEndpointOptions controls optional per-request behavior for custom OpenAI-compatible endpoints.
type DiscoveredProvider ¶ added in v0.9.0
type DiscoveredProvider struct {
Label string
ServerType string
HostEndpoint string
ClusterEndpoint string
Entries []ModelEntry
}
DiscoveredProvider is one local OpenAI-compatible inference server that auto-config can register with LiteLLM.
func DiscoverLocalProviders ¶ added in v0.9.0
func DiscoverLocalProviders(ctx context.Context) ([]DiscoveredProvider, error)
DiscoverLocalProviders probes well-known local inference ports and returns one DiscoveredProvider per host endpoint that exposes at least one model. Returns an empty slice (not an error) when discovery is disabled or nothing is reachable. Honors OBOL_DISABLE_LOCAL_MODEL_DISCOVERY.
type LiteLLMConfig ¶ added in v0.6.0
type LiteLLMConfig struct {
ModelList []ModelEntry `yaml:"model_list"`
GeneralSettings map[string]any `yaml:"general_settings,omitempty"`
LiteLLMSettings map[string]any `yaml:"litellm_settings,omitempty"`
}
LiteLLMConfig represents the LiteLLM proxy config.yaml structure.
type LiteLLMParams ¶ added in v0.6.0
type LiteLLMParams struct {
Model string `yaml:"model"`
APIBase string `yaml:"api_base,omitempty"`
APIKey string `yaml:"api_key,omitempty"`
// ExtraBody is merged by LiteLLM into every upstream request for this
// model. It is intentionally opt-in because many OpenAI-compatible servers
// reject unknown provider-specific fields.
ExtraBody map[string]any `yaml:"extra_body,omitempty"`
// CacheControlInjectionPoints is a LiteLLM directive that tells the proxy
// to attach Anthropic-style `cache_control: {type: ephemeral}` markers to
// specific messages on every request to this model. We pin the system
// message for Anthropic entries so prompt caching is on by default.
CacheControlInjectionPoints []CacheControlInjection `yaml:"cache_control_injection_points,omitempty"`
}
LiteLLMParams holds the routing parameters for a model.
type ModelEntry ¶ added in v0.6.0
type ModelEntry struct {
ModelName string `yaml:"model_name"`
LiteLLMParams LiteLLMParams `yaml:"litellm_params"`
}
ModelEntry is a single entry in model_list.
func MergeRecordedModelList ¶ added in v0.10.0
func MergeRecordedModelList(recorded, current []ModelEntry) []ModelEntry
MergeRecordedModelList builds the reconciled model_list: recorded entries first (their order is operator intent — head = default model), then any current entries not named in the record (chart catch-alls, auto-detected models) in their existing relative order.
type OllamaModel ¶ added in v0.5.0
type OllamaModel struct {
Name string `json:"name"`
Size int64 `json:"size"`
ModifiedAt string `json:"modified_at"`
}
OllamaModel describes a model pulled in the local Ollama instance.
func ListOllamaModels ¶ added in v0.5.0
func ListOllamaModels() ([]OllamaModel, error)
ListOllamaModels queries the local Ollama server for pulled models. Returns nil and an error if Ollama is not reachable.
type ProviderInfo ¶ added in v0.3.1
type ProviderInfo struct {
ID string // provider id (e.g. "anthropic", "openai", "ollama")
Name string // display name
EnvVar string // primary env var for API key (empty for Ollama)
AltEnvVars []string // fallback env vars checked in order (e.g. CLAUDE_CODE_OAUTH_TOKEN)
}
ProviderInfo describes an LLM provider.
func GetAvailableProviders ¶ added in v0.3.1
func GetAvailableProviders(_ *config.Config) ([]ProviderInfo, error)
GetAvailableProviders returns the known provider list (static, no pod query needed).
type ProviderStatus ¶
type ProviderStatus struct {
Enabled bool
HasAPIKey bool
EnvVar string // environment variable name (e.g. ANTHROPIC_API_KEY)
Models []string
}
ProviderStatus captures effective global LiteLLM provider state.
type RecordedModelState ¶ added in v0.10.0
type RecordedModelState struct {
Version int `yaml:"version"`
ModelList []ModelEntry `yaml:"model_list"`
Secrets map[string]string `yaml:"secrets,omitempty"`
}
RecordedModelState is the host-side record of operator-applied LiteLLM configuration. Secrets hold provider API keys in plaintext, matching the existing convention for values-remote-signer.yaml (0600 in ConfigDir).