Documentation
¶
Overview ¶
Package router handles intelligent model selection.
Index ¶
- Constants
- func DelegateHintKey(hint string) string
- func OverlayDelegateHints(base map[string]string, inherited map[string]string) (explicitModel string, merged map[string]string)
- func ValidateTopicFilter(filter string) error
- type Complexity
- type Config
- type Decision
- type DeploymentStats
- type LoopProfile
- type Model
- type Priority
- type Request
- type RequestOptions
- type ResourceHealth
- type Router
- func (r *Router) ContextWindowForModel(name string) int
- func (r *Router) DefaultModel() string
- func (r *Router) ExperienceSnapshot() map[string]DeploymentStats
- func (r *Router) ExperienceVersion() int64
- func (r *Router) Explain(requestID string) *Decision
- func (r *Router) ExplainRequest(req Request) *Decision
- func (r *Router) GetAuditLog(limit int) []Decision
- func (r *Router) GetModels() []Model
- func (r *Router) GetStats() Stats
- func (r *Router) MaxQuality() int
- func (r *Router) RecordFailure(requestID string, latencyMs int64, tokensUsed int, resourceTimeout bool)
- func (r *Router) RecordOutcome(requestID string, latencyMs int64, tokensUsed int, success bool)
- func (r *Router) ReplaceExperience(experience map[string]DeploymentStats)
- func (r *Router) Route(ctx context.Context, req Request) (string, *Decision)
- func (r *Router) UpdateConfig(cfg Config)
- type Stats
- type VirtualModel
- type VirtualModelRuntime
- type VirtualModelSelection
Constants ¶
const ( // HintChannel identifies the request source: "ollama", "homeassistant", "voice", "api" HintChannel = "channel" // HintQualityFloor is the minimum quality rating (1-10) the caller requires. HintQualityFloor = "quality_floor" // HintModelPreference suggests a specific model (soft preference, not override). HintModelPreference = "model_preference" // HintMission describes the task context: "conversation", "device_control", "background", "automation", "metacognitive" HintMission = "mission" // HintLocalOnly restricts routing to free/local models when set to "true". HintLocalOnly = "local_only" // HintDelegationGating controls whether delegation-first tool gating is // active. Set to "disabled" to give the model direct access to all tools // on every iteration (used by thane:ops). HintDelegationGating = "delegation_gating" // HintPreferSpeed indicates the caller benefits from faster response // times over higher quality. When "true", any model with Speed >= 7 // receives a scoring bonus regardless of cost tier or provider. Can be // decisive among similarly priced options. Use for background/delegation // tasks where latency and resource efficiency matter more than maximum // output quality. HintPreferSpeed = "prefer_speed" )
Hint keys for routing decisions. Callers set these to influence model selection.
const ( // HintVirtualModel records the canonical virtual model / execution policy // selected by a higher-level entrypoint such as the Ollama-compatible API. HintVirtualModel = "virtual_model" // HintDelegateModel is an explicit delegate model override carried through // request hints. When set, delegate loops bypass router selection. HintDelegateModel = "delegate_model" )
Variables ¶
This section is empty.
Functions ¶
func DelegateHintKey ¶
DelegateHintKey returns the request-hint key used to carry a delegate-time override for the supplied routing hint.
func OverlayDelegateHints ¶
func OverlayDelegateHints(base map[string]string, inherited map[string]string) (explicitModel string, merged map[string]string)
OverlayDelegateHints merges delegate-policy overrides carried in a parent request-hint map onto a delegate profile's base router hints.
The returned explicitModel, when non-empty, instructs the delegate to bypass router selection and use that exact model.
func ValidateTopicFilter ¶
ValidateTopicFilter checks that an MQTT topic filter is syntactically valid per the MQTT v5 specification:
- Must not be empty
- '#' wildcard must be the last segment (and alone in its level)
- '+' wildcard must occupy an entire level
- No null characters (U+0000)
Types ¶
type Complexity ¶
type Complexity int
Complexity categorizes query difficulty.
const ( ComplexitySimple Complexity = iota // Direct command, single action ComplexityModerate // Multi-step or needs context ComplexityComplex // Reasoning, analysis, explanation )
func (Complexity) String ¶
func (c Complexity) String() string
String returns the human-readable name of a complexity level.
type Config ¶
type Config struct {
Models []Model // Available models
DefaultModel string // Fallback if no rules match
LocalFirst bool // Prefer local models when possible
MaxAuditLog int // How many decisions to keep in memory
}
Config holds router configuration.
type Decision ¶
type Decision struct {
RequestID string `json:"request_id"`
Timestamp time.Time `json:"timestamp"`
// Input analysis
QueryLength int `json:"query_length"`
ContextSize int `json:"context_size"`
NeedsTools bool `json:"needs_tools"`
NeedsStreaming bool `json:"needs_streaming,omitempty"`
NeedsImages bool `json:"needs_images,omitempty"`
Priority string `json:"priority"`
DetectedIntent string `json:"detected_intent,omitempty"`
Complexity Complexity `json:"complexity"`
// Decision process
RulesEvaluated []string `json:"rules_evaluated"`
RulesMatched []string `json:"rules_matched"`
RejectedModels map[string][]string `json:"rejected_models,omitempty"`
Scores map[string]int `json:"scores,omitempty"`
NoEligible bool `json:"no_eligible,omitempty"`
// Outcome
ModelSelected string `json:"model_selected"`
UpstreamModelSelected string `json:"upstream_model_selected,omitempty"`
ProviderSelected string `json:"provider_selected,omitempty"`
ResourceSelected string `json:"resource_selected,omitempty"`
Reasoning string `json:"reasoning"`
// Post-execution (filled in later)
LatencyMs int64 `json:"latency_ms,omitempty"`
TokensUsed int `json:"tokens_used,omitempty"`
Success *bool `json:"success,omitempty"`
}
Decision records why a model was selected.
type DeploymentStats ¶
type DeploymentStats struct {
Provider string `json:"provider"`
Resource string `json:"resource,omitempty"`
UpstreamModel string `json:"upstream_model,omitempty"`
Requests int64 `json:"requests"`
Successes int64 `json:"successes"`
Failures int64 `json:"failures"`
AvgLatencyMs int64 `json:"avg_latency_ms,omitempty"`
AvgTokensUsed int64 `json:"avg_tokens_used,omitempty"`
}
DeploymentStats tracks routing and outcome state for one concrete deployment/route target.
type LoopProfile ¶
type LoopProfile struct {
// Model sets an explicit model, bypassing the router. When empty,
// the router selects based on the other hint fields.
Model string `yaml:"model,omitempty" json:"model,omitempty"`
// QualityFloor is the minimum model quality rating (1–10). Maps
// to [HintQualityFloor].
QualityFloor string `yaml:"quality_floor,omitempty" json:"quality_floor,omitempty"`
// Mission describes the task context for routing. Maps to
// [HintMission]. Common values: "conversation", "automation",
// "device_control", "background".
Mission string `yaml:"mission,omitempty" json:"mission,omitempty"`
// LocalOnly restricts routing to free/local models when "true".
// Maps to [HintLocalOnly].
LocalOnly string `yaml:"local_only,omitempty" json:"local_only,omitempty"`
// DelegationGating controls delegation-first tool gating. Set to
// "disabled" for direct tool access. Maps to [HintDelegationGating].
DelegationGating string `yaml:"delegation_gating,omitempty" json:"delegation_gating,omitempty"`
// PreferSpeed favours faster models when "true". Maps to
// [HintPreferSpeed].
PreferSpeed string `yaml:"prefer_speed,omitempty" json:"prefer_speed,omitempty"`
// ExcludeTools lists tool names to filter out of the agent run.
ExcludeTools []string `yaml:"exclude_tools,omitempty" json:"exclude_tools,omitempty"`
// ExtraHints carries arbitrary key-value routing hints that are
// merged last, allowing callers to override or extend the typed
// fields above.
ExtraHints map[string]string `yaml:"extra_hints,omitempty" json:"extra_hints,omitempty"`
// Instructions is extra text injected into the user message to
// guide the agent's behaviour for this wake context.
Instructions string `yaml:"instructions,omitempty" json:"instructions,omitempty"`
}
LoopProfile captures the common routing and configuration parameters shared by all agent wake sites. It is serializable via both YAML (for config file embedding) and JSON (for API and tool payloads).
Each field maps to a well-known routing hint or agent.Request property. Zero-value fields are omitted during serialization and ignored by LoopProfile.Hints.
func (*LoopProfile) Hints ¶
func (s *LoopProfile) Hints() map[string]string
Hints builds a routing hints map from the profile's typed fields. Only non-empty fields are included. ExtraHints are merged last and can override typed fields.
func (*LoopProfile) RequestOptions ¶
func (s *LoopProfile) RequestOptions() RequestOptions
RequestOptions returns the request-ready fields implied by the profile. Slices are copied so callers can mutate the result without affecting the underlying LoopProfile.
func (*LoopProfile) Validate ¶
func (s *LoopProfile) Validate() error
Validate checks that the profile's typed fields contain semantically valid values. It does not require any field to be set — an empty LoopProfile is valid. Returns nil on success.
type Model ¶
type Model struct {
Name string // Route/deployment identifier (e.g., "qwen3:4b" or "spark/qwen3:32b")
UpstreamModel string // Provider-native model name (e.g., "qwen3:32b")
Provider string // "ollama" or "anthropic" etc
ResourceID string // Provider resource identity (e.g., server name)
Server string // Configured server name when applicable
SupportsTools bool // Deployment is configured for tool calling
ProviderSupportsTools bool // Underlying provider supports tool calling
SupportsStreaming bool // Deployment/provider can stream
SupportsImages bool // Deployment/provider accepts image input
ContextWindow int // Max tokens
Speed int // Relative speed (1-10, 10=fastest)
Quality int // Relative quality (1-10, 10=best)
CostTier int // 0=free/local, 1=cheap, 2=moderate, 3=expensive
MinComplexity Complexity // Don't use for simpler than this
}
Model represents an available model with its capabilities.
type Request ¶
type Request struct {
Query string // The user's input
ContextSize int // Estimated tokens of context (talents, history)
NeedsTools bool // Whether tool calling is required
NeedsStreaming bool // Whether a streaming response is required
NeedsImages bool // Whether image/multimodal input is required
ToolCount int // Number of tools available
Priority Priority // Latency requirements
Hints map[string]string // Caller-supplied routing hints (see HintXxx constants)
}
Request contains the information needed for routing decisions.
type RequestOptions ¶
RequestOptions contains the agent request fields derived from a LoopProfile. Callers can merge additional channel- or trigger-specific hints on top of these shared routing defaults.
type ResourceHealth ¶
type ResourceHealth struct {
CooldownUntil time.Time `json:"cooldown_until,omitempty"`
CooldownReason string `json:"cooldown_reason,omitempty"`
}
ResourceHealth exposes request-plane routing health for one resource.
type Router ¶
type Router struct {
// contains filtered or unexported fields
}
Router selects models based on request characteristics.
func (*Router) ContextWindowForModel ¶
ContextWindowForModel returns the context window size for the named model. If the model is not found in the router's configuration, it returns 0.
func (*Router) DefaultModel ¶
DefaultModel returns the router's current fallback/default model.
func (*Router) ExperienceSnapshot ¶
func (r *Router) ExperienceSnapshot() map[string]DeploymentStats
ExperienceSnapshot returns a copy of the deployment-scoped learned experience that is safe to persist and later rehydrate. It excludes transient resource cooldown state.
func (*Router) ExperienceVersion ¶
ExperienceVersion returns a monotonic counter that changes whenever deployment-scoped learned experience is updated.
func (*Router) ExplainRequest ¶
ExplainRequest computes a routing decision for the supplied request using the router's current config, learned experience, and transient resource health, but does not mutate audit history or stats.
func (*Router) GetAuditLog ¶
GetAuditLog returns recent routing decisions.
func (*Router) GetModels ¶
GetModels returns a copy of the configured model list. The returned slice is safe to mutate without affecting the router.
func (*Router) MaxQuality ¶
MaxQuality returns the highest quality rating among configured models. If no models are configured it returns 10 as a safe default that selects the best available model at runtime.
func (*Router) RecordFailure ¶
func (r *Router) RecordFailure(requestID string, latencyMs int64, tokensUsed int, resourceTimeout bool)
RecordFailure updates a failed routing outcome and optionally applies a temporary resource cooldown so automatic routing can avoid a runner that is timing out on real chat traffic.
func (*Router) RecordOutcome ¶
RecordOutcome updates a decision with execution results.
func (*Router) ReplaceExperience ¶
func (r *Router) ReplaceExperience(experience map[string]DeploymentStats)
ReplaceExperience replaces the router's persisted deployment-scoped learned experience and rebuilds aggregate stats from it. Transient resource cooldown state is intentionally left untouched.
func (*Router) UpdateConfig ¶
UpdateConfig swaps the router's live model configuration while preserving accumulated audit history and stats.
type Stats ¶
type Stats struct {
TotalRequests int64 `json:"total_requests"`
ModelCounts map[string]int64 `json:"model_counts"`
AvgLatencyMs map[string]int64 `json:"avg_latency_ms"`
ComplexityCounts map[string]int64 `json:"complexity_counts"`
SuccessCount int64 `json:"success_count"`
FailureCount int64 `json:"failure_count"`
ProviderCounts map[string]int64 `json:"provider_counts,omitempty"`
ResourceCounts map[string]int64 `json:"resource_counts,omitempty"`
ResourceHealth map[string]ResourceHealth `json:"resource_health,omitempty"`
DeploymentStats map[string]DeploymentStats `json:"deployment_stats,omitempty"`
}
Stats tracks routing statistics.
type VirtualModel ¶
type VirtualModel struct {
Name string
Description string
Exposed bool
Aliases []string
TopLevel LoopProfile
Delegate LoopProfile
}
VirtualModel describes an end-to-end execution policy exposed through virtual model names such as "thane:premium".
TopLevel controls the initial orchestrator loop. Delegate controls child loops launched via thane_now or thane_assign. Future revisions may also derive these policies dynamically from the live registry without changing callers.
func ExposedVirtualModels ¶
func ExposedVirtualModels(runtime VirtualModelRuntime) []VirtualModel
ExposedVirtualModels returns the runtime-expanded virtual models intended for user-facing discovery such as Ollama's /api/tags.
type VirtualModelRuntime ¶
type VirtualModelRuntime struct {
PremiumQualityFloor string
}
VirtualModelRuntime contains runtime-derived values that influence virtual model expansion. The shape is intentionally small today but is designed to grow as the live model registry contributes more dynamic overlay state.
type VirtualModelSelection ¶
type VirtualModelSelection struct {
RequestedName string
CanonicalName string
Description string
Known bool
Model string
Hints map[string]string
}
VirtualModelSelection is the resolved effect of a caller-supplied model string after virtual model expansion.
func ResolveVirtualModelSelection ¶
func ResolveVirtualModelSelection(rawModel string, baseHints map[string]string, runtime VirtualModelRuntime, logger *slog.Logger) VirtualModelSelection
ResolveVirtualModelSelection expands a caller-supplied model string into a canonical virtual model policy or preserves the explicit deployment name when no virtual model matched.