router

package
v0.9.2 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: May 1, 2026 License: Apache-2.0 Imports: 9 Imported by: 0

Documentation

Overview

Package router handles intelligent model selection.

Index

Constants

View Source
const (
	// HintChannel identifies the request source: "ollama", "homeassistant", "voice", "api"
	HintChannel = "channel"
	// HintQualityFloor is the minimum quality rating (1-10) the caller requires.
	HintQualityFloor = "quality_floor"
	// HintModelPreference suggests a specific model (soft preference, not override).
	HintModelPreference = "model_preference"
	// HintMission describes the task context: "conversation", "device_control", "background", "automation", "metacognitive"
	HintMission = "mission"
	// HintLocalOnly restricts routing to free/local models when set to "true".
	HintLocalOnly = "local_only"
	// HintDelegationGating controls whether delegation-first tool gating is
	// active. Set to "disabled" to give the model direct access to all tools
	// on every iteration (used by thane:ops).
	HintDelegationGating = "delegation_gating"
	// HintPreferSpeed indicates the caller benefits from faster response
	// times over higher quality. When "true", any model with Speed >= 7
	// receives a scoring bonus regardless of cost tier or provider. Can be
	// decisive among similarly priced options. Use for background/delegation
	// tasks where latency and resource efficiency matter more than maximum
	// output quality.
	HintPreferSpeed = "prefer_speed"
)

Hint keys for routing decisions. Callers set these to influence model selection.

View Source
const (
	// HintVirtualModel records the canonical virtual model / execution policy
	// selected by a higher-level entrypoint such as the Ollama-compatible API.
	HintVirtualModel = "virtual_model"

	// HintDelegateModel is an explicit delegate model override carried through
	// request hints. When set, delegate loops bypass router selection.
	HintDelegateModel = "delegate_model"
)

Variables

This section is empty.

Functions

func DelegateHintKey

func DelegateHintKey(hint string) string

DelegateHintKey returns the request-hint key used to carry a delegate-time override for the supplied routing hint.

func OverlayDelegateHints

func OverlayDelegateHints(base map[string]string, inherited map[string]string) (explicitModel string, merged map[string]string)

OverlayDelegateHints merges delegate-policy overrides carried in a parent request-hint map onto a delegate profile's base router hints.

The returned explicitModel, when non-empty, instructs the delegate to bypass router selection and use that exact model.

func ValidateTopicFilter

func ValidateTopicFilter(filter string) error

ValidateTopicFilter checks that an MQTT topic filter is syntactically valid per the MQTT v5 specification:

  • Must not be empty
  • '#' wildcard must be the last segment (and alone in its level)
  • '+' wildcard must occupy an entire level
  • No null characters (U+0000)

Types

type Complexity

type Complexity int

Complexity categorizes query difficulty.

const (
	ComplexitySimple   Complexity = iota // Direct command, single action
	ComplexityModerate                   // Multi-step or needs context
	ComplexityComplex                    // Reasoning, analysis, explanation
)

func (Complexity) String

func (c Complexity) String() string

String returns the human-readable name of a complexity level.

type Config

type Config struct {
	Models       []Model // Available models
	DefaultModel string  // Fallback if no rules match
	LocalFirst   bool    // Prefer local models when possible
	MaxAuditLog  int     // How many decisions to keep in memory
}

Config holds router configuration.

type Decision

type Decision struct {
	RequestID string    `json:"request_id"`
	Timestamp time.Time `json:"timestamp"`

	// Input analysis
	QueryLength    int        `json:"query_length"`
	ContextSize    int        `json:"context_size"`
	NeedsTools     bool       `json:"needs_tools"`
	NeedsStreaming bool       `json:"needs_streaming,omitempty"`
	NeedsImages    bool       `json:"needs_images,omitempty"`
	Priority       string     `json:"priority"`
	DetectedIntent string     `json:"detected_intent,omitempty"`
	Complexity     Complexity `json:"complexity"`

	// Decision process
	RulesEvaluated []string            `json:"rules_evaluated"`
	RulesMatched   []string            `json:"rules_matched"`
	RejectedModels map[string][]string `json:"rejected_models,omitempty"`
	Scores         map[string]int      `json:"scores,omitempty"`
	NoEligible     bool                `json:"no_eligible,omitempty"`

	// Outcome
	ModelSelected         string `json:"model_selected"`
	UpstreamModelSelected string `json:"upstream_model_selected,omitempty"`
	ProviderSelected      string `json:"provider_selected,omitempty"`
	ResourceSelected      string `json:"resource_selected,omitempty"`
	Reasoning             string `json:"reasoning"`

	// Post-execution (filled in later)
	LatencyMs  int64 `json:"latency_ms,omitempty"`
	TokensUsed int   `json:"tokens_used,omitempty"`
	Success    *bool `json:"success,omitempty"`
}

Decision records why a model was selected.

type DeploymentStats

type DeploymentStats struct {
	Provider      string `json:"provider"`
	Resource      string `json:"resource,omitempty"`
	UpstreamModel string `json:"upstream_model,omitempty"`
	Requests      int64  `json:"requests"`
	Successes     int64  `json:"successes"`
	Failures      int64  `json:"failures"`
	AvgLatencyMs  int64  `json:"avg_latency_ms,omitempty"`
	AvgTokensUsed int64  `json:"avg_tokens_used,omitempty"`
}

DeploymentStats tracks routing and outcome state for one concrete deployment/route target.

type LoopProfile

type LoopProfile struct {
	// Model sets an explicit model, bypassing the router. When empty,
	// the router selects based on the other hint fields.
	Model string `yaml:"model,omitempty" json:"model,omitempty"`

	// QualityFloor is the minimum model quality rating (1–10). Maps
	// to [HintQualityFloor].
	QualityFloor string `yaml:"quality_floor,omitempty" json:"quality_floor,omitempty"`

	// Mission describes the task context for routing. Maps to
	// [HintMission]. Common values: "conversation", "automation",
	// "device_control", "background".
	Mission string `yaml:"mission,omitempty" json:"mission,omitempty"`

	// LocalOnly restricts routing to free/local models when "true".
	// Maps to [HintLocalOnly].
	LocalOnly string `yaml:"local_only,omitempty" json:"local_only,omitempty"`

	// DelegationGating controls delegation-first tool gating. Set to
	// "disabled" for direct tool access. Maps to [HintDelegationGating].
	DelegationGating string `yaml:"delegation_gating,omitempty" json:"delegation_gating,omitempty"`

	// PreferSpeed favours faster models when "true". Maps to
	// [HintPreferSpeed].
	PreferSpeed string `yaml:"prefer_speed,omitempty" json:"prefer_speed,omitempty"`

	// ExcludeTools lists tool names to filter out of the agent run.
	ExcludeTools []string `yaml:"exclude_tools,omitempty" json:"exclude_tools,omitempty"`

	// ExtraHints carries arbitrary key-value routing hints that are
	// merged last, allowing callers to override or extend the typed
	// fields above.
	ExtraHints map[string]string `yaml:"extra_hints,omitempty" json:"extra_hints,omitempty"`

	// Instructions is extra text injected into the user message to
	// guide the agent's behaviour for this wake context.
	Instructions string `yaml:"instructions,omitempty" json:"instructions,omitempty"`
}

LoopProfile captures the common routing and configuration parameters shared by all agent wake sites. It is serializable via both YAML (for config file embedding) and JSON (for API and tool payloads).

Each field maps to a well-known routing hint or agent.Request property. Zero-value fields are omitted during serialization and ignored by LoopProfile.Hints.

func (*LoopProfile) Hints

func (s *LoopProfile) Hints() map[string]string

Hints builds a routing hints map from the profile's typed fields. Only non-empty fields are included. ExtraHints are merged last and can override typed fields.

func (*LoopProfile) RequestOptions

func (s *LoopProfile) RequestOptions() RequestOptions

RequestOptions returns the request-ready fields implied by the profile. Slices are copied so callers can mutate the result without affecting the underlying LoopProfile.

func (*LoopProfile) Validate

func (s *LoopProfile) Validate() error

Validate checks that the profile's typed fields contain semantically valid values. It does not require any field to be set — an empty LoopProfile is valid. Returns nil on success.

type Model

type Model struct {
	Name                  string     // Route/deployment identifier (e.g., "qwen3:4b" or "spark/qwen3:32b")
	UpstreamModel         string     // Provider-native model name (e.g., "qwen3:32b")
	Provider              string     // "ollama" or "anthropic" etc
	ResourceID            string     // Provider resource identity (e.g., server name)
	Server                string     // Configured server name when applicable
	SupportsTools         bool       // Deployment is configured for tool calling
	ProviderSupportsTools bool       // Underlying provider supports tool calling
	SupportsStreaming     bool       // Deployment/provider can stream
	SupportsImages        bool       // Deployment/provider accepts image input
	ContextWindow         int        // Max tokens
	Speed                 int        // Relative speed (1-10, 10=fastest)
	Quality               int        // Relative quality (1-10, 10=best)
	CostTier              int        // 0=free/local, 1=cheap, 2=moderate, 3=expensive
	MinComplexity         Complexity // Don't use for simpler than this
}

Model represents an available model with its capabilities.

type Priority

type Priority int

Priority indicates latency requirements.

const (
	PriorityInteractive Priority = iota // User waiting, needs fast response
	PriorityBackground                  // Can take longer for better quality
)

type Request

type Request struct {
	Query          string            // The user's input
	ContextSize    int               // Estimated tokens of context (talents, history)
	NeedsTools     bool              // Whether tool calling is required
	NeedsStreaming bool              // Whether a streaming response is required
	NeedsImages    bool              // Whether image/multimodal input is required
	ToolCount      int               // Number of tools available
	Priority       Priority          // Latency requirements
	Hints          map[string]string // Caller-supplied routing hints (see HintXxx constants)
}

Request contains the information needed for routing decisions.

type RequestOptions

type RequestOptions struct {
	Model        string
	Hints        map[string]string
	ExcludeTools []string
}

RequestOptions contains the agent request fields derived from a LoopProfile. Callers can merge additional channel- or trigger-specific hints on top of these shared routing defaults.

type ResourceHealth

type ResourceHealth struct {
	CooldownUntil  time.Time `json:"cooldown_until,omitempty"`
	CooldownReason string    `json:"cooldown_reason,omitempty"`
}

ResourceHealth exposes request-plane routing health for one resource.

type Router

type Router struct {
	// contains filtered or unexported fields
}

Router selects models based on request characteristics.

func NewRouter

func NewRouter(logger *slog.Logger, config Config) *Router

NewRouter creates a router with the given configuration.

func (*Router) ContextWindowForModel

func (r *Router) ContextWindowForModel(name string) int

ContextWindowForModel returns the context window size for the named model. If the model is not found in the router's configuration, it returns 0.

func (*Router) DefaultModel

func (r *Router) DefaultModel() string

DefaultModel returns the router's current fallback/default model.

func (*Router) ExperienceSnapshot

func (r *Router) ExperienceSnapshot() map[string]DeploymentStats

ExperienceSnapshot returns a copy of the deployment-scoped learned experience that is safe to persist and later rehydrate. It excludes transient resource cooldown state.

func (*Router) ExperienceVersion

func (r *Router) ExperienceVersion() int64

ExperienceVersion returns a monotonic counter that changes whenever deployment-scoped learned experience is updated.

func (*Router) Explain

func (r *Router) Explain(requestID string) *Decision

Explain returns details about why a specific decision was made.

func (*Router) ExplainRequest

func (r *Router) ExplainRequest(req Request) *Decision

ExplainRequest computes a routing decision for the supplied request using the router's current config, learned experience, and transient resource health, but does not mutate audit history or stats.

func (*Router) GetAuditLog

func (r *Router) GetAuditLog(limit int) []Decision

GetAuditLog returns recent routing decisions.

func (*Router) GetModels

func (r *Router) GetModels() []Model

GetModels returns a copy of the configured model list. The returned slice is safe to mutate without affecting the router.

func (*Router) GetStats

func (r *Router) GetStats() Stats

GetStats returns routing statistics.

func (*Router) MaxQuality

func (r *Router) MaxQuality() int

MaxQuality returns the highest quality rating among configured models. If no models are configured it returns 10 as a safe default that selects the best available model at runtime.

func (*Router) RecordFailure

func (r *Router) RecordFailure(requestID string, latencyMs int64, tokensUsed int, resourceTimeout bool)

RecordFailure updates a failed routing outcome and optionally applies a temporary resource cooldown so automatic routing can avoid a runner that is timing out on real chat traffic.

func (*Router) RecordOutcome

func (r *Router) RecordOutcome(requestID string, latencyMs int64, tokensUsed int, success bool)

RecordOutcome updates a decision with execution results.

func (*Router) ReplaceExperience

func (r *Router) ReplaceExperience(experience map[string]DeploymentStats)

ReplaceExperience replaces the router's persisted deployment-scoped learned experience and rebuilds aggregate stats from it. Transient resource cooldown state is intentionally left untouched.

func (*Router) Route

func (r *Router) Route(ctx context.Context, req Request) (string, *Decision)

Route selects a model for the given request.

func (*Router) UpdateConfig

func (r *Router) UpdateConfig(cfg Config)

UpdateConfig swaps the router's live model configuration while preserving accumulated audit history and stats.

type Stats

type Stats struct {
	TotalRequests    int64                      `json:"total_requests"`
	ModelCounts      map[string]int64           `json:"model_counts"`
	AvgLatencyMs     map[string]int64           `json:"avg_latency_ms"`
	ComplexityCounts map[string]int64           `json:"complexity_counts"`
	SuccessCount     int64                      `json:"success_count"`
	FailureCount     int64                      `json:"failure_count"`
	ProviderCounts   map[string]int64           `json:"provider_counts,omitempty"`
	ResourceCounts   map[string]int64           `json:"resource_counts,omitempty"`
	ResourceHealth   map[string]ResourceHealth  `json:"resource_health,omitempty"`
	DeploymentStats  map[string]DeploymentStats `json:"deployment_stats,omitempty"`
}

Stats tracks routing statistics.

type VirtualModel

type VirtualModel struct {
	Name        string
	Description string
	Exposed     bool
	Aliases     []string
	TopLevel    LoopProfile
	Delegate    LoopProfile
}

VirtualModel describes an end-to-end execution policy exposed through virtual model names such as "thane:premium".

TopLevel controls the initial orchestrator loop. Delegate controls child loops launched via thane_now or thane_assign. Future revisions may also derive these policies dynamically from the live registry without changing callers.

func ExposedVirtualModels

func ExposedVirtualModels(runtime VirtualModelRuntime) []VirtualModel

ExposedVirtualModels returns the runtime-expanded virtual models intended for user-facing discovery such as Ollama's /api/tags.

type VirtualModelRuntime

type VirtualModelRuntime struct {
	PremiumQualityFloor string
}

VirtualModelRuntime contains runtime-derived values that influence virtual model expansion. The shape is intentionally small today but is designed to grow as the live model registry contributes more dynamic overlay state.

type VirtualModelSelection

type VirtualModelSelection struct {
	RequestedName string
	CanonicalName string
	Description   string
	Known         bool
	Model         string
	Hints         map[string]string
}

VirtualModelSelection is the resolved effect of a caller-supplied model string after virtual model expansion.

func ResolveVirtualModelSelection

func ResolveVirtualModelSelection(rawModel string, baseHints map[string]string, runtime VirtualModelRuntime, logger *slog.Logger) VirtualModelSelection

ResolveVirtualModelSelection expands a caller-supplied model string into a canonical virtual model policy or preserves the explicit deployment name when no virtual model matched.

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL