Documentation
¶
Overview ¶
Package semanticrouter provides fast, embedding-based intent routing, jailbreak detection, and response semantic caching for Genie.
By sitting in front of the main orchestration flow, the semantic router acts as a lightweight gatekeeper. It quickly embeds user requests using a pre-configured vector embedding model (e.g. OpenAI, Gemini) and matches the resulting vector against a set of predefined routes (e.g. "jailbreak", "salutation").
If a route match is found (i.e. above the configured Threshold), the router immediately returns a ClassificationResult (e.g. REFUSE or SALUTATION) without invoking the more expensive, higher-latency front-desk LLM.
Further, it manages Semantic Caching which allows repeated or highly similar queries to immediately bypass generation entirely, fetching previously generated results from the Vector Store.
Index ¶
- Constants
- func GetClassifyPrompt() string
- type Category
- type ClassificationResult
- type Config
- type IRouter
- type Route
- type Router
- func (r *Router) CheckCache(ctx context.Context, query string) (string, bool)
- func (r *Router) Classify(ctx context.Context, question, resume string) (ClassificationResult, error)
- func (r *Router) Close()
- func (r *Router) PruneStaleCacheEntries(ctx context.Context) (int, error)
- func (r *Router) SetCache(ctx context.Context, query string, response string) error
Constants ¶
const ( RouteJailbreak = mw.RouteJailbreak RouteSalutation = mw.RouteSalutation RouteFollowUp = mw.RouteFollowUp )
Route types for L1 vector matching (re-exported from semanticmiddleware).
const AgentNamePlaceholder = "{{AGENT_NAME}}"
Variables ¶
This section is empty.
Functions ¶
func GetClassifyPrompt ¶
func GetClassifyPrompt() string
GetClassifyPrompt returns the global classify.txt template value.
Types ¶
type ClassificationResult ¶
type ClassificationResult struct {
Category Category
Reason string // non-empty only for OUT_OF_SCOPE
BypassedLLM bool // true if semantic router (L1) bypassed the LLM completely
}
ClassificationResult carries the category together with an optional reason.
type Config ¶
type Config struct {
// Disabled determines whether semantic routing features are active.
Disabled bool `yaml:"disabled,omitempty" toml:"disabled,omitempty"`
// Threshold for semantic similarity matches (0.0 to 1.0).
// Default is 0.85.
Threshold float64 `yaml:"threshold,omitempty" toml:"threshold,omitempty"`
// EnableCaching enables semantic caching for user LLM responses.
EnableCaching bool `yaml:"enable_caching,omitempty" toml:"enable_caching,omitempty"`
// CacheTTL controls how long cached responses remain valid.
// Expired entries are ignored on read. Default is 5 minutes.
CacheTTL time.Duration `yaml:"cache_ttl,omitempty" toml:"cache_ttl,omitempty"`
// PruneInterval controls how often the background goroutine prunes stale
// cache entries. Default is 1 hour. Set to 0 to disable background pruning.
PruneInterval time.Duration `yaml:"prune_interval,omitempty" toml:"prune_interval,omitempty"`
// VectorStore defines the embedding and storage backend used for
// the semantic routing and caching. If empty, uses dummy embedder.
VectorStore vector.Config `yaml:"vector_store,omitempty" toml:"vector_store,omitempty"`
// Routes allows injecting custom semantic routes or extending builtin ones.
Routes []Route `yaml:"routes,omitempty" toml:"routes,omitempty"`
// L0Regex configures the L0 regex pre-filter middleware that catches
// conversational follow-ups and corrections before any embedding or LLM call.
L0Regex mw.L0RegexConfig `yaml:"l0_regex,omitempty" toml:"l0_regex,omitempty"`
// FollowUpBypass configures the follow-up bypass middleware that ensures
// messages flagged as follow-ups by L0 skip the expensive L2 LLM call.
FollowUpBypass mw.FollowUpBypassConfig `yaml:"follow_up_bypass,omitempty" toml:"follow_up_bypass,omitempty"`
}
Config configures the semantic routing engine.
type IRouter ¶
type IRouter interface {
Classify(ctx context.Context, question, resume string) (ClassificationResult, error)
CheckCache(ctx context.Context, query string) (string, bool)
SetCache(ctx context.Context, query string, response string) error
PruneStaleCacheEntries(ctx context.Context) (int, error)
}
IRouter defines the interface for the semantic router, enabling mocking and testing.
type Route ¶
type Route struct {
Name string `yaml:"name" toml:"name"`
Utterances []string `yaml:"utterances" toml:"utterances"`
}
Route defines a semantic category alongside example utterances.
type Router ¶
type Router struct {
// contains filtered or unexported fields
}
Router provides semantic routing (intent classification), semantic caching, and safety checks using a vector store for fast, embedding-based comparisons and acts as the gatekeeper applying L0 Regex → L1 Semantic → L2 LLM middleware chain.
func New ¶
func New(ctx context.Context, cfg Config, provider modelprovider.ModelProvider) (*Router, error)
New creates a new Semantic Router. It initializes isolated vector stores for caching and routing to prevent collision, and builds the classify middleware chain: L0 (regex) → L1 (vector) → follow-up bypass → L2 (LLM).
func (*Router) CheckCache ¶
CheckCache looks up the input query in the semantic cache. Cache entries are subject to TTL: entries older than Config.CacheTTL are ignored.
func (*Router) Classify ¶
func (r *Router) Classify(ctx context.Context, question, resume string) (ClassificationResult, error)
Classify acts as the unified gatekeeper using a middleware chain. L0 Check: Regex patterns for common follow-ups (free, <1ms). L1 Check: Checks semantic vector distance and bypasses LLM if intent matches. L2 Check: Proxies to the LLM-based frontDeskExpert if no earlier layer decides.
Each middleware enriches a shared ClassifyContext so downstream layers can make better-informed decisions (e.g. L1's near-miss route score informs L2).
The method creates an OTel span ("semanticrouter.classify") that appears as a child of the caller's active span (typically "codeowner.chat"). This ensures classification always shows up in the Langfuse trace hierarchy.
func (*Router) Close ¶
func (r *Router) Close()
Close stops the background prune goroutine. It is safe to call multiple times.
func (*Router) PruneStaleCacheEntries ¶
PruneStaleCacheEntries removes expired cache entries from the vector store. It searches for a broad set of cache entries and deletes any whose cached_at timestamp is older than the configured CacheTTL. This should be called periodically (e.g. via a background goroutine) to prevent unbounded cache growth.
Directories
¶
| Path | Synopsis |
|---|---|
|
Package semanticmiddleware provides composable classification middleware for the semantic router.
|
Package semanticmiddleware provides composable classification middleware for the semantic router. |
|
Code generated by counterfeiter.
|
Code generated by counterfeiter. |