semanticrouter

package
v0.1.8-rc.21 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Mar 28, 2026 License: Apache-2.0 Imports: 21 Imported by: 0

Documentation

Overview

Package semanticrouter provides fast, embedding-based intent routing, jailbreak detection, and response semantic caching for Genie.

By sitting in front of the main orchestration flow, the semantic router acts as a lightweight gatekeeper. It quickly embeds user requests using a pre-configured vector embedding model (e.g. OpenAI, Gemini) and matches the resulting vector against a set of predefined routes (e.g. "jailbreak", "salutation").

If a route match is found (i.e. above the configured Threshold), the router immediately returns a ClassificationResult (e.g. REFUSE or SALUTATION) without invoking the more expensive, higher-latency front-desk LLM.

Further, it manages Semantic Caching which allows repeated or highly similar queries to immediately bypass generation entirely, fetching previously generated results from the Vector Store.

Index

Constants

View Source
const (
	RouteJailbreak  = mw.RouteJailbreak
	RouteSalutation = mw.RouteSalutation
	RouteFollowUp   = mw.RouteFollowUp
)

Route types for L1 vector matching (re-exported from semanticmiddleware).

View Source
const AgentNamePlaceholder = "{{AGENT_NAME}}"

Variables

This section is empty.

Functions

func GetClassifyPrompt

func GetClassifyPrompt() string

GetClassifyPrompt returns the global classify.txt template value.

Types

type Category

type Category string

Category represents the classification result.

const (
	CategoryRefuse     Category = "REFUSE"
	CategorySalutation Category = "SALUTATION"
	CategoryOutOfScope Category = "OUT_OF_SCOPE"
	CategoryComplex    Category = "COMPLEX"
)

type ClassificationResult

type ClassificationResult struct {
	Category    Category
	Reason      string // non-empty only for OUT_OF_SCOPE
	BypassedLLM bool   // true if semantic router (L1) bypassed the LLM completely
}

ClassificationResult carries the category together with an optional reason.

type Config

type Config struct {
	// Disabled determines whether semantic routing features are active.
	Disabled bool `yaml:"disabled,omitempty" toml:"disabled,omitempty"`

	// Threshold for semantic similarity matches (0.0 to 1.0).
	// Default is 0.85.
	Threshold float64 `yaml:"threshold,omitempty" toml:"threshold,omitempty"`

	// EnableCaching enables semantic caching for user LLM responses.
	EnableCaching bool `yaml:"enable_caching,omitempty" toml:"enable_caching,omitempty"`

	// CacheTTL controls how long cached responses remain valid.
	// Expired entries are ignored on read. Default is 5 minutes.
	CacheTTL time.Duration `yaml:"cache_ttl,omitempty" toml:"cache_ttl,omitempty"`

	// PruneInterval controls how often the background goroutine prunes stale
	// cache entries. Default is 1 hour. Set to 0 to disable background pruning.
	PruneInterval time.Duration `yaml:"prune_interval,omitempty" toml:"prune_interval,omitempty"`

	// VectorStore defines the embedding and storage backend used for
	// the semantic routing and caching. If empty, uses dummy embedder.
	VectorStore vector.Config `yaml:"vector_store,omitempty" toml:"vector_store,omitempty"`

	// Routes allows injecting custom semantic routes or extending builtin ones.
	Routes []Route `yaml:"routes,omitempty" toml:"routes,omitempty"`

	// L0Regex configures the L0 regex pre-filter middleware that catches
	// conversational follow-ups and corrections before any embedding or LLM call.
	L0Regex mw.L0RegexConfig `yaml:"l0_regex,omitempty" toml:"l0_regex,omitempty"`

	// FollowUpBypass configures the follow-up bypass middleware that ensures
	// messages flagged as follow-ups by L0 skip the expensive L2 LLM call.
	FollowUpBypass mw.FollowUpBypassConfig `yaml:"follow_up_bypass,omitempty" toml:"follow_up_bypass,omitempty"`
}

Config configures the semantic routing engine.

func DefaultConfig

func DefaultConfig() Config

DefaultConfig provides sensible defaults.

type IRouter

type IRouter interface {
	Classify(ctx context.Context, question, resume string) (ClassificationResult, error)
	CheckCache(ctx context.Context, query string) (string, bool)
	SetCache(ctx context.Context, query string, response string) error
	PruneStaleCacheEntries(ctx context.Context) (int, error)
}

IRouter defines the interface for the semantic router, enabling mocking and testing.

type Route

type Route struct {
	Name       string   `yaml:"name" toml:"name"`
	Utterances []string `yaml:"utterances" toml:"utterances"`
}

Route defines a semantic category alongside example utterances.

type Router

type Router struct {
	// contains filtered or unexported fields
}

Router provides semantic routing (intent classification), semantic caching, and safety checks using a vector store for fast, embedding-based comparisons and acts as the gatekeeper applying L0 Regex → L1 Semantic → L2 LLM middleware chain.

func New

func New(ctx context.Context, cfg Config, provider modelprovider.ModelProvider) (*Router, error)

New creates a new Semantic Router. It initializes isolated vector stores for caching and routing to prevent collision, and builds the classify middleware chain: L0 (regex) → L1 (vector) → follow-up bypass → L2 (LLM).

func (*Router) CheckCache

func (r *Router) CheckCache(ctx context.Context, query string) (string, bool)

CheckCache looks up the input query in the semantic cache. Cache entries are subject to TTL: entries older than Config.CacheTTL are ignored.

func (*Router) Classify

func (r *Router) Classify(ctx context.Context, question, resume string) (ClassificationResult, error)

Classify acts as the unified gatekeeper using a middleware chain. L0 Check: Regex patterns for common follow-ups (free, <1ms). L1 Check: Checks semantic vector distance and bypasses LLM if intent matches. L2 Check: Proxies to the LLM-based frontDeskExpert if no earlier layer decides.

Each middleware enriches a shared ClassifyContext so downstream layers can make better-informed decisions (e.g. L1's near-miss route score informs L2).

The method creates an OTel span ("semanticrouter.classify") that appears as a child of the caller's active span (typically "codeowner.chat"). This ensures classification always shows up in the Langfuse trace hierarchy.

func (*Router) Close

func (r *Router) Close()

Close stops the background prune goroutine. It is safe to call multiple times.

func (*Router) PruneStaleCacheEntries

func (r *Router) PruneStaleCacheEntries(ctx context.Context) (int, error)

PruneStaleCacheEntries removes expired cache entries from the vector store. It searches for a broad set of cache entries and deletes any whose cached_at timestamp is older than the configured CacheTTL. This should be called periodically (e.g. via a background goroutine) to prevent unbounded cache growth.

func (*Router) SetCache

func (r *Router) SetCache(ctx context.Context, query string, response string) error

SetCache stores the query and response pair for future semantic hits. A timestamp is stored alongside the response for TTL enforcement.

Directories

Path Synopsis
Package semanticmiddleware provides composable classification middleware for the semantic router.
Package semanticmiddleware provides composable classification middleware for the semantic router.
Code generated by counterfeiter.
Code generated by counterfeiter.

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL