semanticrouter

package

v0.1.8-rc.21 Latest Latest Go to latest Published: Mar 28, 2026 License: Apache-2.0 Imports: 21 Imported by: 0

Details

Valid go.mod file
Redistributable license
Tagged version
Stable version
Learn more about best practices

Repository

github.com/stackgenhq/genie

Links

Open Source Insights

Documentation ¶

Overview ¶

Package semanticrouter provides fast, embedding-based intent routing, jailbreak detection, and response semantic caching for Genie.

By sitting in front of the main orchestration flow, the semantic router acts as a lightweight gatekeeper. It quickly embeds user requests using a pre-configured vector embedding model (e.g. OpenAI, Gemini) and matches the resulting vector against a set of predefined routes (e.g. "jailbreak", "salutation").

If a route match is found (i.e. above the configured Threshold), the router immediately returns a ClassificationResult (e.g. REFUSE or SALUTATION) without invoking the more expensive, higher-latency front-desk LLM.

Further, it manages Semantic Caching which allows repeated or highly similar queries to immediately bypass generation entirely, fetching previously generated results from the Vector Store.

Index ¶

Constants
func GetClassifyPrompt() string
type Category
type ClassificationResult
type Config
- func DefaultConfig() Config
type IRouter
type Route
type Router
- func New(ctx context.Context, cfg Config, provider modelprovider.ModelProvider) (*Router, error)

Constants ¶

View Source

const (
	RouteJailbreak  = mw.RouteJailbreak
	RouteSalutation = mw.RouteSalutation
	RouteFollowUp   = mw.RouteFollowUp
)

Route types for L1 vector matching (re-exported from semanticmiddleware).

View Source

const AgentNamePlaceholder = "{{AGENT_NAME}}"

Variables ¶

This section is empty.

Functions ¶

func GetClassifyPrompt ¶

func GetClassifyPrompt() string

GetClassifyPrompt returns the global classify.txt template value.

Types ¶

type Category ¶

type Category string

Category represents the classification result.

const (
	CategoryRefuse     Category = "REFUSE"
	CategorySalutation Category = "SALUTATION"
	CategoryOutOfScope Category = "OUT_OF_SCOPE"
	CategoryComplex    Category = "COMPLEX"
)

type ClassificationResult ¶

type ClassificationResult struct {
	Category    Category
	Reason      string // non-empty only for OUT_OF_SCOPE
	BypassedLLM bool   // true if semantic router (L1) bypassed the LLM completely
}

ClassificationResult carries the category together with an optional reason.

type Config ¶

type Config struct {
	// Disabled determines whether semantic routing features are active.
	Disabled bool `yaml:"disabled,omitempty" toml:"disabled,omitempty"`

	// Threshold for semantic similarity matches (0.0 to 1.0).
	// Default is 0.85.
	Threshold float64 `yaml:"threshold,omitempty" toml:"threshold,omitempty"`

	// EnableCaching enables semantic caching for user LLM responses.
	EnableCaching bool `yaml:"enable_caching,omitempty" toml:"enable_caching,omitempty"`

	// CacheTTL controls how long cached responses remain valid.
	// Expired entries are ignored on read. Default is 5 minutes.
	CacheTTL time.Duration `yaml:"cache_ttl,omitempty" toml:"cache_ttl,omitempty"`

	// PruneInterval controls how often the background goroutine prunes stale
	// cache entries. Default is 1 hour. Set to 0 to disable background pruning.
	PruneInterval time.Duration `yaml:"prune_interval,omitempty" toml:"prune_interval,omitempty"`

	// VectorStore defines the embedding and storage backend used for
	// the semantic routing and caching. If empty, uses dummy embedder.
	VectorStore vector.Config `yaml:"vector_store,omitempty" toml:"vector_store,omitempty"`

	// Routes allows injecting custom semantic routes or extending builtin ones.
	Routes []Route `yaml:"routes,omitempty" toml:"routes,omitempty"`

	// L0Regex configures the L0 regex pre-filter middleware that catches
	// conversational follow-ups and corrections before any embedding or LLM call.
	L0Regex mw.L0RegexConfig `yaml:"l0_regex,omitempty" toml:"l0_regex,omitempty"`

	// FollowUpBypass configures the follow-up bypass middleware that ensures
	// messages flagged as follow-ups by L0 skip the expensive L2 LLM call.
	FollowUpBypass mw.FollowUpBypassConfig `yaml:"follow_up_bypass,omitempty" toml:"follow_up_bypass,omitempty"`
}

Config configures the semantic routing engine.

func DefaultConfig ¶

func DefaultConfig() Config

DefaultConfig provides sensible defaults.

type IRouter ¶

type IRouter interface {
	Classify(ctx context.Context, question, resume string) (ClassificationResult, error)
	CheckCache(ctx context.Context, query string) (string, bool)
	SetCache(ctx context.Context, query string, response string) error
	PruneStaleCacheEntries(ctx context.Context) (int, error)
}

IRouter defines the interface for the semantic router, enabling mocking and testing.

type Route ¶

type Route struct {
	Name       string   `yaml:"name" toml:"name"`
	Utterances []string `yaml:"utterances" toml:"utterances"`
}

Route defines a semantic category alongside example utterances.

type Router ¶

type Router struct {
	// contains filtered or unexported fields
}

Router provides semantic routing (intent classification), semantic caching, and safety checks using a vector store for fast, embedding-based comparisons and acts as the gatekeeper applying L0 Regex → L1 Semantic → L2 LLM middleware chain.

func New ¶

func New(ctx context.Context, cfg Config, provider modelprovider.ModelProvider) (*Router, error)

New creates a new Semantic Router. It initializes isolated vector stores for caching and routing to prevent collision, and builds the classify middleware chain: L0 (regex) → L1 (vector) → follow-up bypass → L2 (LLM).

func (*Router) CheckCache ¶

func (r *Router) CheckCache(ctx context.Context, query string) (string, bool)

CheckCache looks up the input query in the semantic cache. Cache entries are subject to TTL: entries older than Config.CacheTTL are ignored.

func (*Router) Classify ¶

func (r *Router) Classify(ctx context.Context, question, resume string) (ClassificationResult, error)

Classify acts as the unified gatekeeper using a middleware chain. L0 Check: Regex patterns for common follow-ups (free, <1ms). L1 Check: Checks semantic vector distance and bypasses LLM if intent matches. L2 Check: Proxies to the LLM-based frontDeskExpert if no earlier layer decides.

Each middleware enriches a shared ClassifyContext so downstream layers can make better-informed decisions (e.g. L1's near-miss route score informs L2).

The method creates an OTel span ("semanticrouter.classify") that appears as a child of the caller's active span (typically "codeowner.chat"). This ensures classification always shows up in the Langfuse trace hierarchy.

func (*Router) Close ¶

func (r *Router) Close()

Close stops the background prune goroutine. It is safe to call multiple times.

func (*Router) PruneStaleCacheEntries ¶

func (r *Router) PruneStaleCacheEntries(ctx context.Context) (int, error)

PruneStaleCacheEntries removes expired cache entries from the vector store. It searches for a broad set of cache entries and deletes any whose cached_at timestamp is older than the configured CacheTTL. This should be called periodically (e.g. via a background goroutine) to prevent unbounded cache growth.

func (*Router) SetCache ¶

func (r *Router) SetCache(ctx context.Context, query string, response string) error

SetCache stores the query and response pair for future semantic hits. A timestamp is stored alongside the response for TTL enforcement.

Source Files ¶

View all Source files

Directories ¶

Path	Synopsis
semanticmiddleware Package semanticmiddleware provides composable classification middleware for the semantic router.	Package semanticmiddleware provides composable classification middleware for the semantic router.
semanticrouterfakes Code generated by counterfeiter.	Code generated by counterfeiter.

?	: This menu
/	: Search site
f or F	: Jump to
y or Y	: Canonical URL