cache

package

v1.4.6 Latest Latest Go to latest Published: Apr 14, 2026 License: Apache-2.0 Imports: 17 Imported by: 0

Details

Valid go.mod file
Redistributable license
Tagged version
Stable version
Learn more about best practices

Repository

github.com/dativo-io/talon

Links

Open Source Insights

README ¶

Governed Semantic Cache

This package implements Talon's semantic cache for LLM completions: cost and latency optimization by serving similar prompts from cache instead of calling the LLM. The cache is GDPR Article 17 compliant, PII-safe, tenant-isolated, and auditable.

Cache vs memory (clarification)

Aspect	Semantic cache (this package)	Agent memory (`internal/memory`)
Layer	Gateway / proxy (LLM request path)	Agent (per-agent learning)
Purpose	Cost and latency: reuse similar prompts	Safety and compliance: what the agent may remember
Duration	Minutes to days (TTL, eviction)	Weeks to indefinitely
Governance	Cache TTL, data tier, PII scrubbing, GDPR erasure	Categories, PII policy, constitutional AI
Config	`talon.config.yaml` under `cache`	`agent.talon.yaml` under `memory`
When used	Before every LLM call	When building context for later runs

The semantic cache sits at the proxy/gateway layer. Memory governance sits at the agent layer. Both may use similar techniques (e.g. embeddings, similarity) for different goals.

Embedding strategy (Option C — BM25, v0.2.0 default)

We use BM25-style term-vector similarity in pure Go (embedder.go):

No external model or CGO — single binary, no extra dependencies.
Deterministic — same text always yields the same blob for lookup.
Good for exact and near-exact match — repeated or slightly reworded prompts hit the cache.
Does not match paraphrases (e.g. "What is GDPR?" vs "Explain GDPR to me"); that is an acceptable MVP tradeoff. Most cache hits in practice come from repeated or near-identical queries.

Alternatives deferred to later:

Option A (v0.3): Local embedding model (e.g. ONNX MiniLM) for true semantic matching.
Option B (not recommended): LLM provider embedding API — adds latency and cost to every lookup.

What is stored

Stored: Prompt embedding (serialized term vector, no raw prompt text), PII-scrubbed response text, metadata (tenant_id, model, TTL, hit_count), HMAC signature.
Not stored: Raw prompt text, raw response text, user identifiers (except optional user_id for GDPR user-level erasure).

Components

store.go — SQLite schema, CRUD, lookup with similarity function, eviction, HMAC, GDPR erasure.
embedder.go — BM25 tokenization and cosine similarity of term vectors.
pii_scrubber.go — Wraps classifier.Redact for response text before storage.
policy.go — OPA cache eligibility (data tier, PII, request type, cache_enabled); see rego/cache.rego.

Documentation ¶

Overview ¶

Package cache embedder provides BM25-style text similarity for the semantic cache (Option C). No raw prompt text is stored — only a serialized term vector for similarity lookup.

Package cache: cache entry key derivation (deterministic, not password hashing).

Package cache PII scrubber wraps the classifier so LLM responses are PII-scrubbed before being stored in the semantic cache.

Package cache policy evaluates OPA cache eligibility (lookup and store).

Package cache provides a governed semantic cache for LLM completions.

The cache is gateway-level cost optimization: it stores prompt embeddings (not raw prompts) and PII-scrubbed responses, with strict tenant isolation, configurable TTL, and GDPR Article 17 erasure. See internal/cache/README.md for cache-vs-memory clarification.

Index ¶

func DeriveEntryKey(tenantID, model, prompt string) string
func TenantIDForCacheKey(tenantID string) string
type BM25
- func NewBM25() *BM25
type Entry
type Evaluator
- func NewEvaluator(ctx context.Context) (*Evaluator, error)
- func (e *Evaluator) Evaluate(ctx context.Context, input *PolicyInput) (*PolicyResult, error)
type LookupResult
type PIIScrubber
- func NewPIIScrubber(scanner *classifier.Scanner) *PIIScrubber
- func (p *PIIScrubber) Scrub(ctx context.Context, text string) string
type PolicyInput
type PolicyResult
type SimilarityFunc
type Store
- func NewStore(dbPath string, signingKey string) (*Store, error)

Constants ¶

This section is empty.

Variables ¶

This section is empty.

Functions ¶

func DeriveEntryKey ¶

func DeriveEntryKey(tenantID, model, prompt string) string

func TenantIDForCacheKey ¶

func TenantIDForCacheKey(tenantID string) string

DeriveEntryKey returns a deterministic cache entry key from tenant, model, and prompt. Uses SHA-2 (SHA-256) for cache key derivation only: same inputs produce the same key for lookup/insert. No password or secret is hashed; inputs are tenant id, model name, and prompt text. SHA-2 (or SHA-3) is appropriate for non-password uses like cache keys.

Precondition: callers must pass only non-secret values—tenant identifier (e.g. from caller config, used as cache scope), model name, and prompt text. Do not pass API keys, passwords, or other secrets. The tenant ID is typically from config lookup (by API key), not the API key itself; it is an identifier like "acme-corp", not sensitive data.

TenantIDForCacheKey documents that the given string is a tenant identifier for cache scoping (e.g. from caller config), not an API key or secret. Use at call sites to make the non-secret use explicit for static analysis.

Types ¶

type BM25 ¶

type BM25 struct {
	// MinTermLen ignores tokens shorter than this (default 2).
	MinTermLen int
}

BM25 is a pure-Go BM25-style embedder: tokenize text and produce a term vector blob. No external model or CGO; deterministic and suitable for exact and near-exact match caching.

func NewBM25 ¶

func NewBM25() *BM25

NewBM25 returns a new BM25 embedder with default settings.

func (*BM25) Embed ¶

func (b *BM25) Embed(text string) ([]byte, error)

Embed tokenizes text and returns a serialized term vector (blob) for storage in the cache. The blob does not contain raw text; it is used only for similarity comparison via Similarity.

func (*BM25) Similarity ¶

func (b *BM25) Similarity(queryBlob, candidateBlob []byte) (float64, error)

Similarity computes cosine similarity between two term-vector blobs (from Embed). Returns a value in [0, 1]; 1 means identical vectors. Safe to use as cache.SimilarityFunc.

func (*BM25) SimilarityFunc ¶

func (b *BM25) SimilarityFunc() SimilarityFunc

SimilarityFunc returns a cache.SimilarityFunc that uses this BM25 embedder.

type Entry ¶

type Entry struct {
	ID            string
	TenantID      string
	UserID        string // Optional; for user-level GDPR erasure
	CacheKey      string
	EmbeddingData []byte
	ResponseText  string
	Model         string
	DataTier      string
	PIIScrubbed   bool
	HitCount      int64
	CreatedAt     time.Time
	ExpiresAt     time.Time
	LastAccessed  *time.Time
	HMACSignature string
}

Entry is a single semantic cache record.

type Evaluator ¶

type Evaluator struct {
	// contains filtered or unexported fields
}

Evaluator evaluates cache eligibility policy.

func NewEvaluator ¶

func NewEvaluator(ctx context.Context) (*Evaluator, error)

NewEvaluator compiles the embedded cache.rego and returns an evaluator.

func (*Evaluator) Evaluate ¶

func (e *Evaluator) Evaluate(ctx context.Context, input *PolicyInput) (*PolicyResult, error)

Evaluate returns whether cache lookup and store are allowed for the input.

type LookupResult ¶

type LookupResult struct {
	Entry      *Entry
	Similarity float64
}

LookupResult is the return type of Store.Lookup. It includes the matching entry and the actual similarity score (in [0, 1]) so callers can record accurate audit data instead of the configured threshold.

type PIIScrubber ¶

type PIIScrubber struct {
	// contains filtered or unexported fields
}

PIIScrubber wraps the PII classifier's Redact to produce cache-safe response text. Responses are scrubbed (PII replaced with placeholders like [EMAIL]) before storage.

func NewPIIScrubber ¶

func NewPIIScrubber(scanner *classifier.Scanner) *PIIScrubber

NewPIIScrubber returns a scrubber that uses the given classifier scanner.

func (*PIIScrubber) Scrub ¶

func (p *PIIScrubber) Scrub(ctx context.Context, text string) string

Scrub returns text with PII replaced by type-based placeholders (e.g. [EMAIL], [IBAN]). Use this for LLM response text before storing in the cache.

type PolicyInput ¶

type PolicyInput struct {
	TenantID     string `json:"tenant_id"`
	DataTier     string `json:"data_tier"`    // public | internal | confidential | restricted
	PIIDetected  bool   `json:"pii_detected"` // from classifier pre-scan
	PIISeverity  string `json:"pii_severity"` // none | low | high
	Model        string `json:"model"`
	RequestType  string `json:"request_type"`  // completion | embedding | tool_call
	CacheEnabled bool   `json:"cache_enabled"` // from tenant/config
}

PolicyInput is the input to the cache eligibility Rego policy.

type PolicyResult ¶

type PolicyResult struct {
	AllowLookup bool `json:"allow_lookup"`
	AllowStore  bool `json:"allow_store"`
}

PolicyResult is the result of cache policy evaluation.

type SimilarityFunc ¶

type SimilarityFunc func(queryBlob, candidateBlob []byte) (float64, error)

SimilarityFunc compares query embedding blob to a candidate's embedding blob and returns a similarity score in [0, 1]. Used by Lookup to find the best match.

type Store ¶

type Store struct {
	// contains filtered or unexported fields
}

Store persists semantic cache entries in SQLite with HMAC integrity.

func NewStore ¶

func NewStore(dbPath string, signingKey string) (*Store, error)

NewStore opens or creates the cache SQLite DB and applies the schema.

func (*Store) Close ¶

func (s *Store) Close() error

Close closes the database connection.

func (*Store) CountByTenant ¶

func (s *Store) CountByTenant(ctx context.Context, tenantID string) (int, error)

CountByTenant returns the number of cache entries for the tenant (for max_entries_per_tenant enforcement).

func (*Store) DeleteExpired ¶

func (s *Store) DeleteExpired(ctx context.Context) (int64, error)

DeleteExpired removes entries where expires_at < now. Returns the number of rows deleted.

func (*Store) EraseTenant ¶

func (s *Store) EraseTenant(ctx context.Context, tenantID string) (int64, error)

EraseTenant deletes all cache entries for the tenant (GDPR Article 17). Returns count deleted.

func (*Store) EraseTenantUser ¶

func (s *Store) EraseTenantUser(ctx context.Context, tenantID, userID string) (int64, error)

EraseTenantUser deletes all cache entries for the tenant and user (GDPR Article 17). Returns count deleted. Only entries with the given user_id are removed; entries with NULL user_id are not deleted by this call.

func (*Store) GetByID ¶

func (s *Store) GetByID(ctx context.Context, id string) (*Entry, error)

GetByID returns the cache entry by ID, or nil if not found.

func (*Store) IncrementHitCount ¶

func (s *Store) IncrementHitCount(ctx context.Context, id string) error

IncrementHitCount increments hit_count and sets last_accessed for the entry.

func (*Store) Insert ¶

func (s *Store) Insert(ctx context.Context, e *Entry) error

Insert stores a new cache entry and signs it. ID is set if empty.

func (*Store) ListTenants ¶

func (s *Store) ListTenants(ctx context.Context) ([]string, error)

ListTenants returns distinct tenant IDs that have cache entries (for CLI/stats).

func (*Store) Lookup ¶

func (s *Store) Lookup(ctx context.Context, tenantID string, queryEmbedding []byte, threshold float64, maxCandidates int, sim SimilarityFunc) (*LookupResult, error)

Lookup finds the best-matching cache entry for the tenant and query embedding using the provided similarity function. Returns nil if no candidate exceeds the threshold. maxCandidates limits how many entries are loaded for comparison (e.g. 1000). The returned LookupResult includes the actual similarity score so callers can record it in evidence (audit trail) instead of the configured threshold.

Source Files ¶

View all Source files

?	: This menu
/	: Search site
f or F	: Jump to
y or Y	: Canonical URL