Documentation
¶
Overview ¶
Package cache embedder provides BM25-style text similarity for the semantic cache (Option C). No raw prompt text is stored — only a serialized term vector for similarity lookup.
Package cache: cache entry key derivation (deterministic, not password hashing).
Package cache PII scrubber wraps the classifier so LLM responses are PII-scrubbed before being stored in the semantic cache.
Package cache policy evaluates OPA cache eligibility (lookup and store).
Package cache provides a governed semantic cache for LLM completions.
The cache is gateway-level cost optimization: it stores prompt embeddings (not raw prompts) and PII-scrubbed responses, with strict tenant isolation, configurable TTL, and GDPR Article 17 erasure. See internal/cache/README.md for cache-vs-memory clarification.
Index ¶
- func DeriveEntryKey(tenantID, model, prompt string) string
- func TenantIDForCacheKey(tenantID string) string
- type BM25
- type Entry
- type Evaluator
- type LookupResult
- type PIIScrubber
- type PolicyInput
- type PolicyResult
- type SimilarityFunc
- type Store
- func (s *Store) Close() error
- func (s *Store) CountByTenant(ctx context.Context, tenantID string) (int, error)
- func (s *Store) DeleteExpired(ctx context.Context) (int64, error)
- func (s *Store) EraseTenant(ctx context.Context, tenantID string) (int64, error)
- func (s *Store) EraseTenantUser(ctx context.Context, tenantID, userID string) (int64, error)
- func (s *Store) GetByID(ctx context.Context, id string) (*Entry, error)
- func (s *Store) IncrementHitCount(ctx context.Context, id string) error
- func (s *Store) Insert(ctx context.Context, e *Entry) error
- func (s *Store) ListTenants(ctx context.Context) ([]string, error)
- func (s *Store) Lookup(ctx context.Context, tenantID string, queryEmbedding []byte, threshold float64, ...) (*LookupResult, error)
Constants ¶
This section is empty.
Variables ¶
This section is empty.
Functions ¶
func DeriveEntryKey ¶
func TenantIDForCacheKey ¶
DeriveEntryKey returns a deterministic cache entry key from tenant, model, and prompt. Uses SHA-2 (SHA-256) for cache key derivation only: same inputs produce the same key for lookup/insert. No password or secret is hashed; inputs are tenant id, model name, and prompt text. SHA-2 (or SHA-3) is appropriate for non-password uses like cache keys.
Precondition: callers must pass only non-secret values—tenant identifier (e.g. from caller config, used as cache scope), model name, and prompt text. Do not pass API keys, passwords, or other secrets. The tenant ID is typically from config lookup (by API key), not the API key itself; it is an identifier like "acme-corp", not sensitive data.
TenantIDForCacheKey documents that the given string is a tenant identifier for cache scoping (e.g. from caller config), not an API key or secret. Use at call sites to make the non-secret use explicit for static analysis.
Types ¶
type BM25 ¶
type BM25 struct {
// MinTermLen ignores tokens shorter than this (default 2).
MinTermLen int
}
BM25 is a pure-Go BM25-style embedder: tokenize text and produce a term vector blob. No external model or CGO; deterministic and suitable for exact and near-exact match caching.
func (*BM25) Embed ¶
Embed tokenizes text and returns a serialized term vector (blob) for storage in the cache. The blob does not contain raw text; it is used only for similarity comparison via Similarity.
func (*BM25) Similarity ¶
Similarity computes cosine similarity between two term-vector blobs (from Embed). Returns a value in [0, 1]; 1 means identical vectors. Safe to use as cache.SimilarityFunc.
func (*BM25) SimilarityFunc ¶
func (b *BM25) SimilarityFunc() SimilarityFunc
SimilarityFunc returns a cache.SimilarityFunc that uses this BM25 embedder.
type Entry ¶
type Entry struct {
ID string
TenantID string
UserID string // Optional; for user-level GDPR erasure
CacheKey string
EmbeddingData []byte
ResponseText string
Model string
DataTier string
PIIScrubbed bool
HitCount int64
CreatedAt time.Time
ExpiresAt time.Time
LastAccessed *time.Time
HMACSignature string
}
Entry is a single semantic cache record.
type Evaluator ¶
type Evaluator struct {
// contains filtered or unexported fields
}
Evaluator evaluates cache eligibility policy.
func NewEvaluator ¶
NewEvaluator compiles the embedded cache.rego and returns an evaluator.
func (*Evaluator) Evaluate ¶
func (e *Evaluator) Evaluate(ctx context.Context, input *PolicyInput) (*PolicyResult, error)
Evaluate returns whether cache lookup and store are allowed for the input.
type LookupResult ¶
LookupResult is the return type of Store.Lookup. It includes the matching entry and the actual similarity score (in [0, 1]) so callers can record accurate audit data instead of the configured threshold.
type PIIScrubber ¶
type PIIScrubber struct {
// contains filtered or unexported fields
}
PIIScrubber wraps the PII classifier's Redact to produce cache-safe response text. Responses are scrubbed (PII replaced with placeholders like [EMAIL]) before storage.
func NewPIIScrubber ¶
func NewPIIScrubber(scanner *classifier.Scanner) *PIIScrubber
NewPIIScrubber returns a scrubber that uses the given classifier scanner.
type PolicyInput ¶
type PolicyInput struct {
TenantID string `json:"tenant_id"`
DataTier string `json:"data_tier"` // public | internal | confidential | restricted
PIIDetected bool `json:"pii_detected"` // from classifier pre-scan
PIISeverity string `json:"pii_severity"` // none | low | high
Model string `json:"model"`
RequestType string `json:"request_type"` // completion | embedding | tool_call
CacheEnabled bool `json:"cache_enabled"` // from tenant/config
}
PolicyInput is the input to the cache eligibility Rego policy.
type PolicyResult ¶
type PolicyResult struct {
AllowLookup bool `json:"allow_lookup"`
AllowStore bool `json:"allow_store"`
}
PolicyResult is the result of cache policy evaluation.
type SimilarityFunc ¶
SimilarityFunc compares query embedding blob to a candidate's embedding blob and returns a similarity score in [0, 1]. Used by Lookup to find the best match.
type Store ¶
type Store struct {
// contains filtered or unexported fields
}
Store persists semantic cache entries in SQLite with HMAC integrity.
func (*Store) CountByTenant ¶
CountByTenant returns the number of cache entries for the tenant (for max_entries_per_tenant enforcement).
func (*Store) DeleteExpired ¶
DeleteExpired removes entries where expires_at < now. Returns the number of rows deleted.
func (*Store) EraseTenant ¶
EraseTenant deletes all cache entries for the tenant (GDPR Article 17). Returns count deleted.
func (*Store) EraseTenantUser ¶
EraseTenantUser deletes all cache entries for the tenant and user (GDPR Article 17). Returns count deleted. Only entries with the given user_id are removed; entries with NULL user_id are not deleted by this call.
func (*Store) IncrementHitCount ¶
IncrementHitCount increments hit_count and sets last_accessed for the entry.
func (*Store) ListTenants ¶
ListTenants returns distinct tenant IDs that have cache entries (for CLI/stats).
func (*Store) Lookup ¶
func (s *Store) Lookup(ctx context.Context, tenantID string, queryEmbedding []byte, threshold float64, maxCandidates int, sim SimilarityFunc) (*LookupResult, error)
Lookup finds the best-matching cache entry for the tenant and query embedding using the provided similarity function. Returns nil if no candidate exceeds the threshold. maxCandidates limits how many entries are loaded for comparison (e.g. 1000). The returned LookupResult includes the actual similarity score so callers can record it in evidence (audit trail) instead of the configured threshold.