Documentation
¶
Overview ¶
Package kg implements a bitemporal knowledge graph backed by SQLite.
Data Model ¶
The KG stores SPO (Subject-Predicate-Object) triples about entities extracted from conversation memories. Each triple carries two time dimensions:
- Valid time (valid_from/valid_until): when the fact is true in the real world
- Transaction time (txn_from/txn_until): when the row exists in the database
Functional predicates enforce that only one triple per (subject, predicate) may be currently valid — inserting a new value automatically invalidates the old one.
Entity aliases enable case and accent-insensitive matching: "Maria", "maria", and "María" all resolve to the same entity_id.
ADR References ¶
- ADR-003: KG shares the same SQLite file as the memory store (no separate DB)
- ADR-009: Parameterized SQL required for all user-facing queries
Wing Inheritance ¶
Triples inherit the wing of their source memory. A triple with wing IS NULL is neutral — never filtered, never penalized, never downgraded.
Index ¶
- Constants
- Variables
- func DefaultPatternSets() []*patterns.PatternSet
- func IsAutoExtractEnabled(mode string) bool
- func IsPIIPredicate(predicate string) bool
- type Direction
- type ExtractionConfig
- type ExtractionResponse
- type ExtractionResult
- type Extractor
- type KG
- func (k *KG) AddTriple(ctx context.Context, subjectName, predicateName, objectText string, ...) (int64, error)
- func (k *KG) AutoRegisterAlias(ctx context.Context, entityID int64, canonicalName string) error
- func (k *KG) CurrentFacts(ctx context.Context, subjectName string) ([]Triple, error)
- func (k *KG) DB() *sql.DB
- func (k *KG) DeleteEntity(ctx context.Context, entityName string) error
- func (k *KG) EnsureEntity(ctx context.Context, name string) (int64, error)
- func (k *KG) EnsurePredicate(ctx context.Context, name string, isFunctional bool, description string) (int64, error)
- func (k *KG) InvalidateTriple(ctx context.Context, tripleID int64) error
- func (k *KG) QueryEntity(ctx context.Context, entityName string, dir Direction) ([]Triple, error)
- func (k *KG) RegisterAlias(ctx context.Context, entityID int64, alias string) error
- func (k *KG) ResolveAlias(ctx context.Context, name string) (int64, error)
- func (k *KG) Timeline(ctx context.Context, opts TimelineOpts) ([]Triple, error)
- type LLMExtractor
- type LLMProvider
- type TimelineOpts
- type Triple
- type TripleJSON
- type TripleOpts
Constants ¶
const KgDropSchema = `` /* 608-byte string literal not displayed */
KgDropSchema reverses the KG migration. Drops KG tables and indexes. Safe to call on any database — uses IF EXISTS.
const KgSchema = `` /* 3464-byte string literal not displayed */
KgSchema contains all CREATE TABLE and CREATE INDEX statements for the Knowledge Graph bitemporal store. All statements are idempotent (IF NOT EXISTS).
Variables ¶
var PII_PREDICATE_BLACKLIST = map[string]bool{ "ssn": true, "credit_card": true, "password": true, "address": true, "cpf": true, "rg": true, "phone": true, "email": true, "bank_account": true, }
PII_PREDICATE_BLACKLIST lists predicates that are ALWAYS dropped before insertion, regardless of source. This is a data-type classification, not a language keyword list.
Functions ¶
func DefaultPatternSets ¶
func DefaultPatternSets() []*patterns.PatternSet
DefaultPatternSets returns the built-in pattern sets (pt-br + en). Parsed once and cached via sync.Once. Returns nil on parse error.
func IsAutoExtractEnabled ¶
IsAutoExtractEnabled returns true if any extraction mode is enabled.
func IsPIIPredicate ¶
IsPIIPredicate returns true if the predicate is in the blacklist.
Types ¶
type ExtractionConfig ¶
type ExtractionConfig struct {
// AutoExtract controls which extraction modes are active.
// Values: "off" (default), "pattern", "llm", "both".
AutoExtract string
// LLMBudgetPerCycle is the max memories to process per dream cycle.
// Default: 20.
LLMBudgetPerCycle int
// LLMConsentACK must be true when AutoExtract contains "llm".
// This is a safety gate — the operator must explicitly acknowledge
// that memory contents will be sent to an external LLM API.
LLMConsentACK bool
}
ExtractionConfig holds LLM extractor configuration.
type ExtractionResponse ¶
type ExtractionResponse struct {
Triples []TripleJSON `json:"triples"`
}
ExtractionResponse is the structured output format expected from the LLM.
type ExtractionResult ¶
type ExtractionResult struct {
Subject string
Predicate string
Object string
Confidence float64
RawText string
}
ExtractionResult is a single (subject, predicate, object) triple found in text.
type Extractor ¶
type Extractor struct {
// contains filtered or unexported fields
}
Extractor scans free text for SPO triples using configurable regex patterns.
func NewExtractor ¶
NewExtractor compiles all templates from the given pattern sets into regexps. Templates are accent-stripped so that normalized input text matches regardless of diacritics. Clause separators from each PatternSet are collected for sentence splitting before pattern matching.
func (*Extractor) Extract ¶
func (e *Extractor) Extract(text string) []ExtractionResult
Extract scans text and returns all SPO triples matched by the compiled patterns. The text is first split into clauses on the configured clause separators so that multi-predicate sentences are handled correctly. Subsequent clauses inherit the subject from the preceding clause when the clause begins with a verb phrase.
func (*Extractor) ExtractAndStore ¶
func (e *Extractor) ExtractAndStore(ctx context.Context, kg *KG, text string, wing string, sourceMemoryID string) (int, error)
ExtractAndStore extracts triples from text and persists them to the knowledge graph. Returns the count of successfully inserted triples. Individual insertion errors are logged but do not abort the batch.
type KG ¶
type KG struct {
// contains filtered or unexported fields
}
func (*KG) AutoRegisterAlias ¶
func (*KG) CurrentFacts ¶
func (*KG) EnsureEntity ¶
func (*KG) EnsurePredicate ¶
func (*KG) InvalidateTriple ¶
func (*KG) QueryEntity ¶
func (*KG) RegisterAlias ¶
func (*KG) ResolveAlias ¶
type LLMExtractor ¶
type LLMExtractor struct {
// contains filtered or unexported fields
}
LLMExtractor extracts triples using an LLM provider.
It is off by default (AutoExtract="off") and requires explicit consent (LLMConsentACK=true) before any LLM call is made. A circuit breaker prevents cascading failures: after 3 consecutive errors the extractor pauses for 30 minutes.
func NewLLMExtractor ¶
func NewLLMExtractor(kg *KG, provider LLMProvider, config ExtractionConfig, logger *slog.Logger) (*LLMExtractor, error)
NewLLMExtractor creates a new LLM-backed triple extractor.
Validation rules:
- kg must not be nil
- provider must not be nil
- If config.AutoExtract contains "llm" then config.LLMConsentACK must be true
- The prompt template at prompts/extractor.txt must be loadable via the embedded FS or a default is used
func (*LLMExtractor) Config ¶
func (e *LLMExtractor) Config() ExtractionConfig
Config returns a copy of the extraction config.
func (*LLMExtractor) ExtractAndStore ¶
func (e *LLMExtractor) ExtractAndStore(ctx context.Context, text, wing, sourceMemoryID string, knownEntities []string) (int, error)
ExtractAndStore runs the full pipeline: extract triples from text via LLM, filter out PII predicates, and store valid triples in the knowledge graph.
Returns the count of successfully inserted triples. Individual insertion errors are logged but do not abort the batch. PII-blacklisted predicates are silently dropped.
func (*LLMExtractor) ExtractFromText ¶
func (e *LLMExtractor) ExtractFromText(ctx context.Context, text string, knownEntities []string) ([]TripleJSON, error)
ExtractFromText calls the LLM provider with the configured prompt template and parses the JSON response into structured triples.
The caller is responsible for budget enforcement — this method does not track per-cycle counts.
type LLMProvider ¶
LLMProvider is the interface for LLM calls (mockable for tests).
type TimelineOpts ¶
type Triple ¶
type Triple struct {
TripleID int64
SubjectID int64
SubjectName string
PredicateID int64
PredicateName string
ObjectText string
ObjectEntityID *int64
ObjectName string
ValidFrom string
ValidUntil string
TxnFrom string
TxnUntil string
Confidence float64
SourceMemoryID string
Wing string
RawText string
}
type TripleJSON ¶
type TripleJSON struct {
Subject string `json:"subject"`
Predicate string `json:"predicate"`
Object string `json:"object"`
}
TripleJSON is a single (subject, predicate, object) triple from the LLM response.