README
ΒΆ
This project is 100% AI-slop. Use at own risk for life and limb.
gognee - A Go Knowledge Graph Memory System
gognee is an importable Go library that provides persistent knowledge graph memory for AI assistants. It enables applications to extract, store, and retrieve information relationships using a combination of vector search and graph traversal.
Features
- π Knowledge Graph Storage: Persistent storage of entities and relationships in SQLite
- π Hybrid Search: Combine vector similarity and graph traversal for semantic retrieval
- π§ LLM-Powered Extraction: Automatic entity and relationship extraction using OpenAI's APIs
- π Chunking: Intelligent text splitting with token awareness and overlap handling
- π Deterministic Deduplication: Same entities across documents resolve to the same node
- πΎ Persistent Memory: Knowledge persists across application restarts
Importing
gognee is a library package intended to be imported into your Go application. Use the package entrypoint at
import "github.com/dan-solli/gognee/pkg/gognee"
Minimal import-and-use example:
package main
import (
"context"
"fmt"
"os"
"github.com/dan-solli/gognee/pkg/gognee"
)
func main() {
ctx := context.Background()
g, err := gognee.New(gognee.Config{
DBPath: "./memory.db",
OpenAIKey: os.Getenv("OPENAI_API_KEY"),
})
if err != nil {
panic(err)
}
defer g.Close()
// Add text and process
_ = g.Add(ctx, "Gognee is a Go knowledge graph memory library.", gognee.AddOptions{})
_, _ = g.Cognify(ctx, gognee.CognifyOptions{})
// Search
results, _ := g.Search(ctx, "What do I know about gognee?", gognee.SearchOptions{})
fmt.Printf("Found %d results\n", len(results))
}
Types and convenience values are re-exported from the package (for example SearchOptions, SearchResult, SearchTypeHybrid, Node).
Quick Start
Prerequisites
CGO Requirement: gognee v1.2.0+ requires CGO for sqlite-vec vector indexing:
export CGO_ENABLED=1
Platform-specific notes:
- Linux: Requires GCC or Clang
- macOS: Requires Xcode Command Line Tools (
xcode-select --install) - Windows: Requires MinGW-w64 or MSVC
Installation
CGO_ENABLED=1 go get github.com/dan-solli/gognee
Basic Usage
package main
import (
"context"
"fmt"
"log"
"os"
"github.com/dan-solli/gognee/pkg/gognee"
)
func main() {
ctx := context.Background()
// Initialize Gognee with OpenAI API key
g, err := gognee.New(gognee.Config{
OpenAIKey: os.Getenv("OPENAI_API_KEY"),
DBPath: "./memory.db", // Persistent SQLite database
})
if err != nil {
log.Fatal(err)
}
defer g.Close()
// Add documents to the knowledge base
documents := []string{
"React is a JavaScript library for building user interfaces using components.",
"We use TypeScript to add static typing to our React applications.",
"PostgreSQL is our primary database for storing application data.",
"The frontend uses React with TypeScript, and the backend uses PostgreSQL.",
}
for _, doc := range documents {
if err := g.Add(ctx, doc, gognee.AddOptions{}); err != nil {
log.Fatal(err)
}
}
fmt.Printf("Buffered %d documents for processing\n", g.BufferedCount())
// Process buffered documents through the extraction pipeline
result, err := g.Cognify(ctx, gognee.CognifyOptions{})
if err != nil {
log.Fatal(err)
}
fmt.Printf("Processed %d documents: %d nodes, %d edges\n",
result.DocumentsProcessed,
result.NodesCreated,
result.EdgesCreated,
)
// Query the knowledge graph
results, err := g.Search(ctx, "What technologies does the project use?", gognee.SearchOptions{
Type: gognee.SearchTypeHybrid,
TopK: 5,
GraphDepth: 1,
})
if err != nil {
log.Fatal(err)
}
fmt.Println("\nSearch Results:")
for i, result := range results {
fmt.Printf("%d. %s (%s) - Score: %.4f\n", i+1, result.Node.Name, result.Node.Type, result.Score)
}
// Check statistics
stats, err := g.Stats()
if err != nil {
log.Fatal(err)
}
fmt.Printf("\nKnowledge Graph Stats: %d nodes, %d edges\n", stats.NodeCount, stats.EdgeCount)
}
API Reference
Core Methods
New(cfg Config) (*Gognee, error)
Initializes a new Gognee instance.
Config fields:
OpenAIKey(required): OpenAI API key for embeddings and LLMDBPath(optional): Path to SQLite database file. Defaults to:memory:if emptyEmbeddingModel(optional): Embedding model to use. Default:text-embedding-3-smallLLMModel(optional): LLM model for extraction. Default:gpt-4o-miniChunkSize(optional): Token size for text chunks. Default:512ChunkOverlap(optional): Token overlap between chunks. Default:50
Add(ctx context.Context, text string, opts AddOptions) error
Buffers text for later processing.
- Parameters:
text: Document text to add (non-empty)opts.Source(optional): Source identifier for the document
- Returns: Error if text is empty
- Note: Text is buffered but NOT processed until
Cognify()is called
Cognify(ctx context.Context, opts CognifyOptions) (*CognifyResult, error)
Processes all buffered documents through the full extraction pipeline:
- Chunks text into segments
- Extracts entities (Person, Concept, System, etc.) via LLM
- Extracts relationships between entities via LLM
- Creates nodes and edges in the knowledge graph
- Generates embeddings for semantic search
- Clears the buffer
CognifyResult fields:
DocumentsProcessed: Count of documents in the bufferChunksProcessed: Total chunks createdChunksFailed: Chunks that failed extractionNodesCreated: Entities added to graphEdgesCreated: Relationships added to graphErrors: Individual errors encountered (processing continues best-effort)
Note: The buffer is always cleared after Cognify, even if errors occur. Return error is only for catastrophic failures (context canceled, DB connection lost).
Search(ctx context.Context, query string, opts SearchOptions) ([]SearchResult, error)
Searches the knowledge graph.
SearchOptions fields:
Type(optional): Search type -SearchTypeVector,SearchTypeGraph, orSearchTypeHybrid. Default:SearchTypeHybridTopK(optional): Maximum results to return. Default:10GraphDepth(optional): Max depth for graph traversal. Default:1SeedNodeIDs(optional): Starting nodes for graph search
SearchResult fields:
Node: Full node dataScore: Relevance score (0-1)Source: How the node was found ("vector" or "graph")GraphDepth: Distance from search origin
Close() error
Releases all resources (database connections, buffered data).
Stats() (Stats, error)
Returns knowledge graph statistics:
NodeCount: Total entities in graphEdgeCount: Total relationships in graphBufferedDocs: Documents waiting for CognifyLastCognified: Timestamp of last successful Cognify
Advanced Access
For custom pipelines, these components are accessible:
GetChunker(): Text chunkingGetEmbeddings(): Embedding clientGetLLM(): LLM clientGetGraphStore(): Graph storageGetVectorStore(): Vector storage
Type Re-exports
Common types are re-exported from the top-level package for convenience:
SearchResult,SearchOptions,SearchTypeNode,EdgeSearchTypeVector,SearchTypeGraph,SearchTypeHybrid(constants)
Default Behavior
gognee uses SQLite for both graph storage and vector embeddings. Choose the storage mode with DBPath:
- Persistent Storage (recommended):
DBPath: "./memory.db"- Data persists across restarts - In-Memory Storage (testing/dev):
DBPath: ":memory:"- Data is cleared when process exits
Persistence
When using a file-based DBPath, both the knowledge graph (nodes and edges) and vector embeddings persist across application restarts. This means:
β
No need to re-run Cognify() after restart
β
Instant search availability on startup
β
Zero-downtime deployment support
Example workflow:
// Session 1: Build knowledge graph
g1, _ := gognee.New(gognee.Config{
DBPath: "./memory.db",
OpenAIKey: apiKey,
})
g1.Add(ctx, "Document 1...", gognee.AddOptions{})
g1.Cognify(ctx, gognee.CognifyOptions{})
g1.Close()
// Session 2: Reopen and immediately search (no Cognify needed)
g2, _ := gognee.New(gognee.Config{
DBPath: "./memory.db", // Same database
OpenAIKey: apiKey,
})
results, _ := g2.Search(ctx, "query", gognee.SearchOptions{})
// β
Results immediately available - embeddings were persisted
Migration from v0.6.0 and Earlier
In v0.6.0 and earlier, vector embeddings were stored in memory and lost on restart. If you're upgrading:
- Existing databases will work without migration - simply run
Cognify()once after upgrading to v0.7.0 to populate the persistent embeddings - New databases get persistent embeddings automatically
- In-memory mode (
:memory:) behavior is unchanged
MVP Limitations
This is the MVP (Minimum Viable Product). Known limitations:
Performance
- Vector search: Optimized with sqlite-vec indexed ANN search (O(log n) complexity)
- Graph traversal: In-memory BFS implementation (acceptable for <100K nodes)
- Concurrent writes: Serializable transactions may cause contention under heavy load
Persistent Storage
Provide a file path to DBPath to enable persistent storage:
cfg := gognee.Config{
OpenAIKey: "sk-...",
DBPath: "./knowledge.db",
}
g, _ := gognee.New(cfg)
The database file is created automatically if it doesn't exist.
Incremental Cognify
By default, gognee tracks processed documents to avoid redundant processing. This reduces costs and processing time when re-adding documents.
How It Works
When you call Cognify(), gognee:
- Computes a SHA-256 hash of each document's text
- Checks if that hash has been processed before
- Skips documents that have already been processed (incremental mode)
- Processes new/changed documents normally
Benefits:
- β‘ Near-instant processing for duplicate documents (~0ms vs 5-10s)
- π° Zero LLM API costs for cached documents
- π Enables continuous updates without full reprocessing
Default Behavior
Incremental mode is ON by default:
// Second Cognify() call skips already-processed documents
g.Add(ctx, "React is a UI library", gognee.AddOptions{})
g.Cognify(ctx, gognee.CognifyOptions{}) // Processes document (hash: abc123)
g.Add(ctx, "React is a UI library", gognee.AddOptions{}) // Same text
g.Cognify(ctx, gognee.CognifyOptions{}) // Skips (hash: abc123 already processed)
// DocumentsProcessed=0, DocumentsSkipped=1
Controlling Incremental Behavior
Use CognifyOptions to control incremental processing:
// Disable incremental mode (always reprocess)
skipProcessed := false
g.Cognify(ctx, gognee.CognifyOptions{
SkipProcessed: &skipProcessed,
})
// Force reprocessing even with incremental mode enabled
g.Cognify(ctx, gognee.CognifyOptions{
Force: true, // Overrides SkipProcessed
})
When to use Force: true:
- After changing
ChunkSizeorChunkOverlapsettings - To rebuild the knowledge graph from scratch
- After updating extraction prompts or LLM models
Document Identity
Documents are identified by exact text content (SHA-256 hash). Any change to the text (including whitespace) creates a new document:
g.Add(ctx, "React is great", gognee.AddOptions{Source: "file-a"})
g.Cognify(ctx, gognee.CognifyOptions{}) // Processes
g.Add(ctx, "React is great", gognee.AddOptions{Source: "file-b"}) // Different source
g.Cognify(ctx, gognee.CognifyOptions{}) // Skips (same text)
g.Add(ctx, "React is great!", gognee.AddOptions{}) // Punctuation changed
g.Cognify(ctx, gognee.CognifyOptions{}) // Processes (different hash)
Note: The Source field is metadata only and does NOT affect document identity. Identity is purely content-based.
Persistence
Document tracking persists in the SQLite database (table: processed_documents). Tracking survives application restarts when using file-based DBPath.
For :memory: mode, tracking is lost on restart (incremental behavior only applies within a single session).
Resetting Tracking
To clear processed document history without deleting the knowledge graph:
// Access DocumentTracker interface
tracker := g.GetGraphStore().(store.DocumentTracker)
tracker.ClearProcessedDocuments(ctx)
// Now all documents will be reprocessed
g.Cognify(ctx, gognee.CognifyOptions{})
Memory Decay and Forgetting
gognee supports time-based memory decay to keep the knowledge graph relevant and bounded. Older or rarely-accessed nodes receive lower scores in search results, and can be explicitly pruned.
Configuration
Enable decay by setting decay-related fields in Config:
cfg := gognee.Config{
OpenAIKey: "sk-...",
DBPath: "./knowledge.db",
DecayEnabled: true, // Enable time-based decay (default: false)
DecayHalfLifeDays: 30, // Nodes' scores halve after 30 days (default: 30)
DecayBasis: "access", // Decay based on last access ("access" or "creation", default: "access")
}
Decay Options:
- DecayEnabled: When
true, search results are scored with decay multipliers. Off by default for backward compatibility - DecayHalfLifeDays: Number of days after which a node's score is multiplied by 0.5. Shorter values mean faster decay
- DecayBasis:
"access": Decay based onlast_accessed_attimestamp (nodes accessed recently resist decay)"creation": Decay based oncreated_attimestamp (age since creation)- If a node has never been accessed, falls back to
created_at
Access Reinforcement
When decay is enabled, nodes returned in search results have their last_accessed_at timestamp updated automatically. This means frequently searched nodes resist decay (mimicking human memory reinforcement).
Pruning Nodes
Use Prune() to permanently delete nodes that are too old or have decayed below a threshold:
// Preview what would be pruned (dry run)
result, err := g.Prune(ctx, gognee.PruneOptions{
MaxAgeDays: 60, // Prune nodes older than 60 days
MinDecayScore: 0.1, // Prune nodes with decay score < 0.1
DryRun: true, // Don't actually delete
})
if err != nil {
log.Fatal(err)
}
fmt.Printf("Would prune %d nodes and %d edges\n", result.NodesPruned, result.EdgesPruned)
// Actually prune
result, err = g.Prune(ctx, gognee.PruneOptions{
MaxAgeDays: 60,
DryRun: false,
})
if err != nil {
log.Fatal(err)
}
fmt.Printf("Pruned %d nodes\n", result.NodesPruned)
PruneOptions:
- MaxAgeDays: Remove nodes older than this many days (based on
DecayBasis). If 0, this criterion is not used - MinDecayScore: Remove nodes with decay score below this value. If 0, this criterion is not used. Requires
DecayEnabled=true - DryRun: If
true, reports what would be pruned without actually deleting
PruneResult:
- NodesEvaluated: Total number of nodes checked
- NodesPruned: Number of nodes deleted
- EdgesPruned: Number of edges deleted (cascade deletion when endpoints are removed)
- NodeIDs: List of pruned node IDs (for verification)
Important: Pruning is permanent. Use DryRun=true first to preview the impact.
Decay Math
Decay uses an exponential formula:
score_multiplier = 0.5 ^ (age_days / half_life_days)
Examples with 30-day half-life:
- 0 days old: multiplier = 1.0 (no decay)
- 30 days old: multiplier = 0.5 (half score)
- 60 days old: multiplier = 0.25 (quarter score)
- 90 days old: multiplier = 0.125
Best Practices
- Start with decay disabled to build your knowledge graph, then enable it once populated
- Use access-based decay (
DecayBasis="access") to preserve frequently queried nodes - Run dry-run prunes periodically to understand decay behavior before committing
- Adjust half-life based on your domain:
- News/events: 7-14 days
- Product documentation: 90-180 days
- Reference knowledge: 365+ days
Observability and Logging (v1.6.0+)
gognee v1.6.0 adds structured logging for the memory decay subsystem using Go's standard log/slog package. Logging is completely optional and has zero overhead when not enabled.
Configuration
Enable logging by injecting a *slog.Logger via WithLogger():
import "log/slog"
// Create a logger (JSON or text format)
logger := slog.New(slog.NewJSONHandler(os.Stdout, &slog.HandlerOptions{
Level: slog.LevelInfo, // or LevelDebug for verbose logging
}))
// Create Gognee instance
g, err := gognee.New(cfg)
if err != nil {
log.Fatal(err)
}
// Inject logger - triggers configuration logging
g.WithLogger(logger)
What Gets Logged
Startup (INFO level):
- Decay configuration when
WithLogger()is called - Config values:
decay_enabled,half_life_days,decay_basis,access_frequency_enabled,reference_access_count
Prune Operations:
- Start (INFO): options (
dry_run,max_age_days,min_decay_score, etc.) - Per-memory evaluation (DEBUG):
memory_id,status,retention_policy,pinned,decision - Per-node evaluation (DEBUG):
node_id,age_days,decay_score,decision - Complete (INFO): summary with counts (
memories_evaluated,memories_pruned,nodes_pruned,duration_ms)
Search Decay (DEBUG level):
- Per-node decay score calculation
- Retention policy overrides
- Filtered nodes (score < threshold)
Log Levels
- INFO: Operational events (config, prune start/complete)
- DEBUG: Detailed evaluation (per-memory, per-node decisions)
- WARN: Recoverable errors (node fetch failures)
Recommendation: Use LevelInfo for production, LevelDebug for troubleshooting decay behavior.
Security and Privacy
Logging follows strict data classification to prevent sensitive content exposure:
NEVER logged:
- Memory content fields:
Topic,Context,Decisions,Rationale - Node content fields:
Name,Description - Credentials:
OpenAIKey, API tokens - User-supplied metadata values
Safe to log:
- IDs:
memory_id,node_id,edge_id(UUIDs are opaque) - Timestamps:
created_at,updated_at,last_accessed_at - Counts and scores:
access_count,decay_score,nodes_pruned - Enums:
status(Active/Superseded),retention_policy(permanent/standard/ephemeral) - Config values: decay settings, thresholds
Example Log Output
{"time":"2026-02-18T10:00:00Z","level":"INFO","msg":"decay config initialized","decay_enabled":true,"half_life_days":30,"decay_basis":"access","access_frequency_enabled":true,"reference_access_count":10}
{"time":"2026-02-18T10:05:00Z","level":"INFO","msg":"prune started","dry_run":false,"max_age_days":90,"min_decay_score":0.1}
{"time":"2026-02-18T10:05:00Z","level":"DEBUG","msg":"node evaluated","node_id":"abc123","age_days":120,"decay_score":0.03,"decision":"prune"}
{"time":"2026-02-18T10:05:01Z","level":"INFO","msg":"prune complete","memories_evaluated":245,"memories_pruned":12,"nodes_evaluated":1523,"nodes_pruned":89,"edges_pruned":142,"duration_ms":1234}
Disabling Logging
Simply don't call WithLogger(). When the logger is nil (default), all logging code paths are no-ops with zero allocations.
Memory Management (v1.0.0+)
gognee v1.0.0 introduces first-class memory CRUD - a higher-level abstraction for managing discrete units of knowledge with full lifecycle management and provenance tracking.
Overview
Instead of using Add() + Cognify() to process raw text, you can now use Memory APIs for structured knowledge management:
- AddMemory: Create a memory with topic, context, decisions, rationale
- GetMemory: Retrieve a specific memory by ID
- ListMemories: List all memories with pagination
- UpdateMemory: Modify an existing memory (re-cognifies automatically)
- DeleteMemory: Remove a memory and run garbage collection
- Search: Now includes
MemoryIDsfield showing which memories contributed to each result
Key Benefits:
- π Structured Storage: Memories have explicit fields (topic, context, decisions, rationale)
- π Provenance Tracking: Know which knowledge artifacts came from which memory
- β»οΈ Garbage Collection: Deleting/updating a memory cleans up orphaned nodes/edges automatically
- π― Deduplication: Identical memories (same content) are not re-processed
- π Re-Cognify on Update: Updating a memory automatically re-extracts entities and relationships
API Overview
MemoryInput
type MemoryInput struct {
Topic string // Required: 3-7 word title
Context string // Required: 300-1500 char summary
Decisions []string // Optional: list of decisions made
Rationale []string // Optional: explanations for decisions
Metadata map[string]interface{} // Optional: arbitrary metadata
Source string // Optional: source identifier
}
MemoryResult
type MemoryResult struct {
Memory store.MemoryRecord // Full memory record
NodeIDs []string // IDs of extracted nodes
EdgeIDs []string // IDs of extracted edges
NodesCreated int // Count of new nodes
EdgesCreated int // Count of new edges
}
Adding a Memory
memory := gognee.MemoryInput{
Topic: "Phase 4 Storage Layer Implementation",
Context: "Implemented SQLite-backed graph store with nodes, edges, and vector storage. Added provenance tracking for memory CRUD operations. Foreign keys enabled for CASCADE deletes.",
Decisions: []string{
"Use SQLite for both graph and vector storage",
"Enable PRAGMA foreign_keys=ON for automatic cascade deletes",
"Implement two-phase transaction model for memory updates",
},
Rationale: []string{
"SQLite provides ACID guarantees and simplifies deployment",
"Foreign keys ensure provenance integrity without manual cleanup",
"Two-phase model prevents long transactions during LLM calls",
},
Source: "implementation-doc-004",
}
result, err := g.AddMemory(ctx, memory)
if err != nil {
log.Fatal(err)
}
fmt.Printf("Created memory %s: %d nodes, %d edges\n",
result.Memory.ID,
result.NodesCreated,
result.EdgesCreated,
)
Deduplication: If a memory with identical content already exists, AddMemory returns the existing memory without reprocessing.
Retrieving Memories
// Get a specific memory by ID
memory, err := g.GetMemory(ctx, memoryID)
if err != nil {
log.Fatal(err)
}
fmt.Printf("Memory: %s\n", memory.Topic)
fmt.Printf("Decisions: %d\n", len(memory.Decisions))
// List all memories with pagination
memories, err := g.ListMemories(ctx, gognee.ListMemoriesOptions{
Limit: 10,
Offset: 0,
})
if err != nil {
log.Fatal(err)
}
for _, summary := range memories {
fmt.Printf("- %s (%s)\n", summary.Topic, summary.ID)
}
Updating a Memory
Updating a memory triggers automatic re-cognify:
- Unlinks old provenance (nodes/edges from previous version)
- Runs garbage collection on orphaned artifacts
- Re-cognifies with new content
- Links new provenance
updates := gognee.MemoryUpdate{
Context: stringPtr("Updated context with new findings..."),
Decisions: &[]string{
"Decision 1 (updated)",
"Decision 2 (new)",
},
}
result, err := g.UpdateMemory(ctx, memoryID, updates)
if err != nil {
log.Fatal(err)
}
fmt.Printf("Updated: %d old nodes, %d new nodes\n",
len(result.OldNodeIDs),
result.NodesCreated,
)
Important: Only provide fields you want to update. Omitted fields are preserved from the original memory.
Deleting a Memory
Deleting a memory removes it and runs garbage collection:
err := g.DeleteMemory(ctx, memoryID)
if err != nil {
log.Fatal(err)
}
Garbage Collection Behavior:
- Deletes nodes/edges that only belong to this memory
- Preserves shared nodes/edges (used by other memories or legacy Add/Cognify)
- Legacy data (from Add/Cognify) is never deleted by GC
Search Integration
Search results now include MemoryIDs showing which memories contributed to each node:
results, err := g.Search(ctx, "storage implementation", gognee.SearchOptions{
TopK: 5,
IncludeMemoryIDs: boolPtr(true), // Default: true
})
if err != nil {
log.Fatal(err)
}
for _, result := range results {
fmt.Printf("Node: %s\n", result.Node.Name)
fmt.Printf("From memories: %v\n", result.MemoryIDs)
}
Memory IDs are sorted by updated_at DESC, showing the most recent memory first.
Migration from Legacy Add/Cognify
The legacy Add() + Cognify() workflow continues to work:
// Legacy workflow (still supported)
g.Add(ctx, "Some text...", gognee.AddOptions{})
g.Cognify(ctx, gognee.CognifyOptions{})
Differences:
| Feature | Legacy Add/Cognify | Memory CRUD |
|---|---|---|
| Structured Storage | No (raw text only) | Yes (topic, decisions, rationale) |
| Provenance Tracking | No | Yes |
| Garbage Collection | No | Yes |
| Update Support | No (must delete + re-add) | Yes (UpdateMemory re-cognifies) |
| Deduplication | Document-level (doc_hash) | Content-level (doc_hash) |
| Search Integration | Results only | Results + MemoryIDs |
When to use each:
- Memory CRUD: Structured knowledge management, agent memory, decision logs, planning artifacts
- Legacy Add/Cognify: Bulk document ingestion, unstructured text processing
Interoperability: Both systems share the same graph store. Nodes/edges created by legacy Add/Cognify are visible in Search, and vice versa. However, legacy artifacts are not tracked by provenance and won't be affected by garbage collection.
Two-Phase Transaction Model
Memory operations use a two-phase model to avoid long transactions during LLM calls:
Phase 1 (Transaction):
- Persist memory record with
status="pending" - Compute doc_hash for deduplication
Phase 2 (No Transaction):
- Call LLM APIs for entity/relationship extraction
- Generate embeddings
Phase 3 (Transaction):
- Update graph with nodes/edges
- Link provenance
- Set
status="complete"
This design:
- β Prevents database locks during slow LLM calls
- β Allows crash recovery (pending memories can be retried)
- β Maintains transactional integrity for metadata
Garbage Collection Details
Garbage collection uses reference counting via the memory_nodes and memory_edges junction tables:
-- Check if a node is orphaned
SELECT COUNT(*) FROM memory_nodes WHERE node_id = ?
-- If count = 0, node is safe to delete
Preserved Artifacts:
- Nodes/edges with
COUNT(*) > 0(shared across memories) - Nodes/edges without any provenance records (legacy data)
Deleted Artifacts:
- Nodes/edges with
COUNT(*) = 0after unlinking
Foreign Key Cascade: Deleting a memory automatically deletes its provenance records via ON DELETE CASCADE.
Helper Functions
// Helper for optional string fields
func stringPtr(s string) *string {
return &s
}
// Helper for optional bool fields
func boolPtr(b bool) *bool {
return &b
}
Intelligent Memory Lifecycle (v1.1.0)
gognee v1.1.0 introduces intelligent memory lifecycle management to keep knowledge graphs relevant and bounded as they grow over time. Instead of simple time-based aging, memories are managed based on usage patterns, explicit supersession, and retention policies.
Access Frequency Scoring
Memories that are frequently accessed resist decay, regardless of age:
cfg := gognee.Config{
DecayEnabled: true,
DecayHalfLifeDays: 30,
AccessFrequencyEnabled: true, // Enable frequency-based decay modification
ReferenceAccessCount: 10, // Memories with 10+ accesses get full protection
}
- Zero-access memories: Decay to 50% of time-based score (still visible, but de-prioritized)
- High-access memories: Maintain full score despite age (important knowledge stays relevant)
- Access tracking: Automatic - GetMemory() and Search() increment access counters
Explicit Supersession
Mark when one memory replaces another to maintain provenance chains:
// Create new memory that supersedes old ones
result, err := g.AddMemory(ctx, gognee.MemoryInput{
Topic: "Updated API Design",
Context: "We've switched from REST to GraphQL...",
Supersedes: []string{"old-memory-id"},
SupersessionReason: "REST approach had scaling issues",
})
// Query supersession chain
chain, _ := g.GetSupersessionChain(ctx, "old-memory-id")
// Returns: [old β intermediate β current]
Superseded memories:
- Automatically marked as
status="Superseded" - Remain searchable during grace period (default: 30 days)
- Eligible for pruning after grace period expires
Retention Policies
Different memory types have different lifespans:
| Policy | Half-Life | Prunable | Use Case |
|---|---|---|---|
permanent |
β (no decay) | Never | Core facts, system knowledge |
decision |
365 days | Only when superseded | Planning decisions, architectural choices |
standard |
90 days | Yes | General knowledge (default) |
ephemeral |
7 days | Yes | Temporary notes, scratch work |
session |
1 day | Yes | Conversation context |
result, err := g.AddMemory(ctx, gognee.MemoryInput{
Topic: "Core System Architecture",
Context: "...",
RetentionPolicy: "permanent", // Never decays or pruned
})
User Pinning
Exempt critical memories from automatic lifecycle management:
// Pin a memory
err := g.PinMemory(ctx, memoryID, "Critical customer requirement")
// Pinned memories:
// - Never decay (score stays at 1.0)
// - Never pruned (even if old or unused)
// - Marked as status="Pinned"
// Unpin when no longer critical
err = g.UnpinMemory(ctx, memoryID)
Prune with Retention Awareness
Prune operations respect retention policies:
result, err := g.Prune(ctx, gognee.PruneOptions{
PruneSuperseded: true,
SupersededAgeDays: 30, // Grace period before pruning superseded memories
DryRun: true, // Preview what would be deleted
})
fmt.Printf("Would prune %d superseded memories\n", result.SupersededMemoriesPruned)
Prune guarantees:
- Permanent memories never pruned
- Pinned memories never pruned
- Decision memories only pruned when superseded + grace period passed
- retention_until override: explicit expiration timestamp (if set and past, memory pruned regardless of policy)
Enhanced ListMemories
Filter and sort memories for management UIs:
pinnedOnly := true
activeStatus := "Active"
memories, err := g.ListMemories(ctx, store.ListMemoriesOptions{
Status: &activeStatus,
Pinned: &pinnedOnly,
OrderBy: "access_count",
OrderDesc: true, // Most accessed first
Limit: 50,
})
for _, mem := range memories {
fmt.Printf("%s - %d accesses, %s retention\n",
mem.Topic, mem.AccessCount, mem.RetentionPolicy)
}
Available filters:
Status: Filter by status (Active, Superseded, Pinned, etc.)RetentionPolicy: Filter by retention policyPinned: Show only pinned memoriesOrderBy: Sort by created_at, updated_at, access_count, last_accessed_atOrderDesc: Sort direction (true = descending, false = ascending)
Lifecycle Best Practices
-
Use retention policies intentionally:
permanentfor facts that should never changedecisionfor rationale you want to preserve even after supersessionephemeralfor debugging notes or temporary context
-
Supersede instead of delete:
- Maintains provenance chain
- Allows rollback if new approach fails
- Better for auditing and learning
-
Pin sparingly:
- Overuse defeats automatic lifecycle management
- Reserve for truly critical knowledge
- Review pinned memories periodically
-
Configure decay for your use case:
- Short-lived assistants: aggressive decay (7-day half-life)
- Long-term knowledge bases: conservative decay (90-day half-life)
- Reference systems: disable decay entirely
### Example: Agent Memory Loop
```go
// Agent stores a decision
memory, _ := g.AddMemory(ctx, gognee.MemoryInput{
Topic: "API Design Decision",
Context: "Chose REST over GraphQL for simplicity...",
Decisions: []string{"Use REST API"},
Rationale: []string{"Team familiarity", "Lower complexity"},
})
// Later: Search recalls the decision
results, _ := g.Search(ctx, "API design approach", gognee.SearchOptions{})
for _, r := range results {
fmt.Printf("Found in memories: %v\n", r.MemoryIDs)
// Retrieve full memory for context
mem, _ := g.GetMemory(ctx, r.MemoryIDs[0])
fmt.Printf("Decision: %s\n", mem.Decisions[0])
}
// Update the decision with new findings
g.UpdateMemory(ctx, memory.Memory.ID, gognee.MemoryUpdate{
Rationale: &[]string{
"Team familiarity",
"Lower complexity",
"Better caching support (discovered)",
},
})
MVP Limitations
This is the MVP (Minimum Viable Product). Known limitations:
- Linear Vector Search: Vector search uses a direct-query linear scan. Acceptable for <10K nodes; larger graphs may need ANN indexing
- No Parallelization: Document processing is sequential. Large batches may take time
- Single LLM Provider: Only OpenAI is supported
- Basic Chunking: Token-based chunking without semantic awareness
Future Enhancements
- ANN indexing for vector search (e.g., HNSW)
- Multiple LLM providers (Anthropic, Ollama, local models)
- Parallel processing of documents
- Graph visualization
- Incremental cognify (process only new documents)
Error Handling
gognee uses a best-effort model for batch processing:
- Per-Chunk Errors: If a chunk fails extraction, that chunk is skipped; processing continues with remaining chunks
- Buffer Clearing: The buffer is always cleared after
Cognify(), regardless of errors. InspectCognifyResult.Errorsto see what failed - Return Error: Only returned for catastrophic failures (DB connection lost, context canceled)
result, err := g.Cognify(ctx, gognee.CognifyOptions{})
if err != nil {
log.Fatal(err) // Catastrophic error
}
if len(result.Errors) > 0 {
log.Printf("Processing had %d errors:\n", len(result.Errors))
for _, perr := range result.Errors {
log.Printf(" - %v\n", perr)
}
}
Integration Testing
Integration tests with real OpenAI API are gated behind a build tag:
# Run only unit tests (no API calls)
go test ./...
# Run unit + integration tests (requires OPENAI_API_KEY)
OPENAI_API_KEY=sk-... go test -tags=integration ./...
Testing
The library includes:
- Unit Tests: Fast, offline tests with mocked dependencies (~80% coverage)
- Integration Tests: End-to-end tests with real OpenAI API (gated)
- SQLite Tests: Storage layer tests including concurrent access
Run tests:
go test ./... -v
Development
This library follows:
- TDD: Tests are written before implementation
- Interfaces: Core components (LLM, Embeddings, Storage) use interfaces for easy mocking and extension
- Minimal Dependencies: Only SQLite beyond the standard library
License
MIT License - See LICENSE file for details
Acknowledgments
Inspired by Cognee - a Python knowledge graph library.
Directories
ΒΆ
| Path | Synopsis |
|---|---|
|
pkg
|
|
|
embeddings
Package embeddings provides Ollama embedding client implementation
|
Package embeddings provides Ollama embedding client implementation |
|
extraction
Package extraction provides entity and relationship extraction from text
|
Package extraction provides entity and relationship extraction from text |
|
llm
Package llm provides interfaces and implementations for LLM completion clients
|
Package llm provides interfaces and implementations for LLM completion clients |
|
search
Package search provides search implementations for gognee's knowledge graph.
|
Package search provides search implementations for gognee's knowledge graph. |
|
store
Package store provides storage implementations for gognee's knowledge graph.
|
Package store provides storage implementations for gognee's knowledge graph. |