cortexdb

package module
v2.18.0 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Apr 12, 2026 License: MIT Imports: 0 Imported by: 0

README

CortexDB

CI/CD Go Report Card Go Reference

CortexDB is a pure-Go, single-file AI memory and knowledge graph library. It uses SQLite as the storage kernel and exposes vector search, lexical search, RAG knowledge storage, agent memory workflows, RDF/SPARQL/RDFS/SHACL knowledge graph features, corpus-to-graph workflows, and MCP-aligned tool APIs.

It is designed for local-first AI agents that need durable memory without running a separate vector database, graph database, or MCP service stack.

Architecture

pkg/cortexdb
  Core public DB facade: vectors, text search, knowledge, memory, KnowledgeMemory, KG, tools, MCP.

pkg/memoryflow
  Agent memory workflow: transcript ingest, recall, wake-up context, diary, promotion.

pkg/graphflow
  Corpus-to-graph workflow: extraction schema, build, analyze, report, export, HTML.

pkg/graph
  Low-level graph engine: property graph, RDF triples/quads, SPARQL, RDFS, SHACL.

pkg/core
  SQLite storage, embeddings, FTS5, vector indexes, chat/session primitives.

Use pkg/cortexdb first. Reach for pkg/memoryflow when building agent memory UX, pkg/graphflow when building graph extraction/report pipelines, and pkg/graph only when you need low-level RDF or property graph control.

Install

go get github.com/liliang-cn/cortexdb/v2

Quick Start

package main

import (
	"context"
	"fmt"
	"log"

	"github.com/liliang-cn/cortexdb/v2/pkg/cortexdb"
)

func main() {
	db, err := cortexdb.Open(cortexdb.DefaultConfig("KnowledgeMemory.db"))
	if err != nil {
		log.Fatal(err)
	}
	defer db.Close()

	ctx := context.Background()
	quick := db.Quick()

	_, _ = quick.Add(ctx, []float32{0.1, 0.2, 0.9}, "SQLite is a single-file database.")
	results, _ := quick.Search(ctx, []float32{0.1, 0.2, 0.8}, 1)

	if len(results) > 0 {
		fmt.Println(results[0].Content)
	}
}

Choose the Right Layer

Need vectors / collections / FTS5?      -> pkg/cortexdb / pkg/core
Need RAG knowledge storage/search?      -> pkg/cortexdb SaveKnowledge/SearchKnowledge
Need chat/session memory workflow?      -> pkg/memoryflow
Need RDF/SPARQL/RDFS/SHACL?             -> pkg/cortexdb knowledge graph APIs
Need corpus-to-graph/report/export?     -> pkg/graphflow
Need agent tools or MCP server?         -> db.GraphRAGTools() / db.NewMCPServer()
Need low-level graph control?           -> pkg/graph

High-Level Knowledge and Memory

_, _ = db.SaveKnowledge(ctx, cortexdb.KnowledgeSaveRequest{
	KnowledgeID: "apollo-plan",
	Title:       "Apollo launch plan",
	Content:     "Alice owns Apollo. Apollo ships on Friday.",
	ChunkSize:   24,
	Entities: []cortexdb.ToolEntityInput{
		{Name: "Alice", Type: "person", ChunkIDs: []string{"chunk:apollo-plan:000"}},
		{Name: "Apollo", Type: "project", ChunkIDs: []string{"chunk:apollo-plan:000"}},
	},
	Relations: []cortexdb.ToolRelationInput{
		{From: "Alice", To: "Apollo", Type: "owns"},
	},
})

resp, _ := db.SearchKnowledge(ctx, cortexdb.KnowledgeSearchRequest{
	Query:         "Who owns Apollo?",
	Keywords:      []string{"Apollo", "Alice", "owns"},
	RetrievalMode: cortexdb.RetrievalModeLexical,
	TopK:          3,
})
_ = resp.Context

Without an embedder, CortexDB uses lexical retrieval and planner-provided keywords. With an embedder, the same high-level APIs can use semantic or hybrid retrieval.

RAG benchmark coverage is available in pkg/cortexdb:

go test ./pkg/cortexdb -run '^$' -bench 'BenchmarkRAG' -benchmem

Reference run on Apple M2 Pro, -benchtime=3x:

Benchmark Fixture Time/op Approx Throughput Alloc/op
SaveKnowledge 1 document, 3 entities, 2 relations ~3.26 ms ~306 ops/s ~75 KB
SearchKnowledge lexical 500 docs, keyword plan, graph off ~4.43 ms ~226 QPS ~234 KB
SearchKnowledge graph-light 500 docs, entity plan, bounded graph expansion ~8.40 ms ~119 QPS ~1.7 MB
BuildContext chunk pack with graph-light enrichment ~0.41 ms ~2,463 ops/s ~94 KB

MemoryFlow

pkg/memoryflow is the agent memory workflow layer. It stores raw transcript exchanges, recalls relevant context, assembles wake-up layers, appends diary entries, reconstructs transcripts, and optionally promotes durable facts to knowledge.

flow, _ := memoryflow.New(db, planner, extractor)

_, _ = flow.IngestTranscript(ctx, memoryflow.IngestTranscriptRequest{
	Transcript: memoryflow.Transcript{
		SessionID: "session-1",
		UserID:    "user-1",
		Source:    "chat",
		Turns: []memoryflow.TranscriptTurn{
			{Role: "user", Content: "Apollo ships on Friday."},
			{Role: "assistant", Content: "Captured."},
		},
	},
	Scope:     cortexdb.MemoryScopeSession,
	Namespace: "assistant",
})

layers, _ := flow.WakeUpLayers(ctx, memoryflow.WakeUpLayersRequest{
	Identity: "You are the Apollo project assistant.",
	Recall: memoryflow.RecallRequest{
		Query:     "startup context",
		SessionID: "session-1",
		Scope:     cortexdb.MemoryScopeSession,
		Namespace: "assistant",
	},
})
_ = layers

LLM-dependent behavior is interface-based:

type QueryPlanner interface {
	Plan(ctx context.Context, query string, state memoryflow.SessionState) (*cortexdb.RetrievalPlan, error)
}

type SessionExtractor interface {
	Extract(ctx context.Context, transcript memoryflow.Transcript, state memoryflow.SessionState) ([]memoryflow.PromotionCandidate, error)
}

MemoryFlow can also be wrapped with optional recall strategies. pkg/hindsight now provides a compatibility strategy plugin that enriches recall with bank/entity/keyword cues while leaving MemoryFlow as the default workflow:

flow, _ := memoryflow.New(
	db,
	planner,
	extractor,
	memoryflow.WithRecallStrategy(hindsight.NewStrategy(db, hindsight.StrategyOptions{
		BankID:      "apollo-agent",
		EntityNames: []string{"Apollo"},
		Keywords:    []string{"deadline"},
		UseKG:       true,
	})),
)

Knowledge Graph

CortexDB has an embedded RDF/KG layer on top of the same SQLite file:

  • RDF terms, triples, and quads
  • namespaces
  • N-Triples / N-Quads / Turtle / TriG import and export
  • practical SPARQL subset
  • RDFS-lite materialized inference with provenance
  • incremental RDFS inference refresh
  • SHACL-lite validation
_, _ = db.UpsertKnowledgeGraph(ctx, cortexdb.KnowledgeGraphUpsertRequest{
	Triples: []cortexdb.KnowledgeGraphTriple{
		{
			Subject:   graph.NewIRI("https://example.com/alice"),
			Predicate: graph.NewIRI(graph.RDFType),
			Object:    graph.NewIRI("https://example.com/Person"),
		},
		{
			Subject:   graph.NewIRI("https://example.com/alice"),
			Predicate: graph.NewIRI("https://schema.org/name"),
			Object:    graph.NewLiteral("Alice"),
		},
	},
})

result, _ := db.QueryKnowledgeGraph(ctx, cortexdb.KnowledgeGraphQueryRequest{
	Query: `
PREFIX schema: <https://schema.org/>
SELECT ?name WHERE {
	<https://example.com/alice> schema:name ?name .
}
`,
})
_ = result

SPARQL support is a practical embedded subset. It includes SELECT, ASK, CONSTRUCT, DESCRIBE, INSERT DATA, INSERT ... WHERE, DELETE DATA, DELETE WHERE, DELETE ... INSERT ... WHERE, WITH, USING, GRAPH, OPTIONAL, UNION, MINUS, VALUES, BIND, FILTER, EXISTS, NOT EXISTS, REGEX, LANG, DATATYPE, COALESCE, IF, arithmetic, GROUP BY, HAVING, COUNT, SUM, AVG, MIN, MAX, SAMPLE, GROUP_CONCAT, ORDER BY, LIMIT, OFFSET, subqueries, and a constrained property path subset: ^pred, p|q, p+, p*.

RDFS-lite:

refresh, _ := db.RefreshKnowledgeGraphInference(ctx, cortexdb.KnowledgeGraphInferenceRefreshRequest{
	Mode: cortexdb.KnowledgeGraphInferenceRefreshModeIncremental,
	Triples: []cortexdb.KnowledgeGraphTriple{
		{
			Subject:   graph.NewIRI("https://example.com/Employee"),
			Predicate: graph.NewIRI("http://www.w3.org/2000/01/rdf-schema#subClassOf"),
			Object:    graph.NewIRI("https://example.com/Person"),
		},
	},
})
_ = refresh

SHACL-lite:

report, _ := db.ValidateKnowledgeGraphSHACL(ctx, cortexdb.KnowledgeGraphSHACLValidateRequest{
	Shapes: []cortexdb.KnowledgeGraphTriple{
		{Subject: graph.NewIRI("https://example.com/PersonShape"), Predicate: graph.NewIRI(graph.RDFType), Object: graph.NewIRI(graph.SHACLNodeShape)},
		{Subject: graph.NewIRI("https://example.com/PersonShape"), Predicate: graph.NewIRI(graph.SHACLTargetClass), Object: graph.NewIRI("https://example.com/Person")},
	},
})
_ = report

Knowledge graph benchmark coverage is available in pkg/graph:

go test ./pkg/graph -run '^$' -bench 'BenchmarkKnowledgeGraph' -benchmem

Reference run on Apple M2 Pro, -benchtime=3x:

Benchmark Fixture Time/op Approx Throughput Alloc/op
RDF upsert unique person/name triple ~0.97 ms ~1,028 ops/s ~37 KB
RDF find by predicate 1,000 name triples, limit 20 ~0.45 ms ~2,242 QPS ~49 KB
SPARQL select direct lookup over 1,000 people ~0.56 ms ~1,802 QPS ~26 KB
SPARQL property path ex:knows+ over 500-node chain ~2.21 ms ~453 QPS ~2.5 MB
SPARQL subquery grouped friend counts over 500 people ~74.45 ms ~13 QPS ~185 MB
RDFS full refresh 25 class/type closure fixture ~805.94 ms ~1.2 ops/s ~40 MB
RDFS incremental refresh changed subclass triple fixture ~859.85 ms ~1.2 ops/s ~46 MB
SHACL-lite validation 500 people age constraints ~139.24 ms ~7.2 ops/s ~6.6 MB

These numbers are a local reference point, not a portability guarantee. The RDFS and SPARQL subquery benchmarks are intentionally stress-heavy and useful for tracking optimizer work.

GraphFlow

pkg/graphflow is the corpus-to-graph workflow layer:

  • canonical extraction schema: ExtractionResult, ExtractionNode, ExtractionEdge
  • deterministic HeuristicExtractor
  • LLM-backed extraction through JSONGenerator
  • Build, Analyze, RenderReport
  • Export to JSON/Markdown and ExportHTML

The library keeps model integration as an interface:

type JSONGenerator interface {
	GenerateJSON(ctx context.Context, systemPrompt string, userPrompt string) ([]byte, error)
}

The example examples/05_graphflow demonstrates openai-go/v3 with JSON Schema structured output. Configure it with .env:

OPENAI_API_KEY=...
OPENAI_BASE_URL=http://43.167.167.6:8080/v1
OPENAI_MODEL=gpt-5.4

Then run:

go run ./examples/05_graphflow

Tools and MCP

For in-process tool calling:

tools := db.GraphRAGTools()
defs := tools.Definitions()
resp, err := tools.Call(ctx, "knowledge_graph_query", payload)
_ = defs
_ = resp
_ = err

For MCP:

server := db.NewMCPServer(cortexdb.MCPServerOptions{})
_ = server

Tool groups include:

  • GraphRAG: ingest_document, search_text, expand_graph, build_context
  • Knowledge/memory: knowledge_save, knowledge_search, memory_save, memory_search
  • Knowledge graph: knowledge_graph_upsert, knowledge_graph_query, knowledge_graph_shacl_validate, knowledge_graph_infer_refresh
  • KnowledgeMemory: knowledge_memory_recall, knowledge_memory_build_context_pack, knowledge_memory_reflect, knowledge_memory_consolidate
  • Ontology/inference: ontology_save, apply_inference

memoryflow and graphflow also expose their own toolbox/MCP surfaces:

  • memoryflow: memoryflow_ingest_transcript, memoryflow_recall, memoryflow_wake_up_layers, memoryflow_prepare_reply
  • graphflow: graphflow_build, graphflow_analyze, graphflow_report, graphflow_export, graphflow_run

Optional Semantic Router

pkg/semantic-router remains available as an optional utility for routing user input to handlers or CortexDB tools before retrieval. It is not required by the main CortexDB, MemoryFlow, or GraphFlow paths.

For no-embedder setups, use the lexical router:

router, _ := semanticrouter.NewLexicalRouter(semanticrouter.WithSparseThreshold(0.1))
_ = router.Add(&semanticrouter.SparseRoute{
	Name:       "memory_save",
	Utterances: []string{"remember this", "save to memory"},
})
route, _ := router.Route(ctx, "please remember this preference")
_ = route.RouteName

Examples

The examples are intentionally small and architecture-oriented:

go run ./examples/01_core
go run ./examples/02_rag
go run ./examples/03_memoryflow
go run ./examples/04_knowledge_graph
go run ./examples/05_graphflow
go run ./examples/06_tools_mcp

See examples/README.md for the selection guide.

Status

CortexDB is an embedded AI memory/KG library, not a drop-in replacement for full graph database products such as Fuseki, GraphDB, or Stardog. The goal is practical local-first storage and reasoning for agents: one file, Go APIs, tool/MCP surfaces, and enough RDF/SPARQL/RDFS/SHACL functionality to build useful memory and knowledge workflows.

Documentation

Overview

Package cortexdb is the public documentation entrypoint for CortexDB.

CortexDB is a pure-Go, single-file AI memory and knowledge graph library. It uses SQLite as the storage kernel and exposes vector search, lexical search, RAG knowledge storage, scoped agent memory, RDF/SPARQL/RDFS/SHACL knowledge graph features, corpus-to-graph workflows, and MCP-aligned tool APIs.

Architecture

The main packages are:

  • pkg/cortexdb: primary DB facade for vectors, text search, knowledge, memory, KnowledgeMemory recall, knowledge graph APIs, GraphRAG tools, and MCP.
  • pkg/memoryflow: agent memory workflow for transcript ingest, recall, wake-up layers, diary, transcript reconstruction, and promotion.
  • pkg/graphflow: corpus-to-graph workflow for extraction schema, build, analysis, report, JSON/Markdown/HTML export, and optional LLM extraction.
  • pkg/graph: low-level graph engine for property graph operations, RDF triples/quads, SPARQL, RDFS-lite inference, and SHACL-lite validation.
  • pkg/core: low-level SQLite storage, embeddings, FTS5, indexes, and chat/session primitives.

Use pkg/cortexdb first unless you need a workflow layer or low-level graph control.

Quick Start

import (
	"context"
	"github.com/liliang-cn/cortexdb/v2/pkg/cortexdb"
)

func main() {
	db, _ := cortexdb.Open(cortexdb.DefaultConfig("KnowledgeMemory.db"))
	defer db.Close()

	ctx := context.Background()
	quick := db.Quick()

	_, _ = quick.Add(ctx, []float32{0.1, 0.2, 0.9}, "SQLite is a single-file database.")
	_, _ = quick.Search(ctx, []float32{0.1, 0.2, 0.8}, 3)
}

Knowledge and Memory

Durable knowledge and scoped memory are available directly on DB:

_, _ = db.SaveKnowledge(ctx, cortexdb.KnowledgeSaveRequest{
	KnowledgeID: "apollo-plan",
	Title:       "Apollo launch plan",
	Content:     "Alice owns Apollo. Apollo ships on Friday.",
	Keywords:    nil,
})

_, _ = db.SearchKnowledge(ctx, cortexdb.KnowledgeSearchRequest{
	Query:         "Who owns Apollo?",
	Keywords:      []string{"Apollo", "Alice", "owns"},
	RetrievalMode: cortexdb.RetrievalModeLexical,
	TopK:          3,
})

_, _ = db.SaveMemory(ctx, cortexdb.MemorySaveRequest{
	MemoryID:  "style",
	UserID:    "user-1",
	Scope:     cortexdb.MemoryScopeUser,
	Namespace: "assistant",
	Content:   "User prefers concise status updates.",
})

Knowledge Graph

The high-level knowledge graph API supports RDF triples/quads, import/export, SPARQL, RDFS-lite inference, and SHACL-lite validation:

_, _ = db.UpsertKnowledgeGraph(ctx, cortexdb.KnowledgeGraphUpsertRequest{
	Triples: []cortexdb.KnowledgeGraphTriple{
		{
			Subject:   graph.NewIRI("https://example.com/alice"),
			Predicate: graph.NewIRI(graph.RDFType),
			Object:    graph.NewIRI("https://example.com/Person"),
		},
	},
})

_, _ = db.QueryKnowledgeGraph(ctx, cortexdb.KnowledgeGraphQueryRequest{
	Query: `SELECT ?o WHERE { <https://example.com/alice> ?p ?o . }`,
})

_, _ = db.RefreshKnowledgeGraphInference(ctx, cortexdb.KnowledgeGraphInferenceRefreshRequest{
	Mode: cortexdb.KnowledgeGraphInferenceRefreshModeIncremental,
})

SPARQL support is a practical embedded subset. It includes SELECT, ASK, CONSTRUCT, DESCRIBE, update forms, OPTIONAL, UNION, MINUS, VALUES, BIND, FILTER, EXISTS, NOT EXISTS, aggregates, subqueries, and constrained property paths such as ^pred, p|q, p+, and p*.

MemoryFlow

Use pkg/memoryflow for chat/session memory workflows:

flow, _ := memoryflow.New(db, planner, extractor)
_, _ = flow.IngestTranscript(ctx, memoryflow.IngestTranscriptRequest{...})
_, _ = flow.WakeUpLayers(ctx, memoryflow.WakeUpLayersRequest{...})

Hindsight can be used as an optional recall strategy plugin without replacing memoryflow:

flow, _ := memoryflow.New(
	db,
	planner,
	extractor,
	memoryflow.WithRecallStrategy(hindsight.NewStrategy(db, hindsight.StrategyOptions{
		BankID:      "apollo-agent",
		EntityNames: []string{"Apollo"},
		Keywords:    []string{"deadline"},
		UseKG:       true,
	})),
)

LLM-dependent parts are interfaces: QueryPlanner, SessionExtractor, and PromotionPolicy.

GraphFlow

Use pkg/graphflow for corpus-to-graph workflows:

// ExtractionResult can come from deterministic code or an LLM.
_, _ = graphflow.Build(ctx, db, []graphflow.ExtractionResult{extraction}, graphflow.BuildOptions{})
report, _ := graphflow.Analyze(ctx, db, graphflow.AnalyzeRequest{TopN: 10})
_, _ = graphflow.Export(ctx, db, graphflow.ExportRequest{OutputDir: "graphflow-out", Analysis: report})

LLM extraction depends only on graphflow.JSONGenerator. The example examples/05_graphflow demonstrates github.com/openai/openai-go/v3 with JSON Schema structured output.

Tools and MCP

In-process tool calling:

tools := db.GraphRAGTools()
defs := tools.Definitions()
resp, err := tools.Call(ctx, "knowledge_graph_query", payload)
_, _, _ = defs, resp, err

MCP:

server := db.NewMCPServer(cortexdb.MCPServerOptions{})
_ = server

Examples

The examples directory is organized by architecture:

go run ./examples/01_core
go run ./examples/02_rag
go run ./examples/03_memoryflow
go run ./examples/04_knowledge_graph
go run ./examples/05_graphflow
go run ./examples/06_tools_mcp

Index

Constants

View Source
const Version = "2.18.0"

Version represents the current version of the cortexdb library.

Variables

This section is empty.

Functions

This section is empty.

Types

This section is empty.

Directories

Path Synopsis
cmd
examples
01_core command
02_rag command
03_memoryflow command
05_graphflow command
06_tools_mcp command
internal
pkg
core
Package core provides advanced search capabilities
Package core provides advanced search capabilities
cortexdb
Package cortexdb provides a lightweight SQLite-based vector database for Go AI projects
Package cortexdb provides a lightweight SQLite-based vector database for Go AI projects
geo
Package geo provides geo-spatial indexing and search capabilities for cortexdb
Package geo provides geo-spatial indexing and search capabilities for cortexdb
graphflow
Package graphflow provides a library-first graph extraction/build/report/export pipeline over CortexDB's graph and RDF storage.
Package graphflow provides a library-first graph extraction/build/report/export pipeline over CortexDB's graph and RDF storage.
hindsight
Package hindsight: chat.go provides a thin wrapper around the cortexdb session/message API that optionally auto-triggers fact extraction.
Package hindsight: chat.go provides a thin wrapper around the cortexdb session/message API that optionally auto-triggers fact extraction.
index
Package index provides vector indexing implementations
Package index provides vector indexing implementations
memoryflow
Package memoryflow provides a higher-level workflow facade on top of CortexDB's memory, knowledge, and KnowledgeMemory primitives.
Package memoryflow provides a higher-level workflow facade on top of CortexDB's memory, knowledge, and KnowledgeMemory primitives.
quantization
Package quantization provides vector compression techniques
Package quantization provides vector compression techniques
semantic-router
Package semantic-router provides a semantic routing layer for LLM applications.
Package semantic-router provides a semantic routing layer for LLM applications.

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL