knowledge

package
v0.2.3 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Apr 29, 2026 License: MIT Imports: 24 Imported by: 0

Documentation

Overview

Package knowledge — v0.2.x compatibility layer (removed in v0.3.0).

This file is the single home for every symbol that the v0.3.0 architecture supersedes. Each symbol carries a // Deprecated: tag so staticcheck (SA1019) flags new callers; the index below is the canonical "what replaces what" map.

=== Replacement index ===

Storage / orchestration
  Store                    -> *Service                      (sdk/knowledge)
  FSStore                  -> factory.NewLocal              (sdk/knowledge/factory)
  RetrievalStore           -> factory.NewRetrieval          (sdk/knowledge/factory)
  CachedStore              -> (none — fold caching into the repo)

Data models
  Document                 -> SourceDocument + DerivedLayer
  SearchResult             -> Hit
  SearchOptions            -> Query (with Scope/Mode/Layer)
  Chunk                    -> DerivedChunk
  ContextLayer             -> Layer        (alias kept for transition)
  SearchMode               -> Mode         (alias kept for transition)
  ModeSemantic             -> ModeVector

Graph node
  KnowledgeConfig          -> KnowledgeNodeConfig
  KnowledgeNode            -> KnowledgeServiceNode
  NewKnowledgeNode         -> NewKnowledgeServiceNode
  KnowledgeConfigFromMap   -> KnowledgeNodeConfigFromMap
  RegisterNode             -> RegisterServiceNode
  KnowledgeNodeSchema      -> KnowledgeServiceNodeSchema

LLM tools
  NewSearchTool            -> NewSearchServiceTool
  NewAddTool               -> NewPutServiceTool

Reload pipeline
  ChangeNotifier           -> EventNotifier  (typed ChangeEvent stream)
  Reloader                 -> EventReloader  (scope-aware, serialised)
  NewReloader              -> NewEventReloader

Helpers
  ChunkDocument            -> ChunkText      (returns DerivedChunk)
  RankResults              -> RRFRanker (the SearchEngine.Ranker)
  RRFMerge                 -> RRFRanker
  ScoreChunk               -> textsearch.BM25 directly
  parseFrontmatter         -> Service handles frontmatter internally

=== Behaviour bridges that survive v0.3.0 ===

ResolveMode("")           -> ModeBM25
ResolveMode("semantic")   -> ModeVector
KnowledgeNodeConfigFromMap reads "max_layer" as "layer" when
  "layer" is absent.

=== Things that are NOT deprecated ===

GenerateDocumentContext / GenerateDatasetContext — the L0/L1
  derivation helpers remain external to Service so callers control
  scheduling, retry and persistence policy.
DatasetQuery — shared by knowledgenode.Config and the legacy
  KnowledgeConfig (lives in this file because both consumers reach it
  through the knowledge package).
Tokenizer / textsearch.Tokenizer — backend-neutral utility.
CosineSimilarity — used by backend implementations.

Package knowledge implements the v0.3.0 layered knowledge base: document storage, chunking, tokenization, and BM25 / vector / hybrid retrieval over three context layers (L0 abstract, L1 overview, L2 chunk detail).

Architecture

  • Service (this package): orchestrates DocumentRepo / ChunkRepo / LayerRepo, normalises Query and stamps DerivedSig so callers see a single coherent contract.
  • factory (sdk/knowledge/factory): wires Service against either filesystem-backed (factory.NewLocal) or retrieval.Index-backed (factory.NewRetrieval) repositories.
  • SearchEngine (this package): runs Retrievers in parallel, fuses with a Ranker (RRF by default).
  • EventReloader (this package): debounces ChangeEvents and triggers Service.Rebuild with the smallest possible scope.

L0/L1 derivation (GenerateDocumentContext / GenerateDatasetContext) is kept external to Service so callers own scheduling, retry and persistence policy.

Migration: every v0.2.x symbol survives in deprecated.go (tagged // Deprecated:) until v0.3.0; consult deprecated.go for the full new-name index.

Index

Constants

View Source
const (
	// AbstractPrompt produces a single-sentence L0 summary (~100 tokens).
	AbstractPrompt = `` /* 155-byte string literal not displayed */

	// OverviewPrompt produces a structured L1 overview (~1000 tokens).
	OverviewPrompt = `` /* 187-byte string literal not displayed */

	// DatasetOverviewPrompt produces a dataset-level L1 from per-document
	// abstracts ("- name: abstract" lines).
	DatasetOverviewPrompt = `` /* 196-byte string literal not displayed */

)

Prompt templates used to derive layered context from raw documents. Exported so callers can override or compose their own pipelines while reusing the SDK's defaults for the common case.

View Source
const DefaultPromptInputLimit = 8000

DefaultPromptInputLimit is the maximum number of characters of document content fed into a prompt. Content beyond this is truncated to keep prompts within typical context windows.

View Source
const DefaultRRFK = 60

DefaultRRFK is the conventional fusion constant for reciprocal-rank fusion. Smaller K weights the head of each ranked list more heavily.

View Source
const DefaultThreshold = 0.1

DefaultThreshold is the BM25-scale relevance floor used by legacy search paths. Service-driven Search uses Query.Threshold instead.

Variables

View Source
var (
	DetectTokenizer = textsearch.DetectTokenizer
	NewCorpusStats  = textsearch.NewCorpusStats
	ExtractKeywords = textsearch.ExtractKeywords
	ScoreText       = textsearch.ScoreText
)

Functions

func ChunkConfigSig added in v0.2.1

func ChunkConfigSig(c ChunkConfig) string

ChunkConfigSig returns a stable signature for a chunker configuration. It is the ChunkerSig embedded in DerivedSig so freshness checks can detect a configuration change.

func CosineSimilarity added in v0.2.1

func CosineSimilarity(a, b []float32) float64

CosineSimilarity returns the cosine similarity of two equal-length vectors. Returns 0 when lengths differ, either vector is empty, or either vector has zero norm. Exported so backends can share one canonical implementation.

func IsValidLayer added in v0.2.1

func IsValidLayer(l Layer) bool

IsValidLayer reports whether l is a recognised layer.

(Method set on a type alias must live on the underlying type; provided as a free function to avoid colliding with future ContextLayer methods.)

func IsValidMode added in v0.2.1

func IsValidMode(m Mode) bool

IsValidMode reports whether m is a recognised mode.

The empty string is accepted for backwards compatibility (legacy callers used "" to mean BM25); ResolveMode normalises it to ModeBM25.

func NewAddTool deprecated

func NewAddTool(ks Store) tool.Tool

NewAddTool returns the legacy "knowledge_add" LLM tool.

Deprecated: use NewPutServiceTool(*Service). Removed in v0.3.0.

func NewPutServiceTool added in v0.2.1

func NewPutServiceTool(svc *Service) tool.Tool

NewPutServiceTool exposes Service.PutDocument to LLMs (v0.3.0).

Tool name: "knowledge_put" Input JSON:

{
  "dataset_id": string,   // default "default"
  "name":       string,   // required
  "content":    string    // required
}

Output:

{"status": "ok", "dataset_id": ..., "name": ..., "version": uint}

The tool returns the new SourceDocument.Version so callers can chain derivation work (layer generation, vector backfill) keyed off the freshness signal.

func NewSearchServiceTool added in v0.2.1

func NewSearchServiceTool(svc *Service) tool.Tool

NewSearchServiceTool exposes Service.Search to LLMs (v0.3.0).

Tool name: "knowledge_search" Input JSON:

{
  "query":      string,                       // required
  "scope":      "single"|"all",               // default "all"
  "dataset_id": string,                       // required when scope=single
  "mode":       "bm25"|"vector"|"hybrid",     // default "bm25"
  "layer":      "L0"|"L1"|"L2",               // default "L2"
  "top_k":      integer,                      // default 5
  "threshold":  number                        // default 0
}

Output: JSON-encoded []Hit.

Backwards compatibility: when callers send only the legacy {query, top_k} fields, the tool defaults to scope=all/mode=bm25/layer=L2 — the same behaviour as the deprecated NewSearchTool(Store) helper.

func NewSearchTool deprecated

func NewSearchTool(ks Store) tool.Tool

NewSearchTool returns the legacy "knowledge_search" LLM tool.

Deprecated: use NewSearchServiceTool(*Service). Removed in v0.3.0.

func ScoreChunk deprecated

func ScoreChunk(chunk *Chunk, keywords []string, corpus *CorpusStats, tokenizer Tokenizer) float64

ScoreChunk computes the BM25 score for a chunk against query keywords.

Deprecated: use textsearch.BM25 directly with DerivedChunk content. Removed in v0.3.0.

Types

type BM25Retriever added in v0.2.1

type BM25Retriever struct {
	Chunks ChunkRepo
}

BM25Retriever queries ChunkRepo with Mode=ModeBM25. It is layered: queries whose Layer is not LayerDetail short-circuit to nil.

func NewBM25Retriever added in v0.2.1

func NewBM25Retriever(c ChunkRepo) *BM25Retriever

NewBM25Retriever constructs a BM25Retriever bound to a ChunkRepo.

func (*BM25Retriever) Name added in v0.2.1

func (r *BM25Retriever) Name() string

Name implements Retriever.

func (*BM25Retriever) Recall added in v0.2.1

func (r *BM25Retriever) Recall(ctx context.Context, q Query) ([]Candidate, error)

Recall implements Retriever. ModeVector queries are skipped because BM25 cannot satisfy them; ModeHybrid issues both lanes and the Ranker fuses the result.

type CJKTokenizer

type CJKTokenizer = textsearch.CJKTokenizer

Type and function aliases re-exported from sdk/textsearch for backward compatibility. Internal code and tests can use these without changing import paths.

type CacheOption deprecated

type CacheOption func(*CachedStore)

CacheOption configures a CachedStore.

Deprecated: see CachedStore.

func WithMaxItems deprecated

func WithMaxItems(n int) CacheOption

WithMaxItems sets the maximum number of cached items.

Deprecated: see CachedStore.

func WithTTL deprecated

func WithTTL(d time.Duration) CacheOption

WithTTL sets the cache time-to-live.

Deprecated: see CachedStore.

type CachedStore deprecated

type CachedStore struct {
	// contains filtered or unexported fields
}

CachedStore wraps a Store with TTL + LRU caching for read operations.

Deprecated: caching now lives inside Service / repository implementations where appropriate; the indirection no longer earns its keep at the orchestration layer. Removed in v0.3.0.

func NewCachedStore deprecated

func NewCachedStore(inner Store, opts ...CacheOption) *CachedStore

NewCachedStore wraps inner with caching.

Deprecated: see CachedStore. Removed in v0.3.0.

func (*CachedStore) Abstract

func (s *CachedStore) Abstract(ctx context.Context, datasetID, name string) (string, error)

func (*CachedStore) AddDocument

func (s *CachedStore) AddDocument(ctx context.Context, datasetID, name, content string) error

func (*CachedStore) AddDocuments

func (s *CachedStore) AddDocuments(ctx context.Context, datasetID string, docs []DocInput) error

func (*CachedStore) DatasetAbstract

func (s *CachedStore) DatasetAbstract(ctx context.Context, datasetID string) (string, error)

func (*CachedStore) DatasetOverview

func (s *CachedStore) DatasetOverview(ctx context.Context, datasetID string) (string, error)

func (*CachedStore) DeleteDocument

func (s *CachedStore) DeleteDocument(ctx context.Context, datasetID, name string) error

func (*CachedStore) EvictDataset

func (s *CachedStore) EvictDataset(datasetID string)

EvictDataset removes all cached entries for a dataset.

func (*CachedStore) GetDocument

func (s *CachedStore) GetDocument(ctx context.Context, datasetID, name string) (*Document, error)

func (*CachedStore) ListDocuments

func (s *CachedStore) ListDocuments(ctx context.Context, datasetID string) ([]Document, error)

func (*CachedStore) Overview

func (s *CachedStore) Overview(ctx context.Context, datasetID, name string) (string, error)

func (*CachedStore) Search

func (s *CachedStore) Search(ctx context.Context, datasetID, query string, opts SearchOptions) ([]SearchResult, error)

type Candidate added in v0.2.1

type Candidate struct {
	Hit    Hit
	Source string
}

Candidate is the per-item recall result returned by ChunkRepo / LayerRepo. Source identifies the producing retriever ("bm25" / "vector" / "layer") and is consumed by the Ranker for fusion.

type ChangeEvent added in v0.2.1

type ChangeEvent struct {
	DatasetID string
	DocName   string
	Kind      EventKind
}

ChangeEvent carries enough granularity for targeted rebuilds. DocName == "" denotes a dataset-level event.

NOTE (v0.2.x): The deprecated ChangeNotifier in deprecated.go still emits opaque struct{} events; sdkx/knowledge/watcher remains its only in-tree producer. The ChangeEvent shape declared here is what EventNotifier implementations will emit once watcher migrates in v0.3.0.

type ChangeNotifier deprecated

type ChangeNotifier interface {
	Events() <-chan struct{}
	Close() error
}

ChangeNotifier emits an opaque event whenever the underlying source changes.

Deprecated: use EventNotifier (typed ChangeEvent stream) with EventReloader. Removed in v0.3.0.

type Chunk deprecated

type Chunk struct {
	DocName string `json:"doc_name"`
	Index   int    `json:"index"`
	Content string `json:"content"`
	Offset  int    `json:"offset"`
}

Chunk represents a segment of a document.

Deprecated: use DerivedChunk. Chunk is removed in v0.3.0.

func ChunkDocument deprecated

func ChunkDocument(docName, content string, cfg ChunkConfig) []Chunk

ChunkDocument splits content into overlapping chunks, preferring to break at paragraph or sentence boundaries.

Deprecated: use ChunkText (returns []DerivedChunk). Removed in v0.3.0.

type ChunkConfig

type ChunkConfig struct {
	ChunkSize    int `json:"chunk_size,omitempty"`
	ChunkOverlap int `json:"chunk_overlap,omitempty"`
}

ChunkConfig controls document chunking.

func DefaultChunkConfig

func DefaultChunkConfig() ChunkConfig

DefaultChunkConfig returns the default chunking configuration.

type ChunkQuery added in v0.2.1

type ChunkQuery struct {
	DatasetIDs []string
	Text       string
	Vector     []float32
	Mode       Mode
	TopK       int
}

ChunkQuery is the recall input passed by Retrievers to ChunkRepo.Search.

Empty DatasetIDs means "every dataset" (cross-dataset search). When Mode is ModeVector or ModeHybrid, Vector should be supplied; backends that cannot satisfy a mode return an empty result without error.

type ChunkRepo added in v0.2.1

type ChunkRepo interface {
	Replace(ctx context.Context, datasetID, docName string, chunks []DerivedChunk) error
	DeleteByDoc(ctx context.Context, datasetID, docName string) error
	DeleteByDataset(ctx context.Context, datasetID string) error
	Search(ctx context.Context, q ChunkQuery) ([]Candidate, error)
}

ChunkRepo persists DerivedChunks and supports recall.

Replace MUST be atomic: callers rely on it to eliminate stale chunks when a SourceDocument is updated (contract guarantee #5).

type ChunkSpec added in v0.2.1

type ChunkSpec struct {
	Index   int
	Offset  int
	Content string
}

ChunkSpec is the chunker output: a positionally-tagged content slice without dataset/doc identity (the Service fills those in).

type Chunker added in v0.2.1

type Chunker interface {
	Split(content string) []ChunkSpec
	Sig() string
}

Chunker turns raw content into ordered ChunkSpecs.

Implementations MUST be deterministic (same input -> same output) and MUST return a stable Sig() so derived data freshness can be checked.

func NewDefaultChunker added in v0.2.1

func NewDefaultChunker(cfg ChunkConfig) Chunker

NewDefaultChunker returns the built-in paragraph/sentence-boundary chunker. Its output is UTF-8 safe; chunks never split inside a multi-byte rune.

Sizes are measured in runes (not bytes), so multi-byte UTF-8 content (CJK, emoji, etc.) is sliced safely on rune boundaries.

type ContextLayer

type ContextLayer string

ContextLayer indicates the granularity of a search result.

v0.3.0 will rename ContextLayer -> Layer; the alias declared in model.go (type Layer = ContextLayer) lets new code adopt the final name today without breaking existing callers.

const (
	LayerAbstract ContextLayer = "L0" // ~100 token one-sentence summary
	LayerOverview ContextLayer = "L1" // ~1k token structured overview
	LayerDetail   ContextLayer = "L2" // full chunk content
)

type CorpusStats

type CorpusStats = textsearch.CorpusStats

Type and function aliases re-exported from sdk/textsearch for backward compatibility. Internal code and tests can use these without changing import paths.

type DatasetContext

type DatasetContext struct {
	Abstract string // dataset-level L0
	Overview string // dataset-level L1
}

DatasetContext groups the layered context for an entire dataset.

func GenerateDatasetContext

func GenerateDatasetContext(ctx context.Context, l llm.LLM, summaries []DocumentSummary) (DatasetContext, error)

GenerateDatasetContext derives dataset-level L0 + L1 from per-document abstracts. The L1 overview is generated first, then distilled into L0. Returns an empty context with no error when summaries is empty.

type DatasetQuery

type DatasetQuery struct {
	DatasetID string `json:"dataset_id"`
	StateKey  string `json:"state_key"`
	TopK      int    `json:"top_k"`
}

DatasetQuery describes a single dataset search within a Knowledge node. Re-used by both the v0.3.0 knowledgenode.Config and the deprecated KnowledgeConfig (kept stable across versions).

type DerivedChunk added in v0.2.1

type DerivedChunk struct {
	DatasetID string
	DocName   string
	Index     int
	Offset    int
	Content   string
	Vector    []float32
	Sig       DerivedSig
}

DerivedChunk is one retrieval unit derived from a SourceDocument.

func ChunkText added in v0.2.1

func ChunkText(docName, content string, cfg ChunkConfig) []DerivedChunk

ChunkText splits content into UTF-8-safe overlapping chunks.

Behavior:

  • Slicing is performed on rune indices, never byte indices.
  • End offsets are reported in bytes for compatibility with retrieval filters that key on byte position.
  • Boundary preference: paragraph (\n\n) > sentence (". ") > line (\n).
  • The result is deterministic for a given (content, cfg) pair.

type DerivedLayer added in v0.2.1

type DerivedLayer struct {
	DatasetID string
	DocName   string
	Layer     Layer
	Content   string
	Vector    []float32
	Sig       DerivedSig
}

DerivedLayer is an LLM-produced summary of a document or dataset. DocName == "" denotes a dataset-level layer.

type DerivedSig added in v0.2.1

type DerivedSig struct {
	SourceVer  uint64
	ChunkerSig string
	PromptSig  string
	EmbedSig   string
}

DerivedSig binds a derived artifact to the source revision and to the configuration that produced it. Required on every derived object.

  • SourceVer is the SourceDocument.Version that produced this artifact.
  • ChunkerSig is non-empty for chunk artifacts and empty for layers.
  • PromptSig is non-empty for layer artifacts and empty for chunks.
  • EmbedSig identifies the embedder, "" when no vector is attached.

func (DerivedSig) IsStale added in v0.2.1

func (sig DerivedSig) IsStale(want DerivedSig) bool

IsStale returns true when sig was produced for an earlier source version or with a different chunker / prompt / embed configuration than want.

A zero EmbedSig in want is treated as "don't care".

type DocInput deprecated

type DocInput struct {
	Name    string
	Content string
}

DocInput is a name+content pair for batch document ingestion.

Deprecated: use Service.PutDocument (one call per document). Removed in v0.3.0.

type Document deprecated

type Document struct {
	Name     string            `json:"name"`
	Content  string            `json:"content"`
	Abstract string            `json:"abstract,omitempty"` // L0
	Overview string            `json:"overview,omitempty"` // L1
	Metadata map[string]string `json:"metadata,omitempty"`
}

Document represents a knowledge base document.

Deprecated: use SourceDocument (raw content + Version) and DerivedLayer (L0/L1) separately. Document conflates the two and is removed in v0.3.0.

type DocumentContext

type DocumentContext struct {
	Abstract string // L0
	Overview string // L1
}

DocumentContext groups the layered context for a single document.

func GenerateDocumentContext

func GenerateDocumentContext(ctx context.Context, l llm.LLM, content string) (DocumentContext, error)

GenerateDocumentContext synthesizes L0 (abstract) and L1 (overview) for a document by issuing two LLM calls. Pure function: no I/O, no caching, no retries; callers own scheduling and persistence.

Returns a partial result on error: if abstract generation fails the zero-value context is returned with the error; if overview fails the already-generated abstract is preserved so callers can choose to persist it.

type DocumentRepo added in v0.2.1

type DocumentRepo interface {
	Put(ctx context.Context, doc SourceDocument) error
	Get(ctx context.Context, datasetID, name string) (*SourceDocument, error)
	Delete(ctx context.Context, datasetID, name string) error
	List(ctx context.Context, datasetID string) ([]SourceDocument, error)
	ListDatasets(ctx context.Context) ([]string, error)
}

DocumentRepo persists SourceDocuments. Implementations MUST guarantee:

  • Put atomically increments SourceDocument.Version.
  • Get returns the most recent Put with Content losslessly preserved (contract guarantee #4).
  • Delete is idempotent.

Implementations live in sdk/knowledge/backend/*.

type DocumentSummary

type DocumentSummary struct {
	Name     string
	Abstract string
}

DocumentSummary pairs a document name with its L0 abstract, used as input to GenerateDatasetContext.

type Embedder

type Embedder = embedding.Embedder

Embedder is an alias for the SDK embedding.Embedder interface. It supports both single-text and batch embeddings.

type EventKind added in v0.2.1

type EventKind int

EventKind classifies a ChangeEvent emitted by an EventNotifier implementation. Used by EventReloader to decide whether to perform a targeted or dataset-wide rebuild.

const (
	// EventPut signals that a single document was created or updated.
	EventPut EventKind = iota
	// EventDelete signals that a single document was removed.
	EventDelete
	// EventBulk signals a dataset-level mass change (e.g. snapshot replaced).
	EventBulk
)

type EventNotifier added in v0.2.1

type EventNotifier interface {
	Events() <-chan ChangeEvent
	Close() error
}

EventNotifier is the v0.3.0 producer side of the reload pipeline. It supersedes the deprecated ChangeNotifier (defined in deprecated.go): events carry dataset/doc granularity so the consumer can issue targeted Rebuilds instead of a global one.

Implementations live in adapter packages (e.g. sdkx/knowledge/watcher, once it migrates) so the sdk core stays dependency-free. Implementations MUST close the Events channel when Close() is called.

type EventReloader added in v0.2.1

type EventReloader struct {
	// contains filtered or unexported fields
}

EventReloader debounces ChangeEvents and triggers Rebuild on the trailing edge. Rebuilds are serialised: a new rebuild waits for the previous one to finish.

Targeted vs global rebuilds: when the debounce window contains events for a single (dataset, doc) pair, the rebuild is scoped to that pair; when it touches multiple datasets or any EventBulk event, a dataset-wide rebuild is issued instead. Mixed datasets in one window collapse to a global RebuildScope{} (every dataset).

EventReloader is the v0.3.0 successor to Reloader. The legacy Reloader (with its struct{}-channel ChangeNotifier, both in deprecated.go) remains exported during the deprecation window and will be removed in v0.3.0.

func NewEventReloader added in v0.2.1

func NewEventReloader(target Rebuilder, notifier EventNotifier, opts ReloaderOptions) *EventReloader

NewEventReloader wires a Rebuilder to an EventNotifier.

opts.Debounce defaults to 500ms; opts.RebuildTimeout defaults to 30s.

When target or notifier is nil, Run becomes a no-op (returns immediately) and Close is also a no-op for the background loop, so callers can wire up unconditionally and the call order does not matter.

In the normal (non-nil) configuration the contract is:

  • Run MUST be invoked at most once before Close;
  • Close blocks until Run has actually exited AND the notifier has been closed, so callers observing a successful Close are guaranteed no further Rebuild / timer callback can fire.

To uphold the second guarantee even when Close races with the goroutine that is about to call Run, wg.Add(1) is performed here (synchronously, before the constructor returns) rather than inside Run. This way wg.Add strictly happens-before any wg.Wait in Close, which is what sync.WaitGroup requires when its counter starts at zero.

func (*EventReloader) Close added in v0.2.1

func (r *EventReloader) Close() error

Close stops Run and the underlying notifier.

In the normal (non-nil target & notifier) configuration Close blocks until Run has actually returned, so on success the caller is guaranteed that no further Rebuild call or timer-driven flush can happen. Calling Close before Run was started would deadlock, so callers MUST start Run first; this matches the EventReloader lifecycle documented on Run / NewEventReloader.

When target or notifier is nil, Close is a no-op and may be called at any time.

func (*EventReloader) Run added in v0.2.1

func (r *EventReloader) Run(ctx context.Context) error

Run blocks until ctx is cancelled or Close is called. Run MUST be called at most once. When target or notifier is nil, Run returns immediately as a no-op.

type FSStore deprecated

type FSStore struct {
	// contains filtered or unexported fields
}

FSStore implements Store using a Workspace-backed file tree.

Deprecated: use factory.NewLocal(ws, opts...). Removed in v0.3.0.

func NewFSStore deprecated

func NewFSStore(ws workspace.Workspace, opts ...FSStoreOption) *FSStore

NewFSStore creates a knowledge store rooted at the given prefix.

Deprecated: use factory.NewLocal(ws, opts...). Removed in v0.3.0.

func (*FSStore) Abstract

func (s *FSStore) Abstract(ctx context.Context, datasetID, name string) (string, error)

func (*FSStore) AddDocument

func (s *FSStore) AddDocument(ctx context.Context, datasetID, name, content string) error

func (*FSStore) AddDocuments

func (s *FSStore) AddDocuments(ctx context.Context, datasetID string, docs []DocInput) error

func (*FSStore) BuildIndex

func (s *FSStore) BuildIndex(ctx context.Context) error

BuildIndex scans all datasets and builds the in-memory search index.

func (*FSStore) DatasetAbstract

func (s *FSStore) DatasetAbstract(_ context.Context, datasetID string) (string, error)

func (*FSStore) DatasetOverview

func (s *FSStore) DatasetOverview(_ context.Context, datasetID string) (string, error)

func (*FSStore) DeleteDocument

func (s *FSStore) DeleteDocument(ctx context.Context, datasetID, name string) error

func (*FSStore) GetDocument

func (s *FSStore) GetDocument(ctx context.Context, datasetID, name string) (*Document, error)

func (*FSStore) ListDocuments

func (s *FSStore) ListDocuments(ctx context.Context, datasetID string) ([]Document, error)

func (*FSStore) Overview

func (s *FSStore) Overview(ctx context.Context, datasetID, name string) (string, error)

func (*FSStore) Prefix

func (s *FSStore) Prefix() string

Prefix returns the FSStore directory prefix beneath WorkspaceRoot.

func (*FSStore) ReindexVectors

func (s *FSStore) ReindexVectors(ctx context.Context) error

ReindexVectors regenerates vector embeddings for all indexed chunks.

func (*FSStore) Search

func (s *FSStore) Search(ctx context.Context, datasetID, query string, opts SearchOptions) ([]SearchResult, error)

Search performs a two-level search over the in-memory index.

func (*FSStore) SetDatasetAbstract

func (s *FSStore) SetDatasetAbstract(datasetID, abstract string)

func (*FSStore) SetDatasetOverview

func (s *FSStore) SetDatasetOverview(datasetID, overview string)

func (*FSStore) SetDocAbstract

func (s *FSStore) SetDocAbstract(datasetID, name, abstract string)

func (*FSStore) SetDocOverview

func (s *FSStore) SetDocOverview(datasetID, name, overview string)

func (*FSStore) WorkspaceRoot

func (s *FSStore) WorkspaceRoot() string

WorkspaceRoot exposes the underlying workspace root when available.

Returns "" if the workspace does not implement Root().

func (*FSStore) WriteDatasetFile

func (s *FSStore) WriteDatasetFile(ctx context.Context, datasetID, filename, content string) error

WriteDatasetFile writes a dataset-level file (e.g. .abstract.md, .overview.md).

func (*FSStore) WriteSidecar

func (s *FSStore) WriteSidecar(ctx context.Context, datasetID, name, ext, content string) error

WriteSidecar writes a per-document sidecar file (e.g. ".abstract", ".overview") used to persist layered context derived externally.

type FSStoreOption deprecated

type FSStoreOption func(*FSStore)

FSStoreOption configures an FSStore.

Deprecated: see FSStore.

func WithChunkConfig deprecated

func WithChunkConfig(cfg ChunkConfig) FSStoreOption

WithChunkConfig sets the chunking configuration.

Deprecated: configure ChunkConfig on factory.NewLocal's chunker option.

func WithEmbedder deprecated

func WithEmbedder(e Embedder) FSStoreOption

WithEmbedder sets the embedder for semantic/hybrid search.

Deprecated: pass embedder through factory.WithLocalEmbedder.

func WithTokenizer deprecated

func WithTokenizer(t Tokenizer) FSStoreOption

WithTokenizer sets the tokenizer for search and indexing.

Deprecated: pass tokenizer through factory.WithLocalTokenizer.

type Hit added in v0.2.1

type Hit struct {
	DatasetID  string
	DocName    string
	Layer      Layer
	Content    string
	Score      float64
	ChunkIndex int            // -1 for layer hits
	Sig        DerivedSig     // freshness traceability
	Metadata   map[string]any // backend-passthrough metadata
}

Hit is one ranked search result.

func FuseHits added in v0.2.1

func FuseHits(perRetriever [][]Hit, k int) []Hit

FuseHits is a free-function form of RRF fusion, useful for tooling that already has per-retriever Hit lists. It is the building block used by RRFRanker.Rank.

type Layer added in v0.2.1

type Layer = ContextLayer

Layer is the v0.3.0 name for ContextLayer. It is declared as a type alias so values flow seamlessly between old and new APIs during the deprecation window. The constant set lives in types.go.

type LayerQuery added in v0.2.1

type LayerQuery struct {
	DatasetIDs []string
	Layer      Layer
	Text       string
	Vector     []float32
	Mode       Mode
	TopK       int
}

LayerQuery is the recall input for layer-tier searches.

type LayerRepo added in v0.2.1

type LayerRepo interface {
	Put(ctx context.Context, layer DerivedLayer) error
	Get(ctx context.Context, datasetID, docName string, layer Layer) (*DerivedLayer, error)
	DeleteByDoc(ctx context.Context, datasetID, docName string) error
	DeleteByDataset(ctx context.Context, datasetID string) error
	Search(ctx context.Context, q LayerQuery) ([]Candidate, error)
}

LayerRepo persists DerivedLayers and supports layer-scoped recall.

type LayerRetriever added in v0.2.1

type LayerRetriever struct {
	Layers   LayerRepo
	Embedder Embedder
}

LayerRetriever queries LayerRepo for L0/L1 hits. It activates only when Query.Layer is LayerAbstract or LayerOverview; LayerDetail queries are routed to chunk-tier retrievers instead.

Embedder is consulted only for vector lanes (ModeVector / ModeHybrid).

func NewLayerRetriever added in v0.2.1

func NewLayerRetriever(l LayerRepo, e Embedder) *LayerRetriever

NewLayerRetriever constructs a LayerRetriever; pass a nil embedder to disable the vector lane.

func (*LayerRetriever) Name added in v0.2.1

func (r *LayerRetriever) Name() string

Name implements Retriever.

func (*LayerRetriever) Recall added in v0.2.1

func (r *LayerRetriever) Recall(ctx context.Context, q Query) ([]Candidate, error)

Recall implements Retriever. Vector lane is skipped when Embedder is nil; ModeHybrid degrades gracefully to BM25 in that case.

type Mode added in v0.2.1

type Mode = SearchMode

Mode is the v0.3.0 name for SearchMode. It is declared as a type alias so values flow seamlessly between old and new APIs during the deprecation window. The constant set lives in types.go.

func ResolveMode added in v0.2.1

func ResolveMode(m Mode) Mode

ResolveMode normalises legacy and zero values to a canonical Mode.

  • "" -> ModeBM25 (legacy default)
  • ModeSemantic -> ModeVector (Deprecated alias, removed in v0.3.0)

Any other recognised mode is returned unchanged.

type Query added in v0.2.1

type Query struct {
	DatasetID string
	Scope     Scope
	Text      string
	Mode      Mode
	Layer     Layer
	TopK      int
	Threshold float64
	// contains filtered or unexported fields
}

Query is the canonical search input.

Validation rules (enforced by Service.Search):

  • Layer must be valid; defaults to LayerDetail when zero.
  • Mode must be valid; defaults to ModeBM25 when zero.
  • Scope=ScopeSingleDataset requires DatasetID to be non-empty.

type RRFRanker added in v0.2.1

type RRFRanker struct {
	K int
}

RRFRanker performs reciprocal-rank fusion across candidates grouped by Candidate.Source. K defaults to DefaultRRFK when zero.

func NewRRFRanker added in v0.2.1

func NewRRFRanker() *RRFRanker

NewRRFRanker constructs an RRFRanker with sensible defaults.

func (*RRFRanker) Rank added in v0.2.1

func (r *RRFRanker) Rank(candidates []Candidate, q Query) []Hit

Rank fuses candidates and truncates to q.TopK (zero means "no limit").

Threshold filtering is applied AFTER fusion using the fused score: candidates whose fused score is below q.Threshold are dropped. When q.Threshold is zero, no filtering is applied.

type Ranker added in v0.2.1

type Ranker interface {
	Rank(candidates []Candidate, q Query) []Hit
}

Ranker fuses candidates from multiple Retrievers into ordered Hits.

type RebuildScope added in v0.2.1

type RebuildScope struct {
	DatasetID string // "" means all datasets
	DocName   string // "" means all documents in the dataset
}

RebuildScope narrows what Rebuilder.Rebuild touches. Zero value means "everything".

type Rebuilder added in v0.2.1

type Rebuilder interface {
	Rebuild(ctx context.Context, scope RebuildScope) error
}

Rebuilder is the consumer side of the change-driven reload pipeline. Service satisfies this interface; EventReloader invokes Rebuild on the trailing edge of a debounce window over an EventNotifier stream.

type Reloader deprecated

type Reloader struct {
	// contains filtered or unexported fields
}

Reloader debounces ChangeNotifier events and triggers Rebuild on a stable trailing edge.

Deprecated: use EventReloader (typed events + scope-aware Service.Rebuild + serialised execution). Removed in v0.3.0.

func NewReloader deprecated

func NewReloader(store *FSStore, notifier ChangeNotifier, opts ReloaderOptions) *Reloader

NewReloader wires a ChangeNotifier to a rebuild callback.

Deprecated: use NewEventReloader(target Rebuilder, notifier EventNotifier, opts). Removed in v0.3.0.

func (*Reloader) Close

func (r *Reloader) Close() error

Close stops Run and the underlying ChangeNotifier.

func (*Reloader) Run

func (r *Reloader) Run(ctx context.Context) error

Run blocks until Close is called or ctx is cancelled.

type ReloaderOptions

type ReloaderOptions struct {
	Debounce       time.Duration
	RebuildTimeout time.Duration
	Rebuild        func(context.Context) error
}

ReloaderOptions configures EventReloader (and the deprecated Reloader).

Field set is the union of both consumers:

  • Debounce controls the trailing-edge window (both consumers).
  • RebuildTimeout caps each rebuild call (EventReloader only; the legacy Reloader hard-codes 30s).
  • Rebuild is the legacy hook used by the deprecated Reloader to swap the rebuild callback. EventReloader ignores it and always calls Rebuilder.Rebuild on the supplied target.

type Result added in v0.2.1

type Result struct {
	Hits []Hit
}

Result wraps the ordered hit list returned by Service.Search.

type RetrievalStore deprecated

type RetrievalStore struct {
	// contains filtered or unexported fields
}

RetrievalStore is a Store implementation backed by a retrieval.Index.

Deprecated: use factory.NewRetrieval(docs, idx, opts...) which returns a *Service backed by backend/retrieval. Removed in v0.3.0.

func NewRetrievalStore deprecated

func NewRetrievalStore(idx retrieval.Index, opts ...RetrievalStoreOption) *RetrievalStore

NewRetrievalStore wires a Store to a retrieval.Index.

Deprecated: use factory.NewRetrieval(docs, idx, opts...). Removed in v0.3.0.

func (*RetrievalStore) Abstract

func (s *RetrievalStore) Abstract(ctx context.Context, datasetID, name string) (string, error)

func (*RetrievalStore) AddDocument

func (s *RetrievalStore) AddDocument(ctx context.Context, datasetID, name, content string) error

func (*RetrievalStore) AddDocuments

func (s *RetrievalStore) AddDocuments(ctx context.Context, datasetID string, docs []DocInput) error

func (*RetrievalStore) DatasetAbstract

func (s *RetrievalStore) DatasetAbstract(ctx context.Context, datasetID string) (string, error)

func (*RetrievalStore) DatasetOverview

func (s *RetrievalStore) DatasetOverview(ctx context.Context, datasetID string) (string, error)

func (*RetrievalStore) DeleteDocument

func (s *RetrievalStore) DeleteDocument(ctx context.Context, datasetID, name string) error

func (*RetrievalStore) GetDocument

func (s *RetrievalStore) GetDocument(ctx context.Context, datasetID, name string) (*Document, error)

func (*RetrievalStore) Index

func (s *RetrievalStore) Index() retrieval.Index

Index exposes the underlying retrieval.Index.

func (*RetrievalStore) ListDocuments

func (s *RetrievalStore) ListDocuments(ctx context.Context, datasetID string) ([]Document, error)

func (*RetrievalStore) Overview

func (s *RetrievalStore) Overview(ctx context.Context, datasetID, name string) (string, error)

func (*RetrievalStore) Search

func (s *RetrievalStore) Search(ctx context.Context, datasetID, query string, opts SearchOptions) ([]SearchResult, error)

func (*RetrievalStore) SetAbstract

func (s *RetrievalStore) SetAbstract(ctx context.Context, datasetID, name, abstract string) error

func (*RetrievalStore) SetDatasetAbstract

func (s *RetrievalStore) SetDatasetAbstract(ctx context.Context, datasetID, abstract string) error

func (*RetrievalStore) SetDatasetOverview

func (s *RetrievalStore) SetDatasetOverview(ctx context.Context, datasetID, overview string) error

func (*RetrievalStore) SetOverview

func (s *RetrievalStore) SetOverview(ctx context.Context, datasetID, name, overview string) error

type RetrievalStoreOption deprecated

type RetrievalStoreOption func(*RetrievalStore)

RetrievalStoreOption configures a RetrievalStore.

Deprecated: see RetrievalStore.

func WithRetrievalChunkConfig deprecated

func WithRetrievalChunkConfig(c ChunkConfig) RetrievalStoreOption

WithRetrievalChunkConfig overrides the default chunk config.

Deprecated: configure ChunkConfig via factory.WithRetrievalChunker.

func WithRetrievalEmbedder deprecated

func WithRetrievalEmbedder(e embedding.Embedder) RetrievalStoreOption

WithRetrievalEmbedder sets the embedder used to vectorize chunks at write time.

Deprecated: pass embedder through factory.WithRetrievalEmbedder.

func WithRetrievalPipeline deprecated

func WithRetrievalPipeline(p *pipeline.Pipeline) RetrievalStoreOption

WithRetrievalPipeline overrides the default pipeline.Knowledge(emb, nil).

Deprecated: factory.NewRetrieval owns the pipeline now.

func WithRetrievalTokenizer deprecated

func WithRetrievalTokenizer(t Tokenizer) RetrievalStoreOption

WithRetrievalTokenizer overrides the BM25 tokenizer.

Deprecated: factory wires the tokenizer through textsearch.

type Retriever added in v0.2.1

type Retriever interface {
	// Name returns a stable identifier used by Rankers (e.g. "bm25").
	Name() string
	// Recall fetches up to q.TopK candidates for q.
	Recall(ctx context.Context, q Query) ([]Candidate, error)
}

Retriever produces a candidate set for a Query. Implementations are stateless with respect to the Query; all state lives in the underlying repos.

type Scope added in v0.2.1

type Scope int

Scope expresses the dataset breadth of a query.

const (
	// ScopeSingleDataset restricts the search to Query.DatasetID.
	ScopeSingleDataset Scope = iota
	// ScopeAllDatasets searches across every known dataset.
	ScopeAllDatasets
)

type SearchEngine added in v0.2.1

type SearchEngine struct {
	Chunk []Retriever // chunk-tier retrievers (LayerDetail)
	Layer []Retriever // layer-tier retrievers (LayerAbstract / LayerOverview)
	Rank  Ranker
}

SearchEngine routes a Query through Retrievers, then a Ranker.

Behaviour:

  • LayerDetail queries fan out to chunk-tier Retrievers (BM25/Vector).
  • Layer{Abstract,Overview} queries fan out to layer-tier Retrievers.
  • ModeHybrid runs both BM25 and Vector recall paths and lets the Ranker fuse them (RRF by default).

func NewSearchEngine added in v0.2.1

func NewSearchEngine(chunk, layer []Retriever, ranker Ranker) *SearchEngine

NewSearchEngine assembles a SearchEngine. ranker may be nil; the Service will substitute a default RRFRanker.

func (*SearchEngine) Search added in v0.2.1

func (e *SearchEngine) Search(ctx context.Context, q Query) (*Result, error)

Search runs the engine for one Query. Validation is the caller's job (Service.Search performs it before delegating).

type SearchMode

type SearchMode string

SearchMode chooses the retrieval algorithm.

v0.3.0 final values are explicit strings; legacy callers that pass the empty string are normalised to ModeBM25 by ResolveMode().

Deprecated names:

  • ModeSemantic remains for backwards compatibility; new code should use ModeVector. They are recognised as equivalent at the Service boundary starting in v0.2.x and ModeSemantic is removed in v0.3.0.
const (
	ModeBM25     SearchMode = "bm25"
	ModeVector   SearchMode = "vector"
	ModeSemantic SearchMode = "semantic" // Deprecated: use ModeVector.
	ModeHybrid   SearchMode = "hybrid"
)

type SearchOptions deprecated

type SearchOptions struct {
	TopK      int          `json:"top_k,omitempty"`
	MaxLayer  ContextLayer `json:"max_layer,omitempty"`
	Threshold float64      `json:"threshold,omitempty"`
	Mode      SearchMode   `json:"mode,omitempty"`
}

SearchOptions configures a knowledge search query.

Deprecated: use Query. The MaxLayer→Layer rename and ScopeAllDatasets fan-out live on Query. SearchOptions is removed in v0.3.0.

type SearchResult deprecated

type SearchResult struct {
	Content    string         `json:"content"`
	Score      float64        `json:"score"`
	DocName    string         `json:"doc_name,omitempty"`
	ChunkIndex int            `json:"chunk_index,omitempty"`
	Layer      ContextLayer   `json:"layer"`
	Metadata   map[string]any `json:"metadata,omitempty"`
}

SearchResult represents a single search hit with its relevance score.

Deprecated: use Hit. SearchResult is removed in v0.3.0.

func RRFMerge deprecated

func RRFMerge(bm25Results, semanticResults []SearchResult, k int) []SearchResult

RRFMerge fuses two ranked SearchResult lists with reciprocal-rank fusion.

Deprecated: use RRFRanker. Removed in v0.3.0.

func RankResults deprecated

func RankResults(results []SearchResult, topK int) []SearchResult

RankResults sorts by score descending and limits to topK.

Deprecated: use the SearchEngine's Ranker (RRFRanker by default). Removed in v0.3.0.

type Service added in v0.2.1

type Service struct {
	// contains filtered or unexported fields
}

Service orchestrates document lifecycle, derived-data persistence and search. All public Knowledge entry points (graph node, tools, deprecated stores) route through Service so contract guarantees (#1..#7 in doc.go) live in one place.

func NewService added in v0.2.1

func NewService(
	docs DocumentRepo,
	chunks ChunkRepo,
	layers LayerRepo,
	engine *SearchEngine,
	opts ServiceOptions,
) *Service

NewService constructs a Service from explicit repositories.

Most callers should use NewLocalService / NewRetrievalService instead; this entry point exists so out-of-tree backends can be wired the same way the built-ins are.

func (*Service) DatasetLayer added in v0.2.1

func (s *Service) DatasetLayer(ctx context.Context, datasetID string, layer Layer) (string, error)

DatasetLayer reads a dataset-level layer; returns "" without error when missing.

func (*Service) DeleteDocument added in v0.2.1

func (s *Service) DeleteDocument(ctx context.Context, datasetID, name string) error

DeleteDocument removes the document and all its derived data (chunks + layers). Errors from chunk/layer cleanup are returned after a best-effort attempt so a single failure does not leave the document orphaned in DocumentRepo.

func (*Service) GetDocument added in v0.2.1

func (s *Service) GetDocument(ctx context.Context, datasetID, name string) (*SourceDocument, error)

GetDocument returns the lossless SourceDocument (contract #4).

func (*Service) Layer added in v0.2.1

func (s *Service) Layer(ctx context.Context, datasetID, name string, layer Layer) (string, error)

Layer reads a document-level layer; returns "" without error when missing (contract: callers should treat absence as "not yet generated").

func (*Service) ListDatasets added in v0.2.1

func (s *Service) ListDatasets(ctx context.Context) ([]string, error)

ListDatasets enumerates every known dataset id.

func (*Service) ListDocuments added in v0.2.1

func (s *Service) ListDocuments(ctx context.Context, datasetID string) ([]SourceDocument, error)

ListDocuments returns SourceDocuments in the dataset. Implementations MAY omit Content for performance; FSDocumentRepo currently returns it.

func (*Service) PutDatasetLayer added in v0.2.1

func (s *Service) PutDatasetLayer(ctx context.Context, datasetID string, layer Layer, content string) error

PutDatasetLayer persists a dataset-level layer (DocName == "").

func (*Service) PutDocument added in v0.2.1

func (s *Service) PutDocument(ctx context.Context, datasetID, name, raw string) error

PutDocument writes raw content under (datasetID, name).

Behaviour:

  • SourceDocument.Version is incremented atomically by DocumentRepo.
  • DerivedChunks are recomputed and ChunkRepo.Replace is called.
  • DerivedLayers are NOT touched (layer generation is explicit; see PutDocumentLayer / PutDatasetLayer).

Atomicity model: chunk replacement happens AFTER the document write succeeds. A failure between the two leaves the document persisted but chunks stale; the next PutDocument or Rebuild restores consistency. Backends with native transactions can override this by composing repos that share a transaction.

func (*Service) PutDocumentLayer added in v0.2.1

func (s *Service) PutDocumentLayer(ctx context.Context, datasetID, name string, layer Layer, content string) error

PutDocumentLayer persists an LLM-derived layer for one document. Caller is expected to have produced content via GenerateDocumentContext.

func (*Service) Rebuild added in v0.2.1

func (s *Service) Rebuild(ctx context.Context, scope RebuildScope) error

Rebuild re-derives chunks for the requested scope, comparing DerivedSig against the current (SourceVer, ChunkerSig). Stale chunks are recomputed; up-to-date chunks are left alone.

Layers are not regenerated automatically; callers drive layer rebuilds explicitly via PutDocumentLayer / PutDatasetLayer.

func (*Service) Search added in v0.2.1

func (s *Service) Search(ctx context.Context, q Query) (*Result, error)

Search executes the query through the configured SearchEngine.

Validation (the only place these checks live, contract #2):

  • q.Layer defaults to LayerDetail when zero; rejected otherwise.
  • q.Mode defaults to ModeBM25 when zero; ModeSemantic is normalised to ModeVector for backwards compatibility.
  • q.Scope=ScopeSingleDataset requires q.DatasetID to be non-empty.

For ScopeAllDatasets the dataset list is resolved once via DocumentRepo.ListDatasets and pushed down to retrievers via the unexported resolvedDatasets field.

type ServiceOptions added in v0.2.1

type ServiceOptions struct {
	// Chunker overrides the default chunker; nil means "use
	// NewDefaultChunker(DefaultChunkConfig())".
	Chunker Chunker
	// Embedder enables vector indexing and semantic search; nil
	// disables vector lanes.
	Embedder Embedder
	// EmbedSig is stamped onto every DerivedSig produced while this
	// Service runs. Embedder doesn't expose a model identifier, so
	// callers (typically the Factory) supply one. Empty string means
	// "use the embedder's Go type name", which is good enough for
	// freshness checks within a single binary but not stable across
	// processes — production wiring should set it explicitly.
	EmbedSig string
	// Now overrides the clock; nil means time.Now (unit-test hook).
	Now func() time.Time
}

ServiceOptions configures a Service. Nil-friendly: every field is optional and falls back to a sensible default.

type SimpleTokenizer

type SimpleTokenizer = textsearch.SimpleTokenizer

Type and function aliases re-exported from sdk/textsearch for backward compatibility. Internal code and tests can use these without changing import paths.

type SourceDocument added in v0.2.1

type SourceDocument struct {
	DatasetID string
	Name      string
	Content   string
	Metadata  map[string]string

	// Version is monotonically incremented on every successful Put.
	// Derived data uses it as a freshness key.
	Version   uint64
	UpdatedAt time.Time
}

SourceDocument is the canonical, lossless representation of user input.

It is the single source of truth for a document; every DerivedChunk and DerivedLayer carries a DerivedSig that points back to a particular SourceDocument.Version, so derived data can be detected as stale and recomputed deterministically.

type Store deprecated

type Store interface {
	AddDocument(ctx context.Context, datasetID, name, content string) error
	AddDocuments(ctx context.Context, datasetID string, docs []DocInput) error
	GetDocument(ctx context.Context, datasetID, name string) (*Document, error)
	DeleteDocument(ctx context.Context, datasetID, name string) error
	ListDocuments(ctx context.Context, datasetID string) ([]Document, error)
	Search(ctx context.Context, datasetID, query string, opts SearchOptions) ([]SearchResult, error)

	Abstract(ctx context.Context, datasetID, name string) (string, error)
	Overview(ctx context.Context, datasetID, name string) (string, error)

	DatasetAbstract(ctx context.Context, datasetID string) (string, error)
	DatasetOverview(ctx context.Context, datasetID string) (string, error)
}

Store abstracts knowledge base storage. Documents are organized by dataset.

Deprecated: use *Service in sdk/knowledge instead. Service unifies document, chunk and layer storage behind a single contract and is the only orchestrator going forward; Store will be removed in v0.3.0.

Migration:

  • Replace AddDocument / AddDocuments with Service.PutDocument.
  • Replace Search with Service.Search.
  • Replace Abstract / Overview with Service.Layer.
  • Replace DatasetAbstract / Overview with Service.DatasetLayer.

type Tokenizer

type Tokenizer = textsearch.Tokenizer

Type and function aliases re-exported from sdk/textsearch for backward compatibility. Internal code and tests can use these without changing import paths.

type VectorRetriever added in v0.2.1

type VectorRetriever struct {
	Chunks   ChunkRepo
	Embedder Embedder
}

VectorRetriever queries ChunkRepo with Mode=ModeVector. Embedder is invoked lazily; nil disables the retriever (Recall returns nil).

func NewVectorRetriever added in v0.2.1

func NewVectorRetriever(c ChunkRepo, e Embedder) *VectorRetriever

NewVectorRetriever constructs a VectorRetriever; pass a nil embedder to disable the lane (Recall short-circuits).

func (*VectorRetriever) Name added in v0.2.1

func (r *VectorRetriever) Name() string

Name implements Retriever.

func (*VectorRetriever) Recall added in v0.2.1

func (r *VectorRetriever) Recall(ctx context.Context, q Query) ([]Candidate, error)

Recall implements Retriever. ModeBM25 queries are skipped; ModeHybrid produces both lanes and the Ranker fuses them.

Directories

Path Synopsis
backend
fs
Package fs implements knowledge.DocumentRepo, knowledge.ChunkRepo and knowledge.LayerRepo on top of any workspace.Workspace.
Package fs implements knowledge.DocumentRepo, knowledge.ChunkRepo and knowledge.LayerRepo on top of any workspace.Workspace.
retrieval
Package retrieval implements knowledge.ChunkRepo and knowledge.LayerRepo on top of any retrieval.Index.
Package retrieval implements knowledge.ChunkRepo and knowledge.LayerRepo on top of any retrieval.Index.
e2e
internal/testenv
Package testenv provides shared environment loading for the knowledge e2e integration tests.
Package testenv provides shared environment loading for the knowledge e2e integration tests.
Package factory wires the canonical knowledge.Service stacks.
Package factory wires the canonical knowledge.Service stacks.

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL