Documentation
¶
Overview ¶
Package knowledge — v0.2.x compatibility layer (removed in v0.3.0).
This file is the single home for every symbol that the v0.3.0 architecture supersedes. Each symbol carries a // Deprecated: tag so staticcheck (SA1019) flags new callers; the index below is the canonical "what replaces what" map.
=== Replacement index ===
Storage / orchestration Store -> *Service (sdk/knowledge) FSStore -> factory.NewLocal (sdk/knowledge/factory) RetrievalStore -> factory.NewRetrieval (sdk/knowledge/factory) CachedStore -> (none — fold caching into the repo) Data models Document -> SourceDocument + DerivedLayer SearchResult -> Hit SearchOptions -> Query (with Scope/Mode/Layer) Chunk -> DerivedChunk ContextLayer -> Layer (alias kept for transition) SearchMode -> Mode (alias kept for transition) ModeSemantic -> ModeVector Graph node KnowledgeConfig -> KnowledgeNodeConfig KnowledgeNode -> KnowledgeServiceNode NewKnowledgeNode -> NewKnowledgeServiceNode KnowledgeConfigFromMap -> KnowledgeNodeConfigFromMap RegisterNode -> RegisterServiceNode KnowledgeNodeSchema -> KnowledgeServiceNodeSchema LLM tools NewSearchTool -> NewSearchServiceTool NewAddTool -> NewPutServiceTool Reload pipeline ChangeNotifier -> EventNotifier (typed ChangeEvent stream) Reloader -> EventReloader (scope-aware, serialised) NewReloader -> NewEventReloader Helpers ChunkDocument -> ChunkText (returns DerivedChunk) RankResults -> RRFRanker (the SearchEngine.Ranker) RRFMerge -> RRFRanker ScoreChunk -> textsearch.BM25 directly parseFrontmatter -> Service handles frontmatter internally
=== Behaviour bridges that survive v0.3.0 ===
ResolveMode("") -> ModeBM25
ResolveMode("semantic") -> ModeVector
KnowledgeNodeConfigFromMap reads "max_layer" as "layer" when
"layer" is absent.
=== Things that are NOT deprecated ===
GenerateDocumentContext / GenerateDatasetContext — the L0/L1 derivation helpers remain external to Service so callers control scheduling, retry and persistence policy. DatasetQuery — shared by knowledgenode.Config and the legacy KnowledgeConfig (lives in this file because both consumers reach it through the knowledge package). Tokenizer / textsearch.Tokenizer — backend-neutral utility. CosineSimilarity — used by backend implementations.
Package knowledge implements the v0.3.0 layered knowledge base: document storage, chunking, tokenization, and BM25 / vector / hybrid retrieval over three context layers (L0 abstract, L1 overview, L2 chunk detail).
Architecture
- Service (this package): orchestrates DocumentRepo / ChunkRepo / LayerRepo, normalises Query and stamps DerivedSig so callers see a single coherent contract.
- factory (sdk/knowledge/factory): wires Service against either filesystem-backed (factory.NewLocal) or retrieval.Index-backed (factory.NewRetrieval) repositories.
- SearchEngine (this package): runs Retrievers in parallel, fuses with a Ranker (RRF by default).
- EventReloader (this package): debounces ChangeEvents and triggers Service.Rebuild with the smallest possible scope.
L0/L1 derivation (GenerateDocumentContext / GenerateDatasetContext) is kept external to Service so callers own scheduling, retry and persistence policy.
Migration: every v0.2.x symbol survives in deprecated.go (tagged // Deprecated:) until v0.3.0; consult deprecated.go for the full new-name index.
Index ¶
- Constants
- Variables
- func ChunkConfigSig(c ChunkConfig) string
- func CosineSimilarity(a, b []float32) float64
- func IsValidLayer(l Layer) bool
- func IsValidMode(m Mode) bool
- func NewAddTool(ks Store) tool.Tooldeprecated
- func NewPutServiceTool(svc *Service) tool.Tooldeprecated
- func NewSearchServiceTool(svc *Service) tool.Tooldeprecated
- func NewSearchTool(ks Store) tool.Tooldeprecated
- func ScoreChunk(chunk *Chunk, keywords []string, corpus *CorpusStats, tokenizer Tokenizer) float64deprecated
- type BM25Retriever
- type CJKTokenizer
- type CacheOptiondeprecated
- func WithMaxItems(n int) CacheOptiondeprecated
- func WithTTL(d time.Duration) CacheOptiondeprecated
- type CachedStoredeprecated
- func (s *CachedStore) Abstract(ctx context.Context, datasetID, name string) (string, error)
- func (s *CachedStore) AddDocument(ctx context.Context, datasetID, name, content string) error
- func (s *CachedStore) AddDocuments(ctx context.Context, datasetID string, docs []DocInput) error
- func (s *CachedStore) DatasetAbstract(ctx context.Context, datasetID string) (string, error)
- func (s *CachedStore) DatasetOverview(ctx context.Context, datasetID string) (string, error)
- func (s *CachedStore) DeleteDocument(ctx context.Context, datasetID, name string) error
- func (s *CachedStore) EvictDataset(datasetID string)
- func (s *CachedStore) GetDocument(ctx context.Context, datasetID, name string) (*Document, error)
- func (s *CachedStore) ListDocuments(ctx context.Context, datasetID string) ([]Document, error)
- func (s *CachedStore) Overview(ctx context.Context, datasetID, name string) (string, error)
- func (s *CachedStore) Search(ctx context.Context, datasetID, query string, opts SearchOptions) ([]SearchResult, error)
- type Candidate
- type ChangeEvent
- type ChangeNotifierdeprecated
- type Chunkdeprecated
- type ChunkConfig
- type ChunkQuery
- type ChunkRepo
- type ChunkSpec
- type Chunker
- type ContextLayer
- type CorpusStats
- type DatasetContext
- type DatasetQuery
- type DerivedChunk
- type DerivedLayer
- type DerivedSig
- type DocInputdeprecated
- type Documentdeprecated
- type DocumentContext
- type DocumentRepo
- type DocumentSummary
- type Embedder
- type EventKind
- type EventNotifier
- type EventReloader
- type FSStoredeprecated
- func (s *FSStore) Abstract(ctx context.Context, datasetID, name string) (string, error)
- func (s *FSStore) AddDocument(ctx context.Context, datasetID, name, content string) error
- func (s *FSStore) AddDocuments(ctx context.Context, datasetID string, docs []DocInput) error
- func (s *FSStore) BuildIndex(ctx context.Context) error
- func (s *FSStore) DatasetAbstract(_ context.Context, datasetID string) (string, error)
- func (s *FSStore) DatasetOverview(_ context.Context, datasetID string) (string, error)
- func (s *FSStore) DeleteDocument(ctx context.Context, datasetID, name string) error
- func (s *FSStore) GetDocument(ctx context.Context, datasetID, name string) (*Document, error)
- func (s *FSStore) ListDocuments(ctx context.Context, datasetID string) ([]Document, error)
- func (s *FSStore) Overview(ctx context.Context, datasetID, name string) (string, error)
- func (s *FSStore) Prefix() string
- func (s *FSStore) ReindexVectors(ctx context.Context) error
- func (s *FSStore) Search(ctx context.Context, datasetID, query string, opts SearchOptions) ([]SearchResult, error)
- func (s *FSStore) SetDatasetAbstract(datasetID, abstract string)
- func (s *FSStore) SetDatasetOverview(datasetID, overview string)
- func (s *FSStore) SetDocAbstract(datasetID, name, abstract string)
- func (s *FSStore) SetDocOverview(datasetID, name, overview string)
- func (s *FSStore) WorkspaceRoot() string
- func (s *FSStore) WriteDatasetFile(ctx context.Context, datasetID, filename, content string) error
- func (s *FSStore) WriteSidecar(ctx context.Context, datasetID, name, ext, content string) error
- type FSStoreOptiondeprecated
- type Hit
- type Layer
- type LayerQuery
- type LayerRepo
- type LayerRetriever
- type Mode
- type Query
- type RRFRanker
- type Ranker
- type RebuildScope
- type Rebuilder
- type Reloaderdeprecated
- type ReloaderOptions
- type Result
- type RetrievalStoredeprecated
- func (s *RetrievalStore) Abstract(ctx context.Context, datasetID, name string) (string, error)
- func (s *RetrievalStore) AddDocument(ctx context.Context, datasetID, name, content string) error
- func (s *RetrievalStore) AddDocuments(ctx context.Context, datasetID string, docs []DocInput) error
- func (s *RetrievalStore) DatasetAbstract(ctx context.Context, datasetID string) (string, error)
- func (s *RetrievalStore) DatasetOverview(ctx context.Context, datasetID string) (string, error)
- func (s *RetrievalStore) DeleteDocument(ctx context.Context, datasetID, name string) error
- func (s *RetrievalStore) GetDocument(ctx context.Context, datasetID, name string) (*Document, error)
- func (s *RetrievalStore) Index() retrieval.Index
- func (s *RetrievalStore) ListDocuments(ctx context.Context, datasetID string) ([]Document, error)
- func (s *RetrievalStore) Overview(ctx context.Context, datasetID, name string) (string, error)
- func (s *RetrievalStore) Search(ctx context.Context, datasetID, query string, opts SearchOptions) ([]SearchResult, error)
- func (s *RetrievalStore) SetAbstract(ctx context.Context, datasetID, name, abstract string) error
- func (s *RetrievalStore) SetDatasetAbstract(ctx context.Context, datasetID, abstract string) error
- func (s *RetrievalStore) SetDatasetOverview(ctx context.Context, datasetID, overview string) error
- func (s *RetrievalStore) SetOverview(ctx context.Context, datasetID, name, overview string) error
- type RetrievalStoreOptiondeprecated
- func WithRetrievalChunkConfig(c ChunkConfig) RetrievalStoreOptiondeprecated
- func WithRetrievalEmbedder(e embedding.Embedder) RetrievalStoreOptiondeprecated
- func WithRetrievalPipeline(p *pipeline.Pipeline) RetrievalStoreOptiondeprecated
- func WithRetrievalTokenizer(t Tokenizer) RetrievalStoreOptiondeprecated
- type Retriever
- type Scope
- type SearchEngine
- type SearchMode
- type SearchOptionsdeprecated
- type SearchResultdeprecated
- type Service
- func (s *Service) DatasetLayer(ctx context.Context, datasetID string, layer Layer) (string, error)
- func (s *Service) DeleteDocument(ctx context.Context, datasetID, name string) error
- func (s *Service) GetDocument(ctx context.Context, datasetID, name string) (*SourceDocument, error)
- func (s *Service) Layer(ctx context.Context, datasetID, name string, layer Layer) (string, error)
- func (s *Service) ListDatasets(ctx context.Context) ([]string, error)
- func (s *Service) ListDocuments(ctx context.Context, datasetID string) ([]SourceDocument, error)
- func (s *Service) PutDatasetLayer(ctx context.Context, datasetID string, layer Layer, content string) error
- func (s *Service) PutDocument(ctx context.Context, datasetID, name, raw string) error
- func (s *Service) PutDocumentLayer(ctx context.Context, datasetID, name string, layer Layer, content string) error
- func (s *Service) Rebuild(ctx context.Context, scope RebuildScope) error
- func (s *Service) Search(ctx context.Context, q Query) (*Result, error)
- type ServiceOptions
- type SimpleTokenizer
- type SourceDocument
- type Storedeprecated
- type Tokenizer
- type VectorRetriever
Constants ¶
const ( // AbstractPrompt produces a single-sentence L0 summary (~100 tokens). AbstractPrompt = `` /* 155-byte string literal not displayed */ // OverviewPrompt produces a structured L1 overview (~1000 tokens). OverviewPrompt = `` /* 187-byte string literal not displayed */ // DatasetOverviewPrompt produces a dataset-level L1 from per-document // abstracts ("- name: abstract" lines). DatasetOverviewPrompt = `` /* 196-byte string literal not displayed */ )
Prompt templates used to derive layered context from raw documents. Exported so callers can override or compose their own pipelines while reusing the SDK's defaults for the common case.
const DefaultPromptInputLimit = 8000
DefaultPromptInputLimit is the maximum number of characters of document content fed into a prompt. Content beyond this is truncated to keep prompts within typical context windows.
const DefaultRRFK = 60
DefaultRRFK is the conventional fusion constant for reciprocal-rank fusion. Smaller K weights the head of each ranked list more heavily.
const DefaultThreshold = 0.1
DefaultThreshold is the BM25-scale relevance floor used by legacy search paths. Service-driven Search uses Query.Threshold instead.
Variables ¶
var ( DetectTokenizer = textsearch.DetectTokenizer NewCorpusStats = textsearch.NewCorpusStats ExtractKeywords = textsearch.ExtractKeywords ScoreText = textsearch.ScoreText )
Functions ¶
func ChunkConfigSig ¶ added in v0.2.1
func ChunkConfigSig(c ChunkConfig) string
ChunkConfigSig returns a stable signature for a chunker configuration. It is the ChunkerSig embedded in DerivedSig so freshness checks can detect a configuration change.
func CosineSimilarity ¶ added in v0.2.1
CosineSimilarity returns the cosine similarity of two equal-length vectors. Returns 0 when lengths differ, either vector is empty, or either vector has zero norm. Exported so backends can share one canonical implementation.
func IsValidLayer ¶ added in v0.2.1
IsValidLayer reports whether l is a recognised layer.
(Method set on a type alias must live on the underlying type; provided as a free function to avoid colliding with future ContextLayer methods.)
func IsValidMode ¶ added in v0.2.1
IsValidMode reports whether m is a recognised mode.
The empty string is accepted for backwards compatibility (legacy callers used "" to mean BM25); ResolveMode normalises it to ModeBM25.
func NewAddTool
deprecated
func NewPutServiceTool
deprecated
added in
v0.2.1
NewPutServiceTool exposes Service.PutDocument to LLMs (v0.3.0).
Tool name: "knowledge_put" Input JSON:
{
"dataset_id": string, // default "default"
"name": string, // required
"content": string // required
}
Output:
{"status": "ok", "dataset_id": ..., "name": ..., "version": uint}
The tool returns the new SourceDocument.Version so callers can chain derivation work (layer generation, vector backfill) keyed off the freshness signal.
Deprecated: this LLM tool implementation will be moved to sdkx/tool/knowledge in v0.3.0 alongside NewSearchServiceTool. The function signature is preserved across the move; only the import path changes. See docs/migrations/v0.3.0.md.
func NewSearchServiceTool
deprecated
added in
v0.2.1
NewSearchServiceTool exposes Service.Search to LLMs (v0.3.0).
Tool name: "knowledge_search" Input JSON:
{
"query": string, // required
"scope": "single"|"all", // default "all"
"dataset_id": string, // required when scope=single
"mode": "bm25"|"vector"|"hybrid", // default "bm25"
"layer": "L0"|"L1"|"L2", // default "L2"
"top_k": integer, // default 5
"threshold": number // default 0
}
Output: JSON-encoded []Hit.
Backwards compatibility: when callers send only the legacy {query, top_k} fields, the tool defaults to scope=all/mode=bm25/layer=L2 — the same behaviour as the deprecated NewSearchTool(Store) helper.
Deprecated: this LLM tool implementation will be moved to sdkx/tool/knowledge in v0.3.0. The new home matches the existing "sdk = interface, sdkx = concrete adapter" layering rule that sdk/llm and sdk/embedding already follow. The function signature is unchanged across the move; only the import path differs. New code SHOULD pin sdkx/tool/knowledge directly. See docs/migrations/v0.3.0.md.
func NewSearchTool
deprecated
func ScoreChunk
deprecated
func ScoreChunk(chunk *Chunk, keywords []string, corpus *CorpusStats, tokenizer Tokenizer) float64
ScoreChunk computes the BM25 score for a chunk against query keywords.
Deprecated: use textsearch.BM25 directly with DerivedChunk content. Removed in v0.3.0.
Types ¶
type BM25Retriever ¶ added in v0.2.1
type BM25Retriever struct {
Chunks ChunkRepo
}
BM25Retriever queries ChunkRepo with Mode=ModeBM25. It is layered: queries whose Layer is not LayerDetail short-circuit to nil.
func NewBM25Retriever ¶ added in v0.2.1
func NewBM25Retriever(c ChunkRepo) *BM25Retriever
NewBM25Retriever constructs a BM25Retriever bound to a ChunkRepo.
func (*BM25Retriever) Name ¶ added in v0.2.1
func (r *BM25Retriever) Name() string
Name implements Retriever.
type CJKTokenizer ¶
type CJKTokenizer = textsearch.CJKTokenizer
Type and function aliases re-exported from sdk/textsearch for backward compatibility. Internal code and tests can use these without changing import paths.
type CacheOption
deprecated
type CacheOption func(*CachedStore)
CacheOption configures a CachedStore.
Deprecated: see CachedStore.
func WithMaxItems
deprecated
func WithMaxItems(n int) CacheOption
WithMaxItems sets the maximum number of cached items.
Deprecated: see CachedStore.
func WithTTL
deprecated
func WithTTL(d time.Duration) CacheOption
WithTTL sets the cache time-to-live.
Deprecated: see CachedStore.
type CachedStore
deprecated
type CachedStore struct {
// contains filtered or unexported fields
}
CachedStore wraps a Store with TTL + LRU caching for read operations.
Deprecated: caching now lives inside Service / repository implementations where appropriate; the indirection no longer earns its keep at the orchestration layer. Removed in v0.3.0.
func NewCachedStore
deprecated
func NewCachedStore(inner Store, opts ...CacheOption) *CachedStore
NewCachedStore wraps inner with caching.
Deprecated: see CachedStore. Removed in v0.3.0.
func (*CachedStore) AddDocument ¶
func (s *CachedStore) AddDocument(ctx context.Context, datasetID, name, content string) error
func (*CachedStore) AddDocuments ¶
func (*CachedStore) DatasetAbstract ¶
func (*CachedStore) DatasetOverview ¶
func (*CachedStore) DeleteDocument ¶
func (s *CachedStore) DeleteDocument(ctx context.Context, datasetID, name string) error
func (*CachedStore) EvictDataset ¶
func (s *CachedStore) EvictDataset(datasetID string)
EvictDataset removes all cached entries for a dataset.
func (*CachedStore) GetDocument ¶
func (*CachedStore) ListDocuments ¶
func (*CachedStore) Search ¶
func (s *CachedStore) Search(ctx context.Context, datasetID, query string, opts SearchOptions) ([]SearchResult, error)
type Candidate ¶ added in v0.2.1
Candidate is the per-item recall result returned by ChunkRepo / LayerRepo. Source identifies the producing retriever ("bm25" / "vector" / "layer") and is consumed by the Ranker for fusion.
type ChangeEvent ¶ added in v0.2.1
ChangeEvent carries enough granularity for targeted rebuilds. DocName == "" denotes a dataset-level event.
NOTE (v0.2.x): The deprecated ChangeNotifier in deprecated.go still emits opaque struct{} events; sdkx/knowledge/watcher remains its only in-tree producer. The ChangeEvent shape declared here is what EventNotifier implementations will emit once watcher migrates in v0.3.0.
type ChangeNotifier
deprecated
type ChangeNotifier interface {
Events() <-chan struct{}
Close() error
}
ChangeNotifier emits an opaque event whenever the underlying source changes.
Deprecated: use EventNotifier (typed ChangeEvent stream) with EventReloader. Removed in v0.3.0.
type Chunk
deprecated
type Chunk struct {
DocName string `json:"doc_name"`
Index int `json:"index"`
Content string `json:"content"`
Offset int `json:"offset"`
}
Chunk represents a segment of a document.
Deprecated: use DerivedChunk. Chunk is removed in v0.3.0.
func ChunkDocument
deprecated
func ChunkDocument(docName, content string, cfg ChunkConfig) []Chunk
ChunkDocument splits content into overlapping chunks, preferring to break at paragraph or sentence boundaries.
Deprecated: use ChunkText (returns []DerivedChunk). Removed in v0.3.0.
type ChunkConfig ¶
type ChunkConfig struct {
ChunkSize int `json:"chunk_size,omitempty"`
ChunkOverlap int `json:"chunk_overlap,omitempty"`
}
ChunkConfig controls document chunking.
func DefaultChunkConfig ¶
func DefaultChunkConfig() ChunkConfig
DefaultChunkConfig returns the default chunking configuration.
type ChunkQuery ¶ added in v0.2.1
ChunkQuery is the recall input passed by Retrievers to ChunkRepo.Search.
Empty DatasetIDs means "every dataset" (cross-dataset search). When Mode is ModeVector or ModeHybrid, Vector should be supplied; backends that cannot satisfy a mode return an empty result without error.
type ChunkRepo ¶ added in v0.2.1
type ChunkRepo interface {
Replace(ctx context.Context, datasetID, docName string, chunks []DerivedChunk) error
DeleteByDoc(ctx context.Context, datasetID, docName string) error
DeleteByDataset(ctx context.Context, datasetID string) error
Search(ctx context.Context, q ChunkQuery) ([]Candidate, error)
}
ChunkRepo persists DerivedChunks and supports recall.
Replace MUST be atomic: callers rely on it to eliminate stale chunks when a SourceDocument is updated (contract guarantee #5).
type ChunkSpec ¶ added in v0.2.1
ChunkSpec is the chunker output: a positionally-tagged content slice without dataset/doc identity (the Service fills those in).
type Chunker ¶ added in v0.2.1
Chunker turns raw content into ordered ChunkSpecs.
Implementations MUST be deterministic (same input -> same output) and MUST return a stable Sig() so derived data freshness can be checked.
func NewDefaultChunker ¶ added in v0.2.1
func NewDefaultChunker(cfg ChunkConfig) Chunker
NewDefaultChunker returns the built-in paragraph/sentence-boundary chunker. Its output is UTF-8 safe; chunks never split inside a multi-byte rune.
Sizes are measured in runes (not bytes), so multi-byte UTF-8 content (CJK, emoji, etc.) is sliced safely on rune boundaries.
type ContextLayer ¶
type ContextLayer string
ContextLayer indicates the granularity of a search result.
v0.3.0 will rename ContextLayer -> Layer; the alias declared in model.go (type Layer = ContextLayer) lets new code adopt the final name today without breaking existing callers.
const ( LayerAbstract ContextLayer = "L0" // ~100 token one-sentence summary LayerOverview ContextLayer = "L1" // ~1k token structured overview LayerDetail ContextLayer = "L2" // full chunk content )
type CorpusStats ¶
type CorpusStats = textsearch.CorpusStats
Type and function aliases re-exported from sdk/textsearch for backward compatibility. Internal code and tests can use these without changing import paths.
type DatasetContext ¶
type DatasetContext struct {
Abstract string // dataset-level L0
Overview string // dataset-level L1
}
DatasetContext groups the layered context for an entire dataset.
func GenerateDatasetContext ¶
func GenerateDatasetContext(ctx context.Context, l llm.LLM, summaries []DocumentSummary) (DatasetContext, error)
GenerateDatasetContext derives dataset-level L0 + L1 from per-document abstracts. The L1 overview is generated first, then distilled into L0. Returns an empty context with no error when summaries is empty.
type DatasetQuery ¶
type DatasetQuery struct {
DatasetID string `json:"dataset_id"`
StateKey string `json:"state_key"`
TopK int `json:"top_k"`
}
DatasetQuery describes a single dataset search within a Knowledge node. Re-used by both the v0.3.0 knowledgenode.Config and the deprecated KnowledgeConfig (kept stable across versions).
type DerivedChunk ¶ added in v0.2.1
type DerivedChunk struct {
DatasetID string
DocName string
Index int
Offset int
Content string
Vector []float32
Sig DerivedSig
}
DerivedChunk is one retrieval unit derived from a SourceDocument.
func ChunkText ¶ added in v0.2.1
func ChunkText(docName, content string, cfg ChunkConfig) []DerivedChunk
ChunkText splits content into UTF-8-safe overlapping chunks.
Behavior:
- Slicing is performed on rune indices, never byte indices.
- End offsets are reported in bytes for compatibility with retrieval filters that key on byte position.
- Boundary preference: paragraph (\n\n) > sentence (". ") > line (\n).
- The result is deterministic for a given (content, cfg) pair.
type DerivedLayer ¶ added in v0.2.1
type DerivedLayer struct {
DatasetID string
DocName string
Layer Layer
Content string
Vector []float32
Sig DerivedSig
}
DerivedLayer is an LLM-produced summary of a document or dataset. DocName == "" denotes a dataset-level layer.
type DerivedSig ¶ added in v0.2.1
DerivedSig binds a derived artifact to the source revision and to the configuration that produced it. Required on every derived object.
- SourceVer is the SourceDocument.Version that produced this artifact.
- ChunkerSig is non-empty for chunk artifacts and empty for layers.
- PromptSig is non-empty for layer artifacts and empty for chunks.
- EmbedSig identifies the embedder, "" when no vector is attached.
func (DerivedSig) IsStale ¶ added in v0.2.1
func (sig DerivedSig) IsStale(want DerivedSig) bool
IsStale returns true when sig was produced for an earlier source version or with a different chunker / prompt / embed configuration than want.
A zero EmbedSig in want is treated as "don't care".
type Document
deprecated
type Document struct {
Name string `json:"name"`
Content string `json:"content"`
Abstract string `json:"abstract,omitempty"` // L0
Overview string `json:"overview,omitempty"` // L1
Metadata map[string]string `json:"metadata,omitempty"`
}
Document represents a knowledge base document.
Deprecated: use SourceDocument (raw content + Version) and DerivedLayer (L0/L1) separately. Document conflates the two and is removed in v0.3.0.
type DocumentContext ¶
DocumentContext groups the layered context for a single document.
func GenerateDocumentContext ¶
func GenerateDocumentContext(ctx context.Context, l llm.LLM, content string) (DocumentContext, error)
GenerateDocumentContext synthesizes L0 (abstract) and L1 (overview) for a document by issuing two LLM calls. Pure function: no I/O, no caching, no retries; callers own scheduling and persistence.
Returns a partial result on error: if abstract generation fails the zero-value context is returned with the error; if overview fails the already-generated abstract is preserved so callers can choose to persist it.
type DocumentRepo ¶ added in v0.2.1
type DocumentRepo interface {
Put(ctx context.Context, doc SourceDocument) error
Get(ctx context.Context, datasetID, name string) (*SourceDocument, error)
Delete(ctx context.Context, datasetID, name string) error
List(ctx context.Context, datasetID string) ([]SourceDocument, error)
ListDatasets(ctx context.Context) ([]string, error)
}
DocumentRepo persists SourceDocuments. Implementations MUST guarantee:
- Put atomically increments SourceDocument.Version.
- Get returns the most recent Put with Content losslessly preserved (contract guarantee #4).
- Delete is idempotent.
Implementations live in sdk/knowledge/backend/*.
type DocumentSummary ¶
DocumentSummary pairs a document name with its L0 abstract, used as input to GenerateDatasetContext.
type Embedder ¶
Embedder is an alias for the SDK embedding.Embedder interface. It supports both single-text and batch embeddings.
type EventKind ¶ added in v0.2.1
type EventKind int
EventKind classifies a ChangeEvent emitted by an EventNotifier implementation. Used by EventReloader to decide whether to perform a targeted or dataset-wide rebuild.
type EventNotifier ¶ added in v0.2.1
type EventNotifier interface {
Events() <-chan ChangeEvent
Close() error
}
EventNotifier is the v0.3.0 producer side of the reload pipeline. It supersedes the deprecated ChangeNotifier (defined in deprecated.go): events carry dataset/doc granularity so the consumer can issue targeted Rebuilds instead of a global one.
Implementations live in adapter packages (e.g. sdkx/knowledge/watcher, once it migrates) so the sdk core stays dependency-free. Implementations MUST close the Events channel when Close() is called.
type EventReloader ¶ added in v0.2.1
type EventReloader struct {
// contains filtered or unexported fields
}
EventReloader debounces ChangeEvents and triggers Rebuild on the trailing edge. Rebuilds are serialised: a new rebuild waits for the previous one to finish.
Targeted vs global rebuilds: when the debounce window contains events for a single (dataset, doc) pair, the rebuild is scoped to that pair; when it touches multiple datasets or any EventBulk event, a dataset-wide rebuild is issued instead. Mixed datasets in one window collapse to a global RebuildScope{} (every dataset).
EventReloader is the v0.3.0 successor to Reloader. The legacy Reloader (with its struct{}-channel ChangeNotifier, both in deprecated.go) remains exported during the deprecation window and will be removed in v0.3.0.
func NewEventReloader ¶ added in v0.2.1
func NewEventReloader(target Rebuilder, notifier EventNotifier, opts ReloaderOptions) *EventReloader
NewEventReloader wires a Rebuilder to an EventNotifier.
opts.Debounce defaults to 500ms; opts.RebuildTimeout defaults to 30s.
When target or notifier is nil, Run becomes a no-op (returns immediately) and Close is also a no-op for the background loop, so callers can wire up unconditionally and the call order does not matter.
In the normal (non-nil) configuration the contract is:
- Run MUST be invoked at most once before Close;
- Close blocks until Run has actually exited AND the notifier has been closed, so callers observing a successful Close are guaranteed no further Rebuild / timer callback can fire.
To uphold the second guarantee even when Close races with the goroutine that is about to call Run, wg.Add(1) is performed here (synchronously, before the constructor returns) rather than inside Run. This way wg.Add strictly happens-before any wg.Wait in Close, which is what sync.WaitGroup requires when its counter starts at zero.
func (*EventReloader) Close ¶ added in v0.2.1
func (r *EventReloader) Close() error
Close stops Run and the underlying notifier.
In the normal (non-nil target & notifier) configuration Close blocks until Run has actually returned, so on success the caller is guaranteed that no further Rebuild call or timer-driven flush can happen. Calling Close before Run was started would deadlock, so callers MUST start Run first; this matches the EventReloader lifecycle documented on Run / NewEventReloader.
When target or notifier is nil, Close is a no-op and may be called at any time.
type FSStore
deprecated
type FSStore struct {
// contains filtered or unexported fields
}
FSStore implements Store using a Workspace-backed file tree.
Deprecated: use factory.NewLocal(ws, opts...). Removed in v0.3.0.
func NewFSStore
deprecated
func NewFSStore(ws workspace.Workspace, opts ...FSStoreOption) *FSStore
NewFSStore creates a knowledge store rooted at the given prefix.
Deprecated: use factory.NewLocal(ws, opts...). Removed in v0.3.0.
func (*FSStore) AddDocument ¶
func (*FSStore) AddDocuments ¶
func (*FSStore) BuildIndex ¶
BuildIndex scans all datasets and builds the in-memory search index.
func (*FSStore) DatasetAbstract ¶
func (*FSStore) DatasetOverview ¶
func (*FSStore) DeleteDocument ¶
func (*FSStore) GetDocument ¶
func (*FSStore) ListDocuments ¶
func (*FSStore) ReindexVectors ¶
ReindexVectors regenerates vector embeddings for all indexed chunks.
func (*FSStore) Search ¶
func (s *FSStore) Search(ctx context.Context, datasetID, query string, opts SearchOptions) ([]SearchResult, error)
Search performs a two-level search over the in-memory index.
func (*FSStore) SetDatasetAbstract ¶
func (*FSStore) SetDatasetOverview ¶
func (*FSStore) SetDocAbstract ¶
func (*FSStore) SetDocOverview ¶
func (*FSStore) WorkspaceRoot ¶
WorkspaceRoot exposes the underlying workspace root when available.
Returns "" if the workspace does not implement Root().
func (*FSStore) WriteDatasetFile ¶
WriteDatasetFile writes a dataset-level file (e.g. .abstract.md, .overview.md).
type FSStoreOption
deprecated
type FSStoreOption func(*FSStore)
FSStoreOption configures an FSStore.
Deprecated: see FSStore.
func WithChunkConfig
deprecated
func WithChunkConfig(cfg ChunkConfig) FSStoreOption
WithChunkConfig sets the chunking configuration.
Deprecated: configure ChunkConfig on factory.NewLocal's chunker option.
func WithEmbedder
deprecated
func WithEmbedder(e Embedder) FSStoreOption
WithEmbedder sets the embedder for semantic/hybrid search.
Deprecated: pass embedder through factory.WithLocalEmbedder.
func WithTokenizer
deprecated
func WithTokenizer(t Tokenizer) FSStoreOption
WithTokenizer sets the tokenizer for search and indexing.
Deprecated: pass tokenizer through factory.WithLocalTokenizer.
type Hit ¶ added in v0.2.1
type Hit struct {
DatasetID string
DocName string
Layer Layer
Content string
Score float64
ChunkIndex int // -1 for layer hits
Sig DerivedSig // freshness traceability
Metadata map[string]any // backend-passthrough metadata
}
Hit is one ranked search result.
type Layer ¶ added in v0.2.1
type Layer = ContextLayer
Layer is the v0.3.0 name for ContextLayer. It is declared as a type alias so values flow seamlessly between old and new APIs during the deprecation window. The constant set lives in types.go.
type LayerQuery ¶ added in v0.2.1
type LayerQuery struct {
DatasetIDs []string
Layer Layer
Text string
Vector []float32
Mode Mode
TopK int
}
LayerQuery is the recall input for layer-tier searches.
type LayerRepo ¶ added in v0.2.1
type LayerRepo interface {
Put(ctx context.Context, layer DerivedLayer) error
Get(ctx context.Context, datasetID, docName string, layer Layer) (*DerivedLayer, error)
DeleteByDoc(ctx context.Context, datasetID, docName string) error
DeleteByDataset(ctx context.Context, datasetID string) error
Search(ctx context.Context, q LayerQuery) ([]Candidate, error)
}
LayerRepo persists DerivedLayers and supports layer-scoped recall.
type LayerRetriever ¶ added in v0.2.1
LayerRetriever queries LayerRepo for L0/L1 hits. It activates only when Query.Layer is LayerAbstract or LayerOverview; LayerDetail queries are routed to chunk-tier retrievers instead.
Embedder is consulted only for vector lanes (ModeVector / ModeHybrid).
func NewLayerRetriever ¶ added in v0.2.1
func NewLayerRetriever(l LayerRepo, e Embedder) *LayerRetriever
NewLayerRetriever constructs a LayerRetriever; pass a nil embedder to disable the vector lane.
func (*LayerRetriever) Name ¶ added in v0.2.1
func (r *LayerRetriever) Name() string
Name implements Retriever.
type Mode ¶ added in v0.2.1
type Mode = SearchMode
Mode is the v0.3.0 name for SearchMode. It is declared as a type alias so values flow seamlessly between old and new APIs during the deprecation window. The constant set lives in types.go.
func ResolveMode ¶ added in v0.2.1
ResolveMode normalises legacy and zero values to a canonical Mode.
- "" -> ModeBM25 (legacy default)
- ModeSemantic -> ModeVector (Deprecated alias, removed in v0.3.0)
Any other recognised mode is returned unchanged.
type Query ¶ added in v0.2.1
type Query struct {
DatasetID string
Scope Scope
Text string
Mode Mode
Layer Layer
TopK int
Threshold float64
// contains filtered or unexported fields
}
Query is the canonical search input.
Validation rules (enforced by Service.Search):
- Layer must be valid; defaults to LayerDetail when zero.
- Mode must be valid; defaults to ModeBM25 when zero.
- Scope=ScopeSingleDataset requires DatasetID to be non-empty.
type RRFRanker ¶ added in v0.2.1
type RRFRanker struct {
K int
}
RRFRanker performs reciprocal-rank fusion across candidates grouped by Candidate.Source. K defaults to DefaultRRFK when zero.
func NewRRFRanker ¶ added in v0.2.1
func NewRRFRanker() *RRFRanker
NewRRFRanker constructs an RRFRanker with sensible defaults.
type RebuildScope ¶ added in v0.2.1
type RebuildScope struct {
DatasetID string // "" means all datasets
DocName string // "" means all documents in the dataset
}
RebuildScope narrows what Rebuilder.Rebuild touches. Zero value means "everything".
type Rebuilder ¶ added in v0.2.1
type Rebuilder interface {
Rebuild(ctx context.Context, scope RebuildScope) error
}
Rebuilder is the consumer side of the change-driven reload pipeline. Service satisfies this interface; EventReloader invokes Rebuild on the trailing edge of a debounce window over an EventNotifier stream.
type Reloader
deprecated
type Reloader struct {
// contains filtered or unexported fields
}
Reloader debounces ChangeNotifier events and triggers Rebuild on a stable trailing edge.
Deprecated: use EventReloader (typed events + scope-aware Service.Rebuild + serialised execution). Removed in v0.3.0.
func NewReloader
deprecated
func NewReloader(store *FSStore, notifier ChangeNotifier, opts ReloaderOptions) *Reloader
NewReloader wires a ChangeNotifier to a rebuild callback.
Deprecated: use NewEventReloader(target Rebuilder, notifier EventNotifier, opts). Removed in v0.3.0.
type ReloaderOptions ¶
type ReloaderOptions struct {
Debounce time.Duration
RebuildTimeout time.Duration
Rebuild func(context.Context) error
}
ReloaderOptions configures EventReloader (and the deprecated Reloader).
Field set is the union of both consumers:
- Debounce controls the trailing-edge window (both consumers).
- RebuildTimeout caps each rebuild call (EventReloader only; the legacy Reloader hard-codes 30s).
- Rebuild is the legacy hook used by the deprecated Reloader to swap the rebuild callback. EventReloader ignores it and always calls Rebuilder.Rebuild on the supplied target.
type Result ¶ added in v0.2.1
type Result struct {
Hits []Hit
}
Result wraps the ordered hit list returned by Service.Search.
type RetrievalStore
deprecated
type RetrievalStore struct {
// contains filtered or unexported fields
}
RetrievalStore is a Store implementation backed by a retrieval.Index.
Deprecated: use factory.NewRetrieval(docs, idx, opts...) which returns a *Service backed by backend/retrieval. Removed in v0.3.0.
func NewRetrievalStore
deprecated
func NewRetrievalStore(idx retrieval.Index, opts ...RetrievalStoreOption) *RetrievalStore
NewRetrievalStore wires a Store to a retrieval.Index.
Deprecated: use factory.NewRetrieval(docs, idx, opts...). Removed in v0.3.0.
func (*RetrievalStore) AddDocument ¶
func (s *RetrievalStore) AddDocument(ctx context.Context, datasetID, name, content string) error
func (*RetrievalStore) AddDocuments ¶
func (*RetrievalStore) DatasetAbstract ¶
func (*RetrievalStore) DatasetOverview ¶
func (*RetrievalStore) DeleteDocument ¶
func (s *RetrievalStore) DeleteDocument(ctx context.Context, datasetID, name string) error
func (*RetrievalStore) GetDocument ¶
func (*RetrievalStore) Index ¶
func (s *RetrievalStore) Index() retrieval.Index
Index exposes the underlying retrieval.Index.
func (*RetrievalStore) ListDocuments ¶
func (*RetrievalStore) Search ¶
func (s *RetrievalStore) Search(ctx context.Context, datasetID, query string, opts SearchOptions) ([]SearchResult, error)
func (*RetrievalStore) SetAbstract ¶
func (s *RetrievalStore) SetAbstract(ctx context.Context, datasetID, name, abstract string) error
func (*RetrievalStore) SetDatasetAbstract ¶
func (s *RetrievalStore) SetDatasetAbstract(ctx context.Context, datasetID, abstract string) error
func (*RetrievalStore) SetDatasetOverview ¶
func (s *RetrievalStore) SetDatasetOverview(ctx context.Context, datasetID, overview string) error
func (*RetrievalStore) SetOverview ¶
func (s *RetrievalStore) SetOverview(ctx context.Context, datasetID, name, overview string) error
type RetrievalStoreOption
deprecated
type RetrievalStoreOption func(*RetrievalStore)
RetrievalStoreOption configures a RetrievalStore.
Deprecated: see RetrievalStore.
func WithRetrievalChunkConfig
deprecated
func WithRetrievalChunkConfig(c ChunkConfig) RetrievalStoreOption
WithRetrievalChunkConfig overrides the default chunk config.
Deprecated: configure ChunkConfig via factory.WithRetrievalChunker.
func WithRetrievalEmbedder
deprecated
func WithRetrievalEmbedder(e embedding.Embedder) RetrievalStoreOption
WithRetrievalEmbedder sets the embedder used to vectorize chunks at write time.
Deprecated: pass embedder through factory.WithRetrievalEmbedder.
func WithRetrievalPipeline
deprecated
func WithRetrievalPipeline(p *pipeline.Pipeline) RetrievalStoreOption
WithRetrievalPipeline overrides the default pipeline.Knowledge(emb, nil).
Deprecated: factory.NewRetrieval owns the pipeline now.
func WithRetrievalTokenizer
deprecated
func WithRetrievalTokenizer(t Tokenizer) RetrievalStoreOption
WithRetrievalTokenizer overrides the BM25 tokenizer.
Deprecated: factory wires the tokenizer through textsearch.
type Retriever ¶ added in v0.2.1
type Retriever interface {
// Name returns a stable identifier used by Rankers (e.g. "bm25").
Name() string
// Recall fetches up to q.TopK candidates for q.
Recall(ctx context.Context, q Query) ([]Candidate, error)
}
Retriever produces a candidate set for a Query. Implementations are stateless with respect to the Query; all state lives in the underlying repos.
type SearchEngine ¶ added in v0.2.1
type SearchEngine struct {
Chunk []Retriever // chunk-tier retrievers (LayerDetail)
Layer []Retriever // layer-tier retrievers (LayerAbstract / LayerOverview)
Rank Ranker
}
SearchEngine routes a Query through Retrievers, then a Ranker.
Behaviour:
- LayerDetail queries fan out to chunk-tier Retrievers (BM25/Vector).
- Layer{Abstract,Overview} queries fan out to layer-tier Retrievers.
- ModeHybrid runs both BM25 and Vector recall paths and lets the Ranker fuse them (RRF by default).
func NewSearchEngine ¶ added in v0.2.1
func NewSearchEngine(chunk, layer []Retriever, ranker Ranker) *SearchEngine
NewSearchEngine assembles a SearchEngine. ranker may be nil; the Service will substitute a default RRFRanker.
type SearchMode ¶
type SearchMode string
SearchMode chooses the retrieval algorithm.
v0.3.0 final values are explicit strings; legacy callers that pass the empty string are normalised to ModeBM25 by ResolveMode().
Deprecated names:
- ModeSemantic remains for backwards compatibility; new code should use ModeVector. They are recognised as equivalent at the Service boundary starting in v0.2.x and ModeSemantic is removed in v0.3.0.
const ( ModeBM25 SearchMode = "bm25" ModeVector SearchMode = "vector" ModeSemantic SearchMode = "semantic" // Deprecated: use ModeVector. ModeHybrid SearchMode = "hybrid" )
type SearchOptions
deprecated
type SearchOptions struct {
TopK int `json:"top_k,omitempty"`
MaxLayer ContextLayer `json:"max_layer,omitempty"`
Threshold float64 `json:"threshold,omitempty"`
Mode SearchMode `json:"mode,omitempty"`
}
SearchOptions configures a knowledge search query.
Deprecated: use Query. The MaxLayer→Layer rename and ScopeAllDatasets fan-out live on Query. SearchOptions is removed in v0.3.0.
type SearchResult
deprecated
type SearchResult struct {
Content string `json:"content"`
Score float64 `json:"score"`
DocName string `json:"doc_name,omitempty"`
ChunkIndex int `json:"chunk_index,omitempty"`
Layer ContextLayer `json:"layer"`
Metadata map[string]any `json:"metadata,omitempty"`
}
SearchResult represents a single search hit with its relevance score.
Deprecated: use Hit. SearchResult is removed in v0.3.0.
func RRFMerge
deprecated
func RRFMerge(bm25Results, semanticResults []SearchResult, k int) []SearchResult
RRFMerge fuses two ranked SearchResult lists with reciprocal-rank fusion.
Deprecated: use RRFRanker. Removed in v0.3.0.
func RankResults
deprecated
func RankResults(results []SearchResult, topK int) []SearchResult
RankResults sorts by score descending and limits to topK.
Deprecated: use the SearchEngine's Ranker (RRFRanker by default). Removed in v0.3.0.
type Service ¶ added in v0.2.1
type Service struct {
// contains filtered or unexported fields
}
Service orchestrates document lifecycle, derived-data persistence and search. All public Knowledge entry points (graph node, tools, deprecated stores) route through Service so contract guarantees (#1..#7 in doc.go) live in one place.
func NewService ¶ added in v0.2.1
func NewService( docs DocumentRepo, chunks ChunkRepo, layers LayerRepo, engine *SearchEngine, opts ServiceOptions, ) *Service
NewService constructs a Service from explicit repositories.
Most callers should use NewLocalService / NewRetrievalService instead; this entry point exists so out-of-tree backends can be wired the same way the built-ins are.
func (*Service) DatasetLayer ¶ added in v0.2.1
DatasetLayer reads a dataset-level layer; returns "" without error when missing.
func (*Service) DeleteDocument ¶ added in v0.2.1
DeleteDocument removes the document and all its derived data (chunks + layers). Errors from chunk/layer cleanup are returned after a best-effort attempt so a single failure does not leave the document orphaned in DocumentRepo.
func (*Service) GetDocument ¶ added in v0.2.1
GetDocument returns the lossless SourceDocument (contract #4).
func (*Service) Layer ¶ added in v0.2.1
Layer reads a document-level layer; returns "" without error when missing (contract: callers should treat absence as "not yet generated").
func (*Service) ListDatasets ¶ added in v0.2.1
ListDatasets enumerates every known dataset id.
func (*Service) ListDocuments ¶ added in v0.2.1
ListDocuments returns SourceDocuments in the dataset. Implementations MAY omit Content for performance; FSDocumentRepo currently returns it.
func (*Service) PutDatasetLayer ¶ added in v0.2.1
func (s *Service) PutDatasetLayer(ctx context.Context, datasetID string, layer Layer, content string) error
PutDatasetLayer persists a dataset-level layer (DocName == "").
func (*Service) PutDocument ¶ added in v0.2.1
PutDocument writes raw content under (datasetID, name).
Behaviour:
- SourceDocument.Version is incremented atomically by DocumentRepo.
- DerivedChunks are recomputed and ChunkRepo.Replace is called.
- DerivedLayers are NOT touched (layer generation is explicit; see PutDocumentLayer / PutDatasetLayer).
Atomicity model: chunk replacement happens AFTER the document write succeeds. A failure between the two leaves the document persisted but chunks stale; the next PutDocument or Rebuild restores consistency. Backends with native transactions can override this by composing repos that share a transaction.
func (*Service) PutDocumentLayer ¶ added in v0.2.1
func (s *Service) PutDocumentLayer(ctx context.Context, datasetID, name string, layer Layer, content string) error
PutDocumentLayer persists an LLM-derived layer for one document. Caller is expected to have produced content via GenerateDocumentContext.
func (*Service) Rebuild ¶ added in v0.2.1
func (s *Service) Rebuild(ctx context.Context, scope RebuildScope) error
Rebuild re-derives chunks for the requested scope, comparing DerivedSig against the current (SourceVer, ChunkerSig). Stale chunks are recomputed; up-to-date chunks are left alone.
Layers are not regenerated automatically; callers drive layer rebuilds explicitly via PutDocumentLayer / PutDatasetLayer.
func (*Service) Search ¶ added in v0.2.1
Search executes the query through the configured SearchEngine.
Validation (the only place these checks live, contract #2):
- q.Layer defaults to LayerDetail when zero; rejected otherwise.
- q.Mode defaults to ModeBM25 when zero; ModeSemantic is normalised to ModeVector for backwards compatibility.
- q.Scope=ScopeSingleDataset requires q.DatasetID to be non-empty.
For ScopeAllDatasets the dataset list is resolved once via DocumentRepo.ListDatasets and pushed down to retrievers via the unexported resolvedDatasets field.
type ServiceOptions ¶ added in v0.2.1
type ServiceOptions struct {
// Chunker overrides the default chunker; nil means "use
// NewDefaultChunker(DefaultChunkConfig())".
Chunker Chunker
// Embedder enables vector indexing and semantic search; nil
// disables vector lanes.
Embedder Embedder
// EmbedSig is stamped onto every DerivedSig produced while this
// Service runs. Embedder doesn't expose a model identifier, so
// callers (typically the Factory) supply one. Empty string means
// "use the embedder's Go type name", which is good enough for
// freshness checks within a single binary but not stable across
// processes — production wiring should set it explicitly.
EmbedSig string
// Now overrides the clock; nil means time.Now (unit-test hook).
Now func() time.Time
}
ServiceOptions configures a Service. Nil-friendly: every field is optional and falls back to a sensible default.
type SimpleTokenizer ¶
type SimpleTokenizer = textsearch.SimpleTokenizer
Type and function aliases re-exported from sdk/textsearch for backward compatibility. Internal code and tests can use these without changing import paths.
type SourceDocument ¶ added in v0.2.1
type SourceDocument struct {
DatasetID string
Name string
Content string
Metadata map[string]string
// Version is monotonically incremented on every successful Put.
// Derived data uses it as a freshness key.
Version uint64
UpdatedAt time.Time
}
SourceDocument is the canonical, lossless representation of user input.
It is the single source of truth for a document; every DerivedChunk and DerivedLayer carries a DerivedSig that points back to a particular SourceDocument.Version, so derived data can be detected as stale and recomputed deterministically.
type Store
deprecated
type Store interface {
AddDocument(ctx context.Context, datasetID, name, content string) error
AddDocuments(ctx context.Context, datasetID string, docs []DocInput) error
GetDocument(ctx context.Context, datasetID, name string) (*Document, error)
DeleteDocument(ctx context.Context, datasetID, name string) error
ListDocuments(ctx context.Context, datasetID string) ([]Document, error)
Search(ctx context.Context, datasetID, query string, opts SearchOptions) ([]SearchResult, error)
Abstract(ctx context.Context, datasetID, name string) (string, error)
Overview(ctx context.Context, datasetID, name string) (string, error)
DatasetAbstract(ctx context.Context, datasetID string) (string, error)
DatasetOverview(ctx context.Context, datasetID string) (string, error)
}
Store abstracts knowledge base storage. Documents are organized by dataset.
Deprecated: use *Service in sdk/knowledge instead. Service unifies document, chunk and layer storage behind a single contract and is the only orchestrator going forward; Store will be removed in v0.3.0.
Migration:
- Replace AddDocument / AddDocuments with Service.PutDocument.
- Replace Search with Service.Search.
- Replace Abstract / Overview with Service.Layer.
- Replace DatasetAbstract / Overview with Service.DatasetLayer.
type Tokenizer ¶
type Tokenizer = textsearch.Tokenizer
Type and function aliases re-exported from sdk/textsearch for backward compatibility. Internal code and tests can use these without changing import paths.
type VectorRetriever ¶ added in v0.2.1
VectorRetriever queries ChunkRepo with Mode=ModeVector. Embedder is invoked lazily; nil disables the retriever (Recall returns nil).
func NewVectorRetriever ¶ added in v0.2.1
func NewVectorRetriever(c ChunkRepo, e Embedder) *VectorRetriever
NewVectorRetriever constructs a VectorRetriever; pass a nil embedder to disable the lane (Recall short-circuits).
func (*VectorRetriever) Name ¶ added in v0.2.1
func (r *VectorRetriever) Name() string
Name implements Retriever.
Source Files
¶
Directories
¶
| Path | Synopsis |
|---|---|
|
backend
|
|
|
fs
Package fs implements knowledge.DocumentRepo, knowledge.ChunkRepo and knowledge.LayerRepo on top of any workspace.Workspace.
|
Package fs implements knowledge.DocumentRepo, knowledge.ChunkRepo and knowledge.LayerRepo on top of any workspace.Workspace. |
|
retrieval
Package retrieval implements knowledge.ChunkRepo and knowledge.LayerRepo on top of any retrieval.Index.
|
Package retrieval implements knowledge.ChunkRepo and knowledge.LayerRepo on top of any retrieval.Index. |
|
Package factory wires the canonical knowledge.Service stacks.
|
Package factory wires the canonical knowledge.Service stacks. |