Documentation
¶
Overview ¶
Package index orchestrates atomic Stroma index rebuilds and searches.
Index ¶
- Constants
- Variables
- type ArmEvidence
- type BuildOptions
- type BuildResult
- type ChunkContextualizer
- type ContextOptions
- type FusionStrategy
- type HitProvenance
- type RRFFusion
- type RecordQuery
- type RecordSource
- type RecordSourceFunc
- type Reranker
- type RetrievalArm
- type ReuseStatus
- type SearchHit
- type SearchParams
- type SearchQuery
- type Section
- type SectionQuery
- type Snapshot
- func (s *Snapshot) Close() error
- func (s *Snapshot) ExpandContext(ctx context.Context, chunkID int64, opts ContextOptions) ([]Section, error)
- func (s *Snapshot) Path() string
- func (s *Snapshot) Records(ctx context.Context, query RecordQuery) ([]corpus.Record, error)
- func (s *Snapshot) Search(ctx context.Context, query SnapshotSearchQuery) ([]SearchHit, error)
- func (s *Snapshot) SearchVector(ctx context.Context, query VectorSearchQuery) ([]SearchHit, error)
- func (s *Snapshot) Sections(ctx context.Context, query SectionQuery) ([]Section, error)
- func (s *Snapshot) Stats(ctx context.Context) (*Stats, error)
- type SnapshotSearchQuery
- type Stats
- type UpdateOptions
- type UpdateResult
- type VectorSearchQuery
Constants ¶
const ( ArmVector = "vector" ArmFTS = "fts" )
Arm name constants used by the default Snapshot.Search pipeline. Custom FusionStrategy implementations may introduce additional arm names.
const DefaultMaxChunkSections = 10_000
DefaultMaxChunkSections caps the number of heading-aware sections a single record can contribute to the index when the caller hasn't overridden it. 10,000 is generous for legitimate technical documents (few real specs exceed a few hundred headings) while still preventing a pathological or hostile body from expanding into millions of embedder calls + rows.
const DefaultSearchLimit = 10
DefaultSearchLimit is the hit cap applied to Snapshot.Search and Snapshot.SearchVector when SearchParams.Limit / VectorSearchQuery.Limit is zero or negative. The choice is conservative; pick an explicit Limit if throughput matters or if the caller needs a stable shortlist size across snapshots.
const MaxSearchLimit = 250
MaxSearchLimit is the largest accepted SearchParams.Limit or VectorSearchQuery.Limit. Search uses bounded in-memory shortlists for vector/FTS fusion and reranking; callers needing more than this should page or shard at a higher layer rather than relying on an unbounded single-query scan.
Variables ¶
var ErrStaleUpdatePlan = errors.New("index changed while planning update")
ErrStaleUpdatePlan signals that Update planned added records against one committed snapshot, but the snapshot content changed before the write transaction applied those plans. Callers can retry the Update so chunk reuse and embeddings are recomputed against the new base snapshot.
var ErrUnsupportedSchemaVersion = errors.New("unsupported snapshot schema version")
ErrUnsupportedSchemaVersion is returned when an operation encounters a snapshot whose schema_version is neither the current schema nor one the library knows how to migrate from. It is surfaced by OpenSnapshot and wrapped via fmt.Errorf with %w so callers can use errors.Is to detect it.
var ErrUpdateCommittedIntegrityCheckFailed = errors.New("update committed but post-commit integrity check failed")
ErrUpdateCommittedIntegrityCheckFailed signals that Update's transaction committed successfully — the record, chunk, and metadata changes are durable on disk — but the post-commit PRAGMA integrity_check / foreign_key_check reported corruption. The enclosing error wraps this sentinel via fmt.Errorf with %w so callers can use errors.Is to detect it. This case is non-retriable: re-running Update will not unroll the already-durable changes, and the underlying file likely needs operator inspection (see index/ARCHITECTURE.md). Contrast with plain errors returned by Update, which come from pre-commit failures and leave the file byte-identical to its pre-call state.
var ErrUpdatePlanTooLarge = errors.New("update plan exceeds MaxPlannedRecords")
ErrUpdatePlanTooLarge signals that UpdateOptions.MaxPlannedRecords rejected the added-record set before Update opened the write transaction. Callers can split added records into smaller Update calls and retry.
Functions ¶
This section is empty.
Types ¶
type ArmEvidence ¶
type ArmEvidence struct {
// Rank is the zero-based position of the hit within the arm.
Rank int
// Score is the arm-native score at the time the arm returned the hit
// (cosine derivative for vector, negative bm25 for FTS).
Score float64
}
ArmEvidence is one arm's contribution to a fused hit.
type BuildOptions ¶
type BuildOptions struct {
// Path is the OS-native filesystem path where the built snapshot
// is written. On Windows both forward and back slashes are
// accepted — the store package normalizes drive prefixes on open.
Path string
// ReuseFromPath points at an existing Stroma snapshot whose embeddings
// should be reused at the section level: a new section reuses its
// stored embedding whenever its title, heading, and body match a
// section already present in the prior snapshot. Records that are
// fully unchanged are the maximal case, but sections carried over
// from an edited record still reuse their embeddings. The snapshot is
// opened read-only and queried per-record during the rebuild, so
// resident memory scales with a single record's chunks rather than
// with the whole corpus. Leave empty to disable reuse.
ReuseFromPath string
Embedder embed.Embedder
// Contextualizer optionally produces a per-chunk prefix string that
// gets prepended before the embedding text and the FTS5 content. When
// set, the prefix persists on the chunk and participates in reuse
// keying so a changed contextualizer invalidates stale reuse without
// corrupting the stored representation. Nil disables contextualization
// and leaves the build identical to the non-contextual path.
Contextualizer ChunkContextualizer
// MaxChunkTokens sets the approximate maximum number of tokens (words)
// per chunk. Sections that exceed this limit are split into smaller
// sub-sections. Zero disables token-budget splitting.
MaxChunkTokens int
// ChunkOverlapTokens sets the approximate number of overlapping tokens
// between adjacent sub-sections when a section is split. Zero disables
// overlap.
ChunkOverlapTokens int
// MaxChunkSections caps how many sections any single record is allowed
// to produce. A pathological Markdown body (e.g., 10^6 heading lines)
// would otherwise translate into 10^6 embedder calls and 10^6
// chunk/vector rows — a DoS vector for shared embedders. Zero means
// DefaultMaxChunkSections; a negative value disables the cap for
// callers who have their own upstream validation. When the cap is
// exceeded, Rebuild returns an error wrapping chunk.ErrTooManySections
// instead of silently admitting the record.
MaxChunkSections int
// Quantization controls the vector storage format. See the
// store.Quantization* constants for the accept-listed values:
// store.QuantizationFloat32 (default), store.QuantizationInt8 (4x
// smaller, minor precision loss), and store.QuantizationBinary
// (32x smaller via 1-bit sign packing, full-precision rescore on a
// companion table preserves ranking).
Quantization string
// ChunkPolicy selects the chunking strategy. Nil defaults to
// chunk.MarkdownPolicy{Options: chunk.Options{
// MaxTokens: MaxChunkTokens,
// OverlapTokens: ChunkOverlapTokens,
// MaxSections: <resolved>,
// }}, which reproduces the pre-1.0 chunking pipeline exactly.
// Setting a non-nil policy overrides the per-Build chunking shape:
// MaxChunkTokens/ChunkOverlapTokens/MaxChunkSections are read by
// the default MarkdownPolicy but ignored when ChunkPolicy is set
// (the policy carries its own configuration). Hierarchical
// policies like chunk.LateChunkPolicy emit parent + leaf chunks
// linked via parent_chunk_id; ExpandContext can surface the
// parent on demand. See docs/superpowers/specs for the design.
ChunkPolicy chunk.Policy
}
BuildOptions controls how a Stroma index is rebuilt.
type BuildResult ¶
type BuildResult struct {
Path string
RecordCount int
ChunkCount int
ReusedRecordCount int
ReusedChunkCount int
EmbeddedChunkCount int
ReuseStatus ReuseStatus
ReuseDisabledReason string
EmbedderDimension int
EmbedderFingerprint string
ContentFingerprint string
}
BuildResult summarizes a completed rebuild.
func Rebuild ¶
func Rebuild(ctx context.Context, records []corpus.Record, options BuildOptions) (*BuildResult, error)
Rebuild atomically recreates the index at the requested path.
func RebuildFromSource ¶ added in v2.3.0
func RebuildFromSource(ctx context.Context, source RecordSource, options BuildOptions) (*BuildResult, error)
RebuildFromSource atomically recreates the index at the requested path from a streaming record source.
Unlike Rebuild, this API does not require callers to materialize a []corpus.Record with every BodyText resident at once. Records are consumed one at a time in source order, normalized, chunked, embedded, and flushed in bounded internal batches. Duplicate refs are rejected by the staging snapshot's primary key. Source order determines snapshot-local chunk IDs; callers that need repeatable chunk IDs across streaming rebuilds should emit records in a stable order.
type ChunkContextualizer ¶
type ChunkContextualizer interface {
ContextualizeChunks(ctx context.Context, record corpus.Record, sections []chunk.Section) ([]string, error)
}
ChunkContextualizer produces a short explanatory prefix for each section of a record. The returned slice must be the same length as sections and aligned with it index-for-index. An empty prefix is allowed and disables contextual retrieval for that section. The returned prefix is prepended to the embedding text and to the FTS5 content column; it is persisted so reuse keying can detect when a changed contextualizer needs to invalidate the stored embedding.
type ContextOptions ¶
type ContextOptions struct {
// IncludeParent walks the requested chunk's parent_chunk_id one level
// up and includes the parent row in the returned slice when the chunk
// has a parent. Multi-level ancestry walks are explicit recursion by
// the caller.
//
// Against snapshots built before schema v5 (#16), there is no
// parent_chunk_id column to walk; IncludeParent is a no-op.
IncludeParent bool
// NeighborWindow includes up to N sibling chunks on each side of the
// requested chunk, ordered by chunk_index. Two chunks are siblings
// when they share the same parent_chunk_id (NULL counts as a single
// sibling group), so for a leaf the neighborhood stays inside the
// same parent span and for a flat or parent chunk the neighborhood
// is other top-level chunks under the same record. Zero means no
// neighbors are included; the requested chunk is still returned by
// itself.
//
// Against snapshots built before schema v5 (#16), the parent grouping
// is unavailable, so neighbors degrade to "other chunks in the same
// record_ref with chunk_index in the requested window."
NeighborWindow int
}
ContextOptions controls how Snapshot.ExpandContext widens a single chunk hit into a local-context payload.
type FusionStrategy ¶
type FusionStrategy interface {
Fuse(arms []RetrievalArm, limit int) ([]SearchHit, error)
}
FusionStrategy combines one or more RetrievalArms into a single ranked list, truncated to limit. Implementations must be deterministic and must attach HitProvenance to every returned hit covering each arm that contributed.
Fuse returns an error when inputs are malformed (for example Available=true with a non-nil Err, or an arm with an empty Name) or when the strategy fails closed on an upstream arm error. Callers treat errors the same way as any other retrieval failure. Strategies that want to tolerate partial-arm failures do so internally and return a nil error.
Aliasing contract: implementations must treat each input arm's Hits slice and every SearchHit it contains as read-only. They must not mutate Hit fields, must not mutate a Hit's Metadata map (which may alias storage shared across arms when the same ChunkID matched on more than one retrieval path), and must return a freshly allocated []SearchHit rather than repurposing an input arm's slice.
func DefaultFusion ¶
func DefaultFusion() FusionStrategy
DefaultFusion returns the FusionStrategy used when SearchQuery.Fusion is nil. Ordering is identical to pre-#17 Snapshot.Search on every path, and SearchHit.Score is identical on every path except one: when the vector arm returns zero hits and the FTS arm is non-empty, DefaultFusion preserves the bm25-derived arm-native Score instead of the pre-#17 RRF-rewritten score. Callers who read Score on that specific path can recover both the arm-native and pre-#17-style scores via the HitProvenance attached to each hit.
type HitProvenance ¶
type HitProvenance struct {
Arms map[string]ArmEvidence
}
HitProvenance records which arms found a fused hit. The map is keyed by arm name; arms that did not return the hit are absent from the map.
type RRFFusion ¶
RRFFusion is the default FusionStrategy. K controls the RRF constant; K<=0 is treated as K=60 for backward compatibility with the pre-#17 mergeRRF helper.
PreserveSingleArmScore controls the single-arm degenerate case. When true (the default used by DefaultFusion) and exactly one arm is available-and-non-empty, Fuse returns that arm's hits in arm order with arm-native Score preserved. When false, Fuse rewrites Score to the RRF-derived 1/(K+rank+1) on every path. Callers that want numerically uniform fused scores across single-arm and multi-arm paths opt in by setting this to false.
type RecordQuery ¶
RecordQuery filters records from an opened snapshot.
type RecordSource ¶ added in v2.3.0
RecordSource streams records into RebuildFromSource.
Next returns the next record and true while input remains. Returning false ends the stream and ignores the returned record. Implementations should return any loading or decoding failure directly. RebuildFromSource calls Next serially with a non-nil context, propagates source errors, and leaves the destination snapshot unchanged.
type RecordSourceFunc ¶ added in v2.3.0
RecordSourceFunc adapts a function to RecordSource.
type Reranker ¶
type Reranker interface {
Rerank(ctx context.Context, query string, candidates []SearchHit) ([]SearchHit, error)
}
Reranker optionally refines one search candidate shortlist before the final limit truncation.
Aliasing contract: implementations must treat the input candidates slice and every SearchHit it contains as read-only. They must not mutate Hit fields, must not mutate a Hit's Metadata map (which may alias storage shared with other hits), and must not return the input slice — return a freshly allocated []SearchHit instead. Snapshot.Search defensively shallow-copies the candidates slice before handing it to the reranker, but that copy is shallow so maps and sub-slices inside each SearchHit remain shared. Reorderings and truncations are fine; mutations are not.
type RetrievalArm ¶
RetrievalArm is one candidate list from one retrieval path, ordered by the arm's own ranking. Hits[i].Score is the arm-native score (cosine distance derivative for vector, negative bm25-equivalent for FTS).
Available and Err distinguish three otherwise identical-looking states:
- Available=true, Err=nil, len(Hits)==0: arm ran, zero matches.
- Available=false, Err=nil: arm unavailable on this snapshot (for example a legacy snapshot without fts_chunks). Hits must be empty.
- Available=false, Err!=nil: arm failed. Hits must be empty.
Available=true with a non-nil Err is invalid; FusionStrategy implementations should return an error when they observe it.
type ReuseStatus ¶ added in v2.3.0
type ReuseStatus string
ReuseStatus reports whether BuildOptions.ReuseFromPath was usable during Rebuild. Reuse setup remains non-fatal by default; callers can inspect BuildResult.ReuseStatus and BuildResult.ReuseDisabledReason to distinguish "nothing reusable" from "reuse could not start".
const ( // ReuseStatusDisabled means BuildOptions.ReuseFromPath was empty. ReuseStatusDisabled ReuseStatus = "disabled" // ReuseStatusActive means the prior snapshot opened and passed // compatibility checks, so section-level reuse was attempted. ReuseStatusActive ReuseStatus = "active" // readable snapshot file, for example because it was missing or a // directory. ReuseStatusUnavailable ReuseStatus = "unavailable" // ReuseStatusIncompatible means the snapshot exists but cannot seed // this build because schema, embedder, dimension, or quantization // metadata does not match. ReuseStatusIncompatible ReuseStatus = "incompatible" // ReuseStatusError means setup hit an operational error while // checking the configured snapshot. ReuseStatusError ReuseStatus = "error" )
type SearchHit ¶
type SearchHit struct {
ChunkID int64
Ref string
Kind string
Title string
SourceRef string
Heading string
Content string
Metadata map[string]string
Score float64
// Provenance records which retrieval arms contributed to this hit.
// It is populated by FusionStrategy implementations; non-fusion paths
// (SearchVector, direct searchFTS callers) leave it nil.
Provenance *HitProvenance
}
SearchHit is one retrieved section.
type SearchParams ¶
type SearchParams struct {
// Text is the free-form query text. Empty rejects with a
// "search text is required" error — this field has no default.
Text string
// Limit caps the number of SearchHits returned. Zero or negative
// selects DefaultSearchLimit (10). Values above MaxSearchLimit
// reject with an error instead of being silently capped.
Limit int
// Kinds filters candidate records to the supplied kind list. Nil
// or empty means "no filter, all kinds".
Kinds []string
// Embedder produces the query vector(s) used by the dense arm.
// Nil rejects with a "search embedder is required" error — this
// field has no default.
Embedder embed.Embedder
// Fusion optionally overrides the hybrid fusion strategy. Nil
// uses DefaultFusion().
Fusion FusionStrategy
// Reranker optionally refines the candidate shortlist after
// fusion. Nil skips reranking.
Reranker Reranker
// SearchDimension optionally runs a truncated-prefix vector prefilter
// at this dimension, then rescores the shortlist with full-dim cosine.
// Zero (default) uses the full stored dimension throughout. Positive
// values must be <= the stored embedder dimension. Only valid when the
// stored quantization is float32; returns an error against int8 indexes.
// This is the shape Matryoshka Representation Learning (MRL) embeddings
// rely on — callers who use non-MRL embeddings should leave it zero.
//
// The truncated path is a brute-force scan over chunks_vec, not a
// vec0 kNN MATCH, so it is not asymptotically cheaper than the default
// path: its win is constant-factor (fewer floats per cosine) and only
// pays off when the truncated prefix preserves ranking. Treat this as
// a tuning knob for MRL snapshots rather than a blanket speedup.
SearchDimension int
}
SearchParams are the retrieval parameters shared by SearchQuery (the top-level one-shot API against an index path) and SnapshotSearchQuery (the long-lived API against an open Snapshot). Extracting the shared shape lets downstream adapters thread one value through both surfaces and lets the top-level Search forward its params verbatim instead of hand-copying six fields.
type SearchQuery ¶
type SearchQuery struct {
// Path is the OS-native filesystem path to the snapshot. On
// Windows both forward and back slashes are accepted — the store
// package normalizes drive prefixes on open.
Path string
SearchParams
}
SearchQuery defines one semantic search against an index path. Retrieval parameters live on the embedded SearchParams so the same shape flows through Search, Snapshot.Search, and any downstream adapter wrapper.
type Section ¶
type Section struct {
ChunkID int64
Ref string
Kind string
Title string
SourceRef string
Heading string
Content string
ContextPrefix string
Metadata map[string]string
Embedding []float64
}
Section is one stored section from a Stroma snapshot.
type SectionQuery ¶
type SectionQuery struct {
Refs []string
Kinds []string
// IncludeEmbeddings asks Sections() to populate Section.Embedding
// from the stored vector column. Snapshots produced by hierarchical
// policies (e.g., chunk.LateChunkPolicy) hold parent rows that are
// storage-only context with no vector — those rows are filtered
// out of an IncludeEmbeddings = true query because the underlying
// chunks → chunks_vec join is inner. Set IncludeEmbeddings = false
// to receive every chunk row (parents + leaves) without embeddings.
IncludeEmbeddings bool
}
SectionQuery filters sections from an opened snapshot.
type Snapshot ¶
type Snapshot struct {
// contains filtered or unexported fields
}
Snapshot is one opened Stroma index snapshot.
Safe for concurrent use by multiple goroutines once returned from OpenSnapshot: *sql.DB is goroutine-safe per the database/sql contract, and all Snapshot read methods (Stats, Records, Sections, Search, SearchVector, ExpandContext) invoke it through that contract. Cached metadata fields (quantization, storedDimension, hasFTS, …) are populated at open time and read-only thereafter, so no additional synchronization is required around Snapshot itself.
func OpenSnapshot ¶
OpenSnapshot opens a read-only Stroma snapshot at path. The path is OS-native; on Windows both forward and back slashes are accepted (the store package normalizes drive prefixes on open). The snapshot's schema_version metadata must be one of the accept-listed versions — schemaVersion (current), prevSchemaVersion, legacySchemaVersionV3, or legacySchemaVersionV2 — all of which read paths can decode directly without forcing an Update. Anything else returns ErrUnsupportedSchemaVersion wrapped with the observed version, so callers can surface a clear upgrade/downgrade message instead of silently misdecoding data against a future schema.
The returned *Snapshot is safe for concurrent use by multiple goroutines once constructed: *sql.DB is goroutine-safe per the database/sql contract, and Snapshot's cached metadata fields are populated at open time and read-only thereafter.
func (*Snapshot) ExpandContext ¶
func (s *Snapshot) ExpandContext(ctx context.Context, chunkID int64, opts ContextOptions) ([]Section, error)
ExpandContext returns the chunk identified by chunkID together with the caller-requested local context, in document order:
[parent (if IncludeParent and the chunk has one), neighbors before, the chunk itself, neighbors after]
The chunk itself is always included, so callers do not have to reconcile the original SearchHit with the expansion. Embeddings are never populated by ExpandContext — the API is for context retrieval, not for re-ranking against fresh vectors. Callers that need embeddings should use Sections() with IncludeEmbeddings = true.
Returns an empty slice + nil error when chunkID does not exist; the substrate treats "no such chunk" as an empty result rather than an error, matching the section-read APIs.
Against snapshots built before schema v5 (#16), the v5 lineage column is absent: IncludeParent becomes a no-op and NeighborWindow scopes by record_ref alone (no parent grouping). ExpandContext stays useful on legacy files; it just cannot surface lineage that was never recorded.
Internally ExpandContext issues a small bounded number of parameterized reads: at most one to locate the requested chunk, one to fetch the parent (when IncludeParent + parent_chunk_id present), and one range scan over the sibling window. There is no per-result parameter expansion (no `WHERE id IN (?, ?, ?, ...)`), so the query never approaches SQLite's parameter cap regardless of NeighborWindow.
func (*Snapshot) Search ¶
Search runs a hybrid text search (vector + FTS5) against the opened snapshot.
func (*Snapshot) SearchVector ¶
SearchVector runs a vector search against the opened snapshot.
type SnapshotSearchQuery ¶
type SnapshotSearchQuery struct {
SearchParams
}
SnapshotSearchQuery defines one text search against an opened snapshot. Retrieval parameters live on the embedded SearchParams so the same value can be forwarded verbatim from SearchQuery.SearchParams without hand-copying fields.
type Stats ¶
type Stats struct {
Path string
RecordCount int
ChunkCount int
KindCounts map[string]int
SchemaVersion string
EmbedderDimension int
EmbedderFingerprint string
ContentFingerprint string
CreatedAt string
}
Stats describes a built Stroma index.
type UpdateOptions ¶
type UpdateOptions struct {
// Path is the OS-native filesystem path to the existing snapshot
// to update in place. On Windows both forward and back slashes are
// accepted — the store package normalizes drive prefixes on open.
Path string
Embedder embed.Embedder
// Contextualizer optionally produces a per-chunk prefix string. See
// BuildOptions.Contextualizer for the contract. Leaving it nil
// preserves the non-contextual path and produces chunks with an
// empty persisted prefix.
Contextualizer ChunkContextualizer
// MaxChunkTokens sets the approximate maximum number of tokens (words)
// per chunk. It should match the chunking policy used to build the current
// index if callers want incremental updates to remain section-compatible.
MaxChunkTokens int
// ChunkOverlapTokens sets the approximate number of overlapping tokens
// between adjacent sub-sections when a section is split. It should match
// the chunking policy used to build the current index.
ChunkOverlapTokens int
// MaxChunkSections mirrors BuildOptions.MaxChunkSections for the
// incremental-update path. Zero → DefaultMaxChunkSections; negative
// → no cap.
MaxChunkSections int
// MaxPlannedRecords caps how many added/replaced records Update will
// chunk, reuse-plan, and embed before opening its write transaction.
// This bounds resident pre-transaction plan memory for callers that
// split large ingests into repeated Update calls. Zero keeps the
// historical unbounded behavior; negative values reject. The cap
// applies only to added/replaced records, not removals.
MaxPlannedRecords int
// Quantization, when provided, must match the existing index — see
// the store.Quantization* constants (float32, int8, binary) for the
// accept-listed values. Leaving it empty reuses the stored
// quantization metadata.
Quantization string
// ChunkPolicy mirrors BuildOptions.ChunkPolicy for the incremental
// update path. Nil defaults to chunk.MarkdownPolicy with the
// MaxChunkTokens / ChunkOverlapTokens / MaxChunkSections knobs
// resolved here. The substrate does not enforce that the policy
// matches the one used to build the snapshot — callers who switch
// policies between Build and Update should expect reuse cache
// misses on the affected sections (the leaves still re-embed
// correctly; the snapshot just won't share embeddings across
// rebuilds).
ChunkPolicy chunk.Policy
}
UpdateOptions controls how an existing Stroma index is updated in place.
type UpdateResult ¶
type UpdateResult struct {
Path string
UpsertedCount int
RemovedCount int
RecordCount int
ChunkCount int
ReusedRecordCount int
ReusedChunkCount int
EmbeddedChunkCount int
EmbedderDimension int
EmbedderFingerprint string
ContentFingerprint string
}
UpdateResult summarizes one incremental update.
func Update ¶
func Update(ctx context.Context, added []corpus.Record, removed []string, options UpdateOptions) (*UpdateResult, error)
Update applies add, replace, and remove operations to an existing Stroma index without rebuilding it from scratch.
type VectorSearchQuery ¶
type VectorSearchQuery struct {
// Embedding is the precomputed query vector. Empty rejects with
// a "search embedding is required" error — this field has no
// default.
Embedding []float64
// Limit caps the number of SearchHits returned. Zero or negative
// selects DefaultSearchLimit (10). Values above MaxSearchLimit
// reject with an error instead of being silently capped.
Limit int
// Kinds filters candidate records to the supplied kind list. Nil
// or empty means "no filter, all kinds".
Kinds []string
}
VectorSearchQuery defines one vector search against an opened snapshot.