Documentation
¶
Overview ¶
Package vecindex implements IVF (Inverted File Index) vector similarity search using local segment files, overlay journals, and shared metric/kmeans primitives.
Index ¶
- Constants
- Variables
- func DecodeCentroidBlob(data []byte) (*kmeans.CentroidSet, error)
- func DefaultSegmentBlockRows(encoding int64) int
- func EncodeCentroidBlob(cs *kmeans.CentroidSet) ([]byte, error)
- func EncodeSegmentCurrent(current *SegmentCurrent) ([]byte, error)
- func EncodeSegmentManifest(manifest *SegmentManifest) ([]byte, error)
- func EncodeStableMemberCodecBlob(codec *StableMemberCodec) ([]byte, error)
- func Float32ToBytes(v []float32) []byte
- func IsVecLocalTable(tableName string) bool
- func OverlayJournalPath(dir string) string
- func ScoreEncodedMember(spec IVFSpec, cs *kmeans.CentroidSet, query []float32, queryNorm2 float32, ...) (float32, error)
- func SegmentBlockMetaPath(dir string, generation uint64) string
- func SegmentBlockPath(dir string, generation uint64) string
- func SegmentCurrentPath(dir string) string
- func SegmentDataPath(dir string, generation uint64) string
- func SegmentManifestPath(dir string, generation uint64) string
- func SegmentRowMapPath(dir string, generation uint64) string
- func SegmentStoreDir(dbPath, indexName string) string
- func SegmentStoreV1Compat() uint32
- func SegmentStoreV2Compat() uint32
- func SegmentStoreV3Compat() uint32
- func StableMemberEncodingSpec(spec IVFSpec) (int64, int)
- func ValidateIndexName(name string) error
- type BulkEntry
- type Engine
- func (e *Engine) AssignNearest(indexName string, vec []byte) (int64, error)
- func (e *Engine) Detach(indexName string) (*IndexState, bool)
- func (e *Engine) Lookup(indexName string) (*IndexState, bool)
- func (e *Engine) LookupRef(indexName string) (*IndexState, func(), bool)
- func (e *Engine) Register(indexName string, state *IndexState)
- func (e *Engine) RegisterWithCentroidSet(indexName string, spec IVFSpec, cs *kmeans.CentroidSet) *IndexState
- func (e *Engine) TopNprobeClusters(indexName string, vec []byte, n int) ([]int64, error)
- func (e *Engine) TopNprobeClustersWithEpoch(indexName string, vec []byte, n int) ([]int64, uint64, error)
- func (e *Engine) Unregister(indexName string)
- type ForcePlan
- type IVFSpec
- type IndexState
- func (s *IndexState) Acquire() bool
- func (s *IndexState) AddRetireCallback(fn func())
- func (s *IndexState) AssignNearest(vecBytes []byte) (int64, error)
- func (s *IndexState) AssignNearestPrepared(vecBytes []byte) (int64, error)
- func (s *IndexState) ClearOverlay()
- func (s *IndexState) ClearSegmentStore()
- func (s *IndexState) HotClusterScores(limit int) map[int64]uint64
- func (s *IndexState) HotClusters(limit int) []int64
- func (s *IndexState) LoadMaintenanceState() *MaintenanceState
- func (s *IndexState) LoadOverlay() *JournaledOverlay
- func (s *IndexState) LoadSegmentStore() *SegmentGeneration
- func (s *IndexState) ProbeState() *kmeans.CentroidSet
- func (s *IndexState) ProbeVersion() uint64
- func (s *IndexState) RecordClusterHits(clusterIDs []int64)
- func (s *IndexState) RecordClusterMutation(oldCluster int64, oldVec []float32, newCluster int64, newVec []float32)
- func (s *IndexState) RecordRowsModified(delta uint64)
- func (s *IndexState) Release()
- func (s *IndexState) Retire()
- func (s *IndexState) Spec() IVFSpec
- func (s *IndexState) StoreMaintenanceState(ms *MaintenanceState)
- func (s *IndexState) StoreOverlay(overlay *JournaledOverlay)
- func (s *IndexState) StoreSegmentStore(generation *SegmentGeneration)
- func (s *IndexState) SwapProbeState(cs *kmeans.CentroidSet) *kmeans.CentroidSet
- func (s *IndexState) TopNprobeClusters(vecBytes []byte, n int) ([]int64, error)
- func (s *IndexState) TopNprobeClustersWithEpoch(vecBytes []byte, n int) ([]int64, uint64, error)
- type JournaledOverlay
- func (o *JournaledOverlay) ApplyCommittedBatch(mutations []OverlayMutation) error
- func (o *JournaledOverlay) Close() error
- func (o *JournaledOverlay) CompactAfter(minSequence uint64) error
- func (o *JournaledOverlay) Reset(epoch uint64) error
- func (o *JournaledOverlay) Rewrite(epoch uint64, minSequence uint64, mutations []OverlayMutation) error
- func (o *JournaledOverlay) Snapshot() *OverlaySnapshot
- type MaintenanceState
- func (m *MaintenanceState) Clone() *MaintenanceState
- func (m *MaintenanceState) LiveCentroids(base [][]float32) [][]float32
- func (m *MaintenanceState) LiveClusterRowCounts() []uint64
- func (m *MaintenanceState) LiveClusterVectorSums() [][]float32
- func (m *MaintenanceState) RecordClusterMutation(oldCluster int64, oldVec []float32, newCluster int64, newVec []float32)
- func (m *MaintenanceState) RecordRowsModified(delta uint64)
- func (m *MaintenanceState) ResetPending()
- func (m *MaintenanceState) RowCount() uint64
- func (m *MaintenanceState) SetConsecutiveSkewCycles(v uint32)
- func (m *MaintenanceState) Stats() (rowsModified uint64, lastRebuild uint64, skewCycles uint32)
- type Metric
- type OverlayBuffer
- type OverlayJournal
- func (j *OverlayJournal) AppendBatch(mutations []OverlayMutation) ([]overlayVecRef, error)
- func (j *OverlayJournal) Close() error
- func (j *OverlayJournal) Replay() (*OverlaySnapshot, error)
- func (j *OverlayJournal) Reset(epoch uint64) error
- func (j *OverlayJournal) Rewrite(epoch uint64, lastSequence uint64, mutations []OverlayMutation) ([]overlayVecRef, error)
- type OverlayMutation
- type OverlayMutationKind
- type OverlaySnapshot
- func (s *OverlaySnapshot) BacklogStats(minSequence uint64) (rows int, bytes int64, oldestUnixNano int64)
- func (s *OverlaySnapshot) Epoch() uint64
- func (s *OverlaySnapshot) HasTombstone(rowID int64) bool
- func (s *OverlaySnapshot) HasTombstoneAfter(rowID int64, minSequence uint64) bool
- func (s *OverlaySnapshot) LastSequence() uint64
- func (s *OverlaySnapshot) Len() int
- func (s *OverlaySnapshot) MutationsAfter(minSequence uint64) []OverlayMutation
- func (s *OverlaySnapshot) NewestUnixNanoAfter(minSequence uint64) int64
- func (s *OverlaySnapshot) ReadVec(rowID int64) ([]byte, error)
- func (s *OverlaySnapshot) RowCluster(rowID int64) (int64, bool)
- func (s *OverlaySnapshot) RowClusterAfter(rowID int64, minSequence uint64) (int64, bool)
- func (s *OverlaySnapshot) VisitAllAfter(minSequence uint64, visit func(clusterID, rowID int64, vec []byte) bool)
- func (s *OverlaySnapshot) VisitAllEncodedAfter(minSequence uint64, ...)
- func (s *OverlaySnapshot) VisitCluster(clusterID int64, visit func(rowID int64, vec []byte) bool)
- func (s *OverlaySnapshot) VisitClusterAfter(clusterID int64, minSequence uint64, visit func(rowID int64, vec []byte) bool)
- func (s *OverlaySnapshot) VisitClusterEncodedAfter(clusterID int64, minSequence uint64, ...)
- func (s *OverlaySnapshot) VisitClusterRange(clusterID int64, minSequence uint64, maxSequence uint64, ...)
- func (s *OverlaySnapshot) VisitMutationHeadersAfter(minSequence uint64, visit func(OverlayMutation) bool)
- func (s *OverlaySnapshot) VisitMutationHeadersAfterUnordered(minSequence uint64, visit func(OverlayMutation) bool)
- func (s *OverlaySnapshot) VisitMutationsAfter(minSequence uint64, visit func(OverlayMutation) bool)
- func (s *OverlaySnapshot) VisitTombstones(visit func(rowID int64) bool)
- func (s *OverlaySnapshot) VisitTombstonesAfter(minSequence uint64, visit func(rowID int64) bool)
- type OverlayVecEncoding
- type SegmentBlockMetaStore
- func (s *SegmentBlockMetaStore) BlockRows() int
- func (s *SegmentBlockMetaStore) Close() error
- func (s *SegmentBlockMetaStore) Dim() int
- func (s *SegmentBlockMetaStore) Encoding() int64
- func (s *SegmentBlockMetaStore) Epoch() uint64
- func (s *SegmentBlockMetaStore) Generation() uint64
- func (s *SegmentBlockMetaStore) InternalDim() int
- func (s *SegmentBlockMetaStore) MaxCluster() int
- func (s *SegmentBlockMetaStore) Metric() Metric
- func (s *SegmentBlockMetaStore) Path() string
- func (s *SegmentBlockMetaStore) ReadClusterBlocks(clusterIDs []int64) ([]SegmentBlockRecord, error)
- func (s *SegmentBlockMetaStore) RecordCount() uint64
- func (s *SegmentBlockMetaStore) RecordQueryStats(considered, wouldSkip, skipped, scored, rowsScored uint64)
- func (s *SegmentBlockMetaStore) ResetScanStats()
- func (s *SegmentBlockMetaStore) SnapshotScanStats() SegmentBlockScanStats
- func (s *SegmentBlockMetaStore) ValidateCoverage(data *SegmentDataStore) error
- type SegmentBlockMetaWriter
- func (w *SegmentBlockMetaWriter) Abort()
- func (w *SegmentBlockMetaWriter) Append(clusterID, rowID int64, dataOffset uint64, dataBytes int, encoded []byte) error
- func (w *SegmentBlockMetaWriter) AppendRawBlocks(clusterID int64, blocks []SegmentBlockRecord, offsetDelta int64) error
- func (w *SegmentBlockMetaWriter) Close() (*SegmentBlockMetaStore, error)
- type SegmentBlockRecord
- type SegmentBlockScanStats
- type SegmentCurrent
- type SegmentDataStore
- func (s *SegmentDataStore) Close() error
- func (s *SegmentDataStore) ClusterCount(clusterID int64) uint64
- func (s *SegmentDataStore) ClusterRowCounts() []uint64
- func (s *SegmentDataStore) ClusterSpan(clusterID int64) (offset int64, bytes int64, count uint64, ok bool)
- func (s *SegmentDataStore) CopyClusterTo(clusterID int64, w io.Writer) (uint64, error)
- func (s *SegmentDataStore) Dim() int
- func (s *SegmentDataStore) Encoding() int64
- func (s *SegmentDataStore) Epoch() uint64
- func (s *SegmentDataStore) FileOrderedClusters() []int64
- func (s *SegmentDataStore) FileSize() int64
- func (s *SegmentDataStore) Generation() uint64
- func (s *SegmentDataStore) InternalDim() int
- func (s *SegmentDataStore) MaxCluster() int
- func (s *SegmentDataStore) Metric() Metric
- func (s *SegmentDataStore) Path() string
- func (s *SegmentDataStore) ReadEntryAt(offset uint64) (int64, []byte, error)
- func (s *SegmentDataStore) ResetScanStats()
- func (s *SegmentDataStore) RowCount() uint64
- func (s *SegmentDataStore) ScanBlockRecordsFileOrder(blocks []SegmentBlockRecord, ...) error
- func (s *SegmentDataStore) ScanCluster(clusterID int64, yield func(rowid int64, vecBytes []byte) bool) error
- func (s *SegmentDataStore) ScanClusters(clusterIDs []int64, yield func(clusterID, rowid int64, vecBytes []byte) bool) error
- func (s *SegmentDataStore) ScanClustersFileOrder(clusterIDs []int64, yield func(clusterID, rowid int64, vecBytes []byte) bool) error
- func (s *SegmentDataStore) ScanClustersFileOrderSpans(clusterIDs []int64, ...) error
- func (s *SegmentDataStore) SnapshotScanStats() SegmentScanStats
- func (s *SegmentDataStore) VecBytes() int
- func (s *SegmentDataStore) WarmClusters(clusterIDs []int64, maxBytes int64) error
- type SegmentDataWriter
- func (w *SegmentDataWriter) Abort()
- func (w *SegmentDataWriter) Append(clusterID, rowid int64, vec []byte) error
- func (w *SegmentDataWriter) AppendRawCluster(clusterID int64, count uint64, rows io.Reader) error
- func (w *SegmentDataWriter) Close() (*SegmentDataStore, error)
- func (w *SegmentDataWriter) EntrySize() int
- func (w *SegmentDataWriter) NextOffset() uint64
- type SegmentGeneration
- type SegmentManifest
- type SegmentRowLocation
- type SegmentRowMap
- func (m *SegmentRowMap) Close() error
- func (m *SegmentRowMap) EntryCount() uint64
- func (m *SegmentRowMap) Epoch() uint64
- func (m *SegmentRowMap) FileSize() int64
- func (m *SegmentRowMap) Generation() uint64
- func (m *SegmentRowMap) Lookup(rowID int64) (SegmentRowLocation, bool, error)
- func (m *SegmentRowMap) Path() string
- func (m *SegmentRowMap) Scan(visit func(SegmentRowLocation) bool) error
- type SegmentRowMapWriter
- type SegmentScanStats
- type StableCodecTrainingVector
- type StableMemberCodec
- func BuildStableMemberCodec(spec IVFSpec, cs *kmeans.CentroidSet, training []StableCodecTrainingVector, ...) (*StableMemberCodec, error)
- func DecodeStableMemberCodecBlob(spec IVFSpec, cs *kmeans.CentroidSet, enc int64, blob []byte) (*StableMemberCodec, error)
- func NewStableMemberCodec(spec IVFSpec, cs *kmeans.CentroidSet, enc int64, pq *quantize.PQ8Codec) (*StableMemberCodec, error)
- func (c *StableMemberCodec) DecodePrepared(clusterID int64, vecBytes []byte) ([]float32, error)
- func (c *StableMemberCodec) Encode(clusterID int64, prepared []byte) (int64, []byte, error)
- func (c *StableMemberCodec) EncodedSize() int
- func (c *StableMemberCodec) Encoding() int64
- func (c *StableMemberCodec) Validate() error
- func (c *StableMemberCodec) WithCentroids(cs *kmeans.CentroidSet) (*StableMemberCodec, error)
- type StableMemberQueryScorer
- type StableMemberScorer
- type VecSessionVars
- type VectorUDFProvider
Constants ¶
const ( MemberEncodingRawPreparedF32 int64 = iota MemberEncodingResidualInt8 MemberEncodingResidualPQ8 )
const ( // MetricL2 is squared Euclidean distance. MetricL2 = metric.MetricL2 // MetricDot is negative inner product (higher dot = closer). MetricDot = metric.MetricDot // MetricCosine is cosine distance (1 - cosine similarity). MetricCosine = metric.MetricCosine )
const MaxIndexNameLen = 48
Maximum length of a user-supplied index name. Keeps the derived object names (table, index, trigger) well under SQLite's practical identifier limits.
const MaxNlist = 16384
MaxNlist is the maximum number of IVF clusters allowed.
const MemberResidualBlockSize = quantize.DefaultResidualBlockSize
const (
SegmentStoreVersion = 4
)
const StablePQMinInternalDim = 512
Variables ¶
var ErrNoCentroidsLoaded = errors.New("vecindex: no centroids loaded")
Functions ¶
func DecodeCentroidBlob ¶
func DecodeCentroidBlob(data []byte) (*kmeans.CentroidSet, error)
DecodeCentroidBlob deserialises a CentroidSet from a zstd-compressed msgpack blob.
func DefaultSegmentBlockRows ¶
func EncodeCentroidBlob ¶
func EncodeCentroidBlob(cs *kmeans.CentroidSet) ([]byte, error)
EncodeCentroidBlob serialises a CentroidSet to a zstd-compressed msgpack blob. The active centroid blob is embedded in the local segment manifest.
func EncodeSegmentCurrent ¶
func EncodeSegmentCurrent(current *SegmentCurrent) ([]byte, error)
func EncodeSegmentManifest ¶
func EncodeSegmentManifest(manifest *SegmentManifest) ([]byte, error)
func EncodeStableMemberCodecBlob ¶
func EncodeStableMemberCodecBlob(codec *StableMemberCodec) ([]byte, error)
func Float32ToBytes ¶
Float32ToBytes encodes a []float32 as little-endian bytes. Used during sampling for k-means to round-trip through SQL BLOB columns.
func IsVecLocalTable ¶
IsVecLocalTable reports whether tableName is a legacy local-only vector object. Used by the CDC applier to tolerate DROP races against pre-cutover SQLite artifacts without treating them as replicated user tables.
func OverlayJournalPath ¶
func ScoreEncodedMember ¶
func SegmentBlockMetaPath ¶
func SegmentBlockPath ¶
func SegmentCurrentPath ¶
func SegmentDataPath ¶
func SegmentManifestPath ¶
func SegmentRowMapPath ¶
func SegmentStoreDir ¶
func SegmentStoreV1Compat ¶
func SegmentStoreV1Compat() uint32
func SegmentStoreV2Compat ¶
func SegmentStoreV2Compat() uint32
func SegmentStoreV3Compat ¶
func SegmentStoreV3Compat() uint32
func ValidateIndexName ¶
ValidateIndexName rejects names that would produce unsafe or ambiguous SQLite identifiers. Rules:
- non-empty, length <= MaxIndexNameLen
- first character is an ASCII letter
- remaining characters are ASCII letters, digits, or underscore
- must not begin with the reserved `marmot` prefix (reserved for generated names such as `_marmot_vec_*` and `__marmot_vec_*`).
Types ¶
type BulkEntry ¶
type BulkEntry struct {
// ExternalID is the caller-supplied identifier for the vector.
ExternalID []byte
// Vector holds the raw float32 values.
Vector []float32
}
BulkEntry is a single vector supplied during index creation.
type Engine ¶
type Engine struct {
// contains filtered or unexported fields
}
Engine manages in-memory state for all active vector indexes. It implements VectorUDFProvider so it can be installed as the global provider via db.SetVectorUDFProvider(engine).
All methods are safe for concurrent use.
func NewEngine ¶
func NewEngine() *Engine
NewEngine creates a new Engine with no registered indexes.
func (*Engine) AssignNearest ¶
AssignNearest implements VectorUDFProvider. Returns the 1-based cluster ID for the nearest centroid (0 is reserved for delta rows). Returns MARMOT-VEC-013 if the index is unknown.
func (*Engine) Detach ¶
func (e *Engine) Detach(indexName string) (*IndexState, bool)
Detach removes an index from the lookup map without retiring its resources. Callers that do not restore the state must eventually call Retire.
func (*Engine) Lookup ¶
func (e *Engine) Lookup(indexName string) (*IndexState, bool)
Lookup returns the IndexState for indexName and whether it was found.
func (*Engine) LookupRef ¶
func (e *Engine) LookupRef(indexName string) (*IndexState, func(), bool)
LookupRef returns a serving reference to the current state. The caller must invoke the returned release function when it is done reading from file-backed segment or overlay resources.
func (*Engine) Register ¶
func (e *Engine) Register(indexName string, state *IndexState)
Register stores the IndexState for indexName in the engine, replacing any existing state for the same name.
func (*Engine) RegisterWithCentroidSet ¶
func (e *Engine) RegisterWithCentroidSet(indexName string, spec IVFSpec, cs *kmeans.CentroidSet) *IndexState
RegisterWithCentroidSet is a convenience wrapper that creates a new IndexState from spec+cs and registers it. Returns the created state.
func (*Engine) TopNprobeClusters ¶
TopNprobeClusters implements VectorUDFProvider. Thin wrapper over TopNprobeClustersWithEpoch that discards the epoch.
func (*Engine) TopNprobeClustersWithEpoch ¶
func (e *Engine) TopNprobeClustersWithEpoch(indexName string, vec []byte, n int) ([]int64, uint64, error)
TopNprobeClustersWithEpoch is TopNprobeClusters + the probe-state epoch the cluster IDs were computed against. Used by the coordinator's cache path to detect when cluster IDs produced under an older probe would be indexed into a post-reindex cache at a different epoch.
func (*Engine) Unregister ¶
Unregister removes the IndexState for indexName from the engine. It is a no-op if indexName is not registered.
type IVFSpec ¶
type IVFSpec struct {
// ID is the unique index identifier.
ID string
// Dim is the vector dimensionality of caller-supplied vectors.
Dim int
// Metric is the distance function to use.
Metric Metric
// Nlist is the number of IVF centroids (clusters).
Nlist int
// Nprobe is the number of clusters searched at query time.
Nprobe int
// Seed is the RNG seed used for k-means initialisation.
Seed uint64
// Epoch tracks the centroid generation; incremented on retrain.
Epoch uint64
// MaxNorm is the fixed upper bound on vector L2 norms used for MIPS→L2
// augmentation when Metric == MetricDot.
MaxNorm float32
}
IVFSpec describes the configuration for a single IVF vector index.
func (IVFSpec) InternalDim ¶
InternalDim returns the dimensionality of vectors as stored in the index. For MetricDot this is Dim+1 (one augmented dimension for MIPS→L2 reduction). For all other metrics it equals Dim.
func (IVFSpec) InternalMetric ¶
InternalMetric returns the distance metric used for centroid assignment. For MetricDot the MIPS→L2 reduction stores augmented vectors, so centroid assignment uses MetricL2. For all other metrics this equals Metric.
type IndexState ¶
type IndexState struct {
// contains filtered or unexported fields
}
IndexState holds the in-memory state for a single vector index.
probeState is the active routing centroid set used by query transpile and assignment. All pointers are stored as atomic.Pointer so readers are lock-free.
func NewIndexState ¶
func NewIndexState(spec IVFSpec, cs *kmeans.CentroidSet) *IndexState
NewIndexState creates an IndexState rooted at cs.
func (*IndexState) Acquire ¶
func (s *IndexState) Acquire() bool
Acquire pins the state for a serving reader. It returns false when the state has already been retired and its file-backed resources may be closing.
func (*IndexState) AddRetireCallback ¶
func (s *IndexState) AddRetireCallback(fn func())
AddRetireCallback registers fn to run after all readers have released and file-backed serving resources have been closed.
func (*IndexState) AssignNearest ¶
func (s *IndexState) AssignNearest(vecBytes []byte) (int64, error)
AssignNearest returns the 1-based cluster ID for the nearest centroid.
vecBytes must be a little-endian float32 BLOB of exactly spec.Dim*4 bytes for L2/Cosine, or spec.Dim*4 bytes for Dot (augmentation is applied here). Returns an error on dimension mismatch or missing centroids.
func (*IndexState) AssignNearestPrepared ¶
func (s *IndexState) AssignNearestPrepared(vecBytes []byte) (int64, error)
AssignNearestPrepared returns the 1-based cluster ID for the nearest centroid for a vector that is already in the internal search space.
vecBytes must be a little-endian float32 BLOB of exactly spec.InternalDim()*4 bytes. Unlike AssignNearest, dot-metric vectors are not augmented here.
func (*IndexState) ClearOverlay ¶
func (s *IndexState) ClearOverlay()
func (*IndexState) ClearSegmentStore ¶
func (s *IndexState) ClearSegmentStore()
ClearSegmentStore drops the active stable on-disk snapshot.
func (*IndexState) HotClusterScores ¶
func (s *IndexState) HotClusterScores(limit int) map[int64]uint64
func (*IndexState) HotClusters ¶
func (s *IndexState) HotClusters(limit int) []int64
func (*IndexState) LoadMaintenanceState ¶
func (s *IndexState) LoadMaintenanceState() *MaintenanceState
func (*IndexState) LoadOverlay ¶
func (s *IndexState) LoadOverlay() *JournaledOverlay
func (*IndexState) LoadSegmentStore ¶
func (s *IndexState) LoadSegmentStore() *SegmentGeneration
LoadSegmentStore returns the active stable on-disk generation snapshot.
func (*IndexState) ProbeState ¶
func (s *IndexState) ProbeState() *kmeans.CentroidSet
ProbeState returns the current probe centroid set. Nil when no centroids are loaded (empty-table bootstrap). The returned CentroidSet is immutable.
func (*IndexState) ProbeVersion ¶
func (s *IndexState) ProbeVersion() uint64
ProbeVersion returns the epoch of the active probe centroid set. Returns 0 when no centroid set is loaded.
func (*IndexState) RecordClusterHits ¶
func (s *IndexState) RecordClusterHits(clusterIDs []int64)
func (*IndexState) RecordClusterMutation ¶
func (s *IndexState) RecordClusterMutation(oldCluster int64, oldVec []float32, newCluster int64, newVec []float32)
func (*IndexState) RecordRowsModified ¶
func (s *IndexState) RecordRowsModified(delta uint64)
func (*IndexState) Release ¶
func (s *IndexState) Release()
Release drops a serving reader pin acquired by Acquire.
func (*IndexState) Retire ¶
func (s *IndexState) Retire()
Retire prevents new serving readers from pinning this state and closes file-backed resources as soon as all existing readers have left.
func (*IndexState) Spec ¶
func (s *IndexState) Spec() IVFSpec
Spec returns the immutable IVF configuration for this index.
func (*IndexState) StoreMaintenanceState ¶
func (s *IndexState) StoreMaintenanceState(ms *MaintenanceState)
func (*IndexState) StoreOverlay ¶
func (s *IndexState) StoreOverlay(overlay *JournaledOverlay)
func (*IndexState) StoreSegmentStore ¶
func (s *IndexState) StoreSegmentStore(generation *SegmentGeneration)
StoreSegmentStore installs generation as the active stable on-disk snapshot.
func (*IndexState) SwapProbeState ¶
func (s *IndexState) SwapProbeState(cs *kmeans.CentroidSet) *kmeans.CentroidSet
SwapProbeState atomically replaces the probe centroid set and returns the previous one. Called at REINDEX commit (design §8.3 step 7 in-txn swap).
func (*IndexState) TopNprobeClusters ¶
func (s *IndexState) TopNprobeClusters(vecBytes []byte, n int) ([]int64, error)
TopNprobeClusters returns the top-n 1-based cluster IDs ordered by ascending distance to the query vector (closest first). Thin wrapper over TopNprobeClustersWithEpoch that discards the epoch.
func (*IndexState) TopNprobeClustersWithEpoch ¶
TopNprobeClustersWithEpoch is TopNprobeClusters + the probe-state epoch the cluster IDs were computed against. The epoch is read from the SAME probeState pointer used for the top-N computation, so callers can later gate cache reads on epoch equality and avoid indexing IDs computed under the old probe into a freshly-rebuilt cache (task #16 coherence).
type JournaledOverlay ¶
type JournaledOverlay struct {
// contains filtered or unexported fields
}
JournaledOverlay couples a crash-safe local journal with an in-memory overlay snapshot. Callers append committed batches here and readers consume immutable snapshots.
func OpenJournaledOverlay ¶
func OpenJournaledOverlay(path string) (*JournaledOverlay, error)
OpenJournaledOverlay opens path, replays the journal, and returns the live in-memory overlay view.
func OpenJournaledOverlayForRewrite ¶
func OpenJournaledOverlayForRewrite(path string) (*JournaledOverlay, error)
OpenJournaledOverlayForRewrite opens the journal file without replaying the current overlay state. It is intended for callers that already have the mutations they want to persist and will immediately call Rewrite.
func (*JournaledOverlay) ApplyCommittedBatch ¶
func (o *JournaledOverlay) ApplyCommittedBatch(mutations []OverlayMutation) error
ApplyCommittedBatch validates, journals, fsyncs, and publishes a committed overlay batch.
func (*JournaledOverlay) Close ¶
func (o *JournaledOverlay) Close() error
Close closes the underlying journal handle.
func (*JournaledOverlay) CompactAfter ¶
func (o *JournaledOverlay) CompactAfter(minSequence uint64) error
func (*JournaledOverlay) Reset ¶
func (o *JournaledOverlay) Reset(epoch uint64) error
Reset clears both the journal file and the in-memory overlay view.
func (*JournaledOverlay) Rewrite ¶
func (o *JournaledOverlay) Rewrite(epoch uint64, minSequence uint64, mutations []OverlayMutation) error
func (*JournaledOverlay) Snapshot ¶
func (o *JournaledOverlay) Snapshot() *OverlaySnapshot
Snapshot returns the current immutable in-memory overlay view.
type MaintenanceState ¶
type MaintenanceState struct {
ClusterRowCounts []uint64
ClusterVectorSums [][]float32
PendingClusterRowDelta []int64
PendingClusterVectorSums [][]float32
RowsModifiedSinceRebuild uint64
LastRebuildRowCount uint64
ConsecutiveSkewCycles uint32
// contains filtered or unexported fields
}
func (*MaintenanceState) Clone ¶
func (m *MaintenanceState) Clone() *MaintenanceState
func (*MaintenanceState) LiveCentroids ¶
func (m *MaintenanceState) LiveCentroids(base [][]float32) [][]float32
func (*MaintenanceState) LiveClusterRowCounts ¶
func (m *MaintenanceState) LiveClusterRowCounts() []uint64
func (*MaintenanceState) LiveClusterVectorSums ¶
func (m *MaintenanceState) LiveClusterVectorSums() [][]float32
func (*MaintenanceState) RecordClusterMutation ¶
func (m *MaintenanceState) RecordClusterMutation(oldCluster int64, oldVec []float32, newCluster int64, newVec []float32)
func (*MaintenanceState) RecordRowsModified ¶
func (m *MaintenanceState) RecordRowsModified(delta uint64)
func (*MaintenanceState) ResetPending ¶
func (m *MaintenanceState) ResetPending()
func (*MaintenanceState) RowCount ¶
func (m *MaintenanceState) RowCount() uint64
func (*MaintenanceState) SetConsecutiveSkewCycles ¶
func (m *MaintenanceState) SetConsecutiveSkewCycles(v uint32)
type Metric ¶
Metric identifies the distance function used for vector comparisons. It is an alias for metric.Metric so callers can use either import path.
type OverlayBuffer ¶
type OverlayBuffer struct {
// contains filtered or unexported fields
}
OverlayBuffer stores immutable overlay snapshots behind an atomic pointer so readers can take stable snapshots without locking while writers publish whole-snapshot updates.
func NewOverlayBuffer ¶
func NewOverlayBuffer() *OverlayBuffer
NewOverlayBuffer returns an empty overlay buffer.
func (*OverlayBuffer) ApplyBatch ¶
func (b *OverlayBuffer) ApplyBatch(mutations []OverlayMutation) error
ApplyBatch applies mutations copy-on-write and publishes the resulting snapshot atomically.
func (*OverlayBuffer) Reset ¶
func (b *OverlayBuffer) Reset(epoch uint64)
Reset clears the overlay and publishes an empty snapshot tied to epoch.
func (*OverlayBuffer) Snapshot ¶
func (b *OverlayBuffer) Snapshot() *OverlaySnapshot
Snapshot returns the current immutable overlay view.
func (*OverlayBuffer) StoreSnapshot ¶
func (b *OverlayBuffer) StoreSnapshot(snapshot *OverlaySnapshot)
StoreSnapshot replaces the current overlay view.
type OverlayJournal ¶
type OverlayJournal struct {
// contains filtered or unexported fields
}
func OpenOverlayJournal ¶
func OpenOverlayJournal(path string) (*OverlayJournal, error)
OpenOverlayJournal opens or creates a crash-safe append-only overlay journal. A truncated tail left by a torn write is discarded automatically.
func OpenOverlayJournalForRewrite ¶
func OpenOverlayJournalForRewrite(path string) (*OverlayJournal, error)
OpenOverlayJournalForRewrite opens or creates the journal file without replaying its contents. Use this only for callers that will immediately replace the on-disk contents via Rewrite/Reset.
func (*OverlayJournal) AppendBatch ¶
func (j *OverlayJournal) AppendBatch(mutations []OverlayMutation) ([]overlayVecRef, error)
AppendBatch appends committed overlay mutations atomically and fsyncs once per batch so replay after a crash sees either the full batch or none of the torn tail.
func (*OverlayJournal) Close ¶
func (j *OverlayJournal) Close() error
Close closes the underlying file handle.
func (*OverlayJournal) Replay ¶
func (j *OverlayJournal) Replay() (*OverlaySnapshot, error)
Replay rebuilds the overlay snapshot from the journal contents.
func (*OverlayJournal) Reset ¶
func (j *OverlayJournal) Reset(epoch uint64) error
Reset compacts the journal to an empty state for epoch.
func (*OverlayJournal) Rewrite ¶
func (j *OverlayJournal) Rewrite(epoch uint64, lastSequence uint64, mutations []OverlayMutation) ([]overlayVecRef, error)
type OverlayMutation ¶
type OverlayMutation struct {
Kind OverlayMutationKind
Epoch uint64
Sequence uint64
ClusterID int64
RowID int64
AppliedAtUnixNano int64
CommitTxnID uint64
CommitSeqNum uint64
VecEncoding OverlayVecEncoding
Vec []byte
}
OverlayMutation is one committed local overlay change.
Sequence is a caller-provided monotonically increasing local watermark. Epoch tracks the centroid generation the assignment was computed against.
type OverlayMutationKind ¶
type OverlayMutationKind uint8
const ( OverlayMutationUpsert OverlayMutationKind = 1 OverlayMutationReplace OverlayMutationKind = 2 OverlayMutationDelete OverlayMutationKind = 3 )
type OverlaySnapshot ¶
type OverlaySnapshot struct {
// contains filtered or unexported fields
}
OverlaySnapshot is an immutable in-memory view of the local overlay state. It mirrors the serving overlay semantics used during the vector-index cutover: replacement/deletes tombstone the stable view while upserts only contribute overlay rows.
func (*OverlaySnapshot) BacklogStats ¶
func (s *OverlaySnapshot) BacklogStats(minSequence uint64) (rows int, bytes int64, oldestUnixNano int64)
func (*OverlaySnapshot) Epoch ¶
func (s *OverlaySnapshot) Epoch() uint64
Epoch returns the centroid generation the snapshot is tied to.
func (*OverlaySnapshot) HasTombstone ¶
func (s *OverlaySnapshot) HasTombstone(rowID int64) bool
HasTombstone reports whether rowID masks a stable-row copy.
func (*OverlaySnapshot) HasTombstoneAfter ¶
func (s *OverlaySnapshot) HasTombstoneAfter(rowID int64, minSequence uint64) bool
func (*OverlaySnapshot) LastSequence ¶
func (s *OverlaySnapshot) LastSequence() uint64
LastSequence returns the highest applied local journal sequence.
func (*OverlaySnapshot) Len ¶
func (s *OverlaySnapshot) Len() int
Len returns the number of overlay rows currently present.
func (*OverlaySnapshot) MutationsAfter ¶
func (s *OverlaySnapshot) MutationsAfter(minSequence uint64) []OverlayMutation
func (*OverlaySnapshot) NewestUnixNanoAfter ¶
func (s *OverlaySnapshot) NewestUnixNanoAfter(minSequence uint64) int64
func (*OverlaySnapshot) ReadVec ¶
func (s *OverlaySnapshot) ReadVec(rowID int64) ([]byte, error)
ReadVec returns the current overlay vector bytes for rowID.
func (*OverlaySnapshot) RowCluster ¶
func (s *OverlaySnapshot) RowCluster(rowID int64) (int64, bool)
RowCluster returns the overlay cluster for rowID, if present.
func (*OverlaySnapshot) RowClusterAfter ¶
func (s *OverlaySnapshot) RowClusterAfter(rowID int64, minSequence uint64) (int64, bool)
func (*OverlaySnapshot) VisitAllAfter ¶
func (s *OverlaySnapshot) VisitAllAfter(minSequence uint64, visit func(clusterID, rowID int64, vec []byte) bool)
func (*OverlaySnapshot) VisitAllEncodedAfter ¶
func (s *OverlaySnapshot) VisitAllEncodedAfter(minSequence uint64, visit func(clusterID, rowID int64, encoding OverlayVecEncoding, vec []byte) bool)
func (*OverlaySnapshot) VisitCluster ¶
func (s *OverlaySnapshot) VisitCluster(clusterID int64, visit func(rowID int64, vec []byte) bool)
VisitCluster visits every overlay row in clusterID.
func (*OverlaySnapshot) VisitClusterAfter ¶
func (*OverlaySnapshot) VisitClusterEncodedAfter ¶
func (s *OverlaySnapshot) VisitClusterEncodedAfter(clusterID int64, minSequence uint64, visit func(rowID int64, encoding OverlayVecEncoding, vec []byte) bool)
func (*OverlaySnapshot) VisitClusterRange ¶
func (*OverlaySnapshot) VisitMutationHeadersAfter ¶
func (s *OverlaySnapshot) VisitMutationHeadersAfter(minSequence uint64, visit func(OverlayMutation) bool)
func (*OverlaySnapshot) VisitMutationHeadersAfterUnordered ¶
func (s *OverlaySnapshot) VisitMutationHeadersAfterUnordered(minSequence uint64, visit func(OverlayMutation) bool)
VisitMutationHeadersAfterUnordered visits mutation headers without building a sorted intermediate slice. Use it for maintenance accounting where sequence order is not required.
func (*OverlaySnapshot) VisitMutationsAfter ¶
func (s *OverlaySnapshot) VisitMutationsAfter(minSequence uint64, visit func(OverlayMutation) bool)
func (*OverlaySnapshot) VisitTombstones ¶
func (s *OverlaySnapshot) VisitTombstones(visit func(rowID int64) bool)
VisitTombstones visits every tombstoned rowid.
func (*OverlaySnapshot) VisitTombstonesAfter ¶
func (s *OverlaySnapshot) VisitTombstonesAfter(minSequence uint64, visit func(rowID int64) bool)
type OverlayVecEncoding ¶
type OverlayVecEncoding uint8
const ( OverlayPreparedF32 OverlayVecEncoding = 0 OverlayResidualInt8 OverlayVecEncoding = 1 )
type SegmentBlockMetaStore ¶
type SegmentBlockMetaStore struct {
// contains filtered or unexported fields
}
func OpenSegmentBlockMetaStore ¶
func OpenSegmentBlockMetaStore(path string) (*SegmentBlockMetaStore, error)
func (*SegmentBlockMetaStore) BlockRows ¶
func (s *SegmentBlockMetaStore) BlockRows() int
func (*SegmentBlockMetaStore) Close ¶
func (s *SegmentBlockMetaStore) Close() error
func (*SegmentBlockMetaStore) Dim ¶
func (s *SegmentBlockMetaStore) Dim() int
func (*SegmentBlockMetaStore) Encoding ¶
func (s *SegmentBlockMetaStore) Encoding() int64
func (*SegmentBlockMetaStore) Epoch ¶
func (s *SegmentBlockMetaStore) Epoch() uint64
func (*SegmentBlockMetaStore) Generation ¶
func (s *SegmentBlockMetaStore) Generation() uint64
func (*SegmentBlockMetaStore) InternalDim ¶
func (s *SegmentBlockMetaStore) InternalDim() int
func (*SegmentBlockMetaStore) MaxCluster ¶
func (s *SegmentBlockMetaStore) MaxCluster() int
func (*SegmentBlockMetaStore) Metric ¶
func (s *SegmentBlockMetaStore) Metric() Metric
func (*SegmentBlockMetaStore) Path ¶
func (s *SegmentBlockMetaStore) Path() string
func (*SegmentBlockMetaStore) ReadClusterBlocks ¶
func (s *SegmentBlockMetaStore) ReadClusterBlocks(clusterIDs []int64) ([]SegmentBlockRecord, error)
func (*SegmentBlockMetaStore) RecordCount ¶
func (s *SegmentBlockMetaStore) RecordCount() uint64
func (*SegmentBlockMetaStore) RecordQueryStats ¶
func (s *SegmentBlockMetaStore) RecordQueryStats(considered, wouldSkip, skipped, scored, rowsScored uint64)
func (*SegmentBlockMetaStore) ResetScanStats ¶
func (s *SegmentBlockMetaStore) ResetScanStats()
func (*SegmentBlockMetaStore) SnapshotScanStats ¶
func (s *SegmentBlockMetaStore) SnapshotScanStats() SegmentBlockScanStats
func (*SegmentBlockMetaStore) ValidateCoverage ¶
func (s *SegmentBlockMetaStore) ValidateCoverage(data *SegmentDataStore) error
type SegmentBlockMetaWriter ¶
type SegmentBlockMetaWriter struct {
// contains filtered or unexported fields
}
func CreateSegmentBlockMetaWriter ¶
func CreateSegmentBlockMetaWriter(path string, spec IVFSpec, codec *StableMemberCodec, blockRows, maxCluster int, epoch, generation uint64) (*SegmentBlockMetaWriter, error)
func (*SegmentBlockMetaWriter) Abort ¶
func (w *SegmentBlockMetaWriter) Abort()
func (*SegmentBlockMetaWriter) AppendRawBlocks ¶
func (w *SegmentBlockMetaWriter) AppendRawBlocks(clusterID int64, blocks []SegmentBlockRecord, offsetDelta int64) error
func (*SegmentBlockMetaWriter) Close ¶
func (w *SegmentBlockMetaWriter) Close() (*SegmentBlockMetaStore, error)
type SegmentBlockRecord ¶
type SegmentBlockScanStats ¶
type SegmentCurrent ¶
type SegmentCurrent struct {
Version uint32 `msgpack:"version"`
Generation uint64 `msgpack:"generation"`
ManifestFile string `msgpack:"manifest_file"`
}
func DecodeSegmentCurrent ¶
func DecodeSegmentCurrent(data []byte) (*SegmentCurrent, error)
type SegmentDataStore ¶
type SegmentDataStore struct {
// contains filtered or unexported fields
}
SegmentDataStore is a read-only stable-vector generation file opened with explicit ReadAt-based scans. Stable rows are laid out as [rowid uint64][vec bytes] in cluster_id,rowid order.
vecBytes yielded during ScanCluster/ScanClusters is only valid for the callback duration; callers that retain it must copy it.
func OpenSegmentDataStore ¶
func OpenSegmentDataStore(path string) (*SegmentDataStore, error)
func (*SegmentDataStore) Close ¶
func (s *SegmentDataStore) Close() error
func (*SegmentDataStore) ClusterCount ¶
func (s *SegmentDataStore) ClusterCount(clusterID int64) uint64
func (*SegmentDataStore) ClusterRowCounts ¶
func (s *SegmentDataStore) ClusterRowCounts() []uint64
func (*SegmentDataStore) ClusterSpan ¶
func (*SegmentDataStore) CopyClusterTo ¶
func (*SegmentDataStore) Dim ¶
func (s *SegmentDataStore) Dim() int
func (*SegmentDataStore) Encoding ¶
func (s *SegmentDataStore) Encoding() int64
func (*SegmentDataStore) Epoch ¶
func (s *SegmentDataStore) Epoch() uint64
func (*SegmentDataStore) FileOrderedClusters ¶
func (s *SegmentDataStore) FileOrderedClusters() []int64
func (*SegmentDataStore) FileSize ¶
func (s *SegmentDataStore) FileSize() int64
func (*SegmentDataStore) Generation ¶
func (s *SegmentDataStore) Generation() uint64
func (*SegmentDataStore) InternalDim ¶
func (s *SegmentDataStore) InternalDim() int
func (*SegmentDataStore) MaxCluster ¶
func (s *SegmentDataStore) MaxCluster() int
func (*SegmentDataStore) Metric ¶
func (s *SegmentDataStore) Metric() Metric
func (*SegmentDataStore) Path ¶
func (s *SegmentDataStore) Path() string
func (*SegmentDataStore) ReadEntryAt ¶
func (s *SegmentDataStore) ReadEntryAt(offset uint64) (int64, []byte, error)
func (*SegmentDataStore) ResetScanStats ¶
func (s *SegmentDataStore) ResetScanStats()
func (*SegmentDataStore) RowCount ¶
func (s *SegmentDataStore) RowCount() uint64
func (*SegmentDataStore) ScanBlockRecordsFileOrder ¶
func (s *SegmentDataStore) ScanBlockRecordsFileOrder(blocks []SegmentBlockRecord, yield func(clusterID int64, rows []byte, count uint64, entrySize int) bool) error
func (*SegmentDataStore) ScanCluster ¶
func (*SegmentDataStore) ScanClusters ¶
func (*SegmentDataStore) ScanClustersFileOrder ¶
func (*SegmentDataStore) ScanClustersFileOrderSpans ¶
func (*SegmentDataStore) SnapshotScanStats ¶
func (s *SegmentDataStore) SnapshotScanStats() SegmentScanStats
func (*SegmentDataStore) VecBytes ¶
func (s *SegmentDataStore) VecBytes() int
func (*SegmentDataStore) WarmClusters ¶
func (s *SegmentDataStore) WarmClusters(clusterIDs []int64, maxBytes int64) error
type SegmentDataWriter ¶
type SegmentDataWriter struct {
// contains filtered or unexported fields
}
func CreateSegmentDataWriter ¶
func (*SegmentDataWriter) Abort ¶
func (w *SegmentDataWriter) Abort()
func (*SegmentDataWriter) Append ¶
func (w *SegmentDataWriter) Append(clusterID, rowid int64, vec []byte) error
func (*SegmentDataWriter) AppendRawCluster ¶
func (*SegmentDataWriter) Close ¶
func (w *SegmentDataWriter) Close() (*SegmentDataStore, error)
func (*SegmentDataWriter) EntrySize ¶
func (w *SegmentDataWriter) EntrySize() int
func (*SegmentDataWriter) NextOffset ¶
func (w *SegmentDataWriter) NextOffset() uint64
type SegmentGeneration ¶
type SegmentGeneration struct {
Data *SegmentDataStore
RowMap *SegmentRowMap
Blocks *SegmentBlockMetaStore
ProbeCentroids *kmeans.CentroidSet
StableCentroids *kmeans.CentroidSet
StableCodec *StableMemberCodec
AppliedOverlaySeq uint64
ClusterRowCounts []uint64
ClusterVectorSums [][]float32
RowsModifiedSinceRebuild uint64
LastRebuildRowCount uint64
ConsecutiveSkewCycles uint32
LayoutHotClusters []int64
}
SegmentGeneration is the active stable on-disk serving snapshot for one vector index epoch.
func (*SegmentGeneration) Close ¶
func (g *SegmentGeneration) Close() error
type SegmentManifest ¶
type SegmentManifest struct {
Version uint32 `msgpack:"version"`
Database string `msgpack:"database"`
IndexName string `msgpack:"index_name"`
IndexCreatedAt int64 `msgpack:"index_created_at"`
Metric string `msgpack:"metric"`
Dim uint32 `msgpack:"dim"`
InternalDim uint32 `msgpack:"internal_dim"`
ProbeCentroidEpoch uint64 `msgpack:"probe_centroid_epoch,omitempty"`
ProbeCentroidBlob []byte `msgpack:"probe_centroid_blob,omitempty"`
StableCentroidEpoch uint64 `msgpack:"stable_centroid_epoch,omitempty"`
StableCentroidBlob []byte `msgpack:"stable_centroid_blob,omitempty"`
StableMemberCodecBlob []byte `msgpack:"stable_member_codec_blob,omitempty"`
CentroidEpoch uint64 `msgpack:"centroid_epoch,omitempty"`
CentroidBlob []byte `msgpack:"centroid_blob,omitempty"`
AppliedOverlaySeq uint64 `msgpack:"applied_overlay_seq"`
Generation uint64 `msgpack:"generation"`
DataFile string `msgpack:"data_file"`
DataFileSize uint64 `msgpack:"data_file_size"`
DataFileSHA256 string `msgpack:"data_file_sha256"`
RowMapFile string `msgpack:"rowmap_file"`
RowMapFileSize uint64 `msgpack:"rowmap_file_size"`
RowMapFileSHA256 string `msgpack:"rowmap_file_sha256"`
BlockMetaFile string `msgpack:"block_meta_file,omitempty"`
BlockMetaFileSize uint64 `msgpack:"block_meta_file_size,omitempty"`
BlockMetaFileSHA256 string `msgpack:"block_meta_file_sha256,omitempty"`
BlockRows uint32 `msgpack:"block_rows,omitempty"`
MaxCluster uint32 `msgpack:"max_cluster"`
RowCount uint64 `msgpack:"row_count"`
ClusterRowCounts []uint64 `msgpack:"cluster_row_counts,omitempty"`
ClusterVectorSums [][]float32 `msgpack:"cluster_vector_sums,omitempty"`
RowsModifiedSinceRebuild uint64 `msgpack:"rows_modified_since_rebuild,omitempty"`
LastRebuildRowCount uint64 `msgpack:"last_rebuild_row_count,omitempty"`
ConsecutiveSkewCycles uint32 `msgpack:"consecutive_skew_cycles,omitempty"`
LayoutHotClusters []uint32 `msgpack:"layout_hot_clusters,omitempty"`
CreatedAtUnixNano int64 `msgpack:"created_at_unix_nano"`
}
func DecodeSegmentManifest ¶
func DecodeSegmentManifest(data []byte) (*SegmentManifest, error)
func (*SegmentManifest) NormalizeCentroidFields ¶
func (m *SegmentManifest) NormalizeCentroidFields()
func (*SegmentManifest) ProbeBlobValue ¶
func (m *SegmentManifest) ProbeBlobValue() []byte
func (*SegmentManifest) ProbeEpochValue ¶
func (m *SegmentManifest) ProbeEpochValue() uint64
func (*SegmentManifest) StableBlobValue ¶
func (m *SegmentManifest) StableBlobValue() []byte
func (*SegmentManifest) StableEpochValue ¶
func (m *SegmentManifest) StableEpochValue() uint64
type SegmentRowLocation ¶
type SegmentRowMap ¶
type SegmentRowMap struct {
// contains filtered or unexported fields
}
func OpenSegmentRowMap ¶
func OpenSegmentRowMap(path string) (*SegmentRowMap, error)
func (*SegmentRowMap) Close ¶
func (m *SegmentRowMap) Close() error
func (*SegmentRowMap) EntryCount ¶
func (m *SegmentRowMap) EntryCount() uint64
func (*SegmentRowMap) Epoch ¶
func (m *SegmentRowMap) Epoch() uint64
func (*SegmentRowMap) FileSize ¶
func (m *SegmentRowMap) FileSize() int64
func (*SegmentRowMap) Generation ¶
func (m *SegmentRowMap) Generation() uint64
func (*SegmentRowMap) Lookup ¶
func (m *SegmentRowMap) Lookup(rowID int64) (SegmentRowLocation, bool, error)
func (*SegmentRowMap) Path ¶
func (m *SegmentRowMap) Path() string
func (*SegmentRowMap) Scan ¶
func (m *SegmentRowMap) Scan(visit func(SegmentRowLocation) bool) error
type SegmentRowMapWriter ¶
type SegmentRowMapWriter struct {
// contains filtered or unexported fields
}
func CreateSegmentRowMapWriter ¶
func CreateSegmentRowMapWriter(path string, epoch, generation uint64) (*SegmentRowMapWriter, error)
func (*SegmentRowMapWriter) Abort ¶
func (w *SegmentRowMapWriter) Abort()
func (*SegmentRowMapWriter) Append ¶
func (w *SegmentRowMapWriter) Append(rowID, clusterID int64, offset uint64) error
func (*SegmentRowMapWriter) Close ¶
func (w *SegmentRowMapWriter) Close() (*SegmentRowMap, error)
type SegmentScanStats ¶
type StableMemberCodec ¶
type StableMemberCodec struct {
// contains filtered or unexported fields
}
func BuildStableMemberCodec ¶
func BuildStableMemberCodec(spec IVFSpec, cs *kmeans.CentroidSet, training []StableCodecTrainingVector, seed uint64) (*StableMemberCodec, error)
func DecodeStableMemberCodecBlob ¶
func DecodeStableMemberCodecBlob(spec IVFSpec, cs *kmeans.CentroidSet, enc int64, blob []byte) (*StableMemberCodec, error)
func NewStableMemberCodec ¶
func NewStableMemberCodec(spec IVFSpec, cs *kmeans.CentroidSet, enc int64, pq *quantize.PQ8Codec) (*StableMemberCodec, error)
func (*StableMemberCodec) DecodePrepared ¶
func (c *StableMemberCodec) DecodePrepared(clusterID int64, vecBytes []byte) ([]float32, error)
func (*StableMemberCodec) EncodedSize ¶
func (c *StableMemberCodec) EncodedSize() int
func (*StableMemberCodec) Encoding ¶
func (c *StableMemberCodec) Encoding() int64
func (*StableMemberCodec) Validate ¶
func (c *StableMemberCodec) Validate() error
func (*StableMemberCodec) WithCentroids ¶
func (c *StableMemberCodec) WithCentroids(cs *kmeans.CentroidSet) (*StableMemberCodec, error)
type StableMemberQueryScorer ¶
type StableMemberQueryScorer struct {
// contains filtered or unexported fields
}
func NewStableMemberQueryScorerWithCodec ¶
func NewStableMemberQueryScorerWithCodec(codec *StableMemberCodec, query []float32, queryNorm2 float32) (*StableMemberQueryScorer, error)
func (*StableMemberQueryScorer) BlockLowerBound ¶
func (q *StableMemberQueryScorer) BlockLowerBound(clusterID int64, block SegmentBlockRecord) (float32, bool)
func (*StableMemberQueryScorer) ClusterScorer ¶
func (q *StableMemberQueryScorer) ClusterScorer(clusterID int64) (*StableMemberScorer, error)
type StableMemberScorer ¶
type StableMemberScorer struct {
// contains filtered or unexported fields
}
func NewStableMemberScorer ¶
func NewStableMemberScorer(spec IVFSpec, cs *kmeans.CentroidSet, query []float32, queryNorm2 float32, clusterID int64, enc int64) (*StableMemberScorer, error)
func NewStableMemberScorerWithCodec ¶
func NewStableMemberScorerWithCodec(codec *StableMemberCodec, query []float32, queryNorm2 float32, clusterID int64) (*StableMemberScorer, error)
type VecSessionVars ¶
type VecSessionVars struct {
// Nprobe is the number of IVF clusters probed per query (0 = index default).
Nprobe int
// ForcePlan overrides the cost-based plan selection.
ForcePlan ForcePlan
// PrefilterCap is the maximum pre-filter candidate set size.
PrefilterCap int
// Fallback enables short-result fallback (§7.5).
Fallback bool
// UseGoRank enables the Go-side ranking path that bypasses per-row UDF
// dispatch on post-filter plans (§7.6).
UseGoRank bool
// Auto-retrain (§8.7)
RetrainEnabled bool
RetrainCheckInterval int // seconds between retrain checks
RetrainGrowthRatio float64 // per-cluster growth ratio threshold (>= 1.0)
RetrainDeltaRatio float64 // delta/total ratio threshold (0 < x <= 1.0)
ReindexChunkRows int // rows per chunked-populate txn (§8.3)
}
VecSessionVars holds per-connection @@marmot_vec_* session variables (§6.3, §6.4). Zero value is not valid; use DefaultVecSessionVars().
All fields carry the current session value. Zero Nprobe means "use index default".
func DefaultVecSessionVars ¶
func DefaultVecSessionVars() VecSessionVars
DefaultVecSessionVars returns session vars initialised to their documented defaults.
func (*VecSessionVars) Apply ¶
func (v *VecSessionVars) Apply(name, value string) error
Apply sets a single @@marmot_vec_* variable by its unprefixed name (e.g., "marmot_vec_nprobe") and a string value extracted from the Vitess AST literal.
Returns MARMOT-VEC-012 for an invalid enum value, a typed parse error for out-of-range numerics, and a generic error for unknown variable names.
type VectorUDFProvider ¶
type VectorUDFProvider interface {
// AssignNearest returns the cluster id the given vector should be
// assigned to for the named index. Implementations should error with
// MARMOT-VEC-013 if the index is unknown.
AssignNearest(indexName string, vec []byte) (int64, error)
// TopNprobeClusters returns the nearest n 1-based cluster IDs for the query
// vector against the named index's probeState. Error MARMOT-VEC-013 if the
// index is unknown.
TopNprobeClusters(indexName string, vec []byte, n int) ([]int64, error)
}
VectorUDFProvider is the minimal surface a vector index engine must expose so that SQLite UDFs registered on a per-connection basis can reach back into the engine. Implementations land in P1-C.
Methods must be safe for concurrent invocation from arbitrary SQLite connection goroutines.
Source Files
¶
Directories
¶
| Path | Synopsis |
|---|---|
|
pkg
|
|
|
kmeans
Package kmeans provides k-means++ clustering and centroid management for IVF indexes.
|
Package kmeans provides k-means++ clustering and centroid management for IVF indexes. |
|
metric
Package metric provides SIMD-accelerated vector distance functions.
|
Package metric provides SIMD-accelerated vector distance functions. |