hdindex

package module
v0.0.0-...-b936464 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Apr 14, 2026 License: MIT Imports: 25 Imported by: 0

Documentation

Index

Constants

This section is empty.

Variables

This section is empty.

Functions

This section is empty.

Types

type ApplyToken

type ApplyToken struct {
	TxnID uint64
	SeqID uint64
}

ApplyToken identifies a mutation for idempotency tracking.

type DeleteMutation

type DeleteMutation struct {
	TxnID      uint64
	SeqID      uint64
	ExternalID []byte
}

DeleteMutation represents a vector delete operation.

type Engine

type Engine struct {
	// contains filtered or unexported fields
}

Engine manages HD-Index lifecycle (create, open, drop, list, close).

func NewEngine

func NewEngine(rootDir string, config EngineConfig) (*Engine, error)

NewEngine creates an engine that stores indexes under rootDir.

func (*Engine) Close

func (e *Engine) Close() error

Close closes all open indexes and the engine.

func (*Engine) CreateIndex

func (e *Engine) CreateIndex(ctx context.Context, spec HDIndexSpec, vectors []VectorEntry) (*Index, error)

CreateIndex builds a new HD-Index from scratch using the provided vectors.

func (*Engine) DropIndex

func (e *Engine) DropIndex(ctx context.Context, id string) error

DropIndex closes and deletes an index.

func (*Engine) OpenIndex

func (e *Engine) OpenIndex(ctx context.Context, id string) (*Index, error)

OpenIndex opens an existing index by ID from disk.

func (*Engine) RestoreIndex

func (e *Engine) RestoreIndex(ctx context.Context, id string, r io.Reader) error

RestoreIndex restores an index from a compressed tar archive. Reads a tar.gz stream from the provided reader and extracts it to the engine root directory. The archive must contain a top-level directory named after the index ID (as produced by SnapshotIndex). The index can then be opened with OpenIndex.

func (*Engine) SnapshotIndex

func (e *Engine) SnapshotIndex(ctx context.Context, id string, w io.Writer) error

SnapshotIndex creates a compressed tar archive of an index's Pebble data. Writes a tar.gz stream to the provided writer. The index remains open and readable during the snapshot (Pebble checkpoint is crash-consistent).

type EngineConfig

type EngineConfig struct {
	PebbleCacheMB int // Block cache size in MB (0 = Pebble default)
}

EngineConfig holds options for creating an HD-Index engine.

type HDIndexSpec

type HDIndexSpec struct {
	ID          string    `msgpack:"id"`
	Dim         int       `msgpack:"dim"`
	Metric      Metric    `msgpack:"metric"`
	InternalDim int       `msgpack:"internal_dim"`
	Tau         int       `msgpack:"tau"`
	Omega       int       `msgpack:"omega"`
	Eta         int       `msgpack:"eta"`
	RefCount    int       `msgpack:"ref_count"`
	Alpha       int       `msgpack:"alpha"`
	Gamma       int       `msgpack:"gamma"`
	Seed        int64     `msgpack:"seed"`
	NormMax     float64   `msgpack:"norm_max"`
	DomainMin   []float32 `msgpack:"domain_min"`
	DomainMax   []float32 `msgpack:"domain_max"`
}

HDIndexSpec defines the configuration for an HD-Index instance.

func DefaultSpec

func DefaultSpec(id string, dim int, metric Metric) HDIndexSpec

DefaultSpec returns an HDIndexSpec with defaults matching the HD-Index paper (Arora et al., VLDB 2018, Sections 5.2.1–5.2.6):

  • m=10 reference objects (§5.2.3: "quality saturates at m=10")
  • τ ≈ sqrt(dim) (§5.2.4: Enron 1369-dim uses τ=37, Glove 100-dim uses τ=10)
  • α=4096, γ=1024, α/γ=4 (§5.2.6: recommended values)
  • Triangle inequality only, no Ptolemaic (§5.2.5: "more prudent")

type Index

type Index struct {
	// contains filtered or unexported fields
}

Index represents an open HD-Index instance.

func (*Index) Checkpoint

func (idx *Index) Checkpoint(destDir string) error

Checkpoint creates a consistent point-in-time snapshot of the index's Pebble database at destDir. The checkpoint uses hard links when possible, making it very fast. The caller should tar/compress destDir for transport and remove it when done.

func (*Index) Close

func (idx *Index) Close() error

Close flushes and closes the index.

func (*Index) Delete

func (idx *Index) Delete(ctx context.Context, mut DeleteMutation) error

Delete removes a vector by external ID.

func (*Index) Search

func (idx *Index) Search(ctx context.Context, req SearchRequest) (*SearchResult, error)

Search performs a kNN query using the HD-Index algorithm (Algorithm 2 from paper). Partition scans run in parallel across available cores.

func (*Index) Spec

func (idx *Index) Spec() HDIndexSpec

Spec returns the index specification.

func (*Index) Stats

func (idx *Index) Stats() IndexStats

Stats returns index statistics.

func (*Index) Upsert

func (idx *Index) Upsert(ctx context.Context, mut Mutation) error

Upsert inserts or updates a single vector. Idempotent via (TxnID, SeqID).

type IndexStats

type IndexStats struct {
	VectorCount    uint64
	WatermarkTxnID uint64
	WatermarkSeqID uint64
}

IndexStats provides statistics about the index.

type Metric

type Metric int

Metric defines the distance metric for the index.

const (
	MetricEuclidean Metric = iota
	MetricCosine
	MetricDot
)

func ParseMetric

func ParseMetric(s string) (Metric, bool)

func (Metric) String

func (m Metric) String() string

type Mutation

type Mutation struct {
	TxnID      uint64
	SeqID      uint64
	ExternalID []byte
	VectorFP32 []float32
}

Mutation represents a vector upsert operation.

type SearchHit

type SearchHit struct {
	ExternalID []byte
	Distance   float32
	Score      float32
}

SearchHit represents a single search result.

type SearchRequest

type SearchRequest struct {
	VectorFP32 []float32
	TopK       int
	Alpha      int // override per-query, 0 = use index default
	Gamma      int // override per-query, 0 = use index default
}

SearchRequest defines parameters for a kNN search.

type SearchResult

type SearchResult struct {
	Hits  []SearchHit
	Stats SearchStats
}

SearchResult holds the output of a search operation.

type SearchStats

type SearchStats struct {
	CandidatesScanned       int
	CandidatesAfterTriangle int
	CandidatesExactScored   int
	PartitionsSearched      int
}

SearchStats provides diagnostics about a search.

type VectorEntry

type VectorEntry struct {
	ExternalID []byte
	Vector     []float32
}

VectorEntry represents a single vector with its external ID for bulk loading.

Directories

Path Synopsis
pkg
rdb

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL