Documentation
¶
Overview ¶
Package lsh provides a Locality-Sensitive Hashing index for fast approximate nearest-neighbor retrieval of MinHash signatures.
LSH groups similar MinHash signatures into the same buckets by hashing bands of consecutive hash values. This enables O(N) indexing and sublinear query time, replacing O(N^2) pairwise comparison.
The index is parameterized by numBands and numRows where numBands * numRows = numHashes. Higher numBands lowers the similarity threshold for candidate retrieval.
Index ¶
Constants ¶
This section is empty.
Variables ¶
var ( // ErrInvalidParams is returned when numBands or numRows is not positive. ErrInvalidParams = errors.New("lsh: numBands and numRows must be positive") // ErrNilSignature is returned when a nil signature is provided. ErrNilSignature = errors.New("lsh: signature must not be nil") // ErrSizeMismatch is returned when signature size does not match numBands * numRows. ErrSizeMismatch = errors.New("lsh: signature size must equal numBands * numRows") )
Functions ¶
This section is empty.
Types ¶
type Index ¶
type Index struct {
// contains filtered or unexported fields
}
Index is a thread-safe LSH index for approximate nearest-neighbor retrieval.
func New ¶
New creates a new LSH index with the given number of bands and rows per band. The total number of hash functions expected from signatures is numBands * numRows.
func (*Index) Insert ¶
Insert adds a signature to the index with the given identifier. Returns an error if sig is nil or its size does not match numBands * numRows.