embedding

package
v0.4.2 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: May 20, 2026 License: MIT Imports: 11 Imported by: 0

Documentation

Overview

Package embedding provides semantic vector embeddings for symbols. Uses MiniLM-L6-v2 via hugot (pure-Go ONNX runtime) for offline embedding generation. Model auto-downloads from Hugging Face on first use (~30MB). Vectors are stored in an HNSW index (coder/hnsw) for nearest-neighbor search.

Index

Constants

View Source
const (

	// Dims is the embedding vector dimensionality for BGE-small-en-v1.5.
	Dims = 384
)

Variables

This section is empty.

Functions

This section is empty.

Types

type Embedder

type Embedder struct {
	// contains filtered or unexported fields
}

Embedder generates embedding vectors and provides nearest-neighbor search.

func New

func New() (*Embedder, error)

New creates an Embedder, downloading the model if needed. The model is cached at ~/.cache/knowing/models/.

func (*Embedder) AddVector

func (e *Embedder) AddVector(id string, vec []float32)

AddVector indexes a symbol ID with its embedding vector for nearest-neighbor search.

func (*Embedder) Close

func (e *Embedder) Close() error

Close releases resources.

func (*Embedder) Count

func (e *Embedder) Count() int

Count returns the number of indexed vectors.

func (*Embedder) Embed

func (e *Embedder) Embed(ctx context.Context, text string) ([]float32, error)

Embed returns the embedding vector for a single text string.

func (*Embedder) EmbedBatch

func (e *Embedder) EmbedBatch(ctx context.Context, texts []string) ([][]float32, error)

EmbedBatch returns embedding vectors for multiple texts.

func (*Embedder) Search

func (e *Embedder) Search(query []float32, k int) []string

Search returns the k nearest neighbor symbol IDs to the query vector.

type Searcher

type Searcher struct {
	// contains filtered or unexported fields
}

Searcher wraps an Embedder and provides the VectorSearcher interface expected by the context engine. It resolves HNSW string keys (hex-encoded node hashes) back to types.Hash values.

func NewSearcher

func NewSearcher(e *Embedder) *Searcher

NewSearcher creates a Searcher from an initialized Embedder.

func (*Searcher) Close

func (s *Searcher) Close() error

Close releases the underlying embedder resources.

func (*Searcher) Count

func (s *Searcher) Count() int

Count returns the number of indexed vectors.

func (*Searcher) EmbedAndSearch

func (s *Searcher) EmbedAndSearch(ctx context.Context, query string, k int) ([]types.Hash, error)

EmbedAndSearch embeds the query text and returns the k nearest symbol hashes.

func (*Searcher) IndexBatch

func (s *Searcher) IndexBatch(ctx context.Context, nodes []types.Node, filePaths []string) error

IndexBatch embeds multiple nodes and adds them to the HNSW index. More efficient than individual IndexNode calls due to batched embedding.

func (*Searcher) IndexNode

func (s *Searcher) IndexNode(ctx context.Context, node types.Node, filePath string) error

IndexNode embeds a node's text representation and adds it to the HNSW index. The text format is: "kind name signature filepath"

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL