Documentation
¶
Overview ¶
Package embed provides a fail-silent client for generating vector embeddings from an Ollama-compatible or OpenAI-compatible HTTP endpoint. A nil *Client is safe — all methods return (nil, nil) so callers can use the zero value without checking for configuration.
Supported endpoint formats (auto-detected from URL):
- OpenAI (/v1/embeddings) — batch-capable, request {"model":...,"input":...}
- Ollama (/api/embeddings) — serial-only, request {"model":...,"prompt":...}
Package embed provides embedding generation for converting text into float32 vectors for similarity search. The Embedder interface abstracts over multiple backends (builtin ONNX, Ollama, OpenAI).
Index ¶
- type BuiltinEmbedder
- func (b *BuiltinEmbedder) Close() error
- func (b *BuiltinEmbedder) Embed(ctx context.Context, text string) ([]float32, error)
- func (b *BuiltinEmbedder) EmbedBatch(ctx context.Context, texts []string) ([][]float32, error)
- func (b *BuiltinEmbedder) Model() string
- func (b *BuiltinEmbedder) StatusDetail() string
- func (b *BuiltinEmbedder) WarmUp(ctx context.Context) error
- type Client
- type Embedder
- type HTTPDoer
- type OllamaEmbedder
- type Option
Constants ¶
This section is empty.
Variables ¶
This section is empty.
Functions ¶
This section is empty.
Types ¶
type BuiltinEmbedder ¶
type BuiltinEmbedder struct {
// P8-11: optional callback for model download lifecycle events.
// eventType is "download_start" or "download_complete".
OnModelEvent func(eventType string)
// contains filtered or unexported fields
}
BuiltinEmbedder uses the pure-Go hugot library to run nomic-embed-text-v1.5 inference locally without any external dependencies. The quantized ONNX model (~137MB) is auto-downloaded from HuggingFace on first use and cached in the models directory. Output is Matryoshka-truncated to 384 dims.
Concurrent: a pool of pipeline instances allows bounded parallel inference. The pool size defaults to 3 — up to 3 Embed calls run simultaneously; additional callers block (respecting context) until a slot is returned.
Init retry: if model download fails (e.g., no internet), subsequent Embed() calls will retry initialization — not permanently broken.
func NewBuiltinEmbedder ¶
func NewBuiltinEmbedder(modelsDir string) *BuiltinEmbedder
NewBuiltinEmbedder creates a BuiltinEmbedder that stores its model in modelsDir (typically ~/.synapses/models). The nomic-embed-text-v1.5 model is lazily downloaded on the first Embed() call. Uses a pool of 3 pipeline instances for concurrent inference. Output is 384-dim (Matryoshka truncated).
func NewBuiltinEmbedderWithPoolSize ¶
func NewBuiltinEmbedderWithPoolSize(modelsDir string, poolSize int) *BuiltinEmbedder
NewBuiltinEmbedderWithPoolSize creates a BuiltinEmbedder with a custom pool size. poolSize is clamped to [1, 8].
func (*BuiltinEmbedder) Close ¶
func (b *BuiltinEmbedder) Close() error
Close releases all hugot session resources in the pool. Blocks until all in-flight Embed calls complete and return their slots, then destroys all pipeline instances. Safe to call concurrently; second call is a no-op.
func (*BuiltinEmbedder) Embed ¶
Embed generates a 384-dimensional embedding for text using the builtin nomic-embed-text-v1.5 model (768 dims → Matryoshka truncation to 384). Concurrent: up to poolSize calls run in parallel; additional callers block until a pipeline slot is available (respecting context cancellation and shutdown).
If model initialization fails (e.g., no internet for download), subsequent calls will retry — the embedder is never permanently broken.
func (*BuiltinEmbedder) EmbedBatch ¶
EmbedBatch embeds multiple texts in a single ONNX forward pass, which is significantly faster than calling Embed() N times. Returns one vector per input text in the same order. Uses one pool slot for the entire batch.
func (*BuiltinEmbedder) Model ¶
func (b *BuiltinEmbedder) Model() string
Model returns the builtin model identifier.
func (*BuiltinEmbedder) StatusDetail ¶
func (b *BuiltinEmbedder) StatusDetail() string
StatusDetail returns a human-readable string describing the current initialization state. Thread-safe. Four possible values:
- "ready" — pipeline pool initialized, embeddings working
- "model cached" — model on disk but pool not yet started; will initialize automatically on the first recall() call (e.g. after daemon restart)
- "model not yet downloaded" — Embed() has never been called and no model found on disk; model will be downloaded on first recall()
- "unavailable" — init was attempted but failed (download error, pipeline error, or air-gapped environment); Embed() will retry automatically
func (*BuiltinEmbedder) WarmUp ¶
func (b *BuiltinEmbedder) WarmUp(ctx context.Context) error
WarmUp pre-initializes the embedder by downloading the model if not already cached and setting up the inference pipeline pool. Call at daemon startup in a background goroutine so the first Embed() call doesn't block on download. Safe to call concurrently — uses singleflight internally.
type Client ¶
type Client struct {
// contains filtered or unexported fields
}
Client calls an embedding endpoint to convert text into float32 vectors. Supports two formats, auto-detected from the endpoint URL:
- OpenAI (/v1/embeddings) — OpenAI-compatible, supports batch
- Ollama (/api/embeddings) — Ollama native, serial-only
A nil Client is safe to use — Embed returns (nil, nil).
func NewClient ¶
NewClient creates a Client for the given endpoint. Returns nil if endpoint is empty (embedding disabled). model defaults to "nomic-embed-text" when empty, which produces 768-dimensional vectors and is available in Ollama without any extra setup.
func (*Client) Embed ¶
Embed returns a vector embedding for text. The format is auto-detected from the endpoint URL:
- "/v1/embeddings" → OpenAI format
- otherwise → Ollama format
Returns (nil, nil) if the client is nil.
func (*Client) EmbedBatch ¶
EmbedBatch returns vector embeddings for a batch of texts in one HTTP round-trip. Supports OpenAI batch format ({"model":...,"input":[...]}). Ollama does not support batch — falls back to serial Embed() calls. Returns (nil, nil) if the client is nil or texts is empty.
type Embedder ¶
type Embedder interface {
// Embed returns a vector embedding for text.
// Returns (nil, nil) only when the embedder is intentionally disabled.
Embed(ctx context.Context, text string) ([]float32, error)
// WarmUp pre-initializes the embedder (e.g. downloads the model file).
// Call at daemon startup in a background goroutine so the first Embed()
// call doesn't block on model download. Implementations that need no
// warmup should return nil immediately.
WarmUp(ctx context.Context) error
// Model returns the model name used for embedding generation.
// Used as the model key in UpsertMemoryEmbedding for cache invalidation.
Model() string
// Close releases any resources held by the embedder (e.g. ONNX sessions).
// Implementations where Close is a no-op should return nil.
Close() error
}
Embedder generates vector embeddings from text. Implementations must be safe for concurrent use. A nil Embedder is NOT safe — callers must check.
type HTTPDoer ¶
HTTPDoer is the interface for making HTTP requests. *http.Client satisfies this interface. Expose it so callers can inject custom transports (retry, tracing, rate-limiting) or test doubles without needing a real network. This is the same pattern used by AWS SDK v2, google-cloud-go, and stripe-go.
type OllamaEmbedder ¶
type OllamaEmbedder struct {
// contains filtered or unexported fields
}
OllamaEmbedder wraps an embed.Client to satisfy the Embedder interface. Use this when embeddings mode is "ollama" — delegates to a local Ollama instance via the existing HTTP client.
func NewOllamaEmbedder ¶
func NewOllamaEmbedder(endpoint, model string, opts ...Option) *OllamaEmbedder
NewOllamaEmbedder creates an Embedder that delegates to a local Ollama instance. endpoint is the Ollama API URL (e.g. "http://localhost:11434/api/embeddings"). model defaults to "nomic-embed-text" when empty. Returns nil if endpoint is empty.
func (*OllamaEmbedder) Close ¶
func (o *OllamaEmbedder) Close() error
Close is a no-op for the Ollama embedder (HTTP client has no resources to release).
func (*OllamaEmbedder) Model ¶
func (o *OllamaEmbedder) Model() string
Model returns the configured embedding model name.
func (*OllamaEmbedder) WarmUp ¶
func (o *OllamaEmbedder) WarmUp(ctx context.Context) error
WarmUp validates that the Ollama server has the embedding model available by performing a single test embed. Returns an error if the model is not pulled or the server is unreachable — callers can fall back to builtin ONNX.
type Option ¶
type Option func(*Client)
Option is a functional option for Client construction.
func WithHTTPDoer ¶
WithHTTPDoer replaces the default *http.Client with a custom HTTPDoer. Use this to inject retry wrappers, custom transports, or test doubles.
// Production: add tracing embed.NewClient(url, model, embed.WithHTTPDoer(tracedClient)) // Tests: inject a mock without needing a real HTTP server embed.NewClient(url, model, embed.WithHTTPDoer(myMock))