embed

package

v0.8.0 Latest Latest Go to latest Published: Mar 29, 2026 License: MIT Imports: 22 Imported by: 0

Details

Valid go.mod file
Redistributable license
Tagged version
Stable version
Learn more about best practices

Repository

github.com/SynapsesOS/synapses

Links

Open Source Insights

Documentation ¶

Overview ¶

Package embed provides a fail-silent client for generating vector embeddings from an Ollama-compatible or OpenAI-compatible HTTP endpoint. A nil *Client is safe — all methods return (nil, nil) so callers can use the zero value without checking for configuration.

Supported endpoint formats (auto-detected from URL):

OpenAI (/v1/embeddings) — batch-capable, request {"model":...,"input":...}
Ollama (/api/embeddings) — serial-only, request {"model":...,"prompt":...}

Package embed provides embedding generation for converting text into float32 vectors for similarity search. The Embedder interface abstracts over multiple backends (builtin ONNX, Ollama, OpenAI).

Index ¶

type BuiltinEmbedder
- func NewBuiltinEmbedder(modelsDir string) *BuiltinEmbedder
- func NewBuiltinEmbedderWithPoolSize(modelsDir string, poolSize int) *BuiltinEmbedder
type Client
- func NewClient(endpoint, model string, opts ...Option) *Client
type Embedder
type HTTPDoer
type OllamaEmbedder
- func NewOllamaEmbedder(endpoint, model string, opts ...Option) *OllamaEmbedder
type Option
- func WithHTTPDoer(d HTTPDoer) Option

Constants ¶

This section is empty.

Variables ¶

This section is empty.

Functions ¶

This section is empty.

Types ¶

type BuiltinEmbedder ¶

type BuiltinEmbedder struct {

	// P8-11: optional callback for model download lifecycle events.
	// eventType is "download_start" or "download_complete".
	OnModelEvent func(eventType string)
	// contains filtered or unexported fields
}

BuiltinEmbedder uses the pure-Go hugot library to run nomic-embed-text-v1.5 inference locally without any external dependencies. The quantized ONNX model (~137MB) is auto-downloaded from HuggingFace on first use and cached in the models directory. Output is Matryoshka-truncated to 384 dims.

Concurrent: a pool of pipeline instances allows bounded parallel inference. The pool size defaults to 3 — up to 3 Embed calls run simultaneously; additional callers block (respecting context) until a slot is returned.

Init retry: if model download fails (e.g., no internet), subsequent Embed() calls will retry initialization — not permanently broken.

func NewBuiltinEmbedder ¶

func NewBuiltinEmbedder(modelsDir string) *BuiltinEmbedder

NewBuiltinEmbedder creates a BuiltinEmbedder that stores its model in modelsDir (typically ~/.synapses/models). The nomic-embed-text-v1.5 model is lazily downloaded on the first Embed() call. Uses a pool of 3 pipeline instances for concurrent inference. Output is 384-dim (Matryoshka truncated).

func NewBuiltinEmbedderWithPoolSize ¶

func NewBuiltinEmbedderWithPoolSize(modelsDir string, poolSize int) *BuiltinEmbedder

NewBuiltinEmbedderWithPoolSize creates a BuiltinEmbedder with a custom pool size. poolSize is clamped to [1, 8].

func (*BuiltinEmbedder) Close ¶

func (b *BuiltinEmbedder) Close() error

Close releases all hugot session resources in the pool. Blocks until all in-flight Embed calls complete and return their slots, then destroys all pipeline instances. Safe to call concurrently; second call is a no-op.

func (*BuiltinEmbedder) Embed ¶

func (b *BuiltinEmbedder) Embed(ctx context.Context, text string) ([]float32, error)

Embed generates a 384-dimensional embedding for text using the builtin nomic-embed-text-v1.5 model (768 dims → Matryoshka truncation to 384). Concurrent: up to poolSize calls run in parallel; additional callers block until a pipeline slot is available (respecting context cancellation and shutdown).

If model initialization fails (e.g., no internet for download), subsequent calls will retry — the embedder is never permanently broken.

func (*BuiltinEmbedder) EmbedBatch ¶

func (b *BuiltinEmbedder) EmbedBatch(ctx context.Context, texts []string) ([][]float32, error)

EmbedBatch embeds multiple texts in a single ONNX forward pass, which is significantly faster than calling Embed() N times. Returns one vector per input text in the same order. Uses one pool slot for the entire batch.

func (*BuiltinEmbedder) Model ¶

func (b *BuiltinEmbedder) Model() string

Model returns the builtin model identifier.

func (*BuiltinEmbedder) StatusDetail ¶

func (b *BuiltinEmbedder) StatusDetail() string

StatusDetail returns a human-readable string describing the current initialization state. Thread-safe. Four possible values:

"ready" — pipeline pool initialized, embeddings working
"model cached" — model on disk but pool not yet started; will initialize automatically on the first recall() call (e.g. after daemon restart)
"model not yet downloaded" — Embed() has never been called and no model found on disk; model will be downloaded on first recall()
"unavailable" — init was attempted but failed (download error, pipeline error, or air-gapped environment); Embed() will retry automatically

func (*BuiltinEmbedder) WarmUp ¶

func (b *BuiltinEmbedder) WarmUp(ctx context.Context) error

WarmUp pre-initializes the embedder by downloading the model if not already cached and setting up the inference pipeline pool. Call at daemon startup in a background goroutine so the first Embed() call doesn't block on download. Safe to call concurrently — uses singleflight internally.

type Client ¶

type Client struct {
	// contains filtered or unexported fields
}

Client calls an embedding endpoint to convert text into float32 vectors. Supports two formats, auto-detected from the endpoint URL:

OpenAI (/v1/embeddings) — OpenAI-compatible, supports batch
Ollama (/api/embeddings) — Ollama native, serial-only

A nil Client is safe to use — Embed returns (nil, nil).

func NewClient ¶

func NewClient(endpoint, model string, opts ...Option) *Client

NewClient creates a Client for the given endpoint. Returns nil if endpoint is empty (embedding disabled). model defaults to "nomic-embed-text" when empty, which produces 768-dimensional vectors and is available in Ollama without any extra setup.

func (*Client) Embed ¶

func (c *Client) Embed(ctx context.Context, text string) ([]float32, error)

Embed returns a vector embedding for text. The format is auto-detected from the endpoint URL:

"/v1/embeddings" → OpenAI format
otherwise → Ollama format

Returns (nil, nil) if the client is nil.

func (*Client) EmbedBatch ¶

func (c *Client) EmbedBatch(ctx context.Context, texts []string) ([][]float32, error)

EmbedBatch returns vector embeddings for a batch of texts in one HTTP round-trip. Supports OpenAI batch format ({"model":...,"input":[...]}). Ollama does not support batch — falls back to serial Embed() calls. Returns (nil, nil) if the client is nil or texts is empty.

func (*Client) Model ¶

func (c *Client) Model() string

Model returns the configured embedding model name.

func (*Client) WarmUp ¶

func (c *Client) WarmUp(_ context.Context) error

WarmUp is a no-op for the HTTP client embedder (model is managed by the remote server).

type Embedder ¶

type Embedder interface {
	// Embed returns a vector embedding for text.
	// Returns (nil, nil) only when the embedder is intentionally disabled.
	Embed(ctx context.Context, text string) ([]float32, error)

	// WarmUp pre-initializes the embedder (e.g. downloads the model file).
	// Call at daemon startup in a background goroutine so the first Embed()
	// call doesn't block on model download. Implementations that need no
	// warmup should return nil immediately.
	WarmUp(ctx context.Context) error

	// Model returns the model name used for embedding generation.
	// Used as the model key in UpsertMemoryEmbedding for cache invalidation.
	Model() string

	// Close releases any resources held by the embedder (e.g. ONNX sessions).
	// Implementations where Close is a no-op should return nil.
	Close() error
}

Embedder generates vector embeddings from text. Implementations must be safe for concurrent use. A nil Embedder is NOT safe — callers must check.

type HTTPDoer ¶

type HTTPDoer interface {
	Do(*http.Request) (*http.Response, error)
}

HTTPDoer is the interface for making HTTP requests. *http.Client satisfies this interface. Expose it so callers can inject custom transports (retry, tracing, rate-limiting) or test doubles without needing a real network. This is the same pattern used by AWS SDK v2, google-cloud-go, and stripe-go.

type OllamaEmbedder ¶

type OllamaEmbedder struct {
	// contains filtered or unexported fields
}

OllamaEmbedder wraps an embed.Client to satisfy the Embedder interface. Use this when embeddings mode is "ollama" — delegates to a local Ollama instance via the existing HTTP client.

func NewOllamaEmbedder ¶

func NewOllamaEmbedder(endpoint, model string, opts ...Option) *OllamaEmbedder

NewOllamaEmbedder creates an Embedder that delegates to a local Ollama instance. endpoint is the Ollama API URL (e.g. "http://localhost:11434/api/embeddings"). model defaults to "nomic-embed-text" when empty. Returns nil if endpoint is empty.

func (*OllamaEmbedder) Close ¶

func (o *OllamaEmbedder) Close() error

Close is a no-op for the Ollama embedder (HTTP client has no resources to release).

func (*OllamaEmbedder) Embed ¶

func (o *OllamaEmbedder) Embed(ctx context.Context, text string) ([]float32, error)

Embed generates an embedding via the Ollama HTTP API.

func (*OllamaEmbedder) Model ¶

func (o *OllamaEmbedder) Model() string

Model returns the configured embedding model name.

func (*OllamaEmbedder) WarmUp ¶

func (o *OllamaEmbedder) WarmUp(ctx context.Context) error

WarmUp validates that the Ollama server has the embedding model available by performing a single test embed. Returns an error if the model is not pulled or the server is unreachable — callers can fall back to builtin ONNX.

type Option ¶

type Option func(*Client)

Option is a functional option for Client construction.

func WithHTTPDoer ¶

func WithHTTPDoer(d HTTPDoer) Option

WithHTTPDoer replaces the default *http.Client with a custom HTTPDoer. Use this to inject retry wrappers, custom transports, or test doubles.

// Production: add tracing
embed.NewClient(url, model, embed.WithHTTPDoer(tracedClient))

// Tests: inject a mock without needing a real HTTP server
embed.NewClient(url, model, embed.WithHTTPDoer(myMock))

Source Files ¶

View all Source files

?	: This menu
/	: Search site
f or F	: Jump to
y or Y	: Canonical URL