embed

package
v0.3.0-alpha.3 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: May 10, 2026 License: Apache-2.0 Imports: 7 Imported by: 0

Documentation

Index

Constants

View Source
const DefaultContextWindow = 512

DefaultContextWindow is the fallback context window (in tokens) when auto-detection is unavailable and no config is set.

Variables

This section is empty.

Functions

func EmbedForQuery

func EmbedForQuery(ctx context.Context, p Provider, text string) ([]float32, error)

EmbedForQuery is the canonical helper for embedding a search query. Uses the provider's EmbedQuery when implemented; otherwise calls Embed with a single-text slice. Returns a single vector (or nil + error).

Types

type Provider

type Provider interface {
	// Embed generates embeddings for one or more texts. Returns one
	// vector per input text, in the same order. Implementations should
	// support batching where the underlying provider allows it.
	Embed(ctx context.Context, texts []string) ([][]float32, error)

	// ModelID returns the identifier of the model being used, for
	// tracking in the embedding_model property on nodes.
	ModelID() string

	// ContextWindow returns the model's context window in tokens.
	// Used to size chunks before embedding. Implementations should
	// auto-detect where possible (e.g., Ollama /api/show) and fall
	// back to config or DefaultContextWindow.
	ContextWindow() int
}

Provider generates vector embeddings from text. This is the shared interface with multiple implementations -- the correct Go reason to define an interface at the provider (D29).

func New

New creates an embedding provider from the config. Returns nil if no provider is configured (embedding is optional).

type QueryEmbedder

type QueryEmbedder interface {
	EmbedQuery(ctx context.Context, text string) ([]float32, error)
}

QueryEmbedder is implemented by providers that distinguish between document-time and query-time embeddings at the underlying API level. Cohere on Bedrock accepts an `input_type` field that's "search_document" when embedding indexed content and "search_query" when embedding a retrieval query; using the document type for queries degrades cosine similarity measurably.

Search-time code should type-assert its embedder for QueryEmbedder and prefer EmbedQuery when available; non-implementing providers (OpenAI, Ollama, Titan) treat queries and documents identically and can fall back to Embed.

type SetupResult

type SetupResult struct {
	Configured bool
	Provider   string
	Model      string
	Messages   []string
}

SetupResult describes what the setup process found and configured.

func SetupEmbedding

func SetupEmbedding(ctx context.Context, cfg *config.Config) SetupResult

SetupEmbedding detects and configures an embedding provider. Modifies cfg in place. Returns a result with status messages for display.

Priority: if provider is already set (e.g., from config), use it. Otherwise try built-in BERT first (always available, just needs model download), then fall back to Ollama detection.

Directories

Path Synopsis

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL