embedder

package
v0.4.0 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Feb 25, 2026 License: MIT Imports: 9 Imported by: 0

Documentation

Overview

Package embedder provides embedding generation for RAG.

Index

Constants

This section is empty.

Variables

This section is empty.

Functions

This section is empty.

Types

type CohereEmbedder

type CohereEmbedder struct {
	// contains filtered or unexported fields
}

CohereEmbedder uses Cohere's embedding API.

func NewCohereEmbedder

func NewCohereEmbedder(opts ...CohereOption) *CohereEmbedder

NewCohereEmbedder creates a new Cohere embedder. Uses COHERE_API_KEY environment variable. Default model: embed-english-v3.0 (1024 dimensions)

func (*CohereEmbedder) Dim

func (e *CohereEmbedder) Dim() int

Dim returns the embedding dimension.

func (*CohereEmbedder) Embed

func (e *CohereEmbedder) Embed(ctx context.Context, text string) ([]float32, error)

Embed generates an embedding for a single text.

func (*CohereEmbedder) EmbedBatch

func (e *CohereEmbedder) EmbedBatch(ctx context.Context, texts []string) ([][]float32, error)

EmbedBatch generates embeddings for multiple texts.

type CohereOption

type CohereOption func(*CohereEmbedder)

CohereOption configures the Cohere embedder.

func WithCohereInputType

func WithCohereInputType(inputType string) CohereOption

WithCohereInputType sets the input type for embeddings. Use "search_document" for corpus documents, "search_query" for queries.

func WithCohereModel

func WithCohereModel(model string) CohereOption

WithCohereModel sets the Cohere embedding model.

func WithCohereURL added in v0.2.0

func WithCohereURL(url string) CohereOption

WithCohereURL sets a custom API URL (for testing or proxies).

type Embedder

type Embedder interface {
	// Embed generates an embedding for a single text.
	Embed(ctx context.Context, text string) ([]float32, error)

	// EmbedBatch generates embeddings for multiple texts.
	EmbedBatch(ctx context.Context, texts []string) ([][]float32, error)

	// Dim returns the embedding dimension.
	Dim() int
}

Embedder generates vector embeddings from text.

type OllamaEmbedder

type OllamaEmbedder struct {
	// contains filtered or unexported fields
}

OllamaEmbedder uses Ollama's local embedding API.

func NewOllamaEmbedder

func NewOllamaEmbedder(opts ...OllamaOption) *OllamaEmbedder

NewOllamaEmbedder creates a new Ollama embedder. Default model: nomic-embed-text (768 dimensions) Default URL: http://localhost:11434 Default concurrency: 8 (for parallel batch embedding)

func (*OllamaEmbedder) Dim

func (e *OllamaEmbedder) Dim() int

Dim returns the embedding dimension.

func (*OllamaEmbedder) Embed

func (e *OllamaEmbedder) Embed(ctx context.Context, text string) ([]float32, error)

Embed generates an embedding for a single text.

func (*OllamaEmbedder) EmbedBatch

func (e *OllamaEmbedder) EmbedBatch(ctx context.Context, texts []string) ([][]float32, error)

EmbedBatch generates embeddings for multiple texts concurrently. Ollama doesn't support native batching, so we parallelize with a worker pool.

type OllamaOption

type OllamaOption func(*OllamaEmbedder)

OllamaOption configures the Ollama embedder.

func WithOllamaConcurrency

func WithOllamaConcurrency(n int) OllamaOption

WithOllamaConcurrency sets the number of concurrent embedding requests. Default is 8. Higher values speed up large batches but increase memory/CPU usage. For 100k+ datasets, 8-16 is recommended.

func WithOllamaModel

func WithOllamaModel(model string) OllamaOption

WithOllamaModel sets the embedding model.

func WithOllamaURL

func WithOllamaURL(url string) OllamaOption

WithOllamaURL sets the Ollama server URL.

type OpenAIEmbedder

type OpenAIEmbedder struct {
	// contains filtered or unexported fields
}

OpenAIEmbedder uses OpenAI's embedding API.

func NewOpenAIEmbedder

func NewOpenAIEmbedder(opts ...OpenAIOption) *OpenAIEmbedder

NewOpenAIEmbedder creates a new OpenAI embedder. Uses OPENAI_API_KEY environment variable.

func (*OpenAIEmbedder) Dim

func (e *OpenAIEmbedder) Dim() int

Dim returns the embedding dimension.

func (*OpenAIEmbedder) Embed

func (e *OpenAIEmbedder) Embed(ctx context.Context, text string) ([]float32, error)

Embed generates an embedding for a single text.

func (*OpenAIEmbedder) EmbedBatch

func (e *OpenAIEmbedder) EmbedBatch(ctx context.Context, texts []string) ([][]float32, error)

EmbedBatch generates embeddings for multiple texts.

type OpenAIOption added in v0.2.0

type OpenAIOption func(*OpenAIEmbedder)

OpenAIOption configures the OpenAI embedder.

func WithOpenAIURL added in v0.2.0

func WithOpenAIURL(url string) OpenAIOption

WithOpenAIURL sets a custom API URL (for testing or proxies).

type TEIEmbedder

type TEIEmbedder struct {
	// contains filtered or unexported fields
}

TEIEmbedder uses Hugging Face Text Embeddings Inference server. TEI provides native batching and is optimized for high-throughput embedding. Run with: docker run -p 8080:80 ghcr.io/huggingface/text-embeddings-inference:latest --model-id BAAI/bge-base-en-v1.5

func NewTEIEmbedder

func NewTEIEmbedder(opts ...TEIOption) *TEIEmbedder

NewTEIEmbedder creates a new Hugging Face TEI embedder. Default URL: http://localhost:8080 Default model: BAAI/bge-base-en-v1.5 (768 dimensions)

func (*TEIEmbedder) Dim

func (e *TEIEmbedder) Dim() int

Dim returns the embedding dimension.

func (*TEIEmbedder) Embed

func (e *TEIEmbedder) Embed(ctx context.Context, text string) ([]float32, error)

Embed generates an embedding for a single text.

func (*TEIEmbedder) EmbedBatch

func (e *TEIEmbedder) EmbedBatch(ctx context.Context, texts []string) ([][]float32, error)

EmbedBatch generates embeddings for multiple texts. TEI supports native batching for high throughput.

type TEIOption

type TEIOption func(*TEIEmbedder)

TEIOption configures the TEI embedder.

func WithTEIDim

func WithTEIDim(dim int) TEIOption

WithTEIDim explicitly sets the embedding dimension. Use this if your model isn't in the auto-detect list. Explicit dimension takes precedence over model-based auto-detection.

func WithTEIModel

func WithTEIModel(model string) TEIOption

WithTEIModel sets the model name for dimension inference. Note: The actual model is configured when starting the TEI server.

func WithTEIURL

func WithTEIURL(url string) TEIOption

WithTEIURL sets the TEI server URL.

type VoyageEmbedder

type VoyageEmbedder struct {
	// contains filtered or unexported fields
}

VoyageEmbedder uses Voyage AI's embedding API.

func NewVoyageEmbedder

func NewVoyageEmbedder(opts ...VoyageOption) *VoyageEmbedder

NewVoyageEmbedder creates a new Voyage AI embedder. Uses VOYAGE_API_KEY environment variable. Default model: voyage-2 (1024 dimensions)

func (*VoyageEmbedder) Dim

func (e *VoyageEmbedder) Dim() int

Dim returns the embedding dimension.

func (*VoyageEmbedder) Embed

func (e *VoyageEmbedder) Embed(ctx context.Context, text string) ([]float32, error)

Embed generates an embedding for a single text.

func (*VoyageEmbedder) EmbedBatch

func (e *VoyageEmbedder) EmbedBatch(ctx context.Context, texts []string) ([][]float32, error)

EmbedBatch generates embeddings for multiple texts.

type VoyageOption

type VoyageOption func(*VoyageEmbedder)

VoyageOption configures the Voyage embedder.

func WithVoyageInputType

func WithVoyageInputType(inputType string) VoyageOption

WithVoyageInputType sets the input type for embeddings. Use "document" for corpus documents, "query" for queries.

func WithVoyageModel

func WithVoyageModel(model string) VoyageOption

WithVoyageModel sets the Voyage embedding model.

func WithVoyageURL added in v0.2.0

func WithVoyageURL(url string) VoyageOption

WithVoyageURL sets a custom API URL (for testing or proxies).

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL