Documentation
¶
Overview ¶
Package embedder provides embedding generation for RAG.
Index ¶
Constants ¶
This section is empty.
Variables ¶
This section is empty.
Functions ¶
This section is empty.
Types ¶
type CohereEmbedder ¶
type CohereEmbedder struct {
// contains filtered or unexported fields
}
CohereEmbedder uses Cohere's embedding API.
func NewCohereEmbedder ¶
func NewCohereEmbedder(opts ...CohereOption) *CohereEmbedder
NewCohereEmbedder creates a new Cohere embedder. Uses COHERE_API_KEY environment variable. Default model: embed-english-v3.0 (1024 dimensions)
func (*CohereEmbedder) Dim ¶
func (e *CohereEmbedder) Dim() int
Dim returns the embedding dimension.
func (*CohereEmbedder) EmbedBatch ¶
EmbedBatch generates embeddings for multiple texts.
type CohereOption ¶
type CohereOption func(*CohereEmbedder)
CohereOption configures the Cohere embedder.
func WithCohereInputType ¶
func WithCohereInputType(inputType string) CohereOption
WithCohereInputType sets the input type for embeddings. Use "search_document" for corpus documents, "search_query" for queries.
func WithCohereModel ¶
func WithCohereModel(model string) CohereOption
WithCohereModel sets the Cohere embedding model.
func WithCohereURL ¶ added in v0.2.0
func WithCohereURL(url string) CohereOption
WithCohereURL sets a custom API URL (for testing or proxies).
type Embedder ¶
type Embedder interface {
// Embed generates an embedding for a single text.
Embed(ctx context.Context, text string) ([]float32, error)
// EmbedBatch generates embeddings for multiple texts.
EmbedBatch(ctx context.Context, texts []string) ([][]float32, error)
// Dim returns the embedding dimension.
Dim() int
}
Embedder generates vector embeddings from text.
type OllamaEmbedder ¶
type OllamaEmbedder struct {
// contains filtered or unexported fields
}
OllamaEmbedder uses Ollama's local embedding API.
func NewOllamaEmbedder ¶
func NewOllamaEmbedder(opts ...OllamaOption) *OllamaEmbedder
NewOllamaEmbedder creates a new Ollama embedder. Default model: nomic-embed-text (768 dimensions) Default URL: http://localhost:11434 Default concurrency: 8 (for parallel batch embedding)
func (*OllamaEmbedder) Dim ¶
func (e *OllamaEmbedder) Dim() int
Dim returns the embedding dimension.
func (*OllamaEmbedder) EmbedBatch ¶
EmbedBatch generates embeddings for multiple texts concurrently. Ollama doesn't support native batching, so we parallelize with a worker pool.
type OllamaOption ¶
type OllamaOption func(*OllamaEmbedder)
OllamaOption configures the Ollama embedder.
func WithOllamaConcurrency ¶
func WithOllamaConcurrency(n int) OllamaOption
WithOllamaConcurrency sets the number of concurrent embedding requests. Default is 8. Higher values speed up large batches but increase memory/CPU usage. For 100k+ datasets, 8-16 is recommended.
func WithOllamaModel ¶
func WithOllamaModel(model string) OllamaOption
WithOllamaModel sets the embedding model.
func WithOllamaURL ¶
func WithOllamaURL(url string) OllamaOption
WithOllamaURL sets the Ollama server URL.
type OpenAIEmbedder ¶
type OpenAIEmbedder struct {
// contains filtered or unexported fields
}
OpenAIEmbedder uses OpenAI's embedding API.
func NewOpenAIEmbedder ¶
func NewOpenAIEmbedder(opts ...OpenAIOption) *OpenAIEmbedder
NewOpenAIEmbedder creates a new OpenAI embedder. Uses OPENAI_API_KEY environment variable.
func (*OpenAIEmbedder) Dim ¶
func (e *OpenAIEmbedder) Dim() int
Dim returns the embedding dimension.
func (*OpenAIEmbedder) EmbedBatch ¶
EmbedBatch generates embeddings for multiple texts.
type OpenAIOption ¶ added in v0.2.0
type OpenAIOption func(*OpenAIEmbedder)
OpenAIOption configures the OpenAI embedder.
func WithOpenAIURL ¶ added in v0.2.0
func WithOpenAIURL(url string) OpenAIOption
WithOpenAIURL sets a custom API URL (for testing or proxies).
type TEIEmbedder ¶
type TEIEmbedder struct {
// contains filtered or unexported fields
}
TEIEmbedder uses Hugging Face Text Embeddings Inference server. TEI provides native batching and is optimized for high-throughput embedding. Run with: docker run -p 8080:80 ghcr.io/huggingface/text-embeddings-inference:latest --model-id BAAI/bge-base-en-v1.5
func NewTEIEmbedder ¶
func NewTEIEmbedder(opts ...TEIOption) *TEIEmbedder
NewTEIEmbedder creates a new Hugging Face TEI embedder. Default URL: http://localhost:8080 Default model: BAAI/bge-base-en-v1.5 (768 dimensions)
func (*TEIEmbedder) EmbedBatch ¶
EmbedBatch generates embeddings for multiple texts. TEI supports native batching for high throughput.
type TEIOption ¶
type TEIOption func(*TEIEmbedder)
TEIOption configures the TEI embedder.
func WithTEIDim ¶
WithTEIDim explicitly sets the embedding dimension. Use this if your model isn't in the auto-detect list. Explicit dimension takes precedence over model-based auto-detection.
func WithTEIModel ¶
WithTEIModel sets the model name for dimension inference. Note: The actual model is configured when starting the TEI server.
type VoyageEmbedder ¶
type VoyageEmbedder struct {
// contains filtered or unexported fields
}
VoyageEmbedder uses Voyage AI's embedding API.
func NewVoyageEmbedder ¶
func NewVoyageEmbedder(opts ...VoyageOption) *VoyageEmbedder
NewVoyageEmbedder creates a new Voyage AI embedder. Uses VOYAGE_API_KEY environment variable. Default model: voyage-2 (1024 dimensions)
func (*VoyageEmbedder) Dim ¶
func (e *VoyageEmbedder) Dim() int
Dim returns the embedding dimension.
func (*VoyageEmbedder) EmbedBatch ¶
EmbedBatch generates embeddings for multiple texts.
type VoyageOption ¶
type VoyageOption func(*VoyageEmbedder)
VoyageOption configures the Voyage embedder.
func WithVoyageInputType ¶
func WithVoyageInputType(inputType string) VoyageOption
WithVoyageInputType sets the input type for embeddings. Use "document" for corpus documents, "query" for queries.
func WithVoyageModel ¶
func WithVoyageModel(model string) VoyageOption
WithVoyageModel sets the Voyage embedding model.
func WithVoyageURL ¶ added in v0.2.0
func WithVoyageURL(url string) VoyageOption
WithVoyageURL sets a custom API URL (for testing or proxies).