Documentation
¶
Overview ¶
Package embedding defines the Embedder component interface for converting text into vector representations.
Overview ¶
An Embedder converts a batch of strings into dense float vectors. Semantically similar texts produce vectors that are close in the vector space, making embeddings the backbone of semantic search, RAG pipelines, and clustering.
Concrete implementations (OpenAI, Ark, Ollama, …) live in eino-ext:
github.com/cloudwego/eino-ext/components/embedding/
Output Format ¶
[Embedder.EmbedStrings] returns `[][]float64` where:
- outer index corresponds to the input text at the same position
- inner slice is the embedding vector; its length (dimensions) is fixed by the model and is the same for every text
Consistency Requirement ¶
The same model must be used for both indexing and retrieval. Mixing models produces vectors in different spaces — similarity scores become meaningless and semantic search breaks silently.
See https://www.cloudwego.io/docs/eino/core_modules/components/embedding_guide/
Index ¶
Constants ¶
This section is empty.
Variables ¶
This section is empty.
Functions ¶
func GetImplSpecificOptions ¶ added in v0.3.6
GetImplSpecificOptions extracts implementation-specific options from opts, merging them onto base. Call alongside GetCommonOptions inside EmbedStrings:
func (e *MyEmbedder) EmbedStrings(ctx context.Context, texts []string, opts ...embedding.Option) ([][]float64, error) {
common := embedding.GetCommonOptions(nil, opts...)
mine := embedding.GetImplSpecificOptions(&MyOptions{}, opts...)
// use common.Model, mine.MyParam, etc.
}
Types ¶
type CallbackInput ¶
type CallbackInput struct {
// Texts is the texts to be embedded.
Texts []string
// Config is the config for the embedding.
Config *Config
// Extra is the extra information for the callback.
Extra map[string]any
}
CallbackInput is the input for the embedding callback.
func ConvCallbackInput ¶
func ConvCallbackInput(src callbacks.CallbackInput) *CallbackInput
ConvCallbackInput converts the callback input to the embedding callback input.
type CallbackOutput ¶
type CallbackOutput struct {
// Embeddings is the embeddings.
Embeddings [][]float64
// Config is the config for creating the embedding.
Config *Config
// TokenUsage is the token usage for the embedding.
TokenUsage *TokenUsage
// Extra is the extra information for the callback.
Extra map[string]any
}
CallbackOutput is the output for the embedding callback.
func ConvCallbackOutput ¶
func ConvCallbackOutput(src callbacks.CallbackOutput) *CallbackOutput
ConvCallbackOutput converts the callback output to the embedding callback output.
type ComponentExtra ¶
type ComponentExtra struct {
// Config is the config for the embedding.
Config *Config
// TokenUsage is the token usage for the embedding.
TokenUsage *TokenUsage
}
ComponentExtra is the extra information for the embedding.
type Config ¶
type Config struct {
// Model is the model name.
Model string
// EncodingFormat is the encoding format.
EncodingFormat string
}
Config is the config for the embedding.
type Embedder ¶
type Embedder interface {
EmbedStrings(ctx context.Context, texts []string, opts ...Option) ([][]float64, error) // invoke
}
Embedder converts a batch of strings into dense vector representations.
EmbedStrings returns one vector per input text, in the same order. The vector length (dimensions) is fixed by the underlying model and identical for every text in the batch.
The returned [][]float64 maps as:
embeddings[i] → vector for texts[i] len(embeddings[i]) → model's embedding dimension (e.g. 1536 for ada-002)
Both [Indexer] and [Retriever] use an Embedder to convert documents and queries into vectors. They must share the exact same model — mismatched dimensions or model families break semantic similarity.
type Option ¶
type Option struct {
// contains filtered or unexported fields
}
Option is a call-time option for an Embedder.
func WrapImplSpecificOptFn ¶ added in v0.3.6
WrapImplSpecificOptFn wraps an implementation-specific option function so it can be passed alongside standard options. For use by Embedder implementors:
func WithMyParam(v string) embedding.Option {
return embedding.WrapImplSpecificOptFn(func(o *MyOptions) {
o.MyParam = v
})
}
type Options ¶
type Options struct {
// Model is the model name for the embedding.
Model *string
}
Options is the options for the embedding.
func GetCommonOptions ¶
GetCommonOptions extract embedding Options from Option list, optionally providing a base Options with default values. eg.
defaultModelName := "default_model"
embeddingOption := &embedding.Options{
Model: &defaultModelName,
}
embeddingOption := embedding.GetCommonOptions(embeddingOption, opts...)
type TokenUsage ¶
type TokenUsage struct {
// PromptTokens is the number of prompt tokens.
PromptTokens int
// CompletionTokens is the number of completion tokens.
CompletionTokens int
// TotalTokens is the total number of tokens.
TotalTokens int
}
TokenUsage is the token usage for the embedding.