embedding

package
v0.9.0-alpha.3 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Mar 12, 2026 License: Apache-2.0 Imports: 2 Imported by: 142

Documentation

Overview

Package embedding defines the Embedder component interface for converting text into vector representations.

Overview

An Embedder converts a batch of strings into dense float vectors. Semantically similar texts produce vectors that are close in the vector space, making embeddings the backbone of semantic search, RAG pipelines, and clustering.

Concrete implementations (OpenAI, Ark, Ollama, …) live in eino-ext:

github.com/cloudwego/eino-ext/components/embedding/

Output Format

[Embedder.EmbedStrings] returns `[][]float64` where:

  • outer index corresponds to the input text at the same position
  • inner slice is the embedding vector; its length (dimensions) is fixed by the model and is the same for every text

Consistency Requirement

The same model must be used for both indexing and retrieval. Mixing models produces vectors in different spaces — similarity scores become meaningless and semantic search breaks silently.

See https://www.cloudwego.io/docs/eino/core_modules/components/embedding_guide/

Index

Constants

This section is empty.

Variables

This section is empty.

Functions

func GetImplSpecificOptions added in v0.3.6

func GetImplSpecificOptions[T any](base *T, opts ...Option) *T

GetImplSpecificOptions extracts implementation-specific options from opts, merging them onto base. Call alongside GetCommonOptions inside EmbedStrings:

func (e *MyEmbedder) EmbedStrings(ctx context.Context, texts []string, opts ...embedding.Option) ([][]float64, error) {
    common := embedding.GetCommonOptions(nil, opts...)
    mine  := embedding.GetImplSpecificOptions(&MyOptions{}, opts...)
    // use common.Model, mine.MyParam, etc.
}

Types

type CallbackInput

type CallbackInput struct {
	// Texts is the texts to be embedded.
	Texts []string
	// Config is the config for the embedding.
	Config *Config
	// Extra is the extra information for the callback.
	Extra map[string]any
}

CallbackInput is the input for the embedding callback.

func ConvCallbackInput

func ConvCallbackInput(src callbacks.CallbackInput) *CallbackInput

ConvCallbackInput converts the callback input to the embedding callback input.

type CallbackOutput

type CallbackOutput struct {
	// Embeddings is the embeddings.
	Embeddings [][]float64
	// Config is the config for creating the embedding.
	Config *Config
	// TokenUsage is the token usage for the embedding.
	TokenUsage *TokenUsage
	// Extra is the extra information for the callback.
	Extra map[string]any
}

CallbackOutput is the output for the embedding callback.

func ConvCallbackOutput

func ConvCallbackOutput(src callbacks.CallbackOutput) *CallbackOutput

ConvCallbackOutput converts the callback output to the embedding callback output.

type ComponentExtra

type ComponentExtra struct {
	// Config is the config for the embedding.
	Config *Config
	// TokenUsage is the token usage for the embedding.
	TokenUsage *TokenUsage
}

ComponentExtra is the extra information for the embedding.

type Config

type Config struct {
	// Model is the model name.
	Model string
	// EncodingFormat is the encoding format.
	EncodingFormat string
}

Config is the config for the embedding.

type Embedder

type Embedder interface {
	EmbedStrings(ctx context.Context, texts []string, opts ...Option) ([][]float64, error) // invoke
}

Embedder converts a batch of strings into dense vector representations.

EmbedStrings returns one vector per input text, in the same order. The vector length (dimensions) is fixed by the underlying model and identical for every text in the batch.

The returned [][]float64 maps as:

embeddings[i]  →  vector for texts[i]
len(embeddings[i])  →  model's embedding dimension (e.g. 1536 for ada-002)

Both [Indexer] and [Retriever] use an Embedder to convert documents and queries into vectors. They must share the exact same model — mismatched dimensions or model families break semantic similarity.

type Option

type Option struct {
	// contains filtered or unexported fields
}

Option is a call-time option for an Embedder.

func WithModel

func WithModel(model string) Option

WithModel is the option to set the model for the embedding.

func WrapImplSpecificOptFn added in v0.3.6

func WrapImplSpecificOptFn[T any](optFn func(*T)) Option

WrapImplSpecificOptFn wraps an implementation-specific option function so it can be passed alongside standard options. For use by Embedder implementors:

func WithMyParam(v string) embedding.Option {
    return embedding.WrapImplSpecificOptFn(func(o *MyOptions) {
        o.MyParam = v
    })
}

type Options

type Options struct {
	// Model is the model name for the embedding.
	Model *string
}

Options is the options for the embedding.

func GetCommonOptions

func GetCommonOptions(base *Options, opts ...Option) *Options

GetCommonOptions extract embedding Options from Option list, optionally providing a base Options with default values. eg.

defaultModelName := "default_model"
embeddingOption := &embedding.Options{
	Model: &defaultModelName,
}
embeddingOption := embedding.GetCommonOptions(embeddingOption, opts...)

type TokenUsage

type TokenUsage struct {
	// PromptTokens is the number of prompt tokens.
	PromptTokens int
	// CompletionTokens is the number of completion tokens.
	CompletionTokens int
	// TotalTokens is the total number of tokens.
	TotalTokens int
}

TokenUsage is the token usage for the embedding.

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL