Documentation
¶
Overview ¶
Package indexer defines the Indexer component interface for storing documents and their vector representations in a backend store.
Overview ¶
An Indexer is the write path of a RAG pipeline. It takes schema.Document values, optionally generates vector embeddings, and persists them in a backend (vector DB, search engine, etc.) for later retrieval.
Concrete implementations (VikingDB, Milvus, Elasticsearch, …) live in eino-ext:
github.com/cloudwego/eino-ext/components/indexer/
Vector Dimension Consistency ¶
When using the [Options.Embedding] option, the embedding model must be identical to the one used by the paired [retriever.Retriever]. Mismatched models produce vectors in different spaces — queries will not match stored documents.
See https://www.cloudwego.io/docs/eino/core_modules/components/indexer_guide/
Index ¶
Constants ¶
This section is empty.
Variables ¶
This section is empty.
Functions ¶
func GetImplSpecificOptions ¶ added in v0.3.6
GetImplSpecificOptions extracts implementation-specific options from opts, merging onto base. Call alongside GetCommonOptions inside Store.
Types ¶
type CallbackInput ¶
type CallbackInput struct {
// Docs is the documents to be indexed.
Docs []*schema.Document
// Extra is the extra information for the callback.
Extra map[string]any
}
CallbackInput is the input for the indexer callback.
func ConvCallbackInput ¶
func ConvCallbackInput(src callbacks.CallbackInput) *CallbackInput
ConvCallbackInput converts the callback input to the indexer callback input.
type CallbackOutput ¶
type CallbackOutput struct {
// IDs is the ids of the indexed documents returned by the indexer.
IDs []string
// Extra is the extra information for the callback.
Extra map[string]any
}
CallbackOutput is the output for the indexer callback.
func ConvCallbackOutput ¶
func ConvCallbackOutput(src callbacks.CallbackOutput) *CallbackOutput
ConvCallbackOutput converts the callback output to the indexer callback output.
type Indexer ¶
type Indexer interface {
// Store stores the documents and returns their assigned IDs.
Store(ctx context.Context, docs []*schema.Document, opts ...Option) (ids []string, err error) // invoke
}
Indexer stores documents (and optionally their vector embeddings) in a backend for later retrieval.
Store accepts a batch of schema.Document values and returns the IDs assigned to them by the backend. When [Options.Embedding] is provided, the implementation generates vectors before storing — the same embedder must be used by the paired [retriever.Retriever].
Use [Options.SubIndexes] to write documents into logical sub-partitions within the same store.
type Option ¶
type Option struct {
// contains filtered or unexported fields
}
Option is a call-time option for an Indexer.
func WithEmbedding ¶
WithEmbedding is the option to set the embedder for the indexer, which convert document to embeddings.
func WithSubIndexes ¶
WithSubIndexes is the option to set the sub indexes for the indexer.
func WrapImplSpecificOptFn ¶ added in v0.3.6
WrapImplSpecificOptFn wraps an implementation-specific option function so it can be passed alongside standard options. For use by Indexer implementors.
type Options ¶
type Options struct {
// SubIndexes is the sub indexes to be indexed.
SubIndexes []string
// Embedding is the embedding component.
Embedding embedding.Embedder
}
Options is the options for the indexer.
func GetCommonOptions ¶
GetCommonOptions extracts standard Options from opts, merging onto base. Implementors must call this inside Store:
func (idx *MyIndexer) Store(ctx context.Context, docs []*schema.Document, opts ...indexer.Option) ([]string, error) {
options := indexer.GetCommonOptions(nil, opts...)
// use options.Embedding to generate vectors before storage
}