document

package
v0.9.0-alpha.3 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Mar 12, 2026 License: Apache-2.0 Imports: 4 Imported by: 47

Documentation

Overview

Package document defines the Loader and Transformer component interfaces for ingesting and processing documents in an eino pipeline.

Components

  • Loader: reads raw content from an external source (file, URL, S3, …) and returns schema.Document values. Parsing is typically delegated to a parser.Parser configured on the loader.
  • Transformer: takes a slice of schema.Document values and transforms them — splitting, filtering, merging, re-ranking, etc.

Concrete implementations live in eino-ext:

github.com/cloudwego/eino-ext/components/document/

Document Metadata

schema.Document.MetaData is the primary mechanism for carrying contextual information (source URI, scores, chunk indices, embeddings) through the pipeline. Transformers should preserve existing metadata and merge rather than replace when adding their own keys.

See https://www.cloudwego.io/docs/eino/core_modules/components/document_loader_guide/ See https://www.cloudwego.io/docs/eino/core_modules/components/document_transformer_guide/

Index

Constants

This section is empty.

Variables

This section is empty.

Functions

func GetLoaderImplSpecificOptions

func GetLoaderImplSpecificOptions[T any](base *T, opts ...LoaderOption) *T

GetLoaderImplSpecificOptions provides Loader author the ability to extract their own custom options from the unified LoaderOption type. T: the type of the impl specific options struct. This function should be used within the Loader implementation's Load function. It is recommended to provide a base T as the first argument, within which the Loader author can provide default values for the impl specific options. eg.

myOption := &MyOption{
	Field1: "default_value",
}
myOption := loader.GetLoaderImplSpecificOptions(myOption, opts...)

func GetTransformerImplSpecificOptions

func GetTransformerImplSpecificOptions[T any](base *T, opts ...TransformerOption) *T

GetTransformerImplSpecificOptions provides Transformer author the ability to extract their own custom options from the unified TransformerOption type. T: the type of the impl specific options struct. This function should be used within the Transformer implementation's Transform function. It is recommended to provide a base T as the first argument, within which the Transformer author can provide default values for the impl specific options. eg.

myOption := &MyOption{
	Field1: "default_value",
}
myOption := transformer.GetTransformerImplSpecificOptions(myOption, opts...)

Types

type Loader

type Loader interface {
	Load(ctx context.Context, src Source, opts ...LoaderOption) ([]*schema.Document, error)
}

Loader reads raw content from an external source and returns it as a slice of schema.Document values.

The Source.URI may be a local file path or a remote URL. The loader is responsible for fetching the raw bytes; actual format parsing is typically delegated to a parser.Parser configured on the loader via WithParserOptions.

Document metadata (schema.Document.MetaData) should be populated with at least the source URI so that downstream nodes can trace document provenance.

type LoaderCallbackInput

type LoaderCallbackInput struct {
	// Source is the source of the documents.
	Source Source

	// Extra is the extra information for the callback.
	Extra map[string]any
}

LoaderCallbackInput is the input for the loader callback.

func ConvLoaderCallbackInput

func ConvLoaderCallbackInput(src callbacks.CallbackInput) *LoaderCallbackInput

ConvLoaderCallbackInput converts the callback input to the loader callback input.

type LoaderCallbackOutput

type LoaderCallbackOutput struct {
	// Source is the source of the documents.
	Source Source

	// Docs is the documents to be loaded.
	Docs []*schema.Document

	// Extra is the extra information for the callback.
	Extra map[string]any
}

LoaderCallbackOutput is the output for the loader callback.

func ConvLoaderCallbackOutput

func ConvLoaderCallbackOutput(src callbacks.CallbackOutput) *LoaderCallbackOutput

ConvLoaderCallbackOutput converts the callback output to the loader callback output.

type LoaderOption

type LoaderOption struct {
	// contains filtered or unexported fields
}

LoaderOption defines call option for Loader component, which is part of the component interface signature. Each Loader implementation could define its own options struct and option funcs within its own package, then wrap the impl specific option funcs into this type, before passing to Load.

func WithParserOptions added in v0.3.54

func WithParserOptions(opts ...parser.Option) LoaderOption

WithParserOptions attaches parser options to a loader request.

func WrapLoaderImplSpecificOptFn

func WrapLoaderImplSpecificOptFn[T any](optFn func(*T)) LoaderOption

WrapLoaderImplSpecificOptFn wraps the impl specific option functions into LoaderOption type. T: the type of the impl specific options struct. Loader implementations are required to use this function to convert its own option functions into the unified LoaderOption type. For example, if the Loader impl defines its own options struct:

type customOptions struct {
    conf string
}

Then the impl needs to provide an option function as such:

func WithConf(conf string) Option {
    return WrapLoaderImplSpecificOptFn(func(o *customOptions) {
		o.conf = conf
	}
}

type LoaderOptions added in v0.3.54

type LoaderOptions struct {
	ParserOptions []parser.Option
}

LoaderOptions configures document loaders, including parser options.

func GetLoaderCommonOptions added in v0.3.54

func GetLoaderCommonOptions(base *LoaderOptions, opts ...LoaderOption) *LoaderOptions

GetLoaderCommonOptions extract loader Options from Option list, optionally providing a base Options with default values.

type Source

type Source struct {
	URI string
}

Source identifies the external location of a document. URI can be a local file path or a remote URL reachable by the loader.

type Transformer

type Transformer interface {
	Transform(ctx context.Context, src []*schema.Document, opts ...TransformerOption) ([]*schema.Document, error)
}

Transformer converts a slice of schema.Document values into another slice, applying operations such as splitting, filtering, merging, or re-ranking.

Implementations should preserve existing MetaData keys and merge rather than replace when adding their own metadata. Downstream nodes (e.g. Indexer, Retriever) may depend on metadata set by earlier pipeline stages.

type TransformerCallbackInput

type TransformerCallbackInput struct {
	// Input is the input documents.
	Input []*schema.Document

	// Extra is the extra information for the callback.
	Extra map[string]any
}

TransformerCallbackInput is the input for the transformer callback.

func ConvTransformerCallbackInput

func ConvTransformerCallbackInput(src callbacks.CallbackInput) *TransformerCallbackInput

ConvTransformerCallbackInput converts the callback input to the transformer callback input.

type TransformerCallbackOutput

type TransformerCallbackOutput struct {
	// Output is the output documents.
	Output []*schema.Document

	// Extra is the extra information for the callback.
	Extra map[string]any
}

TransformerCallbackOutput is the output for the transformer callback.

func ConvTransformerCallbackOutput

func ConvTransformerCallbackOutput(src callbacks.CallbackOutput) *TransformerCallbackOutput

ConvTransformerCallbackOutput converts the callback output to the transformer callback output.

type TransformerOption

type TransformerOption struct {
	// contains filtered or unexported fields
}

TransformerOption defines call option for Transformer component, which is part of the component interface signature. Each Transformer implementation could define its own options struct and option funcs within its own package, then wrap the impl specific option funcs into this type, before passing to Transform.

func WrapTransformerImplSpecificOptFn

func WrapTransformerImplSpecificOptFn[T any](optFn func(*T)) TransformerOption

WrapTransformerImplSpecificOptFn wraps the impl specific option functions into TransformerOption type. T: the type of the impl specific options struct. Transformer implementations are required to use this function to convert its own option functions into the unified TransformerOption type. For example, if the Transformer impl defines its own options struct:

type customOptions struct {
    conf string
}

Then the impl needs to provide an option function as such:

func WithConf(conf string) TransformerOption {
    return WrapTransformerImplSpecificOptFn(func(o *customOptions) {
		o.conf = conf
	}
}

.

Directories

Path Synopsis
Package parser defines the Parser interface for converting raw byte streams into schema.Document values.
Package parser defines the Parser interface for converting raw byte streams into schema.Document values.

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL