parent

package
v0.3.21 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Apr 10, 2025 License: Apache-2.0 Imports: 5 Imported by: 0

Documentation

Index

Constants

This section is empty.

Variables

This section is empty.

Functions

func NewIndexer

func NewIndexer(ctx context.Context, config *Config) (indexer.Indexer, error)

NewIndexer creates a new parent indexer that handles document splitting and sub-document management.

Parameters:

  • ctx: context for the operation
  • config: configuration for the parent indexer

Example usage:

indexer, err := NewIndexer(ctx, &Config{
    Indexer: milvusIndexer,
    Transformer: textSplitter,
    ParentIDKey: "source_doc_id",
    SubIDGenerator: func(ctx context.Context, parentID string, num int) ([]string, error) {
        ids := make([]string, num)
        for i := 0; i < num; i++ {
            ids[i] = fmt.Sprintf("%s_chunk_%d", parentID, i+1)
        }
        return ids, nil
    },
})

Returns:

  • indexer.Indexer: the created parent indexer
  • error: any error encountered during creation

Types

type Config

type Config struct {
	// Indexer is the underlying indexer implementation that handles the actual document indexing.
	// For example: a vector database indexer like Milvus, or a full-text search indexer like Elasticsearch.
	Indexer indexer.Indexer

	// Transformer processes documents before indexing, typically splitting them into smaller chunks.
	// Each sub-document generated by the transformer must retain its parent document's ID.
	// For example: if a document with ID "doc_1" is split into 3 chunks, all chunks will initially
	// have ID "doc_1". These IDs will later be modified by the SubIDGenerator.
	//
	// Example transformations:
	// - A text splitter that breaks down large documents into paragraphs
	// - A code splitter that separates code files into functions
	Transformer document.Transformer

	// ParentIDKey specifies the metadata key used to store the original document's ID in each sub-document.
	// For example: if ParentIDKey is "parent_id", each sub-document will have metadata like:
	// {"parent_id": "original_doc_123"}
	ParentIDKey string

	// SubIDGenerator generates unique IDs for sub-documents based on their parent document ID.
	// For example: if parent ID is "doc_1" and we need 3 sub-document IDs, it might generate:
	// ["doc_1_chunk_1", "doc_1_chunk_2", "doc_1_chunk_3"]
	//
	// Parameters:
	//   - ctx: context for the operation
	//   - parentID: the ID of the parent document
	//   - num: number of sub-document IDs needed
	// Returns:
	//   - []string: slice of generated sub-document IDs
	//   - error: any error encountered during ID generation
	SubIDGenerator func(ctx context.Context, parentID string, num int) ([]string, error)
}

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL