Documentation
¶
Overview ¶
Package indexer provides high-level indexers for building RAG pipelines.
This package offers pre-configured indexer implementations:
- DefaultNativeIndexer: Lightweight, local-first indexer for prototyping
- DefaultAdvancedIndexer: High-performance indexer for production use
- DefaultGraphIndexer: Knowledge graph-enhanced indexer
- NewVectorIndexer: Custom vector-based indexer
- NewMultimodalGraphIndexer: Multimodal and graph-capable indexer
Index ¶
- type Config
- type Indexer
- func DefaultAdvancedIndexer(opts ...IndexerOption) (Indexer, error)
- func DefaultGraphIndexer(opts ...IndexerOption) (Indexer, error)
- func DefaultIndexer(opts ...IndexerOption) (Indexer, error)
- func DefaultNativeIndexer(opts ...IndexerOption) (Indexer, error)
- func NewMultimodalGraphIndexer(parsers []core.Parser, chunker core.SemanticChunker, ...) (Indexer, error)
- func NewVectorIndexer(parsers []core.Parser, chunker core.SemanticChunker, ...) Indexer
- type IndexerOption
- func ClearParsers() IndexerOption
- func WithAllParsers() IndexerOption
- func WithBGE(modelPath string) IndexerOption
- func WithBert(modelPath string) IndexerOption
- func WithBoltDoc(path string) IndexerOption
- func WithCharacterChunker(size, overlap int) IndexerOption
- func WithChunker(chunker core.SemanticChunker) IndexerOption
- func WithClip(modelPath string) IndexerOption
- func WithConcurrency(enabled bool) IndexerOption
- func WithConsoleLogger() IndexerOption
- func WithDefaultGoVector() IndexerOption
- func WithDefaultSQLiteDoc() IndexerOption
- func WithDefaultSemanticChunker() IndexerOption
- func WithDocStore(s store.DocStore) IndexerOption
- func WithEmbedding(embedder embedding.Provider) IndexerOption
- func WithExtractor(extractor core.EntityExtractor) IndexerOption
- func WithFileLogger(path string) IndexerOption
- func WithGoVector(collection string, path string, dimension int) IndexerOption
- func WithGraph(graphStore store.GraphStore) IndexerOption
- func WithLogger(logger logging.Logger) IndexerOption
- func WithMetrics(metrics core.Metrics) IndexerOption
- func WithMilvus(collection string, addr string, dimension int) IndexerOption
- func WithName(name string) IndexerOption
- func WithNeoGraph(uri, username, password, dbName string) IndexerOption
- func WithOpenAI(apiKey string, model string) IndexerOption
- func WithOpenTelemetryTracer(ctx context.Context, endpoint string, serviceName string) IndexerOption
- func WithParsers(parsers ...core.Parser) IndexerOption
- func WithPinecone(indexName string, apiKey string, dimension int) IndexerOption
- func WithPrometheusMetrics(addr string) IndexerOption
- func WithQdrant(collection string, host string, port int, dimension int) IndexerOption
- func WithSQLDoc(path string) IndexerOption
- func WithStore(vectorStore core.VectorStore, docStore store.DocStore) IndexerOption
- func WithTokenChunker(size, overlap int, model string) IndexerOption
- func WithVectorStore(s core.VectorStore) IndexerOption
- func WithWatchDir(dirs ...string) IndexerOption
- func WithWeaviate(collection string, addr string, apiKey string, dimension int) IndexerOption
- func WithWorkers(workers int) IndexerOption
- func WithZapLogger(path string, maxSizeMB, maxDays, maxBackups int, console bool) IndexerOption
Constants ¶
This section is empty.
Variables ¶
This section is empty.
Functions ¶
This section is empty.
Types ¶
type Config ¶ added in v1.1.2
Config defines the configuration for the indexer. It controls concurrency and worker pool settings for parallel document processing.
type Indexer ¶
Indexer is the unified interface for document indexing. It provides methods for processing files and directories into vector/graph stores.
func DefaultAdvancedIndexer ¶ added in v1.1.3
func DefaultAdvancedIndexer(opts ...IndexerOption) (Indexer, error)
DefaultAdvancedIndexer creates a high-performance Indexer preset for production use. It features increased worker concurrency and optimized defaults for enterprise workloads.
func DefaultGraphIndexer ¶ added in v1.1.3
func DefaultGraphIndexer(opts ...IndexerOption) (Indexer, error)
DefaultGraphIndexer creates a Knowledge-Graph enabled Indexer preset. It integrates graph-based entity relationship extraction for complex query understanding.
func DefaultIndexer ¶ added in v1.1.2
func DefaultIndexer(opts ...IndexerOption) (Indexer, error)
DefaultIndexer is an alias for DefaultNativeIndexer, provided for backward compatibility.
func DefaultNativeIndexer ¶ added in v1.1.3
func DefaultNativeIndexer(opts ...IndexerOption) (Indexer, error)
DefaultNativeIndexer creates a light-weight, local-first Indexer. It uses default TokenChunker, local SQLite and GoVector stores, suitable for quick prototyping and testing.
func NewMultimodalGraphIndexer ¶
func NewMultimodalGraphIndexer( parsers []core.Parser, chunker core.SemanticChunker, embedder embedding.MultimodalProvider, entityExtractor core.EntityExtractor, vectorStore core.VectorStore, docStore store.DocStore, graphStore store.GraphStore, logger logging.Logger, metrics core.Metrics, opts ...IndexerOption, ) (Indexer, error)
NewMultimodalGraphIndexer creates an advanced multimodal and graph pipeline. It supports both text and image inputs, with knowledge graph extraction capabilities.
Parameters:
- parsers: list of document parsers
- chunker: semantic chunker for splitting documents
- embedder: multimodal embedding provider
- entityExtractor: entity extractor for graph construction
- vectorStore: vector storage backend
- docStore: document metadata storage
- graphStore: knowledge graph storage
- logger: logging service
- metrics: observability metrics service
func NewVectorIndexer ¶
func NewVectorIndexer( parsers []core.Parser, chunker core.SemanticChunker, embedder embedding.Provider, vectorStore core.VectorStore, docStore store.DocStore, logger logging.Logger, metrics core.Metrics, opts ...IndexerOption, ) Indexer
NewVectorIndexer creates a simple text-vector pipeline for basic RAG setups.
Parameters:
- parsers: list of document parsers
- chunker: semantic chunker for splitting documents
- embedder: embedding provider for vectorization
- vectorStore: vector storage backend
- docStore: document metadata storage
- logger: logging service
- metrics: observability metrics service
type IndexerOption ¶ added in v1.1.2
type IndexerOption func(*defaultIndexer)
IndexerOption defines a function to configure the indexer.
func ClearParsers ¶ added in v1.1.3
func ClearParsers() IndexerOption
ClearParsers clears the current parser registry.
func WithAllParsers ¶ added in v1.1.2
func WithAllParsers() IndexerOption
WithAllParsers enables all available builtin parsers using the global factory registry.
func WithBGE ¶ added in v1.1.3
func WithBGE(modelPath string) IndexerOption
func WithBert ¶ added in v1.1.3
func WithBert(modelPath string) IndexerOption
func WithBoltDoc ¶ added in v1.1.3
func WithBoltDoc(path string) IndexerOption
func WithCharacterChunker ¶ added in v1.1.3
func WithCharacterChunker(size, overlap int) IndexerOption
WithCharacterChunker sets a simple character-based chunker.
func WithChunker ¶ added in v1.1.2
func WithChunker(chunker core.SemanticChunker) IndexerOption
WithChunker sets the semantic chunker.
func WithClip ¶ added in v1.1.3
func WithClip(modelPath string) IndexerOption
func WithConcurrency ¶ added in v1.1.2
func WithConcurrency(enabled bool) IndexerOption
WithConcurrency enables or disables concurrent indexing.
func WithConsoleLogger ¶ added in v1.1.3
func WithConsoleLogger() IndexerOption
WithConsoleLogger configures the indexer to output logs to standard output.
func WithDefaultGoVector ¶ added in v1.1.3
func WithDefaultGoVector() IndexerOption
WithDefaultGoVector configures the indexer with an out-of-the-box local GoVector store.
func WithDefaultSQLiteDoc ¶ added in v1.1.3
func WithDefaultSQLiteDoc() IndexerOption
WithDefaultSQLiteDoc configures the indexer with an out-of-the-box local SQLite doc store.
func WithDefaultSemanticChunker ¶ added in v1.1.3
func WithDefaultSemanticChunker() IndexerOption
WithDefaultSemanticChunker configures the indexer with an out-of-the-box Semantic chunker.
func WithDocStore ¶ added in v1.1.3
func WithDocStore(s store.DocStore) IndexerOption
WithDocStore sets a custom document store.
func WithEmbedding ¶ added in v1.1.2
func WithEmbedding(embedder embedding.Provider) IndexerOption
WithEmbedding sets the embedding provider explicitly.
func WithExtractor ¶ added in v1.1.2
func WithExtractor(extractor core.EntityExtractor) IndexerOption
WithExtractor sets the entity extractor.
func WithFileLogger ¶ added in v1.1.3
func WithFileLogger(path string) IndexerOption
WithFileLogger configures the indexer to output logs to a specific file.
func WithGoVector ¶ added in v1.1.3
func WithGoVector(collection string, path string, dimension int) IndexerOption
func WithGraph ¶ added in v1.1.2
func WithGraph(graphStore store.GraphStore) IndexerOption
WithGraph sets the graph store explicitly.
func WithLogger ¶ added in v1.1.2
func WithLogger(logger logging.Logger) IndexerOption
WithLogger sets the logger.
func WithMetrics ¶ added in v1.1.2
func WithMetrics(metrics core.Metrics) IndexerOption
WithMetrics sets the metrics recorder.
func WithMilvus ¶ added in v1.1.3
func WithMilvus(collection string, addr string, dimension int) IndexerOption
func WithName ¶ added in v1.1.4
func WithName(name string) IndexerOption
WithName sets a unique name for the indexer instance, used for resource isolation.
func WithNeoGraph ¶ added in v1.1.3
func WithNeoGraph(uri, username, password, dbName string) IndexerOption
func WithOpenAI ¶ added in v1.1.3
func WithOpenAI(apiKey string, model string) IndexerOption
func WithOpenTelemetryTracer ¶ added in v1.1.3
func WithOpenTelemetryTracer(ctx context.Context, endpoint string, serviceName string) IndexerOption
WithOpenTelemetryTracer configures the indexer to send distributed traces to an OTel exporter. endpoint is the gRPC endpoint of the collector (e.g., "localhost:4317").
func WithParsers ¶ added in v1.1.2
func WithParsers(parsers ...core.Parser) IndexerOption
WithParsers adds custom parsers to the registry.
func WithPinecone ¶ added in v1.1.3
func WithPinecone(indexName string, apiKey string, dimension int) IndexerOption
func WithPrometheusMetrics ¶ added in v1.1.3
func WithPrometheusMetrics(addr string) IndexerOption
WithPrometheusMetrics configures the indexer to collect and expose metrics via Prometheus. It will start an HTTP server on the given address (e.g., ":8080") to serve the /metrics endpoint.
func WithQdrant ¶ added in v1.1.3
func WithQdrant(collection string, host string, port int, dimension int) IndexerOption
func WithSQLDoc ¶ added in v1.1.3
func WithSQLDoc(path string) IndexerOption
func WithStore ¶ added in v1.1.2
func WithStore(vectorStore core.VectorStore, docStore store.DocStore) IndexerOption
WithStore sets the vector and document stores explicitly.
func WithTokenChunker ¶ added in v1.1.3
func WithTokenChunker(size, overlap int, model string) IndexerOption
WithTokenChunker sets an accurate token-based chunker.
func WithVectorStore ¶ added in v1.1.3
func WithVectorStore(s core.VectorStore) IndexerOption
WithVectorStore sets a custom vector store.
func WithWatchDir ¶ added in v1.1.2
func WithWatchDir(dirs ...string) IndexerOption
WithWatchDir adds directories to watch for changes.
func WithWeaviate ¶ added in v1.1.3
func WithWeaviate(collection string, addr string, apiKey string, dimension int) IndexerOption
func WithWorkers ¶ added in v1.1.2
func WithWorkers(workers int) IndexerOption
WithWorkers sets the number of workers for concurrent indexing.
func WithZapLogger ¶ added in v1.1.3
func WithZapLogger(path string, maxSizeMB, maxDays, maxBackups int, console bool) IndexerOption
WithZapLogger configures the indexer to use a production-grade Zap logger with log rotation.