base

package
v1.1.1 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Mar 20, 2026 License: MIT Imports: 8 Imported by: 0

Documentation

Index

Constants

This section is empty.

Variables

This section is empty.

Functions

This section is empty.

Types

type ContentExtractor

type ContentExtractor func(ctx context.Context, r io.Reader) (string, error)

ContentExtractor is a simple function signature that takes an io.Reader and returns the full text. This is used for legacy parsers or file types that cannot be easily streamed (like Docx/PDF).

type GenericStreamWrapper

type GenericStreamWrapper struct {
	// contains filtered or unexported fields
}

GenericStreamWrapper wraps a full-read extractor into a core.Parser (streaming interface). It's a bridge for legacy parsers.

func NewGenericStreamWrapper

func NewGenericStreamWrapper(name string, types []string, extractor ContentExtractor) *GenericStreamWrapper

func (*GenericStreamWrapper) GetSupportedTypes

func (w *GenericStreamWrapper) GetSupportedTypes() []string

func (*GenericStreamWrapper) ParseStream

func (w *GenericStreamWrapper) ParseStream(ctx context.Context, r io.Reader, metadata map[string]any) (<-chan *core.Document, error)

type Parser

type Parser interface {
	// ParseStream reads from an io.Reader and streams parsed Document objects.
	// This ensures O(1) memory complexity for handling massive files (e.g., 2GB logs).
	ParseStream(ctx context.Context, r io.Reader, metadata map[string]any) (<-chan *core.Document, error)

	// GetSupportedTypes returns the file extensions or MIME types this parser supports.
	GetSupportedTypes() []string
}

Parser defines the streaming document parser for Next-Gen RAG.

type Triple

type Triple struct {
	Subject     string `json:"subject"`
	Predicate   string `json:"predicate"`
	Object      string `json:"object"`
	SubjectType string `json:"subject_type"`
	ObjectType  string `json:"object_type"`
}

Triple represents a relationship between two entities.

type TriplesExtractor

type TriplesExtractor struct {
	// contains filtered or unexported fields
}

TriplesExtractor uses an LLM to extract knowledge triples from text.

func NewTriplesExtractor

func NewTriplesExtractor(llm chat.Client) *TriplesExtractor

func (*TriplesExtractor) Extract

func (e *TriplesExtractor) Extract(ctx context.Context, text string) ([]Triple, error)

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL