base

package

v1.1.2 Latest Latest Go to latest Published: Mar 20, 2026 License: MIT Imports: 8 Imported by: 0

Details

Valid go.mod file
Redistributable license
Tagged version
Stable version
Learn more about best practices

Repository

github.com/DotNetAge/gorag

Links

Open Source Insights

Documentation ¶

Index ¶

type ContentExtractor
type GenericStreamWrapper
- func NewGenericStreamWrapper(name string, types []string, extractor ContentExtractor) *GenericStreamWrapper
- func (w *GenericStreamWrapper) GetSupportedTypes() []string
- func (w *GenericStreamWrapper) ParseStream(ctx context.Context, r io.Reader, metadata map[string]any) (<-chan *core.Document, error)
type Parser
type Triple
type TriplesExtractor
- func NewTriplesExtractor(llm chat.Client) *TriplesExtractor
- func (e *TriplesExtractor) Extract(ctx context.Context, text string) ([]Triple, error)

Constants ¶

This section is empty.

Variables ¶

This section is empty.

Functions ¶

This section is empty.

Types ¶

type ContentExtractor ¶

type ContentExtractor func(ctx context.Context, r io.Reader) (string, error)

ContentExtractor is a simple function signature that takes an io.Reader and returns the full text. This is used for legacy parsers or file types that cannot be easily streamed (like Docx/PDF).

type GenericStreamWrapper ¶

type GenericStreamWrapper struct {
	// contains filtered or unexported fields
}

GenericStreamWrapper wraps a full-read extractor into a core.Parser (streaming interface). It's a bridge for legacy parsers.

func NewGenericStreamWrapper ¶

func NewGenericStreamWrapper(name string, types []string, extractor ContentExtractor) *GenericStreamWrapper

func (*GenericStreamWrapper) GetSupportedTypes ¶

func (w *GenericStreamWrapper) GetSupportedTypes() []string

func (*GenericStreamWrapper) ParseStream ¶

func (w *GenericStreamWrapper) ParseStream(ctx context.Context, r io.Reader, metadata map[string]any) (<-chan *core.Document, error)

type Parser ¶

type Parser interface {
	// ParseStream reads from an io.Reader and streams parsed Document objects.
	// This ensures O(1) memory complexity for handling massive files (e.g., 2GB logs).
	ParseStream(ctx context.Context, r io.Reader, metadata map[string]any) (<-chan *core.Document, error)

	// GetSupportedTypes returns the file extensions or MIME types this parser supports.
	GetSupportedTypes() []string
}

Parser defines the streaming document parser for Next-Gen RAG.

type Triple ¶

type Triple struct {
	Subject     string `json:"subject"`
	Predicate   string `json:"predicate"`
	Object      string `json:"object"`
	SubjectType string `json:"subject_type"`
	ObjectType  string `json:"object_type"`
}

Triple represents a relationship between two entities.

type TriplesExtractor ¶

type TriplesExtractor struct {
	// contains filtered or unexported fields
}

TriplesExtractor uses an LLM to extract knowledge triples from text.

func NewTriplesExtractor ¶

func NewTriplesExtractor(llm chat.Client) *TriplesExtractor

func (*TriplesExtractor) Extract ¶

func (e *TriplesExtractor) Extract(ctx context.Context, text string) ([]Triple, error)

Source Files ¶

View all Source files

?	: This menu
/	: Search site
f or F	: Jump to
y or Y	: Canonical URL