loader

package
v1.3.0 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Feb 26, 2026 License: MIT Imports: 12 Imported by: 0

Documentation

Index

Constants

This section is empty.

Variables

This section is empty.

Functions

This section is empty.

Types

type ArxivSourceAdapter

type ArxivSourceAdapter struct {
	// contains filtered or unexported fields
}

ArxivSourceAdapter adapts sources.ArxivSource to the DocumentLoader interface. It searches arXiv papers by query and converts each result into a rag.Document.

func NewArxivSourceAdapter

func NewArxivSourceAdapter(source *sources.ArxivSource, maxResults int) *ArxivSourceAdapter

NewArxivSourceAdapter creates an adapter around an existing ArxivSource.

func (*ArxivSourceAdapter) Load

func (a *ArxivSourceAdapter) Load(ctx context.Context, source string) ([]rag.Document, error)

Load interprets source as a search query and returns matching papers as Documents.

func (*ArxivSourceAdapter) SupportedTypes

func (a *ArxivSourceAdapter) SupportedTypes() []string

SupportedTypes returns an empty slice; this adapter is query-based, not file-based.

type CSVLoader

type CSVLoader struct {
	// contains filtered or unexported fields
}

CSVLoader loads CSV files. Each row (or group of rows) becomes a Document. The first row is treated as a header.

func NewCSVLoader

func NewCSVLoader(config CSVLoaderConfig) *CSVLoader

NewCSVLoader creates a CSVLoader with the given config.

func (*CSVLoader) Load

func (l *CSVLoader) Load(ctx context.Context, source string) ([]rag.Document, error)

Load reads a CSV file and returns Documents.

func (*CSVLoader) SupportedTypes

func (l *CSVLoader) SupportedTypes() []string

SupportedTypes returns the extensions handled by CSVLoader.

type CSVLoaderConfig

type CSVLoaderConfig struct {
	// Delimiter is the field separator. Defaults to ','.
	Delimiter rune
	// RowsPerDocument controls how many rows are grouped into a single Document.
	// 0 or 1 means each row becomes its own Document.
	RowsPerDocument int
	// ContentColumns lists column names (from the header) to include in Document.Content.
	// If empty, all columns are concatenated.
	ContentColumns []string
}

CSVLoaderConfig configures the CSV loader.

type DocumentLoader

type DocumentLoader interface {
	// Load reads the source and returns documents.
	// source is typically a file path, but loaders may interpret it as a URL or query.
	Load(ctx context.Context, source string) ([]rag.Document, error)

	// SupportedTypes returns the file extensions this loader handles (e.g. ".txt", ".md").
	SupportedTypes() []string
}

DocumentLoader is the unified interface for loading documents from any source.

type GitHubSourceAdapter

type GitHubSourceAdapter struct {
	// contains filtered or unexported fields
}

GitHubSourceAdapter adapts sources.GitHubSource to the DocumentLoader interface. It searches GitHub repos by query and converts each result into a rag.Document.

func NewGitHubSourceAdapter

func NewGitHubSourceAdapter(source *sources.GitHubSource, maxResults int) *GitHubSourceAdapter

NewGitHubSourceAdapter creates an adapter around an existing GitHubSource.

func (*GitHubSourceAdapter) Load

func (a *GitHubSourceAdapter) Load(ctx context.Context, source string) ([]rag.Document, error)

Load interprets source as a search query and returns matching repos as Documents.

func (*GitHubSourceAdapter) SupportedTypes

func (a *GitHubSourceAdapter) SupportedTypes() []string

SupportedTypes returns an empty slice; this adapter is query-based, not file-based.

type JSONLoader

type JSONLoader struct {
	// contains filtered or unexported fields
}

JSONLoader loads JSON (single object or array) and JSONL files.

func NewJSONLoader

func NewJSONLoader(config JSONLoaderConfig) *JSONLoader

NewJSONLoader creates a JSONLoader.

func (*JSONLoader) Load

func (l *JSONLoader) Load(ctx context.Context, source string) ([]rag.Document, error)

Load reads a JSON or JSONL file and returns Documents.

func (*JSONLoader) SupportedTypes

func (l *JSONLoader) SupportedTypes() []string

SupportedTypes returns the extensions handled by JSONLoader.

type JSONLoaderConfig

type JSONLoaderConfig struct {
	// ContentField is the JSON field name to use as Document.Content.
	// If empty, the entire JSON object is serialized as content.
	ContentField string
	// IDField is the JSON field name to use as Document.ID.
	// If empty, a path-based ID is generated.
	IDField string
}

JSONLoaderConfig configures the JSON/JSONL loader.

type LoaderRegistry

type LoaderRegistry struct {
	// contains filtered or unexported fields
}

LoaderRegistry routes Load calls to the appropriate DocumentLoader based on file extension.

func NewLoaderRegistry

func NewLoaderRegistry() *LoaderRegistry

NewLoaderRegistry creates a registry pre-populated with the built-in loaders.

func (*LoaderRegistry) Load

func (r *LoaderRegistry) Load(ctx context.Context, source string) ([]rag.Document, error)

Load determines the loader from the source's file extension and delegates to it.

func (*LoaderRegistry) Register

func (r *LoaderRegistry) Register(ext string, loader DocumentLoader)

Register adds or replaces a loader for the given file extension. ext should include the leading dot (e.g. ".pdf").

func (*LoaderRegistry) SupportedTypes

func (r *LoaderRegistry) SupportedTypes() []string

SupportedTypes returns all registered extensions, sorted.

type MarkdownLoader

type MarkdownLoader struct{}

MarkdownLoader loads Markdown files, splitting by top-level headings. Each heading section becomes a separate Document with the heading preserved in metadata. If the file has no headings, the entire content is returned as a single Document.

func NewMarkdownLoader

func NewMarkdownLoader() *MarkdownLoader

NewMarkdownLoader creates a MarkdownLoader.

func (*MarkdownLoader) Load

func (l *MarkdownLoader) Load(ctx context.Context, source string) ([]rag.Document, error)

Load reads a Markdown file and splits it into Documents by heading.

func (*MarkdownLoader) SupportedTypes

func (l *MarkdownLoader) SupportedTypes() []string

SupportedTypes returns the extensions handled by MarkdownLoader.

type TextLoader

type TextLoader struct{}

TextLoader loads plain text files as a single Document.

func NewTextLoader

func NewTextLoader() *TextLoader

NewTextLoader creates a TextLoader.

func (*TextLoader) Load

func (l *TextLoader) Load(ctx context.Context, source string) ([]rag.Document, error)

Load reads a text file and returns it as a single Document.

func (*TextLoader) SupportedTypes

func (l *TextLoader) SupportedTypes() []string

SupportedTypes returns the extensions handled by TextLoader.

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL