chunk

package
v1.9.28 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Nov 28, 2025 License: Apache-2.0 Imports: 8 Imported by: 0

Documentation

Index

Constants

This section is empty.

Variables

This section is empty.

Functions

func CollectFiles added in v1.9.28

func CollectFiles(paths []string) ([]string, error)

CollectFiles recursively collects all files from given paths Skips paths that don't exist instead of returning an error

func FileHash added in v1.9.28

func FileHash(path string) (string, error)

FileHash calculates SHA256 hash of a file

func Matches added in v1.9.28

func Matches(path string, patterns []string) (bool, error)

Matches reports whether the given path matches any configured document path or glob pattern. To be used in file watchers to determine if a new/changed file matches the glob patterns or not.

Types

type Chunk

type Chunk struct {
	Index    int
	Content  string
	Metadata map[string]string
}

Chunk represents a piece of text from a document

func ProcessFile added in v1.9.28

func ProcessFile(dp DocumentProcessor, path string) ([]Chunk, error)

ProcessFile reads a file and processes it using the given document processor

type DocumentProcessor added in v1.9.28

type DocumentProcessor interface {
	Process(path string, content []byte) ([]Chunk, error)
}

DocumentProcessor takes file content and returns chunks. Config (size, overlap, etc.) is set at construction time.

type TextDocumentProcessor added in v1.9.28

type TextDocumentProcessor struct {
	// contains filtered or unexported fields
}

TextDocumentProcessor is the default text-based chunker

func NewTextDocumentProcessor added in v1.9.28

func NewTextDocumentProcessor(size, overlap int, respectWordBoundaries bool) *TextDocumentProcessor

NewTextDocumentProcessor creates a text-based document processor

func (*TextDocumentProcessor) Process added in v1.9.28

func (t *TextDocumentProcessor) Process(_ string, content []byte) ([]Chunk, error)

Process implements DocumentProcessor for text-based chunking

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL