chunk

package
v1.9.27 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Nov 27, 2025 License: Apache-2.0 Imports: 8 Imported by: 0

Documentation

Index

Constants

This section is empty.

Variables

This section is empty.

Functions

This section is empty.

Types

type Chunk

type Chunk struct {
	Index    int
	Content  string
	Metadata map[string]string
}

Chunk represents a piece of text from a document

type Processor

type Processor struct{}

Processor handles document processing

func New

func New() *Processor

New creates a new document processor

func (*Processor) ChunkText

func (p *Processor) ChunkText(text string, size, overlap int, respectWordBoundaries bool) []Chunk

ChunkText splits text into overlapping chunks

func (*Processor) CollectFiles

func (p *Processor) CollectFiles(paths []string) ([]string, error)

CollectFiles recursively collects all files from given paths Skips paths that don't exist instead of returning an error

func (*Processor) FileHash

func (p *Processor) FileHash(path string) (string, error)

FileHash calculates SHA256 hash of a file

func (*Processor) Matches added in v1.9.26

func (p *Processor) Matches(path string, patterns []string) (bool, error)

Matches reports whether the given path matches any configured document path or glob pattern. To be used in file watchers to determine if a new/changed file matches the glob patterns or not.

func (*Processor) ProcessFile

func (p *Processor) ProcessFile(path string, chunkSize, overlap int, respectWordBoundaries bool) ([]Chunk, error)

ProcessFile reads a file and splits it into chunks

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL