chunk

package
v1.9.18 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Nov 20, 2025 License: Apache-2.0 Imports: 7 Imported by: 0

Documentation

Index

Constants

This section is empty.

Variables

This section is empty.

Functions

This section is empty.

Types

type Chunk

type Chunk struct {
	Index   int
	Content string
}

Chunk represents a piece of text from a document

type Processor

type Processor struct{}

Processor handles document processing

func New

func New() *Processor

New creates a new document processor

func (*Processor) ChunkText

func (p *Processor) ChunkText(text string, size, overlap int, respectWordBoundaries bool) []Chunk

ChunkText splits text into overlapping chunks

func (*Processor) CollectFiles

func (p *Processor) CollectFiles(paths []string) ([]string, error)

CollectFiles recursively collects all files from given paths Skips paths that don't exist instead of returning an error

func (*Processor) FileHash

func (p *Processor) FileHash(path string) (string, error)

FileHash calculates SHA256 hash of a file

func (*Processor) ProcessFile

func (p *Processor) ProcessFile(path string, chunkSize, overlap int, respectWordBoundaries bool) ([]Chunk, error)

ProcessFile reads a file and splits it into chunks

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL