chunk

package
v1.16.0 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Dec 30, 2025 License: Apache-2.0 Imports: 6 Imported by: 0

Documentation

Index

Constants

This section is empty.

Variables

This section is empty.

Functions

func FileHash added in v1.9.28

func FileHash(path string) (string, error)

FileHash calculates SHA256 hash of a file

Types

type Chunk

type Chunk struct {
	Index    int
	Content  string
	Metadata map[string]string
}

Chunk represents a piece of text from a document

func ProcessFile added in v1.9.28

func ProcessFile(dp DocumentProcessor, path string) ([]Chunk, error)

ProcessFile reads a file and processes it using the given document processor

type DocumentProcessor added in v1.9.28

type DocumentProcessor interface {
	Process(path string, content []byte) ([]Chunk, error)
}

DocumentProcessor takes file content and returns chunks. Config (size, overlap, etc.) is set at construction time.

type TextDocumentProcessor added in v1.9.28

type TextDocumentProcessor struct {
	// contains filtered or unexported fields
}

TextDocumentProcessor is the default text-based chunker

func NewTextDocumentProcessor added in v1.9.28

func NewTextDocumentProcessor(size, overlap int, respectWordBoundaries bool) *TextDocumentProcessor

NewTextDocumentProcessor creates a text-based document processor

func (*TextDocumentProcessor) Process added in v1.9.28

func (t *TextDocumentProcessor) Process(_ string, content []byte) ([]Chunk, error)

Process implements DocumentProcessor for text-based chunking

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL