chunker

package
v0.2.0 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Apr 30, 2026 License: MIT Imports: 2 Imported by: 0

Documentation

Overview

Package chunker splits a message into sentence-sliding-window chunks. See DESIGN.md §4.1.

The window is size=3, stride=2 (one-sentence overlap). For 10 sentences the windows are [1,2,3] [3,4,5] [5,6,7] [7,8,9] [9,10] (the trailing window is shorter when sentences run out).

Index

Constants

View Source
const (
	DefaultWindowSize = 3
	DefaultStride     = 2
)

Default window parameters.

Variables

This section is empty.

Functions

func SplitSentences

func SplitSentences(text string) []string

SplitSentences breaks text into sentences using a rule-based splitter: terminators (.?!) end a sentence unless preceded by a known abbreviation or by a single capital letter (an initial like "J. R. R. Tolkien").

Types

type Chunk

type Chunk struct {
	// SentenceSpan is [start, end) over the input sentence array.
	SentenceSpan [2]int
	// Text is the concatenated sentence text for this window.
	Text string
}

Chunk is one sliding-window result.

func Chunkify

func Chunkify(text string, windowSize, stride int) []Chunk

Chunkify splits text into windows of windowSize sentences advanced by stride. windowSize must be >= 1; stride must be >= 1 and <= windowSize.

func Default

func Default(text string) []Chunk

Chunk splits text into sliding-window chunks using default parameters.

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL