Documentation
¶
Overview ¶
Package chunker splits a message into sentence-sliding-window chunks. See DESIGN.md §4.1.
The window is size=3, stride=2 (one-sentence overlap). For 10 sentences the windows are [1,2,3] [3,4,5] [5,6,7] [7,8,9] [9,10] (the trailing window is shorter when sentences run out).
Index ¶
Constants ¶
View Source
const ( DefaultWindowSize = 3 DefaultStride = 2 )
Default window parameters.
Variables ¶
This section is empty.
Functions ¶
func SplitSentences ¶
SplitSentences breaks text into sentences using a rule-based splitter: terminators (.?!) end a sentence unless preceded by a known abbreviation or by a single capital letter (an initial like "J. R. R. Tolkien").
Types ¶
type Chunk ¶
type Chunk struct {
// SentenceSpan is [start, end) over the input sentence array.
SentenceSpan [2]int
// Text is the concatenated sentence text for this window.
Text string
}
Chunk is one sliding-window result.
Click to show internal directories.
Click to hide internal directories.