Documentation
¶
Index ¶
Constants ¶
This section is empty.
Variables ¶
This section is empty.
Functions ¶
func SplitPlaintextOnSentences ¶
SplitPlaintextOnSentences splits a string into chunks of the given size. It intelligently splits on sentence boundaries assuming that ., !, and ? are sentence endings. It guarantees that each chunk is no larger than the given size, therefore it may split in the middle of a sentence. It limits the amount a chunk can be smaller than the given size to 3/4 of the given size. There for the number of chunks may be 25% greater than expected in worst case.
Types ¶
type Options ¶
type Options struct {
ChunkSize int `json:"chunkSize"` // Maximum size of each chunk in characters
ChunkOverlap int `json:"chunkOverlap"` // Number of characters to overlap between chunks
ChunkingStrategy string `json:"chunkingStrategy"` // Strategy: sentences, paragraphs, or fixed
}
Options defines options for chunking documents
func DefaultOptions ¶
func DefaultOptions() Options
DefaultOptions returns the default chunking options
Click to show internal directories.
Click to hide internal directories.