Documentation
¶
Index ¶
Constants ¶
This section is empty.
Variables ¶
This section is empty.
Functions ¶
This section is empty.
Types ¶
type CharacterSplitter ¶
CharacterSplitter splits text recursively using a list of separators, similar to RecursiveCharacterTextSplitter in LangChain.
func NewCharacterSplitter ¶
func NewCharacterSplitter(size, overlap int) *CharacterSplitter
NewCharacterSplitter creates a chunker splitting by runes and logical breaks
func (*CharacterSplitter) SplitDocument ¶
func (c *CharacterSplitter) SplitDocument(ctx context.Context, doc *core.Document) ([]*core.Chunk, error)
SplitDocument converts a document into interconnected core.Chunk units (similar to LlamaIndex Nodes).
type TextSplitter ¶
type TextSplitter interface {
// SplitText turns a raw string into meaningful chunk strings
SplitText(text string) ([]string, error)
// SplitDocument extends raw string logic with ID mapping and metadata (Chunk = Node in LlamaIndex)
SplitDocument(ctx context.Context, doc *core.Document) ([]*core.Chunk, error)
}
TextSplitter is the generalized LlamaIndex "NodeParser" / Langchain "TextSplitter".
Click to show internal directories.
Click to hide internal directories.