Documentation
¶
Index ¶
Constants ¶
const ( ChunkerNone = "none" ChunkerSentence = "sentence" ChunkerParagraph = "paragraph" ChunkerChunk = "chunk" ChunkerTag = "tag" )
Chunking strategies
Variables ¶
This section is empty.
Functions ¶
func NewChunker ¶
func NewChunker(config ChunkConfig) iosystem.Processor
NewChunker creates a processor that splits documents into chunks. Returns multiple documents from a single input, one per chunk.
func NewCollector ¶ added in v0.2.0
NewCollector creates processor that collects documents into array.
func NewIdentity ¶
NewIdentity creates a processor that passes documents through unchanged.
Types ¶
type Agent ¶
type Agent struct {
// contains filtered or unexported fields
}
Agent wraps a blueprint to use as a pipeline processor. It processes documents through the agent's Prompt() method and returns the agent's response as a new document.
The processor:
- Reads document content as input
- Passes it to agent.Prompt()
- Returns agent response as new document
- Preserves document path with optional suffix
- Supports JSON format output from agents
Example:
agent := getAgentFromBlueprint()
proc := NewAgentProcessor(agent, AgentConfig{
Suffix: ".processed",
})
docs, err := proc.Process(ctx, inputDoc)
func NewAgent ¶
func NewAgent(w Worker, config *AgentConfig) *Agent
NewAgent creates a processor that wraps a blueprint Agent.
func (*Agent) Close ¶
Close releases resources. For AgentProcessor, this is a no-op as the agent lifecycle is managed externally.
func (*Agent) Process ¶
func (p *Agent) Process(ctx context.Context, docs []*iosystem.Document) ([]*iosystem.Document, error)
Process transforms a document by passing its content through the agent.
Input document content is read and passed to agent.Prompt() as:
- string if content is valid UTF-8
- []byte otherwise
- map[string]any if document has metadata that should be templated
The agent's response becomes the content of the output document. Output format depends on agent's configuration (text or JSON).
type AgentConfig ¶
type AgentConfig struct {
Suffix string // Suffix to add to output document path (default: empty)
Options []chatter.Opt // Chatter options to pass to agent.Prompt() (temperature, max_tokens, etc.)
}
AgentConfig configures the agent processor.
type ChunkConfig ¶
type ChunkConfig struct {
Strategy string // "none", "sentence", "paragraph", "chunk"
ChunkSize int // Size for chunk strategy (default: 1024)
DelimiterChars string // Delimiter characters (defaults vary by strategy)
Buffer int
}
ChunkConfig configures the chunker processor.
type Chunker ¶
type Chunker struct {
// contains filtered or unexported fields
}
Chunker splits documents into chunks based on a strategy. Integrates with the existing github.com/fogfish/scanner library.
type Collector ¶ added in v0.2.0
type Collector struct {
// contains filtered or unexported fields
}
Collector collects all input documents and emits them as array on EOF. This processor enables batch processing mode via --array CLI flag.
Behavior:
- Normal documents: collected in memory, returns empty slice
- EOF document: emits all collected documents as array, returns them
- After EOF: collection resets for potential reuse
Memory consideration:
- Buffers ALL documents in memory until EOF
- Not suitable for very large document streams
- Use only with explicit --array flag
func (*Collector) Process ¶ added in v0.2.0
func (p *Collector) Process(ctx context.Context, docs []*iosystem.Document) ([]*iosystem.Document, error)
Process collects documents or emits array on EOF signal.
Normal documents: collected, return empty slice (stops propagation until EOF) EOF document: emit collected []*Document array (monadic - passes documents directly)
type Identity ¶
type Identity struct{}
Identity is a pass-through processor that returns documents unchanged. Useful for testing and as a base for more complex processors.
type Prompter ¶
type Prompter struct {
// contains filtered or unexported fields
}
Wrap LLM as processor
func NewPrompter ¶
NewPrompter creates a processor that wraps a blueprint Prompter.
func (*Prompter) Close ¶
Close releases resources. For AgentProcessor, this is a no-op as the agent lifecycle is managed externally.
func (*Prompter) Process ¶
func (p *Prompter) Process(ctx context.Context, docs []*iosystem.Document) ([]*iosystem.Document, error)
Process transforms a document by passing its content through the agent.
Input document content is read and passed to agent.Prompt() as:
- string if content is valid UTF-8
- []byte otherwise
- map[string]any if document has metadata that should be templated
The agent's response becomes the content of the output document. Output format depends on agent's configuration (text or JSON).