Documentation
¶
Overview ¶
Package compacter provides middleware for automatic conversation history compaction when token limit errors are detected. It uses LLM to summarize old messages.
Index ¶
Constants ¶
This section is empty.
Variables ¶
var DefaultSummaryPrompt = `` /* 272-byte string literal not displayed */
DefaultSummaryPrompt is the default prompt used for summarizing conversation history
Functions ¶
func NewContentBlockMiddleware ¶
func NewContentBlockMiddleware(llmClient gollem.LLMClient, options ...Option) gollem.ContentBlockMiddleware
NewContentBlockMiddleware creates a middleware that automatically compacts history using LLM when ErrTagTokenExceeded is detected
func NewContentStreamMiddleware ¶
func NewContentStreamMiddleware(llmClient gollem.LLMClient, options ...Option) gollem.ContentStreamMiddleware
NewContentStreamMiddleware creates a streaming middleware that automatically compacts history using LLM when ErrTagTokenExceeded is detected
Types ¶
type CompactionEvent ¶
type CompactionEvent struct {
OriginalDataSize int // Total character count before compaction
CompactedDataSize int // Total character count after compaction (summary + remaining)
InputTokens int // LLM input tokens used for summarization
OutputTokens int // LLM output tokens generated for summary
Summary string // The generated summary text
Attempt int // Retry attempt number (1-based)
}
CompactionEvent contains information about a compaction event.
The compaction process selects messages from the beginning of the conversation history based on the configured compact ratio (default 70%). These messages are summarized using an LLM and replaced with a single summary message.
Data sizes represent character counts:
- OriginalDataSize: Total character count of all original messages
- CompactedDataSize: Total character count after compaction (summary + remaining messages)
Token usage represents the actual LLM token consumption during summarization:
- InputTokens: Number of tokens sent to LLM for summarization
- OutputTokens: Number of tokens generated by LLM for the summary
The compact ratio determines what percentage of the original data size should be compressed. For example, with a 70% ratio and 1000 total characters, the first 700 characters worth of messages from the beginning will be summarized into a single message.
type CompactionHook ¶
type CompactionHook func(ctx context.Context, event *CompactionEvent)
CompactionHook is a function called when compaction occurs
type Option ¶
type Option func(*config)
Option is a configuration option for the compacter middleware
func WithCompactRatio ¶
WithCompactRatio sets the ratio of history to compact (default: 0.7) The ratio should be between 0.0 and 1.0
func WithCompactionHook ¶
func WithCompactionHook(hook CompactionHook) Option
WithCompactionHook sets a callback function that is called when compaction occurs
func WithLogger ¶
WithLogger sets the logger for compaction events
func WithMaxRetries ¶
WithMaxRetries sets the maximum number of retry attempts (default: 3)
func WithSummaryPrompt ¶
WithSummaryPrompt sets a custom prompt for summarization Use %s placeholder for conversation history text