Documentation
¶
Overview ¶
Package compaction provides conversation compaction for the agent loop. When the conversation history approaches the model's context window limit, compaction summarizes older messages and replaces them with a structured summary.
Index ¶
- Constants
- func EffectiveTokenCount(usage agent.TokenUsage) int
- func EstimateConversationTokens(messages []agent.Message) int
- func EstimateMessageTokens(msg agent.Message) int
- func EstimateTokens(s string) int
- func FindCutPoint(messages []agent.Message, keepRecentTokens int) int
- func InjectSummary(summary string) agent.Message
- func IsCompactionSummary(msg agent.Message) bool
- func NewCompactor(cfg Config) ...
- func SerializeConversation(messages []agent.Message, maxToolResultChars int) string
- func ShouldCompact(estimatedTokens, contextWindow, effectivePercent, reserveTokens int) bool
- func TruncateToolResult(s string, maxChars int) string
- type CompactionResult
- type Config
- type FileOps
Constants ¶
const DefaultContextWindow = 131072
DefaultContextWindow is used when no context window is configured. Most modern models support at least 128k; 8k caused premature compaction and stuck loops on models where LM Studio doesn't report context length.
const DefaultEffectivePercent = 95
DefaultEffectivePercent is the safety margin on the context window.
const DefaultKeepRecentTokens = 8192
DefaultKeepRecentTokens is how many tokens of recent messages to keep verbatim.
const DefaultMaxToolResultChars = 2000
DefaultMaxToolResultChars is the max chars per tool result in summarization input.
const DefaultReserveTokens = 8192
DefaultReserveTokens is the token budget reserved for the model response.
const DefaultStuckThreshold = 5
DefaultStuckThreshold is the number of consecutive compaction attempts that return without producing a compacted history before the compactor reports ErrCompactionStuck. Only attempts where ShouldCompact returns true count; normal no-ops (below threshold, already summarised) do not.
const DefaultUserMessageTailTokens = 20000
DefaultUserMessageTailTokens is the default token budget for re-including recent real user messages alongside the compaction summary.
const InitialSummarizationPrompt = `` /* 723-byte string literal not displayed */
InitialSummarizationPrompt is the user-side prompt for first-time compaction.
const NoSummaryFallback = "(no summary available)"
NoSummaryFallback is used when the LLM returns an empty summary.
const SummarizationSystemPrompt = `` /* 317-byte string literal not displayed */
Summarization system prompt — prevents the model from continuing the conversation.
const SummaryInjectionPrefix = "The conversation history before this point was compacted into the following summary:\n\n<summary>\n"
SummaryInjectionPrefix is prepended to the summary when injecting into conversation.
const SummaryInjectionSuffix = "\n</summary>"
SummaryInjectionSuffix closes the summary injection.
const UpdateSummarizationPrompt = `` /* 454-byte string literal not displayed */
UpdateSummarizationPrompt is the user-side prompt when merging with a previous summary.
Variables ¶
This section is empty.
Functions ¶
func EffectiveTokenCount ¶
func EffectiveTokenCount(usage agent.TokenUsage) int
EffectiveTokenCount computes the effective context consumption from provider-reported token usage. Includes all four components since they all contribute to context window consumption.
func EstimateConversationTokens ¶
EstimateConversationTokens estimates the total tokens for a slice of messages.
func EstimateMessageTokens ¶
EstimateMessageTokens estimates the token count for a single message, including role, content, tool calls, and tool call arguments.
func EstimateTokens ¶
EstimateTokens estimates the token count for a string using chars/4.
func FindCutPoint ¶
FindCutPoint walks backwards from the end of messages, accumulating token estimates, and returns the index of the first message to keep. Messages before this index will be summarized. The cut is always at a valid turn boundary — never between a tool call and its result.
func InjectSummary ¶
InjectSummary creates a user message containing the compaction summary.
func IsCompactionSummary ¶
IsCompactionSummary checks if a message is a compaction summary injection.
func NewCompactor ¶
func NewCompactor(cfg Config) func(ctx context.Context, messages []agent.Message, provider agent.Provider, toolCalls []agent.ToolCallLog) ([]agent.Message, *agent.CompactionResult, error)
NewCompactor creates a Compactor function suitable for agent.Request.Compactor. It uses the provided config to determine when and how to compact.
func SerializeConversation ¶
SerializeConversation renders a message history as compact text suitable for the summarization LLM. Tool calls are rendered inline as [Assistant → toolName(args)]: result. Tool results are truncated to maxChars.
func ShouldCompact ¶
ShouldCompact returns true if the conversation should be compacted. effectiveWindow = contextWindow * effectivePercent / 100.
func TruncateToolResult ¶
TruncateToolResult truncates a tool result string to maxChars, appending a truncation marker if shortened.
Types ¶
type CompactionResult ¶
type CompactionResult struct {
Summary string // the generated summary text
FileOps *FileOps // accumulated file operations
TokensBefore int // estimated tokens before compaction
TokensAfter int // estimated tokens after compaction
CutIndex int // index of first kept message
MessagesKept int // number of messages preserved verbatim
Warning string // degradation warning, if any
}
CompactionResult holds the output of a compaction pass.
func CompactMessages ¶
func CompactMessages( ctx context.Context, provider agent.Provider, messages []agent.Message, toolCalls []agent.ToolCallLog, previousSummary string, previousFileOps *FileOps, cfg Config, ) ([]agent.Message, *CompactionResult, error)
CompactMessages performs a full compaction pass on a message history. Returns the new message list with older messages replaced by a summary.
type Config ¶
type Config struct {
// Enabled controls whether automatic compaction runs. Default: true.
Enabled bool
// ContextWindow is the model's context window in tokens.
ContextWindow int
// ReserveTokens is the budget reserved for the model response.
ReserveTokens int
// KeepRecentTokens is how many tokens of recent messages to keep verbatim.
KeepRecentTokens int
// MaxToolResultChars is the max chars per tool result in summarization input.
MaxToolResultChars int
// EffectivePercent is the percentage of ContextWindow to actually use (0-100).
EffectivePercent int
// SummarizationModel overrides the model used for summarization.
SummarizationModel string
// SummarizationProvider overrides the provider for summarization.
SummarizationProvider agent.Provider
// SummarizationFocus is optional text appended to the summarization prompt.
SummarizationFocus string
// UserMessageTailTokens is the token budget for re-including recent real
// user messages after compaction alongside the summary. This gives the model
// actual request context rather than only the summary.
// Default: 20000. Zero means disabled.
UserMessageTailTokens int
// StuckThreshold is the number of consecutive compaction attempts (where
// ShouldCompact returns true) that fail to produce a compacted history
// before the compactor returns agent.ErrCompactionStuck. Zero uses
// DefaultStuckThreshold.
StuckThreshold int
}
Config configures automatic conversation compaction.
func DefaultConfig ¶
func DefaultConfig() Config
DefaultConfig returns a Config with sensible defaults.
type FileOps ¶
FileOps tracks files read and modified during a conversation.
func ExtractFileOps ¶
func ExtractFileOps(toolCalls []agent.ToolCallLog) *FileOps
ExtractFileOps scans tool call logs for file operations.
func Summarize ¶
func Summarize( ctx context.Context, provider agent.Provider, messages []agent.Message, toolCalls []agent.ToolCallLog, previousSummary string, cfg Config, maxTokens int, ) (string, *FileOps, error)
Summarize runs the summarization LLM call and returns the summary text. It serializes the messages, builds the prompt, calls the provider, and appends file tracking XML.