Documentation
¶
Overview ¶
Package compaction provides conversation compaction (summarization) for chat sessions that approach their model's context window limit.
It is designed as a standalone component that can be used independently of the runtime loop. The package exposes:
- BuildPrompt: prepares a conversation for summarization by appending the compaction prompt and sanitizing message costs.
- ShouldCompact: decides whether a session needs compaction based on token usage and context window limits.
- EstimateMessageTokens: a fast heuristic for estimating the token count of a single chat message.
- HasConversationMessages: checks whether a message list contains any non-system messages worth summarizing.
Index ¶
Constants ¶
This section is empty.
Variables ¶
var ( //go:embed prompts/compaction-system.txt SystemPrompt string )
Functions ¶
func BuildPrompt ¶
BuildPrompt prepares the messages for a summarization request. It clones the conversation (zeroing per-message costs so they aren't double-counted), then appends a user message containing the compaction prompt. If additionalPrompt is non-empty it is included as extra instructions.
Callers should first check HasConversationMessages to avoid sending an empty conversation to the model.
func EstimateMessageTokens ¶
EstimateMessageTokens returns a rough token-count estimate for a single chat message based on its text length. This is intentionally conservative (overestimates) so that proactive compaction fires before we hit the limit.
The estimate accounts for message content, multi-content text parts, reasoning content, tool call arguments, and a small per-message overhead for role/metadata tokens.
func HasConversationMessages ¶
HasConversationMessages reports whether messages contains at least one non-system message. A session with only system prompts has no conversation to summarize.
func ShouldCompact ¶
ShouldCompact reports whether a session's context usage has crossed the compaction threshold. It returns true when the estimated total token count (input + output + addedTokens) exceeds [contextThreshold] (90%) of contextLimit. A non-positive contextLimit is treated as unlimited and always returns false.
Types ¶
type Result ¶
type Result struct {
// Summary is the generated summary text.
Summary string
// InputTokens is the token count reported by the summarization model,
// used as an approximation of the new context size after compaction.
InputTokens int64
// Cost is the cost of the summarization request in dollars.
Cost float64
}
Result holds the outcome of a compaction operation.