Documentation
¶
Overview ¶
Package context provides context window management for LLM conversations. It includes token counting, model limits, and automatic context compaction.
Index ¶
- Variables
- func EstimateConversationTokens(messages []provider.Message) int
- func EstimateMessageTokens(msg provider.Message) int
- func EstimateTokens(text string) int
- func GetModelLimit(model string) int
- func SummarizeOldMessages(messages []provider.Message, keepRecent int) string
- type CompactionResult
- type Manager
- func (m *Manager) GetStatus(messages []provider.Message) Status
- func (m *Manager) ShouldCompact(messages []provider.Message) bool
- func (m *Manager) SimpleCompact(messages []provider.Message, keepCount int) ([]provider.Message, CompactionResult)
- func (m *Manager) WithCompactThreshold(threshold float64) *Manager
- func (m *Manager) WithWarningThreshold(threshold float64) *Manager
- type Status
- type TokenBudget
Constants ¶
This section is empty.
Variables ¶
var ModelLimits = map[string]int{
"mistral-large-latest": 128000,
"mistral-large": 128000,
"mistral-medium-latest": 32000,
"mistral-medium": 32000,
"mistral-small-latest": 32000,
"mistral-small": 32000,
"codestral-latest": 32000,
"codestral": 32000,
"open-mistral-7b": 32000,
"open-mixtral-8x7b": 32000,
"open-mixtral-8x22b": 64000,
"llama3.2": 128000,
"llama3.1": 128000,
"llama3": 8000,
"llama2": 4096,
"codellama": 16000,
"mistral": 32000,
"mixtral": 32000,
"deepseek-coder": 16000,
"qwen2.5-coder": 32000,
"gpt-4o": 128000,
"gpt-4o-mini": 128000,
"gpt-4-turbo": 128000,
"gpt-4": 8192,
"gpt-3.5-turbo": 16385,
"gpt-3.5-turbo-16k": 16385,
"claude-3-opus": 200000,
"claude-3-sonnet": 200000,
"claude-3-haiku": 200000,
"claude-2.1": 200000,
"claude-2": 100000,
"default": 32000,
}
ModelLimits contains the context window limits for different models
Functions ¶
func EstimateConversationTokens ¶
EstimateConversationTokens estimates total tokens in a conversation.
func EstimateMessageTokens ¶
EstimateMessageTokens estimates tokens for a single message. Includes overhead for message structure (~4 tokens per message).
func EstimateTokens ¶
EstimateTokens provides a rough estimate of token count for text. Uses a simple heuristic of ~4 characters per token on average. This is approximate - actual token count varies by model and content.
func GetModelLimit ¶
GetModelLimit returns the context window limit for a model. Returns the default limit if model is not found.
Types ¶
type CompactionResult ¶
type CompactionResult struct {
OriginalTokens int
CompactedTokens int
MessagesRemoved int
Summary string
}
CompactionResult contains the result of a compaction operation
type Manager ¶
type Manager struct {
// contains filtered or unexported fields
}
Manager handles context window management
func NewManager ¶
NewManager creates a new context manager with the specified limits.
func NewManagerForModel ¶
NewManagerForModel creates a manager with limits based on model name.
func (*Manager) ShouldCompact ¶
ShouldCompact returns true if the conversation should be compacted.
func (*Manager) SimpleCompact ¶
func (m *Manager) SimpleCompact(messages []provider.Message, keepCount int) ([]provider.Message, CompactionResult)
SimpleCompact performs a simple compaction by keeping only recent messages. It preserves the system prompt and keeps the most recent messages.
func (*Manager) WithCompactThreshold ¶
WithCompactThreshold sets the threshold for triggering compaction.
func (*Manager) WithWarningThreshold ¶
WithWarningThreshold sets the threshold for warning about context usage.
type Status ¶
type Status struct {
EstimatedTokens int
MaxTokens int
UsagePercent float64
NeedsCompaction bool
WarningThreshold bool
}
Status represents the current context window status
type TokenBudget ¶
type TokenBudget struct {
Total int
System int
History int
CurrentTurn int
Reserved int // For response generation
}
TokenBudget helps allocate tokens across different parts of the context
func NewTokenBudget ¶
func NewTokenBudget(maxTokens int) TokenBudget
NewTokenBudget creates a token budget based on max context size.
func (TokenBudget) RemainingHistory ¶
func (b TokenBudget) RemainingHistory(used int) int
RemainingHistory returns how many tokens are left for history.