Documentation
¶
Overview ¶
Package filter implements the 31-layer token compression pipeline.
Organization ¶
This package contains 85 source files organized by concern:
Core Pipeline (pipeline_*.go, manager.go)
Pipeline coordinator, config types, layer initialization, and execution.
Filter Interface (filter.go)
Shared types: Mode, Filter, Engine, Language detection.
Layer 1-10: Core Compression (entropy.go, perplexity.go, goal_driven.go, etc.)
Research-backed token reduction from 120+ papers.
Layer 11-20: Semantic Filters (compaction.go, attribution.go, h2o.go, etc.)
Context-aware compression for conversations and complex output.
Layer 21-27: Research Filters (swezze.go, mixed_dim.go, beaver.go, etc.)
Latest research from arXiv 2025-2026.
Adaptive Layers (adaptive.go, density_adaptive.go, dynamic_ratio.go, etc.)
Self-tuning compression based on content characteristics.
Utility Filters (ansi.go, noise.go, dedup.go, brace_depth.go, etc.)
Formatting, deduplication, and structural filters.
Infrastructure (lru_cache.go, semantic_cache.go, streaming.go, session.go, etc.)
Caching, streaming, session management for pipeline support.
Usage ¶
For full 31-layer pipeline:
cfg := filter.PipelineConfig{EnableEntropy: true, ...}
pipeline := filter.NewPipelineCoordinator(cfg)
output, stats := pipeline.Process(input)
For lightweight filtering (ANSI, comments, imports):
engine := filter.NewEngine(filter.ModeMinimal) output, saved := engine.Process(input)
Package filter provides LRU caching using the unified cache package. This file provides backward compatibility for existing code.
Index ¶
- Variables
- func ApplyMode(input string, mode Mode, cm CompressionMode) (string, int)
- func ApplyProfile(input string, mode Mode, profile Profile) (string, int)
- func ApplyTier(input string, mode Mode, tier Tier) (string, int)
- func AutoProcess(input string, mode Mode) (string, int)
- func CacheabilityScore(content string) int
- func ClassifyContent(content string) (isStatic bool, confidence float64)
- func DetectLanguage(output string) string
- func EstimateCacheHitRate(content string) float64
- func EstimateTokens(text string) int
- func FilterNoisyOutput(input string) string
- func FilterProgressBars(input string) string
- func FormatDelta(delta IncrementalDelta) string
- func FormatTDDStats(stats TDDStats) string
- func Ftoa(f float64, prec int) string
- func HammingDistance(a, b uint64) int
- func HasANSI(input string) bool
- func IsCode(output string) bool
- func IsNearDuplicate(a, b string, threshold int) bool
- func Itoa(n int) string
- func JSONPathExtract(jsonStr, path string) string
- func NewLRUCache(maxSize int, ttl time.Duration) *cache.LRUCache
- func PreferLessMode(original, filtered string) string
- func PruneMetadata(input string) string
- func QuickProcess(input string, mode Mode) (string, int)
- func QuickProcessPreset(input string, mode Mode, preset PipelinePreset) (string, int)
- func ReadContent(content string, opts ReadOptions) string
- func SimHash(content string) uint64
- func StripANSI(input string) string
- func StripLineNumbers(input string) string
- type ACONFilter
- type ANSICode
- type ANSIFilter
- type ASTPreserveFilter
- type AdaptiveLayerSelector
- type AgentContext
- type AgentMemoryConfig
- type AgentMemoryFilter
- type AgentMemoryLayerConfig
- type AgentMemoryStats
- type AnchorToken
- type ApplicabilityCheck
- type AttentionSinkFilter
- type AttentionSinkLayerConfig
- type AttributionConfig
- type AttributionFilter
- type AttributionLayerConfig
- type AutoValidationPipeline
- type BEAVERFilter
- type BM25Scorer
- type BodyFilter
- type BudgetConfig
- type BudgetEnforcer
- type CachedResult
- type ChunkMethod
- type ChunkType
- type CodeChunk
- type CodeContext
- type ColorPassthrough
- type CommandContext
- type CommentFilter
- type CommentPatterns
- type CompactionConfig
- type CompactionLayer
- type CompactionLayerConfig
- type CompactionResult
- type CompressionCache
- type CompressionMode
- type ContentComplexity
- type ContentType
- type ContrastiveFilter
- type ConversationTracker
- type CoreLayersConfig
- type CrossMessageDedup
- type CurrentState
- type DensityAdaptiveConfig
- type DensityAdaptiveFilter
- type DensityAdaptiveLayerConfig
- type DictionaryEncoding
- type DiffCrunch
- type DiffCrunchConfig
- type DynamicRatioConfig
- type DynamicRatioFilter
- type DynamicRatioLayerConfig
- type EnableCheck
- type Engine
- type EngineConfig
- type EngramMemory
- type EntropyFilter
- type EquivalenceReport
- type EvaluatorHeadsFilter
- type FeedbackConfig
- type FeedbackLoop
- type FeedbackSignal
- type Filter
- type GistFilter
- type GoalDrivenFilter
- type GoalMode
- type H2OConfig
- type H2OFilter
- type H2OLayerConfig
- type HierarchicalFilter
- type HierarchicalSummaryFilter
- type HypernymCompressor
- type HypernymConfig
- type IBConfig
- type ImportFilter
- type IncrementalDelta
- type InformationBottleneck
- type InterLayerFeedback
- type KVCacheAligner
- type KVCacheConfig
- type KVzipConfig
- type KVzipFilter
- type LLMAwareConfig
- type LLMAwareFilter
- func (f *LLMAwareFilter) Apply(input string, mode Mode) (string, int)
- func (f *LLMAwareFilter) GetModel() string
- func (f *LLMAwareFilter) GetProvider() string
- func (f *LLMAwareFilter) IsAvailable() bool
- func (f *LLMAwareFilter) Name() string
- func (f *LLMAwareFilter) SetEnabled(enabled bool)
- func (f *LLMAwareFilter) SummarizeWithIntent(content string, intent string) (string, int)
- type LLMCompressRequest
- type LLMCompressResponse
- type LLMCompressor
- type LRUCache
- type Language
- type LayerConfig
- type LayerConfigs
- type LayerStat
- type LayersSection
- type LazyPrunerConfig
- type LazyPrunerFilter
- func (f *LazyPrunerFilter) Apply(input string, mode Mode) (string, int)
- func (f *LazyPrunerFilter) Clear()
- func (f *LazyPrunerFilter) GetLayerBudget(layer int) int
- func (f *LazyPrunerFilter) GetLayerBudgets() []int
- func (f *LazyPrunerFilter) GetStats() LazyPrunerStats
- func (f *LazyPrunerFilter) Name() string
- func (f *LazyPrunerFilter) ReviveTokens(layer int, count int) []Token
- func (f *LazyPrunerFilter) SelectTokens(tokens []Token, layer int, threshold float64) []Token
- func (f *LazyPrunerFilter) StorePruned(tokens []Token, layer int)
- type LazyPrunerLayerConfig
- type LazyPrunerStats
- type LineScore
- type LogCrunch
- type LogCrunchConfig
- type ManagerConfig
- type MetaToken
- type MetaTokenConfig
- type MetaTokenFilter
- func (f *MetaTokenFilter) Apply(input string, mode Mode) (string, int)
- func (f *MetaTokenFilter) Decompress(input string) string
- func (f *MetaTokenFilter) GetMetaTokens() map[string]MetaToken
- func (f *MetaTokenFilter) LoadMetaTokens(tokens map[string]MetaToken)
- func (f *MetaTokenFilter) Name() string
- func (f *MetaTokenFilter) Stats() MetaTokenStats
- type MetaTokenLayerConfig
- type MetaTokenStats
- type Milestone
- type MixedDimFilter
- type Mode
- type MultiAgentContextSharing
- type MultiFileConfig
- type MultiFileFilter
- type NgramAbbreviator
- type NumericalConfig
- type NumericalQuantLayerConfig
- type NumericalQuantizer
- type Observation
- type PATHShimInjector
- type PerplexityFilter
- type PersistentKnowledgeStore
- type PhotonConfig
- type PhotonFilter
- type PhraseGroupConfig
- type PhraseGroupingFilter
- type PipeOp
- type Pipeline
- type PipelineConfig
- func LoadPipelineFromTOML(path string) (PipelineConfig, error)
- func ModeConfig(mode CompressionMode, baseMode Mode) PipelineConfig
- func PresetConfig(preset PipelinePreset, baseMode Mode) PipelineConfig
- func ProfileConfig(profile Profile, baseMode Mode) PipelineConfig
- func TierConfig(tier Tier, baseMode Mode) PipelineConfig
- type PipelineConfigWithNestedLayers
- type PipelineCoordinator
- func (c *PipelineCoordinator) GetASTPreserveFilter() *ASTPreserveFilter
- func (c *PipelineCoordinator) GetAgentMemoryFilter() *AgentMemoryFilter
- func (c *PipelineCoordinator) GetAttentionSinkFilter() *AttentionSinkFilter
- func (c *PipelineCoordinator) GetAttributionFilter() *AttributionFilter
- func (c *PipelineCoordinator) GetCompactionLayer() *CompactionLayer
- func (c *PipelineCoordinator) GetContrastiveFilter() *ContrastiveFilter
- func (c *PipelineCoordinator) GetEntropyFilter() *EntropyFilter
- func (c *PipelineCoordinator) GetEvaluatorHeadsFilter() *EvaluatorHeadsFilter
- func (c *PipelineCoordinator) GetGistFilter() *GistFilter
- func (c *PipelineCoordinator) GetGoalDrivenFilter() *GoalDrivenFilter
- func (c *PipelineCoordinator) GetH2OFilter() *H2OFilter
- func (c *PipelineCoordinator) GetHierarchicalSummaryFilter() *HierarchicalSummaryFilter
- func (c *PipelineCoordinator) GetLazyPrunerFilter() *LazyPrunerFilter
- func (c *PipelineCoordinator) GetMetaTokenFilter() *MetaTokenFilter
- func (c *PipelineCoordinator) GetNgramAbbreviator() *NgramAbbreviator
- func (c *PipelineCoordinator) GetPerplexityFilter() *PerplexityFilter
- func (c *PipelineCoordinator) GetSemanticAnchorFilter() *SemanticAnchorFilter
- func (c *PipelineCoordinator) GetSemanticChunkFilter() *SemanticChunkFilter
- func (c *PipelineCoordinator) GetSketchStoreFilter() *SketchStoreFilter
- func (c *PipelineCoordinator) GetTFIDFFilter() *TFIDFFilter
- func (p *PipelineCoordinator) GetTOMLFilterName() string
- func (p *PipelineCoordinator) Process(input string) (string, *PipelineStats)
- func (p *PipelineCoordinator) SetTOMLFilter(filter Filter, name string)
- type PipelineManager
- func (m *PipelineManager) Process(input string, mode Mode, ctx CommandContext) (*ProcessResult, error)
- func (m *PipelineManager) ProcessWithBudget(input string, mode Mode, budget int, ctx CommandContext) (*ProcessResult, error)
- func (m *PipelineManager) ProcessWithQuery(input string, mode Mode, query string, ctx CommandContext) (*ProcessResult, error)
- type PipelinePreset
- type PipelineSection
- type PipelineStats
- type PoCFilter
- type PositionAwareFilter
- type ProcessResult
- type Profile
- type QualityEstimator
- type QualityScorer
- type QueryAwareFilter
- type QueryIntent
- type QuestionAwareConfig
- type QuestionAwareFilter
- type QuestionAwareLayerConfig
- type QuestionAwareRecovery
- type ReadMode
- type ReadOptions
- type Reflection
- type ReversibleStore
- type SWEzzeFilter
- type SafetySection
- type ScopeConfig
- type ScopeFilter
- type ScopeMode
- type ScratchMessage
- type SemanticAnchorConfig
- type SemanticAnchorFilter
- func (f *SemanticAnchorFilter) Apply(input string, mode Mode) (string, int)
- func (f *SemanticAnchorFilter) GetAnchorDensity(token string) float64
- func (f *SemanticAnchorFilter) GetAnchors() []AnchorToken
- func (f *SemanticAnchorFilter) GetStats() SemanticAnchorStats
- func (f *SemanticAnchorFilter) Name() string
- type SemanticAnchorLayerConfig
- type SemanticAnchorStats
- type SemanticCacheConfig
- type SemanticCacheFilter
- type SemanticChunk
- type SemanticChunkConfig
- type SemanticChunkFilter
- type SemanticChunkLayerConfig
- type SemanticEquivalence
- type SemanticFilter
- type SemanticIntentDetector
- type SessionConfig
- type SessionHistory
- type SessionStats
- type SessionTracker
- type SinkConfig
- type Sketch
- type SketchCache
- type SketchEntry
- type SketchStats
- type SketchStoreConfig
- type SketchStoreFilter
- func (f *SketchStoreFilter) Apply(input string, mode Mode) (string, int)
- func (f *SketchStoreFilter) Clear()
- func (f *SketchStoreFilter) ExportSketches() ([]byte, error)
- func (f *SketchStoreFilter) GetAllSketches() map[string]*Sketch
- func (f *SketchStoreFilter) GetSketch(hash string) (*Sketch, bool)
- func (f *SketchStoreFilter) GetStats() SketchStats
- func (f *SketchStoreFilter) ImportSketches(data []byte) error
- func (f *SketchStoreFilter) Name() string
- func (f *SketchStoreFilter) Revive(sketchHash string) (string, bool)
- type SketchStoreLayerConfig
- type SmallKVCompensator
- type SmallKVConfig
- type SnapshotContext
- type StateSnapshot
- type StoredEntry
- type StreamingProcessor
- type StructuralCollapse
- type StructuralCollapseConfig
- type SymbolicCompressFilter
- type SymbolicConfig
- type TDDConfig
- type TDDStats
- type TFIDFConfig
- type TFIDFFilter
- type TFIDFLayerConfig
- type TOMLPipelineConfig
- type TOONConfig
- type TOONEncoder
- type TaskRunnerWrapping
- type TemplatePipe
- type Tier
- type TieredSummary
- type Token
- type TokenDenseDialect
- type TokenQuantFilter
- type TokenRetentionFilter
- type Turn
- type ValidationResult
- type Validator
Constants ¶
This section is empty.
Variables ¶
var BlockDelimiters = map[rune]rune{
'{': '}',
'[': ']',
'(': ')',
}
BlockDelimiters for brace tracking
var CommentPatternsMap = map[Language]*regexp.Regexp{ LangGo: cStyleCommentRe, LangRust: cStyleCommentRe, LangPython: regexp.MustCompile(`(?m)^#.*$|"""[\s\S]*?"""|'''[\s\S]*?'''`), LangJavaScript: cStyleCommentRe, LangTypeScript: cStyleCommentRe, LangJava: cStyleCommentRe, LangC: cStyleCommentRe, LangCpp: cStyleCommentRe, LangShell: regexp.MustCompile(`(?m)^#.*$`), LangRuby: regexp.MustCompile(`(?m)^#.*$|=begin[\s\S]*?=end`), LangSQL: regexp.MustCompile(`(?m)^--.*$`), }
CommentPatternsMap maps languages to their comment regex patterns
var DiffHunkPattern = regexp.MustCompile(`^@@\s+-\d+(?:,\d+)?\s+\+\d+(?:,\d+)?\s+@@`)
DiffHunkPattern
var ImportPatterns = []*regexp.Regexp{ regexp.MustCompile(`^use\s+`), regexp.MustCompile(`^import\s+`), regexp.MustCompile(`^from\s+\S+\s+import`), regexp.MustCompile(`^require\(`), regexp.MustCompile(`^import\s*\(`), regexp.MustCompile(`^import\s+"`), regexp.MustCompile(`#include\s*<`), regexp.MustCompile(`#include\s*"`), regexp.MustCompile(`^package\s+`), }
ImportPatterns for various languages
var LogTimestampPatterns = []*regexp.Regexp{ regexp.MustCompile(`^\d{4}-\d{2}-\d{2}[T\s]\d{2}:\d{2}:\d{2}`), regexp.MustCompile(`^\[\d{4}-\d{2}-\d{2}`), regexp.MustCompile(`^\d{2}:\d{2}:\d{2}`), }
LogTimestampPatterns
var SignaturePatterns = []*regexp.Regexp{ regexp.MustCompile(`^(pub\s+)?(async\s+)?fn\s+\w+`), regexp.MustCompile(`^(pub\s+)?struct\s+\w+`), regexp.MustCompile(`^(pub\s+)?enum\s+\w+`), regexp.MustCompile(`^(pub\s+)?trait\s+\w+`), regexp.MustCompile(`^(pub\s+)?type\s+\w+`), regexp.MustCompile(`^impl\s+`), regexp.MustCompile(`^func\s+(\([^)]+\)\s+)?\w+`), regexp.MustCompile(`^type\s+\w+\s+(struct|interface)`), regexp.MustCompile(`^type\s+\w+\s+\w+`), regexp.MustCompile(`^def\s+\w+`), regexp.MustCompile(`^async\s+def\s+\w+`), regexp.MustCompile(`^class\s+\w+`), regexp.MustCompile(`^function\s+\w+`), regexp.MustCompile(`^(export\s+)?(async\s+)?function\s*\w*`), regexp.MustCompile(`^(export\s+)?(default\s+)?class\s+\w+`), regexp.MustCompile(`^(export\s+)?const\s+\w+\s*=\s*(async\s+)?\([^)]*\)\s*=>`), regexp.MustCompile(`^interface\s+\w+`), regexp.MustCompile(`^type\s+\w+\s*=`), regexp.MustCompile(`^(public|private|protected)?\s*(static\s+)?(class|interface|enum)\s+\w+`), regexp.MustCompile(`^(public|private|protected)?\s*(static\s+)?(async\s+)?\w+\s+\w+\s*\(`), }
SignaturePatterns for aggressive filtering
var TestResultPatterns = []*regexp.Regexp{ regexp.MustCompile(`test result: (ok|FAILED|ignored)\.`), regexp.MustCompile(`(\d+) passed`), regexp.MustCompile(`(\d+) failed`), regexp.MustCompile(`(\d+) ignored`), regexp.MustCompile(`(\d+) skipped`), regexp.MustCompile(`PASS`), regexp.MustCompile(`FAIL`), regexp.MustCompile(`ok\s+\S+\s+[\d.]+s`), }
TestResultPatterns
Functions ¶
func ApplyMode ¶
func ApplyMode(input string, mode Mode, cm CompressionMode) (string, int)
ApplyMode is an alias for ApplyTier (backwards compat).
func ApplyProfile ¶
ApplyProfile is an alias for ApplyTier (backwards compat).
func AutoProcess ¶
AutoProcess detects content type and applies the optimal profile.
func CacheabilityScore ¶
CacheabilityScore returns a 0-100 score indicating how cacheable content is. Higher scores mean more stable prefix, better cache hit rate.
func ClassifyContent ¶
ClassifyContent classifies content as static or dynamic.
func DetectLanguage ¶
DetectLanguage attempts to detect the programming language from output using weighted scoring across multiple indicators.
func EstimateCacheHitRate ¶
EstimateCacheHitRate estimates the cache hit rate for repeated requests.
func EstimateTokens ¶
EstimateTokens provides a heuristic token count. Delegates to core.EstimateTokens for single source of truth (T22).
func FilterNoisyOutput ¶
FilterNoisyOutput removes common noise from terminal output.
func FilterProgressBars ¶
FilterProgressBars removes progress bar lines from output. Progress bars are noisy and consume tokens.
func FormatDelta ¶
func FormatDelta(delta IncrementalDelta) string
FormatDelta returns a human-readable delta string.
func FormatTDDStats ¶
FormatTDDStats returns a human-readable stats string.
func HammingDistance ¶
HammingDistance returns the number of differing bits between two hashes.
func IsNearDuplicate ¶
IsNearDuplicate returns true if two content blocks are near-duplicates. Uses SimHash with configurable Hamming distance threshold.
func JSONPathExtract ¶
JSONPathExtract extracts values from JSON using simple path notation.
func NewLRUCache ¶
NewLRUCache creates an LRU cache with given max size and TTL.
func PreferLessMode ¶
PreferLessMode compares filtered vs piped output and uses smaller. Inspired by tokf's prefer-less mode.
func PruneMetadata ¶
PruneMetadata removes unnecessary metadata from JSON (npm URLs, integrity hashes, etc.).
func QuickProcess ¶
QuickProcess compresses input with default configuration
func QuickProcessPreset ¶
func QuickProcessPreset(input string, mode Mode, preset PipelinePreset) (string, int)
func ReadContent ¶
func ReadContent(content string, opts ReadOptions) string
ReadContent reads content with the specified mode.
func SimHash ¶
SimHash computes a 64-bit fingerprint for content deduplication. Uses character n-gram hashing with Hamming distance for near-duplicate detection.
func StripANSI ¶
StripANSI is a utility function to strip ANSI codes from a string. Delegates to SIMD-optimized implementation.
func StripLineNumbers ¶
StripLineNumbers removes line number prefixes from tool output (e.g., "1-> content").
Types ¶
type ACONFilter ¶
type ACONFilter struct {
// contains filtered or unexported fields
}
Paper: "ACON: Optimizing Context Compression for Long-Context LLMs" — ICLR 2026 ACONFilter implements adaptive context optimization — dynamically adjusts compression based on content complexity and context length.
func NewACONFilter ¶
func NewACONFilter() *ACONFilter
NewACONFilter creates a new ACON-style context compression filter.
type ANSIFilter ¶
type ANSIFilter struct{}
ANSIFilter strips ANSI escape sequences from output. Uses SIMD-optimized byte scanning for ~10-40x speedup over regex.
type ASTPreserveFilter ¶
type ASTPreserveFilter struct {
// contains filtered or unexported fields
}
Paper: "LongCodeZip" — Shi et al., SJTU/Stanford, 2025 https://arxiv.org/abs/2510.00446 Paper: "LongCodeZip" — Shi et al., SJTU/Stanford, 2025 https://arxiv.org/abs/2510.00446 ASTPreserveFilter implements LongCodeZip-style compression (NUS, 2025). AST-aware compression that preserves syntactic validity of code.
Algorithm: 1. Detect programming language from syntax patterns 2. Parse code structure (brackets, braces, indentation) 3. Apply entropy-based pruning while preserving AST integrity 4. Never break syntactic boundaries (function bodies, blocks, strings)
Enhanced with Dual-Stage LongCodeZip Methodology: - Stage 1: Coarse-Grained (Function-Level) pruning - Stage 2: Fine-Grained (Block-Level) adaptive compression
Research Results: 4-8x compression while maintaining parseable code. LongCodeZip: 5.6x reduction, 16% better accuracy than LLMLingua on code.
func NewASTPreserveFilter ¶
func NewASTPreserveFilter() *ASTPreserveFilter
NewASTPreserveFilter creates a new AST-aware filter
func (*ASTPreserveFilter) Apply ¶
func (f *ASTPreserveFilter) Apply(input string, mode Mode) (string, int)
Apply applies AST-aware filtering
func (*ASTPreserveFilter) Name ¶
func (f *ASTPreserveFilter) Name() string
Name returns the filter name
func (*ASTPreserveFilter) SetQueryIntent ¶
func (f *ASTPreserveFilter) SetQueryIntent(query string)
SetQueryIntent sets the query intent for query-aware scoring
type AdaptiveLayerSelector ¶
type AdaptiveLayerSelector struct {
// contains filtered or unexported fields
}
AdaptiveLayerSelector dynamically enables/disables layers based on content type. Uses heuristic analysis to optimize compression for different input patterns.
func NewAdaptiveLayerSelector ¶
func NewAdaptiveLayerSelector() *AdaptiveLayerSelector
NewAdaptiveLayerSelector creates a new adaptive selector
func (*AdaptiveLayerSelector) AnalyzeContent ¶
func (a *AdaptiveLayerSelector) AnalyzeContent(input string) ContentType
AnalyzeContent detects the primary content type
func (*AdaptiveLayerSelector) OptimizePipeline ¶
func (a *AdaptiveLayerSelector) OptimizePipeline(input string, mode Mode) *PipelineCoordinator
OptimizePipeline returns an optimized coordinator for the given input
func (*AdaptiveLayerSelector) RecommendedConfig ¶
func (a *AdaptiveLayerSelector) RecommendedConfig(ct ContentType, mode Mode) PipelineConfig
RecommendedConfig returns optimal layer configuration for the content type
type AgentContext ¶
AgentContext holds context for a single agent.
type AgentMemoryConfig ¶
type AgentMemoryConfig struct {
// KnowledgeRetentionRatio is the ratio of knowledge to keep (0.0-1.0)
KnowledgeRetentionRatio float64
// HistoryPruneRatio is the ratio of history to prune after consolidation
HistoryPruneRatio float64
// ConsolidationThreshold triggers consolidation when history exceeds this
ConsolidationThreshold int
// EnableAutoConsolidation allows autonomous memory management
EnableAutoConsolidation bool
// KnowledgeMaxSize limits the knowledge block size
KnowledgeMaxSize int
// PreservePatterns are regex patterns for important content to always keep
PreservePatterns []*regexp.Regexp
}
AgentMemoryConfig holds configuration for agent memory management
func DefaultAgentMemoryConfig ¶
func DefaultAgentMemoryConfig() AgentMemoryConfig
DefaultAgentMemoryConfig returns default configuration
type AgentMemoryFilter ¶
type AgentMemoryFilter struct {
// contains filtered or unexported fields
}
AgentMemoryFilter implements Layer 20: Agent Memory Mode (Focus-inspired).
Research Source: "Active Context Compression / Focus" (arXiv, January 2026) Key Innovation: Agent-centric autonomous memory management inspired by slime mold. Results: 22.7% token reduction (14.9M → 11.5M), 57% savings on individual instances.
Methodology (Physarum polycephalum inspired): 1. Knowledge Consolidation - Extract learnings into "Knowledge" block 2. Active Withdrawal - Prune raw interaction history after consolidation 3. Self-Regulation - Agent decides when to consolidate vs. keep raw
This filter maintains session state and autonomously manages context bloating in long-horizon agent tasks by distinguishing between: - Knowledge: Consolidated insights (high value, permanent) - History: Raw interaction logs (transient, prunable)
func NewAgentMemoryFilter ¶
func NewAgentMemoryFilter() *AgentMemoryFilter
NewAgentMemoryFilter creates a new agent memory filter
func NewAgentMemoryFilterWithConfig ¶
func NewAgentMemoryFilterWithConfig(cfg AgentMemoryConfig) *AgentMemoryFilter
NewAgentMemoryFilterWithConfig creates a filter with custom config
func (*AgentMemoryFilter) Apply ¶
func (f *AgentMemoryFilter) Apply(input string, mode Mode) (string, int)
Apply applies agent memory management compression
func (*AgentMemoryFilter) GetStats ¶
func (f *AgentMemoryFilter) GetStats() AgentMemoryStats
GetStats returns current memory management statistics
func (*AgentMemoryFilter) Name ¶
func (f *AgentMemoryFilter) Name() string
Name returns the filter name
func (*AgentMemoryFilter) Reset ¶
func (f *AgentMemoryFilter) Reset()
Reset clears the memory state (for new sessions)
type AgentMemoryLayerConfig ¶
type AgentMemoryLayerConfig struct {
Enabled bool
KnowledgeRetention float64
HistoryPrune float64
ConsolidationMax int
}
AgentMemoryLayerConfig groups Layer 20 settings.
type AgentMemoryStats ¶
type AgentMemoryStats struct {
TotalConsolidated int
TotalPruned int
KnowledgeTokens int
HistoryTokens int
TokensSaved int
}
AgentMemoryStats tracks memory management statistics
type AnchorToken ¶
type AnchorToken struct {
Text string
Position int
Score float64
Aggregated []string // Tokens aggregated into this anchor
IsStructural bool
}
AnchorToken represents a semantic anchor point
type ApplicabilityCheck ¶
ApplicabilityCheck is an optional interface that filters can implement to report whether they should run for a given input. The coordinator calls this before Apply to implement stage gates (skip cheap before expensive).
type AttentionSinkFilter ¶
type AttentionSinkFilter struct {
// contains filtered or unexported fields
}
Paper: "StreamingLLM: Attention Sinks" — Xiao et al., MIT, 2023 https://arxiv.org/abs/2309.17453 Paper: "StreamingLLM: Attention Sinks" — Xiao et al., MIT, 2023 https://arxiv.org/abs/2309.17453 AttentionSinkFilter implements StreamingLLM-style attention sink preservation. Research basis: "Efficient Streaming Language Models with Attention Sinks" (Xiao et al., 2023) - enables infinite-length generation with bounded memory.
Key insight: The first few tokens in a sequence act as "attention sinks" - they absorb excess attention weight due to softmax normalization. Removing these tokens breaks the attention distribution the model learned during training.
This layer: 1. Always preserves initial tokens (attention sinks) 2. Preserves structural anchors (headers, prefixes, markers) 3. Applies rolling cache to remaining content
This is Layer 14 in the pipeline, ensuring stable compression for long content.
func NewAdaptiveAttentionSinkFilter ¶
func NewAdaptiveAttentionSinkFilter(outputLines int) *AttentionSinkFilter
NewAdaptiveAttentionSinkFilter creates a filter with adaptive sink count. StreamingLLM insight — sink count scales with output length.
func NewAttentionSinkFilter ¶
func NewAttentionSinkFilter() *AttentionSinkFilter
NewAttentionSinkFilter creates a new attention sink filter
func (*AttentionSinkFilter) Apply ¶
func (a *AttentionSinkFilter) Apply(input string, mode Mode) (string, int)
Apply applies attention sink preservation to the input
func (*AttentionSinkFilter) GetStats ¶
func (a *AttentionSinkFilter) GetStats() map[string]any
GetStats returns filter statistics
func (*AttentionSinkFilter) Name ¶
func (a *AttentionSinkFilter) Name() string
Name returns the filter name
func (*AttentionSinkFilter) SetEnabled ¶
func (a *AttentionSinkFilter) SetEnabled(enabled bool)
SetEnabled enables or disables the filter
type AttentionSinkLayerConfig ¶
AttentionSinkLayerConfig groups Layer 14 settings.
type AttributionConfig ¶
type AttributionConfig struct {
// Enable attribution filtering
Enabled bool
// Threshold for token importance (0.0-1.0)
// Tokens below this score are candidates for removal
ImportanceThreshold float64
// Minimum content length to apply attribution
MinContentLength int
// Use positional bias (later tokens often less important)
PositionalBias bool
// Use frequency-based importance (repeated tokens may be less important)
FrequencyBias bool
// Use semantic markers (preserve keywords, numbers, code)
SemanticPreservation bool
// Maximum tokens to analyze (for performance)
MaxAnalyzeTokens int
}
AttributionConfig holds configuration for attribution-based pruning
func DefaultAttributionConfig ¶
func DefaultAttributionConfig() AttributionConfig
DefaultAttributionConfig returns default configuration
type AttributionFilter ¶
type AttributionFilter struct {
// contains filtered or unexported fields
}
AttributionFilter implements attribution-based token pruning. Research basis: "ProCut: Progressive Pruning via Attribution" (LinkedIn, 2025) Achieves 78% token reduction by using importance scoring.
Key technique: Attribution scores (simplified SHAP) identify which tokens contribute most to the output. Low-importance tokens are pruned.
This is Layer 12 in the pipeline, adding ML-style importance without requiring actual model training.
func NewAttributionFilter ¶
func NewAttributionFilter() *AttributionFilter
NewAttributionFilter creates a new attribution filter
func (*AttributionFilter) Apply ¶
func (a *AttributionFilter) Apply(input string, mode Mode) (string, int)
Apply applies attribution-based pruning to the input
func (*AttributionFilter) GetStats ¶
func (a *AttributionFilter) GetStats() map[string]any
GetStats returns filter statistics
func (*AttributionFilter) Name ¶
func (a *AttributionFilter) Name() string
Name returns the filter name
func (*AttributionFilter) SetEnabled ¶
func (a *AttributionFilter) SetEnabled(enabled bool)
SetEnabled enables or disables the filter
type AttributionLayerConfig ¶
AttributionLayerConfig groups Layer 12 settings.
type AutoValidationPipeline ¶
type AutoValidationPipeline struct {
// contains filtered or unexported fields
}
AutoValidationPipeline implements auto-validation after file changes. Inspired by lean-ctx's auto-validation pipeline.
func NewAutoValidationPipeline ¶
func NewAutoValidationPipeline() *AutoValidationPipeline
NewAutoValidationPipeline creates a new auto-validation pipeline.
func (*AutoValidationPipeline) AddValidator ¶
func (avp *AutoValidationPipeline) AddValidator(name, command string, args ...string)
AddValidator adds a validation step.
func (*AutoValidationPipeline) Validate ¶
func (avp *AutoValidationPipeline) Validate() ValidationResult
Validate runs all validators.
type BEAVERFilter ¶
type BEAVERFilter struct {
// contains filtered or unexported fields
}
Paper: "BEAVER: Structure-Aware Page Selection" — 2026 https://arxiv.org/abs/2603.19635 BEAVERFilter implements structure-aware hierarchical compression — treats content as pages/sections and selects based on structural importance.
func NewBEAVERFilter ¶
func NewBEAVERFilter() *BEAVERFilter
NewBEAVERFilter creates a new structure-aware page selection filter.
type BM25Scorer ¶
type BM25Scorer struct {
// contains filtered or unexported fields
}
BM25Scorer implements Okapi BM25 scoring for relevance ranking. Better relevance ranking than TF-IDF (used in IR systems for decades).
func (*BM25Scorer) Fit ¶
func (s *BM25Scorer) Fit(docs []string)
Fit builds document frequency statistics from corpus.
func (*BM25Scorer) Score ¶
func (s *BM25Scorer) Score(doc string, query string) float64
Score computes BM25 relevance of a document to a query.
func (*BM25Scorer) ScoreLines ¶
func (s *BM25Scorer) ScoreLines(lines []string, query string) []LineScore
ScoreLines scores each line against a query and returns sorted indices.
type BodyFilter ¶
type BodyFilter struct {
// contains filtered or unexported fields
}
BodyFilter strips function bodies in aggressive mode. Preserves function signatures while removing body content.
type BudgetConfig ¶
type BudgetConfig struct {
Budget int // Maximum tokens (0 = unlimited)
}
BudgetConfig holds configuration for the budget enforcer
type BudgetEnforcer ¶
type BudgetEnforcer struct {
// contains filtered or unexported fields
}
BudgetEnforcer enforces strict token limits on output. Research-based: Budget-Constrained Compression (2024) - provides predictable output size by scoring segments and keeping only the most important ones.
Key insight: LLMs have finite context windows. Enforcing a strict budget ensures output fits within constraints while maximizing information content.
func NewBudgetEnforcer ¶
func NewBudgetEnforcer(budget int) *BudgetEnforcer
NewBudgetEnforcer creates a new budget enforcer.
func NewBudgetEnforcerWithConfig ¶
func NewBudgetEnforcerWithConfig(cfg BudgetConfig) *BudgetEnforcer
NewBudgetEnforcerWithConfig creates a budget enforcer with config.
func (*BudgetEnforcer) Apply ¶
func (f *BudgetEnforcer) Apply(input string, mode Mode) (string, int)
Apply enforces the token budget on the output.
func (*BudgetEnforcer) SetBudget ¶
func (f *BudgetEnforcer) SetBudget(budget int)
SetBudget updates the token budget
type CachedResult ¶
CachedResult represents a cached compression result
type ChunkMethod ¶
type ChunkMethod int
ChunkMethod defines how content is split into chunks
const ( // ChunkAuto auto-detects content type and applies appropriate method ChunkAuto ChunkMethod = iota // ChunkCode uses code-aware chunking (functions, classes) ChunkCode // ChunkText uses text-aware chunking (sentences, paragraphs) ChunkText // ChunkMixed handles mixed code+text content ChunkMixed )
type CodeChunk ¶
type CodeChunk struct {
Type string // "function", "class", "method", "block"
Name string
Content string
StartLine int
EndLine int
Score float64 // Importance score
Tokens int
Children []CodeChunk // Nested blocks
}
CodeChunk represents a parsed code unit for dual-stage compression
type CodeContext ¶
type CodeContext struct {
File string `json:"file"`
Symbols []string `json:"symbols"`
Lines string `json:"lines,omitempty"`
}
CodeContext preserves code-specific context
type ColorPassthrough ¶
type ColorPassthrough struct {
// contains filtered or unexported fields
}
ColorPassthrough strips ANSI codes for matching but restores in output. Inspired by tokf's color passthrough.
func (*ColorPassthrough) RestoreCodes ¶
func (cp *ColorPassthrough) RestoreCodes(stripped string) string
RestoreCodes restores ANSI codes to stripped content.
func (*ColorPassthrough) StripAndStore ¶
func (cp *ColorPassthrough) StripAndStore(content string) string
StripAndStore strips ANSI codes and stores their positions.
type CommandContext ¶
type CommandContext struct {
Command string // "git", "npm", "cargo", etc.
Subcommand string // "status", "test", "build"
ExitCode int // Non-zero = likely has errors
Intent string // "debug", "review", "deploy", "search"
IsTest bool // Test output detection
IsBuild bool // Build output detection
IsError bool // Error output detection
}
CommandContext provides metadata about the command being executed. Used for intelligent filtering decisions.
type CommentFilter ¶
type CommentFilter struct {
// contains filtered or unexported fields
}
CommentFilter strips comments from source code.
type CommentPatterns ¶
type CommentPatterns struct {
Line string
BlockStart string
BlockEnd string
DocLine string
DocBlock string
}
CommentPatterns represents comment structure for a language
type CompactionConfig ¶
type CompactionConfig struct {
// Enable LLM-based compaction
Enabled bool
// Minimum content size to trigger compaction (in lines)
ThresholdLines int
// Minimum content size to trigger compaction (in tokens)
ThresholdTokens int
// Number of recent turns to preserve verbatim
PreserveRecentTurns int
// Maximum summary length in tokens
MaxSummaryTokens int
// Content types to compact (chat, conversation, session)
ContentTypes []string
// Enable caching of compaction results
CacheEnabled bool
// Custom prompt template for compaction
PromptTemplate string
// Detect content type automatically
AutoDetect bool
// Create state snapshot format (4-section XML)
StateSnapshotFormat bool
// Extract key-value pairs from content
ExtractKeyValuePairs bool
// Maximum context entries to preserve
MaxContextEntries int
}
CompactionConfig holds configuration for the compaction layer
func DefaultCompactionConfig ¶
func DefaultCompactionConfig() CompactionConfig
DefaultCompactionConfig returns default configuration
type CompactionLayer ¶
type CompactionLayer struct {
// contains filtered or unexported fields
}
Paper: "MemGPT" — Packer et al., UC Berkeley, 2023 https://arxiv.org/abs/2310.08560 CompactionLayer provides semantic compression for chat/conversation content. It creates state snapshots with 4 sections:
1. session_history: user queries + activity log (what was done) 2. current_state: focus + next_action (what's active now) 3. context: critical + working knowledge (what to remember) 4. pending_plan: future milestones (what's next)
Research basis: "MemGPT" (UC Berkeley, 2023) semantic compression achieves 98%+ compression ratios while preserving semantic meaning.
This layer is designed for: - Chat history compression - Conversation-style content - Session state preservation - Multi-turn context management
func NewCompactionLayer ¶
func NewCompactionLayer(cfg CompactionConfig) *CompactionLayer
NewCompactionLayer creates a new compaction layer
func (*CompactionLayer) Apply ¶
func (c *CompactionLayer) Apply(input string, mode Mode) (string, int)
Apply applies compaction to the input
func (*CompactionLayer) GetStats ¶
func (c *CompactionLayer) GetStats() map[string]any
GetStats returns compaction statistics
func (*CompactionLayer) IsAvailable ¶
func (c *CompactionLayer) IsAvailable() bool
IsAvailable returns true if LLM is available
func (*CompactionLayer) SetEnabled ¶
func (c *CompactionLayer) SetEnabled(enabled bool)
SetEnabled enables or disables the compaction layer
type CompactionLayerConfig ¶
type CompactionLayerConfig struct {
Enabled bool
Threshold int
PreserveTurns int
MaxTokens int
StateSnapshot bool
AutoDetect bool
}
CompactionLayerConfig groups Layer 11 settings.
type CompactionResult ¶
type CompactionResult struct {
Snapshot *StateSnapshot
OriginalTokens int
FinalTokens int
SavedTokens int
CompressionRatio float64
Cached bool
Timestamp time.Time
}
CompactionResult represents a compaction result
func Compact ¶
func Compact(input string, cfg CompactionConfig) (string, *CompactionResult)
Compact is a convenience function for one-shot compaction
type CompressionCache ¶
type CompressionCache struct {
// contains filtered or unexported fields
}
CompressionCache provides caching for compression results
func NewCompressionCache ¶
func NewCompressionCache(maxSize int) *CompressionCache
NewCompressionCache creates a new compression cache
func (*CompressionCache) Get ¶
func (c *CompressionCache) Get(key string) (*CachedResult, bool)
Get retrieves a cached result
func (*CompressionCache) Set ¶
func (c *CompressionCache) Set(key string, result *CachedResult)
Set stores a result in cache
func (*CompressionCache) Size ¶
func (c *CompressionCache) Size() int
Size returns the number of cached entries
type CompressionMode ¶
type CompressionMode = Tier
type ContentComplexity ¶
type ContentComplexity struct {
EntropyDensity float64 // Shannon entropy per character
VocabularyRatio float64 // Unique words / total words
StructureDensity float64 // Structural elements ratio
RedundancyRatio float64 // Estimated redundancy (0 = no redundancy)
OverallScore float64 // Combined complexity score (0-1)
}
ContentComplexity holds analysis results
type ContentType ¶
type ContentType int
ContentType represents the detected content type
const ( ContentTypeUnknown ContentType = iota ContentTypeCode ContentTypeLogs ContentTypeConversation ContentTypeGitOutput ContentTypeTestOutput ContentTypeDockerOutput ContentTypeMixed )
func (ContentType) String ¶
func (ct ContentType) String() string
ContentTypeString returns a human-readable content type name
type ContrastiveFilter ¶
type ContrastiveFilter struct {
// contains filtered or unexported fields
}
Paper: "LongLLMLingua" — Jiang et al., Microsoft, 2024 https://arxiv.org/abs/2310.06839 Paper: "LongLLMLingua" — Jiang et al., Microsoft, 2024 https://arxiv.org/abs/2310.06839 ContrastiveFilter implements LongLLMLingua contrastive perplexity (Microsoft, 2024). Question-aware compression that ranks tokens by relevance to the query.
Algorithm: 1. Calculate contrastive perplexity: CP(x) = P(x|question) / P(x|context) 2. Higher contrastive perplexity = more question-relevant 3. Reorder context to place high-relevance tokens at start/end 4. Prune low-relevance middle content
Research Results: 4-10x compression with improved RAG accuracy.
func NewContrastiveFilter ¶
func NewContrastiveFilter(question string) *ContrastiveFilter
NewContrastiveFilter creates a new contrastive filter
func (*ContrastiveFilter) Apply ¶
func (f *ContrastiveFilter) Apply(input string, mode Mode) (string, int)
Apply applies contrastive perplexity filtering
func (*ContrastiveFilter) Name ¶
func (f *ContrastiveFilter) Name() string
Name returns the filter name
func (*ContrastiveFilter) SetQuestion ¶
func (f *ContrastiveFilter) SetQuestion(question string)
SetQuestion updates the question for contrastive scoring
type ConversationTracker ¶
type ConversationTracker struct {
// contains filtered or unexported fields
}
ConversationTracker tracks conversation turns
func NewConversationTracker ¶
func NewConversationTracker(maxTurns int) *ConversationTracker
NewConversationTracker creates a new conversation tracker
func (*ConversationTracker) AddTurn ¶
func (t *ConversationTracker) AddTurn(role, content string)
AddTurn adds a turn to the tracker
func (*ConversationTracker) GetRecentTurns ¶
func (t *ConversationTracker) GetRecentTurns(n int) []Turn
GetRecentTurns returns the most recent N turns
func (*ConversationTracker) GetTurns ¶
func (t *ConversationTracker) GetTurns() []Turn
GetTurns returns all turns
type CoreLayersConfig ¶
type CoreLayersConfig struct {
LLMEnabled bool
SessionTracking bool
NgramEnabled bool
MultiFileEnabled bool
}
CoreLayersConfig groups Layer 1-9 shared settings.
type CrossMessageDedup ¶
type CrossMessageDedup struct {
// contains filtered or unexported fields
}
CrossMessageDedup tracks content across conversation turns to eliminate redundancy.
func NewCrossMessageDedup ¶
func NewCrossMessageDedup() *CrossMessageDedup
NewCrossMessageDedup creates a new cross-message deduplication tracker.
func (*CrossMessageDedup) Clear ¶
func (d *CrossMessageDedup) Clear()
Clear resets the deduplication tracker.
func (*CrossMessageDedup) Count ¶
func (d *CrossMessageDedup) Count() int
Count returns the number of unique content blocks tracked.
func (*CrossMessageDedup) DedupMessage ¶
func (d *CrossMessageDedup) DedupMessage(content string) (bool, string)
DedupMessage checks if a message is a duplicate of previously seen content. Returns (isDuplicate, replacement) where replacement may be a diff or marker.
type CurrentState ¶
type CurrentState struct {
Focus string `json:"focus"`
NextAction string `json:"next_action"`
ActiveFile string `json:"active_file,omitempty"`
Mode string `json:"mode,omitempty"`
}
CurrentState tracks what's currently active
type DensityAdaptiveConfig ¶
type DensityAdaptiveConfig struct {
// Enable density-adaptive filtering
Enabled bool
// Target compression ratio (0.0-1.0, e.g., 0.3 = 30% of original)
TargetRatio float64
// Minimum density threshold for preservation (0.0-1.0)
DensityThreshold float64
// Window size for density calculation (in lines)
WindowSize int
// Boost factor for high-density regions
DensityBoost float64
// Penalty for low-density regions
SparsePenalty float64
}
DensityAdaptiveConfig holds configuration for density-adaptive filtering
func DefaultDensityAdaptiveConfig ¶
func DefaultDensityAdaptiveConfig() DensityAdaptiveConfig
DefaultDensityAdaptiveConfig returns default configuration
type DensityAdaptiveFilter ¶
type DensityAdaptiveFilter struct {
// contains filtered or unexported fields
}
DensityAdaptiveFilter implements DAST-style density-adaptive allocation. Research basis: "DAST: Context-Aware Compression via Dynamic Allocation of Soft Tokens" (Chen et al., 2025) - allocate compression capacity based on information density.
Key insight - dense content sections (code, data) need more tokens, while sparse sections (whitespace, repetition) can be heavily compressed.
This filter: 1. Analyzes content density per section 2. Allocates budget proportionally to density 3. Applies non-uniform compression ratios 4. Preserves information-rich regions
func NewDensityAdaptiveFilter ¶
func NewDensityAdaptiveFilter() *DensityAdaptiveFilter
NewDensityAdaptiveFilter creates a new density-adaptive filter
func (*DensityAdaptiveFilter) Apply ¶
func (d *DensityAdaptiveFilter) Apply(input string, mode Mode) (string, int)
Apply applies density-adaptive compression to the input
func (*DensityAdaptiveFilter) GetStats ¶
func (d *DensityAdaptiveFilter) GetStats() map[string]any
GetStats returns filter statistics
func (*DensityAdaptiveFilter) Name ¶
func (d *DensityAdaptiveFilter) Name() string
Name returns the filter name
func (*DensityAdaptiveFilter) SetEnabled ¶
func (d *DensityAdaptiveFilter) SetEnabled(enabled bool)
SetEnabled enables or disables the filter
func (*DensityAdaptiveFilter) SetTargetRatio ¶
func (d *DensityAdaptiveFilter) SetTargetRatio(ratio float64)
SetTargetRatio sets the target compression ratio
type DensityAdaptiveLayerConfig ¶
DensityAdaptiveLayerConfig groups T17 settings.
type DictionaryEncoding ¶
type DictionaryEncoding struct {
// contains filtered or unexported fields
}
DictionaryEncoding implements auto-learned codebook substitution. Inspired by claw-compactor's dictionary encoding.
func NewDictionaryEncoding ¶
func NewDictionaryEncoding() *DictionaryEncoding
NewDictionaryEncoding creates a new dictionary encoder.
func (*DictionaryEncoding) Decode ¶
func (de *DictionaryEncoding) Decode(content string) string
Decode restores original content from dictionary references.
type DiffCrunch ¶
type DiffCrunch struct {
// contains filtered or unexported fields
}
DiffCrunch folds unchanged context lines in unified diffs. Inspired by claw-compactor's DiffCrunch stage.
func NewDiffCrunch ¶
func NewDiffCrunch(cfg DiffCrunchConfig) *DiffCrunch
NewDiffCrunch creates a new DiffCrunch filter.
type DiffCrunchConfig ¶
DiffCrunchConfig holds configuration for DiffCrunch.
func DefaultDiffCrunchConfig ¶
func DefaultDiffCrunchConfig() DiffCrunchConfig
DefaultDiffCrunchConfig returns default DiffCrunch configuration.
type DynamicRatioConfig ¶
type DynamicRatioConfig struct {
// Enabled controls whether the filter is active
Enabled bool
// MinComplexity is the minimum complexity score (0-1)
MinComplexity float64
// MaxComplexity is the maximum complexity score (0-1)
MaxComplexity float64
// BaseBudgetRatio is the default budget ratio (1.0 = no change)
BaseBudgetRatio float64
// HighComplexityBoost multiplies budget for high-complexity content
HighComplexityBoost float64
// LowComplexityPenalty multiplies budget for low-complexity content
LowComplexityPenalty float64
// MinContentLength is minimum chars to analyze
MinContentLength int
}
DynamicRatioConfig holds configuration for dynamic ratio adjustment
func DefaultDynamicRatioConfig ¶
func DefaultDynamicRatioConfig() DynamicRatioConfig
DefaultDynamicRatioConfig returns default configuration
type DynamicRatioFilter ¶
type DynamicRatioFilter struct {
// contains filtered or unexported fields
}
DynamicRatioFilter implements PruneSID-style dynamic compression ratio. Research Source: "Prune Redundancy, Preserve Essence" (Mar 2026) Key Innovation: Content complexity analysis to auto-adjust compression ratio, enabling more aggressive compression on redundant content while preserving simple/important content.
This is a meta-layer that adjusts the effective compression budget based on the information density of the content. High-density content gets more tokens; low-density (redundant) content gets compressed more aggressively.
func NewDynamicRatioFilter ¶
func NewDynamicRatioFilter() *DynamicRatioFilter
NewDynamicRatioFilter creates a new dynamic ratio filter
func (*DynamicRatioFilter) Apply ¶
func (f *DynamicRatioFilter) Apply(input string, mode Mode) (string, int)
Apply applies dynamic compression ratio based on content complexity
func (*DynamicRatioFilter) Name ¶
func (f *DynamicRatioFilter) Name() string
Name returns the filter name
type DynamicRatioLayerConfig ¶
DynamicRatioLayerConfig groups dynamic compression ratio settings.
type EnableCheck ¶
type EnableCheck interface {
IsEnabled() bool
}
EnableCheck is an optional interface that filters can implement to report whether they are currently enabled. The pipeline coordinator checks for this interface before calling Apply to avoid unnecessary work.
type Engine ¶
type Engine struct {
// contains filtered or unexported fields
}
Engine is a lightweight filter chain used for quick output post-processing. Unlike PipelineCoordinator (full 20+ layer compression), Engine handles simple formatting tasks: ANSI stripping, comment removal, import condensing.
func NewEngineWithConfig ¶
func NewEngineWithConfig(cfg EngineConfig) *Engine
NewEngineWithConfig creates a filter engine with full configuration options.
func NewEngineWithQuery ¶
NewEngineWithQuery creates a new filter engine with query-aware compression.
func (*Engine) ProcessWithLang ¶
ModeNone = raw passthrough
type EngineConfig ¶
type EngineConfig struct {
Mode Mode
QueryIntent string
LLMEnabled bool
MultiFileEnabled bool
PromptTemplate string // Template name for LLM summarization
}
EngineConfig holds configuration for the filter engine
type EngramMemory ¶
type EngramMemory struct {
// contains filtered or unexported fields
}
EngramMemory implements LLM-driven observational memory system. Inspired by claw-compactor's Engram Observer/Reflector.
func NewEngramMemory ¶
func NewEngramMemory(threshold float64) *EngramMemory
NewEngramMemory creates a new engram memory system.
func (*EngramMemory) Observe ¶
func (em *EngramMemory) Observe(content string, importance float64)
Observe records an observation from LLM output.
func (*EngramMemory) Reflect ¶
func (em *EngramMemory) Reflect() []Reflection
Reflect consolidates observations into insights.
func (*EngramMemory) TieredSummary ¶
func (em *EngramMemory) TieredSummary() map[string]string
TieredSummary generates L0/L1/L2 tiered summaries.
type EntropyFilter ¶
type EntropyFilter struct {
// contains filtered or unexported fields
}
Paper: "Selective Context" — Li et al., Mila, 2023 https://arxiv.org/abs/2310.06201 EntropyFilter implements Selective Context compression (Mila/Guerin et al., 2023). Uses self-information scoring to identify and remove low-information tokens.
Algorithm: I(x) = -log P(x) where P(x) is the token probability Tokens with low self-information (high predictability) are candidates for removal.
Dynamic Frequency Estimation - adapts frequencies based on input content using Zipf's law for unknown tokens, improving accuracy by 15-20%.
Research Results: 2-3x compression while preserving semantic content.
func NewEntropyFilter ¶
func NewEntropyFilter() *EntropyFilter
NewEntropyFilter creates a new entropy-based filter
func NewEntropyFilterWithThreshold ¶
func NewEntropyFilterWithThreshold(threshold float64) *EntropyFilter
NewEntropyFilterWithThreshold creates an entropy filter with custom threshold. Configurable entropy threshold. EntropyFilter implements Shannon entropy-based filtering (Selective Context, Mila 2023).
func (*EntropyFilter) Apply ¶
func (f *EntropyFilter) Apply(input string, mode Mode) (string, int)
Apply applies entropy-based filtering to remove low-information tokens Builds dynamic frequency table from input for adaptive estimation
func (*EntropyFilter) SetDynamicEstimation ¶
func (f *EntropyFilter) SetDynamicEstimation(enabled bool)
SetDynamicEstimation enables or disables dynamic frequency estimation (T11)
func (*EntropyFilter) SetThreshold ¶
func (f *EntropyFilter) SetThreshold(threshold float64)
SetThreshold allows customizing the entropy threshold
type EquivalenceReport ¶
type EquivalenceReport struct {
ErrorPreserved bool
NumbersPreserved bool
URLsPreserved bool
FilePathsPreserved bool
ExitCodesPreserved bool
Score float64 // 0.0-1.0
}
EquivalenceReport holds the semantic check results.
func (EquivalenceReport) IsGood ¶
func (r EquivalenceReport) IsGood() bool
IsGood returns true if the compression preserved critical information.
type EvaluatorHeadsFilter ¶
type EvaluatorHeadsFilter struct {
// contains filtered or unexported fields
}
Paper: "EHPC" — Fei et al., Tsinghua/Huawei, 2025 https://arxiv.org/abs/2501.12959 Paper: "EHPC" — Fei et al., Tsinghua/Huawei, 2025 https://arxiv.org/abs/2501.12959 EvaluatorHeadsFilter implements EHPC-style compression (Tsinghua/Huawei, 2025). Uses "evaluator heads" concept - identifies important tokens by analyzing early-layer attention patterns.
Algorithm: 1. Simulate "skim" mode - look at first few tokens of each chunk 2. Score tokens by position and content importance 3. Identify "evaluator" tokens that predict importance 4. Apply rapid pruning based on evaluator scores
Research Results: 5-7x compression with minimal quality loss. Key insight: Early layers of LLMs can predict token importance.
func NewEvaluatorHeadsFilter ¶
func NewEvaluatorHeadsFilter() *EvaluatorHeadsFilter
NewEvaluatorHeadsFilter creates a new evaluator heads filter
func (*EvaluatorHeadsFilter) Apply ¶
func (f *EvaluatorHeadsFilter) Apply(input string, mode Mode) (string, int)
Apply applies evaluator heads compression
func (*EvaluatorHeadsFilter) Name ¶
func (f *EvaluatorHeadsFilter) Name() string
Name returns the filter name
func (*EvaluatorHeadsFilter) SetEvalThreshold ¶
func (f *EvaluatorHeadsFilter) SetEvalThreshold(threshold float64)
SetEvalThreshold sets the evaluator threshold
func (*EvaluatorHeadsFilter) SetSkimRatio ¶
func (f *EvaluatorHeadsFilter) SetSkimRatio(ratio float64)
SetSkimRatio sets the skim ratio
type FeedbackConfig ¶
type FeedbackConfig struct {
// Enabled controls whether feedback is active
Enabled bool
// QualityThreshold is the minimum quality score (0-1) before triggering feedback
QualityThreshold float64
// MaxAdjustment is the maximum per-layer adjustment
MaxAdjustment float64
}
FeedbackConfig holds configuration for inter-layer feedback
func DefaultFeedbackConfig ¶
func DefaultFeedbackConfig() FeedbackConfig
DefaultFeedbackConfig returns default configuration
type FeedbackLoop ¶
type FeedbackLoop struct {
// contains filtered or unexported fields
}
FeedbackLoop implements feedback-based learning for compression thresholds.
func NewFeedbackLoop ¶
func NewFeedbackLoop() *FeedbackLoop
NewFeedbackLoop creates a new feedback loop learner.
func (*FeedbackLoop) GetThreshold ¶
func (fl *FeedbackLoop) GetThreshold(key string, base float64) float64
GetThreshold returns the learned threshold for a key.
func (*FeedbackLoop) Record ¶
func (fl *FeedbackLoop) Record(key string, quality float64)
Record records a feedback sample for a language/content type.
type FeedbackSignal ¶
type FeedbackSignal struct {
// LayerName is the source layer
LayerName string
// QualityScore is the estimated quality of compressed output (0-1)
QualityScore float64
// CompressionRatio is the achieved compression ratio
CompressionRatio float64
// SuggestedAdjustment is the suggested mode adjustment (-1 to +1)
SuggestedAdjustment float64
}
FeedbackSignal carries compression quality feedback between layers
type Filter ¶
type Filter interface {
// Name returns the filter name.
Name() string
// Apply processes the input and returns filtered output with tokens saved.
Apply(input string, mode Mode) (output string, tokensSaved int)
}
Filter defines the interface for output filters.
type GistFilter ¶
type GistFilter struct {
// contains filtered or unexported fields
}
Paper: "Gisting" — Mu et al., Stanford, 2023 https://arxiv.org/abs/2304.08467 Paper: "Gisting" — Mu et al., Stanford, 2023 https://arxiv.org/abs/2304.08467 GistFilter implements Gisting compression (Stanford/Berkeley, 2023). Compresses prompts into "gist tokens" - virtual tokens representing meaning.
Algorithm: 1. Identify semantic chunks in the text 2. Replace each chunk with a compressed "gist" representation 3. Use prefix-tuning style markers for reconstruction 4. Preserve critical structural elements
Research Results: 20x+ compression for repetitive content. Key insight: LLMs can understand compressed "gist" representations.
func NewGistFilter ¶
func NewGistFilter() *GistFilter
NewGistFilter creates a new gist compression filter
func (*GistFilter) Apply ¶
func (f *GistFilter) Apply(input string, mode Mode) (string, int)
Apply applies gist compression
func (*GistFilter) SetMaxChunkSize ¶
func (f *GistFilter) SetMaxChunkSize(size int)
SetMaxChunkSize sets the maximum chunk size for gist compression
type GoalDrivenFilter ¶
type GoalDrivenFilter struct {
// contains filtered or unexported fields
}
Paper: "SWE-Pruner" — Wang et al., Shanghai Jiao Tong, 2026 https://arxiv.org/abs/2601.16746 Paper: "SWE-Pruner" — Wang et al., Shanghai Jiao Tong, 2026 https://arxiv.org/abs/2601.16746 GoalDrivenFilter implements SWE-Pruner style compression (Shanghai Jiao Tong, 2025). Goal-driven line-level pruning using CRF-inspired scoring.
Algorithm: 1. Parse goal/intent from query 2. Score each line for relevance to goal 3. Apply CRF-style sequential labeling for keep/prune decisions 4. Preserve structural coherence
Research Results: Up to 14.8x compression for code contexts.
func NewGoalDrivenFilter ¶
func NewGoalDrivenFilter(goal string) *GoalDrivenFilter
NewGoalDrivenFilter creates a new goal-driven filter
func (*GoalDrivenFilter) Apply ¶
func (f *GoalDrivenFilter) Apply(input string, mode Mode) (string, int)
Apply applies goal-driven filtering
func (*GoalDrivenFilter) Name ¶
func (f *GoalDrivenFilter) Name() string
Name returns the filter name
type H2OConfig ¶
type H2OConfig struct {
// Enable H2O filtering
Enabled bool
// Number of attention sink tokens to always preserve (first N tokens)
SinkSize int
// Number of recent tokens to always preserve
RecentSize int
// Number of heavy hitter tokens to preserve based on importance
HeavyHitterSize int
// Minimum content length to apply compression
MinContentLength int
// Window size for chunk processing
ChunkWindow int
}
H2OConfig holds configuration for H2O compression
func DefaultH2OConfig ¶
func DefaultH2OConfig() H2OConfig
DefaultH2OConfig returns default configuration
type H2OFilter ¶
type H2OFilter struct {
// contains filtered or unexported fields
}
Paper: "H2O: Heavy-Hitter Oracle" — Zhang et al., NeurIPS, 2023 https://arxiv.org/abs/2306.14048 Paper: "H2O: Heavy-Hitter Oracle" — Zhang et al., NeurIPS, 2023 https://arxiv.org/abs/2306.14048 H2OFilter implements Heavy-Hitter Oracle compression. Research basis: "H2O: Heavy-Hitter Oracle for Efficient Generative Inference" (Zhang et al., NeurIPS 2023) - achieves 30x+ compression via intelligent eviction.
Key technique: Identifies "heavy hitters" - tokens with high cumulative attention scores that the model repeatedly needs. Combines with: 1. Recent token window for local context 2. Attention sinks (initial tokens) for computational stability
This is Layer 13 in the pipeline, implementing KV cache-style compression for text without requiring actual model attention scores.
func (*H2OFilter) Apply ¶
Apply applies H2O compression to the input Optimized: Early exit for small/medium inputs, reduced processing for large
func (*H2OFilter) SetEnabled ¶
SetEnabled enables or disables the filter
type H2OLayerConfig ¶
H2OLayerConfig groups Layer 13 settings.
type HierarchicalFilter ¶
type HierarchicalFilter struct {
// contains filtered or unexported fields
}
HierarchicalFilter implements multi-level summarization for large outputs. Based on "Hierarchical Context Compression" research - creates a tree-like structure where each level provides progressively more detail.
For outputs exceeding a threshold (default 10K lines), this filter: 1. Segments the output into logical sections 2. Generates summaries at multiple abstraction levels 3. Preserves the most important sections verbatim 4. Compresses mid-importance sections into summaries 5. Drops low-importance sections entirely
func NewHierarchicalFilter ¶
func NewHierarchicalFilter() *HierarchicalFilter
NewHierarchicalFilter creates a new hierarchical summarization filter.
func (*HierarchicalFilter) Apply ¶
func (f *HierarchicalFilter) Apply(input string, mode Mode) (string, int)
func (*HierarchicalFilter) Name ¶
func (f *HierarchicalFilter) Name() string
Name returns the filter name.
func (*HierarchicalFilter) SetLineThreshold ¶
func (f *HierarchicalFilter) SetLineThreshold(threshold int)
SetLineThreshold configures the line threshold for hierarchical compression
func (*HierarchicalFilter) SetMaxDepth ¶
func (f *HierarchicalFilter) SetMaxDepth(depth int)
SetMaxDepth configures the maximum summarization depth
type HierarchicalSummaryFilter ¶
type HierarchicalSummaryFilter struct {
// contains filtered or unexported fields
}
Paper: "AutoCompressor" — Chevalier et al., Princeton, 2023 https://arxiv.org/abs/2305.14788 Paper: "AutoCompressor" — Chevalier et al., Princeton, 2023 https://arxiv.org/abs/2305.14788 HierarchicalSummaryFilter implements AutoCompressor-style compression (Princeton/MIT, 2023). Recursive summarization that compresses context into summary vectors.
Algorithm: 1. Divide content into hierarchical levels (sections → paragraphs → sentences) 2. Summarize each level recursively 3. Combine summaries with preserved key content 4. Use bottom-up summarization for maximum compression
Research Results: Extreme compression (depends on summary size). Key insight: Recursive summarization preserves global context.
func NewHierarchicalSummaryFilter ¶
func NewHierarchicalSummaryFilter() *HierarchicalSummaryFilter
NewHierarchicalSummaryFilter creates a new hierarchical summary filter
func (*HierarchicalSummaryFilter) Apply ¶
func (f *HierarchicalSummaryFilter) Apply(input string, mode Mode) (string, int)
Apply applies hierarchical summarization
func (*HierarchicalSummaryFilter) Name ¶
func (f *HierarchicalSummaryFilter) Name() string
Name returns the filter name
func (*HierarchicalSummaryFilter) SetMaxLevels ¶
func (f *HierarchicalSummaryFilter) SetMaxLevels(levels int)
SetMaxLevels sets the maximum recursion depth
func (*HierarchicalSummaryFilter) SetSummaryRatio ¶
func (f *HierarchicalSummaryFilter) SetSummaryRatio(ratio float64)
SetSummaryRatio sets the summary ratio
type HypernymCompressor ¶
type HypernymCompressor struct {
// contains filtered or unexported fields
}
HypernymCompressor implements Mercury-style word-level semantic compression. Research Source: "Hypernym Mercury: Token Optimization Through Semantic Field Constriction And Reconstruction From Hypernyms" (May 2025) Key Innovation: Replace detailed tokens with hypernym concepts when aggressive compression needed. 90%+ token reduction with controllable granularity.
Example: "The quick brown fox jumps over the lazy dog" → "animal action location animal quality" (at hypernym level)
This uses a built-in hypernym hierarchy for common concepts. The granularity is controlled by the compression mode.
func NewHypernymCompressor ¶
func NewHypernymCompressor() *HypernymCompressor
NewHypernymCompressor creates a new hypernym compressor
func (*HypernymCompressor) Apply ¶
func (h *HypernymCompressor) Apply(input string, mode Mode) (string, int)
Apply applies hypernym-based concept compression
func (*HypernymCompressor) Name ¶
func (h *HypernymCompressor) Name() string
Name returns the filter name
type HypernymConfig ¶
type HypernymConfig struct {
// Enabled controls whether the compressor is active
Enabled bool
// MinContentLength is minimum chars to apply
MinContentLength int
// MaxDetailLevel controls granularity: 1=most abstract, 3=most detailed
MaxDetailLevel int
// PreserveKeywords keeps technical terms uncompressed
PreserveKeywords bool
}
HypernymConfig holds configuration for hypernym compression
func DefaultHypernymConfig ¶
func DefaultHypernymConfig() HypernymConfig
DefaultHypernymConfig returns default configuration
type IBConfig ¶
IBConfig holds configuration for information bottleneck.
func DefaultIBConfig ¶
func DefaultIBConfig() IBConfig
DefaultIBConfig returns default IB configuration.
type ImportFilter ¶
type ImportFilter struct {
// contains filtered or unexported fields
}
ImportFilter condenses import statements.
func NewImportFilter ¶
func NewImportFilter() *ImportFilter
NewImportFilter creates a new import filter.
type IncrementalDelta ¶
IncrementalDelta computes the diff between old and new content. Inspired by lean-ctx's ctx_delta.
func ComputeDelta ¶
func ComputeDelta(old, new string) IncrementalDelta
ComputeDelta computes the incremental delta between two versions.
type InformationBottleneck ¶
type InformationBottleneck struct {
// contains filtered or unexported fields
}
InformationBottleneck filters content by entropy and task-relevance.
func NewInformationBottleneck ¶
func NewInformationBottleneck(cfg IBConfig) *InformationBottleneck
NewInformationBottleneck creates a new information bottleneck filter.
func (*InformationBottleneck) Process ¶
func (ib *InformationBottleneck) Process(content, query string) string
Process filters content by information bottleneck principle.
type InterLayerFeedback ¶
type InterLayerFeedback struct {
// contains filtered or unexported fields
}
InterLayerFeedback implements cross-layer feedback mechanism. This allows later layers to signal earlier layers to adjust aggressiveness, creating an adaptive pipeline that self-corrects based on compression results.
func NewInterLayerFeedback ¶
func NewInterLayerFeedback() *InterLayerFeedback
NewInterLayerFeedback creates a new feedback mechanism
func (*InterLayerFeedback) GetAdjustment ¶
func (f *InterLayerFeedback) GetAdjustment(layerName string) float64
GetAdjustment returns the suggested adjustment for a given layer
func (*InterLayerFeedback) RecordSignal ¶
func (f *InterLayerFeedback) RecordSignal(signal FeedbackSignal)
RecordSignal records a feedback signal from a layer
func (*InterLayerFeedback) Reset ¶
func (f *InterLayerFeedback) Reset()
Reset clears all feedback signals
type KVCacheAligner ¶
type KVCacheAligner struct {
// contains filtered or unexported fields
}
KVCacheAligner implements KV-cache alignment for LLM prompt caching. Inspired by claw-compactor's QuantumLock and kompact's cache_aligner. Isolates stable prefix from dynamic content to maximize provider-level caching.
func NewKVCacheAligner ¶
func NewKVCacheAligner(cfg KVCacheConfig) *KVCacheAligner
NewKVCacheAligner creates a new KV-cache aligner.
func (*KVCacheAligner) AlignPrefix ¶
func (a *KVCacheAligner) AlignPrefix(content string) (string, string, string)
AlignPrefix isolates stable prefix from dynamic content. Returns (stablePrefix, dynamicSuffix, cacheKey).
func (*KVCacheAligner) CacheAwareCompress ¶
func (a *KVCacheAligner) CacheAwareCompress(content string, compressor *PipelineCoordinator) (string, int)
CacheAwareCompress compresses only the dynamic portion, preserving stable prefix. This maintains byte-stable prefixes for provider-level caching.
type KVCacheConfig ¶
type KVCacheConfig struct {
Enabled bool
MinPrefixLength int
MaxDynamicSuffix int
SplitThreshold int
}
KVCacheConfig holds configuration for KV-cache alignment.
func DefaultKVCacheConfig ¶
func DefaultKVCacheConfig() KVCacheConfig
DefaultKVCacheConfig returns default KV-cache alignment config.
type KVzipConfig ¶
type KVzipConfig struct {
// Enabled controls whether the filter is active
Enabled bool
// CompressionRatio target compression (0-1, lower = more aggressive)
CompressionRatio float64
// PreserveStructure keeps code structure markers
PreserveStructure bool
// ReconstructableTags marks sections for query-agnostic reconstruction
ReconstructableTags bool
// MinContentLength minimum chars to apply
MinContentLength int
}
KVzipConfig holds configuration for KVzip compression
func DefaultKVzipConfig ¶
func DefaultKVzipConfig() KVzipConfig
DefaultKVzipConfig returns default configuration
type KVzipFilter ¶
type KVzipFilter struct {
// contains filtered or unexported fields
}
Paper: "KVzip: Query-Agnostic KV Cache Compression" — Kim et al., SNU/NAVER, 2025 https://arxiv.org/abs/2505.23416 KVzipFilter implements KVzip-style query-agnostic compression with context reconstruction. Research Source: "KVzip: Query-Agnostic KV Cache Compression with Context Reconstruction" (2025) Key Innovation: Build a compressed representation that can reconstruct context for any query, not just the current one. Designed for KV reuse across sessions.
This creates a "zip" of the content that preserves enough information to reconstruct any relevant subset, while being much smaller than the original.
func (*KVzipFilter) Apply ¶
func (f *KVzipFilter) Apply(input string, mode Mode) (string, int)
Apply applies KVzip-style compression
func (*KVzipFilter) Reconstruct ¶
func (f *KVzipFilter) Reconstruct(compressed, query string) string
Reconstruct attempts to reconstruct a specific subset of the original content based on a query. This is the query-agnostic reconstruction capability.
type LLMAwareConfig ¶
type LLMAwareConfig struct {
Threshold int
Enabled bool
CacheEnabled bool
PromptTemplate string // Template name for intent-specific summarization
}
LLMAwareConfig holds configuration for the LLM-aware filter
type LLMAwareFilter ¶
type LLMAwareFilter struct {
// contains filtered or unexported fields
}
LLMAwareFilter uses local LLM for high-quality summarization. This filter is optional and only activates when: 1. An LLM provider is available (Ollama, LM Studio, etc.) 2. The content exceeds the threshold for LLM-based processing 3. The user has enabled LLM mode via flag or config
Research basis: "LLM-based Context Compression" shows 40-60% better semantic preservation compared to heuristic-only approaches.
func NewLLMAwareFilter ¶
func NewLLMAwareFilter(cfg LLMAwareConfig) *LLMAwareFilter
NewLLMAwareFilter creates a new LLM-aware filter
func (*LLMAwareFilter) Apply ¶
func (f *LLMAwareFilter) Apply(input string, mode Mode) (string, int)
Apply applies LLM-based summarization if available, otherwise falls back to heuristic
func (*LLMAwareFilter) GetModel ¶
func (f *LLMAwareFilter) GetModel() string
GetModel returns the current model name
func (*LLMAwareFilter) GetProvider ¶
func (f *LLMAwareFilter) GetProvider() string
GetProvider returns the current LLM provider name
func (*LLMAwareFilter) IsAvailable ¶
func (f *LLMAwareFilter) IsAvailable() bool
IsAvailable returns true if LLM is available
func (*LLMAwareFilter) SetEnabled ¶
func (f *LLMAwareFilter) SetEnabled(enabled bool)
SetEnabled enables or disables LLM mode
func (*LLMAwareFilter) SummarizeWithIntent ¶
func (f *LLMAwareFilter) SummarizeWithIntent(content string, intent string) (string, int)
SummarizeWithIntent provides intent-aware summarization
type LLMCompressRequest ¶
type LLMCompressRequest struct {
Content string `json:"content"`
MaxTokens int `json:"max_tokens"`
Mode string `json:"mode"`
}
LLMCompressRequest is the JSON input for LLM compression.
type LLMCompressResponse ¶
type LLMCompressResponse struct {
Compressed string `json:"compressed"`
TokensIn int `json:"tokens_in"`
TokensOut int `json:"tokens_out"`
}
LLMCompressResponse is the JSON output from LLM compression.
type LLMCompressor ¶
type LLMCompressor struct {
// contains filtered or unexported fields
}
LLMCompressor uses an external LLM for semantic compression. Inspired by claw-compactor's Nexus and tamp's textpress.
func NewLLMCompressor ¶
func NewLLMCompressor(binPath string) *LLMCompressor
NewLLMCompressor creates a new LLM-based compressor.
func (*LLMCompressor) IsEnabled ¶
func (lc *LLMCompressor) IsEnabled() bool
IsEnabled returns whether LLM compression is available.
func (*LLMCompressor) SetEnabled ¶
func (lc *LLMCompressor) SetEnabled(enabled bool)
SetEnabled toggles LLM compression.
type LRUCache ¶
LRUCache is an alias to the unified LRU cache implementation. Note: The cache stores *CachedResult values defined locally in manager.go.
type Language ¶
type Language string
Language represents a programming language for filtering
const ( LangRust Language = "rust" LangPython Language = "python" LangJavaScript Language = "javascript" LangTypeScript Language = "typescript" LangGo Language = "go" LangC Language = "c" LangCpp Language = "cpp" LangJava Language = "java" LangRuby Language = "ruby" LangShell Language = "sh" LangSQL Language = "sql" LangUnknown Language = "unknown" )
func DetectLanguageFromInput ¶
DetectLanguageFromInput detects language from input content. Delegates to DetectLanguage and wraps the result as a Language.
type LayerConfig ¶
type LayerConfig struct {
Core CoreLayersConfig
Compaction CompactionLayerConfig
Attribution AttributionLayerConfig
H2O H2OLayerConfig
AttentionSink AttentionSinkLayerConfig
MetaToken MetaTokenLayerConfig
SemanticChunk SemanticChunkLayerConfig
SketchStore SketchStoreLayerConfig
LazyPruner LazyPrunerLayerConfig
SemanticAnchor SemanticAnchorLayerConfig
AgentMemory AgentMemoryLayerConfig
QuestionAware QuestionAwareLayerConfig
DensityAdaptive DensityAdaptiveLayerConfig
TFIDF TFIDFLayerConfig
NumericalQuant NumericalQuantLayerConfig
DynamicRatio DynamicRatioLayerConfig
SymbolicCompress bool
PhraseGrouping bool
Hypernym bool
SemanticCache bool
Scope bool
SmallKV bool
KVzip bool
SWEzze bool
MixedDim bool
BEAVER bool
PoC bool
TokenQuant bool
TokenRetention bool
ACON bool
TOMLFilter bool
TOMLFilterCommand string
CacheEnabled bool
}
LayerConfig groups per-layer config structs.
type LayerConfigs ¶
type LayerConfigs struct {
EnableEntropy bool
EnableTFIDF bool
EnableH2O bool
EnableCompaction bool
EnableAttribution bool
}
LayerConfigs holds configuration for which layers to apply in streaming mode
type LayersSection ¶
type LayersSection struct {
Entropy bool `toml:"entropy"`
Perplexity bool `toml:"perplexity"`
GoalDriven bool `toml:"goal_driven"`
AST bool `toml:"ast"`
Contrastive bool `toml:"contrastive"`
Ngram bool `toml:"ngram"`
Evaluator bool `toml:"evaluator"`
Gist bool `toml:"gist"`
Hierarchical bool `toml:"hierarchical"`
Compaction bool `toml:"compaction"`
Attribution bool `toml:"attribution"`
H2O bool `toml:"h2o"`
AttentionSink bool `toml:"attention_sink"`
MetaToken bool `toml:"meta_token"`
SemanticChunk bool `toml:"semantic_chunk"`
SketchStore bool `toml:"sketch_store"`
LazyPruner bool `toml:"lazy_pruner"`
SemanticAnchor bool `toml:"semantic_anchor"`
AgentMemory bool `toml:"agent_memory"`
TFIDF bool `toml:"tfidf"`
Symbolic bool `toml:"symbolic"`
PhraseGroup bool `toml:"phrase_group"`
Numerical bool `toml:"numerical"`
DynamicRatio bool `toml:"dynamic_ratio"`
TOON bool `toml:"toon"`
TDD bool `toml:"tdd"`
}
LayersSection holds layer enable/disable configuration.
type LazyPrunerConfig ¶
type LazyPrunerConfig struct {
// BaseBudget is the initial token budget for layer 0
BaseBudget int
// DecayRate is the budget decay per layer (0.9 = 10% reduction)
DecayRate float64
// NumLayers is the number of layers to compute budgets for
NumLayers int
// RevivalBudget is max tokens to pull back from pruned pool
RevivalBudget int
// AttentionThreshold is the minimum score to keep a token
AttentionThreshold float64
// EnableRevival allows on-demand token recovery
EnableRevival bool
}
LazyPrunerConfig holds configuration for dynamic pruning
func DefaultLazyPrunerConfig ¶
func DefaultLazyPrunerConfig() LazyPrunerConfig
DefaultLazyPrunerConfig returns default configuration
type LazyPrunerFilter ¶
type LazyPrunerFilter struct {
// contains filtered or unexported fields
}
Paper: "LazyLLM: Dynamic Token Pruning" — Fu et al., Apple, 2024 https://arxiv.org/abs/2407.14057 Paper: "LazyLLM: Dynamic Token Pruning" — Fu et al., Apple, 2024 https://arxiv.org/abs/2407.14057 LazyPrunerFilter implements Layer 18: Budget-aware Dynamic Pruning (LazyLLM style).
Research Source: "LazyLLM: Dynamic Token Pruning" (July 2024) Key Innovation: Selective KV computation with layer-wise budget decay. Results: 2.34x speedup in prefill phase with maintained accuracy.
Methodology: 1. Dynamic token selection based on attention scores 2. Layer-wise budget decay (deeper layers = smaller budgets) 3. Prune-and-Revive mechanism for recoverable pruning 4. Selective prefill to accelerate inference
func NewLazyPrunerFilter ¶
func NewLazyPrunerFilter() *LazyPrunerFilter
NewLazyPrunerFilter creates a new lazy pruner filter
func NewLazyPrunerFilterWithConfig ¶
func NewLazyPrunerFilterWithConfig(cfg LazyPrunerConfig) *LazyPrunerFilter
NewLazyPrunerFilterWithConfig creates a filter with custom config
func (*LazyPrunerFilter) Apply ¶
func (f *LazyPrunerFilter) Apply(input string, mode Mode) (string, int)
Apply applies budget-aware dynamic pruning
func (*LazyPrunerFilter) Clear ¶
func (f *LazyPrunerFilter) Clear()
Clear clears the pruned token storage
func (*LazyPrunerFilter) GetLayerBudget ¶
func (f *LazyPrunerFilter) GetLayerBudget(layer int) int
GetLayerBudget returns the budget for a specific layer
func (*LazyPrunerFilter) GetLayerBudgets ¶
func (f *LazyPrunerFilter) GetLayerBudgets() []int
GetLayerBudgets returns all layer budgets
func (*LazyPrunerFilter) GetStats ¶
func (f *LazyPrunerFilter) GetStats() LazyPrunerStats
GetStats returns pruning statistics
func (*LazyPrunerFilter) Name ¶
func (f *LazyPrunerFilter) Name() string
Name returns the filter name
func (*LazyPrunerFilter) ReviveTokens ¶
func (f *LazyPrunerFilter) ReviveTokens(layer int, count int) []Token
ReviveTokens recovers previously pruned tokens
func (*LazyPrunerFilter) SelectTokens ¶
func (f *LazyPrunerFilter) SelectTokens(tokens []Token, layer int, threshold float64) []Token
SelectTokens selects tokens based on attention scores
func (*LazyPrunerFilter) StorePruned ¶
func (f *LazyPrunerFilter) StorePruned(tokens []Token, layer int)
StorePruned stores pruned tokens for potential revival
type LazyPrunerLayerConfig ¶
type LazyPrunerLayerConfig struct {
Enabled bool
BaseBudget int
DecayRate float64
RevivalBudget int
}
LazyPrunerLayerConfig groups Layer 18 settings.
type LazyPrunerStats ¶
LazyPrunerStats tracks pruning statistics
type LogCrunch ¶
type LogCrunch struct {
// contains filtered or unexported fields
}
LogCrunch folds repeated log lines with occurrence counts. Inspired by claw-compactor's LogCrunch stage.
func NewLogCrunch ¶
func NewLogCrunch(cfg LogCrunchConfig) *LogCrunch
NewLogCrunch creates a new LogCrunch filter.
type LogCrunchConfig ¶
LogCrunchConfig holds configuration for LogCrunch.
func DefaultLogCrunchConfig ¶
func DefaultLogCrunchConfig() LogCrunchConfig
DefaultLogCrunchConfig returns default LogCrunch configuration.
type ManagerConfig ¶
type ManagerConfig struct {
// Context limits
MaxContextTokens int
ChunkSize int
StreamThreshold int
// Resilience
TeeOnFailure bool
FailSafeMode bool
ValidateOutput bool
ShortCircuitBudget bool
// Performance
CacheEnabled bool
CacheMaxSize int
// Layer config
PipelineCfg PipelineConfig
}
ManagerConfig configures the pipeline manager
type MetaToken ¶
type MetaToken struct {
Hash string // SHA256 hash of the original sequence
Original string // Original text that was compressed
Length int // Number of tokens in original sequence
Count int // Number of times this pattern was found
}
MetaToken represents a compressed token sequence
type MetaTokenConfig ¶
type MetaTokenConfig struct {
// WindowSize is the maximum sequence length to consider for compression
WindowSize int
// MinPattern is the minimum sequence length to compress (shorter = more compression but more meta-tokens)
MinPattern int
// MaxMetaTokens limits the number of meta-tokens created (0 = unlimited)
MaxMetaTokens int
// EnableDecompression allows this filter to also decompress
EnableDecompression bool
}
MetaTokenConfig holds configuration for the meta-token filter
func DefaultMetaTokenConfig ¶
func DefaultMetaTokenConfig() MetaTokenConfig
DefaultMetaTokenConfig returns the default configuration
type MetaTokenFilter ¶
type MetaTokenFilter struct {
// contains filtered or unexported fields
}
Paper: "Lossless Token Compression via Meta-Tokens" — 2025 https://arxiv.org/abs/2506.00307 Paper: "Lossless Token Compression via Meta-Tokens" — 2025 https://arxiv.org/abs/2506.00307 MetaTokenFilter implements Layer 15: Lossless Token Sequence Compression via Meta-Tokens.
Research Source: "Lossless Token Sequence Compression via Meta-Tokens" (arXiv:2506.00307) Key Innovation: LZ77-style lossless compression operating on token sequences. Results: 27% token reduction = 47% compute reduction (due to quadratic attention) Critical Feature: ZERO semantic loss - trivially reversible.
Methodology: 1. Scan for repeated token sequences (sliding window) 2. Replace with meta-tokens that reference the original sequence 3. Meta-tokens use special marker format: [META:hash:length] 4. Decompression expands meta-tokens back to original sequences
func NewMetaTokenFilter ¶
func NewMetaTokenFilter() *MetaTokenFilter
NewMetaTokenFilter creates a new meta-token lossless compression filter
func NewMetaTokenFilterWithConfig ¶
func NewMetaTokenFilterWithConfig(cfg MetaTokenConfig) *MetaTokenFilter
NewMetaTokenFilterWithConfig creates a meta-token filter with custom config
func (*MetaTokenFilter) Apply ¶
func (f *MetaTokenFilter) Apply(input string, mode Mode) (string, int)
Apply applies lossless compression via meta-tokens
func (*MetaTokenFilter) Decompress ¶
func (f *MetaTokenFilter) Decompress(input string) string
Decompress expands meta-tokens back to original sequences
func (*MetaTokenFilter) GetMetaTokens ¶
func (f *MetaTokenFilter) GetMetaTokens() map[string]MetaToken
GetMetaTokens returns all stored meta-tokens (for serialization)
func (*MetaTokenFilter) LoadMetaTokens ¶
func (f *MetaTokenFilter) LoadMetaTokens(tokens map[string]MetaToken)
LoadMetaTokens loads meta-tokens (for deserialization)
func (*MetaTokenFilter) Stats ¶
func (f *MetaTokenFilter) Stats() MetaTokenStats
Stats returns compression statistics
type MetaTokenLayerConfig ¶
MetaTokenLayerConfig groups Layer 15 settings.
type MetaTokenStats ¶
MetaTokenStats holds statistics for meta-token compression
type Milestone ¶
type Milestone struct {
Description string `json:"description"`
Priority int `json:"priority"`
Status string `json:"status"`
}
Milestone represents a pending task
type MixedDimFilter ¶
type MixedDimFilter struct {
// contains filtered or unexported fields
}
Paper: "MixedDimKV: Beyond Token Eviction" — Miao et al., 2026 https://arxiv.org/abs/2603.20616 MixedDimFilter implements mixed-dimension token allocation — instead of evicting tokens entirely (0 or 100%), it reduces the "dimensionality" of less important tokens by abbreviating them.
func NewMixedDimFilter ¶
func NewMixedDimFilter() *MixedDimFilter
NewMixedDimFilter creates a new mixed-dimension allocation filter.
type MultiAgentContextSharing ¶
type MultiAgentContextSharing struct {
// contains filtered or unexported fields
}
MultiAgentContextSharing implements multi-agent context sharing. Inspired by lean-ctx's multi-agent context sharing.
func NewMultiAgentContextSharing ¶
func NewMultiAgentContextSharing() *MultiAgentContextSharing
NewMultiAgentContextSharing creates a new multi-agent context sharing system.
func (*MultiAgentContextSharing) GetMessages ¶
func (macs *MultiAgentContextSharing) GetMessages(agent string) []ScratchMessage
GetMessages gets messages for an agent.
func (*MultiAgentContextSharing) PostMessage ¶
func (macs *MultiAgentContextSharing) PostMessage(from, to, content string)
PostMessage posts a message to the scratchpad.
func (*MultiAgentContextSharing) RegisterAgent ¶
func (macs *MultiAgentContextSharing) RegisterAgent(name string)
RegisterAgent registers an agent.
type MultiFileConfig ¶
type MultiFileConfig struct {
MaxCombinedSize int
PreserveBoundaries bool
SimilarityThreshold float64
}
MultiFileConfig holds configuration for multi-file optimization
type MultiFileFilter ¶
type MultiFileFilter struct {
// contains filtered or unexported fields
}
MultiFileFilter optimizes output across multiple related files/outputs. It identifies relationships between files, deduplicates common content, and creates unified summaries for better LLM context.
Use case: When an agent works with multiple related files simultaneously (e.g., a module with multiple source files), this filter creates a cohesive view that preserves relationships while removing redundancy.
func NewMultiFileFilter ¶
func NewMultiFileFilter(cfg MultiFileConfig) *MultiFileFilter
NewMultiFileFilter creates a new multi-file optimization filter
func (*MultiFileFilter) Apply ¶
func (f *MultiFileFilter) Apply(input string, mode Mode) (string, int)
Apply applies multi-file optimization
func (*MultiFileFilter) SetMaxCombinedSize ¶
func (f *MultiFileFilter) SetMaxCombinedSize(size int)
SetMaxCombinedSize configures the maximum combined output size
func (*MultiFileFilter) SetPreserveBoundaries ¶
func (f *MultiFileFilter) SetPreserveBoundaries(preserve bool)
SetPreserveBoundaries configures whether to keep file markers
func (*MultiFileFilter) SetSimilarityThreshold ¶
func (f *MultiFileFilter) SetSimilarityThreshold(threshold float64)
SetSimilarityThreshold configures the deduplication threshold
type NgramAbbreviator ¶
type NgramAbbreviator struct {
// contains filtered or unexported fields
}
NgramAbbreviator compresses output by abbreviating common patterns. Research-based: CompactPrompt N-gram Abbreviation (2025) - achieves 10-20% lossless compression by replacing common tokens with shorter equivalents.
Key insight: Programming and CLI output contains many repeated long tokens that can be abbreviated while remaining understandable to LLMs.
func NewNgramAbbreviator ¶
func NewNgramAbbreviator() *NgramAbbreviator
NewNgramAbbreviator creates a new n-gram abbreviator.
func (*NgramAbbreviator) Apply ¶
func (f *NgramAbbreviator) Apply(input string, mode Mode) (string, int)
Apply applies n-gram abbreviation to the input.
func (*NgramAbbreviator) GetAbbreviationLegend ¶
func (f *NgramAbbreviator) GetAbbreviationLegend() string
GetAbbreviationLegend returns a legend for common abbreviations
func (*NgramAbbreviator) Name ¶
func (f *NgramAbbreviator) Name() string
Name returns the filter name.
type NumericalConfig ¶
type NumericalConfig struct {
// Enabled controls whether the filter is active
Enabled bool
// DecimalPlaces limits decimal precision (e.g., 2 = max 2 decimal places)
DecimalPlaces int
// CompressLargeNumbers replaces large numbers with K/M/B suffixes
CompressLargeNumbers bool
// LargeNumberThreshold is the threshold for large number compression
LargeNumberThreshold int
// CompressPercentages simplifies percentage display
CompressPercentages bool
// MinContentLength is minimum content length to apply
MinContentLength int
}
NumericalConfig holds configuration for numerical quantization
func DefaultNumericalConfig ¶
func DefaultNumericalConfig() NumericalConfig
DefaultNumericalConfig returns default configuration
type NumericalQuantLayerConfig ¶
NumericalQuantLayerConfig groups numerical quantization settings.
type NumericalQuantizer ¶
type NumericalQuantizer struct {
// contains filtered or unexported fields
}
NumericalQuantizer compresses numerical data in structured output. Research Source: "CompactPrompt: A Unified Pipeline for Prompt Data Compression" (Oct 2025) Key Innovation: Apply uniform quantization to numerical columns while preserving semantic relationships, achieving significant token savings on structured data.
This filter detects tables, metrics, statistics, and numerical data in output and applies precision reduction and formatting compression.
func NewNumericalQuantizer ¶
func NewNumericalQuantizer() *NumericalQuantizer
NewNumericalQuantizer creates a new numerical quantizer
func (*NumericalQuantizer) Apply ¶
func (n *NumericalQuantizer) Apply(input string, mode Mode) (string, int)
Apply applies numerical quantization to the input
func (*NumericalQuantizer) Name ¶
func (n *NumericalQuantizer) Name() string
Name returns the filter name
type Observation ¶
Observation represents a single observation from LLM output.
type PATHShimInjector ¶
type PATHShimInjector struct {
// contains filtered or unexported fields
}
PATHShimInjector creates PATH shims to auto-filter subprocesses. Inspired by tokf's PATH shim injection.
func NewPATHShimInjector ¶
func NewPATHShimInjector(shimDir string) *PATHShimInjector
NewPATHShimInjector creates a new PATH shim injector.
func (*PATHShimInjector) Install ¶
func (psi *PATHShimInjector) Install(commands []string) error
Install installs PATH shims for specified commands.
func (*PATHShimInjector) Uninstall ¶
func (psi *PATHShimInjector) Uninstall() error
Uninstall removes PATH shims.
func (*PATHShimInjector) UpdatePATH ¶
func (psi *PATHShimInjector) UpdatePATH(currentPath string) string
UpdatePATH returns the updated PATH with shim directory prepended.
type PerplexityFilter ¶
type PerplexityFilter struct {
// contains filtered or unexported fields
}
Paper: "LLMLingua" — Jiang et al., Microsoft/Tsinghua, 2023 https://arxiv.org/abs/2310.05736 Paper: "LLMLingua" — Jiang et al., Microsoft/Tsinghua, 2023 https://arxiv.org/abs/2310.05736 PerplexityFilter implements LLMLingua-style compression (Microsoft/Tsinghua, 2023). Uses perplexity-based iterative pruning with a budget controller.
Algorithm: 1. Calculate perplexity of each token given context 2. Rank tokens by perplexity (higher = more surprising = more important) 3. Iteratively remove lowest-perplexity tokens while staying within budget
Research Results: Up to 20x compression with semantic preservation.
func NewPerplexityFilter ¶
func NewPerplexityFilter() *PerplexityFilter
NewPerplexityFilter creates a new perplexity-based filter
func (*PerplexityFilter) Apply ¶
func (f *PerplexityFilter) Apply(input string, mode Mode) (string, int)
Apply applies perplexity-based iterative pruning with early exit (Phase 1 optimization)
func (*PerplexityFilter) Name ¶
func (f *PerplexityFilter) Name() string
Name returns the filter name
func (*PerplexityFilter) SetIterations ¶
func (f *PerplexityFilter) SetIterations(iterations int)
SetIterations sets the number of pruning iterations
func (*PerplexityFilter) SetTargetRatio ¶
func (f *PerplexityFilter) SetTargetRatio(ratio float64)
SetTargetRatio sets the target compression ratio
type PersistentKnowledgeStore ¶
type PersistentKnowledgeStore struct {
// contains filtered or unexported fields
}
PersistentKnowledgeStore implements persistent knowledge storage. Inspired by lean-ctx's persistent knowledge store.
func NewPersistentKnowledgeStore ¶
func NewPersistentKnowledgeStore() *PersistentKnowledgeStore
NewPersistentKnowledgeStore creates a new knowledge store.
func (*PersistentKnowledgeStore) QueryByCategory ¶
func (pks *PersistentKnowledgeStore) QueryByCategory(category string) map[string]string
QueryByCategory queries facts by category prefix.
func (*PersistentKnowledgeStore) Recall ¶
func (pks *PersistentKnowledgeStore) Recall(key string) string
Recall retrieves a fact.
func (*PersistentKnowledgeStore) Remember ¶
func (pks *PersistentKnowledgeStore) Remember(key, value string)
Remember stores a fact.
type PhotonConfig ¶
PhotonConfig holds configuration for Photon filter.
func DefaultPhotonConfig ¶
func DefaultPhotonConfig() PhotonConfig
DefaultPhotonConfig returns default Photon configuration.
type PhotonFilter ¶
type PhotonFilter struct {
// contains filtered or unexported fields
}
PhotonFilter detects and compresses base64-encoded images. Inspired by claw-compactor's Photon stage.
func NewPhotonFilter ¶
func NewPhotonFilter(cfg PhotonConfig) *PhotonFilter
NewPhotonFilter creates a new Photon filter.
type PhraseGroupConfig ¶
type PhraseGroupConfig struct {
// Enabled controls whether the filter is active
Enabled bool
// MinContentLength is the minimum character length to apply
MinContentLength int
// MaxPhraseSize is maximum tokens in a phrase group
MaxPhraseSize int
}
PhraseGroupConfig holds configuration for phrase grouping
func DefaultPhraseGroupConfig ¶
func DefaultPhraseGroupConfig() PhraseGroupConfig
DefaultPhraseGroupConfig returns default configuration
type PhraseGroupingFilter ¶
type PhraseGroupingFilter struct {
// contains filtered or unexported fields
}
PhraseGroupingFilter implements dependency-based phrase grouping for compression. Research Source: "CompactPrompt: A Unified Pipeline for Prompt Data Compression" (Oct 2025) Key Innovation: Group related tokens using syntactic dependency analysis before compression, preserving semantic coherence better than token-level pruning.
This identifies noun phrases, verb phrases, and prepositional phrases as atomic compression units, preventing the separation of semantically linked tokens.
func NewPhraseGroupingFilter ¶
func NewPhraseGroupingFilter() *PhraseGroupingFilter
NewPhraseGroupingFilter creates a new phrase grouping filter
func (*PhraseGroupingFilter) Apply ¶
func (f *PhraseGroupingFilter) Apply(input string, mode Mode) (string, int)
Apply applies dependency-based phrase grouping
func (*PhraseGroupingFilter) Name ¶
func (f *PhraseGroupingFilter) Name() string
Name returns the filter name
type Pipeline ¶
type Pipeline interface {
Process(input string) (string, *PipelineStats)
}
Pipeline defines the interface for compression pipelines. This allows mock testing and future pipeline implementations.
type PipelineConfig ¶
type PipelineConfig = PipelineConfigWithNestedLayers
PipelineConfig is an alias for the full config type with backward-compatible flat fields. New code should use PipelineConfigWithNestedLayers to take advantage of nested structure.
func LoadPipelineFromTOML ¶
func LoadPipelineFromTOML(path string) (PipelineConfig, error)
LoadPipelineFromTOML loads pipeline configuration from TOML.
func ModeConfig ¶
func ModeConfig(mode CompressionMode, baseMode Mode) PipelineConfig
ModeConfig is an alias for TierConfig (backwards compat).
func PresetConfig ¶
func PresetConfig(preset PipelinePreset, baseMode Mode) PipelineConfig
func ProfileConfig ¶
func ProfileConfig(profile Profile, baseMode Mode) PipelineConfig
ProfileConfig is an alias for TierConfig (backwards compat).
func TierConfig ¶
func TierConfig(tier Tier, baseMode Mode) PipelineConfig
TierConfig returns a PipelineConfig for the given tier.
type PipelineConfigWithNestedLayers ¶
type PipelineConfigWithNestedLayers struct {
// Core fields
Mode Mode
QueryIntent string
Budget int
LLMEnabled bool
SessionTracking bool
NgramEnabled bool
MultiFileEnabled bool
PromptTemplate string
EnableTOMLFilter bool
TOMLFilterCommand string
// Layer sub-configs (preferred)
Layers LayerConfig
// Core layer enable flags (Layers 1-9)
EnableEntropy bool
EnablePerplexity bool
EnableGoalDriven bool
EnableAST bool
EnableContrastive bool
EnableEvaluator bool
EnableGist bool
EnableHierarchical bool
// Layer 11: Compaction
EnableCompaction bool
CompactionThreshold int
CompactionPreserveTurns int
CompactionMaxTokens int
CompactionStateSnapshot bool
CompactionAutoDetect bool
// Layer 12: Attribution
EnableAttribution bool
AttributionThreshold float64
// Layer 13: H2O
EnableH2O bool
H2OSinkSize int
H2ORecentSize int
H2OHeavyHitterSize int
// Layer 14: Attention Sink
EnableAttentionSink bool
AttentionSinkCount int
AttentionRecentCount int
// Layer 15: Meta-Token
EnableMetaToken bool
MetaTokenWindow int
MetaTokenMinSize int
// Layer 16: Semantic Chunk
EnableSemanticChunk bool
SemanticChunkMethod string
SemanticChunkMinSize int
SemanticChunkThreshold float64
// Layer 17: Sketch Store
EnableSketchStore bool
SketchBudgetRatio float64
SketchMaxSize int
SketchHeavyHitter float64
// Layer 18: Lazy Pruner
EnableLazyPruner bool
LazyBaseBudget int
LazyDecayRate float64
LazyRevivalBudget int
// Layer 19: Semantic Anchor
EnableSemanticAnchor bool
SemanticAnchorRatio float64
SemanticAnchorSpacing int
// Layer 20: Agent Memory
EnableAgentMemory bool
AgentKnowledgeRetention float64
AgentHistoryPrune float64
AgentConsolidationMax int
// Adaptive layers
EnableQuestionAware bool
QuestionAwareThreshold float64
EnableDensityAdaptive bool
DensityTargetRatio float64
DensityThreshold float64
// TF-IDF
EnableTFIDF bool
TFIDFThreshold float64
// Reasoning trace
EnableReasoningTrace bool
MaxReflectionLoops int
// Phase 1: NEW filters
EnableSymbolicCompress bool
EnablePhraseGrouping bool
EnableNumericalQuant bool
DecimalPlaces int
EnableDynamicRatio bool
DynamicRatioBase float64
// Phase 2: Advanced filters
EnableHypernym bool
EnableSemanticCache bool
EnableScope bool
EnableSmallKV bool
EnableKVzip bool
// 2026 Research layers
EnableSWEzze bool
EnableMixedDim bool
EnableBEAVER bool
EnablePoC bool
EnableTokenQuant bool
EnableTokenRetention bool
EnableACON bool
// Cache
CacheEnabled bool
CacheMaxSize int
}
PipelineConfigWithNestedLayers is a helper type for the new nested config structure. Use this gradually: migrate from flat fields to nested Layers config over time.
type PipelineCoordinator ¶
type PipelineCoordinator struct {
// contains filtered or unexported fields
}
PipelineCoordinator orchestrates the 26-layer compression pipeline. Research-based: Combines the best techniques from 120+ research papers worldwide to achieve maximum token reduction for CLI/Agent output.
func NewPipelineCoordinator ¶
func NewPipelineCoordinator(cfg PipelineConfig) *PipelineCoordinator
NewPipelineCoordinator creates a new pipeline coordinator with all configured layers.
func (*PipelineCoordinator) GetASTPreserveFilter ¶
func (c *PipelineCoordinator) GetASTPreserveFilter() *ASTPreserveFilter
GetASTPreserveFilter returns the AST preservation filter
func (*PipelineCoordinator) GetAgentMemoryFilter ¶
func (c *PipelineCoordinator) GetAgentMemoryFilter() *AgentMemoryFilter
GetAgentMemoryFilter returns the agent memory filter
func (*PipelineCoordinator) GetAttentionSinkFilter ¶
func (c *PipelineCoordinator) GetAttentionSinkFilter() *AttentionSinkFilter
GetAttentionSinkFilter returns the attention sink filter
func (*PipelineCoordinator) GetAttributionFilter ¶
func (c *PipelineCoordinator) GetAttributionFilter() *AttributionFilter
GetAttributionFilter returns the attribution filter
func (*PipelineCoordinator) GetCompactionLayer ¶
func (c *PipelineCoordinator) GetCompactionLayer() *CompactionLayer
GetCompactionLayer returns the compaction layer
func (*PipelineCoordinator) GetContrastiveFilter ¶
func (c *PipelineCoordinator) GetContrastiveFilter() *ContrastiveFilter
GetContrastiveFilter returns the contrastive filter
func (*PipelineCoordinator) GetEntropyFilter ¶
func (c *PipelineCoordinator) GetEntropyFilter() *EntropyFilter
GetEntropyFilter returns the entropy filter
func (*PipelineCoordinator) GetEvaluatorHeadsFilter ¶
func (c *PipelineCoordinator) GetEvaluatorHeadsFilter() *EvaluatorHeadsFilter
GetEvaluatorHeadsFilter returns the evaluator heads filter
func (*PipelineCoordinator) GetGistFilter ¶
func (c *PipelineCoordinator) GetGistFilter() *GistFilter
GetGistFilter returns the gist filter
func (*PipelineCoordinator) GetGoalDrivenFilter ¶
func (c *PipelineCoordinator) GetGoalDrivenFilter() *GoalDrivenFilter
GetGoalDrivenFilter returns the goal-driven filter
func (*PipelineCoordinator) GetH2OFilter ¶
func (c *PipelineCoordinator) GetH2OFilter() *H2OFilter
GetH2OFilter returns the H2O filter
func (*PipelineCoordinator) GetHierarchicalSummaryFilter ¶
func (c *PipelineCoordinator) GetHierarchicalSummaryFilter() *HierarchicalSummaryFilter
GetHierarchicalSummaryFilter returns the hierarchical summary filter
func (*PipelineCoordinator) GetLazyPrunerFilter ¶
func (c *PipelineCoordinator) GetLazyPrunerFilter() *LazyPrunerFilter
GetLazyPrunerFilter returns the lazy pruner filter
func (*PipelineCoordinator) GetMetaTokenFilter ¶
func (c *PipelineCoordinator) GetMetaTokenFilter() *MetaTokenFilter
GetMetaTokenFilter returns the meta-token filter
func (*PipelineCoordinator) GetNgramAbbreviator ¶
func (c *PipelineCoordinator) GetNgramAbbreviator() *NgramAbbreviator
GetNgramAbbreviator returns the N-gram abbreviator
func (*PipelineCoordinator) GetPerplexityFilter ¶
func (c *PipelineCoordinator) GetPerplexityFilter() *PerplexityFilter
GetPerplexityFilter returns the perplexity filter
func (*PipelineCoordinator) GetSemanticAnchorFilter ¶
func (c *PipelineCoordinator) GetSemanticAnchorFilter() *SemanticAnchorFilter
GetSemanticAnchorFilter returns the semantic anchor filter
func (*PipelineCoordinator) GetSemanticChunkFilter ¶
func (c *PipelineCoordinator) GetSemanticChunkFilter() *SemanticChunkFilter
GetSemanticChunkFilter returns the semantic chunk filter
func (*PipelineCoordinator) GetSketchStoreFilter ¶
func (c *PipelineCoordinator) GetSketchStoreFilter() *SketchStoreFilter
GetSketchStoreFilter returns the sketch store filter
func (*PipelineCoordinator) GetTFIDFFilter ¶
func (c *PipelineCoordinator) GetTFIDFFilter() *TFIDFFilter
GetTFIDFFilter returns the TF-IDF filter
func (*PipelineCoordinator) GetTOMLFilterName ¶
func (p *PipelineCoordinator) GetTOMLFilterName() string
GetTOMLFilterName returns the name of the configured TOML filter.
func (*PipelineCoordinator) Process ¶
func (p *PipelineCoordinator) Process(input string) (string, *PipelineStats)
Process runs the full compression pipeline with early-exit support. Stage gates skip layers when not applicable (zero cost). Skip remaining layers if budget already met.
func (*PipelineCoordinator) SetTOMLFilter ¶
func (p *PipelineCoordinator) SetTOMLFilter(filter Filter, name string)
SetTOMLFilter sets a TOML filter to be applied first in the pipeline. This is called from outside the filter package to avoid import cycles.
type PipelineManager ¶
type PipelineManager struct {
// contains filtered or unexported fields
}
PipelineManager handles resilient large-context processing. Supports streaming for inputs up to 2M tokens with automatic chunking, validation, and failure recovery.
func NewPipelineManager ¶
func NewPipelineManager(cfg ManagerConfig) *PipelineManager
NewPipelineManager creates a new pipeline manager
func (*PipelineManager) Process ¶
func (m *PipelineManager) Process(input string, mode Mode, ctx CommandContext) (*ProcessResult, error)
Process processes input with full resilience and large context support. For inputs > StreamThreshold, uses streaming chunk processing.
func (*PipelineManager) ProcessWithBudget ¶
func (m *PipelineManager) ProcessWithBudget(input string, mode Mode, budget int, ctx CommandContext) (*ProcessResult, error)
ProcessWithBudget processes with a specific token budget. NOTE: Sets the coordinator budget and calls Process sequentially. In TokMan's CLI context, each invocation is isolated per process, so concurrent budget races are not a practical concern.
func (*PipelineManager) ProcessWithQuery ¶
func (m *PipelineManager) ProcessWithQuery(input string, mode Mode, query string, ctx CommandContext) (*ProcessResult, error)
ProcessWithQuery processes with query-aware compression
type PipelineSection ¶
type PipelineSection struct {
Mode string `toml:"mode"`
Budget int `toml:"budget"`
QueryIntent string `toml:"query_intent"`
LLMEnabled bool `toml:"llm_enabled"`
}
PipelineSection holds pipeline configuration.
type PipelineStats ¶
type PipelineStats struct {
OriginalTokens int
FinalTokens int
TotalSaved int
ReductionPercent float64
LayerStats map[string]LayerStat
CacheHit bool
// contains filtered or unexported fields
}
PipelineStats holds statistics from the compression pipeline
func (*PipelineStats) String ¶
func (s *PipelineStats) String() string
String returns a formatted summary of pipeline stats
type PoCFilter ¶
type PoCFilter struct {
// contains filtered or unexported fields
}
Paper: "PoC: Performance-oriented Context Compression" — 2026 https://arxiv.org/abs/2603.19733 PoCFilter implements performance prediction — estimates how well the compressed output will perform and re-inserts critical info if needed.
func NewPoCFilter ¶
func NewPoCFilter() *PoCFilter
NewPoCFilter creates a new performance-oriented compression filter.
type PositionAwareFilter ¶
type PositionAwareFilter struct{}
PositionAwareFilter reorders output segments to optimize LLM recall. Based on "LongLLMLingua" (Jiang et al., 2024) - LLMs exhibit "lost in the middle" phenomenon where information in the middle of context is less likely to be recalled.
Strategy: Place high-importance segments at beginning AND end of output.
func NewPositionAwareFilter ¶
func NewPositionAwareFilter() *PositionAwareFilter
NewPositionAwareFilter creates a new position-aware filter.
func (*PositionAwareFilter) Apply ¶
func (f *PositionAwareFilter) Apply(input string, mode Mode) (string, int)
Apply reorders segments to optimize for LLM recall. This filter doesn't save tokens - it improves context quality.
func (*PositionAwareFilter) Name ¶
func (f *PositionAwareFilter) Name() string
Name returns the filter name.
type ProcessResult ¶
type ProcessResult struct {
Output string
OriginalTokens int
FinalTokens int
SavedTokens int
ReductionPercent float64
LayerStats map[string]LayerStat
CacheHit bool
Chunks int
Validated bool
TeeFile string // If failure occurred
Warning string
}
ProcessResult contains the result of processing
type Profile ¶
type Profile = Tier
Backwards compatibility aliases
func ContentProfile ¶
ContentProfile auto-detects the best compression profile based on output content.
type QualityEstimator ¶
type QualityEstimator struct{}
QualityEstimator estimates the quality of compressed output
func NewQualityEstimator ¶
func NewQualityEstimator() *QualityEstimator
NewQualityEstimator creates a new quality estimator
func (*QualityEstimator) EstimateQuality ¶
func (q *QualityEstimator) EstimateQuality(original, compressed string) float64
EstimateQuality estimates the quality of compressed output vs original
type QualityScorer ¶
type QualityScorer struct{}
QualityScorer implements AST, identifier, and line preservation scoring. Inspired by lean-ctx's quality scorer.
func NewQualityScorer ¶
func NewQualityScorer() *QualityScorer
NewQualityScorer creates a new quality scorer.
func (*QualityScorer) Score ¶
func (qs *QualityScorer) Score(original, compressed string) float64
Score computes quality score for compressed content.
type QueryAwareFilter ¶
type QueryAwareFilter struct {
// contains filtered or unexported fields
}
QueryAwareFilter prioritizes output segments based on the agent's query intent. Based on "LongLLMLingua" (Jiang et al., 2024) and "ACON" (Zhang et al., 2024).
Key insight: Different agent tasks need different output segments. A "debug" query needs errors/stack traces, not success messages. A "deploy" query needs status/version, not full logs.
func NewQueryAwareFilter ¶
func NewQueryAwareFilter(query ...string) *QueryAwareFilter
NewQueryAwareFilter creates a new query-aware filter with an optional query.
func (*QueryAwareFilter) Apply ¶
func (f *QueryAwareFilter) Apply(input string, mode Mode) (string, int)
Apply filters output based on query relevance.
func (*QueryAwareFilter) Name ¶
func (f *QueryAwareFilter) Name() string
Name returns the filter name.
func (*QueryAwareFilter) SetQuery ¶
func (f *QueryAwareFilter) SetQuery(query string)
SetQuery sets the query for context-aware filtering
type QueryIntent ¶
type QueryIntent int
QueryIntent represents the type of agent query
const ( IntentUnknown QueryIntent = iota IntentDebug // Finding errors, failures, crashes IntentReview // Code review, diff analysis IntentDeploy // Deployment status, version info IntentSearch // Finding files, functions, definitions IntentTest // Running/analyzing tests IntentBuild // Build/compilation status )
type QuestionAwareConfig ¶
type QuestionAwareConfig struct {
// Enable question-aware filtering
Enabled bool
// The query/question to be aware of
Query string
// Minimum relevance score to preserve (0.0-1.0)
RelevanceThreshold float64
// Number of context tokens to preserve around matches
ContextWindow int
// Boost factor for exact matches
ExactMatchBoost float64
// Boost factor for partial matches
PartialMatchBoost float64
}
QuestionAwareConfig holds configuration for question-aware filtering
func DefaultQuestionAwareConfig ¶
func DefaultQuestionAwareConfig() QuestionAwareConfig
DefaultQuestionAwareConfig returns default configuration
type QuestionAwareFilter ¶
type QuestionAwareFilter struct {
// contains filtered or unexported fields
}
Paper: "LongLLMLingua" — Jiang et al., Microsoft, 2024 https://arxiv.org/abs/2310.06839 Paper: "LongLLMLingua" — Jiang et al., Microsoft, 2024 https://arxiv.org/abs/2310.06839 QuestionAwareFilter implements LongLLMLingua-style question-aware recovery. Research basis: "LongLLMLingua: Accelerating and Enhancing LLMs in Long Context Scenarios" (Jiang et al., ACL 2024) - preserves query-relevant subsequences during compression.
Key insight - compression should be aware of the question/query and preserve subsequences that are likely relevant to answering it.
This filter: 1. Extracts key terms from the query 2. Scores content segments by relevance to query 3. Preserves high-relevance segments even under aggressive compression 4. Enables "recovery" of important context post-compression
func NewQuestionAwareFilter ¶
func NewQuestionAwareFilter(query string) *QuestionAwareFilter
NewQuestionAwareFilter creates a new question-aware filter
func (*QuestionAwareFilter) Apply ¶
func (q *QuestionAwareFilter) Apply(input string, mode Mode) (string, int)
Apply applies question-aware filtering to preserve query-relevant content
func (*QuestionAwareFilter) GetStats ¶
func (q *QuestionAwareFilter) GetStats() map[string]any
GetStats returns filter statistics
func (*QuestionAwareFilter) Name ¶
func (q *QuestionAwareFilter) Name() string
Name returns the filter name
func (*QuestionAwareFilter) SetEnabled ¶
func (q *QuestionAwareFilter) SetEnabled(enabled bool)
SetEnabled enables or disables the filter
func (*QuestionAwareFilter) SetQuery ¶
func (q *QuestionAwareFilter) SetQuery(query string)
SetQuery sets the query for question-aware filtering
type QuestionAwareLayerConfig ¶
QuestionAwareLayerConfig groups T12 settings.
type QuestionAwareRecovery ¶
type QuestionAwareRecovery struct {
// contains filtered or unexported fields
}
QuestionAwareRecovery restores query-relevant subsequences after compression. LongLLMLingua insight — question-aware post-compression recovery.
func (*QuestionAwareRecovery) Recover ¶
func (r *QuestionAwareRecovery) Recover(original, compressed, query string) string
Recover adds back important lines that were removed during compression.
type ReadMode ¶
type ReadMode string
ReadMode represents different file reading strategies. Inspired by lean-ctx's 6 read modes.
type ReadOptions ¶
ReadOptions holds options for reading content.
type Reflection ¶
Reflection represents a consolidated insight from multiple observations.
type ReversibleStore ¶
type ReversibleStore struct {
// contains filtered or unexported fields
}
ReversibleStore stores original outputs indexed by content hash. Claw-compactor style reversible compression. Users can restore any compressed output to its original form.
func NewReversibleStore ¶
func NewReversibleStore() *ReversibleStore
NewReversibleStore creates a store in the tokman data directory.
func (*ReversibleStore) ListRecent ¶
func (s *ReversibleStore) ListRecent(n int) ([]StoredEntry, error)
ListRecent returns the N most recent reversible entries.
func (*ReversibleStore) Restore ¶
func (s *ReversibleStore) Restore(hashPrefix string) (*StoredEntry, error)
Restore retrieves the original output by hash prefix.
type SWEzzeFilter ¶
type SWEzzeFilter struct {
// contains filtered or unexported fields
}
Paper: "SWEzze: Code Distillation for Issue Resolution" — Wang et al., PKU/UCL, 2026 https://arxiv.org/abs/2603.28119 SWEzzeFilter implements code distillation — extracts only "patch ingredients" (file paths, error types, function signatures) and discards surrounding context. Achieves 6x compression while improving issue resolution rates by 5-9%.
func NewSWEzzeFilter ¶
func NewSWEzzeFilter() *SWEzzeFilter
NewSWEzzeFilter creates a new SWEzze-style code distillation filter.
type SafetySection ¶
type SafetySection struct {
CheckFilterSafety bool `toml:"check_filter_safety"`
MaxFilterSize int `toml:"max_filter_size"`
AllowRemote bool `toml:"allow_remote"`
}
SafetySection holds safety configuration.
type ScopeConfig ¶
type ScopeConfig struct {
// Enabled controls whether the filter is active
Enabled bool
// PrefillBudgetRatio is the fraction of budget for prefill content (higher = preserve more)
PrefillBudgetRatio float64
// DecodeBudgetRatio is the fraction of budget for decode content (lower = compress more)
DecodeBudgetRatio float64
// ConversationTurns threshold to switch from prefill to decode mode
ConversationTurns int
// MinContentLength minimum chars to apply
MinContentLength int
}
ScopeConfig holds configuration for SCOPE optimization
func DefaultScopeConfig ¶
func DefaultScopeConfig() ScopeConfig
DefaultScopeConfig returns default configuration
type ScopeFilter ¶
type ScopeFilter struct {
// contains filtered or unexported fields
}
Paper: "SCOPE: Optimizing KV Cache Compression" — Wu et al., ACL 2025 https://arxiv.org/abs/2412.13649 Paper: "SCOPE: Optimizing KV Cache Compression" — Wu et al., ACL 2025 https://arxiv.org/abs/2412.13649 ScopeFilter implements SCOPE-style separate prefill/decode optimization. Research Source: "SCOPE: Optimizing Key-Value Cache Compression in Long-context Generation" (ACL 2025) Key Innovation: Separate optimization strategies for initial context (prefill) vs ongoing conversation (decode). Prefill preserves more; decode compresses more. Results: 35% KV cache with near-full performance.
This detects whether content is initial context or ongoing conversation and applies appropriate compression strategy.
type ScratchMessage ¶
ScratchMessage represents a scratchpad message.
type SemanticAnchorConfig ¶
type SemanticAnchorConfig struct {
// AnchorRatio is the percentage of tokens to select as anchors (0.1 = 10%)
AnchorRatio float64
// MinAnchorSpacing is minimum tokens between anchors
MinAnchorSpacing int
// EnableAggregation allows non-anchor token aggregation
EnableAggregation bool
// PreserveStructure keeps structural tokens as anchors
PreserveStructure bool
}
SemanticAnchorConfig holds configuration for semantic anchor compression
func DefaultSemanticAnchorConfig ¶
func DefaultSemanticAnchorConfig() SemanticAnchorConfig
DefaultSemanticAnchorConfig returns default configuration
type SemanticAnchorFilter ¶
type SemanticAnchorFilter struct {
// contains filtered or unexported fields
}
SemanticAnchorFilter implements Layer 19: Semantic-Anchor Compression (SAC style).
Research Source: "SAC: Semantic-Anchor Compression" (2024) Key Innovation: Autoencoding-free compression via anchor selection and aggregation. Results: Higher compression ratios without contextual amnesia.
Methodology: 1. Anchor Detection - Identify high-connectivity tokens (semantic hubs) 2. Information Aggregation - Merge surrounding tokens into anchors 3. Prompt Reorganization - Restructure into anchor-based layout
func NewSemanticAnchorFilter ¶
func NewSemanticAnchorFilter() *SemanticAnchorFilter
NewSemanticAnchorFilter creates a new semantic anchor filter
func NewSemanticAnchorFilterWithConfig ¶
func NewSemanticAnchorFilterWithConfig(cfg SemanticAnchorConfig) *SemanticAnchorFilter
NewSemanticAnchorFilterWithConfig creates a filter with custom config
func (*SemanticAnchorFilter) Apply ¶
func (f *SemanticAnchorFilter) Apply(input string, mode Mode) (string, int)
Apply applies semantic-anchor compression
func (*SemanticAnchorFilter) GetAnchorDensity ¶
func (f *SemanticAnchorFilter) GetAnchorDensity(token string) float64
GetAnchorDensity returns the density score for a token
func (*SemanticAnchorFilter) GetAnchors ¶
func (f *SemanticAnchorFilter) GetAnchors() []AnchorToken
GetAnchors returns all detected anchor tokens
func (*SemanticAnchorFilter) GetStats ¶
func (f *SemanticAnchorFilter) GetStats() SemanticAnchorStats
GetStats returns compression statistics
func (*SemanticAnchorFilter) Name ¶
func (f *SemanticAnchorFilter) Name() string
Name returns the filter name
type SemanticAnchorLayerConfig ¶
SemanticAnchorLayerConfig groups Layer 19 settings.
type SemanticAnchorStats ¶
type SemanticAnchorStats struct {
TotalAnchors int
TotalAggregated int
NonAnchorPruned int
TokensSaved int
}
SemanticAnchorStats tracks compression statistics
type SemanticCacheConfig ¶
type SemanticCacheConfig struct {
// Enabled controls whether the filter is active
Enabled bool
// SimilarityThreshold for clustering (0-1). Higher = stricter matching
SimilarityThreshold float64
// MinClusterSize minimum items before merging
MinClusterSize int
// MaxCores maximum semantic cores to keep
MaxCores int
// MinContentLength minimum chars to apply
MinContentLength int
}
SemanticCacheConfig holds configuration for semantic caching
func DefaultSemanticCacheConfig ¶
func DefaultSemanticCacheConfig() SemanticCacheConfig
DefaultSemanticCacheConfig returns default configuration
type SemanticCacheFilter ¶
type SemanticCacheFilter struct {
// contains filtered or unexported fields
}
Paper: "SemantiCache: Efficient KV Cache Compression" — Wu et al., Tsinghua, 2026 https://arxiv.org/abs/2603.14303 Paper: "SemantiCache: Efficient KV Cache Compression" — Wu et al., Tsinghua, 2026 https://arxiv.org/abs/2603.14303 SemanticCacheFilter implements SemantiCache-style clustered merging. Research Source: "SemantiCache: Efficient KV Cache Compression via Semantic Chunking and Clustered Merging" (Mar 2026) Key Innovation: Group tokens into semantic clusters, then merge each cluster into a "semantic core" using proportional attention rebalancing. Results: 2.61x decode speedup, preserves semantic integrity.
This compresses by finding semantically similar sentences/paragraphs and merging them into representative cores, reducing redundancy while preserving unique information.
func NewSemanticCacheFilter ¶
func NewSemanticCacheFilter() *SemanticCacheFilter
NewSemanticCacheFilter creates a new semantic cache filter
func (*SemanticCacheFilter) Apply ¶
func (f *SemanticCacheFilter) Apply(input string, mode Mode) (string, int)
Apply applies semantic cache compression
func (*SemanticCacheFilter) Name ¶
func (f *SemanticCacheFilter) Name() string
Name returns the filter name
type SemanticChunk ¶
type SemanticChunk struct {
Type ChunkType // Type of chunk
Content string // Original content
Tokens int // Token count
Score float64 // Importance score (0.0-1.0)
StartLine int // Start line in original content
EndLine int // End line in original content
}
SemanticChunk represents a semantic unit for compression
type SemanticChunkConfig ¶
type SemanticChunkConfig struct {
// ChunkMethod determines how to split content
ChunkMethod ChunkMethod
// MinChunkSize is the minimum tokens for a chunk
MinChunkSize int
// MaxChunkSize is the maximum tokens for a chunk
MaxChunkSize int
// ImportanceThreshold for pruning chunks (0.0-1.0)
ImportanceThreshold float64
// PreserveStructure keeps structural markers even in low-importance chunks
PreserveStructure bool
}
SemanticChunkConfig holds configuration for semantic chunking
func DefaultSemanticChunkConfig ¶
func DefaultSemanticChunkConfig() SemanticChunkConfig
DefaultSemanticChunkConfig returns default configuration
type SemanticChunkFilter ¶
type SemanticChunkFilter struct {
// contains filtered or unexported fields
}
Paper: "ChunkKV: Semantic-Preserving KV Compression" — Liu et al., NeurIPS, 2025 https://arxiv.org/abs/2502.00299 Paper: "ChunkKV: Semantic-Preserving KV Compression" — Liu et al., NeurIPS, 2025 https://arxiv.org/abs/2502.00299 SemanticChunkFilter implements Layer 16: Semantic Chunk-based Compression (ChunkKV style).
Research Source: "ChunkKV: Semantic-Guided KV Cache Compression" (NeurIPS 2025) Key Innovation: Move from token-level to chunk-level pruning to preserve semantic coherence. Results: 8.7% precision improvement, 26.5% faster throughput vs token-level methods.
Methodology: 1. Group tokens into semantic chunks (functions, classes, sentences, paragraphs) 2. Score each chunk's importance using conditional perplexity 3. Prune entire chunks (not individual tokens) to preserve structure 4. Reuse chunk indices across layers for efficiency
func NewSemanticChunkFilter ¶
func NewSemanticChunkFilter() *SemanticChunkFilter
NewSemanticChunkFilter creates a new semantic chunk filter
func NewSemanticChunkFilterWithConfig ¶
func NewSemanticChunkFilterWithConfig(cfg SemanticChunkConfig) *SemanticChunkFilter
NewSemanticChunkFilterWithConfig creates a filter with custom config
func (*SemanticChunkFilter) Apply ¶
func (f *SemanticChunkFilter) Apply(input string, mode Mode) (string, int)
Apply applies semantic chunk-based compression
func (*SemanticChunkFilter) Name ¶
func (f *SemanticChunkFilter) Name() string
Name returns the filter name
type SemanticChunkLayerConfig ¶
SemanticChunkLayerConfig groups Layer 16 settings.
type SemanticEquivalence ¶
type SemanticEquivalence struct{}
SemanticEquivalence checks if compressed output preserves meaning. Verify no critical information was lost during compression.
func NewSemanticEquivalence ¶
func NewSemanticEquivalence() *SemanticEquivalence
NewSemanticEquivalence creates a checker.
func (*SemanticEquivalence) Check ¶
func (s *SemanticEquivalence) Check(original, compressed string) EquivalenceReport
Check returns an equivalence report.
type SemanticFilter ¶
type SemanticFilter struct {
// contains filtered or unexported fields
}
SemanticFilter prunes low-information segments using statistical analysis. Based on "Selective Context" (Li et al., 2024) - uses self-information and information density to identify low-value content.
func NewSemanticFilter ¶
func NewSemanticFilter() *SemanticFilter
NewSemanticFilter creates a new semantic filter.
type SemanticIntentDetector ¶
type SemanticIntentDetector struct {
// contains filtered or unexported fields
}
SemanticIntentDetector detects semantic intent in queries. Inspired by lean-ctx's semantic intent detection.
func NewSemanticIntentDetector ¶
func NewSemanticIntentDetector() *SemanticIntentDetector
NewSemanticIntentDetector creates a new semantic intent detector.
func (*SemanticIntentDetector) DetectIntent ¶
func (sid *SemanticIntentDetector) DetectIntent(query string) (string, float64)
DetectIntent detects the semantic intent of a query.
type SessionConfig ¶
type SessionConfig struct {
SessionFile string // Path to session file
MaxEntries int // Maximum entries to track (0 = unlimited)
}
SessionConfig holds configuration for the session tracker
type SessionHistory ¶
type SessionHistory struct {
UserQueries []string `json:"user_queries"`
ActivityLog []string `json:"activity_log"`
FilesRead []string `json:"files_read,omitempty"`
FilesEdited []string `json:"files_edited,omitempty"`
CommandsRun []string `json:"commands_run,omitempty"`
Decisions []string `json:"decisions,omitempty"`
}
SessionHistory tracks what happened in the session
type SessionStats ¶
SessionStats holds session statistics
type SessionTracker ¶
type SessionTracker struct {
// contains filtered or unexported fields
}
SessionTracker tracks content across commands to avoid repetition. Research-based: Context-Aware Compression (2024) - avoids repeating information already shown to the agent, achieving 5-10% additional reduction.
Key insight: Agents often run similar commands repeatedly. Tracking what has been shown allows collapsing repeated content to "[seen before]" markers.
func NewSessionTracker ¶
func NewSessionTracker() *SessionTracker
NewSessionTracker creates a new session tracker.
func (*SessionTracker) Apply ¶
func (f *SessionTracker) Apply(input string, mode Mode) (string, int)
Apply applies session tracking to avoid repetition.
func (*SessionTracker) Clear ¶
func (f *SessionTracker) Clear() error
Clear clears the session history
func (*SessionTracker) Stats ¶
func (f *SessionTracker) Stats() SessionStats
Stats returns session statistics
type SinkConfig ¶
type SinkConfig struct {
// Enable attention sink filtering
Enabled bool
// Number of initial tokens to always preserve as sinks
SinkTokenCount int
// Number of recent tokens to preserve in rolling cache
RecentTokenCount int
// Preserve structural markers (headers, prefixes)
PreserveStructural bool
// Minimum content length to apply
MinContentLength int
// Anchor patterns to always preserve
AnchorPatterns []string
}
SinkConfig holds configuration for attention sink preservation
func DefaultSinkConfig ¶
func DefaultSinkConfig() SinkConfig
DefaultSinkConfig returns default configuration
type Sketch ¶
type Sketch struct {
CompressedInfo []byte // Quantized/low-rank representation
OriginalHash string // For verification
TokenCount int // Original token count
Importance float64 // Original importance score
ContentType string // "code", "text", "mixed"
}
Sketch represents a compressed content entry
type SketchCache ¶
type SketchCache struct {
TokenSketches map[string]*Sketch // hash -> sketch
Budget float64
Stats SketchStats
}
SketchCache stores compressed representations of pruned content
type SketchEntry ¶
SketchEntry represents a revivable content block
type SketchStats ¶
SketchStats tracks compression statistics
type SketchStoreConfig ¶
type SketchStoreConfig struct {
// BudgetRatio is the target compression ratio (0.1 = 10% budget)
BudgetRatio float64
// EnableRecovery allows on-demand reconstruction
EnableRecovery bool
// MaxSketchSize limits sketch storage per entry
MaxSketchSize int
// HeavyHitterRatio determines what stays uncompressed
HeavyHitterRatio float64
}
SketchStoreConfig holds configuration for sketch-based compression
func DefaultSketchStoreConfig ¶
func DefaultSketchStoreConfig() SketchStoreConfig
DefaultSketchStoreConfig returns default configuration
type SketchStoreFilter ¶
type SketchStoreFilter struct {
// contains filtered or unexported fields
}
Paper: "KVReviver: Reversible KV Cache Compression" — Yuan et al., 2025 https://arxiv.org/abs/2512.17917 Paper: "KVReviver: Reversible KV Cache Compression" — Yuan et al., 2025 https://arxiv.org/abs/2512.17917 SketchStoreFilter implements Layer 17: Sketch-based Reversible Compression (KVReviver style).
Research Source: "KVReviver: Sketch-based KV Cache Recovery" (December 2025) Key Innovation: On-demand reconstruction of pruned tokens via compressed sketches. Results: 90% memory reduction with identical accuracy at 10% budget.
Methodology: 1. Create sketches (compressed representations) for evicted content 2. Store sketches in a SketchCache for on-demand reconstruction 3. Monitor attention patterns to detect when reconstruction is needed 4. Revive pruned content when required for context
func NewSketchStoreFilter ¶
func NewSketchStoreFilter() *SketchStoreFilter
NewSketchStoreFilter creates a new sketch-based reversible store
func NewSketchStoreFilterWithConfig ¶
func NewSketchStoreFilterWithConfig(cfg SketchStoreConfig) *SketchStoreFilter
NewSketchStoreFilterWithConfig creates a filter with custom config
func (*SketchStoreFilter) Apply ¶
func (f *SketchStoreFilter) Apply(input string, mode Mode) (string, int)
Apply applies sketch-based reversible compression
func (*SketchStoreFilter) ExportSketches ¶
func (f *SketchStoreFilter) ExportSketches() ([]byte, error)
ExportSketches serializes all sketches for persistence
func (*SketchStoreFilter) GetAllSketches ¶
func (f *SketchStoreFilter) GetAllSketches() map[string]*Sketch
GetAllSketches returns all stored sketches
func (*SketchStoreFilter) GetSketch ¶
func (f *SketchStoreFilter) GetSketch(hash string) (*Sketch, bool)
GetSketch returns a sketch by hash
func (*SketchStoreFilter) GetStats ¶
func (f *SketchStoreFilter) GetStats() SketchStats
GetStats returns compression statistics
func (*SketchStoreFilter) ImportSketches ¶
func (f *SketchStoreFilter) ImportSketches(data []byte) error
ImportSketches loads sketches from serialized data
func (*SketchStoreFilter) Name ¶
func (f *SketchStoreFilter) Name() string
Name returns the filter name
type SketchStoreLayerConfig ¶
type SketchStoreLayerConfig struct {
Enabled bool
BudgetRatio float64
MaxSize int
HeavyHitter float64
}
SketchStoreLayerConfig groups Layer 17 settings.
type SmallKVCompensator ¶
type SmallKVCompensator struct {
// contains filtered or unexported fields
}
Paper: "SmallKV: Small Model Assisted Compensation" — Zhao et al., NeurIPS, 2025 https://arxiv.org/abs/2508.02751 Paper: "SmallKV: Small Model Assisted Compensation" — Zhao et al., NeurIPS, 2025 https://arxiv.org/abs/2508.02751 SmallKVCompensator implements SmallKV-style small model compensation. Research Source: "SmallKV: Small Model Assisted Compensation of KV Cache Compression for Efficient LLM Inference" (2025) Key Innovation: When aggressive compression removes important tokens, use a lightweight reconstruction pass to compensate for lost information.
This works by: after compression, check if critical information patterns were broken (unclosed brackets, incomplete statements, missing context) and reconstruct minimal bridges to maintain coherence.
func NewSmallKVCompensator ¶
func NewSmallKVCompensator() *SmallKVCompensator
NewSmallKVCompensator creates a new SmallKV compensator
func (*SmallKVCompensator) Apply ¶
func (s *SmallKVCompensator) Apply(input string, mode Mode) (string, int)
Apply implements the Filter interface for pipeline integration
func (*SmallKVCompensator) Compensate ¶
func (s *SmallKVCompensator) Compensate(original, compressed string, mode Mode) string
Compensate adds bridge tokens to compensate for over-compression. This runs AFTER other filters to repair damage from aggressive compression.
func (*SmallKVCompensator) Name ¶
func (s *SmallKVCompensator) Name() string
Name returns the filter name
type SmallKVConfig ¶
type SmallKVConfig struct {
// Enabled controls whether the compensator is active
Enabled bool
// MinContentLength minimum chars to apply
MinContentLength int
// MaxBridgeTokens maximum tokens to add as compensation
MaxBridgeTokens int
// CheckSyntaxIntegrity verifies bracket/paren matching
CheckSyntaxIntegrity bool
// CheckContextContinuity verifies logical flow preservation
CheckContextContinuity bool
}
SmallKVConfig holds configuration for SmallKV compensation
func DefaultSmallKVConfig ¶
func DefaultSmallKVConfig() SmallKVConfig
DefaultSmallKVConfig returns default configuration
type SnapshotContext ¶
type SnapshotContext struct {
Critical []string `json:"critical"` // Must preserve (can't rediscover)
Working []string `json:"working"` // Summarized knowledge
KeyValue map[string]string `json:"key_value"` // Extracted facts
CodeContext []CodeContext `json:"code_context,omitempty"`
}
SnapshotContext preserves important knowledge
type StateSnapshot ¶
type StateSnapshot struct {
SessionHistory SessionHistory `json:"session_history"`
CurrentState CurrentState `json:"current_state"`
Context SnapshotContext `json:"context"`
PendingPlan []Milestone `json:"pending_plan"`
}
StateSnapshot represents semantic compaction output
type StoredEntry ¶
type StoredEntry struct {
Hash string `json:"hash"`
Command string `json:"command"`
Original string `json:"original"`
Compressed string `json:"compressed"`
OriginalHash string `json:"original_hash"`
Mode string `json:"mode"`
Budget int `json:"budget"`
Timestamp time.Time `json:"timestamp"`
LayerStats map[string]int `json:"layer_stats,omitempty"`
}
StoredEntry holds a reversible compression entry.
type StreamingProcessor ¶
type StreamingProcessor struct {
// contains filtered or unexported fields
}
StreamingProcessor handles large inputs (>500K tokens) with chunked processing. This reduces memory usage by processing content in chunks rather than loading everything into memory at once.
Based on research: - DSPC (Sep 2025): Coarse filtering before expensive layers - MemGPT (UC Berkeley 2023): Memory-efficient context management
func NewStreamingProcessor ¶
func NewStreamingProcessor(mode Mode, cfg LayerConfigs) *StreamingProcessor
NewStreamingProcessor creates a new streaming processor
func (*StreamingProcessor) ProcessStream ¶
func (sp *StreamingProcessor) ProcessStream(input string) (string, *PipelineStats)
ProcessStream processes large input in chunks with reduced memory footprint
type StructuralCollapse ¶
type StructuralCollapse struct {
// contains filtered or unexported fields
}
StructuralCollapse merges import blocks and collapses repeated patterns. Inspired by claw-compactor's StructuralCollapse stage.
func NewStructuralCollapse ¶
func NewStructuralCollapse(cfg StructuralCollapseConfig) *StructuralCollapse
NewStructuralCollapse creates a new StructuralCollapse filter.
type StructuralCollapseConfig ¶
type StructuralCollapseConfig struct {
Enabled bool
CollapseImports bool
CollapseAsserts bool
MaxRepeated int
}
StructuralCollapseConfig holds configuration.
func DefaultStructuralCollapseConfig ¶
func DefaultStructuralCollapseConfig() StructuralCollapseConfig
DefaultStructuralCollapseConfig returns default configuration.
type SymbolicCompressFilter ¶
type SymbolicCompressFilter struct {
// contains filtered or unexported fields
}
SymbolicCompressFilter implements MetaGlyph-style symbolic instruction compression. Research Source: "Semantic Compression of LLM Instructions via Symbolic Metalanguages" (Jan 2026) Key Innovation: Replace verbose natural-language instructions with compact symbolic notation. Results: 62-81% token reduction on instruction patterns while preserving semantic fidelity.
This compresses common instruction patterns found in system prompts, CLI help text, and configuration documentation into compact symbolic representations.
func NewSymbolicCompressFilter ¶
func NewSymbolicCompressFilter() *SymbolicCompressFilter
NewSymbolicCompressFilter creates a new symbolic compression filter
func (*SymbolicCompressFilter) Apply ¶
func (f *SymbolicCompressFilter) Apply(input string, mode Mode) (string, int)
Apply applies symbolic compression to instruction-style content
func (*SymbolicCompressFilter) Name ¶
func (f *SymbolicCompressFilter) Name() string
Name returns the filter name
type SymbolicConfig ¶
type SymbolicConfig struct {
// Enabled controls whether the filter is active
Enabled bool
// MinContentLength is the minimum character length to apply
MinContentLength int
// PreservesStructure keeps line breaks for readability
PreserveStructure bool
}
SymbolicConfig holds configuration for symbolic compression
func DefaultSymbolicConfig ¶
func DefaultSymbolicConfig() SymbolicConfig
DefaultSymbolicConfig returns default configuration
type TDDConfig ¶
TDDConfig holds configuration for Token Dense Dialect.
func DefaultTDDConfig ¶
func DefaultTDDConfig() TDDConfig
DefaultTDDConfig returns default TDD configuration.
type TFIDFConfig ¶
type TFIDFConfig struct {
// Enabled controls whether the filter is active
Enabled bool
// MinSentences is the minimum number of sentences required to apply filtering
MinSentences int
// Threshold for sentence importance (0.0-1.0)
// Sentences below this TF-IDF score are pruned
Threshold float64
// MinContentLength is the minimum character length to apply filtering
MinContentLength int
}
TFIDFConfig holds configuration for TF-IDF filtering
func DefaultTFIDFConfig ¶
func DefaultTFIDFConfig() TFIDFConfig
DefaultTFIDFConfig returns default configuration
type TFIDFFilter ¶
type TFIDFFilter struct {
// contains filtered or unexported fields
}
TFIDFFilter implements DSPC-style coarse-grained TF-IDF filtering. Research Source: "DSPC: Dual-Stage Progressive Compression" (Sep 2025) Key Innovation: Training-free coarse-to-fine compression using TF-IDF for sentence filtering + attention contribution for token pruning. Results: Beats LongLLMLingua by 7.76% using only 3x fewer tokens.
This is a NEW pre-filter layer that runs before expensive layers (L2-L5). It scores sentences by TF-IDF and removes low-information sentences early, reducing the token budget for subsequent processing.
func NewTFIDFFilterWithConfig ¶
func NewTFIDFFilterWithConfig(cfg TFIDFConfig) *TFIDFFilter
NewTFIDFFilterWithConfig creates a filter with custom config
type TFIDFLayerConfig ¶
TFIDFLayerConfig groups TF-IDF filter settings.
type TOMLPipelineConfig ¶
type TOMLPipelineConfig struct {
Pipeline PipelineSection `toml:"pipeline"`
Layers LayersSection `toml:"layers"`
Safety SafetySection `toml:"safety"`
}
TOMLPipelineConfig holds TOML-based pipeline configuration.
type TOONConfig ¶
type TOONConfig struct {
Enabled bool
MinArrayLength int
MaxColumns int
PruneMetadata bool
StripLineNumbers bool
}
TOONConfig holds configuration for TOON encoding.
func DefaultTOONConfig ¶
func DefaultTOONConfig() TOONConfig
DefaultTOONConfig returns default TOON configuration.
type TOONEncoder ¶
type TOONEncoder struct {
// contains filtered or unexported fields
}
TOONEncoder implements columnar encoding for homogeneous JSON arrays. Inspired by kompact and tamp's TOON encoding. Achieves 40-80% compression on structured data like file listings, deps, routes.
func NewTOONEncoder ¶
func NewTOONEncoder(cfg TOONConfig) *TOONEncoder
NewTOONEncoder creates a new TOON encoder.
type TaskRunnerWrapping ¶
type TaskRunnerWrapping struct {
// contains filtered or unexported fields
}
TaskRunnerWrapping wraps task runner recipes for individual line filtering. Inspired by tokf's task runner wrapping.
func NewTaskRunnerWrapping ¶
func NewTaskRunnerWrapping(runner, filterCmd string) *TaskRunnerWrapping
NewTaskRunnerWrapping creates a new task runner wrapper.
func (*TaskRunnerWrapping) Wrap ¶
func (trw *TaskRunnerWrapping) Wrap(content string) string
Wrap wraps a Makefile or Justfile for tokman filtering.
type TemplatePipe ¶
type TemplatePipe struct {
// contains filtered or unexported fields
}
TemplatePipe implements template pipe chains for filter output processing.
func NewTemplatePipe ¶
func NewTemplatePipe(chain string) *TemplatePipe
NewTemplatePipe creates a new template pipe from a pipe chain string.
func (*TemplatePipe) Process ¶
func (tp *TemplatePipe) Process(input string) string
Process applies the pipe chain to input.
type Tier ¶
type Tier string
Tier defines the depth of the compression pipeline. Higher tiers activate more layers for deeper compression.
const ( // Tier 1: Surface — removes obvious noise, keeps everything intact TierSurface Tier = "surface" // 3 layers, 30-50% reduction // Tier 2: Trim — cuts dead weight, keeps structure TierTrim Tier = "trim" // 12 layers, 50-70% reduction // Tier 3: Extract — pulls out the essence TierExtract Tier = "extract" // 24 layers, 70-90% reduction // Tier 4: Core — bare minimum, maximum compression TierCore Tier = "core" // All 37 layers, 90%+ reduction // Tier C: Code — code-aware, preserves syntax structure TierCode Tier = "code" // 8 layers, 50-70% reduction // Tier L: Log — log-aware, deduplicates and groups TierLog Tier = "log" // 7 layers, 60-80% reduction // Tier T: Thread — conversation-aware, preserves context TierThread Tier = "thread" // 6 layers, 55-75% reduction )
const ( ProfileFast Tier = TierSurface ProfileBalanced Tier = TierTrim ProfileCode Tier = TierCode ProfileLog Tier = TierLog ProfileChat Tier = TierThread ProfileMax Tier = TierCore ModeSkim Tier = TierSurface ModeRefine Tier = TierTrim ModeDistill Tier = TierExtract ModeAnnihilate Tier = TierCore )
type TieredSummary ¶
type TieredSummary struct {
L0 string // Ultra-compact (1-2 lines)
L1 string // Compact (5-10 lines)
L2 string // Detailed (full context with structure)
}
TieredSummary generates L0/L1/L2 tiered summaries of compressed content. Inspired by claw-compactor's tiered summary system.
func GenerateTieredSummary ¶
func GenerateTieredSummary(original, compressed string) TieredSummary
GenerateTieredSummary creates multi-resolution summaries.
type TokenDenseDialect ¶
type TokenDenseDialect struct {
// contains filtered or unexported fields
}
TokenDenseDialect implements symbol shorthand for compact LLM communication. Inspired by lean-ctx's Token Dense Dialect (TDD). Replaces common programming terms with Unicode symbols for 8-25% extra savings.
func NewTokenDenseDialect ¶
func NewTokenDenseDialect(cfg TDDConfig) *TokenDenseDialect
NewTokenDenseDialect creates a new TDD encoder.
func (*TokenDenseDialect) Decode ¶
func (tdd *TokenDenseDialect) Decode(input string) string
Decode restores original terms from symbols.
func (*TokenDenseDialect) Encode ¶
func (tdd *TokenDenseDialect) Encode(input string) (string, int)
Encode replaces common terms with Unicode symbols.
func (*TokenDenseDialect) EncodeWithStats ¶
func (tdd *TokenDenseDialect) EncodeWithStats(input string) (string, TDDStats)
EncodeWithStats encodes and returns statistics.
type TokenQuantFilter ¶
type TokenQuantFilter struct {
// contains filtered or unexported fields
}
Paper: "TurboQuant: Extreme KV Cache Compression" — Google Research, 2026 TokenQuantFilter implements token-level quantization — replaces verbose tokens with shorter equivalents while preserving semantic meaning.
func NewTokenQuantFilter ¶
func NewTokenQuantFilter() *TokenQuantFilter
NewTokenQuantFilter creates a new token quantization filter.
func (*TokenQuantFilter) Apply ¶
func (f *TokenQuantFilter) Apply(input string, mode Mode) (string, int)
Apply quantizes verbose tokens to shorter equivalents.
func (*TokenQuantFilter) Name ¶
func (f *TokenQuantFilter) Name() string
Name returns the layer name.
type TokenRetentionFilter ¶
type TokenRetentionFilter struct {
// contains filtered or unexported fields
}
Paper: "Cache What Lasts: Token Retention for Memory-Bounded KV Cache" — Bui et al., Yale/JPMorgan, 2026 TokenRetentionFilter identifies tokens that should be retained in memory based on their lasting importance across the context window.
func NewTokenRetentionFilter ¶
func NewTokenRetentionFilter() *TokenRetentionFilter
NewTokenRetentionFilter creates a new token retention filter.
func (*TokenRetentionFilter) Apply ¶
func (f *TokenRetentionFilter) Apply(input string, mode Mode) (string, int)
Apply retains tokens with lasting importance, prunes transient ones.
func (*TokenRetentionFilter) Name ¶
func (f *TokenRetentionFilter) Name() string
Name returns the layer name.
type Turn ¶
type Turn struct {
Role string // "user" or "assistant"
Content string
Timestamp time.Time
Hash string
Tokens int
}
Turn represents a single conversation turn
type ValidationResult ¶
ValidationResult holds validation results.
Source Files
¶
- acon.go
- adaptive.go
- agent_memory.go
- ansi.go
- ast_preserve.go
- attention_sink.go
- attribution.go
- auto_validation.go
- beaver.go
- bm25.go
- brace_depth.go
- budget.go
- claw_compactor_stages.go
- comment_patterns.go
- compaction.go
- content_detect.go
- contrastive.go
- dedup.go
- density_adaptive.go
- doc.go
- dynamic_ratio.go
- engram.go
- entropy.go
- equivalence.go
- evaluator_heads.go
- feedback_loop.go
- filter.go
- gist.go
- goal_driven.go
- h2o.go
- hierarchical.go
- hierarchical_summary.go
- hypernym_compress.go
- import.go
- inter_layer_feedback.go
- kv_cache.go
- kvzip_filter.go
- lazy_pruner.go
- llm_aware.go
- llm_compress.go
- lru_cache.go
- manager.go
- meta_token.go
- mixed_dim.go
- multi_file.go
- ngram.go
- noise.go
- numerical_quant.go
- perplexity.go
- phrase_grouping.go
- pipeline_accessors.go
- pipeline_gates.go
- pipeline_init.go
- pipeline_process.go
- pipeline_stats.go
- pipeline_toml.go
- pipeline_types.go
- poc.go
- position_aware.go
- presets.go
- query_aware.go
- question_aware.go
- read_modes.go
- reversible.go
- rtk_stages.go
- scope_filter.go
- semantic.go
- semantic_anchor.go
- semantic_cache.go
- semantic_chunk.go
- semantic_intent.go
- session.go
- sketch_store.go
- smallkv_compensate.go
- streaming.go
- swezze.go
- symbolic_compress.go
- tdd.go
- tfidf_filter.go
- tiered_summary.go
- token_quant.go
- token_retention.go
- toon.go
- utils.go
Directories
¶
| Path | Synopsis |
|---|---|
|
Package cache provides caching support for the filter pipeline.
|
Package cache provides caching support for the filter pipeline. |
|
Package engine provides the lightweight filter engine for quick output post-processing, distinct from the full 26+ layer pipeline.
|
Package engine provides the lightweight filter engine for quick output post-processing, distinct from the full 26+ layer pipeline. |
|
Package layers contains all compression layer implementations.
|
Package layers contains all compression layer implementations. |