compression

package
v0.1.44 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Feb 13, 2026 License: AGPL-3.0, AGPL-3.0-or-later Imports: 14 Imported by: 0

Documentation

Index

Constants

This section is empty.

Variables

This section is empty.

Functions

func EstimateTokens

func EstimateTokens(content []byte) int

EstimateTokens is a fast fallback when no tiktoken encoding is available. Approximation: ~4 characters per token.

Types

type CompressedPointer

type CompressedPointer struct {
	OriginalTokens   int    `json:"original_tokens"`
	CompressedTokens int    `json:"compressed_tokens"`
	Strategy         string `json:"strategy"`
	CreatedAt        int64  `json:"created_at"`
	Size             int    `json:"size"`
}

CompressedPointer is small metadata stored in Redis that maps a (workspace, queryPath, resultID, strategy) to token counts.

type CompressedStore

type CompressedStore struct {
	// contains filtered or unexported fields
}

CompressedStore handles read/write of compressed content pointers and cached content in Redis, with a per-workspace byte budget.

func NewCompressedStore

func NewCompressedStore(rdb RedisClient, cfg Config) *CompressedStore

NewCompressedStore creates a store backed by Redis.

func (*CompressedStore) FlushWorkspace added in v0.1.44

func (s *CompressedStore) FlushWorkspace(ctx context.Context, workspaceId uint) (int, error)

FlushWorkspace deletes all compressed pointers, content, and usage keys for a workspace. The RedisClient must implement ScanClient (e.g., *common.RedisClient) for pattern-based key discovery.

func (*CompressedStore) GetContent

func (s *CompressedStore) GetContent(ctx context.Context, workspaceId uint, queryPath, resultID, strategy string) []byte

GetContent reads cached compressed content. Returns nil on miss.

func (*CompressedStore) GetPointer

func (s *CompressedStore) GetPointer(ctx context.Context, workspaceId uint, queryPath, resultID, strategy string) *CompressedPointer

GetPointer reads the compression pointer for a result+strategy. Returns nil if not found.

func (*CompressedStore) SetContent

func (s *CompressedStore) SetContent(ctx context.Context, workspaceId uint, queryPath, resultID, strategy string, content []byte) error

SetContent caches compressed content subject to the per-workspace byte budget. If adding this entry would exceed the budget, the write is silently skipped.

func (*CompressedStore) SetPointer

func (s *CompressedStore) SetPointer(ctx context.Context, workspaceId uint, queryPath, resultID, strategy string, ptr *CompressedPointer) error

SetPointer writes the compression pointer with the same TTL as content. Pointers are small (~200 bytes); they expire alongside the cached content.

type CompressionResult

type CompressionResult struct {
	Data             []byte
	OriginalTokens   int
	CompressedTokens int
	Strategy         Strategy
	Outcome          Outcome
	DurationMs       int64
}

CompressionResult is returned by every Compress call.

type Config

type Config struct {
	Strategy             Strategy      `yaml:"strategy"`
	CacheEnabled         bool          `yaml:"cacheEnabled"`
	TokenThreshold       int           `yaml:"tokenThreshold"`
	MaxContentBytes      int           `yaml:"maxContentBytes"`
	TokenEncoding        string        `yaml:"tokenEncoding"`
	Timeout              time.Duration `yaml:"timeout"`
	ContentCacheMaxBytes int64         `yaml:"contentCacheMaxBytes"` // cache only
	ContentCacheTTL      time.Duration `yaml:"contentCacheTTL"`      // cache only
}

Config holds all compression settings.

func DefaultConfig

func DefaultConfig() Config

func (Config) DefaultTimeout

func (c Config) DefaultTimeout() time.Duration

type ContentMeta

type ContentMeta struct {
	Integration string // Source type: types.SourceGmail, types.SourceGitHub, etc.
	QueryPath   string
	ResultID    string
	Filename    string
	MimeHint    string
}

ContentMeta provides context about the content being compressed.

type ContextCompressor

type ContextCompressor interface {
	Name() Strategy
	Compress(ctx context.Context, content []byte, meta ContentMeta) (*CompressionResult, error)
}

ContextCompressor transforms raw content into a smaller representation. Implementations must be safe for concurrent use.

func NewCompressor

func NewCompressor(strategy Strategy, cfg Config) (ContextCompressor, error)

type Outcome

type Outcome string
const (
	OutcomeCompressed  Outcome = "compressed"
	OutcomeCacheHit    Outcome = "cache_hit"
	OutcomePassthrough Outcome = "passthrough"
	OutcomeTimeout     Outcome = "timeout"
	OutcomeError       Outcome = "error"
	OutcomeSkipped     Outcome = "skipped"
)

type RedisClient

type RedisClient interface {
	Get(ctx context.Context, key string) *redis.StringCmd
	Set(ctx context.Context, key string, value interface{}, expiration time.Duration) *redis.StatusCmd
	IncrBy(ctx context.Context, key string, value int64) *redis.IntCmd
	Expire(ctx context.Context, key string, expiration time.Duration) *redis.BoolCmd
	Del(ctx context.Context, keys ...string) *redis.IntCmd
	Pipeline() redis.Pipeliner
}

RedisClient is the minimal Redis interface used by CompressedStore. Both *redis.Client and *common.RedisClient satisfy this.

type ScanClient added in v0.1.44

type ScanClient interface {
	Scan(ctx context.Context, pattern string) ([]string, error)
}

ScanClient is an optional extension for RedisClient that supports key scanning. *common.RedisClient satisfies this.

type Strategy

type Strategy string
const (
	StrategyStrip       Strategy = "strip"
	StrategyPassthrough Strategy = "passthrough"
)

func ParseStrategy

func ParseStrategy(s string) (Strategy, error)

func (Strategy) String

func (s Strategy) String() string

func (Strategy) Valid

func (s Strategy) Valid() bool

type StripCompressor

type StripCompressor struct {
	// contains filtered or unexported fields
}

func NewStripCompressor

func NewStripCompressor(cfg Config) *StripCompressor

func (*StripCompressor) Compress

func (s *StripCompressor) Compress(ctx context.Context, content []byte, meta ContentMeta) (*CompressionResult, error)

func (*StripCompressor) Name

func (s *StripCompressor) Name() Strategy

type TokenCounter

type TokenCounter struct {
	// contains filtered or unexported fields
}

TokenCounter counts tokens using a tiktoken encoding. A nil *TokenCounter is safe to use — it falls back to EstimateTokens.

func DefaultTokenCounter

func DefaultTokenCounter() *TokenCounter

DefaultTokenCounter returns a shared counter using cl100k_base. Lazily initialized on first call.

func NewTokenCounter

func NewTokenCounter(encoding string) (*TokenCounter, error)

NewTokenCounter creates a counter for the given encoding name. Common encodings: "cl100k_base" (GPT-4), "o200k_base" (GPT-4o).

func (*TokenCounter) Count

func (tc *TokenCounter) Count(content []byte) int

Count returns the number of tokens in content. Safe to call on a nil receiver (falls back to EstimateTokens).

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL