compression

package
v0.1.147 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Apr 7, 2026 License: AGPL-3.0, AGPL-3.0-or-later Imports: 14 Imported by: 0

Documentation

Index

Constants

This section is empty.

Variables

This section is empty.

Functions

func EstimateTokens

func EstimateTokens(content []byte) int

EstimateTokens is a fast fallback when no tiktoken encoding is available. Approximation: ~4 characters per token.

func SetCompressionStats added in v0.1.45

func SetCompressionStats(ctx context.Context, originalBytes int, result *CompressionResult, strategy string)

SetCompressionStats fills the CompressionStats stored on the context (if present). Safe to call with a nil result.

Types

type CompressedPointer

type CompressedPointer struct {
	OriginalTokens   int    `json:"original_tokens"`
	CompressedTokens int    `json:"compressed_tokens"`
	Strategy         string `json:"strategy"`
	CreatedAt        int64  `json:"created_at"`
	Size             int    `json:"size"`
}

CompressedPointer is small metadata stored in Redis that maps a (workspace, queryPath, resultID, strategy) to token counts.

type CompressedStore

type CompressedStore struct {
	// contains filtered or unexported fields
}

CompressedStore handles read/write of compressed content pointers and cached content in Redis, with a per-workspace byte budget.

func NewCompressedStore

func NewCompressedStore(rdb RedisClient, cfg Config) *CompressedStore

NewCompressedStore creates a store backed by Redis.

func (*CompressedStore) FlushWorkspace added in v0.1.44

func (s *CompressedStore) FlushWorkspace(ctx context.Context, workspaceId uint) (int, error)

FlushWorkspace deletes all compressed pointers, content, and usage keys for a workspace. The RedisClient must implement ScanClient (e.g., *common.RedisClient) for pattern-based key discovery.

func (*CompressedStore) GetContent

func (s *CompressedStore) GetContent(ctx context.Context, workspaceId uint, queryPath, resultID, strategy string) []byte

GetContent reads cached compressed content. Returns nil on miss.

func (*CompressedStore) GetPointer

func (s *CompressedStore) GetPointer(ctx context.Context, workspaceId uint, queryPath, resultID, strategy string) *CompressedPointer

GetPointer reads the compression pointer for a result+strategy. Returns nil if not found.

func (*CompressedStore) SetContent

func (s *CompressedStore) SetContent(ctx context.Context, workspaceId uint, queryPath, resultID, strategy string, content []byte) error

SetContent caches compressed content subject to the per-workspace byte budget. If adding this entry would exceed the budget, the write is silently skipped.

func (*CompressedStore) SetPointer

func (s *CompressedStore) SetPointer(ctx context.Context, workspaceId uint, queryPath, resultID, strategy string, ptr *CompressedPointer) error

SetPointer writes the compression pointer with the same TTL as content. Pointers are small (~200 bytes); they expire alongside the cached content.

type CompressionResult

type CompressionResult struct {
	Data             []byte
	OriginalTokens   int
	CompressedTokens int
	Strategy         CompressionStrategy
	Outcome          Outcome
	DurationMs       int64
}

CompressionResult is returned by every Compress call.

type CompressionStats added in v0.1.45

type CompressionStats struct {
	OriginalBytes    int
	CompressedBytes  int
	OriginalTokens   int
	CompressedTokens int
	Strategy         string
}

CompressionStats is populated by readWithCompression and passed back to the HTTP handler via context so that real token counts can be emitted as response headers. The caller pre-allocates the struct and puts a pointer on the context; the compression layer fills it in.

func GetCompressionStats added in v0.1.45

func GetCompressionStats(ctx context.Context) *CompressionStats

GetCompressionStats returns the stats pointer stored on the context, or nil.

func WithCompressionStats added in v0.1.45

func WithCompressionStats(ctx context.Context) (context.Context, *CompressionStats)

WithCompressionStats returns a derived context carrying a pointer to an empty CompressionStats. The caller retains the pointer and reads it after the service call completes.

type CompressionStrategy added in v0.1.46

type CompressionStrategy string
const (
	CompressionStrategyStrip       CompressionStrategy = "strip"
	CompressionStrategyPassthrough CompressionStrategy = "passthrough"
)

func ParseStrategy

func ParseStrategy(s string) (CompressionStrategy, error)

func (CompressionStrategy) String added in v0.1.46

func (s CompressionStrategy) String() string

func (CompressionStrategy) Valid added in v0.1.46

func (s CompressionStrategy) Valid() bool

type Config

type Config struct {
	Strategy             CompressionStrategy `yaml:"strategy"`
	CacheEnabled         bool                `yaml:"cacheEnabled"`
	TokenThreshold       int                 `yaml:"tokenThreshold"`
	MaxContentBytes      int                 `yaml:"maxContentBytes"`
	TokenEncoding        string              `yaml:"tokenEncoding"`
	Timeout              time.Duration       `yaml:"timeout"`
	ContentCacheMaxBytes int64               `yaml:"contentCacheMaxBytes"` // cache only
	ContentCacheTTL      time.Duration       `yaml:"contentCacheTTL"`      // cache only
}

Config holds all compression settings.

func DefaultConfig

func DefaultConfig() Config

func (Config) DefaultTimeout

func (c Config) DefaultTimeout() time.Duration

type ContentMeta

type ContentMeta struct {
	Integration string // Source type: types.SourceGmail, types.SourceGitHub, etc.
	QueryPath   string
	ResultID    string
	Filename    string
	MimeHint    string
}

ContentMeta provides context about the content being compressed.

type ContextCompressor

type ContextCompressor interface {
	Name() CompressionStrategy
	Compress(ctx context.Context, content []byte, meta ContentMeta) (*CompressionResult, error)
}

ContextCompressor transforms raw content into a smaller representation. Implementations must be safe for concurrent use.

func NewCompressor

func NewCompressor(strategy CompressionStrategy, cfg Config) (ContextCompressor, error)

type Outcome

type Outcome string
const (
	OutcomeCompressed  Outcome = "compressed"
	OutcomeCacheHit    Outcome = "cache_hit"
	OutcomePassthrough Outcome = "passthrough"
	OutcomeTimeout     Outcome = "timeout"
	OutcomeError       Outcome = "error"
	OutcomeSkipped     Outcome = "skipped"
)

type RedisClient

type RedisClient interface {
	Get(ctx context.Context, key string) *redis.StringCmd
	Set(ctx context.Context, key string, value interface{}, expiration time.Duration) *redis.StatusCmd
	IncrBy(ctx context.Context, key string, value int64) *redis.IntCmd
	Expire(ctx context.Context, key string, expiration time.Duration) *redis.BoolCmd
	SAdd(ctx context.Context, key string, members ...interface{}) *redis.IntCmd
	Del(ctx context.Context, keys ...string) *redis.IntCmd
	Pipeline() redis.Pipeliner
}

RedisClient is the minimal Redis interface used by CompressedStore. Both *redis.Client and *common.RedisClient satisfy this.

type ScanClient added in v0.1.44

type ScanClient interface {
	Scan(ctx context.Context, pattern string) ([]string, error)
}

ScanClient is an optional extension for RedisClient that supports key scanning. *common.RedisClient satisfies this.

type StripCompressor

type StripCompressor struct {
	// contains filtered or unexported fields
}

func NewStripCompressor

func NewStripCompressor(cfg Config) *StripCompressor

func (*StripCompressor) Compress

func (s *StripCompressor) Compress(ctx context.Context, content []byte, meta ContentMeta) (*CompressionResult, error)

func (*StripCompressor) Name

type TokenCounter

type TokenCounter struct {
	// contains filtered or unexported fields
}

TokenCounter counts tokens using a tiktoken encoding. A nil *TokenCounter is safe to use — it falls back to EstimateTokens.

func DefaultTokenCounter

func DefaultTokenCounter() *TokenCounter

DefaultTokenCounter returns a shared counter using cl100k_base. Lazily initialized on first call.

func NewTokenCounter

func NewTokenCounter(encoding string) (*TokenCounter, error)

NewTokenCounter creates a counter for the given encoding name. Common encodings: "cl100k_base" (GPT-4), "o200k_base" (GPT-4o).

func (*TokenCounter) Count

func (tc *TokenCounter) Count(content []byte) int

Count returns the number of tokens in content. Safe to call on a nil receiver (falls back to EstimateTokens).

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL