compression

package
v0.3.4 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Jan 14, 2026 License: GPL-3.0 Imports: 22 Imported by: 0

Documentation

Overview

Package compression provides text compression algorithms for context optimization.

This package implements extractive, abstractive, and hybrid compression techniques to reduce token usage while preserving semantic meaning.

Index

Examples

Constants

This section is empty.

Variables

View Source
var (
	ErrInvalidExperimentID  = errors.New("experiment ID cannot be empty")
	ErrInsufficientVariants = errors.New("experiment must have at least 2 variants")
	ErrInvalidSessionID     = errors.New("session ID cannot be empty")
	ErrAlgorithmNotInExp    = errors.New("algorithm not in experiment variants")
	ErrExperimentNotFound   = errors.New("experiment not found")
)

Common errors for A/B testing

Functions

func Preview

func Preview(w io.Writer, original string, result *Result, opts PreviewOptions) error

Preview generates a side-by-side comparison of original vs compressed content with diff highlighting and optional quality metrics display.

The preview shows: - Quality metrics (compression ratio, quality score, processing time) - Side-by-side comparison with removed lines highlighted - Diff indicators showing what was removed or kept

Parameters:

  • w: Output writer (e.g., os.Stdout, bytes.Buffer)
  • original: Original uncompressed content
  • result: Compression result containing compressed content and metadata
  • opts: Display options (width, colors, metrics)

Returns error if validation fails or writing fails.

Example

ExamplePreview demonstrates the preview functionality

package main

import (
	"context"
	"fmt"
	"os"

	"github.com/fyrsmithlabs/contextd/internal/compression"
)

func main() {
	// Sample content to compress
	original := `The Go programming language is a statically typed, compiled language.
It was designed at Google by Robert Griesemer, Rob Pike, and Ken Thompson.
Go is syntactically similar to C, but with memory safety and garbage collection.
The language was announced in November 2009 and version 1.0 was released in March 2012.
Go is widely used for building web servers, data pipelines, and cloud-native applications.
It has a rich standard library and excellent concurrency support via goroutines.`

	// Create compression service
	config := compression.Config{
		DefaultAlgorithm: compression.AlgorithmExtractive,
		TargetRatio:      2.0,
	}

	service, err := compression.NewService(config)
	if err != nil {
		panic(err)
	}

	// Compress the content
	ctx := context.Background()
	result, err := service.Compress(ctx, original, compression.AlgorithmExtractive, 2.0)
	if err != nil {
		panic(err)
	}

	// Preview the compression results
	opts := compression.PreviewOptions{
		Width:       80,
		ShowMetrics: true,
		ColorOutput: false, // Disable colors for example output
	}

	err = compression.Preview(os.Stdout, original, result, opts)
	if err != nil {
		panic(err)
	}

	fmt.Println("\n✓ Compression preview generated successfully")
}

Types

type ABTestManager

type ABTestManager struct {
	// contains filtered or unexported fields
}

ABTestManager manages multiple A/B test experiments

func NewABTestManager

func NewABTestManager() *ABTestManager

NewABTestManager creates a new A/B test manager

func (*ABTestManager) CreateExperiment

func (m *ABTestManager) CreateExperiment(ctx context.Context, id string, algorithms []Algorithm) (*Experiment, error)

CreateExperiment creates a new experiment

func (*ABTestManager) ExportMetrics

func (m *ABTestManager) ExportMetrics(ctx context.Context, experimentID string) map[Algorithm]VariantMetrics

ExportMetrics exports metrics for an experiment (for analytics integration)

func (*ABTestManager) GetExperiment

func (m *ABTestManager) GetExperiment(ctx context.Context, id string) (*Experiment, error)

GetExperiment retrieves an experiment by ID

func (*ABTestManager) ListExperiments

func (m *ABTestManager) ListExperiments(ctx context.Context) []*Experiment

ListExperiments returns all experiments

type AbstractiveCompressor

type AbstractiveCompressor struct {
	// contains filtered or unexported fields
}

AbstractiveCompressor implements abstractive summarization using Claude API

func NewAbstractiveCompressor

func NewAbstractiveCompressor(config Config) *AbstractiveCompressor

NewAbstractiveCompressor creates a new abstractive compressor

func (*AbstractiveCompressor) Compress

func (c *AbstractiveCompressor) Compress(ctx context.Context, content string, algorithm Algorithm, targetRatio float64) (*Result, error)

Compress implements the Compressor interface using abstractive summarization via Claude API

func (*AbstractiveCompressor) GetCapabilities

func (c *AbstractiveCompressor) GetCapabilities(ctx context.Context) Capabilities

GetCapabilities returns the capabilities of this compressor

type Algorithm

type Algorithm string

Algorithm represents a compression algorithm

const (
	// AlgorithmExtractive uses extractive summarization (sentence selection)
	AlgorithmExtractive Algorithm = "extractive"
	// AlgorithmAbstractive uses abstractive summarization (content generation)
	AlgorithmAbstractive Algorithm = "abstractive"
	// AlgorithmHybrid combines extractive and abstractive approaches
	AlgorithmHybrid Algorithm = "hybrid"
)

type Capabilities

type Capabilities struct {
	// Supported algorithms
	SupportedAlgorithms []Algorithm

	// Maximum content length supported
	MaxContentLength int

	// Whether it supports target compression ratios
	SupportsTargetRatio bool

	// Quality score range
	QualityScoreRange struct {
		Min float64
		Max float64
	}
}

Capabilities describes what a compressor can do

type ClaudeClient

type ClaudeClient interface {
	// Summarize generates an abstractive summary using Claude API
	Summarize(ctx context.Context, content string, targetRatio float64) (string, error)
}

ClaudeClient defines the interface for Claude API interactions This enables testing with mocks

type ClaudeError

type ClaudeError struct {
	Type  string `json:"type"`
	Error struct {
		Type    string `json:"type"`
		Message string `json:"message"`
	} `json:"error"`
}

ClaudeError represents an error response from Claude API

type ClaudeMessage

type ClaudeMessage struct {
	Role    string `json:"role"`
	Content string `json:"content"`
}

ClaudeMessage represents a message in the conversation

type ClaudeRequest

type ClaudeRequest struct {
	Model       string          `json:"model"`
	MaxTokens   int             `json:"max_tokens"`
	Messages    []ClaudeMessage `json:"messages"`
	System      string          `json:"system,omitempty"`
	Temperature float64         `json:"temperature"`
}

ClaudeRequest represents the request format for Claude API

type ClaudeResponse

type ClaudeResponse struct {
	ID      string `json:"id"`
	Type    string `json:"type"`
	Role    string `json:"role"`
	Content []struct {
		Type string `json:"type"`
		Text string `json:"text"`
	} `json:"content"`
	Model      string `json:"model"`
	StopReason string `json:"stop_reason"`
	Usage      struct {
		InputTokens  int `json:"input_tokens"`
		OutputTokens int `json:"output_tokens"`
	} `json:"usage"`
}

ClaudeResponse represents the response from Claude API

type ComparisonReport

type ComparisonReport struct {
	ExperimentID   string                       // Experiment identifier
	GeneratedAt    time.Time                    // When report was generated
	TotalSessions  int                          // Total unique sessions
	VariantMetrics map[Algorithm]VariantMetrics // Metrics per variant
	Winner         *Algorithm                   // Best performing algorithm (if conclusive)
	WinnerReason   string                       // Why this variant won
	Recommendation string                       // Recommendation for production use
}

ComparisonReport contains the comparison analysis of all variants

type CompressionOutcome

type CompressionOutcome struct {
	SessionID        string    // Unique session identifier
	Algorithm        Algorithm // Algorithm used
	CompressionRatio float64   // Actual compression ratio achieved
	QualityScore     float64   // Quality score (0.0 to 1.0)
	ProcessingTimeMs float64   // Processing time in milliseconds
	Success          bool      // Whether compression succeeded
	UserAccepted     bool      // Whether user accepted the compressed result
	ErrorMessage     string    // Error message if failed
	Timestamp        time.Time // When the compression occurred
}

CompressionOutcome represents the result of a single compression operation

func (*CompressionOutcome) Validate

func (o *CompressionOutcome) Validate() error

Validate checks if the outcome is valid

type Compressor

type Compressor interface {
	// Compress compresses the given content using the specified algorithm
	Compress(ctx context.Context, content string, algorithm Algorithm, targetRatio float64) (*Result, error)

	// GetCapabilities returns the capabilities of this compressor
	GetCapabilities(ctx context.Context) Capabilities
}

Compressor defines the interface for content compression

type Config

type Config struct {
	// Default algorithm to use
	DefaultAlgorithm Algorithm

	// Target compression ratio (original/compressed)
	TargetRatio float64

	// Quality threshold (minimum acceptable quality score)
	QualityThreshold float64

	// Maximum processing time per compression
	MaxProcessingTime time.Duration

	// Anthropic API key for abstractive compression
	AnthropicAPIKey string
}

Config holds configuration for compression operations

type ContentSection

type ContentSection struct {
	Content string
	IsCode  bool
}

ContentSection represents a section of content with metadata

type ContentType

type ContentType string

ContentType represents the type of content being compressed

const (
	// ContentTypeCode represents code content (Go, Python, JS, etc.)
	ContentTypeCode ContentType = "code"
	// ContentTypeMarkdown represents markdown documentation
	ContentTypeMarkdown ContentType = "markdown"
	// ContentTypeConversation represents dialog/conversation
	ContentTypeConversation ContentType = "conversation"
	// ContentTypeMixed represents mixed content types
	ContentTypeMixed ContentType = "mixed"
	// ContentTypePlain represents plain text
	ContentTypePlain ContentType = "plain"
)

type Experiment

type Experiment struct {
	ID        string              // Unique experiment identifier
	Variants  []ExperimentVariant // Algorithm variants to test
	StartTime time.Time           // When experiment started
	EndTime   *time.Time          // When experiment ended (nil if ongoing)
	// contains filtered or unexported fields
}

Experiment represents an A/B test comparing multiple compression algorithms

func NewExperiment

func NewExperiment(id string, algorithms []Algorithm) (*Experiment, error)

NewExperiment creates a new A/B test experiment

func (*Experiment) AssignVariant

func (e *Experiment) AssignVariant(sessionID string) (Algorithm, error)

AssignVariant assigns an algorithm variant to a session Uses consistent hashing to ensure same session always gets same variant

func (*Experiment) GenerateComparisonReport

func (e *Experiment) GenerateComparisonReport() ComparisonReport

GenerateComparisonReport generates a comprehensive comparison report

func (*Experiment) GetMetrics

func (e *Experiment) GetMetrics() map[Algorithm]VariantMetrics

GetMetrics computes aggregated metrics for all variants

func (*Experiment) RecordOutcome

func (e *Experiment) RecordOutcome(outcome CompressionOutcome) error

RecordOutcome records the outcome of a compression operation

type ExperimentVariant

type ExperimentVariant struct {
	Algorithm Algorithm // Compression algorithm
	Weight    float64   // Assignment weight (for weighted distribution)
}

ExperimentVariant represents a single variant in an A/B test

type ExtractiveCompressor

type ExtractiveCompressor struct {
	// contains filtered or unexported fields
}

ExtractiveCompressor implements extractive summarization using sentence scoring

func NewExtractiveCompressor

func NewExtractiveCompressor(config Config) *ExtractiveCompressor

NewExtractiveCompressor creates a new extractive compressor

func (*ExtractiveCompressor) Compress

func (c *ExtractiveCompressor) Compress(ctx context.Context, content string, algorithm Algorithm, targetRatio float64) (*Result, error)

Compress implements the Compressor interface using extractive summarization

func (*ExtractiveCompressor) GetCapabilities

func (c *ExtractiveCompressor) GetCapabilities(ctx context.Context) Capabilities

GetCapabilities returns the capabilities of this compressor

type HTTPClaudeClient

type HTTPClaudeClient struct {
	// contains filtered or unexported fields
}

HTTPClaudeClient implements ClaudeClient using the Anthropic API

func NewClaudeClient

func NewClaudeClient(apiKey, baseURL, model string) (*HTTPClaudeClient, error)

NewClaudeClient creates a new Claude API client

func (*HTTPClaudeClient) Summarize

func (c *HTTPClaudeClient) Summarize(ctx context.Context, content string, targetRatio float64) (string, error)

Summarize generates an abstractive summary using Claude API

type HybridCompressor

type HybridCompressor struct {
	// contains filtered or unexported fields
}

HybridCompressor combines extractive and abstractive approaches with intelligent routing based on content type

func NewHybridCompressor

func NewHybridCompressor(config Config) *HybridCompressor

NewHybridCompressor creates a new hybrid compressor

func NewHybridCompressorWithAbstractive

func NewHybridCompressorWithAbstractive(config Config, abstractive Compressor) *HybridCompressor

NewHybridCompressorWithAbstractive creates a hybrid compressor with injected abstractive compressor This allows for testing with mock implementations

func (*HybridCompressor) Compress

func (c *HybridCompressor) Compress(ctx context.Context, content string, algorithm Algorithm, targetRatio float64) (*Result, error)

Compress implements the Compressor interface using a hybrid approach with content-aware routing

func (*HybridCompressor) GetCapabilities

func (c *HybridCompressor) GetCapabilities(ctx context.Context) Capabilities

GetCapabilities returns the capabilities of this compressor

type MockAbstractiveCompressor

type MockAbstractiveCompressor struct {
	// contains filtered or unexported fields
}

MockAbstractiveCompressor implements a mock abstractive compressor for testing It simulates abstractive compression by applying simple text reduction rules without requiring an actual Anthropic API key.

func NewMockAbstractiveCompressor

func NewMockAbstractiveCompressor(config Config) *MockAbstractiveCompressor

NewMockAbstractiveCompressor creates a new mock abstractive compressor

func (*MockAbstractiveCompressor) Compress

func (m *MockAbstractiveCompressor) Compress(ctx context.Context, content string, algorithm Algorithm, targetRatio float64) (*Result, error)

Compress implements the Compressor interface with mock abstractive compression It simulates API-based compression by applying deterministic reduction rules

func (*MockAbstractiveCompressor) GetCapabilities

func (m *MockAbstractiveCompressor) GetCapabilities(ctx context.Context) Capabilities

GetCapabilities returns the capabilities of this mock compressor

type PreviewOptions

type PreviewOptions struct {
	// Width of the output (terminal columns)
	Width int

	// ShowMetrics displays compression metrics at the top
	ShowMetrics bool

	// ColorOutput enables ANSI color codes for diff highlighting
	ColorOutput bool
}

PreviewOptions configures the preview output

type QualityGate

type QualityGate struct {
	Thresholds QualityThresholds
}

QualityGate enforces quality thresholds

func NewQualityGate

func NewQualityGate(thresholds QualityThresholds) *QualityGate

NewQualityGate creates a new quality gate with specified thresholds

func (*QualityGate) Evaluate

func (g *QualityGate) Evaluate(metrics *QualityMetrics, original, compressed string) *QualityGateResult

Evaluate checks if quality metrics meet all thresholds

type QualityGateResult

type QualityGateResult struct {
	Pass                      bool
	FailureReason             string
	CompressionRatioScore     float64
	InformationRetentionScore float64
	SemanticSimilarityScore   float64
	ReadabilityScore          float64
	CompositeScore            float64
}

QualityGateResult contains the result of quality gate evaluation

type QualityMetrics

type QualityMetrics struct {
	OriginalSize   int
	CompressedSize int
	TargetRatio    float64
}

QualityMetrics holds metrics for evaluating compression quality

func NewQualityMetrics

func NewQualityMetrics(originalSize, compressedSize int, targetRatio float64) *QualityMetrics

NewQualityMetrics creates a new quality metrics calculator

func (*QualityMetrics) CompositeScore

func (m *QualityMetrics) CompositeScore(original, compressed string) float64

CompositeScore calculates weighted average of all quality metrics

func (*QualityMetrics) CompressionRatioScore

func (m *QualityMetrics) CompressionRatioScore() float64

CompressionRatioScore calculates score based on compression ratio vs target Returns 1.0 if target is met or exceeded, penalizes if below target

func (*QualityMetrics) InformationRetentionScore

func (m *QualityMetrics) InformationRetentionScore(original, compressed string) float64

InformationRetentionScore measures how well keywords/concepts are preserved

func (*QualityMetrics) KeywordRetentionRate

func (m *QualityMetrics) KeywordRetentionRate(original, compressed string) float64

KeywordRetentionRate calculates the percentage of important keywords retained

func (*QualityMetrics) ReadabilityScore

func (m *QualityMetrics) ReadabilityScore(text string) float64

ReadabilityScore measures the readability of the compressed text

func (*QualityMetrics) SemanticSimilarityScore

func (m *QualityMetrics) SemanticSimilarityScore(original, compressed string) float64

SemanticSimilarityScore measures meaning preservation using word overlap

type QualityThresholds

type QualityThresholds struct {
	MinCompressionRatio     float64
	MinInformationRetention float64
	MinSemanticSimilarity   float64
	MinReadability          float64
	MinCompositeScore       float64
}

QualityThresholds defines minimum acceptable quality scores

type Result

type Result struct {
	// Compressed content
	Content string

	// Compression metadata
	Metadata vectorstore.CompressionMetadata

	// Processing time
	ProcessingTime time.Duration

	// Quality score (0.0 to 1.0, higher is better)
	QualityScore float64
}

Result represents the result of a compression operation

type RoutingStrategy

type RoutingStrategy string

RoutingStrategy defines how content should be compressed

const (
	// RoutingStrategyExtractive routes to extractive compression
	RoutingStrategyExtractive RoutingStrategy = "extractive"
	// RoutingStrategyAbstractive routes to abstractive compression
	RoutingStrategyAbstractive RoutingStrategy = "abstractive"
	// RoutingStrategyMixed uses both approaches for mixed content
	RoutingStrategyMixed RoutingStrategy = "mixed"
)

type Service

type Service struct {
	// contains filtered or unexported fields
}

Service orchestrates content compression operations

func NewService

func NewService(config Config) (*Service, error)

NewService creates a new compression service

func (*Service) Compress

func (s *Service) Compress(ctx context.Context, content string, algorithm Algorithm, targetRatio float64) (*Result, error)

Compress compresses content using the specified algorithm

func (*Service) GetCapabilities

func (s *Service) GetCapabilities(ctx context.Context) map[Algorithm]Capabilities

GetCapabilities returns the capabilities of all supported algorithms

func (*Service) Stats added in v0.3.0

func (s *Service) Stats() Stats

Stats returns current compression statistics for statusline display.

type Stats added in v0.3.0

type Stats struct {
	LastRatio       float64
	LastQuality     float64
	OperationsTotal int64
}

Stats contains compression statistics for statusline display.

type VariantMetrics

type VariantMetrics struct {
	Algorithm           Algorithm // Algorithm variant
	TotalAttempts       int       // Total compression attempts
	SuccessCount        int       // Successful compressions
	SuccessRate         float64   // Success rate (0.0 to 1.0)
	AvgCompressionRatio float64   // Average compression ratio
	AvgQualityScore     float64   // Average quality score
	AvgProcessingTimeMs float64   // Average processing time
	UserAcceptanceRate  float64   // Rate of user acceptance (0.0 to 1.0)
	UserAcceptanceCount int       // Number of times user accepted
	UserRejectionCount  int       // Number of times user rejected
	P50CompressionRatio float64   // Median compression ratio
	P95ProcessingTimeMs float64   // 95th percentile processing time
}

VariantMetrics aggregates metrics for a single algorithm variant

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL