cascade

package
v0.5.24 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Jun 21, 2026 License: Apache-2.0 Imports: 5 Imported by: 0

Documentation

Overview

Package cascade provides model cascading functionality for intelligent routing. It detects quality signals in responses and triggers tier escalation when needed.

Index

Constants

This section is empty.

Variables

This section is empty.

Functions

func CalculateOverallQuality

func CalculateOverallQuality(signals []QualitySignal) float64

CalculateOverallQuality computes an overall quality score from signals. Returns a score between 0.0 (poor) and 1.0 (excellent).

func HasCriticalSignals

func HasCriticalSignals(signals []QualitySignal) bool

HasCriticalSignals checks if any signals require immediate escalation.

func TierToCapability

func TierToCapability(tier Tier) string

TierToCapability maps a cascade tier to a capability slot name.

Types

type CascadeDecision

type CascadeDecision struct {
	// ShouldCascade indicates whether to retry with a higher tier
	ShouldCascade bool `json:"should_cascade"`
	// CurrentTier is the tier that produced the response
	CurrentTier Tier `json:"current_tier"`
	// NextTier is the recommended tier to try (if ShouldCascade is true)
	NextTier Tier `json:"next_tier,omitempty"`
	// Signals contains the detected quality issues
	Signals []QualitySignal `json:"signals,omitempty"`
	// QualityScore is the overall quality score (0.0-1.0)
	QualityScore float64 `json:"quality_score"`
	// Reason explains the decision
	Reason string `json:"reason"`
}

CascadeDecision represents the outcome of a cascade evaluation.

type CascadeResult

type CascadeResult struct {
	// OriginalTier is the tier that was initially used
	OriginalTier Tier `json:"original_tier"`
	// FinalTier is the tier that produced the accepted response
	FinalTier Tier `json:"final_tier"`
	// CascadeCount is the number of cascade attempts
	CascadeCount int `json:"cascade_count"`
	// TotalLatencyMs is the total time spent across all attempts
	TotalLatencyMs int64 `json:"total_latency_ms"`
	// Success indicates whether a satisfactory response was obtained
	Success bool `json:"success"`
}

CascadeResult tracks the outcome of a cascade operation.

type CascadeTracker

type CascadeTracker struct {
	// contains filtered or unexported fields
}

CascadeTracker tracks an ongoing cascade operation.

func NewCascadeTracker

func NewCascadeTracker(startTier Tier, maxAttempts int) *CascadeTracker

NewCascadeTracker creates a tracker for a cascade operation.

func (*CascadeTracker) CanContinue

func (t *CascadeTracker) CanContinue() bool

CanContinue returns whether more cascade attempts are allowed.

func (*CascadeTracker) GetCurrentTier

func (t *CascadeTracker) GetCurrentTier() Tier

GetCurrentTier returns the current tier being used.

func (*CascadeTracker) GetResult

func (t *CascadeTracker) GetResult(success bool) *CascadeResult

GetResult returns the final cascade result.

func (*CascadeTracker) RecordAttempt

func (t *CascadeTracker) RecordAttempt(decision *CascadeDecision)

RecordAttempt records a cascade attempt and its decision.

type Config

type Config struct {
	// Enabled toggles cascade functionality
	Enabled bool `yaml:"enabled" json:"enabled"`
	// QualityThreshold is the minimum quality score to accept (0.0-1.0)
	QualityThreshold float64 `yaml:"quality-threshold" json:"quality_threshold"`
	// MaxCascades is the maximum number of cascade attempts
	MaxCascades int `yaml:"max-cascades" json:"max_cascades"`
}

Config holds configuration for the CascadeManager.

type Manager

type Manager struct {
	// contains filtered or unexported fields
}

Manager orchestrates model cascading based on response quality.

func NewManager

func NewManager(cfg Config) *Manager

NewManager creates a new CascadeManager with the given configuration.

func (*Manager) DetectQualitySignals

func (m *Manager) DetectQualitySignals(response string) []QualitySignal

DetectQualitySignals analyzes a response and returns detected quality issues.

func (*Manager) EvaluateResponse

func (m *Manager) EvaluateResponse(response string, currentTier Tier) *CascadeDecision

EvaluateResponse determines if a response should trigger a cascade.

func (*Manager) GetMetrics

func (m *Manager) GetMetrics() map[string]interface{}

GetMetrics returns cascade performance metrics.

func (*Manager) GetMetricsAsMap

func (m *Manager) GetMetricsAsMap() map[string]interface{}

GetMetricsAsMap returns metrics in a format suitable for Lua.

func (*Manager) IsEnabled

func (m *Manager) IsEnabled() bool

IsEnabled returns whether cascade functionality is active.

type QualitySignal

type QualitySignal struct {
	// Type identifies the kind of quality issue
	Type SignalType `json:"type"`
	// Severity indicates how serious the issue is (0.0-1.0)
	Severity float64 `json:"severity"`
	// Description provides human-readable details
	Description string `json:"description"`
	// Position indicates where in the response the issue was detected (if applicable)
	Position int `json:"position,omitempty"`
}

QualitySignal represents a detected quality issue in a response.

type QualitySignalDetector

type QualitySignalDetector struct {
	// contains filtered or unexported fields
}

QualitySignalDetector detects quality issues in LLM responses.

func NewQualitySignalDetector

func NewQualitySignalDetector() *QualitySignalDetector

NewQualitySignalDetector creates a new detector with default patterns.

func (*QualitySignalDetector) DetectSignals

func (d *QualitySignalDetector) DetectSignals(response string) []QualitySignal

DetectSignals analyzes a response and returns all detected quality signals.

type SignalType

type SignalType string

SignalType categorizes different quality issues.

const (
	// SignalAbruptEnding indicates the response ended unexpectedly
	SignalAbruptEnding SignalType = "abrupt_ending"
	// SignalMissingSections indicates expected content sections are missing
	SignalMissingSections SignalType = "missing_sections"
	// SignalIncompleteCode indicates code blocks are not properly closed
	SignalIncompleteCode SignalType = "incomplete_code"
	// SignalTruncated indicates the response appears to be cut off
	SignalTruncated SignalType = "truncated"
	// SignalRepetitive indicates excessive repetition in the response
	SignalRepetitive SignalType = "repetitive"
	// SignalIncoherent indicates the response lacks logical flow
	SignalIncoherent SignalType = "incoherent"
	// SignalRefusal indicates the model refused to answer
	SignalRefusal SignalType = "refusal"
	// SignalLowQuality indicates general low quality response
	SignalLowQuality SignalType = "low_quality"
)

type Tier

type Tier string

Tier represents a model capability tier for cascading.

const (
	// TierFast is the cheapest, fastest tier (e.g., small models)
	TierFast Tier = "fast"
	// TierStandard is the balanced tier (e.g., medium models)
	TierStandard Tier = "standard"
	// TierReasoning is the most capable tier (e.g., large reasoning models)
	TierReasoning Tier = "reasoning"
)

func CapabilityToTier

func CapabilityToTier(capability string) Tier

CapabilityToTier maps a capability slot name to a cascade tier.

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL