Documentation
¶
Overview ¶
Package cascade provides model cascading functionality for intelligent routing. It detects quality signals in responses and triggers tier escalation when needed.
Index ¶
- func CalculateOverallQuality(signals []QualitySignal) float64
- func HasCriticalSignals(signals []QualitySignal) bool
- func TierToCapability(tier Tier) string
- type CascadeDecision
- type CascadeResult
- type CascadeTracker
- type Config
- type Manager
- func (m *Manager) DetectQualitySignals(response string) []QualitySignal
- func (m *Manager) EvaluateResponse(response string, currentTier Tier) *CascadeDecision
- func (m *Manager) GetMetrics() map[string]interface{}
- func (m *Manager) GetMetricsAsMap() map[string]interface{}
- func (m *Manager) IsEnabled() bool
- type QualitySignal
- type QualitySignalDetector
- type SignalType
- type Tier
Constants ¶
This section is empty.
Variables ¶
This section is empty.
Functions ¶
func CalculateOverallQuality ¶
func CalculateOverallQuality(signals []QualitySignal) float64
CalculateOverallQuality computes an overall quality score from signals. Returns a score between 0.0 (poor) and 1.0 (excellent).
func HasCriticalSignals ¶
func HasCriticalSignals(signals []QualitySignal) bool
HasCriticalSignals checks if any signals require immediate escalation.
func TierToCapability ¶
TierToCapability maps a cascade tier to a capability slot name.
Types ¶
type CascadeDecision ¶
type CascadeDecision struct {
// ShouldCascade indicates whether to retry with a higher tier
ShouldCascade bool `json:"should_cascade"`
// CurrentTier is the tier that produced the response
CurrentTier Tier `json:"current_tier"`
// NextTier is the recommended tier to try (if ShouldCascade is true)
NextTier Tier `json:"next_tier,omitempty"`
// Signals contains the detected quality issues
Signals []QualitySignal `json:"signals,omitempty"`
// QualityScore is the overall quality score (0.0-1.0)
QualityScore float64 `json:"quality_score"`
// Reason explains the decision
Reason string `json:"reason"`
}
CascadeDecision represents the outcome of a cascade evaluation.
type CascadeResult ¶
type CascadeResult struct {
// OriginalTier is the tier that was initially used
OriginalTier Tier `json:"original_tier"`
// FinalTier is the tier that produced the accepted response
FinalTier Tier `json:"final_tier"`
// CascadeCount is the number of cascade attempts
CascadeCount int `json:"cascade_count"`
// TotalLatencyMs is the total time spent across all attempts
TotalLatencyMs int64 `json:"total_latency_ms"`
// Success indicates whether a satisfactory response was obtained
Success bool `json:"success"`
}
CascadeResult tracks the outcome of a cascade operation.
type CascadeTracker ¶
type CascadeTracker struct {
// contains filtered or unexported fields
}
CascadeTracker tracks an ongoing cascade operation.
func NewCascadeTracker ¶
func NewCascadeTracker(startTier Tier, maxAttempts int) *CascadeTracker
NewCascadeTracker creates a tracker for a cascade operation.
func (*CascadeTracker) CanContinue ¶
func (t *CascadeTracker) CanContinue() bool
CanContinue returns whether more cascade attempts are allowed.
func (*CascadeTracker) GetCurrentTier ¶
func (t *CascadeTracker) GetCurrentTier() Tier
GetCurrentTier returns the current tier being used.
func (*CascadeTracker) GetResult ¶
func (t *CascadeTracker) GetResult(success bool) *CascadeResult
GetResult returns the final cascade result.
func (*CascadeTracker) RecordAttempt ¶
func (t *CascadeTracker) RecordAttempt(decision *CascadeDecision)
RecordAttempt records a cascade attempt and its decision.
type Config ¶
type Config struct {
// Enabled toggles cascade functionality
Enabled bool `yaml:"enabled" json:"enabled"`
// QualityThreshold is the minimum quality score to accept (0.0-1.0)
QualityThreshold float64 `yaml:"quality-threshold" json:"quality_threshold"`
// MaxCascades is the maximum number of cascade attempts
MaxCascades int `yaml:"max-cascades" json:"max_cascades"`
}
Config holds configuration for the CascadeManager.
type Manager ¶
type Manager struct {
// contains filtered or unexported fields
}
Manager orchestrates model cascading based on response quality.
func NewManager ¶
NewManager creates a new CascadeManager with the given configuration.
func (*Manager) DetectQualitySignals ¶
func (m *Manager) DetectQualitySignals(response string) []QualitySignal
DetectQualitySignals analyzes a response and returns detected quality issues.
func (*Manager) EvaluateResponse ¶
func (m *Manager) EvaluateResponse(response string, currentTier Tier) *CascadeDecision
EvaluateResponse determines if a response should trigger a cascade.
func (*Manager) GetMetrics ¶
GetMetrics returns cascade performance metrics.
func (*Manager) GetMetricsAsMap ¶
GetMetricsAsMap returns metrics in a format suitable for Lua.
type QualitySignal ¶
type QualitySignal struct {
// Type identifies the kind of quality issue
Type SignalType `json:"type"`
// Severity indicates how serious the issue is (0.0-1.0)
Severity float64 `json:"severity"`
// Description provides human-readable details
Description string `json:"description"`
// Position indicates where in the response the issue was detected (if applicable)
Position int `json:"position,omitempty"`
}
QualitySignal represents a detected quality issue in a response.
type QualitySignalDetector ¶
type QualitySignalDetector struct {
// contains filtered or unexported fields
}
QualitySignalDetector detects quality issues in LLM responses.
func NewQualitySignalDetector ¶
func NewQualitySignalDetector() *QualitySignalDetector
NewQualitySignalDetector creates a new detector with default patterns.
func (*QualitySignalDetector) DetectSignals ¶
func (d *QualitySignalDetector) DetectSignals(response string) []QualitySignal
DetectSignals analyzes a response and returns all detected quality signals.
type SignalType ¶
type SignalType string
SignalType categorizes different quality issues.
const ( // SignalAbruptEnding indicates the response ended unexpectedly SignalAbruptEnding SignalType = "abrupt_ending" // SignalMissingSections indicates expected content sections are missing SignalMissingSections SignalType = "missing_sections" // SignalIncompleteCode indicates code blocks are not properly closed SignalIncompleteCode SignalType = "incomplete_code" // SignalTruncated indicates the response appears to be cut off SignalTruncated SignalType = "truncated" // SignalRepetitive indicates excessive repetition in the response SignalRepetitive SignalType = "repetitive" // SignalIncoherent indicates the response lacks logical flow SignalIncoherent SignalType = "incoherent" // SignalRefusal indicates the model refused to answer SignalRefusal SignalType = "refusal" // SignalLowQuality indicates general low quality response SignalLowQuality SignalType = "low_quality" )
type Tier ¶
type Tier string
Tier represents a model capability tier for cascading.
func CapabilityToTier ¶
CapabilityToTier maps a capability slot name to a cascade tier.