Documentation
¶
Overview ¶
Package metrics provides observability infrastructure for the Superbrain system. It tracks healing attempts, diagnoses, fallbacks, and other autonomous actions to enable monitoring and performance analysis of the self-healing capabilities.
Index ¶
- type LatencyStats
- type Metrics
- func (m *Metrics) DecrementActiveMonitoring()
- func (m *Metrics) DecrementQueuedActions()
- func (m *Metrics) IncrementActiveMonitoring()
- func (m *Metrics) IncrementQueuedActions()
- func (m *Metrics) RecordContextOptimization()
- func (m *Metrics) RecordDiagnosis()
- func (m *Metrics) RecordFailureByType(failureType string)
- func (m *Metrics) RecordFallback()
- func (m *Metrics) RecordHealingAttempt()
- func (m *Metrics) RecordHealingByType(healingType string)
- func (m *Metrics) RecordHealingFailure()
- func (m *Metrics) RecordHealingSuccess(latencyMs int64)
- func (m *Metrics) RecordRestart()
- func (m *Metrics) RecordSilenceDetection()
- func (m *Metrics) RecordStdinInjection()
- func (m *Metrics) Reset()
- func (m *Metrics) Snapshot() *Snapshot
- type Snapshot
Constants ¶
This section is empty.
Variables ¶
This section is empty.
Functions ¶
This section is empty.
Types ¶
type LatencyStats ¶
type LatencyStats struct {
AverageMs int64 `json:"average_ms"`
MinMs int64 `json:"min_ms"`
MaxMs int64 `json:"max_ms"`
Samples int64 `json:"samples"`
}
LatencyStats contains statistical information about healing latencies.
type Metrics ¶
type Metrics struct {
// contains filtered or unexported fields
}
Metrics tracks all Superbrain operations for observability. This is a Day-0 requirement for autonomous systems - we need comprehensive metrics to monitor healing effectiveness and system behavior.
func Global ¶
func Global() *Metrics
Global returns the global Metrics instance, initializing it if necessary.
func New ¶
New creates a new Metrics instance with the specified maximum latency samples. The maxSamples parameter controls how many latency measurements are kept for calculating percentiles and averages.
func (*Metrics) DecrementActiveMonitoring ¶
func (m *Metrics) DecrementActiveMonitoring()
DecrementActiveMonitoring decrements the gauge for active monitoring contexts. This should be called when an Overwatch monitoring context completes.
func (*Metrics) DecrementQueuedActions ¶
func (m *Metrics) DecrementQueuedActions()
DecrementQueuedActions decrements the gauge for queued healing actions.
func (*Metrics) IncrementActiveMonitoring ¶
func (m *Metrics) IncrementActiveMonitoring()
IncrementActiveMonitoring increments the gauge for active monitoring contexts. This should be called when a new Overwatch monitoring context is created.
func (*Metrics) IncrementQueuedActions ¶
func (m *Metrics) IncrementQueuedActions()
IncrementQueuedActions increments the gauge for queued healing actions. This is used in human-in-the-loop mode when actions await approval.
func (*Metrics) RecordContextOptimization ¶
func (m *Metrics) RecordContextOptimization()
RecordContextOptimization increments the counter for context sculpting operations.
func (*Metrics) RecordDiagnosis ¶
func (m *Metrics) RecordDiagnosis()
RecordDiagnosis increments the counter for diagnoses performed by the Internal Doctor.
func (*Metrics) RecordFailureByType ¶
RecordFailureByType increments the counter for a specific failure type. Common types include: "permission_prompt", "auth_error", "context_exceeded", "rate_limit", etc.
func (*Metrics) RecordFallback ¶
func (m *Metrics) RecordFallback()
RecordFallback increments the counter for fallback routing events.
func (*Metrics) RecordHealingAttempt ¶
func (m *Metrics) RecordHealingAttempt()
RecordHealingAttempt increments the total healing attempts counter.
func (*Metrics) RecordHealingByType ¶
RecordHealingByType increments the counter for a specific healing action type. Common types include: "stdin_injection", "restart_with_flags", "fallback_routing", "context_optimization".
func (*Metrics) RecordHealingFailure ¶
func (m *Metrics) RecordHealingFailure()
RecordHealingFailure increments the failed healings counter.
func (*Metrics) RecordHealingSuccess ¶
RecordHealingSuccess increments the successful healings counter and records latency. The latencyMs parameter is the time taken for the healing action to complete.
func (*Metrics) RecordRestart ¶
func (m *Metrics) RecordRestart()
RecordRestart increments the counter for process restart attempts.
func (*Metrics) RecordSilenceDetection ¶
func (m *Metrics) RecordSilenceDetection()
RecordSilenceDetection increments the counter for silence threshold detections.
func (*Metrics) RecordStdinInjection ¶
func (m *Metrics) RecordStdinInjection()
RecordStdinInjection increments the counter for stdin injection attempts.
type Snapshot ¶
type Snapshot struct {
// Counters
HealingAttempts int64 `json:"healing_attempts"`
SuccessfulHealings int64 `json:"successful_healings"`
FailedHealings int64 `json:"failed_healings"`
SilenceDetections int64 `json:"silence_detections"`
DiagnosesPerformed int64 `json:"diagnoses_performed"`
FallbacksTriggered int64 `json:"fallbacks_triggered"`
StdinInjectionsTotal int64 `json:"stdin_injections_total"`
RestartsTotal int64 `json:"restarts_total"`
ContextOptimizations int64 `json:"context_optimizations"`
// By-type breakdowns
HealingByType map[string]int64 `json:"healing_by_type"`
FailureByType map[string]int64 `json:"failure_by_type"`
// Latency statistics
LatencyStats LatencyStats `json:"latency_stats"`
// Gauges
ActiveMonitoringContexts int64 `json:"active_monitoring_contexts"`
QueuedHealingActions int64 `json:"queued_healing_actions"`
// Metadata
UptimeSeconds int64 `json:"uptime_seconds"`
Timestamp time.Time `json:"timestamp"`
}
Snapshot represents a point-in-time view of all Superbrain metrics. This structure is safe to serialize and expose via API endpoints.
func (*Snapshot) FailureRate ¶
FailureRate calculates the healing failure rate as a percentage (0-100). Returns 0 if no healing attempts have been made.
func (*Snapshot) SuccessRate ¶
SuccessRate calculates the healing success rate as a percentage (0-100). Returns 0 if no healing attempts have been made.