metrics

package
v0.5.21 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Jun 2, 2026 License: Apache-2.0 Imports: 3 Imported by: 0

Documentation

Overview

Package metrics provides observability infrastructure for the Superbrain system. It tracks healing attempts, diagnoses, fallbacks, and other autonomous actions to enable monitoring and performance analysis of the self-healing capabilities.

Index

Constants

This section is empty.

Variables

This section is empty.

Functions

This section is empty.

Types

type LatencyStats

type LatencyStats struct {
	AverageMs int64 `json:"average_ms"`
	MinMs     int64 `json:"min_ms"`
	MaxMs     int64 `json:"max_ms"`
	Samples   int64 `json:"samples"`
}

LatencyStats contains statistical information about healing latencies.

type Metrics

type Metrics struct {
	// contains filtered or unexported fields
}

Metrics tracks all Superbrain operations for observability. This is a Day-0 requirement for autonomous systems - we need comprehensive metrics to monitor healing effectiveness and system behavior.

func Global

func Global() *Metrics

Global returns the global Metrics instance, initializing it if necessary.

func New

func New(maxSamples int) *Metrics

New creates a new Metrics instance with the specified maximum latency samples. The maxSamples parameter controls how many latency measurements are kept for calculating percentiles and averages.

func (*Metrics) DecrementActiveMonitoring

func (m *Metrics) DecrementActiveMonitoring()

DecrementActiveMonitoring decrements the gauge for active monitoring contexts. This should be called when an Overwatch monitoring context completes.

func (*Metrics) DecrementQueuedActions

func (m *Metrics) DecrementQueuedActions()

DecrementQueuedActions decrements the gauge for queued healing actions.

func (*Metrics) IncrementActiveMonitoring

func (m *Metrics) IncrementActiveMonitoring()

IncrementActiveMonitoring increments the gauge for active monitoring contexts. This should be called when a new Overwatch monitoring context is created.

func (*Metrics) IncrementQueuedActions

func (m *Metrics) IncrementQueuedActions()

IncrementQueuedActions increments the gauge for queued healing actions. This is used in human-in-the-loop mode when actions await approval.

func (*Metrics) RecordContextOptimization

func (m *Metrics) RecordContextOptimization()

RecordContextOptimization increments the counter for context sculpting operations.

func (*Metrics) RecordDiagnosis

func (m *Metrics) RecordDiagnosis()

RecordDiagnosis increments the counter for diagnoses performed by the Internal Doctor.

func (*Metrics) RecordFailureByType

func (m *Metrics) RecordFailureByType(failureType string)

RecordFailureByType increments the counter for a specific failure type. Common types include: "permission_prompt", "auth_error", "context_exceeded", "rate_limit", etc.

func (*Metrics) RecordFallback

func (m *Metrics) RecordFallback()

RecordFallback increments the counter for fallback routing events.

func (*Metrics) RecordHealingAttempt

func (m *Metrics) RecordHealingAttempt()

RecordHealingAttempt increments the total healing attempts counter.

func (*Metrics) RecordHealingByType

func (m *Metrics) RecordHealingByType(healingType string)

RecordHealingByType increments the counter for a specific healing action type. Common types include: "stdin_injection", "restart_with_flags", "fallback_routing", "context_optimization".

func (*Metrics) RecordHealingFailure

func (m *Metrics) RecordHealingFailure()

RecordHealingFailure increments the failed healings counter.

func (*Metrics) RecordHealingSuccess

func (m *Metrics) RecordHealingSuccess(latencyMs int64)

RecordHealingSuccess increments the successful healings counter and records latency. The latencyMs parameter is the time taken for the healing action to complete.

func (*Metrics) RecordRestart

func (m *Metrics) RecordRestart()

RecordRestart increments the counter for process restart attempts.

func (*Metrics) RecordSilenceDetection

func (m *Metrics) RecordSilenceDetection()

RecordSilenceDetection increments the counter for silence threshold detections.

func (*Metrics) RecordStdinInjection

func (m *Metrics) RecordStdinInjection()

RecordStdinInjection increments the counter for stdin injection attempts.

func (*Metrics) Reset

func (m *Metrics) Reset()

Reset clears all metrics. This is primarily useful for testing.

func (*Metrics) Snapshot

func (m *Metrics) Snapshot() *Snapshot

Snapshot returns a point-in-time view of all metrics. This is safe to call concurrently and returns a copy of the current state.

type Snapshot

type Snapshot struct {
	// Counters
	HealingAttempts      int64 `json:"healing_attempts"`
	SuccessfulHealings   int64 `json:"successful_healings"`
	FailedHealings       int64 `json:"failed_healings"`
	SilenceDetections    int64 `json:"silence_detections"`
	DiagnosesPerformed   int64 `json:"diagnoses_performed"`
	FallbacksTriggered   int64 `json:"fallbacks_triggered"`
	StdinInjectionsTotal int64 `json:"stdin_injections_total"`
	RestartsTotal        int64 `json:"restarts_total"`
	ContextOptimizations int64 `json:"context_optimizations"`

	// By-type breakdowns
	HealingByType map[string]int64 `json:"healing_by_type"`
	FailureByType map[string]int64 `json:"failure_by_type"`

	// Latency statistics
	LatencyStats LatencyStats `json:"latency_stats"`

	// Gauges
	ActiveMonitoringContexts int64 `json:"active_monitoring_contexts"`
	QueuedHealingActions     int64 `json:"queued_healing_actions"`

	// Metadata
	UptimeSeconds int64     `json:"uptime_seconds"`
	Timestamp     time.Time `json:"timestamp"`
}

Snapshot represents a point-in-time view of all Superbrain metrics. This structure is safe to serialize and expose via API endpoints.

func (*Snapshot) FailureRate

func (s *Snapshot) FailureRate() float64

FailureRate calculates the healing failure rate as a percentage (0-100). Returns 0 if no healing attempts have been made.

func (*Snapshot) SuccessRate

func (s *Snapshot) SuccessRate() float64

SuccessRate calculates the healing success rate as a percentage (0-100). Returns 0 if no healing attempts have been made.

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL