Documentation
¶
Overview ¶
Package controller provides the Kubernetes controller for EffectivenessAssessment CRDs. The controller watches EA CRDs created by the Remediation Orchestrator and performs effectiveness assessment checks (health, alert, metrics, hash).
Architecture: ADR-EM-001 (Effectiveness Monitor Service Integration) Controller Pattern: controller-runtime reconciler with dependency injection
Business Requirements: - BR-EM-001 through BR-EM-004: Component assessment checks - BR-EM-005: Phase state transitions - BR-EM-006: Stabilization window - BR-EM-007: Validity window - BR-AUDIT-006: SOC 2 audit trail
Index ¶
Constants ¶
This section is empty.
Variables ¶
This section is empty.
Functions ¶
func ComputePodHealthStats ¶
func ComputePodHealthStats(pods []*corev1.Pod, remediationStartedAt *metav1.Time) health.TargetStatus
ComputePodHealthStats aggregates health indicators from a set of active pods. remediationStartedAt controls restart counting: pods created before that time have their cumulative RestartCount excluded (they predate the remediation). Exported for unit testing (#246).
Types ¶
type Reconciler ¶
type Reconciler struct {
client.Client
Scheme *runtime.Scheme
Recorder record.EventRecorder
// Dependencies (injected via NewReconciler)
Metrics *emmetrics.Metrics
PrometheusClient emclient.PrometheusQuerier
AlertManagerClient emclient.AlertManagerClient
AuditManager *emaudit.Manager
DSQuerier emclient.DataStorageQuerier
// Configuration
Config ReconcilerConfig
// contains filtered or unexported fields
}
Reconciler reconciles EffectivenessAssessment objects. It performs the 4 assessment checks and emits audit events.
func NewReconciler ¶
func NewReconciler( c client.Client, apiReader client.Reader, s *runtime.Scheme, recorder record.EventRecorder, m *emmetrics.Metrics, promClient emclient.PrometheusQuerier, amClient emclient.AlertManagerClient, auditMgr *emaudit.Manager, dsQuerier emclient.DataStorageQuerier, cfg ReconcilerConfig, ) *Reconciler
NewReconciler creates a new Reconciler with all dependencies injected. Per DD-METRICS-001: Metrics wired via dependency injection. Per DD-AUDIT-003: AuditManager wired via dependency injection (Pattern 2).
func (*Reconciler) Reconcile ¶
Reconcile handles a single reconciliation of an EffectivenessAssessment. This is the main entry point called by controller-runtime.
Reconciliation flow:
- Fetch EA from API server
- Check if EA is in terminal state -> skip
- Check validity window (expired takes priority)
- If expired -> complete with partial data
- If stabilizing -> requeue after stabilization
- Transition Pending -> Assessing
- Run component checks (skip already-completed components)
- Update EA status with component results
- If all components done -> complete assessment
- Otherwise -> requeue for remaining components
func (*Reconciler) SetRESTMapper ¶
func (r *Reconciler) SetRESTMapper(rm meta.RESTMapper)
SetRESTMapper sets the REST mapper used to resolve Kind -> GVR for unstructured fetches. Called after NewReconciler and before SetupWithManager so the controller has access to the manager's discovery-backed mapper.
func (*Reconciler) SetupWithManager ¶
func (r *Reconciler) SetupWithManager(mgr ctrl.Manager, maxConcurrentReconciles ...int) error
SetupWithManager registers the controller with the manager. Creates a field index on spec.correlationID for O(1) lookups and kubectl field-selector support (e.g., kubectl get ea --field-selector spec.correlationID=rr-xxx).
type ReconcilerConfig ¶
type ReconcilerConfig struct {
// PrometheusEnabled indicates whether metric comparison is active.
PrometheusEnabled bool
// AlertManagerEnabled indicates whether alert resolution checking is active.
AlertManagerEnabled bool
// ValidityWindow is the maximum duration for assessment completion.
// The EM computes ValidityDeadline = EA.creationTimestamp + ValidityWindow
// on first reconciliation and stores it in EA.Status.ValidityDeadline.
// Default: 30m (from effectivenessmonitor.Config.Assessment.ValidityWindow).
ValidityWindow time.Duration
// PrometheusLookback is the duration before EA creation to query Prometheus.
// Default: 10 minutes. Shorter values improve E2E test speed.
PrometheusLookback time.Duration
// RequeueGenericError is the delay before retrying on transient errors.
// Default: 5s (from emconfig.RequeueGenericError).
RequeueGenericError time.Duration
// RequeueAssessmentInProgress is the delay before retrying while waiting
// for external data (e.g., Prometheus scrape).
// Default: 15s (from emconfig.RequeueAssessmentInProgress).
RequeueAssessmentInProgress time.Duration
}
ReconcilerConfig holds runtime configuration for the reconciler.
func DefaultReconcilerConfig ¶
func DefaultReconcilerConfig() ReconcilerConfig
DefaultReconcilerConfig returns a ReconcilerConfig with production defaults.