Documentation
¶
Index ¶
- func GetKubernetesClient() (kubernetes.Interface, error)
- func ProbeConnection(ctx context.Context, client kubernetes.Interface) error
- type Decision
- type HealthChecker
- func (hc *HealthChecker) CheckClusterCapacity(ctx context.Context, clusterName string) HealthResult
- func (hc *HealthChecker) CheckCriticalWorkloads(ctx context.Context) HealthResult
- func (hc *HealthChecker) CheckNodeHealth(ctx context.Context, clusterName string) HealthResult
- func (hc *HealthChecker) CheckPodDisruptionBudgets(ctx context.Context) HealthResult
- func (hc *HealthChecker) CheckResourceBalance(ctx context.Context, clusterName string) HealthResult
- func (hc *HealthChecker) ListPodDisruptionBudgets(ctx context.Context) ([]PDBInfo, error)
- func (hc *HealthChecker) RunAllChecks(ctx context.Context, clusterName string) HealthSummary
- type HealthResult
- type HealthStatus
- type HealthSummary
- type KubeDiag
- type NodeMetrics
- type PDBInfo
- type ResourceAnalysis
Constants ¶
This section is empty.
Variables ¶
This section is empty.
Functions ¶
func GetKubernetesClient ¶
func GetKubernetesClient() (kubernetes.Interface, error)
GetKubernetesClient creates a Kubernetes client using default resolution ($KUBECONFIG → ~/.kube/config → in-cluster). For an explicit path and diagnostics, use BuildKubeClient.
func ProbeConnection ¶ added in v0.7.0
func ProbeConnection(ctx context.Context, client kubernetes.Interface) error
ProbeConnection verifies the Kubernetes API is actually reachable (and basic list RBAC is present) with a bounded timeout, so callers can report an unreachable cluster up-front rather than silently degrading every check.
Types ¶
type Decision ¶
type Decision string
Decision represents the overall decision for proceeding with update
type HealthChecker ¶
type HealthChecker struct {
// contains filtered or unexported fields
}
HealthChecker performs various health checks on the EKS cluster
func NewChecker ¶
func NewChecker(eksClient *eks.Client, k8sClient kubernetes.Interface, cwClient *cloudwatch.Client, asgClient *autoscaling.Client) *HealthChecker
NewChecker creates a new health checker instance
func (*HealthChecker) CheckClusterCapacity ¶
func (hc *HealthChecker) CheckClusterCapacity(ctx context.Context, clusterName string) HealthResult
CheckClusterCapacity validates that the cluster has sufficient capacity for rolling updates
func (*HealthChecker) CheckCriticalWorkloads ¶
func (hc *HealthChecker) CheckCriticalWorkloads(ctx context.Context) HealthResult
CheckCriticalWorkloads validates that critical system workloads are running
func (*HealthChecker) CheckNodeHealth ¶
func (hc *HealthChecker) CheckNodeHealth(ctx context.Context, clusterName string) HealthResult
CheckNodeHealth validates that all nodes in the cluster are ready
func (*HealthChecker) CheckPodDisruptionBudgets ¶
func (hc *HealthChecker) CheckPodDisruptionBudgets(ctx context.Context) HealthResult
CheckPodDisruptionBudgets validates PDB configuration for user workloads
func (*HealthChecker) CheckResourceBalance ¶
func (hc *HealthChecker) CheckResourceBalance(ctx context.Context, clusterName string) HealthResult
CheckResourceBalance validates resource distribution and utilization patterns
func (*HealthChecker) ListPodDisruptionBudgets ¶ added in v0.7.0
func (hc *HealthChecker) ListPodDisruptionBudgets(ctx context.Context) ([]PDBInfo, error)
ListPodDisruptionBudgets returns a structured snapshot of every PDB in user namespaces with its current disruption status. Returns (nil, nil) when no Kubernetes client is configured so callers can degrade gracefully. (REF-4)
func (*HealthChecker) RunAllChecks ¶
func (hc *HealthChecker) RunAllChecks(ctx context.Context, clusterName string) HealthSummary
RunAllChecks executes all health checks and returns a summary. The checks are independent, so they run concurrently; capacity and balance share one instance-discovery + CloudWatch fetch via a lazy snapshot.
type HealthResult ¶
type HealthResult struct {
Name string `json:"name"`
Status HealthStatus `json:"status"`
Score int `json:"score"` // 0-100
Message string `json:"message"`
Details []string `json:"details,omitempty"`
IsBlocking bool `json:"isBlocking"`
// Skipped marks a check that could not be evaluated (e.g. no Kubernetes
// client) rather than measured. Skipped checks are excluded from the
// OverallScore so a missing prerequisite doesn't silently drag the score.
Skipped bool `json:"skipped,omitempty"`
}
HealthResult represents the result of a single health check
type HealthStatus ¶
type HealthStatus string
HealthStatus represents the status of a health check
const ( StatusPass HealthStatus = "PASS" StatusWarn HealthStatus = "WARN" StatusFail HealthStatus = "FAIL" )
type HealthSummary ¶
type HealthSummary struct {
Results []HealthResult `json:"results"`
OverallScore int `json:"overallScore"`
Decision Decision `json:"decision"`
Warnings []string `json:"warnings,omitempty"`
Errors []string `json:"errors,omitempty"`
}
HealthSummary represents the overall health check results
type KubeDiag ¶ added in v0.7.0
type KubeDiag struct {
Source string // "--kubeconfig", "KUBECONFIG", "default", "in-cluster", "none"
Path string
Context string
}
KubeDiag describes how the Kubernetes client was (or would be) resolved, so callers can emit an actionable message when the API can't be reached.
func BuildKubeClient ¶ added in v0.7.0
func BuildKubeClient(kubeconfigPath string) (kubernetes.Interface, KubeDiag, error)
BuildKubeClient builds a Kubernetes client, preferring an explicit kubeconfig path, then $KUBECONFIG, then ~/.kube/config, then in-cluster config. It returns a KubeDiag describing what was tried (for diagnostics) alongside the client. An explicit --kubeconfig path that doesn't exist is a hard error.
type NodeMetrics ¶
NodeMetrics represents resource metrics for a single node
type PDBInfo ¶ added in v0.7.0
type PDBInfo struct {
Namespace string `json:"namespace" yaml:"namespace"`
Name string `json:"name" yaml:"name"`
DisruptionsAllowed int32 `json:"disruptionsAllowed" yaml:"disruptionsAllowed"`
CurrentHealthy int32 `json:"currentHealthy" yaml:"currentHealthy"`
DesiredHealthy int32 `json:"desiredHealthy" yaml:"desiredHealthy"`
ExpectedPods int32 `json:"expectedPods" yaml:"expectedPods"`
}
PDBInfo is a structured snapshot of one PodDisruptionBudget's disruption status, used by `nodegroup scale --dry-run` to show which PDBs would constrain a scale-down. (REF-4)
type ResourceAnalysis ¶
type ResourceAnalysis struct {
CPUStdDev float64
MemoryStdDev float64
MaxCPU float64
MaxMemory float64
MinCPU float64
MinMemory float64
}
ResourceAnalysis contains analysis of resource distribution. CPUStdDev/MemoryStdDev are the population standard deviation of per-node utilization, in percentage points (a spread measure, not statistical variance).