Documentation
¶
Overview ¶
Package podautoscaler provides controllers for managing PodAutoscaler resources. The controller supports three scaling strategies: - HPA: Creates and manages Kubernetes HorizontalPodAutoscaler resources (KEDA-like wrapper) - KPA: Knative-style Pod Autoscaling with panic/stable windows - APA: Application-specific Pod Autoscaling with custom metrics
Architecture: - Stateless autoscaler management: AutoScalers are created on-demand for each reconciliation - HPA wrapper: For HPA strategy, we create and manage actual K8s HPA resources - Custom scaling: For KPA/APA, we directly compute and apply scaling decisions
Index ¶
- Constants
- Variables
- func Add(mgr manager.Manager, runtimeConfig config.RuntimeConfig) error
- func GetReadyPodsCount(ctx context.Context, client client.Client, namespace string, ...) (int, error)
- type AutoScaler
- type DefaultAutoScaler
- type PodAutoscalerReconciler
- type ReplicaComputeRequest
- type ReplicaComputeResult
- type ScaleDecision
- type ScalingTargetKey
- type ValidationResult
- type WorkloadScale
Constants ¶
const ( RayClusterFleet = "RayClusterFleet" StormService = "StormService" )
const ( ConditionReady = "Ready" ConditionValidSpec = "ValidSpec" ConditionConflict = "MultiPodAutoscalerConflict" ConditionScalingActive = "ScalingActive" ConditionAbleToScale = "AbleToScale" ReasonAsExpected = "AsExpected" ReasonReconcilingScaleDiff = "ReconcilingScaleDiff" ReasonStable = "Stable" ReasonInvalidScalingStrategy = "InvalidScalingStrategy" ReasonInvalidBounds = "InvalidBounds" ReasonMissingTargetRef = "MissingScaleTargetRef" ReasonMetricsConfigError = "MetricsConfigError" ReasonInvalidSpec = "InvalidSpec" ReasonConfigured = "Configured" )
const AutoscalingStormServiceModeAnnotationKey = "autoscaling.aibrix.ai/storm-service-mode"
Variables ¶
Functions ¶
Types ¶
type AutoScaler ¶
type AutoScaler interface {
// ComputeDesiredReplicas performs metric-based scaling calculation.
// This is the primary method for scaling decisions and returns only the recommendation.
// It does NOT perform any actual scaling operations.
// All per-PA configuration is extracted from the PodAutoscaler spec on each call.
ComputeDesiredReplicas(ctx context.Context, request ReplicaComputeRequest) (*ReplicaComputeResult, error)
}
AutoScaler provides scaling decision capabilities based on metrics and algorithms. This interface focuses purely on scaling logic without actual resource manipulation. All implementations are stateless and thread-safe, supporting concurrent reconciliation.
type DefaultAutoScaler ¶
type DefaultAutoScaler struct {
// contains filtered or unexported fields
}
DefaultAutoScaler implements the complete scaling pipeline All components are stateless or thread-safe, allowing concurrent reconciliation
func NewDefaultAutoScaler ¶
func NewDefaultAutoScaler( factory metrics.MetricFetcherFactory, client client.Client, ) *DefaultAutoScaler
NewDefaultAutoScaler creates a new default autoscaler
func (*DefaultAutoScaler) ComputeDesiredReplicas ¶
func (a *DefaultAutoScaler) ComputeDesiredReplicas(ctx context.Context, request ReplicaComputeRequest) (*ReplicaComputeResult, error)
ComputeDesiredReplicas computes desired replicas based on all metrics in MetricsSources. It returns the maximum recommended replicas across all valid metrics.
type PodAutoscalerReconciler ¶
type PodAutoscalerReconciler struct {
client.Client
Scheme *runtime.Scheme
EventRecorder record.EventRecorder
Mapper apimeta.RESTMapper
RuntimeConfig config.RuntimeConfig
// contains filtered or unexported fields
}
PodAutoscalerReconciler reconciles a PodAutoscaler object. It uses stateless autoscaler management where AutoScalers are created on-demand for each reconciliation cycle, avoiding memory leaks and stale state issues.
type ReplicaComputeRequest ¶
type ReplicaComputeRequest struct {
PodAutoscaler autoscalingv1alpha1.PodAutoscaler
ScalingContext scalingctx.ScalingContext // Single source of truth for PA-level configuration
CurrentReplicas int32
Pods []corev1.Pod
Timestamp time.Time
}
ReplicaComputeRequest represents a request for replica calculation. This type is used both for the public interface and internal pipeline processing.
type ReplicaComputeResult ¶
type ReplicaComputeResult struct {
DesiredReplicas int32
Algorithm string
Reason string
Valid bool
}
ReplicaComputeResult represents the result of replica calculation
type ScaleDecision ¶
ScaleDecision represents a scaling decision made by the autoscaler
type ScalingTargetKey ¶
type ValidationResult ¶
type WorkloadScale ¶
type WorkloadScale interface {
// Validate checks if the target is valid and scalable
Validate(ctx context.Context, pa *autoscalingv1alpha1.PodAutoscaler) error
// GetCurrentReplicasFromScale extracts the current replica count from a scale object
GetCurrentReplicasFromScale(ctx context.Context, pa *autoscalingv1alpha1.PodAutoscaler, scale *unstructured.Unstructured) (int32, error)
// SetDesiredReplicas updates the replica count
SetDesiredReplicas(ctx context.Context, pa *autoscalingv1alpha1.PodAutoscaler, replicas int32) error
// GetPodSelectorFromScale extracts the label selector from an existing scale object
// For role-level scaling, it adds the role label requirement
// This avoids re-fetching the scale object when the controller already has it
GetPodSelectorFromScale(ctx context.Context, pa *autoscalingv1alpha1.PodAutoscaler, scale *unstructured.Unstructured) (labels.Selector, error)
}
WorkloadScale provides scaling operations for different workload types. It provides the mechanism to get/set replica counts on workload resources, while AutoScaler provides the intelligence to compute desired replica counts. The interface is stateless - all methods take PodAutoscaler as a parameter.
func NewWorkloadScale ¶
func NewWorkloadScale( client client.Client, restMapper meta.RESTMapper, ) WorkloadScale
NewWorkloadScale creates a stateless WorkloadScale implementation