podautoscaler

package
v0.5.0-rc.2 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Nov 1, 2025 License: Apache-2.0 Imports: 47 Imported by: 0

Documentation

Overview

Package podautoscaler provides controllers for managing PodAutoscaler resources. The controller supports three scaling strategies: - HPA: Creates and manages Kubernetes HorizontalPodAutoscaler resources (KEDA-like wrapper) - KPA: Knative-style Pod Autoscaling with panic/stable windows - APA: Application-specific Pod Autoscaling with custom metrics

Architecture: - Stateless autoscaler management: AutoScalers are created on-demand for each reconciliation - HPA wrapper: For HPA strategy, we create and manage actual K8s HPA resources - Custom scaling: For KPA/APA, we directly compute and apply scaling decisions

Index

Constants

View Source
const (
	RayClusterFleet = "RayClusterFleet"
	StormService    = "StormService"
)
View Source
const (
	ConditionReady         = "Ready"
	ConditionValidSpec     = "ValidSpec"
	ConditionConflict      = "MultiPodAutoscalerConflict"
	ConditionScalingActive = "ScalingActive"
	ConditionAbleToScale   = "AbleToScale"

	ReasonAsExpected             = "AsExpected"
	ReasonReconcilingScaleDiff   = "ReconcilingScaleDiff"
	ReasonStable                 = "Stable"
	ReasonInvalidScalingStrategy = "InvalidScalingStrategy"
	ReasonInvalidBounds          = "InvalidBounds"
	ReasonMissingTargetRef       = "MissingScaleTargetRef"
	ReasonMetricsConfigError     = "MetricsConfigError"
	ReasonInvalidSpec            = "InvalidSpec"
	ReasonConfigured             = "Configured"
)
View Source
const AutoscalingStormServiceModeAnnotationKey = "autoscaling.aibrix.ai/storm-service-mode"

Variables

View Source
var (
	DefaultResyncInterval           = 10 * time.Second
	DefaultRequeueDuration          = 10 * time.Second
	DefaultReconcileTimeoutDuration = 10 * time.Second
)

Functions

func Add

func Add(mgr manager.Manager, runtimeConfig config.RuntimeConfig) error

Add creates a new PodAutoscaler Controller and adds it to the Manager with default RBAC. The Manager will set fields on the Controller and Start it when the Manager is Started.

func GetReadyPodsCount

func GetReadyPodsCount(ctx context.Context, client client.Client, namespace string, selector labels.Selector) (int, error)

GetReadyPodsCount counts the number of ready pods matching the given selector

Types

type AutoScaler

type AutoScaler interface {
	// ComputeDesiredReplicas performs metric-based scaling calculation.
	// This is the primary method for scaling decisions and returns only the recommendation.
	// It does NOT perform any actual scaling operations.
	// All per-PA configuration is extracted from the PodAutoscaler spec on each call.
	ComputeDesiredReplicas(ctx context.Context, request ReplicaComputeRequest) (*ReplicaComputeResult, error)
}

AutoScaler provides scaling decision capabilities based on metrics and algorithms. This interface focuses purely on scaling logic without actual resource manipulation. All implementations are stateless and thread-safe, supporting concurrent reconciliation.

type DefaultAutoScaler

type DefaultAutoScaler struct {
	// contains filtered or unexported fields
}

DefaultAutoScaler implements the complete scaling pipeline All components are stateless or thread-safe, allowing concurrent reconciliation

func NewDefaultAutoScaler

func NewDefaultAutoScaler(
	factory metrics.MetricFetcherFactory,
	client client.Client,
) *DefaultAutoScaler

NewDefaultAutoScaler creates a new default autoscaler

func (*DefaultAutoScaler) ComputeDesiredReplicas

func (a *DefaultAutoScaler) ComputeDesiredReplicas(ctx context.Context, request ReplicaComputeRequest) (*ReplicaComputeResult, error)

ComputeDesiredReplicas computes desired replicas based on all metrics in MetricsSources. It returns the maximum recommended replicas across all valid metrics.

type PodAutoscalerReconciler

type PodAutoscalerReconciler struct {
	client.Client
	Scheme        *runtime.Scheme
	EventRecorder record.EventRecorder
	Mapper        apimeta.RESTMapper

	RuntimeConfig config.RuntimeConfig
	// contains filtered or unexported fields
}

PodAutoscalerReconciler reconciles a PodAutoscaler object. It uses stateless autoscaler management where AutoScalers are created on-demand for each reconciliation cycle, avoiding memory leaks and stale state issues.

func (*PodAutoscalerReconciler) Reconcile

func (r *PodAutoscalerReconciler) Reconcile(ctx context.Context, req ctrl.Request) (ctrl.Result, error)

func (*PodAutoscalerReconciler) Run

func (r *PodAutoscalerReconciler) Run(ctx context.Context, errChan chan<- error)

type ReplicaComputeRequest

type ReplicaComputeRequest struct {
	PodAutoscaler   autoscalingv1alpha1.PodAutoscaler
	ScalingContext  scalingctx.ScalingContext // Single source of truth for PA-level configuration
	CurrentReplicas int32
	Pods            []corev1.Pod
	Timestamp       time.Time
}

ReplicaComputeRequest represents a request for replica calculation. This type is used both for the public interface and internal pipeline processing.

type ReplicaComputeResult

type ReplicaComputeResult struct {
	DesiredReplicas int32
	Algorithm       string
	Reason          string
	Valid           bool
}

ReplicaComputeResult represents the result of replica calculation

type ScaleDecision

type ScaleDecision struct {
	DesiredReplicas int32
	ShouldScale     bool
	Reason          string
	Algorithm       string
}

ScaleDecision represents a scaling decision made by the autoscaler

type ScalingTargetKey

type ScalingTargetKey struct {
	Namespace     string
	APIVersion    string
	Kind          string
	Name          string
	SubTargetRole string // from SubTargetSelector.RoleName
}

type ValidationResult

type ValidationResult struct {
	Valid   bool
	Reason  string
	Message string
}

type WorkloadScale

type WorkloadScale interface {
	// Validate checks if the target is valid and scalable
	Validate(ctx context.Context, pa *autoscalingv1alpha1.PodAutoscaler) error

	// GetCurrentReplicasFromScale extracts the current replica count from a scale object
	GetCurrentReplicasFromScale(ctx context.Context, pa *autoscalingv1alpha1.PodAutoscaler, scale *unstructured.Unstructured) (int32, error)

	// SetDesiredReplicas updates the replica count
	SetDesiredReplicas(ctx context.Context, pa *autoscalingv1alpha1.PodAutoscaler, replicas int32) error

	// GetPodSelectorFromScale extracts the label selector from an existing scale object
	// For role-level scaling, it adds the role label requirement
	// This avoids re-fetching the scale object when the controller already has it
	GetPodSelectorFromScale(ctx context.Context, pa *autoscalingv1alpha1.PodAutoscaler, scale *unstructured.Unstructured) (labels.Selector, error)
}

WorkloadScale provides scaling operations for different workload types. It provides the mechanism to get/set replica counts on workload resources, while AutoScaler provides the intelligence to compute desired replica counts. The interface is stateless - all methods take PodAutoscaler as a parameter.

func NewWorkloadScale

func NewWorkloadScale(
	client client.Client,
	restMapper meta.RESTMapper,
) WorkloadScale

NewWorkloadScale creates a stateless WorkloadScale implementation

Directories

Path Synopsis
Package algorithm provides scaling algorithms for different autoscaling strategies.
Package algorithm provides scaling algorithms for different autoscaling strategies.

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL