Documentation
¶
Overview ¶
Package schedule implements the per-pool streaming claim scheduler.
Streaming model (vs. the retired batch flight):
A persistent ready-queue holds idle pods observed via informer cache. notifyIdle() triggers an incremental refresh that appends newly-idle pods in arrival order; no global sort is performed on the hot path.
ClaimRequest arrivals pop one pod and spawn a goroutine that executes inplaceupdate.TriggerUpdateWithOptions. The scheduler goroutine never blocks on the apiserver round-trip, so new requests are handled as fast as they arrive.
A TTL-bounded reservation map (see reservations.go) prevents a pod from being re-selected while its phase transition is still propagating from apiserver → informer cache. This is the main defence against the ErrUnexpectedPodPhase conflict observed under high QPS in the batch implementation.
See /home/ylli/.claude/plans/spicy-pondering-kahn.md for the design note.
Index ¶
Constants ¶
This section is empty.
Variables ¶
This section is empty.
Functions ¶
This section is empty.
Types ¶
type ClaimOptions ¶
type ClaimOptions struct {
ContainerImages map[string]string
Labels map[string]string
Annotations map[string]string
// TargetPodPhase defaults to SandboxPhaseRunning when empty.
TargetPodPhase string
}
ClaimOptions carries the per-request options applied to the target pod when the CAS succeeds. Fields map 1:1 onto inplaceupdate.UpdateOptions.
type ClaimRequest ¶
type ClaimRequest struct {
Ctx context.Context
Opts ClaimOptions
Deadline time.Time
ResultCh chan<- ClaimResult
// EnqueuedAt is set by Enqueue() and read by doCAS to record the
// end-to-end dispatch latency via ScheduleDispatchLatencySeconds. Callers
// may leave it zero; Enqueue will populate it if so.
EnqueuedAt time.Time
}
ClaimRequest is one pending claim awaiting an idle pod.
type ClaimResult ¶
ClaimResult is the outcome of a single Enqueue/claim attempt.
type PoolScheduler ¶
type PoolScheduler struct {
// contains filtered or unexported fields
}
PoolScheduler is a per-pool background goroutine that streams claim requests to idle pods as they become available.
func NewPoolScheduler ¶
func NewPoolScheduler(ns, name, team, user, env string, k8sClient client.Client) *PoolScheduler
NewPoolScheduler allocates a scheduler without starting its goroutine. team/user/env identify the owning pool for metrics labelling — env is the owning SandboxEnv name read from the Pool's LabelEnv (empty string for legacy Pools that pre-date Env adoption). k8sClient may be nil; in that case the background status writer that mirrors the in-memory queue length onto SandboxPool.Status.PendingRequests is skipped (useful for unit tests that exercise only the in-process dispatch path).
func (*PoolScheduler) Enqueue ¶
func (s *PoolScheduler) Enqueue(req *ClaimRequest) bool
Enqueue hands req to the scheduler. Returns false (and does NOT write to ResultCh) when the internal channel is full; the caller must translate that into backpressure.
func (*PoolScheduler) NotifyIdle ¶
func (s *PoolScheduler) NotifyIdle()
NotifyIdle is called when a pod transitions Stopping → Idle for this pool. It schedules a delayed wake of the scheduler so the informer cache has time to reflect the apiserver write before refreshReady() runs its List call. The delay (idleNotifyDelay ≈ 200 ms) covers P99 informer propagation latency; without it refreshReady() fires against a stale cache that still reports the pod as Stopping and misses it, causing the scheduler to fall back to the 10-second pollTimer. Safe to call from any goroutine.
func (*PoolScheduler) Run ¶
func (s *PoolScheduler) Run(ctx context.Context)
Run is the scheduler goroutine; call it exactly once per scheduler.
func (*PoolScheduler) Shutdown ¶
func (s *PoolScheduler) Shutdown()
Shutdown stops the scheduler goroutine and blocks until it has drained pending requests.
func (*PoolScheduler) Snapshot ¶ added in v0.0.5
func (s *PoolScheduler) Snapshot() Snapshot
Snapshot returns a point-in-time view of the scheduler's internal counters. See the Snapshot doc comment for field semantics. Safe to call from any goroutine; cost is O(1) atomic / channel-len reads.
type Snapshot ¶ added in v0.0.5
type Snapshot struct {
// IdleReady is the number of idle Pods currently admitted to the ready
// queue and not reserved — what would be popped if a request arrived now.
IdleReady int
// QueueLen is the total number of ClaimRequests this scheduler is
// holding that have not yet reached a terminal result. Counts both
// requests still in the producer→consumer channel AND requests
// parked inside the scheduler goroutine waiting for an idle pod.
// Sourced from an atomic counter so the value stays stable while a
// request transiently moves between buffers (the autoscaler relies
// on this for its reactive demand signal).
QueueLen int
// ReservedCount is the number of pods currently reserved (handed off to
// a CAS goroutine but not yet observed back as Idle/Starting through the
// informer cache).
ReservedCount int
// InflightCAS is the number of TriggerUpdateWithOptions goroutines
// currently running. Useful for diagnosing apiserver back-pressure.
InflightCAS int
// LastDispatchAt is the wall-clock time of the most recent successful
// dispatch (doCAS OK branch). Zero when no dispatch has ever succeeded.
LastDispatchAt time.Time
}
Snapshot is a point-in-time view of a PoolScheduler's internal counters. All fields are read with atomic / channel-len semantics; calling Snapshot is safe from any goroutine and does not interact with the apiserver.
EnvScheduler reads Snapshots on the request hot path to rank candidate member pools, so the cost must stay in the tens-of-nanoseconds range.