nodehealth

package
v0.7.0 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Apr 14, 2026 License: Apache-2.0 Imports: 8 Imported by: 0

Documentation

Index

Constants

View Source
const DefaultGracePeriod = 5 * time.Minute

Variables

This section is empty.

Functions

This section is empty.

Types

type Controller

type Controller struct {
	// contains filtered or unexported fields
}

Controller watches Node entities and marks sandboxes DEAD when their runner has been non-READY for longer than the grace period. This handles the case where a runner dies permanently (node failure, crash without recovery) and its sandboxes need to be cleaned up so the pool controller can create replacements.

The grace period exists because containers survive miren process restarts. A runner that's just restarting (upgrade, deploy) will come back and re-adopt its containers via reconcileSandboxesOnBoot(). We only want to intervene when the runner is truly gone.

Nodes with DISABLED status are intentionally excluded. DISABLED is set by Drain() during graceful shutdown, which handles its own sandbox cleanup. We don't want to race with that.

Implements controller.ReconcileControllerI[*compute_v1alpha.Node] Implements controller.DeletingReconcileController

func (*Controller) Delete

func (c *Controller) Delete(ctx context.Context, id entity.Id) error

Delete cleans up tracking state when a node entity is removed (e.g., via "miren runner remove").

func (*Controller) Init

func (c *Controller) Init(ctx context.Context) error

func (*Controller) Reconcile

func (c *Controller) Reconcile(ctx context.Context, node *compute_v1alpha.Node, meta *entity.Meta) error

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL