runner

package
v0.5.0 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: May 14, 2026 License: Apache-2.0 Imports: 32 Imported by: 0

Documentation

Index

Constants

View Source
const ContainerIDMinimumParts = 2

ContainerIDMinimumParts is the minimum number of parts needed in a container ID to extract the network name. Container ID format: realm-space-cell-container We need at least realm and space to form the network name: realm-space.

Variables

This section is empty.

Functions

This section is empty.

Types

type Exec

type Exec struct {
	// contains filtered or unexported fields
}

func (*Exec) BootstrapCNI

func (r *Exec) BootstrapCNI(cfgDir, cacheDir, binDir string) (cni.BootstrapReport, error)

func (*Exec) Close

func (r *Exec) Close() error

func (*Exec) CreateCell

func (r *Exec) CreateCell(cell intmodel.Cell) (intmodel.Cell, error)

func (*Exec) CreateContainer

func (r *Exec) CreateContainer(cell intmodel.Cell, container intmodel.ContainerSpec) (intmodel.Cell, error)

CreateContainer creates a container in an existing cell by merging the container spec into the cell's containers list. The cell must already exist.

Inheritance of spec.defaults.container from the parent Space happens in ensureCellContainers, which is the single choke point every creation and update path traverses — so the post-merge (effective) configuration is what gets persisted and what `kuke get container -o yaml` displays.

func (*Exec) CreateRealm

func (r *Exec) CreateRealm(realm intmodel.Realm) (intmodel.Realm, error)

func (*Exec) CreateSpace

func (r *Exec) CreateSpace(space intmodel.Space) (intmodel.Space, error)

func (*Exec) CreateStack

func (r *Exec) CreateStack(stack intmodel.Stack) (intmodel.Stack, error)

func (*Exec) DeleteCell

func (r *Exec) DeleteCell(cell intmodel.Cell) error

func (*Exec) DeleteContainer

func (r *Exec) DeleteContainer(cell intmodel.Cell, containerID string) error

DeleteContainer stops and deletes a specific container in a cell from containerd.

func (*Exec) DeleteImage added in v0.3.0

func (r *Exec) DeleteImage(namespace, ref string) error

DeleteImage removes the named image ref from the given containerd namespace. errdefs.ErrImageNotFound is propagated unchanged so callers can use errors.Is for not-found detection.

func (*Exec) DeleteRealm

func (r *Exec) DeleteRealm(realm intmodel.Realm) error

func (*Exec) DeleteSpace

func (r *Exec) DeleteSpace(space intmodel.Space) error

func (*Exec) DeleteStack

func (r *Exec) DeleteStack(stack intmodel.Stack) error

func (*Exec) EnsureCell

func (r *Exec) EnsureCell(cell intmodel.Cell) (intmodel.Cell, error)

EnsureCell ensures that all required resources for a cell exist. It ensures the cgroup exists, ensures cell containers exist, and updates metadata.

func (*Exec) EnsureContainer

func (r *Exec) EnsureContainer(cell intmodel.Cell, container intmodel.ContainerSpec) (intmodel.Cell, error)

EnsureContainer ensures that a container spec is merged into an existing cell. It merges the container into the cell's Spec.Containers list (avoiding duplicates by ID), ensures containers exist, and updates metadata.

func (*Exec) EnsureKukeonRootCgroup

func (r *Exec) EnsureKukeonRootCgroup() (bool, bool, error)

EnsureKukeonRootCgroup ensures the kukeon root cgroup (/kukeon) exists at the cgroup hierarchy root. This is the base under which all realms are created. It bypasses buildCgroupPath (which would otherwise nest /kukeon under itself) and creates the cgroup directly at the discovered mountpoint. Returns (existsPre, created, err).

func (*Exec) EnsureRealm

func (r *Exec) EnsureRealm(realm intmodel.Realm) (intmodel.Realm, error)

EnsureRealm ensures that all required resources for a realm exist and reconciles its state. It ensures the containerd namespace and cgroup exist, and transitions the realm from "Creating" to "Ready" state if all resources are present.

func (*Exec) EnsureSpace

func (r *Exec) EnsureSpace(space intmodel.Space) (intmodel.Space, error)

EnsureSpace ensures that all required resources for a space exist. It ensures the CNI config and cgroup exist.

func (*Exec) EnsureStack

func (r *Exec) EnsureStack(stack intmodel.Stack) (intmodel.Stack, error)

EnsureStack ensures that all required resources for a stack exist. It ensures the cgroup exists.

func (*Exec) ExistsCellRootContainer

func (r *Exec) ExistsCellRootContainer(cell intmodel.Cell) (bool, error)

func (*Exec) ExistsCgroup

func (r *Exec) ExistsCgroup(doc any) (bool, error)

func (*Exec) ExistsContainer

func (r *Exec) ExistsContainer(namespace, containerdID string) (bool, error)

ExistsContainer checks if a container exists in containerd by its containerd ID. It ensures the client is connected before making the call.

func (*Exec) ExistsRealmContainerdNamespace

func (r *Exec) ExistsRealmContainerdNamespace(namespace string) (bool, error)

func (*Exec) ExistsSpaceCNIConfig

func (r *Exec) ExistsSpaceCNIConfig(space intmodel.Space) (bool, error)

ExistsSpaceCNIConfig checks if the CNI config for a space exists. It returns a bool and an error. The bool is true if the CNI config exists, false otherwise. The error is returned if the space name is required, the realm name is required, the CNI config does not exist, or the CNI config creation fails.

func (*Exec) ExtractContainersFromCells

func (r *Exec) ExtractContainersFromCells(cells []intmodel.Cell) []intmodel.ContainerSpec

ExtractContainersFromCells extracts all containers from a list of cells. It returns both root containers and regular containers as internal ContainerSpec types.

func (*Exec) GetCell

func (r *Exec) GetCell(cell intmodel.Cell) (intmodel.Cell, error)

func (*Exec) GetContainerState

func (r *Exec) GetContainerState(cell intmodel.Cell, containerID string) (intmodel.ContainerState, error)

GetContainerState queries containerd for the actual task status of a container and converts it to the internal ContainerState.

func (*Exec) GetImage added in v0.3.0

func (r *Exec) GetImage(namespace, ref string) (ctr.ImageInfo, error)

GetImage returns metadata for the named image ref in the given containerd namespace. errdefs.ErrImageNotFound is propagated unchanged so callers can use errors.Is for not-found detection.

func (*Exec) GetRealm

func (r *Exec) GetRealm(realm intmodel.Realm) (intmodel.Realm, error)

func (*Exec) GetSpace

func (r *Exec) GetSpace(space intmodel.Space) (intmodel.Space, error)

func (*Exec) GetStack

func (r *Exec) GetStack(stack intmodel.Stack) (intmodel.Stack, error)

func (*Exec) KillCell

func (r *Exec) KillCell(cell intmodel.Cell) (intmodel.Cell, error)

KillCell immediately force-kills all containers in a cell (workload containers first, then root container). It detaches the root container from the CNI network before killing it.

func (*Exec) KillContainer

func (r *Exec) KillContainer(cell intmodel.Cell, containerID string) error

KillContainer immediately force-kills a specific container in a cell.

func (*Exec) ListCells

func (r *Exec) ListCells(realmName, spaceName, stackName string) ([]intmodel.Cell, error)

func (*Exec) ListContainerdNamespaces added in v0.3.0

func (r *Exec) ListContainerdNamespaces() ([]string, error)

ListContainerdNamespaces returns every containerd namespace visible to the runner's client. A missing containerd socket returns an empty list rather than an error so the uninstall path can degrade gracefully on hosts where containerd was never running (or has already been torn down).

func (*Exec) ListContainers

func (r *Exec) ListContainers(realmName, spaceName, stackName, cellName string) ([]intmodel.ContainerSpec, error)

func (*Exec) ListImages added in v0.3.0

func (r *Exec) ListImages(namespace string) ([]ctr.ImageInfo, error)

ListImages enumerates images in the given containerd namespace. The caller (controller) is responsible for resolving the realm to a namespace and ensuring the realm exists; this method only routes the call onto a connected containerd client.

func (*Exec) ListRealms

func (r *Exec) ListRealms() ([]intmodel.Realm, error)

func (*Exec) ListSpaces

func (r *Exec) ListSpaces(realmName string) ([]intmodel.Space, error)

func (*Exec) ListStacks

func (r *Exec) ListStacks(realmName, spaceName string) ([]intmodel.Stack, error)

func (*Exec) LoadImage added in v0.3.0

func (r *Exec) LoadImage(namespace string, reader io.Reader) ([]string, error)

LoadImage imports an OCI/docker image tarball into the given containerd namespace and returns the names of the imported images. The caller is responsible for ensuring the namespace exists; the controller layer validates the realm before invoking this method.

func (*Exec) PopulateAndPersistCellContainerStatuses

func (r *Exec) PopulateAndPersistCellContainerStatuses(cell *intmodel.Cell) error

PopulateAndPersistCellContainerStatuses populates container statuses from containerd and persists them by updating cell metadata. This should be used when the cell status changes need to be persisted to disk.

func (*Exec) PurgeCell

func (r *Exec) PurgeCell(cell intmodel.Cell) error

PurgeCell performs comprehensive cleanup of a cell, including CNI resources and orphaned containers.

func (*Exec) PurgeContainer

func (r *Exec) PurgeContainer(realm intmodel.Realm, containerID string) error

PurgeContainer performs comprehensive cleanup of a container, including CNI resources.

func (*Exec) PurgeRealm

func (r *Exec) PurgeRealm(realm intmodel.Realm) (bool, error)

PurgeRealm performs comprehensive cleanup of a realm, including all child resources, CNI resources, and orphaned containers.

Returns (namespaceRemoved, err). namespaceRemoved is true iff the containerd namespace was actually removed (or was already gone). err is non-nil only for fatal precondition failures (missing name, GetRealm error, containerd connect error) or when DeleteNamespace itself failed — it is the load-bearing piece of "purge". Best-effort cleanups (cgroup removal, CNI teardown, orphaned-container drain) log warnings and do not surface as err so a fully-cleaned namespace is never reported as a failed purge.

func (*Exec) PurgeSpace

func (r *Exec) PurgeSpace(space intmodel.Space) error

PurgeSpace performs comprehensive cleanup of a space, including CNI resources and orphaned containers.

func (*Exec) PurgeStack

func (r *Exec) PurgeStack(stack intmodel.Stack) error

PurgeStack performs comprehensive cleanup of a stack, including CNI resources and orphaned containers.

func (*Exec) ReconcileCell added in v0.4.0

func (r *Exec) ReconcileCell(cell intmodel.Cell) (intmodel.Cell, ReconcileOutcome, error)

ReconcileCell is the daemon-side counterpart to RefreshCell. Where RefreshCell derives cell.Status.State from cgroup existence alone (so `kuke refresh` keeps its current shape), ReconcileCell additionally considers container task state — which is what flips a Ready cell to Stopped when an operator runs `ctr task kill` outside of kukeon, or when the cell's non-root workloads have all exited.

State derivation (see deriveCellState):

  • cgroup check error or cgroup absent → Unknown
  • cgroup present, no non-root container in spec (kukeond-style or cgroup-only cell) → derived from the root container's task state (the legacy behavior; root *is* the workload here)
  • cgroup present, at least one non-root container in spec → derived from the union of non-root container task states (Ready if any non-root is active; Stopped if every non-root is Stopped/Failed/ non-existent; Unknown if any non-root reads back Unknown — the defensive read-side check that pairs with #301)

Container statuses are populated up front so the derivation reads them straight from the snapshot instead of re-querying containerd (and so `kuke get cell` sees the same per-container view the reconciler decided on).

Wind-down side-effects: when the derivation flips to Stopped/Failed and the cell has at least one non-root container in spec and the root container task is still running, the reconciler kills the cell so the root container shell stops too. Two flavors:

  • AutoDelete=true: autoDeleteCell runs (KillCell + DeleteCell) and the cell metadata is removed. Subsumes the per-cell `kuke run --rm` watcher.
  • AutoDelete=false: windDownCell runs (KillCell only). The cell metadata is preserved in Stopped state for the operator to `kuke delete` explicitly — a long-lived `sleep infinity` root does not get to hold the cell open after every workload has exited.

Both flavors are gated by the ReadyObserved latch so an in-flight CreateCell (cgroup created, non-root containers not yet registered → derivation reads Stopped from "container does not exist") is not reaped before it ever reached Ready.

Errors during kill/delete are returned to the caller so the loop's per-pass `Errors` slice records them and the cell is preserved for retry on the next tick (best-effort, like the watcher it replaces).

func (*Exec) RecreateCell

func (r *Exec) RecreateCell(desired intmodel.Cell) (intmodel.Cell, error)

RecreateCell stops all containers in the cell, deletes them, and recreates the cell with the new root container spec. This is used when the root container spec changes (image, command, or args).

func (*Exec) RefreshCell

func (r *Exec) RefreshCell(cell intmodel.Cell) (intmodel.Cell, int, error)

RefreshCell refreshes the status of a cell and its containers. Returns the updated cell, number of containers updated, and any error.

func (*Exec) RefreshRealm

func (r *Exec) RefreshRealm(realm intmodel.Realm) (intmodel.Realm, bool, error)

RefreshRealm refreshes the status of a realm by checking cgroup and containerd namespace. Returns the updated realm, whether it was updated, and any error.

func (*Exec) RefreshSpace

func (r *Exec) RefreshSpace(space intmodel.Space) (intmodel.Space, bool, error)

RefreshSpace refreshes the status of a space by checking CNI config. Returns the updated space, whether it was updated, and any error.

func (*Exec) RefreshStack

func (r *Exec) RefreshStack(stack intmodel.Stack) (intmodel.Stack, bool, error)

RefreshStack refreshes the status of a stack by checking cgroup. Returns the updated stack, whether it was updated, and any error.

func (*Exec) StartCell

func (r *Exec) StartCell(cell intmodel.Cell) (_ intmodel.Cell, retErr error)

StartCell starts the root container and all containers defined in the CellDoc. The root container is started first, then all containers in doc.Spec.Containers are started.

If a containerd-touching step fails (CreateContainer, StartContainer, CNI attach, attachable chown), the cell is transitioned to CellStateFailed and any containers that did start are killed — issue #407. Errors raised before the provisioning phase (input validation, realm lookup, idempotent-skip path) leave the cell's persisted state alone.

func (*Exec) StartContainer

func (r *Exec) StartContainer(cell intmodel.Cell, containerID string) (_ intmodel.Cell, retErr error)

StartContainer starts a specific container in a cell. Mirrors StartCell's issue-#407 behavior: a containerd-touching failure (CreateContainer, StartContainer, attachable chown) marks the cell as CellStateFailed and kills any siblings the cell still holds, so a single non-root container that cannot be brought up no longer leaves the cell wedged in Unknown.

func (*Exec) StopCell

func (r *Exec) StopCell(cell intmodel.Cell) (intmodel.Cell, error)

StopCell stops all containers in the cell (workload containers first, then root container). It detaches the root container from the CNI network before stopping it, ensuring the network namespace is still valid. If detachment fails or the container is already stopped, fallback cleanup removes IPAM allocations directly.

func (*Exec) StopContainer

func (r *Exec) StopContainer(cell intmodel.Cell, containerID string) error

StopContainer stops a specific container in a cell.

func (*Exec) UpdateCell

func (r *Exec) UpdateCell(desired intmodel.Cell) (intmodel.Cell, error)

UpdateCell updates an existing cell with new metadata and container changes. It handles: - Metadata updates (labels) - Container additions (containers in desired but not in actual) - Container updates (containers in both, with spec changes) - Container removals (orphans: containers in actual but not in desired)

Breaking changes (root container spec changes, parent associations) should be rejected before calling this method.

func (*Exec) UpdateCellMetadata

func (r *Exec) UpdateCellMetadata(cell intmodel.Cell) error

func (*Exec) UpdateContainer

func (r *Exec) UpdateContainer(cell intmodel.Cell, desiredContainer intmodel.ContainerSpec) (intmodel.Cell, error)

UpdateContainer updates an existing container within a cell. If the container spec has breaking changes (image, command, args), it will stop, delete, and recreate the container. Otherwise, it updates the container spec in metadata.

func (*Exec) UpdateRealm

func (r *Exec) UpdateRealm(desired intmodel.Realm) (intmodel.Realm, error)

UpdateRealm updates an existing realm with new metadata and compatible spec fields. It only updates fields that are backward-compatible (labels, registry credentials). Breaking changes (name, namespace) should be rejected before calling this method.

func (*Exec) UpdateRealmMetadata

func (r *Exec) UpdateRealmMetadata(realm intmodel.Realm) error

func (*Exec) UpdateSpace

func (r *Exec) UpdateSpace(desired intmodel.Space) (intmodel.Space, error)

UpdateSpace updates an existing space with new metadata and compatible spec fields. It only updates fields that are backward-compatible (labels). Breaking changes (name, realm association, CNI config path) should be rejected before calling this method.

func (*Exec) UpdateSpaceMetadata

func (r *Exec) UpdateSpaceMetadata(space intmodel.Space) error

func (*Exec) UpdateStack

func (r *Exec) UpdateStack(desired intmodel.Stack) (intmodel.Stack, error)

UpdateStack updates an existing stack with new metadata and compatible spec fields. It only updates fields that are backward-compatible (labels, ID). Breaking changes (name, realm/space association) should be rejected before calling this method.

func (*Exec) UpdateStackMetadata

func (r *Exec) UpdateStackMetadata(stack intmodel.Stack) error

type Options

type Options struct {
	ContainerdSocket string
	RunPath          string
	CniConf          cni.Conf
	// ForceRegenerateCNI forces ensureSpaceCNIConfig to rewrite an existing conflist
	// even when one is present and its bridge name matches SafeBridgeName. Set by
	// `kuke init --force-regenerate-cni` as an operator escape hatch.
	ForceRegenerateCNI bool
	// KukeonGroupGID, when non-zero, is the numeric GID of the kukeon system
	// group. The runner uses it to chown the host-side per-container tty
	// directory created for Attachable containers, so that members of the
	// kukeon group can reach the per-container sbsh socket via the same
	// group-traversal path that `kuke init` sets up on /opt/kukeon.
	KukeonGroupGID int
}

type ReconcileOutcome added in v0.4.0

type ReconcileOutcome struct {
	// Updated is true if the cell's status (state or container statuses)
	// was written back to disk during the pass.
	Updated bool
	// Deleted is true if the reconciler killed and removed the cell as
	// part of honoring Spec.AutoDelete. When true the input cell no
	// longer exists in metadata.
	Deleted bool
}

ReconcileOutcome describes the per-cell effect of a single reconcile pass. At most one of Updated / Deleted is true: an AutoDelete cell whose root task has exited skips the persisted state-update path and runs the kill+delete sequence instead, so callers see Deleted, not Updated.

type Runner

type Runner interface {
	BootstrapCNI(cfgDir, cacheDir, binDir string) (cni.BootstrapReport, error)
	EnsureKukeonRootCgroup() (existsPre bool, created bool, err error)

	GetRealm(realm intmodel.Realm) (intmodel.Realm, error)
	ListRealms() ([]intmodel.Realm, error)
	CreateRealm(realm intmodel.Realm) (intmodel.Realm, error)
	EnsureRealm(realm intmodel.Realm) (intmodel.Realm, error)
	UpdateRealm(realm intmodel.Realm) (intmodel.Realm, error)
	ExistsRealmContainerdNamespace(namespace string) (bool, error)
	// ListContainerdNamespaces returns every containerd namespace the
	// runner's client can see. Surfaced for the uninstall path so it can
	// enumerate kukeon namespaces by `.kukeon.io` suffix and clean up
	// user-created realms whose on-disk metadata was already wiped.
	ListContainerdNamespaces() ([]string, error)
	DeleteRealm(realm intmodel.Realm) error

	GetSpace(space intmodel.Space) (intmodel.Space, error)
	ListSpaces(realmName string) ([]intmodel.Space, error)
	CreateSpace(space intmodel.Space) (intmodel.Space, error)
	EnsureSpace(space intmodel.Space) (intmodel.Space, error)
	UpdateSpace(space intmodel.Space) (intmodel.Space, error)
	ExistsSpaceCNIConfig(space intmodel.Space) (bool, error)
	DeleteSpace(space intmodel.Space) error

	GetCell(cell intmodel.Cell) (intmodel.Cell, error)
	ListCells(realmName, spaceName, stackName string) ([]intmodel.Cell, error)
	ListContainers(realmName, spaceName, stackName, cellName string) ([]intmodel.ContainerSpec, error)
	CreateCell(cell intmodel.Cell) (intmodel.Cell, error)
	EnsureCell(cell intmodel.Cell) (intmodel.Cell, error)
	StartCell(cell intmodel.Cell) (intmodel.Cell, error)
	StopCell(cell intmodel.Cell) (intmodel.Cell, error)
	StartContainer(cell intmodel.Cell, containerID string) (intmodel.Cell, error)
	StopContainer(cell intmodel.Cell, containerID string) error
	KillCell(cell intmodel.Cell) (intmodel.Cell, error)
	KillContainer(cell intmodel.Cell, containerID string) error
	DeleteContainer(cell intmodel.Cell, containerID string) error
	CreateContainer(cell intmodel.Cell, container intmodel.ContainerSpec) (intmodel.Cell, error)
	EnsureContainer(cell intmodel.Cell, container intmodel.ContainerSpec) (intmodel.Cell, error)
	UpdateCell(cell intmodel.Cell) (intmodel.Cell, error)
	RecreateCell(cell intmodel.Cell) (intmodel.Cell, error)
	UpdateContainer(cell intmodel.Cell, container intmodel.ContainerSpec) (intmodel.Cell, error)
	UpdateCellMetadata(cell intmodel.Cell) error
	ExistsCellRootContainer(cell intmodel.Cell) (bool, error)
	DeleteCell(cell intmodel.Cell) error

	GetStack(stack intmodel.Stack) (intmodel.Stack, error)
	ListStacks(realmName, spaceName string) ([]intmodel.Stack, error)
	CreateStack(stack intmodel.Stack) (intmodel.Stack, error)
	EnsureStack(stack intmodel.Stack) (intmodel.Stack, error)
	UpdateStack(stack intmodel.Stack) (intmodel.Stack, error)
	DeleteStack(stack intmodel.Stack) error

	ExistsCgroup(doc any) (bool, error)

	PurgeRealm(realm intmodel.Realm) (namespaceRemoved bool, err error)
	PurgeSpace(space intmodel.Space) error
	PurgeStack(stack intmodel.Stack) error
	PurgeCell(cell intmodel.Cell) error
	PurgeContainer(realm intmodel.Realm, containerID string) error

	RefreshRealm(realm intmodel.Realm) (intmodel.Realm, bool, error)
	RefreshSpace(space intmodel.Space) (intmodel.Space, bool, error)
	RefreshStack(stack intmodel.Stack) (intmodel.Stack, bool, error)
	RefreshCell(cell intmodel.Cell) (intmodel.Cell, int, error)
	// ReconcileCell is the daemon-side counterpart to RefreshCell: it
	// re-derives the cell's status from cgroup + root-container task state
	// (so a Ready cell whose root task exits externally flips to Stopped),
	// and — when Spec.AutoDelete is set on a cell that has reached
	// Stopped/Failed — best-effort kills and deletes the cell instead of
	// persisting the new status. Subsumes the per-cell `kuke run --rm`
	// watcher: the trigger (cell stopped) is exactly what the loop already
	// computes, and a single-instance, restart-resilient ticker means a
	// cell whose Spec.AutoDelete=true survives a daemon restart without
	// needing the daemon to re-install per-cell goroutines on startup.
	ReconcileCell(cell intmodel.Cell) (intmodel.Cell, ReconcileOutcome, error)

	GetContainerState(cell intmodel.Cell, containerID string) (intmodel.ContainerState, error)

	// LoadImage imports an OCI/docker image tarball into the given
	// containerd namespace and returns the names of the imported images.
	LoadImage(namespace string, reader io.Reader) ([]string, error)

	// ListImages enumerates images in the given containerd namespace.
	ListImages(namespace string) ([]ctr.ImageInfo, error)

	// GetImage returns metadata for the named image ref in the given
	// containerd namespace. Returns errdefs.ErrImageNotFound when the
	// ref is absent.
	GetImage(namespace, ref string) (ctr.ImageInfo, error)

	// DeleteImage removes the named image ref from the given containerd
	// namespace. Returns errdefs.ErrImageNotFound when the ref is absent.
	DeleteImage(namespace, ref string) error

	Close() error
}

func NewRunner

func NewRunner(ctx context.Context, logger *slog.Logger, opts Options) Runner

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL