ctr

package
v0.5.0 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: May 14, 2026 License: Apache-2.0 Imports: 37 Imported by: 0

Documentation

Index

Constants

View Source
const (
	// AttachableBinaryPath is where the kuketty binary is bind-mounted
	// read-only inside the container. The host source is staged from
	// kukeond's own /bin/kuketty by the runner (kuketty travels inside the
	// kukeond image — see Dockerfile).
	AttachableBinaryPath = "/.kukeon/bin/kuketty"

	// AttachableTTYDir is where the per-container tty directory is
	// bind-mounted inside the container. kuketty creates and owns the
	// socket (and, in later phases, capture / log) files inside this
	// directory; because it is a directory bind mount (not a file mount),
	// kuketty's unlink-and-recreate of the socket inode stays host-visible.
	AttachableTTYDir = "/run/kukeon/tty"

	// AttachableSocketPath is the in-container path of the kuketty attach
	// control socket. kuketty listens here; the host peer is the bind-
	// mount source directory's `socket` entry, which `kuke attach`
	// connects to.
	AttachableSocketPath = AttachableTTYDir + "/socket"

	// AttachableCapturePath is the declared in-container path for the
	// kuketty capture transcript. Honored starting in phase 2 (#288);
	// kuketty does not yet write the transcript in phase 1b.
	AttachableCapturePath = AttachableTTYDir + "/capture"

	// AttachableLogfilePath is the declared in-container path for the
	// kuketty log file. Honored starting in phase 3 (#289).
	AttachableLogfilePath = AttachableTTYDir + "/log"

	// AttachableMetadataDir is the in-container directory where the
	// kukeond-rendered terminal metadata file lives. Kept under
	// /.kukeon/kuketty/ to mirror the /.kukeon/bin/ layout of the
	// binary mount and keep both kuketty-owned paths in one subtree
	// the workload's rootfs cannot collide with.
	AttachableMetadataDir = "/.kukeon/kuketty"

	// AttachableMetadataPath is the fixed in-container path kuketty reads
	// its runtime configuration from. The daemon bind-mounts the
	// per-container metadata file over this path at OCI spec build time.
	// Kept in sync with cmd/kuketty/main.go's defaultConfigPath constant.
	AttachableMetadataPath = AttachableMetadataDir + "/metadata.json"

	// AttachableMetadataFile is the basename of the host-side per-container
	// metadata file rendered into the directory above HostTTYDir before
	// the workload container starts.
	AttachableMetadataFile = "kuketty-metadata.json"

	// AttachableSocketMode is the octal mode applied to the per-container
	// attach control socket when SocketGID is configured. 0660 = rw for
	// owner (the container's runtime uid) + rw for group (the kukeon
	// group), no world. Combined with the metadata file's socket GID this
	// lets a non-root member of the kukeon group on the host `connect()`
	// to the per-container kuketty control socket. Linux requires write
	// permission on a socket inode to connect — group-readable alone is
	// not enough.
	AttachableSocketMode = "0660"

	// AttachableCaptureMode and AttachableLogFileMode are the octal modes
	// that will apply to the capture transcript and log file once phases
	// 2/3 land kuketty's writers. 0640 = rw for owner (the container uid)
	// + r for group (the kukeon group), no world.
	AttachableCaptureMode = "0640"
	AttachableLogFileMode = "0640"
)

Reserved in-container paths the kuketty wrapper claims. Documented as such in pkg/api/model/v1beta1/container.go.

kuketty (issue #165) replaces sbsh on the OCI injection path. Phase 1b (#410) lands the attach-socket RPC server: kuketty consumes the kukeond-rendered api.TerminalDoc directly via sbsh's pkg/terminal/server facade. The wrapper is invoked with no CLI flags — every runtime input flows through the bind-mounted metadata file.

View Source
const (
	// DefaultRootContainerImage is the image used when none is provided.
	DefaultRootContainerImage = "docker.io/library/busybox:latest"
)
View Source
const DefaultSecretsStagingDir = "/run/kukeon/secrets"

DefaultSecretsStagingDir is the host directory the daemon uses to stage file-mounted secrets before bind-mounting them into containers. The directory lives under /run so contents are ephemeral across reboots on typical deployments.

Variables

View Source
var KukeonKnownSnapshotters = []string{
	"overlayfs",
	"native",
	"btrfs",
	"zfs",
	"devmapper",
	"blockfile",
}

KukeonKnownSnapshotters is the list of containerd snapshotters CleanupNamespaceResources walks when no snapshotter is specified. Stays in sync with the set of snapshotters supported by the kukeond image (overlayfs in production; native is always present as containerd's fallback). Listed in the order they will be drained — overlayfs first because it is the only one populated on a real install, but the others are tried so a host that experimented with btrfs/zfs/etc. still gets a clean uninstall instead of "namespace not empty" surfacing the day after.

Listed snapshotters that are not registered in the daemon return errors from snapshotService.Walk; cleanupSnapshotsFor handles those at WARN and keeps walking the rest.

Functions

func ConvertContainerdStatusToContainerState

func ConvertContainerdStatusToContainerState(status containerd.Status) intmodel.ContainerState

ConvertContainerdStatusToContainerState converts a containerd task status to internal ContainerState.

func DefaultRootContainerSpec

func DefaultRootContainerSpec(
	containerdID,
	cellID,
	realmID,
	spaceID,
	stackID,
	cniConfigPath string,
) intmodel.ContainerSpec

DefaultRootContainerSpec returns a minimal ContainerSpec suitable for keeping the root container alive while other workload containers are managed. containerdID is the hierarchical ID used for containerd operations. The ID field will be set to "root" (base name).

func NormalizeImageReference

func NormalizeImageReference(image string) string

NormalizeImageReference normalizes an image reference to a fully qualified format. Examples:

  • "debian:latest" -> "docker.io/library/debian:latest"
  • "alpine" -> "docker.io/library/alpine:latest"
  • "user/image:tag" -> "docker.io/user/image:tag"
  • "docker.io/library/debian:latest" -> "docker.io/library/debian:latest" (unchanged)
  • "registry.example.com/image:tag" -> "registry.example.com/image:tag" (unchanged)

Types

type AttachableInjection added in v0.2.0

type AttachableInjection struct {
	// KukettyBinaryPath is the host path of the kuketty binary that will
	// be bind-mounted RO at AttachableBinaryPath inside the container.
	// The daemon stages this from its own /bin/kuketty at provision time
	// (kukeond image ships kuketty alongside the daemon binary).
	KukettyBinaryPath string

	// HostTTYDir is the host path of the per-container tty directory that
	// will be bind-mounted at AttachableTTYDir inside the container. The
	// host-visible socket that `kuke attach` connects to is the `socket`
	// entry inside this directory.
	HostTTYDir string

	// HostMetadataPath is the host path of the per-container kuketty
	// metadata file. The runner renders the metadata to this path before
	// the container starts; the OCI bind mount maps it to
	// AttachableMetadataPath inside the container.
	HostMetadataPath string

	// RenderMetadata, when non-nil, is invoked from inside the OCI
	// args-wrap spec opt with the resolved workload argv — i.e. the merge
	// of the image's ENTRYPOINT + CMD and any user override that
	// containerd's WithImageConfig and our WithProcessArgs have already
	// applied to s.Process.Args by the time the wrap runs. The callback
	// is expected to render the api.TerminalDoc with the workload argv
	// baked into Spec.Command / Spec.CommandArgs and write it atomically
	// to HostMetadataPath. nil disables metadata rendering — used by
	// unit tests that exercise only the args-wrap shape.
	RenderMetadata func(workloadArgv []string) error
}

AttachableInjection carries the host-side paths needed to wrap a container's OCI spec so it runs under kuketty. The caller (the daemon) computes the paths from the cell/container identity and the configured run path. Empty fields disable the corresponding bind mount or metadata-file entry.

type BuildOption added in v0.2.0

type BuildOption func(*buildOpts)

BuildOption customizes BuildContainerSpec without changing its return type. Used for caller-provided values that don't live on the model spec — today just the host-side paths required when ContainerSpec.Attachable is true.

func WithAttachableInjection added in v0.2.0

func WithAttachableInjection(inj AttachableInjection) BuildOption

WithAttachableInjection configures the host-side paths used when wrapping an Attachable container. Has no effect on a spec where Attachable is false; in that case the option is silently ignored so callers can pass it unconditionally.

type CPUResources

type CPUResources struct {
	Weight *uint64
	Quota  *int64
	Period *uint64
	Cpus   string
	Mems   string
}

CPUResources maps to cpu*, cpuset* controllers.

type CgroupResources

type CgroupResources struct {
	CPU    *CPUResources
	Memory *MemoryResources
	IO     *IOResources
}

CgroupResources represents the subset of controllers we expose.

type CgroupSpec

type CgroupSpec struct {
	// Group is the target cgroup path, e.g. /kukeon/workloads/runner.
	Group string
	// Mountpoint overrides the default cgroup mount (/sys/fs/cgroup) when non-empty.
	Mountpoint string
	// Resources defines the controller knobs that should be configured for the cgroup.
	Resources CgroupResources
}

CgroupSpec describes how to create a new cgroup.

func DefaultCellSpec

func DefaultCellSpec(cell intmodel.Cell) CgroupSpec

func DefaultRealmSpec

func DefaultRealmSpec(realm intmodel.Realm) CgroupSpec

func DefaultSpaceSpec

func DefaultSpaceSpec(space intmodel.Space) CgroupSpec

func DefaultStackSpec

func DefaultStackSpec(stack intmodel.Stack) CgroupSpec

type Client

type Client interface {
	Connect() error
	Close() error

	CreateNamespace(namespace string) error
	DeleteNamespace(namespace string) error
	ListNamespaces() ([]string, error)
	GetNamespace(namespace string) (string, error)
	ExistsNamespace(namespace string) (bool, error)
	CleanupNamespaceResources(namespace, snapshotter string) error

	GetCgroupMountpoint() string
	GetCurrentCgroupPath() (string, error)
	CgroupPath(group, mountpoint string) (string, error)
	NewCgroup(spec CgroupSpec) (*cgroup2.Manager, error)
	LoadCgroup(group string, mountpoint string) (*cgroup2.Manager, error)
	DeleteCgroup(group, mountpoint string) error
	// EnsureSubtreeControllers writes "+<ctrl>" to the named group's own
	// cgroup.subtree_control AND to every ancestor up to the unified cgroup
	// mount, so the group's children inherit the controllers. The level-
	// agnostic primitive used by every realm/space/stack ensure path (issue
	// #327) and by the cell wrappers below (issues #312, #314). Filters the
	// requested set against what the host root advertises and returns the
	// effective set actually written. Idempotent — re-running on an already-
	// delegated subtree is a no-op.
	EnsureSubtreeControllers(group, mountpoint string, controllers []string) ([]string, error)
	// EnableCellSubtreeControllers enables the named cgroup-v2 controllers in
	// the cell cgroup's own subtree_control AND in every ancestor's
	// subtree_control up to the unified cgroup mount, so child cgroups (the
	// per-container task cgroups runc creates under Linux.CgroupsPath)
	// inherit the controllers and cell-level resource accounting / limits
	// become effective. Returns the effective controller set actually
	// written (the requested set filtered against the host root's
	// cgroup.controllers) so the runner can persist it on
	// CellStatus.SubtreeControllers (issue #328). Thin wrapper around
	// EnsureSubtreeControllers kept for the cell call sites' readability.
	// Issue #312.
	EnableCellSubtreeControllers(group, mountpoint string, controllers []string) ([]string, error)
	// EnableCellAllSubtreeControllers is the cell/profile=NestedCgroupRuntime
	// counterpart: it delegates the full host-available cgroup-v2 controller
	// set on the cell's subtree_control (and every ancestor's), rather than
	// the kukeon resource subset. Returns the effective controller set
	// actually written so the runner can persist it on
	// CellStatus.SubtreeControllers (issue #328). Used by cells that host
	// an inner cgroup runtime (an embedded containerd or systemd) which
	// needs to in turn delegate arbitrary controllers to its own children.
	// Issue #314.
	EnableCellAllSubtreeControllers(group, mountpoint string) ([]string, error)
	// RelocateProcessesToLeaf drains every PID currently in <group>/cgroup.procs
	// into a freshly-mkdir'd leaf cgroup at <group>/<leaf>. Used to satisfy
	// cgroup-v2's no-internal-process rule (issue #336): subtree_control
	// widening for non-thread-aware controllers (memory, io, ...) is rejected
	// by the kernel when the target cgroup hosts processes directly. The
	// leaf inherits the parent's controllers via the parent's subtree_control,
	// so resource accounting at <group> still applies — the PIDs just live
	// one level deeper. Idempotent: re-running on an already-drained group
	// is a no-op.
	RelocateProcessesToLeaf(group, mountpoint, leaf string) error
	CreateContainerFromSpec(namespace string, spec intmodel.ContainerSpec, creds []RegistryCredentials, opts ...BuildOption) (containerd.Container, error)

	CreateContainer(namespace string, spec ContainerSpec, creds []RegistryCredentials) (containerd.Container, error)
	GetContainer(namespace, id string) (containerd.Container, error)
	ListContainers(namespace string, filters ...string) ([]containerd.Container, error)
	ExistsContainer(namespace, id string) (bool, error)
	DeleteContainer(namespace, id string, opts ContainerDeleteOptions) error
	StartContainer(namespace string, spec ContainerSpec, taskSpec TaskSpec) (containerd.Task, error)
	StopContainer(namespace, id string, opts StopContainerOptions) (*containerd.ExitStatus, error)

	TaskStatus(namespace, id string) (containerd.Status, error)
	TaskMetrics(namespace, id string) (*apitypes.Metric, error)

	// ContainerProcessUID returns the resolved process.User.UID from the
	// given container's OCI runtime spec. Used after CreateContainerFromSpec
	// to chown the host-side per-container Attachable tty directory to the
	// runtime uid the container will execute as — which can be non-root
	// when the image carries a USER directive (or the cell profile sets
	// container.user). Without this, kuketty inside the container fails to
	// create its socket/log/capture files in the bind-mounted dir.
	ContainerProcessUID(namespace string, container containerd.Container) (uint32, error)

	// LoadImage imports an OCI/docker image tarball into the specified
	// containerd namespace and returns the names of the imported images.
	LoadImage(namespace string, reader io.Reader) ([]string, error)

	// ListImages enumerates images in the specified containerd namespace.
	ListImages(namespace string) ([]ImageInfo, error)

	// GetImage returns metadata for the named image ref in the specified
	// containerd namespace. Returns errdefs.ErrImageNotFound if
	// the ref is absent.
	GetImage(namespace, ref string) (ImageInfo, error)

	// DeleteImage removes the named image ref from the specified
	// containerd namespace. Returns errdefs.ErrImageNotFound if the ref
	// is absent so callers can distinguish missing from operational
	// failures.
	DeleteImage(namespace, ref string) error
}

func NewClient

func NewClient(ctx context.Context, logger *slog.Logger, socket string) Client

type ContainerDeleteOptions

type ContainerDeleteOptions struct {
	// SnapshotCleanup indicates whether to clean up snapshots.
	SnapshotCleanup bool
}

ContainerDeleteOptions describes options for deleting a container.

type ContainerRuntime

type ContainerRuntime struct {
	// Name is the runtime name (e.g., "io.containerd.runc.v2").
	Name string
	// Options are runtime-specific options.
	Options interface{}
}

ContainerRuntime describes the runtime configuration.

type ContainerSpec

type ContainerSpec struct {
	// ID is the unique identifier for the container.
	ID string
	// Image is the image reference to use for the container.
	Image string
	// SnapshotKey is the key for the snapshot. If empty, defaults to ID.
	SnapshotKey string
	// Snapshotter is the snapshotter to use. If empty, uses default.
	Snapshotter string
	// Runtime is the runtime configuration.
	Runtime *ContainerRuntime
	// SpecOpts are OCI spec options to apply.
	SpecOpts []oci.SpecOpts
	// Labels are key-value pairs to attach to the container.
	Labels map[string]string
	// CNIConfigPath is the path to the CNI configuration to use for this container.
	CNIConfigPath string
}

ContainerSpec describes how to create a new container.

func BuildContainerSpec

func BuildContainerSpec(
	containerSpec intmodel.ContainerSpec,
	options ...BuildOption,
) ContainerSpec

BuildContainerSpec converts an internal ContainerSpec to ctr.ContainerSpec with the expected defaults applied. Uses ContainerdID if available, otherwise falls back to ID.

func BuildRootContainerSpec

func BuildRootContainerSpec(
	rootSpec intmodel.ContainerSpec,
	labels map[string]string,
) ContainerSpec

BuildRootContainerSpec converts the internal root container spec into an internal ctr.ContainerSpec with the expected defaults applied. Uses ContainerdID if available, otherwise falls back to ID.

func JoinContainerNamespaces

func JoinContainerNamespaces(spec ContainerSpec, ns NamespacePaths) ContainerSpec

JoinContainerNamespaces returns a copy of spec with namespace spec options applied.

type IOResources

type IOResources struct {
	Weight   uint16
	Throttle []IOThrottleEntry
}

IOResources exposes IO weight + throttling.

type IOThrottleEntry

type IOThrottleEntry struct {
	Type  IOThrottleType
	Major int64
	Minor int64
	Rate  uint64
}

IOThrottleEntry represents a single io.max entry.

type IOThrottleType

type IOThrottleType string

IOThrottleType identifies the throttle file to target.

type ImageInfo added in v0.3.0

type ImageInfo struct {
	Name      string
	Size      int64
	CreatedAt time.Time
	Digest    string
	MediaType string
	Labels    map[string]string
}

ImageInfo is the ctr-layer view of a containerd image. The fields are the common subset surfaced to operators by `kuke image get`; downstream layers re-encode this onto their own wire types so the ctr package does not leak into pkg/api.

type MemoryResources

type MemoryResources struct {
	Min  *int64
	Max  *int64
	Low  *int64
	High *int64
	Swap *int64
}

MemoryResources maps to memory controller knobs.

type NamespacePaths

type NamespacePaths struct {
	Net string
	IPC string
	UTS string
	PID string
}

NamespacePaths describes the namespace file paths a container should join.

type RegistryCredentials

type RegistryCredentials struct {
	// Username is the registry username.
	Username string
	// Password is the registry password or token.
	Password string
	// ServerAddress is the registry server address (e.g., "docker.io", "registry.example.com").
	// If empty, credentials apply to the registry extracted from the image reference.
	ServerAddress string
}

RegistryCredentials contains authentication information for a container registry. This type matches the modelhub RegistryCredentials structure for use in the ctr package.

func ConvertRealmCredentials

func ConvertRealmCredentials(creds []intmodel.RegistryCredentials) []RegistryCredentials

ConvertRealmCredentials converts modelhub RegistryCredentials slice to ctr RegistryCredentials slice.

type StopContainerOptions

type StopContainerOptions struct {
	// Signal is the signal to send (defaults to SIGTERM).
	Signal string
	// Timeout is the timeout for graceful shutdown.
	Timeout *time.Duration
	// Force indicates whether to force kill if timeout is exceeded.
	Force bool
}

StopContainerOptions describes options for stopping a container.

type TaskIO

type TaskIO struct {
	// Stdin is the path to stdin (if any).
	Stdin string
	// Stdout is the path to stdout (if any).
	Stdout string
	// Stderr is the path to stderr (if any).
	Stderr string
	// Terminal indicates if the task should have a TTY.
	Terminal bool
	// LogFilePath, when set, makes the runtime shim write the task's
	// stdout and stderr to this host path via cio.LogFile. The shim
	// opens the file in append mode; if no file exists yet it is
	// created. Mutually exclusive with Terminal — log files do not
	// pair with a TTY.
	LogFilePath string
}

TaskIO describes the IO configuration for a task.

type TaskSpec

type TaskSpec struct {
	// IO is the IO configuration for the task.
	IO *TaskIO
	// Options are task creation options.
	Options []containerd.NewTaskOpts
}

TaskSpec describes how to create a new task.

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL