README
¶
metalman
Join bare metal nodes to a Kubernetes cluster using PXE and (optionally) Redfish.
Metalman commands are available through the kubectl unbounded plugin. A
dedicated metalman binary is also shipped as a container image for running
the PXE server inside a cluster.
Run metalman version to print the binary version.
Usage
# Create a Machine
kubectl apply -f - <<EOF
apiVersion: unbounded-cloud.io/v1alpha3
kind: Machine
metadata:
name: node-01
spec:
pxe:
image: ghcr.io/azure/images/host-ubuntu2404:v1
dhcpLeases:
- mac: "aa:bb:cc:dd:ee:01"
ipv4: "10.0.0.11"
subnetMask: "255.255.255.0"
gateway: "10.0.0.1"
dns: ["10.0.0.1"]
redfish:
url: https://10.0.10.11
username: admin
passwordRef:
name: bmc-node-01-pass
namespace: default
key: password
EOF
# Store the BMC's Redfish password
kubectl create secret generic bmc-node-01-pass --from-literal=password=example-password
# Repave the node via PXE
kubectl unbounded machine repave node-01
Concepts
Controller
kubectl unbounded site serve-pxe runs a single long-lived process that provides
everything needed to PXE-boot and manage bare metal hosts:
| Service | Default Port | Protocol | Purpose |
|---|---|---|---|
| DHCP | 67/udp | DHCPv4 | Static leases derived from Machine NIC specs |
| TFTP | 69/udp | TFTP | Initial bootloader delivery (e.g. shimx64.efi) |
| HTTP | 8880/tcp | HTTP | Artifact serving, templated configs, attestation endpoints |
| Health | 8081/tcp | HTTP | Liveness/readiness probes |
The controller also runs reconcilers for OCI image pulling (downloading and caching netboot images from container registries) and Machine resources with Redfish BMC specs (power management, boot order configuration).
When deployed inside a cluster, the container entrypoint is metalman and the
site deploy-pxe command passes serve-pxe as an argument:
metalman serve-pxe --site=<site> [flags]
Deploying with site deploy-pxe
kubectl unbounded site deploy-pxe is a convenience command that creates (or
updates) a Kubernetes Deployment running metalman serve-pxe for a given
site. The Deployment is server-side applied into the unbounded-kube
namespace.
# Deploy the PXE server for a site called "rack-a"
kubectl unbounded site deploy-pxe --site=rack-a
The resulting Deployment (metalman-controller-<site>) runs with host networking for DHCP.
It exposes ports 8880/tcp (HTTP), 8081/tcp (health), 67/udp (DHCP), and 69/udp (TFTP).
site deploy-pxe flags:
--site— Site name (required; scopes the PXE instance to machines labeledunbounded-cloud.io/site=<site>).--image— Container image for the PXE deployment (default: build-time value ormetalman:latest).--kubeconfig— Path to kubeconfig file.
The generated Deployment uses host networking, a CriticalAddonsOnly
toleration, DNS policy ClusterFirstWithHostNet, and a node selector
unbounded-cloud.io/site=<site>. Resource requests are 100m CPU / 128Mi
memory with limits of 500m CPU / 256Mi memory.
DHCP Modes
The DHCP server operates in one of two modes depending on whether
--dhcp-interface is set:
-
Interface mode (
--dhcp-interface=eth0): Binds to a network interface and listens for broadcast DHCP traffic. Use this when the controller is directly attached to the provisioning network. -
Auto-interface mode (
--dhcp-auto-interface): Automatically detects the network interface from the server bind address. Mutually exclusive with--dhcp-interface. -
Relay mode (no
--dhcp-interfaceor--dhcp-auto-interface): Listens on a UDP port for unicast packets only. Use this when a DHCP relay agent forwards requests from a remote subnet.
Leader election is always enabled regardless of DHCP mode. Each site gets
its own leader-election lease (metalman-<site>).
Security Model
A mostly-trusted network between the controller and the bare metal hosts is assumed. Bootstrap tokens (Kubernetes ServiceAccount tokens) are issued to nodes based on source IP — the controller looks up the Machine whose NIC matches the requesting IP and issues a short-lived token for that node.
Bootstrap tokens are delivered using the standard TPM 2.0 credential encryption workflow.
The client's endorsement key (EK) public key is stored in status.tpm.ekPublicKey when first seen.
So it's possible to prove that the bootstrap token was delivered only to trusted hosts.
Sites
The --site flag scopes a site serve-pxe instance to a subset of Machines. The
value is matched against the unbounded-cloud.io/site label on Machine
resources:
# Manage only Machines labeled site=rack-a
kubectl unbounded site serve-pxe --site=rack-a --dhcp-interface=eth0
# Manage only unlabeled Machines (the default)
kubectl unbounded site serve-pxe --dhcp-interface=eth0
Each site gets its own leader-election lease (metalman-<site>), so
multiple sites can coexist on one cluster with independent HA. A site serve-pxe
instance with no --site manages Machines that do not have the site label
at all.
Images
Netboot images are standard OCI container images built FROM scratch that
contain all files needed for PXE booting a machine under /disk/. This
follows the kubevirt containerDisk convention. Files
with a .tmpl suffix are Go templates rendered per-machine at serve time;
other files are served verbatim. A metadata.yaml file provides image-level
configuration (e.g. dhcpBootImageName).
Images are built, tagged, and pushed using standard container tooling:
docker build -t ghcr.io/azure/images/host-ubuntu2404:v1 .
docker push ghcr.io/azure/images/host-ubuntu2404:v1
The OCI image layout is the one described above: boot artifacts live under
/disk/, .tmpl files are rendered per-machine at serve time, and
metadata.yaml carries image-level configuration.
Machine
A Machine is a cluster-scoped custom resource representing a single bare metal host. At minimum it needs a NIC (MAC + static IP) and a PXE image reference:
apiVersion: unbounded-cloud.io/v1alpha3
kind: Machine
metadata:
name: node-01
spec:
pxe:
image: ghcr.io/azure/images/host-ubuntu2404:v1
dhcpLeases:
- mac: "aa:bb:cc:dd:ee:01"
ipv4: "10.0.0.11"
subnetMask: "255.255.255.0"
gateway: "10.0.0.1"
This is enough for the DHCP server to issue a lease and for TFTP/HTTP to serve boot artifacts. The node must be manually PXE-booted (or have PXE as its default boot option).
BMC
Adding a redfish block enables remote power management. The controller will
manage boot order and execute reboot cycles without physical access:
apiVersion: unbounded-cloud.io/v1alpha3
kind: Machine
metadata:
name: node-01
spec:
pxe:
image: ghcr.io/azure/images/host-ubuntu2404:v1
dhcpLeases:
- mac: "aa:bb:cc:dd:ee:01"
ipv4: "10.0.0.11"
subnetMask: "255.255.255.0"
gateway: "10.0.0.1"
redfish:
url: https://bmc-node-01.example.com
username: admin
passwordRef:
name: bmc-node-01-pass
namespace: default
key: password
The BMC password is read from a Secret in the same namespace (key: password).
On first connection, the controller captures the BMC's TLS certificate
fingerprint and pins it in status.redfish.certFingerprint for subsequent
requests.
To repave a node with BMC access:
kubectl unbounded machine repave node-01
This increments spec.operations.repaveCounter and spec.operations.rebootCounter. The
controller handles the rest — it configures the boot order to PXE, executes a
ForceOff/On power cycle, and clears the condition once the node is back up.
Documentation
¶
There is no documentation for this package.