devices

package
v0.0.5 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Jan 16, 2026 License: MIT Imports: 18 Imported by: 0

README

Device Passthrough

This package provides GPU, vGPU, and PCI device passthrough for virtual machines.

Overview

hypeman supports two GPU modes:

Mode Description Use Case
vGPU (SR-IOV) Virtual GPUs via mdev on SR-IOV VFs Multi-tenant, shared GPU resources
Passthrough Whole GPU VFIO passthrough Dedicated GPU per instance

For GPU-specific documentation, see GPU.md.

Package Structure

lib/devices/
├── types.go         # Device, AvailableDevice, GPUProfile, MdevDevice types
├── errors.go        # Error definitions
├── discovery.go     # PCI device discovery from sysfs
├── vfio.go          # VFIO bind/unbind operations
├── gpu_mode.go      # GPU mode detection (vGPU vs passthrough)
├── mdev.go          # mdev lifecycle (create, destroy, list, reconcile)
├── manager.go       # Manager interface and implementation
├── manager_test.go  # Unit tests
├── gpu_e2e_test.go  # End-to-end GPU passthrough test
├── GPU.md           # GPU and vGPU documentation
└── scripts/
    └── gpu-reset.sh # GPU recovery script

Quick Start

# Check available profiles
curl localhost:8080/resources | jq .gpu

# Create instance with vGPU
curl -X POST localhost:8080/instances \
  -H "Content-Type: application/json" \
  -d '{
    "name": "ml-training",
    "image": "nvidia/cuda:12.4-runtime-ubuntu22.04",
    "gpu": {"profile": "L40S-1Q"}
  }'

# Inside VM: verify GPU
nvidia-smi
Passthrough Mode (Dedicated GPU)
# Discover available devices
curl localhost:8080/devices/available

# Register the GPU
curl -X POST localhost:8080/devices \
  -d '{"name": "l4-gpu", "pci_address": "0000:a2:00.0"}'

# Create instance with GPU
curl -X POST localhost:8080/instances \
  -d '{"name": "ml-training", "image": "nvidia/cuda:12.0-base", "devices": ["l4-gpu"]}'

# Inside VM: verify GPU
nvidia-smi

# Delete instance (auto-unbinds from VFIO)
curl -X DELETE localhost:8080/instances/{id}

Device Lifecycle

Registration (Passthrough Mode)

Register a device for whole-GPU passthrough:

POST /devices
{
  "name": "l4-gpu",
  "pci_address": "0000:a2:00.0"
}
Instance Creation

vGPU Mode:

{
  "name": "gpu-workload",
  "image": "nvidia/cuda:12.4-runtime",
  "gpu": {"profile": "L40S-1Q"}
}

Passthrough Mode:

{
  "name": "gpu-workload", 
  "image": "nvidia/cuda:12.4-runtime",
  "devices": ["l4-gpu"]
}
Automatic Cleanup
  • vGPU: mdev destroyed when instance is deleted
  • Passthrough: Device unbound from VFIO when instance is deleted
  • Orphaned mdevs: Cleaned up on server startup

Hypervisor Integration

Both Cloud Hypervisor and QEMU receive device paths:

VFIO passthrough:

/sys/bus/pci/devices/0000:a2:00.0/

mdev (vGPU):

/sys/bus/mdev/devices/<uuid>/

Guest Driver Requirements

Important: Guest images must include pre-installed NVIDIA drivers.

FROM nvidia/cuda:12.4.1-runtime-ubuntu22.04
RUN apt-get update && apt-get install -y nvidia-utils-550

hypeman does NOT inject drivers into guests.

Constraints and Limitations

IOMMU Requirements
  • IOMMU must be enabled in BIOS and kernel (intel_iommu=on or amd_iommu=on)
  • All devices in an IOMMU group must be passed through together
VFIO Module Requirements
modprobe vfio_pci
modprobe vfio_iommu_type1
Single Attachment

A device (or vGPU profile slot) can only be attached to one instance at a time.

No Hot-Plug

Devices must be specified at instance creation time.

Troubleshooting

GPU Reset Script

If GPU passthrough tests fail or hang:

# Reset all NVIDIA GPUs
sudo ./lib/devices/scripts/gpu-reset.sh

# Reset specific GPU
sudo ./lib/devices/scripts/gpu-reset.sh 0000:a2:00.0
Common Issues
VFIO Bind Hangs

Solution: Code automatically stops nvidia-persistenced. For manual testing:

sudo systemctl stop nvidia-persistenced
GPU Not Restored After Test
sudo sh -c 'echo 0000:a2:00.0 > /sys/bus/pci/drivers_probe'
sudo systemctl start nvidia-persistenced
vGPU Profile Not Available

Check available slots:

curl localhost:8080/resources | jq '.gpu.profiles'
Running the E2E Test
# Prerequisites
sudo modprobe vfio_pci vfio_iommu_type1

# Run test (auto-skips if no GPU)
sudo env PATH=$PATH:/sbin:/usr/sbin \
  go test -v -run TestGPUPassthrough -timeout 5m ./lib/devices/...

Why root is required: sysfs driver operations require writing to files owned by root with mode 0200.

Documentation

Index

Constants

This section is empty.

Variables

View Source
var (
	// ErrNotFound is returned when a device is not found
	ErrNotFound = errors.New("device not found")

	// ErrInUse is returned when a device is currently attached to an instance
	ErrInUse = errors.New("device is in use")

	// ErrNotBound is returned when a VFIO operation requires the device to be bound
	ErrNotBound = errors.New("device is not bound to VFIO")

	// ErrAlreadyBound is returned when trying to bind a device that's already bound to VFIO
	ErrAlreadyBound = errors.New("device is already bound to VFIO")

	// ErrAlreadyExists is returned when trying to register a device that already exists
	ErrAlreadyExists = errors.New("device already exists")

	// ErrInvalidName is returned when the device name doesn't match the required pattern
	ErrInvalidName = errors.New("device name must match pattern ^[a-zA-Z0-9][a-zA-Z0-9_.-]+$")

	// ErrNameExists is returned when a device with the same name already exists
	ErrNameExists = errors.New("device name already exists")

	// ErrInvalidPCIAddress is returned when the PCI address format is invalid
	ErrInvalidPCIAddress = errors.New("invalid PCI address format")

	// ErrDeviceNotFound is returned when the PCI device doesn't exist on the host
	ErrDeviceNotFound = errors.New("PCI device not found on host")

	// ErrVFIONotAvailable is returned when VFIO modules are not loaded
	ErrVFIONotAvailable = errors.New("VFIO is not available (modules not loaded)")

	// ErrIOMMUGroupConflict is returned when not all devices in IOMMU group can be passed through
	ErrIOMMUGroupConflict = errors.New("IOMMU group contains other devices that must also be passed through")
)
View Source
var DeviceNamePattern = regexp.MustCompile(`^[a-zA-Z0-9][a-zA-Z0-9_.-]+$`)

DeviceNamePattern is the regex pattern for valid device names Must start with alphanumeric, followed by alphanumeric, underscore, dot, or dash

Functions

func DestroyMdev added in v0.0.5

func DestroyMdev(ctx context.Context, mdevUUID string) error

DestroyMdev removes an mdev device.

func GetDeviceSysfsPath

func GetDeviceSysfsPath(pciAddress string) string

GetDeviceSysfsPath returns the sysfs path for a PCI device (used by cloud-hypervisor)

func GetIOMMUGroupDevices

func GetIOMMUGroupDevices(iommuGroup int) ([]string, error)

GetIOMMUGroupDevices returns all PCI devices in the same IOMMU group

func IsMdevInUse added in v0.0.5

func IsMdevInUse(mdevUUID string) bool

IsMdevInUse checks if an mdev device is currently bound to a driver (in use by a VM). An mdev with a driver symlink is actively attached to a hypervisor/VFIO.

func ReconcileMdevs added in v0.0.5

func ReconcileMdevs(ctx context.Context, instanceInfos []MdevReconcileInfo) error

ReconcileMdevs destroys orphaned mdevs that belong to hypeman but are no longer in use. This is called on server startup to clean up stale mdevs from previous runs.

Safety guarantees:

  • Only destroys mdevs that are tracked by hypeman instances (via hypemanMdevs map)
  • Never destroys mdevs created by other processes on the host
  • Skips mdevs that are currently bound to a driver (in use by a VM)
  • Skips mdevs for instances in Running or Unknown state

func ValidateDeviceName

func ValidateDeviceName(name string) bool

ValidateDeviceName validates that a device name matches the required pattern

func ValidatePCIAddress

func ValidatePCIAddress(addr string) bool

ValidatePCIAddress validates that a string is a valid PCI address format

Types

type AvailableDevice

type AvailableDevice struct {
	PCIAddress    string  `json:"pci_address"`
	VendorID      string  `json:"vendor_id"`
	DeviceID      string  `json:"device_id"`
	VendorName    string  `json:"vendor_name"`
	DeviceName    string  `json:"device_name"`
	IOMMUGroup    int     `json:"iommu_group"`
	CurrentDriver *string `json:"current_driver"` // nil if no driver bound
}

AvailableDevice represents a PCI device discovered on the host

func DiscoverAvailableDevices

func DiscoverAvailableDevices() ([]AvailableDevice, error)

DiscoverAvailableDevices scans sysfs for PCI devices that can be used for passthrough It filters for devices that are likely candidates (GPUs, network cards, etc.)

func GetDeviceInfo

func GetDeviceInfo(pciAddress string) (*AvailableDevice, error)

GetDeviceInfo reads information about a specific PCI device

type CreateDeviceRequest

type CreateDeviceRequest struct {
	Name       string `json:"name,omitempty"` // optional: globally unique name (auto-generated if not provided)
	PCIAddress string `json:"pci_address"`    // required: PCI address (e.g., "0000:a2:00.0")
}

CreateDeviceRequest is the request to register a new device

type Device

type Device struct {
	Id          string     `json:"id"`            // cuid2 identifier
	Name        string     `json:"name"`          // user-provided globally unique name
	Type        DeviceType `json:"type"`          // gpu or pci
	PCIAddress  string     `json:"pci_address"`   // e.g., "0000:a2:00.0"
	VendorID    string     `json:"vendor_id"`     // e.g., "10de"
	DeviceID    string     `json:"device_id"`     // e.g., "27b8"
	IOMMUGroup  int        `json:"iommu_group"`   // IOMMU group number
	BoundToVFIO bool       `json:"bound_to_vfio"` // whether device is bound to vfio-pci
	AttachedTo  *string    `json:"attached_to"`   // instance ID if attached, nil otherwise
	CreatedAt   time.Time  `json:"created_at"`
}

Device represents a registered PCI device for passthrough

type DeviceType

type DeviceType string

DeviceType represents the type of PCI device

const (
	DeviceTypeGPU     DeviceType = "gpu"
	DeviceTypeGeneric DeviceType = "pci"
)

func DetermineDeviceType

func DetermineDeviceType(device *AvailableDevice) DeviceType

DetermineDeviceType determines the DeviceType based on device properties

type GPUMode added in v0.0.5

type GPUMode string

GPUMode represents the host's GPU configuration mode

const (
	// GPUModePassthrough indicates whole GPU VFIO passthrough
	GPUModePassthrough GPUMode = "passthrough"
	// GPUModeVGPU indicates SR-IOV + mdev based vGPU
	GPUModeVGPU GPUMode = "vgpu"
	// GPUModeNone indicates no GPU available
	GPUModeNone GPUMode = "none"
)

func DetectHostGPUMode added in v0.0.5

func DetectHostGPUMode() GPUMode

DetectHostGPUMode determines the host's GPU configuration mode.

Returns:

  • GPUModeVGPU if /sys/class/mdev_bus has entries (SR-IOV VFs present)
  • GPUModePassthrough if NVIDIA GPUs are available for VFIO passthrough
  • GPUModeNone if no GPUs are available

Note: A host is configured for either vGPU or passthrough, not both, because the host driver determines which mode is available.

type GPUProfile added in v0.0.5

type GPUProfile struct {
	Name          string `json:"name"`           // user-facing name, e.g., "L40S-1Q"
	FramebufferMB int    `json:"framebuffer_mb"` // frame buffer size in MB
	Available     int    `json:"available"`      // number of VFs that can create this profile
}

GPUProfile describes an available vGPU profile type

func ListGPUProfiles added in v0.0.5

func ListGPUProfiles() ([]GPUProfile, error)

ListGPUProfiles returns available vGPU profiles with availability counts. Profiles are discovered from the first VF's mdev_supported_types directory.

func ListGPUProfilesWithVFs added in v0.0.5

func ListGPUProfilesWithVFs(vfs []VirtualFunction) ([]GPUProfile, error)

ListGPUProfilesWithVFs returns available vGPU profiles using pre-discovered VFs. This avoids redundant VF discovery when the caller already has the list. Uses parallel sysfs reads for fast availability counting.

type InstanceLivenessChecker

type InstanceLivenessChecker interface {
	// IsInstanceRunning returns true if the instance exists and is in a running state
	// (i.e., has an active VMM process). Returns false if the instance doesn't exist
	// or is stopped/standby/unknown.
	IsInstanceRunning(ctx context.Context, instanceID string) bool

	// GetInstanceDevices returns the list of device IDs attached to an instance.
	// Returns nil if the instance doesn't exist.
	GetInstanceDevices(ctx context.Context, instanceID string) []string

	// ListAllInstanceDevices returns a map of instanceID -> []deviceIDs for all instances.
	ListAllInstanceDevices(ctx context.Context) map[string][]string

	// DetectSuspiciousVMMProcesses finds cloud-hypervisor processes that don't match
	// known instances and logs warnings. Returns the count of suspicious processes found.
	DetectSuspiciousVMMProcesses(ctx context.Context) int
}

InstanceLivenessChecker provides a way to check if an instance is running. This interface allows devices to query instance state without a circular dependency.

type Manager

type Manager interface {
	// ListDevices returns all registered devices
	ListDevices(ctx context.Context) ([]Device, error)

	// ListAvailableDevices discovers passthrough-capable devices on the host
	ListAvailableDevices(ctx context.Context) ([]AvailableDevice, error)

	// CreateDevice registers a new device for passthrough
	CreateDevice(ctx context.Context, req CreateDeviceRequest) (*Device, error)

	// GetDevice returns a device by ID or name
	GetDevice(ctx context.Context, idOrName string) (*Device, error)

	// DeleteDevice unregisters a device
	DeleteDevice(ctx context.Context, id string) error

	// BindToVFIO binds a device to vfio-pci driver
	BindToVFIO(ctx context.Context, id string) error

	// UnbindFromVFIO unbinds a device from vfio-pci driver
	UnbindFromVFIO(ctx context.Context, id string) error

	// MarkAttached marks a device as attached to an instance
	MarkAttached(ctx context.Context, deviceID, instanceID string) error

	// MarkDetached marks a device as detached from an instance
	MarkDetached(ctx context.Context, deviceID string) error

	// ReconcileDevices cleans up stale device state on startup.
	// It detects devices with AttachedTo referencing non-existent instances
	// and clears the orphaned attachment state.
	ReconcileDevices(ctx context.Context) error

	// SetLivenessChecker sets the instance liveness checker after construction.
	// This allows breaking the circular dependency between device and instance managers.
	SetLivenessChecker(checker InstanceLivenessChecker)
}

Manager provides device management operations

func NewManager

func NewManager(p *paths.Paths) Manager

NewManager creates a new device manager. Use SetLivenessChecker after construction to enable accurate orphan detection.

type MdevDevice added in v0.0.5

type MdevDevice struct {
	UUID        string `json:"uuid"`         // e.g., "aa618089-8b16-4d01-a136-25a0f3c73123"
	VFAddress   string `json:"vf_address"`   // VF this mdev resides on
	ProfileType string `json:"profile_type"` // internal type name, e.g., "nvidia-556"
	ProfileName string `json:"profile_name"` // user-facing name, e.g., "L40S-1Q"
	SysfsPath   string `json:"sysfs_path"`   // path for VMM device attachment
	InstanceID  string `json:"instance_id"`  // instance this mdev is attached to
}

MdevDevice represents an active mediated device (vGPU instance)

func CreateMdev added in v0.0.5

func CreateMdev(ctx context.Context, profileName, instanceID string) (*MdevDevice, error)

CreateMdev creates an mdev device for the given profile and instance. It finds an available VF and creates the mdev, returning the device info. This function is thread-safe and uses a mutex to prevent race conditions when multiple instances request vGPUs concurrently.

func ListMdevDevices added in v0.0.5

func ListMdevDevices() ([]MdevDevice, error)

ListMdevDevices returns all active mdev devices on the host. Scans sysfs directly for fast, consistent results without external process overhead.

type MdevReconcileInfo added in v0.0.5

type MdevReconcileInfo struct {
	InstanceID string
	MdevUUID   string
	IsRunning  bool // true if instance's VMM is running or state is unknown
}

MdevReconcileInfo contains information needed to reconcile mdevs for an instance

type PassthroughDevice added in v0.0.5

type PassthroughDevice struct {
	Name      string `json:"name"`      // GPU name, e.g., "NVIDIA L40S"
	Available bool   `json:"available"` // true if not attached to an instance
}

PassthroughDevice describes a physical GPU available for passthrough

type VFIOBinder

type VFIOBinder struct{}

VFIOBinder handles binding and unbinding devices to/from VFIO

func NewVFIOBinder

func NewVFIOBinder() *VFIOBinder

NewVFIOBinder creates a new VFIOBinder

func (*VFIOBinder) BindToVFIO

func (v *VFIOBinder) BindToVFIO(pciAddress string) error

BindToVFIO binds a PCI device to the vfio-pci driver This requires: 1. Stopping any processes using the device (e.g., nvidia-persistenced for NVIDIA GPUs) 2. Unbinding the device from its current driver (if any) 3. Binding it to vfio-pci

func (*VFIOBinder) CheckIOMMUGroupSafe

func (v *VFIOBinder) CheckIOMMUGroupSafe(pciAddress string, allowedDevices []string) error

CheckIOMMUGroupSafe checks if all devices in the IOMMU group are safe to pass through Returns an error if there are other devices in the group that aren't being passed through

func (*VFIOBinder) GetVFIOGroupPath

func (v *VFIOBinder) GetVFIOGroupPath(pciAddress string) (string, error)

GetVFIOGroupPath returns the path to the VFIO group device for a PCI device

func (*VFIOBinder) IsDeviceBoundToVFIO

func (v *VFIOBinder) IsDeviceBoundToVFIO(pciAddress string) bool

IsDeviceBoundToVFIO checks if a device is currently bound to vfio-pci

func (*VFIOBinder) IsVFIOAvailable

func (v *VFIOBinder) IsVFIOAvailable() bool

IsVFIOAvailable checks if VFIO is available on the system

func (*VFIOBinder) UnbindFromVFIO

func (v *VFIOBinder) UnbindFromVFIO(pciAddress string) error

UnbindFromVFIO unbinds a device from vfio-pci and restores the original driver

type VirtualFunction added in v0.0.5

type VirtualFunction struct {
	PCIAddress string `json:"pci_address"` // e.g., "0000:82:00.4"
	ParentGPU  string `json:"parent_gpu"`  // e.g., "0000:82:00.0"
	HasMdev    bool   `json:"has_mdev"`    // true if an mdev is created on this VF
}

VirtualFunction represents an SR-IOV Virtual Function for vGPU

func DiscoverVFs added in v0.0.5

func DiscoverVFs() ([]VirtualFunction, error)

DiscoverVFs returns all SR-IOV Virtual Functions available for vGPU. These are discovered by scanning /sys/class/mdev_bus/ which contains VFs that can host mdev devices.

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL