resources

package
v0.0.6 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Feb 16, 2026 License: MIT Imports: 14 Imported by: 0

README

Resource Management

Host resource discovery, capacity tracking, and oversubscription-aware allocation management for CPU, memory, disk, and network.

Features

  • Resource Discovery: Automatically detects host capacity from /proc/cpuinfo, /proc/meminfo, filesystem stats, and network interface speed
  • Oversubscription: Configurable ratios per resource type (e.g., 2x CPU oversubscription)
  • Allocation Tracking: Tracks resource usage across all running instances
  • Bidirectional Network Rate Limiting: Separate download/upload limits with fair sharing
  • API Endpoint: GET /resources returns capacity, allocations, and per-instance breakdown

Configuration

Environment Variable Default Description
OVERSUB_CPU 4.0 CPU oversubscription ratio
OVERSUB_MEMORY 1.0 Memory oversubscription ratio
OVERSUB_DISK 1.0 Disk oversubscription ratio
OVERSUB_NETWORK 2.0 Network oversubscription ratio
OVERSUB_DISK_IO 2.0 Disk I/O oversubscription ratio
DISK_LIMIT auto Hard disk limit (e.g., 500GB), auto-detects from filesystem
NETWORK_LIMIT auto Hard network limit (e.g., 10Gbps), auto-detects from uplink speed
DISK_IO_LIMIT 1GB/s Hard disk I/O limit (e.g., 500MB/s, 2GB/s)
MAX_IMAGE_STORAGE 0.2 Max image storage as fraction of disk (OCI cache + rootfs)
UPLOAD_BURST_MULTIPLIER 4 Multiplier for upload burst ceiling (HTB ceil = rate × multiplier)
DOWNLOAD_BURST_MULTIPLIER 4 Multiplier for download burst bucket (TBF bucket size)

Resource Types

CPU
  • Discovered from /proc/cpuinfo (threads × cores × sockets)
  • Allocated = sum of vcpus from active instances
Memory
  • Discovered from /proc/meminfo (MemTotal)
  • Allocated = sum of size + hotplug_size from active instances
Disk
  • Discovered via statfs() on DataDir, or configured via DISK_LIMIT
  • Allocated = images (rootfs) + OCI cache + volumes + overlays (rootfs + volume)
  • Image pulls blocked when <5GB available or image storage exceeds MAX_IMAGE_STORAGE
Network

Bidirectional rate limiting with separate download and upload controls:

Downloads (external → VM):

  • TBF (Token Bucket Filter) shaping on each TAP device egress
  • Simple per-VM caps, independent of other VMs
  • Smooth traffic shaping (queues packets, doesn't drop)

Uploads (VM → external):

  • HTB (Hierarchical Token Bucket) on bridge egress
  • Per-VM classes with guaranteed rate and burst ceiling
  • Fair sharing when VMs contend for bandwidth
  • fq_codel leaf qdisc for low latency under load

Default limits:

  • Proportional to CPU: (vcpus / cpu_capacity) * network_capacity
  • Symmetric download/upload by default
  • Upload ceiling = 4x guaranteed rate by default (configurable via UPLOAD_BURST_MULTIPLIER)
  • Download burst bucket = 4x rate by default (configurable via DOWNLOAD_BURST_MULTIPLIER)

Capacity tracking:

  • Uses max(download, upload) per instance since they share physical link
Disk I/O

Per-VM disk I/O rate limiting with burst support:

  • Cloud Hypervisor: Uses native RateLimiterConfig with token bucket
  • QEMU: Uses drive throttling.bps-total options
  • Default: Proportional to CPU: (vcpus / cpu_capacity) * disk_io_capacity * 2.0
  • Burst: 4x sustained rate (allows fast cold starts)

Example: Default Limits

Host: 16-core server with 10Gbps NIC (default disk I/O = 1GB/s)

VM: 2 vCPUs (12.5% of host)

Resource Calculation Default Limit
Network (down/up) 10Gbps × 2.0 × 12.5% 2.5 Gbps (312 MB/s)
Disk I/O (sustained) 1GB/s × 2.0 × 12.5% 250 MB/s
Disk I/O (burst) 250 MB/s × 4 1 GB/s

Effective Limits

The effective allocatable capacity is:

effective_limit = capacity × oversub_ratio
available = effective_limit - allocated

For example, with 64 CPUs and OVERSUB_CPU=2.0, up to 128 vCPUs can be allocated across instances.

API Response

{
  "cpu": {
    "capacity": 64,
    "effective_limit": 128,
    "allocated": 48,
    "available": 80,
    "oversub_ratio": 2.0
  },
  "memory": { ... },
  "disk": { ... },
  "network": { ... },
  "disk_breakdown": {
    "images_bytes": 214748364800,
    "oci_cache_bytes": 53687091200,
    "volumes_bytes": 107374182400,
    "overlays_bytes": 227633306624
  },
  "allocations": [
    {
      "instance_id": "abc123",
      "instance_name": "my-vm",
      "cpu": 4,
      "memory_bytes": 8589934592,
      "disk_bytes": 10737418240,
      "network_download_bps": 125000000,
      "network_upload_bps": 125000000
    }
  ]
}

Documentation

Overview

Package resources provides host resource discovery, capacity tracking, and oversubscription-aware allocation management for CPU, memory, disk, and network.

Index

Constants

This section is empty.

Variables

This section is empty.

Functions

func ParseBandwidth

func ParseBandwidth(limit string) (int64, error)

ParseBandwidth parses a bandwidth string like "10Gbps", "1GB/s", "125MB/s". Handles both bit-based (bps) and byte-based (/s) formats. Returns bytes per second.

Types

type AllocationBreakdown

type AllocationBreakdown struct {
	InstanceID         string `json:"instance_id"`
	InstanceName       string `json:"instance_name"`
	CPU                int    `json:"cpu"`
	MemoryBytes        int64  `json:"memory_bytes"`
	DiskBytes          int64  `json:"disk_bytes"`
	NetworkDownloadBps int64  `json:"network_download_bps"` // External→VM
	NetworkUploadBps   int64  `json:"network_upload_bps"`   // VM→External
	DiskIOBps          int64  `json:"disk_io_bps"`          // Disk I/O bandwidth
}

AllocationBreakdown shows per-instance resource allocations.

type CPUResource

type CPUResource struct {
	// contains filtered or unexported fields
}

CPUResource implements Resource for CPU discovery and tracking.

func NewCPUResource

func NewCPUResource() (*CPUResource, error)

NewCPUResource discovers host CPU capacity.

func (*CPUResource) Allocated

func (c *CPUResource) Allocated(ctx context.Context) (int64, error)

Allocated returns the total vCPUs allocated to running instances.

func (*CPUResource) Capacity

func (c *CPUResource) Capacity() int64

Capacity returns the total number of vCPUs available on the host.

func (*CPUResource) SetInstanceLister

func (c *CPUResource) SetInstanceLister(lister InstanceLister)

SetInstanceLister sets the instance lister for allocation calculations.

func (*CPUResource) Type

func (c *CPUResource) Type() ResourceType

Type returns the resource type.

type DiskBreakdown

type DiskBreakdown struct {
	Images   int64 `json:"images_bytes"`    // Exported rootfs disk files
	OCICache int64 `json:"oci_cache_bytes"` // OCI layer cache (shared blobs)
	Volumes  int64 `json:"volumes_bytes"`
	Overlays int64 `json:"overlays_bytes"` // Rootfs overlays + volume overlays
}

DiskBreakdown shows disk usage by category.

type DiskIOResource added in v0.0.6

type DiskIOResource struct {
	// contains filtered or unexported fields
}

DiskIOResource implements Resource for disk I/O bandwidth tracking.

func NewDiskIOResource added in v0.0.6

func NewDiskIOResource(capacity int64, instLister InstanceLister) *DiskIOResource

NewDiskIOResource creates a disk I/O resource with the given capacity.

func (*DiskIOResource) Allocated added in v0.0.6

func (d *DiskIOResource) Allocated(ctx context.Context) (int64, error)

Allocated returns total disk I/O allocated across all active instances.

func (*DiskIOResource) Capacity added in v0.0.6

func (d *DiskIOResource) Capacity() int64

Capacity returns the total disk I/O capacity in bytes per second.

func (*DiskIOResource) Type added in v0.0.6

func (d *DiskIOResource) Type() ResourceType

Type returns the resource type.

type DiskResource

type DiskResource struct {
	// contains filtered or unexported fields
}

DiskResource implements Resource for disk space discovery and tracking.

func NewDiskResource

func NewDiskResource(cfg *config.Config, p *paths.Paths, instLister InstanceLister, imgLister ImageLister, volLister VolumeLister) (*DiskResource, error)

NewDiskResource discovers disk capacity for the data directory. If cfg.Capacity.Disk is set, uses that as capacity; otherwise auto-detects via statfs.

func (*DiskResource) Allocated

func (d *DiskResource) Allocated(ctx context.Context) (int64, error)

Allocated returns currently allocated disk space.

func (*DiskResource) Capacity

func (d *DiskResource) Capacity() int64

Capacity returns the disk capacity in bytes.

func (*DiskResource) GetBreakdown

func (d *DiskResource) GetBreakdown(ctx context.Context) (*DiskBreakdown, error)

GetBreakdown returns disk usage broken down by category.

func (*DiskResource) Type

func (d *DiskResource) Type() ResourceType

Type returns the resource type.

type FullResourceStatus

type FullResourceStatus struct {
	CPU         ResourceStatus        `json:"cpu"`
	Memory      ResourceStatus        `json:"memory"`
	Disk        ResourceStatus        `json:"disk"`
	Network     ResourceStatus        `json:"network"`
	DiskIO      ResourceStatus        `json:"disk_io"`
	DiskDetail  *DiskBreakdown        `json:"disk_breakdown,omitempty"`
	GPU         *GPUResourceStatus    `json:"gpu,omitempty"` // nil if no GPU available
	Allocations []AllocationBreakdown `json:"allocations"`
}

FullResourceStatus is the complete resource status for the API response.

type GPUResourceStatus added in v0.0.5

type GPUResourceStatus struct {
	Mode       string                      `json:"mode"`               // "vgpu" or "passthrough"
	TotalSlots int                         `json:"total_slots"`        // VFs for vGPU, physical GPUs for passthrough
	UsedSlots  int                         `json:"used_slots"`         // Slots currently in use
	Profiles   []devices.GPUProfile        `json:"profiles,omitempty"` // vGPU mode only
	Devices    []devices.PassthroughDevice `json:"devices,omitempty"`  // passthrough mode only
}

GPUResourceStatus represents the GPU resource status for the API response. Returns nil if no GPU is available on the host.

func GetGPUStatus added in v0.0.5

func GetGPUStatus() *GPUResourceStatus

GetGPUStatus returns the current GPU resource status. Returns nil if no GPU is available or the mode is "none".

type ImageLister

type ImageLister interface {
	// TotalImageBytes returns the total size of all images on disk.
	TotalImageBytes(ctx context.Context) (int64, error)
	// TotalOCICacheBytes returns the total size of the OCI layer cache.
	TotalOCICacheBytes(ctx context.Context) (int64, error)
}

ImageLister provides access to image sizes for disk calculations.

type InstanceAllocation

type InstanceAllocation struct {
	ID                 string
	Name               string
	Vcpus              int
	MemoryBytes        int64  // Size + HotplugSize
	OverlayBytes       int64  // Rootfs overlay size
	VolumeOverlayBytes int64  // Sum of volume overlay sizes
	NetworkDownloadBps int64  // Download rate limit (external→VM)
	NetworkUploadBps   int64  // Upload rate limit (VM→external)
	DiskIOBps          int64  // Disk I/O rate limit (bytes/sec)
	State              string // Only count running/paused/created instances
	VolumeBytes        int64  // Sum of attached volume base sizes (for per-instance reporting)
}

InstanceAllocation represents the resources allocated to a single instance.

type InstanceLister

type InstanceLister interface {
	// ListInstanceAllocations returns resource allocations for all instances.
	ListInstanceAllocations(ctx context.Context) ([]InstanceAllocation, error)
}

InstanceLister provides access to instance data for allocation calculations.

type InstanceUtilizationInfo added in v0.0.6

type InstanceUtilizationInfo struct {
	ID            string
	Name          string
	HypervisorPID *int   // PID of the hypervisor process
	TAPDevice     string // Name of the TAP device (e.g., "hype-01234567")

	// Allocated resources (for computing utilization ratios)
	AllocatedVcpus       int   // Number of allocated vCPUs
	AllocatedMemoryBytes int64 // Allocated memory in bytes (Size + HotplugSize)
}

InstanceUtilizationInfo contains the minimal info needed to collect VM utilization metrics. Used by vm_metrics package via adapter.

type Manager

type Manager struct {
	// contains filtered or unexported fields
}

Manager coordinates resource discovery and allocation tracking.

func NewManager

func NewManager(cfg *config.Config, p *paths.Paths) *Manager

NewManager creates a new resource manager.

func (*Manager) CPUCapacity

func (m *Manager) CPUCapacity() int64

CPUCapacity returns the raw CPU capacity (number of vCPUs).

func (*Manager) CanAllocate

func (m *Manager) CanAllocate(ctx context.Context, rt ResourceType, amount int64) (bool, error)

CanAllocate checks if the requested amount can be allocated for a resource type.

func (*Manager) CurrentImageStorageBytes

func (m *Manager) CurrentImageStorageBytes(ctx context.Context) (int64, error)

CurrentImageStorageBytes returns the current image storage usage (OCI cache + rootfs).

func (*Manager) DefaultDiskIOBandwidth

func (m *Manager) DefaultDiskIOBandwidth(vcpus int) (ioBps, burstBps int64)

DefaultDiskIOBandwidth calculates the default disk I/O bandwidth for an instance based on its CPU allocation proportional to host CPU capacity. Formula: (instanceVcpus / hostCpuCapacity) * diskIOCapacity * oversubRatio Returns sustained rate and burst rate (4x sustained).

func (*Manager) DefaultNetworkBandwidth

func (m *Manager) DefaultNetworkBandwidth(vcpus int) (downloadBps, uploadBps int64)

DefaultNetworkBandwidth calculates the default network bandwidth for an instance based on its CPU allocation proportional to host CPU capacity. Formula: (instanceVcpus / hostCpuCapacity) * networkCapacity * oversubRatio Returns symmetric download/upload limits.

func (*Manager) DiskIOCapacity

func (m *Manager) DiskIOCapacity() int64

DiskIOCapacity returns the disk I/O capacity in bytes/sec. Uses configured DISK_IO_LIMIT if set, otherwise defaults to 1 GB/s.

func (*Manager) GetFullStatus

func (m *Manager) GetFullStatus(ctx context.Context) (*FullResourceStatus, error)

GetFullStatus returns the complete resource status for all resource types.

func (*Manager) GetOversubRatio

func (m *Manager) GetOversubRatio(rt ResourceType) float64

GetOversubRatio returns the oversubscription ratio for a resource type. Returns 1.0 (no oversubscription) if the config value is not set or <= 0.

func (*Manager) GetStatus

func (m *Manager) GetStatus(ctx context.Context, rt ResourceType) (*ResourceStatus, error)

GetStatus returns the current status of a specific resource type.

func (*Manager) HasSufficientDiskForPull

func (m *Manager) HasSufficientDiskForPull(ctx context.Context) error

HasSufficientDiskForPull checks if there's enough disk space for an image pull. Returns an error if available disk is below the minimum threshold (5GB).

func (*Manager) HasSufficientImageStorage

func (m *Manager) HasSufficientImageStorage(ctx context.Context) error

HasSufficientImageStorage checks if pulling another image would exceed the image storage limit. Returns an error if current image storage >= max allowed.

func (*Manager) Initialize

func (m *Manager) Initialize(ctx context.Context) error

Initialize discovers host resources and registers them. Must be called after setting listers and before using the manager.

func (*Manager) MaxImageStorageBytes

func (m *Manager) MaxImageStorageBytes() int64

MaxImageStorageBytes returns the maximum allowed image storage (OCI cache + rootfs). Based on MaxImageStorage fraction (default 20%) of disk capacity.

func (*Manager) NetworkCapacity

func (m *Manager) NetworkCapacity() int64

NetworkCapacity returns the raw network capacity in bytes/sec.

func (*Manager) SetImageLister

func (m *Manager) SetImageLister(lister ImageLister)

SetImageLister sets the image lister for disk calculations.

func (*Manager) SetInstanceLister

func (m *Manager) SetInstanceLister(lister InstanceLister)

SetInstanceLister sets the instance lister for allocation calculations.

func (*Manager) SetVolumeLister

func (m *Manager) SetVolumeLister(lister VolumeLister)

SetVolumeLister sets the volume lister for disk calculations.

func (*Manager) ValidateAllocation added in v0.0.6

func (m *Manager) ValidateAllocation(ctx context.Context, vcpus int, memoryBytes int64, networkDownloadBps int64, networkUploadBps int64, diskIOBps int64, needsGPU bool) error

ValidateAllocation checks if the requested resources can be allocated. Returns nil if allocation is allowed, or a detailed error describing which resource is insufficient and the current capacity/usage. Parameters match instances.AllocationRequest to implement instances.ResourceValidator.

type MemoryResource

type MemoryResource struct {
	// contains filtered or unexported fields
}

MemoryResource implements Resource for memory discovery and tracking.

func NewMemoryResource

func NewMemoryResource() (*MemoryResource, error)

NewMemoryResource discovers host memory capacity.

func (*MemoryResource) Allocated

func (m *MemoryResource) Allocated(ctx context.Context) (int64, error)

Allocated returns the total memory allocated to running instances.

func (*MemoryResource) Capacity

func (m *MemoryResource) Capacity() int64

Capacity returns the total memory in bytes available on the host.

func (*MemoryResource) SetInstanceLister

func (m *MemoryResource) SetInstanceLister(lister InstanceLister)

SetInstanceLister sets the instance lister for allocation calculations.

func (*MemoryResource) Type

func (m *MemoryResource) Type() ResourceType

Type returns the resource type.

type NetworkResource

type NetworkResource struct {
	// contains filtered or unexported fields
}

NetworkResource implements Resource for network bandwidth discovery and tracking.

func NewNetworkResource

func NewNetworkResource(ctx context.Context, cfg *config.Config, instLister InstanceLister) (*NetworkResource, error)

NewNetworkResource discovers network capacity. If cfg.Capacity.Network is set, uses that; otherwise auto-detects from uplink interface.

func (*NetworkResource) Allocated

func (n *NetworkResource) Allocated(ctx context.Context) (int64, error)

Allocated returns total network bandwidth allocated to running instances. Uses the max of download/upload per instance since they share the physical link.

func (*NetworkResource) Capacity

func (n *NetworkResource) Capacity() int64

Capacity returns the network capacity in bytes per second.

func (*NetworkResource) Type

func (n *NetworkResource) Type() ResourceType

Type returns the resource type.

type Resource

type Resource interface {
	// Type returns the resource type identifier.
	Type() ResourceType

	// Capacity returns the raw host capacity (before oversubscription).
	Capacity() int64

	// Allocated returns current total allocation across all instances.
	Allocated(ctx context.Context) (int64, error)
}

Resource represents a discoverable and allocatable host resource.

type ResourceStatus

type ResourceStatus struct {
	Type           ResourceType `json:"type"`
	Capacity       int64        `json:"capacity"`         // Raw host capacity
	EffectiveLimit int64        `json:"effective_limit"`  // Capacity * oversubscription ratio
	Allocated      int64        `json:"allocated"`        // Currently allocated
	Available      int64        `json:"available"`        // EffectiveLimit - Allocated
	OversubRatio   float64      `json:"oversub_ratio"`    // Oversubscription ratio applied
	Source         SourceType   `json:"source,omitempty"` // How capacity was determined
}

ResourceStatus represents the current state of a resource type.

type ResourceType

type ResourceType string

ResourceType identifies a type of host resource.

const (
	ResourceCPU     ResourceType = "cpu"
	ResourceMemory  ResourceType = "memory"
	ResourceDisk    ResourceType = "disk"
	ResourceNetwork ResourceType = "network"
	ResourceDiskIO  ResourceType = "disk_io"
)

type SourceType

type SourceType string

SourceType identifies how a resource capacity was determined.

const (
	SourceDetected   SourceType = "detected"   // Auto-detected from host hardware
	SourceConfigured SourceType = "configured" // Explicitly configured by operator
)

type VolumeLister

type VolumeLister interface {
	// TotalVolumeBytes returns the total size of all volumes.
	TotalVolumeBytes(ctx context.Context) (int64, error)
}

VolumeLister provides access to volume sizes for disk calculations.

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL