gpuallocator

package
v1.34.0 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Jun 11, 2025 License: Apache-2.0 Imports: 20 Imported by: 0

Documentation

Overview

Package gpuallocator handles GPU allocation

Index

Constants

This section is empty.

Variables

This section is empty.

Functions

func RefreshGPUNodeCapacity added in v1.33.4

func RefreshGPUNodeCapacity(ctx context.Context, k8sClient client.Client, node *tfv1.GPUNode, pool *tfv1.GPUPool) ([]string, error)

Types

type AllocRequest added in v1.34.0

type AllocRequest struct {
	// Name of the GPU pool to allocate from
	PoolName string
	// Namespace information for the workload
	WorkloadNameNamespace tfv1.NameNamespace
	// Resource requirements for the allocation
	Request tfv1.Resource
	// Number of GPUs to allocate
	Count uint
	// Specific GPU model to allocate, empty string means any model
	GPUModel string
}

AllocRequest encapsulates all parameters needed for GPU allocation

type CompactFirst

type CompactFirst struct{}

CompactFirst selects GPU with minimum available resources (most utilized) to efficiently pack workloads and maximize GPU utilization

func (CompactFirst) SelectGPUs

func (c CompactFirst) SelectGPUs(gpus []tfv1.GPU, count uint) ([]*tfv1.GPU, error)

SelectGPUs selects multiple GPUs from the same node with the least available resources (most packed)

type GpuAllocator

type GpuAllocator struct {
	client.Client
	// contains filtered or unexported fields
}

func NewGpuAllocator

func NewGpuAllocator(ctx context.Context, client client.Client, syncInterval time.Duration) *GpuAllocator

func (*GpuAllocator) Alloc

func (s *GpuAllocator) Alloc(ctx context.Context, req AllocRequest) ([]*tfv1.GPU, error)

Alloc allocates a request to a gpu or multiple gpus from the same node.

func (*GpuAllocator) Dealloc

func (s *GpuAllocator) Dealloc(ctx context.Context, workloadNameNamespace tfv1.NameNamespace, request tfv1.Resource, gpus []types.NamespacedName) error

Dealloc a request from gpu to release available resources on it.

func (*GpuAllocator) SetupWithManager

func (s *GpuAllocator) SetupWithManager(ctx context.Context, mgr manager.Manager) (<-chan struct{}, error)

SetupWithManager sets up the GpuAllocator with the Manager.

func (*GpuAllocator) Stop

func (s *GpuAllocator) Stop()

Stop stops all background goroutines

type LowLoadFirst

type LowLoadFirst struct{}

LowLoadFirst selects GPU with maximum available resources (least utilized) to distribute workloads more evenly across GPUs

func (LowLoadFirst) SelectGPUs

func (l LowLoadFirst) SelectGPUs(gpus []tfv1.GPU, count uint) ([]*tfv1.GPU, error)

SelectGPUs selects multiple GPUs from the same node with the most available resources (least loaded)

type Strategy

type Strategy interface {
	SelectGPUs(gpus []tfv1.GPU, count uint) ([]*tfv1.GPU, error)
}

func NewStrategy

func NewStrategy(placementMode tfv1.PlacementMode) Strategy

NewStrategy creates a strategy based on the placement mode

Directories

Path Synopsis

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL