gpu-operator

module
v0.0.0-...-69ba9c3 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Feb 20, 2026 License: Apache-2.0

README

license pipeline status coverage report

NVIDIA GPU Operator

nvidia-gpu-operator

Kubernetes provides access to special hardware resources such as NVIDIA GPUs, NICs, Infiniband adapters and other devices through the device plugin framework. However, configuring and managing nodes with these hardware resources requires configuration of multiple software components such as drivers, container runtimes or other libraries which are difficult and prone to errors. The NVIDIA GPU Operator uses the operator framework within Kubernetes to automate the management of all NVIDIA software components needed to provision GPU. These components include the NVIDIA drivers (to enable CUDA), Kubernetes device plugin for GPUs, the NVIDIA Container Runtime, automatic node labelling, DCGM based monitoring and others.

Audience and Use-Cases

The GPU Operator allows administrators of Kubernetes clusters to manage GPU nodes just like CPU nodes in the cluster. Instead of provisioning a special OS image for GPU nodes, administrators can rely on a standard OS image for both CPU and GPU nodes and then rely on the GPU Operator to provision the required software components for GPUs.

Note that the GPU Operator is specifically useful for scenarios where the Kubernetes cluster needs to scale quickly - for example provisioning additional GPU nodes on the cloud or on-prem and managing the lifecycle of the underlying software components. Since the GPU Operator runs everything as containers including NVIDIA drivers, the administrators can easily swap various components - simply by starting or stopping containers.

Quick Start

This section provides a quick guide for deploying the GPU Operator with the data center driver.

Make sure your Kubernetes cluster meets the prerequisites and is listed on the platform support page.

Step 1: Add the NVIDIA Helm repository

helm repo add nvidia https://helm.ngc.nvidia.com/nvidia \
    && helm repo update

Step 2: Deploy GPU Operator

helm install --wait --generate-name \
    -n gpu-operator --create-namespace \
    nvidia/gpu-operator

After installation, the GPU Operator and its operands should be up and running.

Note: To deploy the GPU Operator on OpenShift, follow the instructions in the official documentation.

Product Documentation

For information on platform support and getting started, visit the official documentation repository.

Roadmap

  • Support the latest NVIDIA Data Center GPUs, systems, and drivers.
  • Support RHEL 10.
  • Support KubeVirt with Ubuntu 24.04.
  • Promote the NVIDIADriver CRD to General Availability (GA).
  • Integrate NVIDIA’s DRA Driver for GPUs as a managed component of the GPU Operator.

Webinar

How to easily use GPUs on Kubernetes

Contributions

Read the document on contributions. You can contribute by opening a pull request.

Support and Getting Help

Please open an issue on the GitHub project for any questions. Your feedback is appreciated.

Directories

Path Synopsis
api
nvidia/v1
Package v1 contains API Schema definitions for the clusterpolicy v1 API group +kubebuilder:object:generate=true +groupName=nvidia.com
Package v1 contains API Schema definitions for the clusterpolicy v1 API group +kubebuilder:object:generate=true +groupName=nvidia.com
nvidia/v1alpha1
Package v1alpha1 contains API Schema definitions for the nvidia v1alpha1 API group +kubebuilder:object:generate=true +groupName=nvidia.com
Package v1alpha1 contains API Schema definitions for the nvidia v1alpha1 API group +kubebuilder:object:generate=true +groupName=nvidia.com
versioned/fake
This package has the automatically generated fake clientset.
This package has the automatically generated fake clientset.
versioned/scheme
This package contains the scheme of the automatically generated clientset.
This package contains the scheme of the automatically generated clientset.
versioned/typed/nvidia/v1
This package has the automatically generated typed clients.
This package has the automatically generated typed clients.
versioned/typed/nvidia/v1/fake
Package fake has the automatically generated clients.
Package fake has the automatically generated clients.
versioned/typed/nvidia/v1alpha1
This package has the automatically generated typed clients.
This package has the automatically generated typed clients.
versioned/typed/nvidia/v1alpha1/fake
Package fake has the automatically generated clients.
Package fake has the automatically generated clients.
cmd
gpu-operator command
gpuop-cfg command
manage-crds command
internal
tests
e2e
e2e/framework
Package framework contains provider-independent helper code for building and running E2E tests with Ginkgo.
Package framework contains provider-independent helper code for building and running E2E tests with Ginkgo.

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL