nomad-driver-virt

command module
v0.0.2-beta.1 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: May 19, 2026 License: MPL-2.0 Imports: 3 Imported by: 0

README

Nomad Virt Driver

The virt driver task plugin expands the types of workloads Nomad can run to add virtual machines. Currently leveraging the power of libvirt, the virt driver allows users to define virtual machine tasks using the Nomad job spec.

IMPORTANT: This plugin is in tech preview, still under active development, there might be breaking changes in future releases

Features

  • Use the job's task.config to define the virtual machine (VM).
  • Start/stop virtual machines.
  • Nomad runtime environment is populated.
  • Use Nomad alloc data in the virtual machine.
  • Publish ports.
  • Monitor the memory consumption.
  • Monitor CPU usage.
  • Task config cpu value is used to populate virtual machine CpuShares.
  • The tasks task, alloc, and secrets directories are mounted within the VM at the filesystem root. These are currently mounted read-only to prevent excessive amounts of data being written to the host filesystem. Please see the filesystem concepts page for more detail about an allocations working directory.

Ubuntu Example job

Here is a simple Python server on Ubuntu example:

job "python-server" {

  group "virt-group" {
    count = 1

    network {
      mode = "host"
      port "http" {
        to = 8000
      }
    }

    task "virt-task" {

      driver = "virt"

      artifact {
        source      = "http://cloud-images.ubuntu.com/focal/current/focal-server-cloudimg-amd64.img"
        destination = "local/focal-server-cloudimg-amd64.img"
        mode        = "file"
      }

      config {
        default_user_password = "password"
        cmds                  = ["python3 -m http.server 8000"]

        disk {
          size = "10GiB"
          source {
            image = "local/focal-server-cloudimg-amd64.img"
          }
        }

        network_interface {
          bridge {
            name  = "virbr0"
            ports = ["http"]
          }
        }
      }

      resources {
        cores  = 2
        memory = 4000
      }
    }
  }
}
$ nomad job run examples/python.nomad.hcl

==> 2024-09-10T13:01:22+02:00: Monitoring evaluation "c0424142"
    2024-09-10T13:01:22+02:00: Evaluation triggered by job "python-server"
    2024-09-10T13:01:23+02:00: Evaluation within deployment: "d546f16e"
    2024-09-10T13:01:23+02:00: Allocation "db146826" created: node "c20ee15a", group "virt-group"
    2024-09-10T13:01:23+02:00: Evaluation status changed: "pending" -> "complete"
==> 2024-09-10T13:01:23+02:00: Evaluation "c0424142" finished with status "complete"

$ virsh list

 Id   Name                 State
------------------------------------
 4    virt-task-5a6e215e   running

Building The Driver from source

In order to build the plugin binary some development libraries are required:

  • libvirt
  • librbd

For Debian/Ubuntu based systems:

apt install libvirt-dev librbd-dev

For RHEL based systems:

dnf install libvirt-devel librbd-devel

To build the plugin:

git clone git@github.com:hashicorp/nomad-driver-virt
cd nomad-driver-virt
make dev

The compiled binary will be located at ./build/nomad-driver-virt.

Runtime dependencies

Make sure the node where the client will run supports virtualization, in Linux you can do it in a couple of ways:

  1. Reading the CPU flags:
egrep -o '(vmx|svm)' /proc/cpuinfo
  1. Reading the kernel modules and looking for the virtualization ones:
lsmod | grep -E '(kvm_intel|kvm_amd)'

If the result is empty for either call, the machine does not support virtualization and the nomad client wont be able to run any virtualization workload.

  1. Verify permissions: Nomad runs as root, add the user root and the group root to the QEMU configuration to allow it to execute the workloads. Remember to start the libvirtd daemon if not started yet or to restart it after adding the qemu user/group configuration:
systemctl start libvirtd

or

systemctl restart libvirtd

Ensure that Nomad can find the plugin, see plugin_dir

Driver Configuration

  • image_paths - Host paths containing image files allowed to be used by tasks.
  • provider - Named block containing provider configuration. Defaults to libvirt.
  • storage_pools - Block containing storage pool configuration.
Provider - libvirt
  • password - The libvirt password to use for authentication.
  • uri - The libvirt driver to use. Defaults to qemu:///system.
  • user - The libvirt user to use for authentication.
Storage pools

Storage pools contain volumes which are created for, and attached to, task VMs. Two types of storage pools are supported by the driver: directory and Ceph. Directory storage pools are host local storage pools with volumes stored at a specified path. Ceph storage pools are RBD based volumes stored in Ceph.

A default storage pool must be assigned. If the configuration only defines a single storage pool, that storage pool is automatically the default.

  • ceph - Named block containing Ceph based storage pool configuration.
  • default - Name of the default storage pool. If only one storage pool is defined, it is automatically the default.
  • directory - Named block containing directory based storage pool configuration.
Storage pool - directory
  • path - Host path to contain the pool volumes.
Storage pool - ceph
  • authentication - Block containing authentication configuration.
    • username - Ceph client name .
    • secret - Ceph client key (base64 encoded).
  • hosts - List of Ceph monitors.
  • pool - Name of the Ceph pool.
Examples

Minimal configuration defining a directory storage pool on the host and defining a directory for image files which tasks may reference:

plugin "nomad-driver-virt" {
  config {
    image_paths = ["/var/lib/virt/images"]
    storage_pools {
      directory "local" {
        path = "/var/lib/virt/storage"
      }
    }
  }
}

This full libvirt configuration example has a username and password and allows multiple host directories for image files. It defines two storage pools, one directory and one Ceph, and the directory storage pool is marked as the default:

plugin "nomad-driver-virt" {
  config {
    provider "libvirt" {
      uri      = "qemu:///system"
      user     = "libvirt-username"
      password = "libvirt-pass"
    }

    image_paths = [
      "/var/lib/virt/images",
      "/opt/custom-images",
    ]

    storage_pools {
      default = "local"
      directory "local" {
        path = "/var/lib/virt/storage"
      }

      ceph "remote-storage" {
        pool  = "nomad-pool"
        hosts = [
          "10.0.0.2:3300",
          "10.0.0.12:3300",
          "10.0.0.99:3300",
        ]
        authentication {
          username = "nomad"
          secret   = "AQCzNMxpb6aWIxAA7YrNMSg8z5TxEvB0jsuibQ=="
        }
      }
    }
  }
}

Task Configuration

  • cmds - List of commands to execute on the VM once it is running.
  • default_user_authorized_ssh_key - SSH public key added to the SSH configuration for the default user of the cloud image distribution.
  • default_user_password - Initial password configured for the default user of the cloud image distribution.
  • disk - A list of disk configurations for volumes to be attached to the VM.
  • hostname - Hostname assigned. Must be a valid DNS label according to RFC 1123. Defaults to a name based on the task name.
  • network_interface A list of network interfaces to be attached to the VM. Currently only a single entry is supported.
  • os - Configuration for specific machine and architecture to emulate. Default to match host machine.
  • user_data - Path to a cloud-init compliant user data file to be used as the user-data for the cloud-init configuration.

Note: The driver currently has support for cpuSets or cores and memory. Every core will be treated as a vcpu. Do not use resources.cpus, they will be ignored.

Disk

A disk describes a volume to be attached to the task VM. Multiple disks can be defined within a task's configuration, with one disk required to be identified as the primary disk. A disk can provide a volume that is an empty block device, a clone of an existing volume within the storage pool, or formatted with a supplied image.

  • bus_type - Bus type for the disk. Defaults to virtio.
  • chained - Disk is an overlay on the source.
  • devname - Device name used within the VM. Auto-generated by default.
  • driver - Driver to use for the disk. Usage and default value is provider specific.
  • format - Format of the disk. Default is provider specific.
  • kind - Kind of disk defined. Defaults to disk.
  • pool - Storage pool to place volume created from this definition. Defaults to the default storage pool.
  • primary - Disk is the primary to boot the VM.
  • read_only - Disk is read only.
  • size - Size of the disk as bytes, or string (example: 20GB or 15GiB)
  • sparse - Disk should be sparsely populated.
  • source - Block containing disk source configuration.
    • format - Format of the image. Auto-detected if unset.
    • image - Image to write to the disk. Overwrites any existing information on disk.
    • volume - Volume in storage pool to clone.
  • volume - Nomad volume to back the disk.
Example

The example below shows the task and disk configuration to define a primary disk in the directory storage pool and a secondary empty disk in the Ceph storage pool:

job "python-server" {
  group "virt-group" {
    task "virt-task" {
      artifact {
        source      = "http://cloud-images.ubuntu.com/focal/current/focal-server-cloudimg-amd64.img"
        destination = "local/focal-server-cloudimg-amd64.img"
        mode        = "file"
      }

      driver = "virt"

      config {
        disk {
          size    = "10GiB"
          pool    = "local"
          primary = true
          source {
            image = "local/focal-server-cloudimg-amd64.img"
          }
        }

        disk {
          size = "20GB"
          pool = "ceph-storage"
        }
      }
    }
  }
}

If the storage pool already contains a volume with the focal server image, it can be cloned to remove the need of downloading and applying the image. Once the volume is cloned, it will be automatically resized to the requested size:

job "python-server" {
  group "virt-group" {
    task "virt-task" {
      driver = "virt"

      config {
        disk {
          size   = "10GiB"
          pool   = "local"
          source {
            volume = "focal-server-cloudimg-amd64.img"
          }
        }
      }
    }
  }
}

Instead of making a full clone of the source volume, a chained copy may be created which overlays the new volume on the source volume creating a copy-on-write volume:

job "python-server" {
  group "virt-group" {
    task "virt-task" {
      driver = "virt"

      config {
        disk {
          size    = "10GiB"
          pool    = "local"
          chained = true
          source {
            volume = "focal-server-cloudimg-amd64.img"
          }
        }
      }
    }
  }
}

Chained copies may also be used when providing a source image. A new volume will be created for the image and any tasks that define a chained disk with that source image will be chained to that volume:

job "python-server" {
  group "virt-group" {
    task "virt-task" {
      artifact {
        source      = "http://cloud-images.ubuntu.com/focal/current/focal-server-cloudimg-amd64.img"
        destination = "local/focal-server-cloudimg-amd64.img"
        mode        = "file"
      }

      driver = "virt"

      config {
        disk {
          size    = "10GiB"
          pool    = "local"
          chained = true
          source {
            image = "local/focal-server-cloudimg-amd64.img"
          }
        }
      }
    }
  }
}

Comprehensive Examples

Comprehensive examples of storage pool and disk usage can be found in the ./examples/storage directory. The examples currently include:

Network Configuration

The following configuration options are available within the task's driver configuration block:

  • bridge - Block configuration for connecting to a bridged network.
    • name - Name of the bridge interface to use. The default libvirt network, virbr0, is a bridged network.
    • ports - A list of port labels exposed on the host via mapping to the network interface. Labels must exist within the job specification network block.
  • macvtap - Block configuration for configuring a macvtap device.
    • device - Name of the host device to use for creating the macvtap device.
    • mode - Operating mode of the macvtap interface. Supported modes: bridge, private, vepa, or passthrough. Defaults to bridge.
Example (bridge)

The example below shows the network configuration and task configuration required to expose and map ports 22 and 80:

group "virt-group" {

  network {
    mode = "host"
    port "ssh" {
      to = 22
    }
    port "http" {
      to = 80
    }
  }

  task "virt-task" {
    driver = "virt"
    config {
      network_interface {
        bridge {
          name  = "virbr0"
          ports = ["ssh", "http"]
        }
      }
    }
  }
}

Exposed ports and services can make use of the existing service block, so that registrations can be performed using the specified backend provider.

Example (macvtap)

The example below shows task configuration required for configuring a macvtap device:

group "virt-group" {
  task "virt-task" {
    driver = "virt"
    config {
      network_interface {
        macvtap {
          device = "eth0"
          mode   = "bridge"
        }
      }
    }
  }
}

Local Development

Make sure the node supports virtualization.

# Build the task driver plugin
make dev

# Copy the build nomad-driver-plugin executable to the plugin dir
cp ./build/nomad-driver-virt - /opt/nomad/plugins

# Start Nomad
nomad agent -config=examples/server.nomad.hcl 2>&1 > server.log &

# Run the client as sudo
sudo nomad agent -config=examples/client.nomad.hcl 2>&1 > client.log &

# Run a job
nomad job run examples/job.nomad.hcl

# Verify
nomad job status virt-example

virsh list

Debugging a VM

Before starting

If running a job for the first time, you run into errors, remember to verify the runtime Runtime dependencies.

It is important to know that to protect the host machine from guests overusing the disk, managed vm don't have write access to the Nomad filesystem.

If Nomad is not running as root, the permissions for the directories used by both Nomad and the virt driver need to be adjusted.

Once the vm is running things still don't go as plan and extra tools are necessary to find the problem. Here are some strategies to debug a failing VM:

Connecting to a VM

By default, cloud images are password protected, by adding a default_user_password a new password is assigned to the default user of the used distribution (for example, ubuntu for ubuntu fedora for fedora, or root for alpine) By running virsh console [vm-name], a terminal is started inside the VM that will allow an internal inspection of the VM.

$ virsh list
 Id   Name                 State
------------------------------------
 1    virt-task-8bc0a63f   running

$ virsh console virt-task-8bc0a63f
Connected to domain 'virt-task-8bc0a63f'
Escape character is ^] (Ctrl + ])

nomad-virt-task-8bc0a63f login: ubuntu
Password:

If no login prompt shows up, it can mean the virtual machine is not booting and adding some extra space to the disk may solve the problem. Remember the disk has to fit the root image plus any other process running in the VM.

The virt driver heavily relies on cloud-init to execute the virtual machine's configuration. Once you have managed to connect to the terminal, the results of cloud init can be found in two different places:

  • /var/log/cloud-init.log
  • /var/log/cloud-init-output.log

Looking into these files can give a better understanding of any possible execution errors.

If connecting to the terminal is not an option, it is possible to stop the job and mount the VM's disk to inspect it. If the use_thin_copy option is used, the driver will create the disk image in the directory ${plugin_config.data_dir}/virt/vm-name.img:

# Find the virtual machine disk image
$ ls /var/lib/virt
virt-task-8bc0a63f.img

# Enable Network Block Devices on the Host
modprobe nbd max_part=8

# Connect the disk as network block device
qemu-nbd --connect=/dev/nbd0 '/var/lib/virt/virt-task-dc8187e3.img'

# Find The Virtual Machine Partitions
fdisk /dev/nbd0 -l

# Mount the partition from the VM
mount /dev/nbd0p1 /mnt/somepoint/

Important Don't forget to unmount the disk after finishing:

umount /mnt/somepoint/
qemu-nbd --disconnect /dev/nbd0
rmmod nbd
Networking

For networking, the plugin leverages on the libvirt default network default:

$ virsh net-list
 Name      State    Autostart   Persistent
--------------------------------------------
 default   active   yes         yes

Under the hood, libvirt uses dnsmasq to lease IP addresses to the virtual machines, there are mutiple ways to find the IP assigned to the nomad task. Using virsh to find the leased IP:

$ virsh net-dhcp-leases default
 Expiry Time           MAC address         Protocol   IP address           Hostname                   Client ID or DUID
----------------------------------------------------------------------------------------------------------------------------------------------------------------
 2024-10-07 18:48:09   52:54:00:b5:0b:d4   ipv4       192.168.122.211/24   nomad-virt-task-dc8187e3   ff:08:24:45:0e:00:02:00:00:ab:11:63:3c:26:5b:b7:fe:b3:13

or using the mac address to find the IP via ARP:

$ virsh dumpxml virt-task-8473ccfb  | grep "mac address" | awk -F\' '{ print $2}'
52:54:00:b5:0b:d4
$ arp -an | grep 52:54:00:b5:0b:d4
? (192.168.122.211) at 52:54:00:b5:0b:d4 [ether] on virbr0

Documentation

The Go Gopher

There is no documentation for this package.

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL