Documentation
¶
Overview ¶
Package config provides basic infrastructure to set configuration settings for runsc. The configuration is set by flags to the command line. They can also propagate to a different process using the same flags.
Index ¶
- Constants
- Variables
- func RegisterFlags(flagSet *flag.FlagSet)
- type Bundle
- type BundleName
- type Config
- func (c *Config) ApplyBundles(flagSet *flag.FlagSet, bundleNames ...BundleName) error
- func (c *Config) GetHostUDS() HostUDS
- func (c *Config) GetOverlay2() Overlay2
- func (c *Config) Log()
- func (c *Config) MetricMetadata() map[string]string
- func (c *Config) Override(flagSet *flag.FlagSet, name string, value string, force bool) error
- func (c *Config) ToContainerdConfigTOML(opts ContainerdConfigOptions) (string, error)
- func (c *Config) ToFlags() []string
- type ContainerdConfigOptions
- type FileAccessType
- type HostFifo
- type HostUDS
- type KeyVal
- type NetworkType
- type Overlay2
- type OverlayMedium
- type QueueingDiscipline
- type XDP
- type XDPMode
Constants ¶
const ( // NoOverlay indicates that no overlay will be applied. NoOverlay = OverlayMedium("") // MemoryOverlay indicates that the overlay is backed by app memory. MemoryOverlay = OverlayMedium("memory") // SelfOverlay indicates that the overlaid mount is backed by itself. SelfOverlay = OverlayMedium("self") // AnonOverlayPrefix is the prefix that users should specify in the // config for the anonymous overlay. AnonOverlayPrefix = "dir=" )
Variables ¶
var Bundles = map[BundleName]Bundle{
"experimental-high-performance": {
"directfs": "true",
"overlay2": "root:self",
"platform": "systrap",
},
}
Bundles is the set of each Bundle. Each bundle is a named set of flag names and flag values. Bundles may be turned on using pod annotations. Bundles have lower precedence than flag pod annotation and command-line flags. Bundles are mutually exclusive iff their flag values overlap and differ.
var MetricMetadataKeys = []string{
"version",
"platform",
"network",
"numcores",
"coretags",
"overlay",
"fsmode",
"cpuarch",
"go",
"experiment",
}
MetricMetadataKeys is the set of keys of metric metadata labels as returned by `Config.MetricMetadata`.
Functions ¶
func RegisterFlags ¶
RegisterFlags registers flags used to populate Config.
Types ¶
type BundleName ¶
type BundleName string
BundleName is a human-friendly name for a Bundle. It is used as part of an annotation to specify that the user wants to apply a Bundle.
type Config ¶
type Config struct {
// RootDir is the runtime root directory.
RootDir string `flag:"root"`
// Traceback changes the Go runtime's traceback level.
Traceback string `flag:"traceback"`
// Debug indicates that debug logging should be enabled.
Debug bool `flag:"debug"`
// LogFilename is the filename to log to, if not empty.
LogFilename string `flag:"log"`
// LogFormat is the log format.
LogFormat string `flag:"log-format"`
// DebugLog is the path to log debug information to, if not empty.
// If specified together with `DebugToUserLog`, debug logs are emitted
// to both.
DebugLog string `flag:"debug-log"`
// DebugToUserLog indicates that Sentry debug logs should be emitted
// to user-visible logs.
// If specified together with `DebugLog`, debug logs are emitted
// to both.
DebugToUserLog bool `flag:"debug-to-user-log"`
// DebugCommand is a comma-separated list of commands to be debugged if
// --debug-log is also set. Empty means debug all. "!" negates the expression.
// E.g. "create,start" or "!boot,events".
DebugCommand string `flag:"debug-command"`
// PanicLog is the path to log GO's runtime messages, if not empty.
PanicLog string `flag:"panic-log"`
// CoverageReport is the path to write Go coverage information, if not empty.
CoverageReport string `flag:"coverage-report"`
// DebugLogFormat is the log format for debug.
DebugLogFormat string `flag:"debug-log-format"`
// FileAccess indicates how the root filesystem is accessed.
FileAccess FileAccessType `flag:"file-access"`
// FileAccessMounts indicates how non-root volumes are accessed.
FileAccessMounts FileAccessType `flag:"file-access-mounts"`
// Overlay is whether to wrap all mounts in an overlay. The upper tmpfs layer
// will be backed by application memory.
Overlay bool `flag:"overlay"`
// Overlay2 holds configuration about wrapping mounts in overlayfs.
// DO NOT call it directly, use GetOverlay2() instead.
Overlay2 Overlay2 `flag:"overlay2"`
// FSGoferHostUDS is deprecated: use host-uds=all.
FSGoferHostUDS bool `flag:"fsgofer-host-uds"`
// HostUDS controls permission to access host Unix-domain sockets.
// DO NOT call it directly, use GetHostUDS() instead.
HostUDS HostUDS `flag:"host-uds"`
// HostFifo controls permission to access host FIFO (or named pipes).
HostFifo HostFifo `flag:"host-fifo"`
// Network indicates what type of network to use.
Network NetworkType `flag:"network"`
// EnableRaw indicates whether raw sockets should be enabled. Raw
// sockets are disabled by stripping CAP_NET_RAW from the list of
// capabilities.
EnableRaw bool `flag:"net-raw"`
// AllowPacketEndpointWrite enables write operations on packet endpoints.
AllowPacketEndpointWrite bool `flag:"TESTONLY-allow-packet-endpoint-write"`
// HostGSO indicates that host segmentation offload is enabled.
HostGSO bool `flag:"gso"`
// GVisorGSO indicates that gVisor segmentation offload is enabled. The flag
// retains its old name of "software" GSO for API consistency.
GVisorGSO bool `flag:"software-gso"`
// GVisorGRO enables gVisor's generic receive offload.
GVisorGRO bool `flag:"gvisor-gro"`
// TXChecksumOffload indicates that TX Checksum Offload is enabled.
TXChecksumOffload bool `flag:"tx-checksum-offload"`
// RXChecksumOffload indicates that RX Checksum Offload is enabled.
RXChecksumOffload bool `flag:"rx-checksum-offload"`
// QDisc indicates the type of queuening discipline to use by default
// for non-loopback interfaces.
QDisc QueueingDiscipline `flag:"qdisc"`
// LogPackets indicates that all network packets should be logged.
LogPackets bool `flag:"log-packets"`
// PCAP is a file to which network packets should be logged in PCAP format.
PCAP string `flag:"pcap-log"`
// Platform is the platform to run on.
Platform string `flag:"platform"`
// PlatformDevicePath is the path to the device file used by the platform.
// e.g. "/dev/kvm" for the KVM platform.
// If unset, a sane platform-specific default will be used.
PlatformDevicePath string `flag:"platform_device_path"`
// MetricServer, if set, indicates that metrics should be exported on this address.
// This may either be 1) "addr:port" to export metrics on a specific network interface address,
// 2) ":port" for exporting metrics on all addresses, or 3) an absolute path to a Unix Domain
// Socket.
// The substring "%ID%" will be replaced by the container ID, and "%RUNTIME_ROOT%" by the root.
// This flag must be specified *both* as part of the `runsc metric-server` arguments (so that the
// metric server knows which address to bind to), and as part of the `runsc create` arguments (as
// an indication that the container being created wishes that its metrics should be exported).
// The value of this flag must also match across the two command lines.
MetricServer string `flag:"metric-server"`
// ProfilingMetrics is a comma separated list of metric names which are
// going to be written to the ProfilingMetricsLog file from within the
// sentry in CSV format. ProfilingMetrics will be snapshotted at a rate
// specified by ProfilingMetricsRate. Requires ProfilingMetricsLog to be
// set.
ProfilingMetrics string `flag:"profiling-metrics"`
// ProfilingMetricsLog is the file name to use for ProfilingMetrics
// output.
ProfilingMetricsLog string `flag:"profiling-metrics-log"`
// ProfilingMetricsRate is the target rate (in microseconds) at which
// profiling metrics will be snapshotted.
ProfilingMetricsRate int `flag:"profiling-metrics-rate-us"`
// Strace indicates that strace should be enabled.
Strace bool `flag:"strace"`
// StraceSyscalls is the set of syscalls to trace (comma-separated values).
// If StraceEnable is true and this string is empty, then all syscalls will
// be traced.
StraceSyscalls string `flag:"strace-syscalls"`
// StraceLogSize is the max size of data blobs to display.
StraceLogSize uint `flag:"strace-log-size"`
// StraceEvent indicates sending strace to events if true. Strace is
// sent to log if false.
StraceEvent bool `flag:"strace-event"`
// DisableSeccomp indicates whether seccomp syscall filters should be
// disabled. Pardon the double negation, but default to enabled is important.
DisableSeccomp bool
// EnableCoreTags indicates whether the Sentry process and children will be
// run in a core tagged process. This isolates the sentry from sharing
// physical cores with other core tagged processes. This is useful as a
// mitigation for hyperthreading side channel based attacks. Requires host
// linux kernel >= 5.14.
EnableCoreTags bool `flag:"enable-core-tags"`
// WatchdogAction sets what action the watchdog takes when triggered.
WatchdogAction watchdog.Action `flag:"watchdog-action"`
// PanicSignal registers signal handling that panics. Usually set to
// SIGUSR2(12) to troubleshoot hangs. -1 disables it.
PanicSignal int `flag:"panic-signal"`
// ProfileEnable is set to prepare the sandbox to be profiled.
ProfileEnable bool `flag:"profile"`
// ProfileBlock collects a block profile to the passed file for the
// duration of the container execution. Requires ProfileEnabled.
ProfileBlock string `flag:"profile-block"`
// ProfileCPU collects a CPU profile to the passed file for the
// duration of the container execution. Requires ProfileEnabled.
ProfileCPU string `flag:"profile-cpu"`
// ProfileHeap collects a heap profile to the passed file for the
// duration of the container execution. Requires ProfileEnabled.
ProfileHeap string `flag:"profile-heap"`
// ProfileMutex collects a mutex profile to the passed file for the
// duration of the container execution. Requires ProfileEnabled.
ProfileMutex string `flag:"profile-mutex"`
// TraceFile collects a Go runtime execution trace to the passed file
// for the duration of the container execution.
TraceFile string `flag:"trace"`
// NumNetworkChannels controls the number of AF_PACKET sockets that map
// to the same underlying network device. This allows netstack to better
// scale for high throughput use cases.
NumNetworkChannels int `flag:"num-network-channels"`
// NetworkProcessorsPerChannel controls the number of goroutines used to
// handle packets on a single network channel. A higher number can help handle
// many simultaneous connections. If this is 0, runsc will divide GOMAXPROCS
// evenly among each network channel.
NetworkProcessorsPerChannel int `flag:"network-processors-per-channel"`
// Rootless allows the sandbox to be started with a user that is not root.
// Defense in depth measures are weaker in rootless mode. Specifically, the
// sandbox and Gofer process run as root inside a user namespace with root
// mapped to the caller's user. When using rootless, the container root path
// should not have a symlink.
Rootless bool `flag:"rootless"`
// AlsoLogToStderr allows to send log messages to stderr.
AlsoLogToStderr bool `flag:"alsologtostderr"`
// ReferenceLeakMode sets reference leak check mode
ReferenceLeak refs.LeakMode `flag:"ref-leak-mode"`
// CPUNumFromQuota sets CPU number count to available CPU quota, using
// least integer value greater than or equal to quota.
//
// E.g. 0.2 CPU quota will result in 1, and 1.9 in 2.
CPUNumFromQuota bool `flag:"cpu-num-from-quota"`
// Allows overriding of flags in OCI annotations.
AllowFlagOverride bool `flag:"allow-flag-override"`
// Enables seccomp inside the sandbox.
OCISeccomp bool `flag:"oci-seccomp"`
// Don't configure cgroups.
IgnoreCgroups bool `flag:"ignore-cgroups"`
// Use systemd to configure cgroups.
SystemdCgroup bool `flag:"systemd-cgroup"`
// PodInitConfig is the path to configuration file with additional steps to
// take during pod creation.
PodInitConfig string `flag:"pod-init-config"`
// Use pools to manage buffer memory instead of heap.
BufferPooling bool `flag:"buffer-pooling"`
// XDP controls Whether and how to use XDP.
XDP XDP `flag:"EXPERIMENTAL-xdp"`
// AFXDPUseNeedWakeup determines whether XDP_USE_NEED_WAKEUP is set
// when using AF_XDP sockets.
AFXDPUseNeedWakeup bool `flag:"EXPERIMENTAL-xdp-need-wakeup"`
// FDLimit specifies a limit on the number of host file descriptors that can
// be open simultaneously by the sentry and gofer. It applies separately to
// each.
FDLimit int `flag:"fdlimit"`
// DCache sets the global dirent cache size. If negative, per-mount caches are
// used.
DCache int `flag:"dcache"`
// IOUring enables support for the IO_URING API calls to perform
// asynchronous I/O operations.
IOUring bool `flag:"iouring"`
// DirectFS sets up the sandbox to directly access/mutate the filesystem from
// the sentry. Sentry runs with escalated privileges. Gofer process still
// exists, but is mostly idle. Not supported in rootless mode.
DirectFS bool `flag:"directfs"`
// NVProxy enables support for Nvidia GPUs.
NVProxy bool `flag:"nvproxy"`
// NVProxyDocker is deprecated. Please use nvidia-container-runtime or
// `docker run --gpus` directly. For backward compatibility, this has the
// effect of injecting nvidia-container-runtime-hook as a prestart hook.
NVProxyDocker bool `flag:"nvproxy-docker"`
// NVProxyDriverVersion is the version of the NVIDIA driver ABI to use.
// If empty, it is autodetected from the installed NVIDIA driver.
// It can also be set to the special value "latest" to force the use of
// the latest supported NVIDIA driver ABI.
NVProxyDriverVersion string `flag:"nvproxy-driver-version"`
// TPUProxy enables support for TPUs.
TPUProxy bool `flag:"tpuproxy"`
// TestOnlyAllowRunAsCurrentUserWithoutChroot should only be used in
// tests. It allows runsc to start the sandbox process as the current
// user, and without chrooting the sandbox process. This can be
// necessary in test environments that have limited capabilities. When
// disabling chroot, the container root path should not have a symlink.
TestOnlyAllowRunAsCurrentUserWithoutChroot bool `flag:"TESTONLY-unsafe-nonroot"`
// TestOnlyTestNameEnv should only be used in tests. It looks up for the
// test name in the container environment variables and adds it to the debug
// log file name. This is done to help identify the log with the test when
// multiple tests are run in parallel, since there is no way to pass
// parameters to the runtime from docker.
TestOnlyTestNameEnv string `flag:"TESTONLY-test-name-env"`
// TestOnlyAFSSyscallPanic should only be used in tests. It enables the
// alternate behaviour for afs_syscall to trigger a Go-runtime panic upon being
// called. This is useful for tests exercising gVisor panic-reporting.
TestOnlyAFSSyscallPanic bool `flag:"TESTONLY-afs-syscall-panic"`
// ReproduceNAT, when true, tells runsc to scrape the host network
// namespace's NAT iptables and reproduce it inside the sandbox.
ReproduceNAT bool `flag:"reproduce-nat"`
// ReproduceNftables attempts to scrape nftables routing rules if
// present, and reproduce them in the sandbox.
ReproduceNftables bool `flag:"reproduce-nftables"`
// TestOnlyAutosaveImagePath if not empty enables auto save for syscall tests
// and stores the directory path to the saved state file.
TestOnlyAutosaveImagePath string `flag:"TESTONLY-autosave-image-path"`
// TestOnlyAutosaveResume indicates save resume for syscall tests.
TestOnlyAutosaveResume bool `flag:"TESTONLY-autosave-resume"`
// contains filtered or unexported fields
}
Config holds configuration that is not part of the runtime spec.
Follow these steps to add a new flag:
- Create a new field in Config.
- Add a field tag with the flag name
- Register a new flag in flags.go, with same name and add a description
- Add any necessary validation into validate()
- If adding an enum, follow the same pattern as FileAccessType
- Evaluate if the flag can be changed with OCI annotations. See overrideAllowlist for more details
func NewFromBundle ¶
NewFromBundle makes a new config from a Bundle.
func NewFromFlags ¶
NewFromFlags creates a new Config with values coming from command line flags.
func (*Config) ApplyBundles ¶
func (c *Config) ApplyBundles(flagSet *flag.FlagSet, bundleNames ...BundleName) error
ApplyBundles applies the given bundles by name. It returns an error if a bundle doesn't exist, or if the given bundles have conflicting flag values. Config values which are already specified prior to calling ApplyBundles are overridden.
func (*Config) GetHostUDS ¶
GetHostUDS returns the FS gofer communication that is allowed, taking into consideration all flags what affect the result.
func (*Config) GetOverlay2 ¶
GetOverlay2 returns the overlay configuration, taking into consideration all flags that affect the result.
func (*Config) Log ¶
func (c *Config) Log()
Log logs important aspects of the configuration to the given log function.
func (*Config) MetricMetadata ¶
MetricMetadata returns key-value pairs that are useful to include in metrics exported about the sandbox this config represents. It must return the same set of labels as listed in `MetricMetadataKeys`.
func (*Config) ToContainerdConfigTOML ¶
func (c *Config) ToContainerdConfigTOML(opts ContainerdConfigOptions) (string, error)
ToContainerdConfigTOML turns a given config into a format for a k8s containerd config.toml file. See: https://gvisor.dev/docs/user_guide/containerd/quick_start/
type ContainerdConfigOptions ¶
type ContainerdConfigOptions struct {
BinaryPath string
RootPath string
Options map[string]string
RunscFlags []KeyVal
}
ContainerdConfigOptions contains arguments for ToContainerdConfigTOML.
type FileAccessType ¶
type FileAccessType int
FileAccessType tells how the filesystem is accessed.
const ( // FileAccessExclusive gives the sandbox exclusive access over files and // directories in the filesystem. No external modifications are permitted and // can lead to undefined behavior. // // Exclusive filesystem access enables more aggressive caching and offers // significantly better performance. This is the default mode for the root // volume. FileAccessExclusive FileAccessType = iota // requires revalidation on every filesystem access to detect external // changes, and reduces the amount of caching that can be done. This is the // default mode for non-root volumes. FileAccessShared )
func (*FileAccessType) Set ¶
func (f *FileAccessType) Set(v string) error
Set implements flag.Value. Set(String()) should be idempotent.
func (FileAccessType) String ¶
func (f FileAccessType) String() string
String implements flag.Value.
type HostFifo ¶
type HostFifo int
HostFifo tells how much of the host FIFO (or named pipes) the file system has access to.
type HostUDS ¶
type HostUDS int
HostUDS tells how much of the host UDS the file system has access to.
const ( // HostUDSNone doesn't allows UDS from the host to be manipulated. HostUDSNone HostUDS = 0x0 // HostUDSOpen allows UDS from the host to be opened, e.g. connect(2). HostUDSOpen HostUDS = 0x1 // HostUDSCreate allows UDS from the host to be created, e.g. bind(2). HostUDSCreate HostUDS = 0x2 // HostUDSAll allows all form of communication with the host through UDS. HostUDSAll = HostUDSOpen | HostUDSCreate )
func (HostUDS) AllowCreate ¶
AllowCreate returns true if it can create UDS in the host.
type KeyVal ¶
KeyVal is a key value pair. It is used so ToContainerdConfigTOML returns predictable ordering for runsc flags.
type NetworkType ¶
type NetworkType int
NetworkType tells which network stack to use.
const ( // NetworkSandbox uses internal network stack, isolated from the host. NetworkSandbox NetworkType = iota // NetworkHost redirects network related syscalls to the host network. NetworkHost // NetworkNone sets up just loopback using netstack. NetworkNone )
func (*NetworkType) Set ¶
func (n *NetworkType) Set(v string) error
Set implements flag.Value. Set(String()) should be idempotent.
type Overlay2 ¶
type Overlay2 struct {
// contains filtered or unexported fields
}
Overlay2 holds the configuration for setting up overlay filesystems for the container.
func (*Overlay2) RootOverlayMedium ¶
func (o *Overlay2) RootOverlayMedium() OverlayMedium
RootOverlayMedium returns the overlay medium config of the root mount.
func (*Overlay2) SubMountOverlayMedium ¶
func (o *Overlay2) SubMountOverlayMedium() OverlayMedium
SubMountOverlayMedium returns the overlay medium config of submounts.
type OverlayMedium ¶
type OverlayMedium string
OverlayMedium describes how overlay medium is configured.
func (OverlayMedium) HostFileDir ¶
func (m OverlayMedium) HostFileDir() string
HostFileDir indicates the directory in which the overlay-backing host file should be created.
Precondition: m.IsBackedByAnon().
func (OverlayMedium) IsBackedByAnon ¶
func (m OverlayMedium) IsBackedByAnon() bool
IsBackedByAnon indicates whether the overlaid mount is backed by a host file in an anonymous directory.
func (*OverlayMedium) Set ¶
func (m *OverlayMedium) Set(v string) error
Set sets the value. Set(String()) should be idempotent.
func (OverlayMedium) String ¶
func (m OverlayMedium) String() string
String returns a human-readable string representing the overlay medium config.
type QueueingDiscipline ¶
type QueueingDiscipline int
QueueingDiscipline is used to specify the kind of Queueing Discipline to apply for a give FDBasedLink.
const ( // QDiscNone disables any queueing for the underlying FD. QDiscNone QueueingDiscipline = iota // QDiscFIFO applies a simple fifo based queue to the underlying FD. QDiscFIFO )
func (*QueueingDiscipline) Set ¶
func (q *QueueingDiscipline) Set(v string) error
Set implements flag.Value. Set(String()) should be idempotent.
func (QueueingDiscipline) String ¶
func (q QueueingDiscipline) String() string
String implements flag.Value.
type XDPMode ¶
type XDPMode int
XDPMode specifies a particular use of XDP.
const ( // XDPModeOff doesn't use XDP. XDPModeOff XDPMode = iota // XDPModeNS uses an AF_XDP socket to read from the VETH device inside // the container's network namespace. XDPModeNS // XDPModeRedirect uses an AF_XDP socket on the host NIC to bypass the // Linux network stack. XDPModeRedirect // XDPModeTunnel uses XDP_REDIRECT to redirect packets directy from the // host NIC to the VETH device inside the container's network // namespace. Packets are read from the VETH via AF_XDP, as in // XDPModeNS. XDPModeTunnel )