Documentation
¶
Overview ¶
Package probe implements readiness probes for cluster services.
Index ¶
- func ContainerRunning(ctx context.Context, client *docker.Client, name docker.ContainerName) error
- func MungeReady(ctx context.Context, client *docker.Client, name docker.ContainerName) error
- func SSHDReady(ctx context.Context, client *docker.Client, name docker.ContainerName) error
- func SlurmctldReady(ctx context.Context, client *docker.Client, name docker.ContainerName) error
- func SlurmdReady(ctx context.Context, client *docker.Client, name docker.ContainerName) error
- func SystemdReady(ctx context.Context, client *docker.Client, name docker.ContainerName) error
- func UntilReady(ctx context.Context, client *docker.Client, name docker.ContainerName, ...) error
- type Func
- type Probe
- type TerminalError
Constants ¶
This section is empty.
Variables ¶
This section is empty.
Functions ¶
func ContainerRunning ¶
ContainerRunning verifies that the container is in the "running" state. Returns a TerminalError for states that cannot recover (exited, dead).
func MungeReady ¶
MungeReady verifies that the munge authentication service is active.
func SSHDReady ¶
SSHDReady verifies that sshd is accepting connections and responding with the SSH protocol banner on port 22.
func SlurmctldReady ¶
SlurmctldReady verifies that slurmctld is responding to RPC requests.
func SlurmdReady ¶
SlurmdReady verifies that the slurmd service is active.
func SystemdReady ¶
SystemdReady verifies that systemd has finished booting. Both "running" and "degraded" are considered ready, since degraded means systemd completed startup but some units failed (which is expected when Slurm daemons haven't been configured yet).
func UntilReady ¶
func UntilReady(ctx context.Context, client *docker.Client, name docker.ContainerName, probes []Probe, interval time.Duration) error
UntilReady polls the given probes until they all pass or the context expires. The caller controls the deadline via the context. The interval controls the delay between polling attempts. On timeout, the error includes the name and message of the last failing probe.
Types ¶
type Probe ¶
Probe is a named readiness check.
func ForService ¶ added in v0.7.0
ForService returns the readiness probe for a Slurm service.
func NodeProbes ¶
NodeProbes returns the probes applicable to a node with the given role.
type TerminalError ¶ added in v0.7.1
type TerminalError struct {
Msg string
}
TerminalError indicates a probe failure that cannot be recovered by retrying. For example, a container in "exited" or "dead" state will never become "running" on its own.
func (*TerminalError) Error ¶ added in v0.7.1
func (e *TerminalError) Error() string