Documentation
¶
Index ¶
- func BuildDMCommand(profile *nnfv1alpha11.NnfDataMovementProfile, hostfile string, ...) ([]string, error)
- func CreateMpiHostfile(profile *nnfv1alpha11.NnfDataMovementProfile, hosts []string, ...) (string, error)
- func ExtractIndexMountDir(path, namespace string) (string, error)
- func GetCopyOffloadWorkerHostnames(clnt client.Client, ctx context.Context, nodes []string, ...) ([]string, error)
- func GetDMProfile(clnt client.Client, ctx context.Context, dm *nnfv1alpha11.NnfDataMovement) (*nnfv1alpha11.NnfDataMovementProfile, error)
- func GetDestinationDir(profile *nnfv1alpha11.NnfDataMovementProfile, dm *nnfv1alpha11.NnfDataMovement, ...) (string, error)
- func GetStorageNodeNames(clnt client.Client, ctx context.Context, dm *nnfv1alpha11.NnfDataMovement) ([]string, error)
- func GetWorkerHostnames(clnt client.Client, ctx context.Context, nodes []string) ([]string, error)
- func HandleIndexMountDir(profile *nnfv1alpha11.NnfDataMovementProfile, dm *nnfv1alpha11.NnfDataMovement, ...) (string, error)
- func InjectOrteTmpdirBase(cmd, hostfile string) string
- func ParseDcpProgress(line string, cmdStatus *nnfv1alpha11.NnfDataMovementCommandStatus) error
- func ParseDcpStats(line string, cmdStatus *nnfv1alpha11.NnfDataMovementCommandStatus) error
- func PeekMpiHostfile(hostfile string) string
- func PrepareDestination(clnt client.Client, ctx context.Context, ...) error
- func ProgressCollectionEnabled(collectInterval time.Duration) bool
- func TrimDcpProgressFromOutput(output string) string
- func WriteMpiHostfile(dmName string, hosts []string, slots, maxSlots int) (string, error)
- type DataMovementContext
Constants ¶
This section is empty.
Variables ¶
This section is empty.
Functions ¶
func BuildDMCommand ¶
func BuildDMCommand(profile *nnfv1alpha11.NnfDataMovementProfile, hostfile string, usePermissions bool, dm *nnfv1alpha11.NnfDataMovement, log logr.Logger) ([]string, error)
func CreateMpiHostfile ¶
func CreateMpiHostfile(profile *nnfv1alpha11.NnfDataMovementProfile, hosts []string, dm *nnfv1alpha11.NnfDataMovement) (string, error)
Create an MPI hostfile given settings from a profile and user config from the dm
func ExtractIndexMountDir ¶
Pull out the index mount directory from the path for the correct file systems that require it
func GetCopyOffloadWorkerHostnames ¶ added in v0.1.15
func GetCopyOffloadWorkerHostnames(clnt client.Client, ctx context.Context, nodes []string, workflow, namespace string, dm *nnfv1alpha11.NnfDataMovement) ([]string, error)
Copy Offload version of GetWorkerHostnames. For GFS2, we only need to have 1 hostname for the worker pod that is running on the local rabbit node. We can get the rabbit node from the namespace of the storage reference. For lustre, we want all the rabbits for the workflow. mpi-operator builds a hostfile and we can take that list of FQDNs and use them.
func GetDMProfile ¶
func GetDMProfile(clnt client.Client, ctx context.Context, dm *nnfv1alpha11.NnfDataMovement) (*nnfv1alpha11.NnfDataMovementProfile, error)
func GetDestinationDir ¶
func GetDestinationDir(profile *nnfv1alpha11.NnfDataMovementProfile, dm *nnfv1alpha11.NnfDataMovement, mpiHostfile string, log logr.Logger) (string, error)
Determine the directory path to create based on the source and destination. Returns the mkdir directory and error.
func GetStorageNodeNames ¶
func GetStorageNodeNames(clnt client.Client, ctx context.Context, dm *nnfv1alpha11.NnfDataMovement) ([]string, error)
Retrieve the NNF Nodes that are the target of the data movement operation
func GetWorkerHostnames ¶
Get the hostnames for the workers that are running on the rabbit nodes. For node-local data movement (i.e. XFS or GFS2) that only uses the local rabbit node, we can use localhost. For non-local data movement (i.e. Lustre), we need to look up the Pods associated with the MPI workers on each individual rabbit, mapping the nodename to a worker IP address.
func HandleIndexMountDir ¶
func HandleIndexMountDir(profile *nnfv1alpha11.NnfDataMovementProfile, dm *nnfv1alpha11.NnfDataMovement, destDir, indexMount, mpiHostfile string, log logr.Logger) (string, error)
Given a destination directory and index mount directory, apply the necessary changes to the destination directory and the DM's destination path to account for index mount directories
func InjectOrteTmpdirBase ¶ added in v0.1.25
InjectOrteTmpdirBase adds --mca orte_tmpdir_base <dir> to the mpirun command to isolate each mpirun invocation's ORTE session directory. This prevents a race condition where concurrent mpirun processes on the same pod share and race on /tmp/ompi.<hostname>.<uid>/, causing "No such file or directory" errors when one process cleans up the parent directory while another is initializing.
func ParseDcpProgress ¶
func ParseDcpProgress(line string, cmdStatus *nnfv1alpha11.NnfDataMovementCommandStatus) error
func ParseDcpStats ¶
func ParseDcpStats(line string, cmdStatus *nnfv1alpha11.NnfDataMovementCommandStatus) error
Go through the list of dcp stat regexes, parse them, and put them in their appropriate place in cmdStatus
func PeekMpiHostfile ¶
Get the first line of the hostfile for verification
func PrepareDestination ¶
func PrepareDestination(clnt client.Client, ctx context.Context, profile *nnfv1alpha11.NnfDataMovementProfile, dm *nnfv1alpha11.NnfDataMovement, mpiHostfile string, log logr.Logger) error
func TrimDcpProgressFromOutput ¶ added in v0.1.14
Walk through the output of the dcp command and remove all of the progress lines except for the first and last occurrences. Insert a snippet <snipped dcp progress output> to indicate that the output has been trimmed.
func WriteMpiHostfile ¶
Create the MPI Hostfile given a list of hosts, slots, and maxSlots. A temporary directory is created based on the DM Name. The hostfile is created inside of this directory. A value of 0 for slots or maxSlots will not use it in the hostfile.
Types ¶
type DataMovementContext ¶ added in v0.1.27
type DataMovementContext struct {
Ctx context.Context
Cancel context.CancelFunc
Result *nnfv1alpha11.NnfDataMovementStatus
}
DataMovementContext tracks an in-progress data movement operation. It holds the context/cancel for cancellation, and Result is set by the goroutine when the command finishes so the reconciler can write the final status using standard controller-runtime retry semantics.