kubedl

command module
v0.3.0 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Apr 12, 2021 License: Apache-2.0 Imports: 14 Imported by: 0

README

KubeDL

License Build Status

KubeDL is short for Kubernetes-Deep-Learning. It is a unified operator that supports running multiple types of distributed deep learning/machine learning workloads on Kubernetes. Check the website: https://kubedl.io


Currently, KubeDL supports the following ML/DL jobs:

Features

  • Support running prevalent DeepLearning workloads in a single operator.
  • Support running jobs with custom artifacts from remote repository such as github, saving users from manually baking the artificats into the image.
  • Instrumented with unified prometheus metrics for different types of DL jobs, such as job launch delay, number of pending/running jobs.
  • Support job metadata persistency with a pluggable storage backend such as Mysql.
  • Provide more granular information on kubectl command line to show job status.
  • Support advanced scheduling features such as gang scheduling with pluggable backend schedulers.
  • A modular architecture that can be easily extended for more types of DL/ML workloads with shared libraries, see how to add a custom job workload.
  • Run jobs with Host network.
Build right away
make manager
Run the tests
make test
Generate manifests e.g. CRD, RBAC YAML files etc
make manifests
Build the docker image
export IMG=<your_image_name> && make docker-build
Push the image
docker push <your_image_name>

Check the Makefile in the root directory for more details.

Documentation

The Go Gopher

There is no documentation for this package.

Directories

Path Synopsis
training/v1alpha1
Package v1alpha1 contains API Schema definitions for the training v1alpha1 API group +k8s:defaulter-gen=TypeMeta +groupName=training.kubedl.io Package v1alpha1 contains API Schema definitions for the training v1alpha1 API group +kubebuilder:object:generate=true +groupName=training.kubedl.io
Package v1alpha1 contains API Schema definitions for the training v1alpha1 API group +k8s:defaulter-gen=TypeMeta +groupName=training.kubedl.io Package v1alpha1 contains API Schema definitions for the training v1alpha1 API group +kubebuilder:object:generate=true +groupName=training.kubedl.io
client
clientset/versioned
This package has the automatically generated clientset.
This package has the automatically generated clientset.
clientset/versioned/fake
This package has the automatically generated fake clientset.
This package has the automatically generated fake clientset.
clientset/versioned/scheme
This package contains the scheme of the automatically generated clientset.
This package contains the scheme of the automatically generated clientset.
clientset/versioned/typed/training/v1alpha1
This package has the automatically generated typed clients.
This package has the automatically generated typed clients.
clientset/versioned/typed/training/v1alpha1/fake
Package fake has the automatically generated clients.
Package fake has the automatically generated clients.
cmd
mpi
tensorflow
Package tensorflow provides a Kubernetes controller for a TFJob resource.
Package tensorflow provides a Kubernetes controller for a TFJob resource.
xdl
pkg
job_controller/api/v1
Package v1 is the v1 version of the API.
Package v1 is the v1 version of the API.
test_job/v1
Package v1 is the v1 version of the API.
Package v1 is the v1 version of the API.
util
Package util provides various helper routines.
Package util provides various helper routines.
util/train
Package that various helper routines for training.
Package that various helper routines for training.

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL