manta

module

v0.0.4 Latest Latest Go to latest Published: Dec 5, 2024 License: Apache-2.0

Details

Valid go.mod file
Redistributable license
Tagged version
Stable version
Learn more about best practices

Repository

github.com/inftyai/manta

Links

Open Source Insights

README ¶

A lightweight P2P-based cache system for model distributions on Kubernetes.

Name Story: the inspiration of the name Manta is coming from Dota2, called Manta Style, which will create 2 images of your hero just like peers in the P2P network.

Architecture

architecture

Note: llmaz is just one kind of integrations, Manta can be deployed and used independently.

Features Overview

Model Hub Support: Models could be downloaded directly from model hubs (Huggingface etc.) or object storages, no other effort.
Model Preheat: Models could be preloaded to clusters, or specified nodes to accelerate the model serving.
Model Cache: Models will be cached as chunks after downloading for faster model loading.
Model Lifecycle Management: Model lifecycle is managed automatically with different strategies, like Retain or Delete.
Plugin Framework: Filter and Score plugins could be extended to pick up the best candidates.
Memory Management(WIP): Manage the reserved memories for caching, together with LRU algorithm for GC.

You Should Know Before

Manta is not an all-in-one solution for model management, instead, it offers a lightweight solution to utilize the idle bandwidth and cost-effective disk, helping you save money.
It requires no additional components like databases or storage systems, simplifying setup and reducing effort.
All the models will be stored under the host path of /mnt/models/
After all, it's just a cache system.

Quick Start

Installation

Read the Installation for guidance.

Preheat Model

A sample to preload the Qwen/Qwen2.5-0.5B-Instruct model. Once preheated, no longer to fetch the models from cold start, but from the cache instead.

apiVersion: manta.io/v1alpha1
kind: Torrent
metadata:
  name: torrent-sample
spec:
  hub:
    name: Huggingface
    repoID: Qwen/Qwen2.5-0.5B-Instruct

If you want to preload the model to specified nodes, use the NodeSelector:

apiVersion: manta.io/v1alpha1
kind: Torrent
metadata:
  name: torrent-sample
spec:
  hub:
    name: Huggingface
    repoID: Qwen/Qwen2.5-0.5B-Instruct
  nodeSelector:
    foo: bar

Use Model

Once you have a Torrent, you can access the model simply from host path of `/mnt/models/. What you need to do is just set the Pod label like:

metadata:
  labels:
    manta.io/torrent-name: "torrent-sample"

Note: you can make the Torrent Standby by setting the preheat to false (true by default), then preheating will process in runtime, which obviously wll slow down the model loading.

apiVersion: manta.io/v1alpha1
kind: Torrent
metadata:
  name: torrent-sample
spec:
  preheat: false

Delete Model

If you want to remove the model weights once Torrent is deleted, set the ReclaimPolicy=Delete, default to Retain:

apiVersion: manta.io/v1alpha1
kind: Torrent
metadata:
  name: torrent-sample
spec:
  hub:
    name: Huggingface
    repoID: Qwen/Qwen2.5-0.5B-Instruct
  reclaimPolicy: Delete

More details refer to the APIs.

Roadmap

In the long term, we hope to make Manta an unified cache system within MLOps.

Preloading datasets from model hubs
RDMA support for faster model loading
More integrations with MLOps system, including training and serving

Community

Join us for more discussions:

Slack Channel: #manta

Contributions

All kinds of contributions are welcomed ! Please following CONTRIBUTING.md.

Directories ¶

Path	Synopsis
agent
cmd command
pkg/controller
pkg/handler
pkg/server
pkg/task
pkg/util
api
v1alpha1 Package v1alpha1 contains API Schema definitions for the v1alpha1 API group +kubebuilder:object:generate=true +groupName=manta.io	Package v1alpha1 contains API Schema definitions for the v1alpha1 API group +kubebuilder:object:generate=true +groupName=manta.io
cmd
hack
internal
pkg
cert
controller
dispatcher
dispatcher/cache
dispatcher/framework
dispatcher/plugins/diskaware
dispatcher/plugins/nodeselector
util
webhook
test
util
util/validation
util/wrapper

?	: This menu
/	: Search site
f or F	: Jump to
y or Y	: Canonical URL