
TensorFusion.AI
Next-Generation GPU Virtualization and Pooling for Enterprises
Less GPUs, More AI Apps.
Explore the docs ยป
View Demo
|
Report Bug
|
Request Feature
โพ๏ธ Tensor Fusion

Tensor Fusion is a state-of-the-art GPU virtualization and pooling solution designed to optimize GPU cluster utilization to its fullest potential.
English
|
็ฎไฝไธญๆ
๐ Highlights
๐ Fractional GPU with Single TFlops/MiB Precision
๐ Battle-tested GPU-over-IP Remote GPU Sharing
โ๏ธ GPU-first Scheduling and Auto-scaling
๐ Computing Oversubscription and GPU VRAM Expansion
๐ซ GPU Live Migration
๐ฌ Demo
WIP
๐ Quick Start
Onboard Your Own AI Infra
Try it out
# Step 1: Install TensorFusion in Kubernetes
helm install --repo https://nexusgpu.github.io/tensor-fusion/ --create-namespace
# Step 2. Onboard GPU nodes into TensorFusion cluster
kubectl apply -f https://raw.githubusercontent.com/NexusGPU/tensor-fusion/main/manifests/gpu-node.yaml
# Step 3. Check if cluster and pool is ready
kubectl get gpupools -o wide && kubectl get gpunodes -o wide
# Step3. Create an inference app using virtual, remote GPU resources in TensorFusion cluster
kubectl apply -f https://raw.githubusercontent.com/NexusGPU/tensor-fusion/main/manifests/inference-app.yaml
# Then you can forward the port to test inference, or exec shell
(TODO: Asciinema)
๐ฌ Discussion
๐ฎ Features & Roadmap
Core GPU Virtualization Features
- Fractional GPU and flexible oversubscription
- GPU-over-IP, remote GPU sharing with less than 4% performance loss
- GPU VRAM expansion or swap to host RAM
- None NVIDIA GPU/NPU vendor support
Pooling & Scheduling & Management
- GPU/NPU pool management in Kubernetes
- GPU-first resource scheduler based on virtual TFlops/VRAM capacity
- GPU-first auto provisioning and bin-packing
- Seamless onboarding experience for Pytorch, TensorFlow, llama.cpp, vLLM, Tensor-RT, SGlang and all popular AI training/serving frameworks
- Basic management console and dashboards
- Basic autoscaling policies, auto set requests/limits/replicas
- GPU Group scheduling for LLMs
- Support different QoS levels
Enterprise Features
- GPU live-migration, fastest in the world
- Preloading and P2P distribution of container images, AI models, GPU snapshots etc.
- Advanced auto-scaling policies, scale to zero, rebalance of hot GPUs
- Advanced observability features, detailed metrics & tracing/profiling of CUDA calls
- Multi-tenancy billing based on actual usage
- Enterprise level high availability and resilience, support topology aware scheduling, GPU node auto failover etc.
- Enterprise level security, complete on-premise deployment support, encryption in-transit & at-rest
- Enterprise level compliance, SSO/SAML support, advanced audit, ReBAC control, SOC2 and other compliance reports available
- Run on Linux Kubernetes clusters
- Run on Linux VMs or Bare Metal (one-click onboarding to Edge K3S)
- Run on Windows (Docs not ready, contact us for support)
- Run on MacOS (Imagining mount a virtual NVIDIA GPU device on MacOS!)
See the open issues for a full list of proposed features (and known issues).
๐ Contributing
Contributions are what make the open source community such an amazing place to learn, inspire, and create. Any contributions you make are greatly appreciated.
If you have a suggestion that would make this better, please fork the repo and create a pull request. You can also simply open an issue with the tag "enhancement".
Don't forget to give the project a star! Thanks again!
- Fork the Project
- Create your Feature Branch (
git checkout -b feature/AmazingFeature
)
- Commit your Changes (
git commit -m 'Add some AmazingFeature'
)
- Push to the Branch (
git push origin feature/AmazingFeature
)
- Open a Pull Request
Top contributors
๐ท License
- This repo is open sourced with Apache 2.0 License, which includes GPU pooling, scheduling, management features, you can use it for free and modify it.
- GPU virtualization and GPU-over-IP features are also free to use as the part of Community Plan, the implementation is not fully open sourced
- Features mentioned in "Enterprise Features" above are paid, licensed users can automatically unlock these features.