
TensorFusion.AI
Next-Generation GPU Virtualization and Pooling for Enterprises
Less GPUs, More AI Apps.
Explore the docs ยป
View Demo
|
Report Bug
|
Request Feature
โพ๏ธ Tensor Fusion

Tensor Fusion is a state-of-the-art GPU virtualization and pooling solution designed to optimize GPU cluster utilization to its fullest potential.
๐ Highlights
๐ Fractional GPU with Single TFlops/MiB Precision
๐ Battle-tested GPU-over-IP Remote GPU Sharing
โ๏ธ GPU-first Scheduling and Auto-scaling
๐ Computing Oversubscription and GPU VRAM Expansion
๐ซ GPU Pooling, Monitoring, Live Migration, AI Model Preloading and more
๐ฌ Demo
Fractional vGPU & GPU-over-IP & Distributed Allocation

AI Infra Console

GPU Live-migration [End-to-end feature WIP]
https://cdn.tensor-fusion.ai/GPU_Content_Migration.mp4
๐ Quick Start
Onboard Your Own AI Infra
๐ฌ Discussion
๐ฎ Features & Roadmap
Core GPU Virtualization Features
- Fractional GPU and flexible oversubscription
- Remote GPU sharing with SOTA GPU-over-IP technology, less than 4% performance loss
- GPU VRAM expansion and hot/warm/cold tiering
- None NVIDIA GPU/NPU vendor support
Pooling & Scheduling & Management
- GPU/NPU pool management in Kubernetes
- GPU-first scheduling and allocation, with single TFlops/MB precision
- GPU node auto provisioning/termination
- GPU compaction/bin-packing
- Seamless onboarding experience for Pytorch, TensorFlow, llama.cpp, vLLM, Tensor-RT, SGlang and all popular AI training/serving frameworks
- Centralized Dashboard & Control Plane
- GPU-first autoscaling policies, auto set requests/limits/replicas
- Request multiple vGPUs with group scheduling for large models
- Support different QoS levels
Enterprise Features
- GPU live-migration, snapshot/distribute/restore GPU context cross cluster, fastest in the world
- AI model registry and preloading, build your own private MaaS(Model-as-a-Service)
- Advanced auto-scaling policies, scale to zero, rebalance of hot GPUs
- Advanced observability features, detailed metrics & tracing/profiling of CUDA calls
- Monetize your GPU cluster by multi-tenancy usage measurement & billing report
- Enterprise level high availability and resilience, support topology aware scheduling, GPU node auto failover etc.
- Enterprise level security, complete on-premise deployment support, encryption in-transit & at-rest
- Enterprise level compliance, SSO/SAML support, advanced audit, ReBAC control, SOC2 and other compliance reports available
- Run on Linux Kubernetes clusters
- Run on Linux VMs or Bare Metal (one-click onboarding to Edge K3S)
- Run on Windows (Docs not ready, contact us for support)
- Run on MacOS (Imagining mount a virtual NVIDIA GPU device on MacOS!)
See the open issues for a full list of proposed features (and known issues).
๐ Contributing
Contributions are what make the open source community such an amazing place to learn, inspire, and create. Any contributions you make are greatly appreciated.
If you have a suggestion that would make this better, please fork the repo and create a pull request. You can also simply open an issue with the tag "enhancement".
Don't forget to give the project a star! Thanks again!
- Fork the Project
- Create your Feature Branch (
git checkout -b feature/AmazingFeature
)
- Commit your Changes (
git commit -m 'Add some AmazingFeature'
)
- Push to the Branch (
git push origin feature/AmazingFeature
)
- Open a Pull Request
Top contributors
๐ท License
- This repo is open sourced with Apache 2.0 License, which includes GPU pooling, scheduling, management features, you can use it for free and modify it.
- GPU virtualization and GPU-over-IP features are also free to use as the part of Community Plan, the implementation is not fully open sourced
- Features mentioned in "Enterprise Features" above are paid, licensed users can automatically unlock these features.
