Tokilake

command module
v0.5.6 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Mar 21, 2026 License: Apache-2.0 Imports: 34 Imported by: 0

README

English | 中文

Tokilake & Tokiame

Control your own GPUs like OpenRouter.

Tokilake is a decentralized Large Language Model (LLM) API scheduling gateway built on the One-API ecosystem. It completely flips the traditional API gateway model: instead of the gateway strictly acting as a client that actively requests servers with public IPs, it allows any GPU worker node (Tokiame) located behind NAT/Intranets to actively connect to the central gateway (Hub) via a reverse WebSocket tunnel.

Tokilake is built on top of MartialBE/one-hub and the broader One-API ecosystem that evolved around it.

📖 Quick Start

You can visit the Tokilake Demo to explore the core features. For total data privacy and distribution control, we highly recommend self-hosting following our End-to-End Deployment Guide. Whether you're a first-time user or ready for a full deployment, the guides below are here to help.

🌟 Core Concept

Traditional API proxies typically act as clients, routing requests to servers with public IP addresses. If your high-performance GPUs (like an RTX 4090) are sitting quietly on a local home network, or scattered across temporary Spot Instances from various cloud providers, unifying them into a stable, accessible API is a major challenge.

Tokiame changes the game. Operating as a lightweight daemon, it actively "dials out" to connect to the cloud-based Tokilake gateway. Upon a successful connection, Tokilake seamlessly maps the worker internally to a standard Channel. This means you don't need any tricky intranet penetration tools (like FRP or Ngrok). You get to enjoy the gateway's enterprise-grade load balancing, high-concurrency traffic shaping, authentication, and billing systems right out of the box.

🚀 Perfect Use Cases

1. Distributed GPU Pooling for Individuals & Studios (NAT Penetration)

Tailor-made for home broadband or campus network environments without public IPs. Just run the Tokiame process locally, and it instantly establishes a tunnel with the cloud gateway. The LLMs you deploy locally using Ollama or vLLM can instantly and securely provide standard OpenAI-compatible API services to the outside world.

2. Hybrid Cloud Orchestration

Purchased scattered GPU instances across different compute platforms (e.g., AWS, AliCloud, AutoDL, RunPod)? Skip the complex SD-WAN setups. Simply attach the Tokiame startup script to your new instances, and they automatically register into the load-balancing pool. When instances are destroyed or shut down, the heartbeat mechanism safely takes the node offline, drastically reducing DevOps overhead.

3. Enterprise Data Privacy & "Bring Your Own Model" (BYOM)

SaaS providers handle the business logic frontend, while clients provide the compute backend. Clients only need to deploy Tokiame within their highly secure private server rooms, initiating a one-way outbound connection to the SaaS gateway. The client's server room exposes absolutely zero inbound ports, yet perfectly completes the business scheduling of private models, satisfying the most stringent security audit requirements.

4. Community Compute Sharing & C2C API Trading

Built around native Private Group and Invite Code mechanisms. User A hooks up their compute node and generates an invite code; User B redeems the code, gains access to User A's private multi-tenant environment, and invokes the compute power. The gateway handles all centralized billing and authentication, making it effortless to build your very own "OpenRouter."

🛠 Architecture Design

graph TB
    subgraph Users ["🌐 API Consumers"]
        U1["Apps / SDKs"]
        U2["curl / ChatUI"]
    end

    subgraph Gateway ["☁️ Tokilake Gateway (Hub)"]
        GIN["Gin HTTP Server"]
        RELAY["Relay Router"]
        PROV["Tokiame Provider"]
        SM["Session Manager"]
        DB[("DB / Channel Table")]
        GIN --> RELAY --> PROV
        PROV -->|"Lookup Session"| SM
        SM -->|"R/W Virtual Channel"| DB
    end

    subgraph Tunnel ["🔒 Multiplexed Reverse Tunnel"]
        direction LR
        CTRL["Control Stream<br/>register / heartbeat / models_sync"]
        DATA["Data Streams<br/>TunnelRequest ↔ TunnelResponse"]
    end

    subgraph Workers ["🖥️ Tokiame Edge Nodes (Behind NAT)"]
        W1["Tokiame Client A"]
        W2["Tokiame Client B"]
        B1["Ollama / vLLM<br/>Local GPU"]
        B2["SGLang / ComfyUI<br/>Local GPU"]
        W1 --> B1
        W2 --> B2
    end

    U1 & U2 -->|"Standard OpenAI HTTP API"| GIN
    PROV <-->|"Multiplexed Tunnel"| Tunnel
    Tunnel <-->|"Outbound-Only Connection"| W1 & W2
  • Tokilake (Gateway/Hub Level): The unified ingress for traffic. It receives standard HTTP API requests from end-users and multiplexes them to the corresponding edge nodes.
  • Tokiame (Node/Worker Level): The lightweight client on the edge. It maintains an ultra-low latency reverse WebSocket tunnel via the xtaci/smux multiplexing protocol.
Simplified Workflow
  1. The Tokiame client initiates a WebSocket connection request to Tokilake using a standard user API token.
  2. Upon successful gateway verification, it automatically creates/binds a virtual Channel (type=100) in the database and assigns it to a specific Private Group.
  3. When a user sends an LLM HTTP request through the gateway, the gateway treats it like any normal channel, transparently streaming it to the edge node for processing via the smux tunnel.
  4. Relies on real-time heartbeat keepalives. If an edge node loses its connection, the gateway automatically disables its virtual Channel, achieving zero-downtime Failover.

Acknowledgements

Legacy README

Legacy README (historical introduction and compatibility notes)

Documentation

The Go Gopher

There is no documentation for this package.

Directories

Path Synopsis
cmd
tokiame command
mcp
tools/calculator
Package calculator 提供了一个基本的计算器工具 Package calculator provides a basic calculator tool
Package calculator 提供了一个基本的计算器工具 Package calculator provides a basic calculator tool
tools/current_time
Package current_time 提供了一个获取当前时间的工具 Package current_time provides a tool to get the current time
Package current_time 提供了一个获取当前时间的工具 Package current_time provides a tool to get the current time
pkg
log
ali
xAI

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL