Tokilake

command module
v0.12.1 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Apr 25, 2026 License: Apache-2.0 Imports: 34 Imported by: 0

README

English | 中文

Tokilake & Tokiame

Control your own GPUs like OpenRouter.

Tokilake is a decentralized Large Language Model (LLM) API scheduling gateway built on the One-API ecosystem. It completely flips the traditional API gateway model: instead of the gateway strictly acting as a client that actively requests servers with public IPs, it allows any GPU worker node (Tokiame) located behind NAT/Intranets to actively connect to the central gateway (Hub) via a reverse tunnel (WebSocket or QUIC).

Tokilake is built on top of MartialBE/one-hub and the broader One-API ecosystem that evolved around it.

📖 Quick Start

You can visit the Tokilake Demo to explore the core features. For total data privacy and distribution control, we highly recommend self-hosting following our End-to-End Deployment Guide. Whether you're a first-time user or ready for a full deployment, the guides below are here to help.

The fastest way to deploy your own Tokilake Hub with automatic HTTPS and QUIC support:

# 1. Clone the repository
git clone https://github.com/Tokimorphling/Tokilake.git
cd Tokilake

# 2. Production deploy with host nginx + Docker + Let's Encrypt
sudo ./deploy/bootstrap-nginx-letsencrypt.sh \
  --domain api.example.com \
  --email admin@example.com \
  --sql-dsn 'postgres://user:password@127.0.0.1:5432/tokilake'

Access your dashboard at https://api.example.com.

To update the image later:

sudo ./deploy/bootstrap-nginx-letsencrypt.sh \
  --domain api.example.com \
  --update

For a local-only smoke test without nginx or certificates:

docker run -d \
  --name tokilake-local \
  --restart unless-stopped \
  -p 19981:19981 \
  -e TZ=UTC \
  -e PORT=19981 \
  -e GIN_MODE=release \
  -e SERVER_ADDRESS="http://localhost:19981" \
  -e USER_TOKEN_SECRET="$(openssl rand -hex 32)" \
  -e SESSION_SECRET="$(openssl rand -hex 32)" \
  -v tokilake-local-data:/data \
  ghcr.io/tokimorphling/tokilake:latest

Then open http://localhost:19981.

🌟 Core Concept

Traditional API proxies typically act as clients, routing requests to servers with public IP addresses. If your high-performance GPUs (like an RTX 4090) are sitting quietly on a local home network, or scattered across temporary Spot Instances from various cloud providers, unifying them into a stable, accessible API is a major challenge.

Tokiame changes the game. Operating as a lightweight daemon, it actively "dials out" to connect to the cloud-based Tokilake gateway. Upon a successful connection, Tokilake seamlessly maps the worker internally to a standard Channel. This means you don't need any tricky intranet penetration tools (like FRP or Ngrok). You get to enjoy the gateway's enterprise-grade load balancing, high-concurrency traffic shaping, authentication, and billing systems right out of the box.

🚀 Perfect Use Cases

1. Distributed GPU Pooling for Individuals & Studios (NAT Penetration)

Tailor-made for home broadband or campus network environments without public IPs. Just run the Tokiame process locally, and it instantly establishes a tunnel with the cloud gateway. The LLMs you deploy locally using Ollama or vLLM can instantly and securely provide standard OpenAI-compatible API services to the outside world.

2. Hybrid Cloud Orchestration

Purchased scattered GPU instances across different compute platforms (e.g., AWS, AliCloud, AutoDL, RunPod)? Skip the complex SD-WAN setups. Simply attach the Tokiame startup script to your new instances, and they automatically register into the load-balancing pool. When instances are destroyed or shut down, the heartbeat mechanism safely takes the node offline, drastically reducing DevOps overhead.

3. Enterprise Data Privacy & "Bring Your Own Model" (BYOM)

SaaS providers handle the business logic frontend, while clients provide the compute backend. Clients only need to deploy Tokiame within their highly secure private server rooms, initiating a one-way outbound connection to the SaaS gateway. The client's server room exposes absolutely zero inbound ports, yet perfectly completes the business scheduling of private models, satisfying the most stringent security audit requirements.

4. Community Compute Sharing & C2C API Trading

Built around native Private Group and Invite Code mechanisms. User A hooks up their compute node and generates an invite code; User B redeems the code, gains access to User A's private multi-tenant environment, and invokes the compute power. The gateway handles all centralized billing and authentication, making it effortless to build your very own "OpenRouter."

🛠 Architecture Design

graph TB
    subgraph Users ["🌐 API Consumers"]
        U1["Apps / SDKs"]
        U2["curl / ChatUI"]
    end

    subgraph Gateway ["☁️ Tokilake Gateway (Hub)"]
        GIN["Gin HTTP Server"]
        RELAY["Relay Router"]
        PROV["Tokiame Provider"]
        SM["Session Manager"]
        DB[("DB / Channel Table")]
        GIN --> RELAY --> PROV
        PROV -->|"Lookup Session"| SM
        SM -->|"R/W Virtual Channel"| DB
    end

    subgraph Tunnel ["🔒 Multiplexed Reverse Tunnel"]
        direction LR
        CTRL["Control Stream<br/>register / heartbeat / models_sync"]
        DATA["Data Streams<br/>TunnelRequest ↔ TunnelResponse"]
    end

    subgraph Workers ["🖥️ Tokiame Edge Nodes (Behind NAT)"]
        W1["Tokiame Client A"]
        W2["Tokiame Client B"]
        B1["Ollama / vLLM<br/>Local GPU"]
        B2["SGLang / ComfyUI<br/>Local GPU"]
        W1 --> B1
        W2 --> B2
    end

    U1 & U2 -->|"Standard OpenAI HTTP API"| GIN
    PROV <-->|"Multiplexed Tunnel"| Tunnel
    Tunnel <-->|"Outbound-Only Connection"| W1 & W2
  • Tokilake (Gateway/Hub Level): The unified ingress for traffic. It receives standard HTTP API requests from end-users and multiplexes them to the corresponding edge nodes.
  • Tokiame (Node/Worker Level): The lightweight client on the edge. It maintains an ultra-low latency reverse tunnel via either WebSocket (with xtaci/smux multiplexing) or QUIC.
  • tokilake-core: The standalone protocol, tunnel, session, and gateway core. It has no onehub database dependency.
  • tokilake-onehub: The onehub adapter that maps connected workers into channels, providers, and video tasks.
Transport Options

Tokiame supports two transport protocols:

Protocol Description
WebSocket (default) Uses xtaci/smux for stream multiplexing over WebSocket. Compatible with standard HTTP/HTTPS gateways.
QUIC Native QUIC protocol (quic-go) with built-in multiplexing and 0-RTT connection establishment. Requires TLS and a dedicated QUIC-enabled gateway endpoint.

Transport Mode Selection (set via TOKIAME_TRANSPORT_MODE):

  • auto (default): Attempts QUIC first, falls back to WebSocket if connection fails
  • quic: QUIC only
  • websocket: WebSocket only

QUIC is ideal for scenarios requiring lower latency and better connection resilience, especially on unreliable networks. It also supports server-side connection migration.

Simplified Workflow
  1. The Tokiame client initiates a WebSocket or QUIC connection request to Tokilake using a standard user API token.
  2. Upon successful gateway verification, it automatically creates/binds a virtual Channel (type=100) in the database and assigns it to a specific Private Group.
  3. When a user sends an LLM HTTP request through the gateway, the gateway treats it like any normal channel, transparently streaming it to the edge node for processing via the tunnel.
  4. Relies on real-time heartbeat keepalives. If an edge node loses its connection, the gateway automatically disables its virtual Channel, achieving zero-downtime Failover.

Acknowledgements

Legacy README

Legacy README (historical introduction and compatibility notes)

Documentation

The Go Gopher

There is no documentation for this package.

Directories

Path Synopsis
cmd
tokiame command
mcp
tools/calculator
Package calculator 提供了一个基本的计算器工具 Package calculator provides a basic calculator tool
Package calculator 提供了一个基本的计算器工具 Package calculator provides a basic calculator tool
tools/current_time
Package current_time 提供了一个获取当前时间的工具 Package current_time provides a tool to get the current time
Package current_time 提供了一个获取当前时间的工具 Package current_time provides a tool to get the current time
pkg
log
ali
xAI
tokiame module

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL