ai

package module
v1.785.0 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: May 19, 2026 License: Apache-2.0 Imports: 5 Imported by: 0

README

ai

LLM control plane, RAG, and model hub for the Hanzo platform. Native Go model routing with no Python middlemen.

Status License

Quick start

docker pull ghcr.io/hanzoai/ai:latest
curl -H "Authorization: Bearer hk-YOUR-API-KEY" \
  https://api.hanzo.ai/v1/chat/completions \
  -d '{"model":"zen4-pro","messages":[{"role":"user","content":"Hello"}]}'

What this is

ai is the canonical LLM control plane for the Hanzo platform — model routing, RAG, model hub, MCP management. OpenAI-compatible JSON over HTTP (/v1/chat/completions, /v1/models). Three auth modes (IAM API key hk-*, JWT, provider key sk-*). Static model routing — 66+ models mapped to upstream providers in pure Go. Per-request usage tracking fire-and-forget to IAM. KMS-resolved provider secrets, org-scoped. Renamed from hanzoai/cloud on 2026-05-19 when the unified binary took the cloud name per HIP-0106; ai mounts as the ai subsystem inside hanzoai/cloud.

Specs

Implements:

  • HIP-0037 AI Cloud Platform
  • HIP-0106 Unified Cloud Binary (ai subsystem)

Architecture

   user / api key  ->  hanzoai/gateway  ->  hanzoai/ai (zip.App, Go)
                                                  |
                              auth: hk-* | JWT | sk-* (provider passthrough)
                                                  |
                              model routing: 66+ models -> upstream providers
                                                  |
                          +--------+--------+--------+--------+
                          |        |        |        |        |
                       DO-AI  Fireworks  OpenAI   Anthropic   ...
                          |        |        |        |        |
                              KMS-resolved provider secrets
                                                  |
                              IAM usage tracking (async)

Hanzo Cloud

AI Cloud OS — Native Go model routing, IAM-integrated auth, usage tracking, and KMS secrets management. Zero middlemen, pure performance.

Build Release GHCR license Discord

Architecture

Hanzo Cloud implements the ZAP (Zero-overhead API Protocol) — a native Go model routing layer that connects users directly to upstream AI providers with no intermediaries.

User → cloud-api (Go, ZAP gateway) → upstream providers (DO-AI, Fireworks, OpenAI)
                ↕                              ↕
           Hanzo IAM                      Hanzo KMS
      (auth, billing, usage)          (multi-tenant secrets)
Component Description Technology
Gateway ZAP-native model routing, auth, billing Go + Beego
Frontend Admin UI, chat, knowledge base React + Next.js
IAM Identity, API keys (hk-), usage tracking hanzoai/iam
KMS Multi-tenant secrets, org-scoped projects hanzoai/kms
Engine Local inference (mistral.rs fork) hanzoai/engine

ZAP Protocol

ZAP defines the fast native path from API gateway to AI inference:

  • OpenAI-compatible JSON over HTTP (/v1/chat/completions, /v1/models)
  • Three auth modes: IAM API key (hk-*), JWT (hanzo.id OAuth), Provider key (sk-*)
  • Static model routing — 66+ models mapped to 3 upstream providers in pure Go
  • Per-request usage tracking — async fire-and-forget to IAM
  • KMS-resolved secrets — provider API keys from Infisical with org-scoped projects
  • Zero Python — no legacy proxy middleware, no extra hops

Supported Models

Free Tier (DO-AI) — 28 models

gpt-4o, gpt-5, gpt-5-mini, claude-opus-4-6, claude-sonnet-4-5, claude-haiku-4-5, o3, o3-mini, qwen3-32b, deepseek-r1-distill-70b, llama-3.3-70b, and more.

Premium (Fireworks) — 17 models

fireworks/deepseek-r1, fireworks/deepseek-v3, fireworks/kimi-k2, fireworks/qwen3-235b-a22b, fireworks/qwen3-coder-480b, fireworks/cogito-671b, and more.

Premium (OpenAI Direct) — 5 models

openai-direct/gpt-4o, openai-direct/gpt-5, openai-direct/o3, openai-direct/o3-mini, openai-direct/gpt-4o-mini

Zen Models (Premium, 3X) — 8 models

zen4-mini, zen4-pro, zen4-max, zen4-ultra, zen4-coder-flash, zen4-coder-pro, zen-vl, zen3-omni

Full model list: GET /api/models

Quick Start

# Build
CGO_ENABLED=0 GOOS=linux GOARCH=amd64 go build -ldflags="-w -s" -o cloud-api-server .

# Run
./cloud-api-server

# Test
curl -H "Authorization: Bearer hk-YOUR-API-KEY" \
  https://api.hanzo.ai/v1/chat/completions \
  -d '{"model":"gpt-4o","messages":[{"role":"user","content":"Hello"}]}'

Configuration

Set via conf/app.conf or environment variables:

Variable Description
HANZO_API_KEY Unified service token for internal IAM + KMS operations (billing/usage/auth support)
iamEndpoint IAM service URL (production: http://iam.hanzo.svc.cluster.local:8000)
clientId IAM OAuth client ID for cloud
clientSecret IAM OAuth client secret for cloud
dataSourceName Database DSN (do not commit; inject via KMS-managed secret)
KMS_CLIENT_ID Infisical Universal Auth client ID
KMS_CLIENT_SECRET Infisical Universal Auth client secret
KMS_PROJECT_ID Default KMS project ID
KMS_ENVIRONMENT KMS environment (default: production)

Deploy

# Docker
docker pull ghcr.io/hanzoai/cloud:latest

# Kubernetes (production)
kubectl apply -f k8s/kms-secrets.yaml
kubectl apply -f k8s/configmap.yaml
kubectl apply -f k8s/deployment.yaml
kubectl apply -f k8s/service.yaml

GitHub Actions production deploy (.github/workflows/deploy-production.yml) resolves deployment credentials from Hanzo KMS. Preferred setup is a single GitHub secret: HANZO_API_KEY. Universal Auth (KMS_CLIENT_ID + KMS_CLIENT_SECRET) remains as a fallback.

License

Apache-2.0

Documentation

Overview

Package ai mounts the Hanzo AI subsystem (LLM control plane, RAG, model hub, MCP management) into the unified cloud binary per HIP-0106.

The legacy entry point at ~/work/hanzo/ai/main.go registers the existing beego ControllerRegister tree. Mount adapts that same ControllerRegister onto a zip.App via zip.AdaptNetHTTP so the routes continue to operate unchanged while running under the canonical zip-driven cloud entry.

All ~309 X-Org-Id call-sites inside controllers/* continue to read gateway-minted identity headers (X-Org-Id, X-User-Id, X-User-Email) per HIP-0026 — the adapter does not strip headers; zip middleware in the cloud binary already mints them from the JWT before forwarding.

Index

Constants

This section is empty.

Variables

This section is empty.

Functions

func Mount

func Mount(app *zip.App, deps cloud.Deps) error

Mount registers AI's HTTP surface per HIP-0106.

Routes under /v1/ai/* are forwarded to the registered handler (the beego ControllerRegister built by routers/router.go). If no handler is registered yet, the routes 503 — this lets the cloud binary boot the ai subsystem progressively (load model config, initialize providers, then call SetHandler).

The standalone cmd/ai/main.go shim calls SetHandler(beego.BeeApp.Handlers) after object.InitDb and routers/init. The unified binary calls the same SetHandler in its bootstrap.

func SetHandler

func SetHandler(h http.Handler)

SetHandler registers the ai runtime's public HTTP handler (typically beego.BeeApp.Handlers after routers/router.go init). Safe for concurrent use; pass nil to deactivate.

Types

This section is empty.

Directories

Path Synopsis
cmd
aid command
Package routers @APIVersion 1.70.0 @Title Hanzo Cloud RESTful API @Description Swagger Docs of Hanzo Cloud Backend API @Contact cloud@hanzo.ai @SecurityDefinition AccessToken apiKey Authorization header @Schemes https,http @ExternalDocs Find out more about Hanzo Cloud @ExternalDocsUrl https://hanzo.ai/cloud
Package routers @APIVersion 1.70.0 @Title Hanzo Cloud RESTful API @Description Swagger Docs of Hanzo Cloud Backend API @Contact cloud@hanzo.ai @SecurityDefinition AccessToken apiKey Authorization header @Schemes https,http @ExternalDocs Find out more about Hanzo Cloud @ExternalDocsUrl https://hanzo.ai/cloud

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL