gomodel-lib

module
v0.0.0-...-5fe9b4e Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Apr 28, 2026 License: MIT

README

GoModel logo

GoModel - AI Gateway in Go

CI GO Version Docker Pulls Discord

Hacker News docs GoModel

GoModel on Hacker News

A fast and lightweight AI gateway written in Go, providing a unified OpenAI-compatible API for OpenAI, Anthropic, Gemini, xAI, Groq, OpenRouter, Z.ai, Azure OpenAI, Oracle, Ollama, and more.

GoModel AI gateway dashboard showing AI usage analytics, observability panel, token and costs tracking, and estimated cost monitoring

Quick Start with Docker

Step 1: Start GoModel container

docker run --rm -p 8080:8080 \
  -e LOGGING_ENABLED=true \
  -e LOGGING_LOG_BODIES=true \
  -e LOG_FORMAT=text \
  -e LOGGING_LOG_HEADERS=true \
  -e OPENAI_API_KEY="your-openai-key" \
  enterpilot/gomodel

Pass only the provider credentials or base URL you need (at least one required):

docker run --rm -p 8080:8080 \
  -e OPENAI_API_KEY="your-openai-key" \
  -e ANTHROPIC_API_KEY="your-anthropic-key" \
  -e GEMINI_API_KEY="your-gemini-key" \
  -e GROQ_API_KEY="your-groq-key" \
  -e OPENROUTER_API_KEY="your-openrouter-key" \
  -e ZAI_API_KEY="your-zai-key" \
  -e XAI_API_KEY="your-xai-key" \
  -e AZURE_API_KEY="your-azure-key" \
  -e AZURE_BASE_URL="https://your-resource.openai.azure.com/openai/deployments/your-deployment" \
  -e AZURE_API_VERSION="2024-10-21" \
  -e ORACLE_API_KEY="your-oracle-key" \
  -e ORACLE_BASE_URL="https://inference.generativeai.us-chicago-1.oci.oraclecloud.com/20231130/actions/v1" \
  -e ORACLE_MODELS="openai.gpt-oss-120b,xai.grok-3" \
  -e OLLAMA_BASE_URL="http://host.docker.internal:11434/v1" \
  -e VLLM_BASE_URL="http://host.docker.internal:8000/v1" \
  enterpilot/gomodel

⚠️ Avoid passing secrets via -e on the command line - they can leak via shell history and process lists. For production, use docker run --env-file .env to load API keys from a file instead.

Step 2: Make your first API call

curl http://localhost:8080/v1/chat/completions \
  -H "Content-Type: application/json" \
  -d '{
    "model": "gpt-5-chat-latest",
    "messages": [{"role": "user", "content": "Hello!"}]
  }'

That's it! GoModel automatically detects which providers are available based on the credentials you supply.

Supported LLM Providers

Example model identifiers are illustrative and subject to change; consult provider catalogs for current models. Feature columns reflect gateway API support, not every individual model capability exposed by an upstream provider.

Provider Credential Example Model Chat /responses Embed Files Batches Passthru
OpenAI OPENAI_API_KEY gpt-5.5
Anthropic ANTHROPIC_API_KEY claude-sonnet-4-20250514
Google Gemini GEMINI_API_KEY gemini-2.5-flash
Groq GROQ_API_KEY llama-3.3-70b-versatile
OpenRouter OPENROUTER_API_KEY google/gemini-2.5-flash
Z.ai ZAI_API_KEY (ZAI_BASE_URL optional) glm-5.1
xAI (Grok) XAI_API_KEY grok-4
Azure OpenAI AZURE_API_KEY + AZURE_BASE_URL (AZURE_API_VERSION optional) gpt-5
Oracle ORACLE_API_KEY + ORACLE_BASE_URL openai.gpt-oss-120b
Ollama OLLAMA_BASE_URL llama3.2
vLLM VLLM_BASE_URL (VLLM_API_KEY optional) meta-llama/Llama-3.1-8B-Instruct

✅ Supported ❌ Unsupported

For Z.ai's GLM Coding Plan, set ZAI_BASE_URL=https://api.z.ai/api/coding/paas/v4. Configured model lists are available for every provider with <PROVIDER>_MODELS, for example OPENROUTER_MODELS=openai/gpt-oss-120b,anthropic/claude-sonnet-4 or ORACLE_MODELS=openai.gpt-oss-120b,xai.grok-3. By default, CONFIGURED_PROVIDER_MODELS_MODE=fallback uses those lists only when upstream /models is unavailable or empty. Set CONFIGURED_PROVIDER_MODELS_MODE=allowlist to expose only configured models for providers that define a list, skipping their upstream /models calls. For vLLM, set VLLM_API_KEY only if the upstream server was started with --api-key. To register multiple instances of the same provider type without config.yaml, use suffixed env vars such as OPENAI_EAST_API_KEY and OPENAI_EAST_BASE_URL; add OPENAI_EAST_MODELS to configure that instance's model list. This registers provider openai-east with type openai.


Alternative Setup Methods

Running from Source

Prerequisites: Go 1.26.2+

  1. Create a .env file:

    cp .env.template .env
    
  2. Add your API keys to .env (at least one required).

  3. Start the server:

    make run
    

Docker Compose

Infrastructure only (Redis, PostgreSQL, MongoDB, Adminer - no image build):

docker compose up -d
# or: make infra

Full stack (adds GoModel + Prometheus; builds the app image):

cp .env.template .env
# Add your API keys to .env
docker compose --profile app up -d
# or: make image
Service URL
GoModel API http://localhost:8080
Adminer (DB UI) http://localhost:8081
Prometheus http://localhost:9090

Building the Docker Image Locally

docker build -t gomodel .
docker run --rm -p 8080:8080 --env-file .env gomodel

API Endpoints

OpenAI-Compatible API

Endpoint Method Description
/v1/chat/completions POST Chat completions (streaming supported)
/v1/responses POST OpenAI Responses API
/v1/embeddings POST Text embeddings
/v1/models GET List available models
/v1/files POST Upload a file (OpenAI-compatible multipart)
/v1/files GET List files
/v1/files/{id} GET Retrieve file metadata
/v1/files/{id} DELETE Delete a file
/v1/files/{id}/content GET Retrieve raw file content
/v1/batches POST Create a native provider batch (OpenAI-compatible schema; inline requests supported where provider-native)
/v1/batches GET List stored batches
/v1/batches/{id} GET Retrieve one stored batch
/v1/batches/{id}/cancel POST Cancel a pending batch
/v1/batches/{id}/results GET Retrieve native batch results when available

Provider Passthrough

Endpoint Method Description
/p/{provider}/... GET, POST, PUT, PATCH, DELETE, HEAD, OPTIONS Provider-native passthrough with opaque upstream responses

Admin Endpoints

Endpoint Method Description
/admin/dashboard GET Admin dashboard UI
/admin/api/v1/dashboard/config GET Dashboard configuration
/admin/api/v1/cache/overview GET Cache statistics overview
/admin/api/v1/usage/summary GET Aggregate token usage statistics
/admin/api/v1/usage/daily GET Per-period token usage breakdown
/admin/api/v1/usage/models GET Usage breakdown by model
/admin/api/v1/usage/user-paths GET Usage breakdown by user path
/admin/api/v1/usage/log GET Paginated usage log entries
/admin/api/v1/audit/log GET Paginated audit log entries
/admin/api/v1/audit/conversation GET Conversation thread around one audit entry
/admin/api/v1/providers/status GET Provider availability status
/admin/api/v1/runtime/refresh POST Refresh runtime configuration
/admin/api/v1/models GET List models with provider type
/admin/api/v1/models/categories GET List model categories
/admin/api/v1/model-overrides GET List model overrides
/admin/api/v1/model-overrides/:selector PUT Create/update model override
/admin/api/v1/model-overrides/:selector DELETE Remove model override
/admin/api/v1/auth-keys GET List authentication keys

Operations Endpoints

Endpoint Method Description
/health GET Health check
/metrics GET Prometheus metrics (experimental, when enabled)
/swagger/index.html GET Swagger UI (when enabled)

Gateway Configuration

GoModel is configured through environment variables and an optional config.yaml. Environment variables override YAML values. See .env.template and config/config.example.yaml for the available options.

Key settings:

Variable Default Description
PORT 8080 Server port
BASE_PATH / Mount the gateway under a path prefix such as /g
GOMODEL_MASTER_KEY (none) API key for authentication
ENABLE_PASSTHROUGH_ROUTES true Enable provider-native passthrough routes under /p/{provider}/...
ALLOW_PASSTHROUGH_V1_ALIAS true Allow /p/{provider}/v1/... aliases while keeping /p/{provider}/... canonical
ENABLED_PASSTHROUGH_PROVIDERS openai,anthropic,openrouter,zai,vllm Comma-separated list of enabled passthrough providers
STORAGE_TYPE sqlite Storage backend (sqlite, postgresql, mongodb)
METRICS_ENABLED false Enable Prometheus metrics (experimental)
LOGGING_ENABLED false Enable audit logging
GUARDRAILS_ENABLED false Enable the configured guardrails pipeline

Quick Start - Authentication: By default GOMODEL_MASTER_KEY is unset. Without this key, API endpoints are unprotected and anyone can call them. This is insecure for production. Strongly recommend setting a strong secret before exposing the service. Add GOMODEL_MASTER_KEY to your .env or environment for production deployments.


Response Caching

GoModel has a two-layer response cache that reduces LLM API costs and latency for repeated or semantically similar requests.

Layer 1 - Exact-match cache

Hashes the full request body (path + Workflow + body) and returns a stored response on byte-identical requests. Sub-millisecond lookup. Activate by environment variables: RESPONSE_CACHE_SIMPLE_ENABLED and REDIS_URL.

Responses served from this layer carry X-Cache: HIT (exact).

Layer 2 - Semantic cache

Embeds the last user message via your configured provider’s OpenAI-compatible /v1/embeddings API (cache.response.semantic.embedder.provider must name a key in the top-level providers map) and performs a KNN vector search. Semantically equivalent queries - e.g. "What's the capital of France?" vs "Which city is France's capital?" - can return the same cached response without an upstream LLM call.

Expected hit rates: ~60–70% in high-repetition workloads vs. ~18% for exact-match alone.

Responses served from this layer carry X-Cache: HIT (semantic).

Supported vector backends: qdrant, pgvector, pinecone, weaviate (set cache.response.semantic.vector_store.type and the matching nested block).

Both cache layers run after guardrail/workflow patching so they always see the final prompt. Use Cache-Control: no-cache or Cache-Control: no-store to bypass caching per-request.


See DEVELOPMENT.md for testing, linting, and pre-commit setup.


Roadmap to 0.2.0

Must Have

  • Intelligent routing
  • Broader provider support: Oracle model configuration via environment variables, plus Cohere, Command A, Operational, and DeepSeek V3
  • Budget management with limits per user_path and/or API key
  • Editable model pricing for accurate cost tracking and budgeting
  • Full support for the OpenAI /responses and /conversations lifecycle
  • Prompt cache visibility showing how much of each prompt was cached by the provider
  • Guardrails hardening: better UI, simpler architecture, easier custom guardrails, and response-side guardrails before output reaches the client
  • Passthrough for all providers, beyond the current OpenAI and Anthropic beta
  • Fix failover charts in the dashboard

Should Have

  • Cluster mode

Community

Join our Discord to connect with other GoModel users.

Star History

Star History Chart

Directories

Path Synopsis
Package auditlog provides audit logging for the AI gateway.
Package auditlog provides audit logging for the AI gateway.
Package batch provides persistence for OpenAI-compatible batch lifecycle endpoints.
Package batch provides persistence for OpenAI-compatible batch lifecycle endpoints.
Package cache provides a generic key-value store abstraction.
Package cache provides a generic key-value store abstraction.
modelcache
Package modelcache provides model-specific cache types and interfaces.
Package modelcache provides model-specific cache types and interfaces.
Package config provides configuration management for the application.
Package config provides configuration management for the application.
Package core provides core types and interfaces for the LLM gateway.
Package core provides core types and interfaces for the LLM gateway.
docs
2026-03-23_benchmark_scripts/gateway-comparison/mock-backend command
Mock OpenAI-compatible backend server for benchmarking AI gateways.
Mock OpenAI-compatible backend server for benchmarking AI gateways.
2026-03-23_benchmark_scripts/gateway-comparison/stream-bench command
Streaming SSE benchmark tool.
Streaming SSE benchmark tool.
Package gateway contains transport-independent gateway use cases.
Package gateway contains transport-independent gateway use cases.
Package guardrails provides a pluggable pipeline for request-level guardrails.
Package guardrails provides a pluggable pipeline for request-level guardrails.
Package httpclient provides a centralized HTTP client factory with unified configuration.
Package httpclient provides a centralized HTTP client factory with unified configuration.
Package llmclient provides a base HTTP client for LLM providers with: - Request marshaling/unmarshaling - Retries with exponential backoff and jitter - Standardized error parsing (429, 502, 503, 504) - Circuit breaking with half-open state protection
Package llmclient provides a base HTTP client for LLM providers with: - Request marshaling/unmarshaling - Retries with exponential backoff and jitter - Standardized error parsing (429, 502, 503, 504) - Circuit breaking with half-open state protection
Package modeldata provides fetching, parsing, and merging of the external AI model metadata registry (models.json) for enriching GoModel's model data.
Package modeldata provides fetching, parsing, and merging of the external AI model metadata registry (models.json) for enriching GoModel's model data.
Package observability provides instrumentation for metrics, tracing, and logging.
Package observability provides instrumentation for metrics, tracing, and logging.
Package providers provides a factory for creating provider instances.
Package providers provides a factory for creating provider instances.
anthropic
Package anthropic provides Anthropic API integration for the LLM gateway.
Package anthropic provides Anthropic API integration for the LLM gateway.
gemini
Package gemini provides Google Gemini API integration for the LLM gateway.
Package gemini provides Google Gemini API integration for the LLM gateway.
groq
Package groq provides Groq API integration for the LLM gateway.
Package groq provides Groq API integration for the LLM gateway.
minimax
Package minimax provides MiniMax API integration for the LLM gateway.
Package minimax provides MiniMax API integration for the LLM gateway.
ollama
Package ollama provides Ollama API integration for the LLM gateway.
Package ollama provides Ollama API integration for the LLM gateway.
openai
Package openai provides OpenAI API integration for the LLM gateway.
Package openai provides OpenAI API integration for the LLM gateway.
vllm
Package vllm provides vLLM OpenAI-compatible API integration for the LLM gateway.
Package vllm provides vLLM OpenAI-compatible API integration for the LLM gateway.
xai
Package xai provides xAI (Grok) API integration for the LLM gateway.
Package xai provides xAI (Grok) API integration for the LLM gateway.
zai
Package zai provides Z.ai API integration for the LLM gateway.
Package zai provides Z.ai API integration for the LLM gateway.
Package responsestore provides persistence for OpenAI-compatible Responses lifecycle endpoints.
Package responsestore provides persistence for OpenAI-compatible Responses lifecycle endpoints.
Package storage provides shared database connections for all features.
Package storage provides shared database connections for all features.
tests
contract
Package contract provides contract tests that validate API response structures against recorded golden files.
Package contract provides contract tests that validate API response structures against recorded golden files.
Package tools pins developer tooling dependencies used by this repository.
Package tools pins developer tooling dependencies used by this repository.
Package usage provides token usage tracking for the AI gateway.
Package usage provides token usage tracking for the AI gateway.

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL