gomodel-lib

module

v0.0.0-...-5fe9b4e Latest Latest Go to latest Published: Apr 28, 2026 License: MIT

Details

Valid go.mod file
Redistributable license
Tagged version
Stable version
Learn more about best practices

Repository

github.com/BrendanMartin/gomodel-lib

Links

Open Source Insights

README ¶

GoModel - AI Gateway in Go

A fast and lightweight AI gateway written in Go, providing a unified OpenAI-compatible API for OpenAI, Anthropic, Gemini, xAI, Groq, OpenRouter, Z.ai, Azure OpenAI, Oracle, Ollama, and more.

GoModel AI gateway dashboard showing AI usage analytics, observability panel, token and costs tracking, and estimated cost monitoring

Quick Start with Docker

Step 1: Start GoModel container

docker run --rm -p 8080:8080 \
  -e LOGGING_ENABLED=true \
  -e LOGGING_LOG_BODIES=true \
  -e LOG_FORMAT=text \
  -e LOGGING_LOG_HEADERS=true \
  -e OPENAI_API_KEY="your-openai-key" \
  enterpilot/gomodel

Pass only the provider credentials or base URL you need (at least one required):

docker run --rm -p 8080:8080 \
  -e OPENAI_API_KEY="your-openai-key" \
  -e ANTHROPIC_API_KEY="your-anthropic-key" \
  -e GEMINI_API_KEY="your-gemini-key" \
  -e GROQ_API_KEY="your-groq-key" \
  -e OPENROUTER_API_KEY="your-openrouter-key" \
  -e ZAI_API_KEY="your-zai-key" \
  -e XAI_API_KEY="your-xai-key" \
  -e AZURE_API_KEY="your-azure-key" \
  -e AZURE_BASE_URL="https://your-resource.openai.azure.com/openai/deployments/your-deployment" \
  -e AZURE_API_VERSION="2024-10-21" \
  -e ORACLE_API_KEY="your-oracle-key" \
  -e ORACLE_BASE_URL="https://inference.generativeai.us-chicago-1.oci.oraclecloud.com/20231130/actions/v1" \
  -e ORACLE_MODELS="openai.gpt-oss-120b,xai.grok-3" \
  -e OLLAMA_BASE_URL="http://host.docker.internal:11434/v1" \
  -e VLLM_BASE_URL="http://host.docker.internal:8000/v1" \
  enterpilot/gomodel

⚠️ Avoid passing secrets via -e on the command line - they can leak via shell history and process lists. For production, use docker run --env-file .env to load API keys from a file instead.

Step 2: Make your first API call

curl http://localhost:8080/v1/chat/completions \
  -H "Content-Type: application/json" \
  -d '{
    "model": "gpt-5-chat-latest",
    "messages": [{"role": "user", "content": "Hello!"}]
  }'

That's it! GoModel automatically detects which providers are available based on the credentials you supply.

Supported LLM Providers

Example model identifiers are illustrative and subject to change; consult provider catalogs for current models. Feature columns reflect gateway API support, not every individual model capability exposed by an upstream provider.

Provider	Credential	Example Model	Chat	`/responses`	Embed	Files	Batches	Passthru
OpenAI	`OPENAI_API_KEY`	`gpt-5.5`	✅	✅	✅	✅	✅	✅
Anthropic	`ANTHROPIC_API_KEY`	`claude-sonnet-4-20250514`	✅	✅	❌	❌	✅	✅
Google Gemini	`GEMINI_API_KEY`	`gemini-2.5-flash`	✅	✅	✅	✅	✅	❌
Groq	`GROQ_API_KEY`	`llama-3.3-70b-versatile`	✅	✅	✅	✅	✅	❌
OpenRouter	`OPENROUTER_API_KEY`	`google/gemini-2.5-flash`	✅	✅	✅	✅	✅	✅
Z.ai	`ZAI_API_KEY` (`ZAI_BASE_URL` optional)	`glm-5.1`	✅	✅	✅	❌	❌	✅
xAI (Grok)	`XAI_API_KEY`	`grok-4`	✅	✅	✅	✅	✅	❌
Azure OpenAI	`AZURE_API_KEY` + `AZURE_BASE_URL` (`AZURE_API_VERSION` optional)	`gpt-5`	✅	✅	✅	✅	✅	✅
Oracle	`ORACLE_API_KEY` + `ORACLE_BASE_URL`	`openai.gpt-oss-120b`	✅	✅	❌	❌	❌	❌
Ollama	`OLLAMA_BASE_URL`	`llama3.2`	✅	✅	✅	❌	❌	❌
vLLM	`VLLM_BASE_URL` (`VLLM_API_KEY` optional)	`meta-llama/Llama-3.1-8B-Instruct`	✅	✅	✅	❌	❌	✅

✅ Supported ❌ Unsupported

For Z.ai's GLM Coding Plan, set ZAI_BASE_URL=https://api.z.ai/api/coding/paas/v4. Configured model lists are available for every provider with <PROVIDER>_MODELS, for example OPENROUTER_MODELS=openai/gpt-oss-120b,anthropic/claude-sonnet-4 or ORACLE_MODELS=openai.gpt-oss-120b,xai.grok-3. By default, CONFIGURED_PROVIDER_MODELS_MODE=fallback uses those lists only when upstream /models is unavailable or empty. Set CONFIGURED_PROVIDER_MODELS_MODE=allowlist to expose only configured models for providers that define a list, skipping their upstream /models calls. For vLLM, set VLLM_API_KEY only if the upstream server was started with --api-key. To register multiple instances of the same provider type without config.yaml, use suffixed env vars such as OPENAI_EAST_API_KEY and OPENAI_EAST_BASE_URL; add OPENAI_EAST_MODELS to configure that instance's model list. This registers provider openai-east with type openai.

Alternative Setup Methods

Running from Source

Prerequisites: Go 1.26.2+

Create a .env file:
```
cp .env.template .env
```
Add your API keys to .env (at least one required).
Start the server:
```
make run
```

Docker Compose

Infrastructure only (Redis, PostgreSQL, MongoDB, Adminer - no image build):

docker compose up -d
# or: make infra

Full stack (adds GoModel + Prometheus; builds the app image):

cp .env.template .env
# Add your API keys to .env
docker compose --profile app up -d
# or: make image

Service	URL
GoModel API	http://localhost:8080
Adminer (DB UI)	http://localhost:8081
Prometheus	http://localhost:9090

Building the Docker Image Locally

docker build -t gomodel .
docker run --rm -p 8080:8080 --env-file .env gomodel

API Endpoints

OpenAI-Compatible API

Endpoint	Method	Description
`/v1/chat/completions`	POST	Chat completions (streaming supported)
`/v1/responses`	POST	OpenAI Responses API
`/v1/embeddings`	POST	Text embeddings
`/v1/models`	GET	List available models
`/v1/files`	POST	Upload a file (OpenAI-compatible multipart)
`/v1/files`	GET	List files
`/v1/files/{id}`	GET	Retrieve file metadata
`/v1/files/{id}`	DELETE	Delete a file
`/v1/files/{id}/content`	GET	Retrieve raw file content
`/v1/batches`	POST	Create a native provider batch (OpenAI-compatible schema; inline `requests` supported where provider-native)
`/v1/batches`	GET	List stored batches
`/v1/batches/{id}`	GET	Retrieve one stored batch
`/v1/batches/{id}/cancel`	POST	Cancel a pending batch
`/v1/batches/{id}/results`	GET	Retrieve native batch results when available

Provider Passthrough

Endpoint	Method	Description
`/p/{provider}/...`	GET, POST, PUT, PATCH, DELETE, HEAD, OPTIONS	Provider-native passthrough with opaque upstream responses

Admin Endpoints

Endpoint	Method	Description
`/admin/dashboard`	GET	Admin dashboard UI
`/admin/api/v1/dashboard/config`	GET	Dashboard configuration
`/admin/api/v1/cache/overview`	GET	Cache statistics overview
`/admin/api/v1/usage/summary`	GET	Aggregate token usage statistics
`/admin/api/v1/usage/daily`	GET	Per-period token usage breakdown
`/admin/api/v1/usage/models`	GET	Usage breakdown by model
`/admin/api/v1/usage/user-paths`	GET	Usage breakdown by user path
`/admin/api/v1/usage/log`	GET	Paginated usage log entries
`/admin/api/v1/audit/log`	GET	Paginated audit log entries
`/admin/api/v1/audit/conversation`	GET	Conversation thread around one audit entry
`/admin/api/v1/providers/status`	GET	Provider availability status
`/admin/api/v1/runtime/refresh`	POST	Refresh runtime configuration
`/admin/api/v1/models`	GET	List models with provider type
`/admin/api/v1/models/categories`	GET	List model categories
`/admin/api/v1/model-overrides`	GET	List model overrides
`/admin/api/v1/model-overrides/:selector`	PUT	Create/update model override
`/admin/api/v1/model-overrides/:selector`	DELETE	Remove model override
`/admin/api/v1/auth-keys`	GET	List authentication keys

Operations Endpoints

Endpoint	Method	Description
`/health`	GET	Health check
`/metrics`	GET	Prometheus metrics (experimental, when enabled)
`/swagger/index.html`	GET	Swagger UI (when enabled)

Gateway Configuration

GoModel is configured through environment variables and an optional config.yaml. Environment variables override YAML values. See .env.template and config/config.example.yaml for the available options.

Key settings:

Variable	Default	Description
`PORT`	`8080`	Server port
`BASE_PATH`	`/`	Mount the gateway under a path prefix such as `/g`
`GOMODEL_MASTER_KEY`	(none)	API key for authentication
`ENABLE_PASSTHROUGH_ROUTES`	`true`	Enable provider-native passthrough routes under `/p/{provider}/...`
`ALLOW_PASSTHROUGH_V1_ALIAS`	`true`	Allow `/p/{provider}/v1/...` aliases while keeping `/p/{provider}/...` canonical
`ENABLED_PASSTHROUGH_PROVIDERS`	`openai,anthropic,openrouter,zai,vllm`	Comma-separated list of enabled passthrough providers
`STORAGE_TYPE`	`sqlite`	Storage backend (`sqlite`, `postgresql`, `mongodb`)
`METRICS_ENABLED`	`false`	Enable Prometheus metrics (experimental)
`LOGGING_ENABLED`	`false`	Enable audit logging
`GUARDRAILS_ENABLED`	`false`	Enable the configured guardrails pipeline

Quick Start - Authentication: By default GOMODEL_MASTER_KEY is unset. Without this key, API endpoints are unprotected and anyone can call them. This is insecure for production. Strongly recommend setting a strong secret before exposing the service. Add GOMODEL_MASTER_KEY to your .env or environment for production deployments.

Response Caching

GoModel has a two-layer response cache that reduces LLM API costs and latency for repeated or semantically similar requests.

Layer 1 - Exact-match cache

Hashes the full request body (path + Workflow + body) and returns a stored response on byte-identical requests. Sub-millisecond lookup. Activate by environment variables: RESPONSE_CACHE_SIMPLE_ENABLED and REDIS_URL.

Responses served from this layer carry X-Cache: HIT (exact).

Layer 2 - Semantic cache

Embeds the last user message via your configured provider’s OpenAI-compatible /v1/embeddings API (cache.response.semantic.embedder.provider must name a key in the top-level providers map) and performs a KNN vector search. Semantically equivalent queries - e.g. "What's the capital of France?" vs "Which city is France's capital?" - can return the same cached response without an upstream LLM call.

Expected hit rates: ~60–70% in high-repetition workloads vs. ~18% for exact-match alone.

Responses served from this layer carry X-Cache: HIT (semantic).

Supported vector backends: qdrant, pgvector, pinecone, weaviate (set cache.response.semantic.vector_store.type and the matching nested block).

Both cache layers run after guardrail/workflow patching so they always see the final prompt. Use Cache-Control: no-cache or Cache-Control: no-store to bypass caching per-request.

See DEVELOPMENT.md for testing, linting, and pre-commit setup.

Roadmap to 0.2.0

Must Have

Intelligent routing
Broader provider support: Oracle model configuration via environment variables, plus Cohere, Command A, Operational, and DeepSeek V3
Budget management with limits per user_path and/or API key
Editable model pricing for accurate cost tracking and budgeting
Full support for the OpenAI /responses and /conversations lifecycle
Prompt cache visibility showing how much of each prompt was cached by the provider
Guardrails hardening: better UI, simpler architecture, easier custom guardrails, and response-side guardrails before output reaches the client
Passthrough for all providers, beyond the current OpenAI and Anthropic beta
Fix failover charts in the dashboard

Should Have

Cluster mode

Community

Join our Discord to connect with other GoModel users.

Star History

Directories ¶

?	: This menu
/	: Search site
f or F	: Jump to
y or Y	: Canonical URL

Path	Synopsis
aliases
auditlog Package auditlog provides audit logging for the AI gateway.	Package auditlog provides audit logging for the AI gateway.
authkeys
batch Package batch provides persistence for OpenAI-compatible batch lifecycle endpoints.	Package batch provides persistence for OpenAI-compatible batch lifecycle endpoints.
batchrewrite
cache Package cache provides a generic key-value store abstraction.	Package cache provides a generic key-value store abstraction.
modelcache Package modelcache provides model-specific cache types and interfaces.	Package modelcache provides model-specific cache types and interfaces.
config Package config provides configuration management for the application.	Package config provides configuration management for the application.
core Package core provides core types and interfaces for the LLM gateway.	Package core provides core types and interfaces for the LLM gateway.
docs
2026-03-23_benchmark_scripts/gateway-comparison/mock-backend command Mock OpenAI-compatible backend server for benchmarking AI gateways.	Mock OpenAI-compatible backend server for benchmarking AI gateways.
2026-03-23_benchmark_scripts/gateway-comparison/stream-bench command Streaming SSE benchmark tool.	Streaming SSE benchmark tool.
about/benchmark-tools command
embedding
fallback
gateway Package gateway contains transport-independent gateway use cases.	Package gateway contains transport-independent gateway use cases.
guardrails Package guardrails provides a pluggable pipeline for request-level guardrails.	Package guardrails provides a pluggable pipeline for request-level guardrails.
httpclient Package httpclient provides a centralized HTTP client factory with unified configuration.	Package httpclient provides a centralized HTTP client factory with unified configuration.
llmclient Package llmclient provides a base HTTP client for LLM providers with: - Request marshaling/unmarshaling - Retries with exponential backoff and jitter - Standardized error parsing (429, 502, 503, 504) - Circuit breaking with half-open state protection	Package llmclient provides a base HTTP client for LLM providers with: - Request marshaling/unmarshaling - Retries with exponential backoff and jitter - Standardized error parsing (429, 502, 503, 504) - Circuit breaking with half-open state protection
modeldata Package modeldata provides fetching, parsing, and merging of the external AI model metadata registry (models.json) for enriching GoModel's model data.	Package modeldata provides fetching, parsing, and merging of the external AI model metadata registry (models.json) for enriching GoModel's model data.
modeloverrides
observability Package observability provides instrumentation for metrics, tracing, and logging.	Package observability provides instrumentation for metrics, tracing, and logging.
providers Package providers provides a factory for creating provider instances.	Package providers provides a factory for creating provider instances.
anthropic Package anthropic provides Anthropic API integration for the LLM gateway.	Package anthropic provides Anthropic API integration for the LLM gateway.
azure
gemini Package gemini provides Google Gemini API integration for the LLM gateway.	Package gemini provides Google Gemini API integration for the LLM gateway.
groq Package groq provides Groq API integration for the LLM gateway.	Package groq provides Groq API integration for the LLM gateway.
minimax Package minimax provides MiniMax API integration for the LLM gateway.	Package minimax provides MiniMax API integration for the LLM gateway.
ollama Package ollama provides Ollama API integration for the LLM gateway.	Package ollama provides Ollama API integration for the LLM gateway.
openai Package openai provides OpenAI API integration for the LLM gateway.	Package openai provides OpenAI API integration for the LLM gateway.
openrouter
oracle
vllm Package vllm provides vLLM OpenAI-compatible API integration for the LLM gateway.	Package vllm provides vLLM OpenAI-compatible API integration for the LLM gateway.
xai Package xai provides xAI (Grok) API integration for the LLM gateway.	Package xai provides xAI (Grok) API integration for the LLM gateway.
zai Package zai provides Z.ai API integration for the LLM gateway.	Package zai provides Z.ai API integration for the LLM gateway.
responsecache
responsestore Package responsestore provides persistence for OpenAI-compatible Responses lifecycle endpoints.	Package responsestore provides persistence for OpenAI-compatible Responses lifecycle endpoints.
storage Package storage provides shared database connections for all features.	Package storage provides shared database connections for all features.
streaming
tests
contract Package contract provides contract tests that validate API response structures against recorded golden files.	Package contract provides contract tests that validate API response structures against recorded golden files.
tools Package tools pins developer tooling dependencies used by this repository.	Package tools pins developer tooling dependencies used by this repository.
usage Package usage provides token usage tracking for the AI gateway.	Package usage provides token usage tracking for the AI gateway.
version
workflows