aigw

command

v0.4.0-rc1 Latest Latest Go to latest Published: Nov 5, 2025 License: Apache-2.0 Imports: 52 Imported by: 0

Details

Valid go.mod file
Redistributable license
Tagged version
Stable version
Learn more about best practices

Repository

github.com/envoyproxy/ai-gateway

Links

Open Source Insights

README ¶

Envoy AI Gateway CLI (aigw)

Quick Start

docker-compose.yml builds and runs aigw, targeting Ollama for OpenAI chat completion requests on port 1975.

aigw (port 1975): Envoy AI Gateway CLI (standalone mode)
chat-completion: curl command making a simple chat completion

The simplest way to get started is to have aigw generate a configuration for your OpenAI-compatible backend. This happens when there is no configuration file and at least the OPENAI_API_KEY environment variable is set.

Here are values we use for Ollama:

OPENAI_API_KEY=unused (Ollama does not require an API key)
OPENAI_BASE_URL=http://localhost:11434/v1 (host.docker.internal in Docker)

Start Ollama on your host machine:

Start Ollama on all interfaces, with a large context. This allows it to be addressable by Docker and handle large tasks, such as from Goose.
```
OLLAMA_CONTEXT_LENGTH=131072 OLLAMA_HOST=0.0.0.0 ollama serve
```
Run the example minimal stack:

up builds aigw from source and starts the stack, awaiting health checks.
```
docker compose up --wait -d
```
Make requests to Envoy AI Gateway:

The following services use curl to send requests to the AI Gateway CLI (aigw) which routes them to Ollama:
- Chat completion:
```
docker compose run --rm chat-completion
```
- Completion (legacy):
```
docker compose run --rm completion
```
- Embeddings:
```
docker compose run --rm embeddings
```
- MCP (Model Context Protocol) tool call:
```
docker compose run --rm mcp
```
  This calls the kiwi MCP server through aigw's MCP Gateway at /mcp.
Shutdown the example stack:

down stops the containers and removes the volumes used by the stack.
```
docker compose down --remove-orphans
```

Quick Start with OpenTelemetry

docker-compose-otel.yaml includes OpenTelemetry metrics and tracing.

All profiles below use at least these Docker services:

aigw (port 1975): Envoy AI Gateway CLI with OpenAI endpoints at /v1/* and MCP endpoint at /mcp
chat-completion: OpenAI Python client for chat completions, instrumented with OpenTelemetry
completion: OpenAI Python client for completions (legacy), instrumented with OpenTelemetry
create-embeddings: OpenAI Python client for embeddings, instrumented with OpenTelemetry

Prerequisites

Start Ollama on your host machine:

Start Ollama on all interfaces, with a large context. This allows it to be addressable by Docker and handle large tasks, such as from Goose.
```
OLLAMA_CONTEXT_LENGTH=131072 OLLAMA_HOST=0.0.0.0 ollama serve
```

Configure OpenTelemetry Export

Choose how you want to export telemetry data (traces and metrics). We provide pre-configured .env files for common scenarios:

Console (Default - no external dependencies)

Export telemetry directly to the console for debugging. The .env.otel.console file is already provided and will be used by default when no profile is specified or when you set COMPOSE_PROFILES=console.

This outputs traces and metrics to stdout/stderr. Useful for debugging without requiring any external services.

Arize Phoenix (LLM-specific observability)

Arize Phoenix is an open-source LLM tracing and evaluation system with UX features for spans formatted with OpenInference semantics.

The .env.otel.phoenix file is already provided and will be used automatically when you set COMPOSE_PROFILES=phoenix. This also starts the Phoenix service.

This configures:

OTLP endpoint to Phoenix on port 6006
Metrics disabled (Phoenix only supports traces)
Reduced batch delay for demo purposes

otel-tui (Terminal UI for OpenTelemetry)

otel-tui provides a terminal-based UI for viewing OpenTelemetry traces and metrics in real-time.

The .env.otel.otel-tui file is already provided and will be used automatically when you set COMPOSE_PROFILES=otel-tui. This also starts the otel-tui service.

This configures the OTLP endpoint to otel-tui on port 4318.

Run the Stack

Start the services:
```
COMPOSE_PROFILES="<profile>" docker compose -f docker-compose-otel.yaml up --build --wait -d
```
Where <profile> is:
- console - Export to console for debugging (default if omitted)
- otel-tui - Export to otel-tui Terminal UI (also starts otel-tui service)
- phoenix - Export to Phoenix (also starts Phoenix service)

Send test requests:

COMPOSE_PROFILES="<profile>" docker compose -f docker-compose-otel.yaml run --build --rm chat-completion
COMPOSE_PROFILES="<profile>" docker compose -f docker-compose-otel.yaml run --build --rm create-embeddings
COMPOSE_PROFILES="<profile>" docker compose -f docker-compose-otel.yaml run --build --rm completion
COMPOSE_PROFILES="<profile>" docker compose -f docker-compose-otel.yaml run --build --rm mcp

Check telemetry output:

For Console export

# View traces and metrics in aigw logs
docker compose -f docker-compose-otel.yaml logs aigw | grep -E "(SpanContext|gen_ai)"

For Phoenix export

If you configured Phoenix as your OTLP endpoint, you can view detailed traces showing the OpenAI CLI (Python) joining a trace with the Envoy AI Gateway CLI (aigw), including LLM inputs and outputs served by Ollama:

Phoenix Screenshot

# Verify Phoenix is receiving traces
docker compose -f docker-compose-otel.yaml logs phoenix | grep "POST /v1/traces"

# Open Phoenix UI
open http://localhost:6006

For otel-tui export

# Show TUI in your current terminal session
docker compose -f docker-compose-otel.yaml attach otel-tui

# Detach by pressing Ctrl+p -> Ctrl+q

Access logs with GenAI fields (always available):

docker compose -f docker-compose-otel.yaml logs aigw | grep "genai_model_name"

Shutdown

Stop the services:

docker compose -f docker-compose-otel.yaml down --remove-orphans

Documentation ¶

There is no documentation for this package.

Source Files ¶

View all Source files

?	: This menu
/	: Search site
f or F	: Jump to
y or Y	: Canonical URL