README
¶
Envoy AI Gateway CLI (aigw)
Quick Start
docker-compose.yml builds and runs aigw, targeting
Ollama for OpenAI chat completion requests on port 1975.
- aigw (port 1975): Envoy AI Gateway CLI (standalone mode)
- chat-completion: curl command making a simple chat completion
-
Start Ollama on your host machine:
Start Ollama on all interfaces, with a large context. This allows it to be addressable by Docker and handle large tasks, such as from Goose.
OLLAMA_CONTEXT_LENGTH=131072 OLLAMA_HOST=0.0.0.0 ollama serve -
Run the example minimal stack:
upbuildsaigwfrom source and starts the stack, awaiting health checks.docker compose up --wait -d -
Create a simple OpenAI chat completion:
The
chat-completionservice usescurlto send a simple chat completion request to the AI Gateway CLI (aigw) which routes it to Ollama.docker compose run --rm chat-completion -
Shutdown the example stack:
downstops the containers and removes the volumes used by the stack.docker compose down -v
Quick Start with OpenTelemetry
docker-compose-otel.yaml includes OpenTelemetry tracing, visualized with Arize Phoenix, an open-source LLM tracing and evaluation system. It has UX features for LLM spans formatted with OpenInference semantics.
- aigw (port 1975): Envoy AI Gateway CLI (standalone mode) with OTEL tracing
- Phoenix (port 6006): OpenTelemetry trace viewer UI for LLM observability
- chat-completion: OpenAI Python client instrumented with OpenTelemetry
-
Start Ollama on your host machine:
Start Ollama on all interfaces, with a large context. This allows it to be addressable by Docker and handle large tasks, such as from Goose.
OLLAMA_CONTEXT_LENGTH=131072 OLLAMA_HOST=0.0.0.0 ollama serve -
Run the example OpenTelemetry stack:
upbuildsaigwfrom source and starts the stack, awaiting health checks.docker compose -f docker-compose-otel.yaml up --wait -d -
Create a simple OpenAI chat completion:
chat-completionuses the OpenAI Python CLI to send a simple chat completion to the AI Gateway CLI (aigw) which routes it to Ollama. Notably, this app uses OpenTelemetry Python to send traces transparently.# Invoke the OpenTelemetry instrumented chat completion docker compose -f docker-compose-otel.yaml run --build --rm chat-completion # Verify traces are being received by Phoenix docker compose -f docker-compose-otel.yaml logs phoenix | grep "POST /v1/traces" -
View traces in Phoenix:
Open your browser and navigate to the Phoenix UI to view the traces.
open http://localhost:6006You should see a trace like this, which shows the OpenAI CLI (Python) joining a trace with the Envoy AI Gateway CLI (aigw), showing LLM inputs and outputs to the LLM (served by Ollama). The Phoenix screenshot is annotated to highlight the key parts of the trace:

-
Shutdown the example stack:
downstops the containers and removes the volumes used by the stack.docker compose -f docker-compose-otel.yaml down -v
Documentation
¶
There is no documentation for this package.