examples/

directory
v0.0.1 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Apr 10, 2026 License: GPL-3.0

README

Examples

This directory contains runnable examples for github.com/ethpandaops/vllm-agent-sdk-go.

API Overview

The SDK supports two primary styles:

  1. Top-level one-shot APIs:
  • Query(ctx, content, ...opts)
  • QueryStream(ctx, messages, ...opts)
  1. Stateful client API:
  • NewClient() + Start()
  • Query() + ReceiveResponse() / ReceiveMessages()
  • Interrupt() / SetModel() / SetPermissionMode()

Environment

The core SDK resolves VLLM_BASE_URL, VLLM_API_KEY, and VLLM_MODEL automatically (see Configuration). Examples add additional overrides:

Variable Description Default
VLLM_BASE_URL vLLM server base URL http://127.0.0.1:8000/v1
VLLM_API_KEY Bearer auth token (if your server enforces auth) (none)
VLLM_MODEL Model name QuantTrio/Qwen3-Coder-30B-A3B-Instruct-AWQ (example default)
VLLM_IMAGE_MODEL Image-capable model QuantTrio/Qwen3-Coder-30B-A3B-Instruct-AWQ (example default)
VLLM_VISION_MODEL Vision model for multimodal input Falls back to VLLM_IMAGE_MODEL, then VLLM_MODEL
VLLM_IMAGE_OUTPUT_DIR Directory for saving generated images (none)

Use VLLM_MODEL to pin a different served model when you need deterministic capability coverage.

Core SDK Examples

These focus on the sibling-style SDK contract that downstream code consumes directly.

Example Description
quick_start Basic one-shot query with low-cost model defaults.
query_stream Streaming input via QueryStream and MessagesFromSlice.
client_multi_turn Stateful Client usage over multiple turns.
model_discovery List models and inspect free/tool/structured-output/image capabilities.
structured_output Structured JSON output with WithOutputFormat.
sdk_tools In-process SDK tools via WithSDKTools(...).
on_user_input SDK-owned user input prompts via WithOnUserInput(...).
permissions Tool permission denial handling via WithCanUseTool(...).
hooks Hook callbacks around tool execution.
sessions_local Local session persistence, listing, stats, and message inspection.
interrupt Client cancellation via Interrupt().
error_handling Typed UnsupportedControlError and ErrSessionNotFound handling.
system_prompt System prompt configuration (default vs custom string vs preset).
extended_thinking Extended thinking with WithThinking and WithEffort.
include_partial_messages Real-time streaming of partial message deltas.
max_budget_usd API cost control with WithMaxBudgetUSD.
cancellation Context cancellation and graceful client shutdown.
parallel_queries Concurrent Query() calls with errgroup.
pipeline Multi-step LLM orchestration (Generate → Evaluate → Refine).
mcp_calculator In-process MCP server with calculator tools via CreateSdkMcpServer.
mcp_status Query MCP server connection status via GetMCPStatus.
memory_tool Filesystem-backed persistent memory via MCP tools.
custom_logger Bridge any logging library (logrus) to WithLogger via slog.Handler.

VLLM-Native Advanced Examples

These focus on VLLM-specific routing and request-shape controls.

Example Description
vllm_chat_controls Sampling/tool controls (top_p, penalties, seed, stop, logprobs).
vllm_routing Provider/plugins/route/session/trace controls.
vllm_responses /responses mode with instructions/text config/service tier/truncation.
vllm_responses_chaining Responses chaining with previous_response_id and prompt cache key.
vllm_multimodal_input Multimodal chat-completions input with block-based text + image parts.
vllm_multimodal_image Multimodal/image generation with generated image blocks saved to disk.
vllm_extra Escape-hatch payload overrides via WithVLLMExtra.

Running

# Run any example
go run ./examples/quick_start
go run ./examples/vllm_responses
go run ./examples/vllm_multimodal_input

# Examples with sub-examples accept a name argument
go run ./examples/extended_thinking all
go run ./examples/cancellation graceful_shutdown

Testing

# Run all examples and verify output with VLLM
scripts/test_examples.sh

# Run specific examples
scripts/test_examples.sh -f quick_start,pipeline

# Keep going on failure
scripts/test_examples.sh -k

# Override the model
VLLM_MODEL=QuantTrio/Qwen3-Coder-30B-A3B-Instruct-AWQ scripts/test_examples.sh

# Image-generation example with explicit image model and output directory
VLLM_IMAGE_MODEL=QuantTrio/Qwen3-Coder-30B-A3B-Instruct-AWQ \
VLLM_IMAGE_OUTPUT_DIR=/tmp/vllm-images \
go run ./examples/vllm_multimodal_image

# Multimodal input example with a vision-capable model
VLLM_VISION_MODEL=QuantTrio/Qwen3-Coder-30B-A3B-Instruct-AWQ \
go run ./examples/vllm_multimodal_input

Directories

Path Synopsis
Package main demonstrates cancellation and graceful shutdown patterns.
Package main demonstrates cancellation and graceful shutdown patterns.
Package main demonstrates extended thinking capabilities with VLLM.
Package main demonstrates extended thinking capabilities with VLLM.
Package main demonstrates partial message streaming where incremental assistant updates are received as the model generates responses.
Package main demonstrates partial message streaming where incremental assistant updates are received as the model generates responses.
internal
Package main demonstrates API cost control with budget limits.
Package main demonstrates API cost control with budget limits.
Package main demonstrates how to create calculator tools using MCP servers.
Package main demonstrates how to create calculator tools using MCP servers.
Package main demonstrates querying MCP server connection status.
Package main demonstrates querying MCP server connection status.
Package main demonstrates a filesystem-backed memory tool for agent state persistence.
Package main demonstrates a filesystem-backed memory tool for agent state persistence.
Package main demonstrates running multiple Query() calls concurrently.
Package main demonstrates running multiple Query() calls concurrently.
Package main demonstrates multi-step LLM orchestration with Go control flow.
Package main demonstrates multi-step LLM orchestration with Go control flow.
Package main demonstrates configuring system prompts.
Package main demonstrates configuring system prompts.

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL