vllmsdk

package module
v0.0.1 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Apr 10, 2026 License: GPL-3.0 Imports: 25 Imported by: 0

README

vllm-agent-sdk-go

Go SDK for building agentic applications backed by a local or self-hosted vLLM OpenAI-compatible server.

  • Package: vllmsdk
  • Default backend: http://127.0.0.1:8000/v1

Install

go get github.com/ethpandaops/vllm-agent-sdk-go

Configuration

The SDK resolves configuration from explicit options first, then environment variables, then defaults.

Environment Variables
Variable Description Default
VLLM_BASE_URL vLLM server base URL http://127.0.0.1:8000/v1
VLLM_API_KEY Bearer auth token (optional, only if your server enforces auth) (none)
VLLM_MODEL Model name (none — must be set via env or WithModel())
VLLM_AGENT_SESSION_STORE_PATH Local session store directory (none)

Example-only variables (not resolved by the core SDK):

Variable Description Default
VLLM_IMAGE_MODEL Image-capable model for multimodal examples QuantTrio/Qwen3-Coder-30B-A3B-Instruct-AWQ
VLLM_VISION_MODEL Vision model for multimodal input examples Falls back to VLLM_IMAGE_MODEL, then VLLM_MODEL
VLLM_IMAGE_OUTPUT_DIR Directory for saving generated images (none)
Option Precedence

All settings follow the same resolution order:

  1. Explicit option (e.g. WithBaseURL(...), WithAPIKey(...), WithModel(...))
  2. Environment variable (VLLM_BASE_URL, VLLM_API_KEY, VLLM_MODEL)
  3. Built-in default (where applicable)

Developer Workflow

The repo ships a sibling-style Makefile:

  • make test runs race-enabled package tests with coverage output.
  • make test-integration runs ./integration/... with -tags=integration.
  • make audit runs the aggregate quality gate.

Integration setup:

  • Set VLLM_BASE_URL or default to http://127.0.0.1:8000/v1.
  • Set VLLM_MODEL to the model served by your vLLM instance.
  • Set VLLM_API_KEY if your vLLM server enforces bearer auth.
  • Integration tests skip when the local vLLM server is unavailable.

Quick Start

package main

import (
	"context"
	"fmt"
	"time"

	vllmsdk "github.com/ethpandaops/vllm-agent-sdk-go"
)

func main() {
	ctx, cancel := context.WithTimeout(context.Background(), 60*time.Second)
	defer cancel()

	// Model resolved from VLLM_MODEL env var, or set explicitly:
	for msg, err := range vllmsdk.Query(
		ctx,
		vllmsdk.Text("Write a two-line haiku about Go concurrency."),
		// vllmsdk.WithModel("QuantTrio/Qwen3-Coder-30B-A3B-Instruct-AWQ"),
	) {
		if err != nil {
			panic(err)
		}

		if result, ok := msg.(*vllmsdk.ResultMessage); ok && result.Result != nil {
			fmt.Println(*result.Result)
		}
	}
}

Surface

  • Query(ctx, content, ...opts) and QueryStream(...) return iter.Seq2[Message, error].
  • NewClient() exposes Start, StartWithContent, StartWithStream, Query, ReceiveMessages, ReceiveResponse, Interrupt, SetPermissionMode, SetModel, ListModels, ListModelsResponse, GetMCPStatus, RewindFiles, and Close.
  • Unsupported peer-parity controls such as ReconnectMCPServer, ToggleMCPServer, StopTask, and SendToolResult are present on Client and return typed UnsupportedControlErrors.
  • UserMessageContent is the canonical input shape. Use Text(...) for text-only calls and Blocks(...) with ImageInput(...), FileInput(...), AudioInput(...), or VideoInput(...) for multimodal chat-completions requests.
  • WithSDKTools(...) registers high-level in-process tools under mcp__sdk__<name>.
  • WithOnUserInput(...) handles SDK-owned user-input prompts built on top of tool calling.
  • ListModels(...) and ListModelsResponse(...) use vLLM model discovery via /v1/models.
  • StatSession(...), ListSessions(...), and GetSessionMessages(...) operate on the SDK's local persisted session store.

Model Discovery

  • Discovery uses /v1/models.
  • Returned ModelInfo values are projected from the OpenAI-compatible model cards that vLLM serves, so provider-rich VLLM metadata is no longer guaranteed.
  • ModelInfo still exposes helper methods such as CostTier(), SupportsToolCalling(), SupportsStructuredOutput(), SupportsReasoning(), SupportsImageInput(), SupportsImageOutput(), SupportsWebSearch(), SupportsPromptCaching(), MaxContextLength(), and parsed pricing helpers.

Image Output

  • Generated images are surfaced as *ImageBlock values inside AssistantMessage.Content.
  • ImageBlock.Decode() returns raw bytes plus media type for data-URL-backed images.
  • ImageBlock.Save(path) writes generated images to disk.
  • Live image-generation coverage is available behind the integration build tag when VLLM_IMAGE_MODEL is set.

Multimodal Input

Multimodal input in this SDK is block-based and targets the vLLM OpenAI-compatible chat surface.

content := vllmsdk.Blocks(
	vllmsdk.TextInput("Compare these two screenshots and the attached spec file."),
	vllmsdk.ImageInput("https://example.com/before.png"),
	vllmsdk.ImageInput("data:image/png;base64,..."),
	vllmsdk.FileInput("spec.pdf", "data:application/pdf;base64,..."),
)

for msg, err := range vllmsdk.Query(ctx, content,
	// vllmsdk.WithModel("QuantTrio/Qwen3-Coder-30B-A3B-Instruct-AWQ"),
) {
	_ = msg
	_ = err
}
  • ImageInput(...) accepts a normal URL or a base64 data URL.
  • FileInput(...) accepts a filename plus file_data URL/data URL.
  • AudioInput(...) accepts base64 audio data plus a format.
  • VideoInput(...) accepts a normal URL or a data URL.
  • Responses mode is routed to the vLLM /v1/responses surface when selected.

Session Semantics

Session APIs are local SDK APIs, not remote vLLM server sessions.

  • They read from the SDK session store configured with WithSessionStorePath(...) or VLLM_AGENT_SESSION_STORE_PATH.
  • They do not derive from chat session_id.
  • They do not derive from Responses previous_response_id.

Unsupported Controls

vLLM does not have meaningful backend equivalents for some sibling control-plane methods. The SDK exposes those methods where peer parity matters, but they fail explicitly with UnsupportedControlError instead of faking semantics.

Examples

Runnable examples live under examples.

Documentation

Overview

Package vllmsdk provides agent ergonomics backed by VLLM.

Index

Constants

View Source
const (
	// VLLMAPIModeChatCompletions uses /chat/completions.
	VLLMAPIModeChatCompletions = config.VLLMAPIModeChatCompletions
	// VLLMAPIModeResponses uses /responses.
	VLLMAPIModeResponses = config.VLLMAPIModeResponses
)
View Source
const (
	// EffortLow uses minimal thinking.
	EffortLow = config.EffortLow
	// EffortMedium uses moderate thinking.
	EffortMedium = config.EffortMedium
	// EffortHigh uses deep thinking.
	EffortHigh = config.EffortHigh
	// EffortMax uses maximum thinking depth.
	EffortMax = config.EffortMax
)
View Source
const (
	// AssistantMessageErrorAuthFailed indicates authentication failure.
	AssistantMessageErrorAuthFailed = message.AssistantMessageErrorAuthFailed
	// AssistantMessageErrorBilling indicates a billing error.
	AssistantMessageErrorBilling = message.AssistantMessageErrorBilling
	// AssistantMessageErrorRateLimit indicates rate limiting.
	AssistantMessageErrorRateLimit = message.AssistantMessageErrorRateLimit
	// AssistantMessageErrorInvalidReq indicates an invalid request.
	AssistantMessageErrorInvalidReq = message.AssistantMessageErrorInvalidReq
	// AssistantMessageErrorServer indicates a server error.
	AssistantMessageErrorServer = message.AssistantMessageErrorServer
	// AssistantMessageErrorUnknown indicates an unknown error.
	AssistantMessageErrorUnknown = message.AssistantMessageErrorUnknown
)
View Source
const (
	// BlockTypeText contains plain text content.
	BlockTypeText = message.BlockTypeText
	// BlockTypeImage contains generated image output content.
	BlockTypeImage = message.BlockTypeImage
	// BlockTypeImageURL contains input image URL or data-url content.
	BlockTypeImageURL = message.BlockTypeImageURL
	// BlockTypeFile contains input file content.
	BlockTypeFile = message.BlockTypeFile
	// BlockTypeInputAudio contains input audio content.
	BlockTypeInputAudio = message.BlockTypeInputAudio
	// BlockTypeVideoURL contains input video URL or data-url content.
	BlockTypeVideoURL = message.BlockTypeVideoURL
	// BlockTypeThinking contains model reasoning content.
	BlockTypeThinking = message.BlockTypeThinking
	// BlockTypeToolUse contains tool-call content.
	BlockTypeToolUse = message.BlockTypeToolUse
	// BlockTypeToolResult contains tool-result content.
	BlockTypeToolResult = message.BlockTypeToolResult
)
View Source
const (
	// HookEventPreToolUse is triggered before a tool is used.
	HookEventPreToolUse = hook.EventPreToolUse
	// HookEventPostToolUse is triggered after a tool is used.
	HookEventPostToolUse = hook.EventPostToolUse
	// HookEventUserPromptSubmit is triggered when a user submits a prompt.
	HookEventUserPromptSubmit = hook.EventUserPromptSubmit
	// HookEventStop is triggered when a session stops.
	HookEventStop = hook.EventStop
	// HookEventSubagentStop is triggered when a subagent stops.
	HookEventSubagentStop = hook.EventSubagentStop
	// HookEventPreCompact is triggered before compaction.
	HookEventPreCompact = hook.EventPreCompact
	// HookEventPostToolUseFailure is triggered after a tool use fails.
	HookEventPostToolUseFailure = hook.EventPostToolUseFailure
	// HookEventNotification is triggered when a notification is sent.
	HookEventNotification = hook.EventNotification
	// HookEventSubagentStart is triggered when a subagent starts.
	HookEventSubagentStart = hook.EventSubagentStart
	// HookEventPermissionRequest is triggered when a permission is requested.
	HookEventPermissionRequest = hook.EventPermissionRequest
)
View Source
const (
	// PermissionModeDefault uses standard permission prompts.
	PermissionModeDefault = permission.ModeDefault
	// PermissionModeAcceptEdits automatically accepts file edits.
	PermissionModeAcceptEdits = permission.ModeAcceptEdits
	// PermissionModePlan enables plan mode for implementation planning.
	PermissionModePlan = permission.ModePlan
	// PermissionModeBypassPermissions bypasses all permission checks.
	PermissionModeBypassPermissions = permission.ModeBypassPermissions
)
View Source
const (
	// PermissionUpdateTypeAddRules adds new permission rules.
	PermissionUpdateTypeAddRules = permission.UpdateTypeAddRules
	// PermissionUpdateTypeReplaceRules replaces existing permission rules.
	PermissionUpdateTypeReplaceRules = permission.UpdateTypeReplaceRules
	// PermissionUpdateTypeRemoveRules removes permission rules.
	PermissionUpdateTypeRemoveRules = permission.UpdateTypeRemoveRules
	// PermissionUpdateTypeSetMode sets the permission mode.
	PermissionUpdateTypeSetMode = permission.UpdateTypeSetMode
	// PermissionUpdateTypeAddDirectories adds accessible directories.
	PermissionUpdateTypeAddDirectories = permission.UpdateTypeAddDirectories
	// PermissionUpdateTypeRemoveDirectories removes accessible directories.
	PermissionUpdateTypeRemoveDirectories = permission.UpdateTypeRemoveDirectories
)
View Source
const (
	// PermissionUpdateDestUserSettings stores in user-level settings.
	PermissionUpdateDestUserSettings = permission.UpdateDestUserSettings
	// PermissionUpdateDestProjectSettings stores in project-level settings.
	PermissionUpdateDestProjectSettings = permission.UpdateDestProjectSettings
	// PermissionUpdateDestLocalSettings stores in local-level settings.
	PermissionUpdateDestLocalSettings = permission.UpdateDestLocalSettings
	// PermissionUpdateDestSession stores in the current session only.
	PermissionUpdateDestSession = permission.UpdateDestSession
)
View Source
const (
	// PermissionBehaviorAllow automatically allows the operation.
	PermissionBehaviorAllow = permission.BehaviorAllow
	// PermissionBehaviorDeny automatically denies the operation.
	PermissionBehaviorDeny = permission.BehaviorDeny
	// PermissionBehaviorAsk prompts the user for permission.
	PermissionBehaviorAsk = permission.BehaviorAsk
)
View Source
const (
	// MCPServerTypeStdio uses stdio for communication.
	MCPServerTypeStdio = mcp.ServerTypeStdio
	// MCPServerTypeSSE uses Server-Sent Events.
	MCPServerTypeSSE = mcp.ServerTypeSSE
	// MCPServerTypeHTTP uses HTTP for communication.
	MCPServerTypeHTTP = mcp.ServerTypeHTTP
	// MCPServerTypeSDK uses the SDK interface.
	MCPServerTypeSDK = mcp.ServerTypeSDK
)
View Source
const Version = "0.1.0"

Version is the SDK version.

Variables

View Source
var (
	// ErrClientNotConnected indicates the client is not connected.
	ErrClientNotConnected = internalerrors.ErrClientNotConnected

	// ErrClientAlreadyConnected indicates the client is already connected.
	ErrClientAlreadyConnected = internalerrors.ErrClientAlreadyConnected

	// ErrClientClosed indicates the client has been closed and cannot be reused.
	ErrClientClosed = internalerrors.ErrClientClosed

	// ErrTransportNotConnected indicates the transport is not connected.
	ErrTransportNotConnected = internalerrors.ErrTransportNotConnected

	// ErrRequestTimeout indicates a request timed out.
	ErrRequestTimeout = internalerrors.ErrRequestTimeout

	// ErrSessionNotFound indicates a requested local session does not exist.
	ErrSessionNotFound = internalerrors.ErrSessionNotFound

	// ErrUnsupportedFeature indicates an API-compatible feature that is not implemented by this backend.
	ErrUnsupportedFeature = errors.New("unsupported feature in VLLM backend")

	// ErrUnsupportedControl indicates a control-plane operation is not supported by backend.
	ErrUnsupportedControl = controlplane.ErrUnsupportedControl

	// ErrNoCheckpoint indicates rewind was requested without an available checkpoint.
	ErrNoCheckpoint = session.ErrNoCheckpoint
)

Re-export sentinel errors from internal package.

View Source
var NewUserMessageContent = message.NewUserMessageContent

NewUserMessageContent creates UserMessageContent from a string.

View Source
var NewUserMessageContentBlocks = message.NewUserMessageContentBlocks

NewUserMessageContentBlocks creates UserMessageContent from blocks.

Functions

func ErrorResult

func ErrorResult(message string) *mcp.CallToolResult

ErrorResult creates a CallToolResult indicating an error.

func ImageResult

func ImageResult(data []byte, mimeType string) *mcp.CallToolResult

ImageResult creates a CallToolResult with image content.

func MessagesFromChannel

func MessagesFromChannel(ch <-chan StreamingMessage) iter.Seq[StreamingMessage]

MessagesFromChannel creates a MessageStream from a channel. This is useful for dynamic message generation where messages are produced over time. The iterator completes when the channel is closed.

func MessagesFromContent

func MessagesFromContent(content UserMessageContent) iter.Seq[StreamingMessage]

MessagesFromContent creates a single-message stream from user content.

func MessagesFromSlice

func MessagesFromSlice(msgs []StreamingMessage) iter.Seq[StreamingMessage]

MessagesFromSlice creates a MessageStream from a slice of StreamingMessages. This is useful for sending a fixed set of messages in streaming mode.

func NewMcpTool

func NewMcpTool(name, description string, inputSchema *jsonschema.Schema) *mcp.Tool

NewMcpTool creates an mcp.Tool with the given parameters. This is useful when you need direct access to the MCP Tool type.

func NopLogger

func NopLogger() *slog.Logger

NopLogger returns a logger that discards all output. Use this when you want silent operation with no logging overhead.

func ParseArguments

func ParseArguments(req *mcp.CallToolRequest) (map[string]any, error)

ParseArguments unmarshals CallToolRequest arguments into a map. This is a convenience function for extracting tool input.

func Query

func Query(
	ctx context.Context,
	content UserMessageContent,
	opts ...Option,
) iter.Seq2[Message, error]

Query executes a one-shot query and returns a message iterator.

func QueryStream

func QueryStream(
	ctx context.Context,
	messages iter.Seq[StreamingMessage],
	opts ...Option,
) iter.Seq2[Message, error]

QueryStream executes a one-shot query from a stream of user messages.

func SimpleSchema

func SimpleSchema(props map[string]string) *jsonschema.Schema

SimpleSchema creates a jsonschema.Schema from a simple type map.

Input format: {"a": "float64", "b": "string"}

Type mappings:

  • "string" → {"type": "string"}
  • "int", "int64" → {"type": "integer"}
  • "float64", "float" → {"type": "number"}
  • "bool" → {"type": "boolean"}
  • "[]string" → {"type": "array", "items": {"type": "string"}}
  • "any", "object" → {"type": "object"}

func SingleMessage

func SingleMessage(content UserMessageContent) iter.Seq[StreamingMessage]

SingleMessage creates a MessageStream with a single user message.

func TextResult

func TextResult(text string) *mcp.CallToolResult

TextResult creates a CallToolResult with text content.

func Validate

func Validate(schema, input map[string]any) error

Validate checks if the input matches the schema requirements. Returns an error if required fields are missing.

func WithClient

func WithClient(ctx context.Context, fn func(Client) error, opts ...Option) error

WithClient manages client lifecycle with automatic cleanup.

This helper creates a client, starts it with the provided options, executes the callback function, and ensures proper cleanup via Close() when done.

The callback receives a fully initialized Client that is ready for use. If the callback returns an error, it is returned to the caller. If Close() fails, a warning is logged but does not override the callback's error.

Example usage:

err := vllmsdk.WithClient(ctx, func(c vllmsdk.Client) error {
    if err := c.Query(ctx, Text("Hello")); err != nil {
        return err
    }
    for msg, err := range c.ReceiveResponse(ctx) {
        if err != nil {
            return err
        }
        // process message...
    }
    return nil
},
    vllmsdk.WithLogger(log),
    vllmsdk.WithPermissionMode("acceptEdits"),
)

Types

type AssistantMessage

type AssistantMessage = message.AssistantMessage

AssistantMessage represents a message from the model.

type AssistantMessageError

type AssistantMessageError = message.AssistantMessageError

AssistantMessageError represents error types from the assistant.

type AsyncHookJSONOutput

type AsyncHookJSONOutput = hook.AsyncJSONOutput

AsyncHookJSONOutput represents an async hook output.

type BaseHookInput

type BaseHookInput = hook.BaseInput

BaseHookInput contains common fields for all hook inputs.

type CallToolRequest

type CallToolRequest = mcp.CallToolRequest

CallToolRequest is the request passed to tool handlers.

type CallToolResult

type CallToolResult = mcp.CallToolResult

CallToolResult is the server's response to a tool call. Use TextResult, ErrorResult, or ImageResult helpers to create results.

type ChatRequest

type ChatRequest = config.ChatRequest

ChatRequest is the normalized VLLM request sent through a Transport.

type Client

type Client interface {
	// Start initializes the client runtime.
	// Must be called before any other methods.
	// Returns a transport/runtime error on failure.
	Start(ctx context.Context, opts ...Option) error

	// StartWithContent initializes the runtime and immediately sends initial user content.
	// Equivalent to calling Start() followed by Query(ctx, content).
	// The content is sent to the "default" session.
	// Returns a transport/runtime error on failure.
	StartWithContent(ctx context.Context, content UserMessageContent, opts ...Option) error

	// StartWithStream initializes the runtime and consumes the provided input iterator
	// as the initial message stream for the active session.
	// The iterator is consumed in a separate goroutine; use context cancellation to abort.
	// Returns a transport/runtime error on failure.
	StartWithStream(ctx context.Context, messages iter.Seq[StreamingMessage], opts ...Option) error

	// Query sends user content to the active session.
	// Returns immediately after sending; use ReceiveMessages() or ReceiveResponse() to get responses.
	// Optional sessionID defaults to "default" for multi-session support.
	Query(ctx context.Context, content UserMessageContent, sessionID ...string) error

	// ReceiveMessages returns an iterator that yields messages indefinitely.
	// Messages are yielded as they arrive until EOF, an error occurs, or context is cancelled.
	// Unlike ReceiveResponse, this iterator does not stop at ResultMessage.
	// Use iter.Pull2 if you need pull-based iteration instead of range.
	ReceiveMessages(ctx context.Context) iter.Seq2[Message, error]

	// ReceiveResponse returns an iterator that yields messages until a ResultMessage is received.
	// Messages are yielded as they arrive for streaming consumption.
	// The iterator stops after yielding the ResultMessage.
	// Use iter.Pull2 if you need pull-based iteration instead of range.
	// To collect all messages into a slice, use slices.Collect or a simple loop.
	ReceiveResponse(ctx context.Context) iter.Seq2[Message, error]

	// Interrupt cancels the current in-flight request.
	Interrupt(ctx context.Context) error

	// SetPermissionMode changes the permission mode during conversation.
	// Valid modes: "default", "acceptEdits", "plan", "bypassPermissions"
	SetPermissionMode(ctx context.Context, mode string) error

	// SetModel changes the AI model during conversation.
	// Pass nil to use the default model.
	SetModel(ctx context.Context, model *string) error

	// ListModels returns the available VLLM models using the current client options.
	ListModels(ctx context.Context) ([]ModelInfo, error)

	// ListModelsResponse returns the full VLLM model discovery payload.
	ListModelsResponse(ctx context.Context) (*ModelListResponse, error)

	// GetServerInfo returns runtime metadata for the active client session.
	// Returns nil when the client is not connected.
	GetServerInfo() map[string]any

	// GetMCPStatus returns MCP server connection status for the current runtime.
	// Returns the status of all configured MCP servers.
	GetMCPStatus(ctx context.Context) (*MCPStatus, error)

	// ReconnectMCPServer reconnects a disconnected or failed MCP server.
	ReconnectMCPServer(ctx context.Context, serverName string) error

	// ToggleMCPServer enables or disables an MCP server.
	ToggleMCPServer(ctx context.Context, serverName string, enabled bool) error

	// StopTask stops a running task by task ID.
	StopTask(ctx context.Context, taskID string) error

	// RewindFiles rewinds tracked files to their state at a specific user message.
	// The userMessageID should be the ID of a previous user message in the conversation.
	// Requires EnableFileCheckpointing=true in VLLMAgentOptions.
	RewindFiles(ctx context.Context, userMessageID string) error

	// SendToolResult sends a tool result for a pending tool call.
	SendToolResult(ctx context.Context, toolUseID, content string, isError bool) error

	// Close terminates the session and cleans up resources.
	// After Close(), the client cannot be reused. Safe to call multiple times.
	Close() error
}

Client provides an interactive, stateful interface for multi-turn VLLM conversations.

Unlike the one-shot Query() function, Client maintains session state across multiple exchanges. It supports interruption, tool loops, and local session state.

Lifecycle: clients are single-use. After Close(), create a new client with NewClient().

Example usage:

client := NewClient()
defer func() { _ = client.Close() }()

err := client.Start(ctx,
    WithLogger(slog.Default()),
    WithPermissionMode("acceptEdits"),
)
if err != nil {
    log.Fatal(err)
}

// Send a query
err = client.Query(ctx, NewUserMessageContent("What is 2+2?"))
if err != nil {
    log.Fatal(err)
}

// Receive all messages for this response (stops at ResultMessage)
for msg, err := range client.ReceiveResponse(ctx) {
    if err != nil {
        log.Fatal(err)
    }
    // Process message...
}

// Or receive messages indefinitely (for continuous streaming)
for msg, err := range client.ReceiveMessages(ctx) {
    if err != nil {
        break
    }
    // Process message...
}

func NewClient

func NewClient() Client

NewClient creates a new interactive client.

Call Start() with options to begin a session:

client := NewClient()
err := client.Start(ctx,
    WithLogger(slog.Default()),
    WithPermissionMode("acceptEdits"),
)

type ContentBlock

type ContentBlock = message.ContentBlock

ContentBlock represents a block of content within a message.

type Effort

type Effort = config.Effort

Effort controls thinking depth.

type HookCallback

type HookCallback = hook.Callback

HookCallback is the function signature for hook callbacks.

type HookContext

type HookContext = hook.Context

HookContext provides context for hook execution.

type HookEvent

type HookEvent = hook.Event

HookEvent represents the type of event that triggers a hook.

type HookInput

type HookInput = hook.Input

HookInput is the interface for all hook input types.

type HookJSONOutput

type HookJSONOutput = hook.JSONOutput

HookJSONOutput is the interface for hook output types.

type HookMatcher

type HookMatcher = hook.Matcher

HookMatcher configures which tools/events a hook applies to.

type HookSpecificOutput

type HookSpecificOutput = hook.SpecificOutput

HookSpecificOutput is the interface for hook-specific outputs.

type ImageBlock

type ImageBlock = message.ImageBlock

ImageBlock contains a generated image reference.

type InputAudioBlock

type InputAudioBlock = message.InputAudioBlock

InputAudioBlock contains an input audio payload for multimodal prompts.

func AudioInput

func AudioInput(format, data string) *InputAudioBlock

AudioInput creates an input audio block from base64 audio plus its format.

type InputAudioRef

type InputAudioRef = message.InputAudioRef

InputAudioRef contains base64-encoded input audio plus its format.

type InputFileBlock

type InputFileBlock = message.InputFileBlock

InputFileBlock contains an input file reference for multimodal prompts.

func FileInput

func FileInput(filename, fileData string) *InputFileBlock

FileInput creates an input file block from a URL or data URL.

type InputFileRef

type InputFileRef = message.InputFileRef

InputFileRef identifies an input file URL or data URL.

type InputImageBlock

type InputImageBlock = message.InputImageBlock

InputImageBlock contains an input image reference for multimodal prompts.

func ImageInput

func ImageInput(url string) *InputImageBlock

ImageInput creates an input image block from a URL or data URL.

type InputImageRef

type InputImageRef = message.InputImageRef

InputImageRef identifies an input image URL or data URL.

type InputVideoBlock

type InputVideoBlock = message.InputVideoBlock

InputVideoBlock contains an input video reference for multimodal prompts.

func VideoInput

func VideoInput(url string) *InputVideoBlock

VideoInput creates an input video block from a URL or data URL.

type InputVideoRef

type InputVideoRef = message.InputVideoRef

InputVideoRef identifies an input video URL or data URL.

type MCPHTTPServerConfig

type MCPHTTPServerConfig = mcp.HTTPServerConfig

MCPHTTPServerConfig configures an HTTP-based MCP server.

type MCPSSEServerConfig

type MCPSSEServerConfig = mcp.SSEServerConfig

MCPSSEServerConfig configures a Server-Sent Events MCP server.

type MCPSdkServerConfig

type MCPSdkServerConfig = mcp.SdkServerConfig

MCPSdkServerConfig configures an SDK-provided MCP server.

func CreateSdkMcpServer

func CreateSdkMcpServer(name, version string, tools ...*SdkMcpTool) *MCPSdkServerConfig

CreateSdkMcpServer creates an in-process MCP server configuration with SdkMcpTool tools.

This function creates an MCP server that runs within your application, providing better performance than external MCP servers.

The returned config can be used directly in VLLMAgentOptions.MCPServers:

addTool := vllmsdk.NewSdkMcpTool("add", "Add two numbers",
    vllmsdk.SimpleSchema(map[string]string{"a": "float64", "b": "float64"}),
    func(ctx context.Context, req *vllmsdk.CallToolRequest) (*vllmsdk.CallToolResult, error) {
        args, _ := vllmsdk.ParseArguments(req)
        a, b := args["a"].(float64), args["b"].(float64)
        return vllmTextResult(fmt.Sprintf("Result: %v", a+b)), nil
    },
)

calculator := vllmsdk.CreateSdkMcpServer("calculator", "1.0.0", addTool)

options := &vllmsdk.VLLMAgentOptions{
    MCPServers: map[string]vllmsdk.MCPServerConfig{
        "calculator": calculator,
    },
    AllowedTools: []string{"mcp__calculator__add"},
}

Parameters:

  • name: Server name (also used as key in MCPServers map, determines tool naming: mcp__<name>__<toolName>)
  • version: Server version string
  • tools: SdkMcpTool instances to register with the server

type MCPServerConfig

type MCPServerConfig = mcp.ServerConfig

MCPServerConfig is the interface for MCP server configurations.

type MCPServerStatus

type MCPServerStatus = mcp.ServerStatus

MCPServerStatus represents the connection status of a single MCP server.

type MCPServerType

type MCPServerType = mcp.ServerType

MCPServerType represents the type of MCP server.

type MCPStatus

type MCPStatus = mcp.Status

MCPStatus represents the connection status of all configured MCP servers.

type MCPStdioServerConfig

type MCPStdioServerConfig = mcp.StdioServerConfig

MCPStdioServerConfig configures a stdio-based MCP server.

type McpAudioContent

type McpAudioContent = mcp.AudioContent

McpAudioContent represents audio content in a tool result.

type McpContent

type McpContent = mcp.Content

McpContent is the interface for content types in tool results.

type McpImageContent

type McpImageContent = mcp.ImageContent

McpImageContent represents image content in a tool result.

type McpTextContent

type McpTextContent = mcp.TextContent

McpTextContent represents text content in a tool result.

type McpTool

type McpTool = mcp.Tool

McpTool represents an MCP tool definition from the official SDK.

type McpToolAnnotations

type McpToolAnnotations = mcp.ToolAnnotations

McpToolAnnotations describes optional hints about tool behavior. Fields include ReadOnlyHint, DestructiveHint, IdempotentHint, OpenWorldHint, and Title.

type McpToolHandler

type McpToolHandler = mcp.ToolHandler

McpToolHandler is the function signature for low-level tool handlers.

type Message

type Message = message.Message

Message represents any message in the conversation.

func GetSessionMessages

func GetSessionMessages(ctx context.Context, sessionID string, opts ...Option) ([]Message, error)

GetSessionMessages returns persisted local session messages.

type MessageParseError

type MessageParseError = internalerrors.MessageParseError

MessageParseError indicates message parsing failed.

type MessageStream

type MessageStream = iter.Seq[StreamingMessage]

MessageStream is an iterator that yields streaming messages.

type Model

type Model = model.Model

Model is a stable provider-neutral projection of a discovered model.

type ModelArchitecture

type ModelArchitecture = model.Architecture

ModelArchitecture describes the underlying model family where available.

type ModelEndpoint

type ModelEndpoint = model.Endpoint

ModelEndpoint identifies an endpoint a model supports.

type ModelInfo

type ModelInfo = model.Info

ModelInfo describes a model available from VLLM discovery endpoints.

func ListModels

func ListModels(ctx context.Context, opts ...Option) ([]ModelInfo, error)

ListModels returns the available vLLM-served models.

type ModelListResponse

type ModelListResponse = model.ListResponse

ModelListResponse contains the full model discovery payload.

func ListModelsResponse

func ListModelsResponse(ctx context.Context, opts ...Option) (*ModelListResponse, error)

ListModelsResponse returns the full vLLM model discovery payload.

type ModelPerRequestLimits

type ModelPerRequestLimits = model.PerRequestLimits

ModelPerRequestLimits captures request limits where available.

type ModelPricing

type ModelPricing = model.Pricing

ModelPricing contains VLLM pricing fields as returned by discovery endpoints.

type ModelSupportedParameters

type ModelSupportedParameters = model.SupportedParameters

ModelSupportedParameters records provider-supported request parameters.

type ModelTopProvider

type ModelTopProvider = model.TopProvider

ModelTopProvider describes provider-side limits and moderation behavior.

type NotificationHookInput

type NotificationHookInput = hook.NotificationInput

NotificationHookInput is the input for Notification hooks.

type NotificationHookSpecificOutput

type NotificationHookSpecificOutput = hook.NotificationSpecificOutput

NotificationHookSpecificOutput is the hook-specific output for Notification.

type Option

type Option func(*VLLMAgentOptions)

Option configures VLLMAgentOptions using the functional options pattern. This is the primary option type for configuring clients and queries.

func WithAPIKey

func WithAPIKey(apiKey string) Option

WithAPIKey sets the VLLM API key directly.

func WithAllowedTools

func WithAllowedTools(tools ...string) Option

WithAllowedTools sets pre-approved tools that can be used without prompting.

func WithBackground

func WithBackground(background bool) Option

WithBackground sets responses.background.

func WithBaseURL

func WithBaseURL(baseURL string) Option

WithBaseURL overrides the VLLM base URL.

func WithCanUseTool

func WithCanUseTool(callback ToolPermissionCallback) Option

WithCanUseTool sets a callback for permission checking before each tool use.

func WithCwd

func WithCwd(cwd string) Option

WithCwd sets the working directory used by local session and tool features.

func WithDisallowedTools

func WithDisallowedTools(tools ...string) Option

WithDisallowedTools sets tools that are explicitly blocked.

func WithEffort

func WithEffort(effort config.Effort) Option

WithEffort sets the thinking effort level.

func WithEnableFileCheckpointing

func WithEnableFileCheckpointing(enable bool) Option

WithEnableFileCheckpointing enables file change tracking and rewinding.

func WithFallbackModel

func WithFallbackModel(model string) Option

WithFallbackModel specifies a model to use if the primary model fails.

func WithForkSession

func WithForkSession(fork bool) Option

WithForkSession indicates whether to fork the resumed session to a new ID.

func WithFrequencyPenalty

func WithFrequencyPenalty(v float64) Option

WithFrequencyPenalty sets frequency penalty.

func WithHTTPReferer

func WithHTTPReferer(referer string) Option

WithHTTPReferer sets the HTTP-Referer header for VLLM requests.

func WithHooks

func WithHooks(hooks map[HookEvent][]*HookMatcher) Option

WithHooks configures event hooks for tool interception.

func WithImageConfig

func WithImageConfig(cfg map[string]any) Option

WithImageConfig sets provider-specific image config.

func WithInclude

func WithInclude(include ...string) Option

WithInclude sets responses.include entries.

func WithIncludePartialMessages

func WithIncludePartialMessages(include bool) Option

WithIncludePartialMessages enables streaming of partial message updates.

func WithInstructions

func WithInstructions(instructions string) Option

WithInstructions sets responses.instructions.

func WithLogger

func WithLogger(logger *slog.Logger) Option

WithLogger sets the logger for debug output. If not set, logging is disabled (silent operation).

func WithLogprobs

func WithLogprobs(enable bool) Option

WithLogprobs enables token log probabilities where supported.

func WithMCPConfig

func WithMCPConfig(config string) Option

WithMCPConfig sets a path to an MCP config file or a raw JSON string. If set, this takes precedence over WithMCPServers.

func WithMCPServers

func WithMCPServers(servers map[string]MCPServerConfig) Option

WithMCPServers configures external MCP servers to connect to. Map key is the server name, value is the server configuration.

func WithMaxBudgetUSD

func WithMaxBudgetUSD(budget float64) Option

WithMaxBudgetUSD sets a cost limit for the session in USD.

func WithMaxOutputTokens

func WithMaxOutputTokens(max int) Option

WithMaxOutputTokens sets responses.max_output_tokens.

func WithMaxTokens

func WithMaxTokens(max int) Option

WithMaxTokens sets chat max_tokens.

func WithMaxToolCalls

func WithMaxToolCalls(max int) Option

WithMaxToolCalls sets responses.max_tool_calls.

func WithMaxToolIterations

func WithMaxToolIterations(max int) Option

WithMaxToolIterations sets maximum tool-call loops per query.

func WithMaxTurns

func WithMaxTurns(maxTurns int) Option

WithMaxTurns limits the maximum number of conversation turns.

func WithModalities

func WithModalities(modalities ...string) Option

WithModalities sets output modalities.

func WithModel

func WithModel(model string) Option

WithModel specifies which VLLM model to use.

func WithModels

func WithModels(models ...string) Option

WithModels sets candidate fallback models for VLLM routing.

func WithOnUserInput

func WithOnUserInput(callback UserInputCallback) Option

WithOnUserInput sets a callback for handling SDK user-input tool prompts.

func WithOutputFormat

func WithOutputFormat(format map[string]any) Option

WithOutputFormat specifies a JSON schema for structured output.

The canonical format uses a wrapper object:

vllmsdk.WithOutputFormat(map[string]any{
    "type": "json_schema",
    "schema": map[string]any{
        "type":       "object",
        "properties": map[string]any{...},
        "required":   []string{...},
    },
})

Raw JSON schemas (without the wrapper) are also accepted and auto-wrapped:

vllmsdk.WithOutputFormat(map[string]any{
    "type":       "object",
    "properties": map[string]any{...},
    "required":   []string{...},
})

Structured output is available on ResultMessage.StructuredOutput (parsed) or ResultMessage.Result (JSON string).

func WithParallelToolCalls

func WithParallelToolCalls(enable bool) Option

WithParallelToolCalls sets parallel_tool_calls.

func WithPermissionMode

func WithPermissionMode(mode string) Option

WithPermissionMode controls how permissions are handled. Valid values: "default", "acceptEdits", "plan", "bypassPermissions".

func WithPermissionPromptToolName

func WithPermissionPromptToolName(name string) Option

WithPermissionPromptToolName specifies the tool name to use for permission prompts.

func WithPlugins

func WithPlugins(plugins ...*SdkPluginConfig) Option

WithPlugins configures plugins to load.

func WithPresencePenalty

func WithPresencePenalty(v float64) Option

WithPresencePenalty sets presence penalty.

func WithPreviousResponseID

func WithPreviousResponseID(responseID string) Option

WithPreviousResponseID sets responses.previous_response_id.

func WithPrompt

func WithPrompt(prompt map[string]any) Option

WithPrompt sets responses.prompt payload.

func WithPromptCacheKey

func WithPromptCacheKey(key string) Option

WithPromptCacheKey sets responses.prompt_cache_key.

func WithProvider

func WithProvider(provider map[string]any) Option

WithProvider sets VLLM provider preferences.

func WithReasoning

func WithReasoning(reasoning map[string]any) Option

WithReasoning sets VLLM reasoning configuration.

func WithRequestTimeout

func WithRequestTimeout(timeout time.Duration) Option

WithRequestTimeout sets HTTP request timeout for VLLM calls.

func WithResponseText

func WithResponseText(text map[string]any) Option

WithResponseText sets responses.text payload directly.

func WithResume

func WithResume(sessionID string) Option

WithResume sets a session ID to resume from.

func WithRoute

func WithRoute(route string) Option

WithRoute sets VLLM route preference.

func WithSDKTools

func WithSDKTools(tools ...Tool) Option

WithSDKTools registers high-level Tool instances as an in-process MCP server. Tools are exposed under the "sdk" MCP server name (tool names: mcp__sdk__<name>). Each tool is automatically added to AllowedTools.

func WithSafetyIdentifier

func WithSafetyIdentifier(safetyID string) Option

WithSafetyIdentifier sets responses.safety_identifier.

func WithSeed

func WithSeed(seed int64) Option

WithSeed sets deterministic seed where supported.

func WithServiceTier

func WithServiceTier(tier string) Option

WithServiceTier sets responses.service_tier.

func WithSessionID

func WithSessionID(sessionID string) Option

WithSessionID sets request session_id.

func WithSessionStorePath

func WithSessionStorePath(path string) Option

WithSessionStorePath enables durable session persistence at a JSON file path. When set, resume/fork state can survive process restarts.

func WithStop

func WithStop(stop ...string) Option

WithStop sets stop sequences.

func WithStore

func WithStore(store bool) Option

WithStore sets responses.store.

func WithSystemPrompt

func WithSystemPrompt(prompt string) Option

WithSystemPrompt sets the system message to send to the model.

func WithSystemPromptPreset

func WithSystemPromptPreset(preset *SystemPromptPreset) Option

WithSystemPromptPreset sets a preset system prompt configuration. If set, this takes precedence over WithSystemPrompt.

func WithTemperature

func WithTemperature(temperature float64) Option

WithTemperature sets sampling temperature.

func WithThinking

func WithThinking(thinking config.ThinkingConfig) Option

WithThinking sets the thinking configuration.

func WithToolChoice

func WithToolChoice(choice any) Option

WithToolChoice sets the tool_choice payload.

func WithTools

func WithTools(tools config.ToolsConfig) Option

WithTools specifies which tools are available. Accepts ToolsList (tool names) or *ToolsPreset.

func WithTopK

func WithTopK(topK float64) Option

WithTopK sets top-k sampling (responses API).

func WithTopLogprobs

func WithTopLogprobs(v int) Option

WithTopLogprobs sets top_logprobs where supported.

func WithTopP

func WithTopP(topP float64) Option

WithTopP sets nucleus sampling probability.

func WithTrace

func WithTrace(enable bool) Option

WithTrace enables VLLM trace where supported.

func WithTransport

func WithTransport(transport Transport) Option

WithTransport injects a custom transport implementation. The transport must implement the Transport interface.

func WithTruncation

func WithTruncation(truncation string) Option

WithTruncation sets responses.truncation.

func WithUser

func WithUser(user string) Option

WithUser sets a user identifier for tracking purposes.

func WithVLLMAPIMode

func WithVLLMAPIMode(mode config.VLLMAPIMode) Option

WithVLLMAPIMode selects the VLLM API surface.

func WithVLLMExtra

func WithVLLMExtra(extra map[string]any) Option

WithVLLMExtra merges raw request fields into the outgoing payload.

func WithVLLMMetadata

func WithVLLMMetadata(metadata map[string]any) Option

WithVLLMMetadata sets request metadata.

func WithVLLMPlugins

func WithVLLMPlugins(plugins ...map[string]any) Option

WithVLLMPlugins sets VLLM plugin payloads.

func WithXTitle

func WithXTitle(title string) Option

WithXTitle sets the X-Title header for VLLM requests.

type PermissionBehavior

type PermissionBehavior = permission.Behavior

PermissionBehavior represents the permission behavior for a rule.

type PermissionMode

type PermissionMode = permission.Mode

PermissionMode represents different permission handling modes.

type PermissionRequestHookInput

type PermissionRequestHookInput = hook.PermissionRequestInput

PermissionRequestHookInput is the input for PermissionRequest hooks.

type PermissionRequestHookSpecificOutput

type PermissionRequestHookSpecificOutput = hook.PermissionRequestSpecificOutput

PermissionRequestHookSpecificOutput is the hook-specific output for PermissionRequest.

type PermissionResult

type PermissionResult = permission.Result

PermissionResult is the interface for permission decision results.

type PermissionResultAllow

type PermissionResultAllow = permission.ResultAllow

PermissionResultAllow represents an allow decision.

type PermissionResultDeny

type PermissionResultDeny = permission.ResultDeny

PermissionResultDeny represents a deny decision.

type PermissionRuleValue

type PermissionRuleValue = permission.RuleValue

PermissionRuleValue represents a permission rule.

type PermissionUpdate

type PermissionUpdate = permission.Update

PermissionUpdate represents a permission update request.

type PermissionUpdateDestination

type PermissionUpdateDestination = permission.UpdateDestination

PermissionUpdateDestination represents where permission updates are stored.

type PermissionUpdateType

type PermissionUpdateType = permission.UpdateType

PermissionUpdateType represents the type of permission update.

type PostToolUseFailureHookInput

type PostToolUseFailureHookInput = hook.PostToolUseFailureInput

PostToolUseFailureHookInput is the input for PostToolUseFailure hooks.

type PostToolUseFailureHookSpecificOutput

type PostToolUseFailureHookSpecificOutput = hook.PostToolUseFailureSpecificOutput

PostToolUseFailureHookSpecificOutput is the hook-specific output for PostToolUseFailure.

type PostToolUseHookInput

type PostToolUseHookInput = hook.PostToolUseInput

PostToolUseHookInput is the input for PostToolUse hooks.

type PostToolUseHookSpecificOutput

type PostToolUseHookSpecificOutput = hook.PostToolUseSpecificOutput

PostToolUseHookSpecificOutput is the hook-specific output for PostToolUse.

type PreCompactHookInput

type PreCompactHookInput = hook.PreCompactInput

PreCompactHookInput is the input for PreCompact hooks.

type PreToolUseHookInput

type PreToolUseHookInput = hook.PreToolUseInput

PreToolUseHookInput is the input for PreToolUse hooks.

type PreToolUseHookSpecificOutput

type PreToolUseHookSpecificOutput = hook.PreToolUseSpecificOutput

PreToolUseHookSpecificOutput is the hook-specific output for PreToolUse.

type ResultMessage

type ResultMessage = message.ResultMessage

ResultMessage represents the final result of a query.

type Schema

type Schema = jsonschema.Schema

Schema is a JSON Schema object for tool input validation.

type SchemaBuilder

type SchemaBuilder struct {
	// contains filtered or unexported fields
}

SchemaBuilder provides a fluent interface for building JSON schemas.

func NewSchemaBuilder

func NewSchemaBuilder() *SchemaBuilder

NewSchemaBuilder creates a new SchemaBuilder.

func (*SchemaBuilder) Build

func (b *SchemaBuilder) Build() map[string]any

Build returns the complete JSON Schema.

func (*SchemaBuilder) OptionalProperty

func (b *SchemaBuilder) OptionalProperty(name, goType string) *SchemaBuilder

OptionalProperty adds an optional property (not in required list).

func (*SchemaBuilder) OptionalPropertyWithDescription

func (b *SchemaBuilder) OptionalPropertyWithDescription(name, goType, description string) *SchemaBuilder

OptionalPropertyWithDescription adds an optional property with description.

func (*SchemaBuilder) Property

func (b *SchemaBuilder) Property(name, goType string) *SchemaBuilder

Property adds a property with the given name and Go type. The property is marked as required by default.

func (*SchemaBuilder) PropertyWithDescription

func (b *SchemaBuilder) PropertyWithDescription(name, goType, description string) *SchemaBuilder

PropertyWithDescription adds a property with type and description.

type SdkMcpServerInstance

type SdkMcpServerInstance = mcp.ServerInstance

SdkMcpServerInstance is the interface that SDK MCP servers must implement.

type SdkMcpTool

type SdkMcpTool struct {
	ToolName        string
	ToolDescription string
	ToolSchema      *jsonschema.Schema
	ToolHandler     SdkMcpToolHandler
	ToolAnnotations *mcp.ToolAnnotations
}

SdkMcpTool represents a tool created with NewSdkMcpTool.

func NewSdkMcpTool

func NewSdkMcpTool(
	name, description string,
	inputSchema *jsonschema.Schema,
	handler SdkMcpToolHandler,
	opts ...SdkMcpToolOption,
) *SdkMcpTool

NewSdkMcpTool creates an SdkMcpTool with optional configuration.

The inputSchema should be a *jsonschema.Schema. Use SimpleSchema for convenience or create a full Schema struct for more control.

Use WithAnnotations to set MCP tool annotations (hints about tool behavior).

Example with SimpleSchema:

addTool := vllmsdk.NewSdkMcpTool("add", "Add two numbers",
    vllmsdk.SimpleSchema(map[string]string{"a": "float64", "b": "float64"}),
    func(ctx context.Context, req *vllmsdk.CallToolRequest) (*vllmsdk.CallToolResult, error) {
        args, _ := vllmsdk.ParseArguments(req)
        a, b := args["a"].(float64), args["b"].(float64)
        return vllmTextResult(fmt.Sprintf("Result: %v", a+b)), nil
    },
    vllmsdk.WithAnnotations(&vllmsdk.McpToolAnnotations{
        ReadOnlyHint: true,
    }),
)

func (*SdkMcpTool) Annotations

func (t *SdkMcpTool) Annotations() *mcp.ToolAnnotations

Annotations returns the tool annotations, or nil if not set.

func (*SdkMcpTool) Description

func (t *SdkMcpTool) Description() string

Description returns the tool description.

func (*SdkMcpTool) Handler

func (t *SdkMcpTool) Handler() SdkMcpToolHandler

Handler returns the tool handler function.

func (*SdkMcpTool) InputSchema

func (t *SdkMcpTool) InputSchema() *jsonschema.Schema

InputSchema returns the JSON Schema for the tool input.

func (*SdkMcpTool) Name

func (t *SdkMcpTool) Name() string

Name returns the tool name.

type SdkMcpToolHandler

type SdkMcpToolHandler = mcp.ToolHandler

SdkMcpToolHandler is the function signature for SdkMcpTool handlers. It receives the context and request, and returns the result.

Use ParseArguments to extract input as map[string]any from the request. Use TextResult, ErrorResult, or ImageResult helpers to create results.

Example:

func(ctx context.Context, req *vllmsdk.CallToolRequest) (*vllmsdk.CallToolResult, error) {
    args, err := vllmsdk.ParseArguments(req)
    if err != nil {
        return vllmsdk.ErrorResult(err.Error()), nil
    }
    a := args["a"].(float64)
    return vllmTextResult(fmt.Sprintf("Result: %v", a)), nil
}

type SdkMcpToolOption

type SdkMcpToolOption func(*SdkMcpTool)

SdkMcpToolOption configures an SdkMcpTool during construction.

func WithAnnotations

func WithAnnotations(annotations *mcp.ToolAnnotations) SdkMcpToolOption

WithAnnotations sets MCP tool annotations (hints about tool behavior). Annotations describe properties like whether a tool is read-only, destructive, idempotent, or operates in an open world.

type SdkPluginConfig

type SdkPluginConfig = config.PluginConfig

SdkPluginConfig configures a plugin to load.

type SessionStat

type SessionStat struct {
	SessionID           string
	CreatedAt           string
	UpdatedAt           string
	MessageCount        int
	UserTurns           int
	CheckpointCount     int
	FileCheckpointCount int
}

SessionStat contains metadata about a locally persisted SDK session.

func ListSessions

func ListSessions(ctx context.Context, opts ...Option) ([]SessionStat, error)

ListSessions returns local SDK sessions from the configured session store.

func StatSession

func StatSession(ctx context.Context, sessionID string, opts ...Option) (*SessionStat, error)

StatSession returns metadata for a locally persisted SDK session.

type StopHookInput

type StopHookInput = hook.StopInput

StopHookInput is the input for Stop hooks.

type StreamEvent

type StreamEvent = message.StreamEvent

StreamEvent represents a raw streaming event from the VLLM backend.

type StreamingMessage

type StreamingMessage = message.StreamingMessage

StreamingMessage represents a message sent in streaming mode.

func NewUserMessage

func NewUserMessage(content UserMessageContent) StreamingMessage

NewUserMessage creates a StreamingMessage with type "user". This is a convenience constructor for creating user messages.

type StreamingMessageContent

type StreamingMessageContent = message.StreamingMessageContent

StreamingMessageContent represents the content of a streaming message.

type SubagentStartHookInput

type SubagentStartHookInput = hook.SubagentStartInput

SubagentStartHookInput is the input for SubagentStart hooks.

type SubagentStartHookSpecificOutput

type SubagentStartHookSpecificOutput = hook.SubagentStartSpecificOutput

SubagentStartHookSpecificOutput is the hook-specific output for SubagentStart.

type SubagentStopHookInput

type SubagentStopHookInput = hook.SubagentStopInput

SubagentStopHookInput is the input for SubagentStop hooks.

type SyncHookJSONOutput

type SyncHookJSONOutput = hook.SyncJSONOutput

SyncHookJSONOutput represents a sync hook output.

type SystemMessage

type SystemMessage = message.SystemMessage

SystemMessage represents a system message.

type SystemPromptPreset

type SystemPromptPreset = config.SystemPromptPreset

SystemPromptPreset defines a system prompt preset configuration.

type TextBlock

type TextBlock = message.TextBlock

TextBlock contains plain text content.

func TextInput

func TextInput(text string) *TextBlock

TextInput creates a text block for block-based multimodal user content.

type ThinkingBlock

type ThinkingBlock = message.ThinkingBlock

ThinkingBlock contains model reasoning or thinking output.

type ThinkingConfig

type ThinkingConfig = config.ThinkingConfig

ThinkingConfig controls extended thinking behavior.

type ThinkingConfigAdaptive

type ThinkingConfigAdaptive = config.ThinkingConfigAdaptive

ThinkingConfigAdaptive enables adaptive thinking mode.

type ThinkingConfigDisabled

type ThinkingConfigDisabled = config.ThinkingConfigDisabled

ThinkingConfigDisabled disables extended thinking.

type ThinkingConfigEnabled

type ThinkingConfigEnabled = config.ThinkingConfigEnabled

ThinkingConfigEnabled enables thinking with a specific token budget.

type Tool

type Tool interface {
	// Name returns the unique identifier for this tool.
	Name() string

	// Description returns a human-readable description for the model.
	Description() string

	// InputSchema returns a JSON schema describing expected input.
	// The schema should follow JSON Schema Draft 7 specification.
	InputSchema() map[string]any

	// Execute runs the tool with the provided input.
	// The input will be validated against InputSchema before execution.
	Execute(ctx context.Context, input map[string]any) (map[string]any, error)
}

Tool represents a custom tool that the SDK can invoke through VLLM tool calling.

Tools allow users to extend agent capabilities with domain-specific functionality. When registered, the model can discover and execute these tools during a session.

Example:

tool := vllmsdk.NewTool(
    "calculator",
    "Performs basic arithmetic operations",
    map[string]any{
        "type": "object",
        "properties": map[string]any{
            "operation": map[string]any{
                "type": "string",
                "enum": []string{"add", "subtract", "multiply", "divide"},
            },
            "a": map[string]any{"type": "number"},
            "b": map[string]any{"type": "number"},
        },
        "required": []string{"operation", "a", "b"},
    },
    func(ctx context.Context, input map[string]any) (map[string]any, error) {
        op := input["operation"].(string)
        a := input["a"].(float64)
        b := input["b"].(float64)

        var result float64
        switch op {
        case "add":
            result = a + b
        case "subtract":
            result = a - b
        case "multiply":
            result = a * b
        case "divide":
            if b == 0 {
                return nil, fmt.Errorf("division by zero")
            }
            result = a / b
        }

        return map[string]any{"result": result}, nil
    },
)

func NewTool

func NewTool(name, description string, schema map[string]any, fn ToolFunc) Tool

NewTool creates a Tool from a function.

This is a convenience constructor for creating tools without implementing the full Tool interface.

Parameters:

  • name: Unique identifier for the tool (e.g., "calculator", "search_database")
  • description: Human-readable description of what the tool does
  • schema: JSON Schema defining the expected input structure
  • fn: Function that executes the tool logic

type ToolFunc

type ToolFunc func(ctx context.Context, input map[string]any) (map[string]any, error)

ToolFunc is a function-based tool implementation.

type ToolPermissionCallback

type ToolPermissionCallback = permission.Callback

ToolPermissionCallback is called before each tool use for permission checking.

type ToolPermissionContext

type ToolPermissionContext = permission.Context

ToolPermissionContext provides context for tool permission callbacks.

type ToolPermissionDeniedError

type ToolPermissionDeniedError = internalerrors.ToolPermissionDeniedError

ToolPermissionDeniedError indicates a tool execution was denied by permission policy.

type ToolResultBlock

type ToolResultBlock = message.ToolResultBlock

ToolResultBlock contains the result of a tool execution.

type ToolUseBlock

type ToolUseBlock = message.ToolUseBlock

ToolUseBlock represents the model invoking a tool.

type ToolsConfig

type ToolsConfig = config.ToolsConfig

ToolsConfig is an interface for configuring available tools. It represents either a list of tool names or a preset configuration.

type ToolsList

type ToolsList = config.ToolsList

ToolsList is a list of tool names to make available.

type ToolsPreset

type ToolsPreset = config.ToolsPreset

ToolsPreset represents a preset configuration for available tools.

type Transport

type Transport interface {
	Start(ctx context.Context) error
	CreateStream(ctx context.Context, req *ChatRequest) (<-chan map[string]any, <-chan error)
	Close() error
}

Transport defines the runtime transport interface.

type UnsupportedControlError

type UnsupportedControlError = controlplane.UnsupportedControlError

UnsupportedControlError indicates a control-plane operation is unsupported by this backend.

type UnsupportedHookEventError

type UnsupportedHookEventError = internalerrors.UnsupportedHookEventError

UnsupportedHookEventError indicates a configured hook event is not supported by this backend.

type UnsupportedHookOutputError

type UnsupportedHookOutputError = internalerrors.UnsupportedHookOutputError

UnsupportedHookOutputError indicates a hook output field is unsupported by this backend.

type Usage

type Usage = message.Usage

Usage contains token usage information.

type UserInputAnswer

type UserInputAnswer = userinput.Answer

UserInputAnswer contains the user's response(s) to a single question.

type UserInputCallback

type UserInputCallback = userinput.Callback

UserInputCallback handles user-input requests and returns answers.

type UserInputQuestion

type UserInputQuestion = userinput.Question

UserInputQuestion represents a single user-input prompt.

type UserInputQuestionOption

type UserInputQuestionOption = userinput.QuestionOption

UserInputQuestionOption represents a selectable choice in a user-input question.

type UserInputRequest

type UserInputRequest = userinput.Request

UserInputRequest contains parsed user-input payload data.

type UserInputResponse

type UserInputResponse = userinput.Response

UserInputResponse contains answers keyed by question ID.

type UserMessage

type UserMessage = message.UserMessage

UserMessage represents a message from the user.

type UserMessageContent

type UserMessageContent = message.UserMessageContent

UserMessageContent represents content that can be either a string or []ContentBlock.

func Blocks

func Blocks(blocks ...ContentBlock) UserMessageContent

Blocks creates block-based user content.

func Text

func Text(text string) UserMessageContent

Text creates text-only user content.

type UserPromptSubmitHookInput

type UserPromptSubmitHookInput = hook.UserPromptSubmitInput

UserPromptSubmitHookInput is the input for UserPromptSubmit hooks.

type UserPromptSubmitHookSpecificOutput

type UserPromptSubmitHookSpecificOutput = hook.UserPromptSubmitSpecificOutput

UserPromptSubmitHookSpecificOutput is the hook-specific output for UserPromptSubmit.

type VLLMAPIMode

type VLLMAPIMode = config.VLLMAPIMode

VLLMAPIMode selects which VLLM API surface to use.

type VLLMAgentOptions

type VLLMAgentOptions = config.Options

VLLMAgentOptions configures SDK and VLLM request behavior.

type VLLMSDKError

type VLLMSDKError = internalerrors.VLLMSDKError

VLLMSDKError is the base interface for all SDK errors.

Directories

Path Synopsis
examples
cancellation command
Package main demonstrates cancellation and graceful shutdown patterns.
Package main demonstrates cancellation and graceful shutdown patterns.
error_handling command
extended_thinking command
Package main demonstrates extended thinking capabilities with VLLM.
Package main demonstrates extended thinking capabilities with VLLM.
hooks command
include_partial_messages command
Package main demonstrates partial message streaming where incremental assistant updates are received as the model generates responses.
Package main demonstrates partial message streaming where incremental assistant updates are received as the model generates responses.
interrupt command
max_budget_usd command
Package main demonstrates API cost control with budget limits.
Package main demonstrates API cost control with budget limits.
mcp_calculator command
Package main demonstrates how to create calculator tools using MCP servers.
Package main demonstrates how to create calculator tools using MCP servers.
mcp_status command
Package main demonstrates querying MCP server connection status.
Package main demonstrates querying MCP server connection status.
memory_tool command
Package main demonstrates a filesystem-backed memory tool for agent state persistence.
Package main demonstrates a filesystem-backed memory tool for agent state persistence.
model_discovery command
on_user_input command
parallel_queries command
Package main demonstrates running multiple Query() calls concurrently.
Package main demonstrates running multiple Query() calls concurrently.
permissions command
pipeline command
Package main demonstrates multi-step LLM orchestration with Go control flow.
Package main demonstrates multi-step LLM orchestration with Go control flow.
query_stream command
quick_start command
sdk_tools command
sessions_local command
system_prompt command
Package main demonstrates configuring system prompts.
Package main demonstrates configuring system prompts.
vllm_extra command
vllm_responses command
vllm_routing command
internal
hook
Package hook provides hook types for intercepting runtime events.
Package hook provides hook types for intercepting runtime events.
mcp
message
Package message provides internal message and content block types.
Package message provides internal message and content block types.
permission
Package permission provides permission handling types.
Package permission provides permission handling types.
scripts

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL