vllmsdk

package module

v0.0.1 Latest Latest Go to latest Published: Apr 10, 2026 License: GPL-3.0 Imports: 25 Imported by: 0

Details

Valid go.mod file
Redistributable license
Tagged version
Stable version
Learn more about best practices

Repository

github.com/ethpandaops/vllm-agent-sdk-go

Links

Open Source Insights

README ¶

vllm-agent-sdk-go

Go SDK for building agentic applications backed by a local or self-hosted vLLM OpenAI-compatible server.

Package: vllmsdk
Default backend: http://127.0.0.1:8000/v1

Install

go get github.com/ethpandaops/vllm-agent-sdk-go

Configuration

The SDK resolves configuration from explicit options first, then environment variables, then defaults.

Environment Variables

Variable	Description	Default
`VLLM_BASE_URL`	vLLM server base URL	`http://127.0.0.1:8000/v1`
`VLLM_API_KEY`	Bearer auth token (optional, only if your server enforces auth)	(none)
`VLLM_MODEL`	Model name	(none — must be set via env or `WithModel()`)
`VLLM_AGENT_SESSION_STORE_PATH`	Local session store directory	(none)

Example-only variables (not resolved by the core SDK):

Variable	Description	Default
`VLLM_IMAGE_MODEL`	Image-capable model for multimodal examples	`QuantTrio/Qwen3-Coder-30B-A3B-Instruct-AWQ`
`VLLM_VISION_MODEL`	Vision model for multimodal input examples	Falls back to `VLLM_IMAGE_MODEL`, then `VLLM_MODEL`
`VLLM_IMAGE_OUTPUT_DIR`	Directory for saving generated images	(none)

Option Precedence

All settings follow the same resolution order:

Explicit option (e.g. WithBaseURL(...), WithAPIKey(...), WithModel(...))
Environment variable (VLLM_BASE_URL, VLLM_API_KEY, VLLM_MODEL)
Built-in default (where applicable)

Developer Workflow

The repo ships a sibling-style Makefile:

make test runs race-enabled package tests with coverage output.
make test-integration runs ./integration/... with -tags=integration.
make audit runs the aggregate quality gate.

Integration setup:

Set VLLM_BASE_URL or default to http://127.0.0.1:8000/v1.
Set VLLM_MODEL to the model served by your vLLM instance.
Set VLLM_API_KEY if your vLLM server enforces bearer auth.
Integration tests skip when the local vLLM server is unavailable.

Quick Start

package main

import (
	"context"
	"fmt"
	"time"

	vllmsdk "github.com/ethpandaops/vllm-agent-sdk-go"
)

func main() {
	ctx, cancel := context.WithTimeout(context.Background(), 60*time.Second)
	defer cancel()

	// Model resolved from VLLM_MODEL env var, or set explicitly:
	for msg, err := range vllmsdk.Query(
		ctx,
		vllmsdk.Text("Write a two-line haiku about Go concurrency."),
		// vllmsdk.WithModel("QuantTrio/Qwen3-Coder-30B-A3B-Instruct-AWQ"),
	) {
		if err != nil {
			panic(err)
		}

		if result, ok := msg.(*vllmsdk.ResultMessage); ok && result.Result != nil {
			fmt.Println(*result.Result)
		}
	}
}

Surface

Query(ctx, content, ...opts) and QueryStream(...) return iter.Seq2[Message, error].
NewClient() exposes Start, StartWithContent, StartWithStream, Query, ReceiveMessages, ReceiveResponse, Interrupt, SetPermissionMode, SetModel, ListModels, ListModelsResponse, GetMCPStatus, RewindFiles, and Close.
Unsupported peer-parity controls such as ReconnectMCPServer, ToggleMCPServer, StopTask, and SendToolResult are present on Client and return typed UnsupportedControlErrors.
UserMessageContent is the canonical input shape. Use Text(...) for text-only calls and Blocks(...) with ImageInput(...), FileInput(...), AudioInput(...), or VideoInput(...) for multimodal chat-completions requests.
WithSDKTools(...) registers high-level in-process tools under mcp__sdk__<name>.
WithOnUserInput(...) handles SDK-owned user-input prompts built on top of tool calling.
ListModels(...) and ListModelsResponse(...) use vLLM model discovery via /v1/models.
StatSession(...), ListSessions(...), and GetSessionMessages(...) operate on the SDK's local persisted session store.

Model Discovery

Discovery uses /v1/models.
Returned ModelInfo values are projected from the OpenAI-compatible model cards that vLLM serves, so provider-rich VLLM metadata is no longer guaranteed.
ModelInfo still exposes helper methods such as CostTier(), SupportsToolCalling(), SupportsStructuredOutput(), SupportsReasoning(), SupportsImageInput(), SupportsImageOutput(), SupportsWebSearch(), SupportsPromptCaching(), MaxContextLength(), and parsed pricing helpers.

Image Output

Generated images are surfaced as *ImageBlock values inside AssistantMessage.Content.
ImageBlock.Decode() returns raw bytes plus media type for data-URL-backed images.
ImageBlock.Save(path) writes generated images to disk.
Live image-generation coverage is available behind the integration build tag when VLLM_IMAGE_MODEL is set.

Multimodal Input

Multimodal input in this SDK is block-based and targets the vLLM OpenAI-compatible chat surface.

content := vllmsdk.Blocks(
	vllmsdk.TextInput("Compare these two screenshots and the attached spec file."),
	vllmsdk.ImageInput("https://example.com/before.png"),
	vllmsdk.ImageInput("data:image/png;base64,..."),
	vllmsdk.FileInput("spec.pdf", "data:application/pdf;base64,..."),
)

for msg, err := range vllmsdk.Query(ctx, content,
	// vllmsdk.WithModel("QuantTrio/Qwen3-Coder-30B-A3B-Instruct-AWQ"),
) {
	_ = msg
	_ = err
}

ImageInput(...) accepts a normal URL or a base64 data URL.
FileInput(...) accepts a filename plus file_data URL/data URL.
AudioInput(...) accepts base64 audio data plus a format.
VideoInput(...) accepts a normal URL or a data URL.
Responses mode is routed to the vLLM /v1/responses surface when selected.

Session Semantics

Session APIs are local SDK APIs, not remote vLLM server sessions.

They read from the SDK session store configured with WithSessionStorePath(...) or VLLM_AGENT_SESSION_STORE_PATH.
They do not derive from chat session_id.
They do not derive from Responses previous_response_id.

Unsupported Controls

vLLM does not have meaningful backend equivalents for some sibling control-plane methods. The SDK exposes those methods where peer parity matters, but they fail explicitly with UnsupportedControlError instead of faking semantics.

Examples

Runnable examples live under examples.

Documentation ¶

Overview ¶

Package vllmsdk provides agent ergonomics backed by VLLM.

Index ¶

Constants
Variables
func ErrorResult(message string) *mcp.CallToolResult
func ImageResult(data []byte, mimeType string) *mcp.CallToolResult
func MessagesFromChannel(ch <-chan StreamingMessage) iter.Seq[StreamingMessage]
func MessagesFromContent(content UserMessageContent) iter.Seq[StreamingMessage]
func MessagesFromSlice(msgs []StreamingMessage) iter.Seq[StreamingMessage]
func NewMcpTool(name, description string, inputSchema *jsonschema.Schema) *mcp.Tool
func NopLogger() *slog.Logger
func ParseArguments(req *mcp.CallToolRequest) (map[string]any, error)
func Query(ctx context.Context, content UserMessageContent, opts ...Option) iter.Seq2[Message, error]
func QueryStream(ctx context.Context, messages iter.Seq[StreamingMessage], opts ...Option) iter.Seq2[Message, error]
func SimpleSchema(props map[string]string) *jsonschema.Schema
func SingleMessage(content UserMessageContent) iter.Seq[StreamingMessage]
func TextResult(text string) *mcp.CallToolResult
func Validate(schema, input map[string]any) error
func WithClient(ctx context.Context, fn func(Client) error, opts ...Option) error
type AssistantMessage
type AssistantMessageError
type AsyncHookJSONOutput
type BaseHookInput
type CallToolRequest
type CallToolResult
type ChatRequest
type Client
- func NewClient() Client
type ContentBlock
type Effort
type HookCallback
type HookContext
type HookEvent
type HookInput
type HookJSONOutput
type HookMatcher
type HookSpecificOutput
type ImageBlock
type InputAudioBlock
- func AudioInput(format, data string) *InputAudioBlock
type InputAudioRef
type InputFileBlock
- func FileInput(filename, fileData string) *InputFileBlock
type InputFileRef
type InputImageBlock
- func ImageInput(url string) *InputImageBlock
type InputImageRef
type InputVideoBlock
- func VideoInput(url string) *InputVideoBlock
type InputVideoRef
type MCPHTTPServerConfig
type MCPSSEServerConfig
type MCPSdkServerConfig
- func CreateSdkMcpServer(name, version string, tools ...*SdkMcpTool) *MCPSdkServerConfig
type MCPServerConfig
type MCPServerStatus
type MCPServerType
type MCPStatus
type MCPStdioServerConfig
type McpAudioContent
type McpContent
type McpImageContent
type McpTextContent
type McpTool
type McpToolAnnotations
type McpToolHandler
type Message
- func GetSessionMessages(ctx context.Context, sessionID string, opts ...Option) ([]Message, error)
type MessageParseError
type MessageStream
type Model
type ModelArchitecture
type ModelEndpoint
type ModelInfo
- func ListModels(ctx context.Context, opts ...Option) ([]ModelInfo, error)
type ModelListResponse
- func ListModelsResponse(ctx context.Context, opts ...Option) (*ModelListResponse, error)
type ModelPerRequestLimits
type ModelPricing
type ModelSupportedParameters
type ModelTopProvider
type NotificationHookInput
type NotificationHookSpecificOutput
type Option
- func WithAPIKey(apiKey string) Option
- func WithAllowedTools(tools ...string) Option
- func WithBackground(background bool) Option
- func WithBaseURL(baseURL string) Option
- func WithCanUseTool(callback ToolPermissionCallback) Option
- func WithCwd(cwd string) Option
- func WithDisallowedTools(tools ...string) Option
- func WithEffort(effort config.Effort) Option
- func WithEnableFileCheckpointing(enable bool) Option
- func WithFallbackModel(model string) Option
- func WithForkSession(fork bool) Option
- func WithFrequencyPenalty(v float64) Option
- func WithHTTPReferer(referer string) Option
- func WithHooks(hooks map[HookEvent][]*HookMatcher) Option
- func WithImageConfig(cfg map[string]any) Option
- func WithInclude(include ...string) Option
- func WithIncludePartialMessages(include bool) Option
- func WithInstructions(instructions string) Option
- func WithLogger(logger *slog.Logger) Option
- func WithLogprobs(enable bool) Option
- func WithMCPConfig(config string) Option
- func WithMCPServers(servers map[string]MCPServerConfig) Option
- func WithMaxBudgetUSD(budget float64) Option
- func WithMaxOutputTokens(max int) Option
- func WithMaxTokens(max int) Option
- func WithMaxToolCalls(max int) Option
- func WithMaxToolIterations(max int) Option
- func WithMaxTurns(maxTurns int) Option
- func WithModalities(modalities ...string) Option
- func WithModel(model string) Option
- func WithModels(models ...string) Option
- func WithOnUserInput(callback UserInputCallback) Option
- func WithOutputFormat(format map[string]any) Option
- func WithParallelToolCalls(enable bool) Option
- func WithPermissionMode(mode string) Option
- func WithPermissionPromptToolName(name string) Option
- func WithPlugins(plugins ...*SdkPluginConfig) Option
- func WithPresencePenalty(v float64) Option
- func WithPreviousResponseID(responseID string) Option
- func WithPrompt(prompt map[string]any) Option
- func WithPromptCacheKey(key string) Option
- func WithProvider(provider map[string]any) Option
- func WithReasoning(reasoning map[string]any) Option
- func WithRequestTimeout(timeout time.Duration) Option
- func WithResponseText(text map[string]any) Option
- func WithResume(sessionID string) Option
- func WithRoute(route string) Option
- func WithSDKTools(tools ...Tool) Option
- func WithSafetyIdentifier(safetyID string) Option
- func WithSeed(seed int64) Option
- func WithServiceTier(tier string) Option
- func WithSessionID(sessionID string) Option
- func WithSessionStorePath(path string) Option
- func WithStop(stop ...string) Option
- func WithStore(store bool) Option
- func WithSystemPrompt(prompt string) Option
- func WithSystemPromptPreset(preset *SystemPromptPreset) Option
- func WithTemperature(temperature float64) Option
- func WithThinking(thinking config.ThinkingConfig) Option
- func WithToolChoice(choice any) Option
- func WithTools(tools config.ToolsConfig) Option
- func WithTopK(topK float64) Option
- func WithTopLogprobs(v int) Option
- func WithTopP(topP float64) Option
- func WithTrace(enable bool) Option
- func WithTransport(transport Transport) Option
- func WithTruncation(truncation string) Option
- func WithUser(user string) Option
- func WithVLLMAPIMode(mode config.VLLMAPIMode) Option
- func WithVLLMExtra(extra map[string]any) Option
- func WithVLLMMetadata(metadata map[string]any) Option
- func WithVLLMPlugins(plugins ...map[string]any) Option
- func WithXTitle(title string) Option
type PermissionBehavior
type PermissionMode
type PermissionRequestHookInput
type PermissionRequestHookSpecificOutput
type PermissionResult
type PermissionResultAllow
type PermissionResultDeny
type PermissionRuleValue
type PermissionUpdate
type PermissionUpdateDestination
type PermissionUpdateType
type PostToolUseFailureHookInput
type PostToolUseFailureHookSpecificOutput
type PostToolUseHookInput
type PostToolUseHookSpecificOutput
type PreCompactHookInput
type PreToolUseHookInput
type PreToolUseHookSpecificOutput
type ResultMessage
type Schema
type SchemaBuilder
- func NewSchemaBuilder() *SchemaBuilder
- func (b *SchemaBuilder) Build() map[string]any
- func (b *SchemaBuilder) OptionalProperty(name, goType string) *SchemaBuilder
- func (b *SchemaBuilder) OptionalPropertyWithDescription(name, goType, description string) *SchemaBuilder
- func (b *SchemaBuilder) Property(name, goType string) *SchemaBuilder
- func (b *SchemaBuilder) PropertyWithDescription(name, goType, description string) *SchemaBuilder
type SdkMcpServerInstance
type SdkMcpTool
- func NewSdkMcpTool(name, description string, inputSchema *jsonschema.Schema, ...) *SdkMcpTool
- func (t *SdkMcpTool) Annotations() *mcp.ToolAnnotations
- func (t *SdkMcpTool) Description() string
- func (t *SdkMcpTool) Handler() SdkMcpToolHandler
- func (t *SdkMcpTool) InputSchema() *jsonschema.Schema
- func (t *SdkMcpTool) Name() string
type SdkMcpToolHandler
type SdkMcpToolOption
- func WithAnnotations(annotations *mcp.ToolAnnotations) SdkMcpToolOption
type SdkPluginConfig
type SessionStat
- func ListSessions(ctx context.Context, opts ...Option) ([]SessionStat, error)
- func StatSession(ctx context.Context, sessionID string, opts ...Option) (*SessionStat, error)
type StopHookInput
type StreamEvent
type StreamingMessage
- func NewUserMessage(content UserMessageContent) StreamingMessage
type StreamingMessageContent
type SubagentStartHookInput
type SubagentStartHookSpecificOutput
type SubagentStopHookInput
type SyncHookJSONOutput
type SystemMessage
type SystemPromptPreset
type TextBlock
- func TextInput(text string) *TextBlock
type ThinkingBlock
type ThinkingConfig
type ThinkingConfigAdaptive
type ThinkingConfigDisabled
type ThinkingConfigEnabled
type Tool
- func NewTool(name, description string, schema map[string]any, fn ToolFunc) Tool
type ToolFunc
type ToolPermissionCallback
type ToolPermissionContext
type ToolPermissionDeniedError
type ToolResultBlock
type ToolUseBlock
type ToolsConfig
type ToolsList
type ToolsPreset
type Transport
type UnsupportedControlError
type UnsupportedHookEventError
type UnsupportedHookOutputError
type Usage
type UserInputAnswer
type UserInputCallback
type UserInputQuestion
type UserInputQuestionOption
type UserInputRequest
type UserInputResponse
type UserMessage
type UserMessageContent
- func Blocks(blocks ...ContentBlock) UserMessageContent
- func Text(text string) UserMessageContent
type UserPromptSubmitHookInput
type UserPromptSubmitHookSpecificOutput
type VLLMAPIMode
type VLLMAgentOptions
type VLLMSDKError

Constants ¶

View Source

const (
	// VLLMAPIModeChatCompletions uses /chat/completions.
	VLLMAPIModeChatCompletions = config.VLLMAPIModeChatCompletions
	// VLLMAPIModeResponses uses /responses.
	VLLMAPIModeResponses = config.VLLMAPIModeResponses
)

View Source

const (
	// EffortLow uses minimal thinking.
	EffortLow = config.EffortLow
	// EffortMedium uses moderate thinking.
	EffortMedium = config.EffortMedium
	// EffortHigh uses deep thinking.
	EffortHigh = config.EffortHigh
	// EffortMax uses maximum thinking depth.
	EffortMax = config.EffortMax
)

View Source

const (
	// AssistantMessageErrorAuthFailed indicates authentication failure.
	AssistantMessageErrorAuthFailed = message.AssistantMessageErrorAuthFailed
	// AssistantMessageErrorBilling indicates a billing error.
	AssistantMessageErrorBilling = message.AssistantMessageErrorBilling
	// AssistantMessageErrorRateLimit indicates rate limiting.
	AssistantMessageErrorRateLimit = message.AssistantMessageErrorRateLimit
	// AssistantMessageErrorInvalidReq indicates an invalid request.
	AssistantMessageErrorInvalidReq = message.AssistantMessageErrorInvalidReq
	// AssistantMessageErrorServer indicates a server error.
	AssistantMessageErrorServer = message.AssistantMessageErrorServer
	// AssistantMessageErrorUnknown indicates an unknown error.
	AssistantMessageErrorUnknown = message.AssistantMessageErrorUnknown
)

View Source

const (
	// BlockTypeText contains plain text content.
	BlockTypeText = message.BlockTypeText
	// BlockTypeImage contains generated image output content.
	BlockTypeImage = message.BlockTypeImage
	// BlockTypeImageURL contains input image URL or data-url content.
	BlockTypeImageURL = message.BlockTypeImageURL
	// BlockTypeFile contains input file content.
	BlockTypeFile = message.BlockTypeFile
	// BlockTypeInputAudio contains input audio content.
	BlockTypeInputAudio = message.BlockTypeInputAudio
	// BlockTypeVideoURL contains input video URL or data-url content.
	BlockTypeVideoURL = message.BlockTypeVideoURL
	// BlockTypeThinking contains model reasoning content.
	BlockTypeThinking = message.BlockTypeThinking
	// BlockTypeToolUse contains tool-call content.
	BlockTypeToolUse = message.BlockTypeToolUse
	// BlockTypeToolResult contains tool-result content.
	BlockTypeToolResult = message.BlockTypeToolResult
)

View Source

const (
	// HookEventPreToolUse is triggered before a tool is used.
	HookEventPreToolUse = hook.EventPreToolUse
	// HookEventPostToolUse is triggered after a tool is used.
	HookEventPostToolUse = hook.EventPostToolUse
	// HookEventUserPromptSubmit is triggered when a user submits a prompt.
	HookEventUserPromptSubmit = hook.EventUserPromptSubmit
	// HookEventStop is triggered when a session stops.
	HookEventStop = hook.EventStop
	// HookEventSubagentStop is triggered when a subagent stops.
	HookEventSubagentStop = hook.EventSubagentStop
	// HookEventPreCompact is triggered before compaction.
	HookEventPreCompact = hook.EventPreCompact
	// HookEventPostToolUseFailure is triggered after a tool use fails.
	HookEventPostToolUseFailure = hook.EventPostToolUseFailure
	// HookEventNotification is triggered when a notification is sent.
	HookEventNotification = hook.EventNotification
	// HookEventSubagentStart is triggered when a subagent starts.
	HookEventSubagentStart = hook.EventSubagentStart
	// HookEventPermissionRequest is triggered when a permission is requested.
	HookEventPermissionRequest = hook.EventPermissionRequest
)

View Source

const (
	// PermissionModeDefault uses standard permission prompts.
	PermissionModeDefault = permission.ModeDefault
	// PermissionModeAcceptEdits automatically accepts file edits.
	PermissionModeAcceptEdits = permission.ModeAcceptEdits
	// PermissionModePlan enables plan mode for implementation planning.
	PermissionModePlan = permission.ModePlan
	// PermissionModeBypassPermissions bypasses all permission checks.
	PermissionModeBypassPermissions = permission.ModeBypassPermissions
)

View Source

const (
	// PermissionUpdateTypeAddRules adds new permission rules.
	PermissionUpdateTypeAddRules = permission.UpdateTypeAddRules
	// PermissionUpdateTypeReplaceRules replaces existing permission rules.
	PermissionUpdateTypeReplaceRules = permission.UpdateTypeReplaceRules
	// PermissionUpdateTypeRemoveRules removes permission rules.
	PermissionUpdateTypeRemoveRules = permission.UpdateTypeRemoveRules
	// PermissionUpdateTypeSetMode sets the permission mode.
	PermissionUpdateTypeSetMode = permission.UpdateTypeSetMode
	// PermissionUpdateTypeAddDirectories adds accessible directories.
	PermissionUpdateTypeAddDirectories = permission.UpdateTypeAddDirectories
	// PermissionUpdateTypeRemoveDirectories removes accessible directories.
	PermissionUpdateTypeRemoveDirectories = permission.UpdateTypeRemoveDirectories
)

View Source

const (
	// PermissionUpdateDestUserSettings stores in user-level settings.
	PermissionUpdateDestUserSettings = permission.UpdateDestUserSettings
	// PermissionUpdateDestProjectSettings stores in project-level settings.
	PermissionUpdateDestProjectSettings = permission.UpdateDestProjectSettings
	// PermissionUpdateDestLocalSettings stores in local-level settings.
	PermissionUpdateDestLocalSettings = permission.UpdateDestLocalSettings
	// PermissionUpdateDestSession stores in the current session only.
	PermissionUpdateDestSession = permission.UpdateDestSession
)

View Source

const (
	// PermissionBehaviorAllow automatically allows the operation.
	PermissionBehaviorAllow = permission.BehaviorAllow
	// PermissionBehaviorDeny automatically denies the operation.
	PermissionBehaviorDeny = permission.BehaviorDeny
	// PermissionBehaviorAsk prompts the user for permission.
	PermissionBehaviorAsk = permission.BehaviorAsk
)

View Source

const (
	// MCPServerTypeStdio uses stdio for communication.
	MCPServerTypeStdio = mcp.ServerTypeStdio
	// MCPServerTypeSSE uses Server-Sent Events.
	MCPServerTypeSSE = mcp.ServerTypeSSE
	// MCPServerTypeHTTP uses HTTP for communication.
	MCPServerTypeHTTP = mcp.ServerTypeHTTP
	// MCPServerTypeSDK uses the SDK interface.
	MCPServerTypeSDK = mcp.ServerTypeSDK
)

View Source

const Version = "0.1.0"

Version is the SDK version.

Variables ¶

View Source

var (
	// ErrClientNotConnected indicates the client is not connected.
	ErrClientNotConnected = internalerrors.ErrClientNotConnected

	// ErrClientAlreadyConnected indicates the client is already connected.
	ErrClientAlreadyConnected = internalerrors.ErrClientAlreadyConnected

	// ErrClientClosed indicates the client has been closed and cannot be reused.
	ErrClientClosed = internalerrors.ErrClientClosed

	// ErrTransportNotConnected indicates the transport is not connected.
	ErrTransportNotConnected = internalerrors.ErrTransportNotConnected

	// ErrRequestTimeout indicates a request timed out.
	ErrRequestTimeout = internalerrors.ErrRequestTimeout

	// ErrSessionNotFound indicates a requested local session does not exist.
	ErrSessionNotFound = internalerrors.ErrSessionNotFound

	// ErrUnsupportedFeature indicates an API-compatible feature that is not implemented by this backend.
	ErrUnsupportedFeature = errors.New("unsupported feature in VLLM backend")

	// ErrUnsupportedControl indicates a control-plane operation is not supported by backend.
	ErrUnsupportedControl = controlplane.ErrUnsupportedControl

	// ErrNoCheckpoint indicates rewind was requested without an available checkpoint.
	ErrNoCheckpoint = session.ErrNoCheckpoint
)

Re-export sentinel errors from internal package.

View Source

var NewUserMessageContent = message.NewUserMessageContent

NewUserMessageContent creates UserMessageContent from a string.

View Source

var NewUserMessageContentBlocks = message.NewUserMessageContentBlocks

NewUserMessageContentBlocks creates UserMessageContent from blocks.

Functions ¶

func ErrorResult ¶

func ErrorResult(message string) *mcp.CallToolResult

ErrorResult creates a CallToolResult indicating an error.

func ImageResult ¶

func ImageResult(data []byte, mimeType string) *mcp.CallToolResult

ImageResult creates a CallToolResult with image content.

func MessagesFromChannel ¶

func MessagesFromChannel(ch <-chan StreamingMessage) iter.Seq[StreamingMessage]

MessagesFromChannel creates a MessageStream from a channel. This is useful for dynamic message generation where messages are produced over time. The iterator completes when the channel is closed.

func MessagesFromContent ¶

func MessagesFromContent(content UserMessageContent) iter.Seq[StreamingMessage]

MessagesFromContent creates a single-message stream from user content.

func MessagesFromSlice ¶

func MessagesFromSlice(msgs []StreamingMessage) iter.Seq[StreamingMessage]

MessagesFromSlice creates a MessageStream from a slice of StreamingMessages. This is useful for sending a fixed set of messages in streaming mode.

func NewMcpTool ¶

func NewMcpTool(name, description string, inputSchema *jsonschema.Schema) *mcp.Tool

NewMcpTool creates an mcp.Tool with the given parameters. This is useful when you need direct access to the MCP Tool type.

func NopLogger ¶

func NopLogger() *slog.Logger

NopLogger returns a logger that discards all output. Use this when you want silent operation with no logging overhead.

func ParseArguments ¶

func ParseArguments(req *mcp.CallToolRequest) (map[string]any, error)

ParseArguments unmarshals CallToolRequest arguments into a map. This is a convenience function for extracting tool input.

func Query ¶

func Query(
	ctx context.Context,
	content UserMessageContent,
	opts ...Option,
) iter.Seq2[Message, error]

Query executes a one-shot query and returns a message iterator.

func QueryStream ¶

func QueryStream(
	ctx context.Context,
	messages iter.Seq[StreamingMessage],
	opts ...Option,
) iter.Seq2[Message, error]

QueryStream executes a one-shot query from a stream of user messages.

func SimpleSchema ¶

func SimpleSchema(props map[string]string) *jsonschema.Schema

SimpleSchema creates a jsonschema.Schema from a simple type map.

Input format: {"a": "float64", "b": "string"}

Type mappings:

"string" → {"type": "string"}
"int", "int64" → {"type": "integer"}
"float64", "float" → {"type": "number"}
"bool" → {"type": "boolean"}
"[]string" → {"type": "array", "items": {"type": "string"}}
"any", "object" → {"type": "object"}

func SingleMessage ¶

func SingleMessage(content UserMessageContent) iter.Seq[StreamingMessage]

SingleMessage creates a MessageStream with a single user message.

func TextResult ¶

func TextResult(text string) *mcp.CallToolResult

TextResult creates a CallToolResult with text content.

func Validate ¶

func Validate(schema, input map[string]any) error

Validate checks if the input matches the schema requirements. Returns an error if required fields are missing.

func WithClient ¶

func WithClient(ctx context.Context, fn func(Client) error, opts ...Option) error

WithClient manages client lifecycle with automatic cleanup.

This helper creates a client, starts it with the provided options, executes the callback function, and ensures proper cleanup via Close() when done.

The callback receives a fully initialized Client that is ready for use. If the callback returns an error, it is returned to the caller. If Close() fails, a warning is logged but does not override the callback's error.

Example usage:

err := vllmsdk.WithClient(ctx, func(c vllmsdk.Client) error {
    if err := c.Query(ctx, Text("Hello")); err != nil {
        return err
    }
    for msg, err := range c.ReceiveResponse(ctx) {
        if err != nil {
            return err
        }
        // process message...
    }
    return nil
},
    vllmsdk.WithLogger(log),
    vllmsdk.WithPermissionMode("acceptEdits"),
)

Types ¶

type AssistantMessage ¶

type AssistantMessage = message.AssistantMessage

AssistantMessage represents a message from the model.

type AssistantMessageError ¶

type AssistantMessageError = message.AssistantMessageError

AssistantMessageError represents error types from the assistant.

type AsyncHookJSONOutput ¶

type AsyncHookJSONOutput = hook.AsyncJSONOutput

AsyncHookJSONOutput represents an async hook output.

type BaseHookInput ¶

type BaseHookInput = hook.BaseInput

BaseHookInput contains common fields for all hook inputs.

type CallToolRequest ¶

type CallToolRequest = mcp.CallToolRequest

CallToolRequest is the request passed to tool handlers.

type CallToolResult ¶

type CallToolResult = mcp.CallToolResult

CallToolResult is the server's response to a tool call. Use TextResult, ErrorResult, or ImageResult helpers to create results.

type ChatRequest ¶

type ChatRequest = config.ChatRequest

ChatRequest is the normalized VLLM request sent through a Transport.

type Client ¶

type Client interface {
	// Start initializes the client runtime.
	// Must be called before any other methods.
	// Returns a transport/runtime error on failure.
	Start(ctx context.Context, opts ...Option) error

	// StartWithContent initializes the runtime and immediately sends initial user content.
	// Equivalent to calling Start() followed by Query(ctx, content).
	// The content is sent to the "default" session.
	// Returns a transport/runtime error on failure.
	StartWithContent(ctx context.Context, content UserMessageContent, opts ...Option) error

	// StartWithStream initializes the runtime and consumes the provided input iterator
	// as the initial message stream for the active session.
	// The iterator is consumed in a separate goroutine; use context cancellation to abort.
	// Returns a transport/runtime error on failure.
	StartWithStream(ctx context.Context, messages iter.Seq[StreamingMessage], opts ...Option) error

	// Query sends user content to the active session.
	// Returns immediately after sending; use ReceiveMessages() or ReceiveResponse() to get responses.
	// Optional sessionID defaults to "default" for multi-session support.
	Query(ctx context.Context, content UserMessageContent, sessionID ...string) error

	// ReceiveMessages returns an iterator that yields messages indefinitely.
	// Messages are yielded as they arrive until EOF, an error occurs, or context is cancelled.
	// Unlike ReceiveResponse, this iterator does not stop at ResultMessage.
	// Use iter.Pull2 if you need pull-based iteration instead of range.
	ReceiveMessages(ctx context.Context) iter.Seq2[Message, error]

	// ReceiveResponse returns an iterator that yields messages until a ResultMessage is received.
	// Messages are yielded as they arrive for streaming consumption.
	// The iterator stops after yielding the ResultMessage.
	// Use iter.Pull2 if you need pull-based iteration instead of range.
	// To collect all messages into a slice, use slices.Collect or a simple loop.
	ReceiveResponse(ctx context.Context) iter.Seq2[Message, error]

	// Interrupt cancels the current in-flight request.
	Interrupt(ctx context.Context) error

	// SetPermissionMode changes the permission mode during conversation.
	// Valid modes: "default", "acceptEdits", "plan", "bypassPermissions"
	SetPermissionMode(ctx context.Context, mode string) error

	// SetModel changes the AI model during conversation.
	// Pass nil to use the default model.
	SetModel(ctx context.Context, model *string) error

	// ListModels returns the available VLLM models using the current client options.
	ListModels(ctx context.Context) ([]ModelInfo, error)

	// ListModelsResponse returns the full VLLM model discovery payload.
	ListModelsResponse(ctx context.Context) (*ModelListResponse, error)

	// GetServerInfo returns runtime metadata for the active client session.
	// Returns nil when the client is not connected.
	GetServerInfo() map[string]any

	// GetMCPStatus returns MCP server connection status for the current runtime.
	// Returns the status of all configured MCP servers.
	GetMCPStatus(ctx context.Context) (*MCPStatus, error)

	// ReconnectMCPServer reconnects a disconnected or failed MCP server.
	ReconnectMCPServer(ctx context.Context, serverName string) error

	// ToggleMCPServer enables or disables an MCP server.
	ToggleMCPServer(ctx context.Context, serverName string, enabled bool) error

	// StopTask stops a running task by task ID.
	StopTask(ctx context.Context, taskID string) error

	// RewindFiles rewinds tracked files to their state at a specific user message.
	// The userMessageID should be the ID of a previous user message in the conversation.
	// Requires EnableFileCheckpointing=true in VLLMAgentOptions.
	RewindFiles(ctx context.Context, userMessageID string) error

	// SendToolResult sends a tool result for a pending tool call.
	SendToolResult(ctx context.Context, toolUseID, content string, isError bool) error

	// Close terminates the session and cleans up resources.
	// After Close(), the client cannot be reused. Safe to call multiple times.
	Close() error
}

Client provides an interactive, stateful interface for multi-turn VLLM conversations.

Unlike the one-shot Query() function, Client maintains session state across multiple exchanges. It supports interruption, tool loops, and local session state.

Lifecycle: clients are single-use. After Close(), create a new client with NewClient().

Example usage:

client := NewClient()
defer func() { _ = client.Close() }()

err := client.Start(ctx,
    WithLogger(slog.Default()),
    WithPermissionMode("acceptEdits"),
)
if err != nil {
    log.Fatal(err)
}

// Send a query
err = client.Query(ctx, NewUserMessageContent("What is 2+2?"))
if err != nil {
    log.Fatal(err)
}

// Receive all messages for this response (stops at ResultMessage)
for msg, err := range client.ReceiveResponse(ctx) {
    if err != nil {
        log.Fatal(err)
    }
    // Process message...
}

// Or receive messages indefinitely (for continuous streaming)
for msg, err := range client.ReceiveMessages(ctx) {
    if err != nil {
        break
    }
    // Process message...
}

func NewClient ¶

func NewClient() Client

NewClient creates a new interactive client.

Call Start() with options to begin a session:

client := NewClient()
err := client.Start(ctx,
    WithLogger(slog.Default()),
    WithPermissionMode("acceptEdits"),
)

type ContentBlock ¶

type ContentBlock = message.ContentBlock

ContentBlock represents a block of content within a message.

type Effort ¶

type Effort = config.Effort

Effort controls thinking depth.

type HookCallback ¶

type HookCallback = hook.Callback

HookCallback is the function signature for hook callbacks.

type HookContext ¶

type HookContext = hook.Context

HookContext provides context for hook execution.

type HookEvent ¶

type HookEvent = hook.Event

HookEvent represents the type of event that triggers a hook.

type HookInput ¶

type HookInput = hook.Input

HookInput is the interface for all hook input types.

type HookJSONOutput ¶

type HookJSONOutput = hook.JSONOutput

HookJSONOutput is the interface for hook output types.

type HookMatcher ¶

type HookMatcher = hook.Matcher

HookMatcher configures which tools/events a hook applies to.

type HookSpecificOutput ¶

type HookSpecificOutput = hook.SpecificOutput

HookSpecificOutput is the interface for hook-specific outputs.

type ImageBlock ¶

type ImageBlock = message.ImageBlock

ImageBlock contains a generated image reference.

type InputAudioBlock ¶

type InputAudioBlock = message.InputAudioBlock

InputAudioBlock contains an input audio payload for multimodal prompts.

func AudioInput ¶

func AudioInput(format, data string) *InputAudioBlock

AudioInput creates an input audio block from base64 audio plus its format.

type InputAudioRef ¶

type InputAudioRef = message.InputAudioRef

InputAudioRef contains base64-encoded input audio plus its format.

type InputFileBlock ¶

type InputFileBlock = message.InputFileBlock

InputFileBlock contains an input file reference for multimodal prompts.

func FileInput ¶

func FileInput(filename, fileData string) *InputFileBlock

FileInput creates an input file block from a URL or data URL.

type InputFileRef ¶

type InputFileRef = message.InputFileRef

InputFileRef identifies an input file URL or data URL.

type InputImageBlock ¶

type InputImageBlock = message.InputImageBlock

InputImageBlock contains an input image reference for multimodal prompts.

func ImageInput ¶

func ImageInput(url string) *InputImageBlock

ImageInput creates an input image block from a URL or data URL.

type InputImageRef ¶

type InputImageRef = message.InputImageRef

InputImageRef identifies an input image URL or data URL.

type InputVideoBlock ¶

type InputVideoBlock = message.InputVideoBlock

InputVideoBlock contains an input video reference for multimodal prompts.

func VideoInput ¶

func VideoInput(url string) *InputVideoBlock

VideoInput creates an input video block from a URL or data URL.

type InputVideoRef ¶

type InputVideoRef = message.InputVideoRef

InputVideoRef identifies an input video URL or data URL.

type MCPHTTPServerConfig ¶

type MCPHTTPServerConfig = mcp.HTTPServerConfig

MCPHTTPServerConfig configures an HTTP-based MCP server.

type MCPSSEServerConfig ¶

type MCPSSEServerConfig = mcp.SSEServerConfig

MCPSSEServerConfig configures a Server-Sent Events MCP server.

type MCPSdkServerConfig ¶

type MCPSdkServerConfig = mcp.SdkServerConfig

MCPSdkServerConfig configures an SDK-provided MCP server.

func CreateSdkMcpServer ¶

func CreateSdkMcpServer(name, version string, tools ...*SdkMcpTool) *MCPSdkServerConfig

CreateSdkMcpServer creates an in-process MCP server configuration with SdkMcpTool tools.

This function creates an MCP server that runs within your application, providing better performance than external MCP servers.

The returned config can be used directly in VLLMAgentOptions.MCPServers:

addTool := vllmsdk.NewSdkMcpTool("add", "Add two numbers",
    vllmsdk.SimpleSchema(map[string]string{"a": "float64", "b": "float64"}),
    func(ctx context.Context, req *vllmsdk.CallToolRequest) (*vllmsdk.CallToolResult, error) {
        args, _ := vllmsdk.ParseArguments(req)
        a, b := args["a"].(float64), args["b"].(float64)
        return vllmTextResult(fmt.Sprintf("Result: %v", a+b)), nil
    },
)

calculator := vllmsdk.CreateSdkMcpServer("calculator", "1.0.0", addTool)

options := &vllmsdk.VLLMAgentOptions{
    MCPServers: map[string]vllmsdk.MCPServerConfig{
        "calculator": calculator,
    },
    AllowedTools: []string{"mcp__calculator__add"},
}

Parameters:

name: Server name (also used as key in MCPServers map, determines tool naming: mcp__<name>__<toolName>)
version: Server version string
tools: SdkMcpTool instances to register with the server

type MCPServerConfig ¶

type MCPServerConfig = mcp.ServerConfig

MCPServerConfig is the interface for MCP server configurations.

type MCPServerStatus ¶

type MCPServerStatus = mcp.ServerStatus

MCPServerStatus represents the connection status of a single MCP server.

type MCPServerType ¶

type MCPServerType = mcp.ServerType

MCPServerType represents the type of MCP server.

type MCPStatus ¶

type MCPStatus = mcp.Status

MCPStatus represents the connection status of all configured MCP servers.

type MCPStdioServerConfig ¶

type MCPStdioServerConfig = mcp.StdioServerConfig

MCPStdioServerConfig configures a stdio-based MCP server.

type McpAudioContent ¶

type McpAudioContent = mcp.AudioContent

McpAudioContent represents audio content in a tool result.

type McpContent ¶

type McpContent = mcp.Content

McpContent is the interface for content types in tool results.

type McpImageContent ¶

type McpImageContent = mcp.ImageContent

McpImageContent represents image content in a tool result.

type McpTextContent ¶

type McpTextContent = mcp.TextContent

McpTextContent represents text content in a tool result.

type McpTool ¶

type McpTool = mcp.Tool

McpTool represents an MCP tool definition from the official SDK.

type McpToolAnnotations ¶

type McpToolAnnotations = mcp.ToolAnnotations

McpToolAnnotations describes optional hints about tool behavior. Fields include ReadOnlyHint, DestructiveHint, IdempotentHint, OpenWorldHint, and Title.

type McpToolHandler ¶

type McpToolHandler = mcp.ToolHandler

McpToolHandler is the function signature for low-level tool handlers.

type Message ¶

type Message = message.Message

Message represents any message in the conversation.

func GetSessionMessages ¶

func GetSessionMessages(ctx context.Context, sessionID string, opts ...Option) ([]Message, error)

GetSessionMessages returns persisted local session messages.

type MessageParseError ¶

type MessageParseError = internalerrors.MessageParseError

MessageParseError indicates message parsing failed.

type MessageStream ¶

type MessageStream = iter.Seq[StreamingMessage]

MessageStream is an iterator that yields streaming messages.

type Model ¶

type Model = model.Model

Model is a stable provider-neutral projection of a discovered model.

type ModelArchitecture ¶

type ModelArchitecture = model.Architecture

ModelArchitecture describes the underlying model family where available.

type ModelEndpoint ¶

type ModelEndpoint = model.Endpoint

ModelEndpoint identifies an endpoint a model supports.

type ModelInfo ¶

type ModelInfo = model.Info

ModelInfo describes a model available from VLLM discovery endpoints.

func ListModels ¶

func ListModels(ctx context.Context, opts ...Option) ([]ModelInfo, error)

ListModels returns the available vLLM-served models.

type ModelListResponse ¶

type ModelListResponse = model.ListResponse

ModelListResponse contains the full model discovery payload.

func ListModelsResponse ¶

func ListModelsResponse(ctx context.Context, opts ...Option) (*ModelListResponse, error)

ListModelsResponse returns the full vLLM model discovery payload.

type ModelPerRequestLimits ¶

type ModelPerRequestLimits = model.PerRequestLimits

ModelPerRequestLimits captures request limits where available.

type ModelPricing ¶

type ModelPricing = model.Pricing

ModelPricing contains VLLM pricing fields as returned by discovery endpoints.

type ModelSupportedParameters ¶

type ModelSupportedParameters = model.SupportedParameters

ModelSupportedParameters records provider-supported request parameters.

type ModelTopProvider ¶

type ModelTopProvider = model.TopProvider

ModelTopProvider describes provider-side limits and moderation behavior.

type NotificationHookInput ¶

type NotificationHookInput = hook.NotificationInput

NotificationHookInput is the input for Notification hooks.

type NotificationHookSpecificOutput ¶

type NotificationHookSpecificOutput = hook.NotificationSpecificOutput

NotificationHookSpecificOutput is the hook-specific output for Notification.

type Option ¶

type Option func(*VLLMAgentOptions)

Option configures VLLMAgentOptions using the functional options pattern. This is the primary option type for configuring clients and queries.

func WithAPIKey ¶

func WithAPIKey(apiKey string) Option

WithAPIKey sets the VLLM API key directly.

func WithAllowedTools ¶

func WithAllowedTools(tools ...string) Option

WithAllowedTools sets pre-approved tools that can be used without prompting.

func WithBackground ¶

func WithBackground(background bool) Option

WithBackground sets responses.background.

func WithBaseURL ¶

func WithBaseURL(baseURL string) Option

WithBaseURL overrides the VLLM base URL.

func WithCanUseTool ¶

func WithCanUseTool(callback ToolPermissionCallback) Option

WithCanUseTool sets a callback for permission checking before each tool use.

func WithCwd ¶

func WithCwd(cwd string) Option

WithCwd sets the working directory used by local session and tool features.

func WithDisallowedTools ¶

func WithDisallowedTools(tools ...string) Option

WithDisallowedTools sets tools that are explicitly blocked.

func WithEffort ¶

func WithEffort(effort config.Effort) Option

WithEffort sets the thinking effort level.

func WithEnableFileCheckpointing ¶

func WithEnableFileCheckpointing(enable bool) Option

WithEnableFileCheckpointing enables file change tracking and rewinding.

func WithFallbackModel ¶

func WithFallbackModel(model string) Option

WithFallbackModel specifies a model to use if the primary model fails.

func WithForkSession ¶

func WithForkSession(fork bool) Option

WithForkSession indicates whether to fork the resumed session to a new ID.

func WithFrequencyPenalty ¶

func WithFrequencyPenalty(v float64) Option

WithFrequencyPenalty sets frequency penalty.

func WithHTTPReferer ¶

func WithHTTPReferer(referer string) Option

WithHTTPReferer sets the HTTP-Referer header for VLLM requests.

func WithHooks ¶

func WithHooks(hooks map[HookEvent][]*HookMatcher) Option

WithHooks configures event hooks for tool interception.

func WithImageConfig ¶

func WithImageConfig(cfg map[string]any) Option

WithImageConfig sets provider-specific image config.

func WithInclude ¶

func WithInclude(include ...string) Option

WithInclude sets responses.include entries.

func WithIncludePartialMessages ¶

func WithIncludePartialMessages(include bool) Option

WithIncludePartialMessages enables streaming of partial message updates.

func WithInstructions ¶

func WithInstructions(instructions string) Option

WithInstructions sets responses.instructions.

func WithLogger ¶

func WithLogger(logger *slog.Logger) Option

WithLogger sets the logger for debug output. If not set, logging is disabled (silent operation).

func WithLogprobs ¶

func WithLogprobs(enable bool) Option

WithLogprobs enables token log probabilities where supported.

func WithMCPConfig ¶

func WithMCPConfig(config string) Option

WithMCPConfig sets a path to an MCP config file or a raw JSON string. If set, this takes precedence over WithMCPServers.

func WithMCPServers ¶

func WithMCPServers(servers map[string]MCPServerConfig) Option

WithMCPServers configures external MCP servers to connect to. Map key is the server name, value is the server configuration.

func WithMaxBudgetUSD ¶

func WithMaxBudgetUSD(budget float64) Option

WithMaxBudgetUSD sets a cost limit for the session in USD.

func WithMaxOutputTokens ¶

func WithMaxOutputTokens(max int) Option

WithMaxOutputTokens sets responses.max_output_tokens.

func WithMaxTokens ¶

func WithMaxTokens(max int) Option

WithMaxTokens sets chat max_tokens.

func WithMaxToolCalls ¶

func WithMaxToolCalls(max int) Option

WithMaxToolCalls sets responses.max_tool_calls.

func WithMaxToolIterations ¶

func WithMaxToolIterations(max int) Option

WithMaxToolIterations sets maximum tool-call loops per query.

func WithMaxTurns ¶

func WithMaxTurns(maxTurns int) Option

WithMaxTurns limits the maximum number of conversation turns.

func WithModalities ¶

func WithModalities(modalities ...string) Option

WithModalities sets output modalities.

func WithModel ¶

func WithModel(model string) Option

WithModel specifies which VLLM model to use.

func WithModels ¶

func WithModels(models ...string) Option

WithModels sets candidate fallback models for VLLM routing.

func WithOnUserInput ¶

func WithOnUserInput(callback UserInputCallback) Option

WithOnUserInput sets a callback for handling SDK user-input tool prompts.

func WithOutputFormat ¶

func WithOutputFormat(format map[string]any) Option

WithOutputFormat specifies a JSON schema for structured output.

The canonical format uses a wrapper object:

vllmsdk.WithOutputFormat(map[string]any{
    "type": "json_schema",
    "schema": map[string]any{
        "type":       "object",
        "properties": map[string]any{...},
        "required":   []string{...},
    },
})

Raw JSON schemas (without the wrapper) are also accepted and auto-wrapped:

vllmsdk.WithOutputFormat(map[string]any{
    "type":       "object",
    "properties": map[string]any{...},
    "required":   []string{...},
})

Structured output is available on ResultMessage.StructuredOutput (parsed) or ResultMessage.Result (JSON string).

func WithParallelToolCalls ¶

func WithParallelToolCalls(enable bool) Option

WithParallelToolCalls sets parallel_tool_calls.

func WithPermissionMode ¶

func WithPermissionMode(mode string) Option

WithPermissionMode controls how permissions are handled. Valid values: "default", "acceptEdits", "plan", "bypassPermissions".

func WithPermissionPromptToolName ¶

func WithPermissionPromptToolName(name string) Option

WithPermissionPromptToolName specifies the tool name to use for permission prompts.

func WithPlugins ¶

func WithPlugins(plugins ...*SdkPluginConfig) Option

WithPlugins configures plugins to load.

func WithPresencePenalty ¶

func WithPresencePenalty(v float64) Option

WithPresencePenalty sets presence penalty.

func WithPreviousResponseID ¶

func WithPreviousResponseID(responseID string) Option

WithPreviousResponseID sets responses.previous_response_id.

func WithPrompt ¶

func WithPrompt(prompt map[string]any) Option

WithPrompt sets responses.prompt payload.

func WithPromptCacheKey ¶

func WithPromptCacheKey(key string) Option

WithPromptCacheKey sets responses.prompt_cache_key.

func WithProvider ¶

func WithProvider(provider map[string]any) Option

WithProvider sets VLLM provider preferences.

func WithReasoning ¶

func WithReasoning(reasoning map[string]any) Option

WithReasoning sets VLLM reasoning configuration.

func WithRequestTimeout ¶

func WithRequestTimeout(timeout time.Duration) Option

WithRequestTimeout sets HTTP request timeout for VLLM calls.

func WithResponseText ¶

func WithResponseText(text map[string]any) Option

WithResponseText sets responses.text payload directly.

func WithResume ¶

func WithResume(sessionID string) Option

WithResume sets a session ID to resume from.

func WithRoute ¶

func WithRoute(route string) Option

WithRoute sets VLLM route preference.

func WithSDKTools ¶

func WithSDKTools(tools ...Tool) Option

WithSDKTools registers high-level Tool instances as an in-process MCP server. Tools are exposed under the "sdk" MCP server name (tool names: mcp__sdk__<name>). Each tool is automatically added to AllowedTools.

func WithSafetyIdentifier ¶

func WithSafetyIdentifier(safetyID string) Option

WithSafetyIdentifier sets responses.safety_identifier.

func WithSeed ¶

func WithSeed(seed int64) Option

WithSeed sets deterministic seed where supported.

func WithServiceTier ¶

func WithServiceTier(tier string) Option

WithServiceTier sets responses.service_tier.

func WithSessionID ¶

func WithSessionID(sessionID string) Option

WithSessionID sets request session_id.

func WithSessionStorePath ¶

func WithSessionStorePath(path string) Option

WithSessionStorePath enables durable session persistence at a JSON file path. When set, resume/fork state can survive process restarts.

func WithStop ¶

func WithStop(stop ...string) Option

WithStop sets stop sequences.

func WithStore ¶

func WithStore(store bool) Option

WithStore sets responses.store.

func WithSystemPrompt ¶

func WithSystemPrompt(prompt string) Option

WithSystemPrompt sets the system message to send to the model.

func WithSystemPromptPreset ¶

func WithSystemPromptPreset(preset *SystemPromptPreset) Option

WithSystemPromptPreset sets a preset system prompt configuration. If set, this takes precedence over WithSystemPrompt.

func WithTemperature ¶

func WithTemperature(temperature float64) Option

WithTemperature sets sampling temperature.

func WithThinking ¶

func WithThinking(thinking config.ThinkingConfig) Option

WithThinking sets the thinking configuration.

func WithToolChoice ¶

func WithToolChoice(choice any) Option

WithToolChoice sets the tool_choice payload.

func WithTools ¶

func WithTools(tools config.ToolsConfig) Option

WithTools specifies which tools are available. Accepts ToolsList (tool names) or *ToolsPreset.

func WithTopK ¶

func WithTopK(topK float64) Option

WithTopK sets top-k sampling (responses API).

func WithTopLogprobs ¶

func WithTopLogprobs(v int) Option

WithTopLogprobs sets top_logprobs where supported.

func WithTopP ¶

func WithTopP(topP float64) Option

WithTopP sets nucleus sampling probability.

func WithTrace ¶

func WithTrace(enable bool) Option

WithTrace enables VLLM trace where supported.

func WithTransport ¶

func WithTransport(transport Transport) Option

WithTransport injects a custom transport implementation. The transport must implement the Transport interface.

func WithTruncation ¶

func WithTruncation(truncation string) Option

WithTruncation sets responses.truncation.

func WithUser ¶

func WithUser(user string) Option

WithUser sets a user identifier for tracking purposes.

func WithVLLMAPIMode ¶

func WithVLLMAPIMode(mode config.VLLMAPIMode) Option

WithVLLMAPIMode selects the VLLM API surface.

func WithVLLMExtra ¶

func WithVLLMExtra(extra map[string]any) Option

WithVLLMExtra merges raw request fields into the outgoing payload.

func WithVLLMMetadata ¶

func WithVLLMMetadata(metadata map[string]any) Option

WithVLLMMetadata sets request metadata.

func WithVLLMPlugins ¶

func WithVLLMPlugins(plugins ...map[string]any) Option

WithVLLMPlugins sets VLLM plugin payloads.

func WithXTitle ¶

func WithXTitle(title string) Option

WithXTitle sets the X-Title header for VLLM requests.

type PermissionBehavior ¶

type PermissionBehavior = permission.Behavior

PermissionBehavior represents the permission behavior for a rule.

type PermissionMode ¶

type PermissionMode = permission.Mode

PermissionMode represents different permission handling modes.

type PermissionRequestHookInput ¶

type PermissionRequestHookInput = hook.PermissionRequestInput

PermissionRequestHookInput is the input for PermissionRequest hooks.

type PermissionRequestHookSpecificOutput ¶

type PermissionRequestHookSpecificOutput = hook.PermissionRequestSpecificOutput

PermissionRequestHookSpecificOutput is the hook-specific output for PermissionRequest.

type PermissionResult ¶

type PermissionResult = permission.Result

PermissionResult is the interface for permission decision results.

type PermissionResultAllow ¶

type PermissionResultAllow = permission.ResultAllow

PermissionResultAllow represents an allow decision.

type PermissionResultDeny ¶

type PermissionResultDeny = permission.ResultDeny

PermissionResultDeny represents a deny decision.

type PermissionRuleValue ¶

type PermissionRuleValue = permission.RuleValue

PermissionRuleValue represents a permission rule.

type PermissionUpdate ¶

type PermissionUpdate = permission.Update

PermissionUpdate represents a permission update request.

type PermissionUpdateDestination ¶

type PermissionUpdateDestination = permission.UpdateDestination

PermissionUpdateDestination represents where permission updates are stored.

type PermissionUpdateType ¶

type PermissionUpdateType = permission.UpdateType

PermissionUpdateType represents the type of permission update.

type PostToolUseFailureHookInput ¶

type PostToolUseFailureHookInput = hook.PostToolUseFailureInput

PostToolUseFailureHookInput is the input for PostToolUseFailure hooks.

type PostToolUseFailureHookSpecificOutput ¶

type PostToolUseFailureHookSpecificOutput = hook.PostToolUseFailureSpecificOutput

PostToolUseFailureHookSpecificOutput is the hook-specific output for PostToolUseFailure.

type PostToolUseHookInput ¶

type PostToolUseHookInput = hook.PostToolUseInput

PostToolUseHookInput is the input for PostToolUse hooks.

type PostToolUseHookSpecificOutput ¶

type PostToolUseHookSpecificOutput = hook.PostToolUseSpecificOutput

PostToolUseHookSpecificOutput is the hook-specific output for PostToolUse.

type PreCompactHookInput ¶

type PreCompactHookInput = hook.PreCompactInput

PreCompactHookInput is the input for PreCompact hooks.

type PreToolUseHookInput ¶

type PreToolUseHookInput = hook.PreToolUseInput

PreToolUseHookInput is the input for PreToolUse hooks.

type PreToolUseHookSpecificOutput ¶

type PreToolUseHookSpecificOutput = hook.PreToolUseSpecificOutput

PreToolUseHookSpecificOutput is the hook-specific output for PreToolUse.

type ResultMessage ¶

type ResultMessage = message.ResultMessage

ResultMessage represents the final result of a query.

type Schema ¶

type Schema = jsonschema.Schema

Schema is a JSON Schema object for tool input validation.

type SchemaBuilder ¶

type SchemaBuilder struct {
	// contains filtered or unexported fields
}

SchemaBuilder provides a fluent interface for building JSON schemas.

func NewSchemaBuilder ¶

func NewSchemaBuilder() *SchemaBuilder

NewSchemaBuilder creates a new SchemaBuilder.

func (*SchemaBuilder) Build ¶

func (b *SchemaBuilder) Build() map[string]any

Build returns the complete JSON Schema.

func (*SchemaBuilder) OptionalProperty ¶

func (b *SchemaBuilder) OptionalProperty(name, goType string) *SchemaBuilder

OptionalProperty adds an optional property (not in required list).

func (*SchemaBuilder) OptionalPropertyWithDescription ¶

func (b *SchemaBuilder) OptionalPropertyWithDescription(name, goType, description string) *SchemaBuilder

OptionalPropertyWithDescription adds an optional property with description.

func (*SchemaBuilder) Property ¶

func (b *SchemaBuilder) Property(name, goType string) *SchemaBuilder

Property adds a property with the given name and Go type. The property is marked as required by default.

func (*SchemaBuilder) PropertyWithDescription ¶

func (b *SchemaBuilder) PropertyWithDescription(name, goType, description string) *SchemaBuilder

PropertyWithDescription adds a property with type and description.

type SdkMcpServerInstance ¶

type SdkMcpServerInstance = mcp.ServerInstance

SdkMcpServerInstance is the interface that SDK MCP servers must implement.

type SdkMcpTool ¶

type SdkMcpTool struct {
	ToolName        string
	ToolDescription string
	ToolSchema      *jsonschema.Schema
	ToolHandler     SdkMcpToolHandler
	ToolAnnotations *mcp.ToolAnnotations
}

SdkMcpTool represents a tool created with NewSdkMcpTool.

func NewSdkMcpTool ¶

func NewSdkMcpTool(
	name, description string,
	inputSchema *jsonschema.Schema,
	handler SdkMcpToolHandler,
	opts ...SdkMcpToolOption,
) *SdkMcpTool

NewSdkMcpTool creates an SdkMcpTool with optional configuration.

The inputSchema should be a *jsonschema.Schema. Use SimpleSchema for convenience or create a full Schema struct for more control.

Use WithAnnotations to set MCP tool annotations (hints about tool behavior).

Example with SimpleSchema:

addTool := vllmsdk.NewSdkMcpTool("add", "Add two numbers",
    vllmsdk.SimpleSchema(map[string]string{"a": "float64", "b": "float64"}),
    func(ctx context.Context, req *vllmsdk.CallToolRequest) (*vllmsdk.CallToolResult, error) {
        args, _ := vllmsdk.ParseArguments(req)
        a, b := args["a"].(float64), args["b"].(float64)
        return vllmTextResult(fmt.Sprintf("Result: %v", a+b)), nil
    },
    vllmsdk.WithAnnotations(&vllmsdk.McpToolAnnotations{
        ReadOnlyHint: true,
    }),
)

func (*SdkMcpTool) Annotations ¶

func (t *SdkMcpTool) Annotations() *mcp.ToolAnnotations

Annotations returns the tool annotations, or nil if not set.

func (*SdkMcpTool) Description ¶

func (t *SdkMcpTool) Description() string

Description returns the tool description.

func (*SdkMcpTool) Handler ¶

func (t *SdkMcpTool) Handler() SdkMcpToolHandler

Handler returns the tool handler function.

func (*SdkMcpTool) InputSchema ¶

func (t *SdkMcpTool) InputSchema() *jsonschema.Schema

InputSchema returns the JSON Schema for the tool input.

func (*SdkMcpTool) Name ¶

func (t *SdkMcpTool) Name() string

Name returns the tool name.

type SdkMcpToolHandler ¶

type SdkMcpToolHandler = mcp.ToolHandler

SdkMcpToolHandler is the function signature for SdkMcpTool handlers. It receives the context and request, and returns the result.

Use ParseArguments to extract input as map[string]any from the request. Use TextResult, ErrorResult, or ImageResult helpers to create results.

Example:

func(ctx context.Context, req *vllmsdk.CallToolRequest) (*vllmsdk.CallToolResult, error) {
    args, err := vllmsdk.ParseArguments(req)
    if err != nil {
        return vllmsdk.ErrorResult(err.Error()), nil
    }
    a := args["a"].(float64)
    return vllmTextResult(fmt.Sprintf("Result: %v", a)), nil
}

type SdkMcpToolOption ¶

type SdkMcpToolOption func(*SdkMcpTool)

SdkMcpToolOption configures an SdkMcpTool during construction.

func WithAnnotations ¶

func WithAnnotations(annotations *mcp.ToolAnnotations) SdkMcpToolOption

WithAnnotations sets MCP tool annotations (hints about tool behavior). Annotations describe properties like whether a tool is read-only, destructive, idempotent, or operates in an open world.

type SdkPluginConfig ¶

type SdkPluginConfig = config.PluginConfig

SdkPluginConfig configures a plugin to load.

type SessionStat ¶

type SessionStat struct {
	SessionID           string
	CreatedAt           string
	UpdatedAt           string
	MessageCount        int
	UserTurns           int
	CheckpointCount     int
	FileCheckpointCount int
}

SessionStat contains metadata about a locally persisted SDK session.

func ListSessions ¶

func ListSessions(ctx context.Context, opts ...Option) ([]SessionStat, error)

ListSessions returns local SDK sessions from the configured session store.

func StatSession ¶

func StatSession(ctx context.Context, sessionID string, opts ...Option) (*SessionStat, error)

StatSession returns metadata for a locally persisted SDK session.

type StopHookInput ¶

type StopHookInput = hook.StopInput

StopHookInput is the input for Stop hooks.

type StreamEvent ¶

type StreamEvent = message.StreamEvent

StreamEvent represents a raw streaming event from the VLLM backend.

type StreamingMessage ¶

type StreamingMessage = message.StreamingMessage

StreamingMessage represents a message sent in streaming mode.

func NewUserMessage ¶

func NewUserMessage(content UserMessageContent) StreamingMessage

NewUserMessage creates a StreamingMessage with type "user". This is a convenience constructor for creating user messages.

type StreamingMessageContent ¶

type StreamingMessageContent = message.StreamingMessageContent

StreamingMessageContent represents the content of a streaming message.

type SubagentStartHookInput ¶

type SubagentStartHookInput = hook.SubagentStartInput

SubagentStartHookInput is the input for SubagentStart hooks.

type SubagentStartHookSpecificOutput ¶

type SubagentStartHookSpecificOutput = hook.SubagentStartSpecificOutput

SubagentStartHookSpecificOutput is the hook-specific output for SubagentStart.

type SubagentStopHookInput ¶

type SubagentStopHookInput = hook.SubagentStopInput

SubagentStopHookInput is the input for SubagentStop hooks.

type SyncHookJSONOutput ¶

type SyncHookJSONOutput = hook.SyncJSONOutput

SyncHookJSONOutput represents a sync hook output.

type SystemMessage ¶

type SystemMessage = message.SystemMessage

SystemMessage represents a system message.

type SystemPromptPreset ¶

type SystemPromptPreset = config.SystemPromptPreset

SystemPromptPreset defines a system prompt preset configuration.

type TextBlock ¶

type TextBlock = message.TextBlock

TextBlock contains plain text content.

func TextInput ¶

func TextInput(text string) *TextBlock

TextInput creates a text block for block-based multimodal user content.

type ThinkingBlock ¶

type ThinkingBlock = message.ThinkingBlock

ThinkingBlock contains model reasoning or thinking output.

type ThinkingConfig ¶

type ThinkingConfig = config.ThinkingConfig

ThinkingConfig controls extended thinking behavior.

type ThinkingConfigAdaptive ¶

type ThinkingConfigAdaptive = config.ThinkingConfigAdaptive

ThinkingConfigAdaptive enables adaptive thinking mode.

type ThinkingConfigDisabled ¶

type ThinkingConfigDisabled = config.ThinkingConfigDisabled

ThinkingConfigDisabled disables extended thinking.

type ThinkingConfigEnabled ¶

type ThinkingConfigEnabled = config.ThinkingConfigEnabled

ThinkingConfigEnabled enables thinking with a specific token budget.

type Tool ¶

type Tool interface {
	// Name returns the unique identifier for this tool.
	Name() string

	// Description returns a human-readable description for the model.
	Description() string

	// InputSchema returns a JSON schema describing expected input.
	// The schema should follow JSON Schema Draft 7 specification.
	InputSchema() map[string]any

	// Execute runs the tool with the provided input.
	// The input will be validated against InputSchema before execution.
	Execute(ctx context.Context, input map[string]any) (map[string]any, error)
}

Tool represents a custom tool that the SDK can invoke through VLLM tool calling.

Tools allow users to extend agent capabilities with domain-specific functionality. When registered, the model can discover and execute these tools during a session.

Example:

tool := vllmsdk.NewTool(
    "calculator",
    "Performs basic arithmetic operations",
    map[string]any{
        "type": "object",
        "properties": map[string]any{
            "operation": map[string]any{
                "type": "string",
                "enum": []string{"add", "subtract", "multiply", "divide"},
            },
            "a": map[string]any{"type": "number"},
            "b": map[string]any{"type": "number"},
        },
        "required": []string{"operation", "a", "b"},
    },
    func(ctx context.Context, input map[string]any) (map[string]any, error) {
        op := input["operation"].(string)
        a := input["a"].(float64)
        b := input["b"].(float64)

        var result float64
        switch op {
        case "add":
            result = a + b
        case "subtract":
            result = a - b
        case "multiply":
            result = a * b
        case "divide":
            if b == 0 {
                return nil, fmt.Errorf("division by zero")
            }
            result = a / b
        }

        return map[string]any{"result": result}, nil
    },
)

func NewTool ¶

func NewTool(name, description string, schema map[string]any, fn ToolFunc) Tool

NewTool creates a Tool from a function.

This is a convenience constructor for creating tools without implementing the full Tool interface.

Parameters:

name: Unique identifier for the tool (e.g., "calculator", "search_database")
description: Human-readable description of what the tool does
schema: JSON Schema defining the expected input structure
fn: Function that executes the tool logic

type ToolFunc ¶

type ToolFunc func(ctx context.Context, input map[string]any) (map[string]any, error)

ToolFunc is a function-based tool implementation.

type ToolPermissionCallback ¶

type ToolPermissionCallback = permission.Callback

ToolPermissionCallback is called before each tool use for permission checking.

type ToolPermissionContext ¶

type ToolPermissionContext = permission.Context

ToolPermissionContext provides context for tool permission callbacks.

type ToolPermissionDeniedError ¶

type ToolPermissionDeniedError = internalerrors.ToolPermissionDeniedError

ToolPermissionDeniedError indicates a tool execution was denied by permission policy.

type ToolResultBlock ¶

type ToolResultBlock = message.ToolResultBlock

ToolResultBlock contains the result of a tool execution.

type ToolUseBlock ¶

type ToolUseBlock = message.ToolUseBlock

ToolUseBlock represents the model invoking a tool.

type ToolsConfig ¶

type ToolsConfig = config.ToolsConfig

ToolsConfig is an interface for configuring available tools. It represents either a list of tool names or a preset configuration.

type ToolsList ¶

type ToolsList = config.ToolsList

ToolsList is a list of tool names to make available.

type ToolsPreset ¶

type ToolsPreset = config.ToolsPreset

ToolsPreset represents a preset configuration for available tools.

type Transport ¶

type Transport interface {
	Start(ctx context.Context) error
	CreateStream(ctx context.Context, req *ChatRequest) (<-chan map[string]any, <-chan error)
	Close() error
}

Transport defines the runtime transport interface.

type UnsupportedControlError ¶

type UnsupportedControlError = controlplane.UnsupportedControlError

UnsupportedControlError indicates a control-plane operation is unsupported by this backend.

type UnsupportedHookEventError ¶

type UnsupportedHookEventError = internalerrors.UnsupportedHookEventError

UnsupportedHookEventError indicates a configured hook event is not supported by this backend.

type UnsupportedHookOutputError ¶

type UnsupportedHookOutputError = internalerrors.UnsupportedHookOutputError

UnsupportedHookOutputError indicates a hook output field is unsupported by this backend.

type Usage ¶

type Usage = message.Usage

Usage contains token usage information.

type UserInputAnswer ¶

type UserInputAnswer = userinput.Answer

UserInputAnswer contains the user's response(s) to a single question.

type UserInputCallback ¶

type UserInputCallback = userinput.Callback

UserInputCallback handles user-input requests and returns answers.

type UserInputQuestion ¶

type UserInputQuestion = userinput.Question

UserInputQuestion represents a single user-input prompt.

type UserInputQuestionOption ¶

type UserInputQuestionOption = userinput.QuestionOption

UserInputQuestionOption represents a selectable choice in a user-input question.

type UserInputRequest ¶

type UserInputRequest = userinput.Request

UserInputRequest contains parsed user-input payload data.

type UserInputResponse ¶

type UserInputResponse = userinput.Response

UserInputResponse contains answers keyed by question ID.

type UserMessage ¶

type UserMessage = message.UserMessage

UserMessage represents a message from the user.

type UserMessageContent ¶

type UserMessageContent = message.UserMessageContent

UserMessageContent represents content that can be either a string or []ContentBlock.

func Blocks ¶

func Blocks(blocks ...ContentBlock) UserMessageContent

Blocks creates block-based user content.

func Text ¶

func Text(text string) UserMessageContent

Text creates text-only user content.

type UserPromptSubmitHookInput ¶

type UserPromptSubmitHookInput = hook.UserPromptSubmitInput

UserPromptSubmitHookInput is the input for UserPromptSubmit hooks.

type UserPromptSubmitHookSpecificOutput ¶

type UserPromptSubmitHookSpecificOutput = hook.UserPromptSubmitSpecificOutput

UserPromptSubmitHookSpecificOutput is the hook-specific output for UserPromptSubmit.

type VLLMAPIMode ¶

type VLLMAPIMode = config.VLLMAPIMode

VLLMAPIMode selects which VLLM API surface to use.

type VLLMAgentOptions ¶

type VLLMAgentOptions = config.Options

VLLMAgentOptions configures SDK and VLLM request behavior.

type VLLMSDKError ¶

type VLLMSDKError = internalerrors.VLLMSDKError

VLLMSDKError is the base interface for all SDK errors.

Directories ¶

Path	Synopsis
examples
cancellation command Package main demonstrates cancellation and graceful shutdown patterns.	Package main demonstrates cancellation and graceful shutdown patterns.
client_multi_turn command
error_handling command
extended_thinking command Package main demonstrates extended thinking capabilities with VLLM.	Package main demonstrates extended thinking capabilities with VLLM.
hooks command
include_partial_messages command Package main demonstrates partial message streaming where incremental assistant updates are received as the model generates responses.	Package main demonstrates partial message streaming where incremental assistant updates are received as the model generates responses.
internal/exampleutil
interrupt command
max_budget_usd command Package main demonstrates API cost control with budget limits.	Package main demonstrates API cost control with budget limits.
mcp_calculator command Package main demonstrates how to create calculator tools using MCP servers.	Package main demonstrates how to create calculator tools using MCP servers.
mcp_status command Package main demonstrates querying MCP server connection status.	Package main demonstrates querying MCP server connection status.
memory_tool command Package main demonstrates a filesystem-backed memory tool for agent state persistence.	Package main demonstrates a filesystem-backed memory tool for agent state persistence.
model_discovery command
on_user_input command
parallel_queries command Package main demonstrates running multiple Query() calls concurrently.	Package main demonstrates running multiple Query() calls concurrently.
permissions command
pipeline command Package main demonstrates multi-step LLM orchestration with Go control flow.	Package main demonstrates multi-step LLM orchestration with Go control flow.
query_stream command
quick_start command
sdk_tools command
sessions_local command
structured_output command
system_prompt command Package main demonstrates configuring system prompts.	Package main demonstrates configuring system prompts.
vllm_chat_controls command
vllm_extra command
vllm_multimodal_image command
vllm_multimodal_input command
vllm_responses command
vllm_responses_chaining command
vllm_routing command
internal
config
controlplane
errors
hook Package hook provides hook types for intercepting runtime events.	Package hook provides hook types for intercepting runtime events.
mcp
message Package message provides internal message and content block types.	Package message provides internal message and content block types.
model
permission Package permission provides permission handling types.	Package permission provides permission handling types.
runtime
session
tools
userinput
util
vllm
scripts
example_verifier command

?	: This menu
/	: Search site
f or F	: Jump to
y or Y	: Canonical URL