Documentation
¶
Index ¶
- Constants
- type ChatRequest
- type Effort
- type Options
- type PluginConfig
- type QueryLifecycleNotifier
- type SessionMetricsRecorder
- type SystemPromptPreset
- type ThinkingConfig
- type ThinkingConfigAdaptive
- type ThinkingConfigDisabled
- type ThinkingConfigEnabled
- type ToolsConfig
- type ToolsList
- type ToolsPreset
- type Transport
- type VLLMAPIMode
Constants ¶
const DefaultBaseURL = "http://127.0.0.1:8000/v1"
DefaultBaseURL is the default vLLM OpenAI-compatible base URL.
Variables ¶
This section is empty.
Functions ¶
This section is empty.
Types ¶
type ChatRequest ¶
type ChatRequest struct {
Model string
Models []string
Messages []map[string]any
Tools []map[string]any
Stream bool
ToolChoice any
MaxTokens *int
MaxOutputTokens *int
Temperature *float64
TopP *float64
TopK *float64
PresencePenalty *float64
FrequencyPenalty *float64
Seed *int64
Stop []string
Logprobs *bool
TopLogprobs *int
ParallelToolCalls *bool
ResponseFormat map[string]any
ResponseText map[string]any
Metadata map[string]any
Provider map[string]any
Plugins []map[string]any
Route string
Reasoning map[string]any
SessionID string
Trace *bool
Modalities []string
ImageConfig map[string]any
User string
Instructions string
PreviousResponseID string
PromptCacheKey string
MaxToolCalls *int
ServiceTier string
Truncation string
Include []string
Background *bool
SafetyIdentifier string
Store *bool
Prompt map[string]any
Extra map[string]any
}
ChatRequest is the normalized VLLM chat request used by transports.
type Options ¶
type Options struct {
Logger *slog.Logger
SystemPrompt string
SystemPromptPreset *SystemPromptPreset
Model string
PermissionMode string
MaxTurns int
Cwd string
User string
Hooks map[hook.Event][]*hook.Matcher
Thinking ThinkingConfig
Effort *Effort
IncludePartialMessages bool
MaxBudgetUSD *float64
MCPServers map[string]mcp.ServerConfig
MCPConfig string
Tools ToolsConfig
AllowedTools []string
DisallowedTools []string
CanUseTool permission.Callback
OnUserInput userinput.Callback
Resume string
ForkSession bool
SessionStorePath string
FallbackModel string
PermissionPromptToolName string
Plugins []*PluginConfig
OutputFormat map[string]any
EnableFileCheckpointing bool
Transport Transport
// Observability
MeterProvider metric.MeterProvider
TracerProvider trace.TracerProvider
PrometheusRegisterer prometheus.Registerer
// MetricsRecorder is the internal observability recorder created from OTel providers.
// This field is set by the SDK at runtime; users should not set it directly.
MetricsRecorder SessionMetricsRecorder
// Observer is the shared observability helper used for SDK-level span and
// duration instrumentation beyond message-based recording (hook dispatch,
// explicit tool spans, etc.). Set by the SDK at runtime alongside
// MetricsRecorder; consumers should not set this directly.
Observer *observability.Observer
// VLLM specific
APIKey string
BaseURL string
VLLMAPIMode VLLMAPIMode
HTTPReferer string
XTitle string
RequestTimeout *time.Duration
MaxToolIterations int
// VLLM request fields
VLLMTopP *float64
VLLMTemperature *float64
VLLMMaxTokens *int
VLLMTopK *float64
VLLMPresencePenalty *float64
VLLMFrequencyPenalty *float64
VLLMSeed *int64
VLLMStop []string
VLLMLogprobs *bool
VLLMTopLogprobs *int
VLLMParallelToolCalls *bool
VLLMToolChoice any
VLLMProvider map[string]any
VLLMPlugins []map[string]any
VLLMRoute string
VLLMReasoning map[string]any
VLLMSessionID string
VLLMTrace *bool
VLLMModalities []string
VLLMImageConfig map[string]any
VLLMModels []string
VLLMMetadata map[string]any
VLLMInstructions string
VLLMPreviousResponseID string
VLLMPromptCacheKey string
VLLMPrompt map[string]any
VLLMText map[string]any
VLLMMaxOutputTokens *int
VLLMMaxToolCalls *int
VLLMServiceTier string
VLLMTruncation string
VLLMInclude []string
VLLMBackground *bool
VLLMSafetyIdentifier string
VLLMStore *bool
VLLMExtra map[string]any
}
Options contains all SDK options.
func (*Options) ApplyDefaults ¶
func (o *Options) ApplyDefaults()
ApplyDefaults fills missing option defaults.
type PluginConfig ¶
PluginConfig configures a plugin to load.
type QueryLifecycleNotifier ¶ added in v0.0.2
type QueryLifecycleNotifier interface {
MarkQueryStart()
}
QueryLifecycleNotifier is optionally implemented by SessionMetricsRecorder implementations that need query lifecycle notifications for TTFT tracking.
type SessionMetricsRecorder ¶ added in v0.0.2
SessionMetricsRecorder is the narrow observability interface used by the SDK runtime. When configured via WithMeterProvider or WithTracerProvider, the SDK creates a recorder that emits OpenTelemetry metrics and traces at existing observation points. The context parameter enables trace correlation and exemplar propagation.
type SystemPromptPreset ¶
type SystemPromptPreset struct {
Type string `json:"type"` // "preset"
Preset string `json:"preset"` // backend-defined preset identifier
Append *string `json:"append,omitempty"`
}
SystemPromptPreset defines a system prompt preset configuration.
type ThinkingConfig ¶
type ThinkingConfig interface {
// contains filtered or unexported methods
}
ThinkingConfig is a marker interface for thinking settings.
type ThinkingConfigAdaptive ¶
type ThinkingConfigAdaptive struct{}
ThinkingConfigAdaptive enables adaptive thinking.
type ThinkingConfigDisabled ¶
type ThinkingConfigDisabled struct{}
ThinkingConfigDisabled disables thinking.
type ThinkingConfigEnabled ¶
type ThinkingConfigEnabled struct {
BudgetTokens int
}
ThinkingConfigEnabled enables thinking with a token budget.
type ToolsConfig ¶
type ToolsConfig interface {
// contains filtered or unexported methods
}
ToolsConfig is an interface for configuring available tools.
type ToolsPreset ¶
type ToolsPreset struct {
Type string `json:"type"` // "preset"
Preset string `json:"preset"` // backend-defined preset identifier
}
ToolsPreset represents a preset configuration for available tools.
type Transport ¶
type Transport interface {
Start(ctx context.Context) error
CreateStream(ctx context.Context, req *ChatRequest) (<-chan map[string]any, <-chan error)
Close() error
}
Transport defines the runtime transport interface.
type VLLMAPIMode ¶
type VLLMAPIMode string
VLLMAPIMode is retained for public compatibility. The vLLM backend serves chat-completions requests and may emulate responses-mode behavior locally where practical.
const ( // VLLMAPIModeChatCompletions uses /chat/completions. VLLMAPIModeChatCompletions VLLMAPIMode = "chat_completions" // VLLMAPIModeResponses uses /responses. VLLMAPIModeResponses VLLMAPIMode = "responses" )