Documentation
¶
Overview ¶
Package llm provides a provider-neutral interface for LLM interactions across Anthropic, OpenAI (Responses API), OpenRouter, MiniMax, and Xiaomi MiMo.
Index ¶
- Constants
- Variables
- func CloseIdleHTTPConnections()
- func DefaultEffortFromSpec(spec string) string
- func DurStr(a, b time.Time) string
- func EnvDuration(name string, fallback time.Duration) time.Duration
- func NewLoggingTransport(base http.RoundTripper) http.RoundTripper
- func NewPluginHTTPClient(pc PluginConfig) *http.Client
- func NewRequestID() string
- func RequestIDFromContext(ctx context.Context) string
- func SharedHTTPClient() *http.Client
- func Spec(c Client) string
- func StreamDebugVerbose() bool
- func WithRequestID(ctx context.Context, id string) context.Context
- type BedrockHTTPError
- type CacheControl
- type Client
- type Config
- type ContentBlock
- func NewImageBlock(mediaType, data string) ContentBlock
- func NewTextBlock(text string) ContentBlock
- func NewThinkingBlock(text, signature string) ContentBlock
- func NewToolResultBlock(toolUseID, output string, isError bool) ContentBlock
- func NewToolUseBlock(id, name string, input map[string]any) ContentBlock
- type ContentBlockType
- type Message
- type MessageParam
- type PluginConfig
- type PluginSource
- type ProviderID
- type Role
- type StopReason
- type StreamOpts
- type SystemBlock
- type ThinkingStallError
- type ToolCall
- type ToolParam
- type Usage
Constants ¶
const DefaultMaxTokens int64 = 32768
DefaultMaxTokens is the fallback per-call output token cap when a Client has no explicit MaxTokens override.
const DefaultStreamIdleTimeout = 60 * time.Second
DefaultStreamIdleTimeout bounds the silence between any two SSE events.
const DefaultThinkingStallTimeout = 120 * time.Second
DefaultThinkingStallTimeout bounds the time spent inside a single reasoning/thinking block. Past this, the adapter cancels the stream and the retry layer nudges the model to conclude.
Variables ¶
var ErrNoCredential = errors.New("no credential")
ErrNoCredential is wrapped by NewFromModel when the selected model's provider has no resolvable credential. Callers use errors.Is to distinguish this (recoverable: the user just needs to add a key) from other construction failures such as an unknown provider prefix.
var ErrStreamIdleTimeout = errors.New("stream idle timeout")
ErrStreamIdleTimeout is returned when no SSE events arrive within the idle-timeout window. Anthropic sends ping events every ~15-30s even during extended thinking, and OpenAI streams typed events at similar cadence, so prolonged silence indicates a dead connection.
var ErrThinkingStall = errors.New("thinking stall")
ErrThinkingStall is the sentinel unwrap target for ThinkingStallError. Callers use errors.Is(err, ErrThinkingStall) to branch on the stall path; errors.As extracts the concrete ThinkingStallError when the accumulated summary text is needed.
Functions ¶
func CloseIdleHTTPConnections ¶
func CloseIdleHTTPConnections()
CloseIdleHTTPConnections drops all pooled connections in the shared transport. Called between retries so a poisoned conn doesn't get reused on the next attempt.
func DefaultEffortFromSpec ¶
DefaultEffortFromSpec returns the default reasoning effort for the given model spec, per the provider's effort policy. Anthropic and MiniMax default to "adaptive"; the OpenAI-style providers default to "medium" for reasoning-capable models and "" otherwise.
func DurStr ¶
DurStr formats the time between a and b as "<n>ms", or "—" when either is zero (event never observed).
func EnvDuration ¶
EnvDuration reads a Go duration (e.g. "30s", "2m") from the given env var. Returns fallback when unset; logs and falls back on parse error.
func NewLoggingTransport ¶
func NewLoggingTransport(base http.RoundTripper) http.RoundTripper
NewLoggingTransport returns a RoundTripper that attaches an httptrace.ClientTrace and logs a compact lifecycle summary per request. When VIX_STREAM_DEBUG=1 it also wraps the response body to count bytes and log when the first body byte arrives.
This is the provider-agnostic replacement for streamDebugMiddleware (formerly typed for anthropic-sdk-go's option.Middleware).
func NewPluginHTTPClient ¶
func NewPluginHTTPClient(pc PluginConfig) *http.Client
NewPluginHTTPClient returns an *http.Client whose Transport applies the plugin's header set/strip rules to every outgoing request, then delegates to the shared transport. The lifecycle-logging transport is composed on the outside (see NewLoggingTransport) so the log ordering is:
request → loggingTransport → headerStripperTransport → sharedHTTPTransport
Returns SharedHTTPClient() unchanged when pc has no headers.
func NewRequestID ¶
func NewRequestID() string
NewRequestID returns a short random hex ID for correlating one logical LLM turn. The retry layer uses "<turnID>.<attempt>" so all attempts share a prefix.
func RequestIDFromContext ¶
RequestIDFromContext returns the request correlation ID stamped on ctx by WithRequestID, or "" if none.
func SharedHTTPClient ¶
SharedHTTPClient returns the package-wide HTTP client. Adapters install per-instance wrappers (header strip/set, lifecycle logging) by composing transports on top of this client's Transport, then passing the wrapped client to their SDK via that SDK's WithHTTPClient option.
func Spec ¶
Spec returns the full prefixed model spec for a Client (e.g. "anthropic/claude-opus-4-8"). Useful for cost calculation and logging where the bare Client.Model() alone is ambiguous across providers.
func StreamDebugVerbose ¶
func StreamDebugVerbose() bool
StreamDebugVerbose returns true when VIX_STREAM_DEBUG=1, enabling per-response-body-byte tracing in addition to always-on HTTP lifecycle logging.
Types ¶
type BedrockHTTPError ¶ added in v0.4.3
BedrockHTTPError is a typed error for non-2xx Bedrock HTTP responses. Exported so classifyError in the daemon layer can use errors.As for robust classification without string matching.
func (*BedrockHTTPError) Error ¶ added in v0.4.3
func (e *BedrockHTTPError) Error() string
type CacheControl ¶
type CacheControl struct {
Type string `json:"type"` // currently always "ephemeral"
}
CacheControl marks a block as eligible for the provider's prompt cache. Currently only Anthropic honors these; OpenAI/MiniMax do passive caching and adapters drop the marker. OpenRouter forwards it when routing to Anthropic-family models.
type Client ¶
type Client interface {
// StreamMessage runs a streaming request with default options.
StreamMessage(
ctx context.Context,
system []SystemBlock,
messages []MessageParam,
tools []ToolParam,
onDelta func(string),
onThinkingDelta func(string),
) (*Message, time.Duration, error)
// StreamMessageWith runs a streaming request honoring per-call
// overrides from opts (currently just EffortOverride).
StreamMessageWith(
ctx context.Context,
system []SystemBlock,
messages []MessageParam,
tools []ToolParam,
onDelta func(string),
onThinkingDelta func(string),
opts StreamOpts,
) (*Message, time.Duration, error)
// Provider identifies which upstream this client talks to.
Provider() ProviderID
// Model returns the bare model name (no provider prefix).
Model() string
// Credential returns the credential this client was built with.
Credential() config.Credential
// MaxTokens returns the per-call output token cap configured on this
// client. Zero means "use the default" (32768).
MaxTokens() int64
// Effort returns the reasoning effort configured at construction time.
Effort() string
}
Client is the provider-neutral LLM interface. One Client is bound to a single (provider, model, credential, effort, maxTokens, pluginCfg) tuple and is safe for concurrent calls (the underlying SDKs handle request locking themselves).
func NewAnthropic ¶
NewAnthropic constructs an Anthropic adapter from cfg.
func NewBedrock ¶ added in v0.4.3
func NewFromModel ¶
func NewFromModel(spec string, plugins PluginSource, effort string, maxTokens int64) (Client, error)
NewFromModel parses a vix-style model spec, resolves the right credential via config.ResolveProviderCredentialFresh, and constructs the matching adapter by dispatching on the provider's wire_format. All endpoint/header/query data comes from the providers registry (providers.json).
type Config ¶
type Config struct {
Credential config.Credential
Model string // bare model name (no provider prefix)
Effort string // "", "low", "medium", "high", "max", "adaptive"
MaxTokens int64 // 0 = use DefaultMaxTokens
PluginCfg PluginConfig
HTTPClient *http.Client // optional override; nil = use NewPluginHTTPClient(PluginCfg)
// BaseURL overrides the adapter's default API endpoint. Empty means
// use the provider's default. Set from a credential's endpoint override
// (e.g. the Codex backend) or by tests redirecting to httptest servers.
BaseURL string
StreamIdle time.Duration // 0 = read from env or use DefaultStreamIdleTimeout
ThinkingStall time.Duration // 0 = read from env or use DefaultThinkingStallTimeout
}
Config is the shared input set every wire builder takes.
type ContentBlock ¶
type ContentBlock struct {
Type ContentBlockType `json:"type"`
Text string `json:"text,omitempty"` // BlockText, BlockThinking
ID string `json:"id,omitempty"` // BlockToolUse
Name string `json:"name,omitempty"` // BlockToolUse
Input map[string]any `json:"input,omitempty"` // BlockToolUse — already-parsed; never a raw JSON string
ToolUseID string `json:"tool_use_id,omitempty"` // BlockToolResult
Output string `json:"output,omitempty"` // BlockToolResult
IsError bool `json:"is_error,omitempty"` // BlockToolResult
MediaType string `json:"media_type,omitempty"` // BlockImage (e.g. "image/png")
Data string `json:"data,omitempty"` // BlockImage (base64-encoded payload)
Signature string `json:"signature,omitempty"` // BlockThinking — Anthropic signature or OpenAI reasoning-item ID
CacheControl *CacheControl `json:"cache_control,omitempty"`
}
ContentBlock is one element of message content. The fields used depend on Type — see the const docs for each variant.
func NewImageBlock ¶
func NewImageBlock(mediaType, data string) ContentBlock
NewImageBlock builds an image content block. data is the base64-encoded payload.
func NewTextBlock ¶
func NewTextBlock(text string) ContentBlock
NewTextBlock builds a text content block.
func NewThinkingBlock ¶
func NewThinkingBlock(text, signature string) ContentBlock
NewThinkingBlock builds an assistant thinking block. signature is the Anthropic block signature (or the OpenAI reasoning-item ID) used to re-feed the block on the next turn.
func NewToolResultBlock ¶
func NewToolResultBlock(toolUseID, output string, isError bool) ContentBlock
NewToolResultBlock builds a user tool_result block.
func NewToolUseBlock ¶
func NewToolUseBlock(id, name string, input map[string]any) ContentBlock
NewToolUseBlock builds an assistant tool_use block.
type ContentBlockType ¶
type ContentBlockType string
ContentBlockType discriminates the union shape of ContentBlock.
const ( BlockText ContentBlockType = "text" BlockThinking ContentBlockType = "thinking" BlockToolUse ContentBlockType = "tool_use" BlockToolResult ContentBlockType = "tool_result" BlockImage ContentBlockType = "image" )
type Message ¶
type Message struct {
StopReason StopReason
TextContent string // concatenated text blocks (convenience for extractTextFromMessage)
Content []ContentBlock // full ordered content for replay
ToolCalls []ToolCall // convenience extraction; duplicates Content's tool_use blocks
Usage Usage
Raw any // raw provider response, retained for LogLLMCall debugging
}
Message is the provider-neutral result of one LLM turn.
func (*Message) ToParam ¶
func (m *Message) ToParam() MessageParam
ToParam reconstructs an assistant MessageParam from this Message so the turn can be appended to a conversation history and re-sent on the next turn. Preserves all content (text, thinking with signature, tool_use) so providers that need full round-trip (Anthropic, OpenAI Responses) keep working across turns.
type MessageParam ¶
type MessageParam struct {
Role Role `json:"role"`
Content []ContentBlock `json:"content"`
}
MessageParam is one turn in the conversation history.
func NewAssistantMessage ¶
func NewAssistantMessage(blocks ...ContentBlock) MessageParam
NewAssistantMessage builds a MessageParam with role=assistant from the given blocks.
func NewUserMessage ¶
func NewUserMessage(blocks ...ContentBlock) MessageParam
NewUserMessage builds a MessageParam with role=user from the given blocks.
type PluginConfig ¶
type PluginConfig struct {
// Headers maps HTTP header name → value. A nil pointer value means
// "strip this header from every outgoing API request". A non-nil
// pointer means "set (or override) this header to the given string".
Headers map[string]*string `json:"headers"`
// SystemPrefix is prepended as the first system-prompt text block on
// every StreamMessage call. Empty means no-op.
SystemPrefix string `json:"system_prefix"`
}
PluginConfig is the merged output of running all discovered .vix/plugins/ executables on daemon startup. The plugin loader lives in package daemon; the struct lives here so every adapter can apply it without importing daemon (which would cycle).
type PluginSource ¶ added in v0.5.0
type PluginSource func(provider, model string, cred config.Credential) PluginConfig
PluginSource produces the PluginConfig to apply to a client being built for the given provider id, bare model name, and resolved credential. It runs at client-construction time so plugins can react to the actual target provider and credential kind (e.g. only spoof headers for anthropic + OAuth). A nil PluginSource means no plugins.
type ProviderID ¶
type ProviderID string
ProviderID identifies one of the supported upstream providers.
const ( ProviderAnthropic ProviderID = "anthropic" ProviderBedrock ProviderID = "bedrock" ProviderOpenAI ProviderID = "openai" ProviderOpenRouter ProviderID = "openrouter" ProviderMiniMax ProviderID = "minimax" ProviderMiMo ProviderID = "mimo" )
func ParseModel ¶
func ParseModel(spec string) (ProviderID, string, error)
ParseModel maps a vix-style model spec (with mandatory provider prefix) to (provider id, bare model name) via the providers registry — the first matching prefix wins. Bare unprefixed names error explicitly. Thin wrapper over providers.Default().ParseModel so existing callers keep the ProviderID return type.
func Providers ¶ added in v0.4.0
func Providers() []ProviderID
Providers returns every supported provider id, in registry order.
func (ProviderID) CredentialName ¶
func (p ProviderID) CredentialName() string
CredentialName returns the name used for credential resolution and keyring lookups for this provider.
type StopReason ¶
type StopReason string
StopReason is the normalized reason the model stopped producing output. Adapters map provider-specific values into this enum.
const ( StopEndTurn StopReason = "end_turn" StopToolUse StopReason = "tool_use" StopMaxTokens StopReason = "max_tokens" StopStopSequence StopReason = "stop_sequence" StopContentFilter StopReason = "content_filter" StopError StopReason = "error" StopOther StopReason = "other" )
type StreamOpts ¶
type StreamOpts struct {
// EffortOverride, when non-nil, replaces Client.Effort() for this call
// only. Empty string disables reasoning entirely. Used by the retry
// loops to force a non-thinking response on the final attempt after
// repeated thinking stalls.
EffortOverride *string
}
StreamOpts carries per-call overrides for StreamMessageWith. The zero value preserves the instance-level defaults.
type SystemBlock ¶
type SystemBlock struct {
Text string
CacheControl *CacheControl
}
SystemBlock is one block of the system prompt.
type ThinkingStallError ¶
ThinkingStallError is returned when a single reasoning/thinking block runs past the stall timeout. Summary holds the text collected from thinking-delta events so the retry layer can feed it back to the model on the next attempt. Only adapters that surface discrete reasoning events (Anthropic, OpenAI Responses) can produce this error.
func (*ThinkingStallError) Error ¶
func (e *ThinkingStallError) Error() string
func (*ThinkingStallError) Unwrap ¶
func (e *ThinkingStallError) Unwrap() error
type ToolCall ¶
ToolCall is one tool invocation extracted from the model's response. Duplicates the BlockToolUse entries in Message.Content for convenience.
type ToolParam ¶
type ToolParam struct {
Name string
Description string
InputSchema map[string]any // raw JSON Schema object
}
ToolParam describes one tool exposed to the model.
type Usage ¶
type Usage struct {
InputTokens int64
OutputTokens int64
CacheCreationTokens int64
CacheReadTokens int64
ReasoningTokens int64 // openai o-series, gpt-5-thinking
CostUSD float64 // openrouter when usage.include=true
ProviderExtra map[string]any // raw provider blob for future fields
}
Usage holds token counts and provider extras from one LLM response.