llm

package
v0.5.0 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Jun 14, 2026 License: AGPL-3.0 Imports: 30 Imported by: 0

Documentation

Overview

Package llm provides a provider-neutral interface for LLM interactions across Anthropic, OpenAI (Responses API), OpenRouter, MiniMax, and Xiaomi MiMo.

Index

Constants

View Source
const DefaultMaxTokens int64 = 32768

DefaultMaxTokens is the fallback per-call output token cap when a Client has no explicit MaxTokens override.

View Source
const DefaultStreamIdleTimeout = 60 * time.Second

DefaultStreamIdleTimeout bounds the silence between any two SSE events.

View Source
const DefaultThinkingStallTimeout = 120 * time.Second

DefaultThinkingStallTimeout bounds the time spent inside a single reasoning/thinking block. Past this, the adapter cancels the stream and the retry layer nudges the model to conclude.

Variables

View Source
var ErrNoCredential = errors.New("no credential")

ErrNoCredential is wrapped by NewFromModel when the selected model's provider has no resolvable credential. Callers use errors.Is to distinguish this (recoverable: the user just needs to add a key) from other construction failures such as an unknown provider prefix.

View Source
var ErrStreamIdleTimeout = errors.New("stream idle timeout")

ErrStreamIdleTimeout is returned when no SSE events arrive within the idle-timeout window. Anthropic sends ping events every ~15-30s even during extended thinking, and OpenAI streams typed events at similar cadence, so prolonged silence indicates a dead connection.

View Source
var ErrThinkingStall = errors.New("thinking stall")

ErrThinkingStall is the sentinel unwrap target for ThinkingStallError. Callers use errors.Is(err, ErrThinkingStall) to branch on the stall path; errors.As extracts the concrete ThinkingStallError when the accumulated summary text is needed.

Functions

func CloseIdleHTTPConnections

func CloseIdleHTTPConnections()

CloseIdleHTTPConnections drops all pooled connections in the shared transport. Called between retries so a poisoned conn doesn't get reused on the next attempt.

func DefaultEffortFromSpec

func DefaultEffortFromSpec(spec string) string

DefaultEffortFromSpec returns the default reasoning effort for the given model spec, per the provider's effort policy. Anthropic and MiniMax default to "adaptive"; the OpenAI-style providers default to "medium" for reasoning-capable models and "" otherwise.

func DurStr

func DurStr(a, b time.Time) string

DurStr formats the time between a and b as "<n>ms", or "—" when either is zero (event never observed).

func EnvDuration

func EnvDuration(name string, fallback time.Duration) time.Duration

EnvDuration reads a Go duration (e.g. "30s", "2m") from the given env var. Returns fallback when unset; logs and falls back on parse error.

func NewLoggingTransport

func NewLoggingTransport(base http.RoundTripper) http.RoundTripper

NewLoggingTransport returns a RoundTripper that attaches an httptrace.ClientTrace and logs a compact lifecycle summary per request. When VIX_STREAM_DEBUG=1 it also wraps the response body to count bytes and log when the first body byte arrives.

This is the provider-agnostic replacement for streamDebugMiddleware (formerly typed for anthropic-sdk-go's option.Middleware).

func NewPluginHTTPClient

func NewPluginHTTPClient(pc PluginConfig) *http.Client

NewPluginHTTPClient returns an *http.Client whose Transport applies the plugin's header set/strip rules to every outgoing request, then delegates to the shared transport. The lifecycle-logging transport is composed on the outside (see NewLoggingTransport) so the log ordering is:

request → loggingTransport → headerStripperTransport → sharedHTTPTransport

Returns SharedHTTPClient() unchanged when pc has no headers.

func NewRequestID

func NewRequestID() string

NewRequestID returns a short random hex ID for correlating one logical LLM turn. The retry layer uses "<turnID>.<attempt>" so all attempts share a prefix.

func RequestIDFromContext

func RequestIDFromContext(ctx context.Context) string

RequestIDFromContext returns the request correlation ID stamped on ctx by WithRequestID, or "" if none.

func SharedHTTPClient

func SharedHTTPClient() *http.Client

SharedHTTPClient returns the package-wide HTTP client. Adapters install per-instance wrappers (header strip/set, lifecycle logging) by composing transports on top of this client's Transport, then passing the wrapped client to their SDK via that SDK's WithHTTPClient option.

func Spec

func Spec(c Client) string

Spec returns the full prefixed model spec for a Client (e.g. "anthropic/claude-opus-4-8"). Useful for cost calculation and logging where the bare Client.Model() alone is ambiguous across providers.

func StreamDebugVerbose

func StreamDebugVerbose() bool

StreamDebugVerbose returns true when VIX_STREAM_DEBUG=1, enabling per-response-body-byte tracing in addition to always-on HTTP lifecycle logging.

func WithRequestID

func WithRequestID(ctx context.Context, id string) context.Context

WithRequestID returns a new context that carries the given request correlation ID. The HTTP transport stamps this ID on lifecycle log lines so all events for one attempt can be greped together.

Types

type BedrockHTTPError added in v0.4.3

type BedrockHTTPError struct {
	Code int
	Msg  string
}

BedrockHTTPError is a typed error for non-2xx Bedrock HTTP responses. Exported so classifyError in the daemon layer can use errors.As for robust classification without string matching.

func (*BedrockHTTPError) Error added in v0.4.3

func (e *BedrockHTTPError) Error() string

type CacheControl

type CacheControl struct {
	Type string `json:"type"` // currently always "ephemeral"
}

CacheControl marks a block as eligible for the provider's prompt cache. Currently only Anthropic honors these; OpenAI/MiniMax do passive caching and adapters drop the marker. OpenRouter forwards it when routing to Anthropic-family models.

type Client

type Client interface {
	// StreamMessage runs a streaming request with default options.
	StreamMessage(
		ctx context.Context,
		system []SystemBlock,
		messages []MessageParam,
		tools []ToolParam,
		onDelta func(string),
		onThinkingDelta func(string),
	) (*Message, time.Duration, error)

	// StreamMessageWith runs a streaming request honoring per-call
	// overrides from opts (currently just EffortOverride).
	StreamMessageWith(
		ctx context.Context,
		system []SystemBlock,
		messages []MessageParam,
		tools []ToolParam,
		onDelta func(string),
		onThinkingDelta func(string),
		opts StreamOpts,
	) (*Message, time.Duration, error)

	// Provider identifies which upstream this client talks to.
	Provider() ProviderID

	// Model returns the bare model name (no provider prefix).
	Model() string

	// Credential returns the credential this client was built with.
	Credential() config.Credential

	// MaxTokens returns the per-call output token cap configured on this
	// client. Zero means "use the default" (32768).
	MaxTokens() int64

	// Effort returns the reasoning effort configured at construction time.
	Effort() string
}

Client is the provider-neutral LLM interface. One Client is bound to a single (provider, model, credential, effort, maxTokens, pluginCfg) tuple and is safe for concurrent calls (the underlying SDKs handle request locking themselves).

func NewAnthropic

func NewAnthropic(cfg Config) (Client, error)

NewAnthropic constructs an Anthropic adapter from cfg.

func NewBedrock added in v0.4.3

func NewBedrock(cfg Config) (Client, error)

func NewFromModel

func NewFromModel(spec string, plugins PluginSource, effort string, maxTokens int64) (Client, error)

NewFromModel parses a vix-style model spec, resolves the right credential via config.ResolveProviderCredentialFresh, and constructs the matching adapter by dispatching on the provider's wire_format. All endpoint/header/query data comes from the providers registry (providers.json).

func NewOpenAI

func NewOpenAI(cfg Config) (Client, error)

NewOpenAI constructs the OpenAI Responses adapter.

type Config

type Config struct {
	Credential config.Credential
	Model      string // bare model name (no provider prefix)
	Effort     string // "", "low", "medium", "high", "max", "adaptive"
	MaxTokens  int64  // 0 = use DefaultMaxTokens
	PluginCfg  PluginConfig
	HTTPClient *http.Client // optional override; nil = use NewPluginHTTPClient(PluginCfg)

	// BaseURL overrides the adapter's default API endpoint. Empty means
	// use the provider's default. Set from a credential's endpoint override
	// (e.g. the Codex backend) or by tests redirecting to httptest servers.
	BaseURL string

	StreamIdle    time.Duration // 0 = read from env or use DefaultStreamIdleTimeout
	ThinkingStall time.Duration // 0 = read from env or use DefaultThinkingStallTimeout
}

Config is the shared input set every wire builder takes.

type ContentBlock

type ContentBlock struct {
	Type         ContentBlockType `json:"type"`
	Text         string           `json:"text,omitempty"`        // BlockText, BlockThinking
	ID           string           `json:"id,omitempty"`          // BlockToolUse
	Name         string           `json:"name,omitempty"`        // BlockToolUse
	Input        map[string]any   `json:"input,omitempty"`       // BlockToolUse — already-parsed; never a raw JSON string
	ToolUseID    string           `json:"tool_use_id,omitempty"` // BlockToolResult
	Output       string           `json:"output,omitempty"`      // BlockToolResult
	IsError      bool             `json:"is_error,omitempty"`    // BlockToolResult
	MediaType    string           `json:"media_type,omitempty"`  // BlockImage (e.g. "image/png")
	Data         string           `json:"data,omitempty"`        // BlockImage (base64-encoded payload)
	Signature    string           `json:"signature,omitempty"`   // BlockThinking — Anthropic signature or OpenAI reasoning-item ID
	CacheControl *CacheControl    `json:"cache_control,omitempty"`
}

ContentBlock is one element of message content. The fields used depend on Type — see the const docs for each variant.

func NewImageBlock

func NewImageBlock(mediaType, data string) ContentBlock

NewImageBlock builds an image content block. data is the base64-encoded payload.

func NewTextBlock

func NewTextBlock(text string) ContentBlock

NewTextBlock builds a text content block.

func NewThinkingBlock

func NewThinkingBlock(text, signature string) ContentBlock

NewThinkingBlock builds an assistant thinking block. signature is the Anthropic block signature (or the OpenAI reasoning-item ID) used to re-feed the block on the next turn.

func NewToolResultBlock

func NewToolResultBlock(toolUseID, output string, isError bool) ContentBlock

NewToolResultBlock builds a user tool_result block.

func NewToolUseBlock

func NewToolUseBlock(id, name string, input map[string]any) ContentBlock

NewToolUseBlock builds an assistant tool_use block.

type ContentBlockType

type ContentBlockType string

ContentBlockType discriminates the union shape of ContentBlock.

const (
	BlockText       ContentBlockType = "text"
	BlockThinking   ContentBlockType = "thinking"
	BlockToolUse    ContentBlockType = "tool_use"
	BlockToolResult ContentBlockType = "tool_result"
	BlockImage      ContentBlockType = "image"
)

type Message

type Message struct {
	StopReason  StopReason
	TextContent string         // concatenated text blocks (convenience for extractTextFromMessage)
	Content     []ContentBlock // full ordered content for replay
	ToolCalls   []ToolCall     // convenience extraction; duplicates Content's tool_use blocks
	Usage       Usage
	Raw         any // raw provider response, retained for LogLLMCall debugging
}

Message is the provider-neutral result of one LLM turn.

func (*Message) ToParam

func (m *Message) ToParam() MessageParam

ToParam reconstructs an assistant MessageParam from this Message so the turn can be appended to a conversation history and re-sent on the next turn. Preserves all content (text, thinking with signature, tool_use) so providers that need full round-trip (Anthropic, OpenAI Responses) keep working across turns.

type MessageParam

type MessageParam struct {
	Role    Role           `json:"role"`
	Content []ContentBlock `json:"content"`
}

MessageParam is one turn in the conversation history.

func NewAssistantMessage

func NewAssistantMessage(blocks ...ContentBlock) MessageParam

NewAssistantMessage builds a MessageParam with role=assistant from the given blocks.

func NewUserMessage

func NewUserMessage(blocks ...ContentBlock) MessageParam

NewUserMessage builds a MessageParam with role=user from the given blocks.

type PluginConfig

type PluginConfig struct {
	// Headers maps HTTP header name → value. A nil pointer value means
	// "strip this header from every outgoing API request". A non-nil
	// pointer means "set (or override) this header to the given string".
	Headers map[string]*string `json:"headers"`

	// SystemPrefix is prepended as the first system-prompt text block on
	// every StreamMessage call. Empty means no-op.
	SystemPrefix string `json:"system_prefix"`
}

PluginConfig is the merged output of running all discovered .vix/plugins/ executables on daemon startup. The plugin loader lives in package daemon; the struct lives here so every adapter can apply it without importing daemon (which would cycle).

type PluginSource added in v0.5.0

type PluginSource func(provider, model string, cred config.Credential) PluginConfig

PluginSource produces the PluginConfig to apply to a client being built for the given provider id, bare model name, and resolved credential. It runs at client-construction time so plugins can react to the actual target provider and credential kind (e.g. only spoof headers for anthropic + OAuth). A nil PluginSource means no plugins.

type ProviderID

type ProviderID string

ProviderID identifies one of the supported upstream providers.

const (
	ProviderAnthropic  ProviderID = "anthropic"
	ProviderBedrock    ProviderID = "bedrock"
	ProviderOpenAI     ProviderID = "openai"
	ProviderOpenRouter ProviderID = "openrouter"
	ProviderMiniMax    ProviderID = "minimax"
	ProviderMiMo       ProviderID = "mimo"
)

func ParseModel

func ParseModel(spec string) (ProviderID, string, error)

ParseModel maps a vix-style model spec (with mandatory provider prefix) to (provider id, bare model name) via the providers registry — the first matching prefix wins. Bare unprefixed names error explicitly. Thin wrapper over providers.Default().ParseModel so existing callers keep the ProviderID return type.

func Providers added in v0.4.0

func Providers() []ProviderID

Providers returns every supported provider id, in registry order.

func (ProviderID) CredentialName

func (p ProviderID) CredentialName() string

CredentialName returns the name used for credential resolution and keyring lookups for this provider.

type Role

type Role string

Role identifies the author of a message.

const (
	RoleUser      Role = "user"
	RoleAssistant Role = "assistant"
)

type StopReason

type StopReason string

StopReason is the normalized reason the model stopped producing output. Adapters map provider-specific values into this enum.

const (
	StopEndTurn       StopReason = "end_turn"
	StopToolUse       StopReason = "tool_use"
	StopMaxTokens     StopReason = "max_tokens"
	StopStopSequence  StopReason = "stop_sequence"
	StopContentFilter StopReason = "content_filter"
	StopError         StopReason = "error"
	StopOther         StopReason = "other"
)

type StreamOpts

type StreamOpts struct {
	// EffortOverride, when non-nil, replaces Client.Effort() for this call
	// only. Empty string disables reasoning entirely. Used by the retry
	// loops to force a non-thinking response on the final attempt after
	// repeated thinking stalls.
	EffortOverride *string
}

StreamOpts carries per-call overrides for StreamMessageWith. The zero value preserves the instance-level defaults.

type SystemBlock

type SystemBlock struct {
	Text         string
	CacheControl *CacheControl
}

SystemBlock is one block of the system prompt.

type ThinkingStallError

type ThinkingStallError struct {
	Elapsed time.Duration
	Summary string
}

ThinkingStallError is returned when a single reasoning/thinking block runs past the stall timeout. Summary holds the text collected from thinking-delta events so the retry layer can feed it back to the model on the next attempt. Only adapters that surface discrete reasoning events (Anthropic, OpenAI Responses) can produce this error.

func (*ThinkingStallError) Error

func (e *ThinkingStallError) Error() string

func (*ThinkingStallError) Unwrap

func (e *ThinkingStallError) Unwrap() error

type ToolCall

type ToolCall struct {
	ID    string
	Name  string
	Input map[string]any
}

ToolCall is one tool invocation extracted from the model's response. Duplicates the BlockToolUse entries in Message.Content for convenience.

type ToolParam

type ToolParam struct {
	Name        string
	Description string
	InputSchema map[string]any // raw JSON Schema object
}

ToolParam describes one tool exposed to the model.

type Usage

type Usage struct {
	InputTokens         int64
	OutputTokens        int64
	CacheCreationTokens int64
	CacheReadTokens     int64
	ReasoningTokens     int64          // openai o-series, gpt-5-thinking
	CostUSD             float64        // openrouter when usage.include=true
	ProviderExtra       map[string]any // raw provider blob for future fields
}

Usage holds token counts and provider extras from one LLM response.

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL