Documentation
¶
Overview ¶
Package llm provides LLM client implementations.
Index ¶
- Constants
- func ApplyTextToolCallFallback(resp *ChatResponse, validToolNames []string, profile ToolCallTextProfile)
- func CacheHitRate(cacheReadInputTokens, cacheCreationInputTokens int) float64
- func EstimateTokens(text string) int
- func ExtractToolNames(tools []map[string]any) []string
- func LooksLikeHallucinatedToolCall(content string, profile ToolCallTextProfile) bool
- func LooksLikeTextToolCall(content string, profile ToolCallTextProfile) bool
- func StripTopLevelCompositionKeywords(schema map[string]any) (map[string]any, []string)
- func StripTrailingToolCallText(content string, validTools []string, profile ToolCallTextProfile) string
- type AmbiguousModelError
- type ChatResponse
- type Client
- type ContextRenderStyle
- type DynamicClient
- func (c *DynamicClient) Chat(ctx context.Context, model string, messages []Message, tools []map[string]any) (*ChatResponse, error)
- func (c *DynamicClient) ChatStream(ctx context.Context, model string, messages []Message, tools []map[string]any, ...) (*ChatResponse, error)
- func (c *DynamicClient) Ping(ctx context.Context) error
- func (c *DynamicClient) Swap(next Client) error
- type ImageContent
- type Message
- type ModelInteractionProfile
- type ModelProfileInput
- type MultiClient
- func (m *MultiClient) AddAlias(alias, target string)
- func (m *MultiClient) AddModel(modelName, providerName string)
- func (m *MultiClient) AddProvider(name string, client Client)
- func (m *MultiClient) AddRoute(target, providerName, modelName string)
- func (m *MultiClient) Chat(ctx context.Context, model string, messages []Message, tools []map[string]any) (*ChatResponse, error)
- func (m *MultiClient) ChatStream(ctx context.Context, model string, messages []Message, tools []map[string]any, ...) (*ChatResponse, error)
- func (m *MultiClient) MarkAmbiguous(alias string, targets []string)
- func (m *MultiClient) Ping(ctx context.Context) error
- type PromptSection
- type ReadyWatcher
- type StreamCallback
- type StreamEvent
- type StreamEventKind
- type ToolCall
- type ToolCallStyle
- type ToolCallTextProfile
Constants ¶
const LevelTrace = slog.Level(-8)
LevelTrace is below Debug, used for wire-level payload logging.
Variables ¶
This section is empty.
Functions ¶
func ApplyTextToolCallFallback ¶
func ApplyTextToolCallFallback(resp *ChatResponse, validToolNames []string, profile ToolCallTextProfile)
ApplyTextToolCallFallback upgrades raw-text tool call emissions into structured ToolCalls, suppresses obvious hallucinated tool-call shapes, and strips trailing tool-call payloads from mixed responses.
func CacheHitRate ¶
CacheHitRate returns the cache-read share for a single interaction's token counts, in [0, 1]. Zero when both counts are zero so callers never have to guard against division by zero.
func EstimateTokens ¶
EstimateTokens returns a rough token count estimate for English text. Rule of thumb: ~4 characters per token.
func ExtractToolNames ¶
ExtractToolNames extracts tool names from the OpenAI-style tool definitions passed to providers.
func LooksLikeHallucinatedToolCall ¶
func LooksLikeHallucinatedToolCall(content string, profile ToolCallTextProfile) bool
LooksLikeHallucinatedToolCall reports whether content has the shape of a tool call but does not match any valid tool.
func LooksLikeTextToolCall ¶
func LooksLikeTextToolCall(content string, profile ToolCallTextProfile) bool
LooksLikeTextToolCall reports whether content appears to be a raw-text tool call and should be buffered until the full response is available.
func StripTopLevelCompositionKeywords ¶
StripTopLevelCompositionKeywords returns a deep-copied schema with unsupported top-level composition keywords removed and their object properties merged into the root when possible.
This is a compatibility helper for downstream consumers that accept regular object schemas but reject top-level oneOf/allOf/anyOf. The returned schema is intentionally permissive: root-level required fields are preserved, but composition-derived required constraints are not re-encoded because doing so would often overconstrain the tool contract.
func StripTrailingToolCallText ¶
func StripTrailingToolCallText(content string, validTools []string, profile ToolCallTextProfile) string
StripTrailingToolCallText removes trailing tool-call payloads that a model appended after prose.
Types ¶
type AmbiguousModelError ¶
AmbiguousModelError reports that a model selector matches multiple qualified route targets and must be disambiguated by the caller.
func (*AmbiguousModelError) Error ¶
func (e *AmbiguousModelError) Error() string
type ChatResponse ¶
type ChatResponse struct {
Model string
CreatedAt time.Time
Message Message
Done bool
// UpstreamRequestID is the provider-side request identifier when the
// provider exposes one (e.g. Anthropic's `x-request-id` response
// header). Empty when the provider does not return one. Captured
// for support escalation, billing reconciliation, and correlating
// our local r_* request IDs to upstream invoice line items.
UpstreamRequestID string
// StopReason is the provider-side termination signal in
// provider-neutral form. Anthropic emits "end_turn", "tool_use",
// "max_tokens", "stop_sequence", or "pause_turn" (the latter is
// the server-side context-pressure signal that warrants operator
// attention). Empty when the provider doesn't expose one or the
// stream ended unexpectedly.
StopReason string
// Token usage (provider-neutral)
InputTokens int
OutputTokens int
CacheCreationInputTokens int
CacheReadInputTokens int
// Per-TTL breakdown of cache-write tokens. Populated by providers
// that return a structured cache_creation breakdown (Anthropic).
// Zero when the provider doesn't expose the breakdown, in which
// case callers should fall back to CacheCreationInputTokens and
// treat the TTL mix as unknown (typically charged at the 5m rate
// for cost estimation, since that's the default).
CacheCreation5mInputTokens int
CacheCreation1hInputTokens int
// Timing (populated when available)
TotalDuration time.Duration
LoadDuration time.Duration
EvalDuration time.Duration
}
ChatResponse is the unified response from any LLM provider. All fields use proper Go types — wire format conversion happens at provider boundaries (ollama.go, anthropic.go).
func (*ChatResponse) CacheHitRate ¶
func (r *ChatResponse) CacheHitRate() float64
CacheHitRate returns the fraction of cache-eligible input tokens on this response that were served from cache, in [0, 1]. Zero when there were no cache-eligible tokens at all. Matches the Anthropic-recommended observability metric: cache_read / (cache_read + cache_creation).
Exposed on ChatResponse (and as CacheHitRate for bare counts) so providers and loggers can surface the metric without importing the usage package, which would cycle.
type Client ¶
type Client interface {
// Chat sends a chat completion request and returns the response.
Chat(ctx context.Context, model string, messages []Message, tools []map[string]any) (*ChatResponse, error)
// ChatStream sends a streaming chat request. If callback is non-nil, tokens are streamed to it.
ChatStream(ctx context.Context, model string, messages []Message, tools []map[string]any, callback StreamCallback) (*ChatResponse, error)
// Ping checks if the provider is reachable.
Ping(ctx context.Context) error
}
Client is the interface that all LLM providers must implement.
type ContextRenderStyle ¶
type ContextRenderStyle string
ContextRenderStyle describes how runtime-generated context should be shaped for a model family.
const (
ContextRenderStyleJSONFirst ContextRenderStyle = "json_first"
)
type DynamicClient ¶
type DynamicClient struct {
// contains filtered or unexported fields
}
DynamicClient is a concurrency-safe wrapper around a swappable underlying llm.Client. In-flight requests continue using the client they started with while future requests see the new client after Swap.
func NewDynamicClient ¶
func NewDynamicClient(initial Client) *DynamicClient
NewDynamicClient wraps the initial client.
func (*DynamicClient) Chat ¶
func (c *DynamicClient) Chat(ctx context.Context, model string, messages []Message, tools []map[string]any) (*ChatResponse, error)
Chat delegates to the current client.
func (*DynamicClient) ChatStream ¶
func (c *DynamicClient) ChatStream(ctx context.Context, model string, messages []Message, tools []map[string]any, callback StreamCallback) (*ChatResponse, error)
ChatStream delegates to the current client.
func (*DynamicClient) Ping ¶
func (c *DynamicClient) Ping(ctx context.Context) error
Ping delegates to the current client.
func (*DynamicClient) Swap ¶
func (c *DynamicClient) Swap(next Client) error
Swap replaces the underlying client used for future requests.
type ImageContent ¶
type ImageContent struct {
Data string // base64-encoded image data (no data URI prefix)
MediaType string // MIME type: "image/jpeg", "image/png", etc.
}
ImageContent holds a base64-encoded image for multimodal messages. Each provider serializes images differently (Ollama uses a flat base64 array, Anthropic uses typed content blocks), so the Images field on Message is excluded from default JSON marshaling.
type Message ¶
type Message struct {
Role string `json:"role"`
Content string `json:"content"`
Images []ImageContent `json:"-"` // multimodal images; marshaled per-provider
Sections []PromptSection `json:"-"` // system-prompt sections; provider-specific
ToolCalls []ToolCall `json:"tool_calls,omitempty"`
ToolCallID string `json:"tool_call_id,omitempty"` // For tool responses
}
Message represents a chat message for the LLM.
type ModelInteractionProfile ¶
type ModelInteractionProfile struct {
Name string
ContextStyle ContextRenderStyle
ToolCallStyle ToolCallStyle
TextToolProfile ToolCallTextProfile
}
ModelInteractionProfile captures model-family defaults for model-facing context and tool-call compatibility.
func DefaultModelInteractionProfile ¶
func DefaultModelInteractionProfile() ModelInteractionProfile
DefaultModelInteractionProfile returns the generic Thane default.
func ProfileForModel ¶
func ProfileForModel(input ModelProfileInput) ModelInteractionProfile
ProfileForModel selects the best current model-interaction profile from provider/model-family hints. The current default is conservative: stay JSON-first for context, but switch local open-model families to a raw-text tool-call contract when they commonly emit text instead of native tool-call structures.
func (ModelInteractionProfile) ToolCallingContract ¶
func (p ModelInteractionProfile) ToolCallingContract() string
ToolCallingContract returns a short model-facing instruction for runtimes that need to recover tool calls from raw assistant text.
type ModelProfileInput ¶
type ModelProfileInput struct {
Provider string
Model string
Family string
Families []string
TrainedForToolUse bool
}
ModelProfileInput is the normalized metadata used to choose a model-family interaction profile.
type MultiClient ¶
type MultiClient struct {
// contains filtered or unexported fields
}
MultiClient routes requests to the appropriate provider based on model name.
func NewMultiClient ¶
func NewMultiClient(fallback Client) *MultiClient
NewMultiClient creates a client that routes to multiple providers.
func (*MultiClient) AddAlias ¶
func (m *MultiClient) AddAlias(alias, target string)
AddAlias maps an alternate selector to a concrete route target.
func (*MultiClient) AddModel ¶
func (m *MultiClient) AddModel(modelName, providerName string)
AddModel maps a model name to a provider.
func (*MultiClient) AddProvider ¶
func (m *MultiClient) AddProvider(name string, client Client)
AddProvider registers a client for a provider name.
func (*MultiClient) AddRoute ¶
func (m *MultiClient) AddRoute(target, providerName, modelName string)
AddRoute maps a route target to a provider/resource and upstream model name.
func (*MultiClient) Chat ¶
func (m *MultiClient) Chat(ctx context.Context, model string, messages []Message, tools []map[string]any) (*ChatResponse, error)
Chat sends a request to the appropriate provider for the model.
func (*MultiClient) ChatStream ¶
func (m *MultiClient) ChatStream(ctx context.Context, model string, messages []Message, tools []map[string]any, callback StreamCallback) (*ChatResponse, error)
ChatStream sends a streaming request to the appropriate provider.
func (*MultiClient) MarkAmbiguous ¶
func (m *MultiClient) MarkAmbiguous(alias string, targets []string)
MarkAmbiguous records that an alias maps to multiple route targets and must be qualified by the caller.
type PromptSection ¶
type PromptSection struct {
Name string
Content string
CacheTTL string // optional provider hint, for example "1h" or "5m"
}
PromptSection preserves the semantic sections of a system prompt so providers can apply transport-specific optimizations such as prompt caching without changing the prompt text itself.
type ReadyWatcher ¶
type ReadyWatcher interface {
IsReady() bool
}
ReadyWatcher is satisfied by connection watchers that can report whether a provider resource is currently reachable.
type StreamCallback ¶
type StreamCallback func(event StreamEvent)
StreamCallback receives streaming events. For backward compatibility, pure-text consumers can check event.Kind == KindToken.
type StreamEvent ¶
type StreamEvent struct {
Kind StreamEventKind
// Token is set for KindToken events.
Token string
// ToolCall is set for KindToolCallStart events.
ToolCall *ToolCall
// ToolName and ToolResult are set for KindToolCallDone events.
ToolName string
ToolResult string
ToolError string
// Response is set for KindDone events (final summary).
Response *ChatResponse
// Data carries optional extensible metadata for events that need
// more than the typed fields above. Used by KindLLMStart to
// forward router decisions and context estimates.
Data map[string]any
}
StreamEvent represents a single event in a streaming response. Consumers switch on Kind to determine what data is available.
type StreamEventKind ¶
type StreamEventKind int
StreamEventKind identifies the type of stream event.
const ( // KindToken is an incremental text token from the model. KindToken StreamEventKind = iota // KindToolCallStart fires when the model invokes a tool. KindToolCallStart // KindToolCallDone fires when a tool execution completes. KindToolCallDone // KindDone signals the stream is complete. Response carries final metadata. KindDone // KindLLMResponse fires when an LLM response is received (before // tool execution begins). Response carries the model name and // token counts at the earliest point they become available. KindLLMResponse // KindLLMStart fires immediately before an LLM API call begins. // Response.Model carries the selected model name so consumers // can display it before the call completes. KindLLMStart )
type ToolCall ¶
type ToolCall struct {
ID string `json:"id,omitempty"` // Provider-assigned ID (required by Anthropic for tool_result correlation)
Function struct {
Name string `json:"name"`
Arguments map[string]any `json:"arguments"`
} `json:"function"`
}
ToolCall represents a tool call from the model.
func ParseTextToolCalls ¶
func ParseTextToolCalls(content string, validTools []string, profile ToolCallTextProfile) []ToolCall
ParseTextToolCalls attempts to extract structured tool calls from raw assistant text.
func ParseTextToolCallsForRepair ¶
func ParseTextToolCallsForRepair(content string, profile ToolCallTextProfile) []ToolCall
ParseTextToolCallsForRepair extracts tool-shaped JSON payloads even when the tool names do not currently match the valid tool list. This lets later runtime layers repair aliases such as forge_capability or list_capabilities instead of dropping them as hallucinated text.
type ToolCallStyle ¶
type ToolCallStyle string
ToolCallStyle describes the primary tool-calling contract we expect a model family to follow.
const ( ToolCallStyleNative ToolCallStyle = "native" ToolCallStyleRawTextJSON ToolCallStyle = "raw_text_json" )
type ToolCallTextProfile ¶
type ToolCallTextProfile struct {
AcceptTaggedToolCalls bool
AcceptMarkdownFences bool
AcceptConcatenatedJSON bool
AcceptToolNameJSONArgs bool
SuppressHallucinatedText bool
}
ToolCallTextProfile captures the raw-text tool-call formats the runtime is willing to parse for a model family.
func DefaultToolCallTextProfile ¶
func DefaultToolCallTextProfile() ToolCallTextProfile
DefaultToolCallTextProfile accepts the common raw-text tool-call formats emitted by local/open models behind OpenAI-compatible runtimes.