llm

package
v0.9.2 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: May 1, 2026 License: Apache-2.0 Imports: 8 Imported by: 0

Documentation

Overview

Package llm provides LLM client implementations.

Index

Constants

View Source
const LevelTrace = slog.Level(-8)

LevelTrace is below Debug, used for wire-level payload logging.

Variables

This section is empty.

Functions

func ApplyTextToolCallFallback

func ApplyTextToolCallFallback(resp *ChatResponse, validToolNames []string, profile ToolCallTextProfile)

ApplyTextToolCallFallback upgrades raw-text tool call emissions into structured ToolCalls, suppresses obvious hallucinated tool-call shapes, and strips trailing tool-call payloads from mixed responses.

func CacheHitRate

func CacheHitRate(cacheReadInputTokens, cacheCreationInputTokens int) float64

CacheHitRate returns the cache-read share for a single interaction's token counts, in [0, 1]. Zero when both counts are zero so callers never have to guard against division by zero.

func EstimateTokens

func EstimateTokens(text string) int

EstimateTokens returns a rough token count estimate for English text. Rule of thumb: ~4 characters per token.

func ExtractToolNames

func ExtractToolNames(tools []map[string]any) []string

ExtractToolNames extracts tool names from the OpenAI-style tool definitions passed to providers.

func LooksLikeHallucinatedToolCall

func LooksLikeHallucinatedToolCall(content string, profile ToolCallTextProfile) bool

LooksLikeHallucinatedToolCall reports whether content has the shape of a tool call but does not match any valid tool.

func LooksLikeTextToolCall

func LooksLikeTextToolCall(content string, profile ToolCallTextProfile) bool

LooksLikeTextToolCall reports whether content appears to be a raw-text tool call and should be buffered until the full response is available.

func StripTopLevelCompositionKeywords

func StripTopLevelCompositionKeywords(schema map[string]any) (map[string]any, []string)

StripTopLevelCompositionKeywords returns a deep-copied schema with unsupported top-level composition keywords removed and their object properties merged into the root when possible.

This is a compatibility helper for downstream consumers that accept regular object schemas but reject top-level oneOf/allOf/anyOf. The returned schema is intentionally permissive: root-level required fields are preserved, but composition-derived required constraints are not re-encoded because doing so would often overconstrain the tool contract.

func StripTrailingToolCallText

func StripTrailingToolCallText(content string, validTools []string, profile ToolCallTextProfile) string

StripTrailingToolCallText removes trailing tool-call payloads that a model appended after prose.

Types

type AmbiguousModelError

type AmbiguousModelError struct {
	Model   string
	Targets []string
}

AmbiguousModelError reports that a model selector matches multiple qualified route targets and must be disambiguated by the caller.

func (*AmbiguousModelError) Error

func (e *AmbiguousModelError) Error() string

type ChatResponse

type ChatResponse struct {
	Model     string
	CreatedAt time.Time
	Message   Message
	Done      bool

	// UpstreamRequestID is the provider-side request identifier when the
	// provider exposes one (e.g. Anthropic's `x-request-id` response
	// header). Empty when the provider does not return one. Captured
	// for support escalation, billing reconciliation, and correlating
	// our local r_* request IDs to upstream invoice line items.
	UpstreamRequestID string

	// StopReason is the provider-side termination signal in
	// provider-neutral form. Anthropic emits "end_turn", "tool_use",
	// "max_tokens", "stop_sequence", or "pause_turn" (the latter is
	// the server-side context-pressure signal that warrants operator
	// attention). Empty when the provider doesn't expose one or the
	// stream ended unexpectedly.
	StopReason string

	// Token usage (provider-neutral)
	InputTokens              int
	OutputTokens             int
	CacheCreationInputTokens int
	CacheReadInputTokens     int
	// Per-TTL breakdown of cache-write tokens. Populated by providers
	// that return a structured cache_creation breakdown (Anthropic).
	// Zero when the provider doesn't expose the breakdown, in which
	// case callers should fall back to CacheCreationInputTokens and
	// treat the TTL mix as unknown (typically charged at the 5m rate
	// for cost estimation, since that's the default).
	CacheCreation5mInputTokens int
	CacheCreation1hInputTokens int

	// Timing (populated when available)
	TotalDuration time.Duration
	LoadDuration  time.Duration
	EvalDuration  time.Duration
}

ChatResponse is the unified response from any LLM provider. All fields use proper Go types — wire format conversion happens at provider boundaries (ollama.go, anthropic.go).

func (*ChatResponse) CacheHitRate

func (r *ChatResponse) CacheHitRate() float64

CacheHitRate returns the fraction of cache-eligible input tokens on this response that were served from cache, in [0, 1]. Zero when there were no cache-eligible tokens at all. Matches the Anthropic-recommended observability metric: cache_read / (cache_read + cache_creation).

Exposed on ChatResponse (and as CacheHitRate for bare counts) so providers and loggers can surface the metric without importing the usage package, which would cycle.

type Client

type Client interface {
	// Chat sends a chat completion request and returns the response.
	Chat(ctx context.Context, model string, messages []Message, tools []map[string]any) (*ChatResponse, error)

	// ChatStream sends a streaming chat request. If callback is non-nil, tokens are streamed to it.
	ChatStream(ctx context.Context, model string, messages []Message, tools []map[string]any, callback StreamCallback) (*ChatResponse, error)

	// Ping checks if the provider is reachable.
	Ping(ctx context.Context) error
}

Client is the interface that all LLM providers must implement.

type ContextRenderStyle

type ContextRenderStyle string

ContextRenderStyle describes how runtime-generated context should be shaped for a model family.

const (
	ContextRenderStyleJSONFirst ContextRenderStyle = "json_first"
)

type DynamicClient

type DynamicClient struct {
	// contains filtered or unexported fields
}

DynamicClient is a concurrency-safe wrapper around a swappable underlying llm.Client. In-flight requests continue using the client they started with while future requests see the new client after Swap.

func NewDynamicClient

func NewDynamicClient(initial Client) *DynamicClient

NewDynamicClient wraps the initial client.

func (*DynamicClient) Chat

func (c *DynamicClient) Chat(ctx context.Context, model string, messages []Message, tools []map[string]any) (*ChatResponse, error)

Chat delegates to the current client.

func (*DynamicClient) ChatStream

func (c *DynamicClient) ChatStream(ctx context.Context, model string, messages []Message, tools []map[string]any, callback StreamCallback) (*ChatResponse, error)

ChatStream delegates to the current client.

func (*DynamicClient) Ping

func (c *DynamicClient) Ping(ctx context.Context) error

Ping delegates to the current client.

func (*DynamicClient) Swap

func (c *DynamicClient) Swap(next Client) error

Swap replaces the underlying client used for future requests.

type ImageContent

type ImageContent struct {
	Data      string // base64-encoded image data (no data URI prefix)
	MediaType string // MIME type: "image/jpeg", "image/png", etc.
}

ImageContent holds a base64-encoded image for multimodal messages. Each provider serializes images differently (Ollama uses a flat base64 array, Anthropic uses typed content blocks), so the Images field on Message is excluded from default JSON marshaling.

type Message

type Message struct {
	Role       string          `json:"role"`
	Content    string          `json:"content"`
	Images     []ImageContent  `json:"-"` // multimodal images; marshaled per-provider
	Sections   []PromptSection `json:"-"` // system-prompt sections; provider-specific
	ToolCalls  []ToolCall      `json:"tool_calls,omitempty"`
	ToolCallID string          `json:"tool_call_id,omitempty"` // For tool responses
}

Message represents a chat message for the LLM.

type ModelInteractionProfile

type ModelInteractionProfile struct {
	Name            string
	ContextStyle    ContextRenderStyle
	ToolCallStyle   ToolCallStyle
	TextToolProfile ToolCallTextProfile
}

ModelInteractionProfile captures model-family defaults for model-facing context and tool-call compatibility.

func DefaultModelInteractionProfile

func DefaultModelInteractionProfile() ModelInteractionProfile

DefaultModelInteractionProfile returns the generic Thane default.

func ProfileForModel

func ProfileForModel(input ModelProfileInput) ModelInteractionProfile

ProfileForModel selects the best current model-interaction profile from provider/model-family hints. The current default is conservative: stay JSON-first for context, but switch local open-model families to a raw-text tool-call contract when they commonly emit text instead of native tool-call structures.

func (ModelInteractionProfile) ToolCallingContract

func (p ModelInteractionProfile) ToolCallingContract() string

ToolCallingContract returns a short model-facing instruction for runtimes that need to recover tool calls from raw assistant text.

type ModelProfileInput

type ModelProfileInput struct {
	Provider          string
	Model             string
	Family            string
	Families          []string
	TrainedForToolUse bool
}

ModelProfileInput is the normalized metadata used to choose a model-family interaction profile.

type MultiClient

type MultiClient struct {
	// contains filtered or unexported fields
}

MultiClient routes requests to the appropriate provider based on model name.

func NewMultiClient

func NewMultiClient(fallback Client) *MultiClient

NewMultiClient creates a client that routes to multiple providers.

func (*MultiClient) AddAlias

func (m *MultiClient) AddAlias(alias, target string)

AddAlias maps an alternate selector to a concrete route target.

func (*MultiClient) AddModel

func (m *MultiClient) AddModel(modelName, providerName string)

AddModel maps a model name to a provider.

func (*MultiClient) AddProvider

func (m *MultiClient) AddProvider(name string, client Client)

AddProvider registers a client for a provider name.

func (*MultiClient) AddRoute

func (m *MultiClient) AddRoute(target, providerName, modelName string)

AddRoute maps a route target to a provider/resource and upstream model name.

func (*MultiClient) Chat

func (m *MultiClient) Chat(ctx context.Context, model string, messages []Message, tools []map[string]any) (*ChatResponse, error)

Chat sends a request to the appropriate provider for the model.

func (*MultiClient) ChatStream

func (m *MultiClient) ChatStream(ctx context.Context, model string, messages []Message, tools []map[string]any, callback StreamCallback) (*ChatResponse, error)

ChatStream sends a streaming request to the appropriate provider.

func (*MultiClient) MarkAmbiguous

func (m *MultiClient) MarkAmbiguous(alias string, targets []string)

MarkAmbiguous records that an alias maps to multiple route targets and must be qualified by the caller.

func (*MultiClient) Ping

func (m *MultiClient) Ping(ctx context.Context) error

Ping checks the fallback provider.

type PromptSection

type PromptSection struct {
	Name     string
	Content  string
	CacheTTL string // optional provider hint, for example "1h" or "5m"
}

PromptSection preserves the semantic sections of a system prompt so providers can apply transport-specific optimizations such as prompt caching without changing the prompt text itself.

type ReadyWatcher

type ReadyWatcher interface {
	IsReady() bool
}

ReadyWatcher is satisfied by connection watchers that can report whether a provider resource is currently reachable.

type StreamCallback

type StreamCallback func(event StreamEvent)

StreamCallback receives streaming events. For backward compatibility, pure-text consumers can check event.Kind == KindToken.

type StreamEvent

type StreamEvent struct {
	Kind StreamEventKind

	// Token is set for KindToken events.
	Token string

	// ToolCall is set for KindToolCallStart events.
	ToolCall *ToolCall

	// ToolName and ToolResult are set for KindToolCallDone events.
	ToolName   string
	ToolResult string
	ToolError  string

	// Response is set for KindDone events (final summary).
	Response *ChatResponse

	// Data carries optional extensible metadata for events that need
	// more than the typed fields above. Used by KindLLMStart to
	// forward router decisions and context estimates.
	Data map[string]any
}

StreamEvent represents a single event in a streaming response. Consumers switch on Kind to determine what data is available.

type StreamEventKind

type StreamEventKind int

StreamEventKind identifies the type of stream event.

const (
	// KindToken is an incremental text token from the model.
	KindToken StreamEventKind = iota

	// KindToolCallStart fires when the model invokes a tool.
	KindToolCallStart

	// KindToolCallDone fires when a tool execution completes.
	KindToolCallDone

	// KindDone signals the stream is complete. Response carries final metadata.
	KindDone

	// KindLLMResponse fires when an LLM response is received (before
	// tool execution begins). Response carries the model name and
	// token counts at the earliest point they become available.
	KindLLMResponse

	// KindLLMStart fires immediately before an LLM API call begins.
	// Response.Model carries the selected model name so consumers
	// can display it before the call completes.
	KindLLMStart
)

type ToolCall

type ToolCall struct {
	ID       string `json:"id,omitempty"` // Provider-assigned ID (required by Anthropic for tool_result correlation)
	Function struct {
		Name      string         `json:"name"`
		Arguments map[string]any `json:"arguments"`
	} `json:"function"`
}

ToolCall represents a tool call from the model.

func ParseTextToolCalls

func ParseTextToolCalls(content string, validTools []string, profile ToolCallTextProfile) []ToolCall

ParseTextToolCalls attempts to extract structured tool calls from raw assistant text.

func ParseTextToolCallsForRepair

func ParseTextToolCallsForRepair(content string, profile ToolCallTextProfile) []ToolCall

ParseTextToolCallsForRepair extracts tool-shaped JSON payloads even when the tool names do not currently match the valid tool list. This lets later runtime layers repair aliases such as forge_capability or list_capabilities instead of dropping them as hallucinated text.

type ToolCallStyle

type ToolCallStyle string

ToolCallStyle describes the primary tool-calling contract we expect a model family to follow.

const (
	ToolCallStyleNative      ToolCallStyle = "native"
	ToolCallStyleRawTextJSON ToolCallStyle = "raw_text_json"
)

type ToolCallTextProfile

type ToolCallTextProfile struct {
	AcceptTaggedToolCalls    bool
	AcceptMarkdownFences     bool
	AcceptConcatenatedJSON   bool
	AcceptToolNameJSONArgs   bool
	SuppressHallucinatedText bool
}

ToolCallTextProfile captures the raw-text tool-call formats the runtime is willing to parse for a model family.

func DefaultToolCallTextProfile

func DefaultToolCallTextProfile() ToolCallTextProfile

DefaultToolCallTextProfile accepts the common raw-text tool-call formats emitted by local/open models behind OpenAI-compatible runtimes.

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL