llm

package
v0.9.1 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Apr 23, 2026 License: Apache-2.0 Imports: 8 Imported by: 0

Documentation

Overview

Package llm provides LLM client implementations.

Index

Constants

View Source
const LevelTrace = slog.Level(-8)

LevelTrace is below Debug, used for wire-level payload logging.

Variables

This section is empty.

Functions

func ApplyTextToolCallFallback added in v0.9.1

func ApplyTextToolCallFallback(resp *ChatResponse, validToolNames []string, profile ToolCallTextProfile)

ApplyTextToolCallFallback upgrades raw-text tool call emissions into structured ToolCalls, suppresses obvious hallucinated tool-call shapes, and strips trailing tool-call payloads from mixed responses.

func CacheHitRate added in v0.9.1

func CacheHitRate(cacheReadInputTokens, cacheCreationInputTokens int) float64

CacheHitRate returns the cache-read share for a single interaction's token counts, in [0, 1]. Zero when both counts are zero so callers never have to guard against division by zero.

func EstimateTokens added in v0.8.0

func EstimateTokens(text string) int

EstimateTokens returns a rough token count estimate for English text. Rule of thumb: ~4 characters per token.

func ExtractToolNames added in v0.9.1

func ExtractToolNames(tools []map[string]any) []string

ExtractToolNames extracts tool names from the OpenAI-style tool definitions passed to providers.

func LooksLikeHallucinatedToolCall added in v0.9.1

func LooksLikeHallucinatedToolCall(content string, profile ToolCallTextProfile) bool

LooksLikeHallucinatedToolCall reports whether content has the shape of a tool call but does not match any valid tool.

func LooksLikeTextToolCall added in v0.9.1

func LooksLikeTextToolCall(content string, profile ToolCallTextProfile) bool

LooksLikeTextToolCall reports whether content appears to be a raw-text tool call and should be buffered until the full response is available.

func StripTopLevelCompositionKeywords added in v0.9.1

func StripTopLevelCompositionKeywords(schema map[string]any) (map[string]any, []string)

StripTopLevelCompositionKeywords returns a deep-copied schema with unsupported top-level composition keywords removed and their object properties merged into the root when possible.

This is a compatibility helper for downstream consumers that accept regular object schemas but reject top-level oneOf/allOf/anyOf. The returned schema is intentionally permissive: root-level required fields are preserved, but composition-derived required constraints are not re-encoded because doing so would often overconstrain the tool contract.

func StripTrailingToolCallText added in v0.9.1

func StripTrailingToolCallText(content string, validTools []string, profile ToolCallTextProfile) string

StripTrailingToolCallText removes trailing tool-call payloads that a model appended after prose.

Types

type AmbiguousModelError added in v0.9.1

type AmbiguousModelError struct {
	Model   string
	Targets []string
}

AmbiguousModelError reports that a model selector matches multiple qualified route targets and must be disambiguated by the caller.

func (*AmbiguousModelError) Error added in v0.9.1

func (e *AmbiguousModelError) Error() string

type ChatResponse

type ChatResponse struct {
	Model     string
	CreatedAt time.Time
	Message   Message
	Done      bool

	// Token usage (provider-neutral)
	InputTokens              int
	OutputTokens             int
	CacheCreationInputTokens int
	CacheReadInputTokens     int
	// Per-TTL breakdown of cache-write tokens. Populated by providers
	// that return a structured cache_creation breakdown (Anthropic).
	// Zero when the provider doesn't expose the breakdown, in which
	// case callers should fall back to CacheCreationInputTokens and
	// treat the TTL mix as unknown (typically charged at the 5m rate
	// for cost estimation, since that's the default).
	CacheCreation5mInputTokens int
	CacheCreation1hInputTokens int

	// Timing (populated when available)
	TotalDuration time.Duration
	LoadDuration  time.Duration
	EvalDuration  time.Duration
}

ChatResponse is the unified response from any LLM provider. All fields use proper Go types — wire format conversion happens at provider boundaries (ollama.go, anthropic.go).

func (*ChatResponse) CacheHitRate added in v0.9.1

func (r *ChatResponse) CacheHitRate() float64

CacheHitRate returns the fraction of cache-eligible input tokens on this response that were served from cache, in [0, 1]. Zero when there were no cache-eligible tokens at all. Matches the Anthropic-recommended observability metric: cache_read / (cache_read + cache_creation).

Exposed on ChatResponse (and as CacheHitRate for bare counts) so providers and loggers can surface the metric without importing the usage package, which would cycle.

type Client

type Client interface {
	// Chat sends a chat completion request and returns the response.
	Chat(ctx context.Context, model string, messages []Message, tools []map[string]any) (*ChatResponse, error)

	// ChatStream sends a streaming chat request. If callback is non-nil, tokens are streamed to it.
	ChatStream(ctx context.Context, model string, messages []Message, tools []map[string]any, callback StreamCallback) (*ChatResponse, error)

	// Ping checks if the provider is reachable.
	Ping(ctx context.Context) error
}

Client is the interface that all LLM providers must implement.

type ContextRenderStyle added in v0.9.1

type ContextRenderStyle string

ContextRenderStyle describes how runtime-generated context should be shaped for a model family.

const (
	ContextRenderStyleJSONFirst ContextRenderStyle = "json_first"
)

type DynamicClient added in v0.9.1

type DynamicClient struct {
	// contains filtered or unexported fields
}

DynamicClient is a concurrency-safe wrapper around a swappable underlying llm.Client. In-flight requests continue using the client they started with while future requests see the new client after Swap.

func NewDynamicClient added in v0.9.1

func NewDynamicClient(initial Client) *DynamicClient

NewDynamicClient wraps the initial client.

func (*DynamicClient) Chat added in v0.9.1

func (c *DynamicClient) Chat(ctx context.Context, model string, messages []Message, tools []map[string]any) (*ChatResponse, error)

Chat delegates to the current client.

func (*DynamicClient) ChatStream added in v0.9.1

func (c *DynamicClient) ChatStream(ctx context.Context, model string, messages []Message, tools []map[string]any, callback StreamCallback) (*ChatResponse, error)

ChatStream delegates to the current client.

func (*DynamicClient) Ping added in v0.9.1

func (c *DynamicClient) Ping(ctx context.Context) error

Ping delegates to the current client.

func (*DynamicClient) Swap added in v0.9.1

func (c *DynamicClient) Swap(next Client) error

Swap replaces the underlying client used for future requests.

type ImageContent added in v0.8.4

type ImageContent struct {
	Data      string // base64-encoded image data (no data URI prefix)
	MediaType string // MIME type: "image/jpeg", "image/png", etc.
}

ImageContent holds a base64-encoded image for multimodal messages. Each provider serializes images differently (Ollama uses a flat base64 array, Anthropic uses typed content blocks), so the Images field on Message is excluded from default JSON marshaling.

type Message

type Message struct {
	Role       string          `json:"role"`
	Content    string          `json:"content"`
	Images     []ImageContent  `json:"-"` // multimodal images; marshaled per-provider
	Sections   []PromptSection `json:"-"` // system-prompt sections; provider-specific
	ToolCalls  []ToolCall      `json:"tool_calls,omitempty"`
	ToolCallID string          `json:"tool_call_id,omitempty"` // For tool responses
}

Message represents a chat message for the LLM.

type ModelInteractionProfile added in v0.9.1

type ModelInteractionProfile struct {
	Name            string
	ContextStyle    ContextRenderStyle
	ToolCallStyle   ToolCallStyle
	TextToolProfile ToolCallTextProfile
}

ModelInteractionProfile captures model-family defaults for model-facing context and tool-call compatibility.

func DefaultModelInteractionProfile added in v0.9.1

func DefaultModelInteractionProfile() ModelInteractionProfile

DefaultModelInteractionProfile returns the generic Thane default.

func ProfileForModel added in v0.9.1

func ProfileForModel(input ModelProfileInput) ModelInteractionProfile

ProfileForModel selects the best current model-interaction profile from provider/model-family hints. The current default is conservative: stay JSON-first for context, but switch local open-model families to a raw-text tool-call contract when they commonly emit text instead of native tool-call structures.

func (ModelInteractionProfile) ToolCallingContract added in v0.9.1

func (p ModelInteractionProfile) ToolCallingContract() string

ToolCallingContract returns a short model-facing instruction for runtimes that need to recover tool calls from raw assistant text.

type ModelProfileInput added in v0.9.1

type ModelProfileInput struct {
	Provider          string
	Model             string
	Family            string
	Families          []string
	TrainedForToolUse bool
}

ModelProfileInput is the normalized metadata used to choose a model-family interaction profile.

type MultiClient

type MultiClient struct {
	// contains filtered or unexported fields
}

MultiClient routes requests to the appropriate provider based on model name.

func NewMultiClient

func NewMultiClient(fallback Client) *MultiClient

NewMultiClient creates a client that routes to multiple providers.

func (*MultiClient) AddAlias added in v0.9.1

func (m *MultiClient) AddAlias(alias, target string)

AddAlias maps an alternate selector to a concrete route target.

func (*MultiClient) AddModel

func (m *MultiClient) AddModel(modelName, providerName string)

AddModel maps a model name to a provider.

func (*MultiClient) AddProvider

func (m *MultiClient) AddProvider(name string, client Client)

AddProvider registers a client for a provider name.

func (*MultiClient) AddRoute added in v0.9.1

func (m *MultiClient) AddRoute(target, providerName, modelName string)

AddRoute maps a route target to a provider/resource and upstream model name.

func (*MultiClient) Chat

func (m *MultiClient) Chat(ctx context.Context, model string, messages []Message, tools []map[string]any) (*ChatResponse, error)

Chat sends a request to the appropriate provider for the model.

func (*MultiClient) ChatStream

func (m *MultiClient) ChatStream(ctx context.Context, model string, messages []Message, tools []map[string]any, callback StreamCallback) (*ChatResponse, error)

ChatStream sends a streaming request to the appropriate provider.

func (*MultiClient) MarkAmbiguous added in v0.9.1

func (m *MultiClient) MarkAmbiguous(alias string, targets []string)

MarkAmbiguous records that an alias maps to multiple route targets and must be qualified by the caller.

func (*MultiClient) Ping

func (m *MultiClient) Ping(ctx context.Context) error

Ping checks the fallback provider.

type PromptSection added in v0.9.1

type PromptSection struct {
	Name     string
	Content  string
	CacheTTL string // optional provider hint, for example "1h" or "5m"
}

PromptSection preserves the semantic sections of a system prompt so providers can apply transport-specific optimizations such as prompt caching without changing the prompt text itself.

type ReadyWatcher added in v0.9.1

type ReadyWatcher interface {
	IsReady() bool
}

ReadyWatcher is satisfied by connection watchers that can report whether a provider resource is currently reachable.

type StreamCallback

type StreamCallback func(event StreamEvent)

StreamCallback receives streaming events. For backward compatibility, pure-text consumers can check event.Kind == KindToken.

type StreamEvent

type StreamEvent struct {
	Kind StreamEventKind

	// Token is set for KindToken events.
	Token string

	// ToolCall is set for KindToolCallStart events.
	ToolCall *ToolCall

	// ToolName and ToolResult are set for KindToolCallDone events.
	ToolName   string
	ToolResult string
	ToolError  string

	// Response is set for KindDone events (final summary).
	Response *ChatResponse

	// Data carries optional extensible metadata for events that need
	// more than the typed fields above. Used by KindLLMStart to
	// forward router decisions and context estimates.
	Data map[string]any
}

StreamEvent represents a single event in a streaming response. Consumers switch on Kind to determine what data is available.

type StreamEventKind

type StreamEventKind int

StreamEventKind identifies the type of stream event.

const (
	// KindToken is an incremental text token from the model.
	KindToken StreamEventKind = iota

	// KindToolCallStart fires when the model invokes a tool.
	KindToolCallStart

	// KindToolCallDone fires when a tool execution completes.
	KindToolCallDone

	// KindDone signals the stream is complete. Response carries final metadata.
	KindDone

	// KindLLMResponse fires when an LLM response is received (before
	// tool execution begins). Response carries the model name and
	// token counts at the earliest point they become available.
	KindLLMResponse

	// KindLLMStart fires immediately before an LLM API call begins.
	// Response.Model carries the selected model name so consumers
	// can display it before the call completes.
	KindLLMStart
)

type ToolCall

type ToolCall struct {
	ID       string `json:"id,omitempty"` // Provider-assigned ID (required by Anthropic for tool_result correlation)
	Function struct {
		Name      string         `json:"name"`
		Arguments map[string]any `json:"arguments"`
	} `json:"function"`
}

ToolCall represents a tool call from the model.

func ParseTextToolCalls added in v0.9.1

func ParseTextToolCalls(content string, validTools []string, profile ToolCallTextProfile) []ToolCall

ParseTextToolCalls attempts to extract structured tool calls from raw assistant text.

func ParseTextToolCallsForRepair added in v0.9.1

func ParseTextToolCallsForRepair(content string, profile ToolCallTextProfile) []ToolCall

ParseTextToolCallsForRepair extracts tool-shaped JSON payloads even when the tool names do not currently match the valid tool list. This lets later runtime layers repair aliases such as forge_capability or list_capabilities instead of dropping them as hallucinated text.

type ToolCallStyle added in v0.9.1

type ToolCallStyle string

ToolCallStyle describes the primary tool-calling contract we expect a model family to follow.

const (
	ToolCallStyleNative      ToolCallStyle = "native"
	ToolCallStyleRawTextJSON ToolCallStyle = "raw_text_json"
)

type ToolCallTextProfile added in v0.9.1

type ToolCallTextProfile struct {
	AcceptTaggedToolCalls    bool
	AcceptMarkdownFences     bool
	AcceptConcatenatedJSON   bool
	AcceptToolNameJSONArgs   bool
	SuppressHallucinatedText bool
}

ToolCallTextProfile captures the raw-text tool-call formats the runtime is willing to parse for a model family.

func DefaultToolCallTextProfile added in v0.9.1

func DefaultToolCallTextProfile() ToolCallTextProfile

DefaultToolCallTextProfile accepts the common raw-text tool-call formats emitted by local/open models behind OpenAI-compatible runtimes.

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL