llm

package
v0.16.11 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Apr 10, 2026 License: Apache-2.0 Imports: 16 Imported by: 0

Documentation

Index

Constants

This section is empty.

Variables

View Source
var ErrHardLimitExceeded = errors.New("session hard cost limit exceeded")

ErrHardLimitExceeded is returned when the session's hard cost limit is reached.

View Source
var ErrSoftLimitExceeded = errors.New("session soft cost limit exceeded")

ErrSoftLimitExceeded is returned when the session's soft cost limit is reached.

Functions

func TokenCost added in v0.15.0

func TokenCost(resp *ChatResponse, reg *pricing.Registry) (float64, string)

TokenCost returns the USD cost for a chat response using the pricing registry. Priority: provider-reported cost > registry lookup > fallback rate > $0. Also returns a pricing source string ("provider"/"registry"/"fallback"/"unknown").

Types

type AgentStats added in v0.12.0

type AgentStats struct {
	Agent        string  `json:"agent"`
	Cost         float64 `json:"cost"`
	InputTokens  int     `json:"input_tokens"`
	OutputTokens int     `json:"output_tokens"`
	Messages     int     `json:"messages"`
	Sessions     int     `json:"sessions"`
}

AgentStats holds aggregated per-agent cost and token data.

type BalanceProvider

type BalanceProvider interface {
	FundsRemaining(ctx context.Context) (float64, error)
}

BalanceProvider is an optional interface implemented by providers that can report their remaining credit balance. Returns -1 if the balance is unlimited.

type ChatRequest

type ChatRequest struct {
	Model       string
	Messages    []Message
	MaxTokens   int
	Temperature *float64
	Tools       []ToolDef
	OnStream    StreamCallback // if non-nil, provider streams chunks via this callback
}

type ChatResponse

type ChatResponse struct {
	Content      string
	ToolCalls    []ToolCall
	TokensUsed   TokenUsage
	Model        string
	FinishReason string
	CostUSD      float64 // provider-reported or estimated cost in USD
}

type CostTracker

type CostTracker struct {
	// contains filtered or unexported fields
}

CostTracker tracks token usage and estimated costs per session and globally.

func NewCostTracker

func NewCostTracker(defaults SessionLimits, agentOverrides map[string]SessionLimits) *CostTracker

NewCostTracker creates a CostTracker with default limits and optional per-agent overrides.

func (*CostTracker) AgentCosts added in v0.12.0

func (ct *CostTracker) AgentCosts() []AgentStats

AgentCosts returns per-agent aggregated stats. Agent name is extracted from session IDs which have the format "agentname:adapter:externalid".

func (*CostTracker) AllSessionCosts

func (ct *CostTracker) AllSessionCosts() map[string]float64

AllSessionCosts returns a copy of all session costs.

func (*CostTracker) AllSessionStats added in v0.12.0

func (ct *CostTracker) AllSessionStats() map[string]SessionStats

AllSessionStats returns a copy of all session stats.

func (*CostTracker) DefaultLimits added in v0.16.5

func (ct *CostTracker) DefaultLimits() SessionLimits

DefaultLimits returns the default cost limits.

func (*CostTracker) ExceedsBudget

func (ct *CostTracker) ExceedsBudget(sessionID string) bool

ExceedsBudget checks if a session has exceeded its hard budget. Deprecated: use ExceedsHardLimit.

func (*CostTracker) ExceedsHardLimit added in v0.16.5

func (ct *CostTracker) ExceedsHardLimit(sessionID string) bool

ExceedsHardLimit checks if a session has exceeded its hard cost limit. Returns false if the hard limit is disabled (zero).

func (*CostTracker) ExceedsSoftLimit added in v0.16.5

func (ct *CostTracker) ExceedsSoftLimit(sessionID string) bool

ExceedsSoftLimit checks if a session has exceeded its soft cost limit. Returns false if the soft limit is disabled (zero).

func (*CostTracker) GlobalCost

func (ct *CostTracker) GlobalCost() float64

GlobalCost returns the total cost across all sessions.

func (*CostTracker) MaxBudgetPerSession

func (ct *CostTracker) MaxBudgetPerSession() float64

MaxBudgetPerSession returns the default hard cost cap. Deprecated: use DefaultLimits().Hard.

func (*CostTracker) Record

func (ct *CostTracker) Record(sessionID string, cost float64) bool

Record adds cost for a session. Returns true if within hard budget.

func (*CostTracker) RecordWithTokens added in v0.12.0

func (ct *CostTracker) RecordWithTokens(sessionID string, cost float64, inputTokens, outputTokens int, pricingSource ...string) bool

RecordWithTokens adds cost and token usage for a session. Returns true if within budget. The optional pricingSource parameter records which pricing method was used.

func (*CostTracker) SessionCost

func (ct *CostTracker) SessionCost(sessionID string) float64

SessionCost returns the total cost for a session.

func (*CostTracker) SetAgentLimits added in v0.16.5

func (ct *CostTracker) SetAgentLimits(agent string, limits SessionLimits)

SetAgentLimits sets per-agent cost limit overrides.

type FallbackRule

type FallbackRule struct {
	Trigger    string  // "error" | "rate_limit" | "low_funds"
	Action     string  // "switch_provider" | "switch_model" | "wait_and_retry"
	Provider   string  // provider name — for switch_provider
	Model      string  // model name — for switch_model; optional for switch_provider
	Threshold  float64 // remaining credit threshold in USD — for low_funds
	MaxRetries int     // number of retry attempts — for wait_and_retry
	Backoff    string  // "exponential" | "constant" — for wait_and_retry
}

FallbackRule describes a single fallback step the router will attempt.

type FunctionCall

type FunctionCall struct {
	Name      string `json:"name"`
	Arguments string `json:"arguments"`
}

FunctionCall is the function name and JSON-encoded arguments within a ToolCall.

type FunctionDef

type FunctionDef struct {
	Name        string         `json:"name"`
	Description string         `json:"description"`
	Parameters  map[string]any `json:"parameters"` // JSON Schema object
}

FunctionDef describes the function signature within a ToolDef.

type LLMError

type LLMError struct {
	StatusCode int
	Message    string
}

LLMError is returned by providers for API-level failures. Use errors.As to unwrap from wrapped errors.

func (*LLMError) Error

func (e *LLMError) Error() string

func (*LLMError) Retryable

func (e *LLMError) Retryable() bool

Retryable reports whether the error is worth retrying. Non-retryable: 400 (bad request), 401 (auth), 402 (payment), 404 (not found), 422 (unprocessable). Retryable: 429 (rate limit), 5xx (server errors), and any unrecognised status.

type Message

type Message struct {
	Role       string     `json:"role"`
	Content    string     `json:"content"`
	ToolCalls  []ToolCall `json:"tool_calls,omitempty"`
	ToolCallID string     `json:"tool_call_id,omitempty"`
}

type ModelDetailLister added in v0.16.11

type ModelDetailLister interface {
	ListModelDetails(ctx context.Context) ([]ModelInfo, error)
}

ModelDetailLister is an optional interface for providers that can return enriched model metadata (pricing, capabilities). Providers that implement this are preferred over the static heuristic in Router.ListModelDetails.

type ModelInfo added in v0.16.11

type ModelInfo struct {
	ID            string   `json:"id"`
	Name          string   `json:"name"`
	Provider      string   `json:"provider"`
	InputPerMTok  *float64 `json:"input_per_mtok"`  // nil = pricing unknown
	OutputPerMTok *float64 `json:"output_per_mtok"` // nil = pricing unknown
	SupportsTools bool     `json:"supports_tools"`
	WeeklyTokens  int64    `json:"weekly_tokens"` // 0 = unknown; used for popularity sort
}

ModelInfo holds enriched metadata about an available LLM model.

type ModelLister added in v0.15.1

type ModelLister interface {
	ListModels(ctx context.Context) ([]string, error)
}

ModelLister is an optional interface implemented by providers that can enumerate their available models.

type OAIPromptTokenDetail added in v0.16.9

type OAIPromptTokenDetail struct {
	CachedTokens int `json:"cached_tokens"`
}

OAIPromptTokenDetail holds cached token info from the usage block.

type OAIStreamResult added in v0.16.9

type OAIStreamResult struct {
	Content      string
	ToolCalls    []ToolCall
	Model        string
	FinishReason string
	Usage        *OAIStreamUsage
	// ReasoningContent captures reasoning_content deltas (OpenRouter).
	ReasoningContent string
}

OAIStreamResult holds the accumulated data from an OpenAI-compatible streaming response. Providers call ReadOAIStream to parse the SSE body and then convert this into an llm.ChatResponse.

func ReadOAIStream added in v0.16.9

func ReadOAIStream(body io.Reader, onStream StreamCallback) (*OAIStreamResult, error)

ReadOAIStream reads an OpenAI-compatible SSE stream body and calls onStream for each content/reasoning delta. It accumulates tool calls and usage and returns the full result. onStream may be nil.

type OAIStreamUsage added in v0.16.9

type OAIStreamUsage struct {
	PromptTokens        int                   `json:"prompt_tokens"`
	CompletionTokens    int                   `json:"completion_tokens"`
	TotalTokens         int                   `json:"total_tokens"`
	PromptTokensDetails *OAIPromptTokenDetail `json:"prompt_tokens_details,omitempty"`
	Cost                float64               `json:"cost"` // OpenRouter reports cost here
}

OAIStreamUsage mirrors the usage block in the final SSE chunk.

type Provider

type Provider interface {
	ChatCompletion(ctx context.Context, req ChatRequest) (*ChatResponse, error)
	Name() string
	HealthCheck(ctx context.Context) error
}

Provider defines the interface for LLM backends.

type ProviderMetadata

type ProviderMetadata struct {
	Name    string
	BaseURL string
	Models  []string
}

type Router

type Router struct {
	// contains filtered or unexported fields
}

Router selects the appropriate LLM provider for a request.

func NewRouter

func NewRouter(defaultProvider, defaultModel string, costTracker *CostTracker) *Router

func (*Router) Complete

func (r *Router) Complete(ctx context.Context, sessionID string, messages []Message) (*ChatResponse, error)

func (*Router) CompleteStream added in v0.16.9

func (r *Router) CompleteStream(ctx context.Context, sessionID string, messages []Message, onStream StreamCallback) (*ChatResponse, error)

CompleteStream is like Complete but enables real-time streaming of content chunks via the onStream callback. If the active provider does not support streaming, it falls back to the non-streaming path transparently.

func (*Router) CostTracker added in v0.16.5

func (r *Router) CostTracker() *CostTracker

CostTracker returns the router's cost tracker for soft-limit checks.

func (*Router) DefaultModel

func (r *Router) DefaultModel() string

DefaultModel returns the router's default model name.

func (*Router) HealthCheck

func (r *Router) HealthCheck(ctx context.Context) error

func (*Router) ListModelDetails added in v0.16.11

func (r *Router) ListModelDetails(ctx context.Context) []ModelInfo

ListModelDetails queries all providers and returns enriched model metadata including pricing and tool support. Providers implementing ModelDetailLister supply authoritative data; others fall back to the pricing registry and a static tool-support heuristic.

func (*Router) ListModels added in v0.15.1

func (r *Router) ListModels(ctx context.Context) []string

ListModels queries all registered providers that implement ModelLister and returns a de-duplicated sorted list of available model names.

func (*Router) RegisterProvider

func (r *Router) RegisterProvider(p Provider)

func (*Router) SetDefaultModel added in v0.11.0

func (r *Router) SetDefaultModel(model string)

SetDefaultModel changes the router's default model for subsequent requests.

func (*Router) SetFallbacks

func (r *Router) SetFallbacks(rules []FallbackRule)

SetFallbacks configures the ordered list of fallback rules.

func (*Router) SetPricing added in v0.15.1

func (r *Router) SetPricing(reg *pricing.Registry)

SetPricing configures the model pricing registry used by TokenCost.

func (*Router) SetTools

func (r *Router) SetTools(source func() []ToolDef)

SetTools configures a dynamic tool definition source. The function is called on every LLM request so that tools added at runtime are visible immediately.

type SSEEvent added in v0.16.9

type SSEEvent struct {
	Type string // the "event:" field (empty if not present)
	Data string // the "data:" field payload
}

SSEEvent represents a single Server-Sent Event with an optional event type.

type SSEScanner added in v0.16.9

type SSEScanner struct {
	// contains filtered or unexported fields
}

SSEScanner reads Server-Sent Events from an io.Reader. It yields one SSEEvent per event block, stopping when the reader is exhausted or a "data: [DONE]" sentinel is encountered.

func NewSSEScanner added in v0.16.9

func NewSSEScanner(r io.Reader) *SSEScanner

NewSSEScanner creates a scanner over the given reader.

func (*SSEScanner) Err added in v0.16.9

func (s *SSEScanner) Err() error

Err returns the first non-EOF error encountered by the scanner.

func (*SSEScanner) Event added in v0.16.9

func (s *SSEScanner) Event() SSEEvent

Event returns the most recently scanned SSE event.

func (*SSEScanner) Next added in v0.16.9

func (s *SSEScanner) Next() bool

Next advances to the next SSE event. Returns false when no more events are available (either EOF or [DONE] sentinel).

type SessionLimits added in v0.16.5

type SessionLimits struct {
	Soft float64 `json:"soft"`
	Hard float64 `json:"hard"`
}

SessionLimits holds cost limit thresholds for a session. A zero value means the corresponding limit is disabled.

type SessionStats added in v0.12.0

type SessionStats struct {
	Cost           float64        `json:"cost"`
	InputTokens    int            `json:"input_tokens"`
	OutputTokens   int            `json:"output_tokens"`
	Messages       int            `json:"messages"`
	PricingSources map[string]int `json:"pricing_sources,omitempty"`
}

SessionStats holds per-session cost and token tracking.

type StreamCallback added in v0.16.9

type StreamCallback func(chunk StreamChunk)

StreamCallback is invoked for each chunk during a streaming LLM call. It is called synchronously from the provider's HTTP response reader.

type StreamChunk added in v0.16.9

type StreamChunk struct {
	ContentDelta  string // incremental text content
	ThinkingDelta string // incremental thinking/reasoning content
}

StreamChunk carries a single incremental piece of a streaming response.

type StreamingProvider added in v0.16.9

type StreamingProvider interface {
	Provider
	SupportsStreaming() bool
}

StreamingProvider is an optional interface. Providers that implement it honour the OnStream callback field on ChatRequest.

type TokenUsage

type TokenUsage struct {
	Prompt       int
	Completion   int
	CachedPrompt int // tokens served from cache (Anthropic cache_read, OpenAI cached_tokens)
	Total        int
}

type ToolCall

type ToolCall struct {
	ID       string       `json:"id"`
	Type     string       `json:"type"` // "function"
	Function FunctionCall `json:"function"`
}

ToolCall represents a tool invocation requested by the LLM (OpenAI format).

type ToolDef

type ToolDef struct {
	Type     string      `json:"type"` // "function"
	Function FunctionDef `json:"function"`
}

ToolDef describes a tool available for the LLM to call (OpenAI format).

Directories

Path Synopsis
Package anthropic implements the llm.Provider interface against the Anthropic Messages API (https://docs.anthropic.com/en/api/messages).
Package anthropic implements the llm.Provider interface against the Anthropic Messages API (https://docs.anthropic.com/en/api/messages).

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL