llm

package
v0.16.9 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Apr 10, 2026 License: Apache-2.0 Imports: 16 Imported by: 0

Documentation

Index

Constants

This section is empty.

Variables

View Source
var ErrHardLimitExceeded = errors.New("session hard cost limit exceeded")

ErrHardLimitExceeded is returned when the session's hard cost limit is reached.

View Source
var ErrSoftLimitExceeded = errors.New("session soft cost limit exceeded")

ErrSoftLimitExceeded is returned when the session's soft cost limit is reached.

Functions

func TokenCost added in v0.15.0

func TokenCost(resp *ChatResponse, reg *pricing.Registry) (float64, string)

TokenCost returns the USD cost for a chat response using the pricing registry. Priority: provider-reported cost > registry lookup > fallback rate > $0. Also returns a pricing source string ("provider"/"registry"/"fallback"/"unknown").

Types

type AgentStats added in v0.12.0

type AgentStats struct {
	Agent        string  `json:"agent"`
	Cost         float64 `json:"cost"`
	InputTokens  int     `json:"input_tokens"`
	OutputTokens int     `json:"output_tokens"`
	Messages     int     `json:"messages"`
	Sessions     int     `json:"sessions"`
}

AgentStats holds aggregated per-agent cost and token data.

type BalanceProvider

type BalanceProvider interface {
	FundsRemaining(ctx context.Context) (float64, error)
}

BalanceProvider is an optional interface implemented by providers that can report their remaining credit balance. Returns -1 if the balance is unlimited.

type ChatRequest

type ChatRequest struct {
	Model       string
	Messages    []Message
	MaxTokens   int
	Temperature *float64
	Tools       []ToolDef
	OnStream    StreamCallback // if non-nil, provider streams chunks via this callback
}

type ChatResponse

type ChatResponse struct {
	Content      string
	ToolCalls    []ToolCall
	TokensUsed   TokenUsage
	Model        string
	FinishReason string
	CostUSD      float64 // provider-reported or estimated cost in USD
}

type CostTracker

type CostTracker struct {
	// contains filtered or unexported fields
}

CostTracker tracks token usage and estimated costs per session and globally.

func NewCostTracker

func NewCostTracker(defaults SessionLimits, agentOverrides map[string]SessionLimits) *CostTracker

NewCostTracker creates a CostTracker with default limits and optional per-agent overrides.

func (*CostTracker) AgentCosts added in v0.12.0

func (ct *CostTracker) AgentCosts() []AgentStats

AgentCosts returns per-agent aggregated stats. Agent name is extracted from session IDs which have the format "agentname:adapter:externalid".

func (*CostTracker) AllSessionCosts

func (ct *CostTracker) AllSessionCosts() map[string]float64

AllSessionCosts returns a copy of all session costs.

func (*CostTracker) AllSessionStats added in v0.12.0

func (ct *CostTracker) AllSessionStats() map[string]SessionStats

AllSessionStats returns a copy of all session stats.

func (*CostTracker) DefaultLimits added in v0.16.5

func (ct *CostTracker) DefaultLimits() SessionLimits

DefaultLimits returns the default cost limits.

func (*CostTracker) ExceedsBudget

func (ct *CostTracker) ExceedsBudget(sessionID string) bool

ExceedsBudget checks if a session has exceeded its hard budget. Deprecated: use ExceedsHardLimit.

func (*CostTracker) ExceedsHardLimit added in v0.16.5

func (ct *CostTracker) ExceedsHardLimit(sessionID string) bool

ExceedsHardLimit checks if a session has exceeded its hard cost limit. Returns false if the hard limit is disabled (zero).

func (*CostTracker) ExceedsSoftLimit added in v0.16.5

func (ct *CostTracker) ExceedsSoftLimit(sessionID string) bool

ExceedsSoftLimit checks if a session has exceeded its soft cost limit. Returns false if the soft limit is disabled (zero).

func (*CostTracker) GlobalCost

func (ct *CostTracker) GlobalCost() float64

GlobalCost returns the total cost across all sessions.

func (*CostTracker) MaxBudgetPerSession

func (ct *CostTracker) MaxBudgetPerSession() float64

MaxBudgetPerSession returns the default hard cost cap. Deprecated: use DefaultLimits().Hard.

func (*CostTracker) Record

func (ct *CostTracker) Record(sessionID string, cost float64) bool

Record adds cost for a session. Returns true if within hard budget.

func (*CostTracker) RecordWithTokens added in v0.12.0

func (ct *CostTracker) RecordWithTokens(sessionID string, cost float64, inputTokens, outputTokens int, pricingSource ...string) bool

RecordWithTokens adds cost and token usage for a session. Returns true if within budget. The optional pricingSource parameter records which pricing method was used.

func (*CostTracker) SessionCost

func (ct *CostTracker) SessionCost(sessionID string) float64

SessionCost returns the total cost for a session.

func (*CostTracker) SetAgentLimits added in v0.16.5

func (ct *CostTracker) SetAgentLimits(agent string, limits SessionLimits)

SetAgentLimits sets per-agent cost limit overrides.

type FallbackRule

type FallbackRule struct {
	Trigger    string  // "error" | "rate_limit" | "low_funds"
	Action     string  // "switch_provider" | "switch_model" | "wait_and_retry"
	Provider   string  // provider name — for switch_provider
	Model      string  // model name — for switch_model; optional for switch_provider
	Threshold  float64 // remaining credit threshold in USD — for low_funds
	MaxRetries int     // number of retry attempts — for wait_and_retry
	Backoff    string  // "exponential" | "constant" — for wait_and_retry
}

FallbackRule describes a single fallback step the router will attempt.

type FunctionCall

type FunctionCall struct {
	Name      string `json:"name"`
	Arguments string `json:"arguments"`
}

FunctionCall is the function name and JSON-encoded arguments within a ToolCall.

type FunctionDef

type FunctionDef struct {
	Name        string         `json:"name"`
	Description string         `json:"description"`
	Parameters  map[string]any `json:"parameters"` // JSON Schema object
}

FunctionDef describes the function signature within a ToolDef.

type LLMError

type LLMError struct {
	StatusCode int
	Message    string
}

LLMError is returned by providers for API-level failures. Use errors.As to unwrap from wrapped errors.

func (*LLMError) Error

func (e *LLMError) Error() string

func (*LLMError) Retryable

func (e *LLMError) Retryable() bool

Retryable reports whether the error is worth retrying. Non-retryable: 400 (bad request), 401 (auth), 402 (payment), 404 (not found), 422 (unprocessable). Retryable: 429 (rate limit), 5xx (server errors), and any unrecognised status.

type Message

type Message struct {
	Role       string     `json:"role"`
	Content    string     `json:"content"`
	ToolCalls  []ToolCall `json:"tool_calls,omitempty"`
	ToolCallID string     `json:"tool_call_id,omitempty"`
}

type ModelLister added in v0.15.1

type ModelLister interface {
	ListModels(ctx context.Context) ([]string, error)
}

ModelLister is an optional interface implemented by providers that can enumerate their available models.

type OAIPromptTokenDetail added in v0.16.9

type OAIPromptTokenDetail struct {
	CachedTokens int `json:"cached_tokens"`
}

OAIPromptTokenDetail holds cached token info from the usage block.

type OAIStreamResult added in v0.16.9

type OAIStreamResult struct {
	Content      string
	ToolCalls    []ToolCall
	Model        string
	FinishReason string
	Usage        *OAIStreamUsage
	// ReasoningContent captures reasoning_content deltas (OpenRouter).
	ReasoningContent string
}

OAIStreamResult holds the accumulated data from an OpenAI-compatible streaming response. Providers call ReadOAIStream to parse the SSE body and then convert this into an llm.ChatResponse.

func ReadOAIStream added in v0.16.9

func ReadOAIStream(body io.Reader, onStream StreamCallback) (*OAIStreamResult, error)

ReadOAIStream reads an OpenAI-compatible SSE stream body and calls onStream for each content/reasoning delta. It accumulates tool calls and usage and returns the full result. onStream may be nil.

type OAIStreamUsage added in v0.16.9

type OAIStreamUsage struct {
	PromptTokens        int                   `json:"prompt_tokens"`
	CompletionTokens    int                   `json:"completion_tokens"`
	TotalTokens         int                   `json:"total_tokens"`
	PromptTokensDetails *OAIPromptTokenDetail `json:"prompt_tokens_details,omitempty"`
	Cost                float64               `json:"cost"` // OpenRouter reports cost here
}

OAIStreamUsage mirrors the usage block in the final SSE chunk.

type Provider

type Provider interface {
	ChatCompletion(ctx context.Context, req ChatRequest) (*ChatResponse, error)
	Name() string
	HealthCheck(ctx context.Context) error
}

Provider defines the interface for LLM backends.

type ProviderMetadata

type ProviderMetadata struct {
	Name    string
	BaseURL string
	Models  []string
}

type Router

type Router struct {
	// contains filtered or unexported fields
}

Router selects the appropriate LLM provider for a request.

func NewRouter

func NewRouter(defaultProvider, defaultModel string, costTracker *CostTracker) *Router

func (*Router) Complete

func (r *Router) Complete(ctx context.Context, sessionID string, messages []Message) (*ChatResponse, error)

func (*Router) CompleteStream added in v0.16.9

func (r *Router) CompleteStream(ctx context.Context, sessionID string, messages []Message, onStream StreamCallback) (*ChatResponse, error)

CompleteStream is like Complete but enables real-time streaming of content chunks via the onStream callback. If the active provider does not support streaming, it falls back to the non-streaming path transparently.

func (*Router) CostTracker added in v0.16.5

func (r *Router) CostTracker() *CostTracker

CostTracker returns the router's cost tracker for soft-limit checks.

func (*Router) DefaultModel

func (r *Router) DefaultModel() string

DefaultModel returns the router's default model name.

func (*Router) HealthCheck

func (r *Router) HealthCheck(ctx context.Context) error

func (*Router) ListModels added in v0.15.1

func (r *Router) ListModels(ctx context.Context) []string

ListModels queries all registered providers that implement ModelLister and returns a de-duplicated sorted list of available model names.

func (*Router) RegisterProvider

func (r *Router) RegisterProvider(p Provider)

func (*Router) SetDefaultModel added in v0.11.0

func (r *Router) SetDefaultModel(model string)

SetDefaultModel changes the router's default model for subsequent requests.

func (*Router) SetFallbacks

func (r *Router) SetFallbacks(rules []FallbackRule)

SetFallbacks configures the ordered list of fallback rules.

func (*Router) SetPricing added in v0.15.1

func (r *Router) SetPricing(reg *pricing.Registry)

SetPricing configures the model pricing registry used by TokenCost.

func (*Router) SetTools

func (r *Router) SetTools(source func() []ToolDef)

SetTools configures a dynamic tool definition source. The function is called on every LLM request so that tools added at runtime are visible immediately.

type SSEEvent added in v0.16.9

type SSEEvent struct {
	Type string // the "event:" field (empty if not present)
	Data string // the "data:" field payload
}

SSEEvent represents a single Server-Sent Event with an optional event type.

type SSEScanner added in v0.16.9

type SSEScanner struct {
	// contains filtered or unexported fields
}

SSEScanner reads Server-Sent Events from an io.Reader. It yields one SSEEvent per event block, stopping when the reader is exhausted or a "data: [DONE]" sentinel is encountered.

func NewSSEScanner added in v0.16.9

func NewSSEScanner(r io.Reader) *SSEScanner

NewSSEScanner creates a scanner over the given reader.

func (*SSEScanner) Err added in v0.16.9

func (s *SSEScanner) Err() error

Err returns the first non-EOF error encountered by the scanner.

func (*SSEScanner) Event added in v0.16.9

func (s *SSEScanner) Event() SSEEvent

Event returns the most recently scanned SSE event.

func (*SSEScanner) Next added in v0.16.9

func (s *SSEScanner) Next() bool

Next advances to the next SSE event. Returns false when no more events are available (either EOF or [DONE] sentinel).

type SessionLimits added in v0.16.5

type SessionLimits struct {
	Soft float64 `json:"soft"`
	Hard float64 `json:"hard"`
}

SessionLimits holds cost limit thresholds for a session. A zero value means the corresponding limit is disabled.

type SessionStats added in v0.12.0

type SessionStats struct {
	Cost           float64        `json:"cost"`
	InputTokens    int            `json:"input_tokens"`
	OutputTokens   int            `json:"output_tokens"`
	Messages       int            `json:"messages"`
	PricingSources map[string]int `json:"pricing_sources,omitempty"`
}

SessionStats holds per-session cost and token tracking.

type StreamCallback added in v0.16.9

type StreamCallback func(chunk StreamChunk)

StreamCallback is invoked for each chunk during a streaming LLM call. It is called synchronously from the provider's HTTP response reader.

type StreamChunk added in v0.16.9

type StreamChunk struct {
	ContentDelta  string // incremental text content
	ThinkingDelta string // incremental thinking/reasoning content
}

StreamChunk carries a single incremental piece of a streaming response.

type StreamingProvider added in v0.16.9

type StreamingProvider interface {
	Provider
	SupportsStreaming() bool
}

StreamingProvider is an optional interface. Providers that implement it honour the OnStream callback field on ChatRequest.

type TokenUsage

type TokenUsage struct {
	Prompt       int
	Completion   int
	CachedPrompt int // tokens served from cache (Anthropic cache_read, OpenAI cached_tokens)
	Total        int
}

type ToolCall

type ToolCall struct {
	ID       string       `json:"id"`
	Type     string       `json:"type"` // "function"
	Function FunctionCall `json:"function"`
}

ToolCall represents a tool invocation requested by the LLM (OpenAI format).

type ToolDef

type ToolDef struct {
	Type     string      `json:"type"` // "function"
	Function FunctionDef `json:"function"`
}

ToolDef describes a tool available for the LLM to call (OpenAI format).

Directories

Path Synopsis
Package anthropic implements the llm.Provider interface against the Anthropic Messages API (https://docs.anthropic.com/en/api/messages).
Package anthropic implements the llm.Provider interface against the Anthropic Messages API (https://docs.anthropic.com/en/api/messages).

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL