llm

package
v0.2.0 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Apr 27, 2026 License: Apache-2.0 Imports: 4 Imported by: 0

Documentation

Overview

Package llm defines the provider-agnostic interface and types used throughout Cassandra. No package outside llm/ should import a provider sub-package directly; they interact exclusively through the Model interface defined here.

Index

Constants

View Source
const (
	// DefaultRetryAttempts is the number of total attempts (1 initial + 2 retries).
	DefaultRetryAttempts = 3
	// DefaultRetryBaseDelay is the starting back-off delay between attempts.
	DefaultRetryBaseDelay = time.Second
)
View Source
const DefaultMaxTokens = 8192

DefaultMaxTokens is the fallback max-tokens budget for LLM calls when the caller does not specify one — covering both GenerateContent (via core.Agent.RunReview) and GenerateStructuredContent (via StructuredConfig.Resolve). Kept in sync with the CLI's --max-tokens default (see cmd/ai_reviewer) so every pass has consistent headroom.

Variables

This section is empty.

Functions

This section is empty.

Types

type FinishReason added in v0.2.0

type FinishReason string

FinishReason identifies why the model stopped generating content.

const (
	FinishReasonStop   FinishReason = "stop"   // normal termination
	FinishReasonLength FinishReason = "length" // hit max tokens
	FinishReasonOther  FinishReason = "other"  // safety filters, errors, etc.
)

type Message

type Message struct {
	Role             Role
	Text             string
	ToolCalls        []ToolCall
	ToolResults      []ToolResult
	Reasoning        string         // Internal reasoning/thought process from the model
	ProviderMetadata map[string]any // Opaque provider-specific data (e.g. thought signatures)
	// CacheBreakpoint, when true on a RoleSystem message, marks the end of the
	// stable cacheable prefix. Providers that support prompt caching (e.g.
	// Anthropic) use this to inject a cache-control marker; all other providers
	// ignore it.
	CacheBreakpoint bool
}

Message is a single turn in a conversation. Fields are zero-valued when not applicable to the Role:

  • RoleSystem / RoleUser / RoleAssistant (text-only): only Text is set.
  • RoleAssistant (tool requests): ToolCalls is set (Text may also be set).
  • RoleTool: ToolResults is set.

type Model

type Model interface {
	GenerateContent(ctx context.Context, messages []Message, tools []ToolDef, maxTokens int) (*Response, error)
	// GenerateStructuredContent requests the model to produce output adhering to
	// the provided JSON Schema. The schema should be a map[string]any following
	// the JSON Schema specification.
	GenerateStructuredContent(ctx context.Context, messages []Message, schema map[string]any, config StructuredConfig) (*Response, error)
}

Model is the only interface core.Agent depends on. Implementations live in llm/anthropic, llm/google, and llm/openai.

type Response

type Response struct {
	Text             string         // set when the model produced a final answer
	ToolCalls        []ToolCall     // set when the model wants to invoke tools
	Reasoning        string         // set when the model provides internal reasoning
	ProviderMetadata map[string]any // opaque data to be echoed in subsequent turns
	Usage            Usage          // token usage for this interaction
	FinishReason     FinishReason   // why the model stopped generating
}

Response is what the model returns from a single GenerateContent call. At least one of Text or ToolCalls will be non-empty; providers that support mixed streaming turns may populate both simultaneously.

type RetryingModel added in v0.1.0

type RetryingModel struct {
	// contains filtered or unexported fields
}

RetryingModel wraps any Model and transparently retries on any error (network failures, rate limits, server errors, etc.) using exponential back-off. It implements the Model interface.

func NewRetryingModel added in v0.1.0

func NewRetryingModel(inner Model, maxAttempts int, baseDelay time.Duration) *RetryingModel

NewRetryingModel returns a Model that retries failed calls up to maxAttempts times total (i.e. 1 initial attempt + maxAttempts-1 retries), doubling the delay after each failure starting from baseDelay.

The wrapper respects context cancellation: if ctx is cancelled between attempts, the last error is returned immediately without further retries.

func (*RetryingModel) GenerateContent added in v0.1.0

func (r *RetryingModel) GenerateContent(ctx context.Context, messages []Message, tools []ToolDef, maxTokens int) (*Response, error)

GenerateContent calls the underlying model, retrying on any error.

func (*RetryingModel) GenerateStructuredContent added in v0.1.0

func (r *RetryingModel) GenerateStructuredContent(ctx context.Context, messages []Message, schema map[string]any, config StructuredConfig) (*Response, error)

GenerateStructuredContent calls the underlying model, retrying on any error.

type Role

type Role string

Role identifies the author of a message in a conversation.

const (
	RoleSystem    Role = "system"
	RoleUser      Role = "user"
	RoleAssistant Role = "assistant"
	RoleTool      Role = "tool"
)

type StructuredConfig

type StructuredConfig struct {
	// ModelOverride allows using a different model for the structured pass.
	ModelOverride string
	// MaxTokens limits the length of the LLM response.
	MaxTokens int
}

StructuredConfig provides options for structured output generation.

func (StructuredConfig) Resolve added in v0.1.0

func (c StructuredConfig) Resolve(defaultModel string) (string, int)

Resolve returns the effective model and max-tokens values, applying the provider's default model when no override is set and DefaultMaxTokens when MaxTokens is non-positive.

type ToolCall

type ToolCall struct {
	ID        string
	Name      string
	Arguments string // raw JSON
}

ToolCall is a tool invocation requested by the model in an assistant turn.

func (*ToolCall) UnmarshalArguments

func (tc *ToolCall) UnmarshalArguments(dest any) error

UnmarshalArguments unmarshals the raw JSON Arguments into the given destination. It returns a formatted error if the unmarshaling fails.

type ToolDef

type ToolDef struct {
	Name        string
	Description string
	Parameters  map[string]any // full JSON Schema object
}

ToolDef describes a tool the model may call. Parameters is a JSON Schema object (same shape accepted by all providers).

type ToolResult

type ToolResult struct {
	ToolCallID string
	Name       string
	Content    string
}

ToolResult is the response to a ToolCall, bundled into a RoleTool message.

type Usage

type Usage struct {
	PromptTokens   int // tokens in the input prompt
	OutputTokens   int // tokens in the generated response (excluding thinking)
	ThinkingTokens int // tokens used for model internal reasoning/thinking
	CachedTokens   int // tokens served from a cache
}

Usage captures the number of tokens consumed in an interaction. If a provider does not support a specific count, its value will be 0. If the provider does not support token counting at all, all fields will be -1.

func UnknownUsage added in v0.1.0

func UnknownUsage() Usage

UnknownUsage returns a Usage with PromptTokens and OutputTokens set to -1, indicating the provider did not report any token counts. ThinkingTokens and CachedTokens remain 0 (the zero value) which providers overwrite when they do have data.

func (*Usage) Add added in v0.1.0

func (u *Usage) Add(other Usage)

Add accumulates other's token counts into u, ignoring sentinel fields (values <= 0). Intended for callers that sum per-iteration Usage into a running session total without letting UnknownUsage() sentinels corrupt the aggregate.

func (Usage) TotalInput

func (u Usage) TotalInput() int

TotalInput returns the total number of input-side tokens (prompt + cached).

func (Usage) TotalOutput

func (u Usage) TotalOutput() int

TotalOutput returns the total number of output-side tokens (output + thinking).

Directories

Path Synopsis
Package anthropic implements llm.Model using the official Anthropic Go SDK.
Package anthropic implements llm.Model using the official Anthropic Go SDK.
Package factory constructs llm.Model instances for the supported providers.
Package factory constructs llm.Model instances for the supported providers.
Package google implements llm.Model using the official Google Gen AI Go SDK.
Package google implements llm.Model using the official Google Gen AI Go SDK.
internal
Package openai implements llm.Model using the official OpenAI Go SDK.
Package openai implements llm.Model using the official OpenAI Go SDK.

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL