llm

package

v0.2.0 Latest Latest Go to latest Published: Apr 27, 2026 License: Apache-2.0 Imports: 4 Imported by: 0

Details

Valid go.mod file
Redistributable license
Tagged version
Stable version
Learn more about best practices

Repository

github.com/menny/cassandra

Links

Open Source Insights

Documentation ¶

Overview ¶

Package llm defines the provider-agnostic interface and types used throughout Cassandra. No package outside llm/ should import a provider sub-package directly; they interact exclusively through the Model interface defined here.

Index ¶

Constants
type FinishReason
type Message
type Model
type Response
type RetryingModel
- func NewRetryingModel(inner Model, maxAttempts int, baseDelay time.Duration) *RetryingModel
- func (r *RetryingModel) GenerateContent(ctx context.Context, messages []Message, tools []ToolDef, maxTokens int) (*Response, error)
- func (r *RetryingModel) GenerateStructuredContent(ctx context.Context, messages []Message, schema map[string]any, ...) (*Response, error)
type Role
type StructuredConfig
- func (c StructuredConfig) Resolve(defaultModel string) (string, int)
type ToolCall
- func (tc *ToolCall) UnmarshalArguments(dest any) error
type ToolDef
type ToolResult
type Usage
- func UnknownUsage() Usage

Constants ¶

View Source

const (
	// DefaultRetryAttempts is the number of total attempts (1 initial + 2 retries).
	DefaultRetryAttempts = 3
	// DefaultRetryBaseDelay is the starting back-off delay between attempts.
	DefaultRetryBaseDelay = time.Second
)

View Source

const DefaultMaxTokens = 8192

DefaultMaxTokens is the fallback max-tokens budget for LLM calls when the caller does not specify one — covering both GenerateContent (via core.Agent.RunReview) and GenerateStructuredContent (via StructuredConfig.Resolve). Kept in sync with the CLI's --max-tokens default (see cmd/ai_reviewer) so every pass has consistent headroom.

Variables ¶

This section is empty.

Functions ¶

This section is empty.

Types ¶

type FinishReason ¶ added in v0.2.0

type FinishReason string

FinishReason identifies why the model stopped generating content.

const (
	FinishReasonStop   FinishReason = "stop"   // normal termination
	FinishReasonLength FinishReason = "length" // hit max tokens
	FinishReasonOther  FinishReason = "other"  // safety filters, errors, etc.
)

type Message ¶

type Message struct {
	Role             Role
	Text             string
	ToolCalls        []ToolCall
	ToolResults      []ToolResult
	Reasoning        string         // Internal reasoning/thought process from the model
	ProviderMetadata map[string]any // Opaque provider-specific data (e.g. thought signatures)
	// CacheBreakpoint, when true on a RoleSystem message, marks the end of the
	// stable cacheable prefix. Providers that support prompt caching (e.g.
	// Anthropic) use this to inject a cache-control marker; all other providers
	// ignore it.
	CacheBreakpoint bool
}

Message is a single turn in a conversation. Fields are zero-valued when not applicable to the Role:

RoleSystem / RoleUser / RoleAssistant (text-only): only Text is set.
RoleAssistant (tool requests): ToolCalls is set (Text may also be set).
RoleTool: ToolResults is set.

type Model ¶

type Model interface {
	GenerateContent(ctx context.Context, messages []Message, tools []ToolDef, maxTokens int) (*Response, error)
	// GenerateStructuredContent requests the model to produce output adhering to
	// the provided JSON Schema. The schema should be a map[string]any following
	// the JSON Schema specification.
	GenerateStructuredContent(ctx context.Context, messages []Message, schema map[string]any, config StructuredConfig) (*Response, error)
}

Model is the only interface core.Agent depends on. Implementations live in llm/anthropic, llm/google, and llm/openai.

type Response ¶

type Response struct {
	Text             string         // set when the model produced a final answer
	ToolCalls        []ToolCall     // set when the model wants to invoke tools
	Reasoning        string         // set when the model provides internal reasoning
	ProviderMetadata map[string]any // opaque data to be echoed in subsequent turns
	Usage            Usage          // token usage for this interaction
	FinishReason     FinishReason   // why the model stopped generating
}

Response is what the model returns from a single GenerateContent call. At least one of Text or ToolCalls will be non-empty; providers that support mixed streaming turns may populate both simultaneously.

type RetryingModel ¶ added in v0.1.0

type RetryingModel struct {
	// contains filtered or unexported fields
}

RetryingModel wraps any Model and transparently retries on any error (network failures, rate limits, server errors, etc.) using exponential back-off. It implements the Model interface.

func NewRetryingModel ¶ added in v0.1.0

func NewRetryingModel(inner Model, maxAttempts int, baseDelay time.Duration) *RetryingModel

NewRetryingModel returns a Model that retries failed calls up to maxAttempts times total (i.e. 1 initial attempt + maxAttempts-1 retries), doubling the delay after each failure starting from baseDelay.

The wrapper respects context cancellation: if ctx is cancelled between attempts, the last error is returned immediately without further retries.

func (*RetryingModel) GenerateContent ¶ added in v0.1.0

func (r *RetryingModel) GenerateContent(ctx context.Context, messages []Message, tools []ToolDef, maxTokens int) (*Response, error)

GenerateContent calls the underlying model, retrying on any error.

func (*RetryingModel) GenerateStructuredContent ¶ added in v0.1.0

func (r *RetryingModel) GenerateStructuredContent(ctx context.Context, messages []Message, schema map[string]any, config StructuredConfig) (*Response, error)

GenerateStructuredContent calls the underlying model, retrying on any error.

type Role ¶

type Role string

Role identifies the author of a message in a conversation.

const (
	RoleSystem    Role = "system"
	RoleUser      Role = "user"
	RoleAssistant Role = "assistant"
	RoleTool      Role = "tool"
)

type StructuredConfig ¶

type StructuredConfig struct {
	// ModelOverride allows using a different model for the structured pass.
	ModelOverride string
	// MaxTokens limits the length of the LLM response.
	MaxTokens int
}

StructuredConfig provides options for structured output generation.

func (StructuredConfig) Resolve ¶ added in v0.1.0

func (c StructuredConfig) Resolve(defaultModel string) (string, int)

Resolve returns the effective model and max-tokens values, applying the provider's default model when no override is set and DefaultMaxTokens when MaxTokens is non-positive.

type ToolCall ¶

type ToolCall struct {
	ID        string
	Name      string
	Arguments string // raw JSON
}

ToolCall is a tool invocation requested by the model in an assistant turn.

func (*ToolCall) UnmarshalArguments ¶

func (tc *ToolCall) UnmarshalArguments(dest any) error

UnmarshalArguments unmarshals the raw JSON Arguments into the given destination. It returns a formatted error if the unmarshaling fails.

type ToolDef ¶

type ToolDef struct {
	Name        string
	Description string
	Parameters  map[string]any // full JSON Schema object
}

ToolDef describes a tool the model may call. Parameters is a JSON Schema object (same shape accepted by all providers).

type ToolResult ¶

type ToolResult struct {
	ToolCallID string
	Name       string
	Content    string
}

ToolResult is the response to a ToolCall, bundled into a RoleTool message.

type Usage ¶

type Usage struct {
	PromptTokens   int // tokens in the input prompt
	OutputTokens   int // tokens in the generated response (excluding thinking)
	ThinkingTokens int // tokens used for model internal reasoning/thinking
	CachedTokens   int // tokens served from a cache
}

Usage captures the number of tokens consumed in an interaction. If a provider does not support a specific count, its value will be 0. If the provider does not support token counting at all, all fields will be -1.

func UnknownUsage ¶ added in v0.1.0

func UnknownUsage() Usage

UnknownUsage returns a Usage with PromptTokens and OutputTokens set to -1, indicating the provider did not report any token counts. ThinkingTokens and CachedTokens remain 0 (the zero value) which providers overwrite when they do have data.

func (*Usage) Add ¶ added in v0.1.0

func (u *Usage) Add(other Usage)

Add accumulates other's token counts into u, ignoring sentinel fields (values <= 0). Intended for callers that sum per-iteration Usage into a running session total without letting UnknownUsage() sentinels corrupt the aggregate.

func (Usage) TotalInput ¶

func (u Usage) TotalInput() int

TotalInput returns the total number of input-side tokens (prompt + cached).

func (Usage) TotalOutput ¶

func (u Usage) TotalOutput() int

TotalOutput returns the total number of output-side tokens (output + thinking).

Source Files ¶

View all Source files

Directories ¶

Path	Synopsis
anthropic Package anthropic implements llm.Model using the official Anthropic Go SDK.	Package anthropic implements llm.Model using the official Anthropic Go SDK.
factory Package factory constructs llm.Model instances for the supported providers.	Package factory constructs llm.Model instances for the supported providers.
google Package google implements llm.Model using the official Google Gen AI Go SDK.	Package google implements llm.Model using the official Google Gen AI Go SDK.
internal
util
openai Package openai implements llm.Model using the official OpenAI Go SDK.	Package openai implements llm.Model using the official OpenAI Go SDK.

?	: This menu
/	: Search site
f or F	: Jump to
y or Y	: Canonical URL