Documentation
¶
Overview ¶
Package llm defines the provider-agnostic interface and types used throughout Cassandra. No package outside llm/ should import a provider sub-package directly; they interact exclusively through the Model interface defined here.
Index ¶
Constants ¶
const ( // DefaultRetryAttempts is the number of total attempts (1 initial + 2 retries). DefaultRetryAttempts = 3 // DefaultRetryBaseDelay is the starting back-off delay between attempts. DefaultRetryBaseDelay = time.Second )
const DefaultMaxTokens = 8192
DefaultMaxTokens is the fallback max-tokens budget for LLM calls when the caller does not specify one — covering both GenerateContent (via core.Agent.RunReview) and GenerateStructuredContent (via StructuredConfig.Resolve). Kept in sync with the CLI's --max-tokens default (see cmd/ai_reviewer) so every pass has consistent headroom.
Variables ¶
This section is empty.
Functions ¶
This section is empty.
Types ¶
type Message ¶
type Message struct {
Role Role
Text string
ToolCalls []ToolCall
ToolResults []ToolResult
Reasoning string // Internal reasoning/thought process from the model
ProviderMetadata map[string]any // Opaque provider-specific data (e.g. thought signatures)
// CacheBreakpoint, when true on a RoleSystem message, marks the end of the
// stable cacheable prefix. Providers that support prompt caching (e.g.
// Anthropic) use this to inject a cache-control marker; all other providers
// ignore it.
CacheBreakpoint bool
}
Message is a single turn in a conversation. Fields are zero-valued when not applicable to the Role:
- RoleSystem / RoleUser / RoleAssistant (text-only): only Text is set.
- RoleAssistant (tool requests): ToolCalls is set (Text may also be set).
- RoleTool: ToolResults is set.
type Model ¶
type Model interface {
GenerateContent(ctx context.Context, messages []Message, tools []ToolDef, maxTokens int) (*Response, error)
// GenerateStructuredContent requests the model to produce output adhering to
// the provided JSON Schema. The schema should be a map[string]any following
// the JSON Schema specification.
GenerateStructuredContent(ctx context.Context, messages []Message, schema map[string]any, config StructuredConfig) (*Response, error)
}
Model is the only interface core.Agent depends on. Implementations live in llm/anthropic and llm/google.
type Response ¶
type Response struct {
Text string // set when the model produced a final answer
ToolCalls []ToolCall // set when the model wants to invoke tools
Reasoning string // set when the model provides internal reasoning
ProviderMetadata map[string]any // opaque data to be echoed in subsequent turns
Usage Usage // token usage for this interaction
}
Response is what the model returns from a single GenerateContent call. At least one of Text or ToolCalls will be non-empty; providers that support mixed streaming turns may populate both simultaneously.
type RetryingModel ¶ added in v0.1.0
type RetryingModel struct {
// contains filtered or unexported fields
}
RetryingModel wraps any Model and transparently retries on any error (network failures, rate limits, server errors, etc.) using exponential back-off. It implements the Model interface.
func NewRetryingModel ¶ added in v0.1.0
func NewRetryingModel(inner Model, maxAttempts int, baseDelay time.Duration) *RetryingModel
NewRetryingModel returns a Model that retries failed calls up to maxAttempts times total (i.e. 1 initial attempt + maxAttempts-1 retries), doubling the delay after each failure starting from baseDelay.
The wrapper respects context cancellation: if ctx is cancelled between attempts, the last error is returned immediately without further retries.
func (*RetryingModel) GenerateContent ¶ added in v0.1.0
func (r *RetryingModel) GenerateContent(ctx context.Context, messages []Message, tools []ToolDef, maxTokens int) (*Response, error)
GenerateContent calls the underlying model, retrying on any error.
func (*RetryingModel) GenerateStructuredContent ¶ added in v0.1.0
func (r *RetryingModel) GenerateStructuredContent(ctx context.Context, messages []Message, schema map[string]any, config StructuredConfig) (*Response, error)
GenerateStructuredContent calls the underlying model, retrying on any error.
type StructuredConfig ¶
type StructuredConfig struct {
// ModelOverride allows using a different model for the structured pass.
ModelOverride string
// MaxTokens limits the length of the LLM response.
MaxTokens int
}
StructuredConfig provides options for structured output generation.
type ToolCall ¶
ToolCall is a tool invocation requested by the model in an assistant turn.
func (*ToolCall) UnmarshalArguments ¶
UnmarshalArguments unmarshals the raw JSON Arguments into the given destination. It returns a formatted error if the unmarshaling fails.
type ToolDef ¶
type ToolDef struct {
Name string
Description string
Parameters map[string]any // full JSON Schema object
}
ToolDef describes a tool the model may call. Parameters is a JSON Schema object (same shape accepted by all providers).
type ToolResult ¶
ToolResult is the response to a ToolCall, bundled into a RoleTool message.
type Usage ¶
type Usage struct {
PromptTokens int // tokens in the input prompt
OutputTokens int // tokens in the generated response (excluding thinking)
ThinkingTokens int // tokens used for model internal reasoning/thinking
CachedTokens int // tokens served from a cache
}
Usage captures the number of tokens consumed in an interaction. If a provider does not support a specific count, its value will be 0. If the provider does not support token counting at all, all fields will be -1.
func UnknownUsage ¶ added in v0.1.0
func UnknownUsage() Usage
UnknownUsage returns a Usage with PromptTokens and OutputTokens set to -1, indicating the provider did not report any token counts. ThinkingTokens and CachedTokens remain 0 (the zero value) which providers overwrite when they do have data.
func (*Usage) Add ¶ added in v0.1.0
Add accumulates other's token counts into u, ignoring sentinel fields (values <= 0). Intended for callers that sum per-iteration Usage into a running session total without letting UnknownUsage() sentinels corrupt the aggregate.
func (Usage) TotalInput ¶
TotalInput returns the total number of input-side tokens (prompt + cached).
func (Usage) TotalOutput ¶
TotalOutput returns the total number of output-side tokens (output + thinking).
Directories
¶
| Path | Synopsis |
|---|---|
|
Package anthropic implements llm.Model using the official Anthropic Go SDK.
|
Package anthropic implements llm.Model using the official Anthropic Go SDK. |
|
Package factory constructs llm.Model instances for the supported providers.
|
Package factory constructs llm.Model instances for the supported providers. |
|
Package google implements llm.Model using the official Google Gen AI Go SDK.
|
Package google implements llm.Model using the official Google Gen AI Go SDK. |
|
internal
|
|