iterate

package
v0.9.2 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: May 1, 2026 License: Apache-2.0 Imports: 10 Imported by: 0

Documentation

Overview

Package iterate provides a shared model iteration engine for agentic LLM workflows. An Engine repeatedly calls an LLM, inspects the response for tool calls, executes those tools, and feeds results back until the model produces a text-only response or a budget is exhausted.

Both the primary agent loop and the lightweight delegate executor are consumers of this engine, configured via Config callbacks to layer their own streaming, archival, timeout, and budget logic on top of the shared iteration core.

Index

Constants

View Source
const (
	DefaultMaxIterations     = 50
	DefaultMaxIllegalStrikes = 2
	DefaultMaxToolRepeat     = 3
)

Default limits applied when Config fields are zero-valued.

View Source
const (
	ExhaustMaxIterations = "max_iterations"
	ExhaustTokenBudget   = "token_budget"
	ExhaustWallClock     = "wall_clock"
	ExhaustNoOutput      = "no_output"
	ExhaustIllegalTool   = "illegal_tool"
)

Exhaustion reason constants describe why the engine stopped iterating.

Variables

This section is empty.

Functions

This section is empty.

Types

type Config

type Config struct {

	// MaxIterations is the maximum number of LLM call iterations.
	// After this many iterations without a text-only response, the
	// engine forces a final text call. Zero uses [DefaultMaxIterations].
	MaxIterations int

	// MaxIllegalStrikes is the number of consecutive iterations
	// containing illegal tool calls before the engine forces text.
	// The first strike allows one recovery iteration. Zero uses
	// [DefaultMaxIllegalStrikes].
	MaxIllegalStrikes int

	// MaxToolRepeat is the maximum number of times the same tool
	// may be called with the same arguments before the engine
	// injects a loop-break error. Zero uses [DefaultMaxToolRepeat].
	MaxToolRepeat int

	// Model is the model name passed to [llm.Client.ChatStream].
	Model string

	// LLM is the client used for chat completions.
	LLM llm.Client

	// Stream receives streaming events. Nil disables streaming.
	Stream llm.StreamCallback

	// ToolDefs returns the tool definitions for a given iteration.
	// It is called at the top of each iteration, allowing per-iteration
	// tool filtering (e.g., capability tags). Returning nil means no
	// tools are available for that iteration.
	ToolDefs func(iteration int) []map[string]any

	// Executor runs individual tool calls. If nil, the engine panics.
	Executor ToolExecutor

	// OnIterationStart fires at the top of each iteration before the
	// LLM call. Receives the iteration index, the active model name at
	// the start of the iteration (may differ from Config.Model if a
	// prior OnLLMError changed it), the current message history
	// (including assistant and tool messages appended by prior
	// iterations), and the tool definitions for this iteration.
	OnIterationStart func(ctx context.Context, iteration int, currentModel string, msgs []llm.Message, toolDefs []map[string]any)

	// OnLLMResponse fires after each successful LLM call, before tool
	// execution. Use it for logging and emitting stream events.
	OnLLMResponse func(ctx context.Context, resp *llm.ChatResponse, iteration int)

	// OnLLMError is called when an LLM ChatStream call fails. It may
	// implement retry, failover, or recovery logic. Returning a non-nil
	// ChatResponse and nil error means recovery succeeded; the returned
	// model name replaces the current model for subsequent iterations.
	// Returning a nil response and non-nil error propagates the failure.
	// If OnLLMError is nil, errors are returned immediately.
	OnLLMError func(ctx context.Context, err error, model string,
		msgs []llm.Message, toolDefs []map[string]any,
		stream llm.StreamCallback) (resp *llm.ChatResponse, newModel string, retErr error)

	// OnBeforeToolExec fires before each tool execution. The returned
	// context is passed to the tool executor. Use it to inject
	// conversation IDs, session IDs, tool call IDs, etc.
	OnBeforeToolExec func(ctx context.Context, iteration int, tc llm.ToolCall) context.Context

	// OnToolCallStart fires when a tool call begins execution. Use it
	// for stream events and tool call recording.
	OnToolCallStart func(ctx context.Context, tc llm.ToolCall)

	// OnToolCallDone fires after a tool call completes. The errMsg is
	// empty on success.
	OnToolCallDone func(ctx context.Context, name, result, errMsg string)

	// OnTextResponse fires when the model produces a text-only response
	// (no tool calls). Use it for memory storage, fact extraction, etc.
	OnTextResponse func(ctx context.Context, content string, msgs []llm.Message)

	// CheckBudget is called after each LLM response with the cumulative
	// output token count. Return true if the budget is exhausted and the
	// engine should force a text response. Nil means no budget.
	CheckBudget func(totalOutput int) bool

	// CheckToolAvail reports whether a tool is available in the current
	// iteration. Return false if the tool should be treated as illegal.
	// Nil means all tools are available.
	CheckToolAvail func(name string) bool

	// NormalizeToolCall can rewrite or repair a model-emitted tool call
	// before availability checks and execution. Use it for runtime
	// compatibility shims such as aliasing common hallucinated names to
	// the exact supported tool contract.
	NormalizeToolCall func(ctx context.Context, iteration int, tc llm.ToolCall) llm.ToolCall

	// DeferMixedText controls whether text content from mixed
	// (text + tool_call) responses is stripped from the message context
	// and deferred for later use. This prevents the model from restating
	// already-streamed text after tool execution (issue #347).
	// Agent sets true; delegate sets false.
	DeferMixedText bool

	// NudgeOnEmpty enables empty-response nudging: when the model
	// returns no content after tool iterations, inject NudgePrompt to
	// give it one more chance to produce text. Agent sets true.
	NudgeOnEmpty bool

	// NudgePrompt is the user-role message injected on empty responses.
	NudgePrompt string

	// FallbackContent is the static text returned when the model fails
	// to produce content even after nudging.
	FallbackContent string
}

Config controls an Engine.Run execution. Callbacks are optional; nil callbacks are silently skipped.

type DeadlineExecutor

type DeadlineExecutor struct {
	Exec func(ctx context.Context, name, argsJSON string) (string, error)
}

DeadlineExecutor wraps tool execution in a goroutine with deadline enforcement. If the handler does not respect context cancellation, the goroutine leaks but the caller is unblocked. This is the executor used by the delegate system for per-tool timeouts.

func (*DeadlineExecutor) Execute

func (d *DeadlineExecutor) Execute(ctx context.Context, name, argsJSON string) (string, error)

Execute implements ToolExecutor.

type DirectExecutor

type DirectExecutor struct {
	Exec func(ctx context.Context, name, argsJSON string) (string, error)
}

DirectExecutor calls Execute on the underlying function directly. This is the default executor used by the agent loop.

func (*DirectExecutor) Execute

func (d *DirectExecutor) Execute(ctx context.Context, name, argsJSON string) (string, error)

Execute implements ToolExecutor.

type Engine

type Engine struct{}

Engine runs the model iteration loop. It is stateless; all configuration is passed via Config and all per-run state lives on the stack inside Engine.Run.

func (*Engine) Run

func (e *Engine) Run(ctx context.Context, cfg Config, messages []llm.Message) (*Result, error)

Run executes the iteration loop: call the LLM, execute any tool calls, feed results back, and repeat until the model produces a text-only response or a budget is exhausted.

The caller provides the initial message history (including system prompt). Run appends assistant and tool messages during execution and returns the final state in Result.

type IterationRecord

type IterationRecord struct {
	Index             int
	Model             string
	UpstreamRequestID string
	// StopReason mirrors the provider's termination signal in the form
	// llm.ChatResponse.StopReason exposes (Anthropic: "end_turn",
	// "tool_use", "max_tokens", "stop_sequence", "pause_turn").
	// "pause_turn" in particular signals server-side context pressure
	// and warrants operator visibility — surfacing it here lets agent
	// loops and trace dashboards react without reaching into provider
	// internals.
	StopReason                 string
	InputTokens                int
	OutputTokens               int
	CacheCreationInputTokens   int
	CacheCreation5mInputTokens int
	CacheCreation1hInputTokens int
	CacheReadInputTokens       int
	ToolCallIDs                []string
	ToolsOffered               []string
	StartedAt                  time.Time
	DurationMs                 int64
	HasToolCalls               bool
	BreakReason                string
}

IterationRecord collects per-iteration trace data. This replaces the identical iterationRecord structs that were independently defined in the agent and delegate packages.

type Result

type Result struct {
	// Content is the final text response from the model.
	Content string

	// Model is the model that produced the final response. It may
	// differ from the initially configured model if error recovery
	// or failover changed it.
	Model string

	// UpstreamRequestID is the provider-side request ID of the final
	// iteration when the provider returned one (e.g. Anthropic's
	// `x-request-id` header). Empty when the provider doesn't expose
	// one or the run had no successful iterations. Per-iteration IDs
	// remain available on Iterations[i].UpstreamRequestID for cases
	// where billing escalation or correlation needs the full chain.
	UpstreamRequestID string

	// InputTokens is the cumulative input token count across all iterations.
	InputTokens int

	// OutputTokens is the cumulative output token count across all iterations.
	OutputTokens int

	// CacheCreationInputTokens is the cumulative prompt-cache write token
	// count across all iterations when the provider reports it.
	CacheCreationInputTokens int

	// CacheCreation5mInputTokens and CacheCreation1hInputTokens break
	// down cache-write tokens by TTL bucket when the provider exposes
	// the split (Anthropic). The sum is ≤ CacheCreationInputTokens;
	// any shortfall reflects writes the provider didn't attribute.
	CacheCreation5mInputTokens int
	CacheCreation1hInputTokens int

	// CacheReadInputTokens is the cumulative prompt-cache read token
	// count across all iterations when the provider reports it.
	CacheReadInputTokens int

	// ToolsUsed maps tool name → invocation count.
	ToolsUsed map[string]int

	// Exhausted is true when the engine stopped due to a budget
	// (iterations, tokens, wall clock) rather than a text response.
	Exhausted bool

	// ExhaustReason is set when Exhausted is true.
	ExhaustReason string

	// Iterations records per-iteration trace data for archival and
	// dashboard display.
	Iterations []IterationRecord

	// Messages is the final message history including all tool
	// results. Consumers use it for archival and context replay.
	Messages []llm.Message

	// IterationCount is the number of iterations completed.
	IterationCount int
}

Result is the outcome of an Engine.Run execution.

type ToolExecutor

type ToolExecutor interface {
	Execute(ctx context.Context, name, argsJSON string) (string, error)
}

ToolExecutor runs a single tool call. Implementations control timeout enforcement and error wrapping.

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL