agent

package
v0.39.1 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Jun 12, 2026 License: MIT Imports: 20 Imported by: 4

Documentation

Overview

ABOUTME: Turn-budget checkpoint evaluation for the agent session loop. ABOUTME: Returns the message to inject (if any) when a turn threshold is exactly hit.

ABOUTME: Context compaction logic that replaces old tool results with short summaries. ABOUTME: Reduces context window consumption by summarizing stale tool outputs.

ABOUTME: Configuration for agent sessions including turn limits, timeouts, and loop detection. ABOUTME: Provides sensible defaults via DefaultConfig() and validation via Validate().

ABOUTME: Tracks context window utilization using the latest turn's input token count against a limit. ABOUTME: Provides utilization calculation and one-shot warning when approaching the limit.

ABOUTME: Event types emitted by the agent session for UI rendering and logging. ABOUTME: Defines EventType constants, Event struct, EventHandler interface, and multi-handler fan-out.

ABOUTME: Human-readable formatting helpers for live agent events. ABOUTME: Parses tool JSON input to show clean command/path displays instead of raw JSON.

ABOUTME: Repository localization pre-processing — identifies files relevant to a task ABOUTME: prompt via pure text analysis + filesystem scan (no LLM calls).

ABOUTME: SessionResult captures the outcome of a completed agent session. ABOUTME: Tracks turns, tool calls, file changes, token usage, and provides pretty-print formatting.

ABOUTME: Agent session that runs the agentic loop: LLM call -> tool execution -> loop. ABOUTME: Manages conversation state, tool dispatch, event emission, and result collection.

ABOUTME: Run-loop helpers for Session.Run — extracted to reduce session.go file size. ABOUTME: Contains conversation init, steering, LLM calls, usage tracking, tool execution, and loop detection.

ABOUTME: Mid-session steering allows injecting instructions into an active agent loop. ABOUTME: Steering messages are checked between turns via a non-blocking channel read.

ABOUTME: Per-session cache for tool results, keyed on (tool name, canonicalized arguments JSON). ABOUTME: Supports store, get, invalidateAll, and tracks hit/miss stats.

ABOUTME: Verify-after-edit loop: auto-detects build system, runs tests after file edits, and injects repair prompts. ABOUTME: Transparent inner loop — verification turns do not count against session MaxTurns.

Index

Constants

View Source
const (
	DefaultModel    = "claude-sonnet-4-6"
	DefaultProvider = "anthropic"
)

Variables

This section is empty.

Functions

func FormatEventLine

func FormatEventLine(evt Event) string

FormatEventLine formats selected live agent events for console/TUI rendering. Parses tool input JSON to extract meaningful fields for a clean chat-like display.

func ParseEpisodeSummaries

func ParseEpisodeSummaries(raw string) []string

ParseEpisodeSummaries decodes a JSON-array string into episode summaries.

func SerializeEpisodeSummaries

func SerializeEpisodeSummaries(summaries []string) string

SerializeEpisodeSummaries encodes episode summaries as a JSON array.

Types

type BreachVerifyState added in v0.36.0

type BreachVerifyState int

BreachVerifyState records the result of the verify-on-breach pass (#303). The zero value is BreachVerifyNotRun, so an unset field is always the safe "could not verify" state — never mistaken for a pass.

const (
	BreachVerifyNotRun BreachVerifyState = iota // no verify ran (not a breach, no explicit command, or loop detected)
	BreachVerifyPassed                          // verify command exited 0
	BreachVerifyFailed                          // verify failed (non-zero exit) or errored
)

type Checkpoint

type Checkpoint struct {
	Fraction float64 // 0.0–1.0 fraction of MaxTurns
	Message  string  // message injected as a user message
}

Checkpoint defines a message to inject at a specific turn-budget fraction.

type CompactionMode

type CompactionMode string
const (
	CompactionNone CompactionMode = "none"
	CompactionAuto CompactionMode = "auto"
)

type Completer

type Completer = tools.Completer

Completer is the interface needed from the LLM client. It's an alias of tools.Completer so both packages refer to the same type — preventing silent divergence if either side grows new methods.

type ContextWindowTracker

type ContextWindowTracker struct {
	Limit            int
	WarningThreshold float64

	WarningEmitted bool
	// contains filtered or unexported fields
}

ContextWindowTracker monitors context window utilization against a configured limit. InputTokens from each LLM response already represents the full conversation context for that turn, so utilization reflects the latest input token count, not a cumulative sum.

func NewContextWindowTracker

func NewContextWindowTracker(limit int, threshold float64) *ContextWindowTracker

NewContextWindowTracker creates a tracker with the given token limit and warning threshold. The threshold is a fraction (e.g. 0.8 means warn at 80% utilization).

func (*ContextWindowTracker) MarkWarned

func (t *ContextWindowTracker) MarkWarned()

MarkWarned records that the warning has been emitted, preventing further warnings.

func (*ContextWindowTracker) ShouldWarn

func (t *ContextWindowTracker) ShouldWarn() bool

ShouldWarn returns true if utilization meets or exceeds the warning threshold and a warning has not yet been emitted for this session.

func (*ContextWindowTracker) Update

func (t *ContextWindowTracker) Update(usage llm.Usage)

Update records the latest turn's input token count for utilization tracking. InputTokens from the LLM provider already includes the full conversation context.

func (*ContextWindowTracker) Utilization

func (t *ContextWindowTracker) Utilization() float64

Utilization returns the fraction of the context window currently consumed, based on the latest turn's input token count.

type EpisodeEntry

type EpisodeEntry struct {
	Tool    string `json:"tool"`
	Args    string `json:"args"`
	Success bool   `json:"success"`
	Summary string `json:"summary"`
}

EpisodeEntry captures one tool attempt inside a session.

type EpisodeLog

type EpisodeLog struct {
	Entries []EpisodeEntry `json:"entries"`
}

EpisodeLog stores structured tool-attempt records for one session.

func (*EpisodeLog) Record

func (l *EpisodeLog) Record(tool, args, output string, isError bool)

Record appends a tool-call episode.

func (EpisodeLog) Summary

func (l EpisodeLog) Summary() string

Summary renders a compact multiline summary for injection into future sessions.

type Event

type Event struct {
	Type               EventType
	Timestamp          time.Time
	SessionID          string
	NodeID             string // Pipeline node that owns this session (empty for standalone sessions).
	Turn               int
	ToolName           string
	ToolInput          string
	ToolOutput         string
	ToolError          string
	Text               string
	Err                error
	ContextUtilization float64
	Provider           string
	Model              string
	Preview            string
	ProviderEvent      string
	FinishReason       string
	Usage              llm.Usage
	Metrics            *TurnMetrics
	ToolDuration       time.Duration
}

Event carries data about something that happened during an agent session.

type EventHandler

type EventHandler interface {
	HandleEvent(evt Event)
}

EventHandler receives events emitted by the agent session.

var NoopHandler EventHandler = noopHandler{}

NoopHandler silently discards all events.

func MultiHandler

func MultiHandler(handlers ...EventHandler) EventHandler

MultiHandler returns an EventHandler that fans out each event to all provided handlers.

func NodeScopedHandler

func NodeScopedHandler(nodeID string, inner EventHandler) EventHandler

NodeScopedHandler wraps an EventHandler and stamps every event with a pipeline NodeID. This lets parallel branches identify their events without the agent layer needing to know about pipeline concepts.

type EventHandlerFunc

type EventHandlerFunc func(evt Event)

EventHandlerFunc is an adapter to allow the use of ordinary functions as EventHandlers.

func (EventHandlerFunc) HandleEvent

func (f EventHandlerFunc) HandleEvent(evt Event)

type EventType

type EventType string

EventType identifies the kind of event emitted during an agent session.

const (
	EventSessionStart         EventType = "session_start"
	EventSessionEnd           EventType = "session_end"
	EventTurnStart            EventType = "turn_start"
	EventTurnEnd              EventType = "turn_end"
	EventToolCallStart        EventType = "tool_call_start"
	EventToolCallEnd          EventType = "tool_call_end"
	EventTextDelta            EventType = "text_delta"
	EventError                EventType = "error"
	EventContextWindowWarning EventType = "context_window_warning"
	EventSteeringInjected     EventType = "steering_injected"
	EventLLMRequestStart      EventType = "llm_request_start"
	EventLLMReasoning         EventType = "llm_reasoning"
	EventLLMText              EventType = "llm_text"
	EventLLMToolPrepare       EventType = "llm_tool_prepare"
	EventLLMFinish            EventType = "llm_finish"
	EventLLMProviderRaw       EventType = "llm_provider_raw"
	EventToolCacheHit         EventType = "tool_cache_hit"
	EventContextCompaction    EventType = "context_compaction"
	EventTurnMetrics          EventType = "turn_metrics"
	EventLLMRequestPreparing  EventType = "llm_request_preparing"
	// EventVerify is emitted for verify-after-edit status updates (pass/fail/retry).
	// Use EventError only for infrastructure failures (binary not found, etc.).
	EventVerify EventType = "verify"
	// EventCheckpoint is emitted when a turn-budget checkpoint fires and its
	// message is injected into the conversation as a user message.
	EventCheckpoint EventType = "checkpoint"
)

type Session

type Session struct {
	// contains filtered or unexported fields
}

Session holds the state for a single agent conversation loop. A Session is single-use: Run must only be called once.

func NewSession

func NewSession(client Completer, config SessionConfig, opts ...SessionOption) (*Session, error)

NewSession creates a new agent session with the given LLM client, config, and options. Returns an error if the config is invalid.

func (*Session) ID

func (s *Session) ID() string

ID returns the session's unique identifier.

func (*Session) Run

func (s *Session) Run(ctx context.Context, userInput string) (SessionResult, error)

Run executes the agentic loop: send user input to the LLM, execute any tool calls, feed results back, and repeat until the LLM stops or max turns is reached.

type SessionConfig

type SessionConfig struct {
	MaxTurns                      int
	CommandTimeout                time.Duration
	MaxCommandTimeout             time.Duration
	LoopDetectionThreshold        int
	ContextWindowLimit            int
	ContextWindowWarningThreshold float64
	ToolOutputLimits              map[string]int
	WorkingDir                    string
	SystemPrompt                  string
	Model                         string
	Provider                      string
	CacheToolResults              bool
	ContextCompaction             CompactionMode
	CompactionThreshold           float64
	ReasoningEffort               string // OpenAI reasoning effort: "low", "medium", "high"
	ResponseFormat                string // "json_object" or "json_schema" — forces structured output
	ResponseSchema                string // JSON schema string when ResponseFormat is "json_schema"
	// ReflectOnError injects a structured reflection prompt after tool call
	// errors to help the LLM reason about what went wrong before retrying.
	// Default: true.
	ReflectOnError bool

	// VerifyAfterEdit enables automatic test/lint verification after turns that
	// include file writes or edits. If verification fails, the error is fed back
	// to the LLM with a repair prompt. Default: false (opt-in).
	VerifyAfterEdit bool

	// VerifyCommand is the explicit verification command to run. When empty,
	// auto-detection is used (looks for go.mod → "go test ./...", Cargo.toml →
	// "cargo test", package.json → "npm test", Makefile with test target →
	// "make test", pytest markers → "pytest").
	VerifyCommand string

	// MaxVerifyRetries is the maximum number of verify→repair cycles per edit
	// turn before giving up and proceeding. Default: 2.
	MaxVerifyRetries int

	// VerifyOnBreach, when true, makes the session run one verify pass after
	// the turn loop exhausts (MaxTurns reached without a detected loop), using
	// VerifyCommand only (never auto-detection — see resolveBreachVerifier).
	// The pipeline layer sets this to (turn_breach_policy != "fail") so the
	// opt-out path pays no verify cost. Independent of VerifyAfterEdit. (#303)
	VerifyOnBreach bool

	// Checkpoints are messages injected at specific turn-budget fractions.
	// Each checkpoint fires exactly once, on the turn where the fraction is
	// first reached. Fraction is in [0, 1] — e.g. 0.6 means "at 60% of MaxTurns".
	Checkpoints []Checkpoint

	// VerifyBroadCommand is an optional second verification command run after
	// the focused VerifyCommand passes. Use this for regression detection
	// (e.g. run the full test module without -x). Empty means disabled.
	VerifyBroadCommand string

	// Localize enables a pre-processing localization phase that scans the
	// working directory for files relevant to the task prompt and injects a
	// structured context block before the first LLM turn. Pure text analysis
	// plus filesystem scan — no LLM calls. Default: false.
	Localize bool

	// PriorEpisodeSummaries carries summaries from earlier attempts so retries
	// can avoid repeating known-failing approaches.
	PriorEpisodeSummaries []string
	// PlanBeforeExecute inserts one planning-only LLM call before the main turn
	// loop and keeps that plan in conversation context for subsequent turns.
	// Default: false.
	PlanBeforeExecute bool

	// ToolAccess restricts the agent's tool surface. When non-empty (any value),
	// the session registers zero tools, sets ToolChoice=none on LLM requests,
	// scrubs the built-in tool-naming prefix from the system prompt, and rejects
	// Params bypass keys (allowed_tools, disallowed_tools, permission_mode).
	//
	// Defends the v0.28.2 single-agent multi-tool-call vector: an LLM emitting
	// multiple tool calls in one response cannot execute any of them because
	// the registry is empty by construction.
	//
	// Canonical: case-insensitive, whitespace-trimmed. Only recognized spelling
	// is "none"; any other non-empty value still disables tools (fail-closed for
	// typos). Default: "" (unrestricted).
	//
	// System-prompt scope: tracker only scrubs its own built-in basePrompt
	// (which names "read", "write", etc. for path-relative semantics). A
	// caller-supplied SystemPrompt is appended verbatim — if it names tools,
	// the assembled prompt will still contain those tokens. The registry +
	// ToolChoice + dispatch-shortcircuit defenses do not depend on the prompt
	// scrub; the scrub is defense-in-depth against the LLM noticing tool
	// affordances. Callers who need a fully scrubbed assembled prompt should
	// audit their own SystemPrompt under restriction.
	//
	// Issue: github.com/2389-research/tracker#258.
	ToolAccess string

	// WritablePaths is the author-declared write-scope glob list resolved
	// against WorkingDir. Empty/absent = unbounded; non-empty = jail enforced
	// by the runtime (Linux Landlock for Bash subprocess + openat2 for
	// in-process tools). Empty values, malformed globs, working_dir escapes,
	// unsupported backends, and Landlock-unavailable hosts all refuse-to-start
	// at session creation via pipeline/handlers/codergen_jail.go's
	// configureJail gate (Task 14). See issue #272.
	WritablePaths []string

	// WritablePathsSet records whether the writable_paths attr was specified
	// on the originating node, even if the parsed slice is empty. Allows
	// configureJail to distinguish "absent" (Set=false, jail disabled) from
	// "present but parses to no entries" (Set=true, fail-CLOSED). Mirrors
	// pipeline.AgentNodeConfig.WritablePathsSet so the signal carries
	// through the codergen buildConfig handoff intact.
	WritablePathsSet bool

	// Backend names the execution backend for this session. Carried from
	// pipeline.AgentNodeConfig.Backend so configureJail can refuse
	// out-of-process backends (claude-code, acp) and unknown backends
	// (fail-closed) before wiring the writable_paths fs-jail. Empty string
	// is treated as "native" by configureJail. See issue #272.
	Backend string
}

func DefaultConfig

func DefaultConfig() SessionConfig

func (SessionConfig) IsToolAccessRestricted

func (c SessionConfig) IsToolAccessRestricted() bool

IsToolAccessRestricted reports whether ToolAccess is set to any non-empty canonical value. Used by the session to gate tool registration, ToolChoice, and system-prompt assembly. Fail-closed: any non-empty value (including typos) returns true.

func (SessionConfig) Validate

func (c SessionConfig) Validate() error

type SessionOption

type SessionOption func(*Session)

SessionOption configures a Session.

func WithEnvironment

func WithEnvironment(env exec.ExecutionEnvironment) SessionOption

WithEnvironment sets the execution environment and registers built-in tools.

func WithEventHandler

func WithEventHandler(h EventHandler) SessionOption

WithEventHandler attaches an event handler to receive session lifecycle events.

func WithSessionRunner

func WithSessionRunner(runner tools.SessionRunner) SessionOption

WithSessionRunner sets the session runner used by the spawn_agent tool to create child sessions.

func WithSteering

func WithSteering(ch <-chan string) SessionOption

WithSteering attaches a steering channel to receive mid-session instructions.

func WithTools

func WithTools(tt ...tools.Tool) SessionOption

WithTools registers additional tools into the session's tool registry.

type SessionResult

type SessionResult struct {
	SessionID          string
	Provider           string
	Duration           time.Duration
	Turns              int
	MaxTurnsUsed       bool
	LoopDetected       bool
	BreachVerify       BreachVerifyState // #303: result of the verify-on-breach pass
	ToolCalls          map[string]int
	FilesModified      []string
	FilesCreated       []string
	Usage              llm.Usage
	ContextUtilization float64
	ToolCacheHits      int
	ToolCacheMisses    int
	ToolTimings        map[string]time.Duration
	CompactionsApplied int
	LongestTurn        time.Duration
	EpisodeSummary     string
	Error              error
}

SessionResult holds summary statistics and metadata from a completed session.

func (SessionResult) String

func (r SessionResult) String() string

String returns a human-readable summary of the session result.

func (SessionResult) TotalToolCalls

func (r SessionResult) TotalToolCalls() int

TotalToolCalls returns the sum of all tool call counts.

type TurnMetrics

type TurnMetrics struct {
	InputTokens        int
	OutputTokens       int
	CacheReadTokens    int
	CacheWriteTokens   int
	ContextUtilization float64
	ToolCacheHits      int
	ToolCacheMisses    int
	TurnDuration       time.Duration
	EstimatedCost      float64
}

TurnMetrics captures per-turn token and performance data.

Directories

Path Synopsis
ABOUTME: ExecutionEnvironment interface abstracting where agent tools run.
ABOUTME: ExecutionEnvironment interface abstracting where agent tools run.
ABOUTME: Bash tool executes shell commands in the working directory.
ABOUTME: Bash tool executes shell commands in the working directory.

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL