Documentation
¶
Overview ¶
Package agent is the application-layer surface for FlowCraft agents.
Position in the layering ¶
┌──────────────────────────────────────────────────────────┐ │ application layer sdk/agent ← this pkg │ │ │ │ ┌──────────────────────┐ │ │ │ Agent / Request / │ │ │ │ Result / Observer / │ │ │ │ Decider / │ │ │ │ BoardSeeder / │ │ │ │ agent.Run(...) │ │ │ └──────────────────────┘ │ │ ↓ drives │ │ core layer sdk/engine │ │ Engine / Host / Run / Board / Checkpoint │ │ ↑ implemented by │ │ concrete engines sdk/graph, sdk/script │ └──────────────────────────────────────────────────────────┘
agent owns "what the user sees": agents, conversations, request / result envelopes, lifecycle observation, lifecycle decisions (disposition, moderation), board seeding policy. It deliberately does NOT own "how a turn is executed" — execution is delegated to an engine.Engine passed at run time:
agent.Run(ctx, ag, eng, req, opts...)
The same agent identity can be driven by different engines (graph for rich decision trees, script for simple flows, A2A-remote for federation) without changing its definition. This is the central design point that supersedes sdk/workflow's Agent-owns-Strategy coupling.
Memory / history / recall integration ¶
agent does NOT define a History interface. The reason: not every engine speaks "graph + node" or stores its working state on the engine.Board the same way, so a single contract for "load before" and "append after" leaks engine assumptions into the application layer. Instead, agent exposes four orthogonal extension points:
BoardSeeder (via WithBoardSeed) builds the initial board. Use it to load conversation history, run retrieval, materialise system prompts, or whatever the engine expects to find on the board at start.
Observer (via WithObserver or [Agent.Observers]) reacts to run lifecycle events with no return value. Use it to append the produced messages to a transcript on completion, emit metrics, snapshot board state on interrupt, etc.
Decider (via WithDecider or [Agent.Deciders]) influences the run's classification at boundary points. Round B exposes [Decider.BeforeFinalize], which sets [Result.Committed] — transcript / archival Observers gate on that flag, so a barge-in or moderation hit can opt the run out of persistence without rewriting any persistence wiring.
engine.Host (via WithEngineHost) is the bag of host-side capabilities the engine reaches for during execution: event publishing, interrupt injection, user prompting, checkpoint persistence, token-usage reporting. Build one struct that embeds engine.NoopHost and override the methods you actually need; share metric clients / tracers / loggers across methods the way only a struct can. agent never wraps or decorates the supplied host — what you pass in is exactly what the engine sees.
Concrete history / recall / archival integrations are intentionally the caller's problem: they are 5–10 lines of glue around any transcript store and live with the application that owns the store, not in sdk/agent. See example_multiturn_test.go for the canonical wiring shape.
Allowed dependencies ¶
- sdk/engine (Engine, Host, Run, Board, Checkpoint, …)
- sdk/model (Message, Part, TokenUsage)
- sdk/errdefs
- standard library
agent MUST NOT import sdk/history, sdk/recall, sdk/agent/strategy (when added), sdk/graph, sdk/script, sdk/workflow, sdk/voice, sdk/event. Anything that needs an event bus (Publish wiring, OTel span linking, telemetry sinks) lives in the caller-supplied engine.Host (see WithEngineHost); agent itself does not own any event-routing convention.
What lives here ¶
Agent, AgentCard, Skill — agent identity and capability description. Agent is a *plain struct* (no Strategy-method on it): execution wiring is the caller's concern.
Request / Result / Status / Artifact — one-turn input/output.
BoardSeeder / BoardSeederFunc — the data-injection extension point that runs once before engine.Execute.
Observer / BaseObserver / RunInfo — the read-only lifecycle hooks fired around engine.Execute.
Decider / BaseDecider / FinalizeDecision — the decision-making counterpart of Observer. BeforeFinalize fires after every engine.Execute attempt; the merged decision drives [Result.Committed], records the [FinalizeDecision.Reason] in Result.State["finalize_reason"], and (when WithMaxRevise is enabled) gates the revise loop.
DiscardOnInterruptCauses — the canonical disposition Decider for voice / streaming UX. Constructs a Decider that marks Result.Committed=false on barge-in causes.
Run — the entry point that wires Request + Agent + Engine + observers + deciders + seeder together for one turn, returning a Result. Honours WithResumeFrom for checkpoint replay and WithMaxRevise for Decider-driven re-attempts (see "Resume / Revise" below).
RunOption and the WithXxx helpers — plumb optional behaviours into Run without making the function signature unwieldy.
Resume / Revise ¶
Two attempt-shaping options compose with everything else:
WithResumeFrom(cp) replays a previous run from cp by setting engine.Run.ResumeFrom and overriding the run id to cp.ExecID. Engines without engine.Resumer surface NotAvailable; engines with it (graph runner, future script engine) restore board state from cp.Board and continue from cp.Step. ResumeFrom applies to attempt 1 only — see Revise.
WithMaxRevise(n) opts in to the [FinalizeDecision.Revise] loop. When n>=2, deciders that return Revise=true on a completed attempt trigger another engine.Execute pass with a freshly-seeded board (revise is a fresh retry, not a checkpoint replay — ResumeFrom is dropped after attempt 1). The loop exits when no decider asks for revise OR the attempt counter reaches n. Failed / interrupted / canceled / aborted attempts NEVER consume budget; transient infrastructure errors surface immediately. [Result.Attempts] reports the actual count; [Observer.OnRunRevise] fires once per attempt transition.
What does NOT live here yet (later) ¶
RunHandle / ResumeToken for in-flight run management (deferred until vessel-level handle plumbing matures).
Strategy adapter for compiled engines (sdk/agent/strategy will host it once we know what shape it should have).
Index ¶
- Constants
- func DefaultHandoffToolName(agentID string) string
- func HandoffTool(h Handoff) tool.Tool
- func HandoffTools(ctx context.Context, req *Request, hs []Handoff) []tool.Tool
- type Agent
- type AgentCapabilities
- type AgentCard
- type Artifact
- type BaseDecider
- type BaseObserver
- type BoardSeeder
- type BoardSeederFunc
- type Decider
- type DiscardOnInterruptCauses
- type FinalizeDecision
- type Handoff
- type HandoffArgs
- type HandoffEvent
- type Observer
- type Request
- type RequestConfig
- type Result
- type RunInfo
- type RunOption
- func WithArtifactChannels(channels ...string) RunOption
- func WithAttributes(extra map[string]string) RunOption
- func WithBoardSeed(s BoardSeeder) RunOption
- func WithDecider(d Decider) RunOption
- func WithDependencies(d *engine.Dependencies) RunOption
- func WithEngineHost(h engine.Host) RunOption
- func WithMaxRevise(n int) RunOption
- func WithObserver(o Observer) RunOption
- func WithParentRunID(id string) RunOption
- func WithResumeFrom(cp *engine.Checkpoint) RunOption
- type Skill
- type Status
Constants ¶
const HandoffFinalizeReason = "handoff:"
HandoffFinalizeReason is the conventional [FinalizeDecision.Reason] prefix used by HandoffDecider. Format: "handoff:<to_agent_id>". Telemetry consumers can branch on the prefix without parsing the full state map.
const HandoffStateKey = "handoff"
HandoffStateKey is the [Result.State] map key under which HandoffDecider writes its HandoffEvent. Exposed as a constant so hosts and Observers can probe the state without depending on the decider's import path.
Variables ¶
This section is empty.
Functions ¶
func DefaultHandoffToolName ¶ added in v0.2.9
DefaultHandoffToolName produces the canonical LLM-facing tool name from an agent id: "transfer_to_<sanitised>" where the sanitiser lowercases and replaces any non-[a-z0-9_] rune with "_". Mirrors what HandoffTool uses when [Handoff.ToolName] is empty so callers writing prompt templates can compute the same name without booting the SDK.
func HandoffTool ¶ added in v0.2.9
HandoffTool returns one tool.Tool that exposes h to the LLM. The tool's Execute body is intentionally a no-op (returns a short confirmation string) — actual dispatch is the host's job, after observing HandoffEvent in [Result.State]. This separation keeps the LLM round happy (it sees a successful tool result and finishes its turn cleanly) without giving the tool surface any control over the receiving agent's wiring.
HandoffTool panics if h.ToAgentID is empty: a hand-off without a destination is a programming error best caught at registration time.
func HandoffTools ¶ added in v0.2.9
HandoffTools converts a slice of Handoff into the matching slice of [tool.Tool]s, applying any per-Handoff Filter against req. Hand-offs whose Filter rejects req are silently omitted — the tool simply does not appear to the LLM that turn.
Pass req=nil to skip filtering (e.g. for static analysis or admin UI listings).
Types ¶
type Agent ¶
type Agent struct {
// ID is the stable identifier for this agent. It flows into
// telemetry, history conversation keys, and any A2A federation
// envelope. MUST be non-empty.
ID string `json:"id"`
// Card describes the agent's capabilities for discovery (A2A,
// dashboards, …). Optional.
Card AgentCard `json:"card,omitempty"`
// Tools is the list of tool ids the agent is permitted to call.
// The engine looks tools up by id in its dependency container at
// run time; this list is the policy gate, not the wiring.
Tools []string `json:"tools,omitempty"`
// Observers are agent-scoped lifecycle observers. They fire on
// every [Run] of this agent value, before any observers added
// via [WithObserver] for the specific call. JSON-skipped because
// observers carry runtime state (channels, stores, …) that does
// not round-trip through serialisation.
Observers []Observer `json:"-"`
// Deciders are agent-scoped decision hooks (see [Decider]). They
// run before any Decider added via [WithDecider] for the
// specific call. Same JSON-skip rationale as Observers.
Deciders []Decider `json:"-"`
}
Agent is the application-layer description of one logical agent. It is a plain data struct, NOT a "live object" that knows how to run itself: execution is performed by passing an engine.Engine to Run alongside the Agent. The same Agent value can therefore be driven by different engines without re-construction.
This is the central distinction from sdk/workflow.Agent, which internally owned a Strategy and so could only execute one way.
type AgentCapabilities ¶
type AgentCapabilities struct {
// Streaming reports whether the agent emits server-sent events
// during a turn.
Streaming bool `json:"streaming,omitempty"`
// PushNotifications reports whether the agent can push update
// notifications back to the client.
PushNotifications bool `json:"pushNotifications,omitempty"`
// StateTransitionHistory reports whether the agent exposes its
// task state-transition history.
StateTransitionHistory bool `json:"stateTransitionHistory,omitempty"`
}
AgentCapabilities declares which optional A2A features the agent supports. JSON keys exactly match the A2A spec — note the plural PushNotifications and the longer StateTransitionHistory.
type AgentCard ¶
type AgentCard struct {
// Name is a human-readable name for the agent.
Name string `json:"name"`
// Description explains what the agent does.
Description string `json:"description,omitempty"`
// Skills enumerates the capability units the agent can perform.
Skills []Skill `json:"skills,omitempty"`
// DefaultInputModes lists MIME types the agent accepts when a
// skill does not override them. Per the A2A spec, this field is
// keyed "defaultInputModes" — singular "inputModes" lives on
// individual skills.
DefaultInputModes []string `json:"defaultInputModes,omitempty"`
// DefaultOutputModes lists MIME types the agent emits when a
// skill does not override them.
DefaultOutputModes []string `json:"defaultOutputModes,omitempty"`
// Capabilities declares which optional A2A protocol features the
// agent supports.
Capabilities AgentCapabilities `json:"capabilities,omitempty"`
}
AgentCard describes an agent's capabilities for discovery. Field names and JSON tags are a *proper subset* of the A2A AgentCard specification: every field marshals to the exact key A2A readers expect, and AgentCard never collides with an A2A field by using a different name.
What is intentionally NOT here:
url / version / provider / documentationUrl / authentication — these belong to "how the agent is exposed as a service", a concern owned by sdk/a2a (when added). The intended layering is that sdk/a2a's expose-time card embeds AgentCard and adds the deployment fields:
type Card struct { agent.AgentCard URL string `json:"url"` Version string `json:"version"` Provider *Provider `json:"provider,omitempty"` Authentication *Authentication `json:"authentication,omitempty"` DocumentationURL string `json:"documentationUrl,omitempty"` }
This keeps sdk/agent's runtime-identity surface stable when the A2A spec evolves its deployment metadata.
Reference: https://agent2agent.info/docs/concepts/agentcard/
type Artifact ¶
Artifact is a named bundle of typed parts produced during a run (e.g. "summary", "tool_output_image"). Engines that write artifacts store them in a board channel; agent collects channel contents into Artifacts on the way out.
type BaseDecider ¶
type BaseDecider struct{}
BaseDecider provides a no-op default implementation of every Decider method. Embed it when only a subset of decision points matter.
func (BaseDecider) BeforeFinalize ¶
func (BaseDecider) BeforeFinalize(context.Context, RunInfo, *Request, *Result) (FinalizeDecision, error)
BeforeFinalize returns the zero-value FinalizeDecision (no opinion).
type BaseObserver ¶
type BaseObserver struct{}
BaseObserver provides no-op default implementations of every Observer method. Embed it in custom observers that only care about a subset of the lifecycle:
type historyAppender struct {
agent.BaseObserver
store sdk_history.History
}
func (h *historyAppender) OnRunEnd(ctx context.Context, info agent.RunInfo, res *agent.Result) {
if res.Status != agent.StatusCompleted { return }
_ = h.store.Append(ctx, info.ContextID, res.Messages)
}
func (BaseObserver) OnInterrupt ¶
OnInterrupt is a no-op.
func (BaseObserver) OnRunEnd ¶
func (BaseObserver) OnRunEnd(context.Context, RunInfo, *Result)
OnRunEnd is a no-op.
func (BaseObserver) OnRunRevise ¶ added in v0.3.4
OnRunRevise is a no-op.
func (BaseObserver) OnRunStart ¶
func (BaseObserver) OnRunStart(context.Context, RunInfo, *Request)
OnRunStart is a no-op.
type BoardSeeder ¶
type BoardSeeder interface {
SeedBoard(ctx context.Context, info RunInfo, req *Request) (*engine.Board, error)
}
BoardSeeder builds the initial engine.Board for a run.
It is the single extension point for "anything that should be on the board before the engine sees it":
- conversation history (load from sdk/history, summarise, window);
- retrieved long-term memory (sdk/recall results, knowledge-base hits);
- system prompts and persona text;
- request-scoped board vars (form fields, parameters, tool allow-lists);
- any combination of the above.
Run guarantees:
- SeedBoard is called exactly once per Run, before engine.Execute and before any Observer's OnRunStart.
- The returned board is mutated by the engine; SeedBoard must therefore return a fresh value each call (do NOT cache and re-yield a single Board).
- The returned board MUST be non-nil. Returning nil is a Run infrastructure error.
Implementations are expected to be cheap and synchronous; long async work (retrieval, IO) belongs in a wrapper that resolves before Run.
type BoardSeederFunc ¶
BoardSeederFunc is the function-typed adapter for BoardSeeder.
Useful when the seed logic is a single closure over a transcript loader or retriever:
agent.WithBoardSeed(agent.BoardSeederFunc(func(ctx context.Context, info agent.RunInfo, req *agent.Request) (*engine.Board, error) {
prior, err := store.Load(ctx, info.ContextID)
if err != nil { return nil, err }
b := engine.NewBoard()
b.SetChannel(engine.MainChannel, prior)
b.AppendChannelMessage(engine.MainChannel, req.Message)
return b, nil
}))
type Decider ¶
type Decider interface {
// BeforeFinalize fires after engine.Execute returns. The Decider
// inspects res (read-only) and the original req, and returns a
// FinalizeDecision that Run merges with other Deciders' decisions.
//
// info carries the immutable identification fields agreed for
// this turn. The Decider MUST NOT mutate res; agent will surface
// the merged decision via [Result.Committed] and (when a Reason
// was supplied) [Result.State]["finalize_reason"].
BeforeFinalize(ctx context.Context, info RunInfo, req *Request, res *Result) (FinalizeDecision, error)
}
Decider is a decision-making lifecycle hook that can influence what agent.Run does at well-defined boundaries. It is the read-write counterpart of Observer:
- Observers see what happened and emit side effects (logs, metrics, transcript persistence).
- Deciders return a structured decision agent.Run interprets.
Round B exposes one decision point — [Decider.BeforeFinalize] — which fires after engine.Execute returned but before Run commits the produced messages to history (i.e., before any Observer's OnRunEnd). This covers two real cases:
Disposition: a barge-in cause means the assistant was cut off mid-thought; the half-baked output should not appear in the persistent transcript. A Decider returns FinalizeDecision{DiscardOutput: true}.
(Reserved) Revise: the natural answer fails some quality bar (no citations, policy violation, refusal-without-reason); the Decider asks for one more model pass. The wire field is present for forward compatibility — agent does not yet honour it, and engines will need explicit support before it has any effect.
Composition ¶
Multiple Deciders may be registered (Agent-scoped + per-call). They run in registration order. The merged decision is the OR over boolean fields: any Decider asking to discard wins; same for revise. The first non-empty Reason wins, so callers can attribute the decision in logs.
Error contract ¶
A Decider returning a non-nil error short-circuits the merge and causes Run to return (Result, decider-error). agent does NOT swap the error class — it surfaces the Decider's own error so callers can classify with errdefs. The Result is still populated (including the engine's output) so the caller can decide what to do next.
Embed BaseDecider to satisfy the interface with no-op defaults.
func HandoffDecider ¶ added in v0.2.9
HandoffDecider returns a Decider that scans Result.Messages for the FIRST tool call whose name matches one of hs and, when found, records the hand-off in Result.State and emits a FinalizeDecision with Reason = HandoffFinalizeReason + ToAgentID.
Behaviour:
Detection is read-only: the decider does not modify Result.Messages. The hand-off tool's Execute already produced a tool_result the LLM will see; suppressing the rest of the turn is the host's job (typically by *not* committing the result and dispatching the next agent).
DiscardOutput is left at false. Hosts that want to drop the LLM's pre-handoff prose should layer their own decider downstream — the choice depends on whether the assistant's "Sure, transferring you now…" line should appear in the transcript.
Multiple hand-offs in one turn are deduped to the first match (see Handoff doc).
The returned Decider stores the HandoffEvent under HandoffStateKey; consumers retrieve it via HandoffFromResult which performs the type assertion + map initialisation safely.
type DiscardOnInterruptCauses ¶
type DiscardOnInterruptCauses struct {
// contains filtered or unexported fields
}
DiscardOnInterruptCauses is a Decider that asks Run to discard the produced output whenever the engine reported an interrupt with any of the listed causes. It is the canonical disposition policy for voice / streaming UX — a barge-in shouldn't leave half-baked assistant text in the transcript.
Construct it with NewDiscardOnInterruptCauses; the zero value is not useful (no causes match).
Default behaviour without this Decider ¶
agent.Run already sets Result.Committed=false on every non-completed outcome by default, so installing DiscardOnInterruptCauses purely for "discard on barge-in" is technically redundant — the default would discard anyway. The reason it is still a useful Decider:
it sets Result.State["finalize_reason"] to a caller-supplied attribution string, which the default policy cannot do;
it makes the policy explicit at the call site so a future change to the default (e.g. "commit interrupted runs by default") would not silently change voice's behaviour.
func NewDiscardOnInterruptCauses ¶
func NewDiscardOnInterruptCauses(reason string, causes ...engine.Cause) *DiscardOnInterruptCauses
NewDiscardOnInterruptCauses returns a Decider that discards output for the given engine.Cause set. Reason is recorded in Result.State["finalize_reason"] when the decider fires.
Common preset:
agent.NewDiscardOnInterruptCauses("barge-in",
engine.CauseUserInput, engine.CauseUserCancel)
func (*DiscardOnInterruptCauses) BeforeFinalize ¶
func (d *DiscardOnInterruptCauses) BeforeFinalize(_ context.Context, _ RunInfo, _ *Request, res *Result) (FinalizeDecision, error)
BeforeFinalize implements Decider.
type FinalizeDecision ¶
type FinalizeDecision struct {
// DiscardOutput, when true, instructs Run to mark Result.Committed
// = false regardless of Status. Observers reading Committed
// (notably history-append observers) skip persistence on a
// discarded run.
//
// Setting DiscardOutput on a StatusCompleted run is allowed and
// useful for moderation hooks ("the answer violates policy, do
// not persist it").
DiscardOutput bool
// Revise asks agent.Run to discard this attempt's output and
// re-invoke engine.Execute with a fresh board (re-seeded from
// the original Request). Honoured ONLY when the per-call
// [WithMaxRevise] budget allows another attempt; the option
// defaults to 0, so by default Revise is recorded as a
// finalize_reason but does NOT trigger another engine call —
// callers must opt in explicitly to avoid runaway loops on
// faulty Deciders.
//
// When honoured the lifecycle is:
//
// 1. Decider returns Revise=true (and optionally Reason).
// 2. Run fires [Observer.OnRunRevise] with the about-to-be-
// discarded Result and the next attempt index.
// 3. Board is re-seeded via the configured BoardSeeder; the
// same engine.Run identifier is reused so observers that
// key by run id can correlate attempts.
// 4. engine.Execute runs again. The Decider chain runs again
// on the new Result.
// 5. The loop exits when either Revise=false or the attempt
// counter reaches WithMaxRevise. The final Result.Attempts
// reflects how many engine.Execute calls were made.
//
// Revise interacts with [WithResumeFrom]: ResumeFrom applies
// to the FIRST attempt only. Revise restarts are fresh runs
// (the engine should be re-entered from the start), so
// subsequent attempts drop ResumeFrom — replaying a checkpoint
// repeatedly would defeat the purpose of asking for a revision.
Revise bool
// Reason is a free-form short string explaining the decision.
// Agent stores the first non-empty Reason in Result.State under
// "finalize_reason" so logs / metrics can attribute the
// outcome.
Reason string
}
FinalizeDecision is the return type of [Decider.BeforeFinalize]. The zero value means "no opinion" — agent applies its defaults.
Defaults agent.Run uses when no Decider returns a directive:
- StatusCompleted runs are committed.
- StatusInterrupted / StatusCanceled / StatusAborted / StatusFailed runs are NOT committed (their partial output is dropped from the transcript view). This matches the conservative behaviour Round A had hard-coded; round B simply makes it overridable.
type Handoff ¶ added in v0.2.9
type Handoff struct {
// ToAgentID is the stable identifier of the receiving agent.
// The host is responsible for resolving this id to a runnable
// Agent + Engine pair; the SDK stays out of routing.
ToAgentID string
// Description is shown to the LLM as the tool description so
// it knows when to invoke the hand-off. Keep it short and
// behavioural ("Refunds, invoices, plans"). Empty falls back
// to a generic "Transfer the conversation to <id>".
Description string
// ToolName overrides the LLM-facing tool name. Default is
// "transfer_to_<sanitised_ToAgentID>" (lowercased,
// non-alphanumeric → "_"). Override when two hand-offs would
// otherwise collide on the default name (e.g. multiple
// agents whose IDs differ only in case).
ToolName string
// Filter, when set, is consulted by [HandoffTools] to allow
// per-request gating without rebuilding the slice. A Filter
// returning false hides the hand-off from this turn's LLM.
// Typical use: tenant-aware gating, permission checks.
Filter func(ctx context.Context, req *Request) bool
// OnInvoke fires synchronously inside the hand-off tool's
// Execute call. Use it for lightweight observability (a log
// line, a metric) — heavy work should be done by the host
// after Run returns. Returning an error from OnInvoke fails
// the tool call and the LLM may retry; the hand-off is NOT
// recorded in that case.
OnInvoke func(ctx context.Context, args HandoffArgs) error
}
Handoff describes a controlled hand-off from the *current* agent to a *target* agent. Hand-offs are exposed to the LLM as ordinary tools — calling the tool is the LLM's way of saying "I want to transfer this conversation to <target>". The agent layer detects that call after Run returns and writes a structured HandoffEvent into [Result.State] under HandoffStateKey so the host can dispatch the next turn (typically via sdk/kanban).
The DSL is deliberately data-shaped: a Handoff value can be declared next to the Agent, marshalled to JSON for an admin UI, and inspected without booting the runtime. Attaching hand-offs to a turn is two lines:
hs := []agent.Handoff{
{ToAgentID: "billing", Description: "Refunds, invoices, plans"},
{ToAgentID: "tech", Description: "Bugs, errors, integrations"},
}
tools := append(baseTools, agent.HandoffTools(hs)...)
deciders := []agent.Decider{agent.HandoffDecider(hs)}
Recommended host loop after Run returns:
if ev, ok := agent.HandoffFromResult(res); ok {
next := dispatch(ev.ToAgentID, ev.Note) // kanban / direct Run / queue
return next, nil
}
Hand-offs do not fork: only one hand-off per turn is honoured. When the LLM calls multiple hand-off tools in the same turn the FIRST call wins; subsequent hand-off tool calls are dropped at detection time. This matches user expectation ("transfer me to billing — actually no, tech" -> follow the user's last word) and avoids ambiguous double-dispatch.
type HandoffArgs ¶ added in v0.2.9
type HandoffArgs struct {
// Reason is a short rationale for the transfer. Surfaced to
// the receiving agent and to telemetry. Example:
// "User asked about invoice for order #1234".
Reason string `json:"reason,omitempty"`
// Note is an optional free-form message the LLM wants the
// receiving agent to read first. Avoid stuffing the entire
// transcript here; the receiving side should reload history
// via the host's normal mechanisms.
Note string `json:"note,omitempty"`
}
HandoffArgs is the LLM-supplied JSON-decoded argument bundle for a hand-off tool call. Only Reason / Note are exposed — the SDK purposefully refuses richer parameter shapes to keep the schema uniform across hand-offs (different shapes per hand-off encourage the LLM to leak structured data into a free-form field). Hosts that need richer dispatch data should attach it via Request metadata before scheduling the next turn, not via the hand-off tool itself.
type HandoffEvent ¶ added in v0.2.9
type HandoffEvent struct {
// ToAgentID mirrors [Handoff.ToAgentID].
ToAgentID string `json:"to_agent_id"`
// ToolName is the tool name the LLM actually invoked.
ToolName string `json:"tool_name"`
// ToolCallID echoes the ToolCall.ID from the model, so the
// host can correlate downstream events back to the originating
// call.
ToolCallID string `json:"tool_call_id"`
// Args carries the LLM-supplied reason / note.
Args HandoffArgs `json:"args,omitempty"`
}
HandoffEvent is the structured record placed in [Result.State] under HandoffStateKey when HandoffDecider detects a hand-off tool call. The host consumes this to dispatch the next turn.
func HandoffFromResult ¶ added in v0.2.9
func HandoffFromResult(res *Result) (HandoffEvent, bool)
HandoffFromResult extracts the HandoffEvent previously written by HandoffDecider. Returns (zero, false) when no hand-off happened or when the state slot was overwritten with an unexpected type.
Hosts use this in their dispatch loop:
res, _ := agent.Run(ctx, current, eng, req, opts...)
if ev, ok := agent.HandoffFromResult(res); ok {
return dispatchTo(ev.ToAgentID, ev.Args.Note)
}
type Observer ¶
type Observer interface {
// OnRunStart fires after Run prepared the engine inputs but
// before engine.Execute is invoked. info carries the immutable
// identification fields agreed for this turn.
OnRunStart(ctx context.Context, info RunInfo, req *Request)
// OnInterrupt fires only when the engine returned an interrupt
// error. It runs before OnRunEnd. intr carries the structured
// reason supplied by the host.
OnInterrupt(ctx context.Context, info RunInfo, intr engine.Interrupt)
// OnRunRevise fires when a Decider asked agent.Run to re-invoke
// engine.Execute (FinalizeDecision{Revise: true}) AND the
// per-call WithMaxRevise budget allows another attempt. It
// runs after the discarded attempt's classification but BEFORE
// the next OnRunStart, so observers see the lifecycle as:
//
// OnRunStart → engine.Execute → OnRunRevise → OnRunStart → engine.Execute → OnRunEnd
//
// prevRes is the (about-to-be-replaced) Result from the failed
// attempt — observers MUST treat it as read-only. nextAttempt
// is the 1-indexed attempt number the next engine.Execute will
// be (== prevRes.Attempts + 1).
//
// OnRunRevise is the canonical hook for "log how many times the
// answer needed revision" / "page on excessive revise loops" /
// "snapshot intermediate boards before they are discarded". It
// fires zero times for runs that complete on the first attempt
// or whose Decider never asks for revise.
OnRunRevise(ctx context.Context, info RunInfo, prevRes *Result, nextAttempt int)
// OnRunEnd fires after engine.Execute returned and Run finished
// classifying the outcome. res is the same pointer Run is about
// to return; observers MUST treat it as read-only.
OnRunEnd(ctx context.Context, info RunInfo, res *Result)
}
Observer is a read-only lifecycle hook that lets callers react to stages of a Run without affecting its outcome. It is the plumbing behind agent's "history append on completion", "metric emit on start", "transcript snapshot on interrupt", and similar patterns, none of which agent hard-codes any more.
Design rules:
Observers MUST NOT change the Result returned by Run. agent intentionally exposes the Result to OnRunEnd by pointer because it is the same value the caller will receive — observers may stash references to it (for logging, async append, …) but mutating it leaves agent's caller staring at the mutation. Treat this surface as advisory.
Observer methods MUST NOT return an error. Failures inside an observer are the observer's problem; they MUST NOT propagate into Run. When an observer needs to fail or alter a turn (guard hooks, moderation, disposition), use a Decider instead — its explicit decision semantics keep the flow auditable.
Observer methods are called synchronously from Run on the caller's goroutine. Blocking inside them blocks the run. Long-running side effects MUST be dispatched asynchronously by the observer itself.
Run guarantees the call sequence: OnRunStart fires exactly once before engine.Execute; OnInterrupt fires at most once and ONLY when the engine returned an engine.InterruptedError (foreign-shape errors that merely satisfy errdefs.IsInterrupted still classify the run as interrupted but skip OnInterrupt); OnRunEnd fires exactly once after engine.Execute returns, regardless of outcome.
Embed BaseObserver to satisfy the interface with no-op defaults when only a subset of the methods are interesting.
type Request ¶
type Request struct {
// TaskID identifies a long-lived task the request is part of.
// Empty when the caller is not tracking tasks. Maps to A2A's
// "taskId".
TaskID string `json:"taskId,omitempty"`
// ContextID identifies the conversation / session. Used as the
// conversation key passed to History.Load / Append. Empty means
// "no persistent transcript for this turn". Maps to A2A's
// "contextId".
ContextID string `json:"contextId,omitempty"`
// RunID is the host-supplied execution id. When empty Run mints
// one. The same value is propagated as engine.Run.ID and as the
// run id attribute on emitted events. Not part of the A2A wire
// schema — it is an internal correlation key, kept camelCase for
// stylistic consistency.
RunID string `json:"runId,omitempty"`
// Message is the user's turn input (text, parts, attachments).
Message model.Message `json:"message"`
// Inputs are arbitrary structured inputs the engine reads off
// the Board (form fields, parameters, …). They are written under
// their map keys as Board vars before the engine starts.
Inputs map[string]any `json:"inputs,omitempty"`
// Config carries per-request preferences (output modes, …).
// Maps to A2A's "configuration".
Config *RequestConfig `json:"configuration,omitempty"`
// Extensions is host-passed-through metadata.
//
// Deprecated: agent has never interpreted this field — the
// previous godoc only documented "engines may read it from
// Run.Attributes if the host chose to forward it", and no
// such forwarding was ever implemented (contract-audit #8).
// The map[string]any → map[string]string type mismatch with
// engine.Run.Attributes also forces an ad-hoc serialisation
// strategy on every host that wants to forward.
//
// Use [WithAttributes] instead: it is a typed map[string]string
// the caller controls, the wire format is uniform with the
// canonical attribute bag (sdk/telemetry.Attr* dot-keys), and
// engines find it under engine.Run.Attributes via the same
// codepath as agent_id / run_id / task_id / context_id.
//
// This field is scheduled for removal in sdk v0.5.0. Migrate
// callers by serialising any non-string values at the call
// site and passing them through agent.WithAttributes(...) on
// the agent.Run option list. Engines that need to introspect
// caller-supplied metadata read engine.Run.Attributes today.
Extensions map[string]any `json:"extensions,omitempty"`
}
Request is one agent turn submitted to Run.
Field names and JSON tags mirror the A2A protocol's MessageSendParams schema (camelCase: taskId, contextId, …) so requests can be serialised across the protocol without translation. The notable absence vs sdk/workflow.Request is that Request does not carry a RuntimeID (Run is now a stateless function) and does not carry a Strategy hint (the engine is supplied directly to Run).
type RequestConfig ¶
type RequestConfig struct {
// AcceptedOutputModes constrains what modalities the caller can
// receive (e.g. ["text/plain"], ["audio/wav"], …). Engines that
// can produce multiple modalities consult it to pick one. Maps
// to A2A's "acceptedOutputModes".
AcceptedOutputModes []string `json:"acceptedOutputModes,omitempty"`
}
RequestConfig holds per-request preferences. Optional knobs are added here rather than on Request to keep Request stable across minor versions. JSON keys mirror the A2A MessageSendConfiguration schema so requests can flow across the protocol without translation.
type Result ¶
type Result struct {
// TaskID echoes the input Request.TaskID for correlation.
// Matches the A2A taskId casing.
TaskID string `json:"taskId,omitempty"`
// RunID echoes the (possibly auto-generated) execution id Run
// used to drive the engine.
RunID string `json:"runId,omitempty"`
// Status classifies the outcome.
Status Status `json:"status"`
// Cause is set when Status == StatusInterrupted: it carries the
// engine.Cause the host signalled. Empty otherwise.
Cause engine.Cause `json:"cause,omitempty"`
// Messages is the slice of NEW messages produced this turn —
// excluding the input request and any history loaded before the
// turn. Suitable for streaming to a UI or appending to the
// persistent transcript (which Run already did).
Messages []model.Message `json:"messages,omitempty"`
// Artifacts collects named, multi-modal bundles the engine
// emitted via dedicated board channels.
Artifacts []Artifact `json:"artifacts,omitempty"`
// Committed reports whether agent considered this turn's output
// suitable for downstream commit (transcript append, archival,
// …). It is determined by the Round B Decider chain
// (BeforeFinalize) on top of agent's defaults:
//
// - StatusCompleted defaults to Committed=true.
// - All non-completed statuses default to Committed=false.
// - Any Decider returning DiscardOutput=true forces
// Committed=false.
//
// Observers that persist transcript / artifact data are
// expected to short-circuit when Committed is false:
//
// if !res.Committed { return }
//
// Independent of Committed, Result.Messages always reflects the
// engine's actual output; Committed is the *policy* signal, not
// a content flag.
Committed bool `json:"committed"`
// State is a free-form bag carrying run-specific metadata. agent
// puts a few well-known keys (run_id, board, interrupted_node,
// …) here but does not enforce a schema beyond that.
State map[string]any `json:"state,omitempty"`
// Err is the engine's underlying error when Status indicates a
// non-completed outcome. Callers that want classification call
// errdefs.IsXxx on it; the JSON tag is "-" because errors do not
// JSON-marshal usefully.
Err error `json:"-"`
// LastBoard is the engine's final Board (possibly partial when
// Status != StatusCompleted). agent does not persist it; the
// host can choose to checkpoint via engine's Checkpointer.
LastBoard *engine.Board `json:"-"`
// Attempts is the number of engine.Execute invocations Run made
// before settling on this Result. 1 for fresh (single-shot)
// runs; >1 only when WithMaxRevise was enabled and at least
// one Decider returned FinalizeDecision{Revise: true}.
//
// Attempts is the post-loop count, not "remaining budget":
// Attempts == 2 means the engine was invoked twice. Observers
// reading res.Attempts in OnRunEnd see the final value.
//
// Zero is reserved for "Run never reached engine.Execute"
// (infrastructure error). Real runs always have Attempts >= 1.
Attempts int `json:"attempts,omitempty"`
}
Result is what Run returns after one turn. The contract:
Run() returns (res, nil) for ALL business outcomes — completion, interrupt, cancel, abort, failure. Caller inspects Status to branch.
Run() returns (nil, err) ONLY for infrastructure failures the caller cannot reasonably recover from (e.g. history append refused, factory returned nil engine).
This mirrors sdk/workflow.Result's "W-5" rule and avoids the double-encoding pattern where errors are also carried by Status.
func Run ¶
func Run( ctx context.Context, ag Agent, eng engine.Engine, req Request, opts ...RunOption, ) (*Result, error)
Run executes one turn of ag against eng with the given req.
Run is intentionally minimalist: it owns identifier minting, board assembly, observer dispatch, and result classification — and nothing else. Anything that looks like "policy" (load conversation history, run RAG retrieval, write transcripts after a turn, emit metrics, route engine envelopes to a bus, accumulate token usage, …) lives outside Run on:
- Observer / Decider for lifecycle hooks;
- BoardSeeder for engine-input shaping;
- the caller-supplied engine.Host (see WithEngineHost) for every host-side capability the engine needs (event publishing, interrupt injection, user prompting, checkpoint persistence, usage reporting). When omitted, agent falls back to engine.NoopHost — which is fine for trivial / test runs but gives up every observability and HITL capability.
Wiring sequence ¶
- Mint a RunID (req.RunID wins, else autogenerate).
- Build an engine.Board using either a caller-supplied BoardSeeder (WithBoardSeed) or the default seeder, which simply appends req.Message to MainChannel and copies req.Inputs to board vars.
- Resolve the engine.Host — caller-supplied via WithEngineHost, else engine.NoopHost.
- Build a RunInfo and notify all registered Observers via OnRunStart.
- Call eng.Execute. The engine mutates the board in place per its contract.
- If the engine returned an interrupt, fire OnInterrupt with the destructured cause/detail.
- Translate the engine outcome into Status and assemble Result.
- Run the Decider chain ([Decider.BeforeFinalize]) and merge the decisions; this fixes [Result.Committed] and any finalize_reason metadata.
- Fire OnRunEnd before returning. Observers that persist data (transcript appenders, artifact archivers) MUST inspect Result.Committed and short-circuit when it is false.
Error contract ¶
Run returns (res, nil) for every business outcome — completion, interrupt, cancel, abort, failure. (nil, err) is reserved for infrastructure failures the caller cannot reasonably recover from: nil engine, empty Agent.ID, a BoardSeeder that returned an error, or a Decider that returned an error.
Observers MUST NOT cause Run to return an error; they are advisory. Deciders may return errors that surface back to the caller — agent does not swap the error class so callers can classify with errdefs.
type RunInfo ¶
type RunInfo struct {
// AgentID is the running [Agent.ID].
AgentID string
// RunID is the execution id assigned by Run (req.RunID when
// supplied, else the auto-generated one).
RunID string
// TaskID echoes [Request.TaskID]. Empty when the caller did not
// scope this turn to a long-running task.
TaskID string
// ContextID echoes [Request.ContextID]. Empty when the turn is
// not part of a persistent conversation.
ContextID string
}
RunInfo is the immutable identification bundle threaded through observer callbacks. It is small on purpose: anything beyond identification (board contents, request payload, result) is passed as a separate, typed argument so observers cannot accidentally hold onto a snapshot they aren't supposed to.
func RunInfoFromAttributes ¶ added in v0.3.4
RunInfoFromAttributes reconstructs a RunInfo from the engine.Run.Attributes map Run populates on every attempt. runID is taken from the caller-supplied argument (typically engine.Run.ID, the canonical source) rather than the attribute copy, since some downstream contexts (e.g. graph.ExecutionContext) expose RunID as a dedicated field separate from the attributes bag.
Missing keys yield empty strings — RunInfoFromAttributes never errors. This matches Run's "promote when non-empty" write policy: a missing key just means the upstream Request did not carry that identifier, not that something went wrong.
Typical caller (a graph node bridging RunInfo into a script runtime):
info := agent.RunInfoFromAttributes(ec.RunID, ec.Attributes) bindings.NewRunInfoBridge(info)
This closes contract-audit #12: nodes used to construct RunInfo{RunID: ec.RunID} verbatim, leaving AgentID / TaskID / ContextID empty even though Run had written them upstream into engine.Run.Attributes.
type RunOption ¶
type RunOption func(*runConfig)
RunOption configures one Run invocation. Options are stateless and may be reused across calls.
func WithArtifactChannels ¶ added in v0.3.4
WithArtifactChannels names the engine.Board channels Run should harvest into [Result.Artifacts] on the way out. One Artifact per listed channel: Artifact.Name = channel name; Artifact.Parts = flat concatenation of every Part across every Message in the channel (in board-write order). Channels that hold no messages after the run produce no Artifact entry — empty channels do not pollute the result with empty bundles.
This is the writer side of the Result.Artifacts contract that contract-audit #6 flagged: the godoc had been promising "engines store them in a board channel; agent collects channel contents into Artifacts on the way out" since v0.1, but no agent.Run code path actually performed the collection so the field was permanently nil for every caller.
MUST NOT include engine.MainChannel — Run already promotes that channel into [Result.Messages] and a duplicate harvest would surface the same payload twice with confusing semantics (Messages keeps role + tool metadata; Artifacts is the modality-bundle view). Run silently skips MainChannel if the caller mistakenly passes it.
Example: an engine that writes a "summary" markdown blob and an "audio" TTS clip to dedicated channels:
res, _ := agent.Run(ctx, ag, eng, req,
agent.WithArtifactChannels("summary", "audio"))
for _, a := range res.Artifacts {
switch a.Name { ... }
}
Multiple WithArtifactChannels calls accumulate (deduped at collection time) so per-agent and per-call lists compose. nil / empty input is a no-op.
func WithAttributes ¶
WithAttributes adds extra attributes that flow into engine.Run.Attributes alongside the well-known agent_id / run_id / task_id / context_id keys. Caller-supplied keys win on conflict; agent does not overwrite.
This is also the canonical replacement for the deprecated [Request.Extensions] (contract-audit #8): engines that need caller-supplied metadata read engine.Run.Attributes via the same codepath as the well-known keys, with no map[string]any → map[string]string serialisation guesswork. Hosts that previously wrote into req.Extensions should serialise the values at the call site and pass the resulting map[string]string here.
func WithBoardSeed ¶
func WithBoardSeed(s BoardSeeder) RunOption
WithBoardSeed installs a custom BoardSeeder for this run. Use it to inject conversation history, RAG-retrieved context, system prompts, or any other board state the engine needs at start.
When omitted, agent uses [defaultSeeder] which appends req.Message to MainChannel and copies req.Inputs into board vars.
func WithDecider ¶
WithDecider registers a Decider for this run. Multiple deciders can be registered; they fire in registration order, after any [Agent.Deciders] declared on the agent value. Their decisions are merged via OR over boolean fields; the first non-empty Reason wins.
func WithDependencies ¶
func WithDependencies(d *engine.Dependencies) RunOption
WithDependencies passes a dependency container to the engine via engine.Run.Deps. Engines look up named clients (LLM, retriever, tool registry, …) in there.
func WithEngineHost ¶
WithEngineHost installs the engine.Host passed to the engine.
Host is the single extension point for every host-side capability the engine needs: event publishing (Publisher), interrupt injection (Interrupter), user prompting (UserPrompter), checkpoint persistence (Checkpointer), and token-usage reporting (UsageReporter). Composing your own host is how you wire any of those — agent does not provide narrow shortcuts because most non-trivial deployments share state across capabilities (a single metric client, a single OTel tracer, a single request-scoped logger) and a host implementation is the cleanest place to keep that state.
Embed engine.NoopHost in your host struct and override only the methods you actually need:
type myHost struct {
engine.NoopHost
bus event.Bus
intrCh <-chan engine.Interrupt
}
func (h *myHost) Publish(ctx context.Context, e event.Envelope) error {
return h.bus.Publish(ctx, e)
}
func (h *myHost) Interrupts() <-chan engine.Interrupt { return h.intrCh }
When omitted, agent falls back to engine.NoopHost, which silently drops envelopes, never fires interrupts, refuses AskUser, drops checkpoints, and discards usage. That default is appropriate for fire-and-forget batch runs and tests — anything else needs a real host.
func WithMaxRevise ¶ added in v0.3.4
WithMaxRevise sets the upper bound on engine.Execute invocations per Run call when a Decider returns FinalizeDecision{Revise: true}.
n <= 1 (default 0) disables the revise loop entirely. A Decider asking to revise records its Reason but Run still returns after the first attempt — the safe default avoids surprise infinite loops on misconfigured Deciders.
n >= 2 caps total attempts at n. The loop exits as soon as either no Decider asks for revise OR the attempt counter reaches n. The final Result.Attempts is the actual number of engine.Execute calls made.
Revise restarts re-seed the board from the original Request via the configured BoardSeeder, so the engine sees fresh inputs. engine.Run.ResumeFrom is dropped after the first attempt — Revise means "retry from scratch", not "replay a checkpoint".
Negative values are treated as 0 (disabled). Callers that want the engine to drive its own retry policy (rate-limit backoff, transient LLM errors, …) MUST keep WithMaxRevise at the default; the revise loop is the agent-policy layer, not the engine-transport one.
func WithObserver ¶
WithObserver registers a Observer for this run. Multiple observers can be registered; they fire in registration order, after any [Agent.Observers] declared on the agent value. Panics inside an observer are caught and dropped.
func WithParentRunID ¶ added in v0.3.4
WithParentRunID stamps every engine.Run this call dispatches with the supplied parent run id (engine.Run.ParentRunID). Use it when one agent.Run is spawned by another (multi-agent call chain, handoff, sub-agent dispatch) so dashboards / pod controllers can reconstruct the call tree and apply loop-detection / depth budgets against a stable correlation key.
The empty string is a no-op; passing the parent's runID (typically obtained from agent.RunInfo.RunID inside an Observer / Decider on the parent run) is the canonical use. agent.Run does NOT auto-derive ParentRunID from any ambient context — explicit is the only contract that survives ctx propagation rewrites and cross-process dispatch (vessel, A2A bridge).
Engines / hosts that don't read ParentRunID are unaffected. The field is also surfaced under telemetry.AttrParentRunID by observers that emit run-summary spans (sdk/telemetry/run_summary).
func WithResumeFrom ¶ added in v0.3.4
func WithResumeFrom(cp *engine.Checkpoint) RunOption
WithResumeFrom replays an interrupted run from a previously captured engine.Checkpoint. The agent threads cp into engine.Run.ResumeFrom and overrides the run id to cp.ExecID so the underlying engine's Resumer.CanResume sees ExecID == Run.ID (cross-run checkpoints are programmer errors and surface as errdefs.Validation from the engine).
Typical use: a host loaded a checkpoint via its CheckpointStore, possibly after a process restart, and wants the agent to keep going from that point rather than start fresh. The host still passes the ORIGINAL agent.Request (same task id, same inputs); the engine restores board state from the checkpoint so the re-seeded inputs are effectively overwritten by the resumed state. Engines without engine.Resumer surface NotAvailable (per the engine.Engine contract); resume against an unsupported engine is a configuration error, not silent fall-through.
nil cp is a no-op (= fresh start). Multiple WithResumeFrom calls last-write-wins; agent does not attempt to merge checkpoints.
type Skill ¶
type Skill struct {
// ID is the unique identifier for this skill within the agent.
ID string `json:"id"`
// Name is a human-readable label for the skill.
Name string `json:"name"`
// Description explains what the skill does.
Description string `json:"description,omitempty"`
// Tags categorises the skill (e.g. "cooking", "support").
Tags []string `json:"tags,omitempty"`
// Examples lists illustrative prompts the skill can handle.
Examples []string `json:"examples,omitempty"`
// InputModes overrides AgentCard.DefaultInputModes for this
// specific skill. Empty means "use the agent default".
InputModes []string `json:"inputModes,omitempty"`
// OutputModes overrides AgentCard.DefaultOutputModes for this
// specific skill. Empty means "use the agent default".
OutputModes []string `json:"outputModes,omitempty"`
}
Skill is a single capability unit declared on an AgentCard. Field names mirror A2A's skill object so cards round-trip cleanly through /.well-known/agent-card.json.
type Status ¶
type Status string
Status is the terminal classification of a Run outcome. agent does NOT use Status as a control-flow signal — once Run returns, the caller decides what to do based on Status. The values mirror the A2A task-status enum so they can be serialised across protocol boundaries without translation.
const ( // StatusCompleted means the engine finished cleanly and produced // the messages / artifacts in [Result]. StatusCompleted Status = "completed" // StatusInterrupted means the engine was stopped by a cooperative // interrupt (host-injected). Result.Cause carries the reason. // By default the partial output is NOT committed (Result.Committed // is false); register a [Decider] (or rely on the default // disposition) to override. StatusInterrupted Status = "interrupted" // StatusCanceled means ctx was cancelled before the engine // finished. StatusCanceled Status = "canceled" // StatusFailed means the engine returned a domain error not // classified as interrupted / aborted. StatusFailed Status = "failed" // StatusAborted means the engine reported errdefs.IsAborted — // an unrecoverable internal halt. Distinguished from // StatusFailed so callers can apply different retry policy. StatusAborted Status = "aborted" )
Source Files
¶
Directories
¶
| Path | Synopsis |
|---|---|
|
Package agenttest provides reusable contract-test machinery for the interfaces declared in sdk/agent — agent.Decider and agent.Observer today, more if the agent package grows.
|
Package agenttest provides reusable contract-test machinery for the interfaces declared in sdk/agent — agent.Decider and agent.Observer today, more if the agent package grows. |