Documentation
¶
Overview ¶
Package workflow provides workflow scenario support for Arena.
Workflow scenarios test multi-state flows where an agent transitions between prompt_tasks. Transitions are LLM-initiated: the LLM calls the workflow__transition tool with an event and context, and the driver processes the transition internally.
Example YAML:
id: support-escalation
pack: ./support.pack.json
description: "Test escalation from intake to specialist"
steps:
- type: input
content: "I need help with billing"
assertions:
- type: content_includes
params: { substring: "billing" }
- type: input
content: "My invoice is wrong"
assertions:
- type: transitioned_to
params: { state: specialist }
Index ¶
Constants ¶
This section is empty.
Variables ¶
var ( ErrMissingID = errors.New("workflow scenario: missing id") ErrMissingPack = errors.New("workflow scenario: missing pack path") ErrNoSteps = errors.New("workflow scenario: no steps defined") )
Sentinel errors for scenario validation.
Functions ¶
This section is empty.
Types ¶
type Driver ¶
type Driver interface {
// Send sends a user message and returns the assistant response text.
// If the LLM calls the workflow__transition tool, the driver processes
// the transition internally and the result is available via Transitions().
Send(ctx context.Context, message string) (string, error)
// Transitions returns the transitions that occurred during the last Send() call.
// The LLM initiates transitions by calling the workflow__transition tool.
Transitions() []TransitionRecord
// CurrentState returns the current workflow state name.
CurrentState() string
// IsComplete returns true if the workflow reached a terminal state.
IsComplete() bool
// Close releases resources.
Close() error
}
Driver is the interface that a workflow engine must implement. The SDK's WorkflowConversation satisfies this through an adapter.
type DriverFactory ¶
type DriverFactory func(packPath string, variables map[string]string, carryForward bool) (Driver, error)
DriverFactory creates a Driver for a given scenario. The factory receives the pack path, variables, and whether to enable context carry-forward.
type Executor ¶
type Executor struct {
// contains filtered or unexported fields
}
Executor runs workflow scenarios against a Driver.
func NewExecutor ¶
func NewExecutor(factory DriverFactory) *Executor
NewExecutor creates a workflow scenario executor with the given driver factory.
func (*Executor) WithTurnEvalRunner ¶ added in v1.3.2
func (e *Executor) WithTurnEvalRunner(runner TurnEvalRunner, sessionID string) *Executor
WithTurnEvalRunner sets the optional eval runner for dual-write support during step assertions.
type Result ¶
type Result struct {
// ScenarioID identifies which scenario was run.
ScenarioID string `json:"scenario_id"`
// Steps contains per-step results.
Steps []StepResult `json:"steps"`
// FinalState is the workflow state after all steps.
FinalState string `json:"final_state"`
// Duration is the total scenario execution time.
Duration time.Duration `json:"duration"`
// Failed is true if any step errored.
Failed bool `json:"failed"`
// Error is a summary error message (first failure).
Error string `json:"error,omitempty"`
}
Result is the complete outcome of executing a workflow scenario.
type Scenario ¶
type Scenario struct {
// ID uniquely identifies this workflow scenario.
ID string `json:"id" yaml:"id"`
// Pack is the path to the prompt pack file.
Pack string `json:"pack" yaml:"pack"`
// Description is a human-readable summary of what this scenario tests.
Description string `json:"description,omitempty" yaml:"description,omitempty"`
// Steps is the ordered sequence of actions to execute.
Steps []Step `json:"steps" yaml:"steps"`
// Providers optionally restricts which providers to run this scenario against.
Providers []string `json:"providers,omitempty" yaml:"providers,omitempty"`
// Variables are injected into the pack's template variables.
Variables map[string]string `json:"variables,omitempty" yaml:"variables,omitempty"`
// ContextCarryForward enables conversation context hand-off between states.
ContextCarryForward bool `json:"context_carry_forward,omitempty" yaml:"context_carry_forward,omitempty"`
}
Scenario defines a workflow test scenario that drives a WorkflowConversation through multiple states.
type Step ¶
type Step struct {
// Type is "input" or "event".
Type StepType `json:"type" yaml:"type"`
// Content is the user message text (only for input steps).
Content string `json:"content,omitempty" yaml:"content,omitempty"`
// Event is the transition event name (only for event steps).
Event string `json:"event,omitempty" yaml:"event,omitempty"`
// ExpectState is the expected state after an event transition.
// Validated only for event steps.
ExpectState string `json:"expect_state,omitempty" yaml:"expect_state,omitempty"`
// Assertions are evaluated against the assistant response (input steps only).
Assertions []asrt.AssertionConfig `json:"assertions,omitempty" yaml:"assertions,omitempty"`
}
Step is a single action in a workflow scenario.
type StepResult ¶
type StepResult struct {
// Index is the step position (0-based).
Index int `json:"index"`
// Type is "input".
Type StepType `json:"type"`
// Response is the assistant text (input steps only).
Response string `json:"response,omitempty"`
// State is the workflow state after this step.
State string `json:"state"`
// Duration is how long the step took.
Duration time.Duration `json:"duration"`
// AssertionResults are the per-step assertion outcomes.
AssertionResults []asrt.ConversationValidationResult `json:"assertion_results,omitempty"`
// Error is non-empty if the step failed.
Error string `json:"error,omitempty"`
}
StepResult captures the outcome of a single step execution.
type TransitionRecord ¶
type TransitionRecord struct {
From string `json:"from"`
To string `json:"to"`
Event string `json:"event"`
Context string `json:"context,omitempty"`
}
TransitionRecord captures a single workflow state transition.
type TurnEvalRunner ¶ added in v1.3.2
type TurnEvalRunner interface {
RunAssertionsAsEvals(
ctx context.Context,
assertionConfigs []asrt.AssertionConfig,
messages []types.Message,
turnIndex int,
sessionID string,
trigger evals.EvalTrigger,
) []evals.EvalResult
}
TurnEvalRunner is an optional interface for running assertions as evals during workflow steps. PackEvalHook in the engine package implements this interface.