Documentation
¶
Overview ¶
Package stages provides arena-specific pipeline stages for test execution.
Index ¶
- type ArenaAssertionStage
- type ArenaStateStoreSaveStage
- type CompletionInstructionStage
- type GuardrailEvalStage
- type HistoryInjectionStage
- type MetadataInjectionStage
- type MockScenarioContextStage
- type PersonaAssemblyStage
- type ScenarioContextExtractionStage
- type SelfPlayUserTurnContextStage
- type SkillInstructionStage
- type StripToolMessagesStage
- type TurnEvalRunner
- type TurnIndexStage
Constants ¶
This section is empty.
Variables ¶
This section is empty.
Functions ¶
This section is empty.
Types ¶
type ArenaAssertionStage ¶
ArenaAssertionStage validates assertions after LLM responses.
func NewArenaAssertionStage ¶
func NewArenaAssertionStage( assertionConfigs []assertions.AssertionConfig, ) *ArenaAssertionStage
NewArenaAssertionStage creates a new assertion stage.
func (*ArenaAssertionStage) Process ¶
func (s *ArenaAssertionStage) Process( ctx context.Context, input <-chan stage.StreamElement, output chan<- stage.StreamElement, ) error
Process validates assertions on the stream elements.
func (*ArenaAssertionStage) WithTurnEvalRunner ¶ added in v1.3.2
func (s *ArenaAssertionStage) WithTurnEvalRunner(runner TurnEvalRunner, sessionID string) *ArenaAssertionStage
WithTurnEvalRunner sets the eval runner for assertion execution.
type ArenaStateStoreSaveStage ¶
ArenaStateStoreSaveStage saves conversation state with telemetry to ArenaStateStore. This stage captures validation results, turn metrics, and cost information for Arena testing and analysis.
func NewArenaStateStoreSaveStage ¶
func NewArenaStateStoreSaveStage(config *pipeline.StateStoreConfig) *ArenaStateStoreSaveStage
NewArenaStateStoreSaveStage creates a new Arena state store save stage.
func (*ArenaStateStoreSaveStage) Process ¶
func (s *ArenaStateStoreSaveStage) Process(ctx context.Context, input <-chan stage.StreamElement, output chan<- stage.StreamElement) error
Process collects messages and saves them incrementally to Arena state store. Messages are saved after each turn completion (when an element contains a Message). This ensures conversation state is captured in real-time as turns complete.
type CompletionInstructionStage ¶ added in v1.3.16
type CompletionInstructionStage struct {
stage.BaseStage
// contains filtered or unexported fields
}
CompletionInstructionStage appends natural termination instructions to the system prompt assembled by PersonaAssemblyStage.
func NewCompletionInstructionStage ¶ added in v1.3.16
func NewCompletionInstructionStage(instruction string) *CompletionInstructionStage
NewCompletionInstructionStage creates a stage that appends the given instruction to the "system_prompt" metadata key.
func (*CompletionInstructionStage) Process ¶ added in v1.3.16
func (s *CompletionInstructionStage) Process(ctx context.Context, input <-chan stage.StreamElement, output chan<- stage.StreamElement) error
Process appends the completion instruction to the system_prompt metadata.
type GuardrailEvalStage ¶ added in v1.3.2
GuardrailEvalStage evaluates guardrail hooks against assistant messages and records pass/fail results in message.Validations. When a guardrail denies and FailOnViolation is true (the default), this stage also enforces the guardrail by modifying the message content (e.g., truncating for length validators, replacing content for content blockers).
This stage reads "validator_configs" from element metadata (set by PromptAssemblyStage) and uses guardrails.NewGuardrailHook to instantiate each hook, then calls AfterCall to evaluate the response.
func NewGuardrailEvalStage ¶ added in v1.3.2
func NewGuardrailEvalStage() *GuardrailEvalStage
NewGuardrailEvalStage creates a new guardrail evaluation stage.
func (*GuardrailEvalStage) Process ¶ added in v1.3.2
func (s *GuardrailEvalStage) Process( ctx context.Context, input <-chan stage.StreamElement, output chan<- stage.StreamElement, ) error
Process collects all elements, evaluates guardrail hooks against the last assistant message, and forwards all elements with updated Validations.
type HistoryInjectionStage ¶
HistoryInjectionStage prepends conversation history to the pipeline stream. This is useful for self-play scenarios where history needs to be explicitly provided before the LLM generates the next user message.
The stage emits all history messages first, then forwards any incoming elements.
When swapRoles is true, user↔assistant roles are swapped so that the self-play LLM sees its own prior outputs as "assistant" and the target's responses as "user". This prevents the self-play LLM from confusing itself with the target assistant.
func NewHistoryInjectionStage ¶
func NewHistoryInjectionStage(history []types.Message) *HistoryInjectionStage
NewHistoryInjectionStage creates a new history injection stage.
func NewHistoryInjectionStageSwapped ¶ added in v1.3.16
func NewHistoryInjectionStageSwapped(history []types.Message) *HistoryInjectionStage
NewHistoryInjectionStageSwapped creates a history injection stage that swaps user↔assistant roles. Use this for self-play generation so the LLM sees the conversation from its own perspective (its prior outputs as "assistant").
func (*HistoryInjectionStage) Process ¶
func (s *HistoryInjectionStage) Process(ctx context.Context, input <-chan stage.StreamElement, output chan<- stage.StreamElement) error
Process emits history messages first, then forwards all incoming elements.
type MetadataInjectionStage ¶
MetadataInjectionStage injects metadata into stream element metadata.
func NewMetadataInjectionStage ¶
func NewMetadataInjectionStage(metadata map[string]interface{}) *MetadataInjectionStage
NewMetadataInjectionStage creates a new metadata injection stage.
func (*MetadataInjectionStage) Process ¶
func (s *MetadataInjectionStage) Process( ctx context.Context, input <-chan stage.StreamElement, output chan<- stage.StreamElement, ) error
Process adds metadata to all elements.
type MockScenarioContextStage ¶
MockScenarioContextStage adds scenario context to the stream elements for MockProvider to use scenario-specific responses.
This stage should be placed before ProviderStage in the pipeline when using MockProvider to ensure scenario context is available.
func NewMockScenarioContextStage ¶
func NewMockScenarioContextStage(scenario *config.Scenario) *MockScenarioContextStage
NewMockScenarioContextStage creates a stage that adds scenario context to stream elements for MockProvider scenario-specific responses.
func (*MockScenarioContextStage) Process ¶
func (s *MockScenarioContextStage) Process(ctx context.Context, input <-chan stage.StreamElement, output chan<- stage.StreamElement) error
Process adds scenario context to all elements.
type PersonaAssemblyStage ¶
PersonaAssemblyStage assembles persona prompts using the same fragment/template system as PromptAssemblyStage. It enriches elements with the persona's assembled prompt and variables.
This stage mirrors the behavior of PromptAssemblyMiddleware but for personas: - Uses persona's BuildSystemPrompt() which handles fragment assembly - Supports template variable substitution with {{variable}} syntax - Injects persona-specific variables (goals, constraints, style) - Sets base variables for downstream template stage
func NewPersonaAssemblyStage ¶
func NewPersonaAssemblyStage( persona *config.UserPersonaPack, region string, baseVariables map[string]string, ) *PersonaAssemblyStage
NewPersonaAssemblyStage creates a new persona assembly stage.
func (*PersonaAssemblyStage) Process ¶
func (s *PersonaAssemblyStage) Process(ctx context.Context, input <-chan stage.StreamElement, output chan<- stage.StreamElement) error
Process assembles the persona prompt and enriches all elements with it.
type ScenarioContextExtractionStage ¶
type ScenarioContextExtractionStage struct {
stage.BaseStage
// contains filtered or unexported fields
}
ScenarioContextExtractionStage extracts context using scenario metadata and conversation history. This is designed for Arena use where rich scenario metadata is available.
When scenario metadata is present, it uses: - Scenario metadata variables (domain, user role from scenario definition) - Scenario description and task type - Message analysis as fallback
Extracted variables are merged into element metadata, allowing templates to use them.
func NewScenarioContextExtractionStage ¶
func NewScenarioContextExtractionStage(scenario *config.Scenario) *ScenarioContextExtractionStage
NewScenarioContextExtractionStage creates a new scenario context extraction stage.
func (*ScenarioContextExtractionStage) Process ¶
func (s *ScenarioContextExtractionStage) Process(ctx context.Context, input <-chan stage.StreamElement, output chan<- stage.StreamElement) error
Process extracts scenario context and adds it to elements.
type SelfPlayUserTurnContextStage ¶
type SelfPlayUserTurnContextStage struct {
stage.BaseStage
// contains filtered or unexported fields
}
SelfPlayUserTurnContextStage adds scenario context for the NEXT user turn (completed user turns + 1). Intended only for self-play user generation.
This stage enriches elements with metadata that MockProvider uses to select the appropriate mock response based on scenario and turn number.
func NewSelfPlayUserTurnContextStage ¶
func NewSelfPlayUserTurnContextStage(scenario *config.Scenario) *SelfPlayUserTurnContextStage
NewSelfPlayUserTurnContextStage creates a new self-play context stage.
func NewSelfPlayUserTurnContextStageWithHint ¶
func NewSelfPlayUserTurnContextStageWithHint( scenario *config.Scenario, turnIndexHint int, ) *SelfPlayUserTurnContextStage
NewSelfPlayUserTurnContextStageWithHint creates a self-play context stage with an explicit turn index. The turnIndexHint should be the 1-indexed selfplay turn number (first selfplay = 1). This is used when the scenario has mixed file-based and selfplay turns.
func (*SelfPlayUserTurnContextStage) Process ¶
func (s *SelfPlayUserTurnContextStage) Process(ctx context.Context, input <-chan stage.StreamElement, output chan<- stage.StreamElement) error
Process adds next-turn self-play context metadata to all elements.
type SkillInstructionStage ¶ added in v1.4.5
SkillInstructionStage appends preloaded skill instructions to the system_prompt metadata so the model sees skills marked preload: true from turn 1 without having to call skill__activate.
func NewSkillInstructionStage ¶ added in v1.4.5
func NewSkillInstructionStage(instructions string) *SkillInstructionStage
NewSkillInstructionStage creates a stage that appends the given preloaded skill instructions block to the "system_prompt" metadata key.
func (*SkillInstructionStage) Process ¶ added in v1.4.5
func (s *SkillInstructionStage) Process(ctx context.Context, input <-chan stage.StreamElement, output chan<- stage.StreamElement) error
Process appends the preloaded skill instructions to the system_prompt metadata.
type StripToolMessagesStage ¶
StripToolMessagesStage removes tool role messages from the stream. This is used in self-play scenarios before calling the self-play provider.
func NewStripToolMessagesStage ¶
func NewStripToolMessagesStage() *StripToolMessagesStage
NewStripToolMessagesStage creates a new strip tool messages stage.
func (*StripToolMessagesStage) Process ¶
func (s *StripToolMessagesStage) Process( ctx context.Context, input <-chan stage.StreamElement, output chan<- stage.StreamElement, ) error
Process filters out elements with tool role messages.
type TurnEvalRunner ¶ added in v1.3.2
type TurnEvalRunner interface {
RunAssertionsAsEvals(
ctx context.Context,
assertionConfigs []assertions.AssertionConfig,
messages []types.Message,
turnIndex int,
sessionID string,
trigger evals.EvalTrigger,
) []evals.EvalResult
}
TurnEvalRunner is an interface for running assertions as evals during turn execution. EvalOrchestrator in the engine package implements this interface.
type TurnIndexStage ¶
TurnIndexStage computes role-specific turn counters from accumulated messages. It sets clear, role-specific metadata keys that other stages can consume: - arena_user_completed_turns: number of completed user messages - arena_user_next_turn: completed user messages + 1 (next user turn to generate) - arena_assistant_completed_turns: number of completed assistant messages - arena_assistant_next_turn: completed assistant messages + 1
This stage enriches all elements with turn count metadata.
func NewTurnIndexStage ¶
func NewTurnIndexStage() *TurnIndexStage
NewTurnIndexStage creates a new turn index stage.
func (*TurnIndexStage) Process ¶
func (s *TurnIndexStage) Process(ctx context.Context, input <-chan stage.StreamElement, output chan<- stage.StreamElement) error
Process computes role-specific turn counters and enriches elements with metadata.