results

package
v1.2.0 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Feb 15, 2026 License: Apache-2.0 Imports: 5 Imported by: 0

Documentation

Overview

Package results provides abstract result output layer for Arena. This package implements the Repository Pattern to support multiple output formats (JSON, JUnit XML, HTML, TAP) simultaneously while maintaining clean separation of concerns between execution and output.

Index

Constants

This section is empty.

Variables

This section is empty.

Functions

func AllAssertionsPassed added in v1.1.5

func AllAssertionsPassed(result *engine.RunResult) bool

AllAssertionsPassed checks if all assertions in the result passed. This includes both turn-level assertions (in message metadata) and conversation-level assertions.

func CalculatePerformanceMetrics

func CalculatePerformanceMetrics(results []engine.RunResult) (totalCost float64, totalTokens int, totalDuration time.Duration)

CalculatePerformanceMetrics calculates cost, token, and duration totals

func CountResultsByStatus

func CountResultsByStatus(results []engine.RunResult) (passed, failed int)

CountResultsByStatus counts successful and failed results. A result is considered successful if:

  1. There are no errors AND no violations, OR
  2. There are no errors AND violations occurred AND there are assertions AND all assertions passed. This allows tests that EXPECT guardrails to trigger to pass when they do.

A result is considered failed if:

  1. There are errors, OR
  2. There are violations AND no assertions (violations are unexpected), OR
  3. There are violations AND some assertions fail

func ExtractRunIDs

func ExtractRunIDs(results []engine.RunResult) []string

ExtractRunIDs extracts all run IDs from results

func ExtractUniqueValues

func ExtractUniqueValues(results []engine.RunResult, extractor func(engine.RunResult) string) []string

ExtractUniqueValues extracts unique values using the provided extractor function

func HasAssertions added in v1.1.5

func HasAssertions(result *engine.RunResult) bool

HasAssertions checks if the result has any assertions defined. This is used to determine if violations should be treated as failures: - Violations WITH passing assertions = test passed (guardrails were expected) - Violations WITHOUT any assertions = test failed (guardrails were unexpected)

func IsUnsupportedOperation

func IsUnsupportedOperation(err error) bool

IsUnsupportedOperation checks if an error is an UnsupportedOperationError

func ValidateResults

func ValidateResults(results []engine.RunResult) error

ValidateResults performs basic validation on results before processing

Types

type CompositeError

type CompositeError struct {
	Operation string
	Errors    []error
}

CompositeError represents multiple errors from repository operations

func NewCompositeError

func NewCompositeError(operation string, errs []error) *CompositeError

NewCompositeError creates a new composite error

func (*CompositeError) Error

func (e *CompositeError) Error() string

Error implements the error interface

func (*CompositeError) Unwrap

func (e *CompositeError) Unwrap() error

Unwrap returns the first error for compatibility with errors.Unwrap

type CompositeResultRepository

type CompositeResultRepository struct {
	// contains filtered or unexported fields
}

CompositeResultRepository writes to multiple repositories simultaneously. This allows Arena to output results in multiple formats (JSON + JUnit + HTML) with a single call, ensuring consistency across all outputs.

func NewCompositeRepository

func NewCompositeRepository(repos ...ResultRepository) *CompositeResultRepository

NewCompositeRepository creates a new composite repository that writes to all provided repositories in order.

func (*CompositeResultRepository) AddRepository

func (r *CompositeResultRepository) AddRepository(repo ResultRepository)

AddRepository adds a new repository to the composite

func (*CompositeResultRepository) GetRepositories

func (r *CompositeResultRepository) GetRepositories() []ResultRepository

GetRepositories returns all repositories in the composite

func (*CompositeResultRepository) LoadResults

func (r *CompositeResultRepository) LoadResults() ([]engine.RunResult, error)

LoadResults loads from the first repository that supports loading

func (*CompositeResultRepository) SaveResult

func (r *CompositeResultRepository) SaveResult(result *engine.RunResult) error

SaveResult saves a single result to all streaming-capable repositories

func (*CompositeResultRepository) SaveResults

func (r *CompositeResultRepository) SaveResults(results []engine.RunResult) error

SaveResults saves results to all repositories. If any repository fails, it continues with others and returns a composite error containing all failures.

func (*CompositeResultRepository) SaveSummary

func (r *CompositeResultRepository) SaveSummary(summary *ResultSummary) error

SaveSummary saves summary to all repositories

func (*CompositeResultRepository) SupportsStreaming

func (r *CompositeResultRepository) SupportsStreaming() bool

SupportsStreaming returns true if any repository supports streaming

type ResultRepository

type ResultRepository interface {
	// SaveResults saves test execution results in the repository's format
	SaveResults(results []engine.RunResult) error

	// SaveSummary saves a summary of all test results
	SaveSummary(summary *ResultSummary) error

	// LoadResults loads previously saved results (for report generation)
	// Returns error if the repository format doesn't support loading
	LoadResults() ([]engine.RunResult, error)

	// SupportsStreaming returns true if repository can write results incrementally
	SupportsStreaming() bool

	// SaveResult saves a single result (for streaming support)
	// Returns error if streaming is not supported
	SaveResult(result *engine.RunResult) error
}

ResultRepository provides abstract access to test result storage and output formatting. Implementations handle specific formats like JSON, JUnit XML, HTML, or TAP.

type ResultSummary

type ResultSummary struct {
	// Test execution counts
	TotalTests int `json:"total_tests"`
	Passed     int `json:"passed"`
	Failed     int `json:"failed"`
	Errors     int `json:"errors"`
	Skipped    int `json:"skipped"`

	// Performance metrics
	TotalDuration time.Duration `json:"total_duration"`
	AverageCost   float64       `json:"average_cost"`
	TotalCost     float64       `json:"total_cost"`
	TotalTokens   int           `json:"total_tokens"`

	// Execution metadata
	Timestamp  time.Time `json:"timestamp"`
	ConfigFile string    `json:"config_file"`

	// CI/CD integration metadata (optional)
	GitCommit string `json:"git_commit,omitempty"`
	GitBranch string `json:"git_branch,omitempty"`
	CIBuildID string `json:"ci_build_id,omitempty"`
	CIJobURL  string `json:"ci_job_url,omitempty"`

	// Arena-specific metadata
	RunIDs      []string `json:"run_ids"`
	PromptPacks []string `json:"prompt_packs"`
	Scenarios   []string `json:"scenarios"`
	Providers   []string `json:"providers"`
	Regions     []string `json:"regions"`
}

ResultSummary contains aggregate information about test runs This provides metadata and statistics that can be used across different output formats.

type SummaryBuilder

type SummaryBuilder struct {
	// contains filtered or unexported fields
}

SummaryBuilder helps build ResultSummary from RunResult slices

func NewSummaryBuilder

func NewSummaryBuilder(configFile string) *SummaryBuilder

NewSummaryBuilder creates a new summary builder

func (*SummaryBuilder) BuildSummary

func (b *SummaryBuilder) BuildSummary(results []engine.RunResult) *ResultSummary

BuildSummary creates a ResultSummary from the provided results

func (*SummaryBuilder) SetCIMetadata

func (b *SummaryBuilder) SetCIMetadata(buildID, jobURL string) *SummaryBuilder

SetCIMetadata sets CI/CD-related metadata

func (*SummaryBuilder) SetGitMetadata

func (b *SummaryBuilder) SetGitMetadata(commit, branch string) *SummaryBuilder

SetGitMetadata sets Git-related metadata for CI/CD integration

func (*SummaryBuilder) SetTimestamp

func (b *SummaryBuilder) SetTimestamp(ts time.Time) *SummaryBuilder

SetTimestamp sets a custom timestamp for the summary

type UnsupportedOperationError

type UnsupportedOperationError struct {
	Operation string
	Reason    string
}

UnsupportedOperationError represents an operation not supported by a repository

func NewUnsupportedOperationError

func NewUnsupportedOperationError(operation, reason string) *UnsupportedOperationError

NewUnsupportedOperationError creates a new unsupported operation error

func (*UnsupportedOperationError) Error

func (e *UnsupportedOperationError) Error() string

Error implements the error interface

type ValidationError

type ValidationError struct {
	Field   string
	Value   interface{}
	Message string
}

ValidationError represents a validation failure in result processing

func NewValidationError

func NewValidationError(field string, value interface{}, message string) *ValidationError

NewValidationError creates a new validation error

func (*ValidationError) Error

func (e *ValidationError) Error() string

Error implements the error interface

Directories

Path Synopsis
Package html implements the ResultRepository interface for HTML report generation.
Package html implements the ResultRepository interface for HTML report generation.
Package json provides JSON file-based result storage for Arena.
Package json provides JSON file-based result storage for Arena.
Package junit provides JUnit XML result output for Arena.
Package junit provides JUnit XML result output for Arena.
Package markdown provides Markdown file-based result storage for Arena.
Package markdown provides Markdown file-based result storage for Arena.

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL