Documentation
¶
Overview ¶
Package results provides abstract result output layer for Arena. This package implements the Repository Pattern to support multiple output formats (JSON, JUnit XML, HTML, TAP) simultaneously while maintaining clean separation of concerns between execution and output.
Index ¶
- func AllAssertionsPassed(result *engine.RunResult) bool
- func CalculatePerformanceMetrics(results []engine.RunResult) (totalCost float64, totalTokens int, totalDuration time.Duration)
- func CountResultsByStatus(results []engine.RunResult) (passed, failed int)
- func ExtractRunIDs(results []engine.RunResult) []string
- func ExtractUniqueValues(results []engine.RunResult, extractor func(engine.RunResult) string) []string
- func HasAssertions(result *engine.RunResult) bool
- func IsUnsupportedOperation(err error) bool
- func ValidateResults(results []engine.RunResult) error
- type CompositeError
- type CompositeResultRepository
- func (r *CompositeResultRepository) AddRepository(repo ResultRepository)
- func (r *CompositeResultRepository) GetRepositories() []ResultRepository
- func (r *CompositeResultRepository) LoadResults() ([]engine.RunResult, error)
- func (r *CompositeResultRepository) SaveResult(result *engine.RunResult) error
- func (r *CompositeResultRepository) SaveResults(results []engine.RunResult) error
- func (r *CompositeResultRepository) SaveSummary(summary *ResultSummary) error
- func (r *CompositeResultRepository) SupportsStreaming() bool
- type ResultRepository
- type ResultSummary
- type SummaryBuilder
- func (b *SummaryBuilder) BuildSummary(results []engine.RunResult) *ResultSummary
- func (b *SummaryBuilder) SetCIMetadata(buildID, jobURL string) *SummaryBuilder
- func (b *SummaryBuilder) SetGitMetadata(commit, branch string) *SummaryBuilder
- func (b *SummaryBuilder) SetTimestamp(ts time.Time) *SummaryBuilder
- type UnsupportedOperationError
- type ValidationError
Constants ¶
This section is empty.
Variables ¶
This section is empty.
Functions ¶
func AllAssertionsPassed ¶ added in v1.1.5
AllAssertionsPassed checks if all assertions in the result passed. This includes both turn-level assertions (in message metadata) and conversation-level assertions.
func CalculatePerformanceMetrics ¶
func CalculatePerformanceMetrics(results []engine.RunResult) (totalCost float64, totalTokens int, totalDuration time.Duration)
CalculatePerformanceMetrics calculates cost, token, and duration totals
func CountResultsByStatus ¶
CountResultsByStatus counts successful and failed results. A result is considered successful if:
- There are no errors AND no violations, OR
- There are no errors AND violations occurred AND there are assertions AND all assertions passed. This allows tests that EXPECT guardrails to trigger to pass when they do.
A result is considered failed if:
- There are errors, OR
- There are violations AND no assertions (violations are unexpected), OR
- There are violations AND some assertions fail
func ExtractRunIDs ¶
ExtractRunIDs extracts all run IDs from results
func ExtractUniqueValues ¶
func ExtractUniqueValues(results []engine.RunResult, extractor func(engine.RunResult) string) []string
ExtractUniqueValues extracts unique values using the provided extractor function
func HasAssertions ¶ added in v1.1.5
HasAssertions checks if the result has any assertions defined. This is used to determine if violations should be treated as failures: - Violations WITH passing assertions = test passed (guardrails were expected) - Violations WITHOUT any assertions = test failed (guardrails were unexpected)
func IsUnsupportedOperation ¶
IsUnsupportedOperation checks if an error is an UnsupportedOperationError
func ValidateResults ¶
ValidateResults performs basic validation on results before processing
Types ¶
type CompositeError ¶
CompositeError represents multiple errors from repository operations
func NewCompositeError ¶
func NewCompositeError(operation string, errs []error) *CompositeError
NewCompositeError creates a new composite error
func (*CompositeError) Error ¶
func (e *CompositeError) Error() string
Error implements the error interface
func (*CompositeError) Unwrap ¶
func (e *CompositeError) Unwrap() error
Unwrap returns the first error for compatibility with errors.Unwrap
type CompositeResultRepository ¶
type CompositeResultRepository struct {
// contains filtered or unexported fields
}
CompositeResultRepository writes to multiple repositories simultaneously. This allows Arena to output results in multiple formats (JSON + JUnit + HTML) with a single call, ensuring consistency across all outputs.
func NewCompositeRepository ¶
func NewCompositeRepository(repos ...ResultRepository) *CompositeResultRepository
NewCompositeRepository creates a new composite repository that writes to all provided repositories in order.
func (*CompositeResultRepository) AddRepository ¶
func (r *CompositeResultRepository) AddRepository(repo ResultRepository)
AddRepository adds a new repository to the composite
func (*CompositeResultRepository) GetRepositories ¶
func (r *CompositeResultRepository) GetRepositories() []ResultRepository
GetRepositories returns all repositories in the composite
func (*CompositeResultRepository) LoadResults ¶
func (r *CompositeResultRepository) LoadResults() ([]engine.RunResult, error)
LoadResults loads from the first repository that supports loading
func (*CompositeResultRepository) SaveResult ¶
func (r *CompositeResultRepository) SaveResult(result *engine.RunResult) error
SaveResult saves a single result to all streaming-capable repositories
func (*CompositeResultRepository) SaveResults ¶
func (r *CompositeResultRepository) SaveResults(results []engine.RunResult) error
SaveResults saves results to all repositories. If any repository fails, it continues with others and returns a composite error containing all failures.
func (*CompositeResultRepository) SaveSummary ¶
func (r *CompositeResultRepository) SaveSummary(summary *ResultSummary) error
SaveSummary saves summary to all repositories
func (*CompositeResultRepository) SupportsStreaming ¶
func (r *CompositeResultRepository) SupportsStreaming() bool
SupportsStreaming returns true if any repository supports streaming
type ResultRepository ¶
type ResultRepository interface {
// SaveResults saves test execution results in the repository's format
SaveResults(results []engine.RunResult) error
// SaveSummary saves a summary of all test results
SaveSummary(summary *ResultSummary) error
// LoadResults loads previously saved results (for report generation)
// Returns error if the repository format doesn't support loading
LoadResults() ([]engine.RunResult, error)
// SupportsStreaming returns true if repository can write results incrementally
SupportsStreaming() bool
// SaveResult saves a single result (for streaming support)
// Returns error if streaming is not supported
SaveResult(result *engine.RunResult) error
}
ResultRepository provides abstract access to test result storage and output formatting. Implementations handle specific formats like JSON, JUnit XML, HTML, or TAP.
type ResultSummary ¶
type ResultSummary struct {
// Test execution counts
TotalTests int `json:"total_tests"`
Passed int `json:"passed"`
Failed int `json:"failed"`
Errors int `json:"errors"`
Skipped int `json:"skipped"`
// Performance metrics
TotalDuration time.Duration `json:"total_duration"`
AverageCost float64 `json:"average_cost"`
TotalCost float64 `json:"total_cost"`
TotalTokens int `json:"total_tokens"`
// Execution metadata
Timestamp time.Time `json:"timestamp"`
ConfigFile string `json:"config_file"`
// CI/CD integration metadata (optional)
GitCommit string `json:"git_commit,omitempty"`
GitBranch string `json:"git_branch,omitempty"`
CIBuildID string `json:"ci_build_id,omitempty"`
CIJobURL string `json:"ci_job_url,omitempty"`
// Arena-specific metadata
RunIDs []string `json:"run_ids"`
PromptPacks []string `json:"prompt_packs"`
Scenarios []string `json:"scenarios"`
Providers []string `json:"providers"`
Regions []string `json:"regions"`
}
ResultSummary contains aggregate information about test runs This provides metadata and statistics that can be used across different output formats.
type SummaryBuilder ¶
type SummaryBuilder struct {
// contains filtered or unexported fields
}
SummaryBuilder helps build ResultSummary from RunResult slices
func NewSummaryBuilder ¶
func NewSummaryBuilder(configFile string) *SummaryBuilder
NewSummaryBuilder creates a new summary builder
func (*SummaryBuilder) BuildSummary ¶
func (b *SummaryBuilder) BuildSummary(results []engine.RunResult) *ResultSummary
BuildSummary creates a ResultSummary from the provided results
func (*SummaryBuilder) SetCIMetadata ¶
func (b *SummaryBuilder) SetCIMetadata(buildID, jobURL string) *SummaryBuilder
SetCIMetadata sets CI/CD-related metadata
func (*SummaryBuilder) SetGitMetadata ¶
func (b *SummaryBuilder) SetGitMetadata(commit, branch string) *SummaryBuilder
SetGitMetadata sets Git-related metadata for CI/CD integration
func (*SummaryBuilder) SetTimestamp ¶
func (b *SummaryBuilder) SetTimestamp(ts time.Time) *SummaryBuilder
SetTimestamp sets a custom timestamp for the summary
type UnsupportedOperationError ¶
UnsupportedOperationError represents an operation not supported by a repository
func NewUnsupportedOperationError ¶
func NewUnsupportedOperationError(operation, reason string) *UnsupportedOperationError
NewUnsupportedOperationError creates a new unsupported operation error
func (*UnsupportedOperationError) Error ¶
func (e *UnsupportedOperationError) Error() string
Error implements the error interface
type ValidationError ¶
ValidationError represents a validation failure in result processing
func NewValidationError ¶
func NewValidationError(field string, value interface{}, message string) *ValidationError
NewValidationError creates a new validation error
func (*ValidationError) Error ¶
func (e *ValidationError) Error() string
Error implements the error interface
Directories
¶
| Path | Synopsis |
|---|---|
|
Package html implements the ResultRepository interface for HTML report generation.
|
Package html implements the ResultRepository interface for HTML report generation. |
|
Package json provides JSON file-based result storage for Arena.
|
Package json provides JSON file-based result storage for Arena. |
|
Package junit provides JUnit XML result output for Arena.
|
Package junit provides JUnit XML result output for Arena. |
|
Package markdown provides Markdown file-based result storage for Arena.
|
Package markdown provides Markdown file-based result storage for Arena. |