eval

package
v0.9.11 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Jan 27, 2026 License: MIT Imports: 14 Imported by: 0

Documentation

Index

Constants

This section is empty.

Variables

This section is empty.

Functions

func NewBenchmarkCommand added in v0.9.10

func NewBenchmarkCommand() *cobra.Command

NewBenchmarkCommand creates the benchmark command

func NewCreateEvaluatorCommand added in v0.9.10

func NewCreateEvaluatorCommand() *cobra.Command

NewCreateEvaluatorCommand creates a command to create a new evaluator from template

func NewDatasetsCommand

func NewDatasetsCommand() *cobra.Command

NewDatasetsCommand creates the datasets command

func NewDatasetsCreateCommand

func NewDatasetsCreateCommand() *cobra.Command

NewDatasetsCreateCommand creates the datasets create command

func NewDatasetsListCommand

func NewDatasetsListCommand() *cobra.Command

NewDatasetsListCommand creates the datasets list command

func NewDatasetsShowCommand

func NewDatasetsShowCommand() *cobra.Command

NewDatasetsShowCommand creates the datasets show command

func NewDatasetsValidateCommand

func NewDatasetsValidateCommand() *cobra.Command

NewDatasetsValidateCommand creates the datasets validate command

func NewEvalCommand

func NewEvalCommand() *cobra.Command

NewEvalCommand creates the eval command

func NewListCommand

func NewListCommand() *cobra.Command

NewListCommand creates the eval list command

func NewListEvaluatorsCommand added in v0.9.10

func NewListEvaluatorsCommand() *cobra.Command

NewListEvaluatorsCommand creates a command to list available custom evaluators

func NewRunCommand

func NewRunCommand() *cobra.Command

NewRunCommand creates the eval run command

func NewShowCommand

func NewShowCommand() *cobra.Command

NewShowCommand creates the eval show command

func NewValidateEvaluatorCommand added in v0.9.10

func NewValidateEvaluatorCommand() *cobra.Command

NewValidateEvaluatorCommand creates a command to validate an evaluator definition

Types

type BenchmarkComparison added in v0.9.10

type BenchmarkComparison struct {
	DatasetName string
	Collection  string
	Results     []BenchmarkResult
}

BenchmarkComparison stores comparison across multiple agents

type BenchmarkResult added in v0.9.10

type BenchmarkResult struct {
	AgentName string
	RunID     string
	Summary   evaluation.EvaluationSummary
}

BenchmarkResult stores results for one agent

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL