Documentation
¶
Index ¶
- func NewBenchmarkCommand() *cobra.Command
- func NewCreateEvaluatorCommand() *cobra.Command
- func NewDatasetsCommand() *cobra.Command
- func NewDatasetsCreateCommand() *cobra.Command
- func NewDatasetsListCommand() *cobra.Command
- func NewDatasetsShowCommand() *cobra.Command
- func NewDatasetsValidateCommand() *cobra.Command
- func NewEvalCommand() *cobra.Command
- func NewListCommand() *cobra.Command
- func NewListEvaluatorsCommand() *cobra.Command
- func NewRunCommand() *cobra.Command
- func NewShowCommand() *cobra.Command
- func NewValidateEvaluatorCommand() *cobra.Command
- type BenchmarkComparison
- type BenchmarkResult
Constants ¶
This section is empty.
Variables ¶
This section is empty.
Functions ¶
func NewBenchmarkCommand ¶ added in v0.9.10
NewBenchmarkCommand creates the benchmark command
func NewCreateEvaluatorCommand ¶ added in v0.9.10
NewCreateEvaluatorCommand creates a command to create a new evaluator from template
func NewDatasetsCommand ¶
NewDatasetsCommand creates the datasets command
func NewDatasetsCreateCommand ¶
NewDatasetsCreateCommand creates the datasets create command
func NewDatasetsListCommand ¶
NewDatasetsListCommand creates the datasets list command
func NewDatasetsShowCommand ¶
NewDatasetsShowCommand creates the datasets show command
func NewDatasetsValidateCommand ¶
NewDatasetsValidateCommand creates the datasets validate command
func NewListCommand ¶
NewListCommand creates the eval list command
func NewListEvaluatorsCommand ¶ added in v0.9.10
NewListEvaluatorsCommand creates a command to list available custom evaluators
func NewShowCommand ¶
NewShowCommand creates the eval show command
func NewValidateEvaluatorCommand ¶ added in v0.9.10
NewValidateEvaluatorCommand creates a command to validate an evaluator definition
Types ¶
type BenchmarkComparison ¶ added in v0.9.10
type BenchmarkComparison struct {
DatasetName string
Collection string
Results []BenchmarkResult
}
BenchmarkComparison stores comparison across multiple agents
type BenchmarkResult ¶ added in v0.9.10
type BenchmarkResult struct {
AgentName string
RunID string
Summary evaluation.EvaluationSummary
}
BenchmarkResult stores results for one agent