metrics

package
v0.28.0 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Apr 21, 2026 License: MIT Imports: 2 Imported by: 0

Documentation

Index

Constants

This section is empty.

Variables

This section is empty.

Functions

func ConfidenceInterval95

func ConfidenceInterval95(values []float64) (float64, float64)

ConfidenceInterval95 returns the 95% confidence interval (low, high) using the normal approximation (z=1.96). Returns (mean, mean) when fewer than 2 data points are available.

func IsFlaky

func IsFlaky(passRate float64) bool

IsFlaky returns true when the pass rate is strictly between 0 and 1, meaning the task sometimes passes and sometimes fails.

func Mean

func Mean(values []float64) float64

Mean computes the arithmetic mean of a float64 slice. Returns 0 for empty input.

func StdDev

func StdDev(values []float64) float64

StdDev computes the population standard deviation.

func Variance

func Variance(values []float64) float64

Variance computes the population variance of a float64 slice. Returns 0 for empty input.

Types

type BehaviorMetrics

type BehaviorMetrics struct {
	ToolCallCount           int      `json:"tool_call_count"`
	IterationCount          int      `json:"iteration_count"`
	MaxToolCallsAllowed     int      `json:"max_tool_calls_allowed,omitempty"`
	MaxToolCallsPassed      bool     `json:"max_tool_calls_passed"`
	MaxIterations           int      `json:"max_iterations,omitempty"`
	MaxIterationsPassed     bool     `json:"max_iterations_passed"`
	MaxResponseTimeMs       int64    `json:"max_response_time_ms,omitempty"`
	ActualResponseTimeMs    int64    `json:"actual_response_time_ms"`
	MaxResponseTimeMsPassed bool     `json:"max_response_time_ms_passed"`
	RequiredTools           []string `json:"required_tools,omitempty"`
	RequiredToolsUsed       []string `json:"required_tools_used,omitempty"`
	RequiredToolsMissed     []string `json:"required_tools_missed,omitempty"`
	RequiredToolsPassed     bool     `json:"required_tools_passed"`
	ForbiddenTools          []string `json:"forbidden_tools,omitempty"`
	ForbiddenToolsUsed      []string `json:"forbidden_tools_used,omitempty"`
	ForbiddenToolsPassed    bool     `json:"forbidden_tools_passed"`
	EfficiencyScore         float64  `json:"efficiency_score"`
}

BehaviorMetrics captures quality metrics for agent behavior during a run.

func ComputeBehaviorMetrics

func ComputeBehaviorMetrics(run *models.RunResult, rules *models.BehaviorRules) *BehaviorMetrics

ComputeBehaviorMetrics analyzes a RunResult against BehaviorRules and returns quality metrics including compliance checks and an efficiency score.

func (*BehaviorMetrics) AllConstraintsPassed

func (m *BehaviorMetrics) AllConstraintsPassed() bool

AllConstraintsPassed returns true when every behavioral constraint is met.

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL