criteria

package
v0.1.5 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Mar 23, 2026 License: Apache-2.0 Imports: 9 Imported by: 0

Documentation

Overview

Package criteria parses and evaluates success criteria from Errata recipe files.

Criteria are defined as bullet points in the recipe's ## Success Criteria section and evaluated against each model's ModelResponse after a headless task run.

Index

Constants

This section is empty.

Variables

This section is empty.

Functions

func PassCount

func PassCount(results []Result) int

PassCount returns how many results passed.

func TailLines

func TailLines(s string, n int) string

TailLines returns the last n lines of s. If s has more than n lines, the result is prefixed with a truncation notice.

Types

type Criterion

type Criterion struct {
	Raw     string // original string from the recipe
	Type    string // "no_errors" | "has_writes" | "contains" | "files_written" | "run" | "max_cost" | "max_latency" | "tool_used" | "max_tool_calls" | "protected" | "unknown"
	Arg     string // comparison value when applicable
	Timeout int    // seconds; used only by "run" type (0 = default 60s)
}

Criterion is a single parsed success criterion.

func Parse

func Parse(raw []string) []Criterion

Parse converts raw criterion strings (from a recipe) into typed Criterion values. Unknown formats are returned with Type "unknown" and will always pass evaluation.

type EvalContext

type EvalContext struct {
	WorkDir string // absolute path to model's worktree; "" if unavailable
}

EvalContext provides environmental data for criterion evaluation.

type Result

type Result struct {
	Criterion string `json:"criterion"`
	Passed    bool   `json:"passed"`
	Detail    string `json:"detail,omitempty"`
}

Result is the evaluation of one criterion against one model response.

func Evaluate

func Evaluate(criteria []Criterion, resp models.ModelResponse, ectx EvalContext) []Result

Evaluate runs all criteria against a single ModelResponse and returns the results.

func RedactSensitiveDetails

func RedactSensitiveDetails(results []Result) []Result

RedactSensitiveDetails returns a copy of results with Detail cleared for criteria whose output may contain sensitive data (error messages, command output). Safe criteria (max_cost, has_writes, etc.) are preserved.

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL