Documentation
¶
Overview ¶
Package criteria parses and evaluates success criteria from Errata recipe files.
Criteria are defined as bullet points in the recipe's ## Success Criteria section and evaluated against each model's ModelResponse after a headless task run.
Index ¶
Constants ¶
This section is empty.
Variables ¶
This section is empty.
Functions ¶
Types ¶
type Criterion ¶
type Criterion struct {
Raw string // original string from the recipe
Type string // "no_errors" | "has_writes" | "contains" | "files_written" | "run" | "max_cost" | "max_latency" | "tool_used" | "max_tool_calls" | "protected" | "unknown"
Arg string // comparison value when applicable
Timeout int // seconds; used only by "run" type (0 = default 60s)
}
Criterion is a single parsed success criterion.
type EvalContext ¶
type EvalContext struct {
WorkDir string // absolute path to model's worktree; "" if unavailable
}
EvalContext provides environmental data for criterion evaluation.
type Result ¶
type Result struct {
Criterion string `json:"criterion"`
Passed bool `json:"passed"`
Detail string `json:"detail,omitempty"`
}
Result is the evaluation of one criterion against one model response.
func Evaluate ¶
func Evaluate(criteria []Criterion, resp models.ModelResponse, ectx EvalContext) []Result
Evaluate runs all criteria against a single ModelResponse and returns the results.
func RedactSensitiveDetails ¶
RedactSensitiveDetails returns a copy of results with Detail cleared for criteria whose output may contain sensitive data (error messages, command output). Safe criteria (max_cost, has_writes, etc.) are preserved.