baseline

package

v0.21.0 Latest Latest Go to latest Published: Mar 12, 2026 License: MIT Imports: 2 Imported by: 0

Details

Valid go.mod file
Redistributable license
Tagged version
Stable version
Learn more about best practices

Repository

github.com/microsoft/waza

Links

Open Source Insights

Documentation ¶

Index ¶

type BaselineResult
- func ComputeFromOutcomes(withSkill, baseline *models.TestOutcome) *BaselineResult
type ImprovementBreakdown
- func ComputeImprovement(baseline, withSkill *models.RunResult) (float64, ImprovementBreakdown)

Constants ¶

This section is empty.

Variables ¶

This section is empty.

Functions ¶

This section is empty.

Types ¶

type BaselineResult ¶

type BaselineResult struct {
	TaskName    string               `json:"task_name"`
	Baseline    *models.RunResult    `json:"baseline"`
	WithSkill   *models.RunResult    `json:"with_skill"`
	Improvement float64              `json:"improvement"`
	Breakdown   ImprovementBreakdown `json:"improvement_breakdown"`
}

BaselineResult pairs a task's baseline (no skill) and skill-enabled results with computed improvement metrics.

func ComputeFromOutcomes ¶

func ComputeFromOutcomes(withSkill, baseline *models.TestOutcome) *BaselineResult

ComputeFromOutcomes computes BaselineResults for paired TestOutcomes.

type ImprovementBreakdown ¶

type ImprovementBreakdown struct {
	QualityDelta   float64 `json:"quality_delta"`
	TokenReduction float64 `json:"token_reduction"`
	TurnReduction  float64 `json:"turn_reduction"`
	TimeReduction  float64 `json:"time_reduction"`
	TaskCompletion float64 `json:"task_completion"`
}

ImprovementBreakdown captures per-dimension deltas between baseline and skill runs. Positive values mean the skill run was better; negative means worse.

func ComputeImprovement ¶

func ComputeImprovement(baseline, withSkill *models.RunResult) (float64, ImprovementBreakdown)

ComputeImprovement calculates the overall improvement score and per-dimension breakdown between a baseline run (no skill) and a skill-enabled run. Returns a value in [-1, 1] where positive means the skill helped.

Source Files ¶

View all Source files

baseline.go

?	: This menu
/	: Search site
f or F	: Jump to
y or Y	: Canonical URL