Documentation
¶
Index ¶
- type AgentArtifacts
- type AgentEvaluator
- type AgentExample
- type ArtifactKey
- type ComparisonResult
- type DeterministicEvaluator
- type EvalResult
- type ExactMatchComparator
- type GEPAAdapterConfig
- type GEPAAgentOptimizer
- func (o *GEPAAgentOptimizer) CandidateArtifacts(candidate *optimizers.GEPACandidate) (AgentArtifacts, error)
- func (o *GEPAAgentOptimizer) EvaluateCandidate(ctx context.Context, candidate *optimizers.GEPACandidate, ...) (*GEPACandidateEvaluation, error)
- func (o *GEPAAgentOptimizer) MaterializeAgent(artifacts AgentArtifacts) (OptimizableAgent, error)
- func (o *GEPAAgentOptimizer) Optimize(ctx context.Context, req GEPAOptimizeRequest) (*GEPAOptimizeResult, error)
- func (o *GEPAAgentOptimizer) SeedCandidate(seed AgentArtifacts) (*optimizers.GEPACandidate, error)
- func (o *GEPAAgentOptimizer) WithFactory(factory func(AgentArtifacts) (OptimizableAgent, error)) *GEPAAgentOptimizer
- type GEPACandidateEvaluation
- type GEPAOptimizeRequest
- type GEPAOptimizeResult
- type Harness
- type HarnessExampleResult
- type HarnessRunResult
- type IntMutationConfig
- type OptimizableAgent
- type OutputComparator
- type OutputComparatorFunc
- type SideInfo
Constants ¶
This section is empty.
Variables ¶
This section is empty.
Functions ¶
This section is empty.
Types ¶
type AgentArtifacts ¶
AgentArtifacts groups mutable agent configuration surfaced to optimizers.
func (AgentArtifacts) Clone ¶
func (a AgentArtifacts) Clone() AgentArtifacts
Clone returns a deep copy of the artifact maps so parent and child candidates cannot accidentally share mutable state.
type AgentEvaluator ¶
type AgentEvaluator interface {
Evaluate(ctx context.Context, agent OptimizableAgent, ex AgentExample) (*EvalResult, error)
}
AgentEvaluator evaluates an agent on a concrete task and returns score plus ASI.
type AgentExample ¶
type AgentExample struct {
ID string
Inputs map[string]interface{}
Outputs map[string]interface{}
Metadata map[string]interface{}
}
AgentExample describes a single evaluation task for an optimizable agent.
type ArtifactKey ¶
type ArtifactKey string
ArtifactKey identifies a mutable artifact on an optimizable agent.
const ( ArtifactSkillPack ArtifactKey = "skill_pack" ArtifactPlannerPrompt ArtifactKey = "planner_prompt" ArtifactToolPolicy ArtifactKey = "tool_policy" ArtifactMemoryTemplate ArtifactKey = "memory_template" ArtifactReflectionPrompt ArtifactKey = "reflection_prompt" ArtifactContextPolicy ArtifactKey = "context_policy" ArtifactRLMOuterPrompt ArtifactKey = "rlm_outer_prompt" ArtifactRLMIterationPrompt ArtifactKey = "rlm_iteration_prompt" )
type ComparisonResult ¶
type ComparisonResult struct {
Score float64
Scores map[string]float64
Diagnostics map[string]interface{}
PassedTests []string
FailedTests []string
}
ComparisonResult captures deterministic output-comparison details.
type DeterministicEvaluator ¶
type DeterministicEvaluator struct {
Comparator OutputComparator
}
DeterministicEvaluator runs an agent on concrete examples and attaches structured side information suitable for downstream optimization.
func NewDeterministicEvaluator ¶
func NewDeterministicEvaluator(comparator OutputComparator) *DeterministicEvaluator
NewDeterministicEvaluator builds a deterministic evaluator with the provided comparator. When comparator is nil, it falls back to exact key/value output matching.
func (*DeterministicEvaluator) Evaluate ¶
func (e *DeterministicEvaluator) Evaluate(ctx context.Context, agent OptimizableAgent, ex AgentExample) (*EvalResult, error)
Evaluate executes the agent once and converts the outcome into score plus ASI. Agent execution failures are treated as candidate failures, not evaluator failures.
type EvalResult ¶
EvalResult is the output of an AgentEvaluator.
type ExactMatchComparator ¶
type ExactMatchComparator struct{}
ExactMatchComparator scores outputs by exact per-key equality against the example's Outputs map. If no expected outputs are provided, any successful execution receives a score of 1.0.
Matching uses reflect.DeepEqual, so callers should expect Go's normal type sensitivity here. For example, int(1) and float64(1) do not match.
func (ExactMatchComparator) Compare ¶
func (ExactMatchComparator) Compare(ex AgentExample, actual map[string]interface{}) (*ComparisonResult, error)
Compare implements OutputComparator.
type GEPAAdapterConfig ¶
type GEPAAdapterConfig struct {
PopulationSize int
MaxGenerations int
ReflectionFreq int
SearchBatchSize int
StagnationLimit int
ValidationSplit float64
ArtifactKeys []ArtifactKey
EvalConcurrency int
PassThreshold float64
PrimaryArtifact ArtifactKey
IntMutationPlans map[string]IntMutationConfig
}
GEPAAdapterConfig configures the agent-to-GEPA bridge layer.
func DefaultGEPAAdapterConfig ¶
func DefaultGEPAAdapterConfig() GEPAAdapterConfig
DefaultGEPAAdapterConfig returns a conservative default adapter config.
type GEPAAgentOptimizer ¶
type GEPAAgentOptimizer struct {
// contains filtered or unexported fields
}
GEPAAgentOptimizer holds the shared translation logic between agent evaluation and GEPA's candidate, trace, and fitness types.
func NewGEPAAgentOptimizer ¶
func NewGEPAAgentOptimizer(baseAgent OptimizableAgent, evaluator AgentEvaluator, cfg GEPAAdapterConfig) *GEPAAgentOptimizer
NewGEPAAgentOptimizer creates a new adapter scaffold around an optimizable agent and evaluator.
func (*GEPAAgentOptimizer) CandidateArtifacts ¶
func (o *GEPAAgentOptimizer) CandidateArtifacts(candidate *optimizers.GEPACandidate) (AgentArtifacts, error)
CandidateArtifacts reconstructs the full artifact set represented by a GEPA candidate.
func (*GEPAAgentOptimizer) EvaluateCandidate ¶
func (o *GEPAAgentOptimizer) EvaluateCandidate(ctx context.Context, candidate *optimizers.GEPACandidate, examples []AgentExample) (*GEPACandidateEvaluation, error)
EvaluateCandidate runs the adapter's evaluator over examples and converts the result into GEPA fitness and trace records.
func (*GEPAAgentOptimizer) MaterializeAgent ¶
func (o *GEPAAgentOptimizer) MaterializeAgent(artifacts AgentArtifacts) (OptimizableAgent, error)
MaterializeAgent creates a concrete agent instance for the provided artifact set.
func (*GEPAAgentOptimizer) Optimize ¶
func (o *GEPAAgentOptimizer) Optimize(ctx context.Context, req GEPAOptimizeRequest) (*GEPAOptimizeResult, error)
Optimize runs GEPA against agent artifacts using the mainline whole-program engine.
func (*GEPAAgentOptimizer) SeedCandidate ¶
func (o *GEPAAgentOptimizer) SeedCandidate(seed AgentArtifacts) (*optimizers.GEPACandidate, error)
SeedCandidate encodes an artifact set into a GEPA candidate record.
func (*GEPAAgentOptimizer) WithFactory ¶
func (o *GEPAAgentOptimizer) WithFactory(factory func(AgentArtifacts) (OptimizableAgent, error)) *GEPAAgentOptimizer
WithFactory registers a fallback constructor used when clone-based materialization is unavailable.
type GEPACandidateEvaluation ¶
type GEPACandidateEvaluation struct {
Candidate *optimizers.GEPACandidate
Artifacts AgentArtifacts
Run *HarnessRunResult
Fitness *optimizers.MultiObjectiveFitness
Traces []optimizers.ExecutionTrace
AverageScore float64
}
GEPACandidateEvaluation captures the GEPA-shaped output of evaluating one agent candidate.
type GEPAOptimizeRequest ¶
type GEPAOptimizeRequest struct {
SeedArtifacts AgentArtifacts
TrainingExamples []AgentExample
ValidationExamples []AgentExample
ProgressReporter core.ProgressReporter
}
GEPAOptimizeRequest configures one end-to-end GEPA optimization run for an agent artifact set.
SeedArtifacts and examples are treated as trusted harness inputs. They may be embedded into model prompts during optimization, so callers should source them from trusted corpora or explicitly sanitize them before invoking Optimize.
type GEPAOptimizeResult ¶
type GEPAOptimizeResult struct {
BestCandidate *optimizers.GEPACandidate
BestArtifacts AgentArtifacts
BestValidationEvaluation *GEPACandidateEvaluation
TrainingExampleCount int
ValidationExampleCount int
OptimizationState *optimizers.GEPAState
}
GEPAOptimizeResult captures the best candidate and resulting artifacts from a GEPA run.
type Harness ¶
type Harness struct {
Evaluator AgentEvaluator
PassThreshold float64
}
Harness runs an evaluator across a fixed example set while isolating each run behind a fresh agent clone.
func (*Harness) Run ¶
func (h *Harness) Run(ctx context.Context, baseAgent OptimizableAgent, examples []AgentExample) (*HarnessRunResult, error)
Run evaluates each example sequentially using a fresh clone of the base agent.
type HarnessExampleResult ¶
type HarnessExampleResult struct {
ExampleID string
Result *EvalResult
}
HarnessExampleResult records one evaluator outcome for one example.
type HarnessRunResult ¶
type HarnessRunResult struct {
Results []HarnessExampleResult
AverageScore float64
PassedExamples int
FailedExamples int
CompletedExamples int
EvaluationErrors int
}
HarnessRunResult aggregates a deterministic evaluation run.
type IntMutationConfig ¶
IntMutationConfig defines a bounded deterministic search neighborhood for an integer artifact that GEPA itself does not mutate natively.
type OptimizableAgent ¶
type OptimizableAgent interface {
agents.Agent
GetArtifacts() AgentArtifacts
SetArtifacts(AgentArtifacts) error
Clone() (OptimizableAgent, error)
}
OptimizableAgent is a parallel interface that exposes mutable artifacts without widening the base agents.Agent contract.
Implementations that also expose LastExecutionTrace() *agents.ExecutionTrace allow evaluators to attach richer step-level side information without forcing that method into the base interface.
type OutputComparator ¶
type OutputComparator interface {
Compare(ex AgentExample, actual map[string]interface{}) (*ComparisonResult, error)
}
OutputComparator compares an agent execution result against an example's expectations.
type OutputComparatorFunc ¶
type OutputComparatorFunc func(ex AgentExample, actual map[string]interface{}) (*ComparisonResult, error)
OutputComparatorFunc adapts a function to the OutputComparator interface.
func (OutputComparatorFunc) Compare ¶
func (f OutputComparatorFunc) Compare(ex AgentExample, actual map[string]interface{}) (*ComparisonResult, error)
Compare implements OutputComparator.
type SideInfo ¶
type SideInfo struct {
Trace *agents.ExecutionTrace
Diagnostics map[string]interface{}
Scores map[string]float64
Cost float64
LatencyMS float64
Tokens map[string]int64
PassedTests []string
FailedTests []string
}
SideInfo carries diagnostic information beyond the scalar evaluation score.