reflexion

package
v0.26.0 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Jun 13, 2026 License: Apache-2.0 Imports: 8 Imported by: 0

README

Reflexion Strategy

The Reflexion strategy implements the Reflexion framework from the NeurIPS 2023 paper: "Reflexion: Language Agents with Verbal Reinforcement Learning".

Overview

Reflexion enables language agents to learn from their mistakes through verbal feedback and self-reflection. Unlike traditional reinforcement learning, Reflexion uses natural language feedback stored in episodic memory to improve performance across multiple trials.

Key Components

Actor

The LLM agent that executes tasks using the available tools and context.

Evaluator

Determines whether a trial successfully completed the task. You can use:

  • LLMEvaluator (default): Uses an LLM to evaluate success
  • Custom Evaluator: Provide a custom function of type Evaluator
Self-Reflection

Generates verbal feedback after failed trials, which is stored in episodic memory and used to guide future attempts.

Episodic Memory

A bounded FIFO buffer storing reflections from past trials (default: 3 most recent).

Usage

Basic Example
import (
    "github.com/gollem-dev/gollem"
    "github.com/gollem-dev/gollem/llm/openai"
    "github.com/gollem-dev/gollem/strategy/reflexion"
)

// Create LLM client
client, err := openai.New(ctx, apiKey)
if err != nil {
    // Handle error
}

// Create Reflexion strategy with default LLM evaluator
strategy := reflexion.New(client,
    reflexion.WithMaxTrials(3),
    reflexion.WithMemorySize(3),
)

// Create agent
agent := gollem.New(client, gollem.WithStrategy(strategy))

// Execute task
response, err := agent.Execute(ctx, gollem.Text("Your task here..."))
if err != nil {
    // Handle error
}
Custom Evaluator
// Simple function-based evaluator
evaluator := reflexion.Evaluator(func(ctx context.Context, t *reflexion.Trajectory) (*reflexion.EvaluationResult, error) {
    response := strings.Join(t.FinalResponse, " ")

    // Check if response meets your criteria
    if containsExpectedOutput(response) {
        return &reflexion.EvaluationResult{
            Success: true,
            Score:   1.0,
        }, nil
    }

    return &reflexion.EvaluationResult{
        Success:  false,
        Feedback: "Response missing expected content",
    }, nil
})

strategy := reflexion.New(client,
    reflexion.WithEvaluator(evaluator),
)
Using Hooks

Hooks allow you to observe and log the Reflexion process:

type myHooks struct{}

func (h *myHooks) OnTrialStart(ctx context.Context, trialNum int) error {
    log.Printf("Starting trial %d", trialNum)
    return nil
}

func (h *myHooks) OnTrialEnd(ctx context.Context, trialNum int, evaluation *reflexion.EvaluationResult) error {
    log.Printf("Trial %d ended - Success: %v", trialNum, evaluation.Success)
    return nil
}

func (h *myHooks) OnReflectionGenerated(ctx context.Context, trialNum int, reflection string) error {
    log.Printf("Reflection: %s", reflection)
    return nil
}

strategy := reflexion.New(client,
    reflexion.WithHooks(&myHooks{}),
)

Configuration Options

WithMaxTrials(n int)

Sets the maximum number of trials (default: 3).

reflexion.WithMaxTrials(5)
WithMemorySize(n int)

Sets the size of episodic memory (default: 3).

reflexion.WithMemorySize(5)
WithEvaluator(evaluator Evaluator)

Sets a custom evaluator (default: LLMEvaluator).

reflexion.WithEvaluator(myEvaluator)
WithHooks(hooks Hooks)

Sets lifecycle hooks for observability.

reflexion.WithHooks(myHooks)

How It Works

  1. Trial Execution: The agent executes the task with the current context
  2. Evaluation: The evaluator determines if the trial succeeded
  3. Success Path: If successful, return the final response
  4. Failure Path:
    • Generate a reflection on what went wrong
    • Store reflection in episodic memory
    • Start next trial with reflections as additional context
  5. Retry: Continue until success or max trials reached

Examples

See examples/reflexion/ for a complete working example.

Comparison with Other Strategies

Strategy Use Case Learning Trials
Simple Basic tasks No Single
ReAct Step-by-step reasoning No Single
Reflexion Tasks requiring improvement Yes (episodic memory) Multiple
PlanExec Goal-oriented planning No Single

References

Documentation

Overview

Package reflexion implements the Reflexion strategy for gollem.

Reflexion is a framework for language agents that learn through verbal feedback and self-reflection. It enables agents to improve their performance across multiple trials by maintaining episodic memory of past reflections.

Basic usage:

strategy := reflexion.New(llmClient,
    reflexion.WithMaxTrials(3),
    reflexion.WithMemorySize(3),
)

agent := gollem.New(llmClient, gollem.WithStrategy(strategy))
response, err := agent.Execute(ctx, gollem.Text("Solve this task..."))

The strategy will:

  1. Execute a trial with the given input
  2. Evaluate the result using the configured evaluator
  3. If failed, generate a reflection and try again with the reflection as context
  4. Repeat until success or max trials reached

Index

Constants

View Source
const (
	// DefaultMaxTrials is the default maximum number of trials
	DefaultMaxTrials = 3
	// DefaultMemorySize is the default maximum size of episodic memory
	DefaultMemorySize = 3
)

Variables

This section is empty.

Functions

This section is empty.

Types

type EvaluationResult

type EvaluationResult struct {
	Success  bool    // Whether the trial successfully completed the task
	Score    float64 // Optional score (0.0-1.0), 0 if not used
	Feedback string  // Optional feedback or explanation
}

EvaluationResult represents the result of evaluating a trajectory. It indicates whether the trial was successful and optionally provides additional feedback and scoring information.

type Evaluator

type Evaluator func(context.Context, *Trajectory) (*EvaluationResult, error)

Evaluator evaluates a trajectory to determine if it successfully completed the task. Users can provide custom evaluation logic as a function.

Example:

evaluator := reflexion.Evaluator(func(ctx context.Context, t *reflexion.Trajectory) (*reflexion.EvaluationResult, error) {
    // Custom evaluation logic
    if containsExpectedOutput(t.FinalResponse) {
        return &reflexion.EvaluationResult{Success: true}, nil
    }
    return &reflexion.EvaluationResult{
        Success: false,
        Feedback: "Output does not match expected result",
    }, nil
})

func NewLLMEvaluator

func NewLLMEvaluator(client gollem.LLMClient) Evaluator

NewLLMEvaluator creates a new LLM-based evaluator. It asks the LLM to determine if the task was successfully completed based on the original task, execution history, and final response.

type Hooks

type Hooks interface {
	// OnTrialStart is called when a new trial begins.
	OnTrialStart(ctx context.Context, trialNum int) error

	// OnTrialEnd is called when a trial completes (success or failure).
	OnTrialEnd(ctx context.Context, trialNum int, evaluation *EvaluationResult) error

	// OnReflectionGenerated is called when a reflection is generated after a failed trial.
	OnReflectionGenerated(ctx context.Context, trialNum int, reflection string) error
}

Hooks provides lifecycle hooks for observing the Reflexion strategy's execution. All methods are optional - implement only the hooks you need.

type Option

type Option func(*Strategy)

Option is a function that configures a Strategy.

func WithEvaluator

func WithEvaluator(evaluator Evaluator) Option

WithEvaluator sets a custom evaluator. If not set, LLMEvaluator is used by default.

func WithHooks

func WithHooks(hooks Hooks) Option

WithHooks sets lifecycle hooks.

func WithMaxTrials

func WithMaxTrials(n int) Option

WithMaxTrials sets the maximum number of trials. Default is 3.

func WithMemorySize

func WithMemorySize(n int) Option

WithMemorySize sets the maximum size of episodic memory. Default is 3.

type ReflectionGeneratedEvent

type ReflectionGeneratedEvent struct {
	TrialNumber int    `json:"trial_number"`
	Reflection  string `json:"reflection"`
}

ReflectionGeneratedEvent is recorded when a reflection is generated.

type Strategy

type Strategy struct {
	// contains filtered or unexported fields
}

Strategy is the main Reflexion strategy implementation. It manages multiple trials, evaluates their results, generates reflections, and maintains episodic memory to improve performance across trials.

func New

func New(client gollem.LLMClient, options ...Option) *Strategy

New creates a new Reflexion strategy with the given LLM client and options. By default, it uses LLMEvaluator for evaluation, max 3 trials, and memory size of 3.

func (*Strategy) Handle

Handle determines the next input for the LLM based on the current state. It manages the trial loop: execution, evaluation, reflection, and retry.

func (*Strategy) Init

func (s *Strategy) Init(ctx context.Context, inputs []gollem.Input) error

Init initializes the strategy with initial inputs. This is called once when Agent.Execute is invoked.

func (*Strategy) Tools

func (s *Strategy) Tools(ctx context.Context) ([]gollem.Tool, error)

Tools returns additional tools provided by this strategy. Reflexion strategy does not provide additional tools.

type Trajectory

type Trajectory struct {
	TrialNum      int             // Trial number (1-indexed)
	UserInputs    []gollem.Input  // Initial user inputs for the task
	History       *gollem.History // Complete conversation history
	FinalResponse []string        // Final response texts from LLM
	StartTime     time.Time       // Trial start time
	EndTime       time.Time       // Trial end time
}

Trajectory captures the execution trace of a trial. It contains the complete execution history including user inputs, LLM responses, tool executions, and the final response.

type TrialEndEvent

type TrialEndEvent struct {
	TrialNumber int    `json:"trial_number"`
	Success     bool   `json:"success"`
	Feedback    string `json:"feedback,omitempty"`
}

TrialEndEvent is recorded when a trial ends.

type TrialStartEvent

type TrialStartEvent struct {
	TrialNumber int `json:"trial_number"`
}

TrialStartEvent is recorded when a trial begins.

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL