reflexion

package

v0.26.0 Latest Latest Go to latest Published: Jun 13, 2026 License: Apache-2.0 Imports: 8 Imported by: 0

Details

Valid go.mod file
Redistributable license
Tagged version
Stable version
Learn more about best practices

Repository

github.com/gollem-dev/gollem

Links

Open Source Insights

README ¶

Reflexion Strategy

The Reflexion strategy implements the Reflexion framework from the NeurIPS 2023 paper: "Reflexion: Language Agents with Verbal Reinforcement Learning".

Overview

Reflexion enables language agents to learn from their mistakes through verbal feedback and self-reflection. Unlike traditional reinforcement learning, Reflexion uses natural language feedback stored in episodic memory to improve performance across multiple trials.

Key Components

Actor

The LLM agent that executes tasks using the available tools and context.

Evaluator

Determines whether a trial successfully completed the task. You can use:

LLMEvaluator (default): Uses an LLM to evaluate success
Custom Evaluator: Provide a custom function of type Evaluator

Self-Reflection

Generates verbal feedback after failed trials, which is stored in episodic memory and used to guide future attempts.

Episodic Memory

A bounded FIFO buffer storing reflections from past trials (default: 3 most recent).

Usage

Basic Example

import (
    "github.com/gollem-dev/gollem"
    "github.com/gollem-dev/gollem/llm/openai"
    "github.com/gollem-dev/gollem/strategy/reflexion"
)

// Create LLM client
client, err := openai.New(ctx, apiKey)
if err != nil {
    // Handle error
}

// Create Reflexion strategy with default LLM evaluator
strategy := reflexion.New(client,
    reflexion.WithMaxTrials(3),
    reflexion.WithMemorySize(3),
)

// Create agent
agent := gollem.New(client, gollem.WithStrategy(strategy))

// Execute task
response, err := agent.Execute(ctx, gollem.Text("Your task here..."))
if err != nil {
    // Handle error
}

Custom Evaluator

// Simple function-based evaluator
evaluator := reflexion.Evaluator(func(ctx context.Context, t *reflexion.Trajectory) (*reflexion.EvaluationResult, error) {
    response := strings.Join(t.FinalResponse, " ")

    // Check if response meets your criteria
    if containsExpectedOutput(response) {
        return &reflexion.EvaluationResult{
            Success: true,
            Score:   1.0,
        }, nil
    }

    return &reflexion.EvaluationResult{
        Success:  false,
        Feedback: "Response missing expected content",
    }, nil
})

strategy := reflexion.New(client,
    reflexion.WithEvaluator(evaluator),
)

Using Hooks

Hooks allow you to observe and log the Reflexion process:

type myHooks struct{}

func (h *myHooks) OnTrialStart(ctx context.Context, trialNum int) error {
    log.Printf("Starting trial %d", trialNum)
    return nil
}

func (h *myHooks) OnTrialEnd(ctx context.Context, trialNum int, evaluation *reflexion.EvaluationResult) error {
    log.Printf("Trial %d ended - Success: %v", trialNum, evaluation.Success)
    return nil
}

func (h *myHooks) OnReflectionGenerated(ctx context.Context, trialNum int, reflection string) error {
    log.Printf("Reflection: %s", reflection)
    return nil
}

strategy := reflexion.New(client,
    reflexion.WithHooks(&myHooks{}),
)

Configuration Options

WithMaxTrials(n int)

Sets the maximum number of trials (default: 3).

reflexion.WithMaxTrials(5)

WithMemorySize(n int)

Sets the size of episodic memory (default: 3).

reflexion.WithMemorySize(5)

WithEvaluator(evaluator Evaluator)

Sets a custom evaluator (default: LLMEvaluator).

reflexion.WithEvaluator(myEvaluator)

WithHooks(hooks Hooks)

Sets lifecycle hooks for observability.

reflexion.WithHooks(myHooks)

How It Works

Trial Execution: The agent executes the task with the current context
Evaluation: The evaluator determines if the trial succeeded
Success Path: If successful, return the final response
Failure Path:
- Generate a reflection on what went wrong
- Store reflection in episodic memory
- Start next trial with reflections as additional context
Retry: Continue until success or max trials reached

Examples

See examples/reflexion/ for a complete working example.

Comparison with Other Strategies

Strategy	Use Case	Learning	Trials
Simple	Basic tasks	No	Single
ReAct	Step-by-step reasoning	No	Single
Reflexion	Tasks requiring improvement	Yes (episodic memory)	Multiple
PlanExec	Goal-oriented planning	No	Single

References

Documentation ¶

Overview ¶

Package reflexion implements the Reflexion strategy for gollem.

Reflexion is a framework for language agents that learn through verbal feedback and self-reflection. It enables agents to improve their performance across multiple trials by maintaining episodic memory of past reflections.

Basic usage:

strategy := reflexion.New(llmClient,
    reflexion.WithMaxTrials(3),
    reflexion.WithMemorySize(3),
)

agent := gollem.New(llmClient, gollem.WithStrategy(strategy))
response, err := agent.Execute(ctx, gollem.Text("Solve this task..."))

The strategy will:

Execute a trial with the given input
Evaluate the result using the configured evaluator
If failed, generate a reflection and try again with the reflection as context
Repeat until success or max trials reached

Index ¶

Constants
type EvaluationResult
type Evaluator
- func NewLLMEvaluator(client gollem.LLMClient) Evaluator
type Hooks
type Option
type ReflectionGeneratedEvent
type Strategy
- func New(client gollem.LLMClient, options ...Option) *Strategy
type Trajectory
type TrialEndEvent
type TrialStartEvent

Constants ¶

View Source

const (
	// DefaultMaxTrials is the default maximum number of trials
	DefaultMaxTrials = 3
	// DefaultMemorySize is the default maximum size of episodic memory
	DefaultMemorySize = 3
)

Variables ¶

This section is empty.

Functions ¶

This section is empty.

Types ¶

type EvaluationResult ¶

type EvaluationResult struct {
	Success  bool    // Whether the trial successfully completed the task
	Score    float64 // Optional score (0.0-1.0), 0 if not used
	Feedback string  // Optional feedback or explanation
}

EvaluationResult represents the result of evaluating a trajectory. It indicates whether the trial was successful and optionally provides additional feedback and scoring information.

type Evaluator ¶

type Evaluator func(context.Context, *Trajectory) (*EvaluationResult, error)

Evaluator evaluates a trajectory to determine if it successfully completed the task. Users can provide custom evaluation logic as a function.

Example:

evaluator := reflexion.Evaluator(func(ctx context.Context, t *reflexion.Trajectory) (*reflexion.EvaluationResult, error) {
    // Custom evaluation logic
    if containsExpectedOutput(t.FinalResponse) {
        return &reflexion.EvaluationResult{Success: true}, nil
    }
    return &reflexion.EvaluationResult{
        Success: false,
        Feedback: "Output does not match expected result",
    }, nil
})

func NewLLMEvaluator ¶

func NewLLMEvaluator(client gollem.LLMClient) Evaluator

NewLLMEvaluator creates a new LLM-based evaluator. It asks the LLM to determine if the task was successfully completed based on the original task, execution history, and final response.

type Hooks ¶

type Hooks interface {
	// OnTrialStart is called when a new trial begins.
	OnTrialStart(ctx context.Context, trialNum int) error

	// OnTrialEnd is called when a trial completes (success or failure).
	OnTrialEnd(ctx context.Context, trialNum int, evaluation *EvaluationResult) error

	// OnReflectionGenerated is called when a reflection is generated after a failed trial.
	OnReflectionGenerated(ctx context.Context, trialNum int, reflection string) error
}

Hooks provides lifecycle hooks for observing the Reflexion strategy's execution. All methods are optional - implement only the hooks you need.

type Option ¶

type Option func(*Strategy)

Option is a function that configures a Strategy.

func WithEvaluator ¶

func WithEvaluator(evaluator Evaluator) Option

WithEvaluator sets a custom evaluator. If not set, LLMEvaluator is used by default.

func WithHooks ¶

func WithHooks(hooks Hooks) Option

WithHooks sets lifecycle hooks.

func WithMaxTrials ¶

func WithMaxTrials(n int) Option

WithMaxTrials sets the maximum number of trials. Default is 3.

func WithMemorySize ¶

func WithMemorySize(n int) Option

WithMemorySize sets the maximum size of episodic memory. Default is 3.

type ReflectionGeneratedEvent ¶

type ReflectionGeneratedEvent struct {
	TrialNumber int    `json:"trial_number"`
	Reflection  string `json:"reflection"`
}

ReflectionGeneratedEvent is recorded when a reflection is generated.

type Strategy ¶

type Strategy struct {
	// contains filtered or unexported fields
}

Strategy is the main Reflexion strategy implementation. It manages multiple trials, evaluates their results, generates reflections, and maintains episodic memory to improve performance across trials.

func New ¶

func New(client gollem.LLMClient, options ...Option) *Strategy

New creates a new Reflexion strategy with the given LLM client and options. By default, it uses LLMEvaluator for evaluation, max 3 trials, and memory size of 3.

func (*Strategy) Handle ¶

func (s *Strategy) Handle(ctx context.Context, state *gollem.StrategyState) ([]gollem.Input, *gollem.ExecuteResponse, error)

Handle determines the next input for the LLM based on the current state. It manages the trial loop: execution, evaluation, reflection, and retry.

func (*Strategy) Init ¶

func (s *Strategy) Init(ctx context.Context, inputs []gollem.Input) error

Init initializes the strategy with initial inputs. This is called once when Agent.Execute is invoked.

func (*Strategy) Tools ¶

func (s *Strategy) Tools(ctx context.Context) ([]gollem.Tool, error)

Tools returns additional tools provided by this strategy. Reflexion strategy does not provide additional tools.

type Trajectory ¶

type Trajectory struct {
	TrialNum      int             // Trial number (1-indexed)
	UserInputs    []gollem.Input  // Initial user inputs for the task
	History       *gollem.History // Complete conversation history
	FinalResponse []string        // Final response texts from LLM
	StartTime     time.Time       // Trial start time
	EndTime       time.Time       // Trial end time
}

Trajectory captures the execution trace of a trial. It contains the complete execution history including user inputs, LLM responses, tool executions, and the final response.

type TrialEndEvent ¶

type TrialEndEvent struct {
	TrialNumber int    `json:"trial_number"`
	Success     bool   `json:"success"`
	Feedback    string `json:"feedback,omitempty"`
}

TrialEndEvent is recorded when a trial ends.

type TrialStartEvent ¶

type TrialStartEvent struct {
	TrialNumber int `json:"trial_number"`
}

TrialStartEvent is recorded when a trial begins.

Source Files ¶

View all Source files

?	: This menu
/	: Search site
f or F	: Jump to
y or Y	: Canonical URL