agent

package
v0.8.0 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Mar 29, 2026 License: MIT Imports: 15 Imported by: 0

Documentation

Overview

claude_agent.go implements a minimal agent loop that calls the Anthropic Messages API with tool_use. It supports two modes: baseline (file-level tools only) and synapses (file tools + Synapses MCP tools). Used by the SWE-bench benchmark to measure the Pass@1 delta that Synapses provides.

Package agent provides the HTTP client that calls a running Synapses daemon via its REST transport (/v1/tools/{tool}?project=...).

The REST protocol is:

POST /v1/tools/{tool_name}?project={absolute_project_path}
Content-Type: application/json
Body: JSON object of tool arguments

Response 200: {"content": [{"type":"text","text":"..."}], ...}
Response 4xx/5xx: {"error": "..."}

All tool calls are logged for Context F1 tracking (ContextBench).

tools.go defines tool schemas and executors for the SWE-bench agent. Baseline tools operate on a checked-out git repo (read/grep/list/write). Synapses tools extend baseline with Synapses MCP tools (search, context, impact).

Index

Constants

This section is empty.

Variables

This section is empty.

Functions

This section is empty.

Types

type AgentConfig

type AgentConfig struct {
	APIKey      string  // Anthropic API key (env ANTHROPIC_API_KEY)
	Model       string  // e.g. "claude-sonnet-4-6"
	MaxTurns    int     // max agent loop iterations (default 25)
	MaxTokens   int     // max tokens per response (default 4096)
	Temperature float64 // 0.0 for deterministic
}

AgentConfig holds Claude API parameters.

func DefaultAgentConfig

func DefaultAgentConfig() AgentConfig

DefaultAgentConfig returns sensible defaults. APIKey is read from env.

type AgentMode

type AgentMode string

AgentMode selects the tool set.

const (
	ModeBaseline AgentMode = "baseline"
	ModeSynapses AgentMode = "synapses"
)

Supported agent modes.

type AgentResult

type AgentResult struct {
	Patch string     `json:"patch"`
	Stats AgentStats `json:"stats"`
	Error string     `json:"error,omitempty"`
}

AgentResult is the outcome of running the agent on a single task.

func RunAgent

func RunAgent(cfg AgentConfig, systemPrompt, taskPrompt string,
	tools []ToolDef, executor ToolExecutor, patchExtractor func() (string, error)) (*AgentResult, error)

RunAgent executes the agent loop for one task. It sends the system+task prompt to Claude, handles tool_use responses, and returns the final patch (extracted via patchExtractor callback).

type AgentStats

type AgentStats struct {
	TotalTurns   int            `json:"total_turns"`
	ToolCalls    map[string]int `json:"tool_calls"`    // tool_name → count
	InputTokens  int            `json:"input_tokens"`  // cumulative
	OutputTokens int            `json:"output_tokens"` // cumulative
	SynapsesUsed bool           `json:"synapses_used"` // true if any synapses tool was called
	Duration     time.Duration  `json:"duration"`
}

AgentStats tracks resource usage for a single task.

type BaselineExecutor

type BaselineExecutor struct {
	RepoDir string
}

BaselineExecutor provides file-level tools on a checked-out repo.

func (*BaselineExecutor) Execute

func (e *BaselineExecutor) Execute(toolName string, input json.RawMessage) (string, error)

Execute dispatches baseline tool calls (read_file, grep_search, etc.).

type ContextAccess

type ContextAccess struct {
	TaskID    string
	File      string
	LineStart int
	LineEnd   int
	Tool      string // "prepare_context", "search", "get_impact", "recall"
	Timestamp time.Time
}

ContextAccess records a single file+line-range access made via a Synapses tool. Used for ContextBench Context F1 calculation.

type ContextResult

type ContextResult struct {
	Raw  string
	Text string
}

ContextResult is returned by PrepareContext.

type ImpactResult

type ImpactResult struct {
	Raw  string
	Text string
}

ImpactResult is returned by GetImpact.

type RankedCandidate

type RankedCandidate struct {
	Index int     // original index in the candidates slice
	Score float64 // higher = more relevant
}

RankedCandidate is one entry in a ranked candidate list.

type RecallResult

type RecallResult struct {
	Raw  string
	Text string
}

RecallResult is returned by Recall.

type SearchResult

type SearchResult struct {
	Raw  string
	Text string
}

SearchResult is returned by the Search tool.

type SynapsesClient

type SynapsesClient struct {
	// contains filtered or unexported fields
}

SynapsesClient calls a running Synapses daemon over HTTP. Set disabled=true for a control run where all tools return empty results.

func NewClient

func NewClient(endpoint, project string) *SynapsesClient

NewClient creates a live client that calls the Synapses daemon. authToken is optional — read from ~/.synapses/auth_token when the daemon requires Bearer auth.

func NewDisabledClient

func NewDisabledClient() *SynapsesClient

NewDisabledClient creates a no-op client for control runs. All tool calls succeed but return empty results.

func (*SynapsesClient) DrainAccesses

func (c *SynapsesClient) DrainAccesses() []ContextAccess

DrainAccesses returns all recorded context accesses and clears the log.

func (*SynapsesClient) GetContextJSON

func (c *SynapsesClient) GetContextJSON(taskID, entity, detailLevel string) (string, error)

GetContextJSON calls get_context with format=json, returning the raw JSON string. This gives structured callees, callers, related nodes — far more reliable than regex-parsing Markdown from prepare_context.

func (*SynapsesClient) GetImpact

func (c *SynapsesClient) GetImpact(taskID, entity string) (*ImpactResult, error)

GetImpact calls the get_impact tool.

func (*SynapsesClient) GetImpactWithDepth

func (c *SynapsesClient) GetImpactWithDepth(taskID, entity string, depth int) (*ImpactResult, error)

GetImpactWithDepth calls get_impact with an explicit depth parameter.

func (*SynapsesClient) PrepareContext

func (c *SynapsesClient) PrepareContext(taskID, entity, intent string) (*ContextResult, error)

PrepareContext calls get_context with mode=intent (was prepare_context before Sprint 24).

func (*SynapsesClient) RankCandidates

func (c *SynapsesClient) RankCandidates(query string, candidates []string) ([]RankedCandidate, error)

RankCandidates calls the Synapses embedder via the embed_snippet tool (if available) or falls back to requesting the daemon to rank a list of candidate snippets against a query. Used by RepoBench-R (Approach B: embedding-based ranking without needing a live repo index).

Returns a ranked list of (index, score) pairs, highest score first.

func (*SynapsesClient) Recall

func (c *SynapsesClient) Recall(taskID, query string) (*RecallResult, error)

Recall calls the recall tool. Recall calls memory(action=search) (was recall before Sprint 24).

func (*SynapsesClient) Search

func (c *SynapsesClient) Search(taskID, query string) (*SearchResult, error)

Search calls the search tool.

func (*SynapsesClient) SearchWithMode

func (c *SynapsesClient) SearchWithMode(query, mode string) (string, error)

SearchWithMode calls the search tool with an explicit mode (e.g. "vector").

func (*SynapsesClient) WithProject

func (c *SynapsesClient) WithProject(project string) *SynapsesClient

WithProject returns a shallow copy of the client with a different project path. Used for per-repo routing in RepoBench-R: each sample gets routed to its own indexed project directory.

type SynapsesExecutor

type SynapsesExecutor struct {
	BaselineExecutor
	Client *SynapsesClient
	TaskID string
}

SynapsesExecutor extends BaselineExecutor with Synapses MCP tools.

func (*SynapsesExecutor) Execute

func (e *SynapsesExecutor) Execute(toolName string, input json.RawMessage) (string, error)

Execute dispatches Synapses-augmented tool calls, falling back to baseline tools.

type ToolDef

type ToolDef struct {
	Name        string                 `json:"name"`
	Description string                 `json:"description"`
	InputSchema map[string]interface{} `json:"input_schema"`
}

ToolDef matches the Claude API tool schema.

func BaselineTools

func BaselineTools() []ToolDef

BaselineTools returns tool definitions for the baseline agent mode.

func SynapsesTools

func SynapsesTools() []ToolDef

SynapsesTools returns baseline + Synapses MCP tool definitions.

type ToolExecutor

type ToolExecutor interface {
	Execute(toolName string, input json.RawMessage) (string, error)
}

ToolExecutor executes a tool by name and returns the text result.

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL