Documentation
¶
Overview ¶
claude_agent.go implements a minimal agent loop that calls the Anthropic Messages API with tool_use. It supports two modes: baseline (file-level tools only) and synapses (file tools + Synapses MCP tools). Used by the SWE-bench benchmark to measure the Pass@1 delta that Synapses provides.
Package agent provides the HTTP client that calls a running Synapses daemon via its REST transport (/v1/tools/{tool}?project=...).
The REST protocol is:
POST /v1/tools/{tool_name}?project={absolute_project_path}
Content-Type: application/json
Body: JSON object of tool arguments
Response 200: {"content": [{"type":"text","text":"..."}], ...}
Response 4xx/5xx: {"error": "..."}
All tool calls are logged for Context F1 tracking (ContextBench).
tools.go defines tool schemas and executors for the SWE-bench agent. Baseline tools operate on a checked-out git repo (read/grep/list/write). Synapses tools extend baseline with Synapses MCP tools (search, context, impact).
Index ¶
- type AgentConfig
- type AgentMode
- type AgentResult
- type AgentStats
- type BaselineExecutor
- type ContextAccess
- type ContextResult
- type ImpactResult
- type RankedCandidate
- type RecallResult
- type SearchResult
- type SynapsesClient
- func (c *SynapsesClient) DrainAccesses() []ContextAccess
- func (c *SynapsesClient) GetContextJSON(taskID, entity, detailLevel string) (string, error)
- func (c *SynapsesClient) GetImpact(taskID, entity string) (*ImpactResult, error)
- func (c *SynapsesClient) GetImpactWithDepth(taskID, entity string, depth int) (*ImpactResult, error)
- func (c *SynapsesClient) PrepareContext(taskID, entity, intent string) (*ContextResult, error)
- func (c *SynapsesClient) RankCandidates(query string, candidates []string) ([]RankedCandidate, error)
- func (c *SynapsesClient) Recall(taskID, query string) (*RecallResult, error)
- func (c *SynapsesClient) Search(taskID, query string) (*SearchResult, error)
- func (c *SynapsesClient) SearchWithMode(query, mode string) (string, error)
- func (c *SynapsesClient) WithProject(project string) *SynapsesClient
- type SynapsesExecutor
- type ToolDef
- type ToolExecutor
Constants ¶
This section is empty.
Variables ¶
This section is empty.
Functions ¶
This section is empty.
Types ¶
type AgentConfig ¶
type AgentConfig struct {
APIKey string // Anthropic API key (env ANTHROPIC_API_KEY)
Model string // e.g. "claude-sonnet-4-6"
MaxTurns int // max agent loop iterations (default 25)
MaxTokens int // max tokens per response (default 4096)
Temperature float64 // 0.0 for deterministic
}
AgentConfig holds Claude API parameters.
func DefaultAgentConfig ¶
func DefaultAgentConfig() AgentConfig
DefaultAgentConfig returns sensible defaults. APIKey is read from env.
type AgentResult ¶
type AgentResult struct {
Patch string `json:"patch"`
Stats AgentStats `json:"stats"`
Error string `json:"error,omitempty"`
}
AgentResult is the outcome of running the agent on a single task.
func RunAgent ¶
func RunAgent(cfg AgentConfig, systemPrompt, taskPrompt string, tools []ToolDef, executor ToolExecutor, patchExtractor func() (string, error)) (*AgentResult, error)
RunAgent executes the agent loop for one task. It sends the system+task prompt to Claude, handles tool_use responses, and returns the final patch (extracted via patchExtractor callback).
type AgentStats ¶
type AgentStats struct {
TotalTurns int `json:"total_turns"`
ToolCalls map[string]int `json:"tool_calls"` // tool_name → count
InputTokens int `json:"input_tokens"` // cumulative
OutputTokens int `json:"output_tokens"` // cumulative
SynapsesUsed bool `json:"synapses_used"` // true if any synapses tool was called
Duration time.Duration `json:"duration"`
}
AgentStats tracks resource usage for a single task.
type BaselineExecutor ¶
type BaselineExecutor struct {
RepoDir string
}
BaselineExecutor provides file-level tools on a checked-out repo.
func (*BaselineExecutor) Execute ¶
func (e *BaselineExecutor) Execute(toolName string, input json.RawMessage) (string, error)
Execute dispatches baseline tool calls (read_file, grep_search, etc.).
type ContextAccess ¶
type ContextAccess struct {
TaskID string
File string
LineStart int
LineEnd int
Tool string // "prepare_context", "search", "get_impact", "recall"
Timestamp time.Time
}
ContextAccess records a single file+line-range access made via a Synapses tool. Used for ContextBench Context F1 calculation.
type ContextResult ¶
ContextResult is returned by PrepareContext.
type ImpactResult ¶
ImpactResult is returned by GetImpact.
type RankedCandidate ¶
type RankedCandidate struct {
Index int // original index in the candidates slice
Score float64 // higher = more relevant
}
RankedCandidate is one entry in a ranked candidate list.
type RecallResult ¶
RecallResult is returned by Recall.
type SearchResult ¶
SearchResult is returned by the Search tool.
type SynapsesClient ¶
type SynapsesClient struct {
// contains filtered or unexported fields
}
SynapsesClient calls a running Synapses daemon over HTTP. Set disabled=true for a control run where all tools return empty results.
func NewClient ¶
func NewClient(endpoint, project string) *SynapsesClient
NewClient creates a live client that calls the Synapses daemon. authToken is optional — read from ~/.synapses/auth_token when the daemon requires Bearer auth.
func NewDisabledClient ¶
func NewDisabledClient() *SynapsesClient
NewDisabledClient creates a no-op client for control runs. All tool calls succeed but return empty results.
func (*SynapsesClient) DrainAccesses ¶
func (c *SynapsesClient) DrainAccesses() []ContextAccess
DrainAccesses returns all recorded context accesses and clears the log.
func (*SynapsesClient) GetContextJSON ¶
func (c *SynapsesClient) GetContextJSON(taskID, entity, detailLevel string) (string, error)
GetContextJSON calls get_context with format=json, returning the raw JSON string. This gives structured callees, callers, related nodes — far more reliable than regex-parsing Markdown from prepare_context.
func (*SynapsesClient) GetImpact ¶
func (c *SynapsesClient) GetImpact(taskID, entity string) (*ImpactResult, error)
GetImpact calls the get_impact tool.
func (*SynapsesClient) GetImpactWithDepth ¶
func (c *SynapsesClient) GetImpactWithDepth(taskID, entity string, depth int) (*ImpactResult, error)
GetImpactWithDepth calls get_impact with an explicit depth parameter.
func (*SynapsesClient) PrepareContext ¶
func (c *SynapsesClient) PrepareContext(taskID, entity, intent string) (*ContextResult, error)
PrepareContext calls get_context with mode=intent (was prepare_context before Sprint 24).
func (*SynapsesClient) RankCandidates ¶
func (c *SynapsesClient) RankCandidates(query string, candidates []string) ([]RankedCandidate, error)
RankCandidates calls the Synapses embedder via the embed_snippet tool (if available) or falls back to requesting the daemon to rank a list of candidate snippets against a query. Used by RepoBench-R (Approach B: embedding-based ranking without needing a live repo index).
Returns a ranked list of (index, score) pairs, highest score first.
func (*SynapsesClient) Recall ¶
func (c *SynapsesClient) Recall(taskID, query string) (*RecallResult, error)
Recall calls the recall tool. Recall calls memory(action=search) (was recall before Sprint 24).
func (*SynapsesClient) Search ¶
func (c *SynapsesClient) Search(taskID, query string) (*SearchResult, error)
Search calls the search tool.
func (*SynapsesClient) SearchWithMode ¶
func (c *SynapsesClient) SearchWithMode(query, mode string) (string, error)
SearchWithMode calls the search tool with an explicit mode (e.g. "vector").
func (*SynapsesClient) WithProject ¶
func (c *SynapsesClient) WithProject(project string) *SynapsesClient
WithProject returns a shallow copy of the client with a different project path. Used for per-repo routing in RepoBench-R: each sample gets routed to its own indexed project directory.
type SynapsesExecutor ¶
type SynapsesExecutor struct {
BaselineExecutor
Client *SynapsesClient
TaskID string
}
SynapsesExecutor extends BaselineExecutor with Synapses MCP tools.
func (*SynapsesExecutor) Execute ¶
func (e *SynapsesExecutor) Execute(toolName string, input json.RawMessage) (string, error)
Execute dispatches Synapses-augmented tool calls, falling back to baseline tools.
type ToolDef ¶
type ToolDef struct {
Name string `json:"name"`
Description string `json:"description"`
InputSchema map[string]interface{} `json:"input_schema"`
}
ToolDef matches the Claude API tool schema.
func BaselineTools ¶
func BaselineTools() []ToolDef
BaselineTools returns tool definitions for the baseline agent mode.
func SynapsesTools ¶
func SynapsesTools() []ToolDef
SynapsesTools returns baseline + Synapses MCP tool definitions.
type ToolExecutor ¶
type ToolExecutor interface {
Execute(toolName string, input json.RawMessage) (string, error)
}
ToolExecutor executes a tool by name and returns the text result.