Documentation
¶
Overview ¶
Package tools provides tool/function calling infrastructure for LLM testing.
This package implements a flexible tool execution system with:
- Tool descriptor registry with JSON Schema validation
- Mock executors for testing (static and template-based)
- HTTP executor for live API calls
- Type coercion and result validation
- Adapter for prompt registry integration
Tools can be loaded from YAML/JSON files and executed with argument validation, result schema checking, and automatic type coercion for common mismatches.
Index ¶
- type AsyncToolExecutor
- type Coercion
- type Executor
- type FileToolResponseRepository
- type HTTPConfig
- type InMemoryToolResponseRepository
- type MCPExecutor
- type MockScriptedExecutor
- type MockStaticExecutor
- type MockToolErrorConfig
- type MockToolResponseConfig
- type PendingToolInfo
- type PredictMessage
- type PredictionRequest
- type PredictionResponse
- type Registry
- func (r *Registry) Execute(toolName string, args json.RawMessage) (*ToolResult, error)
- func (r *Registry) ExecuteAsync(toolName string, args json.RawMessage) (*ToolExecutionResult, error)
- func (r *Registry) Get(name string) *ToolDescriptor
- func (r *Registry) GetTool(name string) (*ToolDescriptor, error)
- func (r *Registry) GetTools() map[string]*ToolDescriptor
- func (r *Registry) GetToolsByNames(names []string) ([]*ToolDescriptor, error)
- func (r *Registry) List() []string
- func (r *Registry) LoadToolFromBytes(filename string, data []byte) error
- func (r *Registry) Register(descriptor *ToolDescriptor) error
- func (r *Registry) RegisterExecutor(executor Executor)
- type RepositoryToolExecutor
- type SchemaValidator
- func (sv *SchemaValidator) CoerceResult(descriptor *ToolDescriptor, result json.RawMessage) (json.RawMessage, []Coercion, error)
- func (sv *SchemaValidator) ValidateArgs(descriptor *ToolDescriptor, args json.RawMessage) error
- func (sv *SchemaValidator) ValidateResult(descriptor *ToolDescriptor, result json.RawMessage) error
- type ToolCall
- type ToolConfig
- type ToolDescriptor
- type ToolErrorData
- type ToolExecutionResult
- type ToolExecutionStatus
- type ToolGuidance
- type ToolPolicy
- type ToolRepository
- type ToolResponseData
- type ToolResponseRepository
- type ToolResult
- type ToolStats
- type ValidationError
Constants ¶
This section is empty.
Variables ¶
This section is empty.
Functions ¶
This section is empty.
Types ¶
type AsyncToolExecutor ¶
type AsyncToolExecutor interface {
Executor // Still implements the basic Executor interface
// ExecuteAsync may return immediately with a pending status
ExecuteAsync(descriptor *ToolDescriptor, args json.RawMessage) (*ToolExecutionResult, error)
}
AsyncToolExecutor is a tool that can return pending status instead of blocking. Tools that require human approval or external async operations should implement this.
type Coercion ¶
type Coercion struct {
Path string `json:"path"`
From interface{} `json:"from"`
To interface{} `json:"to"`
}
Coercion represents a type coercion that was performed
type Executor ¶
type Executor interface {
Execute(descriptor *ToolDescriptor, args json.RawMessage) (json.RawMessage, error)
Name() string
}
Executor interface defines how tools are executed
type FileToolResponseRepository ¶ added in v1.1.0
type FileToolResponseRepository struct {
// contains filtered or unexported fields
}
FileToolResponseRepository implements ToolResponseRepository using the provider's MockConfig YAML structure. This allows Arena scenarios to define tool responses alongside LLM responses.
func NewFileToolResponseRepository ¶ added in v1.1.0
func NewFileToolResponseRepository(scenarioID string, toolResponses map[string][]MockToolResponseConfig) *FileToolResponseRepository
NewFileToolResponseRepository creates a repository from scenario tool responses. This is typically used by Arena to provide tool mocking from YAML scenarios.
func (*FileToolResponseRepository) GetToolResponse ¶ added in v1.1.0
func (r *FileToolResponseRepository) GetToolResponse(toolName string, args map[string]interface{}, contextKey string) (*ToolResponseData, error)
GetToolResponse implements ToolResponseRepository. It finds the first matching response based on argument comparison.
type HTTPConfig ¶
type HTTPConfig struct {
URL string `json:"url" yaml:"url"`
Method string `json:"method" yaml:"method"`
HeadersFromEnv []string `json:"headers_from_env,omitempty" yaml:"headers_from_env,omitempty"`
TimeoutMs int `json:"timeout_ms" yaml:"timeout_ms"`
Redact []string `json:"redact,omitempty" yaml:"redact,omitempty"`
Headers map[string]string `json:"headers,omitempty" yaml:"headers,omitempty"`
}
HTTPConfig defines configuration for live HTTP tool execution
type InMemoryToolResponseRepository ¶ added in v1.1.0
type InMemoryToolResponseRepository struct {
// contains filtered or unexported fields
}
InMemoryToolResponseRepository implements ToolResponseRepository using in-memory storage. This is useful for SDK unit tests and programmatic configuration of tool responses.
func NewInMemoryToolResponseRepository ¶ added in v1.1.0
func NewInMemoryToolResponseRepository() *InMemoryToolResponseRepository
NewInMemoryToolResponseRepository creates a new in-memory tool response repository.
func (*InMemoryToolResponseRepository) AddResponse ¶ added in v1.1.0
func (r *InMemoryToolResponseRepository) AddResponse(contextKey, toolName string, response *ToolResponseData)
AddResponse adds a tool response for a specific context and tool name. This method supports simple responses where argument matching is not needed.
func (*InMemoryToolResponseRepository) GetToolResponse ¶ added in v1.1.0
func (r *InMemoryToolResponseRepository) GetToolResponse(toolName string, args map[string]interface{}, contextKey string) (*ToolResponseData, error)
GetToolResponse implements ToolResponseRepository. For simplicity, this implementation only matches by tool name and context, not by arguments. For argument-based matching, use FileToolResponseRepository or implement a custom repository.
type MCPExecutor ¶
type MCPExecutor struct {
// contains filtered or unexported fields
}
MCPExecutor executes tools using MCP (Model Context Protocol) servers
func NewMCPExecutor ¶
func NewMCPExecutor(registry mcp.Registry) *MCPExecutor
NewMCPExecutor creates a new MCP executor
func (*MCPExecutor) Execute ¶
func (e *MCPExecutor) Execute(descriptor *ToolDescriptor, args json.RawMessage) (json.RawMessage, error)
Execute executes a tool using an MCP server
type MockScriptedExecutor ¶
type MockScriptedExecutor struct{}
MockScriptedExecutor executes tools using templated mock data
func NewMockScriptedExecutor ¶
func NewMockScriptedExecutor() *MockScriptedExecutor
NewMockScriptedExecutor creates a new scripted mock executor
func (*MockScriptedExecutor) Execute ¶
func (e *MockScriptedExecutor) Execute(descriptor *ToolDescriptor, args json.RawMessage) (json.RawMessage, error)
Execute executes a tool using templated mock data
func (*MockScriptedExecutor) Name ¶
func (e *MockScriptedExecutor) Name() string
Name returns the executor name
type MockStaticExecutor ¶
type MockStaticExecutor struct{}
MockStaticExecutor executes tools using static mock data
func NewMockStaticExecutor ¶
func NewMockStaticExecutor() *MockStaticExecutor
NewMockStaticExecutor creates a new static mock executor
func (*MockStaticExecutor) Execute ¶
func (e *MockStaticExecutor) Execute(descriptor *ToolDescriptor, args json.RawMessage) (json.RawMessage, error)
Execute executes a tool using static mock data
func (*MockStaticExecutor) Name ¶
func (e *MockStaticExecutor) Name() string
Name returns the executor name
type MockToolErrorConfig ¶ added in v1.1.0
MockToolErrorConfig represents an error configuration.
type MockToolResponseConfig ¶ added in v1.1.0
type MockToolResponseConfig struct {
CallArgs map[string]interface{} `yaml:"call_args"`
Result interface{} `yaml:"result,omitempty"`
Error *MockToolErrorConfig `yaml:"error,omitempty"`
}
MockToolResponseConfig represents a single tool response configuration.
type PendingToolInfo ¶
type PendingToolInfo struct {
// Reason for pending (e.g., "requires_approval", "waiting_external_api")
Reason string `json:"reason"`
// Human-readable description
Message string `json:"message"`
// Tool details (for middleware to use in notifications)
ToolName string `json:"tool_name"`
Args json.RawMessage `json:"args"`
// Optional: expiration, callback URL, etc.
ExpiresAt *time.Time `json:"expires_at,omitempty"`
CallbackURL string `json:"callback_url,omitempty"`
// Arbitrary metadata for custom middleware
Metadata map[string]interface{} `json:"metadata,omitempty"`
}
PendingToolInfo provides context for middleware (email templates, notifications)
type PredictMessage ¶ added in v1.1.0
type PredictMessage struct {
Role string `json:"role"`
Content string `json:"content"`
ToolCalls []ToolCall `json:"tool_calls,omitempty"`
ToolCallResponseID string `json:"tool_call_id,omitempty"` // For tool result messages
}
PredictMessage represents a predict message (simplified version for tool context)
type PredictionRequest ¶ added in v1.1.0
type PredictionRequest struct {
System string `json:"system"`
Messages []PredictMessage `json:"messages"`
Temperature float32 `json:"temperature"`
TopP float32 `json:"top_p"`
MaxTokens int `json:"max_tokens"`
Seed *int `json:"seed,omitempty"`
}
PredictionRequest represents a predict request (extending existing type)
type PredictionResponse ¶ added in v1.1.0
type PredictionResponse struct {
Content string `json:"content"`
TokensIn int `json:"tokens_in"`
TokensOut int `json:"tokens_out"`
Latency time.Duration `json:"latency"`
Raw []byte `json:"raw,omitempty"`
ToolCalls []ToolCall `json:"tool_calls,omitempty"` // Tools called in this response
}
PredictionResponse represents a predict response (extending existing type)
type Registry ¶
type Registry struct {
// contains filtered or unexported fields
}
Registry manages tool descriptors and provides access to executors
func NewRegistry ¶
func NewRegistry() *Registry
NewRegistry creates a new tool registry without a repository backend (legacy mode)
func NewRegistryWithRepository ¶
func NewRegistryWithRepository(repository ToolRepository) *Registry
NewRegistryWithRepository creates a new tool registry with a repository backend
func (*Registry) Execute ¶
func (r *Registry) Execute(toolName string, args json.RawMessage) (*ToolResult, error)
Execute executes a tool with the given arguments
func (*Registry) ExecuteAsync ¶
func (r *Registry) ExecuteAsync(toolName string, args json.RawMessage) (*ToolExecutionResult, error)
ExecuteAsync executes a tool with async support, checking if it implements AsyncToolExecutor. Returns ToolExecutionResult with status (complete/pending/failed).
func (*Registry) Get ¶
func (r *Registry) Get(name string) *ToolDescriptor
Get retrieves a tool descriptor by name with repository fallback
func (*Registry) GetTool ¶
func (r *Registry) GetTool(name string) (*ToolDescriptor, error)
GetTool retrieves a tool descriptor by name
func (*Registry) GetTools ¶
func (r *Registry) GetTools() map[string]*ToolDescriptor
GetTools returns all loaded tool descriptors
func (*Registry) GetToolsByNames ¶
func (r *Registry) GetToolsByNames(names []string) ([]*ToolDescriptor, error)
GetToolsByNames returns tool descriptors for the specified names
func (*Registry) LoadToolFromBytes ¶
LoadToolFromBytes loads a tool descriptor from raw bytes data. This is useful when tool data has already been read from a file or received from another source, avoiding redundant file I/O. The filename parameter is used only for error reporting.
func (*Registry) Register ¶
func (r *Registry) Register(descriptor *ToolDescriptor) error
Register adds a tool descriptor to the registry with validation
func (*Registry) RegisterExecutor ¶
RegisterExecutor registers a tool executor
type RepositoryToolExecutor ¶ added in v1.1.0
type RepositoryToolExecutor struct {
// contains filtered or unexported fields
}
RepositoryToolExecutor wraps existing tool executors to provide repository-backed mock responses with fallback to real execution. This enables deterministic tool testing while maintaining the ability to fall back to real tool execution when needed.
func NewRepositoryToolExecutor ¶ added in v1.1.0
func NewRepositoryToolExecutor(baseExecutor Executor, repo ToolResponseRepository, contextKey string) *RepositoryToolExecutor
NewRepositoryToolExecutor creates a new repository-backed tool executor. The executor will first check the repository for configured responses, and fall back to the base executor if no match is found.
func (*RepositoryToolExecutor) Execute ¶ added in v1.1.0
func (e *RepositoryToolExecutor) Execute(descriptor *ToolDescriptor, args json.RawMessage) (json.RawMessage, error)
Execute executes a tool, first checking the repository for mock responses. If a matching response is found in the repository, it returns that response. Otherwise, it falls back to the base executor for real execution.
func (*RepositoryToolExecutor) Name ¶ added in v1.1.0
func (e *RepositoryToolExecutor) Name() string
Name returns the executor name with repository suffix.
type SchemaValidator ¶
type SchemaValidator struct {
// contains filtered or unexported fields
}
SchemaValidator handles JSON schema validation for tool inputs and outputs
func NewSchemaValidator ¶
func NewSchemaValidator() *SchemaValidator
NewSchemaValidator creates a new schema validator
func (*SchemaValidator) CoerceResult ¶
func (sv *SchemaValidator) CoerceResult(descriptor *ToolDescriptor, result json.RawMessage) (json.RawMessage, []Coercion, error)
CoerceResult attempts to coerce simple type mismatches in tool results
func (*SchemaValidator) ValidateArgs ¶
func (sv *SchemaValidator) ValidateArgs(descriptor *ToolDescriptor, args json.RawMessage) error
ValidateArgs validates tool arguments against the input schema
func (*SchemaValidator) ValidateResult ¶
func (sv *SchemaValidator) ValidateResult(descriptor *ToolDescriptor, result json.RawMessage) error
ValidateResult validates tool result against the output schema
type ToolCall ¶
type ToolCall struct {
Name string `json:"name"`
Args json.RawMessage `json:"args"`
ID string `json:"id"` // Provider-specific call ID
}
ToolCall represents a tool invocation request
type ToolConfig ¶
type ToolConfig struct {
APIVersion string `yaml:"apiVersion"`
Kind string `yaml:"kind"`
Metadata metav1.ObjectMeta `yaml:"metadata,omitempty"`
Spec ToolDescriptor `yaml:"spec"`
}
ToolConfig represents a K8s-style tool configuration manifest
type ToolDescriptor ¶
type ToolDescriptor struct {
Name string `json:"name" yaml:"name"`
Description string `json:"description" yaml:"description"`
InputSchema json.RawMessage `json:"input_schema" yaml:"input_schema"` // JSON Schema Draft-07
OutputSchema json.RawMessage `json:"output_schema" yaml:"output_schema"` // JSON Schema Draft-07
Mode string `json:"mode" yaml:"mode"` // "mock" | "live"
TimeoutMs int `json:"timeout_ms" yaml:"timeout_ms"`
// Static mock data (in-memory)
MockResult json.RawMessage `json:"mock_result,omitempty" yaml:"mock_result,omitempty"`
// Template for dynamic mocks (inline or file)
MockTemplate string `json:"mock_template,omitempty" yaml:"mock_template,omitempty"`
MockResultFile string `json:"mock_result_file,omitempty" yaml:"mock_result_file,omitempty"`
MockTemplateFile string `json:"mock_template_file,omitempty" yaml:"mock_template_file,omitempty"`
HTTPConfig *HTTPConfig `json:"http,omitempty" yaml:"http,omitempty"` // Live HTTP configuration
}
ToolDescriptor represents a normalized tool definition
type ToolErrorData ¶ added in v1.1.0
type ToolErrorData struct {
Type string `json:"type"` // Error type/category
Message string `json:"message"` // Error message
}
ToolErrorData represents an error response for tool execution.
type ToolExecutionResult ¶
type ToolExecutionResult struct {
Status ToolExecutionStatus `json:"status"`
Content json.RawMessage `json:"content,omitempty"`
Error string `json:"error,omitempty"`
// Present when Status == ToolStatusPending
PendingInfo *PendingToolInfo `json:"pending_info,omitempty"`
}
ToolExecutionResult includes status and optional pending information
type ToolExecutionStatus ¶
type ToolExecutionStatus string
ToolExecutionStatus represents whether a tool completed or needs external input
const ( // ToolStatusComplete indicates the tool finished executing ToolStatusComplete ToolExecutionStatus = "complete" // ToolStatusPending indicates the tool is waiting for external input (e.g., human approval) ToolStatusPending ToolExecutionStatus = "pending" // ToolStatusFailed indicates the tool execution failed ToolStatusFailed ToolExecutionStatus = "failed" )
type ToolGuidance ¶
type ToolGuidance struct {
Support string `json:"support,omitempty"`
Assistant string `json:"assistant,omitempty"`
Generic string `json:"generic,omitempty"`
}
ToolGuidance provides hints for different interaction modes This is a flexible structure that can be extended with task-specific guidance
type ToolPolicy ¶
type ToolPolicy struct {
ToolChoice string `json:"tool_choice"` // "auto" | "required" | "none"
MaxToolCallsPerTurn int `json:"max_tool_calls_per_turn"`
MaxTotalToolCalls int `json:"max_total_tool_calls"`
Blocklist []string `json:"blocklist,omitempty"`
}
ToolPolicy defines constraints for tool usage in scenarios
type ToolRepository ¶
type ToolRepository interface {
LoadTool(name string) (*ToolDescriptor, error)
ListTools() ([]string, error)
SaveTool(descriptor *ToolDescriptor) error
}
ToolRepository provides abstract access to tool descriptors (local interface to avoid import cycles)
type ToolResponseData ¶ added in v1.1.0
type ToolResponseData struct {
Result interface{} `json:"result,omitempty"` // Successful response data
Error *ToolErrorData `json:"error,omitempty"` // Error response
}
ToolResponseData represents a configured tool response with optional error.
type ToolResponseRepository ¶ added in v1.1.0
type ToolResponseRepository interface {
// GetToolResponse retrieves a mock response for a tool execution.
// Returns nil if no matching response is configured (not an error).
GetToolResponse(toolName string, args map[string]interface{}, contextKey string) (*ToolResponseData, error)
}
ToolResponseRepository defines the interface for repositories that can provide mock tool responses based on tool name, arguments, and context.
type ToolResult ¶
type ToolResult struct {
Name string `json:"name"`
ID string `json:"id"` // Matches ToolCall.ID
Result json.RawMessage `json:"result"`
LatencyMs int64 `json:"latency_ms"`
Error string `json:"error,omitempty"`
}
ToolResult represents the result of a tool execution
type ToolStats ¶
type ToolStats struct {
TotalCalls int `json:"total_calls"`
ByTool map[string]int `json:"by_tool"`
}
ToolStats tracks tool usage statistics
type ValidationError ¶
type ValidationError struct {
Type string `json:"type"` // "args_invalid" | "result_invalid" | "policy_violation"
Tool string `json:"tool"`
Detail string `json:"detail"`
Path string `json:"path,omitempty"`
}
ValidationError represents a tool validation failure
func (*ValidationError) Error ¶
func (e *ValidationError) Error() string
Error implements the error interface