Documentation
¶
Overview ¶
Package runtime provides execution runtime and isolation boundaries for code-oriented orchestration. It sits underneath toolcode and provides:
- Backend-agnostic runtime interface for executing code in sandboxed environments
- Pluggable sandbox backends (from unsafe development mode to hardened isolation)
- Clean trust boundary for running untrusted code that can still call tools
- ToolGateway abstraction for exposing tool discovery and execution to sandboxes
The runtime enforces security through SecurityProfiles that determine which backends are allowed and what resource limits apply. The ToolGateway provides a proxy interface for sandboxed code to discover and execute tools without direct access to host resources.
Architecture ¶
The main types are:
- Runtime: Main execution interface that routes requests to backends
- Backend: Sandbox implementation (see Backend Kinds below)
- ToolGateway: Interface for tool operations exposed to sandboxed code
- ExecuteRequest/ExecuteResult: Request/response types for execution
Security Profiles ¶
Three security profiles are supported:
- ProfileDev: Development mode with minimal restrictions (unsafe)
- ProfileStandard: Standard isolation (no network, read-only rootfs)
- ProfileHardened: Maximum isolation with seccomp, gVisor/Kata/microVM
Backend Kinds ¶
The following execution backends are supported:
- BackendUnsafeHost: Direct host execution (dev only, no isolation)
- BackendDocker: Docker containers with cgroups and seccomp
- BackendContainerd: Containerd for infrastructure-native deployments
- BackendKubernetes: Short-lived pods/jobs with scheduling
- BackendGVisor: Strong isolation via gVisor/runsc
- BackendKata: VM-level isolation via Kata Containers
- BackendFirecracker: MicroVM isolation (strongest)
- BackendWASM: WebAssembly in-process isolation
- BackendTemporal: Workflow orchestration (composes with sandbox backends)
- BackendRemote: Generic remote execution service
Security Requirements ¶
All non-unsafe backends MUST:
- Run as non-root
- Enforce timeouts and cancellation
- Enforce tool call and chain step limits
- Deny host filesystem access by default
- Deny network egress by default
- Provide resource controls where available
- Treat tool schemas/docs/annotations as untrusted input
Backends that cannot enforce a given limit must report that clearly via the LimitsEnforced field in ExecuteResult.
Index ¶
- Variables
- func RunBackendContractTests(t *testing.T, contract BackendContract)
- func RunGatewayContractTests(t *testing.T, contract GatewayContract)
- type Backend
- type BackendContract
- type BackendInfo
- type BackendKind
- type DefaultRuntime
- type ExecuteRequest
- type ExecuteResult
- type GatewayContract
- type Limits
- type LimitsEnforced
- type Logger
- type Runtime
- type RuntimeConfig
- type RuntimeError
- type SecurityProfile
- type ToolCallRecord
- type ToolGateway
Constants ¶
This section is empty.
Variables ¶
var ( ErrRuntimeUnavailable = errors.New("runtime unavailable") // ErrBackendDenied is returned when a backend is denied by security policy. ErrBackendDenied = errors.New("backend denied by policy") // ErrSandboxViolation is returned when sandboxed code violates security policy. ErrSandboxViolation = errors.New("sandbox policy violation") // ErrTimeout is returned when execution exceeds the configured timeout. ErrTimeout = errors.New("execution timeout") // ErrResourceLimit is returned when a resource limit is exceeded. ErrResourceLimit = errors.New("resource limit exceeded") // ErrMissingGateway is returned when ExecuteRequest has no Gateway. ErrMissingGateway = errors.New("gateway is required") // ErrMissingCode is returned when ExecuteRequest has no Code. ErrMissingCode = errors.New("code is required") // ErrInvalidLimits is returned when Limits validation fails. ErrInvalidLimits = errors.New("invalid limits") )
Sentinel errors for toolruntime operations.
Functions ¶
func RunBackendContractTests ¶
func RunBackendContractTests(t *testing.T, contract BackendContract)
RunBackendContractTests runs all contract tests for a Backend implementation.
func RunGatewayContractTests ¶
func RunGatewayContractTests(t *testing.T, contract GatewayContract)
RunGatewayContractTests runs all contract tests for a ToolGateway implementation.
Types ¶
type Backend ¶
type Backend interface {
// Kind returns the backend kind identifier.
Kind() BackendKind
// Execute runs code with the given request parameters.
// It validates the request, executes the code, and returns the result.
Execute(ctx context.Context, req ExecuteRequest) (ExecuteResult, error)
}
Backend is the interface for code execution backends. Each backend provides a different level of isolation and security.
Contract:
- Concurrency: implementations must be safe for concurrent use.
- Context: must honor cancellation/deadlines and return ctx.Err() when canceled.
- Errors: validation errors should use ErrInvalidRequest; runtime errors should return ErrExecutionFailed (see errors.go) where applicable.
- Ownership: requests are read-only; results are caller-owned snapshots.
type BackendContract ¶
type BackendContract struct {
// NewBackend creates a fresh backend instance for testing.
NewBackend func() Backend
// NewGateway creates a gateway for testing.
// The gateway should allow at least basic tool operations.
NewGateway func() ToolGateway
// ExpectedKind is the BackendKind the backend should return.
ExpectedKind BackendKind
// SkipExecutionTests skips tests that require actual code execution.
// Useful for backends that need complex setup (e.g., Docker).
SkipExecutionTests bool
}
BackendContract defines tests that any Backend implementation must pass. Use RunBackendContractTests to test an implementation.
type BackendInfo ¶
type BackendInfo struct {
// Kind identifies the type of backend.
Kind BackendKind
// Details contains backend-specific information.
Details map[string]any
}
BackendInfo contains information about the execution backend.
type BackendKind ¶
type BackendKind string
BackendKind identifies the type of execution backend.
const ( // BackendUnsafeHost runs code directly on the host. // WARNING: No isolation - use only for trusted code in development. BackendUnsafeHost BackendKind = "unsafe_host" // BackendDocker runs code in Docker containers. // Good default isolation with cgroups, read-only rootfs, user remapping, and seccomp. BackendDocker BackendKind = "docker" // BackendContainerd runs code via containerd directly. // Similar to Docker but more infrastructure-native for servers/agents. BackendContainerd BackendKind = "containerd" // BackendKubernetes executes snippets in short-lived pods/jobs. // Isolation depends on configured runtime class; best for scheduling and multi-tenant controls. BackendKubernetes BackendKind = "kubernetes" // BackendGVisor runs code with gVisor (runsc) for stronger isolation. // Appropriate for untrusted multi-tenant execution. BackendGVisor BackendKind = "gvisor" // BackendKata runs code in Kata Containers for VM-level isolation. // Stronger isolation than plain containers. BackendKata BackendKind = "kata" // BackendFirecracker runs code in Firecracker microVMs. // Strongest isolation; higher complexity and operational cost. BackendFirecracker BackendKind = "firecracker" // BackendWASM runs code compiled to WebAssembly. // Strong in-process isolation; requires constrained SDK surface. BackendWASM BackendKind = "wasm" // BackendTemporal treats snippet execution as a Temporal workflow/activity. // Useful for long-running or resumable executions. // Note: Temporal is orchestration, not isolation - must compose with sandbox backends. BackendTemporal BackendKind = "temporal" // BackendRemote executes code on a remote runtime service. // Generic target for dedicated runtime services, batch systems, or job runners. BackendRemote BackendKind = "remote" )
type DefaultRuntime ¶
type DefaultRuntime struct {
// contains filtered or unexported fields
}
DefaultRuntime is the default implementation of Runtime. It routes requests to backends based on security profiles.
func NewDefaultRuntime ¶
func NewDefaultRuntime(cfg RuntimeConfig) *DefaultRuntime
NewDefaultRuntime creates a new DefaultRuntime with the given configuration.
func (*DefaultRuntime) Execute ¶
func (r *DefaultRuntime) Execute(ctx context.Context, req ExecuteRequest) (ExecuteResult, error)
Execute implements the Runtime interface.
func (*DefaultRuntime) RegisterBackend ¶
func (r *DefaultRuntime) RegisterBackend(profile SecurityProfile, backend Backend)
RegisterBackend registers a backend for a security profile. This is thread-safe and can be called at runtime.
func (*DefaultRuntime) UnregisterBackend ¶
func (r *DefaultRuntime) UnregisterBackend(profile SecurityProfile)
UnregisterBackend removes a backend for a security profile. This is thread-safe and can be called at runtime.
type ExecuteRequest ¶
type ExecuteRequest struct {
// Language specifies the programming language of the code.
// If empty, the backend's default language is used.
Language string
// Code is the source code to execute.
// Required.
Code string
// Timeout specifies the maximum duration for execution.
// If zero, the backend's default timeout is used.
Timeout time.Duration
// Limits specifies resource limits for execution.
Limits Limits
// Profile specifies the security profile to use.
// If empty, the runtime's default profile is used.
Profile SecurityProfile
// Gateway is the tool gateway exposed to the executed code.
// Required.
Gateway ToolGateway
// Metadata contains arbitrary metadata for the execution.
Metadata map[string]any
}
ExecuteRequest specifies the parameters for code execution.
func (ExecuteRequest) Validate ¶
func (r ExecuteRequest) Validate() error
Validate checks that the request is valid.
type ExecuteResult ¶
type ExecuteResult struct {
// Value is the final result of the code execution.
// Typically captured via the __out variable convention.
Value any
// Stdout contains any output written to stdout.
Stdout string
// Stderr contains any output written to stderr.
Stderr string
// ToolCalls records all tool invocations made during execution.
ToolCalls []ToolCallRecord
// Duration is the total execution time.
Duration time.Duration
// Backend contains information about the backend that executed the code.
Backend BackendInfo
// LimitsEnforced reports which limits the backend was able to enforce.
// Backends that cannot enforce a given limit should set that field to false.
// This allows callers to know when limits degraded gracefully.
LimitsEnforced LimitsEnforced
}
ExecuteResult contains the outcome of code execution.
type GatewayContract ¶
type GatewayContract struct {
// NewGateway creates a fresh gateway instance for testing.
// The gateway should be configured with at least one tool for search/describe tests.
NewGateway func() ToolGateway
}
GatewayContract defines tests that any ToolGateway implementation must pass. Use RunGatewayContractTests to test an implementation.
type Limits ¶
type Limits struct {
// MaxToolCalls limits the number of tool invocations.
// Zero means unlimited.
MaxToolCalls int
// MaxChainSteps limits the number of steps in a tool chain.
// Zero means unlimited.
MaxChainSteps int
// CPUQuotaMillis limits CPU time in milliseconds.
// Zero means unlimited.
CPUQuotaMillis int64
// MemoryBytes limits memory usage in bytes.
// Zero means unlimited.
MemoryBytes int64
// PidsMax limits the number of processes/threads.
// Zero means unlimited.
PidsMax int64
// DiskBytes limits disk usage in bytes.
// Zero means unlimited.
DiskBytes int64
}
Limits specifies resource limits for execution. Zero values represent "unlimited" for that resource.
type LimitsEnforced ¶
type LimitsEnforced struct {
// Timeout indicates whether the timeout was enforced.
Timeout bool
// ToolCalls indicates whether tool call limits were enforced.
ToolCalls bool
// ChainSteps indicates whether chain step limits were enforced.
ChainSteps bool
// Memory indicates whether memory limits were enforced.
Memory bool
// CPU indicates whether CPU limits were enforced.
CPU bool
// Pids indicates whether process limits were enforced.
Pids bool
// Disk indicates whether disk limits were enforced.
Disk bool
}
LimitsEnforced reports which resource limits were actually enforced by the backend. Backends that cannot enforce a limit should set that field to false.
type Logger ¶
type Logger interface {
Info(msg string, args ...any)
Warn(msg string, args ...any)
Error(msg string, args ...any)
}
Logger is an optional interface for logging.
Contract: - Concurrency: implementations must be safe for concurrent use. - Errors: logging must be best-effort and must not panic.
type Runtime ¶
type Runtime interface {
// Execute runs code with the given request parameters.
// It selects the appropriate backend based on the security profile
// and delegates execution.
Execute(ctx context.Context, req ExecuteRequest) (ExecuteResult, error)
}
Runtime is the main interface for code execution. It manages backends and routes execution requests to the appropriate backend based on the security profile.
Contract:
- Concurrency: implementations must be safe for concurrent use.
- Context: must honor cancellation/deadlines and return ctx.Err() when canceled.
- Errors: request validation should return ErrMissingGateway/ErrInvalidRequest; backend selection failures return ErrRuntimeUnavailable/ErrBackendDenied.
- Ownership: requests are read-only; results are caller-owned snapshots.
type RuntimeConfig ¶
type RuntimeConfig struct {
// Backends maps security profiles to their backend implementations.
Backends map[SecurityProfile]Backend
// DenyUnsafeProfiles lists profiles that cannot use the unsafe backend.
// If a profile is listed here and only the unsafe backend is available,
// execution will be denied.
DenyUnsafeProfiles []SecurityProfile
// DefaultProfile is the profile to use when none is specified in the request.
// If empty and no profile is specified, execution will fail.
DefaultProfile SecurityProfile
// Logger is an optional logger for runtime events.
Logger Logger
}
RuntimeConfig configures a DefaultRuntime instance.
type RuntimeError ¶
type RuntimeError struct {
// Err is the underlying error.
Err error
// Op is the operation that failed (e.g., "execute", "container_create").
Op string
// Backend is the backend kind that was in use when the error occurred.
Backend BackendKind
// Retryable indicates whether the operation can be retried.
// True for transient errors like timeouts; false for policy violations.
Retryable bool
}
RuntimeError wraps an error with execution context information. It provides the operation that failed and the backend that was in use.
func (*RuntimeError) Error ¶
func (e *RuntimeError) Error() string
Error returns the error message with context.
func (*RuntimeError) Unwrap ¶
func (e *RuntimeError) Unwrap() error
Unwrap returns the underlying error for errors.Is and errors.As.
type SecurityProfile ¶
type SecurityProfile string
SecurityProfile determines the security level for execution. Higher security profiles impose more restrictions but provide better isolation.
const ( // ProfileDev is development mode with minimal restrictions. // WARNING: This profile runs code with host access - use only for development. ProfileDev SecurityProfile = "dev" // ProfileStandard provides standard isolation. // Includes: no network access, read-only rootfs, resource limits. ProfileStandard SecurityProfile = "standard" // ProfileHardened provides maximum isolation. // Includes: seccomp profiles, stricter resource limits, additional syscall filtering. ProfileHardened SecurityProfile = "hardened" )
func (SecurityProfile) IsValid ¶
func (p SecurityProfile) IsValid() bool
IsValid returns true if the SecurityProfile is a known valid value.
type ToolCallRecord ¶
type ToolCallRecord struct {
// ToolID is the canonical identifier of the tool that was called.
ToolID string
// BackendKind indicates which backend executed the tool.
BackendKind string
// Duration is the execution time for this tool call.
Duration time.Duration
// ErrorOp indicates the operation that failed, if any.
ErrorOp string
}
ToolCallRecord captures information about a single tool invocation.
type ToolGateway ¶
type ToolGateway interface {
// SearchTools searches for tools matching the query.
SearchTools(ctx context.Context, query string, limit int) ([]index.Summary, error)
// ListNamespaces returns all available tool namespaces.
ListNamespaces(ctx context.Context) ([]string, error)
// DescribeTool returns documentation for a tool at the specified detail level.
DescribeTool(ctx context.Context, id string, level tooldoc.DetailLevel) (tooldoc.ToolDoc, error)
// ListToolExamples returns up to maxExamples usage examples for a tool.
ListToolExamples(ctx context.Context, id string, maxExamples int) ([]tooldoc.ToolExample, error)
// RunTool executes a single tool and returns the result.
RunTool(ctx context.Context, id string, args map[string]any) (run.RunResult, error)
// RunChain executes a sequence of tool calls.
RunChain(ctx context.Context, steps []run.ChainStep) (run.RunResult, []run.StepResult, error)
}
ToolGateway is the interface for tool operations exposed to sandboxed code. It provides a proxy for tool discovery and execution while maintaining the trust boundary between the sandbox and the host.
Contract: - Concurrency: implementations must be safe for concurrent use. - Context: methods must honor cancellation/deadlines and return ctx.Err() when canceled. - Ownership: args are read-only; results are caller-owned snapshots.
Directories
¶
| Path | Synopsis |
|---|---|
|
backend
|
|
|
containerd
Package containerd provides a backend that executes code via containerd.
|
Package containerd provides a backend that executes code via containerd. |
|
docker
Package docker provides a backend that executes code in Docker containers with configurable security profiles and resource limits.
|
Package docker provides a backend that executes code in Docker containers with configurable security profiles and resource limits. |
|
firecracker
Package firecracker provides a backend that executes code in Firecracker microVMs.
|
Package firecracker provides a backend that executes code in Firecracker microVMs. |
|
gvisor
Package gvisor provides a backend that executes code with gVisor (runsc).
|
Package gvisor provides a backend that executes code with gVisor (runsc). |
|
kata
Package kata provides a backend that executes code in Kata Containers.
|
Package kata provides a backend that executes code in Kata Containers. |
|
kubernetes
Package kubernetes provides a backend that executes code in Kubernetes pods/jobs.
|
Package kubernetes provides a backend that executes code in Kubernetes pods/jobs. |
|
remote
Package remote provides a backend that executes code on a remote runtime service.
|
Package remote provides a backend that executes code on a remote runtime service. |
|
temporal
Package temporal provides a backend that treats snippet execution as a Temporal workflow/activity.
|
Package temporal provides a backend that treats snippet execution as a Temporal workflow/activity. |
|
unsafe
Package unsafe provides a backend that executes code directly on the host.
|
Package unsafe provides a backend that executes code directly on the host. |
|
wasm
Package wasm provides a backend that executes code compiled to WebAssembly.
|
Package wasm provides a backend that executes code compiled to WebAssembly. |
|
gateway
|
|
|
direct
Package direct provides a gateway that implements ToolGateway by directly delegating to toolindex, tooldocs, and toolrun components.
|
Package direct provides a gateway that implements ToolGateway by directly delegating to toolindex, tooldocs, and toolrun components. |
|
proxy
Package proxy provides a gateway that implements ToolGateway by serializing requests over a connection (for cross-process/container communication).
|
Package proxy provides a gateway that implements ToolGateway by serializing requests over a connection (for cross-process/container communication). |
|
Package toolcodeengine provides an adapter that implements code.Engine using runtime.Runtime for execution.
|
Package toolcodeengine provides an adapter that implements code.Engine using runtime.Runtime for execution. |