runtime

package

v0.1.1 Latest Latest Go to latest Published: Feb 1, 2026 License: MIT Imports: 9 Imported by: 0

Details

Valid go.mod file
Redistributable license
Tagged version
Stable version
Learn more about best practices

Repository

github.com/jonwraymond/toolexec

Links

Open Source Insights

Documentation ¶

Overview ¶

Package runtime provides execution runtime and isolation boundaries for code-oriented orchestration. It sits underneath toolcode and provides:

Backend-agnostic runtime interface for executing code in sandboxed environments
Pluggable sandbox backends (from unsafe development mode to hardened isolation)
Clean trust boundary for running untrusted code that can still call tools
ToolGateway abstraction for exposing tool discovery and execution to sandboxes

The runtime enforces security through SecurityProfiles that determine which backends are allowed and what resource limits apply. The ToolGateway provides a proxy interface for sandboxed code to discover and execute tools without direct access to host resources.

Architecture ¶

The main types are:

Runtime: Main execution interface that routes requests to backends
Backend: Sandbox implementation (see Backend Kinds below)
ToolGateway: Interface for tool operations exposed to sandboxed code
ExecuteRequest/ExecuteResult: Request/response types for execution

Security Profiles ¶

Three security profiles are supported:

ProfileDev: Development mode with minimal restrictions (unsafe)
ProfileStandard: Standard isolation (no network, read-only rootfs)
ProfileHardened: Maximum isolation with seccomp, gVisor/Kata/microVM

Backend Kinds ¶

The following execution backends are supported:

BackendUnsafeHost: Direct host execution (dev only, no isolation)
BackendDocker: Docker containers with cgroups and seccomp
BackendContainerd: Containerd for infrastructure-native deployments
BackendKubernetes: Short-lived pods/jobs with scheduling
BackendGVisor: Strong isolation via gVisor/runsc
BackendKata: VM-level isolation via Kata Containers
BackendFirecracker: MicroVM isolation (strongest)
BackendWASM: WebAssembly in-process isolation
BackendTemporal: Workflow orchestration (composes with sandbox backends)
BackendRemote: Generic remote execution service

Security Requirements ¶

All non-unsafe backends MUST:

Run as non-root
Enforce timeouts and cancellation
Enforce tool call and chain step limits
Deny host filesystem access by default
Deny network egress by default
Provide resource controls where available
Treat tool schemas/docs/annotations as untrusted input

Backends that cannot enforce a given limit must report that clearly via the LimitsEnforced field in ExecuteResult.

Index ¶

Variables
func RunBackendContractTests(t *testing.T, contract BackendContract)
func RunGatewayContractTests(t *testing.T, contract GatewayContract)
type Backend
type BackendContract
type BackendInfo
type BackendKind
type DefaultRuntime
- func NewDefaultRuntime(cfg RuntimeConfig) *DefaultRuntime
- func (r *DefaultRuntime) Execute(ctx context.Context, req ExecuteRequest) (ExecuteResult, error)
- func (r *DefaultRuntime) RegisterBackend(profile SecurityProfile, backend Backend)
- func (r *DefaultRuntime) UnregisterBackend(profile SecurityProfile)
type ExecuteRequest
- func (r ExecuteRequest) Validate() error
type ExecuteResult
type GatewayContract
type Limits
- func (l Limits) Validate() error
type LimitsEnforced
type Logger
type Runtime
type RuntimeConfig
type RuntimeError
- func (e *RuntimeError) Error() string
- func (e *RuntimeError) Unwrap() error
type SecurityProfile
- func (p SecurityProfile) IsValid() bool
type ToolCallRecord
type ToolGateway

Constants ¶

This section is empty.

Variables ¶

View Source

var (
	// ErrRuntimeUnavailable is returned when no suitable runtime/backend is available.
	ErrRuntimeUnavailable = errors.New("runtime unavailable")

	// ErrBackendDenied is returned when a backend is denied by security policy.
	ErrBackendDenied = errors.New("backend denied by policy")

	// ErrSandboxViolation is returned when sandboxed code violates security policy.
	ErrSandboxViolation = errors.New("sandbox policy violation")

	// ErrTimeout is returned when execution exceeds the configured timeout.
	ErrTimeout = errors.New("execution timeout")

	// ErrResourceLimit is returned when a resource limit is exceeded.
	ErrResourceLimit = errors.New("resource limit exceeded")

	// ErrMissingGateway is returned when ExecuteRequest has no Gateway.
	ErrMissingGateway = errors.New("gateway is required")

	// ErrMissingCode is returned when ExecuteRequest has no Code.
	ErrMissingCode = errors.New("code is required")

	// ErrInvalidLimits is returned when Limits validation fails.
	ErrInvalidLimits = errors.New("invalid limits")
)

Sentinel errors for toolruntime operations.

Functions ¶

func RunBackendContractTests ¶

func RunBackendContractTests(t *testing.T, contract BackendContract)

RunBackendContractTests runs all contract tests for a Backend implementation.

func RunGatewayContractTests ¶

func RunGatewayContractTests(t *testing.T, contract GatewayContract)

RunGatewayContractTests runs all contract tests for a ToolGateway implementation.

Types ¶

type Backend ¶

type Backend interface {
	// Kind returns the backend kind identifier.
	Kind() BackendKind

	// Execute runs code with the given request parameters.
	// It validates the request, executes the code, and returns the result.
	Execute(ctx context.Context, req ExecuteRequest) (ExecuteResult, error)
}

Backend is the interface for code execution backends. Each backend provides a different level of isolation and security.

Contract:

Concurrency: implementations must be safe for concurrent use.
Context: must honor cancellation/deadlines and return ctx.Err() when canceled.
Errors: validation errors should use ErrInvalidRequest; runtime errors should return ErrExecutionFailed (see errors.go) where applicable.
Ownership: requests are read-only; results are caller-owned snapshots.

type BackendContract ¶

type BackendContract struct {
	// NewBackend creates a fresh backend instance for testing.
	NewBackend func() Backend

	// NewGateway creates a gateway for testing.
	// The gateway should allow at least basic tool operations.
	NewGateway func() ToolGateway

	// ExpectedKind is the BackendKind the backend should return.
	ExpectedKind BackendKind

	// SkipExecutionTests skips tests that require actual code execution.
	// Useful for backends that need complex setup (e.g., Docker).
	SkipExecutionTests bool
}

BackendContract defines tests that any Backend implementation must pass. Use RunBackendContractTests to test an implementation.

type BackendInfo ¶

type BackendInfo struct {
	// Kind identifies the type of backend.
	Kind BackendKind

	// Details contains backend-specific information.
	Details map[string]any
}

BackendInfo contains information about the execution backend.

type BackendKind ¶

type BackendKind string

BackendKind identifies the type of execution backend.

const (
	// BackendUnsafeHost runs code directly on the host.
	// WARNING: No isolation - use only for trusted code in development.
	BackendUnsafeHost BackendKind = "unsafe_host"

	// BackendDocker runs code in Docker containers.
	// Good default isolation with cgroups, read-only rootfs, user remapping, and seccomp.
	BackendDocker BackendKind = "docker"

	// BackendContainerd runs code via containerd directly.
	// Similar to Docker but more infrastructure-native for servers/agents.
	BackendContainerd BackendKind = "containerd"

	// BackendKubernetes executes snippets in short-lived pods/jobs.
	// Isolation depends on configured runtime class; best for scheduling and multi-tenant controls.
	BackendKubernetes BackendKind = "kubernetes"

	// BackendGVisor runs code with gVisor (runsc) for stronger isolation.
	// Appropriate for untrusted multi-tenant execution.
	BackendGVisor BackendKind = "gvisor"

	// BackendKata runs code in Kata Containers for VM-level isolation.
	// Stronger isolation than plain containers.
	BackendKata BackendKind = "kata"

	// BackendFirecracker runs code in Firecracker microVMs.
	// Strongest isolation; higher complexity and operational cost.
	BackendFirecracker BackendKind = "firecracker"

	// BackendWASM runs code compiled to WebAssembly.
	// Strong in-process isolation; requires constrained SDK surface.
	BackendWASM BackendKind = "wasm"

	// BackendTemporal treats snippet execution as a Temporal workflow/activity.
	// Useful for long-running or resumable executions.
	// Note: Temporal is orchestration, not isolation - must compose with sandbox backends.
	BackendTemporal BackendKind = "temporal"

	// BackendRemote executes code on a remote runtime service.
	// Generic target for dedicated runtime services, batch systems, or job runners.
	BackendRemote BackendKind = "remote"
)

type DefaultRuntime ¶

type DefaultRuntime struct {
	// contains filtered or unexported fields
}

DefaultRuntime is the default implementation of Runtime. It routes requests to backends based on security profiles.

func NewDefaultRuntime ¶

func NewDefaultRuntime(cfg RuntimeConfig) *DefaultRuntime

NewDefaultRuntime creates a new DefaultRuntime with the given configuration.

func (*DefaultRuntime) Execute ¶

func (r *DefaultRuntime) Execute(ctx context.Context, req ExecuteRequest) (ExecuteResult, error)

Execute implements the Runtime interface.

func (*DefaultRuntime) RegisterBackend ¶

func (r *DefaultRuntime) RegisterBackend(profile SecurityProfile, backend Backend)

RegisterBackend registers a backend for a security profile. This is thread-safe and can be called at runtime.

func (*DefaultRuntime) UnregisterBackend ¶

func (r *DefaultRuntime) UnregisterBackend(profile SecurityProfile)

UnregisterBackend removes a backend for a security profile. This is thread-safe and can be called at runtime.

type ExecuteRequest ¶

type ExecuteRequest struct {
	// Language specifies the programming language of the code.
	// If empty, the backend's default language is used.
	Language string

	// Code is the source code to execute.
	// Required.
	Code string

	// Timeout specifies the maximum duration for execution.
	// If zero, the backend's default timeout is used.
	Timeout time.Duration

	// Limits specifies resource limits for execution.
	Limits Limits

	// Profile specifies the security profile to use.
	// If empty, the runtime's default profile is used.
	Profile SecurityProfile

	// Gateway is the tool gateway exposed to the executed code.
	// Required.
	Gateway ToolGateway

	// Metadata contains arbitrary metadata for the execution.
	Metadata map[string]any
}

ExecuteRequest specifies the parameters for code execution.

func (ExecuteRequest) Validate ¶

func (r ExecuteRequest) Validate() error

Validate checks that the request is valid.

type ExecuteResult ¶

type ExecuteResult struct {
	// Value is the final result of the code execution.
	// Typically captured via the __out variable convention.
	Value any

	// Stdout contains any output written to stdout.
	Stdout string

	// Stderr contains any output written to stderr.
	Stderr string

	// ToolCalls records all tool invocations made during execution.
	ToolCalls []ToolCallRecord

	// Duration is the total execution time.
	Duration time.Duration

	// Backend contains information about the backend that executed the code.
	Backend BackendInfo

	// LimitsEnforced reports which limits the backend was able to enforce.
	// Backends that cannot enforce a given limit should set that field to false.
	// This allows callers to know when limits degraded gracefully.
	LimitsEnforced LimitsEnforced
}

ExecuteResult contains the outcome of code execution.

type GatewayContract ¶

type GatewayContract struct {
	// NewGateway creates a fresh gateway instance for testing.
	// The gateway should be configured with at least one tool for search/describe tests.
	NewGateway func() ToolGateway
}

GatewayContract defines tests that any ToolGateway implementation must pass. Use RunGatewayContractTests to test an implementation.

type Limits ¶

type Limits struct {
	// MaxToolCalls limits the number of tool invocations.
	// Zero means unlimited.
	MaxToolCalls int

	// MaxChainSteps limits the number of steps in a tool chain.
	// Zero means unlimited.
	MaxChainSteps int

	// CPUQuotaMillis limits CPU time in milliseconds.
	// Zero means unlimited.
	CPUQuotaMillis int64

	// MemoryBytes limits memory usage in bytes.
	// Zero means unlimited.
	MemoryBytes int64

	// PidsMax limits the number of processes/threads.
	// Zero means unlimited.
	PidsMax int64

	// DiskBytes limits disk usage in bytes.
	// Zero means unlimited.
	DiskBytes int64
}

Limits specifies resource limits for execution. Zero values represent "unlimited" for that resource.

func (Limits) Validate ¶

func (l Limits) Validate() error

Validate checks that all limit values are valid (non-negative).

type LimitsEnforced ¶

type LimitsEnforced struct {
	// Timeout indicates whether the timeout was enforced.
	Timeout bool

	// ToolCalls indicates whether tool call limits were enforced.
	ToolCalls bool

	// ChainSteps indicates whether chain step limits were enforced.
	ChainSteps bool

	// Memory indicates whether memory limits were enforced.
	Memory bool

	// CPU indicates whether CPU limits were enforced.
	CPU bool

	// Pids indicates whether process limits were enforced.
	Pids bool

	// Disk indicates whether disk limits were enforced.
	Disk bool
}

LimitsEnforced reports which resource limits were actually enforced by the backend. Backends that cannot enforce a limit should set that field to false.

type Logger ¶

type Logger interface {
	Info(msg string, args ...any)
	Warn(msg string, args ...any)
	Error(msg string, args ...any)
}

Logger is an optional interface for logging.

Contract: - Concurrency: implementations must be safe for concurrent use. - Errors: logging must be best-effort and must not panic.

type Runtime ¶

type Runtime interface {
	// Execute runs code with the given request parameters.
	// It selects the appropriate backend based on the security profile
	// and delegates execution.
	Execute(ctx context.Context, req ExecuteRequest) (ExecuteResult, error)
}

Runtime is the main interface for code execution. It manages backends and routes execution requests to the appropriate backend based on the security profile.

Contract:

Concurrency: implementations must be safe for concurrent use.
Context: must honor cancellation/deadlines and return ctx.Err() when canceled.
Errors: request validation should return ErrMissingGateway/ErrInvalidRequest; backend selection failures return ErrRuntimeUnavailable/ErrBackendDenied.
Ownership: requests are read-only; results are caller-owned snapshots.

type RuntimeConfig ¶

type RuntimeConfig struct {
	// Backends maps security profiles to their backend implementations.
	Backends map[SecurityProfile]Backend

	// DenyUnsafeProfiles lists profiles that cannot use the unsafe backend.
	// If a profile is listed here and only the unsafe backend is available,
	// execution will be denied.
	DenyUnsafeProfiles []SecurityProfile

	// DefaultProfile is the profile to use when none is specified in the request.
	// If empty and no profile is specified, execution will fail.
	DefaultProfile SecurityProfile

	// Logger is an optional logger for runtime events.
	Logger Logger
}

RuntimeConfig configures a DefaultRuntime instance.

type RuntimeError ¶

type RuntimeError struct {
	// Err is the underlying error.
	Err error

	// Op is the operation that failed (e.g., "execute", "container_create").
	Op string

	// Backend is the backend kind that was in use when the error occurred.
	Backend BackendKind

	// Retryable indicates whether the operation can be retried.
	// True for transient errors like timeouts; false for policy violations.
	Retryable bool
}

RuntimeError wraps an error with execution context information. It provides the operation that failed and the backend that was in use.

func (*RuntimeError) Error ¶

func (e *RuntimeError) Error() string

Error returns the error message with context.

func (*RuntimeError) Unwrap ¶

func (e *RuntimeError) Unwrap() error

Unwrap returns the underlying error for errors.Is and errors.As.

type SecurityProfile ¶

type SecurityProfile string

SecurityProfile determines the security level for execution. Higher security profiles impose more restrictions but provide better isolation.

const (
	// ProfileDev is development mode with minimal restrictions.
	// WARNING: This profile runs code with host access - use only for development.
	ProfileDev SecurityProfile = "dev"

	// ProfileStandard provides standard isolation.
	// Includes: no network access, read-only rootfs, resource limits.
	ProfileStandard SecurityProfile = "standard"

	// ProfileHardened provides maximum isolation.
	// Includes: seccomp profiles, stricter resource limits, additional syscall filtering.
	ProfileHardened SecurityProfile = "hardened"
)

func (SecurityProfile) IsValid ¶

func (p SecurityProfile) IsValid() bool

IsValid returns true if the SecurityProfile is a known valid value.

type ToolCallRecord ¶

type ToolCallRecord struct {
	// ToolID is the canonical identifier of the tool that was called.
	ToolID string

	// BackendKind indicates which backend executed the tool.
	BackendKind string

	// Duration is the execution time for this tool call.
	Duration time.Duration

	// ErrorOp indicates the operation that failed, if any.
	ErrorOp string
}

ToolCallRecord captures information about a single tool invocation.

type ToolGateway ¶

type ToolGateway interface {
	// SearchTools searches for tools matching the query.
	SearchTools(ctx context.Context, query string, limit int) ([]index.Summary, error)

	// ListNamespaces returns all available tool namespaces.
	ListNamespaces(ctx context.Context) ([]string, error)

	// DescribeTool returns documentation for a tool at the specified detail level.
	DescribeTool(ctx context.Context, id string, level tooldoc.DetailLevel) (tooldoc.ToolDoc, error)

	// ListToolExamples returns up to maxExamples usage examples for a tool.
	ListToolExamples(ctx context.Context, id string, maxExamples int) ([]tooldoc.ToolExample, error)

	// RunTool executes a single tool and returns the result.
	RunTool(ctx context.Context, id string, args map[string]any) (run.RunResult, error)

	// RunChain executes a sequence of tool calls.
	RunChain(ctx context.Context, steps []run.ChainStep) (run.RunResult, []run.StepResult, error)
}

ToolGateway is the interface for tool operations exposed to sandboxed code. It provides a proxy for tool discovery and execution while maintaining the trust boundary between the sandbox and the host.

Contract: - Concurrency: implementations must be safe for concurrent use. - Context: methods must honor cancellation/deadlines and return ctx.Err() when canceled. - Ownership: args are read-only; results are caller-owned snapshots.

Source Files ¶

View all Source files

Directories ¶

?	: This menu
/	: Search site
f or F	: Jump to
y or Y	: Canonical URL

Path	Synopsis
backend
containerd Package containerd provides a backend that executes code via containerd.	Package containerd provides a backend that executes code via containerd.
docker Package docker provides a backend that executes code in Docker containers with configurable security profiles and resource limits.	Package docker provides a backend that executes code in Docker containers with configurable security profiles and resource limits.
firecracker Package firecracker provides a backend that executes code in Firecracker microVMs.	Package firecracker provides a backend that executes code in Firecracker microVMs.
gvisor Package gvisor provides a backend that executes code with gVisor (runsc).	Package gvisor provides a backend that executes code with gVisor (runsc).
kata Package kata provides a backend that executes code in Kata Containers.	Package kata provides a backend that executes code in Kata Containers.
kubernetes Package kubernetes provides a backend that executes code in Kubernetes pods/jobs.	Package kubernetes provides a backend that executes code in Kubernetes pods/jobs.
remote Package remote provides a backend that executes code on a remote runtime service.	Package remote provides a backend that executes code on a remote runtime service.
temporal Package temporal provides a backend that treats snippet execution as a Temporal workflow/activity.	Package temporal provides a backend that treats snippet execution as a Temporal workflow/activity.
unsafe Package unsafe provides a backend that executes code directly on the host.	Package unsafe provides a backend that executes code directly on the host.
wasm Package wasm provides a backend that executes code compiled to WebAssembly.	Package wasm provides a backend that executes code compiled to WebAssembly.
gateway
direct Package direct provides a gateway that implements ToolGateway by directly delegating to toolindex, tooldocs, and toolrun components.	Package direct provides a gateway that implements ToolGateway by directly delegating to toolindex, tooldocs, and toolrun components.
proxy Package proxy provides a gateway that implements ToolGateway by serializing requests over a connection (for cross-process/container communication).	Package proxy provides a gateway that implements ToolGateway by serializing requests over a connection (for cross-process/container communication).
toolcodeengine Package toolcodeengine provides an adapter that implements code.Engine using runtime.Runtime for execution.	Package toolcodeengine provides an adapter that implements code.Engine using runtime.Runtime for execution.