runtime

package
v0.1.1 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Feb 1, 2026 License: MIT Imports: 9 Imported by: 0

Documentation

Overview

Package runtime provides execution runtime and isolation boundaries for code-oriented orchestration. It sits underneath toolcode and provides:

  • Backend-agnostic runtime interface for executing code in sandboxed environments
  • Pluggable sandbox backends (from unsafe development mode to hardened isolation)
  • Clean trust boundary for running untrusted code that can still call tools
  • ToolGateway abstraction for exposing tool discovery and execution to sandboxes

The runtime enforces security through SecurityProfiles that determine which backends are allowed and what resource limits apply. The ToolGateway provides a proxy interface for sandboxed code to discover and execute tools without direct access to host resources.

Architecture

The main types are:

  • Runtime: Main execution interface that routes requests to backends
  • Backend: Sandbox implementation (see Backend Kinds below)
  • ToolGateway: Interface for tool operations exposed to sandboxed code
  • ExecuteRequest/ExecuteResult: Request/response types for execution

Security Profiles

Three security profiles are supported:

  • ProfileDev: Development mode with minimal restrictions (unsafe)
  • ProfileStandard: Standard isolation (no network, read-only rootfs)
  • ProfileHardened: Maximum isolation with seccomp, gVisor/Kata/microVM

Backend Kinds

The following execution backends are supported:

  • BackendUnsafeHost: Direct host execution (dev only, no isolation)
  • BackendDocker: Docker containers with cgroups and seccomp
  • BackendContainerd: Containerd for infrastructure-native deployments
  • BackendKubernetes: Short-lived pods/jobs with scheduling
  • BackendGVisor: Strong isolation via gVisor/runsc
  • BackendKata: VM-level isolation via Kata Containers
  • BackendFirecracker: MicroVM isolation (strongest)
  • BackendWASM: WebAssembly in-process isolation
  • BackendTemporal: Workflow orchestration (composes with sandbox backends)
  • BackendRemote: Generic remote execution service

Security Requirements

All non-unsafe backends MUST:

  1. Run as non-root
  2. Enforce timeouts and cancellation
  3. Enforce tool call and chain step limits
  4. Deny host filesystem access by default
  5. Deny network egress by default
  6. Provide resource controls where available
  7. Treat tool schemas/docs/annotations as untrusted input

Backends that cannot enforce a given limit must report that clearly via the LimitsEnforced field in ExecuteResult.

Index

Constants

This section is empty.

Variables

View Source
var (
	// ErrRuntimeUnavailable is returned when no suitable runtime/backend is available.
	ErrRuntimeUnavailable = errors.New("runtime unavailable")

	// ErrBackendDenied is returned when a backend is denied by security policy.
	ErrBackendDenied = errors.New("backend denied by policy")

	// ErrSandboxViolation is returned when sandboxed code violates security policy.
	ErrSandboxViolation = errors.New("sandbox policy violation")

	// ErrTimeout is returned when execution exceeds the configured timeout.
	ErrTimeout = errors.New("execution timeout")

	// ErrResourceLimit is returned when a resource limit is exceeded.
	ErrResourceLimit = errors.New("resource limit exceeded")

	// ErrMissingGateway is returned when ExecuteRequest has no Gateway.
	ErrMissingGateway = errors.New("gateway is required")

	// ErrMissingCode is returned when ExecuteRequest has no Code.
	ErrMissingCode = errors.New("code is required")

	// ErrInvalidLimits is returned when Limits validation fails.
	ErrInvalidLimits = errors.New("invalid limits")
)

Sentinel errors for toolruntime operations.

Functions

func RunBackendContractTests

func RunBackendContractTests(t *testing.T, contract BackendContract)

RunBackendContractTests runs all contract tests for a Backend implementation.

func RunGatewayContractTests

func RunGatewayContractTests(t *testing.T, contract GatewayContract)

RunGatewayContractTests runs all contract tests for a ToolGateway implementation.

Types

type Backend

type Backend interface {
	// Kind returns the backend kind identifier.
	Kind() BackendKind

	// Execute runs code with the given request parameters.
	// It validates the request, executes the code, and returns the result.
	Execute(ctx context.Context, req ExecuteRequest) (ExecuteResult, error)
}

Backend is the interface for code execution backends. Each backend provides a different level of isolation and security.

Contract:

  • Concurrency: implementations must be safe for concurrent use.
  • Context: must honor cancellation/deadlines and return ctx.Err() when canceled.
  • Errors: validation errors should use ErrInvalidRequest; runtime errors should return ErrExecutionFailed (see errors.go) where applicable.
  • Ownership: requests are read-only; results are caller-owned snapshots.

type BackendContract

type BackendContract struct {
	// NewBackend creates a fresh backend instance for testing.
	NewBackend func() Backend

	// NewGateway creates a gateway for testing.
	// The gateway should allow at least basic tool operations.
	NewGateway func() ToolGateway

	// ExpectedKind is the BackendKind the backend should return.
	ExpectedKind BackendKind

	// SkipExecutionTests skips tests that require actual code execution.
	// Useful for backends that need complex setup (e.g., Docker).
	SkipExecutionTests bool
}

BackendContract defines tests that any Backend implementation must pass. Use RunBackendContractTests to test an implementation.

type BackendInfo

type BackendInfo struct {
	// Kind identifies the type of backend.
	Kind BackendKind

	// Details contains backend-specific information.
	Details map[string]any
}

BackendInfo contains information about the execution backend.

type BackendKind

type BackendKind string

BackendKind identifies the type of execution backend.

const (
	// BackendUnsafeHost runs code directly on the host.
	// WARNING: No isolation - use only for trusted code in development.
	BackendUnsafeHost BackendKind = "unsafe_host"

	// BackendDocker runs code in Docker containers.
	// Good default isolation with cgroups, read-only rootfs, user remapping, and seccomp.
	BackendDocker BackendKind = "docker"

	// BackendContainerd runs code via containerd directly.
	// Similar to Docker but more infrastructure-native for servers/agents.
	BackendContainerd BackendKind = "containerd"

	// BackendKubernetes executes snippets in short-lived pods/jobs.
	// Isolation depends on configured runtime class; best for scheduling and multi-tenant controls.
	BackendKubernetes BackendKind = "kubernetes"

	// BackendGVisor runs code with gVisor (runsc) for stronger isolation.
	// Appropriate for untrusted multi-tenant execution.
	BackendGVisor BackendKind = "gvisor"

	// BackendKata runs code in Kata Containers for VM-level isolation.
	// Stronger isolation than plain containers.
	BackendKata BackendKind = "kata"

	// BackendFirecracker runs code in Firecracker microVMs.
	// Strongest isolation; higher complexity and operational cost.
	BackendFirecracker BackendKind = "firecracker"

	// BackendWASM runs code compiled to WebAssembly.
	// Strong in-process isolation; requires constrained SDK surface.
	BackendWASM BackendKind = "wasm"

	// BackendTemporal treats snippet execution as a Temporal workflow/activity.
	// Useful for long-running or resumable executions.
	// Note: Temporal is orchestration, not isolation - must compose with sandbox backends.
	BackendTemporal BackendKind = "temporal"

	// BackendRemote executes code on a remote runtime service.
	// Generic target for dedicated runtime services, batch systems, or job runners.
	BackendRemote BackendKind = "remote"
)

type DefaultRuntime

type DefaultRuntime struct {
	// contains filtered or unexported fields
}

DefaultRuntime is the default implementation of Runtime. It routes requests to backends based on security profiles.

func NewDefaultRuntime

func NewDefaultRuntime(cfg RuntimeConfig) *DefaultRuntime

NewDefaultRuntime creates a new DefaultRuntime with the given configuration.

func (*DefaultRuntime) Execute

Execute implements the Runtime interface.

func (*DefaultRuntime) RegisterBackend

func (r *DefaultRuntime) RegisterBackend(profile SecurityProfile, backend Backend)

RegisterBackend registers a backend for a security profile. This is thread-safe and can be called at runtime.

func (*DefaultRuntime) UnregisterBackend

func (r *DefaultRuntime) UnregisterBackend(profile SecurityProfile)

UnregisterBackend removes a backend for a security profile. This is thread-safe and can be called at runtime.

type ExecuteRequest

type ExecuteRequest struct {
	// Language specifies the programming language of the code.
	// If empty, the backend's default language is used.
	Language string

	// Code is the source code to execute.
	// Required.
	Code string

	// Timeout specifies the maximum duration for execution.
	// If zero, the backend's default timeout is used.
	Timeout time.Duration

	// Limits specifies resource limits for execution.
	Limits Limits

	// Profile specifies the security profile to use.
	// If empty, the runtime's default profile is used.
	Profile SecurityProfile

	// Gateway is the tool gateway exposed to the executed code.
	// Required.
	Gateway ToolGateway

	// Metadata contains arbitrary metadata for the execution.
	Metadata map[string]any
}

ExecuteRequest specifies the parameters for code execution.

func (ExecuteRequest) Validate

func (r ExecuteRequest) Validate() error

Validate checks that the request is valid.

type ExecuteResult

type ExecuteResult struct {
	// Value is the final result of the code execution.
	// Typically captured via the __out variable convention.
	Value any

	// Stdout contains any output written to stdout.
	Stdout string

	// Stderr contains any output written to stderr.
	Stderr string

	// ToolCalls records all tool invocations made during execution.
	ToolCalls []ToolCallRecord

	// Duration is the total execution time.
	Duration time.Duration

	// Backend contains information about the backend that executed the code.
	Backend BackendInfo

	// LimitsEnforced reports which limits the backend was able to enforce.
	// Backends that cannot enforce a given limit should set that field to false.
	// This allows callers to know when limits degraded gracefully.
	LimitsEnforced LimitsEnforced
}

ExecuteResult contains the outcome of code execution.

type GatewayContract

type GatewayContract struct {
	// NewGateway creates a fresh gateway instance for testing.
	// The gateway should be configured with at least one tool for search/describe tests.
	NewGateway func() ToolGateway
}

GatewayContract defines tests that any ToolGateway implementation must pass. Use RunGatewayContractTests to test an implementation.

type Limits

type Limits struct {
	// MaxToolCalls limits the number of tool invocations.
	// Zero means unlimited.
	MaxToolCalls int

	// MaxChainSteps limits the number of steps in a tool chain.
	// Zero means unlimited.
	MaxChainSteps int

	// CPUQuotaMillis limits CPU time in milliseconds.
	// Zero means unlimited.
	CPUQuotaMillis int64

	// MemoryBytes limits memory usage in bytes.
	// Zero means unlimited.
	MemoryBytes int64

	// PidsMax limits the number of processes/threads.
	// Zero means unlimited.
	PidsMax int64

	// DiskBytes limits disk usage in bytes.
	// Zero means unlimited.
	DiskBytes int64
}

Limits specifies resource limits for execution. Zero values represent "unlimited" for that resource.

func (Limits) Validate

func (l Limits) Validate() error

Validate checks that all limit values are valid (non-negative).

type LimitsEnforced

type LimitsEnforced struct {
	// Timeout indicates whether the timeout was enforced.
	Timeout bool

	// ToolCalls indicates whether tool call limits were enforced.
	ToolCalls bool

	// ChainSteps indicates whether chain step limits were enforced.
	ChainSteps bool

	// Memory indicates whether memory limits were enforced.
	Memory bool

	// CPU indicates whether CPU limits were enforced.
	CPU bool

	// Pids indicates whether process limits were enforced.
	Pids bool

	// Disk indicates whether disk limits were enforced.
	Disk bool
}

LimitsEnforced reports which resource limits were actually enforced by the backend. Backends that cannot enforce a limit should set that field to false.

type Logger

type Logger interface {
	Info(msg string, args ...any)
	Warn(msg string, args ...any)
	Error(msg string, args ...any)
}

Logger is an optional interface for logging.

Contract: - Concurrency: implementations must be safe for concurrent use. - Errors: logging must be best-effort and must not panic.

type Runtime

type Runtime interface {
	// Execute runs code with the given request parameters.
	// It selects the appropriate backend based on the security profile
	// and delegates execution.
	Execute(ctx context.Context, req ExecuteRequest) (ExecuteResult, error)
}

Runtime is the main interface for code execution. It manages backends and routes execution requests to the appropriate backend based on the security profile.

Contract:

  • Concurrency: implementations must be safe for concurrent use.
  • Context: must honor cancellation/deadlines and return ctx.Err() when canceled.
  • Errors: request validation should return ErrMissingGateway/ErrInvalidRequest; backend selection failures return ErrRuntimeUnavailable/ErrBackendDenied.
  • Ownership: requests are read-only; results are caller-owned snapshots.

type RuntimeConfig

type RuntimeConfig struct {
	// Backends maps security profiles to their backend implementations.
	Backends map[SecurityProfile]Backend

	// DenyUnsafeProfiles lists profiles that cannot use the unsafe backend.
	// If a profile is listed here and only the unsafe backend is available,
	// execution will be denied.
	DenyUnsafeProfiles []SecurityProfile

	// DefaultProfile is the profile to use when none is specified in the request.
	// If empty and no profile is specified, execution will fail.
	DefaultProfile SecurityProfile

	// Logger is an optional logger for runtime events.
	Logger Logger
}

RuntimeConfig configures a DefaultRuntime instance.

type RuntimeError

type RuntimeError struct {
	// Err is the underlying error.
	Err error

	// Op is the operation that failed (e.g., "execute", "container_create").
	Op string

	// Backend is the backend kind that was in use when the error occurred.
	Backend BackendKind

	// Retryable indicates whether the operation can be retried.
	// True for transient errors like timeouts; false for policy violations.
	Retryable bool
}

RuntimeError wraps an error with execution context information. It provides the operation that failed and the backend that was in use.

func (*RuntimeError) Error

func (e *RuntimeError) Error() string

Error returns the error message with context.

func (*RuntimeError) Unwrap

func (e *RuntimeError) Unwrap() error

Unwrap returns the underlying error for errors.Is and errors.As.

type SecurityProfile

type SecurityProfile string

SecurityProfile determines the security level for execution. Higher security profiles impose more restrictions but provide better isolation.

const (
	// ProfileDev is development mode with minimal restrictions.
	// WARNING: This profile runs code with host access - use only for development.
	ProfileDev SecurityProfile = "dev"

	// ProfileStandard provides standard isolation.
	// Includes: no network access, read-only rootfs, resource limits.
	ProfileStandard SecurityProfile = "standard"

	// ProfileHardened provides maximum isolation.
	// Includes: seccomp profiles, stricter resource limits, additional syscall filtering.
	ProfileHardened SecurityProfile = "hardened"
)

func (SecurityProfile) IsValid

func (p SecurityProfile) IsValid() bool

IsValid returns true if the SecurityProfile is a known valid value.

type ToolCallRecord

type ToolCallRecord struct {
	// ToolID is the canonical identifier of the tool that was called.
	ToolID string

	// BackendKind indicates which backend executed the tool.
	BackendKind string

	// Duration is the execution time for this tool call.
	Duration time.Duration

	// ErrorOp indicates the operation that failed, if any.
	ErrorOp string
}

ToolCallRecord captures information about a single tool invocation.

type ToolGateway

type ToolGateway interface {
	// SearchTools searches for tools matching the query.
	SearchTools(ctx context.Context, query string, limit int) ([]index.Summary, error)

	// ListNamespaces returns all available tool namespaces.
	ListNamespaces(ctx context.Context) ([]string, error)

	// DescribeTool returns documentation for a tool at the specified detail level.
	DescribeTool(ctx context.Context, id string, level tooldoc.DetailLevel) (tooldoc.ToolDoc, error)

	// ListToolExamples returns up to maxExamples usage examples for a tool.
	ListToolExamples(ctx context.Context, id string, maxExamples int) ([]tooldoc.ToolExample, error)

	// RunTool executes a single tool and returns the result.
	RunTool(ctx context.Context, id string, args map[string]any) (run.RunResult, error)

	// RunChain executes a sequence of tool calls.
	RunChain(ctx context.Context, steps []run.ChainStep) (run.RunResult, []run.StepResult, error)
}

ToolGateway is the interface for tool operations exposed to sandboxed code. It provides a proxy for tool discovery and execution while maintaining the trust boundary between the sandbox and the host.

Contract: - Concurrency: implementations must be safe for concurrent use. - Context: methods must honor cancellation/deadlines and return ctx.Err() when canceled. - Ownership: args are read-only; results are caller-owned snapshots.

Directories

Path Synopsis
backend
containerd
Package containerd provides a backend that executes code via containerd.
Package containerd provides a backend that executes code via containerd.
docker
Package docker provides a backend that executes code in Docker containers with configurable security profiles and resource limits.
Package docker provides a backend that executes code in Docker containers with configurable security profiles and resource limits.
firecracker
Package firecracker provides a backend that executes code in Firecracker microVMs.
Package firecracker provides a backend that executes code in Firecracker microVMs.
gvisor
Package gvisor provides a backend that executes code with gVisor (runsc).
Package gvisor provides a backend that executes code with gVisor (runsc).
kata
Package kata provides a backend that executes code in Kata Containers.
Package kata provides a backend that executes code in Kata Containers.
kubernetes
Package kubernetes provides a backend that executes code in Kubernetes pods/jobs.
Package kubernetes provides a backend that executes code in Kubernetes pods/jobs.
remote
Package remote provides a backend that executes code on a remote runtime service.
Package remote provides a backend that executes code on a remote runtime service.
temporal
Package temporal provides a backend that treats snippet execution as a Temporal workflow/activity.
Package temporal provides a backend that treats snippet execution as a Temporal workflow/activity.
unsafe
Package unsafe provides a backend that executes code directly on the host.
Package unsafe provides a backend that executes code directly on the host.
wasm
Package wasm provides a backend that executes code compiled to WebAssembly.
Package wasm provides a backend that executes code compiled to WebAssembly.
gateway
direct
Package direct provides a gateway that implements ToolGateway by directly delegating to toolindex, tooldocs, and toolrun components.
Package direct provides a gateway that implements ToolGateway by directly delegating to toolindex, tooldocs, and toolrun components.
proxy
Package proxy provides a gateway that implements ToolGateway by serializing requests over a connection (for cross-process/container communication).
Package proxy provides a gateway that implements ToolGateway by serializing requests over a connection (for cross-process/container communication).
Package toolcodeengine provides an adapter that implements code.Engine using runtime.Runtime for execution.
Package toolcodeengine provides an adapter that implements code.Engine using runtime.Runtime for execution.

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL