clrk

module
v0.0.0-...-c5f7cd3 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Jun 16, 2026 License: AGPL-3.0

README

CLRK

CLRK (we pronounce it as "clerk") is a Kubernetes-native runtime for LLM agents. It runs each agent in a gVisor sandbox and transparently intercepts all egress - LLM APIs, MCP, tool calls - without modifying agent code. That interception point gives you observability, policy enforcement, and routing-based cost control over agents you don't otherwise get to see inside.

How it works

CLRK runs untrusted, framework-agnostic agent workloads in gVisor sandboxes. You describe an agent declaratively - a container image + a trigger + an egress policy - and CLRK schedules it onto a pool of sandbox workers. It brings its own scheduler, so agent startup isn't gated on pod-creation latency. Every byte in or out of the sandbox passes through a transparent proxy CLRK controls, so the platform sees and governs all LLM API calls, MCP traffic, and outbound tool calls without the agent code being aware of it. Yes, that includes TLS-encrypted connections.

The agent inside can be anything that makes HTTP(S) calls - a Python script using the OpenAI or Anthropic SDK, a Node MCP client, a shell one-liner. There is no required agent library; CLRK intercepts at the network and process boundary. See _examples/ for runnable agents (openai-bot, gemini-bot, cron-bot, jq-bot, MITM variants, ...).

Motivation

Running agents in production raises problems that general-purpose container orchestration does not solve on its own. CLRK is built to address them directly:

  • Observability. All I/O in and out of a sandbox is intercepted and logged, so LLM, MCP, and remote tool-call telemetry is auto-instrumented rather than bolted on per-framework.
  • Governance. Prevent sandbox escape and apply organization-wide policy (where an agent may connect, what credentials it may use) at the egress boundary.
  • Attribution. Tie agent loops back to the customer request or trigger that started them, captured as first-class Invocation records.
  • Connectivity. Give agents audited, authorized access to internal services instead of all-or-nothing network access.
  • Scalability. One model for both serverless bursts and long-lived "on-prem" Kubernetes fleets.
  • Reliability. Simple, robust retries, load-shedding, all outside of the Kubernetes control plane.

A deliberate design choice that follows from governance: credentials never live in the agent. API keys for AI providers, MCP servers, and internal services are injected by the egress MITM at request time, never via pod env, mounts, or args - so a compromised sandbox cannot exfiltrate them.

Architecture

CLRK ships two long-running binaries plus a CLI:

  • cmd/controller-manager - the control plane. Runs the controller-runtime reconcilers for the CRDs below and embeds an aggregated API server for the clrk.apoxy.dev group. Deployed as a Deployment on Kubernetes but can be run standalone.

  • cmd/worker - Manages sandbox lifecycle via gVisor/runsc, sets up per-sandbox network interception via our custom sentrystack plugin to be routed through the interception path. Linux-only (//go:build linux, CGO).

  • cmd/clrk - the operator/developer CLI: install, upgrade, dev, apply, get, logs, traces, status, run-task, context management, and a local-cluster dev loop.

Egress interception. Outbound traffic is captured transparently and sent through an EgressGateway - an Envoy-based data plane with TLS termination (MITM) and a custom filter. This is where telemetry is recorded, credentials are injected, and routing/governance policies (EgressL4Route, MCPRoute, AIProviderRoute, egress/credential/fallback-routing/logging/rate-limit policies) are applied.

Telemetry storage and export. Intercepted I/O becomes Invocation records backed by ClickHouse (via the ch-go driver) and can be consumed using /logs and /traces subresources as well as re-exported over OpenTelemetry sink.

TaskAgent vs DaemonAgent TaskAgent is for triggered, run-to-completion work (HTTP request or cron) multiplexed across shared worker pods. DaemonAgent is for long-lived agent processes with a restart policy.

APIs
CRD Purpose
TaskAgent Short-lived agent execution (request → sandbox → response)
DaemonAgent Long-lived agent process with restart policy
WorkerPool Fleet of worker pods (Deployment + Service)
EgressGateway Transparent egress proxy with TLS termination modes
EgressL4Route L4 egress routing rules
MCPRoute MCP protocol routing
AIProviderRoute AI-provider-specific egress routing
Invocation Attributed record of an intercepted agent call (ClickHouse-backed)
Repository layout
Path Contents
api/clrk/v1alpha1/ CRD types (Apache-2.0)
client/ Generated Kubernetes clientset, listers, informers (Apache-2.0)
internal/controller/ controller-runtime reconcilers
internal/worker/, internal/sandbox/ sandbox lifecycle (Linux-only)
internal/eg*, internal/extproc/, internal/egress/ Envoy egress data plane + interception
internal/clickhouse/, internal/chwriter/, internal/otel* telemetry storage and export
internal/install/, cmd/clrk/ installer and CLI
codegen/ code-generator config (update.sh, header boilerplate)

FAQ

Does my agent need to use a specific framework or SDK?

No. CLRK intercepts at the network and process boundary, so any agent that makes HTTP/TLS calls works. The provided examples use the OpenAI and Gemini SDKs, plain shell tools, and MCP clients.

Where do API keys live?

Not in the agent. Credentials are injected by the egress MITM at request time via a credential-injection policy - never in pod env, mounts, or args. A compromised sandbox has no secrets to leak.

How is the sandbox isolated?

Sandboxes run via gVisor (runsc) for a stronger syscall boundary, each in its own network namespace with all egress forced through the interception path.

Can I run it locally or do I need a Kubernetes cluster?

All batteries included! clrk dev brings up a local cluster and dev loop. This can also be used to run CLRK without a Kubernetes cluster nearby. clrk install / clrk upgrade manage a Kubernetes-based deployment.

Why are api/ and client/ licensed differently from the rest?

So you can build against the API and use the generated client without AGPL copyleft obligations. See License.

Where is CONTRIBUTING.md?

Currently, external contributions are not accepted. If you encounter a bug or have a feature request, please open an issue on the GitHub repository.

Was this tested? I can't find any tests!

We have tests, we swear! Currently they are coupled with our private build/test infrastructure and are not publicly available. We try to maintain minimum 70% unit test coverage and have integration tests for the public API.

Was this vibe-coded?

We rely on AI-assist but every output line is carefully reviewed and tested before being committed.

License

CLRK is licensed under the GNU Affero General Public License v3.0 (AGPL-3.0); see LICENSE.

Exception: the api/ and client/ directories are licensed under the Apache License 2.0; see api/LICENSE and client/LICENSE. These cover the public API types (api/clrk/v1alpha1) and the generated Kubernetes client/SDK, so they can be imported and used without AGPL copyleft obligations.

Directories

Path Synopsis
api
clrk/v1alpha1
Package v1alpha1 contains API Schema definitions for the clrk.apoxy.dev v1alpha1 API group.
Package v1alpha1 contains API Schema definitions for the clrk.apoxy.dev v1alpha1 API group.
client
versioned/fake
This package has the automatically generated fake clientset.
This package has the automatically generated fake clientset.
versioned/scheme
This package contains the scheme of the automatically generated clientset.
This package contains the scheme of the automatically generated clientset.
versioned/typed/clrk/v1alpha1
This package has the automatically generated typed clients.
This package has the automatically generated typed clients.
versioned/typed/clrk/v1alpha1/fake
Package fake has the automatically generated clients.
Package fake has the automatically generated clients.
cmd
clrk command
worker command
docs generates the Markdown reference for the clrk CLI.
docs generates the Markdown reference for the clrk CLI.
internal
admin
Package admin hosts the controller-manager's system-state HTTP surface — a single mux under /admin/* that other subsystems register handlers on for runtime introspection.
Package admin hosts the controller-manager's system-state HTTP surface — a single mux under /admin/* that other subsystems register handlers on for runtime introspection.
apiserver/invocation
Package invocation backs the Invocation API kind with a ClickHouse read model materialized from a JetStream event stream.
Package invocation backs the Invocation API kind with a ClickHouse read model materialized from a JetStream event stream.
apiserver/invoke
Package invoke serves the taskagents/{name}/invoke connect subresource: the write-side door the run-task CLI POSTs to.
Package invoke serves the taskagents/{name}/invoke connect subresource: the write-side door the run-task CLI POSTs to.
apiserver/telemetry
Package telemetry serves the read API for an agent's OTel logs and traces: the taskagents/{name}/{logs,traces} and daemonagents/{name}/{logs,traces} connect subresources.
Package telemetry serves the read API for an agent's OTel logs and traces: the taskagents/{name}/{logs,traces} and daemonagents/{name}/{logs,traces} connect subresources.
certprovider
Package certprovider implements the envoy.service.tls.v3.
Package certprovider implements the envoy.service.tls.v3.
chwriter
Package chwriter persists OTLP logs and traces received by the controller-manager into the embedded ClickHouse engine (internal/clickhouse) using the ch-go columnar driver.
Package chwriter persists OTLP logs and traces received by the controller-manager into the embedded ClickHouse engine (internal/clickhouse) using the ch-go columnar driver.
clickhouse
Package clickhouse supervises an embedded clickhouse-server subprocess inside the controller-manager container.
Package clickhouse supervises an embedded clickhouse-server subprocess inside the controller-manager container.
cloudevents
Package cloudevents builds CloudEvents (CNCF) attribute sets for TaskAgent dispatches.
Package cloudevents builds CloudEvents (CNCF) attribute sets for TaskAgent dispatches.
cmd
Package cmd implements the `clrk` CLI built on top of cobra.
Package cmd implements the `clrk` CLI built on top of cobra.
cmd/devagents
Package devagents tracks the live state of TaskAgent and DaemonAgent objects in the dev cluster and rolls up per-agent telemetry from the in-process OTLP receiver.
Package devagents tracks the live state of TaskAgent and DaemonAgent objects in the dev cluster and rolls up per-agent telemetry from the in-process OTLP receiver.
cmd/devotel
Package devotel implements an in-process OTLP/HTTP receiver that `clrk dev` runs to capture egress ext_proc telemetry without having to bundle a real otel-collector container.
Package devotel implements an in-process OTLP/HTTP receiver that `clrk dev` runs to capture egress ext_proc telemetry without having to bundle a real otel-collector container.
cmd/devtui
Package devtui renders the live status of `clrk dev` (agents, k3s, controller-manager, workers) and their per-component logs in a Bubble Tea TUI.
Package devtui renders the live status of `clrk dev` (agents, k3s, controller-manager, workers) and their per-component logs in a Bubble Tea TUI.
cmd/spangraph
Package spangraph renders OTLP spans as a lazygit-style hierarchy graph: each trace drawn as a tree by parent/child with colored lanes, a status-colored node per span, a compact metadata tail, and a detail block (attributes plus event bodies) the caller expands or collapses for the whole graph at once via a global expand level.
Package spangraph renders OTLP spans as a lazygit-style hierarchy graph: each trace drawn as a tree by parent/child with colored lanes, a status-colored node per span, a compact metadata tail, and a detail block (attributes plus event bodies) the caller expands or collapses for the whole graph at once via a global expand level.
crds
Package crds embeds the upstream Gateway API and Envoy Gateway CRD bundles and applies them via server-side apply, replacing the `kubectl apply -f https://...` step that the dev driver used to run.
Package crds embeds the upstream Gateway API and Envoy Gateway CRD bundles and applies them via server-side apply, replacing the `kubectl apply -f https://...` step that the dev driver used to run.
drivers
Package drivers orchestrates the docker-side state of `clrk dev`: the shared bridge network the k3d cluster + local registry sit on, and the k3d-library-driven cluster controller that brings them up.
Package drivers orchestrates the docker-side state of `clrk dev`: the shared bridge network the k3d cluster + local registry sit on, and the k3d-library-driven cluster controller that brings them up.
eg
Package eg pins the upstream Envoy Gateway release this clrk build is compatible with.
Package eg pins the upstream Envoy Gateway release this clrk build is compatible with.
egcontrolplane
Package egcontrolplane spawns and supervises the upstream `envoy-gateway server` binary as a child of clrk-controller-manager.
Package egcontrolplane spawns and supervises the upstream `envoy-gateway server` binary as a child of clrk-controller-manager.
egextension
Package egextension implements the Envoy Gateway extension server that rewrites xDS resources on the way out of EG's translator.
Package egextension implements the Envoy Gateway extension server that rewrites xDS resources on the way out of EG's translator.
egidentity
Package egidentity carries the EgressGateway identity from a calling Envoy proxy to the controller-manager's gRPC services (certprovider, ext_proc) over the HTTP/2 :authority pseudo-header.
Package egidentity carries the EgressGateway identity from a calling Envoy proxy to the controller-manager's gRPC services (certprovider, ext_proc) over the HTTP/2 :authority pseudo-header.
egress
Package egress implements CRD-driven egress routing and policy enforcement for sandbox network traffic.
Package egress implements CRD-driven egress routing and policy enforcement for sandbox network traffic.
egress/proxyproto
Package proxyproto encodes PROXY protocol v2 frames with clrk-specific TLVs that carry agent identity into the egress data plane.
Package proxyproto encodes PROXY protocol v2 frames with clrk-specific TLVs that carry agent identity into the egress data plane.
extproc
Network ext_proc server.
Network ext_proc server.
extproc/awssigv4
Package awssigv4 implements AWS Signature Version 4 request signing for credential injection at the egress proxy.
Package awssigv4 implements AWS Signature Version 4 request signing for credential injection at the egress proxy.
extproc/ingress
Package ingress implements the TaskAgent ingress ext_proc filter.
Package ingress implements the TaskAgent ingress ext_proc filter.
extproc/invocationctx
Package invocationctx is the controller-manager-local map that carries a TaskAgent invocation's W3C trace parent context from the ingress edge to every outbound LLM/MCP call the agent makes.
Package invocationctx is the controller-manager-local map that carries a TaskAgent invocation's W3C trace parent context from the ingress edge to every outbound LLM/MCP call the agent makes.
extproc/jsonx
Package jsonx pins the egress data path to a single JSON engine: bytedance/sonic in std-compatible mode.
Package jsonx pins the egress data path to a single JSON engine: bytedance/sonic in std-compatible mode.
extproc/llmcall
Package llmcall defines clrk's canonical intermediate representation (IR) for LLM API traffic and the provider registry that converts wire schemas (Anthropic, OpenAI, Google, ...) to and from it.
Package llmcall defines clrk's canonical intermediate representation (IR) for LLM API traffic and the provider registry that converts wire schemas (Anthropic, OpenAI, Google, ...) to and from it.
extproc/llmcall/providers/all
Package all registers every built-in llmcall provider via blank imports.
Package all registers every built-in llmcall provider via blank imports.
extproc/llmcall/providers/anthropic
Package anthropic is the llmcall provider plugin for api.anthropic.com (the Anthropic Messages API).
Package anthropic is the llmcall provider plugin for api.anthropic.com (the Anthropic Messages API).
extproc/llmcall/providers/azureopenai
Package azureopenai is the llmcall provider plugin for Azure OpenAI (*.openai.azure.com).
Package azureopenai is the llmcall provider plugin for Azure OpenAI (*.openai.azure.com).
extproc/llmcall/providers/bedrock
Package bedrock is the llmcall provider plugin for AWS Bedrock's Converse API (bedrock-runtime.<region>.amazonaws.com).
Package bedrock is the llmcall provider plugin for AWS Bedrock's Converse API (bedrock-runtime.<region>.amazonaws.com).
extproc/llmcall/providers/google
Package google is the llmcall provider plugin for generativelanguage.googleapis.com (Google's consumer Gemini API; "google_genai" in OTel gen_ai semconv).
Package google is the llmcall provider plugin for generativelanguage.googleapis.com (Google's consumer Gemini API; "google_genai" in OTel gen_ai semconv).
extproc/llmcall/providers/openai
Package openai is the llmcall provider plugin for api.openai.com (the OpenAI Chat Completions API surface).
Package openai is the llmcall provider plugin for api.openai.com (the OpenAI Chat Completions API surface).
extproc/parsers
Package parsers extracts AI-provider-specific facts from buffered HTTP request/response pairs captured by ext_proc.
Package parsers extracts AI-provider-specific facts from buffered HTTP request/response pairs captured by ext_proc.
extproc/tracectx
Package tracectx is the shared W3C trace-context utility used by both the ingress and egress ext_proc filters: it owns the singleton TextMapPropagator, the sensitive-header redaction list, and the helpers that extract a parent context from inbound headers / build a child traceparent for outbound injection.
Package tracectx is the shared W3C trace-context utility used by both the ingress and egress ext_proc filters: it owns the singleton TextMapPropagator, the sensitive-header redaction list, and the helpers that extract a parent context from inbound headers / build a child traceparent for outbound injection.
healthcheck
Package healthcheck maintains the controller-manager's view of every worker pod's routing state and exposes Pick for the ingress ext_proc to choose a per-execution worker.
Package healthcheck maintains the controller-manager's view of every worker pod's routing state and exposes Pick for the ingress ext_proc to choose a per-execution worker.
install
Package install holds the shared control-plane bootstrap engine used by both `clrk dev` (against a local k3d cluster) and `clrk install`/`clrk upgrade` (against a customer's existing cluster).
Package install holds the shared control-plane bootstrap engine used by both `clrk dev` (against a local k3d cluster) and `clrk install`/`clrk upgrade` (against a customer's existing cluster).
invevent
Package invevent is the shared JetStream pub/sub layer for Invocation lifecycle events.
Package invevent is the shared JetStream pub/sub layer for Invocation lifecycle events.
llmroute
Package llmroute holds the LLM-rule identity and candidate-resolution logic shared between the EG extension server (which synthesizes the per-rule Envoy routes and multi-endpoint clusters) and the egress ext_proc (which pins matched requests onto those routes and adapts each upstream attempt).
Package llmroute holds the LLM-rule identity and candidate-resolution logic shared between the EG extension server (which synthesizes the per-rule Envoy routes and multi-endpoint clusters) and the egress ext_proc (which pins matched requests onto those routes and adapts each upstream attempt).
nats
Package nats embeds a nats-server with JetStream enabled inside the controller-manager so workers and in-process producers can publish and subscribe to Invocation lifecycle events without standing up a separate NATS deployment.
Package nats embeds a nats-server with JetStream enabled inside the controller-manager so workers and in-process producers can publish and subscribe to Invocation lifecycle events without standing up a separate NATS deployment.
otelemit
Package otelemit is the shared OTLP/HTTP TracerProvider+LoggerProvider bootstrap used by both the controller-manager (extproc capture sink) and the worker (sandbox/egress events).
Package otelemit is the shared OTLP/HTTP TracerProvider+LoggerProvider bootstrap used by both the controller-manager (extproc capture sink) and the worker (sandbox/egress events).
otelforward
Package otelforward re-exports OTLP/HTTP signal bytes that the cm OTLP receiver captured to a customer-configured external endpoint.
Package otelforward re-exports OTLP/HTTP signal bytes that the cm OTLP receiver captured to a customer-configured external endpoint.
otelreceiver
Package otelreceiver is the controller-manager's OTLP/HTTP receiver and the home of OTLP/protobuf decode helpers shared with the `clrk dev` TUI receiver in internal/cmd/devotel.
Package otelreceiver is the controller-manager's OTLP/HTTP receiver and the home of OTLP/protobuf decode helpers shared with the `clrk dev` TUI receiver in internal/cmd/devotel.
ports
Package ports holds TCP/gRPC port and resource-name constants shared across the worker, the controller, and the clrk CLI.
Package ports holds TCP/gRPC port and resource-name constants shared across the worker, the controller, and the clrk CLI.
sandbox/metadata
Package metadata implements the IMDS-style HTTP service exposed inside each sandbox.
Package metadata implements the IMDS-style HTTP service exposed inside each sandbox.
sentrystack
The DNS cache is pure Go and shared between the Sentry-side UDP forwarder (writer) and the Sentry-side TCP forwarder (reader).
The DNS cache is pure Go and shared between the Sentry-side UDP forwarder (writer) and the Sentry-side TCP forwarder (reader).
version
Package version exposes the clrk binary's own build version.
Package version exposes the clrk binary's own build version.
worker
Package worker implements the sandbox runtime for CLRK worker pods.
Package worker implements the sandbox runtime for CLRK worker pods.
workerlog
Package workerlog holds the on-disk layout of per-agent stdio logs teed by the worker.
Package workerlog holds the on-disk layout of per-agent stdio logs teed by the worker.
workerpod
Package workerpod assembles the worker pod template that hosts agent sandboxes.
Package workerpod assembles the worker pod template that hosts agent sandboxes.
pkg
sandbox
Package sandbox is the public, tenant-neutral gVisor/runsc sandbox spine of the clrk worker.
Package sandbox is the public, tenant-neutral gVisor/runsc sandbox spine of the clrk worker.
sandbox/sentrystack
The InitStr envelope is pure JSON with no gvisor dependencies — leave it without a //go:build constraint so cross-platform unit tests can exercise Encode/Decode without pulling in linux-only gvisor packages.
The InitStr envelope is pure JSON with no gvisor dependencies — leave it without a //go:build constraint so cross-platform unit tests can exercise Encode/Decode without pulling in linux-only gvisor packages.

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL