substrate

module
v0.0.0-...-00a90a8 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Jun 17, 2026 License: Apache-2.0

README

Agent Substrate

License

NOTE: This is not an officially supported Google product. This project is not eligible for the Google Open Source Software Vulnerability Rewards Program.

What is Agent Substrate?

Agent substrate is a system built on top of Kubernetes which manages agent-like workloads to achieve higher scale and efficiency than Kubernetes alone can offer, with lower latency. It builds on top of Kubernetes features like Pods and Pod autoscaling, but takes the Kubernetes control-plane out of the critical path to achieve lower latency.

It can run on any Kubernetes cluster and does not inhibit “regular” use of Kubernetes in any way. Kubernetes provides the infrastructure provisioning and management for all types of workloads, while Agent Substrate provides agent-specific scheduling and control.

At its core, Agent Substrate maps a larger set of “actors” (applications such as agents) onto a smaller set of ready “workers” (Kubernetes Pods), relying on the fact that agent-like applications tend to be idle most of the time to achieve heavy multiplexing. It provides functionality to manage an actor’s lifecycle (e.g. create/destroy, suspend/resume), to assign actors to workers in real time, and to route incoming traffic to them.

Agent Substrate is intended to be a low-opinion system. The workloads it manages don't have to be literal AI agents, but those are the best example of the kind of applications it is designed for. It is not an SDK for building agents, but rather a system for running them at scale.

Demo

Agent Substrate Demo

Watch the Agent Substrate cluster multiplex ~250 stateful actor sessions across just 8 physical pods.

This demo highlights the core developer experience and "Agentic Infrastructure" capabilities of Substrate:

  1. Instant Session Teleport: High-performance suspend and resume of actors onto any available worker in the pool with sub-second activation.
  2. State Persistence: Persistent working memory (volatile RAM) and filesystem state preserved perfectly across hibernation cycles via full-state snapshots.
  3. Agent Swarm Multiplexing: Demonstrates 30x+ oversubscription by "juggling" a large registry of stateful actors onto a small pool of shared physical pods.

To reproduce this demo in your own cluster, please refer to the detailed walkthroughs in the Counter Demo and Secret Agent Demo.

For more videos and walkthroughs, visit our YouTube channel: agent-substrate.

Framework Agnostic & Compatibility

Agent Substrate is designed to be framework and agent harness agnostic. Because it manages standard OCI containers at the kernel level (via gVisor), it can host agents built on any stack.

  • Agent Development Kit (ADK): Native support for ADK-compatible session identity and persistent working memory.
  • LangChain: Ideal execution environment for long-running, stateful LangChain agents and sandboxed tool-calling.
  • Claude Code & CodeX: Support for high-density, stateful coding environments that preserve terminal and filesystem state across sessions.
  • Model Context Protocol (MCP): Deploy secure, sandboxed MCP servers as Substrate Actors to provide durable tools for any LLM.

Ecosystem & Examples

Status and compatibility

Agent Substrate is currently in VERY early development. It is not ready for production use, and the APIs are almost guaranteed to change. We are not making any guarantees about backward compatibility at this stage, and everything in this project may be changed.

Supported Kubernetes Releases

Currently we aim to support the latest stable release of Kubernetes, and the previous minor release.

Community

For announcements, technical discussions, and community support, please join the ate-dev Google Group.

We host a weekly community meeting every Thursday from 10:00am - 11:00am PST.

We also have channels in the CNCF slack; request an invite here if you don't have access.

Developing

Please see CONTRIBUTING.md for guidelines on contributing to the project. We welcome contributions of all kinds, but the project is VERY young. Our immediate focus is on building out the core system and demos, so we may not be able to review or merge contributions that don't align with those goals in the near term.

Quickstart (Development)

To quickly set up the complete environment:

  1. Make sure you have Go, kubectl, and docker installed and configured on your dev machine. We will automatically manage other dependencies via Go, including kind.

  2. Run the following steps:

# create cluster and local registry
hack/create-kind-cluster.sh

# install ate, valkey, rustfs
hack/install-ate-kind.sh --deploy-ate-system

# install counter demo
hack/install-ate-kind.sh --deploy-demo-counter

# install kubectl-ate
go install ./cmd/kubectl-ate

# create a counter actor and demo it
kubectl ate create actor my-counter-1 --template ate-demo-counter/counter

# port-forward the network router to bind to local port `8000`
kubectl port-forward -n ate-system svc/atenet-router 8000:80
  1. In a separate terminal, send an HTTP request to increment the counter:
curl -X POST -H "Host: my-counter-1.actors.resources.substrate.ate.dev" -i http://localhost:8000/
GKE Quickstart (Development)
  1. Create and configure your environment file:

    cp hack/ate-dev-env.sh.example .ate-dev-env.sh
    
    # Edit .ate-dev-env.sh to match your project and preferences, then source it:
    source .ate-dev-env.sh
    
  2. Enable application-default credentials for gcloud:

    gcloud auth application-default login --project=${PROJECT_ID}
    
  3. Provision the required GCP resources (GKE cluster, Redis, GCS, and IAM bindings):

    go run ./tools/setup-gcp --all
    
  4. Deploy the Agent Substrate system to your cluster (remember to navigate back to root directory of this repo before running the following commands):

    ./hack/install-ate.sh --deploy-ate-system
    
  5. You can then deploy the sample applications. See demos/counter/README.md or demos/sandbox/README.md for detailed walkthroughs.

    ./hack/install-ate.sh --deploy-demo-counter
    
Custom Setup and Deployment

You can run individual setup steps to create GCP resources as needed. See go run ./tools/setup-gcp --help for available options. For example:

go run ./tools/setup-gcp --create-cluster
go run ./tools/setup-gcp --create-gvisor-node-pool

Similarly, you can deploy or cleanup specific Agent Substrate components using the installation script. See ./hack/install-ate.sh --help for all options.

# Re-deploy only ate-apiserver of the ATE system
./hack/install-ate-kind.sh --deploy-ate-apiserver

# Delete everything (core system and all demos)
./hack/install-ate-kind.sh --delete-all
Tearing down resources (GCP)

If you need to delete the resources created by the setup script, you can use the provided script hack/teardown.sh. This script will delete resources in the reverse order of creation and handles partial failures gracefully.

./hack/teardown.sh --all

Or run individual teardown steps as needed (see ./hack/teardown.sh for available options).

Tearing down local kind resources

If you need to delete the local kind cluster and its registry (if it was created by hack/create-kind-cluster.sh):

./hack/delete-kind-cluster.sh

Demos

We provide several sample applications demonstrating Agent Substrate's capabilities:

  1. Counter Demo: A stateful Go HTTP server demonstrating state preservation across suspends/resumes, and dynamic CRD routing.
  2. Sandbox Demo (Antigravity): A secure, sandboxed execution environment (running Alpine Linux) that allows arbitrary shell execution while preserving filesystem state across sessions.
  3. Claude Code Multiplex: Demonstrates oversubscribing physical hardware by multiplexing multiple Claude Code agents onto a limited pool of workers.
  4. Secret Agent: Highlights Substrate's "Zero-Idle" self-suspension and re-animation of volatile process memory.
Documentation & Guides

Tour

Commands
  • cmd/ateapi: The core control plane API server exposing gRPC endpoints to manage actor and worker lifecycles.
  • cmd/atelet: A node-level DaemonSet that supervises physical worker pods, coordinates snapshotting, and manages state transfers.
  • cmd/atecontroller: A Kubernetes controller that reconciles WorkerPool and ActorTemplate custom resources.
  • cmd/atenet: A combined networking controller providing DNS, Envoy routing, and proxy sidecars.
  • cmd/ateom-gvisor: An interior-pod helper running inside sandboxed worker pods to execute runsc checkpoint and restore commands.
  • cmd/podcertcontroller: A "polyfill" that provides Pod Certificate signers that will eventually ship in upstream Kubernetes (with different names).
  • cmd/kubectl-ate: A CLI tool for managing Agent Substrate resources. See its README.
  • tools/setup-gcp: A provisioning utility to set up the necessary GCP infrastructure resources (GKE, GCS, IAM).
  • demos/: Sample applications demonstrating Agent Substrate capabilities.

Directories

Path Synopsis
LICENSES
cmd
ateapi command
ateapi/internal/credbundle
Package credbundle handles credential bundle files written by Kubernetes Pod Certificates.
Package credbundle handles credential bundle files written by Kubernetes Pod Certificates.
ateapi/internal/k8sjwt
Package k8sjwt provides a JWT verifier tailored to Kubernetes.
Package k8sjwt provides a JWT verifier tailored to Kubernetes.
ateapi/internal/store
Package store contains common types for the persistence layer.
Package store contains common types for the persistence layer.
ateapi/internal/store/ateredis
Package ateredis is an ate storage backend built on Redis.
Package ateredis is an ate storage backend built on Redis.
atecontroller command
atelet command
atenet command
ateom-gvisor command
kubectl-ate command
podcertcontroller command
Command podcertcontroller is a pod certificate controller that implements two signers.
Command podcertcontroller is a pod certificate controller that implements two signers.
podcertcontroller/internal/rendezvous
Package rendezvous uses rendezvous hashing to help multiple controller replicas agree on which replica should handle an item.
Package rendezvous uses rendezvous hashing to help multiple controller replicas agree on which replica should handle an item.
demos
agent-secret command
claude-code-multiplex/ui command
Demo UI server — substrate multiplex visualization.
Demo UI server — substrate multiplex visualization.
counter command
Command counter is a simple server that will be used as a worker pod.
Command counter is a simple server that will be used as a worker pod.
multi-template/fspersist command
Command fspersist is a simple server used as an actor workload.
Command fspersist is a simple server used as an actor workload.
sandbox command
sandbox/client command
internal
ateompath
Ateom and atelet need to agree on many filesystem paths.
Ateom and atelet need to agree on many filesystem paths.
e2e
e2e/fixtures/probe command
Command probe is a minimal introspection actor used by the e2e suites.
Command probe is a minimal introspection actor used by the e2e suites.
envtestbins
Package envtestbins resolves the shared envtest (kubebuilder) binary assets
Package envtestbins resolves the shared envtest (kubebuilder) binary assets
localca
Package localca implements a CA whose state can be stored in a local file or Kubernetes secret.
Package localca implements a CA whose state can be stored in a local file or Kubernetes secret.
localjwtauthority
Package localjwtauthority implements a simple "CA" for JWTs.
Package localjwtauthority implements a simple "CA" for JWTs.
serverboot
Package serverboot collects the startup boilerplate shared by the long-running substrate server binaries (ateapi, atelet, ateom-gvisor): slog wiring, OTel tracer + meter providers, a Prometheus + /readyz HTTP surface, and a couple of small helpers for startup fail-fast.
Package serverboot collects the startup boilerplate shared by the long-running substrate server binaries (ateapi, atelet, ateom-gvisor): slog wiring, OTel tracer + meter providers, a Prometheus + /readyz HTTP surface, and a couple of small helpers for startup fail-fast.
version
Package version exposes build-time identity for substrate binaries.
Package version exposes build-time identity for substrate binaries.
pkg
api/v1alpha1
Package v1alpha1 contains API Schema definitions for the agents v1alpha1 API group.
Package v1alpha1 contains API Schema definitions for the agents v1alpha1 API group.
client/clientset/versioned/fake
This package has the automatically generated fake clientset.
This package has the automatically generated fake clientset.
client/clientset/versioned/scheme
This package contains the scheme of the automatically generated clientset.
This package contains the scheme of the automatically generated clientset.
client/clientset/versioned/typed/api/v1alpha1
This package has the automatically generated typed clients.
This package has the automatically generated typed clients.
client/clientset/versioned/typed/api/v1alpha1/fake
Package fake has the automatically generated clients.
Package fake has the automatically generated clients.
tools
setup-gcp command

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL