agent

module

v0.4.1-0...-f3557d3 Latest Latest Go to latest Published: Feb 11, 2026 License: Apache-2.0

Details

Valid go.mod file
Redistributable license
Tagged version
Stable version
Learn more about best practices

Repository

github.com/vinayprograms/agent

Links

Open Source Insights

README ¶

Headless Agent

A Go-based headless agent for workflow automation using LLMs.

⚠️ Disclaimer: This entire codebase was gramcoded — 20 hours of free time across 5 days, texting an AI through Telegram. No IDE. No terminal. Just a couch, a phone, and OpenClaw + Claude Opus 4.5. Review before production use.

Quick Start

Option A: Interactive Setup (Recommended)

# Build
make build

# Run the interactive setup wizard
./bin/agent setup

The setup wizard guides you through configuration for any deployment scenario — from local experimentation to enterprise Kubernetes deployments.

Option B: Manual Setup

1. Build

Prerequisites (for semantic memory with sqlite-vec):

# Debian/Ubuntu
sudo apt-get install libsqlite3-dev

# macOS
brew install sqlite3

# Or build without semantic memory (CGO_ENABLED=0)

# Using Make (recommended)
make build          # Build to ./bin/agent
make install        # Install to ~/.local/bin/agent

# Or manually
cd src
CGO_CFLAGS="-I/usr/include" go build -o agent ./cmd/agent

2. Create a Config File

Create agent.toml:

# Agent Configuration

[agent]
id = "my-agent"
workspace = "/path/to/your/workspace"

[llm]
model = "claude-sonnet-4-20250514"
max_tokens = 4096
thinking = "auto"  # auto|off|low|medium|high

[profiles.reasoning-heavy]
model = "claude-opus-4-20250514"

[profiles.fast]
model = "gpt-4o-mini"

[profiles.code-generation]
model = "gemini-1.5-pro"

# Embedding providers for semantic memory.
#
# Supported:
#   - openai:  text-embedding-3-small, text-embedding-3-large
#   - google:  text-embedding-004
#   - mistral: mistral-embed
#   - cohere:  embed-english-v3.0, embed-multilingual-v3.0
#   - voyage:  voyage-2, voyage-large-2, voyage-code-2
#   - ollama:  nomic-embed-text, mxbai-embed-large (local)
#   - none:    Disables semantic memory (KV still works)
#
# NOT supported (no embedding endpoints):
#   - anthropic (Claude) - use voyage instead
#   - openrouter - chat completions only
#   - groq - chat completions only
#
[embedding]
provider = "openai"              # see supported list above
model = "text-embedding-3-small"

[storage]
path = "~/.local/grid"           # Base directory for sessions, memory, logs
persist_memory = true            # true = memory survives across runs

[telemetry]
enabled = false

Note: The provider field is optional — it's automatically inferred from the model name:

claude-* → Anthropic
gpt-*, o1-*, o3-* → OpenAI
gemini-*, gemma-* → Google
mistral-*, mixtral-*, codestral-* → Mistral

Set provider explicitly for:

Ambiguous model names
OpenAI-compatible endpoints (Groq, OpenRouter, LiteLLM)
Local Ollama (ollama with base_url)
Ollama Cloud (ollama-cloud)

3. Set API Keys

API keys are loaded in this priority order (highest to lowest):

Environment variables
.env file in current directory
~/.config/grid/credentials.toml

Option A — Environment variables (recommended for production):

export ANTHROPIC_API_KEY="your-key-here"
export OPENAI_API_KEY="your-key-here"

Option B — .env file (auto-loaded from current directory):

# .env
ANTHROPIC_API_KEY=your-key-here
OPENAI_API_KEY=your-key-here

Option C — ~/.config/grid/credentials.toml (shared credentials file):

# ~/.config/grid/credentials.toml

# Simple: single key for all providers
[llm]
api_key = "your-api-key"

# Or provider-specific (optional, overrides [llm])
[anthropic]
api_key = "anthropic-specific-key"

[openai]
api_key = "openai-specific-key"

Priority: [provider] section → [llm] section → environment variable.

This file is shared with other Grid tools and agents.

Security Considerations

For development and small deployments: Using a credentials file is convenient. You can package it into a container for a small set of agents.

For production and larger deployments: Use environment variable injection instead of copying credential files. In Kubernetes, use Secrets mounted as env vars. Credential files:

Can accidentally be committed to version control
Are harder to rotate across multiple deployments
Create copies of secrets that may persist on disk

Never commit credentials to git. Add credentials.toml, .env, and *.pem to your .gitignore.

4. Create an Agentfile

Create Agentfile:

NAME hello-world
INPUT topic DEFAULT "Go programming"
GOAL research "Research $topic and list 3 key facts"
GOAL summarize "Summarize the research in one sentence"
RUN main USING research, summarize

5. Run

# Validate (uses ./Agentfile by default)
./agent validate

# Or specify a different file
./agent validate -f path/to/MyAgentfile

# Run the workflow
./agent run --config agent.json

# Run with custom input
./agent run --config agent.json --input topic="Rust programming"

# Run a specific Agentfile
./agent run -f examples/hello.agent --config agent.json

Quick Test (Validate Examples)

cd src

# Build
go build -o agent ./cmd/agent

# Validate all examples
for f in ../examples/*.agent; do ./agent validate "$f"; done

# Run tests
go test ./...

Capability Profiles

Agents can declare capability requirements using REQUIRES. The config maps these to specific LLM providers/models:

Agentfile:

AGENT critic FROM agents/critic.md REQUIRES "reasoning-heavy"
AGENT helper FROM agents/helper.md REQUIRES "fast"

Config profiles:

{
  "profiles": {
    "reasoning-heavy": {
      "provider": "anthropic",
      "model": "claude-opus-4-20250514"
    },
    "fast": {
      "model": "claude-haiku-20240307"
    }
  }
}

Benefits:

Workflow declares intent (what capability is needed)
Config controls implementation (which model provides it)
Same Agentfile works in different environments
Ops can control costs without editing workflows

Profile inheritance: Profiles inherit from the default llm config. Only specify what differs.

Built-in LLM Roles

Config Section	Purpose	When Used
`[llm]`	Primary model	Main workflow execution, goals, sub-agents without REQUIRES
`[small_llm]`	Fast/cheap model	`web_fetch` summarization, security triage fallback
`[profiles.<name>]`	Capability-specific	Sub-agents with `REQUIRES "<name>"`
`[security] triage_llm`	Security triage	Tier 2 verification (points to a profile name)

Example config:

[llm]
model = "claude-sonnet-4-20250514"
max_tokens = 4096

[small_llm]
model = "claude-haiku-20240307"
max_tokens = 1024

[profiles.reasoning-heavy]
model = "claude-opus-4-20250514"

[profiles.fast]
model = "gpt-4o-mini"

[profiles.code-generation]
model = "claude-sonnet-4-20250514"

[security]
triage_llm = "fast"  # Use the "fast" profile for security triage

Adaptive Thinking

The agent supports thinking/reasoning modes that enable models to show their work on complex problems. Thinking is controlled per-request using a heuristic classifier — zero extra LLM calls, just pattern matching.

Configuration

[llm]
model = "claude-sonnet-4-20250514"
thinking = "auto"  # auto|off|low|medium|high

Value	Behavior
`auto` (default)	Heuristic classifier decides per-request
`off`	Never use thinking
`low`	Light reasoning (simple questions, few tools)
`medium`	Moderate reasoning (code, planning, analysis)
`high`	Deep reasoning (proofs, debugging, architecture)

How Auto Works

The classifier analyzes each request before sending:

High complexity (→ high):

Math expressions, proofs, equations
"debug", "why is", "root cause"
"design system", "architecture", "trade-off"
"security analysis", "threat model"
Very long context (>3000 chars)
Many tools (>10)

Medium complexity (→ medium):

"implement", "refactor", "optimize"
"step by step", "explain how"
Code-related keywords
Moderate context (>1000 chars)
Several tools (>5)

Low complexity (→ low):

"how to", "what is the best"
"summarize", "list"
A few tools (>2)

Simple queries (→ off):

Greetings, short questions
No complexity indicators

Provider Support

Provider	Thinking Support	Notes
Anthropic	✅ Extended thinking	Budget tokens auto-scaled by level
OpenAI	✅ Reasoning effort	For o1/o3 models
Ollama Cloud	✅ Think API	GPT-OSS uses levels, others use bool
Google	❌	Not yet supported
Groq/Mistral	❌	Not yet supported

Semantic Memory

The agent supports persistent memory across sessions with two components:

Photographic memory (KV) — exact key-value recall, always available
Semantic memory — insights and decisions with vector embeddings for meaning-based search

Configuration

# Embedding providers for semantic memory.
#
# Supported:
#   - openai:  text-embedding-3-small, text-embedding-3-large
#   - google:  text-embedding-004
#   - mistral: mistral-embed
#   - cohere:  embed-english-v3.0, embed-multilingual-v3.0
#   - voyage:  voyage-2, voyage-large-2, voyage-code-2
#   - ollama:  nomic-embed-text, mxbai-embed-large (local)
#   - none:    Disables semantic memory (KV still works)
#
# NOT supported (no embedding endpoints):
#   - anthropic (Claude) - use voyage instead
#   - openrouter - chat completions only
#   - groq - chat completions only
#
[embedding]
provider = "openai"
model = "text-embedding-3-small"

[storage]
path = "~/.local/grid"           # Base directory for all persistent data
persist_memory = true            # true = memory survives across runs
                                 # false = session-scoped (scratchpad)

Directory Structure

{storage.path}/
├── sessions/       # Session state (always persisted)
├── kv.json         # Photographic memory (if persist_memory=true)
├── semantic.db     # Semantic memory (if persist_memory=true)
└── logs/           # Audit logs

Memory Tools

Photographic (KV) — always available:

Tool	Purpose
`memory_read`	Get value by exact key
`memory_write`	Store key-value pair
`memory_list`	List keys by prefix
`memory_search`	Substring search across keys/values

Semantic — requires embedding provider:

Tool	Purpose
`memory_remember`	Store content with embeddings for semantic search
`memory_recall`	Find relevant memories by meaning
`memory_forget`	Delete a memory by ID

Example workflow:

# Store an insight
memory_remember(
  content: "We decided to use PostgreSQL for better JSON support",
  importance: 0.8,
  tags: ["architecture", "database"]
)

# Later, recall it semantically
memory_recall(query: "database decision")
# Returns the PostgreSQL insight even without exact keyword match

Disabling Semantic Memory

For resource-constrained environments, disable semantic memory:

[embedding]
provider = "none"

The agent will still have KV (photographic) memory but memory_recall semantic search will be unavailable.

See docs/memory/semantic-memory.md for details.

CLI Commands

Command	Description
`agent run <file>`	Execute a workflow
`agent validate <file>`	Check syntax without running
`agent inspect <file>`	Show workflow/package structure
`agent pack <dir>`	Create a signed package
`agent verify <pkg>`	Verify package signature
`agent install <pkg>`	Install a package
`agent keygen`	Generate signing key pair
`agent setup`	Interactive setup wizard
`agent help`	Show help
`agent version`	Show version

Makefile Targets

make build          # Build to ./bin/agent
make install        # Install to ~/.local/bin/agent
make install-system # Install to /usr/local/bin (requires sudo)
make test           # Run all tests
make test-cover     # Run tests with coverage report
make docker-build   # Build Docker image
make clean          # Remove build artifacts
make help           # Show all available targets

Flags

Flag	Description
`--config <path>`	Config file (default: `agent.json`)
`--input key=value`	Provide input (repeatable)
`--policy <path>`	Security policy file
`--workspace <path>`	Override workspace directory

Packaging

Create distributable, signed agent packages.

Generate Signing Keys

./agent keygen -o my-key
# Creates: my-key.pem (private, keep secret!) and my-key.pub (share for verification)

Create a Package

./agent pack ./my-agent \
  --sign my-key.pem \
  --author "Your Name" \
  --email "you@example.com" \
  --license MIT \
  -o my-agent-1.0.0.agent

Package Structure

my-agent/
├── Agentfile           # Required: workflow definition
├── manifest.json       # Optional: additional metadata (outputs, deps)
├── agents/             # Agent persona files
├── goals/              # Goal prompt files
└── policy.toml         # Default security policy

Manifest (Optional)

Create manifest.json for additional metadata not in Agentfile:

{
  "name": "my-agent",
  "version": "1.0.0",
  "description": "What this agent does",
  "outputs": {
    "report": {"type": "string", "description": "Generated report"}
  },
  "dependencies": {
    "helper-agent": "^1.0.0"
  }
}

Note: name, version, inputs, and requires are auto-extracted from Agentfile.

Verify a Package

./agent verify my-agent-1.0.0.agent --key author.pub

Inspect a Package

./agent inspect my-agent-1.0.0.agent

Install a Package

# Install with dependencies
./agent install my-agent-1.0.0.agent

# Skip dependencies (for A2A setups)
./agent install my-agent-1.0.0.agent --no-deps

# Preview what would be installed
./agent install my-agent-1.0.0.agent --dry-run

Packages install to ~/.agent/packages/<name>/<version>/.

Security Policy

Create policy.toml to restrict tool access:

default_deny = true
workspace = "/path/to/workspace"

[tools.read]
enabled = true
allow = ["$WORKSPACE/**"]
deny = ["**/.env", "**/*.key"]

[tools.write]
enabled = true
allow = ["$WORKSPACE/**"]

[tools.bash]
enabled = true
allowlist = ["ls *", "cat *", "grep *"]
denylist = ["rm *", "sudo *"]

[tools.web_fetch]
enabled = true
allow_domains = ["api.github.com", "*.example.com"]
rate_limit = 10

Security Framework

The agent includes a comprehensive security framework to defend against prompt injection and other attacks when processing untrusted content.

Key Features

Trust-Tagged Content Blocks: All content is tagged with trust levels (trusted, vetted, untrusted) and types (instruction, data)
Tiered Verification Pipeline: Three-tier verification for tool calls involving untrusted content
Encoded Content Detection: Detects Base64, hex, and URL encoding that could hide malicious payloads
Pattern Detection: Identifies suspicious patterns like command injection, path traversal, and privilege escalation attempts
Cryptographic Audit Trail: Ed25519 signatures for all security decisions

Verification Tiers

Tier	Method	Cost	When Used
Tier 1	Deterministic checks	Free	Always (patterns, entropy, trust)
Tier 2	Cheap model triage	Low	If T1 flags concerns (default mode)
Tier 3	Full supervisor	High	If T2 flags concerns, or paranoid mode

Security Modes

default: Efficient tiered verification (T1 → T2 → T3)
paranoid: Skip T2, all flagged content goes directly to T3 supervision
research: Security research mode with scope-aware supervision (see below)

Configuration

Add to agent.toml:

[security]
mode = "default"          # or "paranoid"
user_trust = "untrusted"  # trust level for user messages

Security Research Mode

For legitimate security research (pentesting, vulnerability assessment, malware analysis), use research mode in your Agentfile:

SECURITY research "authorized pentest of internal lab network 192.168.100.0/24"

Research mode:

Uses scope-aware supervision (permissive within scope, strict at boundaries)
Injects defensive framing into prompts
Maintains full audit trail
Still blocks actions outside declared scope

See examples/security-research/ for security research workflow examples.

High-Risk Tools

These tools receive extra scrutiny when untrusted content is in context:

bash - Shell command execution
write - File system writes
web_fetch - External HTTP requests
spawn_agent - Sub-agent creation

See examples/security/ for example configurations and docs/security/ for detailed documentation.

Agentfile Syntax

# Comments start with #

NAME workflow-name

# Inputs with optional defaults
INPUT required_param
INPUT optional_param DEFAULT "value"

# Global setting - execution is supervised
SUPERVISED

# Security mode (optional - defaults to "default")
SECURITY default    # or: SECURITY paranoid
SECURITY research "scope description"  # for security research

# Agents reference prompts, skills, or packages (with optional structured output)
AGENT researcher FROM agents/researcher.md -> findings, sources
AGENT critic FROM skills/code-review -> issues, recommendations
AGENT helper "Inline prompt for helper" -> result
AGENT scanner FROM security.agent REQUIRES "reasoning-heavy"

# Goals define what to accomplish (with optional structured output)
# Supervision modifiers: SUPERVISED (LLM) or SUPERVISED HUMAN
GOAL name "Inline description with $variables" -> output1, output2
GOAL name FROM path/to/goal.md -> outputs
GOAL name "Description" -> summary, recommendations USING agent1, agent2
GOAL sensitive_task "Handle with care" SUPERVISED           # LLM supervision
GOAL critical_task "Requires human approval" SUPERVISED HUMAN

# Mark goal as UNSUPERVISED when SUPERVISED is applied globally.
GOAL fast_task "No supervision needed" UNSUPERVISED

# Steps execute goals
RUN step_name USING goal1, goal2, goal3
LOOP step_name USING goal1 WITHIN 5           # Loop max 5 times
LOOP step_name USING goal1 WITHIN $max_iter   # Variable limit

Structured Output

Use -> to declare output fields. The LLM returns JSON with those fields, which become variables for subsequent goals.

Simple structured output

GOAL research "Research $topic" -> findings, sources, confidence
GOAL report "Write report using $findings" -> summary, action_items

research goal returns: {"findings": "...", "sources": [...], "confidence": 0.8}
Fields become variables: $findings, $sources, $confidence
report goal can use $findings in its prompt

Multi-agent with synthesis

AGENT researcher "Research $topic" -> findings, sources
AGENT critic "Find biases in $topic" -> issues, concerns

GOAL analyze "Analyze $topic" -> summary, recommendations USING researcher, critic

researcher and critic run in parallel
Each returns structured JSON
An implicit synthesizer transforms their outputs into the goal's fields
This is essentially: findings, sources, issues, concerns -> summary, recommendations

AGENT FROM Resolution

The FROM clause in AGENT declarations supports smart resolution:

FROM Value	What Happens
`agents/critic.md`	File path → loads as prompt text
`skills/code-review`	Directory with `SKILL.md` → loads as skill
`testing`	Name → searches `skills.paths` for matching skill directory

Resolution order:

Check if path exists relative to Agentfile
If file → load as prompt (must be .md)
If directory → must have SKILL.md, loads as skill
If not found → search configured skills.paths
If still not found → error

Configure skill paths in agent.json:

{
  "skills": {
    "paths": ["./skills", "~/.agent/skills", "/opt/agent-skills"]
  }
}

Sub-Agents

Both static (AGENT/USING) and dynamic (spawn_agent) sub-agents use the same execution path with identical capabilities:

Parent's tools — Full access to the parent's tool registry
MCP tools — All configured MCP servers
Security verification — 3-tier verification on all tool calls
Supervision phases — COMMIT/EXECUTE/RECONCILE/SUPERVISE when enabled
Agentic loop — Full tool-calling loop, not just a single LLM call
No nesting — spawn_agent/spawn_agents excluded (depth=1)

Static Sub-Agents (AGENT/USING)

Declare agents in Agentfile with personas and model requirements:

AGENT optimist FROM agents/optimist.md REQUIRES "reasoning-heavy"
AGENT critic FROM agents/devils-advocate.md REQUIRES "fast"

GOAL evaluate "Analyze this decision" USING optimist, critic

The FROM path loads a persona prompt. The REQUIRES profile selects which LLM to use.

When the goal runs:

Spawns both agents in parallel (each with full tool access)
Each runs a complete agentic loop with tool calls
Waits for both to complete
Synthesizes their outputs

Dynamic Sub-Agents (spawn_agent)

The LLM can spawn sub-agents at runtime:

spawn_agent(role: "researcher", task: "Find facts about {topic}")
spawn_agents(agents: [
  {role: "optimist", task: "Make the case for..."},
  {role: "pessimist", task: "Make the case against..."}
])

The optional outputs parameter enables structured output parsing.

Unified Execution

Both paths flow through the same code:

spawnAgentWithPrompt()
  └── subAgentExecutePhaseWithProvider()
        └── executeToolsParallel()
              └── executeTool()
                    └── verifyToolCall()  ← security

When to Use

Static (AGENT/USING)	Dynamic (spawn_agent)
Known agents at design time	LLM decides what's needed
Persona defined in files	Ad-hoc specialists
Different model per agent	Same model as parent
Explicit control	Flexible problem decomposition

Both approaches can be used in the same workflow.

Docker

# Build
docker build -t headless-agent src/

# Run
docker run -it --rm \
  -v $(pwd):/workspace \
  -e ANTHROPIC_API_KEY \
  headless-agent run /workspace/Agentfile

Supported LLM Providers

The agent uses official SDKs where available:

Provider	SDK	Notes
Anthropic	anthropic-sdk-go	Official SDK
OpenAI	openai-go	Official SDK
Google	generative-ai-go	Official Gemini SDK
Mistral	Native HTTP	OpenAI-compatible API
Groq	Native HTTP	OpenAI-compatible API
xAI	Native HTTP	OpenAI-compatible API
OpenRouter	Native HTTP	OpenAI-compatible API
Ollama Cloud	Native HTTP	Native Ollama API
Ollama Local	Native HTTP	OpenAI-compatible API
LMStudio	Native HTTP	OpenAI-compatible API

Provider	`provider` value	`api_key_env`	Notes
Anthropic	`anthropic`	`ANTHROPIC_API_KEY`	Claude models
OpenAI	`openai`	`OPENAI_API_KEY`	GPT, o1, o3 models
Google	`google`	`GOOGLE_API_KEY`	Gemini models
Groq	`groq`	`GROQ_API_KEY`	Fast inference
Mistral	`mistral`	`MISTRAL_API_KEY`	Mistral models
xAI	`xai`	`XAI_API_KEY`	Grok models
OpenRouter	`openrouter`	`OPENROUTER_API_KEY`	Multi-provider routing
Ollama Local	`ollama-local`	—	Local, no API key
LMStudio	`lmstudio`	—	Local, no API key
Ollama Cloud	`ollama-cloud`	`OLLAMA_API_KEY`	Hosted models
Generic	`openai-compat`	—	Any OpenAI-compatible

OpenAI-Compatible Endpoints

Any service implementing the OpenAI chat completions API can be used. Built-in providers have default base URLs:

Provider	Default Base URL
`groq`	api.groq.com/openai/v1
`mistral`	api.mistral.ai/v1
`xai`	api.x.ai/v1
`openrouter`	openrouter.ai/api/v1
`ollama-local`	localhost:11434/v1
`lmstudio`	localhost:1234/v1

For services without built-in support, use openai-compat with a custom base_url:

# Together AI
[llm]
provider = "openai-compat"
model = "meta-llama/Llama-3-70b-chat-hf"
max_tokens = 4096
base_url = "https://api.together.xyz/v1"

# Fireworks AI
[llm]
provider = "openai-compat"
model = "accounts/fireworks/models/llama-v3-70b-instruct"
max_tokens = 4096
base_url = "https://api.fireworks.ai/inference/v1"

# Anyscale
[llm]
provider = "openai-compat"
model = "meta-llama/Llama-3-70b-chat-hf"
max_tokens = 4096
base_url = "https://api.endpoints.anyscale.com/v1"

# Perplexity
[llm]
provider = "openai-compat"
model = "llama-3.1-sonar-large-128k-online"
max_tokens = 4096
base_url = "https://api.perplexity.ai"

# DeepSeek
[llm]
provider = "openai-compat"
model = "deepseek-chat"
max_tokens = 4096
base_url = "https://api.deepseek.com/v1"

# vLLM self-hosted
[llm]
provider = "openai-compat"
model = "meta-llama/Llama-3-70b"
max_tokens = 4096
base_url = "http://localhost:8000/v1"

# LiteLLM proxy
[llm]
provider = "openai-compat"
model = "gpt-4"
max_tokens = 4096
base_url = "http://localhost:4000"

Provider Examples

# xAI (Grok)
[llm]
provider = "xai"
model = "grok-2"
max_tokens = 4096

# OpenRouter - route to any model
[llm]
provider = "openrouter"
model = "anthropic/claude-3.5-sonnet"
max_tokens = 4096

# Local Ollama
[llm]
provider = "ollama-local"
model = "llama3:70b"
max_tokens = 4096

# LMStudio local server
[llm]
provider = "lmstudio"
model = "loaded-model"
max_tokens = 4096

Ollama Cloud

Ollama Cloud provides hosted models accessible via API. Use the native ollama-cloud provider:

# agent.toml
[llm]
provider = "ollama-cloud"
model = "gpt-oss:120b"  # or any model from ollama.com/search?c=cloud

# credentials.toml
[ollama-cloud]
api_key = "your-ollama-api-key"  # from ollama.com/settings/keys

Available models: ollama.com/search?c=cloud

Note: This is different from local Ollama. For local Ollama, use provider = "ollama" with base_url = "http://localhost:11434/v1".

Proxying Native Providers

You can also use base_url with native providers to proxy requests:

[llm]
provider = "openai"
model = "gpt-4"
base_url = "https://my-company-proxy.com/v1"

Model Discovery

The provider field is optional for standard models — the agent automatically determines the provider using pattern inference from model name prefixes:

claude-* → anthropic
gpt-*, o1-*, o3-* → openai
gemini-*, gemma-* → google
mistral-*, mixtral-*, codestral-* → mistral
grok-* → xai

Set provider explicitly for custom or ambiguous model names.

Required Configuration

Field	Description
`model`	Model identifier (e.g., "claude-sonnet-4-20250514")
`max_tokens`	Maximum tokens for response generation

Built-in Tools

File Operations

read — Read file contents
write — Write/create files
edit — Edit files (find/replace)
glob — Find files by pattern
grep — Search file contents
ls — List directory contents

Shell

bash — Execute shell commands (policy-controlled)

Web

web_search — Search the web
web_fetch — Fetch and summarize URL content

Memory

memory_read — Read from agent memory
memory_write — Write to agent memory

Dynamic Agents

spawn_agent — Spawn sub-agents at runtime (see Dynamic Sub-Agent Spawning)

Web Search Providers

web_search supports multiple providers with automatic fallback:

Priority	Provider	API Key Required	Notes
1	Brave Search	`BRAVE_API_KEY`	Best quality
2	Tavily	`TAVILY_API_KEY`	Good for research
3	DuckDuckGo	None	Zero-config fallback

Configure in ~/.config/grid/credentials.toml:

[brave]
api_key = "your-key"

[tavily]
api_key = "your-key"

If no API keys are configured, web_search automatically uses DuckDuckGo's HTML endpoint.

Security

Credentials

File permissions: credentials.toml must have mode 0400 (owner read-only). The agent will refuse to load credentials with insecure permissions.

chmod 400 ~/.config/grid/credentials.toml

Priority order:

credentials.toml (if found with secure permissions)
Environment variables (fallback)

Protected Config Files

The agent cannot modify its own configuration files:

agent.toml — agent configuration
policy.toml — security policy
credentials.toml — API keys

This prevents privilege escalation attacks where an agent tries to grant itself additional permissions.

MCP Tool Security

MCP servers run with the agent's permissions. For production, restrict which tools can be called:

# policy.toml
[mcp]
default_deny = true  # Block all MCP tools by default
allowed_tools = [
  "filesystem:read_file",
  "filesystem:list_directory",
  "memory:*",  # Allow all tools from memory server
]

If [mcp] is not configured, the agent logs a security warning and allows all MCP tools (development mode).

Web Tool Security

Restrict which domains can be accessed:

# policy.toml
[web_fetch]
enabled = true
allow_domains = ["github.com", "*.github.io", "docs.python.org"]

Without allow_domains, all domains are allowed.

Shell Command Security

Restrict shell commands with allowlists/denylists:

[bash]
enabled = true
allowlist = ["ls *", "cat *", "go build *", "go test *"]
denylist = ["rm -rf *", "sudo *", "curl * | bash"]

Examples

See the examples/ directory for 38 example workflows covering:

Programming (code review, testing, refactoring, security audits)
Writing (essays, stories, translations)
Planning (travel, fitness, learning paths)
Analysis (decisions, debates, dependencies)
Multi-agent collaboration and parallel specialists
Skill-based agents and mixed sources

Protocol Support

MCP (Model Context Protocol)

Connect to external MCP tool servers for additional capabilities:

{
  "mcp": {
    "servers": {
      "filesystem": {
        "command": "npx",
        "args": ["-y", "@modelcontextprotocol/server-filesystem", "/tmp"]
      },
      "memory": {
        "command": "npx",
        "args": ["-y", "@modelcontextprotocol/server-memory"]
      },
      "github": {
        "command": "npx",
        "args": ["-y", "@modelcontextprotocol/server-github"],
        "env": {"GITHUB_TOKEN": "${GITHUB_TOKEN}"}
      }
    }
  }
}

Tools from MCP servers are automatically discovered and made available to workflows.

Agent Skills (agentskills.io)

Load skills from directories containing SKILL.md:

{
  "skills": {
    "paths": ["./skills", "~/.agent/skills"]
  }
}

Skill structure:

my-skill/
├── SKILL.md           # Required: frontmatter + instructions
├── scripts/           # Optional: executable scripts
├── references/        # Optional: additional docs
└── assets/            # Optional: templates, data

SKILL.md format:

---
name: my-skill
description: What this skill does and when to use it.
license: MIT
---

# Instructions

Step-by-step instructions for the agent...

ACP (Agent Client Protocol)

The agent can run as an ACP server for editor integration:

./agent acp --config agent.json

This enables communication with code editors (VS Code, JetBrains, Zed) that support ACP.

Supervision

Enable drift detection and course correction for critical workflows. The supervision system monitors agent execution and intervenes when the agent's understanding diverges from the original goal.

Enabling Supervision

Global supervision (all steps):

SUPERVISED
NAME my-workflow
GOAL analyze "Analyze data"
GOAL report "Generate report"
RUN main USING analyze, report

Per-step supervision:

NAME my-workflow
GOAL analyze "Analyze data"
GOAL deploy "Deploy to production" SUPERVISED HUMAN
RUN main USING analyze, deploy

Opt-out for specific steps:

SUPERVISED
NAME my-workflow
GOAL analyze "Analyze data"
GOAL trivial "Quick cleanup" UNSUPERVISED
RUN main USING analyze, trivial

Supervision Modes

Mode	Behavior
`SUPERVISED`	Autonomous — supervisor corrects drift automatically
`SUPERVISED HUMAN`	Requires human approval — fails if no human available
`UNSUPERVISED`	Skips supervision for trusted/fast operations

Four-Phase Execution

When supervision is enabled, each step executes in four phases:

COMMIT — Agent declares intent before execution
- Interpretation of the goal
- Planned approach
- Expected output
- Assumptions made
EXECUTE — Agent performs the work
- Tools called
- Actual output
- Self-assessment of whether commitment was met
RECONCILE — Static pattern matching (fast, no LLM)
- Concerns raised?
- Commitment met?
- Deviations from plan?
- Low confidence?
SUPERVISE — LLM evaluation (only if reconcile triggers)
- Evaluates drift against original goal
- Decides: CONTINUE, REORIENT, or PAUSE

Pre-flight Check

Before execution begins, the system checks if:

Any step requires SUPERVISED HUMAN
A human connection (ACP/A2A) is available

If human required but unavailable → hard fail before execution starts.

This prevents wasted compute on workflows that would fail mid-run.

Example: Deployment Pipeline

SUPERVISED
NAME deployment-pipeline
INPUT service_name

# Build and test — supervised for drift detection
GOAL build "Build $service_name"
GOAL test "Run test suite"

# Security scan — important, keep supervised
GOAL security "Scan for vulnerabilities"

# Config validation — fast and trusted
GOAL config "Validate config" UNSUPERVISED

# Deployment — critical, requires human approval
GOAL deploy "Deploy $service_name" SUPERVISED HUMAN

# Post-deploy verification
GOAL verify "Run smoke tests"

RUN pipeline USING build, test, security, config, deploy, verify

This workflow:

Supervises most steps (inherited from global)
Skips supervision for trivial config validation
Requires human approval before deployment
Fails at startup if no human connection available

See examples/39-supervision/ for more examples.

XML Context Format

The agent uses XML-structured prompts to communicate context between goals. This provides clear boundaries that don't collide with LLM-generated markdown:

<workflow name="recipe-creator">

<context>
  <goal id="brainstorm">
Here are 3 distinct dishes:

## 1. Classic South Indian Coconut Chutney
A fresh, vibrant accompaniment...
  </goal>
</context>

<current-goal id="select">
Choose the best recipe based on: ingredient usage, flavor balance.
</current-goal>

</workflow>

Benefits:

Clear boundaries — XML tags separate system context from LLM output
No header collisions — LLM's ## headers don't conflict with context headers
Machine-parseable — Tools can extract specific goals from logs
Session logs — The XML structure appears in session logs for debugging

See docs/execution/06-xml-context-format.md for full documentation.

Directories ¶

?	: This menu
/	: Search site
f or F	: Jump to
y or Y	: Canonical URL

Path	Synopsis
cmd
agent command Package main is the entry point for the headless agent CLI.	Package main is the entry point for the headless agent CLI.
replay command Package main is the entry point for the agent-replay CLI.	Package main is the entry point for the agent-replay CLI.
internal
acp Package acp provides Agent Client Protocol support.	Package acp provides Agent Client Protocol support.
agentfile Package agentfile provides lexer, parser, and AST for the Agentfile DSL.	Package agentfile provides lexer, parser, and AST for the Agentfile DSL.
checkpoint Package checkpoint provides checkpoint creation and management for supervised execution.	Package checkpoint provides checkpoint creation and management for supervised execution.
config Package config provides configuration loading and management.	Package config provides configuration loading and management.
credentials Package credentials loads API keys from standard locations.	Package credentials loads API keys from standard locations.
executor Package executor provides workflow and goal execution.	Package executor provides workflow and goal execution.
llm Package llm provides LLM provider interfaces and implementations.	Package llm provides LLM provider interfaces and implementations.
logging Package logging provides real-time log output derived from session events.	Package logging provides real-time log output derived from session events.
mcp Package mcp provides MCP (Model Context Protocol) client support.	Package mcp provides MCP (Model Context Protocol) client support.
memory Package memory provides semantic memory storage with vector embeddings.	Package memory provides semantic memory storage with vector embeddings.
packaging Package packaging handles agent package creation, verification, and installation.	Package packaging handles agent package creation, verification, and installation.
policy Package policy provides security policy loading and enforcement.	Package policy provides security policy loading and enforcement.
replay Package replay provides session replay and visualization.	Package replay provides session replay and visualization.
security Package security provides prompt injection defense through trust-tagged content blocks, tiered verification, and cryptographic audit trails.	Package security provides prompt injection defense through trust-tagged content blocks, tiered verification, and cryptographic audit trails.
session Package session provides session management and persistence.	Package session provides session management and persistence.
setup Package setup provides the interactive setup wizard for the agent.	Package setup provides the interactive setup wizard for the agent.
skills Package skills provides Agent Skills (agentskills.io) support.	Package skills provides Agent Skills (agentskills.io) support.
supervision Package supervision provides drift detection and course correction for agent execution.	Package supervision provides drift detection and course correction for agent execution.
telemetry Package telemetry provides telemetry export functionality.	Package telemetry provides telemetry export functionality.
tools Package tools provides the tool registry and built-in tools.	Package tools provides the tool registry and built-in tools.
transport Package transport provides JSON-RPC 2.0 stdio transport.	Package transport provides JSON-RPC 2.0 stdio transport.