golem

module
v0.6.4 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Mar 3, 2026 License: MIT

README

Golem (גּוֹלֶם)

Go Version Release CI Status License

Your AI agent. Your terminal. Your rules.

Golem is a terminal-first personal AI assistant built with Go and Eino. It can chat, run tools, call shell commands, manage files, search/fetch web content, keep memory, schedule cron jobs, run as a background service across multiple channels, and support provider auth login plus channel audio transcription.

Golem (גולם): In Jewish folklore, a golem is an animated being made from inanimate matter, created to serve.

Documentation

Why Golem

  • One binary, zero runtime dependency bloat (no Python/Node/Docker required).
  • Provider-agnostic model access through a unified OpenAI-compatible layer.
  • Real agent loop with tool calling, not just plain text chat.
  • Works both interactively (golem chat) and as long-running service (golem run).
  • Built-in channels, gateway API, cron scheduler, heartbeat service, and skill system.
  • Built-in auth commands, voice transcription pipeline, and restart-safe heartbeat routing.

Architecture Overview

┌─────────────────────────────────────────────────────────────────────┐
│                           Golem Architecture                         │
├─────────────────────────────────────────────────────────────────────┤
│                                                                      │
│  ┌──────────────┐     ┌──────────────┐     ┌──────────────────┐    │
│  │   Channels   │     │    Agent     │     │     Providers    │    │
│  │  (Telegram,  │────▶│    Loop      │────▶│  (Claude, OpenAI,│    │
│  │  Discord,    │     │              │     │   DeepSeek...)   │    │
│  │  Slack...)   │     └──────┬───────┘     └──────────────────┘    │
│  └──────────────┘            │                                       │
│         │                    │                                       │
│         ▼                    ▼                                       │
│  ┌─────────────────────────────────────────────────────────────┐    │
│  │                         Message Bus                          │    │
│  │           (Inbound/Outbound async message queue)             │    │
│  └─────────────────────────────────────────────────────────────┘    │
│                              │                                       │
│         ┌────────────────────┼────────────────────┐                 │
│         ▼                    ▼                    ▼                 │
│  ┌─────────────┐     ┌─────────────┐     ┌─────────────────┐       │
│  │   Session   │     │   Skills    │     │     Tools       │       │
│  │  (History)  │     │ (Prompts)   │     │(exec, file, web)│       │
│  └─────────────┘     └─────────────┘     └─────────────────┘       │
│         │                    │                    │                 │
│         └────────────────────┼────────────────────┘                 │
│                              ▼                                       │
│  ┌─────────────────────────────────────────────────────────────┐    │
│  │                     Supporting Services                      │    │
│  │    (Memory | Cron | Heartbeat | Gateway | Skills)           │    │
│  └─────────────────────────────────────────────────────────────┘    │
│                                                                      │
└─────────────────────────────────────────────────────────────────────┘
Core Components
Component Path Description
Agent Loop internal/agent/ Main processing loop with tool calling, max 20 iterations
Message Bus internal/bus/ Event-driven message routing via Go channels
Channel System internal/channel/ Multi-platform integrations (Telegram, Discord, Slack, etc.)
Provider internal/provider/ Unified LLM interface via Eino's OpenAI wrapper
Session internal/session/ Persistent JSONL-based conversation history
Tools internal/tools/ Built-in tools: file, shell, memory, web, cron, message, subagent, workflow
Memory internal/memory/ Long-term memory and daily diary system
Skills internal/skills/ Extensible Markdown-based prompt packs
Cron internal/cron/ Scheduled job management
Heartbeat internal/heartbeat/ Periodic health probe and status reporting
Gateway internal/gateway/ HTTP API server (/health, /version, /chat)

Core Features

Interaction Modes
  • Terminal TUI chat (golem chat)
  • Multi-channel bot mode (golem run): Telegram, WhatsApp, Feishu, Discord, Slack, QQ, DingTalk, MaixCam
  • Gateway HTTP API (/health, /version, /chat)
Latest Additions
  • Auth workflow commands: golem auth login, golem auth logout, golem auth status
  • Heartbeat target persistence across restarts (last active channel/chat is restored automatically)
  • Audio transcription in Telegram/Discord/Slack with fallback placeholders when transcription fails
  • File mutation tools edit_file and append_file for safer incremental edits
  • Outbound channel reliability policy (channels.outbound): retry, rate-limit, dedup window, and bounded send concurrency
Built-in Tools
Tool Description
exec Run shell commands (workspace restriction supported)
read_file / write_file / edit_file / append_file File read/write/edit/append in workspace
list_dir List directory contents
read_memory / write_memory Persistent memory access
append_diary Append daily notes
web_search Web search (Brave when API key exists; fallback available)
web_fetch Fetch and extract web page content
manage_cron Manage scheduled jobs
message Send messages to channels
spawn / subagent / workflow Delegate tasks to subagents and orchestrated workflows
LLM Providers

OpenRouter, Claude, OpenAI, DeepSeek, Gemini, Ark, Qianfan, Qwen, Ollama.

Subagent System

Golem supports delegating tasks to subagents for parallel processing:

  • spawn: Asynchronous subagent, returns task ID immediately, notifies via message bus
  • subagent: Synchronous subagent, blocks until completion, returns result directly
  • workflow: Built-in workflow orchestration tool (decompose task, run sequential/parallel subtasks, aggregate per-step results)

All modes use isolated sessions and propagate origin channel/chat for result delivery.

Memory System

Two-tier memory architecture:

  1. Long-term Memory: Single MEMORY.md file for persistent knowledge
  2. Daily Diary: YYYY-MM-DD.md files for timestamped journal entries
Heartbeat Service

When enabled, server mode can periodically run a health probe and send heartbeat output to the latest active channel/session. The latest target is persisted in workspace state, so routing survives process restarts.

Installation

Option A: Download Binary

Download Windows/Linux binaries from Releases.

Option B: Install from Source
go install github.com/MEKXH/golem/cmd/golem@latest

Quick Start

1. Initialize config
golem init

This creates ~/.golem/config.json and workspace directories.

2. Bootstrap with the example config

Use the provided template as your starting point:

cp config/config.example.json ~/.golem/config.json

PowerShell:

Copy-Item config/config.example.json "$HOME/.golem/config.json"

Then edit ~/.golem/config.json and set at least one provider key (for example providers.openai.api_key).

Create an environment file from template (recommended for local/staging/production separation):

cp .env.example .env.local

PowerShell:

Copy-Item .env.example .env.local

Fill required secrets in .env.local (at least one provider key, and GOLEM_GATEWAY_TOKEN for exposed deployments).

Optional (token/OAuth auth store):

golem auth login --provider openai --token "$OPENAI_API_KEY"
3. Run smoke checks
make smoke

Without make:

go test ./...
go run ./cmd/golem status
go run ./cmd/golem chat "ping"
4. Start chatting
golem chat

One-shot:

golem chat "Analyze the current directory structure"
5. Start server mode
golem run

CLI Commands

Command Description
golem init Initialize config and workspace
golem chat [message] Start TUI chat or send one-shot message
golem run Start server mode
golem status [--json] Show system status summary (human-readable or JSON)
golem auth login --provider <name> [--token <token> | --device-code | --browser] Save provider credentials via token or OAuth
golem auth logout [--provider <name>] Remove one provider credential or all credentials
golem auth status Show current auth credential status
golem channels list List configured channels
golem channels status Show detailed channel status
golem channels start <channel> Enable one channel in config
golem channels stop <channel> Disable one channel in config
golem cron list List scheduled jobs
golem cron add -n <name> -m <msg> [--every <sec> | --cron <expr> | --at <ts>] Add a job
golem cron run <job_id> Run a job immediately
golem cron remove <job_id> Remove a job
golem cron enable <job_id> Enable a job
golem cron disable <job_id> Disable a job
golem approval list List pending approval requests
golem approval approve <id> --by <name> [--note <text>] Approve a pending request
golem approval reject <id> --by <name> [--note <text>] Reject a pending request
golem skills list List installed skills
golem skills install <owner/repo> Install skill from GitHub
golem skills remove <name> Remove installed skill
golem skills show <name> Show skill content
golem skills search [keyword] Search remote skill index

Authentication

Credentials are stored in ~/.golem/auth.json. Provider clients can use auth-store tokens as API credentials when config keys are empty.

Examples:

golem auth login --provider openai --device-code
golem auth status
golem auth logout --provider openai

Cron Scheduling

Schedule types:

  • --every <seconds>: fixed interval
  • --cron "<expr>": standard 5-field cron expression
  • --at "<RFC3339>": one-shot execution

Examples:

golem cron add -n "hourly-check" -m "Check system status and report" --every 3600
golem cron add -n "morning-brief" -m "Give me a morning briefing" --cron "0 9 * * *"
golem cron add -n "meeting-reminder" -m "Remind me about the team meeting" --at "2026-02-14T09:00:00Z"

Skills System

Skills are Markdown instruction packs loaded into the agent prompt.

Skill discovery precedence:

  1. workspace/skills
  2. ~/.golem/skills
  3. builtin skills directory (default: ~/.golem/builtin-skills, override via GOLEM_BUILTIN_SKILLS_DIR)

Install from GitHub:

golem skills install owner/repo

Search remote skills:

golem skills search
golem skills search weather

Configuration

Main file: ~/.golem/config.json

Template file in repo: config/config.example.json

{
  "agents": {
    "defaults": {
      "workspace_mode": "default",
      "workspace": "",
      "model": "anthropic/claude-sonnet-4-5",
      "max_tokens": 8192,
      "temperature": 0.7,
      "max_tool_iterations": 20
    },
    "subagent": {
      "timeout_seconds": 300,
      "retry": 1,
      "max_concurrency": 3
    }
  },
  "channels": {
    "telegram": {
      "enabled": false,
      "token": "",
      "allow_from": []
    },
    "outbound": {
      "max_concurrent_sends": 16,
      "retry_max_attempts": 3,
      "retry_base_backoff_ms": 200,
      "retry_max_backoff_ms": 2000,
      "rate_limit_per_second": 20,
      "dedup_window_seconds": 30
    }
  },
  "providers": {
    "claude": {
      "api_key": ""
    },
    "openai": {
      "api_key": ""
    },
    "ollama": {
      "base_url": "http://localhost:11434"
    }
  },
  "policy": {
    "mode": "strict",
    "off_ttl": "",
    "allow_persistent_off": false,
    "require_approval": ["exec"]
  },
  "mcp": {
    "servers": {
      "localfs": {
        "enabled": true,
        "transport": "stdio",
        "command": "npx",
        "args": ["-y", "@modelcontextprotocol/server-filesystem", "."]
      }
    }
  },
  "tools": {
    "exec": {
      "timeout": 60,
      "restrict_to_workspace": true
    },
    "web": {
      "search": {
        "api_key": "",
        "max_results": 5
      }
    },
    "voice": {
      "enabled": false,
      "provider": "openai",
      "model": "gpt-4o-mini-transcribe",
      "timeout_seconds": 30
    }
  },
  "gateway": {
    "host": "0.0.0.0",
    "port": 18790,
    "token": ""
  },
  "heartbeat": {
    "enabled": true,
    "interval": 30,
    "max_idle_minutes": 720
  },
  "log": {
    "level": "info",
    "file": ""
  }
}

workspace_mode values:

  • default: use ~/.golem/workspace
  • cwd: use current working directory
  • path: use agents.defaults.workspace

agents.subagent runtime values:

  • timeout_seconds: delegated subtask timeout (default 300)
  • retry: retry count per subtask (default 1, total attempts = retry + 1)
  • max_concurrency: max concurrent subtask executions across spawn/subagent/workflow (default 3)

channels.outbound reliability values:

  • max_concurrent_sends: max concurrent outbound sends (default 16)
  • retry_max_attempts: max attempts per outbound message on retriable channels (default 3)
  • retry_base_backoff_ms / retry_max_backoff_ms: exponential backoff window in milliseconds
  • rate_limit_per_second: global outbound send rate limit (default 20)
  • dedup_window_seconds: dedup window for same channel+chat_id+request_id (default 30)

policy.mode values:

  • strict: enforce require_approval list before tool execution
  • relaxed: allow execution without approval gate
  • off: disable policy checks (use off_ttl for temporary bypass)

Approval and audit state files:

  • workspace/state/approvals.json
  • workspace/state/audit.jsonl
Environment Variables

All config keys support GOLEM_ prefix:

export GOLEM_PROVIDERS_OPENROUTER_APIKEY="your-key"
export GOLEM_PROVIDERS_CLAUDE_APIKEY="your-key"
export GOLEM_LOG_LEVEL=debug

Recommended profile files:

  • .env.local: local development defaults
  • .env.staging: pre-release integration environment
  • .env.production: production deployment

You can start from .env.example and keep policy.mode=strict / policy.allow_persistent_off=false as safe defaults.

Minimum required secrets:

  • At least one provider API key (or use golem auth login --provider <name>).
  • GOLEM_GATEWAY_TOKEN for staging/production where gateway is network-accessible.

Gateway API

Available in server mode (golem run):

  • GET /health
  • GET /version
  • POST /chat

POST /chat example:

{
  "message": "Summarize the latest logs",
  "session_id": "ops-room",
  "sender_id": "api-client"
}

If gateway.token is configured, include:

Authorization: Bearer <token>

Data Flow

User Input (CLI/Telegram/Discord/Slack...)
         │
         ▼
    Channel (receives & validates message)
         │
         ▼
    Bus.PublishInbound() ──▶ MessageBus.inbound
         │
         ▼
    Agent Loop (processes message)
         │
    ┌────┴────┐
    ▼         ▼         ▼
Session  Context  LLM Generate
(History) Builder  (with tools bound)
              │           │
              │           ▼
              │      Tools.Execute()
              │      (tool calls)
              │           │
              └─────┬─────┘
                    ▼
         Bus.PublishOutbound()
                    │
                    ▼
         Channel Manager (routes)
                    │
                    ▼
         Channel.Send() ──▶ User

Bootstrap Files

The agent's system prompt is built from these files (searched in workspace):

  1. IDENTITY.md - Agent identity and persona
  2. SOUL.md - Core beliefs and values
  3. USER.md - User-specific context
  4. TOOLS.md - Custom tool descriptions
  5. AGENTS.md - Subagent definitions

Operations

For incident handling, restart/rollback flow, and production guidance:

Development

Common commands:

make build
make test
make lint
make smoke

Without make, run before pushing:

go test ./...
go test -race ./...
go vet ./...

Build:

go build -o golem ./cmd/golem

License

MIT

Directories

Path Synopsis
cmd
golem command
Package main 是 Golem 的主程序入口。
Package main 是 Golem 的主程序入口。
golem/commands
Package commands 提供 Golem CLI 的各个子命令实现。
Package commands 提供 Golem CLI 的各个子命令实现。
internal
agent
Package agent 实现核心 AI 助手逻辑,包括消息循环、工具调用及上下文管理。
Package agent 实现核心 AI 助手逻辑,包括消息循环、工具调用及上下文管理。
approval
Package approval 实现 Golem 的人工审批流程,用于对敏感工具执行进行准入控制。
Package approval 实现 Golem 的人工审批流程,用于对敏感工具执行进行准入控制。
audit
Package audit 实现 Golem 的运行时审计日志记录功能。
Package audit 实现 Golem 的运行时审计日志记录功能。
auth
Package auth 处理 Golem 的身份验证逻辑,支持 OAuth 2.0 流程及凭据持久化。
Package auth 处理 Golem 的身份验证逻辑,支持 OAuth 2.0 流程及凭据持久化。
bus
Package bus 实现 Golem 的消息总线机制,支持通道与 Agent 之间的异步通信。
Package bus 实现 Golem 的消息总线机制,支持通道与 Agent 之间的异步通信。
channel
Package channel 定义了 Golem 与不同聊天平台(如 Telegram、飞书等)交互的接口和基础实现。
Package channel 定义了 Golem 与不同聊天平台(如 Telegram、飞书等)交互的接口和基础实现。
channel/dingtalk
Package dingtalk 实现钉钉机器人的接入,采用钉钉侧流式 (Stream Mode) 协议进行消息推送。
Package dingtalk 实现钉钉机器人的接入,采用钉钉侧流式 (Stream Mode) 协议进行消息推送。
channel/feishu
Package feishu 实现飞书机器人的接入,支持通过 WebSocket 模式接收和发送消息。
Package feishu 实现飞书机器人的接入,支持通过 WebSocket 模式接收和发送消息。
channel/telegram
Package telegram 实现 Telegram 机器人的接入,支持文本消息及语音转录交互。
Package telegram 实现 Telegram 机器人的接入,支持文本消息及语音转录交互。
command
Package command 定义了 Golem 斜杠命令(Slash Commands)的接口、注册与分发逻辑。
Package command 定义了 Golem 斜杠命令(Slash Commands)的接口、注册与分发逻辑。
config
Package config 处理 Golem 的全局配置加载、验证与持久化。
Package config 处理 Golem 的全局配置加载、验证与持久化。
cron
Package cron 实现 Golem 的定时任务 (Cron Jobs) 调度系统。
Package cron 实现 Golem 的定时任务 (Cron Jobs) 调度系统。
gateway
Package gateway 实现 Golem 的 API 网关,允许通过 HTTP 协议与 Agent 进行交互。
Package gateway 实现 Golem 的 API 网关,允许通过 HTTP 协议与 Agent 进行交互。
heartbeat
Package heartbeat 实现定期心跳服务,用于监控系统状态并向用户发送活跃提醒。
Package heartbeat 实现定期心跳服务,用于监控系统状态并向用户发送活跃提醒。
mcp
Package mcp 实现模型上下文协议 (Model Context Protocol) 的客户端管理。
Package mcp 实现模型上下文协议 (Model Context Protocol) 的客户端管理。
memory
Package memory 实现 Golem 的记忆管理系统,包括长期记忆 (MEMORY.md) 和基于日记的短期记忆。
Package memory 实现 Golem 的记忆管理系统,包括长期记忆 (MEMORY.md) 和基于日记的短期记忆。
metrics
Package metrics 实现 Golem 的运行时指标监控,支持工具执行、通道发送及记忆召回的可观察性统计。
Package metrics 实现 Golem 的运行时指标监控,支持工具执行、通道发送及记忆召回的可观察性统计。
policy
Package policy 实现 Golem 的运行时安全策略评估引擎。
Package policy 实现 Golem 的运行时安全策略评估引擎。
provider
Package provider 负责 LLM 供应商的适配、认证及聊天模型 (ChatModel) 的实例化。
Package provider 负责 LLM 供应商的适配、认证及聊天模型 (ChatModel) 的实例化。
render
Package render 提供用于处理和美化模型输出内容的渲染辅助工具。
Package render 提供用于处理和美化模型输出内容的渲染辅助工具。
session
Package session 实现会话管理功能,用于存储和检索用户与 Agent 之间的聊天历史记录。
Package session 实现会话管理功能,用于存储和检索用户与 Agent 之间的聊天历史记录。
skills
Package skills 实现 Golem 的增强技能系统,允许通过外部 Markdown 文件定义复杂指令与工作流。
Package skills 实现 Golem 的增强技能系统,允许通过外部 Markdown 文件定义复杂指令与工作流。
state
Package state 提供轻量级的运行时状态持久化管理。
Package state 提供轻量级的运行时状态持久化管理。
tools
Package tools 实现 Golem 的内置工具体系,并提供工具的注册、守卫与执行管理。
Package tools 实现 Golem 的内置工具体系,并提供工具的注册、守卫与执行管理。

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL