agentcomms

module
v0.3.0 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Mar 16, 2026 License: MIT

README ΒΆ

AgentComms

Go CI Go Lint Go SAST Go Report Card Docs Visualization License

Documentation | Getting Started | MCP Tools

An MCP plugin that enables voice calls and chat messaging for AI coding assistants. Start a task, walk away. Your phone rings when the AI is done, stuck, or needs a decision. Or get notified via Discord, Telegram, or WhatsApp.

Supports: Claude Code, AWS Kiro CLI, Gemini CLI

Built with the plexusone stack - showcasing a complete voice and chat AI architecture in Go.

Features

  • πŸ“ž Phone Calls: Real voice calls to your phone via Twilioβ€”works with smartphones, smartwatches, landlines, or VoIP
  • πŸ’¬ Chat Messaging: Send messages via Discord, Telegram, or WhatsApp
  • πŸ”„ Multi-turn Conversations: Back-and-forth discussions, not just one-way notifications
  • ⚑ Smart Triggers: Hooks that suggest calling/messaging when you're stuck or done with work
  • πŸ”€ Mix and Match: Use voice, chat, or both based on your needs
  • 🧠 Parallel Execution: AI continues working while waiting for your responseβ€”searching code, running tests, preparing next steps

How It Works

AgentComms provides bidirectional communication between humans and AI agents:

                           AgentComms
                    β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
                    β”‚                      β”‚
  β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”      β”‚   β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”     β”‚      β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
  β”‚ AI Agent β”‚ ────▢│   β”‚ MCP Server β”‚     │◀──── β”‚  Human   β”‚
  β”‚ Claude / β”‚      β”‚   β”‚ (OUTBOUND) β”‚     β”‚      β”‚ (Discord β”‚
  β”‚ Codex    β”‚ ◀────│   β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜     │────▢ β”‚  Phone)  β”‚
  β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜      β”‚                      β”‚      β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
                    β”‚   β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”     β”‚
                    β”‚   β”‚   Daemon   β”‚     β”‚
                    β”‚   β”‚ (INBOUND)  β”‚     β”‚
                    β”‚   β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜     β”‚
                    β”‚         β”‚            β”‚
                    β”‚    β”Œβ”€β”€β”€β”€β”΄β”€β”€β”€β”€β”       β”‚
                    β”‚    β”‚  tmux   β”‚       β”‚
                    β”‚    β”‚  pane   β”‚       β”‚
                    β”‚    β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜       β”‚
                    β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜

Two communication modes:

Mode Direction Use Case
OUTBOUND Agent β†’ Human AI needs input, reports completion, escalates blockers
INBOUND Human β†’ Agent Interrupt agent, send instructions, coordinate multiple agents
OUTBOUND (MCP Server)
  1. AI needs input β†’ Calls your phone or sends a chat message
  2. You respond β†’ Voice is transcribed, chat is read directly
  3. AI continues β†’ Uses your input to complete the task
INBOUND (Daemon) - Preview
  1. You send a message β†’ Type in Discord channel or send SMS
  2. Daemon receives β†’ Routes to the correct agent via tmux
  3. Agent sees it β†’ Message appears in agent's terminal

Architecture

β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚                           agentcomms                                      β”‚
β”œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€
β”‚  OUTBOUND (MCP Server) - Agent β†’ Human                                    β”‚
β”‚  β”œβ”€β”€ Voice Tools: initiate_call, continue_call, speak_to_user, end_call   β”‚
β”‚  β”œβ”€β”€ Chat Tools:  send_message, list_channels, get_messages               β”‚
β”‚  β”œβ”€β”€ Voice Manager - Orchestrates calls via omnivoice                     β”‚
β”‚  └── Chat Manager  - Routes messages via omnichat                         β”‚
β”œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€
β”‚  INBOUND (Daemon) - Human β†’ Agent                                         β”‚
β”‚  β”œβ”€β”€ Router       - Actor-style event dispatcher (goroutine per agent)    β”‚
β”‚  β”œβ”€β”€ AgentBridge  - Adapters for tmux, process, etc.                      β”‚
β”‚  β”œβ”€β”€ Event Store  - SQLite database via Ent ORM                           β”‚
β”‚  └── Transports   - Discord, Twilio (receives human messages)             β”‚
β”œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€
β”‚  Shared Infrastructure                                                    β”‚
β”‚  β”œβ”€β”€ omnivoice    - Voice abstraction (TTS, STT, Transport, CallSystem)   β”‚
β”‚  β”œβ”€β”€ omnichat     - Chat abstraction (Discord, Telegram, WhatsApp)        β”‚
β”‚  β”œβ”€β”€ mcpkit       - MCP server with ngrok integration                     β”‚
β”‚  └── Ent          - Database ORM with SQLite/PostgreSQL support           β”‚
β”œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€
β”‚  Provider Implementations                                                 β”‚
β”‚  β”œβ”€β”€ Voice: ElevenLabs, Deepgram, OpenAI, Twilio                          β”‚
β”‚  └── Chat:  Discord, Telegram, WhatsApp                                   β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜

The plexusone Stack

This project demonstrates the plexusone voice and chat AI stack:

Package Role Description
omnivoice Voice Abstraction Batteries-included TTS/STT with registry-based provider lookup
omnichat Chat Abstraction Provider-agnostic chat messaging interface
elevenlabs-go Voice Provider ElevenLabs streaming TTS and STT
omnivoice-deepgram Voice Provider Deepgram streaming TTS and STT
omnivoice-openai Voice Provider OpenAI TTS and STT
omnivoice-twilio Phone Provider Twilio transport and call system
mcpkit Server MCP server runtime with ngrok and multiple transport modes

Installation

Prerequisites
  • Go 1.25+
  • For voice: Twilio account + ngrok account
  • For chat: Discord/Telegram bot token (optional)
Build
cd /path/to/agentcomms
go mod tidy
go build -o agentcomms ./cmd/agentcomms

Configuration

AgentComms uses a unified JSON configuration file that combines all settings.

Quick Setup
# Generate configuration file
./agentcomms config init

# Or generate minimal config (chat only, no voice)
./agentcomms config init --minimal

# Set environment variables for secrets
export DISCORD_TOKEN=your_discord_bot_token
export TWILIO_ACCOUNT_SID=ACxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx
export TWILIO_AUTH_TOKEN=your_auth_token
export ELEVENLABS_API_KEY=your_elevenlabs_key
export DEEPGRAM_API_KEY=your_deepgram_key
export NGROK_AUTHTOKEN=your_ngrok_authtoken

# Validate configuration
./agentcomms config validate
Configuration File

The config file at ~/.agentcomms/config.json supports environment variable substitution:

{
  "version": "1",
  "server": { "port": 3333 },
  "agents": [
    { "id": "claude", "type": "tmux", "tmux_session": "claude-code" }
  ],
  "voice": {
    "phone": {
      "account_sid": "${TWILIO_ACCOUNT_SID}",
      "auth_token": "${TWILIO_AUTH_TOKEN}",
      "number": "+15551234567",
      "user_number": "+15559876543"
    },
    "tts": { "provider": "elevenlabs", "api_key": "${ELEVENLABS_API_KEY}" },
    "stt": { "provider": "deepgram", "api_key": "${DEEPGRAM_API_KEY}" },
    "ngrok": { "auth_token": "${NGROK_AUTHTOKEN}" }
  },
  "chat": {
    "discord": { "enabled": true, "token": "${DISCORD_TOKEN}" },
    "channels": [
      { "channel_id": "discord:YOUR_CHANNEL_ID", "agent_id": "claude" }
    ]
  }
}

See Configuration Guide for full documentation.

Usage

Commands

AgentComms provides two main commands:

# Run MCP server (OUTBOUND - spawned by AI assistant)
./agentcomms serve

# Run daemon (INBOUND - background service for human messages)
./agentcomms daemon

Running ./agentcomms without a subcommand defaults to serve for backwards compatibility.

Running the MCP Server (OUTBOUND)
./agentcomms serve

Output:

Starting agentcomms MCP server...
Using plexusone stack:
  - omnivoice (voice abstraction)
  - omnichat (chat abstraction)
  - mcpkit (MCP server)
Voice providers: tts=elevenlabs stt=deepgram
Chat providers: [discord telegram]
MCP server ready
  Local:  http://localhost:3333/mcp
  Public: https://abc123.ngrok.io/mcp
Running the Daemon (INBOUND) - Preview

The daemon enables human-to-agent communication. It runs as a background service and routes messages from Discord/Twilio to agents running in tmux.

./agentcomms daemon

Output:

INFO starting daemon data_dir=/Users/you/.agentcomms socket=/Users/you/.agentcomms/daemon.sock
INFO database initialized path=/Users/you/.agentcomms/data.db
INFO router initialized
INFO daemon started

Data storage: ~/.agentcomms/

  • config.json - Unified configuration file
  • data.db - SQLite database (events, agents)
  • daemon.sock - Unix socket for CLI/API
Daemon CLI Commands

Once the daemon is running, use these CLI commands to interact with it:

# Check daemon status
./agentcomms status

# List configured agents
./agentcomms agents

# Send a message to an agent (appears in tmux pane)
./agentcomms send <agent-id> "Your message here"

# Send an interrupt (Ctrl-C) to an agent
./agentcomms interrupt <agent-id>

# View recent events for an agent
./agentcomms events <agent-id> --limit 20

# Send a reply to a chat channel (outbound from agent)
./agentcomms reply discord:123456789 "Task completed!"

# List configured chat channels
./agentcomms channels

# Validate configuration
./agentcomms config validate

# Show current configuration
./agentcomms config show
Daemon Configuration

Generate and edit the configuration:

# Generate config file
./agentcomms config init

# Edit ~/.agentcomms/config.json with your settings

# Validate configuration
./agentcomms config validate

See the Configuration Guide for full details.

Multi-Tool Support

agentcomms supports multiple AI coding assistants. Generate configuration files for your preferred tool:

# Generate for a specific tool
go run ./cmd/generate-plugin claude .   # Claude Code
go run ./cmd/generate-plugin kiro .     # AWS Kiro CLI
go run ./cmd/generate-plugin gemini .   # Gemini CLI

# Generate for all tools
go run ./cmd/generate-plugin all ./plugins
Claude Code Integration

Option 1: Use generated plugin files

go run ./cmd/generate-plugin claude .

This creates:

  • .claude-plugin/plugin.json - Plugin manifest
  • skills/phone-input/SKILL.md - Voice calling skill
  • skills/chat-messaging/SKILL.md - Chat messaging skill
  • commands/call.md - /call slash command
  • commands/message.md - /message slash command
  • .claude/settings.json - Lifecycle hooks

Option 2: Manual MCP configuration

Add to ~/.claude/settings.json or .claude/settings.json:

{
  "mcpServers": {
    "agentcomms": {
      "command": "/path/to/agentcomms",
      "env": {
        "TWILIO_ACCOUNT_SID": "ACxxx",
        "TWILIO_AUTH_TOKEN": "xxx",
        "NGROK_AUTHTOKEN": "xxx",
        "DISCORD_TOKEN": "xxx",
        "ELEVENLABS_API_KEY": "xxx",
        "DEEPGRAM_API_KEY": "xxx",
        "AGENTCOMMS_AGENT_ID": "claude"
      }
    }
  }
}

MCP Tools

Voice Tools
initiate_call

Start a new call to the user.

{
  "message": "Hey! I finished implementing the feature. Want me to walk you through it?"
}

Returns:

{
  "call_id": "call-1-1234567890",
  "response": "Sure, go ahead and explain what you built."
}
continue_call

Continue an active call with another message.

{
  "call_id": "call-1-1234567890",
  "message": "I added authentication using JWT. Should I also add refresh tokens?"
}
speak_to_user

Speak without waiting for a response (useful for status updates).

{
  "call_id": "call-1-1234567890",
  "message": "Let me search for that in the codebase. Give me a moment..."
}
end_call

End the call with an optional goodbye message.

{
  "call_id": "call-1-1234567890",
  "message": "Perfect! I'll get started on that. Talk soon!"
}
Chat Tools
send_message

Send a message to a chat channel.

{
  "provider": "discord",
  "chat_id": "123456789",
  "message": "I've finished the PR! Here's the link: https://github.com/..."
}
list_channels

List available chat channels and their status.

{}

Returns:

{
  "channels": [
    {"provider_name": "discord", "status": "connected"},
    {"provider_name": "telegram", "status": "connected"}
  ]
}
get_messages

Get recent messages from a chat conversation.

{
  "provider": "telegram",
  "chat_id": "987654321",
  "limit": 5
}
Inbound Tools

These tools allow Claude Code to poll for messages sent by humans via the daemon.

check_messages

Check for new messages sent to this agent from humans via chat.

{
  "agent_id": "claude",
  "limit": 10
}

Returns:

{
  "messages": [
    {
      "id": "evt_01ABC123",
      "channel_id": "discord:123456789",
      "provider": "discord",
      "text": "Hey, can you also add unit tests?",
      "timestamp": "2024-01-15T10:30:00Z",
      "type": "human_message"
    }
  ],
  "agent_id": "claude",
  "has_more": false
}
get_agent_events

Get all recent events for an agent (messages, interrupts, status changes).

{
  "agent_id": "claude",
  "since_id": "evt_01ABC123",
  "limit": 20
}
daemon_status

Check if the agentcomms daemon is running.

{}

Returns:

{
  "running": true,
  "started_at": "2024-01-15T09:00:00Z",
  "agents": 1,
  "providers": ["discord", "telegram"]
}
Multi-Agent Tools

These tools enable agent-to-agent communication for task delegation and coordination.

list_agents

List all available agents and their status.

{
  "include_offline": false
}

Returns:

{
  "agents": [
    {"id": "backend", "type": "tmux", "status": "online", "target": "tmux:dev:0"},
    {"id": "frontend", "type": "tmux", "status": "online", "target": "tmux:dev:1"}
  ]
}
send_agent_message

Send a message to another agent.

{
  "to_agent_id": "backend",
  "message": "Can you help me with the API implementation?"
}

Messages arrive at the destination agent with source prefix:

[from: frontend] Can you help me with the API implementation?

Use Cases

Phone calls are ideal for:

  • Reporting significant task completion
  • Requesting urgent clarification when blocked
  • Discussing complex decisions
  • Walking through code changes
  • Multi-step processes needing back-and-forth

Chat messaging is ideal for:

  • Asynchronous status updates
  • Sharing links, code, or formatted content
  • Non-urgent notifications
  • Follow-up summaries

Development

Project Structure
agentcomms/
β”œβ”€β”€ cmd/
β”‚   └── agentcomms/
β”‚       β”œβ”€β”€ main.go          # CLI entry point (serve, daemon)
β”‚       └── commands.go      # CLI commands (send, interrupt, reply, etc.)
β”œβ”€β”€ internal/                # INBOUND infrastructure
β”‚   β”œβ”€β”€ daemon/
β”‚   β”‚   β”œβ”€β”€ daemon.go        # Background daemon service
β”‚   β”‚   β”œβ”€β”€ server.go        # Unix socket server
β”‚   β”‚   β”œβ”€β”€ client.go        # Client library for IPC
β”‚   β”‚   β”œβ”€β”€ protocol.go      # JSON-RPC style protocol
β”‚   β”‚   └── config.go        # Daemon configuration (YAML)
β”‚   β”œβ”€β”€ router/
β”‚   β”‚   β”œβ”€β”€ router.go        # Event dispatcher
β”‚   β”‚   └── actor.go         # Per-agent actor (goroutine)
β”‚   β”œβ”€β”€ bridge/
β”‚   β”‚   β”œβ”€β”€ adapter.go       # Agent adapter interface
β”‚   β”‚   └── tmux.go          # tmux adapter
β”‚   β”œβ”€β”€ transport/
β”‚   β”‚   └── chat.go          # Chat transport (omnichat)
β”‚   └── events/
β”‚       └── id.go            # Event ID generation
β”œβ”€β”€ ent/                     # Database schema (Ent ORM)
β”‚   └── schema/
β”‚       β”œβ”€β”€ event.go         # Event entity
β”‚       └── agent.go         # Agent entity
β”œβ”€β”€ pkg/                     # OUTBOUND infrastructure
β”‚   β”œβ”€β”€ voice/
β”‚   β”‚   └── manager.go       # Voice call orchestration
β”‚   β”œβ”€β”€ chat/
β”‚   β”‚   └── manager.go       # Chat message routing
β”‚   β”œβ”€β”€ config/
β”‚   β”‚   β”œβ”€β”€ config.go        # Legacy configuration
β”‚   β”‚   └── unified.go       # Unified JSON configuration
β”‚   └── tools/
β”‚       └── tools.go         # MCP tool definitions
β”œβ”€β”€ examples/
β”‚   └── config.json          # Example JSON configuration
β”œβ”€β”€ docs/
β”‚   └── design/              # Architecture documentation
β”‚       β”œβ”€β”€ FEAT_INBOUND_PRD.md
β”‚       β”œβ”€β”€ FEAT_INBOUND_TRD.md
β”‚       └── FEAT_INBOUND_PLAN.md
β”œβ”€β”€ go.mod
└── README.md
Dependencies
  • github.com/plexusone/omnivoice - Batteries-included voice abstraction
  • github.com/plexusone/omnichat - Chat messaging abstraction
  • github.com/plexusone/omnivoice-twilio - Twilio transport and call system
  • github.com/plexusone/mcpkit - MCP server runtime
  • github.com/modelcontextprotocol/go-sdk - MCP protocol SDK
  • entgo.io/ent - Entity framework for Go (database ORM)
  • modernc.org/sqlite - Pure Go SQLite driver

Cost Estimate

Service Cost
Twilio outbound calls ~$0.014/min
Twilio phone number ~$1.15/month
ElevenLabs TTS $0.30/1K chars ($0.03/min of speech)
ElevenLabs STT ~$0.10/min (Scribe)
Deepgram TTS ~$0.015/1K chars
Deepgram STT ~$0.0043/min (Nova-2)
OpenAI TTS ~$0.015/1K chars
OpenAI STT ~$0.006/min (Whisper)
Discord/Telegram Free
ngrok (free tier) $0

Provider Recommendations:

Priority TTS Provider STT Provider Total Cost/min Notes
Lowest Cost Deepgram Deepgram ~$0.03 Best value, good quality
Best Quality ElevenLabs Deepgram ~$0.05 Premium voices, fast transcription
Balanced OpenAI OpenAI ~$0.04 Single API key, consistent quality

Costs are approximate and exclude Twilio phone charges (~$0.014/min).

License

MIT

Credits

Inspired by ZeframLou/call-me (TypeScript).

Built with the plexusone stack:

Directories ΒΆ

Path Synopsis
cmd
agentcomms command
Package main is the entry point for the agentcomms CLI.
Package main is the entry point for the agentcomms CLI.
generate-plugin command
Package main generates AI assistant integration files using assistantkit bundle.
Package main generates AI assistant integration files using assistantkit bundle.
publish command
Package main publishes the agentcomms plugin to the Claude Code marketplace.
Package main publishes the agentcomms plugin to the Claude Code marketplace.
ent
Package ent provides the generated Ent client and types.
Package ent provides the generated Ent client and types.
schema
Package schema contains the Ent schema definitions for AgentComms.
Package schema contains the Ent schema definitions for AgentComms.
internal
bridge
Package bridge provides adapters for connecting to agent runtimes.
Package bridge provides adapters for connecting to agent runtimes.
daemon
Package daemon provides the AgentComms daemon configuration.
Package daemon provides the AgentComms daemon configuration.
events
Package events provides event-related utilities.
Package events provides event-related utilities.
router
Package router provides the actor-style event router for agent communication.
Package router provides the actor-style event router for agent communication.
transport
Package transport provides inbound message transports for AgentComms.
Package transport provides inbound message transports for AgentComms.
pkg
chat
Package chat provides chat channel integration using omnichat.
Package chat provides chat channel integration using omnichat.
config
Package config provides configuration management for agentcomms.
Package config provides configuration management for agentcomms.
tools
Package tools defines the MCP tools for agentcomms.
Package tools defines the MCP tools for agentcomms.
voice
Package voice orchestrates voice calls using the omnivoice stack.
Package voice orchestrates voice calls using the omnivoice stack.

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL