⚠️ Warning
Early Development Stage: This project is in its early development
stage and breaking changes are expected until it reaches a stable version.
Always use pinned versions by specifying a specific version tag when
downloading binaries or using install scripts.
Table of Contents
Features
- Automatic Gateway Management: Automatically downloads and runs the Inference Gateway binary (no Docker required!)
- Zero-Configuration Setup: Start chatting immediately with just your API keys in a
.env file
- Interactive Chat: Chat with models using an interactive interface
- Status Monitoring: Check gateway health and resource usage
- Conversation History: Store and retrieve past conversations with multiple storage backends
- Configuration Management: Manage gateway settings via YAML config
- Project Initialization: Set up local project configurations
- Tool Execution: LLMs can execute whitelisted commands and tools - See all tools →
- Tool Approval System: User approval workflow for sensitive operations with real-time diff visualization
- Agent Modes: Three operational modes for different workflows:
- Standard Mode (default): Normal operation with all configured tools and approval checks
- Plan Mode: Read-only mode for planning and analysis without execution
- Auto-Accept Mode: All tools auto-approved for rapid execution (YOLO mode)
- Toggle between modes with Shift+Tab
- Token Usage Tracking: Accurate token counting with polyfill support for providers that don't return usage metrics
- Inline History Auto-Completion: Smart command history suggestions with inline completion
- Customizable Keybindings: Fully configurable keyboard shortcuts for the chat interface
- Extensible Shortcuts System: Create custom commands with AI-powered snippets - Learn more →
- MCP Server Support: Direct integration with Model Context Protocol servers for extended tool capabilities -
Learn more →
Installation
Using Go Install
go install github.com/inference-gateway/cli@latest
This installs the binary as cli. To rename it to infer:
mv $(go env GOPATH)/bin/cli $(go env GOPATH)/bin/infer
Or use an alias:
alias infer="$(go env GOPATH)/bin/cli"
Using Container Image
# Create network and deploy inference gateway first
docker network create inference-gateway
docker run -d --name inference-gateway --network inference-gateway \
--env-file .env \
ghcr.io/inference-gateway/inference-gateway:latest
# Pull and run the CLI
docker pull ghcr.io/inference-gateway/cli:latest
docker run -it --rm --network inference-gateway ghcr.io/inference-gateway/cli:latest chat
Using Install Script
# Latest version
curl -fsSL https://raw.githubusercontent.com/inference-gateway/cli/main/install.sh | bash
# Specific version
curl -fsSL https://raw.githubusercontent.com/inference-gateway/cli/main/install.sh | bash -s -- --version v0.77.0
# Custom installation directory
curl -fsSL https://raw.githubusercontent.com/inference-gateway/cli/main/install.sh | bash -s -- --install-dir $HOME/.local/bin
Manual Download
Download the latest release binary for your platform from the releases page.
Verify the binary (recommended for security):
# Download binary and checksums
curl -L -o infer-darwin-amd64 \
https://github.com/inference-gateway/cli/releases/latest/download/infer-darwin-amd64
curl -L -o checksums.txt \
https://github.com/inference-gateway/cli/releases/latest/download/checksums.txt
# Verify checksum
shasum -a 256 infer-darwin-amd64
grep infer-darwin-amd64 checksums.txt
# Install
chmod +x infer-darwin-amd64
sudo mv infer-darwin-amd64 /usr/local/bin/infer
For advanced verification with Cosign signatures, see Binary Verification Guide.
Build from Source
git clone https://github.com/inference-gateway/cli.git
cd cli
go build -o infer cmd/infer/main.go
sudo mv infer /usr/local/bin/
Quick Start
- Initialize your project:
infer init
This creates a .infer/ directory with configuration and shortcuts.
- Set up your environment (create
.env file):
ANTHROPIC_API_KEY=your_key_here
OPENAI_API_KEY=your_key_here
DEEPSEEK_API_KEY=your_key_here
- Start chatting:
infer chat
Next Steps
Now that you're up and running, explore these guides:
Commands
The CLI provides several commands for different workflows. For detailed documentation, see Commands Reference.
Core Commands
infer init - Initialize a new project with configuration and shortcuts
infer init # Initialize project configuration
infer init --userspace # Initialize user-level configuration
infer chat - Start an interactive chat session with model selection
infer chat
Features: Model selection, real-time streaming, scrollable history, three agent modes (Standard/Plan/Auto-Accept).
infer agent - Execute autonomous tasks in background mode
infer agent "Please fix the github issue 38"
infer agent --model "openai/gpt-4" "Implement feature from issue #42"
infer agent "Analyze this UI issue" --files screenshot.png
Features: Autonomous execution, multimodal support (images/files), parallel tool execution.
Configuration Commands
infer config - Manage CLI configuration settings
# Agent configuration
infer config agent set-model "deepseek/deepseek-chat"
infer config agent set-system "You are a helpful assistant"
infer config agent set-max-turns 100
infer config agent verbose-tools enable
# Tool management
infer config tools enable
infer config tools bash enable
infer config tools safety enable
# Export configuration
infer config export set-model "anthropic/claude-4.1-haiku"
See Commands Reference for all configuration options.
Agent Management
infer agents - Manage A2A (Agent-to-Agent) agent configurations
infer agents init # Initialize agents configuration
infer agents add browser-agent # Add an agent from the registry with defaults
infer agents add custom https://... # Add a custom agent
infer agents list # List all agents
For detailed A2A setup, see A2A Agents Configuration.
Utility Commands
infer status - Check gateway health and resource usage
infer status
infer conversation-title - Manage AI-powered conversation titles
infer conversation-title generate # Generate titles for all conversations
infer conversation-title status # Show generation status
infer version - Display CLI version information
infer version
When tool execution is enabled, LLMs can use various tools to interact with your system. Below is a
summary of available tools. For detailed documentation, parameters, and examples, see
Tools Reference.
| Tool |
Purpose |
Approval Required |
Documentation |
| Bash |
Execute whitelisted shell commands |
Optional |
Details |
| Read |
Read file contents with line ranges |
No |
Details |
| Write |
Write content to files |
Yes |
Details |
| Edit |
Exact string replacements in files |
Yes |
Details |
| MultiEdit |
Multiple atomic edits to files |
Yes |
Details |
| Delete |
Delete files and directories |
Yes |
Details |
| Tree |
Display directory structure |
No |
Details |
| Grep |
Search files with regex (ripgrep/Go) |
No |
Details |
| WebSearch |
Search the web (DuckDuckGo/Google) |
No |
Details |
| WebFetch |
Fetch content from URLs |
No |
Details |
| Github |
Interact with GitHub API |
No |
Details |
| TodoWrite |
Create and manage task lists |
No |
Details |
| A2A_SubmitTask |
Submit tasks to A2A agents |
No |
Details |
| A2A_QueryAgent |
Query A2A agent capabilities |
No |
Details |
| A2A_QueryTask |
Check A2A task status |
No |
Details |
| A2A_DownloadArtifacts |
Download A2A task outputs |
No |
Details |
Tool Configuration:
Tools can be enabled/disabled and configured individually:
# Enable/disable specific tools
infer config tools bash enable
infer config tools write enable
# Configure tool settings
infer config tools grep set-backend ripgrep
infer config tools web-fetch add-domain "example.com"
See Tools Reference for complete documentation.
Configuration
The CLI uses a powerful 2-layer configuration system with environment variable support.
Configuration Quick Start
Create a minimal configuration:
# .infer/config.yaml
gateway:
url: http://localhost:8080
docker: true # Use Docker mode (or false for binary mode)
tools:
enabled: true
bash:
enabled: true
agent:
model: "deepseek/deepseek-chat"
max_turns: 50
chat:
theme: tokyo-night
Configuration Layers
- Environment Variables (
INFER_*) - Highest priority
- Command Line Flags
- Project Config (
.infer/config.yaml)
- Userspace Config (
~/.infer/config.yaml)
- Built-in Defaults - Lowest priority
Example:
# Set via environment variable (highest priority)
export INFER_AGENT_MODEL="openai/gpt-4"
# Or via config file
infer config agent set-model "deepseek/deepseek-chat"
# Or via command flag
infer chat --model "anthropic/claude-4"
Key Configuration Options
- gateway.url - Gateway URL (default:
http://localhost:8080)
- gateway.docker - Use Docker mode vs binary mode (default:
true)
- tools.enabled - Enable/disable all tools (default:
true)
- agent.model - Default model for agent operations
- agent.max_turns - Maximum turns for agent sessions (default:
50)
- chat.theme - Chat interface theme (default:
tokyo-night)
Environment Variables
All configuration can be set via environment variables with the INFER_ prefix:
export INFER_GATEWAY_URL="http://localhost:8080"
export INFER_AGENT_MODEL="deepseek/deepseek-chat"
export INFER_TOOLS_BASH_ENABLED=true
export INFER_CHAT_THEME="tokyo-night"
Format: INFER_<PATH> where dots become underscores.
Example: agent.model → INFER_AGENT_MODEL
For complete configuration documentation, including all options and environment variables, see Configuration Reference.
The CLI includes a comprehensive approval system for sensitive tool operations, providing security and
visibility into what actions LLMs are taking.
How It Works
When a tool requiring approval is executed:
- Validation: Tool arguments are validated
- Approval Prompt: User sees tool details with:
- Tool name and parameters
- Real-time diff preview (for file modifications)
- Approve/Reject/Auto-approve options
- Execution: Tool runs only if approved
Default Approval Requirements
| Tool |
Requires Approval |
Reason |
| Write |
Yes |
Creates/modifies files |
| Edit |
Yes |
Modifies file contents |
| MultiEdit |
Yes |
Multiple file modifications |
| Delete |
Yes |
Removes files/directories |
| Bash |
Optional |
Executes system commands |
| Read, Grep, Tree |
No |
Read-only operations |
| WebSearch, WebFetch |
No |
External read-only |
| A2A Tools |
No |
Agent delegation |
Approval Configuration
Configure approval requirements per tool:
# Enable/disable approval for specific tools
infer config tools safety enable # Global approval
infer config tools bash enable # Enable bash tool
Or via configuration file:
tools:
safety:
require_approval: true # Global default
write:
require_approval: true
bash:
require_approval: false # Override for bash
Approval UI Controls
- y / Enter - Approve execution
- n / Esc - Reject execution
- a - Auto-approve (disables approval for session)
Shortcuts
The CLI provides an extensible shortcuts system for quickly executing common commands with /shortcut-name syntax.
Built-in Shortcuts
Core:
/clear - Clear conversation history
/exit - Exit chat session
/help [shortcut] - Show available shortcuts
/switch [model] - Switch to different model
/theme [name] - Switch chat theme
/compact - Compact conversation
/export [format] - Export conversation
Git Shortcuts (created by infer init):
/git-status - Show working tree status
/git-commit - Generate AI commit message from staged changes
/git-push - Push commits to remote
/git-log - Show commit logs
SCM Shortcuts (GitHub integration):
/scm-issues - List GitHub issues
/scm-issue <number> - Show issue details
/scm-pr-create [context] - Generate AI-powered PR plan
AI-Powered Snippets
Create shortcuts that use LLMs to transform data:
# .infer/shortcuts/custom-example.yaml
shortcuts:
- name: analyze-diff
description: "Analyze git diff with AI"
command: bash
args:
- -c
- |
diff=$(git diff)
jq -n --arg diff "$diff" '{diff: $diff}'
snippet:
prompt: |
Analyze this diff and suggest improvements:
```diff
{diff}
```
template: |
## Analysis
{llm}
Custom Shortcuts
Create custom shortcuts by adding YAML files to .infer/shortcuts/:
# .infer/shortcuts/custom-dev.yaml
shortcuts:
- name: tests
description: "Run all tests"
command: go
args:
- test
- ./...
- name: build
description: "Build the project"
command: go
args:
- build
- -o
- infer
- .
Use with /tests or /build.
For complete shortcuts documentation, including advanced features and examples, see Shortcuts Guide.
Global Flags
-v, --verbose: Enable verbose output
--config <path>: Specify custom config file path
Examples
Basic Workflow
# Initialize project
infer init
# Start interactive chat
infer chat
# Execute autonomous task
infer agent "Fix the bug in issue #42"
# Check gateway status
infer status
Working on a GitHub Issue
# Start chat
infer chat
# In chat, use shortcuts to get context
/scm-issue 123
# Discuss with AI, let it use tools to:
# - Read files
# - Search codebase
# - Make changes
# - Run tests
# Generate PR plan when ready
/scm-pr-create Fixes the authentication timeout issue
Configuration Example
# Set default model
infer config agent set-model "deepseek/deepseek-chat"
# Enable bash tool
infer config tools bash enable
# Configure web search
infer config tools web-search enable
# Check current configuration
infer config show
Development
For development, use Task for build automation:
task dev # Format, build, and test
task build # Build binary
task test # Run tests
See CLAUDE.md for detailed development documentation.
License
MIT License - see LICENSE file for details.