shinobi
A terminal chat interface for local LLMs.
File over app at the core. Point Shinobi at your existing agent or skill markdown files — no need to modify your filesystem or make duplicates. Configure Shinobi to any folder or Obsidian vault without need for MCP. Use the /config command to customize all Shinobi settings to your liking.
Talks to any OpenAI-compatible backend (LM Studio, Ollama, etc.) over a streaming API. Stores conversations in SQLite.
your files
your brain
your inference
shinobi
Screenshots
All screenshots and demo media live in assets/.

Watch full streaming demo (mp4)
Install
Requirements: Go 1.21+
go install github.com/liampierc3/shinobi-cli/cmd/shinobi@latest
If needed, ensure your Go bin path is on PATH (usually ~/go/bin).
git clone <repo-url> shinobi
cd shinobi
make install
On macOS, Gatekeeper may block the binary. Sign it after install:
codesign --force --deep --sign - $(which shinobi)
Run
shinobi
First run walks you through connecting to a backend. Config is saved to ~/.shinobi/config.yaml.
Backends
Shinobi supports LM Studio and Ollama. Point it at any OpenAI-compatible endpoint.
LM Studio (default): http://127.0.0.1:1234/v1
Ollama: http://127.0.0.1:11434/v1
Both can be configured simultaneously — Shinobi will ask which to use on first run and remember your choice.
Config
~/.shinobi/config.yaml — see config.example.yaml for a full reference.
lmstudio_url: "http://127.0.0.1:1234/v1"
ollama_url: ""
default_model: "" # Leave empty to auto-select first available model
default_agent: "" # Agent name to load on startup (optional)
brave_api_key: "" # Brave Search API key — enables web search tool when set
Model Runtime Tuning
If replies are inconsistent (long stalls, random unloads, or generated tokens not appearing), tune runtime settings before changing prompts.
These setting names are LM Studio/llama.cpp-style and may differ in Ollama or other backends.
General starting points for larger local models:
Max Concurrent Predictions: 1
Evaluation Batch Size: 128 to 256 (start at 128 for stability)
Context Length: 8192 (increase only after stable runs)
KV Cache Offload: ON (if available)
Flash Attention: ON (if supported by your backend/model)
Keep Model in Memory: ON
mmap: ON (if available)
MoE Experts / Layers: use model defaults first
Troubleshooting:
- Resource/unload errors: lower context length and concurrent predictions.
- Stutters or very uneven token rate: lower evaluation batch size.
- Works in backend UI but chat looks inconsistent: update Shinobi and keep backend/model settings per-model instead of global.
Agents
Agents are markdown files with YAML frontmatter. Drop them in any directory listed under agent_dirs in your config.
---
name: myagent
description: What this agent does
model: "" # optional — override the active model for this agent
---
You are Shinobi, a local-first assistant.
Built-in example agents ship inside the binary and appear under the shinobi group in /agent.
Switch agents with /agent or press Tab when the input is empty.
Skills
Skills are markdown files loaded as system context. They extend or modify the model's behavior for a session.
---
name: my-skill
description: What this skill does
---
When responding, always do X.
Skills live in any directory you configure under skills_dir. Auto-load skills at startup with auto_load_skills. Apply mid-session with /skill.
Supported skill layouts under skills_dir:
skills_dir/<skill-name>/SKILL.md (existing format)
skills_dir/<skill-name>.md (flat-file compatibility)
If both exist for the same skill, skills_dir/<skill-name>/SKILL.md takes precedence.
For flat files, frontmatter is optional. Missing name falls back to filename, and missing description falls back to heading (or name).
Web Search
Set any one search provider key in config — web search is automatically available to the model as a tool.
| Provider |
Config key |
Notes |
| Brave |
brave_api_key |
|
| Tavily |
tavily_api_key |
Built for AI agents, generous free tier |
| SerpAPI |
serpapi_key |
Google results, paid |
| DuckDuckGo |
duckduckgo_enabled: true |
No key required, instant answers only |
If multiple keys are set, priority is Brave → Tavily → SerpAPI → DuckDuckGo.
Commands
| Command |
Description |
/agent |
Switch agent |
/skill |
Apply a skill |
/new |
Start a fresh chat |
/history |
Browse past conversations |
/principles |
Show design principles |
/help |
Toggle help panel |
/menu |
Open command menu |
Keyboard
| Key |
Action |
Enter |
Send message |
/ |
Open command menu (press / again to close open menus/panels) |
Tab |
Cycle agents (when input is empty) |
Esc |
Cancel streaming / close menu |
Ctrl+C |
Quit |
↑ / ↓ |
Scroll chat |
PgUp / PgDn |
Page scroll |
Environment Variables
| Variable |
Description |
SHINOBI_BACKEND_URL |
Override backend URL |
SHINOBI_BACKEND_API_KEY |
Override API key |
SHINOBI_AGENT_DIR |
Override agent directories (: separated) |
SHINOBI_REQUEST_TIMEOUT |
Request timeout in seconds (default: 120) |
Storage
Conversations are saved to ~/.shinobi/conversations.db (SQLite).
Dependencies