shinobi-cli

module
v0.0.0-...-b272ff4 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Mar 8, 2026 License: MIT

README

shinobi

A terminal chat interface for local LLMs.

File over app at the core. Point Shinobi at your existing agent or skill markdown files — no need to modify your filesystem or make duplicates. Configure Shinobi to any folder or Obsidian vault without need for MCP. Use the /config command to customize all Shinobi settings to your liking.

Talks to any OpenAI-compatible backend (LM Studio, Ollama, etc.) over a streaming API. Stores conversations in SQLite.

your files
your brain
your inference

shinobi

Screenshots

All screenshots and demo media live in assets/.

Shinobi chat UI Command menu Streaming demo

Watch full streaming demo (mp4)


Install

Requirements: Go 1.21+

go install github.com/liampierc3/shinobi-cli/cmd/shinobi@latest

If needed, ensure your Go bin path is on PATH (usually ~/go/bin).

git clone <repo-url> shinobi
cd shinobi
make install

On macOS, Gatekeeper may block the binary. Sign it after install:

codesign --force --deep --sign - $(which shinobi)

Run

shinobi

First run walks you through connecting to a backend. Config is saved to ~/.shinobi/config.yaml.


Backends

Shinobi supports LM Studio and Ollama. Point it at any OpenAI-compatible endpoint.

LM Studio (default): http://127.0.0.1:1234/v1 Ollama: http://127.0.0.1:11434/v1

Both can be configured simultaneously — Shinobi will ask which to use on first run and remember your choice.


Config

~/.shinobi/config.yaml — see config.example.yaml for a full reference.

lmstudio_url: "http://127.0.0.1:1234/v1"
ollama_url: ""

default_model: ""       # Leave empty to auto-select first available model
default_agent: ""       # Agent name to load on startup (optional)

brave_api_key: ""       # Brave Search API key — enables web search tool when set

Model Runtime Tuning

If replies are inconsistent (long stalls, random unloads, or generated tokens not appearing), tune runtime settings before changing prompts.

These setting names are LM Studio/llama.cpp-style and may differ in Ollama or other backends.

General starting points for larger local models:

  • Max Concurrent Predictions: 1
  • Evaluation Batch Size: 128 to 256 (start at 128 for stability)
  • Context Length: 8192 (increase only after stable runs)
  • KV Cache Offload: ON (if available)
  • Flash Attention: ON (if supported by your backend/model)
  • Keep Model in Memory: ON
  • mmap: ON (if available)
  • MoE Experts / Layers: use model defaults first

Troubleshooting:

  • Resource/unload errors: lower context length and concurrent predictions.
  • Stutters or very uneven token rate: lower evaluation batch size.
  • Works in backend UI but chat looks inconsistent: update Shinobi and keep backend/model settings per-model instead of global.

Agents

Agents are markdown files with YAML frontmatter. Drop them in any directory listed under agent_dirs in your config.

---
name: myagent
description: What this agent does
model: ""          # optional — override the active model for this agent
---
You are Shinobi, a local-first assistant.

Built-in example agents ship inside the binary and appear under the shinobi group in /agent. Switch agents with /agent or press Tab when the input is empty.


Skills

Skills are markdown files loaded as system context. They extend or modify the model's behavior for a session.

---
name: my-skill
description: What this skill does
---
When responding, always do X.

Skills live in any directory you configure under skills_dir. Auto-load skills at startup with auto_load_skills. Apply mid-session with /skill.

Supported skill layouts under skills_dir:

  • skills_dir/<skill-name>/SKILL.md (existing format)
  • skills_dir/<skill-name>.md (flat-file compatibility)

If both exist for the same skill, skills_dir/<skill-name>/SKILL.md takes precedence. For flat files, frontmatter is optional. Missing name falls back to filename, and missing description falls back to heading (or name).


Set any one search provider key in config — web search is automatically available to the model as a tool.

Provider Config key Notes
Brave brave_api_key
Tavily tavily_api_key Built for AI agents, generous free tier
SerpAPI serpapi_key Google results, paid
DuckDuckGo duckduckgo_enabled: true No key required, instant answers only

If multiple keys are set, priority is Brave → Tavily → SerpAPI → DuckDuckGo.


Commands

Command Description
/agent Switch agent
/skill Apply a skill
/new Start a fresh chat
/history Browse past conversations
/principles Show design principles
/help Toggle help panel
/menu Open command menu

Keyboard

Key Action
Enter Send message
/ Open command menu (press / again to close open menus/panels)
Tab Cycle agents (when input is empty)
Esc Cancel streaming / close menu
Ctrl+C Quit
/ Scroll chat
PgUp / PgDn Page scroll

Environment Variables

Variable Description
SHINOBI_BACKEND_URL Override backend URL
SHINOBI_BACKEND_API_KEY Override API key
SHINOBI_AGENT_DIR Override agent directories (: separated)
SHINOBI_REQUEST_TIMEOUT Request timeout in seconds (default: 120)

Storage

Conversations are saved to ~/.shinobi/conversations.db (SQLite).


Dependencies

Directories

Path Synopsis
cmd
shinobi command
internal
llm
ui

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL