docsiq

command module
v0.1.1 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: May 3, 2026 License: MIT Imports: 1 Imported by: 0

README

docsiq

CI CodeQL OpenSSF Best Practices OpenSSF Scorecard Go Report Card Release Go Version

A single-binary GraphRAG knowledge base — index documents, extract an entity graph, ask questions across it, and browse the result in an embedded React UI over MCP.

Three-minute onboarding

# 1. Install (Linux amd64 shown; macOS arm64 is published alongside)
VERSION=$(curl -s https://api.github.com/repos/RandomCodeSpace/docsiq/releases/latest | grep tag_name | cut -d '"' -f4)
curl -LO "https://github.com/RandomCodeSpace/docsiq/releases/latest/download/docsiq-${VERSION}-linux-amd64"
chmod +x "docsiq-${VERSION}-linux-amd64" && sudo mv "docsiq-${VERSION}-linux-amd64" /usr/local/bin/docsiq

# 2. Index the sample corpus
git clone https://github.com/RandomCodeSpace/docsiq && cd docsiq
docsiq init && docsiq index docs/samples/

# 3. Ask a question
docsiq search "What are the main themes in this corpus?"

For a UI session:

docsiq serve
# → http://localhost:8080

Full walk-through with expected output: docs/quickstart.md.

Screenshots

Home Graph
Home view Graph view

More: Notes · Documents · MCP Console.

What it does

docsiq is a GraphRAG-powered knowledge base that runs as a single Go binary. It ingests unstructured documents, builds a knowledge graph with community detection, persists wikilinked markdown notes, and exposes the whole thing over MCP + an embedded React SPA on one port.

Inspired by Microsoft GraphRAG; storage is CGO-backed SQLite (mattn/go-sqlite3 with FTS5) + the sqlite-vec extension for ANN vector search.

Features

  • GraphRAG pipeline — load → chunk → embed → extract entities / relationships / claims → detect communities, all in one docsiq index run.
  • Notes subsystem — markdown on disk with [[wikilinks]], project scopes, cross-project references, and a live note graph view. Works without any LLM configured.
  • Interactive graph — SVG force-directed viz with d3-zoom (pinch/wheel pan/zoom 0.1×–40×), hover-to-highlight neighbourhood, degree-scaled nodes.
  • Community detection — pure-Go Louvain, hierarchical, no external deps.
  • Three LLM providers — Azure OpenAI, OpenAI, Ollama — via tmc/langchaingo. Set provider: "none" to run the server in notes-only mode with no LLM.
  • MCP server — 12+ tools (local/global search, graph walk, community reports, note read/write, …) exposed at /mcp via Streamable HTTP transport with session handshake.
  • Embedded SPA — React 19 + Tailwind 4 + shadcn/ui, served from //go:embed ui/dist. PWA-installable with manifest + service worker.
  • Per-repo projects — each scope has its own SQLite store + notes directory, addressable by slug.

UI

  • Stack: React 19, Vite 6, Tailwind 4, shadcn/ui primitives, Geist typography, Lucide icons.
  • Architecture: CSS lives in a single globals.css with an @layer components section; JSX uses semantic class names only; shadcn primitives are the only place Tailwind utilities live inline.
  • Navigation: labelled sidebar (Home · Notes · Documents · Graph · MCP) with ⌘K command palette.
  • Responsiveness: mobile drawer via shadcn Sheet; iOS safe-area respected; inputs forced to 16px below sm: to kill Safari auto-zoom.
  • PWA: manifest + 192/512 PNG icons + minimal service worker, installable on Android/iOS.
  • Hard reload: refresh button in the header purges service worker + CacheStorage and reloads from network — mobile-friendly ⌘⇧R substitute.
Keyboard shortcuts
Key Action
⌘K / Ctrl+K Command palette
G H Home
G N Notes
G D Documents
G G Graph
G M MCP console
⌘/ Toggle tree drawer (Notes)
⌘L Toggle links drawer (Notes)

MCP

docsiq speaks the MCP Streamable HTTP transport at POST /mcp. The UI's MCP Console (inspector-style) gives you the same tool list with typed argument forms. For external clients (Claude Desktop, Cursor, etc.) register the server URL directly, or use the hooks helper:

docsiq hooks install --client claude-desktop

Architecture

cmd/            CLI commands (cobra): index, serve, search, projects, init, hooks, vec
internal/
  api/          REST API + /mcp handler
  chunker/      Text splitting (textsplitter.RecursiveCharacter)
  community/    Louvain detection + summaries
  config/       Viper YAML config + env override
  crawler/      Web page crawler
  embedder/     Batched text → vector (nil-safe when provider=none)
  extractor/    LLM-based entity / relationship / claim extraction
  llm/          Provider abstraction (Azure, OpenAI, Ollama, none)
  loader/       Document loaders (PDF, DOCX, TXT, MD, web)
  mcp/          Streamable HTTP MCP server (12+ tools)
  notes/        Per-project markdown + wikilinks + graph builder
  pipeline/     5-phase indexing pipeline
  project/      Project registry (git-remote-scoped slugs)
  search/       Query engine (local + global + hybrid)
  store/        SQLite + FTS5 + vector index
  vectorindex/  HNSW ANN vector search
ui/             React 19 + Vite 6 SPA, embedded at compile time

Configuration

Config lives at ~/.docsiq/config.yaml; every key can be overridden by an env var with prefix DOCSIQ_ (dots → underscores, uppercased). A fully annotated reference with every option, default, and env var is at configs/docsiq.example.yaml.

server:
  host: 0.0.0.0
  port: 37778
  api_key: ""          # if set, UI + API require Authorization: Bearer <key>

llm:
  provider: ollama     # azure | openai | ollama | none
  ollama:
    base_url: http://localhost:11434
    chat_model: llama3.2
    embed_model: nomic-embed-text

No LLM? Set provider: none. The server still runs notes, wikilinks, graph, tree, and notes-search. Endpoints that need the model (POST /api/search, POST /api/upload, /mcp tool calls that embed or extract) return 503 {"code": "llm_disabled"}.

Build from source

Prerequisites: Go ≥ 1.25, Node ≥ 22, and a working C toolchain (build-essential on Debian/Ubuntu, xcode-select --install on macOS, MinGW-w64 / MSYS2 on Windows). Without gcc on PATH, CGO is silently disabled and the build fails at the call site with a misleading undefined: sqlitevec.LoadInto rather than a clear toolchain error, because internal/sqlitevec/load.go is gated by //go:build cgo. Full list: docs/getting-started.md.

# First time on a connected machine
npm --prefix ui ci                          # install UI deps
go mod download                             # Go deps

# Build
npm --prefix ui run build                   # produces ui/dist/
CGO_ENABLED=1 go build -tags sqlite_fts5 -o docsiq ./

CI builds UI first and passes ui/dist/ to each Go job as an artifact. ui/dist/ is not committed; only a tiny placeholder ui/dist/index.html exists in the repo to keep //go:embed ui/dist happy at compile time.

Tests

# Go
CGO_ENABLED=1 go test -tags sqlite_fts5 ./...
# Go -race integration
CGO_ENABLED=1 go test -tags "sqlite_fts5 integration" -race -timeout 1200s ./...

# UI
npm --prefix ui run typecheck
npm --prefix ui test -- --run --coverage
npm --prefix ui run build

Community

License

MIT. See LICENSE.

Documentation

The Go Gopher

There is no documentation for this package.

Directories

Path Synopsis
internal
api
buildinfo
Package buildinfo resolves the running binary's version metadata.
Package buildinfo resolves the running binary's version metadata.
crawler
Package crawler discovers and fetches pages from documentation websites.
Package crawler discovers and fetches pages from documentation websites.
hookinstaller
Package hookinstaller registers docsiq's SessionStart hook with the various AI clients (Claude Code, Cursor, GitHub Copilot, Codex CLI).
Package hookinstaller registers docsiq's SessionStart hook with the various AI clients (Claude Code, Cursor, GitHub Copilot, Codex CLI).
llm
Package llm — HTTP client pooling per provider (Block 3.5).
Package llm — HTTP client pooling per provider (Block 3.5).
llm/mock
Package mock provides a deterministic llm.Provider implementation for tests.
Package mock provides a deterministic llm.Provider implementation for tests.
mcp
notes
Package notes implements the disk-backed notes subsystem ported from kgraph.
Package notes implements the disk-backed notes subsystem ported from kgraph.
obs
Package obs wires Prometheus metrics for docsiq.
Package obs wires Prometheus metrics for docsiq.
project
Package project provides per-project identity, registry, and per-project SQLite storage for docsiq.
Package project provides per-project identity, registry, and per-project SQLite storage for docsiq.
sqlitevec
Package sqlitevec ships the asg017/sqlite-vec C extension embedded in the binary and extracts it to $DATA_DIR/ext/ at runtime so the mattn driver can LOAD EXTENSION it.
Package sqlitevec ships the asg017/sqlite-vec C extension embedded in the binary and extracts it to $DATA_DIR/ext/ at runtime so the mattn driver can LOAD EXTENSION it.
vectorindex
Package vectorindex provides an in-memory HNSW vector index for fast approximate nearest-neighbor search over chunk/entity embeddings.
Package vectorindex provides an in-memory HNSW vector index for fast approximate nearest-neighbor search over chunk/entity embeddings.
workq
Package workq is a minimal bounded worker pool for fire-and-forget background work (e.g.
Package workq is a minimal bounded worker pool for fire-and-forget background work (e.g.

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL