code-index
Semantic code search for AI coding assistants. Parses your source code, generates LLM-powered summaries, and builds a searchable vector index exposed via MCP.
What it does
code-index lets AI assistants like Claude Code search your codebase by concept instead of exact keywords:
"check if string is in slice" → finds StringInSlice in util.go
"how does authentication work" → finds auth packages, token handlers, middleware
"database transaction management" → finds BeginTransaction, CommitTransaction across packages
It works by generating natural language summaries of every function, file, and package, then embedding them as vectors for similarity search.
Supported languages
| Language |
Parser |
| Go |
Native go/ast |
| TypeScript / Vue |
tree-sitter |
| Python |
tree-sitter |
| C / C++ |
tree-sitter |
| R |
Native via Rscript (regex fallback) |
| Markdown / Quarto |
Regex |
Quick start
1. Install
Download a pre-built binary from Releases, or install from source:
go install github.com/posit-dev/code-index/cmd/code-index@latest
Create .code-index.json in your repository root (see Configuration):
{
"project": "my-project",
"sources": [
{"path": "src", "language": "go"},
{"path": "frontend", "language": "typescript"}
],
"llm": {
"provider": "openai",
"base_url": "http://localhost:11434/v1",
"function_model": "llama3.2",
"summary_model": "llama3.2"
},
"embeddings": {
"provider": "openai",
"base_url": "http://localhost:11434/v1",
"model": "nomic-embed-text"
}
}
This example uses Ollama for fully local operation. You can also use OpenAI, AWS Bedrock, or any OpenAI-compatible API.
3. Build the index
code-index all # parse → generate → build → embed
4. Add to Claude Code
Add the MCP server to your project's .mcp.json:
{
"mcpServers": {
"code-index": {
"command": "npx",
"args": ["@jonyoder/code-index-mcp"]
}
}
}
Claude Code will use code_search proactively when working in your codebase.
How it works
parse → generate → build → embed → search
AST LLM docs JSON vectors query
- Parse — extracts functions, types, classes from source files using language-specific parsers
- Generate — creates LLM summaries for every function (fast model) and file/package (quality model) via AWS Bedrock
- Build — combines AST data and summaries into a searchable index
- Embed — generates vector embeddings, stored in a SQLite database with sqlite-vec
- Search — embeds your query, finds the closest vectors, returns results with signatures, summaries, and source locations
Documentation
License
MIT — see LICENSE.